Mailing List Archive

replication, high-availability, and load balancing
Hello Masters and Gurus,

I have been migrating a single-server-with-hot-spare system to a
dual-server-autofailover/cluster system. So far I have a glusterfs to
store data, and a samba file server, dovecot imap, haproxy, mysql, smtp,
etc, all working well in production. Servers work well in unison, and
if I shut off one server the other keeps working and nobody can tell;
which means I don't have to be there in case something goes wrong. The
first step in reverse engineering my evil plan to get some actual
vacation time.

The next service I need to get working on this system is calendars.

I was surprised to find out that postgres doesn't support master-master
replication natively, at least not in the context that mysql does. I
have found possible alternative solutions, such as postgres-BDR,
postgres-LX, and Bucardo, but it starts to look like I may run into
limitations with regard to unique keys, or specifically the lack
thereof, in some of the davical tables in the case conflict resolution
becomes necessary.

While I was digging around, it occurred to me that maybe someone has
already invented this wheel and might be willing to share some insight?
Would actually be interested to hear any thoughts at all on the topic of
migrating into this new clustered-server setup...



--
Bob Miller
Cell: 867-334-7117
Office: 867-633-3760
www.computerisms.ca

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Davical-general mailing list
Davical-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/davical-general
Re: replication, high-availability, and load balancing [ In reply to ]
Hi Folks,

not sure if I can say I have mastered this, but I did get it working, so
thought I would share my notes on how I did it. no promises there
aren't some small mistakes in these notes; if you follow these notes and
find any, I am sure those who follow you would like to know what they are.


-I use gluster shared storage to have a single config to cross 2 servers:
vi /sharedstorage/configs/pg_hba.conf
<pre>
local all postgres peer
local all bucardo md5
local all postgres md5
local all all peer
host all all 127.0.0.1/32 md5
host all postgres 192.168.120.70/32 md5
host all postgres 192.168.120.80/32 md5
host all bucardo 192.168.120.70/32 md5
host all bucardo 192.168.120.80/32 md5
host all all ::1/128 md5
</pre>


-On both servers:
apt-get install bucardo
*deal with any postgres upgrades that happened
cd /etc/postgresql/9.6/main/ && mv pg_hba.conf pg_hba.conf.orig
ln -s /sharedstorage/configs/pg_hba.conf .
vi postgresql.conf ## Set listen_addresses = '*'
/etc/init.d/postgresql restart

-On primary server(.70):
chgrp postgres /etc/bucardorc
su - postgres
bucardo install
*set:
1. Host: /var/run/postgresql
2. Port: 5432 <= note: check conf file
3. User: postgres
4. Database: postgres
5. PID directory: /var/run/bucardo
*Press P to install it.


su - postgres
psql
alter user postgres with password 'SomethingRidiculouslyHard';
alter user bucardo with password 'SomethingElse';
\q
exit



-On secondary server(.80):
*note this echo spans 4 lines:
echo '192.168.120.70:*:postgres:SomethingRidiculouslyHard
192.168.120.70:*:bucardo:SomethingElse
192.168.120.80:*:postgres:SomethingRidiculouslyHard
192.168.120.80:*:bucardo:SomethingElse' > /var/lib/postgresql/.pgpass
chown postgres:postgres /var/lib/postgresql/.pgpass
chmod 0600 /var/lib/postgresql/.pgpass
export PGPASSFILE='/var/lib/postgresql/.pgpass'
su - postgres
psql -h 192.168.120.70 -p 5432 -U postgres
postgres=# \q <== success
psql -h 192.168.120.70 -p 5432 -U bucardo
postgres=# \q <== success
psql
alter user postgres with password 'SomethingRidiculouslyHard';
\q
exit


-On primary server(.70):
*note this echo spans 4 lines:
echo '192.168.120.70:*:postgres:SomethingRidiculouslyHard
192.168.120.70:*:bucardo:SomethingElse
192.168.120.80:*:postgres:SomethingRidiculouslyHard
192.168.120.80:*:bucardo:SomethingElse' > /var/lib/postgresql/.pgpass

chown postgres:postgres /var/lib/postgresql/.pgpass
chmod 0600 /var/lib/postgresql/.pgpass
export PGPASSFILE='/var/lib/postgresql/.pgpass'
su - postgres
psql -h 192.168.120.80 -p 5432 -U postgres
postgres=# \q <==Success
exit
cp -ra /var/lib/postgresql/.pgpass /var/lib/bucardo/
chown bucardo:bucardo /var/lib/bucardo/.pgpass
vi /etc/bucardorc
<pre>
dbport = 5432
dbhost = 192.168.120.70
dbname = bucardo
dbuser = bucardo
#dbpass = bucardo <== Make sure this is commented
<pre>


**Still on primary server, time for the real magic:
**Modify the davical database to be bucardo friendly:
##Table access_ticket - pkey text field. no action, potential
conflicts here
# no action
##Table addressbook_address_adr - add repl_id
su -c "psql -d davical -c \"alter table addressbook_address_adr add
column repl_id serial;\"" postgres
su -c "psql -d davical -c \"alter sequence
addressbook_address_adr_repl_id_seq restart with 100000 increment by
10;\"" postgres
su -c "psql -d davical -c \"update addressbook_address_adr set repl_id
= DEFAULT;\"" postgres
##Table addressbook_address_email - add repl_id
su -c "psql -d davical -c \"alter table addressbook_address_email add
column repl_id serial;\"" postgres
su -c "psql -d davical -c \"alter sequence
addressbook_address_email_repl_id_seq restart with 100000 increment by
10;\"" postgres
su -c "psql -d davical -c \"update addressbook_address_email set
repl_id = DEFAULT;\"" postgres
##Table addressbook_address_tel - add repl_id
su -c "psql -d davical -c \"alter table addressbook_address_tel add
column repl_id serial;\"" postgres
su -c "psql -d davical -c \"alter sequence
addressbook_address_tel_repl_id_seq restart with 100000 increment by
10;\"" postgres
su -c "psql -d davical -c \"update addressbook_address_tel set repl_id
= DEFAULT;\"" postgres
##Table addressbook_resource - modify dav_id seq instead
su -c "psql -d davical -c \"alter sequence dav_id_seq restart with
100000 increment by 10;\"" postgres
##Table awl_db_revision - add repl_id
su -c "psql -d davical -c \"alter table awl_db_revision add column
repl_id serial;\"" postgres
su -c "psql -d davical -c \"alter sequence awl_db_revision_repl_id_seq
restart with 100000 increment by 10;\"" postgres
su -c "psql -d davical -c \"update awl_db_revision set repl_id =
DEFAULT;\"" postgres
##Table caldav_data - modify user_no seq instead
su -c "psql -d davical -c \"alter sequence usr_user_no_seq restart
with 100000 increment by 10;\"" postgres
##Table calendar_alarm - no pkey, but has dav_id. dav_id is unique? not
required. needs repl_id
su -c "psql -d davical -c \"alter table calendar_alarm add column
repl_id serial;\"" postgres
su -c "psql -d davical -c \"alter sequence calendar_alarm_repl_id_seq
restart with 100000 increment by 10;\"" postgres
su -c "psql -d davical -c \"update calendar_alarm set repl_id =
DEFAULT;\"" postgres
##Table calendar_attendee - modify dav_id seq instead
#done
##Table calendar_attendee_email_status - has pkey, not seq. static
table? no action, conflicts possible but unlikely?
#no action
##Table calendar_item - modify user_no seq instead
#done
##Table collection - pkey is collection_id, and that uses dav_id seq.
modify dav_id seq instead
#done
##Table dav_binding - modify dav_id seq instead
#done
##Table freebusy_ticket - pkey is a text field. no action, potential
conflicts here
#no action
##Table grants - add repl_id
su -c "psql -d davical -c \"alter table grants add column repl_id
serial;\"" postgres
su -c "psql -d davical -c \"alter sequence grants_repl_id_seq restart
with 100000 increment by 10;\"" postgres
su -c "psql -d davical -c \"update grants set repl_id = DEFAULT;\""
postgres
##Table group_member - no pkey, convert unique key to a pkey? try it.
su -c "psql -d davical -c \"drop index group_member_pk;\"" postgres
su -c "psql -d davical -c \"alter table group_member add constraint
group_member_pk primary key (group_id, member_id);\"" postgres
##Table locks - no pkey, convert unique text constraint to pkey? try it.
su -c "psql -d davical -c \"alter table locks drop constraint
locks_opaquelocktoken_key;\"" postgres
su -c "psql -d davical -c \"alter table locks add constraint
locks_opaquelocktoken_key primary key (opaquelocktoken);\"" postgres
##Table principal - modify dav_id instead
#done
##Table principal_type - modify principal_type_id seq instead
su -c "psql -d davical -c \"alter sequence
principal_type_principal_type_id_seq restart with 100000 increment by
10;\"" postgres
##Table property - pkey has 2 text values. No action, possible source
of conflicts
#no action
##Table relationship - has pkey, rt_id turns out to be a sequence,
modify that instead.
su -c "psql -d davical -c \"alter sequence relationship_type_rt_id_seq
restart with 100000 increment by 10;\"" postgres
##Table relationship_type - modify rt_id seq instead.
#done
##Table role_member - no pkey, needs repl_id
su -c "psql -d davical -c \"alter table role_member add column repl_id
serial;\"" postgres
su -c "psql -d davical -c \"alter sequence role_member_repl_id_seq
restart with 100000 increment by 10;\"" postgres
su -c "psql -d davical -c \"update role_member set repl_id =
DEFAULT;\"" postgres
##Table roles - modify role_no seq instead
su -c "psql -d davical -c \"alter sequence roles_role_no_seq restart
with 100000 increment by 10;\"" postgres
##Table session - modify session_id seq instead
su -c "psql -d davical -c \"alter sequence session_session_id_seq
restart with 100000 increment by 10;\"" postgres
##Table supported_locales - pkey is text, static table? no action,
conflicts possible but unlikely?
#no action
##Table sync_changes - add repl_id
su -c "psql -d davical -c \"alter table sync_changes add column
repl_id serial;\"" postgres
su -c "psql -d davical -c \"alter sequence sync_changes_repl_id_seq
restart with 100000 increment by 10;\"" postgres
su -c "psql -d davical -c \"update sync_changes set repl_id =
DEFAULT;\"" postgres
##Table sync_tokens - modify sync_token seq instead
su -c "psql -d davical -c \"alter sequence sync_tokens_sync_token_seq
restart with 100000 increment by 10;\"" postgres
##Table timezones - modify our_tzno seq instead
su -c "psql -d davical -c \"alter sequence timezones_our_tzno_seq
restart with 100000 increment by 10;\"" postgres
##Table tmp_password - add repl_id
su -c "psql -d davical -c \"alter table tmp_password add column
repl_id serial;\"" postgres
su -c "psql -d davical -c \"alter sequence tmp_password_repl_id_seq
restart with 100000 increment by 10;\"" postgres
su -c "psql -d davical -c \"update tmp_password set repl_id =
DEFAULT;\"" postgres
##Table tz_aliases - add repl_id
su -c "psql -d davical -c \"alter table tz_aliases add column repl_id
serial;\"" postgres
su -c "psql -d davical -c \"alter sequence tz_aliases_repl_id_seq
restart with 100000 increment by 10;\"" postgres
su -c "psql -d davical -c \"update tz_aliases set repl_id =
DEFAULT;\"" postgres
##Table tz_localnames - needs repl_id
su -c "psql -d davical -c \"alter table tz_localnames add column
repl_id serial;\"" postgres
su -c "psql -d davical -c \"alter sequence tz_localnames_repl_id_seq
restart with 100000 increment by 10;\"" postgres
su -c "psql -d davical -c \"update tz_localnames set repl_id =
DEFAULT;\"" postgres
##Table usr - modify user_no seq instead
#done
##Table usr_setting - modify user_no seq instead
#done


**STill on primary server, setup the replicated database:
su -c "pg_dump -d davical --schema-only > /tmp/davical.psql" postgres

su -c "psql -h 192.168.120.80 -d postgres -U postgres -c \"create user
davical_dba\";" postgres

su -c "psql -h 192.168.120.80 -d postgres -U postgres -c \"create user
davical_app\";" postgres

su -c "psql -h 192.168.120.80 -d postgres -U postgres -c \"create
database davical owner=davical_dba\";" postgres

su -c "psql \"host=192.168.120.80 dbname=davical user=postgres\" -f
/tmp/davical.psql" postgres

su -c "bucardo add db davical_chief dbname=davical dbuser=postgres
dbhost=192.168.120.70" bucardo

su -c "bucardo add db davical_council dbname=davical dbuser=postgres
dbhost=192.168.120.80" bucardo

su -c "bucardo add dbgroup davical_bucardogroup davical_chief:source
davical_council:source" bucardo

su -c "bucardo add sync davical_sync tables=all
dbs=davical_bucardogroup" bucardo

su -c "pg_dump -U postgres -h 192.168.120.70 --data-only -N bucardo
davical | PGOPTIONS='-c session_replication_role=replica' psql -U
postgres -h 192.168.120.80 -d davical" postgres

for k in addressbook_address_adr_repl_id_seq
addressbook_address_email_repl_id_seq
addressbook_address_tel_repl_id_seq dav_id_seq
awl_db_revision_repl_id_seq usr_user_no_seq calendar_alarm_repl_id_seq
grants_repl_id_seq principal_type_principal_type_id_seq
relationship_type_rt_id_seq role_member_repl_id_seq roles_role_no_seq
session_session_id_seq sync_changes_repl_id_seq
sync_tokens_sync_token_seq timezones_our_tzno_seq
tmp_password_repl_id_seq tz_aliases_repl_id_seq
tz_localnames_repl_id_seq metrics_count_acl metrics_count_bind
metrics_count_delete metrics_count_delticket metrics_count_get
metrics_count_head metrics_count_lock metrics_count_mkcalendar
metrics_count_mkcol metrics_count_mkticket metrics_count_move
metrics_count_post metrics_count_propfind metrics_count_proppatch
metrics_count_put metrics_count_report metrics_count_unknown
metrics_count_unlock metrics_count_options; do su -c "psql -h
192.168.120.80 -d davical -c \"alter sequence $k restart with 100001
increment by 10;\"" postgres; done

mkdir /var/run/bucardo
chown bucardo:bucardo /var/run/bucardo/
su -c "bucardo start" bucardo

**Troubleshooting
Every once in a while, bucardo stops syncing the databases and I need to
run a bucardo kick to get it going again. The symptom is that events
are disappearing in the client, but in reality, the client is connecting
to the server that doesn't hold the events. This can be seen by running
select count(*) from calendar_item on both servers and the numbers
returned will be different. It is on my list to script this check and
run the bucardo kick automatically, but it hasn't been enough of a
problem that it has prioritized itself to the top of my todo list.

**In production
I used this to migrate all the calendars on my hosting server to a
gluster-based highly-available/load balanced/fail over system, but in
the end gluster just could not cut it for a busy webserver type of
environment. but I did get about 2 months out of it before I gave up on
a pair of servers with well over 100 domains; and the postgres
replication was certainly not a deciding factor in having to give it up,
nor was davical as one doesn't see the slowness in the same way you see
it with web pages.

I have also got this on a client's cluster of business servers, along
with Samba AD and a bunch of other things (where gluster is running
successfully, I might add). I only moved davical from a stand-alone VM
onto 2 of the clustered servers a couple weeks ago, but so far it has
been smooth sailing, no reported issues from the end-users and no
problems server side to date. The configuration above should allow in
theory up to 10 servers in the highly-available cluster by adjusting the
offsets in subsequent servers.

**More notes:
-You can put /usr/share/davical on a gluster volume. Based on my
experience, I wouldn't personally do that on a busy web server unless
the server was only serving davical, but mileage might vary. I would
expect a link to a customized index.php or something wouldn't be a
problem though.
-I do keep the /etc/davcial folder linked to a folder on the gluster
volume, though, means I only have to update one file to update all servers.
-One limitation to this setup is that bucardo itself is not replicated.
This won't stop auto-failover from happening the case of 2 servers and
the bucardo server dying; rebuild/restore and when it comes online sync
it and away it goes. But in the case of 3 or more servers, if the
server with bucardo fails, the others will not stay in sync while you
rebuild/restore.
-The servers these notes are based on are behind firewalls and in a
somewhat trusted environment, I will leave it as an exercise to the
reader to harden the webserver and postgres configs for internet-facing
servers, and I might suggest something like libreswan for inter-server
communication.

Q'apla!

On 2017-10-26 1:53 p.m., Computerisms Corporation wrote:
> Hello Masters and Gurus,
>
> I have been migrating a single-server-with-hot-spare system to a
> dual-server-autofailover/cluster system.  So far I have a glusterfs to
> store data, and a samba file server, dovecot imap, haproxy, mysql, smtp,
> etc, all working well in production.  Servers work well in unison, and
> if I shut off one server the other keeps working and nobody can tell;
> which means I don't have to be there in case something goes wrong.  The
> first step in reverse engineering my evil plan to get some actual
> vacation time.
>
> The next service I need to get working on this system is calendars.
>
> I was surprised to find out that postgres doesn't support master-master
> replication natively, at least not in the context that mysql does.  I
> have found possible alternative solutions, such as postgres-BDR,
> postgres-LX, and Bucardo, but it starts to look like I may run into
> limitations with regard to unique keys, or specifically the lack
> thereof, in some of the davical tables in the case conflict resolution
> becomes necessary.
>
> While I was digging around, it occurred to me that maybe someone has
> already invented this wheel and might be willing to share some insight?
> Would actually be interested to hear any thoughts at all on the topic of
> migrating into this new clustered-server setup...
>
>
>


_______________________________________________
Davical-general mailing list
Davical-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/davical-general