Mailing List Archive: Better way to change master in 3 node pgsql cluster

Hi

I have 3 node postgresql cluster.
It work well. But I have some trobule with change master.

For now, if I need change master, I must:
1) Stop PGSQL on each node and cluster service
2) Start Setup new manual PGSQL replication
3) Change attributes on each node for point to new master
4) Stop PGSQL on each node
5) Celanup resource and start cluster service

It take a lot of time. Is it exist better way to change master?

This is my cluster service status:
Node Attributes:
* Node a.geocluster.e-autopay.com:
+ master-pgsql:0 : 1000
+ pgsql-data-status : LATEST
+ pgsql-master-baseline : 000000002F000090
+ pgsql-status : PRI
* Node c.geocluster.e-autopay.com:
+ master-pgsql:0 : 1000
+ pgsql-data-status : SYNC
+ pgsql-status : STOP
* Node b.geocluster.e-autopay.com:
+ master-pgsql:0 : 1000
+ pgsql-data-status : SYNC
+ pgsql-status : STOP

I was use http://clusterlabs.org/wiki/PgSQL_Replicated_Cluster for my 3
nodes cluster without hard stik.
Now I got strange situation all nodes stay slave:
============
Last updated: Sat Dec 7 04:33:47 2013
Last change: Sat Dec 7 12:56:23 2013 via crmd on a
Stack: openais
Current DC: c - partition with quorum
Version: 1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff
5 Nodes configured, 3 expected votes
4 Resources configured.
============

Online: [ a c b ]

Master/Slave Set: msPostgresql [pgsql]
Slaves: [ a c b ]

My config is:
node a \
attributes pgsql-data-status="DISCONNECT"
node b \
attributes pgsql-data-status="DISCONNECT"
node c \
attributes pgsql-data-status="DISCONNECT"
primitive pgsql ocf:heartbeat:pgsql \
params pgctl="/usr/lib/postgresql/9.3/bin/pg_ctl" psql="/usr/bin/psql"
pgdata="/var/lib/postgresql/9.3/main" start_opt="-p 5432" rep_mode="sync"
node_list="a b c" restore_command="cp /var/lib/postgresql/9.3/pg_archive/%f
%p" master_ip="192.168.10.200" restart_on_promote="true"
config="/etc/postgresql/9.3/main/postgresql.conf" \
op start interval="0s" timeout="60s" on-fail="restart" \
op monitor interval="4s" timeout="60s" on-fail="restart" \
op monitor interval="3s" role="Master" timeout="60s" on-fail="restart" \
op promote interval="0s" timeout="60s" on-fail="restart" \
op demote interval="0s" timeout="60s" on-fail="stop" \
op stop interval="0s" timeout="60s" on-fail="block" \
op notify interval="0s" timeout="60s"
primitive pgsql-master-ip ocf:heartbeat:IPaddr2 \
params ip="192.168.10.200" nic="peervpn0" \
op start interval="0s" timeout="60s" on-fail="restart" \
op monitor interval="10s" timeout="60s" on-fail="restart" \
op stop interval="0s" timeout="60s" on-fail="block"
group master pgsql-master-ip
ms msPostgresql pgsql \
meta master-max="1" master-node-max="1" clone-max="3" clone-node-max="1"
notify="true"
colocation set_ip inf: master msPostgresql:Master
order ip_down 0: msPostgresql:demote master:stop symmetrical=false
order ip_up 0: msPostgresql:promote master:start symmetrical=false
property $id="cib-bootstrap-options" \
dc-version="1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff" \
cluster-infrastructure="openais" \
expected-quorum-votes="3" \
no-quorum-policy="ignore" \
stonith-enabled="false" \
crmd-transition-delay="0" \
last-lrm-refresh="1386404222"
rsc_defaults $id="rsc-options" \
resource-stickiness="100" \
migration-threshold="1"
_______________________________________________
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

On 13 Jan 2014, at 8:32 pm, Andrey Rogovsky <a.rogovsky@gmail.com> wrote:

> Hi
>
> I have 3 node postgresql cluster.
> It work well. But I have some trobule with change master.
>
> For now, if I need change master, I must:
> 1) Stop PGSQL on each node and cluster service
> 2) Start Setup new manual PGSQL replication
> 3) Change attributes on each node for point to new master
> 4) Stop PGSQL on each node
> 5) Celanup resource and start cluster service
>
> It take a lot of time. Is it exist better way to change master?

Newer versions support:

crm_resource --resource msPostgresql --ban --master --host a.geocluster.e-autopay.com

>
>
>
> This is my cluster service status:
> Node Attributes:
> * Node a.geocluster.e-autopay.com:
> + master-pgsql:0 : 1000
> + pgsql-data-status : LATEST
> + pgsql-master-baseline : 000000002F000090
> + pgsql-status : PRI
> * Node c.geocluster.e-autopay.com:
> + master-pgsql:0 : 1000
> + pgsql-data-status : SYNC
> + pgsql-status : STOP
> * Node b.geocluster.e-autopay.com:
> + master-pgsql:0 : 1000
> + pgsql-data-status : SYNC
> + pgsql-status : STOP
>
> I was use http://clusterlabs.org/wiki/PgSQL_Replicated_Cluster for my 3
> nodes cluster without hard stik.
> Now I got strange situation all nodes stay slave:
> ============
> Last updated: Sat Dec 7 04:33:47 2013
> Last change: Sat Dec 7 12:56:23 2013 via crmd on a
> Stack: openais
> Current DC: c - partition with quorum
> Version: 1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff
> 5 Nodes configured, 3 expected votes
> 4 Resources configured.
> ============
>
> Online: [ a c b ]
>
> Master/Slave Set: msPostgresql [pgsql]
> Slaves: [ a c b ]
>
> My config is:
> node a \
> attributes pgsql-data-status="DISCONNECT"
> node b \
> attributes pgsql-data-status="DISCONNECT"
> node c \
> attributes pgsql-data-status="DISCONNECT"
> primitive pgsql ocf:heartbeat:pgsql \
> params pgctl="/usr/lib/postgresql/9.3/bin/pg_ctl" psql="/usr/bin/psql"
> pgdata="/var/lib/postgresql/9.3/main" start_opt="-p 5432" rep_mode="sync"
> node_list="a b c" restore_command="cp /var/lib/postgresql/9.3/pg_archive/%f
> %p" master_ip="192.168.10.200" restart_on_promote="true"
> config="/etc/postgresql/9.3/main/postgresql.conf" \
> op start interval="0s" timeout="60s" on-fail="restart" \
> op monitor interval="4s" timeout="60s" on-fail="restart" \
> op monitor interval="3s" role="Master" timeout="60s" on-fail="restart" \
> op promote interval="0s" timeout="60s" on-fail="restart" \
> op demote interval="0s" timeout="60s" on-fail="stop" \
> op stop interval="0s" timeout="60s" on-fail="block" \
> op notify interval="0s" timeout="60s"
> primitive pgsql-master-ip ocf:heartbeat:IPaddr2 \
> params ip="192.168.10.200" nic="peervpn0" \
> op start interval="0s" timeout="60s" on-fail="restart" \
> op monitor interval="10s" timeout="60s" on-fail="restart" \
> op stop interval="0s" timeout="60s" on-fail="block"
> group master pgsql-master-ip
> ms msPostgresql pgsql \
> meta master-max="1" master-node-max="1" clone-max="3" clone-node-max="1"
> notify="true"
> colocation set_ip inf: master msPostgresql:Master
> order ip_down 0: msPostgresql:demote master:stop symmetrical=false
> order ip_up 0: msPostgresql:promote master:start symmetrical=false
> property $id="cib-bootstrap-options" \
> dc-version="1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff" \
> cluster-infrastructure="openais" \
> expected-quorum-votes="3" \
> no-quorum-policy="ignore" \
> stonith-enabled="false" \
> crmd-transition-delay="0" \
> last-lrm-refresh="1386404222"
> rsc_defaults $id="rsc-options" \
> resource-stickiness="100" \
> migration-threshold="1"
> _______________________________________________
> Linux-HA mailing list
> Linux-HA@lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems

Mailing List Archive

Mailing List Archive

Attached Files: