Mailing List Archive: Master-Slave role stickiness

Master-Slave role stickiness

Jan 21, 2015, 12:06 PM

Post #1 of 4 (1997 views)

Hi,

I've got a master-slave resource and I'd like to achieve the following
behavior with it:

* Only ever run (as master or slave) on 2 specific nodes (out of N
possible nodes). These nodes are predetermined and are specified at
resource creation time.
* Prefer one specific node (of the 2 selected for running the resource)
for starting in the Master role.
* Upon failover event, promote the secondary node to master.
* Do not re-promote the failed node back to master, should it come back
online.

The last requirement is the one I'm currently struggling with. I can
force the resource to run on only the 2 nodes I want (out of 3 possible
nodes), but I can't get it to "stick" on the secondary node as master
after a failover and recovery. That is, when I take the original
master offline, the resource promotes correctly on the secondary, but if
I bring the origin node back online, the resource is demoted on the
secondary and promotes back to master on the origin. I'd like to avoid
that last bit.

Here's the relevant bits of my CRM configuration:

primitive NIMHA-01 ocf:heartbeat:nimha \
op start interval="0" timeout="60s" \
op monitor interval="30s" role="Master" \
op stop interval="0" timeout="60s" \
op monitor interval="45s" role="Slave" \
ms NIMMS-01 NIMHA-01 \
meta master-max="1" master-node-max="1" clone-max="2"
clone-node-max="1" notify="true" target-role="Started" is-managed="true"
location prefer-elmy-inf NIMMS-01 5: elmyra
location prefer-elmyra-ms NIMMS-01 \
rule $id="prefer-elmyra-rule" $role="Master" 10: #uname eq elmyra
location prefer-pres-inf NIMMS-01 5: president
location prefer-president-ms NIMMS-01 \
rule $id="prefer-president-rule" $role="Master" 5: #uname eq president
property $id="cib-bootstrap-options" \
dc-version="1.1.10-42f2063" \
cluster-infrastructure="corosync" \
stonith-enabled="false" \
no-quorum-policy="ignore" \
last-lrm-refresh="1421798334" \
default-resource-stickiness="200" \
symmetric-cluster="false"

I've set symmetric-cluster="false" to achieve an "opt-in" behavior, per
the corosync docs. From my understanding, these location constraints
should direct the resource to be able to run on the two nodes,
preferring 'elmyra' initially as Master. My question then becomes, is
there a way to apply the stickiness to the Master role ?? I've tried
adding explicit stickiness settings (high numbers and INF) to the
default-resource-stickiness, the actual "ms" resource, and the
primitive, all to no avail.

Anyone have any ideas on how to achieve stickiness on the master role in
such a configuration ?

Thanks for any and all help in advance,

brook

ps. please ignore/forgive the no-quorum-policy and stonith-enabled
settings in my configuration... I know it's bad and not best practice.
I don't think it should affect the answer to the above question, though,
based on my understanding of the system.

_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: Master-Slave role stickiness [ In reply to ]

arvidjaar at gmail

Jan 22, 2015, 5:44 AM

Post #2 of 4 (1959 views)

Permalink

On Wed, Jan 21, 2015 at 11:06 PM, brook davis <brook.davis@nimboxx.com> wrote:
> Hi,
>
> I've got a master-slave resource and I'd like to achieve the following
> behavior with it:
>
> * Only ever run (as master or slave) on 2 specific nodes (out of N possible
> nodes). These nodes are predetermined and are specified at resource
> creation time.
> * Prefer one specific node (of the 2 selected for running the resource) for
> starting in the Master role.
> * Upon failover event, promote the secondary node to master.
> * Do not re-promote the failed node back to master, should it come back
> online.
>
> The last requirement is the one I'm currently struggling with. I can force
> the resource to run on only the 2 nodes I want (out of 3 possible nodes),
> but I can't get it to "stick" on the secondary node as master after a
> failover and recovery. That is, when I take the original master offline,
> the resource promotes correctly on the secondary, but if I bring the origin
> node back online, the resource is demoted on the secondary and promotes back
> to master on the origin. I'd like to avoid that last bit.
>

It sounds like default-resource-stickiness does not kick in; and with
default resource-stickiness=1 it is expected (10 > 6). Documentation
says default-recource-stickiness is deprecated so may be it is ignored
in your version altogether? What "ptest -L -s" shows?

> Here's the relevant bits of my CRM configuration:
>
> primitive NIMHA-01 ocf:heartbeat:nimha \
> op start interval="0" timeout="60s" \
> op monitor interval="30s" role="Master" \
> op stop interval="0" timeout="60s" \
> op monitor interval="45s" role="Slave" \
> ms NIMMS-01 NIMHA-01 \
> meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1"
> notify="true" target-role="Started" is-managed="true"
> location prefer-elmy-inf NIMMS-01 5: elmyra
> location prefer-elmyra-ms NIMMS-01 \
> rule $id="prefer-elmyra-rule" $role="Master" 10: #uname eq elmyra
> location prefer-pres-inf NIMMS-01 5: president
> location prefer-president-ms NIMMS-01 \
> rule $id="prefer-president-rule" $role="Master" 5: #uname eq president
> property $id="cib-bootstrap-options" \
> dc-version="1.1.10-42f2063" \
> cluster-infrastructure="corosync" \
> stonith-enabled="false" \
> no-quorum-policy="ignore" \
> last-lrm-refresh="1421798334" \
> default-resource-stickiness="200" \
> symmetric-cluster="false"
>
>
> I've set symmetric-cluster="false" to achieve an "opt-in" behavior, per the
> corosync docs. From my understanding, these location constraints should
> direct the resource to be able to run on the two nodes, preferring 'elmyra'
> initially as Master. My question then becomes, is there a way to apply the
> stickiness to the Master role ?? I've tried adding explicit stickiness
> settings (high numbers and INF) to the default-resource-stickiness, the
> actual "ms" resource, and the primitive, all to no avail.
>
> Anyone have any ideas on how to achieve stickiness on the master role in
> such a configuration ?
>
> Thanks for any and all help in advance,
>
> brook
>
> ps. please ignore/forgive the no-quorum-policy and stonith-enabled settings
> in my configuration... I know it's bad and not best practice. I don't
> think it should affect the answer to the above question, though, based on my
> understanding of the system.
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: Master-Slave role stickiness [ In reply to ]

brook.davis at nimboxx

Jan 22, 2015, 2:13 PM

Post #3 of 4 (1949 views)

Permalink

< snip >
> It sounds like default-resource-stickiness does not kick in; and with
> default resource-stickiness=1 it is expected (10 > 6). Documentation
> says default-recource-stickiness is deprecated so may be it is ignored
> in your version altogether? What "ptest -L -s" shows?

I see now that default-resource-stickiness has been marked deprecated.
Thanks for the tip on ptest that's helpful... though, it looks like my
14.04 Ubuntu I'm using ships with crm_simulate instead, so using that...

I've seemingly successfully set the default stickiness using the
crm_attribute command and set it in the resource defaults section, as
you can see in my updated config here:

root@elmyra:~# crm configure show
node $id="168430537" elmyra \
attributes standby="off"
node $id="168430539" president \
attributes standby="off" maintenance="off"
primitive NIMHA-01 ocf:heartbeat:nimha \
op start interval="0" timeout="60s" \
op monitor interval="30s" role="Master" \
op stop interval="0" timeout="60s" \
op monitor interval="45s" role="Slave"
ms NIMMS-01 NIMHA-01 \
meta master-max="1" master-node-max="1" clone-max="2"
clone-node-max="1" notify="true" target-role="Started" is-managed="true"
location prefer-elmy-inf NIMMS-01 5: elmyra
location prefer-elmyra-ms NIMMS-01 \
rule $id="prefer-elmyra-rule" $role="Master" 10: #uname eq elmyra
location prefer-pres-inf NIMMS-01 5: president
location prefer-president-ms NIMMS-01 \
rule $id="prefer-president-rule" $role="Master" 5: #uname eq president
property $id="cib-bootstrap-options" \
dc-version="1.1.10-42f2063" \
cluster-infrastructure="corosync" \
stonith-enabled="false" \
no-quorum-policy="ignore" \
last-lrm-refresh="1421964175" \
default-resource-stickiness="200" \
symmetric-cluster="false"
rsc_defaults $id="rsc_defaults-options" \
resource-stickiness="200"
root@elmyra:~#

And here's the output of ptest/crm_simulate:

root@elmyra:~# crm_simulate -L -s

Current cluster status:
Online: [ elmyra president ]

Master/Slave Set: NIMMS-01 [NIMHA-01]
Masters: [ elmyra ]
Slaves: [ president ]

Allocation scores:
clone_color: NIMMS-01 allocation score on elmyra: 5
clone_color: NIMMS-01 allocation score on president: 5
clone_color: NIMHA-01:0 allocation score on elmyra: 205
clone_color: NIMHA-01:0 allocation score on president: 5
clone_color: NIMHA-01:1 allocation score on elmyra: 5
clone_color: NIMHA-01:1 allocation score on president: 205
native_color: NIMHA-01:0 allocation score on elmyra: 205
native_color: NIMHA-01:0 allocation score on president: 5
native_color: NIMHA-01:1 allocation score on elmyra: -INFINITY
native_color: NIMHA-01:1 allocation score on president: 205
NIMHA-01:0 promotion score on elmyra: 14
NIMHA-01:1 promotion score on president: 9

Transition Summary:
root@elmyra:~#

So, am I correct in my assessment that stickiness does not apply to the
promotion score? The 200 value I set the default resource stickiness to
seems to be taking affect. Not sure I entirely understand the scoring,
or at least the way crm_simulate is representing it, however.

Any insights, ideas, thoughts, help would be much appreciated.

Thanks,

brook

_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: Master-Slave role stickiness [ In reply to ]

andrew at beekhof

Mar 29, 2015, 6:38 PM

Post #4 of 4 (1859 views)

Permalink

> On 23 Jan 2015, at 9:13 am, brook davis <brook.davis@nimboxx.com> wrote:
>
> < snip >
>> It sounds like default-resource-stickiness does not kick in; and with
>> default resource-stickiness=1 it is expected (10 > 6). Documentation
>> says default-recource-stickiness is deprecated so may be it is ignored
>> in your version altogether? What "ptest -L -s" shows?
>
> I see now that default-resource-stickiness has been marked deprecated. Thanks for the tip on ptest that's helpful... though, it looks like my 14.04 Ubuntu I'm using ships with crm_simulate instead, so using that...
>
> I've seemingly successfully set the default stickiness using the crm_attribute command and set it in the resource defaults section, as you can see in my updated config here:
>
> root@elmyra:~# crm configure show
> node $id="168430537" elmyra \
> attributes standby="off"
> node $id="168430539" president \
> attributes standby="off" maintenance="off"
> primitive NIMHA-01 ocf:heartbeat:nimha \
> op start interval="0" timeout="60s" \
> op monitor interval="30s" role="Master" \
> op stop interval="0" timeout="60s" \
> op monitor interval="45s" role="Slave"
> ms NIMMS-01 NIMHA-01 \
> meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1" notify="true" target-role="Started" is-managed="true"
> location prefer-elmy-inf NIMMS-01 5: elmyra
> location prefer-elmyra-ms NIMMS-01 \
> rule $id="prefer-elmyra-rule" $role="Master" 10: #uname eq elmyra
> location prefer-pres-inf NIMMS-01 5: president
> location prefer-president-ms NIMMS-01 \
> rule $id="prefer-president-rule" $role="Master" 5: #uname eq president
> property $id="cib-bootstrap-options" \
> dc-version="1.1.10-42f2063" \
> cluster-infrastructure="corosync" \
> stonith-enabled="false" \
> no-quorum-policy="ignore" \
> last-lrm-refresh="1421964175" \
> default-resource-stickiness="200" \
> symmetric-cluster="false"
> rsc_defaults $id="rsc_defaults-options" \
> resource-stickiness="200"
> root@elmyra:~#
>
>
> And here's the output of ptest/crm_simulate:
>
> root@elmyra:~# crm_simulate -L -s
>
> Current cluster status:
> Online: [ elmyra president ]
>
> Master/Slave Set: NIMMS-01 [NIMHA-01]
> Masters: [ elmyra ]
> Slaves: [ president ]
>
> Allocation scores:
> clone_color: NIMMS-01 allocation score on elmyra: 5
> clone_color: NIMMS-01 allocation score on president: 5
> clone_color: NIMHA-01:0 allocation score on elmyra: 205
> clone_color: NIMHA-01:0 allocation score on president: 5
> clone_color: NIMHA-01:1 allocation score on elmyra: 5
> clone_color: NIMHA-01:1 allocation score on president: 205
> native_color: NIMHA-01:0 allocation score on elmyra: 205
> native_color: NIMHA-01:0 allocation score on president: 5
> native_color: NIMHA-01:1 allocation score on elmyra: -INFINITY
> native_color: NIMHA-01:1 allocation score on president: 205
> NIMHA-01:0 promotion score on elmyra: 14
> NIMHA-01:1 promotion score on president: 9
>
> Transition Summary:
> root@elmyra:~#
>
>
> So, am I correct in my assessment that stickiness does not apply to the promotion score?

You are correct for the version you have, but I'm reasonably sure it does for later versions.

> The 200 value I set the default resource stickiness to seems to be taking affect. Not sure I entirely understand the scoring, or at least the way crm_simulate is representing it, however.
>
> Any insights, ideas, thoughts, help would be much appreciated.
>
> Thanks,
>
> brook
>
>
>
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org