Mailing List Archive: Issues Migrating from 12.04 to 14.04 with resource-stickiness

Good Day/Evening All

I am in process of migrating a cluster from Ubuntu 12.04 to 14.04,

The below config works on my existing 12.04 cluster but on 14.04. All my resource primitives have a stickiness of -INFINITY. Any suggestions on what I am doing wrong would help.

node $id="740056600" fhl-r1atdbl1
node $id="740056601" fhl-r3atdbl1
node $id="740056606" fhl-r1atwebl1
node $id="740056607" fhl-r3atwebl1
node $id="740056608" fhl-r1atbl1
node $id="740056609" fhl-r3atbl1
primitive bamboovip ocf:heartbeat:IPaddr2 \
op monitor interval="60s" \
params ip="XXX" cidr_netmask="24"
primitive confluencevip ocf:heartbeat:IPaddr2 \
op monitor interval="60s" \
params ip="XXX" cidr_netmask="24"
primitive databasevip ocf:heartbeat:IPaddr2 \
op monitor interval="60s" \
params ip="XXX" cidr_netmask="24"
primitive glustervip ocf:heartbeat:IPaddr2 \
op monitor interval="60s" \
params ip="XXX" cidr_netmask="24"
primitive jiravip ocf:heartbeat:IPaddr2 \
op monitor interval="60s" \
params ip="XXX" cidr_netmask="24"
primitive nginx_web01 ocf:heartbeat:nginx \
op monitor timeout="60s" interval="10s" \
op start timeout="60s" interval="0" on-fail="restart" \
params port="81"
primitive nginx_web02 ocf:heartbeat:nginx \
op monitor timeout="60s" interval="10s" \
op start timeout="60s" interval="0" on-fail="restart" \
params port="81"
primitive stashvip ocf:heartbeat:IPaddr2 \
op monitor interval="60s" \
params ip="XXX" cidr_netmask="24"
location bamboovip_service_location_1 bamboovip 0: fhl-r1atwebl1
location bamboovip_service_location_2 bamboovip 200: fhl-r3atwebl1
location confluencevip_service_location_1 confluencevip 200: fhl-r1atwebl1
location confluencevip_service_location_2 confluencevip 0: fhl-r3atwebl1
location databasevip_service_location_1 databasevip 201: fhl-r1atdbl1
location databasevip_service_location_2 databasevip 200: fhl-r1atdbl1
location glustervip_service_location_1 glustervip 201: fhl-r1atbl1
location glustervip_service_location_2 glustervip 200: fhl-r3atbl1
location jiravip_service_location_1 jiravip 200: fhl-r1atwebl1
location jiravip_service_location_2 jiravip 0: fhl-r3atwebl1
location nginx_web01_service_location nginx_web01 100: fhl-r1atwebl1
location nginx_web02_service_location nginx_web02 100: fhl-r3atbl1
location stashvip_service_location_1 stashvip 0: fhl-r1atwebl1
location stashvip_service_location_2 stashvip 200: fhl-r3atwebl1
property $id="cib-bootstrap-options" \
dc-version="1.1.10-42f2063" \
cluster-infrastructure="corosync" \
default-resource-stickiness="100" \
no-quorum-policy="stop" \
stonith-enabled="false" \
symmetric-cluster="false"

I have tried setting the below, but did not help:
rsc_defaults $id="rsc-options" \
resource-stickiness="INFINITY"

# crm_simulate -L -s

Current cluster status:
Online: [ fhl-r1atbl1 fhl-r1atdbl1 fhl-r1atwebl1 fhl-r3atbl1 fhl-r3atdbl1 fhl-r3atwebl1 ]

bamboovip (ocf::heartbeat:IPaddr2): Stopped
confluencevip (ocf::heartbeat:IPaddr2): Stopped
nginx_web01 (ocf::heartbeat:nginx): Stopped
databasevip (ocf::heartbeat:IPaddr2): Stopped
jiravip (ocf::heartbeat:IPaddr2): Stopped
stashvip (ocf::heartbeat:IPaddr2): Stopped
nginx_web02 (ocf::heartbeat:nginx): Stopped
glustervip (ocf::heartbeat:IPaddr2): Stopped

Allocation scores:
native_color: bamboovip allocation score on fhl-r1atwebl1: -INFINITY
native_color: bamboovip allocation score on fhl-r3atwebl1: -INFINITY
native_color: confluencevip allocation score on fhl-r1atwebl1: -INFINITY
native_color: confluencevip allocation score on fhl-r3atwebl1: -INFINITY
native_color: nginx_web01 allocation score on fhl-r1atwebl1: -INFINITY
native_color: databasevip allocation score on fhl-r1atdbl1: -INFINITY
native_color: jiravip allocation score on fhl-r1atwebl1: -INFINITY
native_color: jiravip allocation score on fhl-r3atwebl1: -INFINITY
native_color: stashvip allocation score on fhl-r1atwebl1: -INFINITY
native_color: stashvip allocation score on fhl-r3atwebl1: -INFINITY
native_color: nginx_web02 allocation score on fhl-r3atbl1: -INFINITY
native_color: nginx_web02 allocation score on fhl-r3atwebl1: -INFINITY
native_color: glustervip allocation score on fhl-r1atbl1: -INFINITY
native_color: glustervip allocation score on fhl-r3atbl1: -INFINITY

[root@fhl-r3atbl1:/root]# crm_verify -LVVVV
info: main: =#=#=#=#= Getting XML =#=#=#=#=
info: main: Reading XML from: live cluster
info: validate_with_relaxng: Creating RNG parser context
info: determine_online_status: Node fhl-r1atbl1 is online
info: determine_online_status: Node fhl-r3atdbl1 is online
info: determine_online_status: Node fhl-r3atbl1 is online
info: determine_online_status: Node fhl-r1atwebl1 is online
info: determine_online_status: Node fhl-r3atwebl1 is online
info: determine_online_status: Node fhl-r1atdbl1 is online
info: native_print: bamboovip (ocf::heartbeat:IPaddr2): Stopped
info: native_print: confluencevip (ocf::heartbeat:IPaddr2): Stopped
info: native_print: nginx_web01 (ocf::heartbeat:nginx): Stopped
info: native_print: databasevip (ocf::heartbeat:IPaddr2): Stopped
info: native_print: jiravip (ocf::heartbeat:IPaddr2): Stopped
info: native_print: stashvip (ocf::heartbeat:IPaddr2): Stopped
info: native_print: nginx_web02 (ocf::heartbeat:nginx): Stopped
info: native_print: glustervip (ocf::heartbeat:IPaddr2): Stopped
info: get_failcount_full: nginx_web02 has failed INFINITY times on fhl-r3atwebl1
warning: common_apply_stickiness: Forcing nginx_web02 away from fhl-r3atwebl1 after 1000000 failures (max=1000000)
info: native_color: Resource bamboovip cannot run anywhere
info: native_color: Resource confluencevip cannot run anywhere
info: native_color: Resource nginx_web01 cannot run anywhere
info: native_color: Resource databasevip cannot run anywhere
info: native_color: Resource jiravip cannot run anywhere
info: native_color: Resource stashvip cannot run anywhere
info: native_color: Resource nginx_web02 cannot run anywhere
info: native_color: Resource glustervip cannot run anywhere
info: LogActions: Leave bamboovip (Stopped)
info: LogActions: Leave confluencevip (Stopped)
info: LogActions: Leave nginx_web01 (Stopped)
info: LogActions: Leave databasevip (Stopped)
info: LogActions: Leave jiravip (Stopped)
info: LogActions: Leave stashvip (Stopped)
info: LogActions: Leave nginx_web02 (Stopped)
info: LogActions: Leave glustervip (Stopped)

I am running:
ii corosync 2.3.3-1ubuntu1
ii crmsh 1.2.5+hg1034-1ubuntu4
ii libcorosync-common4 2.3.3-1ubuntu1
ii pacemaker 1.1.10+git20130802-1ubuntu2.2
ii pacemaker-cli-utils 1.1.10+git20130802-1ubuntu2.2

I can make this work by setting:
symmetric-cluster="false"

And then setting the below locations to where I do not want the resource to run:
location stashvip_service_location_2 stashvip 200: node_this_should_not_run_on

If I delete all my location settings and set symmetric-cluster="false" all my vips will start up, but on the wrong hosts.

I have seen this in the logs, but not sure what it means:
Feb 12 18:39:42 fhl-r3atdbl1 pengine[1552]: error: common_apply_stickiness: jiravip1[fhl-r1atdbl1] = -1000000
Feb 12 18:39:42 fhl-r3atdbl1 pengine[1552]: error: common_apply_stickiness: jiravip1[fhl-r3atdbl1] = -1000000
2015-02-12T18:39:42.378233+02:00 fhl-r3atdbl1 pengine[1552]: error: common_apply_stickiness: jiravip1[fhl-r3atbl1] = -1000000
2015-02-12T18:39:42.378240+02:00 fhl-r3atdbl1 pengine[1552]: error: common_apply_stickiness: jiravip1[fhl-r1atbl1] = -1000000
Feb 12 18:39:42 fhl-r3atdbl1 pengine[1552]: error: common_apply_stickiness: jiravip1[fhl-r3atbl1] = -1000000
Feb 12 18:39:42 fhl-r3atdbl1 pengine[1552]: error: common_apply_stickiness: jiravip1[fhl-r1atbl1] = -1000000

If you need more info please let me know. I have been Yak shaving all day and i'm sure its something simple :(

Kind Regards
Merritt
To read FirstRand Bank's Disclaimer for this email click on the following address or copy into your Internet browser:
https://www.fnb.co.za/disclaimer.html

If you are unable to access the Disclaimer, send a blank e-mail to
firstrandbankdisclaimer@fnb.co.za and we will send you a copy of the Disclaimer.

_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

It looks like jiravip1 has failed in a lot of places.
Is this the complete configuration? I would have expected some colocation constraints from the behaviour.

Also, you understand what symmetric-cluster="false" does?

> On 13 Feb 2015, at 4:29 am, Krakowitzer, Merritt <mkrakowitzer@fnb.co.za> wrote:
>
> Good Day/Evening All
>
> I am in process of migrating a cluster from Ubuntu 12.04 to 14.04,
>
> The below config works on my existing 12.04 cluster but on 14.04. All my resource primitives have a stickiness of -INFINITY. Any suggestions on what I am doing wrong would help.
>
> node $id="740056600" fhl-r1atdbl1
> node $id="740056601" fhl-r3atdbl1
> node $id="740056606" fhl-r1atwebl1
> node $id="740056607" fhl-r3atwebl1
> node $id="740056608" fhl-r1atbl1
> node $id="740056609" fhl-r3atbl1
> primitive bamboovip ocf:heartbeat:IPaddr2 \
> op monitor interval="60s" \
> params ip="XXX" cidr_netmask="24"
> primitive confluencevip ocf:heartbeat:IPaddr2 \
> op monitor interval="60s" \
> params ip="XXX" cidr_netmask="24"
> primitive databasevip ocf:heartbeat:IPaddr2 \
> op monitor interval="60s" \
> params ip="XXX" cidr_netmask="24"
> primitive glustervip ocf:heartbeat:IPaddr2 \
> op monitor interval="60s" \
> params ip="XXX" cidr_netmask="24"
> primitive jiravip ocf:heartbeat:IPaddr2 \
> op monitor interval="60s" \
> params ip="XXX" cidr_netmask="24"
> primitive nginx_web01 ocf:heartbeat:nginx \
> op monitor timeout="60s" interval="10s" \
> op start timeout="60s" interval="0" on-fail="restart" \
> params port="81"
> primitive nginx_web02 ocf:heartbeat:nginx \
> op monitor timeout="60s" interval="10s" \
> op start timeout="60s" interval="0" on-fail="restart" \
> params port="81"
> primitive stashvip ocf:heartbeat:IPaddr2 \
> op monitor interval="60s" \
> params ip="XXX" cidr_netmask="24"
> location bamboovip_service_location_1 bamboovip 0: fhl-r1atwebl1
> location bamboovip_service_location_2 bamboovip 200: fhl-r3atwebl1
> location confluencevip_service_location_1 confluencevip 200: fhl-r1atwebl1
> location confluencevip_service_location_2 confluencevip 0: fhl-r3atwebl1
> location databasevip_service_location_1 databasevip 201: fhl-r1atdbl1
> location databasevip_service_location_2 databasevip 200: fhl-r1atdbl1
> location glustervip_service_location_1 glustervip 201: fhl-r1atbl1
> location glustervip_service_location_2 glustervip 200: fhl-r3atbl1
> location jiravip_service_location_1 jiravip 200: fhl-r1atwebl1
> location jiravip_service_location_2 jiravip 0: fhl-r3atwebl1
> location nginx_web01_service_location nginx_web01 100: fhl-r1atwebl1
> location nginx_web02_service_location nginx_web02 100: fhl-r3atbl1
> location stashvip_service_location_1 stashvip 0: fhl-r1atwebl1
> location stashvip_service_location_2 stashvip 200: fhl-r3atwebl1
> property $id="cib-bootstrap-options" \
> dc-version="1.1.10-42f2063" \
> cluster-infrastructure="corosync" \
> default-resource-stickiness="100" \
> no-quorum-policy="stop" \
> stonith-enabled="false" \
> symmetric-cluster="false"
>
> I have tried setting the below, but did not help:
> rsc_defaults $id="rsc-options" \
> resource-stickiness="INFINITY"
>
>
> # crm_simulate -L -s
>
> Current cluster status:
> Online: [ fhl-r1atbl1 fhl-r1atdbl1 fhl-r1atwebl1 fhl-r3atbl1 fhl-r3atdbl1 fhl-r3atwebl1 ]
>
> bamboovip (ocf::heartbeat:IPaddr2): Stopped
> confluencevip (ocf::heartbeat:IPaddr2): Stopped
> nginx_web01 (ocf::heartbeat:nginx): Stopped
> databasevip (ocf::heartbeat:IPaddr2): Stopped
> jiravip (ocf::heartbeat:IPaddr2): Stopped
> stashvip (ocf::heartbeat:IPaddr2): Stopped
> nginx_web02 (ocf::heartbeat:nginx): Stopped
> glustervip (ocf::heartbeat:IPaddr2): Stopped
>
> Allocation scores:
> native_color: bamboovip allocation score on fhl-r1atwebl1: -INFINITY
> native_color: bamboovip allocation score on fhl-r3atwebl1: -INFINITY
> native_color: confluencevip allocation score on fhl-r1atwebl1: -INFINITY
> native_color: confluencevip allocation score on fhl-r3atwebl1: -INFINITY
> native_color: nginx_web01 allocation score on fhl-r1atwebl1: -INFINITY
> native_color: databasevip allocation score on fhl-r1atdbl1: -INFINITY
> native_color: jiravip allocation score on fhl-r1atwebl1: -INFINITY
> native_color: jiravip allocation score on fhl-r3atwebl1: -INFINITY
> native_color: stashvip allocation score on fhl-r1atwebl1: -INFINITY
> native_color: stashvip allocation score on fhl-r3atwebl1: -INFINITY
> native_color: nginx_web02 allocation score on fhl-r3atbl1: -INFINITY
> native_color: nginx_web02 allocation score on fhl-r3atwebl1: -INFINITY
> native_color: glustervip allocation score on fhl-r1atbl1: -INFINITY
> native_color: glustervip allocation score on fhl-r3atbl1: -INFINITY
>
> [root@fhl-r3atbl1:/root]# crm_verify -LVVVV
> info: main: =#=#=#=#= Getting XML =#=#=#=#=
> info: main: Reading XML from: live cluster
> info: validate_with_relaxng: Creating RNG parser context
> info: determine_online_status: Node fhl-r1atbl1 is online
> info: determine_online_status: Node fhl-r3atdbl1 is online
> info: determine_online_status: Node fhl-r3atbl1 is online
> info: determine_online_status: Node fhl-r1atwebl1 is online
> info: determine_online_status: Node fhl-r3atwebl1 is online
> info: determine_online_status: Node fhl-r1atdbl1 is online
> info: native_print: bamboovip (ocf::heartbeat:IPaddr2): Stopped
> info: native_print: confluencevip (ocf::heartbeat:IPaddr2): Stopped
> info: native_print: nginx_web01 (ocf::heartbeat:nginx): Stopped
> info: native_print: databasevip (ocf::heartbeat:IPaddr2): Stopped
> info: native_print: jiravip (ocf::heartbeat:IPaddr2): Stopped
> info: native_print: stashvip (ocf::heartbeat:IPaddr2): Stopped
> info: native_print: nginx_web02 (ocf::heartbeat:nginx): Stopped
> info: native_print: glustervip (ocf::heartbeat:IPaddr2): Stopped
> info: get_failcount_full: nginx_web02 has failed INFINITY times on fhl-r3atwebl1
> warning: common_apply_stickiness: Forcing nginx_web02 away from fhl-r3atwebl1 after 1000000 failures (max=1000000)
> info: native_color: Resource bamboovip cannot run anywhere
> info: native_color: Resource confluencevip cannot run anywhere
> info: native_color: Resource nginx_web01 cannot run anywhere
> info: native_color: Resource databasevip cannot run anywhere
> info: native_color: Resource jiravip cannot run anywhere
> info: native_color: Resource stashvip cannot run anywhere
> info: native_color: Resource nginx_web02 cannot run anywhere
> info: native_color: Resource glustervip cannot run anywhere
> info: LogActions: Leave bamboovip (Stopped)
> info: LogActions: Leave confluencevip (Stopped)
> info: LogActions: Leave nginx_web01 (Stopped)
> info: LogActions: Leave databasevip (Stopped)
> info: LogActions: Leave jiravip (Stopped)
> info: LogActions: Leave stashvip (Stopped)
> info: LogActions: Leave nginx_web02 (Stopped)
> info: LogActions: Leave glustervip (Stopped)
>
> I am running:
> ii corosync 2.3.3-1ubuntu1
> ii crmsh 1.2.5+hg1034-1ubuntu4
> ii libcorosync-common4 2.3.3-1ubuntu1
> ii pacemaker 1.1.10+git20130802-1ubuntu2.2
> ii pacemaker-cli-utils 1.1.10+git20130802-1ubuntu2.2
>
> I can make this work by setting:
> symmetric-cluster="false"
>
> And then setting the below locations to where I do not want the resource to run:
> location stashvip_service_location_2 stashvip 200: node_this_should_not_run_on
>
> If I delete all my location settings and set symmetric-cluster="false" all my vips will start up, but on the wrong hosts.
>
> I have seen this in the logs, but not sure what it means:
> Feb 12 18:39:42 fhl-r3atdbl1 pengine[1552]: error: common_apply_stickiness: jiravip1[fhl-r1atdbl1] = -1000000
> Feb 12 18:39:42 fhl-r3atdbl1 pengine[1552]: error: common_apply_stickiness: jiravip1[fhl-r3atdbl1] = -1000000
> 2015-02-12T18:39:42.378233+02:00 fhl-r3atdbl1 pengine[1552]: error: common_apply_stickiness: jiravip1[fhl-r3atbl1] = -1000000
> 2015-02-12T18:39:42.378240+02:00 fhl-r3atdbl1 pengine[1552]: error: common_apply_stickiness: jiravip1[fhl-r1atbl1] = -1000000
> Feb 12 18:39:42 fhl-r3atdbl1 pengine[1552]: error: common_apply_stickiness: jiravip1[fhl-r3atbl1] = -1000000
> Feb 12 18:39:42 fhl-r3atdbl1 pengine[1552]: error: common_apply_stickiness: jiravip1[fhl-r1atbl1] = -1000000
>
> If you need more info please let me know. I have been Yak shaving all day and i'm sure its something simple :(
>
> Kind Regards
> Merritt
> To read FirstRand Bank's Disclaimer for this email click on the following address or copy into your Internet browser:
> https://www.fnb.co.za/disclaimer.html
>
> If you are unable to access the Disclaimer, send a blank e-mail to
> firstrandbankdisclaimer@fnb.co.za and we will send you a copy of the Disclaimer.
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org