Mailing List Archive

Intermittent Failovers: route_ais_message: Sending message to local.crmd failed: ipc delivery failed (rc=-2)
Hey Team,

I'm receiving some strange intermittent failovers on a two-node cluster (happens once every week or two). When this happens, both nodes are unavailable; one node will be marked offline and the other will be shown as unclean. Any help on this would be massively appreciated. Thanks.

Running Ubuntu 12.04 (64-bit)
Pacemaker 1.1.6-2ubuntu3.3
Corosync 1.4.2-2ubuntu0.2

Here are the logs:
Nov 08 14:26:26 corosync [pcmk ] info: pcmk_ipc_exit: Client crmd (conn=0x12bebe0, async-conn=0x12bebe0) left
Nov 08 14:26:26 corosync [pcmk ] WARN: route_ais_message: Sending message to local.crmd failed: ipc delivery failed (rc=-2)
Nov 08 14:26:27 corosync [pcmk ] info: pcmk_ipc_exit: Client attrd (conn=0x12d0230, async-conn=0x12d0230) left
Nov 08 14:26:32 corosync [pcmk ] info: pcmk_ipc_exit: Client cib (conn=0x12c7d80, async-conn=0x12c7d80) left
Nov 08 14:26:32 corosync [pcmk ] info: pcmk_ipc_exit: Client stonith-ng (conn=0x12c3a20, async-conn=0x12c3a20) left
Nov 08 14:26:32 corosync [pcmk ] WARN: route_ais_message: Sending message to local.crmd failed: ipc delivery failed (rc=-2)
Nov 08 14:26:32 corosync [pcmk ] WARN: route_ais_message: Sending message to local.cib failed: ipc delivery failed (rc=-2)
Nov 08 14:26:32 corosync [pcmk ] info: pcmk_ipc: Recorded connection 0x12bebe0 for stonith-ng/0
Nov 08 14:26:32 corosync [pcmk ] info: pcmk_ipc: Recorded connection 0x12c2f40 for attrd/0
Nov 08 14:26:33 corosync [pcmk ] info: pcmk_ipc: Recorded connection 0x12c72a0 for cib/0
Nov 08 14:26:33 corosync [pcmk ] info: pcmk_ipc: Sending membership update 12 to cib
Nov 08 14:26:33 corosync [pcmk ] info: pcmk_ipc: Recorded connection 0x12cb600 for crmd/0
Nov 08 14:26:33 corosync [pcmk ] info: pcmk_ipc: Sending membership update 12 to crmd

Output of crm configure show:
node p-sbc3 \
attributes standby="off"
node p-sbc4 \
attributes standby="off"
primitive fs lsb:FSSofia \
op monitor interval="2s" enabled="true" timeout="10s" on-fail="standby" \
meta target-role="Started"
primitive fs-ip ocf:heartbeat:IPaddr2 \
params ip="10.100.0.90" nic="eth0:0" cidr_netmask="24" \
op monitor interval="10s"
primitive fs-ip2 ocf:heartbeat:IPaddr2 \
params ip="10.100.0.99" nic="eth0:1" cidr_netmask="24" \
op monitor interval="10s"
group cluster_services fs-ip fs-ip2 fs \
meta target-role="Started"
property $id="cib-bootstrap-options" \
dc-version="1.1.6-9971ebba4494012a93c03b40a2c58ec0eb60f50c" \
cluster-infrastructure="openais" \
expected-quorum-votes="2" \
stonith-enabled="false" \
last-lrm-refresh="1348755080" \
no-quorum-policy="ignore"
rsc_defaults $id="rsc-options" \
resource-stickiness="100"
Re: Intermittent Failovers: route_ais_message: Sending message to local.crmd failed: ipc delivery failed (rc=-2) [ In reply to ]
> On 11 Nov 2014, at 1:32 am, Zach Wolf <ZWolf@doublepositive.com> wrote:
>
> Hey Team,
>
> I’m receiving some strange intermittent failovers on a two-node cluster (happens once every week or two). When this happens, both nodes are unavailable; one node will be marked offline and the other will be shown as unclean. Any help on this would be massively appreciated. Thanks.
>
> Running Ubuntu 12.04 (64-bit)
> Pacemaker 1.1.6-2ubuntu3.3
> Corosync 1.4.2-2ubuntu0.2
>
> Here are the logs:
> Nov 08 14:26:26 corosync [pcmk ] info: pcmk_ipc_exit: Client crmd (conn=0x12bebe0, async-conn=0x12bebe0) left
> Nov 08 14:26:26 corosync [pcmk ] WARN: route_ais_message: Sending message to local.crmd failed: ipc delivery failed (rc=-2)
> Nov 08 14:26:27 corosync [pcmk ] info: pcmk_ipc_exit: Client attrd (conn=0x12d0230, async-conn=0x12d0230) left
> Nov 08 14:26:32 corosync [pcmk ] info: pcmk_ipc_exit: Client cib (conn=0x12c7d80, async-conn=0x12c7d80) left
> Nov 08 14:26:32 corosync [pcmk ] info: pcmk_ipc_exit: Client stonith-ng (conn=0x12c3a20, async-conn=0x12c3a20) left
> Nov 08 14:26:32 corosync [pcmk ] WARN: route_ais_message: Sending message to local.crmd failed: ipc delivery failed (rc=-2)
> Nov 08 14:26:32 corosync [pcmk ] WARN: route_ais_message: Sending message to local.cib failed: ipc delivery failed (rc=-2)

Nothing at all from the crmd, cib, attrd or stonith-ng processes?

> Nov 08 14:26:32 corosync [pcmk ] info: pcmk_ipc: Recorded connection 0x12bebe0 for stonith-ng/0
> Nov 08 14:26:32 corosync [pcmk ] info: pcmk_ipc: Recorded connection 0x12c2f40 for attrd/0
> Nov 08 14:26:33 corosync [pcmk ] info: pcmk_ipc: Recorded connection 0x12c72a0 for cib/0
> Nov 08 14:26:33 corosync [pcmk ] info: pcmk_ipc: Sending membership update 12 to cib
> Nov 08 14:26:33 corosync [pcmk ] info: pcmk_ipc: Recorded connection 0x12cb600 for crmd/0
> Nov 08 14:26:33 corosync [pcmk ] info: pcmk_ipc: Sending membership update 12 to crmd
>
> Output of crm configure show:
> node p-sbc3 \
> attributes standby="off"
> node p-sbc4 \
> attributes standby="off"
> primitive fs lsb:FSSofia \
> op monitor interval="2s" enabled="true" timeout="10s" on-fail="standby" \
> meta target-role="Started"
> primitive fs-ip ocf:heartbeat:IPaddr2 \
> params ip="10.100.0.90" nic="eth0:0" cidr_netmask="24" \
> op monitor interval="10s"
> primitive fs-ip2 ocf:heartbeat:IPaddr2 \
> params ip="10.100.0.99" nic="eth0:1" cidr_netmask="24" \
> op monitor interval="10s"
> group cluster_services fs-ip fs-ip2 fs \
> meta target-role="Started"
> property $id="cib-bootstrap-options" \
> dc-version="1.1.6-9971ebba4494012a93c03b40a2c58ec0eb60f50c" \
> cluster-infrastructure="openais" \
> expected-quorum-votes="2" \
> stonith-enabled="false" \
> last-lrm-refresh="1348755080" \
> no-quorum-policy="ignore"
> rsc_defaults $id="rsc-options" \
> resource-stickiness="100"
> _______________________________________________
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org


_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org