Mailing List Archive

stonith
Hi list, i have a pacemaker/corosync2 setup with 4 nodes, stonith configured over ipmi interface. My problem is, that sometimes, a wrong node is stonithed. As example: I have 4 servers: node1, node2, node3, node4 I start a hardware- reset on node node1, but node1 and node3 will be stonithed. In the cluster.log, i found following entry: Apr 17 11:02:41 [20473] node2 stonithd: debug: stonith_action_create: Initiating action reboot for agent fence_legacy (target=node1)
Apr 17 11:02:41 [20473] node2 stonithd: debug: make_args: Performing reboot action for node 'node1' as 'port=node1'
Apr 17 11:02:41 [20473] node2 stonithd: debug: internal_stonith_action_execute: forking
Apr 17 11:02:41 [20473] node2 stonithd: debug: internal_stonith_action_execute: sending args
Apr 17 11:02:41 [20473] node2 stonithd: debug: stonith_device_execute: Operation reboot for node node1 on p_stonith_node3 now running with pid=113092, timeout=60s node1 will be reseted with the stonith primitive of node3 ?? Why?? my stonith config: primitive p_stonith_node1 stonith:external/ipmi \
params hostname=node1 ipaddr=10.100.0.2 passwd_method=file passwd="/etc/stonith_ipmi_passwd" userid=stonith interface=lanplus priv=OPERATOR \
op monitor interval=3s timeout=20s \
meta target-role=Started failure-timeout=30s
primitive p_stonith_node2 stonith:external/ipmi \
op monitor interval=3s timeout=20s \
params hostname=node2 ipaddr=10.100.0.4 passwd_method=file passwd="/etc/stonith_ipmi_passwd" userid=stonith interface=lanplus priv=OPERATOR \
meta target-role=Started failure-timeout=30s
primitive p_stonith_node3 stonith:external/ipmi \
op monitor interval=3s timeout=20s \
params hostname=node3 ipaddr=10.100.0.6 passwd_method=file passwd="/etc/stonith_ipmi_passwd" userid=stonith interface=lanplus priv=OPERATOR \
meta target-role=Started failure-timeout=30s
primitive p_stonith_node4 stonith:external/ipmi \
op monitor interval=3s timeout=20s \
params hostname=node4 ipaddr=10.100.0.8 passwd_method=file passwd="/etc/stonith_ipmi_passwd" userid=stonith interface=lanplus priv=OPERATOR \
meta target-role=Started failure-timeout=30s Somebody can help me??
Thanks! Regards, Thomas