Hi All,
Our user operated cibadmin command by mistake.
By an operation error, reboot of crmd occurs.
Step 1) Start a cluster.
[root@rh70-node1 ~]# crm_mon -1 -Af
Last updated: Wed Nov 5 10:26:51 2014
Last change: Wed Nov 5 10:23:39 2014
Stack: corosync
Current DC: rh70-node1 (3232238160) - partition WITHOUT quorum
Version: 1.1.12-85c093e
1 Nodes configured
0 Resources configured
Online: [ rh70-node1 ]
Node Attributes:
* Node rh70-node1:
Migration summary:
* Node rh70-node1:
Step 2) A user adds a node by wrong designation.
cibadmin -C -o nodes -X '<node id="hpg604" type="normal" uname="hpg604"/>'
The crmd core-dump and reboots.
----------------------------
Nov 5 10:28:17 rh70-node1 cib[2167]: info: cib_process_request: Forwarding cib_create operation for section nodes to master (origin=local/cibadmin/2)
Nov 5 10:28:17 rh70-node1 cib[2167]: info: cib_perform_op: Diff: --- 0.2.7 2
Nov 5 10:28:17 rh70-node1 cib[2167]: info: cib_perform_op: Diff: +++ 0.3.0 92153f86c58ed569196d946612f0dab8
Nov 5 10:28:17 rh70-node1 cib[2167]: info: cib_perform_op: + /cib: @epoch=3, @num_updates=0
Nov 5 10:28:17 rh70-node1 cib[2167]: info: cib_perform_op: ++ /cib/configuration/nodes: <node id="hpg604" type="normal" uname="hpg604"/>
Nov 5 10:28:17 rh70-node1 cib[2167]: info: cib_process_request: Completed cib_create operation for section nodes: OK (rc=0, origin=rh70-node1/cibadmin/2, version=0.3.0)
Nov 5 10:28:17 rh70-node1 crmd[2172]: error: crm_int_helper: Characters left over after parsing 'hpg604': 'hpg604'
Nov 5 10:28:17 rh70-node1 crmd[2172]: error: crm_abort: crm_find_peer: Triggered fatal assert at membership.c:338 : id > 0 || uname != NULL
Nov 5 10:28:17 rh70-node1 cib[2223]: info: write_cib_contents: Archived previous version as /var/lib/pacemaker/cib/cib-2.raw
Nov 5 10:28:17 rh70-node1 cib[2223]: info: write_cib_contents: Wrote version 0.3.0 of the CIB to disk (digest: fd92fe00a0f0478246b1c9f1d2be83a8)
Nov 5 10:28:17 rh70-node1 cib[2223]: info: retrieveCib: Reading cluster configuration from: /var/lib/pacemaker/cib/cib.CARj72 (digest: /var/lib/pacemaker/cib/cib.XK4ybJ)
Nov 5 10:28:17 rh70-node1 abrt-hook-ccpp: Saved core dump of pid 2172 (/usr/libexec/pacemaker/crmd) to /var/tmp/abrt/ccpp-2014-11-05-10:28:17-2172 (18141184 bytes)
Nov 5 10:28:18 rh70-node1 abrt-server: Executable '/usr/libexec/pacemaker/crmd' doesn't belong to any package and ProcessUnpackaged is set to 'no'
Nov 5 10:28:18 rh70-node1 abrt-server: 'post-create' on '/var/tmp/abrt/ccpp-2014-11-05-10:28:17-2172' exited with 1
Nov 5 10:28:18 rh70-node1 abrt-server: Deleting problem directory '/var/tmp/abrt/ccpp-2014-11-05-10:28:17-2172'
Nov 5 10:28:18 rh70-node1 pacemakerd[2166]: error: child_waitpid: Managed process 2172 (crmd) dumped core
Nov 5 10:28:18 rh70-node1 pacemakerd[2166]: error: pcmk_child_exit: The crmd process (2172) terminated with signal 6 (core=1)
Nov 5 10:28:18 rh70-node1 pacemakerd[2166]: notice: pcmk_process_exit: Respawning failed child process: crmd
Nov 5 10:28:18 rh70-node1 pacemakerd[2166]: info: start_child: Using uid=992 and group=990 for process crmd
Nov 5 10:28:18 rh70-node1 pacemakerd[2166]: info: start_child: Forked child 2228 for process crmd
Nov 5 10:28:18 rh70-node1 crmd[2228]: info: crm_log_init: Changed active directory to /usr/var/lib/heartbeat/cores/hacluster
Nov 5 10:28:18 rh70-node1 crmd[2228]: notice: main: CRM Git Version: 85c093e
Nov 5 10:28:18 rh70-node1 crmd[2228]: info: do_log: FSA: Input I_STARTUP from crmd_init() received in state S_STARTING
Nov 5 10:28:18 rh70-node1 crmd[2228]: info: get_cluster_type: Verifying cluster type: 'corosync'
----------------------------
It is an operation error of the user, but it is not desirable for crmd to reboot.
We request the improvement that crmd does not reboot.
Best Regards,
Hideo Yamauchi.
_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker
Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org
Our user operated cibadmin command by mistake.
By an operation error, reboot of crmd occurs.
Step 1) Start a cluster.
[root@rh70-node1 ~]# crm_mon -1 -Af
Last updated: Wed Nov 5 10:26:51 2014
Last change: Wed Nov 5 10:23:39 2014
Stack: corosync
Current DC: rh70-node1 (3232238160) - partition WITHOUT quorum
Version: 1.1.12-85c093e
1 Nodes configured
0 Resources configured
Online: [ rh70-node1 ]
Node Attributes:
* Node rh70-node1:
Migration summary:
* Node rh70-node1:
Step 2) A user adds a node by wrong designation.
cibadmin -C -o nodes -X '<node id="hpg604" type="normal" uname="hpg604"/>'
The crmd core-dump and reboots.
----------------------------
Nov 5 10:28:17 rh70-node1 cib[2167]: info: cib_process_request: Forwarding cib_create operation for section nodes to master (origin=local/cibadmin/2)
Nov 5 10:28:17 rh70-node1 cib[2167]: info: cib_perform_op: Diff: --- 0.2.7 2
Nov 5 10:28:17 rh70-node1 cib[2167]: info: cib_perform_op: Diff: +++ 0.3.0 92153f86c58ed569196d946612f0dab8
Nov 5 10:28:17 rh70-node1 cib[2167]: info: cib_perform_op: + /cib: @epoch=3, @num_updates=0
Nov 5 10:28:17 rh70-node1 cib[2167]: info: cib_perform_op: ++ /cib/configuration/nodes: <node id="hpg604" type="normal" uname="hpg604"/>
Nov 5 10:28:17 rh70-node1 cib[2167]: info: cib_process_request: Completed cib_create operation for section nodes: OK (rc=0, origin=rh70-node1/cibadmin/2, version=0.3.0)
Nov 5 10:28:17 rh70-node1 crmd[2172]: error: crm_int_helper: Characters left over after parsing 'hpg604': 'hpg604'
Nov 5 10:28:17 rh70-node1 crmd[2172]: error: crm_abort: crm_find_peer: Triggered fatal assert at membership.c:338 : id > 0 || uname != NULL
Nov 5 10:28:17 rh70-node1 cib[2223]: info: write_cib_contents: Archived previous version as /var/lib/pacemaker/cib/cib-2.raw
Nov 5 10:28:17 rh70-node1 cib[2223]: info: write_cib_contents: Wrote version 0.3.0 of the CIB to disk (digest: fd92fe00a0f0478246b1c9f1d2be83a8)
Nov 5 10:28:17 rh70-node1 cib[2223]: info: retrieveCib: Reading cluster configuration from: /var/lib/pacemaker/cib/cib.CARj72 (digest: /var/lib/pacemaker/cib/cib.XK4ybJ)
Nov 5 10:28:17 rh70-node1 abrt-hook-ccpp: Saved core dump of pid 2172 (/usr/libexec/pacemaker/crmd) to /var/tmp/abrt/ccpp-2014-11-05-10:28:17-2172 (18141184 bytes)
Nov 5 10:28:18 rh70-node1 abrt-server: Executable '/usr/libexec/pacemaker/crmd' doesn't belong to any package and ProcessUnpackaged is set to 'no'
Nov 5 10:28:18 rh70-node1 abrt-server: 'post-create' on '/var/tmp/abrt/ccpp-2014-11-05-10:28:17-2172' exited with 1
Nov 5 10:28:18 rh70-node1 abrt-server: Deleting problem directory '/var/tmp/abrt/ccpp-2014-11-05-10:28:17-2172'
Nov 5 10:28:18 rh70-node1 pacemakerd[2166]: error: child_waitpid: Managed process 2172 (crmd) dumped core
Nov 5 10:28:18 rh70-node1 pacemakerd[2166]: error: pcmk_child_exit: The crmd process (2172) terminated with signal 6 (core=1)
Nov 5 10:28:18 rh70-node1 pacemakerd[2166]: notice: pcmk_process_exit: Respawning failed child process: crmd
Nov 5 10:28:18 rh70-node1 pacemakerd[2166]: info: start_child: Using uid=992 and group=990 for process crmd
Nov 5 10:28:18 rh70-node1 pacemakerd[2166]: info: start_child: Forked child 2228 for process crmd
Nov 5 10:28:18 rh70-node1 crmd[2228]: info: crm_log_init: Changed active directory to /usr/var/lib/heartbeat/cores/hacluster
Nov 5 10:28:18 rh70-node1 crmd[2228]: notice: main: CRM Git Version: 85c093e
Nov 5 10:28:18 rh70-node1 crmd[2228]: info: do_log: FSA: Input I_STARTUP from crmd_init() received in state S_STARTING
Nov 5 10:28:18 rh70-node1 crmd[2228]: info: get_cluster_type: Verifying cluster type: 'corosync'
----------------------------
It is an operation error of the user, but it is not desirable for crmd to reboot.
We request the improvement that crmd does not reboot.
Best Regards,
Hideo Yamauchi.
_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker
Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org