Mailing List Archive

1 2 3  View All
Re: スプリットブレイン時のSTONITHエラーについて [ In reply to ]
福田さん

こんばんは、山内です。

変わらないようですね。。。

とりあえず、明日くらいに、RHEL上ですが、

Heartbeat3.0.6
Pacemakerの最新

組み合わせで、同じような設定(リソースはDummy、external/xen0はexternal/sshになりますが)stonith-helperが動くかどうかを確認してみます。

#stonith-helperの-x指定の出力が確認出来ると、もう少し問題が絞りやすいのですが・・・


以上です。



----- Original Message -----
>From: Masamichi Fukuda - elf-systems <masamichi_fukuda@elf-systems.com>
>To: 山内英生 <renayama19661014@ybb.ne.jp>; "linux-ha-japan@lists.sourceforge.jp" <linux-ha-japan@lists.sourceforge.jp>
>Date: 2015/3/17, Tue 21:24
>Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
>
>
>山内さん
>
>こんばんは、福田です。
>最新版の情報をありがとうございました。
>
>早速インストールしてみました。
>
>起動後の状態です。
>
>failed actionsは変わりないようです。
>
>
>
># crm_mon -rfA
>Last updated: Tue Mar 17 21:03:49 2015
>Last change: Tue Mar 17 20:30:58 2015
>Stack: heartbeat
>Current DC: lbv1.beta.com (38b0f200-83ea-8633-6f37-047d36cd39c6) - parti
>tion with quorum
>Version: 1.1.12-e32080b
>2 Nodes configured
>8 Resources configured
>
>
>Online: [ lbv1.beta.com lbv2.beta.com ]
>
>Full list of resources:
>
> Resource Group: HAvarnish
>     vip_208    (ocf::heartbeat:IPaddr2):       Started lbv1.beta.com
>     varnishd   (lsb:varnish):  Started lbv1.beta.com
> Resource Group: grpStonith1
>     Stonith1-1 (stonith:external/stonith-helper):      Stopped
>     Stonith1-2 (stonith:external/xen0):        Stopped
> Resource Group: grpStonith2
>     Stonith2-1 (stonith:external/stonith-helper):      Stopped
>     Stonith2-2 (stonith:external/xen0):        Stopped
> Clone Set: clone_ping [ping]
>     Started: [ lbv1.beta.com lbv2.beta.com ]
>
>Node Attributes:
>* Node lbv1.beta.com:
>    + default_ping_set                  : 100
>* Node lbv2.beta.com:
>    + default_ping_set                  : 100
>
>Migration summary:
>* Node lbv1.beta.com:
>   Stonith2-1: migration-threshold=1 fail-count=1000000 last-failure='Tue Mar 17
> 21:03:39 2015'
>* Node lbv2.beta.com:
>   Stonith1-1: migration-threshold=1 fail-count=1000000 last-failure='Tue Mar 17
> 21:03:32 2015'
>
>Failed actions:
>    Stonith2-1_start_0 on lbv1.beta.com 'unknown error' (1): call=31, st
>atus=Error, exit-reason='none', last-rc-change='Tue Mar 17 21:03:37 2015', queue
>d=0ms, exec=1085ms
>    Stonith1-1_start_0 on lbv2.beta.com 'unknown error' (1): call=18, st
>atus=Error, exit-reason='none', last-rc-change='Tue Mar 17 21:03:30 2015', queue
>d=0ms, exec=1061ms
>
>
>
>
>ログです。
>
>
># less /var/log/ha-debug
>Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: info: Pacemaker support: yes
>Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: WARN: File /etc/ha.d//haresources exists.
>Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: WARN: This file is not used because pacemaker is enabled
>Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: debug: Checking access of: /usr/local/heartbeat/libexec/heartbeat/ccm
>Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: debug: Checking access of: /usr/local/heartbeat/libexec/pacemaker/cib
>Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: debug: Checking access of: /usr/local/heartbeat/libexec/pacemaker/stonithd
>Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: debug: Checking access of: /usr/local/heartbeat/libexec/pacemaker/lrmd
>Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: debug: Checking access of: /usr/local/heartbeat/libexec/pacemaker/attrd
>Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: debug: Checking access of: /usr/local/heartbeat/libexec/pacemaker/crmd
>Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: WARN: Core dumps could be lost if multiple dumps occur.
>Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: WARN: Consider setting non-default value in /proc/sys/kernel/core_pattern (or equivalent) for maximum supportability
>Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: WARN: Consider setting /proc/sys/kernel/core_uses_pid (or equivalent) to 1 for maximum supportability
>Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: WARN: Logging daemon is disabled --enabling logging daemon is recommended
>Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: info: **************************
>Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: info: Configuration validated. Starting heartbeat 3.0.6
>Mar 17 21:02:39 lbv1.beta.com heartbeat: [4236]: info: heartbeat: version 3.0.6
>Mar 17 21:02:39 lbv1.beta.com heartbeat: [4236]: info: Heartbeat generation: 1423534116
>Mar 17 21:02:39 lbv1.beta.com heartbeat: [4236]: info: seed is -1702799346
>Mar 17 21:02:39 lbv1.beta.com heartbeat: [4236]: info: glib: ucast: write socket priority set to IPTOS_LOWDELAY on eth1
>Mar 17 21:02:39 lbv1.beta.com heartbeat: [4236]: info: glib: ucast: bound send socket to device: eth1
>Mar 17 21:02:39 lbv1.beta.com heartbeat: [4236]: info: glib: ucast: set SO_REUSEADDR
>Mar 17 21:02:39 lbv1.beta.com heartbeat: [4236]: info: glib: ucast: bound receive socket to device: eth1
>Mar 17 21:02:39 lbv1.beta.com heartbeat: [4236]: info: glib: ucast: started on port 694 interface eth1 to 10.0.17.133
>Mar 17 21:02:39 lbv1.beta.com heartbeat: [4236]: info: Local status now set to: 'up'
>Mar 17 21:02:46 lbv1.beta.com heartbeat: [4236]: info: Link lbv2.beta.com:eth1 up.
>Mar 17 21:02:46 lbv1.beta.com heartbeat: [4236]: info: Status update for node lbv2.beta.com: status up
>Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: Comm_now_up(): updating status to active
>Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: Local status now set to: 'active'
>Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: Starting child client "/usr/local/heartbeat/libexec/heartbeat/ccm" (109,113)
>Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: Starting child client "/usr/local/heartbeat/libexec/pacemaker/cib" (109,113)
>Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: Starting child client "/usr/local/heartbeat/libexec/pacemaker/stonithd" (0,0)
>Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: Starting child client "/usr/local/heartbeat/libexec/pacemaker/lrmd" (0,0)
>Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: Starting child client "/usr/local/heartbeat/libexec/pacemaker/attrd" (109,113)
>Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: Starting child client "/usr/local/heartbeat/libexec/pacemaker/crmd" (109,113)
>Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: debug: get_delnodelist: delnodelist=
>Mar 17 21:02:47 lbv1.beta.com heartbeat: [4250]: info: Starting "/usr/local/heartbeat/libexec/pacemaker/crmd" as uid 109  gid 113 (pid 4250)
>Mar 17 21:02:47 lbv1.beta.com heartbeat: [4246]: info: Starting "/usr/local/heartbeat/libexec/pacemaker/cib" as uid 109  gid 113 (pid 4246)
>Mar 17 21:02:47 lbv1.beta.com heartbeat: [4249]: info: Starting "/usr/local/heartbeat/libexec/pacemaker/attrd" as uid 109  gid 113 (pid 4249)
>Mar 17 21:02:47 lbv1.beta.com heartbeat: [4245]: info: Starting "/usr/local/heartbeat/libexec/heartbeat/ccm" as uid 109  gid 113 (pid 4245)
>Mar 17 21:02:47 lbv1.beta.com heartbeat: [4248]: info: Starting "/usr/local/heartbeat/libexec/pacemaker/lrmd" as uid 0  gid 0 (pid 4248)
>Mar 17 21:02:47 lbv1.beta.com heartbeat: [4247]: info: Starting "/usr/local/heartbeat/libexec/pacemaker/stonithd" as uid 0  gid 0 (pid 4247)
>Mar 17 21:02:47 lbv1.beta.com ccm: [4245]: info: Hostname: lbv1.beta.com
>Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: the send queue length from heartbeat to client ccm is set to 1024
>Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: the send queue length from heartbeat to client attrd is set to 1024
>Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: the send queue length from heartbeat to client stonith-ng is set to 1024
>Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: Status update for node lbv2.beta.com: status active
>Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: the send queue length from heartbeat to client cib is set to 1024
>Mar 17 21:02:51 lbv1.beta.com heartbeat: [4236]: WARN: 1 lost packet(s) for [lbv2.beta.com] [15:17]
>Mar 17 21:02:51 lbv1.beta.com heartbeat: [4236]: info: No pkts missing from lbv2.beta.com!
>Mar 17 21:02:52 lbv1.beta.com heartbeat: [4236]: WARN: 1 lost packet(s) for [lbv2.beta.com] [19:21]
>Mar 17 21:02:52 lbv1.beta.com heartbeat: [4236]: info: No pkts missing from lbv2.beta.com!
>Mar 17 21:02:52 lbv1.beta.com heartbeat: [4236]: info: the send queue length from heartbeat to client crmd is set to 1024
>Mar 17 21:02:53 lbv1.beta.com heartbeat: [4236]: WARN: 1 lost packet(s) for [lbv2.beta.com] [24:26]
>Mar 17 21:02:53 lbv1.beta.com heartbeat: [4236]: info: No pkts missing from lbv2.beta.com!
>Mar 17 21:02:54 lbv1.beta.com heartbeat: [4236]: WARN: 1 lost packet(s) for [lbv2.beta.com] [26:28]
>Mar 17 21:02:54 lbv1.beta.com heartbeat: [4236]: info: No pkts missing from lbv2.beta.com!
>Mar 17 21:02:54 lbv1.beta.com heartbeat: [4236]: WARN: 1 lost packet(s) for [lbv2.beta.com] [30:32]
>Mar 17 21:02:54 lbv1.beta.com heartbeat: [4236]: info: No pkts missing from lbv2.beta.com!
>
>
>
># less /var/log/error
>
>Mar 17 21:02:47 lbv1 attrd[4249]:    error: ha_msg_dispatch: Ignored incoming message. Please set_msg_callback on hbclstat
>Mar 17 21:02:48 lbv1 attrd[4249]:    error: ha_msg_dispatch: Ignored incoming message. Please set_msg_callback on hbclstat
>Mar 17 21:02:53 lbv1 stonith-ng[4247]:    error: ha_msg_dispatch: Ignored incoming message. Please set_msg_callback on hbclstat
>Mar 17 21:02:53 lbv1 stonith-ng[4247]:    error: ha_msg_dispatch: Ignored incoming message. Please set_msg_callback on hbclstat
>Mar 17 21:03:39 lbv1 crmd[4250]:    error: process_lrm_event: Operation Stonith2-1_start_0 (node=lbv1.beta.com, call=31, status=4, cib-update=42, confirmed=true) Error
>
># cat syslog|egrep 'Mar 17 21:03|Mar 17 21:02' |egrep 'heartbeat|stonith|pacemaker|error'
>Mar 17 21:03:24 lbv1 pengine[4253]:   notice: process_pe_message: Calculated Transition 0: /var/lib/pacemaker/pengine/pe-input-115.bz2
>Mar 17 21:03:27 lbv1 crmd[4250]:   notice: run_graph: Transition 0 (Complete=15, Pending=0, Fired=0, Skipped=16, Incomplete=2, Source=/var/lib/pacemaker/pengine/pe-input-115.bz2): Stopped
>Mar 17 21:03:29 lbv1 pengine[4253]:   notice: process_pe_message: Calculated Transition 1: /var/lib/pacemaker/pengine/pe-input-116.bz2
>Mar 17 21:03:34 lbv1 crmd[4250]:   notice: run_graph: Transition 1 (Complete=8, Pending=0, Fired=0, Skipped=12, Incomplete=1, Source=/var/lib/pacemaker/pengine/pe-input-116.bz2): Stopped
>Mar 17 21:03:37 lbv1 pengine[4253]:  warning: unpack_rsc_op_failure: Processing failed op start for Stonith1-1 on lbv2.beta.com: unknown error (1)
>Mar 17 21:03:37 lbv1 pengine[4253]:  warning: unpack_rsc_op_failure: Processing failed op start for Stonith1-1 on lbv2.beta.com: unknown error (1)
>Mar 17 21:03:37 lbv1 pengine[4253]:   notice: process_pe_message: Calculated Transition 2: /var/lib/pacemaker/pengine/pe-input-117.bz2
>Mar 17 21:03:39 lbv1 stonith-ng[4247]:   notice: log_operation: Operation 'monitor' [4377] for device 'Stonith2-1' returned: -201 (Generic Pacemaker error)
>Mar 17 21:03:39 lbv1 stonith-ng[4247]:  warning: log_operation: Stonith2-1:4377 [ Performing: stonith -t external/stonith-helper -S ]
>Mar 17 21:03:39 lbv1 stonith-ng[4247]:  warning: log_operation: Stonith2-1:4377 [ failed to exec "stonith" ]
>Mar 17 21:03:39 lbv1 stonith-ng[4247]:  warning: log_operation: Stonith2-1:4377 [ failed:  2 ]
>Mar 17 21:03:39 lbv1 crmd[4250]:    error: process_lrm_event: Operation Stonith2-1_start_0 (node=lbv1.beta.com, call=31, status=4, cib-update=42, confirmed=true) Error
>Mar 17 21:03:40 lbv1 crmd[4250]:   notice: run_graph: Transition 2 (Complete=12, Pending=0, Fired=0, Skipped=3, Incomplete=0, Source=/var/lib/pacemaker/pengine/pe-input-117.bz2): Stopped
>Mar 17 21:03:42 lbv1 pengine[4253]:  warning: unpack_rsc_op_failure: Processing failed op start for Stonith2-1 on lbv1.beta.com: unknown error (1)
>Mar 17 21:03:42 lbv1 pengine[4253]:  warning: unpack_rsc_op_failure: Processing failed op start for Stonith2-1 on lbv1.beta.com: unknown error (1)
>Mar 17 21:03:42 lbv1 pengine[4253]:  warning: unpack_rsc_op_failure: Processing failed op start for Stonith1-1 on lbv2.beta.com: unknown error (1)
>Mar 17 21:03:42 lbv1 pengine[4253]:   notice: process_pe_message: Calculated Transition 3: /var/lib/pacemaker/pengine/pe-input-118.bz2
>Mar 17 21:03:42 lbv1 IPaddr2(vip_208)[4448]: INFO: /usr/libexec/heartbeat/send_arp -i 200 -r 5 -p /var/run/resource-agents/send_arp-192.168.17.208 eth0 192.168.17.208 auto not_used not_used
>Mar 17 21:03:47 lbv1 crmd[4250]:   notice: run_graph: Transition 3 (Complete=10, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=/var/lib/pacemaker/pengine/pe-input-118.bz2): Complete
>
>宜しくお願いします。
>
>以上
>
>
>
>2015年3月17日 18:31 <renayama19661014@ybb.ne.jp>:
>
>福田さん
>>
>>こんばんは、山内です。
>>
>>tag付けされていないので、本日の最新版は、
>>
>> * https://github.com/ClusterLabs/pacemaker/tree/e32080b460f81486b85d08ec958582b3e72d858c
>>
>>
>>になります。
>>右側の[Download ZIP]からダウンロード出来ます。
>>
>>以上です。
>>
>>
>>----- Original Message -----
>>>From: Masamichi Fukuda - elf-systems <masamichi_fukuda@elf-systems.com>
>>
>>>To: "renayama19661014@ybb.ne.jp" <renayama19661014@ybb.ne.jp>; "linux-ha-japan@lists.sourceforge.jp" <linux-ha-japan@lists.sourceforge.jp>
>>>Date: 2015/3/17, Tue 18:07
>>>Subject: スプリットブレイン時のSTONITHエラーについて
>>>
>>>
>>>山内さん
>>>
>>>
>>>お疲れ様です、福田です。
>>>
>>>
>>>こちらを見たのですが、
>>>https://github.com/ClusterLabs/pacemaker/tags
>>>
>>>
>>>
>>>pacemaker 1.1.12 561c4cf が最新のようなのですが。
>>>済みませんが、これ以降の最新版はどちらにあるか教えて頂けますか。
>>>
>>>
>>>宜しくお願いします。
>>>
>>>
>>>以上
>>>
>>>
>>>
>>>2015年3月17日火曜日、<renayama19661014@ybb.ne.jp>さんは書きました:
>>>
>>>福田さん
>>>>
>>>>お疲れ様です。山内です。
>>>>
>>>>はい。古いです。
>>>>
>>>>PacemakerがHeartbeat3.0.6に対応したのは意外と最近です。
>>>>もっと新しいものを入れてください。(また、ソースから構築する必要がありますが・・・・)
>>>>
>>>>
>>>>
>>>>本家のgithubから入手可能です。
>>>> * https://github.com/ClusterLabs/pacemaker
>>>>
>>>>
>>>>場合によっては、最新のmasterはエラーなどが出る場合がありますので、その場合は、バージョンを古い方にたぐって
>>>>いくのが良いと思います。
>>>>
>>>>以上です。
>>>>
>>>>
>>>>
>>>>----- Original Message -----
>>>>>From: Masamichi Fukuda - elf-systems <masamichi_fukuda@elf-systems.com>
>>>>>To: 山内英生 <renayama19661014@ybb.ne.jp>; "linux-ha-japan@lists.sourceforge.jp" <linux-ha-japan@lists.sourceforge.jp>
>>>>>Date: 2015/3/17, Tue 16:06
>>>>>Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
>>>>>
>>>>>
>>>>>山内さん
>>>>>
>>>>>お疲れ様です、福田です。
>>>>>
>>>>>以前のメールでheartbeatとpacemakerを最新版を入れたほうが良いと回答頂きました。
>>>>>そこで今回、heartbeat3.0.6とpacemaker1.1.12を入れたのですが。
>>>>>
>>>>>heartbeat configuration: Version = "3.0.6"
>>>>>pacemaker configuration: Version = 1.1.12 (Build: 561c4cf)pacemakerがまだ古いということでしょうか。
>>>>>
>>>>>済みませんが、宜しくお願いします。
>>>>>
>>>>>以上
>>>>>
>>>>>
>>>>>
>>>>>2015年3月17日 14:59 <renayama19661014@ybb.ne.jp>:
>>>>>
>>>>>福田さん
>>>>>>
>>>>>>お疲れ様です。山内です。
>>>>>>
>>>>>>ふと思ったのすが、以前のやり取りのメールで以下と回答してますが、問題ないでしょうか?
>>>>>>
>>>>>>
>>>>>>>>>>>> 2)Heartbeat3.0.6+Pacemaker最新 : OK
>>>>>>>>>>>>   
>>>>>>>>>>>> どうやら、Heartbeatも最新版3.0.6を組合せる必要があるようです。
>>>>>>>>>>>>  * http://hg.linux-ha.org/heartbeat-STABLE_3_0/rev/cceeb47a7d8f
>>>>>>
>>>>>>以下のcrm_monのバージョンを見ると、1.1.12のようです。
>>>>>>Heartbeat3.0.6と組み合わせるには、かなり新しめのPacemakerが必要です。
>>>>>>
>>>>>>># crm_mon -rfA
>>>>>>>
>>>>>>>Last updated: Tue Mar 17 14:14:39 2015
>>>>>>>Last change: Tue Mar 17 14:01:43 2015
>>>>>>>Stack: heartbeat
>>>>>>>Current DC: lbv2.beta.com (82ffc36f-1ad8-8686-7db0-35686465c624) - parti
>>>>>>>tion with quorum
>>>>>>>Version: 1.1.12-561c4cf
>>>>>>
>>>>>>たぶん、以下の変更以降は少なくとも必要かと思います。
>>>>>>
>>>>>>https://github.com/ClusterLabs/pacemaker/commit/f2302da063d08719d28367d8e362b8bfb0f85bf3
>>>>>>
>>>>>>
>>>>>>
>>>>>>以上です。
>>>>>>
>>>>>>
>>>>>>
>>>>>>----- Original Message -----
>>>>>>>From: Masamichi Fukuda - elf-systems <masamichi_fukuda@elf-systems.com>
>>>>>>>To: 山内英生 <renayama19661014@ybb.ne.jp>; "linux-ha-japan@lists.sourceforge.jp" <linux-ha-japan@lists.sourceforge.jp>
>>>>>>
>>>>>>>Date: 2015/3/17, Tue 14:38
>>>>>>>Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
>>>>>>>
>>>>>>>
>>>>>>>山内さん
>>>>>>>
>>>>>>>お疲れ様です、福田です。
>>>>>>>
>>>>>>>stonith-helperのシェバング行に-xを追加すれば良いのでしょうか?
>>>>>>>stonith-helperの先頭行を#!/bin/bash -xにしてクラスタを起動してみました。
>>>>>>>
>>>>>>>crm_monでは先ほどと変わりはないようです。
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>># crm_mon -rfA
>>>>>>>
>>>>>>>Last updated: Tue Mar 17 14:14:39 2015
>>>>>>>Last change: Tue Mar 17 14:01:43 2015
>>>>>>>Stack: heartbeat
>>>>>>>Current DC: lbv2.beta.com (82ffc36f-1ad8-8686-7db0-35686465c624) - parti
>>>>>>>tion with quorum
>>>>>>>Version: 1.1.12-561c4cf
>>>>>>>2 Nodes configured
>>>>>>>8 Resources configured
>>>>>>>
>>>>>>>Online: [ lbv1.beta.com lbv2.beta.com ]
>>>>>>>
>>>>>>>Full list of resources:
>>>>>>>
>>>>>>> Resource Group: HAvarnish
>>>>>>>     vip_208    (ocf::heartbeat:IPaddr2):       Started lbv1.beta.com
>>>>>>>     varnishd   (lsb:varnish):  Started lbv1.beta.com
>>>>>>> Resource Group: grpStonith1
>>>>>>>     Stonith1-1 (stonith:external/stonith-helper):      Stopped
>>>>>>>     Stonith1-2 (stonith:external/xen0):        Stopped
>>>>>>> Resource Group: grpStonith2
>>>>>>>     Stonith2-1 (stonith:external/stonith-helper):      Stopped
>>>>>>>     Stonith2-2 (stonith:external/xen0):        Stopped
>>>>>>> Clone Set: clone_ping [ping]
>>>>>>>     Started: [ lbv1.beta.com lbv2.beta.com ]
>>>>>>>
>>>>>>>Node Attributes:
>>>>>>>* Node lbv1.beta.com:
>>>>>>>    + default_ping_set                  : 100
>>>>>>>* Node lbv2.beta.com:
>>>>>>>    + default_ping_set                  : 100
>>>>>>>
>>>>>>>Migration summary:
>>>>>>>* Node lbv2.beta.com:
>>>>>>>   Stonith1-1: migration-threshold=1 fail-count=1000000 last-failure='Tue Mar 17
>>>>>>> 14:12:16 2015'
>>>>>>>* Node lbv1.beta.com:
>>>>>>>   Stonith2-1: migration-threshold=1 fail-count=1000000 last-failure='Tue Mar 17
>>>>>>> 14:12:21 2015'
>>>>>>>
>>>>>>>Failed actions:
>>>>>>>    Stonith1-1_start_0 on lbv2.beta.com 'unknown error' (1): call=31, st
>>>>>>>atus=Error, last-rc-change='Tue Mar 17 14:12:14 2015', queued=0ms, exec=1065ms
>>>>>>>    Stonith2-1_start_0 on lbv1.beta.com 'unknown error' (1): call=26, st
>>>>>>>atus=Error, last-rc-change='Tue Mar 17 14:12:19 2015', queued=0ms, exec=1081ms
>>>>>>>
>>>>>>>その他のログを探してみました。
>>>>>>>
>>>>>>>heartbeat起動時です。
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>># less /var/log/pm_logconv.out
>>>>>>>Mar 17 14:11:28 lbv1.beta.com info: Starting Heartbeat 3.0.6.
>>>>>>>Mar 17 14:11:33 lbv1.beta.com info: Link lbv2.beta.com:eth1 is up.
>>>>>>>Mar 17 14:11:34 lbv1.beta.com info: Start "ccm" process. (pid=13264)
>>>>>>>Mar 17 14:11:34 lbv1.beta.com info: Start "lrmd" process. (pid=13267)
>>>>>>>Mar 17 14:11:34 lbv1.beta.com info: Start "attrd" process. (pid=13268)
>>>>>>>Mar 17 14:11:34 lbv1.beta.com info: Start "stonithd" process. (pid=13266)
>>>>>>>Mar 17 14:11:34 lbv1.beta.com info: Start "cib" process. (pid=13265)
>>>>>>>Mar 17 14:11:34 lbv1.beta.com info: Start "crmd" process. (pid=13269)
>>>>>>>
>>>>>>>
>>>>>>># less /var/log/error
>>>>>>>Mar 17 14:12:20 lbv1 crmd[13269]:    error: process_lrm_event: Operation Stonith2-1_start_0 (node=lbv1.beta.com, call=26, status=4, cib-update=19, confirmed=true) Error
>>>>>>>
>>>>>>>
>>>>>>>syslogからstonithをgrepしたものです
>>>>>>>
>>>>>>>Mar 17 14:11:34 lbv1 heartbeat: [13255]: info: Starting child client "/usr/local/heartbeat/libexec/pacemaker/stonithd" (0,0)
>>>>>>>Mar 17 14:11:34 lbv1 heartbeat: [13266]: info: Starting "/usr/local/heartbeat/libexec/pacemaker/stonithd" as uid 0  gid 0 (pid 13266)
>>>>>>>Mar 17 14:11:34 lbv1 stonithd[13266]:   notice: crm_cluster_connect: Connecting to cluster infrastructure: heartbeat
>>>>>>>Mar 17 14:11:34 lbv1 heartbeat: [13255]: info: the send queue length from heartbeat to client stonithd is set to 1024
>>>>>>>Mar 17 14:11:40 lbv1 stonithd[13266]:   notice: setup_cib: Watching for stonith topology changes
>>>>>>>Mar 17 14:11:40 lbv1 stonithd[13266]:   notice: unpack_config: On loss of CCM Quorum: Ignore
>>>>>>>Mar 17 14:11:40 lbv1 stonithd[13266]:  warning: handle_startup_fencing: Blind faith: not fencing unseen nodes
>>>>>>>Mar 17 14:11:40 lbv1 stonithd[13266]:  warning: handle_startup_fencing: Blind faith: not fencing unseen nodes
>>>>>>>Mar 17 14:11:41 lbv1 stonithd[13266]:   notice: stonith_device_register: Added 'Stonith2-1' to the device list (1 active devices)
>>>>>>>Mar 17 14:11:41 lbv1 stonithd[13266]:   notice: stonith_device_register: Added 'Stonith2-2' to the device list (2 active devices)
>>>>>>>Mar 17 14:12:04 lbv1 stonithd[13266]:   notice: xml_patch_version_check: Versions did not change in patch 0.5.0
>>>>>>>Mar 17 14:12:20 lbv1 stonithd[13266]:   notice: log_operation: Operation 'monitor' [13386] for device 'Stonith2-1' returned: -201 (Generic Pacemaker error)
>>>>>>>Mar 17 14:12:20 lbv1 stonithd[13266]:  warning: log_operation: Stonith2-1:13386 [ Performing: stonith -t external/stonith-helper -S ]
>>>>>>>Mar 17 14:12:20 lbv1 stonithd[13266]:  warning: log_operation: Stonith2-1:13386 [ failed to exec "stonith" ]
>>>>>>>Mar 17 14:12:20 lbv1 stonithd[13266]:  warning: log_operation: Stonith2-1:13386 [ failed:  2 ]
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>宜しくお願いします。
>>>>>>>
>>>>>>>以上
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>2015年3月17日 13:32 <renayama19661014@ybb.ne.jp>:
>>>>>>>
>>>>>>>福田さん
>>>>>>>>
>>>>>>>>お疲れ様です。山内です。
>>>>>>>>
>>>>>>>>ということは、stonith-helperのstartに問題があるようですね。
>>>>>>>>
>>>>>>>>stonith-helperの先頭に
>>>>>>>>
>>>>>>>>#!/bin/bash -x
>>>>>>>>
>>>>>>>>
>>>>>>>>を入れて、クラスタを起動すると何かわかるかも知れません。
>>>>>>>>
>>>>>>>>ちなみに、stonith-helperのログもどこかに出ていると思うのですが。。。
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>以上です。
>>>>>>>>
>>>>>>>>----- Original Message -----
>>>>>>>>>From: Masamichi Fukuda - elf-systems <masamichi_fukuda@elf-systems.com>
>>>>>>>>>To: 山内英生 <renayama19661014@ybb.ne.jp>; "linux-ha-japan@lists.sourceforge.jp" <linux-ha-japan@lists.sourceforge.jp>
>>>>>>>>
>>>>>>>>>Date: 2015/3/17, Tue 12:31
>>>>>>>>>Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>山内さん
>>>>>>>>>cc:松島さん
>>>>>>>>>
>>>>>>>>>こんにちは、福田です。
>>>>>>>>>
>>>>>>>>>同じディレクトリにxen0はありました。
>>>>>>>>>
>>>>>>>>># pwd
>>>>>>>>>/usr/local/heartbeat/lib/stonith/plugins/external
>>>>>>>>>
>>>>>>>>># ls
>>>>>>>>>drac5           ibmrsa          kdumpcheck  riloe          vmware
>>>>>>>>>dracmc-telnet  ibmrsa-telnet  libvirt      ssh          xen0
>>>>>>>>>hetzner        ipmi          nut      stonith-helper  xen0-ha
>>>>>>>>>hmchttp        ippower9258    rackpdu      vcenter
>>>>>>>>>
>>>>>>>>>宜しくお願いします。
>>>>>>>>>
>>>>>>>>>以上
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>2015-03-17 10:53 GMT+09:00 <renayama19661014@ybb.ne.jp>:
>>>>>>>>>
>>>>>>>>>福田さん
>>>>>>>>>>cc:松島さん
>>>>>>>>>>
>>>>>>>>>>お疲れ様です。山内です。
>>>>>>>>>>
>>>>>>>>>>>標準出力や標準エラー出力はありませんでした。
>>>>>>>>>>>
>>>>>>>>>>>stonith-helperがおかしいのでしょうか。
>>>>>>>>>>>stonith-helperはシェルスクリプトなのでインストールはあまり気にしていなかったのですが。
>>>>>>>>>>>stonith-helperはここに配置されています。
>>>>>>>>>>>/usr/local/heartbeat/lib/stonith/plugins/external/stonith-helper
>>>>>>>>>>
>>>>>>>>>>このディレクトリにxen0もありますか?
>>>>>>>>>>無いようでしたら、問題がありますので、一度、stonith-helperのファイルを属性などはそのまま、xen0と同じディレクトリに
>>>>>>>>>>コピーしてみてください。
>>>>>>>>>>
>>>>>>>>>>それで稼働するなら、pm_extrasのインストールに問題があるということになります。
>>>>>>>>>>
>>>>>>>>>>以上です。
>>>>>>>>>>
>>>>>>>>>>----- Original Message -----
>>>>>>>>>>>From: Masamichi Fukuda - elf-systems <masamichi_fukuda@elf-systems.com>
>>>>>>>>>>>To: 山内英生 <renayama19661014@ybb.ne.jp>; "linux-ha-japan@lists.sourceforge.jp" <linux-ha-japan@lists.sourceforge.jp>
>>>>>>>>>>
>>>>>>>>>>>Date: 2015/3/17, Tue 10:31
>>>>>>>>>>>Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>山内さん
>>>>>>>>>>>cc:松島さん
>>>>>>>>>>>
>>>>>>>>>>>おはようございます、福田です。
>>>>>>>>>>>crmの例をありがとうございます。
>>>>>>>>>>>
>>>>>>>>>>>早速、こちらの環境に合わせてみました。
>>>>>>>>>>>
>>>>>>>>>>>$ cat test.crm
>>>>>>>>>>>### Cluster Option ###
>>>>>>>>>>>property \
>>>>>>>>>>>    no-quorum-policy="ignore" \
>>>>>>>>>>>    stonith-enabled="true" \
>>>>>>>>>>>    startup-fencing="false" \
>>>>>>>>>>>    stonith-timeout="710s" \
>>>>>>>>>>>    crmd-transition-delay="2s"
>>>>>>>>>>>
>>>>>>>>>>>### Resource Default ###
>>>>>>>>>>>rsc_defaults \
>>>>>>>>>>>    resource-stickiness="INFINITY" \
>>>>>>>>>>>    migration-threshold="1"
>>>>>>>>>>>
>>>>>>>>>>>### Group Configuration ###
>>>>>>>>>>>group HAvarnish \
>>>>>>>>>>>    vip_208 \
>>>>>>>>>>>    varnishd
>>>>>>>>>>>
>>>>>>>>>>>group grpStonith1 \
>>>>>>>>>>>    Stonith1-1 \
>>>>>>>>>>>    Stonith1-2
>>>>>>>>>>>
>>>>>>>>>>>group grpStonith2 \
>>>>>>>>>>>    Stonith2-1 \
>>>>>>>>>>>    Stonith2-2
>>>>>>>>>>>
>>>>>>>>>>>### Clone Configuration ###
>>>>>>>>>>>clone clone_ping \
>>>>>>>>>>>    ping
>>>>>>>>>>>
>>>>>>>>>>>### Fencing Topology ###
>>>>>>>>>>>fencing_topology \
>>>>>>>>>>>    lbv1.beta.com: Stonith1-1 Stonith1-2 \
>>>>>>>>>>>    lbv2.beta.com: Stonith2-1 Stonith2-2
>>>>>>>>>>>
>>>>>>>>>>>### Primitive Configuration ###
>>>>>>>>>>>primitive vip_208 ocf:heartbeat:IPaddr2 \
>>>>>>>>>>>    params \
>>>>>>>>>>>        ip="192.168.17.208" \
>>>>>>>>>>>        nic="eth0" \
>>>>>>>>>>>        cidr_netmask="24" \
>>>>>>>>>>>    op start interval="0s" timeout="90s" on-fail="restart" \
>>>>>>>>>>>    op monitor interval="5s" timeout="60s" on-fail="restart" \
>>>>>>>>>>>    op stop interval="0s" timeout="100s" on-fail="fence"
>>>>>>>>>>>
>>>>>>>>>>>primitive varnishd lsb:varnish \
>>>>>>>>>>>    op start interval="0s" timeout="90s" on-fail="restart" \
>>>>>>>>>>>    op monitor interval="10s" timeout="60s" on-fail="restart" \
>>>>>>>>>>>    op stop interval="0s" timeout="100s" on-fail="fence"
>>>>>>>>>>>
>>>>>>>>>>>primitive ping ocf:pacemaker:ping \
>>>>>>>>>>>    params \
>>>>>>>>>>>        name="default_ping_set" \
>>>>>>>>>>>        host_list="192.168.17.254" \
>>>>>>>>>>>        multiplier="100" \
>>>>>>>>>>>        dampen="1" \
>>>>>>>>>>>    op start interval="0s" timeout="90s" on-fail="restart" \
>>>>>>>>>>>    op monitor interval="10s" timeout="60s" on-fail="restart" \
>>>>>>>>>>>    op stop interval="0s" timeout="100s" on-fail="fence"
>>>>>>>>>>>
>>>>>>>>>>>primitive Stonith1-1 stonith:external/stonith-helper \
>>>>>>>>>>>    params \
>>>>>>>>>>>        pcmk_reboot_retries="1" \
>>>>>>>>>>>        pcmk_reboot_timeout="40s" \
>>>>>>>>>>>        hostlist="lbv1.beta.com" \
>>>>>>>>>>>        dead_check_target="192.168.17.132 10.0.17.132" \
>>>>>>>>>>>        standby_check_command="/usr/local/sbin/crm_resource -r varnishd -W | grep -q `hostname`" \
>>>>>>>>>>>        run_online_check="yes" \
>>>>>>>>>>>    op start interval="0s" timeout="60s" on-fail="restart" \
>>>>>>>>>>>    op stop interval="0s" timeout="60s" on-fail="ignore"
>>>>>>>>>>>
>>>>>>>>>>>primitive Stonith1-2 stonith:external/xen0 \
>>>>>>>>>>>    params \
>>>>>>>>>>>        pcmk_reboot_timeout="60s" \
>>>>>>>>>>>        hostlist="lbv1.beta.com:/etc/xen/lbv1.cfg" \
>>>>>>>>>>>        dom0="xen0.beta.com" \
>>>>>>>>>>>    op start interval="0s" timeout="60s" on-fail="restart" \
>>>>>>>>>>>    op monitor interval="3600s" timeout="60s" on-fail="restart" \
>>>>>>>>>>>    op stop interval="0s" timeout="60s" on-fail="ignore"
>>>>>>>>>>>
>>>>>>>>>>>primitive Stonith2-1 stonith:external/stonith-helper \
>>>>>>>>>>>    params \
>>>>>>>>>>>        pcmk_reboot_retries="1" \
>>>>>>>>>>>        pcmk_reboot_timeout="40s" \
>>>>>>>>>>>        hostlist="lbv2.beta.com" \
>>>>>>>>>>>        dead_check_target="192.168.17.133 10.0.17.133" \
>>>>>>>>>>>        standby_check_command="/usr/local/sbin/crm_resource -r varnishd -W | grep -q `hostname`" \
>>>>>>>>>>>        run_online_check="yes" \
>>>>>>>>>>>    op start interval="0s" timeout="60s" on-fail="restart" \
>>>>>>>>>>>    op stop interval="0s" timeout="60s" on-fail="ignore"
>>>>>>>>>>>
>>>>>>>>>>>primitive Stonith2-2 stonith:external/xen0 \
>>>>>>>>>>>    params \
>>>>>>>>>>>        pcmk_reboot_timeout="60s" \
>>>>>>>>>>>        hostlist="lbv2.beta.com:/etc/xen/lbv2.cfg" \
>>>>>>>>>>>        dom0="xen0.beta.com" \
>>>>>>>>>>>    op start interval="0s" timeout="60s" on-fail="restart" \
>>>>>>>>>>>    op monitor interval="3600s" timeout="60s" on-fail="restart" \
>>>>>>>>>>>    op stop interval="0s" timeout="60s" on-fail="ignore"
>>>>>>>>>>>
>>>>>>>>>>>### Resource Location ###
>>>>>>>>>>>location HA_location-1 HAvarnish \
>>>>>>>>>>>    rule 200: #uname eq lbv1.beta.com \
>>>>>>>>>>>    rule 100: #uname eq lbv2.beta.com
>>>>>>>>>>>
>>>>>>>>>>>location HA_location-2 HAvarnish \
>>>>>>>>>>>    rule -INFINITY: not_defined default_ping_set or default_ping_set lt 100
>>>>>>>>>>>
>>>>>>>>>>>location HA_location-3 grpStonith1 \
>>>>>>>>>>>    rule -INFINITY: #uname eq lbv1.beta.com
>>>>>>>>>>>
>>>>>>>>>>>location HA_location-4 grpStonith2 \
>>>>>>>>>>>    rule -INFINITY: #uname eq lbv2.beta.com
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>これを流しこんだところ、昨日とはメッセージが異なります。
>>>>>>>>>>>pingのメッセージはなくなっていました。
>>>>>>>>>>>
>>>>>>>>>>># crm_mon -rfA
>>>>>>>>>>>Last updated: Tue Mar 17 10:21:28 2015
>>>>>>>>>>>Last change: Tue Mar 17 10:21:09 2015
>>>>>>>>>>>Stack: heartbeat
>>>>>>>>>>>Current DC: lbv2.beta.com (82ffc36f-1ad8-8686-7db0-35686465c624) - parti
>>>>>>>>>>>tion with quorum
>>>>>>>>>>>Version: 1.1.12-561c4cf
>>>>>>>>>>>2 Nodes configured
>>>>>>>>>>>8 Resources configured
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>Online: [ lbv1.beta.com lbv2.beta.com ]
>>>>>>>>>>>
>>>>>>>>>>>Full list of resources:
>>>>>>>>>>>
>>>>>>>>>>> Resource Group: HAvarnish
>>>>>>>>>>>     vip_208    (ocf::heartbeat:IPaddr2):       Started lbv1.beta.com
>>>>>>>>>>>     varnishd   (lsb:varnish):  Started lbv1.beta.com
>>>>>>>>>>> Resource Group: grpStonith1
>>>>>>>>>>>     Stonith1-1 (stonith:external/stonith-helper):      Stopped
>>>>>>>>>>>     Stonith1-2 (stonith:external/xen0):        Stopped
>>>>>>>>>>> Resource Group: grpStonith2
>>>>>>>>>>>     Stonith2-1 (stonith:external/stonith-helper):      Stopped
>>>>>>>>>>>     Stonith2-2 (stonith:external/xen0):        Stopped
>>>>>>>>>>> Clone Set: clone_ping [ping]
>>>>>>>>>>>     Started: [ lbv1.beta.com lbv2.beta.com ]
>>>>>>>>>>>
>>>>>>>>>>>Node Attributes:
>>>>>>>>>>>* Node lbv1.beta.com:
>>>>>>>>>>>    + default_ping_set                  : 100
>>>>>>>>>>>* Node lbv2.beta.com:
>>>>>>>>>>>    + default_ping_set                  : 100
>>>>>>>>>>>
>>>>>>>>>>>Migration summary:
>>>>>>>>>>>* Node lbv2.beta.com:
>>>>>>>>>>>   Stonith1-1: migration-threshold=1 fail-count=1000000 last-failure='Tue Mar 17
>>>>>>>>>>> 10:21:17 2015'
>>>>>>>>>>>* Node lbv1.beta.com:
>>>>>>>>>>>   Stonith2-1: migration-threshold=1 fail-count=1000000 last-failure='Tue Mar 17
>>>>>>>>>>> 10:21:17 2015'
>>>>>>>>>>>
>>>>>>>>>>>Failed actions:
>>>>>>>>>>>    Stonith1-1_start_0 on lbv2.beta.com 'unknown error' (1): call=31, st
>>>>>>>>>>>atus=Error, last-rc-change='Tue Mar 17 10:21:15 2015', queued=0ms, exec=1082ms
>>>>>>>>>>>    Stonith2-1_start_0 on lbv1.beta.com 'unknown error' (1): call=31, st
>>>>>>>>>>>atus=Error, last-rc-change='Tue Mar 17 10:21:16 2015', queued=0ms, exec=1079ms
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>/var/log/ha-debugのログです。
>>>>>>>>>>>
>>>>>>>>>>>IPaddr2(vip_208)[7851]: 2015/03/17_10:21:22 INFO: Adding inet address 192.168.17.208/24 with broadcast address 192.168.17.255 to device eth0
>>>>>>>>>>>IPaddr2(vip_208)[7851]: 2015/03/17_10:21:22 INFO: Bringing device eth0 up
>>>>>>>>>>>IPaddr2(vip_208)[7851]: 2015/03/17_10:21:22 INFO: /usr/libexec/heartbeat/send_arp -i 200 -r 5 -p /var/run/resource-agents/send_arp-192.168.17.208 eth0 192.168.17.208 auto not_used not_used
>>>>>>>>>>>
>>>>>>>>>>>標準出力や標準エラー出力はありませんでした。
>>>>>>>>>>>
>>>>>>>>>>>stonith-helperがおかしいのでしょうか。
>>>>>>>>>>>stonith-helperはシェルスクリプトなのでインストールはあまり気にしていなかったのですが。
>>>>>>>>>>>stonith-helperはここに配置されています。
>>>>>>>>>>>/usr/local/heartbeat/lib/stonith/plugins/external/stonith-helper
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>宜しくお願いします。
>>>>>>>>>>>
>>>>>>>>>>>以上
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>2015-03-17 9:45 GMT+09:00 <renayama19661014@ybb.ne.jp>:
>>>>>>>>>>>
>>>>>>>>>>>福田さん
>>>>>>>>>>>>
>>>>>>>>>>>>おはようございます。山内です。
>>>>>>>>>>>>
>>>>>>>>>>>>念の為、手元にある複数のstonithを利用した場合の例を抜粋してお送りします。
>>>>>>>>>>>>(実際には、改行に気を付けてください)
>>>>>>>>>>>>
>>>>>>>>>>>>以下の例は、PM1.1系での設定で、
>>>>>>>>>>>>nodeaは、prmStonith1-1、 prmStonith1-2の順でstonithが実行されます。
>>>>>>>>>>>>nodebは、prmStonith2-1、 prmStonith2-2の順でstonithが実行されます。
>>>>>>>>>>>>
>>>>>>>>>>>>stonith自体は、helperとsshです。
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>(snip)
>>>>>>>>>>>>### Group Configuration ###
>>>>>>>>>>>>group grpStonith1 \
>>>>>>>>>>>>prmStonith1-1 \
>>>>>>>>>>>>prmStonith1-2
>>>>>>>>>>>>
>>>>>>>>>>>>group grpStonith2 \
>>>>>>>>>>>>prmStonith2-1 \
>>>>>>>>>>>>prmStonith2-2
>>>>>>>>>>>>
>>>>>>>>>>>>### Fencing Topology ###
>>>>>>>>>>>>fencing_topology \
>>>>>>>>>>>>nodea: prmStonith1-1 prmStonith1-2 \
>>>>>>>>>>>>nodeb: prmStonith2-1 prmStonith2-2
>>>>>>>>>>>>(snp)
>>>>>>>>>>>>primitive prmStonith1-1 stonith:external/stonith-helper \
>>>>>>>>>>>>params \
>>>>>>>>>>>>
>>>>>>>>>>>>pcmk_reboot_retries="1" \
>>>>>>>>>>>>pcmk_reboot_timeout="40s" \
>>>>>>>>>>>>hostlist="nodea" \
>>>>>>>>>>>>dead_check_target="192.168.28.60 192.168.28.70" \
>>>>>>>>>>>>standby_check_command="/usr/sbin/crm_resource -r prmRES -W | grep -qi `hostname`" \
>>>>>>>>>>>>run_online_check="yes" \
>>>>>>>>>>>>op start interval="0s" timeout="60s" on-fail="restart" \
>>>>>>>>>>>>op stop interval="0s" timeout="60s" on-fail="ignore"
>>>>>>>>>>>>
>>>>>>>>>>>>primitive prmStonith1-2 stonith:external/ssh \
>>>>>>>>>>>>params \
>>>>>>>>>>>>pcmk_reboot_timeout="60s" \
>>>>>>>>>>>>hostlist="nodea" \
>>>>>>>>>>>>op start interval="0s" timeout="60s" on-fail="restart" \
>>>>>>>>>>>>op monitor interval="3600s" timeout="60s" on-fail="restart" \
>>>>>>>>>>>>op stop interval="0s" timeout="60s" on-fail="ignore"
>>>>>>>>>>>>
>>>>>>>>>>>>primitive prmStonith2-1 stonith:external/stonith-helper \
>>>>>>>>>>>>params \
>>>>>>>>>>>>pcmk_reboot_retries="1" \
>>>>>>>>>>>>pcmk_reboot_timeout="40s" \
>>>>>>>>>>>>hostlist="nodeb" \
>>>>>>>>>>>>dead_check_target="192.168.28.61 192.168.28.71" \
>>>>>>>>>>>>standby_check_command="/usr/sbin/crm_resource -r prmRES -W | grep -qi `hostname`" \
>>>>>>>>>>>>run_online_check="yes" \
>>>>>>>>>>>>op start interval="0s" timeout="60s" on-fail="restart" \
>>>>>>>>>>>>op stop interval="0s" timeout="60s" on-fail="ignore"
>>>>>>>>>>>>
>>>>>>>>>>>>primitive prmStonith2-2 stonith:external/ssh \
>>>>>>>>>>>>params \
>>>>>>>>>>>>pcmk_reboot_timeout="60s" \
>>>>>>>>>>>>hostlist="nodeb" \
>>>>>>>>>>>>op start interval="0s" timeout="60s" on-fail="restart" \
>>>>>>>>>>>>op monitor interval="3600s" timeout="60s" on-fail="restart" \
>>>>>>>>>>>>op stop interval="0s" timeout="60s" on-fail="ignore"
>>>>>>>>>>>>(snip)
>>>>>>>>>>>>location rsc_location-grpStonith1-2 grpStonith1 \
>>>>>>>>>>>>rule -INFINITY: #uname eq nodea
>>>>>>>>>>>>location rsc_location-grpStonith2-3 grpStonith2 \
>>>>>>>>>>>>rule -INFINITY: #uname eq nodeb
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>以上です。
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>--
>>>>>>>>>>>
>>>>>>>>>>>ELF Systems
>>>>>>>>>>>Masamichi Fukuda
>>>>>>>>>>>mail to: masamichi_fukuda@elf-systems.com
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>_______________________________________________
>>>>>>>>>>Linux-ha-japan mailing list
>>>>>>>>>>Linux-ha-japan@lists.sourceforge.jp
>>>>>>>>>>http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>--
>>>>>>>>>
>>>>>>>>>ELF Systems
>>>>>>>>>Masamichi Fukuda
>>>>>>>>>mail to: masamichi_fukuda@elf-systems.com
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>_______________________________________________
>>>>>>>>Linux-ha-japan mailing list
>>>>>>>>Linux-ha-japan@lists.sourceforge.jp
>>>>>>>>http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>--
>>>>>>>
>>>>>>>ELF Systems
>>>>>>>Masamichi Fukuda
>>>>>>>mail to: masamichi_fukuda@elf-systems.com
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>_______________________________________________
>>>>>>Linux-ha-japan mailing list
>>>>>>Linux-ha-japan@lists.sourceforge.jp
>>>>>>http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>>>>>>
>>>>>
>>>>>
>>>>>--
>>>>>
>>>>>ELF Systems
>>>>>Masamichi Fukuda
>>>>>mail to: masamichi_fukuda@elf-systems.com
>>>>>
>>>>>
>>>>
>>>>_______________________________________________
>>>>Linux-ha-japan mailing list
>>>>Linux-ha-japan@lists.sourceforge.jp
>>>>http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>>>>
>>>
>>>--
>>>
>>>ELF Systems
>>>Masamichi Fukuda
>>>mail to: masamichi_fukuda@elf-systems.com
>>>
>>>
>>>
>>
>>_______________________________________________
>>Linux-ha-japan mailing list
>>Linux-ha-japan@lists.sourceforge.jp
>>http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>>
>
>
>--
>
>ELF Systems
>Masamichi Fukuda
>mail to: masamichi_fukuda@elf-systems.com
>
>

_______________________________________________
Linux-ha-japan mailing list
Linux-ha-japan@lists.sourceforge.jp
http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
Re: スプリットブレイン時のSTONITHエラーについて [ In reply to ]
福田さん

こんばんは、山内です。

ちなみに可能であれば、external/stonith-helperを外して、external/xen0だけにした場合に
どうなるか?を確認すると、問題の切り分けになるかもしれません。

以上です。



----- Original Message -----
> From: "renayama19661014@ybb.ne.jp" <renayama19661014@ybb.ne.jp>
> To: "linux-ha-japan@lists.sourceforge.jp" <linux-ha-japan@lists.sourceforge.jp>
> Cc:
> Date: 2015/3/17, Tue 22:28
> Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
>
> 福田さん
>
> こんばんは、山内です。
>
> 変わらないようですね。。。
>
> とりあえず、明日くらいに、RHEL上ですが、
>
> Heartbeat3.0.6
> Pacemakerの最新
>
> 組み合わせで、同じような設定(リソースはDummy、external/xen0はexternal/sshになりますが)stonith-helperが動くかどうかを確認してみます。
>
> #stonith-helperの-x指定の出力が確認出来ると、もう少し問題が絞りやすいのですが・・・
>
>
> 以上です。
>
>
>
> ----- Original Message -----
>> From: Masamichi Fukuda - elf-systems
> <masamichi_fukuda@elf-systems.com>
>> To: 山内英生 <renayama19661014@ybb.ne.jp>;
> "linux-ha-japan@lists.sourceforge.jp"
> <linux-ha-japan@lists.sourceforge.jp>
>> Date: 2015/3/17, Tue 21:24
>> Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
>>
>>
>> 山内さん
>>
>> こんばんは、福田です。
>> 最新版の情報をありがとうございました。
>>
>> 早速インストールしてみました。
>>
>> 起動後の状態です。
>>
>> failed actionsは変わりないようです。
>>
>>
>>
>> # crm_mon -rfA
>> Last updated: Tue Mar 17 21:03:49 2015
>> Last change: Tue Mar 17 20:30:58 2015
>> Stack: heartbeat
>> Current DC: lbv1.beta.com (38b0f200-83ea-8633-6f37-047d36cd39c6) - parti
>> tion with quorum
>> Version: 1.1.12-e32080b
>> 2 Nodes configured
>> 8 Resources configured
>>
>>
>> Online: [ lbv1.beta.com lbv2.beta.com ]
>>
>> Full list of resources:
>>
>>  Resource Group: HAvarnish
>>      vip_208    (ocf::heartbeat:IPaddr2):       Started lbv1.beta.com
>>      varnishd   (lsb:varnish):  Started lbv1.beta.com
>>  Resource Group: grpStonith1
>>      Stonith1-1 (stonith:external/stonith-helper):      Stopped
>>      Stonith1-2 (stonith:external/xen0):        Stopped
>>  Resource Group: grpStonith2
>>      Stonith2-1 (stonith:external/stonith-helper):      Stopped
>>      Stonith2-2 (stonith:external/xen0):        Stopped
>>  Clone Set: clone_ping [ping]
>>      Started: [ lbv1.beta.com lbv2.beta.com ]
>>
>> Node Attributes:
>> * Node lbv1.beta.com:
>>     + default_ping_set                  : 100
>> * Node lbv2.beta.com:
>>     + default_ping_set                  : 100
>>
>> Migration summary:
>> * Node lbv1.beta.com:
>>    Stonith2-1: migration-threshold=1 fail-count=1000000
> last-failure='Tue Mar 17
>>  21:03:39 2015'
>> * Node lbv2.beta.com:
>>    Stonith1-1: migration-threshold=1 fail-count=1000000
> last-failure='Tue Mar 17
>>  21:03:32 2015'
>>
>> Failed actions:
>>     Stonith2-1_start_0 on lbv1.beta.com 'unknown error' (1):
> call=31, st
>> atus=Error, exit-reason='none', last-rc-change='Tue Mar 17
> 21:03:37 2015', queue
>> d=0ms, exec=1085ms
>>     Stonith1-1_start_0 on lbv2.beta.com 'unknown error' (1):
> call=18, st
>> atus=Error, exit-reason='none', last-rc-change='Tue Mar 17
> 21:03:30 2015', queue
>> d=0ms, exec=1061ms
>>
>>
>>
>>
>> ログです。
>>
>>
>> # less /var/log/ha-debug
>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: info: Pacemaker support:
> yes
>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: WARN: File
> /etc/ha.d//haresources exists.
>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: WARN: This file is not used
> because pacemaker is enabled
>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: debug: Checking access of:
> /usr/local/heartbeat/libexec/heartbeat/ccm
>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: debug: Checking access of:
> /usr/local/heartbeat/libexec/pacemaker/cib
>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: debug: Checking access of:
> /usr/local/heartbeat/libexec/pacemaker/stonithd
>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: debug: Checking access of:
> /usr/local/heartbeat/libexec/pacemaker/lrmd
>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: debug: Checking access of:
> /usr/local/heartbeat/libexec/pacemaker/attrd
>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: debug: Checking access of:
> /usr/local/heartbeat/libexec/pacemaker/crmd
>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: WARN: Core dumps could be
> lost if multiple dumps occur.
>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: WARN: Consider setting
> non-default value in /proc/sys/kernel/core_pattern (or equivalent) for maximum
> supportability
>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: WARN: Consider setting
> /proc/sys/kernel/core_uses_pid (or equivalent) to 1 for maximum supportability
>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: WARN: Logging daemon is
> disabled --enabling logging daemon is recommended
>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: info:
> **************************
>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: info: Configuration
> validated. Starting heartbeat 3.0.6
>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4236]: info: heartbeat: version
> 3.0.6
>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4236]: info: Heartbeat generation:
> 1423534116
>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4236]: info: seed is -1702799346
>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4236]: info: glib: ucast: write
> socket priority set to IPTOS_LOWDELAY on eth1
>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4236]: info: glib: ucast: bound
> send socket to device: eth1
>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4236]: info: glib: ucast: set
> SO_REUSEADDR
>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4236]: info: glib: ucast: bound
> receive socket to device: eth1
>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4236]: info: glib: ucast: started
> on port 694 interface eth1 to 10.0.17.133
>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4236]: info: Local status now set
> to: 'up'
>> Mar 17 21:02:46 lbv1.beta.com heartbeat: [4236]: info: Link
> lbv2.beta.com:eth1 up.
>> Mar 17 21:02:46 lbv1.beta.com heartbeat: [4236]: info: Status update for
> node lbv2.beta.com: status up
>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: Comm_now_up():
> updating status to active
>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: Local status now set
> to: 'active'
>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: Starting child client
> "/usr/local/heartbeat/libexec/heartbeat/ccm" (109,113)
>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: Starting child client
> "/usr/local/heartbeat/libexec/pacemaker/cib" (109,113)
>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: Starting child client
> "/usr/local/heartbeat/libexec/pacemaker/stonithd" (0,0)
>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: Starting child client
> "/usr/local/heartbeat/libexec/pacemaker/lrmd" (0,0)
>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: Starting child client
> "/usr/local/heartbeat/libexec/pacemaker/attrd" (109,113)
>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: Starting child client
> "/usr/local/heartbeat/libexec/pacemaker/crmd" (109,113)
>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: debug: get_delnodelist:
> delnodelist=
>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4250]: info: Starting
> "/usr/local/heartbeat/libexec/pacemaker/crmd" as uid 109  gid 113 (pid
> 4250)
>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4246]: info: Starting
> "/usr/local/heartbeat/libexec/pacemaker/cib" as uid 109  gid 113 (pid
> 4246)
>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4249]: info: Starting
> "/usr/local/heartbeat/libexec/pacemaker/attrd" as uid 109  gid 113
> (pid 4249)
>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4245]: info: Starting
> "/usr/local/heartbeat/libexec/heartbeat/ccm" as uid 109  gid 113 (pid
> 4245)
>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4248]: info: Starting
> "/usr/local/heartbeat/libexec/pacemaker/lrmd" as uid 0  gid 0 (pid
> 4248)
>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4247]: info: Starting
> "/usr/local/heartbeat/libexec/pacemaker/stonithd" as uid 0  gid 0 (pid
> 4247)
>> Mar 17 21:02:47 lbv1.beta.com ccm: [4245]: info: Hostname: lbv1.beta.com
>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: the send queue length
> from heartbeat to client ccm is set to 1024
>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: the send queue length
> from heartbeat to client attrd is set to 1024
>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: the send queue length
> from heartbeat to client stonith-ng is set to 1024
>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: Status update for
> node lbv2.beta.com: status active
>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: the send queue length
> from heartbeat to client cib is set to 1024
>> Mar 17 21:02:51 lbv1.beta.com heartbeat: [4236]: WARN: 1 lost packet(s) for
> [lbv2.beta.com] [15:17]
>> Mar 17 21:02:51 lbv1.beta.com heartbeat: [4236]: info: No pkts missing from
> lbv2.beta.com!
>> Mar 17 21:02:52 lbv1.beta.com heartbeat: [4236]: WARN: 1 lost packet(s) for
> [lbv2.beta.com] [19:21]
>> Mar 17 21:02:52 lbv1.beta.com heartbeat: [4236]: info: No pkts missing from
> lbv2.beta.com!
>> Mar 17 21:02:52 lbv1.beta.com heartbeat: [4236]: info: the send queue length
> from heartbeat to client crmd is set to 1024
>> Mar 17 21:02:53 lbv1.beta.com heartbeat: [4236]: WARN: 1 lost packet(s) for
> [lbv2.beta.com] [24:26]
>> Mar 17 21:02:53 lbv1.beta.com heartbeat: [4236]: info: No pkts missing from
> lbv2.beta.com!
>> Mar 17 21:02:54 lbv1.beta.com heartbeat: [4236]: WARN: 1 lost packet(s) for
> [lbv2.beta.com] [26:28]
>> Mar 17 21:02:54 lbv1.beta.com heartbeat: [4236]: info: No pkts missing from
> lbv2.beta.com!
>> Mar 17 21:02:54 lbv1.beta.com heartbeat: [4236]: WARN: 1 lost packet(s) for
> [lbv2.beta.com] [30:32]
>> Mar 17 21:02:54 lbv1.beta.com heartbeat: [4236]: info: No pkts missing from
> lbv2.beta.com!
>>
>>
>>
>> # less /var/log/error
>>
>> Mar 17 21:02:47 lbv1 attrd[4249]:    error: ha_msg_dispatch: Ignored
> incoming message. Please set_msg_callback on hbclstat
>> Mar 17 21:02:48 lbv1 attrd[4249]:    error: ha_msg_dispatch: Ignored
> incoming message. Please set_msg_callback on hbclstat
>> Mar 17 21:02:53 lbv1 stonith-ng[4247]:    error: ha_msg_dispatch: Ignored
> incoming message. Please set_msg_callback on hbclstat
>> Mar 17 21:02:53 lbv1 stonith-ng[4247]:    error: ha_msg_dispatch: Ignored
> incoming message. Please set_msg_callback on hbclstat
>> Mar 17 21:03:39 lbv1 crmd[4250]:    error: process_lrm_event: Operation
> Stonith2-1_start_0 (node=lbv1.beta.com, call=31, status=4, cib-update=42,
> confirmed=true) Error
>>
>> # cat syslog|egrep 'Mar 17 21:03|Mar 17 21:02' |egrep
> 'heartbeat|stonith|pacemaker|error'
>> Mar 17 21:03:24 lbv1 pengine[4253]:   notice: process_pe_message: Calculated
> Transition 0: /var/lib/pacemaker/pengine/pe-input-115.bz2
>> Mar 17 21:03:27 lbv1 crmd[4250]:   notice: run_graph: Transition 0
> (Complete=15, Pending=0, Fired=0, Skipped=16, Incomplete=2,
> Source=/var/lib/pacemaker/pengine/pe-input-115.bz2): Stopped
>> Mar 17 21:03:29 lbv1 pengine[4253]:   notice: process_pe_message: Calculated
> Transition 1: /var/lib/pacemaker/pengine/pe-input-116.bz2
>> Mar 17 21:03:34 lbv1 crmd[4250]:   notice: run_graph: Transition 1
> (Complete=8, Pending=0, Fired=0, Skipped=12, Incomplete=1,
> Source=/var/lib/pacemaker/pengine/pe-input-116.bz2): Stopped
>> Mar 17 21:03:37 lbv1 pengine[4253]:  warning: unpack_rsc_op_failure:
> Processing failed op start for Stonith1-1 on lbv2.beta.com: unknown error (1)
>> Mar 17 21:03:37 lbv1 pengine[4253]:  warning: unpack_rsc_op_failure:
> Processing failed op start for Stonith1-1 on lbv2.beta.com: unknown error (1)
>> Mar 17 21:03:37 lbv1 pengine[4253]:   notice: process_pe_message: Calculated
> Transition 2: /var/lib/pacemaker/pengine/pe-input-117.bz2
>> Mar 17 21:03:39 lbv1 stonith-ng[4247]:   notice: log_operation: Operation
> 'monitor' [4377] for device 'Stonith2-1' returned: -201 (Generic
> Pacemaker error)
>> Mar 17 21:03:39 lbv1 stonith-ng[4247]:  warning: log_operation:
> Stonith2-1:4377 [ Performing: stonith -t external/stonith-helper -S ]
>> Mar 17 21:03:39 lbv1 stonith-ng[4247]:  warning: log_operation:
> Stonith2-1:4377 [ failed to exec "stonith" ]
>> Mar 17 21:03:39 lbv1 stonith-ng[4247]:  warning: log_operation:
> Stonith2-1:4377 [ failed:  2 ]
>> Mar 17 21:03:39 lbv1 crmd[4250]:    error: process_lrm_event: Operation
> Stonith2-1_start_0 (node=lbv1.beta.com, call=31, status=4, cib-update=42,
> confirmed=true) Error
>> Mar 17 21:03:40 lbv1 crmd[4250]:   notice: run_graph: Transition 2
> (Complete=12, Pending=0, Fired=0, Skipped=3, Incomplete=0,
> Source=/var/lib/pacemaker/pengine/pe-input-117.bz2): Stopped
>> Mar 17 21:03:42 lbv1 pengine[4253]:  warning: unpack_rsc_op_failure:
> Processing failed op start for Stonith2-1 on lbv1.beta.com: unknown error (1)
>> Mar 17 21:03:42 lbv1 pengine[4253]:  warning: unpack_rsc_op_failure:
> Processing failed op start for Stonith2-1 on lbv1.beta.com: unknown error (1)
>> Mar 17 21:03:42 lbv1 pengine[4253]:  warning: unpack_rsc_op_failure:
> Processing failed op start for Stonith1-1 on lbv2.beta.com: unknown error (1)
>> Mar 17 21:03:42 lbv1 pengine[4253]:   notice: process_pe_message: Calculated
> Transition 3: /var/lib/pacemaker/pengine/pe-input-118.bz2
>> Mar 17 21:03:42 lbv1 IPaddr2(vip_208)[4448]: INFO:
> /usr/libexec/heartbeat/send_arp -i 200 -r 5 -p
> /var/run/resource-agents/send_arp-192.168.17.208 eth0 192.168.17.208 auto
> not_used not_used
>> Mar 17 21:03:47 lbv1 crmd[4250]:   notice: run_graph: Transition 3
> (Complete=10, Pending=0, Fired=0, Skipped=0, Incomplete=0,
> Source=/var/lib/pacemaker/pengine/pe-input-118.bz2): Complete
>>
>> 宜しくお願いします。
>>
>> 以上
>>
>>
>>
>> 2015年3月17日 18:31 <renayama19661014@ybb.ne.jp>:
>>
>> 福田さん
>>>
>>> こんばんは、山内です。
>>>
>>> tag付けされていないので、本日の最新版は、
>>>
>>>  *
> https://github.com/ClusterLabs/pacemaker/tree/e32080b460f81486b85d08ec958582b3e72d858c
>>>
>>>
>>> になります。
>>> 右側の[Download ZIP]からダウンロード出来ます。
>>>
>>> 以上です。
>>>
>>>
>>> ----- Original Message -----
>>>> From: Masamichi Fukuda - elf-systems
> <masamichi_fukuda@elf-systems.com>
>>>
>>>> To: "renayama19661014@ybb.ne.jp"
> <renayama19661014@ybb.ne.jp>;
> "linux-ha-japan@lists.sourceforge.jp"
> <linux-ha-japan@lists.sourceforge.jp>
>>>> Date: 2015/3/17, Tue 18:07
>>>> Subject: スプリットブレイン時のSTONITHエラーについて
>>>>
>>>>
>>>> 山内さん
>>>>
>>>>
>>>> お疲れ様です、福田です。
>>>>
>>>>
>>>> こちらを見たのですが、
>>>> https://github.com/ClusterLabs/pacemaker/tags
>>>>
>>>>
>>>>
>>>> pacemaker 1.1.12 561c4cf が最新のようなのですが。
>>>> 済みませんが、これ以降の最新版はどちらにあるか教えて頂けますか。
>>>>
>>>>
>>>> 宜しくお願いします。
>>>>
>>>>
>>>> 以上
>>>>
>>>>
>>>>
>>>> 2015年3月17日火曜日、<renayama19661014@ybb.ne.jp>さんは書きました:
>>>>
>>>> 福田さん
>>>>>
>>>>> お疲れ様です。山内です。
>>>>>
>>>>> はい。古いです。
>>>>>
>>>>> PacemakerがHeartbeat3.0.6に対応したのは意外と最近です。
>>>>> もっと新しいものを入れてください。(また、ソースから構築する必要がありますが・・・・)
>>>>>
>>>>>
>>>>>
>>>>> 本家のgithubから入手可能です。
>>>>>  * https://github.com/ClusterLabs/pacemaker
>>>>>
>>>>>
>>>>> 場合によっては、最新のmasterはエラーなどが出る場合がありますので、その場合は、バージョンを古い方にたぐって
>>>>> いくのが良いと思います。
>>>>>
>>>>> 以上です。
>>>>>
>>>>>
>>>>>
>>>>> ----- Original Message -----
>>>>>> From: Masamichi Fukuda - elf-systems
> <masamichi_fukuda@elf-systems.com>
>>>>>> To: 山内英生 <renayama19661014@ybb.ne.jp>;
> "linux-ha-japan@lists.sourceforge.jp"
> <linux-ha-japan@lists.sourceforge.jp>
>>>>>> Date: 2015/3/17, Tue 16:06
>>>>>> Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
>>>>>>
>>>>>>
>>>>>> 山内さん
>>>>>>
>>>>>> お疲れ様です、福田です。
>>>>>>
>>>>>> 以前のメールでheartbeatとpacemakerを最新版を入れたほうが良いと回答頂きました。
>>>>>> そこで今回、heartbeat3.0.6とpacemaker1.1.12を入れたのですが。
>>>>>>
>>>>>> heartbeat configuration: Version = "3.0.6"
>>>>>> pacemaker configuration: Version = 1.1.12 (Build:
> 561c4cf)pacemakerがまだ古いということでしょうか。
>>>>>>
>>>>>> 済みませんが、宜しくお願いします。
>>>>>>
>>>>>> 以上
>>>>>>
>>>>>>
>>>>>>
>>>>>> 2015年3月17日 14:59 <renayama19661014@ybb.ne.jp>:
>>>>>>
>>>>>> 福田さん
>>>>>>>
>>>>>>> お疲れ様です。山内です。
>>>>>>>
>>>>>>> ふと思ったのすが、以前のやり取りのメールで以下と回答してますが、問題ないでしょうか?
>>>>>>>
>>>>>>>
>>>>>>>>>>>>> 2)Heartbeat3.0.6+Pacemaker最新 :
> OK
>>>>>>>>>>>>>   
>>>>>>>>>>>>>
> どうやら、Heartbeatも最新版3.0.6を組合せる必要があるようです。
>>>>>>>>>>>>>
>  * http://hg.linux-ha.org/heartbeat-STABLE_3_0/rev/cceeb47a7d8f
>>>>>>>
>>>>>>> 以下のcrm_monのバージョンを見ると、1.1.12のようです。
>>>>>>> Heartbeat3.0.6と組み合わせるには、かなり新しめのPacemakerが必要です。
>>>>>>>
>>>>>>>> # crm_mon -rfA
>>>>>>>>
>>>>>>>> Last updated: Tue Mar 17 14:14:39 2015
>>>>>>>> Last change: Tue Mar 17 14:01:43 2015
>>>>>>>> Stack: heartbeat
>>>>>>>> Current DC: lbv2.beta.com
> (82ffc36f-1ad8-8686-7db0-35686465c624) - parti
>>>>>>>> tion with quorum
>>>>>>>> Version: 1.1.12-561c4cf
>>>>>>>
>>>>>>> たぶん、以下の変更以降は少なくとも必要かと思います。
>>>>>>>
>>>>>>> https://github.com/ClusterLabs/pacemaker/commit/f2302da063d08719d28367d8e362b8bfb0f85bf3
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> 以上です。
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> ----- Original Message -----
>>>>>>>> From: Masamichi Fukuda - elf-systems
> <masamichi_fukuda@elf-systems.com>
>>>>>>>> To: 山内英生 <renayama19661014@ybb.ne.jp>;
> "linux-ha-japan@lists.sourceforge.jp"
> <linux-ha-japan@lists.sourceforge.jp>
>>>>>>>
>>>>>>>> Date: 2015/3/17, Tue 14:38
>>>>>>>> Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
>>>>>>>>
>>>>>>>>
>>>>>>>> 山内さん
>>>>>>>>
>>>>>>>> お疲れ様です、福田です。
>>>>>>>>
>>>>>>>> stonith-helperのシェバング行に-xを追加すれば良いのでしょうか?
>>>>>>>> stonith-helperの先頭行を#!/bin/bash -xにしてクラスタを起動してみました。
>>>>>>>>
>>>>>>>> crm_monでは先ほどと変わりはないようです。
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> # crm_mon -rfA
>>>>>>>>
>>>>>>>> Last updated: Tue Mar 17 14:14:39 2015
>>>>>>>> Last change: Tue Mar 17 14:01:43 2015
>>>>>>>> Stack: heartbeat
>>>>>>>> Current DC: lbv2.beta.com
> (82ffc36f-1ad8-8686-7db0-35686465c624) - parti
>>>>>>>> tion with quorum
>>>>>>>> Version: 1.1.12-561c4cf
>>>>>>>> 2 Nodes configured
>>>>>>>> 8 Resources configured
>>>>>>>>
>>>>>>>> Online: [ lbv1.beta.com lbv2.beta.com ]
>>>>>>>>
>>>>>>>> Full list of resources:
>>>>>>>>
>>>>>>>>  Resource Group: HAvarnish
>>>>>>>>      vip_208    (ocf::heartbeat:IPaddr2):      
> Started lbv1.beta.com
>>>>>>>>      varnishd   (lsb:varnish):  Started
> lbv1.beta.com
>>>>>>>>  Resource Group: grpStonith1
>>>>>>>>      Stonith1-1
> (stonith:external/stonith-helper):      Stopped
>>>>>>>>      Stonith1-2 (stonith:external/xen0):       
> Stopped
>>>>>>>>  Resource Group: grpStonith2
>>>>>>>>      Stonith2-1
> (stonith:external/stonith-helper):      Stopped
>>>>>>>>      Stonith2-2 (stonith:external/xen0):       
> Stopped
>>>>>>>>  Clone Set: clone_ping [ping]
>>>>>>>>      Started: [ lbv1.beta.com lbv2.beta.com ]
>>>>>>>>
>>>>>>>> Node Attributes:
>>>>>>>> * Node lbv1.beta.com:
>>>>>>>>     + default_ping_set                  : 100
>>>>>>>> * Node lbv2.beta.com:
>>>>>>>>     + default_ping_set                  : 100
>>>>>>>>
>>>>>>>> Migration summary:
>>>>>>>> * Node lbv2.beta.com:
>>>>>>>>    Stonith1-1: migration-threshold=1
> fail-count=1000000 last-failure='Tue Mar 17
>>>>>>>>  14:12:16 2015'
>>>>>>>> * Node lbv1.beta.com:
>>>>>>>>    Stonith2-1: migration-threshold=1
> fail-count=1000000 last-failure='Tue Mar 17
>>>>>>>>  14:12:21 2015'
>>>>>>>>
>>>>>>>> Failed actions:
>>>>>>>>     Stonith1-1_start_0 on lbv2.beta.com 'unknown
> error' (1): call=31, st
>>>>>>>> atus=Error, last-rc-change='Tue Mar 17 14:12:14
> 2015', queued=0ms, exec=1065ms
>>>>>>>>     Stonith2-1_start_0 on lbv1.beta.com 'unknown
> error' (1): call=26, st
>>>>>>>> atus=Error, last-rc-change='Tue Mar 17 14:12:19
> 2015', queued=0ms, exec=1081ms
>>>>>>>>
>>>>>>>> その他のログを探してみました。
>>>>>>>>
>>>>>>>> heartbeat起動時です。
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> # less /var/log/pm_logconv.out
>>>>>>>> Mar 17 14:11:28 lbv1.beta.com info: Starting
> Heartbeat 3.0.6.
>>>>>>>> Mar 17 14:11:33 lbv1.beta.com info: Link
> lbv2.beta.com:eth1 is up.
>>>>>>>> Mar 17 14:11:34 lbv1.beta.com info: Start
> "ccm" process. (pid=13264)
>>>>>>>> Mar 17 14:11:34 lbv1.beta.com info: Start
> "lrmd" process. (pid=13267)
>>>>>>>> Mar 17 14:11:34 lbv1.beta.com info: Start
> "attrd" process. (pid=13268)
>>>>>>>> Mar 17 14:11:34 lbv1.beta.com info: Start
> "stonithd" process. (pid=13266)
>>>>>>>> Mar 17 14:11:34 lbv1.beta.com info: Start
> "cib" process. (pid=13265)
>>>>>>>> Mar 17 14:11:34 lbv1.beta.com info: Start
> "crmd" process. (pid=13269)
>>>>>>>>
>>>>>>>>
>>>>>>>> # less /var/log/error
>>>>>>>> Mar 17 14:12:20 lbv1 crmd[13269]:    error:
> process_lrm_event: Operation Stonith2-1_start_0 (node=lbv1.beta.com, call=26,
> status=4, cib-update=19, confirmed=true) Error
>>>>>>>>
>>>>>>>>
>>>>>>>> syslogからstonithをgrepしたものです
>>>>>>>>
>>>>>>>> Mar 17 14:11:34 lbv1 heartbeat: [13255]: info:
> Starting child client
> "/usr/local/heartbeat/libexec/pacemaker/stonithd" (0,0)
>>>>>>>> Mar 17 14:11:34 lbv1 heartbeat: [13266]: info:
> Starting "/usr/local/heartbeat/libexec/pacemaker/stonithd" as uid 0 
> gid 0 (pid 13266)
>>>>>>>> Mar 17 14:11:34 lbv1 stonithd[13266]:   notice:
> crm_cluster_connect: Connecting to cluster infrastructure: heartbeat
>>>>>>>> Mar 17 14:11:34 lbv1 heartbeat: [13255]: info: the
> send queue length from heartbeat to client stonithd is set to 1024
>>>>>>>> Mar 17 14:11:40 lbv1 stonithd[13266]:   notice:
> setup_cib: Watching for stonith topology changes
>>>>>>>> Mar 17 14:11:40 lbv1 stonithd[13266]:   notice:
> unpack_config: On loss of CCM Quorum: Ignore
>>>>>>>> Mar 17 14:11:40 lbv1 stonithd[13266]:  warning:
> handle_startup_fencing: Blind faith: not fencing unseen nodes
>>>>>>>> Mar 17 14:11:40 lbv1 stonithd[13266]:  warning:
> handle_startup_fencing: Blind faith: not fencing unseen nodes
>>>>>>>> Mar 17 14:11:41 lbv1 stonithd[13266]:   notice:
> stonith_device_register: Added 'Stonith2-1' to the device list (1 active
> devices)
>>>>>>>> Mar 17 14:11:41 lbv1 stonithd[13266]:   notice:
> stonith_device_register: Added 'Stonith2-2' to the device list (2 active
> devices)
>>>>>>>> Mar 17 14:12:04 lbv1 stonithd[13266]:   notice:
> xml_patch_version_check: Versions did not change in patch 0.5.0
>>>>>>>> Mar 17 14:12:20 lbv1 stonithd[13266]:   notice:
> log_operation: Operation 'monitor' [13386] for device
> 'Stonith2-1' returned: -201 (Generic Pacemaker error)
>>>>>>>> Mar 17 14:12:20 lbv1 stonithd[13266]:  warning:
> log_operation: Stonith2-1:13386 [ Performing: stonith -t external/stonith-helper
> -S ]
>>>>>>>> Mar 17 14:12:20 lbv1 stonithd[13266]:  warning:
> log_operation: Stonith2-1:13386 [ failed to exec "stonith" ]
>>>>>>>> Mar 17 14:12:20 lbv1 stonithd[13266]:  warning:
> log_operation: Stonith2-1:13386 [ failed:  2 ]
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> 宜しくお願いします。
>>>>>>>>
>>>>>>>> 以上
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> 2015年3月17日 13:32 <renayama19661014@ybb.ne.jp>:
>>>>>>>>
>>>>>>>> 福田さん
>>>>>>>>>
>>>>>>>>> お疲れ様です。山内です。
>>>>>>>>>
>>>>>>>>> ということは、stonith-helperのstartに問題があるようですね。
>>>>>>>>>
>>>>>>>>> stonith-helperの先頭に
>>>>>>>>>
>>>>>>>>> #!/bin/bash -x
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> を入れて、クラスタを起動すると何かわかるかも知れません。
>>>>>>>>>
>>>>>>>>> ちなみに、stonith-helperのログもどこかに出ていると思うのですが。。。
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> 以上です。
>>>>>>>>>
>>>>>>>>> ----- Original Message -----
>>>>>>>>>> From: Masamichi Fukuda - elf-systems
> <masamichi_fukuda@elf-systems.com>
>>>>>>>>>> To: 山内英生 <renayama19661014@ybb.ne.jp>;
> "linux-ha-japan@lists.sourceforge.jp"
> <linux-ha-japan@lists.sourceforge.jp>
>>>>>>>>>
>>>>>>>>>> Date: 2015/3/17, Tue 12:31
>>>>>>>>>> Subject: Re: [Linux-ha-jp]
> スプリットブレイン時のSTONITHエラーについて
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> 山内さん
>>>>>>>>>> cc:松島さん
>>>>>>>>>>
>>>>>>>>>> こんにちは、福田です。
>>>>>>>>>>
>>>>>>>>>> 同じディレクトリにxen0はありました。
>>>>>>>>>>
>>>>>>>>>> # pwd
>>>>>>>>>> /usr/local/heartbeat/lib/stonith/plugins/external
>>>>>>>>>>
>>>>>>>>>> # ls
>>>>>>>>>> drac5           ibmrsa          kdumpcheck 
> riloe          vmware
>>>>>>>>>> dracmc-telnet  ibmrsa-telnet  libvirt     
> ssh          xen0
>>>>>>>>>> hetzner        ipmi          nut     
> stonith-helper  xen0-ha
>>>>>>>>>> hmchttp        ippower9258    rackpdu     
> vcenter
>>>>>>>>>>
>>>>>>>>>> 宜しくお願いします。
>>>>>>>>>>
>>>>>>>>>> 以上
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> 2015-03-17 10:53 GMT+09:00
> <renayama19661014@ybb.ne.jp>:
>>>>>>>>>>
>>>>>>>>>> 福田さん
>>>>>>>>>>> cc:松島さん
>>>>>>>>>>>
>>>>>>>>>>> お疲れ様です。山内です。
>>>>>>>>>>>
>>>>>>>>>>>> 標準出力や標準エラー出力はありませんでした。
>>>>>>>>>>>>
>>>>>>>>>>>> stonith-helperがおかしいのでしょうか。
>>>>>>>>>>>> stonith-helperはシェルスクリプトなのでインストールはあまり気にしていなかったのですが。
>>>>>>>>>>>> stonith-helperはここに配置されています。
>>>>>>>>>>>> /usr/local/heartbeat/lib/stonith/plugins/external/stonith-helper
>>>>>>>>>>>
>>>>>>>>>>> このディレクトリにxen0もありますか?
>>>>>>>>>>> 無いようでしたら、問題がありますので、一度、stonith-helperのファイルを属性などはそのまま、xen0と同じディレクトリに
>>>>>>>>>>> コピーしてみてください。
>>>>>>>>>>>
>>>>>>>>>>> それで稼働するなら、pm_extrasのインストールに問題があるということになります。
>>>>>>>>>>>
>>>>>>>>>>> 以上です。
>>>>>>>>>>>
>>>>>>>>>>> ----- Original Message -----
>>>>>>>>>>>> From: Masamichi Fukuda - elf-systems
> <masamichi_fukuda@elf-systems.com>
>>>>>>>>>>>> To: 山内英生
> <renayama19661014@ybb.ne.jp>;
> "linux-ha-japan@lists.sourceforge.jp"
> <linux-ha-japan@lists.sourceforge.jp>
>>>>>>>>>>>
>>>>>>>>>>>> Date: 2015/3/17, Tue 10:31
>>>>>>>>>>>> Subject: Re: [Linux-ha-jp]
> スプリットブレイン時のSTONITHエラーについて
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> 山内さん
>>>>>>>>>>>> cc:松島さん
>>>>>>>>>>>>
>>>>>>>>>>>> おはようございます、福田です。
>>>>>>>>>>>> crmの例をありがとうございます。
>>>>>>>>>>>>
>>>>>>>>>>>> 早速、こちらの環境に合わせてみました。
>>>>>>>>>>>>
>>>>>>>>>>>> $ cat test.crm
>>>>>>>>>>>> ### Cluster Option ###
>>>>>>>>>>>> property \
>>>>>>>>>>>>    
> no-quorum-policy="ignore" \
>>>>>>>>>>>>     stonith-enabled="true"
> \
>>>>>>>>>>>>    
> startup-fencing="false" \
>>>>>>>>>>>>     stonith-timeout="710s"
> \
>>>>>>>>>>>>    
> crmd-transition-delay="2s"
>>>>>>>>>>>>
>>>>>>>>>>>> ### Resource Default ###
>>>>>>>>>>>> rsc_defaults \
>>>>>>>>>>>>    
> resource-stickiness="INFINITY" \
>>>>>>>>>>>>    
> migration-threshold="1"
>>>>>>>>>>>>
>>>>>>>>>>>> ### Group Configuration ###
>>>>>>>>>>>> group HAvarnish \
>>>>>>>>>>>>     vip_208 \
>>>>>>>>>>>>     varnishd
>>>>>>>>>>>>
>>>>>>>>>>>> group grpStonith1 \
>>>>>>>>>>>>     Stonith1-1 \
>>>>>>>>>>>>     Stonith1-2
>>>>>>>>>>>>
>>>>>>>>>>>> group grpStonith2 \
>>>>>>>>>>>>     Stonith2-1 \
>>>>>>>>>>>>     Stonith2-2
>>>>>>>>>>>>
>>>>>>>>>>>> ### Clone Configuration ###
>>>>>>>>>>>> clone clone_ping \
>>>>>>>>>>>>     ping
>>>>>>>>>>>>
>>>>>>>>>>>> ### Fencing Topology ###
>>>>>>>>>>>> fencing_topology \
>>>>>>>>>>>>     lbv1.beta.com: Stonith1-1
> Stonith1-2 \
>>>>>>>>>>>>     lbv2.beta.com: Stonith2-1
> Stonith2-2
>>>>>>>>>>>>
>>>>>>>>>>>> ### Primitive Configuration ###
>>>>>>>>>>>> primitive vip_208
> ocf:heartbeat:IPaddr2 \
>>>>>>>>>>>>     params \
>>>>>>>>>>>>        
> ip="192.168.17.208" \
>>>>>>>>>>>>         nic="eth0" \
>>>>>>>>>>>>         cidr_netmask="24"
> \
>>>>>>>>>>>>     op start interval="0s"
> timeout="90s" on-fail="restart" \
>>>>>>>>>>>>     op monitor
> interval="5s" timeout="60s" on-fail="restart"
> \
>>>>>>>>>>>>     op stop interval="0s"
> timeout="100s" on-fail="fence"
>>>>>>>>>>>>
>>>>>>>>>>>> primitive varnishd lsb:varnish \
>>>>>>>>>>>>     op start interval="0s"
> timeout="90s" on-fail="restart" \
>>>>>>>>>>>>     op monitor
> interval="10s" timeout="60s" on-fail="restart"
> \
>>>>>>>>>>>>     op stop interval="0s"
> timeout="100s" on-fail="fence"
>>>>>>>>>>>>
>>>>>>>>>>>> primitive ping ocf:pacemaker:ping
> \
>>>>>>>>>>>>     params \
>>>>>>>>>>>>        
> name="default_ping_set" \
>>>>>>>>>>>>        
> host_list="192.168.17.254" \
>>>>>>>>>>>>         multiplier="100"
> \
>>>>>>>>>>>>         dampen="1" \
>>>>>>>>>>>>     op start interval="0s"
> timeout="90s" on-fail="restart" \
>>>>>>>>>>>>     op monitor
> interval="10s" timeout="60s" on-fail="restart"
> \
>>>>>>>>>>>>     op stop interval="0s"
> timeout="100s" on-fail="fence"
>>>>>>>>>>>>
>>>>>>>>>>>> primitive Stonith1-1
> stonith:external/stonith-helper \
>>>>>>>>>>>>     params \
>>>>>>>>>>>>        
> pcmk_reboot_retries="1" \
>>>>>>>>>>>>        
> pcmk_reboot_timeout="40s" \
>>>>>>>>>>>>        
> hostlist="lbv1.beta.com" \
>>>>>>>>>>>>        
> dead_check_target="192.168.17.132 10.0.17.132" \
>>>>>>>>>>>>        
> standby_check_command="/usr/local/sbin/crm_resource -r varnishd -W | grep
> -q `hostname`" \
>>>>>>>>>>>>        
> run_online_check="yes" \
>>>>>>>>>>>>     op start interval="0s"
> timeout="60s" on-fail="restart" \
>>>>>>>>>>>>     op stop interval="0s"
> timeout="60s" on-fail="ignore"
>>>>>>>>>>>>
>>>>>>>>>>>> primitive Stonith1-2
> stonith:external/xen0 \
>>>>>>>>>>>>     params \
>>>>>>>>>>>>        
> pcmk_reboot_timeout="60s" \
>>>>>>>>>>>>        
> hostlist="lbv1.beta.com:/etc/xen/lbv1.cfg" \
>>>>>>>>>>>>        
> dom0="xen0.beta.com" \
>>>>>>>>>>>>     op start interval="0s"
> timeout="60s" on-fail="restart" \
>>>>>>>>>>>>     op monitor
> interval="3600s" timeout="60s" on-fail="restart"
> \
>>>>>>>>>>>>     op stop interval="0s"
> timeout="60s" on-fail="ignore"
>>>>>>>>>>>>
>>>>>>>>>>>> primitive Stonith2-1
> stonith:external/stonith-helper \
>>>>>>>>>>>>     params \
>>>>>>>>>>>>        
> pcmk_reboot_retries="1" \
>>>>>>>>>>>>        
> pcmk_reboot_timeout="40s" \
>>>>>>>>>>>>        
> hostlist="lbv2.beta.com" \
>>>>>>>>>>>>        
> dead_check_target="192.168.17.133 10.0.17.133" \
>>>>>>>>>>>>        
> standby_check_command="/usr/local/sbin/crm_resource -r varnishd -W | grep
> -q `hostname`" \
>>>>>>>>>>>>        
> run_online_check="yes" \
>>>>>>>>>>>>     op start interval="0s"
> timeout="60s" on-fail="restart" \
>>>>>>>>>>>>     op stop interval="0s"
> timeout="60s" on-fail="ignore"
>>>>>>>>>>>>
>>>>>>>>>>>> primitive Stonith2-2
> stonith:external/xen0 \
>>>>>>>>>>>>     params \
>>>>>>>>>>>>        
> pcmk_reboot_timeout="60s" \
>>>>>>>>>>>>        
> hostlist="lbv2.beta.com:/etc/xen/lbv2.cfg" \
>>>>>>>>>>>>        
> dom0="xen0.beta.com" \
>>>>>>>>>>>>     op start interval="0s"
> timeout="60s" on-fail="restart" \
>>>>>>>>>>>>     op monitor
> interval="3600s" timeout="60s" on-fail="restart"
> \
>>>>>>>>>>>>     op stop interval="0s"
> timeout="60s" on-fail="ignore"
>>>>>>>>>>>>
>>>>>>>>>>>> ### Resource Location ###
>>>>>>>>>>>> location HA_location-1 HAvarnish
> \
>>>>>>>>>>>>     rule 200: #uname eq
> lbv1.beta.com \
>>>>>>>>>>>>     rule 100: #uname eq
> lbv2.beta.com
>>>>>>>>>>>>
>>>>>>>>>>>> location HA_location-2 HAvarnish
> \
>>>>>>>>>>>>     rule -INFINITY: not_defined
> default_ping_set or default_ping_set lt 100
>>>>>>>>>>>>
>>>>>>>>>>>> location HA_location-3 grpStonith1
> \
>>>>>>>>>>>>     rule -INFINITY: #uname eq
> lbv1.beta.com
>>>>>>>>>>>>
>>>>>>>>>>>> location HA_location-4 grpStonith2
> \
>>>>>>>>>>>>     rule -INFINITY: #uname eq
> lbv2.beta.com
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> これを流しこんだところ、昨日とはメッセージが異なります。
>>>>>>>>>>>> pingのメッセージはなくなっていました。
>>>>>>>>>>>>
>>>>>>>>>>>> # crm_mon -rfA
>>>>>>>>>>>> Last updated: Tue Mar 17 10:21:28
> 2015
>>>>>>>>>>>> Last change: Tue Mar 17 10:21:09
> 2015
>>>>>>>>>>>> Stack: heartbeat
>>>>>>>>>>>> Current DC: lbv2.beta.com
> (82ffc36f-1ad8-8686-7db0-35686465c624) - parti
>>>>>>>>>>>> tion with quorum
>>>>>>>>>>>> Version: 1.1.12-561c4cf
>>>>>>>>>>>> 2 Nodes configured
>>>>>>>>>>>> 8 Resources configured
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Online: [ lbv1.beta.com
> lbv2.beta.com ]
>>>>>>>>>>>>
>>>>>>>>>>>> Full list of resources:
>>>>>>>>>>>>
>>>>>>>>>>>>  Resource Group: HAvarnish
>>>>>>>>>>>>      vip_208   
> (ocf::heartbeat:IPaddr2):       Started lbv1.beta.com
>>>>>>>>>>>>      varnishd   (lsb:varnish): 
> Started lbv1.beta.com
>>>>>>>>>>>>  Resource Group: grpStonith1
>>>>>>>>>>>>      Stonith1-1
> (stonith:external/stonith-helper):      Stopped
>>>>>>>>>>>>      Stonith1-2
> (stonith:external/xen0):        Stopped
>>>>>>>>>>>>  Resource Group: grpStonith2
>>>>>>>>>>>>      Stonith2-1
> (stonith:external/stonith-helper):      Stopped
>>>>>>>>>>>>      Stonith2-2
> (stonith:external/xen0):        Stopped
>>>>>>>>>>>>  Clone Set: clone_ping [ping]
>>>>>>>>>>>>      Started: [ lbv1.beta.com
> lbv2.beta.com ]
>>>>>>>>>>>>
>>>>>>>>>>>> Node Attributes:
>>>>>>>>>>>> * Node lbv1.beta.com:
>>>>>>>>>>>>     +
> default_ping_set                  : 100
>>>>>>>>>>>> * Node lbv2.beta.com:
>>>>>>>>>>>>     +
> default_ping_set                  : 100
>>>>>>>>>>>>
>>>>>>>>>>>> Migration summary:
>>>>>>>>>>>> * Node lbv2.beta.com:
>>>>>>>>>>>>    Stonith1-1: migration-threshold=1
> fail-count=1000000 last-failure='Tue Mar 17
>>>>>>>>>>>>  10:21:17 2015'
>>>>>>>>>>>> * Node lbv1.beta.com:
>>>>>>>>>>>>    Stonith2-1: migration-threshold=1
> fail-count=1000000 last-failure='Tue Mar 17
>>>>>>>>>>>>  10:21:17 2015'
>>>>>>>>>>>>
>>>>>>>>>>>> Failed actions:
>>>>>>>>>>>>     Stonith1-1_start_0 on
> lbv2.beta.com 'unknown error' (1): call=31, st
>>>>>>>>>>>> atus=Error, last-rc-change='Tue
> Mar 17 10:21:15 2015', queued=0ms, exec=1082ms
>>>>>>>>>>>>     Stonith2-1_start_0 on
> lbv1.beta.com 'unknown error' (1): call=31, st
>>>>>>>>>>>> atus=Error, last-rc-change='Tue
> Mar 17 10:21:16 2015', queued=0ms, exec=1079ms
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> /var/log/ha-debugのログです。
>>>>>>>>>>>>
>>>>>>>>>>>> IPaddr2(vip_208)[7851]:
> 2015/03/17_10:21:22 INFO: Adding inet address 192.168.17.208/24 with broadcast
> address 192.168.17.255 to device eth0
>>>>>>>>>>>> IPaddr2(vip_208)[7851]:
> 2015/03/17_10:21:22 INFO: Bringing device eth0 up
>>>>>>>>>>>> IPaddr2(vip_208)[7851]:
> 2015/03/17_10:21:22 INFO: /usr/libexec/heartbeat/send_arp -i 200 -r 5 -p
> /var/run/resource-agents/send_arp-192.168.17.208 eth0 192.168.17.208 auto
> not_used not_used
>>>>>>>>>>>>
>>>>>>>>>>>> 標準出力や標準エラー出力はありませんでした。
>>>>>>>>>>>>
>>>>>>>>>>>> stonith-helperがおかしいのでしょうか。
>>>>>>>>>>>> stonith-helperはシェルスクリプトなのでインストールはあまり気にしていなかったのですが。
>>>>>>>>>>>> stonith-helperはここに配置されています。
>>>>>>>>>>>> /usr/local/heartbeat/lib/stonith/plugins/external/stonith-helper
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> 宜しくお願いします。
>>>>>>>>>>>>
>>>>>>>>>>>> 以上
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> 2015-03-17 9:45 GMT+09:00
> <renayama19661014@ybb.ne.jp>:
>>>>>>>>>>>>
>>>>>>>>>>>> 福田さん
>>>>>>>>>>>>>
>>>>>>>>>>>>> おはようございます。山内です。
>>>>>>>>>>>>>
>>>>>>>>>>>>> 念の為、手元にある複数のstonithを利用した場合の例を抜粋してお送りします。
>>>>>>>>>>>>> (実際には、改行に気を付けてください)
>>>>>>>>>>>>>
>>>>>>>>>>>>> 以下の例は、PM1.1系での設定で、
>>>>>>>>>>>>> nodeaは、prmStonith1-1、 prmStonith1-2の順でstonithが実行されます。
>>>>>>>>>>>>> nodebは、prmStonith2-1、 prmStonith2-2の順でstonithが実行されます。
>>>>>>>>>>>>>
>>>>>>>>>>>>> stonith自体は、helperとsshです。
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> (snip)
>>>>>>>>>>>>> ### Group Configuration ###
>>>>>>>>>>>>> group grpStonith1 \
>>>>>>>>>>>>> prmStonith1-1 \
>>>>>>>>>>>>> prmStonith1-2
>>>>>>>>>>>>>
>>>>>>>>>>>>> group grpStonith2 \
>>>>>>>>>>>>> prmStonith2-1 \
>>>>>>>>>>>>> prmStonith2-2
>>>>>>>>>>>>>
>>>>>>>>>>>>> ### Fencing Topology ###
>>>>>>>>>>>>> fencing_topology \
>>>>>>>>>>>>> nodea: prmStonith1-1
> prmStonith1-2 \
>>>>>>>>>>>>> nodeb: prmStonith2-1
> prmStonith2-2
>>>>>>>>>>>>> (snp)
>>>>>>>>>>>>> primitive prmStonith1-1
> stonith:external/stonith-helper \
>>>>>>>>>>>>> params \
>>>>>>>>>>>>>
>>>>>>>>>>>>> pcmk_reboot_retries="1"
> \
>>>>>>>>>>>>> pcmk_reboot_timeout="40s"
> \
>>>>>>>>>>>>> hostlist="nodea" \
>>>>>>>>>>>>> dead_check_target="192.168.28.60
> 192.168.28.70" \
>>>>>>>>>>>>> standby_check_command="/usr/sbin/crm_resource
> -r prmRES -W | grep -qi `hostname`" \
>>>>>>>>>>>>> run_online_check="yes"
> \
>>>>>>>>>>>>> op start interval="0s"
> timeout="60s" on-fail="restart" \
>>>>>>>>>>>>> op stop interval="0s"
> timeout="60s" on-fail="ignore"
>>>>>>>>>>>>>
>>>>>>>>>>>>> primitive prmStonith1-2
> stonith:external/ssh \
>>>>>>>>>>>>> params \
>>>>>>>>>>>>> pcmk_reboot_timeout="60s"
> \
>>>>>>>>>>>>> hostlist="nodea" \
>>>>>>>>>>>>> op start interval="0s"
> timeout="60s" on-fail="restart" \
>>>>>>>>>>>>> op monitor
> interval="3600s" timeout="60s" on-fail="restart"
> \
>>>>>>>>>>>>> op stop interval="0s"
> timeout="60s" on-fail="ignore"
>>>>>>>>>>>>>
>>>>>>>>>>>>> primitive prmStonith2-1
> stonith:external/stonith-helper \
>>>>>>>>>>>>> params \
>>>>>>>>>>>>> pcmk_reboot_retries="1"
> \
>>>>>>>>>>>>> pcmk_reboot_timeout="40s"
> \
>>>>>>>>>>>>> hostlist="nodeb" \
>>>>>>>>>>>>> dead_check_target="192.168.28.61
> 192.168.28.71" \
>>>>>>>>>>>>> standby_check_command="/usr/sbin/crm_resource
> -r prmRES -W | grep -qi `hostname`" \
>>>>>>>>>>>>> run_online_check="yes"
> \
>>>>>>>>>>>>> op start interval="0s"
> timeout="60s" on-fail="restart" \
>>>>>>>>>>>>> op stop interval="0s"
> timeout="60s" on-fail="ignore"
>>>>>>>>>>>>>
>>>>>>>>>>>>> primitive prmStonith2-2
> stonith:external/ssh \
>>>>>>>>>>>>> params \
>>>>>>>>>>>>> pcmk_reboot_timeout="60s"
> \
>>>>>>>>>>>>> hostlist="nodeb" \
>>>>>>>>>>>>> op start interval="0s"
> timeout="60s" on-fail="restart" \
>>>>>>>>>>>>> op monitor
> interval="3600s" timeout="60s" on-fail="restart"
> \
>>>>>>>>>>>>> op stop interval="0s"
> timeout="60s" on-fail="ignore"
>>>>>>>>>>>>> (snip)
>>>>>>>>>>>>> location
> rsc_location-grpStonith1-2 grpStonith1 \
>>>>>>>>>>>>> rule -INFINITY: #uname eq nodea
>>>>>>>>>>>>> location
> rsc_location-grpStonith2-3 grpStonith2 \
>>>>>>>>>>>>> rule -INFINITY: #uname eq nodeb
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> 以上です。
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>>
>>>>>>>>>>>> ELF Systems
>>>>>>>>>>>> Masamichi Fukuda
>>>>>>>>>>>> mail to:
> masamichi_fukuda@elf-systems.com
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> Linux-ha-japan mailing list
>>>>>>>>>>> Linux-ha-japan@lists.sourceforge.jp
>>>>>>>>>>> http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>>
>>>>>>>>>> ELF Systems
>>>>>>>>>> Masamichi Fukuda
>>>>>>>>>> mail to: masamichi_fukuda@elf-systems.com
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> Linux-ha-japan mailing list
>>>>>>>>> Linux-ha-japan@lists.sourceforge.jp
>>>>>>>>> http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>>
>>>>>>>> ELF Systems
>>>>>>>> Masamichi Fukuda
>>>>>>>> mail to: masamichi_fukuda@elf-systems.com
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Linux-ha-japan mailing list
>>>>>>> Linux-ha-japan@lists.sourceforge.jp
>>>>>>> http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>>
>>>>>> ELF Systems
>>>>>> Masamichi Fukuda
>>>>>> mail to: masamichi_fukuda@elf-systems.com
>>>>>>
>>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Linux-ha-japan mailing list
>>>>> Linux-ha-japan@lists.sourceforge.jp
>>>>> http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>>>>>
>>>>
>>>> --
>>>>
>>>> ELF Systems
>>>> Masamichi Fukuda
>>>> mail to: masamichi_fukuda@elf-systems.com
>>>>
>>>>
>>>>
>>>
>>> _______________________________________________
>>> Linux-ha-japan mailing list
>>> Linux-ha-japan@lists.sourceforge.jp
>>> http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>>>
>>
>>
>> --
>>
>> ELF Systems
>> Masamichi Fukuda
>> mail to: masamichi_fukuda@elf-systems.com
>>
>>
>
> _______________________________________________
> Linux-ha-japan mailing list
> Linux-ha-japan@lists.sourceforge.jp
> http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>

_______________________________________________
Linux-ha-japan mailing list
Linux-ha-japan@lists.sourceforge.jp
http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
Re: スプリットブレイン時のSTONITHエラーについて [ In reply to ]
山内さん

こんばんは、福田です。

stonith-helperの-x指定は何かやり方が違うんでしょうかね。

stonith-helperを外して、xen0だけにして起動してみました。

# crm_mon -rfA
Last updated: Tue Mar 17 23:38:53 2015
Last change: Tue Mar 17 23:30:34 2015
Stack: heartbeat
Current DC: lbv1.beta.com (38b0f200-83ea-8633-6f37-047d36cd39c6) - parti
tion with quorum
Version: 1.1.12-e32080b
2 Nodes configured
6 Resources configured


Online: [ lbv1.beta.com lbv2.beta.com ]

Full list of resources:

Stonith1-2 (stonith:external/xen0): Stopped
Stonith2-2 (stonith:external/xen0): Stopped
Resource Group: HAvarnish
vip_208 (ocf::heartbeat:IPaddr2): Started lbv1.beta.com
varnishd (lsb:varnish): Started lbv1.beta.com
Clone Set: clone_ping [ping]
Started: [ lbv1.beta.com lbv2.beta.com ]

Node Attributes:
* Node lbv1.beta.com:
+ default_ping_set : 100
* Node lbv2.beta.com:
+ default_ping_set : 100

Migration summary:
* Node lbv1.beta.com:
Stonith2-2: migration-threshold=1 fail-count=1000000 last-failure='Tue
Mar 17
23:38:34 2015'
* Node lbv2.beta.com:
Stonith1-2: migration-threshold=1 fail-count=1000000 last-failure='Tue
Mar 17
23:38:27 2015'

Failed actions:
Stonith2-2_start_0 on lbv1.beta.com 'unknown error' (1): call=23, st
atus=Error, exit-reason='none', last-rc-change='Tue Mar 17 23:38:32 2015',
queue
d=0ms, exec=1061ms
Stonith1-2_start_0 on lbv2.beta.com 'unknown error' (1): call=23, st
atus=Error, exit-reason='none', last-rc-change='Tue Mar 17 23:38:25 2015',
queue
d=0ms, exec=1342ms


stonith-helperがあるときと同様のfialed actionsが出ているようです。

宜しくお願いします。

以上


2015年3月17日 22:38 <renayama19661014@ybb.ne.jp>:

> 福田さん
>
> こんばんは、山内です。
>
> ちなみに可能であれば、external/stonith-helperを外して、external/xen0だけにした場合に
> どうなるか?を確認すると、問題の切り分けになるかもしれません。
>
> 以上です。
>
>
>
> ----- Original Message -----
> > From: "renayama19661014@ybb.ne.jp" <renayama19661014@ybb.ne.jp>
> > To: "linux-ha-japan@lists.sourceforge.jp" <
> linux-ha-japan@lists.sourceforge.jp>
> > Cc:
> > Date: 2015/3/17, Tue 22:28
> > Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
> >
> > 福田さん
> >
> > こんばんは、山内です。
> >
> > 変わらないようですね。。。
> >
> > とりあえず、明日くらいに、RHEL上ですが、
> >
> > Heartbeat3.0.6
> > Pacemakerの最新
> >
> >
> 組み合わせで、同じような設定(リソースはDummy、external/xen0はexternal/sshになりますが)stonith-helperが動くかどうかを確認してみます。
> >
> > #stonith-helperの-x指定の出力が確認出来ると、もう少し問題が絞りやすいのですが・・・
> >
> >
> > 以上です。
> >
> >
> >
> > ----- Original Message -----
> >> From: Masamichi Fukuda - elf-systems
> > <masamichi_fukuda@elf-systems.com>
> >> To: 山内英生 <renayama19661014@ybb.ne.jp>;
> > "linux-ha-japan@lists.sourceforge.jp"
> > <linux-ha-japan@lists.sourceforge.jp>
> >> Date: 2015/3/17, Tue 21:24
> >> Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
> >>
> >>
> >> 山内さん
> >>
> >> こんばんは、福田です。
> >> 最新版の情報をありがとうございました。
> >>
> >> 早速インストールしてみました。
> >>
> >> 起動後の状態です。
> >>
> >> failed actionsは変わりないようです。
> >>
> >>
> >>
> >> # crm_mon -rfA
> >> Last updated: Tue Mar 17 21:03:49 2015
> >> Last change: Tue Mar 17 20:30:58 2015
> >> Stack: heartbeat
> >> Current DC: lbv1.beta.com (38b0f200-83ea-8633-6f37-047d36cd39c6) -
> parti
> >> tion with quorum
> >> Version: 1.1.12-e32080b
> >> 2 Nodes configured
> >> 8 Resources configured
> >>
> >>
> >> Online: [ lbv1.beta.com lbv2.beta.com ]
> >>
> >> Full list of resources:
> >>
> >> Resource Group: HAvarnish
> >> vip_208 (ocf::heartbeat:IPaddr2): Started lbv1.beta.com
> >> varnishd (lsb:varnish): Started lbv1.beta.com
> >> Resource Group: grpStonith1
> >> Stonith1-1 (stonith:external/stonith-helper): Stopped
> >> Stonith1-2 (stonith:external/xen0): Stopped
> >> Resource Group: grpStonith2
> >> Stonith2-1 (stonith:external/stonith-helper): Stopped
> >> Stonith2-2 (stonith:external/xen0): Stopped
> >> Clone Set: clone_ping [ping]
> >> Started: [ lbv1.beta.com lbv2.beta.com ]
> >>
> >> Node Attributes:
> >> * Node lbv1.beta.com:
> >> + default_ping_set : 100
> >> * Node lbv2.beta.com:
> >> + default_ping_set : 100
> >>
> >> Migration summary:
> >> * Node lbv1.beta.com:
> >> Stonith2-1: migration-threshold=1 fail-count=1000000
> > last-failure='Tue Mar 17
> >> 21:03:39 2015'
> >> * Node lbv2.beta.com:
> >> Stonith1-1: migration-threshold=1 fail-count=1000000
> > last-failure='Tue Mar 17
> >> 21:03:32 2015'
> >>
> >> Failed actions:
> >> Stonith2-1_start_0 on lbv1.beta.com 'unknown error' (1):
> > call=31, st
> >> atus=Error, exit-reason='none', last-rc-change='Tue Mar 17
> > 21:03:37 2015', queue
> >> d=0ms, exec=1085ms
> >> Stonith1-1_start_0 on lbv2.beta.com 'unknown error' (1):
> > call=18, st
> >> atus=Error, exit-reason='none', last-rc-change='Tue Mar 17
> > 21:03:30 2015', queue
> >> d=0ms, exec=1061ms
> >>
> >>
> >>
> >>
> >> ログです。
> >>
> >>
> >> # less /var/log/ha-debug
> >> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: info: Pacemaker
> support:
> > yes
> >> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: WARN: File
> > /etc/ha.d//haresources exists.
> >> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: WARN: This file is
> not used
> > because pacemaker is enabled
> >> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: debug: Checking
> access of:
> > /usr/local/heartbeat/libexec/heartbeat/ccm
> >> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: debug: Checking
> access of:
> > /usr/local/heartbeat/libexec/pacemaker/cib
> >> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: debug: Checking
> access of:
> > /usr/local/heartbeat/libexec/pacemaker/stonithd
> >> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: debug: Checking
> access of:
> > /usr/local/heartbeat/libexec/pacemaker/lrmd
> >> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: debug: Checking
> access of:
> > /usr/local/heartbeat/libexec/pacemaker/attrd
> >> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: debug: Checking
> access of:
> > /usr/local/heartbeat/libexec/pacemaker/crmd
> >> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: WARN: Core dumps
> could be
> > lost if multiple dumps occur.
> >> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: WARN: Consider setting
> > non-default value in /proc/sys/kernel/core_pattern (or equivalent) for
> maximum
> > supportability
> >> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: WARN: Consider setting
> > /proc/sys/kernel/core_uses_pid (or equivalent) to 1 for maximum
> supportability
> >> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: WARN: Logging daemon
> is
> > disabled --enabling logging daemon is recommended
> >> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: info:
> > **************************
> >> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: info: Configuration
> > validated. Starting heartbeat 3.0.6
> >> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4236]: info: heartbeat:
> version
> > 3.0.6
> >> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4236]: info: Heartbeat
> generation:
> > 1423534116
> >> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4236]: info: seed is
> -1702799346
> >> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4236]: info: glib: ucast:
> write
> > socket priority set to IPTOS_LOWDELAY on eth1
> >> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4236]: info: glib: ucast:
> bound
> > send socket to device: eth1
> >> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4236]: info: glib: ucast: set
> > SO_REUSEADDR
> >> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4236]: info: glib: ucast:
> bound
> > receive socket to device: eth1
> >> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4236]: info: glib: ucast:
> started
> > on port 694 interface eth1 to 10.0.17.133
> >> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4236]: info: Local status
> now set
> > to: 'up'
> >> Mar 17 21:02:46 lbv1.beta.com heartbeat: [4236]: info: Link
> > lbv2.beta.com:eth1 up.
> >> Mar 17 21:02:46 lbv1.beta.com heartbeat: [4236]: info: Status update
> for
> > node lbv2.beta.com: status up
> >> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: Comm_now_up():
> > updating status to active
> >> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: Local status
> now set
> > to: 'active'
> >> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: Starting child
> client
> > "/usr/local/heartbeat/libexec/heartbeat/ccm" (109,113)
> >> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: Starting child
> client
> > "/usr/local/heartbeat/libexec/pacemaker/cib" (109,113)
> >> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: Starting child
> client
> > "/usr/local/heartbeat/libexec/pacemaker/stonithd" (0,0)
> >> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: Starting child
> client
> > "/usr/local/heartbeat/libexec/pacemaker/lrmd" (0,0)
> >> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: Starting child
> client
> > "/usr/local/heartbeat/libexec/pacemaker/attrd" (109,113)
> >> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: Starting child
> client
> > "/usr/local/heartbeat/libexec/pacemaker/crmd" (109,113)
> >> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: debug:
> get_delnodelist:
> > delnodelist=
> >> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4250]: info: Starting
> > "/usr/local/heartbeat/libexec/pacemaker/crmd" as uid 109 gid 113 (pid
> > 4250)
> >> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4246]: info: Starting
> > "/usr/local/heartbeat/libexec/pacemaker/cib" as uid 109 gid 113 (pid
> > 4246)
> >> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4249]: info: Starting
> > "/usr/local/heartbeat/libexec/pacemaker/attrd" as uid 109 gid 113
> > (pid 4249)
> >> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4245]: info: Starting
> > "/usr/local/heartbeat/libexec/heartbeat/ccm" as uid 109 gid 113 (pid
> > 4245)
> >> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4248]: info: Starting
> > "/usr/local/heartbeat/libexec/pacemaker/lrmd" as uid 0 gid 0 (pid
> > 4248)
> >> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4247]: info: Starting
> > "/usr/local/heartbeat/libexec/pacemaker/stonithd" as uid 0 gid 0 (pid
> > 4247)
> >> Mar 17 21:02:47 lbv1.beta.com ccm: [4245]: info: Hostname:
> lbv1.beta.com
> >> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: the send queue
> length
> > from heartbeat to client ccm is set to 1024
> >> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: the send queue
> length
> > from heartbeat to client attrd is set to 1024
> >> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: the send queue
> length
> > from heartbeat to client stonith-ng is set to 1024
> >> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: Status update
> for
> > node lbv2.beta.com: status active
> >> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: the send queue
> length
> > from heartbeat to client cib is set to 1024
> >> Mar 17 21:02:51 lbv1.beta.com heartbeat: [4236]: WARN: 1 lost
> packet(s) for
> > [lbv2.beta.com] [15:17]
> >> Mar 17 21:02:51 lbv1.beta.com heartbeat: [4236]: info: No pkts missing
> from
> > lbv2.beta.com!
> >> Mar 17 21:02:52 lbv1.beta.com heartbeat: [4236]: WARN: 1 lost
> packet(s) for
> > [lbv2.beta.com] [19:21]
> >> Mar 17 21:02:52 lbv1.beta.com heartbeat: [4236]: info: No pkts missing
> from
> > lbv2.beta.com!
> >> Mar 17 21:02:52 lbv1.beta.com heartbeat: [4236]: info: the send queue
> length
> > from heartbeat to client crmd is set to 1024
> >> Mar 17 21:02:53 lbv1.beta.com heartbeat: [4236]: WARN: 1 lost
> packet(s) for
> > [lbv2.beta.com] [24:26]
> >> Mar 17 21:02:53 lbv1.beta.com heartbeat: [4236]: info: No pkts missing
> from
> > lbv2.beta.com!
> >> Mar 17 21:02:54 lbv1.beta.com heartbeat: [4236]: WARN: 1 lost
> packet(s) for
> > [lbv2.beta.com] [26:28]
> >> Mar 17 21:02:54 lbv1.beta.com heartbeat: [4236]: info: No pkts missing
> from
> > lbv2.beta.com!
> >> Mar 17 21:02:54 lbv1.beta.com heartbeat: [4236]: WARN: 1 lost
> packet(s) for
> > [lbv2.beta.com] [30:32]
> >> Mar 17 21:02:54 lbv1.beta.com heartbeat: [4236]: info: No pkts missing
> from
> > lbv2.beta.com!
> >>
> >>
> >>
> >> # less /var/log/error
> >>
> >> Mar 17 21:02:47 lbv1 attrd[4249]: error: ha_msg_dispatch: Ignored
> > incoming message. Please set_msg_callback on hbclstat
> >> Mar 17 21:02:48 lbv1 attrd[4249]: error: ha_msg_dispatch: Ignored
> > incoming message. Please set_msg_callback on hbclstat
> >> Mar 17 21:02:53 lbv1 stonith-ng[4247]: error: ha_msg_dispatch:
> Ignored
> > incoming message. Please set_msg_callback on hbclstat
> >> Mar 17 21:02:53 lbv1 stonith-ng[4247]: error: ha_msg_dispatch:
> Ignored
> > incoming message. Please set_msg_callback on hbclstat
> >> Mar 17 21:03:39 lbv1 crmd[4250]: error: process_lrm_event: Operation
> > Stonith2-1_start_0 (node=lbv1.beta.com, call=31, status=4,
> cib-update=42,
> > confirmed=true) Error
> >>
> >> # cat syslog|egrep 'Mar 17 21:03|Mar 17 21:02' |egrep
> > 'heartbeat|stonith|pacemaker|error'
> >> Mar 17 21:03:24 lbv1 pengine[4253]: notice: process_pe_message:
> Calculated
> > Transition 0: /var/lib/pacemaker/pengine/pe-input-115.bz2
> >> Mar 17 21:03:27 lbv1 crmd[4250]: notice: run_graph: Transition 0
> > (Complete=15, Pending=0, Fired=0, Skipped=16, Incomplete=2,
> > Source=/var/lib/pacemaker/pengine/pe-input-115.bz2): Stopped
> >> Mar 17 21:03:29 lbv1 pengine[4253]: notice: process_pe_message:
> Calculated
> > Transition 1: /var/lib/pacemaker/pengine/pe-input-116.bz2
> >> Mar 17 21:03:34 lbv1 crmd[4250]: notice: run_graph: Transition 1
> > (Complete=8, Pending=0, Fired=0, Skipped=12, Incomplete=1,
> > Source=/var/lib/pacemaker/pengine/pe-input-116.bz2): Stopped
> >> Mar 17 21:03:37 lbv1 pengine[4253]: warning: unpack_rsc_op_failure:
> > Processing failed op start for Stonith1-1 on lbv2.beta.com: unknown
> error (1)
> >> Mar 17 21:03:37 lbv1 pengine[4253]: warning: unpack_rsc_op_failure:
> > Processing failed op start for Stonith1-1 on lbv2.beta.com: unknown
> error (1)
> >> Mar 17 21:03:37 lbv1 pengine[4253]: notice: process_pe_message:
> Calculated
> > Transition 2: /var/lib/pacemaker/pengine/pe-input-117.bz2
> >> Mar 17 21:03:39 lbv1 stonith-ng[4247]: notice: log_operation:
> Operation
> > 'monitor' [4377] for device 'Stonith2-1' returned: -201 (Generic
> > Pacemaker error)
> >> Mar 17 21:03:39 lbv1 stonith-ng[4247]: warning: log_operation:
> > Stonith2-1:4377 [ Performing: stonith -t external/stonith-helper -S ]
> >> Mar 17 21:03:39 lbv1 stonith-ng[4247]: warning: log_operation:
> > Stonith2-1:4377 [ failed to exec "stonith" ]
> >> Mar 17 21:03:39 lbv1 stonith-ng[4247]: warning: log_operation:
> > Stonith2-1:4377 [ failed: 2 ]
> >> Mar 17 21:03:39 lbv1 crmd[4250]: error: process_lrm_event: Operation
> > Stonith2-1_start_0 (node=lbv1.beta.com, call=31, status=4,
> cib-update=42,
> > confirmed=true) Error
> >> Mar 17 21:03:40 lbv1 crmd[4250]: notice: run_graph: Transition 2
> > (Complete=12, Pending=0, Fired=0, Skipped=3, Incomplete=0,
> > Source=/var/lib/pacemaker/pengine/pe-input-117.bz2): Stopped
> >> Mar 17 21:03:42 lbv1 pengine[4253]: warning: unpack_rsc_op_failure:
> > Processing failed op start for Stonith2-1 on lbv1.beta.com: unknown
> error (1)
> >> Mar 17 21:03:42 lbv1 pengine[4253]: warning: unpack_rsc_op_failure:
> > Processing failed op start for Stonith2-1 on lbv1.beta.com: unknown
> error (1)
> >> Mar 17 21:03:42 lbv1 pengine[4253]: warning: unpack_rsc_op_failure:
> > Processing failed op start for Stonith1-1 on lbv2.beta.com: unknown
> error (1)
> >> Mar 17 21:03:42 lbv1 pengine[4253]: notice: process_pe_message:
> Calculated
> > Transition 3: /var/lib/pacemaker/pengine/pe-input-118.bz2
> >> Mar 17 21:03:42 lbv1 IPaddr2(vip_208)[4448]: INFO:
> > /usr/libexec/heartbeat/send_arp -i 200 -r 5 -p
> > /var/run/resource-agents/send_arp-192.168.17.208 eth0 192.168.17.208 auto
> > not_used not_used
> >> Mar 17 21:03:47 lbv1 crmd[4250]: notice: run_graph: Transition 3
> > (Complete=10, Pending=0, Fired=0, Skipped=0, Incomplete=0,
> > Source=/var/lib/pacemaker/pengine/pe-input-118.bz2): Complete
> >>
> >> 宜しくお願いします。
> >>
> >> 以上
> >>
> >>
> >>
> >> 2015年3月17日 18:31 <renayama19661014@ybb.ne.jp>:
> >>
> >> 福田さん
> >>>
> >>> こんばんは、山内です。
> >>>
> >>> tag付けされていないので、本日の最新版は、
> >>>
> >>> *
> >
> https://github.com/ClusterLabs/pacemaker/tree/e32080b460f81486b85d08ec958582b3e72d858c
> >>>
> >>>
> >>> になります。
> >>> 右側の[Download ZIP]からダウンロード出来ます。
> >>>
> >>> 以上です。
> >>>
> >>>
> >>> ----- Original Message -----
> >>>> From: Masamichi Fukuda - elf-systems
> > <masamichi_fukuda@elf-systems.com>
> >>>
> >>>> To: "renayama19661014@ybb.ne.jp"
> > <renayama19661014@ybb.ne.jp>;
> > "linux-ha-japan@lists.sourceforge.jp"
> > <linux-ha-japan@lists.sourceforge.jp>
> >>>> Date: 2015/3/17, Tue 18:07
> >>>> Subject: スプリットブレイン時のSTONITHエラーについて
> >>>>
> >>>>
> >>>> 山内さん
> >>>>
> >>>>
> >>>> お疲れ様です、福田です。
> >>>>
> >>>>
> >>>> こちらを見たのですが、
> >>>> https://github.com/ClusterLabs/pacemaker/tags
> >>>>
> >>>>
> >>>>
> >>>> pacemaker 1.1.12 561c4cf が最新のようなのですが。
> >>>> 済みませんが、これ以降の最新版はどちらにあるか教えて頂けますか。
> >>>>
> >>>>
> >>>> 宜しくお願いします。
> >>>>
> >>>>
> >>>> 以上
> >>>>
> >>>>
> >>>>
> >>>> 2015年3月17日火曜日、<renayama19661014@ybb.ne.jp>さんは書きました:
> >>>>
> >>>> 福田さん
> >>>>>
> >>>>> お疲れ様です。山内です。
> >>>>>
> >>>>> はい。古いです。
> >>>>>
> >>>>> PacemakerがHeartbeat3.0.6に対応したのは意外と最近です。
> >>>>> もっと新しいものを入れてください。(また、ソースから構築する必要がありますが・・・・)
> >>>>>
> >>>>>
> >>>>>
> >>>>> 本家のgithubから入手可能です。
> >>>>> * https://github.com/ClusterLabs/pacemaker
> >>>>>
> >>>>>
> >>>>> 場合によっては、最新のmasterはエラーなどが出る場合がありますので、その場合は、バージョンを古い方にたぐって
> >>>>> いくのが良いと思います。
> >>>>>
> >>>>> 以上です。
> >>>>>
> >>>>>
> >>>>>
> >>>>> ----- Original Message -----
> >>>>>> From: Masamichi Fukuda - elf-systems
> > <masamichi_fukuda@elf-systems.com>
> >>>>>> To: 山内英生 <renayama19661014@ybb.ne.jp>;
> > "linux-ha-japan@lists.sourceforge.jp"
> > <linux-ha-japan@lists.sourceforge.jp>
> >>>>>> Date: 2015/3/17, Tue 16:06
> >>>>>> Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
> >>>>>>
> >>>>>>
> >>>>>> 山内さん
> >>>>>>
> >>>>>> お疲れ様です、福田です。
> >>>>>>
> >>>>>> 以前のメールでheartbeatとpacemakerを最新版を入れたほうが良いと回答頂きました。
> >>>>>> そこで今回、heartbeat3.0.6とpacemaker1.1.12を入れたのですが。
> >>>>>>
> >>>>>> heartbeat configuration: Version = "3.0.6"
> >>>>>> pacemaker configuration: Version = 1.1.12 (Build:
> > 561c4cf)pacemakerがまだ古いということでしょうか。
> >>>>>>
> >>>>>> 済みませんが、宜しくお願いします。
> >>>>>>
> >>>>>> 以上
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> 2015年3月17日 14:59 <renayama19661014@ybb.ne.jp>:
> >>>>>>
> >>>>>> 福田さん
> >>>>>>>
> >>>>>>> お疲れ様です。山内です。
> >>>>>>>
> >>>>>>> ふと思ったのすが、以前のやり取りのメールで以下と回答してますが、問題ないでしょうか?
> >>>>>>>
> >>>>>>>
> >>>>>>>>>>>>> 2)Heartbeat3.0.6+Pacemaker最新 :
> > OK
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> > どうやら、Heartbeatも最新版3.0.6を組合せる必要があるようです。
> >>>>>>>>>>>>>
> > * http://hg.linux-ha.org/heartbeat-STABLE_3_0/rev/cceeb47a7d8f
> >>>>>>>
> >>>>>>> 以下のcrm_monのバージョンを見ると、1.1.12のようです。
> >>>>>>> Heartbeat3.0.6と組み合わせるには、かなり新しめのPacemakerが必要です。
> >>>>>>>
> >>>>>>>> # crm_mon -rfA
> >>>>>>>>
> >>>>>>>> Last updated: Tue Mar 17 14:14:39 2015
> >>>>>>>> Last change: Tue Mar 17 14:01:43 2015
> >>>>>>>> Stack: heartbeat
> >>>>>>>> Current DC: lbv2.beta.com
> > (82ffc36f-1ad8-8686-7db0-35686465c624) - parti
> >>>>>>>> tion with quorum
> >>>>>>>> Version: 1.1.12-561c4cf
> >>>>>>>
> >>>>>>> たぶん、以下の変更以降は少なくとも必要かと思います。
> >>>>>>>
> >>>>>>>
> https://github.com/ClusterLabs/pacemaker/commit/f2302da063d08719d28367d8e362b8bfb0f85bf3
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> 以上です。
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> ----- Original Message -----
> >>>>>>>> From: Masamichi Fukuda - elf-systems
> > <masamichi_fukuda@elf-systems.com>
> >>>>>>>> To: 山内英生 <renayama19661014@ybb.ne.jp>;
> > "linux-ha-japan@lists.sourceforge.jp"
> > <linux-ha-japan@lists.sourceforge.jp>
> >>>>>>>
> >>>>>>>> Date: 2015/3/17, Tue 14:38
> >>>>>>>> Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> 山内さん
> >>>>>>>>
> >>>>>>>> お疲れ様です、福田です。
> >>>>>>>>
> >>>>>>>> stonith-helperのシェバング行に-xを追加すれば良いのでしょうか?
> >>>>>>>> stonith-helperの先頭行を#!/bin/bash -xにしてクラスタを起動してみました。
> >>>>>>>>
> >>>>>>>> crm_monでは先ほどと変わりはないようです。
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> # crm_mon -rfA
> >>>>>>>>
> >>>>>>>> Last updated: Tue Mar 17 14:14:39 2015
> >>>>>>>> Last change: Tue Mar 17 14:01:43 2015
> >>>>>>>> Stack: heartbeat
> >>>>>>>> Current DC: lbv2.beta.com
> > (82ffc36f-1ad8-8686-7db0-35686465c624) - parti
> >>>>>>>> tion with quorum
> >>>>>>>> Version: 1.1.12-561c4cf
> >>>>>>>> 2 Nodes configured
> >>>>>>>> 8 Resources configured
> >>>>>>>>
> >>>>>>>> Online: [ lbv1.beta.com lbv2.beta.com ]
> >>>>>>>>
> >>>>>>>> Full list of resources:
> >>>>>>>>
> >>>>>>>> Resource Group: HAvarnish
> >>>>>>>> vip_208 (ocf::heartbeat:IPaddr2):
> > Started lbv1.beta.com
> >>>>>>>> varnishd (lsb:varnish): Started
> > lbv1.beta.com
> >>>>>>>> Resource Group: grpStonith1
> >>>>>>>> Stonith1-1
> > (stonith:external/stonith-helper): Stopped
> >>>>>>>> Stonith1-2 (stonith:external/xen0):
> > Stopped
> >>>>>>>> Resource Group: grpStonith2
> >>>>>>>> Stonith2-1
> > (stonith:external/stonith-helper): Stopped
> >>>>>>>> Stonith2-2 (stonith:external/xen0):
> > Stopped
> >>>>>>>> Clone Set: clone_ping [ping]
> >>>>>>>> Started: [ lbv1.beta.com lbv2.beta.com ]
> >>>>>>>>
> >>>>>>>> Node Attributes:
> >>>>>>>> * Node lbv1.beta.com:
> >>>>>>>> + default_ping_set : 100
> >>>>>>>> * Node lbv2.beta.com:
> >>>>>>>> + default_ping_set : 100
> >>>>>>>>
> >>>>>>>> Migration summary:
> >>>>>>>> * Node lbv2.beta.com:
> >>>>>>>> Stonith1-1: migration-threshold=1
> > fail-count=1000000 last-failure='Tue Mar 17
> >>>>>>>> 14:12:16 2015'
> >>>>>>>> * Node lbv1.beta.com:
> >>>>>>>> Stonith2-1: migration-threshold=1
> > fail-count=1000000 last-failure='Tue Mar 17
> >>>>>>>> 14:12:21 2015'
> >>>>>>>>
> >>>>>>>> Failed actions:
> >>>>>>>> Stonith1-1_start_0 on lbv2.beta.com 'unknown
> > error' (1): call=31, st
> >>>>>>>> atus=Error, last-rc-change='Tue Mar 17 14:12:14
> > 2015', queued=0ms, exec=1065ms
> >>>>>>>> Stonith2-1_start_0 on lbv1.beta.com 'unknown
> > error' (1): call=26, st
> >>>>>>>> atus=Error, last-rc-change='Tue Mar 17 14:12:19
> > 2015', queued=0ms, exec=1081ms
> >>>>>>>>
> >>>>>>>> その他のログを探してみました。
> >>>>>>>>
> >>>>>>>> heartbeat起動時です。
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> # less /var/log/pm_logconv.out
> >>>>>>>> Mar 17 14:11:28 lbv1.beta.com info: Starting
> > Heartbeat 3.0.6.
> >>>>>>>> Mar 17 14:11:33 lbv1.beta.com info: Link
> > lbv2.beta.com:eth1 is up.
> >>>>>>>> Mar 17 14:11:34 lbv1.beta.com info: Start
> > "ccm" process. (pid=13264)
> >>>>>>>> Mar 17 14:11:34 lbv1.beta.com info: Start
> > "lrmd" process. (pid=13267)
> >>>>>>>> Mar 17 14:11:34 lbv1.beta.com info: Start
> > "attrd" process. (pid=13268)
> >>>>>>>> Mar 17 14:11:34 lbv1.beta.com info: Start
> > "stonithd" process. (pid=13266)
> >>>>>>>> Mar 17 14:11:34 lbv1.beta.com info: Start
> > "cib" process. (pid=13265)
> >>>>>>>> Mar 17 14:11:34 lbv1.beta.com info: Start
> > "crmd" process. (pid=13269)
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> # less /var/log/error
> >>>>>>>> Mar 17 14:12:20 lbv1 crmd[13269]: error:
> > process_lrm_event: Operation Stonith2-1_start_0 (node=lbv1.beta.com,
> call=26,
> > status=4, cib-update=19, confirmed=true) Error
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> syslogからstonithをgrepしたものです
> >>>>>>>>
> >>>>>>>> Mar 17 14:11:34 lbv1 heartbeat: [13255]: info:
> > Starting child client
> > "/usr/local/heartbeat/libexec/pacemaker/stonithd" (0,0)
> >>>>>>>> Mar 17 14:11:34 lbv1 heartbeat: [13266]: info:
> > Starting "/usr/local/heartbeat/libexec/pacemaker/stonithd" as uid 0
> > gid 0 (pid 13266)
> >>>>>>>> Mar 17 14:11:34 lbv1 stonithd[13266]: notice:
> > crm_cluster_connect: Connecting to cluster infrastructure: heartbeat
> >>>>>>>> Mar 17 14:11:34 lbv1 heartbeat: [13255]: info: the
> > send queue length from heartbeat to client stonithd is set to 1024
> >>>>>>>> Mar 17 14:11:40 lbv1 stonithd[13266]: notice:
> > setup_cib: Watching for stonith topology changes
> >>>>>>>> Mar 17 14:11:40 lbv1 stonithd[13266]: notice:
> > unpack_config: On loss of CCM Quorum: Ignore
> >>>>>>>> Mar 17 14:11:40 lbv1 stonithd[13266]: warning:
> > handle_startup_fencing: Blind faith: not fencing unseen nodes
> >>>>>>>> Mar 17 14:11:40 lbv1 stonithd[13266]: warning:
> > handle_startup_fencing: Blind faith: not fencing unseen nodes
> >>>>>>>> Mar 17 14:11:41 lbv1 stonithd[13266]: notice:
> > stonith_device_register: Added 'Stonith2-1' to the device list (1 active
> > devices)
> >>>>>>>> Mar 17 14:11:41 lbv1 stonithd[13266]: notice:
> > stonith_device_register: Added 'Stonith2-2' to the device list (2 active
> > devices)
> >>>>>>>> Mar 17 14:12:04 lbv1 stonithd[13266]: notice:
> > xml_patch_version_check: Versions did not change in patch 0.5.0
> >>>>>>>> Mar 17 14:12:20 lbv1 stonithd[13266]: notice:
> > log_operation: Operation 'monitor' [13386] for device
> > 'Stonith2-1' returned: -201 (Generic Pacemaker error)
> >>>>>>>> Mar 17 14:12:20 lbv1 stonithd[13266]: warning:
> > log_operation: Stonith2-1:13386 [ Performing: stonith -t
> external/stonith-helper
> > -S ]
> >>>>>>>> Mar 17 14:12:20 lbv1 stonithd[13266]: warning:
> > log_operation: Stonith2-1:13386 [ failed to exec "stonith" ]
> >>>>>>>> Mar 17 14:12:20 lbv1 stonithd[13266]: warning:
> > log_operation: Stonith2-1:13386 [ failed: 2 ]
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> 宜しくお願いします。
> >>>>>>>>
> >>>>>>>> 以上
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> 2015年3月17日 13:32 <renayama19661014@ybb.ne.jp>:
> >>>>>>>>
> >>>>>>>> 福田さん
> >>>>>>>>>
> >>>>>>>>> お疲れ様です。山内です。
> >>>>>>>>>
> >>>>>>>>> ということは、stonith-helperのstartに問題があるようですね。
> >>>>>>>>>
> >>>>>>>>> stonith-helperの先頭に
> >>>>>>>>>
> >>>>>>>>> #!/bin/bash -x
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> を入れて、クラスタを起動すると何かわかるかも知れません。
> >>>>>>>>>
> >>>>>>>>> ちなみに、stonith-helperのログもどこかに出ていると思うのですが。。。
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> 以上です。
> >>>>>>>>>
> >>>>>>>>> ----- Original Message -----
> >>>>>>>>>> From: Masamichi Fukuda - elf-systems
> > <masamichi_fukuda@elf-systems.com>
> >>>>>>>>>> To: 山内英生 <renayama19661014@ybb.ne.jp>;
> > "linux-ha-japan@lists.sourceforge.jp"
> > <linux-ha-japan@lists.sourceforge.jp>
> >>>>>>>>>
> >>>>>>>>>> Date: 2015/3/17, Tue 12:31
> >>>>>>>>>> Subject: Re: [Linux-ha-jp]
> > スプリットブレイン時のSTONITHエラーについて
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> 山内さん
> >>>>>>>>>> cc:松島さん
> >>>>>>>>>>
> >>>>>>>>>> こんにちは、福田です。
> >>>>>>>>>>
> >>>>>>>>>> 同じディレクトリにxen0はありました。
> >>>>>>>>>>
> >>>>>>>>>> # pwd
> >>>>>>>>>> /usr/local/heartbeat/lib/stonith/plugins/external
> >>>>>>>>>>
> >>>>>>>>>> # ls
> >>>>>>>>>> drac5 ibmrsa kdumpcheck
> > riloe vmware
> >>>>>>>>>> dracmc-telnet ibmrsa-telnet libvirt
> > ssh xen0
> >>>>>>>>>> hetzner ipmi nut
> > stonith-helper xen0-ha
> >>>>>>>>>> hmchttp ippower9258 rackpdu
> > vcenter
> >>>>>>>>>>
> >>>>>>>>>> 宜しくお願いします。
> >>>>>>>>>>
> >>>>>>>>>> 以上
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> 2015-03-17 10:53 GMT+09:00
> > <renayama19661014@ybb.ne.jp>:
> >>>>>>>>>>
> >>>>>>>>>> 福田さん
> >>>>>>>>>>> cc:松島さん
> >>>>>>>>>>>
> >>>>>>>>>>> お疲れ様です。山内です。
> >>>>>>>>>>>
> >>>>>>>>>>>> 標準出力や標準エラー出力はありませんでした。
> >>>>>>>>>>>>
> >>>>>>>>>>>> stonith-helperがおかしいのでしょうか。
> >>>>>>>>>>>> stonith-helperはシェルスクリプトなのでインストールはあまり気にしていなかったのですが。
> >>>>>>>>>>>> stonith-helperはここに配置されています。
> >>>>>>>>>>>>
> /usr/local/heartbeat/lib/stonith/plugins/external/stonith-helper
> >>>>>>>>>>>
> >>>>>>>>>>> このディレクトリにxen0もありますか?
> >>>>>>>>>>>
> 無いようでしたら、問題がありますので、一度、stonith-helperのファイルを属性などはそのまま、xen0と同じディレクトリに
> >>>>>>>>>>> コピーしてみてください。
> >>>>>>>>>>>
> >>>>>>>>>>> それで稼働するなら、pm_extrasのインストールに問題があるということになります。
> >>>>>>>>>>>
> >>>>>>>>>>> 以上です。
> >>>>>>>>>>>
> >>>>>>>>>>> ----- Original Message -----
> >>>>>>>>>>>> From: Masamichi Fukuda - elf-systems
> > <masamichi_fukuda@elf-systems.com>
> >>>>>>>>>>>> To: 山内英生
> > <renayama19661014@ybb.ne.jp>;
> > "linux-ha-japan@lists.sourceforge.jp"
> > <linux-ha-japan@lists.sourceforge.jp>
> >>>>>>>>>>>
> >>>>>>>>>>>> Date: 2015/3/17, Tue 10:31
> >>>>>>>>>>>> Subject: Re: [Linux-ha-jp]
> > スプリットブレイン時のSTONITHエラーについて
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> 山内さん
> >>>>>>>>>>>> cc:松島さん
> >>>>>>>>>>>>
> >>>>>>>>>>>> おはようございます、福田です。
> >>>>>>>>>>>> crmの例をありがとうございます。
> >>>>>>>>>>>>
> >>>>>>>>>>>> 早速、こちらの環境に合わせてみました。
> >>>>>>>>>>>>
> >>>>>>>>>>>> $ cat test.crm
> >>>>>>>>>>>> ### Cluster Option ###
> >>>>>>>>>>>> property \
> >>>>>>>>>>>>
> > no-quorum-policy="ignore" \
> >>>>>>>>>>>> stonith-enabled="true"
> > \
> >>>>>>>>>>>>
> > startup-fencing="false" \
> >>>>>>>>>>>> stonith-timeout="710s"
> > \
> >>>>>>>>>>>>
> > crmd-transition-delay="2s"
> >>>>>>>>>>>>
> >>>>>>>>>>>> ### Resource Default ###
> >>>>>>>>>>>> rsc_defaults \
> >>>>>>>>>>>>
> > resource-stickiness="INFINITY" \
> >>>>>>>>>>>>
> > migration-threshold="1"
> >>>>>>>>>>>>
> >>>>>>>>>>>> ### Group Configuration ###
> >>>>>>>>>>>> group HAvarnish \
> >>>>>>>>>>>> vip_208 \
> >>>>>>>>>>>> varnishd
> >>>>>>>>>>>>
> >>>>>>>>>>>> group grpStonith1 \
> >>>>>>>>>>>> Stonith1-1 \
> >>>>>>>>>>>> Stonith1-2
> >>>>>>>>>>>>
> >>>>>>>>>>>> group grpStonith2 \
> >>>>>>>>>>>> Stonith2-1 \
> >>>>>>>>>>>> Stonith2-2
> >>>>>>>>>>>>
> >>>>>>>>>>>> ### Clone Configuration ###
> >>>>>>>>>>>> clone clone_ping \
> >>>>>>>>>>>> ping
> >>>>>>>>>>>>
> >>>>>>>>>>>> ### Fencing Topology ###
> >>>>>>>>>>>> fencing_topology \
> >>>>>>>>>>>> lbv1.beta.com: Stonith1-1
> > Stonith1-2 \
> >>>>>>>>>>>> lbv2.beta.com: Stonith2-1
> > Stonith2-2
> >>>>>>>>>>>>
> >>>>>>>>>>>> ### Primitive Configuration ###
> >>>>>>>>>>>> primitive vip_208
> > ocf:heartbeat:IPaddr2 \
> >>>>>>>>>>>> params \
> >>>>>>>>>>>>
> > ip="192.168.17.208" \
> >>>>>>>>>>>> nic="eth0" \
> >>>>>>>>>>>> cidr_netmask="24"
> > \
> >>>>>>>>>>>> op start interval="0s"
> > timeout="90s" on-fail="restart" \
> >>>>>>>>>>>> op monitor
> > interval="5s" timeout="60s" on-fail="restart"
> > \
> >>>>>>>>>>>> op stop interval="0s"
> > timeout="100s" on-fail="fence"
> >>>>>>>>>>>>
> >>>>>>>>>>>> primitive varnishd lsb:varnish \
> >>>>>>>>>>>> op start interval="0s"
> > timeout="90s" on-fail="restart" \
> >>>>>>>>>>>> op monitor
> > interval="10s" timeout="60s" on-fail="restart"
> > \
> >>>>>>>>>>>> op stop interval="0s"
> > timeout="100s" on-fail="fence"
> >>>>>>>>>>>>
> >>>>>>>>>>>> primitive ping ocf:pacemaker:ping
> > \
> >>>>>>>>>>>> params \
> >>>>>>>>>>>>
> > name="default_ping_set" \
> >>>>>>>>>>>>
> > host_list="192.168.17.254" \
> >>>>>>>>>>>> multiplier="100"
> > \
> >>>>>>>>>>>> dampen="1" \
> >>>>>>>>>>>> op start interval="0s"
> > timeout="90s" on-fail="restart" \
> >>>>>>>>>>>> op monitor
> > interval="10s" timeout="60s" on-fail="restart"
> > \
> >>>>>>>>>>>> op stop interval="0s"
> > timeout="100s" on-fail="fence"
> >>>>>>>>>>>>
> >>>>>>>>>>>> primitive Stonith1-1
> > stonith:external/stonith-helper \
> >>>>>>>>>>>> params \
> >>>>>>>>>>>>
> > pcmk_reboot_retries="1" \
> >>>>>>>>>>>>
> > pcmk_reboot_timeout="40s" \
> >>>>>>>>>>>>
> > hostlist="lbv1.beta.com" \
> >>>>>>>>>>>>
> > dead_check_target="192.168.17.132 10.0.17.132" \
> >>>>>>>>>>>>
> > standby_check_command="/usr/local/sbin/crm_resource -r varnishd -W | grep
> > -q `hostname`" \
> >>>>>>>>>>>>
> > run_online_check="yes" \
> >>>>>>>>>>>> op start interval="0s"
> > timeout="60s" on-fail="restart" \
> >>>>>>>>>>>> op stop interval="0s"
> > timeout="60s" on-fail="ignore"
> >>>>>>>>>>>>
> >>>>>>>>>>>> primitive Stonith1-2
> > stonith:external/xen0 \
> >>>>>>>>>>>> params \
> >>>>>>>>>>>>
> > pcmk_reboot_timeout="60s" \
> >>>>>>>>>>>>
> > hostlist="lbv1.beta.com:/etc/xen/lbv1.cfg" \
> >>>>>>>>>>>>
> > dom0="xen0.beta.com" \
> >>>>>>>>>>>> op start interval="0s"
> > timeout="60s" on-fail="restart" \
> >>>>>>>>>>>> op monitor
> > interval="3600s" timeout="60s" on-fail="restart"
> > \
> >>>>>>>>>>>> op stop interval="0s"
> > timeout="60s" on-fail="ignore"
> >>>>>>>>>>>>
> >>>>>>>>>>>> primitive Stonith2-1
> > stonith:external/stonith-helper \
> >>>>>>>>>>>> params \
> >>>>>>>>>>>>
> > pcmk_reboot_retries="1" \
> >>>>>>>>>>>>
> > pcmk_reboot_timeout="40s" \
> >>>>>>>>>>>>
> > hostlist="lbv2.beta.com" \
> >>>>>>>>>>>>
> > dead_check_target="192.168.17.133 10.0.17.133" \
> >>>>>>>>>>>>
> > standby_check_command="/usr/local/sbin/crm_resource -r varnishd -W | grep
> > -q `hostname`" \
> >>>>>>>>>>>>
> > run_online_check="yes" \
> >>>>>>>>>>>> op start interval="0s"
> > timeout="60s" on-fail="restart" \
> >>>>>>>>>>>> op stop interval="0s"
> > timeout="60s" on-fail="ignore"
> >>>>>>>>>>>>
> >>>>>>>>>>>> primitive Stonith2-2
> > stonith:external/xen0 \
> >>>>>>>>>>>> params \
> >>>>>>>>>>>>
> > pcmk_reboot_timeout="60s" \
> >>>>>>>>>>>>
> > hostlist="lbv2.beta.com:/etc/xen/lbv2.cfg" \
> >>>>>>>>>>>>
> > dom0="xen0.beta.com" \
> >>>>>>>>>>>> op start interval="0s"
> > timeout="60s" on-fail="restart" \
> >>>>>>>>>>>> op monitor
> > interval="3600s" timeout="60s" on-fail="restart"
> > \
> >>>>>>>>>>>> op stop interval="0s"
> > timeout="60s" on-fail="ignore"
> >>>>>>>>>>>>
> >>>>>>>>>>>> ### Resource Location ###
> >>>>>>>>>>>> location HA_location-1 HAvarnish
> > \
> >>>>>>>>>>>> rule 200: #uname eq
> > lbv1.beta.com \
> >>>>>>>>>>>> rule 100: #uname eq
> > lbv2.beta.com
> >>>>>>>>>>>>
> >>>>>>>>>>>> location HA_location-2 HAvarnish
> > \
> >>>>>>>>>>>> rule -INFINITY: not_defined
> > default_ping_set or default_ping_set lt 100
> >>>>>>>>>>>>
> >>>>>>>>>>>> location HA_location-3 grpStonith1
> > \
> >>>>>>>>>>>> rule -INFINITY: #uname eq
> > lbv1.beta.com
> >>>>>>>>>>>>
> >>>>>>>>>>>> location HA_location-4 grpStonith2
> > \
> >>>>>>>>>>>> rule -INFINITY: #uname eq
> > lbv2.beta.com
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> これを流しこんだところ、昨日とはメッセージが異なります。
> >>>>>>>>>>>> pingのメッセージはなくなっていました。
> >>>>>>>>>>>>
> >>>>>>>>>>>> # crm_mon -rfA
> >>>>>>>>>>>> Last updated: Tue Mar 17 10:21:28
> > 2015
> >>>>>>>>>>>> Last change: Tue Mar 17 10:21:09
> > 2015
> >>>>>>>>>>>> Stack: heartbeat
> >>>>>>>>>>>> Current DC: lbv2.beta.com
> > (82ffc36f-1ad8-8686-7db0-35686465c624) - parti
> >>>>>>>>>>>> tion with quorum
> >>>>>>>>>>>> Version: 1.1.12-561c4cf
> >>>>>>>>>>>> 2 Nodes configured
> >>>>>>>>>>>> 8 Resources configured
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> Online: [ lbv1.beta.com
> > lbv2.beta.com ]
> >>>>>>>>>>>>
> >>>>>>>>>>>> Full list of resources:
> >>>>>>>>>>>>
> >>>>>>>>>>>> Resource Group: HAvarnish
> >>>>>>>>>>>> vip_208
> > (ocf::heartbeat:IPaddr2): Started lbv1.beta.com
> >>>>>>>>>>>> varnishd (lsb:varnish):
> > Started lbv1.beta.com
> >>>>>>>>>>>> Resource Group: grpStonith1
> >>>>>>>>>>>> Stonith1-1
> > (stonith:external/stonith-helper): Stopped
> >>>>>>>>>>>> Stonith1-2
> > (stonith:external/xen0): Stopped
> >>>>>>>>>>>> Resource Group: grpStonith2
> >>>>>>>>>>>> Stonith2-1
> > (stonith:external/stonith-helper): Stopped
> >>>>>>>>>>>> Stonith2-2
> > (stonith:external/xen0): Stopped
> >>>>>>>>>>>> Clone Set: clone_ping [ping]
> >>>>>>>>>>>> Started: [ lbv1.beta.com
> > lbv2.beta.com ]
> >>>>>>>>>>>>
> >>>>>>>>>>>> Node Attributes:
> >>>>>>>>>>>> * Node lbv1.beta.com:
> >>>>>>>>>>>> +
> > default_ping_set : 100
> >>>>>>>>>>>> * Node lbv2.beta.com:
> >>>>>>>>>>>> +
> > default_ping_set : 100
> >>>>>>>>>>>>
> >>>>>>>>>>>> Migration summary:
> >>>>>>>>>>>> * Node lbv2.beta.com:
> >>>>>>>>>>>> Stonith1-1: migration-threshold=1
> > fail-count=1000000 last-failure='Tue Mar 17
> >>>>>>>>>>>> 10:21:17 2015'
> >>>>>>>>>>>> * Node lbv1.beta.com:
> >>>>>>>>>>>> Stonith2-1: migration-threshold=1
> > fail-count=1000000 last-failure='Tue Mar 17
> >>>>>>>>>>>> 10:21:17 2015'
> >>>>>>>>>>>>
> >>>>>>>>>>>> Failed actions:
> >>>>>>>>>>>> Stonith1-1_start_0 on
> > lbv2.beta.com 'unknown error' (1): call=31, st
> >>>>>>>>>>>> atus=Error, last-rc-change='Tue
> > Mar 17 10:21:15 2015', queued=0ms, exec=1082ms
> >>>>>>>>>>>> Stonith2-1_start_0 on
> > lbv1.beta.com 'unknown error' (1): call=31, st
> >>>>>>>>>>>> atus=Error, last-rc-change='Tue
> > Mar 17 10:21:16 2015', queued=0ms, exec=1079ms
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> /var/log/ha-debugのログです。
> >>>>>>>>>>>>
> >>>>>>>>>>>> IPaddr2(vip_208)[7851]:
> > 2015/03/17_10:21:22 INFO: Adding inet address 192.168.17.208/24 with
> broadcast
> > address 192.168.17.255 to device eth0
> >>>>>>>>>>>> IPaddr2(vip_208)[7851]:
> > 2015/03/17_10:21:22 INFO: Bringing device eth0 up
> >>>>>>>>>>>> IPaddr2(vip_208)[7851]:
> > 2015/03/17_10:21:22 INFO: /usr/libexec/heartbeat/send_arp -i 200 -r 5 -p
> > /var/run/resource-agents/send_arp-192.168.17.208 eth0 192.168.17.208 auto
> > not_used not_used
> >>>>>>>>>>>>
> >>>>>>>>>>>> 標準出力や標準エラー出力はありませんでした。
> >>>>>>>>>>>>
> >>>>>>>>>>>> stonith-helperがおかしいのでしょうか。
> >>>>>>>>>>>> stonith-helperはシェルスクリプトなのでインストールはあまり気にしていなかったのですが。
> >>>>>>>>>>>> stonith-helperはここに配置されています。
> >>>>>>>>>>>>
> /usr/local/heartbeat/lib/stonith/plugins/external/stonith-helper
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> 宜しくお願いします。
> >>>>>>>>>>>>
> >>>>>>>>>>>> 以上
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> 2015-03-17 9:45 GMT+09:00
> > <renayama19661014@ybb.ne.jp>:
> >>>>>>>>>>>>
> >>>>>>>>>>>> 福田さん
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> おはようございます。山内です。
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> 念の為、手元にある複数のstonithを利用した場合の例を抜粋してお送りします。
> >>>>>>>>>>>>> (実際には、改行に気を付けてください)
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> 以下の例は、PM1.1系での設定で、
> >>>>>>>>>>>>> nodeaは、prmStonith1-1、 prmStonith1-2の順でstonithが実行されます。
> >>>>>>>>>>>>> nodebは、prmStonith2-1、 prmStonith2-2の順でstonithが実行されます。
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> stonith自体は、helperとsshです。
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> (snip)
> >>>>>>>>>>>>> ### Group Configuration ###
> >>>>>>>>>>>>> group grpStonith1 \
> >>>>>>>>>>>>> prmStonith1-1 \
> >>>>>>>>>>>>> prmStonith1-2
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> group grpStonith2 \
> >>>>>>>>>>>>> prmStonith2-1 \
> >>>>>>>>>>>>> prmStonith2-2
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> ### Fencing Topology ###
> >>>>>>>>>>>>> fencing_topology \
> >>>>>>>>>>>>> nodea: prmStonith1-1
> > prmStonith1-2 \
> >>>>>>>>>>>>> nodeb: prmStonith2-1
> > prmStonith2-2
> >>>>>>>>>>>>> (snp)
> >>>>>>>>>>>>> primitive prmStonith1-1
> > stonith:external/stonith-helper \
> >>>>>>>>>>>>> params \
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> pcmk_reboot_retries="1"
> > \
> >>>>>>>>>>>>> pcmk_reboot_timeout="40s"
> > \
> >>>>>>>>>>>>> hostlist="nodea" \
> >>>>>>>>>>>>> dead_check_target="192.168.28.60
> > 192.168.28.70" \
> >>>>>>>>>>>>> standby_check_command="/usr/sbin/crm_resource
> > -r prmRES -W | grep -qi `hostname`" \
> >>>>>>>>>>>>> run_online_check="yes"
> > \
> >>>>>>>>>>>>> op start interval="0s"
> > timeout="60s" on-fail="restart" \
> >>>>>>>>>>>>> op stop interval="0s"
> > timeout="60s" on-fail="ignore"
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> primitive prmStonith1-2
> > stonith:external/ssh \
> >>>>>>>>>>>>> params \
> >>>>>>>>>>>>> pcmk_reboot_timeout="60s"
> > \
> >>>>>>>>>>>>> hostlist="nodea" \
> >>>>>>>>>>>>> op start interval="0s"
> > timeout="60s" on-fail="restart" \
> >>>>>>>>>>>>> op monitor
> > interval="3600s" timeout="60s" on-fail="restart"
> > \
> >>>>>>>>>>>>> op stop interval="0s"
> > timeout="60s" on-fail="ignore"
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> primitive prmStonith2-1
> > stonith:external/stonith-helper \
> >>>>>>>>>>>>> params \
> >>>>>>>>>>>>> pcmk_reboot_retries="1"
> > \
> >>>>>>>>>>>>> pcmk_reboot_timeout="40s"
> > \
> >>>>>>>>>>>>> hostlist="nodeb" \
> >>>>>>>>>>>>> dead_check_target="192.168.28.61
> > 192.168.28.71" \
> >>>>>>>>>>>>> standby_check_command="/usr/sbin/crm_resource
> > -r prmRES -W | grep -qi `hostname`" \
> >>>>>>>>>>>>> run_online_check="yes"
> > \
> >>>>>>>>>>>>> op start interval="0s"
> > timeout="60s" on-fail="restart" \
> >>>>>>>>>>>>> op stop interval="0s"
> > timeout="60s" on-fail="ignore"
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> primitive prmStonith2-2
> > stonith:external/ssh \
> >>>>>>>>>>>>> params \
> >>>>>>>>>>>>> pcmk_reboot_timeout="60s"
> > \
> >>>>>>>>>>>>> hostlist="nodeb" \
> >>>>>>>>>>>>> op start interval="0s"
> > timeout="60s" on-fail="restart" \
> >>>>>>>>>>>>> op monitor
> > interval="3600s" timeout="60s" on-fail="restart"
> > \
> >>>>>>>>>>>>> op stop interval="0s"
> > timeout="60s" on-fail="ignore"
> >>>>>>>>>>>>> (snip)
> >>>>>>>>>>>>> location
> > rsc_location-grpStonith1-2 grpStonith1 \
> >>>>>>>>>>>>> rule -INFINITY: #uname eq nodea
> >>>>>>>>>>>>> location
> > rsc_location-grpStonith2-3 grpStonith2 \
> >>>>>>>>>>>>> rule -INFINITY: #uname eq nodeb
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> 以上です。
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> --
> >>>>>>>>>>>>
> >>>>>>>>>>>> ELF Systems
> >>>>>>>>>>>> Masamichi Fukuda
> >>>>>>>>>>>> mail to:
> > masamichi_fukuda@elf-systems.com
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> _______________________________________________
> >>>>>>>>>>> Linux-ha-japan mailing list
> >>>>>>>>>>> Linux-ha-japan@lists.sourceforge.jp
> >>>>>>>>>>> http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> --
> >>>>>>>>>>
> >>>>>>>>>> ELF Systems
> >>>>>>>>>> Masamichi Fukuda
> >>>>>>>>>> mail to: masamichi_fukuda@elf-systems.com
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> _______________________________________________
> >>>>>>>>> Linux-ha-japan mailing list
> >>>>>>>>> Linux-ha-japan@lists.sourceforge.jp
> >>>>>>>>> http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> --
> >>>>>>>>
> >>>>>>>> ELF Systems
> >>>>>>>> Masamichi Fukuda
> >>>>>>>> mail to: masamichi_fukuda@elf-systems.com
> >>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>> _______________________________________________
> >>>>>>> Linux-ha-japan mailing list
> >>>>>>> Linux-ha-japan@lists.sourceforge.jp
> >>>>>>> http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>>> --
> >>>>>>
> >>>>>> ELF Systems
> >>>>>> Masamichi Fukuda
> >>>>>> mail to: masamichi_fukuda@elf-systems.com
> >>>>>>
> >>>>>>
> >>>>>
> >>>>> _______________________________________________
> >>>>> Linux-ha-japan mailing list
> >>>>> Linux-ha-japan@lists.sourceforge.jp
> >>>>> http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
> >>>>>
> >>>>
> >>>> --
> >>>>
> >>>> ELF Systems
> >>>> Masamichi Fukuda
> >>>> mail to: masamichi_fukuda@elf-systems.com
> >>>>
> >>>>
> >>>>
> >>>
> >>> _______________________________________________
> >>> Linux-ha-japan mailing list
> >>> Linux-ha-japan@lists.sourceforge.jp
> >>> http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
> >>>
> >>
> >>
> >> --
> >>
> >> ELF Systems
> >> Masamichi Fukuda
> >> mail to: masamichi_fukuda@elf-systems.com
> >>
> >>
> >
> > _______________________________________________
> > Linux-ha-japan mailing list
> > Linux-ha-japan@lists.sourceforge.jp
> > http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
> >
>
> _______________________________________________
> Linux-ha-japan mailing list
> Linux-ha-japan@lists.sourceforge.jp
> http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>



--
ELF Systems
Masamichi Fukuda
mail to: *masamichi_fukuda@elf-systems.com <elfsystems.com@gmail.com>*
Re: スプリットブレイン時のSTONITHエラーについて [ In reply to ]
福田さん

こんばんは、山内です。

ということは、xen0もstonith-helperもたぶん、Pacemakerのstonithプラグインとしての管理下のパスにはないということになると思います。

Reusableと、pm_extrasあたりのインストールが怪しいと思いますね。

また、何かわかったらご連絡します。

以上です。



----- Original Message -----
>From: Masamichi Fukuda - elf-systems <masamichi_fukuda@elf-systems.com>
>To: 山内英生 <renayama19661014@ybb.ne.jp>; "linux-ha-japan@lists.sourceforge.jp" <linux-ha-japan@lists.sourceforge.jp>
>Date: 2015/3/17, Tue 23:46
>Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
>
>
>山内さん
>
>こんばんは、福田です。
>
>stonith-helperの-x指定は何かやり方が違うんでしょうかね。
>
>stonith-helperを外して、xen0だけにして起動してみました。
>
># crm_mon -rfA
>
>Last updated: Tue Mar 17 23:38:53 2015
>Last change: Tue Mar 17 23:30:34 2015
>Stack: heartbeat
>Current DC: lbv1.beta.com (38b0f200-83ea-8633-6f37-047d36cd39c6) - parti
>tion with quorum
>Version: 1.1.12-e32080b
>2 Nodes configured
>6 Resources configured
>
>
>Online: [ lbv1.beta.com lbv2.beta.com ]
>
>Full list of resources:
>
>Stonith1-2      (stonith:external/xen0):        Stopped
>Stonith2-2      (stonith:external/xen0):        Stopped
> Resource Group: HAvarnish
>     vip_208    (ocf::heartbeat:IPaddr2):       Started lbv1.beta.com
>     varnishd   (lsb:varnish):  Started lbv1.beta.com
> Clone Set: clone_ping [ping]
>     Started: [ lbv1.beta.com lbv2.beta.com ]
>
>Node Attributes:
>* Node lbv1.beta.com:
>    + default_ping_set                  : 100
>* Node lbv2.beta.com:
>    + default_ping_set                  : 100
>
>Migration summary:
>* Node lbv1.beta.com:
>   Stonith2-2: migration-threshold=1 fail-count=1000000 last-failure='Tue Mar 17
> 23:38:34 2015'
>* Node lbv2.beta.com:
>   Stonith1-2: migration-threshold=1 fail-count=1000000 last-failure='Tue Mar 17
> 23:38:27 2015'
>
>Failed actions:
>    Stonith2-2_start_0 on lbv1.beta.com 'unknown error' (1): call=23, st
>atus=Error, exit-reason='none', last-rc-change='Tue Mar 17 23:38:32 2015', queue
>d=0ms, exec=1061ms
>    Stonith1-2_start_0 on lbv2.beta.com 'unknown error' (1): call=23, st
>atus=Error, exit-reason='none', last-rc-change='Tue Mar 17 23:38:25 2015', queue
>d=0ms, exec=1342ms
>
>
>
>
>stonith-helperがあるときと同様のfialed actionsが出ているようです。
>
>
>宜しくお願いします。
>
>以上
>
>
>
>
>2015年3月17日 22:38 <renayama19661014@ybb.ne.jp>:
>
>福田さん
>>
>>こんばんは、山内です。
>>
>>ちなみに可能であれば、external/stonith-helperを外して、external/xen0だけにした場合に
>>どうなるか?を確認すると、問題の切り分けになるかもしれません。
>>
>>以上です。
>>
>>
>>
>>----- Original Message -----
>>
>>> From: "renayama19661014@ybb.ne.jp" <renayama19661014@ybb.ne.jp>
>>> To: "linux-ha-japan@lists.sourceforge.jp" <linux-ha-japan@lists.sourceforge.jp>
>>> Cc:
>>> Date: 2015/3/17, Tue 22:28
>>> Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
>>>
>>> 福田さん
>>>
>>> こんばんは、山内です。
>>>
>>> 変わらないようですね。。。
>>>
>>> とりあえず、明日くらいに、RHEL上ですが、
>>>
>>> Heartbeat3.0.6
>>> Pacemakerの最新
>>>
>>> 組み合わせで、同じような設定(リソースはDummy、external/xen0はexternal/sshになりますが)stonith-helperが動くかどうかを確認してみます。
>>>
>>> #stonith-helperの-x指定の出力が確認出来ると、もう少し問題が絞りやすいのですが・・・
>>>
>>>
>>> 以上です。
>>>
>>>
>>>
>>> ----- Original Message -----
>>>> From: Masamichi Fukuda - elf-systems
>>> <masamichi_fukuda@elf-systems.com>
>>>> To: 山内英生 <renayama19661014@ybb.ne.jp>;
>>> "linux-ha-japan@lists.sourceforge.jp"
>>> <linux-ha-japan@lists.sourceforge.jp>
>>>> Date: 2015/3/17, Tue 21:24
>>>> Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
>>>>
>>>>
>>>> 山内さん
>>>>
>>>> こんばんは、福田です。
>>>> 最新版の情報をありがとうございました。
>>>>
>>>> 早速インストールしてみました。
>>>>
>>>> 起動後の状態です。
>>>>
>>>> failed actionsは変わりないようです。
>>>>
>>>>
>>>>
>>>> # crm_mon -rfA
>>>> Last updated: Tue Mar 17 21:03:49 2015
>>>> Last change: Tue Mar 17 20:30:58 2015
>>>> Stack: heartbeat
>>>> Current DC: lbv1.beta.com (38b0f200-83ea-8633-6f37-047d36cd39c6) - parti
>>>> tion with quorum
>>>> Version: 1.1.12-e32080b
>>>> 2 Nodes configured
>>>> 8 Resources configured
>>>>
>>>>
>>>> Online: [ lbv1.beta.com lbv2.beta.com ]
>>>>
>>>> Full list of resources:
>>>>
>>>>  Resource Group: HAvarnish
>>>>      vip_208    (ocf::heartbeat:IPaddr2):       Started lbv1.beta.com
>>>>      varnishd   (lsb:varnish):  Started lbv1.beta.com
>>>>  Resource Group: grpStonith1
>>>>      Stonith1-1 (stonith:external/stonith-helper):      Stopped
>>>>      Stonith1-2 (stonith:external/xen0):        Stopped
>>>>  Resource Group: grpStonith2
>>>>      Stonith2-1 (stonith:external/stonith-helper):      Stopped
>>>>      Stonith2-2 (stonith:external/xen0):        Stopped
>>>>  Clone Set: clone_ping [ping]
>>>>      Started: [ lbv1.beta.com lbv2.beta.com ]
>>>>
>>>> Node Attributes:
>>>> * Node lbv1.beta.com:
>>>>     + default_ping_set                  : 100
>>>> * Node lbv2.beta.com:
>>>>     + default_ping_set                  : 100
>>>>
>>>> Migration summary:
>>>> * Node lbv1.beta.com:
>>>>    Stonith2-1: migration-threshold=1 fail-count=1000000
>>> last-failure='Tue Mar 17
>>>>  21:03:39 2015'
>>>> * Node lbv2.beta.com:
>>>>    Stonith1-1: migration-threshold=1 fail-count=1000000
>>> last-failure='Tue Mar 17
>>>>  21:03:32 2015'
>>>>
>>>> Failed actions:
>>>>     Stonith2-1_start_0 on lbv1.beta.com 'unknown error' (1):
>>> call=31, st
>>>> atus=Error, exit-reason='none', last-rc-change='Tue Mar 17
>>> 21:03:37 2015', queue
>>>> d=0ms, exec=1085ms
>>>>     Stonith1-1_start_0 on lbv2.beta.com 'unknown error' (1):
>>> call=18, st
>>>> atus=Error, exit-reason='none', last-rc-change='Tue Mar 17
>>> 21:03:30 2015', queue
>>>> d=0ms, exec=1061ms
>>>>
>>>>
>>>>
>>>>
>>>> ログです。
>>>>
>>>>
>>>> # less /var/log/ha-debug
>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: info: Pacemaker support:
>>> yes
>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: WARN: File
>>> /etc/ha.d//haresources exists.
>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: WARN: This file is not used
>>> because pacemaker is enabled
>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: debug: Checking access of:
>>> /usr/local/heartbeat/libexec/heartbeat/ccm
>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: debug: Checking access of:
>>> /usr/local/heartbeat/libexec/pacemaker/cib
>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: debug: Checking access of:
>>> /usr/local/heartbeat/libexec/pacemaker/stonithd
>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: debug: Checking access of:
>>> /usr/local/heartbeat/libexec/pacemaker/lrmd
>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: debug: Checking access of:
>>> /usr/local/heartbeat/libexec/pacemaker/attrd
>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: debug: Checking access of:
>>> /usr/local/heartbeat/libexec/pacemaker/crmd
>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: WARN: Core dumps could be
>>> lost if multiple dumps occur.
>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: WARN: Consider setting
>>> non-default value in /proc/sys/kernel/core_pattern (or equivalent) for maximum
>>> supportability
>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: WARN: Consider setting
>>> /proc/sys/kernel/core_uses_pid (or equivalent) to 1 for maximum supportability
>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: WARN: Logging daemon is
>>> disabled --enabling logging daemon is recommended
>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: info:
>>> **************************
>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: info: Configuration
>>> validated. Starting heartbeat 3.0.6
>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4236]: info: heartbeat: version
>>> 3.0.6
>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4236]: info: Heartbeat generation:
>>> 1423534116
>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4236]: info: seed is -1702799346
>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4236]: info: glib: ucast: write
>>> socket priority set to IPTOS_LOWDELAY on eth1
>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4236]: info: glib: ucast: bound
>>> send socket to device: eth1
>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4236]: info: glib: ucast: set
>>> SO_REUSEADDR
>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4236]: info: glib: ucast: bound
>>> receive socket to device: eth1
>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4236]: info: glib: ucast: started
>>> on port 694 interface eth1 to 10.0.17.133
>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4236]: info: Local status now set
>>> to: 'up'
>>>> Mar 17 21:02:46 lbv1.beta.com heartbeat: [4236]: info: Link
>>> lbv2.beta.com:eth1 up.
>>>> Mar 17 21:02:46 lbv1.beta.com heartbeat: [4236]: info: Status update for
>>> node lbv2.beta.com: status up
>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: Comm_now_up():
>>> updating status to active
>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: Local status now set
>>> to: 'active'
>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: Starting child client
>>> "/usr/local/heartbeat/libexec/heartbeat/ccm" (109,113)
>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: Starting child client
>>> "/usr/local/heartbeat/libexec/pacemaker/cib" (109,113)
>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: Starting child client
>>> "/usr/local/heartbeat/libexec/pacemaker/stonithd" (0,0)
>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: Starting child client
>>> "/usr/local/heartbeat/libexec/pacemaker/lrmd" (0,0)
>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: Starting child client
>>> "/usr/local/heartbeat/libexec/pacemaker/attrd" (109,113)
>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: Starting child client
>>> "/usr/local/heartbeat/libexec/pacemaker/crmd" (109,113)
>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: debug: get_delnodelist:
>>> delnodelist=
>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4250]: info: Starting
>>> "/usr/local/heartbeat/libexec/pacemaker/crmd" as uid 109  gid 113 (pid
>>> 4250)
>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4246]: info: Starting
>>> "/usr/local/heartbeat/libexec/pacemaker/cib" as uid 109  gid 113 (pid
>>> 4246)
>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4249]: info: Starting
>>> "/usr/local/heartbeat/libexec/pacemaker/attrd" as uid 109  gid 113
>>> (pid 4249)
>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4245]: info: Starting
>>> "/usr/local/heartbeat/libexec/heartbeat/ccm" as uid 109  gid 113 (pid
>>> 4245)
>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4248]: info: Starting
>>> "/usr/local/heartbeat/libexec/pacemaker/lrmd" as uid 0  gid 0 (pid
>>> 4248)
>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4247]: info: Starting
>>> "/usr/local/heartbeat/libexec/pacemaker/stonithd" as uid 0  gid 0 (pid
>>> 4247)
>>>> Mar 17 21:02:47 lbv1.beta.com ccm: [4245]: info: Hostname: lbv1.beta.com
>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: the send queue length
>>> from heartbeat to client ccm is set to 1024
>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: the send queue length
>>> from heartbeat to client attrd is set to 1024
>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: the send queue length
>>> from heartbeat to client stonith-ng is set to 1024
>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: Status update for
>>> node lbv2.beta.com: status active
>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: the send queue length
>>> from heartbeat to client cib is set to 1024
>>>> Mar 17 21:02:51 lbv1.beta.com heartbeat: [4236]: WARN: 1 lost packet(s) for
>>> [lbv2.beta.com] [15:17]
>>>> Mar 17 21:02:51 lbv1.beta.com heartbeat: [4236]: info: No pkts missing from
>>> lbv2.beta.com!
>>>> Mar 17 21:02:52 lbv1.beta.com heartbeat: [4236]: WARN: 1 lost packet(s) for
>>> [lbv2.beta.com] [19:21]
>>>> Mar 17 21:02:52 lbv1.beta.com heartbeat: [4236]: info: No pkts missing from
>>> lbv2.beta.com!
>>>> Mar 17 21:02:52 lbv1.beta.com heartbeat: [4236]: info: the send queue length
>>> from heartbeat to client crmd is set to 1024
>>>> Mar 17 21:02:53 lbv1.beta.com heartbeat: [4236]: WARN: 1 lost packet(s) for
>>> [lbv2.beta.com] [24:26]
>>>> Mar 17 21:02:53 lbv1.beta.com heartbeat: [4236]: info: No pkts missing from
>>> lbv2.beta.com!
>>>> Mar 17 21:02:54 lbv1.beta.com heartbeat: [4236]: WARN: 1 lost packet(s) for
>>> [lbv2.beta.com] [26:28]
>>>> Mar 17 21:02:54 lbv1.beta.com heartbeat: [4236]: info: No pkts missing from
>>> lbv2.beta.com!
>>>> Mar 17 21:02:54 lbv1.beta.com heartbeat: [4236]: WARN: 1 lost packet(s) for
>>> [lbv2.beta.com] [30:32]
>>>> Mar 17 21:02:54 lbv1.beta.com heartbeat: [4236]: info: No pkts missing from
>>> lbv2.beta.com!
>>>>
>>>>
>>>>
>>>> # less /var/log/error
>>>>
>>>> Mar 17 21:02:47 lbv1 attrd[4249]:    error: ha_msg_dispatch: Ignored
>>> incoming message. Please set_msg_callback on hbclstat
>>>> Mar 17 21:02:48 lbv1 attrd[4249]:    error: ha_msg_dispatch: Ignored
>>> incoming message. Please set_msg_callback on hbclstat
>>>> Mar 17 21:02:53 lbv1 stonith-ng[4247]:    error: ha_msg_dispatch: Ignored
>>> incoming message. Please set_msg_callback on hbclstat
>>>> Mar 17 21:02:53 lbv1 stonith-ng[4247]:    error: ha_msg_dispatch: Ignored
>>> incoming message. Please set_msg_callback on hbclstat
>>>> Mar 17 21:03:39 lbv1 crmd[4250]:    error: process_lrm_event: Operation
>>> Stonith2-1_start_0 (node=lbv1.beta.com, call=31, status=4, cib-update=42,
>>> confirmed=true) Error
>>>>
>>>> # cat syslog|egrep 'Mar 17 21:03|Mar 17 21:02' |egrep
>>> 'heartbeat|stonith|pacemaker|error'
>>>> Mar 17 21:03:24 lbv1 pengine[4253]:   notice: process_pe_message: Calculated
>>> Transition 0: /var/lib/pacemaker/pengine/pe-input-115.bz2
>>>> Mar 17 21:03:27 lbv1 crmd[4250]:   notice: run_graph: Transition 0
>>> (Complete=15, Pending=0, Fired=0, Skipped=16, Incomplete=2,
>>> Source=/var/lib/pacemaker/pengine/pe-input-115.bz2): Stopped
>>>> Mar 17 21:03:29 lbv1 pengine[4253]:   notice: process_pe_message: Calculated
>>> Transition 1: /var/lib/pacemaker/pengine/pe-input-116.bz2
>>>> Mar 17 21:03:34 lbv1 crmd[4250]:   notice: run_graph: Transition 1
>>> (Complete=8, Pending=0, Fired=0, Skipped=12, Incomplete=1,
>>> Source=/var/lib/pacemaker/pengine/pe-input-116.bz2): Stopped
>>>> Mar 17 21:03:37 lbv1 pengine[4253]:  warning: unpack_rsc_op_failure:
>>> Processing failed op start for Stonith1-1 on lbv2.beta.com: unknown error (1)
>>>> Mar 17 21:03:37 lbv1 pengine[4253]:  warning: unpack_rsc_op_failure:
>>> Processing failed op start for Stonith1-1 on lbv2.beta.com: unknown error (1)
>>>> Mar 17 21:03:37 lbv1 pengine[4253]:   notice: process_pe_message: Calculated
>>> Transition 2: /var/lib/pacemaker/pengine/pe-input-117.bz2
>>>> Mar 17 21:03:39 lbv1 stonith-ng[4247]:   notice: log_operation: Operation
>>> 'monitor' [4377] for device 'Stonith2-1' returned: -201 (Generic
>>> Pacemaker error)
>>>> Mar 17 21:03:39 lbv1 stonith-ng[4247]:  warning: log_operation:
>>> Stonith2-1:4377 [ Performing: stonith -t external/stonith-helper -S ]
>>>> Mar 17 21:03:39 lbv1 stonith-ng[4247]:  warning: log_operation:
>>> Stonith2-1:4377 [ failed to exec "stonith" ]
>>>> Mar 17 21:03:39 lbv1 stonith-ng[4247]:  warning: log_operation:
>>> Stonith2-1:4377 [ failed:  2 ]
>>>> Mar 17 21:03:39 lbv1 crmd[4250]:    error: process_lrm_event: Operation
>>> Stonith2-1_start_0 (node=lbv1.beta.com, call=31, status=4, cib-update=42,
>>> confirmed=true) Error
>>>> Mar 17 21:03:40 lbv1 crmd[4250]:   notice: run_graph: Transition 2
>>> (Complete=12, Pending=0, Fired=0, Skipped=3, Incomplete=0,
>>> Source=/var/lib/pacemaker/pengine/pe-input-117.bz2): Stopped
>>>> Mar 17 21:03:42 lbv1 pengine[4253]:  warning: unpack_rsc_op_failure:
>>> Processing failed op start for Stonith2-1 on lbv1.beta.com: unknown error (1)
>>>> Mar 17 21:03:42 lbv1 pengine[4253]:  warning: unpack_rsc_op_failure:
>>> Processing failed op start for Stonith2-1 on lbv1.beta.com: unknown error (1)
>>>> Mar 17 21:03:42 lbv1 pengine[4253]:  warning: unpack_rsc_op_failure:
>>> Processing failed op start for Stonith1-1 on lbv2.beta.com: unknown error (1)
>>>> Mar 17 21:03:42 lbv1 pengine[4253]:   notice: process_pe_message: Calculated
>>> Transition 3: /var/lib/pacemaker/pengine/pe-input-118.bz2
>>>> Mar 17 21:03:42 lbv1 IPaddr2(vip_208)[4448]: INFO:
>>> /usr/libexec/heartbeat/send_arp -i 200 -r 5 -p
>>> /var/run/resource-agents/send_arp-192.168.17.208 eth0 192.168.17.208 auto
>>> not_used not_used
>>>> Mar 17 21:03:47 lbv1 crmd[4250]:   notice: run_graph: Transition 3
>>> (Complete=10, Pending=0, Fired=0, Skipped=0, Incomplete=0,
>>> Source=/var/lib/pacemaker/pengine/pe-input-118.bz2): Complete
>>>>
>>>> 宜しくお願いします。
>>>>
>>>> 以上
>>>>
>>>>
>>>>
>>>> 2015年3月17日 18:31 <renayama19661014@ybb.ne.jp>:
>>>>
>>>> 福田さん
>>>>>
>>>>> こんばんは、山内です。
>>>>>
>>>>> tag付けされていないので、本日の最新版は、
>>>>>
>>>>>  *
>>> https://github.com/ClusterLabs/pacemaker/tree/e32080b460f81486b85d08ec958582b3e72d858c
>>>>>
>>>>>
>>>>> になります。
>>>>> 右側の[Download ZIP]からダウンロード出来ます。
>>>>>
>>>>> 以上です。
>>>>>
>>>>>
>>>>> ----- Original Message -----
>>>>>> From: Masamichi Fukuda - elf-systems
>>> <masamichi_fukuda@elf-systems.com>
>>>>>
>>>>>> To: "renayama19661014@ybb.ne.jp"
>>> <renayama19661014@ybb.ne.jp>;
>>> "linux-ha-japan@lists.sourceforge.jp"
>>> <linux-ha-japan@lists.sourceforge.jp>
>>>>>> Date: 2015/3/17, Tue 18:07
>>>>>> Subject: スプリットブレイン時のSTONITHエラーについて
>>>>>>
>>>>>>
>>>>>> 山内さん
>>>>>>
>>>>>>
>>>>>> お疲れ様です、福田です。
>>>>>>
>>>>>>
>>>>>> こちらを見たのですが、
>>>>>> https://github.com/ClusterLabs/pacemaker/tags
>>>>>>
>>>>>>
>>>>>>
>>>>>> pacemaker 1.1.12 561c4cf が最新のようなのですが。
>>>>>> 済みませんが、これ以降の最新版はどちらにあるか教えて頂けますか。
>>>>>>
>>>>>>
>>>>>> 宜しくお願いします。
>>>>>>
>>>>>>
>>>>>> 以上
>>>>>>
>>>>>>
>>>>>>
>>>>>> 2015年3月17日火曜日、<renayama19661014@ybb.ne.jp>さんは書きました:
>>>>>>
>>>>>> 福田さん
>>>>>>>
>>>>>>> お疲れ様です。山内です。
>>>>>>>
>>>>>>> はい。古いです。
>>>>>>>
>>>>>>> PacemakerがHeartbeat3.0.6に対応したのは意外と最近です。
>>>>>>> もっと新しいものを入れてください。(また、ソースから構築する必要がありますが・・・・)
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> 本家のgithubから入手可能です。
>>>>>>>  * https://github.com/ClusterLabs/pacemaker
>>>>>>>
>>>>>>>
>>>>>>> 場合によっては、最新のmasterはエラーなどが出る場合がありますので、その場合は、バージョンを古い方にたぐって
>>>>>>> いくのが良いと思います。
>>>>>>>
>>>>>>> 以上です。
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> ----- Original Message -----
>>>>>>>> From: Masamichi Fukuda - elf-systems
>>> <masamichi_fukuda@elf-systems.com>
>>>>>>>> To: 山内英生 <renayama19661014@ybb.ne.jp>;
>>> "linux-ha-japan@lists.sourceforge.jp"
>>> <linux-ha-japan@lists.sourceforge.jp>
>>>>>>>> Date: 2015/3/17, Tue 16:06
>>>>>>>> Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
>>>>>>>>
>>>>>>>>
>>>>>>>> 山内さん
>>>>>>>>
>>>>>>>> お疲れ様です、福田です。
>>>>>>>>
>>>>>>>> 以前のメールでheartbeatとpacemakerを最新版を入れたほうが良いと回答頂きました。
>>>>>>>> そこで今回、heartbeat3.0.6とpacemaker1.1.12を入れたのですが。
>>>>>>>>
>>>>>>>> heartbeat configuration: Version = "3.0.6"
>>>>>>>> pacemaker configuration: Version = 1.1.12 (Build:
>>> 561c4cf)pacemakerがまだ古いということでしょうか。
>>>>>>>>
>>>>>>>> 済みませんが、宜しくお願いします。
>>>>>>>>
>>>>>>>> 以上
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> 2015年3月17日 14:59 <renayama19661014@ybb.ne.jp>:
>>>>>>>>
>>>>>>>> 福田さん
>>>>>>>>>
>>>>>>>>> お疲れ様です。山内です。
>>>>>>>>>
>>>>>>>>> ふと思ったのすが、以前のやり取りのメールで以下と回答してますが、問題ないでしょうか?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>>>>>>  2)Heartbeat3.0.6+Pacemaker最新 :
>>> OK
>>>>>>>>>>>>>>>    
>>>>>>>>>>>>>>>
>>> どうやら、Heartbeatも最新版3.0.6を組合せる必要があるようです。
>>>>>>>>>>>>>>>
>>>  * http://hg.linux-ha.org/heartbeat-STABLE_3_0/rev/cceeb47a7d8f
>>>>>>>>>
>>>>>>>>> 以下のcrm_monのバージョンを見ると、1.1.12のようです。
>>>>>>>>> Heartbeat3.0.6と組み合わせるには、かなり新しめのPacemakerが必要です。
>>>>>>>>>
>>>>>>>>>> # crm_mon -rfA
>>>>>>>>>>
>>>>>>>>>> Last updated: Tue Mar 17 14:14:39 2015
>>>>>>>>>> Last change: Tue Mar 17 14:01:43 2015
>>>>>>>>>> Stack: heartbeat
>>>>>>>>>> Current DC: lbv2.beta.com
>>> (82ffc36f-1ad8-8686-7db0-35686465c624) - parti
>>>>>>>>>> tion with quorum
>>>>>>>>>> Version: 1.1.12-561c4cf
>>>>>>>>>
>>>>>>>>> たぶん、以下の変更以降は少なくとも必要かと思います。
>>>>>>>>>
>>>>>>>>> https://github.com/ClusterLabs/pacemaker/commit/f2302da063d08719d28367d8e362b8bfb0f85bf3
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> 以上です。
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> ----- Original Message -----
>>>>>>>>>> From: Masamichi Fukuda - elf-systems
>>> <masamichi_fukuda@elf-systems.com>
>>>>>>>>>> To: 山内英生 <renayama19661014@ybb.ne.jp>;
>>> "linux-ha-japan@lists.sourceforge.jp"
>>> <linux-ha-japan@lists.sourceforge.jp>
>>>>>>>>>
>>>>>>>>>> Date: 2015/3/17, Tue 14:38
>>>>>>>>>> Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> 山内さん
>>>>>>>>>>
>>>>>>>>>> お疲れ様です、福田です。
>>>>>>>>>>
>>>>>>>>>> stonith-helperのシェバング行に-xを追加すれば良いのでしょうか?
>>>>>>>>>> stonith-helperの先頭行を#!/bin/bash -xにしてクラスタを起動してみました。
>>>>>>>>>>
>>>>>>>>>> crm_monでは先ほどと変わりはないようです。
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> # crm_mon -rfA
>>>>>>>>>>
>>>>>>>>>> Last updated: Tue Mar 17 14:14:39 2015
>>>>>>>>>> Last change: Tue Mar 17 14:01:43 2015
>>>>>>>>>> Stack: heartbeat
>>>>>>>>>> Current DC: lbv2.beta.com
>>> (82ffc36f-1ad8-8686-7db0-35686465c624) - parti
>>>>>>>>>> tion with quorum
>>>>>>>>>> Version: 1.1.12-561c4cf
>>>>>>>>>> 2 Nodes configured
>>>>>>>>>> 8 Resources configured
>>>>>>>>>>
>>>>>>>>>> Online: [ lbv1.beta.com lbv2.beta.com ]
>>>>>>>>>>
>>>>>>>>>> Full list of resources:
>>>>>>>>>>
>>>>>>>>>>  Resource Group: HAvarnish
>>>>>>>>>>      vip_208    (ocf::heartbeat:IPaddr2):      
>>> Started lbv1.beta.com
>>>>>>>>>>      varnishd   (lsb:varnish):  Started
>>> lbv1.beta.com
>>>>>>>>>>  Resource Group: grpStonith1
>>>>>>>>>>      Stonith1-1
>>> (stonith:external/stonith-helper):      Stopped
>>>>>>>>>>      Stonith1-2 (stonith:external/xen0):       
>>> Stopped
>>>>>>>>>>  Resource Group: grpStonith2
>>>>>>>>>>      Stonith2-1
>>> (stonith:external/stonith-helper):      Stopped
>>>>>>>>>>      Stonith2-2 (stonith:external/xen0):       
>>> Stopped
>>>>>>>>>>  Clone Set: clone_ping [ping]
>>>>>>>>>>      Started: [ lbv1.beta.com lbv2.beta.com ]
>>>>>>>>>>
>>>>>>>>>> Node Attributes:
>>>>>>>>>> * Node lbv1.beta.com:
>>>>>>>>>>     + default_ping_set                  : 100
>>>>>>>>>> * Node lbv2.beta.com:
>>>>>>>>>>     + default_ping_set                  : 100
>>>>>>>>>>
>>>>>>>>>> Migration summary:
>>>>>>>>>> * Node lbv2.beta.com:
>>>>>>>>>>    Stonith1-1: migration-threshold=1
>>> fail-count=1000000 last-failure='Tue Mar 17
>>>>>>>>>>  14:12:16 2015'
>>>>>>>>>> * Node lbv1.beta.com:
>>>>>>>>>>    Stonith2-1: migration-threshold=1
>>> fail-count=1000000 last-failure='Tue Mar 17
>>>>>>>>>>  14:12:21 2015'
>>>>>>>>>>
>>>>>>>>>> Failed actions:
>>>>>>>>>>     Stonith1-1_start_0 on lbv2.beta.com 'unknown
>>> error' (1): call=31, st
>>>>>>>>>> atus=Error, last-rc-change='Tue Mar 17 14:12:14
>>> 2015', queued=0ms, exec=1065ms
>>>>>>>>>>     Stonith2-1_start_0 on lbv1.beta.com 'unknown
>>> error' (1): call=26, st
>>>>>>>>>> atus=Error, last-rc-change='Tue Mar 17 14:12:19
>>> 2015', queued=0ms, exec=1081ms
>>>>>>>>>>
>>>>>>>>>> その他のログを探してみました。
>>>>>>>>>>
>>>>>>>>>> heartbeat起動時です。
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> # less /var/log/pm_logconv.out
>>>>>>>>>> Mar 17 14:11:28 lbv1.beta.com info: Starting
>>> Heartbeat 3.0.6.
>>>>>>>>>> Mar 17 14:11:33 lbv1.beta.com info: Link
>>> lbv2.beta.com:eth1 is up.
>>>>>>>>>> Mar 17 14:11:34 lbv1.beta.com info: Start
>>> "ccm" process. (pid=13264)
>>>>>>>>>> Mar 17 14:11:34 lbv1.beta.com info: Start
>>> "lrmd" process. (pid=13267)
>>>>>>>>>> Mar 17 14:11:34 lbv1.beta.com info: Start
>>> "attrd" process. (pid=13268)
>>>>>>>>>> Mar 17 14:11:34 lbv1.beta.com info: Start
>>> "stonithd" process. (pid=13266)
>>>>>>>>>> Mar 17 14:11:34 lbv1.beta.com info: Start
>>> "cib" process. (pid=13265)
>>>>>>>>>> Mar 17 14:11:34 lbv1.beta.com info: Start
>>> "crmd" process. (pid=13269)
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> # less /var/log/error
>>>>>>>>>> Mar 17 14:12:20 lbv1 crmd[13269]:    error:
>>> process_lrm_event: Operation Stonith2-1_start_0 (node=lbv1.beta.com, call=26,
>>> status=4, cib-update=19, confirmed=true) Error
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> syslogからstonithをgrepしたものです
>>>>>>>>>>
>>>>>>>>>> Mar 17 14:11:34 lbv1 heartbeat: [13255]: info:
>>> Starting child client
>>> "/usr/local/heartbeat/libexec/pacemaker/stonithd" (0,0)
>>>>>>>>>> Mar 17 14:11:34 lbv1 heartbeat: [13266]: info:
>>> Starting "/usr/local/heartbeat/libexec/pacemaker/stonithd" as uid 0 
>>> gid 0 (pid 13266)
>>>>>>>>>> Mar 17 14:11:34 lbv1 stonithd[13266]:   notice:
>>> crm_cluster_connect: Connecting to cluster infrastructure: heartbeat
>>>>>>>>>> Mar 17 14:11:34 lbv1 heartbeat: [13255]: info: the
>>> send queue length from heartbeat to client stonithd is set to 1024
>>>>>>>>>> Mar 17 14:11:40 lbv1 stonithd[13266]:   notice:
>>> setup_cib: Watching for stonith topology changes
>>>>>>>>>> Mar 17 14:11:40 lbv1 stonithd[13266]:   notice:
>>> unpack_config: On loss of CCM Quorum: Ignore
>>>>>>>>>> Mar 17 14:11:40 lbv1 stonithd[13266]:  warning:
>>> handle_startup_fencing: Blind faith: not fencing unseen nodes
>>>>>>>>>> Mar 17 14:11:40 lbv1 stonithd[13266]:  warning:
>>> handle_startup_fencing: Blind faith: not fencing unseen nodes
>>>>>>>>>> Mar 17 14:11:41 lbv1 stonithd[13266]:   notice:
>>> stonith_device_register: Added 'Stonith2-1' to the device list (1 active
>>> devices)
>>>>>>>>>> Mar 17 14:11:41 lbv1 stonithd[13266]:   notice:
>>> stonith_device_register: Added 'Stonith2-2' to the device list (2 active
>>> devices)
>>>>>>>>>> Mar 17 14:12:04 lbv1 stonithd[13266]:   notice:
>>> xml_patch_version_check: Versions did not change in patch 0.5.0
>>>>>>>>>> Mar 17 14:12:20 lbv1 stonithd[13266]:   notice:
>>> log_operation: Operation 'monitor' [13386] for device
>>> 'Stonith2-1' returned: -201 (Generic Pacemaker error)
>>>>>>>>>> Mar 17 14:12:20 lbv1 stonithd[13266]:  warning:
>>> log_operation: Stonith2-1:13386 [ Performing: stonith -t external/stonith-helper
>>> -S ]
>>>>>>>>>> Mar 17 14:12:20 lbv1 stonithd[13266]:  warning:
>>> log_operation: Stonith2-1:13386 [ failed to exec "stonith" ]
>>>>>>>>>> Mar 17 14:12:20 lbv1 stonithd[13266]:  warning:
>>> log_operation: Stonith2-1:13386 [ failed:  2 ]
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> 宜しくお願いします。
>>>>>>>>>>
>>>>>>>>>> 以上
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> 2015年3月17日 13:32 <renayama19661014@ybb.ne.jp>:
>>>>>>>>>>
>>>>>>>>>> 福田さん
>>>>>>>>>>>
>>>>>>>>>>> お疲れ様です。山内です。
>>>>>>>>>>>
>>>>>>>>>>> ということは、stonith-helperのstartに問題があるようですね。
>>>>>>>>>>>
>>>>>>>>>>> stonith-helperの先頭に
>>>>>>>>>>>
>>>>>>>>>>> #!/bin/bash -x
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> を入れて、クラスタを起動すると何かわかるかも知れません。
>>>>>>>>>>>
>>>>>>>>>>> ちなみに、stonith-helperのログもどこかに出ていると思うのですが。。。
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> 以上です。
>>>>>>>>>>>
>>>>>>>>>>> ----- Original Message -----
>>>>>>>>>>>> From: Masamichi Fukuda - elf-systems
>>> <masamichi_fukuda@elf-systems.com>
>>>>>>>>>>>> To: 山内英生 <renayama19661014@ybb.ne.jp>;
>>> "linux-ha-japan@lists.sourceforge.jp"
>>> <linux-ha-japan@lists.sourceforge.jp>
>>>>>>>>>>>
>>>>>>>>>>>> Date: 2015/3/17, Tue 12:31
>>>>>>>>>>>> Subject: Re: [Linux-ha-jp]
>>> スプリットブレイン時のSTONITHエラーについて
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> 山内さん
>>>>>>>>>>>> cc:松島さん
>>>>>>>>>>>>
>>>>>>>>>>>> こんにちは、福田です。
>>>>>>>>>>>>
>>>>>>>>>>>> 同じディレクトリにxen0はありました。
>>>>>>>>>>>>
>>>>>>>>>>>> # pwd
>>>>>>>>>>>> /usr/local/heartbeat/lib/stonith/plugins/external
>>>>>>>>>>>>
>>>>>>>>>>>> # ls
>>>>>>>>>>>> drac5           ibmrsa          kdumpcheck 
>>> riloe          vmware
>>>>>>>>>>>> dracmc-telnet  ibmrsa-telnet  libvirt     
>>> ssh          xen0
>>>>>>>>>>>> hetzner        ipmi          nut     
>>> stonith-helper  xen0-ha
>>>>>>>>>>>> hmchttp        ippower9258    rackpdu     
>>> vcenter
>>>>>>>>>>>>
>>>>>>>>>>>> 宜しくお願いします。
>>>>>>>>>>>>
>>>>>>>>>>>> 以上
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> 2015-03-17 10:53 GMT+09:00
>>> <renayama19661014@ybb.ne.jp>:
>>>>>>>>>>>>
>>>>>>>>>>>> 福田さん
>>>>>>>>>>>>> cc:松島さん
>>>>>>>>>>>>>
>>>>>>>>>>>>> お疲れ様です。山内です。
>>>>>>>>>>>>>
>>>>>>>>>>>>>> 標準出力や標準エラー出力はありませんでした。
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> stonith-helperがおかしいのでしょうか。
>>>>>>>>>>>>>> stonith-helperはシェルスクリプトなのでインストールはあまり気にしていなかったのですが。
>>>>>>>>>>>>>> stonith-helperはここに配置されています。
>>>>>>>>>>>>>> /usr/local/heartbeat/lib/stonith/plugins/external/stonith-helper
>>>>>>>>>>>>>
>>>>>>>>>>>>> このディレクトリにxen0もありますか?
>>>>>>>>>>>>> 無いようでしたら、問題がありますので、一度、stonith-helperのファイルを属性などはそのまま、xen0と同じディレクトリに
>>>>>>>>>>>>> コピーしてみてください。
>>>>>>>>>>>>>
>>>>>>>>>>>>> それで稼働するなら、pm_extrasのインストールに問題があるということになります。
>>>>>>>>>>>>>
>>>>>>>>>>>>> 以上です。
>>>>>>>>>>>>>
>>>>>>>>>>>>> ----- Original Message -----
>>>>>>>>>>>>>> From: Masamichi Fukuda - elf-systems
>>> <masamichi_fukuda@elf-systems.com>
>>>>>>>>>>>>>> To: 山内英生
>>> <renayama19661014@ybb.ne.jp>;
>>> "linux-ha-japan@lists.sourceforge.jp"
>>> <linux-ha-japan@lists.sourceforge.jp>
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Date: 2015/3/17, Tue 10:31
>>>>>>>>>>>>>> Subject: Re: [Linux-ha-jp]
>>> スプリットブレイン時のSTONITHエラーについて
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 山内さん
>>>>>>>>>>>>>> cc:松島さん
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> おはようございます、福田です。
>>>>>>>>>>>>>> crmの例をありがとうございます。
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 早速、こちらの環境に合わせてみました。
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> $ cat test.crm
>>>>>>>>>>>>>> ### Cluster Option ###
>>>>>>>>>>>>>> property \
>>>>>>>>>>>>>>    
>>> no-quorum-policy="ignore" \
>>>>>>>>>>>>>>     stonith-enabled="true"
>>> \
>>>>>>>>>>>>>>    
>>> startup-fencing="false" \
>>>>>>>>>>>>>>     stonith-timeout="710s"
>>> \
>>>>>>>>>>>>>>    
>>> crmd-transition-delay="2s"
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ### Resource Default ###
>>>>>>>>>>>>>> rsc_defaults \
>>>>>>>>>>>>>>    
>>> resource-stickiness="INFINITY" \
>>>>>>>>>>>>>>    
>>> migration-threshold="1"
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ### Group Configuration ###
>>>>>>>>>>>>>> group HAvarnish \
>>>>>>>>>>>>>>     vip_208 \
>>>>>>>>>>>>>>     varnishd
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> group grpStonith1 \
>>>>>>>>>>>>>>     Stonith1-1 \
>>>>>>>>>>>>>>     Stonith1-2
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> group grpStonith2 \
>>>>>>>>>>>>>>     Stonith2-1 \
>>>>>>>>>>>>>>     Stonith2-2
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ### Clone Configuration ###
>>>>>>>>>>>>>> clone clone_ping \
>>>>>>>>>>>>>>     ping
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ### Fencing Topology ###
>>>>>>>>>>>>>> fencing_topology \
>>>>>>>>>>>>>>     lbv1.beta.com: Stonith1-1
>>> Stonith1-2 \
>>>>>>>>>>>>>>     lbv2.beta.com: Stonith2-1
>>> Stonith2-2
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ### Primitive Configuration ###
>>>>>>>>>>>>>> primitive vip_208
>>> ocf:heartbeat:IPaddr2 \
>>>>>>>>>>>>>>     params \
>>>>>>>>>>>>>>        
>>> ip="192.168.17.208" \
>>>>>>>>>>>>>>         nic="eth0" \
>>>>>>>>>>>>>>         cidr_netmask="24"
>>> \
>>>>>>>>>>>>>>     op start interval="0s"
>>> timeout="90s" on-fail="restart" \
>>>>>>>>>>>>>>     op monitor
>>> interval="5s" timeout="60s" on-fail="restart"
>>> \
>>>>>>>>>>>>>>     op stop interval="0s"
>>> timeout="100s" on-fail="fence"
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> primitive varnishd lsb:varnish \
>>>>>>>>>>>>>>     op start interval="0s"
>>> timeout="90s" on-fail="restart" \
>>>>>>>>>>>>>>     op monitor
>>> interval="10s" timeout="60s" on-fail="restart"
>>> \
>>>>>>>>>>>>>>     op stop interval="0s"
>>> timeout="100s" on-fail="fence"
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> primitive ping ocf:pacemaker:ping
>>> \
>>>>>>>>>>>>>>     params \
>>>>>>>>>>>>>>        
>>> name="default_ping_set" \
>>>>>>>>>>>>>>        
>>> host_list="192.168.17.254" \
>>>>>>>>>>>>>>         multiplier="100"
>>> \
>>>>>>>>>>>>>>         dampen="1" \
>>>>>>>>>>>>>>     op start interval="0s"
>>> timeout="90s" on-fail="restart" \
>>>>>>>>>>>>>>     op monitor
>>> interval="10s" timeout="60s" on-fail="restart"
>>> \
>>>>>>>>>>>>>>     op stop interval="0s"
>>> timeout="100s" on-fail="fence"
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> primitive Stonith1-1
>>> stonith:external/stonith-helper \
>>>>>>>>>>>>>>     params \
>>>>>>>>>>>>>>        
>>> pcmk_reboot_retries="1" \
>>>>>>>>>>>>>>        
>>> pcmk_reboot_timeout="40s" \
>>>>>>>>>>>>>>        
>>> hostlist="lbv1.beta.com" \
>>>>>>>>>>>>>>        
>>> dead_check_target="192.168.17.132 10.0.17.132" \
>>>>>>>>>>>>>>        
>>> standby_check_command="/usr/local/sbin/crm_resource -r varnishd -W | grep
>>> -q `hostname`" \
>>>>>>>>>>>>>>        
>>> run_online_check="yes" \
>>>>>>>>>>>>>>     op start interval="0s"
>>> timeout="60s" on-fail="restart" \
>>>>>>>>>>>>>>     op stop interval="0s"
>>> timeout="60s" on-fail="ignore"
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> primitive Stonith1-2
>>> stonith:external/xen0 \
>>>>>>>>>>>>>>     params \
>>>>>>>>>>>>>>        
>>> pcmk_reboot_timeout="60s" \
>>>>>>>>>>>>>>        
>>> hostlist="lbv1.beta.com:/etc/xen/lbv1.cfg" \
>>>>>>>>>>>>>>        
>>> dom0="xen0.beta.com" \
>>>>>>>>>>>>>>     op start interval="0s"
>>> timeout="60s" on-fail="restart" \
>>>>>>>>>>>>>>     op monitor
>>> interval="3600s" timeout="60s" on-fail="restart"
>>> \
>>>>>>>>>>>>>>     op stop interval="0s"
>>> timeout="60s" on-fail="ignore"
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> primitive Stonith2-1
>>> stonith:external/stonith-helper \
>>>>>>>>>>>>>>     params \
>>>>>>>>>>>>>>        
>>> pcmk_reboot_retries="1" \
>>>>>>>>>>>>>>        
>>> pcmk_reboot_timeout="40s" \
>>>>>>>>>>>>>>        
>>> hostlist="lbv2.beta.com" \
>>>>>>>>>>>>>>        
>>> dead_check_target="192.168.17.133 10.0.17.133" \
>>>>>>>>>>>>>>        
>>> standby_check_command="/usr/local/sbin/crm_resource -r varnishd -W | grep
>>> -q `hostname`" \
>>>>>>>>>>>>>>        
>>> run_online_check="yes" \
>>>>>>>>>>>>>>     op start interval="0s"
>>> timeout="60s" on-fail="restart" \
>>>>>>>>>>>>>>     op stop interval="0s"
>>> timeout="60s" on-fail="ignore"
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> primitive Stonith2-2
>>> stonith:external/xen0 \
>>>>>>>>>>>>>>     params \
>>>>>>>>>>>>>>        
>>> pcmk_reboot_timeout="60s" \
>>>>>>>>>>>>>>        
>>> hostlist="lbv2.beta.com:/etc/xen/lbv2.cfg" \
>>>>>>>>>>>>>>        
>>> dom0="xen0.beta.com" \
>>>>>>>>>>>>>>     op start interval="0s"
>>> timeout="60s" on-fail="restart" \
>>>>>>>>>>>>>>     op monitor
>>> interval="3600s" timeout="60s" on-fail="restart"
>>> \
>>>>>>>>>>>>>>     op stop interval="0s"
>>> timeout="60s" on-fail="ignore"
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ### Resource Location ###
>>>>>>>>>>>>>> location HA_location-1 HAvarnish
>>> \
>>>>>>>>>>>>>>     rule 200: #uname eq
>>> lbv1.beta.com \
>>>>>>>>>>>>>>     rule 100: #uname eq
>>> lbv2.beta.com
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> location HA_location-2 HAvarnish
>>> \
>>>>>>>>>>>>>>     rule -INFINITY: not_defined
>>> default_ping_set or default_ping_set lt 100
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> location HA_location-3 grpStonith1
>>> \
>>>>>>>>>>>>>>     rule -INFINITY: #uname eq
>>> lbv1.beta.com
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> location HA_location-4 grpStonith2
>>> \
>>>>>>>>>>>>>>     rule -INFINITY: #uname eq
>>> lbv2.beta.com
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> これを流しこんだところ、昨日とはメッセージが異なります。
>>>>>>>>>>>>>> pingのメッセージはなくなっていました。
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> # crm_mon -rfA
>>>>>>>>>>>>>> Last updated: Tue Mar 17 10:21:28
>>> 2015
>>>>>>>>>>>>>> Last change: Tue Mar 17 10:21:09
>>> 2015
>>>>>>>>>>>>>> Stack: heartbeat
>>>>>>>>>>>>>> Current DC: lbv2.beta.com
>>> (82ffc36f-1ad8-8686-7db0-35686465c624) - parti
>>>>>>>>>>>>>> tion with quorum
>>>>>>>>>>>>>> Version: 1.1.12-561c4cf
>>>>>>>>>>>>>> 2 Nodes configured
>>>>>>>>>>>>>> 8 Resources configured
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Online: [ lbv1.beta.com
>>> lbv2.beta.com ]
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Full list of resources:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>  Resource Group: HAvarnish
>>>>>>>>>>>>>>      vip_208   
>>> (ocf::heartbeat:IPaddr2):       Started lbv1.beta.com
>>>>>>>>>>>>>>      varnishd   (lsb:varnish): 
>>> Started lbv1.beta.com
>>>>>>>>>>>>>>  Resource Group: grpStonith1
>>>>>>>>>>>>>>      Stonith1-1
>>> (stonith:external/stonith-helper):      Stopped
>>>>>>>>>>>>>>      Stonith1-2
>>> (stonith:external/xen0):        Stopped
>>>>>>>>>>>>>>  Resource Group: grpStonith2
>>>>>>>>>>>>>>      Stonith2-1
>>> (stonith:external/stonith-helper):      Stopped
>>>>>>>>>>>>>>      Stonith2-2
>>> (stonith:external/xen0):        Stopped
>>>>>>>>>>>>>>  Clone Set: clone_ping [ping]
>>>>>>>>>>>>>>      Started: [ lbv1.beta.com
>>> lbv2.beta.com ]
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Node Attributes:
>>>>>>>>>>>>>> * Node lbv1.beta.com:
>>>>>>>>>>>>>>     +
>>> default_ping_set                  : 100
>>>>>>>>>>>>>> * Node lbv2.beta.com:
>>>>>>>>>>>>>>     +
>>> default_ping_set                  : 100
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Migration summary:
>>>>>>>>>>>>>> * Node lbv2.beta.com:
>>>>>>>>>>>>>>    Stonith1-1: migration-threshold=1
>>> fail-count=1000000 last-failure='Tue Mar 17
>>>>>>>>>>>>>>  10:21:17 2015'
>>>>>>>>>>>>>> * Node lbv1.beta.com:
>>>>>>>>>>>>>>    Stonith2-1: migration-threshold=1
>>> fail-count=1000000 last-failure='Tue Mar 17
>>>>>>>>>>>>>>  10:21:17 2015'
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Failed actions:
>>>>>>>>>>>>>>     Stonith1-1_start_0 on
>>> lbv2.beta.com 'unknown error' (1): call=31, st
>>>>>>>>>>>>>> atus=Error, last-rc-change='Tue
>>> Mar 17 10:21:15 2015', queued=0ms, exec=1082ms
>>>>>>>>>>>>>>     Stonith2-1_start_0 on
>>> lbv1.beta.com 'unknown error' (1): call=31, st
>>>>>>>>>>>>>> atus=Error, last-rc-change='Tue
>>> Mar 17 10:21:16 2015', queued=0ms, exec=1079ms
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> /var/log/ha-debugのログです。
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> IPaddr2(vip_208)[7851]:
>>> 2015/03/17_10:21:22 INFO: Adding inet address 192.168.17.208/24 with broadcast
>>> address 192.168.17.255 to device eth0
>>>>>>>>>>>>>> IPaddr2(vip_208)[7851]:
>>> 2015/03/17_10:21:22 INFO: Bringing device eth0 up
>>>>>>>>>>>>>> IPaddr2(vip_208)[7851]:
>>> 2015/03/17_10:21:22 INFO: /usr/libexec/heartbeat/send_arp -i 200 -r 5 -p
>>> /var/run/resource-agents/send_arp-192.168.17.208 eth0 192.168.17.208 auto
>>> not_used not_used
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 標準出力や標準エラー出力はありませんでした。
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> stonith-helperがおかしいのでしょうか。
>>>>>>>>>>>>>> stonith-helperはシェルスクリプトなのでインストールはあまり気にしていなかったのですが。
>>>>>>>>>>>>>> stonith-helperはここに配置されています。
>>>>>>>>>>>>>> /usr/local/heartbeat/lib/stonith/plugins/external/stonith-helper
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 宜しくお願いします。
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 以上
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 2015-03-17 9:45 GMT+09:00
>>> <renayama19661014@ybb.ne.jp>:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 福田さん
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> おはようございます。山内です。
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 念の為、手元にある複数のstonithを利用した場合の例を抜粋してお送りします。
>>>>>>>>>>>>>>> (実際には、改行に気を付けてください)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 以下の例は、PM1.1系での設定で、
>>>>>>>>>>>>>>> nodeaは、prmStonith1-1、 prmStonith1-2の順でstonithが実行されます。
>>>>>>>>>>>>>>> nodebは、prmStonith2-1、 prmStonith2-2の順でstonithが実行されます。
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> stonith自体は、helperとsshです。
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> (snip)
>>>>>>>>>>>>>>> ### Group Configuration ###
>>>>>>>>>>>>>>> group grpStonith1 \
>>>>>>>>>>>>>>> prmStonith1-1 \
>>>>>>>>>>>>>>> prmStonith1-2
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> group grpStonith2 \
>>>>>>>>>>>>>>> prmStonith2-1 \
>>>>>>>>>>>>>>> prmStonith2-2
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> ### Fencing Topology ###
>>>>>>>>>>>>>>> fencing_topology \
>>>>>>>>>>>>>>> nodea: prmStonith1-1
>>> prmStonith1-2 \
>>>>>>>>>>>>>>> nodeb: prmStonith2-1
>>> prmStonith2-2
>>>>>>>>>>>>>>> (snp)
>>>>>>>>>>>>>>> primitive prmStonith1-1
>>> stonith:external/stonith-helper \
>>>>>>>>>>>>>>> params \
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> pcmk_reboot_retries="1"
>>> \
>>>>>>>>>>>>>>> pcmk_reboot_timeout="40s"
>>> \
>>>>>>>>>>>>>>> hostlist="nodea" \
>>>>>>>>>>>>>>> dead_check_target="192.168.28.60
>>> 192.168.28.70" \
>>>>>>>>>>>>>>> standby_check_command="/usr/sbin/crm_resource
>>> -r prmRES -W | grep -qi `hostname`" \
>>>>>>>>>>>>>>> run_online_check="yes"
>>> \
>>>>>>>>>>>>>>> op start interval="0s"
>>> timeout="60s" on-fail="restart" \
>>>>>>>>>>>>>>> op stop interval="0s"
>>> timeout="60s" on-fail="ignore"
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> primitive prmStonith1-2
>>> stonith:external/ssh \
>>>>>>>>>>>>>>> params \
>>>>>>>>>>>>>>> pcmk_reboot_timeout="60s"
>>> \
>>>>>>>>>>>>>>> hostlist="nodea" \
>>>>>>>>>>>>>>> op start interval="0s"
>>> timeout="60s" on-fail="restart" \
>>>>>>>>>>>>>>> op monitor
>>> interval="3600s" timeout="60s" on-fail="restart"
>>> \
>>>>>>>>>>>>>>> op stop interval="0s"
>>> timeout="60s" on-fail="ignore"
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> primitive prmStonith2-1
>>> stonith:external/stonith-helper \
>>>>>>>>>>>>>>> params \
>>>>>>>>>>>>>>> pcmk_reboot_retries="1"
>>> \
>>>>>>>>>>>>>>> pcmk_reboot_timeout="40s"
>>> \
>>>>>>>>>>>>>>> hostlist="nodeb" \
>>>>>>>>>>>>>>> dead_check_target="192.168.28.61
>>> 192.168.28.71" \
>>>>>>>>>>>>>>> standby_check_command="/usr/sbin/crm_resource
>>> -r prmRES -W | grep -qi `hostname`" \
>>>>>>>>>>>>>>> run_online_check="yes"
>>> \
>>>>>>>>>>>>>>> op start interval="0s"
>>> timeout="60s" on-fail="restart" \
>>>>>>>>>>>>>>> op stop interval="0s"
>>> timeout="60s" on-fail="ignore"
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> primitive prmStonith2-2
>>> stonith:external/ssh \
>>>>>>>>>>>>>>> params \
>>>>>>>>>>>>>>> pcmk_reboot_timeout="60s"
>>> \
>>>>>>>>>>>>>>> hostlist="nodeb" \
>>>>>>>>>>>>>>> op start interval="0s"
>>> timeout="60s" on-fail="restart" \
>>>>>>>>>>>>>>> op monitor
>>> interval="3600s" timeout="60s" on-fail="restart"
>>> \
>>>>>>>>>>>>>>> op stop interval="0s"
>>> timeout="60s" on-fail="ignore"
>>>>>>>>>>>>>>> (snip)
>>>>>>>>>>>>>>> location
>>> rsc_location-grpStonith1-2 grpStonith1 \
>>>>>>>>>>>>>>> rule -INFINITY: #uname eq nodea
>>>>>>>>>>>>>>> location
>>> rsc_location-grpStonith2-3 grpStonith2 \
>>>>>>>>>>>>>>> rule -INFINITY: #uname eq nodeb
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 以上です。
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ELF Systems
>>>>>>>>>>>>>> Masamichi Fukuda
>>>>>>>>>>>>>> mail to:
>>> masamichi_fukuda@elf-systems.com
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>> Linux-ha-japan mailing list
>>>>>>>>>>>>> Linux-ha-japan@lists.sourceforge.jp
>>>>>>>>>>>>> http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>>
>>>>>>>>>>>> ELF Systems
>>>>>>>>>>>> Masamichi Fukuda
>>>>>>>>>>>> mail to: masamichi_fukuda@elf-systems.com
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> Linux-ha-japan mailing list
>>>>>>>>>>> Linux-ha-japan@lists.sourceforge.jp
>>>>>>>>>>> http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>>
>>>>>>>>>> ELF Systems
>>>>>>>>>> Masamichi Fukuda
>>>>>>>>>> mail to: masamichi_fukuda@elf-systems.com
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> Linux-ha-japan mailing list
>>>>>>>>> Linux-ha-japan@lists.sourceforge.jp
>>>>>>>>> http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>>
>>>>>>>> ELF Systems
>>>>>>>> Masamichi Fukuda
>>>>>>>> mail to: masamichi_fukuda@elf-systems.com
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Linux-ha-japan mailing list
>>>>>>> Linux-ha-japan@lists.sourceforge.jp
>>>>>>> http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>>>>>>>
>>>>>>
>>>>>> --
>>>>>>
>>>>>> ELF Systems
>>>>>> Masamichi Fukuda
>>>>>> mail to: masamichi_fukuda@elf-systems.com
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Linux-ha-japan mailing list
>>>>> Linux-ha-japan@lists.sourceforge.jp
>>>>> http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>>>>>
>>>>
>>>>
>>>> --
>>>>
>>>> ELF Systems
>>>> Masamichi Fukuda
>>>> mail to: masamichi_fukuda@elf-systems.com
>>>>
>>>>
>>>
>>> _______________________________________________
>>> Linux-ha-japan mailing list
>>> Linux-ha-japan@lists.sourceforge.jp
>>> http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>>>
>>
>>_______________________________________________
>>Linux-ha-japan mailing list
>>Linux-ha-japan@lists.sourceforge.jp
>>http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>>
>
>
>--
>
>ELF Systems
>Masamichi Fukuda
>mail to: masamichi_fukuda@elf-systems.com
>
>

_______________________________________________
Linux-ha-japan mailing list
Linux-ha-japan@lists.sourceforge.jp
http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
Re: スプリットブレイン時のSTONITHエラーについて [ In reply to ]
山内さん

おはようございます、福田です。

> ということは、xen0もstonith-helperもたぶん、Pacemakerのstonithプラグインとし
> ての管理下のパスにはないということになると思います。
>
> Reusableと、pm_extrasあたりのインストールが怪しいと思いますね。

pacemakerのインストールに問題があるのでしょうか。
あと、Reusableというものは別途インストールが必要なのでしょうか。

stonith-helperを外して、external/sshだけにして起動してみましたが、
crm_monでの状態は変わりありませんでした。

Last updated: Wed Mar 18 08:07:42 2015
Last change: Wed Mar 18 08:04:48 2015
Stack: heartbeat
Current DC: lbv1.beta.com (38b0f200-83ea-8633-6f37-047d36cd39c6) - parti
tion with quorum
Version: 1.1.12-e32080b
2 Nodes configured
6 Resources configured


Online: [ lbv1.beta.com lbv2.beta.com ]

Full list of resources:

Stonith1-2 (stonith:external/ssh): Stopped
Stonith2-2 (stonith:external/ssh): Stopped
Resource Group: HAvarnish
vip_208 (ocf::heartbeat:IPaddr2): Started lbv1.beta.com
varnishd (lsb:varnish): Started lbv1.beta.com
Clone Set: clone_ping [ping]
Started: [ lbv1.beta.com lbv2.beta.com ]

Node Attributes:
* Node lbv1.beta.com:
+ default_ping_set : 100
* Node lbv2.beta.com:
+ default_ping_set : 100

Migration summary:
* Node lbv2.beta.com:
Stonith1-2: migration-threshold=1 fail-count=1000000 last-failure='Wed
Mar 18
08:07:32 2015'
* Node lbv1.beta.com:
Stonith2-2: migration-threshold=1 fail-count=1000000 last-failure='Wed
Mar 18
08:05:53 2015'

Failed actions:
Stonith1-2_start_0 on lbv2.beta.com 'unknown error' (1): call=23, st
atus=Error, exit-reason='none', last-rc-change='Wed Mar 18 08:07:30 2015',
queue
d=0ms, exec=1061ms
Stonith2-2_start_0 on lbv1.beta.com 'unknown error' (1): call=23, st
atus=Error, exit-reason='none', last-rc-change='Wed Mar 18 08:05:51 2015',
queue
d=0ms, exec=1062ms

宜しくお願いします。

以上


2015年3月17日 23:51 <renayama19661014@ybb.ne.jp>:

> 福田さん
>
> こんばんは、山内です。
>
>
> ということは、xen0もstonith-helperもたぶん、Pacemakerのstonithプラグインとしての管理下のパスにはないということになると思います。
>
> Reusableと、pm_extrasあたりのインストールが怪しいと思いますね。
>
> また、何かわかったらご連絡します。
>
> 以上です。
>
>
>
> ----- Original Message -----
> >From: Masamichi Fukuda - elf-systems <masamichi_fukuda@elf-systems.com>
> >To: 山内英生 <renayama19661014@ybb.ne.jp>; "
> linux-ha-japan@lists.sourceforge.jp" <linux-ha-japan@lists.sourceforge.jp>
> >Date: 2015/3/17, Tue 23:46
> >Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
> >
> >
> >山内さん
> >
> >こんばんは、福田です。
> >
> >stonith-helperの-x指定は何かやり方が違うんでしょうかね。
> >
> >stonith-helperを外して、xen0だけにして起動してみました。
> >
> ># crm_mon -rfA
> >
> >Last updated: Tue Mar 17 23:38:53 2015
> >Last change: Tue Mar 17 23:30:34 2015
> >Stack: heartbeat
> >Current DC: lbv1.beta.com (38b0f200-83ea-8633-6f37-047d36cd39c6) - parti
> >tion with quorum
> >Version: 1.1.12-e32080b
> >2 Nodes configured
> >6 Resources configured
> >
> >
> >Online: [ lbv1.beta.com lbv2.beta.com ]
> >
> >Full list of resources:
> >
> >Stonith1-2 (stonith:external/xen0): Stopped
> >Stonith2-2 (stonith:external/xen0): Stopped
> > Resource Group: HAvarnish
> > vip_208 (ocf::heartbeat:IPaddr2): Started lbv1.beta.com
> > varnishd (lsb:varnish): Started lbv1.beta.com
> > Clone Set: clone_ping [ping]
> > Started: [ lbv1.beta.com lbv2.beta.com ]
> >
> >Node Attributes:
> >* Node lbv1.beta.com:
> > + default_ping_set : 100
> >* Node lbv2.beta.com:
> > + default_ping_set : 100
> >
> >Migration summary:
> >* Node lbv1.beta.com:
> > Stonith2-2: migration-threshold=1 fail-count=1000000 last-failure='Tue
> Mar 17
> > 23:38:34 2015'
> >* Node lbv2.beta.com:
> > Stonith1-2: migration-threshold=1 fail-count=1000000 last-failure='Tue
> Mar 17
> > 23:38:27 2015'
> >
> >Failed actions:
> > Stonith2-2_start_0 on lbv1.beta.com 'unknown error' (1): call=23, st
> >atus=Error, exit-reason='none', last-rc-change='Tue Mar 17 23:38:32
> 2015', queue
> >d=0ms, exec=1061ms
> > Stonith1-2_start_0 on lbv2.beta.com 'unknown error' (1): call=23, st
> >atus=Error, exit-reason='none', last-rc-change='Tue Mar 17 23:38:25
> 2015', queue
> >d=0ms, exec=1342ms
> >
> >
> >
> >
> >stonith-helperがあるときと同様のfialed actionsが出ているようです。
> >
> >
> >宜しくお願いします。
> >
> >以上
> >
> >
> >
> >
> >2015年3月17日 22:38 <renayama19661014@ybb.ne.jp>:
> >
> >福田さん
> >>
> >>こんばんは、山内です。
> >>
> >>ちなみに可能であれば、external/stonith-helperを外して、external/xen0だけにした場合に
> >>どうなるか?を確認すると、問題の切り分けになるかもしれません。
> >>
> >>以上です。
> >>
> >>
> >>
> >>----- Original Message -----
> >>
> >>> From: "renayama19661014@ybb.ne.jp" <renayama19661014@ybb.ne.jp>
> >>> To: "linux-ha-japan@lists.sourceforge.jp" <
> linux-ha-japan@lists.sourceforge.jp>
> >>> Cc:
> >>> Date: 2015/3/17, Tue 22:28
> >>> Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
> >>>
> >>> 福田さん
> >>>
> >>> こんばんは、山内です。
> >>>
> >>> 変わらないようですね。。。
> >>>
> >>> とりあえず、明日くらいに、RHEL上ですが、
> >>>
> >>> Heartbeat3.0.6
> >>> Pacemakerの最新
> >>>
> >>>
> 組み合わせで、同じような設定(リソースはDummy、external/xen0はexternal/sshになりますが)stonith-helperが動くかどうかを確認してみます。
> >>>
> >>> #stonith-helperの-x指定の出力が確認出来ると、もう少し問題が絞りやすいのですが・・・
> >>>
> >>>
> >>> 以上です。
> >>>
> >>>
> >>>
> >>> ----- Original Message -----
> >>>> From: Masamichi Fukuda - elf-systems
> >>> <masamichi_fukuda@elf-systems.com>
> >>>> To: 山内英生 <renayama19661014@ybb.ne.jp>;
> >>> "linux-ha-japan@lists.sourceforge.jp"
> >>> <linux-ha-japan@lists.sourceforge.jp>
> >>>> Date: 2015/3/17, Tue 21:24
> >>>> Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
> >>>>
> >>>>
> >>>> 山内さん
> >>>>
> >>>> こんばんは、福田です。
> >>>> 最新版の情報をありがとうございました。
> >>>>
> >>>> 早速インストールしてみました。
> >>>>
> >>>> 起動後の状態です。
> >>>>
> >>>> failed actionsは変わりないようです。
> >>>>
> >>>>
> >>>>
> >>>> # crm_mon -rfA
> >>>> Last updated: Tue Mar 17 21:03:49 2015
> >>>> Last change: Tue Mar 17 20:30:58 2015
> >>>> Stack: heartbeat
> >>>> Current DC: lbv1.beta.com (38b0f200-83ea-8633-6f37-047d36cd39c6) -
> parti
> >>>> tion with quorum
> >>>> Version: 1.1.12-e32080b
> >>>> 2 Nodes configured
> >>>> 8 Resources configured
> >>>>
> >>>>
> >>>> Online: [ lbv1.beta.com lbv2.beta.com ]
> >>>>
> >>>> Full list of resources:
> >>>>
> >>>> Resource Group: HAvarnish
> >>>> vip_208 (ocf::heartbeat:IPaddr2): Started lbv1.beta.com
> >>>> varnishd (lsb:varnish): Started lbv1.beta.com
> >>>> Resource Group: grpStonith1
> >>>> Stonith1-1 (stonith:external/stonith-helper): Stopped
> >>>> Stonith1-2 (stonith:external/xen0): Stopped
> >>>> Resource Group: grpStonith2
> >>>> Stonith2-1 (stonith:external/stonith-helper): Stopped
> >>>> Stonith2-2 (stonith:external/xen0): Stopped
> >>>> Clone Set: clone_ping [ping]
> >>>> Started: [ lbv1.beta.com lbv2.beta.com ]
> >>>>
> >>>> Node Attributes:
> >>>> * Node lbv1.beta.com:
> >>>> + default_ping_set : 100
> >>>> * Node lbv2.beta.com:
> >>>> + default_ping_set : 100
> >>>>
> >>>> Migration summary:
> >>>> * Node lbv1.beta.com:
> >>>> Stonith2-1: migration-threshold=1 fail-count=1000000
> >>> last-failure='Tue Mar 17
> >>>> 21:03:39 2015'
> >>>> * Node lbv2.beta.com:
> >>>> Stonith1-1: migration-threshold=1 fail-count=1000000
> >>> last-failure='Tue Mar 17
> >>>> 21:03:32 2015'
> >>>>
> >>>> Failed actions:
> >>>> Stonith2-1_start_0 on lbv1.beta.com 'unknown error' (1):
> >>> call=31, st
> >>>> atus=Error, exit-reason='none', last-rc-change='Tue Mar 17
> >>> 21:03:37 2015', queue
> >>>> d=0ms, exec=1085ms
> >>>> Stonith1-1_start_0 on lbv2.beta.com 'unknown error' (1):
> >>> call=18, st
> >>>> atus=Error, exit-reason='none', last-rc-change='Tue Mar 17
> >>> 21:03:30 2015', queue
> >>>> d=0ms, exec=1061ms
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> ログです。
> >>>>
> >>>>
> >>>> # less /var/log/ha-debug
> >>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: info: Pacemaker
> support:
> >>> yes
> >>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: WARN: File
> >>> /etc/ha.d//haresources exists.
> >>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: WARN: This file is
> not used
> >>> because pacemaker is enabled
> >>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: debug: Checking
> access of:
> >>> /usr/local/heartbeat/libexec/heartbeat/ccm
> >>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: debug: Checking
> access of:
> >>> /usr/local/heartbeat/libexec/pacemaker/cib
> >>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: debug: Checking
> access of:
> >>> /usr/local/heartbeat/libexec/pacemaker/stonithd
> >>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: debug: Checking
> access of:
> >>> /usr/local/heartbeat/libexec/pacemaker/lrmd
> >>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: debug: Checking
> access of:
> >>> /usr/local/heartbeat/libexec/pacemaker/attrd
> >>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: debug: Checking
> access of:
> >>> /usr/local/heartbeat/libexec/pacemaker/crmd
> >>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: WARN: Core dumps
> could be
> >>> lost if multiple dumps occur.
> >>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: WARN: Consider
> setting
> >>> non-default value in /proc/sys/kernel/core_pattern (or equivalent) for
> maximum
> >>> supportability
> >>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: WARN: Consider
> setting
> >>> /proc/sys/kernel/core_uses_pid (or equivalent) to 1 for maximum
> supportability
> >>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: WARN: Logging
> daemon is
> >>> disabled --enabling logging daemon is recommended
> >>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: info:
> >>> **************************
> >>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: info: Configuration
> >>> validated. Starting heartbeat 3.0.6
> >>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4236]: info: heartbeat:
> version
> >>> 3.0.6
> >>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4236]: info: Heartbeat
> generation:
> >>> 1423534116
> >>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4236]: info: seed is
> -1702799346
> >>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4236]: info: glib: ucast:
> write
> >>> socket priority set to IPTOS_LOWDELAY on eth1
> >>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4236]: info: glib: ucast:
> bound
> >>> send socket to device: eth1
> >>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4236]: info: glib: ucast:
> set
> >>> SO_REUSEADDR
> >>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4236]: info: glib: ucast:
> bound
> >>> receive socket to device: eth1
> >>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4236]: info: glib: ucast:
> started
> >>> on port 694 interface eth1 to 10.0.17.133
> >>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4236]: info: Local status
> now set
> >>> to: 'up'
> >>>> Mar 17 21:02:46 lbv1.beta.com heartbeat: [4236]: info: Link
> >>> lbv2.beta.com:eth1 up.
> >>>> Mar 17 21:02:46 lbv1.beta.com heartbeat: [4236]: info: Status update
> for
> >>> node lbv2.beta.com: status up
> >>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: Comm_now_up():
> >>> updating status to active
> >>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: Local status
> now set
> >>> to: 'active'
> >>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: Starting
> child client
> >>> "/usr/local/heartbeat/libexec/heartbeat/ccm" (109,113)
> >>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: Starting
> child client
> >>> "/usr/local/heartbeat/libexec/pacemaker/cib" (109,113)
> >>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: Starting
> child client
> >>> "/usr/local/heartbeat/libexec/pacemaker/stonithd" (0,0)
> >>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: Starting
> child client
> >>> "/usr/local/heartbeat/libexec/pacemaker/lrmd" (0,0)
> >>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: Starting
> child client
> >>> "/usr/local/heartbeat/libexec/pacemaker/attrd" (109,113)
> >>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: Starting
> child client
> >>> "/usr/local/heartbeat/libexec/pacemaker/crmd" (109,113)
> >>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: debug:
> get_delnodelist:
> >>> delnodelist=
> >>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4250]: info: Starting
> >>> "/usr/local/heartbeat/libexec/pacemaker/crmd" as uid 109 gid 113 (pid
> >>> 4250)
> >>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4246]: info: Starting
> >>> "/usr/local/heartbeat/libexec/pacemaker/cib" as uid 109 gid 113 (pid
> >>> 4246)
> >>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4249]: info: Starting
> >>> "/usr/local/heartbeat/libexec/pacemaker/attrd" as uid 109 gid 113
> >>> (pid 4249)
> >>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4245]: info: Starting
> >>> "/usr/local/heartbeat/libexec/heartbeat/ccm" as uid 109 gid 113 (pid
> >>> 4245)
> >>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4248]: info: Starting
> >>> "/usr/local/heartbeat/libexec/pacemaker/lrmd" as uid 0 gid 0 (pid
> >>> 4248)
> >>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4247]: info: Starting
> >>> "/usr/local/heartbeat/libexec/pacemaker/stonithd" as uid 0 gid 0 (pid
> >>> 4247)
> >>>> Mar 17 21:02:47 lbv1.beta.com ccm: [4245]: info: Hostname:
> lbv1.beta.com
> >>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: the send
> queue length
> >>> from heartbeat to client ccm is set to 1024
> >>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: the send
> queue length
> >>> from heartbeat to client attrd is set to 1024
> >>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: the send
> queue length
> >>> from heartbeat to client stonith-ng is set to 1024
> >>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: Status update
> for
> >>> node lbv2.beta.com: status active
> >>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: the send
> queue length
> >>> from heartbeat to client cib is set to 1024
> >>>> Mar 17 21:02:51 lbv1.beta.com heartbeat: [4236]: WARN: 1 lost
> packet(s) for
> >>> [lbv2.beta.com] [15:17]
> >>>> Mar 17 21:02:51 lbv1.beta.com heartbeat: [4236]: info: No pkts
> missing from
> >>> lbv2.beta.com!
> >>>> Mar 17 21:02:52 lbv1.beta.com heartbeat: [4236]: WARN: 1 lost
> packet(s) for
> >>> [lbv2.beta.com] [19:21]
> >>>> Mar 17 21:02:52 lbv1.beta.com heartbeat: [4236]: info: No pkts
> missing from
> >>> lbv2.beta.com!
> >>>> Mar 17 21:02:52 lbv1.beta.com heartbeat: [4236]: info: the send
> queue length
> >>> from heartbeat to client crmd is set to 1024
> >>>> Mar 17 21:02:53 lbv1.beta.com heartbeat: [4236]: WARN: 1 lost
> packet(s) for
> >>> [lbv2.beta.com] [24:26]
> >>>> Mar 17 21:02:53 lbv1.beta.com heartbeat: [4236]: info: No pkts
> missing from
> >>> lbv2.beta.com!
> >>>> Mar 17 21:02:54 lbv1.beta.com heartbeat: [4236]: WARN: 1 lost
> packet(s) for
> >>> [lbv2.beta.com] [26:28]
> >>>> Mar 17 21:02:54 lbv1.beta.com heartbeat: [4236]: info: No pkts
> missing from
> >>> lbv2.beta.com!
> >>>> Mar 17 21:02:54 lbv1.beta.com heartbeat: [4236]: WARN: 1 lost
> packet(s) for
> >>> [lbv2.beta.com] [30:32]
> >>>> Mar 17 21:02:54 lbv1.beta.com heartbeat: [4236]: info: No pkts
> missing from
> >>> lbv2.beta.com!
> >>>>
> >>>>
> >>>>
> >>>> # less /var/log/error
> >>>>
> >>>> Mar 17 21:02:47 lbv1 attrd[4249]: error: ha_msg_dispatch: Ignored
> >>> incoming message. Please set_msg_callback on hbclstat
> >>>> Mar 17 21:02:48 lbv1 attrd[4249]: error: ha_msg_dispatch: Ignored
> >>> incoming message. Please set_msg_callback on hbclstat
> >>>> Mar 17 21:02:53 lbv1 stonith-ng[4247]: error: ha_msg_dispatch:
> Ignored
> >>> incoming message. Please set_msg_callback on hbclstat
> >>>> Mar 17 21:02:53 lbv1 stonith-ng[4247]: error: ha_msg_dispatch:
> Ignored
> >>> incoming message. Please set_msg_callback on hbclstat
> >>>> Mar 17 21:03:39 lbv1 crmd[4250]: error: process_lrm_event:
> Operation
> >>> Stonith2-1_start_0 (node=lbv1.beta.com, call=31, status=4,
> cib-update=42,
> >>> confirmed=true) Error
> >>>>
> >>>> # cat syslog|egrep 'Mar 17 21:03|Mar 17 21:02' |egrep
> >>> 'heartbeat|stonith|pacemaker|error'
> >>>> Mar 17 21:03:24 lbv1 pengine[4253]: notice: process_pe_message:
> Calculated
> >>> Transition 0: /var/lib/pacemaker/pengine/pe-input-115.bz2
> >>>> Mar 17 21:03:27 lbv1 crmd[4250]: notice: run_graph: Transition 0
> >>> (Complete=15, Pending=0, Fired=0, Skipped=16, Incomplete=2,
> >>> Source=/var/lib/pacemaker/pengine/pe-input-115.bz2): Stopped
> >>>> Mar 17 21:03:29 lbv1 pengine[4253]: notice: process_pe_message:
> Calculated
> >>> Transition 1: /var/lib/pacemaker/pengine/pe-input-116.bz2
> >>>> Mar 17 21:03:34 lbv1 crmd[4250]: notice: run_graph: Transition 1
> >>> (Complete=8, Pending=0, Fired=0, Skipped=12, Incomplete=1,
> >>> Source=/var/lib/pacemaker/pengine/pe-input-116.bz2): Stopped
> >>>> Mar 17 21:03:37 lbv1 pengine[4253]: warning: unpack_rsc_op_failure:
> >>> Processing failed op start for Stonith1-1 on lbv2.beta.com: unknown
> error (1)
> >>>> Mar 17 21:03:37 lbv1 pengine[4253]: warning: unpack_rsc_op_failure:
> >>> Processing failed op start for Stonith1-1 on lbv2.beta.com: unknown
> error (1)
> >>>> Mar 17 21:03:37 lbv1 pengine[4253]: notice: process_pe_message:
> Calculated
> >>> Transition 2: /var/lib/pacemaker/pengine/pe-input-117.bz2
> >>>> Mar 17 21:03:39 lbv1 stonith-ng[4247]: notice: log_operation:
> Operation
> >>> 'monitor' [4377] for device 'Stonith2-1' returned: -201 (Generic
> >>> Pacemaker error)
> >>>> Mar 17 21:03:39 lbv1 stonith-ng[4247]: warning: log_operation:
> >>> Stonith2-1:4377 [ Performing: stonith -t external/stonith-helper -S ]
> >>>> Mar 17 21:03:39 lbv1 stonith-ng[4247]: warning: log_operation:
> >>> Stonith2-1:4377 [ failed to exec "stonith" ]
> >>>> Mar 17 21:03:39 lbv1 stonith-ng[4247]: warning: log_operation:
> >>> Stonith2-1:4377 [ failed: 2 ]
> >>>> Mar 17 21:03:39 lbv1 crmd[4250]: error: process_lrm_event:
> Operation
> >>> Stonith2-1_start_0 (node=lbv1.beta.com, call=31, status=4,
> cib-update=42,
> >>> confirmed=true) Error
> >>>> Mar 17 21:03:40 lbv1 crmd[4250]: notice: run_graph: Transition 2
> >>> (Complete=12, Pending=0, Fired=0, Skipped=3, Incomplete=0,
> >>> Source=/var/lib/pacemaker/pengine/pe-input-117.bz2): Stopped
> >>>> Mar 17 21:03:42 lbv1 pengine[4253]: warning: unpack_rsc_op_failure:
> >>> Processing failed op start for Stonith2-1 on lbv1.beta.com: unknown
> error (1)
> >>>> Mar 17 21:03:42 lbv1 pengine[4253]: warning: unpack_rsc_op_failure:
> >>> Processing failed op start for Stonith2-1 on lbv1.beta.com: unknown
> error (1)
> >>>> Mar 17 21:03:42 lbv1 pengine[4253]: warning: unpack_rsc_op_failure:
> >>> Processing failed op start for Stonith1-1 on lbv2.beta.com: unknown
> error (1)
> >>>> Mar 17 21:03:42 lbv1 pengine[4253]: notice: process_pe_message:
> Calculated
> >>> Transition 3: /var/lib/pacemaker/pengine/pe-input-118.bz2
> >>>> Mar 17 21:03:42 lbv1 IPaddr2(vip_208)[4448]: INFO:
> >>> /usr/libexec/heartbeat/send_arp -i 200 -r 5 -p
> >>> /var/run/resource-agents/send_arp-192.168.17.208 eth0 192.168.17.208
> auto
> >>> not_used not_used
> >>>> Mar 17 21:03:47 lbv1 crmd[4250]: notice: run_graph: Transition 3
> >>> (Complete=10, Pending=0, Fired=0, Skipped=0, Incomplete=0,
> >>> Source=/var/lib/pacemaker/pengine/pe-input-118.bz2): Complete
> >>>>
> >>>> 宜しくお願いします。
> >>>>
> >>>> 以上
> >>>>
> >>>>
> >>>>
> >>>> 2015年3月17日 18:31 <renayama19661014@ybb.ne.jp>:
> >>>>
> >>>> 福田さん
> >>>>>
> >>>>> こんばんは、山内です。
> >>>>>
> >>>>> tag付けされていないので、本日の最新版は、
> >>>>>
> >>>>> *
> >>>
> https://github.com/ClusterLabs/pacemaker/tree/e32080b460f81486b85d08ec958582b3e72d858c
> >>>>>
> >>>>>
> >>>>> になります。
> >>>>> 右側の[Download ZIP]からダウンロード出来ます。
> >>>>>
> >>>>> 以上です。
> >>>>>
> >>>>>
> >>>>> ----- Original Message -----
> >>>>>> From: Masamichi Fukuda - elf-systems
> >>> <masamichi_fukuda@elf-systems.com>
> >>>>>
> >>>>>> To: "renayama19661014@ybb.ne.jp"
> >>> <renayama19661014@ybb.ne.jp>;
> >>> "linux-ha-japan@lists.sourceforge.jp"
> >>> <linux-ha-japan@lists.sourceforge.jp>
> >>>>>> Date: 2015/3/17, Tue 18:07
> >>>>>> Subject: スプリットブレイン時のSTONITHエラーについて
> >>>>>>
> >>>>>>
> >>>>>> 山内さん
> >>>>>>
> >>>>>>
> >>>>>> お疲れ様です、福田です。
> >>>>>>
> >>>>>>
> >>>>>> こちらを見たのですが、
> >>>>>> https://github.com/ClusterLabs/pacemaker/tags
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> pacemaker 1.1.12 561c4cf が最新のようなのですが。
> >>>>>> 済みませんが、これ以降の最新版はどちらにあるか教えて頂けますか。
> >>>>>>
> >>>>>>
> >>>>>> 宜しくお願いします。
> >>>>>>
> >>>>>>
> >>>>>> 以上
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> 2015年3月17日火曜日、<renayama19661014@ybb.ne.jp>さんは書きました:
> >>>>>>
> >>>>>> 福田さん
> >>>>>>>
> >>>>>>> お疲れ様です。山内です。
> >>>>>>>
> >>>>>>> はい。古いです。
> >>>>>>>
> >>>>>>> PacemakerがHeartbeat3.0.6に対応したのは意外と最近です。
> >>>>>>> もっと新しいものを入れてください。(また、ソースから構築する必要がありますが・・・・)
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> 本家のgithubから入手可能です。
> >>>>>>> * https://github.com/ClusterLabs/pacemaker
> >>>>>>>
> >>>>>>>
> >>>>>>> 場合によっては、最新のmasterはエラーなどが出る場合がありますので、その場合は、バージョンを古い方にたぐって
> >>>>>>> いくのが良いと思います。
> >>>>>>>
> >>>>>>> 以上です。
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> ----- Original Message -----
> >>>>>>>> From: Masamichi Fukuda - elf-systems
> >>> <masamichi_fukuda@elf-systems.com>
> >>>>>>>> To: 山内英生 <renayama19661014@ybb.ne.jp>;
> >>> "linux-ha-japan@lists.sourceforge.jp"
> >>> <linux-ha-japan@lists.sourceforge.jp>
> >>>>>>>> Date: 2015/3/17, Tue 16:06
> >>>>>>>> Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> 山内さん
> >>>>>>>>
> >>>>>>>> お疲れ様です、福田です。
> >>>>>>>>
> >>>>>>>> 以前のメールでheartbeatとpacemakerを最新版を入れたほうが良いと回答頂きました。
> >>>>>>>> そこで今回、heartbeat3.0.6とpacemaker1.1.12を入れたのですが。
> >>>>>>>>
> >>>>>>>> heartbeat configuration: Version = "3.0.6"
> >>>>>>>> pacemaker configuration: Version = 1.1.12 (Build:
> >>> 561c4cf)pacemakerがまだ古いということでしょうか。
> >>>>>>>>
> >>>>>>>> 済みませんが、宜しくお願いします。
> >>>>>>>>
> >>>>>>>> 以上
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> 2015年3月17日 14:59 <renayama19661014@ybb.ne.jp>:
> >>>>>>>>
> >>>>>>>> 福田さん
> >>>>>>>>>
> >>>>>>>>> お疲れ様です。山内です。
> >>>>>>>>>
> >>>>>>>>> ふと思ったのすが、以前のやり取りのメールで以下と回答してますが、問題ないでしょうか?
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>>>>>>> 2)Heartbeat3.0.6+Pacemaker最新 :
> >>> OK
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>> どうやら、Heartbeatも最新版3.0.6を組合せる必要があるようです。
> >>>>>>>>>>>>>>>
> >>> * http://hg.linux-ha.org/heartbeat-STABLE_3_0/rev/cceeb47a7d8f
> >>>>>>>>>
> >>>>>>>>> 以下のcrm_monのバージョンを見ると、1.1.12のようです。
> >>>>>>>>> Heartbeat3.0.6と組み合わせるには、かなり新しめのPacemakerが必要です。
> >>>>>>>>>
> >>>>>>>>>> # crm_mon -rfA
> >>>>>>>>>>
> >>>>>>>>>> Last updated: Tue Mar 17 14:14:39 2015
> >>>>>>>>>> Last change: Tue Mar 17 14:01:43 2015
> >>>>>>>>>> Stack: heartbeat
> >>>>>>>>>> Current DC: lbv2.beta.com
> >>> (82ffc36f-1ad8-8686-7db0-35686465c624) - parti
> >>>>>>>>>> tion with quorum
> >>>>>>>>>> Version: 1.1.12-561c4cf
> >>>>>>>>>
> >>>>>>>>> たぶん、以下の変更以降は少なくとも必要かと思います。
> >>>>>>>>>
> >>>>>>>>>
> https://github.com/ClusterLabs/pacemaker/commit/f2302da063d08719d28367d8e362b8bfb0f85bf3
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> 以上です。
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> ----- Original Message -----
> >>>>>>>>>> From: Masamichi Fukuda - elf-systems
> >>> <masamichi_fukuda@elf-systems.com>
> >>>>>>>>>> To: 山内英生 <renayama19661014@ybb.ne.jp>;
> >>> "linux-ha-japan@lists.sourceforge.jp"
> >>> <linux-ha-japan@lists.sourceforge.jp>
> >>>>>>>>>
> >>>>>>>>>> Date: 2015/3/17, Tue 14:38
> >>>>>>>>>> Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> 山内さん
> >>>>>>>>>>
> >>>>>>>>>> お疲れ様です、福田です。
> >>>>>>>>>>
> >>>>>>>>>> stonith-helperのシェバング行に-xを追加すれば良いのでしょうか?
> >>>>>>>>>> stonith-helperの先頭行を#!/bin/bash -xにしてクラスタを起動してみました。
> >>>>>>>>>>
> >>>>>>>>>> crm_monでは先ほどと変わりはないようです。
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> # crm_mon -rfA
> >>>>>>>>>>
> >>>>>>>>>> Last updated: Tue Mar 17 14:14:39 2015
> >>>>>>>>>> Last change: Tue Mar 17 14:01:43 2015
> >>>>>>>>>> Stack: heartbeat
> >>>>>>>>>> Current DC: lbv2.beta.com
> >>> (82ffc36f-1ad8-8686-7db0-35686465c624) - parti
> >>>>>>>>>> tion with quorum
> >>>>>>>>>> Version: 1.1.12-561c4cf
> >>>>>>>>>> 2 Nodes configured
> >>>>>>>>>> 8 Resources configured
> >>>>>>>>>>
> >>>>>>>>>> Online: [ lbv1.beta.com lbv2.beta.com ]
> >>>>>>>>>>
> >>>>>>>>>> Full list of resources:
> >>>>>>>>>>
> >>>>>>>>>> Resource Group: HAvarnish
> >>>>>>>>>> vip_208 (ocf::heartbeat:IPaddr2):
> >>> Started lbv1.beta.com
> >>>>>>>>>> varnishd (lsb:varnish): Started
> >>> lbv1.beta.com
> >>>>>>>>>> Resource Group: grpStonith1
> >>>>>>>>>> Stonith1-1
> >>> (stonith:external/stonith-helper): Stopped
> >>>>>>>>>> Stonith1-2 (stonith:external/xen0):
> >>> Stopped
> >>>>>>>>>> Resource Group: grpStonith2
> >>>>>>>>>> Stonith2-1
> >>> (stonith:external/stonith-helper): Stopped
> >>>>>>>>>> Stonith2-2 (stonith:external/xen0):
> >>> Stopped
> >>>>>>>>>> Clone Set: clone_ping [ping]
> >>>>>>>>>> Started: [ lbv1.beta.com lbv2.beta.com ]
> >>>>>>>>>>
> >>>>>>>>>> Node Attributes:
> >>>>>>>>>> * Node lbv1.beta.com:
> >>>>>>>>>> + default_ping_set : 100
> >>>>>>>>>> * Node lbv2.beta.com:
> >>>>>>>>>> + default_ping_set : 100
> >>>>>>>>>>
> >>>>>>>>>> Migration summary:
> >>>>>>>>>> * Node lbv2.beta.com:
> >>>>>>>>>> Stonith1-1: migration-threshold=1
> >>> fail-count=1000000 last-failure='Tue Mar 17
> >>>>>>>>>> 14:12:16 2015'
> >>>>>>>>>> * Node lbv1.beta.com:
> >>>>>>>>>> Stonith2-1: migration-threshold=1
> >>> fail-count=1000000 last-failure='Tue Mar 17
> >>>>>>>>>> 14:12:21 2015'
> >>>>>>>>>>
> >>>>>>>>>> Failed actions:
> >>>>>>>>>> Stonith1-1_start_0 on lbv2.beta.com 'unknown
> >>> error' (1): call=31, st
> >>>>>>>>>> atus=Error, last-rc-change='Tue Mar 17 14:12:14
> >>> 2015', queued=0ms, exec=1065ms
> >>>>>>>>>> Stonith2-1_start_0 on lbv1.beta.com 'unknown
> >>> error' (1): call=26, st
> >>>>>>>>>> atus=Error, last-rc-change='Tue Mar 17 14:12:19
> >>> 2015', queued=0ms, exec=1081ms
> >>>>>>>>>>
> >>>>>>>>>> その他のログを探してみました。
> >>>>>>>>>>
> >>>>>>>>>> heartbeat起動時です。
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> # less /var/log/pm_logconv.out
> >>>>>>>>>> Mar 17 14:11:28 lbv1.beta.com info: Starting
> >>> Heartbeat 3.0.6.
> >>>>>>>>>> Mar 17 14:11:33 lbv1.beta.com info: Link
> >>> lbv2.beta.com:eth1 is up.
> >>>>>>>>>> Mar 17 14:11:34 lbv1.beta.com info: Start
> >>> "ccm" process. (pid=13264)
> >>>>>>>>>> Mar 17 14:11:34 lbv1.beta.com info: Start
> >>> "lrmd" process. (pid=13267)
> >>>>>>>>>> Mar 17 14:11:34 lbv1.beta.com info: Start
> >>> "attrd" process. (pid=13268)
> >>>>>>>>>> Mar 17 14:11:34 lbv1.beta.com info: Start
> >>> "stonithd" process. (pid=13266)
> >>>>>>>>>> Mar 17 14:11:34 lbv1.beta.com info: Start
> >>> "cib" process. (pid=13265)
> >>>>>>>>>> Mar 17 14:11:34 lbv1.beta.com info: Start
> >>> "crmd" process. (pid=13269)
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> # less /var/log/error
> >>>>>>>>>> Mar 17 14:12:20 lbv1 crmd[13269]: error:
> >>> process_lrm_event: Operation Stonith2-1_start_0 (node=lbv1.beta.com,
> call=26,
> >>> status=4, cib-update=19, confirmed=true) Error
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> syslogからstonithをgrepしたものです
> >>>>>>>>>>
> >>>>>>>>>> Mar 17 14:11:34 lbv1 heartbeat: [13255]: info:
> >>> Starting child client
> >>> "/usr/local/heartbeat/libexec/pacemaker/stonithd" (0,0)
> >>>>>>>>>> Mar 17 14:11:34 lbv1 heartbeat: [13266]: info:
> >>> Starting "/usr/local/heartbeat/libexec/pacemaker/stonithd" as uid 0
> >>> gid 0 (pid 13266)
> >>>>>>>>>> Mar 17 14:11:34 lbv1 stonithd[13266]: notice:
> >>> crm_cluster_connect: Connecting to cluster infrastructure: heartbeat
> >>>>>>>>>> Mar 17 14:11:34 lbv1 heartbeat: [13255]: info: the
> >>> send queue length from heartbeat to client stonithd is set to 1024
> >>>>>>>>>> Mar 17 14:11:40 lbv1 stonithd[13266]: notice:
> >>> setup_cib: Watching for stonith topology changes
> >>>>>>>>>> Mar 17 14:11:40 lbv1 stonithd[13266]: notice:
> >>> unpack_config: On loss of CCM Quorum: Ignore
> >>>>>>>>>> Mar 17 14:11:40 lbv1 stonithd[13266]: warning:
> >>> handle_startup_fencing: Blind faith: not fencing unseen nodes
> >>>>>>>>>> Mar 17 14:11:40 lbv1 stonithd[13266]: warning:
> >>> handle_startup_fencing: Blind faith: not fencing unseen nodes
> >>>>>>>>>> Mar 17 14:11:41 lbv1 stonithd[13266]: notice:
> >>> stonith_device_register: Added 'Stonith2-1' to the device list (1
> active
> >>> devices)
> >>>>>>>>>> Mar 17 14:11:41 lbv1 stonithd[13266]: notice:
> >>> stonith_device_register: Added 'Stonith2-2' to the device list (2
> active
> >>> devices)
> >>>>>>>>>> Mar 17 14:12:04 lbv1 stonithd[13266]: notice:
> >>> xml_patch_version_check: Versions did not change in patch 0.5.0
> >>>>>>>>>> Mar 17 14:12:20 lbv1 stonithd[13266]: notice:
> >>> log_operation: Operation 'monitor' [13386] for device
> >>> 'Stonith2-1' returned: -201 (Generic Pacemaker error)
> >>>>>>>>>> Mar 17 14:12:20 lbv1 stonithd[13266]: warning:
> >>> log_operation: Stonith2-1:13386 [ Performing: stonith -t
> external/stonith-helper
> >>> -S ]
> >>>>>>>>>> Mar 17 14:12:20 lbv1 stonithd[13266]: warning:
> >>> log_operation: Stonith2-1:13386 [ failed to exec "stonith" ]
> >>>>>>>>>> Mar 17 14:12:20 lbv1 stonithd[13266]: warning:
> >>> log_operation: Stonith2-1:13386 [ failed: 2 ]
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> 宜しくお願いします。
> >>>>>>>>>>
> >>>>>>>>>> 以上
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> 2015年3月17日 13:32 <renayama19661014@ybb.ne.jp>:
> >>>>>>>>>>
> >>>>>>>>>> 福田さん
> >>>>>>>>>>>
> >>>>>>>>>>> お疲れ様です。山内です。
> >>>>>>>>>>>
> >>>>>>>>>>> ということは、stonith-helperのstartに問題があるようですね。
> >>>>>>>>>>>
> >>>>>>>>>>> stonith-helperの先頭に
> >>>>>>>>>>>
> >>>>>>>>>>> #!/bin/bash -x
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> を入れて、クラスタを起動すると何かわかるかも知れません。
> >>>>>>>>>>>
> >>>>>>>>>>> ちなみに、stonith-helperのログもどこかに出ていると思うのですが。。。
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> 以上です。
> >>>>>>>>>>>
> >>>>>>>>>>> ----- Original Message -----
> >>>>>>>>>>>> From: Masamichi Fukuda - elf-systems
> >>> <masamichi_fukuda@elf-systems.com>
> >>>>>>>>>>>> To: 山内英生 <renayama19661014@ybb.ne.jp>;
> >>> "linux-ha-japan@lists.sourceforge.jp"
> >>> <linux-ha-japan@lists.sourceforge.jp>
> >>>>>>>>>>>
> >>>>>>>>>>>> Date: 2015/3/17, Tue 12:31
> >>>>>>>>>>>> Subject: Re: [Linux-ha-jp]
> >>> スプリットブレイン時のSTONITHエラーについて
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> 山内さん
> >>>>>>>>>>>> cc:松島さん
> >>>>>>>>>>>>
> >>>>>>>>>>>> こんにちは、福田です。
> >>>>>>>>>>>>
> >>>>>>>>>>>> 同じディレクトリにxen0はありました。
> >>>>>>>>>>>>
> >>>>>>>>>>>> # pwd
> >>>>>>>>>>>> /usr/local/heartbeat/lib/stonith/plugins/external
> >>>>>>>>>>>>
> >>>>>>>>>>>> # ls
> >>>>>>>>>>>> drac5 ibmrsa kdumpcheck
> >>> riloe vmware
> >>>>>>>>>>>> dracmc-telnet ibmrsa-telnet libvirt
> >>> ssh xen0
> >>>>>>>>>>>> hetzner ipmi nut
> >>> stonith-helper xen0-ha
> >>>>>>>>>>>> hmchttp ippower9258 rackpdu
> >>> vcenter
> >>>>>>>>>>>>
> >>>>>>>>>>>> 宜しくお願いします。
> >>>>>>>>>>>>
> >>>>>>>>>>>> 以上
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> 2015-03-17 10:53 GMT+09:00
> >>> <renayama19661014@ybb.ne.jp>:
> >>>>>>>>>>>>
> >>>>>>>>>>>> 福田さん
> >>>>>>>>>>>>> cc:松島さん
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> お疲れ様です。山内です。
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> 標準出力や標準エラー出力はありませんでした。
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> stonith-helperがおかしいのでしょうか。
> >>>>>>>>>>>>>> stonith-helperはシェルスクリプトなのでインストールはあまり気にしていなかったのですが。
> >>>>>>>>>>>>>> stonith-helperはここに配置されています。
> >>>>>>>>>>>>>>
> /usr/local/heartbeat/lib/stonith/plugins/external/stonith-helper
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> このディレクトリにxen0もありますか?
> >>>>>>>>>>>>>
> 無いようでしたら、問題がありますので、一度、stonith-helperのファイルを属性などはそのまま、xen0と同じディレクトリに
> >>>>>>>>>>>>> コピーしてみてください。
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> それで稼働するなら、pm_extrasのインストールに問題があるということになります。
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> 以上です。
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> ----- Original Message -----
> >>>>>>>>>>>>>> From: Masamichi Fukuda - elf-systems
> >>> <masamichi_fukuda@elf-systems.com>
> >>>>>>>>>>>>>> To: 山内英生
> >>> <renayama19661014@ybb.ne.jp>;
> >>> "linux-ha-japan@lists.sourceforge.jp"
> >>> <linux-ha-japan@lists.sourceforge.jp>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> Date: 2015/3/17, Tue 10:31
> >>>>>>>>>>>>>> Subject: Re: [Linux-ha-jp]
> >>> スプリットブレイン時のSTONITHエラーについて
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> 山内さん
> >>>>>>>>>>>>>> cc:松島さん
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> おはようございます、福田です。
> >>>>>>>>>>>>>> crmの例をありがとうございます。
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> 早速、こちらの環境に合わせてみました。
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> $ cat test.crm
> >>>>>>>>>>>>>> ### Cluster Option ###
> >>>>>>>>>>>>>> property \
> >>>>>>>>>>>>>>
> >>> no-quorum-policy="ignore" \
> >>>>>>>>>>>>>> stonith-enabled="true"
> >>> \
> >>>>>>>>>>>>>>
> >>> startup-fencing="false" \
> >>>>>>>>>>>>>> stonith-timeout="710s"
> >>> \
> >>>>>>>>>>>>>>
> >>> crmd-transition-delay="2s"
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> ### Resource Default ###
> >>>>>>>>>>>>>> rsc_defaults \
> >>>>>>>>>>>>>>
> >>> resource-stickiness="INFINITY" \
> >>>>>>>>>>>>>>
> >>> migration-threshold="1"
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> ### Group Configuration ###
> >>>>>>>>>>>>>> group HAvarnish \
> >>>>>>>>>>>>>> vip_208 \
> >>>>>>>>>>>>>> varnishd
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> group grpStonith1 \
> >>>>>>>>>>>>>> Stonith1-1 \
> >>>>>>>>>>>>>> Stonith1-2
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> group grpStonith2 \
> >>>>>>>>>>>>>> Stonith2-1 \
> >>>>>>>>>>>>>> Stonith2-2
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> ### Clone Configuration ###
> >>>>>>>>>>>>>> clone clone_ping \
> >>>>>>>>>>>>>> ping
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> ### Fencing Topology ###
> >>>>>>>>>>>>>> fencing_topology \
> >>>>>>>>>>>>>> lbv1.beta.com: Stonith1-1
> >>> Stonith1-2 \
> >>>>>>>>>>>>>> lbv2.beta.com: Stonith2-1
> >>> Stonith2-2
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> ### Primitive Configuration ###
> >>>>>>>>>>>>>> primitive vip_208
> >>> ocf:heartbeat:IPaddr2 \
> >>>>>>>>>>>>>> params \
> >>>>>>>>>>>>>>
> >>> ip="192.168.17.208" \
> >>>>>>>>>>>>>> nic="eth0" \
> >>>>>>>>>>>>>> cidr_netmask="24"
> >>> \
> >>>>>>>>>>>>>> op start interval="0s"
> >>> timeout="90s" on-fail="restart" \
> >>>>>>>>>>>>>> op monitor
> >>> interval="5s" timeout="60s" on-fail="restart"
> >>> \
> >>>>>>>>>>>>>> op stop interval="0s"
> >>> timeout="100s" on-fail="fence"
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> primitive varnishd lsb:varnish \
> >>>>>>>>>>>>>> op start interval="0s"
> >>> timeout="90s" on-fail="restart" \
> >>>>>>>>>>>>>> op monitor
> >>> interval="10s" timeout="60s" on-fail="restart"
> >>> \
> >>>>>>>>>>>>>> op stop interval="0s"
> >>> timeout="100s" on-fail="fence"
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> primitive ping ocf:pacemaker:ping
> >>> \
> >>>>>>>>>>>>>> params \
> >>>>>>>>>>>>>>
> >>> name="default_ping_set" \
> >>>>>>>>>>>>>>
> >>> host_list="192.168.17.254" \
> >>>>>>>>>>>>>> multiplier="100"
> >>> \
> >>>>>>>>>>>>>> dampen="1" \
> >>>>>>>>>>>>>> op start interval="0s"
> >>> timeout="90s" on-fail="restart" \
> >>>>>>>>>>>>>> op monitor
> >>> interval="10s" timeout="60s" on-fail="restart"
> >>> \
> >>>>>>>>>>>>>> op stop interval="0s"
> >>> timeout="100s" on-fail="fence"
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> primitive Stonith1-1
> >>> stonith:external/stonith-helper \
> >>>>>>>>>>>>>> params \
> >>>>>>>>>>>>>>
> >>> pcmk_reboot_retries="1" \
> >>>>>>>>>>>>>>
> >>> pcmk_reboot_timeout="40s" \
> >>>>>>>>>>>>>>
> >>> hostlist="lbv1.beta.com" \
> >>>>>>>>>>>>>>
> >>> dead_check_target="192.168.17.132 10.0.17.132" \
> >>>>>>>>>>>>>>
> >>> standby_check_command="/usr/local/sbin/crm_resource -r varnishd -W |
> grep
> >>> -q `hostname`" \
> >>>>>>>>>>>>>>
> >>> run_online_check="yes" \
> >>>>>>>>>>>>>> op start interval="0s"
> >>> timeout="60s" on-fail="restart" \
> >>>>>>>>>>>>>> op stop interval="0s"
> >>> timeout="60s" on-fail="ignore"
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> primitive Stonith1-2
> >>> stonith:external/xen0 \
> >>>>>>>>>>>>>> params \
> >>>>>>>>>>>>>>
> >>> pcmk_reboot_timeout="60s" \
> >>>>>>>>>>>>>>
> >>> hostlist="lbv1.beta.com:/etc/xen/lbv1.cfg" \
> >>>>>>>>>>>>>>
> >>> dom0="xen0.beta.com" \
> >>>>>>>>>>>>>> op start interval="0s"
> >>> timeout="60s" on-fail="restart" \
> >>>>>>>>>>>>>> op monitor
> >>> interval="3600s" timeout="60s" on-fail="restart"
> >>> \
> >>>>>>>>>>>>>> op stop interval="0s"
> >>> timeout="60s" on-fail="ignore"
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> primitive Stonith2-1
> >>> stonith:external/stonith-helper \
> >>>>>>>>>>>>>> params \
> >>>>>>>>>>>>>>
> >>> pcmk_reboot_retries="1" \
> >>>>>>>>>>>>>>
> >>> pcmk_reboot_timeout="40s" \
> >>>>>>>>>>>>>>
> >>> hostlist="lbv2.beta.com" \
> >>>>>>>>>>>>>>
> >>> dead_check_target="192.168.17.133 10.0.17.133" \
> >>>>>>>>>>>>>>
> >>> standby_check_command="/usr/local/sbin/crm_resource -r varnishd -W |
> grep
> >>> -q `hostname`" \
> >>>>>>>>>>>>>>
> >>> run_online_check="yes" \
> >>>>>>>>>>>>>> op start interval="0s"
> >>> timeout="60s" on-fail="restart" \
> >>>>>>>>>>>>>> op stop interval="0s"
> >>> timeout="60s" on-fail="ignore"
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> primitive Stonith2-2
> >>> stonith:external/xen0 \
> >>>>>>>>>>>>>> params \
> >>>>>>>>>>>>>>
> >>> pcmk_reboot_timeout="60s" \
> >>>>>>>>>>>>>>
> >>> hostlist="lbv2.beta.com:/etc/xen/lbv2.cfg" \
> >>>>>>>>>>>>>>
> >>> dom0="xen0.beta.com" \
> >>>>>>>>>>>>>> op start interval="0s"
> >>> timeout="60s" on-fail="restart" \
> >>>>>>>>>>>>>> op monitor
> >>> interval="3600s" timeout="60s" on-fail="restart"
> >>> \
> >>>>>>>>>>>>>> op stop interval="0s"
> >>> timeout="60s" on-fail="ignore"
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> ### Resource Location ###
> >>>>>>>>>>>>>> location HA_location-1 HAvarnish
> >>> \
> >>>>>>>>>>>>>> rule 200: #uname eq
> >>> lbv1.beta.com \
> >>>>>>>>>>>>>> rule 100: #uname eq
> >>> lbv2.beta.com
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> location HA_location-2 HAvarnish
> >>> \
> >>>>>>>>>>>>>> rule -INFINITY: not_defined
> >>> default_ping_set or default_ping_set lt 100
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> location HA_location-3 grpStonith1
> >>> \
> >>>>>>>>>>>>>> rule -INFINITY: #uname eq
> >>> lbv1.beta.com
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> location HA_location-4 grpStonith2
> >>> \
> >>>>>>>>>>>>>> rule -INFINITY: #uname eq
> >>> lbv2.beta.com
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> これを流しこんだところ、昨日とはメッセージが異なります。
> >>>>>>>>>>>>>> pingのメッセージはなくなっていました。
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> # crm_mon -rfA
> >>>>>>>>>>>>>> Last updated: Tue Mar 17 10:21:28
> >>> 2015
> >>>>>>>>>>>>>> Last change: Tue Mar 17 10:21:09
> >>> 2015
> >>>>>>>>>>>>>> Stack: heartbeat
> >>>>>>>>>>>>>> Current DC: lbv2.beta.com
> >>> (82ffc36f-1ad8-8686-7db0-35686465c624) - parti
> >>>>>>>>>>>>>> tion with quorum
> >>>>>>>>>>>>>> Version: 1.1.12-561c4cf
> >>>>>>>>>>>>>> 2 Nodes configured
> >>>>>>>>>>>>>> 8 Resources configured
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Online: [ lbv1.beta.com
> >>> lbv2.beta.com ]
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Full list of resources:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Resource Group: HAvarnish
> >>>>>>>>>>>>>> vip_208
> >>> (ocf::heartbeat:IPaddr2): Started lbv1.beta.com
> >>>>>>>>>>>>>> varnishd (lsb:varnish):
> >>> Started lbv1.beta.com
> >>>>>>>>>>>>>> Resource Group: grpStonith1
> >>>>>>>>>>>>>> Stonith1-1
> >>> (stonith:external/stonith-helper): Stopped
> >>>>>>>>>>>>>> Stonith1-2
> >>> (stonith:external/xen0): Stopped
> >>>>>>>>>>>>>> Resource Group: grpStonith2
> >>>>>>>>>>>>>> Stonith2-1
> >>> (stonith:external/stonith-helper): Stopped
> >>>>>>>>>>>>>> Stonith2-2
> >>> (stonith:external/xen0): Stopped
> >>>>>>>>>>>>>> Clone Set: clone_ping [ping]
> >>>>>>>>>>>>>> Started: [ lbv1.beta.com
> >>> lbv2.beta.com ]
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Node Attributes:
> >>>>>>>>>>>>>> * Node lbv1.beta.com:
> >>>>>>>>>>>>>> +
> >>> default_ping_set : 100
> >>>>>>>>>>>>>> * Node lbv2.beta.com:
> >>>>>>>>>>>>>> +
> >>> default_ping_set : 100
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Migration summary:
> >>>>>>>>>>>>>> * Node lbv2.beta.com:
> >>>>>>>>>>>>>> Stonith1-1: migration-threshold=1
> >>> fail-count=1000000 last-failure='Tue Mar 17
> >>>>>>>>>>>>>> 10:21:17 2015'
> >>>>>>>>>>>>>> * Node lbv1.beta.com:
> >>>>>>>>>>>>>> Stonith2-1: migration-threshold=1
> >>> fail-count=1000000 last-failure='Tue Mar 17
> >>>>>>>>>>>>>> 10:21:17 2015'
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Failed actions:
> >>>>>>>>>>>>>> Stonith1-1_start_0 on
> >>> lbv2.beta.com 'unknown error' (1): call=31, st
> >>>>>>>>>>>>>> atus=Error, last-rc-change='Tue
> >>> Mar 17 10:21:15 2015', queued=0ms, exec=1082ms
> >>>>>>>>>>>>>> Stonith2-1_start_0 on
> >>> lbv1.beta.com 'unknown error' (1): call=31, st
> >>>>>>>>>>>>>> atus=Error, last-rc-change='Tue
> >>> Mar 17 10:21:16 2015', queued=0ms, exec=1079ms
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> /var/log/ha-debugのログです。
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> IPaddr2(vip_208)[7851]:
> >>> 2015/03/17_10:21:22 INFO: Adding inet address 192.168.17.208/24 with
> broadcast
> >>> address 192.168.17.255 to device eth0
> >>>>>>>>>>>>>> IPaddr2(vip_208)[7851]:
> >>> 2015/03/17_10:21:22 INFO: Bringing device eth0 up
> >>>>>>>>>>>>>> IPaddr2(vip_208)[7851]:
> >>> 2015/03/17_10:21:22 INFO: /usr/libexec/heartbeat/send_arp -i 200 -r 5
> -p
> >>> /var/run/resource-agents/send_arp-192.168.17.208 eth0 192.168.17.208
> auto
> >>> not_used not_used
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> 標準出力や標準エラー出力はありませんでした。
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> stonith-helperがおかしいのでしょうか。
> >>>>>>>>>>>>>> stonith-helperはシェルスクリプトなのでインストールはあまり気にしていなかったのですが。
> >>>>>>>>>>>>>> stonith-helperはここに配置されています。
> >>>>>>>>>>>>>>
> /usr/local/heartbeat/lib/stonith/plugins/external/stonith-helper
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> 宜しくお願いします。
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> 以上
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> 2015-03-17 9:45 GMT+09:00
> >>> <renayama19661014@ybb.ne.jp>:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> 福田さん
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> おはようございます。山内です。
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> 念の為、手元にある複数のstonithを利用した場合の例を抜粋してお送りします。
> >>>>>>>>>>>>>>> (実際には、改行に気を付けてください)
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> 以下の例は、PM1.1系での設定で、
> >>>>>>>>>>>>>>> nodeaは、prmStonith1-1、 prmStonith1-2の順でstonithが実行されます。
> >>>>>>>>>>>>>>> nodebは、prmStonith2-1、 prmStonith2-2の順でstonithが実行されます。
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> stonith自体は、helperとsshです。
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> (snip)
> >>>>>>>>>>>>>>> ### Group Configuration ###
> >>>>>>>>>>>>>>> group grpStonith1 \
> >>>>>>>>>>>>>>> prmStonith1-1 \
> >>>>>>>>>>>>>>> prmStonith1-2
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> group grpStonith2 \
> >>>>>>>>>>>>>>> prmStonith2-1 \
> >>>>>>>>>>>>>>> prmStonith2-2
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> ### Fencing Topology ###
> >>>>>>>>>>>>>>> fencing_topology \
> >>>>>>>>>>>>>>> nodea: prmStonith1-1
> >>> prmStonith1-2 \
> >>>>>>>>>>>>>>> nodeb: prmStonith2-1
> >>> prmStonith2-2
> >>>>>>>>>>>>>>> (snp)
> >>>>>>>>>>>>>>> primitive prmStonith1-1
> >>> stonith:external/stonith-helper \
> >>>>>>>>>>>>>>> params \
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> pcmk_reboot_retries="1"
> >>> \
> >>>>>>>>>>>>>>> pcmk_reboot_timeout="40s"
> >>> \
> >>>>>>>>>>>>>>> hostlist="nodea" \
> >>>>>>>>>>>>>>> dead_check_target="192.168.28.60
> >>> 192.168.28.70" \
> >>>>>>>>>>>>>>> standby_check_command="/usr/sbin/crm_resource
> >>> -r prmRES -W | grep -qi `hostname`" \
> >>>>>>>>>>>>>>> run_online_check="yes"
> >>> \
> >>>>>>>>>>>>>>> op start interval="0s"
> >>> timeout="60s" on-fail="restart" \
> >>>>>>>>>>>>>>> op stop interval="0s"
> >>> timeout="60s" on-fail="ignore"
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> primitive prmStonith1-2
> >>> stonith:external/ssh \
> >>>>>>>>>>>>>>> params \
> >>>>>>>>>>>>>>> pcmk_reboot_timeout="60s"
> >>> \
> >>>>>>>>>>>>>>> hostlist="nodea" \
> >>>>>>>>>>>>>>> op start interval="0s"
> >>> timeout="60s" on-fail="restart" \
> >>>>>>>>>>>>>>> op monitor
> >>> interval="3600s" timeout="60s" on-fail="restart"
> >>> \
> >>>>>>>>>>>>>>> op stop interval="0s"
> >>> timeout="60s" on-fail="ignore"
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> primitive prmStonith2-1
> >>> stonith:external/stonith-helper \
> >>>>>>>>>>>>>>> params \
> >>>>>>>>>>>>>>> pcmk_reboot_retries="1"
> >>> \
> >>>>>>>>>>>>>>> pcmk_reboot_timeout="40s"
> >>> \
> >>>>>>>>>>>>>>> hostlist="nodeb" \
> >>>>>>>>>>>>>>> dead_check_target="192.168.28.61
> >>> 192.168.28.71" \
> >>>>>>>>>>>>>>> standby_check_command="/usr/sbin/crm_resource
> >>> -r prmRES -W | grep -qi `hostname`" \
> >>>>>>>>>>>>>>> run_online_check="yes"
> >>> \
> >>>>>>>>>>>>>>> op start interval="0s"
> >>> timeout="60s" on-fail="restart" \
> >>>>>>>>>>>>>>> op stop interval="0s"
> >>> timeout="60s" on-fail="ignore"
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> primitive prmStonith2-2
> >>> stonith:external/ssh \
> >>>>>>>>>>>>>>> params \
> >>>>>>>>>>>>>>> pcmk_reboot_timeout="60s"
> >>> \
> >>>>>>>>>>>>>>> hostlist="nodeb" \
> >>>>>>>>>>>>>>> op start interval="0s"
> >>> timeout="60s" on-fail="restart" \
> >>>>>>>>>>>>>>> op monitor
> >>> interval="3600s" timeout="60s" on-fail="restart"
> >>> \
> >>>>>>>>>>>>>>> op stop interval="0s"
> >>> timeout="60s" on-fail="ignore"
> >>>>>>>>>>>>>>> (snip)
> >>>>>>>>>>>>>>> location
> >>> rsc_location-grpStonith1-2 grpStonith1 \
> >>>>>>>>>>>>>>> rule -INFINITY: #uname eq nodea
> >>>>>>>>>>>>>>> location
> >>> rsc_location-grpStonith2-3 grpStonith2 \
> >>>>>>>>>>>>>>> rule -INFINITY: #uname eq nodeb
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> 以上です。
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> --
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> ELF Systems
> >>>>>>>>>>>>>> Masamichi Fukuda
> >>>>>>>>>>>>>> mail to:
> >>> masamichi_fukuda@elf-systems.com
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> _______________________________________________
> >>>>>>>>>>>>> Linux-ha-japan mailing list
> >>>>>>>>>>>>> Linux-ha-japan@lists.sourceforge.jp
> >>>>>>>>>>>>> http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> --
> >>>>>>>>>>>>
> >>>>>>>>>>>> ELF Systems
> >>>>>>>>>>>> Masamichi Fukuda
> >>>>>>>>>>>> mail to: masamichi_fukuda@elf-systems.com
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> _______________________________________________
> >>>>>>>>>>> Linux-ha-japan mailing list
> >>>>>>>>>>> Linux-ha-japan@lists.sourceforge.jp
> >>>>>>>>>>> http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> --
> >>>>>>>>>>
> >>>>>>>>>> ELF Systems
> >>>>>>>>>> Masamichi Fukuda
> >>>>>>>>>> mail to: masamichi_fukuda@elf-systems.com
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> _______________________________________________
> >>>>>>>>> Linux-ha-japan mailing list
> >>>>>>>>> Linux-ha-japan@lists.sourceforge.jp
> >>>>>>>>> http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> --
> >>>>>>>>
> >>>>>>>> ELF Systems
> >>>>>>>> Masamichi Fukuda
> >>>>>>>> mail to: masamichi_fukuda@elf-systems.com
> >>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>> _______________________________________________
> >>>>>>> Linux-ha-japan mailing list
> >>>>>>> Linux-ha-japan@lists.sourceforge.jp
> >>>>>>> http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
> >>>>>>>
> >>>>>>
> >>>>>> --
> >>>>>>
> >>>>>> ELF Systems
> >>>>>> Masamichi Fukuda
> >>>>>> mail to: masamichi_fukuda@elf-systems.com
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>
> >>>>> _______________________________________________
> >>>>> Linux-ha-japan mailing list
> >>>>> Linux-ha-japan@lists.sourceforge.jp
> >>>>> http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
> >>>>>
> >>>>
> >>>>
> >>>> --
> >>>>
> >>>> ELF Systems
> >>>> Masamichi Fukuda
> >>>> mail to: masamichi_fukuda@elf-systems.com
> >>>>
> >>>>
> >>>
> >>> _______________________________________________
> >>> Linux-ha-japan mailing list
> >>> Linux-ha-japan@lists.sourceforge.jp
> >>> http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
> >>>
> >>
> >>_______________________________________________
> >>Linux-ha-japan mailing list
> >>Linux-ha-japan@lists.sourceforge.jp
> >>http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
> >>
> >
> >
> >--
> >
> >ELF Systems
> >Masamichi Fukuda
> >mail to: masamichi_fukuda@elf-systems.com
> >
> >
>
> _______________________________________________
> Linux-ha-japan mailing list
> Linux-ha-japan@lists.sourceforge.jp
> http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>



--
ELF Systems
Masamichi Fukuda
mail to: *masamichi_fukuda@elf-systems.com <elfsystems.com@gmail.com>*
Re: スプリットブレイン時のSTONITHエラーについて [ In reply to ]
福田さん

おはようございます。山内です。

書き方が悪かったです。
Reusableは、glueのことです。

pacemakerのインストールに問題があるかも知れませんが、現時点では、判断出来ません。


>stonith-helperを外して、external/sshだけにして起動してみましたが、
>crm_monでの状態は変わりありませんでした。


これは、想定通りで、external配下のエージェントを認識できず、startしていないと思っています。

#時間を見て、松島さんの情報を元に、Debian7の環境を作らないと何とも言えないと思っています。

以上です。


----- Original Message -----
>From: Masamichi Fukuda - elf-systems <masamichi_fukuda@elf-systems.com>
>To: 山内英生 <renayama19661014@ybb.ne.jp>; "linux-ha-japan@lists.sourceforge.jp" <linux-ha-japan@lists.sourceforge.jp>
>Date: 2015/3/18, Wed 08:12
>Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
>
>
>山内さん
>
>おはようございます、福田です。
>
>> ということは、xen0もstonith-helperもたぶん、Pacemakerのstonithプラグインとし
>> ての管理下のパスにはないということになると思います。
>>
>> Reusableと、pm_extrasあたりのインストールが怪しいと思いますね。
>
>pacemakerのインストールに問題があるのでしょうか。
>あと、Reusableというものは別途インストールが必要なのでしょうか。
>
>stonith-helperを外して、external/sshだけにして起動してみましたが、
>crm_monでの状態は変わりありませんでした。
>
>Last updated: Wed Mar 18 08:07:42 2015
>Last change: Wed Mar 18 08:04:48 2015
>Stack: heartbeat
>Current DC: lbv1.beta.com (38b0f200-83ea-8633-6f37-047d36cd39c6) - parti
>tion with quorum
>Version: 1.1.12-e32080b
>2 Nodes configured
>6 Resources configured
>
>
>Online: [ lbv1.beta.com lbv2.beta.com ]
>
>Full list of resources:
>
>Stonith1-2      (stonith:external/ssh): Stopped
>Stonith2-2      (stonith:external/ssh): Stopped
> Resource Group: HAvarnish
>     vip_208    (ocf::heartbeat:IPaddr2):       Started lbv1.beta.com
>     varnishd   (lsb:varnish):  Started lbv1.beta.com
> Clone Set: clone_ping [ping]
>     Started: [ lbv1.beta.com lbv2.beta.com ]
>
>Node Attributes:
>* Node lbv1.beta.com:
>    + default_ping_set                  : 100
>* Node lbv2.beta.com:
>    + default_ping_set                  : 100
>
>Migration summary:
>* Node lbv2.beta.com:
>   Stonith1-2: migration-threshold=1 fail-count=1000000 last-failure='Wed Mar 18
> 08:07:32 2015'
>* Node lbv1.beta.com:
>   Stonith2-2: migration-threshold=1 fail-count=1000000 last-failure='Wed Mar 18
> 08:05:53 2015'
>
>Failed actions:
>    Stonith1-2_start_0 on lbv2.beta.com 'unknown error' (1): call=23, st
>atus=Error, exit-reason='none', last-rc-change='Wed Mar 18 08:07:30 2015', queue
>d=0ms, exec=1061ms
>    Stonith2-2_start_0 on lbv1.beta.com 'unknown error' (1): call=23, st
>atus=Error, exit-reason='none', last-rc-change='Wed Mar 18 08:05:51 2015', queue
>d=0ms, exec=1062ms
>
>宜しくお願いします。
>
>以上
>
>
>
>
>
>2015年3月17日 23:51 <renayama19661014@ybb.ne.jp>:
>
>福田さん
>>
>>こんばんは、山内です。
>>
>>ということは、xen0もstonith-helperもたぶん、Pacemakerのstonithプラグインとしての管理下のパスにはないということになると思います。
>>
>>Reusableと、pm_extrasあたりのインストールが怪しいと思いますね。
>>
>>また、何かわかったらご連絡します。
>>
>>以上です。
>>
>>
>>
>>----- Original Message -----
>>>From: Masamichi Fukuda - elf-systems <masamichi_fukuda@elf-systems.com>
>>>To: 山内英生 <renayama19661014@ybb.ne.jp>; "linux-ha-japan@lists.sourceforge.jp" <linux-ha-japan@lists.sourceforge.jp>
>>
>>>Date: 2015/3/17, Tue 23:46
>>>Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
>>>
>>>
>>>山内さん
>>>
>>>こんばんは、福田です。
>>>
>>>stonith-helperの-x指定は何かやり方が違うんでしょうかね。
>>>
>>>stonith-helperを外して、xen0だけにして起動してみました。
>>>
>>># crm_mon -rfA
>>>
>>>Last updated: Tue Mar 17 23:38:53 2015
>>>Last change: Tue Mar 17 23:30:34 2015
>>>Stack: heartbeat
>>>Current DC: lbv1.beta.com (38b0f200-83ea-8633-6f37-047d36cd39c6) - parti
>>>tion with quorum
>>>Version: 1.1.12-e32080b
>>>2 Nodes configured
>>>6 Resources configured
>>>
>>>
>>>Online: [ lbv1.beta.com lbv2.beta.com ]
>>>
>>>Full list of resources:
>>>
>>>Stonith1-2      (stonith:external/xen0):        Stopped
>>>Stonith2-2      (stonith:external/xen0):        Stopped
>>> Resource Group: HAvarnish
>>>     vip_208    (ocf::heartbeat:IPaddr2):       Started lbv1.beta.com
>>>     varnishd   (lsb:varnish):  Started lbv1.beta.com
>>> Clone Set: clone_ping [ping]
>>>     Started: [ lbv1.beta.com lbv2.beta.com ]
>>>
>>>Node Attributes:
>>>* Node lbv1.beta.com:
>>>    + default_ping_set                  : 100
>>>* Node lbv2.beta.com:
>>>    + default_ping_set                  : 100
>>>
>>>Migration summary:
>>>* Node lbv1.beta.com:
>>>   Stonith2-2: migration-threshold=1 fail-count=1000000 last-failure='Tue Mar 17
>>> 23:38:34 2015'
>>>* Node lbv2.beta.com:
>>>   Stonith1-2: migration-threshold=1 fail-count=1000000 last-failure='Tue Mar 17
>>> 23:38:27 2015'
>>>
>>>Failed actions:
>>>    Stonith2-2_start_0 on lbv1.beta.com 'unknown error' (1): call=23, st
>>>atus=Error, exit-reason='none', last-rc-change='Tue Mar 17 23:38:32 2015', queue
>>>d=0ms, exec=1061ms
>>>    Stonith1-2_start_0 on lbv2.beta.com 'unknown error' (1): call=23, st
>>>atus=Error, exit-reason='none', last-rc-change='Tue Mar 17 23:38:25 2015', queue
>>>d=0ms, exec=1342ms
>>>
>>>
>>>
>>>
>>>stonith-helperがあるときと同様のfialed actionsが出ているようです。
>>>
>>>
>>>宜しくお願いします。
>>>
>>>以上
>>>
>>>
>>>
>>>
>>>2015年3月17日 22:38 <renayama19661014@ybb.ne.jp>:
>>>
>>>福田さん
>>>>
>>>>こんばんは、山内です。
>>>>
>>>>ちなみに可能であれば、external/stonith-helperを外して、external/xen0だけにした場合に
>>>>どうなるか?を確認すると、問題の切り分けになるかもしれません。
>>>>
>>>>以上です。
>>>>
>>>>
>>>>
>>>>----- Original Message -----
>>>>
>>>>> From: "renayama19661014@ybb.ne.jp" <renayama19661014@ybb.ne.jp>
>>>>> To: "linux-ha-japan@lists.sourceforge.jp" <linux-ha-japan@lists.sourceforge.jp>
>>>>> Cc:
>>>>> Date: 2015/3/17, Tue 22:28
>>>>> Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
>>>>>
>>>>> 福田さん
>>>>>
>>>>> こんばんは、山内です。
>>>>>
>>>>> 変わらないようですね。。。
>>>>>
>>>>> とりあえず、明日くらいに、RHEL上ですが、
>>>>>
>>>>> Heartbeat3.0.6
>>>>> Pacemakerの最新
>>>>>
>>>>> 組み合わせで、同じような設定(リソースはDummy、external/xen0はexternal/sshになりますが)stonith-helperが動くかどうかを確認してみます。
>>>>>
>>>>> #stonith-helperの-x指定の出力が確認出来ると、もう少し問題が絞りやすいのですが・・・
>>>>>
>>>>>
>>>>> 以上です。
>>>>>
>>>>>
>>>>>
>>>>> ----- Original Message -----
>>>>>> From: Masamichi Fukuda - elf-systems
>>>>> <masamichi_fukuda@elf-systems.com>
>>>>>> To: 山内英生 <renayama19661014@ybb.ne.jp>;
>>>>> "linux-ha-japan@lists.sourceforge.jp"
>>>>> <linux-ha-japan@lists.sourceforge.jp>
>>>>>> Date: 2015/3/17, Tue 21:24
>>>>>> Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
>>>>>>
>>>>>>
>>>>>> 山内さん
>>>>>>
>>>>>> こんばんは、福田です。
>>>>>> 最新版の情報をありがとうございました。
>>>>>>
>>>>>> 早速インストールしてみました。
>>>>>>
>>>>>> 起動後の状態です。
>>>>>>
>>>>>> failed actionsは変わりないようです。
>>>>>>
>>>>>>
>>>>>>
>>>>>> # crm_mon -rfA
>>>>>> Last updated: Tue Mar 17 21:03:49 2015
>>>>>> Last change: Tue Mar 17 20:30:58 2015
>>>>>> Stack: heartbeat
>>>>>> Current DC: lbv1.beta.com (38b0f200-83ea-8633-6f37-047d36cd39c6) - parti
>>>>>> tion with quorum
>>>>>> Version: 1.1.12-e32080b
>>>>>> 2 Nodes configured
>>>>>> 8 Resources configured
>>>>>>
>>>>>>
>>>>>> Online: [ lbv1.beta.com lbv2.beta.com ]
>>>>>>
>>>>>> Full list of resources:
>>>>>>
>>>>>>  Resource Group: HAvarnish
>>>>>>      vip_208    (ocf::heartbeat:IPaddr2):       Started lbv1.beta.com
>>>>>>      varnishd   (lsb:varnish):  Started lbv1.beta.com
>>>>>>  Resource Group: grpStonith1
>>>>>>      Stonith1-1 (stonith:external/stonith-helper):      Stopped
>>>>>>      Stonith1-2 (stonith:external/xen0):        Stopped
>>>>>>  Resource Group: grpStonith2
>>>>>>      Stonith2-1 (stonith:external/stonith-helper):      Stopped
>>>>>>      Stonith2-2 (stonith:external/xen0):        Stopped
>>>>>>  Clone Set: clone_ping [ping]
>>>>>>      Started: [ lbv1.beta.com lbv2.beta.com ]
>>>>>>
>>>>>> Node Attributes:
>>>>>> * Node lbv1.beta.com:
>>>>>>     + default_ping_set                  : 100
>>>>>> * Node lbv2.beta.com:
>>>>>>     + default_ping_set                  : 100
>>>>>>
>>>>>> Migration summary:
>>>>>> * Node lbv1.beta.com:
>>>>>>    Stonith2-1: migration-threshold=1 fail-count=1000000
>>>>> last-failure='Tue Mar 17
>>>>>>  21:03:39 2015'
>>>>>> * Node lbv2.beta.com:
>>>>>>    Stonith1-1: migration-threshold=1 fail-count=1000000
>>>>> last-failure='Tue Mar 17
>>>>>>  21:03:32 2015'
>>>>>>
>>>>>> Failed actions:
>>>>>>     Stonith2-1_start_0 on lbv1.beta.com 'unknown error' (1):
>>>>> call=31, st
>>>>>> atus=Error, exit-reason='none', last-rc-change='Tue Mar 17
>>>>> 21:03:37 2015', queue
>>>>>> d=0ms, exec=1085ms
>>>>>>     Stonith1-1_start_0 on lbv2.beta.com 'unknown error' (1):
>>>>> call=18, st
>>>>>> atus=Error, exit-reason='none', last-rc-change='Tue Mar 17
>>>>> 21:03:30 2015', queue
>>>>>> d=0ms, exec=1061ms
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> ログです。
>>>>>>
>>>>>>
>>>>>> # less /var/log/ha-debug
>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: info: Pacemaker support:
>>>>> yes
>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: WARN: File
>>>>> /etc/ha.d//haresources exists.
>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: WARN: This file is not used
>>>>> because pacemaker is enabled
>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: debug: Checking access of:
>>>>> /usr/local/heartbeat/libexec/heartbeat/ccm
>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: debug: Checking access of:
>>>>> /usr/local/heartbeat/libexec/pacemaker/cib
>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: debug: Checking access of:
>>>>> /usr/local/heartbeat/libexec/pacemaker/stonithd
>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: debug: Checking access of:
>>>>> /usr/local/heartbeat/libexec/pacemaker/lrmd
>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: debug: Checking access of:
>>>>> /usr/local/heartbeat/libexec/pacemaker/attrd
>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: debug: Checking access of:
>>>>> /usr/local/heartbeat/libexec/pacemaker/crmd
>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: WARN: Core dumps could be
>>>>> lost if multiple dumps occur.
>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: WARN: Consider setting
>>>>> non-default value in /proc/sys/kernel/core_pattern (or equivalent) for maximum
>>>>> supportability
>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: WARN: Consider setting
>>>>> /proc/sys/kernel/core_uses_pid (or equivalent) to 1 for maximum supportability
>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: WARN: Logging daemon is
>>>>> disabled --enabling logging daemon is recommended
>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: info:
>>>>> **************************
>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: info: Configuration
>>>>> validated. Starting heartbeat 3.0.6
>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4236]: info: heartbeat: version
>>>>> 3.0.6
>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4236]: info: Heartbeat generation:
>>>>> 1423534116
>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4236]: info: seed is -1702799346
>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4236]: info: glib: ucast: write
>>>>> socket priority set to IPTOS_LOWDELAY on eth1
>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4236]: info: glib: ucast: bound
>>>>> send socket to device: eth1
>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4236]: info: glib: ucast: set
>>>>> SO_REUSEADDR
>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4236]: info: glib: ucast: bound
>>>>> receive socket to device: eth1
>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4236]: info: glib: ucast: started
>>>>> on port 694 interface eth1 to 10.0.17.133
>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4236]: info: Local status now set
>>>>> to: 'up'
>>>>>> Mar 17 21:02:46 lbv1.beta.com heartbeat: [4236]: info: Link
>>>>> lbv2.beta.com:eth1 up.
>>>>>> Mar 17 21:02:46 lbv1.beta.com heartbeat: [4236]: info: Status update for
>>>>> node lbv2.beta.com: status up
>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: Comm_now_up():
>>>>> updating status to active
>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: Local status now set
>>>>> to: 'active'
>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: Starting child client
>>>>> "/usr/local/heartbeat/libexec/heartbeat/ccm" (109,113)
>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: Starting child client
>>>>> "/usr/local/heartbeat/libexec/pacemaker/cib" (109,113)
>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: Starting child client
>>>>> "/usr/local/heartbeat/libexec/pacemaker/stonithd" (0,0)
>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: Starting child client
>>>>> "/usr/local/heartbeat/libexec/pacemaker/lrmd" (0,0)
>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: Starting child client
>>>>> "/usr/local/heartbeat/libexec/pacemaker/attrd" (109,113)
>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: Starting child client
>>>>> "/usr/local/heartbeat/libexec/pacemaker/crmd" (109,113)
>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: debug: get_delnodelist:
>>>>> delnodelist=
>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4250]: info: Starting
>>>>> "/usr/local/heartbeat/libexec/pacemaker/crmd" as uid 109  gid 113 (pid
>>>>> 4250)
>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4246]: info: Starting
>>>>> "/usr/local/heartbeat/libexec/pacemaker/cib" as uid 109  gid 113 (pid
>>>>> 4246)
>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4249]: info: Starting
>>>>> "/usr/local/heartbeat/libexec/pacemaker/attrd" as uid 109  gid 113
>>>>> (pid 4249)
>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4245]: info: Starting
>>>>> "/usr/local/heartbeat/libexec/heartbeat/ccm" as uid 109  gid 113 (pid
>>>>> 4245)
>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4248]: info: Starting
>>>>> "/usr/local/heartbeat/libexec/pacemaker/lrmd" as uid 0  gid 0 (pid
>>>>> 4248)
>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4247]: info: Starting
>>>>> "/usr/local/heartbeat/libexec/pacemaker/stonithd" as uid 0  gid 0 (pid
>>>>> 4247)
>>>>>> Mar 17 21:02:47 lbv1.beta.com ccm: [4245]: info: Hostname: lbv1.beta.com
>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: the send queue length
>>>>> from heartbeat to client ccm is set to 1024
>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: the send queue length
>>>>> from heartbeat to client attrd is set to 1024
>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: the send queue length
>>>>> from heartbeat to client stonith-ng is set to 1024
>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: Status update for
>>>>> node lbv2.beta.com: status active
>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: the send queue length
>>>>> from heartbeat to client cib is set to 1024
>>>>>> Mar 17 21:02:51 lbv1.beta.com heartbeat: [4236]: WARN: 1 lost packet(s) for
>>>>> [lbv2.beta.com] [15:17]
>>>>>> Mar 17 21:02:51 lbv1.beta.com heartbeat: [4236]: info: No pkts missing from
>>>>> lbv2.beta.com!
>>>>>> Mar 17 21:02:52 lbv1.beta.com heartbeat: [4236]: WARN: 1 lost packet(s) for
>>>>> [lbv2.beta.com] [19:21]
>>>>>> Mar 17 21:02:52 lbv1.beta.com heartbeat: [4236]: info: No pkts missing from
>>>>> lbv2.beta.com!
>>>>>> Mar 17 21:02:52 lbv1.beta.com heartbeat: [4236]: info: the send queue length
>>>>> from heartbeat to client crmd is set to 1024
>>>>>> Mar 17 21:02:53 lbv1.beta.com heartbeat: [4236]: WARN: 1 lost packet(s) for
>>>>> [lbv2.beta.com] [24:26]
>>>>>> Mar 17 21:02:53 lbv1.beta.com heartbeat: [4236]: info: No pkts missing from
>>>>> lbv2.beta.com!
>>>>>> Mar 17 21:02:54 lbv1.beta.com heartbeat: [4236]: WARN: 1 lost packet(s) for
>>>>> [lbv2.beta.com] [26:28]
>>>>>> Mar 17 21:02:54 lbv1.beta.com heartbeat: [4236]: info: No pkts missing from
>>>>> lbv2.beta.com!
>>>>>> Mar 17 21:02:54 lbv1.beta.com heartbeat: [4236]: WARN: 1 lost packet(s) for
>>>>> [lbv2.beta.com] [30:32]
>>>>>> Mar 17 21:02:54 lbv1.beta.com heartbeat: [4236]: info: No pkts missing from
>>>>> lbv2.beta.com!
>>>>>>
>>>>>>
>>>>>>
>>>>>> # less /var/log/error
>>>>>>
>>>>>> Mar 17 21:02:47 lbv1 attrd[4249]:    error: ha_msg_dispatch: Ignored
>>>>> incoming message. Please set_msg_callback on hbclstat
>>>>>> Mar 17 21:02:48 lbv1 attrd[4249]:    error: ha_msg_dispatch: Ignored
>>>>> incoming message. Please set_msg_callback on hbclstat
>>>>>> Mar 17 21:02:53 lbv1 stonith-ng[4247]:    error: ha_msg_dispatch: Ignored
>>>>> incoming message. Please set_msg_callback on hbclstat
>>>>>> Mar 17 21:02:53 lbv1 stonith-ng[4247]:    error: ha_msg_dispatch: Ignored
>>>>> incoming message. Please set_msg_callback on hbclstat
>>>>>> Mar 17 21:03:39 lbv1 crmd[4250]:    error: process_lrm_event: Operation
>>>>> Stonith2-1_start_0 (node=lbv1.beta.com, call=31, status=4, cib-update=42,
>>>>> confirmed=true) Error
>>>>>>
>>>>>> # cat syslog|egrep 'Mar 17 21:03|Mar 17 21:02' |egrep
>>>>> 'heartbeat|stonith|pacemaker|error'
>>>>>> Mar 17 21:03:24 lbv1 pengine[4253]:   notice: process_pe_message: Calculated
>>>>> Transition 0: /var/lib/pacemaker/pengine/pe-input-115.bz2
>>>>>> Mar 17 21:03:27 lbv1 crmd[4250]:   notice: run_graph: Transition 0
>>>>> (Complete=15, Pending=0, Fired=0, Skipped=16, Incomplete=2,
>>>>> Source=/var/lib/pacemaker/pengine/pe-input-115.bz2): Stopped
>>>>>> Mar 17 21:03:29 lbv1 pengine[4253]:   notice: process_pe_message: Calculated
>>>>> Transition 1: /var/lib/pacemaker/pengine/pe-input-116.bz2
>>>>>> Mar 17 21:03:34 lbv1 crmd[4250]:   notice: run_graph: Transition 1
>>>>> (Complete=8, Pending=0, Fired=0, Skipped=12, Incomplete=1,
>>>>> Source=/var/lib/pacemaker/pengine/pe-input-116.bz2): Stopped
>>>>>> Mar 17 21:03:37 lbv1 pengine[4253]:  warning: unpack_rsc_op_failure:
>>>>> Processing failed op start for Stonith1-1 on lbv2.beta.com: unknown error (1)
>>>>>> Mar 17 21:03:37 lbv1 pengine[4253]:  warning: unpack_rsc_op_failure:
>>>>> Processing failed op start for Stonith1-1 on lbv2.beta.com: unknown error (1)
>>>>>> Mar 17 21:03:37 lbv1 pengine[4253]:   notice: process_pe_message: Calculated
>>>>> Transition 2: /var/lib/pacemaker/pengine/pe-input-117.bz2
>>>>>> Mar 17 21:03:39 lbv1 stonith-ng[4247]:   notice: log_operation: Operation
>>>>> 'monitor' [4377] for device 'Stonith2-1' returned: -201 (Generic
>>>>> Pacemaker error)
>>>>>> Mar 17 21:03:39 lbv1 stonith-ng[4247]:  warning: log_operation:
>>>>> Stonith2-1:4377 [ Performing: stonith -t external/stonith-helper -S ]
>>>>>> Mar 17 21:03:39 lbv1 stonith-ng[4247]:  warning: log_operation:
>>>>> Stonith2-1:4377 [ failed to exec "stonith" ]
>>>>>> Mar 17 21:03:39 lbv1 stonith-ng[4247]:  warning: log_operation:
>>>>> Stonith2-1:4377 [ failed:  2 ]
>>>>>> Mar 17 21:03:39 lbv1 crmd[4250]:    error: process_lrm_event: Operation
>>>>> Stonith2-1_start_0 (node=lbv1.beta.com, call=31, status=4, cib-update=42,
>>>>> confirmed=true) Error
>>>>>> Mar 17 21:03:40 lbv1 crmd[4250]:   notice: run_graph: Transition 2
>>>>> (Complete=12, Pending=0, Fired=0, Skipped=3, Incomplete=0,
>>>>> Source=/var/lib/pacemaker/pengine/pe-input-117.bz2): Stopped
>>>>>> Mar 17 21:03:42 lbv1 pengine[4253]:  warning: unpack_rsc_op_failure:
>>>>> Processing failed op start for Stonith2-1 on lbv1.beta.com: unknown error (1)
>>>>>> Mar 17 21:03:42 lbv1 pengine[4253]:  warning: unpack_rsc_op_failure:
>>>>> Processing failed op start for Stonith2-1 on lbv1.beta.com: unknown error (1)
>>>>>> Mar 17 21:03:42 lbv1 pengine[4253]:  warning: unpack_rsc_op_failure:
>>>>> Processing failed op start for Stonith1-1 on lbv2.beta.com: unknown error (1)
>>>>>> Mar 17 21:03:42 lbv1 pengine[4253]:   notice: process_pe_message: Calculated
>>>>> Transition 3: /var/lib/pacemaker/pengine/pe-input-118.bz2
>>>>>> Mar 17 21:03:42 lbv1 IPaddr2(vip_208)[4448]: INFO:
>>>>> /usr/libexec/heartbeat/send_arp -i 200 -r 5 -p
>>>>> /var/run/resource-agents/send_arp-192.168.17.208 eth0 192.168.17.208 auto
>>>>> not_used not_used
>>>>>> Mar 17 21:03:47 lbv1 crmd[4250]:   notice: run_graph: Transition 3
>>>>> (Complete=10, Pending=0, Fired=0, Skipped=0, Incomplete=0,
>>>>> Source=/var/lib/pacemaker/pengine/pe-input-118.bz2): Complete
>>>>>>
>>>>>> 宜しくお願いします。
>>>>>>
>>>>>> 以上
>>>>>>
>>>>>>
>>>>>>
>>>>>> 2015年3月17日 18:31 <renayama19661014@ybb.ne.jp>:
>>>>>>
>>>>>> 福田さん
>>>>>>>
>>>>>>> こんばんは、山内です。
>>>>>>>
>>>>>>> tag付けされていないので、本日の最新版は、
>>>>>>>
>>>>>>>  *
>>>>> https://github.com/ClusterLabs/pacemaker/tree/e32080b460f81486b85d08ec958582b3e72d858c
>>>>>>>
>>>>>>>
>>>>>>> になります。
>>>>>>> 右側の[Download ZIP]からダウンロード出来ます。
>>>>>>>
>>>>>>> 以上です。
>>>>>>>
>>>>>>>
>>>>>>> ----- Original Message -----
>>>>>>>> From: Masamichi Fukuda - elf-systems
>>>>> <masamichi_fukuda@elf-systems.com>
>>>>>>>
>>>>>>>> To: "renayama19661014@ybb.ne.jp"
>>>>> <renayama19661014@ybb.ne.jp>;
>>>>> "linux-ha-japan@lists.sourceforge.jp"
>>>>> <linux-ha-japan@lists.sourceforge.jp>
>>>>>>>> Date: 2015/3/17, Tue 18:07
>>>>>>>> Subject: スプリットブレイン時のSTONITHエラーについて
>>>>>>>>
>>>>>>>>
>>>>>>>> 山内さん
>>>>>>>>
>>>>>>>>
>>>>>>>> お疲れ様です、福田です。
>>>>>>>>
>>>>>>>>
>>>>>>>> こちらを見たのですが、
>>>>>>>> https://github.com/ClusterLabs/pacemaker/tags
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> pacemaker 1.1.12 561c4cf が最新のようなのですが。
>>>>>>>> 済みませんが、これ以降の最新版はどちらにあるか教えて頂けますか。
>>>>>>>>
>>>>>>>>
>>>>>>>> 宜しくお願いします。
>>>>>>>>
>>>>>>>>
>>>>>>>> 以上
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> 2015年3月17日火曜日、<renayama19661014@ybb.ne.jp>さんは書きました:
>>>>>>>>
>>>>>>>> 福田さん
>>>>>>>>>
>>>>>>>>> お疲れ様です。山内です。
>>>>>>>>>
>>>>>>>>> はい。古いです。
>>>>>>>>>
>>>>>>>>> PacemakerがHeartbeat3.0.6に対応したのは意外と最近です。
>>>>>>>>> もっと新しいものを入れてください。(また、ソースから構築する必要がありますが・・・・)
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> 本家のgithubから入手可能です。
>>>>>>>>>  * https://github.com/ClusterLabs/pacemaker
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> 場合によっては、最新のmasterはエラーなどが出る場合がありますので、その場合は、バージョンを古い方にたぐって
>>>>>>>>> いくのが良いと思います。
>>>>>>>>>
>>>>>>>>> 以上です。
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> ----- Original Message -----
>>>>>>>>>> From: Masamichi Fukuda - elf-systems
>>>>> <masamichi_fukuda@elf-systems.com>
>>>>>>>>>> To: 山内英生 <renayama19661014@ybb.ne.jp>;
>>>>> "linux-ha-japan@lists.sourceforge.jp"
>>>>> <linux-ha-japan@lists.sourceforge.jp>
>>>>>>>>>> Date: 2015/3/17, Tue 16:06
>>>>>>>>>> Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> 山内さん
>>>>>>>>>>
>>>>>>>>>> お疲れ様です、福田です。
>>>>>>>>>>
>>>>>>>>>> 以前のメールでheartbeatとpacemakerを最新版を入れたほうが良いと回答頂きました。
>>>>>>>>>> そこで今回、heartbeat3.0.6とpacemaker1.1.12を入れたのですが。
>>>>>>>>>>
>>>>>>>>>> heartbeat configuration: Version = "3.0.6"
>>>>>>>>>> pacemaker configuration: Version = 1.1.12 (Build:
>>>>> 561c4cf)pacemakerがまだ古いということでしょうか。
>>>>>>>>>>
>>>>>>>>>> 済みませんが、宜しくお願いします。
>>>>>>>>>>
>>>>>>>>>> 以上
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> 2015年3月17日 14:59 <renayama19661014@ybb.ne.jp>:
>>>>>>>>>>
>>>>>>>>>> 福田さん
>>>>>>>>>>>
>>>>>>>>>>> お疲れ様です。山内です。
>>>>>>>>>>>
>>>>>>>>>>> ふと思ったのすが、以前のやり取りのメールで以下と回答してますが、問題ないでしょうか?
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>>>>>>  2)Heartbeat3.0.6+Pacemaker最新 :
>>>>> OK
>>>>>>>>>>>>>>>>>    
>>>>>>>>>>>>>>>>>
>>>>> どうやら、Heartbeatも最新版3.0.6を組合せる必要があるようです。
>>>>>>>>>>>>>>>>>
>>>>>  * http://hg.linux-ha.org/heartbeat-STABLE_3_0/rev/cceeb47a7d8f
>>>>>>>>>>>
>>>>>>>>>>> 以下のcrm_monのバージョンを見ると、1.1.12のようです。
>>>>>>>>>>> Heartbeat3.0.6と組み合わせるには、かなり新しめのPacemakerが必要です。
>>>>>>>>>>>
>>>>>>>>>>>> # crm_mon -rfA
>>>>>>>>>>>>
>>>>>>>>>>>> Last updated: Tue Mar 17 14:14:39 2015
>>>>>>>>>>>> Last change: Tue Mar 17 14:01:43 2015
>>>>>>>>>>>> Stack: heartbeat
>>>>>>>>>>>> Current DC: lbv2.beta.com
>>>>> (82ffc36f-1ad8-8686-7db0-35686465c624) - parti
>>>>>>>>>>>> tion with quorum
>>>>>>>>>>>> Version: 1.1.12-561c4cf
>>>>>>>>>>>
>>>>>>>>>>> たぶん、以下の変更以降は少なくとも必要かと思います。
>>>>>>>>>>>
>>>>>>>>>>> https://github.com/ClusterLabs/pacemaker/commit/f2302da063d08719d28367d8e362b8bfb0f85bf3
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> 以上です。
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> ----- Original Message -----
>>>>>>>>>>>> From: Masamichi Fukuda - elf-systems
>>>>> <masamichi_fukuda@elf-systems.com>
>>>>>>>>>>>> To: 山内英生 <renayama19661014@ybb.ne.jp>;
>>>>> "linux-ha-japan@lists.sourceforge.jp"
>>>>> <linux-ha-japan@lists.sourceforge.jp>
>>>>>>>>>>>
>>>>>>>>>>>> Date: 2015/3/17, Tue 14:38
>>>>>>>>>>>> Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> 山内さん
>>>>>>>>>>>>
>>>>>>>>>>>> お疲れ様です、福田です。
>>>>>>>>>>>>
>>>>>>>>>>>> stonith-helperのシェバング行に-xを追加すれば良いのでしょうか?
>>>>>>>>>>>> stonith-helperの先頭行を#!/bin/bash -xにしてクラスタを起動してみました。
>>>>>>>>>>>>
>>>>>>>>>>>> crm_monでは先ほどと変わりはないようです。
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> # crm_mon -rfA
>>>>>>>>>>>>
>>>>>>>>>>>> Last updated: Tue Mar 17 14:14:39 2015
>>>>>>>>>>>> Last change: Tue Mar 17 14:01:43 2015
>>>>>>>>>>>> Stack: heartbeat
>>>>>>>>>>>> Current DC: lbv2.beta.com
>>>>> (82ffc36f-1ad8-8686-7db0-35686465c624) - parti
>>>>>>>>>>>> tion with quorum
>>>>>>>>>>>> Version: 1.1.12-561c4cf
>>>>>>>>>>>> 2 Nodes configured
>>>>>>>>>>>> 8 Resources configured
>>>>>>>>>>>>
>>>>>>>>>>>> Online: [ lbv1.beta.com lbv2.beta.com ]
>>>>>>>>>>>>
>>>>>>>>>>>> Full list of resources:
>>>>>>>>>>>>
>>>>>>>>>>>>  Resource Group: HAvarnish
>>>>>>>>>>>>      vip_208    (ocf::heartbeat:IPaddr2):      
>>>>> Started lbv1.beta.com
>>>>>>>>>>>>      varnishd   (lsb:varnish):  Started
>>>>> lbv1.beta.com
>>>>>>>>>>>>  Resource Group: grpStonith1
>>>>>>>>>>>>      Stonith1-1
>>>>> (stonith:external/stonith-helper):      Stopped
>>>>>>>>>>>>      Stonith1-2 (stonith:external/xen0):       
>>>>> Stopped
>>>>>>>>>>>>  Resource Group: grpStonith2
>>>>>>>>>>>>      Stonith2-1
>>>>> (stonith:external/stonith-helper):      Stopped
>>>>>>>>>>>>      Stonith2-2 (stonith:external/xen0):       
>>>>> Stopped
>>>>>>>>>>>>  Clone Set: clone_ping [ping]
>>>>>>>>>>>>      Started: [ lbv1.beta.com lbv2.beta.com ]
>>>>>>>>>>>>
>>>>>>>>>>>> Node Attributes:
>>>>>>>>>>>> * Node lbv1.beta.com:
>>>>>>>>>>>>     + default_ping_set                  : 100
>>>>>>>>>>>> * Node lbv2.beta.com:
>>>>>>>>>>>>     + default_ping_set                  : 100
>>>>>>>>>>>>
>>>>>>>>>>>> Migration summary:
>>>>>>>>>>>> * Node lbv2.beta.com:
>>>>>>>>>>>>    Stonith1-1: migration-threshold=1
>>>>> fail-count=1000000 last-failure='Tue Mar 17
>>>>>>>>>>>>  14:12:16 2015'
>>>>>>>>>>>> * Node lbv1.beta.com:
>>>>>>>>>>>>    Stonith2-1: migration-threshold=1
>>>>> fail-count=1000000 last-failure='Tue Mar 17
>>>>>>>>>>>>  14:12:21 2015'
>>>>>>>>>>>>
>>>>>>>>>>>> Failed actions:
>>>>>>>>>>>>     Stonith1-1_start_0 on lbv2.beta.com 'unknown
>>>>> error' (1): call=31, st
>>>>>>>>>>>> atus=Error, last-rc-change='Tue Mar 17 14:12:14
>>>>> 2015', queued=0ms, exec=1065ms
>>>>>>>>>>>>     Stonith2-1_start_0 on lbv1.beta.com 'unknown
>>>>> error' (1): call=26, st
>>>>>>>>>>>> atus=Error, last-rc-change='Tue Mar 17 14:12:19
>>>>> 2015', queued=0ms, exec=1081ms
>>>>>>>>>>>>
>>>>>>>>>>>> その他のログを探してみました。
>>>>>>>>>>>>
>>>>>>>>>>>> heartbeat起動時です。
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> # less /var/log/pm_logconv.out
>>>>>>>>>>>> Mar 17 14:11:28 lbv1.beta.com info: Starting
>>>>> Heartbeat 3.0.6.
>>>>>>>>>>>> Mar 17 14:11:33 lbv1.beta.com info: Link
>>>>> lbv2.beta.com:eth1 is up.
>>>>>>>>>>>> Mar 17 14:11:34 lbv1.beta.com info: Start
>>>>> "ccm" process. (pid=13264)
>>>>>>>>>>>> Mar 17 14:11:34 lbv1.beta.com info: Start
>>>>> "lrmd" process. (pid=13267)
>>>>>>>>>>>> Mar 17 14:11:34 lbv1.beta.com info: Start
>>>>> "attrd" process. (pid=13268)
>>>>>>>>>>>> Mar 17 14:11:34 lbv1.beta.com info: Start
>>>>> "stonithd" process. (pid=13266)
>>>>>>>>>>>> Mar 17 14:11:34 lbv1.beta.com info: Start
>>>>> "cib" process. (pid=13265)
>>>>>>>>>>>> Mar 17 14:11:34 lbv1.beta.com info: Start
>>>>> "crmd" process. (pid=13269)
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> # less /var/log/error
>>>>>>>>>>>> Mar 17 14:12:20 lbv1 crmd[13269]:    error:
>>>>> process_lrm_event: Operation Stonith2-1_start_0 (node=lbv1.beta.com, call=26,
>>>>> status=4, cib-update=19, confirmed=true) Error
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> syslogからstonithをgrepしたものです
>>>>>>>>>>>>
>>>>>>>>>>>> Mar 17 14:11:34 lbv1 heartbeat: [13255]: info:
>>>>> Starting child client
>>>>> "/usr/local/heartbeat/libexec/pacemaker/stonithd" (0,0)
>>>>>>>>>>>> Mar 17 14:11:34 lbv1 heartbeat: [13266]: info:
>>>>> Starting "/usr/local/heartbeat/libexec/pacemaker/stonithd" as uid 0 
>>>>> gid 0 (pid 13266)
>>>>>>>>>>>> Mar 17 14:11:34 lbv1 stonithd[13266]:   notice:
>>>>> crm_cluster_connect: Connecting to cluster infrastructure: heartbeat
>>>>>>>>>>>> Mar 17 14:11:34 lbv1 heartbeat: [13255]: info: the
>>>>> send queue length from heartbeat to client stonithd is set to 1024
>>>>>>>>>>>> Mar 17 14:11:40 lbv1 stonithd[13266]:   notice:
>>>>> setup_cib: Watching for stonith topology changes
>>>>>>>>>>>> Mar 17 14:11:40 lbv1 stonithd[13266]:   notice:
>>>>> unpack_config: On loss of CCM Quorum: Ignore
>>>>>>>>>>>> Mar 17 14:11:40 lbv1 stonithd[13266]:  warning:
>>>>> handle_startup_fencing: Blind faith: not fencing unseen nodes
>>>>>>>>>>>> Mar 17 14:11:40 lbv1 stonithd[13266]:  warning:
>>>>> handle_startup_fencing: Blind faith: not fencing unseen nodes
>>>>>>>>>>>> Mar 17 14:11:41 lbv1 stonithd[13266]:   notice:
>>>>> stonith_device_register: Added 'Stonith2-1' to the device list (1 active
>>>>> devices)
>>>>>>>>>>>> Mar 17 14:11:41 lbv1 stonithd[13266]:   notice:
>>>>> stonith_device_register: Added 'Stonith2-2' to the device list (2 active
>>>>> devices)
>>>>>>>>>>>> Mar 17 14:12:04 lbv1 stonithd[13266]:   notice:
>>>>> xml_patch_version_check: Versions did not change in patch 0.5.0
>>>>>>>>>>>> Mar 17 14:12:20 lbv1 stonithd[13266]:   notice:
>>>>> log_operation: Operation 'monitor' [13386] for device
>>>>> 'Stonith2-1' returned: -201 (Generic Pacemaker error)
>>>>>>>>>>>> Mar 17 14:12:20 lbv1 stonithd[13266]:  warning:
>>>>> log_operation: Stonith2-1:13386 [ Performing: stonith -t external/stonith-helper
>>>>> -S ]
>>>>>>>>>>>> Mar 17 14:12:20 lbv1 stonithd[13266]:  warning:
>>>>> log_operation: Stonith2-1:13386 [ failed to exec "stonith" ]
>>>>>>>>>>>> Mar 17 14:12:20 lbv1 stonithd[13266]:  warning:
>>>>> log_operation: Stonith2-1:13386 [ failed:  2 ]
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> 宜しくお願いします。
>>>>>>>>>>>>
>>>>>>>>>>>> 以上
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> 2015年3月17日 13:32 <renayama19661014@ybb.ne.jp>:
>>>>>>>>>>>>
>>>>>>>>>>>> 福田さん
>>>>>>>>>>>>>
>>>>>>>>>>>>> お疲れ様です。山内です。
>>>>>>>>>>>>>
>>>>>>>>>>>>> ということは、stonith-helperのstartに問題があるようですね。
>>>>>>>>>>>>>
>>>>>>>>>>>>> stonith-helperの先頭に
>>>>>>>>>>>>>
>>>>>>>>>>>>> #!/bin/bash -x
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> を入れて、クラスタを起動すると何かわかるかも知れません。
>>>>>>>>>>>>>
>>>>>>>>>>>>> ちなみに、stonith-helperのログもどこかに出ていると思うのですが。。。
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> 以上です。
>>>>>>>>>>>>>
>>>>>>>>>>>>> ----- Original Message -----
>>>>>>>>>>>>>> From: Masamichi Fukuda - elf-systems
>>>>> <masamichi_fukuda@elf-systems.com>
>>>>>>>>>>>>>> To: 山内英生 <renayama19661014@ybb.ne.jp>;
>>>>> "linux-ha-japan@lists.sourceforge.jp"
>>>>> <linux-ha-japan@lists.sourceforge.jp>
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Date: 2015/3/17, Tue 12:31
>>>>>>>>>>>>>> Subject: Re: [Linux-ha-jp]
>>>>> スプリットブレイン時のSTONITHエラーについて
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 山内さん
>>>>>>>>>>>>>> cc:松島さん
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> こんにちは、福田です。
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 同じディレクトリにxen0はありました。
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> # pwd
>>>>>>>>>>>>>> /usr/local/heartbeat/lib/stonith/plugins/external
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> # ls
>>>>>>>>>>>>>> drac5           ibmrsa          kdumpcheck 
>>>>> riloe          vmware
>>>>>>>>>>>>>> dracmc-telnet  ibmrsa-telnet  libvirt     
>>>>> ssh          xen0
>>>>>>>>>>>>>> hetzner        ipmi          nut     
>>>>> stonith-helper  xen0-ha
>>>>>>>>>>>>>> hmchttp        ippower9258    rackpdu     
>>>>> vcenter
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 宜しくお願いします。
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 以上
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 2015-03-17 10:53 GMT+09:00
>>>>> <renayama19661014@ybb.ne.jp>:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 福田さん
>>>>>>>>>>>>>>> cc:松島さん
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> お疲れ様です。山内です。
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 標準出力や標準エラー出力はありませんでした。
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> stonith-helperがおかしいのでしょうか。
>>>>>>>>>>>>>>>> stonith-helperはシェルスクリプトなのでインストールはあまり気にしていなかったのですが。
>>>>>>>>>>>>>>>> stonith-helperはここに配置されています。
>>>>>>>>>>>>>>>> /usr/local/heartbeat/lib/stonith/plugins/external/stonith-helper
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> このディレクトリにxen0もありますか?
>>>>>>>>>>>>>>> 無いようでしたら、問題がありますので、一度、stonith-helperのファイルを属性などはそのまま、xen0と同じディレクトリに
>>>>>>>>>>>>>>> コピーしてみてください。
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> それで稼働するなら、pm_extrasのインストールに問題があるということになります。
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 以上です。
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> ----- Original Message -----
>>>>>>>>>>>>>>>> From: Masamichi Fukuda - elf-systems
>>>>> <masamichi_fukuda@elf-systems.com>
>>>>>>>>>>>>>>>> To: 山内英生
>>>>> <renayama19661014@ybb.ne.jp>;
>>>>> "linux-ha-japan@lists.sourceforge.jp"
>>>>> <linux-ha-japan@lists.sourceforge.jp>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Date: 2015/3/17, Tue 10:31
>>>>>>>>>>>>>>>> Subject: Re: [Linux-ha-jp]
>>>>> スプリットブレイン時のSTONITHエラーについて
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 山内さん
>>>>>>>>>>>>>>>> cc:松島さん
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> おはようございます、福田です。
>>>>>>>>>>>>>>>> crmの例をありがとうございます。
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 早速、こちらの環境に合わせてみました。
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> $ cat test.crm
>>>>>>>>>>>>>>>> ### Cluster Option ###
>>>>>>>>>>>>>>>> property \
>>>>>>>>>>>>>>>>    
>>>>> no-quorum-policy="ignore" \
>>>>>>>>>>>>>>>>     stonith-enabled="true"
>>>>> \
>>>>>>>>>>>>>>>>    
>>>>> startup-fencing="false" \
>>>>>>>>>>>>>>>>     stonith-timeout="710s"
>>>>> \
>>>>>>>>>>>>>>>>    
>>>>> crmd-transition-delay="2s"
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> ### Resource Default ###
>>>>>>>>>>>>>>>> rsc_defaults \
>>>>>>>>>>>>>>>>    
>>>>> resource-stickiness="INFINITY" \
>>>>>>>>>>>>>>>>    
>>>>> migration-threshold="1"
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> ### Group Configuration ###
>>>>>>>>>>>>>>>> group HAvarnish \
>>>>>>>>>>>>>>>>     vip_208 \
>>>>>>>>>>>>>>>>     varnishd
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> group grpStonith1 \
>>>>>>>>>>>>>>>>     Stonith1-1 \
>>>>>>>>>>>>>>>>     Stonith1-2
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> group grpStonith2 \
>>>>>>>>>>>>>>>>     Stonith2-1 \
>>>>>>>>>>>>>>>>     Stonith2-2
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> ### Clone Configuration ###
>>>>>>>>>>>>>>>> clone clone_ping \
>>>>>>>>>>>>>>>>     ping
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> ### Fencing Topology ###
>>>>>>>>>>>>>>>> fencing_topology \
>>>>>>>>>>>>>>>>     lbv1.beta.com: Stonith1-1
>>>>> Stonith1-2 \
>>>>>>>>>>>>>>>>     lbv2.beta.com: Stonith2-1
>>>>> Stonith2-2
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> ### Primitive Configuration ###
>>>>>>>>>>>>>>>> primitive vip_208
>>>>> ocf:heartbeat:IPaddr2 \
>>>>>>>>>>>>>>>>     params \
>>>>>>>>>>>>>>>>        
>>>>> ip="192.168.17.208" \
>>>>>>>>>>>>>>>>         nic="eth0" \
>>>>>>>>>>>>>>>>         cidr_netmask="24"
>>>>> \
>>>>>>>>>>>>>>>>     op start interval="0s"
>>>>> timeout="90s" on-fail="restart" \
>>>>>>>>>>>>>>>>     op monitor
>>>>> interval="5s" timeout="60s" on-fail="restart"
>>>>> \
>>>>>>>>>>>>>>>>     op stop interval="0s"
>>>>> timeout="100s" on-fail="fence"
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> primitive varnishd lsb:varnish \
>>>>>>>>>>>>>>>>     op start interval="0s"
>>>>> timeout="90s" on-fail="restart" \
>>>>>>>>>>>>>>>>     op monitor
>>>>> interval="10s" timeout="60s" on-fail="restart"
>>>>> \
>>>>>>>>>>>>>>>>     op stop interval="0s"
>>>>> timeout="100s" on-fail="fence"
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> primitive ping ocf:pacemaker:ping
>>>>> \
>>>>>>>>>>>>>>>>     params \
>>>>>>>>>>>>>>>>        
>>>>> name="default_ping_set" \
>>>>>>>>>>>>>>>>        
>>>>> host_list="192.168.17.254" \
>>>>>>>>>>>>>>>>         multiplier="100"
>>>>> \
>>>>>>>>>>>>>>>>         dampen="1" \
>>>>>>>>>>>>>>>>     op start interval="0s"
>>>>> timeout="90s" on-fail="restart" \
>>>>>>>>>>>>>>>>     op monitor
>>>>> interval="10s" timeout="60s" on-fail="restart"
>>>>> \
>>>>>>>>>>>>>>>>     op stop interval="0s"
>>>>> timeout="100s" on-fail="fence"
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> primitive Stonith1-1
>>>>> stonith:external/stonith-helper \
>>>>>>>>>>>>>>>>     params \
>>>>>>>>>>>>>>>>        
>>>>> pcmk_reboot_retries="1" \
>>>>>>>>>>>>>>>>        
>>>>> pcmk_reboot_timeout="40s" \
>>>>>>>>>>>>>>>>        
>>>>> hostlist="lbv1.beta.com" \
>>>>>>>>>>>>>>>>        
>>>>> dead_check_target="192.168.17.132 10.0.17.132" \
>>>>>>>>>>>>>>>>        
>>>>> standby_check_command="/usr/local/sbin/crm_resource -r varnishd -W | grep
>>>>> -q `hostname`" \
>>>>>>>>>>>>>>>>        
>>>>> run_online_check="yes" \
>>>>>>>>>>>>>>>>     op start interval="0s"
>>>>> timeout="60s" on-fail="restart" \
>>>>>>>>>>>>>>>>     op stop interval="0s"
>>>>> timeout="60s" on-fail="ignore"
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> primitive Stonith1-2
>>>>> stonith:external/xen0 \
>>>>>>>>>>>>>>>>     params \
>>>>>>>>>>>>>>>>        
>>>>> pcmk_reboot_timeout="60s" \
>>>>>>>>>>>>>>>>        
>>>>> hostlist="lbv1.beta.com:/etc/xen/lbv1.cfg" \
>>>>>>>>>>>>>>>>        
>>>>> dom0="xen0.beta.com" \
>>>>>>>>>>>>>>>>     op start interval="0s"
>>>>> timeout="60s" on-fail="restart" \
>>>>>>>>>>>>>>>>     op monitor
>>>>> interval="3600s" timeout="60s" on-fail="restart"
>>>>> \
>>>>>>>>>>>>>>>>     op stop interval="0s"
>>>>> timeout="60s" on-fail="ignore"
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> primitive Stonith2-1
>>>>> stonith:external/stonith-helper \
>>>>>>>>>>>>>>>>     params \
>>>>>>>>>>>>>>>>        
>>>>> pcmk_reboot_retries="1" \
>>>>>>>>>>>>>>>>        
>>>>> pcmk_reboot_timeout="40s" \
>>>>>>>>>>>>>>>>        
>>>>> hostlist="lbv2.beta.com" \
>>>>>>>>>>>>>>>>        
>>>>> dead_check_target="192.168.17.133 10.0.17.133" \
>>>>>>>>>>>>>>>>        
>>>>> standby_check_command="/usr/local/sbin/crm_resource -r varnishd -W | grep
>>>>> -q `hostname`" \
>>>>>>>>>>>>>>>>        
>>>>> run_online_check="yes" \
>>>>>>>>>>>>>>>>     op start interval="0s"
>>>>> timeout="60s" on-fail="restart" \
>>>>>>>>>>>>>>>>     op stop interval="0s"
>>>>> timeout="60s" on-fail="ignore"
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> primitive Stonith2-2
>>>>> stonith:external/xen0 \
>>>>>>>>>>>>>>>>     params \
>>>>>>>>>>>>>>>>        
>>>>> pcmk_reboot_timeout="60s" \
>>>>>>>>>>>>>>>>        
>>>>> hostlist="lbv2.beta.com:/etc/xen/lbv2.cfg" \
>>>>>>>>>>>>>>>>        
>>>>> dom0="xen0.beta.com" \
>>>>>>>>>>>>>>>>     op start interval="0s"
>>>>> timeout="60s" on-fail="restart" \
>>>>>>>>>>>>>>>>     op monitor
>>>>> interval="3600s" timeout="60s" on-fail="restart"
>>>>> \
>>>>>>>>>>>>>>>>     op stop interval="0s"
>>>>> timeout="60s" on-fail="ignore"
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> ### Resource Location ###
>>>>>>>>>>>>>>>> location HA_location-1 HAvarnish
>>>>> \
>>>>>>>>>>>>>>>>     rule 200: #uname eq
>>>>> lbv1.beta.com \
>>>>>>>>>>>>>>>>     rule 100: #uname eq
>>>>> lbv2.beta.com
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> location HA_location-2 HAvarnish
>>>>> \
>>>>>>>>>>>>>>>>     rule -INFINITY: not_defined
>>>>> default_ping_set or default_ping_set lt 100
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> location HA_location-3 grpStonith1
>>>>> \
>>>>>>>>>>>>>>>>     rule -INFINITY: #uname eq
>>>>> lbv1.beta.com
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> location HA_location-4 grpStonith2
>>>>> \
>>>>>>>>>>>>>>>>     rule -INFINITY: #uname eq
>>>>> lbv2.beta.com
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> これを流しこんだところ、昨日とはメッセージが異なります。
>>>>>>>>>>>>>>>> pingのメッセージはなくなっていました。
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> # crm_mon -rfA
>>>>>>>>>>>>>>>> Last updated: Tue Mar 17 10:21:28
>>>>> 2015
>>>>>>>>>>>>>>>> Last change: Tue Mar 17 10:21:09
>>>>> 2015
>>>>>>>>>>>>>>>> Stack: heartbeat
>>>>>>>>>>>>>>>> Current DC: lbv2.beta.com
>>>>> (82ffc36f-1ad8-8686-7db0-35686465c624) - parti
>>>>>>>>>>>>>>>> tion with quorum
>>>>>>>>>>>>>>>> Version: 1.1.12-561c4cf
>>>>>>>>>>>>>>>> 2 Nodes configured
>>>>>>>>>>>>>>>> 8 Resources configured
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Online: [ lbv1.beta.com
>>>>> lbv2.beta.com ]
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Full list of resources:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>  Resource Group: HAvarnish
>>>>>>>>>>>>>>>>      vip_208   
>>>>> (ocf::heartbeat:IPaddr2):       Started lbv1.beta.com
>>>>>>>>>>>>>>>>      varnishd   (lsb:varnish): 
>>>>> Started lbv1.beta.com
>>>>>>>>>>>>>>>>  Resource Group: grpStonith1
>>>>>>>>>>>>>>>>      Stonith1-1
>>>>> (stonith:external/stonith-helper):      Stopped
>>>>>>>>>>>>>>>>      Stonith1-2
>>>>> (stonith:external/xen0):        Stopped
>>>>>>>>>>>>>>>>  Resource Group: grpStonith2
>>>>>>>>>>>>>>>>      Stonith2-1
>>>>> (stonith:external/stonith-helper):      Stopped
>>>>>>>>>>>>>>>>      Stonith2-2
>>>>> (stonith:external/xen0):        Stopped
>>>>>>>>>>>>>>>>  Clone Set: clone_ping [ping]
>>>>>>>>>>>>>>>>      Started: [ lbv1.beta.com
>>>>> lbv2.beta.com ]
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Node Attributes:
>>>>>>>>>>>>>>>> * Node lbv1.beta.com:
>>>>>>>>>>>>>>>>     +
>>>>> default_ping_set                  : 100
>>>>>>>>>>>>>>>> * Node lbv2.beta.com:
>>>>>>>>>>>>>>>>     +
>>>>> default_ping_set                  : 100
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Migration summary:
>>>>>>>>>>>>>>>> * Node lbv2.beta.com:
>>>>>>>>>>>>>>>>    Stonith1-1: migration-threshold=1
>>>>> fail-count=1000000 last-failure='Tue Mar 17
>>>>>>>>>>>>>>>>  10:21:17 2015'
>>>>>>>>>>>>>>>> * Node lbv1.beta.com:
>>>>>>>>>>>>>>>>    Stonith2-1: migration-threshold=1
>>>>> fail-count=1000000 last-failure='Tue Mar 17
>>>>>>>>>>>>>>>>  10:21:17 2015'
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Failed actions:
>>>>>>>>>>>>>>>>     Stonith1-1_start_0 on
>>>>> lbv2.beta.com 'unknown error' (1): call=31, st
>>>>>>>>>>>>>>>> atus=Error, last-rc-change='Tue
>>>>> Mar 17 10:21:15 2015', queued=0ms, exec=1082ms
>>>>>>>>>>>>>>>>     Stonith2-1_start_0 on
>>>>> lbv1.beta.com 'unknown error' (1): call=31, st
>>>>>>>>>>>>>>>> atus=Error, last-rc-change='Tue
>>>>> Mar 17 10:21:16 2015', queued=0ms, exec=1079ms
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> /var/log/ha-debugのログです。
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> IPaddr2(vip_208)[7851]:
>>>>> 2015/03/17_10:21:22 INFO: Adding inet address 192.168.17.208/24 with broadcast
>>>>> address 192.168.17.255 to device eth0
>>>>>>>>>>>>>>>> IPaddr2(vip_208)[7851]:
>>>>> 2015/03/17_10:21:22 INFO: Bringing device eth0 up
>>>>>>>>>>>>>>>> IPaddr2(vip_208)[7851]:
>>>>> 2015/03/17_10:21:22 INFO: /usr/libexec/heartbeat/send_arp -i 200 -r 5 -p
>>>>> /var/run/resource-agents/send_arp-192.168.17.208 eth0 192.168.17.208 auto
>>>>> not_used not_used
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 標準出力や標準エラー出力はありませんでした。
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> stonith-helperがおかしいのでしょうか。
>>>>>>>>>>>>>>>> stonith-helperはシェルスクリプトなのでインストールはあまり気にしていなかったのですが。
>>>>>>>>>>>>>>>> stonith-helperはここに配置されています。
>>>>>>>>>>>>>>>> /usr/local/heartbeat/lib/stonith/plugins/external/stonith-helper
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 宜しくお願いします。
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 以上
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 2015-03-17 9:45 GMT+09:00
>>>>> <renayama19661014@ybb.ne.jp>:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 福田さん
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> おはようございます。山内です。
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 念の為、手元にある複数のstonithを利用した場合の例を抜粋してお送りします。
>>>>>>>>>>>>>>>>> (実際には、改行に気を付けてください)
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 以下の例は、PM1.1系での設定で、
>>>>>>>>>>>>>>>>> nodeaは、prmStonith1-1、 prmStonith1-2の順でstonithが実行されます。
>>>>>>>>>>>>>>>>> nodebは、prmStonith2-1、 prmStonith2-2の順でstonithが実行されます。
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> stonith自体は、helperとsshです。
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> (snip)
>>>>>>>>>>>>>>>>> ### Group Configuration ###
>>>>>>>>>>>>>>>>> group grpStonith1 \
>>>>>>>>>>>>>>>>> prmStonith1-1 \
>>>>>>>>>>>>>>>>> prmStonith1-2
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> group grpStonith2 \
>>>>>>>>>>>>>>>>> prmStonith2-1 \
>>>>>>>>>>>>>>>>> prmStonith2-2
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> ### Fencing Topology ###
>>>>>>>>>>>>>>>>> fencing_topology \
>>>>>>>>>>>>>>>>> nodea: prmStonith1-1
>>>>> prmStonith1-2 \
>>>>>>>>>>>>>>>>> nodeb: prmStonith2-1
>>>>> prmStonith2-2
>>>>>>>>>>>>>>>>> (snp)
>>>>>>>>>>>>>>>>> primitive prmStonith1-1
>>>>> stonith:external/stonith-helper \
>>>>>>>>>>>>>>>>> params \
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> pcmk_reboot_retries="1"
>>>>> \
>>>>>>>>>>>>>>>>> pcmk_reboot_timeout="40s"
>>>>> \
>>>>>>>>>>>>>>>>> hostlist="nodea" \
>>>>>>>>>>>>>>>>> dead_check_target="192.168.28.60
>>>>> 192.168.28.70" \
>>>>>>>>>>>>>>>>> standby_check_command="/usr/sbin/crm_resource
>>>>> -r prmRES -W | grep -qi `hostname`" \
>>>>>>>>>>>>>>>>> run_online_check="yes"
>>>>> \
>>>>>>>>>>>>>>>>> op start interval="0s"
>>>>> timeout="60s" on-fail="restart" \
>>>>>>>>>>>>>>>>> op stop interval="0s"
>>>>> timeout="60s" on-fail="ignore"
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> primitive prmStonith1-2
>>>>> stonith:external/ssh \
>>>>>>>>>>>>>>>>> params \
>>>>>>>>>>>>>>>>> pcmk_reboot_timeout="60s"
>>>>> \
>>>>>>>>>>>>>>>>> hostlist="nodea" \
>>>>>>>>>>>>>>>>> op start interval="0s"
>>>>> timeout="60s" on-fail="restart" \
>>>>>>>>>>>>>>>>> op monitor
>>>>> interval="3600s" timeout="60s" on-fail="restart"
>>>>> \
>>>>>>>>>>>>>>>>> op stop interval="0s"
>>>>> timeout="60s" on-fail="ignore"
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> primitive prmStonith2-1
>>>>> stonith:external/stonith-helper \
>>>>>>>>>>>>>>>>> params \
>>>>>>>>>>>>>>>>> pcmk_reboot_retries="1"
>>>>> \
>>>>>>>>>>>>>>>>> pcmk_reboot_timeout="40s"
>>>>> \
>>>>>>>>>>>>>>>>> hostlist="nodeb" \
>>>>>>>>>>>>>>>>> dead_check_target="192.168.28.61
>>>>> 192.168.28.71" \
>>>>>>>>>>>>>>>>> standby_check_command="/usr/sbin/crm_resource
>>>>> -r prmRES -W | grep -qi `hostname`" \
>>>>>>>>>>>>>>>>> run_online_check="yes"
>>>>> \
>>>>>>>>>>>>>>>>> op start interval="0s"
>>>>> timeout="60s" on-fail="restart" \
>>>>>>>>>>>>>>>>> op stop interval="0s"
>>>>> timeout="60s" on-fail="ignore"
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> primitive prmStonith2-2
>>>>> stonith:external/ssh \
>>>>>>>>>>>>>>>>> params \
>>>>>>>>>>>>>>>>> pcmk_reboot_timeout="60s"
>>>>> \
>>>>>>>>>>>>>>>>> hostlist="nodeb" \
>>>>>>>>>>>>>>>>> op start interval="0s"
>>>>> timeout="60s" on-fail="restart" \
>>>>>>>>>>>>>>>>> op monitor
>>>>> interval="3600s" timeout="60s" on-fail="restart"
>>>>> \
>>>>>>>>>>>>>>>>> op stop interval="0s"
>>>>> timeout="60s" on-fail="ignore"
>>>>>>>>>>>>>>>>> (snip)
>>>>>>>>>>>>>>>>> location
>>>>> rsc_location-grpStonith1-2 grpStonith1 \
>>>>>>>>>>>>>>>>> rule -INFINITY: #uname eq nodea
>>>>>>>>>>>>>>>>> location
>>>>> rsc_location-grpStonith2-3 grpStonith2 \
>>>>>>>>>>>>>>>>> rule -INFINITY: #uname eq nodeb
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 以上です。
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> ELF Systems
>>>>>>>>>>>>>>>> Masamichi Fukuda
>>>>>>>>>>>>>>>> mail to:
>>>>> masamichi_fukuda@elf-systems.com
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>> Linux-ha-japan mailing list
>>>>>>>>>>>>>>> Linux-ha-japan@lists.sourceforge.jp
>>>>>>>>>>>>>>> http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ELF Systems
>>>>>>>>>>>>>> Masamichi Fukuda
>>>>>>>>>>>>>> mail to: masamichi_fukuda@elf-systems.com
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>> Linux-ha-japan mailing list
>>>>>>>>>>>>> Linux-ha-japan@lists.sourceforge.jp
>>>>>>>>>>>>> http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>>
>>>>>>>>>>>> ELF Systems
>>>>>>>>>>>> Masamichi Fukuda
>>>>>>>>>>>> mail to: masamichi_fukuda@elf-systems.com
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> Linux-ha-japan mailing list
>>>>>>>>>>> Linux-ha-japan@lists.sourceforge.jp
>>>>>>>>>>> http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>>
>>>>>>>>>> ELF Systems
>>>>>>>>>> Masamichi Fukuda
>>>>>>>>>> mail to: masamichi_fukuda@elf-systems.com
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> Linux-ha-japan mailing list
>>>>>>>>> Linux-ha-japan@lists.sourceforge.jp
>>>>>>>>> http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>>
>>>>>>>> ELF Systems
>>>>>>>> Masamichi Fukuda
>>>>>>>> mail to: masamichi_fukuda@elf-systems.com
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Linux-ha-japan mailing list
>>>>>>> Linux-ha-japan@lists.sourceforge.jp
>>>>>>> http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>>
>>>>>> ELF Systems
>>>>>> Masamichi Fukuda
>>>>>> mail to: masamichi_fukuda@elf-systems.com
>>>>>>
>>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Linux-ha-japan mailing list
>>>>> Linux-ha-japan@lists.sourceforge.jp
>>>>> http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>>>>>
>>>>
>>>>_______________________________________________
>>>>Linux-ha-japan mailing list
>>>>Linux-ha-japan@lists.sourceforge.jp
>>>>http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>>>>
>>>
>>>
>>>--
>>>
>>>ELF Systems
>>>Masamichi Fukuda
>>>mail to: masamichi_fukuda@elf-systems.com
>>>
>>>
>>
>>_______________________________________________
>>Linux-ha-japan mailing list
>>Linux-ha-japan@lists.sourceforge.jp
>>http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>>
>
>
>--
>
>ELF Systems
>Masamichi Fukuda
>mail to: masamichi_fukuda@elf-systems.com
>
>

_______________________________________________
Linux-ha-japan mailing list
Linux-ha-japan@lists.sourceforge.jp
http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
Re: スプリットブレイン時のSTONITHエラーについて [ In reply to ]
山内さん

お疲れ様です、福田です。

>Reusableは、glueのことです。

承知しました。Cluster-glueのことですね。

>これは、想定通りで、external配下のエージェントを認識できず、startしていないと
>思っています。

stonith -Lでは、一応プラグインの一覧は表示されるようです。

# /usr/local/heartbeat/sbin/stonith -L
apcmaster
apcsmart
baytech
cyclades
external/drac5
external/dracmc-telnet
external/hetzner
external/hmchttp
external/ibmrsa
external/ibmrsa-telnet
external/ipmi
external/ippower9258
external/kdumpcheck
external/libvirt
external/nut
external/rackpdu
external/riloe
external/ssh
external/stonith-helper
external/vcenter
external/vmware
external/xen0
external/xen0-ha
ibmhmc
meatware
null
nw_rpc100s
rcd_serial
rps10
ssh
suicide
wti_nps


>#時間を見て、松島さんの情報を元に、Debian7の環境を作らないと何とも言えない
>と思っています

お忙しいところ済みません。
こちらもインストールを見なおして見ます。

宜しくお願いします。

以上


2015年3月18日 9:02 <renayama19661014@ybb.ne.jp>:

> 福田さん
>
> おはようございます。山内です。
>
> 書き方が悪かったです。
> Reusableは、glueのことです。
>
> pacemakerのインストールに問題があるかも知れませんが、現時点では、判断出来ません。
>
>
> >stonith-helperを外して、external/sshだけにして起動してみましたが、
> >crm_monでの状態は変わりありませんでした。
>
>
> これは、想定通りで、external配下のエージェントを認識できず、startしていないと思っています。
>
> #時間を見て、松島さんの情報を元に、Debian7の環境を作らないと何とも言えないと思っています。
>
> 以上です。
>
>
> ----- Original Message -----
> >From: Masamichi Fukuda - elf-systems <masamichi_fukuda@elf-systems.com>
> >To: 山内英生 <renayama19661014@ybb.ne.jp>; "
> linux-ha-japan@lists.sourceforge.jp" <linux-ha-japan@lists.sourceforge.jp>
> >Date: 2015/3/18, Wed 08:12
> >Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
> >
> >
> >山内さん
> >
> >おはようございます、福田です。
> >
> >> ということは、xen0もstonith-helperもたぶん、Pacemakerのstonithプラグインとし
> >> ての管理下のパスにはないということになると思います。
> >>
> >> Reusableと、pm_extrasあたりのインストールが怪しいと思いますね。
> >
> >pacemakerのインストールに問題があるのでしょうか。
> >あと、Reusableというものは別途インストールが必要なのでしょうか。
> >
> >stonith-helperを外して、external/sshだけにして起動してみましたが、
> >crm_monでの状態は変わりありませんでした。
> >
> >Last updated: Wed Mar 18 08:07:42 2015
> >Last change: Wed Mar 18 08:04:48 2015
> >Stack: heartbeat
> >Current DC: lbv1.beta.com (38b0f200-83ea-8633-6f37-047d36cd39c6) - parti
> >tion with quorum
> >Version: 1.1.12-e32080b
> >2 Nodes configured
> >6 Resources configured
> >
> >
> >Online: [ lbv1.beta.com lbv2.beta.com ]
> >
> >Full list of resources:
> >
> >Stonith1-2 (stonith:external/ssh): Stopped
> >Stonith2-2 (stonith:external/ssh): Stopped
> > Resource Group: HAvarnish
> > vip_208 (ocf::heartbeat:IPaddr2): Started lbv1.beta.com
> > varnishd (lsb:varnish): Started lbv1.beta.com
> > Clone Set: clone_ping [ping]
> > Started: [ lbv1.beta.com lbv2.beta.com ]
> >
> >Node Attributes:
> >* Node lbv1.beta.com:
> > + default_ping_set : 100
> >* Node lbv2.beta.com:
> > + default_ping_set : 100
> >
> >Migration summary:
> >* Node lbv2.beta.com:
> > Stonith1-2: migration-threshold=1 fail-count=1000000 last-failure='Wed
> Mar 18
> > 08:07:32 2015'
> >* Node lbv1.beta.com:
> > Stonith2-2: migration-threshold=1 fail-count=1000000 last-failure='Wed
> Mar 18
> > 08:05:53 2015'
> >
> >Failed actions:
> > Stonith1-2_start_0 on lbv2.beta.com 'unknown error' (1): call=23, st
> >atus=Error, exit-reason='none', last-rc-change='Wed Mar 18 08:07:30
> 2015', queue
> >d=0ms, exec=1061ms
> > Stonith2-2_start_0 on lbv1.beta.com 'unknown error' (1): call=23, st
> >atus=Error, exit-reason='none', last-rc-change='Wed Mar 18 08:05:51
> 2015', queue
> >d=0ms, exec=1062ms
> >
> >宜しくお願いします。
> >
> >以上
> >
> >
> >
> >
> >
> >2015年3月17日 23:51 <renayama19661014@ybb.ne.jp>:
> >
> >福田さん
> >>
> >>こんばんは、山内です。
> >>
>
> >>ということは、xen0もstonith-helperもたぶん、Pacemakerのstonithプラグインとしての管理下のパスにはないということになると思います。
> >>
> >>Reusableと、pm_extrasあたりのインストールが怪しいと思いますね。
> >>
> >>また、何かわかったらご連絡します。
> >>
> >>以上です。
> >>
> >>
> >>
> >>----- Original Message -----
> >>>From: Masamichi Fukuda - elf-systems <masamichi_fukuda@elf-systems.com>
> >>>To: 山内英生 <renayama19661014@ybb.ne.jp>; "
> linux-ha-japan@lists.sourceforge.jp" <linux-ha-japan@lists.sourceforge.jp>
> >>
> >>>Date: 2015/3/17, Tue 23:46
> >>>Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
> >>>
> >>>
> >>>山内さん
> >>>
> >>>こんばんは、福田です。
> >>>
> >>>stonith-helperの-x指定は何かやり方が違うんでしょうかね。
> >>>
> >>>stonith-helperを外して、xen0だけにして起動してみました。
> >>>
> >>># crm_mon -rfA
> >>>
> >>>Last updated: Tue Mar 17 23:38:53 2015
> >>>Last change: Tue Mar 17 23:30:34 2015
> >>>Stack: heartbeat
> >>>Current DC: lbv1.beta.com (38b0f200-83ea-8633-6f37-047d36cd39c6) -
> parti
> >>>tion with quorum
> >>>Version: 1.1.12-e32080b
> >>>2 Nodes configured
> >>>6 Resources configured
> >>>
> >>>
> >>>Online: [ lbv1.beta.com lbv2.beta.com ]
> >>>
> >>>Full list of resources:
> >>>
> >>>Stonith1-2 (stonith:external/xen0): Stopped
> >>>Stonith2-2 (stonith:external/xen0): Stopped
> >>> Resource Group: HAvarnish
> >>> vip_208 (ocf::heartbeat:IPaddr2): Started lbv1.beta.com
> >>> varnishd (lsb:varnish): Started lbv1.beta.com
> >>> Clone Set: clone_ping [ping]
> >>> Started: [ lbv1.beta.com lbv2.beta.com ]
> >>>
> >>>Node Attributes:
> >>>* Node lbv1.beta.com:
> >>> + default_ping_set : 100
> >>>* Node lbv2.beta.com:
> >>> + default_ping_set : 100
> >>>
> >>>Migration summary:
> >>>* Node lbv1.beta.com:
> >>> Stonith2-2: migration-threshold=1 fail-count=1000000
> last-failure='Tue Mar 17
> >>> 23:38:34 2015'
> >>>* Node lbv2.beta.com:
> >>> Stonith1-2: migration-threshold=1 fail-count=1000000
> last-failure='Tue Mar 17
> >>> 23:38:27 2015'
> >>>
> >>>Failed actions:
> >>> Stonith2-2_start_0 on lbv1.beta.com 'unknown error' (1): call=23,
> st
> >>>atus=Error, exit-reason='none', last-rc-change='Tue Mar 17 23:38:32
> 2015', queue
> >>>d=0ms, exec=1061ms
> >>> Stonith1-2_start_0 on lbv2.beta.com 'unknown error' (1): call=23,
> st
> >>>atus=Error, exit-reason='none', last-rc-change='Tue Mar 17 23:38:25
> 2015', queue
> >>>d=0ms, exec=1342ms
> >>>
> >>>
> >>>
> >>>
> >>>stonith-helperがあるときと同様のfialed actionsが出ているようです。
> >>>
> >>>
> >>>宜しくお願いします。
> >>>
> >>>以上
> >>>
> >>>
> >>>
> >>>
> >>>2015年3月17日 22:38 <renayama19661014@ybb.ne.jp>:
> >>>
> >>>福田さん
> >>>>
> >>>>こんばんは、山内です。
> >>>>
> >>>>ちなみに可能であれば、external/stonith-helperを外して、external/xen0だけにした場合に
> >>>>どうなるか?を確認すると、問題の切り分けになるかもしれません。
> >>>>
> >>>>以上です。
> >>>>
> >>>>
> >>>>
> >>>>----- Original Message -----
> >>>>
> >>>>> From: "renayama19661014@ybb.ne.jp" <renayama19661014@ybb.ne.jp>
> >>>>> To: "linux-ha-japan@lists.sourceforge.jp" <
> linux-ha-japan@lists.sourceforge.jp>
> >>>>> Cc:
> >>>>> Date: 2015/3/17, Tue 22:28
> >>>>> Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
> >>>>>
> >>>>> 福田さん
> >>>>>
> >>>>> こんばんは、山内です。
> >>>>>
> >>>>> 変わらないようですね。。。
> >>>>>
> >>>>> とりあえず、明日くらいに、RHEL上ですが、
> >>>>>
> >>>>> Heartbeat3.0.6
> >>>>> Pacemakerの最新
> >>>>>
> >>>>>
> 組み合わせで、同じような設定(リソースはDummy、external/xen0はexternal/sshになりますが)stonith-helperが動くかどうかを確認してみます。
> >>>>>
> >>>>> #stonith-helperの-x指定の出力が確認出来ると、もう少し問題が絞りやすいのですが・・・
> >>>>>
> >>>>>
> >>>>> 以上です。
> >>>>>
> >>>>>
> >>>>>
> >>>>> ----- Original Message -----
> >>>>>> From: Masamichi Fukuda - elf-systems
> >>>>> <masamichi_fukuda@elf-systems.com>
> >>>>>> To: 山内英生 <renayama19661014@ybb.ne.jp>;
> >>>>> "linux-ha-japan@lists.sourceforge.jp"
> >>>>> <linux-ha-japan@lists.sourceforge.jp>
> >>>>>> Date: 2015/3/17, Tue 21:24
> >>>>>> Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
> >>>>>>
> >>>>>>
> >>>>>> 山内さん
> >>>>>>
> >>>>>> こんばんは、福田です。
> >>>>>> 最新版の情報をありがとうございました。
> >>>>>>
> >>>>>> 早速インストールしてみました。
> >>>>>>
> >>>>>> 起動後の状態です。
> >>>>>>
> >>>>>> failed actionsは変わりないようです。
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> # crm_mon -rfA
> >>>>>> Last updated: Tue Mar 17 21:03:49 2015
> >>>>>> Last change: Tue Mar 17 20:30:58 2015
> >>>>>> Stack: heartbeat
> >>>>>> Current DC: lbv1.beta.com (38b0f200-83ea-8633-6f37-047d36cd39c6) -
> parti
> >>>>>> tion with quorum
> >>>>>> Version: 1.1.12-e32080b
> >>>>>> 2 Nodes configured
> >>>>>> 8 Resources configured
> >>>>>>
> >>>>>>
> >>>>>> Online: [ lbv1.beta.com lbv2.beta.com ]
> >>>>>>
> >>>>>> Full list of resources:
> >>>>>>
> >>>>>> Resource Group: HAvarnish
> >>>>>> vip_208 (ocf::heartbeat:IPaddr2): Started
> lbv1.beta.com
> >>>>>> varnishd (lsb:varnish): Started lbv1.beta.com
> >>>>>> Resource Group: grpStonith1
> >>>>>> Stonith1-1 (stonith:external/stonith-helper): Stopped
> >>>>>> Stonith1-2 (stonith:external/xen0): Stopped
> >>>>>> Resource Group: grpStonith2
> >>>>>> Stonith2-1 (stonith:external/stonith-helper): Stopped
> >>>>>> Stonith2-2 (stonith:external/xen0): Stopped
> >>>>>> Clone Set: clone_ping [ping]
> >>>>>> Started: [ lbv1.beta.com lbv2.beta.com ]
> >>>>>>
> >>>>>> Node Attributes:
> >>>>>> * Node lbv1.beta.com:
> >>>>>> + default_ping_set : 100
> >>>>>> * Node lbv2.beta.com:
> >>>>>> + default_ping_set : 100
> >>>>>>
> >>>>>> Migration summary:
> >>>>>> * Node lbv1.beta.com:
> >>>>>> Stonith2-1: migration-threshold=1 fail-count=1000000
> >>>>> last-failure='Tue Mar 17
> >>>>>> 21:03:39 2015'
> >>>>>> * Node lbv2.beta.com:
> >>>>>> Stonith1-1: migration-threshold=1 fail-count=1000000
> >>>>> last-failure='Tue Mar 17
> >>>>>> 21:03:32 2015'
> >>>>>>
> >>>>>> Failed actions:
> >>>>>> Stonith2-1_start_0 on lbv1.beta.com 'unknown error' (1):
> >>>>> call=31, st
> >>>>>> atus=Error, exit-reason='none', last-rc-change='Tue Mar 17
> >>>>> 21:03:37 2015', queue
> >>>>>> d=0ms, exec=1085ms
> >>>>>> Stonith1-1_start_0 on lbv2.beta.com 'unknown error' (1):
> >>>>> call=18, st
> >>>>>> atus=Error, exit-reason='none', last-rc-change='Tue Mar 17
> >>>>> 21:03:30 2015', queue
> >>>>>> d=0ms, exec=1061ms
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> ログです。
> >>>>>>
> >>>>>>
> >>>>>> # less /var/log/ha-debug
> >>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: info: Pacemaker
> support:
> >>>>> yes
> >>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: WARN: File
> >>>>> /etc/ha.d//haresources exists.
> >>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: WARN: This file
> is not used
> >>>>> because pacemaker is enabled
> >>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: debug: Checking
> access of:
> >>>>> /usr/local/heartbeat/libexec/heartbeat/ccm
> >>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: debug: Checking
> access of:
> >>>>> /usr/local/heartbeat/libexec/pacemaker/cib
> >>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: debug: Checking
> access of:
> >>>>> /usr/local/heartbeat/libexec/pacemaker/stonithd
> >>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: debug: Checking
> access of:
> >>>>> /usr/local/heartbeat/libexec/pacemaker/lrmd
> >>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: debug: Checking
> access of:
> >>>>> /usr/local/heartbeat/libexec/pacemaker/attrd
> >>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: debug: Checking
> access of:
> >>>>> /usr/local/heartbeat/libexec/pacemaker/crmd
> >>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: WARN: Core dumps
> could be
> >>>>> lost if multiple dumps occur.
> >>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: WARN: Consider
> setting
> >>>>> non-default value in /proc/sys/kernel/core_pattern (or equivalent)
> for maximum
> >>>>> supportability
> >>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: WARN: Consider
> setting
> >>>>> /proc/sys/kernel/core_uses_pid (or equivalent) to 1 for maximum
> supportability
> >>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: WARN: Logging
> daemon is
> >>>>> disabled --enabling logging daemon is recommended
> >>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: info:
> >>>>> **************************
> >>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: info:
> Configuration
> >>>>> validated. Starting heartbeat 3.0.6
> >>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4236]: info: heartbeat:
> version
> >>>>> 3.0.6
> >>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4236]: info: Heartbeat
> generation:
> >>>>> 1423534116
> >>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4236]: info: seed is
> -1702799346
> >>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4236]: info: glib:
> ucast: write
> >>>>> socket priority set to IPTOS_LOWDELAY on eth1
> >>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4236]: info: glib:
> ucast: bound
> >>>>> send socket to device: eth1
> >>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4236]: info: glib:
> ucast: set
> >>>>> SO_REUSEADDR
> >>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4236]: info: glib:
> ucast: bound
> >>>>> receive socket to device: eth1
> >>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4236]: info: glib:
> ucast: started
> >>>>> on port 694 interface eth1 to 10.0.17.133
> >>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4236]: info: Local
> status now set
> >>>>> to: 'up'
> >>>>>> Mar 17 21:02:46 lbv1.beta.com heartbeat: [4236]: info: Link
> >>>>> lbv2.beta.com:eth1 up.
> >>>>>> Mar 17 21:02:46 lbv1.beta.com heartbeat: [4236]: info: Status
> update for
> >>>>> node lbv2.beta.com: status up
> >>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info:
> Comm_now_up():
> >>>>> updating status to active
> >>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: Local
> status now set
> >>>>> to: 'active'
> >>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: Starting
> child client
> >>>>> "/usr/local/heartbeat/libexec/heartbeat/ccm" (109,113)
> >>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: Starting
> child client
> >>>>> "/usr/local/heartbeat/libexec/pacemaker/cib" (109,113)
> >>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: Starting
> child client
> >>>>> "/usr/local/heartbeat/libexec/pacemaker/stonithd" (0,0)
> >>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: Starting
> child client
> >>>>> "/usr/local/heartbeat/libexec/pacemaker/lrmd" (0,0)
> >>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: Starting
> child client
> >>>>> "/usr/local/heartbeat/libexec/pacemaker/attrd" (109,113)
> >>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: Starting
> child client
> >>>>> "/usr/local/heartbeat/libexec/pacemaker/crmd" (109,113)
> >>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: debug:
> get_delnodelist:
> >>>>> delnodelist=
> >>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4250]: info: Starting
> >>>>> "/usr/local/heartbeat/libexec/pacemaker/crmd" as uid 109 gid 113
> (pid
> >>>>> 4250)
> >>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4246]: info: Starting
> >>>>> "/usr/local/heartbeat/libexec/pacemaker/cib" as uid 109 gid 113 (pid
> >>>>> 4246)
> >>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4249]: info: Starting
> >>>>> "/usr/local/heartbeat/libexec/pacemaker/attrd" as uid 109 gid 113
> >>>>> (pid 4249)
> >>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4245]: info: Starting
> >>>>> "/usr/local/heartbeat/libexec/heartbeat/ccm" as uid 109 gid 113 (pid
> >>>>> 4245)
> >>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4248]: info: Starting
> >>>>> "/usr/local/heartbeat/libexec/pacemaker/lrmd" as uid 0 gid 0 (pid
> >>>>> 4248)
> >>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4247]: info: Starting
> >>>>> "/usr/local/heartbeat/libexec/pacemaker/stonithd" as uid 0 gid 0
> (pid
> >>>>> 4247)
> >>>>>> Mar 17 21:02:47 lbv1.beta.com ccm: [4245]: info: Hostname:
> lbv1.beta.com
> >>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: the send
> queue length
> >>>>> from heartbeat to client ccm is set to 1024
> >>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: the send
> queue length
> >>>>> from heartbeat to client attrd is set to 1024
> >>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: the send
> queue length
> >>>>> from heartbeat to client stonith-ng is set to 1024
> >>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: Status
> update for
> >>>>> node lbv2.beta.com: status active
> >>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: the send
> queue length
> >>>>> from heartbeat to client cib is set to 1024
> >>>>>> Mar 17 21:02:51 lbv1.beta.com heartbeat: [4236]: WARN: 1 lost
> packet(s) for
> >>>>> [lbv2.beta.com] [15:17]
> >>>>>> Mar 17 21:02:51 lbv1.beta.com heartbeat: [4236]: info: No pkts
> missing from
> >>>>> lbv2.beta.com!
> >>>>>> Mar 17 21:02:52 lbv1.beta.com heartbeat: [4236]: WARN: 1 lost
> packet(s) for
> >>>>> [lbv2.beta.com] [19:21]
> >>>>>> Mar 17 21:02:52 lbv1.beta.com heartbeat: [4236]: info: No pkts
> missing from
> >>>>> lbv2.beta.com!
> >>>>>> Mar 17 21:02:52 lbv1.beta.com heartbeat: [4236]: info: the send
> queue length
> >>>>> from heartbeat to client crmd is set to 1024
> >>>>>> Mar 17 21:02:53 lbv1.beta.com heartbeat: [4236]: WARN: 1 lost
> packet(s) for
> >>>>> [lbv2.beta.com] [24:26]
> >>>>>> Mar 17 21:02:53 lbv1.beta.com heartbeat: [4236]: info: No pkts
> missing from
> >>>>> lbv2.beta.com!
> >>>>>> Mar 17 21:02:54 lbv1.beta.com heartbeat: [4236]: WARN: 1 lost
> packet(s) for
> >>>>> [lbv2.beta.com] [26:28]
> >>>>>> Mar 17 21:02:54 lbv1.beta.com heartbeat: [4236]: info: No pkts
> missing from
> >>>>> lbv2.beta.com!
> >>>>>> Mar 17 21:02:54 lbv1.beta.com heartbeat: [4236]: WARN: 1 lost
> packet(s) for
> >>>>> [lbv2.beta.com] [30:32]
> >>>>>> Mar 17 21:02:54 lbv1.beta.com heartbeat: [4236]: info: No pkts
> missing from
> >>>>> lbv2.beta.com!
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> # less /var/log/error
> >>>>>>
> >>>>>> Mar 17 21:02:47 lbv1 attrd[4249]: error: ha_msg_dispatch: Ignored
> >>>>> incoming message. Please set_msg_callback on hbclstat
> >>>>>> Mar 17 21:02:48 lbv1 attrd[4249]: error: ha_msg_dispatch: Ignored
> >>>>> incoming message. Please set_msg_callback on hbclstat
> >>>>>> Mar 17 21:02:53 lbv1 stonith-ng[4247]: error: ha_msg_dispatch:
> Ignored
> >>>>> incoming message. Please set_msg_callback on hbclstat
> >>>>>> Mar 17 21:02:53 lbv1 stonith-ng[4247]: error: ha_msg_dispatch:
> Ignored
> >>>>> incoming message. Please set_msg_callback on hbclstat
> >>>>>> Mar 17 21:03:39 lbv1 crmd[4250]: error: process_lrm_event:
> Operation
> >>>>> Stonith2-1_start_0 (node=lbv1.beta.com, call=31, status=4,
> cib-update=42,
> >>>>> confirmed=true) Error
> >>>>>>
> >>>>>> # cat syslog|egrep 'Mar 17 21:03|Mar 17 21:02' |egrep
> >>>>> 'heartbeat|stonith|pacemaker|error'
> >>>>>> Mar 17 21:03:24 lbv1 pengine[4253]: notice: process_pe_message:
> Calculated
> >>>>> Transition 0: /var/lib/pacemaker/pengine/pe-input-115.bz2
> >>>>>> Mar 17 21:03:27 lbv1 crmd[4250]: notice: run_graph: Transition 0
> >>>>> (Complete=15, Pending=0, Fired=0, Skipped=16, Incomplete=2,
> >>>>> Source=/var/lib/pacemaker/pengine/pe-input-115.bz2): Stopped
> >>>>>> Mar 17 21:03:29 lbv1 pengine[4253]: notice: process_pe_message:
> Calculated
> >>>>> Transition 1: /var/lib/pacemaker/pengine/pe-input-116.bz2
> >>>>>> Mar 17 21:03:34 lbv1 crmd[4250]: notice: run_graph: Transition 1
> >>>>> (Complete=8, Pending=0, Fired=0, Skipped=12, Incomplete=1,
> >>>>> Source=/var/lib/pacemaker/pengine/pe-input-116.bz2): Stopped
> >>>>>> Mar 17 21:03:37 lbv1 pengine[4253]: warning: unpack_rsc_op_failure:
> >>>>> Processing failed op start for Stonith1-1 on lbv2.beta.com: unknown
> error (1)
> >>>>>> Mar 17 21:03:37 lbv1 pengine[4253]: warning: unpack_rsc_op_failure:
> >>>>> Processing failed op start for Stonith1-1 on lbv2.beta.com: unknown
> error (1)
> >>>>>> Mar 17 21:03:37 lbv1 pengine[4253]: notice: process_pe_message:
> Calculated
> >>>>> Transition 2: /var/lib/pacemaker/pengine/pe-input-117.bz2
> >>>>>> Mar 17 21:03:39 lbv1 stonith-ng[4247]: notice: log_operation:
> Operation
> >>>>> 'monitor' [4377] for device 'Stonith2-1' returned: -201 (Generic
> >>>>> Pacemaker error)
> >>>>>> Mar 17 21:03:39 lbv1 stonith-ng[4247]: warning: log_operation:
> >>>>> Stonith2-1:4377 [ Performing: stonith -t external/stonith-helper -S ]
> >>>>>> Mar 17 21:03:39 lbv1 stonith-ng[4247]: warning: log_operation:
> >>>>> Stonith2-1:4377 [ failed to exec "stonith" ]
> >>>>>> Mar 17 21:03:39 lbv1 stonith-ng[4247]: warning: log_operation:
> >>>>> Stonith2-1:4377 [ failed: 2 ]
> >>>>>> Mar 17 21:03:39 lbv1 crmd[4250]: error: process_lrm_event:
> Operation
> >>>>> Stonith2-1_start_0 (node=lbv1.beta.com, call=31, status=4,
> cib-update=42,
> >>>>> confirmed=true) Error
> >>>>>> Mar 17 21:03:40 lbv1 crmd[4250]: notice: run_graph: Transition 2
> >>>>> (Complete=12, Pending=0, Fired=0, Skipped=3, Incomplete=0,
> >>>>> Source=/var/lib/pacemaker/pengine/pe-input-117.bz2): Stopped
> >>>>>> Mar 17 21:03:42 lbv1 pengine[4253]: warning: unpack_rsc_op_failure:
> >>>>> Processing failed op start for Stonith2-1 on lbv1.beta.com: unknown
> error (1)
> >>>>>> Mar 17 21:03:42 lbv1 pengine[4253]: warning: unpack_rsc_op_failure:
> >>>>> Processing failed op start for Stonith2-1 on lbv1.beta.com: unknown
> error (1)
> >>>>>> Mar 17 21:03:42 lbv1 pengine[4253]: warning: unpack_rsc_op_failure:
> >>>>> Processing failed op start for Stonith1-1 on lbv2.beta.com: unknown
> error (1)
> >>>>>> Mar 17 21:03:42 lbv1 pengine[4253]: notice: process_pe_message:
> Calculated
> >>>>> Transition 3: /var/lib/pacemaker/pengine/pe-input-118.bz2
> >>>>>> Mar 17 21:03:42 lbv1 IPaddr2(vip_208)[4448]: INFO:
> >>>>> /usr/libexec/heartbeat/send_arp -i 200 -r 5 -p
> >>>>> /var/run/resource-agents/send_arp-192.168.17.208 eth0 192.168.17.208
> auto
> >>>>> not_used not_used
> >>>>>> Mar 17 21:03:47 lbv1 crmd[4250]: notice: run_graph: Transition 3
> >>>>> (Complete=10, Pending=0, Fired=0, Skipped=0, Incomplete=0,
> >>>>> Source=/var/lib/pacemaker/pengine/pe-input-118.bz2): Complete
> >>>>>>
> >>>>>> 宜しくお願いします。
> >>>>>>
> >>>>>> 以上
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> 2015年3月17日 18:31 <renayama19661014@ybb.ne.jp>:
> >>>>>>
> >>>>>> 福田さん
> >>>>>>>
> >>>>>>> こんばんは、山内です。
> >>>>>>>
> >>>>>>> tag付けされていないので、本日の最新版は、
> >>>>>>>
> >>>>>>> *
> >>>>>
> https://github.com/ClusterLabs/pacemaker/tree/e32080b460f81486b85d08ec958582b3e72d858c
> >>>>>>>
> >>>>>>>
> >>>>>>> になります。
> >>>>>>> 右側の[Download ZIP]からダウンロード出来ます。
> >>>>>>>
> >>>>>>> 以上です。
> >>>>>>>
> >>>>>>>
> >>>>>>> ----- Original Message -----
> >>>>>>>> From: Masamichi Fukuda - elf-systems
> >>>>> <masamichi_fukuda@elf-systems.com>
> >>>>>>>
> >>>>>>>> To: "renayama19661014@ybb.ne.jp"
> >>>>> <renayama19661014@ybb.ne.jp>;
> >>>>> "linux-ha-japan@lists.sourceforge.jp"
> >>>>> <linux-ha-japan@lists.sourceforge.jp>
> >>>>>>>> Date: 2015/3/17, Tue 18:07
> >>>>>>>> Subject: スプリットブレイン時のSTONITHエラーについて
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> 山内さん
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> お疲れ様です、福田です。
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> こちらを見たのですが、
> >>>>>>>> https://github.com/ClusterLabs/pacemaker/tags
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> pacemaker 1.1.12 561c4cf が最新のようなのですが。
> >>>>>>>> 済みませんが、これ以降の最新版はどちらにあるか教えて頂けますか。
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> 宜しくお願いします。
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> 以上
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> 2015年3月17日火曜日、<renayama19661014@ybb.ne.jp>さんは書きました:
> >>>>>>>>
> >>>>>>>> 福田さん
> >>>>>>>>>
> >>>>>>>>> お疲れ様です。山内です。
> >>>>>>>>>
> >>>>>>>>> はい。古いです。
> >>>>>>>>>
> >>>>>>>>> PacemakerがHeartbeat3.0.6に対応したのは意外と最近です。
> >>>>>>>>> もっと新しいものを入れてください。(また、ソースから構築する必要がありますが・・・・)
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> 本家のgithubから入手可能です。
> >>>>>>>>> * https://github.com/ClusterLabs/pacemaker
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> 場合によっては、最新のmasterはエラーなどが出る場合がありますので、その場合は、バージョンを古い方にたぐって
> >>>>>>>>> いくのが良いと思います。
> >>>>>>>>>
> >>>>>>>>> 以上です。
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> ----- Original Message -----
> >>>>>>>>>> From: Masamichi Fukuda - elf-systems
> >>>>> <masamichi_fukuda@elf-systems.com>
> >>>>>>>>>> To: 山内英生 <renayama19661014@ybb.ne.jp>;
> >>>>> "linux-ha-japan@lists.sourceforge.jp"
> >>>>> <linux-ha-japan@lists.sourceforge.jp>
> >>>>>>>>>> Date: 2015/3/17, Tue 16:06
> >>>>>>>>>> Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> 山内さん
> >>>>>>>>>>
> >>>>>>>>>> お疲れ様です、福田です。
> >>>>>>>>>>
> >>>>>>>>>> 以前のメールでheartbeatとpacemakerを最新版を入れたほうが良いと回答頂きました。
> >>>>>>>>>> そこで今回、heartbeat3.0.6とpacemaker1.1.12を入れたのですが。
> >>>>>>>>>>
> >>>>>>>>>> heartbeat configuration: Version = "3.0.6"
> >>>>>>>>>> pacemaker configuration: Version = 1.1.12 (Build:
> >>>>> 561c4cf)pacemakerがまだ古いということでしょうか。
> >>>>>>>>>>
> >>>>>>>>>> 済みませんが、宜しくお願いします。
> >>>>>>>>>>
> >>>>>>>>>> 以上
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> 2015年3月17日 14:59 <renayama19661014@ybb.ne.jp>:
> >>>>>>>>>>
> >>>>>>>>>> 福田さん
> >>>>>>>>>>>
> >>>>>>>>>>> お疲れ様です。山内です。
> >>>>>>>>>>>
> >>>>>>>>>>> ふと思ったのすが、以前のやり取りのメールで以下と回答してますが、問題ないでしょうか?
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>>>>>>> 2)Heartbeat3.0.6+Pacemaker最新 :
> >>>>> OK
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>> どうやら、Heartbeatも最新版3.0.6を組合せる必要があるようです。
> >>>>>>>>>>>>>>>>>
> >>>>> * http://hg.linux-ha.org/heartbeat-STABLE_3_0/rev/cceeb47a7d8f
> >>>>>>>>>>>
> >>>>>>>>>>> 以下のcrm_monのバージョンを見ると、1.1.12のようです。
> >>>>>>>>>>> Heartbeat3.0.6と組み合わせるには、かなり新しめのPacemakerが必要です。
> >>>>>>>>>>>
> >>>>>>>>>>>> # crm_mon -rfA
> >>>>>>>>>>>>
> >>>>>>>>>>>> Last updated: Tue Mar 17 14:14:39 2015
> >>>>>>>>>>>> Last change: Tue Mar 17 14:01:43 2015
> >>>>>>>>>>>> Stack: heartbeat
> >>>>>>>>>>>> Current DC: lbv2.beta.com
> >>>>> (82ffc36f-1ad8-8686-7db0-35686465c624) - parti
> >>>>>>>>>>>> tion with quorum
> >>>>>>>>>>>> Version: 1.1.12-561c4cf
> >>>>>>>>>>>
> >>>>>>>>>>> たぶん、以下の変更以降は少なくとも必要かと思います。
> >>>>>>>>>>>
> >>>>>>>>>>>
> https://github.com/ClusterLabs/pacemaker/commit/f2302da063d08719d28367d8e362b8bfb0f85bf3
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> 以上です。
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> ----- Original Message -----
> >>>>>>>>>>>> From: Masamichi Fukuda - elf-systems
> >>>>> <masamichi_fukuda@elf-systems.com>
> >>>>>>>>>>>> To: 山内英生 <renayama19661014@ybb.ne.jp>;
> >>>>> "linux-ha-japan@lists.sourceforge.jp"
> >>>>> <linux-ha-japan@lists.sourceforge.jp>
> >>>>>>>>>>>
> >>>>>>>>>>>> Date: 2015/3/17, Tue 14:38
> >>>>>>>>>>>> Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> 山内さん
> >>>>>>>>>>>>
> >>>>>>>>>>>> お疲れ様です、福田です。
> >>>>>>>>>>>>
> >>>>>>>>>>>> stonith-helperのシェバング行に-xを追加すれば良いのでしょうか?
> >>>>>>>>>>>> stonith-helperの先頭行を#!/bin/bash -xにしてクラスタを起動してみました。
> >>>>>>>>>>>>
> >>>>>>>>>>>> crm_monでは先ほどと変わりはないようです。
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> # crm_mon -rfA
> >>>>>>>>>>>>
> >>>>>>>>>>>> Last updated: Tue Mar 17 14:14:39 2015
> >>>>>>>>>>>> Last change: Tue Mar 17 14:01:43 2015
> >>>>>>>>>>>> Stack: heartbeat
> >>>>>>>>>>>> Current DC: lbv2.beta.com
> >>>>> (82ffc36f-1ad8-8686-7db0-35686465c624) - parti
> >>>>>>>>>>>> tion with quorum
> >>>>>>>>>>>> Version: 1.1.12-561c4cf
> >>>>>>>>>>>> 2 Nodes configured
> >>>>>>>>>>>> 8 Resources configured
> >>>>>>>>>>>>
> >>>>>>>>>>>> Online: [ lbv1.beta.com lbv2.beta.com ]
> >>>>>>>>>>>>
> >>>>>>>>>>>> Full list of resources:
> >>>>>>>>>>>>
> >>>>>>>>>>>> Resource Group: HAvarnish
> >>>>>>>>>>>> vip_208 (ocf::heartbeat:IPaddr2):
> >>>>> Started lbv1.beta.com
> >>>>>>>>>>>> varnishd (lsb:varnish): Started
> >>>>> lbv1.beta.com
> >>>>>>>>>>>> Resource Group: grpStonith1
> >>>>>>>>>>>> Stonith1-1
> >>>>> (stonith:external/stonith-helper): Stopped
> >>>>>>>>>>>> Stonith1-2 (stonith:external/xen0):
> >>>>> Stopped
> >>>>>>>>>>>> Resource Group: grpStonith2
> >>>>>>>>>>>> Stonith2-1
> >>>>> (stonith:external/stonith-helper): Stopped
> >>>>>>>>>>>> Stonith2-2 (stonith:external/xen0):
> >>>>> Stopped
> >>>>>>>>>>>> Clone Set: clone_ping [ping]
> >>>>>>>>>>>> Started: [ lbv1.beta.com lbv2.beta.com ]
> >>>>>>>>>>>>
> >>>>>>>>>>>> Node Attributes:
> >>>>>>>>>>>> * Node lbv1.beta.com:
> >>>>>>>>>>>> + default_ping_set : 100
> >>>>>>>>>>>> * Node lbv2.beta.com:
> >>>>>>>>>>>> + default_ping_set : 100
> >>>>>>>>>>>>
> >>>>>>>>>>>> Migration summary:
> >>>>>>>>>>>> * Node lbv2.beta.com:
> >>>>>>>>>>>> Stonith1-1: migration-threshold=1
> >>>>> fail-count=1000000 last-failure='Tue Mar 17
> >>>>>>>>>>>> 14:12:16 2015'
> >>>>>>>>>>>> * Node lbv1.beta.com:
> >>>>>>>>>>>> Stonith2-1: migration-threshold=1
> >>>>> fail-count=1000000 last-failure='Tue Mar 17
> >>>>>>>>>>>> 14:12:21 2015'
> >>>>>>>>>>>>
> >>>>>>>>>>>> Failed actions:
> >>>>>>>>>>>> Stonith1-1_start_0 on lbv2.beta.com 'unknown
> >>>>> error' (1): call=31, st
> >>>>>>>>>>>> atus=Error, last-rc-change='Tue Mar 17 14:12:14
> >>>>> 2015', queued=0ms, exec=1065ms
> >>>>>>>>>>>> Stonith2-1_start_0 on lbv1.beta.com 'unknown
> >>>>> error' (1): call=26, st
> >>>>>>>>>>>> atus=Error, last-rc-change='Tue Mar 17 14:12:19
> >>>>> 2015', queued=0ms, exec=1081ms
> >>>>>>>>>>>>
> >>>>>>>>>>>> その他のログを探してみました。
> >>>>>>>>>>>>
> >>>>>>>>>>>> heartbeat起動時です。
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> # less /var/log/pm_logconv.out
> >>>>>>>>>>>> Mar 17 14:11:28 lbv1.beta.com info: Starting
> >>>>> Heartbeat 3.0.6.
> >>>>>>>>>>>> Mar 17 14:11:33 lbv1.beta.com info: Link
> >>>>> lbv2.beta.com:eth1 is up.
> >>>>>>>>>>>> Mar 17 14:11:34 lbv1.beta.com info: Start
> >>>>> "ccm" process. (pid=13264)
> >>>>>>>>>>>> Mar 17 14:11:34 lbv1.beta.com info: Start
> >>>>> "lrmd" process. (pid=13267)
> >>>>>>>>>>>> Mar 17 14:11:34 lbv1.beta.com info: Start
> >>>>> "attrd" process. (pid=13268)
> >>>>>>>>>>>> Mar 17 14:11:34 lbv1.beta.com info: Start
> >>>>> "stonithd" process. (pid=13266)
> >>>>>>>>>>>> Mar 17 14:11:34 lbv1.beta.com info: Start
> >>>>> "cib" process. (pid=13265)
> >>>>>>>>>>>> Mar 17 14:11:34 lbv1.beta.com info: Start
> >>>>> "crmd" process. (pid=13269)
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> # less /var/log/error
> >>>>>>>>>>>> Mar 17 14:12:20 lbv1 crmd[13269]: error:
> >>>>> process_lrm_event: Operation Stonith2-1_start_0 (node=lbv1.beta.com,
> call=26,
> >>>>> status=4, cib-update=19, confirmed=true) Error
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> syslogからstonithをgrepしたものです
> >>>>>>>>>>>>
> >>>>>>>>>>>> Mar 17 14:11:34 lbv1 heartbeat: [13255]: info:
> >>>>> Starting child client
> >>>>> "/usr/local/heartbeat/libexec/pacemaker/stonithd" (0,0)
> >>>>>>>>>>>> Mar 17 14:11:34 lbv1 heartbeat: [13266]: info:
> >>>>> Starting "/usr/local/heartbeat/libexec/pacemaker/stonithd" as uid 0
> >>>>> gid 0 (pid 13266)
> >>>>>>>>>>>> Mar 17 14:11:34 lbv1 stonithd[13266]: notice:
> >>>>> crm_cluster_connect: Connecting to cluster infrastructure: heartbeat
> >>>>>>>>>>>> Mar 17 14:11:34 lbv1 heartbeat: [13255]: info: the
> >>>>> send queue length from heartbeat to client stonithd is set to 1024
> >>>>>>>>>>>> Mar 17 14:11:40 lbv1 stonithd[13266]: notice:
> >>>>> setup_cib: Watching for stonith topology changes
> >>>>>>>>>>>> Mar 17 14:11:40 lbv1 stonithd[13266]: notice:
> >>>>> unpack_config: On loss of CCM Quorum: Ignore
> >>>>>>>>>>>> Mar 17 14:11:40 lbv1 stonithd[13266]: warning:
> >>>>> handle_startup_fencing: Blind faith: not fencing unseen nodes
> >>>>>>>>>>>> Mar 17 14:11:40 lbv1 stonithd[13266]: warning:
> >>>>> handle_startup_fencing: Blind faith: not fencing unseen nodes
> >>>>>>>>>>>> Mar 17 14:11:41 lbv1 stonithd[13266]: notice:
> >>>>> stonith_device_register: Added 'Stonith2-1' to the device list (1
> active
> >>>>> devices)
> >>>>>>>>>>>> Mar 17 14:11:41 lbv1 stonithd[13266]: notice:
> >>>>> stonith_device_register: Added 'Stonith2-2' to the device list (2
> active
> >>>>> devices)
> >>>>>>>>>>>> Mar 17 14:12:04 lbv1 stonithd[13266]: notice:
> >>>>> xml_patch_version_check: Versions did not change in patch 0.5.0
> >>>>>>>>>>>> Mar 17 14:12:20 lbv1 stonithd[13266]: notice:
> >>>>> log_operation: Operation 'monitor' [13386] for device
> >>>>> 'Stonith2-1' returned: -201 (Generic Pacemaker error)
> >>>>>>>>>>>> Mar 17 14:12:20 lbv1 stonithd[13266]: warning:
> >>>>> log_operation: Stonith2-1:13386 [ Performing: stonith -t
> external/stonith-helper
> >>>>> -S ]
> >>>>>>>>>>>> Mar 17 14:12:20 lbv1 stonithd[13266]: warning:
> >>>>> log_operation: Stonith2-1:13386 [ failed to exec "stonith" ]
> >>>>>>>>>>>> Mar 17 14:12:20 lbv1 stonithd[13266]: warning:
> >>>>> log_operation: Stonith2-1:13386 [ failed: 2 ]
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> 宜しくお願いします。
> >>>>>>>>>>>>
> >>>>>>>>>>>> 以上
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> 2015年3月17日 13:32 <renayama19661014@ybb.ne.jp>:
> >>>>>>>>>>>>
> >>>>>>>>>>>> 福田さん
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> お疲れ様です。山内です。
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> ということは、stonith-helperのstartに問題があるようですね。
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> stonith-helperの先頭に
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> #!/bin/bash -x
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> を入れて、クラスタを起動すると何かわかるかも知れません。
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> ちなみに、stonith-helperのログもどこかに出ていると思うのですが。。。
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> 以上です。
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> ----- Original Message -----
> >>>>>>>>>>>>>> From: Masamichi Fukuda - elf-systems
> >>>>> <masamichi_fukuda@elf-systems.com>
> >>>>>>>>>>>>>> To: 山内英生 <renayama19661014@ybb.ne.jp>;
> >>>>> "linux-ha-japan@lists.sourceforge.jp"
> >>>>> <linux-ha-japan@lists.sourceforge.jp>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> Date: 2015/3/17, Tue 12:31
> >>>>>>>>>>>>>> Subject: Re: [Linux-ha-jp]
> >>>>> スプリットブレイン時のSTONITHエラーについて
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> 山内さん
> >>>>>>>>>>>>>> cc:松島さん
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> こんにちは、福田です。
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> 同じディレクトリにxen0はありました。
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> # pwd
> >>>>>>>>>>>>>> /usr/local/heartbeat/lib/stonith/plugins/external
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> # ls
> >>>>>>>>>>>>>> drac5 ibmrsa kdumpcheck
> >>>>> riloe vmware
> >>>>>>>>>>>>>> dracmc-telnet ibmrsa-telnet libvirt
> >>>>> ssh xen0
> >>>>>>>>>>>>>> hetzner ipmi nut
> >>>>> stonith-helper xen0-ha
> >>>>>>>>>>>>>> hmchttp ippower9258 rackpdu
> >>>>> vcenter
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> 宜しくお願いします。
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> 以上
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> 2015-03-17 10:53 GMT+09:00
> >>>>> <renayama19661014@ybb.ne.jp>:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> 福田さん
> >>>>>>>>>>>>>>> cc:松島さん
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> お疲れ様です。山内です。
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> 標準出力や標準エラー出力はありませんでした。
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> stonith-helperがおかしいのでしょうか。
> >>>>>>>>>>>>>>>> stonith-helperはシェルスクリプトなのでインストールはあまり気にしていなかったのですが。
> >>>>>>>>>>>>>>>> stonith-helperはここに配置されています。
> >>>>>>>>>>>>>>>>
> /usr/local/heartbeat/lib/stonith/plugins/external/stonith-helper
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> このディレクトリにxen0もありますか?
> >>>>>>>>>>>>>>>
> 無いようでしたら、問題がありますので、一度、stonith-helperのファイルを属性などはそのまま、xen0と同じディレクトリに
> >>>>>>>>>>>>>>> コピーしてみてください。
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> それで稼働するなら、pm_extrasのインストールに問題があるということになります。
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> 以上です。
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> ----- Original Message -----
> >>>>>>>>>>>>>>>> From: Masamichi Fukuda - elf-systems
> >>>>> <masamichi_fukuda@elf-systems.com>
> >>>>>>>>>>>>>>>> To: 山内英生
> >>>>> <renayama19661014@ybb.ne.jp>;
> >>>>> "linux-ha-japan@lists.sourceforge.jp"
> >>>>> <linux-ha-japan@lists.sourceforge.jp>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Date: 2015/3/17, Tue 10:31
> >>>>>>>>>>>>>>>> Subject: Re: [Linux-ha-jp]
> >>>>> スプリットブレイン時のSTONITHエラーについて
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> 山内さん
> >>>>>>>>>>>>>>>> cc:松島さん
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> おはようございます、福田です。
> >>>>>>>>>>>>>>>> crmの例をありがとうございます。
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> 早速、こちらの環境に合わせてみました。
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> $ cat test.crm
> >>>>>>>>>>>>>>>> ### Cluster Option ###
> >>>>>>>>>>>>>>>> property \
> >>>>>>>>>>>>>>>>
> >>>>> no-quorum-policy="ignore" \
> >>>>>>>>>>>>>>>> stonith-enabled="true"
> >>>>> \
> >>>>>>>>>>>>>>>>
> >>>>> startup-fencing="false" \
> >>>>>>>>>>>>>>>> stonith-timeout="710s"
> >>>>> \
> >>>>>>>>>>>>>>>>
> >>>>> crmd-transition-delay="2s"
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> ### Resource Default ###
> >>>>>>>>>>>>>>>> rsc_defaults \
> >>>>>>>>>>>>>>>>
> >>>>> resource-stickiness="INFINITY" \
> >>>>>>>>>>>>>>>>
> >>>>> migration-threshold="1"
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> ### Group Configuration ###
> >>>>>>>>>>>>>>>> group HAvarnish \
> >>>>>>>>>>>>>>>> vip_208 \
> >>>>>>>>>>>>>>>> varnishd
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> group grpStonith1 \
> >>>>>>>>>>>>>>>> Stonith1-1 \
> >>>>>>>>>>>>>>>> Stonith1-2
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> group grpStonith2 \
> >>>>>>>>>>>>>>>> Stonith2-1 \
> >>>>>>>>>>>>>>>> Stonith2-2
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> ### Clone Configuration ###
> >>>>>>>>>>>>>>>> clone clone_ping \
> >>>>>>>>>>>>>>>> ping
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> ### Fencing Topology ###
> >>>>>>>>>>>>>>>> fencing_topology \
> >>>>>>>>>>>>>>>> lbv1.beta.com: Stonith1-1
> >>>>> Stonith1-2 \
> >>>>>>>>>>>>>>>> lbv2.beta.com: Stonith2-1
> >>>>> Stonith2-2
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> ### Primitive Configuration ###
> >>>>>>>>>>>>>>>> primitive vip_208
> >>>>> ocf:heartbeat:IPaddr2 \
> >>>>>>>>>>>>>>>> params \
> >>>>>>>>>>>>>>>>
> >>>>> ip="192.168.17.208" \
> >>>>>>>>>>>>>>>> nic="eth0" \
> >>>>>>>>>>>>>>>> cidr_netmask="24"
> >>>>> \
> >>>>>>>>>>>>>>>> op start interval="0s"
> >>>>> timeout="90s" on-fail="restart" \
> >>>>>>>>>>>>>>>> op monitor
> >>>>> interval="5s" timeout="60s" on-fail="restart"
> >>>>> \
> >>>>>>>>>>>>>>>> op stop interval="0s"
> >>>>> timeout="100s" on-fail="fence"
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> primitive varnishd lsb:varnish \
> >>>>>>>>>>>>>>>> op start interval="0s"
> >>>>> timeout="90s" on-fail="restart" \
> >>>>>>>>>>>>>>>> op monitor
> >>>>> interval="10s" timeout="60s" on-fail="restart"
> >>>>> \
> >>>>>>>>>>>>>>>> op stop interval="0s"
> >>>>> timeout="100s" on-fail="fence"
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> primitive ping ocf:pacemaker:ping
> >>>>> \
> >>>>>>>>>>>>>>>> params \
> >>>>>>>>>>>>>>>>
> >>>>> name="default_ping_set" \
> >>>>>>>>>>>>>>>>
> >>>>> host_list="192.168.17.254" \
> >>>>>>>>>>>>>>>> multiplier="100"
> >>>>> \
> >>>>>>>>>>>>>>>> dampen="1" \
> >>>>>>>>>>>>>>>> op start interval="0s"
> >>>>> timeout="90s" on-fail="restart" \
> >>>>>>>>>>>>>>>> op monitor
> >>>>> interval="10s" timeout="60s" on-fail="restart"
> >>>>> \
> >>>>>>>>>>>>>>>> op stop interval="0s"
> >>>>> timeout="100s" on-fail="fence"
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> primitive Stonith1-1
> >>>>> stonith:external/stonith-helper \
> >>>>>>>>>>>>>>>> params \
> >>>>>>>>>>>>>>>>
> >>>>> pcmk_reboot_retries="1" \
> >>>>>>>>>>>>>>>>
> >>>>> pcmk_reboot_timeout="40s" \
> >>>>>>>>>>>>>>>>
> >>>>> hostlist="lbv1.beta.com" \
> >>>>>>>>>>>>>>>>
> >>>>> dead_check_target="192.168.17.132 10.0.17.132" \
> >>>>>>>>>>>>>>>>
> >>>>> standby_check_command="/usr/local/sbin/crm_resource -r varnishd -W |
> grep
> >>>>> -q `hostname`" \
> >>>>>>>>>>>>>>>>
> >>>>> run_online_check="yes" \
> >>>>>>>>>>>>>>>> op start interval="0s"
> >>>>> timeout="60s" on-fail="restart" \
> >>>>>>>>>>>>>>>> op stop interval="0s"
> >>>>> timeout="60s" on-fail="ignore"
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> primitive Stonith1-2
> >>>>> stonith:external/xen0 \
> >>>>>>>>>>>>>>>> params \
> >>>>>>>>>>>>>>>>
> >>>>> pcmk_reboot_timeout="60s" \
> >>>>>>>>>>>>>>>>
> >>>>> hostlist="lbv1.beta.com:/etc/xen/lbv1.cfg" \
> >>>>>>>>>>>>>>>>
> >>>>> dom0="xen0.beta.com" \
> >>>>>>>>>>>>>>>> op start interval="0s"
> >>>>> timeout="60s" on-fail="restart" \
> >>>>>>>>>>>>>>>> op monitor
> >>>>> interval="3600s" timeout="60s" on-fail="restart"
> >>>>> \
> >>>>>>>>>>>>>>>> op stop interval="0s"
> >>>>> timeout="60s" on-fail="ignore"
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> primitive Stonith2-1
> >>>>> stonith:external/stonith-helper \
> >>>>>>>>>>>>>>>> params \
> >>>>>>>>>>>>>>>>
> >>>>> pcmk_reboot_retries="1" \
> >>>>>>>>>>>>>>>>
> >>>>> pcmk_reboot_timeout="40s" \
> >>>>>>>>>>>>>>>>
> >>>>> hostlist="lbv2.beta.com" \
> >>>>>>>>>>>>>>>>
> >>>>> dead_check_target="192.168.17.133 10.0.17.133" \
> >>>>>>>>>>>>>>>>
> >>>>> standby_check_command="/usr/local/sbin/crm_resource -r varnishd -W |
> grep
> >>>>> -q `hostname`" \
> >>>>>>>>>>>>>>>>
> >>>>> run_online_check="yes" \
> >>>>>>>>>>>>>>>> op start interval="0s"
> >>>>> timeout="60s" on-fail="restart" \
> >>>>>>>>>>>>>>>> op stop interval="0s"
> >>>>> timeout="60s" on-fail="ignore"
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> primitive Stonith2-2
> >>>>> stonith:external/xen0 \
> >>>>>>>>>>>>>>>> params \
> >>>>>>>>>>>>>>>>
> >>>>> pcmk_reboot_timeout="60s" \
> >>>>>>>>>>>>>>>>
> >>>>> hostlist="lbv2.beta.com:/etc/xen/lbv2.cfg" \
> >>>>>>>>>>>>>>>>
> >>>>> dom0="xen0.beta.com" \
> >>>>>>>>>>>>>>>> op start interval="0s"
> >>>>> timeout="60s" on-fail="restart" \
> >>>>>>>>>>>>>>>> op monitor
> >>>>> interval="3600s" timeout="60s" on-fail="restart"
> >>>>> \
> >>>>>>>>>>>>>>>> op stop interval="0s"
> >>>>> timeout="60s" on-fail="ignore"
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> ### Resource Location ###
> >>>>>>>>>>>>>>>> location HA_location-1 HAvarnish
> >>>>> \
> >>>>>>>>>>>>>>>> rule 200: #uname eq
> >>>>> lbv1.beta.com \
> >>>>>>>>>>>>>>>> rule 100: #uname eq
> >>>>> lbv2.beta.com
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> location HA_location-2 HAvarnish
> >>>>> \
> >>>>>>>>>>>>>>>> rule -INFINITY: not_defined
> >>>>> default_ping_set or default_ping_set lt 100
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> location HA_location-3 grpStonith1
> >>>>> \
> >>>>>>>>>>>>>>>> rule -INFINITY: #uname eq
> >>>>> lbv1.beta.com
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> location HA_location-4 grpStonith2
> >>>>> \
> >>>>>>>>>>>>>>>> rule -INFINITY: #uname eq
> >>>>> lbv2.beta.com
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> これを流しこんだところ、昨日とはメッセージが異なります。
> >>>>>>>>>>>>>>>> pingのメッセージはなくなっていました。
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> # crm_mon -rfA
> >>>>>>>>>>>>>>>> Last updated: Tue Mar 17 10:21:28
> >>>>> 2015
> >>>>>>>>>>>>>>>> Last change: Tue Mar 17 10:21:09
> >>>>> 2015
> >>>>>>>>>>>>>>>> Stack: heartbeat
> >>>>>>>>>>>>>>>> Current DC: lbv2.beta.com
> >>>>> (82ffc36f-1ad8-8686-7db0-35686465c624) - parti
> >>>>>>>>>>>>>>>> tion with quorum
> >>>>>>>>>>>>>>>> Version: 1.1.12-561c4cf
> >>>>>>>>>>>>>>>> 2 Nodes configured
> >>>>>>>>>>>>>>>> 8 Resources configured
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Online: [ lbv1.beta.com
> >>>>> lbv2.beta.com ]
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Full list of resources:
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Resource Group: HAvarnish
> >>>>>>>>>>>>>>>> vip_208
> >>>>> (ocf::heartbeat:IPaddr2): Started lbv1.beta.com
> >>>>>>>>>>>>>>>> varnishd (lsb:varnish):
> >>>>> Started lbv1.beta.com
> >>>>>>>>>>>>>>>> Resource Group: grpStonith1
> >>>>>>>>>>>>>>>> Stonith1-1
> >>>>> (stonith:external/stonith-helper): Stopped
> >>>>>>>>>>>>>>>> Stonith1-2
> >>>>> (stonith:external/xen0): Stopped
> >>>>>>>>>>>>>>>> Resource Group: grpStonith2
> >>>>>>>>>>>>>>>> Stonith2-1
> >>>>> (stonith:external/stonith-helper): Stopped
> >>>>>>>>>>>>>>>> Stonith2-2
> >>>>> (stonith:external/xen0): Stopped
> >>>>>>>>>>>>>>>> Clone Set: clone_ping [ping]
> >>>>>>>>>>>>>>>> Started: [ lbv1.beta.com
> >>>>> lbv2.beta.com ]
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Node Attributes:
> >>>>>>>>>>>>>>>> * Node lbv1.beta.com:
> >>>>>>>>>>>>>>>> +
> >>>>> default_ping_set : 100
> >>>>>>>>>>>>>>>> * Node lbv2.beta.com:
> >>>>>>>>>>>>>>>> +
> >>>>> default_ping_set : 100
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Migration summary:
> >>>>>>>>>>>>>>>> * Node lbv2.beta.com:
> >>>>>>>>>>>>>>>> Stonith1-1: migration-threshold=1
> >>>>> fail-count=1000000 last-failure='Tue Mar 17
> >>>>>>>>>>>>>>>> 10:21:17 2015'
> >>>>>>>>>>>>>>>> * Node lbv1.beta.com:
> >>>>>>>>>>>>>>>> Stonith2-1: migration-threshold=1
> >>>>> fail-count=1000000 last-failure='Tue Mar 17
> >>>>>>>>>>>>>>>> 10:21:17 2015'
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Failed actions:
> >>>>>>>>>>>>>>>> Stonith1-1_start_0 on
> >>>>> lbv2.beta.com 'unknown error' (1): call=31, st
> >>>>>>>>>>>>>>>> atus=Error, last-rc-change='Tue
> >>>>> Mar 17 10:21:15 2015', queued=0ms, exec=1082ms
> >>>>>>>>>>>>>>>> Stonith2-1_start_0 on
> >>>>> lbv1.beta.com 'unknown error' (1): call=31, st
> >>>>>>>>>>>>>>>> atus=Error, last-rc-change='Tue
> >>>>> Mar 17 10:21:16 2015', queued=0ms, exec=1079ms
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> /var/log/ha-debugのログです。
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> IPaddr2(vip_208)[7851]:
> >>>>> 2015/03/17_10:21:22 INFO: Adding inet address 192.168.17.208/24
> with broadcast
> >>>>> address 192.168.17.255 to device eth0
> >>>>>>>>>>>>>>>> IPaddr2(vip_208)[7851]:
> >>>>> 2015/03/17_10:21:22 INFO: Bringing device eth0 up
> >>>>>>>>>>>>>>>> IPaddr2(vip_208)[7851]:
> >>>>> 2015/03/17_10:21:22 INFO: /usr/libexec/heartbeat/send_arp -i 200 -r
> 5 -p
> >>>>> /var/run/resource-agents/send_arp-192.168.17.208 eth0 192.168.17.208
> auto
> >>>>> not_used not_used
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> 標準出力や標準エラー出力はありませんでした。
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> stonith-helperがおかしいのでしょうか。
> >>>>>>>>>>>>>>>> stonith-helperはシェルスクリプトなのでインストールはあまり気にしていなかったのですが。
> >>>>>>>>>>>>>>>> stonith-helperはここに配置されています。
> >>>>>>>>>>>>>>>>
> /usr/local/heartbeat/lib/stonith/plugins/external/stonith-helper
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> 宜しくお願いします。
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> 以上
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> 2015-03-17 9:45 GMT+09:00
> >>>>> <renayama19661014@ybb.ne.jp>:
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> 福田さん
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> おはようございます。山内です。
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> 念の為、手元にある複数のstonithを利用した場合の例を抜粋してお送りします。
> >>>>>>>>>>>>>>>>> (実際には、改行に気を付けてください)
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> 以下の例は、PM1.1系での設定で、
> >>>>>>>>>>>>>>>>> nodeaは、prmStonith1-1、 prmStonith1-2の順でstonithが実行されます。
> >>>>>>>>>>>>>>>>> nodebは、prmStonith2-1、 prmStonith2-2の順でstonithが実行されます。
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> stonith自体は、helperとsshです。
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> (snip)
> >>>>>>>>>>>>>>>>> ### Group Configuration ###
> >>>>>>>>>>>>>>>>> group grpStonith1 \
> >>>>>>>>>>>>>>>>> prmStonith1-1 \
> >>>>>>>>>>>>>>>>> prmStonith1-2
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> group grpStonith2 \
> >>>>>>>>>>>>>>>>> prmStonith2-1 \
> >>>>>>>>>>>>>>>>> prmStonith2-2
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> ### Fencing Topology ###
> >>>>>>>>>>>>>>>>> fencing_topology \
> >>>>>>>>>>>>>>>>> nodea: prmStonith1-1
> >>>>> prmStonith1-2 \
> >>>>>>>>>>>>>>>>> nodeb: prmStonith2-1
> >>>>> prmStonith2-2
> >>>>>>>>>>>>>>>>> (snp)
> >>>>>>>>>>>>>>>>> primitive prmStonith1-1
> >>>>> stonith:external/stonith-helper \
> >>>>>>>>>>>>>>>>> params \
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> pcmk_reboot_retries="1"
> >>>>> \
> >>>>>>>>>>>>>>>>> pcmk_reboot_timeout="40s"
> >>>>> \
> >>>>>>>>>>>>>>>>> hostlist="nodea" \
> >>>>>>>>>>>>>>>>> dead_check_target="192.168.28.60
> >>>>> 192.168.28.70" \
> >>>>>>>>>>>>>>>>> standby_check_command="/usr/sbin/crm_resource
> >>>>> -r prmRES -W | grep -qi `hostname`" \
> >>>>>>>>>>>>>>>>> run_online_check="yes"
> >>>>> \
> >>>>>>>>>>>>>>>>> op start interval="0s"
> >>>>> timeout="60s" on-fail="restart" \
> >>>>>>>>>>>>>>>>> op stop interval="0s"
> >>>>> timeout="60s" on-fail="ignore"
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> primitive prmStonith1-2
> >>>>> stonith:external/ssh \
> >>>>>>>>>>>>>>>>> params \
> >>>>>>>>>>>>>>>>> pcmk_reboot_timeout="60s"
> >>>>> \
> >>>>>>>>>>>>>>>>> hostlist="nodea" \
> >>>>>>>>>>>>>>>>> op start interval="0s"
> >>>>> timeout="60s" on-fail="restart" \
> >>>>>>>>>>>>>>>>> op monitor
> >>>>> interval="3600s" timeout="60s" on-fail="restart"
> >>>>> \
> >>>>>>>>>>>>>>>>> op stop interval="0s"
> >>>>> timeout="60s" on-fail="ignore"
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> primitive prmStonith2-1
> >>>>> stonith:external/stonith-helper \
> >>>>>>>>>>>>>>>>> params \
> >>>>>>>>>>>>>>>>> pcmk_reboot_retries="1"
> >>>>> \
> >>>>>>>>>>>>>>>>> pcmk_reboot_timeout="40s"
> >>>>> \
> >>>>>>>>>>>>>>>>> hostlist="nodeb" \
> >>>>>>>>>>>>>>>>> dead_check_target="192.168.28.61
> >>>>> 192.168.28.71" \
> >>>>>>>>>>>>>>>>> standby_check_command="/usr/sbin/crm_resource
> >>>>> -r prmRES -W | grep -qi `hostname`" \
> >>>>>>>>>>>>>>>>> run_online_check="yes"
> >>>>> \
> >>>>>>>>>>>>>>>>> op start interval="0s"
> >>>>> timeout="60s" on-fail="restart" \
> >>>>>>>>>>>>>>>>> op stop interval="0s"
> >>>>> timeout="60s" on-fail="ignore"
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> primitive prmStonith2-2
> >>>>> stonith:external/ssh \
> >>>>>>>>>>>>>>>>> params \
> >>>>>>>>>>>>>>>>> pcmk_reboot_timeout="60s"
> >>>>> \
> >>>>>>>>>>>>>>>>> hostlist="nodeb" \
> >>>>>>>>>>>>>>>>> op start interval="0s"
> >>>>> timeout="60s" on-fail="restart" \
> >>>>>>>>>>>>>>>>> op monitor
> >>>>> interval="3600s" timeout="60s" on-fail="restart"
> >>>>> \
> >>>>>>>>>>>>>>>>> op stop interval="0s"
> >>>>> timeout="60s" on-fail="ignore"
> >>>>>>>>>>>>>>>>> (snip)
> >>>>>>>>>>>>>>>>> location
> >>>>> rsc_location-grpStonith1-2 grpStonith1 \
> >>>>>>>>>>>>>>>>> rule -INFINITY: #uname eq nodea
> >>>>>>>>>>>>>>>>> location
> >>>>> rsc_location-grpStonith2-3 grpStonith2 \
> >>>>>>>>>>>>>>>>> rule -INFINITY: #uname eq nodeb
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> 以上です。
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> --
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> ELF Systems
> >>>>>>>>>>>>>>>> Masamichi Fukuda
> >>>>>>>>>>>>>>>> mail to:
> >>>>> masamichi_fukuda@elf-systems.com
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> _______________________________________________
> >>>>>>>>>>>>>>> Linux-ha-japan mailing list
> >>>>>>>>>>>>>>> Linux-ha-japan@lists.sourceforge.jp
> >>>>>>>>>>>>>>>
> http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> --
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> ELF Systems
> >>>>>>>>>>>>>> Masamichi Fukuda
> >>>>>>>>>>>>>> mail to: masamichi_fukuda@elf-systems.com
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> _______________________________________________
> >>>>>>>>>>>>> Linux-ha-japan mailing list
> >>>>>>>>>>>>> Linux-ha-japan@lists.sourceforge.jp
> >>>>>>>>>>>>> http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> --
> >>>>>>>>>>>>
> >>>>>>>>>>>> ELF Systems
> >>>>>>>>>>>> Masamichi Fukuda
> >>>>>>>>>>>> mail to: masamichi_fukuda@elf-systems.com
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> _______________________________________________
> >>>>>>>>>>> Linux-ha-japan mailing list
> >>>>>>>>>>> Linux-ha-japan@lists.sourceforge.jp
> >>>>>>>>>>> http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> --
> >>>>>>>>>>
> >>>>>>>>>> ELF Systems
> >>>>>>>>>> Masamichi Fukuda
> >>>>>>>>>> mail to: masamichi_fukuda@elf-systems.com
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> _______________________________________________
> >>>>>>>>> Linux-ha-japan mailing list
> >>>>>>>>> Linux-ha-japan@lists.sourceforge.jp
> >>>>>>>>> http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>> --
> >>>>>>>>
> >>>>>>>> ELF Systems
> >>>>>>>> Masamichi Fukuda
> >>>>>>>> mail to: masamichi_fukuda@elf-systems.com
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>> _______________________________________________
> >>>>>>> Linux-ha-japan mailing list
> >>>>>>> Linux-ha-japan@lists.sourceforge.jp
> >>>>>>> http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>>> --
> >>>>>>
> >>>>>> ELF Systems
> >>>>>> Masamichi Fukuda
> >>>>>> mail to: masamichi_fukuda@elf-systems.com
> >>>>>>
> >>>>>>
> >>>>>
> >>>>> _______________________________________________
> >>>>> Linux-ha-japan mailing list
> >>>>> Linux-ha-japan@lists.sourceforge.jp
> >>>>> http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
> >>>>>
> >>>>
> >>>>_______________________________________________
> >>>>Linux-ha-japan mailing list
> >>>>Linux-ha-japan@lists.sourceforge.jp
> >>>>http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
> >>>>
> >>>
> >>>
> >>>--
> >>>
> >>>ELF Systems
> >>>Masamichi Fukuda
> >>>mail to: masamichi_fukuda@elf-systems.com
> >>>
> >>>
> >>
> >>_______________________________________________
> >>Linux-ha-japan mailing list
> >>Linux-ha-japan@lists.sourceforge.jp
> >>http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
> >>
> >
> >
> >--
> >
> >ELF Systems
> >Masamichi Fukuda
> >mail to: masamichi_fukuda@elf-systems.com
> >
> >
>
> _______________________________________________
> Linux-ha-japan mailing list
> Linux-ha-japan@lists.sourceforge.jp
> http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>



--
ELF Systems
Masamichi Fukuda
mail to: *masamichi_fukuda@elf-systems.com <elfsystems.com@gmail.com>*
Re: スプリットブレイン時のSTONITHエラーについて [ In reply to ]
福田さん

お疲れ様です。山内です。

>stonith -Lでは、一応プラグインの一覧は表示されるようです。
>
># /usr/local/heartbeat/sbin/stonith -L

こちらは、Heartbeatのソースに含まれるコマンドのはずなので、Heartbeatとglueの関係では問題ないということになるかと思います。

ですので、pacemakerのインストールに問題がある可能性の方が高いと思われます。

どちらにしても、一度、時間をみて、こちらでも構築してみます。

以上です。


----- Original Message -----
>From: Masamichi Fukuda - elf-systems <masamichi_fukuda@elf-systems.com>
>To: 山内英生 <renayama19661014@ybb.ne.jp>; "linux-ha-japan@lists.sourceforge.jp" <linux-ha-japan@lists.sourceforge.jp>
>Date: 2015/3/18, Wed 09:33
>Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
>
>
>山内さん
>
>お疲れ様です、福田です。
>
>>Reusableは、glueのことです。
>
>承知しました。Cluster-glueのことですね。
>
>>これは、想定通りで、external配下のエージェントを認識できず、startしていないと
>>思っています。
>
>stonith -Lでは、一応プラグインの一覧は表示されるようです。
>
># /usr/local/heartbeat/sbin/stonith -L
>apcmaster
>apcsmart
>baytech
>cyclades
>external/drac5
>external/dracmc-telnet
>external/hetzner
>external/hmchttp
>external/ibmrsa
>external/ibmrsa-telnet
>external/ipmi
>external/ippower9258
>external/kdumpcheck
>external/libvirt
>external/nut
>external/rackpdu
>external/riloe
>external/ssh
>external/stonith-helper
>external/vcenter
>external/vmware
>external/xen0
>external/xen0-ha
>ibmhmc
>meatware
>null
>nw_rpc100s
>rcd_serial
>rps10
>ssh
>suicide
>wti_nps
>
>
>>#時間を見て、松島さんの情報を元に、Debian7の環境を作らないと何とも言えない
>>と思っています
>
>お忙しいところ済みません。
>こちらもインストールを見なおして見ます。
>
>宜しくお願いします。
>
>以上
>
>
>
>
>2015年3月18日 9:02 <renayama19661014@ybb.ne.jp>:
>
>福田さん
>>
>>おはようございます。山内です。
>>
>>書き方が悪かったです。
>>Reusableは、glueのことです。
>>
>>pacemakerのインストールに問題があるかも知れませんが、現時点では、判断出来ません。
>>
>>
>>>stonith-helperを外して、external/sshだけにして起動してみましたが、
>>>crm_monでの状態は変わりありませんでした。
>>
>>
>>これは、想定通りで、external配下のエージェントを認識できず、startしていないと思っています。
>>
>>#時間を見て、松島さんの情報を元に、Debian7の環境を作らないと何とも言えないと思っています。
>>
>>以上です。
>>
>>
>>----- Original Message -----
>>>From: Masamichi Fukuda - elf-systems <masamichi_fukuda@elf-systems.com>
>>>To: 山内英生 <renayama19661014@ybb.ne.jp>; "linux-ha-japan@lists.sourceforge.jp" <linux-ha-japan@lists.sourceforge.jp>
>>
>>>Date: 2015/3/18, Wed 08:12
>>>Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
>>>
>>>
>>>山内さん
>>>
>>>おはようございます、福田です。
>>>
>>>> ということは、xen0もstonith-helperもたぶん、Pacemakerのstonithプラグインとし
>>>> ての管理下のパスにはないということになると思います。
>>>>
>>>> Reusableと、pm_extrasあたりのインストールが怪しいと思いますね。
>>>
>>>pacemakerのインストールに問題があるのでしょうか。
>>>あと、Reusableというものは別途インストールが必要なのでしょうか。
>>>
>>>stonith-helperを外して、external/sshだけにして起動してみましたが、
>>>crm_monでの状態は変わりありませんでした。
>>>
>>>Last updated: Wed Mar 18 08:07:42 2015
>>>Last change: Wed Mar 18 08:04:48 2015
>>>Stack: heartbeat
>>>Current DC: lbv1.beta.com (38b0f200-83ea-8633-6f37-047d36cd39c6) - parti
>>>tion with quorum
>>>Version: 1.1.12-e32080b
>>>2 Nodes configured
>>>6 Resources configured
>>>
>>>
>>>Online: [ lbv1.beta.com lbv2.beta.com ]
>>>
>>>Full list of resources:
>>>
>>>Stonith1-2      (stonith:external/ssh): Stopped
>>>Stonith2-2      (stonith:external/ssh): Stopped
>>> Resource Group: HAvarnish
>>>     vip_208    (ocf::heartbeat:IPaddr2):       Started lbv1.beta.com
>>>     varnishd   (lsb:varnish):  Started lbv1.beta.com
>>> Clone Set: clone_ping [ping]
>>>     Started: [ lbv1.beta.com lbv2.beta.com ]
>>>
>>>Node Attributes:
>>>* Node lbv1.beta.com:
>>>    + default_ping_set                  : 100
>>>* Node lbv2.beta.com:
>>>    + default_ping_set                  : 100
>>>
>>>Migration summary:
>>>* Node lbv2.beta.com:
>>>   Stonith1-2: migration-threshold=1 fail-count=1000000 last-failure='Wed Mar 18
>>> 08:07:32 2015'
>>>* Node lbv1.beta.com:
>>>   Stonith2-2: migration-threshold=1 fail-count=1000000 last-failure='Wed Mar 18
>>> 08:05:53 2015'
>>>
>>>Failed actions:
>>>    Stonith1-2_start_0 on lbv2.beta.com 'unknown error' (1): call=23, st
>>>atus=Error, exit-reason='none', last-rc-change='Wed Mar 18 08:07:30 2015', queue
>>>d=0ms, exec=1061ms
>>>    Stonith2-2_start_0 on lbv1.beta.com 'unknown error' (1): call=23, st
>>>atus=Error, exit-reason='none', last-rc-change='Wed Mar 18 08:05:51 2015', queue
>>>d=0ms, exec=1062ms
>>>
>>>宜しくお願いします。
>>>
>>>以上
>>>
>>>
>>>
>>>
>>>
>>>2015年3月17日 23:51 <renayama19661014@ybb.ne.jp>:
>>>
>>>福田さん
>>>>
>>>>こんばんは、山内です。
>>>>
>>>>ということは、xen0もstonith-helperもたぶん、Pacemakerのstonithプラグインとしての管理下のパスにはないということになると思います。
>>>>
>>>>Reusableと、pm_extrasあたりのインストールが怪しいと思いますね。
>>>>
>>>>また、何かわかったらご連絡します。
>>>>
>>>>以上です。
>>>>
>>>>
>>>>
>>>>----- Original Message -----
>>>>>From: Masamichi Fukuda - elf-systems <masamichi_fukuda@elf-systems.com>
>>>>>To: 山内英生 <renayama19661014@ybb.ne.jp>; "linux-ha-japan@lists.sourceforge.jp" <linux-ha-japan@lists.sourceforge.jp>
>>>>
>>>>>Date: 2015/3/17, Tue 23:46
>>>>>Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
>>>>>
>>>>>
>>>>>山内さん
>>>>>
>>>>>こんばんは、福田です。
>>>>>
>>>>>stonith-helperの-x指定は何かやり方が違うんでしょうかね。
>>>>>
>>>>>stonith-helperを外して、xen0だけにして起動してみました。
>>>>>
>>>>># crm_mon -rfA
>>>>>
>>>>>Last updated: Tue Mar 17 23:38:53 2015
>>>>>Last change: Tue Mar 17 23:30:34 2015
>>>>>Stack: heartbeat
>>>>>Current DC: lbv1.beta.com (38b0f200-83ea-8633-6f37-047d36cd39c6) - parti
>>>>>tion with quorum
>>>>>Version: 1.1.12-e32080b
>>>>>2 Nodes configured
>>>>>6 Resources configured
>>>>>
>>>>>
>>>>>Online: [ lbv1.beta.com lbv2.beta.com ]
>>>>>
>>>>>Full list of resources:
>>>>>
>>>>>Stonith1-2      (stonith:external/xen0):        Stopped
>>>>>Stonith2-2      (stonith:external/xen0):        Stopped
>>>>> Resource Group: HAvarnish
>>>>>     vip_208    (ocf::heartbeat:IPaddr2):       Started lbv1.beta.com
>>>>>     varnishd   (lsb:varnish):  Started lbv1.beta.com
>>>>> Clone Set: clone_ping [ping]
>>>>>     Started: [ lbv1.beta.com lbv2.beta.com ]
>>>>>
>>>>>Node Attributes:
>>>>>* Node lbv1.beta.com:
>>>>>    + default_ping_set                  : 100
>>>>>* Node lbv2.beta.com:
>>>>>    + default_ping_set                  : 100
>>>>>
>>>>>Migration summary:
>>>>>* Node lbv1.beta.com:
>>>>>   Stonith2-2: migration-threshold=1 fail-count=1000000 last-failure='Tue Mar 17
>>>>> 23:38:34 2015'
>>>>>* Node lbv2.beta.com:
>>>>>   Stonith1-2: migration-threshold=1 fail-count=1000000 last-failure='Tue Mar 17
>>>>> 23:38:27 2015'
>>>>>
>>>>>Failed actions:
>>>>>    Stonith2-2_start_0 on lbv1.beta.com 'unknown error' (1): call=23, st
>>>>>atus=Error, exit-reason='none', last-rc-change='Tue Mar 17 23:38:32 2015', queue
>>>>>d=0ms, exec=1061ms
>>>>>    Stonith1-2_start_0 on lbv2.beta.com 'unknown error' (1): call=23, st
>>>>>atus=Error, exit-reason='none', last-rc-change='Tue Mar 17 23:38:25 2015', queue
>>>>>d=0ms, exec=1342ms
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>stonith-helperがあるときと同様のfialed actionsが出ているようです。
>>>>>
>>>>>
>>>>>宜しくお願いします。
>>>>>
>>>>>以上
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>2015年3月17日 22:38 <renayama19661014@ybb.ne.jp>:
>>>>>
>>>>>福田さん
>>>>>>
>>>>>>こんばんは、山内です。
>>>>>>
>>>>>>ちなみに可能であれば、external/stonith-helperを外して、external/xen0だけにした場合に
>>>>>>どうなるか?を確認すると、問題の切り分けになるかもしれません。
>>>>>>
>>>>>>以上です。
>>>>>>
>>>>>>
>>>>>>
>>>>>>----- Original Message -----
>>>>>>
>>>>>>> From: "renayama19661014@ybb.ne.jp" <renayama19661014@ybb.ne.jp>
>>>>>>> To: "linux-ha-japan@lists.sourceforge.jp" <linux-ha-japan@lists.sourceforge.jp>
>>>>>>> Cc:
>>>>>>> Date: 2015/3/17, Tue 22:28
>>>>>>> Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
>>>>>>>
>>>>>>> 福田さん
>>>>>>>
>>>>>>> こんばんは、山内です。
>>>>>>>
>>>>>>> 変わらないようですね。。。
>>>>>>>
>>>>>>> とりあえず、明日くらいに、RHEL上ですが、
>>>>>>>
>>>>>>> Heartbeat3.0.6
>>>>>>> Pacemakerの最新
>>>>>>>
>>>>>>> 組み合わせで、同じような設定(リソースはDummy、external/xen0はexternal/sshになりますが)stonith-helperが動くかどうかを確認してみます。
>>>>>>>
>>>>>>> #stonith-helperの-x指定の出力が確認出来ると、もう少し問題が絞りやすいのですが・・・
>>>>>>>
>>>>>>>
>>>>>>> 以上です。
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> ----- Original Message -----
>>>>>>>> From: Masamichi Fukuda - elf-systems
>>>>>>> <masamichi_fukuda@elf-systems.com>
>>>>>>>> To: 山内英生 <renayama19661014@ybb.ne.jp>;
>>>>>>> "linux-ha-japan@lists.sourceforge.jp"
>>>>>>> <linux-ha-japan@lists.sourceforge.jp>
>>>>>>>> Date: 2015/3/17, Tue 21:24
>>>>>>>> Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
>>>>>>>>
>>>>>>>>
>>>>>>>> 山内さん
>>>>>>>>
>>>>>>>> こんばんは、福田です。
>>>>>>>> 最新版の情報をありがとうございました。
>>>>>>>>
>>>>>>>> 早速インストールしてみました。
>>>>>>>>
>>>>>>>> 起動後の状態です。
>>>>>>>>
>>>>>>>> failed actionsは変わりないようです。
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> # crm_mon -rfA
>>>>>>>> Last updated: Tue Mar 17 21:03:49 2015
>>>>>>>> Last change: Tue Mar 17 20:30:58 2015
>>>>>>>> Stack: heartbeat
>>>>>>>> Current DC: lbv1.beta.com (38b0f200-83ea-8633-6f37-047d36cd39c6) - parti
>>>>>>>> tion with quorum
>>>>>>>> Version: 1.1.12-e32080b
>>>>>>>> 2 Nodes configured
>>>>>>>> 8 Resources configured
>>>>>>>>
>>>>>>>>
>>>>>>>> Online: [ lbv1.beta.com lbv2.beta.com ]
>>>>>>>>
>>>>>>>> Full list of resources:
>>>>>>>>
>>>>>>>>  Resource Group: HAvarnish
>>>>>>>>      vip_208    (ocf::heartbeat:IPaddr2):       Started lbv1.beta.com
>>>>>>>>      varnishd   (lsb:varnish):  Started lbv1.beta.com
>>>>>>>>  Resource Group: grpStonith1
>>>>>>>>      Stonith1-1 (stonith:external/stonith-helper):      Stopped
>>>>>>>>      Stonith1-2 (stonith:external/xen0):        Stopped
>>>>>>>>  Resource Group: grpStonith2
>>>>>>>>      Stonith2-1 (stonith:external/stonith-helper):      Stopped
>>>>>>>>      Stonith2-2 (stonith:external/xen0):        Stopped
>>>>>>>>  Clone Set: clone_ping [ping]
>>>>>>>>      Started: [ lbv1.beta.com lbv2.beta.com ]
>>>>>>>>
>>>>>>>> Node Attributes:
>>>>>>>> * Node lbv1.beta.com:
>>>>>>>>     + default_ping_set                  : 100
>>>>>>>> * Node lbv2.beta.com:
>>>>>>>>     + default_ping_set                  : 100
>>>>>>>>
>>>>>>>> Migration summary:
>>>>>>>> * Node lbv1.beta.com:
>>>>>>>>    Stonith2-1: migration-threshold=1 fail-count=1000000
>>>>>>> last-failure='Tue Mar 17
>>>>>>>>  21:03:39 2015'
>>>>>>>> * Node lbv2.beta.com:
>>>>>>>>    Stonith1-1: migration-threshold=1 fail-count=1000000
>>>>>>> last-failure='Tue Mar 17
>>>>>>>>  21:03:32 2015'
>>>>>>>>
>>>>>>>> Failed actions:
>>>>>>>>     Stonith2-1_start_0 on lbv1.beta.com 'unknown error' (1):
>>>>>>> call=31, st
>>>>>>>> atus=Error, exit-reason='none', last-rc-change='Tue Mar 17
>>>>>>> 21:03:37 2015', queue
>>>>>>>> d=0ms, exec=1085ms
>>>>>>>>     Stonith1-1_start_0 on lbv2.beta.com 'unknown error' (1):
>>>>>>> call=18, st
>>>>>>>> atus=Error, exit-reason='none', last-rc-change='Tue Mar 17
>>>>>>> 21:03:30 2015', queue
>>>>>>>> d=0ms, exec=1061ms
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> ログです。
>>>>>>>>
>>>>>>>>
>>>>>>>> # less /var/log/ha-debug
>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: info: Pacemaker support:
>>>>>>> yes
>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: WARN: File
>>>>>>> /etc/ha.d//haresources exists.
>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: WARN: This file is not used
>>>>>>> because pacemaker is enabled
>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: debug: Checking access of:
>>>>>>> /usr/local/heartbeat/libexec/heartbeat/ccm
>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: debug: Checking access of:
>>>>>>> /usr/local/heartbeat/libexec/pacemaker/cib
>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: debug: Checking access of:
>>>>>>> /usr/local/heartbeat/libexec/pacemaker/stonithd
>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: debug: Checking access of:
>>>>>>> /usr/local/heartbeat/libexec/pacemaker/lrmd
>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: debug: Checking access of:
>>>>>>> /usr/local/heartbeat/libexec/pacemaker/attrd
>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: debug: Checking access of:
>>>>>>> /usr/local/heartbeat/libexec/pacemaker/crmd
>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: WARN: Core dumps could be
>>>>>>> lost if multiple dumps occur.
>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: WARN: Consider setting
>>>>>>> non-default value in /proc/sys/kernel/core_pattern (or equivalent) for maximum
>>>>>>> supportability
>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: WARN: Consider setting
>>>>>>> /proc/sys/kernel/core_uses_pid (or equivalent) to 1 for maximum supportability
>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: WARN: Logging daemon is
>>>>>>> disabled --enabling logging daemon is recommended
>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: info:
>>>>>>> **************************
>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: info: Configuration
>>>>>>> validated. Starting heartbeat 3.0.6
>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4236]: info: heartbeat: version
>>>>>>> 3.0.6
>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4236]: info: Heartbeat generation:
>>>>>>> 1423534116
>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4236]: info: seed is -1702799346
>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4236]: info: glib: ucast: write
>>>>>>> socket priority set to IPTOS_LOWDELAY on eth1
>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4236]: info: glib: ucast: bound
>>>>>>> send socket to device: eth1
>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4236]: info: glib: ucast: set
>>>>>>> SO_REUSEADDR
>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4236]: info: glib: ucast: bound
>>>>>>> receive socket to device: eth1
>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4236]: info: glib: ucast: started
>>>>>>> on port 694 interface eth1 to 10.0.17.133
>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4236]: info: Local status now set
>>>>>>> to: 'up'
>>>>>>>> Mar 17 21:02:46 lbv1.beta.com heartbeat: [4236]: info: Link
>>>>>>> lbv2.beta.com:eth1 up.
>>>>>>>> Mar 17 21:02:46 lbv1.beta.com heartbeat: [4236]: info: Status update for
>>>>>>> node lbv2.beta.com: status up
>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: Comm_now_up():
>>>>>>> updating status to active
>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: Local status now set
>>>>>>> to: 'active'
>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: Starting child client
>>>>>>> "/usr/local/heartbeat/libexec/heartbeat/ccm" (109,113)
>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: Starting child client
>>>>>>> "/usr/local/heartbeat/libexec/pacemaker/cib" (109,113)
>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: Starting child client
>>>>>>> "/usr/local/heartbeat/libexec/pacemaker/stonithd" (0,0)
>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: Starting child client
>>>>>>> "/usr/local/heartbeat/libexec/pacemaker/lrmd" (0,0)
>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: Starting child client
>>>>>>> "/usr/local/heartbeat/libexec/pacemaker/attrd" (109,113)
>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: Starting child client
>>>>>>> "/usr/local/heartbeat/libexec/pacemaker/crmd" (109,113)
>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: debug: get_delnodelist:
>>>>>>> delnodelist=
>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4250]: info: Starting
>>>>>>> "/usr/local/heartbeat/libexec/pacemaker/crmd" as uid 109  gid 113 (pid
>>>>>>> 4250)
>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4246]: info: Starting
>>>>>>> "/usr/local/heartbeat/libexec/pacemaker/cib" as uid 109  gid 113 (pid
>>>>>>> 4246)
>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4249]: info: Starting
>>>>>>> "/usr/local/heartbeat/libexec/pacemaker/attrd" as uid 109  gid 113
>>>>>>> (pid 4249)
>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4245]: info: Starting
>>>>>>> "/usr/local/heartbeat/libexec/heartbeat/ccm" as uid 109  gid 113 (pid
>>>>>>> 4245)
>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4248]: info: Starting
>>>>>>> "/usr/local/heartbeat/libexec/pacemaker/lrmd" as uid 0  gid 0 (pid
>>>>>>> 4248)
>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4247]: info: Starting
>>>>>>> "/usr/local/heartbeat/libexec/pacemaker/stonithd" as uid 0  gid 0 (pid
>>>>>>> 4247)
>>>>>>>> Mar 17 21:02:47 lbv1.beta.com ccm: [4245]: info: Hostname: lbv1.beta.com
>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: the send queue length
>>>>>>> from heartbeat to client ccm is set to 1024
>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: the send queue length
>>>>>>> from heartbeat to client attrd is set to 1024
>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: the send queue length
>>>>>>> from heartbeat to client stonith-ng is set to 1024
>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: Status update for
>>>>>>> node lbv2.beta.com: status active
>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: the send queue length
>>>>>>> from heartbeat to client cib is set to 1024
>>>>>>>> Mar 17 21:02:51 lbv1.beta.com heartbeat: [4236]: WARN: 1 lost packet(s) for
>>>>>>> [lbv2.beta.com] [15:17]
>>>>>>>> Mar 17 21:02:51 lbv1.beta.com heartbeat: [4236]: info: No pkts missing from
>>>>>>> lbv2.beta.com!
>>>>>>>> Mar 17 21:02:52 lbv1.beta.com heartbeat: [4236]: WARN: 1 lost packet(s) for
>>>>>>> [lbv2.beta.com] [19:21]
>>>>>>>> Mar 17 21:02:52 lbv1.beta.com heartbeat: [4236]: info: No pkts missing from
>>>>>>> lbv2.beta.com!
>>>>>>>> Mar 17 21:02:52 lbv1.beta.com heartbeat: [4236]: info: the send queue length
>>>>>>> from heartbeat to client crmd is set to 1024
>>>>>>>> Mar 17 21:02:53 lbv1.beta.com heartbeat: [4236]: WARN: 1 lost packet(s) for
>>>>>>> [lbv2.beta.com] [24:26]
>>>>>>>> Mar 17 21:02:53 lbv1.beta.com heartbeat: [4236]: info: No pkts missing from
>>>>>>> lbv2.beta.com!
>>>>>>>> Mar 17 21:02:54 lbv1.beta.com heartbeat: [4236]: WARN: 1 lost packet(s) for
>>>>>>> [lbv2.beta.com] [26:28]
>>>>>>>> Mar 17 21:02:54 lbv1.beta.com heartbeat: [4236]: info: No pkts missing from
>>>>>>> lbv2.beta.com!
>>>>>>>> Mar 17 21:02:54 lbv1.beta.com heartbeat: [4236]: WARN: 1 lost packet(s) for
>>>>>>> [lbv2.beta.com] [30:32]
>>>>>>>> Mar 17 21:02:54 lbv1.beta.com heartbeat: [4236]: info: No pkts missing from
>>>>>>> lbv2.beta.com!
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> # less /var/log/error
>>>>>>>>
>>>>>>>> Mar 17 21:02:47 lbv1 attrd[4249]:    error: ha_msg_dispatch: Ignored
>>>>>>> incoming message. Please set_msg_callback on hbclstat
>>>>>>>> Mar 17 21:02:48 lbv1 attrd[4249]:    error: ha_msg_dispatch: Ignored
>>>>>>> incoming message. Please set_msg_callback on hbclstat
>>>>>>>> Mar 17 21:02:53 lbv1 stonith-ng[4247]:    error: ha_msg_dispatch: Ignored
>>>>>>> incoming message. Please set_msg_callback on hbclstat
>>>>>>>> Mar 17 21:02:53 lbv1 stonith-ng[4247]:    error: ha_msg_dispatch: Ignored
>>>>>>> incoming message. Please set_msg_callback on hbclstat
>>>>>>>> Mar 17 21:03:39 lbv1 crmd[4250]:    error: process_lrm_event: Operation
>>>>>>> Stonith2-1_start_0 (node=lbv1.beta.com, call=31, status=4, cib-update=42,
>>>>>>> confirmed=true) Error
>>>>>>>>
>>>>>>>> # cat syslog|egrep 'Mar 17 21:03|Mar 17 21:02' |egrep
>>>>>>> 'heartbeat|stonith|pacemaker|error'
>>>>>>>> Mar 17 21:03:24 lbv1 pengine[4253]:   notice: process_pe_message: Calculated
>>>>>>> Transition 0: /var/lib/pacemaker/pengine/pe-input-115.bz2
>>>>>>>> Mar 17 21:03:27 lbv1 crmd[4250]:   notice: run_graph: Transition 0
>>>>>>> (Complete=15, Pending=0, Fired=0, Skipped=16, Incomplete=2,
>>>>>>> Source=/var/lib/pacemaker/pengine/pe-input-115.bz2): Stopped
>>>>>>>> Mar 17 21:03:29 lbv1 pengine[4253]:   notice: process_pe_message: Calculated
>>>>>>> Transition 1: /var/lib/pacemaker/pengine/pe-input-116.bz2
>>>>>>>> Mar 17 21:03:34 lbv1 crmd[4250]:   notice: run_graph: Transition 1
>>>>>>> (Complete=8, Pending=0, Fired=0, Skipped=12, Incomplete=1,
>>>>>>> Source=/var/lib/pacemaker/pengine/pe-input-116.bz2): Stopped
>>>>>>>> Mar 17 21:03:37 lbv1 pengine[4253]:  warning: unpack_rsc_op_failure:
>>>>>>> Processing failed op start for Stonith1-1 on lbv2.beta.com: unknown error (1)
>>>>>>>> Mar 17 21:03:37 lbv1 pengine[4253]:  warning: unpack_rsc_op_failure:
>>>>>>> Processing failed op start for Stonith1-1 on lbv2.beta.com: unknown error (1)
>>>>>>>> Mar 17 21:03:37 lbv1 pengine[4253]:   notice: process_pe_message: Calculated
>>>>>>> Transition 2: /var/lib/pacemaker/pengine/pe-input-117.bz2
>>>>>>>> Mar 17 21:03:39 lbv1 stonith-ng[4247]:   notice: log_operation: Operation
>>>>>>> 'monitor' [4377] for device 'Stonith2-1' returned: -201 (Generic
>>>>>>> Pacemaker error)
>>>>>>>> Mar 17 21:03:39 lbv1 stonith-ng[4247]:  warning: log_operation:
>>>>>>> Stonith2-1:4377 [ Performing: stonith -t external/stonith-helper -S ]
>>>>>>>> Mar 17 21:03:39 lbv1 stonith-ng[4247]:  warning: log_operation:
>>>>>>> Stonith2-1:4377 [ failed to exec "stonith" ]
>>>>>>>> Mar 17 21:03:39 lbv1 stonith-ng[4247]:  warning: log_operation:
>>>>>>> Stonith2-1:4377 [ failed:  2 ]
>>>>>>>> Mar 17 21:03:39 lbv1 crmd[4250]:    error: process_lrm_event: Operation
>>>>>>> Stonith2-1_start_0 (node=lbv1.beta.com, call=31, status=4, cib-update=42,
>>>>>>> confirmed=true) Error
>>>>>>>> Mar 17 21:03:40 lbv1 crmd[4250]:   notice: run_graph: Transition 2
>>>>>>> (Complete=12, Pending=0, Fired=0, Skipped=3, Incomplete=0,
>>>>>>> Source=/var/lib/pacemaker/pengine/pe-input-117.bz2): Stopped
>>>>>>>> Mar 17 21:03:42 lbv1 pengine[4253]:  warning: unpack_rsc_op_failure:
>>>>>>> Processing failed op start for Stonith2-1 on lbv1.beta.com: unknown error (1)
>>>>>>>> Mar 17 21:03:42 lbv1 pengine[4253]:  warning: unpack_rsc_op_failure:
>>>>>>> Processing failed op start for Stonith2-1 on lbv1.beta.com: unknown error (1)
>>>>>>>> Mar 17 21:03:42 lbv1 pengine[4253]:  warning: unpack_rsc_op_failure:
>>>>>>> Processing failed op start for Stonith1-1 on lbv2.beta.com: unknown error (1)
>>>>>>>> Mar 17 21:03:42 lbv1 pengine[4253]:   notice: process_pe_message: Calculated
>>>>>>> Transition 3: /var/lib/pacemaker/pengine/pe-input-118.bz2
>>>>>>>> Mar 17 21:03:42 lbv1 IPaddr2(vip_208)[4448]: INFO:
>>>>>>> /usr/libexec/heartbeat/send_arp -i 200 -r 5 -p
>>>>>>> /var/run/resource-agents/send_arp-192.168.17.208 eth0 192.168.17.208 auto
>>>>>>> not_used not_used
>>>>>>>> Mar 17 21:03:47 lbv1 crmd[4250]:   notice: run_graph: Transition 3
>>>>>>> (Complete=10, Pending=0, Fired=0, Skipped=0, Incomplete=0,
>>>>>>> Source=/var/lib/pacemaker/pengine/pe-input-118.bz2): Complete
>>>>>>>>
>>>>>>>> 宜しくお願いします。
>>>>>>>>
>>>>>>>> 以上
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> 2015年3月17日 18:31 <renayama19661014@ybb.ne.jp>:
>>>>>>>>
>>>>>>>> 福田さん
>>>>>>>>>
>>>>>>>>> こんばんは、山内です。
>>>>>>>>>
>>>>>>>>> tag付けされていないので、本日の最新版は、
>>>>>>>>>
>>>>>>>>>  *
>>>>>>> https://github.com/ClusterLabs/pacemaker/tree/e32080b460f81486b85d08ec958582b3e72d858c
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> になります。
>>>>>>>>> 右側の[Download ZIP]からダウンロード出来ます。
>>>>>>>>>
>>>>>>>>> 以上です。
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> ----- Original Message -----
>>>>>>>>>> From: Masamichi Fukuda - elf-systems
>>>>>>> <masamichi_fukuda@elf-systems.com>
>>>>>>>>>
>>>>>>>>>> To: "renayama19661014@ybb.ne.jp"
>>>>>>> <renayama19661014@ybb.ne.jp>;
>>>>>>> "linux-ha-japan@lists.sourceforge.jp"
>>>>>>> <linux-ha-japan@lists.sourceforge.jp>
>>>>>>>>>> Date: 2015/3/17, Tue 18:07
>>>>>>>>>> Subject: スプリットブレイン時のSTONITHエラーについて
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> 山内さん
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> お疲れ様です、福田です。
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> こちらを見たのですが、
>>>>>>>>>> https://github.com/ClusterLabs/pacemaker/tags
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> pacemaker 1.1.12 561c4cf が最新のようなのですが。
>>>>>>>>>> 済みませんが、これ以降の最新版はどちらにあるか教えて頂けますか。
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> 宜しくお願いします。
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> 以上
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> 2015年3月17日火曜日、<renayama19661014@ybb.ne.jp>さんは書きました:
>>>>>>>>>>
>>>>>>>>>> 福田さん
>>>>>>>>>>>
>>>>>>>>>>> お疲れ様です。山内です。
>>>>>>>>>>>
>>>>>>>>>>> はい。古いです。
>>>>>>>>>>>
>>>>>>>>>>> PacemakerがHeartbeat3.0.6に対応したのは意外と最近です。
>>>>>>>>>>> もっと新しいものを入れてください。(また、ソースから構築する必要がありますが・・・・)
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> 本家のgithubから入手可能です。
>>>>>>>>>>>  * https://github.com/ClusterLabs/pacemaker
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> 場合によっては、最新のmasterはエラーなどが出る場合がありますので、その場合は、バージョンを古い方にたぐって
>>>>>>>>>>> いくのが良いと思います。
>>>>>>>>>>>
>>>>>>>>>>> 以上です。
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> ----- Original Message -----
>>>>>>>>>>>> From: Masamichi Fukuda - elf-systems
>>>>>>> <masamichi_fukuda@elf-systems.com>
>>>>>>>>>>>> To: 山内英生 <renayama19661014@ybb.ne.jp>;
>>>>>>> "linux-ha-japan@lists.sourceforge.jp"
>>>>>>> <linux-ha-japan@lists.sourceforge.jp>
>>>>>>>>>>>> Date: 2015/3/17, Tue 16:06
>>>>>>>>>>>> Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> 山内さん
>>>>>>>>>>>>
>>>>>>>>>>>> お疲れ様です、福田です。
>>>>>>>>>>>>
>>>>>>>>>>>> 以前のメールでheartbeatとpacemakerを最新版を入れたほうが良いと回答頂きました。
>>>>>>>>>>>> そこで今回、heartbeat3.0.6とpacemaker1.1.12を入れたのですが。
>>>>>>>>>>>>
>>>>>>>>>>>> heartbeat configuration: Version = "3.0.6"
>>>>>>>>>>>> pacemaker configuration: Version = 1.1.12 (Build:
>>>>>>> 561c4cf)pacemakerがまだ古いということでしょうか。
>>>>>>>>>>>>
>>>>>>>>>>>> 済みませんが、宜しくお願いします。
>>>>>>>>>>>>
>>>>>>>>>>>> 以上
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> 2015年3月17日 14:59 <renayama19661014@ybb.ne.jp>:
>>>>>>>>>>>>
>>>>>>>>>>>> 福田さん
>>>>>>>>>>>>>
>>>>>>>>>>>>> お疲れ様です。山内です。
>>>>>>>>>>>>>
>>>>>>>>>>>>> ふと思ったのすが、以前のやり取りのメールで以下と回答してますが、問題ないでしょうか?
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>  2)Heartbeat3.0.6+Pacemaker最新 :
>>>>>>> OK
>>>>>>>>>>>>>>>>>>>    
>>>>>>>>>>>>>>>>>>>
>>>>>>> どうやら、Heartbeatも最新版3.0.6を組合せる必要があるようです。
>>>>>>>>>>>>>>>>>>>
>>>>>>>  * http://hg.linux-ha.org/heartbeat-STABLE_3_0/rev/cceeb47a7d8f
>>>>>>>>>>>>>
>>>>>>>>>>>>> 以下のcrm_monのバージョンを見ると、1.1.12のようです。
>>>>>>>>>>>>> Heartbeat3.0.6と組み合わせるには、かなり新しめのPacemakerが必要です。
>>>>>>>>>>>>>
>>>>>>>>>>>>>> # crm_mon -rfA
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Last updated: Tue Mar 17 14:14:39 2015
>>>>>>>>>>>>>> Last change: Tue Mar 17 14:01:43 2015
>>>>>>>>>>>>>> Stack: heartbeat
>>>>>>>>>>>>>> Current DC: lbv2.beta.com
>>>>>>> (82ffc36f-1ad8-8686-7db0-35686465c624) - parti
>>>>>>>>>>>>>> tion with quorum
>>>>>>>>>>>>>> Version: 1.1.12-561c4cf
>>>>>>>>>>>>>
>>>>>>>>>>>>> たぶん、以下の変更以降は少なくとも必要かと思います。
>>>>>>>>>>>>>
>>>>>>>>>>>>> https://github.com/ClusterLabs/pacemaker/commit/f2302da063d08719d28367d8e362b8bfb0f85bf3
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> 以上です。
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> ----- Original Message -----
>>>>>>>>>>>>>> From: Masamichi Fukuda - elf-systems
>>>>>>> <masamichi_fukuda@elf-systems.com>
>>>>>>>>>>>>>> To: 山内英生 <renayama19661014@ybb.ne.jp>;
>>>>>>> "linux-ha-japan@lists.sourceforge.jp"
>>>>>>> <linux-ha-japan@lists.sourceforge.jp>
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Date: 2015/3/17, Tue 14:38
>>>>>>>>>>>>>> Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 山内さん
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> お疲れ様です、福田です。
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> stonith-helperのシェバング行に-xを追加すれば良いのでしょうか?
>>>>>>>>>>>>>> stonith-helperの先頭行を#!/bin/bash -xにしてクラスタを起動してみました。
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> crm_monでは先ほどと変わりはないようです。
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> # crm_mon -rfA
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Last updated: Tue Mar 17 14:14:39 2015
>>>>>>>>>>>>>> Last change: Tue Mar 17 14:01:43 2015
>>>>>>>>>>>>>> Stack: heartbeat
>>>>>>>>>>>>>> Current DC: lbv2.beta.com
>>>>>>> (82ffc36f-1ad8-8686-7db0-35686465c624) - parti
>>>>>>>>>>>>>> tion with quorum
>>>>>>>>>>>>>> Version: 1.1.12-561c4cf
>>>>>>>>>>>>>> 2 Nodes configured
>>>>>>>>>>>>>> 8 Resources configured
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Online: [ lbv1.beta.com lbv2.beta.com ]
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Full list of resources:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>  Resource Group: HAvarnish
>>>>>>>>>>>>>>      vip_208    (ocf::heartbeat:IPaddr2):      
>>>>>>> Started lbv1.beta.com
>>>>>>>>>>>>>>      varnishd   (lsb:varnish):  Started
>>>>>>> lbv1.beta.com
>>>>>>>>>>>>>>  Resource Group: grpStonith1
>>>>>>>>>>>>>>      Stonith1-1
>>>>>>> (stonith:external/stonith-helper):      Stopped
>>>>>>>>>>>>>>      Stonith1-2 (stonith:external/xen0):       
>>>>>>> Stopped
>>>>>>>>>>>>>>  Resource Group: grpStonith2
>>>>>>>>>>>>>>      Stonith2-1
>>>>>>> (stonith:external/stonith-helper):      Stopped
>>>>>>>>>>>>>>      Stonith2-2 (stonith:external/xen0):       
>>>>>>> Stopped
>>>>>>>>>>>>>>  Clone Set: clone_ping [ping]
>>>>>>>>>>>>>>      Started: [ lbv1.beta.com lbv2.beta.com ]
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Node Attributes:
>>>>>>>>>>>>>> * Node lbv1.beta.com:
>>>>>>>>>>>>>>     + default_ping_set                  : 100
>>>>>>>>>>>>>> * Node lbv2.beta.com:
>>>>>>>>>>>>>>     + default_ping_set                  : 100
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Migration summary:
>>>>>>>>>>>>>> * Node lbv2.beta.com:
>>>>>>>>>>>>>>    Stonith1-1: migration-threshold=1
>>>>>>> fail-count=1000000 last-failure='Tue Mar 17
>>>>>>>>>>>>>>  14:12:16 2015'
>>>>>>>>>>>>>> * Node lbv1.beta.com:
>>>>>>>>>>>>>>    Stonith2-1: migration-threshold=1
>>>>>>> fail-count=1000000 last-failure='Tue Mar 17
>>>>>>>>>>>>>>  14:12:21 2015'
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Failed actions:
>>>>>>>>>>>>>>     Stonith1-1_start_0 on lbv2.beta.com 'unknown
>>>>>>> error' (1): call=31, st
>>>>>>>>>>>>>> atus=Error, last-rc-change='Tue Mar 17 14:12:14
>>>>>>> 2015', queued=0ms, exec=1065ms
>>>>>>>>>>>>>>     Stonith2-1_start_0 on lbv1.beta.com 'unknown
>>>>>>> error' (1): call=26, st
>>>>>>>>>>>>>> atus=Error, last-rc-change='Tue Mar 17 14:12:19
>>>>>>> 2015', queued=0ms, exec=1081ms
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> その他のログを探してみました。
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> heartbeat起動時です。
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> # less /var/log/pm_logconv.out
>>>>>>>>>>>>>> Mar 17 14:11:28 lbv1.beta.com info: Starting
>>>>>>> Heartbeat 3.0.6.
>>>>>>>>>>>>>> Mar 17 14:11:33 lbv1.beta.com info: Link
>>>>>>> lbv2.beta.com:eth1 is up.
>>>>>>>>>>>>>> Mar 17 14:11:34 lbv1.beta.com info: Start
>>>>>>> "ccm" process. (pid=13264)
>>>>>>>>>>>>>> Mar 17 14:11:34 lbv1.beta.com info: Start
>>>>>>> "lrmd" process. (pid=13267)
>>>>>>>>>>>>>> Mar 17 14:11:34 lbv1.beta.com info: Start
>>>>>>> "attrd" process. (pid=13268)
>>>>>>>>>>>>>> Mar 17 14:11:34 lbv1.beta.com info: Start
>>>>>>> "stonithd" process. (pid=13266)
>>>>>>>>>>>>>> Mar 17 14:11:34 lbv1.beta.com info: Start
>>>>>>> "cib" process. (pid=13265)
>>>>>>>>>>>>>> Mar 17 14:11:34 lbv1.beta.com info: Start
>>>>>>> "crmd" process. (pid=13269)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> # less /var/log/error
>>>>>>>>>>>>>> Mar 17 14:12:20 lbv1 crmd[13269]:    error:
>>>>>>> process_lrm_event: Operation Stonith2-1_start_0 (node=lbv1.beta.com, call=26,
>>>>>>> status=4, cib-update=19, confirmed=true) Error
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> syslogからstonithをgrepしたものです
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Mar 17 14:11:34 lbv1 heartbeat: [13255]: info:
>>>>>>> Starting child client
>>>>>>> "/usr/local/heartbeat/libexec/pacemaker/stonithd" (0,0)
>>>>>>>>>>>>>> Mar 17 14:11:34 lbv1 heartbeat: [13266]: info:
>>>>>>> Starting "/usr/local/heartbeat/libexec/pacemaker/stonithd" as uid 0 
>>>>>>> gid 0 (pid 13266)
>>>>>>>>>>>>>> Mar 17 14:11:34 lbv1 stonithd[13266]:   notice:
>>>>>>> crm_cluster_connect: Connecting to cluster infrastructure: heartbeat
>>>>>>>>>>>>>> Mar 17 14:11:34 lbv1 heartbeat: [13255]: info: the
>>>>>>> send queue length from heartbeat to client stonithd is set to 1024
>>>>>>>>>>>>>> Mar 17 14:11:40 lbv1 stonithd[13266]:   notice:
>>>>>>> setup_cib: Watching for stonith topology changes
>>>>>>>>>>>>>> Mar 17 14:11:40 lbv1 stonithd[13266]:   notice:
>>>>>>> unpack_config: On loss of CCM Quorum: Ignore
>>>>>>>>>>>>>> Mar 17 14:11:40 lbv1 stonithd[13266]:  warning:
>>>>>>> handle_startup_fencing: Blind faith: not fencing unseen nodes
>>>>>>>>>>>>>> Mar 17 14:11:40 lbv1 stonithd[13266]:  warning:
>>>>>>> handle_startup_fencing: Blind faith: not fencing unseen nodes
>>>>>>>>>>>>>> Mar 17 14:11:41 lbv1 stonithd[13266]:   notice:
>>>>>>> stonith_device_register: Added 'Stonith2-1' to the device list (1 active
>>>>>>> devices)
>>>>>>>>>>>>>> Mar 17 14:11:41 lbv1 stonithd[13266]:   notice:
>>>>>>> stonith_device_register: Added 'Stonith2-2' to the device list (2 active
>>>>>>> devices)
>>>>>>>>>>>>>> Mar 17 14:12:04 lbv1 stonithd[13266]:   notice:
>>>>>>> xml_patch_version_check: Versions did not change in patch 0.5.0
>>>>>>>>>>>>>> Mar 17 14:12:20 lbv1 stonithd[13266]:   notice:
>>>>>>> log_operation: Operation 'monitor' [13386] for device
>>>>>>> 'Stonith2-1' returned: -201 (Generic Pacemaker error)
>>>>>>>>>>>>>> Mar 17 14:12:20 lbv1 stonithd[13266]:  warning:
>>>>>>> log_operation: Stonith2-1:13386 [ Performing: stonith -t external/stonith-helper
>>>>>>> -S ]
>>>>>>>>>>>>>> Mar 17 14:12:20 lbv1 stonithd[13266]:  warning:
>>>>>>> log_operation: Stonith2-1:13386 [ failed to exec "stonith" ]
>>>>>>>>>>>>>> Mar 17 14:12:20 lbv1 stonithd[13266]:  warning:
>>>>>>> log_operation: Stonith2-1:13386 [ failed:  2 ]
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 宜しくお願いします。
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 以上
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 2015年3月17日 13:32 <renayama19661014@ybb.ne.jp>:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 福田さん
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> お疲れ様です。山内です。
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> ということは、stonith-helperのstartに問題があるようですね。
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> stonith-helperの先頭に
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> #!/bin/bash -x
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> を入れて、クラスタを起動すると何かわかるかも知れません。
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> ちなみに、stonith-helperのログもどこかに出ていると思うのですが。。。
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 以上です。
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> ----- Original Message -----
>>>>>>>>>>>>>>>> From: Masamichi Fukuda - elf-systems
>>>>>>> <masamichi_fukuda@elf-systems.com>
>>>>>>>>>>>>>>>> To: 山内英生 <renayama19661014@ybb.ne.jp>;
>>>>>>> "linux-ha-japan@lists.sourceforge.jp"
>>>>>>> <linux-ha-japan@lists.sourceforge.jp>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Date: 2015/3/17, Tue 12:31
>>>>>>>>>>>>>>>> Subject: Re: [Linux-ha-jp]
>>>>>>> スプリットブレイン時のSTONITHエラーについて
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 山内さん
>>>>>>>>>>>>>>>> cc:松島さん
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> こんにちは、福田です。
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 同じディレクトリにxen0はありました。
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> # pwd
>>>>>>>>>>>>>>>> /usr/local/heartbeat/lib/stonith/plugins/external
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> # ls
>>>>>>>>>>>>>>>> drac5           ibmrsa          kdumpcheck 
>>>>>>> riloe          vmware
>>>>>>>>>>>>>>>> dracmc-telnet  ibmrsa-telnet  libvirt     
>>>>>>> ssh          xen0
>>>>>>>>>>>>>>>> hetzner        ipmi          nut     
>>>>>>> stonith-helper  xen0-ha
>>>>>>>>>>>>>>>> hmchttp        ippower9258    rackpdu     
>>>>>>> vcenter
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 宜しくお願いします。
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 以上
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 2015-03-17 10:53 GMT+09:00
>>>>>>> <renayama19661014@ybb.ne.jp>:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 福田さん
>>>>>>>>>>>>>>>>> cc:松島さん
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> お疲れ様です。山内です。
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> 標準出力や標準エラー出力はありませんでした。
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> stonith-helperがおかしいのでしょうか。
>>>>>>>>>>>>>>>>>> stonith-helperはシェルスクリプトなのでインストールはあまり気にしていなかったのですが。
>>>>>>>>>>>>>>>>>> stonith-helperはここに配置されています。
>>>>>>>>>>>>>>>>>> /usr/local/heartbeat/lib/stonith/plugins/external/stonith-helper
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> このディレクトリにxen0もありますか?
>>>>>>>>>>>>>>>>> 無いようでしたら、問題がありますので、一度、stonith-helperのファイルを属性などはそのまま、xen0と同じディレクトリに
>>>>>>>>>>>>>>>>> コピーしてみてください。
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> それで稼働するなら、pm_extrasのインストールに問題があるということになります。
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 以上です。
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> ----- Original Message -----
>>>>>>>>>>>>>>>>>> From: Masamichi Fukuda - elf-systems
>>>>>>> <masamichi_fukuda@elf-systems.com>
>>>>>>>>>>>>>>>>>> To: 山内英生
>>>>>>> <renayama19661014@ybb.ne.jp>;
>>>>>>> "linux-ha-japan@lists.sourceforge.jp"
>>>>>>> <linux-ha-japan@lists.sourceforge.jp>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Date: 2015/3/17, Tue 10:31
>>>>>>>>>>>>>>>>>> Subject: Re: [Linux-ha-jp]
>>>>>>> スプリットブレイン時のSTONITHエラーについて
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> 山内さん
>>>>>>>>>>>>>>>>>> cc:松島さん
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> おはようございます、福田です。
>>>>>>>>>>>>>>>>>> crmの例をありがとうございます。
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> 早速、こちらの環境に合わせてみました。
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> $ cat test.crm
>>>>>>>>>>>>>>>>>> ### Cluster Option ###
>>>>>>>>>>>>>>>>>> property \
>>>>>>>>>>>>>>>>>>    
>>>>>>> no-quorum-policy="ignore" \
>>>>>>>>>>>>>>>>>>     stonith-enabled="true"
>>>>>>> \
>>>>>>>>>>>>>>>>>>    
>>>>>>> startup-fencing="false" \
>>>>>>>>>>>>>>>>>>     stonith-timeout="710s"
>>>>>>> \
>>>>>>>>>>>>>>>>>>    
>>>>>>> crmd-transition-delay="2s"
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> ### Resource Default ###
>>>>>>>>>>>>>>>>>> rsc_defaults \
>>>>>>>>>>>>>>>>>>    
>>>>>>> resource-stickiness="INFINITY" \
>>>>>>>>>>>>>>>>>>    
>>>>>>> migration-threshold="1"
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> ### Group Configuration ###
>>>>>>>>>>>>>>>>>> group HAvarnish \
>>>>>>>>>>>>>>>>>>     vip_208 \
>>>>>>>>>>>>>>>>>>     varnishd
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> group grpStonith1 \
>>>>>>>>>>>>>>>>>>     Stonith1-1 \
>>>>>>>>>>>>>>>>>>     Stonith1-2
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> group grpStonith2 \
>>>>>>>>>>>>>>>>>>     Stonith2-1 \
>>>>>>>>>>>>>>>>>>     Stonith2-2
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> ### Clone Configuration ###
>>>>>>>>>>>>>>>>>> clone clone_ping \
>>>>>>>>>>>>>>>>>>     ping
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> ### Fencing Topology ###
>>>>>>>>>>>>>>>>>> fencing_topology \
>>>>>>>>>>>>>>>>>>     lbv1.beta.com: Stonith1-1
>>>>>>> Stonith1-2 \
>>>>>>>>>>>>>>>>>>     lbv2.beta.com: Stonith2-1
>>>>>>> Stonith2-2
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> ### Primitive Configuration ###
>>>>>>>>>>>>>>>>>> primitive vip_208
>>>>>>> ocf:heartbeat:IPaddr2 \
>>>>>>>>>>>>>>>>>>     params \
>>>>>>>>>>>>>>>>>>        
>>>>>>> ip="192.168.17.208" \
>>>>>>>>>>>>>>>>>>         nic="eth0" \
>>>>>>>>>>>>>>>>>>         cidr_netmask="24"
>>>>>>> \
>>>>>>>>>>>>>>>>>>     op start interval="0s"
>>>>>>> timeout="90s" on-fail="restart" \
>>>>>>>>>>>>>>>>>>     op monitor
>>>>>>> interval="5s" timeout="60s" on-fail="restart"
>>>>>>> \
>>>>>>>>>>>>>>>>>>     op stop interval="0s"
>>>>>>> timeout="100s" on-fail="fence"
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> primitive varnishd lsb:varnish \
>>>>>>>>>>>>>>>>>>     op start interval="0s"
>>>>>>> timeout="90s" on-fail="restart" \
>>>>>>>>>>>>>>>>>>     op monitor
>>>>>>> interval="10s" timeout="60s" on-fail="restart"
>>>>>>> \
>>>>>>>>>>>>>>>>>>     op stop interval="0s"
>>>>>>> timeout="100s" on-fail="fence"
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> primitive ping ocf:pacemaker:ping
>>>>>>> \
>>>>>>>>>>>>>>>>>>     params \
>>>>>>>>>>>>>>>>>>        
>>>>>>> name="default_ping_set" \
>>>>>>>>>>>>>>>>>>        
>>>>>>> host_list="192.168.17.254" \
>>>>>>>>>>>>>>>>>>         multiplier="100"
>>>>>>> \
>>>>>>>>>>>>>>>>>>         dampen="1" \
>>>>>>>>>>>>>>>>>>     op start interval="0s"
>>>>>>> timeout="90s" on-fail="restart" \
>>>>>>>>>>>>>>>>>>     op monitor
>>>>>>> interval="10s" timeout="60s" on-fail="restart"
>>>>>>> \
>>>>>>>>>>>>>>>>>>     op stop interval="0s"
>>>>>>> timeout="100s" on-fail="fence"
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> primitive Stonith1-1
>>>>>>> stonith:external/stonith-helper \
>>>>>>>>>>>>>>>>>>     params \
>>>>>>>>>>>>>>>>>>        
>>>>>>> pcmk_reboot_retries="1" \
>>>>>>>>>>>>>>>>>>        
>>>>>>> pcmk_reboot_timeout="40s" \
>>>>>>>>>>>>>>>>>>        
>>>>>>> hostlist="lbv1.beta.com" \
>>>>>>>>>>>>>>>>>>        
>>>>>>> dead_check_target="192.168.17.132 10.0.17.132" \
>>>>>>>>>>>>>>>>>>        
>>>>>>> standby_check_command="/usr/local/sbin/crm_resource -r varnishd -W | grep
>>>>>>> -q `hostname`" \
>>>>>>>>>>>>>>>>>>        
>>>>>>> run_online_check="yes" \
>>>>>>>>>>>>>>>>>>     op start interval="0s"
>>>>>>> timeout="60s" on-fail="restart" \
>>>>>>>>>>>>>>>>>>     op stop interval="0s"
>>>>>>> timeout="60s" on-fail="ignore"
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> primitive Stonith1-2
>>>>>>> stonith:external/xen0 \
>>>>>>>>>>>>>>>>>>     params \
>>>>>>>>>>>>>>>>>>        
>>>>>>> pcmk_reboot_timeout="60s" \
>>>>>>>>>>>>>>>>>>        
>>>>>>> hostlist="lbv1.beta.com:/etc/xen/lbv1.cfg" \
>>>>>>>>>>>>>>>>>>        
>>>>>>> dom0="xen0.beta.com" \
>>>>>>>>>>>>>>>>>>     op start interval="0s"
>>>>>>> timeout="60s" on-fail="restart" \
>>>>>>>>>>>>>>>>>>     op monitor
>>>>>>> interval="3600s" timeout="60s" on-fail="restart"
>>>>>>> \
>>>>>>>>>>>>>>>>>>     op stop interval="0s"
>>>>>>> timeout="60s" on-fail="ignore"
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> primitive Stonith2-1
>>>>>>> stonith:external/stonith-helper \
>>>>>>>>>>>>>>>>>>     params \
>>>>>>>>>>>>>>>>>>        
>>>>>>> pcmk_reboot_retries="1" \
>>>>>>>>>>>>>>>>>>        
>>>>>>> pcmk_reboot_timeout="40s" \
>>>>>>>>>>>>>>>>>>        
>>>>>>> hostlist="lbv2.beta.com" \
>>>>>>>>>>>>>>>>>>        
>>>>>>> dead_check_target="192.168.17.133 10.0.17.133" \
>>>>>>>>>>>>>>>>>>        
>>>>>>> standby_check_command="/usr/local/sbin/crm_resource -r varnishd -W | grep
>>>>>>> -q `hostname`" \
>>>>>>>>>>>>>>>>>>        
>>>>>>> run_online_check="yes" \
>>>>>>>>>>>>>>>>>>     op start interval="0s"
>>>>>>> timeout="60s" on-fail="restart" \
>>>>>>>>>>>>>>>>>>     op stop interval="0s"
>>>>>>> timeout="60s" on-fail="ignore"
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> primitive Stonith2-2
>>>>>>> stonith:external/xen0 \
>>>>>>>>>>>>>>>>>>     params \
>>>>>>>>>>>>>>>>>>        
>>>>>>> pcmk_reboot_timeout="60s" \
>>>>>>>>>>>>>>>>>>        
>>>>>>> hostlist="lbv2.beta.com:/etc/xen/lbv2.cfg" \
>>>>>>>>>>>>>>>>>>        
>>>>>>> dom0="xen0.beta.com" \
>>>>>>>>>>>>>>>>>>     op start interval="0s"
>>>>>>> timeout="60s" on-fail="restart" \
>>>>>>>>>>>>>>>>>>     op monitor
>>>>>>> interval="3600s" timeout="60s" on-fail="restart"
>>>>>>> \
>>>>>>>>>>>>>>>>>>     op stop interval="0s"
>>>>>>> timeout="60s" on-fail="ignore"
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> ### Resource Location ###
>>>>>>>>>>>>>>>>>> location HA_location-1 HAvarnish
>>>>>>> \
>>>>>>>>>>>>>>>>>>     rule 200: #uname eq
>>>>>>> lbv1.beta.com \
>>>>>>>>>>>>>>>>>>     rule 100: #uname eq
>>>>>>> lbv2.beta.com
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> location HA_location-2 HAvarnish
>>>>>>> \
>>>>>>>>>>>>>>>>>>     rule -INFINITY: not_defined
>>>>>>> default_ping_set or default_ping_set lt 100
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> location HA_location-3 grpStonith1
>>>>>>> \
>>>>>>>>>>>>>>>>>>     rule -INFINITY: #uname eq
>>>>>>> lbv1.beta.com
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> location HA_location-4 grpStonith2
>>>>>>> \
>>>>>>>>>>>>>>>>>>     rule -INFINITY: #uname eq
>>>>>>> lbv2.beta.com
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> これを流しこんだところ、昨日とはメッセージが異なります。
>>>>>>>>>>>>>>>>>> pingのメッセージはなくなっていました。
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> # crm_mon -rfA
>>>>>>>>>>>>>>>>>> Last updated: Tue Mar 17 10:21:28
>>>>>>> 2015
>>>>>>>>>>>>>>>>>> Last change: Tue Mar 17 10:21:09
>>>>>>> 2015
>>>>>>>>>>>>>>>>>> Stack: heartbeat
>>>>>>>>>>>>>>>>>> Current DC: lbv2.beta.com
>>>>>>> (82ffc36f-1ad8-8686-7db0-35686465c624) - parti
>>>>>>>>>>>>>>>>>> tion with quorum
>>>>>>>>>>>>>>>>>> Version: 1.1.12-561c4cf
>>>>>>>>>>>>>>>>>> 2 Nodes configured
>>>>>>>>>>>>>>>>>> 8 Resources configured
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Online: [ lbv1.beta.com
>>>>>>> lbv2.beta.com ]
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Full list of resources:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>  Resource Group: HAvarnish
>>>>>>>>>>>>>>>>>>      vip_208   
>>>>>>> (ocf::heartbeat:IPaddr2):       Started lbv1.beta.com
>>>>>>>>>>>>>>>>>>      varnishd   (lsb:varnish): 
>>>>>>> Started lbv1.beta.com
>>>>>>>>>>>>>>>>>>  Resource Group: grpStonith1
>>>>>>>>>>>>>>>>>>      Stonith1-1
>>>>>>> (stonith:external/stonith-helper):      Stopped
>>>>>>>>>>>>>>>>>>      Stonith1-2
>>>>>>> (stonith:external/xen0):        Stopped
>>>>>>>>>>>>>>>>>>  Resource Group: grpStonith2
>>>>>>>>>>>>>>>>>>      Stonith2-1
>>>>>>> (stonith:external/stonith-helper):      Stopped
>>>>>>>>>>>>>>>>>>      Stonith2-2
>>>>>>> (stonith:external/xen0):        Stopped
>>>>>>>>>>>>>>>>>>  Clone Set: clone_ping [ping]
>>>>>>>>>>>>>>>>>>      Started: [ lbv1.beta.com
>>>>>>> lbv2.beta.com ]
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Node Attributes:
>>>>>>>>>>>>>>>>>> * Node lbv1.beta.com:
>>>>>>>>>>>>>>>>>>     +
>>>>>>> default_ping_set                  : 100
>>>>>>>>>>>>>>>>>> * Node lbv2.beta.com:
>>>>>>>>>>>>>>>>>>     +
>>>>>>> default_ping_set                  : 100
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Migration summary:
>>>>>>>>>>>>>>>>>> * Node lbv2.beta.com:
>>>>>>>>>>>>>>>>>>    Stonith1-1: migration-threshold=1
>>>>>>> fail-count=1000000 last-failure='Tue Mar 17
>>>>>>>>>>>>>>>>>>  10:21:17 2015'
>>>>>>>>>>>>>>>>>> * Node lbv1.beta.com:
>>>>>>>>>>>>>>>>>>    Stonith2-1: migration-threshold=1
>>>>>>> fail-count=1000000 last-failure='Tue Mar 17
>>>>>>>>>>>>>>>>>>  10:21:17 2015'
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Failed actions:
>>>>>>>>>>>>>>>>>>     Stonith1-1_start_0 on
>>>>>>> lbv2.beta.com 'unknown error' (1): call=31, st
>>>>>>>>>>>>>>>>>> atus=Error, last-rc-change='Tue
>>>>>>> Mar 17 10:21:15 2015', queued=0ms, exec=1082ms
>>>>>>>>>>>>>>>>>>     Stonith2-1_start_0 on
>>>>>>> lbv1.beta.com 'unknown error' (1): call=31, st
>>>>>>>>>>>>>>>>>> atus=Error, last-rc-change='Tue
>>>>>>> Mar 17 10:21:16 2015', queued=0ms, exec=1079ms
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> /var/log/ha-debugのログです。
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> IPaddr2(vip_208)[7851]:
>>>>>>> 2015/03/17_10:21:22 INFO: Adding inet address 192.168.17.208/24 with broadcast
>>>>>>> address 192.168.17.255 to device eth0
>>>>>>>>>>>>>>>>>> IPaddr2(vip_208)[7851]:
>>>>>>> 2015/03/17_10:21:22 INFO: Bringing device eth0 up
>>>>>>>>>>>>>>>>>> IPaddr2(vip_208)[7851]:
>>>>>>> 2015/03/17_10:21:22 INFO: /usr/libexec/heartbeat/send_arp -i 200 -r 5 -p
>>>>>>> /var/run/resource-agents/send_arp-192.168.17.208 eth0 192.168.17.208 auto
>>>>>>> not_used not_used
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> 標準出力や標準エラー出力はありませんでした。
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> stonith-helperがおかしいのでしょうか。
>>>>>>>>>>>>>>>>>> stonith-helperはシェルスクリプトなのでインストールはあまり気にしていなかったのですが。
>>>>>>>>>>>>>>>>>> stonith-helperはここに配置されています。
>>>>>>>>>>>>>>>>>> /usr/local/heartbeat/lib/stonith/plugins/external/stonith-helper
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> 宜しくお願いします。
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> 以上
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> 2015-03-17 9:45 GMT+09:00
>>>>>>> <renayama19661014@ybb.ne.jp>:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> 福田さん
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> おはようございます。山内です。
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> 念の為、手元にある複数のstonithを利用した場合の例を抜粋してお送りします。
>>>>>>>>>>>>>>>>>>> (実際には、改行に気を付けてください)
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> 以下の例は、PM1.1系での設定で、
>>>>>>>>>>>>>>>>>>> nodeaは、prmStonith1-1、 prmStonith1-2の順でstonithが実行されます。
>>>>>>>>>>>>>>>>>>> nodebは、prmStonith2-1、 prmStonith2-2の順でstonithが実行されます。
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> stonith自体は、helperとsshです。
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> (snip)
>>>>>>>>>>>>>>>>>>> ### Group Configuration ###
>>>>>>>>>>>>>>>>>>> group grpStonith1 \
>>>>>>>>>>>>>>>>>>> prmStonith1-1 \
>>>>>>>>>>>>>>>>>>> prmStonith1-2
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> group grpStonith2 \
>>>>>>>>>>>>>>>>>>> prmStonith2-1 \
>>>>>>>>>>>>>>>>>>> prmStonith2-2
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> ### Fencing Topology ###
>>>>>>>>>>>>>>>>>>> fencing_topology \
>>>>>>>>>>>>>>>>>>> nodea: prmStonith1-1
>>>>>>> prmStonith1-2 \
>>>>>>>>>>>>>>>>>>> nodeb: prmStonith2-1
>>>>>>> prmStonith2-2
>>>>>>>>>>>>>>>>>>> (snp)
>>>>>>>>>>>>>>>>>>> primitive prmStonith1-1
>>>>>>> stonith:external/stonith-helper \
>>>>>>>>>>>>>>>>>>> params \
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> pcmk_reboot_retries="1"
>>>>>>> \
>>>>>>>>>>>>>>>>>>> pcmk_reboot_timeout="40s"
>>>>>>> \
>>>>>>>>>>>>>>>>>>> hostlist="nodea" \
>>>>>>>>>>>>>>>>>>> dead_check_target="192.168.28.60
>>>>>>> 192.168.28.70" \
>>>>>>>>>>>>>>>>>>> standby_check_command="/usr/sbin/crm_resource
>>>>>>> -r prmRES -W | grep -qi `hostname`" \
>>>>>>>>>>>>>>>>>>> run_online_check="yes"
>>>>>>> \
>>>>>>>>>>>>>>>>>>> op start interval="0s"
>>>>>>> timeout="60s" on-fail="restart" \
>>>>>>>>>>>>>>>>>>> op stop interval="0s"
>>>>>>> timeout="60s" on-fail="ignore"
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> primitive prmStonith1-2
>>>>>>> stonith:external/ssh \
>>>>>>>>>>>>>>>>>>> params \
>>>>>>>>>>>>>>>>>>> pcmk_reboot_timeout="60s"
>>>>>>> \
>>>>>>>>>>>>>>>>>>> hostlist="nodea" \
>>>>>>>>>>>>>>>>>>> op start interval="0s"
>>>>>>> timeout="60s" on-fail="restart" \
>>>>>>>>>>>>>>>>>>> op monitor
>>>>>>> interval="3600s" timeout="60s" on-fail="restart"
>>>>>>> \
>>>>>>>>>>>>>>>>>>> op stop interval="0s"
>>>>>>> timeout="60s" on-fail="ignore"
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> primitive prmStonith2-1
>>>>>>> stonith:external/stonith-helper \
>>>>>>>>>>>>>>>>>>> params \
>>>>>>>>>>>>>>>>>>> pcmk_reboot_retries="1"
>>>>>>> \
>>>>>>>>>>>>>>>>>>> pcmk_reboot_timeout="40s"
>>>>>>> \
>>>>>>>>>>>>>>>>>>> hostlist="nodeb" \
>>>>>>>>>>>>>>>>>>> dead_check_target="192.168.28.61
>>>>>>> 192.168.28.71" \
>>>>>>>>>>>>>>>>>>> standby_check_command="/usr/sbin/crm_resource
>>>>>>> -r prmRES -W | grep -qi `hostname`" \
>>>>>>>>>>>>>>>>>>> run_online_check="yes"
>>>>>>> \
>>>>>>>>>>>>>>>>>>> op start interval="0s"
>>>>>>> timeout="60s" on-fail="restart" \
>>>>>>>>>>>>>>>>>>> op stop interval="0s"
>>>>>>> timeout="60s" on-fail="ignore"
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> primitive prmStonith2-2
>>>>>>> stonith:external/ssh \
>>>>>>>>>>>>>>>>>>> params \
>>>>>>>>>>>>>>>>>>> pcmk_reboot_timeout="60s"
>>>>>>> \
>>>>>>>>>>>>>>>>>>> hostlist="nodeb" \
>>>>>>>>>>>>>>>>>>> op start interval="0s"
>>>>>>> timeout="60s" on-fail="restart" \
>>>>>>>>>>>>>>>>>>> op monitor
>>>>>>> interval="3600s" timeout="60s" on-fail="restart"
>>>>>>> \
>>>>>>>>>>>>>>>>>>> op stop interval="0s"
>>>>>>> timeout="60s" on-fail="ignore"
>>>>>>>>>>>>>>>>>>> (snip)
>>>>>>>>>>>>>>>>>>> location
>>>>>>> rsc_location-grpStonith1-2 grpStonith1 \
>>>>>>>>>>>>>>>>>>> rule -INFINITY: #uname eq nodea
>>>>>>>>>>>>>>>>>>> location
>>>>>>> rsc_location-grpStonith2-3 grpStonith2 \
>>>>>>>>>>>>>>>>>>> rule -INFINITY: #uname eq nodeb
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> 以上です。
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> ELF Systems
>>>>>>>>>>>>>>>>>> Masamichi Fukuda
>>>>>>>>>>>>>>>>>> mail to:
>>>>>>> masamichi_fukuda@elf-systems.com
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>>>> Linux-ha-japan mailing list
>>>>>>>>>>>>>>>>> Linux-ha-japan@lists.sourceforge.jp
>>>>>>>>>>>>>>>>> http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> ELF Systems
>>>>>>>>>>>>>>>> Masamichi Fukuda
>>>>>>>>>>>>>>>> mail to: masamichi_fukuda@elf-systems.com
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>> Linux-ha-japan mailing list
>>>>>>>>>>>>>>> Linux-ha-japan@lists.sourceforge.jp
>>>>>>>>>>>>>>> http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ELF Systems
>>>>>>>>>>>>>> Masamichi Fukuda
>>>>>>>>>>>>>> mail to: masamichi_fukuda@elf-systems.com
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>> Linux-ha-japan mailing list
>>>>>>>>>>>>> Linux-ha-japan@lists.sourceforge.jp
>>>>>>>>>>>>> http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>>
>>>>>>>>>>>> ELF Systems
>>>>>>>>>>>> Masamichi Fukuda
>>>>>>>>>>>> mail to: masamichi_fukuda@elf-systems.com
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> Linux-ha-japan mailing list
>>>>>>>>>>> Linux-ha-japan@lists.sourceforge.jp
>>>>>>>>>>> http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>>
>>>>>>>>>> ELF Systems
>>>>>>>>>> Masamichi Fukuda
>>>>>>>>>> mail to: masamichi_fukuda@elf-systems.com
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> Linux-ha-japan mailing list
>>>>>>>>> Linux-ha-japan@lists.sourceforge.jp
>>>>>>>>> http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>>
>>>>>>>> ELF Systems
>>>>>>>> Masamichi Fukuda
>>>>>>>> mail to: masamichi_fukuda@elf-systems.com
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Linux-ha-japan mailing list
>>>>>>> Linux-ha-japan@lists.sourceforge.jp
>>>>>>> http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>>>>>>>
>>>>>>
>>>>>>_______________________________________________
>>>>>>Linux-ha-japan mailing list
>>>>>>Linux-ha-japan@lists.sourceforge.jp
>>>>>>http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>>>>>>
>>>>>
>>>>>
>>>>>--
>>>>>
>>>>>ELF Systems
>>>>>Masamichi Fukuda
>>>>>mail to: masamichi_fukuda@elf-systems.com
>>>>>
>>>>>
>>>>
>>>>_______________________________________________
>>>>Linux-ha-japan mailing list
>>>>Linux-ha-japan@lists.sourceforge.jp
>>>>http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>>>>
>>>
>>>
>>>--
>>>
>>>ELF Systems
>>>Masamichi Fukuda
>>>mail to: masamichi_fukuda@elf-systems.com
>>>
>>>
>>
>>_______________________________________________
>>Linux-ha-japan mailing list
>>Linux-ha-japan@lists.sourceforge.jp
>>http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>>
>
>
>--
>
>ELF Systems
>Masamichi Fukuda
>mail to: masamichi_fukuda@elf-systems.com
>
>

_______________________________________________
Linux-ha-japan mailing list
Linux-ha-japan@lists.sourceforge.jp
http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
Re: スプリットブレイン時のSTONITHエラーについて [ In reply to ]
福田さん

お疲れ様です。山内です。

ちなみに、私の方でも構築するにあたっての再確認ですが、福田さんの構築手順は、
以下にまとめられた松島さんの手順通りでしょうか?

 * https://gist.github.com/takehironet/1469bd7123f63d61f843

差異などありましたら、今一度、ご連絡ください。

#特に、最初の構築パッケージのapt-getのあたりが、私がちょっと試した時には、うまく行かなかった記憶があるので、気になります。


以上です。


----- Original Message -----
> From: "renayama19661014@ybb.ne.jp" <renayama19661014@ybb.ne.jp>
> To: "linux-ha-japan@lists.sourceforge.jp" <linux-ha-japan@lists.sourceforge.jp>
> Cc:
> Date: 2015/3/18, Wed 09:53
> Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
>
> 福田さん
>
> お疲れ様です。山内です。
>
>> stonith -Lでは、一応プラグインの一覧は表示されるようです。
>>
>> # /usr/local/heartbeat/sbin/stonith -L
>
> こちらは、Heartbeatのソースに含まれるコマンドのはずなので、Heartbeatとglueの関係では問題ないということになるかと思います。
>
> ですので、pacemakerのインストールに問題がある可能性の方が高いと思われます。
>
> どちらにしても、一度、時間をみて、こちらでも構築してみます。
>
> 以上です。
>
>
> ----- Original Message -----
>> From: Masamichi Fukuda - elf-systems
> <masamichi_fukuda@elf-systems.com>
>> To: 山内英生 <renayama19661014@ybb.ne.jp>;
> "linux-ha-japan@lists.sourceforge.jp"
> <linux-ha-japan@lists.sourceforge.jp>
>> Date: 2015/3/18, Wed 09:33
>> Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
>>
>>
>> 山内さん
>>
>> お疲れ様です、福田です。
>>
>>> Reusableは、glueのことです。
>>
>> 承知しました。Cluster-glueのことですね。
>>
>>> これは、想定通りで、external配下のエージェントを認識できず、startしていないと
>>> 思っています。
>>
>> stonith -Lでは、一応プラグインの一覧は表示されるようです。
>>
>> # /usr/local/heartbeat/sbin/stonith -L
>> apcmaster
>> apcsmart
>> baytech
>> cyclades
>> external/drac5
>> external/dracmc-telnet
>> external/hetzner
>> external/hmchttp
>> external/ibmrsa
>> external/ibmrsa-telnet
>> external/ipmi
>> external/ippower9258
>> external/kdumpcheck
>> external/libvirt
>> external/nut
>> external/rackpdu
>> external/riloe
>> external/ssh
>> external/stonith-helper
>> external/vcenter
>> external/vmware
>> external/xen0
>> external/xen0-ha
>> ibmhmc
>> meatware
>> null
>> nw_rpc100s
>> rcd_serial
>> rps10
>> ssh
>> suicide
>> wti_nps
>>
>>
>>> #時間を見て、松島さんの情報を元に、Debian7の環境を作らないと何とも言えない
>>> と思っています
>>
>> お忙しいところ済みません。
>> こちらもインストールを見なおして見ます。
>>
>> 宜しくお願いします。
>>
>> 以上
>>
>>
>>
>>
>> 2015年3月18日 9:02 <renayama19661014@ybb.ne.jp>:
>>
>> 福田さん
>>>
>>> おはようございます。山内です。
>>>
>>> 書き方が悪かったです。
>>> Reusableは、glueのことです。
>>>
>>> pacemakerのインストールに問題があるかも知れませんが、現時点では、判断出来ません。
>>>
>>>
>>>> stonith-helperを外して、external/sshだけにして起動してみましたが、
>>>> crm_monでの状態は変わりありませんでした。
>>>
>>>
>>> これは、想定通りで、external配下のエージェントを認識できず、startしていないと思っています。
>>>
>>> #時間を見て、松島さんの情報を元に、Debian7の環境を作らないと何とも言えないと思っています。
>>>
>>> 以上です。
>>>
>>>
>>> ----- Original Message -----
>>>> From: Masamichi Fukuda - elf-systems
> <masamichi_fukuda@elf-systems.com>
>>>> To: 山内英生 <renayama19661014@ybb.ne.jp>;
> "linux-ha-japan@lists.sourceforge.jp"
> <linux-ha-japan@lists.sourceforge.jp>
>>>
>>>> Date: 2015/3/18, Wed 08:12
>>>> Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
>>>>
>>>>
>>>> 山内さん
>>>>
>>>> おはようございます、福田です。
>>>>
>>>>> ということは、xen0もstonith-helperもたぶん、Pacemakerのstonithプラグインとし
>>>>> ての管理下のパスにはないということになると思います。
>>>>>
>>>>> Reusableと、pm_extrasあたりのインストールが怪しいと思いますね。
>>>>
>>>> pacemakerのインストールに問題があるのでしょうか。
>>>> あと、Reusableというものは別途インストールが必要なのでしょうか。
>>>>
>>>> stonith-helperを外して、external/sshだけにして起動してみましたが、
>>>> crm_monでの状態は変わりありませんでした。
>>>>
>>>> Last updated: Wed Mar 18 08:07:42 2015
>>>> Last change: Wed Mar 18 08:04:48 2015
>>>> Stack: heartbeat
>>>> Current DC: lbv1.beta.com (38b0f200-83ea-8633-6f37-047d36cd39c6) -
> parti
>>>> tion with quorum
>>>> Version: 1.1.12-e32080b
>>>> 2 Nodes configured
>>>> 6 Resources configured
>>>>
>>>>
>>>> Online: [ lbv1.beta.com lbv2.beta.com ]
>>>>
>>>> Full list of resources:
>>>>
>>>> Stonith1-2      (stonith:external/ssh): Stopped
>>>> Stonith2-2      (stonith:external/ssh): Stopped
>>>>  Resource Group: HAvarnish
>>>>      vip_208    (ocf::heartbeat:IPaddr2):       Started
> lbv1.beta.com
>>>>      varnishd   (lsb:varnish):  Started lbv1.beta.com
>>>>  Clone Set: clone_ping [ping]
>>>>      Started: [ lbv1.beta.com lbv2.beta.com ]
>>>>
>>>> Node Attributes:
>>>> * Node lbv1.beta.com:
>>>>     + default_ping_set                  : 100
>>>> * Node lbv2.beta.com:
>>>>     + default_ping_set                  : 100
>>>>
>>>> Migration summary:
>>>> * Node lbv2.beta.com:
>>>>    Stonith1-2: migration-threshold=1 fail-count=1000000
> last-failure='Wed Mar 18
>>>>  08:07:32 2015'
>>>> * Node lbv1.beta.com:
>>>>    Stonith2-2: migration-threshold=1 fail-count=1000000
> last-failure='Wed Mar 18
>>>>  08:05:53 2015'
>>>>
>>>> Failed actions:
>>>>     Stonith1-2_start_0 on lbv2.beta.com 'unknown error' (1):
> call=23, st
>>>> atus=Error, exit-reason='none', last-rc-change='Wed Mar
> 18 08:07:30 2015', queue
>>>> d=0ms, exec=1061ms
>>>>     Stonith2-2_start_0 on lbv1.beta.com 'unknown error' (1):
> call=23, st
>>>> atus=Error, exit-reason='none', last-rc-change='Wed Mar
> 18 08:05:51 2015', queue
>>>> d=0ms, exec=1062ms
>>>>
>>>> 宜しくお願いします。
>>>>
>>>> 以上
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> 2015年3月17日 23:51 <renayama19661014@ybb.ne.jp>:
>>>>
>>>> 福田さん
>>>>>
>>>>> こんばんは、山内です。
>>>>>
>>>>> ということは、xen0もstonith-helperもたぶん、Pacemakerのstonithプラグインとしての管理下のパスにはないということになると思います。
>>>>>
>>>>> Reusableと、pm_extrasあたりのインストールが怪しいと思いますね。
>>>>>
>>>>> また、何かわかったらご連絡します。
>>>>>
>>>>> 以上です。
>>>>>
>>>>>
>>>>>
>>>>> ----- Original Message -----
>>>>>> From: Masamichi Fukuda - elf-systems
> <masamichi_fukuda@elf-systems.com>
>>>>>> To: 山内英生 <renayama19661014@ybb.ne.jp>;
> "linux-ha-japan@lists.sourceforge.jp"
> <linux-ha-japan@lists.sourceforge.jp>
>>>>>
>>>>>> Date: 2015/3/17, Tue 23:46
>>>>>> Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
>>>>>>
>>>>>>
>>>>>> 山内さん
>>>>>>
>>>>>> こんばんは、福田です。
>>>>>>
>>>>>> stonith-helperの-x指定は何かやり方が違うんでしょうかね。
>>>>>>
>>>>>> stonith-helperを外して、xen0だけにして起動してみました。
>>>>>>
>>>>>> # crm_mon -rfA
>>>>>>
>>>>>> Last updated: Tue Mar 17 23:38:53 2015
>>>>>> Last change: Tue Mar 17 23:30:34 2015
>>>>>> Stack: heartbeat
>>>>>> Current DC: lbv1.beta.com
> (38b0f200-83ea-8633-6f37-047d36cd39c6) - parti
>>>>>> tion with quorum
>>>>>> Version: 1.1.12-e32080b
>>>>>> 2 Nodes configured
>>>>>> 6 Resources configured
>>>>>>
>>>>>>
>>>>>> Online: [ lbv1.beta.com lbv2.beta.com ]
>>>>>>
>>>>>> Full list of resources:
>>>>>>
>>>>>> Stonith1-2      (stonith:external/xen0):        Stopped
>>>>>> Stonith2-2      (stonith:external/xen0):        Stopped
>>>>>>  Resource Group: HAvarnish
>>>>>>      vip_208    (ocf::heartbeat:IPaddr2):       Started
> lbv1.beta.com
>>>>>>      varnishd   (lsb:varnish):  Started lbv1.beta.com
>>>>>>  Clone Set: clone_ping [ping]
>>>>>>      Started: [ lbv1.beta.com lbv2.beta.com ]
>>>>>>
>>>>>> Node Attributes:
>>>>>> * Node lbv1.beta.com:
>>>>>>     + default_ping_set                  : 100
>>>>>> * Node lbv2.beta.com:
>>>>>>     + default_ping_set                  : 100
>>>>>>
>>>>>> Migration summary:
>>>>>> * Node lbv1.beta.com:
>>>>>>    Stonith2-2: migration-threshold=1 fail-count=1000000
> last-failure='Tue Mar 17
>>>>>>  23:38:34 2015'
>>>>>> * Node lbv2.beta.com:
>>>>>>    Stonith1-2: migration-threshold=1 fail-count=1000000
> last-failure='Tue Mar 17
>>>>>>  23:38:27 2015'
>>>>>>
>>>>>> Failed actions:
>>>>>>     Stonith2-2_start_0 on lbv1.beta.com 'unknown
> error' (1): call=23, st
>>>>>> atus=Error, exit-reason='none',
> last-rc-change='Tue Mar 17 23:38:32 2015', queue
>>>>>> d=0ms, exec=1061ms
>>>>>>     Stonith1-2_start_0 on lbv2.beta.com 'unknown
> error' (1): call=23, st
>>>>>> atus=Error, exit-reason='none',
> last-rc-change='Tue Mar 17 23:38:25 2015', queue
>>>>>> d=0ms, exec=1342ms
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> stonith-helperがあるときと同様のfialed actionsが出ているようです。
>>>>>>
>>>>>>
>>>>>> 宜しくお願いします。
>>>>>>
>>>>>> 以上
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> 2015年3月17日 22:38 <renayama19661014@ybb.ne.jp>:
>>>>>>
>>>>>> 福田さん
>>>>>>>
>>>>>>> こんばんは、山内です。
>>>>>>>
>>>>>>> ちなみに可能であれば、external/stonith-helperを外して、external/xen0だけにした場合に
>>>>>>> どうなるか?を確認すると、問題の切り分けになるかもしれません。
>>>>>>>
>>>>>>> 以上です。
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> ----- Original Message -----
>>>>>>>
>>>>>>>> From: "renayama19661014@ybb.ne.jp"
> <renayama19661014@ybb.ne.jp>
>>>>>>>> To: "linux-ha-japan@lists.sourceforge.jp"
> <linux-ha-japan@lists.sourceforge.jp>
>>>>>>>> Cc:
>>>>>>>> Date: 2015/3/17, Tue 22:28
>>>>>>>> Subject: Re: [Linux-ha-jp]
> スプリットブレイン時のSTONITHエラーについて
>>>>>>>>
>>>>>>>> 福田さん
>>>>>>>>
>>>>>>>> こんばんは、山内です。
>>>>>>>>
>>>>>>>> 変わらないようですね。。。
>>>>>>>>
>>>>>>>> とりあえず、明日くらいに、RHEL上ですが、
>>>>>>>>
>>>>>>>> Heartbeat3.0.6
>>>>>>>> Pacemakerの最新
>>>>>>>>
>>>>>>>>
> 組み合わせで、同じような設定(リソースはDummy、external/xen0はexternal/sshになりますが)stonith-helperが動くかどうかを確認してみます。
>>>>>>>>
>>>>>>>> #stonith-helperの-x指定の出力が確認出来ると、もう少し問題が絞りやすいのですが・・・
>>>>>>>>
>>>>>>>>
>>>>>>>> 以上です。
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> ----- Original Message -----
>>>>>>>>> From: Masamichi Fukuda - elf-systems
>>>>>>>> <masamichi_fukuda@elf-systems.com>
>>>>>>>>> To: 山内英生 <renayama19661014@ybb.ne.jp>;
>>>>>>>> "linux-ha-japan@lists.sourceforge.jp"
>>>>>>>> <linux-ha-japan@lists.sourceforge.jp>
>>>>>>>>> Date: 2015/3/17, Tue 21:24
>>>>>>>>> Subject: Re: [Linux-ha-jp]
> スプリットブレイン時のSTONITHエラーについて
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> 山内さん
>>>>>>>>>
>>>>>>>>> こんばんは、福田です。
>>>>>>>>> 最新版の情報をありがとうございました。
>>>>>>>>>
>>>>>>>>> 早速インストールしてみました。
>>>>>>>>>
>>>>>>>>> 起動後の状態です。
>>>>>>>>>
>>>>>>>>> failed actionsは変わりないようです。
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> # crm_mon -rfA
>>>>>>>>> Last updated: Tue Mar 17 21:03:49 2015
>>>>>>>>> Last change: Tue Mar 17 20:30:58 2015
>>>>>>>>> Stack: heartbeat
>>>>>>>>> Current DC: lbv1.beta.com
> (38b0f200-83ea-8633-6f37-047d36cd39c6) - parti
>>>>>>>>> tion with quorum
>>>>>>>>> Version: 1.1.12-e32080b
>>>>>>>>> 2 Nodes configured
>>>>>>>>> 8 Resources configured
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Online: [ lbv1.beta.com lbv2.beta.com ]
>>>>>>>>>
>>>>>>>>> Full list of resources:
>>>>>>>>>
>>>>>>>>>  Resource Group: HAvarnish
>>>>>>>>>      vip_208    (ocf::heartbeat:IPaddr2):      
> Started lbv1.beta.com
>>>>>>>>>      varnishd   (lsb:varnish):  Started
> lbv1.beta.com
>>>>>>>>>  Resource Group: grpStonith1
>>>>>>>>>      Stonith1-1
> (stonith:external/stonith-helper):      Stopped
>>>>>>>>>      Stonith1-2 (stonith:external/xen0):       
> Stopped
>>>>>>>>>  Resource Group: grpStonith2
>>>>>>>>>      Stonith2-1
> (stonith:external/stonith-helper):      Stopped
>>>>>>>>>      Stonith2-2 (stonith:external/xen0):       
> Stopped
>>>>>>>>>  Clone Set: clone_ping [ping]
>>>>>>>>>      Started: [ lbv1.beta.com lbv2.beta.com ]
>>>>>>>>>
>>>>>>>>> Node Attributes:
>>>>>>>>> * Node lbv1.beta.com:
>>>>>>>>>     + default_ping_set                  : 100
>>>>>>>>> * Node lbv2.beta.com:
>>>>>>>>>     + default_ping_set                  : 100
>>>>>>>>>
>>>>>>>>> Migration summary:
>>>>>>>>> * Node lbv1.beta.com:
>>>>>>>>>    Stonith2-1: migration-threshold=1
> fail-count=1000000
>>>>>>>> last-failure='Tue Mar 17
>>>>>>>>>  21:03:39 2015'
>>>>>>>>> * Node lbv2.beta.com:
>>>>>>>>>    Stonith1-1: migration-threshold=1
> fail-count=1000000
>>>>>>>> last-failure='Tue Mar 17
>>>>>>>>>  21:03:32 2015'
>>>>>>>>>
>>>>>>>>> Failed actions:
>>>>>>>>>     Stonith2-1_start_0 on lbv1.beta.com
> 'unknown error' (1):
>>>>>>>> call=31, st
>>>>>>>>> atus=Error, exit-reason='none',
> last-rc-change='Tue Mar 17
>>>>>>>> 21:03:37 2015', queue
>>>>>>>>> d=0ms, exec=1085ms
>>>>>>>>>     Stonith1-1_start_0 on lbv2.beta.com
> 'unknown error' (1):
>>>>>>>> call=18, st
>>>>>>>>> atus=Error, exit-reason='none',
> last-rc-change='Tue Mar 17
>>>>>>>> 21:03:30 2015', queue
>>>>>>>>> d=0ms, exec=1061ms
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> ログです。
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> # less /var/log/ha-debug
>>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> [4235]: info: Pacemaker support:
>>>>>>>> yes
>>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> [4235]: WARN: File
>>>>>>>> /etc/ha.d//haresources exists.
>>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> [4235]: WARN: This file is not used
>>>>>>>> because pacemaker is enabled
>>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> [4235]: debug: Checking access of:
>>>>>>>> /usr/local/heartbeat/libexec/heartbeat/ccm
>>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> [4235]: debug: Checking access of:
>>>>>>>> /usr/local/heartbeat/libexec/pacemaker/cib
>>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> [4235]: debug: Checking access of:
>>>>>>>> /usr/local/heartbeat/libexec/pacemaker/stonithd
>>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> [4235]: debug: Checking access of:
>>>>>>>> /usr/local/heartbeat/libexec/pacemaker/lrmd
>>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> [4235]: debug: Checking access of:
>>>>>>>> /usr/local/heartbeat/libexec/pacemaker/attrd
>>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> [4235]: debug: Checking access of:
>>>>>>>> /usr/local/heartbeat/libexec/pacemaker/crmd
>>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> [4235]: WARN: Core dumps could be
>>>>>>>> lost if multiple dumps occur.
>>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> [4235]: WARN: Consider setting
>>>>>>>> non-default value in /proc/sys/kernel/core_pattern
> (or equivalent) for maximum
>>>>>>>> supportability
>>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> [4235]: WARN: Consider setting
>>>>>>>> /proc/sys/kernel/core_uses_pid (or equivalent) to 1
> for maximum supportability
>>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> [4235]: WARN: Logging daemon is
>>>>>>>> disabled --enabling logging daemon is recommended
>>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> [4235]: info:
>>>>>>>> **************************
>>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> [4235]: info: Configuration
>>>>>>>> validated. Starting heartbeat 3.0.6
>>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> [4236]: info: heartbeat: version
>>>>>>>> 3.0.6
>>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> [4236]: info: Heartbeat generation:
>>>>>>>> 1423534116
>>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> [4236]: info: seed is -1702799346
>>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> [4236]: info: glib: ucast: write
>>>>>>>> socket priority set to IPTOS_LOWDELAY on eth1
>>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> [4236]: info: glib: ucast: bound
>>>>>>>> send socket to device: eth1
>>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> [4236]: info: glib: ucast: set
>>>>>>>> SO_REUSEADDR
>>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> [4236]: info: glib: ucast: bound
>>>>>>>> receive socket to device: eth1
>>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> [4236]: info: glib: ucast: started
>>>>>>>> on port 694 interface eth1 to 10.0.17.133
>>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> [4236]: info: Local status now set
>>>>>>>> to: 'up'
>>>>>>>>> Mar 17 21:02:46 lbv1.beta.com heartbeat:
> [4236]: info: Link
>>>>>>>> lbv2.beta.com:eth1 up.
>>>>>>>>> Mar 17 21:02:46 lbv1.beta.com heartbeat:
> [4236]: info: Status update for
>>>>>>>> node lbv2.beta.com: status up
>>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> [4236]: info: Comm_now_up():
>>>>>>>> updating status to active
>>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> [4236]: info: Local status now set
>>>>>>>> to: 'active'
>>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> [4236]: info: Starting child client
>>>>>>>>
> "/usr/local/heartbeat/libexec/heartbeat/ccm" (109,113)
>>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> [4236]: info: Starting child client
>>>>>>>>
> "/usr/local/heartbeat/libexec/pacemaker/cib" (109,113)
>>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> [4236]: info: Starting child client
>>>>>>>>
> "/usr/local/heartbeat/libexec/pacemaker/stonithd" (0,0)
>>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> [4236]: info: Starting child client
>>>>>>>>
> "/usr/local/heartbeat/libexec/pacemaker/lrmd" (0,0)
>>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> [4236]: info: Starting child client
>>>>>>>>
> "/usr/local/heartbeat/libexec/pacemaker/attrd" (109,113)
>>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> [4236]: info: Starting child client
>>>>>>>>
> "/usr/local/heartbeat/libexec/pacemaker/crmd" (109,113)
>>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> [4236]: debug: get_delnodelist:
>>>>>>>> delnodelist=
>>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> [4250]: info: Starting
>>>>>>>>
> "/usr/local/heartbeat/libexec/pacemaker/crmd" as uid 109  gid 113 (pid
>>>>>>>> 4250)
>>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> [4246]: info: Starting
>>>>>>>>
> "/usr/local/heartbeat/libexec/pacemaker/cib" as uid 109  gid 113 (pid
>>>>>>>> 4246)
>>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> [4249]: info: Starting
>>>>>>>>
> "/usr/local/heartbeat/libexec/pacemaker/attrd" as uid 109  gid 113
>>>>>>>> (pid 4249)
>>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> [4245]: info: Starting
>>>>>>>>
> "/usr/local/heartbeat/libexec/heartbeat/ccm" as uid 109  gid 113 (pid
>>>>>>>> 4245)
>>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> [4248]: info: Starting
>>>>>>>>
> "/usr/local/heartbeat/libexec/pacemaker/lrmd" as uid 0  gid 0 (pid
>>>>>>>> 4248)
>>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> [4247]: info: Starting
>>>>>>>>
> "/usr/local/heartbeat/libexec/pacemaker/stonithd" as uid 0  gid 0 (pid
>>>>>>>> 4247)
>>>>>>>>> Mar 17 21:02:47 lbv1.beta.com ccm: [4245]:
> info: Hostname: lbv1.beta.com
>>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> [4236]: info: the send queue length
>>>>>>>> from heartbeat to client ccm is set to 1024
>>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> [4236]: info: the send queue length
>>>>>>>> from heartbeat to client attrd is set to 1024
>>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> [4236]: info: the send queue length
>>>>>>>> from heartbeat to client stonith-ng is set to 1024
>>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> [4236]: info: Status update for
>>>>>>>> node lbv2.beta.com: status active
>>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> [4236]: info: the send queue length
>>>>>>>> from heartbeat to client cib is set to 1024
>>>>>>>>> Mar 17 21:02:51 lbv1.beta.com heartbeat:
> [4236]: WARN: 1 lost packet(s) for
>>>>>>>> [lbv2.beta.com] [15:17]
>>>>>>>>> Mar 17 21:02:51 lbv1.beta.com heartbeat:
> [4236]: info: No pkts missing from
>>>>>>>> lbv2.beta.com!
>>>>>>>>> Mar 17 21:02:52 lbv1.beta.com heartbeat:
> [4236]: WARN: 1 lost packet(s) for
>>>>>>>> [lbv2.beta.com] [19:21]
>>>>>>>>> Mar 17 21:02:52 lbv1.beta.com heartbeat:
> [4236]: info: No pkts missing from
>>>>>>>> lbv2.beta.com!
>>>>>>>>> Mar 17 21:02:52 lbv1.beta.com heartbeat:
> [4236]: info: the send queue length
>>>>>>>> from heartbeat to client crmd is set to 1024
>>>>>>>>> Mar 17 21:02:53 lbv1.beta.com heartbeat:
> [4236]: WARN: 1 lost packet(s) for
>>>>>>>> [lbv2.beta.com] [24:26]
>>>>>>>>> Mar 17 21:02:53 lbv1.beta.com heartbeat:
> [4236]: info: No pkts missing from
>>>>>>>> lbv2.beta.com!
>>>>>>>>> Mar 17 21:02:54 lbv1.beta.com heartbeat:
> [4236]: WARN: 1 lost packet(s) for
>>>>>>>> [lbv2.beta.com] [26:28]
>>>>>>>>> Mar 17 21:02:54 lbv1.beta.com heartbeat:
> [4236]: info: No pkts missing from
>>>>>>>> lbv2.beta.com!
>>>>>>>>> Mar 17 21:02:54 lbv1.beta.com heartbeat:
> [4236]: WARN: 1 lost packet(s) for
>>>>>>>> [lbv2.beta.com] [30:32]
>>>>>>>>> Mar 17 21:02:54 lbv1.beta.com heartbeat:
> [4236]: info: No pkts missing from
>>>>>>>> lbv2.beta.com!
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> # less /var/log/error
>>>>>>>>>
>>>>>>>>> Mar 17 21:02:47 lbv1 attrd[4249]:    error:
> ha_msg_dispatch: Ignored
>>>>>>>> incoming message. Please set_msg_callback on
> hbclstat
>>>>>>>>> Mar 17 21:02:48 lbv1 attrd[4249]:    error:
> ha_msg_dispatch: Ignored
>>>>>>>> incoming message. Please set_msg_callback on
> hbclstat
>>>>>>>>> Mar 17 21:02:53 lbv1 stonith-ng[4247]:   
> error: ha_msg_dispatch: Ignored
>>>>>>>> incoming message. Please set_msg_callback on
> hbclstat
>>>>>>>>> Mar 17 21:02:53 lbv1 stonith-ng[4247]:   
> error: ha_msg_dispatch: Ignored
>>>>>>>> incoming message. Please set_msg_callback on
> hbclstat
>>>>>>>>> Mar 17 21:03:39 lbv1 crmd[4250]:    error:
> process_lrm_event: Operation
>>>>>>>> Stonith2-1_start_0 (node=lbv1.beta.com, call=31,
> status=4, cib-update=42,
>>>>>>>> confirmed=true) Error
>>>>>>>>>
>>>>>>>>> # cat syslog|egrep 'Mar 17 21:03|Mar 17
> 21:02' |egrep
>>>>>>>> 'heartbeat|stonith|pacemaker|error'
>>>>>>>>> Mar 17 21:03:24 lbv1 pengine[4253]:   notice:
> process_pe_message: Calculated
>>>>>>>> Transition 0:
> /var/lib/pacemaker/pengine/pe-input-115.bz2
>>>>>>>>> Mar 17 21:03:27 lbv1 crmd[4250]:   notice:
> run_graph: Transition 0
>>>>>>>> (Complete=15, Pending=0, Fired=0, Skipped=16,
> Incomplete=2,
>>>>>>>>
> Source=/var/lib/pacemaker/pengine/pe-input-115.bz2): Stopped
>>>>>>>>> Mar 17 21:03:29 lbv1 pengine[4253]:   notice:
> process_pe_message: Calculated
>>>>>>>> Transition 1:
> /var/lib/pacemaker/pengine/pe-input-116.bz2
>>>>>>>>> Mar 17 21:03:34 lbv1 crmd[4250]:   notice:
> run_graph: Transition 1
>>>>>>>> (Complete=8, Pending=0, Fired=0, Skipped=12,
> Incomplete=1,
>>>>>>>>
> Source=/var/lib/pacemaker/pengine/pe-input-116.bz2): Stopped
>>>>>>>>> Mar 17 21:03:37 lbv1 pengine[4253]:  warning:
> unpack_rsc_op_failure:
>>>>>>>> Processing failed op start for Stonith1-1 on
> lbv2.beta.com: unknown error (1)
>>>>>>>>> Mar 17 21:03:37 lbv1 pengine[4253]:  warning:
> unpack_rsc_op_failure:
>>>>>>>> Processing failed op start for Stonith1-1 on
> lbv2.beta.com: unknown error (1)
>>>>>>>>> Mar 17 21:03:37 lbv1 pengine[4253]:   notice:
> process_pe_message: Calculated
>>>>>>>> Transition 2:
> /var/lib/pacemaker/pengine/pe-input-117.bz2
>>>>>>>>> Mar 17 21:03:39 lbv1 stonith-ng[4247]:  
> notice: log_operation: Operation
>>>>>>>> 'monitor' [4377] for device
> 'Stonith2-1' returned: -201 (Generic
>>>>>>>> Pacemaker error)
>>>>>>>>> Mar 17 21:03:39 lbv1 stonith-ng[4247]: 
> warning: log_operation:
>>>>>>>> Stonith2-1:4377 [ Performing: stonith -t
> external/stonith-helper -S ]
>>>>>>>>> Mar 17 21:03:39 lbv1 stonith-ng[4247]: 
> warning: log_operation:
>>>>>>>> Stonith2-1:4377 [ failed to exec
> "stonith" ]
>>>>>>>>> Mar 17 21:03:39 lbv1 stonith-ng[4247]: 
> warning: log_operation:
>>>>>>>> Stonith2-1:4377 [ failed:  2 ]
>>>>>>>>> Mar 17 21:03:39 lbv1 crmd[4250]:    error:
> process_lrm_event: Operation
>>>>>>>> Stonith2-1_start_0 (node=lbv1.beta.com, call=31,
> status=4, cib-update=42,
>>>>>>>> confirmed=true) Error
>>>>>>>>> Mar 17 21:03:40 lbv1 crmd[4250]:   notice:
> run_graph: Transition 2
>>>>>>>> (Complete=12, Pending=0, Fired=0, Skipped=3,
> Incomplete=0,
>>>>>>>>
> Source=/var/lib/pacemaker/pengine/pe-input-117.bz2): Stopped
>>>>>>>>> Mar 17 21:03:42 lbv1 pengine[4253]:  warning:
> unpack_rsc_op_failure:
>>>>>>>> Processing failed op start for Stonith2-1 on
> lbv1.beta.com: unknown error (1)
>>>>>>>>> Mar 17 21:03:42 lbv1 pengine[4253]:  warning:
> unpack_rsc_op_failure:
>>>>>>>> Processing failed op start for Stonith2-1 on
> lbv1.beta.com: unknown error (1)
>>>>>>>>> Mar 17 21:03:42 lbv1 pengine[4253]:  warning:
> unpack_rsc_op_failure:
>>>>>>>> Processing failed op start for Stonith1-1 on
> lbv2.beta.com: unknown error (1)
>>>>>>>>> Mar 17 21:03:42 lbv1 pengine[4253]:   notice:
> process_pe_message: Calculated
>>>>>>>> Transition 3:
> /var/lib/pacemaker/pengine/pe-input-118.bz2
>>>>>>>>> Mar 17 21:03:42 lbv1 IPaddr2(vip_208)[4448]:
> INFO:
>>>>>>>> /usr/libexec/heartbeat/send_arp -i 200 -r 5 -p
>>>>>>>> /var/run/resource-agents/send_arp-192.168.17.208
> eth0 192.168.17.208 auto
>>>>>>>> not_used not_used
>>>>>>>>> Mar 17 21:03:47 lbv1 crmd[4250]:   notice:
> run_graph: Transition 3
>>>>>>>> (Complete=10, Pending=0, Fired=0, Skipped=0,
> Incomplete=0,
>>>>>>>>
> Source=/var/lib/pacemaker/pengine/pe-input-118.bz2): Complete
>>>>>>>>>
>>>>>>>>> 宜しくお願いします。
>>>>>>>>>
>>>>>>>>> 以上
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> 2015年3月17日 18:31
> <renayama19661014@ybb.ne.jp>:
>>>>>>>>>
>>>>>>>>> 福田さん
>>>>>>>>>>
>>>>>>>>>> こんばんは、山内です。
>>>>>>>>>>
>>>>>>>>>> tag付けされていないので、本日の最新版は、
>>>>>>>>>>
>>>>>>>>>>  *
>>>>>>>>
> https://github.com/ClusterLabs/pacemaker/tree/e32080b460f81486b85d08ec958582b3e72d858c
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> になります。
>>>>>>>>>> 右側の[Download ZIP]からダウンロード出来ます。
>>>>>>>>>>
>>>>>>>>>> 以上です。
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> ----- Original Message -----
>>>>>>>>>>> From: Masamichi Fukuda - elf-systems
>>>>>>>> <masamichi_fukuda@elf-systems.com>
>>>>>>>>>>
>>>>>>>>>>> To:
> "renayama19661014@ybb.ne.jp"
>>>>>>>> <renayama19661014@ybb.ne.jp>;
>>>>>>>> "linux-ha-japan@lists.sourceforge.jp"
>>>>>>>> <linux-ha-japan@lists.sourceforge.jp>
>>>>>>>>>>> Date: 2015/3/17, Tue 18:07
>>>>>>>>>>> Subject: スプリットブレイン時のSTONITHエラーについて
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> 山内さん
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> お疲れ様です、福田です。
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> こちらを見たのですが、
>>>>>>>>>>>
> https://github.com/ClusterLabs/pacemaker/tags
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> pacemaker 1.1.12 561c4cf が最新のようなのですが。
>>>>>>>>>>> 済みませんが、これ以降の最新版はどちらにあるか教えて頂けますか。
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> 宜しくお願いします。
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> 以上
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
> 2015年3月17日火曜日、<renayama19661014@ybb.ne.jp>さんは書きました:
>>>>>>>>>>>
>>>>>>>>>>> 福田さん
>>>>>>>>>>>>
>>>>>>>>>>>> お疲れ様です。山内です。
>>>>>>>>>>>>
>>>>>>>>>>>> はい。古いです。
>>>>>>>>>>>>
>>>>>>>>>>>>
> PacemakerがHeartbeat3.0.6に対応したのは意外と最近です。
>>>>>>>>>>>>
> もっと新しいものを入れてください。(また、ソースから構築する必要がありますが・・・・)
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> 本家のgithubから入手可能です。
>>>>>>>>>>>>  *
> https://github.com/ClusterLabs/pacemaker
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
> 場合によっては、最新のmasterはエラーなどが出る場合がありますので、その場合は、バージョンを古い方にたぐって
>>>>>>>>>>>> いくのが良いと思います。
>>>>>>>>>>>>
>>>>>>>>>>>> 以上です。
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> ----- Original Message -----
>>>>>>>>>>>>> From: Masamichi Fukuda -
> elf-systems
>>>>>>>> <masamichi_fukuda@elf-systems.com>
>>>>>>>>>>>>> To: 山内英生
> <renayama19661014@ybb.ne.jp>;
>>>>>>>> "linux-ha-japan@lists.sourceforge.jp"
>>>>>>>> <linux-ha-japan@lists.sourceforge.jp>
>>>>>>>>>>>>> Date: 2015/3/17, Tue 16:06
>>>>>>>>>>>>> Subject: Re: [Linux-ha-jp]
> スプリットブレイン時のSTONITHエラーについて
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> 山内さん
>>>>>>>>>>>>>
>>>>>>>>>>>>> お疲れ様です、福田です。
>>>>>>>>>>>>>
>>>>>>>>>>>>>
> 以前のメールでheartbeatとpacemakerを最新版を入れたほうが良いと回答頂きました。
>>>>>>>>>>>>>
> そこで今回、heartbeat3.0.6とpacemaker1.1.12を入れたのですが。
>>>>>>>>>>>>>
>>>>>>>>>>>>> heartbeat configuration:
> Version = "3.0.6"
>>>>>>>>>>>>> pacemaker configuration:
> Version = 1.1.12 (Build:
>>>>>>>> 561c4cf)pacemakerがまだ古いということでしょうか。
>>>>>>>>>>>>>
>>>>>>>>>>>>> 済みませんが、宜しくお願いします。
>>>>>>>>>>>>>
>>>>>>>>>>>>> 以上
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> 2015年3月17日 14:59
> <renayama19661014@ybb.ne.jp>:
>>>>>>>>>>>>>
>>>>>>>>>>>>> 福田さん
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> お疲れ様です。山内です。
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
> ふと思ったのすが、以前のやり取りのメールで以下と回答してますが、問題ないでしょうか?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>  
> 2)Heartbeat3.0.6+Pacemaker最新 :
>>>>>>>> OK
>>>>>>>>>>>>>>>>>>>>     
>>>>>>>>>>>>>>>>>>>>
>>>>>>>> どうやら、Heartbeatも最新版3.0.6を組合せる必要があるようです。
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>
>  * http://hg.linux-ha.org/heartbeat-STABLE_3_0/rev/cceeb47a7d8f
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
> 以下のcrm_monのバージョンを見ると、1.1.12のようです。
>>>>>>>>>>>>>>
> Heartbeat3.0.6と組み合わせるには、かなり新しめのPacemakerが必要です。
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> # crm_mon -rfA
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Last updated: Tue Mar
> 17 14:14:39 2015
>>>>>>>>>>>>>>> Last change: Tue Mar 17
> 14:01:43 2015
>>>>>>>>>>>>>>> Stack: heartbeat
>>>>>>>>>>>>>>> Current DC:
> lbv2.beta.com
>>>>>>>> (82ffc36f-1ad8-8686-7db0-35686465c624) - parti
>>>>>>>>>>>>>>> tion with quorum
>>>>>>>>>>>>>>> Version: 1.1.12-561c4cf
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> たぶん、以下の変更以降は少なくとも必要かと思います。
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
> https://github.com/ClusterLabs/pacemaker/commit/f2302da063d08719d28367d8e362b8bfb0f85bf3
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 以上です。
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ----- Original Message
> -----
>>>>>>>>>>>>>>> From: Masamichi Fukuda
> - elf-systems
>>>>>>>> <masamichi_fukuda@elf-systems.com>
>>>>>>>>>>>>>>> To: 山内英生
> <renayama19661014@ybb.ne.jp>;
>>>>>>>> "linux-ha-japan@lists.sourceforge.jp"
>>>>>>>> <linux-ha-japan@lists.sourceforge.jp>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Date: 2015/3/17, Tue
> 14:38
>>>>>>>>>>>>>>> Subject: Re:
> [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 山内さん
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> お疲れ様です、福田です。
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
> stonith-helperのシェバング行に-xを追加すれば良いのでしょうか?
>>>>>>>>>>>>>>>
> stonith-helperの先頭行を#!/bin/bash -xにしてクラスタを起動してみました。
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
> crm_monでは先ほどと変わりはないようです。
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> # crm_mon -rfA
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Last updated: Tue Mar
> 17 14:14:39 2015
>>>>>>>>>>>>>>> Last change: Tue Mar 17
> 14:01:43 2015
>>>>>>>>>>>>>>> Stack: heartbeat
>>>>>>>>>>>>>>> Current DC:
> lbv2.beta.com
>>>>>>>> (82ffc36f-1ad8-8686-7db0-35686465c624) - parti
>>>>>>>>>>>>>>> tion with quorum
>>>>>>>>>>>>>>> Version: 1.1.12-561c4cf
>>>>>>>>>>>>>>> 2 Nodes configured
>>>>>>>>>>>>>>> 8 Resources configured
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Online: [ lbv1.beta.com
> lbv2.beta.com ]
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Full list of resources:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>  Resource Group:
> HAvarnish
>>>>>>>>>>>>>>>      vip_208   
> (ocf::heartbeat:IPaddr2):      
>>>>>>>> Started lbv1.beta.com
>>>>>>>>>>>>>>>      varnishd  
> (lsb:varnish):  Started
>>>>>>>> lbv1.beta.com
>>>>>>>>>>>>>>>  Resource Group:
> grpStonith1
>>>>>>>>>>>>>>>      Stonith1-1
>>>>>>>> (stonith:external/stonith-helper):      Stopped
>>>>>>>>>>>>>>>      Stonith1-2
> (stonith:external/xen0):       
>>>>>>>> Stopped
>>>>>>>>>>>>>>>  Resource Group:
> grpStonith2
>>>>>>>>>>>>>>>      Stonith2-1
>>>>>>>> (stonith:external/stonith-helper):      Stopped
>>>>>>>>>>>>>>>      Stonith2-2
> (stonith:external/xen0):       
>>>>>>>> Stopped
>>>>>>>>>>>>>>>  Clone Set: clone_ping
> [ping]
>>>>>>>>>>>>>>>      Started: [
> lbv1.beta.com lbv2.beta.com ]
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Node Attributes:
>>>>>>>>>>>>>>> * Node lbv1.beta.com:
>>>>>>>>>>>>>>>     +
> default_ping_set                  : 100
>>>>>>>>>>>>>>> * Node lbv2.beta.com:
>>>>>>>>>>>>>>>     +
> default_ping_set                  : 100
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Migration summary:
>>>>>>>>>>>>>>> * Node lbv2.beta.com:
>>>>>>>>>>>>>>>    Stonith1-1:
> migration-threshold=1
>>>>>>>> fail-count=1000000 last-failure='Tue Mar 17
>>>>>>>>>>>>>>>  14:12:16 2015'
>>>>>>>>>>>>>>> * Node lbv1.beta.com:
>>>>>>>>>>>>>>>    Stonith2-1:
> migration-threshold=1
>>>>>>>> fail-count=1000000 last-failure='Tue Mar 17
>>>>>>>>>>>>>>>  14:12:21 2015'
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Failed actions:
>>>>>>>>>>>>>>>     Stonith1-1_start_0
> on lbv2.beta.com 'unknown
>>>>>>>> error' (1): call=31, st
>>>>>>>>>>>>>>> atus=Error,
> last-rc-change='Tue Mar 17 14:12:14
>>>>>>>> 2015', queued=0ms, exec=1065ms
>>>>>>>>>>>>>>>     Stonith2-1_start_0
> on lbv1.beta.com 'unknown
>>>>>>>> error' (1): call=26, st
>>>>>>>>>>>>>>> atus=Error,
> last-rc-change='Tue Mar 17 14:12:19
>>>>>>>> 2015', queued=0ms, exec=1081ms
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> その他のログを探してみました。
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> heartbeat起動時です。
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> # less
> /var/log/pm_logconv.out
>>>>>>>>>>>>>>> Mar 17 14:11:28
> lbv1.beta.com info: Starting
>>>>>>>> Heartbeat 3.0.6.
>>>>>>>>>>>>>>> Mar 17 14:11:33
> lbv1.beta.com info: Link
>>>>>>>> lbv2.beta.com:eth1 is up.
>>>>>>>>>>>>>>> Mar 17 14:11:34
> lbv1.beta.com info: Start
>>>>>>>> "ccm" process. (pid=13264)
>>>>>>>>>>>>>>> Mar 17 14:11:34
> lbv1.beta.com info: Start
>>>>>>>> "lrmd" process. (pid=13267)
>>>>>>>>>>>>>>> Mar 17 14:11:34
> lbv1.beta.com info: Start
>>>>>>>> "attrd" process. (pid=13268)
>>>>>>>>>>>>>>> Mar 17 14:11:34
> lbv1.beta.com info: Start
>>>>>>>> "stonithd" process. (pid=13266)
>>>>>>>>>>>>>>> Mar 17 14:11:34
> lbv1.beta.com info: Start
>>>>>>>> "cib" process. (pid=13265)
>>>>>>>>>>>>>>> Mar 17 14:11:34
> lbv1.beta.com info: Start
>>>>>>>> "crmd" process. (pid=13269)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> # less /var/log/error
>>>>>>>>>>>>>>> Mar 17 14:12:20 lbv1
> crmd[13269]:    error:
>>>>>>>> process_lrm_event: Operation Stonith2-1_start_0
> (node=lbv1.beta.com, call=26,
>>>>>>>> status=4, cib-update=19, confirmed=true) Error
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
> syslogからstonithをgrepしたものです
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Mar 17 14:11:34 lbv1
> heartbeat: [13255]: info:
>>>>>>>> Starting child client
>>>>>>>>
> "/usr/local/heartbeat/libexec/pacemaker/stonithd" (0,0)
>>>>>>>>>>>>>>> Mar 17 14:11:34 lbv1
> heartbeat: [13266]: info:
>>>>>>>> Starting
> "/usr/local/heartbeat/libexec/pacemaker/stonithd" as uid 0 
>>>>>>>> gid 0 (pid 13266)
>>>>>>>>>>>>>>> Mar 17 14:11:34 lbv1
> stonithd[13266]:   notice:
>>>>>>>> crm_cluster_connect: Connecting to cluster
> infrastructure: heartbeat
>>>>>>>>>>>>>>> Mar 17 14:11:34 lbv1
> heartbeat: [13255]: info: the
>>>>>>>> send queue length from heartbeat to client stonithd
> is set to 1024
>>>>>>>>>>>>>>> Mar 17 14:11:40 lbv1
> stonithd[13266]:   notice:
>>>>>>>> setup_cib: Watching for stonith topology changes
>>>>>>>>>>>>>>> Mar 17 14:11:40 lbv1
> stonithd[13266]:   notice:
>>>>>>>> unpack_config: On loss of CCM Quorum: Ignore
>>>>>>>>>>>>>>> Mar 17 14:11:40 lbv1
> stonithd[13266]:  warning:
>>>>>>>> handle_startup_fencing: Blind faith: not fencing
> unseen nodes
>>>>>>>>>>>>>>> Mar 17 14:11:40 lbv1
> stonithd[13266]:  warning:
>>>>>>>> handle_startup_fencing: Blind faith: not fencing
> unseen nodes
>>>>>>>>>>>>>>> Mar 17 14:11:41 lbv1
> stonithd[13266]:   notice:
>>>>>>>> stonith_device_register: Added 'Stonith2-1'
> to the device list (1 active
>>>>>>>> devices)
>>>>>>>>>>>>>>> Mar 17 14:11:41 lbv1
> stonithd[13266]:   notice:
>>>>>>>> stonith_device_register: Added 'Stonith2-2'
> to the device list (2 active
>>>>>>>> devices)
>>>>>>>>>>>>>>> Mar 17 14:12:04 lbv1
> stonithd[13266]:   notice:
>>>>>>>> xml_patch_version_check: Versions did not change in
> patch 0.5.0
>>>>>>>>>>>>>>> Mar 17 14:12:20 lbv1
> stonithd[13266]:   notice:
>>>>>>>> log_operation: Operation 'monitor' [13386]
> for device
>>>>>>>> 'Stonith2-1' returned: -201 (Generic
> Pacemaker error)
>>>>>>>>>>>>>>> Mar 17 14:12:20 lbv1
> stonithd[13266]:  warning:
>>>>>>>> log_operation: Stonith2-1:13386 [ Performing:
> stonith -t external/stonith-helper
>>>>>>>> -S ]
>>>>>>>>>>>>>>> Mar 17 14:12:20 lbv1
> stonithd[13266]:  warning:
>>>>>>>> log_operation: Stonith2-1:13386 [ failed to exec
> "stonith" ]
>>>>>>>>>>>>>>> Mar 17 14:12:20 lbv1
> stonithd[13266]:  warning:
>>>>>>>> log_operation: Stonith2-1:13386 [ failed:  2 ]
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 宜しくお願いします。
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 以上
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 2015年3月17日 13:32
> <renayama19661014@ybb.ne.jp>:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 福田さん
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> お疲れ様です。山内です。
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
> ということは、stonith-helperのstartに問題があるようですね。
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> stonith-helperの先頭に
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> #!/bin/bash -x
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
> を入れて、クラスタを起動すると何かわかるかも知れません。
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
> ちなみに、stonith-helperのログもどこかに出ていると思うのですが。。。
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 以上です。
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> ----- Original
> Message -----
>>>>>>>>>>>>>>>>> From: Masamichi
> Fukuda - elf-systems
>>>>>>>> <masamichi_fukuda@elf-systems.com>
>>>>>>>>>>>>>>>>> To: 山内英生
> <renayama19661014@ybb.ne.jp>;
>>>>>>>> "linux-ha-japan@lists.sourceforge.jp"
>>>>>>>> <linux-ha-japan@lists.sourceforge.jp>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Date:
> 2015/3/17, Tue 12:31
>>>>>>>>>>>>>>>>> Subject: Re:
> [Linux-ha-jp]
>>>>>>>> スプリットブレイン時のSTONITHエラーについて
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 山内さん
>>>>>>>>>>>>>>>>> cc:松島さん
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> こんにちは、福田です。
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
> 同じディレクトリにxen0はありました。
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> # pwd
>>>>>>>>>>>>>>>>>
> /usr/local/heartbeat/lib/stonith/plugins/external
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> # ls
>>>>>>>>>>>>>>>>> drac5          
> ibmrsa          kdumpcheck 
>>>>>>>> riloe          vmware
>>>>>>>>>>>>>>>>> dracmc-telnet 
> ibmrsa-telnet  libvirt     
>>>>>>>> ssh          xen0
>>>>>>>>>>>>>>>>> hetzner       
> ipmi          nut     
>>>>>>>> stonith-helper  xen0-ha
>>>>>>>>>>>>>>>>> hmchttp       
> ippower9258    rackpdu     
>>>>>>>> vcenter
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 宜しくお願いします。
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 以上
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 2015-03-17
> 10:53 GMT+09:00
>>>>>>>> <renayama19661014@ybb.ne.jp>:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 福田さん
>>>>>>>>>>>>>>>>>> cc:松島さん
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
> お疲れ様です。山内です。
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
> 標準出力や標準エラー出力はありませんでした。
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
> stonith-helperがおかしいのでしょうか。
>>>>>>>>>>>>>>>>>>>
> stonith-helperはシェルスクリプトなのでインストールはあまり気にしていなかったのですが。
>>>>>>>>>>>>>>>>>>>
> stonith-helperはここに配置されています。
>>>>>>>>>>>>>>>>>>>
> /usr/local/heartbeat/lib/stonith/plugins/external/stonith-helper
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
> このディレクトリにxen0もありますか?
>>>>>>>>>>>>>>>>>>
> 無いようでしたら、問題がありますので、一度、stonith-helperのファイルを属性などはそのまま、xen0と同じディレクトリに
>>>>>>>>>>>>>>>>>>
> コピーしてみてください。
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
> それで稼働するなら、pm_extrasのインストールに問題があるということになります。
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> 以上です。
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> -----
> Original Message -----
>>>>>>>>>>>>>>>>>>> From:
> Masamichi Fukuda - elf-systems
>>>>>>>> <masamichi_fukuda@elf-systems.com>
>>>>>>>>>>>>>>>>>>> To:
> 山内英生
>>>>>>>> <renayama19661014@ybb.ne.jp>;
>>>>>>>> "linux-ha-japan@lists.sourceforge.jp"
>>>>>>>> <linux-ha-japan@lists.sourceforge.jp>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Date:
> 2015/3/17, Tue 10:31
>>>>>>>>>>>>>>>>>>>
> Subject: Re: [Linux-ha-jp]
>>>>>>>> スプリットブレイン時のSTONITHエラーについて
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> 山内さん
>>>>>>>>>>>>>>>>>>> cc:松島さん
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
> おはようございます、福田です。
>>>>>>>>>>>>>>>>>>>
> crmの例をありがとうございます。
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
> 早速、こちらの環境に合わせてみました。
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> $ cat
> test.crm
>>>>>>>>>>>>>>>>>>> ###
> Cluster Option ###
>>>>>>>>>>>>>>>>>>>
> property \
>>>>>>>>>>>>>>>>>>>    
>>>>>>>> no-quorum-policy="ignore" \
>>>>>>>>>>>>>>>>>>>    
> stonith-enabled="true"
>>>>>>>> \
>>>>>>>>>>>>>>>>>>>    
>>>>>>>> startup-fencing="false" \
>>>>>>>>>>>>>>>>>>>    
> stonith-timeout="710s"
>>>>>>>> \
>>>>>>>>>>>>>>>>>>>    
>>>>>>>> crmd-transition-delay="2s"
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> ###
> Resource Default ###
>>>>>>>>>>>>>>>>>>>
> rsc_defaults \
>>>>>>>>>>>>>>>>>>>    
>>>>>>>> resource-stickiness="INFINITY" \
>>>>>>>>>>>>>>>>>>>    
>>>>>>>> migration-threshold="1"
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> ###
> Group Configuration ###
>>>>>>>>>>>>>>>>>>> group
> HAvarnish \
>>>>>>>>>>>>>>>>>>>    
> vip_208 \
>>>>>>>>>>>>>>>>>>>    
> varnishd
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> group
> grpStonith1 \
>>>>>>>>>>>>>>>>>>>    
> Stonith1-1 \
>>>>>>>>>>>>>>>>>>>    
> Stonith1-2
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> group
> grpStonith2 \
>>>>>>>>>>>>>>>>>>>    
> Stonith2-1 \
>>>>>>>>>>>>>>>>>>>    
> Stonith2-2
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> ###
> Clone Configuration ###
>>>>>>>>>>>>>>>>>>> clone
> clone_ping \
>>>>>>>>>>>>>>>>>>>    
> ping
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> ###
> Fencing Topology ###
>>>>>>>>>>>>>>>>>>>
> fencing_topology \
>>>>>>>>>>>>>>>>>>>    
> lbv1.beta.com: Stonith1-1
>>>>>>>> Stonith1-2 \
>>>>>>>>>>>>>>>>>>>    
> lbv2.beta.com: Stonith2-1
>>>>>>>> Stonith2-2
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> ###
> Primitive Configuration ###
>>>>>>>>>>>>>>>>>>>
> primitive vip_208
>>>>>>>> ocf:heartbeat:IPaddr2 \
>>>>>>>>>>>>>>>>>>>    
> params \
>>>>>>>>>>>>>>>>>>>        
>>>>>>>> ip="192.168.17.208" \
>>>>>>>>>>>>>>>>>>>        
> nic="eth0" \
>>>>>>>>>>>>>>>>>>>        
> cidr_netmask="24"
>>>>>>>> \
>>>>>>>>>>>>>>>>>>>     op
> start interval="0s"
>>>>>>>> timeout="90s" on-fail="restart"
> \
>>>>>>>>>>>>>>>>>>>     op
> monitor
>>>>>>>> interval="5s" timeout="60s"
> on-fail="restart"
>>>>>>>> \
>>>>>>>>>>>>>>>>>>>     op
> stop interval="0s"
>>>>>>>> timeout="100s" on-fail="fence"
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
> primitive varnishd lsb:varnish \
>>>>>>>>>>>>>>>>>>>     op
> start interval="0s"
>>>>>>>> timeout="90s" on-fail="restart"
> \
>>>>>>>>>>>>>>>>>>>     op
> monitor
>>>>>>>> interval="10s" timeout="60s"
> on-fail="restart"
>>>>>>>> \
>>>>>>>>>>>>>>>>>>>     op
> stop interval="0s"
>>>>>>>> timeout="100s" on-fail="fence"
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
> primitive ping ocf:pacemaker:ping
>>>>>>>> \
>>>>>>>>>>>>>>>>>>>    
> params \
>>>>>>>>>>>>>>>>>>>        
>>>>>>>> name="default_ping_set" \
>>>>>>>>>>>>>>>>>>>        
>>>>>>>> host_list="192.168.17.254" \
>>>>>>>>>>>>>>>>>>>        
> multiplier="100"
>>>>>>>> \
>>>>>>>>>>>>>>>>>>>        
> dampen="1" \
>>>>>>>>>>>>>>>>>>>     op
> start interval="0s"
>>>>>>>> timeout="90s" on-fail="restart"
> \
>>>>>>>>>>>>>>>>>>>     op
> monitor
>>>>>>>> interval="10s" timeout="60s"
> on-fail="restart"
>>>>>>>> \
>>>>>>>>>>>>>>>>>>>     op
> stop interval="0s"
>>>>>>>> timeout="100s" on-fail="fence"
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
> primitive Stonith1-1
>>>>>>>> stonith:external/stonith-helper \
>>>>>>>>>>>>>>>>>>>    
> params \
>>>>>>>>>>>>>>>>>>>        
>>>>>>>> pcmk_reboot_retries="1" \
>>>>>>>>>>>>>>>>>>>        
>>>>>>>> pcmk_reboot_timeout="40s" \
>>>>>>>>>>>>>>>>>>>        
>>>>>>>> hostlist="lbv1.beta.com" \
>>>>>>>>>>>>>>>>>>>        
>>>>>>>> dead_check_target="192.168.17.132
> 10.0.17.132" \
>>>>>>>>>>>>>>>>>>>        
>>>>>>>>
> standby_check_command="/usr/local/sbin/crm_resource -r varnishd -W | grep
>>>>>>>> -q `hostname`" \
>>>>>>>>>>>>>>>>>>>        
>>>>>>>> run_online_check="yes" \
>>>>>>>>>>>>>>>>>>>     op
> start interval="0s"
>>>>>>>> timeout="60s" on-fail="restart"
> \
>>>>>>>>>>>>>>>>>>>     op
> stop interval="0s"
>>>>>>>> timeout="60s" on-fail="ignore"
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
> primitive Stonith1-2
>>>>>>>> stonith:external/xen0 \
>>>>>>>>>>>>>>>>>>>    
> params \
>>>>>>>>>>>>>>>>>>>        
>>>>>>>> pcmk_reboot_timeout="60s" \
>>>>>>>>>>>>>>>>>>>        
>>>>>>>>
> hostlist="lbv1.beta.com:/etc/xen/lbv1.cfg" \
>>>>>>>>>>>>>>>>>>>        
>>>>>>>> dom0="xen0.beta.com" \
>>>>>>>>>>>>>>>>>>>     op
> start interval="0s"
>>>>>>>> timeout="60s" on-fail="restart"
> \
>>>>>>>>>>>>>>>>>>>     op
> monitor
>>>>>>>> interval="3600s" timeout="60s"
> on-fail="restart"
>>>>>>>> \
>>>>>>>>>>>>>>>>>>>     op
> stop interval="0s"
>>>>>>>> timeout="60s" on-fail="ignore"
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
> primitive Stonith2-1
>>>>>>>> stonith:external/stonith-helper \
>>>>>>>>>>>>>>>>>>>    
> params \
>>>>>>>>>>>>>>>>>>>        
>>>>>>>> pcmk_reboot_retries="1" \
>>>>>>>>>>>>>>>>>>>        
>>>>>>>> pcmk_reboot_timeout="40s" \
>>>>>>>>>>>>>>>>>>>        
>>>>>>>> hostlist="lbv2.beta.com" \
>>>>>>>>>>>>>>>>>>>        
>>>>>>>> dead_check_target="192.168.17.133
> 10.0.17.133" \
>>>>>>>>>>>>>>>>>>>        
>>>>>>>>
> standby_check_command="/usr/local/sbin/crm_resource -r varnishd -W | grep
>>>>>>>> -q `hostname`" \
>>>>>>>>>>>>>>>>>>>        
>>>>>>>> run_online_check="yes" \
>>>>>>>>>>>>>>>>>>>     op
> start interval="0s"
>>>>>>>> timeout="60s" on-fail="restart"
> \
>>>>>>>>>>>>>>>>>>>     op
> stop interval="0s"
>>>>>>>> timeout="60s" on-fail="ignore"
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
> primitive Stonith2-2
>>>>>>>> stonith:external/xen0 \
>>>>>>>>>>>>>>>>>>>    
> params \
>>>>>>>>>>>>>>>>>>>        
>>>>>>>> pcmk_reboot_timeout="60s" \
>>>>>>>>>>>>>>>>>>>        
>>>>>>>>
> hostlist="lbv2.beta.com:/etc/xen/lbv2.cfg" \
>>>>>>>>>>>>>>>>>>>        
>>>>>>>> dom0="xen0.beta.com" \
>>>>>>>>>>>>>>>>>>>     op
> start interval="0s"
>>>>>>>> timeout="60s" on-fail="restart"
> \
>>>>>>>>>>>>>>>>>>>     op
> monitor
>>>>>>>> interval="3600s" timeout="60s"
> on-fail="restart"
>>>>>>>> \
>>>>>>>>>>>>>>>>>>>     op
> stop interval="0s"
>>>>>>>> timeout="60s" on-fail="ignore"
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> ###
> Resource Location ###
>>>>>>>>>>>>>>>>>>>
> location HA_location-1 HAvarnish
>>>>>>>> \
>>>>>>>>>>>>>>>>>>>    
> rule 200: #uname eq
>>>>>>>> lbv1.beta.com \
>>>>>>>>>>>>>>>>>>>    
> rule 100: #uname eq
>>>>>>>> lbv2.beta.com
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
> location HA_location-2 HAvarnish
>>>>>>>> \
>>>>>>>>>>>>>>>>>>>    
> rule -INFINITY: not_defined
>>>>>>>> default_ping_set or default_ping_set lt 100
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
> location HA_location-3 grpStonith1
>>>>>>>> \
>>>>>>>>>>>>>>>>>>>    
> rule -INFINITY: #uname eq
>>>>>>>> lbv1.beta.com
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
> location HA_location-4 grpStonith2
>>>>>>>> \
>>>>>>>>>>>>>>>>>>>    
> rule -INFINITY: #uname eq
>>>>>>>> lbv2.beta.com
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
> これを流しこんだところ、昨日とはメッセージが異なります。
>>>>>>>>>>>>>>>>>>>
> pingのメッセージはなくなっていました。
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> #
> crm_mon -rfA
>>>>>>>>>>>>>>>>>>> Last
> updated: Tue Mar 17 10:21:28
>>>>>>>> 2015
>>>>>>>>>>>>>>>>>>> Last
> change: Tue Mar 17 10:21:09
>>>>>>>> 2015
>>>>>>>>>>>>>>>>>>> Stack:
> heartbeat
>>>>>>>>>>>>>>>>>>> Current
> DC: lbv2.beta.com
>>>>>>>> (82ffc36f-1ad8-8686-7db0-35686465c624) - parti
>>>>>>>>>>>>>>>>>>> tion
> with quorum
>>>>>>>>>>>>>>>>>>>
> Version: 1.1.12-561c4cf
>>>>>>>>>>>>>>>>>>> 2 Nodes
> configured
>>>>>>>>>>>>>>>>>>> 8
> Resources configured
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Online:
> [ lbv1.beta.com
>>>>>>>> lbv2.beta.com ]
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Full
> list of resources:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>  Resource Group: HAvarnish
>>>>>>>>>>>>>>>>>>>     
> vip_208   
>>>>>>>> (ocf::heartbeat:IPaddr2):       Started
> lbv1.beta.com
>>>>>>>>>>>>>>>>>>>     
> varnishd   (lsb:varnish): 
>>>>>>>> Started lbv1.beta.com
>>>>>>>>>>>>>>>>>>>
>  Resource Group: grpStonith1
>>>>>>>>>>>>>>>>>>>     
> Stonith1-1
>>>>>>>> (stonith:external/stonith-helper):      Stopped
>>>>>>>>>>>>>>>>>>>     
> Stonith1-2
>>>>>>>> (stonith:external/xen0):        Stopped
>>>>>>>>>>>>>>>>>>>
>  Resource Group: grpStonith2
>>>>>>>>>>>>>>>>>>>     
> Stonith2-1
>>>>>>>> (stonith:external/stonith-helper):      Stopped
>>>>>>>>>>>>>>>>>>>     
> Stonith2-2
>>>>>>>> (stonith:external/xen0):        Stopped
>>>>>>>>>>>>>>>>>>>  Clone
> Set: clone_ping [ping]
>>>>>>>>>>>>>>>>>>>     
> Started: [ lbv1.beta.com
>>>>>>>> lbv2.beta.com ]
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Node
> Attributes:
>>>>>>>>>>>>>>>>>>> * Node
> lbv1.beta.com:
>>>>>>>>>>>>>>>>>>>     +
>>>>>>>> default_ping_set                  : 100
>>>>>>>>>>>>>>>>>>> * Node
> lbv2.beta.com:
>>>>>>>>>>>>>>>>>>>     +
>>>>>>>> default_ping_set                  : 100
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
> Migration summary:
>>>>>>>>>>>>>>>>>>> * Node
> lbv2.beta.com:
>>>>>>>>>>>>>>>>>>>   
> Stonith1-1: migration-threshold=1
>>>>>>>> fail-count=1000000 last-failure='Tue Mar 17
>>>>>>>>>>>>>>>>>>>
>  10:21:17 2015'
>>>>>>>>>>>>>>>>>>> * Node
> lbv1.beta.com:
>>>>>>>>>>>>>>>>>>>   
> Stonith2-1: migration-threshold=1
>>>>>>>> fail-count=1000000 last-failure='Tue Mar 17
>>>>>>>>>>>>>>>>>>>
>  10:21:17 2015'
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Failed
> actions:
>>>>>>>>>>>>>>>>>>>    
> Stonith1-1_start_0 on
>>>>>>>> lbv2.beta.com 'unknown error' (1): call=31,
> st
>>>>>>>>>>>>>>>>>>>
> atus=Error, last-rc-change='Tue
>>>>>>>> Mar 17 10:21:15 2015', queued=0ms, exec=1082ms
>>>>>>>>>>>>>>>>>>>    
> Stonith2-1_start_0 on
>>>>>>>> lbv1.beta.com 'unknown error' (1): call=31,
> st
>>>>>>>>>>>>>>>>>>>
> atus=Error, last-rc-change='Tue
>>>>>>>> Mar 17 10:21:16 2015', queued=0ms, exec=1079ms
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
> /var/log/ha-debugのログです。
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
> IPaddr2(vip_208)[7851]:
>>>>>>>> 2015/03/17_10:21:22 INFO: Adding inet address
> 192.168.17.208/24 with broadcast
>>>>>>>> address 192.168.17.255 to device eth0
>>>>>>>>>>>>>>>>>>>
> IPaddr2(vip_208)[7851]:
>>>>>>>> 2015/03/17_10:21:22 INFO: Bringing device eth0 up
>>>>>>>>>>>>>>>>>>>
> IPaddr2(vip_208)[7851]:
>>>>>>>> 2015/03/17_10:21:22 INFO:
> /usr/libexec/heartbeat/send_arp -i 200 -r 5 -p
>>>>>>>> /var/run/resource-agents/send_arp-192.168.17.208
> eth0 192.168.17.208 auto
>>>>>>>> not_used not_used
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
> 標準出力や標準エラー出力はありませんでした。
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
> stonith-helperがおかしいのでしょうか。
>>>>>>>>>>>>>>>>>>>
> stonith-helperはシェルスクリプトなのでインストールはあまり気にしていなかったのですが。
>>>>>>>>>>>>>>>>>>>
> stonith-helperはここに配置されています。
>>>>>>>>>>>>>>>>>>>
> /usr/local/heartbeat/lib/stonith/plugins/external/stonith-helper
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
> 宜しくお願いします。
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> 以上
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
> 2015-03-17 9:45 GMT+09:00
>>>>>>>> <renayama19661014@ybb.ne.jp>:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> 福田さん
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
> おはようございます。山内です。
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
> 念の為、手元にある複数のstonithを利用した場合の例を抜粋してお送りします。
>>>>>>>>>>>>>>>>>>>>
> (実際には、改行に気を付けてください)
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
> 以下の例は、PM1.1系での設定で、
>>>>>>>>>>>>>>>>>>>>
> nodeaは、prmStonith1-1、 prmStonith1-2の順でstonithが実行されます。
>>>>>>>>>>>>>>>>>>>>
> nodebは、prmStonith2-1、 prmStonith2-2の順でstonithが実行されます。
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
> stonith自体は、helperとsshです。
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
> (snip)
>>>>>>>>>>>>>>>>>>>> ###
> Group Configuration ###
>>>>>>>>>>>>>>>>>>>>
> group grpStonith1 \
>>>>>>>>>>>>>>>>>>>>
> prmStonith1-1 \
>>>>>>>>>>>>>>>>>>>>
> prmStonith1-2
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
> group grpStonith2 \
>>>>>>>>>>>>>>>>>>>>
> prmStonith2-1 \
>>>>>>>>>>>>>>>>>>>>
> prmStonith2-2
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> ###
> Fencing Topology ###
>>>>>>>>>>>>>>>>>>>>
> fencing_topology \
>>>>>>>>>>>>>>>>>>>>
> nodea: prmStonith1-1
>>>>>>>> prmStonith1-2 \
>>>>>>>>>>>>>>>>>>>>
> nodeb: prmStonith2-1
>>>>>>>> prmStonith2-2
>>>>>>>>>>>>>>>>>>>>
> (snp)
>>>>>>>>>>>>>>>>>>>>
> primitive prmStonith1-1
>>>>>>>> stonith:external/stonith-helper \
>>>>>>>>>>>>>>>>>>>>
> params \
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
> pcmk_reboot_retries="1"
>>>>>>>> \
>>>>>>>>>>>>>>>>>>>>
> pcmk_reboot_timeout="40s"
>>>>>>>> \
>>>>>>>>>>>>>>>>>>>>
> hostlist="nodea" \
>>>>>>>>>>>>>>>>>>>>
> dead_check_target="192.168.28.60
>>>>>>>> 192.168.28.70" \
>>>>>>>>>>>>>>>>>>>>
> standby_check_command="/usr/sbin/crm_resource
>>>>>>>> -r prmRES -W | grep -qi `hostname`" \
>>>>>>>>>>>>>>>>>>>>
> run_online_check="yes"
>>>>>>>> \
>>>>>>>>>>>>>>>>>>>> op
> start interval="0s"
>>>>>>>> timeout="60s" on-fail="restart"
> \
>>>>>>>>>>>>>>>>>>>> op
> stop interval="0s"
>>>>>>>> timeout="60s" on-fail="ignore"
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
> primitive prmStonith1-2
>>>>>>>> stonith:external/ssh \
>>>>>>>>>>>>>>>>>>>>
> params \
>>>>>>>>>>>>>>>>>>>>
> pcmk_reboot_timeout="60s"
>>>>>>>> \
>>>>>>>>>>>>>>>>>>>>
> hostlist="nodea" \
>>>>>>>>>>>>>>>>>>>> op
> start interval="0s"
>>>>>>>> timeout="60s" on-fail="restart"
> \
>>>>>>>>>>>>>>>>>>>> op
> monitor
>>>>>>>> interval="3600s" timeout="60s"
> on-fail="restart"
>>>>>>>> \
>>>>>>>>>>>>>>>>>>>> op
> stop interval="0s"
>>>>>>>> timeout="60s" on-fail="ignore"
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
> primitive prmStonith2-1
>>>>>>>> stonith:external/stonith-helper \
>>>>>>>>>>>>>>>>>>>>
> params \
>>>>>>>>>>>>>>>>>>>>
> pcmk_reboot_retries="1"
>>>>>>>> \
>>>>>>>>>>>>>>>>>>>>
> pcmk_reboot_timeout="40s"
>>>>>>>> \
>>>>>>>>>>>>>>>>>>>>
> hostlist="nodeb" \
>>>>>>>>>>>>>>>>>>>>
> dead_check_target="192.168.28.61
>>>>>>>> 192.168.28.71" \
>>>>>>>>>>>>>>>>>>>>
> standby_check_command="/usr/sbin/crm_resource
>>>>>>>> -r prmRES -W | grep -qi `hostname`" \
>>>>>>>>>>>>>>>>>>>>
> run_online_check="yes"
>>>>>>>> \
>>>>>>>>>>>>>>>>>>>> op
> start interval="0s"
>>>>>>>> timeout="60s" on-fail="restart"
> \
>>>>>>>>>>>>>>>>>>>> op
> stop interval="0s"
>>>>>>>> timeout="60s" on-fail="ignore"
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
> primitive prmStonith2-2
>>>>>>>> stonith:external/ssh \
>>>>>>>>>>>>>>>>>>>>
> params \
>>>>>>>>>>>>>>>>>>>>
> pcmk_reboot_timeout="60s"
>>>>>>>> \
>>>>>>>>>>>>>>>>>>>>
> hostlist="nodeb" \
>>>>>>>>>>>>>>>>>>>> op
> start interval="0s"
>>>>>>>> timeout="60s" on-fail="restart"
> \
>>>>>>>>>>>>>>>>>>>> op
> monitor
>>>>>>>> interval="3600s" timeout="60s"
> on-fail="restart"
>>>>>>>> \
>>>>>>>>>>>>>>>>>>>> op
> stop interval="0s"
>>>>>>>> timeout="60s" on-fail="ignore"
>>>>>>>>>>>>>>>>>>>>
> (snip)
>>>>>>>>>>>>>>>>>>>>
> location
>>>>>>>> rsc_location-grpStonith1-2 grpStonith1 \
>>>>>>>>>>>>>>>>>>>>
> rule -INFINITY: #uname eq nodea
>>>>>>>>>>>>>>>>>>>>
> location
>>>>>>>> rsc_location-grpStonith2-3 grpStonith2 \
>>>>>>>>>>>>>>>>>>>>
> rule -INFINITY: #uname eq nodeb
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
> 以上です。
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> ELF
> Systems
>>>>>>>>>>>>>>>>>>>
> Masamichi Fukuda
>>>>>>>>>>>>>>>>>>> mail
> to:
>>>>>>>> masamichi_fukuda@elf-systems.com
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
> _______________________________________________
>>>>>>>>>>>>>>>>>>
> Linux-ha-japan mailing list
>>>>>>>>>>>>>>>>>>
> Linux-ha-japan@lists.sourceforge.jp
>>>>>>>>>>>>>>>>>>
> http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> ELF Systems
>>>>>>>>>>>>>>>>> Masamichi
> Fukuda
>>>>>>>>>>>>>>>>> mail to:
> masamichi_fukuda@elf-systems.com
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
> _______________________________________________
>>>>>>>>>>>>>>>> Linux-ha-japan
> mailing list
>>>>>>>>>>>>>>>>
> Linux-ha-japan@lists.sourceforge.jp
>>>>>>>>>>>>>>>>
> http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> ELF Systems
>>>>>>>>>>>>>>> Masamichi Fukuda
>>>>>>>>>>>>>>> mail to:
> masamichi_fukuda@elf-systems.com
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
> _______________________________________________
>>>>>>>>>>>>>> Linux-ha-japan mailing list
>>>>>>>>>>>>>>
> Linux-ha-japan@lists.sourceforge.jp
>>>>>>>>>>>>>>
> http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> --
>>>>>>>>>>>>>
>>>>>>>>>>>>> ELF Systems
>>>>>>>>>>>>> Masamichi Fukuda
>>>>>>>>>>>>> mail to:
> masamichi_fukuda@elf-systems.com
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
> _______________________________________________
>>>>>>>>>>>> Linux-ha-japan mailing list
>>>>>>>>>>>> Linux-ha-japan@lists.sourceforge.jp
>>>>>>>>>>>>
> http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>>
>>>>>>>>>>> ELF Systems
>>>>>>>>>>> Masamichi Fukuda
>>>>>>>>>>> mail to:
> masamichi_fukuda@elf-systems.com
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
> _______________________________________________
>>>>>>>>>> Linux-ha-japan mailing list
>>>>>>>>>> Linux-ha-japan@lists.sourceforge.jp
>>>>>>>>>>
> http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>>
>>>>>>>>> ELF Systems
>>>>>>>>> Masamichi Fukuda
>>>>>>>>> mail to: masamichi_fukuda@elf-systems.com
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> Linux-ha-japan mailing list
>>>>>>>> Linux-ha-japan@lists.sourceforge.jp
>>>>>>>>
> http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Linux-ha-japan mailing list
>>>>>>> Linux-ha-japan@lists.sourceforge.jp
>>>>>>> http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>>
>>>>>> ELF Systems
>>>>>> Masamichi Fukuda
>>>>>> mail to: masamichi_fukuda@elf-systems.com
>>>>>>
>>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Linux-ha-japan mailing list
>>>>> Linux-ha-japan@lists.sourceforge.jp
>>>>> http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>>>>>
>>>>
>>>>
>>>> --
>>>>
>>>> ELF Systems
>>>> Masamichi Fukuda
>>>> mail to: masamichi_fukuda@elf-systems.com
>>>>
>>>>
>>>
>>> _______________________________________________
>>> Linux-ha-japan mailing list
>>> Linux-ha-japan@lists.sourceforge.jp
>>> http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>>>
>>
>>
>> --
>>
>> ELF Systems
>> Masamichi Fukuda
>> mail to: masamichi_fukuda@elf-systems.com
>>
>>
>
> _______________________________________________
> Linux-ha-japan mailing list
> Linux-ha-japan@lists.sourceforge.jp
> http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>

_______________________________________________
Linux-ha-japan mailing list
Linux-ha-japan@lists.sourceforge.jp
http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
Re: スプリットブレイン時のSTONITHエラーについて [ In reply to ]
山内さん

お疲れ様です、福田です。

こちらの環境では、packageで次のものを入れていたので、
最初にapt-get removeしました。

heartbeat、libheartbeat2、pacemaker、corosync、resource-agents

また、haclusterユーザとhaclientグループはpackage導入の段階で
作成されていました。

ですので、松島さんの手順の

下準備
apt-get install build-essential mercurial git \

以降を実行しました。後は全く同じ手順です。

宜しくお願いします。

以上

2015年3月18日 10:06 <renayama19661014@ybb.ne.jp>:
>
> 福田さん
>
> お疲れ様です。山内です。
>
> ちなみに、私の方でも構築するにあたっての再確認ですが、福田さんの構築手順は、
> 以下にまとめられた松島さんの手順通りでしょうか?
>
> * https://gist.github.com/takehironet/1469bd7123f63d61f843
>
> 差異などありましたら、今一度、ご連絡ください。
>
> #特に、最初の構築パッケージのapt-getのあたりが、私がちょっと試した時には、うまく行かなかった記憶があるので、気になります。
>
>
> 以上です。
>
>
> ----- Original Message -----
> > From: "renayama19661014@ybb.ne.jp" <renayama19661014@ybb.ne.jp>
> > To: "linux-ha-japan@lists.sourceforge.jp" <
linux-ha-japan@lists.sourceforge.jp>
> > Cc:
> > Date: 2015/3/18, Wed 09:53
> > Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
> >
> > 福田さん
> >
> > お疲れ様です。山内です。
> >
> >> stonith -Lでは、一応プラグインの一覧は表示されるようです。
> >>
> >> # /usr/local/heartbeat/sbin/stonith -L
> >
> > こちらは、Heartbeatのソースに含まれるコマンドのはずなので、Heartbeatとglueの関係では問題ないということになるかと思います。
> >
> > ですので、pacemakerのインストールに問題がある可能性の方が高いと思われます。
> >
> > どちらにしても、一度、時間をみて、こちらでも構築してみます。
> >
> > 以上です。
> >
> >
> > ----- Original Message -----
> >> From: Masamichi Fukuda - elf-systems
> > <masamichi_fukuda@elf-systems.com>
> >> To: 山内英生 <renayama19661014@ybb.ne.jp>;
> > "linux-ha-japan@lists.sourceforge.jp"
> > <linux-ha-japan@lists.sourceforge.jp>
> >> Date: 2015/3/18, Wed 09:33
> >> Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
> >>
> >>
> >> 山内さん
> >>
> >> お疲れ様です、福田です。
> >>
> >>> Reusableは、glueのことです。
> >>
> >> 承知しました。Cluster-glueのことですね。
> >>
> >>> これは、想定通りで、external配下のエージェントを認識できず、startしていないと
> >>> 思っています。
> >>
> >> stonith -Lでは、一応プラグインの一覧は表示されるようです。
> >>
> >> # /usr/local/heartbeat/sbin/stonith -L
> >> apcmaster
> >> apcsmart
> >> baytech
> >> cyclades
> >> external/drac5
> >> external/dracmc-telnet
> >> external/hetzner
> >> external/hmchttp
> >> external/ibmrsa
> >> external/ibmrsa-telnet
> >> external/ipmi
> >> external/ippower9258
> >> external/kdumpcheck
> >> external/libvirt
> >> external/nut
> >> external/rackpdu
> >> external/riloe
> >> external/ssh
> >> external/stonith-helper
> >> external/vcenter
> >> external/vmware
> >> external/xen0
> >> external/xen0-ha
> >> ibmhmc
> >> meatware
> >> null
> >> nw_rpc100s
> >> rcd_serial
> >> rps10
> >> ssh
> >> suicide
> >> wti_nps
> >>
> >>
> >>> #時間を見て、松島さんの情報を元に、Debian7の環境を作らないと何とも言えない
> >>> と思っています
> >>
> >> お忙しいところ済みません。
> >> こちらもインストールを見なおして見ます。
> >>
> >> 宜しくお願いします。
> >>
> >> 以上
> >>
> >>
> >>
> >>
> >> 2015年3月18日 9:02 <renayama19661014@ybb.ne.jp>:
> >>
> >> 福田さん
> >>>
> >>> おはようございます。山内です。
> >>>
> >>> 書き方が悪かったです。
> >>> Reusableは、glueのことです。
> >>>
> >>> pacemakerのインストールに問題があるかも知れませんが、現時点では、判断出来ません。
> >>>
> >>>
> >>>> stonith-helperを外して、external/sshだけにして起動してみましたが、
> >>>> crm_monでの状態は変わりありませんでした。
> >>>
> >>>
> >>> これは、想定通りで、external配下のエージェントを認識できず、startしていないと思っています。
> >>>
> >>> #時間を見て、松島さんの情報を元に、Debian7の環境を作らないと何とも言えないと思っています。
> >>>
> >>> 以上です。
> >>>
> >>>
> >>> ----- Original Message -----
> >>>> From: Masamichi Fukuda - elf-systems
> > <masamichi_fukuda@elf-systems.com>
> >>>> To: 山内英生 <renayama19661014@ybb.ne.jp>;
> > "linux-ha-japan@lists.sourceforge.jp"
> > <linux-ha-japan@lists.sourceforge.jp>
> >>>
> >>>> Date: 2015/3/18, Wed 08:12
> >>>> Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
> >>>>
> >>>>
> >>>> 山内さん
> >>>>
> >>>> おはようございます、福田です。
> >>>>
> >>>>> ということは、xen0もstonith-helperもたぶん、Pacemakerのstonithプラグインとし
> >>>>> ての管理下のパスにはないということになると思います。
> >>>>>
> >>>>> Reusableと、pm_extrasあたりのインストールが怪しいと思いますね。
> >>>>
> >>>> pacemakerのインストールに問題があるのでしょうか。
> >>>> あと、Reusableというものは別途インストールが必要なのでしょうか。
> >>>>
> >>>> stonith-helperを外して、external/sshだけにして起動してみましたが、
> >>>> crm_monでの状態は変わりありませんでした。
> >>>>
> >>>> Last updated: Wed Mar 18 08:07:42 2015
> >>>> Last change: Wed Mar 18 08:04:48 2015
> >>>> Stack: heartbeat
> >>>> Current DC: lbv1.beta.com (38b0f200-83ea-8633-6f37-047d36cd39c6) -
> > parti
> >>>> tion with quorum
> >>>> Version: 1.1.12-e32080b
> >>>> 2 Nodes configured
> >>>> 6 Resources configured
> >>>>
> >>>>
> >>>> Online: [ lbv1.beta.com lbv2.beta.com ]
> >>>>
> >>>> Full list of resources:
> >>>>
> >>>> Stonith1-2 (stonith:external/ssh): Stopped
> >>>> Stonith2-2 (stonith:external/ssh): Stopped
> >>>> Resource Group: HAvarnish
> >>>> vip_208 (ocf::heartbeat:IPaddr2): Started
> > lbv1.beta.com
> >>>> varnishd (lsb:varnish): Started lbv1.beta.com
> >>>> Clone Set: clone_ping [ping]
> >>>> Started: [ lbv1.beta.com lbv2.beta.com ]
> >>>>
> >>>> Node Attributes:
> >>>> * Node lbv1.beta.com:
> >>>> + default_ping_set : 100
> >>>> * Node lbv2.beta.com:
> >>>> + default_ping_set : 100
> >>>>
> >>>> Migration summary:
> >>>> * Node lbv2.beta.com:
> >>>> Stonith1-2: migration-threshold=1 fail-count=1000000
> > last-failure='Wed Mar 18
> >>>> 08:07:32 2015'
> >>>> * Node lbv1.beta.com:
> >>>> Stonith2-2: migration-threshold=1 fail-count=1000000
> > last-failure='Wed Mar 18
> >>>> 08:05:53 2015'
> >>>>
> >>>> Failed actions:
> >>>> Stonith1-2_start_0 on lbv2.beta.com 'unknown error' (1):
> > call=23, st
> >>>> atus=Error, exit-reason='none', last-rc-change='Wed Mar
> > 18 08:07:30 2015', queue
> >>>> d=0ms, exec=1061ms
> >>>> Stonith2-2_start_0 on lbv1.beta.com 'unknown error' (1):
> > call=23, st
> >>>> atus=Error, exit-reason='none', last-rc-change='Wed Mar
> > 18 08:05:51 2015', queue
> >>>> d=0ms, exec=1062ms
> >>>>
> >>>> 宜しくお願いします。
> >>>>
> >>>> 以上
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> 2015年3月17日 23:51 <renayama19661014@ybb.ne.jp>:
> >>>>
> >>>> 福田さん
> >>>>>
> >>>>> こんばんは、山内です。
> >>>>>
> >>>>>
ということは、xen0もstonith-helperもたぶん、Pacemakerのstonithプラグインとしての管理下のパスにはないということになると思います。
> >>>>>
> >>>>> Reusableと、pm_extrasあたりのインストールが怪しいと思いますね。
> >>>>>
> >>>>> また、何かわかったらご連絡します。
> >>>>>
> >>>>> 以上です。
> >>>>>
> >>>>>
> >>>>>
> >>>>> ----- Original Message -----
> >>>>>> From: Masamichi Fukuda - elf-systems
> > <masamichi_fukuda@elf-systems.com>
> >>>>>> To: 山内英生 <renayama19661014@ybb.ne.jp>;
> > "linux-ha-japan@lists.sourceforge.jp"
> > <linux-ha-japan@lists.sourceforge.jp>
> >>>>>
> >>>>>> Date: 2015/3/17, Tue 23:46
> >>>>>> Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
> >>>>>>
> >>>>>>
> >>>>>> 山内さん
> >>>>>>
> >>>>>> こんばんは、福田です。
> >>>>>>
> >>>>>> stonith-helperの-x指定は何かやり方が違うんでしょうかね。
> >>>>>>
> >>>>>> stonith-helperを外して、xen0だけにして起動してみました。
> >>>>>>
> >>>>>> # crm_mon -rfA
> >>>>>>
> >>>>>> Last updated: Tue Mar 17 23:38:53 2015
> >>>>>> Last change: Tue Mar 17 23:30:34 2015
> >>>>>> Stack: heartbeat
> >>>>>> Current DC: lbv1.beta.com
> > (38b0f200-83ea-8633-6f37-047d36cd39c6) - parti
> >>>>>> tion with quorum
> >>>>>> Version: 1.1.12-e32080b
> >>>>>> 2 Nodes configured
> >>>>>> 6 Resources configured
> >>>>>>
> >>>>>>
> >>>>>> Online: [ lbv1.beta.com lbv2.beta.com ]
> >>>>>>
> >>>>>> Full list of resources:
> >>>>>>
> >>>>>> Stonith1-2 (stonith:external/xen0): Stopped
> >>>>>> Stonith2-2 (stonith:external/xen0): Stopped
> >>>>>> Resource Group: HAvarnish
> >>>>>> vip_208 (ocf::heartbeat:IPaddr2): Started
> > lbv1.beta.com
> >>>>>> varnishd (lsb:varnish): Started lbv1.beta.com
> >>>>>> Clone Set: clone_ping [ping]
> >>>>>> Started: [ lbv1.beta.com lbv2.beta.com ]
> >>>>>>
> >>>>>> Node Attributes:
> >>>>>> * Node lbv1.beta.com:
> >>>>>> + default_ping_set : 100
> >>>>>> * Node lbv2.beta.com:
> >>>>>> + default_ping_set : 100
> >>>>>>
> >>>>>> Migration summary:
> >>>>>> * Node lbv1.beta.com:
> >>>>>> Stonith2-2: migration-threshold=1 fail-count=1000000
> > last-failure='Tue Mar 17
> >>>>>> 23:38:34 2015'
> >>>>>> * Node lbv2.beta.com:
> >>>>>> Stonith1-2: migration-threshold=1 fail-count=1000000
> > last-failure='Tue Mar 17
> >>>>>> 23:38:27 2015'
> >>>>>>
> >>>>>> Failed actions:
> >>>>>> Stonith2-2_start_0 on lbv1.beta.com 'unknown
> > error' (1): call=23, st
> >>>>>> atus=Error, exit-reason='none',
> > last-rc-change='Tue Mar 17 23:38:32 2015', queue
> >>>>>> d=0ms, exec=1061ms
> >>>>>> Stonith1-2_start_0 on lbv2.beta.com 'unknown
> > error' (1): call=23, st
> >>>>>> atus=Error, exit-reason='none',
> > last-rc-change='Tue Mar 17 23:38:25 2015', queue
> >>>>>> d=0ms, exec=1342ms
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> stonith-helperがあるときと同様のfialed actionsが出ているようです。
> >>>>>>
> >>>>>>
> >>>>>> 宜しくお願いします。
> >>>>>>
> >>>>>> 以上
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> 2015年3月17日 22:38 <renayama19661014@ybb.ne.jp>:
> >>>>>>
> >>>>>> 福田さん
> >>>>>>>
> >>>>>>> こんばんは、山内です。
> >>>>>>>
> >>>>>>> ちなみに可能であれば、external/stonith-helperを外して、external/xen0だけにした場合に
> >>>>>>> どうなるか?を確認すると、問題の切り分けになるかもしれません。
> >>>>>>>
> >>>>>>> 以上です。
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> ----- Original Message -----
> >>>>>>>
> >>>>>>>> From: "renayama19661014@ybb.ne.jp"
> > <renayama19661014@ybb.ne.jp>
> >>>>>>>> To: "linux-ha-japan@lists.sourceforge.jp"
> > <linux-ha-japan@lists.sourceforge.jp>
> >>>>>>>> Cc:
> >>>>>>>> Date: 2015/3/17, Tue 22:28
> >>>>>>>> Subject: Re: [Linux-ha-jp]
> > スプリットブレイン時のSTONITHエラーについて
> >>>>>>>>
> >>>>>>>> 福田さん
> >>>>>>>>
> >>>>>>>> こんばんは、山内です。
> >>>>>>>>
> >>>>>>>> 変わらないようですね。。。
> >>>>>>>>
> >>>>>>>> とりあえず、明日くらいに、RHEL上ですが、
> >>>>>>>>
> >>>>>>>> Heartbeat3.0.6
> >>>>>>>> Pacemakerの最新
> >>>>>>>>
> >>>>>>>>
> >
組み合わせで、同じような設定(リソースはDummy、external/xen0はexternal/sshになりますが)stonith-helperが動くかどうかを確認してみます。
> >>>>>>>>
> >>>>>>>> #stonith-helperの-x指定の出力が確認出来ると、もう少し問題が絞りやすいのですが・・・
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> 以上です。
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> ----- Original Message -----
> >>>>>>>>> From: Masamichi Fukuda - elf-systems
> >>>>>>>> <masamichi_fukuda@elf-systems.com>
> >>>>>>>>> To: 山内英生 <renayama19661014@ybb.ne.jp>;
> >>>>>>>> "linux-ha-japan@lists.sourceforge.jp"
> >>>>>>>> <linux-ha-japan@lists.sourceforge.jp>
> >>>>>>>>> Date: 2015/3/17, Tue 21:24
> >>>>>>>>> Subject: Re: [Linux-ha-jp]
> > スプリットブレイン時のSTONITHエラーについて
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> 山内さん
> >>>>>>>>>
> >>>>>>>>> こんばんは、福田です。
> >>>>>>>>> 最新版の情報をありがとうございました。
> >>>>>>>>>
> >>>>>>>>> 早速インストールしてみました。
> >>>>>>>>>
> >>>>>>>>> 起動後の状態です。
> >>>>>>>>>
> >>>>>>>>> failed actionsは変わりないようです。
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> # crm_mon -rfA
> >>>>>>>>> Last updated: Tue Mar 17 21:03:49 2015
> >>>>>>>>> Last change: Tue Mar 17 20:30:58 2015
> >>>>>>>>> Stack: heartbeat
> >>>>>>>>> Current DC: lbv1.beta.com
> > (38b0f200-83ea-8633-6f37-047d36cd39c6) - parti
> >>>>>>>>> tion with quorum
> >>>>>>>>> Version: 1.1.12-e32080b
> >>>>>>>>> 2 Nodes configured
> >>>>>>>>> 8 Resources configured
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> Online: [ lbv1.beta.com lbv2.beta.com ]
> >>>>>>>>>
> >>>>>>>>> Full list of resources:
> >>>>>>>>>
> >>>>>>>>> Resource Group: HAvarnish
> >>>>>>>>> vip_208 (ocf::heartbeat:IPaddr2):
> > Started lbv1.beta.com
> >>>>>>>>> varnishd (lsb:varnish): Started
> > lbv1.beta.com
> >>>>>>>>> Resource Group: grpStonith1
> >>>>>>>>> Stonith1-1
> > (stonith:external/stonith-helper): Stopped
> >>>>>>>>> Stonith1-2 (stonith:external/xen0):
> > Stopped
> >>>>>>>>> Resource Group: grpStonith2
> >>>>>>>>> Stonith2-1
> > (stonith:external/stonith-helper): Stopped
> >>>>>>>>> Stonith2-2 (stonith:external/xen0):
> > Stopped
> >>>>>>>>> Clone Set: clone_ping [ping]
> >>>>>>>>> Started: [ lbv1.beta.com lbv2.beta.com ]
> >>>>>>>>>
> >>>>>>>>> Node Attributes:
> >>>>>>>>> * Node lbv1.beta.com:
> >>>>>>>>> + default_ping_set : 100
> >>>>>>>>> * Node lbv2.beta.com:
> >>>>>>>>> + default_ping_set : 100
> >>>>>>>>>
> >>>>>>>>> Migration summary:
> >>>>>>>>> * Node lbv1.beta.com:
> >>>>>>>>> Stonith2-1: migration-threshold=1
> > fail-count=1000000
> >>>>>>>> last-failure='Tue Mar 17
> >>>>>>>>> 21:03:39 2015'
> >>>>>>>>> * Node lbv2.beta.com:
> >>>>>>>>> Stonith1-1: migration-threshold=1
> > fail-count=1000000
> >>>>>>>> last-failure='Tue Mar 17
> >>>>>>>>> 21:03:32 2015'
> >>>>>>>>>
> >>>>>>>>> Failed actions:
> >>>>>>>>> Stonith2-1_start_0 on lbv1.beta.com
> > 'unknown error' (1):
> >>>>>>>> call=31, st
> >>>>>>>>> atus=Error, exit-reason='none',
> > last-rc-change='Tue Mar 17
> >>>>>>>> 21:03:37 2015', queue
> >>>>>>>>> d=0ms, exec=1085ms
> >>>>>>>>> Stonith1-1_start_0 on lbv2.beta.com
> > 'unknown error' (1):
> >>>>>>>> call=18, st
> >>>>>>>>> atus=Error, exit-reason='none',
> > last-rc-change='Tue Mar 17
> >>>>>>>> 21:03:30 2015', queue
> >>>>>>>>> d=0ms, exec=1061ms
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> ログです。
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> # less /var/log/ha-debug
> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> > [4235]: info: Pacemaker support:
> >>>>>>>> yes
> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> > [4235]: WARN: File
> >>>>>>>> /etc/ha.d//haresources exists.
> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> > [4235]: WARN: This file is not used
> >>>>>>>> because pacemaker is enabled
> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> > [4235]: debug: Checking access of:
> >>>>>>>> /usr/local/heartbeat/libexec/heartbeat/ccm
> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> > [4235]: debug: Checking access of:
> >>>>>>>> /usr/local/heartbeat/libexec/pacemaker/cib
> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> > [4235]: debug: Checking access of:
> >>>>>>>> /usr/local/heartbeat/libexec/pacemaker/stonithd
> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> > [4235]: debug: Checking access of:
> >>>>>>>> /usr/local/heartbeat/libexec/pacemaker/lrmd
> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> > [4235]: debug: Checking access of:
> >>>>>>>> /usr/local/heartbeat/libexec/pacemaker/attrd
> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> > [4235]: debug: Checking access of:
> >>>>>>>> /usr/local/heartbeat/libexec/pacemaker/crmd
> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> > [4235]: WARN: Core dumps could be
> >>>>>>>> lost if multiple dumps occur.
> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> > [4235]: WARN: Consider setting
> >>>>>>>> non-default value in /proc/sys/kernel/core_pattern
> > (or equivalent) for maximum
> >>>>>>>> supportability
> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> > [4235]: WARN: Consider setting
> >>>>>>>> /proc/sys/kernel/core_uses_pid (or equivalent) to 1
> > for maximum supportability
> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> > [4235]: WARN: Logging daemon is
> >>>>>>>> disabled --enabling logging daemon is recommended
> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> > [4235]: info:
> >>>>>>>> **************************
> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> > [4235]: info: Configuration
> >>>>>>>> validated. Starting heartbeat 3.0.6
> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> > [4236]: info: heartbeat: version
> >>>>>>>> 3.0.6
> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> > [4236]: info: Heartbeat generation:
> >>>>>>>> 1423534116
> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> > [4236]: info: seed is -1702799346
> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> > [4236]: info: glib: ucast: write
> >>>>>>>> socket priority set to IPTOS_LOWDELAY on eth1
> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> > [4236]: info: glib: ucast: bound
> >>>>>>>> send socket to device: eth1
> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> > [4236]: info: glib: ucast: set
> >>>>>>>> SO_REUSEADDR
> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> > [4236]: info: glib: ucast: bound
> >>>>>>>> receive socket to device: eth1
> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> > [4236]: info: glib: ucast: started
> >>>>>>>> on port 694 interface eth1 to 10.0.17.133
> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> > [4236]: info: Local status now set
> >>>>>>>> to: 'up'
> >>>>>>>>> Mar 17 21:02:46 lbv1.beta.com heartbeat:
> > [4236]: info: Link
> >>>>>>>> lbv2.beta.com:eth1 up.
> >>>>>>>>> Mar 17 21:02:46 lbv1.beta.com heartbeat:
> > [4236]: info: Status update for
> >>>>>>>> node lbv2.beta.com: status up
> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> > [4236]: info: Comm_now_up():
> >>>>>>>> updating status to active
> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> > [4236]: info: Local status now set
> >>>>>>>> to: 'active'
> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> > [4236]: info: Starting child client
> >>>>>>>>
> > "/usr/local/heartbeat/libexec/heartbeat/ccm" (109,113)
> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> > [4236]: info: Starting child client
> >>>>>>>>
> > "/usr/local/heartbeat/libexec/pacemaker/cib" (109,113)
> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> > [4236]: info: Starting child client
> >>>>>>>>
> > "/usr/local/heartbeat/libexec/pacemaker/stonithd" (0,0)
> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> > [4236]: info: Starting child client
> >>>>>>>>
> > "/usr/local/heartbeat/libexec/pacemaker/lrmd" (0,0)
> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> > [4236]: info: Starting child client
> >>>>>>>>
> > "/usr/local/heartbeat/libexec/pacemaker/attrd" (109,113)
> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> > [4236]: info: Starting child client
> >>>>>>>>
> > "/usr/local/heartbeat/libexec/pacemaker/crmd" (109,113)
> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> > [4236]: debug: get_delnodelist:
> >>>>>>>> delnodelist=
> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> > [4250]: info: Starting
> >>>>>>>>
> > "/usr/local/heartbeat/libexec/pacemaker/crmd" as uid 109 gid 113 (pid
> >>>>>>>> 4250)
> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> > [4246]: info: Starting
> >>>>>>>>
> > "/usr/local/heartbeat/libexec/pacemaker/cib" as uid 109 gid 113 (pid
> >>>>>>>> 4246)
> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> > [4249]: info: Starting
> >>>>>>>>
> > "/usr/local/heartbeat/libexec/pacemaker/attrd" as uid 109 gid 113
> >>>>>>>> (pid 4249)
> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> > [4245]: info: Starting
> >>>>>>>>
> > "/usr/local/heartbeat/libexec/heartbeat/ccm" as uid 109 gid 113 (pid
> >>>>>>>> 4245)
> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> > [4248]: info: Starting
> >>>>>>>>
> > "/usr/local/heartbeat/libexec/pacemaker/lrmd" as uid 0 gid 0 (pid
> >>>>>>>> 4248)
> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> > [4247]: info: Starting
> >>>>>>>>
> > "/usr/local/heartbeat/libexec/pacemaker/stonithd" as uid 0 gid 0 (pid
> >>>>>>>> 4247)
> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com ccm: [4245]:
> > info: Hostname: lbv1.beta.com
> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> > [4236]: info: the send queue length
> >>>>>>>> from heartbeat to client ccm is set to 1024
> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> > [4236]: info: the send queue length
> >>>>>>>> from heartbeat to client attrd is set to 1024
> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> > [4236]: info: the send queue length
> >>>>>>>> from heartbeat to client stonith-ng is set to 1024
> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> > [4236]: info: Status update for
> >>>>>>>> node lbv2.beta.com: status active
> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> > [4236]: info: the send queue length
> >>>>>>>> from heartbeat to client cib is set to 1024
> >>>>>>>>> Mar 17 21:02:51 lbv1.beta.com heartbeat:
> > [4236]: WARN: 1 lost packet(s) for
> >>>>>>>> [lbv2.beta.com] [15:17]
> >>>>>>>>> Mar 17 21:02:51 lbv1.beta.com heartbeat:
> > [4236]: info: No pkts missing from
> >>>>>>>> lbv2.beta.com!
> >>>>>>>>> Mar 17 21:02:52 lbv1.beta.com heartbeat:
> > [4236]: WARN: 1 lost packet(s) for
> >>>>>>>> [lbv2.beta.com] [19:21]
> >>>>>>>>> Mar 17 21:02:52 lbv1.beta.com heartbeat:
> > [4236]: info: No pkts missing from
> >>>>>>>> lbv2.beta.com!
> >>>>>>>>> Mar 17 21:02:52 lbv1.beta.com heartbeat:
> > [4236]: info: the send queue length
> >>>>>>>> from heartbeat to client crmd is set to 1024
> >>>>>>>>> Mar 17 21:02:53 lbv1.beta.com heartbeat:
> > [4236]: WARN: 1 lost packet(s) for
> >>>>>>>> [lbv2.beta.com] [24:26]
> >>>>>>>>> Mar 17 21:02:53 lbv1.beta.com heartbeat:
> > [4236]: info: No pkts missing from
> >>>>>>>> lbv2.beta.com!
> >>>>>>>>> Mar 17 21:02:54 lbv1.beta.com heartbeat:
> > [4236]: WARN: 1 lost packet(s) for
> >>>>>>>> [lbv2.beta.com] [26:28]
> >>>>>>>>> Mar 17 21:02:54 lbv1.beta.com heartbeat:
> > [4236]: info: No pkts missing from
> >>>>>>>> lbv2.beta.com!
> >>>>>>>>> Mar 17 21:02:54 lbv1.beta.com heartbeat:
> > [4236]: WARN: 1 lost packet(s) for
> >>>>>>>> [lbv2.beta.com] [30:32]
> >>>>>>>>> Mar 17 21:02:54 lbv1.beta.com heartbeat:
> > [4236]: info: No pkts missing from
> >>>>>>>> lbv2.beta.com!
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> # less /var/log/error
> >>>>>>>>>
> >>>>>>>>> Mar 17 21:02:47 lbv1 attrd[4249]: error:
> > ha_msg_dispatch: Ignored
> >>>>>>>> incoming message. Please set_msg_callback on
> > hbclstat
> >>>>>>>>> Mar 17 21:02:48 lbv1 attrd[4249]: error:
> > ha_msg_dispatch: Ignored
> >>>>>>>> incoming message. Please set_msg_callback on
> > hbclstat
> >>>>>>>>> Mar 17 21:02:53 lbv1 stonith-ng[4247]:
> > error: ha_msg_dispatch: Ignored
> >>>>>>>> incoming message. Please set_msg_callback on
> > hbclstat
> >>>>>>>>> Mar 17 21:02:53 lbv1 stonith-ng[4247]:
> > error: ha_msg_dispatch: Ignored
> >>>>>>>> incoming message. Please set_msg_callback on
> > hbclstat
> >>>>>>>>> Mar 17 21:03:39 lbv1 crmd[4250]: error:
> > process_lrm_event: Operation
> >>>>>>>> Stonith2-1_start_0 (node=lbv1.beta.com, call=31,
> > status=4, cib-update=42,
> >>>>>>>> confirmed=true) Error
> >>>>>>>>>
> >>>>>>>>> # cat syslog|egrep 'Mar 17 21:03|Mar 17
> > 21:02' |egrep
> >>>>>>>> 'heartbeat|stonith|pacemaker|error'
> >>>>>>>>> Mar 17 21:03:24 lbv1 pengine[4253]: notice:
> > process_pe_message: Calculated
> >>>>>>>> Transition 0:
> > /var/lib/pacemaker/pengine/pe-input-115.bz2
> >>>>>>>>> Mar 17 21:03:27 lbv1 crmd[4250]: notice:
> > run_graph: Transition 0
> >>>>>>>> (Complete=15, Pending=0, Fired=0, Skipped=16,
> > Incomplete=2,
> >>>>>>>>
> > Source=/var/lib/pacemaker/pengine/pe-input-115.bz2): Stopped
> >>>>>>>>> Mar 17 21:03:29 lbv1 pengine[4253]: notice:
> > process_pe_message: Calculated
> >>>>>>>> Transition 1:
> > /var/lib/pacemaker/pengine/pe-input-116.bz2
> >>>>>>>>> Mar 17 21:03:34 lbv1 crmd[4250]: notice:
> > run_graph: Transition 1
> >>>>>>>> (Complete=8, Pending=0, Fired=0, Skipped=12,
> > Incomplete=1,
> >>>>>>>>
> > Source=/var/lib/pacemaker/pengine/pe-input-116.bz2): Stopped
> >>>>>>>>> Mar 17 21:03:37 lbv1 pengine[4253]: warning:
> > unpack_rsc_op_failure:
> >>>>>>>> Processing failed op start for Stonith1-1 on
> > lbv2.beta.com: unknown error (1)
> >>>>>>>>> Mar 17 21:03:37 lbv1 pengine[4253]: warning:
> > unpack_rsc_op_failure:
> >>>>>>>> Processing failed op start for Stonith1-1 on
> > lbv2.beta.com: unknown error (1)
> >>>>>>>>> Mar 17 21:03:37 lbv1 pengine[4253]: notice:
> > process_pe_message: Calculated
> >>>>>>>> Transition 2:
> > /var/lib/pacemaker/pengine/pe-input-117.bz2
> >>>>>>>>> Mar 17 21:03:39 lbv1 stonith-ng[4247]:
> > notice: log_operation: Operation
> >>>>>>>> 'monitor' [4377] for device
> > 'Stonith2-1' returned: -201 (Generic
> >>>>>>>> Pacemaker error)
> >>>>>>>>> Mar 17 21:03:39 lbv1 stonith-ng[4247]:
> > warning: log_operation:
> >>>>>>>> Stonith2-1:4377 [ Performing: stonith -t
> > external/stonith-helper -S ]
> >>>>>>>>> Mar 17 21:03:39 lbv1 stonith-ng[4247]:
> > warning: log_operation:
> >>>>>>>> Stonith2-1:4377 [ failed to exec
> > "stonith" ]
> >>>>>>>>> Mar 17 21:03:39 lbv1 stonith-ng[4247]:
> > warning: log_operation:
> >>>>>>>> Stonith2-1:4377 [ failed: 2 ]
> >>>>>>>>> Mar 17 21:03:39 lbv1 crmd[4250]: error:
> > process_lrm_event: Operation
> >>>>>>>> Stonith2-1_start_0 (node=lbv1.beta.com, call=31,
> > status=4, cib-update=42,
> >>>>>>>> confirmed=true) Error
> >>>>>>>>> Mar 17 21:03:40 lbv1 crmd[4250]: notice:
> > run_graph: Transition 2
> >>>>>>>> (Complete=12, Pending=0, Fired=0, Skipped=3,
> > Incomplete=0,
> >>>>>>>>
> > Source=/var/lib/pacemaker/pengine/pe-input-117.bz2): Stopped
> >>>>>>>>> Mar 17 21:03:42 lbv1 pengine[4253]: warning:
> > unpack_rsc_op_failure:
> >>>>>>>> Processing failed op start for Stonith2-1 on
> > lbv1.beta.com: unknown error (1)
> >>>>>>>>> Mar 17 21:03:42 lbv1 pengine[4253]: warning:
> > unpack_rsc_op_failure:
> >>>>>>>> Processing failed op start for Stonith2-1 on
> > lbv1.beta.com: unknown error (1)
> >>>>>>>>> Mar 17 21:03:42 lbv1 pengine[4253]: warning:
> > unpack_rsc_op_failure:
> >>>>>>>> Processing failed op start for Stonith1-1 on
> > lbv2.beta.com: unknown error (1)
> >>>>>>>>> Mar 17 21:03:42 lbv1 pengine[4253]: notice:
> > process_pe_message: Calculated
> >>>>>>>> Transition 3:
> > /var/lib/pacemaker/pengine/pe-input-118.bz2
> >>>>>>>>> Mar 17 21:03:42 lbv1 IPaddr2(vip_208)[4448]:
> > INFO:
> >>>>>>>> /usr/libexec/heartbeat/send_arp -i 200 -r 5 -p
> >>>>>>>> /var/run/resource-agents/send_arp-192.168.17.208
> > eth0 192.168.17.208 auto
> >>>>>>>> not_used not_used
> >>>>>>>>> Mar 17 21:03:47 lbv1 crmd[4250]: notice:
> > run_graph: Transition 3
> >>>>>>>> (Complete=10, Pending=0, Fired=0, Skipped=0,
> > Incomplete=0,
> >>>>>>>>
> > Source=/var/lib/pacemaker/pengine/pe-input-118.bz2): Complete
> >>>>>>>>>
> >>>>>>>>> 宜しくお願いします。
> >>>>>>>>>
> >>>>>>>>> 以上
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> 2015年3月17日 18:31
> > <renayama19661014@ybb.ne.jp>:
> >>>>>>>>>
> >>>>>>>>> 福田さん
> >>>>>>>>>>
> >>>>>>>>>> こんばんは、山内です。
> >>>>>>>>>>
> >>>>>>>>>> tag付けされていないので、本日の最新版は、
> >>>>>>>>>>
> >>>>>>>>>> *
> >>>>>>>>
> >
https://github.com/ClusterLabs/pacemaker/tree/e32080b460f81486b85d08ec958582b3e72d858c
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> になります。
> >>>>>>>>>> 右側の[Download ZIP]からダウンロード出来ます。
> >>>>>>>>>>
> >>>>>>>>>> 以上です。
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> ----- Original Message -----
> >>>>>>>>>>> From: Masamichi Fukuda - elf-systems
> >>>>>>>> <masamichi_fukuda@elf-systems.com>
> >>>>>>>>>>
> >>>>>>>>>>> To:
> > "renayama19661014@ybb.ne.jp"
> >>>>>>>> <renayama19661014@ybb.ne.jp>;
> >>>>>>>> "linux-ha-japan@lists.sourceforge.jp"
> >>>>>>>> <linux-ha-japan@lists.sourceforge.jp>
> >>>>>>>>>>> Date: 2015/3/17, Tue 18:07
> >>>>>>>>>>> Subject: スプリットブレイン時のSTONITHエラーについて
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> 山内さん
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> お疲れ様です、福田です。
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> こちらを見たのですが、
> >>>>>>>>>>>
> > https://github.com/ClusterLabs/pacemaker/tags
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> pacemaker 1.1.12 561c4cf が最新のようなのですが。
> >>>>>>>>>>> 済みませんが、これ以降の最新版はどちらにあるか教えて頂けますか。
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> 宜しくお願いします。
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> 以上
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> > 2015年3月17日火曜日、<renayama19661014@ybb.ne.jp>さんは書きました:
> >>>>>>>>>>>
> >>>>>>>>>>> 福田さん
> >>>>>>>>>>>>
> >>>>>>>>>>>> お疲れ様です。山内です。
> >>>>>>>>>>>>
> >>>>>>>>>>>> はい。古いです。
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> > PacemakerがHeartbeat3.0.6に対応したのは意外と最近です。
> >>>>>>>>>>>>
> > もっと新しいものを入れてください。(また、ソースから構築する必要がありますが・・・・)
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> 本家のgithubから入手可能です。
> >>>>>>>>>>>> *
> > https://github.com/ClusterLabs/pacemaker
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> > 場合によっては、最新のmasterはエラーなどが出る場合がありますので、その場合は、バージョンを古い方にたぐって
> >>>>>>>>>>>> いくのが良いと思います。
> >>>>>>>>>>>>
> >>>>>>>>>>>> 以上です。
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> ----- Original Message -----
> >>>>>>>>>>>>> From: Masamichi Fukuda -
> > elf-systems
> >>>>>>>> <masamichi_fukuda@elf-systems.com>
> >>>>>>>>>>>>> To: 山内英生
> > <renayama19661014@ybb.ne.jp>;
> >>>>>>>> "linux-ha-japan@lists.sourceforge.jp"
> >>>>>>>> <linux-ha-japan@lists.sourceforge.jp>
> >>>>>>>>>>>>> Date: 2015/3/17, Tue 16:06
> >>>>>>>>>>>>> Subject: Re: [Linux-ha-jp]
> > スプリットブレイン時のSTONITHエラーについて
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> 山内さん
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> お疲れ様です、福田です。
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> > 以前のメールでheartbeatとpacemakerを最新版を入れたほうが良いと回答頂きました。
> >>>>>>>>>>>>>
> > そこで今回、heartbeat3.0.6とpacemaker1.1.12を入れたのですが。
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> heartbeat configuration:
> > Version = "3.0.6"
> >>>>>>>>>>>>> pacemaker configuration:
> > Version = 1.1.12 (Build:
> >>>>>>>> 561c4cf)pacemakerがまだ古いということでしょうか。
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> 済みませんが、宜しくお願いします。
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> 以上
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> 2015年3月17日 14:59
> > <renayama19661014@ybb.ne.jp>:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> 福田さん
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> お疲れ様です。山内です。
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> > ふと思ったのすが、以前のやり取りのメールで以下と回答してますが、問題ないでしょうか?
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>
> > 2)Heartbeat3.0.6+Pacemaker最新 :
> >>>>>>>> OK
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>> どうやら、Heartbeatも最新版3.0.6を組合せる必要があるようです。
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>
> > * http://hg.linux-ha.org/heartbeat-STABLE_3_0/rev/cceeb47a7d8f
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> > 以下のcrm_monのバージョンを見ると、1.1.12のようです。
> >>>>>>>>>>>>>>
> > Heartbeat3.0.6と組み合わせるには、かなり新しめのPacemakerが必要です。
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> # crm_mon -rfA
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Last updated: Tue Mar
> > 17 14:14:39 2015
> >>>>>>>>>>>>>>> Last change: Tue Mar 17
> > 14:01:43 2015
> >>>>>>>>>>>>>>> Stack: heartbeat
> >>>>>>>>>>>>>>> Current DC:
> > lbv2.beta.com
> >>>>>>>> (82ffc36f-1ad8-8686-7db0-35686465c624) - parti
> >>>>>>>>>>>>>>> tion with quorum
> >>>>>>>>>>>>>>> Version: 1.1.12-561c4cf
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> たぶん、以下の変更以降は少なくとも必要かと思います。
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >
https://github.com/ClusterLabs/pacemaker/commit/f2302da063d08719d28367d8e362b8bfb0f85bf3
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> 以上です。
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> ----- Original Message
> > -----
> >>>>>>>>>>>>>>> From: Masamichi Fukuda
> > - elf-systems
> >>>>>>>> <masamichi_fukuda@elf-systems.com>
> >>>>>>>>>>>>>>> To: 山内英生
> > <renayama19661014@ybb.ne.jp>;
> >>>>>>>> "linux-ha-japan@lists.sourceforge.jp"
> >>>>>>>> <linux-ha-japan@lists.sourceforge.jp>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Date: 2015/3/17, Tue
> > 14:38
> >>>>>>>>>>>>>>> Subject: Re:
> > [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> 山内さん
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> お疲れ様です、福田です。
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> > stonith-helperのシェバング行に-xを追加すれば良いのでしょうか?
> >>>>>>>>>>>>>>>
> > stonith-helperの先頭行を#!/bin/bash -xにしてクラスタを起動してみました。
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> > crm_monでは先ほどと変わりはないようです。
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> # crm_mon -rfA
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Last updated: Tue Mar
> > 17 14:14:39 2015
> >>>>>>>>>>>>>>> Last change: Tue Mar 17
> > 14:01:43 2015
> >>>>>>>>>>>>>>> Stack: heartbeat
> >>>>>>>>>>>>>>> Current DC:
> > lbv2.beta.com
> >>>>>>>> (82ffc36f-1ad8-8686-7db0-35686465c624) - parti
> >>>>>>>>>>>>>>> tion with quorum
> >>>>>>>>>>>>>>> Version: 1.1.12-561c4cf
> >>>>>>>>>>>>>>> 2 Nodes configured
> >>>>>>>>>>>>>>> 8 Resources configured
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Online: [ lbv1.beta.com
> > lbv2.beta.com ]
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Full list of resources:
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Resource Group:
> > HAvarnish
> >>>>>>>>>>>>>>> vip_208
> > (ocf::heartbeat:IPaddr2):
> >>>>>>>> Started lbv1.beta.com
> >>>>>>>>>>>>>>> varnishd
> > (lsb:varnish): Started
> >>>>>>>> lbv1.beta.com
> >>>>>>>>>>>>>>> Resource Group:
> > grpStonith1
> >>>>>>>>>>>>>>> Stonith1-1
> >>>>>>>> (stonith:external/stonith-helper): Stopped
> >>>>>>>>>>>>>>> Stonith1-2
> > (stonith:external/xen0):
> >>>>>>>> Stopped
> >>>>>>>>>>>>>>> Resource Group:
> > grpStonith2
> >>>>>>>>>>>>>>> Stonith2-1
> >>>>>>>> (stonith:external/stonith-helper): Stopped
> >>>>>>>>>>>>>>> Stonith2-2
> > (stonith:external/xen0):
> >>>>>>>> Stopped
> >>>>>>>>>>>>>>> Clone Set: clone_ping
> > [ping]
> >>>>>>>>>>>>>>> Started: [
> > lbv1.beta.com lbv2.beta.com ]
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Node Attributes:
> >>>>>>>>>>>>>>> * Node lbv1.beta.com:
> >>>>>>>>>>>>>>> +
> > default_ping_set : 100
> >>>>>>>>>>>>>>> * Node lbv2.beta.com:
> >>>>>>>>>>>>>>> +
> > default_ping_set : 100
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Migration summary:
> >>>>>>>>>>>>>>> * Node lbv2.beta.com:
> >>>>>>>>>>>>>>> Stonith1-1:
> > migration-threshold=1
> >>>>>>>> fail-count=1000000 last-failure='Tue Mar 17
> >>>>>>>>>>>>>>> 14:12:16 2015'
> >>>>>>>>>>>>>>> * Node lbv1.beta.com:
> >>>>>>>>>>>>>>> Stonith2-1:
> > migration-threshold=1
> >>>>>>>> fail-count=1000000 last-failure='Tue Mar 17
> >>>>>>>>>>>>>>> 14:12:21 2015'
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Failed actions:
> >>>>>>>>>>>>>>> Stonith1-1_start_0
> > on lbv2.beta.com 'unknown
> >>>>>>>> error' (1): call=31, st
> >>>>>>>>>>>>>>> atus=Error,
> > last-rc-change='Tue Mar 17 14:12:14
> >>>>>>>> 2015', queued=0ms, exec=1065ms
> >>>>>>>>>>>>>>> Stonith2-1_start_0
> > on lbv1.beta.com 'unknown
> >>>>>>>> error' (1): call=26, st
> >>>>>>>>>>>>>>> atus=Error,
> > last-rc-change='Tue Mar 17 14:12:19
> >>>>>>>> 2015', queued=0ms, exec=1081ms
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> その他のログを探してみました。
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> heartbeat起動時です。
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> # less
> > /var/log/pm_logconv.out
> >>>>>>>>>>>>>>> Mar 17 14:11:28
> > lbv1.beta.com info: Starting
> >>>>>>>> Heartbeat 3.0.6.
> >>>>>>>>>>>>>>> Mar 17 14:11:33
> > lbv1.beta.com info: Link
> >>>>>>>> lbv2.beta.com:eth1 is up.
> >>>>>>>>>>>>>>> Mar 17 14:11:34
> > lbv1.beta.com info: Start
> >>>>>>>> "ccm" process. (pid=13264)
> >>>>>>>>>>>>>>> Mar 17 14:11:34
> > lbv1.beta.com info: Start
> >>>>>>>> "lrmd" process. (pid=13267)
> >>>>>>>>>>>>>>> Mar 17 14:11:34
> > lbv1.beta.com info: Start
> >>>>>>>> "attrd" process. (pid=13268)
> >>>>>>>>>>>>>>> Mar 17 14:11:34
> > lbv1.beta.com info: Start
> >>>>>>>> "stonithd" process. (pid=13266)
> >>>>>>>>>>>>>>> Mar 17 14:11:34
> > lbv1.beta.com info: Start
> >>>>>>>> "cib" process. (pid=13265)
> >>>>>>>>>>>>>>> Mar 17 14:11:34
> > lbv1.beta.com info: Start
> >>>>>>>> "crmd" process. (pid=13269)
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> # less /var/log/error
> >>>>>>>>>>>>>>> Mar 17 14:12:20 lbv1
> > crmd[13269]: error:
> >>>>>>>> process_lrm_event: Operation Stonith2-1_start_0
> > (node=lbv1.beta.com, call=26,
> >>>>>>>> status=4, cib-update=19, confirmed=true) Error
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> > syslogからstonithをgrepしたものです
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Mar 17 14:11:34 lbv1
> > heartbeat: [13255]: info:
> >>>>>>>> Starting child client
> >>>>>>>>
> > "/usr/local/heartbeat/libexec/pacemaker/stonithd" (0,0)
> >>>>>>>>>>>>>>> Mar 17 14:11:34 lbv1
> > heartbeat: [13266]: info:
> >>>>>>>> Starting
> > "/usr/local/heartbeat/libexec/pacemaker/stonithd" as uid 0
> >>>>>>>> gid 0 (pid 13266)
> >>>>>>>>>>>>>>> Mar 17 14:11:34 lbv1
> > stonithd[13266]: notice:
> >>>>>>>> crm_cluster_connect: Connecting to cluster
> > infrastructure: heartbeat
> >>>>>>>>>>>>>>> Mar 17 14:11:34 lbv1
> > heartbeat: [13255]: info: the
> >>>>>>>> send queue length from heartbeat to client stonithd
> > is set to 1024
> >>>>>>>>>>>>>>> Mar 17 14:11:40 lbv1
> > stonithd[13266]: notice:
> >>>>>>>> setup_cib: Watching for stonith topology changes
> >>>>>>>>>>>>>>> Mar 17 14:11:40 lbv1
> > stonithd[13266]: notice:
> >>>>>>>> unpack_config: On loss of CCM Quorum: Ignore
> >>>>>>>>>>>>>>> Mar 17 14:11:40 lbv1
> > stonithd[13266]: warning:
> >>>>>>>> handle_startup_fencing: Blind faith: not fencing
> > unseen nodes
> >>>>>>>>>>>>>>> Mar 17 14:11:40 lbv1
> > stonithd[13266]: warning:
> >>>>>>>> handle_startup_fencing: Blind faith: not fencing
> > unseen nodes
> >>>>>>>>>>>>>>> Mar 17 14:11:41 lbv1
> > stonithd[13266]: notice:
> >>>>>>>> stonith_device_register: Added 'Stonith2-1'
> > to the device list (1 active
> >>>>>>>> devices)
> >>>>>>>>>>>>>>> Mar 17 14:11:41 lbv1
> > stonithd[13266]: notice:
> >>>>>>>> stonith_device_register: Added 'Stonith2-2'
> > to the device list (2 active
> >>>>>>>> devices)
> >>>>>>>>>>>>>>> Mar 17 14:12:04 lbv1
> > stonithd[13266]: notice:
> >>>>>>>> xml_patch_version_check: Versions did not change in
> > patch 0.5.0
> >>>>>>>>>>>>>>> Mar 17 14:12:20 lbv1
> > stonithd[13266]: notice:
> >>>>>>>> log_operation: Operation 'monitor' [13386]
> > for device
> >>>>>>>> 'Stonith2-1' returned: -201 (Generic
> > Pacemaker error)
> >>>>>>>>>>>>>>> Mar 17 14:12:20 lbv1
> > stonithd[13266]: warning:
> >>>>>>>> log_operation: Stonith2-1:13386 [ Performing:
> > stonith -t external/stonith-helper
> >>>>>>>> -S ]
> >>>>>>>>>>>>>>> Mar 17 14:12:20 lbv1
> > stonithd[13266]: warning:
> >>>>>>>> log_operation: Stonith2-1:13386 [ failed to exec
> > "stonith" ]
> >>>>>>>>>>>>>>> Mar 17 14:12:20 lbv1
> > stonithd[13266]: warning:
> >>>>>>>> log_operation: Stonith2-1:13386 [ failed: 2 ]
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> 宜しくお願いします。
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> 以上
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> 2015年3月17日 13:32
> > <renayama19661014@ybb.ne.jp>:
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> 福田さん
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> お疲れ様です。山内です。
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> > ということは、stonith-helperのstartに問題があるようですね。
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> stonith-helperの先頭に
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> #!/bin/bash -x
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> > を入れて、クラスタを起動すると何かわかるかも知れません。
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> > ちなみに、stonith-helperのログもどこかに出ていると思うのですが。。。
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> 以上です。
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> ----- Original
> > Message -----
> >>>>>>>>>>>>>>>>> From: Masamichi
> > Fukuda - elf-systems
> >>>>>>>> <masamichi_fukuda@elf-systems.com>
> >>>>>>>>>>>>>>>>> To: 山内英生
> > <renayama19661014@ybb.ne.jp>;
> >>>>>>>> "linux-ha-japan@lists.sourceforge.jp"
> >>>>>>>> <linux-ha-japan@lists.sourceforge.jp>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Date:
> > 2015/3/17, Tue 12:31
> >>>>>>>>>>>>>>>>> Subject: Re:
> > [Linux-ha-jp]
> >>>>>>>> スプリットブレイン時のSTONITHエラーについて
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> 山内さん
> >>>>>>>>>>>>>>>>> cc:松島さん
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> こんにちは、福田です。
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> > 同じディレクトリにxen0はありました。
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> # pwd
> >>>>>>>>>>>>>>>>>
> > /usr/local/heartbeat/lib/stonith/plugins/external
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> # ls
> >>>>>>>>>>>>>>>>> drac5
> > ibmrsa kdumpcheck
> >>>>>>>> riloe vmware
> >>>>>>>>>>>>>>>>> dracmc-telnet
> > ibmrsa-telnet libvirt
> >>>>>>>> ssh xen0
> >>>>>>>>>>>>>>>>> hetzner
> > ipmi nut
> >>>>>>>> stonith-helper xen0-ha
> >>>>>>>>>>>>>>>>> hmchttp
> > ippower9258 rackpdu
> >>>>>>>> vcenter
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> 宜しくお願いします。
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> 以上
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> 2015-03-17
> > 10:53 GMT+09:00
> >>>>>>>> <renayama19661014@ybb.ne.jp>:
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> 福田さん
> >>>>>>>>>>>>>>>>>> cc:松島さん
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> > お疲れ様です。山内です。
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>
> > 標準出力や標準エラー出力はありませんでした。
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>
> > stonith-helperがおかしいのでしょうか。
> >>>>>>>>>>>>>>>>>>>
> > stonith-helperはシェルスクリプトなのでインストールはあまり気にしていなかったのですが。
> >>>>>>>>>>>>>>>>>>>
> > stonith-helperはここに配置されています。
> >>>>>>>>>>>>>>>>>>>
> > /usr/local/heartbeat/lib/stonith/plugins/external/stonith-helper
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> > このディレクトリにxen0もありますか?
> >>>>>>>>>>>>>>>>>>
> > 無いようでしたら、問題がありますので、一度、stonith-helperのファイルを属性などはそのまま、xen0と同じディレクトリに
> >>>>>>>>>>>>>>>>>>
> > コピーしてみてください。
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> > それで稼働するなら、pm_extrasのインストールに問題があるということになります。
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> 以上です。
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> -----
> > Original Message -----
> >>>>>>>>>>>>>>>>>>> From:
> > Masamichi Fukuda - elf-systems
> >>>>>>>> <masamichi_fukuda@elf-systems.com>
> >>>>>>>>>>>>>>>>>>> To:
> > 山内英生
> >>>>>>>> <renayama19661014@ybb.ne.jp>;
> >>>>>>>> "linux-ha-japan@lists.sourceforge.jp"
> >>>>>>>> <linux-ha-japan@lists.sourceforge.jp>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> Date:
> > 2015/3/17, Tue 10:31
> >>>>>>>>>>>>>>>>>>>
> > Subject: Re: [Linux-ha-jp]
> >>>>>>>> スプリットブレイン時のSTONITHエラーについて
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> 山内さん
> >>>>>>>>>>>>>>>>>>> cc:松島さん
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>
> > おはようございます、福田です。
> >>>>>>>>>>>>>>>>>>>
> > crmの例をありがとうございます。
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>
> > 早速、こちらの環境に合わせてみました。
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> $ cat
> > test.crm
> >>>>>>>>>>>>>>>>>>> ###
> > Cluster Option ###
> >>>>>>>>>>>>>>>>>>>
> > property \
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>> no-quorum-policy="ignore" \
> >>>>>>>>>>>>>>>>>>>
> > stonith-enabled="true"
> >>>>>>>> \
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>> startup-fencing="false" \
> >>>>>>>>>>>>>>>>>>>
> > stonith-timeout="710s"
> >>>>>>>> \
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>> crmd-transition-delay="2s"
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> ###
> > Resource Default ###
> >>>>>>>>>>>>>>>>>>>
> > rsc_defaults \
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>> resource-stickiness="INFINITY" \
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>> migration-threshold="1"
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> ###
> > Group Configuration ###
> >>>>>>>>>>>>>>>>>>> group
> > HAvarnish \
> >>>>>>>>>>>>>>>>>>>
> > vip_208 \
> >>>>>>>>>>>>>>>>>>>
> > varnishd
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> group
> > grpStonith1 \
> >>>>>>>>>>>>>>>>>>>
> > Stonith1-1 \
> >>>>>>>>>>>>>>>>>>>
> > Stonith1-2
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> group
> > grpStonith2 \
> >>>>>>>>>>>>>>>>>>>
> > Stonith2-1 \
> >>>>>>>>>>>>>>>>>>>
> > Stonith2-2
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> ###
> > Clone Configuration ###
> >>>>>>>>>>>>>>>>>>> clone
> > clone_ping \
> >>>>>>>>>>>>>>>>>>>
> > ping
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> ###
> > Fencing Topology ###
> >>>>>>>>>>>>>>>>>>>
> > fencing_topology \
> >>>>>>>>>>>>>>>>>>>
> > lbv1.beta.com: Stonith1-1
> >>>>>>>> Stonith1-2 \
> >>>>>>>>>>>>>>>>>>>
> > lbv2.beta.com: Stonith2-1
> >>>>>>>> Stonith2-2
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> ###
> > Primitive Configuration ###
> >>>>>>>>>>>>>>>>>>>
> > primitive vip_208
> >>>>>>>> ocf:heartbeat:IPaddr2 \
> >>>>>>>>>>>>>>>>>>>
> > params \
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>> ip="192.168.17.208" \
> >>>>>>>>>>>>>>>>>>>
> > nic="eth0" \
> >>>>>>>>>>>>>>>>>>>
> > cidr_netmask="24"
> >>>>>>>> \
> >>>>>>>>>>>>>>>>>>> op
> > start interval="0s"
> >>>>>>>> timeout="90s" on-fail="restart"
> > \
> >>>>>>>>>>>>>>>>>>> op
> > monitor
> >>>>>>>> interval="5s" timeout="60s"
> > on-fail="restart"
> >>>>>>>> \
> >>>>>>>>>>>>>>>>>>> op
> > stop interval="0s"
> >>>>>>>> timeout="100s" on-fail="fence"
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>
> > primitive varnishd lsb:varnish \
> >>>>>>>>>>>>>>>>>>> op
> > start interval="0s"
> >>>>>>>> timeout="90s" on-fail="restart"
> > \
> >>>>>>>>>>>>>>>>>>> op
> > monitor
> >>>>>>>> interval="10s" timeout="60s"
> > on-fail="restart"
> >>>>>>>> \
> >>>>>>>>>>>>>>>>>>> op
> > stop interval="0s"
> >>>>>>>> timeout="100s" on-fail="fence"
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>
> > primitive ping ocf:pacemaker:ping
> >>>>>>>> \
> >>>>>>>>>>>>>>>>>>>
> > params \
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>> name="default_ping_set" \
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>> host_list="192.168.17.254" \
> >>>>>>>>>>>>>>>>>>>
> > multiplier="100"
> >>>>>>>> \
> >>>>>>>>>>>>>>>>>>>
> > dampen="1" \
> >>>>>>>>>>>>>>>>>>> op
> > start interval="0s"
> >>>>>>>> timeout="90s" on-fail="restart"
> > \
> >>>>>>>>>>>>>>>>>>> op
> > monitor
> >>>>>>>> interval="10s" timeout="60s"
> > on-fail="restart"
> >>>>>>>> \
> >>>>>>>>>>>>>>>>>>> op
> > stop interval="0s"
> >>>>>>>> timeout="100s" on-fail="fence"
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>
> > primitive Stonith1-1
> >>>>>>>> stonith:external/stonith-helper \
> >>>>>>>>>>>>>>>>>>>
> > params \
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>> pcmk_reboot_retries="1" \
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>> pcmk_reboot_timeout="40s" \
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>> hostlist="lbv1.beta.com" \
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>> dead_check_target="192.168.17.132
> > 10.0.17.132" \
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>
> > standby_check_command="/usr/local/sbin/crm_resource -r varnishd -W |
grep
> >>>>>>>> -q `hostname`" \
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>> run_online_check="yes" \
> >>>>>>>>>>>>>>>>>>> op
> > start interval="0s"
> >>>>>>>> timeout="60s" on-fail="restart"
> > \
> >>>>>>>>>>>>>>>>>>> op
> > stop interval="0s"
> >>>>>>>> timeout="60s" on-fail="ignore"
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>
> > primitive Stonith1-2
> >>>>>>>> stonith:external/xen0 \
> >>>>>>>>>>>>>>>>>>>
> > params \
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>> pcmk_reboot_timeout="60s" \
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>
> > hostlist="lbv1.beta.com:/etc/xen/lbv1.cfg" \
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>> dom0="xen0.beta.com" \
> >>>>>>>>>>>>>>>>>>> op
> > start interval="0s"
> >>>>>>>> timeout="60s" on-fail="restart"
> > \
> >>>>>>>>>>>>>>>>>>> op
> > monitor
> >>>>>>>> interval="3600s" timeout="60s"
> > on-fail="restart"
> >>>>>>>> \
> >>>>>>>>>>>>>>>>>>> op
> > stop interval="0s"
> >>>>>>>> timeout="60s" on-fail="ignore"
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>
> > primitive Stonith2-1
> >>>>>>>> stonith:external/stonith-helper \
> >>>>>>>>>>>>>>>>>>>
> > params \
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>> pcmk_reboot_retries="1" \
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>> pcmk_reboot_timeout="40s" \
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>> hostlist="lbv2.beta.com" \
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>> dead_check_target="192.168.17.133
> > 10.0.17.133" \
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>
> > standby_check_command="/usr/local/sbin/crm_resource -r varnishd -W |
grep
> >>>>>>>> -q `hostname`" \
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>> run_online_check="yes" \
> >>>>>>>>>>>>>>>>>>> op
> > start interval="0s"
> >>>>>>>> timeout="60s" on-fail="restart"
> > \
> >>>>>>>>>>>>>>>>>>> op
> > stop interval="0s"
> >>>>>>>> timeout="60s" on-fail="ignore"
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>
> > primitive Stonith2-2
> >>>>>>>> stonith:external/xen0 \
> >>>>>>>>>>>>>>>>>>>
> > params \
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>> pcmk_reboot_timeout="60s" \
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>
> > hostlist="lbv2.beta.com:/etc/xen/lbv2.cfg" \
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>> dom0="xen0.beta.com" \
> >>>>>>>>>>>>>>>>>>> op
> > start interval="0s"
> >>>>>>>> timeout="60s" on-fail="restart"
> > \
> >>>>>>>>>>>>>>>>>>> op
> > monitor
> >>>>>>>> interval="3600s" timeout="60s"
> > on-fail="restart"
> >>>>>>>> \
> >>>>>>>>>>>>>>>>>>> op
> > stop interval="0s"
> >>>>>>>> timeout="60s" on-fail="ignore"
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> ###
> > Resource Location ###
> >>>>>>>>>>>>>>>>>>>
> > location HA_location-1 HAvarnish
> >>>>>>>> \
> >>>>>>>>>>>>>>>>>>>
> > rule 200: #uname eq
> >>>>>>>> lbv1.beta.com \
> >>>>>>>>>>>>>>>>>>>
> > rule 100: #uname eq
> >>>>>>>> lbv2.beta.com
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>
> > location HA_location-2 HAvarnish
> >>>>>>>> \
> >>>>>>>>>>>>>>>>>>>
> > rule -INFINITY: not_defined
> >>>>>>>> default_ping_set or default_ping_set lt 100
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>
> > location HA_location-3 grpStonith1
> >>>>>>>> \
> >>>>>>>>>>>>>>>>>>>
> > rule -INFINITY: #uname eq
> >>>>>>>> lbv1.beta.com
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>
> > location HA_location-4 grpStonith2
> >>>>>>>> \
> >>>>>>>>>>>>>>>>>>>
> > rule -INFINITY: #uname eq
> >>>>>>>> lbv2.beta.com
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>
> > これを流しこんだところ、昨日とはメッセージが異なります。
> >>>>>>>>>>>>>>>>>>>
> > pingのメッセージはなくなっていました。
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> #
> > crm_mon -rfA
> >>>>>>>>>>>>>>>>>>> Last
> > updated: Tue Mar 17 10:21:28
> >>>>>>>> 2015
> >>>>>>>>>>>>>>>>>>> Last
> > change: Tue Mar 17 10:21:09
> >>>>>>>> 2015
> >>>>>>>>>>>>>>>>>>> Stack:
> > heartbeat
> >>>>>>>>>>>>>>>>>>> Current
> > DC: lbv2.beta.com
> >>>>>>>> (82ffc36f-1ad8-8686-7db0-35686465c624) - parti
> >>>>>>>>>>>>>>>>>>> tion
> > with quorum
> >>>>>>>>>>>>>>>>>>>
> > Version: 1.1.12-561c4cf
> >>>>>>>>>>>>>>>>>>> 2 Nodes
> > configured
> >>>>>>>>>>>>>>>>>>> 8
> > Resources configured
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> Online:
> > [ lbv1.beta.com
> >>>>>>>> lbv2.beta.com ]
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> Full
> > list of resources:
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>
> > Resource Group: HAvarnish
> >>>>>>>>>>>>>>>>>>>
> > vip_208
> >>>>>>>> (ocf::heartbeat:IPaddr2): Started
> > lbv1.beta.com
> >>>>>>>>>>>>>>>>>>>
> > varnishd (lsb:varnish):
> >>>>>>>> Started lbv1.beta.com
> >>>>>>>>>>>>>>>>>>>
> > Resource Group: grpStonith1
> >>>>>>>>>>>>>>>>>>>
> > Stonith1-1
> >>>>>>>> (stonith:external/stonith-helper): Stopped
> >>>>>>>>>>>>>>>>>>>
> > Stonith1-2
> >>>>>>>> (stonith:external/xen0): Stopped
> >>>>>>>>>>>>>>>>>>>
> > Resource Group: grpStonith2
> >>>>>>>>>>>>>>>>>>>
> > Stonith2-1
> >>>>>>>> (stonith:external/stonith-helper): Stopped
> >>>>>>>>>>>>>>>>>>>
> > Stonith2-2
> >>>>>>>> (stonith:external/xen0): Stopped
> >>>>>>>>>>>>>>>>>>> Clone
> > Set: clone_ping [ping]
> >>>>>>>>>>>>>>>>>>>
> > Started: [ lbv1.beta.com
> >>>>>>>> lbv2.beta.com ]
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> Node
> > Attributes:
> >>>>>>>>>>>>>>>>>>> * Node
> > lbv1.beta.com:
> >>>>>>>>>>>>>>>>>>> +
> >>>>>>>> default_ping_set : 100
> >>>>>>>>>>>>>>>>>>> * Node
> > lbv2.beta.com:
> >>>>>>>>>>>>>>>>>>> +
> >>>>>>>> default_ping_set : 100
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>
> > Migration summary:
> >>>>>>>>>>>>>>>>>>> * Node
> > lbv2.beta.com:
> >>>>>>>>>>>>>>>>>>>
> > Stonith1-1: migration-threshold=1
> >>>>>>>> fail-count=1000000 last-failure='Tue Mar 17
> >>>>>>>>>>>>>>>>>>>
> > 10:21:17 2015'
> >>>>>>>>>>>>>>>>>>> * Node
> > lbv1.beta.com:
> >>>>>>>>>>>>>>>>>>>
> > Stonith2-1: migration-threshold=1
> >>>>>>>> fail-count=1000000 last-failure='Tue Mar 17
> >>>>>>>>>>>>>>>>>>>
> > 10:21:17 2015'
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> Failed
> > actions:
> >>>>>>>>>>>>>>>>>>>
> > Stonith1-1_start_0 on
> >>>>>>>> lbv2.beta.com 'unknown error' (1): call=31,
> > st
> >>>>>>>>>>>>>>>>>>>
> > atus=Error, last-rc-change='Tue
> >>>>>>>> Mar 17 10:21:15 2015', queued=0ms, exec=1082ms
> >>>>>>>>>>>>>>>>>>>
> > Stonith2-1_start_0 on
> >>>>>>>> lbv1.beta.com 'unknown error' (1): call=31,
> > st
> >>>>>>>>>>>>>>>>>>>
> > atus=Error, last-rc-change='Tue
> >>>>>>>> Mar 17 10:21:16 2015', queued=0ms, exec=1079ms
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>
> > /var/log/ha-debugのログです。
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>
> > IPaddr2(vip_208)[7851]:
> >>>>>>>> 2015/03/17_10:21:22 INFO: Adding inet address
> > 192.168.17.208/24 with broadcast
> >>>>>>>> address 192.168.17.255 to device eth0
> >>>>>>>>>>>>>>>>>>>
> > IPaddr2(vip_208)[7851]:
> >>>>>>>> 2015/03/17_10:21:22 INFO: Bringing device eth0 up
> >>>>>>>>>>>>>>>>>>>
> > IPaddr2(vip_208)[7851]:
> >>>>>>>> 2015/03/17_10:21:22 INFO:
> > /usr/libexec/heartbeat/send_arp -i 200 -r 5 -p
> >>>>>>>> /var/run/resource-agents/send_arp-192.168.17.208
> > eth0 192.168.17.208 auto
> >>>>>>>> not_used not_used
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>
> > 標準出力や標準エラー出力はありませんでした。
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>
> > stonith-helperがおかしいのでしょうか。
> >>>>>>>>>>>>>>>>>>>
> > stonith-helperはシェルスクリプトなのでインストールはあまり気にしていなかったのですが。
> >>>>>>>>>>>>>>>>>>>
> > stonith-helperはここに配置されています。
> >>>>>>>>>>>>>>>>>>>
> > /usr/local/heartbeat/lib/stonith/plugins/external/stonith-helper
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>
> > 宜しくお願いします。
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> 以上
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>
> > 2015-03-17 9:45 GMT+09:00
> >>>>>>>> <renayama19661014@ybb.ne.jp>:
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> 福田さん
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>
> > おはようございます。山内です。
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>
> > 念の為、手元にある複数のstonithを利用した場合の例を抜粋してお送りします。
> >>>>>>>>>>>>>>>>>>>>
> > (実際には、改行に気を付けてください)
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>
> > 以下の例は、PM1.1系での設定で、
> >>>>>>>>>>>>>>>>>>>>
> > nodeaは、prmStonith1-1、 prmStonith1-2の順でstonithが実行されます。
> >>>>>>>>>>>>>>>>>>>>
> > nodebは、prmStonith2-1、 prmStonith2-2の順でstonithが実行されます。
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>
> > stonith自体は、helperとsshです。
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>
> > (snip)
> >>>>>>>>>>>>>>>>>>>> ###
> > Group Configuration ###
> >>>>>>>>>>>>>>>>>>>>
> > group grpStonith1 \
> >>>>>>>>>>>>>>>>>>>>
> > prmStonith1-1 \
> >>>>>>>>>>>>>>>>>>>>
> > prmStonith1-2
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>
> > group grpStonith2 \
> >>>>>>>>>>>>>>>>>>>>
> > prmStonith2-1 \
> >>>>>>>>>>>>>>>>>>>>
> > prmStonith2-2
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> ###
> > Fencing Topology ###
> >>>>>>>>>>>>>>>>>>>>
> > fencing_topology \
> >>>>>>>>>>>>>>>>>>>>
> > nodea: prmStonith1-1
> >>>>>>>> prmStonith1-2 \
> >>>>>>>>>>>>>>>>>>>>
> > nodeb: prmStonith2-1
> >>>>>>>> prmStonith2-2
> >>>>>>>>>>>>>>>>>>>>
> > (snp)
> >>>>>>>>>>>>>>>>>>>>
> > primitive prmStonith1-1
> >>>>>>>> stonith:external/stonith-helper \
> >>>>>>>>>>>>>>>>>>>>
> > params \
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>
> > pcmk_reboot_retries="1"
> >>>>>>>> \
> >>>>>>>>>>>>>>>>>>>>
> > pcmk_reboot_timeout="40s"
> >>>>>>>> \
> >>>>>>>>>>>>>>>>>>>>
> > hostlist="nodea" \
> >>>>>>>>>>>>>>>>>>>>
> > dead_check_target="192.168.28.60
> >>>>>>>> 192.168.28.70" \
> >>>>>>>>>>>>>>>>>>>>
> > standby_check_command="/usr/sbin/crm_resource
> >>>>>>>> -r prmRES -W | grep -qi `hostname`" \
> >>>>>>>>>>>>>>>>>>>>
> > run_online_check="yes"
> >>>>>>>> \
> >>>>>>>>>>>>>>>>>>>> op
> > start interval="0s"
> >>>>>>>> timeout="60s" on-fail="restart"
> > \
> >>>>>>>>>>>>>>>>>>>> op
> > stop interval="0s"
> >>>>>>>> timeout="60s" on-fail="ignore"
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>
> > primitive prmStonith1-2
> >>>>>>>> stonith:external/ssh \
> >>>>>>>>>>>>>>>>>>>>
> > params \
> >>>>>>>>>>>>>>>>>>>>
> > pcmk_reboot_timeout="60s"
> >>>>>>>> \
> >>>>>>>>>>>>>>>>>>>>
> > hostlist="nodea" \
> >>>>>>>>>>>>>>>>>>>> op
> > start interval="0s"
> >>>>>>>> timeout="60s" on-fail="restart"
> > \
> >>>>>>>>>>>>>>>>>>>> op
> > monitor
> >>>>>>>> interval="3600s" timeout="60s"
> > on-fail="restart"
> >>>>>>>> \
> >>>>>>>>>>>>>>>>>>>> op
> > stop interval="0s"
> >>>>>>>> timeout="60s" on-fail="ignore"
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>
> > primitive prmStonith2-1
> >>>>>>>> stonith:external/stonith-helper \
> >>>>>>>>>>>>>>>>>>>>
> > params \
> >>>>>>>>>>>>>>>>>>>>
> > pcmk_reboot_retries="1"
> >>>>>>>> \
> >>>>>>>>>>>>>>>>>>>>
> > pcmk_reboot_timeout="40s"
> >>>>>>>> \
> >>>>>>>>>>>>>>>>>>>>
> > hostlist="nodeb" \
> >>>>>>>>>>>>>>>>>>>>
> > dead_check_target="192.168.28.61
> >>>>>>>> 192.168.28.71" \
> >>>>>>>>>>>>>>>>>>>>
> > standby_check_command="/usr/sbin/crm_resource
> >>>>>>>> -r prmRES -W | grep -qi `hostname`" \
> >>>>>>>>>>>>>>>>>>>>
> > run_online_check="yes"
> >>>>>>>> \
> >>>>>>>>>>>>>>>>>>>> op
> > start interval="0s"
> >>>>>>>> timeout="60s" on-fail="restart"
> > \
> >>>>>>>>>>>>>>>>>>>> op
> > stop interval="0s"
> >>>>>>>> timeout="60s" on-fail="ignore"
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>
> > primitive prmStonith2-2
> >>>>>>>> stonith:external/ssh \
> >>>>>>>>>>>>>>>>>>>>
> > params \
> >>>>>>>>>>>>>>>>>>>>
> > pcmk_reboot_timeout="60s"
> >>>>>>>> \
> >>>>>>>>>>>>>>>>>>>>
> > hostlist="nodeb" \
> >>>>>>>>>>>>>>>>>>>> op
> > start interval="0s"
> >>>>>>>> timeout="60s" on-fail="restart"
> > \
> >>>>>>>>>>>>>>>>>>>> op
> > monitor
> >>>>>>>> interval="3600s" timeout="60s"
> > on-fail="restart"
> >>>>>>>> \
> >>>>>>>>>>>>>>>>>>>> op
> > stop interval="0s"
> >>>>>>>> timeout="60s" on-fail="ignore"
> >>>>>>>>>>>>>>>>>>>>
> > (snip)
> >>>>>>>>>>>>>>>>>>>>
> > location
> >>>>>>>> rsc_location-grpStonith1-2 grpStonith1 \
> >>>>>>>>>>>>>>>>>>>>
> > rule -INFINITY: #uname eq nodea
> >>>>>>>>>>>>>>>>>>>>
> > location
> >>>>>>>> rsc_location-grpStonith2-3 grpStonith2 \
> >>>>>>>>>>>>>>>>>>>>
> > rule -INFINITY: #uname eq nodeb
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>
> > 以上です。
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> --
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> ELF
> > Systems
> >>>>>>>>>>>>>>>>>>>
> > Masamichi Fukuda
> >>>>>>>>>>>>>>>>>>> mail
> > to:
> >>>>>>>> masamichi_fukuda@elf-systems.com
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> > _______________________________________________
> >>>>>>>>>>>>>>>>>>
> > Linux-ha-japan mailing list
> >>>>>>>>>>>>>>>>>>
> > Linux-ha-japan@lists.sourceforge.jp
> >>>>>>>>>>>>>>>>>>
> > http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> --
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> ELF Systems
> >>>>>>>>>>>>>>>>> Masamichi
> > Fukuda
> >>>>>>>>>>>>>>>>> mail to:
> > masamichi_fukuda@elf-systems.com
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> > _______________________________________________
> >>>>>>>>>>>>>>>> Linux-ha-japan
> > mailing list
> >>>>>>>>>>>>>>>>
> > Linux-ha-japan@lists.sourceforge.jp
> >>>>>>>>>>>>>>>>
> > http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> --
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> ELF Systems
> >>>>>>>>>>>>>>> Masamichi Fukuda
> >>>>>>>>>>>>>>> mail to:
> > masamichi_fukuda@elf-systems.com
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> > _______________________________________________
> >>>>>>>>>>>>>> Linux-ha-japan mailing list
> >>>>>>>>>>>>>>
> > Linux-ha-japan@lists.sourceforge.jp
> >>>>>>>>>>>>>>
> > http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> --
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> ELF Systems
> >>>>>>>>>>>>> Masamichi Fukuda
> >>>>>>>>>>>>> mail to:
> > masamichi_fukuda@elf-systems.com
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> > _______________________________________________
> >>>>>>>>>>>> Linux-ha-japan mailing list
> >>>>>>>>>>>> Linux-ha-japan@lists.sourceforge.jp
> >>>>>>>>>>>>
> > http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> --
> >>>>>>>>>>>
> >>>>>>>>>>> ELF Systems
> >>>>>>>>>>> Masamichi Fukuda
> >>>>>>>>>>> mail to:
> > masamichi_fukuda@elf-systems.com
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> > _______________________________________________
> >>>>>>>>>> Linux-ha-japan mailing list
> >>>>>>>>>> Linux-ha-japan@lists.sourceforge.jp
> >>>>>>>>>>
> > http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> --
> >>>>>>>>>
> >>>>>>>>> ELF Systems
> >>>>>>>>> Masamichi Fukuda
> >>>>>>>>> mail to: masamichi_fukuda@elf-systems.com
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>> _______________________________________________
> >>>>>>>> Linux-ha-japan mailing list
> >>>>>>>> Linux-ha-japan@lists.sourceforge.jp
> >>>>>>>>
> > http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
> >>>>>>>>
> >>>>>>>
> >>>>>>> _______________________________________________
> >>>>>>> Linux-ha-japan mailing list
> >>>>>>> Linux-ha-japan@lists.sourceforge.jp
> >>>>>>> http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>>> --
> >>>>>>
> >>>>>> ELF Systems
> >>>>>> Masamichi Fukuda
> >>>>>> mail to: masamichi_fukuda@elf-systems.com
> >>>>>>
> >>>>>>
> >>>>>
> >>>>> _______________________________________________
> >>>>> Linux-ha-japan mailing list
> >>>>> Linux-ha-japan@lists.sourceforge.jp
> >>>>> http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
> >>>>>
> >>>>
> >>>>
> >>>> --
> >>>>
> >>>> ELF Systems
> >>>> Masamichi Fukuda
> >>>> mail to: masamichi_fukuda@elf-systems.com
> >>>>
> >>>>
> >>>
> >>> _______________________________________________
> >>> Linux-ha-japan mailing list
> >>> Linux-ha-japan@lists.sourceforge.jp
> >>> http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
> >>>
> >>
> >>
> >> --
> >>
> >> ELF Systems
> >> Masamichi Fukuda
> >> mail to: masamichi_fukuda@elf-systems.com
> >>
> >>
> >
> > _______________________________________________
> > Linux-ha-japan mailing list
> > Linux-ha-japan@lists.sourceforge.jp
> > http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
> >
>
> _______________________________________________
> Linux-ha-japan mailing list
> Linux-ha-japan@lists.sourceforge.jp
> http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan




--
ELF Systems
Masamichi Fukuda
mail to: masamichi_fukuda@elf-systems.com
Re: スプリットブレイン時のSTONITHエラーについて [ In reply to ]
福田さん

お疲れ様です。山内です。

了解しました。
ご連絡ありがとうございました。

以上です。



----- Original Message -----
>From: Masamichi Fukuda - elf-systems <masamichi_fukuda@elf-systems.com>
>To: 山内英生 <renayama19661014@ybb.ne.jp>; "linux-ha-japan@lists.sourceforge.jp" <linux-ha-japan@lists.sourceforge.jp>
>Date: 2015/3/18, Wed 10:23
>Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
>
>
>山内さん
>
>お疲れ様です、福田です。
>
>こちらの環境では、packageで次のものを入れていたので、
>最初にapt-get removeしました。
>
>heartbeat、libheartbeat2、pacemaker、corosync、resource-agents
>
>また、haclusterユーザとhaclientグループはpackage導入の段階で
>作成されていました。
>
>ですので、松島さんの手順の
>
>下準備
>apt-get install build-essential mercurial git \
>
>以降を実行しました。後は全く同じ手順です。
>
>宜しくお願いします。
>
>以上
>
>2015年3月18日 10:06 <renayama19661014@ybb.ne.jp>:
>>
>> 福田さん
>>
>> お疲れ様です。山内です。
>>
>> ちなみに、私の方でも構築するにあたっての再確認ですが、福田さんの構築手順は、
>> 以下にまとめられた松島さんの手順通りでしょうか?
>>
>>  * https://gist.github.com/takehironet/1469bd7123f63d61f843
>>
>> 差異などありましたら、今一度、ご連絡ください。
>>
>> #特に、最初の構築パッケージのapt-getのあたりが、私がちょっと試した時には、うまく行かなかった記憶があるので、気になります。
>>
>>
>> 以上です。
>>
>>
>> ----- Original Message -----
>> > From: "renayama19661014@ybb.ne.jp" <renayama19661014@ybb.ne.jp>
>> > To: "linux-ha-japan@lists.sourceforge.jp" <linux-ha-japan@lists.sourceforge.jp>
>> > Cc:
>> > Date: 2015/3/18, Wed 09:53
>> > Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
>> >
>> > 福田さん
>> >
>> > お疲れ様です。山内です。
>> >
>> >> stonith -Lでは、一応プラグインの一覧は表示されるようです。
>> >>
>> >> # /usr/local/heartbeat/sbin/stonith -L
>> >
>> > こちらは、Heartbeatのソースに含まれるコマンドのはずなので、Heartbeatとglueの関係では問題ないということになるかと思います。
>> >
>> > ですので、pacemakerのインストールに問題がある可能性の方が高いと思われます。
>> >
>> > どちらにしても、一度、時間をみて、こちらでも構築してみます。
>> >
>> > 以上です。
>> >
>> >
>> > ----- Original Message -----
>> >> From: Masamichi Fukuda - elf-systems
>> > <masamichi_fukuda@elf-systems.com>
>> >> To: 山内英生 <renayama19661014@ybb.ne.jp>;
>> > "linux-ha-japan@lists.sourceforge.jp"
>> > <linux-ha-japan@lists.sourceforge.jp>
>> >> Date: 2015/3/18, Wed 09:33
>> >> Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
>> >>
>> >>
>> >> 山内さん
>> >>
>> >> お疲れ様です、福田です。
>> >>
>> >>> Reusableは、glueのことです。
>> >>
>> >> 承知しました。Cluster-glueのことですね。
>> >>
>> >>> これは、想定通りで、external配下のエージェントを認識できず、startしていないと
>> >>> 思っています。
>> >>
>> >> stonith -Lでは、一応プラグインの一覧は表示されるようです。
>> >>
>> >> # /usr/local/heartbeat/sbin/stonith -L
>> >> apcmaster
>> >> apcsmart
>> >> baytech
>> >> cyclades
>> >> external/drac5
>> >> external/dracmc-telnet
>> >> external/hetzner
>> >> external/hmchttp
>> >> external/ibmrsa
>> >> external/ibmrsa-telnet
>> >> external/ipmi
>> >> external/ippower9258
>> >> external/kdumpcheck
>> >> external/libvirt
>> >> external/nut
>> >> external/rackpdu
>> >> external/riloe
>> >> external/ssh
>> >> external/stonith-helper
>> >> external/vcenter
>> >> external/vmware
>> >> external/xen0
>> >> external/xen0-ha
>> >> ibmhmc
>> >> meatware
>> >> null
>> >> nw_rpc100s
>> >> rcd_serial
>> >> rps10
>> >> ssh
>> >> suicide
>> >> wti_nps
>> >>
>> >>
>> >>> #時間を見て、松島さんの情報を元に、Debian7の環境を作らないと何とも言えない
>> >>> と思っています
>> >>
>> >> お忙しいところ済みません。
>> >> こちらもインストールを見なおして見ます。
>> >>
>> >> 宜しくお願いします。
>> >>
>> >> 以上
>> >>
>> >>
>> >>
>> >>
>> >> 2015年3月18日 9:02 <renayama19661014@ybb.ne.jp>:
>> >>
>> >> 福田さん
>> >>>
>> >>> おはようございます。山内です。
>> >>>
>> >>> 書き方が悪かったです。
>> >>> Reusableは、glueのことです。
>> >>>
>> >>> pacemakerのインストールに問題があるかも知れませんが、現時点では、判断出来ません。
>> >>>
>> >>>
>> >>>> stonith-helperを外して、external/sshだけにして起動してみましたが、
>> >>>> crm_monでの状態は変わりありませんでした。
>> >>>
>> >>>
>> >>> これは、想定通りで、external配下のエージェントを認識できず、startしていないと思っています。
>> >>>
>> >>> #時間を見て、松島さんの情報を元に、Debian7の環境を作らないと何とも言えないと思っています。
>> >>>
>> >>> 以上です。
>> >>>
>> >>>
>> >>> ----- Original Message -----
>> >>>> From: Masamichi Fukuda - elf-systems
>> > <masamichi_fukuda@elf-systems.com>
>> >>>> To: 山内英生 <renayama19661014@ybb.ne.jp>;
>> > "linux-ha-japan@lists.sourceforge.jp"
>> > <linux-ha-japan@lists.sourceforge.jp>
>> >>>
>> >>>> Date: 2015/3/18, Wed 08:12
>> >>>> Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
>> >>>>
>> >>>>
>> >>>> 山内さん
>> >>>>
>> >>>> おはようございます、福田です。
>> >>>>
>> >>>>>  ということは、xen0もstonith-helperもたぶん、Pacemakerのstonithプラグインとし
>> >>>>>  ての管理下のパスにはないということになると思います。
>> >>>>>
>> >>>>>  Reusableと、pm_extrasあたりのインストールが怪しいと思いますね。
>> >>>>
>> >>>> pacemakerのインストールに問題があるのでしょうか。
>> >>>> あと、Reusableというものは別途インストールが必要なのでしょうか。
>> >>>>
>> >>>> stonith-helperを外して、external/sshだけにして起動してみましたが、
>> >>>> crm_monでの状態は変わりありませんでした。
>> >>>>
>> >>>> Last updated: Wed Mar 18 08:07:42 2015
>> >>>> Last change: Wed Mar 18 08:04:48 2015
>> >>>> Stack: heartbeat
>> >>>> Current DC: lbv1.beta.com (38b0f200-83ea-8633-6f37-047d36cd39c6) -
>> > parti
>> >>>> tion with quorum
>> >>>> Version: 1.1.12-e32080b
>> >>>> 2 Nodes configured
>> >>>> 6 Resources configured
>> >>>>
>> >>>>
>> >>>> Online: [ lbv1.beta.com lbv2.beta.com ]
>> >>>>
>> >>>> Full list of resources:
>> >>>>
>> >>>> Stonith1-2      (stonith:external/ssh): Stopped
>> >>>> Stonith2-2      (stonith:external/ssh): Stopped
>> >>>>  Resource Group: HAvarnish
>> >>>>      vip_208    (ocf::heartbeat:IPaddr2):       Started
>> > lbv1.beta.com
>> >>>>      varnishd   (lsb:varnish):  Started lbv1.beta.com
>> >>>>  Clone Set: clone_ping [ping]
>> >>>>      Started: [ lbv1.beta.com lbv2.beta.com ]
>> >>>>
>> >>>> Node Attributes:
>> >>>> * Node lbv1.beta.com:
>> >>>>     + default_ping_set                  : 100
>> >>>> * Node lbv2.beta.com:
>> >>>>     + default_ping_set                  : 100
>> >>>>
>> >>>> Migration summary:
>> >>>> * Node lbv2.beta.com:
>> >>>>    Stonith1-2: migration-threshold=1 fail-count=1000000
>> > last-failure='Wed Mar 18
>> >>>>  08:07:32 2015'
>> >>>> * Node lbv1.beta.com:
>> >>>>    Stonith2-2: migration-threshold=1 fail-count=1000000
>> > last-failure='Wed Mar 18
>> >>>>  08:05:53 2015'
>> >>>>
>> >>>> Failed actions:
>> >>>>     Stonith1-2_start_0 on lbv2.beta.com 'unknown error' (1):
>> > call=23, st
>> >>>> atus=Error, exit-reason='none', last-rc-change='Wed Mar
>> > 18 08:07:30 2015', queue
>> >>>> d=0ms, exec=1061ms
>> >>>>     Stonith2-2_start_0 on lbv1.beta.com 'unknown error' (1):
>> > call=23, st
>> >>>> atus=Error, exit-reason='none', last-rc-change='Wed Mar
>> > 18 08:05:51 2015', queue
>> >>>> d=0ms, exec=1062ms
>> >>>>
>> >>>> 宜しくお願いします。
>> >>>>
>> >>>> 以上
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>> 2015年3月17日 23:51 <renayama19661014@ybb.ne.jp>:
>> >>>>
>> >>>> 福田さん
>> >>>>>
>> >>>>> こんばんは、山内です。
>> >>>>>
>> >>>>> ということは、xen0もstonith-helperもたぶん、Pacemakerのstonithプラグインとしての管理下のパスにはないということになると思います。
>> >>>>>
>> >>>>> Reusableと、pm_extrasあたりのインストールが怪しいと思いますね。
>> >>>>>
>> >>>>> また、何かわかったらご連絡します。
>> >>>>>
>> >>>>> 以上です。
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>> ----- Original Message -----
>> >>>>>> From: Masamichi Fukuda - elf-systems
>> > <masamichi_fukuda@elf-systems.com>
>> >>>>>> To: 山内英生 <renayama19661014@ybb.ne.jp>;
>> > "linux-ha-japan@lists.sourceforge.jp"
>> > <linux-ha-japan@lists.sourceforge.jp>
>> >>>>>
>> >>>>>> Date: 2015/3/17, Tue 23:46
>> >>>>>> Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
>> >>>>>>
>> >>>>>>
>> >>>>>> 山内さん
>> >>>>>>
>> >>>>>> こんばんは、福田です。
>> >>>>>>
>> >>>>>> stonith-helperの-x指定は何かやり方が違うんでしょうかね。
>> >>>>>>
>> >>>>>> stonith-helperを外して、xen0だけにして起動してみました。
>> >>>>>>
>> >>>>>> # crm_mon -rfA
>> >>>>>>
>> >>>>>> Last updated: Tue Mar 17 23:38:53 2015
>> >>>>>> Last change: Tue Mar 17 23:30:34 2015
>> >>>>>> Stack: heartbeat
>> >>>>>> Current DC: lbv1.beta.com
>> > (38b0f200-83ea-8633-6f37-047d36cd39c6) - parti
>> >>>>>> tion with quorum
>> >>>>>> Version: 1.1.12-e32080b
>> >>>>>> 2 Nodes configured
>> >>>>>> 6 Resources configured
>> >>>>>>
>> >>>>>>
>> >>>>>> Online: [ lbv1.beta.com lbv2.beta.com ]
>> >>>>>>
>> >>>>>> Full list of resources:
>> >>>>>>
>> >>>>>> Stonith1-2      (stonith:external/xen0):        Stopped
>> >>>>>> Stonith2-2      (stonith:external/xen0):        Stopped
>> >>>>>>  Resource Group: HAvarnish
>> >>>>>>      vip_208    (ocf::heartbeat:IPaddr2):       Started
>> > lbv1.beta.com
>> >>>>>>      varnishd   (lsb:varnish):  Started lbv1.beta.com
>> >>>>>>  Clone Set: clone_ping [ping]
>> >>>>>>      Started: [ lbv1.beta.com lbv2.beta.com ]
>> >>>>>>
>> >>>>>> Node Attributes:
>> >>>>>> * Node lbv1.beta.com:
>> >>>>>>     + default_ping_set                  : 100
>> >>>>>> * Node lbv2.beta.com:
>> >>>>>>     + default_ping_set                  : 100
>> >>>>>>
>> >>>>>> Migration summary:
>> >>>>>> * Node lbv1.beta.com:
>> >>>>>>    Stonith2-2: migration-threshold=1 fail-count=1000000
>> > last-failure='Tue Mar 17
>> >>>>>>  23:38:34 2015'
>> >>>>>> * Node lbv2.beta.com:
>> >>>>>>    Stonith1-2: migration-threshold=1 fail-count=1000000
>> > last-failure='Tue Mar 17
>> >>>>>>  23:38:27 2015'
>> >>>>>>
>> >>>>>> Failed actions:
>> >>>>>>     Stonith2-2_start_0 on lbv1.beta.com 'unknown
>> > error' (1): call=23, st
>> >>>>>> atus=Error, exit-reason='none',
>> > last-rc-change='Tue Mar 17 23:38:32 2015', queue
>> >>>>>> d=0ms, exec=1061ms
>> >>>>>>     Stonith1-2_start_0 on lbv2.beta.com 'unknown
>> > error' (1): call=23, st
>> >>>>>> atus=Error, exit-reason='none',
>> > last-rc-change='Tue Mar 17 23:38:25 2015', queue
>> >>>>>> d=0ms, exec=1342ms
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> stonith-helperがあるときと同様のfialed actionsが出ているようです。
>> >>>>>>
>> >>>>>>
>> >>>>>> 宜しくお願いします。
>> >>>>>>
>> >>>>>> 以上
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> 2015年3月17日 22:38 <renayama19661014@ybb.ne.jp>:
>> >>>>>>
>> >>>>>> 福田さん
>> >>>>>>>
>> >>>>>>> こんばんは、山内です。
>> >>>>>>>
>> >>>>>>> ちなみに可能であれば、external/stonith-helperを外して、external/xen0だけにした場合に
>> >>>>>>> どうなるか?を確認すると、問題の切り分けになるかもしれません。
>> >>>>>>>
>> >>>>>>> 以上です。
>> >>>>>>>
>> >>>>>>>
>> >>>>>>>
>> >>>>>>> ----- Original Message -----
>> >>>>>>>
>> >>>>>>>>  From: "renayama19661014@ybb.ne.jp"
>> > <renayama19661014@ybb.ne.jp>
>> >>>>>>>>  To: "linux-ha-japan@lists.sourceforge.jp"
>> > <linux-ha-japan@lists.sourceforge.jp>
>> >>>>>>>>  Cc:
>> >>>>>>>>  Date: 2015/3/17, Tue 22:28
>> >>>>>>>>  Subject: Re: [Linux-ha-jp]
>> > スプリットブレイン時のSTONITHエラーについて
>> >>>>>>>>
>> >>>>>>>>  福田さん
>> >>>>>>>>
>> >>>>>>>>  こんばんは、山内です。
>> >>>>>>>>
>> >>>>>>>>  変わらないようですね。。。
>> >>>>>>>>
>> >>>>>>>>  とりあえず、明日くらいに、RHEL上ですが、
>> >>>>>>>>
>> >>>>>>>>  Heartbeat3.0.6
>> >>>>>>>>  Pacemakerの最新
>> >>>>>>>>
>> >>>>>>>>
>> > 組み合わせで、同じような設定(リソースはDummy、external/xen0はexternal/sshになりますが)stonith-helperが動くかどうかを確認してみます。
>> >>>>>>>>
>> >>>>>>>>  #stonith-helperの-x指定の出力が確認出来ると、もう少し問題が絞りやすいのですが・・・
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>>  以上です。
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>>  ----- Original Message -----
>> >>>>>>>>>  From: Masamichi Fukuda - elf-systems
>> >>>>>>>>  <masamichi_fukuda@elf-systems.com>
>> >>>>>>>>>  To: 山内英生 <renayama19661014@ybb.ne.jp>;
>> >>>>>>>>  "linux-ha-japan@lists.sourceforge.jp"
>> >>>>>>>>  <linux-ha-japan@lists.sourceforge.jp>
>> >>>>>>>>>  Date: 2015/3/17, Tue 21:24
>> >>>>>>>>>  Subject: Re: [Linux-ha-jp]
>> > スプリットブレイン時のSTONITHエラーについて
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>>  山内さん
>> >>>>>>>>>
>> >>>>>>>>>  こんばんは、福田です。
>> >>>>>>>>>  最新版の情報をありがとうございました。
>> >>>>>>>>>
>> >>>>>>>>>  早速インストールしてみました。
>> >>>>>>>>>
>> >>>>>>>>>  起動後の状態です。
>> >>>>>>>>>
>> >>>>>>>>>  failed actionsは変わりないようです。
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>>  # crm_mon -rfA
>> >>>>>>>>>  Last updated: Tue Mar 17 21:03:49 2015
>> >>>>>>>>>  Last change: Tue Mar 17 20:30:58 2015
>> >>>>>>>>>  Stack: heartbeat
>> >>>>>>>>>  Current DC: lbv1.beta.com
>> > (38b0f200-83ea-8633-6f37-047d36cd39c6) - parti
>> >>>>>>>>>  tion with quorum
>> >>>>>>>>>  Version: 1.1.12-e32080b
>> >>>>>>>>>  2 Nodes configured
>> >>>>>>>>>  8 Resources configured
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>>  Online: [ lbv1.beta.com lbv2.beta.com ]
>> >>>>>>>>>
>> >>>>>>>>>  Full list of resources:
>> >>>>>>>>>
>> >>>>>>>>>   Resource Group: HAvarnish
>> >>>>>>>>>       vip_208    (ocf::heartbeat:IPaddr2):      
>> > Started lbv1.beta.com
>> >>>>>>>>>       varnishd   (lsb:varnish):  Started
>> > lbv1.beta.com
>> >>>>>>>>>   Resource Group: grpStonith1
>> >>>>>>>>>       Stonith1-1
>> > (stonith:external/stonith-helper):      Stopped
>> >>>>>>>>>       Stonith1-2 (stonith:external/xen0):      
>> > Stopped
>> >>>>>>>>>   Resource Group: grpStonith2
>> >>>>>>>>>       Stonith2-1
>> > (stonith:external/stonith-helper):      Stopped
>> >>>>>>>>>       Stonith2-2 (stonith:external/xen0):      
>> > Stopped
>> >>>>>>>>>   Clone Set: clone_ping [ping]
>> >>>>>>>>>       Started: [ lbv1.beta.com lbv2.beta.com ]
>> >>>>>>>>>
>> >>>>>>>>>  Node Attributes:
>> >>>>>>>>>  * Node lbv1.beta.com:
>> >>>>>>>>>      + default_ping_set                  : 100
>> >>>>>>>>>  * Node lbv2.beta.com:
>> >>>>>>>>>      + default_ping_set                  : 100
>> >>>>>>>>>
>> >>>>>>>>>  Migration summary:
>> >>>>>>>>>  * Node lbv1.beta.com:
>> >>>>>>>>>     Stonith2-1: migration-threshold=1
>> > fail-count=1000000
>> >>>>>>>>  last-failure='Tue Mar 17
>> >>>>>>>>>   21:03:39 2015'
>> >>>>>>>>>  * Node lbv2.beta.com:
>> >>>>>>>>>     Stonith1-1: migration-threshold=1
>> > fail-count=1000000
>> >>>>>>>>  last-failure='Tue Mar 17
>> >>>>>>>>>   21:03:32 2015'
>> >>>>>>>>>
>> >>>>>>>>>  Failed actions:
>> >>>>>>>>>      Stonith2-1_start_0 on lbv1.beta.com
>> > 'unknown error' (1):
>> >>>>>>>>  call=31, st
>> >>>>>>>>>  atus=Error, exit-reason='none',
>> > last-rc-change='Tue Mar 17
>> >>>>>>>>  21:03:37 2015', queue
>> >>>>>>>>>  d=0ms, exec=1085ms
>> >>>>>>>>>      Stonith1-1_start_0 on lbv2.beta.com
>> > 'unknown error' (1):
>> >>>>>>>>  call=18, st
>> >>>>>>>>>  atus=Error, exit-reason='none',
>> > last-rc-change='Tue Mar 17
>> >>>>>>>>  21:03:30 2015', queue
>> >>>>>>>>>  d=0ms, exec=1061ms
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>>  ログです。
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>>  # less /var/log/ha-debug
>> >>>>>>>>>  Mar 17 21:02:39 lbv1.beta.com heartbeat:
>> > [4235]: info: Pacemaker support:
>> >>>>>>>>  yes
>> >>>>>>>>>  Mar 17 21:02:39 lbv1.beta.com heartbeat:
>> > [4235]: WARN: File
>> >>>>>>>>  /etc/ha.d//haresources exists.
>> >>>>>>>>>  Mar 17 21:02:39 lbv1.beta.com heartbeat:
>> > [4235]: WARN: This file is not used
>> >>>>>>>>  because pacemaker is enabled
>> >>>>>>>>>  Mar 17 21:02:39 lbv1.beta.com heartbeat:
>> > [4235]: debug: Checking access of:
>> >>>>>>>>  /usr/local/heartbeat/libexec/heartbeat/ccm
>> >>>>>>>>>  Mar 17 21:02:39 lbv1.beta.com heartbeat:
>> > [4235]: debug: Checking access of:
>> >>>>>>>>  /usr/local/heartbeat/libexec/pacemaker/cib
>> >>>>>>>>>  Mar 17 21:02:39 lbv1.beta.com heartbeat:
>> > [4235]: debug: Checking access of:
>> >>>>>>>>  /usr/local/heartbeat/libexec/pacemaker/stonithd
>> >>>>>>>>>  Mar 17 21:02:39 lbv1.beta.com heartbeat:
>> > [4235]: debug: Checking access of:
>> >>>>>>>>  /usr/local/heartbeat/libexec/pacemaker/lrmd
>> >>>>>>>>>  Mar 17 21:02:39 lbv1.beta.com heartbeat:
>> > [4235]: debug: Checking access of:
>> >>>>>>>>  /usr/local/heartbeat/libexec/pacemaker/attrd
>> >>>>>>>>>  Mar 17 21:02:39 lbv1.beta.com heartbeat:
>> > [4235]: debug: Checking access of:
>> >>>>>>>>  /usr/local/heartbeat/libexec/pacemaker/crmd
>> >>>>>>>>>  Mar 17 21:02:39 lbv1.beta.com heartbeat:
>> > [4235]: WARN: Core dumps could be
>> >>>>>>>>  lost if multiple dumps occur.
>> >>>>>>>>>  Mar 17 21:02:39 lbv1.beta.com heartbeat:
>> > [4235]: WARN: Consider setting
>> >>>>>>>>  non-default value in /proc/sys/kernel/core_pattern
>> > (or equivalent) for maximum
>> >>>>>>>>  supportability
>> >>>>>>>>>  Mar 17 21:02:39 lbv1.beta.com heartbeat:
>> > [4235]: WARN: Consider setting
>> >>>>>>>>  /proc/sys/kernel/core_uses_pid (or equivalent) to 1
>> > for maximum supportability
>> >>>>>>>>>  Mar 17 21:02:39 lbv1.beta.com heartbeat:
>> > [4235]: WARN: Logging daemon is
>> >>>>>>>>  disabled --enabling logging daemon is recommended
>> >>>>>>>>>  Mar 17 21:02:39 lbv1.beta.com heartbeat:
>> > [4235]: info:
>> >>>>>>>>  **************************
>> >>>>>>>>>  Mar 17 21:02:39 lbv1.beta.com heartbeat:
>> > [4235]: info: Configuration
>> >>>>>>>>  validated. Starting heartbeat 3.0.6
>> >>>>>>>>>  Mar 17 21:02:39 lbv1.beta.com heartbeat:
>> > [4236]: info: heartbeat: version
>> >>>>>>>>  3.0.6
>> >>>>>>>>>  Mar 17 21:02:39 lbv1.beta.com heartbeat:
>> > [4236]: info: Heartbeat generation:
>> >>>>>>>>  1423534116
>> >>>>>>>>>  Mar 17 21:02:39 lbv1.beta.com heartbeat:
>> > [4236]: info: seed is -1702799346
>> >>>>>>>>>  Mar 17 21:02:39 lbv1.beta.com heartbeat:
>> > [4236]: info: glib: ucast: write
>> >>>>>>>>  socket priority set to IPTOS_LOWDELAY on eth1
>> >>>>>>>>>  Mar 17 21:02:39 lbv1.beta.com heartbeat:
>> > [4236]: info: glib: ucast: bound
>> >>>>>>>>  send socket to device: eth1
>> >>>>>>>>>  Mar 17 21:02:39 lbv1.beta.com heartbeat:
>> > [4236]: info: glib: ucast: set
>> >>>>>>>>  SO_REUSEADDR
>> >>>>>>>>>  Mar 17 21:02:39 lbv1.beta.com heartbeat:
>> > [4236]: info: glib: ucast: bound
>> >>>>>>>>  receive socket to device: eth1
>> >>>>>>>>>  Mar 17 21:02:39 lbv1.beta.com heartbeat:
>> > [4236]: info: glib: ucast: started
>> >>>>>>>>  on port 694 interface eth1 to 10.0.17.133
>> >>>>>>>>>  Mar 17 21:02:39 lbv1.beta.com heartbeat:
>> > [4236]: info: Local status now set
>> >>>>>>>>  to: 'up'
>> >>>>>>>>>  Mar 17 21:02:46 lbv1.beta.com heartbeat:
>> > [4236]: info: Link
>> >>>>>>>>  lbv2.beta.com:eth1 up.
>> >>>>>>>>>  Mar 17 21:02:46 lbv1.beta.com heartbeat:
>> > [4236]: info: Status update for
>> >>>>>>>>  node lbv2.beta.com: status up
>> >>>>>>>>>  Mar 17 21:02:47 lbv1.beta.com heartbeat:
>> > [4236]: info: Comm_now_up():
>> >>>>>>>>  updating status to active
>> >>>>>>>>>  Mar 17 21:02:47 lbv1.beta.com heartbeat:
>> > [4236]: info: Local status now set
>> >>>>>>>>  to: 'active'
>> >>>>>>>>>  Mar 17 21:02:47 lbv1.beta.com heartbeat:
>> > [4236]: info: Starting child client
>> >>>>>>>>
>> > "/usr/local/heartbeat/libexec/heartbeat/ccm" (109,113)
>> >>>>>>>>>  Mar 17 21:02:47 lbv1.beta.com heartbeat:
>> > [4236]: info: Starting child client
>> >>>>>>>>
>> > "/usr/local/heartbeat/libexec/pacemaker/cib" (109,113)
>> >>>>>>>>>  Mar 17 21:02:47 lbv1.beta.com heartbeat:
>> > [4236]: info: Starting child client
>> >>>>>>>>
>> > "/usr/local/heartbeat/libexec/pacemaker/stonithd" (0,0)
>> >>>>>>>>>  Mar 17 21:02:47 lbv1.beta.com heartbeat:
>> > [4236]: info: Starting child client
>> >>>>>>>>
>> > "/usr/local/heartbeat/libexec/pacemaker/lrmd" (0,0)
>> >>>>>>>>>  Mar 17 21:02:47 lbv1.beta.com heartbeat:
>> > [4236]: info: Starting child client
>> >>>>>>>>
>> > "/usr/local/heartbeat/libexec/pacemaker/attrd" (109,113)
>> >>>>>>>>>  Mar 17 21:02:47 lbv1.beta.com heartbeat:
>> > [4236]: info: Starting child client
>> >>>>>>>>
>> > "/usr/local/heartbeat/libexec/pacemaker/crmd" (109,113)
>> >>>>>>>>>  Mar 17 21:02:47 lbv1.beta.com heartbeat:
>> > [4236]: debug: get_delnodelist:
>> >>>>>>>>  delnodelist=
>> >>>>>>>>>  Mar 17 21:02:47 lbv1.beta.com heartbeat:
>> > [4250]: info: Starting
>> >>>>>>>>
>> > "/usr/local/heartbeat/libexec/pacemaker/crmd" as uid 109  gid 113 (pid
>> >>>>>>>>  4250)
>> >>>>>>>>>  Mar 17 21:02:47 lbv1.beta.com heartbeat:
>> > [4246]: info: Starting
>> >>>>>>>>
>> > "/usr/local/heartbeat/libexec/pacemaker/cib" as uid 109  gid 113 (pid
>> >>>>>>>>  4246)
>> >>>>>>>>>  Mar 17 21:02:47 lbv1.beta.com heartbeat:
>> > [4249]: info: Starting
>> >>>>>>>>
>> > "/usr/local/heartbeat/libexec/pacemaker/attrd" as uid 109  gid 113
>> >>>>>>>>  (pid 4249)
>> >>>>>>>>>  Mar 17 21:02:47 lbv1.beta.com heartbeat:
>> > [4245]: info: Starting
>> >>>>>>>>
>> > "/usr/local/heartbeat/libexec/heartbeat/ccm" as uid 109  gid 113 (pid
>> >>>>>>>>  4245)
>> >>>>>>>>>  Mar 17 21:02:47 lbv1.beta.com heartbeat:
>> > [4248]: info: Starting
>> >>>>>>>>
>> > "/usr/local/heartbeat/libexec/pacemaker/lrmd" as uid 0  gid 0 (pid
>> >>>>>>>>  4248)
>> >>>>>>>>>  Mar 17 21:02:47 lbv1.beta.com heartbeat:
>> > [4247]: info: Starting
>> >>>>>>>>
>> > "/usr/local/heartbeat/libexec/pacemaker/stonithd" as uid 0  gid 0 (pid
>> >>>>>>>>  4247)
>> >>>>>>>>>  Mar 17 21:02:47 lbv1.beta.com ccm: [4245]:
>> > info: Hostname: lbv1.beta.com
>> >>>>>>>>>  Mar 17 21:02:47 lbv1.beta.com heartbeat:
>> > [4236]: info: the send queue length
>> >>>>>>>>  from heartbeat to client ccm is set to 1024
>> >>>>>>>>>  Mar 17 21:02:47 lbv1.beta.com heartbeat:
>> > [4236]: info: the send queue length
>> >>>>>>>>  from heartbeat to client attrd is set to 1024
>> >>>>>>>>>  Mar 17 21:02:47 lbv1.beta.com heartbeat:
>> > [4236]: info: the send queue length
>> >>>>>>>>  from heartbeat to client stonith-ng is set to 1024
>> >>>>>>>>>  Mar 17 21:02:47 lbv1.beta.com heartbeat:
>> > [4236]: info: Status update for
>> >>>>>>>>  node lbv2.beta.com: status active
>> >>>>>>>>>  Mar 17 21:02:47 lbv1.beta.com heartbeat:
>> > [4236]: info: the send queue length
>> >>>>>>>>  from heartbeat to client cib is set to 1024
>> >>>>>>>>>  Mar 17 21:02:51 lbv1.beta.com heartbeat:
>> > [4236]: WARN: 1 lost packet(s) for
>> >>>>>>>>  [lbv2.beta.com] [15:17]
>> >>>>>>>>>  Mar 17 21:02:51 lbv1.beta.com heartbeat:
>> > [4236]: info: No pkts missing from
>> >>>>>>>>  lbv2.beta.com!
>> >>>>>>>>>  Mar 17 21:02:52 lbv1.beta.com heartbeat:
>> > [4236]: WARN: 1 lost packet(s) for
>> >>>>>>>>  [lbv2.beta.com] [19:21]
>> >>>>>>>>>  Mar 17 21:02:52 lbv1.beta.com heartbeat:
>> > [4236]: info: No pkts missing from
>> >>>>>>>>  lbv2.beta.com!
>> >>>>>>>>>  Mar 17 21:02:52 lbv1.beta.com heartbeat:
>> > [4236]: info: the send queue length
>> >>>>>>>>  from heartbeat to client crmd is set to 1024
>> >>>>>>>>>  Mar 17 21:02:53 lbv1.beta.com heartbeat:
>> > [4236]: WARN: 1 lost packet(s) for
>> >>>>>>>>  [lbv2.beta.com] [24:26]
>> >>>>>>>>>  Mar 17 21:02:53 lbv1.beta.com heartbeat:
>> > [4236]: info: No pkts missing from
>> >>>>>>>>  lbv2.beta.com!
>> >>>>>>>>>  Mar 17 21:02:54 lbv1.beta.com heartbeat:
>> > [4236]: WARN: 1 lost packet(s) for
>> >>>>>>>>  [lbv2.beta.com] [26:28]
>> >>>>>>>>>  Mar 17 21:02:54 lbv1.beta.com heartbeat:
>> > [4236]: info: No pkts missing from
>> >>>>>>>>  lbv2.beta.com!
>> >>>>>>>>>  Mar 17 21:02:54 lbv1.beta.com heartbeat:
>> > [4236]: WARN: 1 lost packet(s) for
>> >>>>>>>>  [lbv2.beta.com] [30:32]
>> >>>>>>>>>  Mar 17 21:02:54 lbv1.beta.com heartbeat:
>> > [4236]: info: No pkts missing from
>> >>>>>>>>  lbv2.beta.com!
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>>  # less /var/log/error
>> >>>>>>>>>
>> >>>>>>>>>  Mar 17 21:02:47 lbv1 attrd[4249]:    error:
>> > ha_msg_dispatch: Ignored
>> >>>>>>>>  incoming message. Please set_msg_callback on
>> > hbclstat
>> >>>>>>>>>  Mar 17 21:02:48 lbv1 attrd[4249]:    error:
>> > ha_msg_dispatch: Ignored
>> >>>>>>>>  incoming message. Please set_msg_callback on
>> > hbclstat
>> >>>>>>>>>  Mar 17 21:02:53 lbv1 stonith-ng[4247]:  
>> > error: ha_msg_dispatch: Ignored
>> >>>>>>>>  incoming message. Please set_msg_callback on
>> > hbclstat
>> >>>>>>>>>  Mar 17 21:02:53 lbv1 stonith-ng[4247]:  
>> > error: ha_msg_dispatch: Ignored
>> >>>>>>>>  incoming message. Please set_msg_callback on
>> > hbclstat
>> >>>>>>>>>  Mar 17 21:03:39 lbv1 crmd[4250]:    error:
>> > process_lrm_event: Operation
>> >>>>>>>>  Stonith2-1_start_0 (node=lbv1.beta.com, call=31,
>> > status=4, cib-update=42,
>> >>>>>>>>  confirmed=true) Error
>> >>>>>>>>>
>> >>>>>>>>>  # cat syslog|egrep 'Mar 17 21:03|Mar 17
>> > 21:02' |egrep
>> >>>>>>>>  'heartbeat|stonith|pacemaker|error'
>> >>>>>>>>>  Mar 17 21:03:24 lbv1 pengine[4253]:   notice:
>> > process_pe_message: Calculated
>> >>>>>>>>  Transition 0:
>> > /var/lib/pacemaker/pengine/pe-input-115.bz2
>> >>>>>>>>>  Mar 17 21:03:27 lbv1 crmd[4250]:   notice:
>> > run_graph: Transition 0
>> >>>>>>>>  (Complete=15, Pending=0, Fired=0, Skipped=16,
>> > Incomplete=2,
>> >>>>>>>>
>> > Source=/var/lib/pacemaker/pengine/pe-input-115.bz2): Stopped
>> >>>>>>>>>  Mar 17 21:03:29 lbv1 pengine[4253]:   notice:
>> > process_pe_message: Calculated
>> >>>>>>>>  Transition 1:
>> > /var/lib/pacemaker/pengine/pe-input-116.bz2
>> >>>>>>>>>  Mar 17 21:03:34 lbv1 crmd[4250]:   notice:
>> > run_graph: Transition 1
>> >>>>>>>>  (Complete=8, Pending=0, Fired=0, Skipped=12,
>> > Incomplete=1,
>> >>>>>>>>
>> > Source=/var/lib/pacemaker/pengine/pe-input-116.bz2): Stopped
>> >>>>>>>>>  Mar 17 21:03:37 lbv1 pengine[4253]:  warning:
>> > unpack_rsc_op_failure:
>> >>>>>>>>  Processing failed op start for Stonith1-1 on
>> > lbv2.beta.com: unknown error (1)
>> >>>>>>>>>  Mar 17 21:03:37 lbv1 pengine[4253]:  warning:
>> > unpack_rsc_op_failure:
>> >>>>>>>>  Processing failed op start for Stonith1-1 on
>> > lbv2.beta.com: unknown error (1)
>> >>>>>>>>>  Mar 17 21:03:37 lbv1 pengine[4253]:   notice:
>> > process_pe_message: Calculated
>> >>>>>>>>  Transition 2:
>> > /var/lib/pacemaker/pengine/pe-input-117.bz2
>> >>>>>>>>>  Mar 17 21:03:39 lbv1 stonith-ng[4247]:  
>> > notice: log_operation: Operation
>> >>>>>>>>  'monitor' [4377] for device
>> > 'Stonith2-1' returned: -201 (Generic
>> >>>>>>>>  Pacemaker error)
>> >>>>>>>>>  Mar 17 21:03:39 lbv1 stonith-ng[4247]:
>> > warning: log_operation:
>> >>>>>>>>  Stonith2-1:4377 [ Performing: stonith -t
>> > external/stonith-helper -S ]
>> >>>>>>>>>  Mar 17 21:03:39 lbv1 stonith-ng[4247]:
>> > warning: log_operation:
>> >>>>>>>>  Stonith2-1:4377 [ failed to exec
>> > "stonith" ]
>> >>>>>>>>>  Mar 17 21:03:39 lbv1 stonith-ng[4247]:
>> > warning: log_operation:
>> >>>>>>>>  Stonith2-1:4377 [ failed:  2 ]
>> >>>>>>>>>  Mar 17 21:03:39 lbv1 crmd[4250]:    error:
>> > process_lrm_event: Operation
>> >>>>>>>>  Stonith2-1_start_0 (node=lbv1.beta.com, call=31,
>> > status=4, cib-update=42,
>> >>>>>>>>  confirmed=true) Error
>> >>>>>>>>>  Mar 17 21:03:40 lbv1 crmd[4250]:   notice:
>> > run_graph: Transition 2
>> >>>>>>>>  (Complete=12, Pending=0, Fired=0, Skipped=3,
>> > Incomplete=0,
>> >>>>>>>>
>> > Source=/var/lib/pacemaker/pengine/pe-input-117.bz2): Stopped
>> >>>>>>>>>  Mar 17 21:03:42 lbv1 pengine[4253]:  warning:
>> > unpack_rsc_op_failure:
>> >>>>>>>>  Processing failed op start for Stonith2-1 on
>> > lbv1.beta.com: unknown error (1)
>> >>>>>>>>>  Mar 17 21:03:42 lbv1 pengine[4253]:  warning:
>> > unpack_rsc_op_failure:
>> >>>>>>>>  Processing failed op start for Stonith2-1 on
>> > lbv1.beta.com: unknown error (1)
>> >>>>>>>>>  Mar 17 21:03:42 lbv1 pengine[4253]:  warning:
>> > unpack_rsc_op_failure:
>> >>>>>>>>  Processing failed op start for Stonith1-1 on
>> > lbv2.beta.com: unknown error (1)
>> >>>>>>>>>  Mar 17 21:03:42 lbv1 pengine[4253]:   notice:
>> > process_pe_message: Calculated
>> >>>>>>>>  Transition 3:
>> > /var/lib/pacemaker/pengine/pe-input-118.bz2
>> >>>>>>>>>  Mar 17 21:03:42 lbv1 IPaddr2(vip_208)[4448]:
>> > INFO:
>> >>>>>>>>  /usr/libexec/heartbeat/send_arp -i 200 -r 5 -p
>> >>>>>>>>  /var/run/resource-agents/send_arp-192.168.17.208
>> > eth0 192.168.17.208 auto
>> >>>>>>>>  not_used not_used
>> >>>>>>>>>  Mar 17 21:03:47 lbv1 crmd[4250]:   notice:
>> > run_graph: Transition 3
>> >>>>>>>>  (Complete=10, Pending=0, Fired=0, Skipped=0,
>> > Incomplete=0,
>> >>>>>>>>
>> > Source=/var/lib/pacemaker/pengine/pe-input-118.bz2): Complete
>> >>>>>>>>>
>> >>>>>>>>>  宜しくお願いします。
>> >>>>>>>>>
>> >>>>>>>>>  以上
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>>  2015年3月17日 18:31
>> > <renayama19661014@ybb.ne.jp>:
>> >>>>>>>>>
>> >>>>>>>>>  福田さん
>> >>>>>>>>>>
>> >>>>>>>>>>  こんばんは、山内です。
>> >>>>>>>>>>
>> >>>>>>>>>>  tag付けされていないので、本日の最新版は、
>> >>>>>>>>>>
>> >>>>>>>>>>   *
>> >>>>>>>>
>> > https://github.com/ClusterLabs/pacemaker/tree/e32080b460f81486b85d08ec958582b3e72d858c
>> >>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>>  になります。
>> >>>>>>>>>>  右側の[Download ZIP]からダウンロード出来ます。
>> >>>>>>>>>>
>> >>>>>>>>>>  以上です。
>> >>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>>  ----- Original Message -----
>> >>>>>>>>>>>  From: Masamichi Fukuda - elf-systems
>> >>>>>>>>  <masamichi_fukuda@elf-systems.com>
>> >>>>>>>>>>
>> >>>>>>>>>>>  To:
>> > "renayama19661014@ybb.ne.jp"
>> >>>>>>>>  <renayama19661014@ybb.ne.jp>;
>> >>>>>>>>  "linux-ha-japan@lists.sourceforge.jp"
>> >>>>>>>>  <linux-ha-japan@lists.sourceforge.jp>
>> >>>>>>>>>>>  Date: 2015/3/17, Tue 18:07
>> >>>>>>>>>>>  Subject: スプリットブレイン時のSTONITHエラーについて
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>>  山内さん
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>>  お疲れ様です、福田です。
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>>  こちらを見たのですが、
>> >>>>>>>>>>>
>> > https://github.com/ClusterLabs/pacemaker/tags
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>>  pacemaker 1.1.12 561c4cf が最新のようなのですが。
>> >>>>>>>>>>>  済みませんが、これ以降の最新版はどちらにあるか教えて頂けますか。
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>>  宜しくお願いします。
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>>  以上
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> > 2015年3月17日火曜日、<renayama19661014@ybb.ne.jp>さんは書きました:
>> >>>>>>>>>>>
>> >>>>>>>>>>>  福田さん
>> >>>>>>>>>>>>
>> >>>>>>>>>>>>  お疲れ様です。山内です。
>> >>>>>>>>>>>>
>> >>>>>>>>>>>>  はい。古いです。
>> >>>>>>>>>>>>
>> >>>>>>>>>>>>
>> > PacemakerがHeartbeat3.0.6に対応したのは意外と最近です。
>> >>>>>>>>>>>>
>> > もっと新しいものを入れてください。(また、ソースから構築する必要がありますが・・・・)
>> >>>>>>>>>>>>
>> >>>>>>>>>>>>
>> >>>>>>>>>>>>
>> >>>>>>>>>>>>  本家のgithubから入手可能です。
>> >>>>>>>>>>>>   *
>> > https://github.com/ClusterLabs/pacemaker
>> >>>>>>>>>>>>
>> >>>>>>>>>>>>
>> >>>>>>>>>>>>
>> > 場合によっては、最新のmasterはエラーなどが出る場合がありますので、その場合は、バージョンを古い方にたぐって
>> >>>>>>>>>>>>  いくのが良いと思います。
>> >>>>>>>>>>>>
>> >>>>>>>>>>>>  以上です。
>> >>>>>>>>>>>>
>> >>>>>>>>>>>>
>> >>>>>>>>>>>>
>> >>>>>>>>>>>>  ----- Original Message -----
>> >>>>>>>>>>>>>  From: Masamichi Fukuda -
>> > elf-systems
>> >>>>>>>>  <masamichi_fukuda@elf-systems.com>
>> >>>>>>>>>>>>>  To: 山内英生
>> > <renayama19661014@ybb.ne.jp>;
>> >>>>>>>>  "linux-ha-japan@lists.sourceforge.jp"
>> >>>>>>>>  <linux-ha-japan@lists.sourceforge.jp>
>> >>>>>>>>>>>>>  Date: 2015/3/17, Tue 16:06
>> >>>>>>>>>>>>>  Subject: Re: [Linux-ha-jp]
>> > スプリットブレイン時のSTONITHエラーについて
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>>  山内さん
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>>  お疲れ様です、福田です。
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>>
>> > 以前のメールでheartbeatとpacemakerを最新版を入れたほうが良いと回答頂きました。
>> >>>>>>>>>>>>>
>> > そこで今回、heartbeat3.0.6とpacemaker1.1.12を入れたのですが。
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>>  heartbeat configuration:
>> > Version = "3.0.6"
>> >>>>>>>>>>>>>  pacemaker configuration:
>> > Version = 1.1.12 (Build:
>> >>>>>>>>  561c4cf)pacemakerがまだ古いということでしょうか。
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>>  済みませんが、宜しくお願いします。
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>>  以上
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>>  2015年3月17日 14:59
>> > <renayama19661014@ybb.ne.jp>:
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>>  福田さん
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>  お疲れ様です。山内です。
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>
>> > ふと思ったのすが、以前のやり取りのメールで以下と回答してますが、問題ないでしょうか?
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>  
>> > 2)Heartbeat3.0.6+Pacemaker最新 :
>> >>>>>>>>  OK
>> >>>>>>>>>>>>>>>>>>>>    
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>  どうやら、Heartbeatも最新版3.0.6を組合せる必要があるようです。
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>
>> >  * http://hg.linux-ha.org/heartbeat-STABLE_3_0/rev/cceeb47a7d8f
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>
>> > 以下のcrm_monのバージョンを見ると、1.1.12のようです。
>> >>>>>>>>>>>>>>
>> > Heartbeat3.0.6と組み合わせるには、かなり新しめのPacemakerが必要です。
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>  # crm_mon -rfA
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>  Last updated: Tue Mar
>> > 17 14:14:39 2015
>> >>>>>>>>>>>>>>>  Last change: Tue Mar 17
>> > 14:01:43 2015
>> >>>>>>>>>>>>>>>  Stack: heartbeat
>> >>>>>>>>>>>>>>>  Current DC:
>> > lbv2.beta.com
>> >>>>>>>>  (82ffc36f-1ad8-8686-7db0-35686465c624) - parti
>> >>>>>>>>>>>>>>>  tion with quorum
>> >>>>>>>>>>>>>>>  Version: 1.1.12-561c4cf
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>  たぶん、以下の変更以降は少なくとも必要かと思います。
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>
>> > https://github.com/ClusterLabs/pacemaker/commit/f2302da063d08719d28367d8e362b8bfb0f85bf3
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>  以上です。
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>  ----- Original Message
>> > -----
>> >>>>>>>>>>>>>>>  From: Masamichi Fukuda
>> > - elf-systems
>> >>>>>>>>  <masamichi_fukuda@elf-systems.com>
>> >>>>>>>>>>>>>>>  To: 山内英生
>> > <renayama19661014@ybb.ne.jp>;
>> >>>>>>>>  "linux-ha-japan@lists.sourceforge.jp"
>> >>>>>>>>  <linux-ha-japan@lists.sourceforge.jp>
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>  Date: 2015/3/17, Tue
>> > 14:38
>> >>>>>>>>>>>>>>>  Subject: Re:
>> > [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>  山内さん
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>  お疲れ様です、福田です。
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>
>> > stonith-helperのシェバング行に-xを追加すれば良いのでしょうか?
>> >>>>>>>>>>>>>>>
>> > stonith-helperの先頭行を#!/bin/bash -xにしてクラスタを起動してみました。
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>
>> > crm_monでは先ほどと変わりはないようです。
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>  # crm_mon -rfA
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>  Last updated: Tue Mar
>> > 17 14:14:39 2015
>> >>>>>>>>>>>>>>>  Last change: Tue Mar 17
>> > 14:01:43 2015
>> >>>>>>>>>>>>>>>  Stack: heartbeat
>> >>>>>>>>>>>>>>>  Current DC:
>> > lbv2.beta.com
>> >>>>>>>>  (82ffc36f-1ad8-8686-7db0-35686465c624) - parti
>> >>>>>>>>>>>>>>>  tion with quorum
>> >>>>>>>>>>>>>>>  Version: 1.1.12-561c4cf
>> >>>>>>>>>>>>>>>  2 Nodes configured
>> >>>>>>>>>>>>>>>  8 Resources configured
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>  Online: [ lbv1.beta.com
>> > lbv2.beta.com ]
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>  Full list of resources:
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>   Resource Group:
>> > HAvarnish
>> >>>>>>>>>>>>>>>       vip_208  
>> > (ocf::heartbeat:IPaddr2):      
>> >>>>>>>>  Started lbv1.beta.com
>> >>>>>>>>>>>>>>>       varnishd  
>> > (lsb:varnish):  Started
>> >>>>>>>>  lbv1.beta.com
>> >>>>>>>>>>>>>>>   Resource Group:
>> > grpStonith1
>> >>>>>>>>>>>>>>>       Stonith1-1
>> >>>>>>>>  (stonith:external/stonith-helper):      Stopped
>> >>>>>>>>>>>>>>>       Stonith1-2
>> > (stonith:external/xen0):      
>> >>>>>>>>  Stopped
>> >>>>>>>>>>>>>>>   Resource Group:
>> > grpStonith2
>> >>>>>>>>>>>>>>>       Stonith2-1
>> >>>>>>>>  (stonith:external/stonith-helper):      Stopped
>> >>>>>>>>>>>>>>>       Stonith2-2
>> > (stonith:external/xen0):      
>> >>>>>>>>  Stopped
>> >>>>>>>>>>>>>>>   Clone Set: clone_ping
>> > [ping]
>> >>>>>>>>>>>>>>>       Started: [
>> > lbv1.beta.com lbv2.beta.com ]
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>  Node Attributes:
>> >>>>>>>>>>>>>>>  * Node lbv1.beta.com:
>> >>>>>>>>>>>>>>>      +
>> > default_ping_set                  : 100
>> >>>>>>>>>>>>>>>  * Node lbv2.beta.com:
>> >>>>>>>>>>>>>>>      +
>> > default_ping_set                  : 100
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>  Migration summary:
>> >>>>>>>>>>>>>>>  * Node lbv2.beta.com:
>> >>>>>>>>>>>>>>>     Stonith1-1:
>> > migration-threshold=1
>> >>>>>>>>  fail-count=1000000 last-failure='Tue Mar 17
>> >>>>>>>>>>>>>>>   14:12:16 2015'
>> >>>>>>>>>>>>>>>  * Node lbv1.beta.com:
>> >>>>>>>>>>>>>>>     Stonith2-1:
>> > migration-threshold=1
>> >>>>>>>>  fail-count=1000000 last-failure='Tue Mar 17
>> >>>>>>>>>>>>>>>   14:12:21 2015'
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>  Failed actions:
>> >>>>>>>>>>>>>>>      Stonith1-1_start_0
>> > on lbv2.beta.com 'unknown
>> >>>>>>>>  error' (1): call=31, st
>> >>>>>>>>>>>>>>>  atus=Error,
>> > last-rc-change='Tue Mar 17 14:12:14
>> >>>>>>>>  2015', queued=0ms, exec=1065ms
>> >>>>>>>>>>>>>>>      Stonith2-1_start_0
>> > on lbv1.beta.com 'unknown
>> >>>>>>>>  error' (1): call=26, st
>> >>>>>>>>>>>>>>>  atus=Error,
>> > last-rc-change='Tue Mar 17 14:12:19
>> >>>>>>>>  2015', queued=0ms, exec=1081ms
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>  その他のログを探してみました。
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>  heartbeat起動時です。
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>  # less
>> > /var/log/pm_logconv.out
>> >>>>>>>>>>>>>>>  Mar 17 14:11:28
>> > lbv1.beta.com info: Starting
>> >>>>>>>>  Heartbeat 3.0.6.
>> >>>>>>>>>>>>>>>  Mar 17 14:11:33
>> > lbv1.beta.com info: Link
>> >>>>>>>>  lbv2.beta.com:eth1 is up.
>> >>>>>>>>>>>>>>>  Mar 17 14:11:34
>> > lbv1.beta.com info: Start
>> >>>>>>>>  "ccm" process. (pid=13264)
>> >>>>>>>>>>>>>>>  Mar 17 14:11:34
>> > lbv1.beta.com info: Start
>> >>>>>>>>  "lrmd" process. (pid=13267)
>> >>>>>>>>>>>>>>>  Mar 17 14:11:34
>> > lbv1.beta.com info: Start
>> >>>>>>>>  "attrd" process. (pid=13268)
>> >>>>>>>>>>>>>>>  Mar 17 14:11:34
>> > lbv1.beta.com info: Start
>> >>>>>>>>  "stonithd" process. (pid=13266)
>> >>>>>>>>>>>>>>>  Mar 17 14:11:34
>> > lbv1.beta.com info: Start
>> >>>>>>>>  "cib" process. (pid=13265)
>> >>>>>>>>>>>>>>>  Mar 17 14:11:34
>> > lbv1.beta.com info: Start
>> >>>>>>>>  "crmd" process. (pid=13269)
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>  # less /var/log/error
>> >>>>>>>>>>>>>>>  Mar 17 14:12:20 lbv1
>> > crmd[13269]:    error:
>> >>>>>>>>  process_lrm_event: Operation Stonith2-1_start_0
>> > (node=lbv1.beta.com, call=26,
>> >>>>>>>>  status=4, cib-update=19, confirmed=true) Error
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>
>> > syslogからstonithをgrepしたものです
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>  Mar 17 14:11:34 lbv1
>> > heartbeat: [13255]: info:
>> >>>>>>>>  Starting child client
>> >>>>>>>>
>> > "/usr/local/heartbeat/libexec/pacemaker/stonithd" (0,0)
>> >>>>>>>>>>>>>>>  Mar 17 14:11:34 lbv1
>> > heartbeat: [13266]: info:
>> >>>>>>>>  Starting
>> > "/usr/local/heartbeat/libexec/pacemaker/stonithd" as uid 0
>> >>>>>>>>  gid 0 (pid 13266)
>> >>>>>>>>>>>>>>>  Mar 17 14:11:34 lbv1
>> > stonithd[13266]:   notice:
>> >>>>>>>>  crm_cluster_connect: Connecting to cluster
>> > infrastructure: heartbeat
>> >>>>>>>>>>>>>>>  Mar 17 14:11:34 lbv1
>> > heartbeat: [13255]: info: the
>> >>>>>>>>  send queue length from heartbeat to client stonithd
>> > is set to 1024
>> >>>>>>>>>>>>>>>  Mar 17 14:11:40 lbv1
>> > stonithd[13266]:   notice:
>> >>>>>>>>  setup_cib: Watching for stonith topology changes
>> >>>>>>>>>>>>>>>  Mar 17 14:11:40 lbv1
>> > stonithd[13266]:   notice:
>> >>>>>>>>  unpack_config: On loss of CCM Quorum: Ignore
>> >>>>>>>>>>>>>>>  Mar 17 14:11:40 lbv1
>> > stonithd[13266]:  warning:
>> >>>>>>>>  handle_startup_fencing: Blind faith: not fencing
>> > unseen nodes
>> >>>>>>>>>>>>>>>  Mar 17 14:11:40 lbv1
>> > stonithd[13266]:  warning:
>> >>>>>>>>  handle_startup_fencing: Blind faith: not fencing
>> > unseen nodes
>> >>>>>>>>>>>>>>>  Mar 17 14:11:41 lbv1
>> > stonithd[13266]:   notice:
>> >>>>>>>>  stonith_device_register: Added 'Stonith2-1'
>> > to the device list (1 active
>> >>>>>>>>  devices)
>> >>>>>>>>>>>>>>>  Mar 17 14:11:41 lbv1
>> > stonithd[13266]:   notice:
>> >>>>>>>>  stonith_device_register: Added 'Stonith2-2'
>> > to the device list (2 active
>> >>>>>>>>  devices)
>> >>>>>>>>>>>>>>>  Mar 17 14:12:04 lbv1
>> > stonithd[13266]:   notice:
>> >>>>>>>>  xml_patch_version_check: Versions did not change in
>> > patch 0.5.0
>> >>>>>>>>>>>>>>>  Mar 17 14:12:20 lbv1
>> > stonithd[13266]:   notice:
>> >>>>>>>>  log_operation: Operation 'monitor' [13386]
>> > for device
>> >>>>>>>>  'Stonith2-1' returned: -201 (Generic
>> > Pacemaker error)
>> >>>>>>>>>>>>>>>  Mar 17 14:12:20 lbv1
>> > stonithd[13266]:  warning:
>> >>>>>>>>  log_operation: Stonith2-1:13386 [ Performing:
>> > stonith -t external/stonith-helper
>> >>>>>>>>  -S ]
>> >>>>>>>>>>>>>>>  Mar 17 14:12:20 lbv1
>> > stonithd[13266]:  warning:
>> >>>>>>>>  log_operation: Stonith2-1:13386 [ failed to exec
>> > "stonith" ]
>> >>>>>>>>>>>>>>>  Mar 17 14:12:20 lbv1
>> > stonithd[13266]:  warning:
>> >>>>>>>>  log_operation: Stonith2-1:13386 [ failed:  2 ]
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>  宜しくお願いします。
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>  以上
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>  2015年3月17日 13:32
>> > <renayama19661014@ybb.ne.jp>:
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>  福田さん
>> >>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>  お疲れ様です。山内です。
>> >>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>
>> > ということは、stonith-helperのstartに問題があるようですね。
>> >>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>  stonith-helperの先頭に
>> >>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>  #!/bin/bash -x
>> >>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>
>> > を入れて、クラスタを起動すると何かわかるかも知れません。
>> >>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>
>> > ちなみに、stonith-helperのログもどこかに出ていると思うのですが。。。
>> >>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>  以上です。
>> >>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>  ----- Original
>> > Message -----
>> >>>>>>>>>>>>>>>>>  From: Masamichi
>> > Fukuda - elf-systems
>> >>>>>>>>  <masamichi_fukuda@elf-systems.com>
>> >>>>>>>>>>>>>>>>>  To: 山内英生
>> > <renayama19661014@ybb.ne.jp>;
>> >>>>>>>>  "linux-ha-japan@lists.sourceforge.jp"
>> >>>>>>>>  <linux-ha-japan@lists.sourceforge.jp>
>> >>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>  Date:
>> > 2015/3/17, Tue 12:31
>> >>>>>>>>>>>>>>>>>  Subject: Re:
>> > [Linux-ha-jp]
>> >>>>>>>>  スプリットブレイン時のSTONITHエラーについて
>> >>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>  山内さん
>> >>>>>>>>>>>>>>>>>  cc:松島さん
>> >>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>  こんにちは、福田です。
>> >>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>
>> > 同じディレクトリにxen0はありました。
>> >>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>  # pwd
>> >>>>>>>>>>>>>>>>>
>> > /usr/local/heartbeat/lib/stonith/plugins/external
>> >>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>  # ls
>> >>>>>>>>>>>>>>>>>  drac5          
>> > ibmrsa          kdumpcheck
>> >>>>>>>>  riloe          vmware
>> >>>>>>>>>>>>>>>>>  dracmc-telnet
>> > ibmrsa-telnet  libvirt    
>> >>>>>>>>  ssh          xen0
>> >>>>>>>>>>>>>>>>>  hetzner      
>> > ipmi          nut    
>> >>>>>>>>  stonith-helper  xen0-ha
>> >>>>>>>>>>>>>>>>>  hmchttp      
>> > ippower9258    rackpdu    
>> >>>>>>>>  vcenter
>> >>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>  宜しくお願いします。
>> >>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>  以上
>> >>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>  2015-03-17
>> > 10:53 GMT+09:00
>> >>>>>>>>  <renayama19661014@ybb.ne.jp>:
>> >>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>  福田さん
>> >>>>>>>>>>>>>>>>>>  cc:松島さん
>> >>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>
>> > お疲れ様です。山内です。
>> >>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>
>> > 標準出力や標準エラー出力はありませんでした。
>> >>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>
>> > stonith-helperがおかしいのでしょうか。
>> >>>>>>>>>>>>>>>>>>>
>> > stonith-helperはシェルスクリプトなのでインストールはあまり気にしていなかったのですが。
>> >>>>>>>>>>>>>>>>>>>
>> > stonith-helperはここに配置されています。
>> >>>>>>>>>>>>>>>>>>>
>> > /usr/local/heartbeat/lib/stonith/plugins/external/stonith-helper
>> >>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>
>> > このディレクトリにxen0もありますか?
>> >>>>>>>>>>>>>>>>>>
>> > 無いようでしたら、問題がありますので、一度、stonith-helperのファイルを属性などはそのまま、xen0と同じディレクトリに
>> >>>>>>>>>>>>>>>>>>
>> > コピーしてみてください。
>> >>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>
>> > それで稼働するなら、pm_extrasのインストールに問題があるということになります。
>> >>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>  以上です。
>> >>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>  -----
>> > Original Message -----
>> >>>>>>>>>>>>>>>>>>>  From:
>> > Masamichi Fukuda - elf-systems
>> >>>>>>>>  <masamichi_fukuda@elf-systems.com>
>> >>>>>>>>>>>>>>>>>>>  To:
>> > 山内英生
>> >>>>>>>>  <renayama19661014@ybb.ne.jp>;
>> >>>>>>>>  "linux-ha-japan@lists.sourceforge.jp"
>> >>>>>>>>  <linux-ha-japan@lists.sourceforge.jp>
>> >>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>  Date:
>> > 2015/3/17, Tue 10:31
>> >>>>>>>>>>>>>>>>>>>
>> > Subject: Re: [Linux-ha-jp]
>> >>>>>>>>  スプリットブレイン時のSTONITHエラーについて
>> >>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>  山内さん
>> >>>>>>>>>>>>>>>>>>>  cc:松島さん
>> >>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>
>> > おはようございます、福田です。
>> >>>>>>>>>>>>>>>>>>>
>> > crmの例をありがとうございます。
>> >>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>
>> > 早速、こちらの環境に合わせてみました。
>> >>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>  $ cat
>> > test.crm
>> >>>>>>>>>>>>>>>>>>>  ###
>> > Cluster Option ###
>> >>>>>>>>>>>>>>>>>>>
>> > property \
>> >>>>>>>>>>>>>>>>>>>    
>> >>>>>>>>  no-quorum-policy="ignore" \
>> >>>>>>>>>>>>>>>>>>>    
>> > stonith-enabled="true"
>> >>>>>>>>  \
>> >>>>>>>>>>>>>>>>>>>    
>> >>>>>>>>  startup-fencing="false" \
>> >>>>>>>>>>>>>>>>>>>    
>> > stonith-timeout="710s"
>> >>>>>>>>  \
>> >>>>>>>>>>>>>>>>>>>    
>> >>>>>>>>  crmd-transition-delay="2s"
>> >>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>  ###
>> > Resource Default ###
>> >>>>>>>>>>>>>>>>>>>
>> > rsc_defaults \
>> >>>>>>>>>>>>>>>>>>>    
>> >>>>>>>>  resource-stickiness="INFINITY" \
>> >>>>>>>>>>>>>>>>>>>    
>> >>>>>>>>  migration-threshold="1"
>> >>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>  ###
>> > Group Configuration ###
>> >>>>>>>>>>>>>>>>>>>  group
>> > HAvarnish \
>> >>>>>>>>>>>>>>>>>>>    
>> > vip_208 \
>> >>>>>>>>>>>>>>>>>>>    
>> > varnishd
>> >>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>  group
>> > grpStonith1 \
>> >>>>>>>>>>>>>>>>>>>    
>> > Stonith1-1 \
>> >>>>>>>>>>>>>>>>>>>    
>> > Stonith1-2
>> >>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>  group
>> > grpStonith2 \
>> >>>>>>>>>>>>>>>>>>>    
>> > Stonith2-1 \
>> >>>>>>>>>>>>>>>>>>>    
>> > Stonith2-2
>> >>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>  ###
>> > Clone Configuration ###
>> >>>>>>>>>>>>>>>>>>>  clone
>> > clone_ping \
>> >>>>>>>>>>>>>>>>>>>    
>> > ping
>> >>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>  ###
>> > Fencing Topology ###
>> >>>>>>>>>>>>>>>>>>>
>> > fencing_topology \
>> >>>>>>>>>>>>>>>>>>>    
>> > lbv1.beta.com: Stonith1-1
>> >>>>>>>>  Stonith1-2 \
>> >>>>>>>>>>>>>>>>>>>    
>> > lbv2.beta.com: Stonith2-1
>> >>>>>>>>  Stonith2-2
>> >>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>  ###
>> > Primitive Configuration ###
>> >>>>>>>>>>>>>>>>>>>
>> > primitive vip_208
>> >>>>>>>>  ocf:heartbeat:IPaddr2 \
>> >>>>>>>>>>>>>>>>>>>    
>> > params \
>> >>>>>>>>>>>>>>>>>>>        
>> >>>>>>>>  ip="192.168.17.208" \
>> >>>>>>>>>>>>>>>>>>>        
>> > nic="eth0" \
>> >>>>>>>>>>>>>>>>>>>        
>> > cidr_netmask="24"
>> >>>>>>>>  \
>> >>>>>>>>>>>>>>>>>>>      op
>> > start interval="0s"
>> >>>>>>>>  timeout="90s" on-fail="restart"
>> > \
>> >>>>>>>>>>>>>>>>>>>      op
>> > monitor
>> >>>>>>>>  interval="5s" timeout="60s"
>> > on-fail="restart"
>> >>>>>>>>  \
>> >>>>>>>>>>>>>>>>>>>      op
>> > stop interval="0s"
>> >>>>>>>>  timeout="100s" on-fail="fence"
>> >>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>
>> > primitive varnishd lsb:varnish \
>> >>>>>>>>>>>>>>>>>>>      op
>> > start interval="0s"
>> >>>>>>>>  timeout="90s" on-fail="restart"
>> > \
>> >>>>>>>>>>>>>>>>>>>      op
>> > monitor
>> >>>>>>>>  interval="10s" timeout="60s"
>> > on-fail="restart"
>> >>>>>>>>  \
>> >>>>>>>>>>>>>>>>>>>      op
>> > stop interval="0s"
>> >>>>>>>>  timeout="100s" on-fail="fence"
>> >>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>
>> > primitive ping ocf:pacemaker:ping
>> >>>>>>>>  \
>> >>>>>>>>>>>>>>>>>>>    
>> > params \
>> >>>>>>>>>>>>>>>>>>>        
>> >>>>>>>>  name="default_ping_set" \
>> >>>>>>>>>>>>>>>>>>>        
>> >>>>>>>>  host_list="192.168.17.254" \
>> >>>>>>>>>>>>>>>>>>>        
>> > multiplier="100"
>> >>>>>>>>  \
>> >>>>>>>>>>>>>>>>>>>        
>> > dampen="1" \
>> >>>>>>>>>>>>>>>>>>>      op
>> > start interval="0s"
>> >>>>>>>>  timeout="90s" on-fail="restart"
>> > \
>> >>>>>>>>>>>>>>>>>>>      op
>> > monitor
>> >>>>>>>>  interval="10s" timeout="60s"
>> > on-fail="restart"
>> >>>>>>>>  \
>> >>>>>>>>>>>>>>>>>>>      op
>> > stop interval="0s"
>> >>>>>>>>  timeout="100s" on-fail="fence"
>> >>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>
>> > primitive Stonith1-1
>> >>>>>>>>  stonith:external/stonith-helper \
>> >>>>>>>>>>>>>>>>>>>    
>> > params \
>> >>>>>>>>>>>>>>>>>>>        
>> >>>>>>>>  pcmk_reboot_retries="1" \
>> >>>>>>>>>>>>>>>>>>>        
>> >>>>>>>>  pcmk_reboot_timeout="40s" \
>> >>>>>>>>>>>>>>>>>>>        
>> >>>>>>>>  hostlist="lbv1.beta.com" \
>> >>>>>>>>>>>>>>>>>>>        
>> >>>>>>>>  dead_check_target="192.168.17.132
>> > 10.0.17.132" \
>> >>>>>>>>>>>>>>>>>>>        
>> >>>>>>>>
>> > standby_check_command="/usr/local/sbin/crm_resource -r varnishd -W | grep
>> >>>>>>>>  -q `hostname`" \
>> >>>>>>>>>>>>>>>>>>>        
>> >>>>>>>>  run_online_check="yes" \
>> >>>>>>>>>>>>>>>>>>>      op
>> > start interval="0s"
>> >>>>>>>>  timeout="60s" on-fail="restart"
>> > \
>> >>>>>>>>>>>>>>>>>>>      op
>> > stop interval="0s"
>> >>>>>>>>  timeout="60s" on-fail="ignore"
>> >>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>
>> > primitive Stonith1-2
>> >>>>>>>>  stonith:external/xen0 \
>> >>>>>>>>>>>>>>>>>>>    
>> > params \
>> >>>>>>>>>>>>>>>>>>>        
>> >>>>>>>>  pcmk_reboot_timeout="60s" \
>> >>>>>>>>>>>>>>>>>>>        
>> >>>>>>>>
>> > hostlist="lbv1.beta.com:/etc/xen/lbv1.cfg" \
>> >>>>>>>>>>>>>>>>>>>        
>> >>>>>>>>  dom0="xen0.beta.com" \
>> >>>>>>>>>>>>>>>>>>>      op
>> > start interval="0s"
>> >>>>>>>>  timeout="60s" on-fail="restart"
>> > \
>> >>>>>>>>>>>>>>>>>>>      op
>> > monitor
>> >>>>>>>>  interval="3600s" timeout="60s"
>> > on-fail="restart"
>> >>>>>>>>  \
>> >>>>>>>>>>>>>>>>>>>      op
>> > stop interval="0s"
>> >>>>>>>>  timeout="60s" on-fail="ignore"
>> >>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>
>> > primitive Stonith2-1
>> >>>>>>>>  stonith:external/stonith-helper \
>> >>>>>>>>>>>>>>>>>>>    
>> > params \
>> >>>>>>>>>>>>>>>>>>>        
>> >>>>>>>>  pcmk_reboot_retries="1" \
>> >>>>>>>>>>>>>>>>>>>        
>> >>>>>>>>  pcmk_reboot_timeout="40s" \
>> >>>>>>>>>>>>>>>>>>>        
>> >>>>>>>>  hostlist="lbv2.beta.com" \
>> >>>>>>>>>>>>>>>>>>>        
>> >>>>>>>>  dead_check_target="192.168.17.133
>> > 10.0.17.133" \
>> >>>>>>>>>>>>>>>>>>>        
>> >>>>>>>>
>> > standby_check_command="/usr/local/sbin/crm_resource -r varnishd -W | grep
>> >>>>>>>>  -q `hostname`" \
>> >>>>>>>>>>>>>>>>>>>        
>> >>>>>>>>  run_online_check="yes" \
>> >>>>>>>>>>>>>>>>>>>      op
>> > start interval="0s"
>> >>>>>>>>  timeout="60s" on-fail="restart"
>> > \
>> >>>>>>>>>>>>>>>>>>>      op
>> > stop interval="0s"
>> >>>>>>>>  timeout="60s" on-fail="ignore"
>> >>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>
>> > primitive Stonith2-2
>> >>>>>>>>  stonith:external/xen0 \
>> >>>>>>>>>>>>>>>>>>>    
>> > params \
>> >>>>>>>>>>>>>>>>>>>        
>> >>>>>>>>  pcmk_reboot_timeout="60s" \
>> >>>>>>>>>>>>>>>>>>>        
>> >>>>>>>>
>> > hostlist="lbv2.beta.com:/etc/xen/lbv2.cfg" \
>> >>>>>>>>>>>>>>>>>>>        
>> >>>>>>>>  dom0="xen0.beta.com" \
>> >>>>>>>>>>>>>>>>>>>      op
>> > start interval="0s"
>> >>>>>>>>  timeout="60s" on-fail="restart"
>> > \
>> >>>>>>>>>>>>>>>>>>>      op
>> > monitor
>> >>>>>>>>  interval="3600s" timeout="60s"
>> > on-fail="restart"
>> >>>>>>>>  \
>> >>>>>>>>>>>>>>>>>>>      op
>> > stop interval="0s"
>> >>>>>>>>  timeout="60s" on-fail="ignore"
>> >>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>  ###
>> > Resource Location ###
>> >>>>>>>>>>>>>>>>>>>
>> > location HA_location-1 HAvarnish
>> >>>>>>>>  \
>> >>>>>>>>>>>>>>>>>>>    
>> > rule 200: #uname eq
>> >>>>>>>>  lbv1.beta.com \
>> >>>>>>>>>>>>>>>>>>>    
>> > rule 100: #uname eq
>> >>>>>>>>  lbv2.beta.com
>> >>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>
>> > location HA_location-2 HAvarnish
>> >>>>>>>>  \
>> >>>>>>>>>>>>>>>>>>>    
>> > rule -INFINITY: not_defined
>> >>>>>>>>  default_ping_set or default_ping_set lt 100
>> >>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>
>> > location HA_location-3 grpStonith1
>> >>>>>>>>  \
>> >>>>>>>>>>>>>>>>>>>    
>> > rule -INFINITY: #uname eq
>> >>>>>>>>  lbv1.beta.com
>> >>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>
>> > location HA_location-4 grpStonith2
>> >>>>>>>>  \
>> >>>>>>>>>>>>>>>>>>>    
>> > rule -INFINITY: #uname eq
>> >>>>>>>>  lbv2.beta.com
>> >>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>
>> > これを流しこんだところ、昨日とはメッセージが異なります。
>> >>>>>>>>>>>>>>>>>>>
>> > pingのメッセージはなくなっていました。
>> >>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>  #
>> > crm_mon -rfA
>> >>>>>>>>>>>>>>>>>>>  Last
>> > updated: Tue Mar 17 10:21:28
>> >>>>>>>>  2015
>> >>>>>>>>>>>>>>>>>>>  Last
>> > change: Tue Mar 17 10:21:09
>> >>>>>>>>  2015
>> >>>>>>>>>>>>>>>>>>>  Stack:
>> > heartbeat
>> >>>>>>>>>>>>>>>>>>>  Current
>> > DC: lbv2.beta.com
>> >>>>>>>>  (82ffc36f-1ad8-8686-7db0-35686465c624) - parti
>> >>>>>>>>>>>>>>>>>>>  tion
>> > with quorum
>> >>>>>>>>>>>>>>>>>>>
>> > Version: 1.1.12-561c4cf
>> >>>>>>>>>>>>>>>>>>>  2 Nodes
>> > configured
>> >>>>>>>>>>>>>>>>>>>  8
>> > Resources configured
>> >>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>  Online:
>> > [ lbv1.beta.com
>> >>>>>>>>  lbv2.beta.com ]
>> >>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>  Full
>> > list of resources:
>> >>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>
>> >  Resource Group: HAvarnish
>> >>>>>>>>>>>>>>>>>>>      
>> > vip_208  
>> >>>>>>>>  (ocf::heartbeat:IPaddr2):       Started
>> > lbv1.beta.com
>> >>>>>>>>>>>>>>>>>>>      
>> > varnishd   (lsb:varnish):
>> >>>>>>>>  Started lbv1.beta.com
>> >>>>>>>>>>>>>>>>>>>
>> >  Resource Group: grpStonith1
>> >>>>>>>>>>>>>>>>>>>      
>> > Stonith1-1
>> >>>>>>>>  (stonith:external/stonith-helper):      Stopped
>> >>>>>>>>>>>>>>>>>>>      
>> > Stonith1-2
>> >>>>>>>>  (stonith:external/xen0):        Stopped
>> >>>>>>>>>>>>>>>>>>>
>> >  Resource Group: grpStonith2
>> >>>>>>>>>>>>>>>>>>>      
>> > Stonith2-1
>> >>>>>>>>  (stonith:external/stonith-helper):      Stopped
>> >>>>>>>>>>>>>>>>>>>      
>> > Stonith2-2
>> >>>>>>>>  (stonith:external/xen0):        Stopped
>> >>>>>>>>>>>>>>>>>>>   Clone
>> > Set: clone_ping [ping]
>> >>>>>>>>>>>>>>>>>>>      
>> > Started: [ lbv1.beta.com
>> >>>>>>>>  lbv2.beta.com ]
>> >>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>  Node
>> > Attributes:
>> >>>>>>>>>>>>>>>>>>>  * Node
>> > lbv1.beta.com:
>> >>>>>>>>>>>>>>>>>>>      +
>> >>>>>>>>  default_ping_set                  : 100
>> >>>>>>>>>>>>>>>>>>>  * Node
>> > lbv2.beta.com:
>> >>>>>>>>>>>>>>>>>>>      +
>> >>>>>>>>  default_ping_set                  : 100
>> >>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>
>> > Migration summary:
>> >>>>>>>>>>>>>>>>>>>  * Node
>> > lbv2.beta.com:
>> >>>>>>>>>>>>>>>>>>>    
>> > Stonith1-1: migration-threshold=1
>> >>>>>>>>  fail-count=1000000 last-failure='Tue Mar 17
>> >>>>>>>>>>>>>>>>>>>
>> >  10:21:17 2015'
>> >>>>>>>>>>>>>>>>>>>  * Node
>> > lbv1.beta.com:
>> >>>>>>>>>>>>>>>>>>>    
>> > Stonith2-1: migration-threshold=1
>> >>>>>>>>  fail-count=1000000 last-failure='Tue Mar 17
>> >>>>>>>>>>>>>>>>>>>
>> >  10:21:17 2015'
>> >>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>  Failed
>> > actions:
>> >>>>>>>>>>>>>>>>>>>    
>> > Stonith1-1_start_0 on
>> >>>>>>>>  lbv2.beta.com 'unknown error' (1): call=31,
>> > st
>> >>>>>>>>>>>>>>>>>>>
>> > atus=Error, last-rc-change='Tue
>> >>>>>>>>  Mar 17 10:21:15 2015', queued=0ms, exec=1082ms
>> >>>>>>>>>>>>>>>>>>>    
>> > Stonith2-1_start_0 on
>> >>>>>>>>  lbv1.beta.com 'unknown error' (1): call=31,
>> > st
>> >>>>>>>>>>>>>>>>>>>
>> > atus=Error, last-rc-change='Tue
>> >>>>>>>>  Mar 17 10:21:16 2015', queued=0ms, exec=1079ms
>> >>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>
>> > /var/log/ha-debugのログです。
>> >>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>
>> > IPaddr2(vip_208)[7851]:
>> >>>>>>>>  2015/03/17_10:21:22 INFO: Adding inet address
>> > 192.168.17.208/24 with broadcast
>> >>>>>>>>  address 192.168.17.255 to device eth0
>> >>>>>>>>>>>>>>>>>>>
>> > IPaddr2(vip_208)[7851]:
>> >>>>>>>>  2015/03/17_10:21:22 INFO: Bringing device eth0 up
>> >>>>>>>>>>>>>>>>>>>
>> > IPaddr2(vip_208)[7851]:
>> >>>>>>>>  2015/03/17_10:21:22 INFO:
>> > /usr/libexec/heartbeat/send_arp -i 200 -r 5 -p
>> >>>>>>>>  /var/run/resource-agents/send_arp-192.168.17.208
>> > eth0 192.168.17.208 auto
>> >>>>>>>>  not_used not_used
>> >>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>
>> > 標準出力や標準エラー出力はありませんでした。
>> >>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>
>> > stonith-helperがおかしいのでしょうか。
>> >>>>>>>>>>>>>>>>>>>
>> > stonith-helperはシェルスクリプトなのでインストールはあまり気にしていなかったのですが。
>> >>>>>>>>>>>>>>>>>>>
>> > stonith-helperはここに配置されています。
>> >>>>>>>>>>>>>>>>>>>
>> > /usr/local/heartbeat/lib/stonith/plugins/external/stonith-helper
>> >>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>
>> > 宜しくお願いします。
>> >>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>  以上
>> >>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>
>> > 2015-03-17 9:45 GMT+09:00
>> >>>>>>>>  <renayama19661014@ybb.ne.jp>:
>> >>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>  福田さん
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>
>> > おはようございます。山内です。
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>
>> > 念の為、手元にある複数のstonithを利用した場合の例を抜粋してお送りします。
>> >>>>>>>>>>>>>>>>>>>>
>> > (実際には、改行に気を付けてください)
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>
>> > 以下の例は、PM1.1系での設定で、
>> >>>>>>>>>>>>>>>>>>>>
>> > nodeaは、prmStonith1-1、 prmStonith1-2の順でstonithが実行されます。
>> >>>>>>>>>>>>>>>>>>>>
>> > nodebは、prmStonith2-1、 prmStonith2-2の順でstonithが実行されます。
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>
>> > stonith自体は、helperとsshです。
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>
>> > (snip)
>> >>>>>>>>>>>>>>>>>>>>  ###
>> > Group Configuration ###
>> >>>>>>>>>>>>>>>>>>>>
>> > group grpStonith1 \
>> >>>>>>>>>>>>>>>>>>>>
>> > prmStonith1-1 \
>> >>>>>>>>>>>>>>>>>>>>
>> > prmStonith1-2
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>
>> > group grpStonith2 \
>> >>>>>>>>>>>>>>>>>>>>
>> > prmStonith2-1 \
>> >>>>>>>>>>>>>>>>>>>>
>> > prmStonith2-2
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>  ###
>> > Fencing Topology ###
>> >>>>>>>>>>>>>>>>>>>>
>> > fencing_topology \
>> >>>>>>>>>>>>>>>>>>>>
>> > nodea: prmStonith1-1
>> >>>>>>>>  prmStonith1-2 \
>> >>>>>>>>>>>>>>>>>>>>
>> > nodeb: prmStonith2-1
>> >>>>>>>>  prmStonith2-2
>> >>>>>>>>>>>>>>>>>>>>
>> > (snp)
>> >>>>>>>>>>>>>>>>>>>>
>> > primitive prmStonith1-1
>> >>>>>>>>  stonith:external/stonith-helper \
>> >>>>>>>>>>>>>>>>>>>>
>> > params \
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>
>> > pcmk_reboot_retries="1"
>> >>>>>>>>  \
>> >>>>>>>>>>>>>>>>>>>>
>> > pcmk_reboot_timeout="40s"
>> >>>>>>>>  \
>> >>>>>>>>>>>>>>>>>>>>
>> > hostlist="nodea" \
>> >>>>>>>>>>>>>>>>>>>>
>> > dead_check_target="192.168.28.60
>> >>>>>>>>  192.168.28.70" \
>> >>>>>>>>>>>>>>>>>>>>
>> > standby_check_command="/usr/sbin/crm_resource
>> >>>>>>>>  -r prmRES -W | grep -qi `hostname`" \
>> >>>>>>>>>>>>>>>>>>>>
>> > run_online_check="yes"
>> >>>>>>>>  \
>> >>>>>>>>>>>>>>>>>>>>  op
>> > start interval="0s"
>> >>>>>>>>  timeout="60s" on-fail="restart"
>> > \
>> >>>>>>>>>>>>>>>>>>>>  op
>> > stop interval="0s"
>> >>>>>>>>  timeout="60s" on-fail="ignore"
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>
>> > primitive prmStonith1-2
>> >>>>>>>>  stonith:external/ssh \
>> >>>>>>>>>>>>>>>>>>>>
>> > params \
>> >>>>>>>>>>>>>>>>>>>>
>> > pcmk_reboot_timeout="60s"
>> >>>>>>>>  \
>> >>>>>>>>>>>>>>>>>>>>
>> > hostlist="nodea" \
>> >>>>>>>>>>>>>>>>>>>>  op
>> > start interval="0s"
>> >>>>>>>>  timeout="60s" on-fail="restart"
>> > \
>> >>>>>>>>>>>>>>>>>>>>  op
>> > monitor
>> >>>>>>>>  interval="3600s" timeout="60s"
>> > on-fail="restart"
>> >>>>>>>>  \
>> >>>>>>>>>>>>>>>>>>>>  op
>> > stop interval="0s"
>> >>>>>>>>  timeout="60s" on-fail="ignore"
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>
>> > primitive prmStonith2-1
>> >>>>>>>>  stonith:external/stonith-helper \
>> >>>>>>>>>>>>>>>>>>>>
>> > params \
>> >>>>>>>>>>>>>>>>>>>>
>> > pcmk_reboot_retries="1"
>> >>>>>>>>  \
>> >>>>>>>>>>>>>>>>>>>>
>> > pcmk_reboot_timeout="40s"
>> >>>>>>>>  \
>> >>>>>>>>>>>>>>>>>>>>
>> > hostlist="nodeb" \
>> >>>>>>>>>>>>>>>>>>>>
>> > dead_check_target="192.168.28.61
>> >>>>>>>>  192.168.28.71" \
>> >>>>>>>>>>>>>>>>>>>>
>> > standby_check_command="/usr/sbin/crm_resource
>> >>>>>>>>  -r prmRES -W | grep -qi `hostname`" \
>> >>>>>>>>>>>>>>>>>>>>
>> > run_online_check="yes"
>> >>>>>>>>  \
>> >>>>>>>>>>>>>>>>>>>>  op
>> > start interval="0s"
>> >>>>>>>>  timeout="60s" on-fail="restart"
>> > \
>> >>>>>>>>>>>>>>>>>>>>  op
>> > stop interval="0s"
>> >>>>>>>>  timeout="60s" on-fail="ignore"
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>
>> > primitive prmStonith2-2
>> >>>>>>>>  stonith:external/ssh \
>> >>>>>>>>>>>>>>>>>>>>
>> > params \
>> >>>>>>>>>>>>>>>>>>>>
>> > pcmk_reboot_timeout="60s"
>> >>>>>>>>  \
>> >>>>>>>>>>>>>>>>>>>>
>> > hostlist="nodeb" \
>> >>>>>>>>>>>>>>>>>>>>  op
>> > start interval="0s"
>> >>>>>>>>  timeout="60s" on-fail="restart"
>> > \
>> >>>>>>>>>>>>>>>>>>>>  op
>> > monitor
>> >>>>>>>>  interval="3600s" timeout="60s"
>> > on-fail="restart"
>> >>>>>>>>  \
>> >>>>>>>>>>>>>>>>>>>>  op
>> > stop interval="0s"
>> >>>>>>>>  timeout="60s" on-fail="ignore"
>> >>>>>>>>>>>>>>>>>>>>
>> > (snip)
>> >>>>>>>>>>>>>>>>>>>>
>> > location
>> >>>>>>>>  rsc_location-grpStonith1-2 grpStonith1 \
>> >>>>>>>>>>>>>>>>>>>>
>> > rule -INFINITY: #uname eq nodea
>> >>>>>>>>>>>>>>>>>>>>
>> > location
>> >>>>>>>>  rsc_location-grpStonith2-3 grpStonith2 \
>> >>>>>>>>>>>>>>>>>>>>
>> > rule -INFINITY: #uname eq nodeb
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>
>> > 以上です。
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>  --
>> >>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>  ELF
>> > Systems
>> >>>>>>>>>>>>>>>>>>>
>> > Masamichi Fukuda
>> >>>>>>>>>>>>>>>>>>>  mail
>> > to:
>> >>>>>>>>  masamichi_fukuda@elf-systems.com
>> >>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>
>> > _______________________________________________
>> >>>>>>>>>>>>>>>>>>
>> > Linux-ha-japan mailing list
>> >>>>>>>>>>>>>>>>>>
>> > Linux-ha-japan@lists.sourceforge.jp
>> >>>>>>>>>>>>>>>>>>
>> > http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>> >>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>  --
>> >>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>  ELF Systems
>> >>>>>>>>>>>>>>>>>  Masamichi
>> > Fukuda
>> >>>>>>>>>>>>>>>>>  mail to:
>> > masamichi_fukuda@elf-systems.com
>> >>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>
>> > _______________________________________________
>> >>>>>>>>>>>>>>>>  Linux-ha-japan
>> > mailing list
>> >>>>>>>>>>>>>>>>
>> > Linux-ha-japan@lists.sourceforge.jp
>> >>>>>>>>>>>>>>>>
>> > http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>> >>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>  --
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>  ELF Systems
>> >>>>>>>>>>>>>>>  Masamichi Fukuda
>> >>>>>>>>>>>>>>>  mail to:
>> > masamichi_fukuda@elf-systems.com
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>
>> > _______________________________________________
>> >>>>>>>>>>>>>>  Linux-ha-japan mailing list
>> >>>>>>>>>>>>>>
>> > Linux-ha-japan@lists.sourceforge.jp
>> >>>>>>>>>>>>>>
>> > http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>>  --
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>>  ELF Systems
>> >>>>>>>>>>>>>  Masamichi Fukuda
>> >>>>>>>>>>>>>  mail to:
>> > masamichi_fukuda@elf-systems.com
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>
>> >>>>>>>>>>>>
>> > _______________________________________________
>> >>>>>>>>>>>>  Linux-ha-japan mailing list
>> >>>>>>>>>>>>  Linux-ha-japan@lists.sourceforge.jp
>> >>>>>>>>>>>>
>> > http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>> >>>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>>  --
>> >>>>>>>>>>>
>> >>>>>>>>>>>  ELF Systems
>> >>>>>>>>>>>  Masamichi Fukuda
>> >>>>>>>>>>>  mail to:
>> > masamichi_fukuda@elf-systems.com
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>>
>> > _______________________________________________
>> >>>>>>>>>>  Linux-ha-japan mailing list
>> >>>>>>>>>>  Linux-ha-japan@lists.sourceforge.jp
>> >>>>>>>>>>
>> > http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>> >>>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>>  --
>> >>>>>>>>>
>> >>>>>>>>>  ELF Systems
>> >>>>>>>>>  Masamichi Fukuda
>> >>>>>>>>>  mail to: masamichi_fukuda@elf-systems.com
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>
>> >>>>>>>>  _______________________________________________
>> >>>>>>>>  Linux-ha-japan mailing list
>> >>>>>>>>  Linux-ha-japan@lists.sourceforge.jp
>> >>>>>>>>
>> > http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>> >>>>>>>>
>> >>>>>>>
>> >>>>>>> _______________________________________________
>> >>>>>>> Linux-ha-japan mailing list
>> >>>>>>> Linux-ha-japan@lists.sourceforge.jp
>> >>>>>>> http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>> >>>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> --
>> >>>>>>
>> >>>>>> ELF Systems
>> >>>>>> Masamichi Fukuda
>> >>>>>> mail to: masamichi_fukuda@elf-systems.com
>> >>>>>>
>> >>>>>>
>> >>>>>
>> >>>>> _______________________________________________
>> >>>>> Linux-ha-japan mailing list
>> >>>>> Linux-ha-japan@lists.sourceforge.jp
>> >>>>> http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>> >>>>>
>> >>>>
>> >>>>
>> >>>> --
>> >>>>
>> >>>> ELF Systems
>> >>>> Masamichi Fukuda
>> >>>> mail to: masamichi_fukuda@elf-systems.com
>> >>>>
>> >>>>
>> >>>
>> >>> _______________________________________________
>> >>> Linux-ha-japan mailing list
>> >>> Linux-ha-japan@lists.sourceforge.jp
>> >>> http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>> >>>
>> >>
>> >>
>> >> --
>> >>
>> >> ELF Systems
>> >> Masamichi Fukuda
>> >> mail to: masamichi_fukuda@elf-systems.com
>> >>
>> >>
>> >
>> > _______________________________________________
>> > Linux-ha-japan mailing list
>> > Linux-ha-japan@lists.sourceforge.jp
>> > http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>> >
>>
>> _______________________________________________
>> Linux-ha-japan mailing list
>> Linux-ha-japan@lists.sourceforge.jp
>> http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>
>
>
>
>--
>ELF Systems
>Masamichi Fukuda
>mail to: masamichi_fukuda@elf-systems.com
>
>

_______________________________________________
Linux-ha-japan mailing list
Linux-ha-japan@lists.sourceforge.jp
http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
Re: スプリットブレイン時のSTONITHエラーについて [ In reply to ]
山内さん

お疲れ様です、福田です。

新たにdebian7.8をvirtulabox上にインストールして、
heartbeat + pacemakerをインストールしてみました。

パッケージでheartbeat,pacemaker等はインストールしていません。

heartbeatは起動しますが、crmファイルを読み込ませるとエラーがでました。

# crm configure load update test1.crm
ERROR: crmd:metadata: got no meta-data, does this RA exist?
ERROR: cib-bootstrap-options: attribute no-quorum-policy does not exist
ERROR: cib-bootstrap-options: attribute stonith-enabled does not exist
ERROR: cib-bootstrap-options: attribute crmd-transition-delay does not exist
ERROR: pengine:metadata: got no meta-data, does this RA exist?

external配下のエージェントを認識できない件と関係あるのでしょうか。

宜しくお願いします。

以上


2015年3月18日 12:13 <renayama19661014@ybb.ne.jp>:

> 福田さん
>
> お疲れ様です。山内です。
>
> 了解しました。
> ご連絡ありがとうございました。
>
> 以上です。
>
>
>
> ----- Original Message -----
> >From: Masamichi Fukuda - elf-systems <masamichi_fukuda@elf-systems.com>
> >To: 山内英生 <renayama19661014@ybb.ne.jp>; "
> linux-ha-japan@lists.sourceforge.jp" <linux-ha-japan@lists.sourceforge.jp>
> >Date: 2015/3/18, Wed 10:23
> >Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
> >
> >
> >山内さん
> >
> >お疲れ様です、福田です。
> >
> >こちらの環境では、packageで次のものを入れていたので、
> >最初にapt-get removeしました。
> >
> >heartbeat、libheartbeat2、pacemaker、corosync、resource-agents
> >
> >また、haclusterユーザとhaclientグループはpackage導入の段階で
> >作成されていました。
> >
> >ですので、松島さんの手順の
> >
> >下準備
> >apt-get install build-essential mercurial git \
> >
> >以降を実行しました。後は全く同じ手順です。
> >
> >宜しくお願いします。
> >
> >以上
> >
> >2015年3月18日 10:06 <renayama19661014@ybb.ne.jp>:
> >>
> >> 福田さん
> >>
> >> お疲れ様です。山内です。
> >>
> >> ちなみに、私の方でも構築するにあたっての再確認ですが、福田さんの構築手順は、
> >> 以下にまとめられた松島さんの手順通りでしょうか?
> >>
> >> * https://gist.github.com/takehironet/1469bd7123f63d61f843
> >>
> >> 差異などありましたら、今一度、ご連絡ください。
> >>
> >> #特に、最初の構築パッケージのapt-getのあたりが、私がちょっと試した時には、うまく行かなかった記憶があるので、気になります。
> >>
> >>
> >> 以上です。
> >>
> >>
> >> ----- Original Message -----
> >> > From: "renayama19661014@ybb.ne.jp" <renayama19661014@ybb.ne.jp>
> >> > To: "linux-ha-japan@lists.sourceforge.jp" <
> linux-ha-japan@lists.sourceforge.jp>
> >> > Cc:
> >> > Date: 2015/3/18, Wed 09:53
> >> > Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
> >> >
> >> > 福田さん
> >> >
> >> > お疲れ様です。山内です。
> >> >
> >> >> stonith -Lでは、一応プラグインの一覧は表示されるようです。
> >> >>
> >> >> # /usr/local/heartbeat/sbin/stonith -L
> >> >
> >> >
> こちらは、Heartbeatのソースに含まれるコマンドのはずなので、Heartbeatとglueの関係では問題ないということになるかと思います。
> >> >
> >> > ですので、pacemakerのインストールに問題がある可能性の方が高いと思われます。
> >> >
> >> > どちらにしても、一度、時間をみて、こちらでも構築してみます。
> >> >
> >> > 以上です。
> >> >
> >> >
> >> > ----- Original Message -----
> >> >> From: Masamichi Fukuda - elf-systems
> >> > <masamichi_fukuda@elf-systems.com>
> >> >> To: 山内英生 <renayama19661014@ybb.ne.jp>;
> >> > "linux-ha-japan@lists.sourceforge.jp"
> >> > <linux-ha-japan@lists.sourceforge.jp>
> >> >> Date: 2015/3/18, Wed 09:33
> >> >> Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
> >> >>
> >> >>
> >> >> 山内さん
> >> >>
> >> >> お疲れ様です、福田です。
> >> >>
> >> >>> Reusableは、glueのことです。
> >> >>
> >> >> 承知しました。Cluster-glueのことですね。
> >> >>
> >> >>> これは、想定通りで、external配下のエージェントを認識できず、startしていないと
> >> >>> 思っています。
> >> >>
> >> >> stonith -Lでは、一応プラグインの一覧は表示されるようです。
> >> >>
> >> >> # /usr/local/heartbeat/sbin/stonith -L
> >> >> apcmaster
> >> >> apcsmart
> >> >> baytech
> >> >> cyclades
> >> >> external/drac5
> >> >> external/dracmc-telnet
> >> >> external/hetzner
> >> >> external/hmchttp
> >> >> external/ibmrsa
> >> >> external/ibmrsa-telnet
> >> >> external/ipmi
> >> >> external/ippower9258
> >> >> external/kdumpcheck
> >> >> external/libvirt
> >> >> external/nut
> >> >> external/rackpdu
> >> >> external/riloe
> >> >> external/ssh
> >> >> external/stonith-helper
> >> >> external/vcenter
> >> >> external/vmware
> >> >> external/xen0
> >> >> external/xen0-ha
> >> >> ibmhmc
> >> >> meatware
> >> >> null
> >> >> nw_rpc100s
> >> >> rcd_serial
> >> >> rps10
> >> >> ssh
> >> >> suicide
> >> >> wti_nps
> >> >>
> >> >>
> >> >>> #時間を見て、松島さんの情報を元に、Debian7の環境を作らないと何とも言えない
> >> >>> と思っています
> >> >>
> >> >> お忙しいところ済みません。
> >> >> こちらもインストールを見なおして見ます。
> >> >>
> >> >> 宜しくお願いします。
> >> >>
> >> >> 以上
> >> >>
> >> >>
> >> >>
> >> >>
> >> >> 2015年3月18日 9:02 <renayama19661014@ybb.ne.jp>:
> >> >>
> >> >> 福田さん
> >> >>>
> >> >>> おはようございます。山内です。
> >> >>>
> >> >>> 書き方が悪かったです。
> >> >>> Reusableは、glueのことです。
> >> >>>
> >> >>> pacemakerのインストールに問題があるかも知れませんが、現時点では、判断出来ません。
> >> >>>
> >> >>>
> >> >>>> stonith-helperを外して、external/sshだけにして起動してみましたが、
> >> >>>> crm_monでの状態は変わりありませんでした。
> >> >>>
> >> >>>
> >> >>> これは、想定通りで、external配下のエージェントを認識できず、startしていないと思っています。
> >> >>>
> >> >>> #時間を見て、松島さんの情報を元に、Debian7の環境を作らないと何とも言えないと思っています。
> >> >>>
> >> >>> 以上です。
> >> >>>
> >> >>>
> >> >>> ----- Original Message -----
> >> >>>> From: Masamichi Fukuda - elf-systems
> >> > <masamichi_fukuda@elf-systems.com>
> >> >>>> To: 山内英生 <renayama19661014@ybb.ne.jp>;
> >> > "linux-ha-japan@lists.sourceforge.jp"
> >> > <linux-ha-japan@lists.sourceforge.jp>
> >> >>>
> >> >>>> Date: 2015/3/18, Wed 08:12
> >> >>>> Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
> >> >>>>
> >> >>>>
> >> >>>> 山内さん
> >> >>>>
> >> >>>> おはようございます、福田です。
> >> >>>>
> >> >>>>> ということは、xen0もstonith-helperもたぶん、Pacemakerのstonithプラグインとし
> >> >>>>> ての管理下のパスにはないということになると思います。
> >> >>>>>
> >> >>>>> Reusableと、pm_extrasあたりのインストールが怪しいと思いますね。
> >> >>>>
> >> >>>> pacemakerのインストールに問題があるのでしょうか。
> >> >>>> あと、Reusableというものは別途インストールが必要なのでしょうか。
> >> >>>>
> >> >>>> stonith-helperを外して、external/sshだけにして起動してみましたが、
> >> >>>> crm_monでの状態は変わりありませんでした。
> >> >>>>
> >> >>>> Last updated: Wed Mar 18 08:07:42 2015
> >> >>>> Last change: Wed Mar 18 08:04:48 2015
> >> >>>> Stack: heartbeat
> >> >>>> Current DC: lbv1.beta.com (38b0f200-83ea-8633-6f37-047d36cd39c6) -
> >> > parti
> >> >>>> tion with quorum
> >> >>>> Version: 1.1.12-e32080b
> >> >>>> 2 Nodes configured
> >> >>>> 6 Resources configured
> >> >>>>
> >> >>>>
> >> >>>> Online: [ lbv1.beta.com lbv2.beta.com ]
> >> >>>>
> >> >>>> Full list of resources:
> >> >>>>
> >> >>>> Stonith1-2 (stonith:external/ssh): Stopped
> >> >>>> Stonith2-2 (stonith:external/ssh): Stopped
> >> >>>> Resource Group: HAvarnish
> >> >>>> vip_208 (ocf::heartbeat:IPaddr2): Started
> >> > lbv1.beta.com
> >> >>>> varnishd (lsb:varnish): Started lbv1.beta.com
> >> >>>> Clone Set: clone_ping [ping]
> >> >>>> Started: [ lbv1.beta.com lbv2.beta.com ]
> >> >>>>
> >> >>>> Node Attributes:
> >> >>>> * Node lbv1.beta.com:
> >> >>>> + default_ping_set : 100
> >> >>>> * Node lbv2.beta.com:
> >> >>>> + default_ping_set : 100
> >> >>>>
> >> >>>> Migration summary:
> >> >>>> * Node lbv2.beta.com:
> >> >>>> Stonith1-2: migration-threshold=1 fail-count=1000000
> >> > last-failure='Wed Mar 18
> >> >>>> 08:07:32 2015'
> >> >>>> * Node lbv1.beta.com:
> >> >>>> Stonith2-2: migration-threshold=1 fail-count=1000000
> >> > last-failure='Wed Mar 18
> >> >>>> 08:05:53 2015'
> >> >>>>
> >> >>>> Failed actions:
> >> >>>> Stonith1-2_start_0 on lbv2.beta.com 'unknown error' (1):
> >> > call=23, st
> >> >>>> atus=Error, exit-reason='none', last-rc-change='Wed Mar
> >> > 18 08:07:30 2015', queue
> >> >>>> d=0ms, exec=1061ms
> >> >>>> Stonith2-2_start_0 on lbv1.beta.com 'unknown error' (1):
> >> > call=23, st
> >> >>>> atus=Error, exit-reason='none', last-rc-change='Wed Mar
> >> > 18 08:05:51 2015', queue
> >> >>>> d=0ms, exec=1062ms
> >> >>>>
> >> >>>> 宜しくお願いします。
> >> >>>>
> >> >>>> 以上
> >> >>>>
> >> >>>>
> >> >>>>
> >> >>>>
> >> >>>>
> >> >>>> 2015年3月17日 23:51 <renayama19661014@ybb.ne.jp>:
> >> >>>>
> >> >>>> 福田さん
> >> >>>>>
> >> >>>>> こんばんは、山内です。
> >> >>>>>
> >> >>>>>
> ということは、xen0もstonith-helperもたぶん、Pacemakerのstonithプラグインとしての管理下のパスにはないということになると思います。
> >> >>>>>
> >> >>>>> Reusableと、pm_extrasあたりのインストールが怪しいと思いますね。
> >> >>>>>
> >> >>>>> また、何かわかったらご連絡します。
> >> >>>>>
> >> >>>>> 以上です。
> >> >>>>>
> >> >>>>>
> >> >>>>>
> >> >>>>> ----- Original Message -----
> >> >>>>>> From: Masamichi Fukuda - elf-systems
> >> > <masamichi_fukuda@elf-systems.com>
> >> >>>>>> To: 山内英生 <renayama19661014@ybb.ne.jp>;
> >> > "linux-ha-japan@lists.sourceforge.jp"
> >> > <linux-ha-japan@lists.sourceforge.jp>
> >> >>>>>
> >> >>>>>> Date: 2015/3/17, Tue 23:46
> >> >>>>>> Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
> >> >>>>>>
> >> >>>>>>
> >> >>>>>> 山内さん
> >> >>>>>>
> >> >>>>>> こんばんは、福田です。
> >> >>>>>>
> >> >>>>>> stonith-helperの-x指定は何かやり方が違うんでしょうかね。
> >> >>>>>>
> >> >>>>>> stonith-helperを外して、xen0だけにして起動してみました。
> >> >>>>>>
> >> >>>>>> # crm_mon -rfA
> >> >>>>>>
> >> >>>>>> Last updated: Tue Mar 17 23:38:53 2015
> >> >>>>>> Last change: Tue Mar 17 23:30:34 2015
> >> >>>>>> Stack: heartbeat
> >> >>>>>> Current DC: lbv1.beta.com
> >> > (38b0f200-83ea-8633-6f37-047d36cd39c6) - parti
> >> >>>>>> tion with quorum
> >> >>>>>> Version: 1.1.12-e32080b
> >> >>>>>> 2 Nodes configured
> >> >>>>>> 6 Resources configured
> >> >>>>>>
> >> >>>>>>
> >> >>>>>> Online: [ lbv1.beta.com lbv2.beta.com ]
> >> >>>>>>
> >> >>>>>> Full list of resources:
> >> >>>>>>
> >> >>>>>> Stonith1-2 (stonith:external/xen0): Stopped
> >> >>>>>> Stonith2-2 (stonith:external/xen0): Stopped
> >> >>>>>> Resource Group: HAvarnish
> >> >>>>>> vip_208 (ocf::heartbeat:IPaddr2): Started
> >> > lbv1.beta.com
> >> >>>>>> varnishd (lsb:varnish): Started lbv1.beta.com
> >> >>>>>> Clone Set: clone_ping [ping]
> >> >>>>>> Started: [ lbv1.beta.com lbv2.beta.com ]
> >> >>>>>>
> >> >>>>>> Node Attributes:
> >> >>>>>> * Node lbv1.beta.com:
> >> >>>>>> + default_ping_set : 100
> >> >>>>>> * Node lbv2.beta.com:
> >> >>>>>> + default_ping_set : 100
> >> >>>>>>
> >> >>>>>> Migration summary:
> >> >>>>>> * Node lbv1.beta.com:
> >> >>>>>> Stonith2-2: migration-threshold=1 fail-count=1000000
> >> > last-failure='Tue Mar 17
> >> >>>>>> 23:38:34 2015'
> >> >>>>>> * Node lbv2.beta.com:
> >> >>>>>> Stonith1-2: migration-threshold=1 fail-count=1000000
> >> > last-failure='Tue Mar 17
> >> >>>>>> 23:38:27 2015'
> >> >>>>>>
> >> >>>>>> Failed actions:
> >> >>>>>> Stonith2-2_start_0 on lbv1.beta.com 'unknown
> >> > error' (1): call=23, st
> >> >>>>>> atus=Error, exit-reason='none',
> >> > last-rc-change='Tue Mar 17 23:38:32 2015', queue
> >> >>>>>> d=0ms, exec=1061ms
> >> >>>>>> Stonith1-2_start_0 on lbv2.beta.com 'unknown
> >> > error' (1): call=23, st
> >> >>>>>> atus=Error, exit-reason='none',
> >> > last-rc-change='Tue Mar 17 23:38:25 2015', queue
> >> >>>>>> d=0ms, exec=1342ms
> >> >>>>>>
> >> >>>>>>
> >> >>>>>>
> >> >>>>>>
> >> >>>>>> stonith-helperがあるときと同様のfialed actionsが出ているようです。
> >> >>>>>>
> >> >>>>>>
> >> >>>>>> 宜しくお願いします。
> >> >>>>>>
> >> >>>>>> 以上
> >> >>>>>>
> >> >>>>>>
> >> >>>>>>
> >> >>>>>>
> >> >>>>>> 2015年3月17日 22:38 <renayama19661014@ybb.ne.jp>:
> >> >>>>>>
> >> >>>>>> 福田さん
> >> >>>>>>>
> >> >>>>>>> こんばんは、山内です。
> >> >>>>>>>
> >> >>>>>>> ちなみに可能であれば、external/stonith-helperを外して、external/xen0だけにした場合に
> >> >>>>>>> どうなるか?を確認すると、問題の切り分けになるかもしれません。
> >> >>>>>>>
> >> >>>>>>> 以上です。
> >> >>>>>>>
> >> >>>>>>>
> >> >>>>>>>
> >> >>>>>>> ----- Original Message -----
> >> >>>>>>>
> >> >>>>>>>> From: "renayama19661014@ybb.ne.jp"
> >> > <renayama19661014@ybb.ne.jp>
> >> >>>>>>>> To: "linux-ha-japan@lists.sourceforge.jp"
> >> > <linux-ha-japan@lists.sourceforge.jp>
> >> >>>>>>>> Cc:
> >> >>>>>>>> Date: 2015/3/17, Tue 22:28
> >> >>>>>>>> Subject: Re: [Linux-ha-jp]
> >> > スプリットブレイン時のSTONITHエラーについて
> >> >>>>>>>>
> >> >>>>>>>> 福田さん
> >> >>>>>>>>
> >> >>>>>>>> こんばんは、山内です。
> >> >>>>>>>>
> >> >>>>>>>> 変わらないようですね。。。
> >> >>>>>>>>
> >> >>>>>>>> とりあえず、明日くらいに、RHEL上ですが、
> >> >>>>>>>>
> >> >>>>>>>> Heartbeat3.0.6
> >> >>>>>>>> Pacemakerの最新
> >> >>>>>>>>
> >> >>>>>>>>
> >> >
> 組み合わせで、同じような設定(リソースはDummy、external/xen0はexternal/sshになりますが)stonith-helperが動くかどうかを確認してみます。
> >> >>>>>>>>
> >> >>>>>>>> #stonith-helperの-x指定の出力が確認出来ると、もう少し問題が絞りやすいのですが・・・
> >> >>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>> 以上です。
> >> >>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>> ----- Original Message -----
> >> >>>>>>>>> From: Masamichi Fukuda - elf-systems
> >> >>>>>>>> <masamichi_fukuda@elf-systems.com>
> >> >>>>>>>>> To: 山内英生 <renayama19661014@ybb.ne.jp>;
> >> >>>>>>>> "linux-ha-japan@lists.sourceforge.jp"
> >> >>>>>>>> <linux-ha-japan@lists.sourceforge.jp>
> >> >>>>>>>>> Date: 2015/3/17, Tue 21:24
> >> >>>>>>>>> Subject: Re: [Linux-ha-jp]
> >> > スプリットブレイン時のSTONITHエラーについて
> >> >>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>>> 山内さん
> >> >>>>>>>>>
> >> >>>>>>>>> こんばんは、福田です。
> >> >>>>>>>>> 最新版の情報をありがとうございました。
> >> >>>>>>>>>
> >> >>>>>>>>> 早速インストールしてみました。
> >> >>>>>>>>>
> >> >>>>>>>>> 起動後の状態です。
> >> >>>>>>>>>
> >> >>>>>>>>> failed actionsは変わりないようです。
> >> >>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>>> # crm_mon -rfA
> >> >>>>>>>>> Last updated: Tue Mar 17 21:03:49 2015
> >> >>>>>>>>> Last change: Tue Mar 17 20:30:58 2015
> >> >>>>>>>>> Stack: heartbeat
> >> >>>>>>>>> Current DC: lbv1.beta.com
> >> > (38b0f200-83ea-8633-6f37-047d36cd39c6) - parti
> >> >>>>>>>>> tion with quorum
> >> >>>>>>>>> Version: 1.1.12-e32080b
> >> >>>>>>>>> 2 Nodes configured
> >> >>>>>>>>> 8 Resources configured
> >> >>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>>> Online: [ lbv1.beta.com lbv2.beta.com ]
> >> >>>>>>>>>
> >> >>>>>>>>> Full list of resources:
> >> >>>>>>>>>
> >> >>>>>>>>> Resource Group: HAvarnish
> >> >>>>>>>>> vip_208 (ocf::heartbeat:IPaddr2):
> >> > Started lbv1.beta.com
> >> >>>>>>>>> varnishd (lsb:varnish): Started
> >> > lbv1.beta.com
> >> >>>>>>>>> Resource Group: grpStonith1
> >> >>>>>>>>> Stonith1-1
> >> > (stonith:external/stonith-helper): Stopped
> >> >>>>>>>>> Stonith1-2 (stonith:external/xen0):
> >> > Stopped
> >> >>>>>>>>> Resource Group: grpStonith2
> >> >>>>>>>>> Stonith2-1
> >> > (stonith:external/stonith-helper): Stopped
> >> >>>>>>>>> Stonith2-2 (stonith:external/xen0):
> >> > Stopped
> >> >>>>>>>>> Clone Set: clone_ping [ping]
> >> >>>>>>>>> Started: [ lbv1.beta.com lbv2.beta.com ]
> >> >>>>>>>>>
> >> >>>>>>>>> Node Attributes:
> >> >>>>>>>>> * Node lbv1.beta.com:
> >> >>>>>>>>> + default_ping_set : 100
> >> >>>>>>>>> * Node lbv2.beta.com:
> >> >>>>>>>>> + default_ping_set : 100
> >> >>>>>>>>>
> >> >>>>>>>>> Migration summary:
> >> >>>>>>>>> * Node lbv1.beta.com:
> >> >>>>>>>>> Stonith2-1: migration-threshold=1
> >> > fail-count=1000000
> >> >>>>>>>> last-failure='Tue Mar 17
> >> >>>>>>>>> 21:03:39 2015'
> >> >>>>>>>>> * Node lbv2.beta.com:
> >> >>>>>>>>> Stonith1-1: migration-threshold=1
> >> > fail-count=1000000
> >> >>>>>>>> last-failure='Tue Mar 17
> >> >>>>>>>>> 21:03:32 2015'
> >> >>>>>>>>>
> >> >>>>>>>>> Failed actions:
> >> >>>>>>>>> Stonith2-1_start_0 on lbv1.beta.com
> >> > 'unknown error' (1):
> >> >>>>>>>> call=31, st
> >> >>>>>>>>> atus=Error, exit-reason='none',
> >> > last-rc-change='Tue Mar 17
> >> >>>>>>>> 21:03:37 2015', queue
> >> >>>>>>>>> d=0ms, exec=1085ms
> >> >>>>>>>>> Stonith1-1_start_0 on lbv2.beta.com
> >> > 'unknown error' (1):
> >> >>>>>>>> call=18, st
> >> >>>>>>>>> atus=Error, exit-reason='none',
> >> > last-rc-change='Tue Mar 17
> >> >>>>>>>> 21:03:30 2015', queue
> >> >>>>>>>>> d=0ms, exec=1061ms
> >> >>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>>> ログです。
> >> >>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>>> # less /var/log/ha-debug
> >> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> >> > [4235]: info: Pacemaker support:
> >> >>>>>>>> yes
> >> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> >> > [4235]: WARN: File
> >> >>>>>>>> /etc/ha.d//haresources exists.
> >> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> >> > [4235]: WARN: This file is not used
> >> >>>>>>>> because pacemaker is enabled
> >> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> >> > [4235]: debug: Checking access of:
> >> >>>>>>>> /usr/local/heartbeat/libexec/heartbeat/ccm
> >> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> >> > [4235]: debug: Checking access of:
> >> >>>>>>>> /usr/local/heartbeat/libexec/pacemaker/cib
> >> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> >> > [4235]: debug: Checking access of:
> >> >>>>>>>> /usr/local/heartbeat/libexec/pacemaker/stonithd
> >> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> >> > [4235]: debug: Checking access of:
> >> >>>>>>>> /usr/local/heartbeat/libexec/pacemaker/lrmd
> >> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> >> > [4235]: debug: Checking access of:
> >> >>>>>>>> /usr/local/heartbeat/libexec/pacemaker/attrd
> >> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> >> > [4235]: debug: Checking access of:
> >> >>>>>>>> /usr/local/heartbeat/libexec/pacemaker/crmd
> >> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> >> > [4235]: WARN: Core dumps could be
> >> >>>>>>>> lost if multiple dumps occur.
> >> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> >> > [4235]: WARN: Consider setting
> >> >>>>>>>> non-default value in /proc/sys/kernel/core_pattern
> >> > (or equivalent) for maximum
> >> >>>>>>>> supportability
> >> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> >> > [4235]: WARN: Consider setting
> >> >>>>>>>> /proc/sys/kernel/core_uses_pid (or equivalent) to 1
> >> > for maximum supportability
> >> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> >> > [4235]: WARN: Logging daemon is
> >> >>>>>>>> disabled --enabling logging daemon is recommended
> >> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> >> > [4235]: info:
> >> >>>>>>>> **************************
> >> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> >> > [4235]: info: Configuration
> >> >>>>>>>> validated. Starting heartbeat 3.0.6
> >> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> >> > [4236]: info: heartbeat: version
> >> >>>>>>>> 3.0.6
> >> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> >> > [4236]: info: Heartbeat generation:
> >> >>>>>>>> 1423534116
> >> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> >> > [4236]: info: seed is -1702799346
> >> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> >> > [4236]: info: glib: ucast: write
> >> >>>>>>>> socket priority set to IPTOS_LOWDELAY on eth1
> >> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> >> > [4236]: info: glib: ucast: bound
> >> >>>>>>>> send socket to device: eth1
> >> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> >> > [4236]: info: glib: ucast: set
> >> >>>>>>>> SO_REUSEADDR
> >> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> >> > [4236]: info: glib: ucast: bound
> >> >>>>>>>> receive socket to device: eth1
> >> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> >> > [4236]: info: glib: ucast: started
> >> >>>>>>>> on port 694 interface eth1 to 10.0.17.133
> >> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> >> > [4236]: info: Local status now set
> >> >>>>>>>> to: 'up'
> >> >>>>>>>>> Mar 17 21:02:46 lbv1.beta.com heartbeat:
> >> > [4236]: info: Link
> >> >>>>>>>> lbv2.beta.com:eth1 up.
> >> >>>>>>>>> Mar 17 21:02:46 lbv1.beta.com heartbeat:
> >> > [4236]: info: Status update for
> >> >>>>>>>> node lbv2.beta.com: status up
> >> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> >> > [4236]: info: Comm_now_up():
> >> >>>>>>>> updating status to active
> >> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> >> > [4236]: info: Local status now set
> >> >>>>>>>> to: 'active'
> >> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> >> > [4236]: info: Starting child client
> >> >>>>>>>>
> >> > "/usr/local/heartbeat/libexec/heartbeat/ccm" (109,113)
> >> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> >> > [4236]: info: Starting child client
> >> >>>>>>>>
> >> > "/usr/local/heartbeat/libexec/pacemaker/cib" (109,113)
> >> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> >> > [4236]: info: Starting child client
> >> >>>>>>>>
> >> > "/usr/local/heartbeat/libexec/pacemaker/stonithd" (0,0)
> >> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> >> > [4236]: info: Starting child client
> >> >>>>>>>>
> >> > "/usr/local/heartbeat/libexec/pacemaker/lrmd" (0,0)
> >> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> >> > [4236]: info: Starting child client
> >> >>>>>>>>
> >> > "/usr/local/heartbeat/libexec/pacemaker/attrd" (109,113)
> >> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> >> > [4236]: info: Starting child client
> >> >>>>>>>>
> >> > "/usr/local/heartbeat/libexec/pacemaker/crmd" (109,113)
> >> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> >> > [4236]: debug: get_delnodelist:
> >> >>>>>>>> delnodelist=
> >> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> >> > [4250]: info: Starting
> >> >>>>>>>>
> >> > "/usr/local/heartbeat/libexec/pacemaker/crmd" as uid 109 gid 113 (pid
> >> >>>>>>>> 4250)
> >> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> >> > [4246]: info: Starting
> >> >>>>>>>>
> >> > "/usr/local/heartbeat/libexec/pacemaker/cib" as uid 109 gid 113 (pid
> >> >>>>>>>> 4246)
> >> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> >> > [4249]: info: Starting
> >> >>>>>>>>
> >> > "/usr/local/heartbeat/libexec/pacemaker/attrd" as uid 109 gid 113
> >> >>>>>>>> (pid 4249)
> >> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> >> > [4245]: info: Starting
> >> >>>>>>>>
> >> > "/usr/local/heartbeat/libexec/heartbeat/ccm" as uid 109 gid 113 (pid
> >> >>>>>>>> 4245)
> >> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> >> > [4248]: info: Starting
> >> >>>>>>>>
> >> > "/usr/local/heartbeat/libexec/pacemaker/lrmd" as uid 0 gid 0 (pid
> >> >>>>>>>> 4248)
> >> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> >> > [4247]: info: Starting
> >> >>>>>>>>
> >> > "/usr/local/heartbeat/libexec/pacemaker/stonithd" as uid 0 gid 0 (pid
> >> >>>>>>>> 4247)
> >> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com ccm: [4245]:
> >> > info: Hostname: lbv1.beta.com
> >> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> >> > [4236]: info: the send queue length
> >> >>>>>>>> from heartbeat to client ccm is set to 1024
> >> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> >> > [4236]: info: the send queue length
> >> >>>>>>>> from heartbeat to client attrd is set to 1024
> >> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> >> > [4236]: info: the send queue length
> >> >>>>>>>> from heartbeat to client stonith-ng is set to 1024
> >> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> >> > [4236]: info: Status update for
> >> >>>>>>>> node lbv2.beta.com: status active
> >> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> >> > [4236]: info: the send queue length
> >> >>>>>>>> from heartbeat to client cib is set to 1024
> >> >>>>>>>>> Mar 17 21:02:51 lbv1.beta.com heartbeat:
> >> > [4236]: WARN: 1 lost packet(s) for
> >> >>>>>>>> [lbv2.beta.com] [15:17]
> >> >>>>>>>>> Mar 17 21:02:51 lbv1.beta.com heartbeat:
> >> > [4236]: info: No pkts missing from
> >> >>>>>>>> lbv2.beta.com!
> >> >>>>>>>>> Mar 17 21:02:52 lbv1.beta.com heartbeat:
> >> > [4236]: WARN: 1 lost packet(s) for
> >> >>>>>>>> [lbv2.beta.com] [19:21]
> >> >>>>>>>>> Mar 17 21:02:52 lbv1.beta.com heartbeat:
> >> > [4236]: info: No pkts missing from
> >> >>>>>>>> lbv2.beta.com!
> >> >>>>>>>>> Mar 17 21:02:52 lbv1.beta.com heartbeat:
> >> > [4236]: info: the send queue length
> >> >>>>>>>> from heartbeat to client crmd is set to 1024
> >> >>>>>>>>> Mar 17 21:02:53 lbv1.beta.com heartbeat:
> >> > [4236]: WARN: 1 lost packet(s) for
> >> >>>>>>>> [lbv2.beta.com] [24:26]
> >> >>>>>>>>> Mar 17 21:02:53 lbv1.beta.com heartbeat:
> >> > [4236]: info: No pkts missing from
> >> >>>>>>>> lbv2.beta.com!
> >> >>>>>>>>> Mar 17 21:02:54 lbv1.beta.com heartbeat:
> >> > [4236]: WARN: 1 lost packet(s) for
> >> >>>>>>>> [lbv2.beta.com] [26:28]
> >> >>>>>>>>> Mar 17 21:02:54 lbv1.beta.com heartbeat:
> >> > [4236]: info: No pkts missing from
> >> >>>>>>>> lbv2.beta.com!
> >> >>>>>>>>> Mar 17 21:02:54 lbv1.beta.com heartbeat:
> >> > [4236]: WARN: 1 lost packet(s) for
> >> >>>>>>>> [lbv2.beta.com] [30:32]
> >> >>>>>>>>> Mar 17 21:02:54 lbv1.beta.com heartbeat:
> >> > [4236]: info: No pkts missing from
> >> >>>>>>>> lbv2.beta.com!
> >> >>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>>> # less /var/log/error
> >> >>>>>>>>>
> >> >>>>>>>>> Mar 17 21:02:47 lbv1 attrd[4249]: error:
> >> > ha_msg_dispatch: Ignored
> >> >>>>>>>> incoming message. Please set_msg_callback on
> >> > hbclstat
> >> >>>>>>>>> Mar 17 21:02:48 lbv1 attrd[4249]: error:
> >> > ha_msg_dispatch: Ignored
> >> >>>>>>>> incoming message. Please set_msg_callback on
> >> > hbclstat
> >> >>>>>>>>> Mar 17 21:02:53 lbv1 stonith-ng[4247]:
> >> > error: ha_msg_dispatch: Ignored
> >> >>>>>>>> incoming message. Please set_msg_callback on
> >> > hbclstat
> >> >>>>>>>>> Mar 17 21:02:53 lbv1 stonith-ng[4247]:
> >> > error: ha_msg_dispatch: Ignored
> >> >>>>>>>> incoming message. Please set_msg_callback on
> >> > hbclstat
> >> >>>>>>>>> Mar 17 21:03:39 lbv1 crmd[4250]: error:
> >> > process_lrm_event: Operation
> >> >>>>>>>> Stonith2-1_start_0 (node=lbv1.beta.com, call=31,
> >> > status=4, cib-update=42,
> >> >>>>>>>> confirmed=true) Error
> >> >>>>>>>>>
> >> >>>>>>>>> # cat syslog|egrep 'Mar 17 21:03|Mar 17
> >> > 21:02' |egrep
> >> >>>>>>>> 'heartbeat|stonith|pacemaker|error'
> >> >>>>>>>>> Mar 17 21:03:24 lbv1 pengine[4253]: notice:
> >> > process_pe_message: Calculated
> >> >>>>>>>> Transition 0:
> >> > /var/lib/pacemaker/pengine/pe-input-115.bz2
> >> >>>>>>>>> Mar 17 21:03:27 lbv1 crmd[4250]: notice:
> >> > run_graph: Transition 0
> >> >>>>>>>> (Complete=15, Pending=0, Fired=0, Skipped=16,
> >> > Incomplete=2,
> >> >>>>>>>>
> >> > Source=/var/lib/pacemaker/pengine/pe-input-115.bz2): Stopped
> >> >>>>>>>>> Mar 17 21:03:29 lbv1 pengine[4253]: notice:
> >> > process_pe_message: Calculated
> >> >>>>>>>> Transition 1:
> >> > /var/lib/pacemaker/pengine/pe-input-116.bz2
> >> >>>>>>>>> Mar 17 21:03:34 lbv1 crmd[4250]: notice:
> >> > run_graph: Transition 1
> >> >>>>>>>> (Complete=8, Pending=0, Fired=0, Skipped=12,
> >> > Incomplete=1,
> >> >>>>>>>>
> >> > Source=/var/lib/pacemaker/pengine/pe-input-116.bz2): Stopped
> >> >>>>>>>>> Mar 17 21:03:37 lbv1 pengine[4253]: warning:
> >> > unpack_rsc_op_failure:
> >> >>>>>>>> Processing failed op start for Stonith1-1 on
> >> > lbv2.beta.com: unknown error (1)
> >> >>>>>>>>> Mar 17 21:03:37 lbv1 pengine[4253]: warning:
> >> > unpack_rsc_op_failure:
> >> >>>>>>>> Processing failed op start for Stonith1-1 on
> >> > lbv2.beta.com: unknown error (1)
> >> >>>>>>>>> Mar 17 21:03:37 lbv1 pengine[4253]: notice:
> >> > process_pe_message: Calculated
> >> >>>>>>>> Transition 2:
> >> > /var/lib/pacemaker/pengine/pe-input-117.bz2
> >> >>>>>>>>> Mar 17 21:03:39 lbv1 stonith-ng[4247]:
> >> > notice: log_operation: Operation
> >> >>>>>>>> 'monitor' [4377] for device
> >> > 'Stonith2-1' returned: -201 (Generic
> >> >>>>>>>> Pacemaker error)
> >> >>>>>>>>> Mar 17 21:03:39 lbv1 stonith-ng[4247]:
> >> > warning: log_operation:
> >> >>>>>>>> Stonith2-1:4377 [ Performing: stonith -t
> >> > external/stonith-helper -S ]
> >> >>>>>>>>> Mar 17 21:03:39 lbv1 stonith-ng[4247]:
> >> > warning: log_operation:
> >> >>>>>>>> Stonith2-1:4377 [ failed to exec
> >> > "stonith" ]
> >> >>>>>>>>> Mar 17 21:03:39 lbv1 stonith-ng[4247]:
> >> > warning: log_operation:
> >> >>>>>>>> Stonith2-1:4377 [ failed: 2 ]
> >> >>>>>>>>> Mar 17 21:03:39 lbv1 crmd[4250]: error:
> >> > process_lrm_event: Operation
> >> >>>>>>>> Stonith2-1_start_0 (node=lbv1.beta.com, call=31,
> >> > status=4, cib-update=42,
> >> >>>>>>>> confirmed=true) Error
> >> >>>>>>>>> Mar 17 21:03:40 lbv1 crmd[4250]: notice:
> >> > run_graph: Transition 2
> >> >>>>>>>> (Complete=12, Pending=0, Fired=0, Skipped=3,
> >> > Incomplete=0,
> >> >>>>>>>>
> >> > Source=/var/lib/pacemaker/pengine/pe-input-117.bz2): Stopped
> >> >>>>>>>>> Mar 17 21:03:42 lbv1 pengine[4253]: warning:
> >> > unpack_rsc_op_failure:
> >> >>>>>>>> Processing failed op start for Stonith2-1 on
> >> > lbv1.beta.com: unknown error (1)
> >> >>>>>>>>> Mar 17 21:03:42 lbv1 pengine[4253]: warning:
> >> > unpack_rsc_op_failure:
> >> >>>>>>>> Processing failed op start for Stonith2-1 on
> >> > lbv1.beta.com: unknown error (1)
> >> >>>>>>>>> Mar 17 21:03:42 lbv1 pengine[4253]: warning:
> >> > unpack_rsc_op_failure:
> >> >>>>>>>> Processing failed op start for Stonith1-1 on
> >> > lbv2.beta.com: unknown error (1)
> >> >>>>>>>>> Mar 17 21:03:42 lbv1 pengine[4253]: notice:
> >> > process_pe_message: Calculated
> >> >>>>>>>> Transition 3:
> >> > /var/lib/pacemaker/pengine/pe-input-118.bz2
> >> >>>>>>>>> Mar 17 21:03:42 lbv1 IPaddr2(vip_208)[4448]:
> >> > INFO:
> >> >>>>>>>> /usr/libexec/heartbeat/send_arp -i 200 -r 5 -p
> >> >>>>>>>> /var/run/resource-agents/send_arp-192.168.17.208
> >> > eth0 192.168.17.208 auto
> >> >>>>>>>> not_used not_used
> >> >>>>>>>>> Mar 17 21:03:47 lbv1 crmd[4250]: notice:
> >> > run_graph: Transition 3
> >> >>>>>>>> (Complete=10, Pending=0, Fired=0, Skipped=0,
> >> > Incomplete=0,
> >> >>>>>>>>
> >> > Source=/var/lib/pacemaker/pengine/pe-input-118.bz2): Complete
> >> >>>>>>>>>
> >> >>>>>>>>> 宜しくお願いします。
> >> >>>>>>>>>
> >> >>>>>>>>> 以上
> >> >>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>>> 2015年3月17日 18:31
> >> > <renayama19661014@ybb.ne.jp>:
> >> >>>>>>>>>
> >> >>>>>>>>> 福田さん
> >> >>>>>>>>>>
> >> >>>>>>>>>> こんばんは、山内です。
> >> >>>>>>>>>>
> >> >>>>>>>>>> tag付けされていないので、本日の最新版は、
> >> >>>>>>>>>>
> >> >>>>>>>>>> *
> >> >>>>>>>>
> >> >
> https://github.com/ClusterLabs/pacemaker/tree/e32080b460f81486b85d08ec958582b3e72d858c
> >> >>>>>>>>>>
> >> >>>>>>>>>>
> >> >>>>>>>>>> になります。
> >> >>>>>>>>>> 右側の[Download ZIP]からダウンロード出来ます。
> >> >>>>>>>>>>
> >> >>>>>>>>>> 以上です。
> >> >>>>>>>>>>
> >> >>>>>>>>>>
> >> >>>>>>>>>> ----- Original Message -----
> >> >>>>>>>>>>> From: Masamichi Fukuda - elf-systems
> >> >>>>>>>> <masamichi_fukuda@elf-systems.com>
> >> >>>>>>>>>>
> >> >>>>>>>>>>> To:
> >> > "renayama19661014@ybb.ne.jp"
> >> >>>>>>>> <renayama19661014@ybb.ne.jp>;
> >> >>>>>>>> "linux-ha-japan@lists.sourceforge.jp"
> >> >>>>>>>> <linux-ha-japan@lists.sourceforge.jp>
> >> >>>>>>>>>>> Date: 2015/3/17, Tue 18:07
> >> >>>>>>>>>>> Subject: スプリットブレイン時のSTONITHエラーについて
> >> >>>>>>>>>>>
> >> >>>>>>>>>>>
> >> >>>>>>>>>>> 山内さん
> >> >>>>>>>>>>>
> >> >>>>>>>>>>>
> >> >>>>>>>>>>> お疲れ様です、福田です。
> >> >>>>>>>>>>>
> >> >>>>>>>>>>>
> >> >>>>>>>>>>> こちらを見たのですが、
> >> >>>>>>>>>>>
> >> > https://github.com/ClusterLabs/pacemaker/tags
> >> >>>>>>>>>>>
> >> >>>>>>>>>>>
> >> >>>>>>>>>>>
> >> >>>>>>>>>>> pacemaker 1.1.12 561c4cf が最新のようなのですが。
> >> >>>>>>>>>>> 済みませんが、これ以降の最新版はどちらにあるか教えて頂けますか。
> >> >>>>>>>>>>>
> >> >>>>>>>>>>>
> >> >>>>>>>>>>> 宜しくお願いします。
> >> >>>>>>>>>>>
> >> >>>>>>>>>>>
> >> >>>>>>>>>>> 以上
> >> >>>>>>>>>>>
> >> >>>>>>>>>>>
> >> >>>>>>>>>>>
> >> >>>>>>>>>>>
> >> > 2015年3月17日火曜日、<renayama19661014@ybb.ne.jp>さんは書きました:
> >> >>>>>>>>>>>
> >> >>>>>>>>>>> 福田さん
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>> お疲れ様です。山内です。
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>> はい。古いです。
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>>
> >> > PacemakerがHeartbeat3.0.6に対応したのは意外と最近です。
> >> >>>>>>>>>>>>
> >> > もっと新しいものを入れてください。(また、ソースから構築する必要がありますが・・・・)
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>> 本家のgithubから入手可能です。
> >> >>>>>>>>>>>> *
> >> > https://github.com/ClusterLabs/pacemaker
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>>
> >> > 場合によっては、最新のmasterはエラーなどが出る場合がありますので、その場合は、バージョンを古い方にたぐって
> >> >>>>>>>>>>>> いくのが良いと思います。
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>> 以上です。
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>> ----- Original Message -----
> >> >>>>>>>>>>>>> From: Masamichi Fukuda -
> >> > elf-systems
> >> >>>>>>>> <masamichi_fukuda@elf-systems.com>
> >> >>>>>>>>>>>>> To: 山内英生
> >> > <renayama19661014@ybb.ne.jp>;
> >> >>>>>>>> "linux-ha-japan@lists.sourceforge.jp"
> >> >>>>>>>> <linux-ha-japan@lists.sourceforge.jp>
> >> >>>>>>>>>>>>> Date: 2015/3/17, Tue 16:06
> >> >>>>>>>>>>>>> Subject: Re: [Linux-ha-jp]
> >> > スプリットブレイン時のSTONITHエラーについて
> >> >>>>>>>>>>>>>
> >> >>>>>>>>>>>>>
> >> >>>>>>>>>>>>> 山内さん
> >> >>>>>>>>>>>>>
> >> >>>>>>>>>>>>> お疲れ様です、福田です。
> >> >>>>>>>>>>>>>
> >> >>>>>>>>>>>>>
> >> > 以前のメールでheartbeatとpacemakerを最新版を入れたほうが良いと回答頂きました。
> >> >>>>>>>>>>>>>
> >> > そこで今回、heartbeat3.0.6とpacemaker1.1.12を入れたのですが。
> >> >>>>>>>>>>>>>
> >> >>>>>>>>>>>>> heartbeat configuration:
> >> > Version = "3.0.6"
> >> >>>>>>>>>>>>> pacemaker configuration:
> >> > Version = 1.1.12 (Build:
> >> >>>>>>>> 561c4cf)pacemakerがまだ古いということでしょうか。
> >> >>>>>>>>>>>>>
> >> >>>>>>>>>>>>> 済みませんが、宜しくお願いします。
> >> >>>>>>>>>>>>>
> >> >>>>>>>>>>>>> 以上
> >> >>>>>>>>>>>>>
> >> >>>>>>>>>>>>>
> >> >>>>>>>>>>>>>
> >> >>>>>>>>>>>>> 2015年3月17日 14:59
> >> > <renayama19661014@ybb.ne.jp>:
> >> >>>>>>>>>>>>>
> >> >>>>>>>>>>>>> 福田さん
> >> >>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>> お疲れ様です。山内です。
> >> >>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>
> >> > ふと思ったのすが、以前のやり取りのメールで以下と回答してますが、問題ないでしょうか?
> >> >>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>>>>
> >> > 2)Heartbeat3.0.6+Pacemaker最新 :
> >> >>>>>>>> OK
> >> >>>>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>>>>
> >> >>>>>>>> どうやら、Heartbeatも最新版3.0.6を組合せる必要があるようです。
> >> >>>>>>>>>>>>>>>>>>>>
> >> >>>>>>>>
> >> > * http://hg.linux-ha.org/heartbeat-STABLE_3_0/rev/cceeb47a7d8f
> >> >>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>
> >> > 以下のcrm_monのバージョンを見ると、1.1.12のようです。
> >> >>>>>>>>>>>>>>
> >> > Heartbeat3.0.6と組み合わせるには、かなり新しめのPacemakerが必要です。
> >> >>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>> # crm_mon -rfA
> >> >>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>> Last updated: Tue Mar
> >> > 17 14:14:39 2015
> >> >>>>>>>>>>>>>>> Last change: Tue Mar 17
> >> > 14:01:43 2015
> >> >>>>>>>>>>>>>>> Stack: heartbeat
> >> >>>>>>>>>>>>>>> Current DC:
> >> > lbv2.beta.com
> >> >>>>>>>> (82ffc36f-1ad8-8686-7db0-35686465c624) - parti
> >> >>>>>>>>>>>>>>> tion with quorum
> >> >>>>>>>>>>>>>>> Version: 1.1.12-561c4cf
> >> >>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>> たぶん、以下の変更以降は少なくとも必要かと思います。
> >> >>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>
> >> >
> https://github.com/ClusterLabs/pacemaker/commit/f2302da063d08719d28367d8e362b8bfb0f85bf3
> >> >>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>> 以上です。
> >> >>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>> ----- Original Message
> >> > -----
> >> >>>>>>>>>>>>>>> From: Masamichi Fukuda
> >> > - elf-systems
> >> >>>>>>>> <masamichi_fukuda@elf-systems.com>
> >> >>>>>>>>>>>>>>> To: 山内英生
> >> > <renayama19661014@ybb.ne.jp>;
> >> >>>>>>>> "linux-ha-japan@lists.sourceforge.jp"
> >> >>>>>>>> <linux-ha-japan@lists.sourceforge.jp>
> >> >>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>> Date: 2015/3/17, Tue
> >> > 14:38
> >> >>>>>>>>>>>>>>> Subject: Re:
> >> > [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
> >> >>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>> 山内さん
> >> >>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>> お疲れ様です、福田です。
> >> >>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>
> >> > stonith-helperのシェバング行に-xを追加すれば良いのでしょうか?
> >> >>>>>>>>>>>>>>>
> >> > stonith-helperの先頭行を#!/bin/bash -xにしてクラスタを起動してみました。
> >> >>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>
> >> > crm_monでは先ほどと変わりはないようです。
> >> >>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>> # crm_mon -rfA
> >> >>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>> Last updated: Tue Mar
> >> > 17 14:14:39 2015
> >> >>>>>>>>>>>>>>> Last change: Tue Mar 17
> >> > 14:01:43 2015
> >> >>>>>>>>>>>>>>> Stack: heartbeat
> >> >>>>>>>>>>>>>>> Current DC:
> >> > lbv2.beta.com
> >> >>>>>>>> (82ffc36f-1ad8-8686-7db0-35686465c624) - parti
> >> >>>>>>>>>>>>>>> tion with quorum
> >> >>>>>>>>>>>>>>> Version: 1.1.12-561c4cf
> >> >>>>>>>>>>>>>>> 2 Nodes configured
> >> >>>>>>>>>>>>>>> 8 Resources configured
> >> >>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>> Online: [ lbv1.beta.com
> >> > lbv2.beta.com ]
> >> >>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>> Full list of resources:
> >> >>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>> Resource Group:
> >> > HAvarnish
> >> >>>>>>>>>>>>>>> vip_208
> >> > (ocf::heartbeat:IPaddr2):
> >> >>>>>>>> Started lbv1.beta.com
> >> >>>>>>>>>>>>>>> varnishd
> >> > (lsb:varnish): Started
> >> >>>>>>>> lbv1.beta.com
> >> >>>>>>>>>>>>>>> Resource Group:
> >> > grpStonith1
> >> >>>>>>>>>>>>>>> Stonith1-1
> >> >>>>>>>> (stonith:external/stonith-helper): Stopped
> >> >>>>>>>>>>>>>>> Stonith1-2
> >> > (stonith:external/xen0):
> >> >>>>>>>> Stopped
> >> >>>>>>>>>>>>>>> Resource Group:
> >> > grpStonith2
> >> >>>>>>>>>>>>>>> Stonith2-1
> >> >>>>>>>> (stonith:external/stonith-helper): Stopped
> >> >>>>>>>>>>>>>>> Stonith2-2
> >> > (stonith:external/xen0):
> >> >>>>>>>> Stopped
> >> >>>>>>>>>>>>>>> Clone Set: clone_ping
> >> > [ping]
> >> >>>>>>>>>>>>>>> Started: [
> >> > lbv1.beta.com lbv2.beta.com ]
> >> >>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>> Node Attributes:
> >> >>>>>>>>>>>>>>> * Node lbv1.beta.com:
> >> >>>>>>>>>>>>>>> +
> >> > default_ping_set : 100
> >> >>>>>>>>>>>>>>> * Node lbv2.beta.com:
> >> >>>>>>>>>>>>>>> +
> >> > default_ping_set : 100
> >> >>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>> Migration summary:
> >> >>>>>>>>>>>>>>> * Node lbv2.beta.com:
> >> >>>>>>>>>>>>>>> Stonith1-1:
> >> > migration-threshold=1
> >> >>>>>>>> fail-count=1000000 last-failure='Tue Mar 17
> >> >>>>>>>>>>>>>>> 14:12:16 2015'
> >> >>>>>>>>>>>>>>> * Node lbv1.beta.com:
> >> >>>>>>>>>>>>>>> Stonith2-1:
> >> > migration-threshold=1
> >> >>>>>>>> fail-count=1000000 last-failure='Tue Mar 17
> >> >>>>>>>>>>>>>>> 14:12:21 2015'
> >> >>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>> Failed actions:
> >> >>>>>>>>>>>>>>> Stonith1-1_start_0
> >> > on lbv2.beta.com 'unknown
> >> >>>>>>>> error' (1): call=31, st
> >> >>>>>>>>>>>>>>> atus=Error,
> >> > last-rc-change='Tue Mar 17 14:12:14
> >> >>>>>>>> 2015', queued=0ms, exec=1065ms
> >> >>>>>>>>>>>>>>> Stonith2-1_start_0
> >> > on lbv1.beta.com 'unknown
> >> >>>>>>>> error' (1): call=26, st
> >> >>>>>>>>>>>>>>> atus=Error,
> >> > last-rc-change='Tue Mar 17 14:12:19
> >> >>>>>>>> 2015', queued=0ms, exec=1081ms
> >> >>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>> その他のログを探してみました。
> >> >>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>> heartbeat起動時です。
> >> >>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>> # less
> >> > /var/log/pm_logconv.out
> >> >>>>>>>>>>>>>>> Mar 17 14:11:28
> >> > lbv1.beta.com info: Starting
> >> >>>>>>>> Heartbeat 3.0.6.
> >> >>>>>>>>>>>>>>> Mar 17 14:11:33
> >> > lbv1.beta.com info: Link
> >> >>>>>>>> lbv2.beta.com:eth1 is up.
> >> >>>>>>>>>>>>>>> Mar 17 14:11:34
> >> > lbv1.beta.com info: Start
> >> >>>>>>>> "ccm" process. (pid=13264)
> >> >>>>>>>>>>>>>>> Mar 17 14:11:34
> >> > lbv1.beta.com info: Start
> >> >>>>>>>> "lrmd" process. (pid=13267)
> >> >>>>>>>>>>>>>>> Mar 17 14:11:34
> >> > lbv1.beta.com info: Start
> >> >>>>>>>> "attrd" process. (pid=13268)
> >> >>>>>>>>>>>>>>> Mar 17 14:11:34
> >> > lbv1.beta.com info: Start
> >> >>>>>>>> "stonithd" process. (pid=13266)
> >> >>>>>>>>>>>>>>> Mar 17 14:11:34
> >> > lbv1.beta.com info: Start
> >> >>>>>>>> "cib" process. (pid=13265)
> >> >>>>>>>>>>>>>>> Mar 17 14:11:34
> >> > lbv1.beta.com info: Start
> >> >>>>>>>> "crmd" process. (pid=13269)
> >> >>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>> # less /var/log/error
> >> >>>>>>>>>>>>>>> Mar 17 14:12:20 lbv1
> >> > crmd[13269]: error:
> >> >>>>>>>> process_lrm_event: Operation Stonith2-1_start_0
> >> > (node=lbv1.beta.com, call=26,
> >> >>>>>>>> status=4, cib-update=19, confirmed=true) Error
> >> >>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>
> >> > syslogからstonithをgrepしたものです
> >> >>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>> Mar 17 14:11:34 lbv1
> >> > heartbeat: [13255]: info:
> >> >>>>>>>> Starting child client
> >> >>>>>>>>
> >> > "/usr/local/heartbeat/libexec/pacemaker/stonithd" (0,0)
> >> >>>>>>>>>>>>>>> Mar 17 14:11:34 lbv1
> >> > heartbeat: [13266]: info:
> >> >>>>>>>> Starting
> >> > "/usr/local/heartbeat/libexec/pacemaker/stonithd" as uid 0
> >> >>>>>>>> gid 0 (pid 13266)
> >> >>>>>>>>>>>>>>> Mar 17 14:11:34 lbv1
> >> > stonithd[13266]: notice:
> >> >>>>>>>> crm_cluster_connect: Connecting to cluster
> >> > infrastructure: heartbeat
> >> >>>>>>>>>>>>>>> Mar 17 14:11:34 lbv1
> >> > heartbeat: [13255]: info: the
> >> >>>>>>>> send queue length from heartbeat to client stonithd
> >> > is set to 1024
> >> >>>>>>>>>>>>>>> Mar 17 14:11:40 lbv1
> >> > stonithd[13266]: notice:
> >> >>>>>>>> setup_cib: Watching for stonith topology changes
> >> >>>>>>>>>>>>>>> Mar 17 14:11:40 lbv1
> >> > stonithd[13266]: notice:
> >> >>>>>>>> unpack_config: On loss of CCM Quorum: Ignore
> >> >>>>>>>>>>>>>>> Mar 17 14:11:40 lbv1
> >> > stonithd[13266]: warning:
> >> >>>>>>>> handle_startup_fencing: Blind faith: not fencing
> >> > unseen nodes
> >> >>>>>>>>>>>>>>> Mar 17 14:11:40 lbv1
> >> > stonithd[13266]: warning:
> >> >>>>>>>> handle_startup_fencing: Blind faith: not fencing
> >> > unseen nodes
> >> >>>>>>>>>>>>>>> Mar 17 14:11:41 lbv1
> >> > stonithd[13266]: notice:
> >> >>>>>>>> stonith_device_register: Added 'Stonith2-1'
> >> > to the device list (1 active
> >> >>>>>>>> devices)
> >> >>>>>>>>>>>>>>> Mar 17 14:11:41 lbv1
> >> > stonithd[13266]: notice:
> >> >>>>>>>> stonith_device_register: Added 'Stonith2-2'
> >> > to the device list (2 active
> >> >>>>>>>> devices)
> >> >>>>>>>>>>>>>>> Mar 17 14:12:04 lbv1
> >> > stonithd[13266]: notice:
> >> >>>>>>>> xml_patch_version_check: Versions did not change in
> >> > patch 0.5.0
> >> >>>>>>>>>>>>>>> Mar 17 14:12:20 lbv1
> >> > stonithd[13266]: notice:
> >> >>>>>>>> log_operation: Operation 'monitor' [13386]
> >> > for device
> >> >>>>>>>> 'Stonith2-1' returned: -201 (Generic
> >> > Pacemaker error)
> >> >>>>>>>>>>>>>>> Mar 17 14:12:20 lbv1
> >> > stonithd[13266]: warning:
> >> >>>>>>>> log_operation: Stonith2-1:13386 [ Performing:
> >> > stonith -t external/stonith-helper
> >> >>>>>>>> -S ]
> >> >>>>>>>>>>>>>>> Mar 17 14:12:20 lbv1
> >> > stonithd[13266]: warning:
> >> >>>>>>>> log_operation: Stonith2-1:13386 [ failed to exec
> >> > "stonith" ]
> >> >>>>>>>>>>>>>>> Mar 17 14:12:20 lbv1
> >> > stonithd[13266]: warning:
> >> >>>>>>>> log_operation: Stonith2-1:13386 [ failed: 2 ]
> >> >>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>> 宜しくお願いします。
> >> >>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>> 以上
> >> >>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>> 2015年3月17日 13:32
> >> > <renayama19661014@ybb.ne.jp>:
> >> >>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>> 福田さん
> >> >>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>> お疲れ様です。山内です。
> >> >>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>
> >> > ということは、stonith-helperのstartに問題があるようですね。
> >> >>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>> stonith-helperの先頭に
> >> >>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>> #!/bin/bash -x
> >> >>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>
> >> > を入れて、クラスタを起動すると何かわかるかも知れません。
> >> >>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>
> >> > ちなみに、stonith-helperのログもどこかに出ていると思うのですが。。。
> >> >>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>> 以上です。
> >> >>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>> ----- Original
> >> > Message -----
> >> >>>>>>>>>>>>>>>>> From: Masamichi
> >> > Fukuda - elf-systems
> >> >>>>>>>> <masamichi_fukuda@elf-systems.com>
> >> >>>>>>>>>>>>>>>>> To: 山内英生
> >> > <renayama19661014@ybb.ne.jp>;
> >> >>>>>>>> "linux-ha-japan@lists.sourceforge.jp"
> >> >>>>>>>> <linux-ha-japan@lists.sourceforge.jp>
> >> >>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>> Date:
> >> > 2015/3/17, Tue 12:31
> >> >>>>>>>>>>>>>>>>> Subject: Re:
> >> > [Linux-ha-jp]
> >> >>>>>>>> スプリットブレイン時のSTONITHエラーについて
> >> >>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>> 山内さん
> >> >>>>>>>>>>>>>>>>> cc:松島さん
> >> >>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>> こんにちは、福田です。
> >> >>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>
> >> > 同じディレクトリにxen0はありました。
> >> >>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>> # pwd
> >> >>>>>>>>>>>>>>>>>
> >> > /usr/local/heartbeat/lib/stonith/plugins/external
> >> >>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>> # ls
> >> >>>>>>>>>>>>>>>>> drac5
> >> > ibmrsa kdumpcheck
> >> >>>>>>>> riloe vmware
> >> >>>>>>>>>>>>>>>>> dracmc-telnet
> >> > ibmrsa-telnet libvirt
> >> >>>>>>>> ssh xen0
> >> >>>>>>>>>>>>>>>>> hetzner
> >> > ipmi nut
> >> >>>>>>>> stonith-helper xen0-ha
> >> >>>>>>>>>>>>>>>>> hmchttp
> >> > ippower9258 rackpdu
> >> >>>>>>>> vcenter
> >> >>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>> 宜しくお願いします。
> >> >>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>> 以上
> >> >>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>> 2015-03-17
> >> > 10:53 GMT+09:00
> >> >>>>>>>> <renayama19661014@ybb.ne.jp>:
> >> >>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>> 福田さん
> >> >>>>>>>>>>>>>>>>>> cc:松島さん
> >> >>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>>
> >> > お疲れ様です。山内です。
> >> >>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>>>
> >> > 標準出力や標準エラー出力はありませんでした。
> >> >>>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>>>
> >> > stonith-helperがおかしいのでしょうか。
> >> >>>>>>>>>>>>>>>>>>>
> >> > stonith-helperはシェルスクリプトなのでインストールはあまり気にしていなかったのですが。
> >> >>>>>>>>>>>>>>>>>>>
> >> > stonith-helperはここに配置されています。
> >> >>>>>>>>>>>>>>>>>>>
> >> > /usr/local/heartbeat/lib/stonith/plugins/external/stonith-helper
> >> >>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>>
> >> > このディレクトリにxen0もありますか?
> >> >>>>>>>>>>>>>>>>>>
> >> > 無いようでしたら、問題がありますので、一度、stonith-helperのファイルを属性などはそのまま、xen0と同じディレクトリに
> >> >>>>>>>>>>>>>>>>>>
> >> > コピーしてみてください。
> >> >>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>>
> >> > それで稼働するなら、pm_extrasのインストールに問題があるということになります。
> >> >>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>> 以上です。
> >> >>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>> -----
> >> > Original Message -----
> >> >>>>>>>>>>>>>>>>>>> From:
> >> > Masamichi Fukuda - elf-systems
> >> >>>>>>>> <masamichi_fukuda@elf-systems.com>
> >> >>>>>>>>>>>>>>>>>>> To:
> >> > 山内英生
> >> >>>>>>>> <renayama19661014@ybb.ne.jp>;
> >> >>>>>>>> "linux-ha-japan@lists.sourceforge.jp"
> >> >>>>>>>> <linux-ha-japan@lists.sourceforge.jp>
> >> >>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>>> Date:
> >> > 2015/3/17, Tue 10:31
> >> >>>>>>>>>>>>>>>>>>>
> >> > Subject: Re: [Linux-ha-jp]
> >> >>>>>>>> スプリットブレイン時のSTONITHエラーについて
> >> >>>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>>> 山内さん
> >> >>>>>>>>>>>>>>>>>>> cc:松島さん
> >> >>>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>>>
> >> > おはようございます、福田です。
> >> >>>>>>>>>>>>>>>>>>>
> >> > crmの例をありがとうございます。
> >> >>>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>>>
> >> > 早速、こちらの環境に合わせてみました。
> >> >>>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>>> $ cat
> >> > test.crm
> >> >>>>>>>>>>>>>>>>>>> ###
> >> > Cluster Option ###
> >> >>>>>>>>>>>>>>>>>>>
> >> > property \
> >> >>>>>>>>>>>>>>>>>>>
> >> >>>>>>>> no-quorum-policy="ignore" \
> >> >>>>>>>>>>>>>>>>>>>
> >> > stonith-enabled="true"
> >> >>>>>>>> \
> >> >>>>>>>>>>>>>>>>>>>
> >> >>>>>>>> startup-fencing="false" \
> >> >>>>>>>>>>>>>>>>>>>
> >> > stonith-timeout="710s"
> >> >>>>>>>> \
> >> >>>>>>>>>>>>>>>>>>>
> >> >>>>>>>> crmd-transition-delay="2s"
> >> >>>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>>> ###
> >> > Resource Default ###
> >> >>>>>>>>>>>>>>>>>>>
> >> > rsc_defaults \
> >> >>>>>>>>>>>>>>>>>>>
> >> >>>>>>>> resource-stickiness="INFINITY" \
> >> >>>>>>>>>>>>>>>>>>>
> >> >>>>>>>> migration-threshold="1"
> >> >>>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>>> ###
> >> > Group Configuration ###
> >> >>>>>>>>>>>>>>>>>>> group
> >> > HAvarnish \
> >> >>>>>>>>>>>>>>>>>>>
> >> > vip_208 \
> >> >>>>>>>>>>>>>>>>>>>
> >> > varnishd
> >> >>>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>>> group
> >> > grpStonith1 \
> >> >>>>>>>>>>>>>>>>>>>
> >> > Stonith1-1 \
> >> >>>>>>>>>>>>>>>>>>>
> >> > Stonith1-2
> >> >>>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>>> group
> >> > grpStonith2 \
> >> >>>>>>>>>>>>>>>>>>>
> >> > Stonith2-1 \
> >> >>>>>>>>>>>>>>>>>>>
> >> > Stonith2-2
> >> >>>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>>> ###
> >> > Clone Configuration ###
> >> >>>>>>>>>>>>>>>>>>> clone
> >> > clone_ping \
> >> >>>>>>>>>>>>>>>>>>>
> >> > ping
> >> >>>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>>> ###
> >> > Fencing Topology ###
> >> >>>>>>>>>>>>>>>>>>>
> >> > fencing_topology \
> >> >>>>>>>>>>>>>>>>>>>
> >> > lbv1.beta.com: Stonith1-1
> >> >>>>>>>> Stonith1-2 \
> >> >>>>>>>>>>>>>>>>>>>
> >> > lbv2.beta.com: Stonith2-1
> >> >>>>>>>> Stonith2-2
> >> >>>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>>> ###
> >> > Primitive Configuration ###
> >> >>>>>>>>>>>>>>>>>>>
> >> > primitive vip_208
> >> >>>>>>>> ocf:heartbeat:IPaddr2 \
> >> >>>>>>>>>>>>>>>>>>>
> >> > params \
> >> >>>>>>>>>>>>>>>>>>>
> >> >>>>>>>> ip="192.168.17.208" \
> >> >>>>>>>>>>>>>>>>>>>
> >> > nic="eth0" \
> >> >>>>>>>>>>>>>>>>>>>
> >> > cidr_netmask="24"
> >> >>>>>>>> \
> >> >>>>>>>>>>>>>>>>>>> op
> >> > start interval="0s"
> >> >>>>>>>> timeout="90s" on-fail="restart"
> >> > \
> >> >>>>>>>>>>>>>>>>>>> op
> >> > monitor
> >> >>>>>>>> interval="5s" timeout="60s"
> >> > on-fail="restart"
> >> >>>>>>>> \
> >> >>>>>>>>>>>>>>>>>>> op
> >> > stop interval="0s"
> >> >>>>>>>> timeout="100s" on-fail="fence"
> >> >>>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>>>
> >> > primitive varnishd lsb:varnish \
> >> >>>>>>>>>>>>>>>>>>> op
> >> > start interval="0s"
> >> >>>>>>>> timeout="90s" on-fail="restart"
> >> > \
> >> >>>>>>>>>>>>>>>>>>> op
> >> > monitor
> >> >>>>>>>> interval="10s" timeout="60s"
> >> > on-fail="restart"
> >> >>>>>>>> \
> >> >>>>>>>>>>>>>>>>>>> op
> >> > stop interval="0s"
> >> >>>>>>>> timeout="100s" on-fail="fence"
> >> >>>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>>>
> >> > primitive ping ocf:pacemaker:ping
> >> >>>>>>>> \
> >> >>>>>>>>>>>>>>>>>>>
> >> > params \
> >> >>>>>>>>>>>>>>>>>>>
> >> >>>>>>>> name="default_ping_set" \
> >> >>>>>>>>>>>>>>>>>>>
> >> >>>>>>>> host_list="192.168.17.254" \
> >> >>>>>>>>>>>>>>>>>>>
> >> > multiplier="100"
> >> >>>>>>>> \
> >> >>>>>>>>>>>>>>>>>>>
> >> > dampen="1" \
> >> >>>>>>>>>>>>>>>>>>> op
> >> > start interval="0s"
> >> >>>>>>>> timeout="90s" on-fail="restart"
> >> > \
> >> >>>>>>>>>>>>>>>>>>> op
> >> > monitor
> >> >>>>>>>> interval="10s" timeout="60s"
> >> > on-fail="restart"
> >> >>>>>>>> \
> >> >>>>>>>>>>>>>>>>>>> op
> >> > stop interval="0s"
> >> >>>>>>>> timeout="100s" on-fail="fence"
> >> >>>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>>>
> >> > primitive Stonith1-1
> >> >>>>>>>> stonith:external/stonith-helper \
> >> >>>>>>>>>>>>>>>>>>>
> >> > params \
> >> >>>>>>>>>>>>>>>>>>>
> >> >>>>>>>> pcmk_reboot_retries="1" \
> >> >>>>>>>>>>>>>>>>>>>
> >> >>>>>>>> pcmk_reboot_timeout="40s" \
> >> >>>>>>>>>>>>>>>>>>>
> >> >>>>>>>> hostlist="lbv1.beta.com" \
> >> >>>>>>>>>>>>>>>>>>>
> >> >>>>>>>> dead_check_target="192.168.17.132
> >> > 10.0.17.132" \
> >> >>>>>>>>>>>>>>>>>>>
> >> >>>>>>>>
> >> > standby_check_command="/usr/local/sbin/crm_resource -r varnishd -W |
> grep
> >> >>>>>>>> -q `hostname`" \
> >> >>>>>>>>>>>>>>>>>>>
> >> >>>>>>>> run_online_check="yes" \
> >> >>>>>>>>>>>>>>>>>>> op
> >> > start interval="0s"
> >> >>>>>>>> timeout="60s" on-fail="restart"
> >> > \
> >> >>>>>>>>>>>>>>>>>>> op
> >> > stop interval="0s"
> >> >>>>>>>> timeout="60s" on-fail="ignore"
> >> >>>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>>>
> >> > primitive Stonith1-2
> >> >>>>>>>> stonith:external/xen0 \
> >> >>>>>>>>>>>>>>>>>>>
> >> > params \
> >> >>>>>>>>>>>>>>>>>>>
> >> >>>>>>>> pcmk_reboot_timeout="60s" \
> >> >>>>>>>>>>>>>>>>>>>
> >> >>>>>>>>
> >> > hostlist="lbv1.beta.com:/etc/xen/lbv1.cfg" \
> >> >>>>>>>>>>>>>>>>>>>
> >> >>>>>>>> dom0="xen0.beta.com" \
> >> >>>>>>>>>>>>>>>>>>> op
> >> > start interval="0s"
> >> >>>>>>>> timeout="60s" on-fail="restart"
> >> > \
> >> >>>>>>>>>>>>>>>>>>> op
> >> > monitor
> >> >>>>>>>> interval="3600s" timeout="60s"
> >> > on-fail="restart"
> >> >>>>>>>> \
> >> >>>>>>>>>>>>>>>>>>> op
> >> > stop interval="0s"
> >> >>>>>>>> timeout="60s" on-fail="ignore"
> >> >>>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>>>
> >> > primitive Stonith2-1
> >> >>>>>>>> stonith:external/stonith-helper \
> >> >>>>>>>>>>>>>>>>>>>
> >> > params \
> >> >>>>>>>>>>>>>>>>>>>
> >> >>>>>>>> pcmk_reboot_retries="1" \
> >> >>>>>>>>>>>>>>>>>>>
> >> >>>>>>>> pcmk_reboot_timeout="40s" \
> >> >>>>>>>>>>>>>>>>>>>
> >> >>>>>>>> hostlist="lbv2.beta.com" \
> >> >>>>>>>>>>>>>>>>>>>
> >> >>>>>>>> dead_check_target="192.168.17.133
> >> > 10.0.17.133" \
> >> >>>>>>>>>>>>>>>>>>>
> >> >>>>>>>>
> >> > standby_check_command="/usr/local/sbin/crm_resource -r varnishd -W |
> grep
> >> >>>>>>>> -q `hostname`" \
> >> >>>>>>>>>>>>>>>>>>>
> >> >>>>>>>> run_online_check="yes" \
> >> >>>>>>>>>>>>>>>>>>> op
> >> > start interval="0s"
> >> >>>>>>>> timeout="60s" on-fail="restart"
> >> > \
> >> >>>>>>>>>>>>>>>>>>> op
> >> > stop interval="0s"
> >> >>>>>>>> timeout="60s" on-fail="ignore"
> >> >>>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>>>
> >> > primitive Stonith2-2
> >> >>>>>>>> stonith:external/xen0 \
> >> >>>>>>>>>>>>>>>>>>>
> >> > params \
> >> >>>>>>>>>>>>>>>>>>>
> >> >>>>>>>> pcmk_reboot_timeout="60s" \
> >> >>>>>>>>>>>>>>>>>>>
> >> >>>>>>>>
> >> > hostlist="lbv2.beta.com:/etc/xen/lbv2.cfg" \
> >> >>>>>>>>>>>>>>>>>>>
> >> >>>>>>>> dom0="xen0.beta.com" \
> >> >>>>>>>>>>>>>>>>>>> op
> >> > start interval="0s"
> >> >>>>>>>> timeout="60s" on-fail="restart"
> >> > \
> >> >>>>>>>>>>>>>>>>>>> op
> >> > monitor
> >> >>>>>>>> interval="3600s" timeout="60s"
> >> > on-fail="restart"
> >> >>>>>>>> \
> >> >>>>>>>>>>>>>>>>>>> op
> >> > stop interval="0s"
> >> >>>>>>>> timeout="60s" on-fail="ignore"
> >> >>>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>>> ###
> >> > Resource Location ###
> >> >>>>>>>>>>>>>>>>>>>
> >> > location HA_location-1 HAvarnish
> >> >>>>>>>> \
> >> >>>>>>>>>>>>>>>>>>>
> >> > rule 200: #uname eq
> >> >>>>>>>> lbv1.beta.com \
> >> >>>>>>>>>>>>>>>>>>>
> >> > rule 100: #uname eq
> >> >>>>>>>> lbv2.beta.com
> >> >>>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>>>
> >> > location HA_location-2 HAvarnish
> >> >>>>>>>> \
> >> >>>>>>>>>>>>>>>>>>>
> >> > rule -INFINITY: not_defined
> >> >>>>>>>> default_ping_set or default_ping_set lt 100
> >> >>>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>>>
> >> > location HA_location-3 grpStonith1
> >> >>>>>>>> \
> >> >>>>>>>>>>>>>>>>>>>
> >> > rule -INFINITY: #uname eq
> >> >>>>>>>> lbv1.beta.com
> >> >>>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>>>
> >> > location HA_location-4 grpStonith2
> >> >>>>>>>> \
> >> >>>>>>>>>>>>>>>>>>>
> >> > rule -INFINITY: #uname eq
> >> >>>>>>>> lbv2.beta.com
> >> >>>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>>>
> >> > これを流しこんだところ、昨日とはメッセージが異なります。
> >> >>>>>>>>>>>>>>>>>>>
> >> > pingのメッセージはなくなっていました。
> >> >>>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>>> #
> >> > crm_mon -rfA
> >> >>>>>>>>>>>>>>>>>>> Last
> >> > updated: Tue Mar 17 10:21:28
> >> >>>>>>>> 2015
> >> >>>>>>>>>>>>>>>>>>> Last
> >> > change: Tue Mar 17 10:21:09
> >> >>>>>>>> 2015
> >> >>>>>>>>>>>>>>>>>>> Stack:
> >> > heartbeat
> >> >>>>>>>>>>>>>>>>>>> Current
> >> > DC: lbv2.beta.com
> >> >>>>>>>> (82ffc36f-1ad8-8686-7db0-35686465c624) - parti
> >> >>>>>>>>>>>>>>>>>>> tion
> >> > with quorum
> >> >>>>>>>>>>>>>>>>>>>
> >> > Version: 1.1.12-561c4cf
> >> >>>>>>>>>>>>>>>>>>> 2 Nodes
> >> > configured
> >> >>>>>>>>>>>>>>>>>>> 8
> >> > Resources configured
> >> >>>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>>> Online:
> >> > [ lbv1.beta.com
> >> >>>>>>>> lbv2.beta.com ]
> >> >>>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>>> Full
> >> > list of resources:
> >> >>>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>>>
> >> > Resource Group: HAvarnish
> >> >>>>>>>>>>>>>>>>>>>
> >> > vip_208
> >> >>>>>>>> (ocf::heartbeat:IPaddr2): Started
> >> > lbv1.beta.com
> >> >>>>>>>>>>>>>>>>>>>
> >> > varnishd (lsb:varnish):
> >> >>>>>>>> Started lbv1.beta.com
> >> >>>>>>>>>>>>>>>>>>>
> >> > Resource Group: grpStonith1
> >> >>>>>>>>>>>>>>>>>>>
> >> > Stonith1-1
> >> >>>>>>>> (stonith:external/stonith-helper): Stopped
> >> >>>>>>>>>>>>>>>>>>>
> >> > Stonith1-2
> >> >>>>>>>> (stonith:external/xen0): Stopped
> >> >>>>>>>>>>>>>>>>>>>
> >> > Resource Group: grpStonith2
> >> >>>>>>>>>>>>>>>>>>>
> >> > Stonith2-1
> >> >>>>>>>> (stonith:external/stonith-helper): Stopped
> >> >>>>>>>>>>>>>>>>>>>
> >> > Stonith2-2
> >> >>>>>>>> (stonith:external/xen0): Stopped
> >> >>>>>>>>>>>>>>>>>>> Clone
> >> > Set: clone_ping [ping]
> >> >>>>>>>>>>>>>>>>>>>
> >> > Started: [ lbv1.beta.com
> >> >>>>>>>> lbv2.beta.com ]
> >> >>>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>>> Node
> >> > Attributes:
> >> >>>>>>>>>>>>>>>>>>> * Node
> >> > lbv1.beta.com:
> >> >>>>>>>>>>>>>>>>>>> +
> >> >>>>>>>> default_ping_set : 100
> >> >>>>>>>>>>>>>>>>>>> * Node
> >> > lbv2.beta.com:
> >> >>>>>>>>>>>>>>>>>>> +
> >> >>>>>>>> default_ping_set : 100
> >> >>>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>>>
> >> > Migration summary:
> >> >>>>>>>>>>>>>>>>>>> * Node
> >> > lbv2.beta.com:
> >> >>>>>>>>>>>>>>>>>>>
> >> > Stonith1-1: migration-threshold=1
> >> >>>>>>>> fail-count=1000000 last-failure='Tue Mar 17
> >> >>>>>>>>>>>>>>>>>>>
> >> > 10:21:17 2015'
> >> >>>>>>>>>>>>>>>>>>> * Node
> >> > lbv1.beta.com:
> >> >>>>>>>>>>>>>>>>>>>
> >> > Stonith2-1: migration-threshold=1
> >> >>>>>>>> fail-count=1000000 last-failure='Tue Mar 17
> >> >>>>>>>>>>>>>>>>>>>
> >> > 10:21:17 2015'
> >> >>>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>>> Failed
> >> > actions:
> >> >>>>>>>>>>>>>>>>>>>
> >> > Stonith1-1_start_0 on
> >> >>>>>>>> lbv2.beta.com 'unknown error' (1): call=31,
> >> > st
> >> >>>>>>>>>>>>>>>>>>>
> >> > atus=Error, last-rc-change='Tue
> >> >>>>>>>> Mar 17 10:21:15 2015', queued=0ms, exec=1082ms
> >> >>>>>>>>>>>>>>>>>>>
> >> > Stonith2-1_start_0 on
> >> >>>>>>>> lbv1.beta.com 'unknown error' (1): call=31,
> >> > st
> >> >>>>>>>>>>>>>>>>>>>
> >> > atus=Error, last-rc-change='Tue
> >> >>>>>>>> Mar 17 10:21:16 2015', queued=0ms, exec=1079ms
> >> >>>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>>>
> >> > /var/log/ha-debugのログです。
> >> >>>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>>>
> >> > IPaddr2(vip_208)[7851]:
> >> >>>>>>>> 2015/03/17_10:21:22 INFO: Adding inet address
> >> > 192.168.17.208/24 with broadcast
> >> >>>>>>>> address 192.168.17.255 to device eth0
> >> >>>>>>>>>>>>>>>>>>>
> >> > IPaddr2(vip_208)[7851]:
> >> >>>>>>>> 2015/03/17_10:21:22 INFO: Bringing device eth0 up
> >> >>>>>>>>>>>>>>>>>>>
> >> > IPaddr2(vip_208)[7851]:
> >> >>>>>>>> 2015/03/17_10:21:22 INFO:
> >> > /usr/libexec/heartbeat/send_arp -i 200 -r 5 -p
> >> >>>>>>>> /var/run/resource-agents/send_arp-192.168.17.208
> >> > eth0 192.168.17.208 auto
> >> >>>>>>>> not_used not_used
> >> >>>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>>>
> >> > 標準出力や標準エラー出力はありませんでした。
> >> >>>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>>>
> >> > stonith-helperがおかしいのでしょうか。
> >> >>>>>>>>>>>>>>>>>>>
> >> > stonith-helperはシェルスクリプトなのでインストールはあまり気にしていなかったのですが。
> >> >>>>>>>>>>>>>>>>>>>
> >> > stonith-helperはここに配置されています。
> >> >>>>>>>>>>>>>>>>>>>
> >> > /usr/local/heartbeat/lib/stonith/plugins/external/stonith-helper
> >> >>>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>>>
> >> > 宜しくお願いします。
> >> >>>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>>> 以上
> >> >>>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>>>
> >> > 2015-03-17 9:45 GMT+09:00
> >> >>>>>>>> <renayama19661014@ybb.ne.jp>:
> >> >>>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>>> 福田さん
> >> >>>>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>>>>
> >> > おはようございます。山内です。
> >> >>>>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>>>>
> >> > 念の為、手元にある複数のstonithを利用した場合の例を抜粋してお送りします。
> >> >>>>>>>>>>>>>>>>>>>>
> >> > (実際には、改行に気を付けてください)
> >> >>>>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>>>>
> >> > 以下の例は、PM1.1系での設定で、
> >> >>>>>>>>>>>>>>>>>>>>
> >> > nodeaは、prmStonith1-1、 prmStonith1-2の順でstonithが実行されます。
> >> >>>>>>>>>>>>>>>>>>>>
> >> > nodebは、prmStonith2-1、 prmStonith2-2の順でstonithが実行されます。
> >> >>>>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>>>>
> >> > stonith自体は、helperとsshです。
> >> >>>>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>>>>
> >> > (snip)
> >> >>>>>>>>>>>>>>>>>>>> ###
> >> > Group Configuration ###
> >> >>>>>>>>>>>>>>>>>>>>
> >> > group grpStonith1 \
> >> >>>>>>>>>>>>>>>>>>>>
> >> > prmStonith1-1 \
> >> >>>>>>>>>>>>>>>>>>>>
> >> > prmStonith1-2
> >> >>>>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>>>>
> >> > group grpStonith2 \
> >> >>>>>>>>>>>>>>>>>>>>
> >> > prmStonith2-1 \
> >> >>>>>>>>>>>>>>>>>>>>
> >> > prmStonith2-2
> >> >>>>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>>>> ###
> >> > Fencing Topology ###
> >> >>>>>>>>>>>>>>>>>>>>
> >> > fencing_topology \
> >> >>>>>>>>>>>>>>>>>>>>
> >> > nodea: prmStonith1-1
> >> >>>>>>>> prmStonith1-2 \
> >> >>>>>>>>>>>>>>>>>>>>
> >> > nodeb: prmStonith2-1
> >> >>>>>>>> prmStonith2-2
> >> >>>>>>>>>>>>>>>>>>>>
> >> > (snp)
> >> >>>>>>>>>>>>>>>>>>>>
> >> > primitive prmStonith1-1
> >> >>>>>>>> stonith:external/stonith-helper \
> >> >>>>>>>>>>>>>>>>>>>>
> >> > params \
> >> >>>>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>>>>
> >> > pcmk_reboot_retries="1"
> >> >>>>>>>> \
> >> >>>>>>>>>>>>>>>>>>>>
> >> > pcmk_reboot_timeout="40s"
> >> >>>>>>>> \
> >> >>>>>>>>>>>>>>>>>>>>
> >> > hostlist="nodea" \
> >> >>>>>>>>>>>>>>>>>>>>
> >> > dead_check_target="192.168.28.60
> >> >>>>>>>> 192.168.28.70" \
> >> >>>>>>>>>>>>>>>>>>>>
> >> > standby_check_command="/usr/sbin/crm_resource
> >> >>>>>>>> -r prmRES -W | grep -qi `hostname`" \
> >> >>>>>>>>>>>>>>>>>>>>
> >> > run_online_check="yes"
> >> >>>>>>>> \
> >> >>>>>>>>>>>>>>>>>>>> op
> >> > start interval="0s"
> >> >>>>>>>> timeout="60s" on-fail="restart"
> >> > \
> >> >>>>>>>>>>>>>>>>>>>> op
> >> > stop interval="0s"
> >> >>>>>>>> timeout="60s" on-fail="ignore"
> >> >>>>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>>>>
> >> > primitive prmStonith1-2
> >> >>>>>>>> stonith:external/ssh \
> >> >>>>>>>>>>>>>>>>>>>>
> >> > params \
> >> >>>>>>>>>>>>>>>>>>>>
> >> > pcmk_reboot_timeout="60s"
> >> >>>>>>>> \
> >> >>>>>>>>>>>>>>>>>>>>
> >> > hostlist="nodea" \
> >> >>>>>>>>>>>>>>>>>>>> op
> >> > start interval="0s"
> >> >>>>>>>> timeout="60s" on-fail="restart"
> >> > \
> >> >>>>>>>>>>>>>>>>>>>> op
> >> > monitor
> >> >>>>>>>> interval="3600s" timeout="60s"
> >> > on-fail="restart"
> >> >>>>>>>> \
> >> >>>>>>>>>>>>>>>>>>>> op
> >> > stop interval="0s"
> >> >>>>>>>> timeout="60s" on-fail="ignore"
> >> >>>>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>>>>
> >> > primitive prmStonith2-1
> >> >>>>>>>> stonith:external/stonith-helper \
> >> >>>>>>>>>>>>>>>>>>>>
> >> > params \
> >> >>>>>>>>>>>>>>>>>>>>
> >> > pcmk_reboot_retries="1"
> >> >>>>>>>> \
> >> >>>>>>>>>>>>>>>>>>>>
> >> > pcmk_reboot_timeout="40s"
> >> >>>>>>>> \
> >> >>>>>>>>>>>>>>>>>>>>
> >> > hostlist="nodeb" \
> >> >>>>>>>>>>>>>>>>>>>>
> >> > dead_check_target="192.168.28.61
> >> >>>>>>>> 192.168.28.71" \
> >> >>>>>>>>>>>>>>>>>>>>
> >> > standby_check_command="/usr/sbin/crm_resource
> >> >>>>>>>> -r prmRES -W | grep -qi `hostname`" \
> >> >>>>>>>>>>>>>>>>>>>>
> >> > run_online_check="yes"
> >> >>>>>>>> \
> >> >>>>>>>>>>>>>>>>>>>> op
> >> > start interval="0s"
> >> >>>>>>>> timeout="60s" on-fail="restart"
> >> > \
> >> >>>>>>>>>>>>>>>>>>>> op
> >> > stop interval="0s"
> >> >>>>>>>> timeout="60s" on-fail="ignore"
> >> >>>>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>>>>
> >> > primitive prmStonith2-2
> >> >>>>>>>> stonith:external/ssh \
> >> >>>>>>>>>>>>>>>>>>>>
> >> > params \
> >> >>>>>>>>>>>>>>>>>>>>
> >> > pcmk_reboot_timeout="60s"
> >> >>>>>>>> \
> >> >>>>>>>>>>>>>>>>>>>>
> >> > hostlist="nodeb" \
> >> >>>>>>>>>>>>>>>>>>>> op
> >> > start interval="0s"
> >> >>>>>>>> timeout="60s" on-fail="restart"
> >> > \
> >> >>>>>>>>>>>>>>>>>>>> op
> >> > monitor
> >> >>>>>>>> interval="3600s" timeout="60s"
> >> > on-fail="restart"
> >> >>>>>>>> \
> >> >>>>>>>>>>>>>>>>>>>> op
> >> > stop interval="0s"
> >> >>>>>>>> timeout="60s" on-fail="ignore"
> >> >>>>>>>>>>>>>>>>>>>>
> >> > (snip)
> >> >>>>>>>>>>>>>>>>>>>>
> >> > location
> >> >>>>>>>> rsc_location-grpStonith1-2 grpStonith1 \
> >> >>>>>>>>>>>>>>>>>>>>
> >> > rule -INFINITY: #uname eq nodea
> >> >>>>>>>>>>>>>>>>>>>>
> >> > location
> >> >>>>>>>> rsc_location-grpStonith2-3 grpStonith2 \
> >> >>>>>>>>>>>>>>>>>>>>
> >> > rule -INFINITY: #uname eq nodeb
> >> >>>>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>>>>
> >> > 以上です。
> >> >>>>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>>> --
> >> >>>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>>> ELF
> >> > Systems
> >> >>>>>>>>>>>>>>>>>>>
> >> > Masamichi Fukuda
> >> >>>>>>>>>>>>>>>>>>> mail
> >> > to:
> >> >>>>>>>> masamichi_fukuda@elf-systems.com
> >> >>>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>>
> >> > _______________________________________________
> >> >>>>>>>>>>>>>>>>>>
> >> > Linux-ha-japan mailing list
> >> >>>>>>>>>>>>>>>>>>
> >> > Linux-ha-japan@lists.sourceforge.jp
> >> >>>>>>>>>>>>>>>>>>
> >> > http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
> >> >>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>> --
> >> >>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>> ELF Systems
> >> >>>>>>>>>>>>>>>>> Masamichi
> >> > Fukuda
> >> >>>>>>>>>>>>>>>>> mail to:
> >> > masamichi_fukuda@elf-systems.com
> >> >>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>
> >> > _______________________________________________
> >> >>>>>>>>>>>>>>>> Linux-ha-japan
> >> > mailing list
> >> >>>>>>>>>>>>>>>>
> >> > Linux-ha-japan@lists.sourceforge.jp
> >> >>>>>>>>>>>>>>>>
> >> > http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
> >> >>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>> --
> >> >>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>> ELF Systems
> >> >>>>>>>>>>>>>>> Masamichi Fukuda
> >> >>>>>>>>>>>>>>> mail to:
> >> > masamichi_fukuda@elf-systems.com
> >> >>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>
> >> > _______________________________________________
> >> >>>>>>>>>>>>>> Linux-ha-japan mailing list
> >> >>>>>>>>>>>>>>
> >> > Linux-ha-japan@lists.sourceforge.jp
> >> >>>>>>>>>>>>>>
> >> > http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
> >> >>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>
> >> >>>>>>>>>>>>>
> >> >>>>>>>>>>>>> --
> >> >>>>>>>>>>>>>
> >> >>>>>>>>>>>>> ELF Systems
> >> >>>>>>>>>>>>> Masamichi Fukuda
> >> >>>>>>>>>>>>> mail to:
> >> > masamichi_fukuda@elf-systems.com
> >> >>>>>>>>>>>>>
> >> >>>>>>>>>>>>>
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>>
> >> > _______________________________________________
> >> >>>>>>>>>>>> Linux-ha-japan mailing list
> >> >>>>>>>>>>>> Linux-ha-japan@lists.sourceforge.jp
> >> >>>>>>>>>>>>
> >> > http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>
> >> >>>>>>>>>>> --
> >> >>>>>>>>>>>
> >> >>>>>>>>>>> ELF Systems
> >> >>>>>>>>>>> Masamichi Fukuda
> >> >>>>>>>>>>> mail to:
> >> > masamichi_fukuda@elf-systems.com
> >> >>>>>>>>>>>
> >> >>>>>>>>>>>
> >> >>>>>>>>>>>
> >> >>>>>>>>>>
> >> >>>>>>>>>>
> >> > _______________________________________________
> >> >>>>>>>>>> Linux-ha-japan mailing list
> >> >>>>>>>>>> Linux-ha-japan@lists.sourceforge.jp
> >> >>>>>>>>>>
> >> > http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
> >> >>>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>>> --
> >> >>>>>>>>>
> >> >>>>>>>>> ELF Systems
> >> >>>>>>>>> Masamichi Fukuda
> >> >>>>>>>>> mail to: masamichi_fukuda@elf-systems.com
> >> >>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>> _______________________________________________
> >> >>>>>>>> Linux-ha-japan mailing list
> >> >>>>>>>> Linux-ha-japan@lists.sourceforge.jp
> >> >>>>>>>>
> >> > http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
> >> >>>>>>>>
> >> >>>>>>>
> >> >>>>>>> _______________________________________________
> >> >>>>>>> Linux-ha-japan mailing list
> >> >>>>>>> Linux-ha-japan@lists.sourceforge.jp
> >> >>>>>>> http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
> >> >>>>>>>
> >> >>>>>>
> >> >>>>>>
> >> >>>>>> --
> >> >>>>>>
> >> >>>>>> ELF Systems
> >> >>>>>> Masamichi Fukuda
> >> >>>>>> mail to: masamichi_fukuda@elf-systems.com
> >> >>>>>>
> >> >>>>>>
> >> >>>>>
> >> >>>>> _______________________________________________
> >> >>>>> Linux-ha-japan mailing list
> >> >>>>> Linux-ha-japan@lists.sourceforge.jp
> >> >>>>> http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
> >> >>>>>
> >> >>>>
> >> >>>>
> >> >>>> --
> >> >>>>
> >> >>>> ELF Systems
> >> >>>> Masamichi Fukuda
> >> >>>> mail to: masamichi_fukuda@elf-systems.com
> >> >>>>
> >> >>>>
> >> >>>
> >> >>> _______________________________________________
> >> >>> Linux-ha-japan mailing list
> >> >>> Linux-ha-japan@lists.sourceforge.jp
> >> >>> http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
> >> >>>
> >> >>
> >> >>
> >> >> --
> >> >>
> >> >> ELF Systems
> >> >> Masamichi Fukuda
> >> >> mail to: masamichi_fukuda@elf-systems.com
> >> >>
> >> >>
> >> >
> >> > _______________________________________________
> >> > Linux-ha-japan mailing list
> >> > Linux-ha-japan@lists.sourceforge.jp
> >> > http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
> >> >
> >>
> >> _______________________________________________
> >> Linux-ha-japan mailing list
> >> Linux-ha-japan@lists.sourceforge.jp
> >> http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
> >
> >
> >
> >
> >--
> >ELF Systems
> >Masamichi Fukuda
> >mail to: masamichi_fukuda@elf-systems.com
> >
> >
>
> _______________________________________________
> Linux-ha-japan mailing list
> Linux-ha-japan@lists.sourceforge.jp
> http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>



--
ELF Systems
Masamichi Fukuda
mail to: *masamichi_fukuda@elf-systems.com <elfsystems.com@gmail.com>*
Re: スプリットブレイン時のSTONITHエラーについて [ In reply to ]
福田さん

こんばんは、山内です。

私の方でも同じ状況が発生しました。
どうやら、新しいPacemakerのrngファイル(Pacemaker1.1.12より後)が影響しているようです。
が、こちらの回避方法はまだわかっていません。

ちなみに、本来はうまく動くかどうか不明のPacemaker1.1.12とHeartbeat3.0.6の組み合わせでは、単一ノードで、stonith-helperの起動まで確認しました。

root@debian7-1:~# crm_mon -1 -Af
Last updated: Wed Mar 18 17:43:37 2015
Last change: Wed Mar 18 17:43:29 2015
Stack: heartbeat
Current DC: debian7-1 (d20c7df5-519e-4a4c-9b4b-1b88fc203133) - partition with quorum
Version: 1.1.12-561c4cf
1 Nodes configured
3 Resources configured


Online: [ debian7-1 ]

 prmDummy(ocf::pacemaker:Dummy):Started debian7-1 
 Resource Group: grpStonith2
     Stonith2-1(stonith:external/stonith-helper):Started debian7-1 

Node Attributes:
* Node debian7-1:

Migration summary:
* Node debian7-1: 

松島さんの手順ではうまくいかない箇所(私のdebian不慣れが原因と思いますが)がありましたが、構築オプションは同じ
にして、インストールして、pm_extras_1.0の最新版に含まれるstonith-helperのみをxen0と同じディレクトリにコピーしました。
#stonith-helperの実行権限などに問題があれば、正しく設定してください。

で、福田さんのstonith-helperがstartでエラーになっている件ですが、多分、stonithコマンドへのPATHがないことが原因かと思います。

root@debian7-1:~# find / -name stonith -print
/usr/local/heartbeat/sbin/stonith

root@debian7-1:~# echo $PATH
/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/local/heartbeat/sbin/


PATHに/usr/local/heartbeat/sbinを追加後に再度、heartbeatを起動すると、上記のcrm_mon表示のようになりました。

ただ、最新のPMとの組み合わせの問題の解消はまだですので、この構成(PM1.1.12+Heartbeat3.0.6)が正しく動くかどうかは別物です。
#たぶん、動いているようですが、問題が出ると思います。

以下に試しに流し込んだ、crmファイルを提示しておきます。
(dead_check_targetや、standby_check_commandなどのパラメータ値は起動を確認するのみでしたので、この設定では実際はまったく意味がない値です)

### Cluster Option ###
property \
    no-quorum-policy="ignore" \
    stonith-enabled="true" \
    startup-fencing="false"

### Resource Default ###
rsc_defaults \
    resource-stickiness="INFINITY" \
    migration-threshold="1"

### Fencing Topology ###
fencing_topology \
    debian7-1: Stonith1-1 \
    debian7-2: Stonith2-1

group grpStonith1 \
    Stonith1-1

group grpStonith2 \
    Stonith2-1

primitive prmDummy ocf:pacemaker:Dummy \
    op start interval="0s" timeout="60s" on-fail="restart" \
    op monitor interval="3600s" timeout="60s" on-fail="restart" \
    op stop interval="0s" timeout="60s" on-fail="ignore"

primitive Stonith1-1 stonith:external/stonith-helper \
    params \
        pcmk_reboot_retries="1" \
        pcmk_reboot_timeout="40s" \
        hostlist="debian7-1" \
        dead_check_target="192.168.3.1" \
        standby_wait_time="10" \
        standby_check_command="/usr/local/sbin/crm_resource -r varnishd -W | grep -q `hostname`" \
    op start interval="0s" timeout="60s" on-fail="restart" \
    op monitor interval="3600s" timeout="60s" on-fail="restart" \
    op stop interval="0s" timeout="60s" on-fail="ignore"

primitive Stonith2-1 stonith:external/stonith-helper \
    params \
        pcmk_reboot_retries="1" \
        pcmk_reboot_timeout="40s" \
        hostlist="debian7-2" \
        dead_check_target="192.168.3.1" \
        standby_wait_time="10" \
        standby_check_command="/usr/local/sbin/crm_resource -r varnishd -W | grep -q `hostname`" \
    op start interval="0s" timeout="60s" on-fail="restart" \
    op monitor interval="3600s" timeout="60s" on-fail="restart" \
    op stop interval="0s" timeout="60s" on-fail="ignore"


location HA_location-3 grpStonith1 \
   rule -INFINITY: #uname eq debian7-1

location HA_location-4 grpStonith2 \
   rule -INFINITY: #uname eq debian7-2


また、何かわかりましたら、ご連絡いたします。

以上です。








----- Original Message -----
>From: Masamichi Fukuda - elf-systems <masamichi_fukuda@elf-systems.com>
>To: 山内英生 <renayama19661014@ybb.ne.jp>; "linux-ha-japan@lists.sourceforge.jp" <linux-ha-japan@lists.sourceforge.jp>
>Date: 2015/3/18, Wed 15:09
>Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
>
>
>山内さん
>
>お疲れ様です、福田です。
>
>新たにdebian7.8をvirtulabox上にインストールして、
>heartbeat + pacemakerをインストールしてみました。
>
>
>パッケージでheartbeat,pacemaker等はインストールしていません。
>
>
>heartbeatは起動しますが、crmファイルを読み込ませるとエラーがでました。
>
>
># crm configure load update test1.crm
>
>ERROR: crmd:metadata: got no meta-data, does this RA exist?
>ERROR: cib-bootstrap-options: attribute no-quorum-policy does not exist
>ERROR: cib-bootstrap-options: attribute stonith-enabled does not exist
>ERROR: cib-bootstrap-options: attribute crmd-transition-delay does not exist
>ERROR: pengine:metadata: got no meta-data, does this RA exist?
>
>external配下のエージェントを認識できない件と関係あるのでしょうか。
>
>宜しくお願いします。
>
>以上
>
>
>
>
>
>2015年3月18日 12:13 <renayama19661014@ybb.ne.jp>:
>
>福田さん
>>
>>お疲れ様です。山内です。
>>
>>了解しました。
>>ご連絡ありがとうございました。
>>
>>以上です。
>>
>>
>>
>>----- Original Message -----
>>>From: Masamichi Fukuda - elf-systems <masamichi_fukuda@elf-systems.com>
>>>To: 山内英生 <renayama19661014@ybb.ne.jp>; "linux-ha-japan@lists.sourceforge.jp" <linux-ha-japan@lists.sourceforge.jp>
>>
>>>Date: 2015/3/18, Wed 10:23
>>>Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
>>>
>>>
>>>山内さん
>>>
>>>お疲れ様です、福田です。
>>>
>>>こちらの環境では、packageで次のものを入れていたので、
>>>最初にapt-get removeしました。
>>>
>>>heartbeat、libheartbeat2、pacemaker、corosync、resource-agents
>>>
>>>また、haclusterユーザとhaclientグループはpackage導入の段階で
>>>作成されていました。
>>>
>>>ですので、松島さんの手順の
>>>
>>>下準備
>>>apt-get install build-essential mercurial git \
>>>
>>>以降を実行しました。後は全く同じ手順です。
>>>
>>>宜しくお願いします。
>>>
>>>以上
>>>
>>>2015年3月18日 10:06 <renayama19661014@ybb.ne.jp>:
>>>>
>>>> 福田さん
>>>>
>>>> お疲れ様です。山内です。
>>>>
>>>> ちなみに、私の方でも構築するにあたっての再確認ですが、福田さんの構築手順は、
>>>> 以下にまとめられた松島さんの手順通りでしょうか?
>>>>
>>>>  * https://gist.github.com/takehironet/1469bd7123f63d61f843
>>>>
>>>> 差異などありましたら、今一度、ご連絡ください。
>>>>
>>>> #特に、最初の構築パッケージのapt-getのあたりが、私がちょっと試した時には、うまく行かなかった記憶があるので、気になります。
>>>>
>>>>
>>>> 以上です。
>>>>
>>>>
>>>> ----- Original Message -----
>>>> > From: "renayama19661014@ybb.ne.jp" <renayama19661014@ybb.ne.jp>
>>>> > To: "linux-ha-japan@lists.sourceforge.jp" <linux-ha-japan@lists.sourceforge.jp>
>>>> > Cc:
>>>> > Date: 2015/3/18, Wed 09:53
>>>> > Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
>>>> >
>>>> > 福田さん
>>>> >
>>>> > お疲れ様です。山内です。
>>>> >
>>>> >> stonith -Lでは、一応プラグインの一覧は表示されるようです。
>>>> >>
>>>> >> # /usr/local/heartbeat/sbin/stonith -L
>>>> >
>>>> > こちらは、Heartbeatのソースに含まれるコマンドのはずなので、Heartbeatとglueの関係では問題ないということになるかと思います。
>>>> >
>>>> > ですので、pacemakerのインストールに問題がある可能性の方が高いと思われます。
>>>> >
>>>> > どちらにしても、一度、時間をみて、こちらでも構築してみます。
>>>> >
>>>> > 以上です。
>>>> >
>>>> >
>>>> > ----- Original Message -----
>>>> >> From: Masamichi Fukuda - elf-systems
>>>> > <masamichi_fukuda@elf-systems.com>
>>>> >> To: 山内英生 <renayama19661014@ybb.ne.jp>;
>>>> > "linux-ha-japan@lists.sourceforge.jp"
>>>> > <linux-ha-japan@lists.sourceforge.jp>
>>>> >> Date: 2015/3/18, Wed 09:33
>>>> >> Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
>>>> >>
>>>> >>
>>>> >> 山内さん
>>>> >>
>>>> >> お疲れ様です、福田です。
>>>> >>
>>>> >>> Reusableは、glueのことです。
>>>> >>
>>>> >> 承知しました。Cluster-glueのことですね。
>>>> >>
>>>> >>> これは、想定通りで、external配下のエージェントを認識できず、startしていないと
>>>> >>> 思っています。
>>>> >>
>>>> >> stonith -Lでは、一応プラグインの一覧は表示されるようです。
>>>> >>
>>>> >> # /usr/local/heartbeat/sbin/stonith -L
>>>> >> apcmaster
>>>> >> apcsmart
>>>> >> baytech
>>>> >> cyclades
>>>> >> external/drac5
>>>> >> external/dracmc-telnet
>>>> >> external/hetzner
>>>> >> external/hmchttp
>>>> >> external/ibmrsa
>>>> >> external/ibmrsa-telnet
>>>> >> external/ipmi
>>>> >> external/ippower9258
>>>> >> external/kdumpcheck
>>>> >> external/libvirt
>>>> >> external/nut
>>>> >> external/rackpdu
>>>> >> external/riloe
>>>> >> external/ssh
>>>> >> external/stonith-helper
>>>> >> external/vcenter
>>>> >> external/vmware
>>>> >> external/xen0
>>>> >> external/xen0-ha
>>>> >> ibmhmc
>>>> >> meatware
>>>> >> null
>>>> >> nw_rpc100s
>>>> >> rcd_serial
>>>> >> rps10
>>>> >> ssh
>>>> >> suicide
>>>> >> wti_nps
>>>> >>
>>>> >>
>>>> >>> #時間を見て、松島さんの情報を元に、Debian7の環境を作らないと何とも言えない
>>>> >>> と思っています
>>>> >>
>>>> >> お忙しいところ済みません。
>>>> >> こちらもインストールを見なおして見ます。
>>>> >>
>>>> >> 宜しくお願いします。
>>>> >>
>>>> >> 以上
>>>> >>
>>>> >>
>>>> >>
>>>> >>
>>>> >> 2015年3月18日 9:02 <renayama19661014@ybb.ne.jp>:
>>>> >>
>>>> >> 福田さん
>>>> >>>
>>>> >>> おはようございます。山内です。
>>>> >>>
>>>> >>> 書き方が悪かったです。
>>>> >>> Reusableは、glueのことです。
>>>> >>>
>>>> >>> pacemakerのインストールに問題があるかも知れませんが、現時点では、判断出来ません。
>>>> >>>
>>>> >>>
>>>> >>>> stonith-helperを外して、external/sshだけにして起動してみましたが、
>>>> >>>> crm_monでの状態は変わりありませんでした。
>>>> >>>
>>>> >>>
>>>> >>> これは、想定通りで、external配下のエージェントを認識できず、startしていないと思っています。
>>>> >>>
>>>> >>> #時間を見て、松島さんの情報を元に、Debian7の環境を作らないと何とも言えないと思っています。
>>>> >>>
>>>> >>> 以上です。
>>>> >>>
>>>> >>>
>>>> >>> ----- Original Message -----
>>>> >>>> From: Masamichi Fukuda - elf-systems
>>>> > <masamichi_fukuda@elf-systems.com>
>>>> >>>> To: 山内英生 <renayama19661014@ybb.ne.jp>;
>>>> > "linux-ha-japan@lists.sourceforge.jp"
>>>> > <linux-ha-japan@lists.sourceforge.jp>
>>>> >>>
>>>> >>>> Date: 2015/3/18, Wed 08:12
>>>> >>>> Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
>>>> >>>>
>>>> >>>>
>>>> >>>> 山内さん
>>>> >>>>
>>>> >>>> おはようございます、福田です。
>>>> >>>>
>>>> >>>>>  ということは、xen0もstonith-helperもたぶん、Pacemakerのstonithプラグインとし
>>>> >>>>>  ての管理下のパスにはないということになると思います。
>>>> >>>>>
>>>> >>>>>  Reusableと、pm_extrasあたりのインストールが怪しいと思いますね。
>>>> >>>>
>>>> >>>> pacemakerのインストールに問題があるのでしょうか。
>>>> >>>> あと、Reusableというものは別途インストールが必要なのでしょうか。
>>>> >>>>
>>>> >>>> stonith-helperを外して、external/sshだけにして起動してみましたが、
>>>> >>>> crm_monでの状態は変わりありませんでした。
>>>> >>>>
>>>> >>>> Last updated: Wed Mar 18 08:07:42 2015
>>>> >>>> Last change: Wed Mar 18 08:04:48 2015
>>>> >>>> Stack: heartbeat
>>>> >>>> Current DC: lbv1.beta.com (38b0f200-83ea-8633-6f37-047d36cd39c6) -
>>>> > parti
>>>> >>>> tion with quorum
>>>> >>>> Version: 1.1.12-e32080b
>>>> >>>> 2 Nodes configured
>>>> >>>> 6 Resources configured
>>>> >>>>
>>>> >>>>
>>>> >>>> Online: [ lbv1.beta.com lbv2.beta.com ]
>>>> >>>>
>>>> >>>> Full list of resources:
>>>> >>>>
>>>> >>>> Stonith1-2      (stonith:external/ssh): Stopped
>>>> >>>> Stonith2-2      (stonith:external/ssh): Stopped
>>>> >>>>  Resource Group: HAvarnish
>>>> >>>>      vip_208    (ocf::heartbeat:IPaddr2):       Started
>>>> > lbv1.beta.com
>>>> >>>>      varnishd   (lsb:varnish):  Started lbv1.beta.com
>>>> >>>>  Clone Set: clone_ping [ping]
>>>> >>>>      Started: [ lbv1.beta.com lbv2.beta.com ]
>>>> >>>>
>>>> >>>> Node Attributes:
>>>> >>>> * Node lbv1.beta.com:
>>>> >>>>     + default_ping_set                  : 100
>>>> >>>> * Node lbv2.beta.com:
>>>> >>>>     + default_ping_set                  : 100
>>>> >>>>
>>>> >>>> Migration summary:
>>>> >>>> * Node lbv2.beta.com:
>>>> >>>>    Stonith1-2: migration-threshold=1 fail-count=1000000
>>>> > last-failure='Wed Mar 18
>>>> >>>>  08:07:32 2015'
>>>> >>>> * Node lbv1.beta.com:
>>>> >>>>    Stonith2-2: migration-threshold=1 fail-count=1000000
>>>> > last-failure='Wed Mar 18
>>>> >>>>  08:05:53 2015'
>>>> >>>>
>>>> >>>> Failed actions:
>>>> >>>>     Stonith1-2_start_0 on lbv2.beta.com 'unknown error' (1):
>>>> > call=23, st
>>>> >>>> atus=Error, exit-reason='none', last-rc-change='Wed Mar
>>>> > 18 08:07:30 2015', queue
>>>> >>>> d=0ms, exec=1061ms
>>>> >>>>     Stonith2-2_start_0 on lbv1.beta.com 'unknown error' (1):
>>>> > call=23, st
>>>> >>>> atus=Error, exit-reason='none', last-rc-change='Wed Mar
>>>> > 18 08:05:51 2015', queue
>>>> >>>> d=0ms, exec=1062ms
>>>> >>>>
>>>> >>>> 宜しくお願いします。
>>>> >>>>
>>>> >>>> 以上
>>>> >>>>
>>>> >>>>
>>>> >>>>
>>>> >>>>
>>>> >>>>
>>>> >>>> 2015年3月17日 23:51 <renayama19661014@ybb.ne.jp>:
>>>> >>>>
>>>> >>>> 福田さん
>>>> >>>>>
>>>> >>>>> こんばんは、山内です。
>>>> >>>>>
>>>> >>>>> ということは、xen0もstonith-helperもたぶん、Pacemakerのstonithプラグインとしての管理下のパスにはないということになると思います。
>>>> >>>>>
>>>> >>>>> Reusableと、pm_extrasあたりのインストールが怪しいと思いますね。
>>>> >>>>>
>>>> >>>>> また、何かわかったらご連絡します。
>>>> >>>>>
>>>> >>>>> 以上です。
>>>> >>>>>
>>>> >>>>>
>>>> >>>>>
>>>> >>>>> ----- Original Message -----
>>>> >>>>>> From: Masamichi Fukuda - elf-systems
>>>> > <masamichi_fukuda@elf-systems.com>
>>>> >>>>>> To: 山内英生 <renayama19661014@ybb.ne.jp>;
>>>> > "linux-ha-japan@lists.sourceforge.jp"
>>>> > <linux-ha-japan@lists.sourceforge.jp>
>>>> >>>>>
>>>> >>>>>> Date: 2015/3/17, Tue 23:46
>>>> >>>>>> Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
>>>> >>>>>>
>>>> >>>>>>
>>>> >>>>>> 山内さん
>>>> >>>>>>
>>>> >>>>>> こんばんは、福田です。
>>>> >>>>>>
>>>> >>>>>> stonith-helperの-x指定は何かやり方が違うんでしょうかね。
>>>> >>>>>>
>>>> >>>>>> stonith-helperを外して、xen0だけにして起動してみました。
>>>> >>>>>>
>>>> >>>>>> # crm_mon -rfA
>>>> >>>>>>
>>>> >>>>>> Last updated: Tue Mar 17 23:38:53 2015
>>>> >>>>>> Last change: Tue Mar 17 23:30:34 2015
>>>> >>>>>> Stack: heartbeat
>>>> >>>>>> Current DC: lbv1.beta.com
>>>> > (38b0f200-83ea-8633-6f37-047d36cd39c6) - parti
>>>> >>>>>> tion with quorum
>>>> >>>>>> Version: 1.1.12-e32080b
>>>> >>>>>> 2 Nodes configured
>>>> >>>>>> 6 Resources configured
>>>> >>>>>>
>>>> >>>>>>
>>>> >>>>>> Online: [ lbv1.beta.com lbv2.beta.com ]
>>>> >>>>>>
>>>> >>>>>> Full list of resources:
>>>> >>>>>>
>>>> >>>>>> Stonith1-2      (stonith:external/xen0):        Stopped
>>>> >>>>>> Stonith2-2      (stonith:external/xen0):        Stopped
>>>> >>>>>>  Resource Group: HAvarnish
>>>> >>>>>>      vip_208    (ocf::heartbeat:IPaddr2):       Started
>>>> > lbv1.beta.com
>>>> >>>>>>      varnishd   (lsb:varnish):  Started lbv1.beta.com
>>>> >>>>>>  Clone Set: clone_ping [ping]
>>>> >>>>>>      Started: [ lbv1.beta.com lbv2.beta.com ]
>>>> >>>>>>
>>>> >>>>>> Node Attributes:
>>>> >>>>>> * Node lbv1.beta.com:
>>>> >>>>>>     + default_ping_set                  : 100
>>>> >>>>>> * Node lbv2.beta.com:
>>>> >>>>>>     + default_ping_set                  : 100
>>>> >>>>>>
>>>> >>>>>> Migration summary:
>>>> >>>>>> * Node lbv1.beta.com:
>>>> >>>>>>    Stonith2-2: migration-threshold=1 fail-count=1000000
>>>> > last-failure='Tue Mar 17
>>>> >>>>>>  23:38:34 2015'
>>>> >>>>>> * Node lbv2.beta.com:
>>>> >>>>>>    Stonith1-2: migration-threshold=1 fail-count=1000000
>>>> > last-failure='Tue Mar 17
>>>> >>>>>>  23:38:27 2015'
>>>> >>>>>>
>>>> >>>>>> Failed actions:
>>>> >>>>>>     Stonith2-2_start_0 on lbv1.beta.com 'unknown
>>>> > error' (1): call=23, st
>>>> >>>>>> atus=Error, exit-reason='none',
>>>> > last-rc-change='Tue Mar 17 23:38:32 2015', queue
>>>> >>>>>> d=0ms, exec=1061ms
>>>> >>>>>>     Stonith1-2_start_0 on lbv2.beta.com 'unknown
>>>> > error' (1): call=23, st
>>>> >>>>>> atus=Error, exit-reason='none',
>>>> > last-rc-change='Tue Mar 17 23:38:25 2015', queue
>>>> >>>>>> d=0ms, exec=1342ms
>>>> >>>>>>
>>>> >>>>>>
>>>> >>>>>>
>>>> >>>>>>
>>>> >>>>>> stonith-helperがあるときと同様のfialed actionsが出ているようです。
>>>> >>>>>>
>>>> >>>>>>
>>>> >>>>>> 宜しくお願いします。
>>>> >>>>>>
>>>> >>>>>> 以上
>>>> >>>>>>
>>>> >>>>>>
>>>> >>>>>>
>>>> >>>>>>
>>>> >>>>>> 2015年3月17日 22:38 <renayama19661014@ybb.ne.jp>:
>>>> >>>>>>
>>>> >>>>>> 福田さん
>>>> >>>>>>>
>>>> >>>>>>> こんばんは、山内です。
>>>> >>>>>>>
>>>> >>>>>>> ちなみに可能であれば、external/stonith-helperを外して、external/xen0だけにした場合に
>>>> >>>>>>> どうなるか?を確認すると、問題の切り分けになるかもしれません。
>>>> >>>>>>>
>>>> >>>>>>> 以上です。
>>>> >>>>>>>
>>>> >>>>>>>
>>>> >>>>>>>
>>>> >>>>>>> ----- Original Message -----
>>>> >>>>>>>
>>>> >>>>>>>>  From: "renayama19661014@ybb.ne.jp"
>>>> > <renayama19661014@ybb.ne.jp>
>>>> >>>>>>>>  To: "linux-ha-japan@lists.sourceforge.jp"
>>>> > <linux-ha-japan@lists.sourceforge.jp>
>>>> >>>>>>>>  Cc:
>>>> >>>>>>>>  Date: 2015/3/17, Tue 22:28
>>>> >>>>>>>>  Subject: Re: [Linux-ha-jp]
>>>> > スプリットブレイン時のSTONITHエラーについて
>>>> >>>>>>>>
>>>> >>>>>>>>  福田さん
>>>> >>>>>>>>
>>>> >>>>>>>>  こんばんは、山内です。
>>>> >>>>>>>>
>>>> >>>>>>>>  変わらないようですね。。。
>>>> >>>>>>>>
>>>> >>>>>>>>  とりあえず、明日くらいに、RHEL上ですが、
>>>> >>>>>>>>
>>>> >>>>>>>>  Heartbeat3.0.6
>>>> >>>>>>>>  Pacemakerの最新
>>>> >>>>>>>>
>>>> >>>>>>>>
>>>> > 組み合わせで、同じような設定(リソースはDummy、external/xen0はexternal/sshになりますが)stonith-helperが動くかどうかを確認してみます。
>>>> >>>>>>>>
>>>> >>>>>>>>  #stonith-helperの-x指定の出力が確認出来ると、もう少し問題が絞りやすいのですが・・・
>>>> >>>>>>>>
>>>> >>>>>>>>
>>>> >>>>>>>>  以上です。
>>>> >>>>>>>>
>>>> >>>>>>>>
>>>> >>>>>>>>
>>>> >>>>>>>>  ----- Original Message -----
>>>> >>>>>>>>>  From: Masamichi Fukuda - elf-systems
>>>> >>>>>>>>  <masamichi_fukuda@elf-systems.com>
>>>> >>>>>>>>>  To: 山内英生 <renayama19661014@ybb.ne.jp>;
>>>> >>>>>>>>  "linux-ha-japan@lists.sourceforge.jp"
>>>> >>>>>>>>  <linux-ha-japan@lists.sourceforge.jp>
>>>> >>>>>>>>>  Date: 2015/3/17, Tue 21:24
>>>> >>>>>>>>>  Subject: Re: [Linux-ha-jp]
>>>> > スプリットブレイン時のSTONITHエラーについて
>>>> >>>>>>>>>
>>>> >>>>>>>>>
>>>> >>>>>>>>>  山内さん
>>>> >>>>>>>>>
>>>> >>>>>>>>>  こんばんは、福田です。
>>>> >>>>>>>>>  最新版の情報をありがとうございました。
>>>> >>>>>>>>>
>>>> >>>>>>>>>  早速インストールしてみました。
>>>> >>>>>>>>>
>>>> >>>>>>>>>  起動後の状態です。
>>>> >>>>>>>>>
>>>> >>>>>>>>>  failed actionsは変わりないようです。
>>>> >>>>>>>>>
>>>> >>>>>>>>>
>>>> >>>>>>>>>
>>>> >>>>>>>>>  # crm_mon -rfA
>>>> >>>>>>>>>  Last updated: Tue Mar 17 21:03:49 2015
>>>> >>>>>>>>>  Last change: Tue Mar 17 20:30:58 2015
>>>> >>>>>>>>>  Stack: heartbeat
>>>> >>>>>>>>>  Current DC: lbv1.beta.com
>>>> > (38b0f200-83ea-8633-6f37-047d36cd39c6) - parti
>>>> >>>>>>>>>  tion with quorum
>>>> >>>>>>>>>  Version: 1.1.12-e32080b
>>>> >>>>>>>>>  2 Nodes configured
>>>> >>>>>>>>>  8 Resources configured
>>>> >>>>>>>>>
>>>> >>>>>>>>>
>>>> >>>>>>>>>  Online: [ lbv1.beta.com lbv2.beta.com ]
>>>> >>>>>>>>>
>>>> >>>>>>>>>  Full list of resources:
>>>> >>>>>>>>>
>>>> >>>>>>>>>   Resource Group: HAvarnish
>>>> >>>>>>>>>       vip_208    (ocf::heartbeat:IPaddr2):      
>>>> > Started lbv1.beta.com
>>>> >>>>>>>>>       varnishd   (lsb:varnish):  Started
>>>> > lbv1.beta.com
>>>> >>>>>>>>>   Resource Group: grpStonith1
>>>> >>>>>>>>>       Stonith1-1
>>>> > (stonith:external/stonith-helper):      Stopped
>>>> >>>>>>>>>       Stonith1-2 (stonith:external/xen0):      
>>>> > Stopped
>>>> >>>>>>>>>   Resource Group: grpStonith2
>>>> >>>>>>>>>       Stonith2-1
>>>> > (stonith:external/stonith-helper):      Stopped
>>>> >>>>>>>>>       Stonith2-2 (stonith:external/xen0):      
>>>> > Stopped
>>>> >>>>>>>>>   Clone Set: clone_ping [ping]
>>>> >>>>>>>>>       Started: [ lbv1.beta.com lbv2.beta.com ]
>>>> >>>>>>>>>
>>>> >>>>>>>>>  Node Attributes:
>>>> >>>>>>>>>  * Node lbv1.beta.com:
>>>> >>>>>>>>>      + default_ping_set                  : 100
>>>> >>>>>>>>>  * Node lbv2.beta.com:
>>>> >>>>>>>>>      + default_ping_set                  : 100
>>>> >>>>>>>>>
>>>> >>>>>>>>>  Migration summary:
>>>> >>>>>>>>>  * Node lbv1.beta.com:
>>>> >>>>>>>>>     Stonith2-1: migration-threshold=1
>>>> > fail-count=1000000
>>>> >>>>>>>>  last-failure='Tue Mar 17
>>>> >>>>>>>>>   21:03:39 2015'
>>>> >>>>>>>>>  * Node lbv2.beta.com:
>>>> >>>>>>>>>     Stonith1-1: migration-threshold=1
>>>> > fail-count=1000000
>>>> >>>>>>>>  last-failure='Tue Mar 17
>>>> >>>>>>>>>   21:03:32 2015'
>>>> >>>>>>>>>
>>>> >>>>>>>>>  Failed actions:
>>>> >>>>>>>>>      Stonith2-1_start_0 on lbv1.beta.com
>>>> > 'unknown error' (1):
>>>> >>>>>>>>  call=31, st
>>>> >>>>>>>>>  atus=Error, exit-reason='none',
>>>> > last-rc-change='Tue Mar 17
>>>> >>>>>>>>  21:03:37 2015', queue
>>>> >>>>>>>>>  d=0ms, exec=1085ms
>>>> >>>>>>>>>      Stonith1-1_start_0 on lbv2.beta.com
>>>> > 'unknown error' (1):
>>>> >>>>>>>>  call=18, st
>>>> >>>>>>>>>  atus=Error, exit-reason='none',
>>>> > last-rc-change='Tue Mar 17
>>>> >>>>>>>>  21:03:30 2015', queue
>>>> >>>>>>>>>  d=0ms, exec=1061ms
>>>> >>>>>>>>>
>>>> >>>>>>>>>
>>>> >>>>>>>>>
>>>> >>>>>>>>>
>>>> >>>>>>>>>  ログです。
>>>> >>>>>>>>>
>>>> >>>>>>>>>
>>>> >>>>>>>>>  # less /var/log/ha-debug
>>>> >>>>>>>>>  Mar 17 21:02:39 lbv1.beta.com heartbeat:
>>>> > [4235]: info: Pacemaker support:
>>>> >>>>>>>>  yes
>>>> >>>>>>>>>  Mar 17 21:02:39 lbv1.beta.com heartbeat:
>>>> > [4235]: WARN: File
>>>> >>>>>>>>  /etc/ha.d//haresources exists.
>>>> >>>>>>>>>  Mar 17 21:02:39 lbv1.beta.com heartbeat:
>>>> > [4235]: WARN: This file is not used
>>>> >>>>>>>>  because pacemaker is enabled
>>>> >>>>>>>>>  Mar 17 21:02:39 lbv1.beta.com heartbeat:
>>>> > [4235]: debug: Checking access of:
>>>> >>>>>>>>  /usr/local/heartbeat/libexec/heartbeat/ccm
>>>> >>>>>>>>>  Mar 17 21:02:39 lbv1.beta.com heartbeat:
>>>> > [4235]: debug: Checking access of:
>>>> >>>>>>>>  /usr/local/heartbeat/libexec/pacemaker/cib
>>>> >>>>>>>>>  Mar 17 21:02:39 lbv1.beta.com heartbeat:
>>>> > [4235]: debug: Checking access of:
>>>> >>>>>>>>  /usr/local/heartbeat/libexec/pacemaker/stonithd
>>>> >>>>>>>>>  Mar 17 21:02:39 lbv1.beta.com heartbeat:
>>>> > [4235]: debug: Checking access of:
>>>> >>>>>>>>  /usr/local/heartbeat/libexec/pacemaker/lrmd
>>>> >>>>>>>>>  Mar 17 21:02:39 lbv1.beta.com heartbeat:
>>>> > [4235]: debug: Checking access of:
>>>> >>>>>>>>  /usr/local/heartbeat/libexec/pacemaker/attrd
>>>> >>>>>>>>>  Mar 17 21:02:39 lbv1.beta.com heartbeat:
>>>> > [4235]: debug: Checking access of:
>>>> >>>>>>>>  /usr/local/heartbeat/libexec/pacemaker/crmd
>>>> >>>>>>>>>  Mar 17 21:02:39 lbv1.beta.com heartbeat:
>>>> > [4235]: WARN: Core dumps could be
>>>> >>>>>>>>  lost if multiple dumps occur.
>>>> >>>>>>>>>  Mar 17 21:02:39 lbv1.beta.com heartbeat:
>>>> > [4235]: WARN: Consider setting
>>>> >>>>>>>>  non-default value in /proc/sys/kernel/core_pattern
>>>> > (or equivalent) for maximum
>>>> >>>>>>>>  supportability
>>>> >>>>>>>>>  Mar 17 21:02:39 lbv1.beta.com heartbeat:
>>>> > [4235]: WARN: Consider setting
>>>> >>>>>>>>  /proc/sys/kernel/core_uses_pid (or equivalent) to 1
>>>> > for maximum supportability
>>>> >>>>>>>>>  Mar 17 21:02:39 lbv1.beta.com heartbeat:
>>>> > [4235]: WARN: Logging daemon is
>>>> >>>>>>>>  disabled --enabling logging daemon is recommended
>>>> >>>>>>>>>  Mar 17 21:02:39 lbv1.beta.com heartbeat:
>>>> > [4235]: info:
>>>> >>>>>>>>  **************************
>>>> >>>>>>>>>  Mar 17 21:02:39 lbv1.beta.com heartbeat:
>>>> > [4235]: info: Configuration
>>>> >>>>>>>>  validated. Starting heartbeat 3.0.6
>>>> >>>>>>>>>  Mar 17 21:02:39 lbv1.beta.com heartbeat:
>>>> > [4236]: info: heartbeat: version
>>>> >>>>>>>>  3.0.6
>>>> >>>>>>>>>  Mar 17 21:02:39 lbv1.beta.com heartbeat:
>>>> > [4236]: info: Heartbeat generation:
>>>> >>>>>>>>  1423534116
>>>> >>>>>>>>>  Mar 17 21:02:39 lbv1.beta.com heartbeat:
>>>> > [4236]: info: seed is -1702799346
>>>> >>>>>>>>>  Mar 17 21:02:39 lbv1.beta.com heartbeat:
>>>> > [4236]: info: glib: ucast: write
>>>> >>>>>>>>  socket priority set to IPTOS_LOWDELAY on eth1
>>>> >>>>>>>>>  Mar 17 21:02:39 lbv1.beta.com heartbeat:
>>>> > [4236]: info: glib: ucast: bound
>>>> >>>>>>>>  send socket to device: eth1
>>>> >>>>>>>>>  Mar 17 21:02:39 lbv1.beta.com heartbeat:
>>>> > [4236]: info: glib: ucast: set
>>>> >>>>>>>>  SO_REUSEADDR
>>>> >>>>>>>>>  Mar 17 21:02:39 lbv1.beta.com heartbeat:
>>>> > [4236]: info: glib: ucast: bound
>>>> >>>>>>>>  receive socket to device: eth1
>>>> >>>>>>>>>  Mar 17 21:02:39 lbv1.beta.com heartbeat:
>>>> > [4236]: info: glib: ucast: started
>>>> >>>>>>>>  on port 694 interface eth1 to 10.0.17.133
>>>> >>>>>>>>>  Mar 17 21:02:39 lbv1.beta.com heartbeat:
>>>> > [4236]: info: Local status now set
>>>> >>>>>>>>  to: 'up'
>>>> >>>>>>>>>  Mar 17 21:02:46 lbv1.beta.com heartbeat:
>>>> > [4236]: info: Link
>>>> >>>>>>>>  lbv2.beta.com:eth1 up.
>>>> >>>>>>>>>  Mar 17 21:02:46 lbv1.beta.com heartbeat:
>>>> > [4236]: info: Status update for
>>>> >>>>>>>>  node lbv2.beta.com: status up
>>>> >>>>>>>>>  Mar 17 21:02:47 lbv1.beta.com heartbeat:
>>>> > [4236]: info: Comm_now_up():
>>>> >>>>>>>>  updating status to active
>>>> >>>>>>>>>  Mar 17 21:02:47 lbv1.beta.com heartbeat:
>>>> > [4236]: info: Local status now set
>>>> >>>>>>>>  to: 'active'
>>>> >>>>>>>>>  Mar 17 21:02:47 lbv1.beta.com heartbeat:
>>>> > [4236]: info: Starting child client
>>>> >>>>>>>>
>>>> > "/usr/local/heartbeat/libexec/heartbeat/ccm" (109,113)
>>>> >>>>>>>>>  Mar 17 21:02:47 lbv1.beta.com heartbeat:
>>>> > [4236]: info: Starting child client
>>>> >>>>>>>>
>>>> > "/usr/local/heartbeat/libexec/pacemaker/cib" (109,113)
>>>> >>>>>>>>>  Mar 17 21:02:47 lbv1.beta.com heartbeat:
>>>> > [4236]: info: Starting child client
>>>> >>>>>>>>
>>>> > "/usr/local/heartbeat/libexec/pacemaker/stonithd" (0,0)
>>>> >>>>>>>>>  Mar 17 21:02:47 lbv1.beta.com heartbeat:
>>>> > [4236]: info: Starting child client
>>>> >>>>>>>>
>>>> > "/usr/local/heartbeat/libexec/pacemaker/lrmd" (0,0)
>>>> >>>>>>>>>  Mar 17 21:02:47 lbv1.beta.com heartbeat:
>>>> > [4236]: info: Starting child client
>>>> >>>>>>>>
>>>> > "/usr/local/heartbeat/libexec/pacemaker/attrd" (109,113)
>>>> >>>>>>>>>  Mar 17 21:02:47 lbv1.beta.com heartbeat:
>>>> > [4236]: info: Starting child client
>>>> >>>>>>>>
>>>> > "/usr/local/heartbeat/libexec/pacemaker/crmd" (109,113)
>>>> >>>>>>>>>  Mar 17 21:02:47 lbv1.beta.com heartbeat:
>>>> > [4236]: debug: get_delnodelist:
>>>> >>>>>>>>  delnodelist=
>>>> >>>>>>>>>  Mar 17 21:02:47 lbv1.beta.com heartbeat:
>>>> > [4250]: info: Starting
>>>> >>>>>>>>
>>>> > "/usr/local/heartbeat/libexec/pacemaker/crmd" as uid 109  gid 113 (pid
>>>> >>>>>>>>  4250)
>>>> >>>>>>>>>  Mar 17 21:02:47 lbv1.beta.com heartbeat:
>>>> > [4246]: info: Starting
>>>> >>>>>>>>
>>>> > "/usr/local/heartbeat/libexec/pacemaker/cib" as uid 109  gid 113 (pid
>>>> >>>>>>>>  4246)
>>>> >>>>>>>>>  Mar 17 21:02:47 lbv1.beta.com heartbeat:
>>>> > [4249]: info: Starting
>>>> >>>>>>>>
>>>> > "/usr/local/heartbeat/libexec/pacemaker/attrd" as uid 109  gid 113
>>>> >>>>>>>>  (pid 4249)
>>>> >>>>>>>>>  Mar 17 21:02:47 lbv1.beta.com heartbeat:
>>>> > [4245]: info: Starting
>>>> >>>>>>>>
>>>> > "/usr/local/heartbeat/libexec/heartbeat/ccm" as uid 109  gid 113 (pid
>>>> >>>>>>>>  4245)
>>>> >>>>>>>>>  Mar 17 21:02:47 lbv1.beta.com heartbeat:
>>>> > [4248]: info: Starting
>>>> >>>>>>>>
>>>> > "/usr/local/heartbeat/libexec/pacemaker/lrmd" as uid 0  gid 0 (pid
>>>> >>>>>>>>  4248)
>>>> >>>>>>>>>  Mar 17 21:02:47 lbv1.beta.com heartbeat:
>>>> > [4247]: info: Starting
>>>> >>>>>>>>
>>>> > "/usr/local/heartbeat/libexec/pacemaker/stonithd" as uid 0  gid 0 (pid
>>>> >>>>>>>>  4247)
>>>> >>>>>>>>>  Mar 17 21:02:47 lbv1.beta.com ccm: [4245]:
>>>> > info: Hostname: lbv1.beta.com
>>>> >>>>>>>>>  Mar 17 21:02:47 lbv1.beta.com heartbeat:
>>>> > [4236]: info: the send queue length
>>>> >>>>>>>>  from heartbeat to client ccm is set to 1024
>>>> >>>>>>>>>  Mar 17 21:02:47 lbv1.beta.com heartbeat:
>>>> > [4236]: info: the send queue length
>>>> >>>>>>>>  from heartbeat to client attrd is set to 1024
>>>> >>>>>>>>>  Mar 17 21:02:47 lbv1.beta.com heartbeat:
>>>> > [4236]: info: the send queue length
>>>> >>>>>>>>  from heartbeat to client stonith-ng is set to 1024
>>>> >>>>>>>>>  Mar 17 21:02:47 lbv1.beta.com heartbeat:
>>>> > [4236]: info: Status update for
>>>> >>>>>>>>  node lbv2.beta.com: status active
>>>> >>>>>>>>>  Mar 17 21:02:47 lbv1.beta.com heartbeat:
>>>> > [4236]: info: the send queue length
>>>> >>>>>>>>  from heartbeat to client cib is set to 1024
>>>> >>>>>>>>>  Mar 17 21:02:51 lbv1.beta.com heartbeat:
>>>> > [4236]: WARN: 1 lost packet(s) for
>>>> >>>>>>>>  [lbv2.beta.com] [15:17]
>>>> >>>>>>>>>  Mar 17 21:02:51 lbv1.beta.com heartbeat:
>>>> > [4236]: info: No pkts missing from
>>>> >>>>>>>>  lbv2.beta.com!
>>>> >>>>>>>>>  Mar 17 21:02:52 lbv1.beta.com heartbeat:
>>>> > [4236]: WARN: 1 lost packet(s) for
>>>> >>>>>>>>  [lbv2.beta.com] [19:21]
>>>> >>>>>>>>>  Mar 17 21:02:52 lbv1.beta.com heartbeat:
>>>> > [4236]: info: No pkts missing from
>>>> >>>>>>>>  lbv2.beta.com!
>>>> >>>>>>>>>  Mar 17 21:02:52 lbv1.beta.com heartbeat:
>>>> > [4236]: info: the send queue length
>>>> >>>>>>>>  from heartbeat to client crmd is set to 1024
>>>> >>>>>>>>>  Mar 17 21:02:53 lbv1.beta.com heartbeat:
>>>> > [4236]: WARN: 1 lost packet(s) for
>>>> >>>>>>>>  [lbv2.beta.com] [24:26]
>>>> >>>>>>>>>  Mar 17 21:02:53 lbv1.beta.com heartbeat:
>>>> > [4236]: info: No pkts missing from
>>>> >>>>>>>>  lbv2.beta.com!
>>>> >>>>>>>>>  Mar 17 21:02:54 lbv1.beta.com heartbeat:
>>>> > [4236]: WARN: 1 lost packet(s) for
>>>> >>>>>>>>  [lbv2.beta.com] [26:28]
>>>> >>>>>>>>>  Mar 17 21:02:54 lbv1.beta.com heartbeat:
>>>> > [4236]: info: No pkts missing from
>>>> >>>>>>>>  lbv2.beta.com!
>>>> >>>>>>>>>  Mar 17 21:02:54 lbv1.beta.com heartbeat:
>>>> > [4236]: WARN: 1 lost packet(s) for
>>>> >>>>>>>>  [lbv2.beta.com] [30:32]
>>>> >>>>>>>>>  Mar 17 21:02:54 lbv1.beta.com heartbeat:
>>>> > [4236]: info: No pkts missing from
>>>> >>>>>>>>  lbv2.beta.com!
>>>> >>>>>>>>>
>>>> >>>>>>>>>
>>>> >>>>>>>>>
>>>> >>>>>>>>>  # less /var/log/error
>>>> >>>>>>>>>
>>>> >>>>>>>>>  Mar 17 21:02:47 lbv1 attrd[4249]:    error:
>>>> > ha_msg_dispatch: Ignored
>>>> >>>>>>>>  incoming message. Please set_msg_callback on
>>>> > hbclstat
>>>> >>>>>>>>>  Mar 17 21:02:48 lbv1 attrd[4249]:    error:
>>>> > ha_msg_dispatch: Ignored
>>>> >>>>>>>>  incoming message. Please set_msg_callback on
>>>> > hbclstat
>>>> >>>>>>>>>  Mar 17 21:02:53 lbv1 stonith-ng[4247]:  
>>>> > error: ha_msg_dispatch: Ignored
>>>> >>>>>>>>  incoming message. Please set_msg_callback on
>>>> > hbclstat
>>>> >>>>>>>>>  Mar 17 21:02:53 lbv1 stonith-ng[4247]:  
>>>> > error: ha_msg_dispatch: Ignored
>>>> >>>>>>>>  incoming message. Please set_msg_callback on
>>>> > hbclstat
>>>> >>>>>>>>>  Mar 17 21:03:39 lbv1 crmd[4250]:    error:
>>>> > process_lrm_event: Operation
>>>> >>>>>>>>  Stonith2-1_start_0 (node=lbv1.beta.com, call=31,
>>>> > status=4, cib-update=42,
>>>> >>>>>>>>  confirmed=true) Error
>>>> >>>>>>>>>
>>>> >>>>>>>>>  # cat syslog|egrep 'Mar 17 21:03|Mar 17
>>>> > 21:02' |egrep
>>>> >>>>>>>>  'heartbeat|stonith|pacemaker|error'
>>>> >>>>>>>>>  Mar 17 21:03:24 lbv1 pengine[4253]:   notice:
>>>> > process_pe_message: Calculated
>>>> >>>>>>>>  Transition 0:
>>>> > /var/lib/pacemaker/pengine/pe-input-115.bz2
>>>> >>>>>>>>>  Mar 17 21:03:27 lbv1 crmd[4250]:   notice:
>>>> > run_graph: Transition 0
>>>> >>>>>>>>  (Complete=15, Pending=0, Fired=0, Skipped=16,
>>>> > Incomplete=2,
>>>> >>>>>>>>
>>>> > Source=/var/lib/pacemaker/pengine/pe-input-115.bz2): Stopped
>>>> >>>>>>>>>  Mar 17 21:03:29 lbv1 pengine[4253]:   notice:
>>>> > process_pe_message: Calculated
>>>> >>>>>>>>  Transition 1:
>>>> > /var/lib/pacemaker/pengine/pe-input-116.bz2
>>>> >>>>>>>>>  Mar 17 21:03:34 lbv1 crmd[4250]:   notice:
>>>> > run_graph: Transition 1
>>>> >>>>>>>>  (Complete=8, Pending=0, Fired=0, Skipped=12,
>>>> > Incomplete=1,
>>>> >>>>>>>>
>>>> > Source=/var/lib/pacemaker/pengine/pe-input-116.bz2): Stopped
>>>> >>>>>>>>>  Mar 17 21:03:37 lbv1 pengine[4253]:  warning:
>>>> > unpack_rsc_op_failure:
>>>> >>>>>>>>  Processing failed op start for Stonith1-1 on
>>>> > lbv2.beta.com: unknown error (1)
>>>> >>>>>>>>>  Mar 17 21:03:37 lbv1 pengine[4253]:  warning:
>>>> > unpack_rsc_op_failure:
>>>> >>>>>>>>  Processing failed op start for Stonith1-1 on
>>>> > lbv2.beta.com: unknown error (1)
>>>> >>>>>>>>>  Mar 17 21:03:37 lbv1 pengine[4253]:   notice:
>>>> > process_pe_message: Calculated
>>>> >>>>>>>>  Transition 2:
>>>> > /var/lib/pacemaker/pengine/pe-input-117.bz2
>>>> >>>>>>>>>  Mar 17 21:03:39 lbv1 stonith-ng[4247]:  
>>>> > notice: log_operation: Operation
>>>> >>>>>>>>  'monitor' [4377] for device
>>>> > 'Stonith2-1' returned: -201 (Generic
>>>> >>>>>>>>  Pacemaker error)
>>>> >>>>>>>>>  Mar 17 21:03:39 lbv1 stonith-ng[4247]:
>>>> > warning: log_operation:
>>>> >>>>>>>>  Stonith2-1:4377 [ Performing: stonith -t
>>>> > external/stonith-helper -S ]
>>>> >>>>>>>>>  Mar 17 21:03:39 lbv1 stonith-ng[4247]:
>>>> > warning: log_operation:
>>>> >>>>>>>>  Stonith2-1:4377 [ failed to exec
>>>> > "stonith" ]
>>>> >>>>>>>>>  Mar 17 21:03:39 lbv1 stonith-ng[4247]:
>>>> > warning: log_operation:
>>>> >>>>>>>>  Stonith2-1:4377 [ failed:  2 ]
>>>> >>>>>>>>>  Mar 17 21:03:39 lbv1 crmd[4250]:    error:
>>>> > process_lrm_event: Operation
>>>> >>>>>>>>  Stonith2-1_start_0 (node=lbv1.beta.com, call=31,
>>>> > status=4, cib-update=42,
>>>> >>>>>>>>  confirmed=true) Error
>>>> >>>>>>>>>  Mar 17 21:03:40 lbv1 crmd[4250]:   notice:
>>>> > run_graph: Transition 2
>>>> >>>>>>>>  (Complete=12, Pending=0, Fired=0, Skipped=3,
>>>> > Incomplete=0,
>>>> >>>>>>>>
>>>> > Source=/var/lib/pacemaker/pengine/pe-input-117.bz2): Stopped
>>>> >>>>>>>>>  Mar 17 21:03:42 lbv1 pengine[4253]:  warning:
>>>> > unpack_rsc_op_failure:
>>>> >>>>>>>>  Processing failed op start for Stonith2-1 on
>>>> > lbv1.beta.com: unknown error (1)
>>>> >>>>>>>>>  Mar 17 21:03:42 lbv1 pengine[4253]:  warning:
>>>> > unpack_rsc_op_failure:
>>>> >>>>>>>>  Processing failed op start for Stonith2-1 on
>>>> > lbv1.beta.com: unknown error (1)
>>>> >>>>>>>>>  Mar 17 21:03:42 lbv1 pengine[4253]:  warning:
>>>> > unpack_rsc_op_failure:
>>>> >>>>>>>>  Processing failed op start for Stonith1-1 on
>>>> > lbv2.beta.com: unknown error (1)
>>>> >>>>>>>>>  Mar 17 21:03:42 lbv1 pengine[4253]:   notice:
>>>> > process_pe_message: Calculated
>>>> >>>>>>>>  Transition 3:
>>>> > /var/lib/pacemaker/pengine/pe-input-118.bz2
>>>> >>>>>>>>>  Mar 17 21:03:42 lbv1 IPaddr2(vip_208)[4448]:
>>>> > INFO:
>>>> >>>>>>>>  /usr/libexec/heartbeat/send_arp -i 200 -r 5 -p
>>>> >>>>>>>>  /var/run/resource-agents/send_arp-192.168.17.208
>>>> > eth0 192.168.17.208 auto
>>>> >>>>>>>>  not_used not_used
>>>> >>>>>>>>>  Mar 17 21:03:47 lbv1 crmd[4250]:   notice:
>>>> > run_graph: Transition 3
>>>> >>>>>>>>  (Complete=10, Pending=0, Fired=0, Skipped=0,
>>>> > Incomplete=0,
>>>> >>>>>>>>
>>>> > Source=/var/lib/pacemaker/pengine/pe-input-118.bz2): Complete
>>>> >>>>>>>>>
>>>> >>>>>>>>>  宜しくお願いします。
>>>> >>>>>>>>>
>>>> >>>>>>>>>  以上
>>>> >>>>>>>>>
>>>> >>>>>>>>>
>>>> >>>>>>>>>
>>>> >>>>>>>>>  2015年3月17日 18:31
>>>> > <renayama19661014@ybb.ne.jp>:
>>>> >>>>>>>>>
>>>> >>>>>>>>>  福田さん
>>>> >>>>>>>>>>
>>>> >>>>>>>>>>  こんばんは、山内です。
>>>> >>>>>>>>>>
>>>> >>>>>>>>>>  tag付けされていないので、本日の最新版は、
>>>> >>>>>>>>>>
>>>> >>>>>>>>>>   *
>>>> >>>>>>>>
>>>> > https://github.com/ClusterLabs/pacemaker/tree/e32080b460f81486b85d08ec958582b3e72d858c
>>>> >>>>>>>>>>
>>>> >>>>>>>>>>
>>>> >>>>>>>>>>  になります。
>>>> >>>>>>>>>>  右側の[Download ZIP]からダウンロード出来ます。
>>>> >>>>>>>>>>
>>>> >>>>>>>>>>  以上です。
>>>> >>>>>>>>>>
>>>> >>>>>>>>>>
>>>> >>>>>>>>>>  ----- Original Message -----
>>>> >>>>>>>>>>>  From: Masamichi Fukuda - elf-systems
>>>> >>>>>>>>  <masamichi_fukuda@elf-systems.com>
>>>> >>>>>>>>>>
>>>> >>>>>>>>>>>  To:
>>>> > "renayama19661014@ybb.ne.jp"
>>>> >>>>>>>>  <renayama19661014@ybb.ne.jp>;
>>>> >>>>>>>>  "linux-ha-japan@lists.sourceforge.jp"
>>>> >>>>>>>>  <linux-ha-japan@lists.sourceforge.jp>
>>>> >>>>>>>>>>>  Date: 2015/3/17, Tue 18:07
>>>> >>>>>>>>>>>  Subject: スプリットブレイン時のSTONITHエラーについて
>>>> >>>>>>>>>>>
>>>> >>>>>>>>>>>
>>>> >>>>>>>>>>>  山内さん
>>>> >>>>>>>>>>>
>>>> >>>>>>>>>>>
>>>> >>>>>>>>>>>  お疲れ様です、福田です。
>>>> >>>>>>>>>>>
>>>> >>>>>>>>>>>
>>>> >>>>>>>>>>>  こちらを見たのですが、
>>>> >>>>>>>>>>>
>>>> > https://github.com/ClusterLabs/pacemaker/tags
>>>> >>>>>>>>>>>
>>>> >>>>>>>>>>>
>>>> >>>>>>>>>>>
>>>> >>>>>>>>>>>  pacemaker 1.1.12 561c4cf が最新のようなのですが。
>>>> >>>>>>>>>>>  済みませんが、これ以降の最新版はどちらにあるか教えて頂けますか。
>>>> >>>>>>>>>>>
>>>> >>>>>>>>>>>
>>>> >>>>>>>>>>>  宜しくお願いします。
>>>> >>>>>>>>>>>
>>>> >>>>>>>>>>>
>>>> >>>>>>>>>>>  以上
>>>> >>>>>>>>>>>
>>>> >>>>>>>>>>>
>>>> >>>>>>>>>>>
>>>> >>>>>>>>>>>
>>>> > 2015年3月17日火曜日、<renayama19661014@ybb.ne.jp>さんは書きました:
>>>> >>>>>>>>>>>
>>>> >>>>>>>>>>>  福田さん
>>>> >>>>>>>>>>>>
>>>> >>>>>>>>>>>>  お疲れ様です。山内です。
>>>> >>>>>>>>>>>>
>>>> >>>>>>>>>>>>  はい。古いです。
>>>> >>>>>>>>>>>>
>>>> >>>>>>>>>>>>
>>>> > PacemakerがHeartbeat3.0.6に対応したのは意外と最近です。
>>>> >>>>>>>>>>>>
>>>> > もっと新しいものを入れてください。(また、ソースから構築する必要がありますが・・・・)
>>>> >>>>>>>>>>>>
>>>> >>>>>>>>>>>>
>>>> >>>>>>>>>>>>
>>>> >>>>>>>>>>>>  本家のgithubから入手可能です。
>>>> >>>>>>>>>>>>   *
>>>> > https://github.com/ClusterLabs/pacemaker
>>>> >>>>>>>>>>>>
>>>> >>>>>>>>>>>>
>>>> >>>>>>>>>>>>
>>>> > 場合によっては、最新のmasterはエラーなどが出る場合がありますので、その場合は、バージョンを古い方にたぐって
>>>> >>>>>>>>>>>>  いくのが良いと思います。
>>>> >>>>>>>>>>>>
>>>> >>>>>>>>>>>>  以上です。
>>>> >>>>>>>>>>>>
>>>> >>>>>>>>>>>>
>>>> >>>>>>>>>>>>
>>>> >>>>>>>>>>>>  ----- Original Message -----
>>>> >>>>>>>>>>>>>  From: Masamichi Fukuda -
>>>> > elf-systems
>>>> >>>>>>>>  <masamichi_fukuda@elf-systems.com>
>>>> >>>>>>>>>>>>>  To: 山内英生
>>>> > <renayama19661014@ybb.ne.jp>;
>>>> >>>>>>>>  "linux-ha-japan@lists.sourceforge.jp"
>>>> >>>>>>>>  <linux-ha-japan@lists.sourceforge.jp>
>>>> >>>>>>>>>>>>>  Date: 2015/3/17, Tue 16:06
>>>> >>>>>>>>>>>>>  Subject: Re: [Linux-ha-jp]
>>>> > スプリットブレイン時のSTONITHエラーについて
>>>> >>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>  山内さん
>>>> >>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>  お疲れ様です、福田です。
>>>> >>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>
>>>> > 以前のメールでheartbeatとpacemakerを最新版を入れたほうが良いと回答頂きました。
>>>> >>>>>>>>>>>>>
>>>> > そこで今回、heartbeat3.0.6とpacemaker1.1.12を入れたのですが。
>>>> >>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>  heartbeat configuration:
>>>> > Version = "3.0.6"
>>>> >>>>>>>>>>>>>  pacemaker configuration:
>>>> > Version = 1.1.12 (Build:
>>>> >>>>>>>>  561c4cf)pacemakerがまだ古いということでしょうか。
>>>> >>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>  済みませんが、宜しくお願いします。
>>>> >>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>  以上
>>>> >>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>  2015年3月17日 14:59
>>>> > <renayama19661014@ybb.ne.jp>:
>>>> >>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>  福田さん
>>>> >>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>  お疲れ様です。山内です。
>>>> >>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>
>>>> > ふと思ったのすが、以前のやり取りのメールで以下と回答してますが、問題ないでしょうか?
>>>> >>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>  
>>>> > 2)Heartbeat3.0.6+Pacemaker最新 :
>>>> >>>>>>>>  OK
>>>> >>>>>>>>>>>>>>>>>>>>    
>>>> >>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>  どうやら、Heartbeatも最新版3.0.6を組合せる必要があるようです。
>>>> >>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>
>>>> >  * http://hg.linux-ha.org/heartbeat-STABLE_3_0/rev/cceeb47a7d8f
>>>> >>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>
>>>> > 以下のcrm_monのバージョンを見ると、1.1.12のようです。
>>>> >>>>>>>>>>>>>>
>>>> > Heartbeat3.0.6と組み合わせるには、かなり新しめのPacemakerが必要です。
>>>> >>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>  # crm_mon -rfA
>>>> >>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>  Last updated: Tue Mar
>>>> > 17 14:14:39 2015
>>>> >>>>>>>>>>>>>>>  Last change: Tue Mar 17
>>>> > 14:01:43 2015
>>>> >>>>>>>>>>>>>>>  Stack: heartbeat
>>>> >>>>>>>>>>>>>>>  Current DC:
>>>> > lbv2.beta.com
>>>> >>>>>>>>  (82ffc36f-1ad8-8686-7db0-35686465c624) - parti
>>>> >>>>>>>>>>>>>>>  tion with quorum
>>>> >>>>>>>>>>>>>>>  Version: 1.1.12-561c4cf
>>>> >>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>  たぶん、以下の変更以降は少なくとも必要かと思います。
>>>> >>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>
>>>> > https://github.com/ClusterLabs/pacemaker/commit/f2302da063d08719d28367d8e362b8bfb0f85bf3
>>>> >>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>  以上です。
>>>> >>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>  ----- Original Message
>>>> > -----
>>>> >>>>>>>>>>>>>>>  From: Masamichi Fukuda
>>>> > - elf-systems
>>>> >>>>>>>>  <masamichi_fukuda@elf-systems.com>
>>>> >>>>>>>>>>>>>>>  To: 山内英生
>>>> > <renayama19661014@ybb.ne.jp>;
>>>> >>>>>>>>  "linux-ha-japan@lists.sourceforge.jp"
>>>> >>>>>>>>  <linux-ha-japan@lists.sourceforge.jp>
>>>> >>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>  Date: 2015/3/17, Tue
>>>> > 14:38
>>>> >>>>>>>>>>>>>>>  Subject: Re:
>>>> > [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
>>>> >>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>  山内さん
>>>> >>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>  お疲れ様です、福田です。
>>>> >>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>
>>>> > stonith-helperのシェバング行に-xを追加すれば良いのでしょうか?
>>>> >>>>>>>>>>>>>>>
>>>> > stonith-helperの先頭行を#!/bin/bash -xにしてクラスタを起動してみました。
>>>> >>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>
>>>> > crm_monでは先ほどと変わりはないようです。
>>>> >>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>  # crm_mon -rfA
>>>> >>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>  Last updated: Tue Mar
>>>> > 17 14:14:39 2015
>>>> >>>>>>>>>>>>>>>  Last change: Tue Mar 17
>>>> > 14:01:43 2015
>>>> >>>>>>>>>>>>>>>  Stack: heartbeat
>>>> >>>>>>>>>>>>>>>  Current DC:
>>>> > lbv2.beta.com
>>>> >>>>>>>>  (82ffc36f-1ad8-8686-7db0-35686465c624) - parti
>>>> >>>>>>>>>>>>>>>  tion with quorum
>>>> >>>>>>>>>>>>>>>  Version: 1.1.12-561c4cf
>>>> >>>>>>>>>>>>>>>  2 Nodes configured
>>>> >>>>>>>>>>>>>>>  8 Resources configured
>>>> >>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>  Online: [ lbv1.beta.com
>>>> > lbv2.beta.com ]
>>>> >>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>  Full list of resources:
>>>> >>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>   Resource Group:
>>>> > HAvarnish
>>>> >>>>>>>>>>>>>>>       vip_208  
>>>> > (ocf::heartbeat:IPaddr2):      
>>>> >>>>>>>>  Started lbv1.beta.com
>>>> >>>>>>>>>>>>>>>       varnishd  
>>>> > (lsb:varnish):  Started
>>>> >>>>>>>>  lbv1.beta.com
>>>> >>>>>>>>>>>>>>>   Resource Group:
>>>> > grpStonith1
>>>> >>>>>>>>>>>>>>>       Stonith1-1
>>>> >>>>>>>>  (stonith:external/stonith-helper):      Stopped
>>>> >>>>>>>>>>>>>>>       Stonith1-2
>>>> > (stonith:external/xen0):      
>>>> >>>>>>>>  Stopped
>>>> >>>>>>>>>>>>>>>   Resource Group:
>>>> > grpStonith2
>>>> >>>>>>>>>>>>>>>       Stonith2-1
>>>> >>>>>>>>  (stonith:external/stonith-helper):      Stopped
>>>> >>>>>>>>>>>>>>>       Stonith2-2
>>>> > (stonith:external/xen0):      
>>>> >>>>>>>>  Stopped
>>>> >>>>>>>>>>>>>>>   Clone Set: clone_ping
>>>> > [ping]
>>>> >>>>>>>>>>>>>>>       Started: [
>>>> > lbv1.beta.com lbv2.beta.com ]
>>>> >>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>  Node Attributes:
>>>> >>>>>>>>>>>>>>>  * Node lbv1.beta.com:
>>>> >>>>>>>>>>>>>>>      +
>>>> > default_ping_set                  : 100
>>>> >>>>>>>>>>>>>>>  * Node lbv2.beta.com:
>>>> >>>>>>>>>>>>>>>      +
>>>> > default_ping_set                  : 100
>>>> >>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>  Migration summary:
>>>> >>>>>>>>>>>>>>>  * Node lbv2.beta.com:
>>>> >>>>>>>>>>>>>>>     Stonith1-1:
>>>> > migration-threshold=1
>>>> >>>>>>>>  fail-count=1000000 last-failure='Tue Mar 17
>>>> >>>>>>>>>>>>>>>   14:12:16 2015'
>>>> >>>>>>>>>>>>>>>  * Node lbv1.beta.com:
>>>> >>>>>>>>>>>>>>>     Stonith2-1:
>>>> > migration-threshold=1
>>>> >>>>>>>>  fail-count=1000000 last-failure='Tue Mar 17
>>>> >>>>>>>>>>>>>>>   14:12:21 2015'
>>>> >>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>  Failed actions:
>>>> >>>>>>>>>>>>>>>      Stonith1-1_start_0
>>>> > on lbv2.beta.com 'unknown
>>>> >>>>>>>>  error' (1): call=31, st
>>>> >>>>>>>>>>>>>>>  atus=Error,
>>>> > last-rc-change='Tue Mar 17 14:12:14
>>>> >>>>>>>>  2015', queued=0ms, exec=1065ms
>>>> >>>>>>>>>>>>>>>      Stonith2-1_start_0
>>>> > on lbv1.beta.com 'unknown
>>>> >>>>>>>>  error' (1): call=26, st
>>>> >>>>>>>>>>>>>>>  atus=Error,
>>>> > last-rc-change='Tue Mar 17 14:12:19
>>>> >>>>>>>>  2015', queued=0ms, exec=1081ms
>>>> >>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>  その他のログを探してみました。
>>>> >>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>  heartbeat起動時です。
>>>> >>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>  # less
>>>> > /var/log/pm_logconv.out
>>>> >>>>>>>>>>>>>>>  Mar 17 14:11:28
>>>> > lbv1.beta.com info: Starting
>>>> >>>>>>>>  Heartbeat 3.0.6.
>>>> >>>>>>>>>>>>>>>  Mar 17 14:11:33
>>>> > lbv1.beta.com info: Link
>>>> >>>>>>>>  lbv2.beta.com:eth1 is up.
>>>> >>>>>>>>>>>>>>>  Mar 17 14:11:34
>>>> > lbv1.beta.com info: Start
>>>> >>>>>>>>  "ccm" process. (pid=13264)
>>>> >>>>>>>>>>>>>>>  Mar 17 14:11:34
>>>> > lbv1.beta.com info: Start
>>>> >>>>>>>>  "lrmd" process. (pid=13267)
>>>> >>>>>>>>>>>>>>>  Mar 17 14:11:34
>>>> > lbv1.beta.com info: Start
>>>> >>>>>>>>  "attrd" process. (pid=13268)
>>>> >>>>>>>>>>>>>>>  Mar 17 14:11:34
>>>> > lbv1.beta.com info: Start
>>>> >>>>>>>>  "stonithd" process. (pid=13266)
>>>> >>>>>>>>>>>>>>>  Mar 17 14:11:34
>>>> > lbv1.beta.com info: Start
>>>> >>>>>>>>  "cib" process. (pid=13265)
>>>> >>>>>>>>>>>>>>>  Mar 17 14:11:34
>>>> > lbv1.beta.com info: Start
>>>> >>>>>>>>  "crmd" process. (pid=13269)
>>>> >>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>  # less /var/log/error
>>>> >>>>>>>>>>>>>>>  Mar 17 14:12:20 lbv1
>>>> > crmd[13269]:    error:
>>>> >>>>>>>>  process_lrm_event: Operation Stonith2-1_start_0
>>>> > (node=lbv1.beta.com, call=26,
>>>> >>>>>>>>  status=4, cib-update=19, confirmed=true) Error
>>>> >>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>
>>>> > syslogからstonithをgrepしたものです
>>>> >>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>  Mar 17 14:11:34 lbv1
>>>> > heartbeat: [13255]: info:
>>>> >>>>>>>>  Starting child client
>>>> >>>>>>>>
>>>> > "/usr/local/heartbeat/libexec/pacemaker/stonithd" (0,0)
>>>> >>>>>>>>>>>>>>>  Mar 17 14:11:34 lbv1
>>>> > heartbeat: [13266]: info:
>>>> >>>>>>>>  Starting
>>>> > "/usr/local/heartbeat/libexec/pacemaker/stonithd" as uid 0
>>>> >>>>>>>>  gid 0 (pid 13266)
>>>> >>>>>>>>>>>>>>>  Mar 17 14:11:34 lbv1
>>>> > stonithd[13266]:   notice:
>>>> >>>>>>>>  crm_cluster_connect: Connecting to cluster
>>>> > infrastructure: heartbeat
>>>> >>>>>>>>>>>>>>>  Mar 17 14:11:34 lbv1
>>>> > heartbeat: [13255]: info: the
>>>> >>>>>>>>  send queue length from heartbeat to client stonithd
>>>> > is set to 1024
>>>> >>>>>>>>>>>>>>>  Mar 17 14:11:40 lbv1
>>>> > stonithd[13266]:   notice:
>>>> >>>>>>>>  setup_cib: Watching for stonith topology changes
>>>> >>>>>>>>>>>>>>>  Mar 17 14:11:40 lbv1
>>>> > stonithd[13266]:   notice:
>>>> >>>>>>>>  unpack_config: On loss of CCM Quorum: Ignore
>>>> >>>>>>>>>>>>>>>  Mar 17 14:11:40 lbv1
>>>> > stonithd[13266]:  warning:
>>>> >>>>>>>>  handle_startup_fencing: Blind faith: not fencing
>>>> > unseen nodes
>>>> >>>>>>>>>>>>>>>  Mar 17 14:11:40 lbv1
>>>> > stonithd[13266]:  warning:
>>>> >>>>>>>>  handle_startup_fencing: Blind faith: not fencing
>>>> > unseen nodes
>>>> >>>>>>>>>>>>>>>  Mar 17 14:11:41 lbv1
>>>> > stonithd[13266]:   notice:
>>>> >>>>>>>>  stonith_device_register: Added 'Stonith2-1'
>>>> > to the device list (1 active
>>>> >>>>>>>>  devices)
>>>> >>>>>>>>>>>>>>>  Mar 17 14:11:41 lbv1
>>>> > stonithd[13266]:   notice:
>>>> >>>>>>>>  stonith_device_register: Added 'Stonith2-2'
>>>> > to the device list (2 active
>>>> >>>>>>>>  devices)
>>>> >>>>>>>>>>>>>>>  Mar 17 14:12:04 lbv1
>>>> > stonithd[13266]:   notice:
>>>> >>>>>>>>  xml_patch_version_check: Versions did not change in
>>>> > patch 0.5.0
>>>> >>>>>>>>>>>>>>>  Mar 17 14:12:20 lbv1
>>>> > stonithd[13266]:   notice:
>>>> >>>>>>>>  log_operation: Operation 'monitor' [13386]
>>>> > for device
>>>> >>>>>>>>  'Stonith2-1' returned: -201 (Generic
>>>> > Pacemaker error)
>>>> >>>>>>>>>>>>>>>  Mar 17 14:12:20 lbv1
>>>> > stonithd[13266]:  warning:
>>>> >>>>>>>>  log_operation: Stonith2-1:13386 [ Performing:
>>>> > stonith -t external/stonith-helper
>>>> >>>>>>>>  -S ]
>>>> >>>>>>>>>>>>>>>  Mar 17 14:12:20 lbv1
>>>> > stonithd[13266]:  warning:
>>>> >>>>>>>>  log_operation: Stonith2-1:13386 [ failed to exec
>>>> > "stonith" ]
>>>> >>>>>>>>>>>>>>>  Mar 17 14:12:20 lbv1
>>>> > stonithd[13266]:  warning:
>>>> >>>>>>>>  log_operation: Stonith2-1:13386 [ failed:  2 ]
>>>> >>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>  宜しくお願いします。
>>>> >>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>  以上
>>>> >>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>  2015年3月17日 13:32
>>>> > <renayama19661014@ybb.ne.jp>:
>>>> >>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>  福田さん
>>>> >>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>  お疲れ様です。山内です。
>>>> >>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>
>>>> > ということは、stonith-helperのstartに問題があるようですね。
>>>> >>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>  stonith-helperの先頭に
>>>> >>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>  #!/bin/bash -x
>>>> >>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>
>>>> > を入れて、クラスタを起動すると何かわかるかも知れません。
>>>> >>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>
>>>> > ちなみに、stonith-helperのログもどこかに出ていると思うのですが。。。
>>>> >>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>  以上です。
>>>> >>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>  ----- Original
>>>> > Message -----
>>>> >>>>>>>>>>>>>>>>>  From: Masamichi
>>>> > Fukuda - elf-systems
>>>> >>>>>>>>  <masamichi_fukuda@elf-systems.com>
>>>> >>>>>>>>>>>>>>>>>  To: 山内英生
>>>> > <renayama19661014@ybb.ne.jp>;
>>>> >>>>>>>>  "linux-ha-japan@lists.sourceforge.jp"
>>>> >>>>>>>>  <linux-ha-japan@lists.sourceforge.jp>
>>>> >>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>  Date:
>>>> > 2015/3/17, Tue 12:31
>>>> >>>>>>>>>>>>>>>>>  Subject: Re:
>>>> > [Linux-ha-jp]
>>>> >>>>>>>>  スプリットブレイン時のSTONITHエラーについて
>>>> >>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>  山内さん
>>>> >>>>>>>>>>>>>>>>>  cc:松島さん
>>>> >>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>  こんにちは、福田です。
>>>> >>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>
>>>> > 同じディレクトリにxen0はありました。
>>>> >>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>  # pwd
>>>> >>>>>>>>>>>>>>>>>
>>>> > /usr/local/heartbeat/lib/stonith/plugins/external
>>>> >>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>  # ls
>>>> >>>>>>>>>>>>>>>>>  drac5          
>>>> > ibmrsa          kdumpcheck
>>>> >>>>>>>>  riloe          vmware
>>>> >>>>>>>>>>>>>>>>>  dracmc-telnet
>>>> > ibmrsa-telnet  libvirt    
>>>> >>>>>>>>  ssh          xen0
>>>> >>>>>>>>>>>>>>>>>  hetzner      
>>>> > ipmi          nut    
>>>> >>>>>>>>  stonith-helper  xen0-ha
>>>> >>>>>>>>>>>>>>>>>  hmchttp      
>>>> > ippower9258    rackpdu    
>>>> >>>>>>>>  vcenter
>>>> >>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>  宜しくお願いします。
>>>> >>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>  以上
>>>> >>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>  2015-03-17
>>>> > 10:53 GMT+09:00
>>>> >>>>>>>>  <renayama19661014@ybb.ne.jp>:
>>>> >>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>  福田さん
>>>> >>>>>>>>>>>>>>>>>>  cc:松島さん
>>>> >>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>
>>>> > お疲れ様です。山内です。
>>>> >>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>
>>>> > 標準出力や標準エラー出力はありませんでした。
>>>> >>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>
>>>> > stonith-helperがおかしいのでしょうか。
>>>> >>>>>>>>>>>>>>>>>>>
>>>> > stonith-helperはシェルスクリプトなのでインストールはあまり気にしていなかったのですが。
>>>> >>>>>>>>>>>>>>>>>>>
>>>> > stonith-helperはここに配置されています。
>>>> >>>>>>>>>>>>>>>>>>>
>>>> > /usr/local/heartbeat/lib/stonith/plugins/external/stonith-helper
>>>> >>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>
>>>> > このディレクトリにxen0もありますか?
>>>> >>>>>>>>>>>>>>>>>>
>>>> > 無いようでしたら、問題がありますので、一度、stonith-helperのファイルを属性などはそのまま、xen0と同じディレクトリに
>>>> >>>>>>>>>>>>>>>>>>
>>>> > コピーしてみてください。
>>>> >>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>
>>>> > それで稼働するなら、pm_extrasのインストールに問題があるということになります。
>>>> >>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>  以上です。
>>>> >>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>  -----
>>>> > Original Message -----
>>>> >>>>>>>>>>>>>>>>>>>  From:
>>>> > Masamichi Fukuda - elf-systems
>>>> >>>>>>>>  <masamichi_fukuda@elf-systems.com>
>>>> >>>>>>>>>>>>>>>>>>>  To:
>>>> > 山内英生
>>>> >>>>>>>>  <renayama19661014@ybb.ne.jp>;
>>>> >>>>>>>>  "linux-ha-japan@lists.sourceforge.jp"
>>>> >>>>>>>>  <linux-ha-japan@lists.sourceforge.jp>
>>>> >>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>  Date:
>>>> > 2015/3/17, Tue 10:31
>>>> >>>>>>>>>>>>>>>>>>>
>>>> > Subject: Re: [Linux-ha-jp]
>>>> >>>>>>>>  スプリットブレイン時のSTONITHエラーについて
>>>> >>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>  山内さん
>>>> >>>>>>>>>>>>>>>>>>>  cc:松島さん
>>>> >>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>
>>>> > おはようございます、福田です。
>>>> >>>>>>>>>>>>>>>>>>>
>>>> > crmの例をありがとうございます。
>>>> >>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>
>>>> > 早速、こちらの環境に合わせてみました。
>>>> >>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>  $ cat
>>>> > test.crm
>>>> >>>>>>>>>>>>>>>>>>>  ###
>>>> > Cluster Option ###
>>>> >>>>>>>>>>>>>>>>>>>
>>>> > property \
>>>> >>>>>>>>>>>>>>>>>>>    
>>>> >>>>>>>>  no-quorum-policy="ignore" \
>>>> >>>>>>>>>>>>>>>>>>>    
>>>> > stonith-enabled="true"
>>>> >>>>>>>>  \
>>>> >>>>>>>>>>>>>>>>>>>    
>>>> >>>>>>>>  startup-fencing="false" \
>>>> >>>>>>>>>>>>>>>>>>>    
>>>> > stonith-timeout="710s"
>>>> >>>>>>>>  \
>>>> >>>>>>>>>>>>>>>>>>>    
>>>> >>>>>>>>  crmd-transition-delay="2s"
>>>> >>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>  ###
>>>> > Resource Default ###
>>>> >>>>>>>>>>>>>>>>>>>
>>>> > rsc_defaults \
>>>> >>>>>>>>>>>>>>>>>>>    
>>>> >>>>>>>>  resource-stickiness="INFINITY" \
>>>> >>>>>>>>>>>>>>>>>>>    
>>>> >>>>>>>>  migration-threshold="1"
>>>> >>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>  ###
>>>> > Group Configuration ###
>>>> >>>>>>>>>>>>>>>>>>>  group
>>>> > HAvarnish \
>>>> >>>>>>>>>>>>>>>>>>>    
>>>> > vip_208 \
>>>> >>>>>>>>>>>>>>>>>>>    
>>>> > varnishd
>>>> >>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>  group
>>>> > grpStonith1 \
>>>> >>>>>>>>>>>>>>>>>>>    
>>>> > Stonith1-1 \
>>>> >>>>>>>>>>>>>>>>>>>    
>>>> > Stonith1-2
>>>> >>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>  group
>>>> > grpStonith2 \
>>>> >>>>>>>>>>>>>>>>>>>    
>>>> > Stonith2-1 \
>>>> >>>>>>>>>>>>>>>>>>>    
>>>> > Stonith2-2
>>>> >>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>  ###
>>>> > Clone Configuration ###
>>>> >>>>>>>>>>>>>>>>>>>  clone
>>>> > clone_ping \
>>>> >>>>>>>>>>>>>>>>>>>    
>>>> > ping
>>>> >>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>  ###
>>>> > Fencing Topology ###
>>>> >>>>>>>>>>>>>>>>>>>
>>>> > fencing_topology \
>>>> >>>>>>>>>>>>>>>>>>>    
>>>> > lbv1.beta.com: Stonith1-1
>>>> >>>>>>>>  Stonith1-2 \
>>>> >>>>>>>>>>>>>>>>>>>    
>>>> > lbv2.beta.com: Stonith2-1
>>>> >>>>>>>>  Stonith2-2
>>>> >>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>  ###
>>>> > Primitive Configuration ###
>>>> >>>>>>>>>>>>>>>>>>>
>>>> > primitive vip_208
>>>> >>>>>>>>  ocf:heartbeat:IPaddr2 \
>>>> >>>>>>>>>>>>>>>>>>>    
>>>> > params \
>>>> >>>>>>>>>>>>>>>>>>>        
>>>> >>>>>>>>  ip="192.168.17.208" \
>>>> >>>>>>>>>>>>>>>>>>>        
>>>> > nic="eth0" \
>>>> >>>>>>>>>>>>>>>>>>>        
>>>> > cidr_netmask="24"
>>>> >>>>>>>>  \
>>>> >>>>>>>>>>>>>>>>>>>      op
>>>> > start interval="0s"
>>>> >>>>>>>>  timeout="90s" on-fail="restart"
>>>> > \
>>>> >>>>>>>>>>>>>>>>>>>      op
>>>> > monitor
>>>> >>>>>>>>  interval="5s" timeout="60s"
>>>> > on-fail="restart"
>>>> >>>>>>>>  \
>>>> >>>>>>>>>>>>>>>>>>>      op
>>>> > stop interval="0s"
>>>> >>>>>>>>  timeout="100s" on-fail="fence"
>>>> >>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>
>>>> > primitive varnishd lsb:varnish \
>>>> >>>>>>>>>>>>>>>>>>>      op
>>>> > start interval="0s"
>>>> >>>>>>>>  timeout="90s" on-fail="restart"
>>>> > \
>>>> >>>>>>>>>>>>>>>>>>>      op
>>>> > monitor
>>>> >>>>>>>>  interval="10s" timeout="60s"
>>>> > on-fail="restart"
>>>> >>>>>>>>  \
>>>> >>>>>>>>>>>>>>>>>>>      op
>>>> > stop interval="0s"
>>>> >>>>>>>>  timeout="100s" on-fail="fence"
>>>> >>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>
>>>> > primitive ping ocf:pacemaker:ping
>>>> >>>>>>>>  \
>>>> >>>>>>>>>>>>>>>>>>>    
>>>> > params \
>>>> >>>>>>>>>>>>>>>>>>>        
>>>> >>>>>>>>  name="default_ping_set" \
>>>> >>>>>>>>>>>>>>>>>>>        
>>>> >>>>>>>>  host_list="192.168.17.254" \
>>>> >>>>>>>>>>>>>>>>>>>        
>>>> > multiplier="100"
>>>> >>>>>>>>  \
>>>> >>>>>>>>>>>>>>>>>>>        
>>>> > dampen="1" \
>>>> >>>>>>>>>>>>>>>>>>>      op
>>>> > start interval="0s"
>>>> >>>>>>>>  timeout="90s" on-fail="restart"
>>>> > \
>>>> >>>>>>>>>>>>>>>>>>>      op
>>>> > monitor
>>>> >>>>>>>>  interval="10s" timeout="60s"
>>>> > on-fail="restart"
>>>> >>>>>>>>  \
>>>> >>>>>>>>>>>>>>>>>>>      op
>>>> > stop interval="0s"
>>>> >>>>>>>>  timeout="100s" on-fail="fence"
>>>> >>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>
>>>> > primitive Stonith1-1
>>>> >>>>>>>>  stonith:external/stonith-helper \
>>>> >>>>>>>>>>>>>>>>>>>    
>>>> > params \
>>>> >>>>>>>>>>>>>>>>>>>        
>>>> >>>>>>>>  pcmk_reboot_retries="1" \
>>>> >>>>>>>>>>>>>>>>>>>        
>>>> >>>>>>>>  pcmk_reboot_timeout="40s" \
>>>> >>>>>>>>>>>>>>>>>>>        
>>>> >>>>>>>>  hostlist="lbv1.beta.com" \
>>>> >>>>>>>>>>>>>>>>>>>        
>>>> >>>>>>>>  dead_check_target="192.168.17.132
>>>> > 10.0.17.132" \
>>>> >>>>>>>>>>>>>>>>>>>        
>>>> >>>>>>>>
>>>> > standby_check_command="/usr/local/sbin/crm_resource -r varnishd -W | grep
>>>> >>>>>>>>  -q `hostname`" \
>>>> >>>>>>>>>>>>>>>>>>>        
>>>> >>>>>>>>  run_online_check="yes" \
>>>> >>>>>>>>>>>>>>>>>>>      op
>>>> > start interval="0s"
>>>> >>>>>>>>  timeout="60s" on-fail="restart"
>>>> > \
>>>> >>>>>>>>>>>>>>>>>>>      op
>>>> > stop interval="0s"
>>>> >>>>>>>>  timeout="60s" on-fail="ignore"
>>>> >>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>
>>>> > primitive Stonith1-2
>>>> >>>>>>>>  stonith:external/xen0 \
>>>> >>>>>>>>>>>>>>>>>>>    
>>>> > params \
>>>> >>>>>>>>>>>>>>>>>>>        
>>>> >>>>>>>>  pcmk_reboot_timeout="60s" \
>>>> >>>>>>>>>>>>>>>>>>>        
>>>> >>>>>>>>
>>>> > hostlist="lbv1.beta.com:/etc/xen/lbv1.cfg" \
>>>> >>>>>>>>>>>>>>>>>>>        
>>>> >>>>>>>>  dom0="xen0.beta.com" \
>>>> >>>>>>>>>>>>>>>>>>>      op
>>>> > start interval="0s"
>>>> >>>>>>>>  timeout="60s" on-fail="restart"
>>>> > \
>>>> >>>>>>>>>>>>>>>>>>>      op
>>>> > monitor
>>>> >>>>>>>>  interval="3600s" timeout="60s"
>>>> > on-fail="restart"
>>>> >>>>>>>>  \
>>>> >>>>>>>>>>>>>>>>>>>      op
>>>> > stop interval="0s"
>>>> >>>>>>>>  timeout="60s" on-fail="ignore"
>>>> >>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>
>>>> > primitive Stonith2-1
>>>> >>>>>>>>  stonith:external/stonith-helper \
>>>> >>>>>>>>>>>>>>>>>>>    
>>>> > params \
>>>> >>>>>>>>>>>>>>>>>>>        
>>>> >>>>>>>>  pcmk_reboot_retries="1" \
>>>> >>>>>>>>>>>>>>>>>>>        
>>>> >>>>>>>>  pcmk_reboot_timeout="40s" \
>>>> >>>>>>>>>>>>>>>>>>>        
>>>> >>>>>>>>  hostlist="lbv2.beta.com" \
>>>> >>>>>>>>>>>>>>>>>>>        
>>>> >>>>>>>>  dead_check_target="192.168.17.133
>>>> > 10.0.17.133" \
>>>> >>>>>>>>>>>>>>>>>>>        
>>>> >>>>>>>>
>>>> > standby_check_command="/usr/local/sbin/crm_resource -r varnishd -W | grep
>>>> >>>>>>>>  -q `hostname`" \
>>>> >>>>>>>>>>>>>>>>>>>        
>>>> >>>>>>>>  run_online_check="yes" \
>>>> >>>>>>>>>>>>>>>>>>>      op
>>>> > start interval="0s"
>>>> >>>>>>>>  timeout="60s" on-fail="restart"
>>>> > \
>>>> >>>>>>>>>>>>>>>>>>>      op
>>>> > stop interval="0s"
>>>> >>>>>>>>  timeout="60s" on-fail="ignore"
>>>> >>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>
>>>> > primitive Stonith2-2
>>>> >>>>>>>>  stonith:external/xen0 \
>>>> >>>>>>>>>>>>>>>>>>>    
>>>> > params \
>>>> >>>>>>>>>>>>>>>>>>>        
>>>> >>>>>>>>  pcmk_reboot_timeout="60s" \
>>>> >>>>>>>>>>>>>>>>>>>        
>>>> >>>>>>>>
>>>> > hostlist="lbv2.beta.com:/etc/xen/lbv2.cfg" \
>>>> >>>>>>>>>>>>>>>>>>>        
>>>> >>>>>>>>  dom0="xen0.beta.com" \
>>>> >>>>>>>>>>>>>>>>>>>      op
>>>> > start interval="0s"
>>>> >>>>>>>>  timeout="60s" on-fail="restart"
>>>> > \
>>>> >>>>>>>>>>>>>>>>>>>      op
>>>> > monitor
>>>> >>>>>>>>  interval="3600s" timeout="60s"
>>>> > on-fail="restart"
>>>> >>>>>>>>  \
>>>> >>>>>>>>>>>>>>>>>>>      op
>>>> > stop interval="0s"
>>>> >>>>>>>>  timeout="60s" on-fail="ignore"
>>>> >>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>  ###
>>>> > Resource Location ###
>>>> >>>>>>>>>>>>>>>>>>>
>>>> > location HA_location-1 HAvarnish
>>>> >>>>>>>>  \
>>>> >>>>>>>>>>>>>>>>>>>    
>>>> > rule 200: #uname eq
>>>> >>>>>>>>  lbv1.beta.com \
>>>> >>>>>>>>>>>>>>>>>>>    
>>>> > rule 100: #uname eq
>>>> >>>>>>>>  lbv2.beta.com
>>>> >>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>
>>>> > location HA_location-2 HAvarnish
>>>> >>>>>>>>  \
>>>> >>>>>>>>>>>>>>>>>>>    
>>>> > rule -INFINITY: not_defined
>>>> >>>>>>>>  default_ping_set or default_ping_set lt 100
>>>> >>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>
>>>> > location HA_location-3 grpStonith1
>>>> >>>>>>>>  \
>>>> >>>>>>>>>>>>>>>>>>>    
>>>> > rule -INFINITY: #uname eq
>>>> >>>>>>>>  lbv1.beta.com
>>>> >>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>
>>>> > location HA_location-4 grpStonith2
>>>> >>>>>>>>  \
>>>> >>>>>>>>>>>>>>>>>>>    
>>>> > rule -INFINITY: #uname eq
>>>> >>>>>>>>  lbv2.beta.com
>>>> >>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>
>>>> > これを流しこんだところ、昨日とはメッセージが異なります。
>>>> >>>>>>>>>>>>>>>>>>>
>>>> > pingのメッセージはなくなっていました。
>>>> >>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>  #
>>>> > crm_mon -rfA
>>>> >>>>>>>>>>>>>>>>>>>  Last
>>>> > updated: Tue Mar 17 10:21:28
>>>> >>>>>>>>  2015
>>>> >>>>>>>>>>>>>>>>>>>  Last
>>>> > change: Tue Mar 17 10:21:09
>>
>>>> >>>>>>>>  2015
>>>> >>>>>>>>>>>>>>>>>>>  Stack:
>>>> > heartbeat
>>>> >>>>>>>>>>>>>>>>>>>  Current
>>>> > DC: lbv2.beta.com
>>>> >>>>>>>>  (82ffc36f-1ad8-8686-7db0-35686465c624) - parti
>>>> >>>>>>>>>>>>>>>>>>>  tion
>>>> > with quorum
>>>> >>>>>>>>>>>>>>>>>>>
>>>> > Version: 1.1.12-561c4cf
>>>> >>>>>>>>>>>>>>>>>>>  2 Nodes
>>>> > configured
>>>> >>>>>>>>>>>>>>>>>>>  8
>>>> > Resources configured
>>>> >>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>  Online:
>>>> > [ lbv1.beta.com
>>>> >>>>>>>>  lbv2.beta.com ]
>>>> >>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>  Full
>>>> > list of resources:
>>>> >>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>
>>>> >  Resource Group: HAvarnish
>>>> >>>>>>>>>>>>>>>>>>>      
>>>> > vip_208  
>>>> >>>>>>>>  (ocf::heartbeat:IPaddr2):       Started
>>>> > lbv1.beta.com
>>>> >>>>>>>>>>>>>>>>>>>      
>>>> > varnishd   (lsb:varnish):
>>>> >>>>>>>>  Started lbv1.beta.com
>>>> >>>>>>>>>>>>>>>>>>>
>>>> >  Resource Group: grpStonith1
>>>> >>>>>>>>>>>>>>>>>>>      
>>>> > Stonith1-1
>>>> >>>>>>>>  (stonith:external/stonith-helper):      Stopped
>>>> >>>>>>>>>>>>>>>>>>>      
>>>> > Stonith1-2
>>>> >>>>>>>>  (stonith:external/xen0):        Stopped
>>>> >>>>>>>>>>>>>>>>>>>
>>>> >  Resource Group: grpStonith2
>>>> >>>>>>>>>>>>>>>>>>>      
>>>> > Stonith2-1
>>>> >>>>>>>>  (stonith:external/stonith-helper):      Stopped
>>>> >>>>>>>>>>>>>>>>>>>      
>>>> > Stonith2-2
>>>> >>>>>>>>  (stonith:external/xen0):        Stopped
>>>> >>>>>>>>>>>>>>>>>>>   Clone
>>>> > Set: clone_ping [ping]
>>>> >>>>>>>>>>>>>>>>>>>      
>>>> > Started: [ lbv1.beta.com
>>>> >>>>>>>>  lbv2.beta.com ]
>>>> >>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>  Node
>>>> > Attributes:
>>>> >>>>>>>>>>>>>>>>>>>  * Node
>>>> > lbv1.beta.com:
>>>> >>>>>>>>>>>>>>>>>>>      +
>>>> >>>>>>>>  default_ping_set                  : 100
>>>> >>>>>>>>>>>>>>>>>>>  * Node
>>>> > lbv2.beta.com:
>>>> >>>>>>>>>>>>>>>>>>>      +
>>>> >>>>>>>>  default_ping_set                  : 100
>>>> >>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>
>>>> > Migration summary:
>>>> >>>>>>>>>>>>>>>>>>>  * Node
>>>> > lbv2.beta.com:
>>>> >>>>>>>>>>>>>>>>>>>    
>>>> > Stonith1-1: migration-threshold=1
>>>> >>>>>>>>  fail-count=1000000 last-failure='Tue Mar 17
>>>> >>>>>>>>>>>>>>>>>>>
>>>> >  10:21:17 2015'
>>>> >>>>>>>>>>>>>>>>>>>  * Node
>>>> > lbv1.beta.com:
>>>> >>>>>>>>>>>>>>>>>>>    
>>>> > Stonith2-1: migration-threshold=1
>>>> >>>>>>>>  fail-count=1000000 last-failure='Tue Mar 17
>>>> >>>>>>>>>>>>>>>>>>>
>>>> >  10:21:17 2015'
>>>> >>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>  Failed
>>>> > actions:
>>>> >>>>>>>>>>>>>>>>>>>    
>>>> > Stonith1-1_start_0 on
>>>> >>>>>>>>  lbv2.beta.com 'unknown error' (1): call=31,
>>>> > st
>>>> >>>>>>>>>>>>>>>>>>>
>>>> > atus=Error, last-rc-change='Tue
>>>> >>>>>>>>  Mar 17 10:21:15 2015', queued=0ms, exec=1082ms
>>>> >>>>>>>>>>>>>>>>>>>    
>>>> > Stonith2-1_start_0 on
>>>> >>>>>>>>  lbv1.beta.com 'unknown error' (1): call=31,
>>>> > st
>>>> >>>>>>>>>>>>>>>>>>>
>>>> > atus=Error, last-rc-change='Tue
>>>> >>>>>>>>  Mar 17 10:21:16 2015', queued=0ms, exec=1079ms
>>>> >>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>
>>>> > /var/log/ha-debugのログです。
>>>> >>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>
>>>> > IPaddr2(vip_208)[7851]:
>>>> >>>>>>>>  2015/03/17_10:21:22 INFO: Adding inet address
>>>> > 192.168.17.208/24 with broadcast
>>>> >>>>>>>>  address 192.168.17.255 to device eth0
>>>> >>>>>>>>>>>>>>>>>>>
>>>> > IPaddr2(vip_208)[7851]:
>>>> >>>>>>>>  2015/03/17_10:21:22 INFO: Bringing device eth0 up
>>>> >>>>>>>>>>>>>>>>>>>
>>>> > IPaddr2(vip_208)[7851]:
>>>> >>>>>>>>  2015/03/17_10:21:22 INFO:
>>>> > /usr/libexec/heartbeat/send_arp -i 200 -r 5 -p
>>>> >>>>>>>>  /var/run/resource-agents/send_arp-192.168.17.208
>>>> > eth0 192.168.17.208 auto
>>>> >>>>>>>>  not_used not_used
>>>> >>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>
>>>> > 標準出力や標準エラー出力はありませんでした。
>>>> >>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>
>>>> > stonith-helperがおかしいのでしょうか。
>>>> >>>>>>>>>>>>>>>>>>>
>>>> > stonith-helperはシェルスクリプトなのでインストールはあまり気にしていなかったのですが。
>>>> >>>>>>>>>>>>>>>>>>>
>>>> > stonith-helperはここに配置されています。
>>>> >>>>>>>>>>>>>>>>>>>
>>>> > /usr/local/heartbeat/lib/stonith/plugins/external/stonith-helper
>>>> >>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>
>>>> > 宜しくお願いします。
>>>> >>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>  以上
>>>> >>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>
>>>> > 2015-03-17 9:45 GMT+09:00
>>>> >>>>>>>>  <renayama19661014@ybb.ne.jp>:
>>>> >>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>  福田さん
>>>> >>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>
>>>> > おはようございます。山内です。
>>>> >>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>
>>>> > 念の為、手元にある複数のstonithを利用した場合の例を抜粋してお送りします。
>>>> >>>>>>>>>>>>>>>>>>>>
>>>> > (実際には、改行に気を付けてください)
>>>> >>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>
>>>> > 以下の例は、PM1.1系での設定で、
>>>> >>>>>>>>>>>>>>>>>>>>
>>>> > nodeaは、prmStonith1-1、 prmStonith1-2の順でstonithが実行されます。
>>>> >>>>>>>>>>>>>>>>>>>>
>>>> > nodebは、prmStonith2-1、 prmStonith2-2の順でstonithが実行されます。
>>>> >>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>
>>>> > stonith自体は、helperとsshです。
>>>> >>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>
>>>> > (snip)
>>>> >>>>>>>>>>>>>>>>>>>>  ###
>>>> > Group Configuration ###
>>>> >>>>>>>>>>>>>>>>>>>>
>>>> > group grpStonith1 \
>>>> >>>>>>>>>>>>>>>>>>>>
>>>> > prmStonith1-1 \
>>>> >>>>>>>>>>>>>>>>>>>>
>>>> > prmStonith1-2
>>>> >>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>
>>>> > group grpStonith2 \
>>>> >>>>>>>>>>>>>>>>>>>>
>>>> > prmStonith2-1 \
>>>> >>>>>>>>>>>>>>>>>>>>
>>>> > prmStonith2-2
>>>> >>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>  ###
>>>> > Fencing Topology ###
>>>> >>>>>>>>>>>>>>>>>>>>
>>>> > fencing_topology \
>>>> >>>>>>>>>>>>>>>>>>>>
>>>> > nodea: prmStonith1-1
>>>> >>>>>>>>  prmStonith1-2 \
>>>> >>>>>>>>>>>>>>>>>>>>
>>>> > nodeb: prmStonith2-1
>>>> >>>>>>>>  prmStonith2-2
>>>> >>>>>>>>>>>>>>>>>>>>
>>>> > (snp)
>>>> >>>>>>>>>>>>>>>>>>>>
>>>> > primitive prmStonith1-1
>>>> >>>>>>>>  stonith:external/stonith-helper \
>>>> >>>>>>>>>>>>>>>>>>>>
>>>> > params \
>>>> >>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>
>>>> > pcmk_reboot_retries="1"
>>>> >>>>>>>>  \
>>>> >>>>>>>>>>>>>>>>>>>>
>>>> > pcmk_reboot_timeout="40s"
>>>> >>>>>>>>  \
>>>> >>>>>>>>>>>>>>>>>>>>
>>>> > hostlist="nodea" \
>>>> >>>>>>>>>>>>>>>>>>>>
>>>> > dead_check_target="192.168.28.60
>>>> >>>>>>>>  192.168.28.70" \
>>>> >>>>>>>>>>>>>>>>>>>>
>>>> > standby_check_command="/usr/sbin/crm_resource
>>>> >>>>>>>>  -r prmRES -W | grep -qi `hostname`" \
>>>> >>>>>>>>>>>>>>>>>>>>
>>>> > run_online_check="yes"
>>>> >>>>>>>>  \
>>>> >>>>>>>>>>>>>>>>>>>>  op
>>>> > start interval="0s"
>>>> >>>>>>>>  timeout="60s" on-fail="restart"
>>>> > \
>>>> >>>>>>>>>>>>>>>>>>>>  op
>>>> > stop interval="0s"
>>>> >>>>>>>>  timeout="60s" on-fail="ignore"
>>>> >>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>
>>>> > primitive prmStonith1-2
>>>> >>>>>>>>  stonith:external/ssh \
>>>> >>>>>>>>>>>>>>>>>>>>
>>>> > params \
>>>> >>>>>>>>>>>>>>>>>>>>
>>>> > pcmk_reboot_timeout="60s"
>>>> >>>>>>>>  \
>>>> >>>>>>>>>>>>>>>>>>>>
>>>> > hostlist="nodea" \
>>>> >>>>>>>>>>>>>>>>>>>>  op
>>>> > start interval="0s"
>>>> >>>>>>>>  timeout="60s" on-fail="restart"
>>>> > \
>>>> >>>>>>>>>>>>>>>>>>>>  op
>>>> > monitor
>>>> >>>>>>>>  interval="3600s" timeout="60s"
>>>> > on-fail="restart"
>>>> >>>>>>>>  \
>>>> >>>>>>>>>>>>>>>>>>>>  op
>>>> > stop interval="0s"
>>>> >>>>>>>>  timeout="60s" on-fail="ignore"
>>>> >>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>
>>>> > primitive prmStonith2-1
>>>> >>>>>>>>  stonith:external/stonith-helper \
>>>> >>>>>>>>>>>>>>>>>>>>
>>>> > params \
>>>> >>>>>>>>>>>>>>>>>>>>
>>>> > pcmk_reboot_retries="1"
>>>> >>>>>>>>  \
>>>> >>>>>>>>>>>>>>>>>>>>
>>>> > pcmk_reboot_timeout="40s"
>>>> >>>>>>>>  \
>>>> >>>>>>>>>>>>>>>>>>>>
>>>> > hostlist="nodeb" \
>>>> >>>>>>>>>>>>>>>>>>>>
>>>> > dead_check_target="192.168.28.61
>>>> >>>>>>>>  192.168.28.71" \
>>>> >>>>>>>>>>>>>>>>>>>>
>>>> > standby_check_command="/usr/sbin/crm_resource
>>>> >>>>>>>>  -r prmRES -W | grep -qi `hostname`" \
>>>> >>>>>>>>>>>>>>>>>>>>
>>>> > run_online_check="yes"
>>>> >>>>>>>>  \
>>>> >>>>>>>>>>>>>>>>>>>>  op
>>>> > start interval="0s"
>>>> >>>>>>>>  timeout="60s" on-fail="restart"
>>>> > \
>>>> >>>>>>>>>>>>>>>>>>>>  op
>>>> > stop interval="0s"
>>>> >>>>>>>>  timeout="60s" on-fail="ignore"
>>>> >>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>
>>>> > primitive prmStonith2-2
>>>> >>>>>>>>  stonith:external/ssh \
>>>> >>>>>>>>>>>>>>>>>>>>
>>>> > params \
>>>> >>>>>>>>>>>>>>>>>>>>
>>>> > pcmk_reboot_timeout="60s"
>>>> >>>>>>>>  \
>>>> >>>>>>>>>>>>>>>>>>>>
>>>> > hostlist="nodeb" \
>>>> >>>>>>>>>>>>>>>>>>>>  op
>>>> > start interval="0s"
>>>> >>>>>>>>  timeout="60s" on-fail="restart"
>>>> > \
>>>> >>>>>>>>>>>>>>>>>>>>  op
>>>> > monitor
>>>> >>>>>>>>  interval="3600s" timeout="60s"
>>>> > on-fail="restart"
>>>> >>>>>>>>  \
>>>> >>>>>>>>>>>>>>>>>>>>  op
>>>> > stop interval="0s"
>>>> >>>>>>>>  timeout="60s" on-fail="ignore"
>>>> >>>>>>>>>>>>>>>>>>>>
>>>> > (snip)
>>>> >>>>>>>>>>>>>>>>>>>>
>>>> > location
>>>> >>>>>>>>  rsc_location-grpStonith1-2 grpStonith1 \
>>>> >>>>>>>>>>>>>>>>>>>>
>>>> > rule -INFINITY: #uname eq nodea
>>>> >>>>>>>>>>>>>>>>>>>>
>>>> > location
>>>> >>>>>>>>  rsc_location-grpStonith2-3 grpStonith2 \
>>>> >>>>>>>>>>>>>>>>>>>>
>>>> > rule -INFINITY: #uname eq nodeb
>>>> >>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>
>>>> > 以上です。
>>>> >>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>  --
>>>> >>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>  ELF
>>>> > Systems
>>>> >>>>>>>>>>>>>>>>>>>
>>>> > Masamichi Fukuda
>>>> >>>>>>>>>>>>>>>>>>>  mail
>>>> > to:
>>>> >>>>>>>>  masamichi_fukuda@elf-systems.com
>>>> >>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>
>>>> > _______________________________________________
>>>> >>>>>>>>>>>>>>>>>>
>>>> > Linux-ha-japan mailing list
>>>> >>>>>>>>>>>>>>>>>>
>>>> > Linux-ha-japan@lists.sourceforge.jp
>>>> >>>>>>>>>>>>>>>>>>
>>>> > http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>>>> >>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>  --
>>>> >>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>  ELF Systems
>>>> >>>>>>>>>>>>>>>>>  Masamichi
>>>> > Fukuda
>>>> >>>>>>>>>>>>>>>>>  mail to:
>>>> > masamichi_fukuda@elf-systems.com
>>>> >>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>
>>>> > _______________________________________________
>>>> >>>>>>>>>>>>>>>>  Linux-ha-japan
>>>> > mailing list
>>>> >>>>>>>>>>>>>>>>
>>>> > Linux-ha-japan@lists.sourceforge.jp
>>>> >>>>>>>>>>>>>>>>
>>>> > http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>>>> >>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>  --
>>>> >>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>  ELF Systems
>>>> >>>>>>>>>>>>>>>  Masamichi Fukuda
>>>> >>>>>>>>>>>>>>>  mail to:
>>>> > masamichi_fukuda@elf-systems.com
>>>> >>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>
>>>> > _______________________________________________
>>>> >>>>>>>>>>>>>>  Linux-ha-japan mailing list
>>>> >>>>>>>>>>>>>>
>>>> > Linux-ha-japan@lists.sourceforge.jp
>>>> >>>>>>>>>>>>>>
>>>> > http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>>>> >>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>  --
>>>> >>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>  ELF Systems
>>>> >>>>>>>>>>>>>  Masamichi Fukuda
>>>> >>>>>>>>>>>>>  mail to:
>>>> > masamichi_fukuda@elf-systems.com
>>>> >>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>
>>>> >>>>>>>>>>>>
>>>> >>>>>>>>>>>>
>>>> > _______________________________________________
>>>> >>>>>>>>>>>>  Linux-ha-japan mailing list
>>>> >>>>>>>>>>>>  Linux-ha-japan@lists.sourceforge.jp
>>>> >>>>>>>>>>>>
>>>> > http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>>>> >>>>>>>>>>>>
>>>> >>>>>>>>>>>
>>>> >>>>>>>>>>>  --
>>>> >>>>>>>>>>>
>>>> >>>>>>>>>>>  ELF Systems
>>>> >>>>>>>>>>>  Masamichi Fukuda
>>>> >>>>>>>>>>>  mail to:
>>>> > masamichi_fukuda@elf-systems.com
>>>> >>>>>>>>>>>
>>>> >>>>>>>>>>>
>>>> >>>>>>>>>>>
>>>> >>>>>>>>>>
>>>> >>>>>>>>>>
>>>> > _______________________________________________
>>>> >>>>>>>>>>  Linux-ha-japan mailing list
>>>> >>>>>>>>>>  Linux-ha-japan@lists.sourceforge.jp
>>>> >>>>>>>>>>
>>>> > http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>>>> >>>>>>>>>>
>>>> >>>>>>>>>
>>>> >>>>>>>>>
>>>> >>>>>>>>>  --
>>>> >>>>>>>>>
>>>> >>>>>>>>>  ELF Systems
>>>> >>>>>>>>>  Masamichi Fukuda
>>>> >>>>>>>>>  mail to: masamichi_fukuda@elf-systems.com
>>>> >>>>>>>>>
>>>> >>>>>>>>>
>>>> >>>>>>>>
>>>> >>>>>>>>  _______________________________________________
>>>> >>>>>>>>  Linux-ha-japan mailing list
>>>> >>>>>>>>  Linux-ha-japan@lists.sourceforge.jp
>>>> >>>>>>>>
>>>> > http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>>>> >>>>>>>>
>>>> >>>>>>>
>>>> >>>>>>> _______________________________________________
>>>> >>>>>>> Linux-ha-japan mailing list
>>>> >>>>>>> Linux-ha-japan@lists.sourceforge.jp
>>>> >>>>>>> http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>>>> >>>>>>>
>>>> >>>>>>
>>>> >>>>>>
>>>> >>>>>> --
>>>> >>>>>>
>>>> >>>>>> ELF Systems
>>>> >>>>>> Masamichi Fukuda
>>>> >>>>>> mail to: masamichi_fukuda@elf-systems.com
>>>> >>>>>>
>>>> >>>>>>
>>>> >>>>>
>>>> >>>>> _______________________________________________
>>>> >>>>> Linux-ha-japan mailing list
>>>> >>>>> Linux-ha-japan@lists.sourceforge.jp
>>>> >>>>> http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>>>> >>>>>
>>>> >>>>
>>>> >>>>
>>>> >>>> --
>>>> >>>>
>>>> >>>> ELF Systems
>>>> >>>> Masamichi Fukuda
>>>> >>>> mail to: masamichi_fukuda@elf-systems.com
>>>> >>>>
>>>> >>>>
>>>> >>>
>>>> >>> _______________________________________________
>>>> >>> Linux-ha-japan mailing list
>>>> >>> Linux-ha-japan@lists.sourceforge.jp
>>>> >>> http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>>>> >>>
>>>> >>
>>>> >>
>>>> >> --
>>>> >>
>>>> >> ELF Systems
>>>> >> Masamichi Fukuda
>>>> >> mail to: masamichi_fukuda@elf-systems.com
>>>> >>
>>>> >>
>>>> >
>>>> > _______________________________________________
>>>> > Linux-ha-japan mailing list
>>>> > Linux-ha-japan@lists.sourceforge.jp
>>>> > http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>>>> >
>>>>
>>>> _______________________________________________
>>>> Linux-ha-japan mailing list
>>>> Linux-ha-japan@lists.sourceforge.jp
>>>> http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>>>
>>>
>>>
>>>
>>>--
>>>ELF Systems
>>>Masamichi Fukuda
>>>mail to: masamichi_fukuda@elf-systems.com
>>>
>>>
>>
>>_______________________________________________
>>Linux-ha-japan mailing list
>>Linux-ha-japan@lists.sourceforge.jp
>>http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>>
>
>
>--
>
>ELF Systems
>Masamichi Fukuda
>mail to: masamichi_fukuda@elf-systems.com
>
>

_______________________________________________
Linux-ha-japan mailing list
Linux-ha-japan@lists.sourceforge.jp
http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
Re: スプリットブレイン時のSTONITHエラーについて [ In reply to ]
山内さん

こんばんは、福田です。
debianでの検証ありがとうございます。

> どうやら、新しいPacemakerのrngファイル(
>
> Pacemaker1.1.12より後)が影響しているようです。
> が、こちらの回避方法はまだわかっていません。

こちら回避方法等わかりました際にはご教示お願いします。

> ただ、最新のPMとの組み合わせの問題の解消はまだですので、
>
> この構成(PM1.1.12+Heartbeat3.0.6)が正しく動くかどうかは別物です。
> #たぶん、動いているようですが、問題が出ると思います。

一旦、PM1.1.12に戻して、同じ手順でやってみます。
まずはstonith-helperが動くかどうか確認してみます。

> で、福田さんのstonith-
>
> helperがstartでエラーになっている件ですが、多分、stonithコマンドへのPATHがないことが原因かと思います。

初歩的なミスのようでお恥ずかしい限りです。
こちらも同様に試してみます。

宜しくお願いします。

以上


2015年3月18日 17:56 <renayama19661014@ybb.ne.jp>:

> 福田さん
>
> こんばんは、山内です。
>
> 私の方でも同じ状況が発生しました。
> どうやら、新しいPacemakerのrngファイル(Pacemaker1.1.12より後)が影響しているようです。
> が、こちらの回避方法はまだわかっていません。
>
>
> ちなみに、本来はうまく動くかどうか不明のPacemaker1.1.12とHeartbeat3.0.6の組み合わせでは、単一ノードで、stonith-helperの起動まで確認しました。
>
> root@debian7-1:~# crm_mon -1 -Af
> Last updated: Wed Mar 18 17:43:37 2015
> Last change: Wed Mar 18 17:43:29 2015
> Stack: heartbeat
> Current DC: debian7-1 (d20c7df5-519e-4a4c-9b4b-1b88fc203133) - partition
> with quorum
> Version: 1.1.12-561c4cf
> 1 Nodes configured
> 3 Resources configured
>
>
> Online: [ debian7-1 ]
>
> prmDummy(ocf::pacemaker:Dummy):Started debian7-1
> Resource Group: grpStonith2
> Stonith2-1(stonith:external/stonith-helper):Started debian7-1
>
> Node Attributes:
> * Node debian7-1:
>
> Migration summary:
> * Node debian7-1:
>
> 松島さんの手順ではうまくいかない箇所(私のdebian不慣れが原因と思いますが)がありましたが、構築オプションは同じ
> にして、インストールして、pm_extras_1.0の最新版に含まれるstonith-helperのみをxen0と同じディレクトリにコピーしました。
> #stonith-helperの実行権限などに問題があれば、正しく設定してください。
>
>
> で、福田さんのstonith-helperがstartでエラーになっている件ですが、多分、stonithコマンドへのPATHがないことが原因かと思います。
>
> root@debian7-1:~# find / -name stonith -print
> /usr/local/heartbeat/sbin/stonith
>
> root@debian7-1:~# echo $PATH
>
> /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/local/heartbeat/sbin/
>
>
>
> PATHに/usr/local/heartbeat/sbinを追加後に再度、heartbeatを起動すると、上記のcrm_mon表示のようになりました。
>
> ただ、最新のPMとの組み合わせの問題の解消はまだですので、この構成(PM1.1.12+Heartbeat3.0.6)が正しく動くかどうかは別物です。
> #たぶん、動いているようですが、問題が出ると思います。
>
> 以下に試しに流し込んだ、crmファイルを提示しておきます。
>
> (dead_check_targetや、standby_check_commandなどのパラメータ値は起動を確認するのみでしたので、この設定では実際はまったく意味がない値です)
>
> ### Cluster Option ###
> property \
> no-quorum-policy="ignore" \
> stonith-enabled="true" \
> startup-fencing="false"
>
> ### Resource Default ###
> rsc_defaults \
> resource-stickiness="INFINITY" \
> migration-threshold="1"
>
> ### Fencing Topology ###
> fencing_topology \
> debian7-1: Stonith1-1 \
> debian7-2: Stonith2-1
>
> group grpStonith1 \
> Stonith1-1
>
> group grpStonith2 \
> Stonith2-1
>
> primitive prmDummy ocf:pacemaker:Dummy \
> op start interval="0s" timeout="60s" on-fail="restart" \
> op monitor interval="3600s" timeout="60s" on-fail="restart" \
> op stop interval="0s" timeout="60s" on-fail="ignore"
>
> primitive Stonith1-1 stonith:external/stonith-helper \
> params \
> pcmk_reboot_retries="1" \
> pcmk_reboot_timeout="40s" \
> hostlist="debian7-1" \
> dead_check_target="192.168.3.1" \
> standby_wait_time="10" \
> standby_check_command="/usr/local/sbin/crm_resource -r varnishd -W
> | grep -q `hostname`" \
> op start interval="0s" timeout="60s" on-fail="restart" \
> op monitor interval="3600s" timeout="60s" on-fail="restart" \
> op stop interval="0s" timeout="60s" on-fail="ignore"
>
> primitive Stonith2-1 stonith:external/stonith-helper \
> params \
> pcmk_reboot_retries="1" \
> pcmk_reboot_timeout="40s" \
> hostlist="debian7-2" \
> dead_check_target="192.168.3.1" \
> standby_wait_time="10" \
> standby_check_command="/usr/local/sbin/crm_resource -r varnishd -W
> | grep -q `hostname`" \
> op start interval="0s" timeout="60s" on-fail="restart" \
> op monitor interval="3600s" timeout="60s" on-fail="restart" \
> op stop interval="0s" timeout="60s" on-fail="ignore"
>
>
> location HA_location-3 grpStonith1 \
> rule -INFINITY: #uname eq debian7-1
>
> location HA_location-4 grpStonith2 \
> rule -INFINITY: #uname eq debian7-2
>
>
> また、何かわかりましたら、ご連絡いたします。
>
> 以上です。
>
>
>
>
>
>
>
>
> ----- Original Message -----
> >From: Masamichi Fukuda - elf-systems <masamichi_fukuda@elf-systems.com>
> >To: 山内英生 <renayama19661014@ybb.ne.jp>; "
> linux-ha-japan@lists.sourceforge.jp" <linux-ha-japan@lists.sourceforge.jp>
> >Date: 2015/3/18, Wed 15:09
> >Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
> >
> >
> >山内さん
> >
> >お疲れ様です、福田です。
> >
> >新たにdebian7.8をvirtulabox上にインストールして、
> >heartbeat + pacemakerをインストールしてみました。
> >
> >
> >パッケージでheartbeat,pacemaker等はインストールしていません。
> >
> >
> >heartbeatは起動しますが、crmファイルを読み込ませるとエラーがでました。
> >
> >
> ># crm configure load update test1.crm
> >
> >ERROR: crmd:metadata: got no meta-data, does this RA exist?
> >ERROR: cib-bootstrap-options: attribute no-quorum-policy does not exist
> >ERROR: cib-bootstrap-options: attribute stonith-enabled does not exist
> >ERROR: cib-bootstrap-options: attribute crmd-transition-delay does not
> exist
> >ERROR: pengine:metadata: got no meta-data, does this RA exist?
> >
> >external配下のエージェントを認識できない件と関係あるのでしょうか。
> >
> >宜しくお願いします。
> >
> >以上
> >
> >
> >
> >
> >
> >2015年3月18日 12:13 <renayama19661014@ybb.ne.jp>:
> >
> >福田さん
> >>
> >>お疲れ様です。山内です。
> >>
> >>了解しました。
> >>ご連絡ありがとうございました。
> >>
> >>以上です。
> >>
> >>
> >>
> >>----- Original Message -----
> >>>From: Masamichi Fukuda - elf-systems <masamichi_fukuda@elf-systems.com>
> >>>To: 山内英生 <renayama19661014@ybb.ne.jp>; "
> linux-ha-japan@lists.sourceforge.jp" <linux-ha-japan@lists.sourceforge.jp>
> >>
> >>>Date: 2015/3/18, Wed 10:23
> >>>Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
> >>>
> >>>
> >>>山内さん
> >>>
> >>>お疲れ様です、福田です。
> >>>
> >>>こちらの環境では、packageで次のものを入れていたので、
> >>>最初にapt-get removeしました。
> >>>
> >>>heartbeat、libheartbeat2、pacemaker、corosync、resource-agents
> >>>
> >>>また、haclusterユーザとhaclientグループはpackage導入の段階で
> >>>作成されていました。
> >>>
> >>>ですので、松島さんの手順の
> >>>
> >>>下準備
> >>>apt-get install build-essential mercurial git \
> >>>
> >>>以降を実行しました。後は全く同じ手順です。
> >>>
> >>>宜しくお願いします。
> >>>
> >>>以上
> >>>
> >>>2015年3月18日 10:06 <renayama19661014@ybb.ne.jp>:
> >>>>
> >>>> 福田さん
> >>>>
> >>>> お疲れ様です。山内です。
> >>>>
> >>>> ちなみに、私の方でも構築するにあたっての再確認ですが、福田さんの構築手順は、
> >>>> 以下にまとめられた松島さんの手順通りでしょうか?
> >>>>
> >>>> * https://gist.github.com/takehironet/1469bd7123f63d61f843
> >>>>
> >>>> 差異などありましたら、今一度、ご連絡ください。
> >>>>
> >>>> #特に、最初の構築パッケージのapt-getのあたりが、私がちょっと試した時には、うまく行かなかった記憶があるので、気になります。
> >>>>
> >>>>
> >>>> 以上です。
> >>>>
> >>>>
> >>>> ----- Original Message -----
> >>>> > From: "renayama19661014@ybb.ne.jp" <renayama19661014@ybb.ne.jp>
> >>>> > To: "linux-ha-japan@lists.sourceforge.jp" <
> linux-ha-japan@lists.sourceforge.jp>
> >>>> > Cc:
> >>>> > Date: 2015/3/18, Wed 09:53
> >>>> > Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
> >>>> >
> >>>> > 福田さん
> >>>> >
> >>>> > お疲れ様です。山内です。
> >>>> >
> >>>> >> stonith -Lでは、一応プラグインの一覧は表示されるようです。
> >>>> >>
> >>>> >> # /usr/local/heartbeat/sbin/stonith -L
> >>>> >
> >>>> >
> こちらは、Heartbeatのソースに含まれるコマンドのはずなので、Heartbeatとglueの関係では問題ないということになるかと思います。
> >>>> >
> >>>> > ですので、pacemakerのインストールに問題がある可能性の方が高いと思われます。
> >>>> >
> >>>> > どちらにしても、一度、時間をみて、こちらでも構築してみます。
> >>>> >
> >>>> > 以上です。
> >>>> >
> >>>> >
> >>>> > ----- Original Message -----
> >>>> >> From: Masamichi Fukuda - elf-systems
> >>>> > <masamichi_fukuda@elf-systems.com>
> >>>> >> To: 山内英生 <renayama19661014@ybb.ne.jp>;
> >>>> > "linux-ha-japan@lists.sourceforge.jp"
> >>>> > <linux-ha-japan@lists.sourceforge.jp>
> >>>> >> Date: 2015/3/18, Wed 09:33
> >>>> >> Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
> >>>> >>
> >>>> >>
> >>>> >> 山内さん
> >>>> >>
> >>>> >> お疲れ様です、福田です。
> >>>> >>
> >>>> >>> Reusableは、glueのことです。
> >>>> >>
> >>>> >> 承知しました。Cluster-glueのことですね。
> >>>> >>
> >>>> >>> これは、想定通りで、external配下のエージェントを認識できず、startしていないと
> >>>> >>> 思っています。
> >>>> >>
> >>>> >> stonith -Lでは、一応プラグインの一覧は表示されるようです。
> >>>> >>
> >>>> >> # /usr/local/heartbeat/sbin/stonith -L
> >>>> >> apcmaster
> >>>> >> apcsmart
> >>>> >> baytech
> >>>> >> cyclades
> >>>> >> external/drac5
> >>>> >> external/dracmc-telnet
> >>>> >> external/hetzner
> >>>> >> external/hmchttp
> >>>> >> external/ibmrsa
> >>>> >> external/ibmrsa-telnet
> >>>> >> external/ipmi
> >>>> >> external/ippower9258
> >>>> >> external/kdumpcheck
> >>>> >> external/libvirt
> >>>> >> external/nut
> >>>> >> external/rackpdu
> >>>> >> external/riloe
> >>>> >> external/ssh
> >>>> >> external/stonith-helper
> >>>> >> external/vcenter
> >>>> >> external/vmware
> >>>> >> external/xen0
> >>>> >> external/xen0-ha
> >>>> >> ibmhmc
> >>>> >> meatware
> >>>> >> null
> >>>> >> nw_rpc100s
> >>>> >> rcd_serial
> >>>> >> rps10
> >>>> >> ssh
> >>>> >> suicide
> >>>> >> wti_nps
> >>>> >>
> >>>> >>
> >>>> >>> #時間を見て、松島さんの情報を元に、Debian7の環境を作らないと何とも言えない
> >>>> >>> と思っています
> >>>> >>
> >>>> >> お忙しいところ済みません。
> >>>> >> こちらもインストールを見なおして見ます。
> >>>> >>
> >>>> >> 宜しくお願いします。
> >>>> >>
> >>>> >> 以上
> >>>> >>
> >>>> >>
> >>>> >>
> >>>> >>
> >>>> >> 2015年3月18日 9:02 <renayama19661014@ybb.ne.jp>:
> >>>> >>
> >>>> >> 福田さん
> >>>> >>>
> >>>> >>> おはようございます。山内です。
> >>>> >>>
> >>>> >>> 書き方が悪かったです。
> >>>> >>> Reusableは、glueのことです。
> >>>> >>>
> >>>> >>> pacemakerのインストールに問題があるかも知れませんが、現時点では、判断出来ません。
> >>>> >>>
> >>>> >>>
> >>>> >>>> stonith-helperを外して、external/sshだけにして起動してみましたが、
> >>>> >>>> crm_monでの状態は変わりありませんでした。
> >>>> >>>
> >>>> >>>
> >>>> >>> これは、想定通りで、external配下のエージェントを認識できず、startしていないと思っています。
> >>>> >>>
> >>>> >>> #時間を見て、松島さんの情報を元に、Debian7の環境を作らないと何とも言えないと思っています。
> >>>> >>>
> >>>> >>> 以上です。
> >>>> >>>
> >>>> >>>
> >>>> >>> ----- Original Message -----
> >>>> >>>> From: Masamichi Fukuda - elf-systems
> >>>> > <masamichi_fukuda@elf-systems.com>
> >>>> >>>> To: 山内英生 <renayama19661014@ybb.ne.jp>;
> >>>> > "linux-ha-japan@lists.sourceforge.jp"
> >>>> > <linux-ha-japan@lists.sourceforge.jp>
> >>>> >>>
> >>>> >>>> Date: 2015/3/18, Wed 08:12
> >>>> >>>> Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
> >>>> >>>>
> >>>> >>>>
> >>>> >>>> 山内さん
> >>>> >>>>
> >>>> >>>> おはようございます、福田です。
> >>>> >>>>
> >>>> >>>>> ということは、xen0もstonith-helperもたぶん、Pacemakerのstonithプラグインとし
> >>>> >>>>> ての管理下のパスにはないということになると思います。
> >>>> >>>>>
> >>>> >>>>> Reusableと、pm_extrasあたりのインストールが怪しいと思いますね。
> >>>> >>>>
> >>>> >>>> pacemakerのインストールに問題があるのでしょうか。
> >>>> >>>> あと、Reusableというものは別途インストールが必要なのでしょうか。
> >>>> >>>>
> >>>> >>>> stonith-helperを外して、external/sshだけにして起動してみましたが、
> >>>> >>>> crm_monでの状態は変わりありませんでした。
> >>>> >>>>
> >>>> >>>> Last updated: Wed Mar 18 08:07:42 2015
> >>>> >>>> Last change: Wed Mar 18 08:04:48 2015
> >>>> >>>> Stack: heartbeat
> >>>> >>>> Current DC: lbv1.beta.com
> (38b0f200-83ea-8633-6f37-047d36cd39c6) -
> >>>> > parti
> >>>> >>>> tion with quorum
> >>>> >>>> Version: 1.1.12-e32080b
> >>>> >>>> 2 Nodes configured
> >>>> >>>> 6 Resources configured
> >>>> >>>>
> >>>> >>>>
> >>>> >>>> Online: [ lbv1.beta.com lbv2.beta.com ]
> >>>> >>>>
> >>>> >>>> Full list of resources:
> >>>> >>>>
> >>>> >>>> Stonith1-2 (stonith:external/ssh): Stopped
> >>>> >>>> Stonith2-2 (stonith:external/ssh): Stopped
> >>>> >>>> Resource Group: HAvarnish
> >>>> >>>> vip_208 (ocf::heartbeat:IPaddr2): Started
> >>>> > lbv1.beta.com
> >>>> >>>> varnishd (lsb:varnish): Started lbv1.beta.com
> >>>> >>>> Clone Set: clone_ping [ping]
> >>>> >>>> Started: [ lbv1.beta.com lbv2.beta.com ]
> >>>> >>>>
> >>>> >>>> Node Attributes:
> >>>> >>>> * Node lbv1.beta.com:
> >>>> >>>> + default_ping_set : 100
> >>>> >>>> * Node lbv2.beta.com:
> >>>> >>>> + default_ping_set : 100
> >>>> >>>>
> >>>> >>>> Migration summary:
> >>>> >>>> * Node lbv2.beta.com:
> >>>> >>>> Stonith1-2: migration-threshold=1 fail-count=1000000
> >>>> > last-failure='Wed Mar 18
> >>>> >>>> 08:07:32 2015'
> >>>> >>>> * Node lbv1.beta.com:
> >>>> >>>> Stonith2-2: migration-threshold=1 fail-count=1000000
> >>>> > last-failure='Wed Mar 18
> >>>> >>>> 08:05:53 2015'
> >>>> >>>>
> >>>> >>>> Failed actions:
> >>>> >>>> Stonith1-2_start_0 on lbv2.beta.com 'unknown error' (1):
> >>>> > call=23, st
> >>>> >>>> atus=Error, exit-reason='none', last-rc-change='Wed Mar
> >>>> > 18 08:07:30 2015', queue
> >>>> >>>> d=0ms, exec=1061ms
> >>>> >>>> Stonith2-2_start_0 on lbv1.beta.com 'unknown error' (1):
> >>>> > call=23, st
> >>>> >>>> atus=Error, exit-reason='none', last-rc-change='Wed Mar
> >>>> > 18 08:05:51 2015', queue
> >>>> >>>> d=0ms, exec=1062ms
> >>>> >>>>
> >>>> >>>> 宜しくお願いします。
> >>>> >>>>
> >>>> >>>> 以上
> >>>> >>>>
> >>>> >>>>
> >>>> >>>>
> >>>> >>>>
> >>>> >>>>
> >>>> >>>> 2015年3月17日 23:51 <renayama19661014@ybb.ne.jp>:
> >>>> >>>>
> >>>> >>>> 福田さん
> >>>> >>>>>
> >>>> >>>>> こんばんは、山内です。
> >>>> >>>>>
> >>>> >>>>>
> ということは、xen0もstonith-helperもたぶん、Pacemakerのstonithプラグインとしての管理下のパスにはないということになると思います。
> >>>> >>>>>
> >>>> >>>>> Reusableと、pm_extrasあたりのインストールが怪しいと思いますね。
> >>>> >>>>>
> >>>> >>>>> また、何かわかったらご連絡します。
> >>>> >>>>>
> >>>> >>>>> 以上です。
> >>>> >>>>>
> >>>> >>>>>
> >>>> >>>>>
> >>>> >>>>> ----- Original Message -----
> >>>> >>>>>> From: Masamichi Fukuda - elf-systems
> >>>> > <masamichi_fukuda@elf-systems.com>
> >>>> >>>>>> To: 山内英生 <renayama19661014@ybb.ne.jp>;
> >>>> > "linux-ha-japan@lists.sourceforge.jp"
> >>>> > <linux-ha-japan@lists.sourceforge.jp>
> >>>> >>>>>
> >>>> >>>>>> Date: 2015/3/17, Tue 23:46
> >>>> >>>>>> Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
> >>>> >>>>>>
> >>>> >>>>>>
> >>>> >>>>>> 山内さん
> >>>> >>>>>>
> >>>> >>>>>> こんばんは、福田です。
> >>>> >>>>>>
> >>>> >>>>>> stonith-helperの-x指定は何かやり方が違うんでしょうかね。
> >>>> >>>>>>
> >>>> >>>>>> stonith-helperを外して、xen0だけにして起動してみました。
> >>>> >>>>>>
> >>>> >>>>>> # crm_mon -rfA
> >>>> >>>>>>
> >>>> >>>>>> Last updated: Tue Mar 17 23:38:53 2015
> >>>> >>>>>> Last change: Tue Mar 17 23:30:34 2015
> >>>> >>>>>> Stack: heartbeat
> >>>> >>>>>> Current DC: lbv1.beta.com
> >>>> > (38b0f200-83ea-8633-6f37-047d36cd39c6) - parti
> >>>> >>>>>> tion with quorum
> >>>> >>>>>> Version: 1.1.12-e32080b
> >>>> >>>>>> 2 Nodes configured
> >>>> >>>>>> 6 Resources configured
> >>>> >>>>>>
> >>>> >>>>>>
> >>>> >>>>>> Online: [ lbv1.beta.com lbv2.beta.com ]
> >>>> >>>>>>
> >>>> >>>>>> Full list of resources:
> >>>> >>>>>>
> >>>> >>>>>> Stonith1-2 (stonith:external/xen0): Stopped
> >>>> >>>>>> Stonith2-2 (stonith:external/xen0): Stopped
> >>>> >>>>>> Resource Group: HAvarnish
> >>>> >>>>>> vip_208 (ocf::heartbeat:IPaddr2): Started
> >>>> > lbv1.beta.com
> >>>> >>>>>> varnishd (lsb:varnish): Started lbv1.beta.com
> >>>> >>>>>> Clone Set: clone_ping [ping]
> >>>> >>>>>> Started: [ lbv1.beta.com lbv2.beta.com ]
> >>>> >>>>>>
> >>>> >>>>>> Node Attributes:
> >>>> >>>>>> * Node lbv1.beta.com:
> >>>> >>>>>> + default_ping_set : 100
> >>>> >>>>>> * Node lbv2.beta.com:
> >>>> >>>>>> + default_ping_set : 100
> >>>> >>>>>>
> >>>> >>>>>> Migration summary:
> >>>> >>>>>> * Node lbv1.beta.com:
> >>>> >>>>>> Stonith2-2: migration-threshold=1 fail-count=1000000
> >>>> > last-failure='Tue Mar 17
> >>>> >>>>>> 23:38:34 2015'
> >>>> >>>>>> * Node lbv2.beta.com:
> >>>> >>>>>> Stonith1-2: migration-threshold=1 fail-count=1000000
> >>>> > last-failure='Tue Mar 17
> >>>> >>>>>> 23:38:27 2015'
> >>>> >>>>>>
> >>>> >>>>>> Failed actions:
> >>>> >>>>>> Stonith2-2_start_0 on lbv1.beta.com 'unknown
> >>>> > error' (1): call=23, st
> >>>> >>>>>> atus=Error, exit-reason='none',
> >>>> > last-rc-change='Tue Mar 17 23:38:32 2015', queue
> >>>> >>>>>> d=0ms, exec=1061ms
> >>>> >>>>>> Stonith1-2_start_0 on lbv2.beta.com 'unknown
> >>>> > error' (1): call=23, st
> >>>> >>>>>> atus=Error, exit-reason='none',
> >>>> > last-rc-change='Tue Mar 17 23:38:25 2015', queue
> >>>> >>>>>> d=0ms, exec=1342ms
> >>>> >>>>>>
> >>>> >>>>>>
> >>>> >>>>>>
> >>>> >>>>>>
> >>>> >>>>>> stonith-helperがあるときと同様のfialed actionsが出ているようです。
> >>>> >>>>>>
> >>>> >>>>>>
> >>>> >>>>>> 宜しくお願いします。
> >>>> >>>>>>
> >>>> >>>>>> 以上
> >>>> >>>>>>
> >>>> >>>>>>
> >>>> >>>>>>
> >>>> >>>>>>
> >>>> >>>>>> 2015年3月17日 22:38 <renayama19661014@ybb.ne.jp>:
> >>>> >>>>>>
> >>>> >>>>>> 福田さん
> >>>> >>>>>>>
> >>>> >>>>>>> こんばんは、山内です。
> >>>> >>>>>>>
> >>>> >>>>>>> ちなみに可能であれば、external/stonith-helperを外して、external/xen0だけにした場合に
> >>>> >>>>>>> どうなるか?を確認すると、問題の切り分けになるかもしれません。
> >>>> >>>>>>>
> >>>> >>>>>>> 以上です。
> >>>> >>>>>>>
> >>>> >>>>>>>
> >>>> >>>>>>>
> >>>> >>>>>>> ----- Original Message -----
> >>>> >>>>>>>
> >>>> >>>>>>>> From: "renayama19661014@ybb.ne.jp"
> >>>> > <renayama19661014@ybb.ne.jp>
> >>>> >>>>>>>> To: "linux-ha-japan@lists.sourceforge.jp"
> >>>> > <linux-ha-japan@lists.sourceforge.jp>
> >>>> >>>>>>>> Cc:
> >>>> >>>>>>>> Date: 2015/3/17, Tue 22:28
> >>>> >>>>>>>> Subject: Re: [Linux-ha-jp]
> >>>> > スプリットブレイン時のSTONITHエラーについて
> >>>> >>>>>>>>
> >>>> >>>>>>>> 福田さん
> >>>> >>>>>>>>
> >>>> >>>>>>>> こんばんは、山内です。
> >>>> >>>>>>>>
> >>>> >>>>>>>> 変わらないようですね。。。
> >>>> >>>>>>>>
> >>>> >>>>>>>> とりあえず、明日くらいに、RHEL上ですが、
> >>>> >>>>>>>>
> >>>> >>>>>>>> Heartbeat3.0.6
> >>>> >>>>>>>> Pacemakerの最新
> >>>> >>>>>>>>
> >>>> >>>>>>>>
> >>>> >
> 組み合わせで、同じような設定(リソースはDummy、external/xen0はexternal/sshになりますが)stonith-helperが動くかどうかを確認してみます。
> >>>> >>>>>>>>
> >>>> >>>>>>>> #stonith-helperの-x指定の出力が確認出来ると、もう少し問題が絞りやすいのですが・・・
> >>>> >>>>>>>>
> >>>> >>>>>>>>
> >>>> >>>>>>>> 以上です。
> >>>> >>>>>>>>
> >>>> >>>>>>>>
> >>>> >>>>>>>>
> >>>> >>>>>>>> ----- Original Message -----
> >>>> >>>>>>>>> From: Masamichi Fukuda - elf-systems
> >>>> >>>>>>>> <masamichi_fukuda@elf-systems.com>
> >>>> >>>>>>>>> To: 山内英生 <renayama19661014@ybb.ne.jp>;
> >>>> >>>>>>>> "linux-ha-japan@lists.sourceforge.jp"
> >>>> >>>>>>>> <linux-ha-japan@lists.sourceforge.jp>
> >>>> >>>>>>>>> Date: 2015/3/17, Tue 21:24
> >>>> >>>>>>>>> Subject: Re: [Linux-ha-jp]
> >>>> > スプリットブレイン時のSTONITHエラーについて
> >>>> >>>>>>>>>
> >>>> >>>>>>>>>
> >>>> >>>>>>>>> 山内さん
> >>>> >>>>>>>>>
> >>>> >>>>>>>>> こんばんは、福田です。
> >>>> >>>>>>>>> 最新版の情報をありがとうございました。
> >>>> >>>>>>>>>
> >>>> >>>>>>>>> 早速インストールしてみました。
> >>>> >>>>>>>>>
> >>>> >>>>>>>>> 起動後の状態です。
> >>>> >>>>>>>>>
> >>>> >>>>>>>>> failed actionsは変わりないようです。
> >>>> >>>>>>>>>
> >>>> >>>>>>>>>
> >>>> >>>>>>>>>
> >>>> >>>>>>>>> # crm_mon -rfA
> >>>> >>>>>>>>> Last updated: Tue Mar 17 21:03:49 2015
> >>>> >>>>>>>>> Last change: Tue Mar 17 20:30:58 2015
> >>>> >>>>>>>>> Stack: heartbeat
> >>>> >>>>>>>>> Current DC: lbv1.beta.com
> >>>> > (38b0f200-83ea-8633-6f37-047d36cd39c6) - parti
> >>>> >>>>>>>>> tion with quorum
> >>>> >>>>>>>>> Version: 1.1.12-e32080b
> >>>> >>>>>>>>> 2 Nodes configured
> >>>> >>>>>>>>> 8 Resources configured
> >>>> >>>>>>>>>
> >>>> >>>>>>>>>
> >>>> >>>>>>>>> Online: [ lbv1.beta.com lbv2.beta.com ]
> >>>> >>>>>>>>>
> >>>> >>>>>>>>> Full list of resources:
> >>>> >>>>>>>>>
> >>>> >>>>>>>>> Resource Group: HAvarnish
> >>>> >>>>>>>>> vip_208 (ocf::heartbeat:IPaddr2):
> >>>> > Started lbv1.beta.com
> >>>> >>>>>>>>> varnishd (lsb:varnish): Started
> >>>> > lbv1.beta.com
> >>>> >>>>>>>>> Resource Group: grpStonith1
> >>>> >>>>>>>>> Stonith1-1
> >>>> > (stonith:external/stonith-helper): Stopped
> >>>> >>>>>>>>> Stonith1-2 (stonith:external/xen0):
> >>>> > Stopped
> >>>> >>>>>>>>> Resource Group: grpStonith2
> >>>> >>>>>>>>> Stonith2-1
> >>>> > (stonith:external/stonith-helper): Stopped
> >>>> >>>>>>>>> Stonith2-2 (stonith:external/xen0):
> >>>> > Stopped
> >>>> >>>>>>>>> Clone Set: clone_ping [ping]
> >>>> >>>>>>>>> Started: [ lbv1.beta.com lbv2.beta.com ]
> >>>> >>>>>>>>>
> >>>> >>>>>>>>> Node Attributes:
> >>>> >>>>>>>>> * Node lbv1.beta.com:
> >>>> >>>>>>>>> + default_ping_set : 100
> >>>> >>>>>>>>> * Node lbv2.beta.com:
> >>>> >>>>>>>>> + default_ping_set : 100
> >>>> >>>>>>>>>
> >>>> >>>>>>>>> Migration summary:
> >>>> >>>>>>>>> * Node lbv1.beta.com:
> >>>> >>>>>>>>> Stonith2-1: migration-threshold=1
> >>>> > fail-count=1000000
> >>>> >>>>>>>> last-failure='Tue Mar 17
> >>>> >>>>>>>>> 21:03:39 2015'
> >>>> >>>>>>>>> * Node lbv2.beta.com:
> >>>> >>>>>>>>> Stonith1-1: migration-threshold=1
> >>>> > fail-count=1000000
> >>>> >>>>>>>> last-failure='Tue Mar 17
> >>>> >>>>>>>>> 21:03:32 2015'
> >>>> >>>>>>>>>
> >>>> >>>>>>>>> Failed actions:
> >>>> >>>>>>>>> Stonith2-1_start_0 on lbv1.beta.com
> >>>> > 'unknown error' (1):
> >>>> >>>>>>>> call=31, st
> >>>> >>>>>>>>> atus=Error, exit-reason='none',
> >>>> > last-rc-change='Tue Mar 17
> >>>> >>>>>>>> 21:03:37 2015', queue
> >>>> >>>>>>>>> d=0ms, exec=1085ms
> >>>> >>>>>>>>> Stonith1-1_start_0 on lbv2.beta.com
> >>>> > 'unknown error' (1):
> >>>> >>>>>>>> call=18, st
> >>>> >>>>>>>>> atus=Error, exit-reason='none',
> >>>> > last-rc-change='Tue Mar 17
> >>>> >>>>>>>> 21:03:30 2015', queue
> >>>> >>>>>>>>> d=0ms, exec=1061ms
> >>>> >>>>>>>>>
> >>>> >>>>>>>>>
> >>>> >>>>>>>>>
> >>>> >>>>>>>>>
> >>>> >>>>>>>>> ログです。
> >>>> >>>>>>>>>
> >>>> >>>>>>>>>
> >>>> >>>>>>>>> # less /var/log/ha-debug
> >>>> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> >>>> > [4235]: info: Pacemaker support:
> >>>> >>>>>>>> yes
> >>>> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> >>>> > [4235]: WARN: File
> >>>> >>>>>>>> /etc/ha.d//haresources exists.
> >>>> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> >>>> > [4235]: WARN: This file is not used
> >>>> >>>>>>>> because pacemaker is enabled
> >>>> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> >>>> > [4235]: debug: Checking access of:
> >>>> >>>>>>>> /usr/local/heartbeat/libexec/heartbeat/ccm
> >>>> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> >>>> > [4235]: debug: Checking access of:
> >>>> >>>>>>>> /usr/local/heartbeat/libexec/pacemaker/cib
> >>>> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> >>>> > [4235]: debug: Checking access of:
> >>>> >>>>>>>> /usr/local/heartbeat/libexec/pacemaker/stonithd
> >>>> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> >>>> > [4235]: debug: Checking access of:
> >>>> >>>>>>>> /usr/local/heartbeat/libexec/pacemaker/lrmd
> >>>> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> >>>> > [4235]: debug: Checking access of:
> >>>> >>>>>>>> /usr/local/heartbeat/libexec/pacemaker/attrd
> >>>> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> >>>> > [4235]: debug: Checking access of:
> >>>> >>>>>>>> /usr/local/heartbeat/libexec/pacemaker/crmd
> >>>> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> >>>> > [4235]: WARN: Core dumps could be
> >>>> >>>>>>>> lost if multiple dumps occur.
> >>>> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> >>>> > [4235]: WARN: Consider setting
> >>>> >>>>>>>> non-default value in /proc/sys/kernel/core_pattern
> >>>> > (or equivalent) for maximum
> >>>> >>>>>>>> supportability
> >>>> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> >>>> > [4235]: WARN: Consider setting
> >>>> >>>>>>>> /proc/sys/kernel/core_uses_pid (or equivalent) to 1
> >>>> > for maximum supportability
> >>>> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> >>>> > [4235]: WARN: Logging daemon is
> >>>> >>>>>>>> disabled --enabling logging daemon is recommended
> >>>> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> >>>> > [4235]: info:
> >>>> >>>>>>>> **************************
> >>>> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> >>>> > [4235]: info: Configuration
> >>>> >>>>>>>> validated. Starting heartbeat 3.0.6
> >>>> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> >>>> > [4236]: info: heartbeat: version
> >>>> >>>>>>>> 3.0.6
> >>>> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> >>>> > [4236]: info: Heartbeat generation:
> >>>> >>>>>>>> 1423534116
> >>>> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> >>>> > [4236]: info: seed is -1702799346
> >>>> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> >>>> > [4236]: info: glib: ucast: write
> >>>> >>>>>>>> socket priority set to IPTOS_LOWDELAY on eth1
> >>>> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> >>>> > [4236]: info: glib: ucast: bound
> >>>> >>>>>>>> send socket to device: eth1
> >>>> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> >>>> > [4236]: info: glib: ucast: set
> >>>> >>>>>>>> SO_REUSEADDR
> >>>> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> >>>> > [4236]: info: glib: ucast: bound
> >>>> >>>>>>>> receive socket to device: eth1
> >>>> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> >>>> > [4236]: info: glib: ucast: started
> >>>> >>>>>>>> on port 694 interface eth1 to 10.0.17.133
> >>>> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> >>>> > [4236]: info: Local status now set
> >>>> >>>>>>>> to: 'up'
> >>>> >>>>>>>>> Mar 17 21:02:46 lbv1.beta.com heartbeat:
> >>>> > [4236]: info: Link
> >>>> >>>>>>>> lbv2.beta.com:eth1 up.
> >>>> >>>>>>>>> Mar 17 21:02:46 lbv1.beta.com heartbeat:
> >>>> > [4236]: info: Status update for
> >>>> >>>>>>>> node lbv2.beta.com: status up
> >>>> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> >>>> > [4236]: info: Comm_now_up():
> >>>> >>>>>>>> updating status to active
> >>>> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> >>>> > [4236]: info: Local status now set
> >>>> >>>>>>>> to: 'active'
> >>>> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> >>>> > [4236]: info: Starting child client
> >>>> >>>>>>>>
> >>>> > "/usr/local/heartbeat/libexec/heartbeat/ccm" (109,113)
> >>>> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> >>>> > [4236]: info: Starting child client
> >>>> >>>>>>>>
> >>>> > "/usr/local/heartbeat/libexec/pacemaker/cib" (109,113)
> >>>> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> >>>> > [4236]: info: Starting child client
> >>>> >>>>>>>>
> >>>> > "/usr/local/heartbeat/libexec/pacemaker/stonithd" (0,0)
> >>>> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> >>>> > [4236]: info: Starting child client
> >>>> >>>>>>>>
> >>>> > "/usr/local/heartbeat/libexec/pacemaker/lrmd" (0,0)
> >>>> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> >>>> > [4236]: info: Starting child client
> >>>> >>>>>>>>
> >>>> > "/usr/local/heartbeat/libexec/pacemaker/attrd" (109,113)
> >>>> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> >>>> > [4236]: info: Starting child client
> >>>> >>>>>>>>
> >>>> > "/usr/local/heartbeat/libexec/pacemaker/crmd" (109,113)
> >>>> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> >>>> > [4236]: debug: get_delnodelist:
> >>>> >>>>>>>> delnodelist=
> >>>> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> >>>> > [4250]: info: Starting
> >>>> >>>>>>>>
> >>>> > "/usr/local/heartbeat/libexec/pacemaker/crmd" as uid 109 gid 113
> (pid
> >>>> >>>>>>>> 4250)
> >>>> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> >>>> > [4246]: info: Starting
> >>>> >>>>>>>>
> >>>> > "/usr/local/heartbeat/libexec/pacemaker/cib" as uid 109 gid 113
> (pid
> >>>> >>>>>>>> 4246)
> >>>> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> >>>> > [4249]: info: Starting
> >>>> >>>>>>>>
> >>>> > "/usr/local/heartbeat/libexec/pacemaker/attrd" as uid 109 gid 113
> >>>> >>>>>>>> (pid 4249)
> >>>> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> >>>> > [4245]: info: Starting
> >>>> >>>>>>>>
> >>>> > "/usr/local/heartbeat/libexec/heartbeat/ccm" as uid 109 gid 113
> (pid
> >>>> >>>>>>>> 4245)
> >>>> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> >>>> > [4248]: info: Starting
> >>>> >>>>>>>>
> >>>> > "/usr/local/heartbeat/libexec/pacemaker/lrmd" as uid 0 gid 0 (pid
> >>>> >>>>>>>> 4248)
> >>>> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> >>>> > [4247]: info: Starting
> >>>> >>>>>>>>
> >>>> > "/usr/local/heartbeat/libexec/pacemaker/stonithd" as uid 0 gid 0
> (pid
> >>>> >>>>>>>> 4247)
> >>>> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com ccm: [4245]:
> >>>> > info: Hostname: lbv1.beta.com
> >>>> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> >>>> > [4236]: info: the send queue length
> >>>> >>>>>>>> from heartbeat to client ccm is set to 1024
> >>>> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> >>>> > [4236]: info: the send queue length
> >>>> >>>>>>>> from heartbeat to client attrd is set to 1024
> >>>> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> >>>> > [4236]: info: the send queue length
> >>>> >>>>>>>> from heartbeat to client stonith-ng is set to 1024
> >>>> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> >>>> > [4236]: info: Status update for
> >>>> >>>>>>>> node lbv2.beta.com: status active
> >>>> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> >>>> > [4236]: info: the send queue length
> >>>> >>>>>>>> from heartbeat to client cib is set to 1024
> >>>> >>>>>>>>> Mar 17 21:02:51 lbv1.beta.com heartbeat:
> >>>> > [4236]: WARN: 1 lost packet(s) for
> >>>> >>>>>>>> [lbv2.beta.com] [15:17]
> >>>> >>>>>>>>> Mar 17 21:02:51 lbv1.beta.com heartbeat:
> >>>> > [4236]: info: No pkts missing from
> >>>> >>>>>>>> lbv2.beta.com!
> >>>> >>>>>>>>> Mar 17 21:02:52 lbv1.beta.com heartbeat:
> >>>> > [4236]: WARN: 1 lost packet(s) for
> >>>> >>>>>>>> [lbv2.beta.com] [19:21]
> >>>> >>>>>>>>> Mar 17 21:02:52 lbv1.beta.com heartbeat:
> >>>> > [4236]: info: No pkts missing from
> >>>> >>>>>>>> lbv2.beta.com!
> >>>> >>>>>>>>> Mar 17 21:02:52 lbv1.beta.com heartbeat:
> >>>> > [4236]: info: the send queue length
> >>>> >>>>>>>> from heartbeat to client crmd is set to 1024
> >>>> >>>>>>>>> Mar 17 21:02:53 lbv1.beta.com heartbeat:
> >>>> > [4236]: WARN: 1 lost packet(s) for
> >>>> >>>>>>>> [lbv2.beta.com] [24:26]
> >>>> >>>>>>>>> Mar 17 21:02:53 lbv1.beta.com heartbeat:
> >>>> > [4236]: info: No pkts missing from
> >>>> >>>>>>>> lbv2.beta.com!
> >>>> >>>>>>>>> Mar 17 21:02:54 lbv1.beta.com heartbeat:
> >>>> > [4236]: WARN: 1 lost packet(s) for
> >>>> >>>>>>>> [lbv2.beta.com] [26:28]
> >>>> >>>>>>>>> Mar 17 21:02:54 lbv1.beta.com heartbeat:
> >>>> > [4236]: info: No pkts missing from
> >>>> >>>>>>>> lbv2.beta.com!
> >>>> >>>>>>>>> Mar 17 21:02:54 lbv1.beta.com heartbeat:
> >>>> > [4236]: WARN: 1 lost packet(s) for
> >>>> >>>>>>>> [lbv2.beta.com] [30:32]
> >>>> >>>>>>>>> Mar 17 21:02:54 lbv1.beta.com heartbeat:
> >>>> > [4236]: info: No pkts missing from
> >>>> >>>>>>>> lbv2.beta.com!
> >>>> >>>>>>>>>
> >>>> >>>>>>>>>
> >>>> >>>>>>>>>
> >>>> >>>>>>>>> # less /var/log/error
> >>>> >>>>>>>>>
> >>>> >>>>>>>>> Mar 17 21:02:47 lbv1 attrd[4249]: error:
> >>>> > ha_msg_dispatch: Ignored
> >>>> >>>>>>>> incoming message. Please set_msg_callback on
> >>>> > hbclstat
> >>>> >>>>>>>>> Mar 17 21:02:48 lbv1 attrd[4249]: error:
> >>>> > ha_msg_dispatch: Ignored
> >>>> >>>>>>>> incoming message. Please set_msg_callback on
> >>>> > hbclstat
> >>>> >>>>>>>>> Mar 17 21:02:53 lbv1 stonith-ng[4247]:
> >>>> > error: ha_msg_dispatch: Ignored
> >>>> >>>>>>>> incoming message. Please set_msg_callback on
> >>>> > hbclstat
> >>>> >>>>>>>>> Mar 17 21:02:53 lbv1 stonith-ng[4247]:
> >>>> > error: ha_msg_dispatch: Ignored
> >>>> >>>>>>>> incoming message. Please set_msg_callback on
> >>>> > hbclstat
> >>>> >>>>>>>>> Mar 17 21:03:39 lbv1 crmd[4250]: error:
> >>>> > process_lrm_event: Operation
> >>>> >>>>>>>> Stonith2-1_start_0 (node=lbv1.beta.com, call=31,
> >>>> > status=4, cib-update=42,
> >>>> >>>>>>>> confirmed=true) Error
> >>>> >>>>>>>>>
> >>>> >>>>>>>>> # cat syslog|egrep 'Mar 17 21:03|Mar 17
> >>>> > 21:02' |egrep
> >>>> >>>>>>>> 'heartbeat|stonith|pacemaker|error'
> >>>> >>>>>>>>> Mar 17 21:03:24 lbv1 pengine[4253]: notice:
> >>>> > process_pe_message: Calculated
> >>>> >>>>>>>> Transition 0:
> >>>> > /var/lib/pacemaker/pengine/pe-input-115.bz2
> >>>> >>>>>>>>> Mar 17 21:03:27 lbv1 crmd[4250]: notice:
> >>>> > run_graph: Transition 0
> >>>> >>>>>>>> (Complete=15, Pending=0, Fired=0, Skipped=16,
> >>>> > Incomplete=2,
> >>>> >>>>>>>>
> >>>> > Source=/var/lib/pacemaker/pengine/pe-input-115.bz2): Stopped
> >>>> >>>>>>>>> Mar 17 21:03:29 lbv1 pengine[4253]: notice:
> >>>> > process_pe_message: Calculated
> >>>> >>>>>>>> Transition 1:
> >>>> > /var/lib/pacemaker/pengine/pe-input-116.bz2
> >>>> >>>>>>>>> Mar 17 21:03:34 lbv1 crmd[4250]: notice:
> >>>> > run_graph: Transition 1
> >>>> >>>>>>>> (Complete=8, Pending=0, Fired=0, Skipped=12,
> >>>> > Incomplete=1,
> >>>> >>>>>>>>
> >>>> > Source=/var/lib/pacemaker/pengine/pe-input-116.bz2): Stopped
> >>>> >>>>>>>>> Mar 17 21:03:37 lbv1 pengine[4253]: warning:
> >>>> > unpack_rsc_op_failure:
> >>>> >>>>>>>> Processing failed op start for Stonith1-1 on
> >>>> > lbv2.beta.com: unknown error (1)
> >>>> >>>>>>>>> Mar 17 21:03:37 lbv1 pengine[4253]: warning:
> >>>> > unpack_rsc_op_failure:
> >>>> >>>>>>>> Processing failed op start for Stonith1-1 on
> >>>> > lbv2.beta.com: unknown error (1)
> >>>> >>>>>>>>> Mar 17 21:03:37 lbv1 pengine[4253]: notice:
> >>>> > process_pe_message: Calculated
> >>>> >>>>>>>> Transition 2:
> >>>> > /var/lib/pacemaker/pengine/pe-input-117.bz2
> >>>> >>>>>>>>> Mar 17 21:03:39 lbv1 stonith-ng[4247]:
> >>>> > notice: log_operation: Operation
> >>>> >>>>>>>> 'monitor' [4377] for device
> >>>> > 'Stonith2-1' returned: -201 (Generic
> >>>> >>>>>>>> Pacemaker error)
> >>>> >>>>>>>>> Mar 17 21:03:39 lbv1 stonith-ng[4247]:
> >>>> > warning: log_operation:
> >>>> >>>>>>>> Stonith2-1:4377 [ Performing: stonith -t
> >>>> > external/stonith-helper -S ]
> >>>> >>>>>>>>> Mar 17 21:03:39 lbv1 stonith-ng[4247]:
> >>>> > warning: log_operation:
> >>>> >>>>>>>> Stonith2-1:4377 [ failed to exec
> >>>> > "stonith" ]
> >>>> >>>>>>>>> Mar 17 21:03:39 lbv1 stonith-ng[4247]:
> >>>> > warning: log_operation:
> >>>> >>>>>>>> Stonith2-1:4377 [ failed: 2 ]
> >>>> >>>>>>>>> Mar 17 21:03:39 lbv1 crmd[4250]: error:
> >>>> > process_lrm_event: Operation
> >>>> >>>>>>>> Stonith2-1_start_0 (node=lbv1.beta.com, call=31,
> >>>> > status=4, cib-update=42,
> >>>> >>>>>>>> confirmed=true) Error
> >>>> >>>>>>>>> Mar 17 21:03:40 lbv1 crmd[4250]: notice:
> >>>> > run_graph: Transition 2
> >>>> >>>>>>>> (Complete=12, Pending=0, Fired=0, Skipped=3,
> >>>> > Incomplete=0,
> >>>> >>>>>>>>
> >>>> > Source=/var/lib/pacemaker/pengine/pe-input-117.bz2): Stopped
> >>>> >>>>>>>>> Mar 17 21:03:42 lbv1 pengine[4253]: warning:
> >>>> > unpack_rsc_op_failure:
> >>>> >>>>>>>> Processing failed op start for Stonith2-1 on
> >>>> > lbv1.beta.com: unknown error (1)
> >>>> >>>>>>>>> Mar 17 21:03:42 lbv1 pengine[4253]: warning:
> >>>> > unpack_rsc_op_failure:
> >>>> >>>>>>>> Processing failed op start for Stonith2-1 on
> >>>> > lbv1.beta.com: unknown error (1)
> >>>> >>>>>>>>> Mar 17 21:03:42 lbv1 pengine[4253]: warning:
> >>>> > unpack_rsc_op_failure:
> >>>> >>>>>>>> Processing failed op start for Stonith1-1 on
> >>>> > lbv2.beta.com: unknown error (1)
> >>>> >>>>>>>>> Mar 17 21:03:42 lbv1 pengine[4253]: notice:
> >>>> > process_pe_message: Calculated
> >>>> >>>>>>>> Transition 3:
> >>>> > /var/lib/pacemaker/pengine/pe-input-118.bz2
> >>>> >>>>>>>>> Mar 17 21:03:42 lbv1 IPaddr2(vip_208)[4448]:
> >>>> > INFO:
> >>>> >>>>>>>> /usr/libexec/heartbeat/send_arp -i 200 -r 5 -p
> >>>> >>>>>>>> /var/run/resource-agents/send_arp-192.168.17.208
> >>>> > eth0 192.168.17.208 auto
> >>>> >>>>>>>> not_used not_used
> >>>> >>>>>>>>> Mar 17 21:03:47 lbv1 crmd[4250]: notice:
> >>>> > run_graph: Transition 3
> >>>> >>>>>>>> (Complete=10, Pending=0, Fired=0, Skipped=0,
> >>>> > Incomplete=0,
> >>>> >>>>>>>>
> >>>> > Source=/var/lib/pacemaker/pengine/pe-input-118.bz2): Complete
> >>>> >>>>>>>>>
> >>>> >>>>>>>>> 宜しくお願いします。
> >>>> >>>>>>>>>
> >>>> >>>>>>>>> 以上
> >>>> >>>>>>>>>
> >>>> >>>>>>>>>
> >>>> >>>>>>>>>
> >>>> >>>>>>>>> 2015年3月17日 18:31
> >>>> > <renayama19661014@ybb.ne.jp>:
> >>>> >>>>>>>>>
> >>>> >>>>>>>>> 福田さん
> >>>> >>>>>>>>>>
> >>>> >>>>>>>>>> こんばんは、山内です。
> >>>> >>>>>>>>>>
> >>>> >>>>>>>>>> tag付けされていないので、本日の最新版は、
> >>>> >>>>>>>>>>
> >>>> >>>>>>>>>> *
> >>>> >>>>>>>>
> >>>> >
> https://github.com/ClusterLabs/pacemaker/tree/e32080b460f81486b85d08ec958582b3e72d858c
> >>>> >>>>>>>>>>
> >>>> >>>>>>>>>>
> >>>> >>>>>>>>>> になります。
> >>>> >>>>>>>>>> 右側の[Download ZIP]からダウンロード出来ます。
> >>>> >>>>>>>>>>
> >>>> >>>>>>>>>> 以上です。
> >>>> >>>>>>>>>>
> >>>> >>>>>>>>>>
> >>>> >>>>>>>>>> ----- Original Message -----
> >>>> >>>>>>>>>>> From: Masamichi Fukuda - elf-systems
> >>>> >>>>>>>> <masamichi_fukuda@elf-systems.com>
> >>>> >>>>>>>>>>
> >>>> >>>>>>>>>>> To:
> >>>> > "renayama19661014@ybb.ne.jp"
> >>>> >>>>>>>> <renayama19661014@ybb.ne.jp>;
> >>>> >>>>>>>> "linux-ha-japan@lists.sourceforge.jp"
> >>>> >>>>>>>> <linux-ha-japan@lists.sourceforge.jp>
> >>>> >>>>>>>>>>> Date: 2015/3/17, Tue 18:07
> >>>> >>>>>>>>>>> Subject: スプリットブレイン時のSTONITHエラーについて
> >>>> >>>>>>>>>>>
> >>>> >>>>>>>>>>>
> >>>> >>>>>>>>>>> 山内さん
> >>>> >>>>>>>>>>>
> >>>> >>>>>>>>>>>
> >>>> >>>>>>>>>>> お疲れ様です、福田です。
> >>>> >>>>>>>>>>>
> >>>> >>>>>>>>>>>
> >>>> >>>>>>>>>>> こちらを見たのですが、
> >>>> >>>>>>>>>>>
> >>>> > https://github.com/ClusterLabs/pacemaker/tags
> >>>> >>>>>>>>>>>
> >>>> >>>>>>>>>>>
> >>>> >>>>>>>>>>>
> >>>> >>>>>>>>>>> pacemaker 1.1.12 561c4cf が最新のようなのですが。
> >>>> >>>>>>>>>>> 済みませんが、これ以降の最新版はどちらにあるか教えて頂けますか。
> >>>> >>>>>>>>>>>
> >>>> >>>>>>>>>>>
> >>>> >>>>>>>>>>> 宜しくお願いします。
> >>>> >>>>>>>>>>>
> >>>> >>>>>>>>>>>
> >>>> >>>>>>>>>>> 以上
> >>>> >>>>>>>>>>>
> >>>> >>>>>>>>>>>
> >>>> >>>>>>>>>>>
> >>>> >>>>>>>>>>>
> >>>> > 2015年3月17日火曜日、<renayama19661014@ybb.ne.jp>さんは書きました:
> >>>> >>>>>>>>>>>
> >>>> >>>>>>>>>>> 福田さん
> >>>> >>>>>>>>>>>>
> >>>> >>>>>>>>>>>> お疲れ様です。山内です。
> >>>> >>>>>>>>>>>>
> >>>> >>>>>>>>>>>> はい。古いです。
> >>>> >>>>>>>>>>>>
> >>>> >>>>>>>>>>>>
> >>>> > PacemakerがHeartbeat3.0.6に対応したのは意外と最近です。
> >>>> >>>>>>>>>>>>
> >>>> > もっと新しいものを入れてください。(また、ソースから構築する必要がありますが・・・・)
> >>>> >>>>>>>>>>>>
> >>>> >>>>>>>>>>>>
> >>>> >>>>>>>>>>>>
> >>>> >>>>>>>>>>>> 本家のgithubから入手可能です。
> >>>> >>>>>>>>>>>> *
> >>>> > https://github.com/ClusterLabs/pacemaker
> >>>> >>>>>>>>>>>>
> >>>> >>>>>>>>>>>>
> >>>> >>>>>>>>>>>>
> >>>> > 場合によっては、最新のmasterはエラーなどが出る場合がありますので、その場合は、バージョンを古い方にたぐって
> >>>> >>>>>>>>>>>> いくのが良いと思います。
> >>>> >>>>>>>>>>>>
> >>>> >>>>>>>>>>>> 以上です。
> >>>> >>>>>>>>>>>>
> >>>> >>>>>>>>>>>>
> >>>> >>>>>>>>>>>>
> >>>> >>>>>>>>>>>> ----- Original Message -----
> >>>> >>>>>>>>>>>>> From: Masamichi Fukuda -
> >>>> > elf-systems
> >>>> >>>>>>>> <masamichi_fukuda@elf-systems.com>
> >>>> >>>>>>>>>>>>> To: 山内英生
> >>>> > <renayama19661014@ybb.ne.jp>;
> >>>> >>>>>>>> "linux-ha-japan@lists.sourceforge.jp"
> >>>> >>>>>>>> <linux-ha-japan@lists.sourceforge.jp>
> >>>> >>>>>>>>>>>>> Date: 2015/3/17, Tue 16:06
> >>>> >>>>>>>>>>>>> Subject: Re: [Linux-ha-jp]
> >>>> > スプリットブレイン時のSTONITHエラーについて
> >>>> >>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>> 山内さん
> >>>> >>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>> お疲れ様です、福田です。
> >>>> >>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>
> >>>> > 以前のメールでheartbeatとpacemakerを最新版を入れたほうが良いと回答頂きました。
> >>>> >>>>>>>>>>>>>
> >>>> > そこで今回、heartbeat3.0.6とpacemaker1.1.12を入れたのですが。
> >>>> >>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>> heartbeat configuration:
> >>>> > Version = "3.0.6"
> >>>> >>>>>>>>>>>>> pacemaker configuration:
> >>>> > Version = 1.1.12 (Build:
> >>>> >>>>>>>> 561c4cf)pacemakerがまだ古いということでしょうか。
> >>>> >>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>> 済みませんが、宜しくお願いします。
> >>>> >>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>> 以上
> >>>> >>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>> 2015年3月17日 14:59
> >>>> > <renayama19661014@ybb.ne.jp>:
> >>>> >>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>> 福田さん
> >>>> >>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>> お疲れ様です。山内です。
> >>>> >>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>
> >>>> > ふと思ったのすが、以前のやり取りのメールで以下と回答してますが、問題ないでしょうか?
> >>>> >>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>>>>>
> >>>> > 2)Heartbeat3.0.6+Pacemaker最新 :
> >>>> >>>>>>>> OK
> >>>> >>>>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>> どうやら、Heartbeatも最新版3.0.6を組合せる必要があるようです。
> >>>> >>>>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>
> >>>> > * http://hg.linux-ha.org/heartbeat-STABLE_3_0/rev/cceeb47a7d8f
> >>>> >>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>
> >>>> > 以下のcrm_monのバージョンを見ると、1.1.12のようです。
> >>>> >>>>>>>>>>>>>>
> >>>> > Heartbeat3.0.6と組み合わせるには、かなり新しめのPacemakerが必要です。
> >>>> >>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>> # crm_mon -rfA
> >>>> >>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>> Last updated: Tue Mar
> >>>> > 17 14:14:39 2015
> >>>> >>>>>>>>>>>>>>> Last change: Tue Mar 17
> >>>> > 14:01:43 2015
> >>>> >>>>>>>>>>>>>>> Stack: heartbeat
> >>>> >>>>>>>>>>>>>>> Current DC:
> >>>> > lbv2.beta.com
> >>>> >>>>>>>> (82ffc36f-1ad8-8686-7db0-35686465c624) - parti
> >>>> >>>>>>>>>>>>>>> tion with quorum
> >>>> >>>>>>>>>>>>>>> Version: 1.1.12-561c4cf
> >>>> >>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>> たぶん、以下の変更以降は少なくとも必要かと思います。
> >>>> >>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>
> >>>> >
> https://github.com/ClusterLabs/pacemaker/commit/f2302da063d08719d28367d8e362b8bfb0f85bf3
> >>>> >>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>> 以上です。
> >>>> >>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>> ----- Original Message
> >>>> > -----
> >>>> >>>>>>>>>>>>>>> From: Masamichi Fukuda
> >>>> > - elf-systems
> >>>> >>>>>>>> <masamichi_fukuda@elf-systems.com>
> >>>> >>>>>>>>>>>>>>> To: 山内英生
> >>>> > <renayama19661014@ybb.ne.jp>;
> >>>> >>>>>>>> "linux-ha-japan@lists.sourceforge.jp"
> >>>> >>>>>>>> <linux-ha-japan@lists.sourceforge.jp>
> >>>> >>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>> Date: 2015/3/17, Tue
> >>>> > 14:38
> >>>> >>>>>>>>>>>>>>> Subject: Re:
> >>>> > [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
> >>>> >>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>> 山内さん
> >>>> >>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>> お疲れ様です、福田です。
> >>>> >>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>
> >>>> > stonith-helperのシェバング行に-xを追加すれば良いのでしょうか?
> >>>> >>>>>>>>>>>>>>>
> >>>> > stonith-helperの先頭行を#!/bin/bash -xにしてクラスタを起動してみました。
> >>>> >>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>
> >>>> > crm_monでは先ほどと変わりはないようです。
> >>>> >>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>> # crm_mon -rfA
> >>>> >>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>> Last updated: Tue Mar
> >>>> > 17 14:14:39 2015
> >>>> >>>>>>>>>>>>>>> Last change: Tue Mar 17
> >>>> > 14:01:43 2015
> >>>> >>>>>>>>>>>>>>> Stack: heartbeat
> >>>> >>>>>>>>>>>>>>> Current DC:
> >>>> > lbv2.beta.com
> >>>> >>>>>>>> (82ffc36f-1ad8-8686-7db0-35686465c624) - parti
> >>>> >>>>>>>>>>>>>>> tion with quorum
> >>>> >>>>>>>>>>>>>>> Version: 1.1.12-561c4cf
> >>>> >>>>>>>>>>>>>>> 2 Nodes configured
> >>>> >>>>>>>>>>>>>>> 8 Resources configured
> >>>> >>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>> Online: [ lbv1.beta.com
> >>>> > lbv2.beta.com ]
> >>>> >>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>> Full list of resources:
> >>>> >>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>> Resource Group:
> >>>> > HAvarnish
> >>>> >>>>>>>>>>>>>>> vip_208
> >>>> > (ocf::heartbeat:IPaddr2):
> >>>> >>>>>>>> Started lbv1.beta.com
> >>>> >>>>>>>>>>>>>>> varnishd
> >>>> > (lsb:varnish): Started
> >>>> >>>>>>>> lbv1.beta.com
> >>>> >>>>>>>>>>>>>>> Resource Group:
> >>>> > grpStonith1
> >>>> >>>>>>>>>>>>>>> Stonith1-1
> >>>> >>>>>>>> (stonith:external/stonith-helper): Stopped
> >>>> >>>>>>>>>>>>>>> Stonith1-2
> >>>> > (stonith:external/xen0):
> >>>> >>>>>>>> Stopped
> >>>> >>>>>>>>>>>>>>> Resource Group:
> >>>> > grpStonith2
> >>>> >>>>>>>>>>>>>>> Stonith2-1
> >>>> >>>>>>>> (stonith:external/stonith-helper): Stopped
> >>>> >>>>>>>>>>>>>>> Stonith2-2
> >>>> > (stonith:external/xen0):
> >>>> >>>>>>>> Stopped
> >>>> >>>>>>>>>>>>>>> Clone Set: clone_ping
> >>>> > [ping]
> >>>> >>>>>>>>>>>>>>> Started: [
> >>>> > lbv1.beta.com lbv2.beta.com ]
> >>>> >>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>> Node Attributes:
> >>>> >>>>>>>>>>>>>>> * Node lbv1.beta.com:
> >>>> >>>>>>>>>>>>>>> +
> >>>> > default_ping_set : 100
> >>>> >>>>>>>>>>>>>>> * Node lbv2.beta.com:
> >>>> >>>>>>>>>>>>>>> +
> >>>> > default_ping_set : 100
> >>>> >>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>> Migration summary:
> >>>> >>>>>>>>>>>>>>> * Node lbv2.beta.com:
> >>>> >>>>>>>>>>>>>>> Stonith1-1:
> >>>> > migration-threshold=1
> >>>> >>>>>>>> fail-count=1000000 last-failure='Tue Mar 17
> >>>> >>>>>>>>>>>>>>> 14:12:16 2015'
> >>>> >>>>>>>>>>>>>>> * Node lbv1.beta.com:
> >>>> >>>>>>>>>>>>>>> Stonith2-1:
> >>>> > migration-threshold=1
> >>>> >>>>>>>> fail-count=1000000 last-failure='Tue Mar 17
> >>>> >>>>>>>>>>>>>>> 14:12:21 2015'
> >>>> >>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>> Failed actions:
> >>>> >>>>>>>>>>>>>>> Stonith1-1_start_0
> >>>> > on lbv2.beta.com 'unknown
> >>>> >>>>>>>> error' (1): call=31, st
> >>>> >>>>>>>>>>>>>>> atus=Error,
> >>>> > last-rc-change='Tue Mar 17 14:12:14
> >>>> >>>>>>>> 2015', queued=0ms, exec=1065ms
> >>>> >>>>>>>>>>>>>>> Stonith2-1_start_0
> >>>> > on lbv1.beta.com 'unknown
> >>>> >>>>>>>> error' (1): call=26, st
> >>>> >>>>>>>>>>>>>>> atus=Error,
> >>>> > last-rc-change='Tue Mar 17 14:12:19
> >>>> >>>>>>>> 2015', queued=0ms, exec=1081ms
> >>>> >>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>> その他のログを探してみました。
> >>>> >>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>> heartbeat起動時です。
> >>>> >>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>> # less
> >>>> > /var/log/pm_logconv.out
> >>>> >>>>>>>>>>>>>>> Mar 17 14:11:28
> >>>> > lbv1.beta.com info: Starting
> >>>> >>>>>>>> Heartbeat 3.0.6.
> >>>> >>>>>>>>>>>>>>> Mar 17 14:11:33
> >>>> > lbv1.beta.com info: Link
> >>>> >>>>>>>> lbv2.beta.com:eth1 is up.
> >>>> >>>>>>>>>>>>>>> Mar 17 14:11:34
> >>>> > lbv1.beta.com info: Start
> >>>> >>>>>>>> "ccm" process. (pid=13264)
> >>>> >>>>>>>>>>>>>>> Mar 17 14:11:34
> >>>> > lbv1.beta.com info: Start
> >>>> >>>>>>>> "lrmd" process. (pid=13267)
> >>>> >>>>>>>>>>>>>>> Mar 17 14:11:34
> >>>> > lbv1.beta.com info: Start
> >>>> >>>>>>>> "attrd" process. (pid=13268)
> >>>> >>>>>>>>>>>>>>> Mar 17 14:11:34
> >>>> > lbv1.beta.com info: Start
> >>>> >>>>>>>> "stonithd" process. (pid=13266)
> >>>> >>>>>>>>>>>>>>> Mar 17 14:11:34
> >>>> > lbv1.beta.com info: Start
> >>>> >>>>>>>> "cib" process. (pid=13265)
> >>>> >>>>>>>>>>>>>>> Mar 17 14:11:34
> >>>> > lbv1.beta.com info: Start
> >>>> >>>>>>>> "crmd" process. (pid=13269)
> >>>> >>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>> # less /var/log/error
> >>>> >>>>>>>>>>>>>>> Mar 17 14:12:20 lbv1
> >>>> > crmd[13269]: error:
> >>>> >>>>>>>> process_lrm_event: Operation Stonith2-1_start_0
> >>>> > (node=lbv1.beta.com, call=26,
> >>>> >>>>>>>> status=4, cib-update=19, confirmed=true) Error
> >>>> >>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>
> >>>> > syslogからstonithをgrepしたものです
> >>>> >>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>> Mar 17 14:11:34 lbv1
> >>>> > heartbeat: [13255]: info:
> >>>> >>>>>>>> Starting child client
> >>>> >>>>>>>>
> >>>> > "/usr/local/heartbeat/libexec/pacemaker/stonithd" (0,0)
> >>>> >>>>>>>>>>>>>>> Mar 17 14:11:34 lbv1
> >>>> > heartbeat: [13266]: info:
> >>>> >>>>>>>> Starting
> >>>> > "/usr/local/heartbeat/libexec/pacemaker/stonithd" as uid 0
> >>>> >>>>>>>> gid 0 (pid 13266)
> >>>> >>>>>>>>>>>>>>> Mar 17 14:11:34 lbv1
> >>>> > stonithd[13266]: notice:
> >>>> >>>>>>>> crm_cluster_connect: Connecting to cluster
> >>>> > infrastructure: heartbeat
> >>>> >>>>>>>>>>>>>>> Mar 17 14:11:34 lbv1
> >>>> > heartbeat: [13255]: info: the
> >>>> >>>>>>>> send queue length from heartbeat to client stonithd
> >>>> > is set to 1024
> >>>> >>>>>>>>>>>>>>> Mar 17 14:11:40 lbv1
> >>>> > stonithd[13266]: notice:
> >>>> >>>>>>>> setup_cib: Watching for stonith topology changes
> >>>> >>>>>>>>>>>>>>> Mar 17 14:11:40 lbv1
> >>>> > stonithd[13266]: notice:
> >>>> >>>>>>>> unpack_config: On loss of CCM Quorum: Ignore
> >>>> >>>>>>>>>>>>>>> Mar 17 14:11:40 lbv1
> >>>> > stonithd[13266]: warning:
> >>>> >>>>>>>> handle_startup_fencing: Blind faith: not fencing
> >>>> > unseen nodes
> >>>> >>>>>>>>>>>>>>> Mar 17 14:11:40 lbv1
> >>>> > stonithd[13266]: warning:
> >>>> >>>>>>>> handle_startup_fencing: Blind faith: not fencing
> >>>> > unseen nodes
> >>>> >>>>>>>>>>>>>>> Mar 17 14:11:41 lbv1
> >>>> > stonithd[13266]: notice:
> >>>> >>>>>>>> stonith_device_register: Added 'Stonith2-1'
> >>>> > to the device list (1 active
> >>>> >>>>>>>> devices)
> >>>> >>>>>>>>>>>>>>> Mar 17 14:11:41 lbv1
> >>>> > stonithd[13266]: notice:
> >>>> >>>>>>>> stonith_device_register: Added 'Stonith2-2'
> >>>> > to the device list (2 active
> >>>> >>>>>>>> devices)
> >>>> >>>>>>>>>>>>>>> Mar 17 14:12:04 lbv1
> >>>> > stonithd[13266]: notice:
> >>>> >>>>>>>> xml_patch_version_check: Versions did not change in
> >>>> > patch 0.5.0
> >>>> >>>>>>>>>>>>>>> Mar 17 14:12:20 lbv1
> >>>> > stonithd[13266]: notice:
> >>>> >>>>>>>> log_operation: Operation 'monitor' [13386]
> >>>> > for device
> >>>> >>>>>>>> 'Stonith2-1' returned: -201 (Generic
> >>>> > Pacemaker error)
> >>>> >>>>>>>>>>>>>>> Mar 17 14:12:20 lbv1
> >>>> > stonithd[13266]: warning:
> >>>> >>>>>>>> log_operation: Stonith2-1:13386 [ Performing:
> >>>> > stonith -t external/stonith-helper
> >>>> >>>>>>>> -S ]
> >>>> >>>>>>>>>>>>>>> Mar 17 14:12:20 lbv1
> >>>> > stonithd[13266]: warning:
> >>>> >>>>>>>> log_operation: Stonith2-1:13386 [ failed to exec
> >>>> > "stonith" ]
> >>>> >>>>>>>>>>>>>>> Mar 17 14:12:20 lbv1
> >>>> > stonithd[13266]: warning:
> >>>> >>>>>>>> log_operation: Stonith2-1:13386 [ failed: 2 ]
> >>>> >>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>> 宜しくお願いします。
> >>>> >>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>> 以上
> >>>> >>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>> 2015年3月17日 13:32
> >>>> > <renayama19661014@ybb.ne.jp>:
> >>>> >>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>> 福田さん
> >>>> >>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>> お疲れ様です。山内です。
> >>>> >>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>
> >>>> > ということは、stonith-helperのstartに問題があるようですね。
> >>>> >>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>> stonith-helperの先頭に
> >>>> >>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>> #!/bin/bash -x
> >>>> >>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>
> >>>> > を入れて、クラスタを起動すると何かわかるかも知れません。
> >>>> >>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>
> >>>> > ちなみに、stonith-helperのログもどこかに出ていると思うのですが。。。
> >>>> >>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>> 以上です。
> >>>> >>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>> ----- Original
> >>>> > Message -----
> >>>> >>>>>>>>>>>>>>>>> From: Masamichi
> >>>> > Fukuda - elf-systems
> >>>> >>>>>>>> <masamichi_fukuda@elf-systems.com>
> >>>> >>>>>>>>>>>>>>>>> To: 山内英生
> >>>> > <renayama19661014@ybb.ne.jp>;
> >>>> >>>>>>>> "linux-ha-japan@lists.sourceforge.jp"
> >>>> >>>>>>>> <linux-ha-japan@lists.sourceforge.jp>
> >>>> >>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>> Date:
> >>>> > 2015/3/17, Tue 12:31
> >>>> >>>>>>>>>>>>>>>>> Subject: Re:
> >>>> > [Linux-ha-jp]
> >>>> >>>>>>>> スプリットブレイン時のSTONITHエラーについて
> >>>> >>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>> 山内さん
> >>>> >>>>>>>>>>>>>>>>> cc:松島さん
> >>>> >>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>> こんにちは、福田です。
> >>>> >>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>>
> >>>> > 同じディレクトリにxen0はありました。
> >>>> >>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>> # pwd
> >>>> >>>>>>>>>>>>>>>>>
> >>>> > /usr/local/heartbeat/lib/stonith/plugins/external
> >>>> >>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>> # ls
> >>>> >>>>>>>>>>>>>>>>> drac5
> >>>> > ibmrsa kdumpcheck
> >>>> >>>>>>>> riloe vmware
> >>>> >>>>>>>>>>>>>>>>> dracmc-telnet
> >>>> > ibmrsa-telnet libvirt
> >>>> >>>>>>>> ssh xen0
> >>>> >>>>>>>>>>>>>>>>> hetzner
> >>>> > ipmi nut
> >>>> >>>>>>>> stonith-helper xen0-ha
> >>>> >>>>>>>>>>>>>>>>> hmchttp
> >>>> > ippower9258 rackpdu
> >>>> >>>>>>>> vcenter
> >>>> >>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>> 宜しくお願いします。
> >>>> >>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>> 以上
> >>>> >>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>> 2015-03-17
> >>>> > 10:53 GMT+09:00
> >>>> >>>>>>>> <renayama19661014@ybb.ne.jp>:
> >>>> >>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>> 福田さん
> >>>> >>>>>>>>>>>>>>>>>> cc:松島さん
> >>>> >>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>>>
> >>>> > お疲れ様です。山内です。
> >>>> >>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> > 標準出力や標準エラー出力はありませんでした。
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> > stonith-helperがおかしいのでしょうか。
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> > stonith-helperはシェルスクリプトなのでインストールはあまり気にしていなかったのですが。
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> > stonith-helperはここに配置されています。
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> > /usr/local/heartbeat/lib/stonith/plugins/external/stonith-helper
> >>>> >>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>>>
> >>>> > このディレクトリにxen0もありますか?
> >>>> >>>>>>>>>>>>>>>>>>
> >>>> > 無いようでしたら、問題がありますので、一度、stonith-helperのファイルを属性などはそのまま、xen0と同じディレクトリに
> >>>> >>>>>>>>>>>>>>>>>>
> >>>> > コピーしてみてください。
> >>>> >>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>>>
> >>>> > それで稼働するなら、pm_extrasのインストールに問題があるということになります。
> >>>> >>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>>> 以上です。
> >>>> >>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>>> -----
> >>>> > Original Message -----
> >>>> >>>>>>>>>>>>>>>>>>> From:
> >>>> > Masamichi Fukuda - elf-systems
> >>>> >>>>>>>> <masamichi_fukuda@elf-systems.com>
> >>>> >>>>>>>>>>>>>>>>>>> To:
> >>>> > 山内英生
> >>>> >>>>>>>> <renayama19661014@ybb.ne.jp>;
> >>>> >>>>>>>> "linux-ha-japan@lists.sourceforge.jp"
> >>>> >>>>>>>> <linux-ha-japan@lists.sourceforge.jp>
> >>>> >>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>>>> Date:
> >>>> > 2015/3/17, Tue 10:31
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> > Subject: Re: [Linux-ha-jp]
> >>>> >>>>>>>> スプリットブレイン時のSTONITHエラーについて
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>>>> 山内さん
> >>>> >>>>>>>>>>>>>>>>>>> cc:松島さん
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> > おはようございます、福田です。
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> > crmの例をありがとうございます。
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> > 早速、こちらの環境に合わせてみました。
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>>>> $ cat
> >>>> > test.crm
> >>>> >>>>>>>>>>>>>>>>>>> ###
> >>>> > Cluster Option ###
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> > property \
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>> no-quorum-policy="ignore" \
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> > stonith-enabled="true"
> >>>> >>>>>>>> \
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>> startup-fencing="false" \
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> > stonith-timeout="710s"
> >>>> >>>>>>>> \
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>> crmd-transition-delay="2s"
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>>>> ###
> >>>> > Resource Default ###
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> > rsc_defaults \
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>> resource-stickiness="INFINITY" \
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>> migration-threshold="1"
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>>>> ###
> >>>> > Group Configuration ###
> >>>> >>>>>>>>>>>>>>>>>>> group
> >>>> > HAvarnish \
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> > vip_208 \
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> > varnishd
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>>>> group
> >>>> > grpStonith1 \
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> > Stonith1-1 \
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> > Stonith1-2
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>>>> group
> >>>> > grpStonith2 \
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> > Stonith2-1 \
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> > Stonith2-2
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>>>> ###
> >>>> > Clone Configuration ###
> >>>> >>>>>>>>>>>>>>>>>>> clone
> >>>> > clone_ping \
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> > ping
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>>>> ###
> >>>> > Fencing Topology ###
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> > fencing_topology \
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> > lbv1.beta.com: Stonith1-1
> >>>> >>>>>>>> Stonith1-2 \
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> > lbv2.beta.com: Stonith2-1
> >>>> >>>>>>>> Stonith2-2
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>>>> ###
> >>>> > Primitive Configuration ###
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> > primitive vip_208
> >>>> >>>>>>>> ocf:heartbeat:IPaddr2 \
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> > params \
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>> ip="192.168.17.208" \
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> > nic="eth0" \
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> > cidr_netmask="24"
> >>>> >>>>>>>> \
> >>>> >>>>>>>>>>>>>>>>>>> op
> >>>> > start interval="0s"
> >>>> >>>>>>>> timeout="90s" on-fail="restart"
> >>>> > \
> >>>> >>>>>>>>>>>>>>>>>>> op
> >>>> > monitor
> >>>> >>>>>>>> interval="5s" timeout="60s"
> >>>> > on-fail="restart"
> >>>> >>>>>>>> \
> >>>> >>>>>>>>>>>>>>>>>>> op
> >>>> > stop interval="0s"
> >>>> >>>>>>>> timeout="100s" on-fail="fence"
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> > primitive varnishd lsb:varnish \
> >>>> >>>>>>>>>>>>>>>>>>> op
> >>>> > start interval="0s"
> >>>> >>>>>>>> timeout="90s" on-fail="restart"
> >>>> > \
> >>>> >>>>>>>>>>>>>>>>>>> op
> >>>> > monitor
> >>>> >>>>>>>> interval="10s" timeout="60s"
> >>>> > on-fail="restart"
> >>>> >>>>>>>> \
> >>>> >>>>>>>>>>>>>>>>>>> op
> >>>> > stop interval="0s"
> >>>> >>>>>>>> timeout="100s" on-fail="fence"
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> > primitive ping ocf:pacemaker:ping
> >>>> >>>>>>>> \
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> > params \
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>> name="default_ping_set" \
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>> host_list="192.168.17.254" \
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> > multiplier="100"
> >>>> >>>>>>>> \
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> > dampen="1" \
> >>>> >>>>>>>>>>>>>>>>>>> op
> >>>> > start interval="0s"
> >>>> >>>>>>>> timeout="90s" on-fail="restart"
> >>>> > \
> >>>> >>>>>>>>>>>>>>>>>>> op
> >>>> > monitor
> >>>> >>>>>>>> interval="10s" timeout="60s"
> >>>> > on-fail="restart"
> >>>> >>>>>>>> \
> >>>> >>>>>>>>>>>>>>>>>>> op
> >>>> > stop interval="0s"
> >>>> >>>>>>>> timeout="100s" on-fail="fence"
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> > primitive Stonith1-1
> >>>> >>>>>>>> stonith:external/stonith-helper \
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> > params \
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>> pcmk_reboot_retries="1" \
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>> pcmk_reboot_timeout="40s" \
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>> hostlist="lbv1.beta.com" \
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>> dead_check_target="192.168.17.132
> >>>> > 10.0.17.132" \
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>
> >>>> > standby_check_command="/usr/local/sbin/crm_resource -r varnishd -W
> | grep
> >>>> >>>>>>>> -q `hostname`" \
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>> run_online_check="yes" \
> >>>> >>>>>>>>>>>>>>>>>>> op
> >>>> > start interval="0s"
> >>>> >>>>>>>> timeout="60s" on-fail="restart"
> >>>> > \
> >>>> >>>>>>>>>>>>>>>>>>> op
> >>>> > stop interval="0s"
> >>>> >>>>>>>> timeout="60s" on-fail="ignore"
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> > primitive Stonith1-2
> >>>> >>>>>>>> stonith:external/xen0 \
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> > params \
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>> pcmk_reboot_timeout="60s" \
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>
> >>>> > hostlist="lbv1.beta.com:/etc/xen/lbv1.cfg" \
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>> dom0="xen0.beta.com" \
> >>>> >>>>>>>>>>>>>>>>>>> op
> >>>> > start interval="0s"
> >>>> >>>>>>>> timeout="60s" on-fail="restart"
> >>>> > \
> >>>> >>>>>>>>>>>>>>>>>>> op
> >>>> > monitor
> >>>> >>>>>>>> interval="3600s" timeout="60s"
> >>>> > on-fail="restart"
> >>>> >>>>>>>> \
> >>>> >>>>>>>>>>>>>>>>>>> op
> >>>> > stop interval="0s"
> >>>> >>>>>>>> timeout="60s" on-fail="ignore"
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> > primitive Stonith2-1
> >>>> >>>>>>>> stonith:external/stonith-helper \
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> > params \
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>> pcmk_reboot_retries="1" \
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>> pcmk_reboot_timeout="40s" \
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>> hostlist="lbv2.beta.com" \
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>> dead_check_target="192.168.17.133
> >>>> > 10.0.17.133" \
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>
> >>>> > standby_check_command="/usr/local/sbin/crm_resource -r varnishd -W
> | grep
> >>>> >>>>>>>> -q `hostname`" \
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>> run_online_check="yes" \
> >>>> >>>>>>>>>>>>>>>>>>> op
> >>>> > start interval="0s"
> >>>> >>>>>>>> timeout="60s" on-fail="restart"
> >>>> > \
> >>>> >>>>>>>>>>>>>>>>>>> op
> >>>> > stop interval="0s"
> >>>> >>>>>>>> timeout="60s" on-fail="ignore"
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> > primitive Stonith2-2
> >>>> >>>>>>>> stonith:external/xen0 \
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> > params \
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>> pcmk_reboot_timeout="60s" \
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>
> >>>> > hostlist="lbv2.beta.com:/etc/xen/lbv2.cfg" \
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>> dom0="xen0.beta.com" \
> >>>> >>>>>>>>>>>>>>>>>>> op
> >>>> > start interval="0s"
> >>>> >>>>>>>> timeout="60s" on-fail="restart"
> >>>> > \
> >>>> >>>>>>>>>>>>>>>>>>> op
> >>>> > monitor
> >>>> >>>>>>>> interval="3600s" timeout="60s"
> >>>> > on-fail="restart"
> >>>> >>>>>>>> \
> >>>> >>>>>>>>>>>>>>>>>>> op
> >>>> > stop interval="0s"
> >>>> >>>>>>>> timeout="60s" on-fail="ignore"
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>>>> ###
> >>>> > Resource Location ###
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> > location HA_location-1 HAvarnish
> >>>> >>>>>>>> \
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> > rule 200: #uname eq
> >>>> >>>>>>>> lbv1.beta.com \
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> > rule 100: #uname eq
> >>>> >>>>>>>> lbv2.beta.com
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> > location HA_location-2 HAvarnish
> >>>> >>>>>>>> \
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> > rule -INFINITY: not_defined
> >>>> >>>>>>>> default_ping_set or default_ping_set lt 100
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> > location HA_location-3 grpStonith1
> >>>> >>>>>>>> \
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> > rule -INFINITY: #uname eq
> >>>> >>>>>>>> lbv1.beta.com
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> > location HA_location-4 grpStonith2
> >>>> >>>>>>>> \
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> > rule -INFINITY: #uname eq
> >>>> >>>>>>>> lbv2.beta.com
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> > これを流しこんだところ、昨日とはメッセージが異なります。
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> > pingのメッセージはなくなっていました。
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>>>> #
> >>>> > crm_mon -rfA
> >>>> >>>>>>>>>>>>>>>>>>> Last
> >>>> > updated: Tue Mar 17 10:21:28
> >>>> >>>>>>>> 2015
> >>>> >>>>>>>>>>>>>>>>>>> Last
> >>>> > change: Tue Mar 17 10:21:09
> >>
> >>>> >>>>>>>> 2015
> >>>> >>>>>>>>>>>>>>>>>>> Stack:
> >>>> > heartbeat
> >>>> >>>>>>>>>>>>>>>>>>> Current
> >>>> > DC: lbv2.beta.com
> >>>> >>>>>>>> (82ffc36f-1ad8-8686-7db0-35686465c624) - parti
> >>>> >>>>>>>>>>>>>>>>>>> tion
> >>>> > with quorum
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> > Version: 1.1.12-561c4cf
> >>>> >>>>>>>>>>>>>>>>>>> 2 Nodes
> >>>> > configured
> >>>> >>>>>>>>>>>>>>>>>>> 8
> >>>> > Resources configured
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>>>> Online:
> >>>> > [ lbv1.beta.com
> >>>> >>>>>>>> lbv2.beta.com ]
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>>>> Full
> >>>> > list of resources:
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> > Resource Group: HAvarnish
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> > vip_208
> >>>> >>>>>>>> (ocf::heartbeat:IPaddr2): Started
> >>>> > lbv1.beta.com
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> > varnishd (lsb:varnish):
> >>>> >>>>>>>> Started lbv1.beta.com
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> > Resource Group: grpStonith1
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> > Stonith1-1
> >>>> >>>>>>>> (stonith:external/stonith-helper): Stopped
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> > Stonith1-2
> >>>> >>>>>>>> (stonith:external/xen0): Stopped
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> > Resource Group: grpStonith2
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> > Stonith2-1
> >>>> >>>>>>>> (stonith:external/stonith-helper): Stopped
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> > Stonith2-2
> >>>> >>>>>>>> (stonith:external/xen0): Stopped
> >>>> >>>>>>>>>>>>>>>>>>> Clone
> >>>> > Set: clone_ping [ping]
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> > Started: [ lbv1.beta.com
> >>>> >>>>>>>> lbv2.beta.com ]
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>>>> Node
> >>>> > Attributes:
> >>>> >>>>>>>>>>>>>>>>>>> * Node
> >>>> > lbv1.beta.com:
> >>>> >>>>>>>>>>>>>>>>>>> +
> >>>> >>>>>>>> default_ping_set : 100
> >>>> >>>>>>>>>>>>>>>>>>> * Node
> >>>> > lbv2.beta.com:
> >>>> >>>>>>>>>>>>>>>>>>> +
> >>>> >>>>>>>> default_ping_set : 100
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> > Migration summary:
> >>>> >>>>>>>>>>>>>>>>>>> * Node
> >>>> > lbv2.beta.com:
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> > Stonith1-1: migration-threshold=1
> >>>> >>>>>>>> fail-count=1000000 last-failure='Tue Mar 17
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> > 10:21:17 2015'
> >>>> >>>>>>>>>>>>>>>>>>> * Node
> >>>> > lbv1.beta.com:
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> > Stonith2-1: migration-threshold=1
> >>>> >>>>>>>> fail-count=1000000 last-failure='Tue Mar 17
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> > 10:21:17 2015'
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>>>> Failed
> >>>> > actions:
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> > Stonith1-1_start_0 on
> >>>> >>>>>>>> lbv2.beta.com 'unknown error' (1): call=31,
> >>>> > st
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> > atus=Error, last-rc-change='Tue
> >>>> >>>>>>>> Mar 17 10:21:15 2015', queued=0ms, exec=1082ms
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> > Stonith2-1_start_0 on
> >>>> >>>>>>>> lbv1.beta.com 'unknown error' (1): call=31,
> >>>> > st
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> > atus=Error, last-rc-change='Tue
> >>>> >>>>>>>> Mar 17 10:21:16 2015', queued=0ms, exec=1079ms
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> > /var/log/ha-debugのログです。
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> > IPaddr2(vip_208)[7851]:
> >>>> >>>>>>>> 2015/03/17_10:21:22 INFO: Adding inet address
> >>>> > 192.168.17.208/24 with broadcast
> >>>> >>>>>>>> address 192.168.17.255 to device eth0
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> > IPaddr2(vip_208)[7851]:
> >>>> >>>>>>>> 2015/03/17_10:21:22 INFO: Bringing device eth0 up
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> > IPaddr2(vip_208)[7851]:
> >>>> >>>>>>>> 2015/03/17_10:21:22 INFO:
> >>>> > /usr/libexec/heartbeat/send_arp -i 200 -r 5 -p
> >>>> >>>>>>>> /var/run/resource-agents/send_arp-192.168.17.208
> >>>> > eth0 192.168.17.208 auto
> >>>> >>>>>>>> not_used not_used
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> > 標準出力や標準エラー出力はありませんでした。
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> > stonith-helperがおかしいのでしょうか。
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> > stonith-helperはシェルスクリプトなのでインストールはあまり気にしていなかったのですが。
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> > stonith-helperはここに配置されています。
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> > /usr/local/heartbeat/lib/stonith/plugins/external/stonith-helper
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> > 宜しくお願いします。
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>>>> 以上
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> > 2015-03-17 9:45 GMT+09:00
> >>>> >>>>>>>> <renayama19661014@ybb.ne.jp>:
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>>>> 福田さん
> >>>> >>>>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>>>>>
> >>>> > おはようございます。山内です。
> >>>> >>>>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>>>>>
> >>>> > 念の為、手元にある複数のstonithを利用した場合の例を抜粋してお送りします。
> >>>> >>>>>>>>>>>>>>>>>>>>
> >>>> > (実際には、改行に気を付けてください)
> >>>> >>>>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>>>>>
> >>>> > 以下の例は、PM1.1系での設定で、
> >>>> >>>>>>>>>>>>>>>>>>>>
> >>>> > nodeaは、prmStonith1-1、 prmStonith1-2の順でstonithが実行されます。
> >>>> >>>>>>>>>>>>>>>>>>>>
> >>>> > nodebは、prmStonith2-1、 prmStonith2-2の順でstonithが実行されます。
> >>>> >>>>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>>>>>
> >>>> > stonith自体は、helperとsshです。
> >>>> >>>>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>>>>>
> >>>> > (snip)
> >>>> >>>>>>>>>>>>>>>>>>>> ###
> >>>> > Group Configuration ###
> >>>> >>>>>>>>>>>>>>>>>>>>
> >>>> > group grpStonith1 \
> >>>> >>>>>>>>>>>>>>>>>>>>
> >>>> > prmStonith1-1 \
> >>>> >>>>>>>>>>>>>>>>>>>>
> >>>> > prmStonith1-2
> >>>> >>>>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>>>>>
> >>>> > group grpStonith2 \
> >>>> >>>>>>>>>>>>>>>>>>>>
> >>>> > prmStonith2-1 \
> >>>> >>>>>>>>>>>>>>>>>>>>
> >>>> > prmStonith2-2
> >>>> >>>>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>>>>> ###
> >>>> > Fencing Topology ###
> >>>> >>>>>>>>>>>>>>>>>>>>
> >>>> > fencing_topology \
> >>>> >>>>>>>>>>>>>>>>>>>>
> >>>> > nodea: prmStonith1-1
> >>>> >>>>>>>> prmStonith1-2 \
> >>>> >>>>>>>>>>>>>>>>>>>>
> >>>> > nodeb: prmStonith2-1
> >>>> >>>>>>>> prmStonith2-2
> >>>> >>>>>>>>>>>>>>>>>>>>
> >>>> > (snp)
> >>>> >>>>>>>>>>>>>>>>>>>>
> >>>> > primitive prmStonith1-1
> >>>> >>>>>>>> stonith:external/stonith-helper \
> >>>> >>>>>>>>>>>>>>>>>>>>
> >>>> > params \
> >>>> >>>>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>>>>>
> >>>> > pcmk_reboot_retries="1"
> >>>> >>>>>>>> \
> >>>> >>>>>>>>>>>>>>>>>>>>
> >>>> > pcmk_reboot_timeout="40s"
> >>>> >>>>>>>> \
> >>>> >>>>>>>>>>>>>>>>>>>>
> >>>> > hostlist="nodea" \
> >>>> >>>>>>>>>>>>>>>>>>>>
> >>>> > dead_check_target="192.168.28.60
> >>>> >>>>>>>> 192.168.28.70" \
> >>>> >>>>>>>>>>>>>>>>>>>>
> >>>> > standby_check_command="/usr/sbin/crm_resource
> >>>> >>>>>>>> -r prmRES -W | grep -qi `hostname`" \
> >>>> >>>>>>>>>>>>>>>>>>>>
> >>>> > run_online_check="yes"
> >>>> >>>>>>>> \
> >>>> >>>>>>>>>>>>>>>>>>>> op
> >>>> > start interval="0s"
> >>>> >>>>>>>> timeout="60s" on-fail="restart"
> >>>> > \
> >>>> >>>>>>>>>>>>>>>>>>>> op
> >>>> > stop interval="0s"
> >>>> >>>>>>>> timeout="60s" on-fail="ignore"
> >>>> >>>>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>>>>>
> >>>> > primitive prmStonith1-2
> >>>> >>>>>>>> stonith:external/ssh \
> >>>> >>>>>>>>>>>>>>>>>>>>
> >>>> > params \
> >>>> >>>>>>>>>>>>>>>>>>>>
> >>>> > pcmk_reboot_timeout="60s"
> >>>> >>>>>>>> \
> >>>> >>>>>>>>>>>>>>>>>>>>
> >>>> > hostlist="nodea" \
> >>>> >>>>>>>>>>>>>>>>>>>> op
> >>>> > start interval="0s"
> >>>> >>>>>>>> timeout="60s" on-fail="restart"
> >>>> > \
> >>>> >>>>>>>>>>>>>>>>>>>> op
> >>>> > monitor
> >>>> >>>>>>>> interval="3600s" timeout="60s"
> >>>> > on-fail="restart"
> >>>> >>>>>>>> \
> >>>> >>>>>>>>>>>>>>>>>>>> op
> >>>> > stop interval="0s"
> >>>> >>>>>>>> timeout="60s" on-fail="ignore"
> >>>> >>>>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>>>>>
> >>>> > primitive prmStonith2-1
> >>>> >>>>>>>> stonith:external/stonith-helper \
> >>>> >>>>>>>>>>>>>>>>>>>>
> >>>> > params \
> >>>> >>>>>>>>>>>>>>>>>>>>
> >>>> > pcmk_reboot_retries="1"
> >>>> >>>>>>>> \
> >>>> >>>>>>>>>>>>>>>>>>>>
> >>>> > pcmk_reboot_timeout="40s"
> >>>> >>>>>>>> \
> >>>> >>>>>>>>>>>>>>>>>>>>
> >>>> > hostlist="nodeb" \
> >>>> >>>>>>>>>>>>>>>>>>>>
> >>>> > dead_check_target="192.168.28.61
> >>>> >>>>>>>> 192.168.28.71" \
> >>>> >>>>>>>>>>>>>>>>>>>>
> >>>> > standby_check_command="/usr/sbin/crm_resource
> >>>> >>>>>>>> -r prmRES -W | grep -qi `hostname`" \
> >>>> >>>>>>>>>>>>>>>>>>>>
> >>>> > run_online_check="yes"
> >>>> >>>>>>>> \
> >>>> >>>>>>>>>>>>>>>>>>>> op
> >>>> > start interval="0s"
> >>>> >>>>>>>> timeout="60s" on-fail="restart"
> >>>> > \
> >>>> >>>>>>>>>>>>>>>>>>>> op
> >>>> > stop interval="0s"
> >>>> >>>>>>>> timeout="60s" on-fail="ignore"
> >>>> >>>>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>>>>>
> >>>> > primitive prmStonith2-2
> >>>> >>>>>>>> stonith:external/ssh \
> >>>> >>>>>>>>>>>>>>>>>>>>
> >>>> > params \
> >>>> >>>>>>>>>>>>>>>>>>>>
> >>>> > pcmk_reboot_timeout="60s"
> >>>> >>>>>>>> \
> >>>> >>>>>>>>>>>>>>>>>>>>
> >>>> > hostlist="nodeb" \
> >>>> >>>>>>>>>>>>>>>>>>>> op
> >>>> > start interval="0s"
> >>>> >>>>>>>> timeout="60s" on-fail="restart"
> >>>> > \
> >>>> >>>>>>>>>>>>>>>>>>>> op
> >>>> > monitor
> >>>> >>>>>>>> interval="3600s" timeout="60s"
> >>>> > on-fail="restart"
> >>>> >>>>>>>> \
> >>>> >>>>>>>>>>>>>>>>>>>> op
> >>>> > stop interval="0s"
> >>>> >>>>>>>> timeout="60s" on-fail="ignore"
> >>>> >>>>>>>>>>>>>>>>>>>>
> >>>> > (snip)
> >>>> >>>>>>>>>>>>>>>>>>>>
> >>>> > location
> >>>> >>>>>>>> rsc_location-grpStonith1-2 grpStonith1 \
> >>>> >>>>>>>>>>>>>>>>>>>>
> >>>> > rule -INFINITY: #uname eq nodea
> >>>> >>>>>>>>>>>>>>>>>>>>
> >>>> > location
> >>>> >>>>>>>> rsc_location-grpStonith2-3 grpStonith2 \
> >>>> >>>>>>>>>>>>>>>>>>>>
> >>>> > rule -INFINITY: #uname eq nodeb
> >>>> >>>>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>>>>>
> >>>> > 以上です。
> >>>> >>>>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>>>> --
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>>>> ELF
> >>>> > Systems
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> > Masamichi Fukuda
> >>>> >>>>>>>>>>>>>>>>>>> mail
> >>>> > to:
> >>>> >>>>>>>> masamichi_fukuda@elf-systems.com
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>>>
> >>>> > _______________________________________________
> >>>> >>>>>>>>>>>>>>>>>>
> >>>> > Linux-ha-japan mailing list
> >>>> >>>>>>>>>>>>>>>>>>
> >>>> > Linux-ha-japan@lists.sourceforge.jp
> >>>> >>>>>>>>>>>>>>>>>>
> >>>> > http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
> >>>> >>>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>> --
> >>>> >>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>> ELF Systems
> >>>> >>>>>>>>>>>>>>>>> Masamichi
> >>>> > Fukuda
> >>>> >>>>>>>>>>>>>>>>> mail to:
> >>>> > masamichi_fukuda@elf-systems.com
> >>>> >>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>>
> >>>> > _______________________________________________
> >>>> >>>>>>>>>>>>>>>> Linux-ha-japan
> >>>> > mailing list
> >>>> >>>>>>>>>>>>>>>>
> >>>> > Linux-ha-japan@lists.sourceforge.jp
> >>>> >>>>>>>>>>>>>>>>
> >>>> > http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
> >>>> >>>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>> --
> >>>> >>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>> ELF Systems
> >>>> >>>>>>>>>>>>>>> Masamichi Fukuda
> >>>> >>>>>>>>>>>>>>> mail to:
> >>>> > masamichi_fukuda@elf-systems.com
> >>>> >>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>>
> >>>> > _______________________________________________
> >>>> >>>>>>>>>>>>>> Linux-ha-japan mailing list
> >>>> >>>>>>>>>>>>>>
> >>>> > Linux-ha-japan@lists.sourceforge.jp
> >>>> >>>>>>>>>>>>>>
> >>>> > http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
> >>>> >>>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>> --
> >>>> >>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>> ELF Systems
> >>>> >>>>>>>>>>>>> Masamichi Fukuda
> >>>> >>>>>>>>>>>>> mail to:
> >>>> > masamichi_fukuda@elf-systems.com
> >>>> >>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>>
> >>>> >>>>>>>>>>>>
> >>>> >>>>>>>>>>>>
> >>>> > _______________________________________________
> >>>> >>>>>>>>>>>> Linux-ha-japan mailing list
> >>>> >>>>>>>>>>>> Linux-ha-japan@lists.sourceforge.jp
> >>>> >>>>>>>>>>>>
> >>>> > http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
> >>>> >>>>>>>>>>>>
> >>>> >>>>>>>>>>>
> >>>> >>>>>>>>>>> --
> >>>> >>>>>>>>>>>
> >>>> >>>>>>>>>>> ELF Systems
> >>>> >>>>>>>>>>> Masamichi Fukuda
> >>>> >>>>>>>>>>> mail to:
> >>>> > masamichi_fukuda@elf-systems.com
> >>>> >>>>>>>>>>>
> >>>> >>>>>>>>>>>
> >>>> >>>>>>>>>>>
> >>>> >>>>>>>>>>
> >>>> >>>>>>>>>>
> >>>> > _______________________________________________
> >>>> >>>>>>>>>> Linux-ha-japan mailing list
> >>>> >>>>>>>>>> Linux-ha-japan@lists.sourceforge.jp
> >>>> >>>>>>>>>>
> >>>> > http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
> >>>> >>>>>>>>>>
> >>>> >>>>>>>>>
> >>>> >>>>>>>>>
> >>>> >>>>>>>>> --
> >>>> >>>>>>>>>
> >>>> >>>>>>>>> ELF Systems
> >>>> >>>>>>>>> Masamichi Fukuda
> >>>> >>>>>>>>> mail to: masamichi_fukuda@elf-systems.com
> >>>> >>>>>>>>>
> >>>> >>>>>>>>>
> >>>> >>>>>>>>
> >>>> >>>>>>>> _______________________________________________
> >>>> >>>>>>>> Linux-ha-japan mailing list
> >>>> >>>>>>>> Linux-ha-japan@lists.sourceforge.jp
> >>>> >>>>>>>>
> >>>> > http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
> >>>> >>>>>>>>
> >>>> >>>>>>>
> >>>> >>>>>>> _______________________________________________
> >>>> >>>>>>> Linux-ha-japan mailing list
> >>>> >>>>>>> Linux-ha-japan@lists.sourceforge.jp
> >>>> >>>>>>> http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
> >>>> >>>>>>>
> >>>> >>>>>>
> >>>> >>>>>>
> >>>> >>>>>> --
> >>>> >>>>>>
> >>>> >>>>>> ELF Systems
> >>>> >>>>>> Masamichi Fukuda
> >>>> >>>>>> mail to: masamichi_fukuda@elf-systems.com
> >>>> >>>>>>
> >>>> >>>>>>
> >>>> >>>>>
> >>>> >>>>> _______________________________________________
> >>>> >>>>> Linux-ha-japan mailing list
> >>>> >>>>> Linux-ha-japan@lists.sourceforge.jp
> >>>> >>>>> http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
> >>>> >>>>>
> >>>> >>>>
> >>>> >>>>
> >>>> >>>> --
> >>>> >>>>
> >>>> >>>> ELF Systems
> >>>> >>>> Masamichi Fukuda
> >>>> >>>> mail to: masamichi_fukuda@elf-systems.com
> >>>> >>>>
> >>>> >>>>
> >>>> >>>
> >>>> >>> _______________________________________________
> >>>> >>> Linux-ha-japan mailing list
> >>>> >>> Linux-ha-japan@lists.sourceforge.jp
> >>>> >>> http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
> >>>> >>>
> >>>> >>
> >>>> >>
> >>>> >> --
> >>>> >>
> >>>> >> ELF Systems
> >>>> >> Masamichi Fukuda
> >>>> >> mail to: masamichi_fukuda@elf-systems.com
> >>>> >>
> >>>> >>
> >>>> >
> >>>> > _______________________________________________
> >>>> > Linux-ha-japan mailing list
> >>>> > Linux-ha-japan@lists.sourceforge.jp
> >>>> > http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
> >>>> >
> >>>>
> >>>> _______________________________________________
> >>>> Linux-ha-japan mailing list
> >>>> Linux-ha-japan@lists.sourceforge.jp
> >>>> http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
> >>>
> >>>
> >>>
> >>>
> >>>--
> >>>ELF Systems
> >>>Masamichi Fukuda
> >>>mail to: masamichi_fukuda@elf-systems.com
> >>>
> >>>
> >>
> >>_______________________________________________
> >>Linux-ha-japan mailing list
> >>Linux-ha-japan@lists.sourceforge.jp
> >>http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
> >>
> >
> >
> >--
> >
> >ELF Systems
> >Masamichi Fukuda
> >mail to: masamichi_fukuda@elf-systems.com
> >
> >
>
> _______________________________________________
> Linux-ha-japan mailing list
> Linux-ha-japan@lists.sourceforge.jp
> http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>



--
ELF Systems
Masamichi Fukuda
mail to: *masamichi_fukuda@elf-systems.com <elfsystems.com@gmail.com>*
Re: スプリットブレイン時のSTONITHエラーについて [ In reply to ]
山内さん

こんにちは、福田です。

こちらの環境で、PM1.1.12のbuild:e32080bからbuild:561c4cfへ何度か戻したりしているうちにリブートを繰り返すようになってしまいました。
そこで、再度debian7.8をクリーンインストールしてPM1.1.12 build:561c4cfをインストールしました。
あと、ご指摘頂いたパスを通したところ、こちらでもstonith-helperの起動までは確認できました。

Last updated: Fri Mar 20 16:26:47 2015
Last change: Fri Mar 20 16:22:01 2015
Stack: heartbeat
Current DC: deb64 (71e563fb-34e1-919f-7515-868014cb501d) - partition with
quorum

Version: 1.1.12-561c4cf
2 Nodes configured
10 Resources configured


Online: [ deb63 deb64 ]

Full list of resources:

Resource Group: HAvarnish
vip_208 (ocf::heartbeat:IPaddr2): Started deb63
varnishd (lsb:varnish): Started deb63
Resource Group: grpStonith1
Stonith1-1 (stonith:external/stonith-helper): Started deb64
Stonith1-2 (stonith:external/ssh): Started deb64
Stonith1-3 (stonith:meatware): Started deb64
Resource Group: grpStonith2
Stonith2-1 (stonith:external/stonith-helper): Started deb63
Stonith2-2 (stonith:external/ssh): Started deb63
Stonith2-3 (stonith:meatware): Started deb63
Clone Set: clone_ping [ping]
Started: [ deb63 deb64 ]

Node Attributes:
* Node deb63:
+ default_ping_set : 100
* Node deb64:
+ default_ping_set : 100

Migration summary:
* Node deb64:
* Node deb63:

宜しくお願いします。

以上



2015年3月18日 18:32 Masamichi Fukuda - elf-systems <
masamichi_fukuda@elf-systems.com>:

> 山内さん
>
> こんばんは、福田です。
> debianでの検証ありがとうございます。
>
> > どうやら、新しいPacemakerのrngファイル(
> >
> > Pacemaker1.1.12より後)が影響しているようです。
> > が、こちらの回避方法はまだわかっていません。
>
> こちら回避方法等わかりました際にはご教示お願いします。
>
> > ただ、最新のPMとの組み合わせの問題の解消はまだですので、
> >
> > この構成(PM1.1.12+Heartbeat3.0.6)が正しく動くかどうかは別物です。
> > #たぶん、動いているようですが、問題が出ると思います。
>
> 一旦、PM1.1.12に戻して、同じ手順でやってみます。
> まずはstonith-helperが動くかどうか確認してみます。
>
> > で、福田さんのstonith-
> >
> > helperがstartでエラーになっている件ですが、多分、stonithコマンドへのPATHがないことが原因かと思います。
>
> 初歩的なミスのようでお恥ずかしい限りです。
> こちらも同様に試してみます。
>
> 宜しくお願いします。
>
> 以上
>
>
> 2015年3月18日 17:56 <renayama19661014@ybb.ne.jp>:
>
>> 福田さん
>>
>> こんばんは、山内です。
>>
>> 私の方でも同じ状況が発生しました。
>> どうやら、新しいPacemakerのrngファイル(Pacemaker1.1.12より後)が影響しているようです。
>> が、こちらの回避方法はまだわかっていません。
>>
>>
>> ちなみに、本来はうまく動くかどうか不明のPacemaker1.1.12とHeartbeat3.0.6の組み合わせでは、単一ノードで、stonith-helperの起動まで確認しました。
>>
>> root@debian7-1:~# crm_mon -1 -Af
>> Last updated: Wed Mar 18 17:43:37 2015
>> Last change: Wed Mar 18 17:43:29 2015
>> Stack: heartbeat
>> Current DC: debian7-1 (d20c7df5-519e-4a4c-9b4b-1b88fc203133) - partition
>> with quorum
>> Version: 1.1.12-561c4cf
>> 1 Nodes configured
>> 3 Resources configured
>>
>>
>> Online: [ debian7-1 ]
>>
>> prmDummy(ocf::pacemaker:Dummy):Started debian7-1
>> Resource Group: grpStonith2
>> Stonith2-1(stonith:external/stonith-helper):Started debian7-1
>>
>> Node Attributes:
>> * Node debian7-1:
>>
>> Migration summary:
>> * Node debian7-1:
>>
>> 松島さんの手順ではうまくいかない箇所(私のdebian不慣れが原因と思いますが)がありましたが、構築オプションは同じ
>> にして、インストールして、pm_extras_1.0の最新版に含まれるstonith-helperのみをxen0と同じディレクトリにコピーしました。
>> #stonith-helperの実行権限などに問題があれば、正しく設定してください。
>>
>>
>> で、福田さんのstonith-helperがstartでエラーになっている件ですが、多分、stonithコマンドへのPATHがないことが原因かと思います。
>>
>> root@debian7-1:~# find / -name stonith -print
>> /usr/local/heartbeat/sbin/stonith
>>
>> root@debian7-1:~# echo $PATH
>>
>> /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/local/heartbeat/sbin/
>>
>>
>>
>> PATHに/usr/local/heartbeat/sbinを追加後に再度、heartbeatを起動すると、上記のcrm_mon表示のようになりました。
>>
>> ただ、最新のPMとの組み合わせの問題の解消はまだですので、この構成(PM1.1.12+Heartbeat3.0.6)が正しく動くかどうかは別物です。
>> #たぶん、動いているようですが、問題が出ると思います。
>>
>> 以下に試しに流し込んだ、crmファイルを提示しておきます。
>>
>> (dead_check_targetや、standby_check_commandなどのパラメータ値は起動を確認するのみでしたので、この設定では実際はまったく意味がない値です)
>>
>> ### Cluster Option ###
>> property \
>> no-quorum-policy="ignore" \
>> stonith-enabled="true" \
>> startup-fencing="false"
>>
>> ### Resource Default ###
>> rsc_defaults \
>> resource-stickiness="INFINITY" \
>> migration-threshold="1"
>>
>> ### Fencing Topology ###
>> fencing_topology \
>> debian7-1: Stonith1-1 \
>> debian7-2: Stonith2-1
>>
>> group grpStonith1 \
>> Stonith1-1
>>
>> group grpStonith2 \
>> Stonith2-1
>>
>> primitive prmDummy ocf:pacemaker:Dummy \
>> op start interval="0s" timeout="60s" on-fail="restart" \
>> op monitor interval="3600s" timeout="60s" on-fail="restart" \
>> op stop interval="0s" timeout="60s" on-fail="ignore"
>>
>> primitive Stonith1-1 stonith:external/stonith-helper \
>> params \
>> pcmk_reboot_retries="1" \
>> pcmk_reboot_timeout="40s" \
>> hostlist="debian7-1" \
>> dead_check_target="192.168.3.1" \
>> standby_wait_time="10" \
>> standby_check_command="/usr/local/sbin/crm_resource -r varnishd
>> -W | grep -q `hostname`" \
>> op start interval="0s" timeout="60s" on-fail="restart" \
>> op monitor interval="3600s" timeout="60s" on-fail="restart" \
>> op stop interval="0s" timeout="60s" on-fail="ignore"
>>
>> primitive Stonith2-1 stonith:external/stonith-helper \
>> params \
>> pcmk_reboot_retries="1" \
>> pcmk_reboot_timeout="40s" \
>> hostlist="debian7-2" \
>> dead_check_target="192.168.3.1" \
>> standby_wait_time="10" \
>> standby_check_command="/usr/local/sbin/crm_resource -r varnishd
>> -W | grep -q `hostname`" \
>> op start interval="0s" timeout="60s" on-fail="restart" \
>> op monitor interval="3600s" timeout="60s" on-fail="restart" \
>> op stop interval="0s" timeout="60s" on-fail="ignore"
>>
>>
>> location HA_location-3 grpStonith1 \
>> rule -INFINITY: #uname eq debian7-1
>>
>> location HA_location-4 grpStonith2 \
>> rule -INFINITY: #uname eq debian7-2
>>
>>
>> また、何かわかりましたら、ご連絡いたします。
>>
>> 以上です。
>>
>>
>>
>>
>>
>>
>>
>>
>> ----- Original Message -----
>> >From: Masamichi Fukuda - elf-systems <masamichi_fukuda@elf-systems.com>
>> >To: 山内英生 <renayama19661014@ybb.ne.jp>; "
>> linux-ha-japan@lists.sourceforge.jp" <linux-ha-japan@lists.sourceforge.jp
>> >
>> >Date: 2015/3/18, Wed 15:09
>> >Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
>> >
>> >
>> >山内さん
>> >
>> >お疲れ様です、福田です。
>> >
>> >新たにdebian7.8をvirtulabox上にインストールして、
>> >heartbeat + pacemakerをインストールしてみました。
>> >
>> >
>> >パッケージでheartbeat,pacemaker等はインストールしていません。
>> >
>> >
>> >heartbeatは起動しますが、crmファイルを読み込ませるとエラーがでました。
>> >
>> >
>> ># crm configure load update test1.crm
>> >
>> >ERROR: crmd:metadata: got no meta-data, does this RA exist?
>> >ERROR: cib-bootstrap-options: attribute no-quorum-policy does not exist
>> >ERROR: cib-bootstrap-options: attribute stonith-enabled does not exist
>> >ERROR: cib-bootstrap-options: attribute crmd-transition-delay does not
>> exist
>> >ERROR: pengine:metadata: got no meta-data, does this RA exist?
>> >
>> >external配下のエージェントを認識できない件と関係あるのでしょうか。
>> >
>> >宜しくお願いします。
>> >
>> >以上
>> >
>> >
>> >
>> >
>> >
>> >2015年3月18日 12:13 <renayama19661014@ybb.ne.jp>:
>> >
>> >福田さん
>> >>
>> >>お疲れ様です。山内です。
>> >>
>> >>了解しました。
>> >>ご連絡ありがとうございました。
>> >>
>> >>以上です。
>> >>
>> >>
>> >>
>> >>----- Original Message -----
>> >>>From: Masamichi Fukuda - elf-systems <masamichi_fukuda@elf-systems.com
>> >
>> >>>To: 山内英生 <renayama19661014@ybb.ne.jp>; "
>> linux-ha-japan@lists.sourceforge.jp" <linux-ha-japan@lists.sourceforge.jp
>> >
>> >>
>> >>>Date: 2015/3/18, Wed 10:23
>> >>>Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
>> >>>
>> >>>
>> >>>山内さん
>> >>>
>> >>>お疲れ様です、福田です。
>> >>>
>> >>>こちらの環境では、packageで次のものを入れていたので、
>> >>>最初にapt-get removeしました。
>> >>>
>> >>>heartbeat、libheartbeat2、pacemaker、corosync、resource-agents
>> >>>
>> >>>また、haclusterユーザとhaclientグループはpackage導入の段階で
>> >>>作成されていました。
>> >>>
>> >>>ですので、松島さんの手順の
>> >>>
>> >>>下準備
>> >>>apt-get install build-essential mercurial git \
>> >>>
>> >>>以降を実行しました。後は全く同じ手順です。
>> >>>
>> >>>宜しくお願いします。
>> >>>
>> >>>以上
>> >>>
>> >>>2015年3月18日 10:06 <renayama19661014@ybb.ne.jp>:
>> >>>>
>> >>>> 福田さん
>> >>>>
>> >>>> お疲れ様です。山内です。
>> >>>>
>> >>>> ちなみに、私の方でも構築するにあたっての再確認ですが、福田さんの構築手順は、
>> >>>> 以下にまとめられた松島さんの手順通りでしょうか?
>> >>>>
>> >>>> * https://gist.github.com/takehironet/1469bd7123f63d61f843
>> >>>>
>> >>>> 差異などありましたら、今一度、ご連絡ください。
>> >>>>
>> >>>> #特に、最初の構築パッケージのapt-getのあたりが、私がちょっと試した時には、うまく行かなかった記憶があるので、気になります。
>> >>>>
>> >>>>
>> >>>> 以上です。
>> >>>>
>> >>>>
>> >>>> ----- Original Message -----
>> >>>> > From: "renayama19661014@ybb.ne.jp" <renayama19661014@ybb.ne.jp>
>> >>>> > To: "linux-ha-japan@lists.sourceforge.jp" <
>> linux-ha-japan@lists.sourceforge.jp>
>> >>>> > Cc:
>> >>>> > Date: 2015/3/18, Wed 09:53
>> >>>> > Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
>> >>>> >
>> >>>> > 福田さん
>> >>>> >
>> >>>> > お疲れ様です。山内です。
>> >>>> >
>> >>>> >> stonith -Lでは、一応プラグインの一覧は表示されるようです。
>> >>>> >>
>> >>>> >> # /usr/local/heartbeat/sbin/stonith -L
>> >>>> >
>> >>>> >
>> こちらは、Heartbeatのソースに含まれるコマンドのはずなので、Heartbeatとglueの関係では問題ないということになるかと思います。
>> >>>> >
>> >>>> > ですので、pacemakerのインストールに問題がある可能性の方が高いと思われます。
>> >>>> >
>> >>>> > どちらにしても、一度、時間をみて、こちらでも構築してみます。
>> >>>> >
>> >>>> > 以上です。
>> >>>> >
>> >>>> >
>> >>>> > ----- Original Message -----
>> >>>> >> From: Masamichi Fukuda - elf-systems
>> >>>> > <masamichi_fukuda@elf-systems.com>
>> >>>> >> To: 山内英生 <renayama19661014@ybb.ne.jp>;
>> >>>> > "linux-ha-japan@lists.sourceforge.jp"
>> >>>> > <linux-ha-japan@lists.sourceforge.jp>
>> >>>> >> Date: 2015/3/18, Wed 09:33
>> >>>> >> Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
>> >>>> >>
>> >>>> >>
>> >>>> >> 山内さん
>> >>>> >>
>> >>>> >> お疲れ様です、福田です。
>> >>>> >>
>> >>>> >>> Reusableは、glueのことです。
>> >>>> >>
>> >>>> >> 承知しました。Cluster-glueのことですね。
>> >>>> >>
>> >>>> >>> これは、想定通りで、external配下のエージェントを認識できず、startしていないと
>> >>>> >>> 思っています。
>> >>>> >>
>> >>>> >> stonith -Lでは、一応プラグインの一覧は表示されるようです。
>> >>>> >>
>> >>>> >> # /usr/local/heartbeat/sbin/stonith -L
>> >>>> >> apcmaster
>> >>>> >> apcsmart
>> >>>> >> baytech
>> >>>> >> cyclades
>> >>>> >> external/drac5
>> >>>> >> external/dracmc-telnet
>> >>>> >> external/hetzner
>> >>>> >> external/hmchttp
>> >>>> >> external/ibmrsa
>> >>>> >> external/ibmrsa-telnet
>> >>>> >> external/ipmi
>> >>>> >> external/ippower9258
>> >>>> >> external/kdumpcheck
>> >>>> >> external/libvirt
>> >>>> >> external/nut
>> >>>> >> external/rackpdu
>> >>>> >> external/riloe
>> >>>> >> external/ssh
>> >>>> >> external/stonith-helper
>> >>>> >> external/vcenter
>> >>>> >> external/vmware
>> >>>> >> external/xen0
>> >>>> >> external/xen0-ha
>> >>>> >> ibmhmc
>> >>>> >> meatware
>> >>>> >> null
>> >>>> >> nw_rpc100s
>> >>>> >> rcd_serial
>> >>>> >> rps10
>> >>>> >> ssh
>> >>>> >> suicide
>> >>>> >> wti_nps
>> >>>> >>
>> >>>> >>
>> >>>> >>> #時間を見て、松島さんの情報を元に、Debian7の環境を作らないと何とも言えない
>> >>>> >>> と思っています
>> >>>> >>
>> >>>> >> お忙しいところ済みません。
>> >>>> >> こちらもインストールを見なおして見ます。
>> >>>> >>
>> >>>> >> 宜しくお願いします。
>> >>>> >>
>> >>>> >> 以上
>> >>>> >>
>> >>>> >>
>> >>>> >>
>> >>>> >>
>> >>>> >> 2015年3月18日 9:02 <renayama19661014@ybb.ne.jp>:
>> >>>> >>
>> >>>> >> 福田さん
>> >>>> >>>
>> >>>> >>> おはようございます。山内です。
>> >>>> >>>
>> >>>> >>> 書き方が悪かったです。
>> >>>> >>> Reusableは、glueのことです。
>> >>>> >>>
>> >>>> >>> pacemakerのインストールに問題があるかも知れませんが、現時点では、判断出来ません。
>> >>>> >>>
>> >>>> >>>
>> >>>> >>>> stonith-helperを外して、external/sshだけにして起動してみましたが、
>> >>>> >>>> crm_monでの状態は変わりありませんでした。
>> >>>> >>>
>> >>>> >>>
>> >>>> >>> これは、想定通りで、external配下のエージェントを認識できず、startしていないと思っています。
>> >>>> >>>
>> >>>> >>> #時間を見て、松島さんの情報を元に、Debian7の環境を作らないと何とも言えないと思っています。
>> >>>> >>>
>> >>>> >>> 以上です。
>> >>>> >>>
>> >>>> >>>
>> >>>> >>> ----- Original Message -----
>> >>>> >>>> From: Masamichi Fukuda - elf-systems
>> >>>> > <masamichi_fukuda@elf-systems.com>
>> >>>> >>>> To: 山内英生 <renayama19661014@ybb.ne.jp>;
>> >>>> > "linux-ha-japan@lists.sourceforge.jp"
>> >>>> > <linux-ha-japan@lists.sourceforge.jp>
>> >>>> >>>
>> >>>> >>>> Date: 2015/3/18, Wed 08:12
>> >>>> >>>> Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
>> >>>> >>>>
>> >>>> >>>>
>> >>>> >>>> 山内さん
>> >>>> >>>>
>> >>>> >>>> おはようございます、福田です。
>> >>>> >>>>
>> >>>> >>>>> ということは、xen0もstonith-helperもたぶん、Pacemakerのstonithプラグインとし
>> >>>> >>>>> ての管理下のパスにはないということになると思います。
>> >>>> >>>>>
>> >>>> >>>>> Reusableと、pm_extrasあたりのインストールが怪しいと思いますね。
>> >>>> >>>>
>> >>>> >>>> pacemakerのインストールに問題があるのでしょうか。
>> >>>> >>>> あと、Reusableというものは別途インストールが必要なのでしょうか。
>> >>>> >>>>
>> >>>> >>>> stonith-helperを外して、external/sshだけにして起動してみましたが、
>> >>>> >>>> crm_monでの状態は変わりありませんでした。
>> >>>> >>>>
>> >>>> >>>> Last updated: Wed Mar 18 08:07:42 2015
>> >>>> >>>> Last change: Wed Mar 18 08:04:48 2015
>> >>>> >>>> Stack: heartbeat
>> >>>> >>>> Current DC: lbv1.beta.com
>> (38b0f200-83ea-8633-6f37-047d36cd39c6) -
>> >>>> > parti
>> >>>> >>>> tion with quorum
>> >>>> >>>> Version: 1.1.12-e32080b
>> >>>> >>>> 2 Nodes configured
>> >>>> >>>> 6 Resources configured
>> >>>> >>>>
>> >>>> >>>>
>> >>>> >>>> Online: [ lbv1.beta.com lbv2.beta.com ]
>> >>>> >>>>
>> >>>> >>>> Full list of resources:
>> >>>> >>>>
>> >>>> >>>> Stonith1-2 (stonith:external/ssh): Stopped
>> >>>> >>>> Stonith2-2 (stonith:external/ssh): Stopped
>> >>>> >>>> Resource Group: HAvarnish
>> >>>> >>>> vip_208 (ocf::heartbeat:IPaddr2): Started
>> >>>> > lbv1.beta.com
>> >>>> >>>> varnishd (lsb:varnish): Started lbv1.beta.com
>> >>>> >>>> Clone Set: clone_ping [ping]
>> >>>> >>>> Started: [ lbv1.beta.com lbv2.beta.com ]
>> >>>> >>>>
>> >>>> >>>> Node Attributes:
>> >>>> >>>> * Node lbv1.beta.com:
>> >>>> >>>> + default_ping_set : 100
>> >>>> >>>> * Node lbv2.beta.com:
>> >>>> >>>> + default_ping_set : 100
>> >>>> >>>>
>> >>>> >>>> Migration summary:
>> >>>> >>>> * Node lbv2.beta.com:
>> >>>> >>>> Stonith1-2: migration-threshold=1 fail-count=1000000
>> >>>> > last-failure='Wed Mar 18
>> >>>> >>>> 08:07:32 2015'
>> >>>> >>>> * Node lbv1.beta.com:
>> >>>> >>>> Stonith2-2: migration-threshold=1 fail-count=1000000
>> >>>> > last-failure='Wed Mar 18
>> >>>> >>>> 08:05:53 2015'
>> >>>> >>>>
>> >>>> >>>> Failed actions:
>> >>>> >>>> Stonith1-2_start_0 on lbv2.beta.com 'unknown error' (1):
>> >>>> > call=23, st
>> >>>> >>>> atus=Error, exit-reason='none', last-rc-change='Wed Mar
>> >>>> > 18 08:07:30 2015', queue
>> >>>> >>>> d=0ms, exec=1061ms
>> >>>> >>>> Stonith2-2_start_0 on lbv1.beta.com 'unknown error' (1):
>> >>>> > call=23, st
>> >>>> >>>> atus=Error, exit-reason='none', last-rc-change='Wed Mar
>> >>>> > 18 08:05:51 2015', queue
>> >>>> >>>> d=0ms, exec=1062ms
>> >>>> >>>>
>> >>>> >>>> 宜しくお願いします。
>> >>>> >>>>
>> >>>> >>>> 以上
>> >>>> >>>>
>> >>>> >>>>
>> >>>> >>>>
>> >>>> >>>>
>> >>>> >>>>
>> >>>> >>>> 2015年3月17日 23:51 <renayama19661014@ybb.ne.jp>:
>> >>>> >>>>
>> >>>> >>>> 福田さん
>> >>>> >>>>>
>> >>>> >>>>> こんばんは、山内です。
>> >>>> >>>>>
>> >>>> >>>>>
>> ということは、xen0もstonith-helperもたぶん、Pacemakerのstonithプラグインとしての管理下のパスにはないということになると思います。
>> >>>> >>>>>
>> >>>> >>>>> Reusableと、pm_extrasあたりのインストールが怪しいと思いますね。
>> >>>> >>>>>
>> >>>> >>>>> また、何かわかったらご連絡します。
>> >>>> >>>>>
>> >>>> >>>>> 以上です。
>> >>>> >>>>>
>> >>>> >>>>>
>> >>>> >>>>>
>> >>>> >>>>> ----- Original Message -----
>> >>>> >>>>>> From: Masamichi Fukuda - elf-systems
>> >>>> > <masamichi_fukuda@elf-systems.com>
>> >>>> >>>>>> To: 山内英生 <renayama19661014@ybb.ne.jp>;
>> >>>> > "linux-ha-japan@lists.sourceforge.jp"
>> >>>> > <linux-ha-japan@lists.sourceforge.jp>
>> >>>> >>>>>
>> >>>> >>>>>> Date: 2015/3/17, Tue 23:46
>> >>>> >>>>>> Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
>> >>>> >>>>>>
>> >>>> >>>>>>
>> >>>> >>>>>> 山内さん
>> >>>> >>>>>>
>> >>>> >>>>>> こんばんは、福田です。
>> >>>> >>>>>>
>> >>>> >>>>>> stonith-helperの-x指定は何かやり方が違うんでしょうかね。
>> >>>> >>>>>>
>> >>>> >>>>>> stonith-helperを外して、xen0だけにして起動してみました。
>> >>>> >>>>>>
>> >>>> >>>>>> # crm_mon -rfA
>> >>>> >>>>>>
>> >>>> >>>>>> Last updated: Tue Mar 17 23:38:53 2015
>> >>>> >>>>>> Last change: Tue Mar 17 23:30:34 2015
>> >>>> >>>>>> Stack: heartbeat
>> >>>> >>>>>> Current DC: lbv1.beta.com
>> >>>> > (38b0f200-83ea-8633-6f37-047d36cd39c6) - parti
>> >>>> >>>>>> tion with quorum
>> >>>> >>>>>> Version: 1.1.12-e32080b
>> >>>> >>>>>> 2 Nodes configured
>> >>>> >>>>>> 6 Resources configured
>> >>>> >>>>>>
>> >>>> >>>>>>
>> >>>> >>>>>> Online: [ lbv1.beta.com lbv2.beta.com ]
>> >>>> >>>>>>
>> >>>> >>>>>> Full list of resources:
>> >>>> >>>>>>
>> >>>> >>>>>> Stonith1-2 (stonith:external/xen0): Stopped
>> >>>> >>>>>> Stonith2-2 (stonith:external/xen0): Stopped
>> >>>> >>>>>> Resource Group: HAvarnish
>> >>>> >>>>>> vip_208 (ocf::heartbeat:IPaddr2): Started
>> >>>> > lbv1.beta.com
>> >>>> >>>>>> varnishd (lsb:varnish): Started lbv1.beta.com
>> >>>> >>>>>> Clone Set: clone_ping [ping]
>> >>>> >>>>>> Started: [ lbv1.beta.com lbv2.beta.com ]
>> >>>> >>>>>>
>> >>>> >>>>>> Node Attributes:
>> >>>> >>>>>> * Node lbv1.beta.com:
>> >>>> >>>>>> + default_ping_set : 100
>> >>>> >>>>>> * Node lbv2.beta.com:
>> >>>> >>>>>> + default_ping_set : 100
>> >>>> >>>>>>
>> >>>> >>>>>> Migration summary:
>> >>>> >>>>>> * Node lbv1.beta.com:
>> >>>> >>>>>> Stonith2-2: migration-threshold=1 fail-count=1000000
>> >>>> > last-failure='Tue Mar 17
>> >>>> >>>>>> 23:38:34 2015'
>> >>>> >>>>>> * Node lbv2.beta.com:
>> >>>> >>>>>> Stonith1-2: migration-threshold=1 fail-count=1000000
>> >>>> > last-failure='Tue Mar 17
>> >>>> >>>>>> 23:38:27 2015'
>> >>>> >>>>>>
>> >>>> >>>>>> Failed actions:
>> >>>> >>>>>> Stonith2-2_start_0 on lbv1.beta.com 'unknown
>> >>>> > error' (1): call=23, st
>> >>>> >>>>>> atus=Error, exit-reason='none',
>> >>>> > last-rc-change='Tue Mar 17 23:38:32 2015', queue
>> >>>> >>>>>> d=0ms, exec=1061ms
>> >>>> >>>>>> Stonith1-2_start_0 on lbv2.beta.com 'unknown
>> >>>> > error' (1): call=23, st
>> >>>> >>>>>> atus=Error, exit-reason='none',
>> >>>> > last-rc-change='Tue Mar 17 23:38:25 2015', queue
>> >>>> >>>>>> d=0ms, exec=1342ms
>> >>>> >>>>>>
>> >>>> >>>>>>
>> >>>> >>>>>>
>> >>>> >>>>>>
>> >>>> >>>>>> stonith-helperがあるときと同様のfialed actionsが出ているようです。
>> >>>> >>>>>>
>> >>>> >>>>>>
>> >>>> >>>>>> 宜しくお願いします。
>> >>>> >>>>>>
>> >>>> >>>>>> 以上
>> >>>> >>>>>>
>> >>>> >>>>>>
>> >>>> >>>>>>
>> >>>> >>>>>>
>> >>>> >>>>>> 2015年3月17日 22:38 <renayama19661014@ybb.ne.jp>:
>> >>>> >>>>>>
>> >>>> >>>>>> 福田さん
>> >>>> >>>>>>>
>> >>>> >>>>>>> こんばんは、山内です。
>> >>>> >>>>>>>
>> >>>> >>>>>>> ちなみに可能であれば、external/stonith-helperを外して、external/xen0だけにした場合に
>> >>>> >>>>>>> どうなるか?を確認すると、問題の切り分けになるかもしれません。
>> >>>> >>>>>>>
>> >>>> >>>>>>> 以上です。
>> >>>> >>>>>>>
>> >>>> >>>>>>>
>> >>>> >>>>>>>
>> >>>> >>>>>>> ----- Original Message -----
>> >>>> >>>>>>>
>> >>>> >>>>>>>> From: "renayama19661014@ybb.ne.jp"
>> >>>> > <renayama19661014@ybb.ne.jp>
>> >>>> >>>>>>>> To: "linux-ha-japan@lists.sourceforge.jp"
>> >>>> > <linux-ha-japan@lists.sourceforge.jp>
>> >>>> >>>>>>>> Cc:
>> >>>> >>>>>>>> Date: 2015/3/17, Tue 22:28
>> >>>> >>>>>>>> Subject: Re: [Linux-ha-jp]
>> >>>> > スプリットブレイン時のSTONITHエラーについて
>> >>>> >>>>>>>>
>> >>>> >>>>>>>> 福田さん
>> >>>> >>>>>>>>
>> >>>> >>>>>>>> こんばんは、山内です。
>> >>>> >>>>>>>>
>> >>>> >>>>>>>> 変わらないようですね。。。
>> >>>> >>>>>>>>
>> >>>> >>>>>>>> とりあえず、明日くらいに、RHEL上ですが、
>> >>>> >>>>>>>>
>> >>>> >>>>>>>> Heartbeat3.0.6
>> >>>> >>>>>>>> Pacemakerの最新
>> >>>> >>>>>>>>
>> >>>> >>>>>>>>
>> >>>> >
>> 組み合わせで、同じような設定(リソースはDummy、external/xen0はexternal/sshになりますが)stonith-helperが動くかどうかを確認してみます。
>> >>>> >>>>>>>>
>> >>>> >>>>>>>> #stonith-helperの-x指定の出力が確認出来ると、もう少し問題が絞りやすいのですが・・・
>> >>>> >>>>>>>>
>> >>>> >>>>>>>>
>> >>>> >>>>>>>> 以上です。
>> >>>> >>>>>>>>
>> >>>> >>>>>>>>
>> >>>> >>>>>>>>
>> >>>> >>>>>>>> ----- Original Message -----
>> >>>> >>>>>>>>> From: Masamichi Fukuda - elf-systems
>> >>>> >>>>>>>> <masamichi_fukuda@elf-systems.com>
>> >>>> >>>>>>>>> To: 山内英生 <renayama19661014@ybb.ne.jp>;
>> >>>> >>>>>>>> "linux-ha-japan@lists.sourceforge.jp"
>> >>>> >>>>>>>> <linux-ha-japan@lists.sourceforge.jp>
>> >>>> >>>>>>>>> Date: 2015/3/17, Tue 21:24
>> >>>> >>>>>>>>> Subject: Re: [Linux-ha-jp]
>> >>>> > スプリットブレイン時のSTONITHエラーについて
>> >>>> >>>>>>>>>
>> >>>> >>>>>>>>>
>> >>>> >>>>>>>>> 山内さん
>> >>>> >>>>>>>>>
>> >>>> >>>>>>>>> こんばんは、福田です。
>> >>>> >>>>>>>>> 最新版の情報をありがとうございました。
>> >>>> >>>>>>>>>
>> >>>> >>>>>>>>> 早速インストールしてみました。
>> >>>> >>>>>>>>>
>> >>>> >>>>>>>>> 起動後の状態です。
>> >>>> >>>>>>>>>
>> >>>> >>>>>>>>> failed actionsは変わりないようです。
>> >>>> >>>>>>>>>
>> >>>> >>>>>>>>>
>> >>>> >>>>>>>>>
>> >>>> >>>>>>>>> # crm_mon -rfA
>> >>>> >>>>>>>>> Last updated: Tue Mar 17 21:03:49 2015
>> >>>> >>>>>>>>> Last change: Tue Mar 17 20:30:58 2015
>> >>>> >>>>>>>>> Stack: heartbeat
>> >>>> >>>>>>>>> Current DC: lbv1.beta.com
>> >>>> > (38b0f200-83ea-8633-6f37-047d36cd39c6) - parti
>> >>>> >>>>>>>>> tion with quorum
>> >>>> >>>>>>>>> Version: 1.1.12-e32080b
>> >>>> >>>>>>>>> 2 Nodes configured
>> >>>> >>>>>>>>> 8 Resources configured
>> >>>> >>>>>>>>>
>> >>>> >>>>>>>>>
>> >>>> >>>>>>>>> Online: [ lbv1.beta.com lbv2.beta.com ]
>> >>>> >>>>>>>>>
>> >>>> >>>>>>>>> Full list of resources:
>> >>>> >>>>>>>>>
>> >>>> >>>>>>>>> Resource Group: HAvarnish
>> >>>> >>>>>>>>> vip_208 (ocf::heartbeat:IPaddr2):
>> >>>> > Started lbv1.beta.com
>> >>>> >>>>>>>>> varnishd (lsb:varnish): Started
>> >>>> > lbv1.beta.com
>> >>>> >>>>>>>>> Resource Group: grpStonith1
>> >>>> >>>>>>>>> Stonith1-1
>> >>>> > (stonith:external/stonith-helper): Stopped
>> >>>> >>>>>>>>> Stonith1-2 (stonith:external/xen0):
>> >>>> > Stopped
>> >>>> >>>>>>>>> Resource Group: grpStonith2
>> >>>> >>>>>>>>> Stonith2-1
>> >>>> > (stonith:external/stonith-helper): Stopped
>> >>>> >>>>>>>>> Stonith2-2 (stonith:external/xen0):
>> >>>> > Stopped
>> >>>> >>>>>>>>> Clone Set: clone_ping [ping]
>> >>>> >>>>>>>>> Started: [ lbv1.beta.com lbv2.beta.com ]
>> >>>> >>>>>>>>>
>> >>>> >>>>>>>>> Node Attributes:
>> >>>> >>>>>>>>> * Node lbv1.beta.com:
>> >>>> >>>>>>>>> + default_ping_set : 100
>> >>>> >>>>>>>>> * Node lbv2.beta.com:
>> >>>> >>>>>>>>> + default_ping_set : 100
>> >>>> >>>>>>>>>
>> >>>> >>>>>>>>> Migration summary:
>> >>>> >>>>>>>>> * Node lbv1.beta.com:
>> >>>> >>>>>>>>> Stonith2-1: migration-threshold=1
>> >>>> > fail-count=1000000
>> >>>> >>>>>>>> last-failure='Tue Mar 17
>> >>>> >>>>>>>>> 21:03:39 2015'
>> >>>> >>>>>>>>> * Node lbv2.beta.com:
>> >>>> >>>>>>>>> Stonith1-1: migration-threshold=1
>> >>>> > fail-count=1000000
>> >>>> >>>>>>>> last-failure='Tue Mar 17
>> >>>> >>>>>>>>> 21:03:32 2015'
>> >>>> >>>>>>>>>
>> >>>> >>>>>>>>> Failed actions:
>> >>>> >>>>>>>>> Stonith2-1_start_0 on lbv1.beta.com
>> >>>> > 'unknown error' (1):
>> >>>> >>>>>>>> call=31, st
>> >>>> >>>>>>>>> atus=Error, exit-reason='none',
>> >>>> > last-rc-change='Tue Mar 17
>> >>>> >>>>>>>> 21:03:37 2015', queue
>> >>>> >>>>>>>>> d=0ms, exec=1085ms
>> >>>> >>>>>>>>> Stonith1-1_start_0 on lbv2.beta.com
>> >>>> > 'unknown error' (1):
>> >>>> >>>>>>>> call=18, st
>> >>>> >>>>>>>>> atus=Error, exit-reason='none',
>> >>>> > last-rc-change='Tue Mar 17
>> >>>> >>>>>>>> 21:03:30 2015', queue
>> >>>> >>>>>>>>> d=0ms, exec=1061ms
>> >>>> >>>>>>>>>
>> >>>> >>>>>>>>>
>> >>>> >>>>>>>>>
>> >>>> >>>>>>>>>
>> >>>> >>>>>>>>> ログです。
>> >>>> >>>>>>>>>
>> >>>> >>>>>>>>>
>> >>>> >>>>>>>>> # less /var/log/ha-debug
>> >>>> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
>> >>>> > [4235]: info: Pacemaker support:
>> >>>> >>>>>>>> yes
>> >>>> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
>> >>>> > [4235]: WARN: File
>> >>>> >>>>>>>> /etc/ha.d//haresources exists.
>> >>>> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
>> >>>> > [4235]: WARN: This file is not used
>> >>>> >>>>>>>> because pacemaker is enabled
>> >>>> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
>> >>>> > [4235]: debug: Checking access of:
>> >>>> >>>>>>>> /usr/local/heartbeat/libexec/heartbeat/ccm
>> >>>> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
>> >>>> > [4235]: debug: Checking access of:
>> >>>> >>>>>>>> /usr/local/heartbeat/libexec/pacemaker/cib
>> >>>> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
>> >>>> > [4235]: debug: Checking access of:
>> >>>> >>>>>>>> /usr/local/heartbeat/libexec/pacemaker/stonithd
>> >>>> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
>> >>>> > [4235]: debug: Checking access of:
>> >>>> >>>>>>>> /usr/local/heartbeat/libexec/pacemaker/lrmd
>> >>>> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
>> >>>> > [4235]: debug: Checking access of:
>> >>>> >>>>>>>> /usr/local/heartbeat/libexec/pacemaker/attrd
>> >>>> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
>> >>>> > [4235]: debug: Checking access of:
>> >>>> >>>>>>>> /usr/local/heartbeat/libexec/pacemaker/crmd
>> >>>> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
>> >>>> > [4235]: WARN: Core dumps could be
>> >>>> >>>>>>>> lost if multiple dumps occur.
>> >>>> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
>> >>>> > [4235]: WARN: Consider setting
>> >>>> >>>>>>>> non-default value in /proc/sys/kernel/core_pattern
>> >>>> > (or equivalent) for maximum
>> >>>> >>>>>>>> supportability
>> >>>> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
>> >>>> > [4235]: WARN: Consider setting
>> >>>> >>>>>>>> /proc/sys/kernel/core_uses_pid (or equivalent) to 1
>> >>>> > for maximum supportability
>> >>>> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
>> >>>> > [4235]: WARN: Logging daemon is
>> >>>> >>>>>>>> disabled --enabling logging daemon is recommended
>> >>>> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
>> >>>> > [4235]: info:
>> >>>> >>>>>>>> **************************
>> >>>> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
>> >>>> > [4235]: info: Configuration
>> >>>> >>>>>>>> validated. Starting heartbeat 3.0.6
>> >>>> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
>> >>>> > [4236]: info: heartbeat: version
>> >>>> >>>>>>>> 3.0.6
>> >>>> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
>> >>>> > [4236]: info: Heartbeat generation:
>> >>>> >>>>>>>> 1423534116
>> >>>> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
>> >>>> > [4236]: info: seed is -1702799346
>> >>>> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
>> >>>> > [4236]: info: glib: ucast: write
>> >>>> >>>>>>>> socket priority set to IPTOS_LOWDELAY on eth1
>> >>>> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
>> >>>> > [4236]: info: glib: ucast: bound
>> >>>> >>>>>>>> send socket to device: eth1
>> >>>> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
>> >>>> > [4236]: info: glib: ucast: set
>> >>>> >>>>>>>> SO_REUSEADDR
>> >>>> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
>> >>>> > [4236]: info: glib: ucast: bound
>> >>>> >>>>>>>> receive socket to device: eth1
>> >>>> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
>> >>>> > [4236]: info: glib: ucast: started
>> >>>> >>>>>>>> on port 694 interface eth1 to 10.0.17.133
>> >>>> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
>> >>>> > [4236]: info: Local status now set
>> >>>> >>>>>>>> to: 'up'
>> >>>> >>>>>>>>> Mar 17 21:02:46 lbv1.beta.com heartbeat:
>> >>>> > [4236]: info: Link
>> >>>> >>>>>>>> lbv2.beta.com:eth1 up.
>> >>>> >>>>>>>>> Mar 17 21:02:46 lbv1.beta.com heartbeat:
>> >>>> > [4236]: info: Status update for
>> >>>> >>>>>>>> node lbv2.beta.com: status up
>> >>>> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
>> >>>> > [4236]: info: Comm_now_up():
>> >>>> >>>>>>>> updating status to active
>> >>>> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
>> >>>> > [4236]: info: Local status now set
>> >>>> >>>>>>>> to: 'active'
>> >>>> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
>> >>>> > [4236]: info: Starting child client
>> >>>> >>>>>>>>
>> >>>> > "/usr/local/heartbeat/libexec/heartbeat/ccm" (109,113)
>> >>>> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
>> >>>> > [4236]: info: Starting child client
>> >>>> >>>>>>>>
>> >>>> > "/usr/local/heartbeat/libexec/pacemaker/cib" (109,113)
>> >>>> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
>> >>>> > [4236]: info: Starting child client
>> >>>> >>>>>>>>
>> >>>> > "/usr/local/heartbeat/libexec/pacemaker/stonithd" (0,0)
>> >>>> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
>> >>>> > [4236]: info: Starting child client
>> >>>> >>>>>>>>
>> >>>> > "/usr/local/heartbeat/libexec/pacemaker/lrmd" (0,0)
>> >>>> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
>> >>>> > [4236]: info: Starting child client
>> >>>> >>>>>>>>
>> >>>> > "/usr/local/heartbeat/libexec/pacemaker/attrd" (109,113)
>> >>>> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
>> >>>> > [4236]: info: Starting child client
>> >>>> >>>>>>>>
>> >>>> > "/usr/local/heartbeat/libexec/pacemaker/crmd" (109,113)
>> >>>> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
>> >>>> > [4236]: debug: get_delnodelist:
>> >>>> >>>>>>>> delnodelist=
>> >>>> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
>> >>>> > [4250]: info: Starting
>> >>>> >>>>>>>>
>> >>>> > "/usr/local/heartbeat/libexec/pacemaker/crmd" as uid 109 gid 113
>> (pid
>> >>>> >>>>>>>> 4250)
>> >>>> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
>> >>>> > [4246]: info: Starting
>> >>>> >>>>>>>>
>> >>>> > "/usr/local/heartbeat/libexec/pacemaker/cib" as uid 109 gid 113
>> (pid
>> >>>> >>>>>>>> 4246)
>> >>>> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
>> >>>> > [4249]: info: Starting
>> >>>> >>>>>>>>
>> >>>> > "/usr/local/heartbeat/libexec/pacemaker/attrd" as uid 109 gid 113
>> >>>> >>>>>>>> (pid 4249)
>> >>>> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
>> >>>> > [4245]: info: Starting
>> >>>> >>>>>>>>
>> >>>> > "/usr/local/heartbeat/libexec/heartbeat/ccm" as uid 109 gid 113
>> (pid
>> >>>> >>>>>>>> 4245)
>> >>>> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
>> >>>> > [4248]: info: Starting
>> >>>> >>>>>>>>
>> >>>> > "/usr/local/heartbeat/libexec/pacemaker/lrmd" as uid 0 gid 0 (pid
>> >>>> >>>>>>>> 4248)
>> >>>> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
>> >>>> > [4247]: info: Starting
>> >>>> >>>>>>>>
>> >>>> > "/usr/local/heartbeat/libexec/pacemaker/stonithd" as uid 0 gid 0
>> (pid
>> >>>> >>>>>>>> 4247)
>> >>>> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com ccm: [4245]:
>> >>>> > info: Hostname: lbv1.beta.com
>> >>>> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
>> >>>> > [4236]: info: the send queue length
>> >>>> >>>>>>>> from heartbeat to client ccm is set to 1024
>> >>>> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
>> >>>> > [4236]: info: the send queue length
>> >>>> >>>>>>>> from heartbeat to client attrd is set to 1024
>> >>>> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
>> >>>> > [4236]: info: the send queue length
>> >>>> >>>>>>>> from heartbeat to client stonith-ng is set to 1024
>> >>>> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
>> >>>> > [4236]: info: Status update for
>> >>>> >>>>>>>> node lbv2.beta.com: status active
>> >>>> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
>> >>>> > [4236]: info: the send queue length
>> >>>> >>>>>>>> from heartbeat to client cib is set to 1024
>> >>>> >>>>>>>>> Mar 17 21:02:51 lbv1.beta.com heartbeat:
>> >>>> > [4236]: WARN: 1 lost packet(s) for
>> >>>> >>>>>>>> [lbv2.beta.com] [15:17]
>> >>>> >>>>>>>>> Mar 17 21:02:51 lbv1.beta.com heartbeat:
>> >>>> > [4236]: info: No pkts missing from
>> >>>> >>>>>>>> lbv2.beta.com!
>> >>>> >>>>>>>>> Mar 17 21:02:52 lbv1.beta.com heartbeat:
>> >>>> > [4236]: WARN: 1 lost packet(s) for
>> >>>> >>>>>>>> [lbv2.beta.com] [19:21]
>> >>>> >>>>>>>>> Mar 17 21:02:52 lbv1.beta.com heartbeat:
>> >>>> > [4236]: info: No pkts missing from
>> >>>> >>>>>>>> lbv2.beta.com!
>> >>>> >>>>>>>>> Mar 17 21:02:52 lbv1.beta.com heartbeat:
>> >>>> > [4236]: info: the send queue length
>> >>>> >>>>>>>> from heartbeat to client crmd is set to 1024
>> >>>> >>>>>>>>> Mar 17 21:02:53 lbv1.beta.com heartbeat:
>> >>>> > [4236]: WARN: 1 lost packet(s) for
>> >>>> >>>>>>>> [lbv2.beta.com] [24:26]
>> >>>> >>>>>>>>> Mar 17 21:02:53 lbv1.beta.com heartbeat:
>> >>>> > [4236]: info: No pkts missing from
>> >>>> >>>>>>>> lbv2.beta.com!
>> >>>> >>>>>>>>> Mar 17 21:02:54 lbv1.beta.com heartbeat:
>> >>>> > [4236]: WARN: 1 lost packet(s) for
>> >>>> >>>>>>>> [lbv2.beta.com] [26:28]
>> >>>> >>>>>>>>> Mar 17 21:02:54 lbv1.beta.com heartbeat:
>> >>>> > [4236]: info: No pkts missing from
>> >>>> >>>>>>>> lbv2.beta.com!
>> >>>> >>>>>>>>> Mar 17 21:02:54 lbv1.beta.com heartbeat:
>> >>>> > [4236]: WARN: 1 lost packet(s) for
>> >>>> >>>>>>>> [lbv2.beta.com] [30:32]
>> >>>> >>>>>>>>> Mar 17 21:02:54 lbv1.beta.com heartbeat:
>> >>>> > [4236]: info: No pkts missing from
>> >>>> >>>>>>>> lbv2.beta.com!
>> >>>> >>>>>>>>>
>> >>>> >>>>>>>>>
>> >>>> >>>>>>>>>
>> >>>> >>>>>>>>> # less /var/log/error
>> >>>> >>>>>>>>>
>> >>>> >>>>>>>>> Mar 17 21:02:47 lbv1 attrd[4249]: error:
>> >>>> > ha_msg_dispatch: Ignored
>> >>>> >>>>>>>> incoming message. Please set_msg_callback on
>> >>>> > hbclstat
>> >>>> >>>>>>>>> Mar 17 21:02:48 lbv1 attrd[4249]: error:
>> >>>> > ha_msg_dispatch: Ignored
>> >>>> >>>>>>>> incoming message. Please set_msg_callback on
>> >>>> > hbclstat
>> >>>> >>>>>>>>> Mar 17 21:02:53 lbv1 stonith-ng[4247]:
>> >>>> > error: ha_msg_dispatch: Ignored
>> >>>> >>>>>>>> incoming message. Please set_msg_callback on
>> >>>> > hbclstat
>> >>>> >>>>>>>>> Mar 17 21:02:53 lbv1 stonith-ng[4247]:
>> >>>> > error: ha_msg_dispatch: Ignored
>> >>>> >>>>>>>> incoming message. Please set_msg_callback on
>> >>>> > hbclstat
>> >>>> >>>>>>>>> Mar 17 21:03:39 lbv1 crmd[4250]: error:
>> >>>> > process_lrm_event: Operation
>> >>>> >>>>>>>> Stonith2-1_start_0 (node=lbv1.beta.com, call=31,
>> >>>> > status=4, cib-update=42,
>> >>>> >>>>>>>> confirmed=true) Error
>> >>>> >>>>>>>>>
>> >>>> >>>>>>>>> # cat syslog|egrep 'Mar 17 21:03|Mar 17
>> >>>> > 21:02' |egrep
>> >>>> >>>>>>>> 'heartbeat|stonith|pacemaker|error'
>> >>>> >>>>>>>>> Mar 17 21:03:24 lbv1 pengine[4253]: notice:
>> >>>> > process_pe_message: Calculated
>> >>>> >>>>>>>> Transition 0:
>> >>>> > /var/lib/pacemaker/pengine/pe-input-115.bz2
>> >>>> >>>>>>>>> Mar 17 21:03:27 lbv1 crmd[4250]: notice:
>> >>>> > run_graph: Transition 0
>> >>>> >>>>>>>> (Complete=15, Pending=0, Fired=0, Skipped=16,
>> >>>> > Incomplete=2,
>> >>>> >>>>>>>>
>> >>>> > Source=/var/lib/pacemaker/pengine/pe-input-115.bz2): Stopped
>> >>>> >>>>>>>>> Mar 17 21:03:29 lbv1 pengine[4253]: notice:
>> >>>> > process_pe_message: Calculated
>> >>>> >>>>>>>> Transition 1:
>> >>>> > /var/lib/pacemaker/pengine/pe-input-116.bz2
>> >>>> >>>>>>>>> Mar 17 21:03:34 lbv1 crmd[4250]: notice:
>> >>>> > run_graph: Transition 1
>> >>>> >>>>>>>> (Complete=8, Pending=0, Fired=0, Skipped=12,
>> >>>> > Incomplete=1,
>> >>>> >>>>>>>>
>> >>>> > Source=/var/lib/pacemaker/pengine/pe-input-116.bz2): Stopped
>> >>>> >>>>>>>>> Mar 17 21:03:37 lbv1 pengine[4253]: warning:
>> >>>> > unpack_rsc_op_failure:
>> >>>> >>>>>>>> Processing failed op start for Stonith1-1 on
>> >>>> > lbv2.beta.com: unknown error (1)
>> >>>> >>>>>>>>> Mar 17 21:03:37 lbv1 pengine[4253]: warning:
>> >>>> > unpack_rsc_op_failure:
>> >>>> >>>>>>>> Processing failed op start for Stonith1-1 on
>> >>>> > lbv2.beta.com: unknown error (1)
>> >>>> >>>>>>>>> Mar 17 21:03:37 lbv1 pengine[4253]: notice:
>> >>>> > process_pe_message: Calculated
>> >>>> >>>>>>>> Transition 2:
>> >>>> > /var/lib/pacemaker/pengine/pe-input-117.bz2
>> >>>> >>>>>>>>> Mar 17 21:03:39 lbv1 stonith-ng[4247]:
>> >>>> > notice: log_operation: Operation
>> >>>> >>>>>>>> 'monitor' [4377] for device
>> >>>> > 'Stonith2-1' returned: -201 (Generic
>> >>>> >>>>>>>> Pacemaker error)
>> >>>> >>>>>>>>> Mar 17 21:03:39 lbv1 stonith-ng[4247]:
>> >>>> > warning: log_operation:
>> >>>> >>>>>>>> Stonith2-1:4377 [ Performing: stonith -t
>> >>>> > external/stonith-helper -S ]
>> >>>> >>>>>>>>> Mar 17 21:03:39 lbv1 stonith-ng[4247]:
>> >>>> > warning: log_operation:
>> >>>> >>>>>>>> Stonith2-1:4377 [ failed to exec
>> >>>> > "stonith" ]
>> >>>> >>>>>>>>> Mar 17 21:03:39 lbv1 stonith-ng[4247]:
>> >>>> > warning: log_operation:
>> >>>> >>>>>>>> Stonith2-1:4377 [ failed: 2 ]
>> >>>> >>>>>>>>> Mar 17 21:03:39 lbv1 crmd[4250]: error:
>> >>>> > process_lrm_event: Operation
>> >>>> >>>>>>>> Stonith2-1_start_0 (node=lbv1.beta.com, call=31,
>> >>>> > status=4, cib-update=42,
>> >>>> >>>>>>>> confirmed=true) Error
>> >>>> >>>>>>>>> Mar 17 21:03:40 lbv1 crmd[4250]: notice:
>> >>>> > run_graph: Transition 2
>> >>>> >>>>>>>> (Complete=12, Pending=0, Fired=0, Skipped=3,
>> >>>> > Incomplete=0,
>> >>>> >>>>>>>>
>> >>>> > Source=/var/lib/pacemaker/pengine/pe-input-117.bz2): Stopped
>> >>>> >>>>>>>>> Mar 17 21:03:42 lbv1 pengine[4253]: warning:
>> >>>> > unpack_rsc_op_failure:
>> >>>> >>>>>>>> Processing failed op start for Stonith2-1 on
>> >>>> > lbv1.beta.com: unknown error (1)
>> >>>> >>>>>>>>> Mar 17 21:03:42 lbv1 pengine[4253]: warning:
>> >>>> > unpack_rsc_op_failure:
>> >>>> >>>>>>>> Processing failed op start for Stonith2-1 on
>> >>>> > lbv1.beta.com: unknown error (1)
>> >>>> >>>>>>>>> Mar 17 21:03:42 lbv1 pengine[4253]: warning:
>> >>>> > unpack_rsc_op_failure:
>> >>>> >>>>>>>> Processing failed op start for Stonith1-1 on
>> >>>> > lbv2.beta.com: unknown error (1)
>> >>>> >>>>>>>>> Mar 17 21:03:42 lbv1 pengine[4253]: notice:
>> >>>> > process_pe_message: Calculated
>> >>>> >>>>>>>> Transition 3:
>> >>>> > /var/lib/pacemaker/pengine/pe-input-118.bz2
>> >>>> >>>>>>>>> Mar 17 21:03:42 lbv1 IPaddr2(vip_208)[4448]:
>> >>>> > INFO:
>> >>>> >>>>>>>> /usr/libexec/heartbeat/send_arp -i 200 -r 5 -p
>> >>>> >>>>>>>> /var/run/resource-agents/send_arp-192.168.17.208
>> >>>> > eth0 192.168.17.208 auto
>> >>>> >>>>>>>> not_used not_used
>> >>>> >>>>>>>>> Mar 17 21:03:47 lbv1 crmd[4250]: notice:
>> >>>> > run_graph: Transition 3
>> >>>> >>>>>>>> (Complete=10, Pending=0, Fired=0, Skipped=0,
>> >>>> > Incomplete=0,
>> >>>> >>>>>>>>
>> >>>> > Source=/var/lib/pacemaker/pengine/pe-input-118.bz2): Complete
>> >>>> >>>>>>>>>
>> >>>> >>>>>>>>> 宜しくお願いします。
>> >>>> >>>>>>>>>
>> >>>> >>>>>>>>> 以上
>> >>>> >>>>>>>>>
>> >>>> >>>>>>>>>
>> >>>> >>>>>>>>>
>> >>>> >>>>>>>>> 2015年3月17日 18:31
>> >>>> > <renayama19661014@ybb.ne.jp>:
>> >>>> >>>>>>>>>
>> >>>> >>>>>>>>> 福田さん
>> >>>> >>>>>>>>>>
>> >>>> >>>>>>>>>> こんばんは、山内です。
>> >>>> >>>>>>>>>>
>> >>>> >>>>>>>>>> tag付けされていないので、本日の最新版は、
>> >>>> >>>>>>>>>>
>> >>>> >>>>>>>>>> *
>> >>>> >>>>>>>>
>> >>>> >
>> https://github.com/ClusterLabs/pacemaker/tree/e32080b460f81486b85d08ec958582b3e72d858c
>> >>>> >>>>>>>>>>
>> >>>> >>>>>>>>>>
>> >>>> >>>>>>>>>> になります。
>> >>>> >>>>>>>>>> 右側の[Download ZIP]からダウンロード出来ます。
>> >>>> >>>>>>>>>>
>> >>>> >>>>>>>>>> 以上です。
>> >>>> >>>>>>>>>>
>> >>>> >>>>>>>>>>
>> >>>> >>>>>>>>>> ----- Original Message -----
>> >>>> >>>>>>>>>>> From: Masamichi Fukuda - elf-systems
>> >>>> >>>>>>>> <masamichi_fukuda@elf-systems.com>
>> >>>> >>>>>>>>>>
>> >>>> >>>>>>>>>>> To:
>> >>>> > "renayama19661014@ybb.ne.jp"
>> >>>> >>>>>>>> <renayama19661014@ybb.ne.jp>;
>> >>>> >>>>>>>> "linux-ha-japan@lists.sourceforge.jp"
>> >>>> >>>>>>>> <linux-ha-japan@lists.sourceforge.jp>
>> >>>> >>>>>>>>>>> Date: 2015/3/17, Tue 18:07
>> >>>> >>>>>>>>>>> Subject: スプリットブレイン時のSTONITHエラーについて
>> >>>> >>>>>>>>>>>
>> >>>> >>>>>>>>>>>
>> >>>> >>>>>>>>>>> 山内さん
>> >>>> >>>>>>>>>>>
>> >>>> >>>>>>>>>>>
>> >>>> >>>>>>>>>>> お疲れ様です、福田です。
>> >>>> >>>>>>>>>>>
>> >>>> >>>>>>>>>>>
>> >>>> >>>>>>>>>>> こちらを見たのですが、
>> >>>> >>>>>>>>>>>
>> >>>> > https://github.com/ClusterLabs/pacemaker/tags
>> >>>> >>>>>>>>>>>
>> >>>> >>>>>>>>>>>
>> >>>> >>>>>>>>>>>
>> >>>> >>>>>>>>>>> pacemaker 1.1.12 561c4cf が最新のようなのですが。
>> >>>> >>>>>>>>>>> 済みませんが、これ以降の最新版はどちらにあるか教えて頂けますか。
>> >>>> >>>>>>>>>>>
>> >>>> >>>>>>>>>>>
>> >>>> >>>>>>>>>>> 宜しくお願いします。
>> >>>> >>>>>>>>>>>
>> >>>> >>>>>>>>>>>
>> >>>> >>>>>>>>>>> 以上
>> >>>> >>>>>>>>>>>
>> >>>> >>>>>>>>>>>
>> >>>> >>>>>>>>>>>
>> >>>> >>>>>>>>>>>
>> >>>> > 2015年3月17日火曜日、<renayama19661014@ybb.ne.jp>さんは書きました:
>> >>>> >>>>>>>>>>>
>> >>>> >>>>>>>>>>> 福田さん
>> >>>> >>>>>>>>>>>>
>> >>>> >>>>>>>>>>>> お疲れ様です。山内です。
>> >>>> >>>>>>>>>>>>
>> >>>> >>>>>>>>>>>> はい。古いです。
>> >>>> >>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>
>> >>>> > PacemakerがHeartbeat3.0.6に対応したのは意外と最近です。
>> >>>> >>>>>>>>>>>>
>> >>>> > もっと新しいものを入れてください。(また、ソースから構築する必要がありますが・・・・)
>> >>>> >>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>
>> >>>> >>>>>>>>>>>> 本家のgithubから入手可能です。
>> >>>> >>>>>>>>>>>> *
>> >>>> > https://github.com/ClusterLabs/pacemaker
>> >>>> >>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>
>> >>>> > 場合によっては、最新のmasterはエラーなどが出る場合がありますので、その場合は、バージョンを古い方にたぐって
>> >>>> >>>>>>>>>>>> いくのが良いと思います。
>> >>>> >>>>>>>>>>>>
>> >>>> >>>>>>>>>>>> 以上です。
>> >>>> >>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>
>> >>>> >>>>>>>>>>>> ----- Original Message -----
>> >>>> >>>>>>>>>>>>> From: Masamichi Fukuda -
>> >>>> > elf-systems
>> >>>> >>>>>>>> <masamichi_fukuda@elf-systems.com>
>> >>>> >>>>>>>>>>>>> To: 山内英生
>> >>>> > <renayama19661014@ybb.ne.jp>;
>> >>>> >>>>>>>> "linux-ha-japan@lists.sourceforge.jp"
>> >>>> >>>>>>>> <linux-ha-japan@lists.sourceforge.jp>
>> >>>> >>>>>>>>>>>>> Date: 2015/3/17, Tue 16:06
>> >>>> >>>>>>>>>>>>> Subject: Re: [Linux-ha-jp]
>> >>>> > スプリットブレイン時のSTONITHエラーについて
>> >>>> >>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>> 山内さん
>> >>>> >>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>> お疲れ様です、福田です。
>> >>>> >>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>
>> >>>> > 以前のメールでheartbeatとpacemakerを最新版を入れたほうが良いと回答頂きました。
>> >>>> >>>>>>>>>>>>>
>> >>>> > そこで今回、heartbeat3.0.6とpacemaker1.1.12を入れたのですが。
>> >>>> >>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>> heartbeat configuration:
>> >>>> > Version = "3.0.6"
>> >>>> >>>>>>>>>>>>> pacemaker configuration:
>> >>>> > Version = 1.1.12 (Build:
>> >>>> >>>>>>>> 561c4cf)pacemakerがまだ古いということでしょうか。
>> >>>> >>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>> 済みませんが、宜しくお願いします。
>> >>>> >>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>> 以上
>> >>>> >>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>> 2015年3月17日 14:59
>> >>>> > <renayama19661014@ybb.ne.jp>:
>> >>>> >>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>> 福田さん
>> >>>> >>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>> お疲れ様です。山内です。
>> >>>> >>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>
>> >>>> > ふと思ったのすが、以前のやり取りのメールで以下と回答してますが、問題ないでしょうか?
>> >>>> >>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>>>>>
>> >>>> > 2)Heartbeat3.0.6+Pacemaker最新 :
>> >>>> >>>>>>>> OK
>> >>>> >>>>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>> どうやら、Heartbeatも最新版3.0.6を組合せる必要があるようです。
>> >>>> >>>>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>
>> >>>> > * http://hg.linux-ha.org/heartbeat-STABLE_3_0/rev/cceeb47a7d8f
>> >>>> >>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>
>> >>>> > 以下のcrm_monのバージョンを見ると、1.1.12のようです。
>> >>>> >>>>>>>>>>>>>>
>> >>>> > Heartbeat3.0.6と組み合わせるには、かなり新しめのPacemakerが必要です。
>> >>>> >>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>> # crm_mon -rfA
>> >>>> >>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>> Last updated: Tue Mar
>> >>>> > 17 14:14:39 2015
>> >>>> >>>>>>>>>>>>>>> Last change: Tue Mar 17
>> >>>> > 14:01:43 2015
>> >>>> >>>>>>>>>>>>>>> Stack: heartbeat
>> >>>> >>>>>>>>>>>>>>> Current DC:
>> >>>> > lbv2.beta.com
>> >>>> >>>>>>>> (82ffc36f-1ad8-8686-7db0-35686465c624) - parti
>> >>>> >>>>>>>>>>>>>>> tion with quorum
>> >>>> >>>>>>>>>>>>>>> Version: 1.1.12-561c4cf
>> >>>> >>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>> たぶん、以下の変更以降は少なくとも必要かと思います。
>> >>>> >>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>
>> >>>> >
>> https://github.com/ClusterLabs/pacemaker/commit/f2302da063d08719d28367d8e362b8bfb0f85bf3
>> >>>> >>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>> 以上です。
>> >>>> >>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>> ----- Original Message
>> >>>> > -----
>> >>>> >>>>>>>>>>>>>>> From: Masamichi Fukuda
>> >>>> > - elf-systems
>> >>>> >>>>>>>> <masamichi_fukuda@elf-systems.com>
>> >>>> >>>>>>>>>>>>>>> To: 山内英生
>> >>>> > <renayama19661014@ybb.ne.jp>;
>> >>>> >>>>>>>> "linux-ha-japan@lists.sourceforge.jp"
>> >>>> >>>>>>>> <linux-ha-japan@lists.sourceforge.jp>
>> >>>> >>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>> Date: 2015/3/17, Tue
>> >>>> > 14:38
>> >>>> >>>>>>>>>>>>>>> Subject: Re:
>> >>>> > [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
>> >>>> >>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>> 山内さん
>> >>>> >>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>> お疲れ様です、福田です。
>> >>>> >>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>
>> >>>> > stonith-helperのシェバング行に-xを追加すれば良いのでしょうか?
>> >>>> >>>>>>>>>>>>>>>
>> >>>> > stonith-helperの先頭行を#!/bin/bash -xにしてクラスタを起動してみました。
>> >>>> >>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>
>> >>>> > crm_monでは先ほどと変わりはないようです。
>> >>>> >>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>> # crm_mon -rfA
>> >>>> >>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>> Last updated: Tue Mar
>> >>>> > 17 14:14:39 2015
>> >>>> >>>>>>>>>>>>>>> Last change: Tue Mar 17
>> >>>> > 14:01:43 2015
>> >>>> >>>>>>>>>>>>>>> Stack: heartbeat
>> >>>> >>>>>>>>>>>>>>> Current DC:
>> >>>> > lbv2.beta.com
>> >>>> >>>>>>>> (82ffc36f-1ad8-8686-7db0-35686465c624) - parti
>> >>>> >>>>>>>>>>>>>>> tion with quorum
>> >>>> >>>>>>>>>>>>>>> Version: 1.1.12-561c4cf
>> >>>> >>>>>>>>>>>>>>> 2 Nodes configured
>> >>>> >>>>>>>>>>>>>>> 8 Resources configured
>> >>>> >>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>> Online: [ lbv1.beta.com
>> >>>> > lbv2.beta.com ]
>> >>>> >>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>> Full list of resources:
>> >>>> >>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>> Resource Group:
>> >>>> > HAvarnish
>> >>>> >>>>>>>>>>>>>>> vip_208
>> >>>> > (ocf::heartbeat:IPaddr2):
>> >>>> >>>>>>>> Started lbv1.beta.com
>> >>>> >>>>>>>>>>>>>>> varnishd
>> >>>> > (lsb:varnish): Started
>> >>>> >>>>>>>> lbv1.beta.com
>> >>>> >>>>>>>>>>>>>>> Resource Group:
>> >>>> > grpStonith1
>> >>>> >>>>>>>>>>>>>>> Stonith1-1
>> >>>> >>>>>>>> (stonith:external/stonith-helper): Stopped
>> >>>> >>>>>>>>>>>>>>> Stonith1-2
>> >>>> > (stonith:external/xen0):
>> >>>> >>>>>>>> Stopped
>> >>>> >>>>>>>>>>>>>>> Resource Group:
>> >>>> > grpStonith2
>> >>>> >>>>>>>>>>>>>>> Stonith2-1
>> >>>> >>>>>>>> (stonith:external/stonith-helper): Stopped
>> >>>> >>>>>>>>>>>>>>> Stonith2-2
>> >>>> > (stonith:external/xen0):
>> >>>> >>>>>>>> Stopped
>> >>>> >>>>>>>>>>>>>>> Clone Set: clone_ping
>> >>>> > [ping]
>> >>>> >>>>>>>>>>>>>>> Started: [
>> >>>> > lbv1.beta.com lbv2.beta.com ]
>> >>>> >>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>> Node Attributes:
>> >>>> >>>>>>>>>>>>>>> * Node lbv1.beta.com:
>> >>>> >>>>>>>>>>>>>>> +
>> >>>> > default_ping_set : 100
>> >>>> >>>>>>>>>>>>>>> * Node lbv2.beta.com:
>> >>>> >>>>>>>>>>>>>>> +
>> >>>> > default_ping_set : 100
>> >>>> >>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>> Migration summary:
>> >>>> >>>>>>>>>>>>>>> * Node lbv2.beta.com:
>> >>>> >>>>>>>>>>>>>>> Stonith1-1:
>> >>>> > migration-threshold=1
>> >>>> >>>>>>>> fail-count=1000000 last-failure='Tue Mar 17
>> >>>> >>>>>>>>>>>>>>> 14:12:16 2015'
>> >>>> >>>>>>>>>>>>>>> * Node lbv1.beta.com:
>> >>>> >>>>>>>>>>>>>>> Stonith2-1:
>> >>>> > migration-threshold=1
>> >>>> >>>>>>>> fail-count=1000000 last-failure='Tue Mar 17
>> >>>> >>>>>>>>>>>>>>> 14:12:21 2015'
>> >>>> >>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>> Failed actions:
>> >>>> >>>>>>>>>>>>>>> Stonith1-1_start_0
>> >>>> > on lbv2.beta.com 'unknown
>> >>>> >>>>>>>> error' (1): call=31, st
>> >>>> >>>>>>>>>>>>>>> atus=Error,
>> >>>> > last-rc-change='Tue Mar 17 14:12:14
>> >>>> >>>>>>>> 2015', queued=0ms, exec=1065ms
>> >>>> >>>>>>>>>>>>>>> Stonith2-1_start_0
>> >>>> > on lbv1.beta.com 'unknown
>> >>>> >>>>>>>> error' (1): call=26, st
>> >>>> >>>>>>>>>>>>>>> atus=Error,
>> >>>> > last-rc-change='Tue Mar 17 14:12:19
>> >>>> >>>>>>>> 2015', queued=0ms, exec=1081ms
>> >>>> >>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>> その他のログを探してみました。
>> >>>> >>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>> heartbeat起動時です。
>> >>>> >>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>> # less
>> >>>> > /var/log/pm_logconv.out
>> >>>> >>>>>>>>>>>>>>> Mar 17 14:11:28
>> >>>> > lbv1.beta.com info: Starting
>> >>>> >>>>>>>> Heartbeat 3.0.6.
>> >>>> >>>>>>>>>>>>>>> Mar 17 14:11:33
>> >>>> > lbv1.beta.com info: Link
>> >>>> >>>>>>>> lbv2.beta.com:eth1 is up.
>> >>>> >>>>>>>>>>>>>>> Mar 17 14:11:34
>> >>>> > lbv1.beta.com info: Start
>> >>>> >>>>>>>> "ccm" process. (pid=13264)
>> >>>> >>>>>>>>>>>>>>> Mar 17 14:11:34
>> >>>> > lbv1.beta.com info: Start
>> >>>> >>>>>>>> "lrmd" process. (pid=13267)
>> >>>> >>>>>>>>>>>>>>> Mar 17 14:11:34
>> >>>> > lbv1.beta.com info: Start
>> >>>> >>>>>>>> "attrd" process. (pid=13268)
>> >>>> >>>>>>>>>>>>>>> Mar 17 14:11:34
>> >>>> > lbv1.beta.com info: Start
>> >>>> >>>>>>>> "stonithd" process. (pid=13266)
>> >>>> >>>>>>>>>>>>>>> Mar 17 14:11:34
>> >>>> > lbv1.beta.com info: Start
>> >>>> >>>>>>>> "cib" process. (pid=13265)
>> >>>> >>>>>>>>>>>>>>> Mar 17 14:11:34
>> >>>> > lbv1.beta.com info: Start
>> >>>> >>>>>>>> "crmd" process. (pid=13269)
>> >>>> >>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>> # less /var/log/error
>> >>>> >>>>>>>>>>>>>>> Mar 17 14:12:20 lbv1
>> >>>> > crmd[13269]: error:
>> >>>> >>>>>>>> process_lrm_event: Operation Stonith2-1_start_0
>> >>>> > (node=lbv1.beta.com, call=26,
>> >>>> >>>>>>>> status=4, cib-update=19, confirmed=true) Error
>> >>>> >>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>
>> >>>> > syslogからstonithをgrepしたものです
>> >>>> >>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>> Mar 17 14:11:34 lbv1
>> >>>> > heartbeat: [13255]: info:
>> >>>> >>>>>>>> Starting child client
>> >>>> >>>>>>>>
>> >>>> > "/usr/local/heartbeat/libexec/pacemaker/stonithd" (0,0)
>> >>>> >>>>>>>>>>>>>>> Mar 17 14:11:34 lbv1
>> >>>> > heartbeat: [13266]: info:
>> >>>> >>>>>>>> Starting
>> >>>> > "/usr/local/heartbeat/libexec/pacemaker/stonithd" as uid 0
>> >>>> >>>>>>>> gid 0 (pid 13266)
>> >>>> >>>>>>>>>>>>>>> Mar 17 14:11:34 lbv1
>> >>>> > stonithd[13266]: notice:
>> >>>> >>>>>>>> crm_cluster_connect: Connecting to cluster
>> >>>> > infrastructure: heartbeat
>> >>>> >>>>>>>>>>>>>>> Mar 17 14:11:34 lbv1
>> >>>> > heartbeat: [13255]: info: the
>> >>>> >>>>>>>> send queue length from heartbeat to client stonithd
>> >>>> > is set to 1024
>> >>>> >>>>>>>>>>>>>>> Mar 17 14:11:40 lbv1
>> >>>> > stonithd[13266]: notice:
>> >>>> >>>>>>>> setup_cib: Watching for stonith topology changes
>> >>>> >>>>>>>>>>>>>>> Mar 17 14:11:40 lbv1
>> >>>> > stonithd[13266]: notice:
>> >>>> >>>>>>>> unpack_config: On loss of CCM Quorum: Ignore
>> >>>> >>>>>>>>>>>>>>> Mar 17 14:11:40 lbv1
>> >>>> > stonithd[13266]: warning:
>> >>>> >>>>>>>> handle_startup_fencing: Blind faith: not fencing
>> >>>> > unseen nodes
>> >>>> >>>>>>>>>>>>>>> Mar 17 14:11:40 lbv1
>> >>>> > stonithd[13266]: warning:
>> >>>> >>>>>>>> handle_startup_fencing: Blind faith: not fencing
>> >>>> > unseen nodes
>> >>>> >>>>>>>>>>>>>>> Mar 17 14:11:41 lbv1
>> >>>> > stonithd[13266]: notice:
>> >>>> >>>>>>>> stonith_device_register: Added 'Stonith2-1'
>> >>>> > to the device list (1 active
>> >>>> >>>>>>>> devices)
>> >>>> >>>>>>>>>>>>>>> Mar 17 14:11:41 lbv1
>> >>>> > stonithd[13266]: notice:
>> >>>> >>>>>>>> stonith_device_register: Added 'Stonith2-2'
>> >>>> > to the device list (2 active
>> >>>> >>>>>>>> devices)
>> >>>> >>>>>>>>>>>>>>> Mar 17 14:12:04 lbv1
>> >>>> > stonithd[13266]: notice:
>> >>>> >>>>>>>> xml_patch_version_check: Versions did not change in
>> >>>> > patch 0.5.0
>> >>>> >>>>>>>>>>>>>>> Mar 17 14:12:20 lbv1
>> >>>> > stonithd[13266]: notice:
>> >>>> >>>>>>>> log_operation: Operation 'monitor' [13386]
>> >>>> > for device
>> >>>> >>>>>>>> 'Stonith2-1' returned: -201 (Generic
>> >>>> > Pacemaker error)
>> >>>> >>>>>>>>>>>>>>> Mar 17 14:12:20 lbv1
>> >>>> > stonithd[13266]: warning:
>> >>>> >>>>>>>> log_operation: Stonith2-1:13386 [ Performing:
>> >>>> > stonith -t external/stonith-helper
>> >>>> >>>>>>>> -S ]
>> >>>> >>>>>>>>>>>>>>> Mar 17 14:12:20 lbv1
>> >>>> > stonithd[13266]: warning:
>> >>>> >>>>>>>> log_operation: Stonith2-1:13386 [ failed to exec
>> >>>> > "stonith" ]
>> >>>> >>>>>>>>>>>>>>> Mar 17 14:12:20 lbv1
>> >>>> > stonithd[13266]: warning:
>> >>>> >>>>>>>> log_operation: Stonith2-1:13386 [ failed: 2 ]
>> >>>> >>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>> 宜しくお願いします。
>> >>>> >>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>> 以上
>> >>>> >>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>> 2015年3月17日 13:32
>> >>>> > <renayama19661014@ybb.ne.jp>:
>> >>>> >>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>> 福田さん
>> >>>> >>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>> お疲れ様です。山内です。
>> >>>> >>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>
>> >>>> > ということは、stonith-helperのstartに問題があるようですね。
>> >>>> >>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>> stonith-helperの先頭に
>> >>>> >>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>> #!/bin/bash -x
>> >>>> >>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>
>> >>>> > を入れて、クラスタを起動すると何かわかるかも知れません。
>> >>>> >>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>
>> >>>> > ちなみに、stonith-helperのログもどこかに出ていると思うのですが。。。
>> >>>> >>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>> 以上です。
>> >>>> >>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>> ----- Original
>> >>>> > Message -----
>> >>>> >>>>>>>>>>>>>>>>> From: Masamichi
>> >>>> > Fukuda - elf-systems
>> >>>> >>>>>>>> <masamichi_fukuda@elf-systems.com>
>> >>>> >>>>>>>>>>>>>>>>> To: 山内英生
>> >>>> > <renayama19661014@ybb.ne.jp>;
>> >>>> >>>>>>>> "linux-ha-japan@lists.sourceforge.jp"
>> >>>> >>>>>>>> <linux-ha-japan@lists.sourceforge.jp>
>> >>>> >>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>> Date:
>> >>>> > 2015/3/17, Tue 12:31
>> >>>> >>>>>>>>>>>>>>>>> Subject: Re:
>> >>>> > [Linux-ha-jp]
>> >>>> >>>>>>>> スプリットブレイン時のSTONITHエラーについて
>> >>>> >>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>> 山内さん
>> >>>> >>>>>>>>>>>>>>>>> cc:松島さん
>> >>>> >>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>> こんにちは、福田です。
>> >>>> >>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>>
>> >>>> > 同じディレクトリにxen0はありました。
>> >>>> >>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>> # pwd
>> >>>> >>>>>>>>>>>>>>>>>
>> >>>> > /usr/local/heartbeat/lib/stonith/plugins/external
>> >>>> >>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>> # ls
>> >>>> >>>>>>>>>>>>>>>>> drac5
>> >>>> > ibmrsa kdumpcheck
>> >>>> >>>>>>>> riloe vmware
>> >>>> >>>>>>>>>>>>>>>>> dracmc-telnet
>> >>>> > ibmrsa-telnet libvirt
>> >>>> >>>>>>>> ssh xen0
>> >>>> >>>>>>>>>>>>>>>>> hetzner
>> >>>> > ipmi nut
>> >>>> >>>>>>>> stonith-helper xen0-ha
>> >>>> >>>>>>>>>>>>>>>>> hmchttp
>> >>>> > ippower9258 rackpdu
>> >>>> >>>>>>>> vcenter
>> >>>> >>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>> 宜しくお願いします。
>> >>>> >>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>> 以上
>> >>>> >>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>> 2015-03-17
>> >>>> > 10:53 GMT+09:00
>> >>>> >>>>>>>> <renayama19661014@ybb.ne.jp>:
>> >>>> >>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>> 福田さん
>> >>>> >>>>>>>>>>>>>>>>>> cc:松島さん
>> >>>> >>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>>>
>> >>>> > お疲れ様です。山内です。
>> >>>> >>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> > 標準出力や標準エラー出力はありませんでした。
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> > stonith-helperがおかしいのでしょうか。
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> > stonith-helperはシェルスクリプトなのでインストールはあまり気にしていなかったのですが。
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> > stonith-helperはここに配置されています。
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> > /usr/local/heartbeat/lib/stonith/plugins/external/stonith-helper
>> >>>> >>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>>>
>> >>>> > このディレクトリにxen0もありますか?
>> >>>> >>>>>>>>>>>>>>>>>>
>> >>>> > 無いようでしたら、問題がありますので、一度、stonith-helperのファイルを属性などはそのまま、xen0と同じディレクトリに
>> >>>> >>>>>>>>>>>>>>>>>>
>> >>>> > コピーしてみてください。
>> >>>> >>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>>>
>> >>>> > それで稼働するなら、pm_extrasのインストールに問題があるということになります。
>> >>>> >>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>>> 以上です。
>> >>>> >>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>>> -----
>> >>>> > Original Message -----
>> >>>> >>>>>>>>>>>>>>>>>>> From:
>> >>>> > Masamichi Fukuda - elf-systems
>> >>>> >>>>>>>> <masamichi_fukuda@elf-systems.com>
>> >>>> >>>>>>>>>>>>>>>>>>> To:
>> >>>> > 山内英生
>> >>>> >>>>>>>> <renayama19661014@ybb.ne.jp>;
>> >>>> >>>>>>>> "linux-ha-japan@lists.sourceforge.jp"
>> >>>> >>>>>>>> <linux-ha-japan@lists.sourceforge.jp>
>> >>>> >>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>>>> Date:
>> >>>> > 2015/3/17, Tue 10:31
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> > Subject: Re: [Linux-ha-jp]
>> >>>> >>>>>>>> スプリットブレイン時のSTONITHエラーについて
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>>>> 山内さん
>> >>>> >>>>>>>>>>>>>>>>>>> cc:松島さん
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> > おはようございます、福田です。
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> > crmの例をありがとうございます。
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> > 早速、こちらの環境に合わせてみました。
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>>>> $ cat
>> >>>> > test.crm
>> >>>> >>>>>>>>>>>>>>>>>>> ###
>> >>>> > Cluster Option ###
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> > property \
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>> no-quorum-policy="ignore" \
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> > stonith-enabled="true"
>> >>>> >>>>>>>> \
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>> startup-fencing="false" \
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> > stonith-timeout="710s"
>> >>>> >>>>>>>> \
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>> crmd-transition-delay="2s"
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>>>> ###
>> >>>> > Resource Default ###
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> > rsc_defaults \
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>> resource-stickiness="INFINITY" \
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>> migration-threshold="1"
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>>>> ###
>> >>>> > Group Configuration ###
>> >>>> >>>>>>>>>>>>>>>>>>> group
>> >>>> > HAvarnish \
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> > vip_208 \
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> > varnishd
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>>>> group
>> >>>> > grpStonith1 \
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> > Stonith1-1 \
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> > Stonith1-2
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>>>> group
>> >>>> > grpStonith2 \
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> > Stonith2-1 \
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> > Stonith2-2
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>>>> ###
>> >>>> > Clone Configuration ###
>> >>>> >>>>>>>>>>>>>>>>>>> clone
>> >>>> > clone_ping \
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> > ping
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>>>> ###
>> >>>> > Fencing Topology ###
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> > fencing_topology \
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> > lbv1.beta.com: Stonith1-1
>> >>>> >>>>>>>> Stonith1-2 \
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> > lbv2.beta.com: Stonith2-1
>> >>>> >>>>>>>> Stonith2-2
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>>>> ###
>> >>>> > Primitive Configuration ###
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> > primitive vip_208
>> >>>> >>>>>>>> ocf:heartbeat:IPaddr2 \
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> > params \
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>> ip="192.168.17.208" \
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> > nic="eth0" \
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> > cidr_netmask="24"
>> >>>> >>>>>>>> \
>> >>>> >>>>>>>>>>>>>>>>>>> op
>> >>>> > start interval="0s"
>> >>>> >>>>>>>> timeout="90s" on-fail="restart"
>> >>>> > \
>> >>>> >>>>>>>>>>>>>>>>>>> op
>> >>>> > monitor
>> >>>> >>>>>>>> interval="5s" timeout="60s"
>> >>>> > on-fail="restart"
>> >>>> >>>>>>>> \
>> >>>> >>>>>>>>>>>>>>>>>>> op
>> >>>> > stop interval="0s"
>> >>>> >>>>>>>> timeout="100s" on-fail="fence"
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> > primitive varnishd lsb:varnish \
>> >>>> >>>>>>>>>>>>>>>>>>> op
>> >>>> > start interval="0s"
>> >>>> >>>>>>>> timeout="90s" on-fail="restart"
>> >>>> > \
>> >>>> >>>>>>>>>>>>>>>>>>> op
>> >>>> > monitor
>> >>>> >>>>>>>> interval="10s" timeout="60s"
>> >>>> > on-fail="restart"
>> >>>> >>>>>>>> \
>> >>>> >>>>>>>>>>>>>>>>>>> op
>> >>>> > stop interval="0s"
>> >>>> >>>>>>>> timeout="100s" on-fail="fence"
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> > primitive ping ocf:pacemaker:ping
>> >>>> >>>>>>>> \
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> > params \
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>> name="default_ping_set" \
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>> host_list="192.168.17.254" \
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> > multiplier="100"
>> >>>> >>>>>>>> \
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> > dampen="1" \
>> >>>> >>>>>>>>>>>>>>>>>>> op
>> >>>> > start interval="0s"
>> >>>> >>>>>>>> timeout="90s" on-fail="restart"
>> >>>> > \
>> >>>> >>>>>>>>>>>>>>>>>>> op
>> >>>> > monitor
>> >>>> >>>>>>>> interval="10s" timeout="60s"
>> >>>> > on-fail="restart"
>> >>>> >>>>>>>> \
>> >>>> >>>>>>>>>>>>>>>>>>> op
>> >>>> > stop interval="0s"
>> >>>> >>>>>>>> timeout="100s" on-fail="fence"
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> > primitive Stonith1-1
>> >>>> >>>>>>>> stonith:external/stonith-helper \
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> > params \
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>> pcmk_reboot_retries="1" \
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>> pcmk_reboot_timeout="40s" \
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>> hostlist="lbv1.beta.com" \
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>> dead_check_target="192.168.17.132
>> >>>> > 10.0.17.132" \
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>
>> >>>> > standby_check_command="/usr/local/sbin/crm_resource -r varnishd -W
>> | grep
>> >>>> >>>>>>>> -q `hostname`" \
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>> run_online_check="yes" \
>> >>>> >>>>>>>>>>>>>>>>>>> op
>> >>>> > start interval="0s"
>> >>>> >>>>>>>> timeout="60s" on-fail="restart"
>> >>>> > \
>> >>>> >>>>>>>>>>>>>>>>>>> op
>> >>>> > stop interval="0s"
>> >>>> >>>>>>>> timeout="60s" on-fail="ignore"
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> > primitive Stonith1-2
>> >>>> >>>>>>>> stonith:external/xen0 \
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> > params \
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>> pcmk_reboot_timeout="60s" \
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>
>> >>>> > hostlist="lbv1.beta.com:/etc/xen/lbv1.cfg" \
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>> dom0="xen0.beta.com" \
>> >>>> >>>>>>>>>>>>>>>>>>> op
>> >>>> > start interval="0s"
>> >>>> >>>>>>>> timeout="60s" on-fail="restart"
>> >>>> > \
>> >>>> >>>>>>>>>>>>>>>>>>> op
>> >>>> > monitor
>> >>>> >>>>>>>> interval="3600s" timeout="60s"
>> >>>> > on-fail="restart"
>> >>>> >>>>>>>> \
>> >>>> >>>>>>>>>>>>>>>>>>> op
>> >>>> > stop interval="0s"
>> >>>> >>>>>>>> timeout="60s" on-fail="ignore"
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> > primitive Stonith2-1
>> >>>> >>>>>>>> stonith:external/stonith-helper \
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> > params \
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>> pcmk_reboot_retries="1" \
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>> pcmk_reboot_timeout="40s" \
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>> hostlist="lbv2.beta.com" \
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>> dead_check_target="192.168.17.133
>> >>>> > 10.0.17.133" \
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>
>> >>>> > standby_check_command="/usr/local/sbin/crm_resource -r varnishd -W
>> | grep
>> >>>> >>>>>>>> -q `hostname`" \
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>> run_online_check="yes" \
>> >>>> >>>>>>>>>>>>>>>>>>> op
>> >>>> > start interval="0s"
>> >>>> >>>>>>>> timeout="60s" on-fail="restart"
>> >>>> > \
>> >>>> >>>>>>>>>>>>>>>>>>> op
>> >>>> > stop interval="0s"
>> >>>> >>>>>>>> timeout="60s" on-fail="ignore"
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> > primitive Stonith2-2
>> >>>> >>>>>>>> stonith:external/xen0 \
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> > params \
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>> pcmk_reboot_timeout="60s" \
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>
>> >>>> > hostlist="lbv2.beta.com:/etc/xen/lbv2.cfg" \
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>> dom0="xen0.beta.com" \
>> >>>> >>>>>>>>>>>>>>>>>>> op
>> >>>> > start interval="0s"
>> >>>> >>>>>>>> timeout="60s" on-fail="restart"
>> >>>> > \
>> >>>> >>>>>>>>>>>>>>>>>>> op
>> >>>> > monitor
>> >>>> >>>>>>>> interval="3600s" timeout="60s"
>> >>>> > on-fail="restart"
>> >>>> >>>>>>>> \
>> >>>> >>>>>>>>>>>>>>>>>>> op
>> >>>> > stop interval="0s"
>> >>>> >>>>>>>> timeout="60s" on-fail="ignore"
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>>>> ###
>> >>>> > Resource Location ###
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> > location HA_location-1 HAvarnish
>> >>>> >>>>>>>> \
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> > rule 200: #uname eq
>> >>>> >>>>>>>> lbv1.beta.com \
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> > rule 100: #uname eq
>> >>>> >>>>>>>> lbv2.beta.com
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> > location HA_location-2 HAvarnish
>> >>>> >>>>>>>> \
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> > rule -INFINITY: not_defined
>> >>>> >>>>>>>> default_ping_set or default_ping_set lt 100
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> > location HA_location-3 grpStonith1
>> >>>> >>>>>>>> \
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> > rule -INFINITY: #uname eq
>> >>>> >>>>>>>> lbv1.beta.com
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> > location HA_location-4 grpStonith2
>> >>>> >>>>>>>> \
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> > rule -INFINITY: #uname eq
>> >>>> >>>>>>>> lbv2.beta.com
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> > これを流しこんだところ、昨日とはメッセージが異なります。
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> > pingのメッセージはなくなっていました。
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>>>> #
>> >>>> > crm_mon -rfA
>> >>>> >>>>>>>>>>>>>>>>>>> Last
>> >>>> > updated: Tue Mar 17 10:21:28
>> >>>> >>>>>>>> 2015
>> >>>> >>>>>>>>>>>>>>>>>>> Last
>> >>>> > change: Tue Mar 17 10:21:09
>> >>
>> >>>> >>>>>>>> 2015
>> >>>> >>>>>>>>>>>>>>>>>>> Stack:
>> >>>> > heartbeat
>> >>>> >>>>>>>>>>>>>>>>>>> Current
>> >>>> > DC: lbv2.beta.com
>> >>>> >>>>>>>> (82ffc36f-1ad8-8686-7db0-35686465c624) - parti
>> >>>> >>>>>>>>>>>>>>>>>>> tion
>> >>>> > with quorum
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> > Version: 1.1.12-561c4cf
>> >>>> >>>>>>>>>>>>>>>>>>> 2 Nodes
>> >>>> > configured
>> >>>> >>>>>>>>>>>>>>>>>>> 8
>> >>>> > Resources configured
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>>>> Online:
>> >>>> > [ lbv1.beta.com
>> >>>> >>>>>>>> lbv2.beta.com ]
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>>>> Full
>> >>>> > list of resources:
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> > Resource Group: HAvarnish
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> > vip_208
>> >>>> >>>>>>>> (ocf::heartbeat:IPaddr2): Started
>> >>>> > lbv1.beta.com
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> > varnishd (lsb:varnish):
>> >>>> >>>>>>>> Started lbv1.beta.com
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> > Resource Group: grpStonith1
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> > Stonith1-1
>> >>>> >>>>>>>> (stonith:external/stonith-helper): Stopped
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> > Stonith1-2
>> >>>> >>>>>>>> (stonith:external/xen0): Stopped
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> > Resource Group: grpStonith2
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> > Stonith2-1
>> >>>> >>>>>>>> (stonith:external/stonith-helper): Stopped
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> > Stonith2-2
>> >>>> >>>>>>>> (stonith:external/xen0): Stopped
>> >>>> >>>>>>>>>>>>>>>>>>> Clone
>> >>>> > Set: clone_ping [ping]
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> > Started: [ lbv1.beta.com
>> >>>> >>>>>>>> lbv2.beta.com ]
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>>>> Node
>> >>>> > Attributes:
>> >>>> >>>>>>>>>>>>>>>>>>> * Node
>> >>>> > lbv1.beta.com:
>> >>>> >>>>>>>>>>>>>>>>>>> +
>> >>>> >>>>>>>> default_ping_set : 100
>> >>>> >>>>>>>>>>>>>>>>>>> * Node
>> >>>> > lbv2.beta.com:
>> >>>> >>>>>>>>>>>>>>>>>>> +
>> >>>> >>>>>>>> default_ping_set : 100
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> > Migration summary:
>> >>>> >>>>>>>>>>>>>>>>>>> * Node
>> >>>> > lbv2.beta.com:
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> > Stonith1-1: migration-threshold=1
>> >>>> >>>>>>>> fail-count=1000000 last-failure='Tue Mar 17
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> > 10:21:17 2015'
>> >>>> >>>>>>>>>>>>>>>>>>> * Node
>> >>>> > lbv1.beta.com:
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> > Stonith2-1: migration-threshold=1
>> >>>> >>>>>>>> fail-count=1000000 last-failure='Tue Mar 17
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> > 10:21:17 2015'
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>>>> Failed
>> >>>> > actions:
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> > Stonith1-1_start_0 on
>> >>>> >>>>>>>> lbv2.beta.com 'unknown error' (1): call=31,
>> >>>> > st
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> > atus=Error, last-rc-change='Tue
>> >>>> >>>>>>>> Mar 17 10:21:15 2015', queued=0ms, exec=1082ms
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> > Stonith2-1_start_0 on
>> >>>> >>>>>>>> lbv1.beta.com 'unknown error' (1): call=31,
>> >>>> > st
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> > atus=Error, last-rc-change='Tue
>> >>>> >>>>>>>> Mar 17 10:21:16 2015', queued=0ms, exec=1079ms
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> > /var/log/ha-debugのログです。
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> > IPaddr2(vip_208)[7851]:
>> >>>> >>>>>>>> 2015/03/17_10:21:22 INFO: Adding inet address
>> >>>> > 192.168.17.208/24 with broadcast
>> >>>> >>>>>>>> address 192.168.17.255 to device eth0
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> > IPaddr2(vip_208)[7851]:
>> >>>> >>>>>>>> 2015/03/17_10:21:22 INFO: Bringing device eth0 up
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> > IPaddr2(vip_208)[7851]:
>> >>>> >>>>>>>> 2015/03/17_10:21:22 INFO:
>> >>>> > /usr/libexec/heartbeat/send_arp -i 200 -r 5 -p
>> >>>> >>>>>>>> /var/run/resource-agents/send_arp-192.168.17.208
>> >>>> > eth0 192.168.17.208 auto
>> >>>> >>>>>>>> not_used not_used
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> > 標準出力や標準エラー出力はありませんでした。
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> > stonith-helperがおかしいのでしょうか。
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> > stonith-helperはシェルスクリプトなのでインストールはあまり気にしていなかったのですが。
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> > stonith-helperはここに配置されています。
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> > /usr/local/heartbeat/lib/stonith/plugins/external/stonith-helper
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> > 宜しくお願いします。
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>>>> 以上
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> > 2015-03-17 9:45 GMT+09:00
>> >>>> >>>>>>>> <renayama19661014@ybb.ne.jp>:
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>>>> 福田さん
>> >>>> >>>>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>>>>>
>> >>>> > おはようございます。山内です。
>> >>>> >>>>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>>>>>
>> >>>> > 念の為、手元にある複数のstonithを利用した場合の例を抜粋してお送りします。
>> >>>> >>>>>>>>>>>>>>>>>>>>
>> >>>> > (実際には、改行に気を付けてください)
>> >>>> >>>>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>>>>>
>> >>>> > 以下の例は、PM1.1系での設定で、
>> >>>> >>>>>>>>>>>>>>>>>>>>
>> >>>> > nodeaは、prmStonith1-1、 prmStonith1-2の順でstonithが実行されます。
>> >>>> >>>>>>>>>>>>>>>>>>>>
>> >>>> > nodebは、prmStonith2-1、 prmStonith2-2の順でstonithが実行されます。
>> >>>> >>>>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>>>>>
>> >>>> > stonith自体は、helperとsshです。
>> >>>> >>>>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>>>>>
>> >>>> > (snip)
>> >>>> >>>>>>>>>>>>>>>>>>>> ###
>> >>>> > Group Configuration ###
>> >>>> >>>>>>>>>>>>>>>>>>>>
>> >>>> > group grpStonith1 \
>> >>>> >>>>>>>>>>>>>>>>>>>>
>> >>>> > prmStonith1-1 \
>> >>>> >>>>>>>>>>>>>>>>>>>>
>> >>>> > prmStonith1-2
>> >>>> >>>>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>>>>>
>> >>>> > group grpStonith2 \
>> >>>> >>>>>>>>>>>>>>>>>>>>
>> >>>> > prmStonith2-1 \
>> >>>> >>>>>>>>>>>>>>>>>>>>
>> >>>> > prmStonith2-2
>> >>>> >>>>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>>>>> ###
>> >>>> > Fencing Topology ###
>> >>>> >>>>>>>>>>>>>>>>>>>>
>> >>>> > fencing_topology \
>> >>>> >>>>>>>>>>>>>>>>>>>>
>> >>>> > nodea: prmStonith1-1
>> >>>> >>>>>>>> prmStonith1-2 \
>> >>>> >>>>>>>>>>>>>>>>>>>>
>> >>>> > nodeb: prmStonith2-1
>> >>>> >>>>>>>> prmStonith2-2
>> >>>> >>>>>>>>>>>>>>>>>>>>
>> >>>> > (snp)
>> >>>> >>>>>>>>>>>>>>>>>>>>
>> >>>> > primitive prmStonith1-1
>> >>>> >>>>>>>> stonith:external/stonith-helper \
>> >>>> >>>>>>>>>>>>>>>>>>>>
>> >>>> > params \
>> >>>> >>>>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>>>>>
>> >>>> > pcmk_reboot_retries="1"
>> >>>> >>>>>>>> \
>> >>>> >>>>>>>>>>>>>>>>>>>>
>> >>>> > pcmk_reboot_timeout="40s"
>> >>>> >>>>>>>> \
>> >>>> >>>>>>>>>>>>>>>>>>>>
>> >>>> > hostlist="nodea" \
>> >>>> >>>>>>>>>>>>>>>>>>>>
>> >>>> > dead_check_target="192.168.28.60
>> >>>> >>>>>>>> 192.168.28.70" \
>> >>>> >>>>>>>>>>>>>>>>>>>>
>> >>>> > standby_check_command="/usr/sbin/crm_resource
>> >>>> >>>>>>>> -r prmRES -W | grep -qi `hostname`" \
>> >>>> >>>>>>>>>>>>>>>>>>>>
>> >>>> > run_online_check="yes"
>> >>>> >>>>>>>> \
>> >>>> >>>>>>>>>>>>>>>>>>>> op
>> >>>> > start interval="0s"
>> >>>> >>>>>>>> timeout="60s" on-fail="restart"
>> >>>> > \
>> >>>> >>>>>>>>>>>>>>>>>>>> op
>> >>>> > stop interval="0s"
>> >>>> >>>>>>>> timeout="60s" on-fail="ignore"
>> >>>> >>>>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>>>>>
>> >>>> > primitive prmStonith1-2
>> >>>> >>>>>>>> stonith:external/ssh \
>> >>>> >>>>>>>>>>>>>>>>>>>>
>> >>>> > params \
>> >>>> >>>>>>>>>>>>>>>>>>>>
>> >>>> > pcmk_reboot_timeout="60s"
>> >>>> >>>>>>>> \
>> >>>> >>>>>>>>>>>>>>>>>>>>
>> >>>> > hostlist="nodea" \
>> >>>> >>>>>>>>>>>>>>>>>>>> op
>> >>>> > start interval="0s"
>> >>>> >>>>>>>> timeout="60s" on-fail="restart"
>> >>>> > \
>> >>>> >>>>>>>>>>>>>>>>>>>> op
>> >>>> > monitor
>> >>>> >>>>>>>> interval="3600s" timeout="60s"
>> >>>> > on-fail="restart"
>> >>>> >>>>>>>> \
>> >>>> >>>>>>>>>>>>>>>>>>>> op
>> >>>> > stop interval="0s"
>> >>>> >>>>>>>> timeout="60s" on-fail="ignore"
>> >>>> >>>>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>>>>>
>> >>>> > primitive prmStonith2-1
>> >>>> >>>>>>>> stonith:external/stonith-helper \
>> >>>> >>>>>>>>>>>>>>>>>>>>
>> >>>> > params \
>> >>>> >>>>>>>>>>>>>>>>>>>>
>> >>>> > pcmk_reboot_retries="1"
>> >>>> >>>>>>>> \
>> >>>> >>>>>>>>>>>>>>>>>>>>
>> >>>> > pcmk_reboot_timeout="40s"
>> >>>> >>>>>>>> \
>> >>>> >>>>>>>>>>>>>>>>>>>>
>> >>>> > hostlist="nodeb" \
>> >>>> >>>>>>>>>>>>>>>>>>>>
>> >>>> > dead_check_target="192.168.28.61
>> >>>> >>>>>>>> 192.168.28.71" \
>> >>>> >>>>>>>>>>>>>>>>>>>>
>> >>>> > standby_check_command="/usr/sbin/crm_resource
>> >>>> >>>>>>>> -r prmRES -W | grep -qi `hostname`" \
>> >>>> >>>>>>>>>>>>>>>>>>>>
>> >>>> > run_online_check="yes"
>> >>>> >>>>>>>> \
>> >>>> >>>>>>>>>>>>>>>>>>>> op
>> >>>> > start interval="0s"
>> >>>> >>>>>>>> timeout="60s" on-fail="restart"
>> >>>> > \
>> >>>> >>>>>>>>>>>>>>>>>>>> op
>> >>>> > stop interval="0s"
>> >>>> >>>>>>>> timeout="60s" on-fail="ignore"
>> >>>> >>>>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>>>>>
>> >>>> > primitive prmStonith2-2
>> >>>> >>>>>>>> stonith:external/ssh \
>> >>>> >>>>>>>>>>>>>>>>>>>>
>> >>>> > params \
>> >>>> >>>>>>>>>>>>>>>>>>>>
>> >>>> > pcmk_reboot_timeout="60s"
>> >>>> >>>>>>>> \
>> >>>> >>>>>>>>>>>>>>>>>>>>
>> >>>> > hostlist="nodeb" \
>> >>>> >>>>>>>>>>>>>>>>>>>> op
>> >>>> > start interval="0s"
>> >>>> >>>>>>>> timeout="60s" on-fail="restart"
>> >>>> > \
>> >>>> >>>>>>>>>>>>>>>>>>>> op
>> >>>> > monitor
>> >>>> >>>>>>>> interval="3600s" timeout="60s"
>> >>>> > on-fail="restart"
>> >>>> >>>>>>>> \
>> >>>> >>>>>>>>>>>>>>>>>>>> op
>> >>>> > stop interval="0s"
>> >>>> >>>>>>>> timeout="60s" on-fail="ignore"
>> >>>> >>>>>>>>>>>>>>>>>>>>
>> >>>> > (snip)
>> >>>> >>>>>>>>>>>>>>>>>>>>
>> >>>> > location
>> >>>> >>>>>>>> rsc_location-grpStonith1-2 grpStonith1 \
>> >>>> >>>>>>>>>>>>>>>>>>>>
>> >>>> > rule -INFINITY: #uname eq nodea
>> >>>> >>>>>>>>>>>>>>>>>>>>
>> >>>> > location
>> >>>> >>>>>>>> rsc_location-grpStonith2-3 grpStonith2 \
>> >>>> >>>>>>>>>>>>>>>>>>>>
>> >>>> > rule -INFINITY: #uname eq nodeb
>> >>>> >>>>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>>>>>
>> >>>> > 以上です。
>> >>>> >>>>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>>>> --
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>>>> ELF
>> >>>> > Systems
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> > Masamichi Fukuda
>> >>>> >>>>>>>>>>>>>>>>>>> mail
>> >>>> > to:
>> >>>> >>>>>>>> masamichi_fukuda@elf-systems.com
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>>>
>> >>>> > _______________________________________________
>> >>>> >>>>>>>>>>>>>>>>>>
>> >>>> > Linux-ha-japan mailing list
>> >>>> >>>>>>>>>>>>>>>>>>
>> >>>> > Linux-ha-japan@lists.sourceforge.jp
>> >>>> >>>>>>>>>>>>>>>>>>
>> >>>> > http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>> >>>> >>>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>> --
>> >>>> >>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>> ELF Systems
>> >>>> >>>>>>>>>>>>>>>>> Masamichi
>> >>>> > Fukuda
>> >>>> >>>>>>>>>>>>>>>>> mail to:
>> >>>> > masamichi_fukuda@elf-systems.com
>> >>>> >>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>>
>> >>>> > _______________________________________________
>> >>>> >>>>>>>>>>>>>>>> Linux-ha-japan
>> >>>> > mailing list
>> >>>> >>>>>>>>>>>>>>>>
>> >>>> > Linux-ha-japan@lists.sourceforge.jp
>> >>>> >>>>>>>>>>>>>>>>
>> >>>> > http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>> >>>> >>>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>> --
>> >>>> >>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>> ELF Systems
>> >>>> >>>>>>>>>>>>>>> Masamichi Fukuda
>> >>>> >>>>>>>>>>>>>>> mail to:
>> >>>> > masamichi_fukuda@elf-systems.com
>> >>>> >>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>>
>> >>>> > _______________________________________________
>> >>>> >>>>>>>>>>>>>> Linux-ha-japan mailing list
>> >>>> >>>>>>>>>>>>>>
>> >>>> > Linux-ha-japan@lists.sourceforge.jp
>> >>>> >>>>>>>>>>>>>>
>> >>>> > http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>> >>>> >>>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>> --
>> >>>> >>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>> ELF Systems
>> >>>> >>>>>>>>>>>>> Masamichi Fukuda
>> >>>> >>>>>>>>>>>>> mail to:
>> >>>> > masamichi_fukuda@elf-systems.com
>> >>>> >>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>
>> >>>> >>>>>>>>>>>>
>> >>>> > _______________________________________________
>> >>>> >>>>>>>>>>>> Linux-ha-japan mailing list
>> >>>> >>>>>>>>>>>> Linux-ha-japan@lists.sourceforge.jp
>> >>>> >>>>>>>>>>>>
>> >>>> > http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>> >>>> >>>>>>>>>>>>
>> >>>> >>>>>>>>>>>
>> >>>> >>>>>>>>>>> --
>> >>>> >>>>>>>>>>>
>> >>>> >>>>>>>>>>> ELF Systems
>> >>>> >>>>>>>>>>> Masamichi Fukuda
>> >>>> >>>>>>>>>>> mail to:
>> >>>> > masamichi_fukuda@elf-systems.com
>> >>>> >>>>>>>>>>>
>> >>>> >>>>>>>>>>>
>> >>>> >>>>>>>>>>>
>> >>>> >>>>>>>>>>
>> >>>> >>>>>>>>>>
>> >>>> > _______________________________________________
>> >>>> >>>>>>>>>> Linux-ha-japan mailing list
>> >>>> >>>>>>>>>> Linux-ha-japan@lists.sourceforge.jp
>> >>>> >>>>>>>>>>
>> >>>> > http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>> >>>> >>>>>>>>>>
>> >>>> >>>>>>>>>
>> >>>> >>>>>>>>>
>> >>>> >>>>>>>>> --
>> >>>> >>>>>>>>>
>> >>>> >>>>>>>>> ELF Systems
>> >>>> >>>>>>>>> Masamichi Fukuda
>> >>>> >>>>>>>>> mail to: masamichi_fukuda@elf-systems.com
>> >>>> >>>>>>>>>
>> >>>> >>>>>>>>>
>> >>>> >>>>>>>>
>> >>>> >>>>>>>> _______________________________________________
>> >>>> >>>>>>>> Linux-ha-japan mailing list
>> >>>> >>>>>>>> Linux-ha-japan@lists.sourceforge.jp
>> >>>> >>>>>>>>
>> >>>> > http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>> >>>> >>>>>>>>
>> >>>> >>>>>>>
>> >>>> >>>>>>> _______________________________________________
>> >>>> >>>>>>> Linux-ha-japan mailing list
>> >>>> >>>>>>> Linux-ha-japan@lists.sourceforge.jp
>> >>>> >>>>>>> http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>> >>>> >>>>>>>
>> >>>> >>>>>>
>> >>>> >>>>>>
>> >>>> >>>>>> --
>> >>>> >>>>>>
>> >>>> >>>>>> ELF Systems
>> >>>> >>>>>> Masamichi Fukuda
>> >>>> >>>>>> mail to: masamichi_fukuda@elf-systems.com
>> >>>> >>>>>>
>> >>>> >>>>>>
>> >>>> >>>>>
>> >>>> >>>>> _______________________________________________
>> >>>> >>>>> Linux-ha-japan mailing list
>> >>>> >>>>> Linux-ha-japan@lists.sourceforge.jp
>> >>>> >>>>> http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>> >>>> >>>>>
>> >>>> >>>>
>> >>>> >>>>
>> >>>> >>>> --
>> >>>> >>>>
>> >>>> >>>> ELF Systems
>> >>>> >>>> Masamichi Fukuda
>> >>>> >>>> mail to: masamichi_fukuda@elf-systems.com
>> >>>> >>>>
>> >>>> >>>>
>> >>>> >>>
>> >>>> >>> _______________________________________________
>> >>>> >>> Linux-ha-japan mailing list
>> >>>> >>> Linux-ha-japan@lists.sourceforge.jp
>> >>>> >>> http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>> >>>> >>>
>> >>>> >>
>> >>>> >>
>> >>>> >> --
>> >>>> >>
>> >>>> >> ELF Systems
>> >>>> >> Masamichi Fukuda
>> >>>> >> mail to: masamichi_fukuda@elf-systems.com
>> >>>> >>
>> >>>> >>
>> >>>> >
>> >>>> > _______________________________________________
>> >>>> > Linux-ha-japan mailing list
>> >>>> > Linux-ha-japan@lists.sourceforge.jp
>> >>>> > http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>> >>>> >
>> >>>>
>> >>>> _______________________________________________
>> >>>> Linux-ha-japan mailing list
>> >>>> Linux-ha-japan@lists.sourceforge.jp
>> >>>> http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>--
>> >>>ELF Systems
>> >>>Masamichi Fukuda
>> >>>mail to: masamichi_fukuda@elf-systems.com
>> >>>
>> >>>
>> >>
>> >>_______________________________________________
>> >>Linux-ha-japan mailing list
>> >>Linux-ha-japan@lists.sourceforge.jp
>> >>http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>> >>
>> >
>> >
>> >--
>> >
>> >ELF Systems
>> >Masamichi Fukuda
>> >mail to: masamichi_fukuda@elf-systems.com
>> >
>> >
>>
>> _______________________________________________
>> Linux-ha-japan mailing list
>> Linux-ha-japan@lists.sourceforge.jp
>> http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>>
>
>
>
> --
> ELF Systems
> Masamichi Fukuda
> mail to: *masamichi_fukuda@elf-systems.com <elfsystems.com@gmail.com>*
>



--
ELF Systems
Masamichi Fukuda
mail to: *masamichi_fukuda@elf-systems.com <elfsystems.com@gmail.com>*
Re: スプリットブレイン時のSTONITHエラーについて [ In reply to ]
福田さん


お疲れ様です。山内です。

>こちらの環境で、PM1.1.12のbuild:e32080bからbuild:561c4cfへ何度か戻したりしているうちにリブートを繰り返すようになってしまいました。

この時ですが、戻したりする前に、
1)使っていたバージョンのソースディレクトリで、make uninstall
2)/var/lib/pacemaker/cib, /var/lib/pacemaker/pengineのディレクトリ中身を削除
しておいた方がよいです。

>そこで、再度debian7.8をクリーンインストールしてPM1.1.12 build:561c4cfをインストールしました。
>あと、ご指摘頂いたパスを通したところ、こちらでもstonith-helperの起動までは確認できました。


そうでしたか・・・・良かったですね。
といっても、build:e32080bが動かないと問題ですが・・・・

また、週末に時間が取れたら、こちらでもやってみます。
進展があれば、ご連絡いたします。

以上です。



----- Original Message -----
>From: Masamichi Fukuda - elf-systems <masamichi_fukuda@elf-systems.com>
>To: 山内英生 <renayama19661014@ybb.ne.jp>; "linux-ha-japan@lists.sourceforge.jp" <linux-ha-japan@lists.sourceforge.jp>
>Date: 2015/3/20, Fri 16:36
>Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
>
>
>山内さん
>
>こんにちは、福田です。
>
>こちらの環境で、PM1.1.12のbuild:e32080bからbuild:561c4cfへ何度か戻したりしているうちにリブートを繰り返すようになってしまいました。
>そこで、再度debian7.8をクリーンインストールしてPM1.1.12 build:561c4cfをインストールしました。
>あと、ご指摘頂いたパスを通したところ、こちらでもstonith-helperの起動までは確認できました。
>
>Last updated: Fri Mar 20 16:26:47 2015
>Last change: Fri Mar 20 16:22:01 2015
>Stack: heartbeat
>Current DC: deb64 (71e563fb-34e1-919f-7515-868014cb501d) - partition with quorum
>
>Version: 1.1.12-561c4cf
>2 Nodes configured
>10 Resources configured
>
>
>Online: [ deb63 deb64 ]
>
>Full list of resources:
>
> Resource Group: HAvarnish
>     vip_208    (ocf::heartbeat:IPaddr2):       Started deb63
>     varnishd   (lsb:varnish):  Started deb63
> Resource Group: grpStonith1
>     Stonith1-1 (stonith:external/stonith-helper):      Started deb64
>     Stonith1-2 (stonith:external/ssh): Started deb64
>     Stonith1-3 (stonith:meatware):     Started deb64
> Resource Group: grpStonith2
>     Stonith2-1 (stonith:external/stonith-helper):      Started deb63
>     Stonith2-2 (stonith:external/ssh): Started deb63
>     Stonith2-3 (stonith:meatware):     Started deb63
> Clone Set: clone_ping [ping]
>     Started: [ deb63 deb64 ]
>
>Node Attributes:
>* Node deb63:
>    + default_ping_set                  : 100
>* Node deb64:
>    + default_ping_set                  : 100
>
>Migration summary:
>* Node deb64:
>* Node deb63:
>
>宜しくお願いします。
>
>以上
>
>
>
>
>
>
>2015年3月18日 18:32 Masamichi Fukuda - elf-systems <masamichi_fukuda@elf-systems.com>:
>
>山内さん
>>
>>こんばんは、福田です。
>>debianでの検証ありがとうございます。
>>
>>> どうやら、新しいPacemakerのrngファイル(
>>>
>>> Pacemaker1.1.12より後)が影響しているようです。
>>> が、こちらの回避方法はまだわかっていません。
>>
>>こちら回避方法等わかりました際にはご教示お願いします。
>>
>>> ただ、最新のPMとの組み合わせの問題の解消はまだですので、
>>>
>>> この構成(PM1.1.12+Heartbeat3.0.6)が正しく動くかどうかは別物です。
>>> #たぶん、動いているようですが、問題が出ると思います。
>>
>>一旦、PM1.1.12に戻して、同じ手順でやってみます。
>>まずはstonith-helperが動くかどうか確認してみます。
>>
>>> で、福田さんのstonith-
>>>
>>> helperがstartでエラーになっている件ですが、多分、stonithコマンドへのPATHがないことが原因かと思います。
>>
>>初歩的なミスのようでお恥ずかしい限りです。
>>こちらも同様に試してみます。
>>
>>宜しくお願いします。
>>
>>以上
>>
>>
>>
>>
>>2015年3月18日 17:56 <renayama19661014@ybb.ne.jp>:
>>
>>福田さん
>>>
>>>こんばんは、山内です。
>>>
>>>私の方でも同じ状況が発生しました。
>>>どうやら、新しいPacemakerのrngファイル(Pacemaker1.1.12より後)が影響しているようです。
>>>が、こちらの回避方法はまだわかっていません。
>>>
>>>ちなみに、本来はうまく動くかどうか不明のPacemaker1.1.12とHeartbeat3.0.6の組み合わせでは、単一ノードで、stonith-helperの起動まで確認しました。
>>>
>>>root@debian7-1:~# crm_mon -1 -Af
>>>Last updated: Wed Mar 18 17:43:37 2015
>>>Last change: Wed Mar 18 17:43:29 2015
>>>Stack: heartbeat
>>>Current DC: debian7-1 (d20c7df5-519e-4a4c-9b4b-1b88fc203133) - partition with quorum
>>>Version: 1.1.12-561c4cf
>>>1 Nodes configured
>>>3 Resources configured
>>>
>>>
>>>Online: [ debian7-1 ]
>>>
>>> prmDummy(ocf::pacemaker:Dummy):Started debian7-1 
>>> Resource Group: grpStonith2
>>>     Stonith2-1(stonith:external/stonith-helper):Started debian7-1 
>>>
>>>Node Attributes:
>>>* Node debian7-1:
>>>
>>>Migration summary:
>>>* Node debian7-1: 
>>>
>>>松島さんの手順ではうまくいかない箇所(私のdebian不慣れが原因と思いますが)がありましたが、構築オプションは同じ
>>>にして、インストールして、pm_extras_1.0の最新版に含まれるstonith-helperのみをxen0と同じディレクトリにコピーしました。
>>>#stonith-helperの実行権限などに問題があれば、正しく設定してください。
>>>
>>>で、福田さんのstonith-helperがstartでエラーになっている件ですが、多分、stonithコマンドへのPATHがないことが原因かと思います。
>>>
>>>root@debian7-1:~# find / -name stonith -print
>>>/usr/local/heartbeat/sbin/stonith
>>>
>>>root@debian7-1:~# echo $PATH
>>>/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/local/heartbeat/sbin/
>>>
>>>
>>>PATHに/usr/local/heartbeat/sbinを追加後に再度、heartbeatを起動すると、上記のcrm_mon表示のようになりました。
>>>
>>>ただ、最新のPMとの組み合わせの問題の解消はまだですので、この構成(PM1.1.12+Heartbeat3.0.6)が正しく動くかどうかは別物です。
>>>#たぶん、動いているようですが、問題が出ると思います。
>>>
>>>以下に試しに流し込んだ、crmファイルを提示しておきます。
>>>(dead_check_targetや、standby_check_commandなどのパラメータ値は起動を確認するのみでしたので、この設定では実際はまったく意味がない値です)
>>>
>>>### Cluster Option ###
>>>property \
>>>    no-quorum-policy="ignore" \
>>>    stonith-enabled="true" \
>>>    startup-fencing="false"
>>>
>>>### Resource Default ###
>>>rsc_defaults \
>>>    resource-stickiness="INFINITY" \
>>>    migration-threshold="1"
>>>
>>>### Fencing Topology ###
>>>fencing_topology \
>>>    debian7-1: Stonith1-1 \
>>>    debian7-2: Stonith2-1
>>>
>>>group grpStonith1 \
>>>    Stonith1-1
>>>
>>>group grpStonith2 \
>>>    Stonith2-1
>>>
>>>primitive prmDummy ocf:pacemaker:Dummy \
>>>    op start interval="0s" timeout="60s" on-fail="restart" \
>>>    op monitor interval="3600s" timeout="60s" on-fail="restart" \
>>>    op stop interval="0s" timeout="60s" on-fail="ignore"
>>>
>>>primitive Stonith1-1 stonith:external/stonith-helper \
>>>    params \
>>>        pcmk_reboot_retries="1" \
>>>        pcmk_reboot_timeout="40s" \
>>>        hostlist="debian7-1" \
>>>        dead_check_target="192.168.3.1" \
>>>        standby_wait_time="10" \
>>>        standby_check_command="/usr/local/sbin/crm_resource -r varnishd -W | grep -q `hostname`" \
>>>    op start interval="0s" timeout="60s" on-fail="restart" \
>>>    op monitor interval="3600s" timeout="60s" on-fail="restart" \
>>>    op stop interval="0s" timeout="60s" on-fail="ignore"
>>>
>>>primitive Stonith2-1 stonith:external/stonith-helper \
>>>    params \
>>>        pcmk_reboot_retries="1" \
>>>        pcmk_reboot_timeout="40s" \
>>>        hostlist="debian7-2" \
>>>        dead_check_target="192.168.3.1" \
>>>        standby_wait_time="10" \
>>>        standby_check_command="/usr/local/sbin/crm_resource -r varnishd -W | grep -q `hostname`" \
>>>    op start interval="0s" timeout="60s" on-fail="restart" \
>>>    op monitor interval="3600s" timeout="60s" on-fail="restart" \
>>>    op stop interval="0s" timeout="60s" on-fail="ignore"
>>>
>>>
>>>location HA_location-3 grpStonith1 \
>>>   rule -INFINITY: #uname eq debian7-1
>>>
>>>location HA_location-4 grpStonith2 \
>>>   rule -INFINITY: #uname eq debian7-2
>>>
>>>
>>>また、何かわかりましたら、ご連絡いたします。
>>>
>>>以上です。
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>----- Original Message -----
>>>>From: Masamichi Fukuda - elf-systems <masamichi_fukuda@elf-systems.com>
>>>>To: 山内英生 <renayama19661014@ybb.ne.jp>; "linux-ha-japan@lists.sourceforge.jp" <linux-ha-japan@lists.sourceforge.jp>
>>>
>>>>Date: 2015/3/18, Wed 15:09
>>>>Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
>>>>
>>>>
>>>>山内さん
>>>>
>>>>お疲れ様です、福田です。
>>>>
>>>>新たにdebian7.8をvirtulabox上にインストールして、
>>>>heartbeat + pacemakerをインストールしてみました。
>>>>
>>>>
>>>>パッケージでheartbeat,pacemaker等はインストールしていません。
>>>>
>>>>
>>>>heartbeatは起動しますが、crmファイルを読み込ませるとエラーがでました。
>>>>
>>>>
>>>># crm configure load update test1.crm
>>>>
>>>>ERROR: crmd:metadata: got no meta-data, does this RA exist?
>>>>ERROR: cib-bootstrap-options: attribute no-quorum-policy does not exist
>>>>ERROR: cib-bootstrap-options: attribute stonith-enabled does not exist
>>>>ERROR: cib-bootstrap-options: attribute crmd-transition-delay does not exist
>>>>ERROR: pengine:metadata: got no meta-data, does this RA exist?
>>>>
>>>>external配下のエージェントを認識できない件と関係あるのでしょうか。
>>>>
>>>>宜しくお願いします。
>>>>
>>>>以上
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>2015年3月18日 12:13 <renayama19661014@ybb.ne.jp>:
>>>>
>>>>福田さん
>>>>>
>>>>>お疲れ様です。山内です。
>>>>>
>>>>>了解しました。
>>>>>ご連絡ありがとうございました。
>>>>>
>>>>>以上です。
>>>>>
>>>>>
>>>>>
>>>>>----- Original Message -----
>>>>>>From: Masamichi Fukuda - elf-systems <masamichi_fukuda@elf-systems.com>
>>>>>>To: 山内英生 <renayama19661014@ybb.ne.jp>; "linux-ha-japan@lists.sourceforge.jp" <linux-ha-japan@lists.sourceforge.jp>
>>>>>
>>>>>>Date: 2015/3/18, Wed 10:23
>>>>>>Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
>>>>>>
>>>>>>
>>>>>>山内さん
>>>>>>
>>>>>>お疲れ様です、福田です。
>>>>>>
>>>>>>こちらの環境では、packageで次のものを入れていたので、
>>>>>>最初にapt-get removeしました。
>>>>>>
>>>>>>heartbeat、libheartbeat2、pacemaker、corosync、resource-agents
>>>>>>
>>>>>>また、haclusterユーザとhaclientグループはpackage導入の段階で
>>>>>>作成されていました。
>>>>>>
>>>>>>ですので、松島さんの手順の
>>>>>>
>>>>>>下準備
>>>>>>apt-get install build-essential mercurial git \
>>>>>>
>>>>>>以降を実行しました。後は全く同じ手順です。
>>>>>>
>>>>>>宜しくお願いします。
>>>>>>
>>>>>>以上
>>>>>>
>>>>>>2015年3月18日 10:06 <renayama19661014@ybb.ne.jp>:
>>>>>>>
>>>>>>> 福田さん
>>>>>>>
>>>>>>> お疲れ様です。山内です。
>>>>>>>
>>>>>>> ちなみに、私の方でも構築するにあたっての再確認ですが、福田さんの構築手順は、
>>>>>>> 以下にまとめられた松島さんの手順通りでしょうか?
>>>>>>>
>>>>>>>  * https://gist.github.com/takehironet/1469bd7123f63d61f843
>>>>>>>
>>>>>>> 差異などありましたら、今一度、ご連絡ください。
>>>>>>>
>>>>>>> #特に、最初の構築パッケージのapt-getのあたりが、私がちょっと試した時には、うまく行かなかった記憶があるので、気になります。
>>>>>>>
>>>>>>>
>>>>>>> 以上です。
>>>>>>>
>>>>>>>
>>>>>>> ----- Original Message -----
>>>>>>> > From: "renayama19661014@ybb.ne.jp" <renayama19661014@ybb.ne.jp>
>>>>>>> > To: "linux-ha-japan@lists.sourceforge.jp" <linux-ha-japan@lists.sourceforge.jp>
>>>>>>> > Cc:
>>>>>>> > Date: 2015/3/18, Wed 09:53
>>>>>>> > Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
>>>>>>> >
>>>>>>> > 福田さん
>>>>>>> >
>>>>>>> > お疲れ様です。山内です。
>>>>>>> >
>>>>>>> >> stonith -Lでは、一応プラグインの一覧は表示されるようです。
>>>>>>> >>
>>>>>>> >> # /usr/local/heartbeat/sbin/stonith -L
>>>>>>> >
>>>>>>> > こちらは、Heartbeatのソースに含まれるコマンドのはずなので、Heartbeatとglueの関係では問題ないということになるかと思います。
>>>>>>> >
>>>>>>> > ですので、pacemakerのインストールに問題がある可能性の方が高いと思われます。
>>>>>>> >
>>>>>>> > どちらにしても、一度、時間をみて、こちらでも構築してみます。
>>>>>>> >
>>>>>>> > 以上です。
>>>>>>> >
>>>>>>> >
>>>>>>> > ----- Original Message -----
>>>>>>> >> From: Masamichi Fukuda - elf-systems
>>>>>>> > <masamichi_fukuda@elf-systems.com>
>>>>>>> >> To: 山内英生 <renayama19661014@ybb.ne.jp>;
>>>>>>> > "linux-ha-japan@lists.sourceforge.jp"
>>>>>>> > <linux-ha-japan@lists.sourceforge.jp>
>>>>>>> >> Date: 2015/3/18, Wed 09:33
>>>>>>> >> Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
>>>>>>> >>
>>>>>>> >>
>>>>>>> >> 山内さん
>>>>>>> >>
>>>>>>> >> お疲れ様です、福田です。
>>>>>>> >>
>>>>>>> >>> Reusableは、glueのことです。
>>>>>>> >>
>>>>>>> >> 承知しました。Cluster-glueのことですね。
>>>>>>> >>
>>>>>>> >>> これは、想定通りで、external配下のエージェントを認識できず、startしていないと
>>>>>>> >>> 思っています。
>>>>>>> >>
>>>>>>> >> stonith -Lでは、一応プラグインの一覧は表示されるようです。
>>>>>>> >>
>>>>>>> >> # /usr/local/heartbeat/sbin/stonith -L
>>>>>>> >> apcmaster
>>>>>>> >> apcsmart
>>>>>>> >> baytech
>>>>>>> >> cyclades
>>>>>>> >> external/drac5
>>>>>>> >> external/dracmc-telnet
>>>>>>> >> external/hetzner
>>>>>>> >> external/hmchttp
>>>>>>> >> external/ibmrsa
>>>>>>> >> external/ibmrsa-telnet
>>>>>>> >> external/ipmi
>>>>>>> >> external/ippower9258
>>>>>>> >> external/kdumpcheck
>>>>>>> >> external/libvirt
>>>>>>> >> external/nut
>>>>>>> >> external/rackpdu
>>>>>>> >> external/riloe
>>>>>>> >> external/ssh
>>>>>>> >> external/stonith-helper
>>>>>>> >> external/vcenter
>>>>>>> >> external/vmware
>>>>>>> >> external/xen0
>>>>>>> >> external/xen0-ha
>>>>>>> >> ibmhmc
>>>>>>> >> meatware
>>>>>>> >> null
>>>>>>> >> nw_rpc100s
>>>>>>> >> rcd_serial
>>>>>>> >> rps10
>>>>>>> >> ssh
>>>>>>> >> suicide
>>>>>>> >> wti_nps
>>>>>>> >>
>>>>>>> >>
>>>>>>> >>> #時間を見て、松島さんの情報を元に、Debian7の環境を作らないと何とも言えない
>>>>>>> >>> と思っています
>>>>>>> >>
>>>>>>> >> お忙しいところ済みません。
>>>>>>> >> こちらもインストールを見なおして見ます。
>>>>>>> >>
>>>>>>> >> 宜しくお願いします。
>>>>>>> >>
>>>>>>> >> 以上
>>>>>>> >>
>>>>>>> >>
>>>>>>> >>
>>>>>>> >>
>>>>>>> >> 2015年3月18日 9:02 <renayama19661014@ybb.ne.jp>:
>>>>>>> >>
>>>>>>> >> 福田さん
>>>>>>> >>>
>>>>>>> >>> おはようございます。山内です。
>>>>>>> >>>
>>>>>>> >>> 書き方が悪かったです。
>>>>>>> >>> Reusableは、glueのことです。
>>>>>>> >>>
>>>>>>> >>> pacemakerのインストールに問題があるかも知れませんが、現時点では、判断出来ません。
>>>>>>> >>>
>>>>>>> >>>
>>>>>>> >>>> stonith-helperを外して、external/sshだけにして起動してみましたが、
>>>>>>> >>>> crm_monでの状態は変わりありませんでした。
>>>>>>> >>>
>>>>>>> >>>
>>>>>>> >>> これは、想定通りで、external配下のエージェントを認識できず、startしていないと思っています。
>>>>>>> >>>
>>>>>>> >>> #時間を見て、松島さんの情報を元に、Debian7の環境を作らないと何とも言えないと思っています。
>>>>>>> >>>
>>>>>>> >>> 以上です。
>>>>>>> >>>
>>>>>>> >>>
>>>>>>> >>> ----- Original Message -----
>>>>>>> >>>> From: Masamichi Fukuda - elf-systems
>>>>>>> > <masamichi_fukuda@elf-systems.com>
>>>>>>> >>>> To: 山内英生 <renayama19661014@ybb.ne.jp>;
>>>>>>> > "linux-ha-japan@lists.sourceforge.jp"
>>>>>>> > <linux-ha-japan@lists.sourceforge.jp>
>>>>>>> >>>
>>>>>>> >>>> Date: 2015/3/18, Wed 08:12
>>>>>>> >>>> Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
>>>>>>> >>>>
>>>>>>> >>>>
>>>>>>> >>>> 山内さん
>>>>>>> >>>>
>>>>>>> >>>> おはようございます、福田です。
>>>>>>> >>>>
>>>>>>> >>>>>  ということは、xen0もstonith-helperもたぶん、Pacemakerのstonithプラグインとし
>>>>>>> >>>>>  ての管理下のパスにはないということになると思います。
>>>>>>> >>>>>
>>>>>>> >>>>>  Reusableと、pm_extrasあたりのインストールが怪しいと思いますね。
>>>>>>> >>>>
>>>>>>> >>>> pacemakerのインストールに問題があるのでしょうか。
>>>>>>> >>>> あと、Reusableというものは別途インストールが必要なのでしょうか。
>>>>>>> >>>>
>>>>>>> >>>> stonith-helperを外して、external/sshだけにして起動してみましたが、
>>>>>>> >>>> crm_monでの状態は変わりありませんでした。
>>>>>>> >>>>
>>>>>>> >>>> Last updated: Wed Mar 18 08:07:42 2015
>>>>>>> >>>> Last change: Wed Mar 18 08:04:48 2015
>>>>>>> >>>> Stack: heartbeat
>>>>>>> >>>> Current DC: lbv1.beta.com (38b0f200-83ea-8633-6f37-047d36cd39c6) -
>>>>>>> > parti
>>>>>>> >>>> tion with quorum
>>>>>>> >>>> Version: 1.1.12-e32080b
>>>>>>> >>>> 2 Nodes configured
>>>>>>> >>>> 6 Resources configured
>>>>>>> >>>>
>>>>>>> >>>>
>>>>>>> >>>> Online: [ lbv1.beta.com lbv2.beta.com ]
>>>>>>> >>>>
>>>>>>> >>>> Full list of resources:
>>>>>>> >>>>
>>>>>>> >>>> Stonith1-2      (stonith:external/ssh): Stopped
>>>>>>> >>>> Stonith2-2      (stonith:external/ssh): Stopped
>>>>>>> >>>>  Resource Group: HAvarnish
>>>>>>> >>>>      vip_208    (ocf::heartbeat:IPaddr2):       Started
>>>>>>> > lbv1.beta.com
>>>>>>> >>>>      varnishd   (lsb:varnish):  Started lbv1.beta.com
>>>>>>> >>>>  Clone Set: clone_ping [ping]
>>>>>>> >>>>      Started: [ lbv1.beta.com lbv2.beta.com ]
>>>>>>> >>>>
>>>>>>> >>>> Node Attributes:
>>>>>>> >>>> * Node lbv1.beta.com:
>>>>>>> >>>>     + default_ping_set                  : 100
>>>>>>> >>>> * Node lbv2.beta.com:
>>>>>>> >>>>     + default_ping_set                  : 100
>>>>>>> >>>>
>>>>>>> >>>> Migration summary:
>>>>>>> >>>> * Node lbv2.beta.com:
>>>>>>> >>>>    Stonith1-2: migration-threshold=1 fail-count=1000000
>>>>>>> > last-failure='Wed Mar 18
>>>>>>> >>>>  08:07:32 2015'
>>>>>>> >>>> * Node lbv1.beta.com:
>>>>>>> >>>>    Stonith2-2: migration-threshold=1 fail-count=1000000
>>>>>>> > last-failure='Wed Mar 18
>>>>>>> >>>>  08:05:53 2015'
>>>>>>> >>>>
>>>>>>> >>>> Failed actions:
>>>>>>> >>>>     Stonith1-2_start_0 on lbv2.beta.com 'unknown error' (1):
>>>>>>> > call=23, st
>>>>>>> >>>> atus=Error, exit-reason='none', last-rc-change='Wed Mar
>>>>>>> > 18 08:07:30 2015', queue
>>>>>>> >>>> d=0ms, exec=1061ms
>>>>>>> >>>>     Stonith2-2_start_0 on lbv1.beta.com 'unknown error' (1):
>>>>>>> > call=23, st
>>>>>>> >>>> atus=Error, exit-reason='none', last-rc-change='Wed Mar
>>>>>>> > 18 08:05:51 2015', queue
>>>>>>> >>>> d=0ms, exec=1062ms
>>>>>>> >>>>
>>>>>>> >>>> 宜しくお願いします。
>>>>>>> >>>>
>>>>>>> >>>> 以上
>>>>>>> >>>>
>>>>>>> >>>>
>>>>>>> >>>>
>>>>>>> >>>>
>>>>>>> >>>>
>>>>>>> >>>> 2015年3月17日 23:51 <renayama19661014@ybb.ne.jp>:
>>>>>>> >>>>
>>>>>>> >>>> 福田さん
>>>>>>> >>>>>
>>>>>>> >>>>> こんばんは、山内です。
>>>>>>> >>>>>
>>>>>>> >>>>> ということは、xen0もstonith-helperもたぶん、Pacemakerのstonithプラグインとしての管理下のパスにはないということになると思います。
>>>>>>> >>>>>
>>>>>>> >>>>> Reusableと、pm_extrasあたりのインストールが怪しいと思いますね。
>>>>>>> >>>>>
>>>>>>> >>>>> また、何かわかったらご連絡します。
>>>>>>> >>>>>
>>>>>>> >>>>> 以上です。
>>>>>>> >>>>>
>>>>>>> >>>>>
>>>>>>> >>>>>
>>>>>>> >>>>> ----- Original Message -----
>>>>>>> >>>>>> From: Masamichi Fukuda - elf-systems
>>>>>>> > <masamichi_fukuda@elf-systems.com>
>>>>>>> >>>>>> To: 山内英生 <renayama19661014@ybb.ne.jp>;
>>>>>>> > "linux-ha-japan@lists.sourceforge.jp"
>>>>>>> > <linux-ha-japan@lists.sourceforge.jp>
>>>>>>> >>>>>
>>>>>>> >>>>>> Date: 2015/3/17, Tue 23:46
>>>>>>> >>>>>> Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
>>>>>>> >>>>>>
>>>>>>> >>>>>>
>>>>>>> >>>>>> 山内さん
>>>>>>> >>>>>>
>>>>>>> >>>>>> こんばんは、福田です。
>>>>>>> >>>>>>
>>>>>>> >>>>>> stonith-helperの-x指定は何かやり方が違うんでしょうかね。
>>>>>>> >>>>>>
>>>>>>> >>>>>> stonith-helperを外して、xen0だけにして起動してみました。
>>>>>>> >>>>>>
>>>>>>> >>>>>> # crm_mon -rfA
>>>>>>> >>>>>>
>>>>>>> >>>>>> Last updated: Tue Mar 17 23:38:53 2015
>>>>>>> >>>>>> Last change: Tue Mar 17 23:30:34 2015
>>>>>>> >>>>>> Stack: heartbeat
>>>>>>> >>>>>> Current DC: lbv1.beta.com
>>>>>>> > (38b0f200-83ea-8633-6f37-047d36cd39c6) - parti
>>>>>>> >>>>>> tion with quorum
>>>>>>> >>>>>> Version: 1.1.12-e32080b
>>>>>>> >>>>>> 2 Nodes configured
>>>>>>> >>>>>> 6 Resources configured
>>>>>>> >>>>>>
>>>>>>> >>>>>>
>>>>>>> >>>>>> Online: [ lbv1.beta.com lbv2.beta.com ]
>>>>>>> >>>>>>
>>>>>>> >>>>>> Full list of resources:
>>>>>>> >>>>>>
>>>>>>> >>>>>> Stonith1-2      (stonith:external/xen0):        Stopped
>>>>>>> >>>>>> Stonith2-2      (stonith:external/xen0):        Stopped
>>>>>>> >>>>>>  Resource Group: HAvarnish
>>>>>>> >>>>>>      vip_208    (ocf::heartbeat:IPaddr2):       Started
>>>>>>> > lbv1.beta.com
>>>>>>> >>>>>>      varnishd   (lsb:varnish):  Started lbv1.beta.com
>>>>>>> >>>>>>  Clone Set: clone_ping [ping]
>>>>>>> >>>>>>      Started: [ lbv1.beta.com lbv2.beta.com ]
>>>>>>> >>>>>>
>>>>>>> >>>>>> Node Attributes:
>>>>>>> >>>>>> * Node lbv1.beta.com:
>>>>>>> >>>>>>     + default_ping_set                  : 100
>>>>>>> >>>>>> * Node lbv2.beta.com:
>>>>>>> >>>>>>     + default_ping_set                  : 100
>>>>>>> >>>>>>
>>>>>>> >>>>>> Migration summary:
>>>>>>> >>>>>> * Node lbv1.beta.com:
>>>>>>> >>>>>>    Stonith2-2: migration-threshold=1 fail-count=1000000
>>>>>>> > last-failure='Tue Mar 17
>>>>>>> >>>>>>  23:38:34 2015'
>>>>>>> >>>>>> * Node lbv2.beta.com:
>>>>>>> >>>>>>    Stonith1-2: migration-threshold=1 fail-count=1000000
>>>>>>> > last-failure='Tue Mar 17
>>>>>>> >>>>>>  23:38:27 2015'
>>>>>>> >>>>>>
>>>>>>> >>>>>> Failed actions:
>>>>>>> >>>>>>     Stonith2-2_start_0 on lbv1.beta.com 'unknown
>>>>>>> > error' (1): call=23, st
>>>>>>> >>>>>> atus=Error, exit-reason='none',
>>>>>>> > last-rc-change='Tue Mar 17 23:38:32 2015', queue
>>>>>>> >>>>>> d=0ms, exec=1061ms
>>>>>>> >>>>>>     Stonith1-2_start_0 on lbv2.beta.com 'unknown
>>>>>>> > error' (1): call=23, st
>>>>>>> >>>>>> atus=Error, exit-reason='none',
>>>>>>> > last-rc-change='Tue Mar 17 23:38:25 2015', queue
>>>>>>> >>>>>> d=0ms, exec=1342ms
>>>>>>> >>>>>>
>>>>>>> >>>>>>
>>>>>>> >>>>>>
>>>>>>> >>>>>>
>>>>>>> >>>>>> stonith-helperがあるときと同様のfialed actionsが出ているようです。
>>>>>>> >>>>>>
>>>>>>> >>>>>>
>>>>>>> >>>>>> 宜しくお願いします。
>>>>>>> >>>>>>
>>>>>>> >>>>>> 以上
>>>>>>> >>>>>>
>>>>>>> >>>>>>
>>>>>>> >>>>>>
>>>>>>> >>>>>>
>>>>>>> >>>>>> 2015年3月17日 22:38 <renayama19661014@ybb.ne.jp>:
>>>>>>> >>>>>>
>>>>>>> >>>>>> 福田さん
>>>>>>> >>>>>>>
>>>>>>> >>>>>>> こんばんは、山内です。
>>>>>>> >>>>>>>
>>>>>>> >>>>>>> ちなみに可能であれば、external/stonith-helperを外して、external/xen0だけにした場合に
>>>>>>> >>>>>>> どうなるか?を確認すると、問題の切り分けになるかもしれません。
>>>>>>> >>>>>>>
>>>>>>> >>>>>>> 以上です。
>>>>>>> >>>>>>>
>>>>>>> >>>>>>>
>>>>>>> >>>>>>>
>>>>>>> >>>>>>> ----- Original Message -----
>>>>>>> >>>>>>>
>>>>>>> >>>>>>>>  From: "renayama19661014@ybb.ne.jp"
>>>>>>> > <renayama19661014@ybb.ne.jp>
>>>>>>> >>>>>>>>  To: "linux-ha-japan@lists.sourceforge.jp"
>>>>>>> > <linux-ha-japan@lists.sourceforge.jp>
>>>>>>> >>>>>>>>  Cc:
>>>>>>> >>>>>>>>  Date: 2015/3/17, Tue 22:28
>>>>>>> >>>>>>>>  Subject: Re: [Linux-ha-jp]
>>>>>>> > スプリットブレイン時のSTONITHエラーについて
>>>>>>> >>>>>>>>
>>>>>>> >>>>>>>>  福田さん
>>>>>>> >>>>>>>>
>>>>>>> >>>>>>>>  こんばんは、山内です。
>>>>>>> >>>>>>>>
>>>>>>> >>>>>>>>  変わらないようですね。。。
>>>>>>> >>>>>>>>
>>>>>>> >>>>>>>>  とりあえず、明日くらいに、RHEL上ですが、
>>>>>>> >>>>>>>>
>>>>>>> >>>>>>>>  Heartbeat3.0.6
>>>>>>> >>>>>>>>  Pacemakerの最新
>>>>>>> >>>>>>>>
>>>>>>> >>>>>>>>
>>>>>>> > 組み合わせで、同じような設定(リソースはDummy、external/xen0はexternal/sshになりますが)stonith-helperが動くかどうかを確認してみます。
>>>>>>> >>>>>>>>
>>>>>>> >>>>>>>>  #stonith-helperの-x指定の出力が確認出来ると、もう少し問題が絞りやすいのですが・・・
>>>>>>> >>>>>>>>
>>>>>>> >>>>>>>>
>>>>>>> >>>>>>>>  以上です。
>>>>>>> >>>>>>>>
>>>>>>> >>>>>>>>
>>>>>>> >>>>>>>>
>>>>>>> >>>>>>>>  ----- Original Message -----
>>>>>>> >>>>>>>>>  From: Masamichi Fukuda - elf-systems
>>>>>>> >>>>>>>>  <masamichi_fukuda@elf-systems.com>
>>>>>>> >>>>>>>>>  To: 山内英生 <renayama19661014@ybb.ne.jp>;
>>>>>>> >>>>>>>>  "linux-ha-japan@lists.sourceforge.jp"
>>>>>>> >>>>>>>>  <linux-ha-japan@lists.sourceforge.jp>
>>>>>>> >>>>>>>>>  Date: 2015/3/17, Tue 21:24
>>>>>>> >>>>>>>>>  Subject: Re: [Linux-ha-jp]
>>>>>>> > スプリットブレイン時のSTONITHエラーについて
>>>>>>> >>>>>>>>>
>>>>>>> >>>>>>>>>
>>>>>>> >>>>>>>>>  山内さん
>>>>>>> >>>>>>>>>
>>>>>>> >>>>>>>>>  こんばんは、福田です。
>>>>>>> >>>>>>>>>  最新版の情報をありがとうございました。
>>>>>>> >>>>>>>>>
>>>>>>> >>>>>>>>>  早速インストールしてみました。
>>>>>>> >>>>>>>>>
>>>>>>> >>>>>>>>>  起動後の状態です。
>>>>>>> >>>>>>>>>
>>>>>>> >>>>>>>>>  failed actionsは変わりないようです。
>>>>>>> >>>>>>>>>
>>>>>>> >>>>>>>>>
>>>>>>> >>>>>>>>>
>>>>>>> >>>>>>>>>  # crm_mon -rfA
>>>>>>> >>>>>>>>>  Last updated: Tue Mar 17 21:03:49 2015
>>>>>>> >>>>>>>>>  Last change: Tue Mar 17 20:30:58 2015
>>>>>>> >>>>>>>>>  Stack: heartbeat
>>>>>>> >>>>>>>>>  Current DC: lbv1.beta.com
>>>>>>> > (38b0f200-83ea-8633-6f37-047d36cd39c6) - parti
>>>>>>> >>>>>>>>>  tion with quorum
>>>>>>> >>>>>>>>>  Version: 1.1.12-e32080b
>>>>>>> >>>>>>>>>  2 Nodes configured
>>>>>>> >>>>>>>>>  8 Resources configured
>>>>>>> >>>>>>>>>
>>>>>>> >>>>>>>>>
>>>>>>> >>>>>>>>>  Online: [ lbv1.beta.com lbv2.beta.com ]
>>>>>>> >>>>>>>>>
>>>>>>> >>>>>>>>>  Full list of resources:
>>>>>>> >>>>>>>>>
>>>>>>> >>>>>>>>>   Resource Group: HAvarnish
>>>>>>> >>>>>>>>>       vip_208    (ocf::heartbeat:IPaddr2):      
>>>>>>> > Started lbv1.beta.com
>>>>>>> >>>>>>>>>       varnishd   (lsb:varnish):  Started
>>>>>>> > lbv1.beta.com
>>>>>>> >>>>>>>>>   Resource Group: grpStonith1
>>>>>>> >>>>>>>>>       Stonith1-1
>>>>>>> > (stonith:external/stonith-helper):      Stopped
>>>>>>> >>>>>>>>>       Stonith1-2 (stonith:external/xen0):      
>>>>>>> > Stopped
>>>>>>> >>>>>>>>>   Resource Group: grpStonith2
>>>>>>> >>>>>>>>>       Stonith2-1
>>>>>>> > (stonith:external/stonith-helper):      Stopped
>>>>>>> >>>>>>>>>       Stonith2-2 (stonith:external/xen0):      
>>>>>>> > Stopped
>>>>>>> >>>>>>>>>   Clone Set: clone_ping [ping]
>>>>>>> >>>>>>>>>       Started: [ lbv1.beta.com lbv2.beta.com ]
>>>>>>> >>>>>>>>>
>>>>>>> >>>>>>>>>  Node Attributes:
>>>>>>> >>>>>>>>>  * Node lbv1.beta.com:
>>>>>>> >>>>>>>>>      + default_ping_set                  : 100
>>>>>>> >>>>>>>>>  * Node lbv2.beta.com:
>>>>>>> >>>>>>>>>      + default_ping_set                  : 100
>>>>>>> >>>>>>>>>
>>>>>>> >>>>>>>>>  Migration summary:
>>>>>>> >>>>>>>>>  * Node lbv1.beta.com:
>>>>>>> >>>>>>>>>     Stonith2-1: migration-threshold=1
>>>>>>> > fail-count=1000000
>>>>>>> >>>>>>>>  last-failure='Tue Mar 17
>>>>>>> >>>>>>>>>   21:03:39 2015'
>>>>>>> >>>>>>>>>  * Node lbv2.beta.com:
>>>>>>> >>>>>>>>>     Stonith1-1: migration-threshold=1
>>>>>>> > fail-count=1000000
>>>>>>> >>>>>>>>  last-failure='Tue Mar 17
>>>>>>> >>>>>>>>>   21:03:32 2015'
>>>>>>> >>>>>>>>>
>>>>>>> >>>>>>>>>  Failed actions:
>>>>>>> >>>>>>>>>      Stonith2-1_start_0 on lbv1.beta.com
>>>>>>> > 'unknown error' (1):
>>>>>>> >>>>>>>>  call=31, st
>>>>>>> >>>>>>>>>  atus=Error, exit-reason='none',
>>>>>>> > last-rc-change='Tue Mar 17
>>>>>>> >>>>>>>>  21:03:37 2015', queue
>>>>>>> >>>>>>>>>  d=0ms, exec=1085ms
>>>>>>> >>>>>>>>>      Stonith1-1_start_0 on lbv2.beta.com
>>>>>>> > 'unknown error' (1):
>>>>>>> >>>>>>>>  call=18, st
>>>>>>> >>>>>>>>>  atus=Error, exit-reason='none',
>>>>>>> > last-rc-change='Tue Mar 17
>>>>>>> >>>>>>>>  21:03:30 2015', queue
>>>>>>> >>>>>>>>>  d=0ms, exec=1061ms
>>>>>>> >>>>>>>>>
>>>>>>> >>>>>>>>>
>>>>>>> >>>>>>>>>
>>>>>>> >>>>>>>>>
>>>>>>> >>>>>>>>>  ログです。
>>>>>>> >>>>>>>>>
>>>>>>> >>>>>>>>>
>>>>>>> >>>>>>>>>  # less /var/log/ha-debug
>>>>>>> >>>>>>>>>  Mar 17 21:02:39 lbv1.beta.com heartbeat:
>>>>>>> > [4235]: info: Pacemaker support:
>>>>>>> >>>>>>>>  yes
>>>>>>> >>>>>>>>>  Mar 17 21:02:39 lbv1.beta.com heartbeat:
>>>>>>> > [4235]: WARN: File
>>>>>>> >>>>>>>>  /etc/ha.d//haresources exists.
>>>>>>> >>>>>>>>>  Mar 17 21:02:39 lbv1.beta.com heartbeat:
>>>>>>> > [4235]: WARN: This file is not used
>>>>>>> >>>>>>>>  because pacemaker is enabled
>>>>>>> >>>>>>>>>  Mar 17 21:02:39 lbv1.beta.com heartbeat:
>>>>>>> > [4235]: debug: Checking access of:
>>>>>>> >>>>>>>>  /usr/local/heartbeat/libexec/heartbeat/ccm
>>>>>>> >>>>>>>>>  Mar 17 21:02:39 lbv1.beta.com heartbeat:
>>>>>>> > [4235]: debug: Checking access of:
>>>>>>> >>>>>>>>  /usr/local/heartbeat/libexec/pacemaker/cib
>>>>>>> >>>>>>>>>  Mar 17 21:02:39 lbv1.beta.com heartbeat:
>>>>>>> > [4235]: debug: Checking access of:
>>>>>>> >>>>>>>>  /usr/local/heartbeat/libexec/pacemaker/stonithd
>>>>>>> >>>>>>>>>  Mar 17 21:02:39 lbv1.beta.com heartbeat:
>>>>>>> > [4235]: debug: Checking access of:
>>>>>>> >>>>>>>>  /usr/local/heartbeat/libexec/pacemaker/lrmd
>>>>>>> >>>>>>>>>  Mar 17 21:02:39 lbv1.beta.com heartbeat:
>>>>>>> > [4235]: debug: Checking access of:
>>>>>>> >>>>>>>>  /usr/local/heartbeat/libexec/pacemaker/attrd
>>>>>>> >>>>>>>>>  Mar 17 21:02:39 lbv1.beta.com heartbeat:
>>>>>>> > [4235]: debug: Checking access of:
>>>>>>> >>>>>>>>  /usr/local/heartbeat/libexec/pacemaker/crmd
>>>>>>> >>>>>>>>>  Mar 17 21:02:39 lbv1.beta.com heartbeat:
>>>>>>> > [4235]: WARN: Core dumps could be
>>>>>>> >>>>>>>>  lost if multiple dumps occur.
>>>>>>> >>>>>>>>>  Mar 17 21:02:39 lbv1.beta.com heartbeat:
>>>>>>> > [4235]: WARN: Consider setting
>>>>>>> >>>>>>>>  non-default value in /proc/sys/kernel/core_pattern
>>>>>>> > (or equivalent) for maximum
>>>>>>> >>>>>>>>  supportability
>>>>>>> >>>>>>>>>  Mar 17 21:02:39 lbv1.beta.com heartbeat:
>>>>>>> > [4235]: WARN: Consider setting
>>>>>>> >>>>>>>>  /proc/sys/kernel/core_uses_pid (or equivalent) to 1
>>>>>>> > for maximum supportability
>>>>>>> >>>>>>>>>  Mar 17 21:02:39 lbv1.beta.com heartbeat:
>>>>>>> > [4235]: WARN: Logging daemon is
>>>>>>> >>>>>>>>  disabled --enabling logging daemon is recommended
>>>>>>> >>>>>>>>>  Mar 17 21:02:39 lbv1.beta.com heartbeat:
>>>>>>> > [4235]: info:
>>>>>>> >>>>>>>>  **************************
>>>>>>> >>>>>>>>>  Mar 17 21:02:39 lbv1.beta.com heartbeat:
>>>>>>> > [4235]: info: Configuration
>>>>>>> >>>>>>>>  validated. Starting heartbeat 3.0.6
>>>>>>> >>>>>>>>>  Mar 17 21:02:39 lbv1.beta.com heartbeat:
>>>>>>> > [4236]: info: heartbeat: version
>>>>>>> >>>>>>>>  3.0.6
>>>>>>> >>>>>>>>>  Mar 17 21:02:39 lbv1.beta.com heartbeat:
>>>>>>> > [4236]: info: Heartbeat generation:
>>>>>>> >>>>>>>>  1423534116
>>>>>>> >>>>>>>>>  Mar 17 21:02:39 lbv1.beta.com heartbeat:
>>>>>>> > [4236]: info: seed is -1702799346
>>>>>>> >>>>>>>>>  Mar 17 21:02:39 lbv1.beta.com heartbeat:
>>>>>>> > [4236]: info: glib: ucast: write
>>>>>>> >>>>>>>>  socket priority set to IPTOS_LOWDELAY on eth1
>>>>>>> >>>>>>>>>  Mar 17 21:02:39 lbv1.beta.com heartbeat:
>>>>>>> > [4236]: info: glib: ucast: bound
>>>>>>> >>>>>>>>  send socket to device: eth1
>>>>>>> >>>>>>>>>  Mar 17 21:02:39 lbv1.beta.com heartbeat:
>>>>>>> > [4236]: info: glib: ucast: set
>>>>>>> >>>>>>>>  SO_REUSEADDR
>>>>>>> >>>>>>>>>  Mar 17 21:02:39 lbv1.beta.com heartbeat:
>>>>>>> > [4236]: info: glib: ucast: bound
>>>>>>> >>>>>>>>  receive socket to device: eth1
>>>>>>> >>>>>>>>>  Mar 17 21:02:39 lbv1.beta.com heartbeat:
>>>>>>> > [4236]: info: glib: ucast: started
>>>>>>> >>>>>>>>  on port 694 interface eth1 to 10.0.17.133
>>>>>>> >>>>>>>>>  Mar 17 21:02:39 lbv1.beta.com heartbeat:
>>>>>>> > [4236]: info: Local status now set
>>>>>>> >>>>>>>>  to: 'up'
>>>>>>> >>>>>>>>>  Mar 17 21:02:46 lbv1.beta.com heartbeat:
>>>>>>> > [4236]: info: Link
>>>>>>> >>>>>>>>  lbv2.beta.com:eth1 up.
>>>>>>> >>>>>>>>>  Mar 17 21:02:46 lbv1.beta.com heartbeat:
>>>>>>> > [4236]: info: Status update for
>>>>>>> >>>>>>>>  node lbv2.beta.com: status up
>>>>>>> >>>>>>>>>  Mar 17 21:02:47 lbv1.beta.com heartbeat:
>>>>>>> > [4236]: info: Comm_now_up():
>>>>>>> >>>>>>>>  updating status to active
>>>>>>> >>>>>>>>>  Mar 17 21:02:47 lbv1.beta.com heartbeat:
>>>>>>> > [4236]: info: Local status now set
>>>>>>> >>>>>>>>  to: 'active'
>>>>>>> >>>>>>>>>  Mar 17 21:02:47 lbv1.beta.com heartbeat:
>>>>>>> > [4236]: info: Starting child client
>>>>>>> >>>>>>>>
>>>>>>> > "/usr/local/heartbeat/libexec/heartbeat/ccm" (109,113)
>>>>>>> >>>>>>>>>  Mar 17 21:02:47 lbv1.beta.com heartbeat:
>>>>>>> > [4236]: info: Starting child client
>>>>>>> >>>>>>>>
>>>>>>> > "/usr/local/heartbeat/libexec/pacemaker/cib" (109,113)
>>>>>>> >>>>>>>>>  Mar 17 21:02:47 lbv1.beta.com heartbeat:
>>>>>>> > [4236]: info: Starting child client
>>>>>>> >>>>>>>>
>>>>>>> > "/usr/local/heartbeat/libexec/pacemaker/stonithd" (0,0)
>>>>>>> >>>>>>>>>  Mar 17 21:02:47 lbv1.beta.com heartbeat:
>>>>>>> > [4236]: info: Starting child client
>>>>>>> >>>>>>>>
>>>>>>> > "/usr/local/heartbeat/libexec/pacemaker/lrmd" (0,0)
>>>>>>> >>>>>>>>>  Mar 17 21:02:47 lbv1.beta.com heartbeat:
>>>>>>> > [4236]: info: Starting child client
>>>>>>> >>>>>>>>
>>>>>>> > "/usr/local/heartbeat/libexec/pacemaker/attrd" (109,113)
>>>>>>> >>>>>>>>>  Mar 17 21:02:47 lbv1.beta.com heartbeat:
>>>>>>> > [4236]: info: Starting child client
>>>>>>> >>>>>>>>
>>>>>>> > "/usr/local/heartbeat/libexec/pacemaker/crmd" (109,113)
>>>>>>> >>>>>>>>>  Mar 17 21:02:47 lbv1.beta.com heartbeat:
>>>>>>> > [4236]: debug: get_delnodelist:
>>>>>>> >>>>>>>>  delnodelist=
>>>>>>> >>>>>>>>>  Mar 17 21:02:47 lbv1.beta.com heartbeat:
>>>>>>> > [4250]: info: Starting
>>>>>>> >>>>>>>>
>>>>>>> > "/usr/local/heartbeat/libexec/pacemaker/crmd" as uid 109  gid 113 (pid
>>>>>>> >>>>>>>>  4250)
>>>>>>> >>>>>>>>>  Mar 17 21:02:47 lbv1.beta.com heartbeat:
>>>>>>> > [4246]: info: Starting
>>>>>>> >>>>>>>>
>>>>>>> > "/usr/local/heartbeat/libexec/pacemaker/cib" as uid 109  gid 113 (pid
>>>>>>> >>>>>>>>  4246)
>>>>>>> >>>>>>>>>  Mar 17 21:02:47 lbv1.beta.com heartbeat:
>>>>>>> > [4249]: info: Starting
>>>>>>> >>>>>>>>
>>>>>>> > "/usr/local/heartbeat/libexec/pacemaker/attrd" as uid 109  gid 113
>>>>>>> >>>>>>>>  (pid 4249)
>>>>>>> >>>>>>>>>  Mar 17 21:02:47 lbv1.beta.com heartbeat:
>>>>>>> > [4245]: info: Starting
>>>>>>> >>>>>>>>
>>>>>>> > "/usr/local/heartbeat/libexec/heartbeat/ccm" as uid 109  gid 113 (pid
>>>>>>> >>>>>>>>  4245)
>>>>>>> >>>>>>>>>  Mar 17 21:02:47 lbv1.beta.com heartbeat:
>>>>>>> > [4248]: info: Starting
>>>>>>> >>>>>>>>
>>>>>>> > "/usr/local/heartbeat/libexec/pacemaker/lrmd" as uid 0  gid 0 (pid
>>>>>>> >>>>>>>>  4248)
>>>>>>> >>>>>>>>>  Mar 17 21:02:47 lbv1.beta.com heartbeat:
>>>>>>> > [4247]: info: Starting
>>>>>>> >>>>>>>>
>>>>>>> > "/usr/local/heartbeat/libexec/pacemaker/stonithd" as uid 0  gid 0 (pid
>>>>>>> >>>>>>>>  4247)
>>>>>>> >>>>>>>>>  Mar 17 21:02:47 lbv1.beta.com ccm: [4245]:
>>>>>>> > info: Hostname: lbv1.beta.com
>>>>>>> >>>>>>>>>  Mar 17 21:02:47 lbv1.beta.com heartbeat:
>>>>>>> > [4236]: info: the send queue length
>>>>>>> >>>>>>>>  from heartbeat to client ccm is set to 1024
>>>>>>> >>>>>>>>>  Mar 17 21:02:47 lbv1.beta.com heartbeat:
>>>>>>> > [4236]: info: the send queue length
>>>>>>> >>>>>>>>  from heartbeat to client attrd is set to 1024
>>>>>>> >>>>>>>>>  Mar 17 21:02:47 lbv1.beta.com heartbeat:
>>>>>>> > [4236]: info: the send queue length
>>>>>>> >>>>>>>>  from heartbeat to client stonith-ng is set to 1024
>>>>>>> >>>>>>>>>  Mar 17 21:02:47 lbv1.beta.com heartbeat:
>>>>>>> > [4236]: info: Status update for
>>>>>>> >>>>>>>>  node lbv2.beta.com: status active
>>>>>>> >>>>>>>>>  Mar 17 21:02:47 lbv1.beta.com heartbeat:
>>>>>>> > [4236]: info: the send queue length
>>>>>>> >>>>>>>>  from heartbeat to client cib is set to 1024
>>>>>>> >>>>>>>>>  Mar 17 21:02:51 lbv1.beta.com heartbeat:
>>>>>>> > [4236]: WARN: 1 lost packet(s) for
>>>>>>> >>>>>>>>  [lbv2.beta.com] [15:17]
>>>>>>> >>>>>>>>>  Mar 17 21:02:51 lbv1.beta.com heartbeat:
>>>>>>> > [4236]: info: No pkts missing from
>>>>>>> >>>>>>>>  lbv2.beta.com!
>>>>>>> >>>>>>>>>  Mar 17 21:02:52 lbv1.beta.com heartbeat:
>>>>>>> > [4236]: WARN: 1 lost packet(s) for
>>>>>>> >>>>>>>>  [lbv2.beta.com] [19:21]
>>>>>>> >>>>>>>>>  Mar 17 21:02:52 lbv1.beta.com heartbeat:
>>>>>>> > [4236]: info: No pkts missing from
>>>>>>> >>>>>>>>  lbv2.beta.com!
>>>>>>> >>>>>>>>>  Mar 17 21:02:52 lbv1.beta.com heartbeat:
>>>>>>> > [4236]: info: the send queue length
>>>>>>> >>>>>>>>  from heartbeat to client crmd is set to 1024
>>>>>>> >>>>>>>>>  Mar 17 21:02:53 lbv1.beta.com heartbeat:
>>>>>>> > [4236]: WARN: 1 lost packet(s) for
>>>>>>> >>>>>>>>  [lbv2.beta.com] [24:26]
>>>>>>> >>>>>>>>>  Mar 17 21:02:53 lbv1.beta.com heartbeat:
>>>>>>> > [4236]: info: No pkts missing from
>>>>>>> >>>>>>>>  lbv2.beta.com!
>>>>>>> >>>>>>>>>  Mar 17 21:02:54 lbv1.beta.com heartbeat:
>>>>>>> > [4236]: WARN: 1 lost packet(s) for
>>>>>>> >>>>>>>>  [lbv2.beta.com] [26:28]
>>>>>>> >>>>>>>>>  Mar 17 21:02:54 lbv1.beta.com heartbeat:
>>>>>>> > [4236]: info: No pkts missing from
>>>>>>> >>>>>>>>  lbv2.beta.com!
>>>>>>> >>>>>>>>>  Mar 17 21:02:54 lbv1.beta.com heartbeat:
>>>>>>> > [4236]: WARN: 1 lost packet(s) for
>>>>>>> >>>>>>>>  [lbv2.beta.com] [30:32]
>>>>>>> >>>>>>>>>  Mar 17 21:02:54 lbv1.beta.com heartbeat:
>>>>>>> > [4236]: info: No pkts missing from
>>>>>>> >>>>>>>>  lbv2.beta.com!
>>>>>>> >>>>>>>>>
>>>>>>> >>>>>>>>>
>>>>>>> >>>>>>>>>
>>>>>>> >>>>>>>>>  # less /var/log/error
>>>>>>> >>>>>>>>>
>>>>>>> >>>>>>>>>  Mar 17 21:02:47 lbv1 attrd[4249]:    error:
>>>>>>> > ha_msg_dispatch: Ignored
>>>>>>> >>>>>>>>  incoming message. Please set_msg_callback on
>>>>>>> > hbclstat
>>>>>>> >>>>>>>>>  Mar 17 21:02:48 lbv1 attrd[4249]:    error:
>>>>>>> > ha_msg_dispatch: Ignored
>>>>>>> >>>>>>>>  incoming message. Please set_msg_callback on
>>>>>>> > hbclstat
>>>>>>> >>>>>>>>>  Mar 17 21:02:53 lbv1 stonith-ng[4247]:  
>>>>>>> > error: ha_msg_dispatch: Ignored
>>>>>>> >>>>>>>>  incoming message. Please set_msg_callback on
>>>>>>> > hbclstat
>>>>>>> >>>>>>>>>  Mar 17 21:02:53 lbv1 stonith-ng[4247]:  
>>>>>>> > error: ha_msg_dispatch: Ignored
>>>>>>> >>>>>>>>  incoming message. Please set_msg_callback on
>>>>>>> > hbclstat
>>>>>>> >>>>>>>>>  Mar 17 21:03:39 lbv1 crmd[4250]:    error:
>>>>>>> > process_lrm_event: Operation
>>>>>>> >>>>>>>>  Stonith2-1_start_0 (node=lbv1.beta.com, call=31,
>>>>>>> > status=4, cib-update=42,
>>>>>>> >>>>>>>>  confirmed=true) Error
>>>>>>> >>>>>>>>>
>>>>>>> >>>>>>>>>  # cat syslog|egrep 'Mar 17 21:03|Mar 17
>>>>>>> > 21:02' |egrep
>>>>>>> >>>>>>>>  'heartbeat|stonith|pacemaker|error'
>>>>>>> >>>>>>>>>  Mar 17 21:03:24 lbv1 pengine[4253]:   notice:
>>>>>>> > process_pe_message: Calculated
>>>>>>> >>>>>>>>  Transition 0:
>>>>>>> > /var/lib/pacemaker/pengine/pe-input-115.bz2
>>>>>>> >>>>>>>>>  Mar 17 21:03:27 lbv1 crmd[4250]:   notice:
>>>>>>> > run_graph: Transition 0
>>>>>>> >>>>>>>>  (Complete=15, Pending=0, Fired=0, Skipped=16,
>>>>>>> > Incomplete=2,
>>>>>>> >>>>>>>>
>>>>>>> > Source=/var/lib/pacemaker/pengine/pe-input-115.bz2): Stopped
>>>>>>> >>>>>>>>>  Mar 17 21:03:29 lbv1 pengine[4253]:   notice:
>>>>>>> > process_pe_message: Calculated
>>>>>>> >>>>>>>>  Transition 1:
>>>>>>> > /var/lib/pacemaker/pengine/pe-input-116.bz2
>>>>>>> >>>>>>>>>  Mar 17 21:03:34 lbv1 crmd[4250]:   notice:
>>>>>>> > run_graph: Transition 1
>>>>>>> >>>>>>>>  (Complete=8, Pending=0, Fired=0, Skipped=12,
>>>>>>> > Incomplete=1,
>>>>>>> >>>>>>>>
>>>>>>> > Source=/var/lib/pacemaker/pengine/pe-input-116.bz2): Stopped
>>>>>>> >>>>>>>>>  Mar 17 21:03:37 lbv1 pengine[4253]:  warning:
>>>>>>> > unpack_rsc_op_failure:
>>>>>>> >>>>>>>>  Processing failed op start for Stonith1-1 on
>>>>>>> > lbv2.beta.com: unknown error (1)
>>>>>>> >>>>>>>>>  Mar 17 21:03:37 lbv1 pengine[4253]:  warning:
>>>>>>> > unpack_rsc_op_failure:
>>>>>>> >>>>>>>>  Processing failed op start for Stonith1-1 on
>>>>>>> > lbv2.beta.com: unknown error (1)
>>>>>>> >>>>>>>>>  Mar 17 21:03:37 lbv1 pengine[4253]:   notice:
>>>>>>> > process_pe_message: Calculated
>>>>>>> >>>>>>>>  Transition 2:
>>>>>>> > /var/lib/pacemaker/pengine/pe-input-117.bz2
>>>>>>> >>>>>>>>>  Mar 17 21:03:39 lbv1 stonith-ng[4247]:  
>>>>>>> > notice: log_operation: Operation
>>>>>>> >>>>>>>>  'monitor' [4377] for device
>>>>>>> > 'Stonith2-1' returned: -201 (Generic
>>>>>>> >>>>>>>>  Pacemaker error)
>>>>>>> >>>>>>>>>  Mar 17 21:03:39 lbv1 stonith-ng[4247]:
>>>>>>> > warning: log_operation:
>>>>>>> >>>>>>>>  Stonith2-1:4377 [ Performing: stonith -t
>>>>>>> > external/stonith-helper -S ]
>>>>>>> >>>>>>>>>  Mar 17 21:03:39 lbv1 stonith-ng[4247]:
>>>>>>> > warning: log_operation:
>>>>>>> >>>>>>>>  Stonith2-1:4377 [ failed to exec
>>>>>>> > "stonith" ]
>>>>>>> >>>>>>>>>  Mar 17 21:03:39 lbv1 stonith-ng[4247]:
>>>>>>> > warning: log_operation:
>>>>>>> >>>>>>>>  Stonith2-1:4377 [ failed:  2 ]
>>>>>>> >>>>>>>>>  Mar 17 21:03:39 lbv1 crmd[4250]:    error:
>>>>>>> > process_lrm_event: Operation
>>>>>>> >>>>>>>>  Stonith2-1_start_0 (node=lbv1.beta.com, call=31,
>>>>>>> > status=4, cib-update=42,
>>>>>>> >>>>>>>>  confirmed=true) Error
>>>>>>> >>>>>>>>>  Mar 17 21:03:40 lbv1 crmd[4250]:   notice:
>>>>>>> > run_graph: Transition 2
>>>>>>> >>>>>>>>  (Complete=12, Pending=0, Fired=0, Skipped=3,
>>>>>>> > Incomplete=0,
>>>>>>> >>>>>>>>
>>>>>>> > Source=/var/lib/pacemaker/pengine/pe-input-117.bz2): Stopped
>>>>>>> >>>>>>>>>  Mar 17 21:03:42 lbv1 pengine[4253]:  warning:
>>>>>>> > unpack_rsc_op_failure:
>>>>>>> >>>>>>>>  Processing failed op start for Stonith2-1 on
>>>>>>> > lbv1.beta.com: unknown error (1)
>>>>>>> >>>>>>>>>  Mar 17 21:03:42 lbv1 pengine[4253]:  warning:
>>>>>>> > unpack_rsc_op_failure:
>>>>>>> >>>>>>>>  Processing failed op start for Stonith2-1 on
>>>>>>> > lbv1.beta.com: unknown error (1)
>>>>>>> >>>>>>>>>  Mar 17 21:03:42 lbv1 pengine[4253]:  warning:
>>>>>>> > unpack_rsc_op_failure:
>>>>>>> >>>>>>>>  Processing failed op start for Stonith1-1 on
>>>>>>> > lbv2.beta.com: unknown error (1)
>>>>>>> >>>>>>>>>  Mar 17 21:03:42 lbv1 pengine[4253]:   notice:
>>>>>>> > process_pe_message: Calculated
>>>>>>> >>>>>>>>  Transition 3:
>>>>>>> > /var/lib/pacemaker/pengine/pe-input-118.bz2
>>>>>>> >>>>>>>>>  Mar 17 21:03:42 lbv1 IPaddr2(vip_208)[4448]:
>>>>>>> > INFO:
>>>>>>> >>>>>>>>  /usr/libexec/heartbeat/send_arp -i 200 -r 5 -p
>>>>>>> >>>>>>>>  /var/run/resource-agents/send_arp-192.168.17.208
>>>>>>> > eth0 192.168.17.208 auto
>>>>>>> >>>>>>>>  not_used not_used
>>>>>>> >>>>>>>>>  Mar 17 21:03:47 lbv1 crmd[4250]:   notice:
>>>>>>> > run_graph: Transition 3
>>>>>>> >>>>>>>>  (Complete=10, Pending=0, Fired=0, Skipped=0,
>>>>>>> > Incomplete=0,
>>>>>>> >>>>>>>>
>>>>>>> > Source=/var/lib/pacemaker/pengine/pe-input-118.bz2): Complete
>>>>>>> >>>>>>>>>
>>>>>>> >>>>>>>>>  宜しくお願いします。
>>>>>>> >>>>>>>>>
>>>>>>> >>>>>>>>>  以上
>>>>>>> >>>>>>>>>
>>>>>>> >>>>>>>>>
>>>>>>> >>>>>>>>>
>>>>>>> >>>>>>>>>  2015年3月17日 18:31
>>>>>>> > <renayama19661014@ybb.ne.jp>:
>>>>>>> >>>>>>>>>
>>>>>>> >>>>>>>>>  福田さん
>>>>>>> >>>>>>>>>>
>>>>>>> >>>>>>>>>>  こんばんは、山内です。
>>>>>>> >>>>>>>>>>
>>>>>>> >>>>>>>>>>  tag付けされていないので、本日の最新版は、
>>>>>>> >>>>>>>>>>
>>>>>>> >>>>>>>>>>   *
>>>>>>> >>>>>>>>
>>>>>>> > https://github.com/ClusterLabs/pacemaker/tree/e32080b460f81486b85d08ec958582b3e72d858c
>>>>>>> >>>>>>>>>>
>>>>>>> >>>>>>>>>>
>>>>>>> >>>>>>>>>>  になります。
>>>>>>> >>>>>>>>>>  右側の[Download ZIP]からダウンロード出来ます。
>>>>>>> >>>>>>>>>>
>>>>>>> >>>>>>>>>>  以上です。
>>>>>>> >>>>>>>>>>
>>>>>>> >>>>>>>>>>
>>>>>>> >>>>>>>>>>  ----- Original Message -----
>>>>>>> >>>>>>>>>>>  From: Masamichi Fukuda - elf-systems
>>>>>>> >>>>>>>>  <masamichi_fukuda@elf-systems.com>
>>>>>>> >>>>>>>>>>
>>>>>>> >>>>>>>>>>>  To:
>>>>>>> > "renayama19661014@ybb.ne.jp"
>>>>>>> >>>>>>>>  <renayama19661014@ybb.ne.jp>;
>>>>>>> >>>>>>>>  "linux-ha-japan@lists.sourceforge.jp"
>>>>>>> >>>>>>>>  <linux-ha-japan@lists.sourceforge.jp>
>>>>>>> >>>>>>>>>>>  Date: 2015/3/17, Tue 18:07
>>>>>>> >>>>>>>>>>>  Subject: スプリットブレイン時のSTONITHエラーについて
>>>>>>> >>>>>>>>>>>
>>>>>>> >>>>>>>>>>>
>>>>>>> >>>>>>>>>>>  山内さん
>>>>>>> >>>>>>>>>>>
>>>>>>> >>>>>>>>>>>
>>>>>>> >>>>>>>>>>>  お疲れ様です、福田です。
>>>>>>> >>>>>>>>>>>
>>>>>>> >>>>>>>>>>>
>>>>>>> >>>>>>>>>>>  こちらを見たのですが、
>>>>>>> >>>>>>>>>>>
>>>>>>> > https://github.com/ClusterLabs/pacemaker/tags
>>>>>>> >>>>>>>>>>>
>>>>>>> >>>>>>>>>>>
>>>>>>> >>>>>>>>>>>
>>>>>>> >>>>>>>>>>>  pacemaker 1.1.12 561c4cf が最新のようなのですが。
>>>>>>> >>>>>>>>>>>  済みませんが、これ以降の最新版はどちらにあるか教えて頂けますか。
>>>>>>> >>>>>>>>>>>
>>>>>>> >>>>>>>>>>>
>>>>>>> >>>>>>>>>>>  宜しくお願いします。
>>>>>>> >>>>>>>>>>>
>>>>>>> >>>>>>>>>>>
>>>>>>> >>>>>>>>>>>  以上
>>>>>>> >>>>>>>>>>>
>>>>>>> >>>>>>>>>>>
>>>>>>> >>>>>>>>>>>
>>>>>>> >>>>>>>>>>>
>>>>>>> > 2015年3月17日火曜日、<renayama19661014@ybb.ne.jp>さんは書きました:
>>>>>>> >>>>>>>>>>>
>>>>>>> >>>>>>>>>>>  福田さん
>>>>>>> >>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>  お疲れ様です。山内です。
>>>>>>> >>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>  はい。古いです。
>>>>>>> >>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>
>>>>>>> > PacemakerがHeartbeat3.0.6に対応したのは意外と最近です。
>>>>>>> >>>>>>>>>>>>
>>>>>>> > もっと新しいものを入れてください。(また、ソースから構築する必要がありますが・・・・)
>>>>>>> >>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>  本家のgithubから入手可能です。
>>>>>>> >>>>>>>>>>>>   *
>>>>>>> > https://github.com/ClusterLabs/pacemaker
>>>>>>> >>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>
>>>>>>> > 場合によっては、最新のmasterはエラーなどが出る場合がありますので、その場合は、バージョンを古い方にたぐって
>>>>>>> >>>>>>>>>>>>  いくのが良いと思います。
>>>>>>> >>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>  以上です。
>>>>>>> >>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>  ----- Original Message -----
>>>>>>> >>>>>>>>>>>>>  From: Masamichi Fukuda -
>>>>>>> > elf-systems
>>>>>>> >>>>>>>>  <masamichi_fukuda@elf-systems.com>
>>>>>>> >>>>>>>>>>>>>  To: 山内英生
>>>>>>> > <renayama19661014@ybb.ne.jp>;
>>>>>>> >>>>>>>>  "linux-ha-japan@lists.sourceforge.jp"
>>>>>>> >>>>>>>>  <linux-ha-japan@lists.sourceforge.jp>
>>>>>>> >>>>>>>>>>>>>  Date: 2015/3/17, Tue 16:06
>>>>>>> >>>>>>>>>>>>>  Subject: Re: [Linux-ha-jp]
>>>>>>> > スプリットブレイン時のSTONITHエラーについて
>>>>>>> >>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>  山内さん
>>>>>>> >>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>  お疲れ様です、福田です。
>>>>>>> >>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>
>>>>>>> > 以前のメールでheartbeatとpacemakerを最新版を入れたほうが良いと回答頂きました。
>>>>>>> >>>>>>>>>>>>>
>>>>>>> > そこで今回、heartbeat3.0.6とpacemaker1.1.12を入れたのですが。
>>>>>>> >>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>  heartbeat configuration:
>>>>>>> > Version = "3.0.6"
>>>>>>> >>>>>>>>>>>>>  pacemaker configuration:
>>>>>>> > Version = 1.1.12 (Build:
>>>>>>> >>>>>>>>  561c4cf)pacemakerがまだ古いということでしょうか。
>>>>>>> >>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>  済みませんが、宜しくお願いします。
>>>>>>> >>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>  以上
>>>>>>> >>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>  2015年3月17日 14:59
>>>>>>> > <renayama19661014@ybb.ne.jp>:
>>>>>>> >>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>  福田さん
>>>>>>> >>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>  お疲れ様です。山内です。
>>>>>>> >>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>
>>>>>>> > ふと思ったのすが、以前のやり取りのメールで以下と回答してますが、問題ないでしょうか?
>>>>>>> >>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>>>  
>>>>>>> > 2)Heartbeat3.0.6+Pacemaker最新 :
>>>>>>> >>>>>>>>  OK
>>>>>>> >>>>>>>>>>>>>>>>>>>>    
>>>>>>> >>>>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>  どうやら、Heartbeatも最新版3.0.6を組合せる必要があるようです。
>>>>>>> >>>>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>
>>>>>>> >  * http://hg.linux-ha.org/heartbeat-STABLE_3_0/rev/cceeb47a7d8f
>>>>>>> >>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>
>>>>>>> > 以下のcrm_monのバージョンを見ると、1.1.12のようです。
>>>>>>> >>>>>>>>>>>>>>
>>>>>>> > Heartbeat3.0.6と組み合わせるには、かなり新しめのPacemakerが必要です。
>>>>>>> >>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>  # crm_mon -rfA
>>>>>>> >>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>  Last updated: Tue Mar
>>>>>>> > 17 14:14:39 2015
>>>>>>> >>>>>>>>>>>>>>>  Last change: Tue Mar 17
>>>>>>> > 14:01:43 2015
>>>>>>> >>>>>>>>>>>>>>>  Stack: heartbeat
>>>>>>> >>>>>>>>>>>>>>>  Current DC:
>>>>>>> > lbv2.beta.com
>>>>>>> >>>>>>>>  (82ffc36f-1ad8-8686-7db0-35686465c624) - parti
>>>>>>> >>>>>>>>>>>>>>>  tion with quorum
>>>>>>> >>>>>>>>>>>>>>>  Version: 1.1.12-561c4cf
>>>>>>> >>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>  たぶん、以下の変更以降は少なくとも必要かと思います。
>>>>>>> >>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>
>>>>>>> > https://github.com/ClusterLabs/pacemaker/commit/f2302da063d08719d28367d8e362b8bfb0f85bf3
>>>>>>> >>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>  以上です。
>>>>>>> >>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>  ----- Original Message
>>>>>>> > -----
>>>>>>> >>>>>>>>>>>>>>>  From: Masamichi Fukuda
>>>>>>> > - elf-systems
>>>>>>> >>>>>>>>  <masamichi_fukuda@elf-systems.com>
>>>>>>> >>>>>>>>>>>>>>>  To: 山内英生
>>>>>>> > <renayama19661014@ybb.ne.jp>;
>>>>>>> >>>>>>>>  "linux-ha-japan@lists.sourceforge.jp"
>>>>>>> >>>>>>>>  <linux-ha-japan@lists.sourceforge.jp>
>>>>>>> >>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>  Date: 2015/3/17, Tue
>>>>>>> > 14:38
>>>>>>> >>>>>>>>>>>>>>>  Subject: Re:
>>>>>>> > [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
>>>>>>> >>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>  山内さん
>>>>>>> >>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>  お疲れ様です、福田です。
>>>>>>> >>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>
>>>>>>> > stonith-helperのシェバング行に-xを追加すれば良いのでしょうか?
>>>>>>> >>>>>>>>>>>>>>>
>>>>>>> > stonith-helperの先頭行を#!/bin/bash -xにしてクラスタを起動してみました。
>>>>>>> >>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>
>>>>>>> > crm_monでは先ほどと変わりはないようです。
>>>>>>> >>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>  # crm_mon -rfA
>>>>>>> >>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>  Last updated: Tue Mar
>>>>>>> > 17 14:14:39 2015
>>>>>>> >>>>>>>>>>>>>>>  Last change: Tue Mar 17
>>>>>>> > 14:01:43 2015
>>>>>>> >>>>>>>>>>>>>>>  Stack: heartbeat
>>>>>>> >>>>>>>>>>>>>>>  Current DC:
>>>>>>> > lbv2.beta.com
>>>>>>> >>>>>>>>  (82ffc36f-1ad8-8686-7db0-35686465c624) - parti
>>>>>>> >>>>>>>>>>>>>>>  tion with quorum
>>>>>>> >>>>>>>>>>>>>>>  Version: 1.1.12-561c4cf
>>>>>>> >>>>>>>>>>>>>>>  2 Nodes configured
>>>>>>> >>>>>>>>>>>>>>>  8 Resources configured
>>>>>>> >>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>  Online: [ lbv1.beta.com
>>>>>>> > lbv2.beta.com ]
>>>>>>> >>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>  Full list of resources:
>>>>>>> >>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>   Resource Group:
>>>>>>> > HAvarnish
>>>>>>> >>>>>>>>>>>>>>>       vip_208  
>>>>>>> > (ocf::heartbeat:IPaddr2):      
>>>>>>> >>>>>>>>  Started lbv1.beta.com
>>>>>>> >>>>>>>>>>>>>>>       varnishd  
>>>>>>> > (lsb:varnish):  Started
>>>>>>> >>>>>>>>  lbv1.beta.com
>>>>>>> >>>>>>>>>>>>>>>   Resource Group:
>>>>>>> > grpStonith1
>>>>>>> >>>>>>>>>>>>>>>       Stonith1-1
>>>>>>> >>>>>>>>  (stonith:external/stonith-helper):      Stopped
>>>>>>> >>>>>>>>>>>>>>>       Stonith1-2
>>>>>>> > (stonith:external/xen0):      
>>>>>>> >>>>>>>>  Stopped
>>>>>>> >>>>>>>>>>>>>>>   Resource Group:
>>>>>>> > grpStonith2
>>>>>>> >>>>>>>>>>>>>>>       Stonith2-1
>>>>>>> >>>>>>>>  (stonith:external/stonith-helper):      Stopped
>>>>>>> >>>>>>>>>>>>>>>       Stonith2-2
>>>>>>> > (stonith:external/xen0):      
>>>>>>> >>>>>>>>  Stopped
>>>>>>> >>>>>>>>>>>>>>>   Clone Set: clone_ping
>>>>>>> > [ping]
>>>>>>> >>>>>>>>>>>>>>>       Started: [
>>>>>>> > lbv1.beta.com lbv2.beta.com ]
>>>>>>> >>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>  Node Attributes:
>>>>>>> >>>>>>>>>>>>>>>  * Node lbv1.beta.com:
>>>>>>> >>>>>>>>>>>>>>>      +
>>>>>>> > default_ping_set                  : 100
>>>>>>> >>>>>>>>>>>>>>>  * Node lbv2.beta.com:
>>>>>>> >>>>>>>>>>>>>>>      +
>>>>>>> > default_ping_set                  : 100
>>>>>>> >>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>  Migration summary:
>>>>>>> >>>>>>>>>>>>>>>  * Node lbv2.beta.com:
>>>>>>> >>>>>>>>>>>>>>>     Stonith1-1:
>>>>>>> > migration-threshold=1
>>>>>>> >>>>>>>>  fail-count=1000000 last-failure='Tue Mar 17
>>>>>>> >>>>>>>>>>>>>>>   14:12:16 2015'
>>>>>>> >>>>>>>>>>>>>>>  * Node lbv1.beta.com:
>>>>>>> >>>>>>>>>>>>>>>     Stonith2-1:
>>>>>>> > migration-threshold=1
>>>>>>> >>>>>>>>  fail-count=1000000 last-failure='Tue Mar 17
>>>>>>> >>>>>>>>>>>>>>>   14:12:21 2015'
>>>>>>> >>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>  Failed actions:
>>>>>>> >>>>>>>>>>>>>>>      Stonith1-1_start_0
>>>>>>> > on lbv2.beta.com 'unknown
>>>>>>> >>>>>>>>  error' (1): call=31, st
>>>>>>> >>>>>>>>>>>>>>>  atus=Error,
>>>>>>> > last-rc-change='Tue Mar 17 14:12:14
>>>>>>> >>>>>>>>  2015', queued=0ms, exec=1065ms
>>>>>>> >>>>>>>>>>>>>>>      Stonith2-1_start_0
>>>>>>> > on lbv1.beta.com 'unknown
>>>>>>> >>>>>>>>  error' (1): call=26, st
>>>>>>> >>>>>>>>>>>>>>>  atus=Error,
>>>>>>> > last-rc-change='Tue Mar 17 14:12:19
>>>>>>> >>>>>>>>  2015', queued=0ms, exec=1081ms
>>>>>>> >>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>  その他のログを探してみました。
>>>>>>> >>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>  heartbeat起動時です。
>>>>>>> >>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>  # less
>>>>>>> > /var/log/pm_logconv.out
>>>>>>> >>>>>>>>>>>>>>>  Mar 17 14:11:28
>>>>>>> > lbv1.beta.com info: Starting
>>>>>>> >>>>>>>>  Heartbeat 3.0.6.
>>>>>>> >>>>>>>>>>>>>>>  Mar 17 14:11:33
>>>>>>> > lbv1.beta.com info: Link
>>>>>>> >>>>>>>>  lbv2.beta.com:eth1 is up.
>>>>>>> >>>>>>>>>>>>>>>  Mar 17 14:11:34
>>>>>>> > lbv1.beta.com info: Start
>>>>>>> >>>>>>>>  "ccm" process. (pid=13264)
>>>>>>> >>>>>>>>>>>>>>>  Mar 17 14:11:34
>>>>>>> > lbv1.beta.com info: Start
>>>>>>> >>>>>>>>  "lrmd" process. (pid=13267)
>>>>>>> >>>>>>>>>>>>>>>  Mar 17 14:11:34
>>>>>>> > lbv1.beta.com info: Start
>>>>>>> >>>>>>>>  "attrd" process. (pid=13268)
>>>>>>> >>>>>>>>>>>>>>>  Mar 17 14:11:34
>>>>>>> > lbv1.beta.com info: Start
>>>>>>> >>>>>>>>  "stonithd" process. (pid=13266)
>>>>>>> >>>>>>>>>>>>>>>  Mar 17 14:11:34
>>>>>>> > lbv1.beta.com info: Start
>>>>>>> >>>>>>>>  "cib" process. (pid=13265)
>>>>>>> >>>>>>>>>>>>>>>  Mar 17 14:11:34
>>>>>>> > lbv1.beta.com info: Start
>>>>>>> >>>>>>>>  "crmd" process. (pid=13269)
>>>>>>> >>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>  # less /var/log/error
>>>>>>> >>>>>>>>>>>>>>>  Mar 17 14:12:20 lbv1
>>>>>>> > crmd[13269]:    error:
>>>>>>> >>>>>>>>  process_lrm_event: Operation Stonith2-1_start_0
>>>>>>> > (node=lbv1.beta.com, call=26,
>>>>>>> >>>>>>>>  status=4, cib-update=19, confirmed=true) Error
>>>>>>> >>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>
>>>>>>> > syslogからstonithをgrepしたものです
>>>>>>> >>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>  Mar 17 14:11:34 lbv1
>>>>>>> > heartbeat: [13255]: info:
>>>>>>> >>>>>>>>  Starting child client
>>>>>>> >>>>>>>>
>>>>>>> > "/usr/local/heartbeat/libexec/pacemaker/stonithd" (0,0)
>>>>>>> >>>>>>>>>>>>>>>  Mar 17 14:11:34 lbv1
>>>>>>> > heartbeat: [13266]: info:
>>>>>>> >>>>>>>>  Starting
>>>>>>> > "/usr/local/heartbeat/libexec/pacemaker/stonithd" as uid 0
>>>>>>> >>>>>>>>  gid 0 (pid 13266)
>>>>>>> >>>>>>>>>>>>>>>  Mar 17 14:11:34 lbv1
>>>>>>> > stonithd[13266]:   notice:
>>>>>>> >>>>>>>>  crm_cluster_connect: Connecting to cluster
>>>>>>> > infrastructure: heartbeat
>>>>>>> >>>>>>>>>>>>>>>  Mar 17 14:11:34 lbv1
>>>>>>> > heartbeat: [13255]: info: the
>>>>>>> >>>>>>>>  send queue length from heartbeat to client stonithd
>>>>>>> > is set to 1024
>>>>>>> >>>>>>>>>>>>>>>  Mar 17 14:11:40 lbv1
>>>>>>> > stonithd[13266]:   notice:
>>>>>>> >>>>>>>>  setup_cib: Watching for stonith topology changes
>>>>>>> >>>>>>>>>>>>>>>  Mar 17 14:11:40 lbv1
>>>>>>> > stonithd[13266]:   notice:
>>>>>>> >>>>>>>>  unpack_config: On loss of CCM Quorum: Ignore
>>>>>>> >>>>>>>>>>>>>>>  Mar 17 14:11:40 lbv1
>>>>>>> > stonithd[13266]:  warning:
>>>>>>> >>>>>>>>  handle_startup_fencing: Blind faith: not fencing
>>>>>>> > unseen nodes
>>>>>>> >>>>>>>>>>>>>>>  Mar 17 14:11:40 lbv1
>>>>>>> > stonithd[13266]:  warning:
>>>>>>> >>>>>>>>  handle_startup_fencing: Blind faith: not fencing
>>>>>>> > unseen nodes
>>>>>>> >>>>>>>>>>>>>>>  Mar 17 14:11:41 lbv1
>>>>>>> > stonithd[13266]:   notice:
>>>>>>> >>>>>>>>  stonith_device_register: Added 'Stonith2-1'
>>>>>>> > to the device list (1 active
>>>>>>> >>>>>>>>  devices)
>>>>>>> >>>>>>>>>>>>>>>  Mar 17 14:11:41 lbv1
>>>>>>> > stonithd[13266]:   notice:
>>>>>>> >>>>>>>>  stonith_device_register: Added 'Stonith2-2'
>>>>>>> > to the device list (2 active
>>>>>>> >>>>>>>>  devices)
>>>>>>> >>>>>>>>>>>>>>>  Mar 17 14:12:04 lbv1
>>>>>>> > stonithd[13266]:   notice:
>>>>>>> >>>>>>>>  xml_patch_version_check: Versions did not change in
>>>>>>> > patch 0.5.0
>>>>>>> >>>>>>>>>>>>>>>  Mar 17 14:12:20 lbv1
>>>>>>> > stonithd[13266]:   notice:
>>>>>>> >>>>>>>>  log_operation: Operation 'monitor' [13386]
>>>>>>> > for device
>>>>>>> >>>>>>>>  'Stonith2-1' returned: -201 (Generic
>>>>>>> > Pacemaker error)
>>>>>>> >>>>>>>>>>>>>>>  Mar 17 14:12:20 lbv1
>>>>>>> > stonithd[13266]:  warning:
>>>>>>> >>>>>>>>  log_operation: Stonith2-1:13386 [ Performing:
>>>>>>> > stonith -t external/stonith-helper
>>>>>>> >>>>>>>>  -S ]
>>>>>>> >>>>>>>>>>>>>>>  Mar 17 14:12:20 lbv1
>>>>>>> > stonithd[13266]:  warning:
>>>>>>> >>>>>>>>  log_operation: Stonith2-1:13386 [ failed to exec
>>>>>>> > "stonith" ]
>>>>>>> >>>>>>>>>>>>>>>  Mar 17 14:12:20 lbv1
>>>>>>> > stonithd[13266]:  warning:
>>>>>>> >>>>>>>>  log_operation: Stonith2-1:13386 [ failed:  2 ]
>>>>>>> >>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>  宜しくお願いします。
>>>>>>> >>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>  以上
>>>>>>> >>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>  2015年3月17日 13:32
>>>>>>> > <renayama19661014@ybb.ne.jp>:
>>>>>>> >>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>  福田さん
>>>>>>> >>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>  お疲れ様です。山内です。
>>>>>>> >>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>
>>>>>>> > ということは、stonith-helperのstartに問題があるようですね。
>>>>>>> >>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>  stonith-helperの先頭に
>>>>>>> >>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>  #!/bin/bash -x
>>>>>>> >>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>
>>>>>>> > を入れて、クラスタを起動すると何かわかるかも知れません。
>>>>>>> >>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>
>>>>>>> > ちなみに、stonith-helperのログもどこかに出ていると思うのですが。。。
>>>>>>> >>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>  以上です。
>>>>>>> >>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>  ----- Original
>>>>>>> > Message -----
>>>>>>> >>>>>>>>>>>>>>>>>  From: Masamichi
>>>>>>> > Fukuda - elf-systems
>>>>>>> >>>>>>>>  <masamichi_fukuda@elf-systems.com>
>>>>>>> >>>>>>>>>>>>>>>>>  To: 山内英生
>>>>>>> > <renayama19661014@ybb.ne.jp>;
>>>>>>> >>>>>>>>  "linux-ha-japan@lists.sourceforge.jp"
>>>>>>> >>>>>>>>  <linux-ha-japan@lists.sourceforge.jp>
>>>>>>> >>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>  Date:
>>>>>>> > 2015/3/17, Tue 12:31
>>>>>>> >>>>>>>>>>>>>>>>>  Subject: Re:
>>>>>>> > [Linux-ha-jp]
>>>>>>> >>>>>>>>  スプリットブレイン時のSTONITHエラーについて
>>>>>>> >>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>  山内さん
>>>>>>> >>>>>>>>>>>>>>>>>  cc:松島さん
>>>>>>> >>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>  こんにちは、福田です。
>>>>>>> >>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>
>>>>>>> > 同じディレクトリにxen0はありました。
>>>>>>> >>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>  # pwd
>>>>>>> >>>>>>>>>>>>>>>>>
>>>>>>> > /usr/local/heartbeat/lib/stonith/plugins/external
>>>>>>> >>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>  # ls
>>>>>>> >>>>>>>>>>>>>>>>>  drac5          
>>>>>>> > ibmrsa          kdumpcheck
>>>>>>> >>>>>>>>  riloe          vmware
>>>>>>> >>>>>>>>>>>>>>>>>  dracmc-telnet
>>>>>>> > ibmrsa-telnet  libvirt    
>>>>>>> >>>>>>>>  ssh          xen0
>>>>>>> >>>>>>>>>>>>>>>>>  hetzner      
>>>>>>> > ipmi          nut    
>>>>>>> >>>>>>>>  stonith-helper  xen0-ha
>>>>>>> >>>>>>>>>>>>>>>>>  hmchttp      
>>>>>>> > ippower9258    rackpdu    
>>>>>>> >>>>>>>>  vcenter
>>>>>>> >>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>  宜しくお願いします。
>>>>>>> >>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>  以上
>>>>>>> >>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>  2015-03-17
>>>>>>> > 10:53 GMT+09:00
>>>>>>> >>>>>>>>  <renayama19661014@ybb.ne.jp>:
>>>>>>> >>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>  福田さん
>>>>>>> >>>>>>>>>>>>>>>>>>  cc:松島さん
>>>>>>> >>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>
>>>>>>> > お疲れ様です。山内です。
>>>>>>> >>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> > 標準出力や標準エラー出力はありませんでした。
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> > stonith-helperがおかしいのでしょうか。
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> > stonith-helperはシェルスクリプトなのでインストールはあまり気にしていなかったのですが。
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> > stonith-helperはここに配置されています。
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> > /usr/local/heartbeat/lib/stonith/plugins/external/stonith-helper
>>>>>>> >>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>
>>>>>>> > このディレクトリにxen0もありますか?
>>>>>>> >>>>>>>>>>>>>>>>>>
>>>>>>> > 無いようでしたら、問題がありますので、一度、stonith-helperのファイルを属性などはそのまま、xen0と同じディレクトリに
>>>>>>> >>>>>>>>>>>>>>>>>>
>>>>>>> > コピーしてみてください。
>>>>>>> >>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>
>>>>>>> > それで稼働するなら、pm_extrasのインストールに問題があるということになります。
>>>>>>> >>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>  以上です。
>>>>>>> >>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>  -----
>>>>>>> > Original Message -----
>>>>>>> >>>>>>>>>>>>>>>>>>>  From:
>>>>>>> > Masamichi Fukuda - elf-systems
>>>>>>> >>>>>>>>  <masamichi_fukuda@elf-systems.com>
>>>>>>> >>>>>>>>>>>>>>>>>>>  To:
>>>>>>> > 山内英生
>>>>>>> >>>>>>>>  <renayama19661014@ybb.ne.jp>;
>>>>>>> >>>>>>>>  "linux-ha-japan@lists.sourceforge.jp"
>>>>>>> >>>>>>>>  <linux-ha-japan@lists.sourceforge.jp>
>>>>>>> >>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>>  Date:
>>>>>>> > 2015/3/17, Tue 10:31
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> > Subject: Re: [Linux-ha-jp]
>>>>>>> >>>>>>>>  スプリットブレイン時のSTONITHエラーについて
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>>  山内さん
>>>>>>> >>>>>>>>>>>>>>>>>>>  cc:松島さん
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>
>>>>>>> > おはようございます、福田です。
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> > crmの例をありがとうございます。
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> > 早速、こちらの環境に合わせてみました。
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>>  $ cat
>>>>>>> > test.crm
>>>>>>> >>>>>>>>>>>>>>>>>>>  ###
>>>>>>> > Cluster Option ###
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> > property \
>>>>>>> >>>>>>>>>>>>>>>>>>>    
>>>>>>> >>>>>>>>  no-quorum-policy="ignore" \
>>>>>>> >>>>>>>>>>>>>>>>>>>    
>>>>>>> > stonith-enabled="true"
>>>>>>> >>>>>>>>  \
>>>>>>> >>>>>>>>>>>>>>>>>>>    
>>>>>>> >>>>>>>>  startup-fencing="false" \
>>>>>>> >>>>>>>>>>>>>>>>>>>    
>>>>>>> > stonith-timeout="710s"
>>>>>>> >>>>>>>>  \
>>>>>>> >>>>>>>>>>>>>>>>>>>    
>>>>>>> >>>>>>>>  crmd-transition-delay="2s"
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>>  ###
>>>>>>> > Resource Default ###
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> > rsc_defaults \
>>>>>>> >>>>>>>>>>>>>>>>>>>    
>>>>>>> >>>>>>>>  resource-stickiness="INFINITY" \
>>>>>>> >>>>>>>>>>>>>>>>>>>    
>>>>>>> >>>>>>>>  migration-threshold="1"
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>>  ###
>>>>>>> > Group Configuration ###
>>>>>>> >>>>>>>>>>>>>>>>>>>  group
>>>>>>> > HAvarnish \
>>>>>>> >>>>>>>>>>>>>>>>>>>    
>>>>>>> > vip_208 \
>>>>>>> >>>>>>>>>>>>>>>>>>>    
>>>>>>> > varnishd
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>>  group
>>>>>>> > grpStonith1 \
>>>>>>> >>>>>>>>>>>>>>>>>>>    
>>>>>>> > Stonith1-1 \
>>>>>>> >>>>>>>>>>>>>>>>>>>    
>>>>>>> > Stonith1-2
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>>  group
>>>>>>> > grpStonith2 \
>>>>>>> >>>>>>>>>>>>>>>>>>>    
>>>>>>> > Stonith2-1 \
>>>>>>> >>>>>>>>>>>>>>>>>>>    
>>>>>>> > Stonith2-2
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>>  ###
>>>>>>> > Clone Configuration ###
>>>>>>> >>>>>>>>>>>>>>>>>>>  clone
>>>>>>> > clone_ping \
>>>>>>> >>>>>>>>>>>>>>>>>>>    
>>>>>>> > ping
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>>  ###
>>>>>>> > Fencing Topology ###
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> > fencing_topology \
>>>>>>> >>>>>>>>>>>>>>>>>>>    
>>>>>>> > lbv1.beta.com: Stonith1-1
>>>>>>> >>>>>>>>  Stonith1-2 \
>>>>>>> >>>>>>>>>>>>>>>>>>>    
>>>>>>> > lbv2.beta.com: Stonith2-1
>>>>>>> >>>>>>>>  Stonith2-2
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>>  ###
>>>>>>> > Primitive Configuration ###
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> > primitive vip_208
>>>>>>> >>>>>>>>  ocf:heartbeat:IPaddr2 \
>>>>>>> >>>>>>>>>>>>>>>>>>>    
>>>>>>> > params \
>>>>>>> >>>>>>>>>>>>>>>>>>>        
>>>>>>> >>>>>>>>  ip="192.168.17.208" \
>>>>>>> >>>>>>>>>>>>>>>>>>>        
>>>>>>> > nic="eth0" \
>>>>>>> >>>>>>>>>>>>>>>>>>>        
>>>>>>> > cidr_netmask="24"
>>>>>>> >>>>>>>>  \
>>>>>>> >>>>>>>>>>>>>>>>>>>      op
>>>>>>> > start interval="0s"
>>>>>>> >>>>>>>>  timeout="90s" on-fail="restart"
>>>>>>> > \
>>>>>>> >>>>>>>>>>>>>>>>>>>      op
>>>>>>> > monitor
>>>>>>> >>>>>>>>  interval="5s" timeout="60s"
>>>>>>> > on-fail="restart"
>>>>>>> >>>>>>>>  \
>>>>>>> >>>>>>>>>>>>>>>>>>>      op
>>>>>>> > stop interval="0s"
>>>>>>> >>>>>>>>  timeout="100s" on-fail="fence"
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> > primitive varnishd lsb:varnish \
>>>>>>> >>>>>>>>>>>>>>>>>>>      op
>>>>>>> > start interval="0s"
>>>>>>> >>>>>>>>  timeout="90s" on-fail="restart"
>>>>>>> > \
>>>>>>> >>>>>>>>>>>>>>>>>>>      op
>>>>>>> > monitor
>>>>>>> >>>>>>>>  interval="10s" timeout="60s"
>>>>>>> > on-fail="restart"
>>>>>>> >>>>>>>>  \
>>>>>>> >>>>>>>>>>>>>>>>>>>      op
>>>>>>> > stop interval="0s"
>>>>>>> >>>>>>>>  timeout="100s" on-fail="fence"
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> > primitive ping ocf:pacemaker:ping
>>>>>>> >>>>>>>>  \
>>>>>>> >>>>>>>>>>>>>>>>>>>    
>>>>>>> > params \
>>>>>>> >>>>>>>>>>>>>>>>>>>        
>>>>>>> >>>>>>>>  name="default_ping_set" \
>>>>>>> >>>>>>>>>>>>>>>>>>>        
>>>>>>> >>>>>>>>  host_list="192.168.17.254" \
>>>>>>> >>>>>>>>>>>>>>>>>>>        
>>>>>>> > multiplier="100"
>>>>>>> >>>>>>>>  \
>>>>>>> >>>>>>>>>>>>>>>>>>>        
>>>>>>> > dampen="1" \
>>>>>>> >>>>>>>>>>>>>>>>>>>      op
>>>>>>> > start interval="0s"
>>>>>>> >>>>>>>>  timeout="90s" on-fail="restart"
>>>>>>> > \
>>>>>>> >>>>>>>>>>>>>>>>>>>      op
>>>>>>> > monitor
>>>>>>> >>>>>>>>  interval="10s" timeout="60s"
>>>>>>> > on-fail="restart"
>>>>>>> >>>>>>>>  \
>>>>>>> >>>>>>>>>>>>>>>>>>>      op
>>>>>>> > stop interval="0s"
>>>>>>> >>>>>>>>  timeout="100s" on-fail="fence"
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> > primitive Stonith1-1
>>>>>>> >>>>>>>>  stonith:external/stonith-helper \
>>>>>>> >>>>>>>>>>>>>>>>>>>    
>>>>>>> > params \
>>>>>>> >>>>>>>>>>>>>>>>>>>        
>>>>>>> >>>>>>>>  pcmk_reboot_retries="1" \
>>>>>>> >>>>>>>>>>>>>>>>>>>        
>>>>>>> >>>>>>>>  pcmk_reboot_timeout="40s" \
>>>>>>> >>>>>>>>>>>>>>>>>>>        
>>>>>>> >>>>>>>>  hostlist="lbv1.beta.com" \
>>>>>>> >>>>>>>>>>>>>>>>>>>        
>>>>>>> >>>>>>>>  dead_check_target="192.168.17.132
>>>>>>> > 10.0.17.132" \
>>>>>>> >>>>>>>>>>>>>>>>>>>        
>>>>>>> >>>>>>>>
>>>>>>> > standby_check_command="/usr/local/sbin/crm_resource -r varnishd -W | grep
>>>>>>> >>>>>>>>  -q `hostname`" \
>>>>>>> >>>>>>>>>>>>>>>>>>>        
>>>>>>> >>>>>>>>  run_online_check="yes" \
>>>>>>> >>>>>>>>>>>>>>>>>>>      op
>>>>>>> > start interval="0s"
>>>>>>> >>>>>>>>  timeout="60s" on-fail="restart"
>>>>>>> > \
>>>>>>> >>>>>>>>>>>>>>>>>>>      op
>>>>>>> > stop interval="0s"
>>>>>>> >>>>>>>>  timeout="60s" on-fail="ignore"
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> > primitive Stonith1-2
>>>>>>> >>>>>>>>  stonith:external/xen0 \
>>>>>>> >>>>>>>>>>>>>>>>>>>    
>>>>>>> > params \
>>>>>>> >>>>>>>>>>>>>>>>>>>        
>>>>>>> >>>>>>>>  pcmk_reboot_timeout="60s" \
>>>>>>> >>>>>>>>>>>>>>>>>>>        
>>>>>>> >>>>>>>>
>>>>>>> > hostlist="lbv1.beta.com:/etc/xen/lbv1.cfg" \
>>>>>>> >>>>>>>>>>>>>>>>>>>        
>>>>>>> >>>>>>>>  dom0="xen0.beta.com" \
>>>>>>> >>>>>>>>>>>>>>>>>>>      op
>>>>>>> > start interval="0s"
>>>>>>> >>>>>>>>  timeout="60s" on-fail="restart"
>>>>>>> > \
>>>>>>> >>>>>>>>>>>>>>>>>>>      op
>>>>>>> > monitor
>>>>>>> >>>>>>>>  interval="3600s" timeout="60s"
>>>>>>> > on-fail="restart"
>>>>>>> >>>>>>>>  \
>>>>>>> >>>>>>>>>>>>>>>>>>>      op
>>>>>>> > stop interval="0s"
>>>>>>> >>>>>>>>  timeout="60s" on-fail="ignore"
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> > primitive Stonith2-1
>>>>>>> >>>>>>>>  stonith:external/stonith-helper \
>>>>>>> >>>>>>>>>>>>>>>>>>>    
>>>>>>> > params \
>>>>>>> >>>>>>>>>>>>>>>>>>>        
>>>>>>> >>>>>>>>  pcmk_reboot_retries="1" \
>>>>>>> >>>>>>>>>>>>>>>>>>>        
>>>>>>> >>>>>>>>  pcmk_reboot_timeout="40s" \
>>>>>>> >>>>>>>>>>>>>>>>>>>        
>>>>>>> >>>>>>>>  hostlist="lbv2.beta.com" \
>>>>>>> >>>>>>>>>>>>>>>>>>>        
>>>>>>> >>>>>>>>  dead_check_target="192.168.17.133
>>>>>>> > 10.0.17.133" \
>>>>>>> >>>>>>>>>>>>>>>>>>>        
>>>>>>> >>>>>>>>
>>>>>>> > standby_check_command="/usr/local/sbin/crm_resource -r varnishd -W | grep
>>>>>>> >>>>>>>>  -q `hostname`" \
>>>>>>> >>>>>>>>>>>>>>>>>>>        
>>>>>>> >>>>>>>>  run_online_check="yes" \
>>>>>>> >>>>>>>>>>>>>>>>>>>      op
>>>>>>> > start interval="0s"
>>>>>>> >>>>>>>>  timeout="60s" on-fail="restart"
>>>>>>> > \
>>>>>>> >>>>>>>>>>>>>>>>>>>      op
>>>>>>> > stop interval="0s"
>>>>>>> >>>>>>>>  timeout="60s" on-fail="ignore"
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> > primitive Stonith2-2
>>>>>>> >>>>>>>>  stonith:external/xen0 \
>>>>>>> >>>>>>>>>>>>>>>>>>>    
>>>>>>> > params \
>>>>>>> >>>>>>>>>>>>>>>>>>>        
>>>>>>> >>>>>>>>  pcmk_reboot_timeout="60s" \
>>>>>>> >>>>>>>>>>>>>>>>>>>        
>>>>>>> >>>>>>>>
>>>>>>> > hostlist="lbv2.beta.com:/etc/xen/lbv2.cfg" \
>>>>>>> >>>>>>>>>>>>>>>>>>>        
>>>>>>> >>>>>>>>  dom0="xen0.beta.com" \
>>>>>>> >>>>>>>>>>>>>>>>>>>      op
>>>>>>> > start interval="0s"
>>>>>>> >>>>>>>>  timeout="60s" on-fail="restart"
>>>>>>> > \
>>>>>>> >>>>>>>>>>>>>>>>>>>      op
>>>>>>> > monitor
>>>>>>> >>>>>>>>  interval="3600s" timeout="60s"
>>>>>>> > on-fail="restart"
>>>>>>> >>>>>>>>  \
>>>>>>> >>>>>>>>>>>>>>>>>>>      op
>>>>>>> > stop interval="0s"
>>>>>>> >>>>>>>>  timeout="60s" on-fail="ignore"
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>>  ###
>>>>>>> > Resource Location ###
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> > location HA_location-1 HAvarnish
>>>>>>> >>>>>>>>  \
>>>>>>> >>>>>>>>>>>>>>>>>>>    
>>>>>>> > rule 200: #uname eq
>>>>>>> >>>>>>>>  lbv1.beta.com \
>>>>>>> >>>>>>>>>>>>>>>>>>>    
>>>>>>> > rule 100: #uname eq
>>>>>>> >>>>>>>>  lbv2.beta.com
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> > location HA_location-2 HAvarnish
>>>>>>> >>>>>>>>  \
>>>>>>> >>>>>>>>>>>>>>>>>>>    
>>>>>>> > rule -INFINITY: not_defined
>>>>>>> >>>>>>>>  default_ping_set or default_ping_set lt 100
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> > location HA_location-3 grpStonith1
>>>>>>> >>>>>>>>  \
>>>>>>> >>>>>>>>>>>>>>>>>>>    
>>>>>>> > rule -INFINITY: #uname eq
>>>>>>> >>>>>>>>  lbv1.beta.com
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> > location HA_location-4 grpStonith2
>>>>>>> >>>>>>>>  \
>>>>>>> >>>>>>>>>>>>>>>>>>>    
>>>>>>> > rule -INFINITY: #uname eq
>>>>>>> >>>>>>>>  lbv2.beta.com
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> > これを流しこんだところ、昨日とはメッセージが異なります。
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> > pingのメッセージはなくなっていました。
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>>  #
>>>>>>> > crm_mon -rfA
>>>>>>> >>>>>>>>>>>>>>>>>>>  Last
>>>>>>> > updated: Tue Mar 17 10:21:28
>>>>>>> >>>>>>>>  2015
>>>>>>> >>>>>>>>>>>>>>>>>>>  Last
>>>>>>> > change: Tue Mar 17 10:21:09
>>>
>>>>>
>>>>>>> >>>>>>>>  2015
>>>>>>> >>>>>>>>>>>>>>>>>>>  Stack:
>>>>>>> > heartbeat
>>>>>>> >>>>>>>>>>>>>>>>>>>  Current
>>>>>>> > DC: lbv2.beta.com
>>>>>>> >>>>>>>>  (82ffc36f-1ad8-8686-7db0-35686465c624) - parti
>>>>>>> >>>>>>>>>>>>>>>>>>>  tion
>>>>>>> > with quorum
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> > Version: 1.1.12-561c4cf
>>>>>>> >>>>>>>>>>>>>>>>>>>  2 Nodes
>>>>>>> > configured
>>>>>>> >>>>>>>>>>>>>>>>>>>  8
>>>>>>> > Resources configured
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>>  Online:
>>>>>>> > [ lbv1.beta.com
>>>>>>> >>>>>>>>  lbv2.beta.com ]
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>>  Full
>>>>>>> > list of resources:
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> >  Resource Group: HAvarnish
>>>>>>> >>>>>>>>>>>>>>>>>>>      
>>>>>>> > vip_208  
>>>>>>> >>>>>>>>  (ocf::heartbeat:IPaddr2):       Started
>>>>>>> > lbv1.beta.com
>>>>>>> >>>>>>>>>>>>>>>>>>>      
>>>>>>> > varnishd   (lsb:varnish):
>>>>>>> >>>>>>>>  Started lbv1.beta.com
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> >  Resource Group: grpStonith1
>>>>>>> >>>>>>>>>>>>>>>>>>>      
>>>>>>> > Stonith1-1
>>>>>>> >>>>>>>>  (stonith:external/stonith-helper):      Stopped
>>>>>>> >>>>>>>>>>>>>>>>>>>      
>>>>>>> > Stonith1-2
>>>>>>> >>>>>>>>  (stonith:external/xen0):        Stopped
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> >  Resource Group: grpStonith2
>>>>>>> >>>>>>>>>>>>>>>>>>>      
>>>>>>> > Stonith2-1
>>>>>>> >>>>>>>>  (stonith:external/stonith-helper):      Stopped
>>>>>>> >>>>>>>>>>>>>>>>>>>      
>>>>>>> > Stonith2-2
>>>>>>> >>>>>>>>  (stonith:external/xen0):        Stopped
>>>>>>> >>>>>>>>>>>>>>>>>>>   Clone
>>>>>>> > Set: clone_ping [ping]
>>>>>>> >>>>>>>>>>>>>>>>>>>      
>>>>>>> > Started: [ lbv1.beta.com
>>>>>>> >>>>>>>>  lbv2.beta.com ]
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>>  Node
>>>>>>> > Attributes:
>>>>>>> >>>>>>>>>>>>>>>>>>>  * Node
>>>>>>> > lbv1.beta.com:
>>>>>>> >>>>>>>>>>>>>>>>>>>      +
>>>>>>> >>>>>>>>  default_ping_set                  : 100
>>>>>>> >>>>>>>>>>>>>>>>>>>  * Node
>>>>>>> > lbv2.beta.com:
>>>>>>> >>>>>>>>>>>>>>>>>>>      +
>>>>>>> >>>>>>>>  default_ping_set                  : 100
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> > Migration summary:
>>>>>>> >>>>>>>>>>>>>>>>>>>  * Node
>>>>>>> > lbv2.beta.com:
>>>>>>> >>>>>>>>>>>>>>>>>>>    
>>>>>>> > Stonith1-1: migration-threshold=1
>>>>>>> >>>>>>>>  fail-count=1000000 last-failure='Tue Mar 17
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> >  10:21:17 2015'
>>>>>>> >>>>>>>>>>>>>>>>>>>  * Node
>>>>>>> > lbv1.beta.com:
>>>>>>> >>>>>>>>>>>>>>>>>>>    
>>>>>>> > Stonith2-1: migration-threshold=1
>>>>>>> >>>>>>>>  fail-count=1000000 last-failure='Tue Mar 17
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> >  10:21:17 2015'
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>>  Failed
>>>>>>> > actions:
>>>>>>> >>>>>>>>>>>>>>>>>>>    
>>>>>>> > Stonith1-1_start_0 on
>>>>>>> >>>>>>>>  lbv2.beta.com 'unknown error' (1): call=31,
>>>>>>> > st
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> > atus=Error, last-rc-change='Tue
>>>>>>> >>>>>>>>  Mar 17 10:21:15 2015', queued=0ms, exec=1082ms
>>>>>>> >>>>>>>>>>>>>>>>>>>    
>>>>>>> > Stonith2-1_start_0 on
>>>>>>> >>>>>>>>  lbv1.beta.com 'unknown error' (1): call=31,
>>>>>>> > st
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> > atus=Error, last-rc-change='Tue
>>>>>>> >>>>>>>>  Mar 17 10:21:16 2015', queued=0ms, exec=1079ms
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> > /var/log/ha-debugのログです。
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> > IPaddr2(vip_208)[7851]:
>>>>>>> >>>>>>>>  2015/03/17_10:21:22 INFO: Adding inet address
>>>>>>> > 192.168.17.208/24 with broadcast
>>>>>>> >>>>>>>>  address 192.168.17.255 to device eth0
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> > IPaddr2(vip_208)[7851]:
>>>>>>> >>>>>>>>  2015/03/17_10:21:22 INFO: Bringing device eth0 up
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> > IPaddr2(vip_208)[7851]:
>>>>>>> >>>>>>>>  2015/03/17_10:21:22 INFO:
>>>>>>> > /usr/libexec/heartbeat/send_arp -i 200 -r 5 -p
>>>>>>> >>>>>>>>  /var/run/resource-agents/send_arp-192.168.17.208
>>>>>>> > eth0 192.168.17.208 auto
>>>>>>> >>>>>>>>  not_used not_used
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> > 標準出力や標準エラー出力はありませんでした。
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> > stonith-helperがおかしいのでしょうか。
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> > stonith-helperはシェルスクリプトなのでインストールはあまり気にしていなかったのですが。
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> > stonith-helperはここに配置されています。
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> > /usr/local/heartbeat/lib/stonith/plugins/external/stonith-helper
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>
>>>>>>> > 宜しくお願いします。
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>>  以上
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> > 2015-03-17 9:45 GMT+09:00
>>>>>>> >>>>>>>>  <renayama19661014@ybb.ne.jp>:
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>>  福田さん
>>>>>>> >>>>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>>>
>>>>>>> > おはようございます。山内です。
>>>>>>> >>>>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>>>
>>>>>>> > 念の為、手元にある複数のstonithを利用した場合の例を抜粋してお送りします。
>>>>>>> >>>>>>>>>>>>>>>>>>>>
>>>>>>> > (実際には、改行に気を付けてください)
>>>>>>> >>>>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>>>
>>>>>>> > 以下の例は、PM1.1系での設定で、
>>>>>>> >>>>>>>>>>>>>>>>>>>>
>>>>>>> > nodeaは、prmStonith1-1、 prmStonith1-2の順でstonithが実行されます。
>>>>>>> >>>>>>>>>>>>>>>>>>>>
>>>>>>> > nodebは、prmStonith2-1、 prmStonith2-2の順でstonithが実行されます。
>>>>>>> >>>>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>>>
>>>>>>> > stonith自体は、helperとsshです。
>>>>>>> >>>>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>>>
>>>>>>> > (snip)
>>>>>>> >>>>>>>>>>>>>>>>>>>>  ###
>>>>>>> > Group Configuration ###
>>>>>>> >>>>>>>>>>>>>>>>>>>>
>>>>>>> > group grpStonith1 \
>>>>>>> >>>>>>>>>>>>>>>>>>>>
>>>>>>> > prmStonith1-1 \
>>>>>>> >>>>>>>>>>>>>>>>>>>>
>>>>>>> > prmStonith1-2
>>>>>>> >>>>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>>>
>>>>>>> > group grpStonith2 \
>>>>>>> >>>>>>>>>>>>>>>>>>>>
>>>>>>> > prmStonith2-1 \
>>>>>>> >>>>>>>>>>>>>>>>>>>>
>>>>>>> > prmStonith2-2
>>>>>>> >>>>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>>>  ###
>>>>>>> > Fencing Topology ###
>>>>>>> >>>>>>>>>>>>>>>>>>>>
>>>>>>> > fencing_topology \
>>>>>>> >>>>>>>>>>>>>>>>>>>>
>>>>>>> > nodea: prmStonith1-1
>>>>>>> >>>>>>>>  prmStonith1-2 \
>>>>>>> >>>>>>>>>>>>>>>>>>>>
>>>>>>> > nodeb: prmStonith2-1
>>>>>>> >>>>>>>>  prmStonith2-2
>>>>>>> >>>>>>>>>>>>>>>>>>>>
>>>>>>> > (snp)
>>>>>>> >>>>>>>>>>>>>>>>>>>>
>>>>>>> > primitive prmStonith1-1
>>>>>>> >>>>>>>>  stonith:external/stonith-helper \
>>>>>>> >>>>>>>>>>>>>>>>>>>>
>>>>>>> > params \
>>>>>>> >>>>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>>>
>>>>>>> > pcmk_reboot_retries="1"
>>>>>>> >>>>>>>>  \
>>>>>>> >>>>>>>>>>>>>>>>>>>>
>>>>>>> > pcmk_reboot_timeout="40s"
>>>>>>> >>>>>>>>  \
>>>>>>> >>>>>>>>>>>>>>>>>>>>
>>>>>>> > hostlist="nodea" \
>>>>>>> >>>>>>>>>>>>>>>>>>>>
>>>>>>> > dead_check_target="192.168.28.60
>>>>>>> >>>>>>>>  192.168.28.70" \
>>>>>>> >>>>>>>>>>>>>>>>>>>>
>>>>>>> > standby_check_command="/usr/sbin/crm_resource
>>>>>>> >>>>>>>>  -r prmRES -W | grep -qi `hostname`" \
>>>>>>> >>>>>>>>>>>>>>>>>>>>
>>>>>>> > run_online_check="yes"
>>>>>>> >>>>>>>>  \
>>>>>>> >>>>>>>>>>>>>>>>>>>>  op
>>>>>>> > start interval="0s"
>>>>>>> >>>>>>>>  timeout="60s" on-fail="restart"
>>>>>>> > \
>>>>>>> >>>>>>>>>>>>>>>>>>>>  op
>>>>>>> > stop interval="0s"
>>>>>>> >>>>>>>>  timeout="60s" on-fail="ignore"
>>>>>>> >>>>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>>>
>>>>>>> > primitive prmStonith1-2
>>>>>>> >>>>>>>>  stonith:external/ssh \
>>>>>>> >>>>>>>>>>>>>>>>>>>>
>>>>>>> > params \
>>>>>>> >>>>>>>>>>>>>>>>>>>>
>>>>>>> > pcmk_reboot_timeout="60s"
>>>>>>> >>>>>>>>  \
>>>>>>> >>>>>>>>>>>>>>>>>>>>
>>>>>>> > hostlist="nodea" \
>>>>>>> >>>>>>>>>>>>>>>>>>>>  op
>>>>>>> > start interval="0s"
>>>>>>> >>>>>>>>  timeout="60s" on-fail="restart"
>>>>>>> > \
>>>>>>> >>>>>>>>>>>>>>>>>>>>  op
>>>>>>> > monitor
>>>>>>> >>>>>>>>  interval="3600s" timeout="60s"
>>>>>>> > on-fail="restart"
>>>>>>> >>>>>>>>  \
>>>>>>> >>>>>>>>>>>>>>>>>>>>  op
>>>>>>> > stop interval="0s"
>>>>>>> >>>>>>>>  timeout="60s" on-fail="ignore"
>>>>>>> >>>>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>>>
>>>>>>> > primitive prmStonith2-1
>>>>>>> >>>>>>>>  stonith:external/stonith-helper \
>>>>>>> >>>>>>>>>>>>>>>>>>>>
>>>>>>> > params \
>>>>>>> >>>>>>>>>>>>>>>>>>>>
>>>>>>> > pcmk_reboot_retries="1"
>>>>>>> >>>>>>>>  \
>>>>>>> >>>>>>>>>>>>>>>>>>>>
>>>>>>> > pcmk_reboot_timeout="40s"
>>>>>>> >>>>>>>>  \
>>>>>>> >>>>>>>>>>>>>>>>>>>>
>>>>>>> > hostlist="nodeb" \
>>>>>>> >>>>>>>>>>>>>>>>>>>>
>>>>>>> > dead_check_target="192.168.28.61
>>>>>>> >>>>>>>>  192.168.28.71" \
>>>>>>> >>>>>>>>>>>>>>>>>>>>
>>>>>>> > standby_check_command="/usr/sbin/crm_resource
>>>>>>> >>>>>>>>  -r prmRES -W | grep -qi `hostname`" \
>>>>>>> >>>>>>>>>>>>>>>>>>>>
>>>>>>> > run_online_check="yes"
>>>>>>> >>>>>>>>  \
>>>>>>> >>>>>>>>>>>>>>>>>>>>  op
>>>>>>> > start interval="0s"
>>>>>>> >>>>>>>>  timeout="60s" on-fail="restart"
>>>>>>> > \
>>>>>>> >>>>>>>>>>>>>>>>>>>>  op
>>>>>>> > stop interval="0s"
>>>>>>> >>>>>>>>  timeout="60s" on-fail="ignore"
>>>>>>> >>>>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>>>
>>>>>>> > primitive prmStonith2-2
>>>>>>> >>>>>>>>  stonith:external/ssh \
>>>>>>> >>>>>>>>>>>>>>>>>>>>
>>>>>>> > params \
>>>>>>> >>>>>>>>>>>>>>>>>>>>
>>>>>>> > pcmk_reboot_timeout="60s"
>>>>>>> >>>>>>>>  \
>>>>>>> >>>>>>>>>>>>>>>>>>>>
>>>>>>> > hostlist="nodeb" \
>>>>>>> >>>>>>>>>>>>>>>>>>>>  op
>>>>>>> > start interval="0s"
>>>>>>> >>>>>>>>  timeout="60s" on-fail="restart"
>>>>>>> > \
>>>>>>> >>>>>>>>>>>>>>>>>>>>  op
>>>>>>> > monitor
>>>>>>> >>>>>>>>  interval="3600s" timeout="60s"
>>>>>>> > on-fail="restart"
>>>>>>> >>>>>>>>  \
>>>>>>> >>>>>>>>>>>>>>>>>>>>  op
>>>>>>> > stop interval="0s"
>>>>>>> >>>>>>>>  timeout="60s" on-fail="ignore"
>>>>>>> >>>>>>>>>>>>>>>>>>>>
>>>>>>> > (snip)
>>>>>>> >>>>>>>>>>>>>>>>>>>>
>>>>>>> > location
>>>>>>> >>>>>>>>  rsc_location-grpStonith1-2 grpStonith1 \
>>>>>>> >>>>>>>>>>>>>>>>>>>>
>>>>>>> > rule -INFINITY: #uname eq nodea
>>>>>>> >>>>>>>>>>>>>>>>>>>>
>>>>>>> > location
>>>>>>> >>>>>>>>  rsc_location-grpStonith2-3 grpStonith2 \
>>>>>>> >>>>>>>>>>>>>>>>>>>>
>>>>>>> > rule -INFINITY: #uname eq nodeb
>>>>>>> >>>>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>>>
>>>>>>> > 以上です。
>>>>>>> >>>>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>>  --
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>>  ELF
>>>>>>> > Systems
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> > Masamichi Fukuda
>>>>>>> >>>>>>>>>>>>>>>>>>>  mail
>>>>>>> > to:
>>>>>>> >>>>>>>>  masamichi_fukuda@elf-systems.com
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>
>>>>>>> > _______________________________________________
>>>>>>> >>>>>>>>>>>>>>>>>>
>>>>>>> > Linux-ha-japan mailing list
>>>>>>> >>>>>>>>>>>>>>>>>>
>>>>>>> > Linux-ha-japan@lists.sourceforge.jp
>>>>>>> >>>>>>>>>>>>>>>>>>
>>>>>>> > http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>>>>>>> >>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>  --
>>>>>>> >>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>  ELF Systems
>>>>>>> >>>>>>>>>>>>>>>>>  Masamichi
>>>>>>> > Fukuda
>>>>>>> >>>>>>>>>>>>>>>>>  mail to:
>>>>>>> > masamichi_fukuda@elf-systems.com
>>>>>>> >>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>
>>>>>>> > _______________________________________________
>>>>>>> >>>>>>>>>>>>>>>>  Linux-ha-japan
>>>>>>> > mailing list
>>>>>>> >>>>>>>>>>>>>>>>
>>>>>>> > Linux-ha-japan@lists.sourceforge.jp
>>>>>>> >>>>>>>>>>>>>>>>
>>>>>>> > http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>>>>>>> >>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>  --
>>>>>>> >>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>  ELF Systems
>>>>>>> >>>>>>>>>>>>>>>  Masamichi Fukuda
>>>>>>> >>>>>>>>>>>>>>>  mail to:
>>>>>>> > masamichi_fukuda@elf-systems.com
>>>>>>> >>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>
>>>>>>> > _______________________________________________
>>>>>>> >>>>>>>>>>>>>>  Linux-ha-japan mailing list
>>>>>>> >>>>>>>>>>>>>>
>>>>>>> > Linux-ha-japan@lists.sourceforge.jp
>>>>>>> >>>>>>>>>>>>>>
>>>>>>> > http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>>>>>>> >>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>  --
>>>>>>> >>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>  ELF Systems
>>>>>>> >>>>>>>>>>>>>  Masamichi Fukuda
>>>>>>> >>>>>>>>>>>>>  mail to:
>>>>>>> > masamichi_fukuda@elf-systems.com
>>>>>>> >>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>
>>>>>>> > _______________________________________________
>>>>>>> >>>>>>>>>>>>  Linux-ha-japan mailing list
>>>>>>> >>>>>>>>>>>>  Linux-ha-japan@lists.sourceforge.jp
>>>>>>> >>>>>>>>>>>>
>>>>>>> > http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>>>>>>> >>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>
>>>>>>> >>>>>>>>>>>  --
>>>>>>> >>>>>>>>>>>
>>>>>>> >>>>>>>>>>>  ELF Systems
>>>>>>> >>>>>>>>>>>  Masamichi Fukuda
>>>>>>> >>>>>>>>>>>  mail to:
>>>>>>> > masamichi_fukuda@elf-systems.com
>>>>>>> >>>>>>>>>>>
>>>>>>> >>>>>>>>>>>
>>>>>>> >>>>>>>>>>>
>>>>>>> >>>>>>>>>>
>>>>>>> >>>>>>>>>>
>>>>>>> > _______________________________________________
>>>>>>> >>>>>>>>>>  Linux-ha-japan mailing list
>>>>>>> >>>>>>>>>>  Linux-ha-japan@lists.sourceforge.jp
>>>>>>> >>>>>>>>>>
>>>>>>> > http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>>>>>>> >>>>>>>>>>
>>>>>>> >>>>>>>>>
>>>>>>> >>>>>>>>>
>>>>>>> >>>>>>>>>  --
>>>>>>> >>>>>>>>>
>>>>>>> >>>>>>>>>  ELF Systems
>>>>>>> >>>>>>>>>  Masamichi Fukuda
>>>>>>> >>>>>>>>>  mail to: masamichi_fukuda@elf-systems.com
>>>>>>> >>>>>>>>>
>>>>>>> >>>>>>>>>
>>>>>>> >>>>>>>>
>>>>>>> >>>>>>>>  _______________________________________________
>>>>>>> >>>>>>>>  Linux-ha-japan mailing list
>>>>>>> >>>>>>>>  Linux-ha-japan@lists.sourceforge.jp
>>>>>>> >>>>>>>>
>>>>>>> > http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>>>>>>> >>>>>>>>
>>>>>>> >>>>>>>
>>>>>>> >>>>>>> _______________________________________________
>>>>>>> >>>>>>> Linux-ha-japan mailing list
>>>>>>> >>>>>>> Linux-ha-japan@lists.sourceforge.jp
>>>>>>> >>>>>>> http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>>>>>>> >>>>>>>
>>>>>>> >>>>>>
>>>>>>> >>>>>>
>>>>>>> >>>>>> --
>>>>>>> >>>>>>
>>>>>>> >>>>>> ELF Systems
>>>>>>> >>>>>> Masamichi Fukuda
>>>>>>> >>>>>> mail to: masamichi_fukuda@elf-systems.com
>>>>>>> >>>>>>
>>>>>>> >>>>>>
>>>>>>> >>>>>
>>>>>>> >>>>> _______________________________________________
>>>>>>> >>>>> Linux-ha-japan mailing list
>>>>>>> >>>>> Linux-ha-japan@lists.sourceforge.jp
>>>>>>> >>>>> http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>>>>>>> >>>>>
>>>>>>> >>>>
>>>>>>> >>>>
>>>>>>> >>>> --
>>>>>>> >>>>
>>>>>>> >>>> ELF Systems
>>>>>>> >>>> Masamichi Fukuda
>>>>>>> >>>> mail to: masamichi_fukuda@elf-systems.com
>>>>>>> >>>>
>>>>>>> >>>>
>>>>>>> >>>
>>>>>>> >>> _______________________________________________
>>>>>>> >>> Linux-ha-japan mailing list
>>>>>>> >>> Linux-ha-japan@lists.sourceforge.jp
>>>>>>> >>> http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>>>>>>> >>>
>>>>>>> >>
>>>>>>> >>
>>>>>>> >> --
>>>>>>> >>
>>>>>>> >> ELF Systems
>>>>>>> >> Masamichi Fukuda
>>>>>>> >> mail to: masamichi_fukuda@elf-systems.com
>>>>>>> >>
>>>>>>> >>
>>>>>>> >
>>>>>>> > _______________________________________________
>>>>>>> > Linux-ha-japan mailing list
>>>>>>> > Linux-ha-japan@lists.sourceforge.jp
>>>>>>> > http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>>>>>>> >
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Linux-ha-japan mailing list
>>>>>>> Linux-ha-japan@lists.sourceforge.jp
>>>>>>> http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>--
>>>>>>ELF Systems
>>>>>>Masamichi Fukuda
>>>>>>mail to: masamichi_fukuda@elf-systems.com
>>>>>>
>>>>>>
>>>>>
>>>>>_______________________________________________
>>>>>Linux-ha-japan mailing list
>>>>>Linux-ha-japan@lists.sourceforge.jp
>>>>>http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>>>>>
>>>>
>>>>
>>>>--
>>>>
>>>>ELF Systems
>>>>Masamichi Fukuda
>>>>mail to: masamichi_fukuda@elf-systems.com
>>>>
>>>>
>>>
>>>_______________________________________________
>>>Linux-ha-japan mailing list
>>>Linux-ha-japan@lists.sourceforge.jp
>>>http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>>>
>>
>>
>>--
>>
>>ELF Systems
>>Masamichi Fukuda
>>mail to: masamichi_fukuda@elf-systems.com
>
>
>--
>
>ELF Systems
>Masamichi Fukuda
>mail to: masamichi_fukuda@elf-systems.com
>
>

_______________________________________________
Linux-ha-japan mailing list
Linux-ha-japan@lists.sourceforge.jp
http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
Re: スプリットブレイン時のSTONITHエラーについて [ In reply to ]
山内さん

お疲れ様です、福田です。

> この時ですが、戻したりする前に、
> 1)使っていたバージョンのソースディレクトリで、make uninstall
> 2)/var/lib/pacemaker/cib, /var/lib/pacemaker/
>
> pengineのディレクトリ中身を削除
> しておいた方がよいです。

なるほど、次回は気をつけます。

済みませんが、宜しくお願いします。

以上

2015年3月20日 17:16 <renayama19661014@ybb.ne.jp>:

> 福田さん
>
>
> お疲れ様です。山内です。
>
>
> >こちらの環境で、PM1.1.12のbuild:e32080bからbuild:561c4cfへ何度か戻したりしているうちにリブートを繰り返すようになってしまいました。
>
> この時ですが、戻したりする前に、
> 1)使っていたバージョンのソースディレクトリで、make uninstall
> 2)/var/lib/pacemaker/cib, /var/lib/pacemaker/pengineのディレクトリ中身を削除
> しておいた方がよいです。
>
> >そこで、再度debian7.8をクリーンインストールしてPM1.1.12 build:561c4cfをインストールしました。
> >あと、ご指摘頂いたパスを通したところ、こちらでもstonith-helperの起動までは確認できました。
>
>
> そうでしたか・・・・良かったですね。
> といっても、build:e32080bが動かないと問題ですが・・・・
>
> また、週末に時間が取れたら、こちらでもやってみます。
> 進展があれば、ご連絡いたします。
>
> 以上です。
>
>
>
> ----- Original Message -----
> >From: Masamichi Fukuda - elf-systems <masamichi_fukuda@elf-systems.com>
> >To: 山内英生 <renayama19661014@ybb.ne.jp>; "
> linux-ha-japan@lists.sourceforge.jp" <linux-ha-japan@lists.sourceforge.jp>
> >Date: 2015/3/20, Fri 16:36
> >Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
> >
> >
> >山内さん
> >
> >こんにちは、福田です。
> >
>
> >こちらの環境で、PM1.1.12のbuild:e32080bからbuild:561c4cfへ何度か戻したりしているうちにリブートを繰り返すようになってしまいました。
> >そこで、再度debian7.8をクリーンインストールしてPM1.1.12 build:561c4cfをインストールしました。
> >あと、ご指摘頂いたパスを通したところ、こちらでもstonith-helperの起動までは確認できました。
> >
> >Last updated: Fri Mar 20 16:26:47 2015
> >Last change: Fri Mar 20 16:22:01 2015
> >Stack: heartbeat
> >Current DC: deb64 (71e563fb-34e1-919f-7515-868014cb501d) - partition with
> quorum
> >
> >Version: 1.1.12-561c4cf
> >2 Nodes configured
> >10 Resources configured
> >
> >
> >Online: [ deb63 deb64 ]
> >
> >Full list of resources:
> >
> > Resource Group: HAvarnish
> > vip_208 (ocf::heartbeat:IPaddr2): Started deb63
> > varnishd (lsb:varnish): Started deb63
> > Resource Group: grpStonith1
> > Stonith1-1 (stonith:external/stonith-helper): Started deb64
> > Stonith1-2 (stonith:external/ssh): Started deb64
> > Stonith1-3 (stonith:meatware): Started deb64
> > Resource Group: grpStonith2
> > Stonith2-1 (stonith:external/stonith-helper): Started deb63
> > Stonith2-2 (stonith:external/ssh): Started deb63
> > Stonith2-3 (stonith:meatware): Started deb63
> > Clone Set: clone_ping [ping]
> > Started: [ deb63 deb64 ]
> >
> >Node Attributes:
> >* Node deb63:
> > + default_ping_set : 100
> >* Node deb64:
> > + default_ping_set : 100
> >
> >Migration summary:
> >* Node deb64:
> >* Node deb63:
> >
> >宜しくお願いします。
> >
> >以上
> >
> >
> >
> >
> >
> >
> >2015年3月18日 18:32 Masamichi Fukuda - elf-systems <
> masamichi_fukuda@elf-systems.com>:
> >
> >山内さん
> >>
> >>こんばんは、福田です。
> >>debianでの検証ありがとうございます。
> >>
> >>> どうやら、新しいPacemakerのrngファイル(
> >>>
> >>> Pacemaker1.1.12より後)が影響しているようです。
> >>> が、こちらの回避方法はまだわかっていません。
> >>
> >>こちら回避方法等わかりました際にはご教示お願いします。
> >>
> >>> ただ、最新のPMとの組み合わせの問題の解消はまだですので、
> >>>
> >>> この構成(PM1.1.12+Heartbeat3.0.6)が正しく動くかどうかは別物です。
> >>> #たぶん、動いているようですが、問題が出ると思います。
> >>
> >>一旦、PM1.1.12に戻して、同じ手順でやってみます。
> >>まずはstonith-helperが動くかどうか確認してみます。
> >>
> >>> で、福田さんのstonith-
> >>>
> >>> helperがstartでエラーになっている件ですが、多分、stonithコマンドへのPATHがないことが原因かと思います。
> >>
> >>初歩的なミスのようでお恥ずかしい限りです。
> >>こちらも同様に試してみます。
> >>
> >>宜しくお願いします。
> >>
> >>以上
> >>
> >>
> >>
> >>
> >>2015年3月18日 17:56 <renayama19661014@ybb.ne.jp>:
> >>
> >>福田さん
> >>>
> >>>こんばんは、山内です。
> >>>
> >>>私の方でも同じ状況が発生しました。
> >>>どうやら、新しいPacemakerのrngファイル(Pacemaker1.1.12より後)が影響しているようです。
> >>>が、こちらの回避方法はまだわかっていません。
> >>>
>
> >>>ちなみに、本来はうまく動くかどうか不明のPacemaker1.1.12とHeartbeat3.0.6の組み合わせでは、単一ノードで、stonith-helperの起動まで確認しました。
> >>>
> >>>root@debian7-1:~# crm_mon -1 -Af
> >>>Last updated: Wed Mar 18 17:43:37 2015
> >>>Last change: Wed Mar 18 17:43:29 2015
> >>>Stack: heartbeat
> >>>Current DC: debian7-1 (d20c7df5-519e-4a4c-9b4b-1b88fc203133) -
> partition with quorum
> >>>Version: 1.1.12-561c4cf
> >>>1 Nodes configured
> >>>3 Resources configured
> >>>
> >>>
> >>>Online: [ debian7-1 ]
> >>>
> >>> prmDummy(ocf::pacemaker:Dummy):Started debian7-1
> >>> Resource Group: grpStonith2
> >>> Stonith2-1(stonith:external/stonith-helper):Started debian7-1
> >>>
> >>>Node Attributes:
> >>>* Node debian7-1:
> >>>
> >>>Migration summary:
> >>>* Node debian7-1:
> >>>
> >>>松島さんの手順ではうまくいかない箇所(私のdebian不慣れが原因と思いますが)がありましたが、構築オプションは同じ
>
> >>>にして、インストールして、pm_extras_1.0の最新版に含まれるstonith-helperのみをxen0と同じディレクトリにコピーしました。
> >>>#stonith-helperの実行権限などに問題があれば、正しく設定してください。
> >>>
>
> >>>で、福田さんのstonith-helperがstartでエラーになっている件ですが、多分、stonithコマンドへのPATHがないことが原因かと思います。
> >>>
> >>>root@debian7-1:~# find / -name stonith -print
> >>>/usr/local/heartbeat/sbin/stonith
> >>>
> >>>root@debian7-1:~# echo $PATH
>
> >>>/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/local/heartbeat/sbin/
> >>>
> >>>
>
> >>>PATHに/usr/local/heartbeat/sbinを追加後に再度、heartbeatを起動すると、上記のcrm_mon表示のようになりました。
> >>>
>
> >>>ただ、最新のPMとの組み合わせの問題の解消はまだですので、この構成(PM1.1.12+Heartbeat3.0.6)が正しく動くかどうかは別物です。
> >>>#たぶん、動いているようですが、問題が出ると思います。
> >>>
> >>>以下に試しに流し込んだ、crmファイルを提示しておきます。
>
> >>>(dead_check_targetや、standby_check_commandなどのパラメータ値は起動を確認するのみでしたので、この設定では実際はまったく意味がない値です)
> >>>
> >>>### Cluster Option ###
> >>>property \
> >>> no-quorum-policy="ignore" \
> >>> stonith-enabled="true" \
> >>> startup-fencing="false"
> >>>
> >>>### Resource Default ###
> >>>rsc_defaults \
> >>> resource-stickiness="INFINITY" \
> >>> migration-threshold="1"
> >>>
> >>>### Fencing Topology ###
> >>>fencing_topology \
> >>> debian7-1: Stonith1-1 \
> >>> debian7-2: Stonith2-1
> >>>
> >>>group grpStonith1 \
> >>> Stonith1-1
> >>>
> >>>group grpStonith2 \
> >>> Stonith2-1
> >>>
> >>>primitive prmDummy ocf:pacemaker:Dummy \
> >>> op start interval="0s" timeout="60s" on-fail="restart" \
> >>> op monitor interval="3600s" timeout="60s" on-fail="restart" \
> >>> op stop interval="0s" timeout="60s" on-fail="ignore"
> >>>
> >>>primitive Stonith1-1 stonith:external/stonith-helper \
> >>> params \
> >>> pcmk_reboot_retries="1" \
> >>> pcmk_reboot_timeout="40s" \
> >>> hostlist="debian7-1" \
> >>> dead_check_target="192.168.3.1" \
> >>> standby_wait_time="10" \
> >>> standby_check_command="/usr/local/sbin/crm_resource -r varnishd
> -W | grep -q `hostname`" \
> >>> op start interval="0s" timeout="60s" on-fail="restart" \
> >>> op monitor interval="3600s" timeout="60s" on-fail="restart" \
> >>> op stop interval="0s" timeout="60s" on-fail="ignore"
> >>>
> >>>primitive Stonith2-1 stonith:external/stonith-helper \
> >>> params \
> >>> pcmk_reboot_retries="1" \
> >>> pcmk_reboot_timeout="40s" \
> >>> hostlist="debian7-2" \
> >>> dead_check_target="192.168.3.1" \
> >>> standby_wait_time="10" \
> >>> standby_check_command="/usr/local/sbin/crm_resource -r varnishd
> -W | grep -q `hostname`" \
> >>> op start interval="0s" timeout="60s" on-fail="restart" \
> >>> op monitor interval="3600s" timeout="60s" on-fail="restart" \
> >>> op stop interval="0s" timeout="60s" on-fail="ignore"
> >>>
> >>>
> >>>location HA_location-3 grpStonith1 \
> >>> rule -INFINITY: #uname eq debian7-1
> >>>
> >>>location HA_location-4 grpStonith2 \
> >>> rule -INFINITY: #uname eq debian7-2
> >>>
> >>>
> >>>また、何かわかりましたら、ご連絡いたします。
> >>>
> >>>以上です。
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>----- Original Message -----
> >>>>From: Masamichi Fukuda - elf-systems <masamichi_fukuda@elf-systems.com
> >
> >>>>To: 山内英生 <renayama19661014@ybb.ne.jp>; "
> linux-ha-japan@lists.sourceforge.jp" <linux-ha-japan@lists.sourceforge.jp>
> >>>
> >>>>Date: 2015/3/18, Wed 15:09
> >>>>Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
> >>>>
> >>>>
> >>>>山内さん
> >>>>
> >>>>お疲れ様です、福田です。
> >>>>
> >>>>新たにdebian7.8をvirtulabox上にインストールして、
> >>>>heartbeat + pacemakerをインストールしてみました。
> >>>>
> >>>>
> >>>>パッケージでheartbeat,pacemaker等はインストールしていません。
> >>>>
> >>>>
> >>>>heartbeatは起動しますが、crmファイルを読み込ませるとエラーがでました。
> >>>>
> >>>>
> >>>># crm configure load update test1.crm
> >>>>
> >>>>ERROR: crmd:metadata: got no meta-data, does this RA exist?
> >>>>ERROR: cib-bootstrap-options: attribute no-quorum-policy does not exist
> >>>>ERROR: cib-bootstrap-options: attribute stonith-enabled does not exist
> >>>>ERROR: cib-bootstrap-options: attribute crmd-transition-delay does not
> exist
> >>>>ERROR: pengine:metadata: got no meta-data, does this RA exist?
> >>>>
> >>>>external配下のエージェントを認識できない件と関係あるのでしょうか。
> >>>>
> >>>>宜しくお願いします。
> >>>>
> >>>>以上
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>2015年3月18日 12:13 <renayama19661014@ybb.ne.jp>:
> >>>>
> >>>>福田さん
> >>>>>
> >>>>>お疲れ様です。山内です。
> >>>>>
> >>>>>了解しました。
> >>>>>ご連絡ありがとうございました。
> >>>>>
> >>>>>以上です。
> >>>>>
> >>>>>
> >>>>>
> >>>>>----- Original Message -----
> >>>>>>From: Masamichi Fukuda - elf-systems <
> masamichi_fukuda@elf-systems.com>
> >>>>>>To: 山内英生 <renayama19661014@ybb.ne.jp>; "
> linux-ha-japan@lists.sourceforge.jp" <linux-ha-japan@lists.sourceforge.jp>
> >>>>>
> >>>>>>Date: 2015/3/18, Wed 10:23
> >>>>>>Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
> >>>>>>
> >>>>>>
> >>>>>>山内さん
> >>>>>>
> >>>>>>お疲れ様です、福田です。
> >>>>>>
> >>>>>>こちらの環境では、packageで次のものを入れていたので、
> >>>>>>最初にapt-get removeしました。
> >>>>>>
> >>>>>>heartbeat、libheartbeat2、pacemaker、corosync、resource-agents
> >>>>>>
> >>>>>>また、haclusterユーザとhaclientグループはpackage導入の段階で
> >>>>>>作成されていました。
> >>>>>>
> >>>>>>ですので、松島さんの手順の
> >>>>>>
> >>>>>>下準備
> >>>>>>apt-get install build-essential mercurial git \
> >>>>>>
> >>>>>>以降を実行しました。後は全く同じ手順です。
> >>>>>>
> >>>>>>宜しくお願いします。
> >>>>>>
> >>>>>>以上
> >>>>>>
> >>>>>>2015年3月18日 10:06 <renayama19661014@ybb.ne.jp>:
> >>>>>>>
> >>>>>>> 福田さん
> >>>>>>>
> >>>>>>> お疲れ様です。山内です。
> >>>>>>>
> >>>>>>> ちなみに、私の方でも構築するにあたっての再確認ですが、福田さんの構築手順は、
> >>>>>>> 以下にまとめられた松島さんの手順通りでしょうか?
> >>>>>>>
> >>>>>>> * https://gist.github.com/takehironet/1469bd7123f63d61f843
> >>>>>>>
> >>>>>>> 差異などありましたら、今一度、ご連絡ください。
> >>>>>>>
> >>>>>>> #特に、最初の構築パッケージのapt-getのあたりが、私がちょっと試した時には、うまく行かなかった記憶があるので、気になります。
> >>>>>>>
> >>>>>>>
> >>>>>>> 以上です。
> >>>>>>>
> >>>>>>>
> >>>>>>> ----- Original Message -----
> >>>>>>> > From: "renayama19661014@ybb.ne.jp" <renayama19661014@ybb.ne.jp>
> >>>>>>> > To: "linux-ha-japan@lists.sourceforge.jp" <
> linux-ha-japan@lists.sourceforge.jp>
> >>>>>>> > Cc:
> >>>>>>> > Date: 2015/3/18, Wed 09:53
> >>>>>>> > Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
> >>>>>>> >
> >>>>>>> > 福田さん
> >>>>>>> >
> >>>>>>> > お疲れ様です。山内です。
> >>>>>>> >
> >>>>>>> >> stonith -Lでは、一応プラグインの一覧は表示されるようです。
> >>>>>>> >>
> >>>>>>> >> # /usr/local/heartbeat/sbin/stonith -L
> >>>>>>> >
> >>>>>>> >
> こちらは、Heartbeatのソースに含まれるコマンドのはずなので、Heartbeatとglueの関係では問題ないということになるかと思います。
> >>>>>>> >
> >>>>>>> > ですので、pacemakerのインストールに問題がある可能性の方が高いと思われます。
> >>>>>>> >
> >>>>>>> > どちらにしても、一度、時間をみて、こちらでも構築してみます。
> >>>>>>> >
> >>>>>>> > 以上です。
> >>>>>>> >
> >>>>>>> >
> >>>>>>> > ----- Original Message -----
> >>>>>>> >> From: Masamichi Fukuda - elf-systems
> >>>>>>> > <masamichi_fukuda@elf-systems.com>
> >>>>>>> >> To: 山内英生 <renayama19661014@ybb.ne.jp>;
> >>>>>>> > "linux-ha-japan@lists.sourceforge.jp"
> >>>>>>> > <linux-ha-japan@lists.sourceforge.jp>
> >>>>>>> >> Date: 2015/3/18, Wed 09:33
> >>>>>>> >> Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
> >>>>>>> >>
> >>>>>>> >>
> >>>>>>> >> 山内さん
> >>>>>>> >>
> >>>>>>> >> お疲れ様です、福田です。
> >>>>>>> >>
> >>>>>>> >>> Reusableは、glueのことです。
> >>>>>>> >>
> >>>>>>> >> 承知しました。Cluster-glueのことですね。
> >>>>>>> >>
> >>>>>>> >>> これは、想定通りで、external配下のエージェントを認識できず、startしていないと
> >>>>>>> >>> 思っています。
> >>>>>>> >>
> >>>>>>> >> stonith -Lでは、一応プラグインの一覧は表示されるようです。
> >>>>>>> >>
> >>>>>>> >> # /usr/local/heartbeat/sbin/stonith -L
> >>>>>>> >> apcmaster
> >>>>>>> >> apcsmart
> >>>>>>> >> baytech
> >>>>>>> >> cyclades
> >>>>>>> >> external/drac5
> >>>>>>> >> external/dracmc-telnet
> >>>>>>> >> external/hetzner
> >>>>>>> >> external/hmchttp
> >>>>>>> >> external/ibmrsa
> >>>>>>> >> external/ibmrsa-telnet
> >>>>>>> >> external/ipmi
> >>>>>>> >> external/ippower9258
> >>>>>>> >> external/kdumpcheck
> >>>>>>> >> external/libvirt
> >>>>>>> >> external/nut
> >>>>>>> >> external/rackpdu
> >>>>>>> >> external/riloe
> >>>>>>> >> external/ssh
> >>>>>>> >> external/stonith-helper
> >>>>>>> >> external/vcenter
> >>>>>>> >> external/vmware
> >>>>>>> >> external/xen0
> >>>>>>> >> external/xen0-ha
> >>>>>>> >> ibmhmc
> >>>>>>> >> meatware
> >>>>>>> >> null
> >>>>>>> >> nw_rpc100s
> >>>>>>> >> rcd_serial
> >>>>>>> >> rps10
> >>>>>>> >> ssh
> >>>>>>> >> suicide
> >>>>>>> >> wti_nps
> >>>>>>> >>
> >>>>>>> >>
> >>>>>>> >>> #時間を見て、松島さんの情報を元に、Debian7の環境を作らないと何とも言えない
> >>>>>>> >>> と思っています
> >>>>>>> >>
> >>>>>>> >> お忙しいところ済みません。
> >>>>>>> >> こちらもインストールを見なおして見ます。
> >>>>>>> >>
> >>>>>>> >> 宜しくお願いします。
> >>>>>>> >>
> >>>>>>> >> 以上
> >>>>>>> >>
> >>>>>>> >>
> >>>>>>> >>
> >>>>>>> >>
> >>>>>>> >> 2015年3月18日 9:02 <renayama19661014@ybb.ne.jp>:
> >>>>>>> >>
> >>>>>>> >> 福田さん
> >>>>>>> >>>
> >>>>>>> >>> おはようございます。山内です。
> >>>>>>> >>>
> >>>>>>> >>> 書き方が悪かったです。
> >>>>>>> >>> Reusableは、glueのことです。
> >>>>>>> >>>
> >>>>>>> >>> pacemakerのインストールに問題があるかも知れませんが、現時点では、判断出来ません。
> >>>>>>> >>>
> >>>>>>> >>>
> >>>>>>> >>>> stonith-helperを外して、external/sshだけにして起動してみましたが、
> >>>>>>> >>>> crm_monでの状態は変わりありませんでした。
> >>>>>>> >>>
> >>>>>>> >>>
> >>>>>>> >>> これは、想定通りで、external配下のエージェントを認識できず、startしていないと思っています。
> >>>>>>> >>>
> >>>>>>> >>> #時間を見て、松島さんの情報を元に、Debian7の環境を作らないと何とも言えないと思っています。
> >>>>>>> >>>
> >>>>>>> >>> 以上です。
> >>>>>>> >>>
> >>>>>>> >>>
> >>>>>>> >>> ----- Original Message -----
> >>>>>>> >>>> From: Masamichi Fukuda - elf-systems
> >>>>>>> > <masamichi_fukuda@elf-systems.com>
> >>>>>>> >>>> To: 山内英生 <renayama19661014@ybb.ne.jp>;
> >>>>>>> > "linux-ha-japan@lists.sourceforge.jp"
> >>>>>>> > <linux-ha-japan@lists.sourceforge.jp>
> >>>>>>> >>>
> >>>>>>> >>>> Date: 2015/3/18, Wed 08:12
> >>>>>>> >>>> Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
> >>>>>>> >>>>
> >>>>>>> >>>>
> >>>>>>> >>>> 山内さん
> >>>>>>> >>>>
> >>>>>>> >>>> おはようございます、福田です。
> >>>>>>> >>>>
> >>>>>>> >>>>> ということは、xen0もstonith-helperもたぶん、Pacemakerのstonithプラグインとし
> >>>>>>> >>>>> ての管理下のパスにはないということになると思います。
> >>>>>>> >>>>>
> >>>>>>> >>>>> Reusableと、pm_extrasあたりのインストールが怪しいと思いますね。
> >>>>>>> >>>>
> >>>>>>> >>>> pacemakerのインストールに問題があるのでしょうか。
> >>>>>>> >>>> あと、Reusableというものは別途インストールが必要なのでしょうか。
> >>>>>>> >>>>
> >>>>>>> >>>> stonith-helperを外して、external/sshだけにして起動してみましたが、
> >>>>>>> >>>> crm_monでの状態は変わりありませんでした。
> >>>>>>> >>>>
> >>>>>>> >>>> Last updated: Wed Mar 18 08:07:42 2015
> >>>>>>> >>>> Last change: Wed Mar 18 08:04:48 2015
> >>>>>>> >>>> Stack: heartbeat
> >>>>>>> >>>> Current DC: lbv1.beta.com
> (38b0f200-83ea-8633-6f37-047d36cd39c6) -
> >>>>>>> > parti
> >>>>>>> >>>> tion with quorum
> >>>>>>> >>>> Version: 1.1.12-e32080b
> >>>>>>> >>>> 2 Nodes configured
> >>>>>>> >>>> 6 Resources configured
> >>>>>>> >>>>
> >>>>>>> >>>>
> >>>>>>> >>>> Online: [ lbv1.beta.com lbv2.beta.com ]
> >>>>>>> >>>>
> >>>>>>> >>>> Full list of resources:
> >>>>>>> >>>>
> >>>>>>> >>>> Stonith1-2 (stonith:external/ssh): Stopped
> >>>>>>> >>>> Stonith2-2 (stonith:external/ssh): Stopped
> >>>>>>> >>>> Resource Group: HAvarnish
> >>>>>>> >>>> vip_208 (ocf::heartbeat:IPaddr2): Started
> >>>>>>> > lbv1.beta.com
> >>>>>>> >>>> varnishd (lsb:varnish): Started lbv1.beta.com
> >>>>>>> >>>> Clone Set: clone_ping [ping]
> >>>>>>> >>>> Started: [ lbv1.beta.com lbv2.beta.com ]
> >>>>>>> >>>>
> >>>>>>> >>>> Node Attributes:
> >>>>>>> >>>> * Node lbv1.beta.com:
> >>>>>>> >>>> + default_ping_set : 100
> >>>>>>> >>>> * Node lbv2.beta.com:
> >>>>>>> >>>> + default_ping_set : 100
> >>>>>>> >>>>
> >>>>>>> >>>> Migration summary:
> >>>>>>> >>>> * Node lbv2.beta.com:
> >>>>>>> >>>> Stonith1-2: migration-threshold=1 fail-count=1000000
> >>>>>>> > last-failure='Wed Mar 18
> >>>>>>> >>>> 08:07:32 2015'
> >>>>>>> >>>> * Node lbv1.beta.com:
> >>>>>>> >>>> Stonith2-2: migration-threshold=1 fail-count=1000000
> >>>>>>> > last-failure='Wed Mar 18
> >>>>>>> >>>> 08:05:53 2015'
> >>>>>>> >>>>
> >>>>>>> >>>> Failed actions:
> >>>>>>> >>>> Stonith1-2_start_0 on lbv2.beta.com 'unknown error' (1):
> >>>>>>> > call=23, st
> >>>>>>> >>>> atus=Error, exit-reason='none', last-rc-change='Wed Mar
> >>>>>>> > 18 08:07:30 2015', queue
> >>>>>>> >>>> d=0ms, exec=1061ms
> >>>>>>> >>>> Stonith2-2_start_0 on lbv1.beta.com 'unknown error' (1):
> >>>>>>> > call=23, st
> >>>>>>> >>>> atus=Error, exit-reason='none', last-rc-change='Wed Mar
> >>>>>>> > 18 08:05:51 2015', queue
> >>>>>>> >>>> d=0ms, exec=1062ms
> >>>>>>> >>>>
> >>>>>>> >>>> 宜しくお願いします。
> >>>>>>> >>>>
> >>>>>>> >>>> 以上
> >>>>>>> >>>>
> >>>>>>> >>>>
> >>>>>>> >>>>
> >>>>>>> >>>>
> >>>>>>> >>>>
> >>>>>>> >>>> 2015年3月17日 23:51 <renayama19661014@ybb.ne.jp>:
> >>>>>>> >>>>
> >>>>>>> >>>> 福田さん
> >>>>>>> >>>>>
> >>>>>>> >>>>> こんばんは、山内です。
> >>>>>>> >>>>>
> >>>>>>> >>>>>
> ということは、xen0もstonith-helperもたぶん、Pacemakerのstonithプラグインとしての管理下のパスにはないということになると思います。
> >>>>>>> >>>>>
> >>>>>>> >>>>> Reusableと、pm_extrasあたりのインストールが怪しいと思いますね。
> >>>>>>> >>>>>
> >>>>>>> >>>>> また、何かわかったらご連絡します。
> >>>>>>> >>>>>
> >>>>>>> >>>>> 以上です。
> >>>>>>> >>>>>
> >>>>>>> >>>>>
> >>>>>>> >>>>>
> >>>>>>> >>>>> ----- Original Message -----
> >>>>>>> >>>>>> From: Masamichi Fukuda - elf-systems
> >>>>>>> > <masamichi_fukuda@elf-systems.com>
> >>>>>>> >>>>>> To: 山内英生 <renayama19661014@ybb.ne.jp>;
> >>>>>>> > "linux-ha-japan@lists.sourceforge.jp"
> >>>>>>> > <linux-ha-japan@lists.sourceforge.jp>
> >>>>>>> >>>>>
> >>>>>>> >>>>>> Date: 2015/3/17, Tue 23:46
> >>>>>>> >>>>>> Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
> >>>>>>> >>>>>>
> >>>>>>> >>>>>>
> >>>>>>> >>>>>> 山内さん
> >>>>>>> >>>>>>
> >>>>>>> >>>>>> こんばんは、福田です。
> >>>>>>> >>>>>>
> >>>>>>> >>>>>> stonith-helperの-x指定は何かやり方が違うんでしょうかね。
> >>>>>>> >>>>>>
> >>>>>>> >>>>>> stonith-helperを外して、xen0だけにして起動してみました。
> >>>>>>> >>>>>>
> >>>>>>> >>>>>> # crm_mon -rfA
> >>>>>>> >>>>>>
> >>>>>>> >>>>>> Last updated: Tue Mar 17 23:38:53 2015
> >>>>>>> >>>>>> Last change: Tue Mar 17 23:30:34 2015
> >>>>>>> >>>>>> Stack: heartbeat
> >>>>>>> >>>>>> Current DC: lbv1.beta.com
> >>>>>>> > (38b0f200-83ea-8633-6f37-047d36cd39c6) - parti
> >>>>>>> >>>>>> tion with quorum
> >>>>>>> >>>>>> Version: 1.1.12-e32080b
> >>>>>>> >>>>>> 2 Nodes configured
> >>>>>>> >>>>>> 6 Resources configured
> >>>>>>> >>>>>>
> >>>>>>> >>>>>>
> >>>>>>> >>>>>> Online: [ lbv1.beta.com lbv2.beta.com ]
> >>>>>>> >>>>>>
> >>>>>>> >>>>>> Full list of resources:
> >>>>>>> >>>>>>
> >>>>>>> >>>>>> Stonith1-2 (stonith:external/xen0): Stopped
> >>>>>>> >>>>>> Stonith2-2 (stonith:external/xen0): Stopped
> >>>>>>> >>>>>> Resource Group: HAvarnish
> >>>>>>> >>>>>> vip_208 (ocf::heartbeat:IPaddr2): Started
> >>>>>>> > lbv1.beta.com
> >>>>>>> >>>>>> varnishd (lsb:varnish): Started lbv1.beta.com
> >>>>>>> >>>>>> Clone Set: clone_ping [ping]
> >>>>>>> >>>>>> Started: [ lbv1.beta.com lbv2.beta.com ]
> >>>>>>> >>>>>>
> >>>>>>> >>>>>> Node Attributes:
> >>>>>>> >>>>>> * Node lbv1.beta.com:
> >>>>>>> >>>>>> + default_ping_set : 100
> >>>>>>> >>>>>> * Node lbv2.beta.com:
> >>>>>>> >>>>>> + default_ping_set : 100
> >>>>>>> >>>>>>
> >>>>>>> >>>>>> Migration summary:
> >>>>>>> >>>>>> * Node lbv1.beta.com:
> >>>>>>> >>>>>> Stonith2-2: migration-threshold=1 fail-count=1000000
> >>>>>>> > last-failure='Tue Mar 17
> >>>>>>> >>>>>> 23:38:34 2015'
> >>>>>>> >>>>>> * Node lbv2.beta.com:
> >>>>>>> >>>>>> Stonith1-2: migration-threshold=1 fail-count=1000000
> >>>>>>> > last-failure='Tue Mar 17
> >>>>>>> >>>>>> 23:38:27 2015'
> >>>>>>> >>>>>>
> >>>>>>> >>>>>> Failed actions:
> >>>>>>> >>>>>> Stonith2-2_start_0 on lbv1.beta.com 'unknown
> >>>>>>> > error' (1): call=23, st
> >>>>>>> >>>>>> atus=Error, exit-reason='none',
> >>>>>>> > last-rc-change='Tue Mar 17 23:38:32 2015', queue
> >>>>>>> >>>>>> d=0ms, exec=1061ms
> >>>>>>> >>>>>> Stonith1-2_start_0 on lbv2.beta.com 'unknown
> >>>>>>> > error' (1): call=23, st
> >>>>>>> >>>>>> atus=Error, exit-reason='none',
> >>>>>>> > last-rc-change='Tue Mar 17 23:38:25 2015', queue
> >>>>>>> >>>>>> d=0ms, exec=1342ms
> >>>>>>> >>>>>>
> >>>>>>> >>>>>>
> >>>>>>> >>>>>>
> >>>>>>> >>>>>>
> >>>>>>> >>>>>> stonith-helperがあるときと同様のfialed actionsが出ているようです。
> >>>>>>> >>>>>>
> >>>>>>> >>>>>>
> >>>>>>> >>>>>> 宜しくお願いします。
> >>>>>>> >>>>>>
> >>>>>>> >>>>>> 以上
> >>>>>>> >>>>>>
> >>>>>>> >>>>>>
> >>>>>>> >>>>>>
> >>>>>>> >>>>>>
> >>>>>>> >>>>>> 2015年3月17日 22:38 <renayama19661014@ybb.ne.jp>:
> >>>>>>> >>>>>>
> >>>>>>> >>>>>> 福田さん
> >>>>>>> >>>>>>>
> >>>>>>> >>>>>>> こんばんは、山内です。
> >>>>>>> >>>>>>>
> >>>>>>> >>>>>>>
> ちなみに可能であれば、external/stonith-helperを外して、external/xen0だけにした場合に
> >>>>>>> >>>>>>> どうなるか?を確認すると、問題の切り分けになるかもしれません。
> >>>>>>> >>>>>>>
> >>>>>>> >>>>>>> 以上です。
> >>>>>>> >>>>>>>
> >>>>>>> >>>>>>>
> >>>>>>> >>>>>>>
> >>>>>>> >>>>>>> ----- Original Message -----
> >>>>>>> >>>>>>>
> >>>>>>> >>>>>>>> From: "renayama19661014@ybb.ne.jp"
> >>>>>>> > <renayama19661014@ybb.ne.jp>
> >>>>>>> >>>>>>>> To: "linux-ha-japan@lists.sourceforge.jp"
> >>>>>>> > <linux-ha-japan@lists.sourceforge.jp>
> >>>>>>> >>>>>>>> Cc:
> >>>>>>> >>>>>>>> Date: 2015/3/17, Tue 22:28
> >>>>>>> >>>>>>>> Subject: Re: [Linux-ha-jp]
> >>>>>>> > スプリットブレイン時のSTONITHエラーについて
> >>>>>>> >>>>>>>>
> >>>>>>> >>>>>>>> 福田さん
> >>>>>>> >>>>>>>>
> >>>>>>> >>>>>>>> こんばんは、山内です。
> >>>>>>> >>>>>>>>
> >>>>>>> >>>>>>>> 変わらないようですね。。。
> >>>>>>> >>>>>>>>
> >>>>>>> >>>>>>>> とりあえず、明日くらいに、RHEL上ですが、
> >>>>>>> >>>>>>>>
> >>>>>>> >>>>>>>> Heartbeat3.0.6
> >>>>>>> >>>>>>>> Pacemakerの最新
> >>>>>>> >>>>>>>>
> >>>>>>> >>>>>>>>
> >>>>>>> >
> 組み合わせで、同じような設定(リソースはDummy、external/xen0はexternal/sshになりますが)stonith-helperが動くかどうかを確認してみます。
> >>>>>>> >>>>>>>>
> >>>>>>> >>>>>>>> #stonith-helperの-x指定の出力が確認出来ると、もう少し問題が絞りやすいのですが・・・
> >>>>>>> >>>>>>>>
> >>>>>>> >>>>>>>>
> >>>>>>> >>>>>>>> 以上です。
> >>>>>>> >>>>>>>>
> >>>>>>> >>>>>>>>
> >>>>>>> >>>>>>>>
> >>>>>>> >>>>>>>> ----- Original Message -----
> >>>>>>> >>>>>>>>> From: Masamichi Fukuda - elf-systems
> >>>>>>> >>>>>>>> <masamichi_fukuda@elf-systems.com>
> >>>>>>> >>>>>>>>> To: 山内英生 <renayama19661014@ybb.ne.jp>;
> >>>>>>> >>>>>>>> "linux-ha-japan@lists.sourceforge.jp"
> >>>>>>> >>>>>>>> <linux-ha-japan@lists.sourceforge.jp>
> >>>>>>> >>>>>>>>> Date: 2015/3/17, Tue 21:24
> >>>>>>> >>>>>>>>> Subject: Re: [Linux-ha-jp]
> >>>>>>> > スプリットブレイン時のSTONITHエラーについて
> >>>>>>> >>>>>>>>>
> >>>>>>> >>>>>>>>>
> >>>>>>> >>>>>>>>> 山内さん
> >>>>>>> >>>>>>>>>
> >>>>>>> >>>>>>>>> こんばんは、福田です。
> >>>>>>> >>>>>>>>> 最新版の情報をありがとうございました。
> >>>>>>> >>>>>>>>>
> >>>>>>> >>>>>>>>> 早速インストールしてみました。
> >>>>>>> >>>>>>>>>
> >>>>>>> >>>>>>>>> 起動後の状態です。
> >>>>>>> >>>>>>>>>
> >>>>>>> >>>>>>>>> failed actionsは変わりないようです。
> >>>>>>> >>>>>>>>>
> >>>>>>> >>>>>>>>>
> >>>>>>> >>>>>>>>>
> >>>>>>> >>>>>>>>> # crm_mon -rfA
> >>>>>>> >>>>>>>>> Last updated: Tue Mar 17 21:03:49 2015
> >>>>>>> >>>>>>>>> Last change: Tue Mar 17 20:30:58 2015
> >>>>>>> >>>>>>>>> Stack: heartbeat
> >>>>>>> >>>>>>>>> Current DC: lbv1.beta.com
> >>>>>>> > (38b0f200-83ea-8633-6f37-047d36cd39c6) - parti
> >>>>>>> >>>>>>>>> tion with quorum
> >>>>>>> >>>>>>>>> Version: 1.1.12-e32080b
> >>>>>>> >>>>>>>>> 2 Nodes configured
> >>>>>>> >>>>>>>>> 8 Resources configured
> >>>>>>> >>>>>>>>>
> >>>>>>> >>>>>>>>>
> >>>>>>> >>>>>>>>> Online: [ lbv1.beta.com lbv2.beta.com ]
> >>>>>>> >>>>>>>>>
> >>>>>>> >>>>>>>>> Full list of resources:
> >>>>>>> >>>>>>>>>
> >>>>>>> >>>>>>>>> Resource Group: HAvarnish
> >>>>>>> >>>>>>>>> vip_208 (ocf::heartbeat:IPaddr2):
> >>>>>>> > Started lbv1.beta.com
> >>>>>>> >>>>>>>>> varnishd (lsb:varnish): Started
> >>>>>>> > lbv1.beta.com
> >>>>>>> >>>>>>>>> Resource Group: grpStonith1
> >>>>>>> >>>>>>>>> Stonith1-1
> >>>>>>> > (stonith:external/stonith-helper): Stopped
> >>>>>>> >>>>>>>>> Stonith1-2 (stonith:external/xen0):
> >>>>>>> > Stopped
> >>>>>>> >>>>>>>>> Resource Group: grpStonith2
> >>>>>>> >>>>>>>>> Stonith2-1
> >>>>>>> > (stonith:external/stonith-helper): Stopped
> >>>>>>> >>>>>>>>> Stonith2-2 (stonith:external/xen0):
> >>>>>>> > Stopped
> >>>>>>> >>>>>>>>> Clone Set: clone_ping [ping]
> >>>>>>> >>>>>>>>> Started: [ lbv1.beta.com lbv2.beta.com ]
> >>>>>>> >>>>>>>>>
> >>>>>>> >>>>>>>>> Node Attributes:
> >>>>>>> >>>>>>>>> * Node lbv1.beta.com:
> >>>>>>> >>>>>>>>> + default_ping_set : 100
> >>>>>>> >>>>>>>>> * Node lbv2.beta.com:
> >>>>>>> >>>>>>>>> + default_ping_set : 100
> >>>>>>> >>>>>>>>>
> >>>>>>> >>>>>>>>> Migration summary:
> >>>>>>> >>>>>>>>> * Node lbv1.beta.com:
> >>>>>>> >>>>>>>>> Stonith2-1: migration-threshold=1
> >>>>>>> > fail-count=1000000
> >>>>>>> >>>>>>>> last-failure='Tue Mar 17
> >>>>>>> >>>>>>>>> 21:03:39 2015'
> >>>>>>> >>>>>>>>> * Node lbv2.beta.com:
> >>>>>>> >>>>>>>>> Stonith1-1: migration-threshold=1
> >>>>>>> > fail-count=1000000
> >>>>>>> >>>>>>>> last-failure='Tue Mar 17
> >>>>>>> >>>>>>>>> 21:03:32 2015'
> >>>>>>> >>>>>>>>>
> >>>>>>> >>>>>>>>> Failed actions:
> >>>>>>> >>>>>>>>> Stonith2-1_start_0 on lbv1.beta.com
> >>>>>>> > 'unknown error' (1):
> >>>>>>> >>>>>>>> call=31, st
> >>>>>>> >>>>>>>>> atus=Error, exit-reason='none',
> >>>>>>> > last-rc-change='Tue Mar 17
> >>>>>>> >>>>>>>> 21:03:37 2015', queue
> >>>>>>> >>>>>>>>> d=0ms, exec=1085ms
> >>>>>>> >>>>>>>>> Stonith1-1_start_0 on lbv2.beta.com
> >>>>>>> > 'unknown error' (1):
> >>>>>>> >>>>>>>> call=18, st
> >>>>>>> >>>>>>>>> atus=Error, exit-reason='none',
> >>>>>>> > last-rc-change='Tue Mar 17
> >>>>>>> >>>>>>>> 21:03:30 2015', queue
> >>>>>>> >>>>>>>>> d=0ms, exec=1061ms
> >>>>>>> >>>>>>>>>
> >>>>>>> >>>>>>>>>
> >>>>>>> >>>>>>>>>
> >>>>>>> >>>>>>>>>
> >>>>>>> >>>>>>>>> ログです。
> >>>>>>> >>>>>>>>>
> >>>>>>> >>>>>>>>>
> >>>>>>> >>>>>>>>> # less /var/log/ha-debug
> >>>>>>> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> >>>>>>> > [4235]: info: Pacemaker support:
> >>>>>>> >>>>>>>> yes
> >>>>>>> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> >>>>>>> > [4235]: WARN: File
> >>>>>>> >>>>>>>> /etc/ha.d//haresources exists.
> >>>>>>> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> >>>>>>> > [4235]: WARN: This file is not used
> >>>>>>> >>>>>>>> because pacemaker is enabled
> >>>>>>> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> >>>>>>> > [4235]: debug: Checking access of:
> >>>>>>> >>>>>>>> /usr/local/heartbeat/libexec/heartbeat/ccm
> >>>>>>> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> >>>>>>> > [4235]: debug: Checking access of:
> >>>>>>> >>>>>>>> /usr/local/heartbeat/libexec/pacemaker/cib
> >>>>>>> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> >>>>>>> > [4235]: debug: Checking access of:
> >>>>>>> >>>>>>>> /usr/local/heartbeat/libexec/pacemaker/stonithd
> >>>>>>> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> >>>>>>> > [4235]: debug: Checking access of:
> >>>>>>> >>>>>>>> /usr/local/heartbeat/libexec/pacemaker/lrmd
> >>>>>>> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> >>>>>>> > [4235]: debug: Checking access of:
> >>>>>>> >>>>>>>> /usr/local/heartbeat/libexec/pacemaker/attrd
> >>>>>>> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> >>>>>>> > [4235]: debug: Checking access of:
> >>>>>>> >>>>>>>> /usr/local/heartbeat/libexec/pacemaker/crmd
> >>>>>>> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> >>>>>>> > [4235]: WARN: Core dumps could be
> >>>>>>> >>>>>>>> lost if multiple dumps occur.
> >>>>>>> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> >>>>>>> > [4235]: WARN: Consider setting
> >>>>>>> >>>>>>>> non-default value in /proc/sys/kernel/core_pattern
> >>>>>>> > (or equivalent) for maximum
> >>>>>>> >>>>>>>> supportability
> >>>>>>> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> >>>>>>> > [4235]: WARN: Consider setting
> >>>>>>> >>>>>>>> /proc/sys/kernel/core_uses_pid (or equivalent) to 1
> >>>>>>> > for maximum supportability
> >>>>>>> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> >>>>>>> > [4235]: WARN: Logging daemon is
> >>>>>>> >>>>>>>> disabled --enabling logging daemon is recommended
> >>>>>>> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> >>>>>>> > [4235]: info:
> >>>>>>> >>>>>>>> **************************
> >>>>>>> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> >>>>>>> > [4235]: info: Configuration
> >>>>>>> >>>>>>>> validated. Starting heartbeat 3.0.6
> >>>>>>> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> >>>>>>> > [4236]: info: heartbeat: version
> >>>>>>> >>>>>>>> 3.0.6
> >>>>>>> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> >>>>>>> > [4236]: info: Heartbeat generation:
> >>>>>>> >>>>>>>> 1423534116
> >>>>>>> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> >>>>>>> > [4236]: info: seed is -1702799346
> >>>>>>> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> >>>>>>> > [4236]: info: glib: ucast: write
> >>>>>>> >>>>>>>> socket priority set to IPTOS_LOWDELAY on eth1
> >>>>>>> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> >>>>>>> > [4236]: info: glib: ucast: bound
> >>>>>>> >>>>>>>> send socket to device: eth1
> >>>>>>> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> >>>>>>> > [4236]: info: glib: ucast: set
> >>>>>>> >>>>>>>> SO_REUSEADDR
> >>>>>>> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> >>>>>>> > [4236]: info: glib: ucast: bound
> >>>>>>> >>>>>>>> receive socket to device: eth1
> >>>>>>> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> >>>>>>> > [4236]: info: glib: ucast: started
> >>>>>>> >>>>>>>> on port 694 interface eth1 to 10.0.17.133
> >>>>>>> >>>>>>>>> Mar 17 21:02:39 lbv1.beta.com heartbeat:
> >>>>>>> > [4236]: info: Local status now set
> >>>>>>> >>>>>>>> to: 'up'
> >>>>>>> >>>>>>>>> Mar 17 21:02:46 lbv1.beta.com heartbeat:
> >>>>>>> > [4236]: info: Link
> >>>>>>> >>>>>>>> lbv2.beta.com:eth1 up.
> >>>>>>> >>>>>>>>> Mar 17 21:02:46 lbv1.beta.com heartbeat:
> >>>>>>> > [4236]: info: Status update for
> >>>>>>> >>>>>>>> node lbv2.beta.com: status up
> >>>>>>> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> >>>>>>> > [4236]: info: Comm_now_up():
> >>>>>>> >>>>>>>> updating status to active
> >>>>>>> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> >>>>>>> > [4236]: info: Local status now set
> >>>>>>> >>>>>>>> to: 'active'
> >>>>>>> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> >>>>>>> > [4236]: info: Starting child client
> >>>>>>> >>>>>>>>
> >>>>>>> > "/usr/local/heartbeat/libexec/heartbeat/ccm" (109,113)
> >>>>>>> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> >>>>>>> > [4236]: info: Starting child client
> >>>>>>> >>>>>>>>
> >>>>>>> > "/usr/local/heartbeat/libexec/pacemaker/cib" (109,113)
> >>>>>>> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> >>>>>>> > [4236]: info: Starting child client
> >>>>>>> >>>>>>>>
> >>>>>>> > "/usr/local/heartbeat/libexec/pacemaker/stonithd" (0,0)
> >>>>>>> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> >>>>>>> > [4236]: info: Starting child client
> >>>>>>> >>>>>>>>
> >>>>>>> > "/usr/local/heartbeat/libexec/pacemaker/lrmd" (0,0)
> >>>>>>> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> >>>>>>> > [4236]: info: Starting child client
> >>>>>>> >>>>>>>>
> >>>>>>> > "/usr/local/heartbeat/libexec/pacemaker/attrd" (109,113)
> >>>>>>> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> >>>>>>> > [4236]: info: Starting child client
> >>>>>>> >>>>>>>>
> >>>>>>> > "/usr/local/heartbeat/libexec/pacemaker/crmd" (109,113)
> >>>>>>> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> >>>>>>> > [4236]: debug: get_delnodelist:
> >>>>>>> >>>>>>>> delnodelist=
> >>>>>>> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> >>>>>>> > [4250]: info: Starting
> >>>>>>> >>>>>>>>
> >>>>>>> > "/usr/local/heartbeat/libexec/pacemaker/crmd" as uid 109 gid
> 113 (pid
> >>>>>>> >>>>>>>> 4250)
> >>>>>>> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> >>>>>>> > [4246]: info: Starting
> >>>>>>> >>>>>>>>
> >>>>>>> > "/usr/local/heartbeat/libexec/pacemaker/cib" as uid 109 gid 113
> (pid
> >>>>>>> >>>>>>>> 4246)
> >>>>>>> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> >>>>>>> > [4249]: info: Starting
> >>>>>>> >>>>>>>>
> >>>>>>> > "/usr/local/heartbeat/libexec/pacemaker/attrd" as uid 109 gid
> 113
> >>>>>>> >>>>>>>> (pid 4249)
> >>>>>>> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> >>>>>>> > [4245]: info: Starting
> >>>>>>> >>>>>>>>
> >>>>>>> > "/usr/local/heartbeat/libexec/heartbeat/ccm" as uid 109 gid 113
> (pid
> >>>>>>> >>>>>>>> 4245)
> >>>>>>> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> >>>>>>> > [4248]: info: Starting
> >>>>>>> >>>>>>>>
> >>>>>>> > "/usr/local/heartbeat/libexec/pacemaker/lrmd" as uid 0 gid 0
> (pid
> >>>>>>> >>>>>>>> 4248)
> >>>>>>> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> >>>>>>> > [4247]: info: Starting
> >>>>>>> >>>>>>>>
> >>>>>>> > "/usr/local/heartbeat/libexec/pacemaker/stonithd" as uid 0 gid
> 0 (pid
> >>>>>>> >>>>>>>> 4247)
> >>>>>>> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com ccm: [4245]:
> >>>>>>> > info: Hostname: lbv1.beta.com
> >>>>>>> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> >>>>>>> > [4236]: info: the send queue length
> >>>>>>> >>>>>>>> from heartbeat to client ccm is set to 1024
> >>>>>>> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> >>>>>>> > [4236]: info: the send queue length
> >>>>>>> >>>>>>>> from heartbeat to client attrd is set to 1024
> >>>>>>> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> >>>>>>> > [4236]: info: the send queue length
> >>>>>>> >>>>>>>> from heartbeat to client stonith-ng is set to 1024
> >>>>>>> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> >>>>>>> > [4236]: info: Status update for
> >>>>>>> >>>>>>>> node lbv2.beta.com: status active
> >>>>>>> >>>>>>>>> Mar 17 21:02:47 lbv1.beta.com heartbeat:
> >>>>>>> > [4236]: info: the send queue length
> >>>>>>> >>>>>>>> from heartbeat to client cib is set to 1024
> >>>>>>> >>>>>>>>> Mar 17 21:02:51 lbv1.beta.com heartbeat:
> >>>>>>> > [4236]: WARN: 1 lost packet(s) for
> >>>>>>> >>>>>>>> [lbv2.beta.com] [15:17]
> >>>>>>> >>>>>>>>> Mar 17 21:02:51 lbv1.beta.com heartbeat:
> >>>>>>> > [4236]: info: No pkts missing from
> >>>>>>> >>>>>>>> lbv2.beta.com!
> >>>>>>> >>>>>>>>> Mar 17 21:02:52 lbv1.beta.com heartbeat:
> >>>>>>> > [4236]: WARN: 1 lost packet(s) for
> >>>>>>> >>>>>>>> [lbv2.beta.com] [19:21]
> >>>>>>> >>>>>>>>> Mar 17 21:02:52 lbv1.beta.com heartbeat:
> >>>>>>> > [4236]: info: No pkts missing from
> >>>>>>> >>>>>>>> lbv2.beta.com!
> >>>>>>> >>>>>>>>> Mar 17 21:02:52 lbv1.beta.com heartbeat:
> >>>>>>> > [4236]: info: the send queue length
> >>>>>>> >>>>>>>> from heartbeat to client crmd is set to 1024
> >>>>>>> >>>>>>>>> Mar 17 21:02:53 lbv1.beta.com heartbeat:
> >>>>>>> > [4236]: WARN: 1 lost packet(s) for
> >>>>>>> >>>>>>>> [lbv2.beta.com] [24:26]
> >>>>>>> >>>>>>>>> Mar 17 21:02:53 lbv1.beta.com heartbeat:
> >>>>>>> > [4236]: info: No pkts missing from
> >>>>>>> >>>>>>>> lbv2.beta.com!
> >>>>>>> >>>>>>>>> Mar 17 21:02:54 lbv1.beta.com heartbeat:
> >>>>>>> > [4236]: WARN: 1 lost packet(s) for
> >>>>>>> >>>>>>>> [lbv2.beta.com] [26:28]
> >>>>>>> >>>>>>>>> Mar 17 21:02:54 lbv1.beta.com heartbeat:
> >>>>>>> > [4236]: info: No pkts missing from
> >>>>>>> >>>>>>>> lbv2.beta.com!
> >>>>>>> >>>>>>>>> Mar 17 21:02:54 lbv1.beta.com heartbeat:
> >>>>>>> > [4236]: WARN: 1 lost packet(s) for
> >>>>>>> >>>>>>>> [lbv2.beta.com] [30:32]
> >>>>>>> >>>>>>>>> Mar 17 21:02:54 lbv1.beta.com heartbeat:
> >>>>>>> > [4236]: info: No pkts missing from
> >>>>>>> >>>>>>>> lbv2.beta.com!
> >>>>>>> >>>>>>>>>
> >>>>>>> >>>>>>>>>
> >>>>>>> >>>>>>>>>
> >>>>>>> >>>>>>>>> # less /var/log/error
> >>>>>>> >>>>>>>>>
> >>>>>>> >>>>>>>>> Mar 17 21:02:47 lbv1 attrd[4249]: error:
> >>>>>>> > ha_msg_dispatch: Ignored
> >>>>>>> >>>>>>>> incoming message. Please set_msg_callback on
> >>>>>>> > hbclstat
> >>>>>>> >>>>>>>>> Mar 17 21:02:48 lbv1 attrd[4249]: error:
> >>>>>>> > ha_msg_dispatch: Ignored
> >>>>>>> >>>>>>>> incoming message. Please set_msg_callback on
> >>>>>>> > hbclstat
> >>>>>>> >>>>>>>>> Mar 17 21:02:53 lbv1 stonith-ng[4247]:
> >>>>>>> > error: ha_msg_dispatch: Ignored
> >>>>>>> >>>>>>>> incoming message. Please set_msg_callback on
> >>>>>>> > hbclstat
> >>>>>>> >>>>>>>>> Mar 17 21:02:53 lbv1 stonith-ng[4247]:
> >>>>>>> > error: ha_msg_dispatch: Ignored
> >>>>>>> >>>>>>>> incoming message. Please set_msg_callback on
> >>>>>>> > hbclstat
> >>>>>>> >>>>>>>>> Mar 17 21:03:39 lbv1 crmd[4250]: error:
> >>>>>>> > process_lrm_event: Operation
> >>>>>>> >>>>>>>> Stonith2-1_start_0 (node=lbv1.beta.com, call=31,
> >>>>>>> > status=4, cib-update=42,
> >>>>>>> >>>>>>>> confirmed=true) Error
> >>>>>>> >>>>>>>>>
> >>>>>>> >>>>>>>>> # cat syslog|egrep 'Mar 17 21:03|Mar 17
> >>>>>>> > 21:02' |egrep
> >>>>>>> >>>>>>>> 'heartbeat|stonith|pacemaker|error'
> >>>>>>> >>>>>>>>> Mar 17 21:03:24 lbv1 pengine[4253]: notice:
> >>>>>>> > process_pe_message: Calculated
> >>>>>>> >>>>>>>> Transition 0:
> >>>>>>> > /var/lib/pacemaker/pengine/pe-input-115.bz2
> >>>>>>> >>>>>>>>> Mar 17 21:03:27 lbv1 crmd[4250]: notice:
> >>>>>>> > run_graph: Transition 0
> >>>>>>> >>>>>>>> (Complete=15, Pending=0, Fired=0, Skipped=16,
> >>>>>>> > Incomplete=2,
> >>>>>>> >>>>>>>>
> >>>>>>> > Source=/var/lib/pacemaker/pengine/pe-input-115.bz2): Stopped
> >>>>>>> >>>>>>>>> Mar 17 21:03:29 lbv1 pengine[4253]: notice:
> >>>>>>> > process_pe_message: Calculated
> >>>>>>> >>>>>>>> Transition 1:
> >>>>>>> > /var/lib/pacemaker/pengine/pe-input-116.bz2
> >>>>>>> >>>>>>>>> Mar 17 21:03:34 lbv1 crmd[4250]: notice:
> >>>>>>> > run_graph: Transition 1
> >>>>>>> >>>>>>>> (Complete=8, Pending=0, Fired=0, Skipped=12,
> >>>>>>> > Incomplete=1,
> >>>>>>> >>>>>>>>
> >>>>>>> > Source=/var/lib/pacemaker/pengine/pe-input-116.bz2): Stopped
> >>>>>>> >>>>>>>>> Mar 17 21:03:37 lbv1 pengine[4253]: warning:
> >>>>>>> > unpack_rsc_op_failure:
> >>>>>>> >>>>>>>> Processing failed op start for Stonith1-1 on
> >>>>>>> > lbv2.beta.com: unknown error (1)
> >>>>>>> >>>>>>>>> Mar 17 21:03:37 lbv1 pengine[4253]: warning:
> >>>>>>> > unpack_rsc_op_failure:
> >>>>>>> >>>>>>>> Processing failed op start for Stonith1-1 on
> >>>>>>> > lbv2.beta.com: unknown error (1)
> >>>>>>> >>>>>>>>> Mar 17 21:03:37 lbv1 pengine[4253]: notice:
> >>>>>>> > process_pe_message: Calculated
> >>>>>>> >>>>>>>> Transition 2:
> >>>>>>> > /var/lib/pacemaker/pengine/pe-input-117.bz2
> >>>>>>> >>>>>>>>> Mar 17 21:03:39 lbv1 stonith-ng[4247]:
> >>>>>>> > notice: log_operation: Operation
> >>>>>>> >>>>>>>> 'monitor' [4377] for device
> >>>>>>> > 'Stonith2-1' returned: -201 (Generic
> >>>>>>> >>>>>>>> Pacemaker error)
> >>>>>>> >>>>>>>>> Mar 17 21:03:39 lbv1 stonith-ng[4247]:
> >>>>>>> > warning: log_operation:
> >>>>>>> >>>>>>>> Stonith2-1:4377 [ Performing: stonith -t
> >>>>>>> > external/stonith-helper -S ]
> >>>>>>> >>>>>>>>> Mar 17 21:03:39 lbv1 stonith-ng[4247]:
> >>>>>>> > warning: log_operation:
> >>>>>>> >>>>>>>> Stonith2-1:4377 [ failed to exec
> >>>>>>> > "stonith" ]
> >>>>>>> >>>>>>>>> Mar 17 21:03:39 lbv1 stonith-ng[4247]:
> >>>>>>> > warning: log_operation:
> >>>>>>> >>>>>>>> Stonith2-1:4377 [ failed: 2 ]
> >>>>>>> >>>>>>>>> Mar 17 21:03:39 lbv1 crmd[4250]: error:
> >>>>>>> > process_lrm_event: Operation
> >>>>>>> >>>>>>>> Stonith2-1_start_0 (node=lbv1.beta.com, call=31,
> >>>>>>> > status=4, cib-update=42,
> >>>>>>> >>>>>>>> confirmed=true) Error
> >>>>>>> >>>>>>>>> Mar 17 21:03:40 lbv1 crmd[4250]: notice:
> >>>>>>> > run_graph: Transition 2
> >>>>>>> >>>>>>>> (Complete=12, Pending=0, Fired=0, Skipped=3,
> >>>>>>> > Incomplete=0,
> >>>>>>> >>>>>>>>
> >>>>>>> > Source=/var/lib/pacemaker/pengine/pe-input-117.bz2): Stopped
> >>>>>>> >>>>>>>>> Mar 17 21:03:42 lbv1 pengine[4253]: warning:
> >>>>>>> > unpack_rsc_op_failure:
> >>>>>>> >>>>>>>> Processing failed op start for Stonith2-1 on
> >>>>>>> > lbv1.beta.com: unknown error (1)
> >>>>>>> >>>>>>>>> Mar 17 21:03:42 lbv1 pengine[4253]: warning:
> >>>>>>> > unpack_rsc_op_failure:
> >>>>>>> >>>>>>>> Processing failed op start for Stonith2-1 on
> >>>>>>> > lbv1.beta.com: unknown error (1)
> >>>>>>> >>>>>>>>> Mar 17 21:03:42 lbv1 pengine[4253]: warning:
> >>>>>>> > unpack_rsc_op_failure:
> >>>>>>> >>>>>>>> Processing failed op start for Stonith1-1 on
> >>>>>>> > lbv2.beta.com: unknown error (1)
> >>>>>>> >>>>>>>>> Mar 17 21:03:42 lbv1 pengine[4253]: notice:
> >>>>>>> > process_pe_message: Calculated
> >>>>>>> >>>>>>>> Transition 3:
> >>>>>>> > /var/lib/pacemaker/pengine/pe-input-118.bz2
> >>>>>>> >>>>>>>>> Mar 17 21:03:42 lbv1 IPaddr2(vip_208)[4448]:
> >>>>>>> > INFO:
> >>>>>>> >>>>>>>> /usr/libexec/heartbeat/send_arp -i 200 -r 5 -p
> >>>>>>> >>>>>>>> /var/run/resource-agents/send_arp-192.168.17.208
> >>>>>>> > eth0 192.168.17.208 auto
> >>>>>>> >>>>>>>> not_used not_used
> >>>>>>> >>>>>>>>> Mar 17 21:03:47 lbv1 crmd[4250]: notice:
> >>>>>>> > run_graph: Transition 3
> >>>>>>> >>>>>>>> (Complete=10, Pending=0, Fired=0, Skipped=0,
> >>>>>>> > Incomplete=0,
> >>>>>>> >>>>>>>>
> >>>>>>> > Source=/var/lib/pacemaker/pengine/pe-input-118.bz2): Complete
> >>>>>>> >>>>>>>>>
> >>>>>>> >>>>>>>>> 宜しくお願いします。
> >>>>>>> >>>>>>>>>
> >>>>>>> >>>>>>>>> 以上
> >>>>>>> >>>>>>>>>
> >>>>>>> >>>>>>>>>
> >>>>>>> >>>>>>>>>
> >>>>>>> >>>>>>>>> 2015年3月17日 18:31
> >>>>>>> > <renayama19661014@ybb.ne.jp>:
> >>>>>>> >>>>>>>>>
> >>>>>>> >>>>>>>>> 福田さん
> >>>>>>> >>>>>>>>>>
> >>>>>>> >>>>>>>>>> こんばんは、山内です。
> >>>>>>> >>>>>>>>>>
> >>>>>>> >>>>>>>>>> tag付けされていないので、本日の最新版は、
> >>>>>>> >>>>>>>>>>
> >>>>>>> >>>>>>>>>> *
> >>>>>>> >>>>>>>>
> >>>>>>> >
> https://github.com/ClusterLabs/pacemaker/tree/e32080b460f81486b85d08ec958582b3e72d858c
> >>>>>>> >>>>>>>>>>
> >>>>>>> >>>>>>>>>>
> >>>>>>> >>>>>>>>>> になります。
> >>>>>>> >>>>>>>>>> 右側の[Download ZIP]からダウンロード出来ます。
> >>>>>>> >>>>>>>>>>
> >>>>>>> >>>>>>>>>> 以上です。
> >>>>>>> >>>>>>>>>>
> >>>>>>> >>>>>>>>>>
> >>>>>>> >>>>>>>>>> ----- Original Message -----
> >>>>>>> >>>>>>>>>>> From: Masamichi Fukuda - elf-systems
> >>>>>>> >>>>>>>> <masamichi_fukuda@elf-systems.com>
> >>>>>>> >>>>>>>>>>
> >>>>>>> >>>>>>>>>>> To:
> >>>>>>> > "renayama19661014@ybb.ne.jp"
> >>>>>>> >>>>>>>> <renayama19661014@ybb.ne.jp>;
> >>>>>>> >>>>>>>> "linux-ha-japan@lists.sourceforge.jp"
> >>>>>>> >>>>>>>> <linux-ha-japan@lists.sourceforge.jp>
> >>>>>>> >>>>>>>>>>> Date: 2015/3/17, Tue 18:07
> >>>>>>> >>>>>>>>>>> Subject: スプリットブレイン時のSTONITHエラーについて
> >>>>>>> >>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>
> >>>>>>> >>>>>>>>>>> 山内さん
> >>>>>>> >>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>
> >>>>>>> >>>>>>>>>>> お疲れ様です、福田です。
> >>>>>>> >>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>
> >>>>>>> >>>>>>>>>>> こちらを見たのですが、
> >>>>>>> >>>>>>>>>>>
> >>>>>>> > https://github.com/ClusterLabs/pacemaker/tags
> >>>>>>> >>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>
> >>>>>>> >>>>>>>>>>> pacemaker 1.1.12 561c4cf が最新のようなのですが。
> >>>>>>> >>>>>>>>>>> 済みませんが、これ以降の最新版はどちらにあるか教えて頂けますか。
> >>>>>>> >>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>
> >>>>>>> >>>>>>>>>>> 宜しくお願いします。
> >>>>>>> >>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>
> >>>>>>> >>>>>>>>>>> 以上
> >>>>>>> >>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>
> >>>>>>> > 2015年3月17日火曜日、<renayama19661014@ybb.ne.jp>さんは書きました:
> >>>>>>> >>>>>>>>>>>
> >>>>>>> >>>>>>>>>>> 福田さん
> >>>>>>> >>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>> お疲れ様です。山内です。
> >>>>>>> >>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>> はい。古いです。
> >>>>>>> >>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>
> >>>>>>> > PacemakerがHeartbeat3.0.6に対応したのは意外と最近です。
> >>>>>>> >>>>>>>>>>>>
> >>>>>>> > もっと新しいものを入れてください。(また、ソースから構築する必要がありますが・・・・)
> >>>>>>> >>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>> 本家のgithubから入手可能です。
> >>>>>>> >>>>>>>>>>>> *
> >>>>>>> > https://github.com/ClusterLabs/pacemaker
> >>>>>>> >>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>
> >>>>>>> > 場合によっては、最新のmasterはエラーなどが出る場合がありますので、その場合は、バージョンを古い方にたぐって
> >>>>>>> >>>>>>>>>>>> いくのが良いと思います。
> >>>>>>> >>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>> 以上です。
> >>>>>>> >>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>> ----- Original Message -----
> >>>>>>> >>>>>>>>>>>>> From: Masamichi Fukuda -
> >>>>>>> > elf-systems
> >>>>>>> >>>>>>>> <masamichi_fukuda@elf-systems.com>
> >>>>>>> >>>>>>>>>>>>> To: 山内英生
> >>>>>>> > <renayama19661014@ybb.ne.jp>;
> >>>>>>> >>>>>>>> "linux-ha-japan@lists.sourceforge.jp"
> >>>>>>> >>>>>>>> <linux-ha-japan@lists.sourceforge.jp>
> >>>>>>> >>>>>>>>>>>>> Date: 2015/3/17, Tue 16:06
> >>>>>>> >>>>>>>>>>>>> Subject: Re: [Linux-ha-jp]
> >>>>>>> > スプリットブレイン時のSTONITHエラーについて
> >>>>>>> >>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>> 山内さん
> >>>>>>> >>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>> お疲れ様です、福田です。
> >>>>>>> >>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>
> >>>>>>> > 以前のメールでheartbeatとpacemakerを最新版を入れたほうが良いと回答頂きました。
> >>>>>>> >>>>>>>>>>>>>
> >>>>>>> > そこで今回、heartbeat3.0.6とpacemaker1.1.12を入れたのですが。
> >>>>>>> >>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>> heartbeat configuration:
> >>>>>>> > Version = "3.0.6"
> >>>>>>> >>>>>>>>>>>>> pacemaker configuration:
> >>>>>>> > Version = 1.1.12 (Build:
> >>>>>>> >>>>>>>> 561c4cf)pacemakerがまだ古いということでしょうか。
> >>>>>>> >>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>> 済みませんが、宜しくお願いします。
> >>>>>>> >>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>> 以上
> >>>>>>> >>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>> 2015年3月17日 14:59
> >>>>>>> > <renayama19661014@ybb.ne.jp>:
> >>>>>>> >>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>> 福田さん
> >>>>>>> >>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>> お疲れ様です。山内です。
> >>>>>>> >>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>
> >>>>>>> > ふと思ったのすが、以前のやり取りのメールで以下と回答してますが、問題ないでしょうか?
> >>>>>>> >>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>>>>>
> >>>>>>> > 2)Heartbeat3.0.6+Pacemaker最新 :
> >>>>>>> >>>>>>>> OK
> >>>>>>> >>>>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>> どうやら、Heartbeatも最新版3.0.6を組合せる必要があるようです。
> >>>>>>> >>>>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>
> >>>>>>> > * http://hg.linux-ha.org/heartbeat-STABLE_3_0/rev/cceeb47a7d8f
> >>>>>>> >>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>
> >>>>>>> > 以下のcrm_monのバージョンを見ると、1.1.12のようです。
> >>>>>>> >>>>>>>>>>>>>>
> >>>>>>> > Heartbeat3.0.6と組み合わせるには、かなり新しめのPacemakerが必要です。
> >>>>>>> >>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>> # crm_mon -rfA
> >>>>>>> >>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>> Last updated: Tue Mar
> >>>>>>> > 17 14:14:39 2015
> >>>>>>> >>>>>>>>>>>>>>> Last change: Tue Mar 17
> >>>>>>> > 14:01:43 2015
> >>>>>>> >>>>>>>>>>>>>>> Stack: heartbeat
> >>>>>>> >>>>>>>>>>>>>>> Current DC:
> >>>>>>> > lbv2.beta.com
> >>>>>>> >>>>>>>> (82ffc36f-1ad8-8686-7db0-35686465c624) - parti
> >>>>>>> >>>>>>>>>>>>>>> tion with quorum
> >>>>>>> >>>>>>>>>>>>>>> Version: 1.1.12-561c4cf
> >>>>>>> >>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>> たぶん、以下の変更以降は少なくとも必要かと思います。
> >>>>>>> >>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>
> >>>>>>> >
> https://github.com/ClusterLabs/pacemaker/commit/f2302da063d08719d28367d8e362b8bfb0f85bf3
> >>>>>>> >>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>> 以上です。
> >>>>>>> >>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>> ----- Original Message
> >>>>>>> > -----
> >>>>>>> >>>>>>>>>>>>>>> From: Masamichi Fukuda
> >>>>>>> > - elf-systems
> >>>>>>> >>>>>>>> <masamichi_fukuda@elf-systems.com>
> >>>>>>> >>>>>>>>>>>>>>> To: 山内英生
> >>>>>>> > <renayama19661014@ybb.ne.jp>;
> >>>>>>> >>>>>>>> "linux-ha-japan@lists.sourceforge.jp"
> >>>>>>> >>>>>>>> <linux-ha-japan@lists.sourceforge.jp>
> >>>>>>> >>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>> Date: 2015/3/17, Tue
> >>>>>>> > 14:38
> >>>>>>> >>>>>>>>>>>>>>> Subject: Re:
> >>>>>>> > [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
> >>>>>>> >>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>> 山内さん
> >>>>>>> >>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>> お疲れ様です、福田です。
> >>>>>>> >>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>
> >>>>>>> > stonith-helperのシェバング行に-xを追加すれば良いのでしょうか?
> >>>>>>> >>>>>>>>>>>>>>>
> >>>>>>> > stonith-helperの先頭行を#!/bin/bash -xにしてクラスタを起動してみました。
> >>>>>>> >>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>
> >>>>>>> > crm_monでは先ほどと変わりはないようです。
> >>>>>>> >>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>> # crm_mon -rfA
> >>>>>>> >>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>> Last updated: Tue Mar
> >>>>>>> > 17 14:14:39 2015
> >>>>>>> >>>>>>>>>>>>>>> Last change: Tue Mar 17
> >>>>>>> > 14:01:43 2015
> >>>>>>> >>>>>>>>>>>>>>> Stack: heartbeat
> >>>>>>> >>>>>>>>>>>>>>> Current DC:
> >>>>>>> > lbv2.beta.com
> >>>>>>> >>>>>>>> (82ffc36f-1ad8-8686-7db0-35686465c624) - parti
> >>>>>>> >>>>>>>>>>>>>>> tion with quorum
> >>>>>>> >>>>>>>>>>>>>>> Version: 1.1.12-561c4cf
> >>>>>>> >>>>>>>>>>>>>>> 2 Nodes configured
> >>>>>>> >>>>>>>>>>>>>>> 8 Resources configured
> >>>>>>> >>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>> Online: [ lbv1.beta.com
> >>>>>>> > lbv2.beta.com ]
> >>>>>>> >>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>> Full list of resources:
> >>>>>>> >>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>> Resource Group:
> >>>>>>> > HAvarnish
> >>>>>>> >>>>>>>>>>>>>>> vip_208
> >>>>>>> > (ocf::heartbeat:IPaddr2):
> >>>>>>> >>>>>>>> Started lbv1.beta.com
> >>>>>>> >>>>>>>>>>>>>>> varnishd
> >>>>>>> > (lsb:varnish): Started
> >>>>>>> >>>>>>>> lbv1.beta.com
> >>>>>>> >>>>>>>>>>>>>>> Resource Group:
> >>>>>>> > grpStonith1
> >>>>>>> >>>>>>>>>>>>>>> Stonith1-1
> >>>>>>> >>>>>>>> (stonith:external/stonith-helper): Stopped
> >>>>>>> >>>>>>>>>>>>>>> Stonith1-2
> >>>>>>> > (stonith:external/xen0):
> >>>>>>> >>>>>>>> Stopped
> >>>>>>> >>>>>>>>>>>>>>> Resource Group:
> >>>>>>> > grpStonith2
> >>>>>>> >>>>>>>>>>>>>>> Stonith2-1
> >>>>>>> >>>>>>>> (stonith:external/stonith-helper): Stopped
> >>>>>>> >>>>>>>>>>>>>>> Stonith2-2
> >>>>>>> > (stonith:external/xen0):
> >>>>>>> >>>>>>>> Stopped
> >>>>>>> >>>>>>>>>>>>>>> Clone Set: clone_ping
> >>>>>>> > [ping]
> >>>>>>> >>>>>>>>>>>>>>> Started: [
> >>>>>>> > lbv1.beta.com lbv2.beta.com ]
> >>>>>>> >>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>> Node Attributes:
> >>>>>>> >>>>>>>>>>>>>>> * Node lbv1.beta.com:
> >>>>>>> >>>>>>>>>>>>>>> +
> >>>>>>> > default_ping_set : 100
> >>>>>>> >>>>>>>>>>>>>>> * Node lbv2.beta.com:
> >>>>>>> >>>>>>>>>>>>>>> +
> >>>>>>> > default_ping_set : 100
> >>>>>>> >>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>> Migration summary:
> >>>>>>> >>>>>>>>>>>>>>> * Node lbv2.beta.com:
> >>>>>>> >>>>>>>>>>>>>>> Stonith1-1:
> >>>>>>> > migration-threshold=1
> >>>>>>> >>>>>>>> fail-count=1000000 last-failure='Tue Mar 17
> >>>>>>> >>>>>>>>>>>>>>> 14:12:16 2015'
> >>>>>>> >>>>>>>>>>>>>>> * Node lbv1.beta.com:
> >>>>>>> >>>>>>>>>>>>>>> Stonith2-1:
> >>>>>>> > migration-threshold=1
> >>>>>>> >>>>>>>> fail-count=1000000 last-failure='Tue Mar 17
> >>>>>>> >>>>>>>>>>>>>>> 14:12:21 2015'
> >>>>>>> >>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>> Failed actions:
> >>>>>>> >>>>>>>>>>>>>>> Stonith1-1_start_0
> >>>>>>> > on lbv2.beta.com 'unknown
> >>>>>>> >>>>>>>> error' (1): call=31, st
> >>>>>>> >>>>>>>>>>>>>>> atus=Error,
> >>>>>>> > last-rc-change='Tue Mar 17 14:12:14
> >>>>>>> >>>>>>>> 2015', queued=0ms, exec=1065ms
> >>>>>>> >>>>>>>>>>>>>>> Stonith2-1_start_0
> >>>>>>> > on lbv1.beta.com 'unknown
> >>>>>>> >>>>>>>> error' (1): call=26, st
> >>>>>>> >>>>>>>>>>>>>>> atus=Error,
> >>>>>>> > last-rc-change='Tue Mar 17 14:12:19
> >>>>>>> >>>>>>>> 2015', queued=0ms, exec=1081ms
> >>>>>>> >>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>> その他のログを探してみました。
> >>>>>>> >>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>> heartbeat起動時です。
> >>>>>>> >>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>> # less
> >>>>>>> > /var/log/pm_logconv.out
> >>>>>>> >>>>>>>>>>>>>>> Mar 17 14:11:28
> >>>>>>> > lbv1.beta.com info: Starting
> >>>>>>> >>>>>>>> Heartbeat 3.0.6.
> >>>>>>> >>>>>>>>>>>>>>> Mar 17 14:11:33
> >>>>>>> > lbv1.beta.com info: Link
> >>>>>>> >>>>>>>> lbv2.beta.com:eth1 is up.
> >>>>>>> >>>>>>>>>>>>>>> Mar 17 14:11:34
> >>>>>>> > lbv1.beta.com info: Start
> >>>>>>> >>>>>>>> "ccm" process. (pid=13264)
> >>>>>>> >>>>>>>>>>>>>>> Mar 17 14:11:34
> >>>>>>> > lbv1.beta.com info: Start
> >>>>>>> >>>>>>>> "lrmd" process. (pid=13267)
> >>>>>>> >>>>>>>>>>>>>>> Mar 17 14:11:34
> >>>>>>> > lbv1.beta.com info: Start
> >>>>>>> >>>>>>>> "attrd" process. (pid=13268)
> >>>>>>> >>>>>>>>>>>>>>> Mar 17 14:11:34
> >>>>>>> > lbv1.beta.com info: Start
> >>>>>>> >>>>>>>> "stonithd" process. (pid=13266)
> >>>>>>> >>>>>>>>>>>>>>> Mar 17 14:11:34
> >>>>>>> > lbv1.beta.com info: Start
> >>>>>>> >>>>>>>> "cib" process. (pid=13265)
> >>>>>>> >>>>>>>>>>>>>>> Mar 17 14:11:34
> >>>>>>> > lbv1.beta.com info: Start
> >>>>>>> >>>>>>>> "crmd" process. (pid=13269)
> >>>>>>> >>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>> # less /var/log/error
> >>>>>>> >>>>>>>>>>>>>>> Mar 17 14:12:20 lbv1
> >>>>>>> > crmd[13269]: error:
> >>>>>>> >>>>>>>> process_lrm_event: Operation Stonith2-1_start_0
> >>>>>>> > (node=lbv1.beta.com, call=26,
> >>>>>>> >>>>>>>> status=4, cib-update=19, confirmed=true) Error
> >>>>>>> >>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>
> >>>>>>> > syslogからstonithをgrepしたものです
> >>>>>>> >>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>> Mar 17 14:11:34 lbv1
> >>>>>>> > heartbeat: [13255]: info:
> >>>>>>> >>>>>>>> Starting child client
> >>>>>>> >>>>>>>>
> >>>>>>> > "/usr/local/heartbeat/libexec/pacemaker/stonithd" (0,0)
> >>>>>>> >>>>>>>>>>>>>>> Mar 17 14:11:34 lbv1
> >>>>>>> > heartbeat: [13266]: info:
> >>>>>>> >>>>>>>> Starting
> >>>>>>> > "/usr/local/heartbeat/libexec/pacemaker/stonithd" as uid 0
> >>>>>>> >>>>>>>> gid 0 (pid 13266)
> >>>>>>> >>>>>>>>>>>>>>> Mar 17 14:11:34 lbv1
> >>>>>>> > stonithd[13266]: notice:
> >>>>>>> >>>>>>>> crm_cluster_connect: Connecting to cluster
> >>>>>>> > infrastructure: heartbeat
> >>>>>>> >>>>>>>>>>>>>>> Mar 17 14:11:34 lbv1
> >>>>>>> > heartbeat: [13255]: info: the
> >>>>>>> >>>>>>>> send queue length from heartbeat to client stonithd
> >>>>>>> > is set to 1024
> >>>>>>> >>>>>>>>>>>>>>> Mar 17 14:11:40 lbv1
> >>>>>>> > stonithd[13266]: notice:
> >>>>>>> >>>>>>>> setup_cib: Watching for stonith topology changes
> >>>>>>> >>>>>>>>>>>>>>> Mar 17 14:11:40 lbv1
> >>>>>>> > stonithd[13266]: notice:
> >>>>>>> >>>>>>>> unpack_config: On loss of CCM Quorum: Ignore
> >>>>>>> >>>>>>>>>>>>>>> Mar 17 14:11:40 lbv1
> >>>>>>> > stonithd[13266]: warning:
> >>>>>>> >>>>>>>> handle_startup_fencing: Blind faith: not fencing
> >>>>>>> > unseen nodes
> >>>>>>> >>>>>>>>>>>>>>> Mar 17 14:11:40 lbv1
> >>>>>>> > stonithd[13266]: warning:
> >>>>>>> >>>>>>>> handle_startup_fencing: Blind faith: not fencing
> >>>>>>> > unseen nodes
> >>>>>>> >>>>>>>>>>>>>>> Mar 17 14:11:41 lbv1
> >>>>>>> > stonithd[13266]: notice:
> >>>>>>> >>>>>>>> stonith_device_register: Added 'Stonith2-1'
> >>>>>>> > to the device list (1 active
> >>>>>>> >>>>>>>> devices)
> >>>>>>> >>>>>>>>>>>>>>> Mar 17 14:11:41 lbv1
> >>>>>>> > stonithd[13266]: notice:
> >>>>>>> >>>>>>>> stonith_device_register: Added 'Stonith2-2'
> >>>>>>> > to the device list (2 active
> >>>>>>> >>>>>>>> devices)
> >>>>>>> >>>>>>>>>>>>>>> Mar 17 14:12:04 lbv1
> >>>>>>> > stonithd[13266]: notice:
> >>>>>>> >>>>>>>> xml_patch_version_check: Versions did not change in
> >>>>>>> > patch 0.5.0
> >>>>>>> >>>>>>>>>>>>>>> Mar 17 14:12:20 lbv1
> >>>>>>> > stonithd[13266]: notice:
> >>>>>>> >>>>>>>> log_operation: Operation 'monitor' [13386]
> >>>>>>> > for device
> >>>>>>> >>>>>>>> 'Stonith2-1' returned: -201 (Generic
> >>>>>>> > Pacemaker error)
> >>>>>>> >>>>>>>>>>>>>>> Mar 17 14:12:20 lbv1
> >>>>>>> > stonithd[13266]: warning:
> >>>>>>> >>>>>>>> log_operation: Stonith2-1:13386 [ Performing:
> >>>>>>> > stonith -t external/stonith-helper
> >>>>>>> >>>>>>>> -S ]
> >>>>>>> >>>>>>>>>>>>>>> Mar 17 14:12:20 lbv1
> >>>>>>> > stonithd[13266]: warning:
> >>>>>>> >>>>>>>> log_operation: Stonith2-1:13386 [ failed to exec
> >>>>>>> > "stonith" ]
> >>>>>>> >>>>>>>>>>>>>>> Mar 17 14:12:20 lbv1
> >>>>>>> > stonithd[13266]: warning:
> >>>>>>> >>>>>>>> log_operation: Stonith2-1:13386 [ failed: 2 ]
> >>>>>>> >>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>> 宜しくお願いします。
> >>>>>>> >>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>> 以上
> >>>>>>> >>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>> 2015年3月17日 13:32
> >>>>>>> > <renayama19661014@ybb.ne.jp>:
> >>>>>>> >>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>> 福田さん
> >>>>>>> >>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>> お疲れ様です。山内です。
> >>>>>>> >>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>
> >>>>>>> > ということは、stonith-helperのstartに問題があるようですね。
> >>>>>>> >>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>> stonith-helperの先頭に
> >>>>>>> >>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>> #!/bin/bash -x
> >>>>>>> >>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>
> >>>>>>> > を入れて、クラスタを起動すると何かわかるかも知れません。
> >>>>>>> >>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>
> >>>>>>> > ちなみに、stonith-helperのログもどこかに出ていると思うのですが。。。
> >>>>>>> >>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>> 以上です。
> >>>>>>> >>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>> ----- Original
> >>>>>>> > Message -----
> >>>>>>> >>>>>>>>>>>>>>>>> From: Masamichi
> >>>>>>> > Fukuda - elf-systems
> >>>>>>> >>>>>>>> <masamichi_fukuda@elf-systems.com>
> >>>>>>> >>>>>>>>>>>>>>>>> To: 山内英生
> >>>>>>> > <renayama19661014@ybb.ne.jp>;
> >>>>>>> >>>>>>>> "linux-ha-japan@lists.sourceforge.jp"
> >>>>>>> >>>>>>>> <linux-ha-japan@lists.sourceforge.jp>
> >>>>>>> >>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>> Date:
> >>>>>>> > 2015/3/17, Tue 12:31
> >>>>>>> >>>>>>>>>>>>>>>>> Subject: Re:
> >>>>>>> > [Linux-ha-jp]
> >>>>>>> >>>>>>>> スプリットブレイン時のSTONITHエラーについて
> >>>>>>> >>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>> 山内さん
> >>>>>>> >>>>>>>>>>>>>>>>> cc:松島さん
> >>>>>>> >>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>> こんにちは、福田です。
> >>>>>>> >>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>>
> >>>>>>> > 同じディレクトリにxen0はありました。
> >>>>>>> >>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>> # pwd
> >>>>>>> >>>>>>>>>>>>>>>>>
> >>>>>>> > /usr/local/heartbeat/lib/stonith/plugins/external
> >>>>>>> >>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>> # ls
> >>>>>>> >>>>>>>>>>>>>>>>> drac5
> >>>>>>> > ibmrsa kdumpcheck
> >>>>>>> >>>>>>>> riloe vmware
> >>>>>>> >>>>>>>>>>>>>>>>> dracmc-telnet
> >>>>>>> > ibmrsa-telnet libvirt
> >>>>>>> >>>>>>>> ssh xen0
> >>>>>>> >>>>>>>>>>>>>>>>> hetzner
> >>>>>>> > ipmi nut
> >>>>>>> >>>>>>>> stonith-helper xen0-ha
> >>>>>>> >>>>>>>>>>>>>>>>> hmchttp
> >>>>>>> > ippower9258 rackpdu
> >>>>>>> >>>>>>>> vcenter
> >>>>>>> >>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>> 宜しくお願いします。
> >>>>>>> >>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>> 以上
> >>>>>>> >>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>> 2015-03-17
> >>>>>>> > 10:53 GMT+09:00
> >>>>>>> >>>>>>>> <renayama19661014@ybb.ne.jp>:
> >>>>>>> >>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>> 福田さん
> >>>>>>> >>>>>>>>>>>>>>>>>> cc:松島さん
> >>>>>>> >>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>>>
> >>>>>>> > お疲れ様です。山内です。
> >>>>>>> >>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> > 標準出力や標準エラー出力はありませんでした。
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> > stonith-helperがおかしいのでしょうか。
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> > stonith-helperはシェルスクリプトなのでインストールはあまり気にしていなかったのですが。
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> > stonith-helperはここに配置されています。
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> > /usr/local/heartbeat/lib/stonith/plugins/external/stonith-helper
> >>>>>>> >>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>>>
> >>>>>>> > このディレクトリにxen0もありますか?
> >>>>>>> >>>>>>>>>>>>>>>>>>
> >>>>>>> >
> 無いようでしたら、問題がありますので、一度、stonith-helperのファイルを属性などはそのまま、xen0と同じディレクトリに
> >>>>>>> >>>>>>>>>>>>>>>>>>
> >>>>>>> > コピーしてみてください。
> >>>>>>> >>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>>>
> >>>>>>> > それで稼働するなら、pm_extrasのインストールに問題があるということになります。
> >>>>>>> >>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>>> 以上です。
> >>>>>>> >>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>>> -----
> >>>>>>> > Original Message -----
> >>>>>>> >>>>>>>>>>>>>>>>>>> From:
> >>>>>>> > Masamichi Fukuda - elf-systems
> >>>>>>> >>>>>>>> <masamichi_fukuda@elf-systems.com>
> >>>>>>> >>>>>>>>>>>>>>>>>>> To:
> >>>>>>> > 山内英生
> >>>>>>> >>>>>>>> <renayama19661014@ybb.ne.jp>;
> >>>>>>> >>>>>>>> "linux-ha-japan@lists.sourceforge.jp"
> >>>>>>> >>>>>>>> <linux-ha-japan@lists.sourceforge.jp>
> >>>>>>> >>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>>>> Date:
> >>>>>>> > 2015/3/17, Tue 10:31
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> > Subject: Re: [Linux-ha-jp]
> >>>>>>> >>>>>>>> スプリットブレイン時のSTONITHエラーについて
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>>>> 山内さん
> >>>>>>> >>>>>>>>>>>>>>>>>>> cc:松島さん
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>
> >>>>>>> > おはようございます、福田です。
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> > crmの例をありがとうございます。
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> > 早速、こちらの環境に合わせてみました。
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>>>> $ cat
> >>>>>>> > test.crm
> >>>>>>> >>>>>>>>>>>>>>>>>>> ###
> >>>>>>> > Cluster Option ###
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> > property \
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>> no-quorum-policy="ignore" \
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> > stonith-enabled="true"
> >>>>>>> >>>>>>>> \
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>> startup-fencing="false" \
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> > stonith-timeout="710s"
> >>>>>>> >>>>>>>> \
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>> crmd-transition-delay="2s"
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>>>> ###
> >>>>>>> > Resource Default ###
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> > rsc_defaults \
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>> resource-stickiness="INFINITY" \
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>> migration-threshold="1"
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>>>> ###
> >>>>>>> > Group Configuration ###
> >>>>>>> >>>>>>>>>>>>>>>>>>> group
> >>>>>>> > HAvarnish \
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> > vip_208 \
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> > varnishd
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>>>> group
> >>>>>>> > grpStonith1 \
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> > Stonith1-1 \
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> > Stonith1-2
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>>>> group
> >>>>>>> > grpStonith2 \
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> > Stonith2-1 \
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> > Stonith2-2
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>>>> ###
> >>>>>>> > Clone Configuration ###
> >>>>>>> >>>>>>>>>>>>>>>>>>> clone
> >>>>>>> > clone_ping \
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> > ping
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>>>> ###
> >>>>>>> > Fencing Topology ###
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> > fencing_topology \
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> > lbv1.beta.com: Stonith1-1
> >>>>>>> >>>>>>>> Stonith1-2 \
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> > lbv2.beta.com: Stonith2-1
> >>>>>>> >>>>>>>> Stonith2-2
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>>>> ###
> >>>>>>> > Primitive Configuration ###
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> > primitive vip_208
> >>>>>>> >>>>>>>> ocf:heartbeat:IPaddr2 \
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> > params \
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>> ip="192.168.17.208" \
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> > nic="eth0" \
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> > cidr_netmask="24"
> >>>>>>> >>>>>>>> \
> >>>>>>> >>>>>>>>>>>>>>>>>>> op
> >>>>>>> > start interval="0s"
> >>>>>>> >>>>>>>> timeout="90s" on-fail="restart"
> >>>>>>> > \
> >>>>>>> >>>>>>>>>>>>>>>>>>> op
> >>>>>>> > monitor
> >>>>>>> >>>>>>>> interval="5s" timeout="60s"
> >>>>>>> > on-fail="restart"
> >>>>>>> >>>>>>>> \
> >>>>>>> >>>>>>>>>>>>>>>>>>> op
> >>>>>>> > stop interval="0s"
> >>>>>>> >>>>>>>> timeout="100s" on-fail="fence"
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> > primitive varnishd lsb:varnish \
> >>>>>>> >>>>>>>>>>>>>>>>>>> op
> >>>>>>> > start interval="0s"
> >>>>>>> >>>>>>>> timeout="90s" on-fail="restart"
> >>>>>>> > \
> >>>>>>> >>>>>>>>>>>>>>>>>>> op
> >>>>>>> > monitor
> >>>>>>> >>>>>>>> interval="10s" timeout="60s"
> >>>>>>> > on-fail="restart"
> >>>>>>> >>>>>>>> \
> >>>>>>> >>>>>>>>>>>>>>>>>>> op
> >>>>>>> > stop interval="0s"
> >>>>>>> >>>>>>>> timeout="100s" on-fail="fence"
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> > primitive ping ocf:pacemaker:ping
> >>>>>>> >>>>>>>> \
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> > params \
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>> name="default_ping_set" \
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>> host_list="192.168.17.254" \
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> > multiplier="100"
> >>>>>>> >>>>>>>> \
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> > dampen="1" \
> >>>>>>> >>>>>>>>>>>>>>>>>>> op
> >>>>>>> > start interval="0s"
> >>>>>>> >>>>>>>> timeout="90s" on-fail="restart"
> >>>>>>> > \
> >>>>>>> >>>>>>>>>>>>>>>>>>> op
> >>>>>>> > monitor
> >>>>>>> >>>>>>>> interval="10s" timeout="60s"
> >>>>>>> > on-fail="restart"
> >>>>>>> >>>>>>>> \
> >>>>>>> >>>>>>>>>>>>>>>>>>> op
> >>>>>>> > stop interval="0s"
> >>>>>>> >>>>>>>> timeout="100s" on-fail="fence"
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> > primitive Stonith1-1
> >>>>>>> >>>>>>>> stonith:external/stonith-helper \
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> > params \
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>> pcmk_reboot_retries="1" \
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>> pcmk_reboot_timeout="40s" \
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>> hostlist="lbv1.beta.com" \
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>> dead_check_target="192.168.17.132
> >>>>>>> > 10.0.17.132" \
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>
> >>>>>>> > standby_check_command="/usr/local/sbin/crm_resource -r varnishd
> -W | grep
> >>>>>>> >>>>>>>> -q `hostname`" \
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>> run_online_check="yes" \
> >>>>>>> >>>>>>>>>>>>>>>>>>> op
> >>>>>>> > start interval="0s"
> >>>>>>> >>>>>>>> timeout="60s" on-fail="restart"
> >>>>>>> > \
> >>>>>>> >>>>>>>>>>>>>>>>>>> op
> >>>>>>> > stop interval="0s"
> >>>>>>> >>>>>>>> timeout="60s" on-fail="ignore"
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> > primitive Stonith1-2
> >>>>>>> >>>>>>>> stonith:external/xen0 \
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> > params \
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>> pcmk_reboot_timeout="60s" \
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>
> >>>>>>> > hostlist="lbv1.beta.com:/etc/xen/lbv1.cfg" \
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>> dom0="xen0.beta.com" \
> >>>>>>> >>>>>>>>>>>>>>>>>>> op
> >>>>>>> > start interval="0s"
> >>>>>>> >>>>>>>> timeout="60s" on-fail="restart"
> >>>>>>> > \
> >>>>>>> >>>>>>>>>>>>>>>>>>> op
> >>>>>>> > monitor
> >>>>>>> >>>>>>>> interval="3600s" timeout="60s"
> >>>>>>> > on-fail="restart"
> >>>>>>> >>>>>>>> \
> >>>>>>> >>>>>>>>>>>>>>>>>>> op
> >>>>>>> > stop interval="0s"
> >>>>>>> >>>>>>>> timeout="60s" on-fail="ignore"
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> > primitive Stonith2-1
> >>>>>>> >>>>>>>> stonith:external/stonith-helper \
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> > params \
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>> pcmk_reboot_retries="1" \
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>> pcmk_reboot_timeout="40s" \
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>> hostlist="lbv2.beta.com" \
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>> dead_check_target="192.168.17.133
> >>>>>>> > 10.0.17.133" \
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>
> >>>>>>> > standby_check_command="/usr/local/sbin/crm_resource -r varnishd
> -W | grep
> >>>>>>> >>>>>>>> -q `hostname`" \
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>> run_online_check="yes" \
> >>>>>>> >>>>>>>>>>>>>>>>>>> op
> >>>>>>> > start interval="0s"
> >>>>>>> >>>>>>>> timeout="60s" on-fail="restart"
> >>>>>>> > \
> >>>>>>> >>>>>>>>>>>>>>>>>>> op
> >>>>>>> > stop interval="0s"
> >>>>>>> >>>>>>>> timeout="60s" on-fail="ignore"
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> > primitive Stonith2-2
> >>>>>>> >>>>>>>> stonith:external/xen0 \
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> > params \
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>> pcmk_reboot_timeout="60s" \
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>
> >>>>>>> > hostlist="lbv2.beta.com:/etc/xen/lbv2.cfg" \
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>> dom0="xen0.beta.com" \
> >>>>>>> >>>>>>>>>>>>>>>>>>> op
> >>>>>>> > start interval="0s"
> >>>>>>> >>>>>>>> timeout="60s" on-fail="restart"
> >>>>>>> > \
> >>>>>>> >>>>>>>>>>>>>>>>>>> op
> >>>>>>> > monitor
> >>>>>>> >>>>>>>> interval="3600s" timeout="60s"
> >>>>>>> > on-fail="restart"
> >>>>>>> >>>>>>>> \
> >>>>>>> >>>>>>>>>>>>>>>>>>> op
> >>>>>>> > stop interval="0s"
> >>>>>>> >>>>>>>> timeout="60s" on-fail="ignore"
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>>>> ###
> >>>>>>> > Resource Location ###
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> > location HA_location-1 HAvarnish
> >>>>>>> >>>>>>>> \
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> > rule 200: #uname eq
> >>>>>>> >>>>>>>> lbv1.beta.com \
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> > rule 100: #uname eq
> >>>>>>> >>>>>>>> lbv2.beta.com
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> > location HA_location-2 HAvarnish
> >>>>>>> >>>>>>>> \
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> > rule -INFINITY: not_defined
> >>>>>>> >>>>>>>> default_ping_set or default_ping_set lt 100
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> > location HA_location-3 grpStonith1
> >>>>>>> >>>>>>>> \
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> > rule -INFINITY: #uname eq
> >>>>>>> >>>>>>>> lbv1.beta.com
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> > location HA_location-4 grpStonith2
> >>>>>>> >>>>>>>> \
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> > rule -INFINITY: #uname eq
> >>>>>>> >>>>>>>> lbv2.beta.com
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> > これを流しこんだところ、昨日とはメッセージが異なります。
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> > pingのメッセージはなくなっていました。
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>>>> #
> >>>>>>> > crm_mon -rfA
> >>>>>>> >>>>>>>>>>>>>>>>>>> Last
> >>>>>>> > updated: Tue Mar 17 10:21:28
> >>>>>>> >>>>>>>> 2015
> >>>>>>> >>>>>>>>>>>>>>>>>>> Last
> >>>>>>> > change: Tue Mar 17 10:21:09
> >>>
> >>>>>
> >>>>>>> >>>>>>>> 2015
> >>>>>>> >>>>>>>>>>>>>>>>>>> Stack:
> >>>>>>> > heartbeat
> >>>>>>> >>>>>>>>>>>>>>>>>>> Current
> >>>>>>> > DC: lbv2.beta.com
> >>>>>>> >>>>>>>> (82ffc36f-1ad8-8686-7db0-35686465c624) - parti
> >>>>>>> >>>>>>>>>>>>>>>>>>> tion
> >>>>>>> > with quorum
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> > Version: 1.1.12-561c4cf
> >>>>>>> >>>>>>>>>>>>>>>>>>> 2 Nodes
> >>>>>>> > configured
> >>>>>>> >>>>>>>>>>>>>>>>>>> 8
> >>>>>>> > Resources configured
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>>>> Online:
> >>>>>>> > [ lbv1.beta.com
> >>>>>>> >>>>>>>> lbv2.beta.com ]
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>>>> Full
> >>>>>>> > list of resources:
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> > Resource Group: HAvarnish
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> > vip_208
> >>>>>>> >>>>>>>> (ocf::heartbeat:IPaddr2): Started
> >>>>>>> > lbv1.beta.com
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> > varnishd (lsb:varnish):
> >>>>>>> >>>>>>>> Started lbv1.beta.com
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> > Resource Group: grpStonith1
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> > Stonith1-1
> >>>>>>> >>>>>>>> (stonith:external/stonith-helper): Stopped
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> > Stonith1-2
> >>>>>>> >>>>>>>> (stonith:external/xen0): Stopped
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> > Resource Group: grpStonith2
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> > Stonith2-1
> >>>>>>> >>>>>>>> (stonith:external/stonith-helper): Stopped
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> > Stonith2-2
> >>>>>>> >>>>>>>> (stonith:external/xen0): Stopped
> >>>>>>> >>>>>>>>>>>>>>>>>>> Clone
> >>>>>>> > Set: clone_ping [ping]
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> > Started: [ lbv1.beta.com
> >>>>>>> >>>>>>>> lbv2.beta.com ]
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>>>> Node
> >>>>>>> > Attributes:
> >>>>>>> >>>>>>>>>>>>>>>>>>> * Node
> >>>>>>> > lbv1.beta.com:
> >>>>>>> >>>>>>>>>>>>>>>>>>> +
> >>>>>>> >>>>>>>> default_ping_set : 100
> >>>>>>> >>>>>>>>>>>>>>>>>>> * Node
> >>>>>>> > lbv2.beta.com:
> >>>>>>> >>>>>>>>>>>>>>>>>>> +
> >>>>>>> >>>>>>>> default_ping_set : 100
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> > Migration summary:
> >>>>>>> >>>>>>>>>>>>>>>>>>> * Node
> >>>>>>> > lbv2.beta.com:
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> > Stonith1-1: migration-threshold=1
> >>>>>>> >>>>>>>> fail-count=1000000 last-failure='Tue Mar 17
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> > 10:21:17 2015'
> >>>>>>> >>>>>>>>>>>>>>>>>>> * Node
> >>>>>>> > lbv1.beta.com:
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> > Stonith2-1: migration-threshold=1
> >>>>>>> >>>>>>>> fail-count=1000000 last-failure='Tue Mar 17
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> > 10:21:17 2015'
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>>>> Failed
> >>>>>>> > actions:
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> > Stonith1-1_start_0 on
> >>>>>>> >>>>>>>> lbv2.beta.com 'unknown error' (1): call=31,
> >>>>>>> > st
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> > atus=Error, last-rc-change='Tue
> >>>>>>> >>>>>>>> Mar 17 10:21:15 2015', queued=0ms, exec=1082ms
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> > Stonith2-1_start_0 on
> >>>>>>> >>>>>>>> lbv1.beta.com 'unknown error' (1): call=31,
> >>>>>>> > st
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> > atus=Error, last-rc-change='Tue
> >>>>>>> >>>>>>>> Mar 17 10:21:16 2015', queued=0ms, exec=1079ms
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> > /var/log/ha-debugのログです。
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> > IPaddr2(vip_208)[7851]:
> >>>>>>> >>>>>>>> 2015/03/17_10:21:22 INFO: Adding inet address
> >>>>>>> > 192.168.17.208/24 with broadcast
> >>>>>>> >>>>>>>> address 192.168.17.255 to device eth0
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> > IPaddr2(vip_208)[7851]:
> >>>>>>> >>>>>>>> 2015/03/17_10:21:22 INFO: Bringing device eth0 up
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> > IPaddr2(vip_208)[7851]:
> >>>>>>> >>>>>>>> 2015/03/17_10:21:22 INFO:
> >>>>>>> > /usr/libexec/heartbeat/send_arp -i 200 -r 5 -p
> >>>>>>> >>>>>>>> /var/run/resource-agents/send_arp-192.168.17.208
> >>>>>>> > eth0 192.168.17.208 auto
> >>>>>>> >>>>>>>> not_used not_used
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> > 標準出力や標準エラー出力はありませんでした。
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> > stonith-helperがおかしいのでしょうか。
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> > stonith-helperはシェルスクリプトなのでインストールはあまり気にしていなかったのですが。
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> > stonith-helperはここに配置されています。
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> > /usr/local/heartbeat/lib/stonith/plugins/external/stonith-helper
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>
> >>>>>>> > 宜しくお願いします。
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>>>> 以上
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> > 2015-03-17 9:45 GMT+09:00
> >>>>>>> >>>>>>>> <renayama19661014@ybb.ne.jp>:
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>>>> 福田さん
> >>>>>>> >>>>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>>>>>
> >>>>>>> > おはようございます。山内です。
> >>>>>>> >>>>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>>>>>
> >>>>>>> > 念の為、手元にある複数のstonithを利用した場合の例を抜粋してお送りします。
> >>>>>>> >>>>>>>>>>>>>>>>>>>>
> >>>>>>> > (実際には、改行に気を付けてください)
> >>>>>>> >>>>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>>>>>
> >>>>>>> > 以下の例は、PM1.1系での設定で、
> >>>>>>> >>>>>>>>>>>>>>>>>>>>
> >>>>>>> > nodeaは、prmStonith1-1、 prmStonith1-2の順でstonithが実行されます。
> >>>>>>> >>>>>>>>>>>>>>>>>>>>
> >>>>>>> > nodebは、prmStonith2-1、 prmStonith2-2の順でstonithが実行されます。
> >>>>>>> >>>>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>>>>>
> >>>>>>> > stonith自体は、helperとsshです。
> >>>>>>> >>>>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>>>>>
> >>>>>>> > (snip)
> >>>>>>> >>>>>>>>>>>>>>>>>>>> ###
> >>>>>>> > Group Configuration ###
> >>>>>>> >>>>>>>>>>>>>>>>>>>>
> >>>>>>> > group grpStonith1 \
> >>>>>>> >>>>>>>>>>>>>>>>>>>>
> >>>>>>> > prmStonith1-1 \
> >>>>>>> >>>>>>>>>>>>>>>>>>>>
> >>>>>>> > prmStonith1-2
> >>>>>>> >>>>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>>>>>
> >>>>>>> > group grpStonith2 \
> >>>>>>> >>>>>>>>>>>>>>>>>>>>
> >>>>>>> > prmStonith2-1 \
> >>>>>>> >>>>>>>>>>>>>>>>>>>>
> >>>>>>> > prmStonith2-2
> >>>>>>> >>>>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>>>>> ###
> >>>>>>> > Fencing Topology ###
> >>>>>>> >>>>>>>>>>>>>>>>>>>>
> >>>>>>> > fencing_topology \
> >>>>>>> >>>>>>>>>>>>>>>>>>>>
> >>>>>>> > nodea: prmStonith1-1
> >>>>>>> >>>>>>>> prmStonith1-2 \
> >>>>>>> >>>>>>>>>>>>>>>>>>>>
> >>>>>>> > nodeb: prmStonith2-1
> >>>>>>> >>>>>>>> prmStonith2-2
> >>>>>>> >>>>>>>>>>>>>>>>>>>>
> >>>>>>> > (snp)
> >>>>>>> >>>>>>>>>>>>>>>>>>>>
> >>>>>>> > primitive prmStonith1-1
> >>>>>>> >>>>>>>> stonith:external/stonith-helper \
> >>>>>>> >>>>>>>>>>>>>>>>>>>>
> >>>>>>> > params \
> >>>>>>> >>>>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>>>>>
> >>>>>>> > pcmk_reboot_retries="1"
> >>>>>>> >>>>>>>> \
> >>>>>>> >>>>>>>>>>>>>>>>>>>>
> >>>>>>> > pcmk_reboot_timeout="40s"
> >>>>>>> >>>>>>>> \
> >>>>>>> >>>>>>>>>>>>>>>>>>>>
> >>>>>>> > hostlist="nodea" \
> >>>>>>> >>>>>>>>>>>>>>>>>>>>
> >>>>>>> > dead_check_target="192.168.28.60
> >>>>>>> >>>>>>>> 192.168.28.70" \
> >>>>>>> >>>>>>>>>>>>>>>>>>>>
> >>>>>>> > standby_check_command="/usr/sbin/crm_resource
> >>>>>>> >>>>>>>> -r prmRES -W | grep -qi `hostname`" \
> >>>>>>> >>>>>>>>>>>>>>>>>>>>
> >>>>>>> > run_online_check="yes"
> >>>>>>> >>>>>>>> \
> >>>>>>> >>>>>>>>>>>>>>>>>>>> op
> >>>>>>> > start interval="0s"
> >>>>>>> >>>>>>>> timeout="60s" on-fail="restart"
> >>>>>>> > \
> >>>>>>> >>>>>>>>>>>>>>>>>>>> op
> >>>>>>> > stop interval="0s"
> >>>>>>> >>>>>>>> timeout="60s" on-fail="ignore"
> >>>>>>> >>>>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>>>>>
> >>>>>>> > primitive prmStonith1-2
> >>>>>>> >>>>>>>> stonith:external/ssh \
> >>>>>>> >>>>>>>>>>>>>>>>>>>>
> >>>>>>> > params \
> >>>>>>> >>>>>>>>>>>>>>>>>>>>
> >>>>>>> > pcmk_reboot_timeout="60s"
> >>>>>>> >>>>>>>> \
> >>>>>>> >>>>>>>>>>>>>>>>>>>>
> >>>>>>> > hostlist="nodea" \
> >>>>>>> >>>>>>>>>>>>>>>>>>>> op
> >>>>>>> > start interval="0s"
> >>>>>>> >>>>>>>> timeout="60s" on-fail="restart"
> >>>>>>> > \
> >>>>>>> >>>>>>>>>>>>>>>>>>>> op
> >>>>>>> > monitor
> >>>>>>> >>>>>>>> interval="3600s" timeout="60s"
> >>>>>>> > on-fail="restart"
> >>>>>>> >>>>>>>> \
> >>>>>>> >>>>>>>>>>>>>>>>>>>> op
> >>>>>>> > stop interval="0s"
> >>>>>>> >>>>>>>> timeout="60s" on-fail="ignore"
> >>>>>>> >>>>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>>>>>
> >>>>>>> > primitive prmStonith2-1
> >>>>>>> >>>>>>>> stonith:external/stonith-helper \
> >>>>>>> >>>>>>>>>>>>>>>>>>>>
> >>>>>>> > params \
> >>>>>>> >>>>>>>>>>>>>>>>>>>>
> >>>>>>> > pcmk_reboot_retries="1"
> >>>>>>> >>>>>>>> \
> >>>>>>> >>>>>>>>>>>>>>>>>>>>
> >>>>>>> > pcmk_reboot_timeout="40s"
> >>>>>>> >>>>>>>> \
> >>>>>>> >>>>>>>>>>>>>>>>>>>>
> >>>>>>> > hostlist="nodeb" \
> >>>>>>> >>>>>>>>>>>>>>>>>>>>
> >>>>>>> > dead_check_target="192.168.28.61
> >>>>>>> >>>>>>>> 192.168.28.71" \
> >>>>>>> >>>>>>>>>>>>>>>>>>>>
> >>>>>>> > standby_check_command="/usr/sbin/crm_resource
> >>>>>>> >>>>>>>> -r prmRES -W | grep -qi `hostname`" \
> >>>>>>> >>>>>>>>>>>>>>>>>>>>
> >>>>>>> > run_online_check="yes"
> >>>>>>> >>>>>>>> \
> >>>>>>> >>>>>>>>>>>>>>>>>>>> op
> >>>>>>> > start interval="0s"
> >>>>>>> >>>>>>>> timeout="60s" on-fail="restart"
> >>>>>>> > \
> >>>>>>> >>>>>>>>>>>>>>>>>>>> op
> >>>>>>> > stop interval="0s"
> >>>>>>> >>>>>>>> timeout="60s" on-fail="ignore"
> >>>>>>> >>>>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>>>>>
> >>>>>>> > primitive prmStonith2-2
> >>>>>>> >>>>>>>> stonith:external/ssh \
> >>>>>>> >>>>>>>>>>>>>>>>>>>>
> >>>>>>> > params \
> >>>>>>> >>>>>>>>>>>>>>>>>>>>
> >>>>>>> > pcmk_reboot_timeout="60s"
> >>>>>>> >>>>>>>> \
> >>>>>>> >>>>>>>>>>>>>>>>>>>>
> >>>>>>> > hostlist="nodeb" \
> >>>>>>> >>>>>>>>>>>>>>>>>>>> op
> >>>>>>> > start interval="0s"
> >>>>>>> >>>>>>>> timeout="60s" on-fail="restart"
> >>>>>>> > \
> >>>>>>> >>>>>>>>>>>>>>>>>>>> op
> >>>>>>> > monitor
> >>>>>>> >>>>>>>> interval="3600s" timeout="60s"
> >>>>>>> > on-fail="restart"
> >>>>>>> >>>>>>>> \
> >>>>>>> >>>>>>>>>>>>>>>>>>>> op
> >>>>>>> > stop interval="0s"
> >>>>>>> >>>>>>>> timeout="60s" on-fail="ignore"
> >>>>>>> >>>>>>>>>>>>>>>>>>>>
> >>>>>>> > (snip)
> >>>>>>> >>>>>>>>>>>>>>>>>>>>
> >>>>>>> > location
> >>>>>>> >>>>>>>> rsc_location-grpStonith1-2 grpStonith1 \
> >>>>>>> >>>>>>>>>>>>>>>>>>>>
> >>>>>>> > rule -INFINITY: #uname eq nodea
> >>>>>>> >>>>>>>>>>>>>>>>>>>>
> >>>>>>> > location
> >>>>>>> >>>>>>>> rsc_location-grpStonith2-3 grpStonith2 \
> >>>>>>> >>>>>>>>>>>>>>>>>>>>
> >>>>>>> > rule -INFINITY: #uname eq nodeb
> >>>>>>> >>>>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>>>>>
> >>>>>>> > 以上です。
> >>>>>>> >>>>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>>>> --
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>>>> ELF
> >>>>>>> > Systems
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> > Masamichi Fukuda
> >>>>>>> >>>>>>>>>>>>>>>>>>> mail
> >>>>>>> > to:
> >>>>>>> >>>>>>>> masamichi_fukuda@elf-systems.com
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>>>
> >>>>>>> > _______________________________________________
> >>>>>>> >>>>>>>>>>>>>>>>>>
> >>>>>>> > Linux-ha-japan mailing list
> >>>>>>> >>>>>>>>>>>>>>>>>>
> >>>>>>> > Linux-ha-japan@lists.sourceforge.jp
> >>>>>>> >>>>>>>>>>>>>>>>>>
> >>>>>>> > http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
> >>>>>>> >>>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>> --
> >>>>>>> >>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>> ELF Systems
> >>>>>>> >>>>>>>>>>>>>>>>> Masamichi
> >>>>>>> > Fukuda
> >>>>>>> >>>>>>>>>>>>>>>>> mail to:
> >>>>>>> > masamichi_fukuda@elf-systems.com
> >>>>>>> >>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>>
> >>>>>>> > _______________________________________________
> >>>>>>> >>>>>>>>>>>>>>>> Linux-ha-japan
> >>>>>>> > mailing list
> >>>>>>> >>>>>>>>>>>>>>>>
> >>>>>>> > Linux-ha-japan@lists.sourceforge.jp
> >>>>>>> >>>>>>>>>>>>>>>>
> >>>>>>> > http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
> >>>>>>> >>>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>> --
> >>>>>>> >>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>> ELF Systems
> >>>>>>> >>>>>>>>>>>>>>> Masamichi Fukuda
> >>>>>>> >>>>>>>>>>>>>>> mail to:
> >>>>>>> > masamichi_fukuda@elf-systems.com
> >>>>>>> >>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>>
> >>>>>>> > _______________________________________________
> >>>>>>> >>>>>>>>>>>>>> Linux-ha-japan mailing list
> >>>>>>> >>>>>>>>>>>>>>
> >>>>>>> > Linux-ha-japan@lists.sourceforge.jp
> >>>>>>> >>>>>>>>>>>>>>
> >>>>>>> > http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
> >>>>>>> >>>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>> --
> >>>>>>> >>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>> ELF Systems
> >>>>>>> >>>>>>>>>>>>> Masamichi Fukuda
> >>>>>>> >>>>>>>>>>>>> mail to:
> >>>>>>> > masamichi_fukuda@elf-systems.com
> >>>>>>> >>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>>
> >>>>>>> > _______________________________________________
> >>>>>>> >>>>>>>>>>>> Linux-ha-japan mailing list
> >>>>>>> >>>>>>>>>>>> Linux-ha-japan@lists.sourceforge.jp
> >>>>>>> >>>>>>>>>>>>
> >>>>>>> > http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
> >>>>>>> >>>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>
> >>>>>>> >>>>>>>>>>> --
> >>>>>>> >>>>>>>>>>>
> >>>>>>> >>>>>>>>>>> ELF Systems
> >>>>>>> >>>>>>>>>>> Masamichi Fukuda
> >>>>>>> >>>>>>>>>>> mail to:
> >>>>>>> > masamichi_fukuda@elf-systems.com
> >>>>>>> >>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>
> >>>>>>> >>>>>>>>>>>
> >>>>>>> >>>>>>>>>>
> >>>>>>> >>>>>>>>>>
> >>>>>>> > _______________________________________________
> >>>>>>> >>>>>>>>>> Linux-ha-japan mailing list
> >>>>>>> >>>>>>>>>> Linux-ha-japan@lists.sourceforge.jp
> >>>>>>> >>>>>>>>>>
> >>>>>>> > http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
> >>>>>>> >>>>>>>>>>
> >>>>>>> >>>>>>>>>
> >>>>>>> >>>>>>>>>
> >>>>>>> >>>>>>>>> --
> >>>>>>> >>>>>>>>>
> >>>>>>> >>>>>>>>> ELF Systems
> >>>>>>> >>>>>>>>> Masamichi Fukuda
> >>>>>>> >>>>>>>>> mail to: masamichi_fukuda@elf-systems.com
> >>>>>>> >>>>>>>>>
> >>>>>>> >>>>>>>>>
> >>>>>>> >>>>>>>>
> >>>>>>> >>>>>>>> _______________________________________________
> >>>>>>> >>>>>>>> Linux-ha-japan mailing list
> >>>>>>> >>>>>>>> Linux-ha-japan@lists.sourceforge.jp
> >>>>>>> >>>>>>>>
> >>>>>>> > http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
> >>>>>>> >>>>>>>>
> >>>>>>> >>>>>>>
> >>>>>>> >>>>>>> _______________________________________________
> >>>>>>> >>>>>>> Linux-ha-japan mailing list
> >>>>>>> >>>>>>> Linux-ha-japan@lists.sourceforge.jp
> >>>>>>> >>>>>>>
> http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
> >>>>>>> >>>>>>>
> >>>>>>> >>>>>>
> >>>>>>> >>>>>>
> >>>>>>> >>>>>> --
> >>>>>>> >>>>>>
> >>>>>>> >>>>>> ELF Systems
> >>>>>>> >>>>>> Masamichi Fukuda
> >>>>>>> >>>>>> mail to: masamichi_fukuda@elf-systems.com
> >>>>>>> >>>>>>
> >>>>>>> >>>>>>
> >>>>>>> >>>>>
> >>>>>>> >>>>> _______________________________________________
> >>>>>>> >>>>> Linux-ha-japan mailing list
> >>>>>>> >>>>> Linux-ha-japan@lists.sourceforge.jp
> >>>>>>> >>>>> http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
> >>>>>>> >>>>>
> >>>>>>> >>>>
> >>>>>>> >>>>
> >>>>>>> >>>> --
> >>>>>>> >>>>
> >>>>>>> >>>> ELF Systems
> >>>>>>> >>>> Masamichi Fukuda
> >>>>>>> >>>> mail to: masamichi_fukuda@elf-systems.com
> >>>>>>> >>>>
> >>>>>>> >>>>
> >>>>>>> >>>
> >>>>>>> >>> _______________________________________________
> >>>>>>> >>> Linux-ha-japan mailing list
> >>>>>>> >>> Linux-ha-japan@lists.sourceforge.jp
> >>>>>>> >>> http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
> >>>>>>> >>>
> >>>>>>> >>
> >>>>>>> >>
> >>>>>>> >> --
> >>>>>>> >>
> >>>>>>> >> ELF Systems
> >>>>>>> >> Masamichi Fukuda
> >>>>>>> >> mail to: masamichi_fukuda@elf-systems.com
> >>>>>>> >>
> >>>>>>> >>
> >>>>>>> >
> >>>>>>> > _______________________________________________
> >>>>>>> > Linux-ha-japan mailing list
> >>>>>>> > Linux-ha-japan@lists.sourceforge.jp
> >>>>>>> > http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
> >>>>>>> >
> >>>>>>>
> >>>>>>> _______________________________________________
> >>>>>>> Linux-ha-japan mailing list
> >>>>>>> Linux-ha-japan@lists.sourceforge.jp
> >>>>>>> http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>--
> >>>>>>ELF Systems
> >>>>>>Masamichi Fukuda
> >>>>>>mail to: masamichi_fukuda@elf-systems.com
> >>>>>>
> >>>>>>
> >>>>>
> >>>>>_______________________________________________
> >>>>>Linux-ha-japan mailing list
> >>>>>Linux-ha-japan@lists.sourceforge.jp
> >>>>>http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
> >>>>>
> >>>>
> >>>>
> >>>>--
> >>>>
> >>>>ELF Systems
> >>>>Masamichi Fukuda
> >>>>mail to: masamichi_fukuda@elf-systems.com
> >>>>
> >>>>
> >>>
> >>>_______________________________________________
> >>>Linux-ha-japan mailing list
> >>>Linux-ha-japan@lists.sourceforge.jp
> >>>http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
> >>>
> >>
> >>
> >>--
> >>
> >>ELF Systems
> >>Masamichi Fukuda
> >>mail to: masamichi_fukuda@elf-systems.com
> >
> >
> >--
> >
> >ELF Systems
> >Masamichi Fukuda
> >mail to: masamichi_fukuda@elf-systems.com
> >
> >
>
> _______________________________________________
> Linux-ha-japan mailing list
> Linux-ha-japan@lists.sourceforge.jp
> http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>



--
ELF Systems
Masamichi Fukuda
mail to: *masamichi_fukuda@elf-systems.com <elfsystems.com@gmail.com>*
Re: スプリットブレイン時のSTONITHエラーについて [ In reply to ]
山内さん

お疲れ様です、福田です。

crm_monで見たのですが、どうもpacemaker1.1.12 build:e32080bが動いているように見えます。

# crm_mon -1
Last updated: Tue Mar 24 14:12:39 2015
Last change: Tue Mar 24 11:37:06 2015
Stack: heartbeat
Current DC: lbv1.beta.com (38b0f200-83ea-8633-6f37-047d36cd39c6) -
partition with quorum
Version: 1.1.12-e32080b
2 Nodes configured
10 Resources configured


Online: [ lbv1.beta.com lbv2.beta.com ]


virtulabox上でpacemaker1.1.12 build:e32080bをインストールしたノードは
crm configureでcrmファイルを読み込めなかったです。

こちらはXen上の検証機ですが、crm configureでcrmファイルを読み込めます。
build 561c4cf をインストールしてからbuild:e32080bをインストールしました。
何かファイルが残っていたんでしょうか。

宜しくお願いします。

以上

2015年3月20日 20:35 Masamichi Fukuda - elf-systems <
masamichi_fukuda@elf-systems.com>:

> 山内さん
>
> お疲れ様です、福田です。
>
> > この時ですが、戻したりする前に、
> > 1)使っていたバージョンのソースディレクトリで、make uninstall
> > 2)/var/lib/pacemaker/cib, /var/lib/pacemaker/
> >
> > pengineのディレクトリ中身を削除
> > しておいた方がよいです。
>
> なるほど、次回は気をつけます。
>
> 済みませんが、宜しくお願いします。
>
> 以上
>
> 2015年3月20日 17:16 <renayama19661014@ybb.ne.jp>:
>
>> 福田さん
>>
>>
>> お疲れ様です。山内です。
>>
>>
>> >こちらの環境で、PM1.1.12のbuild:e32080bからbuild:561c4cfへ何度か戻したりしているうちにリブートを繰り返すようになってしまいました。
>>
>> この時ですが、戻したりする前に、
>> 1)使っていたバージョンのソースディレクトリで、make uninstall
>> 2)/var/lib/pacemaker/cib, /var/lib/pacemaker/pengineのディレクトリ中身を削除
>> しておいた方がよいです。
>>
>> >そこで、再度debian7.8をクリーンインストールしてPM1.1.12 build:561c4cfをインストールしました。
>> >あと、ご指摘頂いたパスを通したところ、こちらでもstonith-helperの起動までは確認できました。
>>
>>
>> そうでしたか・・・・良かったですね。
>> といっても、build:e32080bが動かないと問題ですが・・・・
>>
>> また、週末に時間が取れたら、こちらでもやってみます。
>> 進展があれば、ご連絡いたします。
>>
>> 以上です。
>>
>>
>>
>>
>
--
ELF Systems
Masamichi Fukuda
mail to: *masamichi_fukuda@elf-systems.com <elfsystems.com@gmail.com>*

1 2 3  View All