山内ã•ã‚“
ã“ã‚“ã«ã¡ã¯ã€ç¦ç”°ã§ã™ã€‚
ã¾ãšã¯ã€pacemakerã‚’ãƒãƒ¼ã‚¸ãƒ§ãƒ³ã‚¢ãƒƒãƒ—ã—ã¦ã¿ã¾ã—ãŸã€‚
ãƒãƒ¼ã‚¸ãƒ§ãƒ³ã¯ã€Version: 1.1.10+git20130802-4.1 ã§ã™ã€‚
ã¾ãšã¯ãƒŽãƒ¼ãƒ‰ï¼‘ã®ä¸€å°ã ã‘アップグレードã—ã¦çŠ¶æ…‹ã‚’ã¿ã¾ã—ãŸã€‚
ãƒãƒ¼ã‚¸ãƒ§ãƒ³ã‚¢ãƒƒãƒ—ã—ã¦ã„ãªã„ノード2å´ã®pacemakerã‚’åœæ¢ã—ãŸçŠ¶æ…‹ã§
crm_monを見るã¨ä¸‹è¨˜ã®ã‚ˆã†ã«ãªã‚Šã¾ã™ã€‚
# crm_mon -rfA -1
Could not establish cib_ro connection: Connection refused (111)
Connection to cluster failed: Transport endpoint is not connected
ノード2ã®pacemakerã‚’èµ·å‹•ã—ã¦crm_monã§è¦‹ã‚‹ã¨pending状態ã§ã™ã€‚
Node lbv1.beta.com (38b0f200-83ea-8633-6f37-047d36cd39c6): pending
Online: [ lbv2.beta.com ]
ã“ã‚Œã§æ•°åˆ†ã™ã‚‹ã¨ã€ãƒãƒ¼ã‚¸ãƒ§ãƒ³ã‚¢ãƒƒãƒ—ã—ãŸãƒŽãƒ¼ãƒ‰ï¼‘ãŒãƒªãƒ–ートを繰り返ã—ã¾ã™ã€‚
一度crmã®è¨å®šã‚’削除ã—ã¦ã€ã¾ã£ã•ã‚‰ãªçŠ¶æ…‹ã§èµ·å‹•ã—ã¦ã‚‚åŒæ§˜ã«è½ã¡ã¾ã™ã€‚
リブートã®éš›ã«ã¯æ¬¡ã®ã‚ˆã†ãªãƒ¡ãƒƒã‚»ãƒ¼ã‚¸ãŒå‡ºã¾ã™ã€‚
2015 Mar 5 17:25:39 lbv1 [1854]: EMERG: Rebooting system. Reason:
/usr/lib/heartbeat/crmd
ãƒã‚°ã¯ä¸‹è¨˜ã®ã‚ˆã†ã«ãªã£ã¦ã„ã¾ã™ã€‚
Mar 05 16:45:30 [3019] lbv1.beta.com crmd: info:
crm_timer_popped: Wait Timer (I_NULL) just popped (2000ms)
Mar 05 16:45:30 [3019] lbv1.beta.com crmd: info:
lrmd_ipc_connect: Connecting to lrmd
Mar 05 16:45:30 [3019] lbv1.beta.com crmd: info: crm_ipc_connect:
Could not establish lrmd connection: Connection refused (111)
Mar 05 16:45:30 [3019] lbv1.beta.com crmd: warning: do_lrm_control:
Failed to sign on to the LRM 29 (30 max) times
Mar 05 16:45:32 [3019] lbv1.beta.com crmd: info:
crm_timer_popped: Wait Timer (I_NULL) just popped (2000ms)
Mar 05 16:45:32 [3019] lbv1.beta.com crmd: info:
lrmd_ipc_connect: Connecting to lrmd
Mar 05 16:45:32 [3019] lbv1.beta.com crmd: info: crm_ipc_connect:
Could not establish lrmd connection: Connection refused (111)
Mar 05 16:45:32 [3019] lbv1.beta.com crmd: error: do_lrm_control:
Failed to sign on to the LRM 30 (max) times
Mar 05 16:45:32 [3019] lbv1.beta.com crmd: info:
register_fsa_error_adv: Resetting the current action list
Mar 05 16:45:32 [3019] lbv1.beta.com crmd: error: do_log: FSA:
Input I_ERROR from do_lrm_control() received in state S_STARTING
Mar 05 16:45:32 [3019] lbv1.beta.com crmd: notice:
do_state_transition: State transition S_STARTING -> S_RECOVERY [
input=I_ERROR cause=C_FSA_INTERNAL origin=do_lrm_control ]
Mar 05 16:45:32 [3019] lbv1.beta.com crmd: warning: do_recover:
Fast-tracking shutdown in response to errors
Mar 05 16:45:32 [3019] lbv1.beta.com crmd: info: do_ccm_control:
CCM connection established... waiting for first callback
Mar 05 16:45:32 [3019] lbv1.beta.com crmd: error: do_started:
Start cancelled... S_RECOVERY
Mar 05 16:45:32 [3019] lbv1.beta.com crmd: error: do_log: FSA:
Input I_TERMINATE from do_recover() received in state S_RECOVERY
Mar 05 16:45:32 [3019] lbv1.beta.com crmd: info:
do_state_transition: State transition S_RECOVERY -> S_TERMINATE [
input=I_TERMINATE cause=C_FSA_INTERNAL origin=do_recover ]
Mar 05 16:45:32 [3019] lbv1.beta.com crmd: info: do_shutdown: All
subsystems stopped, continuing
Mar 05 16:45:32 [3019] lbv1.beta.com crmd: info: do_lrm_control:
Disconnecting from the LRM
Mar 05 16:45:32 [3019] lbv1.beta.com crmd: info:
lrmd_api_disconnect: Disconnecting from lrmd service
Mar 05 16:45:32 [3019] lbv1.beta.com crmd: info:
lrmd_api_disconnect: Disconnecting from lrmd service
Mar 05 16:45:32 [3019] lbv1.beta.com crmd: notice: do_lrm_control:
Disconnected from the LRM
Mar 05 16:45:32 [3019] lbv1.beta.com crmd: info:
crm_cluster_disconnect: Disconnecting from cluster infrastructure:
heartbeat
Mar 05 16:45:32 lbv1.beta.com ccm: [3014]: info: client (pid=3019) removed
from ccm
Mar 05 16:45:32 [3019] lbv1.beta.com crmd: info:
crm_cluster_disconnect: Disconnected from heartbeat
Mar 05 16:45:32 [3019] lbv1.beta.com crmd: info: do_ha_control:
Disconnected from the cluster
Mar 05 16:45:32 [3019] lbv1.beta.com crmd: info: do_cib_control:
Disconnecting CIB
Mar 05 16:45:32 [3019] lbv1.beta.com crmd: info:
crmd_cib_connection_destroy: Connection to the CIB terminated...
Mar 05 16:45:32 [3020] lbv1.beta.com pengine: info:
crm_signal_dispatch: Invoking handler for signal 15: Terminated
Mar 05 16:45:32 [3020] lbv1.beta.com pengine: info:
qb_ipcs_us_withdraw: withdrawing server sockets
Mar 05 16:45:32 [3020] lbv1.beta.com pengine: info: crm_xml_cleanup:
Cleaning up memory from libxml2
Mar 05 16:45:32 [3015] lbv1.beta.com cib: info:
crm_client_destroy: Destroying 0 events
Mar 05 16:45:32 [3019] lbv1.beta.com crmd: info: stop_subsystem:
Sent -TERM to pengine: [3020]
Mar 05 16:45:32 [3019] lbv1.beta.com crmd: info: do_exit:
Performing A_EXIT_0 - gracefully exiting the CRMd
Mar 05 16:45:32 [3019] lbv1.beta.com crmd: info: do_exit:
[crmd] stopped (0)
Mar 05 16:45:32 [3019] lbv1.beta.com crmd: info: crmd_exit:
Dropping I_TERMINATE: [ state=S_TERMINATE cause=C_FSA_INTERNAL
origin=do_stop ]
Mar 05 16:45:32 lbv1.beta.com heartbeat: [1862]: WARN: Managed
/usr/lib/heartbeat/crmd process 3019 killed by signal 11 [SIGSEGV -
Segmentation violation].
åŽŸå› ã¯ã©ã“ã«ã‚ã‚‹ã‹ã‚ã‹ã‚Šã¾ã—ãŸã‚‰ã”教示下ã•ã„。
宜ã—ããŠé¡˜ã„ã—ã¾ã™ã€‚
以上
2015年3月4日 13:26 Masamichi Fukuda - elf-systems <
masamichi_fukuda@elf-systems.com>:
> 山内ã•ã‚“
>
> ã“ã‚“ã«ã¡ã¯ã€ç¦ç”°ã§ã™ã€‚
>
> 下記urlã®æƒ…å ±ã‚ã‚ŠãŒã¨ã†ã”ã–ã„ã¾ã™ã€‚
> 見ã¦ã¿ã¾ã™ã€‚
>
> corosyncã®ä»¶ã‚‚ã‚ã‚ã›ã¦æ¤œè¨Žã—ãŸã„ã¨æ€ã„ã¾ã™ã€‚
>
> 宜ã—ããŠé¡˜ã„ã—ã¾ã™ã€‚
>
> 以上
>
> 2015年3月4日 13:18 <renayama19661014@ybb.ne.jp>:
>
> ç¦ç”°ã•ã‚“
>>
>> ã“ã‚“ã«ã¡ã¯ã€å±±å†…ã§ã™ã€‚
>>
>> debianã«ã†ã¨ãã¦ç”³ã—訳ãªã„ã®ã§ã™ãŒã€ä»¥ä¸‹ã‚‚ã‚るよã†ã§ã™ã€‚
>>
>> https://packages.qa.debian.org/p/pacemaker.html
>>
>>
>> å…¬å¼ã®pacemakerサイトã‹ã‚‰ã‚‚リンクã•ã‚Œã¦ã„ã¾ã™ã€‚
>> ã“ã¡ã‚‰ã¯ã€1.1.10ã®ã‚ˆã†ã§ã™ã€‚
>>
>> 1.1ç³»ã®ãƒãƒ¼ã‚¸ãƒ§ãƒ³ã‚¢ãƒƒãƒ—ã®ãŠè€ƒãˆã§ã‚ã‚Œã°ã€ãœã²ã€corosyncã¨ã®çµ„ã¿åˆã‚ã›ã‚‚ã”検討ãã ã•ã„。
>>
>> 以上ã§ã™ã€‚
>>
>>
>> ----- Original Message -----
>> >From: Masamichi Fukuda - elf-systems <masamichi_fukuda@elf-systems.com>
>> >To: "renayama19661014@ybb.ne.jp" <renayama19661014@ybb.ne.jp>
>> >Cc: Masamichi Fukuda - elf-systems <masamichi_fukuda@elf-systems.com>; "
>> linux-ha-japan@lists.sourceforge.jp" <linux-ha-japan@lists.sourceforge.jp
>> >
>> >Date: 2015/3/4, Wed 12:35
>> >Subject: Re: スプリットブレイン時ã®STONITHエラーã«ã¤ã„ã¦
>> >
>> >
>> >山内ã•ã‚“
>> >
>> >
>> >ã“ã‚“ã«ã¡ã¯ã€ç¦ç”°ã§ã™ã€‚
>> >早速ã®æ¤œè¨¼ã‚ã‚ŠãŒã¨ã†ã”ã–ã„ã¾ã—ãŸã€‚
>> >
>> >
>> >pacemakerã®ãƒãƒ¼ã‚¸ãƒ§ãƒ³ã‚¢ãƒƒãƒ—ã‚’è¡Œã†ã‹ã€è‡ªå‰ãƒ—ラグインを作るã‹ç¤¾å†…ã§æ¤œè¨Žã—ã¦ã¿ã¾ã™ã€‚
>> >
>> >
>> >ã‚‚ã†ä¸€ã¤è³ªå•ã§ã™ã¿ã¾ã›ã‚“ãŒã€
>> >ç¾åœ¨ã®æ§‹æˆã§ã™ãŒã€pacemakerã¯debianã®ãƒ‘ッケージã§å°Žå…¥ã—ã¾ã—ãŸã€‚
>> >
>> >
>> >pacemakerã®ãƒãƒ¼ã‚¸ãƒ§ãƒ³ã‚¢ãƒƒãƒ—ã‚’è¡Œã†å ´åˆã€ã‚½ãƒ¼ã‚¹ã‹ã‚‰ã®ã‚¤ãƒ³ã‚¹ãƒˆãƒ¼ãƒ«ã«ãªã‚Šã¾ã™ã§ã—ょã†ã‹ã€‚
>> >OSã¯debian7.8を使ã†ã“ã¨ã«ãªã£ã¦ã„ã¾ã™ã€‚
>> >
>> >
>> >宜ã—ããŠé¡˜ã„ã—ã¾ã™ã€‚
>> >
>> >以上
>> >
>> >
>> >
>> >
>> >2015å¹´3月4日水曜日ã€<renayama19661014@ybb.ne.jp>ã•ã‚“ã¯æ›¸ãã¾ã—ãŸ:
>> >
>> >
>> >>ç¦ç”°ã•ã‚“
>> >>
>> >>
>> >>ã“ã‚“ã«ã¡ã¯ã€å±±å†…ã§ã™ã€‚
>> >>
>> >>環境ã¯ç•°ãªã‚Šã¾ã™ãŒ(corosync+PM1.1.7)ã§ã€stonith-helperã¨sshã«ã‚ˆã‚‹æ•…障時ã®å‹•ä½œã‚’確èªã—ã¾ã—ãŸãŒã€
>> >>åŒæ§˜ã«å…ˆé ã®stonith-helperã®å®Ÿè¡ŒãŒãƒ«ãƒ¼ãƒ—ã—ã¾ã™ã€‚
>> >>1.1.7ã§ã¯ã€ã“ã®ã‚ãŸã‚Šã‚’制御ã™ã‚‹ãƒ‘ラメータãŒå˜åœ¨ã—ã¾ã›ã‚“。
>> >>
>> >>ã“ã®å¯¾å¿œã¯ã€Pacemaker1.1.9ã‚ãŸã‚Šã§å…¥ã£ã¦ã„ãŠã‚Šã€1.1.7ã§ã¯ã“ã®äº‹è±¡ã«ã‚ˆã‚Šãƒ«ãƒ¼ãƒ—ã—ã¦ã—ã¾ã„ã¾ã™ã€‚
>> >>
>> >>1.1.9ã‚ãŸã‚Šã§ã¯ã€ã“ã®ãƒ«ãƒ¼ãƒ—を制御ã™ã‚‹ç‚ºã«ã€pcmk_reboot_retriesãªã©ã®ãƒ‘ラメータã«ã‚ˆã‚Šå®Ÿè¡Œå›žæ•°ã‚’パラメータã§
>> >>指定ã§ãるよã†ã«ãªã£ã¦ã„ã¾ã™ã€‚
>> >>
>> >>æ£å¸¸ãªï¼¦ï¼¯å‹•ä½œ(stonith実行~å†èµ·å‹•ã€‚。)ã‚’è¡Œã†ã«ã¯ã€ã‚„ã¯ã‚Šã€pacemakerã®ãƒãƒ¼ã‚¸ãƒ§ãƒ³ã‚¢ãƒƒãƒ—ã‚’è¡Œã†ã‹ã€
>>
>> >>3ã¤ã®stonithを組åˆã›ãŸã‚ˆã†ãªã€stonithプラグインを自å‰ã§ä½œæˆã™ã‚‹å¿…è¦ï¼ˆhelperã¨ssh(ç¦ç”°ã•ã‚“ã®å ´åˆã«ã¯ã€xen0)ã¨meatware)
>> >>ãŒã‚るよã†ã§ã™ã€‚
>> >>
>> >>#自å‰ã§ä½œæˆã—ãŸãƒ—ラグインã§å…¨ã¦ã®ãƒ—ラグインを実行ã—ã¦çµæžœã‚’è¿”ã™ã‚ˆã†ãªå½¢ã«ã™ã‚‹ã€‚
>> >>
>> >>以上ã€ã§ã™ã€‚
>> >>
>> >>----- Original Message -----
>> >>>From: Masamichi Fukuda - elf-systems <masamichi_fukuda@elf-systems.com
>> >
>> >>>To: 山内英生 <renayama19661014@ybb.ne.jp>
>> >>>Cc: Masamichi Fukuda - elf-systems <masamichi_fukuda@elf-systems.com>;
>> "linux-ha-japan@lists.sourceforge.jp" <
>> linux-ha-japan@lists.sourceforge.jp>
>> >>>Date: 2015/3/4, Wed 11:09
>> >>>Subject: Re: [Linux-ha-jp] スプリットブレイン時ã®STONITHエラーã«ã¤ã„ã¦
>> >>>
>> >>>
>> >>>山内ã•ã‚“
>> >>>
>> >>>ãŠä¸–話ã«ãªã‚Šã¾ã™ã€ç¦ç”°ã§ã™ã€‚
>> >>>ã”確èªã‚ã‚ŠãŒã¨ã†ã”ã–ã„ã¾ã™ã€‚
>> >>>
>> >>>>ã¨ã„ã†ã“ã¨ã§ã„ã‘ã°ã€xen0ã®å˜ä½“ã§ã¯å‹•ä½œã¯å•é¡Œãªã„ã¨ã„ã†ã“ã¨ã«ãªã‚Šã¾ã™ã€‚
>> >>>
>> >>>xen0ã®å‹•ä½œã¯å•é¡Œãªã„ã¨ã®ã“ã¨ã§ã‚ˆã‹ã£ãŸã§ã™ã€‚
>> >>>
>> >>>>ã“ã¡ã‚‰ã¯ã€ã‚³ãƒžãƒ³ãƒ‰å®Ÿè¡Œã§ãªã„å ´åˆã®ãƒã‚°ã¨æ€ã„ã¾ã™ãŒã€ã©ã†ã‚„らã€
>> >>>>pacemaker経由ã®external/stonith-helper以é™ãŒå®Ÿè¡Œã•ã‚Œãªã„ã®ãŒå•é¡Œã®ã‚ˆã†ã§ã™ã€‚
>> >>>>
>> >>>>æ–°ã—ã‚ã®pacemakerã¨ã¯ãƒ‘ラメータも異ãªã£ã¦ã„る為ã€ã“ã¡ã‚‰ã§ã‚‚構æˆã¯é•ã£ã¦
>> >>>>ã—ã¾ã„ã¾ã™ãŒã€stonith-helperã€sshãªã©ã®çµ„åˆã›ã§ã©ã†ãªã‚‹ã‹ç¢ºèªã—ã¦ã¿ã¾ã™ã€‚
>> >>>
>> >>>ã™ã¿ã¾ã›ã‚“ãŒå®œã—ããŠé¡˜ã„ã—ã¾ã™ã€‚
>> >>>
>> >>>以上
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>2015年3月4日 10:41 <renayama19661014@ybb.ne.jp>:
>> >>>
>> >>>ç¦ç”°ã•ã‚“
>> >>>>
>> >>>>ãŠã¯ã‚ˆã†ã”ã–ã„ã¾ã™ã€‚山内ã§ã™ã€‚
>> >>>>
>> >>>>>早速STONITHコマンドを試ã—ã¦ã¿ã¾ã—ãŸã€‚
>> >>>>>activeå´(lbv1)ã§ä¸‹è¨˜ã‚³ãƒžãƒ³ãƒ‰ã‚’実行ã—ãŸã¨ã“ã‚ã€standbyå´(lbv2)ノードã¯ãƒªãƒ–ートã•ã‚Œã¾ã—ãŸã€‚
>> >>>>>
>> >>>>># stonith -t external/xen0 hostlist="lbv2.beta.com:/etc/xen/lbv2.cfg"
>> dom0="dom0.xxxx.com" reset_method="reboot" -T reset lbv2.beta.com
>> >>>>
>> >>>>ã¨ã„ã†ã“ã¨ã§ã„ã‘ã°ã€xen0ã®å˜ä½“ã§ã¯å‹•ä½œã¯å•é¡Œãªã„ã¨ã„ã†ã“ã¨ã«ãªã‚Šã¾ã™ã€‚
>> >>>>
>> >>>>>ãã®å¾Œã®çŠ¶æ…‹ã§ã™ãŒã€ãƒªãƒ–ートã•ã‚ŒãŸlbv2å´ã§ãƒ¡ãƒƒã‚»ãƒ¼ã‚¸ãŒå‡ºç¶šã‘ã¦ã„ã¾ã™ã€‚
>> >>>>>
>> >>>>>2015 Mar 4 09:56:56 lbv2 [3387]: CRIT: external_reset_req:
>> 'stonith-helper reset' for host lbv1.beta.com failed with rc 1
>> >>>>>2015 Mar 4 09:57:11 lbv2 [3508]: CRIT: external_reset_req:
>> 'stonith-helper reset' for host lbv1.beta.com failed with rc 1
>> >>>>>2015 Mar 4 09:57:26 lbv2 [3629]: CRIT: external_reset_req:
>> 'stonith-helper reset' for host lbv1.beta.com failed with rc 1
>> >>>>>
>> >>>>
>>
>> >>>>ã“ã¡ã‚‰ã¯ã€ã‚³ãƒžãƒ³ãƒ‰å®Ÿè¡Œã§ãªã„å ´åˆã®ãƒã‚°ã¨æ€ã„ã¾ã™ãŒã€ã©ã†ã‚„らã€pacemaker経由ã®external/stonith-helper以é™ãŒå®Ÿè¡Œã•ã‚Œãªã„ã®ãŒå•é¡Œã®ã‚ˆã†ã§ã™ã€‚
>> >>>>
>>
>> >>>>æ–°ã—ã‚ã®pacemakerã¨ã¯ãƒ‘ラメータも異ãªã£ã¦ã„る為ã€ã“ã¡ã‚‰ã§ã‚‚構æˆã¯é•ã£ã¦ã—ã¾ã„ã¾ã™ãŒã€stonith-helperã€sshãªã©ã®çµ„åˆã›ã§ã©ã†ãªã‚‹ã‹ç¢ºèªã—ã¦ã¿ã¾ã™ã€‚
>> >>>>
>> >>>>
>> >>>>以上ã§ã™ã€‚
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>>----- Original Message -----
>> >>>>>From: Masamichi Fukuda - elf-systems <
>> masamichi_fukuda@elf-systems.com>
>> >>>>>To: 山内英生 <renayama19661014@ybb.ne.jp>
>> >>>>>Cc: Masamichi Fukuda - elf-systems <masamichi_fukuda@elf-systems.com>;
>> "linux-ha-japan@lists.sourceforge.jp" <
>> linux-ha-japan@lists.sourceforge.jp>
>> >>>>
>> >>>>>Date: 2015/3/4, Wed 10:16
>> >>>>>Subject: Re: [Linux-ha-jp] スプリットブレイン時ã®STONITHエラーã«ã¤ã„ã¦
>> >>>>>
>> >>>>>
>> >>>>>山内ã•ã‚“
>> >>>>>
>> >>>>>ãŠã¯ã‚ˆã†ã”ã–ã„ã¾ã™ã€ç¦ç”°ã§ã™ã€‚
>> >>>>>ã”連絡ã‚ã‚ŠãŒã¨ã†ã”ã–ã„ã¾ã™ã€‚
>> >>>>>
>> >>>>>早速STONITHコマンドを試ã—ã¦ã¿ã¾ã—ãŸã€‚
>> >>>>>activeå´(lbv1)ã§ä¸‹è¨˜ã‚³ãƒžãƒ³ãƒ‰ã‚’実行ã—ãŸã¨ã“ã‚ã€standbyå´(lbv2)ノードã¯ãƒªãƒ–ートã•ã‚Œã¾ã—ãŸã€‚
>> >>>>>
>> >>>>># stonith -t external/xen0 hostlist="lbv2.beta.com:/etc/xen/lbv2.cfg"
>> dom0="dom0.xxxx.com" reset_method="reboot" -T reset lbv2.beta.com
>> >>>>>
>> >>>>>ãã®å¾Œã®çŠ¶æ…‹ã§ã™ãŒã€ãƒªãƒ–ートã•ã‚ŒãŸlbv2å´ã§ãƒ¡ãƒƒã‚»ãƒ¼ã‚¸ãŒå‡ºç¶šã‘ã¦ã„ã¾ã™ã€‚
>> >>>>>
>> >>>>>2015 Mar 4 09:56:56 lbv2 [3387]: CRIT: external_reset_req:
>> 'stonith-helper reset' for host lbv1.beta.com failed with rc 1
>> >>>>>2015 Mar 4 09:57:11 lbv2 [3508]: CRIT: external_reset_req:
>> 'stonith-helper reset' for host lbv1.beta.com failed with rc 1
>> >>>>>2015 Mar 4 09:57:26 lbv2 [3629]: CRIT: external_reset_req:
>> 'stonith-helper reset' for host lbv1.beta.com failed with rc 1
>> >>>>>
>> >>>>>
>> >>>>>active(lbv1)å´ã®ãƒã‚°ã§ã™ã€‚
>> >>>>>Mar 04 09:54:40 lbv1.beta.com info: Node lbv2.beta.com is member.
>> >>>>>Mar 04 09:55:56 lbv1.beta.com info: Set DC node to lbv1.beta.com.
>> >>>>>Mar 04 09:58:15 lbv1.beta.com info: Start "pengine" process.
>> >>>>>Mar 04 09:58:19 lbv1.beta.com info: Set DC node to lbv1.beta.com.
>> >>>>>
>> >>>>>standby(lbv2)å´ã®ãƒã‚°ã§ã™ã€‚
>> >>>>>Mar 04 09:54:32 lbv2.beta.com info: Starting Heartbeat 3.0.5.
>> >>>>>Mar 04 09:54:32 lbv2.beta.com info: Link lbv1.beta.com:eth1 is up.
>> >>>>>Mar 04 09:54:39 lbv2.beta.com info: Start "crmd" process. (pid=2938)
>> >>>>>Mar 04 09:54:39 lbv2.beta.com info: Start "cib" process. (pid=2934)
>> >>>>>Mar 04 09:54:39 lbv2.beta.com info: Start "lrmd" process. (pid=2935)
>> >>>>>Mar 04 09:54:39 lbv2.beta.com info: Start "attrd" process.
>> (pid=2937)
>> >>>>>Mar 04 09:54:39 lbv2.beta.com info: Start "ccm" process. (pid=2933)
>> >>>>>Mar 04 09:54:39 lbv2.beta.com info: Start "stonithd" process.
>> (pid=2936)
>> >>>>>Mar 04 09:54:39 lbv2.beta.com info: Start "ipfail" process.
>> (pid=2932)
>> >>>>>Mar 04 09:54:39 lbv2.beta.com WARN: Managed "ipfail" process
>> exited. (pid=2932, rc=100)
>> >>>>>Mar 04 09:56:15 lbv2.beta.com info: Start "pengine" process.
>> >>>>>Mar 04 09:56:19 lbv2.beta.com info: Set DC node to lbv2.beta.com.
>> >>>>>Mar 04 09:56:23 lbv2.beta.com ERROR: Start to fail-over.
>> >>>>>Mar 04 09:56:25 lbv2.beta.com info: Resource Stonith1-1 started.
>> (rc=0)
>> >>>>>Mar 04 09:56:26 lbv2.beta.com info: Resource Stonith1-2 started.
>> (rc=0)
>> >>>>>Mar 04 09:56:26 lbv2.beta.com info: Resource Stonith1-3 started.
>> (rc=0)
>> >>>>>
>> >>>>>crm_monã§ã¯ã©ã¡ã‚‰ã®ãƒŽãƒ¼ãƒ‰ã‚‚lbv2ãŒpending状態ã§è¡¨ç¤ºã•ã‚Œã¦ã„ã¾ã™ã€‚
>> >>>>>
>> >>>>>active(lbv1)å´ã®crm_monã§ã™ã€‚(一部)
>> >>>>>Node lbv2.beta.com (82ffc36f-1ad8-8686-7db0-35686465c624): pending
>> >>>>>Online: [ lbv1.beta.com ]
>> >>>>>
>> >>>>>standby(lbv2)å´ã®crm_monã§ã™ã€‚(一部)
>> >>>>>Node lbv2.beta.com (82ffc36f-1ad8-8686-7db0-35686465c624): pending
>> >>>>>Online: [ lbv1.beta.com ]
>> >>>>>
>> >>>>>宜ã—ããŠé¡˜ã„ã—ã¾ã™ã€‚
>> >>>>>
>> >>>>>以上
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>>2015年3月4日 9:05 <renayama19661014@ybb.ne.jp>:
>> >>>>>
>> >>>>>ç¦ç”°ã•ã‚“
>> >>>>>>
>> >>>>>>ãŠã¯ã‚ˆã†ã”ã–ã„ã¾ã™ã€‚山内ã§ã™ã€‚
>> >>>>>>
>> >>>>>>1点ã€å…ˆã«è©¦ã—ã¦é ‚ããŸã„stonithコマンドã«ã¤ã„ã¦ã”連絡ã—ã¦ãŠãã¾ã™ã€‚
>> >>>>>>
>> >>>>>>xen0ãŒå‹•ã„ã¦ã„ãªã„ã‹ã‚‚知れãªã„ã¨ã®ã“ã¨ã§ã™ã®ã§ã€ä»¥ä¸‹ã‚’å‚ç…§ã—ã¦xen0を個別ã§å®Ÿè¡Œã—ã¦ã¿ã‚‹ã¨è‰¯ã„ã¨ãŠã‚‚ã„ã¾ã™ã€‚
>> >>>>>>
>> >>>>>>â—stonithコマンドã®ä¾‹(例ã¯libvirt)
>> >>>>>>stonith -t external/libvirt hostlist="xx01" hypervisor_uri="xxxxx"
>> reset_method="reboot" -T reset ap01
>> >>>>>>
>> >>>>>>PM1.1.7ã§ã‚‚å‹•ãã¨ã¯æ€ã„ã¾ã™ãŒã€ã‚³ãƒžãƒ³ãƒ‰ãƒ©ã‚¤ãƒ³çš„ã«ã¯
>> >>>>>>
>> >>>>>> stonith -t 実行ã™ã‚‹stonithプラグイン パラメータ1・・・パラメータN -T 実行動作 stonithã™ã‚‹ãƒ›ã‚¹ãƒˆ
>> >>>>>>
>> >>>>>>ã§ã™ã€‚
>> >>>>>>xen0å˜ä½“ã®å®Ÿè¡Œã§ã‚‚ã€stonithを実行ã™ã‚‹ãƒ›ã‚¹ãƒˆã‹ã‚‰ç›¸æ‰‹ï¼ˆæ•…障を想定)ホストをã“ã®ã‚³ãƒžãƒ³ãƒ‰ã§å®Ÿè¡Œã§ãã¾ã™ã€‚
>> >>>>>>
>> >>>>>>ã¾ãšã¯ã€xen0ã®å‹•ä½œã‚’確èªã—ã¦ã¿ã¦ãã ã•ã„。
>> >>>>>>
>> >>>>>>以上ã§ã™ã€‚
>> >>>>>>
>> >>>>>>
>> >>>>>>----- Original Message -----
>> >>>>>>>From: Masamichi Fukuda - elf-systems <
>> masamichi_fukuda@elf-systems.com>
>> >>>>>>>To: 山内英生 <renayama19661014@ybb.ne.jp>
>> >>>>>>>Cc: Masamichi Fukuda - elf-systems <
>> masamichi_fukuda@elf-systems.com>; "linux-ha-japan@lists.sourceforge.jp"
>> <linux-ha-japan@lists.sourceforge.jp>
>> >>>>>>
>> >>>>>>>Date: 2015/3/3, Tue 10:43
>> >>>>>>>Subject: Re: [Linux-ha-jp] スプリットブレイン時ã®STONITHエラーã«ã¤ã„ã¦
>> >>>>>>>
>> >>>>>>>
>> >>>>>>>山内ã•ã‚“
>> >>>>>>>
>> >>>>>>>ãŠä¸–話ã«ãªã‚Šã¾ã™ã€ç¦ç”°ã§ã™ã€‚
>> >>>>>>>
>> >>>>>>>ãŠå¿™ã—ã„ã¨ã“ã‚ã™ã¿ã¾ã›ã‚“ãŒã€å®œã—ããŠé¡˜ã„ã—ã¾ã™ã€‚
>> >>>>>>>
>> >>>>>>>
>> >>>>>>>
>> >>>>>>>2015年3月3日 9:27 <renayama19661014@ybb.ne.jp>:
>> >>>>>>>
>> >>>>>>>ç¦ç”°ã•ã‚“
>> >>>>>>>>
>> >>>>>>>>ã“ã‚“ã«ã¡ã¯ã€å±±å†…ã§ã™ã€‚
>> >>>>>>>>
>> >>>>>>>>詳細ã¯å¤±å¿µã—ã¦ã„ã¾ã™ã®ã§ã€æ˜Žæ—¥ã«ã§ã‚‚ã¾ãŸã”連絡ã—ã¾ã™ãŒã€‚。。。
>> >>>>>>>>
>> >>>>>>>>stonithモジュールã®å˜ä½“ã®å®Ÿè¡Œã‚’stonithコマンドã§è©¦ã›ã¾ã™ã®ã§ã€
>> >>>>>>>>xen0ã®å®Ÿè¡Œã‚’パラメータも指定ã—ã¦å®Ÿè¡Œã—ã¦ã¿ãŸæ–¹ãŒã‚ˆã•ãã†ã§ã™ã€‚
>> >>>>>>>>
>> >>>>>>>>ã¾ãŸã€æ˜Žæ—¥ã«ã§ã‚‚ãŠé€ã‚Šã„ãŸã ã„ãŸè¨å®šãƒ•ã‚¡ã‚¤ãƒ«ã®ä¸èº«ã‚‚å«ã‚ã¦ã€ç¢ºèªã—ã¦
>> >>>>>>>>ã”連絡ã—ã¾ã™ã。
>> >>>>>>>>
>> >>>>>>>>以上ã§ã™ã€‚
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>>----- Original Message -----
>> >>>>>>>>>From: Masamichi Fukuda - elf-systems <
>> masamichi_fukuda@elf-systems.com>
>> >>>>>>>>
>> >>>>>>>>>To: 山内英生 <renayama19661014@ybb.ne.jp>
>> >>>>>>>>>Cc: Masamichi Fukuda - elf-systems <
>> masamichi_fukuda@elf-systems.com>; "linux-ha-japan@lists.sourceforge.jp"
>> <linux-ha-japan@lists.sourceforge.jp>
>> >>>>>>>>>Date: 2015/3/2, Mon 12:10
>> >>>>>>>>>Subject: Re: [Linux-ha-jp] スプリットブレイン時ã®STONITHエラーã«ã¤ã„ã¦
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>>山内ã•ã‚“
>> >>>>>>>>>
>> >>>>>>>>>ã“ã‚“ã«ã¡ã¯ã€ç¦ç”°ã§ã™ã€‚
>> >>>>>>>>>
>> >>>>>>>>>å‰å›žã¨åŒã˜ã‚ˆã†ã«ã‚¤ãƒ³ã‚¿ãƒ¼ã‚³ãƒã‚¯ãƒˆlanã®ã‚¤ãƒ³ã‚¿ãƒ•ã‚§ãƒ¼ã‚¹ã‚’downã•ã›ã¦ã¿ã¾ã—ãŸãŒã€
>> >>>>>>>>>ã‚„ã¯ã‚Šæ¬¡ã®stonithモジュール(xen0)ãŒå®Ÿè¡Œã•ã‚Œãªã„よã†ã§ã™ã€‚
>> >>>>>>>>>
>> >>>>>>>>>サービスlanã®ã‚¤ãƒ³ã‚¿ãƒ•ã‚§ãƒ¼ã‚¹ã‚’downã•ã›ã‚‹ã¨ã€ãƒŽãƒ¼ãƒ‰ï¼’ã«ãƒ•ã‚£ã‚¨ãƒ«ã‚ªãƒ¼ãƒã—ã¾ã™ã€‚
>> >>>>>>>>>
>> >>>>>>>>>crmã®è¨å®šãƒ•ã‚¡ã‚¤ãƒ«ã¯æ¬¡ã®ã‚ˆã†ã«ã—ã¦ã„ã¾ã™ã€‚
>> >>>>>>>>>
>> >>>>>>>>>### Cluster Option ###
>> >>>>>>>>>property \
>> >>>>>>>>> no-quorum-policy="ignore" \
>> >>>>>>>>> stonith-enabled="true" \
>> >>>>>>>>> startup-fencing="false" \
>> >>>>>>>>> stonith-timeout="710s" \
>> >>>>>>>>> crmd-transition-delay="2s"
>> >>>>>>>>>
>> >>>>>>>>>### Resource Default ###
>> >>>>>>>>>rsc_defaults \
>> >>>>>>>>> resource-stickiness="INFINITY" \
>> >>>>>>>>> migration-threshold="1"
>> >>>>>>>>>
>> >>>>>>>>>### Group Configuration ###
>> >>>>>>>>>group HAvarnish \
>> >>>>>>>>> vip_208 \
>> >>>>>>>>> varnishd
>> >>>>>>>>>
>> >>>>>>>>>group grpStonith1 \
>> >>>>>>>>> Stonith1-1 \
>> >>>>>>>>> Stonith1-2 \
>> >>>>>>>>> Stonith1-3
>> >>>>>>>>>
>> >>>>>>>>>group grpStonith2 \
>> >>>>>>>>> Stonith2-1 \
>> >>>>>>>>> Stonith2-2 \
>> >>>>>>>>> Stonith2-3
>> >>>>>>>>>
>> >>>>>>>>>### Clone Configuration ###
>> >>>>>>>>>clone clone_ping \
>> >>>>>>>>> ping
>> >>>>>>>>>
>> >>>>>>>>>### Primitive Configuration ###
>> >>>>>>>>>primitive vip_208 ocf:heartbeat:IPaddr2 \
>> >>>>>>>>> params \
>> >>>>>>>>> ip="192.168.17.208" \
>> >>>>>>>>> nic="eth0" \
>> >>>>>>>>> cidr_netmask="24" \
>> >>>>>>>>> op start interval="0s" timeout="90s" on-fail="restart" \
>> >>>>>>>>> op monitor interval="5s" timeout="60s" on-fail="restart" \
>> >>>>>>>>> op stop interval="0s" timeout="100s" on-fail="fence"
>> >>>>>>>>>
>> >>>>>>>>>primitive varnishd lsb:varnish \
>> >>>>>>>>> op start interval="0s" timeout="90s" on-fail="restart" \
>> >>>>>>>>> op monitor interval="10s" timeout="60s" on-fail="restart" \
>> >>>>>>>>> op stop interval="0s" timeout="100s" on-fail="fence"
>> >>>>>>>>>
>> >>>>>>>>>primitive ping ocf:pacemaker:ping \
>> >>>>>>>>> params \
>> >>>>>>>>> name="default_ping_set" \
>> >>>>>>>>> host_list="192.168.17.254" \
>> >>>>>>>>> multiplier="100" \
>> >>>>>>>>> dampen="1" \
>> >>>>>>>>> op start interval="0s" timeout="90s" on-fail="restart" \
>> >>>>>>>>> op monitor interval="10s" timeout="60s" on-fail="restart" \
>> >>>>>>>>> op stop interval="0s" timeout="100s" on-fail="fence"
>> >>>>>>>>>
>> >>>>>>>>>primitive Stonith1-1 stonith:external/stonith-helper \
>> >>>>>>>>> params \
>> >>>>>>>>> priority="1" \
>> >>>>>>>>> stonith-timeout="40" \
>> >>>>>>>>> hostlist="lbv1.beta.com" \
>> >>>>>>>>> dead_check_target="192.168.17.132 10.0.17.132" \
>> >>>>>>>>> standby_wait_time="10" \
>> >>>>>>>>> standby_check_command="/usr/sbin/crm_resource -r
>> varnishd -W | grep -q `hostname`" \
>> >>>>>>>>> op start interval="0s" timeout="60s" on-fail="restart" \
>> >>>>>>>>> stonith-timeout="300" \
>> >>>>>>>>> hostlist="lbv1.beta.com:/etc/xen/lbv1.cfg" \
>> >>>>>>>>> dom0="dom0.xxxx.com" \
>> >>>>>>>>> op start interval="0s" timeout="60s" on-fail="restart" \
>> >>>>>>>>> op monitor interval="3600s" timeout="60s" on-fail="restart" \
>> >>>>>>>>> op stop interval="0s" timeout="60s" on-fail="ignore"
>> >>>>>>>>>
>> >>>>>>>>>primitive Stonith1-3 stonith:meatware \
>> >>>>>>>>> params \
>> >>>>>>>>> priority="3" \
>> >>>>>>>>> stonith-timeout="600" \
>> >>>>>>>>> hostlist="lbv1.beta.com" \
>> >>>>>>>>> op start interval="0s" timeout="60s" \
>> >>>>>>>>> op monitor interval="3600s" timeout="60s" \
>> >>>>>>>>> op stop interval="0s" timeout="60s"
>> >>>>>>>>>
>> >>>>>>>>>primitive Stonith2-1 stonith:external/stonith-helper \
>> >>>>>>>>> params \
>> >>>>>>>>> priority="1" \
>> >>>>>>>>> stonith-timeout="40" \
>> >>>>>>>>> hostlist="lbv2.beta.com" \
>> >>>>>>>>> dead_check_target="192.168.17.133 10.0.17.133" \
>> >>>>>>>>> standby_wait_time="10" \
>> >>>>>>>>> standby_check_command="/usr/sbin/crm_resource -r
>> varnishd -W | grep -q `hostname`" \
>> >>>>>>>>> op start interval="0s" timeout="60s" on-fail="restart" \
>> >>>>>>>>> op monitor interval="3600s" timeout="60s" on-fail="restart" \
>> >>>>>>>>> op stop interval="0s" timeout="60s" on-fail="ignore"
>> >>>>>>>>>
>> >>>>>>>>>primitive Stonith2-2 stonith:external/xen0 \
>> >>>>>>>>> params \
>> >>>>>>>>> priority="2" \
>> >>>>>>>>> stonith-timeout="300" \
>> >>>>>>>>> hostlist="lbv2.beta.com:/etc/xen/lbv2.cfg" \
>> >>>>>>>>> dom0="dom0.xxxx.com" \
>> >>>>>>>>> op start interval="0s" timeout="60s" on-fail="restart" \
>> >>>>>>>>> op monitor interval="3600s" timeout="60s" on-fail="restart" \
>> >>>>>>>>> op stop interval="0s" timeout="60s" on-fail="ignore"
>> >>>>>>>>>
>> >>>>>>>>>primitive Stonith2-3 stonith:meatware \
>> >>>>>>>>> params \
>> >>>>>>>>> priority="3" \
>> >>>>>>>>> stonith-timeout="600" \
>> >>>>>>>>> hostlist="lbv2.beta.com" \
>> >>>>>>>>> op start interval="0s" timeout="60s" \
>> >>>>>>>>> op monitor interval="3600s" timeout="60s" \
>> >>>>>>>>> op stop interval="0s" timeout="60s"
>> >>>>>>>>>
>> >>>>>>>>>### Resource Location ###
>> >>>>>>>>>location HA_location-1 HAvarnish \
>> >>>>>>>>> rule 200: #uname eq lbv1.beta.com \
>> >>>>>>>>> rule 100: #uname eq lbv2.beta.com
>> >>>>>>>>>
>> >>>>>>>>>location HA_location-2 HAvarnish \
>> >>>>>>>>> rule -INFINITY: not_defined default_ping_set or
>> default_ping_set lt 100
>> >>>>>>>>>
>> >>>>>>>>>location HA_location-3 grpStonith1 \
>> >>>>>>>>> rule -INFINITY: #uname eq lbv1.beta.com
>> >>>>>>>>>
>> >>>>>>>>>location HA_location-4 grpStonith2 \
>> >>>>>>>>> rule -INFINITY: #uname eq lbv2.beta.com
>> >>>>>>>>>
>> >>>>>>>>>DomU(lbv1ã¨lbv2)ã‹ã‚‰Dom0ã¸ã¯rootã§sshã€ãƒ‘スワードãªã—ã§ãƒã‚°ã‚¤ãƒ³ã§ãるよã†ã«ã¯ãªã£ã¦ã„ã¾ã™ã€‚
>> >>>>>>>>>
>> >>>>>>>>>xen0ã®ãƒ‘ラメータã§ä¸è¶³åˆ†ã‚ã‚Šã¾ã™ã§ã—ょã†ã‹ã€‚
>> >>>>>>>>>
>> >>>>>>>>>宜ã—ããŠé¡˜ã„ã—ã¾ã™ã€‚
>> >>>>>>>>>
>> >>>>>>>>>以上
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>>2015年3月1日 16:54 <renayama19661014@ybb.ne.jp>:
>> >>>>>>>>>
>> >>>>>>>>>ç¦ç”°ã•ã‚“
>> >>>>>>>>>>
>> >>>>>>>>>>ã“ã‚“ã«ã¡ã¯ã€å±±å†…ã§ã™ã€‚
>> >>>>>>>>>>
>> >>>>>>>>>>æµã‚Œçš„ã«ã¯æ£å¸¸ã§ã™ã€‚
>> >>>>>>>>>>ãŸã ã€helperã®æ¬¡ã®stonithモジュール(xen0)ãŒå®Ÿè¡Œã•ã‚Œã¦ã„ãªã„よã†ãªã®ã§ã€ã“ã¡ã‚‰ã¯å•é¡Œã§ã™ã€‚
>> >>>>>>>>>>
>> >>>>>>>>>>ãŸã ã€å…ˆã«ã‚‚書ãã¾ã—ãŸãŒã€pacemakerã®ãƒãƒ¼ã‚¸ãƒ§ãƒ³ã§fencing_topologyãŒã©ã†ãªã£ã¦ã„ã‚‹ã‹ï¼Ÿ
>> >>>>>>>>>>#ãŠä½¿ã„ã®1.1.7ã§ä½¿ãˆã‚‹ã‹ã©ã†ã‹ãƒ»ãƒ»ãƒ»ã¡ã‚‡ã£ã¨å®šã‹ã§ã¯ã‚ã‚Šã¾ã›ã‚“。
>> >>>>>>>>>>
>> >>>>>>>>>>後ã¯stonithモジュールもパラメータã§ãƒªãƒˆãƒ©ã‚¤ã®å›žæ•°ã‚„ã€ã‚¿ã‚¤ãƒ アウトãªã©ã‚‚è¨å®šã§ããŸã‚Šã‚‚ã—ã¦ã„ã‚‹ã®ã§ã€
>> >>>>>>>>>>ãã®ã‚ãŸã‚Šã‚‚見直ã—ã¦ã¿ãŸæ–¹ãŒã‚ˆã„ã‹ã‚‚知れã¾ã›ã‚“。
>> >>>>>>>>>>
>> >>>>>>>>>>#fencing_topologyãŒãªã„ã¨ã€1.1.12ã‚ãŸã‚Šã§ã¯ã€stonithã®å®Ÿè¡Œé †ç•ªã‚‚制御ã§ããªã„ã¯ãšãªã®ã§ãƒ»ãƒ»ãƒ»
>> >>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>>ã¾ãšã¯ã€è©¦ã—ã¦ã„ãŸã ã„ã¦ã€é–‹ç¤ºã§ãる範囲ã§ã€crmファイルã®å…¨ä½“も見ã›ã¦é ‚ã„ãŸã»ã†ãŒè‰¯ã„ã‹ã‚‚知れã¾ã›ã‚“ã。
>> >>>>>>>>>>
>> >>>>>>>>>>ã¾ãŸã€å¯èƒ½ã§ã‚ã‚Œã°ã€1.1.12ã‚ãŸã‚Šã®åˆ©ç”¨ã‚‚考ãˆã¦ã‚‚らã£ãŸã»ã†ãŒè‰¯ã„ã‹ã‚‚知れã¾ã›ã‚“。
>> >>>>>>>>>>
>> >>>>>>>>>>#ã™ã„ã¾ã›ã‚“ã€å€‹äººçš„ãªç†ç”±ã§ã€æ°´æ›œæ—¥ã‚ãŸã‚Šã¾ã§ã¯ã€ã‚ã¾ã‚Šãƒ¡ãƒ¼ãƒ«ã®åå¿œãŒã‚ˆããªã„ã‹ã‚‚知れã¾ã›ã‚“。
>> >>>>>>>>>>
>> >>>>>>>>>>以上ã§ã™ã€‚
>> >>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>>----- Original Message -----
>> >>>>>>>>>>>From: Masamichi Fukuda - elf-systems <
>> masamichi_fukuda@elf-systems.com>
>> >>>>>>>>>>
>> >>>>>>>>>>>To: renayama19661014@ybb.ne.jp;
>> linux-ha-japan@lists.sourceforge.jp
>> >>>>>>>>>>>Date: 2015/3/1, Sun 12:09
>> >>>>>>>>>>>Subject: Re: [Linux-ha-jp] スプリットブレイン時ã®STONITHエラーã«ã¤ã„ã¦
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>>山内ã•ã‚“
>> >>>>>>>>>>>
>> >>>>>>>>>>>ç¦ç”°ã§ã™ã€‚
>> >>>>>>>>>>>ã”回ç”ã‚ã‚ŠãŒã¨ã†ã”ã–ã„ã¾ã™ã€‚
>> >>>>>>>>>>>
>> >>>>>>>>>>>今ã®çŠ¶æ…‹ã¯æ£å¸¸ãªã‚“ã§ã™ã。
>> >>>>>>>>>>>ãã‚Œã§ã¯æ˜Žæ—¥ã€ã‚µãƒ¼ãƒ“スãƒãƒƒãƒˆãƒ¯ãƒ¼ã‚¯ã‚’切ã£ã¦è©¦ã—ã¦ã¿ãŸã„ã¨æ€ã„ã¾ã™ã€‚
>> >>>>>>>>>>>
>> >>>>>>>>>>>> crmè¨å®šãƒ•ã‚¡ã‚¤ãƒ«ã®fencing_topologyã®è¨å®šã‚’見直ã—ã¦ã¿ãŸæ–¹ãŒã‚ˆã„ã¨æ€ã„ã¾ã™ã€‚
>> >>>>>>>>>>>
>> >>>>>>>>>>>fencing_topologyã¨ã„ã†è¨å®šã¯ã¾ã 入れã¦ã„ãªã‹ã£ãŸã§ã™ã€‚
>> >>>>>>>>>>>ã“ã¡ã‚‰ã‚’入れãªã„ã¨æ£ã—ãå‹•ã‹ãªã„ã®ã§ã—ょã†ã‹ã€‚
>> >>>>>>>>>>>
>> >>>>>>>>>>>宜ã—ããŠé¡˜ã„ã—ã¾ã™ã€‚
>> >>>>>>>>>>>
>> >>>>>>>>>>>以上
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>>2015年2月28日 7:41 <renayama19661014@ybb.ne.jp>:
>> >>>>>>>>>>>
>> >>>>>>>>>>>ç¦ç”°ã•ã‚“
>> >>>>>>>>>>>>
>> >>>>>>>>>>>>ãŠã¯ã‚ˆã†ã”ã–ã„ã¾ã™ã€‚山内ã§ã™ã€‚
>> >>>>>>>>>>>>
>>
>> >>>>>>>>>>>>インターコãƒã‚¯ãƒˆ(10.0.17.X)ãŒåˆ‡ã‚Œã¦ã€ã‚µãƒ¼ãƒ“スãƒãƒƒãƒˆãƒ¯ãƒ¼ã‚¯(192.168.17.X)ãŒåˆ‡ã‚Œã¦ã„ãªã„状態ã¨ãªã£ã¦ã„ã‚‹
>> >>>>>>>>>>>>ã¨æ€ã„ã¾ã™ã®ã§ã€stonith-helperã¯ã€1ã‚’è¿”ã—ã¦å¤±æ•—ã—ã¦ã„ã‚‹ã¯ãšã§ã™ã€‚(æ£ã—ã„検知)
>> >>>>>>>>>>>>ãã®å¾Œã€stonith-helperãŒå¤±æ•—ã—ã¦ã€xen0,meatwareã®é †ã«å®Ÿè¡ŒãŒç¶šãã¯ãšã§ã™ã®ã§ã€‚。。
>> >>>>>>>>>>>>
>> >>>>>>>>>>>>crmè¨å®šãƒ•ã‚¡ã‚¤ãƒ«ã®fencing_topologyã®è¨å®šã‚’見直ã—ã¦ã¿ãŸæ–¹ãŒã‚ˆã„ã¨æ€ã„ã¾ã™ã€‚
>> >>>>>>>>>>>>
>> >>>>>>>>>>>>ã‚‚ã—ã‹ã™ã‚‹ã¨ã€pacemaker1.1.7ã‚ãŸã‚Šã§ã¯ã€fencing_topologyãŒä½¿ãˆãªã‹ã£ãŸã‹ã‚‚?ã—ã‚Œã¾ã›ã‚“・・・
>> >>>>>>>>>>>>
>>
>> >>>>>>>>>>>>fencing_topologyã‚ãŸã‚Šã®å‡¦ç†ã¯ã€ã‹ãªã‚Šã€pacemaker1.1.12ã¾ã§ä¿®æ£ãŒå…¥ã£ã¦å‹•ãよã†ã«ãªã‚Šã¾ã—ãŸã®ã§ã€
>> >>>>>>>>>>>>pacemakerã®ãƒãƒ¼ã‚¸ãƒ§ãƒ³ã‚¢ãƒƒãƒ—ã‚‚å¿…è¦ã‹ã‚‚知れã¾ã›ã‚“。
>> >>>>>>>>>>>>
>> >>>>>>>>>>>>以上ã§ã™ã€‚
>> >>>>>>>>>>>>
>> >>>>>>>>>>>>
>> >>>>>>>>>>>>
>> >>>>>>>>>>>>----- Original Message -----
>> >>>>>>>>>>>>>From: Masamichi Fukuda - elf-systems <
>> masamichi_fukuda@elf-systems.com>
>> >>>>>>>>>>>>>To: linux-ha-japan@lists.sourceforge.jp
>> >>>>>>>>>>>>>Date: 2015/2/27, Fri 21:04
>> >>>>>>>>>>>>>Subject: [Linux-ha-jp] スプリットブレイン時ã®STONITHエラーã«ã¤ã„ã¦
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>>ãŠä¸–話ã«ãªã‚Šã¾ã™ã€ç¦ç”°ã¨ç”³ã—ã¾ã™ã€‚
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>>debian Xen上ã§ï¼’ノードã®ã‚¯ãƒ©ã‚¹ã‚¿ã‚·ã‚¹ãƒ†ãƒ を構築ã—ã¦æ¤œè¨¼ã‚’ã—ã¦ã„ã¾ã™ã€‚
>> >>>>>>>>>>>>>Xen上ã§ã®stonith使用時ã®ã‚¨ãƒ©ãƒ¼ã«ã¤ã„ã¦è³ªå•ã•ã›ã¦é ‚ãã¾ã™ã€‚
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>>環境:
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>>Dom0ã¯debian7.7, Xen 4.1.4-3+deb7u3
>> >>>>>>>>>>>>>DomUã¯debian7.8, pacemaker 1.1.7-1, heartbeat 1:3.0.5-3
>> >>>>>>>>>>>>>åŒä¸€Dom0上ã«ã‚¯ãƒ©ã‚¹ã‚¿2å°ã‚’構築ã—ã¦ã„ã¾ã™ã€‚
>> >>>>>>>>>>>>>pacemaker,heartbeatã¯debianパッケージã§ã‚¤ãƒ³ã‚¹ãƒˆãƒ¼ãƒ«ã—ã¦ã„ã¾ã™ã€‚
>> >>>>>>>>>>>>>stonith-helper,xen0,meatwareプラグインを使用
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>>ノード1(active)å´ã®ã‚¤ãƒ³ã‚¿ãƒ¼ã‚³ãƒã‚¯ãƒˆç”¨LANインタフェースをダウンã•ã›ã¦ã€
>> >>>>>>>>>>>>>スプリットブレインを発生ã•ã›ã€STONITHã‚’è¡Œã‚ã›ã‚ˆã†ã¨ã—ã¦ã„ã¾ã™ã€‚
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>>両ノードã®crm_monã§ã¯ä¸‹è¨˜ã®ã‚ˆã†ã«ãŠäº’ã„ã‚’uncleanã¨è¡¨ç¤ºã—ã¦ã„ã¾ã™ã€‚
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>>ノード1å´
>> >>>>>>>>>>>>>Node lbv2.beta.com (82ffc36f-1ad8-8686-7db0-35686465c624):
>> UNCLEAN (offl
>> >>>>>>>>>>>>>ine)
>> >>>>>>>>>>>>>Online: [ lbv1.beta.com ]
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>>ノード2å´
>> >>>>>>>>>>>>>Node lbv1.beta.com (38b0f200-83ea-8633-6f37-047d36cd39c6):
>> UNCLEAN (offl
>> >>>>>>>>>>>>>ine)
>> >>>>>>>>>>>>>Online: [ lbv2.beta.com ]
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>>ã¨ã“ã‚ãŒã‚¨ãƒ©ãƒ¼ãƒ¡ãƒƒã‚»ãƒ¼ã‚¸ãŒæ¬¡ã®ã‚ˆã†ã«ã§ã¦ã—ã¾ã„ã¾ã™ã€‚
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>>ノード1å´
>> >>>>>>>>>>>>>lbv1 [12657]: CRIT: external_reset_req: 'stonith-helper
>> reset' for host lbv2.beta.com failed with rc 1
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>>ノード2å´
>> >>>>>>>>>>>>>lbv2 [22225]: CRIT: external_reset_req: 'stonith-helper
>> reset' for host lbv1.beta.com failed with rc 1
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>>質å•
>> >>>>>>>>>>>>>ã“ã®çŠ¶æ…‹ã¯STONITHãŒå‹•ã„ã¦ãŠã‚‰ãšã€stonith-helperã®ãƒ‘ラメータãŒãŠã‹ã—ã„ã®ã§ã—ょã†ã‹ï¼Ÿ
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>>パラメータã¯æ¬¡ã®ã‚ˆã†ã«ã—ã¦ã„ã¾ã™ã€‚
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>>primitive Stonith1-1 stonith:external/stonith-helper \
>> >>>>>>>>>>>>> params \
>> >>>>>>>>>>>>> priority="1" \
>> >>>>>>>>>>>>> stonith-timeout="40" \
>> >>>>>>>>>>>>> hostlist="lbv1.beta.com" \
>> >>>>>>>>>>>>> dead_check_target="192.168.17.132 10.0.17.132" \
>> >>>>>>>>>>>>> standby_wait_time="10" \
>> >>>>>>>>>>>>> standby_check_command="/usr/sbin/crm_resource -r
>> varnishd -W | grep -q `hostname`" \
>> >>>>>>>>>>>>> op start interval="0s" timeout="60s" on-fail="restart" \
>> >>>>>>>>>>>>> op monitor interval="3600s" timeout="60s"
>> on-fail="restart" \
>> >>>>>>>>>>>>> op stop interval="0s" timeout="60s" on-fail="ignore"
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>>primitive Stonith2-1 stonith:external/stonith-helper \
>> >>>>>>>>>>>>> params \
>> >>>>>>>>>>>>> priority="1" \
>> >>>>>>>>>>>>> stonith-timeout="40" \
>> >>>>>>>>>>>>> hostlist="lbv2.beta.com" \
>> >>>>>>>>>>>>> dead_check_target="192.168.17.133 10.0.17.133" \
>> >>>>>>>>>>>>> standby_wait_time="10" \
>> >>>>>>>>>>>>> standby_check_command="/usr/sbin/crm_resource -r
>> varnishd -W | grep -q `hostname`" \
>> >>>>>>>>>>>>> op start interval="0s" timeout="60s" on-fail="restart" \
>> >>>>>>>>>>>>> op monitor interval="3600s" timeout="60s"
>> on-fail="restart" \
>> >>>>>>>>>>>>> op stop interval="0s" timeout="60s" on-fail="ignore"
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>>192.168.17.0ãŒã‚µãƒ¼ãƒ“ス用ã€10.0.17.0ãŒã‚¤ãƒ³ã‚¿ãƒ¼ã‚³ãƒã‚¯ãƒˆç”¨ã«ä½¿ç”¨ã—ã¦ã„るサブãƒãƒƒãƒˆã§ã™ã€‚
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>>ãƒã‚°ã¯ä¸‹è¨˜ã®é€šã‚Šã§ã™ã€‚
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>>Feb 27 19:29:04 lbv1.beta.com stonith: [18566]: CRIT:
>> external_reset_req
>> >>>>>>>>>>>>>: 'stonith-helper reset' for host lbv2.beta.com failed with
>> rc 1
>> >>>>>>>>>>>>>Feb 27 19:29:04 lbv1.beta.com stonith-ng: [2815]: ERROR:
>> log_operation:
>> >>>>>>>>>>>>>Operation 'reboot' [18565] (call 0 from
>> d2acf6a5-ef8d-4249-aaab-25a8686d6647) fo
>> >>>>>>>>>>>>>r host 'lbv2.beta.com' with device 'Stonith2-1' returned: -2
>> >>>>>>>>>>>>>Feb 27 19:29:04 lbv1.beta.com stonith-ng: [2815]: ERROR:
>> log_operation:
>> >>>>>>>>>>>>>Stonith2-1: Performing: stonith -t external/stonith-helper
>> -T reset lbv2.
>> >>>>>>>>>>>>>-beta.com
>> >>>>>>>>>>>>>Feb 27 19:29:04 lbv1.beta.com stonith-ng: [2815]: ERROR:
>> log_operation:
>> >>>>>>>>>>>>>Stonith2-1: failed: lbv2.beta.com 5
>> >>>>>>>>>>>>>Feb 27 19:29:05 lbv1.beta.com stonith-ng: [2815]: info:
>> call_remote_ston
>> >>>>>>>>>>>>>ith: Requesting that lbv1.beta.com perform op reboot
>> lbv2.beta.c
>> >>>>>>>>>>>>>om
>> >>>>>>>>>>>>>Feb 27 19:29:05 lbv1.beta.com stonith-ng: [2815]: info:
>> can_fence_host_w
>> >>>>>>>>>>>>>ith_device: Stonith2-1 can fence lbv2.beta.com: dynamic-list
>> >>>>>>>>>>>>>Feb 27 19:29:05 lbv1.beta.com stonith-ng: [2815]: info:
>> can_fence_host_w
>> >>>>>>>>>>>>>ith_device: Stonith2-2 can fence lbv2.beta.com: dynamic-list
>> >>>>>>>>>>>>>Feb 27 19:29:05 lbv1.beta.com stonith-ng: [2815]: info:
>> can_fence_host_w
>> >>>>>>>>>>>>>ith_device: Stonith2-3 can fence lbv2.beta.com: dynamic-list
>> >>>>>>>>>>>>>Feb 27 19:29:05 lbv1.beta.com stonith-ng: [2815]: info:
>> stonith_fence: F
>> >>>>>>>>>>>>>ound 3 matching devices for 'lbv2.beta.com'
>> >>>>>>>>>>>>>Feb 27 19:29:05 lbv1.beta.com stonith-ng: [2815]: info:
>> stonith_command:
>> >>>>>>>>>>>>> Processed st_fence from lbv1.beta.com: rc=-1
>> >>>>>>>>>>>>>Feb 27 19:29:08 lbv1.beta.com crm_resource: [18790]: info:
>> Invoked: /usr
>> >>>>>>>>>>>>>/sbin/crm_resource -r varnishd -W
>> >>>>>>>>>>>>>Feb 27 19:29:09 lbv1.beta.com stonith: [18706]: CRIT:
>> external_reset_req
>> >>>>>>>>>>>>>: 'stonith-helper reset' for host lbv2.beta.com failed with
>> rc 1
>> >>>>>>>>>>>>>Feb 27 19:29:09 lbv1.beta.com stonith-ng: [2815]: ERROR:
>> log_operation:
>> >>>>>>>>>>>>>Operation 'reboot' [18705] (call 0 from
>> d2acf6a5-ef8d-4249-aaab-25a8686d6647) fo
>> >>>>>>>>>>>>>r host 'lbv2.beta.com' with device 'Stonith2-1' returned: -2
>> >>>>>>>>>>>>>Feb 27 19:29:09 lbv1.beta.com stonith-ng: [2815]: ERROR:
>> log_operation:
>> >>>>>>>>>>>>>Stonith2-1: Performing: stonith -t external/stonith-helper
>> -T reset lbv2.
>> >>>>>>>>>>>>>-beta.com
>> >>>>>>>>>>>>>Feb 27 19:29:09 lbv1.beta.com stonith-ng: [2815]: ERROR:
>> log_operation:
>> >>>>>>>>>>>>>Stonith2-1: failed: lbv2.beta.com 5
>> >>>>>>>>>>>>>Feb 27 19:29:10 lbv1.beta.com stonith-ng: [2815]: info:
>> call_remote_ston
>> >>>>>>>>>>>>>ith: Requesting that lbv1.beta.com perform op reboot
>> lbv2.beta.c
>> >>>>>>>>>>>>>om
>> >>>>>>>>>>>>>Feb 27 19:29:10 lbv1.beta.com stonith-ng: [2815]: info:
>> can_fence_host_w
>> >>>>>>>>>>>>>ith_device: Stonith2-1 can fence lbv2.beta.com: dynamic-list
>> >>>>>>>>>>>>>Feb 27 19:29:10 lbv1.beta.com stonith-ng: [2815]: info:
>> can_fence_host_w
>> >>>>>>>>>>>>>ith_device: Stonith2-2 can fence lbv2.beta.com: dynamic-list
>> >>>>>>>>>>>>>Feb 27 19:29:10 lbv1.beta.com stonith-ng: [2815]: info:
>> can_fence_host_w
>> >>>>>>>>>>>>>ith_device: Stonith2-3 can fence lbv2.beta.com: dynamic-list
>> >>>>>>>>>>>>>Feb 27 19:29:10 lbv1.beta.com stonith-ng: [2815]: info:
>> stonith_fence: F
>> >>>>>>>>>>>>>ound 3 matching devices for 'lbv2.beta.com'
>> >>>>>>>>>>>>>Feb 27 19:29:10 lbv1.beta.com stonith-ng: [2815]: info:
>> stonith_command:
>> >>>>>>>>>>>>> Processed st_fence from lbv1.beta.com: rc=-1
>> >>>>>>>>>>>>>Feb 27 19:29:13 lbv1.beta.com crm_resource: [18953]: info:
>> Invoked: /usr
>> >>>>>>>>>>>>>/sbin/crm_resource -r varnishd -W
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>>宜ã—ããŠé¡˜ã„ã—ã¾ã™ã€‚
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>>--
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>>ELF Systems
>> >>>>>>>>>>>>>Masamichi Fukuda
>> >>>>>>>>>>>>>mail to: masamichi_fukuda@elf-systems.com
>> >>>>>>>>>>>>>_______________________________________________
>> >>>>>>>>>>>>>Linux-ha-japan mailing list
>> >>>>>>>>>>>>>Linux-ha-japan@lists.sourceforge.jp
>> >>>>>>>>>>>>>http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>
>> >>>>>>>>>>>>_______________________________________________
>> >>>>>>>>>>>>Linux-ha-japan mailing list
>> >>>>>>>>>>>>Linux-ha-japan@lists.sourceforge.jp
>> >>>>>>>>>>>>http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>> >>>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>>--
>> >>>>>>>>>>>
>> >>>>>>>>>>>ELF Systems
>> >>>>>>>>>>>Masamichi Fukuda
>> >>>>>>>>>>>mail to: masamichi_fukuda@elf-systems.com
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>>--
>> >>>>>>>>>
>> >>>>>>>>>ELF Systems
>> >>>>>>>>>Masamichi Fukuda
>> >>>>>>>>>mail to: masamichi_fukuda@elf-systems.com
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>
>> >>>>>>>
>> >>>>>>>--
>> >>>>>>>
>> >>>>>>>ELF Systems
>> >>>>>>>Masamichi Fukuda
>> >>>>>>>mail to: masamichi_fukuda@elf-systems.com
>> >>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>
>> >>>>>
>> >>>>>--
>> >>>>>
>> >>>>>ELF Systems
>> >>>>>Masamichi Fukuda
>> >>>>>mail to: masamichi_fukuda@elf-systems.com
>> >>>>>
>> >>>>>
>> >>>>
>> >>>>
>> >>>
>> >>>
>> >>>--
>> >>>
>> >>>ELF Systems
>> >>>Masamichi Fukuda
>> >>>mail to: masamichi_fukuda@elf-systems.com
>> >>>
>> >>>
>> >>
>> >>
>> >
>> >--
>> >
>> >ELF Systems
>> >Masamichi Fukuda
>> >mail to: masamichi_fukuda@elf-systems.com
>> >
>> >
>> >
>>
>>
>
>
> --
> ELF Systems
> Masamichi Fukuda
> mail to: *masamichi_fukuda@elf-systems.com <elfsystems.com@gmail.com>*
>
--
ELF Systems
Masamichi Fukuda
mail to: *masamichi_fukuda@elf-systems.com <elfsystems.com@gmail.com>*