Mailing List Archive: Re: [Linux-HA] Mysql RA issue: Heartbeat/Pacemaker stops switching Master/Slave after killing mysql processes of Master many times (3 times)

On Fri, Jan 18, 2013 at 9:06 PM, Thai Nguyen <nqthai@tma.com.vn> wrote:
> Hello all,
>
> I am running Heartbeat/Pacemaker with MySql Master/Slave Replication on my
> servers.
>
> And i am facing an issue which involved to MySQL RA as follow:
>
>
>
> Steps to reproduce:
>
> Step 1: Kill mysql processes of Master.
>
> Step 2: Wait until Heartbeat/Pacemaker switched Master/Slave.
>
> Step 3: Repeat step 1 and step 2 two times.
>
> Step 4: Observe Master/Slave status.
>
>
>
> Expected result: Heartbeat/Pacemaker switches Master/Slave successfully.
>
> Actually result: Heartbeat/Pacemaker stops switching Master/Slave.
>
>
>
> After killed Master in 2nd time, I check the new Master 's log (ha-log) ,
> the message "MySQL monitor succeeded (master)" didn't show up in log. Then i
> kill mysql processes of new Master (3rd time), the result is
> heartbeat/pacemaker stops switching Master/Slave. To work around this issue,
> I need to restart Heartbeat.

You could have also just run "crm resource cleanup ms_MySQL" to clear
out the failures.
If that doesn't work, some logs would make it easier to comment.

>
>
>
> And this is my pacemaker config:
>
>
>
> node $id="fabe2f8e-9ba2-4f85-a644-fa16fe492830" ares \
>
> attributes apollo-log-file-p_mysql="mysql-bin.000067"
> apollo-log-pos-p_mysql="107"
>
> node $id="fd5a954a-aadc-450e-9dda-ca2c18e980c2" apollo
>
> primitive MailTo ocf:heartbeat:MailTo \
>
> params email="nqthai@gmail.com"
>
> primitive p_mysql ocf:heartbeat:mysql \
>
> params config="/etc/mysql/my.cnf" pid="/var/run/mysqld/mysqld.pid"
> socket="/var/run/mysqld/mysqld.sock" binary="/usr/bin/mysqld_safe"
> replication_user="root" replication_passwd="nec" test_user="root"
> test_passwd="nec" max_slave_lag="10" evict_outdated_slaves="false" \
>
> op monitor interval="1s" role="Master" timeout="120s" \
>
> op monitor interval="3s" timeout="120s" \
>
> op start interval="0" role="Stopped" timeout="120s" on-fail="restart" \
>
> op stop interval="0" timeout="120s" \
>
> meta is-managed="true"
>
> primitive virtualIP ocf:heartbeat:IPaddr \
>
> params ip="192.168.103.223" cidr_netmask="255.255.255.0" \
>
> op monitor interval="1s" \
>
> meta is-managed="true"
>
> ms ms_MySQL p_mysql \
>
> meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1"
> notify="true" globally-unique="false" target-role="Master" is-managed="true"
>
> colocation mysql_co_ip inf: virtualIP ms_MySQL:Master
>
> order my_MySQL_promote_before_vip inf: ms_MySQL:promote virtualIP:start
>
> property $id="cib-bootstrap-options" \
>
> dc-version="1.1.6-9971ebba4494012a93c03b40a2c58ec0eb60f50c" \
>
> cluster-infrastructure="Heartbeat" \
>
> stonith-enabled="false" \
>
> default-action-timeout="30" \
>
> cluster-recheck-interval="30s" \
>
> no-quorum-policy="ignore"
>
> property $id="mysql_replication" \
>
> p_mysql_REPL_INFO="ares|mysql-bin.000034|107"
>
> rsc_defaults $id="rsc-options" \
>
> resource-stickiness="1" \
>
> migration-threshold="1" \
>
> failure-timeout="15s
>
>
>
> Best regards,
>
> Thai Nguyen
>
>
>
> _______________________________________________
> Linux-HA mailing list
> Linux-HA@lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/