Hi all,
I encountered trouble that Master/Slave resource collocated
with ping resource can not fail-over at the HDD crash.
After HDD crash, stop operation of the ping resource is looping
and notify operation of the Master/Slave resource too.
I configured "op stop on-fail=ignore" to ping resource, but
ping resource return OCF_ERR_INSTALLED(5) and it is not ignored.
How do I configure resource to ignore operation error
even if OCF_ERR_INSTALLED is returned?
(or fence only OCF_ERR_INSTALLED is returned)
Of course, on-fail=fence make it possible to fail-over.
But, I do not want fence when OCF_ERR_GENERIC(1) is
returned in the stop operation of the ping resource.
By the way, this feature was introduced at the Pacemaker-1.1.11.
https://github.com/ClusterLabs/pacemaker/commit/767213e4e47e122d3ae89c06bc7b5b670aa26f4d
It seems that retrying operation have no meaning in this case
because it fail for all time.
If we assume the agent isn't present(rc-code="5" op-status="5")
as a hard error, then fence is better way than retry.
Best regards,
Kazutomo NAKAHIRA
_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker
Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org
I encountered trouble that Master/Slave resource collocated
with ping resource can not fail-over at the HDD crash.
After HDD crash, stop operation of the ping resource is looping
and notify operation of the Master/Slave resource too.
I configured "op stop on-fail=ignore" to ping resource, but
ping resource return OCF_ERR_INSTALLED(5) and it is not ignored.
How do I configure resource to ignore operation error
even if OCF_ERR_INSTALLED is returned?
(or fence only OCF_ERR_INSTALLED is returned)
Of course, on-fail=fence make it possible to fail-over.
But, I do not want fence when OCF_ERR_GENERIC(1) is
returned in the stop operation of the ping resource.
By the way, this feature was introduced at the Pacemaker-1.1.11.
https://github.com/ClusterLabs/pacemaker/commit/767213e4e47e122d3ae89c06bc7b5b670aa26f4d
It seems that retrying operation have no meaning in this case
because it fail for all time.
If we assume the agent isn't present(rc-code="5" op-status="5")
as a hard error, then fence is better way than retry.
Best regards,
Kazutomo NAKAHIRA
_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker
Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org