Mailing List Archive: Slight bending of OCF specs: Re: Issues found in Apache resource agent

Slight bending of OCF specs: Re: Issues found in Apache resource agent

Sep 4, 2012, 6:20 PM

Post #1 of 14 (4608 views)

Hi Dejan,

If the resource agent is not running correctly it needs to be
restarted. My memory says that OCF_ERR_GENERIC will not cause that
behavior. I believe the spec says you should exit with not running if
it is not functioning correctly. (but I didn't check it, and my memory
isn't that clear in this case).

I will likely write a monitor-only resource agent for web servers. What
would you think about calling it from the other web resource agents?

This resource agent will not look at any config files, and will require
everything explicitly in parameters, and will not know how to start or
stop anything. This would be for my new monitoring project, of course
;-). But it could then be called by all the HTTP resource agents - or
used directly - for example by the Assimilation project.

This would be a slight but useful bending of OCF resource agent APIs.
We could create some new metadata to document it, and also not put start
and stop into the actions in the operations section. Or just the latter.

What do you think?

On 08/29/2012 05:31 AM, Dejan Muhamedagic wrote:
> Hi Alan,
>
> On Mon, Aug 27, 2012 at 10:51:15AM -0600, Alan Robertson wrote:
>> Hi,
>>
>> I was recently using the Apache resource agent, and discovered a few
>> problems:
>>
>> The exit code from grep was used directly as an OCF exit code.
>> It is NOT an OCF exit code, and should not be directly used
>> in this way.
> I guess you mean the greps in monitor_apache_extended and
> monitor_apache_basic? These lines:
>
> 267 $whattorun "$test_url" | grep -Ei "$test_regex" > /dev/null
> 277 ${ourhttpclient}_func "$STATUSURL" | grep -Ei "$TESTREGEX" > /dev/null
>
>> This caused a "not running" error to become a generic error.
> These lines are invoked _only_ in case it was previously
> established that the apache server is running. So, they should
> return OCF_ERR_GENERIC if the test fails. grep exits with code 1
> which matches OCF_ERR_GENERIC. But indeed the OCF error code
> should be returned explicitely.
>
>> Pacemaker reacts very differently to the two kinds of errors.
>>
>> This code occurred in two places.
>>
>> The resource agent used OCF_CHECK_LEVEL improperly.
>>
>> The specification says that if you receive an OCF_CHECK_LEVEL which you
>> do not support, you are required to interpret it as the next lower
>> supported value for OCF_CHECK_LEVEL.
>>
>> In effect, there are no invalid OCF_CHECK_LEVEL values. The Apache
>> agent declared all values but one to be errors. This is not the correct
>> behavior.
> OK. That somehow slipped while I had been reading the OCF standard.
>
> BTW, it'd be great if nginx shared some code with apache. The
> latter has already been split into three scripts.
>
> Cheers,
>
> Dejan
>
>> --
>> Alan Robertson <alanr@unix.sh> - @OSSAlanR
>>
>> "Openness is the foundation and preservative of friendship... Let me claim from you at all times your undisguised opinions." - William Wilberforce
>> _______________________________________________________
>> Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
>> http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
>> Home Page: http://linux-ha.org/
> _______________________________________________________
> Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
> Home Page: http://linux-ha.org/

--
Alan Robertson <alanr@unix.sh> - @OSSAlanR

"Openness is the foundation and preservative of friendship... Let me claim from you at all times your undisguised opinions." - William Wilberforce
_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/