Mailing List Archive

Hi, I've joined the list
Hi everybody,

some of you might know me already from "linux-ha", bu tI was advised to subscribe here.
I am developing my first OCF RA for SLES11 SP1, and during that course I found a lot of outdated, incompatible, and incomplete documentation. So the job got quite difficult.

First thing was that there exist two "RA-API-1" interface definitions, one with "verify-all", the other with "validate-all" (just to name one example). Also the DTD lacks any serious comment (like origin and release date, not to talk about the semantics). Imagine the SNMP MIBs would just consist of the ASN without any comments. This is how I felt with the RA-API.

When testing my "saprouter" RA, I was using this script:
---
TESTDIR=~/src/scripts/OCF/testenv/saprouter
export LD_LIBRARY_PATH=$TESTDIR/exe
if [ "$1" = "manual" ]; then
shift
OCF_ROOT=/usr/lib/ocf OCF_RESOURCE_INSTANCE=saprouter \
OCF_RESKEY_exe_dir="$TESTDIR/exe" \
OCF_RESKEY_conf_dir="$TESTDIR/conf" \
OCF_RESKEY_log_dir="$TESTDIR/log" \
OCF_RESKEY_source_ip="127.0.0.1" \
sh -x ./saprouter "$@"
echo "Exit status is $?"
else
/usr/sbin/ocf-tester -n test -o exe_dir="$TESTDIR/exe" -o conf_dir="$TESTDIR/conf" -o log_dir="$TESTDIR/log" -o source_ip="127.0.0.1" ./saprouter
fi
---

So as ocf-tester complained, I was using "manual start", "manual status", and "manual stop" to debug the script. As it turned out, the application, and not the RA had the most serious problem: It would start with an invalid parameter, but won't be able to terminate with the same invalid parameter...

Greetings from real life,
Ulrich


_______________________________________________
ha-wg-technical mailing list
ha-wg-technical@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/ha-wg-technical
Re: Hi, I've joined the list [ In reply to ]
On 07/06/2011 05:51 PM, Ulrich Windl wrote:
> Hi everybody,
>
> some of you might know me already from "linux-ha", bu tI was advised to subscribe here.
> I am developing my first OCF RA for SLES11 SP1, and during that course I found a lot of outdated, incompatible, and incomplete documentation. So the job got quite difficult.
>
> First thing was that there exist two "RA-API-1" interface definitions, one with "verify-all", the other with "validate-all" (just to name one example). Also the DTD lacks any serious comment (like origin and release date, not to talk about the semantics). Imagine the SNMP MIBs would just consist of the ASN without any comments. This is how I felt with the RA-API.

Your criticism with the RA API DTD being poorly documented is entirely
valid, and it's also valid to criticize the spec for being outdated.

As an aside, though, I guess you will agree that it's a bit, how should
I say, brave to try to come up with implementation ideas for a resource
agent from a DTD.

Now, I trust you have read the OCF RA Developer's Guide and followed the
best practices outlined there. I must conclude from your frustration
that the Guide is inadequate. So as the author of said guide, I would
ask you to suggest improvements in any way you deem appropriate. Your
feedback is much appreciated.

> When testing my "saprouter" RA, I was using this script:
> ---
> TESTDIR=~/src/scripts/OCF/testenv/saprouter
> export LD_LIBRARY_PATH=$TESTDIR/exe
> if [ "$1" = "manual" ]; then
> shift
> OCF_ROOT=/usr/lib/ocf OCF_RESOURCE_INSTANCE=saprouter \
> OCF_RESKEY_exe_dir="$TESTDIR/exe" \
> OCF_RESKEY_conf_dir="$TESTDIR/conf" \
> OCF_RESKEY_log_dir="$TESTDIR/log" \
> OCF_RESKEY_source_ip="127.0.0.1" \
> sh -x ./saprouter "$@"
> echo "Exit status is $?"
> else
> /usr/sbin/ocf-tester -n test -o exe_dir="$TESTDIR/exe" -o conf_dir="$TESTDIR/conf" -o log_dir="$TESTDIR/log" -o source_ip="127.0.0.1" ./saprouter
> fi
> ---
>
> So as ocf-tester complained, I was using "manual start", "manual status", and "manual stop" to debug the script. As it turned out, the application, and not the RA had the most serious problem: It would start with an invalid parameter, but won't be able to terminate with the same invalid parameter...

So like you are saying yourself, there was application misbehavior
involved. I'm afraid I am not following what your conclusion from this
is in relation to the OCF spec, your resource agent, or ocf-tester. Can
you elaborate please?

Cheers,
Florian