Hi there,
After days chasing this bug, I and Luis Claudio found that some scripts
and specially a small program behave strangely when called by heartbeat.
When we got tired of seeing messages from heartbeat that "datadisk
something start" had failed, and even "httpd startup failed", we started
debugging some initscripts and drbd's datadisk script as well.
It turns out that anything that uses the "action" and "daemon" functions
defined in /etc/rc.d/init.d/functions ends up running a little program
called initlog, that runs a command and logs its output.
When run from heartbeat, this program works fine but seems to always
return 255 (or -1, if that suits you better), which causes the action or
daemon call to return that value. We use the function "action" in drbd's
datadisk, so it seems to allways fail when run from heartbeat.
Anybody can confirm this, or even shed some light over it? I browsed the
sources of this initlog program, and there are lots of points where it
returns -1 when it finds an error. That can be interpreted as 255 if seen
as an unsigned char.
What shouldn't be happening is having it work fine and return 0 when run
by hand in a shell, and return 255 when run by heartbeat... I feel really
messed up after trying to figure this out for a whole afternoon, so I
can't go on exploring this now... :)
See ya!
Fábio
( Fábio Olivé Leite -* ConectivaLinux *- olive@conectiva.com[.br] )
( PPGC/UFRGS MSc candidate -*- Advisor: Taisy Silva Weber )
( Linux - Distributed Systems - Fault Tolerance - Security - /etc )
After days chasing this bug, I and Luis Claudio found that some scripts
and specially a small program behave strangely when called by heartbeat.
When we got tired of seeing messages from heartbeat that "datadisk
something start" had failed, and even "httpd startup failed", we started
debugging some initscripts and drbd's datadisk script as well.
It turns out that anything that uses the "action" and "daemon" functions
defined in /etc/rc.d/init.d/functions ends up running a little program
called initlog, that runs a command and logs its output.
When run from heartbeat, this program works fine but seems to always
return 255 (or -1, if that suits you better), which causes the action or
daemon call to return that value. We use the function "action" in drbd's
datadisk, so it seems to allways fail when run from heartbeat.
Anybody can confirm this, or even shed some light over it? I browsed the
sources of this initlog program, and there are lots of points where it
returns -1 when it finds an error. That can be interpreted as 255 if seen
as an unsigned char.
What shouldn't be happening is having it work fine and return 0 when run
by hand in a shell, and return 255 when run by heartbeat... I feel really
messed up after trying to figure this out for a whole afternoon, so I
can't go on exploring this now... :)
See ya!
Fábio
( Fábio Olivé Leite -* ConectivaLinux *- olive@conectiva.com[.br] )
( PPGC/UFRGS MSc candidate -*- Advisor: Taisy Silva Weber )
( Linux - Distributed Systems - Fault Tolerance - Security - /etc )