Hi there,
) Most of what I do regarding signal handling is normal for daemon
) processes. The only thing I can do that might be funny would be that
) I ignore SIGCHLD. This could be the problem. You could try and
) change it before it starts the scripts, and see if that fixes the
) problem.
This is exactly the problem. This behaviour will be inherited by all
children of heartbeat, and god knows what will happen to them if they
rely on the defaults. I really believe that every special setup heartbeat
does to itself should be undone before forking unrelated children, because
they may also be affected by the changes in a bad way.
Luis Claudio and I solved this problem by calling signal(SIGCHLD, SIG_DFL)
just before forking in a few places... now even httpd startup messages are
ok. Of course, drbd's datadisk is also happy now. :) The altered functions
were req_our_resources and notify_world.
Patch follows:
---8<---Cut---Here---8<---
--- heartbeat-0.4.7b/heartbeat/heartbeat.c Fri May 12 16:19:49 2000
+++ linux-ha.new/heartbeat/heartbeat.c Wed May 24 15:13:55 2000
@@ -1199,6 +1316,7 @@
case 0: { /* Child */
int j;
make_normaltime();
+ signal(SIGCHLD, SIG_DFL);
for (j=0; j < msg->nfields; ++j) {
char ename[64];
sprintf(ename, "HA_%s", msg->names[j]);
@@ -1759,6 +1967,7 @@
break;
}
+ signal(SIGCHLD, SIG_DFL);
ha_log(LOG_INFO, "Requesting our resources.");
sprintf(cmd, HALIB "/ResourceManager listkeys %s", curnode->nodename);
---8<---Cut---Here---8<---
Cheers!
Fábio
( Fábio Olivé Leite -* ConectivaLinux *- olive@conectiva.com[.br] )
( PPGC/UFRGS MSc candidate -*- Advisor: Taisy Silva Weber )
( Linux - Distributed Systems - Fault Tolerance - Security - /etc )
) Most of what I do regarding signal handling is normal for daemon
) processes. The only thing I can do that might be funny would be that
) I ignore SIGCHLD. This could be the problem. You could try and
) change it before it starts the scripts, and see if that fixes the
) problem.
This is exactly the problem. This behaviour will be inherited by all
children of heartbeat, and god knows what will happen to them if they
rely on the defaults. I really believe that every special setup heartbeat
does to itself should be undone before forking unrelated children, because
they may also be affected by the changes in a bad way.
Luis Claudio and I solved this problem by calling signal(SIGCHLD, SIG_DFL)
just before forking in a few places... now even httpd startup messages are
ok. Of course, drbd's datadisk is also happy now. :) The altered functions
were req_our_resources and notify_world.
Patch follows:
---8<---Cut---Here---8<---
--- heartbeat-0.4.7b/heartbeat/heartbeat.c Fri May 12 16:19:49 2000
+++ linux-ha.new/heartbeat/heartbeat.c Wed May 24 15:13:55 2000
@@ -1199,6 +1316,7 @@
case 0: { /* Child */
int j;
make_normaltime();
+ signal(SIGCHLD, SIG_DFL);
for (j=0; j < msg->nfields; ++j) {
char ename[64];
sprintf(ename, "HA_%s", msg->names[j]);
@@ -1759,6 +1967,7 @@
break;
}
+ signal(SIGCHLD, SIG_DFL);
ha_log(LOG_INFO, "Requesting our resources.");
sprintf(cmd, HALIB "/ResourceManager listkeys %s", curnode->nodename);
---8<---Cut---Here---8<---
Cheers!
Fábio
( Fábio Olivé Leite -* ConectivaLinux *- olive@conectiva.com[.br] )
( PPGC/UFRGS MSc candidate -*- Advisor: Taisy Silva Weber )
( Linux - Distributed Systems - Fault Tolerance - Security - /etc )