Mailing List Archive

inherited sockets
I've played around with heartbeat for some days now and found some strang=
e =

behaviour. For testing I use httpd as my resource, so heartbeat starts th=
e =

httpd process when it comes up. The problem is: the httpd process kind of=
=

inherits the udp-socket used for the exchange of the heartbeats. This is =
not a =

problem if everything goes fine and heartbeat is taken down with =

/etc/rc.d/init.d/heartbeat stop, because in this case also httpd is taken=
down =

and the socket is free. But if for some reason heartbeat dies without tak=
ing =

httpd down (e.g. kill -9) the socket (1001) can still be found allocated =
for =

httpd (do a socklist | grep 1001) and heartbeat refuses to start up again=
=

(because it can't open a socket on port 1001).

I solved this problem by setting the close-on-exec bit for the socket in =
udp.c =

using fcntl(). It looks as if I didn't destroy anything else with this ch=
ange =

and it works fine now for me.

If you like I can send you a patch (or do you disagree with my changes?)

Christoph
-- =

Christoph J=E4ger mailto:cja@gams.at
g.a.m.s. edv dienstleistungen gmbh +43 1 895 84 99-25
stiegergasse 15-17 ; 1150 Wien http://www.gams.at =
inherited sockets [ In reply to ]
"Christoph Jäger" wrote:
>
> I've played around with heartbeat for some days now and found some strange
> behaviour. For testing I use httpd as my resource, so heartbeat starts the
> httpd process when it comes up. The problem is: the httpd process kind of
> inherits the udp-socket used for the exchange of the heartbeats. This is not a
> problem if everything goes fine and heartbeat is taken down with
> /etc/rc.d/init.d/heartbeat stop, because in this case also httpd is taken down
> and the socket is free. But if for some reason heartbeat dies without taking
> httpd down (e.g. kill -9) the socket (1001) can still be found allocated for
> httpd (do a socklist | grep 1001) and heartbeat refuses to start up again
> (because it can't open a socket on port 1001).
>
> I solved this problem by setting the close-on-exec bit for the socket in udp.c
> using fcntl(). It looks as if I didn't destroy anything else with this change
> and it works fine now for me.
>
> If you like I can send you a patch (or do you disagree with my changes?)

Send the patch!

Thanks!!

-- Alan Robertson
alanr@suse.com
inherited sockets [ In reply to ]
Ok, so here is the patch:

*** udp.c.orig Sun Oct 10 22:12:58 1999
--- udp.c Wed May 17 07:50:47 2000
***************
*** 26,31 ****
--- 26,32 ----
#include <errno.h>
#include <string.h>
#include <ctype.h>
+ #include <fcntl.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
***************
*** 292,297 ****
--- 293,304 ----
}
}
#endif
+ if (fcntl(sockfd,F_SETFD,FD_CLOEXEC)) {
+ ha_perror("Error setting the close-on-exec flag");
+ close(sockfd);
+ return(-1);
+ }
+ =

return(sockfd);
}
=

***************
*** 358,363 ****
--- 365,377 ----
close(sockfd);
return(-1);
}
+ =

+ if (fcntl(sockfd,F_SETFD,FD_CLOEXEC)) {
+ ha_perror("Error setting the close-on-exec flag");
+ close(sockfd);
+ return(-1);
+ }
+ =

return(sockfd);
}

Hope this helps,

Christoph
-- =

Christoph J=E4ger mailto:cja@gams.at
g.a.m.s. edv dienstleistungen gmbh +43 1 895 84 99-25
stiegergasse 15-17 ; 1150 Wien http://www.gams.at =
inherited sockets [ In reply to ]
"Christoph Jäger" wrote:
>
> Ok, so here is the patch:

<snip>

> + if (fcntl(sockfd,F_SETFD,FD_CLOEXEC)) {
> + ha_perror("Error setting the close-on-exec flag");
> + close(sockfd);
> + return(-1);
> + }
> +
> return(sockfd);

<snip>

> Hope this helps,
>
> Christoph


Yep. It did.

I decided not to perform the close in the error leg, and applied
corresponding changes to the ppp-udp code and serial code as well.
These changes are now in CVS.

Thanks!!

-- Alan Robertson
alanr@suse.com
inherited sockets [ In reply to ]
Alan Robertson <alanr@suse.com> wrote:
> "Christoph Jäger" wrote:
>>
>> Ok, so here is the patch:

> <snip>

>> + if (fcntl(sockfd,F_SETFD,FD_CLOEXEC)) {
>> + ha_perror("Error setting the close-on-exec flag");
>> + close(sockfd);
>> + return(-1);
>> + }
>> +
>> return(sockfd);

> <snip>

>> Hope this helps,
>>
>> Christoph


> Yep. It did.

> I decided not to perform the close in the error leg, and applied
> corresponding changes to the ppp-udp code and serial code as well.
> These changes are now in CVS.

is this cvs repository public accessable (pserver or anoncvs) ?

t

--
thomas.graichen@innominate.de
innominate AG
networking people
fon: +49.30.308806-13 fax: -77 web: http://innominate.de pgp: /pgp/tg
inherited sockets [ In reply to ]
Thomas Graichen wrote:
>
> Alan Robertson <alanr@suse.com> wrote:
> > "Christoph Jäger" wrote:
> >>
> >> Ok, so here is the patch:
>
> > <snip>
>
> >> + if (fcntl(sockfd,F_SETFD,FD_CLOEXEC)) {
> >> + ha_perror("Error setting the close-on-exec flag");
> >> + close(sockfd);
> >> + return(-1);
> >> + }
> >> +
> >> return(sockfd);
>
> > <snip>
>
> >> Hope this helps,
> >>
> >> Christoph
>
> > Yep. It did.
>
> > I decided not to perform the close in the error leg, and applied
> > corresponding changes to the ppp-udp code and serial code as well.
> > These changes are now in CVS.
>
> is this cvs repository public accessable (pserver or anoncvs) ?

The directions are on the web site. I'm offline right now, but I think
it's either the main page http://linux-ha.org/ or the contact page
http://linux-ha.org/contact/ I suppose the directions ought to be on
the download page as well.

-- Alan Robertson
alanr@suse.com