Mailing List Archive

SIGHUP on connect
Hi,

I've been writing a simple client and server for cluster computing this weekend. At first everything appeared to work just fine, but soon enough I found some
inexplicable bind errors. I've tried to make sure that the client closes it's sockets before the server closes it's sockets, to prevent linger, but trying did
not help. Now I think I found the problem.

Please do have a look at the code. It looks like the SIGHUP is sent to the server not on close or exit, but on the connect instead.

Greetz,
Mischa.
Re: SIGHUP on connect [ In reply to ]
Hi all!

On 25/10/2020 16:11, Michael J. Baars wrote:
[...]
> I've been writing a simple client and server for cluster computing this weekend. At first everything appeared to work just fine, but soon enough I found some
> inexplicable bind errors. I've tried to make sure that the client closes it's sockets before the server closes it's sockets, to prevent linger, but trying did

Which were exactly?
English/original text pls ...

And The close() (and shutdown() syscalls, respectively) don't avoid
the FIN_WAIT2 timeout on a closed socket.
Just set the SO_REUSEADDR socket option on the listening socket.

> not help. Now I think I found the problem.

Then solve it.

> Please do have a look at the code. It looks like the SIGHUP is sent to the server not on close or exit, but on the connect instead.

Too lazy to save and uncompress the file ...

MfG,
Bernd
--
There is no cloud, just other people computers.
-- https://static.fsf.org/nosvn/stickers/thereisnocloud.svg
Re: SIGHUP on connect [ In reply to ]
On Mon, 2020-10-26 at 17:12 +0000, Bernd Petrovitsch wrote:
> Hi all!
>
> On 25/10/2020 16:11, Michael J. Baars wrote:
> [...]
> > I've been writing a simple client and server for cluster computing this weekend. At first everything appeared to work just fine, but soon enough I found
> > some
> > inexplicable bind errors. I've tried to make sure that the client closes it's sockets before the server closes it's sockets, to prevent linger, but trying
> > did
>
> Which were exactly?
> English/original text pls ...
>
> And The close() (and shutdown() syscalls, respectively) don't avoid
> the FIN_WAIT2 timeout on a closed socket.
> Just set the SO_REUSEADDR socket option on the listening socket.
>
> > not help. Now I think I found the problem.
>
> Then solve it.
>
> > Please do have a look at the code. It looks like the SIGHUP is sent to the server not on close or exit, but on the connect instead.
>
> Too lazy to save and uncompress the file ...
>
> MfG,
> Bernd

Oh, I see the difference.

I forgot to mention that in my setup, there's only one client and numerous servers that do the computational work. So in my case, it would have been better to
have the SIGHUPs on sock[0]. In other cases, like most cases, the SIGHUPs should probably sent out on sock[1].

Best regards,
Mischa.
Re: SIGHUP on connect [ In reply to ]
On Mon, 2020-10-26 at 17:12 +0000, Bernd Petrovitsch wrote:
> Hi all!
>
> On 25/10/2020 16:11, Michael J. Baars wrote:
> [...]
> > I've been writing a simple client and server for cluster computing this weekend. At first everything appeared to work just fine, but soon enough I found
> > some
> > inexplicable bind errors. I've tried to make sure that the client closes it's sockets before the server closes it's sockets, to prevent linger, but trying
> > did
>
> Which were exactly?
> English/original text pls ...
>
> And The close() (and shutdown() syscalls, respectively) don't avoid
> the FIN_WAIT2 timeout on a closed socket.
> Just set the SO_REUSEADDR socket option on the listening socket.
>
> > not help. Now I think I found the problem.
>
> Then solve it.
>
> > Please do have a look at the code. It looks like the SIGHUP is sent to the server not on close or exit, but on the connect instead.
>
> Too lazy to save and uncompress the file ...
>
> MfG,
> Bernd

And I think this was sort of part of the question:

We have on sock[0] serverside 1 SIGHUP on the connect
We have on sock[1] serverside about 7 SIGHUPs on the close

Why not sent these 6 or 7 SIGHUPs on sock[0], such that the SIGHUP handler has to be installed only once?

Regards,
Mischa.
Re: SIGHUP on connect [ In reply to ]
Hi Bernd,

According to manual page socket(7), SO_REUSEADDR allows for local addresses to be reused for binding. I've tested this socket option with the WAN address, it
appears the problem is solved for both local and non-local connections.

I also found the the SO_LINGER socket option to be useful in some way. By default, SO_LINGER is set to 0, so you would think that lingering connections were out
of the question. However, an enabled linger with a l_onoff = 1 and a l_linger = 0 seems to work a lot better than a disabled linger with a l_onoff = 0 and a
l_linger = 0.

Which option would you use?

Mischa.

On Mon, 2020-10-26 at 17:12 +0000, Bernd Petrovitsch wrote:
> Hi all!
>
> On 25/10/2020 16:11, Michael J. Baars wrote:
> [...]
> > I've been writing a simple client and server for cluster computing this weekend. At first everything appeared to work just fine, but soon enough I found
> > some
> > inexplicable bind errors. I've tried to make sure that the client closes it's sockets before the server closes it's sockets, to prevent linger, but trying
> > did
>
> Which were exactly?
> English/original text pls ...
>
> And The close() (and shutdown() syscalls, respectively) don't avoid
> the FIN_WAIT2 timeout on a closed socket.
> Just set the SO_REUSEADDR socket option on the listening socket.
>
> > not help. Now I think I found the problem.
>
> Then solve it.
>
> > Please do have a look at the code. It looks like the SIGHUP is sent to the server not on close or exit, but on the connect instead.
>
> Too lazy to save and uncompress the file ...
>
> MfG,
> Bernd
Re: SIGHUP on connect [ In reply to ]
Hi all!

On 29/10/2020 14:10, Michael J. Baars wrote:
[...]
> According to manual page socket(7), SO_REUSEADDR allows for local addresses to be reused for binding. I've tested this socket option with the WAN address, it
> appears the problem is solved for both local and non-local connections.

Yup.

> I also found the the SO_LINGER socket option to be useful in some way. By default, SO_LINGER is set to 0, so you would think that lingering connections were out
> of the question. However, an enabled linger with a l_onoff = 1 and a l_linger = 0 seems to work a lot better than a disabled linger with a l_onoff = 0 and a
> l_linger = 0.
>
> Which option would you use?

I never used SO_LINGER before.

From the description in `man 7 socket`, active SO_LINGER just
makes shutdown() to block if not all data has been transmitted
(and ACKed?).
close() on a socket calls shutdown() automatically (unless
the shutdown() has been already called).

The timeout which you're application runs into
applies after shutting down/closing the connection.

MfG,
Bernd
--
Bernd Petrovitsch Email : bernd@petrovitsch.priv.at
There is no cloud, just other people computers. - FSFE
LUGA : http://www.luga.at
Re: SIGHUP on connect [ In reply to ]
Here, think I got it :)

Just run 'make; make run'

Thank you for your help.

On Thu, 2020-10-29 at 21:48 +0100, Bernd Petrovitsch wrote:
> Hi all!
>
> On 29/10/2020 14:10, Michael J. Baars wrote:
> [...]
> > According to manual page socket(7), SO_REUSEADDR allows for local addresses to be reused for binding. I've tested this socket option with the WAN address,
> > it
> > appears the problem is solved for both local and non-local connections.
>
> Yup.
>
> > I also found the the SO_LINGER socket option to be useful in some way. By default, SO_LINGER is set to 0, so you would think that lingering connections were
> > out
> > of the question. However, an enabled linger with a l_onoff = 1 and a l_linger = 0 seems to work a lot better than a disabled linger with a l_onoff = 0 and a
> > l_linger = 0.
> >
> > Which option would you use?
>
> I never used SO_LINGER before.
>
> From the description in `man 7 socket`, active SO_LINGER just
> makes shutdown() to block if not all data has been transmitted
> (and ACKed?).
> close() on a socket calls shutdown() automatically (unless
> the shutdown() has been already called).
>
> The timeout which you're application runs into
> applies after shutting down/closing the connection.
>
> MfG,
> Bernd