Mailing List Archive

A safer way to kill pseudo-forked processes on Windows?
Currently it is rather difficult to cleanly terminate a Perl program
using fork() emulation on Windows:

The Perl process will only terminate once the main thread *and* all
forked children have terminated.

So if the child process might be waiting in a blocked system call, we
may end up with a deadlock.

The standard "solution" is to use kill(9, $child), as this is the
only mechanism that will terminate a blocked thread.

However, using kill(9, $pid) may destabilize the process itself if the
child process is not blocked, but actively doing something, like
allocating memory from the shared heap.

I've been wondering if we shouldn't allow one of the other signals to
mark the child as "killed", even though the signal might not be delivered
if the thread is blocked. That way the main thread could exit without
having to wait for these children, and ExitProcess() will eventually
terminate all other threads anyways.

The question then becomes: Which signal should get this special treatment?
In some ways it would be most convenient to use SIGTERM, as that means
the same code would work on Unix and Windows (which is the main point
behind the fork() emulation). However, this would mean that a child
process would no longer be guaranteed to be able to catch SIGTERM and
do its own cleanup; it could get reaped at any time after catching the
signal.

Given that currently a pseudo-forked process is not guaranteed to receive
a SIGTERM signal at all, if it is blocked on a system call, this seems
like a sensible compromise to me.

Or would it be better to use a different signal, e.g. the Windows specific
SIGBREAK?

Unless I see any better suggestion (or reasons why doing this is a bad
idea), I'll implement this for SIGTERM in a day or two.

For more background, please check out these bug reports for Test::TCP:

https://rt.cpan.org/Ticket/Display.html?id=66016
https://rt.cpan.org/Ticket/Display.html?id=66437

My recent change to always surrender the remainder of the timeslice after
calling TerminateThread() on a child process seems to have reduced the
frequency of receiving the wrong exit code, but unfortunately the race
conditions inherent in terminating a thread in an unknown state can still
be observed.

Cheers,
-Jan
Re: A safer way to kill pseudo-forked processes on Windows? [ In reply to ]
Jan Dubois wrote:
> However, this would mean that a child
>process would no longer be guaranteed to be able to catch SIGTERM and
>do its own cleanup; it could get reaped at any time after catching the
>signal.

That would be bad. Catching SIGTERM and handling it in a program-specific
way is an important aspect of Unix programming, and it really would not
wash for Perl to make it impossible.

-zefram
RE: A safer way to kill pseudo-forked processes on Windows? [ In reply to ]
On Tue, 15 Mar 2011, Zefram wrote:
> Jan Dubois wrote:
> > However, this would mean that a child
> >process would no longer be guaranteed to be able to catch SIGTERM and
> >do its own cleanup; it could get reaped at any time after catching the
> >signal.
>
> That would be bad. Catching SIGTERM and handling it in a program-specific
> way is an important aspect of Unix programming, and it really would not
> wash for Perl to make it impossible.

Note that I'm *only* talking about Windows here, which doesn't have real
signals. Signals to pseudo-threads are emulated by posting a message to
a queue, and will only be checked whenever Perl checks for safe signals.
They will never interrupt a blocking system call (which makes the
signal emulation mostly useless already).

So any child that is waiting for a socket connection, or waiting in a
blocking socket read will never receive the SIGTERM signal anyways; they
already aren't able to do any program-specific cleanup in this (I think
most common) case.

This means you currently cannot rely on sending a SIGTERM to the child
and exiting the parent. Both threads could now hang forever, as the child
is still blocked, and the parent process has to wait for the child to
finish, because they are really just 2 threads within the same process.

So either you accept the chance of a deadlock, or you wait in your
application code with a timeout, and then kill(9,$child) anyways, in
which case you are no better off than with the supposed change.

So while the change may be bad, I think the current situation is even
worse.

At least with the change, you have the option to waitpid() on the child
yourself and have the same behavior as right now. But without the change
you currently have no mechanism that guarantees to safely terminate the
parent process unless the child voluntarily terminates itself (or is
written in a way to never block on I/O without a timeout, which doesn't
seem very common).

Cheers,
-Jan
Re: A safer way to kill pseudo-forked processes on Windows? [ In reply to ]
* Jan Dubois <jand@activestate.com> [2011-03-15 17:50]:
> On Tue, 15 Mar 2011, Zefram wrote:
> > Jan Dubois wrote:
> > > However, this would mean that
> > >a child process would no longer be guaranteed to be able to
> > >catch SIGTERM and do its own cleanup; it could get reaped at
> > >any time after catching the signal.
> >
> > That would be bad. Catching SIGTERM and handling it in
> > a program-specific way is an important aspect of Unix
> > programming, and it really would not wash for Perl to make it
> > impossible.
>
> Note that I'm *only* talking about Windows here, which doesn't
> have real signals. Signals to pseudo-threads are emulated by
> posting a message to a queue, and will only be checked whenever
> Perl checks for safe signals. They will never interrupt
> a blocking system call (which makes the signal emulation mostly
> useless already).

But the implication is “if you want to write cross-platform code
you have to avoid SIGTERM because it won’t catchable everywhere.”
If I understand correctly, SIGTERM currently simply does nothing
on Windows, so at least you don’t have to avoid it in Unix (or
add conditionals to your code about whether or not to avoid it
based on platform). You just have to write some extra mechanism
to make Windows work.

So if my premises are correct, your proposal would penalise Unix
users in order to help Windows users.

Why does this SIGTERM have to uncatchable on Windows anyway? If
signals are pushed to a queue, why is it not possible to call
their handlers? (Only eventually, not right away, sure. So what?)
If it can be made to trigger the appropriate handlers, then that
would be the best way to go about: the same code you’re write on
either platform would work on the other.

Regards,
--
Aristotle Pagaltzis // <http://plasmasturm.org/>
Re: A safer way to kill pseudo-forked processes on Windows? [ In reply to ]
On Tue, Mar 15, 2011 at 5:02 PM, Aristotle Pagaltzis <pagaltzis@gmx.de> wrote:
> But the implication is “if you want to write cross-platform code
> you have to avoid SIGTERM because it won’t catchable everywhere.”

If you want to write cross-platform code, you probably shouldn't use
signals, full stop.

-- David
Re: A safer way to kill pseudo-forked processes on Windows? [ In reply to ]
* David Golden <xdaveg@gmail.com> [2011-03-15 22:30]:
> On Tue, Mar 15, 2011 at 5:02 PM, Aristotle Pagaltzis <pagaltzis@gmx.de> wrote:
> >But the implication is “if you want to write cross-platform
> >code you have to avoid SIGTERM because it won’t catchable
> >everywhere.”
>
> If you want to write cross-platform code, you probably
> shouldn't use signals, full stop.

That’s not the point. The point is that as I understand it, the
current situation with SIGTERM is thus:

Unix: can be handled as expected
Win32: does nothing

So you can use signals to cover certain cases as long as you are
not solely reliant on them in order to behave correctly.

Afterwards it would be:

Unix: can be handled as expected
Win32: always kills your thread

So then you explicitly have to keep signals away from your code.
Case “Unix” is penalised indirectly because of the mere existence
of case “Win32”.

Regards,
--
Aristotle Pagaltzis // <http://plasmasturm.org/>
RE: A safer way to kill pseudo-forked processes on Windows? [ In reply to ]
On Tue, 15 Mar 2011, Aristotle Pagaltzis wrote:
>
> But the implication is “if you want to write cross-platform code
> you have to avoid SIGTERM because it won’t catchable everywhere.”

Yes, as David Golden already said: "If you want to write cross-platform
code, you probably shouldn't use signals, full stop."

And once you read `perldoc perlfork` you'll see that there are
plenty of other limitation with the fork() emulation too.

> If I understand correctly, SIGTERM currently simply does nothing
> on Windows, so at least you don’t have to avoid it in Unix (or
> add conditionals to your code about whether or not to avoid it
> based on platform). You just have to write some extra mechanism
> to make Windows work.

No, SIGTERM works as well as any of the other emulated "signals":

#!perl
use strict;
use warnings;

$|=1;
if (my $pid = fork) {
sleep 1;
kill 'TERM', $pid;
warn 1;
}
else {
require IO::Socket::INET;
# IO::Socket::INET->new(Listen => 5)->accept;
sleep 5;
warn 2;
}
__END__

C:\tmp>perl fork.pl
1 at fork.pl line 9.
Terminating on signal SIGTERM(15)

The sleep() function is especially implemented to return early when we
receive an emulated signal (including a fake SIGALARM from our alarm()
emulation).

But if you uncomment the accept() call, then the program will just
hang forever instead (unless someone actually connects to the
socket, of course).

> So if my premises are correct, your proposal would penalise Unix
> users in order to help Windows users.
>
> Why does this SIGTERM have to uncatchable on Windows anyway? If
> signals are pushed to a queue, why is it not possible to call
> their handlers? (Only eventually, not right away, sure. So what?)

What if "eventually" is actually *never*, see above?

> If it can be made to trigger the appropriate handlers, then that
> would be the best way to go about: the same code you’re write on
> either platform would work on the other.

Yes, but given that Windows is neither a POSIX system, nor even
a Unix system, this will be an unattainable goal. And I think
by stopping to block for SIGTERM we actually do improve the chances
of Unix code just working on Windows. At least for all those
cases where the parent wants to kill a child that is waiting for
another incoming connection, which seems to be quite common.

And that code currently doesn't work at all, unless it contains
something like this (from Test::TCP):

# process does not die when received SIGTERM, on win32.
my $TERMSIG = $^O eq 'MSWin32' ? 'KILL' : 'TERM';

And using KILL (which internally calls TerminateThread()) is never
a safe option; it can sometimes crash or hang, if the thread is
killed in the middle of an "atomic" operation.

With my SIGTERM patch in place I could remove the Windows specific
code from Test::TCP and all tests ran successfully anyways...

Cheers,
-Jan
RE: A safer way to kill pseudo-forked processes on Windows? [ In reply to ]
On Tue, 15 Mar 2011, Aristotle Pagaltzis wrote:
> That’s not the point. The point is that as I understand it, the
> current situation with SIGTERM is thus:
>
> Unix: can be handled as expected
> Win32: does nothing

No, the only difference is that the signal does not interrupt a
blocking system call. It is otherwise handled as expected.

Of course if the system call never unblocks, then the result is
quite different...

> So you can use signals to cover certain cases as long as you are
> not solely reliant on them in order to behave correctly.
>
> Afterwards it would be:
>
> Unix: can be handled as expected
> Win32: always kills your thread

No, it will not always kill your thread, it will still trigger your
signal handler (unless you are blocked in a system call...).

The difference now is that when the parent terminated, it will
no longer wait for the child to terminate as well. Which means
that the OS will eventually terminate the child process
when the parent interpreter is gone.

> So then you explicitly have to keep signals away from your code.
> Case “Unix” is penalised indirectly because of the mere existence
> of case “Win32”.

This is already the case with the current way things work.

Cheers,
-Jan
Re: A safer way to kill pseudo-forked processes on Windows? [ In reply to ]
* Jan Dubois <jand@activestate.com> [2011-03-15 23:10]:
> On Tue, 15 Mar 2011, Aristotle Pagaltzis wrote:
> >So then you explicitly have to keep signals away from your
> >code. Case “Unix” is penalised indirectly because of the mere
> >existence of case “Win32”.
>
> This is already the case with the current way things work.

OK. If it doesn’t introduce any new downsides, and just brings
things a smidgen closer to “it works the same everywhere” (even
if it never fully can), then that sounds good to me.

Thanks for arguing the case,
--
Aristotle Pagaltzis // <http://plasmasturm.org/>