Mailing List Archive

Strange segfault in Python threads and linux kernel 2.6
G'day,

I've Cc'ed this to zope-coders as it might affect other Zope developers
and it had me stumped for ages. I couldn't find anything on it anywhere,
so I figured it would be good to get something into google :-).

We are developing a Zope2.7 application on Debian GNU/Linux that is
using fop to generate pdf's from xml-fo data. fop is a java thing, and
we are using popen2.Popen3(), non-blocking mode, and select loop to
write/read stdin/stdout/stderr. This was all working fine.

Then over the Christmas chaos, various things on my development system
were apt-get updated, and I noticed that java/fop had started
segfaulting. I tried running fop with the exact same input data from the
command line; it worked. I wrote a python script that invoked fop in
exactly the same way as we were invoking it inside zope; it worked. It
only segfaulted when invoked inside Zope.

I googled and tried everything... switched from j2re1.4 to kaffe, rolled
back to a previous version of python, re-built Zope, upgraded Zope from
2.7.2 to 2.7.4, nothing helped. Then I went back from a linux 2.6.8
kernel to a 2.4.27 kernel; it worked!

After googling around, I found references to recent attempts to resolve
some signal handling problems in Python threads. There was one post that
mentioned subtle differences between how Linux 2.4 and Linux 2.6 did
signals to threads.

So it seems this is a problem with Python threads and Linux kernel 2.6.
The attached program demonstrates that it has nothing to do with Zope.
Using it to run "fop-test /usr/bin/fop </dev/null" on a Debian box with
fop installed will show the segfault. Running the same thing on a
machine with 2.4 kernel will instead get the fop "usage" message. It is
not a generic fop/java problem with 2.6 because the commented
un-threaded line works fine. It doesn't seem to segfault for any
command... "cat -" works OK, so it must be something about java
contributing.

After searching the Python bugs, the closest I could find was #971213
<http://sourceforge.net/tracker/?group_id=5470&atid=105470&func=detail&aid=971213>. Is this the same bug? Should I submit a new bug report? Is there any other way I can help resolve this?

BTW, built in file objects really could use better non-blocking
support... I've got a half-drafted PEP for it... anyone interested in
it?

--
Donovan Baarda <abo@minkirri.apana.org.au>
http://minkirri.apana.org.au/~abo/
Re: Strange segfault in Python threads and linux kernel 2.6 [ In reply to ]
On Wed, Jan 19, 2005 at 04:16:09PM +1100, Donovan Baarda wrote:
[...]

| After googling around, I found references to recent attempts to resolve
| some signal handling problems in Python threads. There was one post that
| mentioned subtle differences between how Linux 2.4 and Linux 2.6 did
| signals to threads.
|
| So it seems this is a problem with Python threads and Linux kernel 2.6.

Maybe this information will help you :

http://marc.theaimsgroup.com/?l=vim-dev&;m=100827301719235&w=2
http://marc.theaimsgroup.com/?l=vim-dev&;m=100803941216634&w=2
http://lists.debian.org/debian-user/2002/12/msg03171.html


Quoting the last paragraph of section 6.6 of "Programming with
POSIX Threads" by David R. Butenhof:

It is always best to avoid using signals in conjunction
with threads. At the same time, it is often not possible or
practical to keep them separate. When signals and threads
meet, beware. If at all possible, use only pthread_sigmask
to mask signals in the main thread, and sigwait to handle
signals synchronously within a single thread dedicated to
that purpose. If you must use sigaction (or equivalent) to
handle synchronous signals (such as SIGSEGV) within threads,
be especially cautious. Do as little work as possible within
the signal-catching function.


Good luck!

-D

--
\begin{humor}
Disclaimer:
If I receive a message from you, you are agreeing that:
1. I am by definition, "the intended recipient"
2. All information in the email is mine to do with as I see fit and make
such financial profit, political mileage, or good joke as it lends
itself to. In particular, I may quote it on USENET or the WWW.
3. I may take the contents as representing the views of your company.
4. This overrides any disclaimer or statement of confidentiality that may
be included on your message
\end{humor}

www: http://dman13.dyndns.org/~dman/ jabber: dman@dman13.dyndns.org