Mailing List Archive

Leaking files / file descriptors in HTTP::Body..
Switching this to the dev list as it is a bit more involved.

On 6 Oct 2008, at 15:30, James R. Leu wrote:
>
<snip>
> ... or running out of file descriptors. Do a 'lsof -p <pid>' where
> PID
> is that of the httpd children. I see a file descriptor leak in
> Catalyst::Engine::finalize_body in my cat app.
>
> See this post for more info:
>
> http://lists.scsys.co.uk/pipermail/catalyst/2007-November/
> 015817.html

and to quote that:

> The contents of the
> files is the JSON data for the API calls I'm making from the
client back
> to the server. I've traced the origins of the files back to
Catalyst::Engine
> (more specifically HTTP::Body's use of File::Temp). If I add the
following
> 'hack' to the end of Catalyst::Engine::finalize_body the files in /
tmp
> get closed and I no longer experience the file handle exhaustion:
>
> if (defined($c->request->{_body})) {
> delete $c->request->{_body};
> }
>
>
> So I have a couple of questions:
> - is there a better way to 'fix' this?
> - has anyone else seen this?

I'm guessing from the above that you're also getting a
HTTP::Body::OctetStream object?

I've seen something similar to this. My processes don't appear to
leak file descriptors, but I _do_ leak temporary files (i.e. they
don't get cleaned up).

As this is happening for me, I thought I'd knock up some tests of the
unlink behavior (attached). They're not quite perfect yet, but a good
start at testing this specific case.

This test however steadfastly refuses to fail (on my laptop, haven't
tested at work yet), however if I alter the File::Temp->new call in
HTTP::Body::OctetStream to add UNLINK => 0, it fails as I'd expect
(so proving at least that it's testing what I think it is testing).

The only things I can think of which could be causing this are:
* Really old File::Temp versions (is UNLINK new behavior) - I'll
check this out when I get into work tomorrow.
* $File::Temp::KEEP_ALL = 1 - would it be sane to localise this
variable to 0 in HTTP::Body to stop users shooting themselves in the
foot?

Anyone else got any suggestions or ideas about what may be causing this?

Cheers
t0m
Re: Leaking files / file descriptors in HTTP::Body.. [ In reply to ]
On Oct 6, 2008, at 5:49 PM, Tomas Doran wrote:

> Switching this to the dev list as it is a bit more involved.
>
> On 6 Oct 2008, at 15:30, James R. Leu wrote:
>>
> <snip>
>> ... or running out of file descriptors. Do a 'lsof -p <pid>' where
>> PID
>> is that of the httpd children. I see a file descriptor leak in
>> Catalyst::Engine::finalize_body in my cat app.
>>
>> See this post for more info:
>>
>> http://lists.scsys.co.uk/pipermail/catalyst/2007-November/015817.html
>
> and to quote that:
>
> > The contents of the
> > files is the JSON data for the API calls I'm making from the
> client back
> > to the server. I've traced the origins of the files back to
> Catalyst::Engine
> > (more specifically HTTP::Body's use of File::Temp). If I add the
> following
> > 'hack' to the end of Catalyst::Engine::finalize_body the files in /
> tmp
> > get closed and I no longer experience the file handle exhaustion:
> >
> > if (defined($c->request->{_body})) {
> > delete $c->request->{_body};
> > }
> >
> >
> > So I have a couple of questions:
> > - is there a better way to 'fix' this?
> > - has anyone else seen this?
>
> I'm guessing from the above that you're also getting a
> HTTP::Body::OctetStream object?
>
> I've seen something similar to this. My processes don't appear to
> leak file descriptors, but I _do_ leak temporary files (i.e. they
> don't get cleaned up).
>
> As this is happening for me, I thought I'd knock up some tests of
> the unlink behavior (attached). They're not quite perfect yet, but a
> good start at testing this specific case.
>
> This test however steadfastly refuses to fail (on my laptop, haven't
> tested at work yet), however if I alter the File::Temp->new call in
> HTTP::Body::OctetStream to add UNLINK => 0, it fails as I'd expect
> (so proving at least that it's testing what I think it is testing).
>
> The only things I can think of which could be causing this are:
> * Really old File::Temp versions (is UNLINK new behavior) - I'll
> check this out when I get into work tomorrow.
> * $File::Temp::KEEP_ALL = 1 - would it be sane to localise this
> variable to 0 in HTTP::Body to stop users shooting themselves in the
> foot?
>
> Anyone else got any suggestions or ideas about what may be causing
> this?

Currently you need to manually clean up any temp files HTTP::Body
creates that are not the result of a form upload. I think we'll
probably change this at some point, since it really should not be your
problem to clean up the files, but Catalyst's.

_______________________________________________
Catalyst-dev mailing list
Catalyst-dev@lists.scsys.co.uk
http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst-dev
Re: Leaking files / file descriptors in HTTP::Body.. [ In reply to ]
On 7 Oct 2008, at 00:17, Andy Grundman wrote:
>
> On Oct 6, 2008, at 5:49 PM, Tomas Doran wrote:
>>
>> As this is happening for me, I thought I'd knock up some tests of
>> the unlink behavior (attached). They're not quite perfect yet, but
>> a good start at testing this specific case.
>>
>> This test however steadfastly refuses to fail (on my laptop,
>> haven't tested at work yet), however if I alter the File::Temp-
>> >new call in HTTP::Body::OctetStream to add UNLINK => 0, it fails
>> as I'd expect (so proving at least that it's testing what I think
>> it is testing).
>>
>> The only things I can think of which could be causing this are:
>> * Really old File::Temp versions (is UNLINK new behavior) - I'll
>> check this out when I get into work tomorrow.
>> * $File::Temp::KEEP_ALL = 1 - would it be sane to localise this
>> variable to 0 in HTTP::Body to stop users shooting themselves in
>> the foot?
>>
>> Anyone else got any suggestions or ideas about what may be causing
>> this?
>
> Currently you need to manually clean up any temp files HTTP::Body
> creates that are not the result of a form upload. I think we'll
> probably change this at some point, since it really should not be
> your problem to clean up the files, but Catalyst's.
>

The temp files created by HTTP::Body::OctetStream _are_ correctly
cleaned up at the end of the request in my test case (as they go out
of scope at the end of the request), at least for the PUT use-case,
which is what I care about.

I'm pretty damn sure that this also happens correctly in real
applications (I'm going to do some more testing of this today).

_However_ - I have some production systems which are steadfastly
refusing to clean up temp files, and so there is something wiggy
going on. My current suspects (as previously noted) are really old
versions of some CPAN library in /usr/local, or some other piece of
code doing something nasty (like $File::Temp::KEEPALL = 1), which
ruins this behavior.

My current personal problems with retarded Debian boxen aside - there
is some logical inconsistency - some of the HTTP::Body sub-classes
say UNLINK => 0, and rely on Catalyst to clean up, whereas some say
nothing, rely on the default cleanup behavior. I guess that this
should in some way be unified.

I'm happy to supply a patch & tests to unify this behavior - but
which is the correct way to go?

If we allow the File::Temp objects to auto-cleanup, then if the user
decides (for some mad reason) to keep the File::Temp object around,
then the file doesn't get deleted (and that's obviously their
problem, as they've probably had too many hits on the crack pipe if
they want to do this) - behavior that won't work/works inconsistently
currently, as Catalyst will come along and unlink your file for you
(but only for certain request types)... Alternatively - we should
make _all_ the File::Temp objects with UNLINK => 0, but get Catalyst
to unlink the files manually (as it does currently for form uploads)
- this approach also means you're safe from $File::Temp::KEEPALL=1.

I don't have a strong opinion about which of these is better,
although the second feels less dangerous as the user is _forced_ to
copy / hard link the file if they want it preserved, which I think is
probably better behavior (i.e. less easy to shot your own foot).
Letting File::Temp in some way do the cleanup may be superior, as
when you look at it's code, it is dealing with a load of edge cases
to get the cleanup right for you - but I can't think of any cases of
the user having a legitimate reason to keep a reference to the
temporary object around after the end of a hit...

Cheers
t0m


_______________________________________________
Catalyst-dev mailing list
Catalyst-dev@lists.scsys.co.uk
http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst-dev