Mailing List Archive

Move FILE buckets on setaside (like/as a morphing bucket)?
For FILE buckets, the behaviour of apr_bucket_setaside() is to take
*full* ownership of the underlying apr_file, that is: allocate/move
the file/cleanup to the new pool AND set the old file's fd to -1 (see
apr_file_setaside, [1]).

This can be problematic in an output filters chain whenever a FILE
bucket gets splitted. Suppose that an FTYPE_RESOURCE output filter
splits a FILE bucket in two and sends the first one through the chain,
and then due to network congestion the core output filter needs to set
the FILE bucket aside. It happens that the second FILE bucket becomes
a dangling bucket (with fd == -1).
The case I'm facing is a bit different but similar. A handler is
saving large bodies into a file and inserts an input filter for
mod_proxy to use the file bucket as request body (much like
mod_request). The file is still needed/used after mod_proxy (in an
output filter), so I don't expect the file to be invalidated by
mod_proxy's output filtering.
It seems to be an issue for mod_http2 too, judging by this workaround [2].

I don't think the caller should have to handle this kind of things,
and for the output filters splitting case it's really too much to ask.

I can only think of one way to address this: move file buckets from
the user brigade to the setaside brigade in
ap_filter_setaside_brigade(), assuming correct lifetime for the bucket
and apr_file (like we do for buckets with length == -1 already).

After a look at our codebase, this would work as is, all of the FILE
buckets we pass through output filters are from r->pool, so the
lifetime is within the EOR scope (I'm not totally sure about the http2
beam case, Stefan?).

So, how about we do that?
Would look like the attached patch (which also removes the misleading
AP_BUCKET_IS_MORPHING() macro, as discussed previouly, and
talks/comments about opaque buckets for ->length == -1, and morphing
buckets for both opaque and FILE buckets, as an attempt to clarify
things..).

Regards,
Yann.

[1] https://github.com/apache/apr/blob/trunk/file_io/unix/filedup.c#L155
[2] https://github.com/apache/httpd/blob/trunk/modules/http2/h2_bucket_beam.c#L1019
Re: Move FILE buckets on setaside (like/as a morphing bucket)? [ In reply to ]
On 4/26/20 5:26 PM, Yann Ylavic wrote:

> So, how about we do that?
> Would look like the attached patch (which also removes the misleading

Did I miss the attachment? :-)
Or even better, can we have this as a PR against trunk on github? This ensure that it has run already through all the Travis stuff.

Regards

Rüdiger
Re: Move FILE buckets on setaside (like/as a morphing bucket)? [ In reply to ]
> Am 26.04.2020 um 17:26 schrieb Yann Ylavic <ylavic.dev@gmail.com>:
>
> After a look at our codebase, this would work as is, all of the FILE
> buckets we pass through output filters are from r->pool, so the
> lifetime is within the EOR scope (I'm not totally sure about the http2
> beam case, Stefan?).

Lacking a name to describe the pool situtation in mod_http2, yet, let's use "crossing conntections" for now. The "main" connection (to the h2 client) and the one processing the h2 streams (requests) are side-by-side, living mostly in separate threads even. Buckets in request brigades need to "cross" these connections.

In my delusion of having understood all about pools, threads and filters, I made h2_bucket_beam for the purpose of avoiding copying bucket data, especially files,
across connections. And for files, it needs to apr_file_setaside() into the other connection. Which, I guess, only works for non-split file buckets. But I am no longer certain.

In an alternate universe, were I to do it again, I would not place HTTP/2 into the httpd process itself. Not only because it is tricky, but it also affects further optimizations of the HTTP/1.x case more difficult. As to be seen from this discussion.

- Stefan
Re: Move FILE buckets on setaside (like/as a morphing bucket)? [ In reply to ]
On Sun, Apr 26, 2020 at 05:26:02PM +0200, Yann Ylavic wrote:
> For FILE buckets, the behaviour of apr_bucket_setaside() is to take
> *full* ownership of the underlying apr_file, that is: allocate/move
> the file/cleanup to the new pool AND set the old file's fd to -1 (see
> apr_file_setaside, [1]).
>
> This can be problematic in an output filters chain whenever a FILE
> bucket gets splitted. Suppose that an FTYPE_RESOURCE output filter
> splits a FILE bucket in two and sends the first one through the chain,
> and then due to network congestion the core output filter needs to set
> the FILE bucket aside. It happens that the second FILE bucket becomes
> a dangling bucket (with fd == -1).

Hang on, I don't follow this, how does data->fd end up as -1 in this
case? The buckets split from the original FILE bucket have a common
->data since it is a refcounted bucket type. If any of them are
setaside the common ->data->fd is replaced with a new apr_file_t
allocated from a new (longer-lived) pool. No?

> The case I'm facing is a bit different but similar. A handler is
> saving large bodies into a file and inserts an input filter for
> mod_proxy to use the file bucket as request body (much like
> mod_request). The file is still needed/used after mod_proxy (in an
> output filter), so I don't expect the file to be invalidated by
> mod_proxy's output filtering.

I do agree that it is surprising behaviour that the apr_file_t * is
potentially invalidated after being wrapped in a FILE bucket though. Not
sure how to avoid it, IIRC this is some optimisation to avoid excessive
dup2() calls?

Regards, Joe
Re: Move FILE buckets on setaside (like/as a morphing bucket)? [ In reply to ]
On Mon, Apr 27, 2020 at 9:24 AM Ruediger Pluem <rpluem@apache.org> wrote:
>
> On 4/26/20 5:26 PM, Yann Ylavic wrote:
>
> > So, how about we do that?
> > Would look like the attached patch (which also removes the misleading
>
> Did I miss the attachment? :-)

Argh :/

> Or even better, can we have this as a PR against trunk on github? This ensure that it has run already through all the Travis stuff.

https://github.com/apache/httpd/pull/117

Thanks,
Yann.
Re: Move FILE buckets on setaside (like/as a morphing bucket)? [ In reply to ]
On Mon, Apr 27, 2020 at 11:11 AM Joe Orton <jorton@redhat.com> wrote:
>
> On Sun, Apr 26, 2020 at 05:26:02PM +0200, Yann Ylavic wrote:
> > For FILE buckets, the behaviour of apr_bucket_setaside() is to take
> > *full* ownership of the underlying apr_file, that is: allocate/move
> > the file/cleanup to the new pool AND set the old file's fd to -1 (see
> > apr_file_setaside, [1]).
> >
> > This can be problematic in an output filters chain whenever a FILE
> > bucket gets splitted. Suppose that an FTYPE_RESOURCE output filter
> > splits a FILE bucket in two and sends the first one through the chain,
> > and then due to network congestion the core output filter needs to set
> > the FILE bucket aside. It happens that the second FILE bucket becomes
> > a dangling bucket (with fd == -1).
>
> Hang on, I don't follow this, how does data->fd end up as -1 in this
> case? The buckets split from the original FILE bucket have a common
> ->data since it is a refcounted bucket type. If any of them are
> setaside the common ->data->fd is replaced with a new apr_file_t
> allocated from a new (longer-lived) pool. No?

The call chain is:
apr_bucket_setaside(bucket, newpool)

And since bucket->data is shared, all other buckets with the same data
get their filedes set to -1.

>
> > The case I'm facing is a bit different but similar. A handler is
> > saving large bodies into a file and inserts an input filter for
> > mod_proxy to use the file bucket as request body (much like
> > mod_request). The file is still needed/used after mod_proxy (in an
> > output filter), so I don't expect the file to be invalidated by
> > mod_proxy's output filtering.
>
> I do agree that it is surprising behaviour that the apr_file_t * is
> potentially invalidated after being wrapped in a FILE bucket though. Not
> sure how to avoid it, IIRC this is some optimisation to avoid excessive
> dup2() calls?

Yes I think so, that plus a new fd to take into account for ulimit nofile.
Alternatively we could play with apr_os_file_get()+apr_os_file_set(),
which will leave alone the fd and cleanup from the original pool, but
at that point I think we'd better let the caller responsible for the
lifetime..

Regards,
Yann.

>
> Regards, Joe
>
Re: Move FILE buckets on setaside (like/as a morphing bucket)? [ In reply to ]
On 4/27/20 2:29 PM, Yann Ylavic wrote:
> On Mon, Apr 27, 2020 at 11:11 AM Joe Orton <jorton@redhat.com> wrote:
>>
>> On Sun, Apr 26, 2020 at 05:26:02PM +0200, Yann Ylavic wrote:
>>> For FILE buckets, the behaviour of apr_bucket_setaside() is to take
>>> *full* ownership of the underlying apr_file, that is: allocate/move
>>> the file/cleanup to the new pool AND set the old file's fd to -1 (see
>>> apr_file_setaside, [1]).
>>>
>>> This can be problematic in an output filters chain whenever a FILE
>>> bucket gets splitted. Suppose that an FTYPE_RESOURCE output filter
>>> splits a FILE bucket in two and sends the first one through the chain,
>>> and then due to network congestion the core output filter needs to set
>>> the FILE bucket aside. It happens that the second FILE bucket becomes
>>> a dangling bucket (with fd == -1).
>>
>> Hang on, I don't follow this, how does data->fd end up as -1 in this
>> case? The buckets split from the original FILE bucket have a common
>> ->data since it is a refcounted bucket type. If any of them are
>> setaside the common ->data->fd is replaced with a new apr_file_t
>> allocated from a new (longer-lived) pool. No?
>
> The call chain is:
> apr_bucket_setaside(bucket, newpool)
> => file_bucket_setaside(bucket, newpool)
> => apr_file_setaside(&newfd, bucket->data->fd, newpool)
> => copy bucket->data->fd to newfd, move cleanup to newpool
> => bucket->data->fd->filedes = -1
> => bucket->data->fd = newfd

But isn't bucket->data->fd the same for the old and the new one?
So after bucket->data->fd = newfd bucket->data->fd(aka newfd)->filedes will have the correct value
for all buckets?

Regards

Rüdiger
Re: Move FILE buckets on setaside (like/as a morphing bucket)? [ In reply to ]
On Mon, Apr 27, 2020 at 02:29:55PM +0200, Yann Ylavic wrote:
> The call chain is:
> apr_bucket_setaside(bucket, newpool)
> => file_bucket_setaside(bucket, newpool)
> => apr_file_setaside(&newfd, bucket->data->fd, newpool)
> => copy bucket->data->fd to newfd, move cleanup to newpool
> => bucket->data->fd->filedes = -1
> => bucket->data->fd = newfd
>
> And since bucket->data is shared, all other buckets with the same data
> get their filedes set to -1.

You mentioned this was a problem for split buckets, which I don't get.
For a split FILE bucket bucket->data is common across copies, so this
seems fine, the last "bucket->data->fd = newfd;" updates _all_ the split
buckets.

I can see this is a problem *outside* of _split, so if you do something
like:

e1 = apr_brigade_insert_file(bb, fd, ...);
e2 = apr_brigade_insert_file(bb, fd, ...);

and then a later apr_bucket_setaside(e1) will break e2 as you describe,
because e1 and e2 _don't_ have a common ->data.

I can see in the list archive a discussion of apr_file_setaside() where
cliffw suggested apr_file_setaside() should work like apr_mmap_dup()
which doesn't appear to have this problem.

> > > The case I'm facing is a bit different but similar. A handler is
> > > saving large bodies into a file and inserts an input filter for
> > > mod_proxy to use the file bucket as request body (much like
> > > mod_request). The file is still needed/used after mod_proxy (in an
> > > output filter), so I don't expect the file to be invalidated by
> > > mod_proxy's output filtering.
> >
> > I do agree that it is surprising behaviour that the apr_file_t * is
> > potentially invalidated after being wrapped in a FILE bucket though. Not
> > sure how to avoid it, IIRC this is some optimisation to avoid excessive
> > dup2() calls?
>
> Yes I think so, that plus a new fd to take into account for ulimit nofile.
> Alternatively we could play with apr_os_file_get()+apr_os_file_set(),
> which will leave alone the fd and cleanup from the original pool, but
> at that point I think we'd better let the caller responsible for the
> lifetime..

I think within the constraint of the current APR/httpd buckets/filtering
API it is probably necessary to say that you cannot insert a FILE bucket
referring to the same apr_file_t * twice. Or you should dup2 inbetween.

That is clearly the implication of the apr_file_setaside() API warning
which extends to apr_bucket_setaside(). Alternatively you could say the
apr_file_setaside() implementation is broken and should work like
apr_mmap_dup().

(I think treating setaside as special for FILE is a bad road to start
down - I should be able copy and paste the FILE implementation as a new
MYFILE bucket type and have the core behave correctly, if not optimally)

Regards, Joe
Re: Move FILE buckets on setaside (like/as a morphing bucket)? [ In reply to ]
On Mon, Apr 27, 2020 at 3:04 PM Joe Orton <jorton@redhat.com> wrote:
>
> On Mon, Apr 27, 2020 at 02:29:55PM +0200, Yann Ylavic wrote:
> > The call chain is:
> > apr_bucket_setaside(bucket, newpool)
> > => file_bucket_setaside(bucket, newpool)
> > => apr_file_setaside(&newfd, bucket->data->fd, newpool)
> > => copy bucket->data->fd to newfd, move cleanup to newpool
> > => bucket->data->fd->filedes = -1
> > => bucket->data->fd = newfd
> >
> > And since bucket->data is shared, all other buckets with the same data
> > get their filedes set to -1.
>
> You mentioned this was a problem for split buckets, which I don't get.
> For a split FILE bucket bucket->data is common across copies, so this
> seems fine, the last "bucket->data->fd = newfd;" updates _all_ the split
> buckets.

Ah yes indeed, you are obviously correct.
I just extrapolated my issue (the file mangled by mod_proxy), and
thought the same could happen with splitted file buckets..

Sorry for the noise and taking your time, and thank you/Rüdiger for
your patience ;)

>
> That is clearly the implication of the apr_file_setaside() API warning
> which extends to apr_bucket_setaside(). Alternatively you could say the
> apr_file_setaside() implementation is broken and should work like
> apr_mmap_dup().

Actually what I did for my proxy case is an apr_os_file_get() +
apr_os_file_set() to create a file that mod_proxy can mangle at wish.
The goal was also that sendfile() be finally used.

>
> (I think treating setaside as special for FILE is a bad road to start
> down - I should be able copy and paste the FILE implementation as a new
> MYFILE bucket type and have the core behave correctly, if not optimally)

+1


Thanks,
Yann.
Re: Move FILE buckets on setaside (like/as a morphing bucket)? [ In reply to ]
On Mon, Apr 27, 2020 at 2:09 PM Yann Ylavic <ylavic.dev@gmail.com> wrote:
>
> https://github.com/apache/httpd/pull/117

I closed it, how about the second patch there though?

https://github.com/apache/httpd/pull/117/commits/cdf4c4a3c61340f061267cc70c456d9469e9faac
Re: Move FILE buckets on setaside (like/as a morphing bucket)? [ In reply to ]
On Mon, Apr 27, 2020 at 3:43 PM Yann Ylavic <ylavic.dev@gmail.com> wrote:
>
> On Mon, Apr 27, 2020 at 2:09 PM Yann Ylavic <ylavic.dev@gmail.com> wrote:
> >
> > https://github.com/apache/httpd/pull/117
>
> I closed it, how about the second patch there though?
>
> https://github.com/apache/httpd/pull/117/commits/cdf4c4a3c61340f061267cc70c456d9469e9faac

Actually the attached one (with no mention of morphing since we are
not talking about FILE buckets anymore).
I have chosen the term "opaque" because it's concise (e.g.
opaque_buckets_in_brigade), but it's just in comments now (and
accompanied by "(length == -1)" to specify what it means). The goal
being to axe AP_BUCKET_IS_MORPHING() which I introduced recently..
Re: Move FILE buckets on setaside (like/as a morphing bucket)? [ In reply to ]
On Mon, Apr 27, 2020 at 3:57 PM Yann Ylavic <ylavic.dev@gmail.com> wrote:
>
> On Mon, Apr 27, 2020 at 3:43 PM Yann Ylavic <ylavic.dev@gmail.com> wrote:
> >
> > On Mon, Apr 27, 2020 at 2:09 PM Yann Ylavic <ylavic.dev@gmail.com> wrote:
> > >
> > > https://github.com/apache/httpd/pull/117
> >
> > I closed it, how about the second patch there though?
> >
> > https://github.com/apache/httpd/pull/117/commits/cdf4c4a3c61340f061267cc70c456d9469e9faac
>
> Actually the attached one (with no mention of morphing since we are
> not talking about FILE buckets anymore).
> I have chosen the term "opaque" because it's concise (e.g.
> opaque_buckets_in_brigade), but it's just in comments now (and
> accompanied by "(length == -1)" to specify what it means). The goal
> being to axe AP_BUCKET_IS_MORPHING() which I introduced recently..

Argh.. now attached.
Re: Move FILE buckets on setaside (like/as a morphing bucket)? [ In reply to ]
On Mon, Apr 27, 2020 at 03:57:37PM +0200, Yann Ylavic wrote:
> On Mon, Apr 27, 2020 at 3:43 PM Yann Ylavic <ylavic.dev@gmail.com> wrote:
> >
> > On Mon, Apr 27, 2020 at 2:09 PM Yann Ylavic <ylavic.dev@gmail.com> wrote:
> > >
> > > https://github.com/apache/httpd/pull/117
> >
> > I closed it, how about the second patch there though?
> >
> > https://github.com/apache/httpd/pull/117/commits/cdf4c4a3c61340f061267cc70c456d9469e9faac
>
> Actually the attached one (with no mention of morphing since we are
> not talking about FILE buckets anymore).
> I have chosen the term "opaque" because it's concise (e.g.
> opaque_buckets_in_brigade), but it's just in comments now (and
> accompanied by "(length == -1)" to specify what it means). The goal
> being to axe AP_BUCKET_IS_MORPHING() which I introduced recently..

+1 from me for using the term "opaque buckets" as a synonym for
"e->length == (apr_size_t)-1" aka "buckets with indeterminate length".

Regards, Joe
Re: Move FILE buckets on setaside (like/as a morphing bucket)? [ In reply to ]
On Mon, Apr 27, 2020 at 4:11 PM Joe Orton <jorton@redhat.com> wrote:
>
> +1 from me for using the term "opaque buckets" as a synonym for
> "e->length == (apr_size_t)-1" aka "buckets with indeterminate length".

OK thanks, just committed "something" in r1877077 because I keep
messing up with my attachments today.
Let's discuss from "that" if needed..
Re: Move FILE buckets on setaside (like/as a morphing bucket)? [ In reply to ]
On Mon, Apr 27, 2020 at 04:27:56PM +0200, Yann Ylavic wrote:
> On Mon, Apr 27, 2020 at 4:11 PM Joe Orton <jorton@redhat.com> wrote:
> >
> > +1 from me for using the term "opaque buckets" as a synonym for
> > "e->length == (apr_size_t)-1" aka "buckets with indeterminate length".
>
> OK thanks, just committed "something" in r1877077 because I keep
> messing up with my attachments today.
> Let's discuss from "that" if needed..

That change is definitely better. Isn't the code still making the
assumption that FILE is the only possible non-opaque morphing bucket
type?

http://svn.apache.org/viewvc/httpd/httpd/trunk/server/util_filter.c?revision=1877077&view=markup#l1108

...and by "non-opaque morphing bucket type" the distinction intended
here is... "representing a determinate length bucket type which is not
fully mapped into RAM until you try read()ing it until exhaustion".

(!)
Re: Move FILE buckets on setaside (like/as a morphing bucket)? [ In reply to ]
On Mon, Apr 27, 2020 at 6:17 PM Joe Orton <jorton@redhat.com> wrote:
>
> On Mon, Apr 27, 2020 at 04:27:56PM +0200, Yann Ylavic wrote:
> > On Mon, Apr 27, 2020 at 4:11 PM Joe Orton <jorton@redhat.com> wrote:
> > >
> > > +1 from me for using the term "opaque buckets" as a synonym for
> > > "e->length == (apr_size_t)-1" aka "buckets with indeterminate length".
> >
> > OK thanks, just committed "something" in r1877077 because I keep
> > messing up with my attachments today.
> > Let's discuss from "that" if needed..
>
> That change is definitely better. Isn't the code still making the
> assumption that FILE is the only possible non-opaque morphing bucket
> type?
>
> http://svn.apache.org/viewvc/httpd/httpd/trunk/server/util_filter.c?revision=1877077&view=markup#l1108
>
> ...and by "non-opaque morphing bucket type" the distinction intended
> here is... "representing a determinate length bucket type which is not
> fully mapped into RAM until you try read()ing it until exhaustion".

I wish you don't want me to find a variable name for this :)

But yes, the code assumes that non-opaque buckets can be passed to
ap_save_brigade() without mapping more than necessary to memory (by
more than necessary I mean e.g. TRANSIENT to HEAP is fine, but
morphing an fd to HEAP isn't). So the assumption is mainly that each
bucket (besides opaque) implements ->setaside() in a "sane" manner.
Are you aware of a non-opaque bucket type for which this is an issue?

Note that there is nothing special about FILE buckets here, it's just
that we know that they don't consume memory until read, so we don't
account for their ->length against memory limits
(flush_max_threshold). We don't do that for other buckets types, so
their ->length do count.

Regards,
Yann.