Mailing List Archive: svn commit: r523499 - /perl/modperl/docs/trunk/src/docs/2.0/user/handlers/filters.pod

Author: stas
Date: Wed Mar 28 16:02:30 2007
New Revision: 523499

URL: http://svn.apache.org/viewvc?view=rev&rev=523499
Log:
more edits from the book

Modified:
perl/modperl/docs/trunk/src/docs/2.0/user/handlers/filters.pod

Modified: perl/modperl/docs/trunk/src/docs/2.0/user/handlers/filters.pod
URL: http://svn.apache.org/viewvc/perl/modperl/docs/trunk/src/docs/2.0/user/handlers/filters.pod?view=diff&rev=523499&r1=523498&r2=523499
==============================================================================
--- perl/modperl/docs/trunk/src/docs/2.0/user/handlers/filters.pod (original)
+++ perl/modperl/docs/trunk/src/docs/2.0/user/handlers/filters.pod Wed Mar 28 16:02:30 2007
@@ -16,7 +16,7 @@

-=head1 Your First Filter
+=head1 Introducing Filters

You certainly already know how filters work, because you
encounter filters so often in real life. If you are unfortunate to
@@ -33,7 +33,8 @@
<img src="filter_life_cigarrette.jpg" width="95" height="116"
align="middle" alt="cigarrette filter"><br><br>

-If you are a coffee gourmand, you have certainly tried a filter coffee:
+If you are a coffee gourmand, you have certainly tried a filter
+coffee:

=for html
<img src="filter_life_coffee.jpg" width="179" height="190"
@@ -140,24 +141,26 @@
}
1;

-Next we configure Apache to apply the C<MyApache2::FilterObfuscate>
-filter to all requests that get mapped to files with an I<".html">
-extension:
+The directives below configure Apache to apply the
+C<MyApache2::FilterObfuscate> filter to all requests that get mapped
+to files with an I<".html"> extension:

<Files ~ "\.html">
PerlOutputFilterHandler MyApache2::FilterObfuscate
</Files>

-Filter handlers are similar to HTTP handlers, they are expected to
-return C<Apache2::Const::OK> or C<Apache2::Const::DECLINED>, but instead of receiving
-C<$r> (the request object) as the first argument, they receive C<$f>
-(the filter object).
+Filters are expected to return C<Apache2::Const::OK> or
+C<Apache2::Const::DECLINED>. But instead of receiving C<$r> (the
+request object) as the first argument, they receive C<$f> (the filter
+object). The filter object is described later in this chapter.

-The filter starts by unsetting of the C<Content-Length> response
+The filter starts by unsetting the C<Content-Length> response
header, because it modifies the length of the response body (shrinks
-it). If the response handler had set the C<Content-Length> header and
-the filter hasn't unset it, the client may have problems receiving the
-response since it'd expect more data than it was sent.
+it). If the response handler sets the C<Content-Length> header and
+the filter doesn't unset it, the client may have problems receiving the
+response since it will expect more data than it was sent. I<Setting the
+Content-Length Header> below describes how to set the Content-Length
+header if you need to.

The core of this filter is a read-modify-print expression in a while
loop. The logic is very simple: read at most C<BUFF_LEN> characters of
@@ -166,17 +169,17 @@
data may come from a response handler, or from an upstream filter. The
output data goes to the next filter in the output chain. Even though
in this example we haven't configured any more filters, internally
-Apache by itself uses several core filters to manipulate the data and
+Apache itself uses several core filters to manipulate the data and
send it out to the client.

-As we are going to explain in great detail in the next sections, the
+As we are going to explain in detail in the following sections, the
same filter may be called many times during a single request, every
-time receiving a chunk of data. For example if the POSTed request data
-is 64k long, an input filter could be invoked 8 times, each time
-receiving 8k of data. The same may happen during response phase, where
-an upstream filter may split 64k output in 8 8k chunks. The while loop
-that we just saw is going to read each of these 8k in 8 calls, since
-it requests 1k on every C<read()> call.
+time receiving a subsequent chunk of data. For example if the POSTed
+request data is 64k long, an input filter could be invoked 8 times,
+each time receiving 8k of data. The same may happen during the
+response phase, where an upstream filter may split 64k of output in 8,
+8k chunks. The while loop that we just saw is going to read each of
+these 8k in 8 calls, since it requests 1k on every C<read()> call.

Since it's enough to unset the C<Content-Length> header when the
filter is called the first time, we need to have some flag telling us
@@ -188,44 +191,31 @@
$f->ctx(1);
}

-the C<unset()> call will be made only on the first filter call for
-each request. Of course you can store any kind of a Perl data
+The C<unset()> call will be made only on the first filter call for
+each request. You can store any kind of a Perl data
structure in C<L<$f-E<gt>ctx|docs::2.0::api::Apache2::Filter/C_ctx_>>
and retrieve it later in subsequent filter invocations of the same
-request. We will show plenty of examples using this method in the
+request. There are several examples using this method in the
following sections.

-Of course the C<MyApache2::FilterObfuscate> filter logic should take
-into account situations where removing new line characters will break
-the correct rendering, as is the case if there are multi-line
-C<E<lt>preE<gt>>...C<E<lt>/preE<gt>> entries, but since it escalates
-the complexity of the filter, we will disregard this requirement for
-now.
-
-A positive side effect of this obfuscation algorithm is in shortening
-the amount of the data sent to the client. If you want to look at the
-production ready implementation, which takes into account the HTML
-markup specifics, the C<Apache::Clean> module, available from CPAN,
-does just that.
+To be truly useful, the C<MyApache2::FilterObfuscate> filter logic
+should take into account situations where removing new line characters
+will make the document render incorrectly in the browser. As we
+mentioned above, this is the case if there are multi-line
+C<E<lt>preE<gt>>...C<E<lt>/preE<gt>> entries. Since this increases the
+complexity of the filter, we will disregard this requirement for now.
+
+A positive side-effect of this obfuscation algorithm is that it
+reduces the amount of the data sent to the client. The
+C<Apache::Clean> module, available from the CPAN, provides a
+production-ready implementation of this technique which takes into
+account the HTML markup specifics.

-mod_perl I/O filtering follows the Perl's principle of making simple
+mod_perl I/O filtering follows the Perl principle of making simple
things easy and difficult things possible. You have seen that it's
-trivial to write simple filters. As you read through this tutorial you
-will see that much more difficult things are possible, even though a
-more elaborated code will be needed.
-
-
-
-
-
-
-
-
-
-
-
-
-
+trivial to write simple filters. As you read through this chapter you
+will see that much more difficult things are possible, and that the
+code is more elaborate.

=head1 I/O Filtering Concepts
@@ -235,15 +225,17 @@

=head2 Two Methods for Manipulating Data

-Apache 2.0 considers all incoming and outgoing data as chunks of
-information, disregarding their kind and source or storage
+mod_perl provides two interfaces to filtering: a direct bucket
+brigades manipulation interface and a simpler, stream-oriented
+interface. Apache 2.0 considers all incoming and outgoing data as
+chunks of information, disregarding their kind and source or storage
methods. These data chunks are stored in I<buckets>, which form
L<bucket
brigades|docs::2.0::user::handlers::intro/Bucket_Brigades>. Input and
-output filters massage the data in I<bucket brigades>. Response and
-protocol handlers also receive and send data using bucket brigades,
-though in most cases this is hidden behind wrappers, such as C<read()>
-and C<print()>.
+output filters massage the data in these I<bucket brigades>. Response
+and protocol handlers also receive and send data using bucket
+brigades, though in most cases this is hidden behind wrappers, such as
+C<read()> and C<print()>.

mod_perl 2.0 filters can directly manipulate the bucket brigades or
use the simplified streaming interface where the filter object acts
@@ -271,94 +263,95 @@
=head2 HTTP Request Versus Connection Filters

HTTP request filters are applied when Apache serves an HTTP request.
-
HTTP request input filters get invoked on the body of the HTTP request
only if the body is consumed by the content handler. HTTP request
-headers are not passed through the HTTP request input filters.
+headers are not passed through the HTTP request input filters. HTTP
+response output filters get invoked on the body of the HTTP response
+if the content handler has generated one. HTTP response headers are
+not passed through the HTTP response output filters.

-HTTP response output filters get invoked on the body of the HTTP
-response if the content handler has generated one. HTTP response
-headers are not passed through the HTTP response output filters.
-
-Connection level filters are applied at the connection level.
-
-A connection may be configured to serve one or more HTTP requests, or
+It is also possible to apply filters at the connection level. A
+connection may be configured to serve one or more HTTP requests, or
handle other protocols. Connection filters see all the incoming and
outgoing data. If an HTTP request is served, connection filters can
modify the HTTP headers and the body of request and response. If a
-different protocol is served over connection (e.g. IMAP), the data
-could have a completely different pattern, than the HTTP protocol
-(headers + body).
+different protocol is served over the connection (e.g., IMAP), the
+data could have a completely different pattern than the HTTP protocol
+(headers + body). Thus, the only difference between connection filters
+and request filters is that connection filters see everything from the
+request, i.e., the headers and the body, whereas request filters see
+only the body.
+

-Apache supports several other filter types, which mod_perl 2.0 may
-support in the future.
+mod_perl 2.0 may
+support several other Apache filter types in the future.

=head2 Multiple Invocations of Filter Handlers

Unlike other Apache handlers, filter handlers may get invoked more
than once during the same request. Filters get invoked as many times
-as the number of bucket brigades sent from an upstream filter or
-a content provider.
+as the number of bucket brigades sent from an upstream filter or a
+content provider.

-For example if a content generation handler sends a string, then
-forces a flush, and then sends more data:
+For example, a content generation handler may send a string, then
+force a flush, and then send more data:

# assuming buffered STDOUT ($|==0)
$r->print("foo");
$r->rflush;
$r->print("bar");

-Apache will generate one bucket brigade with two buckets (there are
-several types of buckets which contain data, one of them is
-I<transient>):
+In this case, Apache will generate one bucket brigade with two
+buckets. There are several types of buckets which contain data; in
+this example, the data type is I<transient>:

bucket type data
----------------------
1st transient foo
2nd flush

-and send it to the filter chain. Then assuming that no more data was
-sent after C<print("bar")>, it will create a last bucket brigade
-containing data:
+Apache sends this bucket brigade to the filter chain. Then, assuming
+no more data is sent after C<print("bar")>, it will create a last
+bucket brigade, with one bucket, containing data:

bucket type data
----------------------
1st transient bar

-and send it to the filter chain. Finally it'll send yet another bucket
-brigade with the EOS bucket indicating that there will be no more data
-sent:
+and send it to the filter chain. Finally it will send yet another
+bucket brigade with the EOS bucket indicating that there will be no
+more data sent:

bucket type data
----------------------
1st eos

-META: EOS buckets are valid for Request filters. For Connection
-filters, you will get one only in the response filters only at the end
-of the connection. See the trick how to workaround this in
-C<Apache2::Filter::HTTPHeadersFixup>. Need to mention that in a few
-other places in this doc.
-
-Notice that the EOS bucket may come attached to the last bucket
-brigade with data, instead of coming in its its own bucket brigade.
-Filters should never make an assumption that the EOS bucket is
-arriving alone in a bucket brigade. Therefore the first output filter
-will be invoked two or three times (three times if EOS is coming in
-its own brigade), depending on the number of bucket brigades sent by
-the response handler.
-
-A user may install an upstream filter, and that filter may decide to
-insert extra bucket brigades or collect all the data in all bucket
-brigades passing through it and send it all down in one brigade.
-What's important to remember is when coding a filter, one should never
-assume that the filter is always going to be invoked once, or a fixed
-number of times. Neither one can make assumptions on the way the data
-is going to come in. Therefore a typical filter handler may need to
-split its logic in three parts.
-
-Jumping ahead we will show some pseudo-code that represents all three
-parts. This is how a typical stream-oriented filter handler looks
-like:
+N<EOS buckets are valid for request filters. For connection filters,
+you will get one only in the response filters and only at the end of
+the connection. You can see a sample workaround for this situation in
+the module C<Apache2::Filter::HTTPHeadersFixup> available on the
+CPAN.>
+
+Note that the EOS bucket may come attached to the last bucket brigade
+with data, instead of coming in its own bucket brigade. The location
+depends on the other Apache modules manipulating the buckets and can
+vary. Filters should never assume that the EOS bucket is arriving
+alone in a bucket brigade. Therefore the first output filter will be
+invoked two or three times (three times if EOS is coming in its own
+brigade), depending on the number of bucket brigades sent by the
+response handler.
+
+An upstream filter can modify the bucket brigades, by inserting extra
+bucket brigades or even by collecting the data from multiple bucket
+brigades and sending it along in just one brigade. Therefore, when
+coding a filter, never assume that the filter is always going to be
+invoked once, or any fixed number of times. Neither can you assume how
+the data is going to come in. To accommodate these situations, a
+typical filter handler may need to split its logic in three parts.
+
+To illustrate, below is some pseudo-code that represents all three
+parts, i.e., initialization, processing, and finalization. This is a
+typical stream-oriented filter handler.

sub handler {
my $f = shift;
@@ -395,12 +388,12 @@

=item 1 Initialization

-During the initialization, the filter runs all the code that should be
-performed only once across multiple invocations of the filter (this is
-during a single request). The filter context is used to accomplish
-that task. For each new request the filter context is created before
-the filter is called for the first time and its destroyed at the end
-of the request.
+During initialization, the filter runs code that you want executed
+only once, even if there are multiple invocations of the filter (this
+is during a single request). The filter context ($f-E<gt>ctx) is used
+as a flag to accomplish this task. For each new request the filter
+context is created before the filter is called for the first time, and
+it is destroyed at the end of the request.

unless ($f->ctx) {
init($f);
@@ -410,8 +403,8 @@
When the filter is invoked for the first time
C<L<$f-E<gt>ctx|docs::2.0::api::Apache2::Filter/C_ctx_>> returns
C<undef> and the custom function init() is called. This function
-could, for example, retrieve some configuration data, set in
-I<httpd.conf> or initialize some datastructure to its default value.
+could, for example, retrieve some configuration data set in
+I<httpd.conf> or initialize some data structure to a default value.

To make sure that init() won't be called on the following invocations,
we must set the filter context before the first invocation is
@@ -419,9 +412,11 @@

$f->ctx(1);

-In practice, the context is not just served as a flag, but used to
-store real data. For example the following filter handler counts the
-number of times it was invoked during a single request:
+In practice, the context is not just used as a flag, but to store real
+data. You can use it to hold any data structure and pass it between
+successive filter invocations. For example, the following filter
+handler counts the number of times it was invoked during a single
+request:

sub handler {
my $f = shift;
@@ -435,14 +430,14 @@
}

Since this filter handler doesn't consume the data from the upstream
-filter, it's important that this handler returns
+filter, it's important that this handler return
C<Apache2::Const::DECLINED>, in which case mod_perl passes the current
bucket brigade to the next filter. If this handler returns
-C<Apache2::Const::OK>, the data will be simply lost. And if that data
-included a special EOS token, this may wreck havoc.
+C<Apache2::Const::OK>, the data will be lost, and if that data
+included a special EOS token, this may cause problems.

Unsetting the C<Content-Length> header for filters that modify the
-response body length is a good example of the code to be used in the
+response body length is a good example of code to run in the
initialization phase:

unless ($f->ctx) {
@@ -450,7 +445,7 @@
$f->ctx(1);
}

-We will see more of initialization examples later in this chapter.
+We will see more initialization examples later in this chapter.

=item 2 Processing

@@ -472,24 +467,24 @@
}

Here the filter operates only on a single bucket brigade. Since it
-manipulates every character separately the logic is really simple.
+manipulates every character separately the logic is simple.

-In more complicated filters the filters may need to buffer data first
-before the transformation can be applied. For example if the filter
-operates on html tokens (e.g., 'E<lt>img src="me.jpg"E<gt>'), it's
+In more complicated situations, a filter may need to buffer data
+before the transformation can be applied. For example, if the filter
+operates on HTML tokens (e.g., 'E<lt>img src="me.jpg"E<gt>'), it's
possible that one brigade will include the beginning of the token
('E<lt>img ') and the remainder of the token ('src="me.jpg"E<gt>')
will come in the next bucket brigade (on the next filter
-invocation). In certain cases it may involve more than two bucket
-brigades to get the whole token. In such a case the filter will have
-to store the remainder of unprocessed data in the filter context and
-then reuse it on the next invocation. Another good example is a filter
-that performs data compression (compression is usually effective only
-when applied to relatively big chunks of data), so if a single bucket
+invocation). To operate on the token as a whole, you would need to
+capture each piece over several invocations. To do so, you can store
+the unprocessed data in the filter context and then access it again on
+the next invocation.
+
+Another good example of the need to buffer data is a filter that
+performs data compression, because compression is usually effective
+only when applied to relatively big chunks of data. If a single bucket
brigade doesn't contain enough data, the filter may need to buffer the
-data in the filter context till it collects enough of it.
-
-We will see the implementation examples in this chapter.
+data in the filter context until it collects enough to compress it.

=item 3 Finalization

@@ -498,33 +493,33 @@
data. As mentioned earlier, Apache indicates this event by a special
end of stream "token", represented by a bucket of type C<EOS>. If the
filter is using the streaming interface, rather than manipulating the
-bucket brigades directly, and it was calling read() in a while loop,
-it can check whether this is the last time it's invoked, using the
-C<$f-E<gt>seen_eos> method:
+bucket brigades directly, and it was calling C<read()> in a while
+loop, it can check for the EOS token using the C<$f-E<gt>seen_eos>
+method:

if ($f->seen_eos) {
finalize($f);
}

-This check should be done at the end of the filter handler, because
-sometimes the EOS "token" comes attached to the tail of data (the last
-invocation gets both the data and EOS) and sometimes it comes all
-alone (the last invocation gets only EOS). So if this test is
+This check should be done at the end of the filter handler because the
+EOS token can come attached to the tail of some data or all alone such
+that the last invocation gets only the EOS token. If this test is
performed at the beginning of the handler and the EOS bucket was sent
-in together with the data, the EOS event may be missed and filter
+in together with the data, the EOS event may be missed and the filter
won't function properly.

-Jumping ahead, filters, directly manipulating bucket brigades, have to
-look for a bucket whose type is C<EOS> to accomplish this. We will see
-examples later in the chapter.
+Filters that directly manipulate bucket brigades must manually look
+for a bucket whose type is C<EOS>. There are examples of this method
+later in the chapter.

=back

-Some filters may need to deploy all three parts of the described
-logic, others will need to do only initialization and processing, or
-processing and finalization, while the simplest filters might perform
-only the normal processing (as we saw in the example of the filter
-handler that lowers the case of the characters going through it).
+While not all filters need to perform all of these steps, this is a
+good model to keep in mind while working on your filter handlers.
+Since filters are called multiple times per request, you will likely
+use these steps, with initialization, processing, and finishing, on
+all but the simplest filters.
+

=head2 Blocking Calls

@@ -535,16 +530,16 @@
upstream filter or when the bucket brigade is passed to the downstream
filter.

-First of all, the input and output filters differ in the ways they
-acquire the bucket brigades (which includes the data that they
-filter). Even though when a streaming API is used the difference can't
-be seen, it's important to understand how things work
-underneath. Therefore we are going to show examples of transparent
-filters, which pass data through them unmodified. Instead of reading
-the data in and printing it out the bucket brigades are now passed as
-is.
+Input and output filters differ in the ways they acquire the bucket
+brigades, and thus in how blocking is handled. Each type is described
+separately below. Although you can't see the difference when using the
+streaming API, it's important to understand how things work
+underneath. Therefore the examples below are transparent filters,
+passing data through them unmodified. Instead of reading the data in
+and printing it out, the bucket brigades are passed as is. This makes
+it easier to observe the blocking behavior.

-Here is a code for a transparent input filter:
+The first example is a transparent input filter:

#file:MyApache2/FilterTransparent.pm (first part)
#-----------------------------------------------
@@ -567,56 +562,59 @@
When the input filter I<in()> is invoked, it first asks the upstream
filter for the next bucket brigade (using the C<get_brigade()>
call). That upstream filter is in turn going to ask for the bucket
-brigade from the next upstream filter in chain, etc., till the last
-filter (called C<core_in>), that reads from the network is
-reached. The C<core_in> filter reads, using a socket, a portion of the
-incoming data from the network, processes it and sends it to its
-downstream filter, which will process the data and send it to its
-downstream filter, etc., till it reaches the very first filter who has
-asked for the data. (In reality some other handler triggers the
-request for the bucket brigade, e.g., an HTTP response handler, or a
-protocol module, but for our discussion it's good enough to assume
+brigade from the next upstream filter and so on up the chain, until
+the last filter (called C<core_in>), the one that reads from the
+network, is reached. The C<core_in> filter reads, using a socket, a
+portion of the incoming data from the network, processes it, and sends
+it to its downstream filter. That filter processes the data and send
+it to its downstream filter, etc., until it reaches the first filter
+that requested the data. (In reality some other handler triggers the
+request for the bucket brigade, such as an HTTP response handler or a
+protocol module, but for this discussion it's sufficient to assume
that it's the first filter that issues the C<get_brigade()> call.)

-The following diagram depicts a typical input filters chain data flow
+The following diagram depicts a typical input filter data flow
in addition to the program control flow.

=for html
<img src="in_filter_stream.gif" width="659" height="275"
align="middle" alt="input filter data flow"><br><br>

-The black- and white-headed arrows show when the control is switched
-from one filter to another. In addition the black-headed arrows show
-the actual data flow. The diagram includes some pseudo-code, both for
-in Perl for the mod_perl filters and in C for the internal Apache
-filters. You don't have to understand C to understand this
-diagram. What's important to understand is that when input filters are
-invoked they first call each other via the C<get_brigade()> call and
-then block (notice the brick wall on the diagram), waiting for the
-call to return. When this call returns all upstream filters have
-already completed finishing their filtering task.
+The black- and white-headed arrows show when the control is passed
+from one filter to another. In addition, the black-headed arrows show
+the actual data flow. The diagram includes some pseudo-code, in Perl
+for the mod_perl filters and in C for the internal Apache filters. You
+don't have to understand C to understand this diagram. What's
+important to understand is that when input filters are invoked, they
+first call each other via the C<get_brigade()> call and then block
+(notice the brick wall on the diagram), waiting for the call to
+return. When this call returns, all upstream filters have already
+completed their filtering task on the bucket brigade.

-As mentioned earlier, the streaming interface hides these details,
-however the first C<$f-E<gt>read()> call will block, as underneath it
+As mentioned earlier, the streaming interface hides the details,
+but the first C<$f-E<gt>read()> call will block as the layer under it
performs the C<get_brigade()> call.

-The diagram shows a part of the actual input filter chain for an HTTP
-request, the C<...> shows that there are more filters in between the
-mod_perl filter and C<http_in>.
+The diagram shows only part of the actual input filter chain for an
+HTTP request. The C<...> indicates that there are more filters in
+between the mod_perl filter and C<http_in>.

Now let's look at what happens in the output filters chain. Here the
-first filter acquires the bucket brigades containing the response
-data, from the content handler (or another protocol handler if we
-aren't talking HTTP), it then may apply some modification and pass the
-data to the next filter (using the C<pass_brigade()> call), which in
-turn applies its modifications and sends the bucket brigade to the
-next filter, etc., all the way down to the last filter (called
-C<core>) which writes the data to the network, via the socket the
-client is listening to. Even though the output filters don't have to
-wait to acquire the bucket brigade (since the upstream filter passes
-it to them as an argument), they still block in a similar fashion to
-input filters, since they have to wait for the C<pass_brigade()> call
-to return.
+first filter acquires the bucket brigades containing the response data
+from the content handler (or another protocol handler if we aren't
+talking HTTP). It may then make some modification and pass the data to
+the next filter (using the C<pass_brigade()> call), which in turn
+applies its modifications and sends the bucket brigade to the next
+filter, etc. This continues all the way down to the last filter
+(called C<core>) which writes the data to the network via the socket
+the client is listening to.
+
+Even though the output filters don't have to wait to acquire the
+bucket brigade (since the upstream filter passes it to them as an
+argument), they still block in a similar fashion to input filters,
+since they have to wait for the C<pass_brigade()> call to return. In
+this case, they are waiting to pass the data along rather than waiting
+to receive it.

Here is an example of a transparent output filter:

@@ -632,13 +630,13 @@
}
1;

-The I<out()> filter passes C<$bb> to the downstream filter unmodified
-and if you add debug prints before and after the C<pass_brigade()>
-call and configure the same filter twice, the debug print will show
-the blocking call.
+The I<out()> filter passes C<$bb> to the downstream filter unmodified.
+If you add print statements before and after the C<pass_brigade()>
+call and configure the same filter twice, the print will show the
+blocking call.

-The following diagram depicts a typical output filters chain data flow
-in addition to the program control flow:
+The following diagram depicts a typical output filter data flow in
+addition to the program control flow:

=for html
<img src="out_filter_stream.gif" width="575" height="261"
@@ -646,18 +644,17 @@

Similar to the input filters chain diagram, the arrows show the
program control flow and in addition the black-headed arrows show the
-data flow. Again, it uses a Perl pseudo-code for the mod_perl filter
-and C pseudo-code for the Apache filters, similarly the brick walls
-represent the waiting. And again, the diagram shows a part of the real
-HTTP response filters chain, where C<...> stands for the omitted
-filters.
+data flow. Again, it uses Perl pseudo-code for the mod_perl filter and
+C pseudo-code for the Apache filters and the brick walls represent the
+waiting. The diagram shows only part of the real HTTP response filters
+chain, where C<...> stands for the omitted filters.

=head1 mod_perl Filters Declaration and Configuration

-Now let's see how mod_perl filters are declared and configured.
-
+Now that we have laid out some basic concepts involved in filter use,
+we can look at how mod_perl filters are declared and configured.

=head2 Filter Priority Types

@@ -670,16 +667,15 @@
pages. Numerical definitions of priority types, such as
C<AP_FTYPE_CONTENT_SET> and C<AP_FTYPE_RESOURCE>, can be found in the
Apache source distribution in I<include/util_filter.h>.
-I<include/util_filter.h>.

As of this writing Apache comes with two core filters: C<DEFLATE> and
-C<INCLUDES>. With the following directives:
+C<INCLUDES>. Regardless of your configuration directives, e.g.,:

SetOutputFilter DEFLATE
SetOutputFilter INCLUDES

-the C<DEFLATE> filter will be inserted in the filters chain after the
-C<INCLUDES> filter, even though it was configured before it. This is
+the C<INCLUDES> filter will be inserted in the filters chain before
+the C<DEFLATE> filter, even though it was configured after it. This is
because the C<DEFLATE> filter is of type C<AP_FTYPE_CONTENT_SET> (20),
whereas the C<INCLUDES> filter is of type C<AP_FTYPE_RESOURCE> (10).

@@ -692,15 +688,15 @@
FilterRequestHandler AP_FTYPE_RESOURCE 10
FilterConnectionHandler AP_FTYPE_PROTOCOL 30

-Therefore C<FilterRequestHandler> filters (10) will be always invoked
+Therefore C<FilterRequestHandler> filters (10) will always be invoked
before the C<DEFLATE> filter (20), whereas C<FilterConnectionHandler>
-filters (30) after it. The C<INCLUDES> filter (10) has the same
-priority as C<FilterRequestHandler> filters (10), and therefore it'll
-be inserted according to the configuration order, when
+filters (30) will be invoked after it. When two filters have the same
+priority (e.g., the C<INCLUDES> filter (10) has the same priority as
+C<FilterRequestHandler> filters (10)), they are run in the order they
+are configured. Therefore filters are inserted according to the
+configuration order when
C<L<PerlSetOutputFilter|/PerlSetOutputFilter>> or
-C<L<PerlSetInputFilter|/PerlSetInputFilter>> is used.
-
-
+C<L<PerlSetInputFilter|/PerlSetInputFilter>> are used.

=head2 C<PerlInputFilterHandler>
@@ -720,8 +716,8 @@
they need to be compiled before L<the filter
attributes|/HTTP_Request_vs__Connection_Filters> can be accessed.
Therefore if the filter handler subroutine is not called C<handler>,
-you must preload the module containing the filter subroutine at the
-server startup. A filter handler can be configured not to be
+you must preload the module containing the filter subroutine at server
+startup. A filter handler can be configured not to be
C<L<AutoLoad|docs::2.0::user::config::config/C_AutoLoad_>>ed, using
the C<-> prefix. For example:

@@ -746,6 +742,10 @@
The handler's configuration scope is
C<L<DIR|docs::2.0::user::config::config/item_DIR>>.

+Similar to the C<PerlInputFilterHandler> handlers,
+C<PerlOutputFilterHandler> handlers are automatically
+C<AutoLoad>ed.
+
The following sections include several examples that use the
C<PerlOutputFilterHandler> handler.

@@ -761,7 +761,7 @@
=head2 C<PerlSetInputFilter>

The C<SetInputFilter> directive, documented at
-I<http://httpd.apache.org/docs-2.0/mod/core.html#setinputfilter> sets
+I<http://httpd.apache.org/docs-2.0/mod/core.html#setinputfilter>, sets
the filter or filters which will process client requests and POST
input when they are received by the server (in addition to any filters
configured earlier).
@@ -774,20 +774,20 @@
SetInputFilter FILTER_FOO
PerlInputFilterHandler MyApache2::FilterInputFoo

-will add both filters, however the order of their invocation might be
-not the one that you've expected. To make the invocation order the
-same as the insertion order replace C<SetInputFilter> with
+will add both filters. However the order of their invocation might
+not be as you expect. To make the invocation order the
+same as the insertion order, replace C<SetInputFilter> with
C<PerlSetInputFilter>, like so:

PerlSetInputFilter FILTER_FOO
PerlInputFilterHandler MyApache2::FilterInputFoo

-now C<FILTER_FOO> filter will be always executed before the
+Now the C<FILTER_FOO> filter will always be executed before the
C<MyApache2::FilterInputFoo> filter, since it was configured before
C<MyApache2::FilterInputFoo> (i.e., it'll apply its transformations on
-the incoming data last). Here is a diagram input filters chain and the
-data flow from the network to the response handler for the presented
-configuration:
+the incoming data last). The diagram below shows the input filters
+chain and the data flow from the network to the response handler for
+the presented configuration:

response handler
/\
@@ -804,7 +804,7 @@
network

As explained in the section L<Filter Priority
-Types|/Filter_Priority_Types> this directive won't affect filters of
+Types|/Filter_Priority_Types> this directive won't affect filters of
different priority. For example assuming that
C<MyApache2::FilterInputFoo> is a C<FilterRequestHandler> filter, the
configurations:
@@ -818,12 +818,12 @@
PerlInputFilterHandler MyApache2::FilterInputFoo

are equivalent, because mod_deflate's C<DEFLATE> filter has a higher
-priority than C<MyApache2::FilterInputFoo>, thefore it'll always be
+priority than C<MyApache2::FilterInputFoo>. Thefore, it will always be
inserted into the filter chain after C<MyApache2::FilterInputFoo>,
(i.e. the C<DEFLATE> filter will apply its transformations on the
-incoming data first). Here is a diagram input filters chain and the
-data flow from the network to the response handler for the presented
-configuration:
+incoming data first). The diagram below shows the input filters chain
+and the data flow from the network to the response handler for the
+presented configuration:

response handler
/\
@@ -849,11 +849,6 @@
C<FILTER_FOO> and finally by C<FILTER_BAR> (again, assuming that all
three filters have the same priority).

-The C<PerlSetInputFilter> directives's configuration scope is
-C<L<DIR|docs::2.0::user::config::config/item_DIR>>.
-
-
-

=head2 C<PerlSetOutputFilter>

@@ -870,18 +865,17 @@
SetOutputFilter INCLUDES
PerlOutputFilterHandler MyApache2::FilterOutputFoo

-will add all two filters to the filter chain, however the order of
-their invocation might be not the one that you've expected. To
+As with input filters, to
preserve the insertion order replace C<SetOutputFilter> with
C<PerlSetOutputFilter>, like so:

PerlSetOutputFilter INCLUDES
PerlOutputFilterHandler MyApache2::FilterOutputFoo

-now mod_include's C<INCLUDES> filter will be always executed before
-the C<MyApache2::FilterOutputFoo> filter. Here is a diagram input
-filters chain and the data flow from the response handler to the
-network for the presented configuration:
+Now mod_include's C<INCLUDES> filter will always be executed before
+the C<MyApache2::FilterOutputFoo> filter. The diagram below shows the
+output filters chain and the data flow from the response handler to
+the network for the presented configuration:

response handler
||
@@ -907,7 +901,7 @@
C<INCLUDES> and finally by C<FILTER_FOO> (again, assuming that all
three filters have the same priority).

-Just as explained in the C<L<PerlSetInputFilter|/PerlSetInputFilter>>
+As explained in the C<L<PerlSetInputFilter|/PerlSetInputFilter>>
section, if filters have different priorities, the insertion order
might be different. For example in the following configuration:

@@ -939,8 +933,6 @@
\/
network

-The C<PerlSetOutputFilter> directives's configuration scope is
-C<L<DIR|docs::2.0::user::config::config/item_DIR>>.

@@ -958,7 +950,7 @@
PerlFixupHandler MyApache2::AddFilterDyn
</Files>

-And the corresponding module:
+And the corresponding module is:

#file:MyApache2/AddFilterDyn.pm
#------------------------------
@@ -981,7 +973,7 @@

You can also add connection filters dynamically. For more information
refer to the C<L<Apache2::Filter|docs::2.0::api::Apache2::Filter>>
-manpage:
+manpages
C<L<add_input_filter|docs::2.0::api::Apache2::Filter/C_add_input_filter_>>
and
C<L<add_output_filter|docs::2.0::api::Apache2::Filter/C_add_output_filter_>>.
@@ -1021,7 +1013,7 @@

use Apache2::Filter ();

-The request filters are usually configured in the
+Request filters are usually configured in the
C<E<lt>LocationE<gt>> or equivalent sections:

PerlModule MyApache2::FilterRequestFoo
@@ -1054,9 +1046,10 @@

1;

-This time the configuration must be done outside the
-C<E<lt>LocationE<gt>> or equivalent sections, usually within the
-C<E<lt>VirtualHostE<gt>> or the global server configuration:
+With connection filters, unlike the request flters, the configuration
+must be done outside the C<E<lt>LocationE<gt>> or equivalent sections,
+usually within the C<E<lt>VirtualHostE<gt>> or the global server
+configuration:

Listen 8005
<VirtualHost _default_:8005>
@@ -1072,17 +1065,13 @@

</VirtualHost>

-This accomplishes the configuration of the connection input and output
+This configures the connection input and output
filters.

-Notice that for HTTP requests the only difference between connection
-filters and request filters is that the former see everything: the
-headers and the body, whereas the latter see only the body.
-
-mod_perl provides two interfaces to filtering: a direct bucket
-brigades manipulation interface and a simpler, stream-oriented
-interface. The examples in the following sections will help you to
-understand the difference between the two interfaces.
+As can be seen from the above examples, the only difference between
+connection filters and request filters is that that connection filters
+see everything from the request, i.e., the headers and the body,
+whereas request filters see only the body.

@@ -1095,7 +1084,7 @@

There is one more callback in the filter framework. And that's
C<FilterInitHandler>. This I<init> callback runs immediately after the
-filter handler is inserted into the filter chain, before it was
+filter handler is inserted into the filter chain, before it is
invoked for the first time. Here is a skeleton of an init handler:

sub init : FilterInitHandler {
@@ -1105,9 +1094,7 @@
}

The attribute C<FilterInitHandler> marks the Perl function as suitable
-to be used as a filter initialization callback, which is called
-immediately after a filter is inserted to the filter chain and before
-it's actually called.
+to be used as a filter initialization callback.

For example you may decide to dynamically remove a filter before it
had a chance to run, if some condition is true:
@@ -1120,21 +1107,20 @@

Not all C<Apache2::Filter> methods can be used in the init handler,
because it's not a filter. Hence you can use methods that L<operate on
-the filter
-itself|docs::2.0::api::Apache2::Filter/Common_Filter_API>, such as
-C<L<remove()|docs::2.0::api::Apache2::Filter/C_remove_>> and
+the filter itself|docs::2.0::api::Apache2::Filter/Common_Filter_API>,
+such as C<L<remove()|docs::2.0::api::Apache2::Filter/C_remove_>> and
C<L<ctx()|docs::2.0::api::Apache2::Filter/C_ctx_>> or retrieve request
-information, such as C<L<r()|docs::2.0::api::Apache2::Filter/C_r_>> and
-C<L<c()|docs::2.0::api::Apache2::Filter/C_c_>>. But not methods that
-operate on data, such as
+information, such as C<L<r()|docs::2.0::api::Apache2::Filter/C_r_>>
+and C<L<c()|docs::2.0::api::Apache2::Filter/C_c_>>. You cannot use
+methods that operate on data, such as
C<L<read()|docs::2.0::api::Apache2::Filter/C_read_>> and
C<L<print()|docs::2.0::api::Apache2::Filter/C_print_>>.

In order to hook an init filter handler, the real filter has to assign
-this callback using the C<FilterHasInitHandler> which accepts a
-reference to the callback function, similar to C<push_handlers()>. The
-used callback function has to have the C<FilterInitHandler>
-attribute. For example:
+this callback using the C<FilterHasInitHandler> function which accepts
+a reference to the callback function, similar to
+C<push_handlers()>. The callback function referred to must have the
+C<FilterInitHandler> attribute. For example:

package MyApache2::FilterBar;
use base qw(Apache2::Filter);
@@ -1145,8 +1131,8 @@
return Apache2::Const::OK;
}

-While attributes are parsed during the code compilation (it's really a
-sort of source filter), the argument to the C<FilterHasInitHandler()>
+While attributes are parsed during compilation (it's really a sort of
+source filter), the argument to the C<FilterHasInitHandler()>
attribute is compiled at a later stage once the module is compiled.

The argument to C<FilterHasInitHandler()> can be any Perl code which
@@ -1171,13 +1157,10 @@

$init_sub = eval "package MyApache2::FilterBar; get_pre_handler()";

-though, this is done in C, using the C<eval_pv()> C call.
-
-META: currently only one initialization callback can be registered per
-filter handler. If the need to register more than one arises it should
-be very easy to extend the functionality.
-
+This part is actually done in C, using the C<eval_pv()> C call.

+Currently only one initialization callback can be registered per
+filter handler.

@@ -1189,7 +1172,7 @@
develop the C<MyApache2::FilterSnoop> handler which can snoop on
request and connection filters, in input and output modes.

-But first let's develop a simple response handler that simply dumps
+First we create a simple response handler that dumps
the request's I<args> and I<content> as strings:

#file:MyApache2/Dump.pm
@@ -1349,39 +1332,41 @@
}
1;

-This package provides two filter handlers, one for connection and
+Recall that there are two types of two filter handlers, one for connection and
another for request filtering:

sub connection : FilterConnectionHandler { snoop("connection", @_) }
sub request : FilterRequestHandler { snoop("request", @_) }

-Both handlers forward their arguments to the C<snoop()> function that
-does the real job. We needed to add these two subroutines in order to
-assign the two different attributes. Plus the functions pass the
-filter type to C<snoop()> as the first argument, which gets shifted
-off C<@_> and the rest of the C<@_> are the arguments that were
+Both handlers forward their arguments to the C<snoop()> function,
+which does the real work. These two subroutines are added in order to
+assign the two different attributes. In addition, the functions pass
+the filter type to C<snoop()> as the first argument, which gets
+shifted off C<@_>. The rest of C<@_> are the arguments that were
originally passed to the filter handler.

It's easy to know whether a filter handler is running in the input or
-the output mode. The arguments C<$f> and C<$bb> are always
-passed, whereas the arguments C<$mode>, C<$block>, and C<$readbytes>
+the output mode. Although the arguments C<$f> and C<$bb> are always
+passed, the arguments C<$mode>, C<$block>, and C<$readbytes>
are passed only to input filter handlers.

-If we are in the input mode, in the same call we retrieve the bucket
+If we are in input mode, in the same call we retrieve the bucket
brigade from the previous filter on the input filters stack and
immediately link it to the C<$bb> variable which makes the bucket
brigade available to the next input filter when the filter handler
returns. If we forget to perform this linking our filter will become a
-black hole in which data simply disappears. Next we call C<bb_dump()>
-which dumps the type of the filter and the contents of the bucket
-brigade to C<STDERR>, without influencing the normal data flow.
-
-If we are in the output mode, the C<$bb> variable already points to
-the current bucket brigade. Therefore we can read the contents of the
-brigade right away. After that we pass the brigade to the next filter.
+black hole into which data simply disappears. Next we call
+C<bb_dump()> which dumps the type of the filter and the contents of
+the bucket brigade to C<STDERR>, without influencing the normal data
+flow.
+
+If we are in output mode, the C<$bb> variable already points to the
+current bucket brigade. Therefore we can read the contents of the
+brigade right away, and then we pass the brigade to the next filter.
+
+Let's snoop on connection and request filter levels in both directions
+by applying the following configuration:

-Let's snoop on connection and request filter levels in both
-directions by applying the following configuration:

Listen 8008
<VirtualHost _default_:8008>
@@ -1409,10 +1394,11 @@

% echo "mod_perl rules" | POST 'http://localhost:8008/dump?foo=1&bar=2'

-We get the same response, when using C<MyApache2::FilterSnoop>, because
-our snooping filter didn't change anything. Though there was a lot of
-output printed to I<error_log>. We present it all here, since it helps
-a lot to understand how filters work.
+we get the same response as before we installed
+C<MyApache2::FilterSnoop> because our snooping filter didn't change
+anything. The output didn't change, but there was some new information
+printed to the I<error_log>. We present it all here, in order to
+understand how filters work.

First we can see the connection input filter at work, as it processes
the HTTP headers. We can see that for this request each header is put
@@ -1462,8 +1448,8 @@
Here the HTTP header has been terminated by a double new line. So far
all the buckets were of the I<HEAP> type, meaning that they were
allocated from the heap memory. Notice that the HTTP request input
-filters will never see the bucket brigades with HTTP headers, as it has
-been consumed by the last core connection filter.
+filters will never see the bucket brigades with HTTP headers because
+they are consumed by the last core connection filter.

The following two entries are generated when
C<MyApache2::Dump::handler> reads the POSTed content:
@@ -1478,10 +1464,10 @@
o bucket 2: EOS
[]

-as we saw earlier on the diagram, the connection input filter is run
-before the request input filter. Since our connection input filter was
-passing the data through unmodified and no other custom connection
-input filter was configured, the request input filter sees the same
+As shown earlier, the connection input filter is run before the
+request input filter. Since our connection input filter was passing
+the data through unmodified and no other custom connection input
+filter was configured, the request input filter sees the same
data. The last bucket in the brigade received by the request input
filter is of type I<EOS>, meaning that all the input data from the
current request has been received.
@@ -1498,7 +1484,7 @@
mod_perl rules
]

-This happens because Apache hasn't sent yet the response HTTP headers
+This happens because Apache hasn't yet sent the response HTTP headers
to the client. The request filter sees a bucket brigade with a single
bucket of type I<TRANSIENT> which is allocated from the stack memory.

@@ -1520,7 +1506,7 @@

]

-and followed by the first response body's brigade:
+This is followed by the first response body's brigade:

>>> connection output filter
o bucket 1: TRANSIENT
@@ -1538,14 +1524,13 @@
]

If the response is large, the request and connection filters will
-filter chunks of the response one by one.
-
-META: what's the size of the chunks? 8k?
+filter chunks of the response one by one. These chunks are typically
+8k in size, but this size can vary.

-Finally, Apache sends a series of the bucket brigades to finish off
-the response, including the end of stream meta-bucket to tell filters
-that they shouldn't expect any more data, and flush buckets to flush
-the data, to make sure that any buffered output is sent to the client:
+Finally, Apache sends a series of bucket brigades to finish off the
+response, including the end of stream meta-bucket to tell filters that
+they shouldn't expect any more data, and flush buckets to flush the
+data, to make sure that any buffered output is sent to the client:

>>> connection output filter
o bucket 1: IMMORTAL
@@ -1563,13 +1548,11 @@
o bucket 1: FLUSH
[]

-This module helps to understand that each filter handler can be called
-many time during each request and connection. It's called for each
-bucket brigade.
-
-Also it's important to mention that HTTP request input filters are
-invoked only if there is some POSTed data to read and it's consumed by
-a content handler.
+This module helps to illustrate that each filter handler can be called
+many times during each request and connection. It is called for each
+bucket brigade. Also it is important to mention that HTTP request
+input filters are invoked only if there is some POSTed data to read
+and it's consumed by a content handler.

@@ -1580,19 +1563,15 @@
=head1 Input Filters

mod_perl supports L<Connection|/Connection_Input_Filters> and L<HTTP
-Request|/HTTP_Request_Input_Filters> input filters:
-
-
-
-
-
+Request|/HTTP_Request_Input_Filters> input filters. In the following
+sections we will look at each of these in turn.

=head2 Connection Input Filters

Let's say that we want to test how our handlers behave when they are
-requested as C<HEAD> requests, rather than C<GET>. We can alter the
-request headers at the incoming connection level transparently to all
-handlers.
+requested as C<HEAD> requests, rather than C<GET> requests. We can
+alter the request headers at the incoming connection level with the
+alteration transparent to all handlers.

This example's filter handler looks for data like:

@@ -1641,16 +1620,18 @@
}
}

- Apache2::Const::OK;
+ return Apache2::Const::OK;
}
1;

-The filter handler is called for each bucket brigade, which in turn
+The filter handler is called for each bucket brigade, which then
includes buckets with data. The gist of any input filter handler is to
-request the bucket brigade from the upstream filter, and return it
-downstream filter using the second argument C<$bb>. It's important to
-remember that you can call methods on this argument, but you shouldn't
-assign to this argument, or the chain will be broken. You have two
+request the bucket brigade from the upstream filter, and return it to
+the downstream filter using the second argument C<$bb>. It's important
+to remember that you can call methods on this argument, but you
+shouldn't assign to this argument, or the chain will be broken.
+
+There are two
techniques to choose from to retrieve-modify-return bucket brigades:

=over
@@ -1659,14 +1640,15 @@

Create a new empty bucket brigade C<$ctx_bb>, pass it to the upstream
filter via C<get_brigade()> and wait for this call to return. When it
-returns, C<$ctx_bb> is populated with buckets. Now the filter should
-move the bucket from C<$ctx_bb> to C<$bb>, on the way modifying the
-buckets if needed. Once the buckets are moved, and the filter returns,
-the downstream filter will receive the populated bucket brigade.
+returns, C<$ctx_bb> will be populated with buckets. Now the filter
+should move the bucket from C<$ctx_bb> to C<$bb>, on the way modifying
+the buckets if needed. Once the buckets are moved, and the filter
+returns, the downstream filter will receive the populated bucket
+brigade.

=item 2

-Pass C<$bb> to C<get_brigade()> to the upstream filter, so it will be
+Pass C<$bb> to the upstream filter using C<get_brigade()> so it will be
populated with buckets. Once C<get_brigade()> returns, the filter can
go through the buckets and modify them in place, or it can do nothing
and just return (in which case, the downstream filter will receive the
@@ -1689,9 +1671,9 @@
required substitution, though once it completes the job it sets the
context to 1.

-To optimize the speed, the filter immediately returns
-C<Apache2::Const::DECLINED> when it's invoked after the substitution
-job has been done:
+Using the information stored in the context, the filter can
+immediately return C<Apache2::Const::DECLINED> when it's invoked after
+the substitution job has been done:

return Apache2::Const::DECLINED if $f->ctx;

@@ -1723,21 +1705,21 @@
If the job wasn't done yet, the filter calls C<get_brigade>, which
populates the C<$bb> bucket brigade. Next, the filter steps through
the buckets looking for the bucket that matches the regex:
-C</^GET/>. If that happens, a new bucket is created with the modified
-data (C<s/^GET/HEAD/>. Now it has to be inserted in place of the old
-bucket. In our example we insert the new bucket after the bucket that
-we have just modified and immediately remove that bucket that we don't
-need anymore:
+C</^GET/>. If a match is found, a new bucket is created with the
+modified data (C<s/^GET/HEAD/>. Now it has to be inserted in place of
+the old bucket. In our example we insert the new bucket after the
+bucket that we have just modified and immediately remove that bucket
+that we don't need anymore:

$b->insert_after($nb);
$b->remove; # no longer needed

Finally we set the context to 1, so we know not to apply the
-substitution on the following data and break from the I<for> loop.
+substitution on the following data, and break from the I<for> loop.

-The handler returns C<Apache2::Const::OK> indicating that everything was
-fine. The downstream filter will receive the bucket brigade with one
-bucket modified.
+The handler returns C<Apache2::Const::OK> indicating that everything
+was fine. The downstream filter will receive the bucket brigade with
+one bucket modified.

Now let's check that the handler works properly. For example, consider
the following response handler:
@@ -1763,16 +1745,16 @@
$r->set_content_length(length $response);
$r->print($response);

- Apache2::Const::OK;
+ return Apache2::Const::OK;
}

1;

-which returns to the client the request type it has issued. In the
-case of the C<HEAD> request Apache will discard the response body, but
-it'll will still set the correct C<Content-Length> header, which will
-be 24 in case of the C<GET> request and 25 for C<HEAD>. Therefore if
-this response handler is configured as:
+This handler returns to the client the request type it has issued. For
+a C<HEAD> request Apache will discard the response body, but it will
+still set the correct C<Content-Length> header, which will be 24 for a
+C<GET> request and 25 for a C<HEAD> request. Therefore, if this
+response handler is configured as:

Listen 8005
<VirtualHost _default_:8005>
@@ -1789,14 +1771,13 @@
print $r->headers->content_length . ": ". $r->content'
24: the request type was GET

-where the response's body is:
+the response body will be:

the request type was GET

-And the C<Content-Length> header is set to 24.
-
-However if we enable the C<MyApache2::InputFilterGET2HEAD> input
-connection filter:
+and the C<Content-Length> header will be set to 24. This is what we
+would expect since the request was processed normally. However, if we
+enable the C<MyApache2::InputFilterGET2HEAD> input connection filter:

Listen 8005
<VirtualHost _default_:8005>
@@ -1808,19 +1789,19 @@
</Location>
</VirtualHost>

-And issue the same C<GET> request, we get only:
+and issue the same C<GET> request, we get only:

25:

-which means that the body was discarded by Apache, because our filter
-turned the C<GET> request into a C<HEAD> request and if Apache wasn't
-discarding the body on C<HEAD>, the response would be:
+This means the body was discarded by Apache, because our filter turned
+the C<GET> request into a C<HEAD> request. If Apache wasn't discarding
+the body on C<HEAD>, the response would be:

the request type was HEAD

-that's why the content length is reported as 25 and not 24 as in the
-real GET request.
-
+That's why the content length is reported as 25 and not 24 as in the
+real GET request. So the content length of 25 and lack of body text in
+the response confirm that our filter is acting as we expected.

=head2 HTTP Request Input Filters
@@ -1831,8 +1812,10 @@

=head2 Bucket Brigade-based Input Filters

-Let's look at the request input filter that lowers the case of the
-request's body: C<MyApache2::InputRequestFilterLC>:
+As we have seen, filters can be either bucket brigade based, or stream
+oriented. Here we look at a request input filter that lowercases the
+request's body by directly manipulating the bucket brigade:
+C<MyApache2::InputRequestFilterLC>.

#file:MyApache2/InputRequestFilterLC.pm
#-------------------------------------
@@ -1873,7 +1856,7 @@
$bb->insert_tail($b);
}

- Apache2::Const::OK;
+ return Apache2::Const::OK;
}

1;
@@ -1881,27 +1864,28 @@
As promised, in this filter handler we have used the first technique
of bucket brigade modification. The handler creates a temporary bucket
brigade (C<ctx_bb>), populates it with data using C<get_brigade()>,
-and then moves buckets from it to the bucket brigade C<$bb>, which is
-then retrieved by the downstream filter when our handler returns.
+and then moves buckets from it to the bucket brigade C<$bb>. This
+bucket brigade is then retrieved by the downstream filter when our
+handler returns.

This filter doesn't need to know whether it was invoked for the first
-time or whether it has already done something. It's a state-less
+time or whether it has already done something. It's a stateless
handler, since it has to lower case everything that passes through
it. Notice that this filter can't be used as the connection filter for
-HTTP requests, since it will invalidate the incoming request headers;
-for example the first header line:
+HTTP requests, since it will invalidate the incoming request headers.
+For example the first header line:

GET /perl/TEST.pl HTTP/1.1

-will become:
+becomes:

get /perl/test.pl http/1.1

-which messes up the request method, the URL and the protocol.
+which invalidates the request method, the URL and the protocol.

-Now if we use the C<MyApache2::Dump> response handler, we have
-developed before in this chapter, which dumps the query string and the
-content body as a response, and configure the server as follows:
+To test, we can use the C<MyApache2::Dump> response handler, presented
+earlier, which dumps the query string and the content body as a
+response. Configure the server as follows:

<Location /lc_input>
SetHandler modperl
@@ -1909,20 +1893,40 @@
PerlInputFilterHandler +MyApache2::InputRequestFilterLC
</Location>

-When issuing a POST request:
+Now when issuing a POST request:

% echo "mOd_pErl RuLeS" | POST 'http://localhost:8002/lc_input?FoO=1&BAR=2'

-we get the response:
+we get a response:

args:
FoO=1&BAR=2
content:
mod_perl rules

-As before, we see that our filter has lowercased the POSTed body
-before the content handler received it and the query string wasn't
-changed.
+We can see that our filter has lowercased the POSTed body before the
+content handler received it. And you can see that the query string
+wasn't changed.
+
+We have devoted so much attention to bucket brigade filters, even
+though they are simple to manipulate, because it is important to
+understand how the filters work underneath. This understanding is
+essential when you need to debug filters or to optimize them. There
+are cases when a bucket brigade filter may be more efficient than the
+stream-oriented version. For example if the filter applies a
+transformation to selected buckets, certain buckets may contain open
+filehandles or pipes, rather than real data. When you call C<read()>,
+as shown above, the buckets will be forced to read that data in. But
+if you didn't want to modify these buckets you could pass them as they
+are and let Apache perform faster techniques for sending data from the
+file handles or pipes.
+
+N< The call to $b-E<gt>read(), or any other operation that internally
+forces the bucket to read the information into the memory (like the
+length() op), makes the data handling less efficient because it
+creates more work. Therefore care should be taken so not to read the
+data in unless it's really necessary, and sometimes you can gain this
+efficiency only by working with the bucket brigades.>

=head2 Stream-oriented Input Filters
@@ -1954,34 +1958,22 @@
}
1;

-Now you probably ask yourself why did we have to go through the bucket
-brigades filters when this all can be done so much simpler. The reason
-is that we wanted you to understand how the filters work underneath,
-which will assist a lot when you will need to debug filters or
-optimize their speed. In certain cases a bucket brigade filter may be
-more efficient than the stream-oriented. For example if the filter
-applies transformation to selected buckets, certain buckets may
-contain open filehandles or pipes, rather than real data. And when you
-call C<read()> the buckets will be forced to read that data in. But if
-you didn't want to modify these buckets you could pass them as they
-are and let Apache do faster techniques for sending data from the file
-handles or pipes.

-The logic is very simple here, the filter reads in loop, and prints
+The logic is very simple here. The filter reads in a loop and prints
the modified data, which at some point will be sent to the next
-filter. This point happens every time the internal mod_perl buffer is
-full or when the filter returns.
+filter. The data transmission is triggered every time the internal
+mod_perl buffer is filled or when the filter returns.

C<read()> populates C<$buffer> to a maximum of C<BUFF_LEN> characters
(1024 in our example). Assuming that the current bucket brigade
contains 2050 chars, C<read()> will get the first 1024 characters,
-then 1024 characters more and finally the remaining 2
-characters. Notice that even though the response handler may have sent
-more than 2050 characters, every filter invocation operates on a
-single bucket brigade so you have to wait for the next invocation to
-get more input. In one of the earlier examples we have shown that you
-can force the generation of several bucket brigades in the content
-handler by using C<rflush()>. For example:
+then 1024 characters more and finally the remaining 2 characters. Note
+that even though the response handler may have sent more than 2050
+characters, every filter invocation operates on a single bucket
+brigade so you have to wait for the next invocation to get more
+input. Earlier we showed that you can force the generation of several
+bucket brigades in the content handler by using C<rflush()>. For
+example:

$r->print("string");
$r->rflush();
@@ -1989,10 +1981,11 @@

It's only possible to get more than one bucket brigade from the same
filter handler invocation if the filter is not using the streaming
-interface and by simply calling C<get_brigade()> as many times as
-needed or till EOS is received.
+interface. In that case you can call C<get_brigade()> as many times as
+needed or until EOS is received.

-The configuration section is pretty much identical:
+The configuration section is nearly identical for the two types of
+filters:

<Location /lc_input2>
SetHandler modperl
@@ -2004,26 +1997,35 @@

% echo "mOd_pErl RuLeS" | POST 'http://localhost:8002/lc_input2?FoO=1&BAR=2'

-we get a response:
+we get the response:

args:
FoO=1&BAR=2
content:
mod_perl rules

-indeed we can see that our filter has lowercased the POSTed body,
-before the content handler received it. You can see that the query
-string wasn't changed.
+As before, we see that our filter has lowercased the POSTed body
+before the content handler received it and the query string wasn't
+changed.

=head1 Output Filters

+
+As discussed above in the section L<HTTP Request vs. Connection
+Filters>, mod_perl supports L<Connection|/Connection_Output_Filters>
+and L<HTTP Request|/HTTP_Request_Output_Filters> output filters. In
+the following sections we will look at each of these in turn.
+
+
mod_perl supports L<Connection|/Connection_Output_Filters> and L<HTTP
-Request|/HTTP_Request_Output_Filters> output filters:
+Request|/HTTP_Request_Output_Filters> output filters. The differences
+between connection filters and HTTP request filters are described
+above in the section L<HTTP Request vs. Connection Filters>.

=head2 Connection Output Filters

Connection filters filter B<all> the data that is going through the
-server. Therefore if the connection is of HTTP request type,
+server. Therefore if the connection is of the HTTP request type,
connection output filters see the headers and the body of the
response, whereas request output filters see only the response body.

@@ -2035,11 +2037,14 @@

=head2 HTTP Request Output Filters

-As mentioned earlier output filters can be written using the bucket
+As mentioned earlier, output filters can be written using the bucket
brigades manipulation or the simplified stream-oriented interface.
+This section will show examples of both.

-First let's develop a response handler that sends two lines of output:
-numerals 1234567890 and the English alphabet in a single string:
+In order to generate output that can be manipulated by the two types
+of output filters, we will first develop a response handler that sends
+two lines of output: numerals 1234567890 and the English alphabet in a
+single string:

#file:MyApache2/SendAlphaNum.pm
#-------------------------------
@@ -2061,21 +2066,20 @@
$r->print(1..9, "0\n");
$r->print('a'..'z', "\n");

- Apache2::Const::OK;
+ return Apache2::Const::OK;
}
1;

-The purpose of our filter handler is to reverse every line of the
-response body, preserving the new line characters in their
+In the examples below, we'll create a filter handler to reverse every
+line of the response body, preserving the new line characters in their
places. Since we want to reverse characters only in the response body,
without breaking the HTTP headers, we will use the HTTP request output
-filter.
-
+filter rather than a connection output filter.

=head3 Stream-oriented Output Filters

-The first filter implementation is using the stream-oriented filtering
+The first filter implementation uses the stream-oriented filtering
API:

#file:MyApache2/FilterReverse1.pm
@@ -2101,7 +2105,7 @@
}
}

- Apache2::Const::OK;
+ return Apache2::Const::OK;
}
1;

@@ -2115,13 +2119,13 @@
PerlOutputFilterHandler MyApache2::FilterReverse1
</Location>

-Now when a request to I</reverse1> is made, the response handler
+Now when a request to I</reverse1> is sent, the response handler
C<MyApache2::SendAlphaNum::handler()> sends:

1234567890
abcdefghijklmnopqrstuvwxyz

-as a response and the output filter handler
+as a response. The output filter handler
C<MyApache2::FilterReverse1::handler> reverses the lines, so the client
gets:

@@ -2135,21 +2139,21 @@
the I<readline()> mode in chunks up to the buffer length (1024 in our
example), and then prints each line reversed while preserving the new
line control characters at the end of each line. Behind the scenes
-C<$f-E<gt>read()> retrieves the incoming brigade and gets the
-data from it, and C<$f-E<gt>print()> appends to the new brigade
-which is then sent to the next filter in the stack. C<read()> breaks
-the I<while> loop, when the brigade is emptied or the end of stream is
+C<$f-E<gt>read()> retrieves the incoming brigade and gets the data
+from it, and C<$f-E<gt>print()> appends to the new brigade which is
+then sent to the next filter in the stack. C<read()> breaks the
+I<while> loop when the brigade is emptied or the end of stream is
received.

-In order not to distract the reader from the purpose of the example
-the used code is oversimplified and won't handle correctly input lines
-which are longer than 1024 characters and possibly using a different
-line termination token (could be "\n", "\r" or "\r\n" depending on a
-platform). Moreover a single line may be split between across two or
+While this code is simple and easy to explain, there are cases it
+won't handle correctly. For example, it will have problems if the
+input lines are longer than 1,024 characters. It also doesn't account
+for the different line terminators on different platforms (e.g., "\n",
+"\r", or "\r\n"). Moreover a single line may be split across two or
even more bucket brigades, so we have to store the unprocessed string
-in the filter context, so it can be used on the following invocations.
-So here is an example of a more complete handler, which does takes
-care of these issues:
+in the filter context so it can be used on the following invocations.
+Below is an example of a more complete handler, which takes care of
+these issues:

sub handler {
my $f = shift;
@@ -2178,13 +2182,14 @@
as long as it fails to assemble a complete line or there is an
incomplete line following the new line token. On the next handler
invocation this data is then prepended to the next chunk that is
-read. When the filter is invoked on the last time, it unconditionally
-reverses and flushes any remaining data.
+read. When the filter is invoked for the last time, signaled by the
+C<$f-E<gt>seen_eos> method, it unconditionally reverses and sends the
+data down the stream, which is then flushed down to the client.

=head3 Bucket Brigade-based Output Filters

-The following filter implementation is using the bucket brigades API
+The following filter implementation uses the bucket brigades API
to accomplish exactly the same task as the first filter.

#file:MyApache2/FilterReverse2.pm
@@ -2229,11 +2234,11 @@
my $rv = $f->next->pass_brigade($bb_ctx);
return $rv unless $rv == APR::Const::SUCCESS;

- Apache2::Const::OK;
+ return Apache2::Const::OK;
}
1;

-and the corresponding configuration:
+Below is the corresponding configuration for I<httpd.conf>:

PerlModule MyApache2::FilterReverse2
PerlModule MyApache2::SendAlphaNum
@@ -2252,25 +2257,26 @@

The bucket brigades output filter version is just a bit more
complicated than the stream-oriented one. The handler receives the
-incoming bucket brigade C<$bb> as its second argument. Since when the
-handler is completed it must pass a brigade to the next filter in the
-stack, we create a new bucket brigade into which we are going to put
-the modified buckets and which eventually we pass to the next filter.
-
-The core of the handler is in removing buckets from the head of the
-bucket brigade C<$bb> while there are some, reading the data from the
-buckets, reversing and putting it into a newly created bucket which is
-inserted to the end of the new bucket brigade. If we see a bucket
-which designates the end of stream, we insert that bucket to the tail
-of the new bucket brigade and break the loop. Finally we pass the
-created brigade with modified data to the next filter and return.
+incoming bucket brigade C<$bb> as its second argument. When the
+handler finishes it must pass a brigade to the next filter in the
+stack, so we create a new bucket brigade into which we put the
+modified buckets and which eventually we pass to the next filter.
+
+The core of the handler removes buckets from the head of the bucket
+brigade C<$bb>, while buckets are available, reads the data from the
+buckets, then reverses and puts the data into a newly created bucket.
+The new bucket is then inserted on the end of the new bucket brigade
+using the C<insert_tail()> method. If we see a bucket which designates
+the end of stream, we insert that bucket on the tail of the new bucket
+brigade and break the loop. Finally we pass the created brigade with
+modified data to the next filter and return.

-Similarly to the original version of
+Similar to the original version of
C<MyApache2::FilterReverse1::handler>, this filter is not smart enough
-to handle incomplete lines. However the exercise of making the filter
-foolproof should be trivial by porting a better matching rule and
-using the C<$leftover> buffer from the previous section is trivial and
-left as an exercise to the reader.
+to handle incomplete lines. The filter could be made more foolproof by
+building a better matching rule, and using the C<$leftover> buffer as
+demonstrated in the previous section. This left as an exercise for
+the reader.

@@ -2285,20 +2291,21 @@

Sometimes filters need to read at least N bytes before they can apply
their transformation. It's quite possible that reading one bucket
-brigade is not enough. But two or more are needed. This situation is
-sometimes referred to as an I<underrun>.
+brigade is not enough, but that two or more are needed. This situation
+is sometimes referred to as an I<underrun>.

Let's take an input filter as an example. When the filter realizes
that it doesn't have enough data in the current bucket brigade, it can
store the read data in the filter context, and wait for the next
-invocation of itself, which may or may not satisfy its
-needs. Meanwhile it must return an empty bb to the upstream input
-filter. This is not the most efficient technique to resolve underruns.
-
-Instead of returning an empty bb, the input filter can initiate the
-retrieval of extra bucket brigades, until the underrun condition gets
-resolved. Notice that this solution is absolutely transparent to any
-filters before or after the current filter.
+invocation of itself, which may or may not satisfy its needs. While it
+is gathering the data from the bucket brigades, it must return an
+empty bucket brigade to the upstream input filter. However, this is
+not the most efficient technique to resolve underruns.
+
+Instead of returning an empty bucket brigade, the input filter can
+request extra bucket brigades until the underrun condition gets
+resolved. Note that this solution is transparent to any filters before
+or after the current filter.

Consider this HTTP request:

@@ -2314,14 +2321,16 @@
to get 5 brigades of 8kb, and one brigade with just a few bytes (a
total of 6 bucket brigades).

-Now let's say that the filter needs to have 1024*16 + 5 bytes to have
-a complete token and then it can start its processing. The extra 5
-bytes are just so we don't perfectly fit into 8bk bucket brigades,
-making the example closer to real situations. Having 40975 bytes of
-input and a token size of 16389 bytes, we will have 2 full tokens and
-8197 remainder.
+Now let's assume our example filter needs to have 1024*16 + 5 bytes to
+have a complete token before it can start its processing. The extra 5
+bytes are just so we don't perfectly fit into 8kb bucket brigades,
+making the example closer to real situations. Having 40,975 bytes of
+input and a token size of 16,389 bytes, we will have 2 full tokens and
+a remainder of 8,197 bytes.
+
+Before showing any code, let's look at the filter debug output to
+better explain what we expect to happen:

-Jumping ahead let's look at the filter debug output:

filter called
asking for a bb
@@ -2338,29 +2347,28 @@
asking for a bb
seen eos, flushing the remaining: 8197 bytes

-So we can see that the filter was invoked three times. The first time
-it has consumed three bucket brigades, collecting one full token of
-16389 bytes and has a remainder of 7611 bytes to be processed on the
+We can see that the filter was invoked three times. The first time it
+has consumed three bucket brigades, collecting one full token of
+16,389 bytes with a remainder of 7,611 bytes to be processed on the
next invocation. The second time it needed only two more bucket
-brigades and this time after completing the second token, 7222 bytes
-have remained. Finally on the third invocation it has consumed the
-last bucket brigade (total of six, just as we have expected), however
-it didn't have enough for the third token and since EOS has been seen
-(no more data expected), it has flushed the remaining 8197 bytes as we
-have calculated earlier.
+brigades and this time, after completing the second token, 7,222 bytes
+remained. Finally on the third invocation it consumed the last bucket
+brigade for a total of six, just as we expected. However, it didn't
+have enough for the third token and since EOS has been seen (no more
+data expected), it has flushed the remaining 8,197 bytes as we
+calculated earlier.

It is clear from the debugging output that the filter was invoked only
three times, instead of six times (there were six bucket
-brigades). Notice that the upstread input filter (if any) wasn't aware
-that there were six bucket brigades, since it saw only three. Our
-example filter didn't do much with those tokens, so it has only
-repackaged data from 8kb per bucket brigade, to 16389 bytes per bucket
-brigade. But of course in real world some transformation is applied on
-these tokens.
-
-Now you understand what did we want from the filter, it's time for the
-implementation details. First let's look at the C<response()> handler
-(the first part of the module):
+brigades). Notice that the upstream input filter, if there is one,
+isn't aware that there were six bucket brigades, since it saw only
+three. Our example filter didn't do much with those tokens, so it has
+only repackaged data from 8kb per bucket brigade, to 16,389 bytes per
+bucket brigade. But of course in a real implementation some
+transformation would be applied on these tokens.
+
+Now let's look at the implementation details. First let's look at the
+C<response()> handler, which is the first part of the module:

#file:MyApache2/Underrun.pm
#-------------------------
@@ -2443,7 +2451,7 @@
my $data;
warn "\nfilter called\n";

- # fetch and consume bucket brigades untill we have at least TOKEN_SIZE
+ # fetch and consume bucket brigades until we have at least TOKEN_SIZE
# bytes to work with
do {
my $tbb = APR::Brigade->new($f->r->pool, $ba);
@@ -2510,23 +2518,22 @@

1;

-The filter calls C<get_brigade()> in a do-while loop till it reads
-enough data or sees EOS. Notice that it may get underruns for several
+The filter calls C<get_brigade()> in a do-while loop until it reads
+enough data or sees EOS. Notice that it may get underruns several
times, and then suddenly receive a lot of data at once, which will be
-enough for more than one minimal size token, so we have to take care
-this into an account. Once the underrun condition is satisfied (we
-have at least one complete token) the tokens are put into a bucket
-brigade and returned to the upstream filter for processing, keeping
-any remainders in the filter context, for the next invocations or
-flushing all the remaining data if EOS has been seen.
-
-Notice that this won't be possible with streaming filters where every
-invocation gives the filter exactly one bucket brigade to work with
-and provides not facilities to fetch extra brigades. (META: however
-this can be fixed, by providing a method which will fetch the next
-bucket brigade, so the read in a while loop can be repeated)
+enough for more than one minimal size token, so we have to take this
+into an account. Once the underrun condition is satisfied (we have at
+least one complete token) the tokens are put into a bucket brigade and
+returned to the upstream filter for processing, keeping any remainders
+in the filter context for the next invocations or flushing all the
+remaining data if EOS is seen.
+
+Note that this example cannot be implemented with streaming filters
+because each invocation gives the filter exactly one bucket brigade to
+work with. The streaming interface does not currently provide a
+facility to fetch extra brigades.

-And here is the configuration for this setup:
+Here is the Apache configuration for this example:

PerlModule MyApache2::Underrun
<Location />
@@ -2543,10 +2550,12 @@
Earlier we have stated that a filter that modifies the content's
length must unset the Content-Length HTTP header. However sometimes
it's desirable to have this header set, for example when dealing with
-proxies. Since the headers are sent before the data, all the data must
-be first buffered and processed. You cannot accomplish this task with
-the streaming filter API since it passes FLUSH buckets through. As
-soon as the FLUSH bucket is received by the core filter that sends the
+proxies.
+
+Since the headers are sent before the data, all the data must first be
+buffered and processed. You cannot accomplish this task with the
+streaming filter API since it passes FLUSH buckets through. As soon as
+the FLUSH bucket is received by the core filter that sends the
headers, it generates the headers and sends those out. Therefore the
bucket brigade API must be used here to have a complete control over
what's going through. Here is a possible implementation:
@@ -2656,14 +2665,14 @@

Whenever a new HTTP request is processed, request filters get their
context (C<L<$f-E<gt>ctx|docs::2.0::api::Apache2::Filter/C_ctx_>>)
-reset. This is also true for connection filters context, as long as
-the connection is not
-C<L<keepalive|docs::2.0::api::Apache2::Connection/C_keepalive_>>). When
-the connection is kept alive, there could be many requests processed
-during a single connection and the same filter context will persist
-through all of them, until the maximum number of KeepAlive requests
-over the same connection is reached or the client breaks the
-connection.
+reset. This is also true for the connection filter context, as long as
+the connection is not a
+C<L<keepalive|docs::2.0::api::Apache2::Connection/C_keepalive_>>)
+connection. When the connection is kept alive, there could be many
+requests processed during a single connection and the same filter
+context will persist through all of them, until the maximum number of
+KeepAlive requests over the same connection is reached or until the
+client breaks the connection.

Sometimes it's desirable to reset the whole context or parts of it
before a HTTP request is processed. For example
@@ -2671,13 +2680,13 @@
start and stop processing HTTP headers. It keeps the state in the
filter's context. The problem is that whenever a new HTTP request is
coming in, it needs to be able to reset the state machine. If it
-doesn't, it'll process the HTTP headers of the first request and miss
-the rest of the requests.
+doesn't, it will process the HTTP headers of the first request and
+miss the rest of the requests.

So let's say we have a hypothetical module
C<MyApache2::Filter::StateMachine> which implements an input
-connection filter, which processes incoming data as long as the
-I<state> flag is down. Once that flag goes up the filter switches to
+connection filter and it processes incoming data as long as the
+I<state> flag is down. Once that flag goes up, the filter switches to
the pass-through-unmodified mode. Here is a skeleton of the module:

#file:MyApache2/Filter/StateMachine.pm
@@ -2722,9 +2731,9 @@
}
1;

-In order to make this module work properly over KeepAlive connections,
-we want to reset the I<state> flag at the very beginning of the new
-request. To accomplish that all we need to do is to change the
+To make this module work properly over KeepAlive connections, we want
+to reset the I<state> flag at the very beginning of the new
+request. To accomplish this, all we need to do is to change the
C<context> wrapper to be:

sub context {
@@ -2753,13 +2762,13 @@
}

The only difference from the previous implementation is that we
-maintain one more state, which stores the number of requests, served
+maintain one more state, which stores the number of requests served
over the current connection. When Apache reports more served requests
than we have in the context that means that we have a new request
coming in. So we reset the I<state> flag and store the new value of
the served connections.

-For a concrete implementation examples see:
+For a more complete real-world implementation, see:
http://search.cpan.org/dist/Apache-Filter-HTTPHeadersFixup/

@@ -2829,11 +2838,11 @@

[

-talk about issues like not losing meta-buckets. e.g. if the filter runs
-a switch statement and propagates buckets types that were known at the
-time of writing, it may drop buckets of new types which may be added
-later, so it's important to ensure that there is a default cause where
-the bucket is passed as is.
+talk about issues like not losing meta-buckets. e.g. if the filter
+runs a switch statement and propagates buckets types that were known
+at the time of writing, it may drop buckets of new types which may be
+added later, so it's important to ensure that there is a default cause
+where the bucket is passed as is.

of course mention the fact where things like EOS buckets must be
passed, or the whole chain will be broken. Or if some filter decides
@@ -2866,13 +2875,9 @@

=head1 Writing Efficient Filters

-META: to be written
-
-[
-
-As of this writing the network input filter reads in 8000B chunks (not
-8192B!), and making each bucket 8000B in size, so it seems that the
-most efficient reading technique is:
+As of this writing, the Apache network input filter reads in 8000B
+chunks (not 8192B) and makes each bucket 8000B in size. Based on this,
+the most efficient reading technique is:

use constant BUFF_LEN => 8000;
while ($f->read(my $buffer, BUFF_LEN)) {
@@ -2909,7 +2914,10 @@

=head1 CPAN Modules

-Some of the CPAN modules that implement mod_perl 2.0 filters:
+Several modules are available on the CPAN that implement mod_perl 2.0
+filters. As with all code on the CPAN, the source code is fully
+available, so you can download these modules and see exactly how they
+work.

=over

---------------------------------------------------------------------
To unsubscribe, e-mail: docs-cvs-unsubscribe@perl.apache.org
For additional commands, e-mail: docs-cvs-help@perl.apache.org