Mailing List Archive

crash in process_lingering_close
I noticed a crash in process_lingering_close

The crash happens here:

#0 0x00007f5d4af33b28 in apr_socket_timeout_set (sock=sock@entry=0x0, t=t@entry=2000000) at network_io/unix/sockopt.c:86
stat = <optimized out>
#1 0x00000000004706a4 in process_lingering_close (cs=0x7f5c88000ed0) at event.c:1466
csd = 0x0
dummybuf = '\000' <repeats 30216 times>...
nbytes = 0
rv = <optimized out>
q = <optimized out>
#2 0x000000000047154b in worker_thread (thd=0x1a31480, dummy=<optimized out>) at event.c:2142
csd = 0x7f5c88000cd0
cs = 0x0
te = 0x0
ptrans = 0x7f5c88000c58
ti = <optimized out>
process_slot = 3
thread_slot = 1
rv = 0
is_idle = 0
#3 0x00007f5d4a48015a in start_thread () from /lib64/libpthread.so.0
No symbol table info available.
#4 0x00007f5d49fadf73 in clone () from /lib64/libc.so.6


The reason is that in this case a third party module that has hooked into the pre_connection hook returns an error which causes
the hook to stop running. This prevents core_pre_connection from running which is APR_HOOK_REALLY_LAST.
Hence the socket is not set into c->conn_config and hence ap_get_conn_socket returns NULL.
The quick fix that prevents that from happening is

Index: server/mpm/event/event.c
===================================================================
--- server/mpm/event/event.c (revision 1893352)
+++ server/mpm/event/event.c (working copy)
@@ -1040,6 +1040,16 @@
ap_log_cerror(APLOG_MARK, APLOG_DEBUG, 0, c, APLOGNO(00469)
"process_socket: connection aborted");
c->aborted = 1;
+ /*
+ * In case we errored, the pre_connection hook of the core
+ * module maybe did not run (it is APR_HOOK_REALLY_LAST) and
+ * hence the socket might not have been set in c->conn_config
+ * and the lingering close process cannot fetch it from there
+ * via ap_get_conn_socket. Hence set it in this case.
+ */
+ if (!ap_get_conn_socket(c)) {
+ ap_set_core_module_config(c->conn_config, sock);
+ }
}

/**


But I think it is incomplete. I think we should do all that core_pre_connection does.
We cannot call it from event.c as it is a static in core.c. If we agree that we want to execute what it does I see two ways forward:

1. Make core_pre_connection part of the public API and have the pre_connection hook of core just call it.
2. Create a wrapper around ap_run_pre_connection as a public API and if ap_run_pre_connection returns != OK && != DONE do
things like setting c->aborted to 1 and call core_pre_connection or do a subset of it.

I would be in favour of 2. as 1. seems to provide an API function for only a very specific case and of limited use. 2. seems to
deliver a more "save" version of ap_run_pre_connection that could be used in several locations as a drop in replacement for
ap_run_pre_connection.

Regards

Rüdiger
Re: crash in process_lingering_close [ In reply to ]
> Am 21.09.2021 um 11:48 schrieb Ruediger Pluem <rpluem@apache.org>:
>
> I noticed a crash in process_lingering_close
>
> The crash happens here:
>
> #0 0x00007f5d4af33b28 in apr_socket_timeout_set (sock=sock@entry=0x0, t=t@entry=2000000) at network_io/unix/sockopt.c:86
> stat = <optimized out>
> #1 0x00000000004706a4 in process_lingering_close (cs=0x7f5c88000ed0) at event.c:1466
> csd = 0x0
> dummybuf = '\000' <repeats 30216 times>...
> nbytes = 0
> rv = <optimized out>
> q = <optimized out>
> #2 0x000000000047154b in worker_thread (thd=0x1a31480, dummy=<optimized out>) at event.c:2142
> csd = 0x7f5c88000cd0
> cs = 0x0
> te = 0x0
> ptrans = 0x7f5c88000c58
> ti = <optimized out>
> process_slot = 3
> thread_slot = 1
> rv = 0
> is_idle = 0
> #3 0x00007f5d4a48015a in start_thread () from /lib64/libpthread.so.0
> No symbol table info available.
> #4 0x00007f5d49fadf73 in clone () from /lib64/libc.so.6
>
>
> The reason is that in this case a third party module that has hooked into the pre_connection hook returns an error which causes
> the hook to stop running. This prevents core_pre_connection from running which is APR_HOOK_REALLY_LAST.
> Hence the socket is not set into c->conn_config and hence ap_get_conn_socket returns NULL.
> The quick fix that prevents that from happening is
>
> Index: server/mpm/event/event.c
> ===================================================================
> --- server/mpm/event/event.c (revision 1893352)
> +++ server/mpm/event/event.c (working copy)
> @@ -1040,6 +1040,16 @@
> ap_log_cerror(APLOG_MARK, APLOG_DEBUG, 0, c, APLOGNO(00469)
> "process_socket: connection aborted");
> c->aborted = 1;
> + /*
> + * In case we errored, the pre_connection hook of the core
> + * module maybe did not run (it is APR_HOOK_REALLY_LAST) and
> + * hence the socket might not have been set in c->conn_config
> + * and the lingering close process cannot fetch it from there
> + * via ap_get_conn_socket. Hence set it in this case.
> + */
> + if (!ap_get_conn_socket(c)) {
> + ap_set_core_module_config(c->conn_config, sock);
> + }
> }
>
> /**
>
>
> But I think it is incomplete. I think we should do all that core_pre_connection does.
> We cannot call it from event.c as it is a static in core.c. If we agree that we want to execute what it does I see two ways forward:
>
> 1. Make core_pre_connection part of the public API and have the pre_connection hook of core just call it.
> 2. Create a wrapper around ap_run_pre_connection as a public API and if ap_run_pre_connection returns != OK && != DONE do
> things like setting c->aborted to 1 and call core_pre_connection or do a subset of it.
>
> I would be in favour of 2. as 1. seems to provide an API function for only a very specific case and of limited use. 2. seems to
> deliver a more "save" version of ap_run_pre_connection that could be used in several locations as a drop in replacement for
> ap_run_pre_connection.

+1 for approach 2
Re: crash in process_lingering_close [ In reply to ]
On Tue, Sep 21, 2021 at 11:52 AM stefan@eissing.org <stefan@eissing.org> wrote:
>
>
>
> > Am 21.09.2021 um 11:48 schrieb Ruediger Pluem <rpluem@apache.org>:
> >
> > I noticed a crash in process_lingering_close
> >
> > The crash happens here:
> >
> > #0 0x00007f5d4af33b28 in apr_socket_timeout_set (sock=sock@entry=0x0, t=t@entry=2000000) at network_io/unix/sockopt.c:86
> > stat = <optimized out>
> > #1 0x00000000004706a4 in process_lingering_close (cs=0x7f5c88000ed0) at event.c:1466
> > csd = 0x0
> > dummybuf = '\000' <repeats 30216 times>...
> > nbytes = 0
> > rv = <optimized out>
> > q = <optimized out>
> > #2 0x000000000047154b in worker_thread (thd=0x1a31480, dummy=<optimized out>) at event.c:2142
> > csd = 0x7f5c88000cd0
> > cs = 0x0
> > te = 0x0
> > ptrans = 0x7f5c88000c58
> > ti = <optimized out>
> > process_slot = 3
> > thread_slot = 1
> > rv = 0
> > is_idle = 0
> > #3 0x00007f5d4a48015a in start_thread () from /lib64/libpthread.so.0
> > No symbol table info available.
> > #4 0x00007f5d49fadf73 in clone () from /lib64/libc.so.6
> >
> >
> > The reason is that in this case a third party module that has hooked into the pre_connection hook returns an error which causes
> > the hook to stop running. This prevents core_pre_connection from running which is APR_HOOK_REALLY_LAST.
> > Hence the socket is not set into c->conn_config and hence ap_get_conn_socket returns NULL.
> > The quick fix that prevents that from happening is
> >
> > Index: server/mpm/event/event.c
> > ===================================================================
> > --- server/mpm/event/event.c (revision 1893352)
> > +++ server/mpm/event/event.c (working copy)
> > @@ -1040,6 +1040,16 @@
> > ap_log_cerror(APLOG_MARK, APLOG_DEBUG, 0, c, APLOGNO(00469)
> > "process_socket: connection aborted");
> > c->aborted = 1;
> > + /*
> > + * In case we errored, the pre_connection hook of the core
> > + * module maybe did not run (it is APR_HOOK_REALLY_LAST) and
> > + * hence the socket might not have been set in c->conn_config
> > + * and the lingering close process cannot fetch it from there
> > + * via ap_get_conn_socket. Hence set it in this case.
> > + */
> > + if (!ap_get_conn_socket(c)) {
> > + ap_set_core_module_config(c->conn_config, sock);
> > + }
> > }
> >
> > /**
> >
> >
> > But I think it is incomplete. I think we should do all that core_pre_connection does.
> > We cannot call it from event.c as it is a static in core.c. If we agree that we want to execute what it does I see two ways forward:
> >
> > 1. Make core_pre_connection part of the public API and have the pre_connection hook of core just call it.
> > 2. Create a wrapper around ap_run_pre_connection as a public API and if ap_run_pre_connection returns != OK && != DONE do
> > things like setting c->aborted to 1 and call core_pre_connection or do a subset of it.
> >
> > I would be in favour of 2. as 1. seems to provide an API function for only a very specific case and of limited use. 2. seems to
> > deliver a more "save" version of ap_run_pre_connection that could be used in several locations as a drop in replacement for
> > ap_run_pre_connection.
>
> +1 for approach 2

+1
Re: crash in process_lingering_close [ In reply to ]
On 9/21/21 12:03 PM, Yann Ylavic wrote:
> On Tue, Sep 21, 2021 at 11:52 AM stefan@eissing.org <stefan@eissing.org> wrote:
>>
>>
>>
>>> Am 21.09.2021 um 11:48 schrieb Ruediger Pluem <rpluem@apache.org>:
>>>

>>>
>>> But I think it is incomplete. I think we should do all that core_pre_connection does.
>>> We cannot call it from event.c as it is a static in core.c. If we agree that we want to execute what it does I see two ways forward:
>>>
>>> 1. Make core_pre_connection part of the public API and have the pre_connection hook of core just call it.
>>> 2. Create a wrapper around ap_run_pre_connection as a public API and if ap_run_pre_connection returns != OK && != DONE do
>>> things like setting c->aborted to 1 and call core_pre_connection or do a subset of it.
>>>
>>> I would be in favour of 2. as 1. seems to provide an API function for only a very specific case and of limited use. 2. seems to
>>> deliver a more "save" version of ap_run_pre_connection that could be used in several locations as a drop in replacement for
>>> ap_run_pre_connection.
>>
>> +1 for approach 2
>
> +1
>

How about the below?

Index: include/ap_mmn.h
===================================================================
--- include/ap_mmn.h (revision 1893352)
+++ include/ap_mmn.h (working copy)
@@ -680,6 +680,7 @@
* 20210531.3 (2.5.1-dev) Add hook child_stopping to get informed that a child
* is being shut down.
* 20210531.4 (2.5.1-dev) Add ap_create_connection
+ * 20210531.5 (2.5.1-dev) Add ap_pre_connection
*/

#define MODULE_MAGIC_COOKIE 0x41503235UL /* "AP25" */
@@ -687,7 +688,7 @@
#ifndef MODULE_MAGIC_NUMBER_MAJOR
#define MODULE_MAGIC_NUMBER_MAJOR 20210531
#endif
-#define MODULE_MAGIC_NUMBER_MINOR 4 /* 0...n */
+#define MODULE_MAGIC_NUMBER_MINOR 5 /* 0...n */

/**
* Determine if the server's current MODULE_MAGIC_NUMBER is at least a
Index: include/http_connection.h
===================================================================
--- include/http_connection.h (revision 1893352)
+++ include/http_connection.h (working copy)
@@ -135,7 +135,23 @@
*/
AP_DECLARE_HOOK(int,pre_close_connection,(conn_rec *c))

+
/**
+ * This is a wrapper around ap_run_pre_connection. In case that
+ * ap_run_pre_connection returns an error it marks the connection as
+ * aborted and ensures that the basic connection setup normally done
+ * by the core module is done in case it was not done so far.
+ * @param c The connection on which the request has been received.
+ * Same as for the pre_connection hook.
+ * @param csd The mechanism on which this connection is to be read.
+ * Most times this will be a socket, but it is up to the module
+ * that accepts the request to determine the exact type.
+ * Same as for the pre_connection hook.
+ * @return The result of ap_run_pre_connection
+ */
+AP_DECLARE(int) ap_pre_connection(conn_rec *c, void *csd);
+
+/**
* Create a new server/incoming or client/outgoing/proxy connection
* @param p The pool from which to allocate the connection record
* @param server The server record to create the connection too.
Index: server/core.c
===================================================================
--- server/core.c (revision 1893352)
+++ server/core.c (working copy)
@@ -5557,6 +5557,31 @@
return DONE;
}

+AP_DECLARE(int) ap_pre_connection(conn_rec *c, void *csd)
+{
+ int rc = OK;
+
+ rc = ap_run_pre_connection(c, csd);
+ if (rc != OK && rc != DONE) {
+ c->aborted = 1;
+ /*
+ * In case we errored, the pre_connection hook of the core
+ * module maybe did not run (it is APR_HOOK_REALLY_LAST) and
+ * hence we missed to
+ *
+ * - Put the socket in c->conn_config
+ * - Setup core output and input filters
+ * - Set socket options and timeouts
+ *
+ * Hence call it in this case.
+ */
+ if (!ap_get_conn_socket(c)) {
+ core_pre_connection(c, csd);
+ }
+ }
+ return rc;
+}
+
AP_CORE_DECLARE(conn_rec *) ap_create_slave_connection(conn_rec *c)
{
apr_pool_t *pool;
Index: server/mpm/event/event.c
===================================================================
--- server/mpm/event/event.c (revision 1893352)
+++ server/mpm/event/event.c (working copy)
@@ -1035,11 +1035,10 @@

ap_update_vhost_given_ip(c);

- rc = ap_run_pre_connection(c, sock);
+ rc = ap_pre_connection(c, sock);
if (rc != OK && rc != DONE) {
ap_log_cerror(APLOG_MARK, APLOG_DEBUG, 0, c, APLOGNO(00469)
"process_socket: connection aborted");
- c->aborted = 1;
}

/**


Regards

Rüdiger
Re: crash in process_lingering_close [ In reply to ]
On Tue, Sep 21, 2021 at 1:38 PM Ruediger Pluem <rpluem@apache.org> wrote:
>
> How about the below?

+1, we possibly need to change all the calls to
ap_run_pre_connection() by ap_pre_connection() then, not only in
mpm_event?


Cheers;
Yann.
Re: crash in process_lingering_close [ In reply to ]
On 9/21/21 2:26 PM, Yann Ylavic wrote:
> On Tue, Sep 21, 2021 at 1:38 PM Ruediger Pluem <rpluem@apache.org> wrote:
>>
>> How about the below?
>
> +1, we possibly need to change all the calls to
> ap_run_pre_connection() by ap_pre_connection() then, not only in
> mpm_event?

I am not sure if this is needed everywhere, but yes we should check.

Regards

Rüdiger