Mailing List Archive

module threads and mpm lifecycle
One thing that is currently missing is a way to shutdown/reap/join module threads when the mpm exits.

Example:

mod_watchdog creates several threads post_config which are only joined on pool destruction.

Problem:

pool destruction is not ordered between modules and dependencies on order are not
fully known. This leads to crashes in OpenSSL, for example, when mod_ssl is destructed
before all watchdogs using OpenSSL are joined (OpenSSL 3.x seems to do better on this,
but the point remains valid).

There has been attempts by Yann to make a patch that make the OpenSSL termination
way later (adding it to a 'higher' pool destruction). But that would only solve
this particular problem and not any other 3rd party dependency.

Proposal:

Add a hook 'child_exiting(int graceful)' where modules can register
and do their own thread join/reap.


On a graceful shutdown, this would allow watchdogs to fully complete any ongoing task
before things start to disappear in the process.

Thoughts?

Kind Regards,
Stefan
Re: module threads and mpm lifecycle [ In reply to ]
On 2/23/22 11:39 AM, Stefan Eissing wrote:
> One thing that is currently missing is a way to shutdown/reap/join module threads when the mpm exits.
>
> Example:
>
> mod_watchdog creates several threads post_config which are only joined on pool destruction.
>
> Problem:
>
> pool destruction is not ordered between modules and dependencies on order are not
> fully known. This leads to crashes in OpenSSL, for example, when mod_ssl is destructed
> before all watchdogs using OpenSSL are joined (OpenSSL 3.x seems to do better on this,
> but the point remains valid).
>
> There has been attempts by Yann to make a patch that make the OpenSSL termination
> way later (adding it to a 'higher' pool destruction). But that would only solve
> this particular problem and not any other 3rd party dependency.
>
> Proposal:
>
> Add a hook 'child_exiting(int graceful)' where modules can register
> and do their own thread join/reap.

How does that differ from the child_stopping hook? Or is child_stopping only used when we shutdown the whole httpd not just one child?

Regards

Rüdiger
Re: module threads and mpm lifecycle [ In reply to ]
> Am 23.02.2022 um 11:52 schrieb Ruediger Pluem <rpluem@apache.org>:
>
>
>
> On 2/23/22 11:39 AM, Stefan Eissing wrote:
>> One thing that is currently missing is a way to shutdown/reap/join module threads when the mpm exits.
>>
>> Example:
>>
>> mod_watchdog creates several threads post_config which are only joined on pool destruction.
>>
>> Problem:
>>
>> pool destruction is not ordered between modules and dependencies on order are not
>> fully known. This leads to crashes in OpenSSL, for example, when mod_ssl is destructed
>> before all watchdogs using OpenSSL are joined (OpenSSL 3.x seems to do better on this,
>> but the point remains valid).
>>
>> There has been attempts by Yann to make a patch that make the OpenSSL termination
>> way later (adding it to a 'higher' pool destruction). But that would only solve
>> this particular problem and not any other 3rd party dependency.
>>
>> Proposal:
>>
>> Add a hook 'child_exiting(int graceful)' where modules can register
>> and do their own thread join/reap.
>
> How does that differ from the child_stopping hook? Or is child_stopping only used when we shutdown the whole httpd not just one child?

child_is_stopping is called when the shutdown is initiated.
child_is_exiting would be called before pool destruction begins.

The difference is that between those two, ongoing requests are still being served on graceful shutdowns.

>
> Regards
>
> Rüdiger
Re: module threads and mpm lifecycle [ In reply to ]
> Am 23.02.2022 um 13:24 schrieb Stefan Eissing <stefan@eissing.org>:
>
>
>
>> Am 23.02.2022 um 11:52 schrieb Ruediger Pluem <rpluem@apache.org>:
>>
>>
>>
>> On 2/23/22 11:39 AM, Stefan Eissing wrote:
>>> One thing that is currently missing is a way to shutdown/reap/join module threads when the mpm exits.
>>>
>>> Example:
>>>
>>> mod_watchdog creates several threads post_config which are only joined on pool destruction.
>>>
>>> Problem:
>>>
>>> pool destruction is not ordered between modules and dependencies on order are not
>>> fully known. This leads to crashes in OpenSSL, for example, when mod_ssl is destructed
>>> before all watchdogs using OpenSSL are joined (OpenSSL 3.x seems to do better on this,
>>> but the point remains valid).
>>>
>>> There has been attempts by Yann to make a patch that make the OpenSSL termination
>>> way later (adding it to a 'higher' pool destruction). But that would only solve
>>> this particular problem and not any other 3rd party dependency.
>>>
>>> Proposal:
>>>
>>> Add a hook 'child_exiting(int graceful)' where modules can register
>>> and do their own thread join/reap.
>>
>> How does that differ from the child_stopping hook? Or is child_stopping only used when we shutdown the whole httpd not just one child?
>
> child_is_stopping is called when the shutdown is initiated.
> child_is_exiting would be called before pool destruction begins.
>
> The difference is that between those two, ongoing requests are still being served on graceful shutdowns.

Opening the bike shed: child_has_stopped?

>
>>
>> Regards
>>
>> Rüdiger
Re: module threads and mpm lifecycle [ In reply to ]
On Wed, 23 Feb 2022, Stefan Eissing wrote:

>>> On 2/23/22 11:39 AM, Stefan Eissing wrote:
>>>> One thing that is currently missing is a way to shutdown/reap/join module threads when the mpm exits.
>>>>
>>>> Example:
>>>>
>>>> mod_watchdog creates several threads post_config which are only joined on pool destruction.
>>>>
>>>> Problem:
>>>>
>>>> pool destruction is not ordered between modules and dependencies on order are not
>>>> fully known. This leads to crashes in OpenSSL, for example, when mod_ssl is destructed
>>>> before all watchdogs using OpenSSL are joined (OpenSSL 3.x seems to do better on this,
>>>> but the point remains valid).
>>>>
>>>> There has been attempts by Yann to make a patch that make the OpenSSL termination
>>>> way later (adding it to a 'higher' pool destruction). But that would only solve
>>>> this particular problem and not any other 3rd party dependency.
>>>>
>>>> Proposal:
>>>>
>>>> Add a hook 'child_exiting(int graceful)' where modules can register
>>>> and do their own thread join/reap.
>>>
>>> How does that differ from the child_stopping hook? Or is child_stopping only used when we shutdown the whole httpd not just one child?
>>
>> child_is_stopping is called when the shutdown is initiated.
>> child_is_exiting would be called before pool destruction begins.
>>
>> The difference is that between those two, ongoing requests are still being served on graceful shutdowns.

FWIW, I'm +1 to the idea. The current situation is no fun for our
home-grown cache module (ie it has threads that crashes on pool
destruction now and then).


> Opening the bike shed: child_has_stopped?

Neither name is fully self-explanatory, I'd need to look at the
descriptive function header to figure out when to use it. However, I
don't know if a name such as child_pre_pooldestroy or somesuch would
be any better...

As long as it's not elvis_leaving_building I'm happy ;)

/Nikke
--
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Niklas Edmundsson, Admin @ {acc,hpc2n}.umu.se | nikke@acc.umu.se
---------------------------------------------------------------------------
Gene Roddenberry showed us the future...Make It So!
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Re: module threads and mpm lifecycle [ In reply to ]
Added in r1898369 on trunk as `child_stopped`.

Kind Regards,
Stefan

> Am 23.02.2022 um 15:49 schrieb Niklas Edmundsson <nikke@acc.umu.se>:
>
> On Wed, 23 Feb 2022, Stefan Eissing wrote:
>
>>>> On 2/23/22 11:39 AM, Stefan Eissing wrote:
>>>>> One thing that is currently missing is a way to shutdown/reap/join module threads when the mpm exits.
>>>>>
>>>>> Example:
>>>>>
>>>>> mod_watchdog creates several threads post_config which are only joined on pool destruction.
>>>>>
>>>>> Problem:
>>>>>
>>>>> pool destruction is not ordered between modules and dependencies on order are not
>>>>> fully known. This leads to crashes in OpenSSL, for example, when mod_ssl is destructed
>>>>> before all watchdogs using OpenSSL are joined (OpenSSL 3.x seems to do better on this,
>>>>> but the point remains valid).
>>>>>
>>>>> There has been attempts by Yann to make a patch that make the OpenSSL termination
>>>>> way later (adding it to a 'higher' pool destruction). But that would only solve
>>>>> this particular problem and not any other 3rd party dependency.
>>>>>
>>>>> Proposal:
>>>>>
>>>>> Add a hook 'child_exiting(int graceful)' where modules can register
>>>>> and do their own thread join/reap.
>>>>
>>>> How does that differ from the child_stopping hook? Or is child_stopping only used when we shutdown the whole httpd not just one child?
>>>
>>> child_is_stopping is called when the shutdown is initiated.
>>> child_is_exiting would be called before pool destruction begins.
>>>
>>> The difference is that between those two, ongoing requests are still being served on graceful shutdowns.
>
> FWIW, I'm +1 to the idea. The current situation is no fun for our home-grown cache module (ie it has threads that crashes on pool destruction now and then).
>
>
>> Opening the bike shed: child_has_stopped?
>
> Neither name is fully self-explanatory, I'd need to look at the descriptive function header to figure out when to use it. However, I don't know if a name such as child_pre_pooldestroy or somesuch would be any better...
>
> As long as it's not elvis_leaving_building I'm happy ;)
>
> /Nikke
> --
> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
> Niklas Edmundsson, Admin @ {acc,hpc2n}.umu.se | nikke@acc.umu.se
> ---------------------------------------------------------------------------
> Gene Roddenberry showed us the future...Make It So!
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Re: module threads and mpm lifecycle [ In reply to ]
On 23 Feb 2022, at 12:39, Stefan Eissing <stefan@eissing.org> wrote:

> One thing that is currently missing is a way to shutdown/reap/join module threads when the mpm exits.
>
> Example:
>
> mod_watchdog creates several threads post_config which are only joined on pool destruction.
>
> Problem:
>
> pool destruction is not ordered between modules and dependencies on order are not
> fully known. This leads to crashes in OpenSSL, for example, when mod_ssl is destructed
> before all watchdogs using OpenSSL are joined (OpenSSL 3.x seems to do better on this,
> but the point remains valid).

A note on OpenSSL termination - in old versions of OpenSSL, there was this wrong assumption that an application would only ever initialise OpenSSL once, and in turn shut down OpenSSL once, but this scenario doesn’t exist in the real world (like ours, where modules that have no idea about each other independently use OpenSSL).

OpenSSL 3 (and I think earlier versions, I don’t remember) have gone further than “do better”, they have (to my knowledge) 100% fixed this problem by turning all OpenSSL init functions into noops and using reference counting internally to ensure sanity is maintained at all times.

If any of this doesn't work, we need to let OpenSSL know.

> There has been attempts by Yann to make a patch that make the OpenSSL termination
> way later (adding it to a 'higher' pool destruction). But that would only solve
> this particular problem and not any other 3rd party dependency.
>
> Proposal:
>
> Add a hook 'child_exiting(int graceful)' where modules can register
> and do their own thread join/reap.
>
>
> On a graceful shutdown, this would allow watchdogs to fully complete any ongoing task
> before things start to disappear in the process.

+1 - will be very useful.

Regards,
Graham
Re: module threads and mpm lifecycle [ In reply to ]
> Am 25.02.2022 um 14:15 schrieb Graham Leggett <minfrin@sharp.fm>:
>
> On 23 Feb 2022, at 12:39, Stefan Eissing <stefan@eissing.org> wrote:
>
>> One thing that is currently missing is a way to shutdown/reap/join module threads when the mpm exits.
>>
>> Example:
>>
>> mod_watchdog creates several threads post_config which are only joined on pool destruction.
>>
>> Problem:
>>
>> pool destruction is not ordered between modules and dependencies on order are not
>> fully known. This leads to crashes in OpenSSL, for example, when mod_ssl is destructed
>> before all watchdogs using OpenSSL are joined (OpenSSL 3.x seems to do better on this,
>> but the point remains valid).
>
> A note on OpenSSL termination - in old versions of OpenSSL, there was this wrong assumption that an application would only ever initialise OpenSSL once, and in turn shut down OpenSSL once, but this scenario doesn’t exist in the real world (like ours, where modules that have no idea about each other independently use OpenSSL).
>
> OpenSSL 3 (and I think earlier versions, I don’t remember) have gone further than “do better”, they have (to my knowledge) 100% fixed this problem by turning all OpenSSL init functions into noops and using reference counting internally to ensure sanity is maintained at all times.
>
> If any of this doesn't work, we need to let OpenSSL know.
>
>> There has been attempts by Yann to make a patch that make the OpenSSL termination
>> way later (adding it to a 'higher' pool destruction). But that would only solve
>> this particular problem and not any other 3rd party dependency.
>>
>> Proposal:
>>
>> Add a hook 'child_exiting(int graceful)' where modules can register
>> and do their own thread join/reap.
>>
>>
>> On a graceful shutdown, this would allow watchdogs to fully complete any ongoing task
>> before things start to disappear in the process.
>
> +1 - will be very useful.

Thanks. I added this and support in mod_watchdog. Results from mpm_event seem to
indicate that this works.

However, I get strange failures/crashes with mpm_prefork. Seemingly when clean_child_exit()
is directly invoked from the signal handler. I have not been able to wrap my head around
what is causing this.

I disabled calling the child_stopping/child_stopped hooks when this is invoked by the signal
handler. This should revert mpm_prefork to the state before and have it again passing in out
Travis CI. If this still fails, I screwed up somewhere else.

Kind Regards,
Stefan

>
> Regards,
> Graham
> —