Mailing List Archive

Re: svn commit: r1894285 - in /httpd/httpd/trunk: changes-entries/event-kill_at_total_daemons_limit.txt server/mpm/event/event.c
On Fri, Oct 15, 2021 at 6:29 AM <ylavic@apache.org> wrote:
>
> Author: ylavic
> Date: Fri Oct 15 10:29:00 2021
> New Revision: 1894285
>
> URL: http://svn.apache.org/viewvc?rev=1894285&view=rev
> Log:
> mpm_event: Restart stopping of idle children after a load peak. PR 65626.
>
> r1770752 added an heuristic to avoid stopping children when the load triggers
> MaxSpareThreads but children take some time to shut down until the point where
> active_daemons_limit/ServerLimit is reached (scoreboard full) and no child gets
> created to handle incoming connections.
>
> However when this happens there is nothing to stop children again when the load
> settles down (besides MaxRequestsPerChild, which may be 0) so let's restart to
> stop children again if/when idle_thread_count reaches max_workers / 4.

Wouldn't the slow-to-exit (but eventually exiting) gracefully shutting
down processes eventually bring total_daemons down and start killing
more processes again?
It seems like if they are not slow but effectively never exiting, the
change could do more harm by filling the scoreboard quicker.

But then I can't explain the PR this came from
https://bz.apache.org/bugzilla/show_bug.cgi?id=65626
Re: svn commit: r1894285 - in /httpd/httpd/trunk: changes-entries/event-kill_at_total_daemons_limit.txt server/mpm/event/event.c [ In reply to ]
On 11/30/21 2:53 PM, Eric Covener wrote:
> On Fri, Oct 15, 2021 at 6:29 AM <ylavic@apache.org> wrote:
>>
>> Author: ylavic
>> Date: Fri Oct 15 10:29:00 2021
>> New Revision: 1894285
>>
>> URL: http://svn.apache.org/viewvc?rev=1894285&view=rev
>> Log:
>> mpm_event: Restart stopping of idle children after a load peak. PR 65626.
>>
>> r1770752 added an heuristic to avoid stopping children when the load triggers
>> MaxSpareThreads but children take some time to shut down until the point where
>> active_daemons_limit/ServerLimit is reached (scoreboard full) and no child gets
>> created to handle incoming connections.
>>
>> However when this happens there is nothing to stop children again when the load
>> settles down (besides MaxRequestsPerChild, which may be 0) so let's restart to
>> stop children again if/when idle_thread_count reaches max_workers / 4.
>
> Wouldn't the slow-to-exit (but eventually exiting) gracefully shutting
> down processes eventually bring total_daemons down and start killing
> more processes again?
> It seems like if they are not slow but effectively never exiting, the
> change could do more harm by filling the scoreboard quicker.
>
> But then I can't explain the PR this came from
> https://bz.apache.org/bugzilla/show_bug.cgi?id=65626

The issue is that in the PR ServerLimit == MaxRequestWorkers / ThreadsPerChild.
Once you reached ServerLimit and there are no dying children, child processes will not be shutdown any longer.
From a quick glance this can only happen if ServerLimit == MaxRequestWorkers / ThreadsPerChild. If it is larger
shutdown should start again some time as some children must be in the process of dying.

Regards

Rüdiger
Re: svn commit: r1894285 - in /httpd/httpd/trunk: changes-entries/event-kill_at_total_daemons_limit.txt server/mpm/event/event.c [ In reply to ]
On Wed, Dec 1, 2021 at 9:16 AM Ruediger Pluem <rpluem@apache.org> wrote:
>
>
>
> On 11/30/21 2:53 PM, Eric Covener wrote:
> > On Fri, Oct 15, 2021 at 6:29 AM <ylavic@apache.org> wrote:
> >>
> >> Author: ylavic
> >> Date: Fri Oct 15 10:29:00 2021
> >> New Revision: 1894285
> >>
> >> URL: http://svn.apache.org/viewvc?rev=1894285&view=rev
> >> Log:
> >> mpm_event: Restart stopping of idle children after a load peak. PR 65626.
> >>
> >> r1770752 added an heuristic to avoid stopping children when the load triggers
> >> MaxSpareThreads but children take some time to shut down until the point where
> >> active_daemons_limit/ServerLimit is reached (scoreboard full) and no child gets
> >> created to handle incoming connections.
> >>
> >> However when this happens there is nothing to stop children again when the load
> >> settles down (besides MaxRequestsPerChild, which may be 0) so let's restart to
> >> stop children again if/when idle_thread_count reaches max_workers / 4.
> >
> > Wouldn't the slow-to-exit (but eventually exiting) gracefully shutting
> > down processes eventually bring total_daemons down and start killing
> > more processes again?
> > It seems like if they are not slow but effectively never exiting, the
> > change could do more harm by filling the scoreboard quicker.
> >
> > But then I can't explain the PR this came from
> > https://bz.apache.org/bugzilla/show_bug.cgi?id=65626
>
> The issue is that in the PR ServerLimit == MaxRequestWorkers / ThreadsPerChild.
> Once you reached ServerLimit and there are no dying children, child processes will not be shutdown any longer.
> From a quick glance this can only happen if ServerLimit == MaxRequestWorkers / ThreadsPerChild. If it is larger
> shutdown should start again some time as some children must be in the process of dying.

Thanks, that makes sense.
I wonder if the patch here could maybe key off of whether there is
slack space in ServerLimit or if
retained->total_daemons == active_daemons_limit to avoid creating more
slow to exit children when
the new condition is not the only way out?
Re: svn commit: r1894285 - in /httpd/httpd/trunk: changes-entries/event-kill_at_total_daemons_limit.txt server/mpm/event/event.c [ In reply to ]
On 12/1/21 3:59 PM, Eric Covener wrote:
> On Wed, Dec 1, 2021 at 9:16 AM Ruediger Pluem <rpluem@apache.org> wrote:
>>
>>
>>
>> On 11/30/21 2:53 PM, Eric Covener wrote:
>>> On Fri, Oct 15, 2021 at 6:29 AM <ylavic@apache.org> wrote:
>>>>
>>>> Author: ylavic
>
>>>
>>> But then I can't explain the PR this came from
>>> https://bz.apache.org/bugzilla/show_bug.cgi?id=65626
>>
>> The issue is that in the PR ServerLimit == MaxRequestWorkers / ThreadsPerChild.
>> Once you reached ServerLimit and there are no dying children, child processes will not be shutdown any longer.
>> From a quick glance this can only happen if ServerLimit == MaxRequestWorkers / ThreadsPerChild. If it is larger
>> shutdown should start again some time as some children must be in the process of dying.
>
> Thanks, that makes sense.
> I wonder if the patch here could maybe key off of whether there is
> slack space in ServerLimit or if
> retained->total_daemons == active_daemons_limit to avoid creating more
> slow to exit children when
> the new condition is not the only way out?

You mean something like:

Index: server/mpm/event/event.c
===================================================================
--- server/mpm/event/event.c (revision 1895463)
+++ server/mpm/event/event.c (working copy)
@@ -3201,7 +3201,8 @@
* exits by itself (i.e. MaxRequestsPerChild reached), so the below
* test makes sure that the situation unblocks when the load falls
* significantly (regardless of MaxRequestsPerChild, e.g. 0) */
- || idle_thread_count > max_workers/4 / num_buckets) {
+ || (active_daemons_limit == server_limit
+ && idle_thread_count > max_workers/4 / num_buckets)) {
/* Kill off one child */
ap_mpm_podx_signal(retained->buckets[child_bucket].pod,
AP_MPM_PODX_GRACEFUL);


Regards

Rüdiger
Re: svn commit: r1894285 - in /httpd/httpd/trunk: changes-entries/event-kill_at_total_daemons_limit.txt server/mpm/event/event.c [ In reply to ]
On Wed, Dec 1, 2021 at 5:19 PM Ruediger Pluem <rpluem@apache.org> wrote:
>
> On 12/1/21 3:59 PM, Eric Covener wrote:
> >
> > I wonder if the patch here could maybe key off of whether there is
> > slack space in ServerLimit or if
> > retained->total_daemons == active_daemons_limit to avoid creating more
> > slow to exit children when
> > the new condition is not the only way out?
>
> You mean something like:
>
> Index: server/mpm/event/event.c
> ===================================================================
> --- server/mpm/event/event.c (revision 1895463)
> +++ server/mpm/event/event.c (working copy)
> @@ -3201,7 +3201,8 @@
> * exits by itself (i.e. MaxRequestsPerChild reached), so the below
> * test makes sure that the situation unblocks when the load falls
> * significantly (regardless of MaxRequestsPerChild, e.g. 0) */
> - || idle_thread_count > max_workers/4 / num_buckets) {
> + || (active_daemons_limit == server_limit
> + && idle_thread_count > max_workers/4 / num_buckets)) {
> /* Kill off one child */
> ap_mpm_podx_signal(retained->buckets[child_bucket].pod,
> AP_MPM_PODX_GRACEFUL);

Or this:
--- server/mpm/event/event.c (revision 1895394)
+++ server/mpm/event/event.c (working copy)
@@ -3195,13 +3195,15 @@ static void perform_idle_server_maintenance(int ch
* XXX depending on server load, later be able to resurrect them
* or kill them
*/
- if ((retained->total_daemons <= active_daemons_limit
- && retained->total_daemons < server_limit)
+ if (retained->total_daemons < active_daemons_limit
/* The above test won't transition from true to false until a child
- * exits by itself (i.e. MaxRequestsPerChild reached), so the below
- * test makes sure that the situation unblocks when the load falls
- * significantly (regardless of MaxRequestsPerChild, e.g. 0) */
- || idle_thread_count > max_workers/4 / num_buckets) {
+ * exits by itself (i.e. old generation or MaxRequestsPerChild), so
+ * the below test makes sure that the situation unblocks when the
+ * load drops significantly and there are no child of an old gen
+ * occupying the scoreboard already (above active_daemons_limit) so
+ * to avoid another squatter potentially slow to exit too. */
+ || (retained->total_daemons == active_daemons_limit
+ && idle_thread_count > max_workers/4 / num_buckets)) {
/* Kill off one child */
ap_mpm_podx_signal(retained->buckets[child_bucket].pod,
AP_MPM_PODX_GRACEFUL);
?
It would unblock the situation only if/when there are no old gen
processes in the scoreboard, though it'd works only if ServerLimit >
active_daemons_limit (if they are equals, we can't/don't account for
quiescing processes).

Or maybe we could simply:
--- server/mpm/event/event.c (revision 1895394)
+++ server/mpm/event/event.c (working copy)
@@ -3186,8 +3186,8 @@ static void perform_idle_server_maintenance(int ch
* requests. If the server load changes many times, many such
* gracefully finishing processes may accumulate, filling up the
* scoreboard. To avoid running out of scoreboard entries, we
- * don't shut down more processes when the total number of processes
- * is high, until there are more than max_workers/4 idle threads.
+ * don't shut down more processes if there are quiescing ones
+ * already (i.e. retained->total_daemons > active_daemons).
*
* XXX It would be nice if we could
* XXX - kill processes without keepalive connections first
@@ -3195,13 +3195,7 @@ static void perform_idle_server_maintenance(int ch
* XXX depending on server load, later be able to resurrect them
* or kill them
*/
- if ((retained->total_daemons <= active_daemons_limit
- && retained->total_daemons < server_limit)
- /* The above test won't transition from true to false until a child
- * exits by itself (i.e. MaxRequestsPerChild reached), so the below
- * test makes sure that the situation unblocks when the load falls
- * significantly (regardless of MaxRequestsPerChild, e.g. 0) */
- || idle_thread_count > max_workers/4 / num_buckets) {
+ if (retained->total_daemons == active_daemons) {
/* Kill off one child */
ap_mpm_podx_signal(retained->buckets[child_bucket].pod,
AP_MPM_PODX_GRACEFUL);
?
And then kill processes only if there are no stopping ones already..


Cheers;
Yann.
Re: svn commit: r1894285 - in /httpd/httpd/trunk: changes-entries/event-kill_at_total_daemons_limit.txt server/mpm/event/event.c [ In reply to ]
On 12/2/21 4:32 PM, Yann Ylavic wrote:
> On Wed, Dec 1, 2021 at 5:19 PM Ruediger Pluem <rpluem@apache.org> wrote:
>>
>> On 12/1/21 3:59 PM, Eric Covener wrote:
>>>
>>> I wonder if the patch here could maybe key off of whether there is
>>> slack space in ServerLimit or if
>>> retained->total_daemons == active_daemons_limit to avoid creating more
>>> slow to exit children when
>>> the new condition is not the only way out?
>>
>> You mean something like:
>>
>> Index: server/mpm/event/event.c
>> ===================================================================
>> --- server/mpm/event/event.c (revision 1895463)
>> +++ server/mpm/event/event.c (working copy)
>> @@ -3201,7 +3201,8 @@
>> * exits by itself (i.e. MaxRequestsPerChild reached), so the below
>> * test makes sure that the situation unblocks when the load falls
>> * significantly (regardless of MaxRequestsPerChild, e.g. 0) */
>> - || idle_thread_count > max_workers/4 / num_buckets) {
>> + || (active_daemons_limit == server_limit
>> + && idle_thread_count > max_workers/4 / num_buckets)) {
>> /* Kill off one child */
>> ap_mpm_podx_signal(retained->buckets[child_bucket].pod,
>> AP_MPM_PODX_GRACEFUL);
>
> Or this:
> --- server/mpm/event/event.c (revision 1895394)
> +++ server/mpm/event/event.c (working copy)
> @@ -3195,13 +3195,15 @@ static void perform_idle_server_maintenance(int ch
> * XXX depending on server load, later be able to resurrect them
> * or kill them
> */
> - if ((retained->total_daemons <= active_daemons_limit
> - && retained->total_daemons < server_limit)
> + if (retained->total_daemons < active_daemons_limit
> /* The above test won't transition from true to false until a child
> - * exits by itself (i.e. MaxRequestsPerChild reached), so the below
> - * test makes sure that the situation unblocks when the load falls
> - * significantly (regardless of MaxRequestsPerChild, e.g. 0) */
> - || idle_thread_count > max_workers/4 / num_buckets) {
> + * exits by itself (i.e. old generation or MaxRequestsPerChild), so
> + * the below test makes sure that the situation unblocks when the
> + * load drops significantly and there are no child of an old gen
> + * occupying the scoreboard already (above active_daemons_limit) so
> + * to avoid another squatter potentially slow to exit too. */
> + || (retained->total_daemons == active_daemons_limit
> + && idle_thread_count > max_workers/4 / num_buckets)) {
> /* Kill off one child */
> ap_mpm_podx_signal(retained->buckets[child_bucket].pod,
> AP_MPM_PODX_GRACEFUL);
> ?
> It would unblock the situation only if/when there are no old gen
> processes in the scoreboard, though it'd works only if ServerLimit >
> active_daemons_limit (if they are equals, we can't/don't account for
> quiescing processes).
>
> Or maybe we could simply:
> --- server/mpm/event/event.c (revision 1895394)
> +++ server/mpm/event/event.c (working copy)
> @@ -3186,8 +3186,8 @@ static void perform_idle_server_maintenance(int ch
> * requests. If the server load changes many times, many such
> * gracefully finishing processes may accumulate, filling up the
> * scoreboard. To avoid running out of scoreboard entries, we
> - * don't shut down more processes when the total number of processes
> - * is high, until there are more than max_workers/4 idle threads.
> + * don't shut down more processes if there are quiescing ones
> + * already (i.e. retained->total_daemons > active_daemons).
> *
> * XXX It would be nice if we could
> * XXX - kill processes without keepalive connections first
> @@ -3195,13 +3195,7 @@ static void perform_idle_server_maintenance(int ch
> * XXX depending on server load, later be able to resurrect them
> * or kill them
> */
> - if ((retained->total_daemons <= active_daemons_limit
> - && retained->total_daemons < server_limit)
> - /* The above test won't transition from true to false until a child
> - * exits by itself (i.e. MaxRequestsPerChild reached), so the below
> - * test makes sure that the situation unblocks when the load falls
> - * significantly (regardless of MaxRequestsPerChild, e.g. 0) */
> - || idle_thread_count > max_workers/4 / num_buckets) {
> + if (retained->total_daemons == active_daemons) {

Wouldn't this effectively disable shutting down processes until we hit the limit even if shrinking is desired far before we reach
this?
Example:

MaxRequestWorkers 1000
ThreadsPerChild 10
MinSpareThreads 10
MaxSpareThreads 30
Serverlimit 105
StartServers 2

In case the server gets 40 connections in parallel after startup it would spin up to 5 processes if it gets idle afterwards it
would not shutdown children any longer.


Regards

Rüdiger
Re: svn commit: r1894285 - in /httpd/httpd/trunk: changes-entries/event-kill_at_total_daemons_limit.txt server/mpm/event/event.c [ In reply to ]
On Thu, Dec 2, 2021 at 5:50 PM Ruediger Pluem <rpluem@apache.org> wrote:
>
> On 12/2/21 4:32 PM, Yann Ylavic wrote:
> >
> > Or maybe we could simply:
> > --- server/mpm/event/event.c (revision 1895394)
> > +++ server/mpm/event/event.c (working copy)
> > @@ -3186,8 +3186,8 @@ static void perform_idle_server_maintenance(int ch
> > * requests. If the server load changes many times, many such
> > * gracefully finishing processes may accumulate, filling up the
> > * scoreboard. To avoid running out of scoreboard entries, we
> > - * don't shut down more processes when the total number of processes
> > - * is high, until there are more than max_workers/4 idle threads.
> > + * don't shut down more processes if there are quiescing ones
> > + * already (i.e. retained->total_daemons > active_daemons).
> > *
> > * XXX It would be nice if we could
> > * XXX - kill processes without keepalive connections first
> > @@ -3195,13 +3195,7 @@ static void perform_idle_server_maintenance(int ch
> > * XXX depending on server load, later be able to resurrect them
> > * or kill them
> > */
> > - if ((retained->total_daemons <= active_daemons_limit
> > - && retained->total_daemons < server_limit)
> > - /* The above test won't transition from true to false until a child
> > - * exits by itself (i.e. MaxRequestsPerChild reached), so the below
> > - * test makes sure that the situation unblocks when the load falls
> > - * significantly (regardless of MaxRequestsPerChild, e.g. 0) */
> > - || idle_thread_count > max_workers/4 / num_buckets) {
> > + if (retained->total_daemons == active_daemons) {
>
> Wouldn't this effectively disable shutting down processes until we hit the limit even if shrinking is desired far before we reach
> this?

Possibly not because we are still under:
if (idle_thread_count > max_spare_threads / num_buckets)
?

> Example:
>
> MaxRequestWorkers 1000
> ThreadsPerChild 10
> MinSpareThreads 10
> MaxSpareThreads 30
> Serverlimit 105
> StartServers 2
>
> In case the server gets 40 connections in parallel after startup it would spin up to 5 processes if it gets idle afterwards it
> would not shutdown children any longer.

We are above MaxSpareThreads in this case so it would trigger as soon
as it gets idle (unless a graceful was asked during the load, but
immediately after all the old gen processes have stopped otherwise).

Regards;
Yann.
Re: svn commit: r1894285 - in /httpd/httpd/trunk: changes-entries/event-kill_at_total_daemons_limit.txt server/mpm/event/event.c [ In reply to ]
On Thu, Dec 2, 2021 at 4:32 PM Yann Ylavic <ylavic.dev@gmail.com> wrote:
>
> Or maybe we could simply:
> --- server/mpm/event/event.c (revision 1895394)
> +++ server/mpm/event/event.c (working copy)
> @@ -3186,8 +3186,8 @@ static void perform_idle_server_maintenance(int ch
> * requests. If the server load changes many times, many such
> * gracefully finishing processes may accumulate, filling up the
> * scoreboard. To avoid running out of scoreboard entries, we
> - * don't shut down more processes when the total number of processes
> - * is high, until there are more than max_workers/4 idle threads.
> + * don't shut down more processes if there are quiescing ones
> + * already (i.e. retained->total_daemons > active_daemons).
> *
> * XXX It would be nice if we could
> * XXX - kill processes without keepalive connections first
> @@ -3195,13 +3195,7 @@ static void perform_idle_server_maintenance(int ch
> * XXX depending on server load, later be able to resurrect them
> * or kill them
> */
> - if ((retained->total_daemons <= active_daemons_limit
> - && retained->total_daemons < server_limit)
> - /* The above test won't transition from true to false until a child
> - * exits by itself (i.e. MaxRequestsPerChild reached), so the below
> - * test makes sure that the situation unblocks when the load falls
> - * significantly (regardless of MaxRequestsPerChild, e.g. 0) */
> - || idle_thread_count > max_workers/4 / num_buckets) {
> + if (retained->total_daemons == active_daemons) {
> /* Kill off one child */
> ap_mpm_podx_signal(retained->buckets[child_bucket].pod,
> AP_MPM_PODX_GRACEFUL);
> ?
> And then kill processes only if there are no stopping ones already..

Full patch attached, we need to retain active_daemons too for this to
work actually.

Cheers;
Yann.
Re: svn commit: r1894285 - in /httpd/httpd/trunk: changes-entries/event-kill_at_total_daemons_limit.txt server/mpm/event/event.c [ In reply to ]
On 12/2/21 6:32 PM, Yann Ylavic wrote:
> On Thu, Dec 2, 2021 at 5:50 PM Ruediger Pluem <rpluem@apache.org> wrote:
>>
>> On 12/2/21 4:32 PM, Yann Ylavic wrote:
>>>
>>> Or maybe we could simply:
>>> --- server/mpm/event/event.c (revision 1895394)
>>> +++ server/mpm/event/event.c (working copy)
>>> @@ -3186,8 +3186,8 @@ static void perform_idle_server_maintenance(int ch
>>> * requests. If the server load changes many times, many such
>>> * gracefully finishing processes may accumulate, filling up the
>>> * scoreboard. To avoid running out of scoreboard entries, we
>>> - * don't shut down more processes when the total number of processes
>>> - * is high, until there are more than max_workers/4 idle threads.
>>> + * don't shut down more processes if there are quiescing ones
>>> + * already (i.e. retained->total_daemons > active_daemons).
>>> *
>>> * XXX It would be nice if we could
>>> * XXX - kill processes without keepalive connections first
>>> @@ -3195,13 +3195,7 @@ static void perform_idle_server_maintenance(int ch
>>> * XXX depending on server load, later be able to resurrect them
>>> * or kill them
>>> */
>>> - if ((retained->total_daemons <= active_daemons_limit
>>> - && retained->total_daemons < server_limit)
>>> - /* The above test won't transition from true to false until a child
>>> - * exits by itself (i.e. MaxRequestsPerChild reached), so the below
>>> - * test makes sure that the situation unblocks when the load falls
>>> - * significantly (regardless of MaxRequestsPerChild, e.g. 0) */
>>> - || idle_thread_count > max_workers/4 / num_buckets) {
>>> + if (retained->total_daemons == active_daemons) {
>>
>> Wouldn't this effectively disable shutting down processes until we hit the limit even if shrinking is desired far before we reach
>> this?
>
> Possibly not because we are still under:
> if (idle_thread_count > max_spare_threads / num_buckets)
> ?
>
>> Example:
>>
>> MaxRequestWorkers 1000
>> ThreadsPerChild 10
>> MinSpareThreads 10
>> MaxSpareThreads 30
>> Serverlimit 105
>> StartServers 2
>>
>> In case the server gets 40 connections in parallel after startup it would spin up to 5 processes if it gets idle afterwards it
>> would not shutdown children any longer.
>
> We are above MaxSpareThreads in this case so it would trigger as soon

My bad. The patch says

retained->total_daemons == active_daemons

but I read

retained->total_daemons == active_daemons_limit

So yes this could work.

Regards

Rüdiger
Re: svn commit: r1894285 - in /httpd/httpd/trunk: changes-entries/event-kill_at_total_daemons_limit.txt server/mpm/event/event.c [ In reply to ]
On 12/2/21 7:31 PM, Yann Ylavic wrote:
> On Thu, Dec 2, 2021 at 4:32 PM Yann Ylavic <ylavic.dev@gmail.com> wrote:
>>
>> Or maybe we could simply:
>> --- server/mpm/event/event.c (revision 1895394)
>> +++ server/mpm/event/event.c (working copy)
>> @@ -3186,8 +3186,8 @@ static void perform_idle_server_maintenance(int ch
>> * requests. If the server load changes many times, many such
>> * gracefully finishing processes may accumulate, filling up the
>> * scoreboard. To avoid running out of scoreboard entries, we
>> - * don't shut down more processes when the total number of processes
>> - * is high, until there are more than max_workers/4 idle threads.
>> + * don't shut down more processes if there are quiescing ones
>> + * already (i.e. retained->total_daemons > active_daemons).
>> *
>> * XXX It would be nice if we could
>> * XXX - kill processes without keepalive connections first
>> @@ -3195,13 +3195,7 @@ static void perform_idle_server_maintenance(int ch
>> * XXX depending on server load, later be able to resurrect them
>> * or kill them
>> */
>> - if ((retained->total_daemons <= active_daemons_limit
>> - && retained->total_daemons < server_limit)
>> - /* The above test won't transition from true to false until a child
>> - * exits by itself (i.e. MaxRequestsPerChild reached), so the below
>> - * test makes sure that the situation unblocks when the load falls
>> - * significantly (regardless of MaxRequestsPerChild, e.g. 0) */
>> - || idle_thread_count > max_workers/4 / num_buckets) {
>> + if (retained->total_daemons == active_daemons) {
>> /* Kill off one child */
>> ap_mpm_podx_signal(retained->buckets[child_bucket].pod,
>> AP_MPM_PODX_GRACEFUL);
>> ?
>> And then kill processes only if there are no stopping ones already..
>
> Full patch attached, we need to retain active_daemons too for this to
> work actually.
>

Why do we need to retain active_daemons? When we get back after a restart there should be no active daemons any longer. They
should be all gone or shutting down. The only active daemons we should have then should be the ones of the new generation.

Regards

Rüdiger
Re: svn commit: r1894285 - in /httpd/httpd/trunk: changes-entries/event-kill_at_total_daemons_limit.txt server/mpm/event/event.c [ In reply to ]
On Thu, Dec 2, 2021 at 9:07 PM Ruediger Pluem <rpluem@apache.org> wrote:
>
> On 12/2/21 7:31 PM, Yann Ylavic wrote:
> >
> > Full patch attached, we need to retain active_daemons too for this to
> > work actually.
> >
>
> Why do we need to retain active_daemons? When we get back after a restart there should be no active daemons any longer. They
> should be all gone or shutting down. The only active daemons we should have then should be the ones of the new generation.

Retaining active_daemons probably does not belong to the same patch
than the MaxSpareThreads changes (like I did), but it looks like
restarting (gracefully or not) with active_daemons = 0 is not correct
because the first call to perform_idle_server_maintenance() will find
the stopping processes old gen and do:
ps = &ap_scoreboard_image->parent[i];
if (ps->pid != 0) {
int child_threads_active = 0;
if (ps->quiescing == 1) {
ps->quiescing = 2;
active_daemons--;
}
...
}
Thus doubly account for the stopping daemons and leaving
active_daemons as a negative value?

I think that since r1893520 and the correct accounting of
active_daemons throughout perform_idle_server_maintenance() we'd
better let perform_idle_server_maintenance() handle
retained->active_daemons entirely.

Am I missing something?

Cheers;
Yann.
Re: svn commit: r1894285 - in /httpd/httpd/trunk: changes-entries/event-kill_at_total_daemons_limit.txt server/mpm/event/event.c [ In reply to ]
On 12/3/21 1:18 AM, Yann Ylavic wrote:
> On Thu, Dec 2, 2021 at 9:07 PM Ruediger Pluem <rpluem@apache.org> wrote:
>>
>> On 12/2/21 7:31 PM, Yann Ylavic wrote:
>>>
>>> Full patch attached, we need to retain active_daemons too for this to
>>> work actually.
>>>
>>
>> Why do we need to retain active_daemons? When we get back after a restart there should be no active daemons any longer. They
>> should be all gone or shutting down. The only active daemons we should have then should be the ones of the new generation.
>
> Retaining active_daemons probably does not belong to the same patch
> than the MaxSpareThreads changes (like I did), but it looks like
> restarting (gracefully or not) with active_daemons = 0 is not correct
> because the first call to perform_idle_server_maintenance() will find
> the stopping processes old gen and do:
> ps = &ap_scoreboard_image->parent[i];
> if (ps->pid != 0) {
> int child_threads_active = 0;
> if (ps->quiescing == 1) {
> ps->quiescing = 2;
> active_daemons--;
> }
> ...
> }
> Thus doubly account for the stopping daemons and leaving
> active_daemons as a negative value?
>
> I think that since r1893520 and the correct accounting of
> active_daemons throughout perform_idle_server_maintenance() we'd
> better let perform_idle_server_maintenance() handle
> retained->active_daemons entirely.
>
> Am I missing something?

So the idea is that once we ran through perform_idle_server_maintenance the first time the retained active_daemons number will be
correct again?
It looks like we don't rely on active_daemons to be correct before this. Hence this should be working. More so: As you state if we
don't retain active_daemons the counter will become negative and wrong. Good catch.

Regards

Rüdiger