Mailing List Archive

mod_systemd suggestion
Hi all,

triggered by the new mod_systemd I drafted a patch to enhance the
monitoring data it provides during the monitor hook run.

Currently it publishes important data, like idle and busy slots and
total request count, but also not so useful info like requests/second
and bytes/second as a long term average (since start). These two figues
tend to become near constant after a longer time of operation.

Since the monitor hook of the module always seems to run in the same
(parent) process, it is easy to remember the previous request and byte
count data and average only over the last monitor hook interval. This
should give more meaningful data. And is a change local to mod_systemd.

In addition we have a third metric available in the scoreboard, namely
the total request duration. From that we can get the average request
duration and the average request concurrency. This part also needs a
change to the sload structure. Maybe we need a minor MMN bump for that.

I scetched a patch under

home.apache.org/~rjung/patches/httpd-trunk-mod_systemd-interval-stats.patch

Any comments, likes or dislikes?

Thanks and regards,

Rainer
Re: mod_systemd suggestion [ In reply to ]
Thinking further: I think it would make sense to have a module or core
implement the monitor hook to generate that derived data (requests/sec,
bytes/sec, durationMs/request, avgConcurrency) in the last monitor
interval and to provide that data to consumers like mod_systemd or - new
- mod_status - instead of the long term averages since start. It could
probably be added to the code that already provides "sload". That way
mod_status would also profit from the more precise average values (taken
over the last monitor interval).

Regards,

Rainer

Am 23.04.2020 um 21:29 schrieb Rainer Jung:
> Hi all,
>
> triggered by the new mod_systemd I drafted a patch to enhance the
> monitoring data it provides during the monitor hook run.
>
> Currently it publishes important data, like idle and busy slots and
> total request count, but also not so useful info like requests/second
> and bytes/second as a long term average (since start). These two figues
> tend to become near constant after a longer time of operation.
>
> Since the monitor hook of the module always seems to run in the same
> (parent) process, it is easy to remember the previous request and byte
> count data and average only over the last monitor hook interval. This
> should give more meaningful data. And is a change local to mod_systemd.
>
> In addition we have a third metric available in the scoreboard, namely
> the total request duration. From that we can get the average request
> duration and the average request concurrency. This part also needs a
> change to the sload structure. Maybe we need a minor MMN bump for that.
>
> I scetched a patch under
>
> home.apache.org/~rjung/patches/httpd-trunk-mod_systemd-interval-stats.patch
>
> Any comments, likes or dislikes?
>
> Thanks and regards,
>
> Rainer
Re: mod_systemd suggestion [ In reply to ]
On Fri, Apr 24, 2020 at 12:17:19PM +0200, Rainer Jung wrote:
> Thinking further: I think it would make sense to have a module or core
> implement the monitor hook to generate that derived data (requests/sec,
> bytes/sec, durationMs/request, avgConcurrency) in the last monitor interval
> and to provide that data to consumers like mod_systemd or - new - mod_status
> - instead of the long term averages since start. It could probably be added
> to the code that already provides "sload". That way mod_status would also
> profit from the more precise average values (taken over the last monitor
> interval).

I definitely like the patch, it has bothered me that the "per second"
stats are not very useful but wasn't sure how to make it better.

This is also an interesting idea.

So you would suggest having a new monitor hook which runs REALLY_LAST in
the order, calls ap_get_sload() and stores it in a global, and then we'd
have an ap_get_cached_sload() (or whatever) which gives you the cached
data from the last iteration? Or are you thinking of a more
sophisticated API which does the "diff" between intervals internally?

Regards, Joe

>
> Regards,
>
> Rainer
>
> Am 23.04.2020 um 21:29 schrieb Rainer Jung:
> > Hi all,
> >
> > triggered by the new mod_systemd I drafted a patch to enhance the
> > monitoring data it provides during the monitor hook run.
> >
> > Currently it publishes important data, like idle and busy slots and
> > total request count, but also not so useful info like requests/second
> > and bytes/second as a long term average (since start). These two figues
> > tend to become near constant after a longer time of operation.
> >
> > Since the monitor hook of the module always seems to run in the same
> > (parent) process, it is easy to remember the previous request and byte
> > count data and average only over the last monitor hook interval. This
> > should give more meaningful data. And is a change local to mod_systemd.
> >
> > In addition we have a third metric available in the scoreboard, namely
> > the total request duration. From that we can get the average request
> > duration and the average request concurrency. This part also needs a
> > change to the sload structure. Maybe we need a minor MMN bump for that.
> >
> > I scetched a patch under
> >
> > home.apache.org/~rjung/patches/httpd-trunk-mod_systemd-interval-stats.patch
> >
> > Any comments, likes or dislikes?
> >
> > Thanks and regards,
> >
> > Rainer
>
Re: mod_systemd suggestion [ In reply to ]
Am 24.04.2020 um 16:21 schrieb Joe Orton:
> On Fri, Apr 24, 2020 at 12:17:19PM +0200, Rainer Jung wrote:
>> Thinking further: I think it would make sense to have a module or core
>> implement the monitor hook to generate that derived data (requests/sec,
>> bytes/sec, durationMs/request, avgConcurrency) in the last monitor interval
>> and to provide that data to consumers like mod_systemd or - new - mod_status
>> - instead of the long term averages since start. It could probably be added
>> to the code that already provides "sload". That way mod_status would also
>> profit from the more precise average values (taken over the last monitor
>> interval).
>
> I definitely like the patch, it has bothered me that the "per second"
> stats are not very useful but wasn't sure how to make it better.
>
> This is also an interesting idea.
>
> So you would suggest having a new monitor hook which runs REALLY_LAST in
> the order, calls ap_get_sload() and stores it in a global, and then we'd
> have an ap_get_cached_sload() (or whatever) which gives you the cached
> data from the last iteration? Or are you thinking of a more
> sophisticated API which does the "diff" between intervals internally?

Thanks for the positive feedback.

The averaged metrics IMHO only make sense as cached data, updated in
regular intervals and provided for use by various modules (probably only
mod_systemd and mod_status).

I would like to provide the already averaged data in a struct, each
metric as a float or double. The bytes/request probably not already
human readably scaled, because it makes its use less flexible. Since we
already also have the absolute counters at that point, we can easily add
them to the same struct as 32 or 64 bit counters and return a consistent
set of data (five old values, five new values, five averages and two
time stamps). [idle(32), busy(32), requests(64), bytes(64),
duration(64); req/s, bytes/s, bytes/req, dur/s, dur/req]. So consumers
needing a consistent view can get it.

Even more so since the absolute metrics are currently not cheap to
access. We collect all of them by iterating over the scoreboard and
summing up. By adding them to the cached data, the consuming code could
decide, whether such near-time data is good enough or it needs to
acquire new and curent counters. For mod_systemd, cached data (10 second
interval) might be OK.

For some modules - like mod_status - cached averages are fine, but I
think the counters should be correct for the point in time the status
request was handled by the module. So the scoreboard statistics code in
mod_status unfortunately would not go away.ยด, but the data quality for
the averages would become better.

Implementation wise I am thinking about adding

ap_hook_monitor(mon_avg_monitor, NULL, NULL, ???);

to server/util.c, which calculates the new averages and

ap_get_mon_avg(ap_mon_avg_t *ma)

which returns the four averages in a struct similar to the existing
ap_get_loadavg() and ap_get_sload().

We might have a little hassle to make the statistics update
atomic/thread-safe (eg. two instances of the internal data struct, so
that we only need to make the switch between them after the new
calculation atomic).

About REALLY_LAST: why last? If other modules collect data via this API
and wasn't to do it in the monitor hook as well, shouldn't run the
caching of data REALLY_FIRST, so you get the new averages?

I'll try to draft and test something along these lines later today. Fun
stuff. And more comments are very welcome.

And I like, that mod_status will profit by showing betrer averages as well.

Regards,

Rainer

>> Am 23.04.2020 um 21:29 schrieb Rainer Jung:
>>> Hi all,
>>>
>>> triggered by the new mod_systemd I drafted a patch to enhance the
>>> monitoring data it provides during the monitor hook run.
>>>
>>> Currently it publishes important data, like idle and busy slots and
>>> total request count, but also not so useful info like requests/second
>>> and bytes/second as a long term average (since start). These two figues
>>> tend to become near constant after a longer time of operation.
>>>
>>> Since the monitor hook of the module always seems to run in the same
>>> (parent) process, it is easy to remember the previous request and byte
>>> count data and average only over the last monitor hook interval. This
>>> should give more meaningful data. And is a change local to mod_systemd.
>>>
>>> In addition we have a third metric available in the scoreboard, namely
>>> the total request duration. From that we can get the average request
>>> duration and the average request concurrency. This part also needs a
>>> change to the sload structure. Maybe we need a minor MMN bump for that.
>>>
>>> I scetched a patch under
>>>
>>> home.apache.org/~rjung/patches/httpd-trunk-mod_systemd-interval-stats.patch
>>>
>>> Any comments, likes or dislikes?
>>>
>>> Thanks and regards,
>>>
>>> Rainer