Mailing List Archive

balancers bybusyness, bytraffic and byrequest thread/process safe issues
Hi,

All the balancers have thread/process safe issues, but with bybusyness
the effect is worse, basically a worker may stay with a busy count
greater than zero even no request is being processed.

busy is displayed in the balancer_handler() so users/customers will
notice the value doesn't return to zero...

If you run a load test the value of busy will increase by time and in
all the workers

When using bybusyness, having pics in the load and later no much load
makes the lowest busy workers used and the ones with a wrong higher
value not being used.

In a test with 3 workers, I end with busy:
worker1: 3
worker2: 0
worker3: 2
Doing the load test several time the buys values are increasing in all
workers.

I am wondering is we could end with something like:
worker1: 1000
worker2: 0
worker3: 1000

in this case bybusyness will send all the load to worker2 until we reach
1000 simultaneous request on worker2... Obvious that looks bad.

How to fix that?
1 - reset the busy using a watchdog and elected (or transferred+read)
unchanged for some time (using one of timeout we have on workers).
2 - warn in the docs that bybusyness is not the best choice for
loadbalancing.
3 - create another balancer that just choose random a worker.

--
Cheers

Jean-Frederic
Re: balancers bybusyness, bytraffic and byrequest thread/process safe issues [ In reply to ]
Hi JFC,

I have not checked ur current code, but the topic reminds me of our
history in mod_jk land. There we switched the counters to atomics were
available. The other problematic part could be how to handle process
local counters versus global counters.

Busyness was especially problematic for mod_jk as well, because we never
deremented below zero if we lost increments, but if we lost decrements
the counters stayed elevated. I think there we now have no longer such
problems.

Best regards,

Rainer

Am 30.08.23 um 17:19 schrieb jean-frederic clere:
> Hi,
>
> All the balancers have thread/process safe issues, but with bybusyness
> the effect is worse, basically a worker may stay with a busy count
> greater than zero even no request is being processed.
>
> busy is displayed in the balancer_handler() so users/customers will
> notice the value doesn't return to zero...
>
> If you run a load test the value of busy will increase by time and in
> all the workers
>
> When using bybusyness, having pics in the load and later no much load
> makes the lowest busy workers used and the ones with a wrong higher
> value not being used.
>
> In a test with 3 workers, I end with busy:
> worker1: 3
> worker2: 0
> worker3: 2
> Doing the load test several time the buys values are increasing in all
> workers.
>
> I am wondering is we could end with something like:
> worker1: 1000
> worker2: 0
> worker3: 1000
>
> in this case bybusyness will send all the load to worker2 until we reach
> 1000 simultaneous request on worker2... Obvious that looks bad.
>
> How to fix that?
> 1 - reset the busy using a watchdog and elected (or transferred+read)
> unchanged for some time (using one of timeout we have on workers).
> 2 - warn in the docs that bybusyness is not the best choice for
> loadbalancing.
> 3 - create another balancer that just choose random a worker.
Re: balancers bybusyness, bytraffic and byrequest thread/process safe issues [ In reply to ]
On 8/30/23 17:33, Rainer Jung wrote:
> Hi JFC,
>
> I have not checked ur current code, but the topic reminds me of our
> history in mod_jk land. There we switched the counters to atomics were
> available. The other problematic part could be how to handle process
> local counters versus global counters.

Using apr_atomic_inc32()/apr_atomic_dec32 on apr_size_t busy won't work?
Actual apr_size_t for busy is probably overkill does using
apr_atomic_add64() and apr_atomic_dec64() makes sense here?

Anyway I will give it a try.

>
> Busyness was especially problematic for mod_jk as well, because we never
> deremented below zero if we lost increments, but if we lost decrements
> the counters stayed elevated. I think there we now have no longer such
> problems.
>
> Best regards,
>
> Rainer
>
> Am 30.08.23 um 17:19 schrieb jean-frederic clere:
>> Hi,
>>
>> All the balancers have thread/process safe issues, but with bybusyness
>> the effect is worse, basically a worker may stay with a busy count
>> greater than zero even no request is being processed.
>>
>> busy is displayed in the balancer_handler() so users/customers will
>> notice the value doesn't return to zero...
>>
>> If you run a load test the value of busy will increase by time and in
>> all the workers
>>
>> When using bybusyness, having pics in the load and later no much load
>> makes the lowest busy workers used and the ones with a wrong higher
>> value not being used.
>>
>> In a test with 3 workers, I end with busy:
>> worker1: 3
>> worker2: 0
>> worker3: 2
>> Doing the load test several time the buys values are increasing in all
>> workers.
>>
>> I am wondering is we could end with something like:
>> worker1: 1000
>> worker2: 0
>> worker3: 1000
>>
>> in this case bybusyness will send all the load to worker2 until we
>> reach 1000 simultaneous request on worker2... Obvious that looks bad.
>>
>> How to fix that?
>> 1 - reset the busy using a watchdog and elected (or transferred+read)
>> unchanged for some time (using one of timeout we have on workers).
>> 2 - warn in the docs that bybusyness is not the best choice for
>> loadbalancing.
>> 3 - create another balancer that just choose random a worker.

--
Cheers

Jean-Frederic
Re: balancers bybusyness, bytraffic and byrequest thread/process safe issues [ In reply to ]
Isn't the call to find the best balancer mutex protected?

> On Aug 31, 2023, at 7:44 AM, jean-frederic clere <jfclere@gmail.com> wrote:
>
> On 8/30/23 17:33, Rainer Jung wrote:
>> Hi JFC,
>> I have not checked ur current code, but the topic reminds me of our history in mod_jk land. There we switched the counters to atomics were available. The other problematic part could be how to handle process local counters versus global counters.
>
> Using apr_atomic_inc32()/apr_atomic_dec32 on apr_size_t busy won't work?
> Actual apr_size_t for busy is probably overkill does using apr_atomic_add64() and apr_atomic_dec64() makes sense here?
>
> Anyway I will give it a try.
>
>> Busyness was especially problematic for mod_jk as well, because we never deremented below zero if we lost increments, but if we lost decrements the counters stayed elevated. I think there we now have no longer such problems.
>> Best regards,
>> Rainer
>> Am 30.08.23 um 17:19 schrieb jean-frederic clere:
>>> Hi,
>>>
>>> All the balancers have thread/process safe issues, but with bybusyness the effect is worse, basically a worker may stay with a busy count greater than zero even no request is being processed.
>>>
>>> busy is displayed in the balancer_handler() so users/customers will notice the value doesn't return to zero...
>>>
>>> If you run a load test the value of busy will increase by time and in all the workers
>>>
>>> When using bybusyness, having pics in the load and later no much load makes the lowest busy workers used and the ones with a wrong higher value not being used.
>>>
>>> In a test with 3 workers, I end with busy:
>>> worker1: 3
>>> worker2: 0
>>> worker3: 2
>>> Doing the load test several time the buys values are increasing in all workers.
>>>
>>> I am wondering is we could end with something like:
>>> worker1: 1000
>>> worker2: 0
>>> worker3: 1000
>>>
>>> in this case bybusyness will send all the load to worker2 until we reach 1000 simultaneous request on worker2... Obvious that looks bad.
>>>
>>> How to fix that?
>>> 1 - reset the busy using a watchdog and elected (or transferred+read) unchanged for some time (using one of timeout we have on workers).
>>> 2 - warn in the docs that bybusyness is not the best choice for loadbalancing.
>>> 3 - create another balancer that just choose random a worker.
>
> --
> Cheers
>
> Jean-Frederic
Re: balancers bybusyness, bytraffic and byrequest thread/process safe issues [ In reply to ]
IIRC, the goal of having an "aging" function was to handle this exact kind of thing, where values could be normalized over a long period of time so that old entries that may skew results are not weighted as heavily as new ones.

> On Aug 30, 2023, at 11:19 AM, jean-frederic clere <jfclere@gmail.com> wrote:
>
> Hi,
>
> All the balancers have thread/process safe issues, but with bybusyness the effect is worse, basically a worker may stay with a busy count greater than zero even no request is being processed.
>
> busy is displayed in the balancer_handler() so users/customers will notice the value doesn't return to zero...
>
> If you run a load test the value of busy will increase by time and in all the workers
>
> When using bybusyness, having pics in the load and later no much load makes the lowest busy workers used and the ones with a wrong higher value not being used.
>
> In a test with 3 workers, I end with busy:
> worker1: 3
> worker2: 0
> worker3: 2
> Doing the load test several time the buys values are increasing in all workers.
>
> I am wondering is we could end with something like:
> worker1: 1000
> worker2: 0
> worker3: 1000
>
> in this case bybusyness will send all the load to worker2 until we reach 1000 simultaneous request on worker2... Obvious that looks bad.
>
> How to fix that?
> 1 - reset the busy using a watchdog and elected (or transferred+read) unchanged for some time (using one of timeout we have on workers).
> 2 - warn in the docs that bybusyness is not the best choice for loadbalancing.
> 3 - create another balancer that just choose random a worker.
>
> --
> Cheers
>
> Jean-Frederic
Re: balancers bybusyness, bytraffic and byrequest thread/process safe issues [ In reply to ]
Hi there,

mod_jk for example uses such aging, but only for the non busyness case.
busyness is meant to show the number of currently in-flight requests, so
aging isn't a good fit there. Old load numbers are never part of
busyness. But busyness is the mode that is most sensitive to the numer
skew effects that JFC observed. Therefore that attempt to have more
precise counting there.

It makes sense for byrequests and bytraffic though. But in mod_jk we use
a different byrequests algorithm. Not the original count and decrement
system that Mladen introduced but instead a count and age system.

The aging for byrequests and bytraffic could be hooked on mod_watchdog
which is nice, because we would not need to run it as part of normal
request handling.

Another thing that comes to my mind is (graceful) restart handlingan
bybusyness. It might make sense to clear the numbers in case of such an
event.

Best regards,

Rainer

Am 31.08.23 um 18:23 schrieb Jim Jagielski:
> IIRC, the goal of having an "aging" function was to handle this exact kind of thing, where values could be normalized over a long period of time so that old entries that may skew results are not weighted as heavily as new ones.
>
>> On Aug 30, 2023, at 11:19 AM, jean-frederic clere <jfclere@gmail.com> wrote:
>>
>> Hi,
>>
>> All the balancers have thread/process safe issues, but with bybusyness the effect is worse, basically a worker may stay with a busy count greater than zero even no request is being processed.
>>
>> busy is displayed in the balancer_handler() so users/customers will notice the value doesn't return to zero...
>>
>> If you run a load test the value of busy will increase by time and in all the workers
>>
>> When using bybusyness, having pics in the load and later no much load makes the lowest busy workers used and the ones with a wrong higher value not being used.
>>
>> In a test with 3 workers, I end with busy:
>> worker1: 3
>> worker2: 0
>> worker3: 2
>> Doing the load test several time the buys values are increasing in all workers.
>>
>> I am wondering is we could end with something like:
>> worker1: 1000
>> worker2: 0
>> worker3: 1000
>>
>> in this case bybusyness will send all the load to worker2 until we reach 1000 simultaneous request on worker2... Obvious that looks bad.
>>
>> How to fix that?
>> 1 - reset the busy using a watchdog and elected (or transferred+read) unchanged for some time (using one of timeout we have on workers).
>> 2 - warn in the docs that bybusyness is not the best choice for loadbalancing.
>> 3 - create another balancer that just choose random a worker.
>>
>> --
>> Cheers
>>
>> Jean-Frederic
´
Re: balancers bybusyness, bytraffic and byrequest thread/process safe issues [ In reply to ]
On 8/31/23 18:46, Rainer Jung wrote:
> Hi there,
>
> mod_jk for example uses such aging, but only for the non busyness case.
> busyness is meant to show the number of currently in-flight requests, so
> aging isn't a good fit there. Old load numbers are never part of
> busyness. But busyness is the mode that is most sensitive to the numer
> skew effects that JFC observed. Therefore that attempt to have more
> precise counting there.

Based on the mod_jk code, I have a PR:
https://github.com/apache/httpd/pull/383

>
> It makes sense for byrequests and bytraffic though. But in mod_jk we use
> a different byrequests algorithm. Not the original count and decrement
> system that Mladen introduced but instead a count and age system.
>
> The aging for byrequests and bytraffic could be hooked on mod_watchdog
> which is nice, because we would not need to run it as part of normal
> request handling.

I will look to the age() and other to see how to use it with byrequests
and bytraffic.

>
> Another thing that comes to my mind is (graceful) restart handlingan
> bybusyness. It might make sense to clear the numbers in case of such an
> event.
>
> Best regards,
>
> Rainer
>
> Am 31.08.23 um 18:23 schrieb Jim Jagielski:
>> IIRC, the goal of having an "aging" function was to handle this exact
>> kind of thing, where values could be normalized over a long period of
>> time so that old entries that may skew results are not weighted as
>> heavily as new ones.
>>
>>> On Aug 30, 2023, at 11:19 AM, jean-frederic clere <jfclere@gmail.com>
>>> wrote:
>>>
>>> Hi,
>>>
>>> All the balancers have thread/process safe issues, but with
>>> bybusyness the effect is worse, basically a worker may stay with a
>>> busy count greater than zero even no request is being processed.
>>>
>>> busy is displayed in the balancer_handler() so users/customers will
>>> notice the value doesn't return to zero...
>>>
>>> If you run a load test the value of busy will increase by time and in
>>> all the workers
>>>
>>> When using bybusyness, having pics in the load and later no much load
>>> makes the lowest busy workers used and the ones with a wrong higher
>>> value not being used.
>>>
>>> In a test with 3 workers, I end with busy:
>>> worker1: 3
>>> worker2: 0
>>> worker3: 2
>>> Doing the load test several time the buys values are increasing in
>>> all workers.
>>>
>>> I am wondering is we could end with something like:
>>> worker1: 1000
>>> worker2: 0
>>> worker3: 1000
>>>
>>> in this case bybusyness will send all the load to worker2 until we
>>> reach 1000 simultaneous request on worker2... Obvious that looks bad.
>>>
>>> How to fix that?
>>> 1 - reset the busy using a watchdog and elected (or transferred+read)
>>> unchanged for some time (using one of timeout we have on workers).
>>> 2 - warn in the docs that bybusyness is not the best choice for
>>> loadbalancing.
>>> 3 - create another balancer that just choose random a worker.
>>>
>>> --
>>> Cheers
>>>
>>> Jean-Frederic
> ´

--
Cheers

Jean-Frederic
Re: balancers bybusyness, bytraffic and byrequest thread/process safe issues [ In reply to ]
On 8/31/23 18:23, Jim Jagielski wrote:
> IIRC, the goal of having an "aging" function was to handle this exact kind of thing, where values could be normalized over a long period of time so that old entries that may skew results are not weighted as heavily as new ones.

So the reset() and age() are for those?
struct proxy_balancer_method {
const char *name; /* name of the load balancer method*/
proxy_worker *(*finder)(proxy_balancer *balancer,
request_rec *r);
void *context; /* general purpose storage */
apr_status_t (*reset)(proxy_balancer *balancer, server_rec *s);
apr_status_t (*age)(proxy_balancer *balancer, server_rec *s);
apr_status_t (*updatelbstatus)(proxy_balancer *balancer,
proxy_worker *elected, server_rec *s);
};

There are not much doc nor example of use, correct?
Where can I find something?


>
>> On Aug 30, 2023, at 11:19 AM, jean-frederic clere <jfclere@gmail.com> wrote:
>>
>> Hi,
>>
>> All the balancers have thread/process safe issues, but with bybusyness the effect is worse, basically a worker may stay with a busy count greater than zero even no request is being processed.
>>
>> busy is displayed in the balancer_handler() so users/customers will notice the value doesn't return to zero...
>>
>> If you run a load test the value of busy will increase by time and in all the workers
>>
>> When using bybusyness, having pics in the load and later no much load makes the lowest busy workers used and the ones with a wrong higher value not being used.
>>
>> In a test with 3 workers, I end with busy:
>> worker1: 3
>> worker2: 0
>> worker3: 2
>> Doing the load test several time the buys values are increasing in all workers.
>>
>> I am wondering is we could end with something like:
>> worker1: 1000
>> worker2: 0
>> worker3: 1000
>>
>> in this case bybusyness will send all the load to worker2 until we reach 1000 simultaneous request on worker2... Obvious that looks bad.
>>
>> How to fix that?
>> 1 - reset the busy using a watchdog and elected (or transferred+read) unchanged for some time (using one of timeout we have on workers).
>> 2 - warn in the docs that bybusyness is not the best choice for loadbalancing.
>> 3 - create another balancer that just choose random a worker.
>>
>> --
>> Cheers
>>
>> Jean-Frederic
>

--
Cheers

Jean-Frederic
Re: balancers bybusyness, bytraffic and byrequest thread/process safe issues [ In reply to ]
On 8/31/23 18:20, Jim Jagielski wrote:
> Isn't the call to find the best balancer mutex protected?

Look to apr_atomic_cas32() and the APR code (1.7.x) I noted that we
don't test the return value of __atomic_compare_exchange_n()

+++
PR_DECLARE(apr_uint32_t) apr_atomic_cas32(volatile apr_uint32_t *mem,
apr_uint32_t val,
apr_uint32_t cmp)
{
#if HAVE__ATOMIC_BUILTINS
__atomic_compare_exchange_n(mem, &cmp, val, 0, __ATOMIC_SEQ_CST,
__ATOMIC_SEQ_CST);
return cmp;
#else
return __sync_val_compare_and_swap(mem, cmp, val);
#endif
+++

and:
https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html
Says:
Otherwise, false is returned and memory is affected according to
failure_memorder. This memory order cannot be __ATOMIC_RELEASE nor
__ATOMIC_ACQ_REL. It also cannot be a stronger order than that specified
by success_memorder.

So we use __ATOMIC_SEQ_CST so we can't fail or do I miss something?

>
>> On Aug 31, 2023, at 7:44 AM, jean-frederic clere <jfclere@gmail.com>
>> wrote:
>>
>> On 8/30/23 17:33, Rainer Jung wrote:
>>> Hi JFC,
>>> I have not checked ur current code, but the topic reminds me of our
>>> history in mod_jk land. There we switched the counters to atomics
>>> were available. The other problematic part could be how to handle
>>> process local counters versus global counters.
>>
>> Using apr_atomic_inc32()/apr_atomic_dec32 on apr_size_t busy won't work?
>> Actual apr_size_t for busy is probably overkill does using
>> apr_atomic_add64() and apr_atomic_dec64() makes sense here?
>>
>> Anyway I will give it a try.
>>
>>> Busyness was especially problematic for mod_jk as well, because we
>>> never deremented below zero if we lost increments, but if we lost
>>> decrements the counters stayed elevated. I think there we now have no
>>> longer such problems.
>>> Best regards,
>>> Rainer
>>> Am 30.08.23 um 17:19 schrieb jean-frederic clere:
>>>> Hi,
>>>>
>>>> All the balancers have thread/process safe issues, but with
>>>> bybusyness the effect is worse, basically a worker may stay with a
>>>> busy count greater than zero even no request is being processed.
>>>>
>>>> busy is displayed in the balancer_handler() so users/customers will
>>>> notice the value doesn't return to zero...
>>>>
>>>> If you run a load test the value of busy will increase by time and
>>>> in all the workers
>>>>
>>>> When using bybusyness, having pics in the load and later no much
>>>> load makes the lowest busy workers used and the ones with a wrong
>>>> higher value not being used.
>>>>
>>>> In a test with 3 workers, I end with busy:
>>>> worker1: 3
>>>> worker2: 0
>>>> worker3: 2
>>>> Doing the load test several time the buys values are increasing in
>>>> all workers.
>>>>
>>>> I am wondering is we could end with something like:
>>>> worker1: 1000
>>>> worker2: 0
>>>> worker3: 1000
>>>>
>>>> in this case bybusyness will send all the load to worker2 until we
>>>> reach 1000 simultaneous request on worker2... Obvious that looks bad.
>>>>
>>>> How to fix that?
>>>> 1 - reset the busy using a watchdog and elected (or
>>>> transferred+read) unchanged for some time (using one of timeout we
>>>> have on workers).
>>>> 2 - warn in the docs that bybusyness is not the best choice for
>>>> loadbalancing.
>>>> 3 - create another balancer that just choose random a worker.
>>
>> --
>> Cheers
>>
>> Jean-Frederic
>

--
Cheers

Jean-Frederic
Re: balancers bybusyness, bytraffic and byrequest thread/process safe issues [ In reply to ]
On Wed, Sep 6, 2023 at 1:17?PM jean-frederic clere <jfclere@gmail.com> wrote:
>
> On 8/31/23 18:20, Jim Jagielski wrote:
> > Isn't the call to find the best balancer mutex protected?
>
> Look to apr_atomic_cas32() and the APR code (1.7.x) I noted that we
> don't test the return value of __atomic_compare_exchange_n()

IIUC we don't need to since apr_atomic_cas32() does not return it, see below.

>
> +++
> PR_DECLARE(apr_uint32_t) apr_atomic_cas32(volatile apr_uint32_t *mem,
> apr_uint32_t val,
> apr_uint32_t cmp)
> {
> #if HAVE__ATOMIC_BUILTINS
> __atomic_compare_exchange_n(mem, &cmp, val, 0, __ATOMIC_SEQ_CST,
> __ATOMIC_SEQ_CST);
> return cmp;
> #else
> return __sync_val_compare_and_swap(mem, cmp, val);
> #endif
> +++
>
> and:
> https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html
> Says:
> Otherwise, false is returned and memory is affected according to
> failure_memorder. This memory order cannot be __ATOMIC_RELEASE nor
> __ATOMIC_ACQ_REL. It also cannot be a stronger order than that specified
> by success_memorder.
>
> So we use __ATOMIC_SEQ_CST so we can't fail or do I miss something?

These docs (also) say:
"""
bool __atomic_compare_exchange_n (type *ptr, type *expected, type desired, ...)
This compares the contents of *ptr with the contents of *expected. If
equal, the operation is a read-modify-write operation that writes
desired into *ptr. If they are not equal, the operation is a read and
the current contents of *ptr are written into *expected.
"""
So __atomic_compare_exchange_n() guarantees that *expected will always
be set to the value of *ptr before the call (be it changed or not),
that's all we care about as returned value.

Typical usage of apr_atomic_cas32() is:
ret = apr_atomic_cas32(mem, val, cmp);
if (ret == cmp) {
/* *mem changed to val, old value was ret (== cmp) */
}
else {
/* *mem unchanged, current value is ret (!=t cmp) */
}
meaning that done/true (or noop/false) is given by ret == cmp (or
respectively ret != cmp), no explicit bool is returned. The CAS
guarantees that if *mem is equal to cmp then it's set to val, so it's
enough to check that ret == cmp (i.e. old value was cmp) to assume
success.

As for the memory orders on success/failure, they have nothing to do
with the likeliness of success/failure (which solely depends on the
current value of *ptr compared to the one in *expected), they are a
way to tell which memory ordering applies to the CAS operation w.r.t.
the simultaneous atomic operations on the same *ptr memory (for more
details you could find out about the acquire, release and
acquire+release semantics relating to the C++ memory model used by the
__atomic builtins). Roughly this is a way to tell how reads/writes on
different CPUs should synchronize with each other (reads waiting or
not for in-flight writes to finish and vice versa).
The APR uses __ATOMIC_SEQ_CST for all its __atomic operations which is
the stronger memory ordering (full barrier) since there is no way
(yet?) for the users to specify their own (i.e. a faster/weaker one
but safe enough for their usage, which the APR can't guess..).


Regards;
Yann.
Re: balancers bybusyness, bytraffic and byrequest thread/process safe issues [ In reply to ]
On Wed, Sep 6, 2023 at 6:29?PM Yann Ylavic <ylavic.dev@gmail.com> wrote:
>
> As for the memory orders on success/failure, they have nothing to do
> with the likeliness of success/failure

Well the memory orderings specified can certainly influence the
likeliness of success/failure since a weaker ordering implies less
synchronization thus probably more concurrency, what I meant is that
they don't influence *correctness* of the returned values!


Regards;
Yann.
Re: balancers bybusyness, bytraffic and byrequest thread/process safe issues [ In reply to ]
On 9/6/23 18:40, Yann Ylavic wrote:
> On Wed, Sep 6, 2023 at 6:29?PM Yann Ylavic <ylavic.dev@gmail.com> wrote:
>>
>> As for the memory orders on success/failure, they have nothing to do
>> with the likeliness of success/failure
>
> Well the memory orderings specified can certainly influence the
> likeliness of success/failure since a weaker ordering implies less
> synchronization thus probably more concurrency, what I meant is that
> they don't influence *correctness* of the returned values!

OK thanks for your help.
>
>
> Regards;
> Yann.

--
Cheers

Jean-Frederic