Mailing List Archive

attrd and repeated changes
Hello,

when constantly sending new data via attrd the changes are never used.


Example:
while sleep 1
do attrd_updater -l reboot -d 5 -n rep_chg -U try$SECONDS
cibadmin -Ql | grep rep_chg
done

This always returns the same value - the one that was given with more than 5
seconds delay afterwards, so that the dampen interval wasn't broken by the
next change.


I've attached two draft patches; one for allowing the _first_ value in a
dampen interval to be used (effectively ignoring changes until this value is
written), and one for using the _last_ value in the dampen interval (by not
changing the dampen timer). [1]


*** Note: they are for discussion only!
*** I didn't test them, not even for compilation.


Perhaps this "bug" [2] was introduced with one of these changes (the hashes
are the GIT numbers)

High: crmd: Bug lf#2528 - Introduce a slight delay when
creating a transition to allow attrd time to perform its updates
e7f5da92490844d190609931f434e08c0440da0f

Low: attrd: Indicate when attrd clients are updating fields
69b49b93ff6fd25ac91f589d8149f2e71a5114c5


What is the correct way to handle multiple updates within the dampen
interval?


Thank you for any hints or ideas.


Regards,

Phil


Ad [1]: I even tried to provide an average, median or some other value here;
but a) this isn't necessarily valid (for boolean or integer values), and b)
as the data gets stored and transmitted as string it wouldn't easily work
anyway.
So the question remaining is just first or last value, IMO.

Ad [2]: if this _is_ a bug -- but I'd certainly argue that way, as _not_
propagating a change is worse than a changing value.
Re: attrd and repeated changes [ In reply to ]
On 10/20/2011 03:41 AM, Philipp Marek wrote:
> Hello,
>
> when constantly sending new data via attrd the changes are never used.
>
>
> Example:
> while sleep 1
> do attrd_updater -l reboot -d 5 -n rep_chg -U try$SECONDS
> cibadmin -Ql | grep rep_chg
> done
>
> This always returns the same value - the one that was given with more than 5
> seconds delay afterwards, so that the dampen interval wasn't broken by the
> next change.
>
>
> I've attached two draft patches; one for allowing the _first_ value in a
> dampen interval to be used (effectively ignoring changes until this value is
> written), and one for using the _last_ value in the dampen interval (by not
> changing the dampen timer). [1]
>
>
> *** Note: they are for discussion only!
> *** I didn't test them, not even for compilation.
>
>
> Perhaps this "bug" [2] was introduced with one of these changes (the hashes
> are the GIT numbers)
>
> High: crmd: Bug lf#2528 - Introduce a slight delay when
> creating a transition to allow attrd time to perform its updates
> e7f5da92490844d190609931f434e08c0440da0f
>
> Low: attrd: Indicate when attrd clients are updating fields
> 69b49b93ff6fd25ac91f589d8149f2e71a5114c5
>
>
> What is the correct way to handle multiple updates within the dampen
> interval?
Personally, I'd vote for the last value. I agree with you about this
being a bug.


--
Alan Robertson<alanr@unix.sh>

"Openness is the foundation and preservative of friendship... Let me claim from you at all times your undisguised opinions." - William Wilberforce
_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/
Re: attrd and repeated changes [ In reply to ]
On Thu, Oct 20, 2011 at 08:48:36AM -0600, Alan Robertson wrote:
> On 10/20/2011 03:41 AM, Philipp Marek wrote:
> > Hello,
> >
> > when constantly sending new data via attrd the changes are never used.
> >
> >
> > Example:
> > while sleep 1
> > do attrd_updater -l reboot -d 5 -n rep_chg -U try$SECONDS
> > cibadmin -Ql | grep rep_chg
> > done
> >
> > This always returns the same value - the one that was given with more than 5
> > seconds delay afterwards, so that the dampen interval wasn't broken by the
> > next change.
> >
> >
> > I've attached two draft patches; one for allowing the _first_ value in a
> > dampen interval to be used (effectively ignoring changes until this value is
> > written), and one for using the _last_ value in the dampen interval (by not
> > changing the dampen timer). [1]
> >
> >
> > *** Note: they are for discussion only!
> > *** I didn't test them, not even for compilation.

> > What is the correct way to handle multiple updates within the dampen
> > interval?

> Personally, I'd vote for the last value. I agree with you about this
> being a bug.

If the attribute is used to check connectivity changes (ping resource
agent), or similar, and we have a "flaky", "flapping" connectivity,
it would be useful to have a "max" or "min" consolidation function
for incoming values during a dampen interval.

Otherwise, I get + + - + + -|+ + + +
and if the dampen interval just "happened" to expire
where I put the | above, it would have pushed a - to the cib,
where I'd rather kept it at +.

We likely want to add an option to attrd_updater (and to the ipc
messages it sends to attrd, and to the rest of the chain involved),
which can specify the consolidation function to be used.

The initial set I suggest would be
generic:
oldest
latest (default?)
for values assumed to be numeric:
max (also a candidate for default behaviour)
min
avg (with a printf like template for rounding, %.2f or similar,
so we could even average "boolean" values)

I suggest this behaviour:

* If different updates request a different consolidation function,
the last one (within the respective dampen interval) wins.

* update with the _same_ value: Do not start or modify any timer.
If a timer is pending, still add the value to the list of values to
be processed by the consolidation function (relevant for avg,
possibly not yet listed others).

* update with a different value:
Start a new timer, unless one is pending already.
Do not restart/modify an already pending timer.
Add to the list of values for the consolidation function.

* Flush message received: expire timer. See below.

* Timer expires:
Apply consolidation function to list of values. If list is empty
(probably flush message without pending timer), use current value.
Send that result to the cib.

--
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com
_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/
Re: attrd and repeated changes [ In reply to ]
On Sat, Oct 22, 2011 at 7:14 AM, Lars Ellenberg
<lars.ellenberg@linbit.com> wrote:
> On Thu, Oct 20, 2011 at 08:48:36AM -0600, Alan Robertson wrote:
>> On 10/20/2011 03:41 AM, Philipp Marek wrote:
>> > Hello,
>> >
>> > when constantly sending new data via attrd the changes are never used.
>> >
>> >
>> > Example:
>> >     while sleep 1
>> >       do attrd_updater -l reboot -d 5 -n rep_chg -U try$SECONDS
>> >       cibadmin -Ql | grep rep_chg
>> >     done
>> >
>> > This always returns the same value - the one that was given with more than 5
>> > seconds delay afterwards, so that the dampen interval wasn't broken by the
>> > next change.
>> >
>> >
>> > I've attached two draft patches; one for allowing the _first_ value in a
>> > dampen interval to be used (effectively ignoring changes until this value is
>> > written), and one for using the _last_ value in the dampen interval (by not
>> > changing the dampen timer). [1]
>> >
>> >
>> > ***  Note: they are for discussion only!
>> > ***  I didn't test them, not even for compilation.
>
>> > What is the correct way to handle multiple updates within the dampen
>> > interval?
>
>> Personally, I'd vote for the last value.  I agree with you about this
>> being a bug.
>
> If the attribute is used to check connectivity changes (ping resource
> agent), or similar, and we have a "flaky", "flapping" connectivity,
> it would be useful to have a "max" or "min" consolidation function
> for incoming values during a dampen interval.
>
> Otherwise, I get + + - + + -|+ + + +
> and if the dampen interval just "happened" to expire
> where I put the | above, it would have pushed a - to the cib,
> where I'd rather kept it at +.

Thats why dampen should typically be a multiple of the monitor interval.

>
> We likely want to add an option to attrd_updater (and to the ipc
> messages it sends to attrd, and to the rest of the chain involved),
> which can specify the consolidation function to be used.
>
> The initial set I suggest would be
> generic:
>  oldest
>  latest (default?)
> for values assumed to be numeric:
>  max (also a candidate for default behaviour)
>  min
>  avg (with a printf like template for rounding, %.2f or similar,
>      so we could even average "boolean" values)

For avg you'd need to specify how many values to remember.

>
> I suggest this behaviour:
>
>  * If different updates request a different consolidation function,
>   the last one (within the respective dampen interval) wins.
>
>  * update with the _same_ value: Do not start or modify any timer.
>   If a timer is pending, still add the value to the list of values to
>   be processed by the consolidation function (relevant for avg,
>   possibly not yet listed others).
>
>  * update with a different value:
>   Start a new timer, unless one is pending already.
>   Do not restart/modify an already pending timer.
>   Add to the list of values for the consolidation function.
>
>  * Flush message received: expire timer. See below.
>
>  * Timer expires:
>   Apply consolidation function to list of values.  If list is empty
>   (probably flush message without pending timer), use current value.
>   Send that result to the cib.

Sounds reasonable, there's no way I'm going to be able to get to
implement it any time soon though.

If someone else wants to implement it, I think it would be useful to
have it be part of a larger rework that ensured atomicity of the
updates.
I.e. have all nodes send their values to a designated instance which
did all the updates.
_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/