Mailing List Archive

drbd-reactor v1.0.0-rc.1
Dear DRBD users,

this is the first RC for the upcoming 1.0.0 release of drbd-reactor.
This version contains an improvement for the promoter plugin. I will
just quote the commit log in the following.

Just a word on the version number: I was told that there are people that
wait for a 1.0.0 before they use the software in production. Let's do
them a favor and call it 1.0.0. The version number does not mean
anything. It is just a number like for all releases before and all the
ones that will follow.

The important commit:
promoter: try to restart target periodically

With the defaults (target-as="Requires") the generated target unit
behaves like follows:

- start with failing service => target active
- systemctl stop service => target+services stop
- kill pid of a service => target active

If the service fails for whatever reason, so far we did not detect that
because systemd assumes the target is started. We also did not properly
check for the target status... This might not be desirable and can be
improved by setting target-as="BindsTo", which then generates the
following behavior:

- start with failing service => target inactive
- systemctl stop service => target+services stop
- kill pid of a service => target+services stop

The problem is that this generates a start-stop loop if the service
fails to start. The target unit will not be started successfully, so it
gets stopped. This triggers a "may_promote", which triggers a start
attempt and so on, until systemd rate limiting kicks in.

We can improve the situation by checking if the target unit is started
and trying to start it in a saner interval ourselves. In a future
version we might even check if all services in a target are started
(which shouldn't be necessary if BindsTo is used).

Having rate limiting and keeping the rest as it was would not be good
enough. There would be a start, a stop, a new may_promote, which would
then be rate limited, so no new start. And then there wouldn't be any
new may_promote event and things would starve. To avoid that, we can use
a ticker that periodically checks for the target state. Both, the
existing may_promote mechanism and the ticker follow a global rate.

Please test and report any regressions/bugs. The final release will be
in about a week from now.

Regards, rck


[Roland Kammerer ]
* core: improve module version check
* promoter: try to restart target periodically