Mailing List Archive

Weird systemd / pacemaker stop crash issue
DRBD 9.0.23, utils 9.13, RHEL 8.

I've found a really odd issue where stopping pacemaker causes
'systemctl stop drbd' to hang. This is with drbd configured as a systemd
resource and with no DRBD resources yet configured (in DRBD or
pacemaker). The global-common.conf is standard (as created when installed).

In short;

While pacemaker is running, the DRBD resource can be stopped and
started fine. However, if the DRBD resource is running when you stop the
cluster, it hangs. Once this hang happens, even outside pacemaker, you
can't stop drbd ('systemctl stop drbd' hangs when called in another
terminal).

Now here is where it gets really weird...

If you stop the DRBD resource in pacemaker, then stop the cluster,
it's fine. More over, the crash never happens again. You can start the
cluster back up, then stop it with DRBD running and the daemon stops
cleanly.

To recreate the crash, I destroy the pacemaker cluster and reconfigure
systemd:drbd and the crash returns until one stop of the cluster with
DRBD already stopped, then the crash doesn't happen again.

Below are links to a series of pacemaker.log files (with debugging
on). The first pair starts with the initial config of pacemaker up to
the crash on shutdown. The second was fixing a stonith issue and then
repeating the crash. The third is a clean start to crash. The fourth
shows that drbd can be stopped and started with pacemaker, and still
crash. The last is where drbd is stopped when pacemaker stops, which
doesn't crash.

After this, everything works fine from then on.

Initial pacemaker config to DRBD crash;
node 1 - https://pastebin.com/raw/7QQGHW5g
node 2 - https://pastebin.com/raw/a1BYVzM7

Fix fence issue, repeat test, DRBD crash;
node 1 - https://pastebin.com/raw/1fZH1SSP
nod2 2 - https://pastebin.com/raw/p7i3absC

Fresh cluster start, crash on stop;
node 1 - https://pastebin.com/raw/gYNiMHDC
node 2 - https://pastebin.com/raw/VzkaPyEb

DRBD resource stopped, started, stop pacemaker, crash;
node 1 - https://pastebin.com/raw/LDTpmncY
node 2 - https://pastebin.com/raw/ryj2J6Qt

DRBD stopped resource, stop cluster, start cluster, stop cluster OK
node 1 - https://pastebin.com/raw/haECJz8y
node 2 - https://pastebin.com/raw/tBSD0ZyJ

--
Digimer
Papers and Projects: https://alteeve.com/w/
"I am, somehow, less interested in the weight and convolutions of
Einstein’s brain than in the near certainty that people of equal talent
have lived and died in cotton fields and sweatshops." - Stephen Jay Gould
_______________________________________________
Star us on GITHUB: https://github.com/LINBIT
drbd-user mailing list
drbd-user@lists.linbit.com
https://lists.linbit.com/mailman/listinfo/drbd-user