Mailing List Archive

Announcing: heartbeat 0.4.5e
Hi,

Bill Bacher sent me an email noting that the 0.4.5d version of heartbeat could
cause a node to declare itself dead after the other node had been down for a
couple of minutes.

This was due to not testing the CTS/RTS flow control code well enough. I didn't
leave the machine down for two minutes during my testing. It took that long for
TTY queues and FIFOs to fill up, and cause things to hang. If you have the
watchdog timer enabled, this would cause the machine to reboot every couple of
minutes. Not too cool :-(

In any case, I think this is now fixed in 0.4.5e.

Since I didn't add any unrelated bug fixes into the mix, it should be OK :-)

I also changed it so that if it declares itself dead, it shuts down heartbeat
gracefully...


You'll find the new code in the usual place at:
http://linux-ha.org/download/

and in the CVS repository...

Please let me know of any further problems...

Thanks for your continual patience and help with testing!

-- Alan Robertson
alanr@bell-labs.com