Mailing List Archive: Re: failback

There is not much to add, to Olive's statement.
Only one thing:
Yes I (and most probabely others too) want to add this meta-data
management to drbd, but unfortunately our resources are limited.

The next things I will do are:
*) man pages (nearly finished)
*) faster resynchronisation (will be sponsored by CUBiT)

-Philipp

* Fabio Olive Leite <olive@example.com> [010419 14:57]:
> Hi!
>
> On Tue, Apr 17, 2001 at 09:34:22AM -0300, Ricardo Alexandre Mattar wrote:
> ) I've got a question about failback on a drbd based disk replication
> ) system:
> ) I noticed that when I shutdown one node of my HA system and let the
> ) other one take over and become the primary drbd system or just continue
> ) its' original role if the node shut down was the secondary one and then
> ) with just one node running I perform writing on the device (/dev/nb0) as
> ) a mounted filesystem and afterwards I shutdown the remaining node, when
> ) I bring up the two nodes they don't know that they need to resync the
> ) disks. Worst, I need to take a look at the two disks to know who should
> ) be the primary and perform a resync manually!
>
> That is expected and correct as to the initial goals of DRBD and the way it
> is implemented. The current implementation of DRBD uses a bitmap in RAM
> (in the primary) to mark dirty blocks that should be resynced to the
> secondary when it comes back. If you loose both machines, obviously the
> bitmap in RAM goes outside the matrix and the next incarnation of DRBD
> cannot possibly know what was written to it before. Full sync is on the
> way. I said this is expected because DRBD is intended to tolerate a single
> fault. Multiple faults would certainly need something a little bit
> different.
>
> Also, this has been subject of a long debate for ages, so the list archives
> might help enlighten the subject.
>
> ) Splitting my pain in some steps:
> )
> ) -HA system with two drbd nodes running.
>
> ... that is meant to tolerate single faults.
>
> ) -Master node is shut down.
>
> Your quota of single faults fills up.
>
> ) -Secondary becomes master.
>
> ... and allocates a bitmap in RAM to mark dirty blocks as such and be able
> to resync them later.
>
> ) -Writing is performed on /dev/nb0 which is now mounted by heartbeat
> ) somewhere on the secondary node's filesystem.
>
> This marks some blocks as dirty (written to) in the bitmap in RAM.
>
> ) -Secondary node is also shut down.
>
> Bitmap in RAM goes south.
>
> ) -The two nodes are turned on.
>
> And the primary allocates a fresh new bitmap. :)
>
> ) Questions:
> )
> ) What happens drbd starts on the two nodes?
>
> Nothing special. It can't tell it has had problems earlier. Somebody else
> should tell him that. That somebody, at the moment, is the administrator. A
> full fledged cluster resource manager should help in the future.
>
> ) Must I resync the nodes manually?
>
> Yep!
>
> ) In other words: What happens when I bring up two drbd nodes that are no
> ) syncronized?
>
> Pain. The primary will hapilly send written blocks to the secondary, and
> slowly overwrite whatever was written only there, most likely getting it
> highly inconsistent.
>
> ) Is there are way to log which node has the more recently updated
> ) filesystem in order to have a fully automated failback or even to help
> ) the system administrator decide which is the 'best' node?
>
> Not with DRBD alone. If you subject something to a situation it was not
> meant to survive, there are good chances it won't survive. :)
>
> ) If someone has any clue, please...
>
> I hope the clues above fit. :)
>
> Regards
> Fábio