I've just successfully set up DRBD 0.6.1-pre3 and heartbeat on two new
Athlon boxes to provide failover NFS service. It's succeeded beyond my
dreams so far - so congratulations all.
When setting it up, there seemed to be a couple of errors in the
"datadisk" script - it calls drbdsetup with the arguments "pri" and
"sec" rather than "primary" and "secondary", which caused it to fail on
my setup.
I have a few questions which I need to answer before moving this onto
any kind of "production" box though, and I hope someone will be able to
help.
Firstly, what (2.4) kernel versions are recommended, if any, to work
best with DRBD?
Secondly, I don't really have a good understanding of the different
states that the system can be in. Where can I look for documentation /
code on this? For example: clearly, if host1 is primary and host2 is
secondary, and if host1 goes down, then datadisk will make host2
primary, and then when host1 comes back up it will become secondary and
do a SyncAll. But what happens if both machines go down? Will they
both come back up in secondary state? When is manual intervention
required? What happens if host1 and host2 are primary/secondary as
before, host1 goes down, comes back up and starts synchronising, but
then host2 goes down while this sync is occurring? Is there some sort
of state diagram available somewhere? (there are clearly several
possibilities I haven't mentioned here, and I'd like to have a good
understanding of what might happen before deciding how to test and
benchmark my setup)
Lastly, I'm getting a lot of errors in the syslog from drbd:
drbd0: transferlog too small!!
drbd0: Epoch set size wrong!!found=512 reported=1535
I know from the mailing list archive that the first one isn't a problem
- although I have tried to remove it by setting tl-size=1024, with no
success. But what does the second message mean? Should I be worried?
Hope someone can help!
Many thanks
Jack Bertram
Athlon boxes to provide failover NFS service. It's succeeded beyond my
dreams so far - so congratulations all.
When setting it up, there seemed to be a couple of errors in the
"datadisk" script - it calls drbdsetup with the arguments "pri" and
"sec" rather than "primary" and "secondary", which caused it to fail on
my setup.
I have a few questions which I need to answer before moving this onto
any kind of "production" box though, and I hope someone will be able to
help.
Firstly, what (2.4) kernel versions are recommended, if any, to work
best with DRBD?
Secondly, I don't really have a good understanding of the different
states that the system can be in. Where can I look for documentation /
code on this? For example: clearly, if host1 is primary and host2 is
secondary, and if host1 goes down, then datadisk will make host2
primary, and then when host1 comes back up it will become secondary and
do a SyncAll. But what happens if both machines go down? Will they
both come back up in secondary state? When is manual intervention
required? What happens if host1 and host2 are primary/secondary as
before, host1 goes down, comes back up and starts synchronising, but
then host2 goes down while this sync is occurring? Is there some sort
of state diagram available somewhere? (there are clearly several
possibilities I haven't mentioned here, and I'd like to have a good
understanding of what might happen before deciding how to test and
benchmark my setup)
Lastly, I'm getting a lot of errors in the syslog from drbd:
drbd0: transferlog too small!!
drbd0: Epoch set size wrong!!found=512 reported=1535
I know from the mailing list archive that the first one isn't a problem
- although I have tried to remove it by setting tl-size=1024, with no
success. But what does the second message mean? Should I be worried?
Hope someone can help!
Many thanks
Jack Bertram