I've started playing around with the latest DRBD. The meta-data stuff
is great, it cleanly addresses my major concern with the older DRBD.
Well done!
I do have some questions about how this interfaces with heartbeat.
When two nodes are coming up, DRBD negotiates its states to determine
which one has the latest data. That node is made primary, and it
initiates a quick-sync.
When heartbeat comes up on the node where DRBD is already primary, it
complains:
2001/10/11_11:11:12 ERROR: Resource datadisk::drbd0 is active, and
should not be.
2001/10/11_11:11:12 WARNING: Non-idle resources will affect resource
takeback
Heartbeat doesn't run the "datadisk start" script because datadisk
reported that it was already "running". So even though DRBD is
primary, the file system is never mounted above it.
Things get even worse if heartbeat decides to make the node where DRBD
is secondary the primary node in the cluster:
kernel: drbd0: incompatible states
Now one of the nodes is "WFConnection st:Primary/Unknown" and the
other is "StandAlone st:Primary/Unknown".
It seems to me that heartbeat should be the sole entity to make a DRBD
node primary. I wonder if it would be better to have DRBD do it's
meta-data negotiation, decide who should be primary, do the quick-sync
and then *go back to secondary* before allowing heartbeat to start.
Once the disks are quick-synced, then heartbeat can make whichever node
it wants primary.
Is this reasonable? Am I missing something?
Thanks (my configuration is below).
Tony.
Two RH 6.1 systems.
DRBD: version: 0.6.1-pre3 (api:58/proto:58).
reiserfs over DRBD.
Heartbeat: version 0.4.9.0d.
drbd.conf:
resource drbd0 {
protocol=B
fsckcmd=fsck -p -y
disk {
do-panic
disk-size=2048728
}
net {
sync-rate=5000
skip-sync
tl-size=256
timeout=60
connect-int=10
ping-int=10
}
on cuda1 {
device=/dev/nb0
disk=/dev/hda5
address=192.0.2.1
port=7788
}
on cuda2 {
device=/dev/nb0
disk=/dev/hda5
address=192.0.2.2
port=7788
}
}
--
Tony Willoughby tonyw@example.com
"I don't want knowledge, I want certainty"
-David Bowie
is great, it cleanly addresses my major concern with the older DRBD.
Well done!
I do have some questions about how this interfaces with heartbeat.
When two nodes are coming up, DRBD negotiates its states to determine
which one has the latest data. That node is made primary, and it
initiates a quick-sync.
When heartbeat comes up on the node where DRBD is already primary, it
complains:
2001/10/11_11:11:12 ERROR: Resource datadisk::drbd0 is active, and
should not be.
2001/10/11_11:11:12 WARNING: Non-idle resources will affect resource
takeback
Heartbeat doesn't run the "datadisk start" script because datadisk
reported that it was already "running". So even though DRBD is
primary, the file system is never mounted above it.
Things get even worse if heartbeat decides to make the node where DRBD
is secondary the primary node in the cluster:
kernel: drbd0: incompatible states
Now one of the nodes is "WFConnection st:Primary/Unknown" and the
other is "StandAlone st:Primary/Unknown".
It seems to me that heartbeat should be the sole entity to make a DRBD
node primary. I wonder if it would be better to have DRBD do it's
meta-data negotiation, decide who should be primary, do the quick-sync
and then *go back to secondary* before allowing heartbeat to start.
Once the disks are quick-synced, then heartbeat can make whichever node
it wants primary.
Is this reasonable? Am I missing something?
Thanks (my configuration is below).
Tony.
Two RH 6.1 systems.
DRBD: version: 0.6.1-pre3 (api:58/proto:58).
reiserfs over DRBD.
Heartbeat: version 0.4.9.0d.
drbd.conf:
resource drbd0 {
protocol=B
fsckcmd=fsck -p -y
disk {
do-panic
disk-size=2048728
}
net {
sync-rate=5000
skip-sync
tl-size=256
timeout=60
connect-int=10
ping-int=10
}
on cuda1 {
device=/dev/nb0
disk=/dev/hda5
address=192.0.2.1
port=7788
}
on cuda2 {
device=/dev/nb0
disk=/dev/hda5
address=192.0.2.2
port=7788
}
}
--
Tony Willoughby tonyw@example.com
"I don't want knowledge, I want certainty"
-David Bowie