I now have my NFS cluster almost working (Redhat 7.1 base,
2.4.10-ac11 patched for ext3, drbd 0.6.1-pre6) I am testing right now by
leaving the secondary node always up and simply rebooting the primary
node. The failover seems to go fine, drbd switches to primary on node2 and
the services start without a hitch. When the primary node comes back up,
it seems to want to SyncAll most of the time - I have seen it do the quick
sync sometimes, but usually SyncAll... This causes the datadisk scripts to
fail to mount the nb0 and nb1, which in turn causes most of the nfs
services to fail since they try to export /home (nb0) and the lock dirs
are symlink'd to nb1.
It seems to me that drbd should either be doing the quick sync always
(assuming both nodes didnt go down at the same time) or else datadisk
should at least wait and retry to mount the partitions until the syncall
completes (an hour later).
Any suggestions?
2.4.10-ac11 patched for ext3, drbd 0.6.1-pre6) I am testing right now by
leaving the secondary node always up and simply rebooting the primary
node. The failover seems to go fine, drbd switches to primary on node2 and
the services start without a hitch. When the primary node comes back up,
it seems to want to SyncAll most of the time - I have seen it do the quick
sync sometimes, but usually SyncAll... This causes the datadisk scripts to
fail to mount the nb0 and nb1, which in turn causes most of the nfs
services to fail since they try to export /home (nb0) and the lock dirs
are symlink'd to nb1.
It seems to me that drbd should either be doing the quick sync always
(assuming both nodes didnt go down at the same time) or else datadisk
should at least wait and retry to mount the partitions until the syncall
completes (an hour later).
Any suggestions?