Mailing List Archive

Blksize / mount bug ???
Hi,

I found some really curious behavior of drbd (latest CVS version).

On the primary node I did:
insmod drbd
drbdsetup /dev/nb0 /dev/hda6 C 192.168.10.72 192.168.10.73 -r 1000 -d 40131
drbdsetup /dev/nb0 PRI

On the secondary:
insmod drbd
drbdsetup /dev/nb0 /dev/hda6 C 192.168.10.73 192.168.10.72 -r 1000 -d 40131
drbdsetup /dev/nb0 SEC

Again on the primary:
drbdsetup /dev/nb0 REPL

Everything went fine, logfile said:
primary:
Dec 14 23:03:32 sunder2 kernel: drbd: module initialised. Version: 58
Dec 14 23:03:56 sunder2 kernel: drbd0: user provided size = 40131 KB
Dec 14 23:03:56 sunder2 kernel: drbd : vmallocing 1254 B for bitmap. @c802001c
Dec 14 23:04:24 sunder2 kernel: drbd0: Connection established.
Dec 14 23:04:24 sunder2 kernel: drbd0: size=40131 KB / blksize=4096 B
Dec 14 23:04:24 sunder2 kernel: drbd0: Synchronisation started blks=3 int=1
Dec 14 23:04:24 sunder2 kernel: drbd0: Synchronisation done. 0 blks sent
Dec 14 23:04:57 sunder2 kernel: drbd0: Synchronisation started blks=3 int=1
Dec 14 23:07:00 sunder2 kernel: drbd0: Synchronisation done. 10032 blks sent

Fine, 123 sek reconstruction time -> ~326K/sek, seems okay, because the
secondary is only a 486/33 w 8Mb ram, and I have cheap Realtek 10Mb NIC's
connected via BNC cable.

secondary:
Dec 14 23:06:10 sunder3 kernel: drbd: module initialised. Version: 58
Dec 14 23:06:56 sunder3 kernel: drbd0: user provided size = 40131 KB
Dec 14 23:06:56 sunder3 kernel: drbd : vmallocing 1254 B for bitmap. @c101101c
Dec 14 23:06:56 sunder3 kernel: drbd0: Connection established.
Dec 14 23:06:56 sunder3 kernel: drbd0: size=40131 KB / blksize=4096 B

Sorry, as I see now, time is not synchronized.

/proc/drbd on pimary:
version : 58

0: cs:Connected st:Primary/Secondary ns:10032 nr:0 dw:0 dr:0 of:0
1: cs:Unconfigured st:Secondary/Unknown ns:0 nr:0 dw:0 dr:0 of:0

secondary:
version : 58

0: cs:Connected st:Secondary/Primary ns:0 nr:10032 dw:10032 dr:0 of:0
1: cs:Unconfigured st:Secondary/Unknown ns:0 nr:0 dw:0 dr:0 of:0


Completly diffrent try (I got drbd down, unloaded the modules):
On the primary node I did:
everthing as above, but additional:
mount /dev/nb0 /mnt

everything else as above

But now, the logfile said:
primary:
Dec 14 23:12:12 sunder2 kernel: drbd0: user provided size = 40131 KB
Dec 14 23:12:12 sunder2 kernel: drbd : vmallocing 1254 B for bitmap. @c802001c
Dec 14 23:12:26 sunder2 kernel: drbd0: Connection established.
Dec 14 23:12:26 sunder2 kernel: drbd0: size=40131 KB / blksize=4096 B
Dec 14 23:12:26 sunder2 kernel: drbd0: Synchronisation started blks=3 int=1
Dec 14 23:12:26 sunder2 kernel: drbd0: Synchronisation done. 0 blks sent
Dec 14 23:12:56 sunder2 kernel: drbd0: blksize=1024 B
Dec 14 23:13:17 sunder2 kernel: drbd0: Synchronisation started blks=15 int=1
Dec 14 23:21:02 sunder2 kernel: drbd0: Synchronisation done. 40131 blks sent

Hmm, reconstruct took 465 sek -> ~86K/s, this is definitly too slow, isn't it ?

secondary:
Dec 14 23:14:54 sunder3 kernel: drbd: module initialised. Version: 58
Dec 14 23:14:58 sunder3 kernel: drbd0: user provided size = 40131 KB
Dec 14 23:14:58 sunder3 kernel: drbd : vmallocing 1254 B for bitmap. @c101101c
Dec 14 23:14:58 sunder3 kernel: drbd0: Connection established.
Dec 14 23:14:58 sunder3 kernel: drbd0: size=40131 KB / blksize=4096 B
Dec 14 23:15:28 sunder3 kernel: drbd0: blksize=1024 B
Again, sorry for the time

/proc/drbd on pimary:
version : 58

0: cs:Connected st:Primary/Secondary ns:40132 nr:0 dw:1 dr:0 of:0
1: cs:Unconfigured st:Secondary/Unknown ns:0 nr:0 dw:0 dr:0 of:0

secondary:
version : 58

0: cs:Connected st:Secondary/Primary ns:0 nr:40132 dw:40132 dr:0 of:0
1: cs:Unconfigured st:Secondary/Unknown ns:0 nr:0 dw:0 dr:0 of:0

I recognized the change in the blocksize when I mount the device, but I have no
clue which blocksize is the right one. According to dumpe2fs my blocksize is 1k
(but what has the filesystem to do with it ?). According to the repl speed 4096
seems to be the right value.
Is there data lost ? (I didn't found any errors on the fs, in both cases, but
this maybe wrong, because I didn't alter the data and both disks were in sync
before)

By the way, kernel is clean 2.2.16 with RAID patches, but drbd is not running
on top of RAID (there's no raid running)

Thanks,
Sven