Hello,
Misterious death of [drbdd_x] thread:
Look at the test i did:
node 1 is primary. Process on node 1 make writes (synchron) on drbd0 and
drbd1.
Periodically, node 2 is hardly reseted.
I observed this mornig that [drbdd_0] died. My syslog on node 1 says:
Oct 25 11:25:44 CNODE-1-110 kernel: drbd0: Connection established.
Oct 25 11:25:44 CNODE-1-110 kernel: drbd0: size=52384 KB / blksize=4096
B
Oct 25 11:25:44 CNODE-1-110 kernel: drbd0: Synchronisation started
blks=15 int=1
Oct 25 11:25:44 CNODE-1-110 kernel: drbd1: Connection established.
Oct 25 11:25:44 CNODE-1-110 kernel: drbd1: size=102784 KB / blksize=4096
B
Oct 25 11:25:44 CNODE-1-110 kernel: drbd1: Synchronisation started
blks=15 int=1
Oct 25 11:25:46 CNODE-1-110 kernel: drbd0: Synchronisation done.
Oct 25 11:25:46 CNODE-1-110 kernel: drbd1: Synchronisation done.
Oct 25 11:25:47 CNODE-1-110 kernel: TCP: peer 136.10.15.120:1024/7788
shrinks window 557600581:6849:557620853. Bad, what else can I say?
Oct 25 11:25:47 CNODE-1-110 kernel: drbd0: magic?? m: 0 c: 0 l: 0
Oct 25 11:25:47 CNODE-1-110 kernel: drbd0: Connection lost.(pc=6,uc=0)
Oct 25 11:25:47 CNODE-1-110 kernel: drbd0: asender terminated
Oct 25 11:25:47 CNODE-1-110 kernel: Unable to handle kernel paging
request at virtual address 4f3e3680
Oct 25 11:25:47 CNODE-1-110 kernel: printing eip:
Oct 25 11:25:47 CNODE-1-110 kernel: c88b60b5
Oct 25 11:25:47 CNODE-1-110 kernel: *pde = 00000000
Oct 25 11:25:47 CNODE-1-110 kernel: Oops: 0000
Oct 25 11:25:47 CNODE-1-110 kernel: CPU: 0
cat /proc/drbd on node 1:
version: 0.6.1-pre6 (api:58/proto:60)
0: cs:Unconnected st:Primary/Unknown ns:93116 nr:0 dw:94172 dr:90244
pe:4 ua:0
1: cs:Connected st:Primary/Secondary ns:143760 nr:0 dw:144800 dr:140956
pe:0 ua:0
ps -ef |grep drbd on node 1:
root 103 1 0 11:12 ? 00:00:07 [drbdd_1]
root 455 454 0 11:17 ? 00:00:00 /sbin/boswrite.exe
5000000 600 5 10000 20000 /mnt/drbd0
root 480 455 0 11:17 ? 00:00:00 /sbin/boswrite.exe
5000000 600 5 10000 20000 /mnt/drbd0
root 481 480 0 11:17 ? 00:00:00 /sbin/boswrite.exe
5000000 600 5 10000 20000 /mnt/drbd0
root 488 480 0 11:17 ? 00:00:00 /sbin/boswrite.exe
5000000 600 5 10000 20000 /mnt/drbd0
root 492 491 0 11:17 ? 00:00:00 /sbin/boswrite.exe
5000000 600 5 10000 20000 /mnt/drbd1
root 493 492 0 11:17 ? 00:00:00 /sbin/boswrite.exe
5000000 600 5 10000 20000 /mnt/drbd1
root 494 493 0 11:17 ? 00:00:00 /sbin/boswrite.exe
5000000 600 5 10000 20000 /mnt/drbd1
root 495 493 0 11:17 ? 00:00:00 /sbin/boswrite.exe
5000000 600 5 10000 20000 /mnt/drbd1
root 1053 103 0 11:27 ? 00:00:00 [drbd_asender_1]
root 1054 103 0 11:27 ? 00:00:00 [drbd_syncer_1
<defunct>]
(boswrite are my process...)
->[drbdd_0] is not her anymore!!
Bye,
--
Jean-Yves BOUET
EADS Defence and Security Networks
jean-yves.bouet@example.com
01 34 60 86 36
Misterious death of [drbdd_x] thread:
Look at the test i did:
node 1 is primary. Process on node 1 make writes (synchron) on drbd0 and
drbd1.
Periodically, node 2 is hardly reseted.
I observed this mornig that [drbdd_0] died. My syslog on node 1 says:
Oct 25 11:25:44 CNODE-1-110 kernel: drbd0: Connection established.
Oct 25 11:25:44 CNODE-1-110 kernel: drbd0: size=52384 KB / blksize=4096
B
Oct 25 11:25:44 CNODE-1-110 kernel: drbd0: Synchronisation started
blks=15 int=1
Oct 25 11:25:44 CNODE-1-110 kernel: drbd1: Connection established.
Oct 25 11:25:44 CNODE-1-110 kernel: drbd1: size=102784 KB / blksize=4096
B
Oct 25 11:25:44 CNODE-1-110 kernel: drbd1: Synchronisation started
blks=15 int=1
Oct 25 11:25:46 CNODE-1-110 kernel: drbd0: Synchronisation done.
Oct 25 11:25:46 CNODE-1-110 kernel: drbd1: Synchronisation done.
Oct 25 11:25:47 CNODE-1-110 kernel: TCP: peer 136.10.15.120:1024/7788
shrinks window 557600581:6849:557620853. Bad, what else can I say?
Oct 25 11:25:47 CNODE-1-110 kernel: drbd0: magic?? m: 0 c: 0 l: 0
Oct 25 11:25:47 CNODE-1-110 kernel: drbd0: Connection lost.(pc=6,uc=0)
Oct 25 11:25:47 CNODE-1-110 kernel: drbd0: asender terminated
Oct 25 11:25:47 CNODE-1-110 kernel: Unable to handle kernel paging
request at virtual address 4f3e3680
Oct 25 11:25:47 CNODE-1-110 kernel: printing eip:
Oct 25 11:25:47 CNODE-1-110 kernel: c88b60b5
Oct 25 11:25:47 CNODE-1-110 kernel: *pde = 00000000
Oct 25 11:25:47 CNODE-1-110 kernel: Oops: 0000
Oct 25 11:25:47 CNODE-1-110 kernel: CPU: 0
cat /proc/drbd on node 1:
version: 0.6.1-pre6 (api:58/proto:60)
0: cs:Unconnected st:Primary/Unknown ns:93116 nr:0 dw:94172 dr:90244
pe:4 ua:0
1: cs:Connected st:Primary/Secondary ns:143760 nr:0 dw:144800 dr:140956
pe:0 ua:0
ps -ef |grep drbd on node 1:
root 103 1 0 11:12 ? 00:00:07 [drbdd_1]
root 455 454 0 11:17 ? 00:00:00 /sbin/boswrite.exe
5000000 600 5 10000 20000 /mnt/drbd0
root 480 455 0 11:17 ? 00:00:00 /sbin/boswrite.exe
5000000 600 5 10000 20000 /mnt/drbd0
root 481 480 0 11:17 ? 00:00:00 /sbin/boswrite.exe
5000000 600 5 10000 20000 /mnt/drbd0
root 488 480 0 11:17 ? 00:00:00 /sbin/boswrite.exe
5000000 600 5 10000 20000 /mnt/drbd0
root 492 491 0 11:17 ? 00:00:00 /sbin/boswrite.exe
5000000 600 5 10000 20000 /mnt/drbd1
root 493 492 0 11:17 ? 00:00:00 /sbin/boswrite.exe
5000000 600 5 10000 20000 /mnt/drbd1
root 494 493 0 11:17 ? 00:00:00 /sbin/boswrite.exe
5000000 600 5 10000 20000 /mnt/drbd1
root 495 493 0 11:17 ? 00:00:00 /sbin/boswrite.exe
5000000 600 5 10000 20000 /mnt/drbd1
root 1053 103 0 11:27 ? 00:00:00 [drbd_asender_1]
root 1054 103 0 11:27 ? 00:00:00 [drbd_syncer_1
<defunct>]
(boswrite are my process...)
->[drbdd_0] is not her anymore!!
Bye,
--
Jean-Yves BOUET
EADS Defence and Security Networks
jean-yves.bouet@example.com
01 34 60 86 36