Redhat 7.2 running pre6, oops and lost connections
Dual 1GHZ/ 1GB RAM/ 50GIG partition / 100mbit dedicated connection for both primary and secondary
/var/log/messages:
Oct 26 10:23:10 nfs1 kernel: drbd: initialised. Version: 0.6.1-pre6 (api:58/proto:60)
Oct 26 10:23:10 nfs1 kernel: drbd0: Connection established.
Oct 26 10:23:10 nfs1 kernel: drbd0: size=51199155 KB / blksize=4096 B
Oct 26 10:23:10 nfs1 kernel: drbd0: Synchronisation started blks=15 int=1
Oct 26 10:27:59 nfs1 kernel: drbd0: blksize=1024 B
Oct 26 10:27:59 nfs1 kernel: drbd0: blksize=4096 B
Oct 26 10:28:08 nfs1 kernel: kjournald starting. Commit interval 5 seconds
Oct 26 10:28:08 nfs1 kernel: EXT3 FS 2.4-0.9.8, 25 Aug 2001 on drbd(43,0), internal journal
Oct 26 10:28:08 nfs1 kernel: EXT3-fs: recovery complete.
Oct 26 10:28:08 nfs1 kernel: EXT3-fs: mounted filesystem with ordered data mode.
Oct 26 10:29:15 nfs1 login(pam_unix)[911]: session opened for user root by LOGIN(uid=0)
Oct 26 10:29:15 nfs1 -- root[911]: ROOT LOGIN ON tty2
Oct 26 12:36:07 nfs1 kernel: drbd0: sock_sendmsg returned -14
Oct 26 12:36:07 nfs1 kernel: drbd0: Connection lost.(pc=0,uc=0)
Oct 26 12:36:08 nfs1 kernel: drbd0: syncer send failed!!
Oct 26 12:36:08 nfs1 kernel: drbd0: Syncer aborted.
Oct 26 12:36:08 nfs1 kernel: drbd0: asender terminated
Oct 26 12:36:11 nfs1 kernel: drbd0: Connection established.
Oct 26 12:36:11 nfs1 kernel: drbd0: size=51199155 KB / blksize=4096 B
Oct 26 12:36:11 nfs1 kernel: drbd0: Synchronisation started blks=15 int=1
Oct 26 12:36:15 nfs1 kernel: drbd0: sock_sendmsg returned -14
Oct 26 12:36:15 nfs1 kernel: drbd0: Connection lost.(pc=35,uc=0)
Oct 26 12:36:15 nfs1 kernel: drbd0: syncer send failed!!
Oct 26 12:36:15 nfs1 kernel: drbd0: Syncer send failed.
Oct 26 12:36:15 nfs1 kernel: drbd0: asender terminated
Oct 26 12:36:15 nfs1 kernel: Unable to handle kernel paging request at virtual address 4f3e3380
Oct 26 12:36:15 nfs1 kernel: printing eip:
Oct 26 12:36:15 nfs1 kernel: f894618f
Oct 26 12:36:15 nfs1 kernel: *pde = 00000000
Oct 26 12:36:15 nfs1 kernel: Oops: 0000
Oct 26 12:36:15 nfs1 kernel: CPU: 1
Oct 26 12:36:15 nfs1 kernel: EIP: 0010:[eepro100:__insmod_eepro100_O/lib/modules/2.4.7-10smp/kernel/drivers/+-65137/96]
Oct 26 12:36:15 nfs1 kernel: EIP: 0010:[<f894618f>]
Oct 26 12:36:23 nfs1 kernel: EFLAGS: 00010093
Oct 26 12:36:27 nfs1 kernel: eax: 4f3e333c ebx: f6c68400 ecx: 0000000c edx: f6610000
Oct 26 12:36:27 nfs1 kernel: esi: f661770c edi: 00000001 ebp: f6627ec0 esp: f6627e98
Oct 26 12:36:27 nfs1 kernel: ds: 0018 es: 0018 ss: 0018
Oct 26 12:36:27 nfs1 kernel: Process drbdd_0 (pid: 968, stackpage=f6627000)
Oct 26 12:36:27 nfs1 kernel: Stack: 0000000c 00000001 00000297 00000000 00000286 00000001 00000286 00000000
Oct 26 12:36:27 nfs1 kernel: f6c68400 00000000 f6627fb8 f8949cf1 f6c68400 0000000a f6627f14 00000001
Oct 26 12:36:27 nfs1 kernel: 00002b00 f68fd6a0 00000803 00000000 f6627f44 00000048 00000000 00000000
Oct 26 12:36:27 nfs1 kernel: Call Trace: [eepro100:__insmod_eepro100_O/lib/modules/2.4.7-10smp/kernel/drivers/+-49935/96] [eepro100:__insmod_eepro100_O/lib/modules/2.4.7-10smp/kernel/drivers/+-54135/96] [eepro100:__insmod_eepro100_O/lib/modules/2.4.7-10smp/kernel/drivers/+-49269/96] [eepro100:__insmod_eepro100_O/lib/modules/2.4.7-10smp/kernel/drivers/+-64733/96] [kernel_thread+38/48]
Oct 26 12:36:27 nfs1 kernel: Call Trace: [<f8949cf1>] [<f8948c89>] [<f8949f8b>] [<f8946323>] [<c0105836>]
Oct 26 12:36:27 nfs1 kernel: [eepro100:__insmod_eepro100_O/lib/modules/2.4.7-10smp/kernel/drivers/+-64820/96]
Oct 26 12:36:28 nfs1 kernel: [<f89462cc>]
Oct 26 12:36:28 nfs1 kernel:
Oct 26 12:36:28 nfs1 kernel: Code: 8b 40 44 83 e9 09 d3 e8 50 ff b3 3c 02 00 00 e8 f5 18 00 00
[root@fs source]#
drbd.conf
[root@nfs1 root]# cd /etc/
[root@nfs1 etc]# cat drbd.conf
#
# Comment lines.
#
resource drbd0 {
protocol=B
fsckcmd=fsck -p -y
inittimeout=10
disk {
do-panic
disk-size=51199155
}
net {
sync-rate=12500
# skip-sync
tl-size=10240
timeout=60
connect-int=10
ping-int=10
}
on nfs1 {
device=/dev/nb0
disk=/dev/sda3
address=10.1.1.20
port=7788
}
on nfs2 {
device=/dev/nb0
disk=/dev/sda2
address=10.1.1.21
port=7788
}
}
I have been able to duplicate this everytime, currently this is happening during sync-all with bonnie++ running on the primary. I am trying right now without the the device mounted just to see if it completes the sync-all.
Thanks,
John Hamlik
Dual 1GHZ/ 1GB RAM/ 50GIG partition / 100mbit dedicated connection for both primary and secondary
/var/log/messages:
Oct 26 10:23:10 nfs1 kernel: drbd: initialised. Version: 0.6.1-pre6 (api:58/proto:60)
Oct 26 10:23:10 nfs1 kernel: drbd0: Connection established.
Oct 26 10:23:10 nfs1 kernel: drbd0: size=51199155 KB / blksize=4096 B
Oct 26 10:23:10 nfs1 kernel: drbd0: Synchronisation started blks=15 int=1
Oct 26 10:27:59 nfs1 kernel: drbd0: blksize=1024 B
Oct 26 10:27:59 nfs1 kernel: drbd0: blksize=4096 B
Oct 26 10:28:08 nfs1 kernel: kjournald starting. Commit interval 5 seconds
Oct 26 10:28:08 nfs1 kernel: EXT3 FS 2.4-0.9.8, 25 Aug 2001 on drbd(43,0), internal journal
Oct 26 10:28:08 nfs1 kernel: EXT3-fs: recovery complete.
Oct 26 10:28:08 nfs1 kernel: EXT3-fs: mounted filesystem with ordered data mode.
Oct 26 10:29:15 nfs1 login(pam_unix)[911]: session opened for user root by LOGIN(uid=0)
Oct 26 10:29:15 nfs1 -- root[911]: ROOT LOGIN ON tty2
Oct 26 12:36:07 nfs1 kernel: drbd0: sock_sendmsg returned -14
Oct 26 12:36:07 nfs1 kernel: drbd0: Connection lost.(pc=0,uc=0)
Oct 26 12:36:08 nfs1 kernel: drbd0: syncer send failed!!
Oct 26 12:36:08 nfs1 kernel: drbd0: Syncer aborted.
Oct 26 12:36:08 nfs1 kernel: drbd0: asender terminated
Oct 26 12:36:11 nfs1 kernel: drbd0: Connection established.
Oct 26 12:36:11 nfs1 kernel: drbd0: size=51199155 KB / blksize=4096 B
Oct 26 12:36:11 nfs1 kernel: drbd0: Synchronisation started blks=15 int=1
Oct 26 12:36:15 nfs1 kernel: drbd0: sock_sendmsg returned -14
Oct 26 12:36:15 nfs1 kernel: drbd0: Connection lost.(pc=35,uc=0)
Oct 26 12:36:15 nfs1 kernel: drbd0: syncer send failed!!
Oct 26 12:36:15 nfs1 kernel: drbd0: Syncer send failed.
Oct 26 12:36:15 nfs1 kernel: drbd0: asender terminated
Oct 26 12:36:15 nfs1 kernel: Unable to handle kernel paging request at virtual address 4f3e3380
Oct 26 12:36:15 nfs1 kernel: printing eip:
Oct 26 12:36:15 nfs1 kernel: f894618f
Oct 26 12:36:15 nfs1 kernel: *pde = 00000000
Oct 26 12:36:15 nfs1 kernel: Oops: 0000
Oct 26 12:36:15 nfs1 kernel: CPU: 1
Oct 26 12:36:15 nfs1 kernel: EIP: 0010:[eepro100:__insmod_eepro100_O/lib/modules/2.4.7-10smp/kernel/drivers/+-65137/96]
Oct 26 12:36:15 nfs1 kernel: EIP: 0010:[<f894618f>]
Oct 26 12:36:23 nfs1 kernel: EFLAGS: 00010093
Oct 26 12:36:27 nfs1 kernel: eax: 4f3e333c ebx: f6c68400 ecx: 0000000c edx: f6610000
Oct 26 12:36:27 nfs1 kernel: esi: f661770c edi: 00000001 ebp: f6627ec0 esp: f6627e98
Oct 26 12:36:27 nfs1 kernel: ds: 0018 es: 0018 ss: 0018
Oct 26 12:36:27 nfs1 kernel: Process drbdd_0 (pid: 968, stackpage=f6627000)
Oct 26 12:36:27 nfs1 kernel: Stack: 0000000c 00000001 00000297 00000000 00000286 00000001 00000286 00000000
Oct 26 12:36:27 nfs1 kernel: f6c68400 00000000 f6627fb8 f8949cf1 f6c68400 0000000a f6627f14 00000001
Oct 26 12:36:27 nfs1 kernel: 00002b00 f68fd6a0 00000803 00000000 f6627f44 00000048 00000000 00000000
Oct 26 12:36:27 nfs1 kernel: Call Trace: [eepro100:__insmod_eepro100_O/lib/modules/2.4.7-10smp/kernel/drivers/+-49935/96] [eepro100:__insmod_eepro100_O/lib/modules/2.4.7-10smp/kernel/drivers/+-54135/96] [eepro100:__insmod_eepro100_O/lib/modules/2.4.7-10smp/kernel/drivers/+-49269/96] [eepro100:__insmod_eepro100_O/lib/modules/2.4.7-10smp/kernel/drivers/+-64733/96] [kernel_thread+38/48]
Oct 26 12:36:27 nfs1 kernel: Call Trace: [<f8949cf1>] [<f8948c89>] [<f8949f8b>] [<f8946323>] [<c0105836>]
Oct 26 12:36:27 nfs1 kernel: [eepro100:__insmod_eepro100_O/lib/modules/2.4.7-10smp/kernel/drivers/+-64820/96]
Oct 26 12:36:28 nfs1 kernel: [<f89462cc>]
Oct 26 12:36:28 nfs1 kernel:
Oct 26 12:36:28 nfs1 kernel: Code: 8b 40 44 83 e9 09 d3 e8 50 ff b3 3c 02 00 00 e8 f5 18 00 00
[root@fs source]#
drbd.conf
[root@nfs1 root]# cd /etc/
[root@nfs1 etc]# cat drbd.conf
#
# Comment lines.
#
resource drbd0 {
protocol=B
fsckcmd=fsck -p -y
inittimeout=10
disk {
do-panic
disk-size=51199155
}
net {
sync-rate=12500
# skip-sync
tl-size=10240
timeout=60
connect-int=10
ping-int=10
}
on nfs1 {
device=/dev/nb0
disk=/dev/sda3
address=10.1.1.20
port=7788
}
on nfs2 {
device=/dev/nb0
disk=/dev/sda2
address=10.1.1.21
port=7788
}
}
I have been able to duplicate this everytime, currently this is happening during sync-all with bonnie++ running on the primary. I am trying right now without the the device mounted just to see if it completes the sync-all.
Thanks,
John Hamlik