Mailing List Archive

Crash
I just had a nice crash:

Both nodes up, fully synchronized.
On node riclu_a a running smbd. I copied some files from a windows client to a share, all works fine and
seemed to be copied to node riclu_b as well.

A minute later, I switched of node riclu_a. The other node riclu_b crashed completely, no high-availability, there
was a message on the screen about an error in some C-file and a kernel panic, but I missed to write it down
exactly and it was not saved in /var/log/messages - too bad.

Node riclu_b crashed completely, no possibility to switch the terminal with alt-F2 or so, I have to reset it. Then it
doesn't want to become primary, first I had to reboot node riclu_a, then riclu_b came up as primary and a full
sync started. No error here: riclu_b couldn't take over the service, it crashed, but after a reboot it doesn't want to
become primary, drbd came up only after a reboot of riclu_a (the former primary), but then as primary and the
former primary now is secondary and they are synchronising.

There are some error messages about heartbeat in /var/log/messages:

ERROR: ha_msg_add_nv: line doesn't contain '='
ERROR: s>>>

and often:

ERROR: controlfifo2msg: cannot create message
ERROR: control_process: NULL message

Here is /var/log/messages:

Nov 12 22:20:00 riclu_b heartbeat[1232]: WARN: node riclu_a: is dead
Nov 12 22:20:00 riclu_b heartbeat[1232]: info: Link riclu_a:/dev/ttyS0 dead.
Nov 12 22:20:00 riclu_b heartbeat[1232]: info: Link riclu_a:eth0 dead.
Nov 12 22:20:00 riclu_b heartbeat: info: Running /etc/ha.d/rc.d/status status
Nov 12 22:20:00 riclu_b heartbeat: info: Running /etc/ha.d/rc.d/ifstat ifstat
Nov 12 22:20:00 riclu_b heartbeat: info: Running /etc/ha.d/rc.d/ifstat ifstat
Nov 12 22:20:00 riclu_b heartbeat: info: Taking over resource group 192.168.1.23
Nov 12 22:20:00 riclu_b heartbeat: info: Acquiring resource group: riclu_a 192.168.1.23 datadisk::drbd1 smb
Nov 12 22:20:00 riclu_b heartbeat: info: Running /etc/ha.d/resource.d/IPaddr 192.168.1.23 start
Nov 12 22:20:00 riclu_b heartbeat: info: ifconfig eth1:0 192.168.1.23 netmask 255.255.255.0^Ibroadcast
192.168.1.255
Nov 12 22:20:00 riclu_b heartbeat: info: Sending Gratuitous Arp for 192.168.1.23 on eth1:0 [eth1]
Nov 12 22:20:00 riclu_b heartbeat: info: Running /etc/ha.d/resource.d/datadisk drbd1 start
Nov 12 22:20:00 riclu_b kernel: drbd1: blksize=1024 B
Nov 12 22:20:00 riclu_b kernel: drbd1: blksize=4096 B
Nov 12 22:20:04 riclu_b kernel: drbd0: ping ack did not arrive
Nov 12 22:20:04 riclu_b kernel: drbd0: Connection lost.(pc=0,uc=0)
Nov 12 22:20:04 riclu_b kernel: drbd0: asender terminated
Nov 12 22:20:05 riclu_b kernel: drbd1: ping ack did not arrive
Nov 12 22:20:05 riclu_b kernel: drbd1: Connection lost.(pc=24,uc=0)
Nov 12 22:20:05 riclu_b kernel: drbd1: asender terminated
Nov 12 22:20:05 riclu_b kernel: Unable to handle kernel NULL pointer dereference at virtual address 00000004
Nov 12 22:20:05 riclu_b kernel: printing eip:
Nov 12 22:20:05 riclu_b kernel: c01d5016
Nov 12 22:20:05 riclu_b kernel: *pde = 00000000
Nov 12 22:20:05 riclu_b kernel: Oops: 0002
Nov 12 22:20:05 riclu_b kernel: CPU: 0
Nov 12 22:20:05 riclu_b kernel: EIP: 0010:[wait_for_tcp_memory+166/624]
Nov 12 22:20:05 riclu_b kernel: EIP: 0010:[<c01d5016>]
Nov 12 22:20:05 riclu_b kernel: EFLAGS: 00010246
Nov 12 22:20:05 riclu_b kernel: eax: 00000000 ebx: c02ae808 ecx: 00000000 edx: c02ae808
Nov 12 22:20:05 riclu_b kernel: esi: f35380c0 edi: 7fffffff ebp: 00000000 esp: f2695a50
Nov 12 22:20:05 riclu_b kernel: ds: 0018 es: 0018 ss: 0018
Nov 12 22:20:05 riclu_b kernel: Process mount (pid: 2029, stackpage=f2695000)
Nov 12 22:20:05 riclu_b kernel: Stack: f2694000 00000000 00000000 f2694000 00000000 00000000 00000000
f2694000
Nov 12 22:20:05 riclu_b kernel: f3836184 f3836184 00000000 f344cd80 00000000 f35381f8 c01d6af5
f35380c0
Nov 12 22:20:05 riclu_b kernel: f2695adc 00000000 c2187ec0 c02e1b20 c019e836 00000000 c02e1b20
f2695ad8
Nov 12 22:20:05 riclu_b kernel: Call Trace: [tcp_sendmsg+3349/4432] [do_rw_disk+246/736]
[ide_set_handler+88/96] [ide_dmaproc+309/528] [ide_dma_intr+0/192]
Nov 12 22:20:05 riclu_b kernel: Call Trace: [<c01d6af5>] [<c019e836>] [<c018ad78>] [<c01954e5>]
[<c0194d10>]
Nov 12 22:20:05 riclu_b kernel: [dma_timer_expiry+0/96] [inet_sendmsg+53/64] [sock_sendmsg+108/144]
[ide_wait_stat+202/272] [start_request+416/528] [<f8935cf6>]
Nov 12 22:20:05 riclu_b kernel: [<c0195350>] [<c01eee05>] [<c01b8bec>] [<c018bb4a>] [<c018be40>]
[<f8935cf6>]
Nov 12 22:20:05 riclu_b kernel: [<f89359cc>] [<f89358a3>] [ide_set_handler+88/96] [<f893aae6>]
[generic_make_request+242/272] [generic_block_bmap+46/64]
Nov 12 22:20:05 riclu_b kernel: [<f89359cc>] [<f89358a3>] [<c018ad78>] [<f893aae6>] [<c017fd92>]
[<c01380ce>]
Nov 12 22:20:05 riclu_b kernel: [submit_bh+78/112] [write_locked_buffers+52/96]
[write_unlocked_buffers+171/272] [sync_buffers+20/64] [fsync_no_super+14/32]
[8139too:__insmod_8139too_O/lib/modules/2.4.7-10/kernel/drivers/net/+-793243/96]
Nov 12 22:20:05 riclu_b kernel: [<c017fdfe>] [<c01357b4>] [<c013588b>] [<c0135994>] [<c01359ce>]
[<f8803565>]
Nov 12 22:20:05 riclu_b kernel: [8139too:__insmod_8139too_O/lib/modules/2.4.7-10/kernel/drivers/net/+-
783322/96] [8139too:__insmod_8139too_O/lib/modules/2.4.7-10/kernel/drivers/net/+-714014/96]
[8139too:__insmod_8139too_O/lib/modules/2.4.7-10/kernel/drivers/net/+-715596/96]
[8139too:__insmod_8139too_O/lib/modules/2.4.7-10/kernel/drivers/net/+-694460/96] [read_super+98/176]
[get_sb_bdev+320/416]
Nov 12 22:20:05 riclu_b kernel: [<f8805c26>] [<f8816ae2>] [<f88164b4>] [<f881b744>] [<c0139dd2>]
[<c0139fd0>]
Nov 12 22:20:05 riclu_b kernel: [8139too:__insmod_8139too_O/lib/modules/2.4.7-10/kernel/drivers/net/+-
694460/96] [cached_lookup+16/80] [do_add_mount+243/560] [8139too:__insmod_8139too_O/lib/modules/2.4.7-
10/kernel/drivers/net/+-694460/96] [8139too:__insmod_8139too_O/lib/modules/2.4.7-10/kernel/drivers/net/+-
694460/96] [do_page_fault+0/1168]
Nov 12 22:20:05 riclu_b kernel: [<f881b744>] [<c013ed00>] [<c013a883>] [<f881b744>] [<f881b744>]
[<c0113450>]
Nov 12 22:20:05 riclu_b kernel: [error_code+56/64] [do_mount+246/272] [__alloc_pages+110/608]
[copy_mount_options+76/160] [sys_mount+124/192] [system_call+51/56]
Nov 12 22:20:05 riclu_b kernel: [<c0107028>] [<c013ab56>] [<c012e5be>] [<c013aa0c>] [<c013abec>]
[<c0106f2b>]
Nov 12 22:20:05 riclu_b kernel:
Nov 12 22:20:05 riclu_b kernel: Code: 0f ba 68 04 00 8b 04 24 c7 00 01 00 00 00 8b 86 e4 02 00 00
Nov 12 22:20:05 riclu_b kernel: <1>Unable to handle kernel NULL pointer dereference at virtual address
000004bf
Nov 12 22:20:05 riclu_b kernel: printing eip:
Nov 12 22:20:05 riclu_b kernel: c0124e97
Nov 12 22:20:05 riclu_b kernel: *pde = 00000000
Nov 12 22:20:05 riclu_b kernel: Oops: 0000
Nov 12 22:20:05 riclu_b kernel: CPU: 0
Nov 12 22:20:05 riclu_b kernel: EIP: 0010:[find_vma+103/128]
Nov 12 22:20:05 riclu_b kernel: EIP: 0010:[<c0124e97>]
Nov 12 22:20:05 riclu_b kernel: EFLAGS: 00010206
Nov 12 22:20:05 riclu_b kernel: eax: f2695ba0 ebx: f6708b40 ecx: 000004bf edx: 000004b7
Nov 12 22:20:05 riclu_b kernel: esi: 00000000 edi: c0113450 ebp: 000004bf esp: f2695c08
Nov 12 22:20:05 riclu_b kernel: ds: 0018 es: 0018 ss: 0018
Nov 12 22:20:05 riclu_b kernel: Process ResourceManager (pid: 2042, stackpage=f2695000)
Nov 12 22:20:05 riclu_b kernel: Stack: f6708b40 c01134d9 f6708b40 000004bf 00000000 f2694000 f2694000
00000300
Nov 12 22:20:05 riclu_b kernel: 00000000 00030001 00000018 ffffffff c0204aeb 00000010 00010212
00000286
Nov 12 22:20:05 riclu_b kernel: 00000758 40016540 f2694000 00000002 c0113450 0804f3cc c0107028
f2695c6c
Nov 12 22:20:05 riclu_b kernel: Call Trace: [do_page_fault+137/1168] [clear_user+43/64]
[do_page_fault+0/1168] [error_code+56/64] [do_page_fault+0/1168]
Nov 12 22:20:05 riclu_b kernel: Call Trace: [<c01134d9>] [<c0204aeb>] [<c0113450>] [<c0107028>]
[<c0113450>]
Nov 12 22:20:05 riclu_b kernel: [error_code+56/64] [do_page_fault+0/1168] [find_vma+103/128]
[do_page_fault+137/1168] [ide_end_request+130/144] [ide_dma_intr+112/192]
Nov 12 22:20:05 riclu_b kernel: [<c0107028>] [<c0113450>] [<c0124e97>] [<c01134d9>] [<c018ad12>]
[<c0194d80>]
Nov 12 22:20:05 riclu_b kernel: [ide_intr+292/336] [account_io_end+60/80] [do_page_fault+0/1168]
[error_code+56/64] [do_page_fault+0/1168] [find_vma+103/128]
Nov 12 22:20:05 riclu_b kernel: [<c018c664>] [<c017f2fc>] [<c0113450>] [<c0107028>] [<c0113450>]
[<c0124e97>]
Nov 12 22:20:05 riclu_b kernel: [do_page_fault+137/1168] [deliver_signal+29/96] [wake_up_parent+37/64]
[do_notify_parent+166/176] [pipe_write_release+14/32] [fput+116/192]
Nov 12 22:20:05 riclu_b kernel: [<c01134d9>] [<c011ea9d>] [<c011ef15>] [<c011efd6>] [<c013e50e>]
[<c0135534>]
Nov 12 22:20:05 riclu_b kernel: [do_page_fault+0/1168] [error_code+56/64] [setup_frame+244/528]
[handle_signal+125/256] [do_signal+591/672] [filp_open+54/96]
Nov 12 22:20:05 riclu_b kernel: [<c0113450>] [<c0107028>] [<c01066e4>] [<c0106b1d>] [<c0106def>]
[<c0134186>]
Nov 12 22:20:05 riclu_b kernel: [getname+94/160] [sys_open+125/176] [do_page_fault+0/1168]
[signal_return+20/24]
Nov 12 22:20:05 riclu_b kernel: [<c013eade>] [<c01344bd>] [<c0113450>] [<c0106f6c>]
Nov 12 22:20:05 riclu_b kernel:
Nov 12 22:20:05 riclu_b kernel: Code: 39 4a 08 76 f4 89 d0 39 48 04 77 e3 85 c0 74 03 89 43 08 5b
Nov 12 22:20:05 riclu_b heartbeat: ERROR: Cannot locate resource script smb
Nov 12 22:20:05 riclu_b heartbeat: info: mach_down takeover complete.
Nov 12 22:22:38 riclu_b syslogd 1.4.1: restart.
Nov 12 22:22:38 riclu_b syslog: Starten von syslogd succeeded
Nov 12 22:22:38 riclu_b kernel: klogd 1.4.1, log source = /proc/kmsg started.
Nov 12 22:22:38 riclu_b kernel: Inspecting /boot/System.map-2.4.7-10







mfg ar

--
mailto:andreas@example.com
http://www.rittershofer.de
PGP-Public-Key http://www.rittershofer.de/ari.htm
Re: Crash [ In reply to ]
All the crashes are happending somwhere in the memory management subsystem...

What's about an other kernel ? Maybe 2.4.13 or 14 ?

-philipp

* Andreas Rittershofer <andreas@example.com> [011112 23:08]:
> I just had a nice crash:
>
> Both nodes up, fully synchronized.
> On node riclu_a a running smbd. I copied some files from a windows client to a share, all works fine and
> seemed to be copied to node riclu_b as well.
>
> A minute later, I switched of node riclu_a. The other node riclu_b crashed completely, no high-availability, there
> was a message on the screen about an error in some C-file and a kernel panic, but I missed to write it down
> exactly and it was not saved in /var/log/messages - too bad.
>
> Node riclu_b crashed completely, no possibility to switch the terminal with alt-F2 or so, I have to reset it. Then it
> doesn't want to become primary, first I had to reboot node riclu_a, then riclu_b came up as primary and a full
> sync started. No error here: riclu_b couldn't take over the service, it crashed, but after a reboot it doesn't want to
> become primary, drbd came up only after a reboot of riclu_a (the former primary), but then as primary and the
> former primary now is secondary and they are synchronising.
>
> There are some error messages about heartbeat in /var/log/messages:
>
> ERROR: ha_msg_add_nv: line doesn't contain '='
> ERROR: s>>>
>
> and often:
>
> ERROR: controlfifo2msg: cannot create message
> ERROR: control_process: NULL message
>
> Here is /var/log/messages:
>
> Nov 12 22:20:00 riclu_b heartbeat[1232]: WARN: node riclu_a: is dead
> Nov 12 22:20:00 riclu_b heartbeat[1232]: info: Link riclu_a:/dev/ttyS0 dead.
> Nov 12 22:20:00 riclu_b heartbeat[1232]: info: Link riclu_a:eth0 dead.
> Nov 12 22:20:00 riclu_b heartbeat: info: Running /etc/ha.d/rc.d/status status
> Nov 12 22:20:00 riclu_b heartbeat: info: Running /etc/ha.d/rc.d/ifstat ifstat
> Nov 12 22:20:00 riclu_b heartbeat: info: Running /etc/ha.d/rc.d/ifstat ifstat
> Nov 12 22:20:00 riclu_b heartbeat: info: Taking over resource group 192.168.1.23
> Nov 12 22:20:00 riclu_b heartbeat: info: Acquiring resource group: riclu_a 192.168.1.23 datadisk::drbd1 smb
> Nov 12 22:20:00 riclu_b heartbeat: info: Running /etc/ha.d/resource.d/IPaddr 192.168.1.23 start
> Nov 12 22:20:00 riclu_b heartbeat: info: ifconfig eth1:0 192.168.1.23 netmask 255.255.255.0^Ibroadcast
> 192.168.1.255
> Nov 12 22:20:00 riclu_b heartbeat: info: Sending Gratuitous Arp for 192.168.1.23 on eth1:0 [eth1]
> Nov 12 22:20:00 riclu_b heartbeat: info: Running /etc/ha.d/resource.d/datadisk drbd1 start
> Nov 12 22:20:00 riclu_b kernel: drbd1: blksize=1024 B
> Nov 12 22:20:00 riclu_b kernel: drbd1: blksize=4096 B
> Nov 12 22:20:04 riclu_b kernel: drbd0: ping ack did not arrive
> Nov 12 22:20:04 riclu_b kernel: drbd0: Connection lost.(pc=0,uc=0)
> Nov 12 22:20:04 riclu_b kernel: drbd0: asender terminated
> Nov 12 22:20:05 riclu_b kernel: drbd1: ping ack did not arrive
> Nov 12 22:20:05 riclu_b kernel: drbd1: Connection lost.(pc=24,uc=0)
> Nov 12 22:20:05 riclu_b kernel: drbd1: asender terminated
> Nov 12 22:20:05 riclu_b kernel: Unable to handle kernel NULL pointer dereference at virtual address 00000004
> Nov 12 22:20:05 riclu_b kernel: printing eip:
> Nov 12 22:20:05 riclu_b kernel: c01d5016
> Nov 12 22:20:05 riclu_b kernel: *pde = 00000000
> Nov 12 22:20:05 riclu_b kernel: Oops: 0002
> Nov 12 22:20:05 riclu_b kernel: CPU: 0
> Nov 12 22:20:05 riclu_b kernel: EIP: 0010:[wait_for_tcp_memory+166/624]
> Nov 12 22:20:05 riclu_b kernel: EIP: 0010:[<c01d5016>]
> Nov 12 22:20:05 riclu_b kernel: EFLAGS: 00010246
> Nov 12 22:20:05 riclu_b kernel: eax: 00000000 ebx: c02ae808 ecx: 00000000 edx: c02ae808
> Nov 12 22:20:05 riclu_b kernel: esi: f35380c0 edi: 7fffffff ebp: 00000000 esp: f2695a50
> Nov 12 22:20:05 riclu_b kernel: ds: 0018 es: 0018 ss: 0018
> Nov 12 22:20:05 riclu_b kernel: Process mount (pid: 2029, stackpage=f2695000)
> Nov 12 22:20:05 riclu_b kernel: Stack: f2694000 00000000 00000000 f2694000 00000000 00000000 00000000
> f2694000
> Nov 12 22:20:05 riclu_b kernel: f3836184 f3836184 00000000 f344cd80 00000000 f35381f8 c01d6af5
> f35380c0
> Nov 12 22:20:05 riclu_b kernel: f2695adc 00000000 c2187ec0 c02e1b20 c019e836 00000000 c02e1b20
> f2695ad8
> Nov 12 22:20:05 riclu_b kernel: Call Trace: [tcp_sendmsg+3349/4432] [do_rw_disk+246/736]
> [ide_set_handler+88/96] [ide_dmaproc+309/528] [ide_dma_intr+0/192]
> Nov 12 22:20:05 riclu_b kernel: Call Trace: [<c01d6af5>] [<c019e836>] [<c018ad78>] [<c01954e5>]
> [<c0194d10>]
> Nov 12 22:20:05 riclu_b kernel: [dma_timer_expiry+0/96] [inet_sendmsg+53/64] [sock_sendmsg+108/144]
> [ide_wait_stat+202/272] [start_request+416/528] [<f8935cf6>]
> Nov 12 22:20:05 riclu_b kernel: [<c0195350>] [<c01eee05>] [<c01b8bec>] [<c018bb4a>] [<c018be40>]
> [<f8935cf6>]
> Nov 12 22:20:05 riclu_b kernel: [<f89359cc>] [<f89358a3>] [ide_set_handler+88/96] [<f893aae6>]
> [generic_make_request+242/272] [generic_block_bmap+46/64]
> Nov 12 22:20:05 riclu_b kernel: [<f89359cc>] [<f89358a3>] [<c018ad78>] [<f893aae6>] [<c017fd92>]
> [<c01380ce>]
> Nov 12 22:20:05 riclu_b kernel: [submit_bh+78/112] [write_locked_buffers+52/96]
> [write_unlocked_buffers+171/272] [sync_buffers+20/64] [fsync_no_super+14/32]
> [8139too:__insmod_8139too_O/lib/modules/2.4.7-10/kernel/drivers/net/+-793243/96]
> Nov 12 22:20:05 riclu_b kernel: [<c017fdfe>] [<c01357b4>] [<c013588b>] [<c0135994>] [<c01359ce>]
> [<f8803565>]
> Nov 12 22:20:05 riclu_b kernel: [8139too:__insmod_8139too_O/lib/modules/2.4.7-10/kernel/drivers/net/+-
> 783322/96] [8139too:__insmod_8139too_O/lib/modules/2.4.7-10/kernel/drivers/net/+-714014/96]
> [8139too:__insmod_8139too_O/lib/modules/2.4.7-10/kernel/drivers/net/+-715596/96]
> [8139too:__insmod_8139too_O/lib/modules/2.4.7-10/kernel/drivers/net/+-694460/96] [read_super+98/176]
> [get_sb_bdev+320/416]
> Nov 12 22:20:05 riclu_b kernel: [<f8805c26>] [<f8816ae2>] [<f88164b4>] [<f881b744>] [<c0139dd2>]
> [<c0139fd0>]
> Nov 12 22:20:05 riclu_b kernel: [8139too:__insmod_8139too_O/lib/modules/2.4.7-10/kernel/drivers/net/+-
> 694460/96] [cached_lookup+16/80] [do_add_mount+243/560] [8139too:__insmod_8139too_O/lib/modules/2.4.7-
> 10/kernel/drivers/net/+-694460/96] [8139too:__insmod_8139too_O/lib/modules/2.4.7-10/kernel/drivers/net/+-
> 694460/96] [do_page_fault+0/1168]
> Nov 12 22:20:05 riclu_b kernel: [<f881b744>] [<c013ed00>] [<c013a883>] [<f881b744>] [<f881b744>]
> [<c0113450>]
> Nov 12 22:20:05 riclu_b kernel: [error_code+56/64] [do_mount+246/272] [__alloc_pages+110/608]
> [copy_mount_options+76/160] [sys_mount+124/192] [system_call+51/56]
> Nov 12 22:20:05 riclu_b kernel: [<c0107028>] [<c013ab56>] [<c012e5be>] [<c013aa0c>] [<c013abec>]
> [<c0106f2b>]
> Nov 12 22:20:05 riclu_b kernel:
> Nov 12 22:20:05 riclu_b kernel: Code: 0f ba 68 04 00 8b 04 24 c7 00 01 00 00 00 8b 86 e4 02 00 00
> Nov 12 22:20:05 riclu_b kernel: <1>Unable to handle kernel NULL pointer dereference at virtual address
> 000004bf
> Nov 12 22:20:05 riclu_b kernel: printing eip:
> Nov 12 22:20:05 riclu_b kernel: c0124e97
> Nov 12 22:20:05 riclu_b kernel: *pde = 00000000
> Nov 12 22:20:05 riclu_b kernel: Oops: 0000
> Nov 12 22:20:05 riclu_b kernel: CPU: 0
> Nov 12 22:20:05 riclu_b kernel: EIP: 0010:[find_vma+103/128]
> Nov 12 22:20:05 riclu_b kernel: EIP: 0010:[<c0124e97>]
> Nov 12 22:20:05 riclu_b kernel: EFLAGS: 00010206
> Nov 12 22:20:05 riclu_b kernel: eax: f2695ba0 ebx: f6708b40 ecx: 000004bf edx: 000004b7
> Nov 12 22:20:05 riclu_b kernel: esi: 00000000 edi: c0113450 ebp: 000004bf esp: f2695c08
> Nov 12 22:20:05 riclu_b kernel: ds: 0018 es: 0018 ss: 0018
> Nov 12 22:20:05 riclu_b kernel: Process ResourceManager (pid: 2042, stackpage=f2695000)
> Nov 12 22:20:05 riclu_b kernel: Stack: f6708b40 c01134d9 f6708b40 000004bf 00000000 f2694000 f2694000
> 00000300
> Nov 12 22:20:05 riclu_b kernel: 00000000 00030001 00000018 ffffffff c0204aeb 00000010 00010212
> 00000286
> Nov 12 22:20:05 riclu_b kernel: 00000758 40016540 f2694000 00000002 c0113450 0804f3cc c0107028
> f2695c6c
> Nov 12 22:20:05 riclu_b kernel: Call Trace: [do_page_fault+137/1168] [clear_user+43/64]
> [do_page_fault+0/1168] [error_code+56/64] [do_page_fault+0/1168]
> Nov 12 22:20:05 riclu_b kernel: Call Trace: [<c01134d9>] [<c0204aeb>] [<c0113450>] [<c0107028>]
> [<c0113450>]
> Nov 12 22:20:05 riclu_b kernel: [error_code+56/64] [do_page_fault+0/1168] [find_vma+103/128]
> [do_page_fault+137/1168] [ide_end_request+130/144] [ide_dma_intr+112/192]
> Nov 12 22:20:05 riclu_b kernel: [<c0107028>] [<c0113450>] [<c0124e97>] [<c01134d9>] [<c018ad12>]
> [<c0194d80>]
> Nov 12 22:20:05 riclu_b kernel: [ide_intr+292/336] [account_io_end+60/80] [do_page_fault+0/1168]
> [error_code+56/64] [do_page_fault+0/1168] [find_vma+103/128]
> Nov 12 22:20:05 riclu_b kernel: [<c018c664>] [<c017f2fc>] [<c0113450>] [<c0107028>] [<c0113450>]
> [<c0124e97>]
> Nov 12 22:20:05 riclu_b kernel: [do_page_fault+137/1168] [deliver_signal+29/96] [wake_up_parent+37/64]
> [do_notify_parent+166/176] [pipe_write_release+14/32] [fput+116/192]
> Nov 12 22:20:05 riclu_b kernel: [<c01134d9>] [<c011ea9d>] [<c011ef15>] [<c011efd6>] [<c013e50e>]
> [<c0135534>]
> Nov 12 22:20:05 riclu_b kernel: [do_page_fault+0/1168] [error_code+56/64] [setup_frame+244/528]
> [handle_signal+125/256] [do_signal+591/672] [filp_open+54/96]
> Nov 12 22:20:05 riclu_b kernel: [<c0113450>] [<c0107028>] [<c01066e4>] [<c0106b1d>] [<c0106def>]
> [<c0134186>]
> Nov 12 22:20:05 riclu_b kernel: [getname+94/160] [sys_open+125/176] [do_page_fault+0/1168]
> [signal_return+20/24]
> Nov 12 22:20:05 riclu_b kernel: [<c013eade>] [<c01344bd>] [<c0113450>] [<c0106f6c>]
> Nov 12 22:20:05 riclu_b kernel:
> Nov 12 22:20:05 riclu_b kernel: Code: 39 4a 08 76 f4 89 d0 39 48 04 77 e3 85 c0 74 03 89 43 08 5b
> Nov 12 22:20:05 riclu_b heartbeat: ERROR: Cannot locate resource script smb
> Nov 12 22:20:05 riclu_b heartbeat: info: mach_down takeover complete.
> Nov 12 22:22:38 riclu_b syslogd 1.4.1: restart.
> Nov 12 22:22:38 riclu_b syslog: Starten von syslogd succeeded
> Nov 12 22:22:38 riclu_b kernel: klogd 1.4.1, log source = /proc/kmsg started.
> Nov 12 22:22:38 riclu_b kernel: Inspecting /boot/System.map-2.4.7-10
>
>
>
>
>
>
>
> mfg ar
>
> --
> mailto:andreas@example.com
> http://www.rittershofer.de
> PGP-Public-Key http://www.rittershofer.de/ari.htm
>
>
> _______________________________________________
> DRBD-devel mailing list
> DRBD-devel@example.com
> https://lists.sourceforge.net/lists/listinfo/drbd-devel
Re: Crash [ In reply to ]
On 14 Nov 01, at 21:00, Philipp Reisner wrote:

> All the crashes are happending somwhere in the memory management

That was ONE crash of ONE machine, not many.

> subsystem...
>
> What's about an other kernel ? Maybe 2.4.13 or 14 ?

I'm VERY interested in a kernel without these crashes in order to really use my cluster.

> > /proc/kmsg started. Nov 12 22:22:38 riclu_b kernel: Inspecting
> > /boot/System.map-2.4.7-10



mfg ar

--
mailto:andreas@example.com
http://www.rittershofer.de
PGP-Public-Key http://www.rittershofer.de/ari.htm