Mailing List Archive

RE: Badness in softirq.c / no modules loaded / related tonetwork interface
> On Fri, May 06, 2005 at 06:40:33PM +0200,
> sebastian@gutweiler.net wrote:
> > I have a porblem with the unprivileged kernel. When I start
> my domain
> > using the unprivileged kernel I get the message:
> >
> > Badness in local_bh_enable at kernel/softirq.c:140 [<c011f3c0>]
> > local_bh_enable+0x80/0x90 [<c0223089>] skb_checksum+0x129/0x2a0
> > [<c026736c>] udp_poll+0x9c/0x150 [<c021e409>] sock_poll+0x29/0x40
> > [<c016ba3e>] do_select+0x25e/0x2d0 [<c016b630>]
> __pollwait+0x0/0xd0
> > [<c016bd9f>] sys_select+0x2bf/0x4d0 [<c01093f4>]
> syscall_call+0x7/0xb

Are these messages coming out on the dom0 or domU console?

Can you repeat on 2.0-testing or unstable?

Does anyone see this that's using a NIC other than an e1000?

Thanks,
Ian

> We are still having this problem too. We only have it on
> some hosts, not others so its related to some activity in the
> domUs - we haven't figured out what though. The backtrace
> suggests UDP traffic
>
> I'm prety sure we didn't miscompile our modules - I
> disassembled them to check for cli/sti. The kernel was compiled using
>
> make-kpkg --arch xen --append_to_version -xen
> --revision=2.6.11 kernel_image
>
> We see this with the e1000 driver, using kernel 2.6.11.7 +
> debian patches + xen stable 2.0.5, on dell poweredge 750 hardware.
>
> No functionality appears to be affected, other than 1000s of
> these messages in the log.
>
> --
> Nick Craig-Wood <nick@craig-wood.com> --
> http://www.craig-wood.com/nick
>
> _______________________________________________
> Xen-users mailing list
> Xen-users@lists.xensource.com
> http://lists.xensource.com/xen-users
>

_______________________________________________
Xen-users mailing list
Xen-users@lists.xensource.com
http://lists.xensource.com/xen-users
Re: Badness in softirq.c / no modules loaded / related tonetwork interface [ In reply to ]
On Mon, May 09, 2005 at 08:09:38PM +0100, Ian Pratt wrote:
> > On Fri, May 06, 2005 at 06:40:33PM +0200,
> > sebastian@gutweiler.net wrote:
> > > I have a porblem with the unprivileged kernel. When I start my domain
> > > using the unprivileged kernel I get the message:
> > >
> > > Badness in local_bh_enable at kernel/softirq.c:140 [<c011f3c0>]
>
> Are these messages coming out on the dom0 or domU console?

They are seen in the dom0 dmesg output.

> Can you repeat on 2.0-testing or unstable?

This is a production machine unfortunately. Its possible we may be
able to schedule some downtime, I'll ask our sysadmin.

> Does anyone see this that's using a NIC other than an e1000?

Only one of our machines does this and it has an e1000 chipset.

--
Nick Craig-Wood <nick@craig-wood.com> -- http://www.craig-wood.com/nick

_______________________________________________
Xen-users mailing list
Xen-users@lists.xensource.com
http://lists.xensource.com/xen-users
Re: Badness in softirq.c / no modules loaded / related tonetwork interface [ In reply to ]
On Mon, May 09, 2005 at 10:16:47PM +0100, Nick Craig-Wood wrote:
> On Mon, May 09, 2005 at 08:09:38PM +0100, Ian Pratt wrote:
> > Can you repeat on 2.0-testing or unstable?
>
> This is a production machine unfortunately. Its possible we may be
> able to schedule some downtime, I'll ask our sysadmin.
>
> > Does anyone see this that's using a NIC other than an e1000?
>
> Only one of our machines does this and it has an e1000 chipset.

Since upgrading from 2.6.10 / xen 2.0.5 to 2.6.11 / xen 2.0-testing
25/05/05 I am also seeing the following in dom0 console output.

kernel: Badness in local_bh_enable at kernel/softirq.c:140
kernel: [local_bh_enable+130/144] local_bh_enable+0x82/0x90
kernel: [skb_checksum+317/704] skb_checksum+0x13d/0x2c0
kernel: [udp_poll+154/352] udp_poll+0x9a/0x160
kernel: [sock_poll+41/64] sock_poll+0x29/0x40
kernel: [do_pollfd+149/160] do_pollfd+0x95/0xa0
kernel: [do_poll+106/208] do_poll+0x6a/0xd0
kernel: [sys_poll+353/576] sys_poll+0x161/0x240
kernel: [sys_gettimeofday+60/144] sys_gettimeofday+0x3c/0x90
kernel: [__pollwait+0/208] __pollwait+0x0/0xd0
kernel: [syscall_call+7/11] syscall_call+0x7/0xb

e1000 compiled in, and this is a 1U rackmount with onboard nics so
changing nic is not really an option. This did not happen with
2.6.10/2.0.5.

Any other info that would be useful?
Re: Badness in softirq.c / no modules loaded / related tonetwork interface [ In reply to ]
>Since upgrading from 2.6.10 / xen 2.0.5 to 2.6.11 / xen 2.0-testing
>25/05/05 I am also seeing the following in dom0 console output.
>
>kernel: Badness in local_bh_enable at kernel/softirq.c:140
>kernel: [local_bh_enable+130/144] local_bh_enable+0x82/0x90
>kernel: [skb_checksum+317/704] skb_checksum+0x13d/0x2c0
>kernel: [udp_poll+154/352] udp_poll+0x9a/0x160
>kernel: [sock_poll+41/64] sock_poll+0x29/0x40
>kernel: [do_pollfd+149/160] do_pollfd+0x95/0xa0
>kernel: [do_poll+106/208] do_poll+0x6a/0xd0
>kernel: [sys_poll+353/576] sys_poll+0x161/0x240
>kernel: [sys_gettimeofday+60/144] sys_gettimeofday+0x3c/0x90
>kernel: [__pollwait+0/208] __pollwait+0x0/0xd0
>kernel: [syscall_call+7/11] syscall_call+0x7/0xb
>
>e1000 compiled in, and this is a 1U rackmount with onboard nics so
>changing nic is not really an option. This did not happen with
>2.6.10/2.0.5.
>
>Any other info that would be useful?

Hmm - are you using CONFIG_HIGHMEM? This seems to be the required
path and may have something to do with it (we've certainly seen
nothing like this on our own E1000 boxes).


cheers,

S.



_______________________________________________
Xen-users mailing list
Xen-users@lists.xensource.com
http://lists.xensource.com/xen-users
Re: Badness in softirq.c / no modules loaded / related tonetwork interface [ In reply to ]
On Mon, May 30, 2005 at 02:12:14AM +0100, Steven Hand wrote:
>
> >Since upgrading from 2.6.10 / xen 2.0.5 to 2.6.11 / xen 2.0-testing
> >25/05/05 I am also seeing the following in dom0 console output.
> >
> >kernel: Badness in local_bh_enable at kernel/softirq.c:140

[...]

> Hmm - are you using CONFIG_HIGHMEM? This seems to be the required
> path and may have something to do with it (we've certainly seen
> nothing like this on our own E1000 boxes).

CONFIG_HIGHMEM4G=y
CONFIG_HIGHMEM=y

The machine currently has 2G and is unlikely to ever go above 4G.
Do I need those settings?

Rest of the config file is attached.
Re: Badness in softirq.c / no modules loaded / related tonetwork interface [ In reply to ]
>
>On Mon, May 30, 2005 at 02:12:14AM +0100, Steven Hand wrote:
>>
>> >Since upgrading from 2.6.10 / xen 2.0.5 to 2.6.11 / xen 2.0-testing
>> >25/05/05 I am also seeing the following in dom0 console output.
>> >
>> >kernel: Badness in local_bh_enable at kernel/softirq.c:140
>
>[...]
>
>> Hmm - are you using CONFIG_HIGHMEM? This seems to be the required
>> path and may have something to do with it (we've certainly seen
>> nothing like this on our own E1000 boxes).
>
>CONFIG_HIGHMEM4G=y
>CONFIG_HIGHMEM=y
>
>The machine currently has 2G and is unlikely to ever go above 4G.
>Do I need those settings?

If you want dom0 to be able to use the memory, yes. Not if you just
want to share it e.g. as 3 x 640MB chunks between 3 domains...

Error path seems to be:

udp_poll():1334 does spin_lock_irq() which disables interrupts (__cli())
then line 1336 does udp_checksum_complete() which calls skb_checksum()
which calls local_bh_enable() which barfs if irqs are disabled.

This only happens under CONFIG_HIGHMEM.

However - at first look this seems like it should happen under vanilla
linux too ... so I may be missing something. Will take a more detailed
look....

In the meantime you could try to see if you can reproduce w/out
CONFIG_HIGHMEM...

cheers,

S.

p.s. what is the app that's doing UDP polling? may be able to
reproduce locally which would help...




_______________________________________________
Xen-users mailing list
Xen-users@lists.xensource.com
http://lists.xensource.com/xen-users
Re: Badness in softirq.c / no modules loaded / related tonetwork interface [ In reply to ]
On Mon, May 30, 2005 at 02:36:01AM +0100, Steven Hand wrote:
> Error path seems to be:
>
> udp_poll():1334 does spin_lock_irq() which disables interrupts (__cli())
> then line 1336 does udp_checksum_complete() which calls skb_checksum()
> which calls local_bh_enable() which barfs if irqs are disabled.
>
> This only happens under CONFIG_HIGHMEM.
>
> However - at first look this seems like it should happen under vanilla
> linux too ... so I may be missing something. Will take a more detailed
> look....
>
> In the meantime you could try to see if you can reproduce w/out
> CONFIG_HIGHMEM...

I will give that a go at some point, thank you.

> p.s. what is the app that's doing UDP polling? may be able to
> reproduce locally which would help...

Aside from the ping results I reported earlier, heavy accesses to
the snmpd in dom0 produces it too.
Re: Badness in softirq.c / no modules loaded / related tonetwork interface [ In reply to ]
On Mon, May 30, 2005 at 04:35:28AM +0000, Andy Smith wrote:
> On Mon, May 30, 2005 at 02:36:01AM +0100, Steven Hand wrote:
> > In the meantime you could try to see if you can reproduce w/out
> > CONFIG_HIGHMEM...
>
> I will give that a go at some point, thank you.

After removing the CONFIG_HIGHMEM option I can no longer reproduce
this.

Hopefully I won't need to give a guest more than 600 or so MB of RAM
until this is fixed. :)

Thanks for your help.
Re: Badness in softirq.c / no modules loaded / related tonetwork interface [ In reply to ]
> On Mon, May 30, 2005 at 04:35:28AM +0000, Andy Smith wrote:
> > On Mon, May 30, 2005 at 02:36:01AM +0100, Steven Hand wrote:
> > > In the meantime you could try to see if you can reproduce w/out
> > > CONFIG_HIGHMEM...
> >
> > I will give that a go at some point, thank you.
>
> After removing the CONFIG_HIGHMEM option I can no longer reproduce
> this.

Excellent.

> Hopefully I won't need to give a guest more than 600 or so MB of RAM
> until this is fixed. :)

Just an update: this turns out to be a bug in linux rather than one
in Xen/XenLinux, and so we've reported it upstream. Hopefully it'll
get fixed in a forthcoming version -- the only risk is that it's
hard to trigger on non-Xen boxes (requires fragmented skb's being
received) so may not be top priority for the mainstream guys.

We'll roll in any fix as soon as it's available.

cheers,

S.

_______________________________________________
Xen-users mailing list
Xen-users@lists.xensource.com
http://lists.xensource.com/xen-users
Re: Badness in softirq.c / no modules loaded / related tonetwork interface [ In reply to ]
>> Hopefully I won't need to give a guest more than 600 or so MB of RAM
>> until this is fixed. :)
>
>Just an update: this turns out to be a bug in linux rather than one
>in Xen/XenLinux, and so we've reported it upstream. Hopefully it'll
>get fixed in a forthcoming version -- the only risk is that it's
>hard to trigger on non-Xen boxes (requires fragmented skb's being
>received) so may not be top priority for the mainstream guys.
>
>We'll roll in any fix as soon as it's available.

I've checked in an upstream fix and pushed to bkbits - can you
please check that this solves your problem even with CONFIG_HIGHMEM?

cheers,

S.

_______________________________________________
Xen-users mailing list
Xen-users@lists.xensource.com
http://lists.xensource.com/xen-users