Mailing List Archive

2.6.17-rc1-mm2: badness in 3w_xxxx driver
The following patch: x86-kmap_atomic-debugging.patch exposed a badness
in 3w_xxx driver. I'm getting a lot of:

Apr 9 13:00:04 nickolas kernel: kmap_atomic: local irqs are enabled while using KM_IRQn
Apr 9 13:00:04 nickolas kernel: <c0104103> show_trace+0x13/0x20 <c010412e> dump_stack+0x1e/0x20
Apr 9 13:00:04 nickolas kernel: <c01159c9> kmap_atomic+0x79/0xe0 <c028b885> tw_transfer_internal+0x85/0xa0
Apr 9 13:00:04 nickolas kernel: <c028ca7e> tw_interrupt+0x3fe/0x820 <c0143b9e> handle_IRQ_event+0x3e/0x80
Apr 9 13:00:04 nickolas kernel: <c0143c70> __do_IRQ+0x90/0x100 <c01057a6> do_IRQ+0x26/0x40
Apr 9 13:00:04 nickolas kernel: <c010396e> common_interrupt+0x1a/0x20 <c0101cdd> cpu_idle+0x4d/0xb0
Apr 9 13:00:04 nickolas kernel: <c010f2cc> start_secondary+0x24c/0x4b0 <00000000> 0x0
Apr 9 13:00:04 nickolas kernel: <c214ffb4> 0xc214ffb4

I'm running 32 bit kernel on AMD64x2 w/ HIGHMEM enabled.
I think this is an old bug since the 3w_xxxx.c has not been changed for
a long time (at least since 2.6.16-rc1-mm4).

Please let me know if you want me to try some patches.

--
With best wishes,
Nick Orlov.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.17-rc1-mm2: badness in 3w_xxxx driver [ In reply to ]
Nick Orlov <bugfixer@list.ru> wrote:
>
> The following patch: x86-kmap_atomic-debugging.patch exposed a badness
> in 3w_xxx driver.

Sweet, thanks.

> I'm getting a lot of:
>
> Apr 9 13:00:04 nickolas kernel: kmap_atomic: local irqs are enabled while using KM_IRQn
> Apr 9 13:00:04 nickolas kernel: <c0104103> show_trace+0x13/0x20 <c010412e> dump_stack+0x1e/0x20
> Apr 9 13:00:04 nickolas kernel: <c01159c9> kmap_atomic+0x79/0xe0 <c028b885> tw_transfer_internal+0x85/0xa0
> Apr 9 13:00:04 nickolas kernel: <c028ca7e> tw_interrupt+0x3fe/0x820 <c0143b9e> handle_IRQ_event+0x3e/0x80
> Apr 9 13:00:04 nickolas kernel: <c0143c70> __do_IRQ+0x90/0x100 <c01057a6> do_IRQ+0x26/0x40
> Apr 9 13:00:04 nickolas kernel: <c010396e> common_interrupt+0x1a/0x20 <c0101cdd> cpu_idle+0x4d/0xb0
> Apr 9 13:00:04 nickolas kernel: <c010f2cc> start_secondary+0x24c/0x4b0 <00000000> 0x0
> Apr 9 13:00:04 nickolas kernel: <c214ffb4> 0xc214ffb4
>
> I'm running 32 bit kernel on AMD64x2 w/ HIGHMEM enabled.
> I think this is an old bug since the 3w_xxxx.c has not been changed for
> a long time (at least since 2.6.16-rc1-mm4).
>
> Please let me know if you want me to try some patches.
>


From: Andrew Morton <akpm@osdl.org>

We must disable local IRQs while holding KM_IRQ0 or KM_IRQ1. Otherwise, an
IRQ handler could use those kmap slots while this code is using them,
resulting in memory corruption.

Thanks to Nick Orlov <bugfixer@list.ru> for reporting.

Cc: <linuxraid@amcc.com>
Cc: James Bottomley <James.Bottomley@SteelEye.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
---

drivers/scsi/3w-xxxx.c | 3 +++
1 files changed, 3 insertions(+)

diff -puN drivers/scsi/3w-xxxx.c~3ware-kmap_atomic-fix drivers/scsi/3w-xxxx.c
--- devel/drivers/scsi/3w-xxxx.c~3ware-kmap_atomic-fix 2006-04-09 11:28:08.000000000 -0700
+++ devel-akpm/drivers/scsi/3w-xxxx.c 2006-04-09 11:29:21.000000000 -0700
@@ -1508,10 +1508,12 @@ static void tw_transfer_internal(TW_Devi
struct scsi_cmnd *cmd = tw_dev->srb[request_id];
void *buf;
unsigned int transfer_len;
+ unsigned long flags = 0;

if (cmd->use_sg) {
struct scatterlist *sg =
(struct scatterlist *)cmd->request_buffer;
+ local_irq_save(flags);
buf = kmap_atomic(sg->page, KM_IRQ0) + sg->offset;
transfer_len = min(sg->length, len);
} else {
@@ -1526,6 +1528,7 @@ static void tw_transfer_internal(TW_Devi

sg = (struct scatterlist *)cmd->request_buffer;
kunmap_atomic(buf - sg->offset, KM_IRQ0);
+ local_irq_restore(flags);
}
}

_

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.17-rc1-mm2: badness in 3w_xxxx driver [ In reply to ]
Andrew Morton wrote:
> Nick Orlov <bugfixer@list.ru> wrote:
>> The following patch: x86-kmap_atomic-debugging.patch exposed a badness
>> in 3w_xxx driver.
>
> Sweet, thanks.
>
>> I'm getting a lot of:
>>
>> Apr 9 13:00:04 nickolas kernel: kmap_atomic: local irqs are enabled while using KM_IRQn
>> Apr 9 13:00:04 nickolas kernel: <c0104103> show_trace+0x13/0x20 <c010412e> dump_stack+0x1e/0x20
>> Apr 9 13:00:04 nickolas kernel: <c01159c9> kmap_atomic+0x79/0xe0 <c028b885> tw_transfer_internal+0x85/0xa0
>> Apr 9 13:00:04 nickolas kernel: <c028ca7e> tw_interrupt+0x3fe/0x820 <c0143b9e> handle_IRQ_event+0x3e/0x80
>> Apr 9 13:00:04 nickolas kernel: <c0143c70> __do_IRQ+0x90/0x100 <c01057a6> do_IRQ+0x26/0x40
>> Apr 9 13:00:04 nickolas kernel: <c010396e> common_interrupt+0x1a/0x20 <c0101cdd> cpu_idle+0x4d/0xb0
>> Apr 9 13:00:04 nickolas kernel: <c010f2cc> start_secondary+0x24c/0x4b0 <00000000> 0x0
>> Apr 9 13:00:04 nickolas kernel: <c214ffb4> 0xc214ffb4
>>
>> I'm running 32 bit kernel on AMD64x2 w/ HIGHMEM enabled.
>> I think this is an old bug since the 3w_xxxx.c has not been changed for
>> a long time (at least since 2.6.16-rc1-mm4).
>>
>> Please let me know if you want me to try some patches.
>>
>
>
> From: Andrew Morton <akpm@osdl.org>
>
> We must disable local IRQs while holding KM_IRQ0 or KM_IRQ1. Otherwise, an
> IRQ handler could use those kmap slots while this code is using them,
> resulting in memory corruption.
>
> Thanks to Nick Orlov <bugfixer@list.ru> for reporting.
>
> Cc: <linuxraid@amcc.com>
> Cc: James Bottomley <James.Bottomley@SteelEye.com>
> Signed-off-by: Andrew Morton <akpm@osdl.org>
> ---
>
> drivers/scsi/3w-xxxx.c | 3 +++
> 1 files changed, 3 insertions(+)
>
> diff -puN drivers/scsi/3w-xxxx.c~3ware-kmap_atomic-fix drivers/scsi/3w-xxxx.c
> --- devel/drivers/scsi/3w-xxxx.c~3ware-kmap_atomic-fix 2006-04-09 11:28:08.000000000 -0700
> +++ devel-akpm/drivers/scsi/3w-xxxx.c 2006-04-09 11:29:21.000000000 -0700
> @@ -1508,10 +1508,12 @@ static void tw_transfer_internal(TW_Devi
> struct scsi_cmnd *cmd = tw_dev->srb[request_id];
> void *buf;
> unsigned int transfer_len;
> + unsigned long flags = 0;
>
> if (cmd->use_sg) {
> struct scatterlist *sg =
> (struct scatterlist *)cmd->request_buffer;
> + local_irq_save(flags);
> buf = kmap_atomic(sg->page, KM_IRQ0) + sg->offset;
> transfer_len = min(sg->length, len);
> } else {
> @@ -1526,6 +1528,7 @@ static void tw_transfer_internal(TW_Devi
>
> sg = (struct scatterlist *)cmd->request_buffer;
> kunmap_atomic(buf - sg->offset, KM_IRQ0);
> + local_irq_restore(flags);

ACK.

Though please make sure the active maintainer is CC'd on this... There
is even a helpful MAINTAINERS entry for this driver.

Jeff


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.17-rc1-mm2: badness in 3w_xxxx driver [ In reply to ]
On Sun, Apr 09, 2006 at 11:32:40AM -0700, Andrew Morton wrote:
> Nick Orlov <bugfixer@list.ru> wrote:
> >
> > The following patch: x86-kmap_atomic-debugging.patch exposed a badness
> > in 3w_xxx driver.
>
> Sweet, thanks.
>
[[ skipped ]]
>
> From: Andrew Morton <akpm@osdl.org>
>
> We must disable local IRQs while holding KM_IRQ0 or KM_IRQ1. Otherwise, an
> IRQ handler could use those kmap slots while this code is using them,
> resulting in memory corruption.
>
> Thanks to Nick Orlov <bugfixer@list.ru> for reporting.
>
> Cc: <linuxraid@amcc.com>
> Cc: James Bottomley <James.Bottomley@SteelEye.com>
> Signed-off-by: Andrew Morton <akpm@osdl.org>
> ---
>
> drivers/scsi/3w-xxxx.c | 3 +++
> 1 files changed, 3 insertions(+)
>
> diff -puN drivers/scsi/3w-xxxx.c~3ware-kmap_atomic-fix drivers/scsi/3w-xxxx.c
> --- devel/drivers/scsi/3w-xxxx.c~3ware-kmap_atomic-fix 2006-04-09 11:28:08.000000000 -0700
> +++ devel-akpm/drivers/scsi/3w-xxxx.c 2006-04-09 11:29:21.000000000 -0700
> @@ -1508,10 +1508,12 @@ static void tw_transfer_internal(TW_Devi
> struct scsi_cmnd *cmd = tw_dev->srb[request_id];
> void *buf;
> unsigned int transfer_len;
> + unsigned long flags = 0;
>
> if (cmd->use_sg) {
> struct scatterlist *sg =
> (struct scatterlist *)cmd->request_buffer;
> + local_irq_save(flags);
> buf = kmap_atomic(sg->page, KM_IRQ0) + sg->offset;
> transfer_len = min(sg->length, len);
> } else {
> @@ -1526,6 +1528,7 @@ static void tw_transfer_internal(TW_Devi
>
> sg = (struct scatterlist *)cmd->request_buffer;
> kunmap_atomic(buf - sg->offset, KM_IRQ0);
> + local_irq_restore(flags);
> }
> }
>
> _

Confirmed, this patch solves the "badness" problem for me.
I still experiencing a weird hangs though (the box just hangs, no
messages on console/syslog, nothing). I'll try to nail it down.

2.6.16-mm2 works like a charm with the same config.
Do you know which patches should I try to revert first?

--
With best wishes,
Nick Orlov.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.17-rc1-mm2: badness in 3w_xxxx driver [ In reply to ]
> > Cc: <linuxraid@amcc.com>
> > ---
> >
> ACK.
>
> Though please make sure the active maintainer is CC'd on this... There
> is even a helpful MAINTAINERS entry for this driver.

I'd say it is ;-)


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.17-rc1-mm2: badness in 3w_xxxx driver [ In reply to ]
Nick Orlov <bugfixer@list.ru> wrote:
>
> Confirmed, this patch solves the "badness" problem for me.

yup, thanks.

> I still experiencing a weird hangs though (the box just hangs, no
> messages on console/syslog, nothing). I'll try to nail it down.
>
> 2.6.16-mm2 works like a charm with the same config.
> Do you know which patches should I try to revert first?

Gee, 2.6.16-mm2 was a long time ago.

Tried sysrq?

echo 1 > /proc/sys/kernel/sysrq
<wait for hang>
ALT-SYSRQ-P or ALT-SYSRQ-T

Is the NMi watchdog enabled? Boot with `nmi_watchdog=1', make sure that
the NMI counts are incrementing in /proc/interrupts.

Failing all that, testing 2.6.17-rc1 would be interesting.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.17-rc1-mm2: badness in 3w_xxxx driver [ In reply to ]
On Sun, Apr 09, 2006 at 12:43:01PM -0700, Andrew Morton wrote:
> Nick Orlov <bugfixer@list.ru> wrote:
> >
> > Confirmed, this patch solves the "badness" problem for me.
>
> yup, thanks.
>
> > I still experiencing a weird hangs though (the box just hangs, no
> > messages on console/syslog, nothing). I'll try to nail it down.
> >
> > 2.6.16-mm2 works like a charm with the same config.
> > Do you know which patches should I try to revert first?
>
> Gee, 2.6.16-mm2 was a long time ago.
>
> Tried sysrq?
>
> echo 1 > /proc/sys/kernel/sysrq
> <wait for hang>
> ALT-SYSRQ-P or ALT-SYSRQ-T
>
> Is the NMi watchdog enabled? Boot with `nmi_watchdog=1', make sure that
> the NMI counts are incrementing in /proc/interrupts.
>
> Failing all that, testing 2.6.17-rc1 would be interesting.

2.6.17-rc1 fails in the same fashion - it hangs "randomly".
Good news that I've found the pattern and solution:
it always happens when 2 applications open /dev/dsp simultaneously.

Applying the following patches published by Takashi Iwai solves the
problem:
http://marc.theaimsgroup.com/?l=linux-kernel&m=114423578508165&w=2
http://marc.theaimsgroup.com/?l=linux-kernel&m=114424198614019&w=2

Not sure if the first one is enough.

I would probably recommend to put them into the hot-fixes,
since many people can be frustrated because of this.

--
With best wishes,
Nick Orlov.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/