Mailing List Archive

2.6.16-rc5 pppd oops on disconnects
I searched the mailing list and saw a similar report but that was back
in January and it looked to be resolved. I get the following oops in
pppd when I'm connected with my tethered cell phone and accidentally
unplug the usb cable. Happens every time. I'm running Linus' git
0d514f040ac6629311974889d5b96bcf21c6461a (I think).

PPP Deflate Compression module registered
usb 1-2: USB disconnect, address 4
Unable to handle kernel paging request at virtual address 6b6b6bfb
printing eip:
c027a4f6
*pde = 00000000
Oops: 0000 [#1]
Modules linked in: ppp_deflate zlib_deflate bsd_comp ppp_async
crc_ccitt ppp_generic slhc i915 binfmt_misc parport_pc lp parport
video thermal processor fan button battery ac af_packet nls_iso8859_1
nls_cp437 vfat fat dm_mod fuse evdev usb_storage ide_cd sr_mod
scsi_mod cdrom i810 drm cdc_acm pcmcia crc32 eth1394 ipw2100 tg3
ieee80211 ieee80211_crypt 8250_pci 8250 serial_core yenta_socket
rsrc_nonstatic pcmcia_core ohci1394 ieee1394 snd_intel8x0 ehci_hcd
snd_intel8x0m snd_ac97_codec uhci_hcd snd_ac97_bus usbcore intel_agp
agpgart unix
CPU: 0
EIP: 0060:[<c027a4f6>] Not tainted VLI
EFLAGS: 00210046 (2.6.16-rc5 #16)
EIP is at __mutex_lock_slowpath+0x70/0x286
eax: cf549e20 ebx: cf548000 ecx: 00000000 edx: 00000054
esi: 6b6b6bdb edi: cea6f030 ebp: c017592e esp: cf549e20
ds: 007b es: 007b ss: 0068
Process pppd (pid: 4076, threadinfo=cf548000 task=cea6f030)
Stack: <0>cf549e20 cf549e20 11111111 11111111 cf549e20 cce32c60
cf609cc4 cf609ccc
cf609c60 c017592e ccabf7a0 cce5466c 6b6b6b6b cce32c60 cf609cc4 cf609ccc
cf609c60 c01e756e cce5466c ccabf7a0 cd619aac c029fd21 00000000 ccabf7a0
Call Trace:
[<c017592e>] sysfs_hash_and_remove+0x34/0x10a
[<c01e756e>] class_device_del+0xa0/0x11c
[<c01e75f5>] class_device_unregister+0xb/0x16
[<d01f81f3>] acm_tty_unregister+0x1d/0x63 [cdc_acm]
[<d01f8baa>] acm_tty_close+0x9d/0xac [cdc_acm]
[<c01d6c1c>] release_dev+0x1a9/0x5b7
[<c01d7e37>] opost+0x1bb/0x1d3
[<c0146d9d>] __fput+0x74/0x132
[<c01d72e8>] tty_release+0x9/0xc
[<c0146dc8>] __fput+0x9f/0x132
[<c0144c06>] filp_close+0x4e/0x57
[<c0145584>] sys_close+0x56/0x63
[<c01029c9>] syscall_call+0x7/0xb
Code: dc dd 27 c0 85 d2 0f 44 c2 68 ef 59 28 c0 c7 05 98 48 2c c0 00
00 00 00 a3 8c 33 2c c0 e8 be c1 e9 ff e8 f4 95 e8 ff 83 c4 10 fa <39>
76 20 74 49 83 3d 98 48 2c c0 00 74 40 8b 15 8c 33 2c c0 b8
<3>Debug: sleeping function called from invalid context at
include/linux/rwsem.h:43
in_atomic():0, irqs_disabled():1
[<c0116c69>] exit_mm+0x28/0xe8
[<c011773a>] do_exit+0x17f/0x619
[<c0103d85>] do_simd_coprocessor_error+0x0/0x14f
[<c0111cf7>] do_page_fault+0x389/0x4c0
[<c011196e>] do_page_fault+0x0/0x4c0
[<c017592e>] sysfs_hash_and_remove+0x34/0x10a
[<c010345f>] error_code+0x4f/0x54
[<c017592e>] sysfs_hash_and_remove+0x34/0x10a
[<c027a4f6>] __mutex_lock_slowpath+0x70/0x286
[<c017592e>] sysfs_hash_and_remove+0x34/0x10a
[<c01e756e>] class_device_del+0xa0/0x11c
[<c01e75f5>] class_device_unregister+0xb/0x16
[<d01f81f3>] acm_tty_unregister+0x1d/0x63 [cdc_acm]
[<d01f8baa>] acm_tty_close+0x9d/0xac [cdc_acm]
[<c01d6c1c>] release_dev+0x1a9/0x5b7
[<c01d7e37>] opost+0x1bb/0x1d3
[<c0146d9d>] __fput+0x74/0x132
[<c01d72e8>] tty_release+0x9/0xc
[<c0146dc8>] __fput+0x9f/0x132
[<c0144c06>] filp_close+0x4e/0x57
[<c0145584>] sys_close+0x56/0x63
[<c01029c9>] syscall_call+0x7/0xb

-Bob
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.16-rc5 pppd oops on disconnects [ In reply to ]
On Fri, 2006-03-10 at 09:25 -0500, Bob Copeland wrote:
> Unable to handle kernel paging request at virtual address 6b6b6bfb
> printing eip:
> c027a4f6
> *pde = 00000000
> Oops: 0000 [#1]
> ...
> CPU: 0
> EIP: 0060:[<c027a4f6>] Not tainted VLI
> EFLAGS: 00210046 (2.6.16-rc5 #16)
> EIP is at __mutex_lock_slowpath+0x70/0x286
> eax: cf549e20 ebx: cf548000 ecx: 00000000 edx: 00000054
> esi: 6b6b6bdb edi: cea6f030 ebp: c017592e esp: cf549e20
> ds: 007b es: 007b ss: 0068
> Process pppd (pid: 4076, threadinfo=cf548000 task=cea6f030)
> Stack: <0>cf549e20 cf549e20 11111111 11111111 cf549e20 cce32c60
> cf609cc4 cf609ccc
> cf609c60 c017592e ccabf7a0 cce5466c 6b6b6b6b cce32c60 cf609cc4 cf609ccc
> cf609c60 c01e756e cce5466c ccabf7a0 cd619aac c029fd21 00000000 ccabf7a0
> Call Trace:
> [<c017592e>] sysfs_hash_and_remove+0x34/0x10a
> [<c01e756e>] class_device_del+0xa0/0x11c
> [<c01e75f5>] class_device_unregister+0xb/0x16
> [<d01f81f3>] acm_tty_unregister+0x1d/0x63 [cdc_acm]

This looks more like
http://bugzilla.kernel.org/show_bug.cgi?id=5876

The offset from 6b6b6b6b looks like slab poisoning on
the dentry in sysfs_hash_and_remove.

--
Paul Fulghum
Microgate Systems, Ltd

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.16-rc5 pppd oops on disconnects [ In reply to ]
> > Call Trace:
> > [<c017592e>] sysfs_hash_and_remove+0x34/0x10a
> > [<c01e756e>] class_device_del+0xa0/0x11c
> > [<c01e75f5>] class_device_unregister+0xb/0x16
> > [<d01f81f3>] acm_tty_unregister+0x1d/0x63 [cdc_acm]
>
> This looks more like
> http://bugzilla.kernel.org/show_bug.cgi?id=5876

Hmm... it looks different from that bug - in that case the root cause
was sysfs_make_dirent failing, presumably when the sysfs node for the
device was being set up, by unplugging and re-plugging the device a
lot. Here it's oopsing when the node is being removed, after it's
been in use a while and unplugged only once. But yes ppp may not have
anything to do with it. I'll try it on an older kernel to see if I
can reproduce there...

-Bob
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.16-rc5 pppd oops on disconnects [ In reply to ]
On Fri, 2006-03-10 at 13:48 -0500, Bob Copeland wrote:
> > > Call Trace:
> > > [<c017592e>] sysfs_hash_and_remove+0x34/0x10a
> > > [<c01e756e>] class_device_del+0xa0/0x11c
> > > [<c01e75f5>] class_device_unregister+0xb/0x16
> > > [<d01f81f3>] acm_tty_unregister+0x1d/0x63 [cdc_acm]
> >
> > This looks more like
> > http://bugzilla.kernel.org/show_bug.cgi?id=5876
>
> Hmm... it looks different from that bug - in that case the root cause
> was sysfs_make_dirent failing, presumably when the sysfs node for the
> device was being set up, by unplugging and re-plugging the device a
> lot. Here it's oopsing when the node is being removed, after it's
> been in use a while and unplugged only once. But yes ppp may not have
> anything to do with it. I'll try it on an older kernel to see if I
> can reproduce there...

The i_sem to i_mutex change started in the 2.6.16 series.
Running against 2.6.15 would be interesting. Being able
to repeat every time is a plus. I'm not that familiar
with the sysfs stuff, but the slab poisoning is pretty
damning. The dentry was released and then accessed.

I looked at cdc_acm for disconnect and close and
did not see any problems (such as trying to call
tty_unregister_device twice for a device).

--
Paul Fulghum
Microgate Systems, Ltd

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.16-rc5 pppd oops on disconnects [ In reply to ]
On Fri, Mar 10, 2006 at 01:25:09PM -0600, Paul Fulghum wrote:
> The i_sem to i_mutex change started in the 2.6.16 series.
> Running against 2.6.15 would be interesting. Being able
> to repeat every time is a plus. I'm not that familiar
> with the sysfs stuff, but the slab poisoning is pretty
> damning. The dentry was released and then accessed.
>
> I looked at cdc_acm for disconnect and close and
> did not see any problems (such as trying to call
> tty_unregister_device twice for a device).

Well, back to at least 2.6.15-rc7 I get a similar oops so this looks old
and unrelated to the mutex changes. I don't believe it triggers without
CONFIG_DEBUG_SLAB. Also won't oops without something (ppp) using the
device at the time of disconnect.

Unable to handle kernel paging request at virtual address 6b6b6bdb
printing eip:
c0176da6
*pde = 00000000
Oops: 0002 [#1]
Modules linked in: ppp_deflate zlib_deflate bsd_comp ppp_async crc_ccitt ppp_generic slhc cdc_acm i915 binfmt_misc parport_pc lp parport video thermal processor fan button battery ac af_packet nls_iso8859_1 nls_cp437 vfat fat dm_mod fuse usb_storage ide_cd sr_mod scsi_mod cdrom i810 drm eth1394 pcmcia crc32 ipw2100 tg3 ieee80211 ieee80211_crypt joydev 8250_pci 8250 serial_core evdev snd_intel8x0 snd_intel8x0m yenta_socket rsrc_nonstatic pcmcia_core ohci1394 ieee1394 snd_ac97_codec snd_ac97_bus ehci_hcd uhci_hcd intel_agp agpgart usbcore unix
CPU: 0
EIP: 0060:[<c0176da6>] Not tainted VLI
EFLAGS: 00210202 (2.6.15-rc7)
EIP is at sysfs_hash_and_remove+0x2a/0x100
eax: 00200246 ebx: 6b6b6b6b ecx: 00000063 edx: cdea3c1c
esi: cefe68d4 edi: cf608c60 ebp: cf608ccc esp: ce1fbe50
ds: 007b es: 007b ss: 0068
Process pppd (pid: 4128, threadinfo=ce1fa000 task=cef61540)
Stack: c0288916 00000063 6b6b6b6b c64182d0 cefe68d4 cf608c60 cf608ccc c01e9f5f
ced938a0 cefe68d4 cedb611c c02a0f11 c64182d0 00000000 c64182d0 ce970e40
00000000 00000000 c01e9fbf c64182d0 ccf14d68 d03e246e c64182d0 0a600000
Call Trace:
[<c01e9f5f>] class_device_del+0x94/0xe9
[<c01e9fbf>] class_device_unregister+0xb/0x16
[<d03e246e>] acm_tty_unregister+0x16/0x54 [cdc_acm]
[<d03e2533>] acm_tty_close+0x87/0x96 [cdc_acm]
[<c01d7f15>] release_dev+0x1b1/0x5dc
[<c011d8a6>] __group_send_sig_info+0x5d/0x69
[<c011eb1a>] sys_kill+0x4e/0x55
[<c01d8774>] tty_release+0x9/0xd
[<c0146efb>] __fput+0x9f/0x12c
[<c0145b6b>] filp_close+0x4e/0x57
[<c0145bca>] sys_close+0x56/0x63
[<c0102ad5>] syscall_call+0x7/0xb
Code: c3 55 57 56 53 51 8b 44 24 18 8b 40 50 89 04 24 8b 44 24 18 8b 58 08 85 db 0f 84 dc 00 00 00 6a 63 68 16 89 28 c0 e8 32 d0 f9 ff <ff> 4b 70 0f 88 cd 00 00 00 8b 44 24 08 8b 68 0c 58 5a 83 ed 04

--
Bob Copeland %% www.bobcopeland.com
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.16-rc5 pppd oops on disconnects [ In reply to ]
On Sat, 2006-03-11 at 10:09 -0500, Bob Copeland wrote:
> Well, back to at least 2.6.15-rc7 I get a similar oops so this looks old
> and unrelated to the mutex changes. I don't believe it triggers without
> CONFIG_DEBUG_SLAB. Also won't oops without something (ppp) using the
> device at the time of disconnect.

OK, try this patch with CONFIG_DEBUG_SLAB on and post the
debug output with the oops.

I do see one problem with the cdc-acm driver (not setting
acm->tty to NULL on the last close, where tty is released).

Thanks,
Paul

--
Paul Fulghum
Microgate Systems, Ltd

--- linux-2.6.16-rc5/drivers/usb/class/cdc-acm.c 2006-02-27 09:24:29.000000000 -0600
+++ b/drivers/usb/class/cdc-acm.c 2006-03-11 11:50:40.000000000 -0600
@@ -258,6 +258,7 @@ static void acm_ctrl_irq(struct urb *urb

if (acm->tty && !acm->clocal && (acm->ctrlin & ~newctrl & ACM_CTRL_DCD)) {
dbg("calling hangup");
+ printk("acm_ctrl_irq tty_hangup(%p)\n", acm->tty);
tty_hangup(acm->tty);
}

@@ -443,6 +444,8 @@ static int acm_tty_open(struct tty_struc
tty->driver_data = acm;
acm->tty = tty;

+ printk("acm_tty_open tty=%p acm=%p acm->used=%p\n", tty, acm, acm->used);
+
/* force low_latency on so that our tty_push actually forces the data through,
otherwise it is scheduled, and with high data rates data can get lost. */
tty->low_latency = 1;
@@ -504,6 +507,10 @@ static void acm_tty_close(struct tty_str
struct acm *acm = tty->driver_data;
int i;

+ printk("acm_tty_close tty=%p filp=%p acm=%p\n", tty, filp, acm);
+ if (acm)
+ printk("acm_tty_close acm->used=%d acm->dev=%p\n", acm->used, acm->dev);
+
if (!acm || !acm->used)
return;

@@ -517,6 +524,7 @@ static void acm_tty_close(struct tty_str
usb_kill_urb(acm->ru[i].urb);
} else
acm_tty_unregister(acm);
+ /* need to set acm->tty = NULL here */
}
up(&open_sem);
}
@@ -1008,6 +1016,10 @@ static void acm_disconnect(struct usb_in
struct usb_device *usb_dev = interface_to_usbdev(intf);
int i;

+ printk("acm_disconnect intf=%p acm=%p usb_dev=%p\n", intf, acm, usb_dev);
+ if (acm)
+ printk("acm_disconnect acm->used=%d acm->dev=%p acm->tty=%p\n", acm->used, acm->dev, acm->tty);
+
if (!acm || !acm->dev) {
dbg("disconnect on nonexisting interface");
return;


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.16-rc5 pppd oops on disconnects [ In reply to ]
On 3/11/06, Paul Fulghum <paulkf@microgate.com> wrote:
> OK, try this patch with CONFIG_DEBUG_SLAB on and post the
> debug output with the oops.
>
> I do see one problem with the cdc-acm driver (not setting
> acm->tty to NULL on the last close, where tty is released).
>
> Thanks,
> Paul

dmesg follows...

usb 1-2: new full speed USB device using uhci_hcd and address 6
usb 1-2: configuration #1 chosen from 1 choice
drivers/usb/class/cdc-acm.c: Ignoring extra header, type -3, length 4
cdc_acm 1-2:1.1: ttyACM0: USB ACM device
usbcore: registered new driver cdc_acm
drivers/usb/class/cdc-acm.c: v0.25:USB Abstract Control Model driver
for USB modems and ISDN adapters
CSLIP: code copyright 1989 Regents of the University of California
PPP generic driver version 2.4.2
acm_tty_open tty=cd254af8 acm=ce24eca0 acm->used=00000000
PPP BSD Compression module registered
PPP Deflate Compression module registered
usb 1-2: USB disconnect, address 6
acm_disconnect intf=ce99f720 acm=ce24eca0 usb_dev=ce24f4b8
acm_disconnect acm->used=1 acm->dev=ce24f4b8 acm->tty=cd254af8
acm_disconnect intf=cefb6760 acm=00000000 usb_dev=ce24f4b8
acm_tty_close tty=cd254af8 filp=ce96511c acm=ce24eca0
acm_tty_close acm->used=1 acm->dev=00000000
Unable to handle kernel paging request at virtual address 6b6b6bfb
printing eip:
c0279db6
*pde = 00000000
Oops: 0000 [#2]
Modules linked in: ppp_deflate zlib_deflate bsd_comp ppp_async
crc_ccitt ppp_generic slhc cdc_acm i915 binfmt_misc parport_pc lp
parport video thermal processor fan button battery ac af_packet
nls_iso8859_1 nls_cp437 vfat fat dm_mod fuse evdev ide_cd i810 drm
sr_mod cdrom usb_storage pcmcia crc32 scsi_mod eth1394 8250_pci 8250
serial_core ipw2100 yenta_socket tg3 snd_intel8x0 snd_intel8x0m
ieee80211 ieee80211_crypt rsrc_nonstatic pcmcia_core snd_ac97_codec
snd_ac97_bus ohci1394 ieee1394 ehci_hcd uhci_hcd usbcore intel_agp
agpgart unix
CPU: 0
EIP: 0060:[<c0279db6>] Not tainted VLI
EFLAGS: 00210046 (2.6.16-rc5 #18)
EIP is at __mutex_lock_slowpath+0x70/0x286
eax: c6065e20 ebx: c6064000 ecx: 00000000 edx: 00000054
esi: 6b6b6bdb edi: c64f7550 ebp: c017585e esp: c6065e20
ds: 007b es: 007b ss: 0068
Process pppd (pid: 4237, threadinfo=c6064000 task=c64f7550)
Stack: <0>c6065e20 c6065e20 11111111 11111111 c6065e20 c6223534
cf609cc4 cf609ccc
cf609c60 c017585e c6ae0ae4 cee17ccc 6b6b6b6b c6223534 cf609cc4 cf609ccc
cf609c60 c01e749e cee17ccc c6ae0ae4 c6e2266c c029ecfc 00000000 c6ae0ae4
Call Trace:
[<c017585e>] sysfs_hash_and_remove+0x34/0x10a
[<c01e749e>] class_device_del+0xa0/0x11c
[<c01e7525>] class_device_unregister+0xb/0x16
[<d036320d>] acm_tty_unregister+0x1d/0x63 [cdc_acm]
[<d0363c14>] acm_tty_close+0xc5/0xd4 [cdc_acm]
[<c01d6b4c>] release_dev+0x1a9/0x5b7
[<c011e186>] __group_complete_signal+0x17c/0x20d
[<c011edf1>] sys_kill+0xd8/0xe7
[<c0146d9d>] __fput+0x74/0x132
[<c01d7218>] tty_release+0x9/0xc
[<c0146dc8>] __fput+0x9f/0x132
[<c0144c06>] filp_close+0x4e/0x57
[<c0145584>] sys_close+0x56/0x63
[<c01029c9>] syscall_call+0x7/0xb
Code: fc cd 27 c0 85 d2 0f 44 c2 68 0f 4a 28 c0 c7 05 98 38 2c c0 00
00 00 00 a3 8c 23 2c c0 e8 fe c8 e9 ff e8 34 9d e8 ff 83 c4 10 fa <39>
76 20 74 49 83 3d 98 38 2c c0 00 74 40 8b 15 8c 23 2c c0 b8
<3>Debug: sleeping function called from invalid context at
include/linux/rwsem.h:43
in_atomic():0, irqs_disabled():1
[<c0116c69>] exit_mm+0x28/0xe8
[<c011773a>] do_exit+0x17f/0x619
[<c0103d85>] do_simd_coprocessor_error+0x0/0x14f
[<c0111cf7>] do_page_fault+0x389/0x4c0
[<c011196e>] do_page_fault+0x0/0x4c0
[<c017585e>] sysfs_hash_and_remove+0x34/0x10a
[<c010345f>] error_code+0x4f/0x54
[<c017585e>] sysfs_hash_and_remove+0x34/0x10a
[<c0279db6>] __mutex_lock_slowpath+0x70/0x286
[<c017585e>] sysfs_hash_and_remove+0x34/0x10a
[<c01e749e>] class_device_del+0xa0/0x11c
[<c01e7525>] class_device_unregister+0xb/0x16
[<d036320d>] acm_tty_unregister+0x1d/0x63 [cdc_acm]
[<d0363c14>] acm_tty_close+0xc5/0xd4 [cdc_acm]
[<c01d6b4c>] release_dev+0x1a9/0x5b7
[<c011e186>] __group_complete_signal+0x17c/0x20d
[<c011edf1>] sys_kill+0xd8/0xe7
[<c0146d9d>] __fput+0x74/0x132
[<c01d7218>] tty_release+0x9/0xc
[<c0146dc8>] __fput+0x9f/0x132
[<c0144c06>] filp_close+0x4e/0x57
[<c0145584>] sys_close+0x56/0x63
[<c01029c9>] syscall_call+0x7/0xb
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.16-rc5 pppd oops on disconnects [ In reply to ]
Bob Copeland wrote:

> dmesg follows...
>
> usb 1-2: USB disconnect, address 6
> acm_disconnect intf=ce99f720 acm=ce24eca0 usb_dev=ce24f4b8
> acm_disconnect acm->used=1 acm->dev=ce24f4b8 acm->tty=cd254af8
> acm_disconnect intf=cefb6760 acm=00000000 usb_dev=ce24f4b8
> acm_tty_close tty=cd254af8 filp=ce96511c acm=ce24eca0
> acm_tty_close acm->used=1 acm->dev=00000000
> Unable to handle kernel paging request at virtual address 6b6b6bfb

> ...

> [<c017585e>] sysfs_hash_and_remove+0x34/0x10a


OK, the cdc-acm driver disconnect/close seems to behave correctly
as I first thought. tty_unregister_device is only called once. The reference
counting is correct. acm->tty still needs to be set to NULL on the final
close, but that is not the problem you are seeing.

I'm looking again at the sysfs stuff as both acm_disconnect
and tty_unregister_device (called from acm_tty_close) remove sysfs entries.
There may be some interaction of entries (name space
collision?) such that acm_disconnect releases a sysfs entry
that tty_unregister_device tries to release again (hence the slab poisoning
flagging a reference to already released memory). I'm not
familiar with this so it may take me a while.

Feel free to bug others about this, I don't mean to interfere
if someone else has a better idea.

Thanks for your persistence,
Paul








-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.16-rc5 pppd oops on disconnects [ In reply to ]
On Fri, Mar 10, 2006 at 09:25:35AM -0500, Bob Copeland wrote:
> I searched the mailing list and saw a similar report but that was back
> in January and it looked to be resolved. I get the following oops in
> pppd when I'm connected with my tethered cell phone and accidentally
> unplug the usb cable. Happens every time. I'm running Linus' git
> 0d514f040ac6629311974889d5b96bcf21c6461a (I think).

Previous versions of bugs with this driver were fixed in the past, but
unfortunatly, you are one of the lucky owners of a phone that causes
this problem :(

There is an open Novell bug for this issue, but the USB developers that
have tried to fix this in the past have had no luck at all (probably
because we don't have the device to test with).

What exact model of phone is this?

thanks,

greg k-h


(oops left below for linux-usb-devel people to see, for more details,
see the full lkml thread.)


>
> PPP Deflate Compression module registered
> usb 1-2: USB disconnect, address 4
> Unable to handle kernel paging request at virtual address 6b6b6bfb
> printing eip:
> c027a4f6
> *pde = 00000000
> Oops: 0000 [#1]
> Modules linked in: ppp_deflate zlib_deflate bsd_comp ppp_async
> crc_ccitt ppp_generic slhc i915 binfmt_misc parport_pc lp parport
> video thermal processor fan button battery ac af_packet nls_iso8859_1
> nls_cp437 vfat fat dm_mod fuse evdev usb_storage ide_cd sr_mod
> scsi_mod cdrom i810 drm cdc_acm pcmcia crc32 eth1394 ipw2100 tg3
> ieee80211 ieee80211_crypt 8250_pci 8250 serial_core yenta_socket
> rsrc_nonstatic pcmcia_core ohci1394 ieee1394 snd_intel8x0 ehci_hcd
> snd_intel8x0m snd_ac97_codec uhci_hcd snd_ac97_bus usbcore intel_agp
> agpgart unix
> CPU: 0
> EIP: 0060:[<c027a4f6>] Not tainted VLI
> EFLAGS: 00210046 (2.6.16-rc5 #16)
> EIP is at __mutex_lock_slowpath+0x70/0x286
> eax: cf549e20 ebx: cf548000 ecx: 00000000 edx: 00000054
> esi: 6b6b6bdb edi: cea6f030 ebp: c017592e esp: cf549e20
> ds: 007b es: 007b ss: 0068
> Process pppd (pid: 4076, threadinfo=cf548000 task=cea6f030)
> Stack: <0>cf549e20 cf549e20 11111111 11111111 cf549e20 cce32c60
> cf609cc4 cf609ccc
> cf609c60 c017592e ccabf7a0 cce5466c 6b6b6b6b cce32c60 cf609cc4 cf609ccc
> cf609c60 c01e756e cce5466c ccabf7a0 cd619aac c029fd21 00000000 ccabf7a0
> Call Trace:
> [<c017592e>] sysfs_hash_and_remove+0x34/0x10a
> [<c01e756e>] class_device_del+0xa0/0x11c
> [<c01e75f5>] class_device_unregister+0xb/0x16
> [<d01f81f3>] acm_tty_unregister+0x1d/0x63 [cdc_acm]
> [<d01f8baa>] acm_tty_close+0x9d/0xac [cdc_acm]
> [<c01d6c1c>] release_dev+0x1a9/0x5b7
> [<c01d7e37>] opost+0x1bb/0x1d3
> [<c0146d9d>] __fput+0x74/0x132
> [<c01d72e8>] tty_release+0x9/0xc
> [<c0146dc8>] __fput+0x9f/0x132
> [<c0144c06>] filp_close+0x4e/0x57
> [<c0145584>] sys_close+0x56/0x63
> [<c01029c9>] syscall_call+0x7/0xb
> Code: dc dd 27 c0 85 d2 0f 44 c2 68 ef 59 28 c0 c7 05 98 48 2c c0 00
> 00 00 00 a3 8c 33 2c c0 e8 be c1 e9 ff e8 f4 95 e8 ff 83 c4 10 fa <39>
> 76 20 74 49 83 3d 98 48 2c c0 00 74 40 8b 15 8c 33 2c c0 b8
> <3>Debug: sleeping function called from invalid context at
> include/linux/rwsem.h:43
> in_atomic():0, irqs_disabled():1
> [<c0116c69>] exit_mm+0x28/0xe8
> [<c011773a>] do_exit+0x17f/0x619
> [<c0103d85>] do_simd_coprocessor_error+0x0/0x14f
> [<c0111cf7>] do_page_fault+0x389/0x4c0
> [<c011196e>] do_page_fault+0x0/0x4c0
> [<c017592e>] sysfs_hash_and_remove+0x34/0x10a
> [<c010345f>] error_code+0x4f/0x54
> [<c017592e>] sysfs_hash_and_remove+0x34/0x10a
> [<c027a4f6>] __mutex_lock_slowpath+0x70/0x286
> [<c017592e>] sysfs_hash_and_remove+0x34/0x10a
> [<c01e756e>] class_device_del+0xa0/0x11c
> [<c01e75f5>] class_device_unregister+0xb/0x16
> [<d01f81f3>] acm_tty_unregister+0x1d/0x63 [cdc_acm]
> [<d01f8baa>] acm_tty_close+0x9d/0xac [cdc_acm]
> [<c01d6c1c>] release_dev+0x1a9/0x5b7
> [<c01d7e37>] opost+0x1bb/0x1d3
> [<c0146d9d>] __fput+0x74/0x132
> [<c01d72e8>] tty_release+0x9/0xc
> [<c0146dc8>] __fput+0x9f/0x132
> [<c0144c06>] filp_close+0x4e/0x57
> [<c0145584>] sys_close+0x56/0x63
> [<c01029c9>] syscall_call+0x7/0xb
>
> -Bob
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

--
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.16-rc5 pppd oops on disconnects [ In reply to ]
Bob, if you are still willing, please try this patch
with the slab debug turned on and see if it still oops.

Thanks,
Paul

--
Paul Fulghum
Microgate Systems, Ltd

--- linux-2.6.16-rc5/drivers/usb/class/cdc-acm.c 2006-02-27 09:24:29.000000000 -0600
+++ b/drivers/usb/class/cdc-acm.c 2006-03-12 10:22:21.000000000 -0600
@@ -980,7 +980,7 @@ skip_normal_probe:
usb_driver_claim_interface(&acm_driver, data_interface, acm);

usb_get_intf(control_interface);
- tty_register_device(acm_tty_driver, minor, &control_interface->dev);
+ tty_register_device(acm_tty_driver, minor, NULL);

acm_table[minor] = acm;
usb_set_intfdata (intf, acm);


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.16-rc5 pppd oops on disconnects [ In reply to ]
On 3/12/06, Greg KH <greg@kroah.com> wrote:
> There is an open Novell bug for this issue, but the USB developers that
> have tried to fix this in the past have had no luck at all (probably
> because we don't have the device to test with).
>
> What exact model of phone is this?

It's a Nokia 6230. I'm happy to test patches, and I'll also take a
glance at the code when I get some time to see what I can figure out.

-Bob
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.16-rc5 pppd oops on disconnects [ In reply to ]
On 3/12/06, Paul Fulghum <paulkf@microgate.com> wrote:
> --- linux-2.6.16-rc5/drivers/usb/class/cdc-acm.c 2006-02-27 09:24:29.000000000 -0600
> +++ b/drivers/usb/class/cdc-acm.c 2006-03-12 10:22:21.000000000 -0600
> @@ -980,7 +980,7 @@ skip_normal_probe:
> usb_driver_claim_interface(&acm_driver, data_interface, acm);
>
> usb_get_intf(control_interface);
> - tty_register_device(acm_tty_driver, minor, &control_interface->dev);
> + tty_register_device(acm_tty_driver, minor, NULL);
>
> acm_table[minor] = acm;
> usb_set_intfdata (intf, acm);
>

Paul,

No oops with the above patch.

thanks!
-Bob
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.16-rc5 pppd oops on disconnects [ In reply to ]
Bob Copeland wrote:
> On 3/12/06, Paul Fulghum <paulkf@microgate.com> wrote:
>
>>--- linux-2.6.16-rc5/drivers/usb/class/cdc-acm.c 2006-02-27 09:24:29.000000000 -0600
>>+++ b/drivers/usb/class/cdc-acm.c 2006-03-12 10:22:21.000000000 -0600
>>@@ -980,7 +980,7 @@ skip_normal_probe:
>> usb_driver_claim_interface(&acm_driver, data_interface, acm);
>>
>> usb_get_intf(control_interface);
>>- tty_register_device(acm_tty_driver, minor, &control_interface->dev);
>>+ tty_register_device(acm_tty_driver, minor, NULL);
>>
>> acm_table[minor] = acm;
>> usb_set_intfdata (intf, acm);
>>
>
>
> Paul,
>
> No oops with the above patch.
>
> thanks!
> -Bob

I think what is happening is that control_interface->dev is used
to back 2 sysfs entries (one usb, and one tty). When the usb
device is disconnected, the usb sysfs entries are removed and
the backing device is released. But the tty sysfs entry is
not removed until later after the tty is closed. This removal oops
because the backing device (or some sysfs entity associated with
the backing device) has already been freed. The slab poisoning
is needed to catch this. That's my theory, but I'm no expert
on USB or sysfs.

The above change does not associate the device
with the tty object, and no tty sysfs entry is made that
references the device. No function is lost, but some info
is not exported to userland.

I guess a more thorough approach would be to somehow not release
the usb device until the tty close completes. But that sounds
kind of messy, as the usb code would need to know about any
other class sysfs entries besides usb. (tty, maybe storage, etc)

Greg or the USB folks are more qualified to decide the details.

Thanks for your help Bob.

--
Paul


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/