Mailing List Archive

3.10.9: Oops at elf_core_dump()
Hi,
I just got this stacktrace. Not sure whom to send it, poking throu MAINTAINERS
file and looking for ELF gave me nothing. ;-)

[105670.434336] BUG: unable to handle kernel NULL pointer dereference at (null)
[105670.434366] IP: [<ffffffff812f7b42>] strlen+0x2/0x20
[105670.434385] PGD 18c8e5067 PUD 2b547e067 PMD 0
[105670.434401] Oops: 0000 [#1] SMP
[105670.434413] Modules linked in: iwldvm iwlwifi
[105670.434432] CPU: 0 PID: 7497 Comm: emerge Not tainted 3.10.9-default-pciehp #8
[105670.434451] Hardware name: Dell Inc. Vostro 3550/, BIOS A11 08/03/2012
[105670.434468] task: ffff88037df42f70 ti: ffff88018683c000 task.ti: ffff88018683c000
[105670.434487] RIP: 0010:[<ffffffff812f7b42>] [<ffffffff812f7b42>] strlen+0x2/0x20
[105670.434509] RSP: 0018:ffff88018683d9f0 EFLAGS: 00010246
[105670.434523] RAX: 0000000000000000 RBX: 0000000000000003 RCX: ffff88037df42f70
[105670.434542] RDX: 00000000016e3610 RSI: 0000000000000000 RDI: 0000000000000000
[105670.434560] RBP: ffff88018683da08 R08: 0000000000000000 R09: 0000000000000000
[105670.434579] R10: 0000000000000000 R11: 0000000000000001 R12: ffff88018683db00
[105670.434598] R13: 00007ffffffff000 R14: 0000000000000004 R15: 0000000000000000
[105670.434617] FS: 00007f89b0989740(0000) GS:ffff88041d800000(0000) knlGS:0000000000000000
[105670.434637] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[105670.434652] CR2: 0000000000000000 CR3: 00000002b4c06000 CR4: 00000000000407f0
[105670.434671] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[105670.434690] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[105670.434708] Stack:
[105670.434715] ffffffff811bb0a3 0000000000000004 00000000000003d8 ffff88018683dc08
[105670.434738] ffffffff811bbcbd ffffffff811bb913 0000000000000000 ffff88018683db28
[105670.434762] ffff88037df42f70 ffff88018683ffff 0000000000000246 000f424200000000
[105670.434785] Call Trace:
[105670.434795] [<ffffffff811bb0a3>] ? notesize.isra.11+0x13/0x30
[105670.434812] [<ffffffff811bbcbd>] elf_core_dump+0xbfd/0x1570
[105670.434828] [<ffffffff811bb913>] ? elf_core_dump+0x853/0x1570
[105670.434845] [<ffffffff811c2bc5>] ? do_coredump+0xe25/0xff0
[105670.434861] [<ffffffff810eb85d>] ? trace_hardirqs_on+0xd/0x10
[105670.434878] [<ffffffff8116e0ff>] ? __sb_start_write+0xdf/0x1b0
[105670.434894] [<ffffffff811c2bc5>] ? do_coredump+0xe25/0xff0
[105670.434911] [<ffffffff81097209>] ? unshare_files+0x29/0xa0
[105670.434926] [<ffffffff811c289c>] do_coredump+0xafc/0xff0
[105670.434943] [<ffffffff810a63c8>] ? __sigqueue_free+0x38/0x40
[105670.434960] [<ffffffff810a9961>] get_signal_to_deliver+0x1c1/0x5c0
[105670.434977] [<ffffffff810a8721>] ? do_send_sig_info+0x61/0x90
[105670.434994] [<ffffffff81002303>] do_signal+0x53/0x8e0
[105670.435008] [<ffffffff810a8cd0>] ? kill_pgrp+0x60/0x60
[105670.435025] [<ffffffff810c2b9e>] ? finish_task_switch+0x7e/0xe0
[105670.435043] [<ffffffff81799750>] ? sysret_signal+0x5/0x47
[105670.435058] [<ffffffff81002bef>] do_notify_resume+0x5f/0x70
[105670.435074] [<ffffffff812fc71e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[105670.435092] [<ffffffff817999e2>] int_signal+0x12/0x17
[105670.435106] Code: 48 89 e5 f6 82 40 c6 84 81 20 74 15 0f 1f 44 00 00 48 83 c0 01 0f b6 10 f6 82 40 c6 84 81 20 75 f0 5d c3 66 0f 1f 44 00 00 31 c0 <80> 3f 00 55 48 89 e5 74 11 48 89 f8 66 90 48 83 c0 01 80 38 00
[105670.435238] RIP [<ffffffff812f7b42>] strlen+0x2/0x20
[105670.435254] RSP <ffff88018683d9f0>
[105670.435843] CR2: 0000000000000000
[105670.439699] ---[ end trace 9d67aee555e92d75 ]---



Martin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Re: 3.10.9: Oops at elf_core_dump() [ In reply to ]
On Thu, Aug 29, 2013 at 11:46:18PM +0200, Martin MOKREJÅ  wrote:
> Hi,
> I just got this stacktrace. Not sure whom to send it, poking throu MAINTAINERS
> file and looking for ELF gave me nothing. ;-)
>
> [105670.434336] BUG: unable to handle kernel NULL pointer dereference at (null)
> [105670.434366] IP: [<ffffffff812f7b42>] strlen+0x2/0x20
> [105670.434385] PGD 18c8e5067 PUD 2b547e067 PMD 0
> [105670.434401] Oops: 0000 [#1] SMP
> [105670.434413] Modules linked in: iwldvm iwlwifi
> [105670.434432] CPU: 0 PID: 7497 Comm: emerge Not tainted 3.10.9-default-pciehp #8

Is this reproducable?

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Re: 3.10.9: Oops at elf_core_dump() [ In reply to ]
Got it for the first time. Actually, am doing something really unusual
(http://bugs.python.org/issue18843).

Am looking for an answer why I suffer memory corruption in python applicatuons.
So I installed DUMA from http://duma.sourceforge.net and tried to recompile&reinstall
failing python. In previous attempt it exited and per README instructions
I increased the vm.max_map_count value.


# export LD_PRELOAD=/usr/lib64/libduma.so.0.0.0
# sysctl -w vm.max_map_count=1000000
# emerge dev-lang/python:2.7
DUMA 2.5.15 (shared library, NO_LEAKDETECTION)
Copyright (C) 2006 Michael Eddington <meddington@gmail.com>
Copyright (C) 2002-2008 Hayati Ayguen <h_ayguen@web.de>, Procitec GmbH
Copyright (C) 1987-1999 Bruce Perens <bruce@perens.com>


* IMPORTANT: 11 news items need reading for repository 'gentoo'.
* Use eselect news to read news items.


* IMPORTANT: config file '5 (shared library, NO_LEAKDETECTION)
Copyright (C) 2006 Michael Eddington <meddington@gmail.com>
Copyright (C) 2002-2008 Hayati Ayguen <h_ayguen@web.de>, Procitec GmbH
Copyright (C) 1987-1999 Bruce Perens <bruce@perens.com>

' needs updating.
* See the CONFIGURATION FILES section of the emerge
* man page to learn how to update config files.
Calculating dependencies |
DUMA Aborting: mprotect() failed: Cannot allocate memory.
Check README section 'MEMORY USAGE AND EXECUTION SPEED'
if your (Linux) system may limit the number of different page mappings per process


[and it crashed, no ctrl+c working]



Sorry do not know what to say more. I just crashed teh kernel but except
the Ooops it works so far. The core filesize is zero.
Martin


Greg KH wrote:
> On Thu, Aug 29, 2013 at 11:46:18PM +0200, Martin MOKREJÅ  wrote:
>> Hi,
>> I just got this stacktrace. Not sure whom to send it, poking throu MAINTAINERS
>> file and looking for ELF gave me nothing. ;-)
>>
>> [105670.434336] BUG: unable to handle kernel NULL pointer dereference at (null)
>> [105670.434366] IP: [<ffffffff812f7b42>] strlen+0x2/0x20
>> [105670.434385] PGD 18c8e5067 PUD 2b547e067 PMD 0
>> [105670.434401] Oops: 0000 [#1] SMP
>> [105670.434413] Modules linked in: iwldvm iwlwifi
>> [105670.434432] CPU: 0 PID: 7497 Comm: emerge Not tainted 3.10.9-default-pciehp #8
>
> Is this reproducable?
>
> thanks,
>
> greg k-h
>

--
Martin Mokrejs, Ph.D.
Bioinformatics
Donovalska 1658
149 00 Prague
Czech Republic
http://www.iresite.org
http://www.iresite.org/~mmokrejs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Re: 3.10.9: Oops at elf_core_dump() [ In reply to ]
So it happened again:

$ export LD_PRELOAD=/usr/lib64/libduma.so.0.0.0
$ python memory-corruption-test.py
DUMA 2.5.15 (shared library, NO_LEAKDETECTION)
Copyright (C) 2006 Michael Eddington <meddington@gmail.com>
Copyright (C) 2002-2008 Hayati Ayguen <h_ayguen@web.de>, Procitec GmbH
Copyright (C) 1987-1999 Bruce Perens <bruce@perens.com>

DUMA 2.5.15 (shared library, NO_LEAKDETECTION)
Copyright (C) 2006 Michael Eddington <meddington@gmail.com>
Copyright (C) 2002-2008 Hayati Ayguen <h_ayguen@web.de>, Procitec GmbH
Copyright (C) 1987-1999 Bruce Perens <bruce@perens.com>

Finished one record
Finished one record
Finished one record
Finished one record
Finished one record
[cut]


DUMA Aborting: mprotect() failed: Cannot allocate memory.
Check README section 'MEMORY USAGE AND EXECUTION SPEED'
if your (Linux) system may limit the number of different page mappings per process
Fatal Python error: Illegal instruction

Current thread 0x00007fc9c4803740:
File "/usr/lib64/python2.7/site-packages/Bio/Blast/NCBIXML.py", line 106 in endElement
File "/mnt/1TB/var/tmp/portage/dev-lang/python-2.7.5-r2/work/Python-2.7.5/Modules/pyexpat.c", line 618 in EndElement
File "/usr/lib64/python2.7/site-packages/Bio/Blast/NCBIXML.py", line 654 in parse
File "memory-corruption-test.py", line 55 in doparse
File "memory-corruption-test.py", line 104 in main
File "memory-corruption-test.py", line 109 in <module>



The stacktrace is little different but ... I think I need to find what resource to so that
duma can keep running watching the python binary.


[112567.987073] BUG: unable to handle kernel NULL pointer dereference at (null)
[112567.987684] IP: [<ffffffff812f7b42>] strlen+0x2/0x20
[112567.988282] PGD 28be2c067 PUD 3a7744067 PMD 0
[112567.988879] Oops: 0000 [#2] SMP
[112567.989468] Modules linked in: iwldvm iwlwifi
[112567.990057] CPU: 0 PID: 8822 Comm: python2.7 Tainted: G D 3.10.9-default-pciehp #8
[112567.990655] Hardware name: Dell Inc. Vostro 3550/, BIOS A11 08/03/2012
[112567.991249] task: ffff8803b5eb0fd0 ti: ffff8803b36d4000 task.ti: ffff8803b36d4000
[112567.991845] RIP: 0010:[<ffffffff812f7b42>] [<ffffffff812f7b42>] strlen+0x2/0x20
[112567.992443] RSP: 0018:ffff8803b36d59f0 EFLAGS: 00010246
[112567.993039] RAX: 0000000000000000 RBX: 0000000000000003 RCX: ffff8803b5eb0fd0
[112567.993643] RDX: 00000000016e3610 RSI: 0000000000000000 RDI: 0000000000000000
[112567.994249] RBP: ffff8803b36d5a08 R08: 00000000fffffffa R09: 0000000000000000
[112567.994854] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8803b36d5b00
[112567.995459] R13: 00007ffffffff000 R14: 0000000000000004 R15: 0000000000000000
[112567.996060] FS: 00007fc9c4803740(0000) GS:ffff88041d800000(0000) knlGS:0000000000000000
[112567.996664] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[112567.997268] CR2: 0000000000000000 CR3: 00000003a3b84000 CR4: 00000000000407f0
[112567.997884] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[112567.998495] DR3: 0000000000000000 DR6: 00000000ffff4ff0 DR7: 0000000000000400
[112567.999100] Stack:
[112567.999698] ffffffff811bb0a3 ffff8803b36d5e60 00000000000003d8 ffff8803b36d5c08
[112568.000315] ffffffff811bbcbd ffffffff811bb913 0000000000000002 0000000000000000
[112568.000931] ffff8803b5eb0fd0 ffff8803b36dffff 0000000000000246 000f424200000000
[112568.001548] Call Trace:
[112568.002156] [<ffffffff811bb0a3>] ? notesize.isra.11+0x13/0x30
[112568.002774] [<ffffffff811bbcbd>] elf_core_dump+0xbfd/0x1570
[112568.003392] [<ffffffff811bb913>] ? elf_core_dump+0x853/0x1570
[112568.004012] [<ffffffff81097209>] ? unshare_files+0x29/0xa0
[112568.004629] [<ffffffff811c289c>] do_coredump+0xafc/0xff0
[112568.005247] [<ffffffff810a63c8>] ? __sigqueue_free+0x38/0x40
[112568.005865] [<ffffffff810a9961>] get_signal_to_deliver+0x1c1/0x5c0
[112568.006488] [<ffffffff810b5d90>] ? pid_vnr+0x30/0x30
[112568.007108] [<ffffffff81002303>] do_signal+0x53/0x8e0
[112568.007725] [<ffffffff81002bef>] do_notify_resume+0x5f/0x70
[112568.008342] [<ffffffff812fc71e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[112568.008965] [<ffffffff817999e2>] int_signal+0x12/0x17
[112568.009583] Code: 48 89 e5 f6 82 40 c6 84 81 20 74 15 0f 1f 44 00 00 48 83 c0 01 0f b6 10 f6 82 40 c6 84 81 20 75 f0 5d c3 66 0f 1f 44 00 00 31 c0 <80> 3f 00 55 48 89 e5 74 11 48 89 f8 66 90 48 83 c0 01 80 38 00
[112568.011042] RIP [<ffffffff812f7b42>] strlen+0x2/0x20
[112568.011748] RSP <ffff8803b36d59f0>
[112568.012445] CR2: 0000000000000000
[112568.013155] ---[ end trace 9d67aee555e92d76 ]---


Martin MOKREJÅ  wrote:
> Got it for the first time. Actually, am doing something really unusual
> (http://bugs.python.org/issue18843).
>
> Am looking for an answer why I suffer memory corruption in python applicatuons.
> So I installed DUMA from http://duma.sourceforge.net and tried to recompile&reinstall
> failing python. In previous attempt it exited and per README instructions
> I increased the vm.max_map_count value.
>
>
> # export LD_PRELOAD=/usr/lib64/libduma.so.0.0.0
> # sysctl -w vm.max_map_count=1000000
> # emerge dev-lang/python:2.7
> DUMA 2.5.15 (shared library, NO_LEAKDETECTION)
> Copyright (C) 2006 Michael Eddington <meddington@gmail.com>
> Copyright (C) 2002-2008 Hayati Ayguen <h_ayguen@web.de>, Procitec GmbH
> Copyright (C) 1987-1999 Bruce Perens <bruce@perens.com>
>
>
> * IMPORTANT: 11 news items need reading for repository 'gentoo'.
> * Use eselect news to read news items.
>
>
> * IMPORTANT: config file '5 (shared library, NO_LEAKDETECTION)
> Copyright (C) 2006 Michael Eddington <meddington@gmail.com>
> Copyright (C) 2002-2008 Hayati Ayguen <h_ayguen@web.de>, Procitec GmbH
> Copyright (C) 1987-1999 Bruce Perens <bruce@perens.com>
>
> ' needs updating.
> * See the CONFIGURATION FILES section of the emerge
> * man page to learn how to update config files.
> Calculating dependencies |
> DUMA Aborting: mprotect() failed: Cannot allocate memory.
> Check README section 'MEMORY USAGE AND EXECUTION SPEED'
> if your (Linux) system may limit the number of different page mappings per process
>
>
> [and it crashed, no ctrl+c working]
>
>
>
> Sorry do not know what to say more. I just crashed teh kernel but except
> the Ooops it works so far. The core filesize is zero.
> Martin
>
>
> Greg KH wrote:
>> On Thu, Aug 29, 2013 at 11:46:18PM +0200, Martin MOKREJÅ  wrote:
>>> Hi,
>>> I just got this stacktrace. Not sure whom to send it, poking throu MAINTAINERS
>>> file and looking for ELF gave me nothing. ;-)
>>>
>>> [105670.434336] BUG: unable to handle kernel NULL pointer dereference at (null)
>>> [105670.434366] IP: [<ffffffff812f7b42>] strlen+0x2/0x20
>>> [105670.434385] PGD 18c8e5067 PUD 2b547e067 PMD 0
>>> [105670.434401] Oops: 0000 [#1] SMP
>>> [105670.434413] Modules linked in: iwldvm iwlwifi
>>> [105670.434432] CPU: 0 PID: 7497 Comm: emerge Not tainted 3.10.9-default-pciehp #8
>>
>> Is this reproducable?
>>
>> thanks,
>>
>> greg k-h
>>
>

--
Martin Mokrejs, Ph.D.
Bioinformatics
Donovalska 1658
149 00 Prague
Czech Republic
http://www.iresite.org
http://www.iresite.org/~mmokrejs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Re: 3.10.9: Oops at elf_core_dump() [ In reply to ]
On Thu, Aug 29, 2013 at 03:05:50PM -0700, Greg KH wrote:
> On Thu, Aug 29, 2013 at 11:46:18PM +0200, Martin MOKREJÅ  wrote:
> > Hi,
> > I just got this stacktrace. Not sure whom to send it, poking throu MAINTAINERS
> > file and looking for ELF gave me nothing. ;-)
> >
> > [105670.434336] BUG: unable to handle kernel NULL pointer dereference at (null)
> > [105670.434366] IP: [<ffffffff812f7b42>] strlen+0x2/0x20
> > [105670.434385] PGD 18c8e5067 PUD 2b547e067 PMD 0
> > [105670.434401] Oops: 0000 [#1] SMP
> > [105670.434413] Modules linked in: iwldvm iwlwifi
> > [105670.434432] CPU: 0 PID: 7497 Comm: emerge Not tainted 3.10.9-default-pciehp #8
>
> Is this reproducable?

Yes, and here is my analysis:

fill_files_note(&info->files) exits early because of too many VM areas, or
due to memory pressure (vmalloc failing), leaving a NULL string in info->files,
letting notesize() crash on it.

as root do:

echo 300000 > /proc/sys/vm/max_map_count

then, as a regular user:

ulimit -c unlimited
gcc prog.c -o prog
./prog

prog.c:
-------
int main(int argc, char *argv[])
{
char *p, *t;
int i;

p = (void *)0x444400000000;

for (i = 0; i < 200000; i++) {
t = mmap(p, 0x1000, PROT_READ | PROT_WRITE | PROT_EXEC,
MAP_FIXED | MAP_ANONYMOUS | MAP_PRIVATE,
-1, 0);
p = &p[0x2000];
}

*((char *)0x0) = 0;

return 0;
}

And the result:

user@guestvm:~$ c[ 380.520865] BUG: unable to handle kernel NULL pointer dereference at 0000000000000086
[ 380.523196] IP: [<ffffffff812ee180>] strim+0x80/0x80
[ 380.524477] PGD 3abc6067 PUD 3c7b4067 PMD 0
[ 380.525974] Oops: 0000 [#1] SMP

Entering kdb (current=0xffff880033ee8000, pid 1716) on processor 0 Oops: (null)
due to oops @ 0xffffffff812ee180
dCPU: 0 PID: 1716 Comm: a.out Not tainted 3.10.9-mod-nodbg+ #1
dHardware name: Bochs Bochs, BIOS Bochs 01/01/2011
dtask: ffff880033ee8000 ti: ffff880034eec000 task.ti: ffff880034eec000
dRIP: 0010:[<ffffffff812ee180>] [<ffffffff812ee180>] strim+0x80/0x80
dRSP: 0000:ffff880034eeda30 EFLAGS: 00010292
dRAX: 0000000000c353c0 RBX: 00000000ffff8800 RCX: ffff880033ee8000
dRDX: 0000000000493f78 RSI: 00000000ffff8800 RDI: 0000000000000086
dRBP: ffff880034eeda48 R08: 00000000fffffffd R09: 0000000000000000
dR10: 0000000000000000 R11: ffffffff812e6c4e R12: ffff880034eedb78
dR13: 00007ffffffff000 R14: 0000000000000000 R15: ffffffff81802708
dFS: 00007fd9f8bf7740(0000) GS:ffff88003f200000(0000) knlGS:0000000000000000
dCS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
dCR2: 0000000000000086 CR3: 000000003ae91000 CR4: 00000000001407f0
dDR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
dDR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
dStack:
ffffffff811d29a5 ffff880033ee8000 00000000000003d8 ffff880034eedc38
ffffffff811d3579 ffff880034eeda88 ffffffff8108625a ffff88003a522300
ffff880033ee8000 0000000000493f78 0000ffff00030d51 ffff880000493f78
dCall Trace:
d [<ffffffff811d29a5>] ? notesize.isra.9+0x15/0x30
d [<ffffffff811d3579>] elf_core_dump+0xbb9/0x1460
d [<ffffffff8108625a>] ? finish_task_switch+0x4a/0x100
d [<ffffffff8164054d>] ? schedule+0x5d/0x60
d [<ffffffff81084a23>] ? __wake_up+0x53/0x70
d [<ffffffff811dbaee>] do_coredump+0xb8e/0xef0
d [<ffffffff8106632d>] ? __sigqueue_free+0x3d/0x50
d [<ffffffff81069bcf>] get_signal_to_deliver+0x53f/0x5d0
d [<ffffffff81637c03>] ? bad_area+0x44/0x4c
d [<ffffffff810123c7>] do_signal+0x57/0x570
d [<ffffffff8108cf0d>] ? __dequeue_entity+0x3d/0x50
d [<ffffffff81637eda>] ? printk+0x61/0x63
d [<ffffffff8108625a>] ? finish_task_switch+0x4a/0x100
d [<ffffffff8164030b>] ? __schedule+0x6bb/0x800
d [<ffffffff8101291e>] do_notify_resume+0x3e/0x90
d [<ffffffff81641b3c>] retint_signal+0x48/0x8c

On some systems the requirements for max_map_count are really large, so we can't
avoid it. So, binfmt_elf.c should be fixed.

--
Dan Aloni
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/