Mailing List Archive

[issue41335] free(): invalid pointer in list_ass_item() in Python 3.7.3
New submission from Howard A. Landman <howard@riverrock.org>:

I have a program qtd.py that reliably dies with free(): invalid pointer after about 13 hours of runtime (on a RPi3B+). This is hard to debug because (1) try:except: will not catch SIGABRT
(2) signal.signal(signal.SIGABRT, sigabrt_handler) also fails to catch SIGABRT
(3) python3-dbg does not work with libraries like RPi.GPIO and spidev()
This happens under both Buster and Stretch, so I don't think it's OS-dependent.

I managed to get a core dump and gdb back trace gives:
warning: core file may not match specified executable file.
[New LWP 10277]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/arm-linux-gnueabihf/libthread_db.so.1".
Core was generated by `python3 qtd.py'.
Program terminated with signal SIGABRT, Aborted.
#0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
50 ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1 0x76c5e308 in __GI_abort () at abort.c:100
#2 0x76cae51c in __libc_message (action=action@entry=do_abort,
fmt=<optimized out>) at ../sysdeps/posix/libc_fatal.c:181
#3 0x76cb5044 in malloc_printerr (str=<optimized out>) at malloc.c:5341
#4 0x76cb6d50 in _int_free (av=0x76d917d4 <main_arena>,
p=0x43417c <small_ints.lto_priv+72>, have_lock=<optimized out>)
at malloc.c:4165
#5 0x001b3bb4 in list_ass_item (a=<optimized out>, i=<optimized out>,
v=<optimized out>, a=<optimized out>, i=<optimized out>, v=<optimized out>)
at ../Objects/listobject.c:739
Backtrace stopped: Cannot access memory at address 0x17abeff8

So, as far as I can tell, it looks like list_ass_item() is where the error happens. I'm willing to help debug and maybe even fix this, but is there any way to remove any of the roadblocks (1-3) above?

NOTE: qtd.py will exit unless there is a Texas Instruments TDC7201 chip attached to the RPI's SPI pins.

----------
components: Interpreter Core
hgrepos: 389
messages: 373911
nosy: Howard_Landman
priority: normal
severity: normal
status: open
title: free(): invalid pointer in list_ass_item() in Python 3.7.3
type: crash
versions: Python 3.7

_______________________________________
Python tracker <report@bugs.python.org>
<https://bugs.python.org/issue41335>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com
[issue41335] free(): invalid pointer in list_ass_item() in Python 3.7.3 [ In reply to ]
Change by Howard A. Landman <howard@riverrock.org>:


----------
hgrepos: -389

_______________________________________
Python tracker <report@bugs.python.org>
<https://bugs.python.org/issue41335>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com
[issue41335] free(): invalid pointer in list_ass_item() in Python 3.7.3 [ In reply to ]
Howard A. Landman <howard@riverrock.org> added the comment:

This is not a memory leak problem. "Top" reports VIRT 21768 RES 13516 for the whole run, and Python internal resource reporting says 13564 kb for the whole run. So that's less than 1 kb leaked in 118.6M measurement cycles; most likely zero.

----------

_______________________________________
Python tracker <report@bugs.python.org>
<https://bugs.python.org/issue41335>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com
[issue41335] free(): invalid pointer in list_ass_item() in Python 3.7.3 [ In reply to ]
Steve Stagg <stestagg@gmail.com> added the comment:

It's possible that RPi.GPIO (or another C extension) is mis-managing reference counts, resulting in it freeing or trying to free an object at the wrong time.

If that object is a common value, or the branch that causes the refcount issue is uncommon, it may take many millions of cycles for the problem to occur.

----------
nosy: +stestagg

_______________________________________
Python tracker <report@bugs.python.org>
<https://bugs.python.org/issue41335>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com
[issue41335] free(): invalid pointer in list_ass_item() in Python 3.7.3 [ In reply to ]
Christian Heimes <lists@cheimes.de> added the comment:

int_free() and small_ints in the traceback likely point to a reference counting issue for small integers in range of -5 to +256. These integers are cached and should never get freed in the middle of a process.

----------
nosy: +christian.heimes

_______________________________________
Python tracker <report@bugs.python.org>
<https://bugs.python.org/issue41335>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com
[issue41335] free(): invalid pointer in list_ass_item() in Python 3.7.3 [ In reply to ]
Christian Heimes <lists@cheimes.de> added the comment:

Correction, _int_free() is an internal function of malloc().

small_ints.lto_priv+72 looks like deallocation of an integer object at location +72. This is either int 4 (on 64bit platforms) or int 13 (on 32bit platforms):

>>> NSMALLNEGINTS = 5
>>> (4 + NSMALLNEGINTS) * 8
72
>>> (13 + NSMALLNEGINTS) * 4
72

Could you please report the issue with RPi package owner? This looks like a bug in a 3rd party module.

----------

_______________________________________
Python tracker <report@bugs.python.org>
<https://bugs.python.org/issue41335>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com
[issue41335] free(): invalid pointer in list_ass_item() in Python 3.7.3 [ In reply to ]
Howard A. Landman <howard@riverrock.org> added the comment:

After a quick glance at the source code for the spidev library, I think it is unlikely but not impossible to be the home for the bug. It does do malloc() and free(), but only for data that is greater than 256 bytes. Short tx and rx data is kept in static local buffers. Also, these calls do not match the partial stack trace I got. There is a small amount of allocating and deallocating Python objects, however, including calls to PyList_New(), PyList_SET_ITEM(), and Py_TYPE(self)->tp_free((PyObject *)self), so it's possible that the bug is buried under one of those.

----------
nosy: -christian.heimes, stestagg

_______________________________________
Python tracker <report@bugs.python.org>
<https://bugs.python.org/issue41335>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com
[issue41335] free(): invalid pointer in list_ass_item() in Python 3.7.3 [ In reply to ]
Howard A. Landman <howard@riverrock.org> added the comment:

I'm running under 32-bit Raspbian, so let's assume the magic number is 13. There are only two places in my own code where the number 13 appears:

(1) My result_list is 14 items long, i.e. 0 to 13. Relevant code from qtd.py:
cum_results = [.0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
...
while batches != 0:
...
result_list = [.0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
for m in range(ITERS):
...
result = tdc.measure(...)
result_list[result] += 1
...
for i in range(len(result_list)):
cum_results[i] += result_list[i]

I notice that result_list is getting thrown away each cycle, and thus must be garbage-collected. I could try changing that and see if it has any effect.

(2) in the tdc7201 library, 13 occurs as a possible (error) result code from measure(). However several of the failing runs had zero errors of this class, which means the code with 13 in it never got executed even once. I am not using pin 13 of the RPi's header, so I never send that number to RPi.GPIO. GPIO13 is pin 33 of the header, and I am not using that pin either, so I don't think RPi.GPIO is translating one of my pin number arguments to 13 internally. (But I should check Broadcom Mode numbers.)

----------

_______________________________________
Python tracker <report@bugs.python.org>
<https://bugs.python.org/issue41335>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com
[issue41335] free(): invalid pointer in list_ass_item() in Python 3.7.3 [ In reply to ]
Howard A. Landman <howard@riverrock.org> added the comment:

Made some progress. By running it under gdb and breaking in malloc_printerr(), I got a better stack trace:

Breakpoint 1, malloc_printerr (str=0x76e028f8 "free(): invalid pointer")
at malloc.c:5341
5341 malloc.c: No such file or directory.
(gdb) bt
#0 malloc_printerr (str=0x76e028f8 "free(): invalid pointer") at malloc.c:5341
#1 0x76d44d50 in _int_free (av=0x76e1f7d4 <main_arena>,
p=0x43417c <small_ints.lto_priv+72>, have_lock=<optimized out>)
at malloc.c:4165
#2 0x001b4f40 in list_dealloc (op=0x760ef288) at ../Objects/listobject.c:324
#3 0x001be784 in frame_dealloc (
f=Frame 0x765fcc30, for file /home/pi/src/QTD/src/tdc7201/tdc7201/__init__.py, line 719, in read_regs24 (i=40, reg=28)) at ../Objects/frameobject.c:470
#4 function_code_fastcall (globals=<optimized out>, nargs=<optimized out>,
args=<optimized out>, co=<optimized out>) at ../Objects/call.c:291
#5 _PyFunction_FastCallKeywords (func=<optimized out>, stack=<optimized out>,
nargs=<optimized out>, kwnames=<optimized out>) at ../Objects/call.c:408
#6 0x0011f984 in call_function (kwnames=0x0, oparg=<optimized out>,
pp_stack=0x7effed40) at ../Python/ceval.c:4554
#7 _PyEval_EvalFrameDefault (f=<optimized out>, throwflag=<optimized out>)
at ../Python/ceval.c:3110
#8 0x0011d63c in PyEval_EvalFrameEx (throwflag=0,
f=Frame 0x760e8960, for file /home/pi/src/QTD/src/tdc7201/tdc7201/__init__.py, line 962, in measure (self=<TDC7201(_spi=<SpiDev at remote 0x7618d2f0>, reg1=[.130, 66, 0, 7, 255, 255, 6, 64, 0, 1, 65535, 1600, Breakpoint 1, malloc_printerr (str=0x76e028f8 "free(): invalid pointer")
at malloc.c:5341
5341 malloc.c: No such file or directory.
(gdb) bt
#0 malloc_printerr (str=0x76e028f8 "free(): invalid pointer") at malloc.c:5341
#1 0x76d44d50 in _int_free (av=0x76e1f7d4 <main_arena>,
p=0x43417c <small_ints.lto_priv+72>, have_lock=<optimized out>)
at malloc.c:4165
#2 0x001b4f40 in list_dealloc (op=0x760ef288) at ../Objects/listobject.c:324
#3 0x001be784 in frame_dealloc (
f=Frame 0x765fcc30, for file /home/pi/src/QTD/src/tdc7201/tdc7201/__init__.py, line 719, in read_regs24 (i=40, reg=28)) at ../Objects/frameobject.c:470
#4 function_code_fastcall (globals=<optimized out>, nargs=<optimized out>,
args=<optimized out>, co=<optimized out>) at ../Objects/call.c:291
#5 _PyFunction_FastCallKeywords (func=<optimized out>, stack=<optimized out>,
nargs=<optimized out>, kwnames=<optimized out>) at ../Objects/call.c:408
#6 0x0011f984 in call_function (kwnames=0x0, oparg=<optimized out>,
pp_stack=0x7effed40) at ../Python/ceval.c:4554
#7 _PyEval_EvalFrameDefault (f=<optimized out>, throwflag=<optimized out>)
at ../Python/ceval.c:3110
#8 0x0011d63c in PyEval_EvalFrameEx (throwflag=0,
f=Frame 0x760e8960, for file /home/pi/src/QTD/src/tdc7201/tdc7201/__init__.py, line 962, in measure (self=<TDC7201(_spi=<SpiDev at remote 0x7618d2f0>, reg1=[.130, 66, 0, 7, 255, 255, 6, 64, 0, 1, 65535, 1600, 1, None, None, None, 1284, 322, 2371, 0, 0, 0, 0, 0, 0, 0, 0, 2322, 23172, None], chip_select=1, sclk=23, miso=21, mosi=19, cs1=24, cs2=26, enable=12, osc_enable=16, trig1=7, int1=37, trig2=None, int2=None, start=18, stop=22, meas_mode=2, cal_pers=10, cal_count=<float at remote 0x76225e50>, norm_lsb=<float at remote 0x760ee8f0>, tof1=<float at remote 0x760ee900>, tof2=<float at remote 0x760ee920>, tof3=0, tof4=0, tof5=0) at remote 0x76191750>, simulate=True, error_prefix='59835 ', log_file=<_io.TextIOWrapper at remote 0x761df4b0>, cf1=130, timeout=<float at remote 0x76225d00>, n_stop=3, threshold=<float at remote 0x760ee880>, pulse=2))
at ../Python/ceval.c:3930
1, None, None, None, 1284, 322, 2371, 0, 0, 0, 0, 0, 0, 0, 0, 2322, 23172, None], chip_select=1, sclk=23, miso=21, mosi=19, cs1=24, cs2=26, enable=12, osc_enable=16, trig1=7, int1=37, trig2=None, int2=None, start=18, stop=22, meas_mode=2, cal_pers=10, cal_count=<float at remote 0x76225e50>, norm_lsb=<float at remote 0x760ee8f0>, tof1=<float at remote 0x760ee900>, tof2=<float at remote 0x760ee920>, tof3=0, tof4=0, tof5=0) at remote 0x76191750>, simulate=True, error_prefix='59835 ', log_file=<_io.TextIOWrapper at remote 0x761df4b0>, cf1=130, timeout=<float at remote 0x76225d00>, n_stop=3, threshold=<float at remote 0x760ee880>, pulse=2))
at ../Python/ceval.c:3930
...

read_regs24 (i=40, reg=28) performs an SPI read of 13 24-bit registers (so 39 bytes); the corresponding Python code is:

def read_regs24(self):
"""Read all 24-bit chip registers, using auto-increment feature."""
result24 = self._spi.xfer([.self.MINREG24|self._AI,
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00])
#print("AI read 24-bits =", result24)
#print("length =", len(result24))
i = 1
for reg in range(self.MINREG24, self.MAXREG24+1):
# Data comes in MSB first.
self.reg1[reg] = (result24[i] << 16) | (result24[i+1] << 8) | result24[i+2]
i += 3

The zero padding is necessary to make sure that enough SPI clock cycles are sent for the data bytes to get clocked back in. The _AI flag turns on auto-increment, so each byte is from the next address.

So it looks like the bug is either in the spidev library, or in Python deallocation of this frame. I'll try pulling the sent data list outside the method, since it is always the same; that will avoid the list alloc/dealloc, and may even slightly speed up the program. If that fixes the problem, it's in Python; if it doesn't, it's in spidev. Probably.

----------

_______________________________________
Python tracker <report@bugs.python.org>
<https://bugs.python.org/issue41335>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com
[issue41335] free(): invalid pointer in list_ass_item() in Python 3.7.3 [ In reply to ]
Howard A. Landman <howard@riverrock.org> added the comment:

Getting closer to isolating this. A small program that does nothing but call read_regs24() over and over dies with the same error after about 107.4M iterations. So it seems to be definitely in that method. But that only takes a few hours, not half a day. Going to try to separate the SPI operation from the remaining Python code next.

----------

_______________________________________
Python tracker <report@bugs.python.org>
<https://bugs.python.org/issue41335>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com
[issue41335] free(): invalid pointer in list_ass_item() in Python 3.7.3 [ In reply to ]
Howard A. Landman <howard@riverrock.org> added the comment:

It appears to be in the spidev library xfer method, when used for reading. Calling that 107.4M times (using the same code as inside my read_regs24() method) causes the free() error.

Breakpoint 1, malloc_printerr (str=0x76e028f8 "free(): invalid pointer")
at malloc.c:5341
5341 malloc.c: No such file or directory.
(gdb) bt
#0 malloc_printerr (str=0x76e028f8 "free(): invalid pointer") at malloc.c:5341
#1 0x76d44d50 in _int_free (av=0x76e1f7d4 <main_arena>,
p=0x43417c <small_ints.lto_priv+72>, have_lock=<optimized out>) at malloc.c:4165
#2 0x001b4f40 in list_dealloc (op=0x766d0f58) at ../Objects/listobject.c:324
#3 0x7669b660 in ?? ()
from /usr/lib/python3/dist-packages/spidev.cpython-37m-arm-linux-gnueabihf.so
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Should I be worried about the "corrupt stack" message?

----------

_______________________________________
Python tracker <report@bugs.python.org>
<https://bugs.python.org/issue41335>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com
[issue41335] free(): invalid pointer in list_ass_item() in Python 3.7.3 [ In reply to ]
Howard A. Landman <howard@riverrock.org> added the comment:

OK, this has been filed against the spidev library:
https://github.com/doceme/py-spidev/issues/107

Do you want it closed, or left open until that gets resolved?

----------
resolution: -> third party

_______________________________________
Python tracker <report@bugs.python.org>
<https://bugs.python.org/issue41335>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com
[issue41335] free(): invalid pointer in list_ass_item() in Python 3.7.3 [ In reply to ]
Change by Ned Deily <nad@python.org>:


----------
stage: -> resolved
status: open -> closed

_______________________________________
Python tracker <report@bugs.python.org>
<https://bugs.python.org/issue41335>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com
[issue41335] free(): invalid pointer in list_ass_item() in Python 3.7.3 [ In reply to ]
Christian Heimes <lists@cheimes.de> added the comment:

Yeah, my bet is on SpiDev_xfer(), https://github.com/doceme/py-spidev/blob/master/spidev_module.c#L458

----------
nosy: +christian.heimes

_______________________________________
Python tracker <report@bugs.python.org>
<https://bugs.python.org/issue41335>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com
[issue41335] free(): invalid pointer in list_ass_item() in Python 3.7.3 [ In reply to ]
Howard A. Landman <howard@riverrock.org> added the comment:

As far as we can tell, this is a known Py_DECREF problem with spidev==3.4. Testing on spidev==3.5 has not triggered the bug so far, so it appears to be already fixed.

Under 3.4, changing the list to a tuple did not affect the behavior.

----------

_______________________________________
Python tracker <report@bugs.python.org>
<https://bugs.python.org/issue41335>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com