New submission from Gregory P. Smith <greg@krypto.org>:
The attached script simply loops building a list of tuples. It has
horrible performance as the list gets larger compared to something
appending simple objects like ints to the list.
% python tuple_gc_hell.py ~
...
1000000 1.3615500927
2000000 2.95893096924
3000000 4.53531980515
4000000 5.62795209885
5000000 7.70974612236
...
The time it takes to execute grows linearly with the number of tuples
already appended to the list.
Turning on gc.debug(gc.DEBUG_STATS) shows why as does running python
under a profiler:
% cumulative self self total
time seconds seconds calls s/call s/call name
27.85 115.84 115.84 14270 0.01 0.02 collect
21.19 204.01 88.17 1115120314 0.00 0.00 tupletraverse
13.14 258.65 54.64 1655624731 0.00 0.00 visit_reachable
10.72 303.25 44.60 1655624731 0.00 0.00 visit_decref
5.97 328.10 24.85 338 0.07 1.10 PyEval_EvalFrame
It appears the cyclic gc is rechecking all of these tuples repeatedly.
disabling gc during the loop or using a very high gc.set_threshold hides
the problem.
----------
components: Interpreter Core
files: tuple_gc_hell.py
messages: 74512
nosy: gregory.p.smith
priority: normal
severity: normal
status: open
title: Building a list of tuples has non-linear performance
type: performance
versions: Python 2.5, Python 2.6, Python 2.7
Added file: http://bugs.python.org/file11743/tuple_gc_hell.py
_______________________________________
Python tracker <report@bugs.python.org>
<http://bugs.python.org/issue4074>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com
The attached script simply loops building a list of tuples. It has
horrible performance as the list gets larger compared to something
appending simple objects like ints to the list.
% python tuple_gc_hell.py ~
...
1000000 1.3615500927
2000000 2.95893096924
3000000 4.53531980515
4000000 5.62795209885
5000000 7.70974612236
...
The time it takes to execute grows linearly with the number of tuples
already appended to the list.
Turning on gc.debug(gc.DEBUG_STATS) shows why as does running python
under a profiler:
% cumulative self self total
time seconds seconds calls s/call s/call name
27.85 115.84 115.84 14270 0.01 0.02 collect
21.19 204.01 88.17 1115120314 0.00 0.00 tupletraverse
13.14 258.65 54.64 1655624731 0.00 0.00 visit_reachable
10.72 303.25 44.60 1655624731 0.00 0.00 visit_decref
5.97 328.10 24.85 338 0.07 1.10 PyEval_EvalFrame
It appears the cyclic gc is rechecking all of these tuples repeatedly.
disabling gc during the loop or using a very high gc.set_threshold hides
the problem.
----------
components: Interpreter Core
files: tuple_gc_hell.py
messages: 74512
nosy: gregory.p.smith
priority: normal
severity: normal
status: open
title: Building a list of tuples has non-linear performance
type: performance
versions: Python 2.5, Python 2.6, Python 2.7
Added file: http://bugs.python.org/file11743/tuple_gc_hell.py
_______________________________________
Python tracker <report@bugs.python.org>
<http://bugs.python.org/issue4074>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com