Mailing List Archive

[issue41431] Optimize dict_merge for copy
New submission from Inada Naoki <songofacandy@gmail.com>:

Although there are dict.copy() and PyDict_Copy(), dict_merge can be used for copying dict.

* d={}; d.update(orig)
* d=dict(orig)
* d=orig.copy() # orig has many dummy keys.

----------
components: Interpreter Core
messages: 374550
nosy: inada.naoki
priority: normal
severity: normal
status: open
title: Optimize dict_merge for copy
versions: Python 3.10

_______________________________________
Python tracker <report@bugs.python.org>
<https://bugs.python.org/issue41431>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com
[issue41431] Optimize dict_merge for copy [ In reply to ]
Change by Inada Naoki <songofacandy@gmail.com>:


----------
keywords: +patch
pull_requests: +20813
stage: -> patch review
pull_request: https://github.com/python/cpython/pull/21669

_______________________________________
Python tracker <report@bugs.python.org>
<https://bugs.python.org/issue41431>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com
[issue41431] Optimize dict_merge for copy [ In reply to ]
Inada Naoki <songofacandy@gmail.com> added the comment:

Microbenchmark for commit cf4f61ce50e07f7ccd3aef991647050c8da058f9.

# timeit -s 'd=dict.fromkeys(range(8))' -- 'dict(d)'
Mean +- std dev: [master] 311 ns +- 2 ns -> [patched] 144 ns +- 1 ns: 2.16x faster (-54%)

# timeit -s 'd=dict.fromkeys(range(1000))' -- 'dict(d)'
Mean +- std dev: [master] 21.6 us +- 0.2 us -> [patched] 7.67 us +- 0.09 us: 2.81x faster (-64%)

# timeit -s 'd=dict.fromkeys(range(8))' -- '{}.update(d)'
Mean +- std dev: [master] 301 ns +- 5 ns -> [patched] 149 ns +- 1 ns: 2.01x faster (-50%)

# timeit -s 'd=dict.fromkeys(range(1000))' -- '{}.update(d)'
Mean +- std dev: [master] 21.4 us +- 0.2 us -> [patched] 7.64 us +- 0.07 us: 2.80x faster (-64%)

----------
stage: patch review ->

_______________________________________
Python tracker <report@bugs.python.org>
<https://bugs.python.org/issue41431>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com
[issue41431] Optimize dict_merge for copy [ In reply to ]
Inada Naoki <songofacandy@gmail.com> added the comment:

To reduce code size, I am considering to remove clone_combined_dict. I will check how PyDict_Copy() is performance critical.

This is microbenchmark result of d.copy() and dict(d).

$ ./python -m pyperf timeit --compare-to ./python-master -s 'd=dict.fromkeys(range(1000))' -- 'd.copy()'
python-master: ..................... 4.36 us +- 0.07 us
python: ..................... 5.96 us +- 0.10 us

Mean +- std dev: [python-master] 4.36 us +- 0.07 us -> [python] 5.96 us +- 0.10 us: 1.37x slower (+37%)

$ ./python -m pyperf timeit --compare-to ./python-master -s 'd=dict.fromkeys(range(1000))' -- 'dict(d)'
python-master: ..................... 21.6 us +- 0.2 us
python: ..................... 6.01 us +- 0.09 us

Mean +- std dev: [python-master] 21.6 us +- 0.2 us -> [python] 6.01 us +- 0.09 us: 3.59x faster (-72%)

----------

_______________________________________
Python tracker <report@bugs.python.org>
<https://bugs.python.org/issue41431>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com
[issue41431] Optimize dict_merge for copy [ In reply to ]
Inada Naoki <songofacandy@gmail.com> added the comment:

PyDict_Copy() is not used in eval loop or calling functions. So removing clone_combined_dict() is a considerable option.

Another option is to use clone_combined_dict() in dict_merge, instead of adding dict_copy2().

Pros: No performance regression. PyDict_Copy() is as fast as before.
Cons: Can not "fast copy" split dict and dirty dict.

I suppose most dict used by `dict(d)` or `dict.update(d)` is clean and combined. So I will implement the second option.

----------

_______________________________________
Python tracker <report@bugs.python.org>
<https://bugs.python.org/issue41431>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com
[issue41431] Optimize dict_merge for copy [ In reply to ]
Change by Inada Naoki <songofacandy@gmail.com>:


----------
pull_requests: +20819
stage: -> patch review
pull_request: https://github.com/python/cpython/pull/21674

_______________________________________
Python tracker <report@bugs.python.org>
<https://bugs.python.org/issue41431>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com
[issue41431] Optimize dict_merge for copy [ In reply to ]
Change by Serhiy Storchaka <storchaka+cpython@gmail.com>:


----------
nosy: +serhiy.storchaka, yselivanov

_______________________________________
Python tracker <report@bugs.python.org>
<https://bugs.python.org/issue41431>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com
[issue41431] Optimize dict_merge for copy [ In reply to ]
Inada Naoki <songofacandy@gmail.com> added the comment:


New changeset db6d9a50cee92c0ded7c5cb87331c5f0b1008698 by Inada Naoki in branch 'master':
bpo-41431: Optimize dict_merge for copy (GH-21674)
https://github.com/python/cpython/commit/db6d9a50cee92c0ded7c5cb87331c5f0b1008698


----------

_______________________________________
Python tracker <report@bugs.python.org>
<https://bugs.python.org/issue41431>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com
[issue41431] Optimize dict_merge for copy [ In reply to ]
Change by Inada Naoki <songofacandy@gmail.com>:


----------
resolution: -> fixed
stage: patch review -> resolved
status: open -> closed
type: -> performance

_______________________________________
Python tracker <report@bugs.python.org>
<https://bugs.python.org/issue41431>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com