Mailing List Archive

[issue4941] Tell GCC Py_DECREF is unlikely to call the destructor
New submission from Daniel Diniz <ajaksu@gmail.com>:

As suggested by Paolo Giarrusso, add a "likely()" to Py_DECREF, telling
GCC it doesn't call the destructor in the common path.

This optimization seems to interact well with #4896 and to be slower on
top of #4753 on my system. Patching ceval.c instead of object.h seems to
give the same result, FWIW. Benchmarks most welcome :)

Paolo Giarrusso wrote:
"Anyway, yesterday I was looking at the assembly and I realized that
GCC optimizes the case where Py_DECREF calls the destructor! Since I
don't have time for this, can you hint GCC for the better by adding
unlikely() to Py_DECREF, at least (or maybe only) for ceval.c, or
maybe even only for the main interpreter?"

pybench:
Rounds: 10
Warp: 10
Timer: time.time

Machine Details:
Platform ID: Linux-2.6.24-23-generic-i686-with-debian-lenny-sid
Processor: Intel Celeron M 410

Python:
Implementation: CPython
Executable: /home/ajaksu/py3k/python
Version: 3.1.0
Compiler: GCC 4.2.4 (Ubuntu 4.2.4-1ubuntu3)
Bits: 32bit
Build: Jan 13 2009 23:40:05 (#py3k:68571M)
Unicode: UCS2


Test minimum run-time average run-time
this other diff this other
diff
-------------------------------------------------------------------------------
BuiltinFunctionCalls: 147ms 163ms -9.7% 148ms 164ms
-10.0%
BuiltinMethodLookup: 138ms 154ms -10.3% 139ms 155ms
-10.6%
CompareFloats: 185ms 191ms -3.4% 185ms 192ms
-3.4%
CompareFloatsIntegers: 304ms 294ms +3.1% 304ms 296ms
+2.7%
CompareIntegers: 287ms 281ms +2.0% 288ms 281ms
+2.3%
CompareInternedStrings: 189ms 217ms -12.7% 192ms 217ms
-11.7%
CompareLongs: 167ms 162ms +3.3% 168ms 162ms
+3.3%
CompareStrings: 152ms 178ms -14.2% 156ms 179ms
-12.7%
ComplexPythonFunctionCalls: 192ms 202ms -5.1% 193ms 206ms
-6.2%
ConcatStrings: 325ms 327ms -0.7% 326ms 330ms
-1.1%
CreateInstances: 200ms 209ms -4.4% 204ms 210ms
-2.9%
CreateNewInstances: 149ms 156ms -4.4% 151ms 161ms
-5.8%
CreateStringsWithConcat: 255ms 272ms -6.1% 258ms 275ms
-6.3%
DictCreation: 145ms 151ms -4.0% 146ms 154ms
-5.3%
DictWithFloatKeys: 209ms 224ms -6.8% 209ms 224ms
-6.7%
DictWithIntegerKeys: 170ms 185ms -8.0% 171ms 187ms
-8.3%
DictWithStringKeys: 155ms 172ms -10.1% 156ms 173ms
-10.2%
ForLoops: 162ms 183ms -11.7% 162ms 183ms
-11.4%
IfThenElse: 189ms 217ms -12.7% 190ms 219ms
-13.1%
ListSlicing: 156ms 156ms -0.3% 156ms 158ms
-1.0%
NestedForLoops: 199ms 219ms -9.1% 201ms 220ms
-8.6%
NestedListComprehensions: 222ms 245ms -9.5% 228ms 251ms
-9.2%
NormalClassAttribute: 352ms 376ms -6.2% 353ms 378ms
-6.6%
NormalInstanceAttribute: 188ms 212ms -11.1% 191ms 215ms
-11.3%
PythonFunctionCalls: 173ms 192ms -9.6% 174ms 193ms
-9.7%
PythonMethodCalls: 213ms 226ms -5.6% 214ms 227ms
-5.8%
Recursion: 275ms 295ms -7.0% 276ms 296ms
-7.0%
SecondImport: 158ms 158ms +0.2% 160ms 160ms
+0.1%
SecondPackageImport: 164ms 168ms -2.2% 166ms 170ms
-2.8%
SecondSubmoduleImport: 221ms 226ms -2.4% 223ms 229ms
-2.9%
SimpleComplexArithmetic: 143ms 166ms -13.5% 144ms 166ms
-13.4%
SimpleDictManipulation: 269ms 296ms -9.2% 270ms 297ms
-9.0%
SimpleFloatArithmetic: 150ms 177ms -15.3% 151ms 196ms
-23.2%
SimpleIntFloatArithmetic: 200ms 226ms -11.4% 203ms 232ms
-12.4%
SimpleIntegerArithmetic: 200ms 226ms -11.4% 202ms 235ms
-13.9%
SimpleListComprehensions: 186ms 211ms -12.0% 193ms 218ms
-11.5%
SimpleListManipulation: 161ms 180ms -10.6% 161ms 181ms
-10.9%
SimpleLongArithmetic: 137ms 159ms -14.0% 137ms 162ms
-15.2%
SmallLists: 197ms 209ms -5.6% 201ms 210ms
-4.2%
SmallTuples: 215ms 234ms -8.2% 216ms 236ms
-8.2%
SpecialClassAttribute: 544ms 566ms -4.0% 549ms 571ms
-3.7%
SpecialInstanceAttribute: 188ms 211ms -11.0% 190ms 214ms
-11.0%
StringMappings: 432ms 439ms -1.6% 433ms 441ms
-1.9%
StringPredicates: 183ms 198ms -7.8% 183ms 199ms
-7.9%
StringSlicing: 304ms 317ms -4.2% 310ms 321ms
-3.3%
TryExcept: 138ms 154ms -10.5% 139ms 155ms
-10.4%
TryFinally: 149ms 156ms -4.8% 151ms 159ms
-5.1%
TryRaiseExcept: 76ms 78ms -2.3% 77ms 82ms
-6.3%
TupleSlicing: 254ms 254ms -0.2% 254ms 257ms
-1.0%
WithFinally: 204ms 207ms -1.1% 206ms 209ms
-1.1%
WithRaiseExcept: 187ms 190ms -2.0% 188ms 192ms
-2.2%
-------------------------------------------------------------------------------
Totals: 10557ms 11266ms -6.3% 10647ms 11397ms
-6.6%

(this=likely_object.pybench, other=classic_opcode.pybench)

----------
components: Interpreter Core
files: likely_decref.diff
keywords: patch
messages: 79821
nosy: ajaksu2, blaisorblade, pitrou
severity: normal
status: open
title: Tell GCC Py_DECREF is unlikely to call the destructor
type: performance
versions: Python 3.0, Python 3.1
Added file: http://bugs.python.org/file12731/likely_decref.diff

_______________________________________
Python tracker <report@bugs.python.org>
<http://bugs.python.org/issue4941>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com
[issue4941] Tell GCC Py_DECREF is unlikely to call the destructor [ In reply to ]
Antoine Pitrou <pitrou@free.fr> added the comment:

I get a slowdown here (results attached). Many objects are shortlived in
Python, so I'm not sure the approach is valid anyway.

Added file: http://bugs.python.org/file12732/pybench.txt

_______________________________________
Python tracker <report@bugs.python.org>
<http://bugs.python.org/issue4941>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com
[issue4941] Tell GCC Py_DECREF is unlikely to call the destructor [ In reply to ]
Paolo 'Blaisorblade' Giarrusso <p.giarrusso@gmail.com> added the comment:

If speedup on other machines are confirmed, a slowdown of less than 2%
should be acceptable (it's also similar to the statistic noise I have on
my machine - it's a pity pybench does not calculate the standard
deviation of the execution time).

It is true that many objects are short lived, and that makes
generational GC work, but the patch is about another point. Do most
refcount decrements lead to garbage collection? That's a completely
different question. And I don't think that popping a function call
argument usually leads to a destructor invocation, for instance, unless
the argument is a temporary. The same reasoning holds for most other
opcodes.

The simplest way to check this is to look at the compilation output with
Profile-Guided Optimization active and verify which branch is optimized.

Having said that, the slowdown you get (on your Athlon X2 3600+) might
be due to a higher cost in case of failed static prediction, since that
leads to 2 jumps.

This code leads to 0 or 1 jump:

if (likely(cond)) {
destructor
}

while with unlikely, one runs through either 0 or 2 jumps. The second
jump is unconditional, so it has a much smaller delay (the decode unit
has to tell the fetch unit to fetch another instruction). However, in
that case one has to invoke anyway the destructor through a virtual
call, which is much more expensive.

_______________________________________
Python tracker <report@bugs.python.org>
<http://bugs.python.org/issue4941>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com
[issue4941] Tell GCC Py_DECREF is unlikely to call the destructor [ In reply to ]
Martin v. Löwis <martin@v.loewis.de> added the comment:

I dislike about this patch that there are two different else cases
(giving the trivial macro definition); there should only be one such case.

----------
nosy: +loewis

_______________________________________
Python tracker <report@bugs.python.org>
<http://bugs.python.org/issue4941>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com
[issue4941] Tell GCC Py_DECREF is unlikely to call the destructor [ In reply to ]
Paolo 'Blaisorblade' Giarrusso <p.giarrusso@gmail.com> added the comment:

Also, GCC 2.95 does not support the construct, GCC 2.96 is required.
So, I'd suggest defining likely/unlikely unconditionally and using this,
which leads to less code overall:

# if (__GNUC__ < 2) || ((__GNUC__ == 2) && (__GNUC_MINOR__ <= 95))
# define __builtin_expect(exp, prediction) (exp)
# endif

This is the simplest way to code this, assuming that ICC defines
__GNUC__, and my reference guide tells "yes, it defines __GNUC__ and
__INTEL_COMPILER".

_______________________________________
Python tracker <report@bugs.python.org>
<http://bugs.python.org/issue4941>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com
[issue4941] Tell GCC Py_DECREF is unlikely to call the destructor [ In reply to ]
Changes by Collin Winter <collinw@gmail.com>:


----------
nosy: +collinwinter, jyasskin

_______________________________________
Python tracker <report@bugs.python.org>
<http://bugs.python.org/issue4941>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com
[issue4941] Tell GCC Py_DECREF is unlikely to call the destructor [ In reply to ]
Antoine Pitrou <pitrou@free.fr> added the comment:

You could also test this principle with a few opcodes in ceval.c.
Especially, the error cases in LOAD_FAST, LOAD_GLOBAL etc. could be
flagged as unlikely.

_______________________________________
Python tracker <report@bugs.python.org>
<http://bugs.python.org/issue4941>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com
[issue4941] Tell GCC Py_DECREF is unlikely to call the destructor [ In reply to ]
Paolo 'Blaisorblade' Giarrusso <p.giarrusso@gmail.com> added the comment:

Yep, agreed. It could be quite cool to rely on __attribute__((cold)) to
mark format_exc_check_arg(), but that only works with newer gcc's. I
guess I'll add many likely() by hand - in some cases GCC might already
get it right, but the heuristics used to choose the likely path
statically are quite arguable and subject to change. Also, on other
archs such hints might have a bigger impact on the generated code.

I'll give a look at that not earlier than February, or you're welcome to
try this. However, at least on x86, an "if (successCondition) {
success(); dispatch(); } failure();" will be probably compiled to
assembly like follows, equivalent to adding likely to the test:

evaluate successCondition
if_false goto err:
call to success();
dispatch next instrutions.

err:
call to failure();

_______________________________________
Python tracker <report@bugs.python.org>
<http://bugs.python.org/issue4941>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com
[issue4941] Tell GCC Py_DECREF is unlikely to call the destructor [ In reply to ]
Changes by Daniel Diniz <ajaksu@gmail.com>:


Removed file: http://bugs.python.org/file12731/likely_decref.diff

_______________________________________
Python tracker <report@bugs.python.org>
<http://bugs.python.org/issue4941>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com
[issue4941] Tell GCC Py_DECREF is unlikely to call the destructor [ In reply to ]
Daniel Diniz <ajaksu@gmail.com> added the comment:

New version of the patch, as well as one that doesn't touch Py_DECREF
but defines likely and unlikely. Then, three ceval patches that result
in small speedups (2% to 8% here), but don't play well together (neither
with the Py_DECREF one).

Since I get speedups for some changes and slowdowns when more than one
is applied, I suspect my system is hitting some limitation (register
stavation?) and contributing more noise that signal :/

Added file: http://bugs.python.org/file12813/likely_decref.diff

_______________________________________
Python tracker <report@bugs.python.org>
<http://bugs.python.org/issue4941>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com
[issue4941] Tell GCC Py_DECREF is unlikely to call the destructor [ In reply to ]
Changes by Daniel Diniz <ajaksu@gmail.com>:


Added file: http://bugs.python.org/file12814/likely_object.diff

_______________________________________
Python tracker <report@bugs.python.org>
<http://bugs.python.org/issue4941>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com
[issue4941] Tell GCC Py_DECREF is unlikely to call the destructor [ In reply to ]
Changes by Daniel Diniz <ajaksu@gmail.com>:


Added file: http://bugs.python.org/file12815/ceval_exception.diff

_______________________________________
Python tracker <report@bugs.python.org>
<http://bugs.python.org/issue4941>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com
[issue4941] Tell GCC Py_DECREF is unlikely to call the destructor [ In reply to ]
Changes by Daniel Diniz <ajaksu@gmail.com>:


Added file: http://bugs.python.org/file12816/ceval_tuple_unpack.diff

_______________________________________
Python tracker <report@bugs.python.org>
<http://bugs.python.org/issue4941>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com
[issue4941] Tell GCC Py_DECREF is unlikely to call the destructor [ In reply to ]
Changes by Daniel Diniz <ajaksu@gmail.com>:


Added file: http://bugs.python.org/file12817/ceval_function.diff

_______________________________________
Python tracker <report@bugs.python.org>
<http://bugs.python.org/issue4941>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com
[issue4941] Tell GCC Py_DECREF is unlikely to call the destructor [ In reply to ]
Jeffrey Yasskin <jyasskin@gmail.com> added the comment:

Minor comments on likely_object.diff:
* Py_LIKELY() and Py_UNLIKELY() would be better spellings than likely()
and unlikely().
* The definitions should go in pyport.h instead of object.h
* Usually don't put spaces after the "#".
* Probably #if the definitions of Py_LIKELY and Py_UNLIKELY instead of
__builtin_expect so new compilers can easily add their own definitions.

_______________________________________
Python tracker <report@bugs.python.org>
<http://bugs.python.org/issue4941>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com
[issue4941] Tell GCC Py_DECREF is unlikely to call the destructor [ In reply to ]
Paolo 'Blaisorblade' Giarrusso <p.giarrusso@gmail.com> added the comment:

>Probably #if the definitions of Py_LIKELY and Py_UNLIKELY instead of
__builtin_expect so new compilers can easily add their own definitions.

This was done in the first version, but with the currently supported
compilers it's simpler to do like that, because both GCC and ICC support
the same __builtin_expect syntax, so you get less code this way.

Anyway, the code was "inspired" from the Linux kernel which only
supports those two compilers, so anybody more knowledgeable is welcome
to suggest how to express this with other compilers.

_______________________________________
Python tracker <report@bugs.python.org>
<http://bugs.python.org/issue4941>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com