Mailing List Archive: Performance benchmarks for 3.9

Hi!

I have updated the branch benchmarks in the pyperformance server and now
they include 3.9. There are
some benchmarks that are faster but on the other hand some benchmarks are
substantially slower, pointing
at a possible performance regression in 3.9 in some aspects. In particular
some tests like "unpack sequence" are
almost 20% slower. As there are some other tests were 3.9 is faster, is not
fair to conclude that 3.9 is slower, but
this is something we should look into in my opinion.

You can check these benchmarks I am talking about by:

* Go here: https://speed.python.org/comparison/
* In the left bar, select "lto-pgo latest in branch '3.9'" and "lto-pgo
latest in branch '3.8'"
* To better read the plot, I would recommend to select a "Normalization" to
the 3.8 branch (this is in the top part of the page)
and to check the "horizontal" checkbox.

These benchmarks are very stable: I have executed them several times over
the weekend yielding the same results and,
more importantly, they are being executed on a server specially prepared to
running reproducible benchmarks: CPU affinity,
CPU isolation, CPU pinning for NUMA nodes, CPU frequency is fixed, CPU
governor set to performance mode, IRQ affinity is
disabled for the benchmarking CPU nodes...etc so you can trust these
numbers.

I kindly suggest for everyone interested in trying to improve the 3.9 (and
master) performance, to review these benchmarks
and try to identify the problems and fix them or to find what changes
introduced the regressions in the first place. All benchmarks
are the ones being executed by the pyperformance suite (
https://github.com/python/pyperformance) so you can execute them
locally if you need to.

---

On a related note, I am also working on the speed.python.org server to
provide more automation and
ideally some integrations with GitHub to detect performance regressions.
For now, I have done the following:

* Recompute benchmarks for all branches using the same version of
pyperformance (except master) so they can
be compared with each other. This can only be seen in the "Comparison"
tab: https://speed.python.org/comparison/
* I am setting daily builds of the master branch so we can detect
performance regressions with daily granularity. These
daily builds will be located in the "Changes" and "Timeline" tabs (
https://speed.python.org/timeline/).
* Once the daily builds are working as expected, I plan to work on trying
to automatically comment or PRs or on bpo if
we detect that a commit has introduced some notable performance regression.

Regards from sunny London,
Pablo Galindo Salgado.

On 10/14/2020 9:16 AM, Pablo Galindo Salgado wrote:

> You can check these benchmarks I am talking about by:
>
> * Go here: https://speed.python.org/comparison/
> * In the left bar, select "lto-pgo latest in branch '3.9'" and "lto-pgo
> latest in branch '3.8'"

At the moment, there are only results for 'speed-python', none for
'Broadwell-EP'. What do those terms mean?

If one leaves all 5 versions checked, they are mis-ordered 3.9, 3.7,
3.8, 3.6, master. The correct sequence, 3.6 to master, would be easier
to read and interpret. Then pick colors to maximize contrast between
adjacent bars.

> * To better read the plot, I would recommend to select a "Normalization"
> to the 3.8 branch (this is in the top part of the page)

Or either end of whatever sequence one includes.

> and to check the "horizontal" checkbox.

Overall, there have been many substantial improvements since 3.6.

--
Terry Jan Reedy

_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-leave@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/DDQC76IF5DF335MTYNCGN7OAFJQPYZUI/
Code of Conduct: http://python.org/psf/codeofconduct/