On Tue, Oct 20, 2020 at 5:59 AM Mark Shannon <mark@hotpy.org> wrote:
> Hi everyone,
>
> CPython is slow. We all know that, yet little is done to fix it.
>
> I'd like to change that.
> I have a plan to speed up CPython by a factor of five over the next few
> years. But it needs funding.
>
> I am aware that there have been several promised speed ups in the past
> that have failed. You might wonder why this is different.
>
> Here are three reasons:
> 1. I already have working code for the first stage.
> 2. I'm not promising a silver bullet. I recognize that this is a
> substantial amount of work and needs funding.
> 3. I have extensive experience in VM implementation, not to mention a
> PhD in the subject.
>
> My ideas for possible funding, as well as the actual plan of
> development, can be found here:
>
> https://github.com/markshannon/faster-cpython
>
> I'd love to hear your thoughts on this.
>
> Cheers,
> Mark.
>
+1
Overall I think you are making quite a reasonable proposal. It sounds
like effectively bringing your hotpy2 concepts into the CPython interpreter
with an intent to help maintain them over the long term.
People worried that you are doing this out of self interest may not know
who you are. Sure, you want to be paid to do work that you appear to love
and have been mulling over for a decade+. There is nothing wrong with
that. Payment is proposed as on delivery per phase. I like the sound of
that, nice!
Challenges I expect we'll face, that seem tractable to me, are mostly
around what potential roadblocks we, us existing python-committers and our
ultimate decider steering council might introduce intentionally or not that
prevents landing such work. Payment on delivery helps that a lot, if we
opt out of some work, it is both our losses. One potential outcome is that
you'd burn out and go away if we didn't accept something meaning payment
wasn't going to happen. That already happens amongst all core devs today
so I don't have a problem with this even though it isn't what we'd
rightfully want to happen. Middle grounds are quite reasonable
renegotiations. The deciders on this would be the PSF (because money) and
the PSF would presumably involve the Steering Council in decisions of terms
and judgements.
Some background materials for those who don't already know Mark's past work
along these lines that is where this proposal comes from:
https://sites.google.com/site/makingcpythonfast/ (hotpy)
and the associated presentation in 2012:
https://www.youtube.com/watch?v=c6PYnZUMF7o The amount of money seems entirely reasonable to me. Were it to be taken
on, part of the PSF's job is to drum up the money. This would be a contract
with outcomes that could effectively be sold to funders in order to do so.
There are many companies who use CPython a lot that we could solicit
funding from, many of whom have employees among our core devs already. Will
they bite? It doesn't even have to be from a single source or be all
proposed phases up front, that is what the PSF exists to decide and
coordinate on.
We've been discussing on and off in the past many years how to pay people
for focused work on CPython and the ecosystem and balance that with being
an open source community project. We've got some people employed along
these lines already, this would become more of that and in many ways just
makes sense to me.
Summarizing some responses to points others have brought up:
Performance estimates:
* Don't fret about claimed speedups of each phase. We're all going to
doubt different things or expect others to be better. The proof is
ultimately in the future pudding.
JIT topics:
* JITs rarely stand alone. The phase 1+2 improved interpreter will always
exist. It is normal to start with an interpreter for fast startup and
initial tracing before performing JIT compilations, and as a fallback
mechanism when the JIT isn't appropriate or available. (my background:
Transmeta. We had an Interpreter and at least two levels of Translators
behind our x86 compatible CPUs, all were necessary)
* Sometimes you simply want to turn tracing and jit stuff off to save
memory. That knob always winds up existing. If nothing else it is normal to
run our test suite with such a knob in multiple positions for proper
coverage.
* It is safe to assume any later phase actual JIT would target at least
one important platform (ex: amd64 or aarch64) and if successful should
easily elicit contributions supporting others either as work or as funding
to create it.
"*Why this, why not fund XyZ?*" whataboutism:
* This conversation is separate from other projects. The way attracting
funding for a project works can involve spelling out what it is for. It
isn't my decision, but I'd be amazed if anything beyond maybe phase 1 came
solely out of a PSF general no obligation fund. CPython is the most used
Python VM in the world. A small amount of funding is not going to get
maintainers and users to switch to PyPy. There is unlikely to be a major
this or that situation here.
Unladen Swallow
* That was a fixed time one year attempt at speeding up CPython by Google.
IIRC CPython's computed goto support came out of that(?), as did a ton of
improvements to the LLVM internals that we don't see in python-dev land as
that project was not yet anywhere near ready to take on dynamic language
VMs at the time. At the end the llvm backed side was not something that was
deemed maintainable or necessarily a win, so it was not accepted by us and
was shelved. It wasn't a clear win and carried a very complicated cross
project maintenance burden. I still work with many of the people involved
in that project, at least one of whom works full time on LLVM today. Nobody
involved that I'm aware of is bitter about it. It was a fixed time
experiment, a few projects got some good out of it. Another reason that did
continue: The motivating internal application we attempted Unladen Swallow
for ultimately found they were more Memory than CPU constrained in terms of
their compute resources planning... 5-6 years ago an attempt at getting the
same internal application up and running on PyPy (which led to many
contributions to PyPy's cpyext) ran in part into the memory constraint. (*there
were more issues with pypy - cpyext vs performance being but one; this
isn't the place for that and I'm not the right person to ask*)
meta: i've written too many words and edited so often i can't see my own
typos and misedits anymore. i'll stop now. :)
-gps
> _______________________________________________
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-leave@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-dev@python.org/message/RDXLCH22T2EZDRCBM6ZYYIUTBWQVVVWH/
> Code of Conduct: http://python.org/psf/codeofconduct/
>