Mailing List Archive

Slowly bend the C API towards the limited API to get a stable ABI for everyone
Hi,

There is a reason why I'm bothering C extensions maintainers and
Python core developers with my incompatible C API changes since Python
3.8. Let me share my plan with you :-)


In 2009 (Python 3.2), Martin v. Löwis did an amazing job with the PEP
384 "Defining a Stable ABI" to provide a "limited C API" and a "stable
ABI" for C extensions: build an extension once, use it on multiple
Python versions. Some projects like PyQt5 and cryptograpy use it, but
it is just a drop in the PyPI ocean (353,084 projects). I'm trying to
bend the "default" C API towards this "limited C API" to make it
possible tomorrow to build *more* C extensions for the stable ABI.

My goal is that the stable ABI would be the default, and only a
minority of C extensions would opt-out because they need to access to
more functions for best performance.

The basic problem is that at the ABI level, C extensions must only
call functions, rather than getting and setting directly to structure
members. Structures changes frequently in Python (look at changes
between Python 3.2 and Python 3.11), and any minor structure change
breaks the ABI. The limited C API hides structures and only use
function calls to solve this problem.


Since 2020, I'm modifying the C API, one function by one, to slowly
hide implementations (prepare the API to make strutures opaque). I
focused on the following structures:

* PyObject and PyVarObject (bpo-39573)
* PyTypeObject (bpo-40170)
* PyFrameObject (bpo-40421)
* PyThreadState (bpo-39947)

The majority of C extensions use functions and macros, they don't
access directly structure members. There are a few members which are
sometimes accessed directly which prevents making these structures
opaque. For example, some old C extensions use obj->ob_type rather
than Py_TYPE(obj). Fixing the minority of C extensions should benefit
to the majority which may become compatible with the stable ABI.

I am also converting macros to static inline functions to fix their
API: define parameter types, result type and avoid surprising macros
side effects ("macro pitfalls"). I wrote the PEP 670 "Convert macros
to functions in the Python C API" for these changes.


I wrote the upgrade_pythoncapi.py tool in my pythoncapi_project (*)
which modify C code to use Py_TYPE(), Py_SIZE() and Py_REFCNT() rather
than accessing directly PyObject and PyVarObject members.

(*) https://github.com/pythoncapi/pythoncapi_compat

In this tool, I also added "Borrow" variant of functions like
PyFrame_GetCode() which returns a strong reference, to replace
frame->f_code with _PyFrame_GetCodeBorrow(). In Python 3.11, you
cannot use the frame->f_code member anymore, since it has been
removed! You must call PyFrame_GetCode() (or pythoncapi_compat
_PyFrame_GetCodeBorrow() variant).


There are also a few macros which can be used as l-values like
Py_TYPE(): "Py_TYPE(type1) = type2" must now be written
"Py_SET_TYPE(type1, type2)" to avoid setting directly the tp_type type
at the ABI level. I proposed the PEP 674 "Disallow using Py_TYPE() and
Py_SIZE() macros as l-values" to solve these issues.


Currently, many "functions" are still implemented as macros or static
inline functions, so C extensions still access structure members at
the ABI level for best Python performance. Converting these to regular
functions has an impact on performance and I would prefer to first
write a PEP giving the rationale for that.


Today, it is not possible yet to build numpy for the stable ABI. The
gap is just too large for this big C extension. But step by step, the
C API becomes closer to the limited API, and more and more code is
ready to be built for the stable ABI.


Well, these C API changes have other advantages, like preparing Python
for further optimizations, ease Python maintenance, clarify the
seperation between the limited C API and the default C API, etc. ;-)

Victor
--
Night gathers, and now my watch begins. It shall not end until my death.
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-leave@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/DN6JAK62ZXZUXQK4MTGYOFEC67XFQYI5/
Code of Conduct: http://python.org/psf/codeofconduct/
Re: Slowly bend the C API towards the limited API to get a stable ABI for everyone [ In reply to ]
Wait, where is the HPy project in that plan? :-) The HPy project
(brand new C API) is a good solution for the long term!

My concerns about HPy right now is that, in short, CPython has to
continue supporting the C API for a few more years, and we cannot
evolve CPython before it will become reasonable to consider removing
the "legacy" C API.

I explained that in details in the PEP 674 (Disallow using Py_TYPE()
and Py_SIZE() macros as l-values):
https://www.python.org/dev/peps/pep-0674/#relationship-with-the-hpy-project

In parallel, we should continue promoting the usage of Cython, cffi,
pybind11 and HPy, rather than using directly the C API.

Victor
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-leave@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/TEC4SRHT36KAHB4GB6FEXVGGWXK4KXTI/
Code of Conduct: http://python.org/psf/codeofconduct/
Re: Slowly bend the C API towards the limited API to get a stable ABI for everyone [ In reply to ]
Does HPy have any clear guidance or assistance for their users to keep
it up to date?

I'm concerned that if we simply substitute "support the C API for
everyone" with "support the C API for every version of HPy" we're no
better off.

I think it can be done with clear communication from the HPy project
(and us when we endorse it) that they will *never* break compatibility
and it's *always* safe (and indeed, essential) for their users to use
the latest version. But that's a big commitment that I can't sign them
up for.

Cython seems to manage it okay. I can't remember the last compat issue I
had there that wasn't on our (C-API) side.

Thoughts?

Cheers,
Steve

On 1/28/2022 4:50 PM, Victor Stinner wrote:
> Wait, where is the HPy project in that plan? :-) The HPy project
> (brand new C API) is a good solution for the long term!
>
> My concerns about HPy right now is that, in short, CPython has to
> continue supporting the C API for a few more years, and we cannot
> evolve CPython before it will become reasonable to consider removing
> the "legacy" C API.
>
> I explained that in details in the PEP 674 (Disallow using Py_TYPE()
> and Py_SIZE() macros as l-values):
> https://www.python.org/dev/peps/pep-0674/#relationship-with-the-hpy-project
>
> In parallel, we should continue promoting the usage of Cython, cffi,
> pybind11 and HPy, rather than using directly the C API.
>
> Victor
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-leave@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/KHJ3IBDAM7OJNECT33FIZBDN3N5HMWYN/
Code of Conduct: http://python.org/psf/codeofconduct/
Re: Slowly bend the C API towards the limited API to get a stable ABI for everyone [ In reply to ]
On Jan 28, 2022, at 09:00, Steve Dower <steve.dower@python.org> wrote:
>
> Does HPy have any clear guidance or assistance for their users to keep it up to date?
>
> I'm concerned that if we simply substitute "support the C API for everyone" with "support the C API for every version of HPy" we're no better off.

Will it ever make sense to pull HPy into the CPython repo so that they evolve together? I can see advantages and disadvantages. If there’s a point in the future where we can just start promoting HPy as an official alternative C API, then it will likely get more traction over time. The disadvantage is that HPy would evolve at the same annual pace as CPython.

-Barry
Re: Slowly bend the C API towards the limited API to get a stable ABI for everyone [ In reply to ]
I think we will get *one* chance in the next decade to get it right.
Whether that's HPy or evolution of the C API I'm not sure.

Victor, am I right that the (some) stable ABI will remain important because
projects don't have resources to build wheels for every Python release? If
a project does R releases per year for P platforms that need to support V
versions of Python, they would normally have to build R * P * V wheels.
With a stable ABI, they could reduce that to R * P. That's the key point,
right?

Can HPy do that?

On Fri, Jan 28, 2022 at 9:19 AM Barry Warsaw <barry@python.org> wrote:

> On Jan 28, 2022, at 09:00, Steve Dower <steve.dower@python.org> wrote:
> >
> > Does HPy have any clear guidance or assistance for their users to keep
> it up to date?
> >
> > I'm concerned that if we simply substitute "support the C API for
> everyone" with "support the C API for every version of HPy" we're no better
> off.
>
> Will it ever make sense to pull HPy into the CPython repo so that they
> evolve together? I can see advantages and disadvantages. If there’s a
> point in the future where we can just start promoting HPy as an official
> alternative C API, then it will likely get more traction over time. The
> disadvantage is that HPy would evolve at the same annual pace as CPython.
>
> -Barry
>
> _______________________________________________
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-leave@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-dev@python.org/message/ABFHYMUHQXKMFSBGYMFHKTGHBYJN3XJF/
> Code of Conduct: http://python.org/psf/codeofconduct/
>


--
--Guido van Rossum (python.org/~guido)
*Pronouns: he/him **(why is my pronoun here?)*
<http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
Re: Slowly bend the C API towards the limited API to get a stable ABI for everyone [ In reply to ]
On 1/28/2022 5:15 PM, Barry Warsaw wrote:
> On Jan 28, 2022, at 09:00, Steve Dower <steve.dower@python.org> wrote:
>>
>> Does HPy have any clear guidance or assistance for their users to keep it up to date?
>>
>> I'm concerned that if we simply substitute "support the C API for everyone" with "support the C API for every version of HPy" we're no better off.
>
> Will it ever make sense to pull HPy into the CPython repo so that they evolve together? I can see advantages and disadvantages. If there’s a point in the future where we can just start promoting HPy as an official alternative C API, then it will likely get more traction over time. The disadvantage is that HPy would evolve at the same annual pace as CPython.

Possibly, but we'd have to be really careful to not actually *evolve*
HPy. It would essentially be a new stable API, but ideally one that uses
all the preprocessor tricks we can (and perhaps runtime tricks) to
compile against any CPython version rather than just the one that it
comes with.

PSF "ownership" is probably enough to make it official (for those people
who need everything to be "official"). I don't think that's necessary,
but it does smooth the path for some people to be willing to use it.

Cheers,
Steve
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-leave@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/XCJH743KWSZ6436PU7RLG4NCX62ELTPR/
Code of Conduct: http://python.org/psf/codeofconduct/
Re: Slowly bend the C API towards the limited API to get a stable ABI for everyone [ In reply to ]
> Does HPy have any clear guidance or assistance for their users to keep
> it up to date?

not right now, because we are still somewhat in alpha mode and sometimes we redesign the API and/or break compatibility. But the plan is of course to stabilize at some point.

> I think it can be done with clear communication from the HPy project
> (and us when we endorse it) that they will *never* break compatibility
> and it's *always* safe (and indeed, essential) for their users to use
> the latest version. But that's a big commitment that I can't sign them
> up for.

I think this will be doable once HPy is mature enough, and I also agree that any kind of official endorsement from CPython and/or PSF will help a lot the adoption of HPy itself.

ciao,
Antonio
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-leave@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/Y3F2IIZLS7ETQZM3NQMJL47EEGVU3S2R/
Code of Conduct: http://python.org/psf/codeofconduct/
Re: Slowly bend the C API towards the limited API to get a stable ABI for everyone [ In reply to ]
> If a project does R releases per year for P platforms that need to support V
> versions of Python, they would normally have to build R * P * V wheels.
> With a stable ABI, they could reduce that to R * P. That's the key point,
> right?

> Can HPy do that?

actually, it can do even better than that. When you compile an HPy extension you can choose which ABI to target:
- CPython ABI: in this modality, all HPy_* calls are statically translated (using static inline functions) into the corresponding Py_* call, and it generates modules like foo.cpython-38-x86_64-linux-gnu.so, undistinguishable from a "normal" module

- HPy universal ABI: in this modality, it generates something like foo.hpy-x86_64-linux-gnu.so: all API calls are done through the HPyContext (which is basically a giant vtable): this module can be loaded by any implementation which supports the HPy universal ABI, including CPython, PyPy and GraalPython.

The main drawback of the universal ABI is that it's slightly slower because it goes through the vtable indirection for every call, in particular HPy_Dup/HPy_Close which are mapped to Py_INCREF/Py_DECREF. Some early benchmark indicate a 5-10% slowdown. We haven't benchmarked it against the stable ABI though.

Of course, in order to be fully usable, the HPy universal ABI will need special support by PyPI/pip/etc, because at the moment it is impossible to package it inside a wheel, AFAIK.

ciao,
Antonio
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-leave@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/VUUDJLP5NHM3XJJGDFPMLFEEVAEYYXH2/
Code of Conduct: http://python.org/psf/codeofconduct/
Re: Slowly bend the C API towards the limited API to get a stable ABI for everyone [ In reply to ]
On 1/28/2022 6:17 PM, Antonio Cuni wrote:
> Of course, in order to be fully usable, the HPy universal ABI will need special support by PyPI/pip/etc, because at the moment it is impossible to package it inside a wheel, AFAIK.

It's totally possible, it's just that none of the existing tools will
automatically generate the tags you need. (These are most critical in
the filename itself, and also appear in 1-2 bits of metadata that
currently are unused AFAIK.)

Basically, instead of just "cp310" (or "abi3", etc.), you'll want to use
dots to separate each supported version ("cp38.cp39.cp310"). That will
match the wheel to any of those versions.

You can even do the same with OS platforms if you prefer fewer/bigger
wheels over more platform-specific ones.

Python on all platforms since IIRC 3.6 (maybe 3.5?) also have version
and platform-specific tags in extension modules. These do not support
combining tags as in wheels (and unfortunately do not match wheel tags
at all), but do allow you to have version/platform-specific
.pyd/.dylib/.so files in a single wheel. Again, it's just that none of
the current build backends will help you do it.

Cheers,
Steve
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-leave@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/MLJXBKETEY7YJIPBRKKOVWKS6HKN2PMG/
Code of Conduct: http://python.org/psf/codeofconduct/
Re: Slowly bend the C API towards the limited API to get a stable ABI for everyone [ In reply to ]
On Fri, Jan 28, 2022 at 6:28 PM Guido van Rossum <guido@python.org> wrote:
> I think we will get *one* chance in the next decade to get it right. Whether that's HPy or evolution of the C API I'm not sure.

Would you mind to elaborate? Which risk do you expect from switching
to HPy and from fixing the C API (introducing incompatible C API
changes)?

For me, promoting HPy and evolution of the C API are complementary,
can and must done in parallel for me. As I explained in PEP 674, while
HPy does help C extensions writers, it doesn't solve any problem for
CPython right now. CPython is still blocked by implementation details
leaked throught the C API that we must still maintain for a few more
years.


> Victor, am I right that the (some) stable ABI will remain important because projects don't have resources to build wheels for every Python release? If a project does R releases per year for P platforms that need to support V versions of Python, they would normally have to build R * P * V wheels. With a stable ABI, they could reduce that to R * P. That's the key point, right?

There are different use cases.

1) First, my main worry is that we put a high pressure on maintainers
of most important Python dependencies before the next of a new Python
version, because we want them to handle the flow of incompatible C API
changes before the final Python 3.x versions is released, to get them
available when Python 3.x final is released.

It annoys core developers who cannot change things in Python without
getting an increasing number of complains about a large number of
broken packages, sometimes with a request to revert.

It annoys C extensions maintainers who have to care about Python alpha
and beta releases which are not convenient to use (ex: not available
in Linux distributions). Moreover, it became common to ask multiple
changes and multiple releases before a Python final release, since
more incompatible changes are introduced in Python (before the beta1).

2) Second, as you said, the stable ABI reduces the number of binary
packages which have to be built. Small projects with a little team
(ex: a single person) don't have resources to set up a CI and maintain
it to build all these packages. It's doable, but it isn't free.

--

The irony of the situation is that we must break the C API (hiding
structures is technically an incompatible change)... to make the C API
stable. Breaking it now to make it stable later.

We already broke the C API many times in the past. The difference here
is that changes are done in the purpose of bending it towards the
limited C API and the stable ABI.

My expectation is that replacing frame->f_code with PyFrame_GetCode()
only has to be done exactly once: this API is not going this change.
Sadly, the changes are not limited to frame->f_code, more changes are
needed. For example, for PyFrameObject, accesses to every structure
member must have to go through a function call (getter or setter
function). Hopefully, only a small number of member are used by C
extensions.

The tricky part is to think about the high level API ("use cases")
rather than just adding functions doing "return struct->member" and
"struct->member = new_value". The PyThreadState_EnterTracing() and
PyThreadState_LeaveTracing() functions added to Python 3.11 are a good
example: the API is "generic" and the implementation changes 2
structure members, not a single one.

In practice, what I did since Python 3.8 is to introduce a small
number of C API changes per Python versions. We tried the "fix all the
things at once" approach (!!!) with Python 3, and it... didn't go
well. All C extensions had to suddenly write their own compatibility
layer for a large number of C API functions (ex: replace PyInt_xxx
with PyLong_xxx, without losing Python 2 support!). The changes that
I'm introducing in the C API usually impact less than 100 extensions
in total (usually, I would say between 10 and 25 per Python version,
but it's hard to measure exactly).


> Can HPy do that?

I wish more projects are incrementally rewritten with Cython, cffi,
pybind11 and HPy, and so slowly move away using directly the C API.

Yes, HPy support an "universal build" mode which allows to only build
a C extension once, and use it on multiple *CPython* versions *and*
(that's the big news!) multiple *PyPy* versions! I even heard that it
also brings GraalPython support for free ;-)

Victor
--
Night gathers, and now my watch begins. It shall not end until my death.
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-leave@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/4I7N3SCBWIJCYXE3WGCQBMRTATVGIHY6/
Code of Conduct: http://python.org/psf/codeofconduct/
Re: Slowly bend the C API towards the limited API to get a stable ABI for everyone [ In reply to ]
On 28. 01. 22 16:04, Victor Stinner wrote:
> Hi,
>
> There is a reason why I'm bothering C extensions maintainers and
> Python core developers with my incompatible C API changes since Python
> 3.8. Let me share my plan with you :-)
>
>
> In 2009 (Python 3.2), Martin v. Löwis did an amazing job with the PEP
> 384 "Defining a Stable ABI" to provide a "limited C API" and a "stable
> ABI" for C extensions: build an extension once, use it on multiple
> Python versions. Some projects like PyQt5 and cryptograpy use it, but
> it is just a drop in the PyPI ocean (353,084 projects). I'm trying to
> bend the "default" C API towards this "limited C API" to make it
> possible tomorrow to build *more* C extensions for the stable ABI.
>
> My goal is that the stable ABI would be the default, and only a
> minority of C extensions would opt-out because they need to access to
> more functions for best performance.
>
> The basic problem is that at the ABI level, C extensions must only
> call functions, rather than getting and setting directly to structure
> members. Structures changes frequently in Python (look at changes
> between Python 3.2 and Python 3.11), and any minor structure change
> breaks the ABI. The limited C API hides structures and only use
> function calls to solve this problem.

This is not true. The limited C API does include some structs that are
not opaque, including some fields of PyObject.
Your effort is not only bending the "regular" C API towards the limited
API, but it's *also* bending the limited API towards a struct-less future.

This will be a better future if we get there, but getting there has its
downsides. One downside is that making incompatible changes to the
limited API could make it very hard to support and test the stable ABI.

For example, stable ABI extensions that do `obj->ob_type` must continue
to work*, even if we make it impossible to do this in new extensions (by
making PyObject opaque).
Making PyObject opaque is possible (the limited API is not stable), but
not easy to do correctly (e.g. remember to add tests for the newly
"unreachable" parts of the stable ABI).


(* we could also break the stable ABI, and we could even do it
reasonably safely over a long period of time, but that's a whole
different discussion.)


> Since 2020, I'm modifying the C API, one function by one, to slowly
> hide implementations (prepare the API to make strutures opaque). I
> focused on the following structures:
>
> * PyObject and PyVarObject (bpo-39573)
> * PyTypeObject (bpo-40170)
> * PyFrameObject (bpo-40421)
> * PyThreadState (bpo-39947)
>
> The majority of C extensions use functions and macros, they don't
> access directly structure members. There are a few members which are
> sometimes accessed directly which prevents making these structures
> opaque. For example, some old C extensions use obj->ob_type rather
> than Py_TYPE(obj). Fixing the minority of C extensiisons should benefit
> to the majority which may become compatible with the stable ABI.
>
> I am also converting macros to static inline functions to fix their
> API: define parameter types, result type and avoid surprising macros
> side effects ("macro pitfalls"). I wrote the PEP 670 "Convert macros
> to functions in the Python C API" for these changes.
>
>
> I wrote the upgrade_pythoncapi.py tool in my pythoncapi_project (*)
> which modify C code to use Py_TYPE(), Py_SIZE() and Py_REFCNT() rather
> than accessing directly PyObject and PyVarObject members.
>
> (*) https://github.com/pythoncapi/pythoncapi_compat
>
> In this tool, I also added "Borrow" variant of functions like
> PyFrame_GetCode() which returns a strong reference, to replace
> frame->f_code with _PyFrame_GetCodeBorrow(). In Python 3.11, you
> cannot use the frame->f_code member anymore, since it has been
> removed! You must call PyFrame_GetCode() (or pythoncapi_compat
> _PyFrame_GetCodeBorrow() variant). >
> There are also a few macros which can be used as l-values like
> Py_TYPE(): "Py_TYPE(type1) = type2" must now be written
> "Py_SET_TYPE(type1, type2)" to avoid setting directly the tp_type type
> at the ABI level. I proposed the PEP 674 "Disallow using Py_TYPE() and
> Py_SIZE() macros as l-values" to solve these issues.
>
>
> Currently, many "functions" are still implemented as macros or static
> inline functions, so C extensions still access structure members at
> the ABI level for best Python performance. Converting these to regular
> functions has an impact on performance and I would prefer to first
> write a PEP giving the rationale for that. >
>
> Today, it is not possible yet to build numpy for the stable ABI. The
> gap is just too large for this big C extension. But step by step, the
> C API becomes closer to the limited API, and more and more code is
> ready to be built for the stable ABI.
>
>
> Well, these C API changes have other advantages, like preparing Python
> for further optimizations, ease Python maintenance, clarify the
> seperation between the limited C API and the default C API, etc. ;-)
>
> Victor
> --
> Night gathers, and now my watch begins. It shall not end until my death.
> _______________________________________________
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-leave@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/DN6JAK62ZXZUXQK4MTGYOFEC67XFQYI5/
> Code of Conduct: http://python.org/psf/codeofconduct/
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-leave@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/SRL2MCOQPMC6PSB6BJ37PLGSFUXHYE2A/
Code of Conduct: http://python.org/psf/codeofconduct/
Re: Slowly bend the C API towards the limited API to get a stable ABI for everyone [ In reply to ]
On Mon, Jan 31, 2022 at 1:48 PM Petr Viktorin <encukou@gmail.com> wrote:
> (* we could also break the stable ABI, and we could even do it
> reasonably safely over a long period of time, but that's a whole
> different discussion.)

IMO the stable ABI must be change in the long term, it still leaks too
many implementation details. But right now, I didn't gather enough
data about the problematic APIs and what must be changed exactly. I
would prefer to only do once the work will be really blocked and there
would be no other choice.

Right now, I'm focused on fixing the *API*. It doesn't require to
break the stable ABI.

If we change the stable ABI, I would prefer to fix multiple issues at
once. Examples:

* No longer return borrowed references (ex: PyDict_GetItem is part of
the stable ABI) and no longer steal references (ex:
PyModule_AddObject)

* Disallow getting direct access into an object data without a
function to "release" the data. For example, PyBytes_AsString() gives
a direct access into the string, but Python doesn't know when the C
extension is done with it, and when it's safe to delete the object.
Such API prevents to move Python objects in memory (implement a moving
garbage collector in Python).

* Disallow dereferencing a PyObject* pointer: most structures must be
opaque. It indirectly means that accessing directly structure members
must also be disallowed. PEP 670 and PEP 674 are partially fixing the
issues.

Victor
--
Night gathers, and now my watch begins. It shall not end until my death.
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-leave@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/AFQVC6V4EXOOWV7LK7BHIIX2WPV5H2WX/
Code of Conduct: http://python.org/psf/codeofconduct/
Re: Slowly bend the C API towards the limited API to get a stable ABI for everyone [ In reply to ]
On 31. 01. 22 15:40, Victor Stinner wrote:
> On Mon, Jan 31, 2022 at 1:48 PM Petr Viktorin <encukou@gmail.com> wrote:
>> (* we could also break the stable ABI, and we could even do it
>> reasonably safely over a long period of time, but that's a whole
>> different discussion.)
>
> IMO the stable ABI must be change in the long term, it still leaks too
> many implementation details. But right now, I didn't gather enough
> data about the problematic APIs and what must be changed exactly. I
> would prefer to only do once the work will be really blocked and there
> would be no other choice.
>
> Right now, I'm focused on fixing the *API*. It doesn't require to
> break the stable ABI.
>
> If we change the stable ABI, I would prefer to fix multiple issues at
> once. Examples:
>
> * No longer return borrowed references (ex: PyDict_GetItem is part of
> the stable ABI) and no longer steal references (ex:
> PyModule_AddObject)
>
> * Disallow getting direct access into an object data without a
> function to "release" the data. For example, PyBytes_AsString() gives
> a direct access into the string, but Python doesn't know when the C
> extension is done with it, and when it's safe to delete the object.
> Such API prevents to move Python objects in memory (implement a moving
> garbage collector in Python).
>
> * Disallow dereferencing a PyObject* pointer: most structures must be
> opaque. It indirectly means that accessing directly structure members
> must also be disallowed. PEP 670 and PEP 674 are partially fixing the
> issues.

All of these can be changed in the API. Not easily -- I mentioned the
problem with testing the ABI, but there might be others -- but fixing
these in the API first is probably the way to go.

The ABI can then be changed to align with the current API.
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-leave@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/5A6A5QQ2HWVP7BHY36SFK3TP7A4MMX6I/
Code of Conduct: http://python.org/psf/codeofconduct/
Re: Slowly bend the C API towards the limited API to get a stable ABI for everyone [ In reply to ]
On Mon, Jan 31, 2022 at 4:03 PM Petr Viktorin <encukou@gmail.com> wrote:
> > If we change the stable ABI, I would prefer to fix multiple issues at
> > once. Examples:
> >
> > * No longer return borrowed references (ex: PyDict_GetItem is part of
> > the stable ABI) and no longer steal references (ex:
> > PyModule_AddObject)
> >
> > * Disallow getting direct access into an object data without a
> > function to "release" the data. For example, PyBytes_AsString() gives
> > a direct access into the string, but Python doesn't know when the C
> > extension is done with it, and when it's safe to delete the object.
> > Such API prevents to move Python objects in memory (implement a moving
> > garbage collector in Python).
> >
> > * Disallow dereferencing a PyObject* pointer: most structures must be
> > opaque. It indirectly means that accessing directly structure members
> > must also be disallowed. PEP 670 and PEP 674 are partially fixing the
> > issues.
>
> (...) fixing these in the API first is probably the way to go.

That's what I already did in the past and what I plan to do in the future.

Victor
--
Night gathers, and now my watch begins. It shall not end until my death.
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-leave@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/JYZFBZPHBYNOR7RZXLFFEHL7WZGNY5EA/
Code of Conduct: http://python.org/psf/codeofconduct/
Re: Slowly bend the C API towards the limited API to get a stable ABI for everyone [ In reply to ]
I'm sorry, I was overwhelmed and didn't find the time until now to answer
this. A lot was already said about this, so I'll just briefly explain below
(inline).

On Sat, Jan 29, 2022 at 2:38 AM Victor Stinner <vstinner@python.org> wrote:

> On Fri, Jan 28, 2022 at 6:28 PM Guido van Rossum <guido@python.org> wrote:
> > I think we will get *one* chance in the next decade to get it right.
> Whether that's HPy or evolution of the C API I'm not sure.
>
> Would you mind to elaborate? Which risk do you expect from switching
> to HPy and from fixing the C API (introducing incompatible C API
> changes)?
>

IMO users would benefit if we recommended one solution and started
deprecating the rest. We currently have too many choices: Stable ABI,
limited API (not everybody sees those two as the same thing), CPython API
(C API), Cython (for many this is how they interact with the interpreter),
HPy... And I think you have another thing in the works, a library that
"backfills" (I think that's the word) APIs for older CPython versions so
that users can pretend to use the latest C API but are able to compile/link
for older versions.

To me, that's too many choices -- at the very least it should be clearer
how these relate to each other (e.g. the C API is a superset of the Limited
API, the Stable ABI is based on the limited API (explain how), and HPy is a
wrapper around the C API. (Or is it?)

Such an explanation (of the relationships) would help users understand the
consequences of choosing one or the other for their code -- how will future
CPython versions affect them, how portable is their code to other Python
implementations (PyPy, GraalPython, Jython). Users can't be expected to
understand these consequences without a lot of help (honestly, many of
these I couldn't explain myself :-( ).


> For me, promoting HPy and evolution of the C API are complementary,
> can and must done in parallel for me. As I explained in PEP 674, while
> HPy does help C extensions writers, it doesn't solve any problem for
> CPython right now. CPython is still blocked by implementation details
> leaked throught the C API that we must still maintain for a few more
> years.
>

I understand the CPython is stuck supporting the de-facto standard C API
for a long time. But unless we pick a "north star" (as people call it
nowadays) of what we want to support in say 5-10 years, the situation will
never improve.

My point about "getting one chance to get it right in the next decade" is
that we have to pick that north star, so we can tell users which horse to
bet on. If the north star we pick is HPy, things will be clear. If it is
evolving the C API things will also be clear. But I think we have to pick
one, and stick to it so users (i.e., package maintainers/developers) have
clarity.

I understand that HPy is currently implemented on top of the C API, but
hopefully it's not stuck on that. And it only helps a small group of
extension writers -- those who don't need the functionality that HPy is
still missing (they keep saying they're not ready for prime time) and who
value portability to other Python implementations, and for whom the
existing C API hacks in PyPy aren't sufficient. So it's mostly
aspirational. But if it stays that for too long, it will just die for lack
of motivation.


>
>
> > Victor, am I right that the (some) stable ABI will remain important
> because projects don't have resources to build wheels for every Python
> release? If a project does R releases per year for P platforms that need to
> support V versions of Python, they would normally have to build R * P * V
> wheels. With a stable ABI, they could reduce that to R * P. That's the key
> point, right?
>
> There are different use cases.
>
> 1) First, my main worry is that we put a high pressure on maintainers
> of most important Python dependencies before the next of a new Python
> version, because we want them to handle the flow of incompatible C API
> changes before the final Python 3.x versions is released, to get them
> available when Python 3.x final is released.
>

Hm, maybe we should reduce the flow. And e.g. reject PEP 674...


> It annoys core developers who cannot change things in Python without
> getting an increasing number of complains about a large number of
> broken packages, sometimes with a request to revert.
>

You are mostly talking about yourself here, right? Since the revert
requests were mostly aimed at you. :-)


> It annoys C extensions maintainers who have to care about Python alpha
> and beta releases which are not convenient to use (ex: not available
> in Linux distributions).


I don't use Linux much, so I am not familiar with the inconvenience of
Python alpha/beta releases being unavailable. I thought that the Linux
philosophy was that you could always just build from source?


> Moreover, it became common to ask multiple
> changes and multiple releases before a Python final release, since
> more incompatible changes are introduced in Python (before the beta1).
>

Sorry, your grammar confuses me. Who is asking whom to do what here?

Is the complaint just that things change between alphas? Maybe we should
just give up on alphas and instead do nightlies (fully automated)?

>
> 2) Second, as you said, the stable ABI reduces the number of binary
> packages which have to be built. Small projects with a little team
> (ex: a single person) don't have resources to set up a CI and maintain
> it to build all these packages. It's doable, but it isn't free.
>

Maybe we need to help there. For example IIRC conda-forge will build conda
packages -- maybe we should offer a service like that for wheels?


> --
>
> The irony of the situation is that we must break the C API (hiding
> structures is technically an incompatible change)... to make the C API
> stable. Breaking it now to make it stable later.
>

The question is whether that will ever be enough. Unless we manage to get
rid of the INCREF/DECREF macros completely (from the public C API anyway)
we still can't change object layout.


> We already broke the C API many times in the past. The difference here
> is that changes are done in the purpose of bending it towards the
> limited C API and the stable ABI.
>
> My expectation is that replacing frame->f_code with PyFrame_GetCode()
> only has to be done exactly once: this API is not going this change.
> Sadly, the changes are not limited to frame->f_code, more changes are
> needed. For example, for PyFrameObject, accesses to every structure
> member must have to go through a function call (getter or setter
> function). Hopefully, only a small number of member are used by C
> extensions.
>

Is this worth it? Maybe we should just declare those structs and APIs
*unstable* and tell people who use them that they can expect to be broken
by each alpha release. As you say, hopefully this doesn't affect most
people. Likely it'll affect Cython dramatically but Cython is such a
special case that trying to evolve the C API will never satisfy them. We'll
have to deal with it separately. (Debuggers are a more serious concern. We
may need to provide higher-level APIs for debuggers to do the things they
need to do. Mark's PEP 669 should help here.)


> The tricky part is to think about the high level API ("use cases")
> rather than just adding functions doing "return struct->member" and
> "struct->member = new_value". The PyThreadState_EnterTracing() and
> PyThreadState_LeaveTracing() functions added to Python 3.11 are a good
> example: the API is "generic" and the implementation changes 2
> structure members, not a single one.
>

Right.


> In practice, what I did since Python 3.8 is to introduce a small
> number of C API changes per Python versions. We tried the "fix all the
> things at once" approach (!!!) with Python 3, and it... didn't go
> well. All C extensions had to suddenly write their own compatibility
> layer for a large number of C API functions (ex: replace PyInt_xxx
> with PyLong_xxx, without losing Python 2 support!). The changes that
> I'm introducing in the C API usually impact less than 100 extensions
> in total (usually, I would say between 10 and 25 per Python version,
> but it's hard to measure exactly).
>

Ho *do* you count this? Try to compile the top 5000 PyPI packages? That
might severely undercount a long tail of proprietary extensions.


>
>
> > Can HPy do that?
>
> I wish more projects are incrementally rewritten with Cython, cffi,
> pybind11 and HPy, and so slowly move away using directly the C API.
>
> Yes, HPy support an "universal build" mode which allows to only build
> a C extension once, and use it on multiple *CPython* versions *and*
> (that's the big news!) multiple *PyPy* versions! I even heard that it
> also brings GraalPython support for free ;-)
>
> Victor
> --
> Night gathers, and now my watch begins. It shall not end until my death.
>


--
--Guido van Rossum (python.org/~guido)
*Pronouns: he/him **(why is my pronoun here?)*
<http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
Re: Slowly bend the C API towards the limited API to get a stable ABI for everyone [ In reply to ]
Hi Guido,

My "north star", as you say, is the HPy "design" (not the actual HPy
API). I would like to convert PyObject* to opaque handles:
dereferencing a PyObject* pointer would simply fail with a compiler
error.

I'm working bottom-to-top: prepare PyObject and PyVarObject to become
opaque, *and* top-to-bottom: prepare subclasses (structures
"inheriting" from PyObject and PyVarObject) to become opaque like
PyFrameObject.

IMO if PyObject* becomes a handle, the migration to the HPy API should
be much easier.

So far, a large part of the work has been identifying which APIs are
preventing to move the current C API to opaque handles. The second
part has been fixing these APIs one by one, starting with changes
which don't break the C API.

In the last 5 years, I fixed many issues of the C API. Very few
projects were impacted since I avoided incompatible changes on
purpose.

Now I reached the end of my (current) queue with the most
controversial changes: incompatible C API changes. I wrote PEP 670 and
674 to share my overall rationale and communicate on the reasons why I
consider that these changes will make our life easier next years.

I'm well aware that in the short term, the benefits are very limited.
But we cannot jump immediately to opaque PyObject* handles. This work
must be done incrementally.


On Thu, Feb 3, 2022 at 1:40 AM Guido van Rossum <guido@python.org> wrote:
>> 1) First, my main worry is that we put a high pressure on maintainers
>> of most important Python dependencies before the next of a new Python
>> version, because we want them to handle the flow of incompatible C API
>> changes before the final Python 3.x versions is released, to get them
>> available when Python 3.x final is released.
>
> Hm, maybe we should reduce the flow. And e.g. reject PEP 674...

We need to find a the right limit when introducing C API to not break
too many C extensions "per Python release". Yes, PEP 674 is an example
of incompatible C API change, but Python 3.11 already got many others
which don't are unrelated to that PEP.

For example, the optimization work done by your Microsoft team broke a
bunch of projects using PyThreadState and PyFrameObject APIs. See the
related What's New in Python 3.11 entries:
https://docs.python.org/dev/whatsnew/3.11.html#id2
(I didn't keep track of which projects are affected by these changes.)

I would like to ensure that it remains possible to optimize Python,
but for that, we need to reduce the friction related to C API changes.
What's the best approach for that remains an open question. I'm
proposing some solutions, we are discussing advantages and drawbacks
;-)


>> It annoys core developers who cannot change things in Python without
>> getting an increasing number of complains about a large number of
>> broken packages, sometimes with a request to revert.
>
> You are mostly talking about yourself here, right? Since the revert requests were mostly aimed at you. :-)

Latest requests for revert are not about C API changes that I made.

Cython and the C API exception change:
https://mail.python.org/archives/list/python-dev@python.org/thread/RS2C53LDZPXHRR2VCY2G2YSPDVA4LNQU/

There are other requests for revert in Python 3.11 related to Python
changes, not to the C API.

So far, I only had to revert 2 changes about my C API work:

* PyType_HasFeature(): the change caused a performance regression on
macOS, sadly Python cannot be built with LTO. With LTO (all platforms
but macOS), my change doesn't affect performances.

* Py_TYPE() / Py_SIZE() change: I reverted my change to have more time
to prepare affected projects. Two years later, I consider that this
preparation work is now done (all affected projects are ready) and so
I submitted the PEP 674. Affected projects (merged and pending fixes):
https://www.python.org/dev/peps/pep-0674/#backwards-compatibility


>> Moreover, it became common to ask multiple
>> changes and multiple releases before a Python final release, since
>> more incompatible changes are introduced in Python (before the beta1).
>
> Sorry, your grammar confuses me. Who is asking whom to do what here?

Cython is the best example. During the development cycle of a new
Python version, my Fedora team adapts Cython to the next Python and
then request a release to get an official release supporting an alpha
Python version.

A release is not strictly required by us (we can apply downsteam
patches), but it's more convenient for us and for people who cannot
use our work-in-progress "COPR" (repository of RPM packages specific
to Fedora). When we ask a project (using Cython) to merge our pull
request, maintainers want to test it on the current Python alpha
version, and it's not convenient when Cython is unusable.

Between Python 3.x alpha1 and Python 3.x final, there might be
multiple Cython releases to handle each time a bunch of C API
incompatible changes. On Python and Cython sides, so far, there was no
coordination to group incompatible changes, they land between alpha1
and beta1 in an irregular fashion.


>> 2) Second, as you said, the stable ABI reduces the number of binary
>> packages which have to be built. Small projects with a little team
>> (ex: a single person) don't have resources to set up a CI and maintain
>> it to build all these packages. It's doable, but it isn't free.
>
> Maybe we need to help there. For example IIRC conda-forge will build conda packages -- maybe we should offer a service like that for wheels?

Tooling to automate the creation of wheel binary packages targeting
the stable ABI would help. C extensions likely need changing a few
function calls which are not supported by the limited C API. Someone
has to do this work.

My hope is that the quantify of changes is small (ex: modify 2 or 3
function calls) for small C extensions. I didn't try in practice yet.


>> The irony of the situation is that we must break the C API (hiding
>> structures is technically an incompatible change)... to make the C API
>> stable. Breaking it now to make it stable later.
>
> The question is whether that will ever be enough. Unless we manage to get rid of the INCREF/DECREF macros completely (from the public C API anyway) we still can't change object layout.

I'm open to ideas if someone has a better plan ;-)

Keeping reference counting for consumers of the C API (C extensions)
doesn't prevent to avoid reference counting inside Python (switch to a
completely different GC implementation). PyPy cpyext is a concrete
example of that.

The nogil fork keeps Py_INCREF()/Py_DECREF() functions, but changes
their implementation. If we have to (ex: if we merge nogil), we can
convert Py_INCREF() / Py_DECREF() static inline functions to regular
functions tomorrow (and then change their implementation) without
breaking the API.

Adding abstractions (getter and setter functions) on PyObject give
more freedom to consider different options to evolve Python tomorrow.


> Is this worth it? Maybe we should just declare those structs and APIs *unstable* and tell people who use them that they can expect to be broken by each alpha release. As you say, hopefully this doesn't affect most people. Likely it'll affect Cython dramatically but Cython is such a special case that trying to evolve the C API will never satisfy them. We'll have to deal with it separately. (Debuggers are a more serious concern. We may need to provide higher-level APIs for debuggers to do the things they need to do. Mark's PEP 669 should help here.)

IMO we only need to add 5 to 10 functions to cover most use cases
involving PyThreadState and PyFrameObject. The remaining least common
usages can continue to require changes at each Python releases. The
short term goal is only to reduce the number of required changes per
Python releases.

Yes, I'm talking about Cython, debuggers and profilers.

Another example is that Cython currently calls PyCode_New() to create
a fake frame object with a filename and line number. IMO it's the
wrong abstraction level: Python should provide a function to create a
frame with a filename and line number, so the caller doesn't have to
bother about the complex PyCode_New() API and frequent PyCodeObject
changes. (Correct me if this problem has already been solved in
Python.)


>> In practice, what I did since Python 3.8 is to introduce a small
>> number of C API changes per Python versions. (...)
>
> How *do* you count this? Try to compile the top 5000 PyPI packages?

I'm using code search in the source code of top 5000 PyPI packages and
I'm looking at broken packages in Fedora when we update Python. Also,
sometimes people add a comment on an issue to mention that their
project is broken by a change.

> That might severely undercount a long tail of proprietary extensions.

Right.

Victor
--
Night gathers, and now my watch begins. It shall not end until my death.
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-leave@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/YY5QCAIZI2YA7P7T2CPVPMRKLJ32NO4Q/
Code of Conduct: http://python.org/psf/codeofconduct/
Re: Slowly bend the C API towards the limited API to get a stable ABI for everyone [ In reply to ]
On Thu, Feb 3, 2022 at 9:27 AM Victor Stinner <vstinner@python.org> wrote:

> Hi Guido,
>

[SNIP]


>
> On Thu, Feb 3, 2022 at 1:40 AM Guido van Rossum <guido@python.org> wrote:
>
>
[SNIP]


> >
> > Maybe we need to help there. For example IIRC conda-forge will build
> conda packages -- maybe we should offer a service like that for wheels?
>
> Tooling to automate the creation of wheel binary packages targeting
> the stable ABI would help. C extensions likely need changing a few
> function calls which are not supported by the limited C API. Someone
> has to do this work.
>

The idea of having a service to build wheels for folks is an old one that I
think a ton of people would benefit from.

Currently, people typically get pointed to
https://pypi.org/project/cibuildwheel/ as the per-project solution. But
designing a safe way to build wheels from any sdist on PyPI, keeping such a
service up, having the free processing to do it, etc. is unfortunately a
big enough project that no one has stepped forward to try and tackle it.
Re: Slowly bend the C API towards the limited API to get a stable ABI for everyone [ In reply to ]
On Wed, Feb 2, 2022 at 4:46 PM Guido van Rossum <guido@python.org> wrote:

A few notes in this:


> Maybe we need to help there. For example IIRC conda-forge will build conda
> packages -- maybe we should offer a service like that for wheels?
>

Yes, conda-forge used a complex CI system to build binaries conda packages
for a variety of Python versions. And it does have some support for
development versions. ONce a "feedstock" is developed, it's remarkably
painless to get all the binaries up and available.

I imagine someone could borrow a bunch of that code to make a system to
build wheels.

In fact, ther is the MAcPYthon org:

https://github.com/MacPython

Which began as a place to share building scripts for Mac binaries, but has
expanded to build wheels for multiple platforms for the scipy stack. I
don't know how it works these days -- I am no longer involved since I
discovered conda, but they seem to have some nice stuff there -- perhaps it
could be leveraged for more projects.

However: one of the challenges for building C extensions is that they often
depend on external C libs -- and that is exactly the problem that conda was
built to address. So in a sense, a conda-forge-like auto-build system is
inherently easier for conda packages than binary wheels.

Which doesn't mean it couldn't be done -- just that the challenge of third
party libs would need to be addressed.

In any case, someone would have to do the work, as usual.

-CHB


--
Christopher Barker, PhD (Chris)

Python Language Consulting
- Teaching
- Scientific Software Development
- Desktop GUI and Web Development
- wxPython, numpy, scipy, Cython
Re: Slowly bend the C API towards the limited API to get a stable ABI for everyone [ In reply to ]
On 2/3/2022 12:15 PM, Victor Stinner wrote:
> I'm working bottom-to-top: prepare PyObject and PyVarObject to become
> opaque, *and* top-to-bottom: prepare subclasses (structures
> "inheriting" from PyObject and PyVarObject) to become opaque like
> PyFrameObject.
>
> IMO if PyObject* becomes a handle, the migration to the HPy API should
> be much easier.

It seems to me that moving PyObject* to be a handle leaves you in a
place very similar to HPy. So why not just focus on making HPy suitable
for developing C extensions, leave the existing C API alone, and
eventually abandon the existing C API?

Eric


_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-leave@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/BWCNBP26BZU2SYLCBVCXXVBMYUTSHE27/
Code of Conduct: http://python.org/psf/codeofconduct/
Re: Slowly bend the C API towards the limited API to get a stable ABI for everyone [ In reply to ]
On Fri, Feb 4, 2022 at 12:55 AM Eric V. Smith <eric@trueblade.com> wrote:

> It seems to me that moving PyObject* to be a handle leaves you in a
> place very similar to HPy. So why not just focus on making HPy suitable
> for developing C extensions, leave the existing C API alone, and
> eventually abandon the existing C API?
>

I agree (but I'm biased :)), but I think there is also an important point
which is easy to miss/overlook: it is not enough to declare that now you
have handles instead of refcounting, you also need a way to enforce/check
that the handles are used correctly.

CPython might declare that object references are now handles and that each
handle must be closed individually: this would work formally, but as long
as handles are internally implemented on top of refcounting, things like
closing the same handle twice would just continue to work if by chance the
total refcount is still correct. This means that we will have extensions
which will be formally incorrect but will work well on CPython, and
horribly break as soon as you try to load them on e.g. PyPy.

That's the biggest selling point of the HPy debug mode: in debug mode, HPy
actively check that handles are closed properly, and it warns you if you
close a handle twice or forget to close a handle, even on CPython.

ciao,
Antonio
Re: Slowly bend the C API towards the limited API to get a stable ABI for everyone [ In reply to ]
On Thu, Feb 3, 2022 at 3:53 PM Eric V. Smith <eric@trueblade.com> wrote:

> On 2/3/2022 12:15 PM, Victor Stinner wrote:
> > I'm working bottom-to-top: prepare PyObject and PyVarObject to become
> > opaque, *and* top-to-bottom: prepare subclasses (structures
> > "inheriting" from PyObject and PyVarObject) to become opaque like
> > PyFrameObject.
> >
> > IMO if PyObject* becomes a handle, the migration to the HPy API should
> > be much easier.
>
> It seems to me that moving PyObject* to be a handle leaves you in a
> place very similar to HPy. So why not just focus on making HPy suitable
> for developing C extensions, leave the existing C API alone, and
> eventually abandon the existing C API?
>

I think that's a possibility. I think it's a question for the team here
whether that's the long-term goal that we want. If so we can make all of
our work head towards that and help out HPy out as best we can.
Re: Slowly bend the C API towards the limited API to get a stable ABI for everyone [ In reply to ]
On Fri, Feb 4, 2022 at 12:52 AM Eric V. Smith <eric@trueblade.com> wrote:
>
> On 2/3/2022 12:15 PM, Victor Stinner wrote:
> >
> > IMO if PyObject* becomes a handle, the migration to the HPy API should
> > be much easier.
>
> It seems to me that moving PyObject* to be a handle leaves you in a
> place very similar to HPy. So why not just focus on making HPy suitable
> for developing C extensions, leave the existing C API alone, and
> eventually abandon the existing C API?

I tried to explain the reasons why HPy doesn't solve all problems in
the PEP 674:
https://www.python.org/dev/peps/pep-0674/#the-c-api-is-here-is-stay-for-a-few-more-years

One problem is to provide a better C API to users: HPy is great for that!

Another problem is the inability to evolve Python because the C API
leaks implementation details: HPy doesn't solve this problem because
Python must continue supporting the C API for a few more years.

My approach is to (slowly) bend the C API towards HPy design/API to
ease the migration to HPy *and* (slowly) allow changing more Python
internals (without affecting the public C API).

Victor
--
Night gathers, and now my watch begins. It shall not end until my death.
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-leave@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/BIAGMMRJP45FR3R5DS772TZZU6AQVO2V/
Code of Conduct: http://python.org/psf/codeofconduct/
Re: Slowly bend the C API towards the limited API to get a stable ABI for everyone [ In reply to ]
Trying to cut things short, there's one thing I'd like to correct:

On Thu, Feb 3, 2022 at 9:15 AM Victor Stinner <vstinner@python.org> wrote:

> [...]
>
> Another example is that Cython currently calls PyCode_New() to create
> a fake frame object with a filename and line number. IMO it's the
> wrong abstraction level: Python should provide a function to create a
> frame with a filename and line number, so the caller doesn't have to
> bother about the complex PyCode_New() API and frequent PyCodeObject
> changes. (Correct me if this problem has already been solved in
> Python.)
>

That was solved quite a while ago, with the PyCode_NewEmpty() API. Sadly
Cython doesn't call it (or at least not always), because it takes a C
string which is turned into a unicode object, and Cython already has the
unicode object in hand. I don't want to proliferate APIs.

--
--Guido van Rossum (python.org/~guido)
*Pronouns: he/him **(why is my pronoun here?)*
<http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
Re: Slowly bend the C API towards the limited API to get a stable ABI for everyone [ In reply to ]
On 31. 01. 22 16:14, Victor Stinner wrote:
> On Mon, Jan 31, 2022 at 4:03 PM Petr Viktorin <encukou@gmail.com> wrote:
>>> If we change the stable ABI, I would prefer to fix multiple issues at
>>> once. Examples:
>>>
>>> * No longer return borrowed references (ex: PyDict_GetItem is part of
>>> the stable ABI) and no longer steal references (ex:
>>> PyModule_AddObject)
>>>
>>> * Disallow getting direct access into an object data without a
>>> function to "release" the data. For example, PyBytes_AsString() gives
>>> a direct access into the string, but Python doesn't know when the C
>>> extension is done with it, and when it's safe to delete the object.
>>> Such API prevents to move Python objects in memory (implement a moving
>>> garbage collector in Python).
>>>
>>> * Disallow dereferencing a PyObject* pointer: most structures must be
>>> opaque. It indirectly means that accessing directly structure members
>>> must also be disallowed. PEP 670 and PEP 674 are partially fixing the
>>> issues.
>>
>> (...) fixing these in the API first is probably the way to go.
>
> That's what I already did in the past and what I plan to do in the future.

I see a problem in the subject of this thread: "Slowly bend the C API
towards the limited API to get a stable ABI for everyone" is not a good
summary of the proposed changes -- that is, bend *all* API (both the
general public API and the limited API) to make most structs opaque, etc.

If the summary doesn't match what's actually proposed, it's hard to
discuss the proposal. Especially if the concrete plan changes often.


Anyway, I propose a different plan than what I think you are proposing:

- Add new "good" API where the current API is currently lacking. For
examrle, PyModule_AddObjectRef is the "good" alternative to
PyModule_AddObject -- it returns a strong reference. You've done a lot
of great work here, and the API is much better for it.

- "Soft"-deprecate the "bad" API: document that it's only there for
existing working code.
Why not remove it? The issue is that it *is* possible to use the
existing API correctly, and many extension authors have spent a lot of
time and effort to do just that. If we force them to use a new API that
makes writing correct code easier, it won't actually make their job
easier if they've already found the caveats and fixed them.

- Remove the "bad" API from newer versions of the limited API. Extension
authors can "opt in" to this new version, gaining new features of the
limited API but losing the deprecated parts.
(Here is the part where we should make sure the removals are
well-documented, provide tools to "modernize" code, etc.)

- Proactively work with popular/important projects (top PyPI packages,
distro packages) to move to the latest API. The benefit for a CPython
devs here is that we can see the pain points and unforeseen use cases,
test any modernization tools, having a "reality check" on how invasive
the changes actually are, and helping HPy & other implementations
succeed even if they don't implement deprecated API.

- Agree with HPy and other implementations of the limited API that it's
not necessary for them to support the deprecated parts.

- When (and only when) a deprecated API is actually harmful -- i.e. it
blocks new improvements that benefit actual users in the short term --
it should be deprecated and removed. (Even better, if instead of
removing it could be e.g. replaced by a function that's 3x slower, or
leaks memory on exit, then it should.)



Basically, instead of "We'll remove this API now because it prevents
moving to a hypothetical moving garbage collector", it should be "Here
is a moving garbage collector that speeds Python up by 30%, but to add
it we need to remove these 30 deprecated APIs". The deprecation can be
proactive, but not the removal.
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-leave@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/IXXPO47EZ4QLYWZ4I7RZ5OQVJY7J3FLB/
Code of Conduct: http://python.org/psf/codeofconduct/
Re: Slowly bend the C API towards the limited API to get a stable ABI for everyone [ In reply to ]
On Mon, Feb 7, 2022 at 2:08 PM Petr Viktorin <encukou@gmail.com> wrote:
> Basically, instead of "We'll remove this API now because it prevents
> moving to a hypothetical moving garbage collector", it should be "Here
> is a moving garbage collector that speeds Python up by 30%, but to add
> it we need to remove these 30 deprecated APIs". The deprecation can be
> proactive, but not the removal.

PEP 674 gives 3 concrete examples of issues already affecting the
CPython nogil fork, HPy and GraalPython. They are not hypothetical.

CPython is also affected by these issues, but the benefits of PEP 674
(alone) are too indirect, so I chose to avoid mentioning CPython
issues directly, to avoid confusion.

It's possible to workaround them: more or less copy/paste CPython
inefficient code, as PyPy did years ago. The problem is that the
workaround is inefficient and so PyPy cpyext remains slow. Well, HPy
address the cpyext performance problem for PyPy and GraalPython ;-)

I don't think that the question is if there is a real problem or not.
The question is what's the best migration plan to move existing C
extensions towards a better API which don't suffer from these
problems.

Once 95% of C extensions will use the limited C API, we would still
not be able to change Python internals, because of the 5% remaining C
extensions which are stuck at the legacy C API.

Victor
--
Night gathers, and now my watch begins. It shall not end until my death.
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-leave@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/INQP74MGMUVSKCO7A7NCXDNML646SHRL/
Code of Conduct: http://python.org/psf/codeofconduct/

1 2  View All