Mailing List Archive

PEP 684: A Per-Interpreter GIL
I'd really appreciate feedback on this new PEP about making the GIL
per-interpreter.

The PEP targets 3.11, but we'll see if that is too close. I don't
mind waiting one more
release, though I'd prefer 3.11 (obviously). Regardless, I have no
intention of rushing
this through at the expense of cutting corners. Hence, we'll see how it goes.

The PEP text is included inline below. Thanks!

-eric

===================================================

PEP: 684
Title: A Per-Interpreter GIL
Author: Eric Snow <ericsnowcurrently@gmail.com>
Discussions-To: python-dev@python.org
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 08-Mar-2022
Python-Version: 3.11
Post-History: 08-Mar-2022
Resolution:

Abstract
========

Since Python 1.5 (1997), CPython users can run multiple interpreters
in the same process. However, interpreters in the same process
have always shared a significant
amount of global state. This is a source of bugs, with a growing
impact as more and more people use the feature. Furthermore,
sufficient isolation would facilitate true multi-core parallelism,
where interpreters no longer share the GIL. The changes outlined in
this proposal will result in that level of interpreter isolation.


High-Level Summary
==================

At a high level, this proposal changes CPython in the following ways:

* stops sharing the GIL between interpreters, given sufficient isolation
* adds several new interpreter config options for isolation settings
* adds some public C-API for fine-grained control when creating interpreters
* keeps incompatible extensions from causing problems

The GIL
-------

The GIL protects concurrent access to most of CPython's runtime state.
So all that GIL-protected global state must move to each interpreter
before the GIL can.

(In a handful of cases, other mechanisms can be used to ensure
thread-safe sharing instead, such as locks or "immortal" objects.)

CPython Runtime State
---------------------

Properly isolating interpreters requires that most of CPython's
runtime state be stored in the ``PyInterpreterState`` struct. Currently,
only a portion of it is; the rest is found either in global variables
or in ``_PyRuntimeState``. Most of that will have to be moved.

This directly coincides with an ongoing effort (of many years) to greatly
reduce internal use of C global variables and consolidate the runtime
state into ``_PyRuntimeState`` and ``PyInterpreterState``.
(See `Consolidating Runtime Global State`_ below.) That project has
`significant merit on its own <Benefits to Consolidation_>`_
and has faced little controversy. So, while a per-interpreter GIL
relies on the completion of that effort, that project should not be
considered a part of this proposal--only a dependency.

Other Isolation Considerations
------------------------------

CPython's interpreters must be strictly isolated from each other, with
few exceptions. To a large extent they already are. Each interpreter
has its own copy of all modules, classes, functions, and variables.
The CPython C-API docs `explain further <caveats_>`_.

.. _caveats: https://docs.python.org/3/c-api/init.html#bugs-and-caveats

However, aside from what has already been mentioned (e.g. the GIL),
there are a couple of ways in which interpreters still share some state.

First of all, some process-global resources (e.g. memory,
file descriptors, environment variables) are shared. There are no
plans to change this.

Second, some isolation is faulty due to bugs or implementations that
did not take multiple interpreters into account. This includes
CPython's runtime and the stdlib, as well as extension modules that
rely on global variables. Bugs should be opened in these cases,
as some already have been.

Depending on Immortal Objects
-----------------------------

:pep:`683` introduces immortal objects as a CPython-internal feature.
With immortal objects, we can share any otherwise immutable global
objects between all interpreters. Consequently, this PEP does not
need to address how to deal with the various objects
`exposed in the public C-API <capi objects_>`_.
It also simplifies the question of what to do about the builtin
static types. (See `Global Objects`_ below.)

Both issues have alternate solutions, but everything is simpler with
immortal objects. If PEP 683 is not accepted then this one will be
updated with the alternatives. This lets us reduce noise in this
proposal.


Motivation
==========

The fundamental problem we're solving here is a lack of true multi-core
parallelism (for Python code) in the CPython runtime. The GIL is the
cause. While it usually isn't a problem in practice, at the very least
it makes Python's multi-core story murky, which makes the GIL
a consistent distraction.

Isolated interpreters are also an effective mechanism to support
certain concurrency models. :pep:`554` discusses this in more detail.

Indirect Benefits
-----------------

Most of the effort needed for a per-interpreter GIL has benefits that
make those tasks worth doing anyway:

* makes multiple-interpreter behavior more reliable
* has led to fixes for long-standing runtime bugs that otherwise
hadn't been prioritized
* has been exposing (and inspiring fixes for) previously unknown runtime bugs
* has driven cleaner runtime initialization (:pep:`432`, :pep:`587`)
* has driven cleaner and more complete runtime finalization
* led to structural layering of the C-API (e.g. ``Include/internal``)
* also see `Benefits to Consolidation`_ below

Furthermore, much of that work benefits other CPython-related projects:

* performance improvements ("faster-cpython")
* pre-fork application deployment (e.g. Instagram)
* extension module isolation (see :pep:`630`, etc.)
* embedding CPython

Existing Use of Multiple Interpreters
-------------------------------------

The C-API for multiple interpreters has been used for many years.
However, until relatively recently the feature wasn't widely known,
nor extensively used (with the exception of mod_wsgi).

In the last few years use of multiple interpreters has been increasing.
Here are some of the public projects using the feature currently:

* `mod_wsgi <https://github.com/GrahamDumpleton/mod_wsgi>`_
* `OpenStack Ceph <https://github.com/ceph/ceph/pull/14971>`_
* `JEP <https://github.com/ninia/jep>`_
* `Kodi <https://github.com/xbmc/xbmc>`_

Note that, with :pep:`554`, multiple interpreter usage would likely
grow significantly (via Python code rather than the C-API).

PEP 554
-------

:pep:`554` is strictly about providing a minimal stdlib module
to give users access to multiple interpreters from Python code.
In fact, it specifically avoids proposing any changes related to
the GIL. Consider, however, that users of that module would benefit
from a per-interpreter GIL, which makes PEP 554 more appealing.


Rationale
=========

During initial investigations in 2014, a variety of possible solutions
for multi-core Python were explored, but each had its drawbacks
without simple solutions:

* the existing practice of releasing the GIL in extension modules
* doesn't help with Python code
* other Python implementations (e.g. Jython, IronPython)
* CPython dominates the community
* remove the GIL (e.g. gilectomy, "no-gil")
* too much technical risk (at the time)
* Trent Nelson's "PyParallel" project
* incomplete; Windows-only at the time
* ``multiprocessing``

* too much work to make it effective enough;
high penalties in some situations (at large scale, Windows)

* other parallelism tools (e.g. dask, ray, MPI)
* not a fit for the stdlib
* give up on multi-core (e.g. async, do nothing)
* this can only end in tears

Even in 2014, it was fairly clear that a solution using isolated
interpreters did not have a high level of technical risk and that
most of the work was worth doing anyway.
(The downside was the volume of work to be done.)


Specification
=============

As `summarized above <High-Level Summary_>`__, this proposal involves the
following changes, in the order they must happen:

1. `consolidate global runtime state <Consolidating Runtime Global State_>`_
(including objects) into ``_PyRuntimeState``
2. move nearly all of the state down into ``PyInterpreterState``
3. finally, move the GIL down into ``PyInterpreterState``
4. everything else
* add to the public C-API
* implement restrictions in ``ExtensionFileLoader``

* work with popular extension maintainers to help
with multi-interpreter support

Per-Interpreter State
---------------------

The following runtime state will be moved to ``PyInterpreterState``:

* all global objects that are not safely shareable (fully immutable)
* the GIL
* mutable, currently protected by the GIL
* mutable, currently protected by some other per-interpreter lock
* mutable, may be used independently in different interpreters
* all other mutable (or effectively mutable) state
not otherwise excluded below

Furthermore, a number of parts of the global state have already been
moved to the interpreter, such as GC, warnings, and atexit hooks.

The following state will not be moved:

* global objects that are safely shareable, if any
* immutable, often ``const``
* treated as immutable
* related to CPython's ``main()`` execution
* related to the REPL
* set during runtime init, then treated as immutable
* mutable, protected by some global lock
* mutable, atomic

Note that currently the allocators (see ``Objects/obmalloc.c``) are shared
between all interpreters, protected by the GIL. They will need to move
to each interpreter (or a global lock will be needed). This is the
highest risk part of the work to isolate interpreters and may require
more than just moving fields down from ``_PyRuntimeState``. Some of
the complexity is reduced if CPython switches to a thread-safe
allocator like mimalloc.

.. _proposed capi:

C-API
-----

The following private API will be made public:

* ``_PyInterpreterConfig``
* ``_Py_NewInterpreter()`` (as ``Py_NewInterpreterEx()``)

The following fields will be added to ``PyInterpreterConfig``:

* ``own_gil`` - (bool) create a new interpreter lock
(instead of sharing with the main interpreter)
* ``strict_extensions`` - fail import in this interpreter for
incompatible extensions (see `Restricting Extension Modules`_)

Restricting Extension Modules
-----------------------------

Extension modules have many of the same problems as the runtime when
state is stored in global variables. :pep:`630` covers all the details
of what extensions must do to support isolation, and thus safely run in
multiple interpreters at once. This includes dealing with their globals.

Extension modules that do not implement isolation will only run in
the main interpreter. In all other interpreters, the import will
raise ``ImportError``. This will be done through
``importlib._bootstrap_external.ExtensionFileLoader``.

We will work with popular extensions to help them support use in
multiple interpreters. This may involve adding to CPython's public C-API,
which we will address on a case-by-case basis.

Extension Module Compatibility
''''''''''''''''''''''''''''''

As noted in `Extension Modules`_, many extensions work fine in multiple
interpreters without needing any changes. The import system will still
fail if such a module doesn't explicitly indicate support. At first,
not many extension modules will, so this is a potential source
of frustration.

We will address this by adding a context manager to temporarily disable
the check on multiple interpreter support:
``importlib.util.allow_all_extensions()``.

Documentation
-------------

The "Sub-interpreter support" section of ``Doc/c-api/init.rst`` will be
updated with the added API.


Impact
======

Backwards Compatibility
-----------------------

No behavior or APIs are intended to change due to this proposal,
with one exception noted in `the next section <Extension Modules_>`_.
The existing C-API for managing interpreters will preserve its current
behavior, with new behavior exposed through new API. No other API
or runtime behavior is meant to change, including compatibility with
the stable ABI.

See `Objects Exposed in the C-API`_ below for related discussion.

Extension Modules
'''''''''''''''''

Currently the most common usage of Python, by far, is with the main
interpreter running by itself. This proposal has zero impact on
extension modules in that scenario. Likewise, for better or worse,
there is no change in behavior under multiple interpreters created
using the existing ``Py_NewInterpreter()``.

Keep in mind that some extensions already break when used in multiple
interpreters, due to keeping module state in global variables. They
may crash or, worse, experience inconsistent behavior. That was part
of the motivation for :pep:`630` and friends, so this is not a new
situation nor a consequence of this proposal.

In contrast, when the `proposed API <proposed capi_>`_ is used to
create multiple interpreters, the default behavior will change for
some extensions. In that case, importing an extension will fail
(outside the main interpreter) if it doesn't indicate support for
multiple interpreters. For extensions that already break in
multiple interpreters, this will be an improvement.

Now we get to the break in compatibility mentioned above. Some
extensions are safe under multiple interpreters, even though they
haven't indicated that. Unfortunately, there is no reliable way for
the import system to infer that such an extension is safe, so
importing them will still fail. This case is addressed in
`Extension Module Compatibility`_ below.

Extension Module Maintainers
----------------------------

One related consideration is that a per-interpreter GIL will likely
drive increased use of multiple interpreters, particularly if :pep:`554`
is accepted. Some maintainers of large extension modules have expressed
concern about the increased burden they anticipate due to increased
use of multiple interpreters.

Specifically, enabling support for multiple interpreters will require
substantial work for some extension modules. To add that support,
the maintainer(s) of such a module (often volunteers) would have to
set aside their normal priorities and interests to focus on
compatibility (see :pep:`630`).

Of course, extension maintainers are free to not add support for use
in multiple interpreters. However, users will increasingly demand
such support, especially if the feature grows
in popularity.

Either way, the situation can be stressful for maintainers of such
extensions, particularly when they are doing the work in their spare
time. The concerns they have expressed are understandable, and we address
the partial solution in `Restricting Extension Modules`_ below.

Alternate Python Implementations
--------------------------------

Other Python implementation are not required to provide support for
multiple interpreters in the same process (though some do already).

Security Implications
---------------------

There is no known impact to security with this proposal.

Maintainability
---------------

On the one hand, this proposal has already motivated a number of
improvements that make CPython *more* maintainable. That is expected
to continue. On the other hand, the underlying work has already
exposed various pre-existing defects in the runtime that have had
to be fixed. That is also expected to continue as multiple interpreters
receive more use. Otherwise, there shouldn't be a significant impact
on maintainability, so the net effect should be positive.

Performance
-----------

The work to consolidate globals has already provided a number of
improvements to CPython's performance, both speeding it up and using
less memory, and this should continue. Performance benefits to a
per-interpreter GIL have not been explored. At the very least, it is
not expected to make CPython slower (as long as interpreters are
sufficiently isolated).


How to Teach This
=================

This is an advanced feature for users of the C-API. There is no
expectation that this will be taught.

That said, if it were taught then it would boil down to the following:

In addition to Py_NewInterpreter(), you can use Py_NewInterpreterEx()
to create an interpreter. The config you pass it indicates how you
want that interpreter to behave.


Reference Implementation
========================

<TBD>


Open Issues
===========

* What are the risks/hurdles involved with moving the allocators?
* Is ``allow_all_extensions`` the best name for the context manager?


Deferred Functionality
======================

* ``PyInterpreterConfig`` option to always run the interpreter in a new thread
* ``PyInterpreterConfig`` option to assign a "main" thread to the interpreter
and only run in that thread


Rejected Ideas
==============

<TBD>


Extra Context
=============

Sharing Global Objects
----------------------

We are sharing some global objects between interpreters.
This is an implementation detail and relates more to
`globals consolidation <Consolidating Runtime Global State>`_
than to this proposal, but it is a significant enough detail
to explain here.

The alternative is to share no objects between interpreters, ever.
To accomplish that, we'd have to sort out the fate of all our static
types, as well as deal with compatibility issues for the many objects
`exposed in the public C-API <capi objects_>`_.

That approach introduces a meaningful amount of extra complexity
and higher risk, though prototyping has demonstrated valid solutions.
Also, it would likely result in a performance penalty.

`Immortal objects <Depending on Immortal Objects_>`_ allow us to
share the otherwise immutable global objects. That way we avoid
the extra costs.

.. _capi objects:

Objects Exposed in the C-API
''''''''''''''''''''''''''''

The C-API (including the limited API) exposes all the builtin types,
including the builtin exceptions, as well as the builtin singletons.
The exceptions are exposed as ``PyObject *`` but the rest are exposed
as the static values rather than pointers. This was one of the few
non-trivial problems we had to solve for per-interpreter GIL.

With immortal objects this is a non-issue.


Consolidating Runtime Global State
----------------------------------

As noted in `CPython Runtime State`_ above, there is an active effort
(separate from this PEP) to consolidate CPython's global state into the
``_PyRuntimeState`` struct. Nearly all the work involves moving that
state from global variables. The project is particularly relevant to
this proposal, so below is some extra detail.

Benefits to Consolidation
'''''''''''''''''''''''''

Consolidating the globals has a variety of benefits:

* greatly reduces the number of C globals (best practice for C code)
* the move draws attention to runtime state that is unstable or broken
* encourages more consistency in how runtime state is used
* makes multiple-interpreter behavior more reliable
* leads to fixes for long-standing runtime bugs that otherwise
haven't been prioritized
* exposes (and inspires fixes for) previously unknown runtime bugs
* facilitates cleaner runtime initialization and finalization
* makes it easier to discover/identify CPython's runtime state
* makes it easier to statically allocate runtime state in a consistent way
* better memory locality for runtime state
* structural layering of the C-API (e.g. ``Include/internal``)

Furthermore, much of that work benefits other CPython-related projects:

* performance improvements ("faster-cpython")
* pre-fork application deployment (e.g. Instagram)
* extension module isolation (see :pep:`630`, etc.)
* embedding CPython

Scale of Work
'''''''''''''

The number of global variables to be moved is large enough to matter,
but most are Python objects that can be dealt with in large groups
(like ``Py_IDENTIFIER``). In nearly all cases, moving these globals
to the interpreter is highly mechanical. That doesn't require
cleverness but instead requires someone to put in the time.

State To Be Moved
'''''''''''''''''

The remaining global variables can be categorized as follows:

* global objects
* static types (incl. exception types)
* non-static types (incl. heap types, structseq types)
* singletons (static)
* singletons (initialized once)
* cached objects
* non-objects
* will not (or unlikely to) change after init
* only used in the main thread
* initialized lazily
* pre-allocated buffers
* state

Those globals are spread between the core runtime, the builtin modules,
and the stdlib extension modules.

For a breakdown of the remaining globals, run:

.. code-block:: bash

./python Tools/c-analyzer/table-file.py
Tools/c-analyzer/cpython/globals-to-fix.tsv

Already Completed Work
''''''''''''''''''''''

As mentioned, this work has been going on for many years. Here are some
of the things that have already been done:

* cleanup of runtime initialization (see :pep:`432` / :pep:`587`)
* extension module isolation machinery (see :pep:`384` / :pep:`3121` /
:pep:`489`)
* isolation for many builtin modules
* isolation for many stdlib extension modules
* addition of ``_PyRuntimeState``
* no more ``_Py_IDENTIFIER()``
* statically allocated:

* empty string
* string literals
* identifiers
* latin-1 strings
* length-1 bytes
* empty tuple

Tooling
'''''''

As already indicated, there are several tools to help identify the
globals and reason about them.

* ``Tools/c-analyzer/cpython/globals-to-fix.tsv`` - the list of
remaining globals
* ``Tools/c-analyzer/c-analyzer.py``
* ``analyze`` - identify all the globals
* ``check`` - fail if there are any unsupported globals that aren't ignored
* ``Tools/c-analyzer/table-file.py`` - summarize the known globals

Also, the check for unsupported globals is incorporated into CI so that
no new globals are accidentally added.

Global Objects
''''''''''''''

Global objects that are safe to be shared (without a GIL) between
interpreters can stay on ``_PyRuntimeState``. Not only must the object
be effectively immutable (e.g. singletons, strings), but not even the
refcount can change for it to be safe. Immortality (:pep:`683`)
provides that. (The alternative is that no objects are shared, which
adds significant complexity to the solution, particularly for the
objects `exposed in the public C-API <capi objects_>`_.)

Builtin static types are a special case of global objects that will be
shared. They are effectively immutable except for one part:
``__subclasses__`` (AKA ``tp_subclasses``). We expect that nothing
else on a builtin type will change, even the content
of ``__dict__`` (AKA ``tp_dict``).

``__subclasses__`` for the builtin types will be dealt with by making
it a getter that does a lookup on the current ``PyInterpreterState``
for that type.


References
==========

Related:

* :pep:`384`
* :pep:`432`
* :pep:`489`
* :pep:`554`
* :pep:`573`
* :pep:`587`
* :pep:`630`
* :pep:`683`
* :pep:`3121`


Copyright
=========

This document is placed in the public domain or under the
CC0-1.0-Universal license, whichever is more permissive.
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-leave@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/CF7B7FMACFYDAHU6NPBEVEY6TOSGICXU/
Code of Conduct: http://python.org/psf/codeofconduct/
Re: PEP 684: A Per-Interpreter GIL [ In reply to ]
On 09. 03. 22 4:38, Eric Snow wrote:
> I'd really appreciate feedback on this new PEP about making the GIL
> per-interpreter.

Yay! Thank you!


>
> The PEP targets 3.11, but we'll see if that is too close. I don't
> mind waiting one more
> release, though I'd prefer 3.11 (obviously). Regardless, I have no
> intention of rushing
> this through at the expense of cutting corners. Hence, we'll see how it goes.

How mature is the implementation?

If it ends up in 3.12, I'd consider asking the release manager for an
extra alpha release so people can start playing with the feature early.

(With my Fedora hat on: I'd love to test it with thousands of packages!)


> The PEP text is included inline below. Thanks!
>
> -eric
>
> ===================================================
>
> PEP: 684
> Title: A Per-Interpreter GIL
> Author: Eric Snow <ericsnowcurrently@gmail.com>
> Discussions-To: python-dev@python.org
> Status: Draft
> Type: Standards Track
> Content-Type: text/x-rst

This iteration of the PEP should also have `Requires: 683` (Immortal
Objects).

> Created: 08-Mar-2022
> Python-Version: 3.11
> Post-History: 08-Mar-2022
> Resolution:


>
> Abstract
> ========
>
> Since Python 1.5 (1997), CPython users can run multiple interpreters
> in the same process. However, interpreters in the same process
> have always shared a significant
> amount of global state. This is a source of bugs, with a growing
> impact as more and more people use the feature. Furthermore,
> sufficient isolation would facilitate true multi-core parallelism,
> where interpreters no longer share the GIL. The changes outlined in
> this proposal will result in that level of interpreter isolation.
>
>
> High-Level Summary
> ==================
>
> At a high level, this proposal changes CPython in the following ways:
>
> * stops sharing the GIL between interpreters, given sufficient isolation
> * adds several new interpreter config options for isolation settings
> * adds some public C-API for fine-grained control when creating interpreters
> * keeps incompatible extensions from causing problems
>
> The GIL
> -------
>
> The GIL protects concurrent access to most of CPython's runtime state.
> So all that GIL-protected global state must move to each interpreter
> before the GIL can.
>
> (In a handful of cases, other mechanisms can be used to ensure
> thread-safe sharing instead, such as locks or "immortal" objects.)
>
> CPython Runtime State
> ---------------------
>
> Properly isolating interpreters requires that most of CPython's
> runtime state be stored in the ``PyInterpreterState`` struct. Currently,
> only a portion of it is; the rest is found either in global variables
> or in ``_PyRuntimeState``. Most of that will have to be moved.
>
> This directly coincides with an ongoing effort (of many years) to greatly
> reduce internal use of C global variables and consolidate the runtime
> state into ``_PyRuntimeState`` and ``PyInterpreterState``.
> (See `Consolidating Runtime Global State`_ below.) That project has
> `significant merit on its own <Benefits to Consolidation_>`_
> and has faced little controversy. So, while a per-interpreter GIL
> relies on the completion of that effort, that project should not be
> considered a part of this proposal--only a dependency.
>
> Other Isolation Considerations
> ------------------------------
>
> CPython's interpreters must be strictly isolated from each other, with
> few exceptions. To a large extent they already are. Each interpreter
> has its own copy of all modules, classes, functions, and variables.
> The CPython C-API docs `explain further <caveats_>`_.
>
> .. _caveats: https://docs.python.org/3/c-api/init.html#bugs-and-caveats
>
> However, aside from what has already been mentioned (e.g. the GIL),
> there are a couple of ways in which interpreters still share some state.
>
> First of all, some process-global resources (e.g. memory,
> file descriptors, environment variables) are shared. There are no
> plans to change this.
>
> Second, some isolation is faulty due to bugs or implementations that
> did not take multiple interpreters into account. This includes
> CPython's runtime and the stdlib, as well as extension modules that
> rely on global variables. Bugs should be opened in these cases,
> as some already have been.
>
> Depending on Immortal Objects
> -----------------------------
>
> :pep:`683` introduces immortal objects as a CPython-internal feature.
> With immortal objects, we can share any otherwise immutable global
> objects between all interpreters. Consequently, this PEP does not
> need to address how to deal with the various objects
> `exposed in the public C-API <capi objects_>`_.
> It also simplifies the question of what to do about the builtin
> static types. (See `Global Objects`_ below.)
>
> Both issues have alternate solutions, but everything is simpler with
> immortal objects. If PEP 683 is not accepted then this one will be
> updated with the alternatives. This lets us reduce noise in this
> proposal.
>
>
> Motivation
> ==========
>
> The fundamental problem we're solving here is a lack of true multi-core
> parallelism (for Python code) in the CPython runtime. The GIL is the
> cause. While it usually isn't a problem in practice, at the very least
> it makes Python's multi-core story murky, which makes the GIL
> a consistent distraction.
>
> Isolated interpreters are also an effective mechanism to support
> certain concurrency models. :pep:`554` discusses this in more detail.
>
> Indirect Benefits
> -----------------
>
> Most of the effort needed for a per-interpreter GIL has benefits that
> make those tasks worth doing anyway:
>
> * makes multiple-interpreter behavior more reliable
> * has led to fixes for long-standing runtime bugs that otherwise
> hadn't been prioritized > * has been exposing (and inspiring fixes for) previously unknown
runtime bugs
> * has driven cleaner runtime initialization (:pep:`432`, :pep:`587`)
> * has driven cleaner and more complete runtime finalization
> * led to structural layering of the C-API (e.g. ``Include/internal``)
> * also see `Benefits to Consolidation`_ below

Do you want to dig up some bpo examples, to make these more convincing
to the casual reader?

>
> Furthermore, much of that work benefits other CPython-related projects:
>
> * performance improvements ("faster-cpython")
> * pre-fork application deployment (e.g. Instagram)

Maybe say “e.g. with Instagram's Cinder” – both the household name and
the project you can link to?

> * extension module isolation (see :pep:`630`, etc.)
> * embedding CPython
>
> Existing Use of Multiple Interpreters
> -------------------------------------
>
> The C-API for multiple interpreters has been used for many years.
> However, until relatively recently the feature wasn't widely known,
> nor extensively used (with the exception of mod_wsgi).
>
> In the last few years use of multiple interpreters has been increasing.
> Here are some of the public projects using the feature currently:
>
> * `mod_wsgi <https://github.com/GrahamDumpleton/mod_wsgi>`_
> * `OpenStack Ceph <https://github.com/ceph/ceph/pull/14971>`_
> * `JEP <https://github.com/ninia/jep>`_
> * `Kodi <https://github.com/xbmc/xbmc>`_
>
> Note that, with :pep:`554`, multiple interpreter usage would likely
> grow significantly (via Python code rather than the C-API).
>
> PEP 554
> -------

Please spell out "PEP 554 (Multiple Interpreters in the Stdlib)", for
people who don't remember the magic numbers but want to skim the table
of contents.

> :pep:`554` is strictly about providing a minimal stdlib module
> to give users access to multiple interpreters from Python code.
> In fact, it specifically avoids proposing any changes related to
> the GIL. Consider, however, that users of that module would benefit
> from a per-interpreter GIL, which makes PEP 554 more appealing.
>
>
> Rationale
> =========
>
> During initial investigations in 2014, a variety of possible solutions
> for multi-core Python were explored, but each had its drawbacks
> without simple solutions:
>
> * the existing practice of releasing the GIL in extension modules
> * doesn't help with Python code
> * other Python implementations (e.g. Jython, IronPython)
> * CPython dominates the community
> * remove the GIL (e.g. gilectomy, "no-gil")
> * too much technical risk (at the time)
> * Trent Nelson's "PyParallel" project
> * incomplete; Windows-only at the time
> * ``multiprocessing``
>
> * too much work to make it effective enough;
> high penalties in some situations (at large scale, Windows)
>
> * other parallelism tools (e.g. dask, ray, MPI)
> * not a fit for the stdlib
> * give up on multi-core (e.g. async, do nothing)
> * this can only end in tears

This list doesn't render correctly in ReST, you need blank lines everywhere.

>
> Even in 2014, it was fairly clear that a solution using isolated
> interpreters did not have a high level of technical risk and that
> most of the work was worth doing anyway.
> (The downside was the volume of work to be done.)
>
>
> Specification
> =============
>
> As `summarized above <High-Level Summary_>`__, this proposal involves the
> following changes, in the order they must happen:
>
> 1. `consolidate global runtime state <Consolidating Runtime Global State_>`_
> (including objects) into ``_PyRuntimeState``
> 2. move nearly all of the state down into ``PyInterpreterState``
> 3. finally, move the GIL down into ``PyInterpreterState``
> 4. everything else
> * add to the public C-API
> * implement restrictions in ``ExtensionFileLoader``
>
> * work with popular extension maintainers to help
> with multi-interpreter support

And this needs blank lines too.

>
> Per-Interpreter State
> ---------------------
>
> The following runtime state will be moved to ``PyInterpreterState``:
>
> * all global objects that are not safely shareable (fully immutable)
> * the GIL
> * mutable, currently protected by the GIL

Spelling out “mutable state” in these lists would make this clearer,
since “state” isn't elided from all the points.

> * mutable, currently protected by some other per-interpreter lock
> * mutable, may be used independently in different interpreters

This includes extension modules (with multi-phase init), right?

> * all other mutable (or effectively mutable) state
> not otherwise excluded below
>
> Furthermore, a number of parts of the global state have already been
> moved to the interpreter, such as GC, warnings, and atexit hooks.
>
> The following state will not be moved:
>
> * global objects that are safely shareable, if any
> * immutable, often ``const``
> * treated as immutable

Do you have an example for this?

> * related to CPython's ``main()`` execution
> * related to the REPL

Would “only used by” work instead of “related to”?


> * set during runtime init, then treated as immutable

`main()`, REPL and runtime init look like special cases of functionality
that only runs in one interpreter. Maybe generalize this?

> * mutable, protected by some global lock
> * mutable, atomic
>
> Note that currently the allocators (see ``Objects/obmalloc.c``) are shared
> between all interpreters, protected by the GIL. They will need to move
> to each interpreter (or a global lock will be needed). This is the
> highest risk part of the work to isolate interpreters and may require
> more than just moving fields down from ``_PyRuntimeState``. Some of
> the complexity is reduced if CPython switches to a thread-safe
> allocator like mimalloc.
>
> .. _proposed capi:
>
> C-API
> -----
>
> The following private API will be made public:
>
> * ``_PyInterpreterConfig``
> * ``_Py_NewInterpreter()`` (as ``Py_NewInterpreterEx()``)

Since the API is not documented (and _PyInterpreterConfig is not even in
main yet!), it would be good to sketch out the docs (intended behavior)
here.

> The following fields will be added to ``PyInterpreterConfig``:
>
> * ``own_gil`` - (bool) create a new interpreter lock
> (instead of sharing with the main interpreter)
> * ``strict_extensions`` - fail import in this interpreter for
> incompatible extensions (see `Restricting Extension Modules`_)
>
> Restricting Extension Modules
> -----------------------------
>
> Extension modules have many of the same problems as the runtime when
> state is stored in global variables. :pep:`630` covers all the details
> of what extensions must do to support isolation, and thus safely run in
> multiple interpreters at once. This includes dealing with their globals.
>
> Extension modules that do not implement isolation will only run in
> the main interpreter. In all other interpreters, the import will
> raise ``ImportError``. This will be done through
> ``importlib._bootstrap_external.ExtensionFileLoader``.

“Main interpreter” should be defined. Or maybe the term should be
*avoided*, and name a flag that Py_Initialize sets but Py_NewInterpreter
doesn't.

> We will work with popular extensions to help them support use in
> multiple interpreters. This may involve adding to CPython's public C-API,
> which we will address on a case-by-case basis.
>
> Extension Module Compatibility
> ''''''''''''''''''''''''''''''
>
> As noted in `Extension Modules`_, many extensions work fine in multiple
> interpreters without needing any changes. The import system will still
> fail if such a module doesn't explicitly indicate support. At first,
> not many extension modules will, so this is a potential source
> of frustration.
>
> We will address this by adding a context manager to temporarily disable
> the check on multiple interpreter support:
> ``importlib.util.allow_all_extensions()``. >
> Documentation
> -------------
>
> The "Sub-interpreter support" section of ``Doc/c-api/init.rst`` will be
> updated with the added API.
>
>
> Impact
> ======
>
> Backwards Compatibility
> -----------------------
>
> No behavior or APIs are intended to change due to this proposal,
> with one exception noted in `the next section <Extension Modules_>`_.
> The existing C-API for managing interpreters will preserve its current
> behavior, with new behavior exposed through new API. No other API
> or runtime behavior is meant to change, including compatibility with
> the stable ABI.
>
> See `Objects Exposed in the C-API`_ below for related discussion.
>
> Extension Modules
> '''''''''''''''''
>
> Currently the most common usage of Python, by far, is with the main
> interpreter running by itself. This proposal has zero impact on
> extension modules in that scenario. Likewise, for better or worse,
> there is no change in behavior under multiple interpreters created
> using the existing ``Py_NewInterpreter()``.
>
> Keep in mind that some extensions already break when used in multiple
> interpreters, due to keeping module state in global variables. They
> may crash or, worse, experience inconsistent behavior. That was part
> of the motivation for :pep:`630` and friends, so this is not a new
> situation nor a consequence of this proposal.
>
> In contrast, when the `proposed API <proposed capi_>`_ is used to
> create multiple interpreters, the default behavior will change for
> some extensions. In that case, importing an extension will fail
> (outside the main interpreter) if it doesn't indicate support for
> multiple interpreters. For extensions that already break in
> multiple interpreters, this will be an improvement.
>
> Now we get to the break in compatibility mentioned above. Some
> extensions are safe under multiple interpreters, even though they
> haven't indicated that. Unfortunately, there is no reliable way for
> the import system to infer that such an extension is safe, so
> importing them will still fail. This case is addressed in
> `Extension Module Compatibility`_ below.
>
> Extension Module Maintainers
> ----------------------------
>
> One related consideration is that a per-interpreter GIL will likely
> drive increased use of multiple interpreters, particularly if :pep:`554`
> is accepted. Some maintainers of large extension modules have expressed
> concern about the increased burden they anticipate due to increased
> use of multiple interpreters.
>
> Specifically, enabling support for multiple interpreters will require
> substantial work for some extension modules. To add that support,
> the maintainer(s) of such a module (often volunteers) would have to
> set aside their normal priorities and interests to focus on
> compatibility (see :pep:`630`).
>
> Of course, extension maintainers are free to not add support for use
> in multiple interpreters. However, users will increasingly demand
> such support, especially if the feature grows
> in popularity.
>
> Either way, the situation can be stressful for maintainers of such
> extensions, particularly when they are doing the work in their spare
> time. The concerns they have expressed are understandable, and we address
> the partial solution in `Restricting Extension Modules`_ below.
>
> Alternate Python Implementations
> --------------------------------
>
> Other Python implementation are not required to provide support for
> multiple interpreters in the same process (though some do already).
>
> Security Implications
> ---------------------
>
> There is no known impact to security with this proposal.
>
> Maintainability
> ---------------
>
> On the one hand, this proposal has already motivated a number of
> improvements that make CPython *more* maintainable. That is expected
> to continue. On the other hand, the underlying work has already
> exposed various pre-existing defects in the runtime that have had
> to be fixed. That is also expected to continue as multiple interpreters
> receive more use. Otherwise, there shouldn't be a significant impact
> on maintainability, so the net effect should be positive.
>
> Performance
> -----------
>
> The work to consolidate globals has already provided a number of
> improvements to CPython's performance, both speeding it up and using
> less memory, and this should continue. Performance benefits to a
> per-interpreter GIL have not been explored. At the very least, it is
> not expected to make CPython slower (as long as interpreters are
> sufficiently isolated).
>
>
> How to Teach This
> =================
>
> This is an advanced feature for users of the C-API. There is no
> expectation that this will be taught.
>
> That said, if it were taught then it would boil down to the following:
>
> In addition to Py_NewInterpreter(), you can use Py_NewInterpreterEx()
> to create an interpreter. The config you pass it indicates how you
> want that interpreter to behave.
>
>
> Reference Implementation
> ========================
>
> <TBD>
>
>
> Open Issues
> ===========
>
> * What are the risks/hurdles involved with moving the allocators?
> * Is ``allow_all_extensions`` the best name for the context manager?
>
>
> Deferred Functionality
> ======================
>
> * ``PyInterpreterConfig`` option to always run the interpreter in a new thread
> * ``PyInterpreterConfig`` option to assign a "main" thread to the interpreter
> and only run in that thread
>
>
> Rejected Ideas
> ==============
>
> <TBD>
>
>
> Extra Context
> =============
>
> Sharing Global Objects
> ----------------------
>
> We are sharing some global objects between interpreters.
> This is an implementation detail and relates more to
> `globals consolidation <Consolidating Runtime Global State>`_
> than to this proposal, but it is a significant enough detail
> to explain here.
>
> The alternative is to share no objects between interpreters, ever.
> To accomplish that, we'd have to sort out the fate of all our static
> types, as well as deal with compatibility issues for the many objects
> `exposed in the public C-API <capi objects_>`_.
>
> That approach introduces a meaningful amount of extra complexity
> and higher risk, though prototyping has demonstrated valid solutions.
> Also, it would likely result in a performance penalty.
>
> `Immortal objects <Depending on Immortal Objects_>`_ allow us to
> share the otherwise immutable global objects. That way we avoid
> the extra costs.
>
> .. _capi objects:
>
> Objects Exposed in the C-API
> ''''''''''''''''''''''''''''
>
> The C-API (including the limited API) exposes all the builtin types,
> including the builtin exceptions, as well as the builtin singletons.
> The exceptions are exposed as ``PyObject *`` but the rest are exposed
> as the static values rather than pointers. This was one of the few
> non-trivial problems we had to solve for per-interpreter GIL.
>
> With immortal objects this is a non-issue.
>
>
> Consolidating Runtime Global State
> ----------------------------------
>
> As noted in `CPython Runtime State`_ above, there is an active effort
> (separate from this PEP) to consolidate CPython's global state into the
> ``_PyRuntimeState`` struct. Nearly all the work involves moving that
> state from global variables. The project is particularly relevant to
> this proposal, so below is some extra detail.
>
> Benefits to Consolidation
> '''''''''''''''''''''''''
>
> Consolidating the globals has a variety of benefits:
>
> * greatly reduces the number of C globals (best practice for C code)
> * the move draws attention to runtime state that is unstable or broken
> * encourages more consistency in how runtime state is used
> * makes multiple-interpreter behavior more reliable
> * leads to fixes for long-standing runtime bugs that otherwise
> haven't been prioritized
> * exposes (and inspires fixes for) previously unknown runtime bugs
> * facilitates cleaner runtime initialization and finalization
> * makes it easier to discover/identify CPython's runtime state
> * makes it easier to statically allocate runtime state in a consistent way
> * better memory locality for runtime state
> * structural layering of the C-API (e.g. ``Include/internal``)
>
> Furthermore, much of that work benefits other CPython-related projects:
>
> * performance improvements ("faster-cpython")
> * pre-fork application deployment (e.g. Instagram)
> * extension module isolation (see :pep:`630`, etc.)
> * embedding CPython
>
> Scale of Work
> '''''''''''''
>
> The number of global variables to be moved is large enough to matter,
> but most are Python objects that can be dealt with in large groups
> (like ``Py_IDENTIFIER``). In nearly all cases, moving these globals
> to the interpreter is highly mechanical. That doesn't require
> cleverness but instead requires someone to put in the time.
>
> State To Be Moved
> '''''''''''''''''
>
> The remaining global variables can be categorized as follows:
>
> * global objects
> * static types (incl. exception types)
> * non-static types (incl. heap types, structseq types)
> * singletons (static)
> * singletons (initialized once)
> * cached objects
> * non-objects
> * will not (or unlikely to) change after init
> * only used in the main thread
> * initialized lazily
> * pre-allocated buffers
> * state
>
> Those globals are spread between the core runtime, the builtin modules,
> and the stdlib extension modules.
>
> For a breakdown of the remaining globals, run:
>
> .. code-block:: bash
>
> ./python Tools/c-analyzer/table-file.py
> Tools/c-analyzer/cpython/globals-to-fix.tsv
>
> Already Completed Work
> ''''''''''''''''''''''
>
> As mentioned, this work has been going on for many years. Here are some
> of the things that have already been done:
>
> * cleanup of runtime initialization (see :pep:`432` / :pep:`587`)
> * extension module isolation machinery (see :pep:`384` / :pep:`3121` /
> :pep:`489`)
> * isolation for many builtin modules
> * isolation for many stdlib extension modules
> * addition of ``_PyRuntimeState``
> * no more ``_Py_IDENTIFIER()``
> * statically allocated:
>
> * empty string
> * string literals
> * identifiers
> * latin-1 strings
> * length-1 bytes
> * empty tuple
>
> Tooling
> '''''''
>
> As already indicated, there are several tools to help identify the
> globals and reason about them.
>
> * ``Tools/c-analyzer/cpython/globals-to-fix.tsv`` - the list of
> remaining globals
> * ``Tools/c-analyzer/c-analyzer.py``
> * ``analyze`` - identify all the globals
> * ``check`` - fail if there are any unsupported globals that aren't ignored
> * ``Tools/c-analyzer/table-file.py`` - summarize the known globals
>
> Also, the check for unsupported globals is incorporated into CI so that
> no new globals are accidentally added.
>
> Global Objects
> ''''''''''''''
>
> Global objects that are safe to be shared (without a GIL) between
> interpreters can stay on ``_PyRuntimeState``. Not only must the object
> be effectively immutable (e.g. singletons, strings), but not even the
> refcount can change for it to be safe. Immortality (:pep:`683`)
> provides that. (The alternative is that no objects are shared, which
> adds significant complexity to the solution, particularly for the
> objects `exposed in the public C-API <capi objects_>`_.)
>
> Builtin static types are a special case of global objects that will be
> shared. They are effectively immutable except for one part:
> ``__subclasses__`` (AKA ``tp_subclasses``). We expect that nothing
> else on a builtin type will change, even the content
> of ``__dict__`` (AKA ``tp_dict``).
>
> ``__subclasses__`` for the builtin types will be dealt with by making
> it a getter that does a lookup on the current ``PyInterpreterState``
> for that type.
>
>
> References
> ==========
>
> Related:
>
> * :pep:`384`
> * :pep:`432`
> * :pep:`489`
> * :pep:`554`
> * :pep:`573`
> * :pep:`587`
> * :pep:`630`
> * :pep:`683`
> * :pep:`3121`
>
>
> Copyright
> =========
>
> This document is placed in the public domain or under the
> CC0-1.0-Universal license, whichever is more permissive.
> _______________________________________________
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-leave@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/CF7B7FMACFYDAHU6NPBEVEY6TOSGICXU/
> Code of Conduct: http://python.org/psf/codeofconduct/
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-leave@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/LQHDO3JEMSSLIGSQ3SRHJQYFMTF4SQCM/
Code of Conduct: http://python.org/psf/codeofconduct/
Re: PEP 684: A Per-Interpreter GIL [ In reply to ]
Oops, I hit Send by mistake! Please disregard the previous message (I
often draft questions I later find answered, so I delete them.)

On Wed, Mar 9, 2022 at 5:53 PM Petr Viktorin <encukou@gmail.com> wrote:
>
> On 09. 03. 22 4:38, Eric Snow wrote:
> > I'd really appreciate feedback on this new PEP about making the GIL
> > per-interpreter.
>
> Yay! Thank you!
>
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-leave@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/RLBJEE2MLXMJNN2R444AFZDN54JDRWI7/
Code of Conduct: http://python.org/psf/codeofconduct/
Re: PEP 684: A Per-Interpreter GIL [ In reply to ]
On 09. 03. 22 4:38, Eric Snow wrote:
> I'd really appreciate feedback on this new PEP about making the GIL
> per-interpreter.

Yay! Thank you!
This PEP definitely makes per-interpreter GIL sound possible :)


> The PEP targets 3.11, but we'll see if that is too close. I don't
> mind waiting one more
> release, though I'd prefer 3.11 (obviously). Regardless, I have no
> intention of rushing
> this through at the expense of cutting corners. Hence, we'll see how it goes.
> > The PEP text is included inline below. Thanks!
>
> -eric
>
> ===================================================
>
> PEP: 684
> Title: A Per-Interpreter GIL
> Author: Eric Snow <ericsnowcurrently@gmail.com>
> Discussions-To: python-dev@python.org
> Status: Draft
> Type: Standards Track
> Content-Type: text/x-rst

This iteration of the PEP should also have `Requires: 683` (Immortal
Objects).

[...]
>
> Motivation
> ==========
>
> The fundamental problem we're solving here is a lack of true multi-core
> parallelism (for Python code) in the CPython runtime. The GIL is the
> cause. While it usually isn't a problem in practice, at the very least
> it makes Python's multi-core story murky, which makes the GIL
> a consistent distraction.
>
> Isolated interpreters are also an effective mechanism to support
> certain concurrency models. :pep:`554` discusses this in more detail.
>
> Indirect Benefits
> -----------------
>
> Most of the effort needed for a per-interpreter GIL has benefits that
> make those tasks worth doing anyway:
>
> * makes multiple-interpreter behavior more reliable
> * has led to fixes for long-standing runtime bugs that otherwise
> hadn't been prioritized > * has been exposing (and inspiring fixes for) previously unknown
runtime bugs
> * has driven cleaner runtime initialization (:pep:`432`, :pep:`587`)
> * has driven cleaner and more complete runtime finalization
> * led to structural layering of the C-API (e.g. ``Include/internal``)
> * also see `Benefits to Consolidation`_ below

Do you want to dig up some bpo examples, to make these more convincing
to the casual reader?

>
> Furthermore, much of that work benefits other CPython-related projects:
>
> * performance improvements ("faster-cpython")
> * pre-fork application deployment (e.g. Instagram)

Maybe say “e.g. with Instagram's Cinder” – both the household name and
the project you can link to?

> * extension module isolation (see :pep:`630`, etc.)
> * embedding CPython

A lot of these points are duplicated in "Benefits to Consolidation" list
below, maybe there'd be, ehm, benefits to consolidating them?

[...]
> PEP 554
> -------

Please spell out "PEP 554 (Multiple Interpreters in the Stdlib)", for
people who don't remember the magic numbers but want to skim the table
of contents.

> :pep:`554` is strictly about providing a minimal stdlib module
> to give users access to multiple interpreters from Python code.
> In fact, it specifically avoids proposing any changes related to
> the GIL. Consider, however, that users of that module would benefit
> from a per-interpreter GIL, which makes PEP 554 more appealing.
>
>
> Rationale
> =========
>
> During initial investigations in 2014, a variety of possible solutions
> for multi-core Python were explored, but each had its drawbacks
> without simple solutions:
>
> * the existing practice of releasing the GIL in extension modules
> * doesn't help with Python code
> * other Python implementations (e.g. Jython, IronPython)
> * CPython dominates the community
> * remove the GIL (e.g. gilectomy, "no-gil")
> * too much technical risk (at the time)
> * Trent Nelson's "PyParallel" project
> * incomplete; Windows-only at the time
> * ``multiprocessing``
>
> * too much work to make it effective enough;
> high penalties in some situations (at large scale, Windows)
>
> * other parallelism tools (e.g. dask, ray, MPI)
> * not a fit for the stdlib
> * give up on multi-core (e.g. async, do nothing)
> * this can only end in tears

This list doesn't render correctly in ReST, you need blank lines everywhere.
There are more cases like this below.

[...]> Per-Interpreter State
> ---------------------
>
> The following runtime state will be moved to ``PyInterpreterState``:
>
> * all global objects that are not safely shareable (fully immutable)
> * the GIL
> * mutable, currently protected by the GIL

Spelling out “mutable state” in these lists would make this clearer,
since “state” isn't elided from all the points.

> * mutable, currently protected by some other per-interpreter lock
> * mutable, may be used independently in different interpreters

This includes extension modules (with multi-phase init), right?

> * all other mutable (or effectively mutable) state
> not otherwise excluded below
>
> Furthermore, a number of parts of the global state have already been
> moved to the interpreter, such as GC, warnings, and atexit hooks.
>
> The following state will not be moved:
>
> * global objects that are safely shareable, if any
> * immutable, often ``const``
> * treated as immutable

Do you have an example for this?

> * related to CPython's ``main()`` execution
> * related to the REPL

Would “only used by” work instead of “related to”?

> * set during runtime init, then treated as immutable

`main()`, REPL and runtime init look like special cases of functionality
that only runs in one interpreter. If it's so, maybe generalize this?

[...]
> C-API
> -----
>
> The following private API will be made public:
>
> * ``_PyInterpreterConfig``
> * ``_Py_NewInterpreter()`` (as ``Py_NewInterpreterEx()``)

Since the API is not documented (and _PyInterpreterConfig is not even in
main yet!), it would be good to sketch out the docs (intended behavior)
here.

> The following fields will be added to ``PyInterpreterConfig``:
>
> * ``own_gil`` - (bool) create a new interpreter lock
> (instead of sharing with the main interpreter)

As a user of the API, what should I consider when setting this flag?
Would the GIL be shared with the *parent* interpreter or the main one?
What are the restrictions/implications of this flag?

> * ``strict_extensions`` - fail import in this interpreter for
> incompatible extensions (see `Restricting Extension Modules`_)

I'm not sure about including a workaround flag in the structure.
Since the Python API will get a context manager for this, maybe the C
API should get a function to set/reset it instead of this flag?


> Restricting Extension Modules
> -----------------------------
>
> Extension modules have many of the same problems as the runtime when
> state is stored in global variables. :pep:`630` covers all the details
> of what extensions must do to support isolation, and thus safely run in
> multiple interpreters at once. This includes dealing with their globals.
>
> Extension modules that do not implement isolation will only run in
> the main interpreter. In all other interpreters, the import will
> raise ``ImportError``. This will be done through
> ``importlib._bootstrap_external.ExtensionFileLoader``.

“Main interpreter” should be defined. (Or maybe the term should be
avoided instead -- always having to spell out “interpreter started by
Py_Initialize rather than Py_NewInterpreter” might push us toward
finding ways to avoid the special case...)

> We will work with popular extensions to help them support use in
> multiple interpreters. This may involve adding to CPython's public C-API,
> which we will address on a case-by-case basis.
>
> Extension Module Compatibility
> ''''''''''''''''''''''''''''''
>
> As noted in `Extension Modules`_, many extensions work fine in multiple
> interpreters without needing any changes. The import system will still
> fail if such a module doesn't explicitly indicate support. At first,
> not many extension modules will, so this is a potential source
> of frustration.
>
> We will address this by adding a context manager to temporarily disable
> the check on multiple interpreter support:
> ``importlib.util.allow_all_extensions()``. >

I'd prefer a more dangerous-sounding name, to guide code readers (and
autocomplete users) toward checking the warning in the docs.


[...]
> Extension Modules
> '''''''''''''''''
>
> Currently the most common usage of Python, by far, is with the main
> interpreter running by itself. This proposal has zero impact on
> extension modules in that scenario. Likewise, for better or worse,
> there is no change in behavior under multiple interpreters created
> using the existing ``Py_NewInterpreter()``.
>
> Keep in mind that some extensions already break when used in multiple
> interpreters, due to keeping module state in global variables. They
> may crash or, worse, experience inconsistent behavior. That was part
> of the motivation for :pep:`630` and friends, so this is not a new
> situation nor a consequence of this proposal.
>
> In contrast, when the `proposed API <proposed capi_>`_ is used to
> create multiple interpreters, the default behavior will change for
> some extensions. In that case, importing an extension will fail
> (outside the main interpreter) if it doesn't indicate support for
> multiple interpreters. For extensions that already break in
> multiple interpreters, this will be an improvement.
>
> Now we get to the break in compatibility mentioned above. Some
> extensions are safe under multiple interpreters, even though they
> haven't indicated that. Unfortunately, there is no reliable way for
> the import system to infer that such an extension is safe, so
> importing them will still fail. This case is addressed in
> `Extension Module Compatibility`_ below.

Extensions that use multi-phase init should already be compatible with
multiple interpreters. Multi-phase init itself is the flag that
indicates this.
But they might not be compatible with *per-interpreter GIL*. I don't
like how that's conflated with multiple interpreters here.
For example, extension modules can currently support multiple
interpreters, but rely on the GIL to protect calls to a non-threadsafe
library, access shared memory, etc. As an example, the PEP 630 “opt-out”
is not thread-safe.

It seems to me that there should be a separate flag (slot) to indicate
support for per-interpreter GIL, and the `strict_extensions` bit should
work with that.

[...]
> How to Teach This
> =================
>
> This is an advanced feature for users of the C-API. There is no
> expectation that this will be taught.
Oh, I'm afraid this will need some docs related to making sure an
extension is compatible with per-interpreter GIL.
I'd rather not repeat my mistake of hand-wavingly noting "All modules
created using multi-phase initialization are expected to support
sub-interpreters" in the docs, and only writing PEP 630 much later.

[...]
> References
> ==========
>
> Related:
>
> * :pep:`384`
> * :pep:`432`
> * :pep:`489`
> * :pep:`554`
> * :pep:`573`
> * :pep:`587`
> * :pep:`630`
> * :pep:`683`
> * :pep:`3121`

Please write out the titles here.
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-leave@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/UEUYVZ5IHSE3FYUDJ3INLSDWS7IZITOB/
Code of Conduct: http://python.org/psf/codeofconduct/
Re: PEP 684: A Per-Interpreter GIL [ In reply to ]
Thanks for the feedback, Petr! Responses inline below.

-eric

On Wed, Mar 9, 2022 at 10:58 AM Petr Viktorin <encukou@gmail.com> wrote:
> This PEP definitely makes per-interpreter GIL sound possible :)

Oh good. :)

> > PEP: 684
> > Title: A Per-Interpreter GIL
> > Author: Eric Snow <ericsnowcurrently@gmail.com>
> > Discussions-To: python-dev@python.org
> > Status: Draft
> > Type: Standards Track
> > Content-Type: text/x-rst
>
> This iteration of the PEP should also have `Requires: 683` (Immortal
> Objects).

+1

> > Most of the effort needed for a per-interpreter GIL has benefits that
> > make those tasks worth doing anyway:
> >
> > * makes multiple-interpreter behavior more reliable
> > * has led to fixes for long-standing runtime bugs that otherwise
> > hadn't been prioritized > * has been exposing (and inspiring fixes for) previously unknown
> runtime bugs
> > * has driven cleaner runtime initialization (:pep:`432`, :pep:`587`)
> > * has driven cleaner and more complete runtime finalization
> > * led to structural layering of the C-API (e.g. ``Include/internal``)
> > * also see `Benefits to Consolidation`_ below
>
> Do you want to dig up some bpo examples, to make these more convincing
> to the casual reader?

Heh, the casual reader isn't really my target audience. :) I actually
have a stockpile of links but left them all out until they were
needed. Would the decision-makers benefit from the links? I'm trying
to avoid adding to the already sizeable clutter in this PEP. :) I'll
add some links in if you think it matters.

> > Furthermore, much of that work benefits other CPython-related projects:
> >
> > * performance improvements ("faster-cpython")
> > * pre-fork application deployment (e.g. Instagram)
>
> Maybe say “e.g. with Instagram's Cinder” – both the household name and
> the project you can link to?

+1

Note that Instagram isn't exactly using Cinder. I'll have to check if
Cinder uses the pre-fork model.

> > * extension module isolation (see :pep:`630`, etc.)
> > * embedding CPython
>
> A lot of these points are duplicated in "Benefits to Consolidation" list
> below, maybe there'd be, ehm, benefits to consolidating them?

There shouldn't be any direct overlap.

FWIW, the whole "Extra Context" section is essentially a separate PEP
that I inlined (with the caveat that it really isn't worth its own
PEP). I'm still considering yanking it, so the above list should
stand on its own.

> > PEP 554
> > -------
>
> Please spell out "PEP 554 (Multiple Interpreters in the Stdlib)", for
> people who don't remember the magic numbers but want to skim the table
> of contents.

+1

> This list doesn't render correctly in ReST, you need blank lines everywhere.
> There are more cases like this below.

Hmm, I had blank lines and the PEP editor told me I needed to remove them.

> [...]> Per-Interpreter State
> > ---------------------
> >
> > The following runtime state will be moved to ``PyInterpreterState``:
> >
> > * all global objects that are not safely shareable (fully immutable)
> > * the GIL
> > * mutable, currently protected by the GIL
>
> Spelling out “mutable state” in these lists would make this clearer,
> since “state” isn't elided from all the points.

+1

> > * mutable, currently protected by some other per-interpreter lock
> > * mutable, may be used independently in different interpreters
>
> This includes extension modules (with multi-phase init), right?

Yep.

> > The following state will not be moved:
> >
> > * global objects that are safely shareable, if any
> > * immutable, often ``const``
> > * treated as immutable
>
> Do you have an example for this?

Strings (PyUnicodeObject) actually cache some info, making them not
strictly immutable, but they are close enough to be treated as such.
I'll add a note to the PEP.

> > * related to CPython's ``main()`` execution
> > * related to the REPL
>
> Would “only used by” work instead of “related to”?

Sure.

> > * set during runtime init, then treated as immutable
>
> `main()`, REPL and runtime init look like special cases of functionality
> that only runs in one interpreter. If it's so, maybe generalize this?

+1

> > * ``_PyInterpreterConfig``
> > * ``_Py_NewInterpreter()`` (as ``Py_NewInterpreterEx()``)
>
> Since the API is not documented (and _PyInterpreterConfig is not even in
> main yet!), it would be good to sketch out the docs (intended behavior)
> here.

+1

> > The following fields will be added to ``PyInterpreterConfig``:
> >
> > * ``own_gil`` - (bool) create a new interpreter lock
> > (instead of sharing with the main interpreter)
>
> As a user of the API, what should I consider when setting this flag?
> Would the GIL be shared with the *parent* interpreter or the main one?

The GIL would be shared with the main interpreter. I state that there
but it looks like I wasn' clear enough.

> What are the restrictions/implications of this flag?

Good point. I'll add a brief explanation of why you would want to
keep sharing the GIL (e.g. the status quo) and what is different if
you don't.

> > * ``strict_extensions`` - fail import in this interpreter for
> > incompatible extensions (see `Restricting Extension Modules`_)
>
> I'm not sure about including a workaround flag in the structure.
> Since the Python API will get a context manager for this, maybe the C
> API should get a function to set/reset it instead of this flag?

The flag is necessary if we want to be able to preserve the current
behavior of the existing API, Py_NewInterpreter(), which we do.

> > Restricting Extension Modules
> > -----------------------------
> >
> > Extension modules have many of the same problems as the runtime when
> > state is stored in global variables. :pep:`630` covers all the details
> > of what extensions must do to support isolation, and thus safely run in
> > multiple interpreters at once. This includes dealing with their globals.
> >
> > Extension modules that do not implement isolation will only run in
> > the main interpreter. In all other interpreters, the import will
> > raise ``ImportError``. This will be done through
> > ``importlib._bootstrap_external.ExtensionFileLoader``.
>
> “Main interpreter” should be defined.

+1

> (Or maybe the term should be
> avoided instead -- always having to spell out “interpreter started by
> Py_Initialize rather than Py_NewInterpreter” might push us toward
> finding ways to avoid the special case...)

We (me, Nick, Victor, others) have considered this in the past and
have concluded that having a distinct main interpreter is valuable.
That topic is a bit out of scope for this PEP though.

> > We will work with popular extensions to help them support use in
> > multiple interpreters. This may involve adding to CPython's public C-API,
> > which we will address on a case-by-case basis.
> >
> > Extension Module Compatibility
> > ''''''''''''''''''''''''''''''
> >
> > As noted in `Extension Modules`_, many extensions work fine in multiple
> > interpreters without needing any changes. The import system will still
> > fail if such a module doesn't explicitly indicate support. At first,
> > not many extension modules will, so this is a potential source
> > of frustration.
> >
> > We will address this by adding a context manager to temporarily disable
> > the check on multiple interpreter support:
> > ``importlib.util.allow_all_extensions()``. >
>
> I'd prefer a more dangerous-sounding name, to guide code readers (and
> autocomplete users) toward checking the warning in the docs.

+1

I had meant to explicitly ask for suggestions for a better name. :)

> > Now we get to the break in compatibility mentioned above. Some
> > extensions are safe under multiple interpreters, even though they
> > haven't indicated that. Unfortunately, there is no reliable way for
> > the import system to infer that such an extension is safe, so
> > importing them will still fail. This case is addressed in
> > `Extension Module Compatibility`_ below.
>
> Extensions that use multi-phase init should already be compatible with
> multiple interpreters. Multi-phase init itself is the flag that
> indicates this.

Correct and ExtensionFileLoader will use that if
PyInterpreterConfig.strict_extensions is set, regardless of whether or
not we need a second extension module indicator for
I-said-I-was-isolated-but-now-I-really-mean-it (per-interpreter GIL).

> But they might not be compatible with *per-interpreter GIL*. I don't
> like how that's conflated with multiple interpreters here.

Hmm, I suppose in my mind they *have* been the same thing. :)

> For example, extension modules can currently support multiple
> interpreters, but rely on the GIL to protect calls to a non-threadsafe
> library, access shared memory, etc. As an example, the PEP 630 “opt-out”
> is not thread-safe.

So, you are saying that some mutli-phase init extensions may still be
relying on the GIL as a lock for some shared state. In the case of
your "opt-out" example, there is a possible (albeit super unlikely)
race on "loaded". So such an extension needs to be able to separately
opt in to being used without a GIL between interpreters. Is all that
correct?

Out of curiosity, do you have any examples of extensions that
implement multi-phase init but need to opt out (like in your example)?
Is it only the case where the maintainer is in the process of
isolating the module, so the opt-out is temporary?

Aside from the unsafe-flag-to-indicate-not-isolated case, do you know
of any other examples where a module is safe for use between
interpreters but still relies on the shared GIL? I'm struggling to
imagine such a scenario, but where they don't also opt out of
multi-interpreter support already.

FWIW, my assumption is that, if an extension has been made isolated
enough for use between multiple interpreters, then it is extremely
likely that it is also isolated enough to use without a GIL shared
between the interpreters.

> It seems to me that there should be a separate flag (slot) to indicate
> support for per-interpreter GIL, and the `strict_extensions` bit should
> work with that.

I think I see what you are saying. My concern is that anything beyond
the default settings is an obstacle for extension maintainers, so
opt-in is especially painful if it ends up being the common case. Of
course, it may be unavoidable in the end.

While we may end up needing a flag to indicate
yes-I'm-isolated-but-not-that-isolated, the following alternative came
to mind:

* multi-phase init extensions should be expected to be isolated, thus
compatible with use in multiple interpreters (already true)
* multi-phase init extensions should be expected to be fully isolated,
thus compatible with per-interpreter GIL
* there would be a new module def slot that a multi-phase init
extension can use to explicitly opt out
* the ExtensionFileLoader would enforce loading the module only once
in the process (and use a dedicated granular lock to prevent races)

Instead of using its own static "loaded" variable, the module in your
opt-out example would use this new slot.

To me, really-truly-fully-isolated is the sensible long-term default
for multi-phase init. Most maintainers will implement multi-phase
init, using PEP 630 to get isolated enough. We can avoid the extra
step for the common case. So our future selves would be much happier
if we go with an explicit opt-out now. :) This follows my earlier
assumption that few extensions will be safe in multi-interpreter but
not per-interpreter GIL.

FWIW, I was going to say perhaps we could get away with treating the
vast majority of extensions as already safe in multiple interpreters,
to avoid requiring extensions to implement multi-phase init. However,
I can already think of a number of relatively common cases where that
isn't true (e.g. static types). :)

Plus, multi-phase init is such a good thing and doesn't require that
much effort (especially if we provide an opt-out slot for
multi-interpreter support). Per-interpreter GIL would be a pretty
good carrot. :)

> [...]
> > How to Teach This
> > =================
> >
> > This is an advanced feature for users of the C-API. There is no
> > expectation that this will be taught.
> Oh, I'm afraid this will need some docs related to making sure an
> extension is compatible with per-interpreter GIL.
> I'd rather not repeat my mistake of hand-wavingly noting "All modules
> created using multi-phase initialization are expected to support
> sub-interpreters" in the docs, and only writing PEP 630 much later.

good point

Is PEP 630 the de facto documentation for this sort of thing or is
there something on docs.python.org?

> [...]
> > References
> > ==========
> >
> > Related:
> >
> > * :pep:`384`
> > * :pep:`432`
> > * :pep:`489`
> > * :pep:`554`
> > * :pep:`573`
> > * :pep:`587`
> > * :pep:`630`
> > * :pep:`683`
> > * :pep:`3121`
>
> Please write out the titles here.

will do
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-leave@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/VH74ERJYZ4VDGQQN52LF5Q56EABACHX3/
Code of Conduct: http://python.org/psf/codeofconduct/
Re: PEP 684: A Per-Interpreter GIL [ In reply to ]
Hi Eric, just one note:

On Wed, Mar 9, 2022 at 7:13 PM Eric Snow <ericsnowcurrently@gmail.com> wrote:
> > Maybe say “e.g. with Instagram's Cinder” – both the household name and
> > the project you can link to?
>
> +1
>
> Note that Instagram isn't exactly using Cinder.

This sounds like a misunderstanding somewhere. Instagram server is
"exactly using Cinder" :)

> I'll have to check if Cinder uses the pre-fork model.

It doesn't really make sense to ask whether "Cinder uses the pre-fork
model" -- Cinder is just a CPython variant, it can work with all the
same execution models CPython can. Instagram server uses Cinder with a
pre-fork execution model. Some other workloads use Cinder without
pre-forking.

Carl
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-leave@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/5A3E6VCEY5XZXEFPGHNGKPM3HXQEJRTX/
Code of Conduct: http://python.org/psf/codeofconduct/
Re: PEP 684: A Per-Interpreter GIL [ In reply to ]
On Wed, Mar 9, 2022 at 7:37 PM Carl Meyer <carl@oddbird.net> wrote:
> > Note that Instagram isn't exactly using Cinder.
>
> This sounds like a misunderstanding somewhere. Instagram server is
> "exactly using Cinder" :)

:)

Thanks for clarifying, Carl.

> > I'll have to check if Cinder uses the pre-fork model.
>
> It doesn't really make sense to ask whether "Cinder uses the pre-fork
> model" -- Cinder is just a CPython variant, it can work with all the
> same execution models CPython can. Instagram server uses Cinder with a
> pre-fork execution model. Some other workloads use Cinder without
> pre-forking.

+1

-eric
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-leave@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/ZI6JXJJ2F6DCHTVYUVQFDNPCWEH76J6V/
Code of Conduct: http://python.org/psf/codeofconduct/
Re: PEP 684: A Per-Interpreter GIL [ In reply to ]
> Is ``allow_all_extensions`` the best name for the context manager?

Nope. I'm pretty sure that "parallel processing via multiple simultaneous interpreters" won't be the only reason people ever want to exclude certain extensions.

It might be easier to express that through package or module name, but importlib and util aren't specific enough.

For an example of an extension that works with multiple interpreters but only if they share a single GIL ... why wouldn't that apply to any extension designed to work with a Singleton external resource? For example, the interpreters could all share a single database connection, and repurpose the GIL to ensure that there isn't a thread (or interpreter) switch mid-transaction.
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-leave@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/RUDVIEDDCNFDRBIQVQU334GMPW77ZNOK/
Code of Conduct: http://python.org/psf/codeofconduct/
Re: PEP 684: A Per-Interpreter GIL [ In reply to ]
That sounds like a horrible idea. The GIL should never be held during an
I/O operation.

On Fri, Mar 11, 2022 at 19:00 Jim J. Jewett <jimjjewett@gmail.com> wrote:

> > Is ``allow_all_extensions`` the best name for the context manager?
>
> Nope. I'm pretty sure that "parallel processing via multiple simultaneous
> interpreters" won't be the only reason people ever want to exclude certain
> extensions.
>
> It might be easier to express that through package or module name, but
> importlib and util aren't specific enough.
>
> For an example of an extension that works with multiple interpreters but
> only if they share a single GIL ... why wouldn't that apply to any
> extension designed to work with a Singleton external resource? For
> example, the interpreters could all share a single database connection, and
> repurpose the GIL to ensure that there isn't a thread (or interpreter)
> switch mid-transaction.
> _______________________________________________
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-leave@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-dev@python.org/message/RUDVIEDDCNFDRBIQVQU334GMPW77ZNOK/
> Code of Conduct: http://python.org/psf/codeofconduct/
>
--
--Guido (mobile)
Re: PEP 684: A Per-Interpreter GIL [ In reply to ]
On Thu, 10 Mar 2022, 12:12 pm Eric Snow, <ericsnowcurrently@gmail.com>
wrote:

>
> On Wed, Mar 9, 2022 at 10:58 AM Petr Viktorin <encukou@gmail.com> wrote:
> > This PEP definitely makes per-interpreter GIL sound possible :)
>
>
> >
> > “Main interpreter” should be defined.
>
> +1
>
> > (Or maybe the term should be
> > avoided instead -- always having to spell out “interpreter started by
> > Py_Initialize rather than Py_NewInterpreter” might push us toward
> > finding ways to avoid the special case...)
>
> We (me, Nick, Victor, others) have considered this in the past and
> have concluded that having a distinct main interpreter is valuable.
> That topic is a bit out of scope for this PEP though.
>


The PEP can mostly link to
https://docs.python.org/3/c-api/init.html#sub-interpreter-support for the
explanation of the main interpreter, and just include a partial paraphrase
of those docs to give the gist of the idea. We added that info the last
time we considered whether the main interpreter's "first among equals"
status was necessary and decided it was.

For example, something based on the first and third sentences out of the
docs explanation:

The “main” interpreter is the first one created when the runtime
initializes. It continues to manage unique process-global responsibilities
like signal handling even when other subinterpreters have been started.

Cheers,
Nick.



> > >
>
Re: PEP 684: A Per-Interpreter GIL [ In reply to ]
> That sounds like a horrible idea. The GIL should never be held during an
> I/O operation.

For a greenfield design, I agree that it would be perverse. But I thought we were talking about affordances for transitions from code that was written without consideration of multiple interpreters. In those cases, the GIL can be a way of saying "OK, this is the part where I haven't thought things through yet." Using a more fine-grained lock would be better, but would take a lot more work and be more error-prone.

For a legacy system, I'm seen plenty of situations where a blunt (but simple) hammer like "Grab the GIL" would still be a huge improvement from the status quo. And those situations tend to occur with the sort of clients where "Brutally inefficient, but it does work because the fragile parts are guaranteed by an external tool" is the right tradeoff.
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-leave@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/AAWSCUNVS2NUXRHVATO736KM6I5M6RK5/
Code of Conduct: http://python.org/psf/codeofconduct/