Mailing List Archive

New sys.module_names attribute in Python 3.10: list of all stdlib modules
Hi,

I just added a new sys.module_names attribute, list (technically a
frozenset) of all stdlib module names:
https://bugs.python.org/issue42955

There are multiple use cases:

* Group stdlib imports when reformatting a Python file,
* Exclude stdlib imports when computing dependencies.
* Exclude stdlib modules when listing extension modules on crash or
fatal error, only list 3rd party extension (already implemented in
master, see bpo-42923 ;-)).
* Exclude stdlib modules when tracing the execution of a program using
the trace module.
* Detect typo and suggest a fix: ImportError("No module named maths.
Did you mean 'math'?",) (test the nice friendly-traceback project!).

Example:

>>> 'asyncio' in sys.module_names
True
>>> 'numpy' in sys.module_names
False

>>> len(sys.module_names)
312
>>> type(sys.module_names)
<class 'frozenset'>

>>> sorted(sys.module_names)[:10]
['__future__', '_abc', '_aix_support', '_ast', '_asyncio', '_bisect',
'_blake2', '_bootsubprocess', '_bz2', '_codecs']
>>> sorted(sys.module_names)[-10:]
['xml.dom', 'xml.etree', 'xml.parsers', 'xml.sax', 'xmlrpc', 'zipapp',
'zipfile', 'zipimport', 'zlib', 'zoneinfo']

The list is opinionated and defined by its documentation:

A frozenset of strings containing the names of standard library
modules.

It is the same on all platforms. Modules which are not available on
some platforms and modules disabled at Python build are also listed.
All module kinds are listed: pure Python, built-in, frozen and
extension modules. Test modules are excluded.

For packages, only sub-packages are listed, not sub-modules. For
example, ``concurrent`` package and ``concurrent.futures``
sub-package are listed, but not ``concurrent.futures.base``
sub-module.

See also the :attr:`sys.builtin_module_names` list.

The design (especially, the fact of having the same list on all
platforms) comes from the use cases list above. For example, running
isort should produce the same output on any platform, and not depend
if the Python stdlib was splitted into multiple packages on Linux
(which is done by most popular Linux distributions).

The list is generated by the Tools/scripts/generate_module_names.py script:
https://github.com/python/cpython/blob/master/Tools/scripts/generate_module_names.py

When you add a new module, you must run "make regen-module-names,
otherwise a pre-commit check will fail on your PR ;-) The list of
Windows extensions is currently hardcoded in the script (contributions
are welcomed to discover them, since the list is short and evolves
rarely, I didn't feel the need to spend time that on that).

Currently (Python 3.10.0a4+), there are 312 names in sys.module_names,
stored in Python/module_names.h:
https://github.com/python/cpython/blob/master/Python/module_names.h

It was decided to include "helper" modules like "_aix_support" which
is used by sysconfig. But test modules like _testcapi are excluded to
make the list shorter (it's rare to run the CPython test suite outside
Python).

There are 83 private modules, name starting with an underscore
(exclude _abc but also __future__):

>>> len([name for name in sys.module_names if not name.startswith('_')])
229

This new attribute may help to define "what is the Python stdlib" ;-)

Victor
--
Night gathers, and now my watch begins. It shall not end until my death.
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-leave@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/BTX7SH2CR66QCLER2EXAK2GOUAH2U4CL/
Code of Conduct: http://python.org/psf/codeofconduct/
Re: New sys.module_names attribute in Python 3.10: list of all stdlib modules [ In reply to ]
On Mon, 25 Jan 2021 14:03:22 +0100
Victor Stinner <vstinner@python.org> wrote:
>
> The list is opinionated and defined by its documentation:

So "the list is opinionated" means there can be false negatives, i.e.
some stdlib modules which are not present in this list?

This will probably make life harder for third-party software that wants
to answer the question "is module XXX a stdlib module or does it need
to be distributed separately?".

Regards

Antoine.

_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-leave@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/MKQNQL5VMQO25M6EVVVFH662VNEDS4FC/
Code of Conduct: http://python.org/psf/codeofconduct/
Re: New sys.module_names attribute in Python 3.10: list of all stdlib modules [ In reply to ]
On Mon, Jan 25, 2021 at 4:18 PM Antoine Pitrou <antoine@python.org> wrote:
>
> On Mon, 25 Jan 2021 14:03:22 +0100
> Victor Stinner <vstinner@python.org> wrote:
> >
> > The list is opinionated and defined by its documentation:
>
> So "the list is opinionated" means there can be false negatives, i.e.
> some stdlib modules which are not present in this list?

Test modules of the stdlib are excluded. Example:

>>> import sys
>>> '_testcapi' in sys.module_names # _testcapi extension
False
>>> 'test' in sys.module_names # Lib/test/ package
False
>>> import _testcapi
>>> _testcapi
<module '_testcapi' from
'/home/vstinner/python/master/build/lib.linux-x86_64-3.10-pydebug/_testcapi.cpython-310d-x86_64-linux-gnu.so'>
>>> import test
>>> test
<module 'test' from '/home/vstinner/python/master/Lib/test/__init__.py'>

It can be changed if it's an issue. That's also why I sent an email to
python-dev, to see if there is something wrong with sys.module_names
definition.

Victor
--
Night gathers, and now my watch begins. It shall not end until my death.
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-leave@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/VVTGOIJYIS66A4C6JZDQQTIFXGNKKJEL/
Code of Conduct: http://python.org/psf/codeofconduct/
Re: New sys.module_names attribute in Python 3.10: list of all stdlib modules [ In reply to ]
Just _names_? There's a recurring error case when a 3rd-party module overrides a standard one if it happens to have the same name. If you
filter such a module out, you're shooting yourself in the foot...

On 25.01.2021 16:03, Victor Stinner wrote:
> Hi,
>
> I just added a new sys.module_names attribute, list (technically a
> frozenset) of all stdlib module names:
> https://bugs.python.org/issue42955
>
> There are multiple use cases:
>
> * Group stdlib imports when reformatting a Python file,
> * Exclude stdlib imports when computing dependencies.
> * Exclude stdlib modules when listing extension modules on crash or
> fatal error, only list 3rd party extension (already implemented in
> master, see bpo-42923 ;-)).
> * Exclude stdlib modules when tracing the execution of a program using
> the trace module.
> * Detect typo and suggest a fix: ImportError("No module named maths.
> Did you mean 'math'?",) (test the nice friendly-traceback project!).
>
> Example:
>
>>>> 'asyncio' in sys.module_names
> True
>>>> 'numpy' in sys.module_names
> False
>
>>>> len(sys.module_names)
> 312
>>>> type(sys.module_names)
> <class 'frozenset'>
>
>>>> sorted(sys.module_names)[:10]
> [.'__future__', '_abc', '_aix_support', '_ast', '_asyncio', '_bisect',
> '_blake2', '_bootsubprocess', '_bz2', '_codecs']
>>>> sorted(sys.module_names)[-10:]
> [.'xml.dom', 'xml.etree', 'xml.parsers', 'xml.sax', 'xmlrpc', 'zipapp',
> 'zipfile', 'zipimport', 'zlib', 'zoneinfo']
>
> The list is opinionated and defined by its documentation:
>
> A frozenset of strings containing the names of standard library
> modules.
>
> It is the same on all platforms. Modules which are not available on
> some platforms and modules disabled at Python build are also listed.
> All module kinds are listed: pure Python, built-in, frozen and
> extension modules. Test modules are excluded.
>
> For packages, only sub-packages are listed, not sub-modules. For
> example, ``concurrent`` package and ``concurrent.futures``
> sub-package are listed, but not ``concurrent.futures.base``
> sub-module.
>
> See also the :attr:`sys.builtin_module_names` list.
>
> The design (especially, the fact of having the same list on all
> platforms) comes from the use cases list above. For example, running
> isort should produce the same output on any platform, and not depend
> if the Python stdlib was splitted into multiple packages on Linux
> (which is done by most popular Linux distributions).
>
> The list is generated by the Tools/scripts/generate_module_names.py script:
> https://github.com/python/cpython/blob/master/Tools/scripts/generate_module_names.py
>
> When you add a new module, you must run "make regen-module-names,
> otherwise a pre-commit check will fail on your PR ;-) The list of
> Windows extensions is currently hardcoded in the script (contributions
> are welcomed to discover them, since the list is short and evolves
> rarely, I didn't feel the need to spend time that on that).
>
> Currently (Python 3.10.0a4+), there are 312 names in sys.module_names,
> stored in Python/module_names.h:
> https://github.com/python/cpython/blob/master/Python/module_names.h
>
> It was decided to include "helper" modules like "_aix_support" which
> is used by sysconfig. But test modules like _testcapi are excluded to
> make the list shorter (it's rare to run the CPython test suite outside
> Python).
>
> There are 83 private modules, name starting with an underscore
> (exclude _abc but also __future__):
>
>>>> len([name for name in sys.module_names if not name.startswith('_')])
> 229
>
> This new attribute may help to define "what is the Python stdlib" ;-)
>
> Victor

--
Regards,
Ivan
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-leave@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/KCJDHKOKCN5343VVA3DC7RAGNUGWNKZY/
Code of Conduct: http://python.org/psf/codeofconduct/
Re: New sys.module_names attribute in Python 3.10: list of all stdlib modules [ In reply to ]
On Mon, Jan 25, 2021 at 06:46:51PM +0300, Ivan Pozdeev via Python-Dev wrote:
> There's a recurring error case when a 3rd-party module
> overrides a standard one if it happens to have the same name.

Any argument and expectation is off in this case. We shouldn't worry about such
scenarios.

--
Senthil
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-leave@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/OA6HOHTGG2I7QGP7QRMCYRHGOKWZL6D4/
Code of Conduct: http://python.org/psf/codeofconduct/
Re: New sys.module_names attribute in Python 3.10: list of all stdlib modules [ In reply to ]
Hi Ivan,

On Mon, Jan 25, 2021 at 4:53 PM Ivan Pozdeev via Python-Dev
<python-dev@python.org> wrote:
> Just _names_? There's a recurring error case when a 3rd-party module overrides a standard one if it happens to have the same name. If you
> filter such a module out, you're shooting yourself in the foot...

Overriding stdlib modules has been discussed in the issue.

For example, it was proposed to add an attribute to all stdlib modules
(__stdlib__=True or __author__ = 'PSF'), and then check if the
attribute exists or not. The problem is that importing a module to
check for its attribute cause side effect or fail, and so cannot be
used for some use cases. For example, it would be a surprising to open
a web browser window when running isort on a Python code containing
"import antigravity". Another problem is that third party can also add
the attribute to pretend that their code is part of the stdlib.

In a previous version of my PR, I added a note about sys.path and
overriding stdlib modules, but I have been asked to remove it. Feel
free to propose a PR to add such note if you consider that it's
related to sys.module_names.

Please read the discussion at https://bugs.python.org/issue42955 and
https://github.com/python/cpython/pull/24238

Victor
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-leave@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/7HMWTGBECAVLINLO3MAEN74YVDHOMZKM/
Code of Conduct: http://python.org/psf/codeofconduct/
Re: New sys.module_names attribute in Python 3.10: list of all stdlib modules [ In reply to ]
Hello,

In general, I love the idea and implementation. I'm not in love with the
name though, it makes it sound like it contains all module names
imported/available. We have sys.module already containing all module
imported. So without a deeper knowledge sys.modules_names is very close to
sys.module.keys() or all available modules. Can we name it instead
sys.stdlib_modules_names to clarify that this is standard library only
subset and not all available modules for the interpreter?

Thanks,

On Mon, Jan 25, 2021 at 4:33 PM Victor Stinner <vstinner@python.org> wrote:

> Hi Ivan,
>
> On Mon, Jan 25, 2021 at 4:53 PM Ivan Pozdeev via Python-Dev
> <python-dev@python.org> wrote:
> > Just _names_? There's a recurring error case when a 3rd-party module
> overrides a standard one if it happens to have the same name. If you
> > filter such a module out, you're shooting yourself in the foot...
>
> Overriding stdlib modules has been discussed in the issue.
>
> For example, it was proposed to add an attribute to all stdlib modules
> (__stdlib__=True or __author__ = 'PSF'), and then check if the
> attribute exists or not. The problem is that importing a module to
> check for its attribute cause side effect or fail, and so cannot be
> used for some use cases. For example, it would be a surprising to open
> a web browser window when running isort on a Python code containing
> "import antigravity". Another problem is that third party can also add
> the attribute to pretend that their code is part of the stdlib.
>
> In a previous version of my PR, I added a note about sys.path and
> overriding stdlib modules, but I have been asked to remove it. Feel
> free to propose a PR to add such note if you consider that it's
> related to sys.module_names.
>
> Please read the discussion at https://bugs.python.org/issue42955 and
> https://github.com/python/cpython/pull/24238
>
> Victor
> _______________________________________________
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-leave@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-dev@python.org/message/7HMWTGBECAVLINLO3MAEN74YVDHOMZKM/
> Code of Conduct: http://python.org/psf/codeofconduct/
>
Re: New sys.module_names attribute in Python 3.10: list of all stdlib modules [ In reply to ]
Hi Bernat,

"stdlib_module_names" was my first idea but it looks too long, so I
chose "module_names". But someone on Twitter and now you asked me why
not "stdlib_module_names", so I wrote a PR to rename module_names to
sys.stdlib_module_names:
https://github.com/python/cpython/pull/24332

At least "stdlib_module_names" better summarizes its definition: "A
frozenset of strings containing the names of standard library
modules".

Victor


On Mon, Jan 25, 2021 at 5:39 PM Bernat Gabor <jokerjokerer@gmail.com> wrote:
>
> Hello,
>
> In general, I love the idea and implementation. I'm not in love with the name though, it makes it sound like it contains all module names imported/available. We have sys.module already containing all module imported. So without a deeper knowledge sys.modules_names is very close to sys.module.keys() or all available modules. Can we name it instead sys.stdlib_modules_names to clarify that this is standard library only subset and not all available modules for the interpreter?
>
> Thanks,
>
> On Mon, Jan 25, 2021 at 4:33 PM Victor Stinner <vstinner@python.org> wrote:
>>
>> Hi Ivan,
>>
>> On Mon, Jan 25, 2021 at 4:53 PM Ivan Pozdeev via Python-Dev
>> <python-dev@python.org> wrote:
>> > Just _names_? There's a recurring error case when a 3rd-party module overrides a standard one if it happens to have the same name. If you
>> > filter such a module out, you're shooting yourself in the foot...
>>
>> Overriding stdlib modules has been discussed in the issue.
>>
>> For example, it was proposed to add an attribute to all stdlib modules
>> (__stdlib__=True or __author__ = 'PSF'), and then check if the
>> attribute exists or not. The problem is that importing a module to
>> check for its attribute cause side effect or fail, and so cannot be
>> used for some use cases. For example, it would be a surprising to open
>> a web browser window when running isort on a Python code containing
>> "import antigravity". Another problem is that third party can also add
>> the attribute to pretend that their code is part of the stdlib.
>>
>> In a previous version of my PR, I added a note about sys.path and
>> overriding stdlib modules, but I have been asked to remove it. Feel
>> free to propose a PR to add such note if you consider that it's
>> related to sys.module_names.
>>
>> Please read the discussion at https://bugs.python.org/issue42955 and
>> https://github.com/python/cpython/pull/24238
>>
>> Victor
>> _______________________________________________
>> Python-Dev mailing list -- python-dev@python.org
>> To unsubscribe send an email to python-dev-leave@python.org
>> https://mail.python.org/mailman3/lists/python-dev.python.org/
>> Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/7HMWTGBECAVLINLO3MAEN74YVDHOMZKM/
>> Code of Conduct: http://python.org/psf/codeofconduct/



--
Night gathers, and now my watch begins. It shall not end until my death.
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-leave@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/WJMYK2JKZPTXMID7WRMP4KMJ656WEMI5/
Code of Conduct: http://python.org/psf/codeofconduct/
Re: New sys.module_names attribute in Python 3.10: list of all stdlib modules [ In reply to ]
On 1/25/21 5:03 AM, Victor Stinner wrote:

> I just added a new sys.module_names attribute, list (technically a
> frozenset) of all stdlib module names

> The list is opinionated and defined by its documentation

> For packages, only sub-packages are listed, not sub-modules. For
> example, ``concurrent`` package and ``concurrent.futures``
> sub-package are listed, but not ``concurrent.futures.base``
> sub-module.

I'm not sure I understand the above. Is it fair to say that any stdlib module, except
for private or test (./Lib/test/*) modules, that can be imported are listed in
`sys.module_names`? My confusion stems from being able to import `concurrent.futures`
but not `concurrent.futures.base`.

--
~Ethan~
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-leave@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/XF2RSEIQ5UEUOWEZKGAVX6KROKHNJWLZ/
Code of Conduct: http://python.org/psf/codeofconduct/
Re: New sys.module_names attribute in Python 3.10: list of all stdlib modules [ In reply to ]
On Mon, Jan 25, 2021 at 7:51 AM Ivan Pozdeev via Python-Dev <
python-dev@python.org> wrote:

> Just _names_? There's a recurring error case when a 3rd-party module
> overrides a standard one if it happens to have the same name. If you
> filter such a module out, you're shooting yourself in the foot...


Would another use case be to support issuing a warning if a third-party
module is imported whose name matches a standard one? A related use case
would be to build on this and define a function that accepts an already
imported module and return whether it is from the standard library. Unlike,
the module_names attribute, this function would reflect the reality of the
underlying module, and so not have false positives as with doing a name
check alone.

—Chris



>
> On 25.01.2021 16:03, Victor Stinner wrote:
> > Hi,
> >
> > I just added a new sys.module_names attribute, list (technically a
> > frozenset) of all stdlib module names:
> > https://bugs.python.org/issue42955
> >
> > There are multiple use cases:
> >
> > * Group stdlib imports when reformatting a Python file,
> > * Exclude stdlib imports when computing dependencies.
> > * Exclude stdlib modules when listing extension modules on crash or
> > fatal error, only list 3rd party extension (already implemented in
> > master, see bpo-42923 ;-)).
> > * Exclude stdlib modules when tracing the execution of a program using
> > the trace module.
> > * Detect typo and suggest a fix: ImportError("No module named maths.
> > Did you mean 'math'?",) (test the nice friendly-traceback project!).
> >
> > Example:
> >
> >>>> 'asyncio' in sys.module_names
> > True
> >>>> 'numpy' in sys.module_names
> > False
> >
> >>>> len(sys.module_names)
> > 312
> >>>> type(sys.module_names)
> > <class 'frozenset'>
> >
> >>>> sorted(sys.module_names)[:10]
> > [.'__future__', '_abc', '_aix_support', '_ast', '_asyncio', '_bisect',
> > '_blake2', '_bootsubprocess', '_bz2', '_codecs']
> >>>> sorted(sys.module_names)[-10:]
> > [.'xml.dom', 'xml.etree', 'xml.parsers', 'xml.sax', 'xmlrpc', 'zipapp',
> > 'zipfile', 'zipimport', 'zlib', 'zoneinfo']
> >
> > The list is opinionated and defined by its documentation:
> >
> > A frozenset of strings containing the names of standard library
> > modules.
> >
> > It is the same on all platforms. Modules which are not available on
> > some platforms and modules disabled at Python build are also listed.
> > All module kinds are listed: pure Python, built-in, frozen and
> > extension modules. Test modules are excluded.
> >
> > For packages, only sub-packages are listed, not sub-modules. For
> > example, ``concurrent`` package and ``concurrent.futures``
> > sub-package are listed, but not ``concurrent.futures.base``
> > sub-module.
> >
> > See also the :attr:`sys.builtin_module_names` list.
> >
> > The design (especially, the fact of having the same list on all
> > platforms) comes from the use cases list above. For example, running
> > isort should produce the same output on any platform, and not depend
> > if the Python stdlib was splitted into multiple packages on Linux
> > (which is done by most popular Linux distributions).
> >
> > The list is generated by the Tools/scripts/generate_module_names.py
> script:
> >
> https://github.com/python/cpython/blob/master/Tools/scripts/generate_module_names.py
> >
> > When you add a new module, you must run "make regen-module-names,
> > otherwise a pre-commit check will fail on your PR ;-) The list of
> > Windows extensions is currently hardcoded in the script (contributions
> > are welcomed to discover them, since the list is short and evolves
> > rarely, I didn't feel the need to spend time that on that).
> >
> > Currently (Python 3.10.0a4+), there are 312 names in sys.module_names,
> > stored in Python/module_names.h:
> > https://github.com/python/cpython/blob/master/Python/module_names.h
> >
> > It was decided to include "helper" modules like "_aix_support" which
> > is used by sysconfig. But test modules like _testcapi are excluded to
> > make the list shorter (it's rare to run the CPython test suite outside
> > Python).
> >
> > There are 83 private modules, name starting with an underscore
> > (exclude _abc but also __future__):
> >
> >>>> len([name for name in sys.module_names if not name.startswith('_')])
> > 229
> >
> > This new attribute may help to define "what is the Python stdlib" ;-)
> >
> > Victor
>
> --
> Regards,
> Ivan
> _______________________________________________
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-leave@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-dev@python.org/message/KCJDHKOKCN5343VVA3DC7RAGNUGWNKZY/
> Code of Conduct: http://python.org/psf/codeofconduct/
>
Re: New sys.module_names attribute in Python 3.10: list of all stdlib modules [ In reply to ]
On Mon, Jan 25, 2021 at 6:39 PM Ethan Furman <ethan@stoneleaf.us> wrote:
> > For packages, only sub-packages are listed, not sub-modules. For
> > example, ``concurrent`` package and ``concurrent.futures``
> > sub-package are listed, but not ``concurrent.futures.base``
> > sub-module.
>
> I'm not sure I understand the above. Is it fair to say that any stdlib module, except
> for private or test (./Lib/test/*) modules,

Private modules are listed: __future__, _abc, _aix_support, etc.

> that can be imported are listed in `sys.module_names`? My confusion stems from being able
> to import `concurrent.futures`
> but not `concurrent.futures.base`.

For package, I chose to exclude sub-modules just to keep the list
short. ~300 items can be displayed and read manually. If you want to
check if "asyncio.base_events" is a stdlib module, extract "asyncio"
string and check if "asyncio" is part of the list.

sys.module_names cannot be used directly if you need to get the
exhaustive list of all modules including sub-modules.
pkgutil.iter_modules() can be used to list modules of package:

>>> [mod.name for mod in pkgutil.iter_modules(path=asyncio.__path__)]
[.'__main__', 'base_events', 'base_futures', 'base_subprocess',
'base_tasks', 'constants', 'coroutines', 'events', 'exceptions',
'format_helpers', 'futures', 'locks', 'log', 'mixins',
'proactor_events', 'protocols', 'queues', 'runners',
'selector_events', 'sslproto', 'staggered', 'streams', 'subprocess',
'tasks', 'threads', 'transports', 'trsock', 'unix_events',
'windows_events', 'windows_utils']

One drawback is that if the stdlib would contain packages without
__init__.py file, a third party project could add a sub-module to it
(ex: inject encodings/myencoding.py in the encodings package). But it
seems like all Lib/ sub-directories contain an __init__.py file, so
it's not an issue in practice.

If we include sub-modules, sys.module_names grows from 312 names to
813 names (2.6x more).

Two examples:

"collections",
+"collections.abc",

"concurrent",
"concurrent.futures",
+"concurrent.futures._base",
+"concurrent.futures.process",
+"concurrent.futures.thread",

Just the encodings package contains 121 sub-modules.

Victor
--
Night gathers, and now my watch begins. It shall not end until my death.
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-leave@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/BGC422TKT5R5HNN2A3SDDLUJE32AP5IR/
Code of Conduct: http://python.org/psf/codeofconduct/
Re: New sys.module_names attribute in Python 3.10: list of all stdlib modules [ In reply to ]
On Mon, Jan 25, 2021 at 6:37 PM Chris Jerdonek <chris.jerdonek@gmail.com> wrote:
> On Mon, Jan 25, 2021 at 7:51 AM Ivan Pozdeev via Python-Dev <python-dev@python.org> wrote:
>>
>> Just _names_? There's a recurring error case when a 3rd-party module overrides a standard one if it happens to have the same name. If you
>> filter such a module out, you're shooting yourself in the foot...
>
> Would another use case be to support issuing a warning if a third-party module is imported whose name matches a standard one? A related use case would be to build on this and define a function that accepts an already imported module and return whether it is from the standard library. Unlike, the module_names attribute, this function would reflect the reality of the underlying module, and so not have false positives as with doing a name check alone.

This is a different use case which requires a different solution.
sys.module_names solve some specific use cases (that I listed in my
first email).

In Python 3.9, you can already check if a module __file__ is in the
sysconfig.get_paths()['stdlib'] directory. You don't need to modify
Python for that.

If you also would like to check if an *extension* module comes from
the stdlib, you need to get the "lib-dynload" directory. I failed to
find a programmatic way to get this directory, maybe new API would be
needed for that.

Victor
--
Night gathers, and now my watch begins. It shall not end until my death.
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-leave@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/4BTCHRA4XUWXHXHDGXSRJSH6LFS66TF5/
Code of Conduct: http://python.org/psf/codeofconduct/
Re: New sys.module_names attribute in Python 3.10: list of all stdlib modules [ In reply to ]
That's not possible.

Stdlib can be arranged any way a user/maintainer wishes (zipped stdlib and virtual environments are just two examples), so there's no way to
tell if the module's location is "right".
Dowstream changes are also standard practice so there's no way to verify a module's contents, either.

As such, there's no way to tell if any given module being imported is a standard or a 3rd-party one.

On 25.01.2021 20:33, Chris Jerdonek wrote:
> On Mon, Jan 25, 2021 at 7:51 AM Ivan Pozdeev via Python-Dev <python-dev@python.org <mailto:python-dev@python.org>> wrote:
>
> Just _names_? There's a recurring error case when a 3rd-party module overrides a standard one if it happens to have the same name. If you
> filter such a module out, you're shooting yourself in the foot...
>
>
> Would another use case be to support issuing a warning if a third-party module is imported whose name matches a standard one? A related
> use case would be to build on this and define a function that accepts an already imported module and return whether it is from the
> standard library. Unlike, the module_names attribute, this function would reflect the reality of the underlying module, and so not have
> false positives as with doing a name check alone.
>
> —Chris
>
>
>
>
> On 25.01.2021 16:03, Victor Stinner wrote:
> > Hi,
> >
> > I just added a new sys.module_names attribute, list (technically a
> > frozenset) of all stdlib module names:
> > https://bugs.python.org/issue42955 <https://bugs.python.org/issue42955>
> >
> > There are multiple use cases:
> >
> > * Group stdlib imports when reformatting a Python file,
> > * Exclude stdlib imports when computing dependencies.
> > * Exclude stdlib modules when listing extension modules on crash or
> > fatal error, only list 3rd party extension (already implemented in
> > master, see bpo-42923 ;-)).
> > * Exclude stdlib modules when tracing the execution of a program using
> > the trace module.
> > * Detect typo and suggest a fix: ImportError("No module named maths.
> > Did you mean 'math'?",) (test the nice friendly-traceback project!).
> >
> > Example:
> >
> >>>> 'asyncio' in sys.module_names
> > True
> >>>> 'numpy' in sys.module_names
> > False
> >
> >>>> len(sys.module_names)
> > 312
> >>>> type(sys.module_names)
> > <class 'frozenset'>
> >
> >>>> sorted(sys.module_names)[:10]
> > [.'__future__', '_abc', '_aix_support', '_ast', '_asyncio', '_bisect',
> > '_blake2', '_bootsubprocess', '_bz2', '_codecs']
> >>>> sorted(sys.module_names)[-10:]
> > [.'xml.dom', 'xml.etree', 'xml.parsers', 'xml.sax', 'xmlrpc', 'zipapp',
> > 'zipfile', 'zipimport', 'zlib', 'zoneinfo']
> >
> > The list is opinionated and defined by its documentation:
> >
> >     A frozenset of strings containing the names of standard library
> >     modules.
> >
> >     It is the same on all platforms. Modules which are not available on
> >     some platforms and modules disabled at Python build are also listed.
> >     All module kinds are listed: pure Python, built-in, frozen and
> >     extension modules. Test modules are excluded.
> >
> >     For packages, only sub-packages are listed, not sub-modules. For
> >     example, ``concurrent`` package and ``concurrent.futures``
> >     sub-package are listed, but not ``concurrent.futures.base``
> >     sub-module.
> >
> >     See also the :attr:`sys.builtin_module_names` list.
> >
> > The design (especially, the fact of having the same list on all
> > platforms) comes from the use cases list above. For example, running
> > isort should produce the same output on any platform, and not depend
> > if the Python stdlib was splitted into multiple packages on Linux
> > (which is done by most popular Linux distributions).
> >
> > The list is generated by the Tools/scripts/generate_module_names.py script:
> > https://github.com/python/cpython/blob/master/Tools/scripts/generate_module_names.py
> <https://github.com/python/cpython/blob/master/Tools/scripts/generate_module_names.py>
> >
> > When you add a new module, you must run "make regen-module-names,
> > otherwise a pre-commit check will fail on your PR ;-) The list of
> > Windows extensions is currently hardcoded in the script (contributions
> > are welcomed to discover them, since the list is short and evolves
> > rarely, I didn't feel the need to spend time that on that).
> >
> > Currently (Python 3.10.0a4+), there are 312 names in sys.module_names,
> > stored in Python/module_names.h:
> > https://github.com/python/cpython/blob/master/Python/module_names.h
> <https://github.com/python/cpython/blob/master/Python/module_names.h>
> >
> > It was decided to include "helper" modules like "_aix_support" which
> > is used by sysconfig. But test modules like _testcapi are excluded to
> > make the list shorter (it's rare to run the CPython test suite outside
> > Python).
> >
> > There are 83 private modules, name starting with an underscore
> > (exclude _abc but also __future__):
> >
> >>>> len([name for name in sys.module_names if not name.startswith('_')])
> > 229
> >
> > This new attribute may help to define "what is the Python stdlib" ;-)
> >
> > Victor
>
> --
> Regards,
> Ivan
> _______________________________________________
> Python-Dev mailing list -- python-dev@python.org <mailto:python-dev@python.org>
> To unsubscribe send an email to python-dev-leave@python.org <mailto:python-dev-leave@python.org>
> https://mail.python.org/mailman3/lists/python-dev.python.org/ <https://mail.python.org/mailman3/lists/python-dev.python.org/>
> Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/KCJDHKOKCN5343VVA3DC7RAGNUGWNKZY/
> <https://mail.python.org/archives/list/python-dev@python.org/message/KCJDHKOKCN5343VVA3DC7RAGNUGWNKZY/>
> Code of Conduct: http://python.org/psf/codeofconduct/ <http://python.org/psf/codeofconduct/>
>
--
Regards,
Ivan
Re: New sys.module_names attribute in Python 3.10: list of all stdlib modules [ In reply to ]
On Mon, Jan 25, 2021 at 11:22 PM Ivan Pozdeev via Python-Dev
<python-dev@python.org> wrote:
> That's not possible.
>
> Stdlib can be arranged any way a user/maintainer wishes (zipped stdlib and virtual environments are just two examples), so there's no way to tell if the module's location is "right".
> Dowstream changes are also standard practice so there's no way to verify a module's contents, either.
>
> As such, there's no way to tell if any given module being imported is a standard or a 3rd-party one.

By the way, IMO it's also a legit use case on an old Python version to
override a stdlib module with a patched or more recent version, to get
a bugfix for example ;-) Even if it's an uncommon use case, it can
solve some practical issues.

Victor
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-leave@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/NEID6HKSVGUSDG7GMHQGGE3QOFYGTGE4/
Code of Conduct: http://python.org/psf/codeofconduct/
Re: New sys.module_names attribute in Python 3.10: list of all stdlib modules [ In reply to ]
Fortunately for, you :) , all this argument is not against the feature per se but only against its use to blindly filter module lists for
automated bug reports.

On 26.01.2021 1:34, Victor Stinner wrote:
> On Mon, Jan 25, 2021 at 11:22 PM Ivan Pozdeev via Python-Dev
> <python-dev@python.org> wrote:
>> That's not possible.
>>
>> Stdlib can be arranged any way a user/maintainer wishes (zipped stdlib and virtual environments are just two examples), so there's no way to tell if the module's location is "right".
>> Dowstream changes are also standard practice so there's no way to verify a module's contents, either.
>>
>> As such, there's no way to tell if any given module being imported is a standard or a 3rd-party one.
> By the way, IMO it's also a legit use case on an old Python version to
> override a stdlib module with a patched or more recent version, to get
> a bugfix for example ;-) Even if it's an uncommon use case, it can
> solve some practical issues.
>
> Victor

--
Regards,
Ivan
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-leave@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/FVVPLGWDAKURT4VSTHD746QJ6LG2MQDR/
Code of Conduct: http://python.org/psf/codeofconduct/
Re: New sys.module_names attribute in Python 3.10: list of all stdlib modules [ In reply to ]
On Mon, Jan 25, 2021 at 06:17:09PM +0100, Victor Stinner wrote:
> Hi Bernat,
>
> "stdlib_module_names" was my first idea but it looks too long, so I
> chose "module_names". But someone on Twitter and now you asked me why
> not "stdlib_module_names", so I wrote a PR to rename module_names to
> sys.stdlib_module_names:
> https://github.com/python/cpython/pull/24332
>
> At least "stdlib_module_names" better summarizes its definition: "A
> frozenset of strings containing the names of standard library
> modules".

Your first instinct that it is too long is correct. Just call it
"stdlib" or "stdlib_names". The fact that it is a frozen set of module
names will be obvious from just looking at it, and there is no need for
the name to explain everything about it. We have:

* `dir()`, not `sorted_dir_names()`;

* `sys.prefix`, not `sys.site_specific_directory_path_prefix`;

* `sys.audit`, not `sys.raise_audit_hook_event`;

* `sys.exit()`, not `sys.exit_python()`;

* `sys.float_info`, not `sys.float_prec_and_low_level_info`;

etc. Python has very good documentation and excellent introspection
capabilities. Names should act as a short reminder of the meaning, there
is no need to encode a full description into a long amd verbose name.


--
Steve
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-leave@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/7BEFXZLBH7L63WIZJZMZPQWHDDYTB3LR/
Code of Conduct: http://python.org/psf/codeofconduct/
Re: New sys.module_names attribute in Python 3.10: list of all stdlib modules [ In reply to ]
On Mon, Jan 25, 2021 at 2:05 PM Victor Stinner <vstinner@python.org> wrote:

> On Mon, Jan 25, 2021 at 6:37 PM Chris Jerdonek <chris.jerdonek@gmail.com>
> wrote:
> > On Mon, Jan 25, 2021 at 7:51 AM Ivan Pozdeev via Python-Dev <
> python-dev@python.org> wrote:
> >>
> >> Just _names_? There's a recurring error case when a 3rd-party module
> overrides a standard one if it happens to have the same name. If you
> >> filter such a module out, you're shooting yourself in the foot...
> >
> > Would another use case be to support issuing a warning if a third-party
> module is imported whose name matches a standard one? A related use case
> would be to build on this and define a function that accepts an already
> imported module and return whether it is from the standard library. Unlike,
> the module_names attribute, this function would reflect the reality of the
> underlying module, and so not have false positives as with doing a name
> check alone.
>
> This is a different use case which requires a different solution.
> sys.module_names solve some specific use cases (that I listed in my
> first email).
>
> In Python 3.9, you can already check if a module __file__ is in the
> sysconfig.get_paths()['stdlib'] directory. You don't need to modify
> Python for that.


But to issue a warning when a standard module is being overridden like I
was suggesting, wouldn’t you also need to know whether the name of the
module being imported is a standard name, which is what says.module_names
provides?

—Chris




If you also would like to check if an *extension* module comes from
> the stdlib, you need to get the "lib-dynload" directory. I failed to
> find a programmatic way to get this directory, maybe new API would be
> needed for that.
>
> Victor
> --
> Night gathers, and now my watch begins. It shall not end until my death.
>
Re: New sys.module_names attribute in Python 3.10: list of all stdlib modules [ In reply to ]
On Tue, 26 Jan 2021 10:36:10 +1100
Steven D'Aprano <steve@pearwood.info> wrote:
> On Mon, Jan 25, 2021 at 06:17:09PM +0100, Victor Stinner wrote:
> > Hi Bernat,
> >
> > "stdlib_module_names" was my first idea but it looks too long, so I
> > chose "module_names". But someone on Twitter and now you asked me why
> > not "stdlib_module_names", so I wrote a PR to rename module_names to
> > sys.stdlib_module_names:
> > https://github.com/python/cpython/pull/24332
> >
> > At least "stdlib_module_names" better summarizes its definition: "A
> > frozenset of strings containing the names of standard library
> > modules".
>
> Your first instinct that it is too long is correct.

Disagreed. This is niche enough that it warrants a long but explicit
name, rather than some ambiguous shortcut.

> Just call it
> "stdlib" or "stdlib_names".

If you call it "stdlib", then you should make it a namedtuple that will
expose various information, such as "sys.stdlib.module_names".

Regards

Antoine.

_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-leave@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/CJYWXVBIMDHRJCT4HZMOLJ7XMSVNZF6I/
Code of Conduct: http://python.org/psf/codeofconduct/
Re: New sys.module_names attribute in Python 3.10: list of all stdlib modules [ In reply to ]
On Mon, 25 Jan 2021 23:05:07 +0100
Victor Stinner <vstinner@python.org> wrote:
>
> This is a different use case which requires a different solution.
> sys.module_names solve some specific use cases (that I listed in my
> first email).
>
> In Python 3.9, you can already check if a module __file__ is in the
> sysconfig.get_paths()['stdlib'] directory. You don't need to modify
> Python for that.

Is this reliable? What if the stdlib is zipped or frozen in some way?

> If you also would like to check if an *extension* module comes from
> the stdlib, you need to get the "lib-dynload" directory.

So you're saying the need is already fulfilled, even though it only has
a cryptic (*) and partial solution?

(*) who would think about `sysconfig.get_paths()['stdlib']` on their
own?

Regards

Antoine.

_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-leave@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/2XBORMKGYQX2EYWXEEURSYRWJW3F3VKF/
Code of Conduct: http://python.org/psf/codeofconduct/
Re: New sys.module_names attribute in Python 3.10: list of all stdlib modules [ In reply to ]
On Tue, Jan 26, 2021 at 12:44 AM Steven D'Aprano <steve@pearwood.info> wrote:
>
> On Mon, Jan 25, 2021 at 06:17:09PM +0100, Victor Stinner wrote:
> > Hi Bernat,
> >
> > "stdlib_module_names" was my first idea but it looks too long, so I
> > chose "module_names". But someone on Twitter and now you asked me why
> > not "stdlib_module_names", so I wrote a PR to rename module_names to
> > sys.stdlib_module_names:
> > https://github.com/python/cpython/pull/24332
> >
> > At least "stdlib_module_names" better summarizes its definition: "A
> > frozenset of strings containing the names of standard library
> > modules".
>
> Your first instinct that it is too long is correct. Just call it
> "stdlib" or "stdlib_names". The fact that it is a frozen set of module
> names will be obvious from just looking at it, and there is no need for
> the name to explain everything about it. We have:

The sys module already has a sys.modules attribute, and so
sys.module_names sounds like "give me the name of all imported
modules": sys.module.keys(). It's confusing. Just after I announced
the creation of the attribute, at least 3 people told me that they
were confused by the name. Also, my PR was approved quickly 3 times
which confirms that the rename was a good idea ;-)

In general, I agree that short names are great ;-) For example, I like
short obj.name() rather than obj.getname() when it doesn't make sense
to set the name.

Naming is a hard problem :-D

Victor
--
Night gathers, and now my watch begins. It shall not end until my death.
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-leave@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/DKRUV2HDWIN4XWDNXT55BBHNQOUZTOU6/
Code of Conduct: http://python.org/psf/codeofconduct/
Re: New sys.module_names attribute in Python 3.10: list of all stdlib modules [ In reply to ]
On Tue, Jan 26, 2021 at 1:26 AM Antoine Pitrou <antoine@python.org> wrote:

> On Tue, 26 Jan 2021 10:36:10 +1100
> Steven D'Aprano <steve@pearwood.info> wrote:
> > On Mon, Jan 25, 2021 at 06:17:09PM +0100, Victor Stinner wrote:
> > > Hi Bernat,
> > >
> > > "stdlib_module_names" was my first idea but it looks too long, so I
> > > chose "module_names". But someone on Twitter and now you asked me why
> > > not "stdlib_module_names", so I wrote a PR to rename module_names to
> > > sys.stdlib_module_names:
> > > https://github.com/python/cpython/pull/24332
> > >
> > > At least "stdlib_module_names" better summarizes its definition: "A
> > > frozenset of strings containing the names of standard library
> > > modules".
> >
> > Your first instinct that it is too long is correct.
>
> Disagreed. This is niche enough that it warrants a long but explicit
> name, rather than some ambiguous shortcut.
>

I agree w/ Antoine that a more descriptive name for such a niche (but
useful!) attribute makes sense.

-Brett


>
> > Just call it
> > "stdlib" or "stdlib_names".
>
> If you call it "stdlib", then you should make it a namedtuple that will
> expose various information, such as "sys.stdlib.module_names".
>


>
> Regards
>
> Antoine.
>
> _______________________________________________
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-leave@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-dev@python.org/message/CJYWXVBIMDHRJCT4HZMOLJ7XMSVNZF6I/
> Code of Conduct: http://python.org/psf/codeofconduct/
>
Re: New sys.module_names attribute in Python 3.10: list of all stdlib modules [ In reply to ]
If the length of the name is any kind of issue, since the stdlib
only contains modules (and packages), why not just sys.stdlib_names?


On Mon, Jan 25, 2021 at 5:18 PM Victor Stinner <vstinner@python.org> wrote:

> Hi Bernat,
>
> "stdlib_module_names" was my first idea but it looks too long, so I
> chose "module_names". But someone on Twitter and now you asked me why
> not "stdlib_module_names", so I wrote a PR to rename module_names to
> sys.stdlib_module_names:
> https://github.com/python/cpython/pull/24332
>
> At least "stdlib_module_names" better summarizes its definition: "A
> frozenset of strings containing the names of standard library
> modules".
>
> Victor
>
>
> On Mon, Jan 25, 2021 at 5:39 PM Bernat Gabor <jokerjokerer@gmail.com>
> wrote:
> >
> > Hello,
> >
> > In general, I love the idea and implementation. I'm not in love with the
> name though, it makes it sound like it contains all module names
> imported/available. We have sys.module already containing all module
> imported. So without a deeper knowledge sys.modules_names is very close to
> sys.module.keys() or all available modules. Can we name it instead
> sys.stdlib_modules_names to clarify that this is standard library only
> subset and not all available modules for the interpreter?
> >
> > Thanks,
> >
> > On Mon, Jan 25, 2021 at 4:33 PM Victor Stinner <vstinner@python.org>
> wrote:
> >>
> >> Hi Ivan,
> >>
> >> On Mon, Jan 25, 2021 at 4:53 PM Ivan Pozdeev via Python-Dev
> >> <python-dev@python.org> wrote:
> >> > Just _names_? There's a recurring error case when a 3rd-party module
> overrides a standard one if it happens to have the same name. If you
> >> > filter such a module out, you're shooting yourself in the foot...
> >>
> >> Overriding stdlib modules has been discussed in the issue.
> >>
> >> For example, it was proposed to add an attribute to all stdlib modules
> >> (__stdlib__=True or __author__ = 'PSF'), and then check if the
> >> attribute exists or not. The problem is that importing a module to
> >> check for its attribute cause side effect or fail, and so cannot be
> >> used for some use cases. For example, it would be a surprising to open
> >> a web browser window when running isort on a Python code containing
> >> "import antigravity". Another problem is that third party can also add
> >> the attribute to pretend that their code is part of the stdlib.
> >>
> >> In a previous version of my PR, I added a note about sys.path and
> >> overriding stdlib modules, but I have been asked to remove it. Feel
> >> free to propose a PR to add such note if you consider that it's
> >> related to sys.module_names.
> >>
> >> Please read the discussion at https://bugs.python.org/issue42955 and
> >> https://github.com/python/cpython/pull/24238
> >>
> >> Victor
> >> _______________________________________________
> >> Python-Dev mailing list -- python-dev@python.org
> >> To unsubscribe send an email to python-dev-leave@python.org
> >> https://mail.python.org/mailman3/lists/python-dev.python.org/
> >> Message archived at
> https://mail.python.org/archives/list/python-dev@python.org/message/7HMWTGBECAVLINLO3MAEN74YVDHOMZKM/
> >> Code of Conduct: http://python.org/psf/codeofconduct/
>
>
>
> --
> Night gathers, and now my watch begins. It shall not end until my death.
> _______________________________________________
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-leave@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-dev@python.org/message/WJMYK2JKZPTXMID7WRMP4KMJ656WEMI5/
> Code of Conduct: http://python.org/psf/codeofconduct/
>
Re: New sys.module_names attribute in Python 3.10: list of all stdlib modules [ In reply to ]
On 1/26/2021 8:32 PM, Steve Holden wrote:
> If the length of the name is any kind of issue, since the stdlib
> only contains modules (and packages), why not just sys.stdlib_names?

And since the modules can vary between platforms and builds, why
wouldn't this be sysconfig.stdlib_names rather than sys.stdlib_names?

"Modules that were built into the stdlib" sounds more like sysconfig,
and having an accurate list seems better than one that specifies (e.g.)
distutils, ensurepip, resource or termios when those are absent.

Cheers,
Steve
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-leave@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/7JBPSATSJMONLAGEU5PKTJHZ72MFRXBK/
Code of Conduct: http://python.org/psf/codeofconduct/
Re: New sys.module_names attribute in Python 3.10: list of all stdlib modules [ In reply to ]
On Tue, Jan 26, 2021 at 10:04 PM Steve Dower <steve.dower@python.org> wrote:
>
> On 1/26/2021 8:32 PM, Steve Holden wrote:
> > If the length of the name is any kind of issue, since the stdlib
> > only contains modules (and packages), why not just sys.stdlib_names?
>
> And since the modules can vary between platforms and builds, why
> wouldn't this be sysconfig.stdlib_names rather than sys.stdlib_names?

The list is the same on all platforms on purpose ;-) Example:

>>> 'winsound' in sys.stdlib_module_names
True
>>> 'ossaudiodev' in sys.stdlib_module_names
True

For example, grouping stdlib imports using sys.stdlib_module_names
gives the same output on any platform, even if there were missing
dependencies when you built Python.

Victor
--
Night gathers, and now my watch begins. It shall not end until my death.
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-leave@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/3EEZW2MYNFIE4ZE75RLOLYQC5ZUKSN2D/
Code of Conduct: http://python.org/psf/codeofconduct/
Re: New sys.module_names attribute in Python 3.10: list of all stdlib modules [ In reply to ]
On Mon, Jan 25, 2021 at 10:23 PM Random832 <random832@fastmail.com> wrote:

> On Mon, Jan 25, 2021, at 18:44, Chris Jerdonek wrote:
> > But to issue a warning when a standard module is being overridden like
> > I was suggesting, wouldn’t you also need to know whether the name of
> > the module being imported is a standard name, which is what
> > says.module_names provides?
>
> I don't think the warning would be only useful for stdlib modules... has
> any thought been given to warning when a module being imported from the
> current directory / script directory is the same as an installed package?
>

Related to this, I wonder if another application of sys.stdlib_module_names
could be for installers: When installing a new package, a warning could be
issued if the package is attempting to install a package with a name
already in sys.stdlib_module_names. I don't know off-hand what happens if
one were to try to do that today..

--Chris



>
Re: New sys.module_names attribute in Python 3.10: list of all stdlib modules [ In reply to ]
On Tue, Jan 26, 2021 at 12:08:03PM -0800, Brett Cannon wrote:
> On Tue, Jan 26, 2021 at 1:26 AM Antoine Pitrou <antoine@python.org> wrote:

[...]
> > Disagreed. This is niche enough that it warrants a long but explicit
> > name, rather than some ambiguous shortcut.
> >
>
> I agree w/ Antoine that a more descriptive name for such a niche (but
> useful!) attribute makes sense.

This descriptive name is *literally incorrect*. By design, it doesn't
list modules. It only lists sub-packages and not sub-modules, to keep
the number of entries more managable.

(Personally, I don't think an extra hundred or two names makes that much
difference. Its going to be a big list one way or the other.)

So by the current rules, many stdlib modules are not included and the
name is inaccurate.

If you're not going to list all the dotted modules of a package, why
distinguish sub-modules from sub-packages? It is confusing and ackward
to have only some dotted modules listed based on the **implementation**.

(We need a good term for "things you can import that create a module
*object* regardless of whether they are a *module file* or a *package*.
I'm calling them a dotted module for lack of a better name.)

By the current rules, many stdlib modules are not listed, and you can't
see why unless you know their implementation:


* urllib - listed
* urllib.parse - not listed

* collections - listed
* collections.abc - not listed

* email - listed
* email.parser - not listed
* email.mime - listed # Surprise!


So we have this weird situation where an implementation detail of the
dotted module (whether it is a file `package/module.py` or
`package/module/__init__.py`) determines whether it shows up or not.

And because the file system structure of a module is not part of its
API, that implementation detail could change without warning.

I think that either of:

1. list *all* package dotted modules regardless of whether they are
implemented as a sub-module or sub-package;

2. list *no* package dotted modules, only the top-level package;

would be better than this inconsistent hybrid of only listing some
dotted modules.

(Excluding the test modules is fine.)



--
Steve
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-leave@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/GNOAP4TVTHIUKE2GUGZWV6HNVE37KU4Q/
Code of Conduct: http://python.org/psf/codeofconduct/
Re: New sys.module_names attribute in Python 3.10: list of all stdlib modules [ In reply to ]
On Tue, Jan 26, 2021 at 11:19:13PM +0100, Victor Stinner wrote:
> On Tue, Jan 26, 2021 at 10:04 PM Steve Dower <steve.dower@python.org> wrote:
> >
> > On 1/26/2021 8:32 PM, Steve Holden wrote:
> > > If the length of the name is any kind of issue, since the stdlib
> > > only contains modules (and packages), why not just sys.stdlib_names?
> >
> > And since the modules can vary between platforms and builds, why
> > wouldn't this be sysconfig.stdlib_names rather than sys.stdlib_names?
>
> The list is the same on all platforms on purpose ;-) Example:
>
> >>> 'winsound' in sys.stdlib_module_names
> True

Right. This is (I think) Steve's point: the list is inaccurate, because
the existence of 'winsound' in the stdlib_module_names doesn't mean that
the module 'winsound' exists.



--
Steve
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-leave@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/BHKOH3N2FGMTUVXT4YCJMPCOI7SWOTEU/
Code of Conduct: http://python.org/psf/codeofconduct/
Re: New sys.module_names attribute in Python 3.10: list of all stdlib modules [ In reply to ]
On Tue, Jan 26, 2021 at 10:56:57AM +0100, Victor Stinner wrote:
> On Tue, Jan 26, 2021 at 12:44 AM Steven D'Aprano <steve@pearwood.info> wrote:

[...]
> > Your first instinct that it is too long is correct. Just call it
> > "stdlib" or "stdlib_names". The fact that it is a frozen set of module
> > names will be obvious from just looking at it, and there is no need for
> > the name to explain everything about it. We have:
>
> The sys module already has a sys.modules attribute, and so
> sys.module_names sounds like "give me the name of all imported
> modules": sys.module.keys().

Then its a good thing I didn't propose calling it "module_names" :-)



--
Steve
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-leave@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/K7FP7PSFULJYL2GPZDSOGKDPKZZ5GYKP/
Code of Conduct: http://python.org/psf/codeofconduct/
Re: New sys.module_names attribute in Python 3.10: list of all stdlib modules [ In reply to ]
On Wed, 27 Jan 2021 11:05:28 +1100
Steven D'Aprano <steve@pearwood.info> wrote:
> On Tue, Jan 26, 2021 at 12:08:03PM -0800, Brett Cannon wrote:
> > On Tue, Jan 26, 2021 at 1:26 AM Antoine Pitrou <antoine@python.org> wrote:
>
> [...]
> > > Disagreed. This is niche enough that it warrants a long but explicit
> > > name, rather than some ambiguous shortcut.
> > >
> >
> > I agree w/ Antoine that a more descriptive name for such a niche (but
> > useful!) attribute makes sense.
>
> This descriptive name is *literally incorrect*. By design, it doesn't
> list modules. It only lists sub-packages and not sub-modules, to keep
> the number of entries more managable.
>
> (Personally, I don't think an extra hundred or two names makes that much
> difference. Its going to be a big list one way or the other.)
>
> So by the current rules, many stdlib modules are not included and the
> name is inaccurate.

Ok, then "stdlib_package_names"? :-)

Regards

Antoine.

_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-leave@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/NMDFLQDCXRQNUBMHTHTH37OMDPCCQYRZ/
Code of Conduct: http://python.org/psf/codeofconduct/
Re: New sys.module_names attribute in Python 3.10: list of all stdlib modules [ In reply to ]
On Wed, Jan 27, 2021 at 10:44:00AM +0100, Antoine Pitrou wrote:

> Ok, then "stdlib_package_names"? :-)

Heh :-)

I see your smiley, and I'm not going to argue about the name any
further. I have my preference, but if the consensus is
stdlib_module_names, so be it.

But I think the inconsistency between sub-modules and sub-packages is
important. We should either list all sub-whatever or none of them,
rather than only some of them.

--
Steve
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-leave@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/4YYPKUGD4HCFXUNBYXFNNTDDOUF7ZRR2/
Code of Conduct: http://python.org/psf/codeofconduct/
Re: New sys.module_names attribute in Python 3.10: list of all stdlib modules [ In reply to ]
Hi Steven,

That makes sense to me: I wrote
https://github.com/python/cpython/pull/24353 to exclude sub-package.

The change removes 12 sub-packages from sys.stdlib_module_names and
len(sys.stdlib_module_names) becomes 300 :-)

-"concurrent.futures",
-"ctypes.macholib",
-"distutils.command",
-"email.mime",
-"ensurepip._bundled",
-"lib2to3.fixes",
-"lib2to3.pgen2",
-"multiprocessing.dummy",
-"xml.dom",
-"xml.etree",
-"xml.parsers",
-"xml.sax",

With that name, names of sys.stdlib_module_names don't contain "." anymore.

So to check if "email.message" is a stdlib module name, exclude the
part after the first dot, and check if "email" is in
sys.stdlib_module_names. In practice, it is not possible to add a
sub-package or a sub-module to a stdlib module, so this limitation
(excluding sub-packages and sub-modules) sounds reasonable to me.

Victor


On Wed, Jan 27, 2021 at 1:09 AM Steven D'Aprano <steve@pearwood.info> wrote:
> This descriptive name is *literally incorrect*. By design, it doesn't
> list modules. It only lists sub-packages and not sub-modules, to keep
> the number of entries more managable.
>
> (Personally, I don't think an extra hundred or two names makes that much
> difference. Its going to be a big list one way or the other.)
>
> So by the current rules, many stdlib modules are not included and the
> name is inaccurate.
>
> If you're not going to list all the dotted modules of a package, why
> distinguish sub-modules from sub-packages? It is confusing and ackward
> to have only some dotted modules listed based on the **implementation**.
>
> (We need a good term for "things you can import that create a module
> *object* regardless of whether they are a *module file* or a *package*.
> I'm calling them a dotted module for lack of a better name.)
>
> By the current rules, many stdlib modules are not listed, and you can't
> see why unless you know their implementation:
>
>
> * urllib - listed
> * urllib.parse - not listed
>
> * collections - listed
> * collections.abc - not listed
>
> * email - listed
> * email.parser - not listed
> * email.mime - listed # Surprise!
>
>
> So we have this weird situation where an implementation detail of the
> dotted module (whether it is a file `package/module.py` or
> `package/module/__init__.py`) determines whether it shows up or not.
>
> And because the file system structure of a module is not part of its
> API, that implementation detail could change without warning.
>
> I think that either of:
>
> 1. list *all* package dotted modules regardless of whether they are
> implemented as a sub-module or sub-package;
>
> 2. list *no* package dotted modules, only the top-level package;
>
> would be better than this inconsistent hybrid of only listing some
> dotted modules.
>
> (Excluding the test modules is fine.)

--
Night gathers, and now my watch begins. It shall not end until my death.
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-leave@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/UW6F2MRC5RNOLEJJI64BALENK7R7UYA2/
Code of Conduct: http://python.org/psf/codeofconduct/
Re: New sys.module_names attribute in Python 3.10: list of all stdlib modules [ In reply to ]
On Wed, Jan 27, 2021 at 1:16 AM Steven D'Aprano <steve@pearwood.info> wrote:
> Right. This is (I think) Steve's point: the list is inaccurate, because
> the existence of 'winsound' in the stdlib_module_names doesn't mean that
> the module 'winsound' exists.

This point is addressed by the definition of the list:
sys.stdlib_module_names documentation.

"It is the same on all platforms. Modules which are not available on
some platforms and modules disabled at Python build are also listed.
All module kinds are listed: pure Python, built-in, frozen and
extension modules. Test modules are excluded."

https://docs.python.org/dev/library/sys.html#sys.stdlib_module_names

As I wrote previously, there are use cases which *require* the list
being the same on all platforms. Moreover, in practice, it's quite
hard to build a list of available stdlib module names. You need to
build extension modules, try to implement them, then rebuild the list
of module which requires to rebuild Python. It's not convenient. Also,
there are different definition of "available". For example, "import
multiprocessing" can fail on some platforms if there is no lock
implementation available. It's not because it's installed on the
system that the import will work for sure.

IMO the only reliable way to check if a module can be imported... is
to import it. And then you hit again the issue of import side effects.

There are different ways to filter sys.stdlib_module_names list to
only list "available" modules. Try import, pkgutil.iter_modules() or
pkgutil.walk_packages(). IMO it should remain out of the scope of
sys.stdlib_module_names.

Victor
--
Night gathers, and now my watch begins. It shall not end until my death.
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-leave@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/IKFK6CTTYTWD2VFH36AIN5IGS66KSMFA/
Code of Conduct: http://python.org/psf/codeofconduct/
Re: New sys.module_names attribute in Python 3.10: list of all stdlib modules [ In reply to ]
I probably wouldn't think of that on my own, but the need is rare enough that having the recipe in the documentation (preferably including the docstring) might be enough. (Or it might not.)
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-leave@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/BFSS7QXGT3PA6TKSC55JLLUFO5AXUTOC/
Code of Conduct: http://python.org/psf/codeofconduct/
Re: New sys.module_names attribute in Python 3.10: list of all stdlib modules [ In reply to ]
I see a bunch of similar -- but not quite the same -- use cases.

I feel like instead of a set, it should be a dict pointing to an object with attributes that describe the module in various ways (top-level vs subpackage, installed on this machine or not, test module or not, etc). I'll understand if this seems like scope creep, but try not to rule it out as a future enhancement. (e.g., don't promise it will be precisely a set., as opposed to the keys of a map.)
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-leave@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/VZ5NSCXXHRE63477ANQXJHD3U2YDFU3J/
Code of Conduct: http://python.org/psf/codeofconduct/