Mailing List Archive

[issue45582] Rewrite getpath.c in Python
New submission from Steve Dower <steve.dower@python.org>:

As discussed in issue42260, combining the two getpath implementations into a single Python implementation would make it more maintainable and modifiable (particularly where distros need to patch to support alternative layouts).

----------
assignee: steve.dower
components: Interpreter Core
messages: 404841
nosy: eric.snow, ncoghlan, steve.dower, vstinner
priority: normal
severity: normal
stage: needs patch
status: open
title: Rewrite getpath.c in Python
type: enhancement
versions: Python 3.11

_______________________________________
Python tracker <report@bugs.python.org>
<https://bugs.python.org/issue45582>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com
[issue45582] Rewrite getpath.c in Python [ In reply to ]
Change by Steve Dower <steve.dower@python.org>:


----------
keywords: +patch
pull_requests: +27452
stage: needs patch -> patch review
pull_request: https://github.com/python/cpython/pull/29041

_______________________________________
Python tracker <report@bugs.python.org>
<https://bugs.python.org/issue45582>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com
[issue45582] Rewrite getpath.c in Python [ In reply to ]
Steve Dower <steve.dower@python.org> added the comment:

The PR has more work to do, but the overall layout/changes are more or less there, so happy to discuss feedback/etc.

Obviously there are a lot of edge cases here, but they seem to be mostly tested already. And I think we're early enough in alpha to find any major issues (or absorb any necessary minor changes - seems like trailing slashes might change on some paths).

There are also some changes/hacks into the new frozen module support, so that I can freeze getpath.py without turning it into a module. I really just want to execute the bytecode - no reason for any of its contents to stick around - and this works out pretty neatly. But if the changes to frozen modules seem off then maybe we can split it into totally separate freezing support?

----------

_______________________________________
Python tracker <report@bugs.python.org>
<https://bugs.python.org/issue45582>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com
[issue45582] Rewrite getpath.c in Python [ In reply to ]
Change by Filipe LaĆ­ns <lains@riseup.net>:


----------
nosy: +FFY00

_______________________________________
Python tracker <report@bugs.python.org>
<https://bugs.python.org/issue45582>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com
[issue45582] Rewrite getpath.c in Python [ In reply to ]
Steve Dower <steve.dower@python.org> added the comment:

So I think I've found my first completely unavoidable API break: PyConfig_Read(config) has to work before initialisation, but is also supposed to fill out all the fields (including the search path). But because we need at least an interpreter state, we now can't calculate everything.

The only test that seems to be affected here is test_embed.test_init_read_set(), which does a PyConfig_Read() and then inserts new paths into module_search_paths before initialising. With that one skipped, I think everything else can be handled.

----------

_______________________________________
Python tracker <report@bugs.python.org>
<https://bugs.python.org/issue45582>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com
[issue45582] Rewrite getpath.c in Python [ In reply to ]
Steve Dower <steve.dower@python.org> added the comment:

Last remaining test failure is one that I can't figure out on my own - the freeze test is rerunning a CPython build (on Linux) and is apparently building getpath.c with the ".c.o" rule rather than the "Modules/getpath.o" rule.

Any tips as to what I should be looking at to figure this one out?

----------

_______________________________________
Python tracker <report@bugs.python.org>
<https://bugs.python.org/issue45582>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com
[issue45582] Rewrite getpath.c in Python [ In reply to ]
Eric Snow <ericsnowcurrently@gmail.com> added the comment:

On Wed, Nov 3, 2021 at 1:21 PM Steve Dower <report@bugs.python.org> wrote:
> Last remaining test failure is one that I can't figure out on my own - the freeze test is rerunning a CPython build (on Linux) and is apparently building getpath.c with the ".c.o" rule rather than the "Modules/getpath.o" rule.
>
> Any tips as to what I should be looking at to figure this one out?

That test does an out-of-tree build. Might that be related?

----------

_______________________________________
Python tracker <report@bugs.python.org>
<https://bugs.python.org/issue45582>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com
[issue45582] Rewrite getpath.c in Python [ In reply to ]
Steve Dower <steve.dower@python.org> added the comment:

I'm betting the out-of-tree (actually just deeper within the same tree) bit is related, but I just can't see how. Modules/getbuildinfo.c takes extra parameters and they seem to be being used, so I can't tell why getpath.c's are not (those rules are listed right next to each other, but well above the .c.o rule).

----------

_______________________________________
Python tracker <report@bugs.python.org>
<https://bugs.python.org/issue45582>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com
[issue45582] Rewrite getpath.c in Python [ In reply to ]
Steve Dower <steve.dower@python.org> added the comment:

Unsurprisingly, it was a bad edit that I made to the Makefile myself. The commit that undoes it is https://github.com/python/cpython/pull/29041/commits/aedebcc45a638f5cf65d17046ae09b5cac97cebf but since I made the initial change as part of this PR, it was never merged in.

Now to find out why the old getpath could somehow locate the stdlib but new getpath cannot... (I'm guessing it is finding the "original" stdlib rather than the fresh clone, since AFAICT there's no reference at all to the original source dir)

----------

_______________________________________
Python tracker <report@bugs.python.org>
<https://bugs.python.org/issue45582>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com
[issue45582] Rewrite getpath.c in Python [ In reply to ]
Eric Snow <ericsnowcurrently@gmail.com> added the comment:

On Wed, Nov 3, 2021 at 6:25 PM Steve Dower <report@bugs.python.org> wrote:
> Now to find out why the old getpath could somehow locate the stdlib but new getpath cannot... (I'm guessing it is finding the "original" stdlib rather than the fresh clone, since AFAICT there's no reference at all to the original source dir)

What fresh clone do you mean? test_embed is failing, not test_freeze.
So there is no out-of-tree build involved.

----------

_______________________________________
Python tracker <report@bugs.python.org>
<https://bugs.python.org/issue45582>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com
[issue45582] Rewrite getpath.c in Python [ In reply to ]
Steve Dower <steve.dower@python.org> added the comment:

> What fresh clone do you mean? test_embed is failing, not test_freeze.

test_freeze is passing, but it shouldn't be able to locate a valid Lib/ directory to load modules from. So it's somehow managing to do it against the "official" logic (none of the searches are going to find `root/python-build/Lib` starting from `root/python-installation/python`).

I haven't dug into it yet, but I suspect if the root used for this test is moved outside of the main tree then it will fail again.

----------

_______________________________________
Python tracker <report@bugs.python.org>
<https://bugs.python.org/issue45582>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com
[issue45582] Rewrite getpath.c in Python [ In reply to ]
Steve Dower <steve.dower@python.org> added the comment:

I'm expecting another dumb error (on my part) or two in the PR, but I'm very close to having this working.

Reviews would be appreciated! Bear in mind that I'm trying to match the current (quirky) behaviour, rather than streamline anything by changing it (yet). We can do those once we know we've got something working.

----------

_______________________________________
Python tracker <report@bugs.python.org>
<https://bugs.python.org/issue45582>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com
[issue45582] Rewrite getpath.c in Python [ In reply to ]
Eric Snow <ericsnowcurrently@gmail.com> added the comment:

On Thu, Nov 11, 2021 at 6:27 PM Steve Dower <report@bugs.python.org> wrote:
> rather than streamline anything by changing it (yet). We can do those once we know we've got something working.

+1

----------

_______________________________________
Python tracker <report@bugs.python.org>
<https://bugs.python.org/issue45582>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com
[issue45582] Rewrite getpath.c in Python [ In reply to ]
Steve Dower <steve.dower@python.org> added the comment:

I have tests passing now, so reviews would be appreciated.

There's definitely scope for optimising this algorithm both for speed and clarity, but I'd prefer to get the main translation in first so that any further changes have a reliable baseline (especially since we'll likely end up changing the behaviour slightly if we touch anything at all in getpath, so it'd be good to capture those as individual commits/bugs).

----------

_______________________________________
Python tracker <report@bugs.python.org>
<https://bugs.python.org/issue45582>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com
[issue45582] Rewrite getpath.c in Python [ In reply to ]
Steve Dower <steve.dower@python.org> added the comment:

Status update on this: I owe everyone a perf comparison of the before/after with this change.

I don't particularly want to block on a regression unless it's significant (honestly still have no idea what to expect), but open to others' thoughts on this point. How big a perf impact is this change worth?

(Obviously once I have some numbers the discussion can be more concrete, but I don't have them yet, and I have to catch up on other issues for a while as this one took so long to get this far.)

----------

_______________________________________
Python tracker <report@bugs.python.org>
<https://bugs.python.org/issue45582>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com
[issue45582] Rewrite getpath.c in Python [ In reply to ]
Steve Dower <steve.dower@python.org> added the comment:

It's one data point (well, statistics over 1000 points), but it looks like it's actually a slight improvement in performance over the previous code on Windows :)

before after
min 23.103 22.154
25% 25.069 23.59925
50% 25.8125 24.2715
75% 26.65175 24.89575
max 147.567 138.997

Going to run a Linux test as well, since that was a completely different code path, but assuming it's not drastically different then I'll go ahead and merge.

----------

_______________________________________
Python tracker <report@bugs.python.org>
<https://bugs.python.org/issue45582>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com
[issue45582] Rewrite getpath.c in Python [ In reply to ]
Steve Dower <steve.dower@python.org> added the comment:

Basically unchanged on Debian/WSL as well.

There's a new conflict arisen, so I'll resolve that and then merge.

----------

_______________________________________
Python tracker <report@bugs.python.org>
<https://bugs.python.org/issue45582>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com
[issue45582] Rewrite getpath.c in Python [ In reply to ]
Steve Dower <steve.dower@python.org> added the comment:


New changeset 99fcf1505218464c489d419d4500f126b6d6dc28 by Steve Dower in branch 'main':
bpo-45582: Port getpath[p].c to Python (GH-29041)
https://github.com/python/cpython/commit/99fcf1505218464c489d419d4500f126b6d6dc28


----------

_______________________________________
Python tracker <report@bugs.python.org>
<https://bugs.python.org/issue45582>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com
[issue45582] Rewrite getpath.c in Python [ In reply to ]
Steve Dower <steve.dower@python.org> added the comment:

It's done! Those were some of the hardest memory leaks I've had to track down, but it should be clear now.

As I mentioned in the commit message, this change attempts to preserve every known quirk. However, I think these ought to be streamlined across platforms to improve the code and/or startup performance.

But at least we know now that any changes in the future that require test changes are expected, and not the result of this change :)

----------
resolution: -> fixed
stage: patch review -> resolved
status: open -> closed

_______________________________________
Python tracker <report@bugs.python.org>
<https://bugs.python.org/issue45582>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com
[issue45582] Rewrite getpath.c in Python [ In reply to ]
Eric Snow <ericsnowcurrently@gmail.com> added the comment:

Hurray! Thanks, Steve!

----------

_______________________________________
Python tracker <report@bugs.python.org>
<https://bugs.python.org/issue45582>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com
[issue45582] Rewrite getpath.c in Python [ In reply to ]
Change by Christian Heimes <lists@cheimes.de>:


----------
nosy: +christian.heimes
nosy_count: 5.0 -> 6.0
pull_requests: +28126
pull_request: https://github.com/python/cpython/pull/29902

_______________________________________
Python tracker <report@bugs.python.org>
<https://bugs.python.org/issue45582>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com
[issue45582] Rewrite getpath.c in Python [ In reply to ]
Christian Heimes <lists@cheimes.de> added the comment:


New changeset ccb73a0d50dd03bc8455fe210cb83e41a6dc91d8 by Christian Heimes in branch 'main':
bpo-45582: Fix out-of-tree build issues with new getpath (GH-29902)
https://github.com/python/cpython/commit/ccb73a0d50dd03bc8455fe210cb83e41a6dc91d8


----------

_______________________________________
Python tracker <report@bugs.python.org>
<https://bugs.python.org/issue45582>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com
[issue45582] Rewrite getpath.c in Python [ In reply to ]
Change by neonene <nicesalmon@gmail.com>:


----------
nosy: +neonene
nosy_count: 6.0 -> 7.0
pull_requests: +28130
pull_request: https://github.com/python/cpython/pull/29906

_______________________________________
Python tracker <report@bugs.python.org>
<https://bugs.python.org/issue45582>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com
[issue45582] Rewrite getpath.c in Python [ In reply to ]
neonene <nicesalmon@gmail.com> added the comment:

PGO-instrumented binary seems not to specify the stdlib directory on PR29041. I can run it with PYTHONPATH set.


Python path configuration:
PYTHONHOME = 'C:\Py311\'
PYTHONPATH = (not set)
program name = 'C:\Py311\PCbuild\amd64\instrumented\python.exe'
isolated = 0
environment = 1
user site = 1
import site = 1
is in build tree = 1
stdlib dir = 'C:\Py311\PCbuild\Lib'
sys._base_executable = 'C:\\py311\\PCbuild\\amd64\\instrumented\\python.exe'
sys.base_prefix = 'C:\\py311\\'
sys.base_exec_prefix = 'C:\\py311\\'
sys.platlibdir = 'DLLs'
sys.executable = 'C:\\py311\\PCbuild\\amd64\\instrumented\\python.exe'
sys.prefix = 'C:\\py311\\'
sys.exec_prefix = 'C:\\py311\\'
sys.path = [
'C:\\py311\\PCbuild\\amd64\\instrumented\\python311.zip',
'C:\\py311\\PCbuild\\Lib',
'C:\\py311\\PCbuild\\amd64\\instrumented',
]
Fatal Python error: init_fs_encoding: failed to get the Python codec of the filesystem encoding
Python runtime state: core initialized
ModuleNotFoundError: No module named 'encodings'

----------

_______________________________________
Python tracker <report@bugs.python.org>
<https://bugs.python.org/issue45582>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com
[issue45582] Rewrite getpath.c in Python [ In reply to ]
Steve Dower <steve.dower@python.org> added the comment:


New changeset 7d7c91a8e8c0bb04105a21a17d1061ffc1c04d80 by neonene in branch 'main':
bpo-45582: Add a NOT operator to the condition in getpath_isxfile (GH-29906)
https://github.com/python/cpython/commit/7d7c91a8e8c0bb04105a21a17d1061ffc1c04d80


----------

_______________________________________
Python tracker <report@bugs.python.org>
<https://bugs.python.org/issue45582>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com

1 2  View All