Mailing List Archive

1 2  View All
Re: PEP 626: Precise line numbers for debugging and other tools. [ In reply to ]
On Wed, Jul 22, 2020 at 5:19 AM Mark Shannon <mark@hotpy.org> wrote:

>
>
> On 21/07/2020 9:46 pm, Gregory P. Smith wrote:
> >
> >
> > On Fri, Jul 17, 2020 at 8:41 AM Ned Batchelder <ned@nedbatchelder.com
> > <mailto:ned@nedbatchelder.com>> wrote:
> >
> > https://www.python.org/dev/peps/pep-0626/ :)
> >
> > --Ned.
> >
> > On 7/17/20 10:48 AM, Mark Shannon wrote:
> > > Hi all,
> > >
> > > I'd like to announce a new PEP.
> > >
> > > It is mainly codifying that Python should do what you probably
> > already
> > > thought it did :)
> > >
> > > Should be uncontroversial, but all comments are welcome.
> > >
> > > Cheers,
> > > Mark.
> >
> >
> > """When a frame object is created, the f_lineno will be set to the line
> > at which the function or class is defined. For modules it will be set to
> > zero."""
> >
> > Within this PEP it'd be good for us to be very pedantic. f_lineno is a
> > single number. So which number is it given many class and function
> > definition statements can span multiple lines.
> >
> > Is it the line containing the class or def keyword? Or is it the line
> > containing the trailing :?
>
> The line of the `def`/`class`. It wouldn't change for the current
> behavior. I'll add that to the PEP.
>
> >
> > Q: Why can't we have the information about the entire span of lines
> > rather than consider a definition to be a "line"?
>
> Pretty much every profiler, coverage tool, and debugger ever expects
> lines to be natural numbers, not ranges of numbers.
> A lot of tooling would need to be changed.
>
> >
> > I think that question applies to later sections as well. Anywhere we
> > refer to a "line", it could actually mean a span of lines. (especially
> > when you consider \ continuation in situations you might not otherwise
> > think could span lines)
>
> Let's take an example:
> ```
> x = (
> a,
> b,
> )
> ```
>
> You would want the BUILD_TUPLE instruction to have a of span lines 1 to
> 4 (inclusive), rather just line 1?
> If you wanted to break on the BUILD_TUPLE where you tell pdb to break?
>
> I don't see that it would add much value, but it would add a lot of
> complexity.
>

We should have the data about the range at bytecode compilation time,
correct? So why not keep it? sure, most existing tooling would just use
the start of the range as the line number as it always has. but some
tooling could find the range useful (ex: semantic code indexing for use in
display, search, editors, IDEs. Rendering lint errors more accurately
instead of just claiming a single line or resorting to parsing hacks to
come up with a range, etc.). The downside is that we'd be storing a second
number in bytecode making it slightly larger. Though it could be stored
efficiently as a prefixed delta so it'd likely average out as less than 2
bytes per line number stored. (i don't have a feeling for our current
format to know if that is significant or not - if it is, maybe this idea
just gets nixed)

The reason the range concept was on my mind is due to something not quite
related but involving a changed idea of a line number in our current system
that we recently ran into with pytype during a Python upgrade.

"""in 3.7, if a function body is a plain docstring, the line number of the
RETURN_VALUE opcode corresponds to the docstring, whereas in 3.6 it
corresponds to the function definition.""" (Thanks, Martin & Rebecca!)

```python
def no_op():
"""docstring instead of pass."""
```

so the location of what *was* originally an end of line `# pytype:
disable=bad-return-type` comment (to work around an issue not relevant
here) turned awkward and version dependent. pytype is bytecode based, thus
that is where its line numbers come from. metadata comments in source can
only be tied to bytecode via line numbers. making end of line directives
occasionally hard to match up.

When there is no return statement, this opcode still exists. what line
number does it belong to? 3.6's answer made sense to me. 3.7's seems
wrong - a docstring isn't responsible for a return opcode. I didn't check
what 3.8 and 3.9 do. An alternate answer after this PEP is that it
wouldn't have a line number when there is no return statement (pedantically
correct, I approve! #win).

-gps


>
> Cheers,
> Mark.
>
> >
> > -gps
>
Re: PEP 626: Precise line numbers for debugging and other tools. [ In reply to ]
But on which line is the RETURN opcode if there is more than a docstring?
Doesn’t it make sense to have it attached to the last line of the body?
(Too bad about pytype, that kind of change happens — we had this kind of
thing for mypy too, when line numbers in the AST were fixed.)

On Wed, Jul 22, 2020 at 17:29 Gregory P. Smith <greg@krypto.org> wrote:

>
>
> On Wed, Jul 22, 2020 at 5:19 AM Mark Shannon <mark@hotpy.org> wrote:
>
>>
>>
>> On 21/07/2020 9:46 pm, Gregory P. Smith wrote:
>> >
>> >
>> > On Fri, Jul 17, 2020 at 8:41 AM Ned Batchelder <ned@nedbatchelder.com
>> > <mailto:ned@nedbatchelder.com>> wrote:
>> >
>> > https://www.python.org/dev/peps/pep-0626/ :)
>> >
>> > --Ned.
>> >
>> > On 7/17/20 10:48 AM, Mark Shannon wrote:
>> > > Hi all,
>> > >
>> > > I'd like to announce a new PEP.
>> > >
>> > > It is mainly codifying that Python should do what you probably
>> > already
>> > > thought it did :)
>> > >
>> > > Should be uncontroversial, but all comments are welcome.
>> > >
>> > > Cheers,
>> > > Mark.
>> >
>> >
>> > """When a frame object is created, the f_lineno will be set to the line
>> > at which the function or class is defined. For modules it will be set
>> to
>> > zero."""
>> >
>> > Within this PEP it'd be good for us to be very pedantic. f_lineno is a
>> > single number. So which number is it given many class and function
>> > definition statements can span multiple lines.
>> >
>> > Is it the line containing the class or def keyword? Or is it the line
>> > containing the trailing :?
>>
>> The line of the `def`/`class`. It wouldn't change for the current
>> behavior. I'll add that to the PEP.
>>
>> >
>> > Q: Why can't we have the information about the entire span of lines
>> > rather than consider a definition to be a "line"?
>>
>> Pretty much every profiler, coverage tool, and debugger ever expects
>> lines to be natural numbers, not ranges of numbers.
>> A lot of tooling would need to be changed.
>>
>> >
>> > I think that question applies to later sections as well. Anywhere we
>> > refer to a "line", it could actually mean a span of lines. (especially
>> > when you consider \ continuation in situations you might not otherwise
>> > think could span lines)
>>
>> Let's take an example:
>> ```
>> x = (
>> a,
>> b,
>> )
>> ```
>>
>> You would want the BUILD_TUPLE instruction to have a of span lines 1 to
>> 4 (inclusive), rather just line 1?
>> If you wanted to break on the BUILD_TUPLE where you tell pdb to break?
>>
>> I don't see that it would add much value, but it would add a lot of
>> complexity.
>>
>
> We should have the data about the range at bytecode compilation time,
> correct? So why not keep it? sure, most existing tooling would just use
> the start of the range as the line number as it always has. but some
> tooling could find the range useful (ex: semantic code indexing for use in
> display, search, editors, IDEs. Rendering lint errors more accurately
> instead of just claiming a single line or resorting to parsing hacks to
> come up with a range, etc.). The downside is that we'd be storing a second
> number in bytecode making it slightly larger. Though it could be stored
> efficiently as a prefixed delta so it'd likely average out as less than 2
> bytes per line number stored. (i don't have a feeling for our current
> format to know if that is significant or not - if it is, maybe this idea
> just gets nixed)
>
> The reason the range concept was on my mind is due to something not quite
> related but involving a changed idea of a line number in our current system
> that we recently ran into with pytype during a Python upgrade.
>
> """in 3.7, if a function body is a plain docstring, the line number of the
> RETURN_VALUE opcode corresponds to the docstring, whereas in 3.6 it
> corresponds to the function definition.""" (Thanks, Martin & Rebecca!)
>
> ```python
> def no_op():
> """docstring instead of pass."""
> ```
>
> so the location of what *was* originally an end of line `# pytype:
> disable=bad-return-type` comment (to work around an issue not relevant
> here) turned awkward and version dependent. pytype is bytecode based, thus
> that is where its line numbers come from. metadata comments in source can
> only be tied to bytecode via line numbers. making end of line directives
> occasionally hard to match up.
>
> When there is no return statement, this opcode still exists. what line
> number does it belong to? 3.6's answer made sense to me. 3.7's seems
> wrong - a docstring isn't responsible for a return opcode. I didn't check
> what 3.8 and 3.9 do. An alternate answer after this PEP is that it
> wouldn't have a line number when there is no return statement (pedantically
> correct, I approve! #win).
>
> -gps
>
>
>>
>> Cheers,
>> Mark.
>>
>> >
>> > -gps
>>
> _______________________________________________
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-leave@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-dev@python.org/message/H3YBK275SUSCR5EHWHYBTJBF655UK7JG/
> Code of Conduct: http://python.org/psf/codeofconduct/
>
--
--Guido (mobile)
Re: PEP 626: Precise line numbers for debugging and other tools. [ In reply to ]
On 22Jul2020 1319, Mark Shannon wrote:
> On 21/07/2020 9:46 pm, Gregory P. Smith wrote:
>>
>> Q: Why can't we have the information about the entire span of lines
>> rather than consider a definition to be a "line"?
>
> Pretty much every profiler, coverage tool, and debugger ever expects
> lines to be natural numbers, not ranges of numbers.
> A lot of tooling would need to be changed.

As someone who worked on apparently the only debugger that expects
_character_ ranges, rather than a simple line number, I would love to
keep full mapping information somewhere.

We experimented with some stack analysis to see if we could tell the
difference between being inside the list comprehension vs. outside the
comprehension, or which of the nested comprehension is currently
running. But it turned out to be too much trouble.

An alternative to lnotab that includes the full line/column range for
the expression, presumably taken from a particular type of node in the
AST, would be great. But I think omitting even line ranges at this stage
would be a missed opportunity, since we're breaking non-Python debuggers
anyway.

Cheers,
Steve
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-leave@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/MKX3TW2DCQ5XCOWP2C4XBREENQKFIFH3/
Code of Conduct: http://python.org/psf/codeofconduct/
Re: PEP 626: Precise line numbers for debugging and other tools. [ In reply to ]
In theory, this table could be stored somewhere other than the code object, so that it doesn't actually get paged in or occupy cache unless tracing is on. Whether that saves enough to be worth the extra indirections when tracing is on, I have no intention of volunteering to measure. I will note that in the past, taking out docstrings (not even just moving them to a dict of [code:docstring] -- just taking them out completely) has been considered worthwhile.
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-leave@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/HEXSSC35MFWFKFRK6TO4N5SBJDTZAZOS/
Code of Conduct: http://python.org/psf/codeofconduct/
Re: PEP 626: Precise line numbers for debugging and other tools. [ In reply to ]
I think this example should be in the PEP.
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-leave@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/Z6TNMC7HKRQHQMEDHXKM2PAAKE233KUO/
Code of Conduct: http://python.org/psf/codeofconduct/
Re: PEP 626: Precise line numbers for debugging and other tools. [ In reply to ]
> In theory, this table could be stored somewhere other than the code
object, so that it doesn't actually get paged in or occupy cache unless
tracing is on.

As some of us mentioned before, that will hurt the ecosystem of profilers
and debugger tools considerably

On Thu, 23 Jul 2020 at 18:08, Jim J. Jewett <jimjjewett@gmail.com> wrote:

> In theory, this table could be stored somewhere other than the code
> object, so that it doesn't actually get paged in or occupy cache unless
> tracing is on. Whether that saves enough to be worth the extra
> indirections when tracing is on, I have no intention of volunteering to
> measure. I will note that in the past, taking out docstrings (not even
> just moving them to a dict of [code:docstring] -- just taking them out
> completely) has been considered worthwhile.
> _______________________________________________
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-leave@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-dev@python.org/message/HEXSSC35MFWFKFRK6TO4N5SBJDTZAZOS/
> Code of Conduct: http://python.org/psf/codeofconduct/
>
Re: PEP 626: Precise line numbers for debugging and other tools. [ In reply to ]
I certainly understand saying "this change isn't important enough to justify a change."

But it sounds as though you are saying the benefit is irrelevant; it is just inherently too expensive to ask programs that are already dealing with internals and trying to optimize performance to make a mechanical change from:
code.magic_attrname
to:
magicdict[code]

What have I missed?
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-leave@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/TDCJFNHIAFEH5NIBEPP2GFP4C2BYR2DP/
Code of Conduct: http://python.org/psf/codeofconduct/
Re: PEP 626: Precise line numbers for debugging and other tools. [ In reply to ]
On Sat, Jul 25, 2020 at 12:17 PM Jim J. Jewett <jimjjewett@gmail.com> wrote:

> I certainly understand saying "this change isn't important enough to
> justify a change."
>
> But it sounds as though you are saying the benefit is irrelevant;


Jim, if you include what you’re replying to in your own message (like I’m
doing here), it will be easier for people to tell who / what you’re
replying to. I wasn’t able to tell what your last few messages were in
reply to.

—Chris


it is just inherently too expensive to ask programs that are already
> dealing with internals and trying to optimize performance to make a
> mechanical change from:
> code.magic_attrname
> to:
> magicdict[code]
>
> What have I missed?
> _______________________________________________
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-leave@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-dev@python.org/message/TDCJFNHIAFEH5NIBEPP2GFP4C2BYR2DP/
> Code of Conduct: http://python.org/psf/codeofconduct/
>
Re: PEP 626: Precise line numbers for debugging and other tools. [ In reply to ]
On 25Jul2020 2014, Jim J. Jewett wrote:
> But it sounds as though you are saying the benefit is irrelevant; it is just inherently too expensive to ask programs that are already dealing with internals and trying to optimize performance to make a mechanical change from:
> code.magic_attrname
> to:
> magicdict[code]
>
> What have I missed?

You've missed that debugging and profiling tools that operate purely on
native memory can't execute Python code, so the "magic" has to be easily
representable in C such that it can be copied into whichever language is
being used (whether it's C, C++, C#, Rust, or something else).

Cheers,
Steve
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-leave@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/PE44CTX6NG6KOUPIJUFRXJHNFSFMN2TK/
Code of Conduct: http://python.org/psf/codeofconduct/
Re: PEP 626: Precise line numbers for debugging and other tools. [ In reply to ]
ah... we may have been talking past each other.

Steve Dower wrote:
> On 25Jul2020 2014, Jim J. Jewett wrote:
> > But it sounds as though you are saying the benefit

[.of storing the line numbers in an external table, I thought,
but perhaps Pablo Galindo Salgado and yourself were
talking only of the switch from an lnotab string to an opaque
co_linetable?]

> > is irrelevant; it is just inherently too expensive to ask programs that are already dealing
> > with internals and trying to optimize performance to make a mechanical change from:
> > code.magic_attrname
> > to:
> > magicdict[code]
> > What have I missed?

> You've missed that debugging and profiling tools that operate purely on
> native memory can't execute Python code, so the "magic" has to be easily
> representable in C such that it can be copied into whichever language is
> being used (whether it's C, C++, C#, Rust, or something else).

Unless you really were talking only of the switch to co_linetable, I'm still
missing the problem. To me, it still looks like a call to:

PyAPI_FUNC(PyObject *) PyObject_GetAttrString(PyObject *, const char *);

with the code object being stepped through and "co_lnotab"
would be replaced by:

PyAPI_FUNC(PyObject *) PyDict_GetItem(PyObject *mp, PyObject *key);

using that same code object as the key, but getting the dict from
some well-known (yet-to-be-defined) location, such as sys.code_to_lnotab.

Mark Shannon and Carl Shapiro had seemed to object to the PEP because
the new structure would make the code object longer, and making it smaller
by a string does seem likely to be good. But if your real objections are to
just to replacing the lnotab format with something that needs to be
executed, then I apologize for misunderstanding.

-jJ
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-leave@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/WUEFHFTPVTOPA3EFHACDECT3ZPLGGTFJ/
Code of Conduct: http://python.org/psf/codeofconduct/
Re: PEP 626: Precise line numbers for debugging and other tools. [ In reply to ]
On Tue, Jul 28, 2020 at 2:12 PM Jim J. Jewett <jimjjewett@gmail.com> wrote:

> ah... we may have been talking past each other.
>
> Steve Dower wrote:
> > On 25Jul2020 2014, Jim J. Jewett wrote:
> > > But it sounds as though you are saying the benefit
>
> [.of storing the line numbers in an external table, I thought,
> but perhaps Pablo Galindo Salgado and yourself were
> talking only of the switch from an lnotab string to an opaque
> co_linetable?]
>
> > > is irrelevant; it is just inherently too expensive to ask programs
> that are already dealing
> > > with internals and trying to optimize performance to make a mechanical
> change from:
> > > code.magic_attrname
> > > to:
> > > magicdict[code]
> > > What have I missed?
>
> > You've missed that debugging and profiling tools that operate purely on
> > native memory can't execute Python code, so the "magic" has to be easily
> > representable in C such that it can be copied into whichever language is
> > being used (whether it's C, C++, C#, Rust, or something else).
>
> Unless you really were talking only of the switch to co_linetable, I'm
> still
> missing the problem. To me, it still looks like a call to:
>
> PyAPI_FUNC(PyObject *) PyObject_GetAttrString(PyObject *, const char
> *);
>
> with the code object being stepped through and "co_lnotab"
> would be replaced by:
>
> PyAPI_FUNC(PyObject *) PyDict_GetItem(PyObject *mp, PyObject *key);
>
> using that same code object as the key, but getting the dict from
> some well-known (yet-to-be-defined) location, such as sys.code_to_lnotab.
>
> Mark Shannon and Carl Shapiro had seemed to object to the PEP because
> the new structure would make the code object longer, and making it smaller
> by a string does seem likely to be good. But if your real objections are
> to
> just to replacing the lnotab format with something that needs to be
> executed, then I apologize for misunderstanding.
>

Introspection of the running CPython process is happening from outside of
the CPython interpreter itself. Either from a signal handler or C/C++
managed thread within the process, or (as Pablo suggested) from outside the
process entirely. Calling CPython APIs is a non-option in all of those
situations.

That is why I suggested that the "undocumented" new co_linetable will be
used instead of the disappeared co_lnotab regardless of documentation or
claimed stability guarantees. It sounds like an equivalent read only data
source for this purpose. It doesn't matter to anyone with such a profiler
if it is claimed to be unspecified.

The data is needed, and the format shouldn't change within a stable python
major.minor release (we'd be unlikely to anyways even without that
guarantee). Given this, I suggest at least specifying valuable properties
of it such as "read only, never mutated" even if the exact format is
intentionally left as implementation defined, subject to change between
minor releases structure.

-gps


>
> -jJ
> _______________________________________________
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-leave@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-dev@python.org/message/WUEFHFTPVTOPA3EFHACDECT3ZPLGGTFJ/
> Code of Conduct: http://python.org/psf/codeofconduct/
>

1 2  View All