Hi all,
Right now, when a debugger is active, the number of local variables can
affect the tracing speed quite a lot.
For instance, having tracing setup in a program such as the one below takes
4.64 seconds to run, yet, changing all the variables to have the same name
-- i.e.: change all assignments to `a = 1` (such that there's only a single
variable in the namespace), it takes 1.47 seconds (in my machine)... the
higher the number of variables, the slower the tracing becomes.
```
import time
t = time.time()
def call():
a = 1
b = 1
c = 1
d = 1
e = 1
f = 1
def noop(frame, event, arg):
return noop
import sys
sys.settrace(noop)
for i in range(1_000_000):
call()
print('%.2fs' % (time.time() - t,))
```
This happens because `PyFrame_FastToLocalsWithError` and
`PyFrame_LocalsToFast` are called inside the `call_trampoline` (
https://github.com/python/cpython/blob/master/Python/sysmodule.c#L946).
So, I'd like to simply remove those calls.
Debuggers can call `PyFrame_LocalsToFast` when needed -- otherwise
mutating non-current frames doesn't work anyways. As a note, pydevd already
has such a call:
https://github.com/fabioz/PyDev.Debugger/blob/0d4d210f01a1c0a8647178b2e665b53ab113509d/_pydevd_bundle/pydevd_save_locals.py#L57
and PyPy also has a counterpart.
As for `PyFrame_FastToLocalsWithError`, I don't really see any reason to
call it at all.
i.e.: something as the code below prints the `a` variable from the `main()`
frame regardless of that and I checked all pydevd tests and nothing seems
to be affected (it seems that accessing f_locals already does this:
https://github.com/python/cpython/blob/cb9879b948a19c9434316f8ab6aba9c4601a8173/Objects/frameobject.c#L35,
so, I don't see much reason to call it at all).
```
def call():
import sys
frame = sys._getframe()
print(frame.f_back.f_locals)
def main():
a = 1
call()
if __name__ == '__main__':
main()
```
Does anyone see any issue with this?
If it's non controversial, is a PEP needed or just an issue to track it
would be enough to remove those 2 lines?
Thanks,
Fabio
Right now, when a debugger is active, the number of local variables can
affect the tracing speed quite a lot.
For instance, having tracing setup in a program such as the one below takes
4.64 seconds to run, yet, changing all the variables to have the same name
-- i.e.: change all assignments to `a = 1` (such that there's only a single
variable in the namespace), it takes 1.47 seconds (in my machine)... the
higher the number of variables, the slower the tracing becomes.
```
import time
t = time.time()
def call():
a = 1
b = 1
c = 1
d = 1
e = 1
f = 1
def noop(frame, event, arg):
return noop
import sys
sys.settrace(noop)
for i in range(1_000_000):
call()
print('%.2fs' % (time.time() - t,))
```
This happens because `PyFrame_FastToLocalsWithError` and
`PyFrame_LocalsToFast` are called inside the `call_trampoline` (
https://github.com/python/cpython/blob/master/Python/sysmodule.c#L946).
So, I'd like to simply remove those calls.
Debuggers can call `PyFrame_LocalsToFast` when needed -- otherwise
mutating non-current frames doesn't work anyways. As a note, pydevd already
has such a call:
https://github.com/fabioz/PyDev.Debugger/blob/0d4d210f01a1c0a8647178b2e665b53ab113509d/_pydevd_bundle/pydevd_save_locals.py#L57
and PyPy also has a counterpart.
As for `PyFrame_FastToLocalsWithError`, I don't really see any reason to
call it at all.
i.e.: something as the code below prints the `a` variable from the `main()`
frame regardless of that and I checked all pydevd tests and nothing seems
to be affected (it seems that accessing f_locals already does this:
https://github.com/python/cpython/blob/cb9879b948a19c9434316f8ab6aba9c4601a8173/Objects/frameobject.c#L35,
so, I don't see much reason to call it at all).
```
def call():
import sys
frame = sys._getframe()
print(frame.f_back.f_locals)
def main():
a = 1
call()
if __name__ == '__main__':
main()
```
Does anyone see any issue with this?
If it's non controversial, is a PEP needed or just an issue to track it
would be enough to remove those 2 lines?
Thanks,
Fabio