Mailing List Archive

How stackless can Python be?
Hi,

to make the core interpreter stackless is one thing.
Turning functions which call the interpreter
from some deep nesting level into versions,
which return a frame object instead which is
to be called, is possible in many cases.

Internals like apply are rather uncomplicated to convert.
CallObjectWithKeywords is done.

What I have *no* good solution for is map.
Map does an iteration over evaluations and keeps
state while it is running. The same applies to reduce,
but it seems to be not used so much. Map is.

I don't see at the moment if map could be a killer
for Tim's nice mini-thread idea. How must map work,
if, for instance, a map is done with a function
which then begins to switch between threads,
before map is done? Can one imagine a problem?

Maybe it is no issue, but I'd really like to
know wether we need a stateless map.
(without replacing it by a for loop :-)

ciao - chris

--
Christian Tismer :^) <mailto:tismer@appliedbiometrics.com>
Applied Biometrics GmbH : Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net
10553 Berlin : PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF
we're tired of banana software - shipped green, ripens at home
Re: How stackless can Python be? [ In reply to ]
After a good sleep, I can answer this one by myself.

I wrote:
> to make the core interpreter stackless is one thing.
...
> Internals like apply are rather uncomplicated to convert.
> CallObjectWithKeywords is done.
>
> What I have *no* good solution for is map.
> Map does an iteration over evaluations and keeps
> state while it is running. The same applies to reduce,
> but it seems to be not used so much. Map is.
...

About stackless map,
and this applies to every extension module
which *wants* to be stackless. We don't have to enforce
everybody to be stackless, but there is a couple of
modules which would benefit from it.

The problem with map is, that it needs to keep state,
while repeatedly calling objects which might call
the interpreter. Even if we kept local variables
in the caller's frame, this would still be not
stateless. The info that a map is running is sitting
on the hardware stack, and that's wrong.

Now a solution. In my last post, I argued that I don't
want to replace map by a slower Python function. But
that gave me the key idea to solve this:

C functions which cannot tail-recursively unwound to
return an executable frame object must instead return
themselves as a frame object. That's it! Frames need
again to be a little extended. They have to spell their
interpreter, which normally is the old eval_code loop.

Anatomy of a standard frame invocation:
A new frame is created, parameters are inserted,
the frame is returned to the frame dispatcher,
which runs the inner eval_code loop until it bails out.
On return, special cases of control flow are handled,
as there are exception, returning, and now also calling.
This is an eval_code frame, since eval_code is its
execution handler.

Anatomy of a map frame invocation:
Map has several phases. The first phases to
argument checking and basic setup.
The last phase is iteration over function calls
and building the result. This phase must be split
off as a second function, eval_map.
A new frame is created, with all temporary variables
placed there. eval_map is inserted as the execution
handler.

Now, I think the analogy is obvious.
By building proper frames, it should be possible
to turn any extension function into a stackless function.

The overall protocol is:
A C function which does a simple computation which cannot
cause an interpreter invocation, may simply evaluate
and return a value.
A C function which might cause an interpreter invocation,
should return a freshly created frame as return value.
- This can be done either in a tail-recursive fashion,
if the last action of the C function would basically
be calling the frame.
- If no tail-recursion is possible, the function must
return a new frame for itself, with an executor
for its purpose.

A good stackless candidate is Fredrik's xmlop, which
calls back into the interpreter. If that worked
without the hardware stack, then we could build
ultra-fast XML processors with co-routines!

As a side note:
The frame structure which I sketched
so far is still made for eval_code in the first place,
but it has all necessary flexibilty for pluggable
interpreters. An extension module can now create
its own frame, with its own execution handler, and
throw it back to the frame dispatcher.
In other words: People can create extensions and
test their own VMs if they want.
This was not my primary intent, but comes for free
as a consequence of having a stackless map.

ciao - chris

--
Christian Tismer :^) <mailto:tismer@appliedbiometrics.com>
Applied Biometrics GmbH : Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net
10553 Berlin : PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF
we're tired of banana software - shipped green, ripens at home