Mailing List Archive

Generator details
I have a few questions/suggestions about generators.

Tim writes that a suspended generator has exactly one stack frame.
I'm not sure I like that. The Demo/thread/Generator.py version has no
such restriction; anything that has a reference to the generator can
put() the next value. Is the restriction really necessary? I can see
a good use for a recursive generator, e.g. one that generates a tree
traversal:

def inorder(node):
if node.left: inorder(node.left)
suspend node
if node.right: inorder(node.right)

If I understand Tim, this could not work because there's more than one
stack frame involved. On the other hand, he seems to suggest that
something like this *is* allowed when using "modern" coroutines. Am I
missing something? I though that tree traversal was one of Tim's
first examples of generators; would I really have to use an explicit
stack to create the traversal?

Next, I want more clarity about the initialization and termination
conditions.

The Demo/thread/Generator.py version is very explicit about
initialization: you instantiate the Generator class, passing it a
function that takes a Generator instance as an argument; the function
executes in a new thread. (I guess I would've used a different
interface now -- perhaps inheriting from the Generator class
overriding a run() method.) For termination, the normal way to stop
seems to be for the generator function to return (rather than calling
g.put()), the consumer then gets an EOFError exception the next time
it calls g.get(). There's also a way for either side to call g.kill()
to stop the generator prematurely.

Let me try to translate that to a threadless implementation. We could
declare a simple generator as follows:

generator reverse(seq):
i = len(seq)
while i > 0:
i = i-1
suspend seq[i]

This could be translated by the Python translator into the following,
assuming a system class generator which provides the machinery for
generators:

class reverse(generator):
def run(self, seq):
i = len(seq)
while i > 0:
i = i-1
self.suspend(seq[i])

(Perhaps the identifiers generator, run and suspend would be spelled
with __...__, but that's just clutter for now.)

Now where Tim was writing examples like this:

for c in reverse("Hello world"):
print c,
print

I'd like to guess what the underlying machinery would look like. For
argument's sake, let's assume the for loop recognizes that it's using
a generator (or better, it always needs a generator, and when it's not
a generator it silently implies a sequence-iterating generator).
So the translator could generate the following:

g = reverse("Hello world") # instantiate class reverse
while 1:
try:
c = g.resume()
except EOGError: # End Of Generator
break
print c,
print

(Where g should really be a unique temporary local variable.)

In this model, the g.resume() and g.suspend() calls have all the magic.
They should not be accessible to the user. They are written in C so
they can play games with frame objects.

I guess that the *first* call to g.resume(), for a particular
generator instance, should start the generator's run() method; run()
is not activated by the instantiation of the generator. Then run()
runs until the first suspend() call, which causes the return from the
resume() call to happen. Subsequent resume() calls know that there's
already is a frame (it's stored in the generator instance) and simply
continue its execution where it was. If the run() method returns from
the frame, the resume() call is made to raise EOGError (blah, bogus
name) which signals the end of the loop. (The user may write this
code explicitly if they want to consume the generated elements in a
different way than through a for loop.)

Looking at this machinery, I think the recursive generator that I
wanted could be made to work, by explicitly declaring a generator
subclass (instead of using the generator keyword, which is just
syntactic sugar) and making calls to methods of self, e.g.:

class inorder(generator):
def run(self, node):
if node.left: self.run(node.left)
self.suspend(node)
if node.right: self.run(node.right)

The generator machinery would (ab)use the fact that Python frames
don't necessarily have to be linked in a strict stack order; the
generator gets a pointer to the frame to resume from resume(), and
there's a "bottom" frame which, when hit, raises the EOGError
exception. All currently active frames belonging to the generator
stay alive while another resume() is possible.

All this is possible by the introduction of an explicit generator
object. I think Tim had an implementation in mind where the standard
return pointer in the frame is the only thing necessary; actually, I
think the return pointer is stored in the calling frame, not in the
called frame (Christian? Is this so in your version?). That
shouldn't make a difference, except that it's not clear to me how to
reference the frame (in the explicitly coded version, which has to
exist at least at the bytecode level).

With classic coroutines, I believe that there's no difference between
the first call and subsequent calls to the coroutine. This works in
the Knuth world where coroutines and recursion don't go together; but
at least for generators I would hope that it's possible for multiple
instances of the same generator to be active simultaneously (e.g. I
could be reversing over a list of files and then reverse each of the
lines in the file; this uses separate instances of the reverse()
generator). So we need a way to reference the generator instance
separately from the generator constructor. The machinery I sketched
above solves this.

After Tim has refined or rebutted this, I think I'll be able to
suggest what to do for coroutines.

(I'm still baffled by continuations. The question whether the for
saved and restored loop should find itself in the 1st or 5th iteration
surprises me. Doesn't this cleanly map into some Scheme code that
tells us what to do? Or is it unclear because Scheme does all loops
through recursion? I presume that if you save the continuation of the
1st iteration and restore it in the 5th, you'd find yourself in the
back 1st iteration? But this is another thread.)

--Guido van Rossum (home page: http://www.python.org/~guido/)
RE: Generator details [ In reply to ]
I'm out of time for tonight so will just address the first one:

[Guido van Rossum]
> I have a few questions/suggestions about generators.
>
> Tim writes that a suspended generator has exactly one stack frame.
> I'm not sure I like that. The Demo/thread/Generator.py version has no
> such restriction; anything that has a reference to the generator can
> put() the next value. Is the restriction really necessary?

It can simplify the implementation, and (not coincidentally <wink>) the
user's mental model of how they work.

> I can see a good use for a recursive generator, e.g. one that generates
> a tree traversal:

Definitely; in fact, recursive generators are particularly useful in both
traversals and enumeration of combinatorial objects (permutations, subsets,
and so on).

> def inorder(node):
> if node.left: inorder(node.left)
> suspend node
> if node.right: inorder(node.right)
>
> If I understand Tim, this could not work because there's more than one
> stack frame involved.

That's right. It would be written like this instead:

def inorder(node):
if node.left:
suspend inorder(node.left)
suspend node
if node.right:
suspend inorder(node.right)

Now there may be many instances of the "inorder" generator active (as many
as the tree is deep), but each one returns directly to its caller, and all
but the bottom-most one is "the caller" wrt the generator it invokes. This
implies that "suspend expr" treats expr like a generator in much the same
way that "for x in expr" does (or may ...). I realize there's some
muddiness in that.

> On the other hand, he seems to suggest that something like this *is*
> allowed when using "modern" coroutines.

Yes, and then your original version can be made to work, delivering its
results directly to the ultimate consumer instead of (in effect) crawling up
the stack each time there's a result.

> Am I missing something?

Only that I've been pushing generators for almost a decade, and have always
pushed the simplest possible version that's sufficient for my needs.
However, every time I've made a micron's progress in selling this notion,
it's been hijacked by someone else pushing continuations. So I keep pushing
the simplest possible version of generators ("resumable function"), in the
hopes that someday somebody will remember they don't need to turn Python
inside out to get just that much <wink>.

[much worth discussion skipped for now]
> ...
> (I'm still baffled by continuations.

Actually not, I think!

> The question whether the for saved and restored loop should find itself
> in the 1st or 5th iteration surprises me. Doesn't this cleanly map into
> some Scheme code that tells us what to do? Or is it unclear because
> Scheme does all loops through recursion?

Bingo: Scheme has no loops. I can model Python's "for" in Scheme in such a
way that the continuation sees the 1st iteration, or the 5th, but neither
way is obviously right -- or wrong (they both reproduce Python's behavior in
the *absence* of continuations!).

> I presume that if you save the continuation of the 1st iteration and
> restore it in the 5th, you'd find yourself in the back 1st iteration?
> But this is another thread.)

The short course here is just that any way I've tried to model Python's
"for" in *Python* shares the property of the "while 1:" way I posted: the
continuation sees the 5th iteration. And some hours I think it probably
should <wink>, since the bindings of all the locals it sees will be
consistent with the 5th iteration's values but not the 1st's.

could-live-with-it-either-way-but-"correct"-is-debatable-ly y'rs - tim
RE: Generator details [ In reply to ]
Picking up where we left off, I like Guido's vision of generators fine. The
"one frame" version I've described is in fact what Icon provides, and what
Guido is doing requires using coroutines instead in that language. Guido's
is more flexible, and I'm not opposed to that <wink>.

OTOH, I *have* seen many a person (including me!) confused by the semantics
of coroutines in Icon, so I don't know how much of the additional
flexibility converts into additional confusion. One thing I am sure of:
having debated the fine points of continuations recently, I'm incapable of
judging it harshly today <0.5 wink>.

> ...
> def inorder(node):
> if node.left: inorder(node.left)
> suspend node
> if node.right: inorder(node.right)

The first thing that struck me there is that I'm not sure to whom the
suspend transfers control. In the one-frame flavor of generator, it's
always to the caller of the function that (lexically) contains the
"suspend". Is it possible to keep this all straight if the "suspend" above
is changed to e.g.

pass_it_back(node)

where

def pass_it_back(x):
suspend x

? I'm vaguely picturing some kind of additional frame state, a pointer to
the topmost frame that's "expecting" to receive a suspend. (I see you
resolve this in a different way later, though.)

> ...
> I thought that tree traversal was one of Tim's first examples of
> generators; would I really have to use an explicit stack to create
> the traversal?

As before, still no <wink>, but the one-frame version does require an
unbroken *implicit* chain back to the intended receiver, with an explicit
"suspend" at every step back to that. Let me rewrite the one-frame version
in a way that assumes less semantics from "suspend", instead building on the
already-assumed new smarts in "for":

def inorder(node):
if node:
for child in inorder(node.left):
suspend child
suspend node
for child in inorder(node.right):
suspend child

I hope this makes it clearer that the one-frame version spawns two *new*
generators for every non-None node, and in purely stack-like fashion (both
"recursing down" and "suspending up").

> Next, I want more clarity about the initialization and termination
> conditions.

Good idea.

> The Demo/thread/Generator.py version is very explicit about
> initialization: you instantiate the Generator class, passing it a
> function that takes a Generator instance as an argument; the function
> executes in a new thread. (I guess I would've used a different
> interface now -- perhaps inheriting from the Generator class
> overriding a run() method.)

I would change my coroutine implementation similarly.

> For termination, the normal way to stop seems to be for the generator
> function to return (rather than calling g.put()), the consumer then gets
> an EOFError exception the next time it calls g.get(). There's also a
> way for either side to call g.kill() to stop the generator prematurely.

A perfectly serviceable interface, but "feels clumsy" in comparison to
normal for loops and e.g. reading lines from a file, where *visible*
exceptions aren't raised at the end. I expect most sequences to terminate
before I do <wink>, so (visible) try/except isn't the best UI here.

> Let me try to translate that to a threadless implementation. We could
> declare a simple generator as follows:
>
> generator reverse(seq):
> i = len(seq)
> while i > 0:
> i = i-1
> suspend seq[i]
>
> This could be translated by the Python translator into the following,
> assuming a system class generator which provides the machinery for
> generators:
>
> class reverse(generator):
> def run(self, seq):
> i = len(seq)
> while i > 0:
> i = i-1
> self.suspend(seq[i])
>
> (Perhaps the identifiers generator, run and suspend would be spelled
> with __...__, but that's just clutter for now.)
>
> Now where Tim was writing examples like this:
>
> for c in reverse("Hello world"):
> print c,
> print
>
> I'd like to guess what the underlying machinery would look like. For
> argument's sake, let's assume the for loop recognizes that it's using
> a generator (or better, it always needs a generator, and when it's not
> a generator it silently implies a sequence-iterating generator).

In the end I expect these concepts could be unified, e.g. via a new class
__iterate__ method. Then

for i in 42:

could fail simply because ints don't have a value in that slot, while lists
and tuples could inherit from SequenceIterator, pushing the generation of
the index range into the type instead of explicitly constructed by the eval
loop.

> So the translator could generate the following:
>
> g = reverse("Hello world") # instantiate class reverse
> while 1:
> try:
> c = g.resume()
> except EOGError: # End Of Generator
> break
> print c,
> print
>
> (Where g should really be a unique temporary local variable.)
>
> In this model, the g.resume() and g.suspend() calls have all the magic.
> They should not be accessible to the user.

This seems at odds with the later:

> (The user may write this code explicitly if they want to consume the
> generated elements in a different way than through a for loop.)

Whether it's at odds or not, I like the latter better. When the machinery
is clean & well-designed, expose it! Else in 2002 we'll be subjected to a
generatorhacks module <wink>.

> They are written in C so they can play games with frame objects.
>
> I guess that the *first* call to g.resume(), for a particular
> generator instance, should start the generator's run() method; run()
> is not activated by the instantiation of the generator.

This can work either way. If it's more convenient to begin run() as part of
instantiation, the code for run() can start with an equivalent of

if self.first_time:
self.first_time = 0
return

where self.first_time is set true by the constructor. Then "the frame" will
exist from the start. The first resume() will skip over that block and
launch into the code, while subsequent resume()s will never even see this
block: almost free.

> Then run() runs until the first suspend() call, which causes the return
> from the resume() call to happen. Subsequent resume() calls know that
> there's already is a frame (it's stored in the generator instance) and
simply
> continue its execution where it was. If the run() method returns from
> the frame, the resume() call is made to raise EOGError (blah, bogus
> name) which signals the end of the loop. (The user may write this
> code explicitly if they want to consume the generated elements in a
> different way than through a for loop.)

Yes, that parenthetical comment bears repeating <wink>.

> Looking at this machinery, I think the recursive generator that I
> wanted could be made to work, by explicitly declaring a generator
> subclass (instead of using the generator keyword, which is just
> syntactic sugar) and making calls to methods of self, e.g.:
>
> class inorder(generator):
> def run(self, node):
> if node.left: self.run(node.left)
> self.suspend(node)
> if node.right: self.run(node.right)

Going way back to the top, this implies the

def pass_it_back(x):
suspend x

indirection couldn't work -- unless pass_it_back were also a method of
inorder. Not complaining, just trying to understand. Once you generalize,
it's hard to know when to stop.

> The generator machinery would (ab)use the fact that Python frames
> don't necessarily have to be linked in a strict stack order;

If you call *this* abuse, what words remain to vilify what Christian is
doing <wink>?

> the generator gets a pointer to the frame to resume from resume(),

Ah! That addresses my first question. Are you implicitly assuming a
"stackless" eval loop here? Else resuming the receiving frame would appear
to push another C stack frame for each value delivered, ever deeper. The
"one frame" version of generators doesn't have this headache (since a
suspend *returns* to its immediate caller there -- it doesn't *resume* its
caller).

> and there's a "bottom" frame which, when hit, raises the EOGError
> exception.

Although desribed at the end, this is something set up at the start, right?
To trap a plain return from the topmost invocation of the generator.

> All currently active frames belonging to the generator stay alive
> while another resume() is possible.

And those form a linear chain from the most-recent suspend() back to the
primal resume(). Which appears to address an earlier issue not brought up
in this message: this provides a well-defined & intuitively clear path for
exceptions to follow, yes? I'm not sure about coroutines, but there's
something wrong with a generator implementation if the guy who kicks it off
can't see errors raised by the generator's execution! This doesn't appear
to be a problem here.

> All this is possible by the introduction of an explicit generator
> object. I think Tim had an implementation in mind where the standard
> return pointer in the frame is the only thing necessary; actually, I
> think the return pointer is stored in the calling frame, not in the
> called frame

What I've had in mind is what Majewski implemented 5 years ago, but lost
interest in because it couldn't be extended to those blasted continuations
<wink>. The called frame points back to the calling frame via f->f_back (of
course), and I think that's all the return info the one-frame version needs.
I expect I'm missing your meaning here.

> (Christian? Is this so in your version?). That shouldn't make a
> difference, except that it's not clear to me how to reference the frame
> (in the explicitly coded version, which has to exist at least at the
> bytecode level).

"The" frame being which frame specifically, and refrenced from where?
Regardless, it must be solvable, since if Christian can (& he thinks he can,
& I believe him <wink>) expose a call/cc variant, the generator class could
be coded entirely in Python.

> With classic coroutines, I believe that there's no difference between
> the first call and subsequent calls to the coroutine. This works in
> the Knuth world where coroutines and recursion don't go together;

That's also a world where co-transfers are implemented via funky
self-modifying assembler, custom-crafted for the exact number of coroutines
you expect to be using -- I don't recommend Knuth as a guide to
*implementing* these beasts <0.3 wink>.

That said, yes, provided the coroutines objects all exist, there's nothing
special about the first call. About "provided that": if your coroutine
objects A and B have "run" methods, you dare not invoke A.run() before B has
been constructed (else the first instance of B.transfer() in A chokes --
there's no object to transfer *to*). So, in practice, I think instantiation
is still divorced from initiation. One possibility is to hide all that in a

cobegin(list_of_coroutine_classes_to_instantiate_and_run)

function. But then naming the instances is a puzzle.

> but at least for generators I would hope that it's possible for multiple
> instances of the same generator to be active simultaneously (e.g. I
> could be reversing over a list of files and then reverse each of the
> lines in the file; this uses separate instances of the reverse()
> generator).

Since that's the trick the "one frame" generators *rely* on for recursion,
it's surely not a problem in your stronger version.

Note that my old coroutine implementation did allow for multiple instances
of a coroutine, although the examples posted with it didn't illustrate that.

The weakness of coroutines in practice is (in my experience) the requirement
that you *name* the target of a transfer. This is brittle; e.g., in the
pipeline example I posted, each stage had to know the names of the stages on
either side of it. By adopting a

target.transfer(optional_value)

primitive it's possible to *pass in* the target object as an argument to the
coroutine doing the transfer. Then "the names" are all in the setup, and
don't pollute the bodies of the coroutines (e.g., each coroutine in the
pipeline example could have arguments named "stdin" and "stdout"). I
haven't seen a system that *does* this, but it's so obviously the right
thing to do it's not worth saying any more about <wink>.

> So we need a way to reference the generator instance separately from
> the generator constructor. The machinery I sketched above solves this.
>
> After Tim has refined or rebutted this, I think I'll be able to
> suggest what to do for coroutines.

Please do. Whether or not it's futile, it's fun <wink>.

hmm-haven't-had-enough-of-that-lately!-ly y'rs - tim
Re: Generator details [ In reply to ]
Guido van Rossum wrote:

[snipped all what's addressed to Tim]

> All this is possible by the introduction of an explicit generator
> object. I think Tim had an implementation in mind where the standard
> return pointer in the frame is the only thing necessary; actually, I
> think the return pointer is stored in the calling frame, not in the
> called frame (Christian? Is this so in your version?). That
> shouldn't make a difference, except that it's not clear to me how to
> reference the frame (in the explicitly coded version, which has to
> exist at least at the bytecode level).

No, it isn't. It is still as it was. I didn't change the frame
machinery at all. The callee finds his caller in its f_back field.

[...]

> (I'm still baffled by continuations. The question whether the for
> saved and restored loop should find itself in the 1st or 5th iteration
> surprises me. Doesn't this cleanly map into some Scheme code that
> tells us what to do? Or is it unclear because Scheme does all loops
> through recursion? I presume that if you save the continuation of the
> 1st iteration and restore it in the 5th, you'd find yourself in the
> back 1st iteration? But this is another thread.)

In Scheme, Python's for-loop would be a tail-recursive expression,
it would especially be its own extra lambda. Doesn't fit. Tim is
right when he says that Python isn't Scheme.

Yesterday I built your suggested change to for-loops, and it works
fine. By turning the loop counter into a mutable object, every
reference to it shares the current value, and it behaves like
Tim pointed out it should.

About Tims reply to this post:

[Gui-do]
> The generator machinery would (ab)use the fact that Python frames
> don't necessarily have to be linked in a strict stack order;

[Tim-bot]
If you call *this* abuse, what words remain to vilify what Christian is
doing <wink>?

As a matter of fact, I have been thinking quite long about
this *abuse*. At the moment I do not do this. The frame stack
becomes a frame tree, and you can jump like Tarzan from leaf to
leaf, but I never change the order. Perhaps this can make
sense too, but this is curently where *my* brain explodes.
Right now I'm happy that there is *always* a view of the top
level, and an exception always knows where to wind up.

Form that point of view, I'm even more conservative than Guido
(above) and Sam (replacing whole frame chains).
In a sense, since I don't change the frame chain but only
change the current frame, this is like a functional way to
use weak references.

The continuation approach is to build new paths in a tree,
and loose those which are unreachable. Modifying the tree
is not part of my model at the moment. This may be
interesting to study after we know everything about this tree
and wee need even more freedom.

ciao - chris

--
Christian Tismer :^) <mailto:tismer@appliedbiometrics.com>
Applied Biometrics GmbH : Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net
10553 Berlin : PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF
we're tired of banana software - shipped green, ripens at home
Re: Generator details [ In reply to ]
[Christian]
> The frame stack becomes a frame tree, and you can jump like Tarzan
> from leaf to leaf [...].

Christian, I want to kiss you! (OK, just a hug. We're both
Europeans. :-)

This one remark suddenly made me understand much better what
continuations do -- it was the one missing piece of insight I still
needed after Tim's explanation and skimming the Scheme tutorial a bit.
I'll have to think more about the consequences but this finally made
me understand better how to interpreter the mysterious words ``the
continuation represents "the rest of the program"''.

--Guido van Rossum (home page: http://www.python.org/~guido/)
Re: Generator details [ In reply to ]
I've been thinking some more about Tim's single-frame generators, and
I think I understand better how to implement them now.

(And yes, it was a mistake of me to write that the suspend() and
resume() methods shouldn't be accessible to the user! Also thanks for
the clarification of how to write a recursive generator.)

Let's say we have a generator function like this:

generator reverse(l):
i = len(l)
while i > 0:
i = i-1
suspend l[i]

and a for loop like this:

for i in reverse(range(10)):
print i

What is the expanded version of the for loop? I think this will work:

__value, __frame = call_generator(reverse, range(10))
while __frame:
i = __value
# start of original for loop body
print i
# end of original for loop body
__value, __frame = resume_frame(__frame)

(Note that when the original for loop body contains 'continue', this
should jump to the resume_frame() call. This is just pseudo code.)

Now we must define two new built-in functions: call_generator() and
resume_frame().

- call_generator() is like apply() but it returns a pair (result,
frame) where result is the function result and frame is the frame,
*if* the function returned via suspend. If it returned via return,
call_generator() returns None for the frame.

- resume_frame() does exactly what its name suggests. It has the same
return convention as call_generator().

Note that the for loop throws away the final (non-suspend) return
value of the generator -- this just signals the end of the loop.

How to translate the generator itself? I've come up with two
versions.

First version: add a new bytecode SUSPEND, which does the same as
RETURN but also marks the frame as resumable. call_generator() then
calls the function using a primitive which allows it to specify the
frame (e.g. a variant of eval_code2 taking a frame argument). When
the call returns, it looks at the resumable bit of the frame to decode
whether to return (value, frame) or (value, None). resume_frame()
simply marks the frame as non-resumable and continues its execution;
upon return it does the same thing as call_generator().

Alternative translation version: introduce a new builtin get_frame()
which returns the current frame. The statement "suspend x" gets
translated to "return x, get_frame()" and the statement "return x"
(including the default "return None" at the end of the function) gets
translated to "return x, None". So our example turns into:

def reverse(l):
i = len(l)
while i > 0:
i = i-1
return l[i], get_frame()
return None, None

This of course means that call_generator() can be exactly the same as
apply(), and in fact we better get rid of it, so the for loop
translation becomes:

__value, __frame = reverse(range(10))
while __frame:
...same as before...

In a real implementation, get_frame() could be a new bytecode; but it
doesn't have to be (making for easier experimentation).

(get_frame() makes a fine builtin; there's nothing inherently
dangerous to it, in fact people get it all the time, currently using
horrible hacks!).

I'm not sure which is better; the version without call_generator()
allows you to create your own generator without using the 'generator'
and 'suspend' keywords, calling get_frame() explicitly.

Loose end: what to do when there's a try/finally around a suspend?
E.g.

generator foo(l):
try:
for i in l:
suspend i+1
finally:
print "Done"

The second translation variant would cause "Done" to be printed on
each suspend *and* on the final return. This is confusing (and in
fact I think resuming the frame would be a problem since the return
breaks down the try-finally blocks).

So I guess the SUSPEND bytecode is a better implementation -- it can
suspend the frame without going through try-finally clauses.

Then of course we create another loose end: what if the for loop
contains a break? Then the frame will never be resumed and its
finally clause will never be executed! This sounds bad. Perhaps the
destructor of the frame should look at the 'resumable' bit and if set,
resume the frame with a system exception, "Killed", indicating an
abortion? (This is like the kill() call in Generator.py.) We can
increase the likelihood that the frame's desctructor is called at the
expected time (right when the for loop terminates), by deleting
__frame at the end of the loop. If the resumed frame raises another
exception, we ignore it. Its return value is ignored. If it suspends
itself again, we resume it with the "Killed" exception again until it
dies (thoughts of the Blank Knight come to mind).

I am beginning to like this idea. (Not that I have time for an
implementation... But it could be done without Christian's patches.)

--Guido van Rossum (home page: http://www.python.org/~guido/)
RE: Generator details [ In reply to ]
[Christian]
> The frame stack becomes a frame tree, and you can jump like Tarzan
> from leaf to leaf [...].

[Guido]
> Christian, I want to kiss you! (OK, just a hug. We're both
> Europeans. :-)

Not in America, pal -- the only male hugging allowed here is in the two
seconds after your team wins the Superbowl -- and even then only so long as
you haven't yet taken off your helmets.

> This one remark suddenly made me understand much better what
> continuations do -- it was the one missing piece of insight I still
> needed after Tim's explanation and skimming the Scheme tutorial a bit.

It's an insight I was missing too -- continuations are often *invoked* in
general directed-graph fashion, and before Christian said that I hadn't
realized the *implementation* never sees anything worse than a tree.

So next time I see Christian, I'll punch him hard in the stomach, and mumble
"good job" loudly enough so that he hears it, but indistinctly enough so I
can plausibly deny it in case any other guy overhears us. *That's* the
American Way <wink>.

first-it's-hugging-then-it's-song-contests-ly y'rs - tim
RE: Generator details [ In reply to ]
[Guido, sketches 112 ways to implement one-frame generators today <wink>]

I'm glad you're having fun too! I won't reply in detail here; it's enough
for now to happily agree that adding a one-frame generator isn't much of a
stretch for the current implementation of the PVM.

> Loose end: what to do when there's a try/finally around a suspend?
> E.g.
>
> generator foo(l):
> try:
> for i in l:
> suspend i+1
> finally:
> print "Done"
>
> The second translation variant would cause "Done" to be printed on
> each suspend *and* on the final return. This is confusing (and in
> fact I think resuming the frame would be a problem since the return
> breaks down the try-finally blocks).

There are several things to be said about this:

+ A suspend really can't ever go thru today's normal "return" path, because
(among other things) that wipes out the frame's value stack!

while (!EMPTY()) {
v = POP();
Py_XDECREF(v);
}

A SUSPEND opcode would let it do what it needs to do without mixing that
into the current return path. So my answer to:

> I'm not sure which is better; the version without call_generator()
> allows you to create your own generator without using the 'generator'
> and 'suspend' keywords, calling get_frame() explicitly.

is "both" <wink>: get_frame() is beautifully clean, but it still needs
something like SUSPEND to keep everything straight. Maybe this just amounts
to setting "why" to a new WHY_SUSPEND and sorting it all out after the eval
loop; OTOH, that code is pretty snaky already.

+ I *expect* the example code to print "Done" len(l)+1 times! The generator
mechanics are the same as the current for/__getitem__ protocol in this
respect: if you have N items to enumerate, the enumeration routine will get
called N+1 times, and that's life. That is, the fact is that the generator
"gets to" execute code N+1 times, and the only reason your original example
seems surprising at first is that it doesn't happen to do anything (except
exit the "try" block) on the last of those times. Change it to

generator foo(l):
try:
for i in l:
suspend i+1
cleanup() # new line
finally:
print "Done"

and then you'd be surprised *not* to see "Done" printed len(l)+1 times. So
I think the easiest thing is also the right thing in this case.

OTOH, the notion that the "finally" clause should get triggered at all the
first len(l) times is debatable. If I picture it as a "resumable function"
then, sure, it should; but if I picture the caller as bouncing control back
& forth with the generator, coroutine style, then suspension is a just a
pause in the generator's execution. The latter is probably the more natural
way to picture it, eh?

Which feeds into:

> Then of course we create another loose end: what if the for loop
> contains a break? Then the frame will never be resumed and its
> finally clause will never be executed! This sounds bad. Perhaps the
> destructor of the frame should look at the 'resumable' bit and if set,
> resume the frame with a system exception, "Killed", indicating an
> abortion? (This is like the kill() call in Generator.py.) We can
> increase the likelihood that the frame's desctructor is called at the
> expected time (right when the for loop terminates), by deleting
> __frame at the end of the loop. If the resumed frame raises another
> exception, we ignore it. Its return value is ignored. If it suspends
> itself again, we resume it with the "Killed" exception again until it
> dies (thoughts of the Blank Knight come to mind).

This may leave another loose end <wink>: what if the for loop doesn't
contain a break, but dies because of an exception in some line unrelated to
the generator? Or someone has used an explicit get_frame() in any case and
that keeps a ref to the frame alive? If the semantic is that the generator
must be shut down no matter what, then the invoker needs code more like

value, frame = generator(args)
try:
while frame:
etc
value, frame = resume_frame(frame)
finally:
if frame:
shut_frame_down(frame)

OTOH, the possibility that someone *can* do an explicit get_frame suggests
that "for" shouldn't assume it's the master of the universe <wink>. Perhaps
the user's intent was to generate the first 100 values in a for loop, then
break out, analyze the results, and decide whether to resume it again by
hand (I've done stuff like that ...). So there's also a case to be made for
saying that a "finally" clause wrapping a generator body will only be
executed if the generator body raises an exception or the generator itself
decides it's done; i.e. iff it triggers while the generator is actively
running.

Just complicating things there <wink>. It actually sounds pretty good to
raise a Killed exception in the frame destructor! The destructor has to do
*something* to trigger the code that drains the frame's value stack anyway,
"finally" blocks or not (frame_dealloc doesn't do that now, since there's
currently no way to get out of eval_code2 with a non-empty stack).

> ...
> I am beginning to like this idea. (Not that I have time for an
> implementation... But it could be done without Christian's patches.)

Or with them too <wink>. If stuff is implemented via continuations, the
same concerns about try/finally blocks pop up everywhere a continuation is
invoked: you (probably) leave the current frame, and may or may not ever
come back. So if there's a "finally" clause pending and you don't ever come
back, it's a surprise there too.

So while you thought you were dealing with dirt-simple one-frame generators,
you were *really* thinking about how to make general continuations play nice
<wink>.

solve-one-mystery-and-you-solve-'em-all-ly y'rs - tim
Re: Generator details [ In reply to ]
[.Tim seems to be explaining why len(l)+1 and not len(l) -- but I was
really thinking about len(l)+1 vs. 1.]

> OTOH, the notion that the "finally" clause should get triggered at all the
> first len(l) times is debatable. If I picture it as a "resumable function"
> then, sure, it should; but if I picture the caller as bouncing control back
> & forth with the generator, coroutine style, then suspension is a just a
> pause in the generator's execution. The latter is probably the more natural
> way to picture it, eh?

*This* is what I was getting at, and it points in favor of a SUSPEND
opcode since I don't know how to do that in the multiple-return. As
you point out, there can be various things on the various in-frame
stacks (value stack and block stack) that all get discarded by a
return, and that no restart_frame() can restore (unless get_frame()
returns a *copy* of the frame, which seems to be defeating the
purpose).

> OTOH, the possibility that someone *can* do an explicit get_frame suggests
> that "for" shouldn't assume it's the master of the universe <wink>. Perhaps
> the user's intent was to generate the first 100 values in a for loop, then
> break out, analyze the results, and decide whether to resume it again by
> hand (I've done stuff like that ...). So there's also a case to be made for
> saying that a "finally" clause wrapping a generator body will only be
> executed if the generator body raises an exception or the generator itself
> decides it's done; i.e. iff it triggers while the generator is actively
> running.

Hmm... I think that if the generator is started by a for loop, it's
okay for the loop to assume it is the master of the universe -- just
like there's no force in the world (apart from illegal C code :) that
can change the hidden loop counter in present-day for loop.

--Guido van Rossum (home page: http://www.python.org/~guido/)
RE: Generator details [ In reply to ]
| value, frame = generator(args)
| try:
| while frame:
| etc
| value, frame = resume_frame(frame)
| finally:
| if frame:
| shut_frame_down(frame)

Minor point, but why not make resume() and shutdown() methods on the
frame? Isn't this much cleaner?

value, frame = generator(args)
try:
while frame:
etc
value, frame = frame.resume()
finally:
if frame:
frame.shutdown()

-Barry
RE: Generator details [ In reply to ]
[Guido]
> ...
> Hmm... I think that if the generator is started by a for loop, it's
> okay for the loop to assume it is the master of the universe -- just
> like there's no force in the world (apart from illegal C code :) that
> can change the hidden loop counter in present-day for loop.

If it comes to a crunch, me too. I think your idea of forcing an exception
in the frame's destructor (to get the stacks cleaned up, and any suspended
"finally" blocks executed) renders this a non-issue, though (it will "just
work", and if people resort to illegal C code, it will *still* work <wink>).

hadn't-noticed-you-can't-spell-"illegal-code"-without-"c"-ly y'rs - tim
RE: Generator details [ In reply to ]
[Barry]
> Minor point, but why not make resume() and shutdown() methods on the
> frame? Isn't this much cleaner?
>
> value, frame = generator(args)
> try:
> while frame:
> etc
> value, frame = frame.resume()
> finally:
> if frame:
> frame.shutdown()

Yes -- and at least it's better than arguing over what to name them <wink>.

btw-tabs-in-email-don't-look-the-way-you-expect-them-to-ly y'rs - tim