Mailing List Archive

ANN: Stackless Python 0.2
ANNOUNCING:

Stackless Python 0.2
A Python Implementation Which
Does Not Ese The C Stack

What is it?
A plugin-replacement for core Python.
It should run any program which runs under Python 1.5.2 .
But it does not need space on the C stack.

Why did I write it?
Stackless Python was never written before (afaik), since it
was said to be impossible without major rewrites of core Python.
I am proving the controverse:
It is easy to write, just hard to think.

Who needs it?
At the moment, this is only useful for C programmers
who want to try certain new ideas. Hardcore stuff.
It allows to modify the current execution state by
changing the frame stack chain without restictions,
and it allows for pluggable interpreters on a per-frame-basis.

The possibilities are for instance:

Coroutines, Continuations, Generators

Restartable exceptions and Persistent execution state
might be possible.

Stackless extension modules can be built. The new builtin
stackless "map" function is a small example for this.

Coroutines will be able to run at the speed of
a single C function call, which makes them a considerable
alternative in certain algorithms.

Status of the project:
Stackless-ness has been implemented and tested with pystone.
pystone works correctly and is about 4-5 % slower than
with standard Python.

What I need at the moment is
- time to build a sample coroutine extension
- your input, your testing, critics and hints.

Some still rough documentation is available at
http://www.pns.cc/stackless/stackless.htm

Source code and a VC++6.0 build for Windows can be found
from the document or directly from
ftp://ftp.pns.cc/pub/stackless_990611.zip

cheers - chris

--
Christian Tismer :^) <mailto:tismer@appliedbiometrics.com>
Applied Biometrics GmbH : Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net
10553 Berlin : PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF
we're tired of banana software - shipped green, ripens at home
ANN: Stackless Python 0.2 [ In reply to ]
In article <37628EAA.C682F16C@appliedbiometrics.com>,
Christian Tismer <tismer@appliedbiometrics.com> wrote:
>ANNOUNCING:
>
> Stackless Python 0.2
>
>The possibilities are for instance:
>
>Coroutines, Continuations, Generators
>
>Coroutines will be able to run at the speed of
>a single C function call, which makes them a considerable
>alternative in certain algorithms.

This is very neat, and you are completely deranged. I know just
enough to know that I should cheer you on, mind, but I'll try to
cheer loudly. Feel free to make an announce when you add coroutine
support. Please? :)


Neel
ANN: Stackless Python 0.2 [ In reply to ]
Neel Krishnaswami wrote:

[Stackless Python 0.2]

> This is very neat, and you are completely deranged. I know just
> enough to know that I should cheer you on, mind, but I'll try to
> cheer loudly. Feel free to make an announce when you add coroutine
> support. Please? :)

Thanks :-)

Sure I will. I'd like to know wether my binary works ok for you.
You might also try a recursive function, it should raise an
exception after 29999 recursions (just to have *a* limit).

I'm just undecided on the design. There are a couple of
different coroutine interfaces. You find them for instance in
Scheme, in Modula II, some Oberons, Icon, and in Tim Peter's
example which I added to my stackless archive.
Do you have a proposal?

For Stackless V. 0.3 I'm planning for some more nonrecursive
builtins, and also for a proper "invalid opcode" handling,
which makes it possible to extend the standard interpreter
without having to build Python or rewrite eval_code completely.
This will give a new playing ground for Michael Hudson.

Then the Stackless implementation will only undergo changes
if something needs internal support. Everything else will
go into extension modules which can be loaded on demand.

One extension I'm planning is an API engine, which knows all
C API calls and all internal structures. Together with a
small fast engine, this will allow to build new fast
functions without a compler. Maybe this is not for this year.

But first I have to get out of this asylum :-?

lunaticly y'rs - chris

--
Christian Tismer :^) <mailto:tismer@appliedbiometrics.com>
Applied Biometrics GmbH : Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net
10553 Berlin : PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF
we're tired of banana software - shipped green, ripens at home
ANN: Stackless Python 0.2 [ In reply to ]
In article <376636EA.84240129@appliedbiometrics.com>,
Christian Tismer <tismer@appliedbiometrics.com> wrote:
>
>
>Neel Krishnaswami wrote:
>
>[Stackless Python 0.2]
>
>> This is very neat, and you are completely deranged. I know just
>> enough to know that I should cheer you on, mind, but I'll try to
>> cheer loudly. Feel free to make an announce when you add coroutine
>> support. Please? :)
>
>Thanks :-)
>
>Sure I will. I'd like to know wether my binary works ok for you.
>You might also try a recursive function, it should raise an
>exception after 29999 recursions (just to have *a* limit).

Unfortunately, I couldn't get it to build (I'm on a Linux machine).
I tried copying the files from the stackless distribution into the
appropriate places in the Python source distribution, but I get
the following error:

$ gcc -g -O2 -I./../Include -I.. -DHAVE_CONFIG_H -c myreadline.c -o myreadline.o
In file included from ../Include/Python.h:38,
from myreadline.c:42:
../Include/patchlevel.h:69: parse error before `5'
../Include/patchlevel.h:69: stray '\' in program
../Include/patchlevel.h:70: stray '\' in program
../Include/patchlevel.h:71: stray '\' in program

[plus more cascading errors as a result of this]

Apparently gcc is not liking this:

/* Version as a single 4-byte hex number, e.g. 0x010502B2 == 1.5.2b2.
Use this for numeric comparisons, e.g. #if PY_VERSION_HEX >= ... */
#define PY_VERSION_HEX ((PY_MAJOR_VERSION << 24) | \
(PY_MINOR_VERSION << 16) | \
(PY_MICRO_VERSION << 8) | \
(PY_RELEASE_LEVEL << 4) | \
(PY_RELEASE_SERIAL << 0))

The '5' in the "parse error before '5'" is the PY_MICRO_VERSION; the
macro looks ok to me, but I haven't used C to any great extent in
literally years. Am I making some very naive mistake?

>I'm just undecided on the design. There are a couple of
>different coroutine interfaces. You find them for instance in
>Scheme, in Modula II, some Oberons, Icon, and in Tim Peter's
>example which I added to my stackless archive.
>Do you have a proposal?

Not really; all I know is a little Scheme, and that part of my brain
is telling me that "coroutines are just closures over continuations."
Fortunately, I'm still able to realize that that's probably the
wrong interface. :)


Neel
ANN: Stackless Python 0.2 [ In reply to ]
neelk@brick.cswv.com (Neel Krishnaswami) writes:

> In article <376636EA.84240129@appliedbiometrics.com>,
> Christian Tismer <tismer@appliedbiometrics.com> wrote:
> >
> >
> >Neel Krishnaswami wrote:
> >
> >[Stackless Python 0.2]
> >
> >> This is very neat, and you are completely deranged. I know just
> >> enough to know that I should cheer you on, mind, but I'll try to
> >> cheer loudly. Feel free to make an announce when you add coroutine
> >> support. Please? :)
> >
> >Thanks :-)
> >
> >Sure I will. I'd like to know wether my binary works ok for you.
> >You might also try a recursive function, it should raise an
> >exception after 29999 recursions (just to have *a* limit).
>
> Unfortunately, I couldn't get it to build (I'm on a Linux machine).
> I tried copying the files from the stackless distribution into the
> appropriate places in the Python source distribution, but I get
> the following error:
>
> $ gcc -g -O2 -I./../Include -I.. -DHAVE_CONFIG_H -c myreadline.c -o myreadline.o
> In file included from ../Include/Python.h:38,
> from myreadline.c:42:
> ../Include/patchlevel.h:69: parse error before `5'
> ../Include/patchlevel.h:69: stray '\' in program
> ../Include/patchlevel.h:70: stray '\' in program
> ../Include/patchlevel.h:71: stray '\' in program
>
> [plus more cascading errors as a result of this]
>
> Apparently gcc is not liking this:
>
> /* Version as a single 4-byte hex number, e.g. 0x010502B2 == 1.5.2b2.
> Use this for numeric comparisons, e.g. #if PY_VERSION_HEX >= ... */
> #define PY_VERSION_HEX ((PY_MAJOR_VERSION << 24) | \
> (PY_MINOR_VERSION << 16) | \
> (PY_MICRO_VERSION << 8) | \
> (PY_RELEASE_LEVEL << 4) | \
> (PY_RELEASE_SERIAL << 0))
>
> The '5' in the "parse error before '5'" is the PY_MICRO_VERSION; the
> macro looks ok to me, but I haven't used C to any great extent in
> literally years. Am I making some very naive mistake?

Ah, got this one; Naughty Chris's archive contains files with CRLF
line endings. This doesn't bother gcc in most places, except where a
backslash is used to continue a line, when it gets the screaming
heebie-jeebies. Eliminate the nasty ^M's and it all gets better. Also
diff stops reporting that every single line in the file as changed,
which was confusing me.

It doesn't seem to link yet. Oh well, it's late and I'm going to bed.

I can send you a diff against cvs's latest if that would help.

Yours,
Michael

> >I'm just undecided on the design. There are a couple of
> >different coroutine interfaces. You find them for instance in
> >Scheme, in Modula II, some Oberons, Icon, and in Tim Peter's
> >example which I added to my stackless archive.
> >Do you have a proposal?
>
> Not really; all I know is a little Scheme, and that part of my brain
> is telling me that "coroutines are just closures over continuations."
> Fortunately, I'm still able to realize that that's probably the
> wrong interface. :)
>
>
> Neel
ANN: Stackless Python 0.2 [ In reply to ]
Sure sounds like some stray whitespace got between the
backslash and the newline it was intended to protect. Possibly
due to different conventions about newlines and linefeeds
between Unix and other bogus operating systems?

I've always hated the Backslash-Newline convention! It's one thing to
have *leading* whitespace be significat (a la Python) but when code
doesn't work because of *trailing* whitespace (invisible!) it makes me
mad!
ANN: Stackless Python 0.2 [ In reply to ]
Michael Hudson <mwh21@cam.ac.uk> writes:

> neelk@brick.cswv.com (Neel Krishnaswami) writes:
>
> > In article <376636EA.84240129@appliedbiometrics.com>,
> > Christian Tismer <tismer@appliedbiometrics.com> wrote:
> > >
> > >
> > >Neel Krishnaswami wrote:
> > >
> > >[Stackless Python 0.2]
> > >

[stuff]

> It doesn't seem to link yet. Oh well, it's late and I'm going to bed.

Got this one; PyEval_Frame_Dispatch is defined static in
Python/ceval.c. Take the keyowrd static out, and it builds and runs
fine (in my limited testing so far).

Michael

> I can send you a diff against cvs's latest if that would help.
>
> Yours,
> Michael
ANN: Stackless Python 0.2 [ In reply to ]
Michael Hudson wrote:

> Ah, got this one; Naughty Chris's archive contains files with CRLF
> line endings. This doesn't bother gcc in most places, except where a
> backslash is used to continue a line, when it gets the screaming
> heebie-jeebies. Eliminate the nasty ^M's and it all gets better. Also
> diff stops reporting that every single line in the file as changed,
> which was confusing me.

Sorry about that. I developed under Windows, which silently
insertes the CR/LFs for (against) me.

The next release (coming soon) will run through a filter
before I post it.

ciao - chris

--
Christian Tismer :^) <mailto:tismer@appliedbiometrics.com>
Applied Biometrics GmbH : Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net
10553 Berlin : PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF
we're tired of banana software - shipped green, ripens at home
ANN: Stackless Python 0.2 [ In reply to ]
Michael Hudson wrote:

> Got this one; PyEval_Frame_Dispatch is defined static in
> Python/ceval.c. Take the keyowrd static out, and it builds and runs
> fine (in my limited testing so far).

Ah, thanks. Yes, this function was a local eval_frame_dispatch,
before I realized that it needs to be exposed as an API
function. Ahem, forgot to change the decl.

Someone else told be that in frameobject.h, my forward
type declaration gives problems with some compilers.
What is the standard way to do this?

thanks - chris

--
Christian Tismer :^) <mailto:tismer@appliedbiometrics.com>
Applied Biometrics GmbH : Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net
10553 Berlin : PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF
we're tired of banana software - shipped green, ripens at home
ANN: Stackless Python 0.2 [ In reply to ]
Christian Tismer <tismer@appliedbiometrics.com> writes:

> Michael Hudson wrote:
>
> > Got this one; PyEval_Frame_Dispatch is defined static in
> > Python/ceval.c. Take the keyowrd static out, and it builds and runs
> > fine (in my limited testing so far).
>
> Ah, thanks. Yes, this function was a local eval_frame_dispatch,
> before I realized that it needs to be exposed as an API
> function. Ahem, forgot to change the decl.
>
> Someone else told be that in frameobject.h, my forward
> type declaration gives problems with some compilers.
> What is the standard way to do this?

The problem I had with this was that you had code like

typedef struct _frame PyFrameObject;

... stuff ...

typedef struct _frame {
... stuff ...
} PyFrameObject;

which egcs didn't like; I changed it to

typedef struct _frame PyFrameObject;

... stuff ...

struct _frame {
... stuff ...
};

and everything went smoothly.

> thanks - chris

HTH
Michael
ANN: Stackless Python 0.2 [ In reply to ]
In article <37628EAA.C682F16C@appliedbiometrics.com>,
Christian Tismer <tismer@appliedbiometrics.com> wrote:
>ANNOUNCING:
>
> Stackless Python 0.2
> A Python Implementation Which
> Does Not Ese The C Stack
>
>What is it?
>A plugin-replacement for core Python.
>It should run any program which runs under Python 1.5.2 .
>But it does not need space on the C stack.
[snip]

I managed to get this to build eventually (thanks to the hints from Michael
Hudson). Looking at it, I wonder whether there's the potential here for
more than coroutines and to write an implementation of threading within
Python. Each thread would presumably need its own Python stack, and a
queueing and locking system would need to be added somehow, but because
the Python and C stacks are no longer intertangled, switching between
threads should be easy (as opposed to impossible <wink>).

This wouldn't be quite as flexible as the current implimentations of
threads at the C level - C extensions would be called in a single block
with no way to swap threads until they return. On the other hand, this
would allow every platform to have some sort of threading available,
which would be a boon.

Unfortunately I'm not familiar enough with threads to go out there and
implement it right away, but I thought I'd at least raise it as a
possibility and see what people think and what the pros and cons are.

Corran
ANN: Stackless Python 0.2 [ In reply to ]
Corran Webster wrote:
> In article <37628EAA.C682F16C@appliedbiometrics.com>,
> Christian Tismer <tismer@appliedbiometrics.com> wrote:
> >ANNOUNCING:

> > Stackless Python 0.2
> > A Python Implementation Which
> > Does Not Ese The C Stack

This sounds interesting, but how do you deal with the
issue of C extensions which call back to Python code?
Or is this simply disallowed in Stackless Python?

Greg
ANN: Stackless Python 0.2 [ In reply to ]
[Corran Webster, commenting on Christian Tismer's "stackless Python"]
> ...
> I wonder whether there's the potential here for more than coroutines
> and to write an implementation of threading within Python.

It's been mentioned a few times. So far Guido isn't keen on any of this,
but he's been known to buckle after a few short years of incessant whining
<wink>. BTW, a coroutine pretty much *is* a thread in a time-sliced world,
just lacking a "transfer" function that invokes implicitly whenever it
bloody well feels like it <wink>.

> Each thread would presumably need its own Python stack,

Nope! It's a beauty of the implementation already that each code object
knows exactly how much "Python stack space" it needs, and (just) that much
is allocated directly into the code object's runtime frame object. IOW,
there isn't "a Python stack" as such, so there's nothing to change here --
the stack is implicit in the way frames link up to each other.

> and a queueing and locking system would need to be added somehow,

Yes, and that would require some changes to the core. Sounds doable,
anyway.

> but because the Python and C stacks are no longer intertangled, switching
> between threads should be easy (as opposed to impossible <wink>).

I think Christian's approach moves lots of crazy ideas from impossible to
plausible.

> This wouldn't be quite as flexible as the current implimentations of
> threads at the C level - C extensions would be called in a single block
> with no way to swap threads until they return.

That's mostly true today too: Python threads run serially now, one at a
time, and if a thread calling out to C doesn't release the global lock no
other thread will run until it returns.

> On the other hand, this would allow every platform to have some sort
> of threading available, which would be a boon.

It's even quite possible that "fake threads" would-- for programs that
aren't doing true multiprocessing --run significantly more efficiently than
today's scheme of creating OS-level threads and then choking them into
taking strict turns; e.g., because "the Python stack" grows only to the
exact size it needs, there's almost certainly much less memory overhead that
way than by letting the OS allocate a mostly unused Mb (whatever) to each
real thread's stack. This opens the possibility to create thousands &
thousands of fake threads.

> Unfortunately I'm not familiar enough with threads to go out there and
> implement it right away, but I thought I'd at least raise it as a
> possibility and see what people think and what the pros and cons are.

It's sure worth pondering!

write-some-code-anyway-&-get-famous-one-way-or-another<wink>-ly y'rs - tim
ANN: Stackless Python 0.2 [ In reply to ]
Corran Webster wrote:
>
> In article <37628EAA.C682F16C@appliedbiometrics.com>,
> Christian Tismer <tismer@appliedbiometrics.com> wrote:
> >ANNOUNCING:
> >
> > Stackless Python 0.2
> > A Python Implementation Which
> > Does Not Ese The C Stack
> >
> >What is it?
> >A plugin-replacement for core Python.
> >It should run any program which runs under Python 1.5.2 .
> >But it does not need space on the C stack.
> [snip]
>
> I managed to get this to build eventually (thanks to the hints from Michael
> Hudson). Looking at it, I wonder whether there's the potential here for
> more than coroutines and to write an implementation of threading within
> Python. Each thread would presumably need its own Python stack, and a
> queueing and locking system would need to be added somehow, but because
> the Python and C stacks are no longer intertangled, switching between
> threads should be easy (as opposed to impossible <wink>).

Please be a little patient. :-)

Yesterday I finished an alpha version
of first class continuations in Python. This allows you to do
everything, and your threads are on my todo-list from the first
time.

> This wouldn't be quite as flexible as the current implimentations of
> threads at the C level - C extensions would be called in a single block
> with no way to swap threads until they return. On the other hand, this
> would allow every platform to have some sort of threading available,
> which would be a boon.

Not really. C extensions can be written in a stackless
manner. Whenever they need to call into the interpreter,
they can do it in a conformant manner. They do so by just
using their own frame with their own executor.
You need to change your thinking here: There is no return
in that sense any longer. We just have frames in some chains,
and all local state is kept in frames.

> Unfortunately I'm not familiar enough with threads to go out there and
> implement it right away, but I thought I'd at least raise it as a
> possibility and see what people think and what the pros and cons are.

It will be available soon, and it will be nothing more than
a Python module which tames continuations to behave as threads.

ciao - chris

--
Christian Tismer :^) <mailto:tismer@appliedbiometrics.com>
Applied Biometrics GmbH : Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net
10553 Berlin : PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF
we're tired of banana software - shipped green, ripens at home
ANN: Stackless Python 0.2 [ In reply to ]
Greg Ewing wrote:
>
> Corran Webster wrote:
> > In article <37628EAA.C682F16C@appliedbiometrics.com>,
> > Christian Tismer <tismer@appliedbiometrics.com> wrote:
> > >ANNOUNCING:
>
> > > Stackless Python 0.2
> > > A Python Implementation Which
> > > Does Not Ese The C Stack
>
> This sounds interesting, but how do you deal with the
> issue of C extensions which call back to Python code?
> Or is this simply disallowed in Stackless Python?

No, everything is allowed in Stackless Python.
Just grab the source and look at the implementation
of stackless map. That is exactly what a conformant
extension needs to do.
Extensions which don't are not much of a problem,
they just create incompatible frames which will not
be allowed jump targets.

ciao - chris

--
Christian Tismer :^) <mailto:tismer@appliedbiometrics.com>
Applied Biometrics GmbH : Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net
10553 Berlin : PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF
we're tired of banana software - shipped green, ripens at home
ANN: Stackless Python 0.2 [ In reply to ]
Tim Peters (tim_one@email.msn.com) wrote:

>
> That's mostly true today too: Python threads run serially now, one at a
> time, and if a thread calling out to C doesn't release the global lock no
> other thread will run until it returns.
>

I'll bite. How *does* a thread that calls out to C release the global
lock?

Would it have to do this thingy that is at the beginning of the
interpreter loop in python/ceval.c?

if (PyThreadState_Swap(NULL) != tstate)
Py_FatalError("ceval: tstate mix-up");
PyThread_release_lock(interpreter_lock);

Would this work? or would it screw up the interpreter?

Is there a way to do this in a Python script *before* it calls out
to C.

Is there an approved solution to this problem?


- (mystified) Dave
ANN: Stackless Python 0.2 [ In reply to ]
In article <377757D5.25A2AFF6@appliedbiometrics.com>,
Christian Tismer <tismer@appliedbiometrics.com> wrote:
>
>Corran Webster wrote:
>>
>> I managed to get this to build eventually (thanks to the hints from Michael
>> Hudson). Looking at it, I wonder whether there's the potential here for
>> more than coroutines and to write an implementation of threading within
>> Python. Each thread would presumably need its own Python stack, and a
>> queueing and locking system would need to be added somehow, but because
>> the Python and C stacks are no longer intertangled, switching between
>> threads should be easy (as opposed to impossible <wink>).
>
>Please be a little patient. :-)

Apologies if this read as a request for someone else, such as yourself,
to impliment this. This was at the idle speculation stage for me, and
I hadn't seen any previous discussion of it, so I thought it worth asking
about. Apparently it has been talked about, which is cool.

>> This wouldn't be quite as flexible as the current implimentations of
>> threads at the C level - C extensions would be called in a single block
>> with no way to swap threads until they return. On the other hand, this
>> would allow every platform to have some sort of threading available,
>> which would be a boon.
>
>Not really. C extensions can be written in a stackless
>manner. Whenever they need to call into the interpreter,
>they can do it in a conformant manner. They do so by just
>using their own frame with their own executor.
>You need to change your thinking here: There is no return
>in that sense any longer. We just have frames in some chains,
>and all local state is kept in frames.

Okay - from your reply and Tim Peters', it seems as though my understanding
of the way the Python stack works was somewhat wrong.

>> Unfortunately I'm not familiar enough with threads to go out there and
>> implement it right away, but I thought I'd at least raise it as a
>> possibility and see what people think and what the pros and cons are.
>
>It will be available soon, and it will be nothing more than
>a Python module which tames continuations to behave as threads.

Looking forward to see what you come up with.

Corran
ANN: Stackless Python 0.2 [ In reply to ]
In article <000c01bec11f$8dfbf800$e19e2299@tim>,
Tim Peters <tim_one@email.msn.com> wrote:
>[Corran Webster, commenting on Christian Tismer's "stackless Python"]
>> ...
>> I wonder whether there's the potential here for more than coroutines
>> and to write an implementation of threading within Python.
>
>It's been mentioned a few times. So far Guido isn't keen on any of this,
>but he's been known to buckle after a few short years of incessant whining
><wink>. BTW, a coroutine pretty much *is* a thread in a time-sliced world,
>just lacking a "transfer" function that invokes implicitly whenever it
>bloody well feels like it <wink>.

Indeed - and it was all this talk of coroutines that made me wonder
why not go that extra bit.

>> Each thread would presumably need its own Python stack,
>
>Nope! It's a beauty of the implementation already that each code object
>knows exactly how much "Python stack space" it needs, and (just) that much
>is allocated directly into the code object's runtime frame object. IOW,
>there isn't "a Python stack" as such, so there's nothing to change here --
>the stack is implicit in the way frames link up to each other.

Ah - my impression of the way that the stack works was wrong (hadn't
gotten that far in the reading of the source). So each code object
has a little block of stack space set aside (which is precalculated
to be enough), together with a pointer to its own top of the stack
inside that block?

>> and a queueing and locking system would need to be added somehow,
>
>Yes, and that would require some changes to the core. Sounds doable,
>anyway.

More work than coroutines, I suspect, and I don't think it can be
done easily to use the same framework as the current C threads.

>> but because the Python and C stacks are no longer intertangled, switching
>> between threads should be easy (as opposed to impossible <wink>).
>
>I think Christian's approach moves lots of crazy ideas from impossible to
>plausible.

Indeed. A very nice piece of code.

>> This wouldn't be quite as flexible as the current implimentations of
>> threads at the C level - C extensions would be called in a single block
>> with no way to swap threads until they return.
>
>That's mostly true today too: Python threads run serially now, one at a
>time, and if a thread calling out to C doesn't release the global lock no
>other thread will run until it returns.

OK. And presumably most extensions don't bother to release the lock
unless they specifically want to take advantage of threading somehow.

>> On the other hand, this would allow every platform to have some sort
>> of threading available, which would be a boon.
>
>It's even quite possible that "fake threads" would-- for programs that
>aren't doing true multiprocessing --run significantly more efficiently than
>today's scheme of creating OS-level threads and then choking them into
>taking strict turns; e.g., because "the Python stack" grows only to the
>exact size it needs, there's almost certainly much less memory overhead that
>way than by letting the OS allocate a mostly unused Mb (whatever) to each
>real thread's stack. This opens the possibility to create thousands &
>thousands of fake threads.

Although any machine which is being used that intensively probably already
has some sort of C level threading available.

>> Unfortunately I'm not familiar enough with threads to go out there and
>> implement it right away, but I thought I'd at least raise it as a
>> possibility and see what people think and what the pros and cons are.
>
>It's sure worth pondering!
>
>write-some-code-anyway-&-get-famous-one-way-or-another<wink>-ly y'rs - tim

Well, I won't be writing it this week<wink> but I have a better idea of the
ins and outs now.

Thanks.

Corran
ANN: Stackless Python 0.2 [ In reply to ]
On Mon, Jun 28, 1999 at 12:34:48AM -0400, Tim Peters wrote:
> Nope! It's a beauty of the implementation already that each code object
> knows exactly how much "Python stack space" it needs, and (just) that much
> is allocated directly into the code object's runtime frame object. IOW,
> there isn't "a Python stack" as such, so there's nothing to change here --
> the stack is implicit in the way frames link up to each other.

This might be completely irrelevant, but during the course of my masters, I
considered doing this kind of thing to java in order to allow asynchronous
threads to share stack frames (don't ask...). My supervisor complained bitterly
on the grounds that function invocations where orders of magnitude more
common than object creation, and hence associating memory
allocation/deallocation with every call was considered horrendously
inefficient.

It seems that this should affect Stackless Python equally as much. Does anyone
have anything to add on the subject? I would imagine that frames could be
allocated and managed in chunks to alleviate a lot of the memory management
load...

Toby.
ANN: Stackless Python 0.2 [ In reply to ]
[.Tim sez Python stores space for each code object's "Python stack" in a
runtime frame object]

[Toby J Sargeant]
> This might be completely irrelevant, but during the course of my
> masters, I considered doing this kind of thing to java in order to
> allow asynchronous threads to share stack frames (don't ask...).

Don't tell <wink>.

> My supervisor complained bitterly on the grounds that function
> invocations where orders of magnitude more common than object creation,
> and hence associating memory allocation/deallocation with every call was
> considered horrendously inefficient.
>
> It seems that this should affect Stackless Python equally as
> much.

No more so than Stackful Python: Python has always worked this way;
Stackless Python doesn't change this aspect.

> Does anyone have anything to add on the subject? I would imagine that
> frames could be allocated and managed in chunks to alleviate a lot of
> the memory management load...

The code is in Objects/frameobject.c, and easy to follow. frameobjects are
recycled via their own free list. Typically the total number of
frame-associated mallocs is proportional to the maximum depth of the call
stack, not to the number of calls made.

Setting up a Python frame remains expensive, but for other reasons.

115-lines-of-code-doesn't-run-in-one-cycle-malloc-or-not<wink>-ly y'rs -
tim
ANN: Stackless Python 0.2 [ In reply to ]
From: "Tim Peters" <tim_one@email.msn.com>

> ...
> Py_BEGIN_ALLOW_THREADS
> ThreadedSpam();
> Py_END_ALLOW_THREADS
>
> it's-only-confusing-if-you-think-about-it-too-much<wink>-ly y'rs - tim

[Robin Becker]
> so do I then have to poll ThreadedSpam myself to see if it's finished or
> is there a python api version of mutexes/locks etc.

You're in C now -- you do anything you need to do, depending on the
specifics of ThreadedSpam (which was presumed to be a pre-existing
thread-safe C routine). The snippet above clearly assumes that ThreadedSpam
"is finished" when it returns from the call to it. If your flavor of
ThreadedSpam doesn't enjoy this property, that's fine too, but then you do
whatever *it* requires you to do. This is like asking me whether the right
answer is bigger than 5, or less than 4: how the heck should I know <wink>?

You can certainly use Python's lock abstraction in your own C code, but it's
unlikely a pre-existing C function is using that too.

python-doesn't-restrict-what-c-code-can-do-except-to-insist-that-it-
acquire-the-lock-before-returning-to-python-ly y'rs - tim
ANN: Stackless Python 0.2 [ In reply to ]
[G. David Kuhlman]
> I'll bite. How *does* a thread that calls out to C release the global
> lock?

This is covered in detail in the Python/C API manual (look in your Python
doc distribution), chapter 8. You typically just use a pair of
Python-supplied bracket macros:

... here you own the global lock

Py_BEGIN_ALLOW_THREADS

... now you do not: other Python threads can run, and you can do
... C stuff here as long as you like, in parallel with them

Py_END_ALLOW_THREADS

... now you own the global lock again, and can return to Python

Many examples of these macros can be found in the distribution's C source
code too.

> Would it have to do this thingy that is at the beginning of the
> interpreter loop in python/ceval.c?
>
> if (PyThreadState_Swap(NULL) != tstate)
> Py_FatalError("ceval: tstate mix-up");
> PyThread_release_lock(interpreter_lock);
>
> Would this work? or would it screw up the interpreter?

It won't even compile. interpreter_lock is not an extern symbol: you can't
get at it directly. That's a feature, of course, to discourage chuckleheads
from trying to subvert the published API <wink>.

> Is there a way to do this in a Python script *before* it calls out
> to C.

No: Python-level code must never run unless the thread running it holds the
global lock. Therefore no Python-level code can ever release the global
lock (the instant it did so, it would be illegal code -- so no possibility
for screwing up this way is provided).

This is no real restriction, though. To run external C code at all you have
to put it in a module known to Python (so that Python can resolve the name
*from* Python), and in the typical case where you're calling some
pre-existing thread-safe C function ThreadedSpam, you're not going to stick
ThreadedSpam in a Python module anyway. Instead you'd write your Python C
module with a trivial wrapper function and tell Python about *that*; its
guts will consist of

Py_BEGIN_ALLOW_THREADS
ThreadedSpam();
Py_END_ALLOW_THREADS

it's-only-confusing-if-you-think-about-it-too-much<wink>-ly y'rs - tim
ANN: Stackless Python 0.2 [ In reply to ]
Tim Peters wrote:
>
> [.Tim sez Python stores space for each code object's "Python stack" in a
> runtime frame object]
>
> [Toby J Sargeant wondering about frame allocs]
...
> > It seems that this should affect Stackless Python equally as
> > much.
>
> No more so than Stackful Python: Python has always worked this way;
> Stackless Python doesn't change this aspect.

And this was one of the temptations to start this at all.
We pay already some amount of runtime for the frame thing.
Then, we should make full use of it. Doing not so would
be a waste and not optimum.

>
> > Does anyone have anything to add on the subject? I would imagine that
> > frames could be allocated and managed in chunks to alleviate a lot of
> > the memory management load...
>
> The code is in Objects/frameobject.c, and easy to follow. frameobjects are
> recycled via their own free list. Typically the total number of
> frame-associated mallocs is proportional to the maximum depth of the call
> stack, not to the number of calls made.
>
> Setting up a Python frame remains expensive, but for other reasons.

The last word on this isn't spoken yet, since I didn't try
to optimize frames. But I will see what to do (later).

There are two things, which at the moment are still too
intermingled. I will try to break things apart more in the
next stackless verison. Currently, a frame is constructed
from a code object, by examining the code object's properties.
This is done in frameobject.c. The engine then modifies
the local registers in the frame object, according to
parameters of the given function call.

This is by far not optimum. If a function is called repeatedly
in the same context, the frame could stay initialized and be
reused. I believe there are very many contexts where just
one or two arguments of a function call change, and the
frame would only need to update these two fast locals!

Maybe caching this a little, by code object and caller
location would pay off. Will think more about it.

tim-this-is-continuation-prediction - ly y'rs - chris

--
Christian Tismer :^) <mailto:tismer@appliedbiometrics.com>
Applied Biometrics GmbH : Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net
10553 Berlin : PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF
we're tired of banana software - shipped green, ripens at home
ANN: Stackless Python 0.2 [ In reply to ]
From: Gary Duzan <gduzan@gte.com>

Toby J Sargeant wrote:
>
> On Mon, Jun 28, 1999 at 12:34:48AM -0400, Tim Peters wrote:
> > Nope! It's a beauty of the implementation already that each code object
> > knows exactly how much "Python stack space" it needs, and (just) that much
> > is allocated directly into the code object's runtime frame object. IOW,
> > there isn't "a Python stack" as such, so there's nothing to change here --
> > the stack is implicit in the way frames link up to each other.
>
> This might be completely irrelevant, but during the course of my masters, I
> considered doing this kind of thing to java in order to allow asynchronous
> threads to share stack frames (don't ask...). My supervisor complained
bitterly
> on the grounds that function invocations where orders of magnitude more
> common than object creation, and hence associating memory
> allocation/deallocation with every call was considered horrendously
> inefficient.

This appears to be the conventional wisdom. The optimization class I
took used an ML-like language, and the first thing we did was to move
heap frames, which was the default, to the stack (i.e. closure
optimization.)
It might be less of an issue in an interpreted language if it already has
heap
allocation (or other sorts of) overhead to deal with when making calls.
There may be other allocation tricks which can be used, as well.

> It seems that this should affect Stackless Python equally as much. Does
anyone
> have anything to add on the subject? I would imagine that frames could be
> allocated and managed in chunks to alleviate a lot of the memory management
> load...

I would think that as well (at least at first.) Maybe pools of different
sizes of frame to limit the space overhead.

> Toby.

Gary Duzan
GTE Laboratories
ANN: Stackless Python 0.2 [ In reply to ]
In article <000801bec207$142b6360$719e2299@tim>, Tim Peters
<tim_one@email.msn.com> writes
>[G. David Kuhlman]
...
>Py_BEGIN_ALLOW_THREADS
>ThreadedSpam();
>Py_END_ALLOW_THREADS
>
>it's-only-confusing-if-you-think-about-it-too-much<wink>-ly y'rs - tim
>
>
>
>
so do I then have to poll ThreadedSpam myself to see if it's finished or
is there a python api version of mutexes/locks etc.
--
Robin Becker

1 2  View All