Mailing List Archive

RE: marking shared-ness
On Wed, 19 Apr 2000, Salz, Rich wrote:
> >Consider the case where a programmer forgets to note the sharedness. He
> >passes the object to another thread. At certain points: BAM! The
> >interpreter dumps core.
>
> No. Using the "owning thread" idea prevents coredumps and allows the
> interpreter to throw an exception. Perhaps my note wasn't clear
> enough?

INCREF and DECREF cannot throw exceptions.

Are there other points where you could safely detect erroneous sharing of
objects? (in a guaranteed fashion)

For example: what are all the ways that objects can be transported between
threads. Can you erect tests at each of those points? I believe "no" since
there are too many ways (func arg or an item in a shared ob).

Cheers,
-g

--
Greg Stein, http://www.lyra.org/
Re: marking shared-ness [ In reply to ]
Greg Stein wrote:
>
> On Wed, 19 Apr 2000, Christian Tismer wrote:
> >...
> > Too bad that we don't have incref/decref as methods.
>
> This would probably impose more overhead than some of the atomic inc/dec
> mechanisms.
>
> > The possible mutables which have to be protected could
>
> Non-mutable objects must be protected, too. An integer can be shared just
> as easily as a list.

Uhh, right. Everything is mutable, since me mutate the refcount :-(

...
> Ah. Neat. "Automatic marking of shared-ness"
>
> Could work. That initial test for the thread id could be expensive,
> though. What is the overhead of getting the current thread id?

Zero if we cache it in the thread state.

> [ ... thinking about the code ... ]
>
> Nope. Won't work at all.

@#$%ยง!!-| yes-you-are-right - gnnn!

> There is a race condition when an object "becomes shared".
>
> DECREF:
> if ( object is not shared )
> /* whoops! it just became shared! */
> --(op)->ob_refcnt;
> else
> atomic_decrement(op)
>
> To prevent the race, you'd need an interlock which is more expensive than
> an atomic decrement.

Really, sad but true.

Are atomic decrements really so cheap, meaning "are they mapped
to the atomic dec opcode"?
Then this is all ok IMHO.

ciao - chris

--
Christian Tismer :^) <mailto:tismer@appliedbiometrics.com>
Applied Biometrics GmbH : Have a break! Take a ride on Python's
Kaunstr. 26 : *Starship* http://starship.python.net
14163 Berlin : PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF
where do you want to jump today? http://www.stackless.com
Re: marking shared-ness [ In reply to ]
On Wed, 19 Apr 2000, Christian Tismer wrote:
> Greg Stein wrote:
>...
> > Ah. Neat. "Automatic marking of shared-ness"
> >
> > Could work. That initial test for the thread id could be expensive,
> > though. What is the overhead of getting the current thread id?
>
> Zero if we cache it in the thread state.

You don't have the thread state at incref/decref time.

And don't say "_PyThreadState_Current" or I'll fly to Germany and
personally kick your ass :-)

>...
> > There is a race condition when an object "becomes shared".
> >
> > DECREF:
> > if ( object is not shared )
> > /* whoops! it just became shared! */
> > --(op)->ob_refcnt;
> > else
> > atomic_decrement(op)
> >
> > To prevent the race, you'd need an interlock which is more expensive than
> > an atomic decrement.
>
> Really, sad but true.
>
> Are atomic decrements really so cheap, meaning "are they mapped
> to the atomic dec opcode"?

On some platforms and architectures, they *might* be.

On Win32, we call InterlockedIncrement(). No idea what that does, but I
don't think that it is a macro or compiler-detected thingy to insert
opcodes. I believe there is a function call involved.

pthreads do not define atomic inc/dec, so we must use a critical section +
normal inc/dec operators.

Linux has a kernel macro for atomic inc/dec, but it is only valid if
__SMP__ is defined in your compilation context.

etc.

Platforms that do have an API (as Donn stated: BeOS has one; Win32 has
one), they will be cheaper than an interlock. Therefore, we want to take
advantage of an "atomic inc/dec" semantic when possible (and fallback to
slower stuff when not).

Cheers,
-g

--
Greg Stein, http://www.lyra.org/
Re: Re: marking shared-ness [ In reply to ]
Greg Stein wrote:
>
> On Wed, 19 Apr 2000, Christian Tismer wrote:
> > Greg Stein wrote:
> >...
> > > Ah. Neat. "Automatic marking of shared-ness"
> > >
> > > Could work. That initial test for the thread id could be expensive,
> > > though. What is the overhead of getting the current thread id?
> >
> > Zero if we cache it in the thread state.
>
> You don't have the thread state at incref/decref time.
>
> And don't say "_PyThreadState_Current" or I'll fly to Germany and
> personally kick your ass :-)

A real temptation to see whether I can really get you to Germany :-))

...

Thanks for all the info.

> Linux has a kernel macro for atomic inc/dec, but it is only valid if
> __SMP__ is defined in your compilation context.

Well, and while it looks cheap, it is for sure expensive
since several caches are flushed, and the system is stalled
until the modified value is written back into the memory bank.

Could it be that we might want to use another thread design
at all? I'm thinking of running different interpreters in
the same process space, but with all objects really disjoint,
invisible between the interpreters. This would perhaps need
some internal changes, in order to make all the builtin
free-lists disjoint as well.
Now each such interpreter would be running in its own thread
without any racing condition at all so far.
To make this into threading and not just a flavor of multitasking,
we now need of course shared objects, but only those objects
which we really want to share. This could reduce the cost for
free threading to nearly zero, except for the (hopefully) few
shared objects.
I think, instead of shared globals, it would make more sense
to have some explicit shared resource pool, which controls
every access via mutexes/semas/whateverweneed. Maybe also that
we would prefer to copy objects into it over sharing, in order
to minimize collisions. I hope the need for true sharing
can be minimized to a few variables. Well, I hope.
"freethreads" could even coexist with the current locking threads,
we would not even need a special build for them, but to rethink
threading.
Like "the more free threading is, the more disjoint threads are".

are-you-now-convinced-to-come-and-kick-my-ass-ly y'rs - chris :-)

--
Christian Tismer :^) <mailto:tismer@appliedbiometrics.com>
Applied Biometrics GmbH : Have a break! Take a ride on Python's
Kaunstr. 26 : *Starship* http://starship.python.net
14163 Berlin : PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF
where do you want to jump today? http://www.stackless.com
Re: Re: marking shared-ness [ In reply to ]
Chris> I think, instead of shared globals, it would make more sense to
Chris> have some explicit shared resource pool, which controls every
Chris> access via mutexes/semas/whateverweneed.

Tuple space, anyone? Check out

http://www.snurgle.org/~pybrenda/

It's a Linda implementation for Python. Linda was developed at Yale by
David Gelernter. Unfortunately, he's better known to the general public as
being one of the Unabomber's targets. You can find out more about Linda at

http://www.cs.yale.edu/Linda/linda.html

Skip
Re: Re: marking shared-ness [ In reply to ]
Skip Montanaro wrote:
>
> Chris> I think, instead of shared globals, it would make more sense to
> Chris> have some explicit shared resource pool, which controls every
> Chris> access via mutexes/semas/whateverweneed.
>
> Tuple space, anyone? Check out
>
> http://www.snurgle.org/~pybrenda/

Very interesting, indeed.

> It's a Linda implementation for Python. Linda was developed at Yale by
> David Gelernter. Unfortunately, he's better known to the general public as
> being one of the Unabomber's targets. You can find out more about Linda at
>
> http://www.cs.yale.edu/Linda/linda.html

Many broken links. The most activity appears to have stopped
around 94/95, the project looks kinda dead. But this doesn't
mean that we cannot learn from them.

Will think more when the starship problem is over...

ciao - chris

--
Christian Tismer :^) <mailto:tismer@appliedbiometrics.com>
Applied Biometrics GmbH : Have a break! Take a ride on Python's
Kaunstr. 26 : *Starship* http://starship.python.net
14163 Berlin : PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF
where do you want to jump today? http://www.stackless.com
Re: Re: marking shared-ness [ In reply to ]
>> http://www.cs.yale.edu/Linda/linda.html

Chris> Many broken links. The most activity appears to have stopped
Chris> around 94/95, the project looks kinda dead. But this doesn't mean
Chris> that we cannot learn from them.

Yes, I think Linda mostly lurks under the covers these days. Their Piranha
project, which aims to soak up spare CPU cycles to do parallel computing,
uses Linda. I suspect Linda is probably hidden somewhere inside Lifestreams
as well.

As a correction to my original note, Nicholas Carriero was the other primary
lead on Linda. I no longer recall the details, but he may have been on of
Gelernter's grad students in the late 80's.

Skip
Re: Re: marking shared-ness [ In reply to ]
> Chris> I think, instead of shared globals, it would make more sense to
> Chris> have some explicit shared resource pool, which controls every
> Chris> access via mutexes/semas/whateverweneed.

> Skip wrote:
> Tuple space, anyone? Check out
> http://www.snurgle.org/~pybrenda/
> It's a Linda implementation for Python. You can find out more about
> Linda at
> http://www.cs.yale.edu/Linda/linda.html

Linda is also the inspiration for Sun's JavaSpaces, an easier-to-use layer
on top of Jini:

http://java.sun.com/products/javaspaces/
http://cseng.aw.com/bookpage.taf?ISBN=0-201-30955-6

On the plus side:

1. It's much (much) easier to use than mutex, semaphore, or monitor
models: students in my parallel programming course could start writing
C-Linda programs after (literally) five minutes of instruction.

2. If you're willing/able to do global analysis of access patterns, its
simplicity doesn't have a significant performance penalty.

3. (Bonus points) It integrates very well with persistence schemes.

On the minus side:

1. Some things that "ought" to be simple (e.g. barrier synchronization)
are surprisingly difficult to get right, efficiently, in vanilla
Linda-like systems. Some VHLL derivates (based on SETL and Lisp
dialects) solved this in interesting ways.

2. It's different enough from hardware-inspired shared-memory + mutex
models to inspire the same "Huh, that looks weird" reaction as
Scheme's parentheses, or Python's indentation. On the other hand,
Bill Joy and company are now backing it...

Personal opinion: I've felt for 15 years that something like Linda could
be to threads and mutexes what structured loops and conditionals are to
the "goto" statement. Were it not for the "Huh" effect, I'd recommend
hanging "Danger!" signs over threads and mutexes, and making tuple spaces
the "standard" concurrency mechanism in Python.

I'd also recommend calling the system "Carol", after Monty Python regular
Carol Cleveland. The story is that Linda itself was named after the 70s
porn star Linda Lovelace, in response to the DoD naming its language "Ada"
after the other Lovelace...

Greg

p.s. I talk a bit about Linda, and the limitations of the vanilla
approach, in http://mitpress.mit.edu/book-home.tcl?isbn=0262231867.
Re: [Thread-SIG] Re: Re: marking shared-ness [ In reply to ]
On Thu, 20 Apr 2000, Christian Tismer wrote:

> Skip Montanaro wrote:
> >
> > Tuple space, anyone? Check out
> >
> > http://www.snurgle.org/~pybrenda/
>
> Very interesting, indeed.

*Steps out of the woodwork and bows*

PyBrenda doesn't have a thread implementation, but it could be adapted to
do so. It might be prudent to eliminate the use of TCP/IP in that case as
well.

In case anyone is interested, I just created a mailing list for PyBrenda
at egroups: http://www.egroups.com/group/pybrenda-users

--
Milton L. Hankins \\ ><> Ephesians 5:2 ><>
http://www.snurgle.org/~mhankins // <mlh@swl.msd.ray.com>
These are my opinions, not Raytheon's. \\ W. W. J. D. ?
Re: Re: marking shared-ness [ In reply to ]
On Thu, 20 Apr 2000, Christian Tismer wrote:
>...
> > Linux has a kernel macro for atomic inc/dec, but it is only valid if
> > __SMP__ is defined in your compilation context.
>
> Well, and while it looks cheap, it is for sure expensive
> since several caches are flushed, and the system is stalled
> until the modified value is written back into the memory bank.

Yes, Bill mentioned that yesterday. Important fact, but there isn't much
you can do -- they must be atomic.

> Could it be that we might want to use another thread design
> at all? I'm thinking of running different interpreters in
> the same process space, but with all objects really disjoint,
> invisible between the interpreters. This would perhaps need
> some internal changes, in order to make all the builtin
> free-lists disjoint as well.
> Now each such interpreter would be running in its own thread
> without any racing condition at all so far.
> To make this into threading and not just a flavor of multitasking,
> we now need of course shared objects, but only those objects
> which we really want to share. This could reduce the cost for
> free threading to nearly zero, except for the (hopefully) few
> shared objects.
> I think, instead of shared globals, it would make more sense
> to have some explicit shared resource pool, which controls
> every access via mutexes/semas/whateverweneed. Maybe also that
> we would prefer to copy objects into it over sharing, in order
> to minimize collisions. I hope the need for true sharing
> can be minimized to a few variables. Well, I hope.
> "freethreads" could even coexist with the current locking threads,
> we would not even need a special build for them, but to rethink
> threading.
> Like "the more free threading is, the more disjoint threads are".

No. Now you're just talking processes with IPC. Yes, they happen to run in
threads, but you got none of the advantages of a threaded application.

Threading is about sharing an address space.

Cheers,
-g

--
Greg Stein, http://www.lyra.org/
Re: Re: marking shared-ness [ In reply to ]
Greg Stein wrote:
>
> On Thu, 20 Apr 2000, Christian Tismer wrote:
[me, about free threading with less sharing]

> No. Now you're just talking processes with IPC. Yes, they happen to run in
> threads, but you got none of the advantages of a threaded application.

Are you shure that every thread user shares your opinion?
I see many people using threads just in order to have
multiple tasks in parallel, with none or quite few shared
variables.

> Threading is about sharing an address space.

This is part of the truth. There are a number of other
reasons to use threads, too.
Since Python has nothing really private, this implies in
fact to protect every single object for free threading,
although nobody wants this in the first place to happen.

Other languages have much fewer problems here (I mean
C, C++, Delphi...), they are able to do the right thing
in the right place.
Python is not designed for that. Why do you want to enforce
the impossible, letting every object pay a high penalty
to become completely thread-safe?

Sharing an address space should not mean to share
everything, but something. If Python does not support
this, we should think of a redesign of its threading
model, instead of loosing so much of efficiency.
You end up in a situation where all your C extensions
can run free threaded at high speed, just Python is
busy all the time to fight the threading.
That is not Python.

You know that I like to optimize things. For me, optimization
mut give an overall gain, not just in one area, where others
get worse. If free threading cannot be optimized in
a way that gives better overall performance, then
it is a wrong optimization to me.

Well, this is all speculative until we did some measures.
Maybe I'm just complaining about 1-2 percent of performance
loss, then I'd agree to move my complaining into /dev/null :-)

ciao - chris

--
Christian Tismer :^) <mailto:tismer@appliedbiometrics.com>
Applied Biometrics GmbH : Have a break! Take a ride on Python's
Kaunstr. 26 : *Starship* http://starship.python.net
14163 Berlin : PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF
where do you want to jump today? http://www.stackless.com
Re: [Thread-SIG] Re: Re: marking shared-ness [ In reply to ]
On Fri, 21 Apr 2000, Christian Tismer wrote:

> Are you shure that every thread user shares your opinion?
> I see many people using threads just in order to have
> multiple tasks in parallel, with none or quite few shared
> variables.

About the only time I use threads is when
1) I'm doing something asynchronous in an event loop-driven paradigm
(such as Tkinter) or
2) I'm trying to emulate fork() under win32

> Since Python has nothing really private, this implies in
> fact to protect every single object for free threading,
> although nobody wants this in the first place to happen.

How does Java solve this problem? (Is this analagous to native vs. green
threads?)

> Python is not designed for that. Why do you want to enforce
> the impossible, letting every object pay a high penalty
> to become completely thread-safe?

Hmm, how about declaring only certain builtins as free-thread safe? Or is
"the impossible" necessary because of the nature of incref/decref?

--
Milton L. Hankins :: ><> Ephesians 5:2 ><>
Software Engineer, Raytheon Systems Company :: <mlh@swl.msd.ray.com>
http://amasts.msd.ray.com/~mlh :: RayComNet 7-225-4728
RE: [Thread-SIG] Re: Re: marking shared-ness [ In reply to ]
> From: Milton L. Hankins [mailto:mlh@swl.msd.ray.com]
>
> On Fri, 21 Apr 2000, Christian Tismer wrote:
>
> > Are you shure that every thread user shares your opinion?
> > I see many people using threads just in order to have
> > multiple tasks in parallel, with none or quite few shared
> > variables.
>
> About the only time I use threads is when
> 1) I'm doing something asynchronous in an event loop-driven
> paradigm
> (such as Tkinter) or
> 2) I'm trying to emulate fork() under win32
>
3) I'm doing something that would block in an asynchronous FSM. (e.g.
Medusa, or an NT I/O completion port driven system)

> > Since Python has nothing really private, this implies in
> > fact to protect every single object for free threading,
> > although nobody wants this in the first place to happen.
>
> How does Java solve this problem? (Is this analagous to
> native vs. green
> threads?)
>

Java allows you to specifically mention whether something should be
seralized or not, and no, this doesn't have anything to do with native vs.
green threads)

> > Python is not designed for that. Why do you want to enforce
> > the impossible, letting every object pay a high penalty
> > to become completely thread-safe?
>
> Hmm, how about declaring only certain builtins as free-thread
> safe?

incref/decref are not type object specific, they're global macros.
Making them methods on the type object would be the sensible thing to do,
but would definately be non-backward compatible.

Bill
Re: [Thread-SIG] Re: Re: marking shared-ness [ In reply to ]
> > Since Python has nothing really private, this implies in
> > fact to protect every single object for free threading,
> > although nobody wants this in the first place to happen.
>
> How does Java solve this problem? (Is this analagous to native vs. green
> threads?)
>
> > Python is not designed for that. Why do you want to enforce
> > the impossible, letting every object pay a high penalty
> > to become completely thread-safe?
>
> Hmm, how about declaring only certain builtins as free-thread safe? Or is
> "the impossible" necessary because of the nature of incref/decref?

http://www.javacats.com/US/articles/MultiThreading.html

I would like

sync foo:
bloc of code here

maybe we could merge in some Occam while were at it. B^)


sync would be a most excellent operator in python.
Re: [Thread-SIG] Re: Re: marking shared-ness [ In reply to ]
http://www.cs.bris.ac.uk/~alan/javapp.html

Take a look at the above link. It merges the Occam model with Java and uses
'channel based' interfaces (not sure exactly what this is).

But they seem pretty exicted.

I vote for using InterlockedInc/Dec as it is available as an assembly
instruction on almost everyplatform. Could be then derive all other locking
schemantics from this?

And our portability problem is solved if it comes in the box with gcc.

On Fri, 21 Apr 2000, Sean Jensen_Grey wrote:

> > > Since Python has nothing really private, this implies in
> > > fact to protect every single object for free threading,
> > > although nobody wants this in the first place to happen.
> >
> > How does Java solve this problem? (Is this analagous to native vs. green
> > threads?)
> >
> > > Python is not designed for that. Why do you want to enforce
> > > the impossible, letting every object pay a high penalty
> > > to become completely thread-safe?
> >
> > Hmm, how about declaring only certain builtins as free-thread safe? Or is
> > "the impossible" necessary because of the nature of incref/decref?
>
> http://www.javacats.com/US/articles/MultiThreading.html
>
> I would like
>
> sync foo:
> bloc of code here
>
> maybe we could merge in some Occam while were at it. B^)
>
>
> sync would be a most excellent operator in python.
>
>
>
>
> _______________________________________________
> Thread-SIG maillist - Thread-SIG@python.org
> http://www.python.org/mailman/listinfo/thread-sig
>
Re: [Thread-SIG] Re: Re: marking shared-ness [ In reply to ]
> On Fri, 21 Apr 2000, Sean Jensen_Grey wrote:
> http://www.cs.bris.ac.uk/~alan/javapp.html
> Take a look at the above link. It merges the Occam model with Java and uses
> 'channel based' interfaces (not sure exactly what this is).

Channel-based programming has been called "the revenge of the goto", as
in, "Where the hell does this channel go to?" Programmers must manage
conversational continuity manually (i.e. keep track of the origins of
messages, so that they can be replied to). It also doesn't really help
with the sharing problem that started this thread: if you want a shared
integer, you have to write a little server thread that knows how to act
like a semaphore, and then it read/write requests that are exactly
equivalent to P and V operations (and subject to all the same abuses). Oh,
and did I mention the joys of trying to draw a semi-accurate diagram of
the plumbing in your program after three months of upgrade work?

*shudder*

Greg
Re: Re: marking shared-ness [ In reply to ]
> It is more than this. In my last shot at this, pystone ran about half as
> fast. There are a few things that will be different this time around, but
> it certainly won't in the "few percent" range.

Interesting thought: according to patches recently posted to
patches@python.org (but not yet vetted), "turning on" threads on Win32
in regular Python also slows down Pystone considerably. Maybe it's
not so bad? Maybe those patches contain a hint of what we could do?

--Guido van Rossum (home page: http://www.python.org/~guido/)
Re: marking shared-ness [ In reply to ]
On Fri, 21 Apr 2000, Christian Tismer wrote:
>...
> > No. Now you're just talking processes with IPC. Yes, they happen to run in
> > threads, but you got none of the advantages of a threaded application.
>
> Are you shure that every thread user shares your opinion?

Now you're just being argumentative. I won't respond to this.

>...
> Other languages have much fewer problems here (I mean
> C, C++, Delphi...), they are able to do the right thing
> in the right place.
> Python is not designed for that. Why do you want to enforce
> the impossible, letting every object pay a high penalty
> to become completely thread-safe?

Existing Python semantics plus free-threading places us in this scenario.
Many people have asked for free-threading, and the number of inquiries
that I receive have grown over time. (nobody asked in 1996 when I first
published my patches; I get a query every couple months now)

>...
> You know that I like to optimize things. For me, optimization
> mut give an overall gain, not just in one area, where others
> get worse. If free threading cannot be optimized in
> a way that gives better overall performance, then
> it is a wrong optimization to me.
>
> Well, this is all speculative until we did some measures.
> Maybe I'm just complaining about 1-2 percent of performance
> loss, then I'd agree to move my complaining into /dev/null :-)

It is more than this. In my last shot at this, pystone ran about half as
fast. There are a few things that will be different this time around, but
it certainly won't in the "few percent" range.

Presuming you can keep your lock contention low, then your overall
performances *goes up* once you have a multiprocessor machine. Sure, each
processor runs Python (say) 10% slower, but you have *two* of them going.
That is 180% compared to a central-lock Python on an MP machine.

Lock contention: my last patches had really high contention. It didn't
scale across processors well. This round will have more fine-grained locks
than the previous version. But it will be interesting to measure the
contention.

Cheers,
-g

--
Greg Stein, http://www.lyra.org/
Re: Re: marking shared-ness [ In reply to ]
On Fri, 21 Apr 2000, Guido van Rossum wrote:
> > It is more than this. In my last shot at this, pystone ran about half as
> > fast. There are a few things that will be different this time around, but
> > it certainly won't in the "few percent" range.
>
> Interesting thought: according to patches recently posted to
> patches@python.org (but not yet vetted), "turning on" threads on Win32
> in regular Python also slows down Pystone considerably. Maybe it's
> not so bad? Maybe those patches contain a hint of what we could do?

I think that my tests were threaded vs. free-threaded. It has been so long
ago, though... :-)

Yes, we'll get those patches reviewed and installed. That will at least
help the standard threading case. With more discrete locks (e.g. one per
object or one per code section), then we will reduce lock contention.
Working on improving the lock mechanism itself and the INCREF/DECREF
system will help, too.

But this initial thread was to seek people to assist with some coding to
get stuff into 1.6. The heavy lifting will certainly be after 1.6, but we
can get some good stuff in *today*. We'll examine performance later on,
then start improving it.

Cheers,
-g

--
Greg Stein, http://www.lyra.org/
Re: Re: marking shared-ness [ In reply to ]
Guido van Rossum wrote:
>
> > It is more than this. In my last shot at this, pystone ran about half as
> > fast. There are a few things that will be different this time around, but
> > it certainly won't in the "few percent" range.
>
> Interesting thought: according to patches recently posted to
> patches@python.org (but not yet vetted), "turning on" threads on Win32
> in regular Python also slows down Pystone considerably. Maybe it's
> not so bad? Maybe those patches contain a hint of what we could do?

I had a rough look at the patches but didn't understand
enough yet. But I tried the sample scriptlet on python 1.5.2
and Stackless Python - see here:

D:\python>python -c "import test.pystone;test.pystone.main()"
Pystone(1.1) time for 10000 passes = 1.96765
This machine benchmarks at 5082.2 pystones/second

D:\python>python spc/threadstone.py
Pystone(1.1) time for 10000 passes = 5.57609
This machine benchmarks at 1793.37 pystones/second

This is even worse than Markovitch's observation.

Now, let's try with Stackless Python:

D:\python>cd spc

D:\python\spc>python -c "import test.pystone;test.pystone.main()"
Pystone(1.1) time for 10000 passes = 1.843
This machine benchmarks at 5425.94 pystones/second

D:\python\spc>python threadstone.py
Pystone(1.1) time for 10000 passes = 3.27625
This machine benchmarks at 3052.27 pystones/second

Isn't that remarkable? Stackless performs nearly 1.8
as good under threads.

Why?
I've optimized the ticker code away for all those
"fast" opcodes which never can cause another interpreter
incarnation. Standard Python does a bit too much here,
dealing the same way with extremely fast opcodes like POP_TOP,
as with a function call. Responsiveness is still very good.

Markovitch's example also tells us this story:
Even with his patches, the threading stuff still costs
10 percent. This is the lock that we touch every ten
opcodes. In other words: touching a lock costs about
as much as an opcode costs on average.

ciao - chris

threadstone.py:
import thread
# Start empty thread to initialise thread mechanics (and global lock!)
# This thread will finish immediately thus won't make much influence on
# test results by itself, only by that fact that it initialises global lock
thread.start_new_thread(lambda : 1, ())

import test.pystone
test.pystone.main()

--
Christian Tismer :^) <mailto:tismer@appliedbiometrics.com>
Applied Biometrics GmbH : Have a break! Take a ride on Python's
Kaunstr. 26 : *Starship* http://starship.python.net
14163 Berlin : PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF
where do you want to jump today? http://www.stackless.com
Re: Re: marking shared-ness [ In reply to ]
Greg,

Greg Stein wrote:
<snip/
> Presuming you can keep your lock contention low, then your overall
> performances *goes up* once you have a multiprocessor machine. Sure, each
> processor runs Python (say) 10% slower, but you have *two* of them going.
> That is 180% compared to a central-lock Python on an MP machine.

Why didn't I think of this.
MP is a very very good point.
Makes now all much sense to me.

sorry for being dumb - happy easter - chris

--
Christian Tismer :^) <mailto:tismer@appliedbiometrics.com>
Applied Biometrics GmbH : Have a break! Take a ride on Python's
Kaunstr. 26 : *Starship* http://starship.python.net
14163 Berlin : PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF
where do you want to jump today? http://www.stackless.com
RE: Re: marking shared-ness [ In reply to ]
[Greg Stein]
> ...
> Many people have asked for free-threading, and the number of inquiries
> that I receive have grown over time. (nobody asked in 1996 when I first
> published my patches; I get a query every couple months now)

Huh! That means people ask me about it more often than they ask you <wink>.

I'll add, though, that you have to dig into the inquiry: almost everyone
who asks me is running on a uniprocessor machine, and are really after one
of two other things:

1. They expect threaded stuff to run faster if free-threaded. "Why?" is
a question I can't answer <0.5 wink>.

2. Dealing with the global lock drives them insane, especially when trying
to call back into Python from a "foreign" C thread.

#2 may be fixable via less radical means (like a streamlined procedure
enabled by some relatively minor core interpreter changes, and clearer
docs).

I'm still a fan of free-threading! It's just one of those things that may
yield a "well, ya, that's what I asked for, but turns out it's not what I
*wanted*" outcome as often as not.

enthusiastically y'rs - tim
RE: Re: marking shared-ness [ In reply to ]
[Greg Wilson, on Linda and JavaSpaces]
> ...
> Personal opinion: I've felt for 15 years that something like Linda could
> be to threads and mutexes what structured loops and conditionals are to
> the "goto" statement. Were it not for the "Huh" effect, I'd recommend
> hanging "Danger!" signs over threads and mutexes, and making tuple spaces
> the "standard" concurrency mechanism in Python.

There's no question about tuple spaces being easier to learn and to use, but
Python slams into a conundrum here akin to the "floating-point versus
*anything* sane <wink>" one: Python's major real-life use is as a glue
language, and threaded apps (ditto IEEE-754 floating-point apps) are
overwhelmingly what it needs to glue *to*.

So Python has to have a good thread story. Free-threading would be a fine
enhancement of it, Tuple spaces (spelled "PyBrenda" or otherwise) would be
a fine alternative to it, but Python can't live without threads too.

And, yes, everyone who goes down Hoare's CSP road gets lost <0.7 wink>.
RE: Re: marking shared-ness [ In reply to ]
On Tue, 25 Apr 2000, Tim Peters wrote:
> [Greg Stein]
> > ...
> > Many people have asked for free-threading, and the number of inquiries
> > that I receive have grown over time. (nobody asked in 1996 when I first
> > published my patches; I get a query every couple months now)
>
> Huh! That means people ask me about it more often than they ask you <wink>.
>
> I'll add, though, that you have to dig into the inquiry: almost everyone
> who asks me is running on a uniprocessor machine, and are really after one
> of two other things:
>
> 1. They expect threaded stuff to run faster if free-threaded. "Why?" is
> a question I can't answer <0.5 wink>.

Heh. Yes, I definitely see this one. But there are some clueful people out
there, too, so I'm not totally discouraged :-)

> 2. Dealing with the global lock drives them insane, especially when trying
> to call back into Python from a "foreign" C thread.
>
> #2 may be fixable via less radical means (like a streamlined procedure
> enabled by some relatively minor core interpreter changes, and clearer
> docs).

No doubt. I was rather upset with Guido's "Swap" API for the thread state.
Grr. I sent him a very nice (IMO) API that I used for my patches. The Swap
was simply a poor choice on his part. It implies that you are swapping a
thread state for another (specifically: the "current" thread state). Of
course, that is wholly inappropriate in a free-threading environment. All
those calls to _Swap() will be overhead in an FT world.

I liked my "PyThreadState *PyThreadState_Ensure()" function. It would
create the sucker if it didn't exist, then return *this* thread's state to
you. Handy as hell. No monkeying around with "Get. oops. didn't exist.
let's create one now."

> I'm still a fan of free-threading! It's just one of those things that may
> yield a "well, ya, that's what I asked for, but turns out it's not what I
> *wanted*" outcome as often as not.

hehe. Damn straight. :-)

Cheers,
-g

--
Greg Stein, http://www.lyra.org/