Mailing List Archive: RFC

RFC - "Saving" perl

Apr 11, 2021, 10:14 PM

Post #1 of 3 (152 views)

This is broken up. General idea at the top; ramblings after that. I
appreciate the consideration on this topic that's been afforded and the
patience all have held during this exercise. Everything below is mine and
mine alone. It expresses no one else's opinions. Just to be "transparent".
Additionally I consulted no one while writing it or before sending it.

I. EXECUTIVE SUMMARY

I'll present the idea, then I'll probably write a ton of stuff to provide
background and context, plus a lot of stuff that is irrelevant. Such is my
burden; here it is. Also note, this is not my idea. I just happen to
recognize its importance and potential to be a tremendous catalyst.

IDEA

I should say upfront, I am under the impression we don't have anything
like what I am about to describe. If I am mistaken and we do, then great;
we're half way there.

The perl runtime would greatly benefit from a simple (but not too
simple) API
or mechanism by which execution contexts may be managed (saved, resumed,
inspected, etc). The execution context represents the overall state [0] of
the program at the time it's saved. Note, the book in which [0] is
chapter is
a dense read, but highly recommended it all.

Or, conversely, we should seek to assist LeoNerd in his current async/await
work to see a general usable approach is made available as an artifact
of the
implementation (not on him IMO, on *us*).

In time sharing operating systems (Unix), this is known as "context
switching". When using git, it's the "git stash" interface. See [7,8] for
more reads.

SYNOPSIS

If we can't have more than one execution context at a time; let's fake it in
a useful way by providing a way to save it, resume it, and even manipulate
more than one context at a time (e.g., "merging" - more on this later)

I expand a little below, but to make the executive summary complete; It's my
understanding that "80%" of LeoNerd's effort behind async/away is based
solely on context management. Apologies if this point was misunderstood.
But I
don't think it was. Nonetheless that only encouraged what I had already
concluded was very important during that "chat".



II. LONGMESS

The perl "run time" is a single, monolith execution context. When we fork,
we're admitting this. We're also admitting another thing implicitly; that
fork, while very useful, only creates clones of the parent process. And the
are non-communicating. So, one important and meaningful concept is that
"perl" is great at facilitating and managing any number of
"non-communicating
sequential processes" [5,6].

This has cost untold amounts of time and mental anguish for people who tried
to solve this "problem", were forced to seek alternatives but really wanted
to use perl/Perl, etc. Is it a problem? Not if you admit that a perl process
is a single execution context. It is part of the nature of Perl. This is
where I am.

It wasn't that long ago that single CPUs were normal. The race to get
"faster" was measured in MHz and GHz. People overclocked their single CPUs,
paid way to much for a few MB of RAM, installed FreeBSD or Gentoo so they
could tweak every program they installed, etc. Many remember those days.

It *was* quite a while ago that Unix came on to the scene. Its main selling
point was that it was a "time sharing" operating system. Not to get into
"why
did Unix succeed where others failed"; I will just point this out. The time
sharing aspect of Unix allowed multiple users to be on the machine at the
same time. Though, generally the people "felt" like they had exclusive
access
to it.

Time sharing was so popular, this thing called Linux came about. And
before I
get accused of providing a terrible history lesson, I will end by pointing
out that *most* of use first installed Linux (or a BSD) many years ago on
single CPU machines. Some of us might still do that for sport. The point is
that the "time sharing" approach, implemented.

Getting back to the perl "runtime", what does thinking "perl/Perl is a
uniprocessor operating system" get us? A lot of things, actually.

Operating systems research for a long time was focused on the uniprocessor
model. And there are a lot of things that go into presenting a "time
sharing"
os experience for human beings. Some of those "things" might one day
considered appropriate for the perl runtime, but one in particular I believe
will help us move ahead right now. That concept of *context switching*. It
was so powerful, in fact, people felt like they were the only ones on the
computer. MORE that that, all of the many processes running on the computer
felt like *they* were the only process on the machine. There I go
anthropomorphizing computer stuff again. Sorry.

BACKGROUND

Not to be dramatic, but I've had a slowly growing epiphany. And today after
chatting with some *very* patient folks on #p5p, the path forward dawned on
me.

1. formalize the notion that the perl "run time" is a single context thread
2. don't fight it
3. embrace it

Most stop digging when they realize they're in a hole. I think our situation
requires us to get better shovels, upgrade to a steam shovel, or even invest
in a few tunneling machines. I suspect we'll bust out on the other side of
the world and be glad we did. It might get hot for a while, but there's
really
no alternative.

What does embracing it look like? I looks like this. If we've got this
amazingly powerful, albeit uniprocess runtime, look into the past and
consider how those brilliant folks in the past dealt with having just a
single CPU.

How did this come from a conversation with the good folks on #p5p? LeoNerd
was discussing his async/await work. And, me, having been sufficiently
convinced that no amount of magic could present an "real" asynchronous
environment. I mean, "async" is just like threads but way less powerful.
This
is why "futures" and "promises" are a thing; but still, I persisted. HOW is
this possible? Here's how based on my limited understanding; and btw I think
it is brilliant and was the final piece of the puzzle for me:

0. main execution context is running; all the async stuff is in a "run loop"

1. async/await is called 2. then execution contexts are managed using
suspension/resumption code being CUSTOM written for this purpose

[.LeoNerd correct me if I am wrong here, don't mean to mischaracterize your
work]

Then I said something to the effect of, "seems like support for context
switching 'execution' contexts would help you out there". And, at risk of
taking his comment the wrong way; he said about 80% (EIGHTY PERCENT) of the
code is in place to manage the execution context. WTF.

To his great credit, he also said he suggested the same or a similar
approach
some years ago. I definitely believe that! Furthermore, I believe such an
approach's time has come.

WHY IS THIS IMPORTANT

Let's generalize the implications that having the ability to "context
switch"
easily

The implementation of async/await can be described as, "enabling distributed
programming semantics in an environment that may allow one and only one
'execution context' to be active at any time".

In this view, it doesn't matter if it is async/wait, "threading"
(fork/join),
or some other concurrency model that is 'hot'. It boils down to enabling the
support of the SEMANTICS of the concurrency paradigms in inherently
serialized way.

There is a Computer Science concept for doing this "correctly" given the
semantics of the paradigm you are literally faking.

Some people call this concept "linearizability". Others call it "sequential
consistency".

What does "sequential consistency" mean From [1,2]?

...the result of any execution is the same as if the
operations of all the processors were executed in some
sequential order, and the operations of each individual
processor appear in this sequence in the order
specified by its program.

What does "linearizability" from [3.4] mean?

In concurrent programming, an operation (or set of
operations) is linearizable if it consists of an
ordered list of invocation and response events
(callbacks), that may be extended by adding response
events such that:

+ The extended list can be re-expressed as a sequential
history (is serializable).
+ That sequential history is a subset of the original
unextended list.

Informally, this means that the unmodified list of
events is linearizable if and only if its invocations
were serializable, but some of the responses of the
serial schedule have yet to return.

(incidentally, the main author from [4] is a very well know software
transactional memory researcher)

For us, this means: if you're going to fake a multi-process environment in
uni-process model; make damn sure it behaves the same as it would IRL.

The burden of ensuring "sequential consistency" is squarely on whomever is
implementing the required support of the semantics of the concurrent
programming model they are adding, in our case, to perl/Perl.

It also follows that if someone wanted to implement, oh idk, a shared memory
programming environment (e.g., threads; or OpenMP's 'fork'/'join' model);
then they would also have to correctly implement this "sequential
consistency". That's a HUGE burden alone - though a necessary one.

The main point: This is IMPORTANT because extending Perl semantics to
support
interesting programming paradigms (which is one of it's greatest strengths)
currently requires the dual burden of anyone implementing said semantics to:

a. ensure sequential consistency (rightly so)

b. implement their own way of managing execution contexts (the horror!,
srsly this is an awful situation)

It is my conclusion that by marshaling our brains and resources, the CORRECT
"next" thing to provide is a standard way of managing execution contexts,
specifically for the context of enabling any new programming paradigm -
which
all seem to be multi-process. Without this key feature, we are going to get
no more meaningful semantics ever.

For me, and I hope I've convinced the right people, this is a mortal wound
that we need to fix ASAP.

BABBY'S NEXT STEP

It is very fortunate because I do not *think* providing this is such a great
burden. And far be it for me to suggest *what* to do, but I will propose a
straw-man "plan" of action that is hopefully clear and doable; and it
provides
a clear high level road map:

0. formalize that the "perl run time" is a uniprocess environment (the OS
analogy may or may not help)

1. provide a minimal, practical, not totally boneheaded way for those
working
on semantics for "new" programming model support (async/away/futures,
fork/join, etc) to context switch execution contexts.

2. judiciously and incrementally add to this general API layer based on
needs
of it supporting additional concurrent programming model semantics

3. enjoy the new found path ahead for new features that are consistent with
the perl/Perl we all know and love

THE FUTURE

I added this section for additional motivation. With a way to easily
capture,
save, and restore execution contexts there are suddenly a great many
interesting possibilities for perl/Perl. Some relate to real
multi-processing, some relate to new (and consistent!) semantics, some
relate
to solving some people problems that have been created over the years due to
how *badly* so many people want a path forward (outside of the coolness
factor, the latter is actually what I hope the greatest real benefit is)..

In terms abilities other than save/restore execution contexts can such a
layer provide; well...one that I think is particularly exciting would be the
ability to "merge" execution contexts. This would be required, for example,
if we wished to present software transactional memory type semantics on top
of any simulated "shared memory" we'd use to support actual SMP semantics.
It'd also provide the basis for experimentation with semantics related to
'lock free' data structures. The list gets longer and more exotic the more
one thinks about it.

In addition to this, here are some other things that seem very possible as a
result of taking this first step:

a. once sufficiently encapsulated, this "context management" layer could be
the basis for enabling actual multi-processing - could be the "perl runtime"
may not even need to be are that things are happening concurrently.

b. #a could also lead to some real collaboration among many individuals that
are highly capable and interest in the are of "run times". That's not my
aim;
I am just saying that it could provide an opportunity for some folks who
maybe we have not seen in a while to work on very interesting things in that
*layer*.

CONCLUSION

I will wrap it up with the following strongly held convictions:

* it will service us well to look to how uni-process time sharing operating
systems solve things; it may inspire us with new language features or
solutions to sticky situations in terms of perl

* if we want to enable semantic extensions to Perl that uniformly implement
multi-process paradigms in a uni-process environment, we absolutely must
provide standard capabilities for manage context; likely there are other
capabilities needed (e.g., "run loop" support? idk if that's a thing).

III. APPENDIX A - FAQ

1. You went from pushing "real" SMP to "fake" multi-processing by
proposing a
way to save and resume context states

Yes, bust. I feel this is the best way forward. And it could fall out of the
work that LeoNerd is doing now for async/await. It's already being done,
let's try and make it reusable; then actually reuse (I would).

2. You sound pretty confident that this will solve all of our problem, what
if it doesn't?

It's NOT a silver bullet. But I do think that it's highly likely not to be
"sufficient" for some semantic extensions to Perl, I do think it's
"necessary"; i.e., we will need to do this no matter what IMO.

3. You speak of this as if its easy.

I don't think this is easy, but I do think that it'd be a substantial force
multiplier for anyone interested in extending Perl into "interesting"
programming model areas. Going back to what I said above. All the "low
hanging fruit" has been picked. Time to put on our adult pants and move
forward.

4. Sounded like you're thinking about this as a "virtual machine" area, are
you?

NO. I can write another 5,000 words on what I observed starting with pugs,
parrot, moarvm, rakudo, and perl 6. If anything I am suggesting the opposite
approach. Provide the minimum functionality one might need to fake
multi-processing in our uni-process model. That doesn't sounds like a vm
layer to me, it sounds like actually supporting semantic extensions for new
and unanticipated things.

REFERENCES

[0] Edseger W. Dijkstra, "A Discipline of Programming", Chapter 2,
"STATES AND
THEIR CHARACTERIZATIONS"

[1] https://en.wikipedia.org/wiki/Sequential_consistency

[2] Leslie Lamport, "How to Make a Multiprocessor Computer That
Correctly Executes
Multiprocess Programs", IEEE Trans. Comput. C-28,9 (Sept. 1979), 690-691.

[3] https://en.wikipedia.org/wiki/Linearizability

[4] Herlihy, Maurice P.; Wing, Jeannette M. (1990). "Linearizability: A
Correctness
Condition for Concurrent Objects". ACM Transactions on Programming
Languages and
Systems. 12 (3): 463–492. CiteSeerX 10.1.1.142.5315

[5] http://www.usingcsp.com/cspbook.pdf

[6] https://en.wikipedia.org/wiki/Communicating_sequential_processes

[7] https://en.wikipedia.org/wiki/Coroutine#Implementations_for_C

[8] https://docs.python.org/3/reference/compound_stmts.html#async

Re: RFC - "Saving" perl [ In reply to ]

kimoto.yuki at gmail

Apr 12, 2021, 4:40 PM

Post #2 of 3 (152 views)

Permalink

I think people have a very confusing understanding of parallelization.

For example, Although GPU parallelization is completely different from
I/O parallelization.

Because of the word parallelization, we tend to think of it as something
similar.

Before people learn the right knowledge, they say "Perl is a bad language
because it doesn't have parallelization."

2021?4?12?(?) 14:15 B. Estrade <brett@cpanel.net>:

> This is broken up. General idea at the top; ramblings after that. I
> appreciate the consideration on this topic that's been afforded and the
> patience all have held during this exercise. Everything below is mine and
> mine alone. It expresses no one else's opinions. Just to be "transparent".
> Additionally I consulted no one while writing it or before sending it.
>
> I. EXECUTIVE SUMMARY
>
> I'll present the idea, then I'll probably write a ton of stuff to provide
> background and context, plus a lot of stuff that is irrelevant. Such is my
> burden; here it is. Also note, this is not my idea. I just happen to
> recognize its importance and potential to be a tremendous catalyst.
>
> IDEA
>
> I should say upfront, I am under the impression we don't have anything
> like what I am about to describe. If I am mistaken and we do, then great;
> we're half way there.
>
> The perl runtime would greatly benefit from a simple (but not too
> simple) API
> or mechanism by which execution contexts may be managed (saved, resumed,
> inspected, etc). The execution context represents the overall state [0] of
> the program at the time it's saved. Note, the book in which [0] is
> chapter is
> a dense read, but highly recommended it all.
>
> Or, conversely, we should seek to assist LeoNerd in his current async/await
> work to see a general usable approach is made available as an artifact
> of the
> implementation (not on him IMO, on *us*).
>
> In time sharing operating systems (Unix), this is known as "context
> switching". When using git, it's the "git stash" interface. See [7,8] for
> more reads.
>
> SYNOPSIS
>
> If we can't have more than one execution context at a time; let's fake it
> in
> a useful way by providing a way to save it, resume it, and even manipulate
> more than one context at a time (e.g., "merging" - more on this later)
>
> I expand a little below, but to make the executive summary complete; It's
> my
> understanding that "80%" of LeoNerd's effort behind async/away is based
> solely on context management. Apologies if this point was misunderstood.
> But I
> don't think it was. Nonetheless that only encouraged what I had already
> concluded was very important during that "chat".
>
> 
>
> II. LONGMESS
>
> The perl "run time" is a single, monolith execution context. When we fork,
> we're admitting this. We're also admitting another thing implicitly; that
> fork, while very useful, only creates clones of the parent process. And the
> are non-communicating. So, one important and meaningful concept is that
> "perl" is great at facilitating and managing any number of
> "non-communicating
> sequential processes" [5,6].
>
> This has cost untold amounts of time and mental anguish for people who
> tried
> to solve this "problem", were forced to seek alternatives but really wanted
> to use perl/Perl, etc. Is it a problem? Not if you admit that a perl
> process
> is a single execution context. It is part of the nature of Perl. This is
> where I am.
>
> It wasn't that long ago that single CPUs were normal. The race to get
> "faster" was measured in MHz and GHz. People overclocked their single CPUs,
> paid way to much for a few MB of RAM, installed FreeBSD or Gentoo so they
> could tweak every program they installed, etc. Many remember those days.
>
> It *was* quite a while ago that Unix came on to the scene. Its main selling
> point was that it was a "time sharing" operating system. Not to get into
> "why
> did Unix succeed where others failed"; I will just point this out. The time
> sharing aspect of Unix allowed multiple users to be on the machine at the
> same time. Though, generally the people "felt" like they had exclusive
> access
> to it.
>
> Time sharing was so popular, this thing called Linux came about. And
> before I
> get accused of providing a terrible history lesson, I will end by pointing
> out that *most* of use first installed Linux (or a BSD) many years ago on
> single CPU machines. Some of us might still do that for sport. The point is
> that the "time sharing" approach, implemented.
>
> Getting back to the perl "runtime", what does thinking "perl/Perl is a
> uniprocessor operating system" get us? A lot of things, actually.
>
> Operating systems research for a long time was focused on the uniprocessor
> model. And there are a lot of things that go into presenting a "time
> sharing"
> os experience for human beings. Some of those "things" might one day
> considered appropriate for the perl runtime, but one in particular I
> believe
> will help us move ahead right now. That concept of *context switching*. It
> was so powerful, in fact, people felt like they were the only ones on the
> computer. MORE that that, all of the many processes running on the computer
> felt like *they* were the only process on the machine. There I go
> anthropomorphizing computer stuff again. Sorry.
>
> BACKGROUND
>
> Not to be dramatic, but I've had a slowly growing epiphany. And today after
> chatting with some *very* patient folks on #p5p, the path forward dawned on
> me.
>
> 1. formalize the notion that the perl "run time" is a single context thread
> 2. don't fight it
> 3. embrace it
>
> Most stop digging when they realize they're in a hole. I think our
> situation
> requires us to get better shovels, upgrade to a steam shovel, or even
> invest
> in a few tunneling machines. I suspect we'll bust out on the other side of
> the world and be glad we did. It might get hot for a while, but there's
> really
> no alternative.
>
> What does embracing it look like? I looks like this. If we've got this
> amazingly powerful, albeit uniprocess runtime, look into the past and
> consider how those brilliant folks in the past dealt with having just a
> single CPU.
>
> How did this come from a conversation with the good folks on #p5p? LeoNerd
> was discussing his async/await work. And, me, having been sufficiently
> convinced that no amount of magic could present an "real" asynchronous
> environment. I mean, "async" is just like threads but way less powerful.
> This
> is why "futures" and "promises" are a thing; but still, I persisted. HOW is
> this possible? Here's how based on my limited understanding; and btw I
> think
> it is brilliant and was the final piece of the puzzle for me:
>
> 0. main execution context is running; all the async stuff is in a "run
> loop"
>
> 1. async/await is called 2. then execution contexts are managed using
> suspension/resumption code being CUSTOM written for this purpose
>
> [.LeoNerd correct me if I am wrong here, don't mean to mischaracterize your
> work]
>
> Then I said something to the effect of, "seems like support for context
> switching 'execution' contexts would help you out there". And, at risk of
> taking his comment the wrong way; he said about 80% (EIGHTY PERCENT) of the
> code is in place to manage the execution context. WTF.
>
> To his great credit, he also said he suggested the same or a similar
> approach
> some years ago. I definitely believe that! Furthermore, I believe such an
> approach's time has come.
>
> WHY IS THIS IMPORTANT
>
> Let's generalize the implications that having the ability to "context
> switch"
> easily
>
> The implementation of async/await can be described as, "enabling
> distributed
> programming semantics in an environment that may allow one and only one
> 'execution context' to be active at any time".
>
> In this view, it doesn't matter if it is async/wait, "threading"
> (fork/join),
> or some other concurrency model that is 'hot'. It boils down to enabling
> the
> support of the SEMANTICS of the concurrency paradigms in inherently
> serialized way.
>
> There is a Computer Science concept for doing this "correctly" given the
> semantics of the paradigm you are literally faking.
>
> Some people call this concept "linearizability". Others call it "sequential
> consistency".
>
> What does "sequential consistency" mean From [1,2]?
>
> ...the result of any execution is the same as if the
> operations of all the processors were executed in some
> sequential order, and the operations of each individual
> processor appear in this sequence in the order
> specified by its program.
>
> What does "linearizability" from [3.4] mean?
>
> In concurrent programming, an operation (or set of
> operations) is linearizable if it consists of an
> ordered list of invocation and response events
> (callbacks), that may be extended by adding response
> events such that:
>
> + The extended list can be re-expressed as a sequential
> history (is serializable).
> + That sequential history is a subset of the original
> unextended list.
>
> Informally, this means that the unmodified list of
> events is linearizable if and only if its invocations
> were serializable, but some of the responses of the
> serial schedule have yet to return.
>
> (incidentally, the main author from [4] is a very well know software
> transactional memory researcher)
>
> For us, this means: if you're going to fake a multi-process environment in
> uni-process model; make damn sure it behaves the same as it would IRL.
>
> The burden of ensuring "sequential consistency" is squarely on whomever is
> implementing the required support of the semantics of the concurrent
> programming model they are adding, in our case, to perl/Perl.
>
> It also follows that if someone wanted to implement, oh idk, a shared
> memory
> programming environment (e.g., threads; or OpenMP's 'fork'/'join' model);
> then they would also have to correctly implement this "sequential
> consistency". That's a HUGE burden alone - though a necessary one.
>
> The main point: This is IMPORTANT because extending Perl semantics to
> support
> interesting programming paradigms (which is one of it's greatest strengths)
> currently requires the dual burden of anyone implementing said semantics
> to:
>
> a. ensure sequential consistency (rightly so)
>
> b. implement their own way of managing execution contexts (the horror!,
> srsly this is an awful situation)
>
> It is my conclusion that by marshaling our brains and resources, the
> CORRECT
> "next" thing to provide is a standard way of managing execution contexts,
> specifically for the context of enabling any new programming paradigm -
> which
> all seem to be multi-process. Without this key feature, we are going to get
> no more meaningful semantics ever.
>
> For me, and I hope I've convinced the right people, this is a mortal wound
> that we need to fix ASAP.
>
> BABBY'S NEXT STEP
>
> It is very fortunate because I do not *think* providing this is such a
> great
> burden. And far be it for me to suggest *what* to do, but I will propose a
> straw-man "plan" of action that is hopefully clear and doable; and it
> provides
> a clear high level road map:
>
> 0. formalize that the "perl run time" is a uniprocess environment (the OS
> analogy may or may not help)
>
> 1. provide a minimal, practical, not totally boneheaded way for those
> working
> on semantics for "new" programming model support (async/away/futures,
> fork/join, etc) to context switch execution contexts.
>
> 2. judiciously and incrementally add to this general API layer based on
> needs
> of it supporting additional concurrent programming model semantics
>
> 3. enjoy the new found path ahead for new features that are consistent with
> the perl/Perl we all know and love
>
> THE FUTURE
>
> I added this section for additional motivation. With a way to easily
> capture,
> save, and restore execution contexts there are suddenly a great many
> interesting possibilities for perl/Perl. Some relate to real
> multi-processing, some relate to new (and consistent!) semantics, some
> relate
> to solving some people problems that have been created over the years due
> to
> how *badly* so many people want a path forward (outside of the coolness
> factor, the latter is actually what I hope the greatest real benefit is)..
>
> In terms abilities other than save/restore execution contexts can such a
> layer provide; well...one that I think is particularly exciting would be
> the
> ability to "merge" execution contexts. This would be required, for example,
> if we wished to present software transactional memory type semantics on top
> of any simulated "shared memory" we'd use to support actual SMP semantics.
> It'd also provide the basis for experimentation with semantics related to
> 'lock free' data structures. The list gets longer and more exotic the more
> one thinks about it.
>
> In addition to this, here are some other things that seem very possible as
> a
> result of taking this first step:
>
> a. once sufficiently encapsulated, this "context management" layer could be
> the basis for enabling actual multi-processing - could be the "perl
> runtime"
> may not even need to be are that things are happening concurrently.
>
> b. #a could also lead to some real collaboration among many individuals
> that
> are highly capable and interest in the are of "run times". That's not my
> aim;
> I am just saying that it could provide an opportunity for some folks who
> maybe we have not seen in a while to work on very interesting things in
> that
> *layer*.
>
> CONCLUSION
>
> I will wrap it up with the following strongly held convictions:
>
> * it will service us well to look to how uni-process time sharing operating
> systems solve things; it may inspire us with new language features or
> solutions to sticky situations in terms of perl
>
> * if we want to enable semantic extensions to Perl that uniformly implement
> multi-process paradigms in a uni-process environment, we absolutely must
> provide standard capabilities for manage context; likely there are other
> capabilities needed (e.g., "run loop" support? idk if that's a thing).
>
> III. APPENDIX A - FAQ
>
> 1. You went from pushing "real" SMP to "fake" multi-processing by
> proposing a
> way to save and resume context states
>
> Yes, bust. I feel this is the best way forward. And it could fall out of
> the
> work that LeoNerd is doing now for async/await. It's already being done,
> let's try and make it reusable; then actually reuse (I would).
>
> 2. You sound pretty confident that this will solve all of our problem, what
> if it doesn't?
>
> It's NOT a silver bullet. But I do think that it's highly likely not to be
> "sufficient" for some semantic extensions to Perl, I do think it's
> "necessary"; i.e., we will need to do this no matter what IMO.
>
> 3. You speak of this as if its easy.
>
> I don't think this is easy, but I do think that it'd be a substantial force
> multiplier for anyone interested in extending Perl into "interesting"
> programming model areas. Going back to what I said above. All the "low
> hanging fruit" has been picked. Time to put on our adult pants and move
> forward.
>
> 4. Sounded like you're thinking about this as a "virtual machine" area, are
> you?
>
> NO. I can write another 5,000 words on what I observed starting with pugs,
> parrot, moarvm, rakudo, and perl 6. If anything I am suggesting the
> opposite
> approach. Provide the minimum functionality one might need to fake
> multi-processing in our uni-process model. That doesn't sounds like a vm
> layer to me, it sounds like actually supporting semantic extensions for new
> and unanticipated things.
>
> REFERENCES
>
> [0] Edseger W. Dijkstra, "A Discipline of Programming", Chapter 2,
> "STATES AND
> THEIR CHARACTERIZATIONS"
>
> [1] https://en.wikipedia.org/wiki/Sequential_consistency
>
> [2] Leslie Lamport, "How to Make a Multiprocessor Computer That
> Correctly Executes
> Multiprocess Programs", IEEE Trans. Comput. C-28,9 (Sept. 1979), 690-691.
>
> [3] https://en.wikipedia.org/wiki/Linearizability
>
> [4] Herlihy, Maurice P.; Wing, Jeannette M. (1990). "Linearizability: A
> Correctness
> Condition for Concurrent Objects". ACM Transactions on Programming
> Languages and
> Systems. 12 (3): 463–492. CiteSeerX 10.1.1.142.5315
>
> [5] http://www.usingcsp.com/cspbook.pdf
>
> [6] https://en.wikipedia.org/wiki/Communicating_sequential_processes
>
> [7] https://en.wikipedia.org/wiki/Coroutine#Implementations_for_C
>
> [8] https://docs.python.org/3/reference/compound_stmts.html#async
>

Re: RFC - "Saving" perl [ In reply to ]

brett at cpanel

Apr 12, 2021, 6:07 PM

Post #3 of 3 (152 views)

Permalink

On 4/12/21 6:40 PM, Yuki Kimoto wrote:
> I think people have a very confusing understanding of parallelization.
>
> For example, Although GPU parallelization is completely different from
> I/O parallelization.
>
> Because of the word parallelization, we tend to think of it as something
> similar.
>
> Before people learn the right knowledge, they say "Perl is a bad
> language because it doesn't have parallelization."
>
>

I know that I've contributed to the confusion. The question has changed
directions several times.

I've gone from a quest to find opportunities for "real" concurrency that
might be exposed through Perl semantics, 180 degrees, to what can be
done to support "sequential consistency" (correctness) of concurrent
semantics in a uni-process environment.

The most clearly I can express this, is that in order to support
concurrent semantics proper in perl; we must have fundamental support
that allows us to go from the concurrent order of operations expressed
using the Perl (language) semantics support to a fully serialized
ordering that is consistent with some ordering of operations that could
stem from it.

I know its confusing.

I can try to put it another way. If we want to support concurrent
semantics in Perl (the language), the perl API needs to provide whomever
is implementing the runtime execution with capabilities for ensuring the
final execution (necessarily sequential in a uni-process) is "correct"
(i.e., 'sequentially consistent' with the original concurrent
semantics). I have proposed, minimally there needs to be high level
runtime execution context switching and the ability to query stored
contexts.

Another generalization. Rather than figuring out how to make a
uniprocess runtime execute concurrent semantics in parallel for real; I
have reversed this and now believe the right thing to do is to ensure
concurrent semantics are executed correctly in a uniprocess runtime
(i.e., fully sequential). That means we safely remove the parallelism
(aka, "serialize") in execution; the end result is consistent with a
valid end state had there been actual parallel execution.

And finally, instead of going from sequential to parallel; we actually
reduce the parallelism expressed down to a single line of execution and
just make sure it's "correct".

There are other approaches to handling this also. I thing context
switching is minimally required. Another thing that could be helpful is
a scheduler, but this is an area ripe for collaborative conflict. For
example, the whole FreeBSD/DillonFreeBSD debacle was over SMP scheduler
changes between FreeBSD 4.x and 5.x (DfBSD forked from 4.x). So I
recommend we avoid that whole area; too many opinions and options there.
I digress.

My current thinking is that, given uniprocess runtime context switching
capabilities; we'll learn enough lessons over the course of
implementations of "concurrent semantic serialization" to know what kind
of things we need and if a a scheduler is needed. It's not rocket
surgery. Or maybe it is.

So for better or for worse, that's my ask: the context switching in a
uniprocess runtime thing.

Today I was thinking of what the "serialized form" of "semantically
expressed concurrency" looked like. For "fork/join" "shared memory"
semantics, I think it looks like loop unrolling. If I come up with some
examples this evening I will post them; for async/await, I'd rather
LeoNerd have time to think over all this stuff and take a crack at it
when he's ready.

Again, I know it's super confusing why I was first seeking
parallelization and now seeking serialization. It's all about going
through the runtime; and to do that we necessarily have to present a
uniprocess execution model in our reasoning about "correctness".

Brett

>
>
> 2021?4?12?(?) 14:15 B. Estrade <brett@cpanel.net
> <mailto:brett@cpanel.net>>:
>
> This is broken up. General idea at the top; ramblings after that. I
> appreciate the consideration on this topic that's been afforded and the
> patience all have held during this exercise. Everything below is
> mine and
> mine alone. It expresses no one else's opinions. Just to be
> "transparent".
> Additionally I consulted no one while writing it or before sending it.
>
> I. EXECUTIVE SUMMARY
>
> I'll present the idea, then I'll probably write a ton of stuff to
> provide
> background and context, plus a lot of stuff that is irrelevant. Such
> is my
> burden; here it is. Also note, this is not my idea. I just happen to
> recognize its importance and potential to be a tremendous catalyst.
>
> IDEA
>
> I should say upfront, I am under the impression we don't have anything
> like what I am about to describe. If I am mistaken and we do, then
> great;
> we're half way there.
>
> The perl runtime would greatly benefit from a simple (but not too
> simple) API
> or mechanism by which execution contexts may be managed (saved, resumed,
> inspected, etc). The execution context represents the overall state
> [0] of
> the program at the time it's saved. Note, the book in which [0] is
> chapter is
> a dense read, but highly recommended it all.
>
> Or, conversely, we should seek to assist LeoNerd in his current
> async/await
> work to see a general usable approach is made available as an artifact
> of the
> implementation (not on him IMO, on *us*).
>
> In time sharing operating systems (Unix), this is known as "context
> switching". When using git, it's the "git stash" interface. See
> [7,8] for
> more reads.
>
> SYNOPSIS
>
> If we can't have more than one execution context at a time; let's
> fake it in
> a useful way by providing a way to save it, resume it, and even
> manipulate
> more than one context at a time (e.g., "merging" - more on this later)
>
> I expand a little below, but to make the executive summary complete;
> It's my
> understanding that "80%" of LeoNerd's effort behind async/away is based
> solely on context management. Apologies if this point was
> misunderstood.
> But I
> don't think it was. Nonetheless that only encouraged what I had already
> concluded was very important during that "chat".
>
> 
>
> II. LONGMESS
>
> The perl "run time" is a single, monolith execution context. When we
> fork,
> we're admitting this. We're also admitting another thing implicitly;
> that
> fork, while very useful, only creates clones of the parent process.
> And the
> are non-communicating. So, one important and meaningful concept is that
> "perl" is great at facilitating and managing any number of
> "non-communicating
> sequential processes" [5,6].
>
> This has cost untold amounts of time and mental anguish for people
> who tried
> to solve this "problem", were forced to seek alternatives but really
> wanted
> to use perl/Perl, etc. Is it a problem? Not if you admit that a perl
> process
> is a single execution context. It is part of the nature of Perl. This is
> where I am.
>
> It wasn't that long ago that single CPUs were normal. The race to get
> "faster" was measured in MHz and GHz. People overclocked their
> single CPUs,
> paid way to much for a few MB of RAM, installed FreeBSD or Gentoo so
> they
> could tweak every program they installed, etc. Many remember those days.
>
> It *was* quite a while ago that Unix came on to the scene. Its main
> selling
> point was that it was a "time sharing" operating system. Not to get
> into
> "why
> did Unix succeed where others failed"; I will just point this out.
> The time
> sharing aspect of Unix allowed multiple users to be on the machine
> at the
> same time. Though, generally the people "felt" like they had exclusive
> access
> to it.
>
> Time sharing was so popular, this thing called Linux came about. And
> before I
> get accused of providing a terrible history lesson, I will end by
> pointing
> out that *most* of use first installed Linux (or a BSD) many years
> ago on
> single CPU machines. Some of us might still do that for sport. The
> point is
> that the "time sharing" approach, implemented.
>
> Getting back to the perl "runtime", what does thinking "perl/Perl is a
> uniprocessor operating system" get us? A lot of things, actually.
>
> Operating systems research for a long time was focused on the
> uniprocessor
> model. And there are a lot of things that go into presenting a "time
> sharing"
> os experience for human beings. Some of those "things" might one day
> considered appropriate for the perl runtime, but one in particular I
> believe
> will help us move ahead right now. That concept of *context
> switching*. It
> was so powerful, in fact, people felt like they were the only ones
> on the
> computer. MORE that that, all of the many processes running on the
> computer
> felt like *they* were the only process on the machine. There I go
> anthropomorphizing computer stuff again. Sorry.
>
> BACKGROUND
>
> Not to be dramatic, but I've had a slowly growing epiphany. And
> today after
> chatting with some *very* patient folks on #p5p, the path forward
> dawned on
> me.
>
> 1. formalize the notion that the perl "run time" is a single context
> thread
> 2. don't fight it
> 3. embrace it
>
> Most stop digging when they realize they're in a hole. I think our
> situation
> requires us to get better shovels, upgrade to a steam shovel, or
> even invest
> in a few tunneling machines. I suspect we'll bust out on the other
> side of
> the world and be glad we did. It might get hot for a while, but there's
> really
> no alternative.
>
> What does embracing it look like? I looks like this. If we've got this
> amazingly powerful, albeit uniprocess runtime, look into the past and
> consider how those brilliant folks in the past dealt with having just a
> single CPU.
>
> How did this come from a conversation with the good folks on #p5p?
> LeoNerd
> was discussing his async/await work. And, me, having been sufficiently
> convinced that no amount of magic could present an "real" asynchronous
> environment. I mean, "async" is just like threads but way less
> powerful.
> This
> is why "futures" and "promises" are a thing; but still, I persisted.
> HOW is
> this possible? Here's how based on my limited understanding; and btw
> I think
> it is brilliant and was the final piece of the puzzle for me:
>
> 0. main execution context is running; all the async stuff is in a
> "run loop"
>
> 1. async/await is called 2. then execution contexts are managed using
> suspension/resumption code being CUSTOM written for this purpose
>
> [.LeoNerd correct me if I am wrong here, don't mean to
> mischaracterize your
> work]
>
> Then I said something to the effect of, "seems like support for context
> switching 'execution' contexts would help you out there". And, at
> risk of
> taking his comment the wrong way; he said about 80% (EIGHTY PERCENT)
> of the
> code is in place to manage the execution context. WTF.
>
> To his great credit, he also said he suggested the same or a similar
> approach
> some years ago. I definitely believe that! Furthermore, I believe
> such an
> approach's time has come.
>
> WHY IS THIS IMPORTANT
>
> Let's generalize the implications that having the ability to "context
> switch"
> easily
>
> The implementation of async/await can be described as, "enabling
> distributed
> programming semantics in an environment that may allow one and only one
> 'execution context' to be active at any time".
>
> In this view, it doesn't matter if it is async/wait, "threading"
> (fork/join),
> or some other concurrency model that is 'hot'. It boils down to
> enabling the
> support of the SEMANTICS of the concurrency paradigms in inherently
> serialized way.
>
> There is a Computer Science concept for doing this "correctly" given the
> semantics of the paradigm you are literally faking.
>
> Some people call this concept "linearizability". Others call it
> "sequential
> consistency".
>
> What does "sequential consistency" mean From [1,2]?
>
> ...the result of any execution is the same as if the
> operations of all the processors were executed in some
> sequential order, and the operations of each individual
> processor appear in this sequence in the order
> specified by its program.
>
> What does "linearizability" from [3.4] mean?
>
> In concurrent programming, an operation (or set of
> operations) is linearizable if it consists of an
> ordered list of invocation and response events
> (callbacks), that may be extended by adding response
> events such that:
>
> + The extended list can be re-expressed as a sequential
> history (is serializable).
> + That sequential history is a subset of the original
> unextended list.
>
> Informally, this means that the unmodified list of
> events is linearizable if and only if its invocations
> were serializable, but some of the responses of the
> serial schedule have yet to return.
>
> (incidentally, the main author from [4] is a very well know software
> transactional memory researcher)
>
> For us, this means: if you're going to fake a multi-process
> environment in
> uni-process model; make damn sure it behaves the same as it would IRL.
>
> The burden of ensuring "sequential consistency" is squarely on
> whomever is
> implementing the required support of the semantics of the concurrent
> programming model they are adding, in our case, to perl/Perl.
>
> It also follows that if someone wanted to implement, oh idk, a
> shared memory
> programming environment (e.g., threads; or OpenMP's 'fork'/'join'
> model);
> then they would also have to correctly implement this "sequential
> consistency". That's a HUGE burden alone - though a necessary one.
>
> The main point: This is IMPORTANT because extending Perl semantics to
> support
> interesting programming paradigms (which is one of it's greatest
> strengths)
> currently requires the dual burden of anyone implementing said
> semantics to:
>
> a. ensure sequential consistency (rightly so)
>
> b. implement their own way of managing execution contexts (the horror!,
> srsly this is an awful situation)
>
> It is my conclusion that by marshaling our brains and resources, the
> CORRECT
> "next" thing to provide is a standard way of managing execution
> contexts,
> specifically for the context of enabling any new programming paradigm -
> which
> all seem to be multi-process. Without this key feature, we are going
> to get
> no more meaningful semantics ever.
>
> For me, and I hope I've convinced the right people, this is a mortal
> wound
> that we need to fix ASAP.
>
> BABBY'S NEXT STEP
>
> It is very fortunate because I do not *think* providing this is such
> a great
> burden. And far be it for me to suggest *what* to do, but I will
> propose a
> straw-man "plan" of action that is hopefully clear and doable; and it
> provides
> a clear high level road map:
>
> 0. formalize that the "perl run time" is a uniprocess environment
> (the OS
> analogy may or may not help)
>
> 1. provide a minimal, practical, not totally boneheaded way for those
> working
> on semantics for "new" programming model support
> (async/away/futures,
> fork/join, etc) to context switch execution contexts.
>
> 2. judiciously and incrementally add to this general API layer based on
> needs
> of it supporting additional concurrent programming model semantics
>
> 3. enjoy the new found path ahead for new features that are
> consistent with
> the perl/Perl we all know and love
>
> THE FUTURE
>
> I added this section for additional motivation. With a way to easily
> capture,
> save, and restore execution contexts there are suddenly a great many
> interesting possibilities for perl/Perl. Some relate to real
> multi-processing, some relate to new (and consistent!) semantics, some
> relate
> to solving some people problems that have been created over the
> years due to
> how *badly* so many people want a path forward (outside of the coolness
> factor, the latter is actually what I hope the greatest real benefit
> is)..
>
> In terms abilities other than save/restore execution contexts can such a
> layer provide; well...one that I think is particularly exciting
> would be the
> ability to "merge" execution contexts. This would be required, for
> example,
> if we wished to present software transactional memory type semantics
> on top
> of any simulated "shared memory" we'd use to support actual SMP
> semantics.
> It'd also provide the basis for experimentation with semantics
> related to
> 'lock free' data structures. The list gets longer and more exotic
> the more
> one thinks about it.
>
> In addition to this, here are some other things that seem very
> possible as a
> result of taking this first step:
>
> a. once sufficiently encapsulated, this "context management" layer
> could be
> the basis for enabling actual multi-processing - could be the "perl
> runtime"
> may not even need to be are that things are happening concurrently.
>
> b. #a could also lead to some real collaboration among many
> individuals that
> are highly capable and interest in the are of "run times". That's
> not my
> aim;
> I am just saying that it could provide an opportunity for some folks who
> maybe we have not seen in a while to work on very interesting things
> in that
> *layer*.
>
> CONCLUSION
>
> I will wrap it up with the following strongly held convictions:
>
> * it will service us well to look to how uni-process time sharing
> operating
> systems solve things; it may inspire us with new language
> features or
> solutions to sticky situations in terms of perl
>
> * if we want to enable semantic extensions to Perl that uniformly
> implement
> multi-process paradigms in a uni-process environment, we
> absolutely must
> provide standard capabilities for manage context; likely there
> are other
> capabilities needed (e.g., "run loop" support? idk if that's a
> thing).
>
> III. APPENDIX A - FAQ
>
> 1. You went from pushing "real" SMP to "fake" multi-processing by
> proposing a
> way to save and resume context states
>
> Yes, bust. I feel this is the best way forward. And it could fall
> out of the
> work that LeoNerd is doing now for async/await. It's already being done,
> let's try and make it reusable; then actually reuse (I would).
>
> 2. You sound pretty confident that this will solve all of our
> problem, what
> if it doesn't?
>
> It's NOT a silver bullet. But I do think that it's highly likely not
> to be
> "sufficient" for some semantic extensions to Perl, I do think it's
> "necessary"; i.e., we will need to do this no matter what IMO.
>
> 3. You speak of this as if its easy.
>
> I don't think this is easy, but I do think that it'd be a
> substantial force
> multiplier for anyone interested in extending Perl into "interesting"
> programming model areas. Going back to what I said above. All the "low
> hanging fruit" has been picked. Time to put on our adult pants and move
> forward.
>
> 4. Sounded like you're thinking about this as a "virtual machine"
> area, are
> you?
>
> NO. I can write another 5,000 words on what I observed starting with
> pugs,
> parrot, moarvm, rakudo, and perl 6. If anything I am suggesting the
> opposite
> approach. Provide the minimum functionality one might need to fake
> multi-processing in our uni-process model. That doesn't sounds like a vm
> layer to me, it sounds like actually supporting semantic extensions
> for new
> and unanticipated things.
>
> REFERENCES
>
> [0] Edseger W. Dijkstra, "A Discipline of Programming", Chapter 2,
> "STATES AND
> THEIR CHARACTERIZATIONS"
>
> [1] https://en.wikipedia.org/wiki/Sequential_consistency
>
> [2] Leslie Lamport, "How to Make a Multiprocessor Computer That
> Correctly Executes
> Multiprocess Programs", IEEE Trans. Comput. C-28,9 (Sept. 1979),
> 690-691.
>
> [3] https://en.wikipedia.org/wiki/Linearizability
>
> [4] Herlihy, Maurice P.; Wing, Jeannette M. (1990). "Linearizability: A
> Correctness
> Condition for Concurrent Objects". ACM Transactions on Programming
> Languages and
> Systems. 12 (3): 463–492. CiteSeerX 10.1.1.142.5315
>
> [5] http://www.usingcsp.com/cspbook.pdf
>
> [6] https://en.wikipedia.org/wiki/Communicating_sequential_processes
>
> [7] https://en.wikipedia.org/wiki/Coroutine#Implementations_for_C
>
> [8] https://docs.python.org/3/reference/compound_stmts.html#async
>