Mailing List Archive: Stackless & String-processing

Stackless & String-processing [ In reply to ]

Jul 20, 1999, 9:26 AM

Post #26 of 39 (544 views)

Tim Peters (tim_one@email.msn.com) wrote:
: Who said anything about agreeing? You two obviously disagree, but
: disagreement doesn't imply misunderstanding. Heck, you've both repeated the
: same unbudging positions so often even *I* understand what you're both
: saying <wink>.
:
: + other equally useful contributions.

What is the point of your post Tim? Getting your jollies from adding
useless material to a thread?

graham

--
The girls is all salty, the boys is all sweet,
the food ain't too shabby
an' they piss in the streets
In France, way down in France, way on down, way on down in France

Stackless & String-processing [ In reply to ]

m.faassen at vet

Jul 20, 1999, 9:36 AM

Post #27 of 39 (550 views)

Permalink

Graham Matthews <graham@sloth.math.uga.edu> wrote:
> Tim Peters (tim_one@email.msn.com) wrote:
> : Who said anything about agreeing? You two obviously disagree, but
> : disagreement doesn't imply misunderstanding. Heck, you've both repeated the
> : same unbudging positions so often even *I* understand what you're both
> : saying <wink>.
> :
> : + other equally useful contributions.

> What is the point of your post Tim? Getting your jollies from adding
> useless material to a thread?

Actually this makes Tim go up in his political standing in this newsgroup,
if this is possible. :)

Actually, no. This kind of post is actually socially useful. By adding
some lightness to a heated discussion it's sometimes possible to prevent
nastiness from occuring.

Of course I shouldn't give such Grand Usenet Secrets away; my Pythonic
political standing is sure to drop. :)

Regards,

Martijn

Stackless & String-processing [ In reply to ]

greg.ewing at compaq

Jul 20, 1999, 3:29 PM

Post #28 of 39 (544 views)

Permalink

If you're trying to parse something using a grammar
that needs heavy transformations to make it tractable,
then obviously using monadic parsers with arbitrary
code embedded in them is going to get in the way
of those transformations.

However, not all parsing problems are like that.
Programming languages are usually designed to be
straightforward to parse efficiently. Monadic
techniques can allow such parsers to be written
very elegantly.

Greg

Stackless & String-processing [ In reply to ]

tim_one at email

Jul 20, 1999, 5:20 PM

Post #29 of 39 (551 views)

Permalink

[Tim]
> Who said anything about agreeing? You two obviously disagree, but
> disagreement doesn't imply misunderstanding. Heck, you've both
> repeated the same unbudging positions so often even *I* understand
> what you're both saying <wink>.

[Graham Matthews]
> : What is the point of your post Tim?

'Twas but a gentle attempt to break the thread out of the loop it appears to
be retracing now for the third time.

> Getting your jollies from adding useless material to a thread?

Sometimes, yes. Don't think it applies in this case, though.

> :
> : + other equally useful contributions.

My sig? There was nothing else of mine in the msg.

sarcasm-doesn't-work-on-imaginary-targets<wink>-ly y'rs - tim

Stackless & String-processing [ In reply to ]

pereira at research

Jul 20, 1999, 8:48 PM

Post #30 of 39 (546 views)

Permalink

In article <7n27r8$mm4$1@cronkite.cc.uga.edu>, Graham Matthews
<graham@sloth.math.uga.edu> wrote:
> Could you please point out to me where I claimed that "a program could
> optimise your combinator program". If you can't find such a quote then
> what is your point?
You didn't. The point, which is obvious to anyone who has actually done
serious pattern matching and parsing applications operating on
gigabytes of data, is that pattern matchers and parsers *must* be
optimized automatically (in the same way that programs in high-level
languages *must* be optimized automatically) because humans are just
not good enough at doing hand optimization of complex programs. Insofar
as a style of writing pattern-matchers/parsers is incompatible with
automatic optimization, it is only useful for toy examples, or, at,
best, for proofs of concept (although in my experience a proof of
concept that doesn't deal with efficiency issues is not much of a
proof).
>
> Fernando Pereira (pereira@research.att.com) wrote:
> : But from my readings (those mentioned in the thread
> : and others) as well as your comments, the whole point of monadic
> : combinators for parsing is to interweave the grammatical structure with
> : other computations, thus making the underlying grammatical structure
> : less accessible to optimization. As I suggested above, it would be
> : interesting to define classes of monadic programs that still allow (an
> : appropriate extension of) those optimizations. But AFAIK that has not
> : been done.
>
> It hasn't been done, it won't likely be done, and I never said it
> could be done. So what is your point?
That it might be interesting to understand under what conditions it
could be done, building on the work I mentioned in my earlier posting
as well as on the work on attribute grammars, which is directly
relevant. Of course, if what is at stake is religion rather than
science, such curiousity may be frowned upon. Oh well... Enough said.

-- F

Stackless & String-processing [ In reply to ]

tim_one at email

Jul 20, 1999, 9:57 PM

Post #31 of 39 (556 views)

Permalink

[Fernando Pereira]
> ...
> But from my readings (those mentioned in the thread and others) as well
> as your comments, the whole point of monadic combinators for parsing is
> to interweave the grammatical structure with other computations,

That was at least part of the point, the rest being the perceived advantages
of using the same notation (language) for both parsing and semantic
processing. Based on your work with weighted automata, I'd guess you'd be
the last person on earth to dismiss the advantages of a unifying framework
<wink>.

> thus making the underlying grammatical structure less accessible to
> optimization.

Whereas your approaches have factored things in such a way as to make them
more accesible to optimization.

If it isn't clear yet <wink>, efficiency isn't driving Graham here. He's
found a cool new approach, and is having fun with it.

Different strokes. Most of my parsing life has been concerned with
artificial languages, and there I've got a different complaint about
intermixing syntax with semantics: the grammar is interesting for its own
sake, and it's a positive burden at times to clutter it up with semantic
annotations!

Case in point: Python's grammar is formally defined by the file
Grammar/Grammar in the source distribution. That's the input to a parser
generator. At 60 non-noise lines, it's quite possible to digest the whole
thing, and minor changes in syntax can be effected without touching anything
else (how minor? exactly what doesn't require changing anything else
<wink>).

This is a joy as it is; even extending it to a conventional attributed
grammar would get in the way of the purposes to which the grammar file is
*usually* put.

I haven't tried using parser combinators, but in reading the papers the
question I had over and over is whether they scale well in practice, whether
measured by speed or maintainability (they eventually answered the "speed"
one: no). It didn't bode well that each new example seemed to require
creating a new combinator, and then you've got the implementation of *that*
to worry about on top of the target grammar and the semantics. It seemed to
carry the flavor of programming in an assembly language for "a real parsing
engine" (as Icon often feels to me in contrast to SNOBOL4 as grammars get
large). "You can have any sequencing you want, but you have to build it
yourself."

> As I suggested above, it would be interesting to define classes of
> monadic programs that still allow (an appropriate extension of) those
> optimizations. But AFAIK that has not been done.

It would. Not for me, though <wink>.

lex/yacc-looking-better-all-the-time-ly y'rs - tim

Stackless & String-processing [ In reply to ]

graham at sloth

Jul 21, 1999, 8:09 AM

Post #32 of 39 (550 views)

Permalink

[Fernando Pereira]
> But from my readings (those mentioned in the thread and others) as well
> as your comments, the whole point of monadic combinators for parsing is
> to interweave the grammatical structure with other computations,
Tim Peters (tim_one@email.msn.com) wrote:
: If it isn't clear yet <wink>, efficiency isn't driving Graham here. He's
: found a cool new approach, and is having fun with it.

No that's not the case. Efficiency drives all my programs *when it's a
necessity*. My point is that using combinators:

a) is a very nice way of structuring a parser since it is very amenable
to change and robust under change.

b) doesn't have to cost anything in efficiency since you can freely mix
combinators with other parsing techniques.

c) is a general programming technique useful in a vast range of programs
outside of just parsing. For example I have used it for type checking
and writing a GUI. The literature has a lot of other examples of this
approach.

Moreover the point of combinators is not to "interweave the grammatical
structure with other computations". Combinators do do that, and I think
it's a great benefit. But the point is to:

a) clearly separate sequencing code from other code. In a combinator
program the precise control flow of the program is clear.
l
b) allow programs to program their own sequencing combinators (rather
than providing some fixed set that the language designer hopes will
be enough.

Tim Peters (tim_one@email.msn.com) wrote:
: Different strokes. Most of my parsing life has been concerned with
: artificial languages, and there I've got a different complaint about
: intermixing syntax with semantics: the grammar is interesting for its own
: sake, and it's a positive burden at times to clutter it up with semantic
: annotations!

I have been using combinators for some time now and never noticed the
"grammar" being cluttered with all these "semantic annotations". What
annotations one requires can easily be separated from the grammar
either by sensible layout, or by the design of clever combinators that
hide semantic annotations away.
:
Tim Peters (tim_one@email.msn.com) wrote:
: Case in point: Python's grammar is formally defined by the file
: Grammar/Grammar in the source distribution. That's the input to a parser
: generator. At 60 non-noise lines, it's quite possible to digest the whole
: thing, and minor changes in syntax can be effected without touching anything
: else (how minor? exactly what doesn't require changing anything else

Combinators are very robust under change. I have a parser written using
combinators and I often tweak the grammar, without requiring large
changes to other code or the parser itself.

Have you ever written a combinator based parser or are you just
"theorising"?

Tim Peters (tim_one@email.msn.com) wrote:
: I haven't tried using parser combinators, but in reading the papers the
: question I had over and over is whether they scale well in practice, whether
: measured by speed or maintainability (they eventually answered the "speed"
: one: no).

Ah there's the rub -- you haven't written a combinator based parser so you
your comments are just what you *think* might happen if you did.

I don't know where you got the idea that speed is a problem! For some
kinds of parsing I wouldn't use combinators rather I would use
automaton techniques. But for most kinds of parsing they are plenty
fast enough.

Tim Peters (tim_one@email.msn.com) wrote:
: It didn't bode well that each new example seemed to require
: creating a new combinator, and then you've got the implementation of *that*
: to worry about on top of the target grammar and the semantics. It seemed to
: carry the flavor of programming in an assembly language for "a real parsing
: engine" (as Icon often feels to me in contrast to SNOBOL4 as grammars get
: large). "You can have any sequencing you want, but you have to build it
: yourself."

Let me guess Tim -- you have read two papers on combinators. One is the
Hutton/Meijer paper on "monadic parser combinators", and the other is
the chapter in the old Clean book. Each new example in these papers
requires a different combinator because each new example illustrates
some new sequencing idea -- eg. determinisation, coninuation, ...
The papers are illustrating the generality of the techniques, and hence
you expect to see a lot of different combinators. In practice however
(something you wouldn't know about right?) you don't use all these
combinators. I have written a parser for a new programming language
and I used 10 combinators (5 of which I re-used in the type checker!).
Yes you do have to build your own combinators (that's the point of the
approach!), but once done you stick them in a library and re-use them.

graham:
--
The girls is all salty, the boys is all sweet,
the food ain't too shabby
an' they piss in the streets
In France, way down in France, way on down, way on down in France

Stackless & String-processing [ In reply to ]

tismer at appliedbiometrics

Jul 21, 1999, 9:27 AM

Post #33 of 39 (547 views)

Permalink

Hi friends,

My understanding of this thread is limited and reduces
to the following sketch:

Graham Matthews wrote:
> [Fernando Pereira] [arguing]
> [Tim Peters] [digging in]
[Graham] [ elaborating ]
> [Tim Peters] [syntax mixed with sematix]
[Graham] [ elaborating ]
> Tim Peters [insists]
[Graham] [ combinatores are robust, is Tim theorising? ]
> Tim Peters [ papers, not tried ]
[Graham] [ made a point, Tim is theorising ]
> Tim Peters [ combinators as assembly language ]
[Graham, now at Hutton/Meijer and the old CLean book]

can't see an end of this and would like to understand more.

I believe this thread would not have evolved if I had not
begun the Stackless Python / continuation/coroutine/generator
talk. Now I'm asking to excuse my limited understanding.

Besides my person, I think there are more interested readers
who would be happy to know what you are talking about at all.
I may assume that you would use private email if this
is not your intent. Also I'm sure that we'll see its
relevance for Python, soon.

Can you please be all a little more concrete?
How would your patterns look in Python?
Please show some small example code pieces.

This would also be valuable input for my implementation,
and would give me more reason to finish it at all.

gimme-a-reason-to-read-then-I'll-give-you-a-reason-to-write - ly y'rs

--
Christian Tismer :^) <mailto:tismer@appliedbiometrics.com>
Applied Biometrics GmbH : Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net
10553 Berlin : PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF
we're tired of banana software - shipped green, ripens at home

Stackless & String-processing [ In reply to ]

tim_one at email

Jul 21, 1999, 8:29 PM

Post #34 of 39 (547 views)

Permalink

[Tim]
> ...
> If it isn't clear yet <wink>, efficiency isn't driving Graham
> here. He's found a cool new approach, and is having fun with it.

[Graham Matthews]
> No that's not the case.

OK, my mistake: you didn't find it, it's not cool, it's not new, and/or
you're not having fun <wink>.

> Efficiency drives all my programs *when it's a necessity*.

Ack, I'm speaking English here. If efficiency isn't your *primary* concern,
then efficiency isn't "driving" you. Dictionary. Look up. Efficiency is
Fernando's primary concern wrt parsing approaches -- it drives him. That's
how you two differ. It's important to note that efficiency really is a
necessity in his applications (have you read any of his papers? no claim
makes sense out of context).

> My point is that using combinators:
>
> a) is a very nice way of structuring a parser since it is very amenable
> to change and robust under change.

Could well be.

> b) doesn't have to cost anything in efficiency since you can freely mix
> combinators with other parsing techniques.

Like programming in Python doesn't have to cost anything in efficiency since
you can freely mix Python with other languages? It's a true enough
argument, but useless in practice unless 5% of the code accounts for 95% of
the runtime. Which it often *does* in general-purpose Python apps. For
parsing, it's not clear what kind of sequencing you could want that isn't
already covered by the "other parsing techniques" that are presumably doing
95% of the (presumed to be) efficiency-critical computation. Have something
concrete in mind?

> c) is a general programming technique useful in a vast range of programs
> outside of just parsing. For example I have used it for type checking
> and writing a GUI. The literature has a lot of other examples of this
> approach.

I didn't see anyone argue against that. This thread started with string
processing, and everyone else has stuck to that.

> Moreover the point of combinators is not to "interweave the grammatical
> structure with other computations". Combinators do do that, and I think
> it's a great benefit.

Expect that depends on the app: sometimes a benefit, sometimes not. I've
used systems of both kinds, and neither is a pure win.

> But the point is to:
>
> a) clearly separate sequencing code from other code. In a combinator
> program the precise control flow of the program is clear.

In a way deeper than it is in a functional program not using monads? IOW,
is the clarity of "control flow" due to monads specifically or just to
functional semantics in general? AFAIK, "combinator" hasn't sprouted a new
meaning beyond its old "higher-order curried function" one.

> b) allow programs to program their own sequencing combinators (rather
> than providing some fixed set that the language designer hopes will
> be enough.

WRT parsing specifically, what kinds of desirable sequencing do you find
lacking in other approaches? Any parsing framework caters more-than-less
directly to juxtaposition, choice, optionality, repetition and (at least
conceptual) backtracking. I understand you can model all those with
combinators. What I haven't seen is an example of something clearly useful
(wrt parsing specifically) *beyond* those.

> ...
> I have been using combinators for some time now and never noticed the
> "grammar" being cluttered with all these "semantic annotations".

We often see most clearly that which we hate <wink>. I saw 'em all over the
place in the papers.

> What annotations one requires can easily be separated from the grammar
> either by sensible layout, or by the design of clever combinators that
> hide semantic annotations away.

I mentioned indirectly that what drives me is variously "speed or
maintainability" in parsing systems. "clever" is at odds with the latter --
unless all the cleverness comes built in to the system once & for all.

> ...
> Combinators are very robust under change. I have a parser written using
> combinators and I often tweak the grammar, without requiring large
> changes to other code or the parser itself.

That's encouraging!

> Have you ever written a combinator based parser or are you just
> "theorising"?

You ask that immediately before quoting my

> : I haven't tried using parser combinators, but in reading the papers ...

? Prophetic <wink>.

> Ah there's the rub -- you haven't written a combinator based parser so you
> your comments are just what you *think* might happen if you did.

Oh yes! Parsing with combinators is a Porsche. I've driven all kinds of
cars for thirty years, in various terrains and in all kinds of weather; even
built a few engines from metal I smolt myself <wink>, that other people
relied on. I used functional languages extensively too, although very
lightly this decade. So when I read the Porsche owner's manual it's not
exactly blind imagination driving my concerns. It's certainly no substitute
for a test drive, though. I installed Clean with the aim of doing that, but
had too much fun poking at other aspects of the system to get around to it.

> I don't know where you got the idea that speed is a problem!

*Can* be a problem. Fernando covered that exhaustively, and you actually
haven't disgreed with his points for all your words to the contrary <wink>.
I also have much other experience crafting elegant functional solutions that
had to be tossed in practice for the same kinds of "can't automatically
optimize a huge nest of clever higher-order applications" reasons.

> For some kinds of parsing I wouldn't use combinators rather I would use
> automaton techniques. But for most kinds of parsing they are plenty
> fast enough.

"Most" depends on what you're doing. For most things I do for fun, even
SNOBOL4 was plenty fast enough (and I don't doubt for an instant that even a
casually tossed-together combinator-based program is faster than that!).
Half the things I do for work deal with artificial languages that are
specifically designed for one-token lookahead techniques, because they're
designed for peak speed over megabytes of input; the other half is parsing
huge ambiguous grammars on cheap PCs via dynamic programming techniques, and
there we *claw* to save bytes and milliseconds, no holds barred.

Different strokes. I believe I'd enjoy combinator approaches for my own
fun; I still doubt they'd be practical for my work (but I'd rather try that
than continue arguing about it <0.9 wink>).

> ...
> Each new example in these papers requires a different combinator because
? each new example illustrates some new sequencing idea -- eg.
determinisation,
> coninuation, ...
> The papers are illustrating the generality of the techniques, and hence
> you expect to see a lot of different combinators.

Yes, and I confess to over-panicking at what I indeed should have expected.

> In practice however (something you wouldn't know about right?) you don't
> use all these combinators.

Which casts some doubt on the great utility of being able to create new
ones. Parsing is an old problem. If only a handful of combinators are of
actual use in parsing problems, freeze the semantics, package them in a
library, and implement a pleasant parser generator on top of them <wink>.

> I have written a parser for a new programming language and I used 10
> combinators (5 of which I re-used in the type checker!).
> Yes you do have to build your own combinators (that's the point of the
> approach!), but once done you stick them in a library and re-use them.

That's fine for a one-man show. "Maintainability" in my life cuts across
programmers, and it's Not Good for each person to reinvent their own basic
machinery -- even to have the ability to do so. One reason I like Python is
that it *doesn't* let the jerk down the hall invent their own control
structures <0.5 wink>.

I will give parser combinators a try; they do sound interesting to me. More
interesting would be to hear about that "new programming language"!

i-believe-you-can-parse-it<wink>-ly y'rs - tim

Stackless & String-processing [ In reply to ]

UUCP at p1

Jul 21, 1999, 8:29 PM

Post #35 of 39 (550 views)

Permalink

From: "Tim Peters" <tim_one@email.msn.com>
Subject: RE: Stackless & String-processing

[Tim]
> ...
> If it isn't clear yet <wink>, efficiency isn't driving Graham
> here. He's found a cool new approach, and is having fun with it.

[Graham Matthews]
> No that's not the case.

OK, my mistake: you didn't find it, it's not cool, it's not new, and/or
you're not having fun <wink>.

> Efficiency drives all my programs *when it's a necessity*.

Ack, I'm speaking English here. If efficiency isn't your *primary* concern,
then efficiency isn't "driving" you. Dictionary. Look up. Efficiency is
Fernando's primary concern wrt parsing approaches -- it drives him. That's
how you two differ. It's important to note that efficiency really is a
necessity in his applications (have you read any of his papers? no claim
makes sense out of context).

> My point is that using combinators:
>
> a) is a very nice way of structuring a parser since it is very amenable
> to change and robust under change.

Could well be.

> b) doesn't have to cost anything in efficiency since you can freely mix
> combinators with other parsing techniques.

Like programming in Python doesn't have to cost anything in efficiency since
you can freely mix Python with other languages? It's a true enough
argument, but useless in practice unless 5% of the code accounts for 95% of
the runtime. Which it often *does* in general-purpose Python apps. For
parsing, it's not clear what kind of sequencing you could want that isn't
already covered by the "other parsing techniques" that are presumably doing
95% of the (presumed to be) efficiency-critical computation. Have something
concrete in mind?

> c) is a general programming technique useful in a vast range of programs
> outside of just parsing. For example I have used it for type checking
> and writing a GUI. The literature has a lot of other examples of this
> approach.

I didn't see anyone argue against that. This thread started with string
processing, and everyone else has stuck to that.

> Moreover the point of combinators is not to "interweave the grammatical
> structure with other computations". Combinators do do that, and I think
> it's a great benefit.

Expect that depends on the app: sometimes a benefit, sometimes not. I've
used systems of both kinds, and neither is a pure win.

> But the point is to:
>
> a) clearly separate sequencing code from other code. In a combinator
> program the precise control flow of the program is clear.

In a way deeper than it is in a functional program not using monads? IOW,
is the clarity of "control flow" due to monads specifically or just to
functional semantics in general? AFAIK, "combinator" hasn't sprouted a new
meaning beyond its old "higher-order curried function" one.

> b) allow programs to program their own sequencing combinators (rather
> than providing some fixed set that the language designer hopes will
> be enough.

WRT parsing specifically, what kinds of desirable sequencing do you find
lacking in other approaches? Any parsing framework caters more-than-less
directly to juxtaposition, choice, optionality, repetition and (at least
conceptual) backtracking. I understand you can model all those with
combinators. What I haven't seen is an example of something clearly useful
(wrt parsing specifically) *beyond* those.

> ...
> I have been using combinators for some time now and never noticed the
> "grammar" being cluttered with all these "semantic annotations".

We often see most clearly that which we hate <wink>. I saw 'em all over the
place in the papers.

> What annotations one requires can easily be separated from the grammar
> either by sensible layout, or by the design of clever combinators that
> hide semantic annotations away.

I mentioned indirectly that what drives me is variously "speed or
maintainability" in parsing systems. "clever" is at odds with the latter --
unless all the cleverness comes built in to the system once & for all.

> ...
> Combinators are very robust under change. I have a parser written using
> combinators and I often tweak the grammar, without requiring large
> changes to other code or the parser itself.

That's encouraging!

> Have you ever written a combinator based parser or are you just
> "theorising"?

You ask that immediately before quoting my

> : I haven't tried using parser combinators, but in reading the papers ...

? Prophetic <wink>.

> Ah there's the rub -- you haven't written a combinator based parser so you
> your comments are just what you *think* might happen if you did.

Oh yes! Parsing with combinators is a Porsche. I've driven all kinds of
cars for thirty years, in various terrains and in all kinds of weather; even
built a few engines from metal I smolt myself <wink>, that other people
relied on. I used functional languages extensively too, although very
lightly this decade. So when I read the Porsche owner's manual it's not
exactly blind imagination driving my concerns. It's certainly no substitute
for a test drive, though. I installed Clean with the aim of doing that, but
had too much fun poking at other aspects of the system to get around to it.

> I don't know where you got the idea that speed is a problem!

*Can* be a problem. Fernando covered that exhaustively, and you actually
haven't disgreed with his points for all your words to the contrary <wink>.
I also have much other experience crafting elegant functional solutions that
had to be tossed in practice for the same kinds of "can't automatically
optimize a huge nest of clever higher-order applications" reasons.

> For some kinds of parsing I wouldn't use combinators rather I would use
> automaton techniques. But for most kinds of parsing they are plenty
> fast enough.

"Most" depends on what you're doing. For most things I do for fun, even
SNOBOL4 was plenty fast enough (and I don't doubt for an instant that even a
casually tossed-together combinator-based program is faster than that!).
Half the things I do for work deal with artificial languages that are
specifically designed for one-token lookahead techniques, because they're
designed for peak speed over megabytes of input; the other half is parsing
huge ambiguous grammars on cheap PCs via dynamic programming techniques, and
there we *claw* to save bytes and milliseconds, no holds barred.

Different strokes. I believe I'd enjoy combinator approaches for my own
fun; I still doubt they'd be practical for my work (but I'd rather try that
than continue arguing about it <0.9 wink>).

> ...
> Each new example in these papers requires a different combinator because
? each new example illustrates some new sequencing idea -- eg.
determinisation,
> coninuation, ...
> The papers are illustrating the generality of the techniques, and hence
> you expect to see a lot of different combinators.

Yes, and I confess to over-panicking at what I indeed should have expected.

> In practice however (something you wouldn't know about right?) you don't
> use all these combinators.

Which casts some doubt on the great utility of being able to create new
ones. Parsing is an old problem. If only a handful of combinators are of
actual use in parsing problems, freeze the semantics, package them in a
library, and implement a pleasant parser generator on top of them <wink>.

> I have written a parser for a new programming language and I used 10
> combinators (5 of which I re-used in the type checker!).
> Yes you do have to build your own combinators (that's the point of the
> approach!), but once done you stick them in a library and re-use them.

That's fine for a one-man show. "Maintainability" in my life cuts across
programmers, and it's Not Good for each person to reinvent their own basic
machinery -- even to have the ability to do so. One reason I like Python is
that it *doesn't* let the jerk down the hall invent their own control
structures <0.5 wink>.

I will give parser combinators a try; they do sound interesting to me. More
interesting would be to hear about that "new programming language"!

i-believe-you-can-parse-it<wink>-ly y'rs - tim

--
|Fidonet: UUCP 2:500/3.1
|Internet: UUCP@p1.f3.n500.z2.hccfido.hcc.nl
|
| Standard disclaimer: The views of this user are strictly his own.

Stackless & String-processing [ In reply to ]

greg.ewing at compaq

Jul 22, 1999, 3:58 PM

Post #36 of 39 (545 views)

Permalink

Tim Peters wrote:
>
> In a way deeper than it is in a functional program not using monads? IOW,
> is the clarity of "control flow" due to monads specifically or just to
> functional semantics in general?

Monads are really only a big deal in the context of
functional languages, where there is no such thing
as control flow. Monads provide an astoundingly
elegant way of writing what is effectively procedural
code in a functional language. If you've ever used
one of the pre-monad systems for doing I/O in a
functional language, you'll appreciate what a
massive achievement this is.

However, Python is already a procedural language,
so there's no need to use monads there to fake
procedural code. It would be as pointless as
putting training wheels on a tricycle.

With regard to parser combinators, there are various
things at work in the functional language context that
don't apply to Python.

Using monads and parser combinators in a functional
language, you can write parsers that look very much
like a BNF grammar or whatever your favourite notation
is -- *without* needing any special tool to compile
and execute them.

This is largely due to the general sparseness of the
syntax, and the ability to define new infix operators,
neither of which apply to Python. You can't translate
the code into Python without losing most of the notational
convenience which accounts for a large part of the
technique's attractiveness.

The idea of composing parsing functions can still
be useful, however. For instance you might have a
function with the signature

def p_separated_list(p_item, p_separator):

where p_item and p_separator are parsers for
list items and separator tokens. Which incidentally
could be an answer to

> What I haven't seen is an example of something clearly useful
> (wrt parsing specifically) *beyond* those.

Note that the implementation of p_separated_list
need not be quite the standard version you might
find in a library. In a parser for Python code, for
example, it would take care of that quirky comma-at-
the-end-of-a-sequence thing.

> I will give parser combinators a try; they do sound interesting to me. More
> interesting would be to hear about that "new programming language"!

If you're interested in examples of monad use for
parsing and compiling, I had a fit of insanity last
year in which I implemented a compiler in Hugs using
monads for several different things, including parsing.
If you'd like to see it, I could dig it out and put it
on a web page somewhere.

Greg

Stackless & String-processing [ In reply to ]

neelk at alum

Jul 22, 1999, 6:46 PM

Post #37 of 39 (552 views)

Permalink

In article <3795F4E4.8330755F@appliedbiometrics.com>,
Christian Tismer <tismer@appliedbiometrics.com> wrote:
>
>I believe this thread would not have evolved if I had not
>begun the Stackless Python / continuation/coroutine/generator
>talk. Now I'm asking to excuse my limited understanding.
>
>Can you please be all a little more concrete?
>How would your patterns look in Python?
>Please show some small example code pieces.

I am working through the Hutton/Meijer paper, and (slowly!)
implementing the parsers they describe. Basically, the important part
isn't the "monadic" part, but the "combinator" part. The idea is that
you can/should write primitive parsers, and then write higher-order
functions to weave those little parsers into a bigger parser. It's a
nice way to write recursive descent parsers without sinking into
spaghetti code, basically.

(This is kind of hard to do right in Python, because the lack of
lexical scoping and easy-to-use metaclasses makes it trickier to mimic
combinators they describe exactly. But it's still a fun project.)

One of the consequences of this approach is that outputting all the
parses for an ambiguous grammar as a stream of parse trees is very
straightforward. In Haskell or Gofer, you could rely on the laziness
of the language to avoid unnecessary computations, but in Python
it would be not-fun to wait for Python to compute the whole list.

Here's a (simple, rough-draft) combinator I've written:

class Bind:
def __init__(self, parser1, func):
self.parser1 = parser1
self.func = func # func(v :: AST) -> Parser; e.g. Result
def __call__(self, input):
parse_list = self.parser1(input)
lst = []
for ast, s in parse_list:
foo = self.func(ast)(s)
for elt in foo:
lst.append( elt ) # <-- a generator would help a lot here!
return lst

I would love to rewrite the __call__ method so that it returns a
generator that returns parses one at a time rather than returning a
list. Note that the combinator approach magnifies the benefits of
generators, since parsers are normally built through composition, and
each sub-parser can potentially generate a sequence of parses.
Laziness is a real win here.

Also, it might make sense to add a xreadlines() method to file
objects, which returns the lines of a file one at a time, because then
you can write code in the natural style of:

for line in file.xreadlines():
blah_blah_blah(line)

without having to read the whole file into memory. (It would
also likely eliminate about 80% of the push for assignment
expressions.)

Neel

Stackless & String-processing [ In reply to ]

graham at sloth

Jul 23, 1999, 7:24 AM

Post #38 of 39 (541 views)

Permalink

[Tim]
: > If it isn't clear yet <wink>, efficiency isn't driving Graham
: > here. He's found a cool new approach, and is having fun with it.
[Graham Matthews]
: > Efficiency drives all my programs *when it's a necessity*.
[Tim]
: Ack, I'm speaking English here. If efficiency isn't your *primary* concern,
: then efficiency isn't "driving" you. Dictionary. Look up.

No Tim I suggest you look up the dictionary. The dictionary definition of
"driven" does not preclude one being driven by more than one concern (and
indeed driven by more than one concern each of equal importance). For
example I am driven by efficiency, correctness, and expressiveness in
roughly equal importance. So efficiency is driving me. Perhaps the motto
here is that he who lives by the dictionary shall die by the dictionary.

At this point I am signing off this thread. It is no longer remotely
relevant to the group. I refer all questions about combinators to
Tim, who, although he has never written a combinator based parser,
appears to know everything about them. Oh and he can look up words in
dictionary too, although perhaps not understand the definition he finds.

graham

--
Black velvet in that little boy's smile
Black velvet and a soft southern sky
A new religion that'll bring you to your knees
Black velvet, if you please

Stackless & String-processing [ In reply to ]

greg.ewing at compaq

Jul 28, 1999, 1:16 PM

Post #39 of 39 (543 views)

Permalink

Christian Tismer wrote:
>
> def backtrack():
> a = next_alternative(list)
> b = next_alternative(otherlist)
> c = next_alternative(listwhatever)
> if specialcondition(a, b, c):
> return 42
> fail()

This might not do what you seem to think
it will do under Icon semantics. For each value
generated for a, you will get all the values of
b, etc., so you'll get the cross product of
the three lists. If you were expecting a parallel
iteration, that's not what you'll get.

This illustrates why I don't like this sort of
implicit generator syntax -- often it's not obvious
what it's going to do. Switching languages for a
moment, rewriting the above in Scheme it comes
out as:

(define (backtrack cont)
(for-each-alternative list
(lambda (a)
(for-each-alternative otherlist
(lambda (b)
(for-each-alternative listwhatever
(lambda (c)
(if (specialcondition a b c)
(cont 42)
'()))))))))

which makes it perfectly clear that you have
three nested loops.

Currently there is no elegant way to write that
sort of thing in Python. Either you have to split
the thunks off into separate procedures, or put up
with a seriously crippled lambda syntax. Either
way, you get namespace problems. That's why I
suggested an extension, which would allow you to
write

def backtrack(cont):
for_each_alternative(list) -> (a):
for_each_alternative(otherlist) -> (b):
for_each_alternative(listwhatever) -> (c):
if specialcondition(a, b, c):
cont(42)

which is actually one line shorter than the
generator version, and (to my way of thinking)
the control flow is clearer.

Greg