Mailing List Archive: random_module.3pm (Was: perl)

random_module.3pm (Was: perl)

Aug 28, 1995, 1:50 AM

Post #1 of 8 (2194 views)

>From: Kenneth Albanowski <kjahds@kjahds.com>
>On Mon, 28 Aug 1995, matthew green wrote:
>
>>
>> i notice MakeMaker's 'make install' still doesn't do this:
>>
>> pod2man random_module.pm > random_module.3pm
>>
>> i'm not sure if i've said this before, or not, but:
>>
>> shouldn't this be run as part of the 'make' process, not the
>> 'make install' process? it's annoying to 'make' perl, and then,
>> 'make install' perl, and only *then* have it `compile' the
>> manual pages, and, even worse, to `compile' them in to the
>> target directory directly ? do want 'cc -o /usr/bin/perl ...'
>> also ? i think not :-)

It's really too late now. MakeMaker is designed to 'compile'
everything away. So-to-say it does 'cc -o /usr/bin/perl ...'. 'make
install' is a recursive call to make with the three key macros
INST_LIB, INST_ARCHLIB, and INST_EXE set to the defaults that were
given at 'perl Makefile.PL' time. With installman3dir I will do just
the same. INST_MAN3 will be ./blib at 'make' time and will become an
overridable $Config{installman3dir} at 'make install' time.

What I really need is a function contains_pod(). I suppose this should
be found in Pod::Parse. What's the status, Kenneth? Can we have
Pod::Parse in the near future?

>I think part of the problem is cleanly up the files when asked is a
>little bit of a chore. Might'nt there be a file named "something.man"
>that you don't want delete when you make clean? Just a thought. Perhaps
>not a very good one.

:) When MakeMaker will manify the pods, it will do it properly :) And
people will complain anyway ;-)

>
>> .mrg.
>>
>
>--
>Kenneth Albanowski (kjahds@kjahds.com, CIS: 70705,126)
>
>

andreas

Re: random_module.3pm (Was: perl) [ In reply to ]

kjahds at kjahds

Aug 28, 1995, 1:47 AM

Post #2 of 8 (2175 views)

On Mon, 28 Aug 1995, Andreas Koenig wrote:

> What I really need is a function contains_pod(). I suppose this should
> be found in Pod::Parse. What's the status, Kenneth? Can we have
> Pod::Parse in the near future?

Not likely. I may get to do some work on it soon, but I doubt it can be
beaten into shape. After the latest round of pod2html discussion I've
gotten a bit fed up with POD, to be honest. Maybe I'll get it out of my
system soon enough to do some work. In any case, to quote perldoc:

sub containspod {
my($file) = @_;
local($_);
open(TEST,"<$file");
while(<TEST>) {
if(/^=head/) {
close(TEST);
return 1;
}
}
close(TEST);
return 0;
}

> >I think part of the problem is cleanly up the files when asked is a
> >little bit of a chore. Might'nt there be a file named "something.man"
> >that you don't want delete when you make clean? Just a thought. Perhaps
> >not a very good one.
>
> :) When MakeMaker will manify the pods, it will do it properly :) And
> people will complain anyway ;-)

Of course. ;-)

> >
> >> .mrg.
> >>
> >
> >--
> >Kenneth Albanowski (kjahds@kjahds.com, CIS: 70705,126)
> >
> >
>
> andreas
>
>
>

--
Kenneth Albanowski (kjahds@kjahds.com, CIS: 70705,126)

Re: random_module.3pm (Was: perl) [ In reply to ]

lwall at scalpel

Aug 28, 1995, 5:35 PM

Post #3 of 8 (2175 views)

: On Mon, 28 Aug 1995, Andreas Koenig wrote:
:
: > What I really need is a function contains_pod(). I suppose this should
: > be found in Pod::Parse. What's the status, Kenneth? Can we have
: > Pod::Parse in the near future?
:
: Not likely. I may get to do some work on it soon, but I doubt it can be
: beaten into shape. After the latest round of pod2html discussion I've
: gotten a bit fed up with POD, to be honest.

Your frustration over the difficulty of it can be taken as a sign of
how extremely useful it'll be when it's done... :-)

: Maybe I'll get it out of my system soon enough to do some work.

Would it help if I apologized? :-)

Larry

Re: random_module.3pm (Was: perl) [ In reply to ]

kjahds at kjahds

Aug 28, 1995, 7:25 PM

Post #4 of 8 (2172 views)

On Mon, 28 Aug 1995, Larry Wall wrote:

> : On Mon, 28 Aug 1995, Andreas Koenig wrote:
> :
> : > What I really need is a function contains_pod(). I suppose this should
> : > be found in Pod::Parse. What's the status, Kenneth? Can we have
> : > Pod::Parse in the near future?
> :
> : Not likely. I may get to do some work on it soon, but I doubt it can be
> : beaten into shape. After the latest round of pod2html discussion I've
> : gotten a bit fed up with POD, to be honest.
>
> Your frustration over the difficulty of it can be taken as a sign of
> how extremely useful it'll be when it's done... :-)

Thanks, I think. ;-)

> : Maybe I'll get it out of my system soon enough to do some work.
>
> Would it help if I apologized? :-)

<Grin>

I suppose so. I just have to keep remembering your original statement of:

Note that I'm not at all claiming this to be
sufficient for producing a book. I'm just trying to
make an idiot-proof common source for nroff, TeX, and
other markup languages, as used for online
documentation. Both pod2html and pod2man translators
exist.

Unfortunately, it appears people have thought differently, and if not
actually trying to produce books, they certainly want nicely done
indexes, intelligent hypertext, and general applicability to numerous
formatters and languages.

Well, Larry, are there any changes you'd like to see in POD, or things
you'd like to see stay the same? I'm thinking about ripping POD down to
it's roots, and then building it back up in a slightly different manner --
hopefully a more flexible one.

> Larry

--
Kenneth Albanowski (kjahds@kjahds.com, CIS: 70705,126)

Re: random_module.3pm (Was: perl) [ In reply to ]

lwall at scalpel

Aug 28, 1995, 8:49 PM

Post #5 of 8 (2177 views)

: Unfortunately, it appears people have thought differently, and if not
: actually trying to produce books, they certainly want nicely done
: indexes, intelligent hypertext, and general applicability to numerous
: formatters and languages.

Yes, POD is good at making easy things easy, but fails at making hard
things possible. The pragmas would help there.

: Well, Larry, are there any changes you'd like to see in POD, or things
: you'd like to see stay the same?

The things I like about POD:

Verbatim paragraphs are absolutely verbatim, so I can slurp a program
in after testing it without having to count backslashes. At most
I have to indent it a little, which shouldn't change its meaning.

I can read it, because it almost looks like a Usenet article, because
it's block paragraphed.

Hence, I can reformat text paragraphs by hitting my F7 key.

Most everything else is negotiable. (As you've no doubt noticed. :-)

: I'm thinking about ripping POD down to
: it's roots, and then building it back up in a slightly different manner --
: hopefully a more flexible one.

I think the basic problem we're coming up against is that for the thing
to be well-defined, you want to do it in one pass, but then it becomes
difficult to write the heuristics in Perl, which tends to prefer multiple
passes for that sort of thing. But the regexp/substitution semantics aren't
powerful enough to leave enough analytical residue from pass to pass to
keep things from interfering with each other. Hence the 7-bit hack.

The problem is probably solvable with the right data structure, but we
need to think more about applying heuristics to nodes in a syntax tree
than to paragraphs as a whole.

We need to distinguish constructs that every translator needs to deal
with heuristically from those that are peculiar to a particular translator.
Those which are done everywhere should be parsed, and those which aren't
shouldn't. Perhaps this is obvious. But it does mean that we have to
decide whether things like $foo warrant their own node in the syntax tree.
If only pod2man wants to turn $foo into C<$foo>, then it shouldn't. But
if we want to say that $foo is by nature code, and everybody has to
treat it that way, then it probably needs its own syntax tree node.

What would be nice is if we can define things well enough that we can
parse the POD without having to have hooks back to the calling code to
decide on the fly how to transform things heuristically. That is, we
just pass a $foo node back out to pod2man, and later pod2man could turn
that into C<$foo> and call back into POD::Parse to turn that into
syntax tree. (Or just treat $foo directly, but I'm using $foo to
potentially represent more complicated thingies.)

The problem is that the heuristics might depend on the current state.
Unless the $foo syntax node was marked with all the state that was in
effect when it was parsed, we could lose the fact that someone said

=for text translate_variables=0

Here is a $foo that is not to be turned into code.

The example is lame from beginning to end, but it illustrates that we
might be forced either to remember arbitrary state in the syntax tree,
or to process heuristics on the fly using callbacks.

Well, hey, at least the problem looks *interesting*. Sort of.

Larry

Re: random_module.3pm (Was: perl) [ In reply to ]

kjahds at kjahds

Aug 28, 1995, 9:53 PM

Post #6 of 8 (2178 views)

On Mon, 28 Aug 1995, Larry Wall wrote:

> Yes, POD is good at making easy things easy, but fails at making hard
> things possible. The pragmas would help there.

In thinking about changing =over/=back to =begin list/=end list, I think
that's rather a better way to go in general. HTML certainly seems to live
well enough with ignore-me-if-you-don't-understand-me tags.

> : Well, Larry, are there any changes you'd like to see in POD, or things
> : you'd like to see stay the same?
>
> The things I like about POD:
>
> Verbatim paragraphs are absolutely verbatim, so I can slurp a program
> in after testing it without having to count backslashes. At most
> I have to indent it a little, which shouldn't change its meaning.

Absolutely. This is simple, elegant, and works.

> I can read it, because it almost looks like a Usenet article, because
> it's block paragraphed.
>
> Hence, I can reformat text paragraphs by hitting my F7 key.

Paragraphs in general work quite nicely. You can reflow the entire
document and not kill it -- a big benefit if you aren't writing in
What-You-See-Is-Something-Graphically-Approaching-What-You'll-Get
environment.

> Most everything else is negotiable. (As you've no doubt noticed. :-)

:-)

> : I'm thinking about ripping POD down to
> : it's roots, and then building it back up in a slightly different manner --
> : hopefully a more flexible one.
>
> I think the basic problem we're coming up against is that for the thing
> to be well-defined, you want to do it in one pass, but then it becomes
> difficult to write the heuristics in Perl, which tends to prefer multiple
> passes for that sort of thing. But the regexp/substitution semantics aren't
> powerful enough to leave enough analytical residue from pass to pass to
> keep things from interfering with each other. Hence the 7-bit hack.
>
> The problem is probably solvable with the right data structure, but we
> need to think more about applying heuristics to nodes in a syntax tree
> than to paragraphs as a whole.

This mirrors my thoughts. The obvious answer is to break it down into a
list of strings, with nested lists for the complex bits. A very
textish parse-tree, in other words.

Two schemes seem apparent. A paired scheme:

('',"normal text ",'B',"bold text",'',' more normal text',
'l',[list containing link data])

Or something less flat:

("normal text ",["B","bold text"]," more normal text",
["L",link,data,list])

Actually, I think the latter is necessary, because only in it can you say:

(["B","bold text, ",["I","bold & italic text"]," and bold text"])

Which in turn implies that a normal paragraph could be embedded in a
['normal',...] list, or pulled down any amount you wanted. Seems good
enough for a parse-tree.

> We need to distinguish constructs that every translator needs to deal
> with heuristically from those that are peculiar to a particular translator.

This is one of the bits that has given be the most headaches. The thing
is, I rather think there aren't many of these. Sure, pod2man and pod2html
can (and do!) format `$bar' differently, but should they?

> Those which are done everywhere should be parsed, and those which aren't
> shouldn't. Perhaps this is obvious.

I encoded it in my first run as transformation heuristics. In one of the
preprocessing passes, it would turn `$foo' into `C<$foo>'. Obviously the
formatter could make it's own transformations if it wanted to, past what
the parser did.

> But it does mean that we have to
> decide whether things like $foo warrant their own node in the syntax tree.
> If only pod2man wants to turn $foo into C<$foo>, then it shouldn't. But
> if we want to say that $foo is by nature code, and everybody has to
> treat it that way, then it probably needs its own syntax tree node.

An _excellent_ point. If we are going to parse-nodes instead of regexp,
then _nothing_ should have to do regexp. (And since regexp can't handle
nested commands and so such, this is a win).

> What would be nice is if we can define things well enough that we can
> parse the POD without having to have hooks back to the calling code to
> decide on the fly how to transform things heuristically. That is, we
> just pass a $foo node back out to pod2man, and later pod2man could turn
> that into C<$foo> and call back into POD::Parse to turn that into
> syntax tree. (Or just treat $foo directly, but I'm using $foo to
> potentially represent more complicated thingies.)

Well, I could see two directions here; either pull out stuff like "$foo"
and put it into a "guess" node (something that probably contains just
normal text, but that might mean something heuristically) and leave the
guess nodes in the parse tree for the formatter to do with as it pleases
at the end. But that means that formatter couldn't feed the changes back
to the parser. (And if the parser is dealing with indexing and linking,
this could have nasty side-effects.)

The other way would be for the formatter to install a set of callbacks
(perhaps with regexp filters to specify how to generate guess-nodes? that
can be invoked when a guess-node is generated to see what to do with it.
It would return a parse-tree that would be glued in in place of the
guess-node.

> The problem is that the heuristics might depend on the current state.
> Unless the $foo syntax node was marked with all the state that was in
> effect when it was parsed, we could lose the fact that someone said
>
> =for text translate_variables=0

(Side note: maybe that should be changed to =with)

>
> Here is a $foo that is not to be turned into code.

If we keep the entire state around (in globals? in a hash?) we could hand
that to the callback.

> The example is lame from beginning to end, but it illustrates that we
> might be forced either to remember arbitrary state in the syntax tree,
> or to process heuristics on the fly using callbacks.

Yes, the point is made.

> Well, hey, at least the problem looks *interesting*. Sort of.

Well, here's the crucial bit: can you come up with a concrete example of a
heuristic something-or-other that one formatter should want to do
differently from another?

> Larry

--
Kenneth Albanowski (kjahds@kjahds.com, CIS: 70705,126)

Re: random_module.3pm (Was: perl) [ In reply to ]

lwall at scalpel

Aug 28, 1995, 10:22 PM

Post #7 of 8 (2180 views)

: On Mon, 28 Aug 1995, Larry Wall wrote:
:
: > Yes, POD is good at making easy things easy, but fails at making hard
: > things possible. The pragmas would help there.
:
: In thinking about changing =over/=back to =begin list/=end list, I think
: that's rather a better way to go in general. HTML certainly seems to live
: well enough with ignore-me-if-you-don't-understand-me tags.

Except that the semantics of begin/end we were talking about before would
cause it to ignore everything between the =begin and the =end if it doesn't
recognize "list".

: Well, here's the crucial bit: can you come up with a concrete example of a
: heuristic something-or-other that one formatter should want to do
: differently from another?

Good question. Someone needs to go through and enumerate all the tricks
that the current translators perform. Any vict...er, volunteers?

Larry

Re: random_module.3pm (Was: perl) [ In reply to ]

kjahds at kjahds

Aug 28, 1995, 10:48 PM

Post #8 of 8 (2169 views)

On Mon, 28 Aug 1995, Larry Wall wrote:

> : On Mon, 28 Aug 1995, Larry Wall wrote:
> :
> : > Yes, POD is good at making easy things easy, but fails at making hard
> : > things possible. The pragmas would help there.
> :
> : In thinking about changing =over/=back to =begin list/=end list, I think
> : that's rather a better way to go in general. HTML certainly seems to live
> : well enough with ignore-me-if-you-don't-understand-me tags.
>
> Except that the semantics of begin/end we were talking about before would
> cause it to ignore everything between the =begin and the =end if it doesn't
> recognize "list".

Something more to think about. I'm not sure of the direction I want on
this. Currently over/back is the only one that would qualify for
begin/end so we don't have a good set of examples.

> : Well, here's the crucial bit: can you come up with a concrete example of a
> : heuristic something-or-other that one formatter should want to do
> : differently from another?
>
> Good question. Someone needs to go through and enumerate all the tricks
> that the current translators perform. Any vict...er, volunteers?

Oh well, I've done enough of this, this isn't to much trouble. (Oh, and
I'm not sure who put them in, but the xtended regexps are murder to read,
and things like "[^\s,\051]+" are simply obnoxious ;-)

Pod2man:

In the preprocessing stage:

func() becomes I<func()>
func(\d+) becomes I<func>\|(\d+) where \| is a slim space (half-n?)
$var becomes C<$var>

and some warning about some code that don't actually make changes. I'd
include them here, but it's too much trouble to understand the regexps
right now.

In the postprocessing (output) stage:

normal translations of B<> and I<> to \fB, etc. I'll ignore these.
F<...> becomes I<...>
L<grep/(3)> becomes "the I<grep>(3) manpage"
L<page/foo()> becomes "the C<foo()> entry on the I<page> manpage" I think.
L</foo()> becomes something like "the C<foo()> entry elsewhere in this
document"
L<yet/another/variant> becomes "the section on I<$2> in the I<$1> manpage"

(right about now I'm beginning to get that
I-can't-stand-overloaded-overloading headache)

Then Z<>, C<>, and anything. All of this repeats several times to make
sure any progressive changes got fed through.

In pod2html:

Uhm... Now I remember why I don't like pod2html. I'm afraid I can't dig
out the substitution bits for pod2html easy. The two passes are ensconced
within the parsing code in utterly intruiging ways that defy easy
description. I can say for certain that pod2html doesn't do the $var to
C<$var> conversion, for what it's worth.

in pod2latex:

It appears to be mostly the same as pod2html, with a few variations. One
seems to be in understanding that $foo[\d+] should be turned into
C<$foo[\d+]>. The postprocessing changes are a bit different, but mostly
similar.

Anyhow, from what I can see, the main points of contention seem to be on
what to do with `$bar', and the exact output of L<manpage(3f)> style
links. In HTML, it probably makes sense to keep these pristine with a
link to the page if possible.

(Come to think of it, the expansion of "the X section in the Y manpage"
might be a _complete_ misfeature if you are writing a pod in a different
language.)

I don't think I see any major preprocessing differences that would
warrent a full call-back scheme. I guess I'll have a go at writing a
POD-folder, and see how far a simple parse-tree gets me.

> Larry

--
Kenneth Albanowski (kjahds@kjahds.com, CIS: 70705,126)