Mailing List Archive: PDSC comments (was Re: mangled mail again)

On Mon, 09 Oct 1995 11:46:48 MDT, Tom Christiansen wrote:
>
>> By the way, your new Perl Data Structures manual (or whatever it's called)
>> is VERY HELPFUL.
>
>Glad you like it. I now have both the general tips and the
>lists of lists docs expanded into full form. I encourage feedback,
>although I sure haven't gotten much -- I guess it's all obvious stuff
>to folks on this list.
>

Obvious to us or not, these (esp. the expanded versions of the cookbook) are
real gems. I suspect Randall is gonna have to settle for selling only half
what would have been sold of the Camel-yet-unborn (not even counting the
effects of the pun with the Schwarztian Transforms bit on the popular Perl
populace ;-) I'd even suggest having two sections of each recipe:

THE EASY WAY vs. THE HARD WAY

for $a (@a) { grep map { ..etc..
for $b (@b) { ..etc..

Now, calling which as being easy and which hard is open to a raging debate,
I suppose :-)

It's safe to say I haven't seen manpages that rival Perl's--period.

Here is some stuff that might be mentioned in the recursive data structures
page that's not yet there:

* Perl4-ers might still be tempted to go the typeglob way. Typeglobs are a
polymorphic reference structure that have much magic attached to them
in special cases (eg. filehandles). Typeglobs can also be subverted to
do truly strange things. Use straightforward references in preference
to typeglobs wherever you can.

This both makes it easy to understand what's going on and avoids some
mysteries of typeglob handling. (When the concept of references and
lexical scoping has been uniformly extended to FORMATs, FILEHANDLEs and
the rest of the types, maybe we won't need to use typeglobs at all).

* We usually need more than one statement to construct a self-referential
structure. i.e.,

$b = { a => { b => $b }}; # not legal

will have to be written as:

$b = { a => {} };
$b->{b} = $b;

* Not directly relevant to recursive data structs, but $a = $a->{b} will
cause a coredump in 5.001m without the patches.

Other general comments (more in the nature of traps) that can be mentioned
in appropriate places:

* Hashes-of-Hashes are humungous memory hogs (well, at least for
now). Unless its is a small application, using list of lists is to be
preferred. Never use HoH-es in objects that might get instantiated in
the 1000s. In fact, converting a hash-based class implementation to a
array-based one is a useful optimization with current Perl, and this
will also save on all those duplicate literal keys (duh) in them hashes.

* 'my($a) ; $a->[0] = 1;' does not work yet. The lexical has to pre-defined
like in 'my($a) = ""; $a->[0] = 1;'.

* The precedence of indirect object syntax just doesn't mix with the
dereferencing of data structures. Guess what this is:

meth $a->{foo} # calls meth($a) and derefs result: (meth($a))->{foo}

If you are predisposed to using indirect object syntax, this can be a
really hard one to trace, because it does not produce a warning (yet).

I think there may be more, but those are the ones that have bitten me.

Keep up the amazing work on c.l.p.m, Tom.

- Sarathy.
gsar@engin.umich.edu

Thanks for everything.

> Obvious to us or not, these (esp. the expanded versions of the cookbook) are
> real gems. I suspect Randall is gonna have to settle for selling only half
> what would have been sold of the Camel-yet-unborn

Oh, there are easy ways to fix that. :-)

You know, I've always felt that the data structures are *THE* single most
important thing in the 5.0 release of perl, and it's a crying shame that I
have to answer 13 questions a day about it in email and usenet just
because it isn't written down anywhere. I'll tell you this: I've read
half-a-dozen or more current or about-to-be perl books of late, and not a
single one of them covers this critically important stuff, especially like
this. Neither has it been in any of the many outlines for perl book
proposals I've read. I think perhaps that few outside this forum actually
understand it.

So I've set about doing it myself. I've even in theory drafted Larry into
writing the final section on objects at the end of the PDSC, unless he
makes a whole Camel chapter on objects. I'm making sure that the whole
PDS issue is at last covered in the standard distributed docs in a
definitive way. Whether this opus (or these opera, I guess, if I do all I
want to do, and we take into account my other "Far More Than Everything
You Ever Wanted To Know..." writings) finally makes its way into the Camel
Reborn is of course not really up to me. I've certainly given them my
permission, though.

Um... in case Larry does want to use it, please nobody go off and
publish it on your own: it's supposed to be a gift from me to Larry --
and his wife and kids.

> It's safe to say I haven't seen manpages that rival Perl's--period.

I'm certainly working on it, but I got some complaint mail about them the
other day, claiming they don't work for "users" just "gurus", that they
spend too much time saying what not to do, or you won't understand this
till later, etc. Oh well. Manpages are not user guides. For that we
have large furry creatures that drink scant water. :-)

I think we need a perlio.pod as well -- filehandles and the whole i/o
system need some treatment (I've begun that on the side). Also,
perlre.pod needs a major facelift: one shouldn't have to go to perlop.pod
to find the regexp stuff.

> * Hashes-of-Hashes are humungous memory hogs (well, at least for
> now). Unless its is a small application, using list of lists is to be
> preferred. Never use HoH-es in objects that might get instantiated in
> the 1000s. In fact, converting a hash-based class implementation to a
> array-based one is a useful optimization with current Perl, and this
> will also save on all those duplicate literal keys (duh) in them hashes.

You know, I thought that, too, but when I converted my mtg from
HoL to HoH, it grew almost not at all. I believe that this was some
bizarre malloc fragmentation leak. I've reported it before, and could
probably dig up the report again if you want.

> * 'my($a) ; $a->[0] = 1;' does not work yet. The lexical has to pre-defined
> like in 'my($a) = ""; $a->[0] = 1;'.

That's mentioned in perldsc.pod or perlLoL.pod somewhere, plus Larry's
already fixed it in his working version.

BTW, I'm planning on fixing up my Struct.pm module and documenting it as a
way of getting slow method-based structs, like

$x = $r->field; # get
$f->field($val); # set

I'm still wondering whether I shouldn't use get_attr and set_attr
method names lest people want to use function member names that conflict
with data member names. But maybe we don't mind having a unified
member namespace. Hey, did anyone just read this paragraph?

> * The precedence of indirect object syntax just doesn't mix with the
> dereferencing of data structures. Guess what this is:

> meth $a->{foo} # calls meth($a) and derefs result: (meth($a))->{foo}

> If you are predisposed to using indirect object syntax, this can be a
> really hard one to trace, because it does not produce a warning (yet).

I know, I know. It's wicked. Even more wicked than C programmers
translating *x[i] into $$x[$i], which is also wrong. At least $ isn't
EXACTLY the same as * (but it's so close that I understand why they get
confused). But indirect object methods as super tight unary operators
that are tighter than -> is kinda mindblowing.

--tom