Mailing List Archive: Garbage Collection

Garbage Collection

Dec 15, 1995, 5:55 PM

Post #1 of 21 (709 views)

I've been documenting a phenomenal number of things lately;
e.g. perltie is getting to be nearly a FMTEYEWTK.

Here's a section I recently added I thought some of you
might be interested in. I'm probably unintentionally lying
here at some point, but I'm trusting Larry to set me straight.

Any ideas where this should go? L<perlobj> is my current best
place.

--tom

=head2 Two-Phased Garbage Collection

For most purposes, Perl uses a fast and simple reference-based
garbage collection system. For this reason, there's an extra
dereference going on at some level, so if you haven't built
your Perl executable using your C compiler's C<-O> flag, performance
will suffer. If you I<have> built Perl with C<cc -O>, then this
probably won't matter.

A more serious concern is that unreachable memory with a non-zero
reference count will not normally get freed. Therefore, this is a bad
idea:

{ my $a; $a = \$a; }

Even thought $a I<should> go away, it can't. When building recursive data
structures, you'll have to break the self-reference yourself explicitly
if you don't care to leak. For example, here's a self-referential
node such as one might use in a sophisticated tree structure:

sub new_node {
my $node = {};
$node->{LEFT} = $node->{RIGHT} = $node;
$node->{DATA} = [ @_ ];
return $node;
}

If you create nodes like that, they won't go away unless you break their
self reference yourself.

Almost.

When an interpreter thread finally shuts down (usually when your program
exits), then a rather costly but complete mark-and-sweep style of garbage
collection is performed, and everything allocated by that thread gets
destroyed. This is essential to support Perl as an embedded or a
multithreadable language. For example, this program demonstrates Perl's
two-phased garbage collection:

#!/usr/bin/perl
package Subtle;

sub new {
my $test;
$test = \$test;
warn "CREATING " . \$test;
return bless \$test;
}

sub DESTROY {
my $self = shift;
warn "DESTROYING $self";
}

package main;

warn "starting program";
{
my $a = Subtle->new;
my $b = Subtle->new;
$$a = 0; # break selfref
warn "leaving block";
}

warn "just exited block";
warn "time to die...";
exit;

When run as F</tmp/test>, the following output is produced:

starting program at /tmp/test line 18.
CREATING SCALAR(0x8e5b8) at /tmp/test line 7.
CREATING SCALAR(0x8e57c) at /tmp/test line 7.
leaving block at /tmp/test line 23.
DESTROYING Subtle=SCALAR(0x8e5b8) at /tmp/test line 13.
just exited block at /tmp/test line 26.
time to die... at /tmp/test line 27.
DESTROYING Subtle=SCALAR(0x8e57c) during global destruction.

Notice that "global destruction" bit there? That's the thread
garbage collector reaching the unreachable.

Re: Garbage Collection [ In reply to ]

Tim.Bunce at ig

Dec 15, 1995, 7:13 PM

Post #2 of 21 (711 views)

> From: Tom Christiansen <tchrist@mox.perl.com>

> =head2 Two-Phased Garbage Collection
>
> If you create nodes like that, they won't go away unless you break their
> self reference yourself.

Sometimes you want the destruction to happen 'naturally' when the
_application_ has undef'd _its_ last reference.

This is possible by using an extra level of indirection:

A --> B --> C --.
^ |
`---'

When the applications last ref A is undef'd perl will call DESTROY
for B. B's DESTROY can then arrange for C's refs to be destroyed.

This technique is probably worth mentioning somewhere.

Tim.

Re: Garbage Collection [ In reply to ]

Dec 16, 1995, 12:59 AM

Post #3 of 21 (712 views)

On Fri, 15 Dec 1995 17:55:41 MST, Tom Christiansen wrote:
>If you create nodes like that, they won't go away unless you break their
>self reference yourself.
>
>Almost.

I too had thought about this some when tackling the leak reported by Alan
Burlison. Other similar scenarios that are even more difficult to avoid is
when two (or more) temps are pointing at each other.

while (1) { my $a = []; my $b = []; $a->[0] = $b; $b->[0] = $a; }

You could accumulate huge cyclic temp bogosities in loops very easily.

What we need is global-destruction on demand. Just doing it when
the interp shuts down isn't much good in loops that have temps that
refer to themselves.

My initial thoughts were that sv_setsv() (I can hear Larry cursing) can
be made to recognize assignment of RVs to parts within themselves and
taught not to increment the REFCNT, but that will come at a
substantial cost, because it will have to be done via the equivalent
of a grep() through every node in the structure. The alternative is
to implement the equivalent thing to be done at user specified
checkpoints, and readjust REFCNT for cyclic refs. Both approaches
are doable, but the former is expensive, the latter is ugly.

>
>Notice that "global destruction" bit there? That's the thread
>garbage collector reaching the unreachable.
>

Ahh, but you are missing the fact that it is only implemented for
objects. Sorry, no plain refs, no sir.

Raphael was talking about a mark-and-sweep gc for malloc, which seems
the most generic way to go to tackle this.

- Sarathy.
gsar@engin.umich.edu

Re: Garbage Collection [ In reply to ]

Dec 16, 1995, 1:16 AM

Post #4 of 21 (709 views)

On Sat, 16 Dec 1995 02:13:52 GMT, Tim Bunce wrote:
>
>Sometimes you want the destruction to happen 'naturally' when the
>_application_ has undef'd _its_ last reference.
>
>This is possible by using an extra level of indirection:
>
> A --> B --> C --.
> ^ |
> `---'
>
>When the applications last ref A is undef'd perl will call DESTROY
>for B. B's DESTROY can then arrange for C's refs to be destroyed.
>

That should be made block-scope rather than application-scope to be
more useful.

- Sarathy.
gsar@engin.umich.edu

Re: Garbage Collection [ In reply to ]

Dec 18, 1995, 8:39 AM

Post #5 of 21 (712 views)

>>Notice that "global destruction" bit there? That's the thread
>>garbage collector reaching the unreachable.
>>

>Ahh, but you are missing the fact that it is only implemented for
>objects. Sorry, no plain refs, no sir.

How's this:

Notice that "global destruction" bit there? That's the thread garbage
collector reaching the unreachable. Only blessed objects get cleaned
up in global destruction, though, not regular memory. A more complete
garbage collection strategy will be implemented at a future date.

--tom

Re: Garbage Collection [ In reply to ]

lwall at scalpel

Dec 18, 1995, 9:44 AM

Post #6 of 21 (711 views)

: >>Notice that "global destruction" bit there? That's the thread
: >>garbage collector reaching the unreachable.
: >>
:
: >Ahh, but you are missing the fact that it is only implemented for
: >objects. Sorry, no plain refs, no sir.
:
: How's this:
:
: Notice that "global destruction" bit there? That's the thread garbage
: collector reaching the unreachable. Only blessed objects get cleaned
: up in global destruction, though, not regular memory. A more complete
: garbage collection strategy will be implemented at a future date.

Well, that's all very well, other than the fact that it's simply not
true. Plain refs are indeed garbage collected if destruct_level > 0.
GS was, er, being creative.

It's true that objects are always destructed, even when refs aren't.
It's also true that they're destructed in a separate pass before
ordinary refs, to try to prevent object destructors from using refs
that have been destructed.

You can test the higher levels of global destruction by setting the
PERL_DESTRUCT_LEVEL environment variable. (Presuming -DDEBUGGING is
enabled.)

Larry

Re: Garbage Collection [ In reply to ]

lwall at scalpel

Dec 18, 1995, 10:35 AM

Post #7 of 21 (711 views)

: >You can test the higher levels of global destruction by setting the
: PERL_DESTRUCT_LEVEL environment variable. (Presuming -DDEBUGGING is
: >enabled.)
: >
:
: Why should it be tied to -DDEBUGGING?

I mostly put it in for my own use when I was doing Purify runs. I think
if there's a need for a public interface, it should be through a pragma:

use destructlevel 2;

(This would use the max of all specified destruct levels.)

Actually, that would have to be shortened, to get within the 14 char limit.

Larry

Re: Garbage Collection [ In reply to ]

Dec 18, 1995, 10:35 AM

Post #8 of 21 (717 views)

On Mon, 18 Dec 1995 08:39:22 MST, Tom Christiansen wrote:
>
>How's this:
>
[..]
> Only blessed objects get cleaned
> up in global destruction, though, not regular memory. [..]

Since you asked, I'd rather have:
Currently, only objects get cleaned
out during global destruction, not plain references that have been so
"lost". [..]

- Sarathy.
gsar@engin.umich.edu

Re: Garbage Collection [ In reply to ]

Dec 18, 1995, 11:08 AM

Post #9 of 21 (711 views)

On Mon, 18 Dec 1995 09:44:30 PST, Larry Wall wrote:
>
>Well, that's all very well, other than the fact that it's simply not
>true. Plain refs are indeed garbage collected if destruct_level > 0.
>GS was, er, being creative.
>

Er, huh, not the first time, surely, that is. :-)

>
>You can test the higher levels of global destruction by setting the
PERL_DESTRUCT_LEVEL environment variable. (Presuming -DDEBUGGING is
>enabled.)
>

Why should it be tied to -DDEBUGGING?

- Sarathy.
gsar@engin.umich.edu

Re: [PERL] Garbage Collection [ In reply to ]

chip at atlantic

Dec 18, 1995, 12:17 PM

Post #10 of 21 (711 views)

Re: Garbage Collection [ In reply to ]

ram at hptnos02

Dec 18, 1995, 1:52 PM

Post #11 of 21 (712 views)

Quoting Gurusamy Sarathy:
:Raphael was talking about a mark-and-sweep gc for malloc, which seems
:the most generic way to go to tackle this.

Yes, it's still planned. Although I need to defer this to after 5.002
for various reasons (one of them being a code stability issue, but other
reasons are more personal -- understand: critical lack of free time).

The mark & sweep is simple, yet fairly efficient. Copying collectors
are out of the question, given all the impact moving references can
have... The real difficulty is not to code the traversal (mark)
phase, but rather the sweep phase. The mixing of zone and traditional
malloc allocation (SV pool vs. plain allocation) being a pain to sweep.

Raphael

Re: [PERL] Garbage Collection [ In reply to ]

Dec 18, 1995, 2:53 PM

Post #12 of 21 (712 views)

On Mon, 18 Dec 1995 14:17:28 EST, Chip Salzenberg wrote:
>How about:
>
> use destruction level => 2;
>
>? Actually, it would be better to get away from numeric levels and
>get into mnemonic actions.

So why not just:

use destruct qw(objects all verbose);

corresponding to destruct_level of 0, 1 and 2?

- Sarathy.
gsar@engin.umich.edu

Re: Garbage Collection [ In reply to ]

chip at atlantic

Dec 18, 1995, 3:15 PM

Post #13 of 21 (710 views)

Re: Garbage Collection [ In reply to ]

Dec 19, 1995, 7:48 AM

Post #14 of 21 (711 views)

How's this:

Objects are always destructed, even when regular refs aren't and in
fact are destructed in a separate pass before ordinary refs just to
try to prevent object destructors from using refs that have been
themselves destructed. Plain refs are only garbage collected if the
destruct level is greater than 0. You can test the higher levels of
global destruction by setting the PERL_DESTRUCT_LEVEL environment
variable, presuming C<-DDEBUGGING> was enabled during perl build
time.

Sometimes I think my job consists mainnly of following around
Larry's writings and saving them for posterity.

--tom

--tom

Re: Garbage Collection [ In reply to ]

h.b.furuseth at usit

Dec 19, 1995, 8:46 AM

Post #15 of 21 (711 views)

> If you create nodes like that, they

*** currently ***

> won't go away unless you break their self reference yourself.

Please sprinkle a few such words around, so users won't get the idea
that this is a feature they can depend on. Wouldn't it be fun to insert
a real garbage collector some day and then receive questions about how
to disable it...

Regards,

Hallvard

Re: Garbage Collection [ In reply to ]

Dec 19, 1995, 9:12 AM

Post #16 of 21 (711 views)

ok.

If you create nodes like that, they (currently) won't go away unless
you break their self reference yourself. (In other words, this is not
to be construed as a feature, and you shouldn't depend on it.)

--tom

Re: Garbage Collection [ In reply to ]

Dec 19, 1995, 9:17 AM

Post #17 of 21 (711 views)

On Tue, 19 Dec 1995 07:48:12 MST, Tom Christiansen wrote:
>How's this:
>
> Objects are always destructed, even when regular refs aren't and in
> fact are destructed in a separate pass before ordinary refs just to
> try to prevent object destructors from using refs that have been
> themselves destructed. Plain refs are only garbage collected if the
> destruct level is greater than 0. You can test the higher levels of
> global destruction by setting the PERL_DESTRUCT_LEVEL environment
> variable, presuming C<-DDEBUGGING> was enabled during perl build
> time.
>

That looks fine (for perl5.002b1f).

However, here's another half-baked idea. Given that Larry has murmured
something about making it a pragma and without the dependency on
-DDEBUGGING, it might make sense to not document too much of the details
just yet (assuming someone is going to be forthcoming with them patches soon
enough -- me? Shiver.)

- Sarathy.
gsar@engin.umich.edu

Re: Garbage Collection [ In reply to ]

lwall at scalpel

Dec 19, 1995, 9:29 AM

Post #18 of 21 (710 views)

: Sometimes I think my job consists mainnly of following around
: Larry's writings and saving them for posterity.

Appropriate, considering the Subject line. :-)

Larry

Re: Garbage Collection [ In reply to ]

Tim.Bunce at ig

Dec 29, 1995, 11:26 AM

Post #19 of 21 (711 views)

Dredging my mailbox turned up this one:

> From: Gurusamy Sarathy <gsar@engin.umich.edu>
>
> On Sat, 16 Dec 1995 02:13:52 GMT, Tim Bunce wrote:
> >
> >Sometimes you want the destruction to happen 'naturally' when the
> >_application_ has undef'd _its_ last reference.
> >
> >This is possible by using an extra level of indirection:
> >
> > A --> B --> C --.
> > ^ |
> > `---'
> >
> >When the applications last ref A is undef'd perl will call DESTROY
> >for B. B's DESTROY can then arrange for C's refs to be destroyed.
>
> That should be made block-scope rather than application-scope to be
> more useful.

Eh? I don't follow you here Gurusamy.

Tim.

Re: Garbage Collection [ In reply to ]

Dec 29, 1995, 1:45 PM

Post #20 of 21 (726 views)

On Fri, 29 Dec 1995 18:26:17 GMT, Tim Bunce wrote:
>Dredging my mailbox turned up this one:
>
>> From: Gurusamy Sarathy <gsar@engin.umich.edu>
>>
>> On Sat, 16 Dec 1995 02:13:52 GMT, Tim Bunce wrote:
>> >
>> >Sometimes you want the destruction to happen 'naturally' when the
>> >_application_ has undef'd _its_ last reference.
>> >
>> >This is possible by using an extra level of indirection:
>> >
>> > A --> B --> C --.
>> > ^ |
>> > `---'
>> >
>> >When the applications last ref A is undef'd perl will call DESTROY
>> >for B. B's DESTROY can then arrange for C's refs to be destroyed.
>>
>> That should be made block-scope rather than application-scope to be
>> more useful.
>
>Eh? I don't follow you here Gurusamy.
>

I meant s/application/block/g in your description. So that self-referential
lexicals that were *created* in the block don't become "zombies" that hang
around until the application dies, but instead get cleared out as soon as
the block exits.

Like in this much-bandied-about example:

{
my $a = [];
$a->[0] = $a;
}
# $a ought to be destroyed here, but isn't

- Sarathy.
gsar@engin.umich.edu

Re: Garbage Collection [ In reply to ]

Dec 29, 1995, 2:04 PM

Post #21 of 21 (711 views)

On Fri, 29 Dec 1995 15:45:46 EST, I wrote:
>
>Like in this much-bandied-about example:
>

Huh, I kinda left the outermost braces to the imagination there,
didn't I?

{
> {
> my $a = [];
> $a->[0] = $a;
> }
> # $a ought to be destroyed here, but isn't
}

- Sarathy.
gsar@engin.umich.edu