Mailing List Archive

Text about cycle GC
I've written some text for the What's New article on the GCing of
cycles, but it wasn't a topic I paid close attention to at the time,
so I'd like corrections. Here's what I've got; please send me
comments privately.

6. Optional Collection of Cycles

The C implementation of Python uses reference counting to implement
garbage collection. Every Python object maintains a count of the
number of references pointing to itself, and adjusts the count as
references are created or destroyed. Once the reference count reaches
zero, the object is no longer accessible, since you need to have a
reference to an object to access it, and if the count is zero, no
references exist any longer.

Reference counting has some pleasant properties: it's easy to
understand and implement, and the resulting implementation is
portable, fairly fast, and reacts well with other libraries that
implement their own memory handling schemes. The major problem with
reference counting is that it sometimes doesn't realise that objects
are no longer accessible, resulting in a memory leak. This happens
when there are cycles of references.

Consider the simplest possible cycle, a class instance which has a
reference to itself:

instance = SomeClass()
instance.myself = instance

After the above two lines of code have been executed, the reference
count of instance is 2; one reference is from the variable named
"'instance'", and the other is from the "myself"attribute of the
instance.

If the next line of code is del instance, what happens? The reference
count of instance is decreased by 1, so it has a reference count of 1;
the reference in the "myself" attribute still exists. Yet the instance
is no longer accessible through Python code, and it could be deleted.
Several objects can participate in a cycle if they have references to
each other, causing all of the objects to be leaked.

An experimental step has been made toward fixing this problem. When
compiling Python, the -with-cycle-gc (XXX correct option flag?) option
can be specified. This causes a cycle detection algorithm to be
periodically executed, which looks for inaccessible cycles and deletes
the objects involved.

Why isn't this enabled by default? Running the cycle detection
algorithm takes some time, and some tuning will be required to
minimize the overhead cost. It's not yet obvious how much performance
is lost, because benchmarking this is tricky and depends sensitively
on how often the program creates and destroys objects. XXX is this
actually the correct reason? Or is it fear of breaking software that
runs happily while leaving garbage?

Several people worked on this problem. Early versions were written by
XXX1, XXX2. (I vaguely remember several people writing first cuts at
this. Anyone recall who?) The implementation that's in Python 1.6 is a
rewritten version, this time done by Neil Schemenauer. Lots of other
people offered suggestions along the way, such as (in alphabetical
order) Marc-André Lemburg, Tim Peters, Greg Stein, Eric Tiedemann.
(XXX other names? If you feel you should be added, e-mail me). The
March 2000 archives of the python-dev mailing list contain most of the
relevant discussion, especially in the threads titled ``Reference
cycle collection for Python'' and ``Finalization again''.
Re: Text about cycle GC [ In reply to ]
"A.M. Kuchling" wrote:
>
> ...
>
> An experimental step has been made toward fixing this problem. When
> compiling Python, the -with-cycle-gc (XXX correct option flag?) option
> can be specified. This causes a cycle detection algorithm to be
> periodically executed, which looks for inaccessible cycles and deletes
> the objects involved.

I propose (without any investigation into the difficulty of
implementation) that

import cyclicgc

should turn on the -with-cycle-gc flag.

Then you could do a

import cyclicgc
import someDOM

And not worry about cycles. If it's too slow, you remove the line and
take responsibility for cycle checking. The important thing is that you
don't have to tell YOUR USERS to turn on GC in order for your code to
work.

I think that this model will get more people actively using (and
depending upon!) GC and thus give us a better idea about usage patterns,
stability and performance. This is especially important in the beta
period when we are trying to shake out bugs.

Really, this comes back to Greg's point that we should not have
incompatible sub-languages running around. A module depends on the
feature or it doesn't. The end-user should not be responsible for
knowing whether any module in a whole program needs GC.

--
Paul Prescod - Not encumbered by corporate consensus
The calculus and the rich body of mathematical analysis to which it
gave rise made modern science possible, but it was the algorithm that
made the modern world possible.
- The Advent of the Algorithm (pending), by David Berlinski
Re: Text about cycle GC [ In reply to ]
>>>>> "PP" == Paul Prescod <paul@prescod.net> writes:

PP> "A.M. Kuchling" wrote:
>> ...
>>
>> An experimental step has been made toward fixing this
>> problem. When compiling Python, the -with-cycle-gc (XXX correct
>> option flag?) option can be specified. This causes a cycle
>> detection algorithm to be periodically executed, which looks for
>> inaccessible cycles and deletes the objects involved.

PP> I propose (without any investigation into the difficulty of
PP> implementation) that

PP> import cyclicgc

PP> should turn on the -with-cycle-gc flag.

The -with-cycle-gc flag is an option to configure when you build
Python. It can't be turned on or off at runtime.

We hope that everyone will build with the -with-cycle-gc flag during
the beta testing, but we don't expect to have enough confidence in it
by final release to make it anything other than an experimental
option.

Jeremy
Re: Text about cycle GC [ In reply to ]
Jeremy Hylton wrote:
>
> ....
>
> The -with-cycle-gc flag is an option to configure when you build
> Python. It can't be turned on or off at runtime.

Doh! Sorry. Andrew's text does say that but I didn't read carefully and
testing it with my stuff hasn't floated to the top yet (because when I
looked into it, I realized I'd have to recompile...double doh!)
Compiling is really slow on my "transitional" computer.

I'm a little disappointed that it will be so difficult for people to
test with GC on, but I'm not going to claim that there is some easy way
to make it a runtime option because I haven't looked at it at all. There
probably isn't.

--
Paul Prescod - Not encumbered by corporate consensus
The calculus and the rich body of mathematical analysis to which it
gave rise made modern science possible, but it was the algorithm that
made the modern world possible.
- The Advent of the Algorithm (pending), by David Berlinski
Re: Text about cycle GC [ In reply to ]
On Wed, Jun 28, 2000 at 11:09:38AM -0700, Paul Prescod wrote:
> I'm a little disappointed that it will be so difficult for people to
> test with GC on, but I'm not going to claim that there is some easy way
> to make it a runtime option because I haven't looked at it at all. There
> probably isn't.

There isn't. With an interpreter compiled --with-cycle-gc you can
turn off the collection with gc.set_threshold(0) but that's not
the same thing.

Neil
Re: Text about cycle GC [ In reply to ]
> On Wed, Jun 28, 2000 at 11:09:38AM -0700, Paul Prescod wrote:
> > I'm a little disappointed that it will be so difficult for people to
> > test with GC on, but I'm not going to claim that there is some easy way
> > to make it a runtime option because I haven't looked at it at all. There
> > probably isn't.
>
> There isn't. With an interpreter compiled --with-cycle-gc you can
> turn off the collection with gc.set_threshold(0) but that's not
> the same thing.
>
> Neil

I wonder if we should turn this option *on* during beta testing?

That way we gather a lot more experience with its use! Maybe nobody
complains and we can leave it on in the final release...

--Guido van Rossum (home page: http://www.python.org/~guido/)
Re: Text about cycle GC [ In reply to ]
On Wed, Jun 28, 2000 at 03:59:46PM -0500, Guido van Rossum wrote:
> > On Wed, Jun 28, 2000 at 11:09:38AM -0700, Paul Prescod wrote:
> > > I'm a little disappointed that it will be so difficult for people to
> > > test with GC on, but I'm not going to claim that there is some easy way
> > > to make it a runtime option because I haven't looked at it at all. There
> > > probably isn't.
> >
> > There isn't. With an interpreter compiled --with-cycle-gc you can
> > turn off the collection with gc.set_threshold(0) but that's not
> > the same thing.
> >
> > Neil
>
> I wonder if we should turn this option *on* during beta testing?
>
> That way we gather a lot more experience with its use! Maybe nobody
> complains and we can leave it on in the final release...

That presumes an adequate number of testers and a wide variety of
deployment/usage scenarios. Given the relatively short timeframe, that may
be a bit aggressive.

I *do* agree with enabling it during the beta. But I'm just not sure about
leaving it on.

Cheers,
-g

--
Greg Stein, http://www.lyra.org/
Re: Text about cycle GC [ In reply to ]
On 28 June 2000, Jeremy Hylton said:
> PP> import cyclicgc
>
> PP> should turn on the -with-cycle-gc flag.
>
> The -with-cycle-gc flag is an option to configure when you build
> Python. It can't be turned on or off at runtime.

No, no, *obviously* Paul meant that "import cyclicgc" should configure
and compile a new Python interpreter with "--with-cycle-gc" and 'exec()'
it for you. What else would it mean?

;-)

Greg

(PS. if this were Perl, you could say

use cyclegc;

to turn it on, and then elsewhere in your code

no cyclegc;

to turn it off. Maybe Python needs an "unimport" command?)

(Kidding!)
Re: Text about cycle GC [ In reply to ]
Guido van Rossum wrote:
>
> I wonder if we should turn this option *on* during beta testing?

Why not, as long as --without-gc does not suffer from bugs hidden by
--with-gc.

>
> That way we gather a lot more experience with its use! Maybe nobody
> complains and we can leave it on in the final release...

This is risky. We don't have enough experience with this implementation.
All we know is that things get slower and consume more memory. Without
a clear picture of the impact of this GC implementation, it's probably
a bad idea to enable (i.e. impose) it by default.

The "optional experimental feature" label is good and IMO it complies
with anybody's expectations. Interested people will jump in and will
eventually contribute with improvements, those who don't care won't
and will live happily without it (at least until the day the feature
and its implementation reach their expected maturity).

--
Vladimir MARANGOZOV | Vladimir.Marangozov@inrialpes.fr
http://sirac.inrialpes.fr/~marangoz | tel:(+33-4)76615277 fax:76615252
Re: Text about cycle GC [ In reply to ]
Vladimir Marangozov wrote:
>
>...
>
> This is risky. We don't have enough experience with this implementation.
> All we know is that things get slower and consume more memory. Without
> a clear picture of the impact of this GC implementation, it's probably
> a bad idea to enable (i.e. impose) it by default.

Won't we have a clear picture by the end of the beta period?

> The "optional experimental feature" label is good and IMO it complies
> with anybody's expectations. Interested people will jump in and will
> eventually contribute with improvements, those who don't care won't
> and will live happily without it (at least until the day the feature
> and its implementation reach their expected maturity).

A million Window users will live unhappily without it because they don't
know how to recompile to get it!

--
Paul Prescod - Not encumbered by corporate consensus
The calculus and the rich body of mathematical analysis to which it
gave rise made modern science possible, but it was the algorithm that
made the modern world possible.
- The Advent of the Algorithm (pending), by David Berlinski
Re: Text about cycle GC [ In reply to ]
guido wrote:
> I wonder if we should turn this option *on* during beta testing?

+1 from me.

> That way we gather a lot more experience with its use! Maybe nobody
> complains and we can leave it on in the final release...

I won't mind (unless it breaks my code, of course).

</F>
Re: Text about cycle GC [ In reply to ]
> Won't we have a clear picture by the end of the beta period?

Not clear. Unfortunately, few people who run important apps will
download a beta and try it. But they *will* download a new release
labeled "final" and install it without making sure it works for their
app. This has been my experience throughout Python's life. I've
become pretty conservative about staying backwards compatible as a
result...

> > The "optional experimental feature" label is good and IMO it complies
> > with anybody's expectations. Interested people will jump in and will
> > eventually contribute with improvements, those who don't care won't
> > and will live happily without it (at least until the day the feature
> > and its implementation reach their expected maturity).
>
> A million Window users will live unhappily without it because they don't
> know how to recompile to get it!

This can be solved the way we solve everything for Windows users: give
them two distributions to choose from -- or maybe one distribution
with a checkbox to choose which version to use.

--Guido van Rossum (home page: http://www.python.org/~guido/)