Mailing List Archive

Quick-and-dirty weak references
This week again I was bitten by the fact that Python doesn't have any
form of weak references, and while I was toying with some ideas I came
up with the following quick-and-dirty scheme that I thought I'd bounce
off this list. I might even volunteer to implement it, if people agree
it is worth it:-)

We add a new builtin function (or a module with that function)
weak(). This returns a weak reference to the object passed as a
parameter. A weak object has one method: strong(), which returns the
corresponding real object or raises an exception if the object doesn't
exist anymore. For convenience we could add a method exists() that
returns true if the real object still exists.

Now comes the bit that I'm unsure about: to implement this I need to
add a pointer to every object. This pointer is either NULL or points
to the corresponding weak objectt (so for every object there is either no
weak reference object or exactly one). But, for the price of 4 bytes extra
in every object we get the nicety that there is little cpu-overhead:
refcounting macros work identical to the way they do now, the only
thing to take care of is that during object deallocation we have to
zero the weak pointer. (actually: we could make do with a single bit
in every object, with the bit meaning "this object has an associated
weak object". We could then use a global dictionary indexed by object
address to find the weak object)

From here on life is easy: the weak object is a normal refcounted
object with a pointer to the real object as its only data. weak()
creates the weak object if it doesn't exist and returns the existing
(and INCREFfed) weak object if it does. Strong() checks that
self->object->weak == self and returns self->object (INCREFfed) if it
is. This works on all platforms that I'm aware of, but it could break
if there are any (Python) platforms that can have objects at VM
addresses that are later, when the object has been free()d, become
invalid addresses. And even then a vmaddrvalid() function, only needed
in the strong() method, could solve this.

The weak object isn't transparent, because you have to call strong()
before you can do anything with it, but this is an advantage (says he,
aspiring to a career in politics or sales:-): with a transparent weak
object the object could disappear at unexpected moments and with this
scheme it can't, because when you have the object itself in hand you
have a refcount too.
--
Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++
Jack.Jansen@oratrix.com | ++++ if you agree copy these lines to your sig ++++
www.oratrix.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm
Re: Quick-and-dirty weak references [ In reply to ]
Jack Jansen wrote:
>
> This week again I was bitten by the fact that Python doesn't have any
> form of weak references, and while I was toying with some ideas I came
> up with the following quick-and-dirty scheme that I thought I'd bounce
> off this list. I might even volunteer to implement it, if people agree
> it is worth it:-)

Have you checked the weak reference dictionary implementation
by Dieter Maurer ? It's at:

http://www.handshake.de/~dieter/weakdict.html

While I like the idea of having weak references in the core,
I think 4 extra bytes for *every* object is just a little
too much. The flag bit idea (with the added global dictionary
of weak referenced objects) looks promising though.

BTW, how would this be done in JPython ? I guess it doesn't
make much sense there because cycles are no problem for the
Java VM GC.

--
Marc-Andre Lemburg
______________________________________________________________________
Y2000: 139 days left
Business: http://www.lemburg.com/
Python Pages: http://www.lemburg.com/python/
RE: Quick-and-dirty weak references [ In reply to ]
[Jack Jansen]
> ...

A long time ago, Dianne Hackborn actually implemented a scheme like this,
under the name VREF (for "virtual reference", or some such). IIRC,
differences from your scheme were mainly that:

1) There was an elaborate proxy mechanism to avoid having to explicitly
strengthen the weak.

2) Each object contained a pointer to a linked list of associated weak refs.

This predates DejaNews, so may be a pain to find.

> ...
> We add a new builtin function (or a module with that function)
> weak(). This returns a weak reference to the object passed as a
> parameter. A weak object has one method: strong(), which returns the
> corresponding real object or raises an exception if the object doesn't
> exist anymore.

This interface appears nearly isomorphic to MIT Scheme's "hash" and "unhash"
functions, except that their hash returns an (unbounded) int and guarantees
that hash(o1) != hash(o2) for any distinct objects o1 and o2 (this is a
stronger guarantee than Python's "id", which may return the same int for
objects with disjoint lifetimes; the other reason object address isn't
appropriate for them is that objects can be moved by garbage collection, but
hash is an object invariant).

Of course unhash(hash(o)) is o, unless o has been gc'ed; then unhash raises
an exception. By most accounts (I haven't used it seriously myself), it's a
usable interface.

> ...
> to implement this I need to add a pointer to every object.

That's unattractive, of course.

> ...
> (actually: we could make do with a single bit in every object, with
> the bit meaning "this object has an associated weak object". We could
> then use a global dictionary indexed by object address to find the
> weak object)

Is a single bit actually smaller than a pointer? For example, on most
machines these days

#define PyObject_HEAD \
int ob_refcnt; \
struct _typeobject *ob_type;

is two 4-byte fields packed solid already, and structure padding prevents
adding anything less than a 4-byte increment in reality. I guess on Alpha
there's a 4-byte hole here, but I don't want weak pointers enough to switch
machines <wink>.

OTOH, sooner or later Guido is going to want a mark bit too, so the other
way to view this is that 32 new flag bits are as cheap as one <wink>.

There's one other thing I like about this: it can get rid of the dicey

> Strong() checks that self->object->weak == self and returns
> self->object (INCREFfed) if it is.

check. If object has gone away, you're worried that self->object may (on
some systems) point to a newly-invalid address. But worse than that, its
memory may get reused, and then self->object may point into the *middle* of
some other object where the bit pattern at the "weak" offset just happens to
equal self.

Let's try a sketch in pseduo-Python, where __xxx are secret functions that
do the obvious things (and glossing over thread safety since these are
presumably really implemented in C):

# invariant: __is_weak_bit_set(obj) == id2weak.has_key(id(obj))
# So "the weak bit" is simply an optimization, sparing most objects
# from a dict lookup when they die.
# The invariant is delicate in the presence of threads.

id2weak = {}

class _Weak:
def __init__(self, obj):
self.id = id(obj) # obj's refcount not bumped
__set_weak_bit(obj)
id2weak[self.id] = self
# note that "the system" (see below) sets self.id
# to None if obj dies

def strong(self):
if self.id is None:
raise DeadManWalkingError(self.id)
return __id2obj(self.id) # will bump obj's refcount

def __del__(self):
# this is purely an optimization: if self gets nuked,
# exempt its referent from greater expense when *it*
# dies
if self.id is not None:
__clear_weak_bit(__id2obj(self.id))
del id2weak[self.id]

def weak(obj):
return id2weak.get(id(obj), None) or _Weak(obj)

and then whenever an object of any kind is deleted the system does:

if __is_weak_bit_set(obj):
objid = id(obj)
id2weak[objid].id = None
del id2weak[objid]

In my current over-tired state, I think that's safe (modulo threads),
portable and reasonably fast; I do think the extra bit costs 4 bytes,
though.

> ...
> The weak object isn't transparent, because you have to call strong()
> before you can do anything with it, but this is an advantage (says he,
> aspiring to a career in politics or sales:-): with a transparent weak
> object the object could disappear at unexpected moments and with this
> scheme it can't, because when you have the object itself in hand you
> have a refcount too.

Explicit is better than implicit for me.

[M.-A. Lemburg]
> Have you checked the weak reference dictionary implementation
> by Dieter Maurer ? It's at:
>
> http://www.handshake.de/~dieter/weakdict.html

A project where I work is using it; it blows up a lot <wink/frown>.

While some form of weak dict is what most people want in the end, I'm not
sure Dieter's decision to support weak dicts with only weak values (not weak
keys) is sufficient. For example, the aforementioned project wants to
associate various computed long strings with certain hashable objects, and
for some reason or other (ain't my project ...) these objects can't be
changed. So they can't store the strings in the objects. So they'd like to
map the objects to the strings via assorted dicts. But using the object as
a dict key keeps it (and, via the dicts, also its associated strings)
artificially alive; they really want a weakdict with weak *keys*.

I'm not sure I know of a clear & fast way to implement a weakdict building
only on the weak() function. Jack?

Using weak objects as values (or keys) with an ordinary dict can prevent
their referents from being kept artificially alive, but that doesn't get the
dict itself cleaned up by magic. Perhaps "the system" should notify a weak
object when its referent goes away; that would at least give the WO a chance
to purge itself from structures it knows it's in ...

> ...
> BTW, how would this be done in JPython ? I guess it doesn't
> make much sense there because cycles are no problem for the
> Java VM GC.

Weak refs have many uses beyond avoiding cycles, and Java 1.2 has all of
"hard", "soft", "weak", and "phantom" references. See java.lang.ref for
details. I stopped paying attention to Java, so it's up to you to tell us
what you learn about it <wink>.
Re: Quick-and-dirty weak references [ In reply to ]
Tim Peters wrote:
>
> [M.-A. Lemburg]
> > Have you checked the weak reference dictionary implementation
> > by Dieter Maurer ? It's at:
> >
> > http://www.handshake.de/~dieter/weakdict.html
>
> A project where I work is using it; it blows up a lot <wink/frown>.
>
> While some form of weak dict is what most people want in the end, I'm not
> sure Dieter's decision to support weak dicts with only weak values (not weak
> keys) is sufficient. For example, the aforementioned project wants to
> associate various computed long strings with certain hashable objects, and
> for some reason or other (ain't my project ...) these objects can't be
> changed. So they can't store the strings in the objects. So they'd like to
> map the objects to the strings via assorted dicts. But using the object as
> a dict key keeps it (and, via the dicts, also its associated strings)
> artificially alive; they really want a weakdict with weak *keys*.
>
> I'm not sure I know of a clear & fast way to implement a weakdict building
> only on the weak() function. Jack?
>
> Using weak objects as values (or keys) with an ordinary dict can prevent
> their referents from being kept artificially alive, but that doesn't get the
> dict itself cleaned up by magic. Perhaps "the system" should notify a weak
> object when its referent goes away; that would at least give the WO a chance
> to purge itself from structures it knows it's in ...

Perhaps one could fiddle something out of the Proxy objects
in mxProxy (you know where...). These support a special __cleanup__
protocol that I use a lot to work around circular garbage:
the __cleanup__ method of the referenced object is called prior
to destroying the proxy; even if the reference count on the
object has not yet gone down to 0.

This makes direct circles possible without problems: the parent
can reference a child through the proxy and the child can reference the
parent directly. As soon as the parent is cleaned up, the reference to
the proxy is deleted which then automagically makes the
back reference in the child disappear, allowing the parent
to be deallocated after cleanup without leaving a circular
reference around.

> > ...
> > BTW, how would this be done in JPython ? I guess it doesn't
> > make much sense there because cycles are no problem for the
> > Java VM GC.
>
> Weak refs have many uses beyond avoiding cycles, and Java 1.2 has all of
> "hard", "soft", "weak", and "phantom" references. See java.lang.ref for
> details. I stopped paying attention to Java, so it's up to you to tell us
> what you learn about it <wink>.

Thanks for the reference... but I guess this will remain a
weak one for some time since the latter is currently a
limited resource :-)

--
Marc-Andre Lemburg
______________________________________________________________________
Y2000: 138 days left
Business: http://www.lemburg.com/
Python Pages: http://www.lemburg.com/python/
RE: Quick-and-dirty weak references [ In reply to ]
[.about weakdicts and the possibility of building them on weak
references; the obvious way doesn't clean up the dict itself by
magic; maybe a weak object should be notified when its referent
goes away
]

[M.-A. Lemburg]
> Perhaps one could fiddle something out of the Proxy objects
> in mxProxy (you know where...). These support a special __cleanup__
> protocol that I use a lot to work around circular garbage:
> the __cleanup__ method of the referenced object is called prior
> to destroying the proxy; even if the reference count on the
> object has not yet gone down to 0.
>
> This makes direct circles possible without problems: the parent
> can reference a child through the proxy and the child can reference the
> parent directly.

What you just wrote is:

parent --> proxy --> child -->+
^ v
+<----------------------------+

Looks like a plain old cycle to me!

> As soon as the parent is cleaned up, the reference to
> the proxy is deleted which then automagically makes the
> back reference in the child disappear, allowing the parent
> to be deallocated after cleanup without leaving a circular
> reference around.

M-A, this is making less sense by the paragraph <wink>: skipping the
middle, this says "as soon as the parent is cleaned up ... allowing the
parent to be deallocated after cleanup". If we presume that the parent gets
cleaned up explicitly (since the reference from the child is keeping it
alive, it's not going to get cleaned up by magic, right?), then the parent
could just as well call the __cleanup__ methods of the things it references
directly without bothering with a proxy. For that matter, if it's the
straightforward

parent <-> child

kind of cycle, the parent's cleanup method can just do

self.__dict__.clear()

and the cycle is broken without writing a __cleanup__ method anywhere
(that's what I usually do, and in this kind of cycle that clears the last
reference to the child, which then goes away, which in turn automagically
clears its back reference to the parent).

So, offhand, I don't see that the proxy protocol could help here. In a
sense, what's really needed is the opposite: notifying the *proxy* when the
*real* object goes away (which makes no sense in the context of what your
proxy objects were designed to do).

[about Java and its four reference strengths]

Found a good introductory writeup at (sorry, my mailer will break this URL,
so I'll break it myself at a sensible place):

http://developer.java.sun.com/developer/
technicalArticles//ALT/RefObj/index.html

They have a class for each of the three "not strong" flavors of references.
For all three you pass the referenced object to the constructor, and all
three accept (optional in two of the flavors) a second ReferenceQueue
argument. In the latter case, when the referenced object goes away the
weak/soft/phantom-ref proxy object is placed on the queue. Which, in turn,
is a thread-safe queue with various put, get, and timeout-limited polling
functions. So you have to write code to look at the queue from time to
time, to find the proxies whose referents have gone away.

The three flavors may (or may not ...) have these motivations:

soft: an object reachable at strongest by soft references can go away at
any time, but the garbage collector strives to keep it intact until it can't
find any other way to get enough memory

weak: an object reachable at strongest by weak references can go away at
any time, and the collector makes no attempt to delay its death

phantom: an object reachable at strongest by phantom references can get
*finalized* at any time, but won't get *deallocated* before its phantom
proxy does something or other (goes away? wasn't clear). This is the flavor
that requires passing a queue argument to the constructor. Seems to be a
major hack to worm around Java's notorious problems with order of
finalization -- along the lines that you give phantom referents trivial
finalizers, and put the real cleanup logic in the phantom proxy. This lets
your program take responsibility for running the real cleanup code in the
order-- and in the thread! --where it makes sense.

Java 1.2 *also* tosses in a WeakHashMap class, which is a dict with
under-the-cover weak keys (unlike Dieter's flavor with weak values), and
where the key+value pairs vanish by magic when the key object goes away.
The details and the implementation of these guys waren't clear to me, but
then I didn't download the code, just scanned the online docs.


Ah, a correction to my last post:

class _Weak:
...
def __del__(self):
# this is purely an optimization: if self gets nuked,
# exempt its referent from greater expense when *it*
# dies
if self.id is not None:
__clear_weak_bit(__id2obj(self.id))
del id2weak[self.id]

Root of all evil: this method is useless, since the id2weak dict keeps each
_Weak object alive until its referent goes away (at which time self.id gets
set to None, so _Weak.__del__ doesn't do anything). Even if it did do
something, it's no cheaper to do it here than in the systemt cleanup code
("greater expense" was wrong).

weakly y'rs - tim


PS: Ooh! Ooh! Fellow at work today was whining about weakdicts, and
called them "limp dicts". I'm not entirely sure it was an innocent Freudian
slut, but it's a funny pun even if it wasn't (for you foreigners, it sounds
like American slang for "flaccid one-eyed trouser snake" ...).
Re: Quick-and-dirty weak references [ In reply to ]
Tim Peters wrote:
>
> [.about weakdicts and the possibility of building them on weak
> references; the obvious way doesn't clean up the dict itself by
> magic; maybe a weak object should be notified when its referent
> goes away
> ]
>
> [M.-A. Lemburg]
> > Perhaps one could fiddle something out of the Proxy objects
> > in mxProxy (you know where...). These support a special __cleanup__
> > protocol that I use a lot to work around circular garbage:
> > the __cleanup__ method of the referenced object is called prior
> > to destroying the proxy; even if the reference count on the
> > object has not yet gone down to 0.
> >
> > This makes direct circles possible without problems: the parent
> > can reference a child through the proxy and the child can reference the
> > parent directly.
>
> What you just wrote is:
>
> parent --> proxy --> child -->+
> ^ v
> +<----------------------------+
>
> Looks like a plain old cycle to me!

Sure :-) That was the intention. I'm using this to implement
acquisition without turning to ExtensionClasses. [Nice picture, BTW]

> > As soon as the parent is cleaned up, the reference to
> > the proxy is deleted which then automagically makes the
> > back reference in the child disappear, allowing the parent
> > to be deallocated after cleanup without leaving a circular
> > reference around.
>
> M-A, this is making less sense by the paragraph <wink>: skipping the
> middle, this says "as soon as the parent is cleaned up ... allowing the
> parent to be deallocated after cleanup". If we presume that the parent gets
> cleaned up explicitly (since the reference from the child is keeping it
> alive, it's not going to get cleaned up by magic, right?), then the parent
> could just as well call the __cleanup__ methods of the things it references
> directly without bothering with a proxy. For that matter, if it's the
> straightforward
>
> parent <-> child
>
> kind of cycle, the parent's cleanup method can just do
>
> self.__dict__.clear()
>
> and the cycle is broken without writing a __cleanup__ method anywhere
> (that's what I usually do, and in this kind of cycle that clears the last
> reference to the child, which then goes away, which in turn automagically
> clears its back reference to the parent).
>
> So, offhand, I don't see that the proxy protocol could help here. In a
> sense, what's really needed is the opposite: notifying the *proxy* when the
> *real* object goes away (which makes no sense in the context of what your
> proxy objects were designed to do).

All true :-). The nice thing about the proxy is that it takes
care of the process automagically. And yes, the parent is used
via a proxy too. So the picture looks like this:

--> proxy --> parent --> proxy --> child -->+
^ v
+<----------------------------+

Since the proxy isn't noticed by the referencing objects (well, at
least if they don't fiddle with internals), the picture for the
objects looks like this:

--> parent --> child -->+
^ v
+<------------------+

You could of course do the same via explicit invokation of
the __cleanup__ method, but the object references involved could be
hidden in some other structure, so they might be hard to find.

And there's another feature about Proxies (as defined in mxProxy):
they allow you to control access in a much more strict way than
Python does. You can actually hide attributes and methods you
don't want exposed in a way that doesn't even let you access them
via some dict or pass me the frame object trick. This is very useful
when you program multi-user application host servers where you don't
want users to access internal structures of the server.

> [about Java and its four reference strengths]
>
> Found a good introductory writeup at (sorry, my mailer will break this URL,
> so I'll break it myself at a sensible place):
>
> http://developer.java.sun.com/developer/
> technicalArticles//ALT/RefObj/index.html

Thanks for the reference... and for the summary ;-)

> They have a class for each of the three "not strong" flavors of references.
> For all three you pass the referenced object to the constructor, and all
> three accept (optional in two of the flavors) a second ReferenceQueue
> argument. In the latter case, when the referenced object goes away the
> weak/soft/phantom-ref proxy object is placed on the queue. Which, in turn,
> is a thread-safe queue with various put, get, and timeout-limited polling
> functions. So you have to write code to look at the queue from time to
> time, to find the proxies whose referents have gone away.
>
> The three flavors may (or may not ...) have these motivations:
>
> soft: an object reachable at strongest by soft references can go away at
> any time, but the garbage collector strives to keep it intact until it can't
> find any other way to get enough memory

So there is a possibility of reviving these objects, right ?

I've just recently added a hackish function to my mxTools which allows
me to regain access to objects via their address (no, not thread safe,
not even necessarily correct).

sys.makeref(id)
Provided that id is a valid address of a Python object (id(object) returns this address),
this function returns a new reference to it. Only objects that are "alive" can be referenced
this way, ones with zero reference count cause an exception to be raised.

You can use this function to reaccess objects lost during garbage collection.

USE WITH CARE: this is an expert-only function since it can cause instant core dumps and
many other strange things -- even ruin your system if you don't know what you're doing !

SECURITY WARNING: This function can provide you with access to objects that are
otherwise not visible, e.g. in restricted mode, and thus be a potential security hole.

I use it for tracking objects via id-key based dictionary and
hooks in the create/del mechanisms of Python instances. It helps
finding those memory eating cycles.

> weak: an object reachable at strongest by weak references can go away at
> any time, and the collector makes no attempt to delay its death
>
> phantom: an object reachable at strongest by phantom references can get
> *finalized* at any time, but won't get *deallocated* before its phantom
> proxy does something or other (goes away? wasn't clear). This is the flavor
> that requires passing a queue argument to the constructor. Seems to be a
> major hack to worm around Java's notorious problems with order of
> finalization -- along the lines that you give phantom referents trivial
> finalizers, and put the real cleanup logic in the phantom proxy. This lets
> your program take responsibility for running the real cleanup code in the
> order-- and in the thread! --where it makes sense.

Wouldn't these flavors be possible using the following setup ? Note
that it's quite similar to your _Weak class except that I use a
proxy without the need to first get a strong reference for the
object and that it doesn't use a weak bit.

--> proxy --> object
^
|
all_managed_objects

all_managed_objects is a dictionary indexed by address (its id)
and keeps a strong reference to the objects. The proxy does
not keep a strong reference to the object, but only the address
as integer and checks the ref-count on the object in the
all_managed_objects dictionary prior to every dereferencing
action. In case this refcount falls down to 1 (only the
all_managed_objects dict references it), the proxy takes
appropriate action, e.g. raises an exceptions and deletes
the reference in all_managed_objects to mimic a weak reference.
The same check is done prior to garbage collection of the
proxy.

Add to this some queues, pepper and salt and place it in an
oven at 220° for 20 minutes... plus take a look every 10 seconds
or so...

The downside is obvious: the zombified object will not get inspected
(and then GCed) until the next time a weak reference to it is used.

> Java 1.2 *also* tosses in a WeakHashMap class, which is a dict with
> under-the-cover weak keys (unlike Dieter's flavor with weak values), and
> where the key+value pairs vanish by magic when the key object goes away.
> The details and the implementation of these guys waren't clear to me, but
> then I didn't download the code, just scanned the online docs.

Would the above help in creating such beasts ?

> Ah, a correction to my last post:
>
> class _Weak:
> ...
> def __del__(self):
> # this is purely an optimization: if self gets nuked,
> # exempt its referent from greater expense when *it*
> # dies
> if self.id is not None:
> __clear_weak_bit(__id2obj(self.id))
> del id2weak[self.id]
>
> Root of all evil: this method is useless, since the id2weak dict keeps each
> _Weak object alive until its referent goes away (at which time self.id gets
> set to None, so _Weak.__del__ doesn't do anything). Even if it did do
> something, it's no cheaper to do it here than in the systemt cleanup code
> ("greater expense" was wrong).
>
> weakly y'rs - tim
>
> PS: Ooh! Ooh! Fellow at work today was whining about weakdicts, and
> called them "limp dicts". I'm not entirely sure it was an innocent Freudian
> slut, but it's a funny pun even if it wasn't (for you foreigners, it sounds
> like American slang for "flaccid one-eyed trouser snake" ...).

:-)

--
Marc-Andre Lemburg
______________________________________________________________________
Y2000: 136 days left
Business: http://www.lemburg.com/
Python Pages: http://www.lemburg.com/python/
Re: Quick-and-dirty weak references [ In reply to ]
[.about weakdicts and the possibility of building them on weak
references; the obvious way doesn't clean up the dict itself by
magic; maybe a weak object should be notified when its referent
goes away
]

Here is a new version of my Proxy package which includes a
self managing weak reference mechanism without the need to
add extra bits or bytes to all Python objects:

http://starship.skyport.net/~lemburg/mxProxy-pre0.2.0.zip

The docs and an explanation of how the thingie works are
included in the archive's Doc subdir. Basically it builds
upon the idea I posted earlier on on this thread -- with
a few extra kicks to get it right in the end ;-)

Usage is pretty simple:

from Proxy import WeakProxy
object = []
wr = WeakProxy(object)
wr.append(8)
del object

>>> wr[0]
Traceback (innermost last):
File "<stdin>", line 1, in ?
mxProxy.LostReferenceError: object already garbage collected

I have checked the ref counts pretty thoroughly, but before
going public I would like the Python-Dev crowd to run some
tests as well: after all, the point is for the weak references
to be weak and that's sometimes a bit hard to check.

Hope you have as much fun with it as I had writing it ;-)

Ah yes, for the raw details have a look at the code. The code
uses a list of back references to the weak Proxies and notifies them
when the object goes away... would it be useful to add a hook
to the Proxies so that they can apply some other action as well ?

--
Marc-Andre Lemburg
______________________________________________________________________
Y2000: 135 days left
Business: http://www.lemburg.com/
Python Pages: http://www.lemburg.com/python/
Re: Quick-and-dirty weak references [ In reply to ]
The one thing I'm not thrilled by in mxProxy is that a call to
CheckWeakReferences() is needed before an object is cleaned up. I guess this
boils down to the same problem I had with my weak reference scheme: you
somehow want the Python core to tell the proxy stuff that the object can be
cleaned up (although the details are different: in my scheme this would be
triggered by refcount==0 and in mxProxy by refcount==1). And because objects
are created and destroyed in Python at a tremendous rate you don't want to do
this call for every object, only if you have a hint that the object has a weak
reference (or a proxy).
--
Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++
Jack.Jansen@oratrix.com | ++++ if you agree copy these lines to your sig ++++
www.oratrix.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm
Re: Quick-and-dirty weak references [ In reply to ]
Vladimir Marangozov wrote:
>
> M.-A. Lemburg wrote:
> > I have checked the ref counts pretty thoroughly, but before
> > going public I would like the Python-Dev crowd to run some
> > tests as well: after all, the point is for the weak references
> > to be weak and that's sometimes a bit hard to check.
>
> It's even harder to implement them without side effects. I used
> the same hack for the __heirs__ class attribute some time ago.
> But I knew that a parent class cannot be garbage collected before
> all of its descendants. That allowed me to keep weak refs in
> the parent class, and preserve the existing strong refs in the
> subclasses. On every dealloc of a subclass, the corresponding
> weak ref in the parent class' __heirs__ is removed.
>
> In your case, the lifetime of the objects cannot be predicted,
> so implementing weak refs by messing with refcounts or checking
> mem pointers is a dead end.

> I don't know whether this is the
> case with mxProxy as I just browsed the code quickly, but here's
> a scenario where your scheme (or implementation) is not working:
>
> >>> from Proxy import WeakProxy
> >>> o = []
> >>> p = WeakProxy(o)
> >>> d = WeakProxy(o)
> >>> p
> <WeakProxy object at 20260138>
> >>> d
> <WeakProxy object at 20261328>
> >>> print p
> []
> >>> print d
> []
> >>> del o
> >>> p
> <WeakProxy object at 20260138>
> >>> d
> <WeakProxy object at 20261328>
> >>> print p
> Illegal instruction (core dumped)

Could you tell me where the core dump originates ? Also, it would
help to compile the package with the -DMAL_DEBUG switch turned
on (edit Setup) and then run the same things using 'python -d'.
The package will then print a pretty complete list of things it
is doing to mxProxy.log, which would help track down errors like
these.

BTW, I get:
>>> print p

Traceback (innermost last):
File "<stdin>", line 1, in ?
mxProxy.LostReferenceError: object already garbage collected
>>>

[.Don't know why the print statement prints an empty line, though.]

Thanks for trying it,
--
Marc-Andre Lemburg
______________________________________________________________________
Y2000: 135 days left
Business: http://www.lemburg.com/
Python Pages: http://www.lemburg.com/python/
Re: Quick-and-dirty weak references [ In reply to ]
M.-A. Lemburg wrote:
>
> Usage is pretty simple:
>
> from Proxy import WeakProxy
> object = []
> wr = WeakProxy(object)
> wr.append(8)
> del object
>
> >>> wr[0]
> Traceback (innermost last):
> File "<stdin>", line 1, in ?
> mxProxy.LostReferenceError: object already garbage collected
>
> I have checked the ref counts pretty thoroughly, but before
> going public I would like the Python-Dev crowd to run some
> tests as well: after all, the point is for the weak references
> to be weak and that's sometimes a bit hard to check.

It's even harder to implement them without side effects. I used
the same hack for the __heirs__ class attribute some time ago.
But I knew that a parent class cannot be garbage collected before
all of its descendants. That allowed me to keep weak refs in
the parent class, and preserve the existing strong refs in the
subclasses. On every dealloc of a subclass, the corresponding
weak ref in the parent class' __heirs__ is removed.

In your case, the lifetime of the objects cannot be predicted,
so implementing weak refs by messing with refcounts or checking
mem pointers is a dead end. I don't know whether this is the
case with mxProxy as I just browsed the code quickly, but here's
a scenario where your scheme (or implementation) is not working:

>>> from Proxy import WeakProxy
>>> o = []
>>> p = WeakProxy(o)
>>> d = WeakProxy(o)
>>> p
<WeakProxy object at 20260138>
>>> d
<WeakProxy object at 20261328>
>>> print p
[]
>>> print d
[]
>>> del o
>>> p
<WeakProxy object at 20260138>
>>> d
<WeakProxy object at 20261328>
>>> print p
Illegal instruction (core dumped)


--
Vladimir MARANGOZOV | Vladimir.Marangozov@inrialpes.fr
http://sirac.inrialpes.fr/~marangoz | tel:(+33-4)76615277 fax:76615252
Re: Quick-and-dirty weak references [ In reply to ]
Jack Jansen wrote:
>
> The one thing I'm not thrilled by in mxProxy is that a call to
> CheckWeakReferences() is needed before an object is cleaned up. I guess this
> boils down to the same problem I had with my weak reference scheme: you
> somehow want the Python core to tell the proxy stuff that the object can be
> cleaned up (although the details are different: in my scheme this would be
> triggered by refcount==0 and in mxProxy by refcount==1). And because objects
> are created and destroyed in Python at a tremendous rate you don't want to do
> this call for every object, only if you have a hint that the object has a weak
> reference (or a proxy).

Well, the check is done prior to every action using a proxy to
the object and also when a proxy to it is deallocated. The
addition checkweakrefs() API is only included to enable additional explicit
checking of the whole weak refs dictionary, e.g. every 10 seconds
or so (just like you would with a mark&sweep GC).

But yes, GC of the phantom object is delayed a bit depending on
how you set up the proxies. Still, I think most usages won't have
this problem, since the proxies themselves are usually
temporary objects.

It may sometimes even make sense to have the phantom object
around as long as possible, e.g. to implement the soft references
Tim quoted from the Java paper.

--
Marc-Andre Lemburg
______________________________________________________________________
Y2000: 135 days left
Business: http://www.lemburg.com/
Python Pages: http://www.lemburg.com/python/
Re: Quick-and-dirty weak references [ In reply to ]
Vladimir Marangozov wrote:
>
> [about mxProxy, WeakProxy]
>
> M.-A. Lemburg wrote:
> >
> > Could you tell me where the core dump originates ? Also, it would
> > help to compile the package with the -DMAL_DEBUG switch turned
> > on (edit Setup) and then run the same things using 'python -d'.
> > The package will then print a pretty complete list of things it
> > is doing to mxProxy.log, which would help track down errors like
> > these.
> >
> > BTW, I get:
> > >>> print p
> >
> > Traceback (innermost last):
> > File "<stdin>", line 1, in ?
> > mxProxy.LostReferenceError: object already garbage collected
> > >>>
> >
> > [.Don't know why the print statement prints an empty line, though.]
> >
>
> The previous example now *seems* to work fine in a freshly launched
> interpreter, so it's not a good example, but this shorter one
> definitely doesn't:
>
> >>> from Proxy import WeakProxy
> >>> o = []
> >>> p = q = WeakProxy(o)
> >>> p = q = WeakProxy(o)
> >>> del o
> >>> print p or q
> Illegal instruction (core dumped)
>
> It crashes in PyDict_DelItem() called from mxProxy_CollectWeakReference().
> I can mail you a complete trace in private, if you still need it.

That would be nice (please also include the log-file), because I get:
>>> print p or q
Traceback (innermost last):
File "<stdin>", line 1, in ?
mxProxy.LostReferenceError: object already garbage collected
>>>

Thank you,
--
Marc-Andre Lemburg
______________________________________________________________________
Y2000: 135 days left
Business: http://www.lemburg.com/
Python Pages: http://www.lemburg.com/python/
Re: Quick-and-dirty weak references [ In reply to ]
[about mxProxy, WeakProxy]

M.-A. Lemburg wrote:
>
> Could you tell me where the core dump originates ? Also, it would
> help to compile the package with the -DMAL_DEBUG switch turned
> on (edit Setup) and then run the same things using 'python -d'.
> The package will then print a pretty complete list of things it
> is doing to mxProxy.log, which would help track down errors like
> these.
>
> BTW, I get:
> >>> print p
>
> Traceback (innermost last):
> File "<stdin>", line 1, in ?
> mxProxy.LostReferenceError: object already garbage collected
> >>>
>
> [.Don't know why the print statement prints an empty line, though.]
>

The previous example now *seems* to work fine in a freshly launched
interpreter, so it's not a good example, but this shorter one
definitely doesn't:

>>> from Proxy import WeakProxy
>>> o = []
>>> p = q = WeakProxy(o)
>>> p = q = WeakProxy(o)
>>> del o
>>> print p or q
Illegal instruction (core dumped)

Or even shorter:

>>> from Proxy import WeakProxy
>>> o = []
>>> p = q = WeakProxy(o)
>>> p = WeakProxy(o)
>>> del o
>>> print p
Illegal instruction (core dumped)

It crashes in PyDict_DelItem() called from mxProxy_CollectWeakReference().
I can mail you a complete trace in private, if you still need it.

--
Vladimir MARANGOZOV | Vladimir.Marangozov@inrialpes.fr
http://sirac.inrialpes.fr/~marangoz | tel:(+33-4)76615277 fax:76615252
Re: Quick-and-dirty weak references [ In reply to ]
[about weak references and a sample implementation in mxProxy]

With the help of Vladimir, I have solved the problem and uploaded
a modified version of the prerelease:

http://starship.skyport.net/~lemburg/mxProxy-pre0.2.0.zip

The archive now also contains a precompiled Win32 PYD file
for those on WinXX platforms. Please give it a try and tell
me what you think.

Cheers,
--
Marc-Andre Lemburg
______________________________________________________________________
Y2000: 134 days left
Business: http://www.lemburg.com/
Python Pages: http://www.lemburg.com/python/
Re: Quick-and-dirty weak references [ In reply to ]
> In reply to no one in particular:
>
> I've often wished that the instance type object had an (optimized)
> __decref__ slot. With nothing but hand-waving to support it, I'll
> claim that would enable all these games.

Without context, I don't know when this would be called. If you want
this called on all DECREFs (regardless of the refcount value), realize
that this is a huge slowdown because it would mean the DECREF macro
has to inspect the type object, which means several indirections.
This would slow down *every* DECREF operation, not just those on
instances with a __decref__ slot, because the DECREF macro doesn't
know the type of the object!

--Guido van Rossum (home page: http://www.python.org/~guido/)
Re: Quick-and-dirty weak references [ In reply to ]
In reply to no one in particular:

I've often wished that the instance type object had an (optimized)
__decref__ slot. With nothing but hand-waving to support it, I'll
claim that would enable all these games.

- Gordon
Re: Quick-and-dirty weak references [ In reply to ]
[me]
> >
> > I've often wished that the instance type object had an (optimized)
> > __decref__ slot. With nothing but hand-waving to support it, I'll
> > claim that would enable all these games.

[Guido]
> Without context, I don't know when this would be called. If you
> want this called on all DECREFs (regardless of the refcount value),
> realize that this is a huge slowdown because it would mean the
> DECREF macro has to inspect the type object, which means several
> indirections. This would slow down *every* DECREF operation, not
> just those on instances with a __decref__ slot, because the DECREF
> macro doesn't know the type of the object!

This was more 2.0-ish speculation, and really thinking of classic C++
ref counting where decref would be a function call, not a macro.
Still a slowdown, of course, but not quite so massive. The upside is
opening up all kinds of tricks at the type object and user class
levels, (such as weak refs and copy on write etc). Worth it? I'd
think so, but I'm not a speed demon.

- Gordon