Mailing List Archive

keeping references to nodes
Hi there,

As if I haven't posted enough to the parsed-xml-dev list yet, here some more..

What I want to have is an ability to keep a reference to a unique
node in a document, without using a reference to the node itself
directly. This because I want to store these references in say a
session. By using such a reference I can recognize whether I've
run into a node before and take action (such as pull things from
the cache, or render it differently as I've selected that node
for special operations in the GUI, etc)

With XMLDocument, URLs to nodes were stable enough to make this possible.
I'd simply get the absolute_url to some node and I had my reference.
Unfortunately ParsedXML URLs to nodes are very unstable with respect
to changes to the DOM, so that won't work.

So, I've been pondering a way to get this for ParsedXML. One way would be
to change the way URLs work to be more stable towards changes, but how?

One simple way is to simply keep a reference to the node directly.
Unfortunately that seems hard to do for sessions and would cause a lot
of growth of the ZODB if references are kept there. Nodes may never
get garbage collected.

Another way is to give each node that is created a unique id. For that
we'd need to change the DOM in some places and we'll have to carry the
unique id around, but that'd certainly be possible.

A hybrid approach I tried but didn't get far with is to keep a weak
value dictionary. The key is some unique id that a manager class can
assign, and the value is the weak ref to the node. This way, if a node
goes away, it won't be in the dictionary anymore either. Unfortunately
I get:

TypeError: 'ExplicitAcquirerWrapper' objects are not weakly referencable

when I try to put nodes in a WeakValueDictionary, so this seems to be
a no-go.

Yet another way would be to exploit some properties of the ZODB to get
a unique id to the node object. Since nodes aren't persistent objects
directly, I guess that won't work, right?

Anyway, some feedback on this matter would be welcome. I'm also sending
a cc to the zope-xml list to get some more discussion on this, hopefully.
Perhaps it's a use case others have run into before.

(should I move this kind of discussion in general to the zope-xml list?
Sometimes I don't know where I should be posting my stuff)

Regards,

Martijn
Re: keeping references to nodes [ In reply to ]
> Another way is to give each node that is created a unique id. For that
> we'd need to change the DOM in some places and we'll have to carry the
> unique id around, but that'd certainly be possible.


This would be my choice, it also requires a check to make sure the node
is still *there* before you use it.

> (should I move this kind of discussion in general to the zope-xml list?
> Sometimes I don't know where I should be posting my stuff)


I can't speak for anyone else, but I'm certainly willing to see
ParsedXML discussed in zope-xml. In fact, I'd like to see all the
various XML projects within zope discussed in zope-xml: We can work
towards a common vision and all that. Individual lists would still be
OK for specific bug reports, CVS checkin reports, etc., but direction
and strategy would seem to be right at home on this list.


--Dethe

--

Dethe Elza (delza@burningtiger.com)
Chief Mad Scientist
Burning Tiger Technologies (http://burningtiger.com)
Living Code Weblog (http://livingcode.ca)
Re: keeping references to nodes [ In reply to ]
Dethe Elza wrote:
> >Another way is to give each node that is created a unique id. For that
> >we'd need to change the DOM in some places and we'll have to carry the
> >unique id around, but that'd certainly be possible.
>
> This would be my choice, it also requires a check to make sure the node
> is still *there* before you use it.

Right, although such a reverse mapping isn't part of my requirements
just yet (that brings back the whole stuff about keeping references to
nodes, the dictionary that maps id to nodes would need to do something
like that). What I think is enough for my purposes so far is to store the
id only, and then when I encounter a node, I ask it what its id is,
and compare the two. That way I know I've seen it before.

Resolving the ids to nodes in an efficient way seems harder. Of course you
can hunt the entire DOM tree for it, but that's not efficient. You can
keep a potentially huge mapping of all ids to all nodes, but then we
want that to be a mapping of weak references, and that doesn't seem to
work with the acquisition base classes..

Hmm...what *would* work is the original url mapping thing, though.
You can now make a stable url into the document, using the ids. It's
also reasonably efficient to look up such a node by URL. It's also easy
to verify whether a node doesn't exist anymore in that location. Of
course this is slightly different from finding out whether a node exists
at all anymore (it may have moved somewhere else), but would that be
important?

> >(should I move this kind of discussion in general to the zope-xml list?
> >Sometimes I don't know where I should be posting my stuff)
>
> I can't speak for anyone else, but I'm certainly willing to see
> ParsedXML discussed in zope-xml. In fact, I'd like to see all the
> various XML projects within zope discussed in zope-xml: We can work
> towards a common vision and all that. Individual lists would still be
> OK for specific bug reports, CVS checkin reports, etc., but direction
> and strategy would seem to be right at home on this list.

Okay, then I'll be posting more often to this list when stuff like this
comes up.

Regards,

Martijn
Re: [Parsed-XML-Dev] keeping references to nodes [ In reply to ]
Martijn Faassen <faassen@vet.uu.nl> writes:

> What I want to have is an ability to keep a reference to a unique
> node in a document, without using a reference to the node itself
> directly.

> One simple way is to simply keep a reference to the node directly.
> Unfortunately that seems hard to do for sessions and would cause a lot
> of growth of the ZODB if references are kept there. Nodes may never
> get garbage collected.
>
> Another way is to give each node that is created a unique id. For that
> we'd need to change the DOM in some places and we'll have to carry the
> unique id around, but that'd certainly be possible.

Problem with either of these is that the entire subtree is replaced on
a parse. References to the old subtree are worthless, and there isn't
a reliable way to make sure that the new nodes get the same ID as the
old nodes.

If you're using XML IDs the problem is solved, of course, and there
are DOM methods to work with that. I remember before that you were
saying that you didn't want to have to change the XML source to
support this feature, though, and you want it to work for XML without
IDs. That's a valid concern, but I dunno, your problem is really a
drawback of XML. Unless there is an ID attribute expressed in XML,
there just isn't a way to identify an XML node other than by saying
how to traverse to it, so the DOM can't express this either.

> Yet another way would be to exploit some properties of the ZODB to get
> a unique id to the node object. Since nodes aren't persistent objects
> directly, I guess that won't work, right?

yup, they're just attributes of the single persistent object, the
document.

--
Karl Anderson kra@monkey.org http://www.monkey.org/~kra/