OK, so I want to take a step back and lay my thoughts out on this caching
stuff.
We need to cache the BitVector of a Filter on a per-reader basis. That
BitVector should be destroyed when the Reader is.
A Filter generally does not change, but it can, particularly in the case of
a PolyFilter. When the Filter changes, the BitVector should no longer be
used, and, ideally, it should be cleaned up, as it is not likely to be used
again.
That's basically it. Then there's implementation.
Uniquely identifying a particular Filter can be done with Filter class +
hash_code. This holds true even if the Filter is modified, as hash_code
will change. [.NB: we still need a way to generate hash_code for
PolyFilter.]
Caching by Filter class and hash_code does not allow us to destroy old
cached BitVectors if the Filter has changed. One way to perhaps address
this is with a GUID for the Filter; we can check to see if there is an old
BitVector cached for that GUID, and if so, destroy it.
I cannot think right now of any other possible way to destroy old
BitVectors for modified Filters in this implementation, because if you
cannot uniquely identify the Filter, you can't know that its BitVector is
expired, if the hash_code has changed (we can't check on modification of
the Filter, because we may not know the Reader at that point, such as in
PolyFilter->add). We either need to use a GUID, come up with another way,
or accept that we may have a leak here.
We could also use weak references. I generally think weak refs are a hack,
but then again, the above is also turning into a hack. :) We would no
longer need hash_code, or unique identification. The BitVector would be
there in the Filter and we could use it, up until the Reader disappeared,
in which case it would disappear; and we could prevent BitVectors from
accumulating by manually calling some dispose() method on the BitVector
when the Filter is modified:
# if Filter is modified
$_->dispose for values %{ $self->{cached_bits} };
$self->{cached_bits} = {};
However, apart from creating a dispose() method, we would also need to
uniquely identify the *Reader*, instead of the Filter, unless we think the
current method of using a stringified reference is sufficient.
So Marvin, at this point, I have two questions:
* Is the above description essentially correct? Am I missing anything?
* Which method do you prefer (including additional options)?
--
Chris Nandor pudge@pobox.com
http://pudge.net/ Open Source Technology Group pudge@ostg.com
http://ostg.com/