On Nov 20, 2007, at 12:18 PM, Peter Karman wrote:
> Are you finding it makes it easier to do things with XS, C and the
> reference counting?
KS objects under anything other than the new, temporary class
KinoSearch::Util::Nat maintain their own refcount, separate from
Perl. When a Perl object wrapping a KS object has its SvREFCNT fall
to 0, the DESTROY method which gets called is
KinoSearch::Util::Obj::DESTROY, which simply decrements the KS
object's internal refcount rather than invoking Kino_Obj_Destroy(obj).
void
DESTROY(self)
kino_Obj *self;
PPCODE:
REFCOUNT_DEC(self);
We have to do things that way because there are many KS objects which
Perl doesn't know about. For instance, when TopDocCollector's C
constructor TDColl_new() is invoked, it creates its own HitQueue
object without telling Perl anything about it. However, should we
need to deal with that HitQueue from Perl-space, we have to wrap it
in a Perl object. That's what happens here:
{
my $hit_queue = $collector->get_hit_queue;
} # $hit_queue goes out of scope, DESTROY called
Currently, when that $hit_queue goes out of scope, the Perl wrapper
object gets destroyed. However, the interior KS HitQueue object must
not be destroyed, because $collector still needs it.
As a consequence, KS objects can reappear wrapped in several
different Perl objects, which is rather strange and is probably a bug
waiting to bite someone. Here's an example of how things can go
wrong: cycling through multiple Perl objects doesn't work well with
the inside-out pattern, because DESTROY gets invoked over and over
again, necessitating a broken hack like this...
sub DESTROY {
my $self = shift;
if ($self->refcount < 2) {
delete $inside_out_var{$$self};
}
$self->SUPER::DESTROY;
}
That hack doesn't even work reliably because if the last refcount
gets decremented by KS internally, the Perl DESTROY method will never
get called and any inside-out vars will leak.
The solution is to cache a Perl object within a KS object, so that
effectively Perl *does* know about it. That's the difference between
Nat and Obj. Under Nat, the refcounting is handled via the cached
Perl object. There are no longer two refcounts.
One drawback of this design, though, is that Perl objects are
heavyweight. That's ok for big stuff like a PostingList, but it's
not-so-great for small stuff like a ByteBuf, a Token, or a TermInfo.
If we were to put a Perl object into every last one of those, I'd be
concerned both about memory usage and performance.
My current plan is to override the refcounting infrastructure for
small classes by basing them off of a "FastObj" class which will use
an integer refcount as Obj does now. The scheme is more complicated
to implement than I'd like, and it will have the one-KS-object-many-
Perl-objects problem for anything that subclasses FastObj. But it
will work in the near term and maybe it won't be so bad.
Marvin Humphrey
Rectangular Research
http://www.rectangular.com/ _______________________________________________
KinoSearch mailing list
KinoSearch@rectangular.com
http://www.rectangular.com/mailman/listinfo/kinosearch