Mailing List Archive

_02: RangeFilter problems
First, RangeFilter cannot be used with BooleanQuery. Seems to me it should
(else how to combine filters?). When I try, I get an error about
'create_weight' not being implemented in RangeFilter.

Second, if I just use it by itself, passing it as a filter directly to
search(), I get a segmentation fault or bus error (the latter along witgh a
vm_allocate error).

--
Chris Nandor pudge@pobox.com http://pudge.net/
Open Source Technology Group pudge@ostg.com http://ostg.com/
_02: RangeFilter problems [ In reply to ]
On Mar 8, 2007, at 11:34 AM, Chris Nandor wrote:

> First, RangeFilter cannot be used with BooleanQuery. Seems to me
> it should
> (else how to combine filters?). When I try, I get an error about
> 'create_weight' not being implemented in RangeFilter.
>
> Second, if I just use it by itself, passing it as a filter directly to
> search(), I get a segmentation fault or bus error (the latter along
> witgh a
> vm_allocate error).

I don't understand. Can you please post the code for the search you
are trying to execute?

If you don't supply a Query to Searcher->search, it should throw an
error.

Marvin Humphrey
Rectangular Research
http://www.rectangular.com/
_02: RangeFilter problems [ In reply to ]
On Mar 8, 2007, at 12:13 PM, Marvin Humphrey wrote:

> If you don't supply a Query to Searcher->search, it should throw an
> error.

I take this back -- it's 100% wrong. There will be no error, and no
warning.

Marvin Humphrey
Rectangular Research
http://www.rectangular.com/
_02: RangeFilter problems [ In reply to ]
At 12:13 -0800 2007.03.08, Marvin Humphrey wrote:
>I don't understand. Can you please post the code for the search you
>are trying to execute?

I am working to strip it down into a (non-)working snippet. But basically
I followed the docs for RangeFilter.

Will get back to you soon, I hope.

--
Chris Nandor pudge@pobox.com http://pudge.net/
Open Source Technology Group pudge@ostg.com http://ostg.com/
_02: RangeFilter problems [ In reply to ]
OK, here we go. This causes a segfault when $range_filter is used.
Without $range_filter, I get the five results I expect.

#!/usr/bin/perl
use warnings;
use strict;

use Slash::Test;
$::searchtoo->_init('firehose'); # create schema

use Data::Dumper;
use KinoSearch::QueryParser::QueryParser;
use KinoSearch::Search::RangeFilter;
use KinoSearch::Searcher;

my $kdir = '/path/to/invindex';
my $schema = 'Slash::SearchToo::KinoSearch::Schema::firehose';

my $searcher = KinoSearch::Searcher->new(
invindex => $schema->open($kdir),
);

my $query_parser = KinoSearch::QueryParser::QueryParser->new(
schema => $schema->new,
fields => [qw(introtext title toptags)],
default_boolop => 'AND',
);
my $query = $query_parser->parse('sony OR test');

my $range_filter = KinoSearch::Search::RangeFilter->new(
field => 'dayssince1970',
lower_term => 12880,
upper_term => 13580,
include_lower => 1,
include_upper => 1,
);

my $hits = $searcher->search(
query => $query,
filter => $range_filter,
);

while (my $hit = $hits->fetch_hit_hashref) {
print Dumper $hit;
}

print "Done.\n";

--
Chris Nandor pudge@pobox.com http://pudge.net/
Open Source Technology Group pudge@ostg.com http://ostg.com/
_02: RangeFilter problems [ In reply to ]
At 13:38 -0800 2007.03.08, Chris Nandor wrote:
>OK, here we go. This causes a segfault when $range_filter is used.
>Without $range_filter, I get the five results I expect.

Oh, and then there's the first problem, the BooleanQuery thing. Here is
the relevant portion of that code, changed from the last mail:

my $bool_query = KinoSearch::Search::BooleanQuery->new;
$bool_query->add_clause(query => $range_filter, occur => 'MUST');
my $filter = KinoSearch::Search::QueryFilter->new(query => $bool_query);

my $hits = $searcher->search(
query => $query,
filter => $filter,
);

I realize that this is probably not the way to do it, but then, how do I
combine my $range_filter with other filters? Is it possible?

--
Chris Nandor pudge@pobox.com http://pudge.net/
Open Source Technology Group pudge@ostg.com http://ostg.com/
_02: RangeFilter problems [ In reply to ]
On Mar 8, 2007, at 2:36 PM, Chris Nandor wrote:

> Oh, and then there's the first problem, the BooleanQuery thing.
> Here is
> the relevant portion of that code, changed from the last mail:
>
> my $bool_query = KinoSearch::Search::BooleanQuery->new;
> $bool_query->add_clause(query => $range_filter, occur => 'MUST');
> my $filter = KinoSearch::Search::QueryFilter->new(query =>
> $bool_query);

That crashes because RangeFilter is not a subclass of Query.

> how do I combine my $range_filter with other filters? Is it possible?

Not presently. I've been contemplating how to make this available
(i.e. procastinating) while working on a bunch of other problems.
The trick is how ranges should score.

Marvin Humphrey
Rectangular Research
http://www.rectangular.com/
_02: RangeFilter problems [ In reply to ]
I can't get the RangeFilter to work, any ideas, here is a sample:

#!/usr/bin/perl

use strict; use warnings;

use KinoSearch::InvIndexer;
use KinoSearch::Searcher;
use KinoSearch::Search::RangeFilter;

package ShotSchema::Title;
use base qw( KinoSearch::Schema::FieldSpec );
sub vectorized { 0 }

package ShotSchema::Date;
use base qw( KinoSearch::Schema::FieldSpec );
sub analyzed { 0 }

package ShotSchema;
use base qw( KinoSearch::Schema );
use KinoSearch::Analysis::PolyAnalyzer;

our %FIELDS = (
date => 'ShotSchema::Date',
title => 'ShotSchema::Title',
);

sub analyzer {
return KinoSearch::Analysis::PolyAnalyzer->new( language => 'en' );
}

package main;

my $invindexer = KinoSearch::InvIndexer->new(
invindex => ShotSchema->clobber('index')
);
$invindexer->add_doc({ date => "2006-01-01", title => "ENGLAND"});
$invindexer->finish;

## done indexing

my $searcher = KinoSearch::Searcher->new(
invindex => ShotSchema->open('index'),
);

## search without filter

my $hits = $searcher->search( query => 'ENGLAND' );

while ( my $hit = $hits->fetch_hit_hashref ) {
print "$hit->{date}\n";
}

## search with filter

my $filter = KinoSearch::Search::RangeFilter->new(
field => 'date',
lower_term => '2005-01-01',
upper_term => '2007-01-01',
include_lower => 1,
include_upper => 1,
);
$hits = $searcher->search(
query => 'ENGLAND',
filter => $filter,
);

while ( my $hit = $hits->fetch_hit_hashref ) {
print "$hit->{date}\n";
}
_02: RangeFilter problems [ In reply to ]
On Mar 15, 2007, at 10:09 AM, Edward Betts wrote:

> I can't get the RangeFilter to work, any ideas, here is a sample:

Edward, bless you for supplying me with these minimal failing example
cases! I have duplicated the problem on my system, and I'll tackle a
fix this weekend.

Marvin Humphrey
Rectangular Research
http://www.rectangular.com/
_02: RangeFilter problems [ In reply to ]
At 9:39 -0700 2007.03.15, Marvin Humphrey wrote:
>That crashes because RangeFilter is not a subclass of Query.

Right, I figured, but it was the closest thing I could find.


>> how do I combine my $range_filter with other filters? Is it possible?
>
>Not presently. I've been contemplating how to make this available
>(i.e. procastinating) while working on a bunch of other problems.
>The trick is how ranges should score.

OK. Until then I will revert to how I was doing it before (manually
constructing an OR'd query of terms, which is AND'd to the main query
filter).

--
Chris Nandor pudge@pobox.com http://pudge.net/
Open Source Technology Group pudge@ostg.com http://ostg.com/
_02: RangeFilter problems [ In reply to ]
At 9:39 -0700 2007.03.15, Marvin Humphrey wrote:
>On Mar 8, 2007, at 2:36 PM, Chris Nandor wrote:
>> how do I combine my $range_filter with other filters? Is it possible?
>
>Not presently. I've been contemplating how to make this available
>(i.e. procastinating) while working on a bunch of other problems.
>The trick is how ranges should score.

This is something we need pretty soon; is there anything I can do to help
make it work?

--
Chris Nandor pudge@pobox.com http://pudge.net/
Open Source Technology Group pudge@ostg.com http://ostg.com/
_02: RangeFilter problems [ In reply to ]
On Mar 20, 2007, at 11:55 AM, Chris Nandor wrote:

> At 9:39 -0700 2007.03.15, Marvin Humphrey wrote:
>> On Mar 8, 2007, at 2:36 PM, Chris Nandor wrote:
>>> how do I combine my $range_filter with other filters? Is it
>>> possible?
>>
>> Not presently. I've been contemplating how to make this available
>> (i.e. procastinating) while working on a bunch of other problems.
>> The trick is how ranges should score.
>
> This is something we need pretty soon; is there anything I can do
> to help
> make it work?

Yes, there is.

QueryFilter needs to be changed to cache BitVector objects in a hash,
keyed per IndexReader. The bits() method should be changed to take
an IndexReader rather than a Searcher as an argument, and so should
make_collector(). Calls to those methods in the library and the test
suite need to be adjusted.

Tests need to be added to t/507-query_filter.t to ensure that...

* The caching mechanism works and we don't keep
generating new BitVectors.
* The correct BitVector is returned by the bits()
method (i.e. not one belonging to another IndexReader).

Ideally, destruction of the cached BitVectors held by a QueryFilter
object would be triggered when the IndexReader gets destroyed, since
they're no longer of any use after that. That's a little harder, and
may require some sort of stupid hack to store references to the
BitVectors in IndexReader along with calling weaken() on the refs
held by the QueryFilter object. The point is that we don't want to
accumulate BitVectors when the Searcher/Reader is being continually
refreshed.

RangeFilter also needs make_collector() changed to be keyed off of an
IndexReader. That will be straightforward, as the first thing
RangeFilter->make_collector does right now is call get_reader().
Tests and Library calls to the method need to be adjusted, but won't
need any changes to their substance.

RangeFilter then needs a bits() method added to it. It will probably
look like this...

sub bits {
my ( $self, $reader ) = @_;

# collect docs that have a value for this field which passes the
filter
my $collector = KinoSearch::Search::HitCollector->new_bit_coll;
my $searcher = KinoSearch::Searcher->new( reader => $reader );
my $query = KinoSearch::Search::MatchFieldQuery->new(
field => $self->{field},
);
$searcher->collect(
query => $query,
filter => $self,
collector => $collector,
);

return $collector->get_bit_vector;
}

Searcher->collect needs to be created, but that will basically be a
refactoring of Searcher->search_hit_collector which will be trivial
for me and hard for anyone else... so I'll handle that.

MatchFieldQuery (which will be nearly identical to TermQuery) also
needs to be written. Writing tests to ensure that a Searcher returns
correct results when supplied with a MatchFieldQuery will be pretty
straightforward and would be appreciated.

I'd love it if someone else wanted to get involved in writing
MatchFieldQuery itself, but such a person would need to be be willing
to absorb some information retrieval theory -- so I'll assume it will
be my sole responsibility (as will MatchFieldScorer) unless someone
expresses an interest.

Finally, we need to create PolyFilter. PolyFilter will have an add()
method which works like this:

$poly_filter->add(
filter => $filter,
logic => 'AND',
);

PolyFilter->bits() will call bits() on each of its sub-filters, then
it will combine the BitVectors together. Like QueryFilter, it will
cache filters per-IndexReader.

At present, BitVector only has a logical_and() method; if PolyFilter
is to be able to combine filters using OR, XOR, etc, the appropriate
methods need to be added to BitVector. This is deceptively
difficult. It involves classic C bit-twiddling, but has to be
maximally efficient, and there are a lot of nasty corner cases that
need tests. I'm assuming I'll be handling this one.

Still with me? ;)

I also ask that potential hackers agree contribute their code to
Apache. That way we can use it in Lucy without complication.

Marvin Humphrey
Rectangular Research
http://www.rectangular.com/
_02: RangeFilter problems [ In reply to ]
Taking one piece at a time :-)

At 13:14 -0700 2007.03.21, Marvin Humphrey wrote:
>QueryFilter needs to be changed to cache BitVector objects in a hash,
>keyed per IndexReader. The bits() method should be changed to take
>an IndexReader rather than a Searcher as an argument, and so should
>make_collector().

Looking at those methods ... most of it looks straightforward, but how
would I handle this (in bits()) without a Searcher?

$searcher->search_hit_collector(
weight => $self->{query}->to_weight($searcher),
hit_collector => $collector,
);

--
Chris Nandor pudge@pobox.com http://pudge.net/
Open Source Technology Group pudge@ostg.com http://ostg.com/
_02: RangeFilter problems [ In reply to ]
On Mar 22, 2007, at 3:50 PM, Chris Nandor wrote:

> Looking at those methods ... most of it looks straightforward, but how
> would I handle this (in bits()) without a Searcher?
>
> $searcher->search_hit_collector(
> weight => $self->{query}->to_weight($searcher),
> hit_collector => $collector,
> );

Create a new Searcher. They're cheap. The IndexReader does all the
caching.

You will need to work off of svn trunk, because it has only recently
become possible to do this:

my $searcher = KinoSearch::Searcher->new(
reader => $reader,
);

Marvin Humphrey
Rectangular Research
http://www.rectangular.com/
_02: RangeFilter problems [ In reply to ]
At 16:11 -0700 2007.03.22, Marvin Humphrey wrote:
>On Mar 22, 2007, at 3:50 PM, Chris Nandor wrote:
>
>> Looking at those methods ... most of it looks straightforward, but how
>> would I handle this (in bits()) without a Searcher?
>>
>> $searcher->search_hit_collector(
>> weight => $self->{query}->to_weight($searcher),
>> hit_collector => $collector,
>> );
>
>Create a new Searcher. They're cheap. The IndexReader does all the
>caching.
>
>You will need to work off of svn trunk, because it has only recently
>become possible to do this:
>
> my $searcher = KinoSearch::Searcher->new(
> reader => $reader,
> );

Neat, thanks. I'll write back more later as needed. :-)

Oh, what about the earlier call in bits() to max_doc? Should that be done
on the reader, or the searcher, and does it matter?

--
Chris Nandor pudge@pobox.com http://pudge.net/
Open Source Technology Group pudge@ostg.com http://ostg.com/
_02: RangeFilter problems [ In reply to ]
On Mar 22, 2007, at 4:17 PM, Chris Nandor wrote:

> Oh, what about the earlier call in bits() to max_doc? Should that
> be done
> on the reader, or the searcher, and does it matter?

It doesn't matter. This is Searcher->max_doc:

sub max_doc { shift->{reader}->max_doc }

max_doc() looks different in other searchers, e.g. MultiSearcher, so
if $searcher were opaque, we'd have to call it on $searcher. But
since we know this is really a KinoSearch::Searcher -- because we
just created it -- it doesn't matter.

Marvin Humphrey
Rectangular Research
http://www.rectangular.com/
_02: RangeFilter problems [ In reply to ]
On Mar 15, 2007, at 10:09 AM, Edward Betts wrote:

> I can't get the RangeFilter to work, any ideas, here is a sample:

Edward, thank you for this test. Your bug is fixed in SVN trunk.
The patch is archived here:

http://www.rectangular.com/pipermail/kinosearch-commits/2007-March/
000017.html

A word of warning about trunk (Pudge, this is for you, too): right
now, trunk passes its tests on all of my boxen, but a recent valgrind
test turned up some memory errors. I may need to go in and disable
some code temporarily to make it safer to work with, but I'm not sure.

So you may just want to apply that patch on RangeFilter.pm to your
local copy, Edward.

Pudge, I don't think this one relates to your segfault problem. I'm
working on the intermittent MultiTermList test failure right now.
Perhaps that one will nail it down.

Marvin Humphrey
Rectangular Research
http://www.rectangular.com/
_02: RangeFilter problems [ In reply to ]
> $searcher->search_hit_collector(
> weight => $self->{query}->to_weight($searcher),
> hit_collector => $collector,
> );

I've finished the work refactoring search_hit_collector() into collect
().

This call should now look like...

$searcher->collect(
query => $self->{query},
collector => $collector,
);

Three changes: it now takes a Query rather than a Weight, the name of
the method is now collect(), and the argument is now labeled
"collector" rather than "hit_collector".

Marvin Humphrey
Rectangular Research
http://www.rectangular.com/
_02: RangeFilter problems [ In reply to ]
On 3/23/07, Marvin Humphrey <marvin@rectangular.com> wrote:
> Edward, thank you for this test. Your bug is fixed in SVN trunk.

Excellent. Thanks very much. Keep up the good work.

--
Edward Betts
_02: RangeFilter problems [ In reply to ]
At 13:14 -0700 2007.03.21, Marvin Humphrey wrote:
>QueryFilter needs to be changed to cache BitVector objects in a hash,
>keyed per IndexReader.

Can this just be moved from the QueryFilter object to the IndexReader
object? That is, just change $self->{cached_bits} to
$reader->{cached_bits}?

Does RangeFilter's to-be-new bits() method need the same caching?

--
Chris Nandor pudge@pobox.com http://pudge.net/
Open Source Technology Group pudge@ostg.com http://ostg.com/
_02: RangeFilter problems [ In reply to ]
At 13:14 -0700 2007.03.21, Marvin Humphrey wrote:
>QueryFilter needs to be changed to cache BitVector objects in a hash,
>keyed per IndexReader. The bits() method should be changed to take
>an IndexReader rather than a Searcher as an argument, and so should
>make_collector(). Calls to those methods in the library and the test
>suite need to be adjusted.

I see no calls in the test suite, or anywhere else, to these two methods;
just one call to make_collector() in Searcher.pm, and the call to bits() in
make_collector().

Which also makes me wonder why RangeFilter needs a bits(), since it
wouldn't be called. Unless I am missing something, which is not unlikely.


>Tests need to be added to t/507-query_filter.t to ensure that...
>
> * The caching mechanism works and we don't keep
> generating new BitVectors.
> * The correct BitVector is returned by the bits()
> method (i.e. not one belonging to another IndexReader).

Is it sufficient to run search() a few times and compare the BitVector objects?

--
Chris Nandor pudge@pobox.com http://pudge.net/
Open Source Technology Group pudge@ostg.com http://ostg.com/
_02: RangeFilter problems [ In reply to ]
On Mar 27, 2007, at 11:57 AM, Chris Nandor wrote:

> At 13:14 -0700 2007.03.21, Marvin Humphrey wrote:
>> QueryFilter needs to be changed to cache BitVector objects in a hash,
>> keyed per IndexReader.
>
> Can this just be moved from the QueryFilter object to the IndexReader
> object? That is, just change $self->{cached_bits} to
> $reader->{cached_bits}?

Then we would need to key the IndexReader's cached bits by
QueryFilter instance.

Lemme think about that for a sec. (But I won't delay sending this
email while I'm thinking.)

In general, we should avoid putting anything in IndexReader unless we
have to. It's a big class, and it's only going to get bigger. I'd
rather do something fairly outrageous in RangeFilter if we can.

> Does RangeFilter's to-be-new bits() method need the same caching?

There's actually some caching going on with regards to RangeFilter,
but it's already stored in the IndexReader. :)

When you first apply a RangeFilter against a field, KS reads through
the terms and docs for the field in question and records the "term
number" associated with each doc into an array. This only has to be
done once per field, per IndexReader -- an IndexReader always
represents an unchanging snapshot of the index in time, so the
field's content is static.

Then, this wrapping hit collector is used to pass hits to the inner
hit collector only for documents which possess a value for the field
which is inside the acceptable range:

static void
HC_RangeColl_collect(HitCollector *self, u32_t doc_num, float
score)
{
RangeCollData *const data = (RangeCollData*)self->data;
const i32_t locus = IntMap_Get(data->sort_cache, doc_num);

if (locus >= data->lower_bound && locus <= data->upper_bound) {
data->inner_coll->collect(data->inner_coll, doc_num,
score);
}
}

So, RangeFilter does benefit from caching, but the caching can be
handled differently from QueryFilter:

* Multiple RangeFilters which operate against the
same field can share the same sort cache.
* Multiple QueryFilters cannot share caches because
each query typically produces a distinct result set.

Marvin Humphrey
Rectangular Research
http://www.rectangular.com/
_02: RangeFilter problems [ In reply to ]
At 12:57 -0700 2007.03.27, Marvin Humphrey wrote:
>Then we would need to key the IndexReader's cached bits by
>QueryFilter instance.

Yeah, the initial version of my question asked whether we should cache by
IndexReader or by IndexReader by QueryFilter. That's kinda what I thought,
but I wasn't entirely sure what was going on, but I see my first instinct
was correct.

I still don't see when a bits() method in RangeFilter would be called, though.

--
Chris Nandor pudge@pobox.com http://pudge.net/
Open Source Technology Group pudge@ostg.com http://ostg.com/
_02: RangeFilter problems [ In reply to ]
On Mar 27, 2007, at 12:54 PM, Chris Nandor wrote:

>> QueryFilter needs to be changed to cache BitVector objects in a hash,
>> keyed per IndexReader. The bits() method should be changed to take
>> an IndexReader rather than a Searcher as an argument, and so should
>> make_collector(). Calls to those methods in the library and the test
>> suite need to be adjusted.
>>
>
> I see no calls in the test suite, or anywhere else, to these two
> methods;
> just one call to make_collector() in Searcher.pm, and the call to
> bits() in
> make_collector().

Then only those two cases need to be modified.

> Which also makes me wonder why RangeFilter needs a bits(), since it
> wouldn't be called.

It will be called by PolyFilter. PolyFilter needs to combine
multiple filters and cache the result as a BitVector.

If we tried to modify a the cached bits from a QueryFilter by running
it's contents through a RangeFilter's hit collector somehow, that
would be awkward, inefficient, and error prone. Better to limit our
AND, OR and XOR operations to BitVectors operating on other BitVectors.

>> Tests need to be added to t/507-query_filter.t to ensure that...
>>
>> * The caching mechanism works and we don't keep
>> generating new BitVectors.
>> * The correct BitVector is returned by the bits()
>> method (i.e. not one belonging to another IndexReader).
>>
>
> Is it sufficient to run search() a few times and compare the
> BitVector objects?

Yeah, that'll work if the readers produce distinct result sets.

Psuedocode...

my $query_filter = QueryFilter->new($query);

my $reader_a = IndexReader->open($invindex_a);
my $reader_b = IndexReader->open($invindex_b);

$searcher = Searcher->new( $reader_a );
verify_results( $query_filter, $searcher, \@results_a );
$searcher = Searcher->new( $reader_b );
verify_results( $query_filter, $searcher, \@results_b );
$searcher = Searcher->new( $reader_a );
verify_results( $query_filter, $searcher, \@results_a );

Marvin Humphrey
Rectangular Research
http://www.rectangular.com/
_02: RangeFilter problems [ In reply to ]
OK, sounds good re: tests. I think I basically have this all set. Not including the tests, here is what I have. It is all working fine, though RangeFilter->bits() is incomplete because of MatchFieldQuery.pm. I'll send the tests along later tonight, probably.

Index: lib/KinoSearch/Search/QueryFilter.pm
===================================================================
--- lib/KinoSearch/Search/QueryFilter.pm (revision 2218)
+++ lib/KinoSearch/Search/QueryFilter.pm (working copy)
@@ -24,16 +24,21 @@
}

sub bits {
- my ( $self, $searcher ) = @_;
+ my ( $self, $reader ) = @_;

+ my $cached_bits = $self->{$reader}{cached_bits};
+
# fill the cache
- if ( !defined $self->{cached_bits} ) {
- $self->{cached_bits} = KinoSearch::Util::BitVector->new(
- capacity => $searcher->max_doc );
+ if ( !defined $cached_bits ) {
+ $self->{$reader}{cached_bits} = $cached_bits = KinoSearch::Util::BitVector->new(
+ capacity => $reader->max_doc );

my $collector = KinoSearch::Search::HitCollector->new_bit_coll(
- bit_vector => $self->{cached_bits} );
+ bit_vector => $cached_bits );

+ my $searcher = KinoSearch::Searcher->new(
+ reader => $reader );
+
# perform the search
$searcher->collect(
query => $self->{query},
@@ -41,13 +46,14 @@
);
}

- return $self->{cached_bits};
+ return $cached_bits;
}

sub make_collector {
- my ( $self, $inner_coll, $searcher ) = @_;
+ my ( $self, $inner_coll, $reader ) = @_;
+
return KinoSearch::Search::HitCollector->new_filt_coll(
- filter_bits => $self->bits($searcher),
+ filter_bits => $self->bits($reader),
collector => $inner_coll,
);
}
Index: lib/KinoSearch/Search/RangeFilter.pm
===================================================================
--- lib/KinoSearch/Search/RangeFilter.pm (revision 2218)
+++ lib/KinoSearch/Search/RangeFilter.pm (working copy)
@@ -27,10 +27,27 @@
}
}

+sub bits {
+ my ( $self, $reader ) = @_;
+
+ # collect docs that have a value for this field which passes the filter
+ my $collector = KinoSearch::Search::HitCollector->new_bit_coll;
+ my $searcher = KinoSearch::Searcher->new( reader => $reader );
+# XXX not working yet
+# my $query = KinoSearch::Search::MatchFieldQuery->new(
+# field => $self->{field} );
+
+ $searcher->collect(
+# query => $query,
+ filter => $self,
+ collector => $collector,
+ );
+
+ return $collector->get_bit_vector;
+}
+
sub make_collector {
- my ( $self, $inner_coll, $searcher ) = @_;
- confess("Can't get an reader") unless $searcher->can('get_reader');
- my $reader = $searcher->get_reader;
+ my ( $self, $inner_coll, $reader ) = @_;
my $sort_cache = $reader->fetch_sort_cache( $self->{field} );

my $low_term
Index: lib/KinoSearch/Searcher.pm
===================================================================
--- lib/KinoSearch/Searcher.pm (revision 2218)
+++ lib/KinoSearch/Searcher.pm (working copy)
@@ -148,7 +148,7 @@
# wrap the collector if there's a filter
my $collector = $args{collector};
if ( defined $args{filter} ) {
- $collector = $args{filter}->make_collector( $collector, $self );
+ $collector = $args{filter}->make_collector( $collector, $reader );
}

# process prune_factor if supplied

--
Chris Nandor pudge@pobox.com http://pudge.net/
Open Source Technology Group pudge@ostg.com http://ostg.com/

1 2  View All