Mailing List Archive

Filtering in Lucene
For those of you who have worked with the BitSet concept to
use lucene in searching within a subset, just to make sure that
I got this right, if I have 100 000 documents to search, my Bit Vector
will be of 100 000 length, just to save that vector for repeated use
I'll have to use a clob! Am I thinking right or have I misunderstood the
concept.

thanks

Nader S. Henein
Bayt.com , Dubai Internet City
Tel. +9714 3911900
Fax. +9714 3911915
GSM. +9715 05659557
www.bayt.com


--
To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>
Re: Filtering in Lucene [ In reply to ]
Yes, the BitSet has to be the same size as the number of documents in the
index.

If you have enough memory you could cache the BitSet in the VM's memory
rather
then storing it in a clob though.

Joel

----- Original Message -----
From: "Nader S. Henein" <nsh@bayt.net>
To: <lucene-user@jakarta.apache.org>
Sent: Monday, May 13, 2002 11:11 AM
Subject: Filtering in Lucene


> For those of you who have worked with the BitSet concept to
> use lucene in searching within a subset, just to make sure that
> I got this right, if I have 100 000 documents to search, my Bit Vector
> will be of 100 000 length, just to save that vector for repeated use
> I'll have to use a clob! Am I thinking right or have I misunderstood the
> concept.
>
> thanks
>
> Nader S. Henein
> Bayt.com , Dubai Internet City
> Tel. +9714 3911900
> Fax. +9714 3911915
> GSM. +9715 05659557
> www.bayt.com
>
>
> --
> To unsubscribe, e-mail:
<mailto:lucene-user-unsubscribe@jakarta.apache.org>
> For additional commands, e-mail:
<mailto:lucene-user-help@jakarta.apache.org>
>


--
To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>
RE: Filtering in Lucene [ In reply to ]
So if I do a search and then two minutes later I filter but sometime in
between
the index was automatically updated (let's say deleting a document from the
middle
of the index), my Filter will become skewed by 1 and thereby screwed.

Because if that's true, we need to start looking at a different way because
if you have an index that changes constantly (I delete documents every 5
minutes and
add/update + optimize documents every 30 minutes) no way will this filter
work.



-----Original Message-----
From: Joel Bernstein [mailto:j.bernstein@ei.org]
Sent: Monday, May 13, 2002 8:49 PM
To: Lucene Users List; nsh@bayt.net
Subject: Re: Filtering in Lucene


Yes, the BitSet has to be the same size as the number of documents in the
index.

If you have enough memory you could cache the BitSet in the VM's memory
rather
then storing it in a clob though.

Joel

----- Original Message -----
From: "Nader S. Henein" <nsh@bayt.net>
To: <lucene-user@jakarta.apache.org>
Sent: Monday, May 13, 2002 11:11 AM
Subject: Filtering in Lucene


> For those of you who have worked with the BitSet concept to
> use lucene in searching within a subset, just to make sure that
> I got this right, if I have 100 000 documents to search, my Bit Vector
> will be of 100 000 length, just to save that vector for repeated use
> I'll have to use a clob! Am I thinking right or have I misunderstood the
> concept.
>
> thanks
>
> Nader S. Henein
> Bayt.com , Dubai Internet City
> Tel. +9714 3911900
> Fax. +9714 3911915
> GSM. +9715 05659557
> www.bayt.com
>
>
> --
> To unsubscribe, e-mail:
<mailto:lucene-user-unsubscribe@jakarta.apache.org>
> For additional commands, e-mail:
<mailto:lucene-user-help@jakarta.apache.org>
>


--
To unsubscribe, e-mail:
<mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail:
<mailto:lucene-user-help@jakarta.apache.org>



--
To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>