Mailing List Archive: Term ordering for IndexReader.termDocs()

Term ordering for IndexReader.termDocs()

Jan 25, 2002, 8:02 AM

Post #1 of 3 (773 views)

Hello,

I'm creating a filter from a set of terms that are read from
a file, and I find that IndexReader.termDocs(Term(fieldName, valueFromFile))
does this quite well (around 0.1 secs elapsed time in jython code.)

Would it be advantageous to sort the values from the file before
using them in this way? This could help to reduce the nr. of disk seeks,
but I have no idea about the way the segments are organized on disk.

I did not yet profile this, because I have only tried it with less then
100 terms on a relatively small index. I wonder whether performance
it still as good at say 20000 terms.

Thanks in advance,
Ype Kingma

--

--
To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>

RE: Term ordering for IndexReader.termDocs() [ In reply to ]

DCutting at grandcentral

Jan 25, 2002, 9:49 AM

Post #2 of 3 (756 views)

Permalink

> From: Ype Kingma [mailto:ykingma@xs4all.nl]
>
> I'm creating a filter from a set of terms that are read from
> a file, and I find that IndexReader.termDocs(Term(fieldName,
> valueFromFile))
> does this quite well (around 0.1 secs elapsed time in jython code.)
>
> Would it be advantageous to sort the values from the file before
> using them in this way?

Yes, that would be faster. The term dictionary is sorted and this would
reduce both i/o and computation.

Doug

--
To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>

RE: Term ordering for IndexReader.termDocs() [ In reply to ]

ykingma at xs4all

Jan 25, 2002, 11:42 AM

Post #3 of 3 (757 views)

Permalink

Doug,

> > From: Ype Kingma [mailto:ykingma@xs4all.nl]
>>
>> I'm creating a filter from a set of terms that are read from
>> a file, and I find that IndexReader.termDocs(Term(fieldName,
>> valueFromFile))
>> does this quite well (around 0.1 secs elapsed time in jython code.)
>>
>> Would it be advantageous to sort the values from the file before
>> using them in this way?
>
>Yes, that would be faster. The term dictionary is sorted and this would
>reduce both i/o and computation.

Thanks. I suppose it would be correct to assume that the sorting order
is java.lang.String.compareTo() ?

Regards,
Ype
--

--
To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>