Mailing List Archive

Document Scoring
Hi,

I've been going through Lucene's code for days now and I'm puzzled as to how the scoring works.

The scoring is ultimately performed by the Scorer which in turn uses the Similarity class.

The Similiarity class needs the docFreq(Term t) amd maxDoc() methods.

It calls these methods through the Searcher which is passed [Similarity.idf(Term t, Searcher searcher)].

Now, the class IndexSearcher, which extends Searcher, refers back to the IndexReader. Here is the puzzling part:

The method docFreq(Term t) and the method maxDoc() are both declared abstract in IndexReader.

Faced with this question, I was obliged to check out how it was done in the demo that comes with Lucene. But the demo still uses IndexSearcher which in turn uses IndexReader to call these methods.

Can anyone please shed some light here?

Melissa
Re: Document Scoring [ In reply to ]
>
>
>
>Now, the class IndexSearcher, which extends Searcher, refers back to the IndexReader. Here is the puzzling part:
>
> The method docFreq(Term t) and the method maxDoc() are both declared abstract in IndexReader.
>
>Faced with this question, I was obliged to check out how it was done in the demo that comes with Lucene. But the demo still uses IndexSearcher which in turn uses IndexReader to call these methods.
>
There are a couple of kinds of IndexReaders that are supported. One is
called SegmentReader and the other - SegmentsReader (notice the ploral).
The first one is used to access a single index segment, whereas the
second one is used to do a search in unoptimized indexes that have more
then one segment. These are implementation classes that extend
IndexReader in different ways. Other readers could potentially be
created as well that would use implement access into some other kind of
a datastructure.

Not sure if this sheds any light onto your original question about how
the scoring works, but it should explaing the mystery of the abstract
methods. As far as the scoring, you will probably find your questions
answered by one of the FAQs. They give the answer in terms of the
searching theory that might save you time in trying to figure this out
from code. I would include the answer here, but I'm not that strong in
the theory/math of it, so the FAQ is your best bet. I just know it
works, and that happens to be good enough for me today :).

Dmitry.

>
>
>Can anyone please shed some light here?
>
>Melissa
>
>




--
To unsubscribe, e-mail: <mailto:lucene-dev-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-dev-help@jakarta.apache.org>