Mailing List Archive

Term hitscores in multiterm searches
Hi gang!

If you do a multiterm query to Lucene, say "foo bar zoo", it gives you a
heap of documents (Hits) as a result and all is well. If you want some
tracking abilities to this query, for instance you want to know that
document X was included in the Hits because "foo" and "bar" matched but
"zoo" did not, how would you go about? In previous applications, I have done
individual searches for each term and kept track of which term scored the
highest (or not at all), but this is currently not possible since the query
traffic will be overwhelming - we're talking 30-40 words per query, and
hundreds of queries per second (distributed on several servers ofcourse).

So, is there any way to gather this information without doing searches on
the individual terms of the query?

Thanks,
Fredrik
Re: Term hitscores in multiterm searches [ In reply to ]
On Thursday 23 November 2006 13:57, Fredrik Andersson wrote:
> Hi gang!
>
> If you do a multiterm query to Lucene, say "foo bar zoo", it gives you a
> heap of documents (Hits) as a result and all is well. If you want some
> tracking abilities to this query, for instance you want to know that
> document X was included in the Hits because "foo" and "bar" matched but
> "zoo" did not, how would you go about? In previous applications, I have done

Have a look at the result of Searcher.explain(query,doc) .

Regards,
Paul Elschot

P.S. On the java-user list there is normally more response.