Mailing List Archive

Scorer#getMinScore()
Hi all,

I was wondering why there is no Scorer#getMinScore() equivalent to
Scorer#getMaxScore() (here
<https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/search/Scorer.java#L103>).
I think it could potentially be useful for skipping when you have scoring
functions with a subtraction in it.

As a contrived example, say I wrote a SubtractionAndQuery(Query a, Query b)
that matched a conjunction of a and b but the score was a.score() -
b.score(). When creating a scorer, the best getMaxScore() function I could
create would look like this:

float getMaxScore(int upto) {
return a.getMaxScore(upto);
}

However, this would not give me the tightest upper bound score possible as
I am completely neglecting the "b" term here. Something like this would be
better:

float getMaxScore(int upto) {
return Math.max(a.getMaxScore(upto) - b.getMinScore(upto), 0);
}

So I was wondering if not including this API was by design (the same reason
why Lucene doesn't allow negative scores for queries) or if it was because
the added block level metadata required to store the min term scores would
be too much? I'm sure there's some other issues I could be overlooking as
well.

Any answers would be greatly appreciated!

Thanks,
Marc
Re: Scorer#getMinScore() [ In reply to ]
Your guesses sound right to me:
- A query that does subtractions could yield negative scores, which are
not supported.
- We'd need to store the least competitive impacts for each block of
postings, which would double the amount of CPU and space we spend on
impacts, while min scores would likely be much less frequently useful than
max scores?

On Fri, Jun 9, 2023 at 10:10?PM Marc D'Mello <marcd2000@gmail.com> wrote:

> Hi all,
>
> I was wondering why there is no Scorer#getMinScore() equivalent to
> Scorer#getMaxScore() (here
> <https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/search/Scorer.java#L103>).
> I think it could potentially be useful for skipping when you have scoring
> functions with a subtraction in it.
>
> As a contrived example, say I wrote a SubtractionAndQuery(Query a, Query
> b) that matched a conjunction of a and b but the score was a.score() -
> b.score(). When creating a scorer, the best getMaxScore() function I could
> create would look like this:
>
> float getMaxScore(int upto) {
> return a.getMaxScore(upto);
> }
>
> However, this would not give me the tightest upper bound score possible as
> I am completely neglecting the "b" term here. Something like this would be
> better:
>
> float getMaxScore(int upto) {
> return Math.max(a.getMaxScore(upto) - b.getMinScore(upto), 0);
> }
>
> So I was wondering if not including this API was by design (the same
> reason why Lucene doesn't allow negative scores for queries) or if it was
> because the added block level metadata required to store the min term
> scores would be too much? I'm sure there's some other issues I could be
> overlooking as well.
>
> Any answers would be greatly appreciated!
>
> Thanks,
> Marc
>


--
Adrien