Mailing List Archive

Should Queries be able to throw CollectionTerminationException?
Hi folks-

I've run into a bit of an interesting situation when attempting to
enforce a query evaluation time budget in our Lucene application
(Amazon Product Search), and I'm curious if it's something others have
run into or have thoughts on. There's a reasonable chance this
use-case is fairly specific to our application, but if others have
seen similar use-cases, then maybe there's a general solution worth
pursuing here in Lucene itself?

We'd like to enforce a strict time budget for query evaluation, even
at the cost of potentially missing some matches that have yet to be
seen. One tempting solution is to enforce this in our leaf collectors
each time they see a hit by throwing a CollectionTerminationException,
which is handled nicely by IndexSearcher (we use concurrent search so
this more-or-less enforces an overall time budget). Our queries follow
a two-phase matching approach, and we've run into some interesting
edge-cases where the "approximation" phase may produce a very large
set of match candidates but the "confirmation" phase only confirms
matches on a very small fraction of them. In extreme cases, the entire
index could match in the "approximation" phase and none of the hits
could be "confirmed" in the second phase check.

This creates an interesting issue where the query may evaluate for a
long time before the leaf collectors see hits (or they may never see
hits). This boils down to the BulkScorer running a loop over all
"approximate" candidates and then attempting to "confirm" each before
the leaf collector "sees" anything (it could also happen in a case of
many first phase matches with many of those hits having been deleted).
In these cases, we can run significantly over our time budget.

One solution I've come up with is to create a top-level Query
implementation that enforces the time budget each time it produces
"approximate" matches. This more-or-less works for our use-case, but
has some "rough edges" as a general solution. What I've observed is
that Lucene really only supports collectors / leaf collectors throwing
CollectionTerminationException and doesn't necessarily support Query
implementations doing this. One of the most glaring issues is that the
LRU query caching (if enabled) doesn't handle the exception, so if a
Query were to throw when pre-populating the cache bitsets, it would
terminate the entire search (in a pretty ungraceful way).

I'm also aware of ExitableDirectoryReader but it's trickier to manage
for our use case since we read from the index outside of the main
query evaluation phase for other purposes. I'm sure there's a solution
where we maintain multiple Readers, etc.

So... I'm interested if anyone else has run into a similar use-case.
Does anyone have thoughts on alternative solutions? Is there any
appetite to augment Lucene to allow for queries to signal early
termination by throwing CollectionTerminationExceptions? I suspect
ExitableDirectoryReader probably provides a good enough solution for
others in this situation, but I wanted to raise the topic and see what
other folks here think.

Cheers,
-Greg

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
Re: Should Queries be able to throw CollectionTerminationException? [ In reply to ]
Hi Greg,

Maybe one clean way to make it happen would be to make timeouts an
IndexSearcher feature. Whenever a timeout is set, IndexSearcher could split
the doc ID space into ranges of X docs and check the timeout between every
range. This way, the CollectionTerminatedException wouldn't be raised by a
query, IndexSearcher would be in full control of terminating the query
prematurely based on the configured timeout.

On Tue, Oct 5, 2021 at 3:22 PM Greg Miller <gsmiller@gmail.com> wrote:

> Hi folks-
>
> I've run into a bit of an interesting situation when attempting to
> enforce a query evaluation time budget in our Lucene application
> (Amazon Product Search), and I'm curious if it's something others have
> run into or have thoughts on. There's a reasonable chance this
> use-case is fairly specific to our application, but if others have
> seen similar use-cases, then maybe there's a general solution worth
> pursuing here in Lucene itself?
>
> We'd like to enforce a strict time budget for query evaluation, even
> at the cost of potentially missing some matches that have yet to be
> seen. One tempting solution is to enforce this in our leaf collectors
> each time they see a hit by throwing a CollectionTerminationException,
> which is handled nicely by IndexSearcher (we use concurrent search so
> this more-or-less enforces an overall time budget). Our queries follow
> a two-phase matching approach, and we've run into some interesting
> edge-cases where the "approximation" phase may produce a very large
> set of match candidates but the "confirmation" phase only confirms
> matches on a very small fraction of them. In extreme cases, the entire
> index could match in the "approximation" phase and none of the hits
> could be "confirmed" in the second phase check.
>
> This creates an interesting issue where the query may evaluate for a
> long time before the leaf collectors see hits (or they may never see
> hits). This boils down to the BulkScorer running a loop over all
> "approximate" candidates and then attempting to "confirm" each before
> the leaf collector "sees" anything (it could also happen in a case of
> many first phase matches with many of those hits having been deleted).
> In these cases, we can run significantly over our time budget.
>
> One solution I've come up with is to create a top-level Query
> implementation that enforces the time budget each time it produces
> "approximate" matches. This more-or-less works for our use-case, but
> has some "rough edges" as a general solution. What I've observed is
> that Lucene really only supports collectors / leaf collectors throwing
> CollectionTerminationException and doesn't necessarily support Query
> implementations doing this. One of the most glaring issues is that the
> LRU query caching (if enabled) doesn't handle the exception, so if a
> Query were to throw when pre-populating the cache bitsets, it would
> terminate the entire search (in a pretty ungraceful way).
>
> I'm also aware of ExitableDirectoryReader but it's trickier to manage
> for our use case since we read from the index outside of the main
> query evaluation phase for other purposes. I'm sure there's a solution
> where we maintain multiple Readers, etc.
>
> So... I'm interested if anyone else has run into a similar use-case.
> Does anyone have thoughts on alternative solutions? Is there any
> appetite to augment Lucene to allow for queries to signal early
> termination by throwing CollectionTerminationExceptions? I suspect
> ExitableDirectoryReader probably provides a good enough solution for
> others in this situation, but I wanted to raise the topic and see what
> other folks here think.
>
> Cheers,
> -Greg
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>

--
Adrien
Re: Should Queries be able to throw CollectionTerminationException? [ In reply to ]
On Tue, Oct 5, 2021 at 11:13 AM Adrien Grand <jpountz@gmail.com> wrote:

Maybe one clean way to make it happen would be to make timeouts an
> IndexSearcher feature. Whenever a timeout is set, IndexSearcher could split
> the doc ID space into ranges of X docs and check the timeout between every
> range. This way, the CollectionTerminatedException wouldn't be raised by a
> query, IndexSearcher would be in full control of terminating the query
> prematurely based on the configured timeout.
>

+1, I like this idea! It'd make timeouts more of a first class feature,
and then the overhead should be very small if we check only after each
block of partitioned docid space.

Mike McCandless

http://blog.mikemccandless.com
>
>
Re: Should Queries be able to throw CollectionTerminationException? [ In reply to ]
+1 I like this as well. Thanks for the suggestion! If IndexSearcher is
aware of timeouts, I think it can "do the right thing" with respect to
caching as well as properly establish the right "Relation" in
"TotalHits" if it times out (maybe there's some callback or something
allowing IndexSearcher to report a timeout to collectors / collector
managers?).

I went ahead and created an issue to explore this further:
https://issues.apache.org/jira/browse/LUCENE-10151

Thanks again!
-Greg

On Tue, Oct 5, 2021 at 8:20 AM Michael McCandless
<lucene@mikemccandless.com> wrote:
>
> On Tue, Oct 5, 2021 at 11:13 AM Adrien Grand <jpountz@gmail.com> wrote:
>
>> Maybe one clean way to make it happen would be to make timeouts an IndexSearcher feature. Whenever a timeout is set, IndexSearcher could split the doc ID space into ranges of X docs and check the timeout between every range. This way, the CollectionTerminatedException wouldn't be raised by a query, IndexSearcher would be in full control of terminating the query prematurely based on the configured timeout.
>
>
> +1, I like this idea! It'd make timeouts more of a first class feature, and then the overhead should be very small if we check only after each block of partitioned docid space.
>
> Mike McCandless
>
> http://blog.mikemccandless.com

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org