Mailing List Archive

Can the BooleanQuery execution be optimized with same term queries?
Hi All

During my unemployment time ,the happiest thing is diving to study the
Lucene Source Code ,thanks for all the work .

About the BooleanQuery.I am encounterd by a question about the execution
of BooleanQuery:although,BooleanQuery#rewrite has done some works to
remove duplicate FILTER,SHOULD clauses.however still the same term query
can been executed the several times.

I copy the test code in the TestBooleanQuery to approve my assumption.

Unit Test Code as follows:



BooleanQuery.Builder qBuilder = new BooleanQuery.Builder();

qBuilder = new BooleanQuery.Builder();

qBuilder.add(new TermQuery(new Term("field", "b")), Occur.*FILTER*);

qBuilder.add(new TermQuery(new Term("field", "a")), Occur.*SHOULD*);

qBuilder.add(new TermQuery(new Term("field", "d")), Occur.*SHOULD*);

BooleanQuery.Builder nestQuery = new BooleanQuery.Builder();

nestQuery.add(new TermQuery(new Term("field", "b")), Occur.*FILTER*);

nestQuery.add(new TermQuery(new Term("field", "a")), Occur.*SHOULD*);

nestQuery.add(new TermQuery(new Term("field", "d")), Occur.*SHOULD*);

qBuilder.add(nestQuery.build(),Occur.*SHOULD*);

qBuilder.setMinimumNumberShouldMatch(1);

BooleanQuery q = qBuilder.build();

q = qBuilder.build();

assertSameScoresWithoutFilters(searcher, q);


In this test, the top boolean query(qBuilder) contains 4 clauses(3 simple
term-query ,1 nested boolean query that contains the same 3 term-quey).

The underlying execution is that the all the 6 term query were executed(see
TermQuery.Termweight#getTermsEnum()).

Apparently and theoretically, the executions can be merged to increase the
time,right?.


So,Is there any possible or necessary that Lucene merge the execution to
optimize the query performance, even I know the optimization may be
difficult.
Re: Can the BooleanQuery execution be optimized with same term queries? [ In reply to ]
another thing to check beyond whether the correct documents are
matched is whether the correct score is returned. I'm not sure
actually how it works but I can imagine that a query for "red red
wine" would produce a higher score for documents having "red red wine"
than it would for documents having "red wine wine"

On Tue, Sep 19, 2023 at 2:37?AM YouPeng Yang <yypvsxf19870706@gmail.com> wrote:
>
> Hi All
>
> During my unemployment time ,the happiest thing is diving to study the
> Lucene Source Code ,thanks for all the work .
>
> About the BooleanQuery.I am encounterd by a question about the execution
> of BooleanQuery:although,BooleanQuery#rewrite has done some works to
> remove duplicate FILTER,SHOULD clauses.however still the same term query
> can been executed the several times.
>
> I copy the test code in the TestBooleanQuery to approve my assumption.
>
> Unit Test Code as follows:
>
>
>
> BooleanQuery.Builder qBuilder = new BooleanQuery.Builder();
>
> qBuilder = new BooleanQuery.Builder();
>
> qBuilder.add(new TermQuery(new Term("field", "b")), Occur.*FILTER*);
>
> qBuilder.add(new TermQuery(new Term("field", "a")), Occur.*SHOULD*);
>
> qBuilder.add(new TermQuery(new Term("field", "d")), Occur.*SHOULD*);
>
> BooleanQuery.Builder nestQuery = new BooleanQuery.Builder();
>
> nestQuery.add(new TermQuery(new Term("field", "b")), Occur.*FILTER*);
>
> nestQuery.add(new TermQuery(new Term("field", "a")), Occur.*SHOULD*);
>
> nestQuery.add(new TermQuery(new Term("field", "d")), Occur.*SHOULD*);
>
> qBuilder.add(nestQuery.build(),Occur.*SHOULD*);
>
> qBuilder.setMinimumNumberShouldMatch(1);
>
> BooleanQuery q = qBuilder.build();
>
> q = qBuilder.build();
>
> assertSameScoresWithoutFilters(searcher, q);
>
>
> In this test, the top boolean query(qBuilder) contains 4 clauses(3 simple
> term-query ,1 nested boolean query that contains the same 3 term-quey).
>
> The underlying execution is that the all the 6 term query were executed(see
> TermQuery.Termweight#getTermsEnum()).
>
> Apparently and theoretically, the executions can be merged to increase the
> time,right?.
>
>
> So,Is there any possible or necessary that Lucene merge the execution to
> optimize the query performance, even I know the optimization may be
> difficult.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org