Mailing List Archive

Can the BooleanQuery execution be optimized with same term queries
Hi All

Sorry to bother you.The happiest thing is studying the Lucene source
codes,thank you for all the great works .


About the BooleanQuery.I am encountered by a question about the execution
of BooleanQuery:although,BooleanQuery#rewrite has done some works to
remove duplicate FILTER,SHOULD clauses.however still the same term query
can been executed the several times.

I copied the test code in the TestBooleanQuery to confirm my assumption.

Unit Test Code as follows:



BooleanQuery.Builder qBuilder = new BooleanQuery.Builder();

qBuilder = new BooleanQuery.Builder();

qBuilder.add(new TermQuery(new Term("field", "b")), Occur.*FILTER*);

qBuilder.add(new TermQuery(new Term("field", "a")), Occur.*SHOULD*);

qBuilder.add(new TermQuery(new Term("field", "d")), Occur.*SHOULD*);

BooleanQuery.Builder nestQuery = new BooleanQuery.Builder();

nestQuery.add(new TermQuery(new Term("field", "b")), Occur.*FILTER*);

nestQuery.add(new TermQuery(new Term("field", "a")), Occur.*SHOULD*);

nestQuery.add(new TermQuery(new Term("field", "d")), Occur.*SHOULD*);

qBuilder.add(nestQuery.build(),Occur.*SHOULD*);

qBuilder.setMinimumNumberShouldMatch(1);

BooleanQuery q = qBuilder.build();

q = qBuilder.build();

assertSameScoresWithoutFilters(searcher, q);


In this test, the top boolean query(qBuilder) contains 4 clauses(3 simple
term-query ,1 nested boolean query that contains the same 3 term-query).

The underlying execution is that all the 6 term query were executed(see
TermQuery.Termweight#getTermsEnum()).

Apparently and theoretically, the executions can be merged to increase the
time,right?.


So,is it possible or necessary that Lucene merge the execution to optimize
the query performance, even though I know the optimization may be difficult.
Re: Can the BooleanQuery execution be optimized with same term queries [ In reply to ]
Hi Yang,

It would be legal for Lucene to perform such optimizations indeed.

On Tue, Sep 19, 2023 at 3:27?PM YouPeng Yang <yypvsxf19870706@gmail.com> wrote:
>
> Hi All
>
> Sorry to bother you.The happiest thing is studying the Lucene source codes,thank you for all the great works .
>
>
> About the BooleanQuery.I am encountered by a question about the execution of BooleanQuery:although,BooleanQuery#rewrite has done some works to remove duplicate FILTER,SHOULD clauses.however still the same term query can been executed the several times.
>
> I copied the test code in the TestBooleanQuery to confirm my assumption.
>
> Unit Test Code as follows:
>
>
>
> BooleanQuery.Builder qBuilder = new BooleanQuery.Builder();
>
> qBuilder = new BooleanQuery.Builder();
>
> qBuilder.add(new TermQuery(new Term("field", "b")), Occur.FILTER);
>
> qBuilder.add(new TermQuery(new Term("field", "a")), Occur.SHOULD);
>
> qBuilder.add(new TermQuery(new Term("field", "d")), Occur.SHOULD);
>
> BooleanQuery.Builder nestQuery = new BooleanQuery.Builder();
>
> nestQuery.add(new TermQuery(new Term("field", "b")), Occur.FILTER);
>
> nestQuery.add(new TermQuery(new Term("field", "a")), Occur.SHOULD);
>
> nestQuery.add(new TermQuery(new Term("field", "d")), Occur.SHOULD);
>
> qBuilder.add(nestQuery.build(),Occur.SHOULD);
>
> qBuilder.setMinimumNumberShouldMatch(1);
>
> BooleanQuery q = qBuilder.build();
>
> q = qBuilder.build();
>
> assertSameScoresWithoutFilters(searcher, q);
>
>
> In this test, the top boolean query(qBuilder) contains 4 clauses(3 simple term-query ,1 nested boolean query that contains the same 3 term-query).
>
> The underlying execution is that all the 6 term query were executed(see TermQuery.Termweight#getTermsEnum()).
>
> Apparently and theoretically, the executions can be merged to increase the time,right?.
>
>
> So,is it possible or necessary that Lucene merge the execution to optimize the query performance, even though I know the optimization may be difficult.
>
>
>


--
Adrien

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
Re: Can the BooleanQuery execution be optimized with same term queries [ In reply to ]
Hi Adrien
Glad to have your opinion.I am reading your excellent articles on
elastic blog.

Best regards


Adrien Grand <jpountz@gmail.com> ?2023?9?19??? 21:32???

> Hi Yang,
>
> It would be legal for Lucene to perform such optimizations indeed.
>
> On Tue, Sep 19, 2023 at 3:27?PM YouPeng Yang <yypvsxf19870706@gmail.com>
> wrote:
> >
> > Hi All
> >
> > Sorry to bother you.The happiest thing is studying the Lucene source
> codes,thank you for all the great works .
> >
> >
> > About the BooleanQuery.I am encountered by a question about the
> execution of BooleanQuery:although,BooleanQuery#rewrite has done some
> works to remove duplicate FILTER,SHOULD clauses.however still the same term
> query can been executed the several times.
> >
> > I copied the test code in the TestBooleanQuery to confirm my
> assumption.
> >
> > Unit Test Code as follows:
> >
> >
> >
> > BooleanQuery.Builder qBuilder = new BooleanQuery.Builder();
> >
> > qBuilder = new BooleanQuery.Builder();
> >
> > qBuilder.add(new TermQuery(new Term("field", "b")), Occur.FILTER);
> >
> > qBuilder.add(new TermQuery(new Term("field", "a")), Occur.SHOULD);
> >
> > qBuilder.add(new TermQuery(new Term("field", "d")), Occur.SHOULD);
> >
> > BooleanQuery.Builder nestQuery = new BooleanQuery.Builder();
> >
> > nestQuery.add(new TermQuery(new Term("field", "b")), Occur.FILTER);
> >
> > nestQuery.add(new TermQuery(new Term("field", "a")), Occur.SHOULD);
> >
> > nestQuery.add(new TermQuery(new Term("field", "d")), Occur.SHOULD);
> >
> > qBuilder.add(nestQuery.build(),Occur.SHOULD);
> >
> > qBuilder.setMinimumNumberShouldMatch(1);
> >
> > BooleanQuery q = qBuilder.build();
> >
> > q = qBuilder.build();
> >
> > assertSameScoresWithoutFilters(searcher, q);
> >
> >
> > In this test, the top boolean query(qBuilder) contains 4 clauses(3
> simple term-query ,1 nested boolean query that contains the same 3
> term-query).
> >
> > The underlying execution is that all the 6 term query were executed(see
> TermQuery.Termweight#getTermsEnum()).
> >
> > Apparently and theoretically, the executions can be merged to increase
> the time,right?.
> >
> >
> > So,is it possible or necessary that Lucene merge the execution to
> optimize the query performance, even though I know the optimization may be
> difficult.
> >
> >
> >
>
>
> --
> Adrien
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>
Re: Can the BooleanQuery execution be optimized with same term queries [ In reply to ]
Thanks for letting me know, I'm glad you like them!


Le ven. 22 sept. 2023, 16:36, YouPeng Yang <yypvsxf19870706@gmail.com> a
écrit :

> Hi Adrien
> Glad to have your opinion.I am reading your excellent articles on
> elastic blog.
>
> Best regards
>
>
> Adrien Grand <jpountz@gmail.com> ?2023?9?19??? 21:32???
>
>> Hi Yang,
>>
>> It would be legal for Lucene to perform such optimizations indeed.
>>
>> On Tue, Sep 19, 2023 at 3:27?PM YouPeng Yang <yypvsxf19870706@gmail.com>
>> wrote:
>> >
>> > Hi All
>> >
>> > Sorry to bother you.The happiest thing is studying the Lucene source
>> codes,thank you for all the great works .
>> >
>> >
>> > About the BooleanQuery.I am encountered by a question about the
>> execution of BooleanQuery:although,BooleanQuery#rewrite has done some
>> works to remove duplicate FILTER,SHOULD clauses.however still the same term
>> query can been executed the several times.
>> >
>> > I copied the test code in the TestBooleanQuery to confirm my
>> assumption.
>> >
>> > Unit Test Code as follows:
>> >
>> >
>> >
>> > BooleanQuery.Builder qBuilder = new BooleanQuery.Builder();
>> >
>> > qBuilder = new BooleanQuery.Builder();
>> >
>> > qBuilder.add(new TermQuery(new Term("field", "b")), Occur.FILTER);
>> >
>> > qBuilder.add(new TermQuery(new Term("field", "a")), Occur.SHOULD);
>> >
>> > qBuilder.add(new TermQuery(new Term("field", "d")), Occur.SHOULD);
>> >
>> > BooleanQuery.Builder nestQuery = new BooleanQuery.Builder();
>> >
>> > nestQuery.add(new TermQuery(new Term("field", "b")), Occur.FILTER);
>> >
>> > nestQuery.add(new TermQuery(new Term("field", "a")), Occur.SHOULD);
>> >
>> > nestQuery.add(new TermQuery(new Term("field", "d")), Occur.SHOULD);
>> >
>> > qBuilder.add(nestQuery.build(),Occur.SHOULD);
>> >
>> > qBuilder.setMinimumNumberShouldMatch(1);
>> >
>> > BooleanQuery q = qBuilder.build();
>> >
>> > q = qBuilder.build();
>> >
>> > assertSameScoresWithoutFilters(searcher, q);
>> >
>> >
>> > In this test, the top boolean query(qBuilder) contains 4 clauses(3
>> simple term-query ,1 nested boolean query that contains the same 3
>> term-query).
>> >
>> > The underlying execution is that all the 6 term query were executed(see
>> TermQuery.Termweight#getTermsEnum()).
>> >
>> > Apparently and theoretically, the executions can be merged to increase
>> the time,right?.
>> >
>> >
>> > So,is it possible or necessary that Lucene merge the execution to
>> optimize the query performance, even though I know the optimization may be
>> difficult.
>> >
>> >
>> >
>>
>>
>> --
>> Adrien
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: dev-help@lucene.apache.org
>>
>>