Hi All
During my unemployment time ,the happiest thing is diving to study the
Lucene Source Code ,thanks for all the work .
About the BooleanQuery.I am encounterd by a question about the execution
of BooleanQuery:although,BooleanQuery#rewrite has done some works to
remove duplicate FILTER,SHOULD clauses.however still the same term query
can been executed the several times.
I copy the test code in the TestBooleanQuery to approve my assumption.
Unit Test Code as follows:
BooleanQuery.Builder qBuilder = new BooleanQuery.Builder();
qBuilder = new BooleanQuery.Builder();
qBuilder.add(new TermQuery(new Term("field", "b")), Occur.*FILTER*);
qBuilder.add(new TermQuery(new Term("field", "a")), Occur.*SHOULD*);
qBuilder.add(new TermQuery(new Term("field", "d")), Occur.*SHOULD*);
BooleanQuery.Builder nestQuery = new BooleanQuery.Builder();
nestQuery.add(new TermQuery(new Term("field", "b")), Occur.*FILTER*);
nestQuery.add(new TermQuery(new Term("field", "a")), Occur.*SHOULD*);
nestQuery.add(new TermQuery(new Term("field", "d")), Occur.*SHOULD*);
qBuilder.add(nestQuery.build(),Occur.*SHOULD*);
qBuilder.setMinimumNumberShouldMatch(1);
BooleanQuery q = qBuilder.build();
q = qBuilder.build();
assertSameScoresWithoutFilters(searcher, q);
In this test, the top boolean query(qBuilder) contains 4 clauses(3 simple
term-query ,1 nested boolean query that contains the same 3 term-quey).
The underlying execution is that the all the 6 term query were executed(see
TermQuery.Termweight#getTermsEnum()).
Apparently and theoretically, the executions can be merged to increase the
time,right?.
So,Is there any possible or necessary that Lucene merge the execution to
optimize the query performance, even I know the optimization may be
difficult.
During my unemployment time ,the happiest thing is diving to study the
Lucene Source Code ,thanks for all the work .
About the BooleanQuery.I am encounterd by a question about the execution
of BooleanQuery:although,BooleanQuery#rewrite has done some works to
remove duplicate FILTER,SHOULD clauses.however still the same term query
can been executed the several times.
I copy the test code in the TestBooleanQuery to approve my assumption.
Unit Test Code as follows:
BooleanQuery.Builder qBuilder = new BooleanQuery.Builder();
qBuilder = new BooleanQuery.Builder();
qBuilder.add(new TermQuery(new Term("field", "b")), Occur.*FILTER*);
qBuilder.add(new TermQuery(new Term("field", "a")), Occur.*SHOULD*);
qBuilder.add(new TermQuery(new Term("field", "d")), Occur.*SHOULD*);
BooleanQuery.Builder nestQuery = new BooleanQuery.Builder();
nestQuery.add(new TermQuery(new Term("field", "b")), Occur.*FILTER*);
nestQuery.add(new TermQuery(new Term("field", "a")), Occur.*SHOULD*);
nestQuery.add(new TermQuery(new Term("field", "d")), Occur.*SHOULD*);
qBuilder.add(nestQuery.build(),Occur.*SHOULD*);
qBuilder.setMinimumNumberShouldMatch(1);
BooleanQuery q = qBuilder.build();
q = qBuilder.build();
assertSameScoresWithoutFilters(searcher, q);
In this test, the top boolean query(qBuilder) contains 4 clauses(3 simple
term-query ,1 nested boolean query that contains the same 3 term-quey).
The underlying execution is that the all the 6 term query were executed(see
TermQuery.Termweight#getTermsEnum()).
Apparently and theoretically, the executions can be merged to increase the
time,right?.
So,Is there any possible or necessary that Lucene merge the execution to
optimize the query performance, even I know the optimization may be
difficult.