Mailing List Archive

Accessibility of QueryParserBase::handleBareFuzzy
Hi,

In an effort to prepare Elasticsearch for modularization, we are
investigating and eliminating split packages. The situation has improved
through recent refactoring in Lucene 9.0 [1], but a number of split
packages still remain. This message identifies one such so that it can
be discussed in isolation, with a view to a potential solution either in
Lucene or possibly within Elasticsearch itself.

Elasticsearch has a query parser, `QueryStringQueryParser`[2], that
builds queries based on mapping information. This parser has a need to
override its superclass's `org.apache.lucene.queryparser.classic.QueryParserBase::handleBareFuzzy` [3]
method, in order to provide custom handling of fuzzy queries. This is
clearly not "best practice", since to do so requires the use of
effectively (but not literally) injecting into a lucene package, which
is done through `XQueryParser` [4]. We want to eliminate the need for
`XQueryParser`, and hence the split package at run time.

Clearly, but likely not right, we could simply make `handleBareFuzzy` a
a protected method in Lucene's `QueryParser` or `QueryParserBase` - this
would satisfy the need of the Elasticsearch `QueryStringQueryParser`. If
not this, I don't see an alternative that could be coded in
Elasticsearch's `QueryStringQueryParser`, but maybe there is a different
API extension point that could be used, or a new one provided?

-Chris.

[1] https://issues.apache.org/jira/browse/LUCENE-9319
[2] https://github.com/elastic/elasticsearch/blob/master/server/src/main/java/org/elasticsearch/index/search/QueryStringQueryParser.java#L436
[3] https://github.com/apache/lucene/blob/8ac26737913d0c1555019e93bc6bf7db1ab9047e/lucene/queryparser/src/java/org/apache/lucene/queryparser/classic/QueryParserBase.java#L813
[4] https://github.com/elastic/elasticsearch/blob/master/server/src/main/java/org/apache/lucene/queryparser/classic/XQueryParser.java


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
Re: Accessibility of QueryParserBase::handleBareFuzzy [ In reply to ]
Hi Chris,

The difference between the elasticsearch query parser and the built-in lucene one appears to be based around how they parse fuzziness, so I think the best solution here is to add another protected method, something like this:

protected float parseFuzzyDistance(String input, float default) {
try {
return Float.parseFloat(fuzzySlop.image.substring(1));
} catch (@SuppressWarnings("unused”) Exception ignored) {
return default;
}
}

Then handleBareFuzzy() can call out to this, and the ES version can overload it and do its own parsing.

- A

> On 20 Sep 2021, at 15:39, Chris Hegarty <christopher.hegarty@elastic.co.INVALID> wrote:
>
> Hi,
>
> In an effort to prepare Elasticsearch for modularization, we are
> investigating and eliminating split packages. The situation has improved
> through recent refactoring in Lucene 9.0 [1], but a number of split
> packages still remain. This message identifies one such so that it can
> be discussed in isolation, with a view to a potential solution either in
> Lucene or possibly within Elasticsearch itself.
>
> Elasticsearch has a query parser, `QueryStringQueryParser`[2], that
> builds queries based on mapping information. This parser has a need to
> override its superclass's `org.apache.lucene.queryparser.classic.QueryParserBase::handleBareFuzzy` [3]
> method, in order to provide custom handling of fuzzy queries. This is
> clearly not "best practice", since to do so requires the use of
> effectively (but not literally) injecting into a lucene package, which
> is done through `XQueryParser` [4]. We want to eliminate the need for
> `XQueryParser`, and hence the split package at run time.
>
> Clearly, but likely not right, we could simply make `handleBareFuzzy` a
> a protected method in Lucene's `QueryParser` or `QueryParserBase` - this
> would satisfy the need of the Elasticsearch `QueryStringQueryParser`. If
> not this, I don't see an alternative that could be coded in
> Elasticsearch's `QueryStringQueryParser`, but maybe there is a different
> API extension point that could be used, or a new one provided?
>
> -Chris.
>
> [1] https://issues.apache.org/jira/browse/LUCENE-9319
> [2] https://github.com/elastic/elasticsearch/blob/master/server/src/main/java/org/elasticsearch/index/search/QueryStringQueryParser.java#L436
> [3] https://github.com/apache/lucene/blob/8ac26737913d0c1555019e93bc6bf7db1ab9047e/lucene/queryparser/src/java/org/apache/lucene/queryparser/classic/QueryParserBase.java#L813
> [4] https://github.com/elastic/elasticsearch/blob/master/server/src/main/java/org/apache/lucene/queryparser/classic/XQueryParser.java
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
Re: Accessibility of QueryParserBase::handleBareFuzzy [ In reply to ]
Thanks Alan, great suggestion.

I filed the following issue to track this:
https://issues.apache.org/jira/browse/LUCENE-10115

-Chris.

> On 20 Sep 2021, at 16:14, Alan Woodward <romseygeek@gmail.com> wrote:
>
> Hi Chris,
>
> The difference between the elasticsearch query parser and the built-in lucene one appears to be based around how they parse fuzziness, so I think the best solution here is to add another protected method, something like this:
>
> protected float parseFuzzyDistance(String input, float default) {
> try {
> return Float.parseFloat(fuzzySlop.image.substring(1));
> } catch (@SuppressWarnings("unused”) Exception ignored) {
> return default;
> }
> }
>
> Then handleBareFuzzy() can call out to this, and the ES version can overload it and do its own parsing.
>
> - A

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org