In Lucene-9445 we'd like to add a case insensitive option to regex queries
in the query parser of the form:
/Foo/i
However, today people can search for :
/foo.com/index.html
and not get an error. The searcher may think this is a query for a URL but
it's actually parsed as a regex "foo.com" ORed with a term query.
I'd like to draw attention to this proposed change in behaviour because I
think it could affect many existing systems. Arguably it may be a positive
in drawing attention to a number of existing silent failures (unescaped
searches for urls or file paths) but equally could be seen as a negative
breaking change by some.
What is our BWC policy for changes to query parser?
Do the benefits of the proposed new regex feature outweigh the costs of the
breakages in your view?
https://issues.apache.org/jira/browse/LUCENE-9445?focusedCommentId=17196793&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17196793
in the query parser of the form:
/Foo/i
However, today people can search for :
/foo.com/index.html
and not get an error. The searcher may think this is a query for a URL but
it's actually parsed as a regex "foo.com" ORed with a term query.
I'd like to draw attention to this proposed change in behaviour because I
think it could affect many existing systems. Arguably it may be a positive
in drawing attention to a number of existing silent failures (unescaped
searches for urls or file paths) but equally could be seen as a negative
breaking change by some.
What is our BWC policy for changes to query parser?
Do the benefits of the proposed new regex feature outweigh the costs of the
breakages in your view?
https://issues.apache.org/jira/browse/LUCENE-9445?focusedCommentId=17196793&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17196793