Hi, Lucene dev community:
Our current code is based on Lucene7.
In some analyzer testcase, give a string "*Google's biologist’s*", the
tokenization result is, *["google", "biologist"]*
But after I migrating the codebase to Lucene9,
the result becomes, *["googles", "**biologist’s**"]*
It looks like some behavior has changed among the major versions.
But I cannot find exactly where is the RC that causes this.
Could someone please provide some clues? Maybe some grammar has changed?
The analyzer uses the following three Lucene libraries:
org.apache.lucene.analysis.core.FlattenGraphFilter;
org.apache.lucene.analysis.shingle.ShingleFilter;
org.apache.lucene.analysis.synonym.SynonymGraphFilter;
Thanks
Our current code is based on Lucene7.
In some analyzer testcase, give a string "*Google's biologist’s*", the
tokenization result is, *["google", "biologist"]*
But after I migrating the codebase to Lucene9,
the result becomes, *["googles", "**biologist’s**"]*
It looks like some behavior has changed among the major versions.
But I cannot find exactly where is the RC that causes this.
Could someone please provide some clues? Maybe some grammar has changed?
The analyzer uses the following three Lucene libraries:
org.apache.lucene.analysis.core.FlattenGraphFilter;
org.apache.lucene.analysis.shingle.ShingleFilter;
org.apache.lucene.analysis.synonym.SynonymGraphFilter;
Thanks