Mailing List Archive

1 2  View All
Re: FuzzyQuery- why is it ignored? [ In reply to ]
Sorry, I made a mistake when copypasting. Let me just correct my previous mail.

> 1. Indexed this text: "NASHUA NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES".

1. Indexed this text: "MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW
HAMPSHIRE UNITED STATES"

----
As far as I can say, this query correctly find the indexed document
(so I have no idea about what is wrong with fuzzy query).
+contentDFLT:mains~2 +contentDFLT:"nashua"
+contentDFLT:"new-hampshire" +contentDFLT:"united states"

I am
- using lucene 8.1.
- using standard analyzer for both of indexing and searching.
- using classic query parser for parsing.



2019?6?13?(?) 23:18 <baris.kazar@oracle.com>:
>
> However, the index does not have MAINS but MAIN for the expected entry.
>
> Best regards
>
>
>
> On 6/13/19 10:33 AM, baris.kazar@oracle.com wrote:
> > does it consider it as like plural word? :) :) :)
> > That makes sense.
> >
> > Best regards
> >
> >
> > On 6/13/19 10:31 AM, baris.kazar@oracle.com wrote:
> >> Erick,
> >>
> >> Cool, could You give a simple example with my example please?
> >>
> >> Best regards
> >>
> >>
> >>
> >> On 6/13/19 10:12 AM, Erick Erickson wrote:
> >>> Shot in the dark: stemming. Whenever I see a problem with something
> >>> ending in “s” (or “er” or “ing” or….) my first suspect is that
> >>> stemming is turned on. In that case the token in the index that’s
> >>> actually searched on is somewhat different than you expect.
> >>>
> >>> The test is easy, just insure your fieldType contains no stemmers.
> >>> PorterStemmer is particularly aggressive, but for this case to test
> >>> I’d just remove all stemming, re-index and see if the results differ.
> >>>
> >>> Best,
> >>> Erick
> >>>
> >>>> On Jun 13, 2019, at 7:26 AM, baris.kazar@oracle.com wrote:
> >>>>
> >>>> Tomoko,-
> >>>>
> >>>> That is strange indeed.
> >>>>
> >>>> Something is wrong when i use mains but maink, mainl, mainr,mainq,
> >>>> maint all work ok any consonant at the end except s works in this
> >>>> case.
> >>>>
> >>>> Case #3 had +contentDFLT:mains~2 but not +contentDFLT:"mains~2".
> >>>>
> >>>> i am using fuzzy query with ~ from Query.builder and that is not
> >>>> PhraseQuery.
> >>>>
> >>>> Similarly FuzzyQuery with input "mains" (it has to be lowercase
> >>>> since it does not go through StandardAnalyzer) is also not
> >>>> PhraseQuery.
> >>>>
> >>>> can there be a clearer sample case for ComplexPhraseQuery please in
> >>>> the docs?
> >>>>
> >>>> did You also index "MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED
> >>>> STATES" the expected output in this case?
> >>>>
> >>>> Thanks for spending time on this, i would like to thank everyone.
> >>>>
> >>>> Best regards
> >>>>
> >>>>
> >>>> On 6/13/19 12:13 AM, Tomoko Uchida wrote:
> >>>>> Hi,
> >>>>>
> >>>>>> Ok, i think only this very specific only "mains" has an issue.
> >>>>> It looks strange to me. I did some test locally.
> >>>>>
> >>>>> 1. Indexed this text: "NASHUA NASHUA HILLSBOROUGH NEW HAMPSHIRE
> >>>>> UNITED STATES".
> >>>>>
> >>>>> 2a. This query string (just copied from your Case #3) worked
> >>>>> correctly
> >>>>> for me as far as I can see.
> >>>>> +contentDFLT:mains~2 +contentDFLT:"nashua",
> >>>>> +contentDFLT:"new-hampshire", +contentDFLT:"united state"
> >>>>>
> >>>>> 2b. However this query string got no results.
> >>>>> +contentDFLT:"mains~2", +contentDFLT:"nashua",
> >>>>> +contentDFLT:"new-hampshire", +contentDFLT:"united states"
> >>>>> It is an expected behaviour because the classic query parser does not
> >>>>> support fuzzy query inside phrase query (as far as I know).
> >>>>>
> >>>>> I suspect you use fuzzy query operator (~) inside phrase query
> >>>>> ("), as
> >>>>> the 2b case.
> >>>>>
> >>>>> FYI: there is a special parser for such complex phrase query.
> >>>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__lucene.apache.org_core_8-5F1-5F0_queryparser_org_apache_lucene_queryparser_complexPhrase_ComplexPhraseQueryParser.html&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=ZcXpaSlwS5DegX76mHTb_6DH3P7noan1eeMXc-Vh5M8&s=FoIMlcjDO2b7Gut9XRx-NIBWiBQWItsj8IlylJC7Wkc&e=
> >>>>>
> >>>>>
> >>>>> Tomoko
> >>>>>
> >>>>> 2019?6?13?(?) 6:16 <baris.kazar@oracle.com>:
> >>>>>> Ok, i think only this very specific only "mains" has an issue.
> >>>>>>
> >>>>>> all i knew about Lucene was fine :) Great...
> >>>>>>
> >>>>>> i have one more question:
> >>>>>>
> >>>>>> which one is advised to use: FuzzyQuery or the Query.parser with
> >>>>>> search string~ appended?
> >>>>>>
> >>>>>> The second one will go through analyzer and make search string
> >>>>>> lowercase.
> >>>>>>
> >>>>>> Best regards
> >>>>>>
> >>>>>>
> >>>>>> On 6/12/19 1:03 PM, baris.kazar@oracle.com wrote:
> >>>>>>
> >>>>>> Hi again,-
> >>>>>>
> >>>>>> this is really interesting and i hope i am missing something.
> >>>>>> Index small cases all entries so case sensitivity is not an issue
> >>>>>> i think.
> >>>>>>
> >>>>>> Case #1:
> >>>>>>
> >>>>>> org.apache.lucene.queryparser.classic.QueryParser parser = new
> >>>>>> org.apache.lucene.queryparser.classic.QueryParser(field,
> >>>>>> phraseAnalyzer) ;
> >>>>>> Query q1 = null;
> >>>>>> try {
> >>>>>> q1 = parser.parse("Main");
> >>>>>> } catch (ParseException e) {
> >>>>>> e.printStackTrace();
> >>>>>> }
> >>>>>> booleanQuery.add(q1, BooleanClause.Occur.MUST);
> >>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> >>>>>> "NASHUA"), BooleanClause.Occur.MUST);
> >>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> >>>>>> "NEW HAMPSHIRE"), BooleanClause.Occur.MUST);
> >>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> >>>>>> "UNITED STATES"), BooleanClause.Occur.MUST);
> >>>>>>
> >>>>>>
> >>>>>> This brings with this:
> >>>>>>
> >>>>>> query plan:
> >>>>>>
> >>>>>> [+contentDFLT:main, +contentDFLT:"nashua",
> >>>>>> +contentDFLT:"new-hampshire", +contentDFLT:"united states"]
> >>>>>>
> >>>>>> testQuerySearch1 Time to compute: 0 seconds (copied answer after
> >>>>>> exec finished)
> >>>>>>
> >>>>>> Number of results: 12
> >>>>>> Name: Main Dunstable Rd
> >>>>>> Score: 41.204945
> >>>>>> ID: 12677400
> >>>>>> Country Code: US
> >>>>>> Coordinates: 42.72631, -71.50269
> >>>>>> Search Key: MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW HAMPSHIRE
> >>>>>> UNITED STATES
> >>>>>>
> >>>>>> Name: Main St
> >>>>>> Score: 41.204945
> >>>>>> ID: 12681980
> >>>>>> Country Code: US
> >>>>>> Coordinates: 42.76416, -71.46681
> >>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> >>>>>>
> >>>>>> Name: Main St
> >>>>>> Score: 41.204945
> >>>>>> ID: 12681973
> >>>>>> Country Code: US
> >>>>>> Coordinates: 42.75045, -71.4607
> >>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> >>>>>>
> >>>>>> Name: Main St
> >>>>>> Score: 41.204945
> >>>>>> ID: 12681974
> >>>>>> Country Code: US
> >>>>>> Coordinates: 42.76019, -71.465
> >>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> >>>>>>
> >>>>>> Name: Main Dunstable Rd
> >>>>>> Score: 41.204945
> >>>>>> ID: 12677399
> >>>>>> Country Code: US
> >>>>>> Coordinates: 42.74641, -71.48943
> >>>>>> Search Key: MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW HAMPSHIRE
> >>>>>> UNITED STATES
> >>>>>>
> >>>>>> Name: S Main St
> >>>>>> Score: 41.204945
> >>>>>> ID: 11893215
> >>>>>> Country Code: US
> >>>>>> Coordinates: 42.73412, -71.44797
> >>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> >>>>>>
> >>>>>> Name: Main St
> >>>>>> Score: 41.204945
> >>>>>> ID: 12681978
> >>>>>> Country Code: US
> >>>>>> Coordinates: 42.73492, -71.44951
> >>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> >>>>>>
> >>>>>> Name: S Main St
> >>>>>> Score: 41.204945
> >>>>>> ID: 11893214
> >>>>>> Country Code: US
> >>>>>> Coordinates: 42.73958, -71.45895
> >>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> >>>>>>
> >>>>>> Name: Main St
> >>>>>> Score: 41.204945
> >>>>>> ID: 12681979
> >>>>>> Country Code: US
> >>>>>> Coordinates: 42.76416, -71.46681
> >>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> >>>>>>
> >>>>>> Name: Main St
> >>>>>> Score: 41.204945
> >>>>>> ID: 12681977
> >>>>>> Country Code: US
> >>>>>> Coordinates: 42.747, -71.45957
> >>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> Case #2
> >>>>>>
> >>>>>> When i did this it also worked by adding ~ to make it Fuzzy query
> >>>>>> to Main word:
> >>>>>>
> >>>>>> org.apache.lucene.queryparser.classic.QueryParser parser = new
> >>>>>> org.apache.lucene.queryparser.classic.QueryParser(field,
> >>>>>> phraseAnalyzer) ;
> >>>>>> Query q1 = null;
> >>>>>> try {
> >>>>>> q1 = parser.parse("Main~");
> >>>>>> } catch (ParseException e) {
> >>>>>> e.printStackTrace();
> >>>>>> }
> >>>>>> booleanQuery.add(q1, BooleanClause.Occur.MUST);
> >>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> >>>>>> "NASHUA"), BooleanClause.Occur.MUST);
> >>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> >>>>>> "NEW HAMPSHIRE"), BooleanClause.Occur.MUST);
> >>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> >>>>>> "UNITED STATES"), BooleanClause.Occur.MUST);
> >>>>>>
> >>>>>>
> >>>>>> query plan:
> >>>>>>
> >>>>>> [+contentDFLT:main~2, +contentDFLT:"nashua",
> >>>>>> +contentDFLT:"new-hampshire", +contentDFLT:"united states"]
> >>>>>>
> >>>>>> testQuerySearch1 Time to compute: 24 seconds (due to debugging
> >>>>>> stops)
> >>>>>> Number of results: 12
> >>>>>> Name: Main Dunstable Rd
> >>>>>> Score: 41.06405
> >>>>>> ID: 12677400
> >>>>>> Country Code: US
> >>>>>> Coordinates: 42.72631, -71.50269
> >>>>>> Search Key: MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW HAMPSHIRE
> >>>>>> UNITED STATES
> >>>>>>
> >>>>>> Name: Main St
> >>>>>> Score: 41.06405
> >>>>>> ID: 12681980
> >>>>>> Country Code: US
> >>>>>> Coordinates: 42.76416, -71.46681
> >>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> >>>>>>
> >>>>>> Name: Main St
> >>>>>> Score: 41.06405
> >>>>>> ID: 12681973
> >>>>>> Country Code: US
> >>>>>> Coordinates: 42.75045, -71.4607
> >>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> >>>>>>
> >>>>>> Name: Main St
> >>>>>> Score: 41.06405
> >>>>>> ID: 12681974
> >>>>>> Country Code: US
> >>>>>> Coordinates: 42.76019, -71.465
> >>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> >>>>>>
> >>>>>> Name: Main Dunstable Rd
> >>>>>> Score: 41.06405
> >>>>>> ID: 12677399
> >>>>>> Country Code: US
> >>>>>> Coordinates: 42.74641, -71.48943
> >>>>>> Search Key: MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW HAMPSHIRE
> >>>>>> UNITED STATES
> >>>>>>
> >>>>>> Name: S Main St
> >>>>>> Score: 41.06405
> >>>>>> ID: 11893215
> >>>>>> Country Code: US
> >>>>>> Coordinates: 42.73412, -71.44797
> >>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> >>>>>>
> >>>>>> Name: Main St
> >>>>>> Score: 41.06405
> >>>>>> ID: 12681978
> >>>>>> Country Code: US
> >>>>>> Coordinates: 42.73492, -71.44951
> >>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> >>>>>>
> >>>>>> Name: S Main St
> >>>>>> Score: 41.06405
> >>>>>> ID: 11893214
> >>>>>> Country Code: US
> >>>>>> Coordinates: 42.73958, -71.45895
> >>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> >>>>>>
> >>>>>> Name: Main St
> >>>>>> Score: 41.06405
> >>>>>> ID: 12681979
> >>>>>> Country Code: US
> >>>>>> Coordinates: 42.76416, -71.46681
> >>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> >>>>>>
> >>>>>> Name: Main St
> >>>>>> Score: 41.06405
> >>>>>> ID: 12681977
> >>>>>> Country Code: US
> >>>>>> Coordinates: 42.747, -71.45957
> >>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> Case #3
> >>>>>>
> >>>>>> But why does this not work with fuzzy mode and i misspelled a bit
> >>>>>> (1 edit away) and as You saw the data is there with Main spelling:
> >>>>>>
> >>>>>> org.apache.lucene.queryparser.classic.QueryParser parser = new
> >>>>>> org.apache.lucene.queryparser.classic.QueryParser(field,
> >>>>>> phraseAnalyzer) ;
> >>>>>>
> >>>>>> Query q1 = null;
> >>>>>> try {
> >>>>>> q1 = parser.parse("Mains~"); // 1 edit away
> >>>>>> } catch (ParseException e) {
> >>>>>> e.printStackTrace();
> >>>>>> }
> >>>>>> booleanQuery.add(q1, BooleanClause.Occur.MUST);
> >>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> >>>>>> "NASHUA"), BooleanClause.Occur.MUST);
> >>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> >>>>>> "NEW HAMPSHIRE"), BooleanClause.Occur.MUST);
> >>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> >>>>>> "UNITED STATES"), BooleanClause.Occur.MUST);
> >>>>>>
> >>>>>> query plan:
> >>>>>>
> >>>>>> [+contentDFLT:mains~2, +contentDFLT:"nashua",
> >>>>>> +contentDFLT:"new-hampshire", +contentDFLT:"united states"]
> >>>>>>
> >>>>>> testQuerySearch1 Time to compute: 23 seconds (due to debugging
> >>>>>> stops)
> >>>>>>
> >>>>>> Number of results: 0
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> Case #4
> >>>>>>
> >>>>>> Then i changed q1 to SHOULD from MUST above: and i think fuzzy
> >>>>>> query is ignored here since there is no MAIN in the first 468
> >>>>>> resuls:
> >>>>>>
> >>>>>> there is no boost for Mains term here.
> >>>>>>
> >>>>>> query plan:
> >>>>>>
> >>>>>> [contentDFLT:mains~2, +contentDFLT:"nashua",
> >>>>>> +contentDFLT:"new-hampshire", +contentDFLT:"united states"]
> >>>>>>
> >>>>>> testQuerySearch1 Time to compute: 125 seconds (due to debugging
> >>>>>> stops)
> >>>>>> Number of results: 1794
> >>>>>> Name: Nashua Dr
> >>>>>> Score: 34.186226
> >>>>>> ID: 4974936
> >>>>>> Country Code: US
> >>>>>> Coordinates: 42.7636, -71.46063
> >>>>>> Search Key: NASHUA NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> >>>>>>
> >>>>>> Name: Nashua River Rail Trl
> >>>>>> Score: 34.186226
> >>>>>> ID: 4975508
> >>>>>> Country Code: US
> >>>>>> Coordinates: 42.7062, -71.53962
> >>>>>> Search Key: NASHUA RIVER RAIL NASHUA HILLSBOROUGH NEW HAMPSHIRE
> >>>>>> UNITED STATES
> >>>>>>
> >>>>>> Name: Nashua Rd
> >>>>>> Score: 33.84896
> >>>>>> ID: 4975388
> >>>>>> Country Code: US
> >>>>>> Coordinates: 42.78746, -71.92823
> >>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> >>>>>>
> >>>>>> Name: NASHUA
> >>>>>> Score: 33.84896
> >>>>>> ID: 21014865
> >>>>>> Country Code: US
> >>>>>> Coordinates: 42.75873, -71.46438
> >>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> >>>>>>
> >>>>>> Name: NASHUA
> >>>>>> Score: 33.84896
> >>>>>> ID: 21014865
> >>>>>> Country Code: US
> >>>>>> Coordinates: 42.75873, -71.46438
> >>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> >>>>>>
> >>>>>> Name: NASHUA
> >>>>>> Score: 33.84896
> >>>>>> ID: 21014865
> >>>>>> Country Code: US
> >>>>>> Coordinates: 42.75873, -71.46438
> >>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> >>>>>>
> >>>>>> Name: NASHUA
> >>>>>> Score: 33.84896
> >>>>>> ID: 21014865
> >>>>>> Country Code: US
> >>>>>> Coordinates: 42.75873, -71.46438
> >>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> >>>>>>
> >>>>>> Name: NASHUA
> >>>>>> Score: 33.84896
> >>>>>> ID: 21014865
> >>>>>> Country Code: US
> >>>>>> Coordinates: 42.75873, -71.46438
> >>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> >>>>>>
> >>>>>> Name: Nashua St
> >>>>>> Score: 33.84896
> >>>>>> ID: 4975671
> >>>>>> Country Code: US
> >>>>>> Coordinates: 42.88471, -70.81687
> >>>>>> Search Key: NASHUA ROCKINGHAM NEW HAMPSHIRE UNITED STATES
> >>>>>>
> >>>>>> Name: Nashua Rd
> >>>>>> Score: 33.84896
> >>>>>> ID: 4975400
> >>>>>> Country Code: US
> >>>>>> Coordinates: 42.79014, -71.92364
> >>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> >>>>>>
> >>>>>>
> >>>>>> Why is the fuzzy query ignored?
> >>>>>> Even if i have separate fields for street, city,region, country,
> >>>>>> this fuzzy query issue will come into place for words with
> >>>>>> multiple parts like main dunstable etc., right?
> >>>>>>
> >>>>>> Best regards
> >>>>>>
> >>>>>> On 6/12/19 11:36 AM, baris.kazar@oracle.com wrote:
> >>>>>>
> >>>>>> Tomoko,-
> >>>>>>
> >>>>>> Thank You for Your suggestions. i am trying to understand it
> >>>>>> and i thought i did :)
> >>>>>>
> >>>>>> but it does not work with FuzzyQuery when i used with a *single*
> >>>>>> large TextField like street=...value... city=...value...
> >>>>>> region=...value... country=...value... (with or without quotes
> >>>>>> for the values)
> >>>>>>
> >>>>>> What i knew about Lucene fuzzy queries are not holding now with
> >>>>>> this Textfield form. That is why i suspected of a bug.
> >>>>>>
> >>>>>> 1. Yes, i saw and have a solid proof on that now.
> >>>>>>
> >>>>>> 2. yes but FuzzyQuery takes quotes as they are as they are
> >>>>>> escaped and it is not analyzed.
> >>>>>>
> >>>>>> Stuffing into one textfield vs having separate fields should only
> >>>>>> affect probably the performance but not the outcome in my case.
> >>>>>> But, i have been thinking about this and maybe it is the way to
> >>>>>> go in this case.
> >>>>>>
> >>>>>> mY CONTENT field has street names in mixed case and city, region
> >>>>>> country names in UPPERCASE. Can this be a problem?
> >>>>>> i thought index stored them in lowercase since i am using
> >>>>>> StandardAnalyzer.
> >>>>>>
> >>>>>> CONTENT field also has full textfield string with street=...
> >>>>>> city=... region=... country=... (here all values are UPPERCASE).
> >>>>>>
> >>>>>> Why cant the index find the names via FuzzyQuery? i tried both
> >>>>>> FuzzyQuery and Query builder as i showed before.
> >>>>>>
> >>>>>> The last advice in Your previous email would nicely go outside
> >>>>>> the parantheses since it might be very critical :) :) :)
> >>>>>>
> >>>>>> Best regards
> >>>>>>
> >>>>>>
> >>>>>> On 6/12/19 12:17 AM, Tomoko Uchida wrote:
> >>>>>>
> >>>>>> I'd suggest to correctly understand the way a software works before
> >>>>>> suspecting its bug :-)
> >>>>>>
> >>>>>> I guess you may miss two points:
> >>>>>>
> >>>>>> 1. the standard analyzer (standard tokenizer) breaks words by double
> >>>>>> quote (U+0022) so quotes are not indexed or searched at all if
> >>>>>> you are
> >>>>>> using standard analyzer. (That is the reason you have same results
> >>>>>> with or without quotes.)
> >>>>>> See:
> >>>>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__lucene.apache.org_core_8-5F1-5F0_core_org_apache_lucene_analysis_standard_StandardTokenizer.html&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=1L6ZQKxmWmYxDX4uJHxzY5SAR_UCl6UUXCo916wzXCo&s=8E2lp1YIGM-3v3FspeieGl8z8rEBs6qioTudtFNzh8c&e=
> >>>>>> and
> >>>>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__unicode.org_reports_tr29_&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=1L6ZQKxmWmYxDX4uJHxzY5SAR_UCl6UUXCo916wzXCo&s=riCZ_f25XW869CKbHPUqfbLiDU-AukE6la0xTLMw6u8&e=
> >>>>>>
> >>>>>> 2. double quote has special meaning (it's interpreted as phrase
> >>>>>> query)
> >>>>>> with the built-in query parser so you need to escape it if you
> >>>>>> want to
> >>>>>> search double quotes itself.
> >>>>>> See:
> >>>>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__lucene.apache.org_core_8-5F1-5F0_queryparser_org_apache_lucene_queryparser_classic_package-2Dsummary.html-23Terms&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=1L6ZQKxmWmYxDX4uJHxzY5SAR_UCl6UUXCo916wzXCo&s=t8OYTgidvcwNpAVFuTsqGhDJK5BwUZVCxc0mPHzqCYU&e=
> >>>>>>
> >>>>>> (My advice would be to create separate fields for each key value
> >>>>>> pairs
> >>>>>> instead of stuffing all pairs into one text field, if you need to
> >>>>>> search them separately.)
> >>>>>>
> >>>>>> 2019?6?12?(?) 2:39 <baris.kazar@oracle.com>:
> >>>>>>
> >>>>>> i can say that quotes is not the issue with index as it still
> >>>>>> results in
> >>>>>> same results with quotes or without quotes.
> >>>>>>
> >>>>>> i am starting to feel that this might be a bug maybe??
> >>>>>>
> >>>>>> Best regards
> >>>>>>
> >>>>>>
> >>>>>> On 6/10/19 2:46 PM, baris.kazar@oracle.com wrote:
> >>>>>>
> >>>>>> Somehow " is causing an issue as this should return street with
> >>>>>> MAIN:
> >>>>>>
> >>>>>> [.contentDFLT:street="MAINS"~2, +contentDFLT:"city nashua",
> >>>>>> +contentDFLT:"region new-hampshire", +contentDFLT:"country united
> >>>>>> states"] -> this was with fuzzyquery on MAINS
> >>>>>>
> >>>>>> Best regards
> >>>>>>
> >>>>>>
> >>>>>> On 6/10/19 2:24 PM, baris.kazar@oracle.com wrote:
> >>>>>>
> >>>>>> [.+contentDFLT:"city nashua", +contentDFLT:"region new-hampshire",
> >>>>>> +contentDFLT:"country united states", contentDFLT:street
> >>>>>> contentDFLT:mains]
> >>>>>>
> >>>>>> QueeryParser chops it into two pieces from
> >>>>>> parser.parser("street=\"MAINS\"");
> >>>>>>
> >>>>>> Index has a TextField named contentDFLT the following data :
> >>>>>> street="MAIN" city="NASHUA" municipality="HILLSBOROUGH" region="NEW
> >>>>>> HAMPSHIRE" country="UNITED STATES"
> >>>>>>
> >>>>>>
> >>>>>> When i set street=\"MAINS~\" with parser:
> >>>>>> i get the following
> >>>>>> [.+contentDFLT:"city nashua", +contentDFLT:"region new-hampshire",
> >>>>>> +contentDFLT:"country united states", contentDFLT:street
> >>>>>> contentDFLT:mains]
> >>>>>>
> >>>>>> probably " quotations are messing this up as You were saying...
> >>>>>> Best regards
> >>>>>>
> >>>>>>
> >>>>>> On 6/10/19 12:48 PM, Tomoko Uchida wrote:
> >>>>>>
> >>>>>> Or, " (double quotation) in your query string may affect query
> >>>>>> parsing.
> >>>>>>
> >>>>>> When I parse this string by classic query parser (lucene 8.1),
> >>>>>> street="MAINS~"
> >>>>>> parsed (raw) query is
> >>>>>> text:street text:mains
> >>>>>> (I set the default search field to "text", so text:xxxx is appeared
> >>>>>> here.)
> >>>>>>
> >>>>>> Query parsing is a complex process, so it would be good to check
> >>>>>> parsed raw query string especially when you have (reserved) special
> >>>>>> characters in your query...
> >>>>>>
> >>>>>> 2019?6?11?(?) 1:10 Tomoko Uchida <tomoko.uchida.1111@gmail.com>:
> >>>>>>
> >>>>>> Hi,
> >>>>>>
> >>>>>> I noticed one small thing in your previous mail.
> >>>>>>
> >>>>>> when i use q1 = parser.parse("street=\"MAIN\""); i get same results
> >>>>>>
> >>>>>> which is good.
> >>>>>>
> >>>>>> To specify a search field, ":" (colon) should be used instead of
> >>>>>> "=".
> >>>>>> See the query parser documentation:
> >>>>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__lucene.apache.org_core_8-5F1-5F0_queryparser_org_apache_lucene_queryparser_classic_package-2Dsummary.html-23Fields&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=u4SeJqH4lePhOazCLwxLEr3WqcMkODtYLv4njiKZ4PM&s=WrNfUXO9gz1PqpczTJw1vD9sWqvr76WRv2Aeo9uWqa4&e=
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> I'm not sure this is related to your problem.
> >>>>>>
> >>>>>> 2019?6?11?(?) 0:51 <baris.kazar@oracle.com>:
> >>>>>>
> >>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> >>>>>> "city=\"NASHUA\""), BooleanClause.Occur.MUST);
> >>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> >>>>>> "region=\"NEW HAMPSHIRE\""), BooleanClause.Occur.MUST);
> >>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> >>>>>> "country=\"UNITED STATES\""), BooleanClause.Occur.MUST);
> >>>>>>
> >>>>>> org.apache.lucene.queryparser.classic.QueryParser parser = new
> >>>>>> org.apache.lucene.queryparser.classic.QueryParser(field,
> >>>>>> phraseAnalyzer) ;
> >>>>>> Query q1 = null;
> >>>>>> try {
> >>>>>> q1 = parser.parse("MAIN");
> >>>>>> } catch (ParseException e) {
> >>>>>>
> >>>>>> e.printStackTrace();
> >>>>>> }
> >>>>>> booleanQuery.add(q1, BooleanClause.Occur.SHOULD);
> >>>>>>
> >>>>>> testQuerySearch2 Time to compute: 0 seconds
> >>>>>> Number of results: 1775
> >>>>>> Name: Main St
> >>>>>> Score: 37.20959
> >>>>>> ID: 12681979
> >>>>>> Country Code: US
> >>>>>> Coordinates: 42.76416, -71.46681
> >>>>>> Search Key: street="MAIN" city="NASHUA" municipality="HILLSBOROUGH"
> >>>>>> region="NEW HAMPSHIRE" country="UNITED STATES"
> >>>>>>
> >>>>>> Name: Main St
> >>>>>> Score: 37.20959
> >>>>>> ID: 12681977
> >>>>>> Country Code: US
> >>>>>> Coordinates: 42.747, -71.45957
> >>>>>> Search Key: street="MAIN" city="NASHUA" municipality="HILLSBOROUGH"
> >>>>>> region="NEW HAMPSHIRE" country="UNITED STATES"
> >>>>>>
> >>>>>> Name: Main St
> >>>>>> Score: 37.20959
> >>>>>> ID: 12681978
> >>>>>> Country Code: US
> >>>>>> Coordinates: 42.73492, -71.44951
> >>>>>> Search Key: street="MAIN" city="NASHUA" municipality="HILLSBOROUGH"
> >>>>>> region="NEW HAMPSHIRE" country="UNITED STATES"
> >>>>>>
> >>>>>> when i use q1 = parser.parse("street=\"MAIN\""); i get same
> >>>>>> results
> >>>>>> which is good.
> >>>>>>
> >>>>>> But when i switch to MAINS~ then fuzzy query does not work.
> >>>>>>
> >>>>>>
> >>>>>> i need to say something with the q1 only in the booleanquery:
> >>>>>> it tries to match the MAIN in street, city, region and country
> >>>>>> which are
> >>>>>> in a single TextField field.
> >>>>>> But i dont want this. that is why i need to street="..." etc when
> >>>>>> searching.
> >>>>>>
> >>>>>> Best regards
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> On 6/10/19 11:31 AM, Tomoko Uchida wrote:
> >>>>>>
> >>>>>> Hi,
> >>>>>>
> >>>>>> just for the basic verification, can you find the document without
> >>>>>> fuzzy query? I mean, does this query work for you?
> >>>>>>
> >>>>>> Query query = parser.parse("MAIN");
> >>>>>>
> >>>>>> Tomoko
> >>>>>>
> >>>>>> 2019?6?11?(?) 0:22 <baris.kazar@oracle.com>:
> >>>>>>
> >>>>>> why cant the second set not work at all?
> >>>>>>
> >>>>>> it is indexed as Textfield like street="..." city="..." etc.
> >>>>>>
> >>>>>> Best regards
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> On 6/10/19 11:23 AM, baris.kazar@oracle.com wrote:
> >>>>>>
> >>>>>> i dont know how to use Fuzzyquery with queryparser but probably
> >>>>>> You
> >>>>>> are suggesting
> >>>>>>
> >>>>>> QueryParser parser = new QueryParser(field, analyzer) ;
> >>>>>> Query query = parser.parse("MAINS~2");
> >>>>>>
> >>>>>> booleanQuery.add(query, BooleanClause.Occur.SHOULD);
> >>>>>>
> >>>>>> am i right?
> >>>>>> Best regards
> >>>>>>
> >>>>>>
> >>>>>> On 6/10/19 10:47 AM, Atri Sharma wrote:
> >>>>>>
> >>>>>> I would suggest using a QueryParser for your fuzzy query before
> >>>>>> adding it to the Boolean query. This should weed out any case
> >>>>>> issues.
> >>>>>>
> >>>>>> On Mon, 10 Jun 2019 at 8:06 PM, <baris.kazar@oracle.com
> >>>>>> <mailto:baris.kazar@oracle.com>> wrote:
> >>>>>>
> >>>>>> BooleanQuery.Builder booleanQuery = new
> >>>>>> BooleanQuery.Builder();
> >>>>>>
> >>>>>> //First set
> >>>>>>
> >>>>>> booleanQuery.add(new FuzzyQuery(new
> >>>>>> org.apache.lucene.index.Term(field, "MAINS")),
> >>>>>> BooleanClause.Occur.SHOULD);
> >>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> >>>>>> "NASHUA"), BooleanClause.Occur.MUST);
> >>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> >>>>>> "NEW HAMPSHIRE"), BooleanClause.Occur.MUST);
> >>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> >>>>>> "UNITED STATES"), BooleanClause.Occur.MUST);
> >>>>>>
> >>>>>> // Second set
> >>>>>> //booleanQuery.add(new FuzzyQuery(new
> >>>>>> org.apache.lucene.index.Term(field, "street=\"MAINS\"")),
> >>>>>> BooleanClause.Occur.SHOULD);
> >>>>>> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,
> >>>>>>
> >>>>>> field, "city=\"NASHUA\""), BooleanClause.Occur.MUST);
> >>>>>> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,
> >>>>>>
> >>>>>> field, "region=\"NEW HAMPSHIRE\""),
> >>>>>> BooleanClause.Occur.MUST);
> >>>>>> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,
> >>>>>>
> >>>>>> field, "country=\"UNITED STATES\""),
> >>>>>> BooleanClause.Occur.MUST);
> >>>>>>
> >>>>>> The first set brings also street with Nashua name.
> >>>>>> (NASHUA).
> >>>>>>
> >>>>>> so, to prevent that and since i also indexed with
> >>>>>> street="..."
> >>>>>> city="..." i did the second set but it does not bring
> >>>>>> anything.
> >>>>>>
> >>>>>> createPhraseQuery builds a Phrasequery with one term
> >>>>>> equal to the
> >>>>>> string
> >>>>>> in the call.
> >>>>>>
> >>>>>> Best regards
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> On 6/10/19 10:47 AM, baris.kazar@oracle.com
> >>>>>> <mailto:baris.kazar@oracle.com> wrote:
> >>>>>> > How do i check how it is indexed? lowecase or uppercase?
> >>>>>> >
> >>>>>> > only way is now to by testing.
> >>>>>> >
> >>>>>> > i am using standardanalyzer.
> >>>>>> >
> >>>>>> > Best regards
> >>>>>> >
> >>>>>> >
> >>>>>> > On 6/9/19 11:57 AM, Atri Sharma wrote:
> >>>>>> >> On Sun, Jun 9, 2019 at 8:53 PM Tomoko Uchida
> >>>>>> >> <tomoko.uchida.1111@gmail.com
> >>>>>> <mailto:tomoko.uchida.1111@gmail.com>> wrote:
> >>>>>> >>> Hi,
> >>>>>> >>>
> >>>>>> >>> What analyzer do you use for the text field? Is the
> >>>>>> term "Main"
> >>>>>> >>> correctly indexed?
> >>>>>> >> Agreed. Also, it would be good if you could post your
> >>>>>> actual
> >>>>>> code.
> >>>>>> >>
> >>>>>> >> What analyzer are you using? If you are using
> >>>>>> StandardAnalyzer,
> >>>>>> then
> >>>>>> >> all of your terms while indexing will be lowercased,
> >>>>>> AFAIK, but
> >>>>>> your
> >>>>>> >> query will not be analyzed until you run a
> >>>>>> QueryParser on it.
> >>>>>> >>
> >>>>>> >>
> >>>>>> >> Atri
> >>>>>> >>
> >>>>>> >
> >>>>>> >
> >>>>>> >
> >>>>>> ---------------------------------------------------------------------
> >>>>>>
> >>>>>>
> >>>>>> > To unsubscribe, e-mail:
> >>>>>> java-user-unsubscribe@lucene.apache.org
> >>>>>> <mailto:java-user-unsubscribe@lucene.apache.org>
> >>>>>> > For additional commands, e-mail:
> >>>>>> java-user-help@lucene.apache.org
> >>>>>> <mailto:java-user-help@lucene.apache.org>
> >>>>>> >
> >>>>>>
> >>>>>> ---------------------------------------------------------------------
> >>>>>>
> >>>>>>
> >>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>>>>>
> >>>>>> ---------------------------------------------------------------------
> >>>>>>
> >>>>>>
> >>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>>>>>
> >>>>>> ---------------------------------------------------------------------
> >>>>>>
> >>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>>>>>
> >>>>>> ---------------------------------------------------------------------
> >>>>>>
> >>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>> ---------------------------------------------------------------------
> >>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>>>>
> >>>>
> >>>> ---------------------------------------------------------------------
> >>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >>>> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>>>
> >>>
> >>> ---------------------------------------------------------------------
> >>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >>> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>>
> >>
> >
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: FuzzyQuery- why is it ignored? [ In reply to ]
i see, i am using an older version 6.6 and we should switch to Your 8.1
version of at least 7.X.

Tomoko i think i understood You meant MAIN NASHUA .... for the string :)

Again i really appreciate all answers.

How do we disable or enable stemming while indexing? :) another question.

Best regards


On 6/13/19 10:40 AM, Tomoko Uchida wrote:
> Sorry, I made a mistake when copypasting. Let me just correct my previous mail.
>
>> 1. Indexed this text: "NASHUA NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES".
> 1. Indexed this text: "MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW
> HAMPSHIRE UNITED STATES"
>
> ----
> As far as I can say, this query correctly find the indexed document
> (so I have no idea about what is wrong with fuzzy query).
> +contentDFLT:mains~2 +contentDFLT:"nashua"
> +contentDFLT:"new-hampshire" +contentDFLT:"united states"
>
> I am
> - using lucene 8.1.
> - using standard analyzer for both of indexing and searching.
> - using classic query parser for parsing.
>
>
>
> 2019?6?13?(?) 23:18 <baris.kazar@oracle.com>:
>> However, the index does not have MAINS but MAIN for the expected entry.
>>
>> Best regards
>>
>>
>>
>> On 6/13/19 10:33 AM, baris.kazar@oracle.com wrote:
>>> does it consider it as like plural word? :) :) :)
>>> That makes sense.
>>>
>>> Best regards
>>>
>>>
>>> On 6/13/19 10:31 AM, baris.kazar@oracle.com wrote:
>>>> Erick,
>>>>
>>>> Cool, could You give a simple example with my example please?
>>>>
>>>> Best regards
>>>>
>>>>
>>>>
>>>> On 6/13/19 10:12 AM, Erick Erickson wrote:
>>>>> Shot in the dark: stemming. Whenever I see a problem with something
>>>>> ending in “s” (or “er” or “ing” or….) my first suspect is that
>>>>> stemming is turned on. In that case the token in the index that’s
>>>>> actually searched on is somewhat different than you expect.
>>>>>
>>>>> The test is easy, just insure your fieldType contains no stemmers.
>>>>> PorterStemmer is particularly aggressive, but for this case to test
>>>>> I’d just remove all stemming, re-index and see if the results differ.
>>>>>
>>>>> Best,
>>>>> Erick
>>>>>
>>>>>> On Jun 13, 2019, at 7:26 AM, baris.kazar@oracle.com wrote:
>>>>>>
>>>>>> Tomoko,-
>>>>>>
>>>>>> That is strange indeed.
>>>>>>
>>>>>> Something is wrong when i use mains but maink, mainl, mainr,mainq,
>>>>>> maint all work ok any consonant at the end except s works in this
>>>>>> case.
>>>>>>
>>>>>> Case #3 had +contentDFLT:mains~2 but not +contentDFLT:"mains~2".
>>>>>>
>>>>>> i am using fuzzy query with ~ from Query.builder and that is not
>>>>>> PhraseQuery.
>>>>>>
>>>>>> Similarly FuzzyQuery with input "mains" (it has to be lowercase
>>>>>> since it does not go through StandardAnalyzer) is also not
>>>>>> PhraseQuery.
>>>>>>
>>>>>> can there be a clearer sample case for ComplexPhraseQuery please in
>>>>>> the docs?
>>>>>>
>>>>>> did You also index "MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED
>>>>>> STATES" the expected output in this case?
>>>>>>
>>>>>> Thanks for spending time on this, i would like to thank everyone.
>>>>>>
>>>>>> Best regards
>>>>>>
>>>>>>
>>>>>> On 6/13/19 12:13 AM, Tomoko Uchida wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>>> Ok, i think only this very specific only "mains" has an issue.
>>>>>>> It looks strange to me. I did some test locally.
>>>>>>>
>>>>>>> 1. Indexed this text: "NASHUA NASHUA HILLSBOROUGH NEW HAMPSHIRE
>>>>>>> UNITED STATES".
>>>>>>>
>>>>>>> 2a. This query string (just copied from your Case #3) worked
>>>>>>> correctly
>>>>>>> for me as far as I can see.
>>>>>>> +contentDFLT:mains~2 +contentDFLT:"nashua",
>>>>>>> +contentDFLT:"new-hampshire", +contentDFLT:"united state"
>>>>>>>
>>>>>>> 2b. However this query string got no results.
>>>>>>> +contentDFLT:"mains~2", +contentDFLT:"nashua",
>>>>>>> +contentDFLT:"new-hampshire", +contentDFLT:"united states"
>>>>>>> It is an expected behaviour because the classic query parser does not
>>>>>>> support fuzzy query inside phrase query (as far as I know).
>>>>>>>
>>>>>>> I suspect you use fuzzy query operator (~) inside phrase query
>>>>>>> ("), as
>>>>>>> the 2b case.
>>>>>>>
>>>>>>> FYI: there is a special parser for such complex phrase query.
>>>>>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__lucene.apache.org_core_8-5F1-5F0_queryparser_org_apache_lucene_queryparser_complexPhrase_ComplexPhraseQueryParser.html&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=ZcXpaSlwS5DegX76mHTb_6DH3P7noan1eeMXc-Vh5M8&s=FoIMlcjDO2b7Gut9XRx-NIBWiBQWItsj8IlylJC7Wkc&e=
>>>>>>>
>>>>>>>
>>>>>>> Tomoko
>>>>>>>
>>>>>>> 2019?6?13?(?) 6:16 <baris.kazar@oracle.com>:
>>>>>>>> Ok, i think only this very specific only "mains" has an issue.
>>>>>>>>
>>>>>>>> all i knew about Lucene was fine :) Great...
>>>>>>>>
>>>>>>>> i have one more question:
>>>>>>>>
>>>>>>>> which one is advised to use: FuzzyQuery or the Query.parser with
>>>>>>>> search string~ appended?
>>>>>>>>
>>>>>>>> The second one will go through analyzer and make search string
>>>>>>>> lowercase.
>>>>>>>>
>>>>>>>> Best regards
>>>>>>>>
>>>>>>>>
>>>>>>>> On 6/12/19 1:03 PM, baris.kazar@oracle.com wrote:
>>>>>>>>
>>>>>>>> Hi again,-
>>>>>>>>
>>>>>>>> this is really interesting and i hope i am missing something.
>>>>>>>> Index small cases all entries so case sensitivity is not an issue
>>>>>>>> i think.
>>>>>>>>
>>>>>>>> Case #1:
>>>>>>>>
>>>>>>>> org.apache.lucene.queryparser.classic.QueryParser parser = new
>>>>>>>> org.apache.lucene.queryparser.classic.QueryParser(field,
>>>>>>>> phraseAnalyzer) ;
>>>>>>>> Query q1 = null;
>>>>>>>> try {
>>>>>>>> q1 = parser.parse("Main");
>>>>>>>> } catch (ParseException e) {
>>>>>>>> e.printStackTrace();
>>>>>>>> }
>>>>>>>> booleanQuery.add(q1, BooleanClause.Occur.MUST);
>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>>> "NASHUA"), BooleanClause.Occur.MUST);
>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>>> "NEW HAMPSHIRE"), BooleanClause.Occur.MUST);
>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>>> "UNITED STATES"), BooleanClause.Occur.MUST);
>>>>>>>>
>>>>>>>>
>>>>>>>> This brings with this:
>>>>>>>>
>>>>>>>> query plan:
>>>>>>>>
>>>>>>>> [+contentDFLT:main, +contentDFLT:"nashua",
>>>>>>>> +contentDFLT:"new-hampshire", +contentDFLT:"united states"]
>>>>>>>>
>>>>>>>> testQuerySearch1 Time to compute: 0 seconds (copied answer after
>>>>>>>> exec finished)
>>>>>>>>
>>>>>>>> Number of results: 12
>>>>>>>> Name: Main Dunstable Rd
>>>>>>>> Score: 41.204945
>>>>>>>> ID: 12677400
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.72631, -71.50269
>>>>>>>> Search Key: MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW HAMPSHIRE
>>>>>>>> UNITED STATES
>>>>>>>>
>>>>>>>> Name: Main St
>>>>>>>> Score: 41.204945
>>>>>>>> ID: 12681980
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.76416, -71.46681
>>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>
>>>>>>>> Name: Main St
>>>>>>>> Score: 41.204945
>>>>>>>> ID: 12681973
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.75045, -71.4607
>>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>
>>>>>>>> Name: Main St
>>>>>>>> Score: 41.204945
>>>>>>>> ID: 12681974
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.76019, -71.465
>>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>
>>>>>>>> Name: Main Dunstable Rd
>>>>>>>> Score: 41.204945
>>>>>>>> ID: 12677399
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.74641, -71.48943
>>>>>>>> Search Key: MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW HAMPSHIRE
>>>>>>>> UNITED STATES
>>>>>>>>
>>>>>>>> Name: S Main St
>>>>>>>> Score: 41.204945
>>>>>>>> ID: 11893215
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.73412, -71.44797
>>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>
>>>>>>>> Name: Main St
>>>>>>>> Score: 41.204945
>>>>>>>> ID: 12681978
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.73492, -71.44951
>>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>
>>>>>>>> Name: S Main St
>>>>>>>> Score: 41.204945
>>>>>>>> ID: 11893214
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.73958, -71.45895
>>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>
>>>>>>>> Name: Main St
>>>>>>>> Score: 41.204945
>>>>>>>> ID: 12681979
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.76416, -71.46681
>>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>
>>>>>>>> Name: Main St
>>>>>>>> Score: 41.204945
>>>>>>>> ID: 12681977
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.747, -71.45957
>>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Case #2
>>>>>>>>
>>>>>>>> When i did this it also worked by adding ~ to make it Fuzzy query
>>>>>>>> to Main word:
>>>>>>>>
>>>>>>>> org.apache.lucene.queryparser.classic.QueryParser parser = new
>>>>>>>> org.apache.lucene.queryparser.classic.QueryParser(field,
>>>>>>>> phraseAnalyzer) ;
>>>>>>>> Query q1 = null;
>>>>>>>> try {
>>>>>>>> q1 = parser.parse("Main~");
>>>>>>>> } catch (ParseException e) {
>>>>>>>> e.printStackTrace();
>>>>>>>> }
>>>>>>>> booleanQuery.add(q1, BooleanClause.Occur.MUST);
>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>>> "NASHUA"), BooleanClause.Occur.MUST);
>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>>> "NEW HAMPSHIRE"), BooleanClause.Occur.MUST);
>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>>> "UNITED STATES"), BooleanClause.Occur.MUST);
>>>>>>>>
>>>>>>>>
>>>>>>>> query plan:
>>>>>>>>
>>>>>>>> [+contentDFLT:main~2, +contentDFLT:"nashua",
>>>>>>>> +contentDFLT:"new-hampshire", +contentDFLT:"united states"]
>>>>>>>>
>>>>>>>> testQuerySearch1 Time to compute: 24 seconds (due to debugging
>>>>>>>> stops)
>>>>>>>> Number of results: 12
>>>>>>>> Name: Main Dunstable Rd
>>>>>>>> Score: 41.06405
>>>>>>>> ID: 12677400
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.72631, -71.50269
>>>>>>>> Search Key: MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW HAMPSHIRE
>>>>>>>> UNITED STATES
>>>>>>>>
>>>>>>>> Name: Main St
>>>>>>>> Score: 41.06405
>>>>>>>> ID: 12681980
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.76416, -71.46681
>>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>
>>>>>>>> Name: Main St
>>>>>>>> Score: 41.06405
>>>>>>>> ID: 12681973
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.75045, -71.4607
>>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>
>>>>>>>> Name: Main St
>>>>>>>> Score: 41.06405
>>>>>>>> ID: 12681974
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.76019, -71.465
>>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>
>>>>>>>> Name: Main Dunstable Rd
>>>>>>>> Score: 41.06405
>>>>>>>> ID: 12677399
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.74641, -71.48943
>>>>>>>> Search Key: MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW HAMPSHIRE
>>>>>>>> UNITED STATES
>>>>>>>>
>>>>>>>> Name: S Main St
>>>>>>>> Score: 41.06405
>>>>>>>> ID: 11893215
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.73412, -71.44797
>>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>
>>>>>>>> Name: Main St
>>>>>>>> Score: 41.06405
>>>>>>>> ID: 12681978
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.73492, -71.44951
>>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>
>>>>>>>> Name: S Main St
>>>>>>>> Score: 41.06405
>>>>>>>> ID: 11893214
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.73958, -71.45895
>>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>
>>>>>>>> Name: Main St
>>>>>>>> Score: 41.06405
>>>>>>>> ID: 12681979
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.76416, -71.46681
>>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>
>>>>>>>> Name: Main St
>>>>>>>> Score: 41.06405
>>>>>>>> ID: 12681977
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.747, -71.45957
>>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Case #3
>>>>>>>>
>>>>>>>> But why does this not work with fuzzy mode and i misspelled a bit
>>>>>>>> (1 edit away) and as You saw the data is there with Main spelling:
>>>>>>>>
>>>>>>>> org.apache.lucene.queryparser.classic.QueryParser parser = new
>>>>>>>> org.apache.lucene.queryparser.classic.QueryParser(field,
>>>>>>>> phraseAnalyzer) ;
>>>>>>>>
>>>>>>>> Query q1 = null;
>>>>>>>> try {
>>>>>>>> q1 = parser.parse("Mains~"); // 1 edit away
>>>>>>>> } catch (ParseException e) {
>>>>>>>> e.printStackTrace();
>>>>>>>> }
>>>>>>>> booleanQuery.add(q1, BooleanClause.Occur.MUST);
>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>>> "NASHUA"), BooleanClause.Occur.MUST);
>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>>> "NEW HAMPSHIRE"), BooleanClause.Occur.MUST);
>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>>> "UNITED STATES"), BooleanClause.Occur.MUST);
>>>>>>>>
>>>>>>>> query plan:
>>>>>>>>
>>>>>>>> [+contentDFLT:mains~2, +contentDFLT:"nashua",
>>>>>>>> +contentDFLT:"new-hampshire", +contentDFLT:"united states"]
>>>>>>>>
>>>>>>>> testQuerySearch1 Time to compute: 23 seconds (due to debugging
>>>>>>>> stops)
>>>>>>>>
>>>>>>>> Number of results: 0
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Case #4
>>>>>>>>
>>>>>>>> Then i changed q1 to SHOULD from MUST above: and i think fuzzy
>>>>>>>> query is ignored here since there is no MAIN in the first 468
>>>>>>>> resuls:
>>>>>>>>
>>>>>>>> there is no boost for Mains term here.
>>>>>>>>
>>>>>>>> query plan:
>>>>>>>>
>>>>>>>> [contentDFLT:mains~2, +contentDFLT:"nashua",
>>>>>>>> +contentDFLT:"new-hampshire", +contentDFLT:"united states"]
>>>>>>>>
>>>>>>>> testQuerySearch1 Time to compute: 125 seconds (due to debugging
>>>>>>>> stops)
>>>>>>>> Number of results: 1794
>>>>>>>> Name: Nashua Dr
>>>>>>>> Score: 34.186226
>>>>>>>> ID: 4974936
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.7636, -71.46063
>>>>>>>> Search Key: NASHUA NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>
>>>>>>>> Name: Nashua River Rail Trl
>>>>>>>> Score: 34.186226
>>>>>>>> ID: 4975508
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.7062, -71.53962
>>>>>>>> Search Key: NASHUA RIVER RAIL NASHUA HILLSBOROUGH NEW HAMPSHIRE
>>>>>>>> UNITED STATES
>>>>>>>>
>>>>>>>> Name: Nashua Rd
>>>>>>>> Score: 33.84896
>>>>>>>> ID: 4975388
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.78746, -71.92823
>>>>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>
>>>>>>>> Name: NASHUA
>>>>>>>> Score: 33.84896
>>>>>>>> ID: 21014865
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.75873, -71.46438
>>>>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>
>>>>>>>> Name: NASHUA
>>>>>>>> Score: 33.84896
>>>>>>>> ID: 21014865
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.75873, -71.46438
>>>>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>
>>>>>>>> Name: NASHUA
>>>>>>>> Score: 33.84896
>>>>>>>> ID: 21014865
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.75873, -71.46438
>>>>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>
>>>>>>>> Name: NASHUA
>>>>>>>> Score: 33.84896
>>>>>>>> ID: 21014865
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.75873, -71.46438
>>>>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>
>>>>>>>> Name: NASHUA
>>>>>>>> Score: 33.84896
>>>>>>>> ID: 21014865
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.75873, -71.46438
>>>>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>
>>>>>>>> Name: Nashua St
>>>>>>>> Score: 33.84896
>>>>>>>> ID: 4975671
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.88471, -70.81687
>>>>>>>> Search Key: NASHUA ROCKINGHAM NEW HAMPSHIRE UNITED STATES
>>>>>>>>
>>>>>>>> Name: Nashua Rd
>>>>>>>> Score: 33.84896
>>>>>>>> ID: 4975400
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.79014, -71.92364
>>>>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>
>>>>>>>>
>>>>>>>> Why is the fuzzy query ignored?
>>>>>>>> Even if i have separate fields for street, city,region, country,
>>>>>>>> this fuzzy query issue will come into place for words with
>>>>>>>> multiple parts like main dunstable etc., right?
>>>>>>>>
>>>>>>>> Best regards
>>>>>>>>
>>>>>>>> On 6/12/19 11:36 AM, baris.kazar@oracle.com wrote:
>>>>>>>>
>>>>>>>> Tomoko,-
>>>>>>>>
>>>>>>>> Thank You for Your suggestions. i am trying to understand it
>>>>>>>> and i thought i did :)
>>>>>>>>
>>>>>>>> but it does not work with FuzzyQuery when i used with a *single*
>>>>>>>> large TextField like street=...value... city=...value...
>>>>>>>> region=...value... country=...value... (with or without quotes
>>>>>>>> for the values)
>>>>>>>>
>>>>>>>> What i knew about Lucene fuzzy queries are not holding now with
>>>>>>>> this Textfield form. That is why i suspected of a bug.
>>>>>>>>
>>>>>>>> 1. Yes, i saw and have a solid proof on that now.
>>>>>>>>
>>>>>>>> 2. yes but FuzzyQuery takes quotes as they are as they are
>>>>>>>> escaped and it is not analyzed.
>>>>>>>>
>>>>>>>> Stuffing into one textfield vs having separate fields should only
>>>>>>>> affect probably the performance but not the outcome in my case.
>>>>>>>> But, i have been thinking about this and maybe it is the way to
>>>>>>>> go in this case.
>>>>>>>>
>>>>>>>> mY CONTENT field has street names in mixed case and city, region
>>>>>>>> country names in UPPERCASE. Can this be a problem?
>>>>>>>> i thought index stored them in lowercase since i am using
>>>>>>>> StandardAnalyzer.
>>>>>>>>
>>>>>>>> CONTENT field also has full textfield string with street=...
>>>>>>>> city=... region=... country=... (here all values are UPPERCASE).
>>>>>>>>
>>>>>>>> Why cant the index find the names via FuzzyQuery? i tried both
>>>>>>>> FuzzyQuery and Query builder as i showed before.
>>>>>>>>
>>>>>>>> The last advice in Your previous email would nicely go outside
>>>>>>>> the parantheses since it might be very critical :) :) :)
>>>>>>>>
>>>>>>>> Best regards
>>>>>>>>
>>>>>>>>
>>>>>>>> On 6/12/19 12:17 AM, Tomoko Uchida wrote:
>>>>>>>>
>>>>>>>> I'd suggest to correctly understand the way a software works before
>>>>>>>> suspecting its bug :-)
>>>>>>>>
>>>>>>>> I guess you may miss two points:
>>>>>>>>
>>>>>>>> 1. the standard analyzer (standard tokenizer) breaks words by double
>>>>>>>> quote (U+0022) so quotes are not indexed or searched at all if
>>>>>>>> you are
>>>>>>>> using standard analyzer. (That is the reason you have same results
>>>>>>>> with or without quotes.)
>>>>>>>> See:
>>>>>>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__lucene.apache.org_core_8-5F1-5F0_core_org_apache_lucene_analysis_standard_StandardTokenizer.html&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=1L6ZQKxmWmYxDX4uJHxzY5SAR_UCl6UUXCo916wzXCo&s=8E2lp1YIGM-3v3FspeieGl8z8rEBs6qioTudtFNzh8c&e=
>>>>>>>> and
>>>>>>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__unicode.org_reports_tr29_&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=1L6ZQKxmWmYxDX4uJHxzY5SAR_UCl6UUXCo916wzXCo&s=riCZ_f25XW869CKbHPUqfbLiDU-AukE6la0xTLMw6u8&e=
>>>>>>>>
>>>>>>>> 2. double quote has special meaning (it's interpreted as phrase
>>>>>>>> query)
>>>>>>>> with the built-in query parser so you need to escape it if you
>>>>>>>> want to
>>>>>>>> search double quotes itself.
>>>>>>>> See:
>>>>>>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__lucene.apache.org_core_8-5F1-5F0_queryparser_org_apache_lucene_queryparser_classic_package-2Dsummary.html-23Terms&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=1L6ZQKxmWmYxDX4uJHxzY5SAR_UCl6UUXCo916wzXCo&s=t8OYTgidvcwNpAVFuTsqGhDJK5BwUZVCxc0mPHzqCYU&e=
>>>>>>>>
>>>>>>>> (My advice would be to create separate fields for each key value
>>>>>>>> pairs
>>>>>>>> instead of stuffing all pairs into one text field, if you need to
>>>>>>>> search them separately.)
>>>>>>>>
>>>>>>>> 2019?6?12?(?) 2:39 <baris.kazar@oracle.com>:
>>>>>>>>
>>>>>>>> i can say that quotes is not the issue with index as it still
>>>>>>>> results in
>>>>>>>> same results with quotes or without quotes.
>>>>>>>>
>>>>>>>> i am starting to feel that this might be a bug maybe??
>>>>>>>>
>>>>>>>> Best regards
>>>>>>>>
>>>>>>>>
>>>>>>>> On 6/10/19 2:46 PM, baris.kazar@oracle.com wrote:
>>>>>>>>
>>>>>>>> Somehow " is causing an issue as this should return street with
>>>>>>>> MAIN:
>>>>>>>>
>>>>>>>> [contentDFLT:street="MAINS"~2, +contentDFLT:"city nashua",
>>>>>>>> +contentDFLT:"region new-hampshire", +contentDFLT:"country united
>>>>>>>> states"] -> this was with fuzzyquery on MAINS
>>>>>>>>
>>>>>>>> Best regards
>>>>>>>>
>>>>>>>>
>>>>>>>> On 6/10/19 2:24 PM, baris.kazar@oracle.com wrote:
>>>>>>>>
>>>>>>>> [.+contentDFLT:"city nashua", +contentDFLT:"region new-hampshire",
>>>>>>>> +contentDFLT:"country united states", contentDFLT:street
>>>>>>>> contentDFLT:mains]
>>>>>>>>
>>>>>>>> QueeryParser chops it into two pieces from
>>>>>>>> parser.parser("street=\"MAINS\"");
>>>>>>>>
>>>>>>>> Index has a TextField named contentDFLT the following data :
>>>>>>>> street="MAIN" city="NASHUA" municipality="HILLSBOROUGH" region="NEW
>>>>>>>> HAMPSHIRE" country="UNITED STATES"
>>>>>>>>
>>>>>>>>
>>>>>>>> When i set street=\"MAINS~\" with parser:
>>>>>>>> i get the following
>>>>>>>> [.+contentDFLT:"city nashua", +contentDFLT:"region new-hampshire",
>>>>>>>> +contentDFLT:"country united states", contentDFLT:street
>>>>>>>> contentDFLT:mains]
>>>>>>>>
>>>>>>>> probably " quotations are messing this up as You were saying...
>>>>>>>> Best regards
>>>>>>>>
>>>>>>>>
>>>>>>>> On 6/10/19 12:48 PM, Tomoko Uchida wrote:
>>>>>>>>
>>>>>>>> Or, " (double quotation) in your query string may affect query
>>>>>>>> parsing.
>>>>>>>>
>>>>>>>> When I parse this string by classic query parser (lucene 8.1),
>>>>>>>> street="MAINS~"
>>>>>>>> parsed (raw) query is
>>>>>>>> text:street text:mains
>>>>>>>> (I set the default search field to "text", so text:xxxx is appeared
>>>>>>>> here.)
>>>>>>>>
>>>>>>>> Query parsing is a complex process, so it would be good to check
>>>>>>>> parsed raw query string especially when you have (reserved) special
>>>>>>>> characters in your query...
>>>>>>>>
>>>>>>>> 2019?6?11?(?) 1:10 Tomoko Uchida <tomoko.uchida.1111@gmail.com>:
>>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> I noticed one small thing in your previous mail.
>>>>>>>>
>>>>>>>> when i use q1 = parser.parse("street=\"MAIN\""); i get same results
>>>>>>>>
>>>>>>>> which is good.
>>>>>>>>
>>>>>>>> To specify a search field, ":" (colon) should be used instead of
>>>>>>>> "=".
>>>>>>>> See the query parser documentation:
>>>>>>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__lucene.apache.org_core_8-5F1-5F0_queryparser_org_apache_lucene_queryparser_classic_package-2Dsummary.html-23Fields&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=u4SeJqH4lePhOazCLwxLEr3WqcMkODtYLv4njiKZ4PM&s=WrNfUXO9gz1PqpczTJw1vD9sWqvr76WRv2Aeo9uWqa4&e=
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> I'm not sure this is related to your problem.
>>>>>>>>
>>>>>>>> 2019?6?11?(?) 0:51 <baris.kazar@oracle.com>:
>>>>>>>>
>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>>> "city=\"NASHUA\""), BooleanClause.Occur.MUST);
>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>>> "region=\"NEW HAMPSHIRE\""), BooleanClause.Occur.MUST);
>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>>> "country=\"UNITED STATES\""), BooleanClause.Occur.MUST);
>>>>>>>>
>>>>>>>> org.apache.lucene.queryparser.classic.QueryParser parser = new
>>>>>>>> org.apache.lucene.queryparser.classic.QueryParser(field,
>>>>>>>> phraseAnalyzer) ;
>>>>>>>> Query q1 = null;
>>>>>>>> try {
>>>>>>>> q1 = parser.parse("MAIN");
>>>>>>>> } catch (ParseException e) {
>>>>>>>>
>>>>>>>> e.printStackTrace();
>>>>>>>> }
>>>>>>>> booleanQuery.add(q1, BooleanClause.Occur.SHOULD);
>>>>>>>>
>>>>>>>> testQuerySearch2 Time to compute: 0 seconds
>>>>>>>> Number of results: 1775
>>>>>>>> Name: Main St
>>>>>>>> Score: 37.20959
>>>>>>>> ID: 12681979
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.76416, -71.46681
>>>>>>>> Search Key: street="MAIN" city="NASHUA" municipality="HILLSBOROUGH"
>>>>>>>> region="NEW HAMPSHIRE" country="UNITED STATES"
>>>>>>>>
>>>>>>>> Name: Main St
>>>>>>>> Score: 37.20959
>>>>>>>> ID: 12681977
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.747, -71.45957
>>>>>>>> Search Key: street="MAIN" city="NASHUA" municipality="HILLSBOROUGH"
>>>>>>>> region="NEW HAMPSHIRE" country="UNITED STATES"
>>>>>>>>
>>>>>>>> Name: Main St
>>>>>>>> Score: 37.20959
>>>>>>>> ID: 12681978
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.73492, -71.44951
>>>>>>>> Search Key: street="MAIN" city="NASHUA" municipality="HILLSBOROUGH"
>>>>>>>> region="NEW HAMPSHIRE" country="UNITED STATES"
>>>>>>>>
>>>>>>>> when i use q1 = parser.parse("street=\"MAIN\""); i get same
>>>>>>>> results
>>>>>>>> which is good.
>>>>>>>>
>>>>>>>> But when i switch to MAINS~ then fuzzy query does not work.
>>>>>>>>
>>>>>>>>
>>>>>>>> i need to say something with the q1 only in the booleanquery:
>>>>>>>> it tries to match the MAIN in street, city, region and country
>>>>>>>> which are
>>>>>>>> in a single TextField field.
>>>>>>>> But i dont want this. that is why i need to street="..." etc when
>>>>>>>> searching.
>>>>>>>>
>>>>>>>> Best regards
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On 6/10/19 11:31 AM, Tomoko Uchida wrote:
>>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> just for the basic verification, can you find the document without
>>>>>>>> fuzzy query? I mean, does this query work for you?
>>>>>>>>
>>>>>>>> Query query = parser.parse("MAIN");
>>>>>>>>
>>>>>>>> Tomoko
>>>>>>>>
>>>>>>>> 2019?6?11?(?) 0:22 <baris.kazar@oracle.com>:
>>>>>>>>
>>>>>>>> why cant the second set not work at all?
>>>>>>>>
>>>>>>>> it is indexed as Textfield like street="..." city="..." etc.
>>>>>>>>
>>>>>>>> Best regards
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On 6/10/19 11:23 AM, baris.kazar@oracle.com wrote:
>>>>>>>>
>>>>>>>> i dont know how to use Fuzzyquery with queryparser but probably
>>>>>>>> You
>>>>>>>> are suggesting
>>>>>>>>
>>>>>>>> QueryParser parser = new QueryParser(field, analyzer) ;
>>>>>>>> Query query = parser.parse("MAINS~2");
>>>>>>>>
>>>>>>>> booleanQuery.add(query, BooleanClause.Occur.SHOULD);
>>>>>>>>
>>>>>>>> am i right?
>>>>>>>> Best regards
>>>>>>>>
>>>>>>>>
>>>>>>>> On 6/10/19 10:47 AM, Atri Sharma wrote:
>>>>>>>>
>>>>>>>> I would suggest using a QueryParser for your fuzzy query before
>>>>>>>> adding it to the Boolean query. This should weed out any case
>>>>>>>> issues.
>>>>>>>>
>>>>>>>> On Mon, 10 Jun 2019 at 8:06 PM, <baris.kazar@oracle.com
>>>>>>>> <mailto:baris.kazar@oracle.com>> wrote:
>>>>>>>>
>>>>>>>> BooleanQuery.Builder booleanQuery = new
>>>>>>>> BooleanQuery.Builder();
>>>>>>>>
>>>>>>>> //First set
>>>>>>>>
>>>>>>>> booleanQuery.add(new FuzzyQuery(new
>>>>>>>> org.apache.lucene.index.Term(field, "MAINS")),
>>>>>>>> BooleanClause.Occur.SHOULD);
>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>>> "NASHUA"), BooleanClause.Occur.MUST);
>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>>> "NEW HAMPSHIRE"), BooleanClause.Occur.MUST);
>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>>> "UNITED STATES"), BooleanClause.Occur.MUST);
>>>>>>>>
>>>>>>>> // Second set
>>>>>>>> //booleanQuery.add(new FuzzyQuery(new
>>>>>>>> org.apache.lucene.index.Term(field, "street=\"MAINS\"")),
>>>>>>>> BooleanClause.Occur.SHOULD);
>>>>>>>> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,
>>>>>>>>
>>>>>>>> field, "city=\"NASHUA\""), BooleanClause.Occur.MUST);
>>>>>>>> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,
>>>>>>>>
>>>>>>>> field, "region=\"NEW HAMPSHIRE\""),
>>>>>>>> BooleanClause.Occur.MUST);
>>>>>>>> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,
>>>>>>>>
>>>>>>>> field, "country=\"UNITED STATES\""),
>>>>>>>> BooleanClause.Occur.MUST);
>>>>>>>>
>>>>>>>> The first set brings also street with Nashua name.
>>>>>>>> (NASHUA).
>>>>>>>>
>>>>>>>> so, to prevent that and since i also indexed with
>>>>>>>> street="..."
>>>>>>>> city="..." i did the second set but it does not bring
>>>>>>>> anything.
>>>>>>>>
>>>>>>>> createPhraseQuery builds a Phrasequery with one term
>>>>>>>> equal to the
>>>>>>>> string
>>>>>>>> in the call.
>>>>>>>>
>>>>>>>> Best regards
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On 6/10/19 10:47 AM, baris.kazar@oracle.com
>>>>>>>> <mailto:baris.kazar@oracle.com> wrote:
>>>>>>>> > How do i check how it is indexed? lowecase or uppercase?
>>>>>>>> >
>>>>>>>> > only way is now to by testing.
>>>>>>>> >
>>>>>>>> > i am using standardanalyzer.
>>>>>>>> >
>>>>>>>> > Best regards
>>>>>>>> >
>>>>>>>> >
>>>>>>>> > On 6/9/19 11:57 AM, Atri Sharma wrote:
>>>>>>>> >> On Sun, Jun 9, 2019 at 8:53 PM Tomoko Uchida
>>>>>>>> >> <tomoko.uchida.1111@gmail.com
>>>>>>>> <mailto:tomoko.uchida.1111@gmail.com>> wrote:
>>>>>>>> >>> Hi,
>>>>>>>> >>>
>>>>>>>> >>> What analyzer do you use for the text field? Is the
>>>>>>>> term "Main"
>>>>>>>> >>> correctly indexed?
>>>>>>>> >> Agreed. Also, it would be good if you could post your
>>>>>>>> actual
>>>>>>>> code.
>>>>>>>> >>
>>>>>>>> >> What analyzer are you using? If you are using
>>>>>>>> StandardAnalyzer,
>>>>>>>> then
>>>>>>>> >> all of your terms while indexing will be lowercased,
>>>>>>>> AFAIK, but
>>>>>>>> your
>>>>>>>> >> query will not be analyzed until you run a
>>>>>>>> QueryParser on it.
>>>>>>>> >>
>>>>>>>> >>
>>>>>>>> >> Atri
>>>>>>>> >>
>>>>>>>> >
>>>>>>>> >
>>>>>>>> >
>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>
>>>>>>>>
>>>>>>>> > To unsubscribe, e-mail:
>>>>>>>> java-user-unsubscribe@lucene.apache.org
>>>>>>>> <mailto:java-user-unsubscribe@lucene.apache.org>
>>>>>>>> > For additional commands, e-mail:
>>>>>>>> java-user-help@lucene.apache.org
>>>>>>>> <mailto:java-user-help@lucene.apache.org>
>>>>>>>> >
>>>>>>>>
>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>
>>>>>>>>
>>>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>>>>
>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>
>>>>>>>>
>>>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>>>>
>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>
>>>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>>>>
>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>
>>>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>> ---------------------------------------------------------------------
>>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>>>
>>>>>> ---------------------------------------------------------------------
>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: FuzzyQuery- why is it ignored? [ In reply to ]
Hello,-
Erick explained how to disable stemming in Solr but i am using Lucene purely.
i am also researching how to disable it in Lucene but if You have instructions how to do so already
i appreciate if You could share here.
Best regards

----- Original Message -----
From: baris.kazar@oracle.com
To: java-user@lucene.apache.org, tomoko.uchida.1111@gmail.com, erickerickson@gmail.com, atri@linux.com, baris.kazar@oracle.com, lucene@mikemccandless.com
Sent: Thursday, June 13, 2019 10:48:47 AM GMT -05:00 US/Canada Eastern
Subject: Re: FuzzyQuery- why is it ignored?

i see, i am using an older version 6.6 and we should switch to Your 8.1
version of at least 7.X.

Tomoko i think i understood You meant MAIN NASHUA .... for the string :)

Again i really appreciate all answers.

How do we disable or enable stemming while indexing? :) another question.

Best regards


On 6/13/19 10:40 AM, Tomoko Uchida wrote:
> Sorry, I made a mistake when copypasting. Let me just correct my previous mail.
>
>> 1. Indexed this text: "NASHUA NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES".
> 1. Indexed this text: "MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW
> HAMPSHIRE UNITED STATES"
>
> ----
> As far as I can say, this query correctly find the indexed document
> (so I have no idea about what is wrong with fuzzy query).
> +contentDFLT:mains~2 +contentDFLT:"nashua"
> +contentDFLT:"new-hampshire" +contentDFLT:"united states"
>
> I am
> - using lucene 8.1.
> - using standard analyzer for both of indexing and searching.
> - using classic query parser for parsing.
>
>
>
> 2019?6?13?(?) 23:18 <baris.kazar@oracle.com>:
>> However, the index does not have MAINS but MAIN for the expected entry.
>>
>> Best regards
>>
>>
>>
>> On 6/13/19 10:33 AM, baris.kazar@oracle.com wrote:
>>> does it consider it as like plural word? :) :) :)
>>> That makes sense.
>>>
>>> Best regards
>>>
>>>
>>> On 6/13/19 10:31 AM, baris.kazar@oracle.com wrote:
>>>> Erick,
>>>>
>>>> Cool, could You give a simple example with my example please?
>>>>
>>>> Best regards
>>>>
>>>>
>>>>
>>>> On 6/13/19 10:12 AM, Erick Erickson wrote:
>>>>> Shot in the dark: stemming. Whenever I see a problem with something
>>>>> ending in “s” (or “er” or “ing” or….) my first suspect is that
>>>>> stemming is turned on. In that case the token in the index that’s
>>>>> actually searched on is somewhat different than you expect.
>>>>>
>>>>> The test is easy, just insure your fieldType contains no stemmers.
>>>>> PorterStemmer is particularly aggressive, but for this case to test
>>>>> I’d just remove all stemming, re-index and see if the results differ.
>>>>>
>>>>> Best,
>>>>> Erick
>>>>>
>>>>>> On Jun 13, 2019, at 7:26 AM, baris.kazar@oracle.com wrote:
>>>>>>
>>>>>> Tomoko,-
>>>>>>
>>>>>> That is strange indeed.
>>>>>>
>>>>>> Something is wrong when i use mains but maink, mainl, mainr,mainq,
>>>>>> maint all work ok any consonant at the end except s works in this
>>>>>> case.
>>>>>>
>>>>>> Case #3 had +contentDFLT:mains~2 but not +contentDFLT:"mains~2".
>>>>>>
>>>>>> i am using fuzzy query with ~ from Query.builder and that is not
>>>>>> PhraseQuery.
>>>>>>
>>>>>> Similarly FuzzyQuery with input "mains" (it has to be lowercase
>>>>>> since it does not go through StandardAnalyzer) is also not
>>>>>> PhraseQuery.
>>>>>>
>>>>>> can there be a clearer sample case for ComplexPhraseQuery please in
>>>>>> the docs?
>>>>>>
>>>>>> did You also index "MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED
>>>>>> STATES" the expected output in this case?
>>>>>>
>>>>>> Thanks for spending time on this, i would like to thank everyone.
>>>>>>
>>>>>> Best regards
>>>>>>
>>>>>>
>>>>>> On 6/13/19 12:13 AM, Tomoko Uchida wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>>> Ok, i think only this very specific only "mains" has an issue.
>>>>>>> It looks strange to me. I did some test locally.
>>>>>>>
>>>>>>> 1. Indexed this text: "NASHUA NASHUA HILLSBOROUGH NEW HAMPSHIRE
>>>>>>> UNITED STATES".
>>>>>>>
>>>>>>> 2a. This query string (just copied from your Case #3) worked
>>>>>>> correctly
>>>>>>> for me as far as I can see.
>>>>>>> +contentDFLT:mains~2 +contentDFLT:"nashua",
>>>>>>> +contentDFLT:"new-hampshire", +contentDFLT:"united state"
>>>>>>>
>>>>>>> 2b. However this query string got no results.
>>>>>>> +contentDFLT:"mains~2", +contentDFLT:"nashua",
>>>>>>> +contentDFLT:"new-hampshire", +contentDFLT:"united states"
>>>>>>> It is an expected behaviour because the classic query parser does not
>>>>>>> support fuzzy query inside phrase query (as far as I know).
>>>>>>>
>>>>>>> I suspect you use fuzzy query operator (~) inside phrase query
>>>>>>> ("), as
>>>>>>> the 2b case.
>>>>>>>
>>>>>>> FYI: there is a special parser for such complex phrase query.
>>>>>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__lucene.apache.org_core_8-5F1-5F0_queryparser_org_apache_lucene_queryparser_complexPhrase_ComplexPhraseQueryParser.html&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=ZcXpaSlwS5DegX76mHTb_6DH3P7noan1eeMXc-Vh5M8&s=FoIMlcjDO2b7Gut9XRx-NIBWiBQWItsj8IlylJC7Wkc&e=
>>>>>>>
>>>>>>>
>>>>>>> Tomoko
>>>>>>>
>>>>>>> 2019?6?13?(?) 6:16 <baris.kazar@oracle.com>:
>>>>>>>> Ok, i think only this very specific only "mains" has an issue.
>>>>>>>>
>>>>>>>> all i knew about Lucene was fine :) Great...
>>>>>>>>
>>>>>>>> i have one more question:
>>>>>>>>
>>>>>>>> which one is advised to use: FuzzyQuery or the Query.parser with
>>>>>>>> search string~ appended?
>>>>>>>>
>>>>>>>> The second one will go through analyzer and make search string
>>>>>>>> lowercase.
>>>>>>>>
>>>>>>>> Best regards
>>>>>>>>
>>>>>>>>
>>>>>>>> On 6/12/19 1:03 PM, baris.kazar@oracle.com wrote:
>>>>>>>>
>>>>>>>> Hi again,-
>>>>>>>>
>>>>>>>> this is really interesting and i hope i am missing something.
>>>>>>>> Index small cases all entries so case sensitivity is not an issue
>>>>>>>> i think.
>>>>>>>>
>>>>>>>> Case #1:
>>>>>>>>
>>>>>>>> org.apache.lucene.queryparser.classic.QueryParser parser = new
>>>>>>>> org.apache.lucene.queryparser.classic.QueryParser(field,
>>>>>>>> phraseAnalyzer) ;
>>>>>>>> Query q1 = null;
>>>>>>>> try {
>>>>>>>> q1 = parser.parse("Main");
>>>>>>>> } catch (ParseException e) {
>>>>>>>> e.printStackTrace();
>>>>>>>> }
>>>>>>>> booleanQuery.add(q1, BooleanClause.Occur.MUST);
>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>>> "NASHUA"), BooleanClause.Occur.MUST);
>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>>> "NEW HAMPSHIRE"), BooleanClause.Occur.MUST);
>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>>> "UNITED STATES"), BooleanClause.Occur.MUST);
>>>>>>>>
>>>>>>>>
>>>>>>>> This brings with this:
>>>>>>>>
>>>>>>>> query plan:
>>>>>>>>
>>>>>>>> [+contentDFLT:main, +contentDFLT:"nashua",
>>>>>>>> +contentDFLT:"new-hampshire", +contentDFLT:"united states"]
>>>>>>>>
>>>>>>>> testQuerySearch1 Time to compute: 0 seconds (copied answer after
>>>>>>>> exec finished)
>>>>>>>>
>>>>>>>> Number of results: 12
>>>>>>>> Name: Main Dunstable Rd
>>>>>>>> Score: 41.204945
>>>>>>>> ID: 12677400
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.72631, -71.50269
>>>>>>>> Search Key: MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW HAMPSHIRE
>>>>>>>> UNITED STATES
>>>>>>>>
>>>>>>>> Name: Main St
>>>>>>>> Score: 41.204945
>>>>>>>> ID: 12681980
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.76416, -71.46681
>>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>
>>>>>>>> Name: Main St
>>>>>>>> Score: 41.204945
>>>>>>>> ID: 12681973
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.75045, -71.4607
>>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>
>>>>>>>> Name: Main St
>>>>>>>> Score: 41.204945
>>>>>>>> ID: 12681974
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.76019, -71.465
>>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>
>>>>>>>> Name: Main Dunstable Rd
>>>>>>>> Score: 41.204945
>>>>>>>> ID: 12677399
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.74641, -71.48943
>>>>>>>> Search Key: MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW HAMPSHIRE
>>>>>>>> UNITED STATES
>>>>>>>>
>>>>>>>> Name: S Main St
>>>>>>>> Score: 41.204945
>>>>>>>> ID: 11893215
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.73412, -71.44797
>>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>
>>>>>>>> Name: Main St
>>>>>>>> Score: 41.204945
>>>>>>>> ID: 12681978
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.73492, -71.44951
>>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>
>>>>>>>> Name: S Main St
>>>>>>>> Score: 41.204945
>>>>>>>> ID: 11893214
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.73958, -71.45895
>>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>
>>>>>>>> Name: Main St
>>>>>>>> Score: 41.204945
>>>>>>>> ID: 12681979
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.76416, -71.46681
>>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>
>>>>>>>> Name: Main St
>>>>>>>> Score: 41.204945
>>>>>>>> ID: 12681977
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.747, -71.45957
>>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Case #2
>>>>>>>>
>>>>>>>> When i did this it also worked by adding ~ to make it Fuzzy query
>>>>>>>> to Main word:
>>>>>>>>
>>>>>>>> org.apache.lucene.queryparser.classic.QueryParser parser = new
>>>>>>>> org.apache.lucene.queryparser.classic.QueryParser(field,
>>>>>>>> phraseAnalyzer) ;
>>>>>>>> Query q1 = null;
>>>>>>>> try {
>>>>>>>> q1 = parser.parse("Main~");
>>>>>>>> } catch (ParseException e) {
>>>>>>>> e.printStackTrace();
>>>>>>>> }
>>>>>>>> booleanQuery.add(q1, BooleanClause.Occur.MUST);
>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>>> "NASHUA"), BooleanClause.Occur.MUST);
>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>>> "NEW HAMPSHIRE"), BooleanClause.Occur.MUST);
>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>>> "UNITED STATES"), BooleanClause.Occur.MUST);
>>>>>>>>
>>>>>>>>
>>>>>>>> query plan:
>>>>>>>>
>>>>>>>> [+contentDFLT:main~2, +contentDFLT:"nashua",
>>>>>>>> +contentDFLT:"new-hampshire", +contentDFLT:"united states"]
>>>>>>>>
>>>>>>>> testQuerySearch1 Time to compute: 24 seconds (due to debugging
>>>>>>>> stops)
>>>>>>>> Number of results: 12
>>>>>>>> Name: Main Dunstable Rd
>>>>>>>> Score: 41.06405
>>>>>>>> ID: 12677400
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.72631, -71.50269
>>>>>>>> Search Key: MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW HAMPSHIRE
>>>>>>>> UNITED STATES
>>>>>>>>
>>>>>>>> Name: Main St
>>>>>>>> Score: 41.06405
>>>>>>>> ID: 12681980
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.76416, -71.46681
>>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>
>>>>>>>> Name: Main St
>>>>>>>> Score: 41.06405
>>>>>>>> ID: 12681973
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.75045, -71.4607
>>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>
>>>>>>>> Name: Main St
>>>>>>>> Score: 41.06405
>>>>>>>> ID: 12681974
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.76019, -71.465
>>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>
>>>>>>>> Name: Main Dunstable Rd
>>>>>>>> Score: 41.06405
>>>>>>>> ID: 12677399
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.74641, -71.48943
>>>>>>>> Search Key: MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW HAMPSHIRE
>>>>>>>> UNITED STATES
>>>>>>>>
>>>>>>>> Name: S Main St
>>>>>>>> Score: 41.06405
>>>>>>>> ID: 11893215
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.73412, -71.44797
>>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>
>>>>>>>> Name: Main St
>>>>>>>> Score: 41.06405
>>>>>>>> ID: 12681978
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.73492, -71.44951
>>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>
>>>>>>>> Name: S Main St
>>>>>>>> Score: 41.06405
>>>>>>>> ID: 11893214
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.73958, -71.45895
>>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>
>>>>>>>> Name: Main St
>>>>>>>> Score: 41.06405
>>>>>>>> ID: 12681979
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.76416, -71.46681
>>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>
>>>>>>>> Name: Main St
>>>>>>>> Score: 41.06405
>>>>>>>> ID: 12681977
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.747, -71.45957
>>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Case #3
>>>>>>>>
>>>>>>>> But why does this not work with fuzzy mode and i misspelled a bit
>>>>>>>> (1 edit away) and as You saw the data is there with Main spelling:
>>>>>>>>
>>>>>>>> org.apache.lucene.queryparser.classic.QueryParser parser = new
>>>>>>>> org.apache.lucene.queryparser.classic.QueryParser(field,
>>>>>>>> phraseAnalyzer) ;
>>>>>>>>
>>>>>>>> Query q1 = null;
>>>>>>>> try {
>>>>>>>> q1 = parser.parse("Mains~"); // 1 edit away
>>>>>>>> } catch (ParseException e) {
>>>>>>>> e.printStackTrace();
>>>>>>>> }
>>>>>>>> booleanQuery.add(q1, BooleanClause.Occur.MUST);
>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>>> "NASHUA"), BooleanClause.Occur.MUST);
>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>>> "NEW HAMPSHIRE"), BooleanClause.Occur.MUST);
>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>>> "UNITED STATES"), BooleanClause.Occur.MUST);
>>>>>>>>
>>>>>>>> query plan:
>>>>>>>>
>>>>>>>> [+contentDFLT:mains~2, +contentDFLT:"nashua",
>>>>>>>> +contentDFLT:"new-hampshire", +contentDFLT:"united states"]
>>>>>>>>
>>>>>>>> testQuerySearch1 Time to compute: 23 seconds (due to debugging
>>>>>>>> stops)
>>>>>>>>
>>>>>>>> Number of results: 0
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Case #4
>>>>>>>>
>>>>>>>> Then i changed q1 to SHOULD from MUST above: and i think fuzzy
>>>>>>>> query is ignored here since there is no MAIN in the first 468
>>>>>>>> resuls:
>>>>>>>>
>>>>>>>> there is no boost for Mains term here.
>>>>>>>>
>>>>>>>> query plan:
>>>>>>>>
>>>>>>>> [contentDFLT:mains~2, +contentDFLT:"nashua",
>>>>>>>> +contentDFLT:"new-hampshire", +contentDFLT:"united states"]
>>>>>>>>
>>>>>>>> testQuerySearch1 Time to compute: 125 seconds (due to debugging
>>>>>>>> stops)
>>>>>>>> Number of results: 1794
>>>>>>>> Name: Nashua Dr
>>>>>>>> Score: 34.186226
>>>>>>>> ID: 4974936
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.7636, -71.46063
>>>>>>>> Search Key: NASHUA NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>
>>>>>>>> Name: Nashua River Rail Trl
>>>>>>>> Score: 34.186226
>>>>>>>> ID: 4975508
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.7062, -71.53962
>>>>>>>> Search Key: NASHUA RIVER RAIL NASHUA HILLSBOROUGH NEW HAMPSHIRE
>>>>>>>> UNITED STATES
>>>>>>>>
>>>>>>>> Name: Nashua Rd
>>>>>>>> Score: 33.84896
>>>>>>>> ID: 4975388
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.78746, -71.92823
>>>>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>
>>>>>>>> Name: NASHUA
>>>>>>>> Score: 33.84896
>>>>>>>> ID: 21014865
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.75873, -71.46438
>>>>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>
>>>>>>>> Name: NASHUA
>>>>>>>> Score: 33.84896
>>>>>>>> ID: 21014865
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.75873, -71.46438
>>>>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>
>>>>>>>> Name: NASHUA
>>>>>>>> Score: 33.84896
>>>>>>>> ID: 21014865
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.75873, -71.46438
>>>>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>
>>>>>>>> Name: NASHUA
>>>>>>>> Score: 33.84896
>>>>>>>> ID: 21014865
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.75873, -71.46438
>>>>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>
>>>>>>>> Name: NASHUA
>>>>>>>> Score: 33.84896
>>>>>>>> ID: 21014865
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.75873, -71.46438
>>>>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>
>>>>>>>> Name: Nashua St
>>>>>>>> Score: 33.84896
>>>>>>>> ID: 4975671
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.88471, -70.81687
>>>>>>>> Search Key: NASHUA ROCKINGHAM NEW HAMPSHIRE UNITED STATES
>>>>>>>>
>>>>>>>> Name: Nashua Rd
>>>>>>>> Score: 33.84896
>>>>>>>> ID: 4975400
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.79014, -71.92364
>>>>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>
>>>>>>>>
>>>>>>>> Why is the fuzzy query ignored?
>>>>>>>> Even if i have separate fields for street, city,region, country,
>>>>>>>> this fuzzy query issue will come into place for words with
>>>>>>>> multiple parts like main dunstable etc., right?
>>>>>>>>
>>>>>>>> Best regards
>>>>>>>>
>>>>>>>> On 6/12/19 11:36 AM, baris.kazar@oracle.com wrote:
>>>>>>>>
>>>>>>>> Tomoko,-
>>>>>>>>
>>>>>>>> Thank You for Your suggestions. i am trying to understand it
>>>>>>>> and i thought i did :)
>>>>>>>>
>>>>>>>> but it does not work with FuzzyQuery when i used with a *single*
>>>>>>>> large TextField like street=...value... city=...value...
>>>>>>>> region=...value... country=...value... (with or without quotes
>>>>>>>> for the values)
>>>>>>>>
>>>>>>>> What i knew about Lucene fuzzy queries are not holding now with
>>>>>>>> this Textfield form. That is why i suspected of a bug.
>>>>>>>>
>>>>>>>> 1. Yes, i saw and have a solid proof on that now.
>>>>>>>>
>>>>>>>> 2. yes but FuzzyQuery takes quotes as they are as they are
>>>>>>>> escaped and it is not analyzed.
>>>>>>>>
>>>>>>>> Stuffing into one textfield vs having separate fields should only
>>>>>>>> affect probably the performance but not the outcome in my case.
>>>>>>>> But, i have been thinking about this and maybe it is the way to
>>>>>>>> go in this case.
>>>>>>>>
>>>>>>>> mY CONTENT field has street names in mixed case and city, region
>>>>>>>> country names in UPPERCASE. Can this be a problem?
>>>>>>>> i thought index stored them in lowercase since i am using
>>>>>>>> StandardAnalyzer.
>>>>>>>>
>>>>>>>> CONTENT field also has full textfield string with street=...
>>>>>>>> city=... region=... country=... (here all values are UPPERCASE).
>>>>>>>>
>>>>>>>> Why cant the index find the names via FuzzyQuery? i tried both
>>>>>>>> FuzzyQuery and Query builder as i showed before.
>>>>>>>>
>>>>>>>> The last advice in Your previous email would nicely go outside
>>>>>>>> the parantheses since it might be very critical :) :) :)
>>>>>>>>
>>>>>>>> Best regards
>>>>>>>>
>>>>>>>>
>>>>>>>> On 6/12/19 12:17 AM, Tomoko Uchida wrote:
>>>>>>>>
>>>>>>>> I'd suggest to correctly understand the way a software works before
>>>>>>>> suspecting its bug :-)
>>>>>>>>
>>>>>>>> I guess you may miss two points:
>>>>>>>>
>>>>>>>> 1. the standard analyzer (standard tokenizer) breaks words by double
>>>>>>>> quote (U+0022) so quotes are not indexed or searched at all if
>>>>>>>> you are
>>>>>>>> using standard analyzer. (That is the reason you have same results
>>>>>>>> with or without quotes.)
>>>>>>>> See:
>>>>>>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__lucene.apache.org_core_8-5F1-5F0_core_org_apache_lucene_analysis_standard_StandardTokenizer.html&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=1L6ZQKxmWmYxDX4uJHxzY5SAR_UCl6UUXCo916wzXCo&s=8E2lp1YIGM-3v3FspeieGl8z8rEBs6qioTudtFNzh8c&e=
>>>>>>>> and
>>>>>>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__unicode.org_reports_tr29_&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=1L6ZQKxmWmYxDX4uJHxzY5SAR_UCl6UUXCo916wzXCo&s=riCZ_f25XW869CKbHPUqfbLiDU-AukE6la0xTLMw6u8&e=
>>>>>>>>
>>>>>>>> 2. double quote has special meaning (it's interpreted as phrase
>>>>>>>> query)
>>>>>>>> with the built-in query parser so you need to escape it if you
>>>>>>>> want to
>>>>>>>> search double quotes itself.
>>>>>>>> See:
>>>>>>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__lucene.apache.org_core_8-5F1-5F0_queryparser_org_apache_lucene_queryparser_classic_package-2Dsummary.html-23Terms&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=1L6ZQKxmWmYxDX4uJHxzY5SAR_UCl6UUXCo916wzXCo&s=t8OYTgidvcwNpAVFuTsqGhDJK5BwUZVCxc0mPHzqCYU&e=
>>>>>>>>
>>>>>>>> (My advice would be to create separate fields for each key value
>>>>>>>> pairs
>>>>>>>> instead of stuffing all pairs into one text field, if you need to
>>>>>>>> search them separately.)
>>>>>>>>
>>>>>>>> 2019?6?12?(?) 2:39 <baris.kazar@oracle.com>:
>>>>>>>>
>>>>>>>> i can say that quotes is not the issue with index as it still
>>>>>>>> results in
>>>>>>>> same results with quotes or without quotes.
>>>>>>>>
>>>>>>>> i am starting to feel that this might be a bug maybe??
>>>>>>>>
>>>>>>>> Best regards
>>>>>>>>
>>>>>>>>
>>>>>>>> On 6/10/19 2:46 PM, baris.kazar@oracle.com wrote:
>>>>>>>>
>>>>>>>> Somehow " is causing an issue as this should return street with
>>>>>>>> MAIN:
>>>>>>>>
>>>>>>>> [contentDFLT:street="MAINS"~2, +contentDFLT:"city nashua",
>>>>>>>> +contentDFLT:"region new-hampshire", +contentDFLT:"country united
>>>>>>>> states"] -> this was with fuzzyquery on MAINS
>>>>>>>>
>>>>>>>> Best regards
>>>>>>>>
>>>>>>>>
>>>>>>>> On 6/10/19 2:24 PM, baris.kazar@oracle.com wrote:
>>>>>>>>
>>>>>>>> [.+contentDFLT:"city nashua", +contentDFLT:"region new-hampshire",
>>>>>>>> +contentDFLT:"country united states", contentDFLT:street
>>>>>>>> contentDFLT:mains]
>>>>>>>>
>>>>>>>> QueeryParser chops it into two pieces from
>>>>>>>> parser.parser("street=\"MAINS\"");
>>>>>>>>
>>>>>>>> Index has a TextField named contentDFLT the following data :
>>>>>>>> street="MAIN" city="NASHUA" municipality="HILLSBOROUGH" region="NEW
>>>>>>>> HAMPSHIRE" country="UNITED STATES"
>>>>>>>>
>>>>>>>>
>>>>>>>> When i set street=\"MAINS~\" with parser:
>>>>>>>> i get the following
>>>>>>>> [.+contentDFLT:"city nashua", +contentDFLT:"region new-hampshire",
>>>>>>>> +contentDFLT:"country united states", contentDFLT:street
>>>>>>>> contentDFLT:mains]
>>>>>>>>
>>>>>>>> probably " quotations are messing this up as You were saying...
>>>>>>>> Best regards
>>>>>>>>
>>>>>>>>
>>>>>>>> On 6/10/19 12:48 PM, Tomoko Uchida wrote:
>>>>>>>>
>>>>>>>> Or, " (double quotation) in your query string may affect query
>>>>>>>> parsing.
>>>>>>>>
>>>>>>>> When I parse this string by classic query parser (lucene 8.1),
>>>>>>>> street="MAINS~"
>>>>>>>> parsed (raw) query is
>>>>>>>> text:street text:mains
>>>>>>>> (I set the default search field to "text", so text:xxxx is appeared
>>>>>>>> here.)
>>>>>>>>
>>>>>>>> Query parsing is a complex process, so it would be good to check
>>>>>>>> parsed raw query string especially when you have (reserved) special
>>>>>>>> characters in your query...
>>>>>>>>
>>>>>>>> 2019?6?11?(?) 1:10 Tomoko Uchida <tomoko.uchida.1111@gmail.com>:
>>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> I noticed one small thing in your previous mail.
>>>>>>>>
>>>>>>>> when i use q1 = parser.parse("street=\"MAIN\""); i get same results
>>>>>>>>
>>>>>>>> which is good.
>>>>>>>>
>>>>>>>> To specify a search field, ":" (colon) should be used instead of
>>>>>>>> "=".
>>>>>>>> See the query parser documentation:
>>>>>>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__lucene.apache.org_core_8-5F1-5F0_queryparser_org_apache_lucene_queryparser_classic_package-2Dsummary.html-23Fields&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=u4SeJqH4lePhOazCLwxLEr3WqcMkODtYLv4njiKZ4PM&s=WrNfUXO9gz1PqpczTJw1vD9sWqvr76WRv2Aeo9uWqa4&e=
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> I'm not sure this is related to your problem.
>>>>>>>>
>>>>>>>> 2019?6?11?(?) 0:51 <baris.kazar@oracle.com>:
>>>>>>>>
>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>>> "city=\"NASHUA\""), BooleanClause.Occur.MUST);
>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>>> "region=\"NEW HAMPSHIRE\""), BooleanClause.Occur.MUST);
>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>>> "country=\"UNITED STATES\""), BooleanClause.Occur.MUST);
>>>>>>>>
>>>>>>>> org.apache.lucene.queryparser.classic.QueryParser parser = new
>>>>>>>> org.apache.lucene.queryparser.classic.QueryParser(field,
>>>>>>>> phraseAnalyzer) ;
>>>>>>>> Query q1 = null;
>>>>>>>> try {
>>>>>>>> q1 = parser.parse("MAIN");
>>>>>>>> } catch (ParseException e) {
>>>>>>>>
>>>>>>>> e.printStackTrace();
>>>>>>>> }
>>>>>>>> booleanQuery.add(q1, BooleanClause.Occur.SHOULD);
>>>>>>>>
>>>>>>>> testQuerySearch2 Time to compute: 0 seconds
>>>>>>>> Number of results: 1775
>>>>>>>> Name: Main St
>>>>>>>> Score: 37.20959
>>>>>>>> ID: 12681979
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.76416, -71.46681
>>>>>>>> Search Key: street="MAIN" city="NASHUA" municipality="HILLSBOROUGH"
>>>>>>>> region="NEW HAMPSHIRE" country="UNITED STATES"
>>>>>>>>
>>>>>>>> Name: Main St
>>>>>>>> Score: 37.20959
>>>>>>>> ID: 12681977
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.747, -71.45957
>>>>>>>> Search Key: street="MAIN" city="NASHUA" municipality="HILLSBOROUGH"
>>>>>>>> region="NEW HAMPSHIRE" country="UNITED STATES"
>>>>>>>>
>>>>>>>> Name: Main St
>>>>>>>> Score: 37.20959
>>>>>>>> ID: 12681978
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.73492, -71.44951
>>>>>>>> Search Key: street="MAIN" city="NASHUA" municipality="HILLSBOROUGH"
>>>>>>>> region="NEW HAMPSHIRE" country="UNITED STATES"
>>>>>>>>
>>>>>>>> when i use q1 = parser.parse("street=\"MAIN\""); i get same
>>>>>>>> results
>>>>>>>> which is good.
>>>>>>>>
>>>>>>>> But when i switch to MAINS~ then fuzzy query does not work.
>>>>>>>>
>>>>>>>>
>>>>>>>> i need to say something with the q1 only in the booleanquery:
>>>>>>>> it tries to match the MAIN in street, city, region and country
>>>>>>>> which are
>>>>>>>> in a single TextField field.
>>>>>>>> But i dont want this. that is why i need to street="..." etc when
>>>>>>>> searching.
>>>>>>>>
>>>>>>>> Best regards
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On 6/10/19 11:31 AM, Tomoko Uchida wrote:
>>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> just for the basic verification, can you find the document without
>>>>>>>> fuzzy query? I mean, does this query work for you?
>>>>>>>>
>>>>>>>> Query query = parser.parse("MAIN");
>>>>>>>>
>>>>>>>> Tomoko
>>>>>>>>
>>>>>>>> 2019?6?11?(?) 0:22 <baris.kazar@oracle.com>:
>>>>>>>>
>>>>>>>> why cant the second set not work at all?
>>>>>>>>
>>>>>>>> it is indexed as Textfield like street="..." city="..." etc.
>>>>>>>>
>>>>>>>> Best regards
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On 6/10/19 11:23 AM, baris.kazar@oracle.com wrote:
>>>>>>>>
>>>>>>>> i dont know how to use Fuzzyquery with queryparser but probably
>>>>>>>> You
>>>>>>>> are suggesting
>>>>>>>>
>>>>>>>> QueryParser parser = new QueryParser(field, analyzer) ;
>>>>>>>> Query query = parser.parse("MAINS~2");
>>>>>>>>
>>>>>>>> booleanQuery.add(query, BooleanClause.Occur.SHOULD);
>>>>>>>>
>>>>>>>> am i right?
>>>>>>>> Best regards
>>>>>>>>
>>>>>>>>
>>>>>>>> On 6/10/19 10:47 AM, Atri Sharma wrote:
>>>>>>>>
>>>>>>>> I would suggest using a QueryParser for your fuzzy query before
>>>>>>>> adding it to the Boolean query. This should weed out any case
>>>>>>>> issues.
>>>>>>>>
>>>>>>>> On Mon, 10 Jun 2019 at 8:06 PM, <baris.kazar@oracle.com
>>>>>>>> <mailto:baris.kazar@oracle.com>> wrote:
>>>>>>>>
>>>>>>>> BooleanQuery.Builder booleanQuery = new
>>>>>>>> BooleanQuery.Builder();
>>>>>>>>
>>>>>>>> //First set
>>>>>>>>
>>>>>>>> booleanQuery.add(new FuzzyQuery(new
>>>>>>>> org.apache.lucene.index.Term(field, "MAINS")),
>>>>>>>> BooleanClause.Occur.SHOULD);
>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>>> "NASHUA"), BooleanClause.Occur.MUST);
>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>>> "NEW HAMPSHIRE"), BooleanClause.Occur.MUST);
>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>>> "UNITED STATES"), BooleanClause.Occur.MUST);
>>>>>>>>
>>>>>>>> // Second set
>>>>>>>> //booleanQuery.add(new FuzzyQuery(new
>>>>>>>> org.apache.lucene.index.Term(field, "street=\"MAINS\"")),
>>>>>>>> BooleanClause.Occur.SHOULD);
>>>>>>>> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,
>>>>>>>>
>>>>>>>> field, "city=\"NASHUA\""), BooleanClause.Occur.MUST);
>>>>>>>> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,
>>>>>>>>
>>>>>>>> field, "region=\"NEW HAMPSHIRE\""),
>>>>>>>> BooleanClause.Occur.MUST);
>>>>>>>> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,
>>>>>>>>
>>>>>>>> field, "country=\"UNITED STATES\""),
>>>>>>>> BooleanClause.Occur.MUST);
>>>>>>>>
>>>>>>>> The first set brings also street with Nashua name.
>>>>>>>> (NASHUA).
>>>>>>>>
>>>>>>>> so, to prevent that and since i also indexed with
>>>>>>>> street="..."
>>>>>>>> city="..." i did the second set but it does not bring
>>>>>>>> anything.
>>>>>>>>
>>>>>>>> createPhraseQuery builds a Phrasequery with one term
>>>>>>>> equal to the
>>>>>>>> string
>>>>>>>> in the call.
>>>>>>>>
>>>>>>>> Best regards
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On 6/10/19 10:47 AM, baris.kazar@oracle.com
>>>>>>>> <mailto:baris.kazar@oracle.com> wrote:
>>>>>>>> > How do i check how it is indexed? lowecase or uppercase?
>>>>>>>> >
>>>>>>>> > only way is now to by testing.
>>>>>>>> >
>>>>>>>> > i am using standardanalyzer.
>>>>>>>> >
>>>>>>>> > Best regards
>>>>>>>> >
>>>>>>>> >
>>>>>>>> > On 6/9/19 11:57 AM, Atri Sharma wrote:
>>>>>>>> >> On Sun, Jun 9, 2019 at 8:53 PM Tomoko Uchida
>>>>>>>> >> <tomoko.uchida.1111@gmail.com
>>>>>>>> <mailto:tomoko.uchida.1111@gmail.com>> wrote:
>>>>>>>> >>> Hi,
>>>>>>>> >>>
>>>>>>>> >>> What analyzer do you use for the text field? Is the
>>>>>>>> term "Main"
>>>>>>>> >>> correctly indexed?
>>>>>>>> >> Agreed. Also, it would be good if you could post your
>>>>>>>> actual
>>>>>>>> code.
>>>>>>>> >>
>>>>>>>> >> What analyzer are you using? If you are using
>>>>>>>> StandardAnalyzer,
>>>>>>>> then
>>>>>>>> >> all of your terms while indexing will be lowercased,
>>>>>>>> AFAIK, but
>>>>>>>> your
>>>>>>>> >> query will not be analyzed until you run a
>>>>>>>> QueryParser on it.
>>>>>>>> >>
>>>>>>>> >>
>>>>>>>> >> Atri
>>>>>>>> >>
>>>>>>>> >
>>>>>>>> >
>>>>>>>> >
>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>
>>>>>>>>
>>>>>>>> > To unsubscribe, e-mail:
>>>>>>>> java-user-unsubscribe@lucene.apache.org
>>>>>>>> <mailto:java-user-unsubscribe@lucene.apache.org>
>>>>>>>> > For additional commands, e-mail:
>>>>>>>> java-user-help@lucene.apache.org
>>>>>>>> <mailto:java-user-help@lucene.apache.org>
>>>>>>>> >
>>>>>>>>
>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>
>>>>>>>>
>>>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>>>>
>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>
>>>>>>>>
>>>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>>>>
>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>
>>>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>>>>
>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>
>>>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>> ---------------------------------------------------------------------
>>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>>>
>>>>>> ---------------------------------------------------------------------
>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: FuzzyQuery- why is it ignored? [ In reply to ]
Hi,

you said you are using standard analyzer. If so, you are not using any
stemmer at all (please see the analyzer's Javadocs).

2019?6?16?(?) 11:43 Baris Kazar <baris.kazar@oracle.com>:
>
> Hello,-
> Erick explained how to disable stemming in Solr but i am using Lucene purely.
> i am also researching how to disable it in Lucene but if You have instructions how to do so already
> i appreciate if You could share here.
> Best regards
>
> ----- Original Message -----
> From: baris.kazar@oracle.com
> To: java-user@lucene.apache.org, tomoko.uchida.1111@gmail.com, erickerickson@gmail.com, atri@linux.com, baris.kazar@oracle.com, lucene@mikemccandless.com
> Sent: Thursday, June 13, 2019 10:48:47 AM GMT -05:00 US/Canada Eastern
> Subject: Re: FuzzyQuery- why is it ignored?
>
> i see, i am using an older version 6.6 and we should switch to Your 8.1
> version of at least 7.X.
>
> Tomoko i think i understood You meant MAIN NASHUA .... for the string :)
>
> Again i really appreciate all answers.
>
> How do we disable or enable stemming while indexing? :) another question.
>
> Best regards
>
>
> On 6/13/19 10:40 AM, Tomoko Uchida wrote:
> > Sorry, I made a mistake when copypasting. Let me just correct my previous mail.
> >
> >> 1. Indexed this text: "NASHUA NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES".
> > 1. Indexed this text: "MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW
> > HAMPSHIRE UNITED STATES"
> >
> > ----
> > As far as I can say, this query correctly find the indexed document
> > (so I have no idea about what is wrong with fuzzy query).
> > +contentDFLT:mains~2 +contentDFLT:"nashua"
> > +contentDFLT:"new-hampshire" +contentDFLT:"united states"
> >
> > I am
> > - using lucene 8.1.
> > - using standard analyzer for both of indexing and searching.
> > - using classic query parser for parsing.
> >
> >
> >
> > 2019?6?13?(?) 23:18 <baris.kazar@oracle.com>:
> >> However, the index does not have MAINS but MAIN for the expected entry.
> >>
> >> Best regards
> >>
> >>
> >>
> >> On 6/13/19 10:33 AM, baris.kazar@oracle.com wrote:
> >>> does it consider it as like plural word? :) :) :)
> >>> That makes sense.
> >>>
> >>> Best regards
> >>>
> >>>
> >>> On 6/13/19 10:31 AM, baris.kazar@oracle.com wrote:
> >>>> Erick,
> >>>>
> >>>> Cool, could You give a simple example with my example please?
> >>>>
> >>>> Best regards
> >>>>
> >>>>
> >>>>
> >>>> On 6/13/19 10:12 AM, Erick Erickson wrote:
> >>>>> Shot in the dark: stemming. Whenever I see a problem with something
> >>>>> ending in “s” (or “er” or “ing” or….) my first suspect is that
> >>>>> stemming is turned on. In that case the token in the index that’s
> >>>>> actually searched on is somewhat different than you expect.
> >>>>>
> >>>>> The test is easy, just insure your fieldType contains no stemmers.
> >>>>> PorterStemmer is particularly aggressive, but for this case to test
> >>>>> I’d just remove all stemming, re-index and see if the results differ.
> >>>>>
> >>>>> Best,
> >>>>> Erick
> >>>>>
> >>>>>> On Jun 13, 2019, at 7:26 AM, baris.kazar@oracle.com wrote:
> >>>>>>
> >>>>>> Tomoko,-
> >>>>>>
> >>>>>> That is strange indeed.
> >>>>>>
> >>>>>> Something is wrong when i use mains but maink, mainl, mainr,mainq,
> >>>>>> maint all work ok any consonant at the end except s works in this
> >>>>>> case.
> >>>>>>
> >>>>>> Case #3 had +contentDFLT:mains~2 but not +contentDFLT:"mains~2".
> >>>>>>
> >>>>>> i am using fuzzy query with ~ from Query.builder and that is not
> >>>>>> PhraseQuery.
> >>>>>>
> >>>>>> Similarly FuzzyQuery with input "mains" (it has to be lowercase
> >>>>>> since it does not go through StandardAnalyzer) is also not
> >>>>>> PhraseQuery.
> >>>>>>
> >>>>>> can there be a clearer sample case for ComplexPhraseQuery please in
> >>>>>> the docs?
> >>>>>>
> >>>>>> did You also index "MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED
> >>>>>> STATES" the expected output in this case?
> >>>>>>
> >>>>>> Thanks for spending time on this, i would like to thank everyone.
> >>>>>>
> >>>>>> Best regards
> >>>>>>
> >>>>>>
> >>>>>> On 6/13/19 12:13 AM, Tomoko Uchida wrote:
> >>>>>>> Hi,
> >>>>>>>
> >>>>>>>> Ok, i think only this very specific only "mains" has an issue.
> >>>>>>> It looks strange to me. I did some test locally.
> >>>>>>>
> >>>>>>> 1. Indexed this text: "NASHUA NASHUA HILLSBOROUGH NEW HAMPSHIRE
> >>>>>>> UNITED STATES".
> >>>>>>>
> >>>>>>> 2a. This query string (just copied from your Case #3) worked
> >>>>>>> correctly
> >>>>>>> for me as far as I can see.
> >>>>>>> +contentDFLT:mains~2 +contentDFLT:"nashua",
> >>>>>>> +contentDFLT:"new-hampshire", +contentDFLT:"united state"
> >>>>>>>
> >>>>>>> 2b. However this query string got no results.
> >>>>>>> +contentDFLT:"mains~2", +contentDFLT:"nashua",
> >>>>>>> +contentDFLT:"new-hampshire", +contentDFLT:"united states"
> >>>>>>> It is an expected behaviour because the classic query parser does not
> >>>>>>> support fuzzy query inside phrase query (as far as I know).
> >>>>>>>
> >>>>>>> I suspect you use fuzzy query operator (~) inside phrase query
> >>>>>>> ("), as
> >>>>>>> the 2b case.
> >>>>>>>
> >>>>>>> FYI: there is a special parser for such complex phrase query.
> >>>>>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__lucene.apache.org_core_8-5F1-5F0_queryparser_org_apache_lucene_queryparser_complexPhrase_ComplexPhraseQueryParser.html&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=ZcXpaSlwS5DegX76mHTb_6DH3P7noan1eeMXc-Vh5M8&s=FoIMlcjDO2b7Gut9XRx-NIBWiBQWItsj8IlylJC7Wkc&e=
> >>>>>>>
> >>>>>>>
> >>>>>>> Tomoko
> >>>>>>>
> >>>>>>> 2019?6?13?(?) 6:16 <baris.kazar@oracle.com>:
> >>>>>>>> Ok, i think only this very specific only "mains" has an issue.
> >>>>>>>>
> >>>>>>>> all i knew about Lucene was fine :) Great...
> >>>>>>>>
> >>>>>>>> i have one more question:
> >>>>>>>>
> >>>>>>>> which one is advised to use: FuzzyQuery or the Query.parser with
> >>>>>>>> search string~ appended?
> >>>>>>>>
> >>>>>>>> The second one will go through analyzer and make search string
> >>>>>>>> lowercase.
> >>>>>>>>
> >>>>>>>> Best regards
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On 6/12/19 1:03 PM, baris.kazar@oracle.com wrote:
> >>>>>>>>
> >>>>>>>> Hi again,-
> >>>>>>>>
> >>>>>>>> this is really interesting and i hope i am missing something.
> >>>>>>>> Index small cases all entries so case sensitivity is not an issue
> >>>>>>>> i think.
> >>>>>>>>
> >>>>>>>> Case #1:
> >>>>>>>>
> >>>>>>>> org.apache.lucene.queryparser.classic.QueryParser parser = new
> >>>>>>>> org.apache.lucene.queryparser.classic.QueryParser(field,
> >>>>>>>> phraseAnalyzer) ;
> >>>>>>>> Query q1 = null;
> >>>>>>>> try {
> >>>>>>>> q1 = parser.parse("Main");
> >>>>>>>> } catch (ParseException e) {
> >>>>>>>> e.printStackTrace();
> >>>>>>>> }
> >>>>>>>> booleanQuery.add(q1, BooleanClause.Occur.MUST);
> >>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> >>>>>>>> "NASHUA"), BooleanClause.Occur.MUST);
> >>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> >>>>>>>> "NEW HAMPSHIRE"), BooleanClause.Occur.MUST);
> >>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> >>>>>>>> "UNITED STATES"), BooleanClause.Occur.MUST);
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> This brings with this:
> >>>>>>>>
> >>>>>>>> query plan:
> >>>>>>>>
> >>>>>>>> [+contentDFLT:main, +contentDFLT:"nashua",
> >>>>>>>> +contentDFLT:"new-hampshire", +contentDFLT:"united states"]
> >>>>>>>>
> >>>>>>>> testQuerySearch1 Time to compute: 0 seconds (copied answer after
> >>>>>>>> exec finished)
> >>>>>>>>
> >>>>>>>> Number of results: 12
> >>>>>>>> Name: Main Dunstable Rd
> >>>>>>>> Score: 41.204945
> >>>>>>>> ID: 12677400
> >>>>>>>> Country Code: US
> >>>>>>>> Coordinates: 42.72631, -71.50269
> >>>>>>>> Search Key: MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW HAMPSHIRE
> >>>>>>>> UNITED STATES
> >>>>>>>>
> >>>>>>>> Name: Main St
> >>>>>>>> Score: 41.204945
> >>>>>>>> ID: 12681980
> >>>>>>>> Country Code: US
> >>>>>>>> Coordinates: 42.76416, -71.46681
> >>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> >>>>>>>>
> >>>>>>>> Name: Main St
> >>>>>>>> Score: 41.204945
> >>>>>>>> ID: 12681973
> >>>>>>>> Country Code: US
> >>>>>>>> Coordinates: 42.75045, -71.4607
> >>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> >>>>>>>>
> >>>>>>>> Name: Main St
> >>>>>>>> Score: 41.204945
> >>>>>>>> ID: 12681974
> >>>>>>>> Country Code: US
> >>>>>>>> Coordinates: 42.76019, -71.465
> >>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> >>>>>>>>
> >>>>>>>> Name: Main Dunstable Rd
> >>>>>>>> Score: 41.204945
> >>>>>>>> ID: 12677399
> >>>>>>>> Country Code: US
> >>>>>>>> Coordinates: 42.74641, -71.48943
> >>>>>>>> Search Key: MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW HAMPSHIRE
> >>>>>>>> UNITED STATES
> >>>>>>>>
> >>>>>>>> Name: S Main St
> >>>>>>>> Score: 41.204945
> >>>>>>>> ID: 11893215
> >>>>>>>> Country Code: US
> >>>>>>>> Coordinates: 42.73412, -71.44797
> >>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> >>>>>>>>
> >>>>>>>> Name: Main St
> >>>>>>>> Score: 41.204945
> >>>>>>>> ID: 12681978
> >>>>>>>> Country Code: US
> >>>>>>>> Coordinates: 42.73492, -71.44951
> >>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> >>>>>>>>
> >>>>>>>> Name: S Main St
> >>>>>>>> Score: 41.204945
> >>>>>>>> ID: 11893214
> >>>>>>>> Country Code: US
> >>>>>>>> Coordinates: 42.73958, -71.45895
> >>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> >>>>>>>>
> >>>>>>>> Name: Main St
> >>>>>>>> Score: 41.204945
> >>>>>>>> ID: 12681979
> >>>>>>>> Country Code: US
> >>>>>>>> Coordinates: 42.76416, -71.46681
> >>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> >>>>>>>>
> >>>>>>>> Name: Main St
> >>>>>>>> Score: 41.204945
> >>>>>>>> ID: 12681977
> >>>>>>>> Country Code: US
> >>>>>>>> Coordinates: 42.747, -71.45957
> >>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Case #2
> >>>>>>>>
> >>>>>>>> When i did this it also worked by adding ~ to make it Fuzzy query
> >>>>>>>> to Main word:
> >>>>>>>>
> >>>>>>>> org.apache.lucene.queryparser.classic.QueryParser parser = new
> >>>>>>>> org.apache.lucene.queryparser.classic.QueryParser(field,
> >>>>>>>> phraseAnalyzer) ;
> >>>>>>>> Query q1 = null;
> >>>>>>>> try {
> >>>>>>>> q1 = parser.parse("Main~");
> >>>>>>>> } catch (ParseException e) {
> >>>>>>>> e.printStackTrace();
> >>>>>>>> }
> >>>>>>>> booleanQuery.add(q1, BooleanClause.Occur.MUST);
> >>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> >>>>>>>> "NASHUA"), BooleanClause.Occur.MUST);
> >>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> >>>>>>>> "NEW HAMPSHIRE"), BooleanClause.Occur.MUST);
> >>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> >>>>>>>> "UNITED STATES"), BooleanClause.Occur.MUST);
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> query plan:
> >>>>>>>>
> >>>>>>>> [+contentDFLT:main~2, +contentDFLT:"nashua",
> >>>>>>>> +contentDFLT:"new-hampshire", +contentDFLT:"united states"]
> >>>>>>>>
> >>>>>>>> testQuerySearch1 Time to compute: 24 seconds (due to debugging
> >>>>>>>> stops)
> >>>>>>>> Number of results: 12
> >>>>>>>> Name: Main Dunstable Rd
> >>>>>>>> Score: 41.06405
> >>>>>>>> ID: 12677400
> >>>>>>>> Country Code: US
> >>>>>>>> Coordinates: 42.72631, -71.50269
> >>>>>>>> Search Key: MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW HAMPSHIRE
> >>>>>>>> UNITED STATES
> >>>>>>>>
> >>>>>>>> Name: Main St
> >>>>>>>> Score: 41.06405
> >>>>>>>> ID: 12681980
> >>>>>>>> Country Code: US
> >>>>>>>> Coordinates: 42.76416, -71.46681
> >>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> >>>>>>>>
> >>>>>>>> Name: Main St
> >>>>>>>> Score: 41.06405
> >>>>>>>> ID: 12681973
> >>>>>>>> Country Code: US
> >>>>>>>> Coordinates: 42.75045, -71.4607
> >>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> >>>>>>>>
> >>>>>>>> Name: Main St
> >>>>>>>> Score: 41.06405
> >>>>>>>> ID: 12681974
> >>>>>>>> Country Code: US
> >>>>>>>> Coordinates: 42.76019, -71.465
> >>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> >>>>>>>>
> >>>>>>>> Name: Main Dunstable Rd
> >>>>>>>> Score: 41.06405
> >>>>>>>> ID: 12677399
> >>>>>>>> Country Code: US
> >>>>>>>> Coordinates: 42.74641, -71.48943
> >>>>>>>> Search Key: MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW HAMPSHIRE
> >>>>>>>> UNITED STATES
> >>>>>>>>
> >>>>>>>> Name: S Main St
> >>>>>>>> Score: 41.06405
> >>>>>>>> ID: 11893215
> >>>>>>>> Country Code: US
> >>>>>>>> Coordinates: 42.73412, -71.44797
> >>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> >>>>>>>>
> >>>>>>>> Name: Main St
> >>>>>>>> Score: 41.06405
> >>>>>>>> ID: 12681978
> >>>>>>>> Country Code: US
> >>>>>>>> Coordinates: 42.73492, -71.44951
> >>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> >>>>>>>>
> >>>>>>>> Name: S Main St
> >>>>>>>> Score: 41.06405
> >>>>>>>> ID: 11893214
> >>>>>>>> Country Code: US
> >>>>>>>> Coordinates: 42.73958, -71.45895
> >>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> >>>>>>>>
> >>>>>>>> Name: Main St
> >>>>>>>> Score: 41.06405
> >>>>>>>> ID: 12681979
> >>>>>>>> Country Code: US
> >>>>>>>> Coordinates: 42.76416, -71.46681
> >>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> >>>>>>>>
> >>>>>>>> Name: Main St
> >>>>>>>> Score: 41.06405
> >>>>>>>> ID: 12681977
> >>>>>>>> Country Code: US
> >>>>>>>> Coordinates: 42.747, -71.45957
> >>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Case #3
> >>>>>>>>
> >>>>>>>> But why does this not work with fuzzy mode and i misspelled a bit
> >>>>>>>> (1 edit away) and as You saw the data is there with Main spelling:
> >>>>>>>>
> >>>>>>>> org.apache.lucene.queryparser.classic.QueryParser parser = new
> >>>>>>>> org.apache.lucene.queryparser.classic.QueryParser(field,
> >>>>>>>> phraseAnalyzer) ;
> >>>>>>>>
> >>>>>>>> Query q1 = null;
> >>>>>>>> try {
> >>>>>>>> q1 = parser.parse("Mains~"); // 1 edit away
> >>>>>>>> } catch (ParseException e) {
> >>>>>>>> e.printStackTrace();
> >>>>>>>> }
> >>>>>>>> booleanQuery.add(q1, BooleanClause.Occur.MUST);
> >>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> >>>>>>>> "NASHUA"), BooleanClause.Occur.MUST);
> >>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> >>>>>>>> "NEW HAMPSHIRE"), BooleanClause.Occur.MUST);
> >>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> >>>>>>>> "UNITED STATES"), BooleanClause.Occur.MUST);
> >>>>>>>>
> >>>>>>>> query plan:
> >>>>>>>>
> >>>>>>>> [+contentDFLT:mains~2, +contentDFLT:"nashua",
> >>>>>>>> +contentDFLT:"new-hampshire", +contentDFLT:"united states"]
> >>>>>>>>
> >>>>>>>> testQuerySearch1 Time to compute: 23 seconds (due to debugging
> >>>>>>>> stops)
> >>>>>>>>
> >>>>>>>> Number of results: 0
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Case #4
> >>>>>>>>
> >>>>>>>> Then i changed q1 to SHOULD from MUST above: and i think fuzzy
> >>>>>>>> query is ignored here since there is no MAIN in the first 468
> >>>>>>>> resuls:
> >>>>>>>>
> >>>>>>>> there is no boost for Mains term here.
> >>>>>>>>
> >>>>>>>> query plan:
> >>>>>>>>
> >>>>>>>> [contentDFLT:mains~2, +contentDFLT:"nashua",
> >>>>>>>> +contentDFLT:"new-hampshire", +contentDFLT:"united states"]
> >>>>>>>>
> >>>>>>>> testQuerySearch1 Time to compute: 125 seconds (due to debugging
> >>>>>>>> stops)
> >>>>>>>> Number of results: 1794
> >>>>>>>> Name: Nashua Dr
> >>>>>>>> Score: 34.186226
> >>>>>>>> ID: 4974936
> >>>>>>>> Country Code: US
> >>>>>>>> Coordinates: 42.7636, -71.46063
> >>>>>>>> Search Key: NASHUA NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> >>>>>>>>
> >>>>>>>> Name: Nashua River Rail Trl
> >>>>>>>> Score: 34.186226
> >>>>>>>> ID: 4975508
> >>>>>>>> Country Code: US
> >>>>>>>> Coordinates: 42.7062, -71.53962
> >>>>>>>> Search Key: NASHUA RIVER RAIL NASHUA HILLSBOROUGH NEW HAMPSHIRE
> >>>>>>>> UNITED STATES
> >>>>>>>>
> >>>>>>>> Name: Nashua Rd
> >>>>>>>> Score: 33.84896
> >>>>>>>> ID: 4975388
> >>>>>>>> Country Code: US
> >>>>>>>> Coordinates: 42.78746, -71.92823
> >>>>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> >>>>>>>>
> >>>>>>>> Name: NASHUA
> >>>>>>>> Score: 33.84896
> >>>>>>>> ID: 21014865
> >>>>>>>> Country Code: US
> >>>>>>>> Coordinates: 42.75873, -71.46438
> >>>>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> >>>>>>>>
> >>>>>>>> Name: NASHUA
> >>>>>>>> Score: 33.84896
> >>>>>>>> ID: 21014865
> >>>>>>>> Country Code: US
> >>>>>>>> Coordinates: 42.75873, -71.46438
> >>>>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> >>>>>>>>
> >>>>>>>> Name: NASHUA
> >>>>>>>> Score: 33.84896
> >>>>>>>> ID: 21014865
> >>>>>>>> Country Code: US
> >>>>>>>> Coordinates: 42.75873, -71.46438
> >>>>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> >>>>>>>>
> >>>>>>>> Name: NASHUA
> >>>>>>>> Score: 33.84896
> >>>>>>>> ID: 21014865
> >>>>>>>> Country Code: US
> >>>>>>>> Coordinates: 42.75873, -71.46438
> >>>>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> >>>>>>>>
> >>>>>>>> Name: NASHUA
> >>>>>>>> Score: 33.84896
> >>>>>>>> ID: 21014865
> >>>>>>>> Country Code: US
> >>>>>>>> Coordinates: 42.75873, -71.46438
> >>>>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> >>>>>>>>
> >>>>>>>> Name: Nashua St
> >>>>>>>> Score: 33.84896
> >>>>>>>> ID: 4975671
> >>>>>>>> Country Code: US
> >>>>>>>> Coordinates: 42.88471, -70.81687
> >>>>>>>> Search Key: NASHUA ROCKINGHAM NEW HAMPSHIRE UNITED STATES
> >>>>>>>>
> >>>>>>>> Name: Nashua Rd
> >>>>>>>> Score: 33.84896
> >>>>>>>> ID: 4975400
> >>>>>>>> Country Code: US
> >>>>>>>> Coordinates: 42.79014, -71.92364
> >>>>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Why is the fuzzy query ignored?
> >>>>>>>> Even if i have separate fields for street, city,region, country,
> >>>>>>>> this fuzzy query issue will come into place for words with
> >>>>>>>> multiple parts like main dunstable etc., right?
> >>>>>>>>
> >>>>>>>> Best regards
> >>>>>>>>
> >>>>>>>> On 6/12/19 11:36 AM, baris.kazar@oracle.com wrote:
> >>>>>>>>
> >>>>>>>> Tomoko,-
> >>>>>>>>
> >>>>>>>> Thank You for Your suggestions. i am trying to understand it
> >>>>>>>> and i thought i did :)
> >>>>>>>>
> >>>>>>>> but it does not work with FuzzyQuery when i used with a *single*
> >>>>>>>> large TextField like street=...value... city=...value...
> >>>>>>>> region=...value... country=...value... (with or without quotes
> >>>>>>>> for the values)
> >>>>>>>>
> >>>>>>>> What i knew about Lucene fuzzy queries are not holding now with
> >>>>>>>> this Textfield form. That is why i suspected of a bug.
> >>>>>>>>
> >>>>>>>> 1. Yes, i saw and have a solid proof on that now.
> >>>>>>>>
> >>>>>>>> 2. yes but FuzzyQuery takes quotes as they are as they are
> >>>>>>>> escaped and it is not analyzed.
> >>>>>>>>
> >>>>>>>> Stuffing into one textfield vs having separate fields should only
> >>>>>>>> affect probably the performance but not the outcome in my case.
> >>>>>>>> But, i have been thinking about this and maybe it is the way to
> >>>>>>>> go in this case.
> >>>>>>>>
> >>>>>>>> mY CONTENT field has street names in mixed case and city, region
> >>>>>>>> country names in UPPERCASE. Can this be a problem?
> >>>>>>>> i thought index stored them in lowercase since i am using
> >>>>>>>> StandardAnalyzer.
> >>>>>>>>
> >>>>>>>> CONTENT field also has full textfield string with street=...
> >>>>>>>> city=... region=... country=... (here all values are UPPERCASE).
> >>>>>>>>
> >>>>>>>> Why cant the index find the names via FuzzyQuery? i tried both
> >>>>>>>> FuzzyQuery and Query builder as i showed before.
> >>>>>>>>
> >>>>>>>> The last advice in Your previous email would nicely go outside
> >>>>>>>> the parantheses since it might be very critical :) :) :)
> >>>>>>>>
> >>>>>>>> Best regards
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On 6/12/19 12:17 AM, Tomoko Uchida wrote:
> >>>>>>>>
> >>>>>>>> I'd suggest to correctly understand the way a software works before
> >>>>>>>> suspecting its bug :-)
> >>>>>>>>
> >>>>>>>> I guess you may miss two points:
> >>>>>>>>
> >>>>>>>> 1. the standard analyzer (standard tokenizer) breaks words by double
> >>>>>>>> quote (U+0022) so quotes are not indexed or searched at all if
> >>>>>>>> you are
> >>>>>>>> using standard analyzer. (That is the reason you have same results
> >>>>>>>> with or without quotes.)
> >>>>>>>> See:
> >>>>>>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__lucene.apache.org_core_8-5F1-5F0_core_org_apache_lucene_analysis_standard_StandardTokenizer.html&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=1L6ZQKxmWmYxDX4uJHxzY5SAR_UCl6UUXCo916wzXCo&s=8E2lp1YIGM-3v3FspeieGl8z8rEBs6qioTudtFNzh8c&e=
> >>>>>>>> and
> >>>>>>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__unicode.org_reports_tr29_&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=1L6ZQKxmWmYxDX4uJHxzY5SAR_UCl6UUXCo916wzXCo&s=riCZ_f25XW869CKbHPUqfbLiDU-AukE6la0xTLMw6u8&e=
> >>>>>>>>
> >>>>>>>> 2. double quote has special meaning (it's interpreted as phrase
> >>>>>>>> query)
> >>>>>>>> with the built-in query parser so you need to escape it if you
> >>>>>>>> want to
> >>>>>>>> search double quotes itself.
> >>>>>>>> See:
> >>>>>>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__lucene.apache.org_core_8-5F1-5F0_queryparser_org_apache_lucene_queryparser_classic_package-2Dsummary.html-23Terms&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=1L6ZQKxmWmYxDX4uJHxzY5SAR_UCl6UUXCo916wzXCo&s=t8OYTgidvcwNpAVFuTsqGhDJK5BwUZVCxc0mPHzqCYU&e=
> >>>>>>>>
> >>>>>>>> (My advice would be to create separate fields for each key value
> >>>>>>>> pairs
> >>>>>>>> instead of stuffing all pairs into one text field, if you need to
> >>>>>>>> search them separately.)
> >>>>>>>>
> >>>>>>>> 2019?6?12?(?) 2:39 <baris.kazar@oracle.com>:
> >>>>>>>>
> >>>>>>>> i can say that quotes is not the issue with index as it still
> >>>>>>>> results in
> >>>>>>>> same results with quotes or without quotes.
> >>>>>>>>
> >>>>>>>> i am starting to feel that this might be a bug maybe??
> >>>>>>>>
> >>>>>>>> Best regards
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On 6/10/19 2:46 PM, baris.kazar@oracle.com wrote:
> >>>>>>>>
> >>>>>>>> Somehow " is causing an issue as this should return street with
> >>>>>>>> MAIN:
> >>>>>>>>
> >>>>>>>> [.contentDFLT:street="MAINS"~2, +contentDFLT:"city nashua",
> >>>>>>>> +contentDFLT:"region new-hampshire", +contentDFLT:"country united
> >>>>>>>> states"] -> this was with fuzzyquery on MAINS
> >>>>>>>>
> >>>>>>>> Best regards
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On 6/10/19 2:24 PM, baris.kazar@oracle.com wrote:
> >>>>>>>>
> >>>>>>>> [.+contentDFLT:"city nashua", +contentDFLT:"region new-hampshire",
> >>>>>>>> +contentDFLT:"country united states", contentDFLT:street
> >>>>>>>> contentDFLT:mains]
> >>>>>>>>
> >>>>>>>> QueeryParser chops it into two pieces from
> >>>>>>>> parser.parser("street=\"MAINS\"");
> >>>>>>>>
> >>>>>>>> Index has a TextField named contentDFLT the following data :
> >>>>>>>> street="MAIN" city="NASHUA" municipality="HILLSBOROUGH" region="NEW
> >>>>>>>> HAMPSHIRE" country="UNITED STATES"
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> When i set street=\"MAINS~\" with parser:
> >>>>>>>> i get the following
> >>>>>>>> [.+contentDFLT:"city nashua", +contentDFLT:"region new-hampshire",
> >>>>>>>> +contentDFLT:"country united states", contentDFLT:street
> >>>>>>>> contentDFLT:mains]
> >>>>>>>>
> >>>>>>>> probably " quotations are messing this up as You were saying...
> >>>>>>>> Best regards
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On 6/10/19 12:48 PM, Tomoko Uchida wrote:
> >>>>>>>>
> >>>>>>>> Or, " (double quotation) in your query string may affect query
> >>>>>>>> parsing.
> >>>>>>>>
> >>>>>>>> When I parse this string by classic query parser (lucene 8.1),
> >>>>>>>> street="MAINS~"
> >>>>>>>> parsed (raw) query is
> >>>>>>>> text:street text:mains
> >>>>>>>> (I set the default search field to "text", so text:xxxx is appeared
> >>>>>>>> here.)
> >>>>>>>>
> >>>>>>>> Query parsing is a complex process, so it would be good to check
> >>>>>>>> parsed raw query string especially when you have (reserved) special
> >>>>>>>> characters in your query...
> >>>>>>>>
> >>>>>>>> 2019?6?11?(?) 1:10 Tomoko Uchida <tomoko.uchida.1111@gmail.com>:
> >>>>>>>>
> >>>>>>>> Hi,
> >>>>>>>>
> >>>>>>>> I noticed one small thing in your previous mail.
> >>>>>>>>
> >>>>>>>> when i use q1 = parser.parse("street=\"MAIN\""); i get same results
> >>>>>>>>
> >>>>>>>> which is good.
> >>>>>>>>
> >>>>>>>> To specify a search field, ":" (colon) should be used instead of
> >>>>>>>> "=".
> >>>>>>>> See the query parser documentation:
> >>>>>>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__lucene.apache.org_core_8-5F1-5F0_queryparser_org_apache_lucene_queryparser_classic_package-2Dsummary.html-23Fields&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=u4SeJqH4lePhOazCLwxLEr3WqcMkODtYLv4njiKZ4PM&s=WrNfUXO9gz1PqpczTJw1vD9sWqvr76WRv2Aeo9uWqa4&e=
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> I'm not sure this is related to your problem.
> >>>>>>>>
> >>>>>>>> 2019?6?11?(?) 0:51 <baris.kazar@oracle.com>:
> >>>>>>>>
> >>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> >>>>>>>> "city=\"NASHUA\""), BooleanClause.Occur.MUST);
> >>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> >>>>>>>> "region=\"NEW HAMPSHIRE\""), BooleanClause.Occur.MUST);
> >>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> >>>>>>>> "country=\"UNITED STATES\""), BooleanClause.Occur.MUST);
> >>>>>>>>
> >>>>>>>> org.apache.lucene.queryparser.classic.QueryParser parser = new
> >>>>>>>> org.apache.lucene.queryparser.classic.QueryParser(field,
> >>>>>>>> phraseAnalyzer) ;
> >>>>>>>> Query q1 = null;
> >>>>>>>> try {
> >>>>>>>> q1 = parser.parse("MAIN");
> >>>>>>>> } catch (ParseException e) {
> >>>>>>>>
> >>>>>>>> e.printStackTrace();
> >>>>>>>> }
> >>>>>>>> booleanQuery.add(q1, BooleanClause.Occur.SHOULD);
> >>>>>>>>
> >>>>>>>> testQuerySearch2 Time to compute: 0 seconds
> >>>>>>>> Number of results: 1775
> >>>>>>>> Name: Main St
> >>>>>>>> Score: 37.20959
> >>>>>>>> ID: 12681979
> >>>>>>>> Country Code: US
> >>>>>>>> Coordinates: 42.76416, -71.46681
> >>>>>>>> Search Key: street="MAIN" city="NASHUA" municipality="HILLSBOROUGH"
> >>>>>>>> region="NEW HAMPSHIRE" country="UNITED STATES"
> >>>>>>>>
> >>>>>>>> Name: Main St
> >>>>>>>> Score: 37.20959
> >>>>>>>> ID: 12681977
> >>>>>>>> Country Code: US
> >>>>>>>> Coordinates: 42.747, -71.45957
> >>>>>>>> Search Key: street="MAIN" city="NASHUA" municipality="HILLSBOROUGH"
> >>>>>>>> region="NEW HAMPSHIRE" country="UNITED STATES"
> >>>>>>>>
> >>>>>>>> Name: Main St
> >>>>>>>> Score: 37.20959
> >>>>>>>> ID: 12681978
> >>>>>>>> Country Code: US
> >>>>>>>> Coordinates: 42.73492, -71.44951
> >>>>>>>> Search Key: street="MAIN" city="NASHUA" municipality="HILLSBOROUGH"
> >>>>>>>> region="NEW HAMPSHIRE" country="UNITED STATES"
> >>>>>>>>
> >>>>>>>> when i use q1 = parser.parse("street=\"MAIN\""); i get same
> >>>>>>>> results
> >>>>>>>> which is good.
> >>>>>>>>
> >>>>>>>> But when i switch to MAINS~ then fuzzy query does not work.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> i need to say something with the q1 only in the booleanquery:
> >>>>>>>> it tries to match the MAIN in street, city, region and country
> >>>>>>>> which are
> >>>>>>>> in a single TextField field.
> >>>>>>>> But i dont want this. that is why i need to street="..." etc when
> >>>>>>>> searching.
> >>>>>>>>
> >>>>>>>> Best regards
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On 6/10/19 11:31 AM, Tomoko Uchida wrote:
> >>>>>>>>
> >>>>>>>> Hi,
> >>>>>>>>
> >>>>>>>> just for the basic verification, can you find the document without
> >>>>>>>> fuzzy query? I mean, does this query work for you?
> >>>>>>>>
> >>>>>>>> Query query = parser.parse("MAIN");
> >>>>>>>>
> >>>>>>>> Tomoko
> >>>>>>>>
> >>>>>>>> 2019?6?11?(?) 0:22 <baris.kazar@oracle.com>:
> >>>>>>>>
> >>>>>>>> why cant the second set not work at all?
> >>>>>>>>
> >>>>>>>> it is indexed as Textfield like street="..." city="..." etc.
> >>>>>>>>
> >>>>>>>> Best regards
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On 6/10/19 11:23 AM, baris.kazar@oracle.com wrote:
> >>>>>>>>
> >>>>>>>> i dont know how to use Fuzzyquery with queryparser but probably
> >>>>>>>> You
> >>>>>>>> are suggesting
> >>>>>>>>
> >>>>>>>> QueryParser parser = new QueryParser(field, analyzer) ;
> >>>>>>>> Query query = parser.parse("MAINS~2");
> >>>>>>>>
> >>>>>>>> booleanQuery.add(query, BooleanClause.Occur.SHOULD);
> >>>>>>>>
> >>>>>>>> am i right?
> >>>>>>>> Best regards
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On 6/10/19 10:47 AM, Atri Sharma wrote:
> >>>>>>>>
> >>>>>>>> I would suggest using a QueryParser for your fuzzy query before
> >>>>>>>> adding it to the Boolean query. This should weed out any case
> >>>>>>>> issues.
> >>>>>>>>
> >>>>>>>> On Mon, 10 Jun 2019 at 8:06 PM, <baris.kazar@oracle.com
> >>>>>>>> <mailto:baris.kazar@oracle.com>> wrote:
> >>>>>>>>
> >>>>>>>> BooleanQuery.Builder booleanQuery = new
> >>>>>>>> BooleanQuery.Builder();
> >>>>>>>>
> >>>>>>>> //First set
> >>>>>>>>
> >>>>>>>> booleanQuery.add(new FuzzyQuery(new
> >>>>>>>> org.apache.lucene.index.Term(field, "MAINS")),
> >>>>>>>> BooleanClause.Occur.SHOULD);
> >>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> >>>>>>>> "NASHUA"), BooleanClause.Occur.MUST);
> >>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> >>>>>>>> "NEW HAMPSHIRE"), BooleanClause.Occur.MUST);
> >>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> >>>>>>>> "UNITED STATES"), BooleanClause.Occur.MUST);
> >>>>>>>>
> >>>>>>>> // Second set
> >>>>>>>> //booleanQuery.add(new FuzzyQuery(new
> >>>>>>>> org.apache.lucene.index.Term(field, "street=\"MAINS\"")),
> >>>>>>>> BooleanClause.Occur.SHOULD);
> >>>>>>>> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,
> >>>>>>>>
> >>>>>>>> field, "city=\"NASHUA\""), BooleanClause.Occur.MUST);
> >>>>>>>> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,
> >>>>>>>>
> >>>>>>>> field, "region=\"NEW HAMPSHIRE\""),
> >>>>>>>> BooleanClause.Occur.MUST);
> >>>>>>>> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,
> >>>>>>>>
> >>>>>>>> field, "country=\"UNITED STATES\""),
> >>>>>>>> BooleanClause.Occur.MUST);
> >>>>>>>>
> >>>>>>>> The first set brings also street with Nashua name.
> >>>>>>>> (NASHUA).
> >>>>>>>>
> >>>>>>>> so, to prevent that and since i also indexed with
> >>>>>>>> street="..."
> >>>>>>>> city="..." i did the second set but it does not bring
> >>>>>>>> anything.
> >>>>>>>>
> >>>>>>>> createPhraseQuery builds a Phrasequery with one term
> >>>>>>>> equal to the
> >>>>>>>> string
> >>>>>>>> in the call.
> >>>>>>>>
> >>>>>>>> Best regards
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On 6/10/19 10:47 AM, baris.kazar@oracle.com
> >>>>>>>> <mailto:baris.kazar@oracle.com> wrote:
> >>>>>>>> > How do i check how it is indexed? lowecase or uppercase?
> >>>>>>>> >
> >>>>>>>> > only way is now to by testing.
> >>>>>>>> >
> >>>>>>>> > i am using standardanalyzer.
> >>>>>>>> >
> >>>>>>>> > Best regards
> >>>>>>>> >
> >>>>>>>> >
> >>>>>>>> > On 6/9/19 11:57 AM, Atri Sharma wrote:
> >>>>>>>> >> On Sun, Jun 9, 2019 at 8:53 PM Tomoko Uchida
> >>>>>>>> >> <tomoko.uchida.1111@gmail.com
> >>>>>>>> <mailto:tomoko.uchida.1111@gmail.com>> wrote:
> >>>>>>>> >>> Hi,
> >>>>>>>> >>>
> >>>>>>>> >>> What analyzer do you use for the text field? Is the
> >>>>>>>> term "Main"
> >>>>>>>> >>> correctly indexed?
> >>>>>>>> >> Agreed. Also, it would be good if you could post your
> >>>>>>>> actual
> >>>>>>>> code.
> >>>>>>>> >>
> >>>>>>>> >> What analyzer are you using? If you are using
> >>>>>>>> StandardAnalyzer,
> >>>>>>>> then
> >>>>>>>> >> all of your terms while indexing will be lowercased,
> >>>>>>>> AFAIK, but
> >>>>>>>> your
> >>>>>>>> >> query will not be analyzed until you run a
> >>>>>>>> QueryParser on it.
> >>>>>>>> >>
> >>>>>>>> >>
> >>>>>>>> >> Atri
> >>>>>>>> >>
> >>>>>>>> >
> >>>>>>>> >
> >>>>>>>> >
> >>>>>>>> ---------------------------------------------------------------------
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> > To unsubscribe, e-mail:
> >>>>>>>> java-user-unsubscribe@lucene.apache.org
> >>>>>>>> <mailto:java-user-unsubscribe@lucene.apache.org>
> >>>>>>>> > For additional commands, e-mail:
> >>>>>>>> java-user-help@lucene.apache.org
> >>>>>>>> <mailto:java-user-help@lucene.apache.org>
> >>>>>>>> >
> >>>>>>>>
> >>>>>>>> ---------------------------------------------------------------------
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >>>>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>>>>>>>
> >>>>>>>> ---------------------------------------------------------------------
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >>>>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>>>>>>>
> >>>>>>>> ---------------------------------------------------------------------
> >>>>>>>>
> >>>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >>>>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>>>>>>>
> >>>>>>>> ---------------------------------------------------------------------
> >>>>>>>>
> >>>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >>>>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>> ---------------------------------------------------------------------
> >>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >>>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>>>>>>
> >>>>>> ---------------------------------------------------------------------
> >>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>>>>>
> >>>>> ---------------------------------------------------------------------
> >>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>>>>
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: FuzzyQuery- why is it ignored? [ In reply to ]
Tomoko,-
Yes, i noticed that last nite when i was researching it and
thanks for confirming. StandardAnalyzer does not do stemming.
So, MAINS case has some other reason.
Best regards

----- Original Message -----
From: tomoko.uchida.1111@gmail.com
To: java-user@lucene.apache.org
Sent: Sunday, June 16, 2019 4:39:29 AM GMT -05:00 US/Canada Eastern
Subject: Re: FuzzyQuery- why is it ignored?

Hi,

you said you are using standard analyzer. If so, you are not using any
stemmer at all (please see the analyzer's Javadocs).

2019?6?16?(?) 11:43 Baris Kazar <baris.kazar@oracle.com>:
>
> Hello,-
> Erick explained how to disable stemming in Solr but i am using Lucene purely.
> i am also researching how to disable it in Lucene but if You have instructions how to do so already
> i appreciate if You could share here.
> Best regards
>
> ----- Original Message -----
> From: baris.kazar@oracle.com
> To: java-user@lucene.apache.org, tomoko.uchida.1111@gmail.com, erickerickson@gmail.com, atri@linux.com, baris.kazar@oracle.com, lucene@mikemccandless.com
> Sent: Thursday, June 13, 2019 10:48:47 AM GMT -05:00 US/Canada Eastern
> Subject: Re: FuzzyQuery- why is it ignored?
>
> i see, i am using an older version 6.6 and we should switch to Your 8.1
> version of at least 7.X.
>
> Tomoko i think i understood You meant MAIN NASHUA .... for the string :)
>
> Again i really appreciate all answers.
>
> How do we disable or enable stemming while indexing? :) another question.
>
> Best regards
>
>
> On 6/13/19 10:40 AM, Tomoko Uchida wrote:
> > Sorry, I made a mistake when copypasting. Let me just correct my previous mail.
> >
> >> 1. Indexed this text: "NASHUA NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES".
> > 1. Indexed this text: "MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW
> > HAMPSHIRE UNITED STATES"
> >
> > ----
> > As far as I can say, this query correctly find the indexed document
> > (so I have no idea about what is wrong with fuzzy query).
> > +contentDFLT:mains~2 +contentDFLT:"nashua"
> > +contentDFLT:"new-hampshire" +contentDFLT:"united states"
> >
> > I am
> > - using lucene 8.1.
> > - using standard analyzer for both of indexing and searching.
> > - using classic query parser for parsing.
> >
> >
> >
> > 2019?6?13?(?) 23:18 <baris.kazar@oracle.com>:
> >> However, the index does not have MAINS but MAIN for the expected entry.
> >>
> >> Best regards
> >>
> >>
> >>
> >> On 6/13/19 10:33 AM, baris.kazar@oracle.com wrote:
> >>> does it consider it as like plural word? :) :) :)
> >>> That makes sense.
> >>>
> >>> Best regards
> >>>
> >>>
> >>> On 6/13/19 10:31 AM, baris.kazar@oracle.com wrote:
> >>>> Erick,
> >>>>
> >>>> Cool, could You give a simple example with my example please?
> >>>>
> >>>> Best regards
> >>>>
> >>>>
> >>>>
> >>>> On 6/13/19 10:12 AM, Erick Erickson wrote:
> >>>>> Shot in the dark: stemming. Whenever I see a problem with something
> >>>>> ending in “s” (or “er” or “ing” or….) my first suspect is that
> >>>>> stemming is turned on. In that case the token in the index that’s
> >>>>> actually searched on is somewhat different than you expect.
> >>>>>
> >>>>> The test is easy, just insure your fieldType contains no stemmers.
> >>>>> PorterStemmer is particularly aggressive, but for this case to test
> >>>>> I’d just remove all stemming, re-index and see if the results differ.
> >>>>>
> >>>>> Best,
> >>>>> Erick
> >>>>>
> >>>>>> On Jun 13, 2019, at 7:26 AM, baris.kazar@oracle.com wrote:
> >>>>>>
> >>>>>> Tomoko,-
> >>>>>>
> >>>>>> That is strange indeed.
> >>>>>>
> >>>>>> Something is wrong when i use mains but maink, mainl, mainr,mainq,
> >>>>>> maint all work ok any consonant at the end except s works in this
> >>>>>> case.
> >>>>>>
> >>>>>> Case #3 had +contentDFLT:mains~2 but not +contentDFLT:"mains~2".
> >>>>>>
> >>>>>> i am using fuzzy query with ~ from Query.builder and that is not
> >>>>>> PhraseQuery.
> >>>>>>
> >>>>>> Similarly FuzzyQuery with input "mains" (it has to be lowercase
> >>>>>> since it does not go through StandardAnalyzer) is also not
> >>>>>> PhraseQuery.
> >>>>>>
> >>>>>> can there be a clearer sample case for ComplexPhraseQuery please in
> >>>>>> the docs?
> >>>>>>
> >>>>>> did You also index "MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED
> >>>>>> STATES" the expected output in this case?
> >>>>>>
> >>>>>> Thanks for spending time on this, i would like to thank everyone.
> >>>>>>
> >>>>>> Best regards
> >>>>>>
> >>>>>>
> >>>>>> On 6/13/19 12:13 AM, Tomoko Uchida wrote:
> >>>>>>> Hi,
> >>>>>>>
> >>>>>>>> Ok, i think only this very specific only "mains" has an issue.
> >>>>>>> It looks strange to me. I did some test locally.
> >>>>>>>
> >>>>>>> 1. Indexed this text: "NASHUA NASHUA HILLSBOROUGH NEW HAMPSHIRE
> >>>>>>> UNITED STATES".
> >>>>>>>
> >>>>>>> 2a. This query string (just copied from your Case #3) worked
> >>>>>>> correctly
> >>>>>>> for me as far as I can see.
> >>>>>>> +contentDFLT:mains~2 +contentDFLT:"nashua",
> >>>>>>> +contentDFLT:"new-hampshire", +contentDFLT:"united state"
> >>>>>>>
> >>>>>>> 2b. However this query string got no results.
> >>>>>>> +contentDFLT:"mains~2", +contentDFLT:"nashua",
> >>>>>>> +contentDFLT:"new-hampshire", +contentDFLT:"united states"
> >>>>>>> It is an expected behaviour because the classic query parser does not
> >>>>>>> support fuzzy query inside phrase query (as far as I know).
> >>>>>>>
> >>>>>>> I suspect you use fuzzy query operator (~) inside phrase query
> >>>>>>> ("), as
> >>>>>>> the 2b case.
> >>>>>>>
> >>>>>>> FYI: there is a special parser for such complex phrase query.
> >>>>>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__lucene.apache.org_core_8-5F1-5F0_queryparser_org_apache_lucene_queryparser_complexPhrase_ComplexPhraseQueryParser.html&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=ZcXpaSlwS5DegX76mHTb_6DH3P7noan1eeMXc-Vh5M8&s=FoIMlcjDO2b7Gut9XRx-NIBWiBQWItsj8IlylJC7Wkc&e=
> >>>>>>>
> >>>>>>>
> >>>>>>> Tomoko
> >>>>>>>
> >>>>>>> 2019?6?13?(?) 6:16 <baris.kazar@oracle.com>:
> >>>>>>>> Ok, i think only this very specific only "mains" has an issue.
> >>>>>>>>
> >>>>>>>> all i knew about Lucene was fine :) Great...
> >>>>>>>>
> >>>>>>>> i have one more question:
> >>>>>>>>
> >>>>>>>> which one is advised to use: FuzzyQuery or the Query.parser with
> >>>>>>>> search string~ appended?
> >>>>>>>>
> >>>>>>>> The second one will go through analyzer and make search string
> >>>>>>>> lowercase.
> >>>>>>>>
> >>>>>>>> Best regards
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On 6/12/19 1:03 PM, baris.kazar@oracle.com wrote:
> >>>>>>>>
> >>>>>>>> Hi again,-
> >>>>>>>>
> >>>>>>>> this is really interesting and i hope i am missing something.
> >>>>>>>> Index small cases all entries so case sensitivity is not an issue
> >>>>>>>> i think.
> >>>>>>>>
> >>>>>>>> Case #1:
> >>>>>>>>
> >>>>>>>> org.apache.lucene.queryparser.classic.QueryParser parser = new
> >>>>>>>> org.apache.lucene.queryparser.classic.QueryParser(field,
> >>>>>>>> phraseAnalyzer) ;
> >>>>>>>> Query q1 = null;
> >>>>>>>> try {
> >>>>>>>> q1 = parser.parse("Main");
> >>>>>>>> } catch (ParseException e) {
> >>>>>>>> e.printStackTrace();
> >>>>>>>> }
> >>>>>>>> booleanQuery.add(q1, BooleanClause.Occur.MUST);
> >>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> >>>>>>>> "NASHUA"), BooleanClause.Occur.MUST);
> >>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> >>>>>>>> "NEW HAMPSHIRE"), BooleanClause.Occur.MUST);
> >>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> >>>>>>>> "UNITED STATES"), BooleanClause.Occur.MUST);
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> This brings with this:
> >>>>>>>>
> >>>>>>>> query plan:
> >>>>>>>>
> >>>>>>>> [+contentDFLT:main, +contentDFLT:"nashua",
> >>>>>>>> +contentDFLT:"new-hampshire", +contentDFLT:"united states"]
> >>>>>>>>
> >>>>>>>> testQuerySearch1 Time to compute: 0 seconds (copied answer after
> >>>>>>>> exec finished)
> >>>>>>>>
> >>>>>>>> Number of results: 12
> >>>>>>>> Name: Main Dunstable Rd
> >>>>>>>> Score: 41.204945
> >>>>>>>> ID: 12677400
> >>>>>>>> Country Code: US
> >>>>>>>> Coordinates: 42.72631, -71.50269
> >>>>>>>> Search Key: MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW HAMPSHIRE
> >>>>>>>> UNITED STATES
> >>>>>>>>
> >>>>>>>> Name: Main St
> >>>>>>>> Score: 41.204945
> >>>>>>>> ID: 12681980
> >>>>>>>> Country Code: US
> >>>>>>>> Coordinates: 42.76416, -71.46681
> >>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> >>>>>>>>
> >>>>>>>> Name: Main St
> >>>>>>>> Score: 41.204945
> >>>>>>>> ID: 12681973
> >>>>>>>> Country Code: US
> >>>>>>>> Coordinates: 42.75045, -71.4607
> >>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> >>>>>>>>
> >>>>>>>> Name: Main St
> >>>>>>>> Score: 41.204945
> >>>>>>>> ID: 12681974
> >>>>>>>> Country Code: US
> >>>>>>>> Coordinates: 42.76019, -71.465
> >>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> >>>>>>>>
> >>>>>>>> Name: Main Dunstable Rd
> >>>>>>>> Score: 41.204945
> >>>>>>>> ID: 12677399
> >>>>>>>> Country Code: US
> >>>>>>>> Coordinates: 42.74641, -71.48943
> >>>>>>>> Search Key: MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW HAMPSHIRE
> >>>>>>>> UNITED STATES
> >>>>>>>>
> >>>>>>>> Name: S Main St
> >>>>>>>> Score: 41.204945
> >>>>>>>> ID: 11893215
> >>>>>>>> Country Code: US
> >>>>>>>> Coordinates: 42.73412, -71.44797
> >>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> >>>>>>>>
> >>>>>>>> Name: Main St
> >>>>>>>> Score: 41.204945
> >>>>>>>> ID: 12681978
> >>>>>>>> Country Code: US
> >>>>>>>> Coordinates: 42.73492, -71.44951
> >>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> >>>>>>>>
> >>>>>>>> Name: S Main St
> >>>>>>>> Score: 41.204945
> >>>>>>>> ID: 11893214
> >>>>>>>> Country Code: US
> >>>>>>>> Coordinates: 42.73958, -71.45895
> >>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> >>>>>>>>
> >>>>>>>> Name: Main St
> >>>>>>>> Score: 41.204945
> >>>>>>>> ID: 12681979
> >>>>>>>> Country Code: US
> >>>>>>>> Coordinates: 42.76416, -71.46681
> >>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> >>>>>>>>
> >>>>>>>> Name: Main St
> >>>>>>>> Score: 41.204945
> >>>>>>>> ID: 12681977
> >>>>>>>> Country Code: US
> >>>>>>>> Coordinates: 42.747, -71.45957
> >>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Case #2
> >>>>>>>>
> >>>>>>>> When i did this it also worked by adding ~ to make it Fuzzy query
> >>>>>>>> to Main word:
> >>>>>>>>
> >>>>>>>> org.apache.lucene.queryparser.classic.QueryParser parser = new
> >>>>>>>> org.apache.lucene.queryparser.classic.QueryParser(field,
> >>>>>>>> phraseAnalyzer) ;
> >>>>>>>> Query q1 = null;
> >>>>>>>> try {
> >>>>>>>> q1 = parser.parse("Main~");
> >>>>>>>> } catch (ParseException e) {
> >>>>>>>> e.printStackTrace();
> >>>>>>>> }
> >>>>>>>> booleanQuery.add(q1, BooleanClause.Occur.MUST);
> >>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> >>>>>>>> "NASHUA"), BooleanClause.Occur.MUST);
> >>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> >>>>>>>> "NEW HAMPSHIRE"), BooleanClause.Occur.MUST);
> >>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> >>>>>>>> "UNITED STATES"), BooleanClause.Occur.MUST);
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> query plan:
> >>>>>>>>
> >>>>>>>> [+contentDFLT:main~2, +contentDFLT:"nashua",
> >>>>>>>> +contentDFLT:"new-hampshire", +contentDFLT:"united states"]
> >>>>>>>>
> >>>>>>>> testQuerySearch1 Time to compute: 24 seconds (due to debugging
> >>>>>>>> stops)
> >>>>>>>> Number of results: 12
> >>>>>>>> Name: Main Dunstable Rd
> >>>>>>>> Score: 41.06405
> >>>>>>>> ID: 12677400
> >>>>>>>> Country Code: US
> >>>>>>>> Coordinates: 42.72631, -71.50269
> >>>>>>>> Search Key: MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW HAMPSHIRE
> >>>>>>>> UNITED STATES
> >>>>>>>>
> >>>>>>>> Name: Main St
> >>>>>>>> Score: 41.06405
> >>>>>>>> ID: 12681980
> >>>>>>>> Country Code: US
> >>>>>>>> Coordinates: 42.76416, -71.46681
> >>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> >>>>>>>>
> >>>>>>>> Name: Main St
> >>>>>>>> Score: 41.06405
> >>>>>>>> ID: 12681973
> >>>>>>>> Country Code: US
> >>>>>>>> Coordinates: 42.75045, -71.4607
> >>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> >>>>>>>>
> >>>>>>>> Name: Main St
> >>>>>>>> Score: 41.06405
> >>>>>>>> ID: 12681974
> >>>>>>>> Country Code: US
> >>>>>>>> Coordinates: 42.76019, -71.465
> >>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> >>>>>>>>
> >>>>>>>> Name: Main Dunstable Rd
> >>>>>>>> Score: 41.06405
> >>>>>>>> ID: 12677399
> >>>>>>>> Country Code: US
> >>>>>>>> Coordinates: 42.74641, -71.48943
> >>>>>>>> Search Key: MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW HAMPSHIRE
> >>>>>>>> UNITED STATES
> >>>>>>>>
> >>>>>>>> Name: S Main St
> >>>>>>>> Score: 41.06405
> >>>>>>>> ID: 11893215
> >>>>>>>> Country Code: US
> >>>>>>>> Coordinates: 42.73412, -71.44797
> >>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> >>>>>>>>
> >>>>>>>> Name: Main St
> >>>>>>>> Score: 41.06405
> >>>>>>>> ID: 12681978
> >>>>>>>> Country Code: US
> >>>>>>>> Coordinates: 42.73492, -71.44951
> >>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> >>>>>>>>
> >>>>>>>> Name: S Main St
> >>>>>>>> Score: 41.06405
> >>>>>>>> ID: 11893214
> >>>>>>>> Country Code: US
> >>>>>>>> Coordinates: 42.73958, -71.45895
> >>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> >>>>>>>>
> >>>>>>>> Name: Main St
> >>>>>>>> Score: 41.06405
> >>>>>>>> ID: 12681979
> >>>>>>>> Country Code: US
> >>>>>>>> Coordinates: 42.76416, -71.46681
> >>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> >>>>>>>>
> >>>>>>>> Name: Main St
> >>>>>>>> Score: 41.06405
> >>>>>>>> ID: 12681977
> >>>>>>>> Country Code: US
> >>>>>>>> Coordinates: 42.747, -71.45957
> >>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Case #3
> >>>>>>>>
> >>>>>>>> But why does this not work with fuzzy mode and i misspelled a bit
> >>>>>>>> (1 edit away) and as You saw the data is there with Main spelling:
> >>>>>>>>
> >>>>>>>> org.apache.lucene.queryparser.classic.QueryParser parser = new
> >>>>>>>> org.apache.lucene.queryparser.classic.QueryParser(field,
> >>>>>>>> phraseAnalyzer) ;
> >>>>>>>>
> >>>>>>>> Query q1 = null;
> >>>>>>>> try {
> >>>>>>>> q1 = parser.parse("Mains~"); // 1 edit away
> >>>>>>>> } catch (ParseException e) {
> >>>>>>>> e.printStackTrace();
> >>>>>>>> }
> >>>>>>>> booleanQuery.add(q1, BooleanClause.Occur.MUST);
> >>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> >>>>>>>> "NASHUA"), BooleanClause.Occur.MUST);
> >>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> >>>>>>>> "NEW HAMPSHIRE"), BooleanClause.Occur.MUST);
> >>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> >>>>>>>> "UNITED STATES"), BooleanClause.Occur.MUST);
> >>>>>>>>
> >>>>>>>> query plan:
> >>>>>>>>
> >>>>>>>> [+contentDFLT:mains~2, +contentDFLT:"nashua",
> >>>>>>>> +contentDFLT:"new-hampshire", +contentDFLT:"united states"]
> >>>>>>>>
> >>>>>>>> testQuerySearch1 Time to compute: 23 seconds (due to debugging
> >>>>>>>> stops)
> >>>>>>>>
> >>>>>>>> Number of results: 0
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Case #4
> >>>>>>>>
> >>>>>>>> Then i changed q1 to SHOULD from MUST above: and i think fuzzy
> >>>>>>>> query is ignored here since there is no MAIN in the first 468
> >>>>>>>> resuls:
> >>>>>>>>
> >>>>>>>> there is no boost for Mains term here.
> >>>>>>>>
> >>>>>>>> query plan:
> >>>>>>>>
> >>>>>>>> [contentDFLT:mains~2, +contentDFLT:"nashua",
> >>>>>>>> +contentDFLT:"new-hampshire", +contentDFLT:"united states"]
> >>>>>>>>
> >>>>>>>> testQuerySearch1 Time to compute: 125 seconds (due to debugging
> >>>>>>>> stops)
> >>>>>>>> Number of results: 1794
> >>>>>>>> Name: Nashua Dr
> >>>>>>>> Score: 34.186226
> >>>>>>>> ID: 4974936
> >>>>>>>> Country Code: US
> >>>>>>>> Coordinates: 42.7636, -71.46063
> >>>>>>>> Search Key: NASHUA NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> >>>>>>>>
> >>>>>>>> Name: Nashua River Rail Trl
> >>>>>>>> Score: 34.186226
> >>>>>>>> ID: 4975508
> >>>>>>>> Country Code: US
> >>>>>>>> Coordinates: 42.7062, -71.53962
> >>>>>>>> Search Key: NASHUA RIVER RAIL NASHUA HILLSBOROUGH NEW HAMPSHIRE
> >>>>>>>> UNITED STATES
> >>>>>>>>
> >>>>>>>> Name: Nashua Rd
> >>>>>>>> Score: 33.84896
> >>>>>>>> ID: 4975388
> >>>>>>>> Country Code: US
> >>>>>>>> Coordinates: 42.78746, -71.92823
> >>>>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> >>>>>>>>
> >>>>>>>> Name: NASHUA
> >>>>>>>> Score: 33.84896
> >>>>>>>> ID: 21014865
> >>>>>>>> Country Code: US
> >>>>>>>> Coordinates: 42.75873, -71.46438
> >>>>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> >>>>>>>>
> >>>>>>>> Name: NASHUA
> >>>>>>>> Score: 33.84896
> >>>>>>>> ID: 21014865
> >>>>>>>> Country Code: US
> >>>>>>>> Coordinates: 42.75873, -71.46438
> >>>>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> >>>>>>>>
> >>>>>>>> Name: NASHUA
> >>>>>>>> Score: 33.84896
> >>>>>>>> ID: 21014865
> >>>>>>>> Country Code: US
> >>>>>>>> Coordinates: 42.75873, -71.46438
> >>>>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> >>>>>>>>
> >>>>>>>> Name: NASHUA
> >>>>>>>> Score: 33.84896
> >>>>>>>> ID: 21014865
> >>>>>>>> Country Code: US
> >>>>>>>> Coordinates: 42.75873, -71.46438
> >>>>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> >>>>>>>>
> >>>>>>>> Name: NASHUA
> >>>>>>>> Score: 33.84896
> >>>>>>>> ID: 21014865
> >>>>>>>> Country Code: US
> >>>>>>>> Coordinates: 42.75873, -71.46438
> >>>>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> >>>>>>>>
> >>>>>>>> Name: Nashua St
> >>>>>>>> Score: 33.84896
> >>>>>>>> ID: 4975671
> >>>>>>>> Country Code: US
> >>>>>>>> Coordinates: 42.88471, -70.81687
> >>>>>>>> Search Key: NASHUA ROCKINGHAM NEW HAMPSHIRE UNITED STATES
> >>>>>>>>
> >>>>>>>> Name: Nashua Rd
> >>>>>>>> Score: 33.84896
> >>>>>>>> ID: 4975400
> >>>>>>>> Country Code: US
> >>>>>>>> Coordinates: 42.79014, -71.92364
> >>>>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Why is the fuzzy query ignored?
> >>>>>>>> Even if i have separate fields for street, city,region, country,
> >>>>>>>> this fuzzy query issue will come into place for words with
> >>>>>>>> multiple parts like main dunstable etc., right?
> >>>>>>>>
> >>>>>>>> Best regards
> >>>>>>>>
> >>>>>>>> On 6/12/19 11:36 AM, baris.kazar@oracle.com wrote:
> >>>>>>>>
> >>>>>>>> Tomoko,-
> >>>>>>>>
> >>>>>>>> Thank You for Your suggestions. i am trying to understand it
> >>>>>>>> and i thought i did :)
> >>>>>>>>
> >>>>>>>> but it does not work with FuzzyQuery when i used with a *single*
> >>>>>>>> large TextField like street=...value... city=...value...
> >>>>>>>> region=...value... country=...value... (with or without quotes
> >>>>>>>> for the values)
> >>>>>>>>
> >>>>>>>> What i knew about Lucene fuzzy queries are not holding now with
> >>>>>>>> this Textfield form. That is why i suspected of a bug.
> >>>>>>>>
> >>>>>>>> 1. Yes, i saw and have a solid proof on that now.
> >>>>>>>>
> >>>>>>>> 2. yes but FuzzyQuery takes quotes as they are as they are
> >>>>>>>> escaped and it is not analyzed.
> >>>>>>>>
> >>>>>>>> Stuffing into one textfield vs having separate fields should only
> >>>>>>>> affect probably the performance but not the outcome in my case.
> >>>>>>>> But, i have been thinking about this and maybe it is the way to
> >>>>>>>> go in this case.
> >>>>>>>>
> >>>>>>>> mY CONTENT field has street names in mixed case and city, region
> >>>>>>>> country names in UPPERCASE. Can this be a problem?
> >>>>>>>> i thought index stored them in lowercase since i am using
> >>>>>>>> StandardAnalyzer.
> >>>>>>>>
> >>>>>>>> CONTENT field also has full textfield string with street=...
> >>>>>>>> city=... region=... country=... (here all values are UPPERCASE).
> >>>>>>>>
> >>>>>>>> Why cant the index find the names via FuzzyQuery? i tried both
> >>>>>>>> FuzzyQuery and Query builder as i showed before.
> >>>>>>>>
> >>>>>>>> The last advice in Your previous email would nicely go outside
> >>>>>>>> the parantheses since it might be very critical :) :) :)
> >>>>>>>>
> >>>>>>>> Best regards
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On 6/12/19 12:17 AM, Tomoko Uchida wrote:
> >>>>>>>>
> >>>>>>>> I'd suggest to correctly understand the way a software works before
> >>>>>>>> suspecting its bug :-)
> >>>>>>>>
> >>>>>>>> I guess you may miss two points:
> >>>>>>>>
> >>>>>>>> 1. the standard analyzer (standard tokenizer) breaks words by double
> >>>>>>>> quote (U+0022) so quotes are not indexed or searched at all if
> >>>>>>>> you are
> >>>>>>>> using standard analyzer. (That is the reason you have same results
> >>>>>>>> with or without quotes.)
> >>>>>>>> See:
> >>>>>>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__lucene.apache.org_core_8-5F1-5F0_core_org_apache_lucene_analysis_standard_StandardTokenizer.html&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=1L6ZQKxmWmYxDX4uJHxzY5SAR_UCl6UUXCo916wzXCo&s=8E2lp1YIGM-3v3FspeieGl8z8rEBs6qioTudtFNzh8c&e=
> >>>>>>>> and
> >>>>>>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__unicode.org_reports_tr29_&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=1L6ZQKxmWmYxDX4uJHxzY5SAR_UCl6UUXCo916wzXCo&s=riCZ_f25XW869CKbHPUqfbLiDU-AukE6la0xTLMw6u8&e=
> >>>>>>>>
> >>>>>>>> 2. double quote has special meaning (it's interpreted as phrase
> >>>>>>>> query)
> >>>>>>>> with the built-in query parser so you need to escape it if you
> >>>>>>>> want to
> >>>>>>>> search double quotes itself.
> >>>>>>>> See:
> >>>>>>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__lucene.apache.org_core_8-5F1-5F0_queryparser_org_apache_lucene_queryparser_classic_package-2Dsummary.html-23Terms&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=1L6ZQKxmWmYxDX4uJHxzY5SAR_UCl6UUXCo916wzXCo&s=t8OYTgidvcwNpAVFuTsqGhDJK5BwUZVCxc0mPHzqCYU&e=
> >>>>>>>>
> >>>>>>>> (My advice would be to create separate fields for each key value
> >>>>>>>> pairs
> >>>>>>>> instead of stuffing all pairs into one text field, if you need to
> >>>>>>>> search them separately.)
> >>>>>>>>
> >>>>>>>> 2019?6?12?(?) 2:39 <baris.kazar@oracle.com>:
> >>>>>>>>
> >>>>>>>> i can say that quotes is not the issue with index as it still
> >>>>>>>> results in
> >>>>>>>> same results with quotes or without quotes.
> >>>>>>>>
> >>>>>>>> i am starting to feel that this might be a bug maybe??
> >>>>>>>>
> >>>>>>>> Best regards
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On 6/10/19 2:46 PM, baris.kazar@oracle.com wrote:
> >>>>>>>>
> >>>>>>>> Somehow " is causing an issue as this should return street with
> >>>>>>>> MAIN:
> >>>>>>>>
> >>>>>>>> [.contentDFLT:street="MAINS"~2, +contentDFLT:"city nashua",
> >>>>>>>> +contentDFLT:"region new-hampshire", +contentDFLT:"country united
> >>>>>>>> states"] -> this was with fuzzyquery on MAINS
> >>>>>>>>
> >>>>>>>> Best regards
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On 6/10/19 2:24 PM, baris.kazar@oracle.com wrote:
> >>>>>>>>
> >>>>>>>> [.+contentDFLT:"city nashua", +contentDFLT:"region new-hampshire",
> >>>>>>>> +contentDFLT:"country united states", contentDFLT:street
> >>>>>>>> contentDFLT:mains]
> >>>>>>>>
> >>>>>>>> QueeryParser chops it into two pieces from
> >>>>>>>> parser.parser("street=\"MAINS\"");
> >>>>>>>>
> >>>>>>>> Index has a TextField named contentDFLT the following data :
> >>>>>>>> street="MAIN" city="NASHUA" municipality="HILLSBOROUGH" region="NEW
> >>>>>>>> HAMPSHIRE" country="UNITED STATES"
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> When i set street=\"MAINS~\" with parser:
> >>>>>>>> i get the following
> >>>>>>>> [.+contentDFLT:"city nashua", +contentDFLT:"region new-hampshire",
> >>>>>>>> +contentDFLT:"country united states", contentDFLT:street
> >>>>>>>> contentDFLT:mains]
> >>>>>>>>
> >>>>>>>> probably " quotations are messing this up as You were saying...
> >>>>>>>> Best regards
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On 6/10/19 12:48 PM, Tomoko Uchida wrote:
> >>>>>>>>
> >>>>>>>> Or, " (double quotation) in your query string may affect query
> >>>>>>>> parsing.
> >>>>>>>>
> >>>>>>>> When I parse this string by classic query parser (lucene 8.1),
> >>>>>>>> street="MAINS~"
> >>>>>>>> parsed (raw) query is
> >>>>>>>> text:street text:mains
> >>>>>>>> (I set the default search field to "text", so text:xxxx is appeared
> >>>>>>>> here.)
> >>>>>>>>
> >>>>>>>> Query parsing is a complex process, so it would be good to check
> >>>>>>>> parsed raw query string especially when you have (reserved) special
> >>>>>>>> characters in your query...
> >>>>>>>>
> >>>>>>>> 2019?6?11?(?) 1:10 Tomoko Uchida <tomoko.uchida.1111@gmail.com>:
> >>>>>>>>
> >>>>>>>> Hi,
> >>>>>>>>
> >>>>>>>> I noticed one small thing in your previous mail.
> >>>>>>>>
> >>>>>>>> when i use q1 = parser.parse("street=\"MAIN\""); i get same results
> >>>>>>>>
> >>>>>>>> which is good.
> >>>>>>>>
> >>>>>>>> To specify a search field, ":" (colon) should be used instead of
> >>>>>>>> "=".
> >>>>>>>> See the query parser documentation:
> >>>>>>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__lucene.apache.org_core_8-5F1-5F0_queryparser_org_apache_lucene_queryparser_classic_package-2Dsummary.html-23Fields&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=u4SeJqH4lePhOazCLwxLEr3WqcMkODtYLv4njiKZ4PM&s=WrNfUXO9gz1PqpczTJw1vD9sWqvr76WRv2Aeo9uWqa4&e=
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> I'm not sure this is related to your problem.
> >>>>>>>>
> >>>>>>>> 2019?6?11?(?) 0:51 <baris.kazar@oracle.com>:
> >>>>>>>>
> >>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> >>>>>>>> "city=\"NASHUA\""), BooleanClause.Occur.MUST);
> >>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> >>>>>>>> "region=\"NEW HAMPSHIRE\""), BooleanClause.Occur.MUST);
> >>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> >>>>>>>> "country=\"UNITED STATES\""), BooleanClause.Occur.MUST);
> >>>>>>>>
> >>>>>>>> org.apache.lucene.queryparser.classic.QueryParser parser = new
> >>>>>>>> org.apache.lucene.queryparser.classic.QueryParser(field,
> >>>>>>>> phraseAnalyzer) ;
> >>>>>>>> Query q1 = null;
> >>>>>>>> try {
> >>>>>>>> q1 = parser.parse("MAIN");
> >>>>>>>> } catch (ParseException e) {
> >>>>>>>>
> >>>>>>>> e.printStackTrace();
> >>>>>>>> }
> >>>>>>>> booleanQuery.add(q1, BooleanClause.Occur.SHOULD);
> >>>>>>>>
> >>>>>>>> testQuerySearch2 Time to compute: 0 seconds
> >>>>>>>> Number of results: 1775
> >>>>>>>> Name: Main St
> >>>>>>>> Score: 37.20959
> >>>>>>>> ID: 12681979
> >>>>>>>> Country Code: US
> >>>>>>>> Coordinates: 42.76416, -71.46681
> >>>>>>>> Search Key: street="MAIN" city="NASHUA" municipality="HILLSBOROUGH"
> >>>>>>>> region="NEW HAMPSHIRE" country="UNITED STATES"
> >>>>>>>>
> >>>>>>>> Name: Main St
> >>>>>>>> Score: 37.20959
> >>>>>>>> ID: 12681977
> >>>>>>>> Country Code: US
> >>>>>>>> Coordinates: 42.747, -71.45957
> >>>>>>>> Search Key: street="MAIN" city="NASHUA" municipality="HILLSBOROUGH"
> >>>>>>>> region="NEW HAMPSHIRE" country="UNITED STATES"
> >>>>>>>>
> >>>>>>>> Name: Main St
> >>>>>>>> Score: 37.20959
> >>>>>>>> ID: 12681978
> >>>>>>>> Country Code: US
> >>>>>>>> Coordinates: 42.73492, -71.44951
> >>>>>>>> Search Key: street="MAIN" city="NASHUA" municipality="HILLSBOROUGH"
> >>>>>>>> region="NEW HAMPSHIRE" country="UNITED STATES"
> >>>>>>>>
> >>>>>>>> when i use q1 = parser.parse("street=\"MAIN\""); i get same
> >>>>>>>> results
> >>>>>>>> which is good.
> >>>>>>>>
> >>>>>>>> But when i switch to MAINS~ then fuzzy query does not work.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> i need to say something with the q1 only in the booleanquery:
> >>>>>>>> it tries to match the MAIN in street, city, region and country
> >>>>>>>> which are
> >>>>>>>> in a single TextField field.
> >>>>>>>> But i dont want this. that is why i need to street="..." etc when
> >>>>>>>> searching.
> >>>>>>>>
> >>>>>>>> Best regards
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On 6/10/19 11:31 AM, Tomoko Uchida wrote:
> >>>>>>>>
> >>>>>>>> Hi,
> >>>>>>>>
> >>>>>>>> just for the basic verification, can you find the document without
> >>>>>>>> fuzzy query? I mean, does this query work for you?
> >>>>>>>>
> >>>>>>>> Query query = parser.parse("MAIN");
> >>>>>>>>
> >>>>>>>> Tomoko
> >>>>>>>>
> >>>>>>>> 2019?6?11?(?) 0:22 <baris.kazar@oracle.com>:
> >>>>>>>>
> >>>>>>>> why cant the second set not work at all?
> >>>>>>>>
> >>>>>>>> it is indexed as Textfield like street="..." city="..." etc.
> >>>>>>>>
> >>>>>>>> Best regards
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On 6/10/19 11:23 AM, baris.kazar@oracle.com wrote:
> >>>>>>>>
> >>>>>>>> i dont know how to use Fuzzyquery with queryparser but probably
> >>>>>>>> You
> >>>>>>>> are suggesting
> >>>>>>>>
> >>>>>>>> QueryParser parser = new QueryParser(field, analyzer) ;
> >>>>>>>> Query query = parser.parse("MAINS~2");
> >>>>>>>>
> >>>>>>>> booleanQuery.add(query, BooleanClause.Occur.SHOULD);
> >>>>>>>>
> >>>>>>>> am i right?
> >>>>>>>> Best regards
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On 6/10/19 10:47 AM, Atri Sharma wrote:
> >>>>>>>>
> >>>>>>>> I would suggest using a QueryParser for your fuzzy query before
> >>>>>>>> adding it to the Boolean query. This should weed out any case
> >>>>>>>> issues.
> >>>>>>>>
> >>>>>>>> On Mon, 10 Jun 2019 at 8:06 PM, <baris.kazar@oracle.com
> >>>>>>>> <mailto:baris.kazar@oracle.com>> wrote:
> >>>>>>>>
> >>>>>>>> BooleanQuery.Builder booleanQuery = new
> >>>>>>>> BooleanQuery.Builder();
> >>>>>>>>
> >>>>>>>> //First set
> >>>>>>>>
> >>>>>>>> booleanQuery.add(new FuzzyQuery(new
> >>>>>>>> org.apache.lucene.index.Term(field, "MAINS")),
> >>>>>>>> BooleanClause.Occur.SHOULD);
> >>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> >>>>>>>> "NASHUA"), BooleanClause.Occur.MUST);
> >>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> >>>>>>>> "NEW HAMPSHIRE"), BooleanClause.Occur.MUST);
> >>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> >>>>>>>> "UNITED STATES"), BooleanClause.Occur.MUST);
> >>>>>>>>
> >>>>>>>> // Second set
> >>>>>>>> //booleanQuery.add(new FuzzyQuery(new
> >>>>>>>> org.apache.lucene.index.Term(field, "street=\"MAINS\"")),
> >>>>>>>> BooleanClause.Occur.SHOULD);
> >>>>>>>> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,
> >>>>>>>>
> >>>>>>>> field, "city=\"NASHUA\""), BooleanClause.Occur.MUST);
> >>>>>>>> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,
> >>>>>>>>
> >>>>>>>> field, "region=\"NEW HAMPSHIRE\""),
> >>>>>>>> BooleanClause.Occur.MUST);
> >>>>>>>> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,
> >>>>>>>>
> >>>>>>>> field, "country=\"UNITED STATES\""),
> >>>>>>>> BooleanClause.Occur.MUST);
> >>>>>>>>
> >>>>>>>> The first set brings also street with Nashua name.
> >>>>>>>> (NASHUA).
> >>>>>>>>
> >>>>>>>> so, to prevent that and since i also indexed with
> >>>>>>>> street="..."
> >>>>>>>> city="..." i did the second set but it does not bring
> >>>>>>>> anything.
> >>>>>>>>
> >>>>>>>> createPhraseQuery builds a Phrasequery with one term
> >>>>>>>> equal to the
> >>>>>>>> string
> >>>>>>>> in the call.
> >>>>>>>>
> >>>>>>>> Best regards
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On 6/10/19 10:47 AM, baris.kazar@oracle.com
> >>>>>>>> <mailto:baris.kazar@oracle.com> wrote:
> >>>>>>>> > How do i check how it is indexed? lowecase or uppercase?
> >>>>>>>> >
> >>>>>>>> > only way is now to by testing.
> >>>>>>>> >
> >>>>>>>> > i am using standardanalyzer.
> >>>>>>>> >
> >>>>>>>> > Best regards
> >>>>>>>> >
> >>>>>>>> >
> >>>>>>>> > On 6/9/19 11:57 AM, Atri Sharma wrote:
> >>>>>>>> >> On Sun, Jun 9, 2019 at 8:53 PM Tomoko Uchida
> >>>>>>>> >> <tomoko.uchida.1111@gmail.com
> >>>>>>>> <mailto:tomoko.uchida.1111@gmail.com>> wrote:
> >>>>>>>> >>> Hi,
> >>>>>>>> >>>
> >>>>>>>> >>> What analyzer do you use for the text field? Is the
> >>>>>>>> term "Main"
> >>>>>>>> >>> correctly indexed?
> >>>>>>>> >> Agreed. Also, it would be good if you could post your
> >>>>>>>> actual
> >>>>>>>> code.
> >>>>>>>> >>
> >>>>>>>> >> What analyzer are you using? If you are using
> >>>>>>>> StandardAnalyzer,
> >>>>>>>> then
> >>>>>>>> >> all of your terms while indexing will be lowercased,
> >>>>>>>> AFAIK, but
> >>>>>>>> your
> >>>>>>>> >> query will not be analyzed until you run a
> >>>>>>>> QueryParser on it.
> >>>>>>>> >>
> >>>>>>>> >>
> >>>>>>>> >> Atri
> >>>>>>>> >>
> >>>>>>>> >
> >>>>>>>> >
> >>>>>>>> >
> >>>>>>>> ---------------------------------------------------------------------
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> > To unsubscribe, e-mail:
> >>>>>>>> java-user-unsubscribe@lucene.apache.org
> >>>>>>>> <mailto:java-user-unsubscribe@lucene.apache.org>
> >>>>>>>> > For additional commands, e-mail:
> >>>>>>>> java-user-help@lucene.apache.org
> >>>>>>>> <mailto:java-user-help@lucene.apache.org>
> >>>>>>>> >
> >>>>>>>>
> >>>>>>>> ---------------------------------------------------------------------
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >>>>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>>>>>>>
> >>>>>>>> ---------------------------------------------------------------------
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >>>>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>>>>>>>
> >>>>>>>> ---------------------------------------------------------------------
> >>>>>>>>
> >>>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >>>>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>>>>>>>
> >>>>>>>> ---------------------------------------------------------------------
> >>>>>>>>
> >>>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >>>>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>> ---------------------------------------------------------------------
> >>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >>>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>>>>>>
> >>>>>> ---------------------------------------------------------------------
> >>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>>>>>
> >>>>> ---------------------------------------------------------------------
> >>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>>>>
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: FuzzyQuery- why is it ignored? [ In reply to ]
i still cannot find the reason why MAINS cannot be found by the Lucene
index with StandardAnalyzer.

MAINZ, MAINK, MAINT .... all works ok.

Any suggestions please?

Best regards


On 6/16/19 9:38 AM, Baris Kazar wrote:
> Tomoko,-
> Yes, i noticed that last nite when i was researching it and
> thanks for confirming. StandardAnalyzer does not do stemming.
> So, MAINS case has some other reason.
> Best regards
>
> ----- Original Message -----
> From: tomoko.uchida.1111@gmail.com
> To: java-user@lucene.apache.org
> Sent: Sunday, June 16, 2019 4:39:29 AM GMT -05:00 US/Canada Eastern
> Subject: Re: FuzzyQuery- why is it ignored?
>
> Hi,
>
> you said you are using standard analyzer. If so, you are not using any
> stemmer at all (please see the analyzer's Javadocs).
>
> 2019?6?16?(?) 11:43 Baris Kazar <baris.kazar@oracle.com>:
>> Hello,-
>> Erick explained how to disable stemming in Solr but i am using Lucene purely.
>> i am also researching how to disable it in Lucene but if You have instructions how to do so already
>> i appreciate if You could share here.
>> Best regards
>>
>> ----- Original Message -----
>> From: baris.kazar@oracle.com
>> To: java-user@lucene.apache.org, tomoko.uchida.1111@gmail.com, erickerickson@gmail.com, atri@linux.com, baris.kazar@oracle.com, lucene@mikemccandless.com
>> Sent: Thursday, June 13, 2019 10:48:47 AM GMT -05:00 US/Canada Eastern
>> Subject: Re: FuzzyQuery- why is it ignored?
>>
>> i see, i am using an older version 6.6 and we should switch to Your 8.1
>> version of at least 7.X.
>>
>> Tomoko i think i understood You meant MAIN NASHUA .... for the string :)
>>
>> Again i really appreciate all answers.
>>
>> How do we disable or enable stemming while indexing? :) another question.
>>
>> Best regards
>>
>>
>> On 6/13/19 10:40 AM, Tomoko Uchida wrote:
>>> Sorry, I made a mistake when copypasting. Let me just correct my previous mail.
>>>
>>>> 1. Indexed this text: "NASHUA NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES".
>>> 1. Indexed this text: "MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW
>>> HAMPSHIRE UNITED STATES"
>>>
>>> ----
>>> As far as I can say, this query correctly find the indexed document
>>> (so I have no idea about what is wrong with fuzzy query).
>>> +contentDFLT:mains~2 +contentDFLT:"nashua"
>>> +contentDFLT:"new-hampshire" +contentDFLT:"united states"
>>>
>>> I am
>>> - using lucene 8.1.
>>> - using standard analyzer for both of indexing and searching.
>>> - using classic query parser for parsing.
>>>
>>>
>>>
>>> 2019?6?13?(?) 23:18 <baris.kazar@oracle.com>:
>>>> However, the index does not have MAINS but MAIN for the expected entry.
>>>>
>>>> Best regards
>>>>
>>>>
>>>>
>>>> On 6/13/19 10:33 AM, baris.kazar@oracle.com wrote:
>>>>> does it consider it as like plural word? :) :) :)
>>>>> That makes sense.
>>>>>
>>>>> Best regards
>>>>>
>>>>>
>>>>> On 6/13/19 10:31 AM, baris.kazar@oracle.com wrote:
>>>>>> Erick,
>>>>>>
>>>>>> Cool, could You give a simple example with my example please?
>>>>>>
>>>>>> Best regards
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 6/13/19 10:12 AM, Erick Erickson wrote:
>>>>>>> Shot in the dark: stemming. Whenever I see a problem with something
>>>>>>> ending in “s” (or “er” or “ing” or….) my first suspect is that
>>>>>>> stemming is turned on. In that case the token in the index that’s
>>>>>>> actually searched on is somewhat different than you expect.
>>>>>>>
>>>>>>> The test is easy, just insure your fieldType contains no stemmers.
>>>>>>> PorterStemmer is particularly aggressive, but for this case to test
>>>>>>> I’d just remove all stemming, re-index and see if the results differ.
>>>>>>>
>>>>>>> Best,
>>>>>>> Erick
>>>>>>>
>>>>>>>> On Jun 13, 2019, at 7:26 AM, baris.kazar@oracle.com wrote:
>>>>>>>>
>>>>>>>> Tomoko,-
>>>>>>>>
>>>>>>>> That is strange indeed.
>>>>>>>>
>>>>>>>> Something is wrong when i use mains but maink, mainl, mainr,mainq,
>>>>>>>> maint all work ok any consonant at the end except s works in this
>>>>>>>> case.
>>>>>>>>
>>>>>>>> Case #3 had +contentDFLT:mains~2 but not +contentDFLT:"mains~2".
>>>>>>>>
>>>>>>>> i am using fuzzy query with ~ from Query.builder and that is not
>>>>>>>> PhraseQuery.
>>>>>>>>
>>>>>>>> Similarly FuzzyQuery with input "mains" (it has to be lowercase
>>>>>>>> since it does not go through StandardAnalyzer) is also not
>>>>>>>> PhraseQuery.
>>>>>>>>
>>>>>>>> can there be a clearer sample case for ComplexPhraseQuery please in
>>>>>>>> the docs?
>>>>>>>>
>>>>>>>> did You also index "MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED
>>>>>>>> STATES" the expected output in this case?
>>>>>>>>
>>>>>>>> Thanks for spending time on this, i would like to thank everyone.
>>>>>>>>
>>>>>>>> Best regards
>>>>>>>>
>>>>>>>>
>>>>>>>> On 6/13/19 12:13 AM, Tomoko Uchida wrote:
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>>> Ok, i think only this very specific only "mains" has an issue.
>>>>>>>>> It looks strange to me. I did some test locally.
>>>>>>>>>
>>>>>>>>> 1. Indexed this text: "NASHUA NASHUA HILLSBOROUGH NEW HAMPSHIRE
>>>>>>>>> UNITED STATES".
>>>>>>>>>
>>>>>>>>> 2a. This query string (just copied from your Case #3) worked
>>>>>>>>> correctly
>>>>>>>>> for me as far as I can see.
>>>>>>>>> +contentDFLT:mains~2 +contentDFLT:"nashua",
>>>>>>>>> +contentDFLT:"new-hampshire", +contentDFLT:"united state"
>>>>>>>>>
>>>>>>>>> 2b. However this query string got no results.
>>>>>>>>> +contentDFLT:"mains~2", +contentDFLT:"nashua",
>>>>>>>>> +contentDFLT:"new-hampshire", +contentDFLT:"united states"
>>>>>>>>> It is an expected behaviour because the classic query parser does not
>>>>>>>>> support fuzzy query inside phrase query (as far as I know).
>>>>>>>>>
>>>>>>>>> I suspect you use fuzzy query operator (~) inside phrase query
>>>>>>>>> ("), as
>>>>>>>>> the 2b case.
>>>>>>>>>
>>>>>>>>> FYI: there is a special parser for such complex phrase query.
>>>>>>>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__lucene.apache.org_core_8-5F1-5F0_queryparser_org_apache_lucene_queryparser_complexPhrase_ComplexPhraseQueryParser.html&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=ZcXpaSlwS5DegX76mHTb_6DH3P7noan1eeMXc-Vh5M8&s=FoIMlcjDO2b7Gut9XRx-NIBWiBQWItsj8IlylJC7Wkc&e=
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Tomoko
>>>>>>>>>
>>>>>>>>> 2019?6?13?(?) 6:16 <baris.kazar@oracle.com>:
>>>>>>>>>> Ok, i think only this very specific only "mains" has an issue.
>>>>>>>>>>
>>>>>>>>>> all i knew about Lucene was fine :) Great...
>>>>>>>>>>
>>>>>>>>>> i have one more question:
>>>>>>>>>>
>>>>>>>>>> which one is advised to use: FuzzyQuery or the Query.parser with
>>>>>>>>>> search string~ appended?
>>>>>>>>>>
>>>>>>>>>> The second one will go through analyzer and make search string
>>>>>>>>>> lowercase.
>>>>>>>>>>
>>>>>>>>>> Best regards
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 6/12/19 1:03 PM, baris.kazar@oracle.com wrote:
>>>>>>>>>>
>>>>>>>>>> Hi again,-
>>>>>>>>>>
>>>>>>>>>> this is really interesting and i hope i am missing something.
>>>>>>>>>> Index small cases all entries so case sensitivity is not an issue
>>>>>>>>>> i think.
>>>>>>>>>>
>>>>>>>>>> Case #1:
>>>>>>>>>>
>>>>>>>>>> org.apache.lucene.queryparser.classic.QueryParser parser = new
>>>>>>>>>> org.apache.lucene.queryparser.classic.QueryParser(field,
>>>>>>>>>> phraseAnalyzer) ;
>>>>>>>>>> Query q1 = null;
>>>>>>>>>> try {
>>>>>>>>>> q1 = parser.parse("Main");
>>>>>>>>>> } catch (ParseException e) {
>>>>>>>>>> e.printStackTrace();
>>>>>>>>>> }
>>>>>>>>>> booleanQuery.add(q1, BooleanClause.Occur.MUST);
>>>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>>>>> "NASHUA"), BooleanClause.Occur.MUST);
>>>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>>>>> "NEW HAMPSHIRE"), BooleanClause.Occur.MUST);
>>>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>>>>> "UNITED STATES"), BooleanClause.Occur.MUST);
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> This brings with this:
>>>>>>>>>>
>>>>>>>>>> query plan:
>>>>>>>>>>
>>>>>>>>>> [+contentDFLT:main, +contentDFLT:"nashua",
>>>>>>>>>> +contentDFLT:"new-hampshire", +contentDFLT:"united states"]
>>>>>>>>>>
>>>>>>>>>> testQuerySearch1 Time to compute: 0 seconds (copied answer after
>>>>>>>>>> exec finished)
>>>>>>>>>>
>>>>>>>>>> Number of results: 12
>>>>>>>>>> Name: Main Dunstable Rd
>>>>>>>>>> Score: 41.204945
>>>>>>>>>> ID: 12677400
>>>>>>>>>> Country Code: US
>>>>>>>>>> Coordinates: 42.72631, -71.50269
>>>>>>>>>> Search Key: MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW HAMPSHIRE
>>>>>>>>>> UNITED STATES
>>>>>>>>>>
>>>>>>>>>> Name: Main St
>>>>>>>>>> Score: 41.204945
>>>>>>>>>> ID: 12681980
>>>>>>>>>> Country Code: US
>>>>>>>>>> Coordinates: 42.76416, -71.46681
>>>>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>>>
>>>>>>>>>> Name: Main St
>>>>>>>>>> Score: 41.204945
>>>>>>>>>> ID: 12681973
>>>>>>>>>> Country Code: US
>>>>>>>>>> Coordinates: 42.75045, -71.4607
>>>>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>>>
>>>>>>>>>> Name: Main St
>>>>>>>>>> Score: 41.204945
>>>>>>>>>> ID: 12681974
>>>>>>>>>> Country Code: US
>>>>>>>>>> Coordinates: 42.76019, -71.465
>>>>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>>>
>>>>>>>>>> Name: Main Dunstable Rd
>>>>>>>>>> Score: 41.204945
>>>>>>>>>> ID: 12677399
>>>>>>>>>> Country Code: US
>>>>>>>>>> Coordinates: 42.74641, -71.48943
>>>>>>>>>> Search Key: MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW HAMPSHIRE
>>>>>>>>>> UNITED STATES
>>>>>>>>>>
>>>>>>>>>> Name: S Main St
>>>>>>>>>> Score: 41.204945
>>>>>>>>>> ID: 11893215
>>>>>>>>>> Country Code: US
>>>>>>>>>> Coordinates: 42.73412, -71.44797
>>>>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>>>
>>>>>>>>>> Name: Main St
>>>>>>>>>> Score: 41.204945
>>>>>>>>>> ID: 12681978
>>>>>>>>>> Country Code: US
>>>>>>>>>> Coordinates: 42.73492, -71.44951
>>>>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>>>
>>>>>>>>>> Name: S Main St
>>>>>>>>>> Score: 41.204945
>>>>>>>>>> ID: 11893214
>>>>>>>>>> Country Code: US
>>>>>>>>>> Coordinates: 42.73958, -71.45895
>>>>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>>>
>>>>>>>>>> Name: Main St
>>>>>>>>>> Score: 41.204945
>>>>>>>>>> ID: 12681979
>>>>>>>>>> Country Code: US
>>>>>>>>>> Coordinates: 42.76416, -71.46681
>>>>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>>>
>>>>>>>>>> Name: Main St
>>>>>>>>>> Score: 41.204945
>>>>>>>>>> ID: 12681977
>>>>>>>>>> Country Code: US
>>>>>>>>>> Coordinates: 42.747, -71.45957
>>>>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Case #2
>>>>>>>>>>
>>>>>>>>>> When i did this it also worked by adding ~ to make it Fuzzy query
>>>>>>>>>> to Main word:
>>>>>>>>>>
>>>>>>>>>> org.apache.lucene.queryparser.classic.QueryParser parser = new
>>>>>>>>>> org.apache.lucene.queryparser.classic.QueryParser(field,
>>>>>>>>>> phraseAnalyzer) ;
>>>>>>>>>> Query q1 = null;
>>>>>>>>>> try {
>>>>>>>>>> q1 = parser.parse("Main~");
>>>>>>>>>> } catch (ParseException e) {
>>>>>>>>>> e.printStackTrace();
>>>>>>>>>> }
>>>>>>>>>> booleanQuery.add(q1, BooleanClause.Occur.MUST);
>>>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>>>>> "NASHUA"), BooleanClause.Occur.MUST);
>>>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>>>>> "NEW HAMPSHIRE"), BooleanClause.Occur.MUST);
>>>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>>>>> "UNITED STATES"), BooleanClause.Occur.MUST);
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> query plan:
>>>>>>>>>>
>>>>>>>>>> [+contentDFLT:main~2, +contentDFLT:"nashua",
>>>>>>>>>> +contentDFLT:"new-hampshire", +contentDFLT:"united states"]
>>>>>>>>>>
>>>>>>>>>> testQuerySearch1 Time to compute: 24 seconds (due to debugging
>>>>>>>>>> stops)
>>>>>>>>>> Number of results: 12
>>>>>>>>>> Name: Main Dunstable Rd
>>>>>>>>>> Score: 41.06405
>>>>>>>>>> ID: 12677400
>>>>>>>>>> Country Code: US
>>>>>>>>>> Coordinates: 42.72631, -71.50269
>>>>>>>>>> Search Key: MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW HAMPSHIRE
>>>>>>>>>> UNITED STATES
>>>>>>>>>>
>>>>>>>>>> Name: Main St
>>>>>>>>>> Score: 41.06405
>>>>>>>>>> ID: 12681980
>>>>>>>>>> Country Code: US
>>>>>>>>>> Coordinates: 42.76416, -71.46681
>>>>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>>>
>>>>>>>>>> Name: Main St
>>>>>>>>>> Score: 41.06405
>>>>>>>>>> ID: 12681973
>>>>>>>>>> Country Code: US
>>>>>>>>>> Coordinates: 42.75045, -71.4607
>>>>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>>>
>>>>>>>>>> Name: Main St
>>>>>>>>>> Score: 41.06405
>>>>>>>>>> ID: 12681974
>>>>>>>>>> Country Code: US
>>>>>>>>>> Coordinates: 42.76019, -71.465
>>>>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>>>
>>>>>>>>>> Name: Main Dunstable Rd
>>>>>>>>>> Score: 41.06405
>>>>>>>>>> ID: 12677399
>>>>>>>>>> Country Code: US
>>>>>>>>>> Coordinates: 42.74641, -71.48943
>>>>>>>>>> Search Key: MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW HAMPSHIRE
>>>>>>>>>> UNITED STATES
>>>>>>>>>>
>>>>>>>>>> Name: S Main St
>>>>>>>>>> Score: 41.06405
>>>>>>>>>> ID: 11893215
>>>>>>>>>> Country Code: US
>>>>>>>>>> Coordinates: 42.73412, -71.44797
>>>>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>>>
>>>>>>>>>> Name: Main St
>>>>>>>>>> Score: 41.06405
>>>>>>>>>> ID: 12681978
>>>>>>>>>> Country Code: US
>>>>>>>>>> Coordinates: 42.73492, -71.44951
>>>>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>>>
>>>>>>>>>> Name: S Main St
>>>>>>>>>> Score: 41.06405
>>>>>>>>>> ID: 11893214
>>>>>>>>>> Country Code: US
>>>>>>>>>> Coordinates: 42.73958, -71.45895
>>>>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>>>
>>>>>>>>>> Name: Main St
>>>>>>>>>> Score: 41.06405
>>>>>>>>>> ID: 12681979
>>>>>>>>>> Country Code: US
>>>>>>>>>> Coordinates: 42.76416, -71.46681
>>>>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>>>
>>>>>>>>>> Name: Main St
>>>>>>>>>> Score: 41.06405
>>>>>>>>>> ID: 12681977
>>>>>>>>>> Country Code: US
>>>>>>>>>> Coordinates: 42.747, -71.45957
>>>>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Case #3
>>>>>>>>>>
>>>>>>>>>> But why does this not work with fuzzy mode and i misspelled a bit
>>>>>>>>>> (1 edit away) and as You saw the data is there with Main spelling:
>>>>>>>>>>
>>>>>>>>>> org.apache.lucene.queryparser.classic.QueryParser parser = new
>>>>>>>>>> org.apache.lucene.queryparser.classic.QueryParser(field,
>>>>>>>>>> phraseAnalyzer) ;
>>>>>>>>>>
>>>>>>>>>> Query q1 = null;
>>>>>>>>>> try {
>>>>>>>>>> q1 = parser.parse("Mains~"); // 1 edit away
>>>>>>>>>> } catch (ParseException e) {
>>>>>>>>>> e.printStackTrace();
>>>>>>>>>> }
>>>>>>>>>> booleanQuery.add(q1, BooleanClause.Occur.MUST);
>>>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>>>>> "NASHUA"), BooleanClause.Occur.MUST);
>>>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>>>>> "NEW HAMPSHIRE"), BooleanClause.Occur.MUST);
>>>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>>>>> "UNITED STATES"), BooleanClause.Occur.MUST);
>>>>>>>>>>
>>>>>>>>>> query plan:
>>>>>>>>>>
>>>>>>>>>> [+contentDFLT:mains~2, +contentDFLT:"nashua",
>>>>>>>>>> +contentDFLT:"new-hampshire", +contentDFLT:"united states"]
>>>>>>>>>>
>>>>>>>>>> testQuerySearch1 Time to compute: 23 seconds (due to debugging
>>>>>>>>>> stops)
>>>>>>>>>>
>>>>>>>>>> Number of results: 0
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Case #4
>>>>>>>>>>
>>>>>>>>>> Then i changed q1 to SHOULD from MUST above: and i think fuzzy
>>>>>>>>>> query is ignored here since there is no MAIN in the first 468
>>>>>>>>>> resuls:
>>>>>>>>>>
>>>>>>>>>> there is no boost for Mains term here.
>>>>>>>>>>
>>>>>>>>>> query plan:
>>>>>>>>>>
>>>>>>>>>> [contentDFLT:mains~2, +contentDFLT:"nashua",
>>>>>>>>>> +contentDFLT:"new-hampshire", +contentDFLT:"united states"]
>>>>>>>>>>
>>>>>>>>>> testQuerySearch1 Time to compute: 125 seconds (due to debugging
>>>>>>>>>> stops)
>>>>>>>>>> Number of results: 1794
>>>>>>>>>> Name: Nashua Dr
>>>>>>>>>> Score: 34.186226
>>>>>>>>>> ID: 4974936
>>>>>>>>>> Country Code: US
>>>>>>>>>> Coordinates: 42.7636, -71.46063
>>>>>>>>>> Search Key: NASHUA NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>>>
>>>>>>>>>> Name: Nashua River Rail Trl
>>>>>>>>>> Score: 34.186226
>>>>>>>>>> ID: 4975508
>>>>>>>>>> Country Code: US
>>>>>>>>>> Coordinates: 42.7062, -71.53962
>>>>>>>>>> Search Key: NASHUA RIVER RAIL NASHUA HILLSBOROUGH NEW HAMPSHIRE
>>>>>>>>>> UNITED STATES
>>>>>>>>>>
>>>>>>>>>> Name: Nashua Rd
>>>>>>>>>> Score: 33.84896
>>>>>>>>>> ID: 4975388
>>>>>>>>>> Country Code: US
>>>>>>>>>> Coordinates: 42.78746, -71.92823
>>>>>>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>>>
>>>>>>>>>> Name: NASHUA
>>>>>>>>>> Score: 33.84896
>>>>>>>>>> ID: 21014865
>>>>>>>>>> Country Code: US
>>>>>>>>>> Coordinates: 42.75873, -71.46438
>>>>>>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>>>
>>>>>>>>>> Name: NASHUA
>>>>>>>>>> Score: 33.84896
>>>>>>>>>> ID: 21014865
>>>>>>>>>> Country Code: US
>>>>>>>>>> Coordinates: 42.75873, -71.46438
>>>>>>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>>>
>>>>>>>>>> Name: NASHUA
>>>>>>>>>> Score: 33.84896
>>>>>>>>>> ID: 21014865
>>>>>>>>>> Country Code: US
>>>>>>>>>> Coordinates: 42.75873, -71.46438
>>>>>>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>>>
>>>>>>>>>> Name: NASHUA
>>>>>>>>>> Score: 33.84896
>>>>>>>>>> ID: 21014865
>>>>>>>>>> Country Code: US
>>>>>>>>>> Coordinates: 42.75873, -71.46438
>>>>>>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>>>
>>>>>>>>>> Name: NASHUA
>>>>>>>>>> Score: 33.84896
>>>>>>>>>> ID: 21014865
>>>>>>>>>> Country Code: US
>>>>>>>>>> Coordinates: 42.75873, -71.46438
>>>>>>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>>>
>>>>>>>>>> Name: Nashua St
>>>>>>>>>> Score: 33.84896
>>>>>>>>>> ID: 4975671
>>>>>>>>>> Country Code: US
>>>>>>>>>> Coordinates: 42.88471, -70.81687
>>>>>>>>>> Search Key: NASHUA ROCKINGHAM NEW HAMPSHIRE UNITED STATES
>>>>>>>>>>
>>>>>>>>>> Name: Nashua Rd
>>>>>>>>>> Score: 33.84896
>>>>>>>>>> ID: 4975400
>>>>>>>>>> Country Code: US
>>>>>>>>>> Coordinates: 42.79014, -71.92364
>>>>>>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Why is the fuzzy query ignored?
>>>>>>>>>> Even if i have separate fields for street, city,region, country,
>>>>>>>>>> this fuzzy query issue will come into place for words with
>>>>>>>>>> multiple parts like main dunstable etc., right?
>>>>>>>>>>
>>>>>>>>>> Best regards
>>>>>>>>>>
>>>>>>>>>> On 6/12/19 11:36 AM, baris.kazar@oracle.com wrote:
>>>>>>>>>>
>>>>>>>>>> Tomoko,-
>>>>>>>>>>
>>>>>>>>>> Thank You for Your suggestions. i am trying to understand it
>>>>>>>>>> and i thought i did :)
>>>>>>>>>>
>>>>>>>>>> but it does not work with FuzzyQuery when i used with a *single*
>>>>>>>>>> large TextField like street=...value... city=...value...
>>>>>>>>>> region=...value... country=...value... (with or without quotes
>>>>>>>>>> for the values)
>>>>>>>>>>
>>>>>>>>>> What i knew about Lucene fuzzy queries are not holding now with
>>>>>>>>>> this Textfield form. That is why i suspected of a bug.
>>>>>>>>>>
>>>>>>>>>> 1. Yes, i saw and have a solid proof on that now.
>>>>>>>>>>
>>>>>>>>>> 2. yes but FuzzyQuery takes quotes as they are as they are
>>>>>>>>>> escaped and it is not analyzed.
>>>>>>>>>>
>>>>>>>>>> Stuffing into one textfield vs having separate fields should only
>>>>>>>>>> affect probably the performance but not the outcome in my case.
>>>>>>>>>> But, i have been thinking about this and maybe it is the way to
>>>>>>>>>> go in this case.
>>>>>>>>>>
>>>>>>>>>> mY CONTENT field has street names in mixed case and city, region
>>>>>>>>>> country names in UPPERCASE. Can this be a problem?
>>>>>>>>>> i thought index stored them in lowercase since i am using
>>>>>>>>>> StandardAnalyzer.
>>>>>>>>>>
>>>>>>>>>> CONTENT field also has full textfield string with street=...
>>>>>>>>>> city=... region=... country=... (here all values are UPPERCASE).
>>>>>>>>>>
>>>>>>>>>> Why cant the index find the names via FuzzyQuery? i tried both
>>>>>>>>>> FuzzyQuery and Query builder as i showed before.
>>>>>>>>>>
>>>>>>>>>> The last advice in Your previous email would nicely go outside
>>>>>>>>>> the parantheses since it might be very critical :) :) :)
>>>>>>>>>>
>>>>>>>>>> Best regards
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 6/12/19 12:17 AM, Tomoko Uchida wrote:
>>>>>>>>>>
>>>>>>>>>> I'd suggest to correctly understand the way a software works before
>>>>>>>>>> suspecting its bug :-)
>>>>>>>>>>
>>>>>>>>>> I guess you may miss two points:
>>>>>>>>>>
>>>>>>>>>> 1. the standard analyzer (standard tokenizer) breaks words by double
>>>>>>>>>> quote (U+0022) so quotes are not indexed or searched at all if
>>>>>>>>>> you are
>>>>>>>>>> using standard analyzer. (That is the reason you have same results
>>>>>>>>>> with or without quotes.)
>>>>>>>>>> See:
>>>>>>>>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__lucene.apache.org_core_8-5F1-5F0_core_org_apache_lucene_analysis_standard_StandardTokenizer.html&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=1L6ZQKxmWmYxDX4uJHxzY5SAR_UCl6UUXCo916wzXCo&s=8E2lp1YIGM-3v3FspeieGl8z8rEBs6qioTudtFNzh8c&e=
>>>>>>>>>> and
>>>>>>>>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__unicode.org_reports_tr29_&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=1L6ZQKxmWmYxDX4uJHxzY5SAR_UCl6UUXCo916wzXCo&s=riCZ_f25XW869CKbHPUqfbLiDU-AukE6la0xTLMw6u8&e=
>>>>>>>>>>
>>>>>>>>>> 2. double quote has special meaning (it's interpreted as phrase
>>>>>>>>>> query)
>>>>>>>>>> with the built-in query parser so you need to escape it if you
>>>>>>>>>> want to
>>>>>>>>>> search double quotes itself.
>>>>>>>>>> See:
>>>>>>>>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__lucene.apache.org_core_8-5F1-5F0_queryparser_org_apache_lucene_queryparser_classic_package-2Dsummary.html-23Terms&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=1L6ZQKxmWmYxDX4uJHxzY5SAR_UCl6UUXCo916wzXCo&s=t8OYTgidvcwNpAVFuTsqGhDJK5BwUZVCxc0mPHzqCYU&e=
>>>>>>>>>>
>>>>>>>>>> (My advice would be to create separate fields for each key value
>>>>>>>>>> pairs
>>>>>>>>>> instead of stuffing all pairs into one text field, if you need to
>>>>>>>>>> search them separately.)
>>>>>>>>>>
>>>>>>>>>> 2019?6?12?(?) 2:39 <baris.kazar@oracle.com>:
>>>>>>>>>>
>>>>>>>>>> i can say that quotes is not the issue with index as it still
>>>>>>>>>> results in
>>>>>>>>>> same results with quotes or without quotes.
>>>>>>>>>>
>>>>>>>>>> i am starting to feel that this might be a bug maybe??
>>>>>>>>>>
>>>>>>>>>> Best regards
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 6/10/19 2:46 PM, baris.kazar@oracle.com wrote:
>>>>>>>>>>
>>>>>>>>>> Somehow " is causing an issue as this should return street with
>>>>>>>>>> MAIN:
>>>>>>>>>>
>>>>>>>>>> [contentDFLT:street="MAINS"~2, +contentDFLT:"city nashua",
>>>>>>>>>> +contentDFLT:"region new-hampshire", +contentDFLT:"country united
>>>>>>>>>> states"] -> this was with fuzzyquery on MAINS
>>>>>>>>>>
>>>>>>>>>> Best regards
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 6/10/19 2:24 PM, baris.kazar@oracle.com wrote:
>>>>>>>>>>
>>>>>>>>>> [.+contentDFLT:"city nashua", +contentDFLT:"region new-hampshire",
>>>>>>>>>> +contentDFLT:"country united states", contentDFLT:street
>>>>>>>>>> contentDFLT:mains]
>>>>>>>>>>
>>>>>>>>>> QueeryParser chops it into two pieces from
>>>>>>>>>> parser.parser("street=\"MAINS\"");
>>>>>>>>>>
>>>>>>>>>> Index has a TextField named contentDFLT the following data :
>>>>>>>>>> street="MAIN" city="NASHUA" municipality="HILLSBOROUGH" region="NEW
>>>>>>>>>> HAMPSHIRE" country="UNITED STATES"
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> When i set street=\"MAINS~\" with parser:
>>>>>>>>>> i get the following
>>>>>>>>>> [.+contentDFLT:"city nashua", +contentDFLT:"region new-hampshire",
>>>>>>>>>> +contentDFLT:"country united states", contentDFLT:street
>>>>>>>>>> contentDFLT:mains]
>>>>>>>>>>
>>>>>>>>>> probably " quotations are messing this up as You were saying...
>>>>>>>>>> Best regards
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 6/10/19 12:48 PM, Tomoko Uchida wrote:
>>>>>>>>>>
>>>>>>>>>> Or, " (double quotation) in your query string may affect query
>>>>>>>>>> parsing.
>>>>>>>>>>
>>>>>>>>>> When I parse this string by classic query parser (lucene 8.1),
>>>>>>>>>> street="MAINS~"
>>>>>>>>>> parsed (raw) query is
>>>>>>>>>> text:street text:mains
>>>>>>>>>> (I set the default search field to "text", so text:xxxx is appeared
>>>>>>>>>> here.)
>>>>>>>>>>
>>>>>>>>>> Query parsing is a complex process, so it would be good to check
>>>>>>>>>> parsed raw query string especially when you have (reserved) special
>>>>>>>>>> characters in your query...
>>>>>>>>>>
>>>>>>>>>> 2019?6?11?(?) 1:10 Tomoko Uchida <tomoko.uchida.1111@gmail.com>:
>>>>>>>>>>
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> I noticed one small thing in your previous mail.
>>>>>>>>>>
>>>>>>>>>> when i use q1 = parser.parse("street=\"MAIN\""); i get same results
>>>>>>>>>>
>>>>>>>>>> which is good.
>>>>>>>>>>
>>>>>>>>>> To specify a search field, ":" (colon) should be used instead of
>>>>>>>>>> "=".
>>>>>>>>>> See the query parser documentation:
>>>>>>>>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__lucene.apache.org_core_8-5F1-5F0_queryparser_org_apache_lucene_queryparser_classic_package-2Dsummary.html-23Fields&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=u4SeJqH4lePhOazCLwxLEr3WqcMkODtYLv4njiKZ4PM&s=WrNfUXO9gz1PqpczTJw1vD9sWqvr76WRv2Aeo9uWqa4&e=
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> I'm not sure this is related to your problem.
>>>>>>>>>>
>>>>>>>>>> 2019?6?11?(?) 0:51 <baris.kazar@oracle.com>:
>>>>>>>>>>
>>>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>>>>> "city=\"NASHUA\""), BooleanClause.Occur.MUST);
>>>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>>>>> "region=\"NEW HAMPSHIRE\""), BooleanClause.Occur.MUST);
>>>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>>>>> "country=\"UNITED STATES\""), BooleanClause.Occur.MUST);
>>>>>>>>>>
>>>>>>>>>> org.apache.lucene.queryparser.classic.QueryParser parser = new
>>>>>>>>>> org.apache.lucene.queryparser.classic.QueryParser(field,
>>>>>>>>>> phraseAnalyzer) ;
>>>>>>>>>> Query q1 = null;
>>>>>>>>>> try {
>>>>>>>>>> q1 = parser.parse("MAIN");
>>>>>>>>>> } catch (ParseException e) {
>>>>>>>>>>
>>>>>>>>>> e.printStackTrace();
>>>>>>>>>> }
>>>>>>>>>> booleanQuery.add(q1, BooleanClause.Occur.SHOULD);
>>>>>>>>>>
>>>>>>>>>> testQuerySearch2 Time to compute: 0 seconds
>>>>>>>>>> Number of results: 1775
>>>>>>>>>> Name: Main St
>>>>>>>>>> Score: 37.20959
>>>>>>>>>> ID: 12681979
>>>>>>>>>> Country Code: US
>>>>>>>>>> Coordinates: 42.76416, -71.46681
>>>>>>>>>> Search Key: street="MAIN" city="NASHUA" municipality="HILLSBOROUGH"
>>>>>>>>>> region="NEW HAMPSHIRE" country="UNITED STATES"
>>>>>>>>>>
>>>>>>>>>> Name: Main St
>>>>>>>>>> Score: 37.20959
>>>>>>>>>> ID: 12681977
>>>>>>>>>> Country Code: US
>>>>>>>>>> Coordinates: 42.747, -71.45957
>>>>>>>>>> Search Key: street="MAIN" city="NASHUA" municipality="HILLSBOROUGH"
>>>>>>>>>> region="NEW HAMPSHIRE" country="UNITED STATES"
>>>>>>>>>>
>>>>>>>>>> Name: Main St
>>>>>>>>>> Score: 37.20959
>>>>>>>>>> ID: 12681978
>>>>>>>>>> Country Code: US
>>>>>>>>>> Coordinates: 42.73492, -71.44951
>>>>>>>>>> Search Key: street="MAIN" city="NASHUA" municipality="HILLSBOROUGH"
>>>>>>>>>> region="NEW HAMPSHIRE" country="UNITED STATES"
>>>>>>>>>>
>>>>>>>>>> when i use q1 = parser.parse("street=\"MAIN\""); i get same
>>>>>>>>>> results
>>>>>>>>>> which is good.
>>>>>>>>>>
>>>>>>>>>> But when i switch to MAINS~ then fuzzy query does not work.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> i need to say something with the q1 only in the booleanquery:
>>>>>>>>>> it tries to match the MAIN in street, city, region and country
>>>>>>>>>> which are
>>>>>>>>>> in a single TextField field.
>>>>>>>>>> But i dont want this. that is why i need to street="..." etc when
>>>>>>>>>> searching.
>>>>>>>>>>
>>>>>>>>>> Best regards
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 6/10/19 11:31 AM, Tomoko Uchida wrote:
>>>>>>>>>>
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> just for the basic verification, can you find the document without
>>>>>>>>>> fuzzy query? I mean, does this query work for you?
>>>>>>>>>>
>>>>>>>>>> Query query = parser.parse("MAIN");
>>>>>>>>>>
>>>>>>>>>> Tomoko
>>>>>>>>>>
>>>>>>>>>> 2019?6?11?(?) 0:22 <baris.kazar@oracle.com>:
>>>>>>>>>>
>>>>>>>>>> why cant the second set not work at all?
>>>>>>>>>>
>>>>>>>>>> it is indexed as Textfield like street="..." city="..." etc.
>>>>>>>>>>
>>>>>>>>>> Best regards
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 6/10/19 11:23 AM, baris.kazar@oracle.com wrote:
>>>>>>>>>>
>>>>>>>>>> i dont know how to use Fuzzyquery with queryparser but probably
>>>>>>>>>> You
>>>>>>>>>> are suggesting
>>>>>>>>>>
>>>>>>>>>> QueryParser parser = new QueryParser(field, analyzer) ;
>>>>>>>>>> Query query = parser.parse("MAINS~2");
>>>>>>>>>>
>>>>>>>>>> booleanQuery.add(query, BooleanClause.Occur.SHOULD);
>>>>>>>>>>
>>>>>>>>>> am i right?
>>>>>>>>>> Best regards
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 6/10/19 10:47 AM, Atri Sharma wrote:
>>>>>>>>>>
>>>>>>>>>> I would suggest using a QueryParser for your fuzzy query before
>>>>>>>>>> adding it to the Boolean query. This should weed out any case
>>>>>>>>>> issues.
>>>>>>>>>>
>>>>>>>>>> On Mon, 10 Jun 2019 at 8:06 PM, <baris.kazar@oracle.com
>>>>>>>>>> <mailto:baris.kazar@oracle.com>> wrote:
>>>>>>>>>>
>>>>>>>>>> BooleanQuery.Builder booleanQuery = new
>>>>>>>>>> BooleanQuery.Builder();
>>>>>>>>>>
>>>>>>>>>> //First set
>>>>>>>>>>
>>>>>>>>>> booleanQuery.add(new FuzzyQuery(new
>>>>>>>>>> org.apache.lucene.index.Term(field, "MAINS")),
>>>>>>>>>> BooleanClause.Occur.SHOULD);
>>>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>>>>> "NASHUA"), BooleanClause.Occur.MUST);
>>>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>>>>> "NEW HAMPSHIRE"), BooleanClause.Occur.MUST);
>>>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>>>>> "UNITED STATES"), BooleanClause.Occur.MUST);
>>>>>>>>>>
>>>>>>>>>> // Second set
>>>>>>>>>> //booleanQuery.add(new FuzzyQuery(new
>>>>>>>>>> org.apache.lucene.index.Term(field, "street=\"MAINS\"")),
>>>>>>>>>> BooleanClause.Occur.SHOULD);
>>>>>>>>>> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,
>>>>>>>>>>
>>>>>>>>>> field, "city=\"NASHUA\""), BooleanClause.Occur.MUST);
>>>>>>>>>> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,
>>>>>>>>>>
>>>>>>>>>> field, "region=\"NEW HAMPSHIRE\""),
>>>>>>>>>> BooleanClause.Occur.MUST);
>>>>>>>>>> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,
>>>>>>>>>>
>>>>>>>>>> field, "country=\"UNITED STATES\""),
>>>>>>>>>> BooleanClause.Occur.MUST);
>>>>>>>>>>
>>>>>>>>>> The first set brings also street with Nashua name.
>>>>>>>>>> (NASHUA).
>>>>>>>>>>
>>>>>>>>>> so, to prevent that and since i also indexed with
>>>>>>>>>> street="..."
>>>>>>>>>> city="..." i did the second set but it does not bring
>>>>>>>>>> anything.
>>>>>>>>>>
>>>>>>>>>> createPhraseQuery builds a Phrasequery with one term
>>>>>>>>>> equal to the
>>>>>>>>>> string
>>>>>>>>>> in the call.
>>>>>>>>>>
>>>>>>>>>> Best regards
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 6/10/19 10:47 AM, baris.kazar@oracle.com
>>>>>>>>>> <mailto:baris.kazar@oracle.com> wrote:
>>>>>>>>>> > How do i check how it is indexed? lowecase or uppercase?
>>>>>>>>>> >
>>>>>>>>>> > only way is now to by testing.
>>>>>>>>>> >
>>>>>>>>>> > i am using standardanalyzer.
>>>>>>>>>> >
>>>>>>>>>> > Best regards
>>>>>>>>>> >
>>>>>>>>>> >
>>>>>>>>>> > On 6/9/19 11:57 AM, Atri Sharma wrote:
>>>>>>>>>> >> On Sun, Jun 9, 2019 at 8:53 PM Tomoko Uchida
>>>>>>>>>> >> <tomoko.uchida.1111@gmail.com
>>>>>>>>>> <mailto:tomoko.uchida.1111@gmail.com>> wrote:
>>>>>>>>>> >>> Hi,
>>>>>>>>>> >>>
>>>>>>>>>> >>> What analyzer do you use for the text field? Is the
>>>>>>>>>> term "Main"
>>>>>>>>>> >>> correctly indexed?
>>>>>>>>>> >> Agreed. Also, it would be good if you could post your
>>>>>>>>>> actual
>>>>>>>>>> code.
>>>>>>>>>> >>
>>>>>>>>>> >> What analyzer are you using? If you are using
>>>>>>>>>> StandardAnalyzer,
>>>>>>>>>> then
>>>>>>>>>> >> all of your terms while indexing will be lowercased,
>>>>>>>>>> AFAIK, but
>>>>>>>>>> your
>>>>>>>>>> >> query will not be analyzed until you run a
>>>>>>>>>> QueryParser on it.
>>>>>>>>>> >>
>>>>>>>>>> >>
>>>>>>>>>> >> Atri
>>>>>>>>>> >>
>>>>>>>>>> >
>>>>>>>>>> >
>>>>>>>>>> >
>>>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> > To unsubscribe, e-mail:
>>>>>>>>>> java-user-unsubscribe@lucene.apache.org
>>>>>>>>>> <mailto:java-user-unsubscribe@lucene.apache.org>
>>>>>>>>>> > For additional commands, e-mail:
>>>>>>>>>> java-user-help@lucene.apache.org
>>>>>>>>>> <mailto:java-user-help@lucene.apache.org>
>>>>>>>>>> >
>>>>>>>>>>
>>>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>>>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>>>>>>
>>>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>>>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>>>>>>
>>>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>>>
>>>>>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>>>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>>>>>>
>>>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>>>
>>>>>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>>>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>>>>>
>>>>>>>> ---------------------------------------------------------------------
>>>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>>>>
>>>>>>> ---------------------------------------------------------------------
>>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: FuzzyQuery- why is it ignored? [ In reply to ]
Tomoko,-
may i ask if You could try with these few more data indexed too?

"KEHOE NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES"
"CHESTNUT NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES"
"JEFFERSON NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES"
"NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES"
"NASHUA NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES"
"NEW HAMPSHIRE NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES"

If You could index these entries and still find Main from MAINS query with Lucene 8.1,
that means this is a bug in Lucene 6.6.
Best regards

----- Original Message -----
From: baris.kazar@oracle.com
To: java-user@lucene.apache.org, tomoko.uchida.1111@gmail.com, erickerickson@gmail.com, atri@linux.com, baris.kazar@oracle.com, lucene@mikemccandless.com
Sent: Thursday, June 13, 2019 10:49:05 AM GMT -05:00 US/Canada Eastern
Subject: Re: FuzzyQuery- why is it ignored?

i see, i am using an older version 6.6 and we should switch to Your 8.1
version of at least 7.X.

Tomoko i think i understood You meant MAIN NASHUA .... for the string :)

Again i really appreciate all answers.

How do we disable or enable stemming while indexing? :) another question.

Best regards


On 6/13/19 10:40 AM, Tomoko Uchida wrote:
> Sorry, I made a mistake when copypasting. Let me just correct my previous mail.
>
>> 1. Indexed this text: "NASHUA NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES".
> 1. Indexed this text: "MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW
> HAMPSHIRE UNITED STATES"
>
> ----
> As far as I can say, this query correctly find the indexed document
> (so I have no idea about what is wrong with fuzzy query).
> +contentDFLT:mains~2 +contentDFLT:"nashua"
> +contentDFLT:"new-hampshire" +contentDFLT:"united states"
>
> I am
> - using lucene 8.1.
> - using standard analyzer for both of indexing and searching.
> - using classic query parser for parsing.
>
>
>
> 2019?6?13?(?) 23:18 <baris.kazar@oracle.com>:
>> However, the index does not have MAINS but MAIN for the expected entry.
>>
>> Best regards
>>
>>
>>
>> On 6/13/19 10:33 AM, baris.kazar@oracle.com wrote:
>>> does it consider it as like plural word? :) :) :)
>>> That makes sense.
>>>
>>> Best regards
>>>
>>>
>>> On 6/13/19 10:31 AM, baris.kazar@oracle.com wrote:
>>>> Erick,
>>>>
>>>> Cool, could You give a simple example with my example please?
>>>>
>>>> Best regards
>>>>
>>>>
>>>>
>>>> On 6/13/19 10:12 AM, Erick Erickson wrote:
>>>>> Shot in the dark: stemming. Whenever I see a problem with something
>>>>> ending in “s” (or “er” or “ing” or….) my first suspect is that
>>>>> stemming is turned on. In that case the token in the index that’s
>>>>> actually searched on is somewhat different than you expect.
>>>>>
>>>>> The test is easy, just insure your fieldType contains no stemmers.
>>>>> PorterStemmer is particularly aggressive, but for this case to test
>>>>> I’d just remove all stemming, re-index and see if the results differ.
>>>>>
>>>>> Best,
>>>>> Erick
>>>>>
>>>>>> On Jun 13, 2019, at 7:26 AM, baris.kazar@oracle.com wrote:
>>>>>>
>>>>>> Tomoko,-
>>>>>>
>>>>>> That is strange indeed.
>>>>>>
>>>>>> Something is wrong when i use mains but maink, mainl, mainr,mainq,
>>>>>> maint all work ok any consonant at the end except s works in this
>>>>>> case.
>>>>>>
>>>>>> Case #3 had +contentDFLT:mains~2 but not +contentDFLT:"mains~2".
>>>>>>
>>>>>> i am using fuzzy query with ~ from Query.builder and that is not
>>>>>> PhraseQuery.
>>>>>>
>>>>>> Similarly FuzzyQuery with input "mains" (it has to be lowercase
>>>>>> since it does not go through StandardAnalyzer) is also not
>>>>>> PhraseQuery.
>>>>>>
>>>>>> can there be a clearer sample case for ComplexPhraseQuery please in
>>>>>> the docs?
>>>>>>
>>>>>> did You also index "MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED
>>>>>> STATES" the expected output in this case?
>>>>>>
>>>>>> Thanks for spending time on this, i would like to thank everyone.
>>>>>>
>>>>>> Best regards
>>>>>>
>>>>>>
>>>>>> On 6/13/19 12:13 AM, Tomoko Uchida wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>>> Ok, i think only this very specific only "mains" has an issue.
>>>>>>> It looks strange to me. I did some test locally.
>>>>>>>
>>>>>>> 1. Indexed this text: "NASHUA NASHUA HILLSBOROUGH NEW HAMPSHIRE
>>>>>>> UNITED STATES".
>>>>>>>
>>>>>>> 2a. This query string (just copied from your Case #3) worked
>>>>>>> correctly
>>>>>>> for me as far as I can see.
>>>>>>> +contentDFLT:mains~2 +contentDFLT:"nashua",
>>>>>>> +contentDFLT:"new-hampshire", +contentDFLT:"united state"
>>>>>>>
>>>>>>> 2b. However this query string got no results.
>>>>>>> +contentDFLT:"mains~2", +contentDFLT:"nashua",
>>>>>>> +contentDFLT:"new-hampshire", +contentDFLT:"united states"
>>>>>>> It is an expected behaviour because the classic query parser does not
>>>>>>> support fuzzy query inside phrase query (as far as I know).
>>>>>>>
>>>>>>> I suspect you use fuzzy query operator (~) inside phrase query
>>>>>>> ("), as
>>>>>>> the 2b case.
>>>>>>>
>>>>>>> FYI: there is a special parser for such complex phrase query.
>>>>>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__lucene.apache.org_core_8-5F1-5F0_queryparser_org_apache_lucene_queryparser_complexPhrase_ComplexPhraseQueryParser.html&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=ZcXpaSlwS5DegX76mHTb_6DH3P7noan1eeMXc-Vh5M8&s=FoIMlcjDO2b7Gut9XRx-NIBWiBQWItsj8IlylJC7Wkc&e=
>>>>>>>
>>>>>>>
>>>>>>> Tomoko
>>>>>>>
>>>>>>> 2019?6?13?(?) 6:16 <baris.kazar@oracle.com>:
>>>>>>>> Ok, i think only this very specific only "mains" has an issue.
>>>>>>>>
>>>>>>>> all i knew about Lucene was fine :) Great...
>>>>>>>>
>>>>>>>> i have one more question:
>>>>>>>>
>>>>>>>> which one is advised to use: FuzzyQuery or the Query.parser with
>>>>>>>> search string~ appended?
>>>>>>>>
>>>>>>>> The second one will go through analyzer and make search string
>>>>>>>> lowercase.
>>>>>>>>
>>>>>>>> Best regards
>>>>>>>>
>>>>>>>>
>>>>>>>> On 6/12/19 1:03 PM, baris.kazar@oracle.com wrote:
>>>>>>>>
>>>>>>>> Hi again,-
>>>>>>>>
>>>>>>>> this is really interesting and i hope i am missing something.
>>>>>>>> Index small cases all entries so case sensitivity is not an issue
>>>>>>>> i think.
>>>>>>>>
>>>>>>>> Case #1:
>>>>>>>>
>>>>>>>> org.apache.lucene.queryparser.classic.QueryParser parser = new
>>>>>>>> org.apache.lucene.queryparser.classic.QueryParser(field,
>>>>>>>> phraseAnalyzer) ;
>>>>>>>> Query q1 = null;
>>>>>>>> try {
>>>>>>>> q1 = parser.parse("Main");
>>>>>>>> } catch (ParseException e) {
>>>>>>>> e.printStackTrace();
>>>>>>>> }
>>>>>>>> booleanQuery.add(q1, BooleanClause.Occur.MUST);
>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>>> "NASHUA"), BooleanClause.Occur.MUST);
>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>>> "NEW HAMPSHIRE"), BooleanClause.Occur.MUST);
>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>>> "UNITED STATES"), BooleanClause.Occur.MUST);
>>>>>>>>
>>>>>>>>
>>>>>>>> This brings with this:
>>>>>>>>
>>>>>>>> query plan:
>>>>>>>>
>>>>>>>> [+contentDFLT:main, +contentDFLT:"nashua",
>>>>>>>> +contentDFLT:"new-hampshire", +contentDFLT:"united states"]
>>>>>>>>
>>>>>>>> testQuerySearch1 Time to compute: 0 seconds (copied answer after
>>>>>>>> exec finished)
>>>>>>>>
>>>>>>>> Number of results: 12
>>>>>>>> Name: Main Dunstable Rd
>>>>>>>> Score: 41.204945
>>>>>>>> ID: 12677400
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.72631, -71.50269
>>>>>>>> Search Key: MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW HAMPSHIRE
>>>>>>>> UNITED STATES
>>>>>>>>
>>>>>>>> Name: Main St
>>>>>>>> Score: 41.204945
>>>>>>>> ID: 12681980
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.76416, -71.46681
>>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>
>>>>>>>> Name: Main St
>>>>>>>> Score: 41.204945
>>>>>>>> ID: 12681973
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.75045, -71.4607
>>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>
>>>>>>>> Name: Main St
>>>>>>>> Score: 41.204945
>>>>>>>> ID: 12681974
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.76019, -71.465
>>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>
>>>>>>>> Name: Main Dunstable Rd
>>>>>>>> Score: 41.204945
>>>>>>>> ID: 12677399
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.74641, -71.48943
>>>>>>>> Search Key: MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW HAMPSHIRE
>>>>>>>> UNITED STATES
>>>>>>>>
>>>>>>>> Name: S Main St
>>>>>>>> Score: 41.204945
>>>>>>>> ID: 11893215
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.73412, -71.44797
>>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>
>>>>>>>> Name: Main St
>>>>>>>> Score: 41.204945
>>>>>>>> ID: 12681978
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.73492, -71.44951
>>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>
>>>>>>>> Name: S Main St
>>>>>>>> Score: 41.204945
>>>>>>>> ID: 11893214
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.73958, -71.45895
>>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>
>>>>>>>> Name: Main St
>>>>>>>> Score: 41.204945
>>>>>>>> ID: 12681979
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.76416, -71.46681
>>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>
>>>>>>>> Name: Main St
>>>>>>>> Score: 41.204945
>>>>>>>> ID: 12681977
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.747, -71.45957
>>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Case #2
>>>>>>>>
>>>>>>>> When i did this it also worked by adding ~ to make it Fuzzy query
>>>>>>>> to Main word:
>>>>>>>>
>>>>>>>> org.apache.lucene.queryparser.classic.QueryParser parser = new
>>>>>>>> org.apache.lucene.queryparser.classic.QueryParser(field,
>>>>>>>> phraseAnalyzer) ;
>>>>>>>> Query q1 = null;
>>>>>>>> try {
>>>>>>>> q1 = parser.parse("Main~");
>>>>>>>> } catch (ParseException e) {
>>>>>>>> e.printStackTrace();
>>>>>>>> }
>>>>>>>> booleanQuery.add(q1, BooleanClause.Occur.MUST);
>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>>> "NASHUA"), BooleanClause.Occur.MUST);
>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>>> "NEW HAMPSHIRE"), BooleanClause.Occur.MUST);
>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>>> "UNITED STATES"), BooleanClause.Occur.MUST);
>>>>>>>>
>>>>>>>>
>>>>>>>> query plan:
>>>>>>>>
>>>>>>>> [+contentDFLT:main~2, +contentDFLT:"nashua",
>>>>>>>> +contentDFLT:"new-hampshire", +contentDFLT:"united states"]
>>>>>>>>
>>>>>>>> testQuerySearch1 Time to compute: 24 seconds (due to debugging
>>>>>>>> stops)
>>>>>>>> Number of results: 12
>>>>>>>> Name: Main Dunstable Rd
>>>>>>>> Score: 41.06405
>>>>>>>> ID: 12677400
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.72631, -71.50269
>>>>>>>> Search Key: MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW HAMPSHIRE
>>>>>>>> UNITED STATES
>>>>>>>>
>>>>>>>> Name: Main St
>>>>>>>> Score: 41.06405
>>>>>>>> ID: 12681980
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.76416, -71.46681
>>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>
>>>>>>>> Name: Main St
>>>>>>>> Score: 41.06405
>>>>>>>> ID: 12681973
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.75045, -71.4607
>>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>
>>>>>>>> Name: Main St
>>>>>>>> Score: 41.06405
>>>>>>>> ID: 12681974
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.76019, -71.465
>>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>
>>>>>>>> Name: Main Dunstable Rd
>>>>>>>> Score: 41.06405
>>>>>>>> ID: 12677399
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.74641, -71.48943
>>>>>>>> Search Key: MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW HAMPSHIRE
>>>>>>>> UNITED STATES
>>>>>>>>
>>>>>>>> Name: S Main St
>>>>>>>> Score: 41.06405
>>>>>>>> ID: 11893215
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.73412, -71.44797
>>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>
>>>>>>>> Name: Main St
>>>>>>>> Score: 41.06405
>>>>>>>> ID: 12681978
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.73492, -71.44951
>>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>
>>>>>>>> Name: S Main St
>>>>>>>> Score: 41.06405
>>>>>>>> ID: 11893214
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.73958, -71.45895
>>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>
>>>>>>>> Name: Main St
>>>>>>>> Score: 41.06405
>>>>>>>> ID: 12681979
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.76416, -71.46681
>>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>
>>>>>>>> Name: Main St
>>>>>>>> Score: 41.06405
>>>>>>>> ID: 12681977
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.747, -71.45957
>>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Case #3
>>>>>>>>
>>>>>>>> But why does this not work with fuzzy mode and i misspelled a bit
>>>>>>>> (1 edit away) and as You saw the data is there with Main spelling:
>>>>>>>>
>>>>>>>> org.apache.lucene.queryparser.classic.QueryParser parser = new
>>>>>>>> org.apache.lucene.queryparser.classic.QueryParser(field,
>>>>>>>> phraseAnalyzer) ;
>>>>>>>>
>>>>>>>> Query q1 = null;
>>>>>>>> try {
>>>>>>>> q1 = parser.parse("Mains~"); // 1 edit away
>>>>>>>> } catch (ParseException e) {
>>>>>>>> e.printStackTrace();
>>>>>>>> }
>>>>>>>> booleanQuery.add(q1, BooleanClause.Occur.MUST);
>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>>> "NASHUA"), BooleanClause.Occur.MUST);
>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>>> "NEW HAMPSHIRE"), BooleanClause.Occur.MUST);
>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>>> "UNITED STATES"), BooleanClause.Occur.MUST);
>>>>>>>>
>>>>>>>> query plan:
>>>>>>>>
>>>>>>>> [+contentDFLT:mains~2, +contentDFLT:"nashua",
>>>>>>>> +contentDFLT:"new-hampshire", +contentDFLT:"united states"]
>>>>>>>>
>>>>>>>> testQuerySearch1 Time to compute: 23 seconds (due to debugging
>>>>>>>> stops)
>>>>>>>>
>>>>>>>> Number of results: 0
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Case #4
>>>>>>>>
>>>>>>>> Then i changed q1 to SHOULD from MUST above: and i think fuzzy
>>>>>>>> query is ignored here since there is no MAIN in the first 468
>>>>>>>> resuls:
>>>>>>>>
>>>>>>>> there is no boost for Mains term here.
>>>>>>>>
>>>>>>>> query plan:
>>>>>>>>
>>>>>>>> [contentDFLT:mains~2, +contentDFLT:"nashua",
>>>>>>>> +contentDFLT:"new-hampshire", +contentDFLT:"united states"]
>>>>>>>>
>>>>>>>> testQuerySearch1 Time to compute: 125 seconds (due to debugging
>>>>>>>> stops)
>>>>>>>> Number of results: 1794
>>>>>>>> Name: Nashua Dr
>>>>>>>> Score: 34.186226
>>>>>>>> ID: 4974936
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.7636, -71.46063
>>>>>>>> Search Key: NASHUA NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>
>>>>>>>> Name: Nashua River Rail Trl
>>>>>>>> Score: 34.186226
>>>>>>>> ID: 4975508
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.7062, -71.53962
>>>>>>>> Search Key: NASHUA RIVER RAIL NASHUA HILLSBOROUGH NEW HAMPSHIRE
>>>>>>>> UNITED STATES
>>>>>>>>
>>>>>>>> Name: Nashua Rd
>>>>>>>> Score: 33.84896
>>>>>>>> ID: 4975388
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.78746, -71.92823
>>>>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>
>>>>>>>> Name: NASHUA
>>>>>>>> Score: 33.84896
>>>>>>>> ID: 21014865
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.75873, -71.46438
>>>>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>
>>>>>>>> Name: NASHUA
>>>>>>>> Score: 33.84896
>>>>>>>> ID: 21014865
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.75873, -71.46438
>>>>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>
>>>>>>>> Name: NASHUA
>>>>>>>> Score: 33.84896
>>>>>>>> ID: 21014865
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.75873, -71.46438
>>>>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>
>>>>>>>> Name: NASHUA
>>>>>>>> Score: 33.84896
>>>>>>>> ID: 21014865
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.75873, -71.46438
>>>>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>
>>>>>>>> Name: NASHUA
>>>>>>>> Score: 33.84896
>>>>>>>> ID: 21014865
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.75873, -71.46438
>>>>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>
>>>>>>>> Name: Nashua St
>>>>>>>> Score: 33.84896
>>>>>>>> ID: 4975671
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.88471, -70.81687
>>>>>>>> Search Key: NASHUA ROCKINGHAM NEW HAMPSHIRE UNITED STATES
>>>>>>>>
>>>>>>>> Name: Nashua Rd
>>>>>>>> Score: 33.84896
>>>>>>>> ID: 4975400
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.79014, -71.92364
>>>>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>
>>>>>>>>
>>>>>>>> Why is the fuzzy query ignored?
>>>>>>>> Even if i have separate fields for street, city,region, country,
>>>>>>>> this fuzzy query issue will come into place for words with
>>>>>>>> multiple parts like main dunstable etc., right?
>>>>>>>>
>>>>>>>> Best regards
>>>>>>>>
>>>>>>>> On 6/12/19 11:36 AM, baris.kazar@oracle.com wrote:
>>>>>>>>
>>>>>>>> Tomoko,-
>>>>>>>>
>>>>>>>> Thank You for Your suggestions. i am trying to understand it
>>>>>>>> and i thought i did :)
>>>>>>>>
>>>>>>>> but it does not work with FuzzyQuery when i used with a *single*
>>>>>>>> large TextField like street=...value... city=...value...
>>>>>>>> region=...value... country=...value... (with or without quotes
>>>>>>>> for the values)
>>>>>>>>
>>>>>>>> What i knew about Lucene fuzzy queries are not holding now with
>>>>>>>> this Textfield form. That is why i suspected of a bug.
>>>>>>>>
>>>>>>>> 1. Yes, i saw and have a solid proof on that now.
>>>>>>>>
>>>>>>>> 2. yes but FuzzyQuery takes quotes as they are as they are
>>>>>>>> escaped and it is not analyzed.
>>>>>>>>
>>>>>>>> Stuffing into one textfield vs having separate fields should only
>>>>>>>> affect probably the performance but not the outcome in my case.
>>>>>>>> But, i have been thinking about this and maybe it is the way to
>>>>>>>> go in this case.
>>>>>>>>
>>>>>>>> mY CONTENT field has street names in mixed case and city, region
>>>>>>>> country names in UPPERCASE. Can this be a problem?
>>>>>>>> i thought index stored them in lowercase since i am using
>>>>>>>> StandardAnalyzer.
>>>>>>>>
>>>>>>>> CONTENT field also has full textfield string with street=...
>>>>>>>> city=... region=... country=... (here all values are UPPERCASE).
>>>>>>>>
>>>>>>>> Why cant the index find the names via FuzzyQuery? i tried both
>>>>>>>> FuzzyQuery and Query builder as i showed before.
>>>>>>>>
>>>>>>>> The last advice in Your previous email would nicely go outside
>>>>>>>> the parantheses since it might be very critical :) :) :)
>>>>>>>>
>>>>>>>> Best regards
>>>>>>>>
>>>>>>>>
>>>>>>>> On 6/12/19 12:17 AM, Tomoko Uchida wrote:
>>>>>>>>
>>>>>>>> I'd suggest to correctly understand the way a software works before
>>>>>>>> suspecting its bug :-)
>>>>>>>>
>>>>>>>> I guess you may miss two points:
>>>>>>>>
>>>>>>>> 1. the standard analyzer (standard tokenizer) breaks words by double
>>>>>>>> quote (U+0022) so quotes are not indexed or searched at all if
>>>>>>>> you are
>>>>>>>> using standard analyzer. (That is the reason you have same results
>>>>>>>> with or without quotes.)
>>>>>>>> See:
>>>>>>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__lucene.apache.org_core_8-5F1-5F0_core_org_apache_lucene_analysis_standard_StandardTokenizer.html&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=1L6ZQKxmWmYxDX4uJHxzY5SAR_UCl6UUXCo916wzXCo&s=8E2lp1YIGM-3v3FspeieGl8z8rEBs6qioTudtFNzh8c&e=
>>>>>>>> and
>>>>>>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__unicode.org_reports_tr29_&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=1L6ZQKxmWmYxDX4uJHxzY5SAR_UCl6UUXCo916wzXCo&s=riCZ_f25XW869CKbHPUqfbLiDU-AukE6la0xTLMw6u8&e=
>>>>>>>>
>>>>>>>> 2. double quote has special meaning (it's interpreted as phrase
>>>>>>>> query)
>>>>>>>> with the built-in query parser so you need to escape it if you
>>>>>>>> want to
>>>>>>>> search double quotes itself.
>>>>>>>> See:
>>>>>>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__lucene.apache.org_core_8-5F1-5F0_queryparser_org_apache_lucene_queryparser_classic_package-2Dsummary.html-23Terms&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=1L6ZQKxmWmYxDX4uJHxzY5SAR_UCl6UUXCo916wzXCo&s=t8OYTgidvcwNpAVFuTsqGhDJK5BwUZVCxc0mPHzqCYU&e=
>>>>>>>>
>>>>>>>> (My advice would be to create separate fields for each key value
>>>>>>>> pairs
>>>>>>>> instead of stuffing all pairs into one text field, if you need to
>>>>>>>> search them separately.)
>>>>>>>>
>>>>>>>> 2019?6?12?(?) 2:39 <baris.kazar@oracle.com>:
>>>>>>>>
>>>>>>>> i can say that quotes is not the issue with index as it still
>>>>>>>> results in
>>>>>>>> same results with quotes or without quotes.
>>>>>>>>
>>>>>>>> i am starting to feel that this might be a bug maybe??
>>>>>>>>
>>>>>>>> Best regards
>>>>>>>>
>>>>>>>>
>>>>>>>> On 6/10/19 2:46 PM, baris.kazar@oracle.com wrote:
>>>>>>>>
>>>>>>>> Somehow " is causing an issue as this should return street with
>>>>>>>> MAIN:
>>>>>>>>
>>>>>>>> [contentDFLT:street="MAINS"~2, +contentDFLT:"city nashua",
>>>>>>>> +contentDFLT:"region new-hampshire", +contentDFLT:"country united
>>>>>>>> states"] -> this was with fuzzyquery on MAINS
>>>>>>>>
>>>>>>>> Best regards
>>>>>>>>
>>>>>>>>
>>>>>>>> On 6/10/19 2:24 PM, baris.kazar@oracle.com wrote:
>>>>>>>>
>>>>>>>> [.+contentDFLT:"city nashua", +contentDFLT:"region new-hampshire",
>>>>>>>> +contentDFLT:"country united states", contentDFLT:street
>>>>>>>> contentDFLT:mains]
>>>>>>>>
>>>>>>>> QueeryParser chops it into two pieces from
>>>>>>>> parser.parser("street=\"MAINS\"");
>>>>>>>>
>>>>>>>> Index has a TextField named contentDFLT the following data :
>>>>>>>> street="MAIN" city="NASHUA" municipality="HILLSBOROUGH" region="NEW
>>>>>>>> HAMPSHIRE" country="UNITED STATES"
>>>>>>>>
>>>>>>>>
>>>>>>>> When i set street=\"MAINS~\" with parser:
>>>>>>>> i get the following
>>>>>>>> [.+contentDFLT:"city nashua", +contentDFLT:"region new-hampshire",
>>>>>>>> +contentDFLT:"country united states", contentDFLT:street
>>>>>>>> contentDFLT:mains]
>>>>>>>>
>>>>>>>> probably " quotations are messing this up as You were saying...
>>>>>>>> Best regards
>>>>>>>>
>>>>>>>>
>>>>>>>> On 6/10/19 12:48 PM, Tomoko Uchida wrote:
>>>>>>>>
>>>>>>>> Or, " (double quotation) in your query string may affect query
>>>>>>>> parsing.
>>>>>>>>
>>>>>>>> When I parse this string by classic query parser (lucene 8.1),
>>>>>>>> street="MAINS~"
>>>>>>>> parsed (raw) query is
>>>>>>>> text:street text:mains
>>>>>>>> (I set the default search field to "text", so text:xxxx is appeared
>>>>>>>> here.)
>>>>>>>>
>>>>>>>> Query parsing is a complex process, so it would be good to check
>>>>>>>> parsed raw query string especially when you have (reserved) special
>>>>>>>> characters in your query...
>>>>>>>>
>>>>>>>> 2019?6?11?(?) 1:10 Tomoko Uchida <tomoko.uchida.1111@gmail.com>:
>>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> I noticed one small thing in your previous mail.
>>>>>>>>
>>>>>>>> when i use q1 = parser.parse("street=\"MAIN\""); i get same results
>>>>>>>>
>>>>>>>> which is good.
>>>>>>>>
>>>>>>>> To specify a search field, ":" (colon) should be used instead of
>>>>>>>> "=".
>>>>>>>> See the query parser documentation:
>>>>>>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__lucene.apache.org_core_8-5F1-5F0_queryparser_org_apache_lucene_queryparser_classic_package-2Dsummary.html-23Fields&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=u4SeJqH4lePhOazCLwxLEr3WqcMkODtYLv4njiKZ4PM&s=WrNfUXO9gz1PqpczTJw1vD9sWqvr76WRv2Aeo9uWqa4&e=
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> I'm not sure this is related to your problem.
>>>>>>>>
>>>>>>>> 2019?6?11?(?) 0:51 <baris.kazar@oracle.com>:
>>>>>>>>
>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>>> "city=\"NASHUA\""), BooleanClause.Occur.MUST);
>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>>> "region=\"NEW HAMPSHIRE\""), BooleanClause.Occur.MUST);
>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>>> "country=\"UNITED STATES\""), BooleanClause.Occur.MUST);
>>>>>>>>
>>>>>>>> org.apache.lucene.queryparser.classic.QueryParser parser = new
>>>>>>>> org.apache.lucene.queryparser.classic.QueryParser(field,
>>>>>>>> phraseAnalyzer) ;
>>>>>>>> Query q1 = null;
>>>>>>>> try {
>>>>>>>> q1 = parser.parse("MAIN");
>>>>>>>> } catch (ParseException e) {
>>>>>>>>
>>>>>>>> e.printStackTrace();
>>>>>>>> }
>>>>>>>> booleanQuery.add(q1, BooleanClause.Occur.SHOULD);
>>>>>>>>
>>>>>>>> testQuerySearch2 Time to compute: 0 seconds
>>>>>>>> Number of results: 1775
>>>>>>>> Name: Main St
>>>>>>>> Score: 37.20959
>>>>>>>> ID: 12681979
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.76416, -71.46681
>>>>>>>> Search Key: street="MAIN" city="NASHUA" municipality="HILLSBOROUGH"
>>>>>>>> region="NEW HAMPSHIRE" country="UNITED STATES"
>>>>>>>>
>>>>>>>> Name: Main St
>>>>>>>> Score: 37.20959
>>>>>>>> ID: 12681977
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.747, -71.45957
>>>>>>>> Search Key: street="MAIN" city="NASHUA" municipality="HILLSBOROUGH"
>>>>>>>> region="NEW HAMPSHIRE" country="UNITED STATES"
>>>>>>>>
>>>>>>>> Name: Main St
>>>>>>>> Score: 37.20959
>>>>>>>> ID: 12681978
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.73492, -71.44951
>>>>>>>> Search Key: street="MAIN" city="NASHUA" municipality="HILLSBOROUGH"
>>>>>>>> region="NEW HAMPSHIRE" country="UNITED STATES"
>>>>>>>>
>>>>>>>> when i use q1 = parser.parse("street=\"MAIN\""); i get same
>>>>>>>> results
>>>>>>>> which is good.
>>>>>>>>
>>>>>>>> But when i switch to MAINS~ then fuzzy query does not work.
>>>>>>>>
>>>>>>>>
>>>>>>>> i need to say something with the q1 only in the booleanquery:
>>>>>>>> it tries to match the MAIN in street, city, region and country
>>>>>>>> which are
>>>>>>>> in a single TextField field.
>>>>>>>> But i dont want this. that is why i need to street="..." etc when
>>>>>>>> searching.
>>>>>>>>
>>>>>>>> Best regards
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On 6/10/19 11:31 AM, Tomoko Uchida wrote:
>>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> just for the basic verification, can you find the document without
>>>>>>>> fuzzy query? I mean, does this query work for you?
>>>>>>>>
>>>>>>>> Query query = parser.parse("MAIN");
>>>>>>>>
>>>>>>>> Tomoko
>>>>>>>>
>>>>>>>> 2019?6?11?(?) 0:22 <baris.kazar@oracle.com>:
>>>>>>>>
>>>>>>>> why cant the second set not work at all?
>>>>>>>>
>>>>>>>> it is indexed as Textfield like street="..." city="..." etc.
>>>>>>>>
>>>>>>>> Best regards
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On 6/10/19 11:23 AM, baris.kazar@oracle.com wrote:
>>>>>>>>
>>>>>>>> i dont know how to use Fuzzyquery with queryparser but probably
>>>>>>>> You
>>>>>>>> are suggesting
>>>>>>>>
>>>>>>>> QueryParser parser = new QueryParser(field, analyzer) ;
>>>>>>>> Query query = parser.parse("MAINS~2");
>>>>>>>>
>>>>>>>> booleanQuery.add(query, BooleanClause.Occur.SHOULD);
>>>>>>>>
>>>>>>>> am i right?
>>>>>>>> Best regards
>>>>>>>>
>>>>>>>>
>>>>>>>> On 6/10/19 10:47 AM, Atri Sharma wrote:
>>>>>>>>
>>>>>>>> I would suggest using a QueryParser for your fuzzy query before
>>>>>>>> adding it to the Boolean query. This should weed out any case
>>>>>>>> issues.
>>>>>>>>
>>>>>>>> On Mon, 10 Jun 2019 at 8:06 PM, <baris.kazar@oracle.com
>>>>>>>> <mailto:baris.kazar@oracle.com>> wrote:
>>>>>>>>
>>>>>>>> BooleanQuery.Builder booleanQuery = new
>>>>>>>> BooleanQuery.Builder();
>>>>>>>>
>>>>>>>> //First set
>>>>>>>>
>>>>>>>> booleanQuery.add(new FuzzyQuery(new
>>>>>>>> org.apache.lucene.index.Term(field, "MAINS")),
>>>>>>>> BooleanClause.Occur.SHOULD);
>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>>> "NASHUA"), BooleanClause.Occur.MUST);
>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>>> "NEW HAMPSHIRE"), BooleanClause.Occur.MUST);
>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>>> "UNITED STATES"), BooleanClause.Occur.MUST);
>>>>>>>>
>>>>>>>> // Second set
>>>>>>>> //booleanQuery.add(new FuzzyQuery(new
>>>>>>>> org.apache.lucene.index.Term(field, "street=\"MAINS\"")),
>>>>>>>> BooleanClause.Occur.SHOULD);
>>>>>>>> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,
>>>>>>>>
>>>>>>>> field, "city=\"NASHUA\""), BooleanClause.Occur.MUST);
>>>>>>>> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,
>>>>>>>>
>>>>>>>> field, "region=\"NEW HAMPSHIRE\""),
>>>>>>>> BooleanClause.Occur.MUST);
>>>>>>>> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,
>>>>>>>>
>>>>>>>> field, "country=\"UNITED STATES\""),
>>>>>>>> BooleanClause.Occur.MUST);
>>>>>>>>
>>>>>>>> The first set brings also street with Nashua name.
>>>>>>>> (NASHUA).
>>>>>>>>
>>>>>>>> so, to prevent that and since i also indexed with
>>>>>>>> street="..."
>>>>>>>> city="..." i did the second set but it does not bring
>>>>>>>> anything.
>>>>>>>>
>>>>>>>> createPhraseQuery builds a Phrasequery with one term
>>>>>>>> equal to the
>>>>>>>> string
>>>>>>>> in the call.
>>>>>>>>
>>>>>>>> Best regards
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On 6/10/19 10:47 AM, baris.kazar@oracle.com
>>>>>>>> <mailto:baris.kazar@oracle.com> wrote:
>>>>>>>> > How do i check how it is indexed? lowecase or uppercase?
>>>>>>>> >
>>>>>>>> > only way is now to by testing.
>>>>>>>> >
>>>>>>>> > i am using standardanalyzer.
>>>>>>>> >
>>>>>>>> > Best regards
>>>>>>>> >
>>>>>>>> >
>>>>>>>> > On 6/9/19 11:57 AM, Atri Sharma wrote:
>>>>>>>> >> On Sun, Jun 9, 2019 at 8:53 PM Tomoko Uchida
>>>>>>>> >> <tomoko.uchida.1111@gmail.com
>>>>>>>> <mailto:tomoko.uchida.1111@gmail.com>> wrote:
>>>>>>>> >>> Hi,
>>>>>>>> >>>
>>>>>>>> >>> What analyzer do you use for the text field? Is the
>>>>>>>> term "Main"
>>>>>>>> >>> correctly indexed?
>>>>>>>> >> Agreed. Also, it would be good if you could post your
>>>>>>>> actual
>>>>>>>> code.
>>>>>>>> >>
>>>>>>>> >> What analyzer are you using? If you are using
>>>>>>>> StandardAnalyzer,
>>>>>>>> then
>>>>>>>> >> all of your terms while indexing will be lowercased,
>>>>>>>> AFAIK, but
>>>>>>>> your
>>>>>>>> >> query will not be analyzed until you run a
>>>>>>>> QueryParser on it.
>>>>>>>> >>
>>>>>>>> >>
>>>>>>>> >> Atri
>>>>>>>> >>
>>>>>>>> >
>>>>>>>> >
>>>>>>>> >
>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>
>>>>>>>>
>>>>>>>> > To unsubscribe, e-mail:
>>>>>>>> java-user-unsubscribe@lucene.apache.org
>>>>>>>> <mailto:java-user-unsubscribe@lucene.apache.org>
>>>>>>>> > For additional commands, e-mail:
>>>>>>>> java-user-help@lucene.apache.org
>>>>>>>> <mailto:java-user-help@lucene.apache.org>
>>>>>>>> >
>>>>>>>>
>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>
>>>>>>>>
>>>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>>>>
>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>
>>>>>>>>
>>>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>>>>
>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>
>>>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>>>>
>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>
>>>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>> ---------------------------------------------------------------------
>>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>>>
>>>>>> ---------------------------------------------------------------------
>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: FuzzyQuery- why is it ignored? [ In reply to ]
Please send messages to java-user mail list only. It is not
recommended to send questions to someone's private mail address.
(I accidentally send a reply without including java-user to "To",
because "Reply-To" header was not correctly set in your previous
mail.)

I did not run any test with Lucene 6.6, and won't try it until
reproducible results/conditions are provided.

2019?6?23?(?) 10:40 Baris Kazar <baris.kazar@oracle.com>:

>
> Tomoko,-
> i will surely try on my env version 8.1
> but if You could also try then both runs will
> make sure it is bug.
> No problems at all. i will test it.
> I need to ask one thing when you ran the example
> did you have any other entries in the index?
> and i did not understand your statement:
> the fuzzy query worked on 8.1, did you also try 6.6?
> anyways i will know for sure when i test this weekend.
> baris
>
> ----- Original Message -----
> From: tomoko.uchida.1111@gmail.com
> To: baris.kazar@oracle.com
> Sent: Saturday, June 22, 2019 9:14:22 PM GMT -05:00 US/Canada Eastern
> Subject: Re: FuzzyQuery- why is it ignored?
>
> > If You could index these entries and still find Main from MAINS query with Lucene 8.1,
> > that means this is a bug in Lucene 6.6.
>
> No, it does not mean there is difference between 8.1 and 6.6 if the
> fuzzy query correctly works for me.
> I'd suggest you try Lucene 8.1 on your own with your environments/settings.
>
> Tomoko
>
> 2019?6?23?(?) 2:14 Baris Kazar <baris.kazar@oracle.com>:
> >
> > Tomoko,-
> > may i ask if You could try with these few more data indexed too?
> >
> > "KEHOE NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES"
> > "CHESTNUT NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES"
> > "JEFFERSON NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES"
> > "NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES"
> > "NASHUA NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES"
> > "NEW HAMPSHIRE NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES"
> >
> > If You could index these entries and still find Main from MAINS query with Lucene 8.1,
> > that means this is a bug in Lucene 6.6.
> > Best regards
> >
> > ----- Original Message -----
> > From: baris.kazar@oracle.com
> > To: java-user@lucene.apache.org, tomoko.uchida.1111@gmail.com, erickerickson@gmail.com, atri@linux.com, baris.kazar@oracle.com, lucene@mikemccandless.com
> > Sent: Thursday, June 13, 2019 10:49:05 AM GMT -05:00 US/Canada Eastern
> > Subject: Re: FuzzyQuery- why is it ignored?
> >
> > i see, i am using an older version 6.6 and we should switch to Your 8.1
> > version of at least 7.X.
> >
> > Tomoko i think i understood You meant MAIN NASHUA .... for the string :)
> >
> > Again i really appreciate all answers.
> >
> > How do we disable or enable stemming while indexing? :) another question.
> >
> > Best regards
> >
> >
> > On 6/13/19 10:40 AM, Tomoko Uchida wrote:
> > > Sorry, I made a mistake when copypasting. Let me just correct my previous mail.
> > >
> > >> 1. Indexed this text: "NASHUA NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES".
> > > 1. Indexed this text: "MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW
> > > HAMPSHIRE UNITED STATES"
> > >
> > > ----
> > > As far as I can say, this query correctly find the indexed document
> > > (so I have no idea about what is wrong with fuzzy query).
> > > +contentDFLT:mains~2 +contentDFLT:"nashua"
> > > +contentDFLT:"new-hampshire" +contentDFLT:"united states"
> > >
> > > I am
> > > - using lucene 8.1.
> > > - using standard analyzer for both of indexing and searching.
> > > - using classic query parser for parsing.
> > >
> > >
> > >
> > > 2019?6?13?(?) 23:18 <baris.kazar@oracle.com>:
> > >> However, the index does not have MAINS but MAIN for the expected entry.
> > >>
> > >> Best regards
> > >>
> > >>
> > >>
> > >> On 6/13/19 10:33 AM, baris.kazar@oracle.com wrote:
> > >>> does it consider it as like plural word? :) :) :)
> > >>> That makes sense.
> > >>>
> > >>> Best regards
> > >>>
> > >>>
> > >>> On 6/13/19 10:31 AM, baris.kazar@oracle.com wrote:
> > >>>> Erick,
> > >>>>
> > >>>> Cool, could You give a simple example with my example please?
> > >>>>
> > >>>> Best regards
> > >>>>
> > >>>>
> > >>>>
> > >>>> On 6/13/19 10:12 AM, Erick Erickson wrote:
> > >>>>> Shot in the dark: stemming. Whenever I see a problem with something
> > >>>>> ending in “s” (or “er” or “ing” or….) my first suspect is that
> > >>>>> stemming is turned on. In that case the token in the index that’s
> > >>>>> actually searched on is somewhat different than you expect.
> > >>>>>
> > >>>>> The test is easy, just insure your fieldType contains no stemmers.
> > >>>>> PorterStemmer is particularly aggressive, but for this case to test
> > >>>>> I’d just remove all stemming, re-index and see if the results differ.
> > >>>>>
> > >>>>> Best,
> > >>>>> Erick
> > >>>>>
> > >>>>>> On Jun 13, 2019, at 7:26 AM, baris.kazar@oracle.com wrote:
> > >>>>>>
> > >>>>>> Tomoko,-
> > >>>>>>
> > >>>>>> That is strange indeed.
> > >>>>>>
> > >>>>>> Something is wrong when i use mains but maink, mainl, mainr,mainq,
> > >>>>>> maint all work ok any consonant at the end except s works in this
> > >>>>>> case.
> > >>>>>>
> > >>>>>> Case #3 had +contentDFLT:mains~2 but not +contentDFLT:"mains~2".
> > >>>>>>
> > >>>>>> i am using fuzzy query with ~ from Query.builder and that is not
> > >>>>>> PhraseQuery.
> > >>>>>>
> > >>>>>> Similarly FuzzyQuery with input "mains" (it has to be lowercase
> > >>>>>> since it does not go through StandardAnalyzer) is also not
> > >>>>>> PhraseQuery.
> > >>>>>>
> > >>>>>> can there be a clearer sample case for ComplexPhraseQuery please in
> > >>>>>> the docs?
> > >>>>>>
> > >>>>>> did You also index "MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED
> > >>>>>> STATES" the expected output in this case?
> > >>>>>>
> > >>>>>> Thanks for spending time on this, i would like to thank everyone.
> > >>>>>>
> > >>>>>> Best regards
> > >>>>>>
> > >>>>>>
> > >>>>>> On 6/13/19 12:13 AM, Tomoko Uchida wrote:
> > >>>>>>> Hi,
> > >>>>>>>
> > >>>>>>>> Ok, i think only this very specific only "mains" has an issue.
> > >>>>>>> It looks strange to me. I did some test locally.
> > >>>>>>>
> > >>>>>>> 1. Indexed this text: "NASHUA NASHUA HILLSBOROUGH NEW HAMPSHIRE
> > >>>>>>> UNITED STATES".
> > >>>>>>>
> > >>>>>>> 2a. This query string (just copied from your Case #3) worked
> > >>>>>>> correctly
> > >>>>>>> for me as far as I can see.
> > >>>>>>> +contentDFLT:mains~2 +contentDFLT:"nashua",
> > >>>>>>> +contentDFLT:"new-hampshire", +contentDFLT:"united state"
> > >>>>>>>
> > >>>>>>> 2b. However this query string got no results.
> > >>>>>>> +contentDFLT:"mains~2", +contentDFLT:"nashua",
> > >>>>>>> +contentDFLT:"new-hampshire", +contentDFLT:"united states"
> > >>>>>>> It is an expected behaviour because the classic query parser does not
> > >>>>>>> support fuzzy query inside phrase query (as far as I know).
> > >>>>>>>
> > >>>>>>> I suspect you use fuzzy query operator (~) inside phrase query
> > >>>>>>> ("), as
> > >>>>>>> the 2b case.
> > >>>>>>>
> > >>>>>>> FYI: there is a special parser for such complex phrase query.
> > >>>>>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__lucene.apache.org_core_8-5F1-5F0_queryparser_org_apache_lucene_queryparser_complexPhrase_ComplexPhraseQueryParser.html&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=ZcXpaSlwS5DegX76mHTb_6DH3P7noan1eeMXc-Vh5M8&s=FoIMlcjDO2b7Gut9XRx-NIBWiBQWItsj8IlylJC7Wkc&e=
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> Tomoko
> > >>>>>>>
> > >>>>>>> 2019?6?13?(?) 6:16 <baris.kazar@oracle.com>:
> > >>>>>>>> Ok, i think only this very specific only "mains" has an issue.
> > >>>>>>>>
> > >>>>>>>> all i knew about Lucene was fine :) Great...
> > >>>>>>>>
> > >>>>>>>> i have one more question:
> > >>>>>>>>
> > >>>>>>>> which one is advised to use: FuzzyQuery or the Query.parser with
> > >>>>>>>> search string~ appended?
> > >>>>>>>>
> > >>>>>>>> The second one will go through analyzer and make search string
> > >>>>>>>> lowercase.
> > >>>>>>>>
> > >>>>>>>> Best regards
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> On 6/12/19 1:03 PM, baris.kazar@oracle.com wrote:
> > >>>>>>>>
> > >>>>>>>> Hi again,-
> > >>>>>>>>
> > >>>>>>>> this is really interesting and i hope i am missing something.
> > >>>>>>>> Index small cases all entries so case sensitivity is not an issue
> > >>>>>>>> i think.
> > >>>>>>>>
> > >>>>>>>> Case #1:
> > >>>>>>>>
> > >>>>>>>> org.apache.lucene.queryparser.classic.QueryParser parser = new
> > >>>>>>>> org.apache.lucene.queryparser.classic.QueryParser(field,
> > >>>>>>>> phraseAnalyzer) ;
> > >>>>>>>> Query q1 = null;
> > >>>>>>>> try {
> > >>>>>>>> q1 = parser.parse("Main");
> > >>>>>>>> } catch (ParseException e) {
> > >>>>>>>> e.printStackTrace();
> > >>>>>>>> }
> > >>>>>>>> booleanQuery.add(q1, BooleanClause.Occur.MUST);
> > >>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> > >>>>>>>> "NASHUA"), BooleanClause.Occur.MUST);
> > >>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> > >>>>>>>> "NEW HAMPSHIRE"), BooleanClause.Occur.MUST);
> > >>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> > >>>>>>>> "UNITED STATES"), BooleanClause.Occur.MUST);
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> This brings with this:
> > >>>>>>>>
> > >>>>>>>> query plan:
> > >>>>>>>>
> > >>>>>>>> [+contentDFLT:main, +contentDFLT:"nashua",
> > >>>>>>>> +contentDFLT:"new-hampshire", +contentDFLT:"united states"]
> > >>>>>>>>
> > >>>>>>>> testQuerySearch1 Time to compute: 0 seconds (copied answer after
> > >>>>>>>> exec finished)
> > >>>>>>>>
> > >>>>>>>> Number of results: 12
> > >>>>>>>> Name: Main Dunstable Rd
> > >>>>>>>> Score: 41.204945
> > >>>>>>>> ID: 12677400
> > >>>>>>>> Country Code: US
> > >>>>>>>> Coordinates: 42.72631, -71.50269
> > >>>>>>>> Search Key: MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW HAMPSHIRE
> > >>>>>>>> UNITED STATES
> > >>>>>>>>
> > >>>>>>>> Name: Main St
> > >>>>>>>> Score: 41.204945
> > >>>>>>>> ID: 12681980
> > >>>>>>>> Country Code: US
> > >>>>>>>> Coordinates: 42.76416, -71.46681
> > >>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> > >>>>>>>>
> > >>>>>>>> Name: Main St
> > >>>>>>>> Score: 41.204945
> > >>>>>>>> ID: 12681973
> > >>>>>>>> Country Code: US
> > >>>>>>>> Coordinates: 42.75045, -71.4607
> > >>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> > >>>>>>>>
> > >>>>>>>> Name: Main St
> > >>>>>>>> Score: 41.204945
> > >>>>>>>> ID: 12681974
> > >>>>>>>> Country Code: US
> > >>>>>>>> Coordinates: 42.76019, -71.465
> > >>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> > >>>>>>>>
> > >>>>>>>> Name: Main Dunstable Rd
> > >>>>>>>> Score: 41.204945
> > >>>>>>>> ID: 12677399
> > >>>>>>>> Country Code: US
> > >>>>>>>> Coordinates: 42.74641, -71.48943
> > >>>>>>>> Search Key: MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW HAMPSHIRE
> > >>>>>>>> UNITED STATES
> > >>>>>>>>
> > >>>>>>>> Name: S Main St
> > >>>>>>>> Score: 41.204945
> > >>>>>>>> ID: 11893215
> > >>>>>>>> Country Code: US
> > >>>>>>>> Coordinates: 42.73412, -71.44797
> > >>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> > >>>>>>>>
> > >>>>>>>> Name: Main St
> > >>>>>>>> Score: 41.204945
> > >>>>>>>> ID: 12681978
> > >>>>>>>> Country Code: US
> > >>>>>>>> Coordinates: 42.73492, -71.44951
> > >>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> > >>>>>>>>
> > >>>>>>>> Name: S Main St
> > >>>>>>>> Score: 41.204945
> > >>>>>>>> ID: 11893214
> > >>>>>>>> Country Code: US
> > >>>>>>>> Coordinates: 42.73958, -71.45895
> > >>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> > >>>>>>>>
> > >>>>>>>> Name: Main St
> > >>>>>>>> Score: 41.204945
> > >>>>>>>> ID: 12681979
> > >>>>>>>> Country Code: US
> > >>>>>>>> Coordinates: 42.76416, -71.46681
> > >>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> > >>>>>>>>
> > >>>>>>>> Name: Main St
> > >>>>>>>> Score: 41.204945
> > >>>>>>>> ID: 12681977
> > >>>>>>>> Country Code: US
> > >>>>>>>> Coordinates: 42.747, -71.45957
> > >>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> Case #2
> > >>>>>>>>
> > >>>>>>>> When i did this it also worked by adding ~ to make it Fuzzy query
> > >>>>>>>> to Main word:
> > >>>>>>>>
> > >>>>>>>> org.apache.lucene.queryparser.classic.QueryParser parser = new
> > >>>>>>>> org.apache.lucene.queryparser.classic.QueryParser(field,
> > >>>>>>>> phraseAnalyzer) ;
> > >>>>>>>> Query q1 = null;
> > >>>>>>>> try {
> > >>>>>>>> q1 = parser.parse("Main~");
> > >>>>>>>> } catch (ParseException e) {
> > >>>>>>>> e.printStackTrace();
> > >>>>>>>> }
> > >>>>>>>> booleanQuery.add(q1, BooleanClause.Occur.MUST);
> > >>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> > >>>>>>>> "NASHUA"), BooleanClause.Occur.MUST);
> > >>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> > >>>>>>>> "NEW HAMPSHIRE"), BooleanClause.Occur.MUST);
> > >>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> > >>>>>>>> "UNITED STATES"), BooleanClause.Occur.MUST);
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> query plan:
> > >>>>>>>>
> > >>>>>>>> [+contentDFLT:main~2, +contentDFLT:"nashua",
> > >>>>>>>> +contentDFLT:"new-hampshire", +contentDFLT:"united states"]
> > >>>>>>>>
> > >>>>>>>> testQuerySearch1 Time to compute: 24 seconds (due to debugging
> > >>>>>>>> stops)
> > >>>>>>>> Number of results: 12
> > >>>>>>>> Name: Main Dunstable Rd
> > >>>>>>>> Score: 41.06405
> > >>>>>>>> ID: 12677400
> > >>>>>>>> Country Code: US
> > >>>>>>>> Coordinates: 42.72631, -71.50269
> > >>>>>>>> Search Key: MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW HAMPSHIRE
> > >>>>>>>> UNITED STATES
> > >>>>>>>>
> > >>>>>>>> Name: Main St
> > >>>>>>>> Score: 41.06405
> > >>>>>>>> ID: 12681980
> > >>>>>>>> Country Code: US
> > >>>>>>>> Coordinates: 42.76416, -71.46681
> > >>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> > >>>>>>>>
> > >>>>>>>> Name: Main St
> > >>>>>>>> Score: 41.06405
> > >>>>>>>> ID: 12681973
> > >>>>>>>> Country Code: US
> > >>>>>>>> Coordinates: 42.75045, -71.4607
> > >>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> > >>>>>>>>
> > >>>>>>>> Name: Main St
> > >>>>>>>> Score: 41.06405
> > >>>>>>>> ID: 12681974
> > >>>>>>>> Country Code: US
> > >>>>>>>> Coordinates: 42.76019, -71.465
> > >>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> > >>>>>>>>
> > >>>>>>>> Name: Main Dunstable Rd
> > >>>>>>>> Score: 41.06405
> > >>>>>>>> ID: 12677399
> > >>>>>>>> Country Code: US
> > >>>>>>>> Coordinates: 42.74641, -71.48943
> > >>>>>>>> Search Key: MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW HAMPSHIRE
> > >>>>>>>> UNITED STATES
> > >>>>>>>>
> > >>>>>>>> Name: S Main St
> > >>>>>>>> Score: 41.06405
> > >>>>>>>> ID: 11893215
> > >>>>>>>> Country Code: US
> > >>>>>>>> Coordinates: 42.73412, -71.44797
> > >>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> > >>>>>>>>
> > >>>>>>>> Name: Main St
> > >>>>>>>> Score: 41.06405
> > >>>>>>>> ID: 12681978
> > >>>>>>>> Country Code: US
> > >>>>>>>> Coordinates: 42.73492, -71.44951
> > >>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> > >>>>>>>>
> > >>>>>>>> Name: S Main St
> > >>>>>>>> Score: 41.06405
> > >>>>>>>> ID: 11893214
> > >>>>>>>> Country Code: US
> > >>>>>>>> Coordinates: 42.73958, -71.45895
> > >>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> > >>>>>>>>
> > >>>>>>>> Name: Main St
> > >>>>>>>> Score: 41.06405
> > >>>>>>>> ID: 12681979
> > >>>>>>>> Country Code: US
> > >>>>>>>> Coordinates: 42.76416, -71.46681
> > >>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> > >>>>>>>>
> > >>>>>>>> Name: Main St
> > >>>>>>>> Score: 41.06405
> > >>>>>>>> ID: 12681977
> > >>>>>>>> Country Code: US
> > >>>>>>>> Coordinates: 42.747, -71.45957
> > >>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> Case #3
> > >>>>>>>>
> > >>>>>>>> But why does this not work with fuzzy mode and i misspelled a bit
> > >>>>>>>> (1 edit away) and as You saw the data is there with Main spelling:
> > >>>>>>>>
> > >>>>>>>> org.apache.lucene.queryparser.classic.QueryParser parser = new
> > >>>>>>>> org.apache.lucene.queryparser.classic.QueryParser(field,
> > >>>>>>>> phraseAnalyzer) ;
> > >>>>>>>>
> > >>>>>>>> Query q1 = null;
> > >>>>>>>> try {
> > >>>>>>>> q1 = parser.parse("Mains~"); // 1 edit away
> > >>>>>>>> } catch (ParseException e) {
> > >>>>>>>> e.printStackTrace();
> > >>>>>>>> }
> > >>>>>>>> booleanQuery.add(q1, BooleanClause.Occur.MUST);
> > >>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> > >>>>>>>> "NASHUA"), BooleanClause.Occur.MUST);
> > >>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> > >>>>>>>> "NEW HAMPSHIRE"), BooleanClause.Occur.MUST);
> > >>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> > >>>>>>>> "UNITED STATES"), BooleanClause.Occur.MUST);
> > >>>>>>>>
> > >>>>>>>> query plan:
> > >>>>>>>>
> > >>>>>>>> [+contentDFLT:mains~2, +contentDFLT:"nashua",
> > >>>>>>>> +contentDFLT:"new-hampshire", +contentDFLT:"united states"]
> > >>>>>>>>
> > >>>>>>>> testQuerySearch1 Time to compute: 23 seconds (due to debugging
> > >>>>>>>> stops)
> > >>>>>>>>
> > >>>>>>>> Number of results: 0
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> Case #4
> > >>>>>>>>
> > >>>>>>>> Then i changed q1 to SHOULD from MUST above: and i think fuzzy
> > >>>>>>>> query is ignored here since there is no MAIN in the first 468
> > >>>>>>>> resuls:
> > >>>>>>>>
> > >>>>>>>> there is no boost for Mains term here.
> > >>>>>>>>
> > >>>>>>>> query plan:
> > >>>>>>>>
> > >>>>>>>> [contentDFLT:mains~2, +contentDFLT:"nashua",
> > >>>>>>>> +contentDFLT:"new-hampshire", +contentDFLT:"united states"]
> > >>>>>>>>
> > >>>>>>>> testQuerySearch1 Time to compute: 125 seconds (due to debugging
> > >>>>>>>> stops)
> > >>>>>>>> Number of results: 1794
> > >>>>>>>> Name: Nashua Dr
> > >>>>>>>> Score: 34.186226
> > >>>>>>>> ID: 4974936
> > >>>>>>>> Country Code: US
> > >>>>>>>> Coordinates: 42.7636, -71.46063
> > >>>>>>>> Search Key: NASHUA NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> > >>>>>>>>
> > >>>>>>>> Name: Nashua River Rail Trl
> > >>>>>>>> Score: 34.186226
> > >>>>>>>> ID: 4975508
> > >>>>>>>> Country Code: US
> > >>>>>>>> Coordinates: 42.7062, -71.53962
> > >>>>>>>> Search Key: NASHUA RIVER RAIL NASHUA HILLSBOROUGH NEW HAMPSHIRE
> > >>>>>>>> UNITED STATES
> > >>>>>>>>
> > >>>>>>>> Name: Nashua Rd
> > >>>>>>>> Score: 33.84896
> > >>>>>>>> ID: 4975388
> > >>>>>>>> Country Code: US
> > >>>>>>>> Coordinates: 42.78746, -71.92823
> > >>>>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> > >>>>>>>>
> > >>>>>>>> Name: NASHUA
> > >>>>>>>> Score: 33.84896
> > >>>>>>>> ID: 21014865
> > >>>>>>>> Country Code: US
> > >>>>>>>> Coordinates: 42.75873, -71.46438
> > >>>>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> > >>>>>>>>
> > >>>>>>>> Name: NASHUA
> > >>>>>>>> Score: 33.84896
> > >>>>>>>> ID: 21014865
> > >>>>>>>> Country Code: US
> > >>>>>>>> Coordinates: 42.75873, -71.46438
> > >>>>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> > >>>>>>>>
> > >>>>>>>> Name: NASHUA
> > >>>>>>>> Score: 33.84896
> > >>>>>>>> ID: 21014865
> > >>>>>>>> Country Code: US
> > >>>>>>>> Coordinates: 42.75873, -71.46438
> > >>>>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> > >>>>>>>>
> > >>>>>>>> Name: NASHUA
> > >>>>>>>> Score: 33.84896
> > >>>>>>>> ID: 21014865
> > >>>>>>>> Country Code: US
> > >>>>>>>> Coordinates: 42.75873, -71.46438
> > >>>>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> > >>>>>>>>
> > >>>>>>>> Name: NASHUA
> > >>>>>>>> Score: 33.84896
> > >>>>>>>> ID: 21014865
> > >>>>>>>> Country Code: US
> > >>>>>>>> Coordinates: 42.75873, -71.46438
> > >>>>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> > >>>>>>>>
> > >>>>>>>> Name: Nashua St
> > >>>>>>>> Score: 33.84896
> > >>>>>>>> ID: 4975671
> > >>>>>>>> Country Code: US
> > >>>>>>>> Coordinates: 42.88471, -70.81687
> > >>>>>>>> Search Key: NASHUA ROCKINGHAM NEW HAMPSHIRE UNITED STATES
> > >>>>>>>>
> > >>>>>>>> Name: Nashua Rd
> > >>>>>>>> Score: 33.84896
> > >>>>>>>> ID: 4975400
> > >>>>>>>> Country Code: US
> > >>>>>>>> Coordinates: 42.79014, -71.92364
> > >>>>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> Why is the fuzzy query ignored?
> > >>>>>>>> Even if i have separate fields for street, city,region, country,
> > >>>>>>>> this fuzzy query issue will come into place for words with
> > >>>>>>>> multiple parts like main dunstable etc., right?
> > >>>>>>>>
> > >>>>>>>> Best regards
> > >>>>>>>>
> > >>>>>>>> On 6/12/19 11:36 AM, baris.kazar@oracle.com wrote:
> > >>>>>>>>
> > >>>>>>>> Tomoko,-
> > >>>>>>>>
> > >>>>>>>> Thank You for Your suggestions. i am trying to understand it
> > >>>>>>>> and i thought i did :)
> > >>>>>>>>
> > >>>>>>>> but it does not work with FuzzyQuery when i used with a *single*
> > >>>>>>>> large TextField like street=...value... city=...value...
> > >>>>>>>> region=...value... country=...value... (with or without quotes
> > >>>>>>>> for the values)
> > >>>>>>>>
> > >>>>>>>> What i knew about Lucene fuzzy queries are not holding now with
> > >>>>>>>> this Textfield form. That is why i suspected of a bug.
> > >>>>>>>>
> > >>>>>>>> 1. Yes, i saw and have a solid proof on that now.
> > >>>>>>>>
> > >>>>>>>> 2. yes but FuzzyQuery takes quotes as they are as they are
> > >>>>>>>> escaped and it is not analyzed.
> > >>>>>>>>
> > >>>>>>>> Stuffing into one textfield vs having separate fields should only
> > >>>>>>>> affect probably the performance but not the outcome in my case.
> > >>>>>>>> But, i have been thinking about this and maybe it is the way to
> > >>>>>>>> go in this case.
> > >>>>>>>>
> > >>>>>>>> mY CONTENT field has street names in mixed case and city, region
> > >>>>>>>> country names in UPPERCASE. Can this be a problem?
> > >>>>>>>> i thought index stored them in lowercase since i am using
> > >>>>>>>> StandardAnalyzer.
> > >>>>>>>>
> > >>>>>>>> CONTENT field also has full textfield string with street=...
> > >>>>>>>> city=... region=... country=... (here all values are UPPERCASE).
> > >>>>>>>>
> > >>>>>>>> Why cant the index find the names via FuzzyQuery? i tried both
> > >>>>>>>> FuzzyQuery and Query builder as i showed before.
> > >>>>>>>>
> > >>>>>>>> The last advice in Your previous email would nicely go outside
> > >>>>>>>> the parantheses since it might be very critical :) :) :)
> > >>>>>>>>
> > >>>>>>>> Best regards
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> On 6/12/19 12:17 AM, Tomoko Uchida wrote:
> > >>>>>>>>
> > >>>>>>>> I'd suggest to correctly understand the way a software works before
> > >>>>>>>> suspecting its bug :-)
> > >>>>>>>>
> > >>>>>>>> I guess you may miss two points:
> > >>>>>>>>
> > >>>>>>>> 1. the standard analyzer (standard tokenizer) breaks words by double
> > >>>>>>>> quote (U+0022) so quotes are not indexed or searched at all if
> > >>>>>>>> you are
> > >>>>>>>> using standard analyzer. (That is the reason you have same results
> > >>>>>>>> with or without quotes.)
> > >>>>>>>> See:
> > >>>>>>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__lucene.apache.org_core_8-5F1-5F0_core_org_apache_lucene_analysis_standard_StandardTokenizer.html&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=1L6ZQKxmWmYxDX4uJHxzY5SAR_UCl6UUXCo916wzXCo&s=8E2lp1YIGM-3v3FspeieGl8z8rEBs6qioTudtFNzh8c&e=
> > >>>>>>>> and
> > >>>>>>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__unicode.org_reports_tr29_&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=1L6ZQKxmWmYxDX4uJHxzY5SAR_UCl6UUXCo916wzXCo&s=riCZ_f25XW869CKbHPUqfbLiDU-AukE6la0xTLMw6u8&e=
> > >>>>>>>>
> > >>>>>>>> 2. double quote has special meaning (it's interpreted as phrase
> > >>>>>>>> query)
> > >>>>>>>> with the built-in query parser so you need to escape it if you
> > >>>>>>>> want to
> > >>>>>>>> search double quotes itself.
> > >>>>>>>> See:
> > >>>>>>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__lucene.apache.org_core_8-5F1-5F0_queryparser_org_apache_lucene_queryparser_classic_package-2Dsummary.html-23Terms&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=1L6ZQKxmWmYxDX4uJHxzY5SAR_UCl6UUXCo916wzXCo&s=t8OYTgidvcwNpAVFuTsqGhDJK5BwUZVCxc0mPHzqCYU&e=
> > >>>>>>>>
> > >>>>>>>> (My advice would be to create separate fields for each key value
> > >>>>>>>> pairs
> > >>>>>>>> instead of stuffing all pairs into one text field, if you need to
> > >>>>>>>> search them separately.)
> > >>>>>>>>
> > >>>>>>>> 2019?6?12?(?) 2:39 <baris.kazar@oracle.com>:
> > >>>>>>>>
> > >>>>>>>> i can say that quotes is not the issue with index as it still
> > >>>>>>>> results in
> > >>>>>>>> same results with quotes or without quotes.
> > >>>>>>>>
> > >>>>>>>> i am starting to feel that this might be a bug maybe??
> > >>>>>>>>
> > >>>>>>>> Best regards
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> On 6/10/19 2:46 PM, baris.kazar@oracle.com wrote:
> > >>>>>>>>
> > >>>>>>>> Somehow " is causing an issue as this should return street with
> > >>>>>>>> MAIN:
> > >>>>>>>>
> > >>>>>>>> [.contentDFLT:street="MAINS"~2, +contentDFLT:"city nashua",
> > >>>>>>>> +contentDFLT:"region new-hampshire", +contentDFLT:"country united
> > >>>>>>>> states"] -> this was with fuzzyquery on MAINS
> > >>>>>>>>
> > >>>>>>>> Best regards
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> On 6/10/19 2:24 PM, baris.kazar@oracle.com wrote:
> > >>>>>>>>
> > >>>>>>>> [.+contentDFLT:"city nashua", +contentDFLT:"region new-hampshire",
> > >>>>>>>> +contentDFLT:"country united states", contentDFLT:street
> > >>>>>>>> contentDFLT:mains]
> > >>>>>>>>
> > >>>>>>>> QueeryParser chops it into two pieces from
> > >>>>>>>> parser.parser("street=\"MAINS\"");
> > >>>>>>>>
> > >>>>>>>> Index has a TextField named contentDFLT the following data :
> > >>>>>>>> street="MAIN" city="NASHUA" municipality="HILLSBOROUGH" region="NEW
> > >>>>>>>> HAMPSHIRE" country="UNITED STATES"
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> When i set street=\"MAINS~\" with parser:
> > >>>>>>>> i get the following
> > >>>>>>>> [.+contentDFLT:"city nashua", +contentDFLT:"region new-hampshire",
> > >>>>>>>> +contentDFLT:"country united states", contentDFLT:street
> > >>>>>>>> contentDFLT:mains]
> > >>>>>>>>
> > >>>>>>>> probably " quotations are messing this up as You were saying...
> > >>>>>>>> Best regards
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> On 6/10/19 12:48 PM, Tomoko Uchida wrote:
> > >>>>>>>>
> > >>>>>>>> Or, " (double quotation) in your query string may affect query
> > >>>>>>>> parsing.
> > >>>>>>>>
> > >>>>>>>> When I parse this string by classic query parser (lucene 8.1),
> > >>>>>>>> street="MAINS~"
> > >>>>>>>> parsed (raw) query is
> > >>>>>>>> text:street text:mains
> > >>>>>>>> (I set the default search field to "text", so text:xxxx is appeared
> > >>>>>>>> here.)
> > >>>>>>>>
> > >>>>>>>> Query parsing is a complex process, so it would be good to check
> > >>>>>>>> parsed raw query string especially when you have (reserved) special
> > >>>>>>>> characters in your query...
> > >>>>>>>>
> > >>>>>>>> 2019?6?11?(?) 1:10 Tomoko Uchida <tomoko.uchida.1111@gmail.com>:
> > >>>>>>>>
> > >>>>>>>> Hi,
> > >>>>>>>>
> > >>>>>>>> I noticed one small thing in your previous mail.
> > >>>>>>>>
> > >>>>>>>> when i use q1 = parser.parse("street=\"MAIN\""); i get same results
> > >>>>>>>>
> > >>>>>>>> which is good.
> > >>>>>>>>
> > >>>>>>>> To specify a search field, ":" (colon) should be used instead of
> > >>>>>>>> "=".
> > >>>>>>>> See the query parser documentation:
> > >>>>>>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__lucene.apache.org_core_8-5F1-5F0_queryparser_org_apache_lucene_queryparser_classic_package-2Dsummary.html-23Fields&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=u4SeJqH4lePhOazCLwxLEr3WqcMkODtYLv4njiKZ4PM&s=WrNfUXO9gz1PqpczTJw1vD9sWqvr76WRv2Aeo9uWqa4&e=
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> I'm not sure this is related to your problem.
> > >>>>>>>>
> > >>>>>>>> 2019?6?11?(?) 0:51 <baris.kazar@oracle.com>:
> > >>>>>>>>
> > >>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> > >>>>>>>> "city=\"NASHUA\""), BooleanClause.Occur.MUST);
> > >>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> > >>>>>>>> "region=\"NEW HAMPSHIRE\""), BooleanClause.Occur.MUST);
> > >>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> > >>>>>>>> "country=\"UNITED STATES\""), BooleanClause.Occur.MUST);
> > >>>>>>>>
> > >>>>>>>> org.apache.lucene.queryparser.classic.QueryParser parser = new
> > >>>>>>>> org.apache.lucene.queryparser.classic.QueryParser(field,
> > >>>>>>>> phraseAnalyzer) ;
> > >>>>>>>> Query q1 = null;
> > >>>>>>>> try {
> > >>>>>>>> q1 = parser.parse("MAIN");
> > >>>>>>>> } catch (ParseException e) {
> > >>>>>>>>
> > >>>>>>>> e.printStackTrace();
> > >>>>>>>> }
> > >>>>>>>> booleanQuery.add(q1, BooleanClause.Occur.SHOULD);
> > >>>>>>>>
> > >>>>>>>> testQuerySearch2 Time to compute: 0 seconds
> > >>>>>>>> Number of results: 1775
> > >>>>>>>> Name: Main St
> > >>>>>>>> Score: 37.20959
> > >>>>>>>> ID: 12681979
> > >>>>>>>> Country Code: US
> > >>>>>>>> Coordinates: 42.76416, -71.46681
> > >>>>>>>> Search Key: street="MAIN" city="NASHUA" municipality="HILLSBOROUGH"
> > >>>>>>>> region="NEW HAMPSHIRE" country="UNITED STATES"
> > >>>>>>>>
> > >>>>>>>> Name: Main St
> > >>>>>>>> Score: 37.20959
> > >>>>>>>> ID: 12681977
> > >>>>>>>> Country Code: US
> > >>>>>>>> Coordinates: 42.747, -71.45957
> > >>>>>>>> Search Key: street="MAIN" city="NASHUA" municipality="HILLSBOROUGH"
> > >>>>>>>> region="NEW HAMPSHIRE" country="UNITED STATES"
> > >>>>>>>>
> > >>>>>>>> Name: Main St
> > >>>>>>>> Score: 37.20959
> > >>>>>>>> ID: 12681978
> > >>>>>>>> Country Code: US
> > >>>>>>>> Coordinates: 42.73492, -71.44951
> > >>>>>>>> Search Key: street="MAIN" city="NASHUA" municipality="HILLSBOROUGH"
> > >>>>>>>> region="NEW HAMPSHIRE" country="UNITED STATES"
> > >>>>>>>>
> > >>>>>>>> when i use q1 = parser.parse("street=\"MAIN\""); i get same
> > >>>>>>>> results
> > >>>>>>>> which is good.
> > >>>>>>>>
> > >>>>>>>> But when i switch to MAINS~ then fuzzy query does not work.
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> i need to say something with the q1 only in the booleanquery:
> > >>>>>>>> it tries to match the MAIN in street, city, region and country
> > >>>>>>>> which are
> > >>>>>>>> in a single TextField field.
> > >>>>>>>> But i dont want this. that is why i need to street="..." etc when
> > >>>>>>>> searching.
> > >>>>>>>>
> > >>>>>>>> Best regards
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> On 6/10/19 11:31 AM, Tomoko Uchida wrote:
> > >>>>>>>>
> > >>>>>>>> Hi,
> > >>>>>>>>
> > >>>>>>>> just for the basic verification, can you find the document without
> > >>>>>>>> fuzzy query? I mean, does this query work for you?
> > >>>>>>>>
> > >>>>>>>> Query query = parser.parse("MAIN");
> > >>>>>>>>
> > >>>>>>>> Tomoko
> > >>>>>>>>
> > >>>>>>>> 2019?6?11?(?) 0:22 <baris.kazar@oracle.com>:
> > >>>>>>>>
> > >>>>>>>> why cant the second set not work at all?
> > >>>>>>>>
> > >>>>>>>> it is indexed as Textfield like street="..." city="..." etc.
> > >>>>>>>>
> > >>>>>>>> Best regards
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> On 6/10/19 11:23 AM, baris.kazar@oracle.com wrote:
> > >>>>>>>>
> > >>>>>>>> i dont know how to use Fuzzyquery with queryparser but probably
> > >>>>>>>> You
> > >>>>>>>> are suggesting
> > >>>>>>>>
> > >>>>>>>> QueryParser parser = new QueryParser(field, analyzer) ;
> > >>>>>>>> Query query = parser.parse("MAINS~2");
> > >>>>>>>>
> > >>>>>>>> booleanQuery.add(query, BooleanClause.Occur.SHOULD);
> > >>>>>>>>
> > >>>>>>>> am i right?
> > >>>>>>>> Best regards
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> On 6/10/19 10:47 AM, Atri Sharma wrote:
> > >>>>>>>>
> > >>>>>>>> I would suggest using a QueryParser for your fuzzy query before
> > >>>>>>>> adding it to the Boolean query. This should weed out any case
> > >>>>>>>> issues.
> > >>>>>>>>
> > >>>>>>>> On Mon, 10 Jun 2019 at 8:06 PM, <baris.kazar@oracle.com
> > >>>>>>>> <mailto:baris.kazar@oracle.com>> wrote:
> > >>>>>>>>
> > >>>>>>>> BooleanQuery.Builder booleanQuery = new
> > >>>>>>>> BooleanQuery.Builder();
> > >>>>>>>>
> > >>>>>>>> //First set
> > >>>>>>>>
> > >>>>>>>> booleanQuery.add(new FuzzyQuery(new
> > >>>>>>>> org.apache.lucene.index.Term(field, "MAINS")),
> > >>>>>>>> BooleanClause.Occur.SHOULD);
> > >>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> > >>>>>>>> "NASHUA"), BooleanClause.Occur.MUST);
> > >>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> > >>>>>>>> "NEW HAMPSHIRE"), BooleanClause.Occur.MUST);
> > >>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> > >>>>>>>> "UNITED STATES"), BooleanClause.Occur.MUST);
> > >>>>>>>>
> > >>>>>>>> // Second set
> > >>>>>>>> //booleanQuery.add(new FuzzyQuery(new
> > >>>>>>>> org.apache.lucene.index.Term(field, "street=\"MAINS\"")),
> > >>>>>>>> BooleanClause.Occur.SHOULD);
> > >>>>>>>> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,
> > >>>>>>>>
> > >>>>>>>> field, "city=\"NASHUA\""), BooleanClause.Occur.MUST);
> > >>>>>>>> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,
> > >>>>>>>>
> > >>>>>>>> field, "region=\"NEW HAMPSHIRE\""),
> > >>>>>>>> BooleanClause.Occur.MUST);
> > >>>>>>>> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,
> > >>>>>>>>
> > >>>>>>>> field, "country=\"UNITED STATES\""),
> > >>>>>>>> BooleanClause.Occur.MUST);
> > >>>>>>>>
> > >>>>>>>> The first set brings also street with Nashua name.
> > >>>>>>>> (NASHUA).
> > >>>>>>>>
> > >>>>>>>> so, to prevent that and since i also indexed with
> > >>>>>>>> street="..."
> > >>>>>>>> city="..." i did the second set but it does not bring
> > >>>>>>>> anything.
> > >>>>>>>>
> > >>>>>>>> createPhraseQuery builds a Phrasequery with one term
> > >>>>>>>> equal to the
> > >>>>>>>> string
> > >>>>>>>> in the call.
> > >>>>>>>>
> > >>>>>>>> Best regards
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> On 6/10/19 10:47 AM, baris.kazar@oracle.com
> > >>>>>>>> <mailto:baris.kazar@oracle.com> wrote:
> > >>>>>>>> > How do i check how it is indexed? lowecase or uppercase?
> > >>>>>>>> >
> > >>>>>>>> > only way is now to by testing.
> > >>>>>>>> >
> > >>>>>>>> > i am using standardanalyzer.
> > >>>>>>>> >
> > >>>>>>>> > Best regards
> > >>>>>>>> >
> > >>>>>>>> >
> > >>>>>>>> > On 6/9/19 11:57 AM, Atri Sharma wrote:
> > >>>>>>>> >> On Sun, Jun 9, 2019 at 8:53 PM Tomoko Uchida
> > >>>>>>>> >> <tomoko.uchida.1111@gmail.com
> > >>>>>>>> <mailto:tomoko.uchida.1111@gmail.com>> wrote:
> > >>>>>>>> >>> Hi,
> > >>>>>>>> >>>
> > >>>>>>>> >>> What analyzer do you use for the text field? Is the
> > >>>>>>>> term "Main"
> > >>>>>>>> >>> correctly indexed?
> > >>>>>>>> >> Agreed. Also, it would be good if you could post your
> > >>>>>>>> actual
> > >>>>>>>> code.
> > >>>>>>>> >>
> > >>>>>>>> >> What analyzer are you using? If you are using
> > >>>>>>>> StandardAnalyzer,
> > >>>>>>>> then
> > >>>>>>>> >> all of your terms while indexing will be lowercased,
> > >>>>>>>> AFAIK, but
> > >>>>>>>> your
> > >>>>>>>> >> query will not be analyzed until you run a
> > >>>>>>>> QueryParser on it.
> > >>>>>>>> >>
> > >>>>>>>> >>
> > >>>>>>>> >> Atri
> > >>>>>>>> >>
> > >>>>>>>> >
> > >>>>>>>> >
> > >>>>>>>> >
> > >>>>>>>> ---------------------------------------------------------------------
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> > To unsubscribe, e-mail:
> > >>>>>>>> java-user-unsubscribe@lucene.apache.org
> > >>>>>>>> <mailto:java-user-unsubscribe@lucene.apache.org>
> > >>>>>>>> > For additional commands, e-mail:
> > >>>>>>>> java-user-help@lucene.apache.org
> > >>>>>>>> <mailto:java-user-help@lucene.apache.org>
> > >>>>>>>> >
> > >>>>>>>>
> > >>>>>>>> ---------------------------------------------------------------------
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > >>>>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
> > >>>>>>>>
> > >>>>>>>> ---------------------------------------------------------------------
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > >>>>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
> > >>>>>>>>
> > >>>>>>>> ---------------------------------------------------------------------
> > >>>>>>>>
> > >>>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > >>>>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
> > >>>>>>>>
> > >>>>>>>> ---------------------------------------------------------------------
> > >>>>>>>>
> > >>>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > >>>>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>> ---------------------------------------------------------------------
> > >>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > >>>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
> > >>>>>>>
> > >>>>>> ---------------------------------------------------------------------
> > >>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > >>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
> > >>>>>>
> > >>>>> ---------------------------------------------------------------------
> > >>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > >>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
> > >>>>>
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > > For additional commands, e-mail: java-user-help@lucene.apache.org
> > >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: FuzzyQuery- why is it ignored? [ In reply to ]
oops sorry about this, i also automatically did reply to and
assumed it went to the list.

i totally agree about your recommendations and
totally agree it should have been sent to the forum emailing list.

please accept my apologies for not confirming the sent address.

i will try 8.1 tomorrow on my env.

baris

----- Original Message -----
From: tomoko.uchida.1111@gmail.com
To: java-user@lucene.apache.org
Sent: Saturday, June 22, 2019 10:35:26 PM GMT -05:00 US/Canada Eastern
Subject: Re: FuzzyQuery- why is it ignored?

Please send messages to java-user mail list only. It is not
recommended to send questions to someone's private mail address.
(I accidentally send a reply without including java-user to "To",
because "Reply-To" header was not correctly set in your previous
mail.)

I did not run any test with Lucene 6.6, and won't try it until
reproducible results/conditions are provided.

2019?6?23?(?) 10:40 Baris Kazar <baris.kazar@oracle.com>:

>
> Tomoko,-
> i will surely try on my env version 8.1
> but if You could also try then both runs will
> make sure it is bug.
> No problems at all. i will test it.
> I need to ask one thing when you ran the example
> did you have any other entries in the index?
> and i did not understand your statement:
> the fuzzy query worked on 8.1, did you also try 6.6?
> anyways i will know for sure when i test this weekend.
> baris
>
> ----- Original Message -----
> From: tomoko.uchida.1111@gmail.com
> To: baris.kazar@oracle.com
> Sent: Saturday, June 22, 2019 9:14:22 PM GMT -05:00 US/Canada Eastern
> Subject: Re: FuzzyQuery- why is it ignored?
>
> > If You could index these entries and still find Main from MAINS query with Lucene 8.1,
> > that means this is a bug in Lucene 6.6.
>
> No, it does not mean there is difference between 8.1 and 6.6 if the
> fuzzy query correctly works for me.
> I'd suggest you try Lucene 8.1 on your own with your environments/settings.
>
> Tomoko
>
> 2019?6?23?(?) 2:14 Baris Kazar <baris.kazar@oracle.com>:
> >
> > Tomoko,-
> > may i ask if You could try with these few more data indexed too?
> >
> > "KEHOE NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES"
> > "CHESTNUT NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES"
> > "JEFFERSON NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES"
> > "NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES"
> > "NASHUA NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES"
> > "NEW HAMPSHIRE NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES"
> >
> > If You could index these entries and still find Main from MAINS query with Lucene 8.1,
> > that means this is a bug in Lucene 6.6.
> > Best regards
> >
> > ----- Original Message -----
> > From: baris.kazar@oracle.com
> > To: java-user@lucene.apache.org, tomoko.uchida.1111@gmail.com, erickerickson@gmail.com, atri@linux.com, baris.kazar@oracle.com, lucene@mikemccandless.com
> > Sent: Thursday, June 13, 2019 10:49:05 AM GMT -05:00 US/Canada Eastern
> > Subject: Re: FuzzyQuery- why is it ignored?
> >
> > i see, i am using an older version 6.6 and we should switch to Your 8.1
> > version of at least 7.X.
> >
> > Tomoko i think i understood You meant MAIN NASHUA .... for the string :)
> >
> > Again i really appreciate all answers.
> >
> > How do we disable or enable stemming while indexing? :) another question.
> >
> > Best regards
> >
> >
> > On 6/13/19 10:40 AM, Tomoko Uchida wrote:
> > > Sorry, I made a mistake when copypasting. Let me just correct my previous mail.
> > >
> > >> 1. Indexed this text: "NASHUA NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES".
> > > 1. Indexed this text: "MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW
> > > HAMPSHIRE UNITED STATES"
> > >
> > > ----
> > > As far as I can say, this query correctly find the indexed document
> > > (so I have no idea about what is wrong with fuzzy query).
> > > +contentDFLT:mains~2 +contentDFLT:"nashua"
> > > +contentDFLT:"new-hampshire" +contentDFLT:"united states"
> > >
> > > I am
> > > - using lucene 8.1.
> > > - using standard analyzer for both of indexing and searching.
> > > - using classic query parser for parsing.
> > >
> > >
> > >
> > > 2019?6?13?(?) 23:18 <baris.kazar@oracle.com>:
> > >> However, the index does not have MAINS but MAIN for the expected entry.
> > >>
> > >> Best regards
> > >>
> > >>
> > >>
> > >> On 6/13/19 10:33 AM, baris.kazar@oracle.com wrote:
> > >>> does it consider it as like plural word? :) :) :)
> > >>> That makes sense.
> > >>>
> > >>> Best regards
> > >>>
> > >>>
> > >>> On 6/13/19 10:31 AM, baris.kazar@oracle.com wrote:
> > >>>> Erick,
> > >>>>
> > >>>> Cool, could You give a simple example with my example please?
> > >>>>
> > >>>> Best regards
> > >>>>
> > >>>>
> > >>>>
> > >>>> On 6/13/19 10:12 AM, Erick Erickson wrote:
> > >>>>> Shot in the dark: stemming. Whenever I see a problem with something
> > >>>>> ending in “s” (or “er” or “ing” or….) my first suspect is that
> > >>>>> stemming is turned on. In that case the token in the index that’s
> > >>>>> actually searched on is somewhat different than you expect.
> > >>>>>
> > >>>>> The test is easy, just insure your fieldType contains no stemmers.
> > >>>>> PorterStemmer is particularly aggressive, but for this case to test
> > >>>>> I’d just remove all stemming, re-index and see if the results differ.
> > >>>>>
> > >>>>> Best,
> > >>>>> Erick
> > >>>>>
> > >>>>>> On Jun 13, 2019, at 7:26 AM, baris.kazar@oracle.com wrote:
> > >>>>>>
> > >>>>>> Tomoko,-
> > >>>>>>
> > >>>>>> That is strange indeed.
> > >>>>>>
> > >>>>>> Something is wrong when i use mains but maink, mainl, mainr,mainq,
> > >>>>>> maint all work ok any consonant at the end except s works in this
> > >>>>>> case.
> > >>>>>>
> > >>>>>> Case #3 had +contentDFLT:mains~2 but not +contentDFLT:"mains~2".
> > >>>>>>
> > >>>>>> i am using fuzzy query with ~ from Query.builder and that is not
> > >>>>>> PhraseQuery.
> > >>>>>>
> > >>>>>> Similarly FuzzyQuery with input "mains" (it has to be lowercase
> > >>>>>> since it does not go through StandardAnalyzer) is also not
> > >>>>>> PhraseQuery.
> > >>>>>>
> > >>>>>> can there be a clearer sample case for ComplexPhraseQuery please in
> > >>>>>> the docs?
> > >>>>>>
> > >>>>>> did You also index "MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED
> > >>>>>> STATES" the expected output in this case?
> > >>>>>>
> > >>>>>> Thanks for spending time on this, i would like to thank everyone.
> > >>>>>>
> > >>>>>> Best regards
> > >>>>>>
> > >>>>>>
> > >>>>>> On 6/13/19 12:13 AM, Tomoko Uchida wrote:
> > >>>>>>> Hi,
> > >>>>>>>
> > >>>>>>>> Ok, i think only this very specific only "mains" has an issue.
> > >>>>>>> It looks strange to me. I did some test locally.
> > >>>>>>>
> > >>>>>>> 1. Indexed this text: "NASHUA NASHUA HILLSBOROUGH NEW HAMPSHIRE
> > >>>>>>> UNITED STATES".
> > >>>>>>>
> > >>>>>>> 2a. This query string (just copied from your Case #3) worked
> > >>>>>>> correctly
> > >>>>>>> for me as far as I can see.
> > >>>>>>> +contentDFLT:mains~2 +contentDFLT:"nashua",
> > >>>>>>> +contentDFLT:"new-hampshire", +contentDFLT:"united state"
> > >>>>>>>
> > >>>>>>> 2b. However this query string got no results.
> > >>>>>>> +contentDFLT:"mains~2", +contentDFLT:"nashua",
> > >>>>>>> +contentDFLT:"new-hampshire", +contentDFLT:"united states"
> > >>>>>>> It is an expected behaviour because the classic query parser does not
> > >>>>>>> support fuzzy query inside phrase query (as far as I know).
> > >>>>>>>
> > >>>>>>> I suspect you use fuzzy query operator (~) inside phrase query
> > >>>>>>> ("), as
> > >>>>>>> the 2b case.
> > >>>>>>>
> > >>>>>>> FYI: there is a special parser for such complex phrase query.
> > >>>>>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__lucene.apache.org_core_8-5F1-5F0_queryparser_org_apache_lucene_queryparser_complexPhrase_ComplexPhraseQueryParser.html&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=ZcXpaSlwS5DegX76mHTb_6DH3P7noan1eeMXc-Vh5M8&s=FoIMlcjDO2b7Gut9XRx-NIBWiBQWItsj8IlylJC7Wkc&e=
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> Tomoko
> > >>>>>>>
> > >>>>>>> 2019?6?13?(?) 6:16 <baris.kazar@oracle.com>:
> > >>>>>>>> Ok, i think only this very specific only "mains" has an issue.
> > >>>>>>>>
> > >>>>>>>> all i knew about Lucene was fine :) Great...
> > >>>>>>>>
> > >>>>>>>> i have one more question:
> > >>>>>>>>
> > >>>>>>>> which one is advised to use: FuzzyQuery or the Query.parser with
> > >>>>>>>> search string~ appended?
> > >>>>>>>>
> > >>>>>>>> The second one will go through analyzer and make search string
> > >>>>>>>> lowercase.
> > >>>>>>>>
> > >>>>>>>> Best regards
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> On 6/12/19 1:03 PM, baris.kazar@oracle.com wrote:
> > >>>>>>>>
> > >>>>>>>> Hi again,-
> > >>>>>>>>
> > >>>>>>>> this is really interesting and i hope i am missing something.
> > >>>>>>>> Index small cases all entries so case sensitivity is not an issue
> > >>>>>>>> i think.
> > >>>>>>>>
> > >>>>>>>> Case #1:
> > >>>>>>>>
> > >>>>>>>> org.apache.lucene.queryparser.classic.QueryParser parser = new
> > >>>>>>>> org.apache.lucene.queryparser.classic.QueryParser(field,
> > >>>>>>>> phraseAnalyzer) ;
> > >>>>>>>> Query q1 = null;
> > >>>>>>>> try {
> > >>>>>>>> q1 = parser.parse("Main");
> > >>>>>>>> } catch (ParseException e) {
> > >>>>>>>> e.printStackTrace();
> > >>>>>>>> }
> > >>>>>>>> booleanQuery.add(q1, BooleanClause.Occur.MUST);
> > >>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> > >>>>>>>> "NASHUA"), BooleanClause.Occur.MUST);
> > >>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> > >>>>>>>> "NEW HAMPSHIRE"), BooleanClause.Occur.MUST);
> > >>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> > >>>>>>>> "UNITED STATES"), BooleanClause.Occur.MUST);
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> This brings with this:
> > >>>>>>>>
> > >>>>>>>> query plan:
> > >>>>>>>>
> > >>>>>>>> [+contentDFLT:main, +contentDFLT:"nashua",
> > >>>>>>>> +contentDFLT:"new-hampshire", +contentDFLT:"united states"]
> > >>>>>>>>
> > >>>>>>>> testQuerySearch1 Time to compute: 0 seconds (copied answer after
> > >>>>>>>> exec finished)
> > >>>>>>>>
> > >>>>>>>> Number of results: 12
> > >>>>>>>> Name: Main Dunstable Rd
> > >>>>>>>> Score: 41.204945
> > >>>>>>>> ID: 12677400
> > >>>>>>>> Country Code: US
> > >>>>>>>> Coordinates: 42.72631, -71.50269
> > >>>>>>>> Search Key: MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW HAMPSHIRE
> > >>>>>>>> UNITED STATES
> > >>>>>>>>
> > >>>>>>>> Name: Main St
> > >>>>>>>> Score: 41.204945
> > >>>>>>>> ID: 12681980
> > >>>>>>>> Country Code: US
> > >>>>>>>> Coordinates: 42.76416, -71.46681
> > >>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> > >>>>>>>>
> > >>>>>>>> Name: Main St
> > >>>>>>>> Score: 41.204945
> > >>>>>>>> ID: 12681973
> > >>>>>>>> Country Code: US
> > >>>>>>>> Coordinates: 42.75045, -71.4607
> > >>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> > >>>>>>>>
> > >>>>>>>> Name: Main St
> > >>>>>>>> Score: 41.204945
> > >>>>>>>> ID: 12681974
> > >>>>>>>> Country Code: US
> > >>>>>>>> Coordinates: 42.76019, -71.465
> > >>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> > >>>>>>>>
> > >>>>>>>> Name: Main Dunstable Rd
> > >>>>>>>> Score: 41.204945
> > >>>>>>>> ID: 12677399
> > >>>>>>>> Country Code: US
> > >>>>>>>> Coordinates: 42.74641, -71.48943
> > >>>>>>>> Search Key: MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW HAMPSHIRE
> > >>>>>>>> UNITED STATES
> > >>>>>>>>
> > >>>>>>>> Name: S Main St
> > >>>>>>>> Score: 41.204945
> > >>>>>>>> ID: 11893215
> > >>>>>>>> Country Code: US
> > >>>>>>>> Coordinates: 42.73412, -71.44797
> > >>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> > >>>>>>>>
> > >>>>>>>> Name: Main St
> > >>>>>>>> Score: 41.204945
> > >>>>>>>> ID: 12681978
> > >>>>>>>> Country Code: US
> > >>>>>>>> Coordinates: 42.73492, -71.44951
> > >>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> > >>>>>>>>
> > >>>>>>>> Name: S Main St
> > >>>>>>>> Score: 41.204945
> > >>>>>>>> ID: 11893214
> > >>>>>>>> Country Code: US
> > >>>>>>>> Coordinates: 42.73958, -71.45895
> > >>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> > >>>>>>>>
> > >>>>>>>> Name: Main St
> > >>>>>>>> Score: 41.204945
> > >>>>>>>> ID: 12681979
> > >>>>>>>> Country Code: US
> > >>>>>>>> Coordinates: 42.76416, -71.46681
> > >>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> > >>>>>>>>
> > >>>>>>>> Name: Main St
> > >>>>>>>> Score: 41.204945
> > >>>>>>>> ID: 12681977
> > >>>>>>>> Country Code: US
> > >>>>>>>> Coordinates: 42.747, -71.45957
> > >>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> Case #2
> > >>>>>>>>
> > >>>>>>>> When i did this it also worked by adding ~ to make it Fuzzy query
> > >>>>>>>> to Main word:
> > >>>>>>>>
> > >>>>>>>> org.apache.lucene.queryparser.classic.QueryParser parser = new
> > >>>>>>>> org.apache.lucene.queryparser.classic.QueryParser(field,
> > >>>>>>>> phraseAnalyzer) ;
> > >>>>>>>> Query q1 = null;
> > >>>>>>>> try {
> > >>>>>>>> q1 = parser.parse("Main~");
> > >>>>>>>> } catch (ParseException e) {
> > >>>>>>>> e.printStackTrace();
> > >>>>>>>> }
> > >>>>>>>> booleanQuery.add(q1, BooleanClause.Occur.MUST);
> > >>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> > >>>>>>>> "NASHUA"), BooleanClause.Occur.MUST);
> > >>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> > >>>>>>>> "NEW HAMPSHIRE"), BooleanClause.Occur.MUST);
> > >>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> > >>>>>>>> "UNITED STATES"), BooleanClause.Occur.MUST);
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> query plan:
> > >>>>>>>>
> > >>>>>>>> [+contentDFLT:main~2, +contentDFLT:"nashua",
> > >>>>>>>> +contentDFLT:"new-hampshire", +contentDFLT:"united states"]
> > >>>>>>>>
> > >>>>>>>> testQuerySearch1 Time to compute: 24 seconds (due to debugging
> > >>>>>>>> stops)
> > >>>>>>>> Number of results: 12
> > >>>>>>>> Name: Main Dunstable Rd
> > >>>>>>>> Score: 41.06405
> > >>>>>>>> ID: 12677400
> > >>>>>>>> Country Code: US
> > >>>>>>>> Coordinates: 42.72631, -71.50269
> > >>>>>>>> Search Key: MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW HAMPSHIRE
> > >>>>>>>> UNITED STATES
> > >>>>>>>>
> > >>>>>>>> Name: Main St
> > >>>>>>>> Score: 41.06405
> > >>>>>>>> ID: 12681980
> > >>>>>>>> Country Code: US
> > >>>>>>>> Coordinates: 42.76416, -71.46681
> > >>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> > >>>>>>>>
> > >>>>>>>> Name: Main St
> > >>>>>>>> Score: 41.06405
> > >>>>>>>> ID: 12681973
> > >>>>>>>> Country Code: US
> > >>>>>>>> Coordinates: 42.75045, -71.4607
> > >>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> > >>>>>>>>
> > >>>>>>>> Name: Main St
> > >>>>>>>> Score: 41.06405
> > >>>>>>>> ID: 12681974
> > >>>>>>>> Country Code: US
> > >>>>>>>> Coordinates: 42.76019, -71.465
> > >>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> > >>>>>>>>
> > >>>>>>>> Name: Main Dunstable Rd
> > >>>>>>>> Score: 41.06405
> > >>>>>>>> ID: 12677399
> > >>>>>>>> Country Code: US
> > >>>>>>>> Coordinates: 42.74641, -71.48943
> > >>>>>>>> Search Key: MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW HAMPSHIRE
> > >>>>>>>> UNITED STATES
> > >>>>>>>>
> > >>>>>>>> Name: S Main St
> > >>>>>>>> Score: 41.06405
> > >>>>>>>> ID: 11893215
> > >>>>>>>> Country Code: US
> > >>>>>>>> Coordinates: 42.73412, -71.44797
> > >>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> > >>>>>>>>
> > >>>>>>>> Name: Main St
> > >>>>>>>> Score: 41.06405
> > >>>>>>>> ID: 12681978
> > >>>>>>>> Country Code: US
> > >>>>>>>> Coordinates: 42.73492, -71.44951
> > >>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> > >>>>>>>>
> > >>>>>>>> Name: S Main St
> > >>>>>>>> Score: 41.06405
> > >>>>>>>> ID: 11893214
> > >>>>>>>> Country Code: US
> > >>>>>>>> Coordinates: 42.73958, -71.45895
> > >>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> > >>>>>>>>
> > >>>>>>>> Name: Main St
> > >>>>>>>> Score: 41.06405
> > >>>>>>>> ID: 12681979
> > >>>>>>>> Country Code: US
> > >>>>>>>> Coordinates: 42.76416, -71.46681
> > >>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> > >>>>>>>>
> > >>>>>>>> Name: Main St
> > >>>>>>>> Score: 41.06405
> > >>>>>>>> ID: 12681977
> > >>>>>>>> Country Code: US
> > >>>>>>>> Coordinates: 42.747, -71.45957
> > >>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> Case #3
> > >>>>>>>>
> > >>>>>>>> But why does this not work with fuzzy mode and i misspelled a bit
> > >>>>>>>> (1 edit away) and as You saw the data is there with Main spelling:
> > >>>>>>>>
> > >>>>>>>> org.apache.lucene.queryparser.classic.QueryParser parser = new
> > >>>>>>>> org.apache.lucene.queryparser.classic.QueryParser(field,
> > >>>>>>>> phraseAnalyzer) ;
> > >>>>>>>>
> > >>>>>>>> Query q1 = null;
> > >>>>>>>> try {
> > >>>>>>>> q1 = parser.parse("Mains~"); // 1 edit away
> > >>>>>>>> } catch (ParseException e) {
> > >>>>>>>> e.printStackTrace();
> > >>>>>>>> }
> > >>>>>>>> booleanQuery.add(q1, BooleanClause.Occur.MUST);
> > >>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> > >>>>>>>> "NASHUA"), BooleanClause.Occur.MUST);
> > >>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> > >>>>>>>> "NEW HAMPSHIRE"), BooleanClause.Occur.MUST);
> > >>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> > >>>>>>>> "UNITED STATES"), BooleanClause.Occur.MUST);
> > >>>>>>>>
> > >>>>>>>> query plan:
> > >>>>>>>>
> > >>>>>>>> [+contentDFLT:mains~2, +contentDFLT:"nashua",
> > >>>>>>>> +contentDFLT:"new-hampshire", +contentDFLT:"united states"]
> > >>>>>>>>
> > >>>>>>>> testQuerySearch1 Time to compute: 23 seconds (due to debugging
> > >>>>>>>> stops)
> > >>>>>>>>
> > >>>>>>>> Number of results: 0
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> Case #4
> > >>>>>>>>
> > >>>>>>>> Then i changed q1 to SHOULD from MUST above: and i think fuzzy
> > >>>>>>>> query is ignored here since there is no MAIN in the first 468
> > >>>>>>>> resuls:
> > >>>>>>>>
> > >>>>>>>> there is no boost for Mains term here.
> > >>>>>>>>
> > >>>>>>>> query plan:
> > >>>>>>>>
> > >>>>>>>> [contentDFLT:mains~2, +contentDFLT:"nashua",
> > >>>>>>>> +contentDFLT:"new-hampshire", +contentDFLT:"united states"]
> > >>>>>>>>
> > >>>>>>>> testQuerySearch1 Time to compute: 125 seconds (due to debugging
> > >>>>>>>> stops)
> > >>>>>>>> Number of results: 1794
> > >>>>>>>> Name: Nashua Dr
> > >>>>>>>> Score: 34.186226
> > >>>>>>>> ID: 4974936
> > >>>>>>>> Country Code: US
> > >>>>>>>> Coordinates: 42.7636, -71.46063
> > >>>>>>>> Search Key: NASHUA NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> > >>>>>>>>
> > >>>>>>>> Name: Nashua River Rail Trl
> > >>>>>>>> Score: 34.186226
> > >>>>>>>> ID: 4975508
> > >>>>>>>> Country Code: US
> > >>>>>>>> Coordinates: 42.7062, -71.53962
> > >>>>>>>> Search Key: NASHUA RIVER RAIL NASHUA HILLSBOROUGH NEW HAMPSHIRE
> > >>>>>>>> UNITED STATES
> > >>>>>>>>
> > >>>>>>>> Name: Nashua Rd
> > >>>>>>>> Score: 33.84896
> > >>>>>>>> ID: 4975388
> > >>>>>>>> Country Code: US
> > >>>>>>>> Coordinates: 42.78746, -71.92823
> > >>>>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> > >>>>>>>>
> > >>>>>>>> Name: NASHUA
> > >>>>>>>> Score: 33.84896
> > >>>>>>>> ID: 21014865
> > >>>>>>>> Country Code: US
> > >>>>>>>> Coordinates: 42.75873, -71.46438
> > >>>>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> > >>>>>>>>
> > >>>>>>>> Name: NASHUA
> > >>>>>>>> Score: 33.84896
> > >>>>>>>> ID: 21014865
> > >>>>>>>> Country Code: US
> > >>>>>>>> Coordinates: 42.75873, -71.46438
> > >>>>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> > >>>>>>>>
> > >>>>>>>> Name: NASHUA
> > >>>>>>>> Score: 33.84896
> > >>>>>>>> ID: 21014865
> > >>>>>>>> Country Code: US
> > >>>>>>>> Coordinates: 42.75873, -71.46438
> > >>>>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> > >>>>>>>>
> > >>>>>>>> Name: NASHUA
> > >>>>>>>> Score: 33.84896
> > >>>>>>>> ID: 21014865
> > >>>>>>>> Country Code: US
> > >>>>>>>> Coordinates: 42.75873, -71.46438
> > >>>>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> > >>>>>>>>
> > >>>>>>>> Name: NASHUA
> > >>>>>>>> Score: 33.84896
> > >>>>>>>> ID: 21014865
> > >>>>>>>> Country Code: US
> > >>>>>>>> Coordinates: 42.75873, -71.46438
> > >>>>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> > >>>>>>>>
> > >>>>>>>> Name: Nashua St
> > >>>>>>>> Score: 33.84896
> > >>>>>>>> ID: 4975671
> > >>>>>>>> Country Code: US
> > >>>>>>>> Coordinates: 42.88471, -70.81687
> > >>>>>>>> Search Key: NASHUA ROCKINGHAM NEW HAMPSHIRE UNITED STATES
> > >>>>>>>>
> > >>>>>>>> Name: Nashua Rd
> > >>>>>>>> Score: 33.84896
> > >>>>>>>> ID: 4975400
> > >>>>>>>> Country Code: US
> > >>>>>>>> Coordinates: 42.79014, -71.92364
> > >>>>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> Why is the fuzzy query ignored?
> > >>>>>>>> Even if i have separate fields for street, city,region, country,
> > >>>>>>>> this fuzzy query issue will come into place for words with
> > >>>>>>>> multiple parts like main dunstable etc., right?
> > >>>>>>>>
> > >>>>>>>> Best regards
> > >>>>>>>>
> > >>>>>>>> On 6/12/19 11:36 AM, baris.kazar@oracle.com wrote:
> > >>>>>>>>
> > >>>>>>>> Tomoko,-
> > >>>>>>>>
> > >>>>>>>> Thank You for Your suggestions. i am trying to understand it
> > >>>>>>>> and i thought i did :)
> > >>>>>>>>
> > >>>>>>>> but it does not work with FuzzyQuery when i used with a *single*
> > >>>>>>>> large TextField like street=...value... city=...value...
> > >>>>>>>> region=...value... country=...value... (with or without quotes
> > >>>>>>>> for the values)
> > >>>>>>>>
> > >>>>>>>> What i knew about Lucene fuzzy queries are not holding now with
> > >>>>>>>> this Textfield form. That is why i suspected of a bug.
> > >>>>>>>>
> > >>>>>>>> 1. Yes, i saw and have a solid proof on that now.
> > >>>>>>>>
> > >>>>>>>> 2. yes but FuzzyQuery takes quotes as they are as they are
> > >>>>>>>> escaped and it is not analyzed.
> > >>>>>>>>
> > >>>>>>>> Stuffing into one textfield vs having separate fields should only
> > >>>>>>>> affect probably the performance but not the outcome in my case.
> > >>>>>>>> But, i have been thinking about this and maybe it is the way to
> > >>>>>>>> go in this case.
> > >>>>>>>>
> > >>>>>>>> mY CONTENT field has street names in mixed case and city, region
> > >>>>>>>> country names in UPPERCASE. Can this be a problem?
> > >>>>>>>> i thought index stored them in lowercase since i am using
> > >>>>>>>> StandardAnalyzer.
> > >>>>>>>>
> > >>>>>>>> CONTENT field also has full textfield string with street=...
> > >>>>>>>> city=... region=... country=... (here all values are UPPERCASE).
> > >>>>>>>>
> > >>>>>>>> Why cant the index find the names via FuzzyQuery? i tried both
> > >>>>>>>> FuzzyQuery and Query builder as i showed before.
> > >>>>>>>>
> > >>>>>>>> The last advice in Your previous email would nicely go outside
> > >>>>>>>> the parantheses since it might be very critical :) :) :)
> > >>>>>>>>
> > >>>>>>>> Best regards
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> On 6/12/19 12:17 AM, Tomoko Uchida wrote:
> > >>>>>>>>
> > >>>>>>>> I'd suggest to correctly understand the way a software works before
> > >>>>>>>> suspecting its bug :-)
> > >>>>>>>>
> > >>>>>>>> I guess you may miss two points:
> > >>>>>>>>
> > >>>>>>>> 1. the standard analyzer (standard tokenizer) breaks words by double
> > >>>>>>>> quote (U+0022) so quotes are not indexed or searched at all if
> > >>>>>>>> you are
> > >>>>>>>> using standard analyzer. (That is the reason you have same results
> > >>>>>>>> with or without quotes.)
> > >>>>>>>> See:
> > >>>>>>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__lucene.apache.org_core_8-5F1-5F0_core_org_apache_lucene_analysis_standard_StandardTokenizer.html&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=1L6ZQKxmWmYxDX4uJHxzY5SAR_UCl6UUXCo916wzXCo&s=8E2lp1YIGM-3v3FspeieGl8z8rEBs6qioTudtFNzh8c&e=
> > >>>>>>>> and
> > >>>>>>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__unicode.org_reports_tr29_&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=1L6ZQKxmWmYxDX4uJHxzY5SAR_UCl6UUXCo916wzXCo&s=riCZ_f25XW869CKbHPUqfbLiDU-AukE6la0xTLMw6u8&e=
> > >>>>>>>>
> > >>>>>>>> 2. double quote has special meaning (it's interpreted as phrase
> > >>>>>>>> query)
> > >>>>>>>> with the built-in query parser so you need to escape it if you
> > >>>>>>>> want to
> > >>>>>>>> search double quotes itself.
> > >>>>>>>> See:
> > >>>>>>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__lucene.apache.org_core_8-5F1-5F0_queryparser_org_apache_lucene_queryparser_classic_package-2Dsummary.html-23Terms&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=1L6ZQKxmWmYxDX4uJHxzY5SAR_UCl6UUXCo916wzXCo&s=t8OYTgidvcwNpAVFuTsqGhDJK5BwUZVCxc0mPHzqCYU&e=
> > >>>>>>>>
> > >>>>>>>> (My advice would be to create separate fields for each key value
> > >>>>>>>> pairs
> > >>>>>>>> instead of stuffing all pairs into one text field, if you need to
> > >>>>>>>> search them separately.)
> > >>>>>>>>
> > >>>>>>>> 2019?6?12?(?) 2:39 <baris.kazar@oracle.com>:
> > >>>>>>>>
> > >>>>>>>> i can say that quotes is not the issue with index as it still
> > >>>>>>>> results in
> > >>>>>>>> same results with quotes or without quotes.
> > >>>>>>>>
> > >>>>>>>> i am starting to feel that this might be a bug maybe??
> > >>>>>>>>
> > >>>>>>>> Best regards
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> On 6/10/19 2:46 PM, baris.kazar@oracle.com wrote:
> > >>>>>>>>
> > >>>>>>>> Somehow " is causing an issue as this should return street with
> > >>>>>>>> MAIN:
> > >>>>>>>>
> > >>>>>>>> [.contentDFLT:street="MAINS"~2, +contentDFLT:"city nashua",
> > >>>>>>>> +contentDFLT:"region new-hampshire", +contentDFLT:"country united
> > >>>>>>>> states"] -> this was with fuzzyquery on MAINS
> > >>>>>>>>
> > >>>>>>>> Best regards
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> On 6/10/19 2:24 PM, baris.kazar@oracle.com wrote:
> > >>>>>>>>
> > >>>>>>>> [.+contentDFLT:"city nashua", +contentDFLT:"region new-hampshire",
> > >>>>>>>> +contentDFLT:"country united states", contentDFLT:street
> > >>>>>>>> contentDFLT:mains]
> > >>>>>>>>
> > >>>>>>>> QueeryParser chops it into two pieces from
> > >>>>>>>> parser.parser("street=\"MAINS\"");
> > >>>>>>>>
> > >>>>>>>> Index has a TextField named contentDFLT the following data :
> > >>>>>>>> street="MAIN" city="NASHUA" municipality="HILLSBOROUGH" region="NEW
> > >>>>>>>> HAMPSHIRE" country="UNITED STATES"
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> When i set street=\"MAINS~\" with parser:
> > >>>>>>>> i get the following
> > >>>>>>>> [.+contentDFLT:"city nashua", +contentDFLT:"region new-hampshire",
> > >>>>>>>> +contentDFLT:"country united states", contentDFLT:street
> > >>>>>>>> contentDFLT:mains]
> > >>>>>>>>
> > >>>>>>>> probably " quotations are messing this up as You were saying...
> > >>>>>>>> Best regards
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> On 6/10/19 12:48 PM, Tomoko Uchida wrote:
> > >>>>>>>>
> > >>>>>>>> Or, " (double quotation) in your query string may affect query
> > >>>>>>>> parsing.
> > >>>>>>>>
> > >>>>>>>> When I parse this string by classic query parser (lucene 8.1),
> > >>>>>>>> street="MAINS~"
> > >>>>>>>> parsed (raw) query is
> > >>>>>>>> text:street text:mains
> > >>>>>>>> (I set the default search field to "text", so text:xxxx is appeared
> > >>>>>>>> here.)
> > >>>>>>>>
> > >>>>>>>> Query parsing is a complex process, so it would be good to check
> > >>>>>>>> parsed raw query string especially when you have (reserved) special
> > >>>>>>>> characters in your query...
> > >>>>>>>>
> > >>>>>>>> 2019?6?11?(?) 1:10 Tomoko Uchida <tomoko.uchida.1111@gmail.com>:
> > >>>>>>>>
> > >>>>>>>> Hi,
> > >>>>>>>>
> > >>>>>>>> I noticed one small thing in your previous mail.
> > >>>>>>>>
> > >>>>>>>> when i use q1 = parser.parse("street=\"MAIN\""); i get same results
> > >>>>>>>>
> > >>>>>>>> which is good.
> > >>>>>>>>
> > >>>>>>>> To specify a search field, ":" (colon) should be used instead of
> > >>>>>>>> "=".
> > >>>>>>>> See the query parser documentation:
> > >>>>>>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__lucene.apache.org_core_8-5F1-5F0_queryparser_org_apache_lucene_queryparser_classic_package-2Dsummary.html-23Fields&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=u4SeJqH4lePhOazCLwxLEr3WqcMkODtYLv4njiKZ4PM&s=WrNfUXO9gz1PqpczTJw1vD9sWqvr76WRv2Aeo9uWqa4&e=
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> I'm not sure this is related to your problem.
> > >>>>>>>>
> > >>>>>>>> 2019?6?11?(?) 0:51 <baris.kazar@oracle.com>:
> > >>>>>>>>
> > >>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> > >>>>>>>> "city=\"NASHUA\""), BooleanClause.Occur.MUST);
> > >>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> > >>>>>>>> "region=\"NEW HAMPSHIRE\""), BooleanClause.Occur.MUST);
> > >>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> > >>>>>>>> "country=\"UNITED STATES\""), BooleanClause.Occur.MUST);
> > >>>>>>>>
> > >>>>>>>> org.apache.lucene.queryparser.classic.QueryParser parser = new
> > >>>>>>>> org.apache.lucene.queryparser.classic.QueryParser(field,
> > >>>>>>>> phraseAnalyzer) ;
> > >>>>>>>> Query q1 = null;
> > >>>>>>>> try {
> > >>>>>>>> q1 = parser.parse("MAIN");
> > >>>>>>>> } catch (ParseException e) {
> > >>>>>>>>
> > >>>>>>>> e.printStackTrace();
> > >>>>>>>> }
> > >>>>>>>> booleanQuery.add(q1, BooleanClause.Occur.SHOULD);
> > >>>>>>>>
> > >>>>>>>> testQuerySearch2 Time to compute: 0 seconds
> > >>>>>>>> Number of results: 1775
> > >>>>>>>> Name: Main St
> > >>>>>>>> Score: 37.20959
> > >>>>>>>> ID: 12681979
> > >>>>>>>> Country Code: US
> > >>>>>>>> Coordinates: 42.76416, -71.46681
> > >>>>>>>> Search Key: street="MAIN" city="NASHUA" municipality="HILLSBOROUGH"
> > >>>>>>>> region="NEW HAMPSHIRE" country="UNITED STATES"
> > >>>>>>>>
> > >>>>>>>> Name: Main St
> > >>>>>>>> Score: 37.20959
> > >>>>>>>> ID: 12681977
> > >>>>>>>> Country Code: US
> > >>>>>>>> Coordinates: 42.747, -71.45957
> > >>>>>>>> Search Key: street="MAIN" city="NASHUA" municipality="HILLSBOROUGH"
> > >>>>>>>> region="NEW HAMPSHIRE" country="UNITED STATES"
> > >>>>>>>>
> > >>>>>>>> Name: Main St
> > >>>>>>>> Score: 37.20959
> > >>>>>>>> ID: 12681978
> > >>>>>>>> Country Code: US
> > >>>>>>>> Coordinates: 42.73492, -71.44951
> > >>>>>>>> Search Key: street="MAIN" city="NASHUA" municipality="HILLSBOROUGH"
> > >>>>>>>> region="NEW HAMPSHIRE" country="UNITED STATES"
> > >>>>>>>>
> > >>>>>>>> when i use q1 = parser.parse("street=\"MAIN\""); i get same
> > >>>>>>>> results
> > >>>>>>>> which is good.
> > >>>>>>>>
> > >>>>>>>> But when i switch to MAINS~ then fuzzy query does not work.
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> i need to say something with the q1 only in the booleanquery:
> > >>>>>>>> it tries to match the MAIN in street, city, region and country
> > >>>>>>>> which are
> > >>>>>>>> in a single TextField field.
> > >>>>>>>> But i dont want this. that is why i need to street="..." etc when
> > >>>>>>>> searching.
> > >>>>>>>>
> > >>>>>>>> Best regards
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> On 6/10/19 11:31 AM, Tomoko Uchida wrote:
> > >>>>>>>>
> > >>>>>>>> Hi,
> > >>>>>>>>
> > >>>>>>>> just for the basic verification, can you find the document without
> > >>>>>>>> fuzzy query? I mean, does this query work for you?
> > >>>>>>>>
> > >>>>>>>> Query query = parser.parse("MAIN");
> > >>>>>>>>
> > >>>>>>>> Tomoko
> > >>>>>>>>
> > >>>>>>>> 2019?6?11?(?) 0:22 <baris.kazar@oracle.com>:
> > >>>>>>>>
> > >>>>>>>> why cant the second set not work at all?
> > >>>>>>>>
> > >>>>>>>> it is indexed as Textfield like street="..." city="..." etc.
> > >>>>>>>>
> > >>>>>>>> Best regards
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> On 6/10/19 11:23 AM, baris.kazar@oracle.com wrote:
> > >>>>>>>>
> > >>>>>>>> i dont know how to use Fuzzyquery with queryparser but probably
> > >>>>>>>> You
> > >>>>>>>> are suggesting
> > >>>>>>>>
> > >>>>>>>> QueryParser parser = new QueryParser(field, analyzer) ;
> > >>>>>>>> Query query = parser.parse("MAINS~2");
> > >>>>>>>>
> > >>>>>>>> booleanQuery.add(query, BooleanClause.Occur.SHOULD);
> > >>>>>>>>
> > >>>>>>>> am i right?
> > >>>>>>>> Best regards
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> On 6/10/19 10:47 AM, Atri Sharma wrote:
> > >>>>>>>>
> > >>>>>>>> I would suggest using a QueryParser for your fuzzy query before
> > >>>>>>>> adding it to the Boolean query. This should weed out any case
> > >>>>>>>> issues.
> > >>>>>>>>
> > >>>>>>>> On Mon, 10 Jun 2019 at 8:06 PM, <baris.kazar@oracle.com
> > >>>>>>>> <mailto:baris.kazar@oracle.com>> wrote:
> > >>>>>>>>
> > >>>>>>>> BooleanQuery.Builder booleanQuery = new
> > >>>>>>>> BooleanQuery.Builder();
> > >>>>>>>>
> > >>>>>>>> //First set
> > >>>>>>>>
> > >>>>>>>> booleanQuery.add(new FuzzyQuery(new
> > >>>>>>>> org.apache.lucene.index.Term(field, "MAINS")),
> > >>>>>>>> BooleanClause.Occur.SHOULD);
> > >>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> > >>>>>>>> "NASHUA"), BooleanClause.Occur.MUST);
> > >>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> > >>>>>>>> "NEW HAMPSHIRE"), BooleanClause.Occur.MUST);
> > >>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> > >>>>>>>> "UNITED STATES"), BooleanClause.Occur.MUST);
> > >>>>>>>>
> > >>>>>>>> // Second set
> > >>>>>>>> //booleanQuery.add(new FuzzyQuery(new
> > >>>>>>>> org.apache.lucene.index.Term(field, "street=\"MAINS\"")),
> > >>>>>>>> BooleanClause.Occur.SHOULD);
> > >>>>>>>> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,
> > >>>>>>>>
> > >>>>>>>> field, "city=\"NASHUA\""), BooleanClause.Occur.MUST);
> > >>>>>>>> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,
> > >>>>>>>>
> > >>>>>>>> field, "region=\"NEW HAMPSHIRE\""),
> > >>>>>>>> BooleanClause.Occur.MUST);
> > >>>>>>>> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,
> > >>>>>>>>
> > >>>>>>>> field, "country=\"UNITED STATES\""),
> > >>>>>>>> BooleanClause.Occur.MUST);
> > >>>>>>>>
> > >>>>>>>> The first set brings also street with Nashua name.
> > >>>>>>>> (NASHUA).
> > >>>>>>>>
> > >>>>>>>> so, to prevent that and since i also indexed with
> > >>>>>>>> street="..."
> > >>>>>>>> city="..." i did the second set but it does not bring
> > >>>>>>>> anything.
> > >>>>>>>>
> > >>>>>>>> createPhraseQuery builds a Phrasequery with one term
> > >>>>>>>> equal to the
> > >>>>>>>> string
> > >>>>>>>> in the call.
> > >>>>>>>>
> > >>>>>>>> Best regards
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> On 6/10/19 10:47 AM, baris.kazar@oracle.com
> > >>>>>>>> <mailto:baris.kazar@oracle.com> wrote:
> > >>>>>>>> > How do i check how it is indexed? lowecase or uppercase?
> > >>>>>>>> >
> > >>>>>>>> > only way is now to by testing.
> > >>>>>>>> >
> > >>>>>>>> > i am using standardanalyzer.
> > >>>>>>>> >
> > >>>>>>>> > Best regards
> > >>>>>>>> >
> > >>>>>>>> >
> > >>>>>>>> > On 6/9/19 11:57 AM, Atri Sharma wrote:
> > >>>>>>>> >> On Sun, Jun 9, 2019 at 8:53 PM Tomoko Uchida
> > >>>>>>>> >> <tomoko.uchida.1111@gmail.com
> > >>>>>>>> <mailto:tomoko.uchida.1111@gmail.com>> wrote:
> > >>>>>>>> >>> Hi,
> > >>>>>>>> >>>
> > >>>>>>>> >>> What analyzer do you use for the text field? Is the
> > >>>>>>>> term "Main"
> > >>>>>>>> >>> correctly indexed?
> > >>>>>>>> >> Agreed. Also, it would be good if you could post your
> > >>>>>>>> actual
> > >>>>>>>> code.
> > >>>>>>>> >>
> > >>>>>>>> >> What analyzer are you using? If you are using
> > >>>>>>>> StandardAnalyzer,
> > >>>>>>>> then
> > >>>>>>>> >> all of your terms while indexing will be lowercased,
> > >>>>>>>> AFAIK, but
> > >>>>>>>> your
> > >>>>>>>> >> query will not be analyzed until you run a
> > >>>>>>>> QueryParser on it.
> > >>>>>>>> >>
> > >>>>>>>> >>
> > >>>>>>>> >> Atri
> > >>>>>>>> >>
> > >>>>>>>> >
> > >>>>>>>> >
> > >>>>>>>> >
> > >>>>>>>> ---------------------------------------------------------------------
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> > To unsubscribe, e-mail:
> > >>>>>>>> java-user-unsubscribe@lucene.apache.org
> > >>>>>>>> <mailto:java-user-unsubscribe@lucene.apache.org>
> > >>>>>>>> > For additional commands, e-mail:
> > >>>>>>>> java-user-help@lucene.apache.org
> > >>>>>>>> <mailto:java-user-help@lucene.apache.org>
> > >>>>>>>> >
> > >>>>>>>>
> > >>>>>>>> ---------------------------------------------------------------------
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > >>>>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
> > >>>>>>>>
> > >>>>>>>> ---------------------------------------------------------------------
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > >>>>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
> > >>>>>>>>
> > >>>>>>>> ---------------------------------------------------------------------
> > >>>>>>>>
> > >>>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > >>>>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
> > >>>>>>>>
> > >>>>>>>> ---------------------------------------------------------------------
> > >>>>>>>>
> > >>>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > >>>>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>> ---------------------------------------------------------------------
> > >>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > >>>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
> > >>>>>>>
> > >>>>>> ---------------------------------------------------------------------
> > >>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > >>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
> > >>>>>>
> > >>>>> ---------------------------------------------------------------------
> > >>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > >>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
> > >>>>>
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > > For additional commands, e-mail: java-user-help@lucene.apache.org
> > >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: FuzzyQuery- why is it ignored? [ In reply to ]
i tested this on Lucene 7.7.2 and got the same answer MAINS cannot find
MAIN but all other consonant combos at the end can be found.

i am now confident that this is a bug with Lucene.

Best regards

PS. Lucene 8.1 has drastic changes such as StandardFilter is removed in
one of the packages and asked about this in another thread.



On 6/23/19 12:29 AM, Baris Kazar wrote:
> oops sorry about this, i also automatically did reply to and
> assumed it went to the list.
>
> i totally agree about your recommendations and
> totally agree it should have been sent to the forum emailing list.
>
> please accept my apologies for not confirming the sent address.
>
> i will try 8.1 tomorrow on my env.
>
> baris
>
> ----- Original Message -----
> From: tomoko.uchida.1111@gmail.com
> To: java-user@lucene.apache.org
> Sent: Saturday, June 22, 2019 10:35:26 PM GMT -05:00 US/Canada Eastern
> Subject: Re: FuzzyQuery- why is it ignored?
>
> Please send messages to java-user mail list only. It is not
> recommended to send questions to someone's private mail address.
> (I accidentally send a reply without including java-user to "To",
> because "Reply-To" header was not correctly set in your previous
> mail.)
>
> I did not run any test with Lucene 6.6, and won't try it until
> reproducible results/conditions are provided.
>
> 2019?6?23?(?) 10:40 Baris Kazar <baris.kazar@oracle.com>:
>
>> Tomoko,-
>> i will surely try on my env version 8.1
>> but if You could also try then both runs will
>> make sure it is bug.
>> No problems at all. i will test it.
>> I need to ask one thing when you ran the example
>> did you have any other entries in the index?
>> and i did not understand your statement:
>> the fuzzy query worked on 8.1, did you also try 6.6?
>> anyways i will know for sure when i test this weekend.
>> baris
>>
>> ----- Original Message -----
>> From: tomoko.uchida.1111@gmail.com
>> To: baris.kazar@oracle.com
>> Sent: Saturday, June 22, 2019 9:14:22 PM GMT -05:00 US/Canada Eastern
>> Subject: Re: FuzzyQuery- why is it ignored?
>>
>>> If You could index these entries and still find Main from MAINS query with Lucene 8.1,
>>> that means this is a bug in Lucene 6.6.
>> No, it does not mean there is difference between 8.1 and 6.6 if the
>> fuzzy query correctly works for me.
>> I'd suggest you try Lucene 8.1 on your own with your environments/settings.
>>
>> Tomoko
>>
>> 2019?6?23?(?) 2:14 Baris Kazar <baris.kazar@oracle.com>:
>>> Tomoko,-
>>> may i ask if You could try with these few more data indexed too?
>>>
>>> "KEHOE NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES"
>>> "CHESTNUT NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES"
>>> "JEFFERSON NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES"
>>> "NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES"
>>> "NASHUA NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES"
>>> "NEW HAMPSHIRE NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES"
>>>
>>> If You could index these entries and still find Main from MAINS query with Lucene 8.1,
>>> that means this is a bug in Lucene 6.6.
>>> Best regards
>>>
>>> ----- Original Message -----
>>> From: baris.kazar@oracle.com
>>> To: java-user@lucene.apache.org, tomoko.uchida.1111@gmail.com, erickerickson@gmail.com, atri@linux.com, baris.kazar@oracle.com, lucene@mikemccandless.com
>>> Sent: Thursday, June 13, 2019 10:49:05 AM GMT -05:00 US/Canada Eastern
>>> Subject: Re: FuzzyQuery- why is it ignored?
>>>
>>> i see, i am using an older version 6.6 and we should switch to Your 8.1
>>> version of at least 7.X.
>>>
>>> Tomoko i think i understood You meant MAIN NASHUA .... for the string :)
>>>
>>> Again i really appreciate all answers.
>>>
>>> How do we disable or enable stemming while indexing? :) another question.
>>>
>>> Best regards
>>>
>>>
>>> On 6/13/19 10:40 AM, Tomoko Uchida wrote:
>>>> Sorry, I made a mistake when copypasting. Let me just correct my previous mail.
>>>>
>>>>> 1. Indexed this text: "NASHUA NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES".
>>>> 1. Indexed this text: "MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW
>>>> HAMPSHIRE UNITED STATES"
>>>>
>>>> ----
>>>> As far as I can say, this query correctly find the indexed document
>>>> (so I have no idea about what is wrong with fuzzy query).
>>>> +contentDFLT:mains~2 +contentDFLT:"nashua"
>>>> +contentDFLT:"new-hampshire" +contentDFLT:"united states"
>>>>
>>>> I am
>>>> - using lucene 8.1.
>>>> - using standard analyzer for both of indexing and searching.
>>>> - using classic query parser for parsing.
>>>>
>>>>
>>>>
>>>> 2019?6?13?(?) 23:18 <baris.kazar@oracle.com>:
>>>>> However, the index does not have MAINS but MAIN for the expected entry.
>>>>>
>>>>> Best regards
>>>>>
>>>>>
>>>>>
>>>>> On 6/13/19 10:33 AM, baris.kazar@oracle.com wrote:
>>>>>> does it consider it as like plural word? :) :) :)
>>>>>> That makes sense.
>>>>>>
>>>>>> Best regards
>>>>>>
>>>>>>
>>>>>> On 6/13/19 10:31 AM, baris.kazar@oracle.com wrote:
>>>>>>> Erick,
>>>>>>>
>>>>>>> Cool, could You give a simple example with my example please?
>>>>>>>
>>>>>>> Best regards
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 6/13/19 10:12 AM, Erick Erickson wrote:
>>>>>>>> Shot in the dark: stemming. Whenever I see a problem with something
>>>>>>>> ending in “s” (or “er” or “ing” or….) my first suspect is that
>>>>>>>> stemming is turned on. In that case the token in the index that’s
>>>>>>>> actually searched on is somewhat different than you expect.
>>>>>>>>
>>>>>>>> The test is easy, just insure your fieldType contains no stemmers.
>>>>>>>> PorterStemmer is particularly aggressive, but for this case to test
>>>>>>>> I’d just remove all stemming, re-index and see if the results differ.
>>>>>>>>
>>>>>>>> Best,
>>>>>>>> Erick
>>>>>>>>
>>>>>>>>> On Jun 13, 2019, at 7:26 AM, baris.kazar@oracle.com wrote:
>>>>>>>>>
>>>>>>>>> Tomoko,-
>>>>>>>>>
>>>>>>>>> That is strange indeed.
>>>>>>>>>
>>>>>>>>> Something is wrong when i use mains but maink, mainl, mainr,mainq,
>>>>>>>>> maint all work ok any consonant at the end except s works in this
>>>>>>>>> case.
>>>>>>>>>
>>>>>>>>> Case #3 had +contentDFLT:mains~2 but not +contentDFLT:"mains~2".
>>>>>>>>>
>>>>>>>>> i am using fuzzy query with ~ from Query.builder and that is not
>>>>>>>>> PhraseQuery.
>>>>>>>>>
>>>>>>>>> Similarly FuzzyQuery with input "mains" (it has to be lowercase
>>>>>>>>> since it does not go through StandardAnalyzer) is also not
>>>>>>>>> PhraseQuery.
>>>>>>>>>
>>>>>>>>> can there be a clearer sample case for ComplexPhraseQuery please in
>>>>>>>>> the docs?
>>>>>>>>>
>>>>>>>>> did You also index "MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED
>>>>>>>>> STATES" the expected output in this case?
>>>>>>>>>
>>>>>>>>> Thanks for spending time on this, i would like to thank everyone.
>>>>>>>>>
>>>>>>>>> Best regards
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 6/13/19 12:13 AM, Tomoko Uchida wrote:
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>>> Ok, i think only this very specific only "mains" has an issue.
>>>>>>>>>> It looks strange to me. I did some test locally.
>>>>>>>>>>
>>>>>>>>>> 1. Indexed this text: "NASHUA NASHUA HILLSBOROUGH NEW HAMPSHIRE
>>>>>>>>>> UNITED STATES".
>>>>>>>>>>
>>>>>>>>>> 2a. This query string (just copied from your Case #3) worked
>>>>>>>>>> correctly
>>>>>>>>>> for me as far as I can see.
>>>>>>>>>> +contentDFLT:mains~2 +contentDFLT:"nashua",
>>>>>>>>>> +contentDFLT:"new-hampshire", +contentDFLT:"united state"
>>>>>>>>>>
>>>>>>>>>> 2b. However this query string got no results.
>>>>>>>>>> +contentDFLT:"mains~2", +contentDFLT:"nashua",
>>>>>>>>>> +contentDFLT:"new-hampshire", +contentDFLT:"united states"
>>>>>>>>>> It is an expected behaviour because the classic query parser does not
>>>>>>>>>> support fuzzy query inside phrase query (as far as I know).
>>>>>>>>>>
>>>>>>>>>> I suspect you use fuzzy query operator (~) inside phrase query
>>>>>>>>>> ("), as
>>>>>>>>>> the 2b case.
>>>>>>>>>>
>>>>>>>>>> FYI: there is a special parser for such complex phrase query.
>>>>>>>>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__lucene.apache.org_core_8-5F1-5F0_queryparser_org_apache_lucene_queryparser_complexPhrase_ComplexPhraseQueryParser.html&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=ZcXpaSlwS5DegX76mHTb_6DH3P7noan1eeMXc-Vh5M8&s=FoIMlcjDO2b7Gut9XRx-NIBWiBQWItsj8IlylJC7Wkc&e=
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Tomoko
>>>>>>>>>>
>>>>>>>>>> 2019?6?13?(?) 6:16 <baris.kazar@oracle.com>:
>>>>>>>>>>> Ok, i think only this very specific only "mains" has an issue.
>>>>>>>>>>>
>>>>>>>>>>> all i knew about Lucene was fine :) Great...
>>>>>>>>>>>
>>>>>>>>>>> i have one more question:
>>>>>>>>>>>
>>>>>>>>>>> which one is advised to use: FuzzyQuery or the Query.parser with
>>>>>>>>>>> search string~ appended?
>>>>>>>>>>>
>>>>>>>>>>> The second one will go through analyzer and make search string
>>>>>>>>>>> lowercase.
>>>>>>>>>>>
>>>>>>>>>>> Best regards
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 6/12/19 1:03 PM, baris.kazar@oracle.com wrote:
>>>>>>>>>>>
>>>>>>>>>>> Hi again,-
>>>>>>>>>>>
>>>>>>>>>>> this is really interesting and i hope i am missing something.
>>>>>>>>>>> Index small cases all entries so case sensitivity is not an issue
>>>>>>>>>>> i think.
>>>>>>>>>>>
>>>>>>>>>>> Case #1:
>>>>>>>>>>>
>>>>>>>>>>> org.apache.lucene.queryparser.classic.QueryParser parser = new
>>>>>>>>>>> org.apache.lucene.queryparser.classic.QueryParser(field,
>>>>>>>>>>> phraseAnalyzer) ;
>>>>>>>>>>> Query q1 = null;
>>>>>>>>>>> try {
>>>>>>>>>>> q1 = parser.parse("Main");
>>>>>>>>>>> } catch (ParseException e) {
>>>>>>>>>>> e.printStackTrace();
>>>>>>>>>>> }
>>>>>>>>>>> booleanQuery.add(q1, BooleanClause.Occur.MUST);
>>>>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>>>>>> "NASHUA"), BooleanClause.Occur.MUST);
>>>>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>>>>>> "NEW HAMPSHIRE"), BooleanClause.Occur.MUST);
>>>>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>>>>>> "UNITED STATES"), BooleanClause.Occur.MUST);
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> This brings with this:
>>>>>>>>>>>
>>>>>>>>>>> query plan:
>>>>>>>>>>>
>>>>>>>>>>> [+contentDFLT:main, +contentDFLT:"nashua",
>>>>>>>>>>> +contentDFLT:"new-hampshire", +contentDFLT:"united states"]
>>>>>>>>>>>
>>>>>>>>>>> testQuerySearch1 Time to compute: 0 seconds (copied answer after
>>>>>>>>>>> exec finished)
>>>>>>>>>>>
>>>>>>>>>>> Number of results: 12
>>>>>>>>>>> Name: Main Dunstable Rd
>>>>>>>>>>> Score: 41.204945
>>>>>>>>>>> ID: 12677400
>>>>>>>>>>> Country Code: US
>>>>>>>>>>> Coordinates: 42.72631, -71.50269
>>>>>>>>>>> Search Key: MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW HAMPSHIRE
>>>>>>>>>>> UNITED STATES
>>>>>>>>>>>
>>>>>>>>>>> Name: Main St
>>>>>>>>>>> Score: 41.204945
>>>>>>>>>>> ID: 12681980
>>>>>>>>>>> Country Code: US
>>>>>>>>>>> Coordinates: 42.76416, -71.46681
>>>>>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>>>>
>>>>>>>>>>> Name: Main St
>>>>>>>>>>> Score: 41.204945
>>>>>>>>>>> ID: 12681973
>>>>>>>>>>> Country Code: US
>>>>>>>>>>> Coordinates: 42.75045, -71.4607
>>>>>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>>>>
>>>>>>>>>>> Name: Main St
>>>>>>>>>>> Score: 41.204945
>>>>>>>>>>> ID: 12681974
>>>>>>>>>>> Country Code: US
>>>>>>>>>>> Coordinates: 42.76019, -71.465
>>>>>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>>>>
>>>>>>>>>>> Name: Main Dunstable Rd
>>>>>>>>>>> Score: 41.204945
>>>>>>>>>>> ID: 12677399
>>>>>>>>>>> Country Code: US
>>>>>>>>>>> Coordinates: 42.74641, -71.48943
>>>>>>>>>>> Search Key: MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW HAMPSHIRE
>>>>>>>>>>> UNITED STATES
>>>>>>>>>>>
>>>>>>>>>>> Name: S Main St
>>>>>>>>>>> Score: 41.204945
>>>>>>>>>>> ID: 11893215
>>>>>>>>>>> Country Code: US
>>>>>>>>>>> Coordinates: 42.73412, -71.44797
>>>>>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>>>>
>>>>>>>>>>> Name: Main St
>>>>>>>>>>> Score: 41.204945
>>>>>>>>>>> ID: 12681978
>>>>>>>>>>> Country Code: US
>>>>>>>>>>> Coordinates: 42.73492, -71.44951
>>>>>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>>>>
>>>>>>>>>>> Name: S Main St
>>>>>>>>>>> Score: 41.204945
>>>>>>>>>>> ID: 11893214
>>>>>>>>>>> Country Code: US
>>>>>>>>>>> Coordinates: 42.73958, -71.45895
>>>>>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>>>>
>>>>>>>>>>> Name: Main St
>>>>>>>>>>> Score: 41.204945
>>>>>>>>>>> ID: 12681979
>>>>>>>>>>> Country Code: US
>>>>>>>>>>> Coordinates: 42.76416, -71.46681
>>>>>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>>>>
>>>>>>>>>>> Name: Main St
>>>>>>>>>>> Score: 41.204945
>>>>>>>>>>> ID: 12681977
>>>>>>>>>>> Country Code: US
>>>>>>>>>>> Coordinates: 42.747, -71.45957
>>>>>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Case #2
>>>>>>>>>>>
>>>>>>>>>>> When i did this it also worked by adding ~ to make it Fuzzy query
>>>>>>>>>>> to Main word:
>>>>>>>>>>>
>>>>>>>>>>> org.apache.lucene.queryparser.classic.QueryParser parser = new
>>>>>>>>>>> org.apache.lucene.queryparser.classic.QueryParser(field,
>>>>>>>>>>> phraseAnalyzer) ;
>>>>>>>>>>> Query q1 = null;
>>>>>>>>>>> try {
>>>>>>>>>>> q1 = parser.parse("Main~");
>>>>>>>>>>> } catch (ParseException e) {
>>>>>>>>>>> e.printStackTrace();
>>>>>>>>>>> }
>>>>>>>>>>> booleanQuery.add(q1, BooleanClause.Occur.MUST);
>>>>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>>>>>> "NASHUA"), BooleanClause.Occur.MUST);
>>>>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>>>>>> "NEW HAMPSHIRE"), BooleanClause.Occur.MUST);
>>>>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>>>>>> "UNITED STATES"), BooleanClause.Occur.MUST);
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> query plan:
>>>>>>>>>>>
>>>>>>>>>>> [+contentDFLT:main~2, +contentDFLT:"nashua",
>>>>>>>>>>> +contentDFLT:"new-hampshire", +contentDFLT:"united states"]
>>>>>>>>>>>
>>>>>>>>>>> testQuerySearch1 Time to compute: 24 seconds (due to debugging
>>>>>>>>>>> stops)
>>>>>>>>>>> Number of results: 12
>>>>>>>>>>> Name: Main Dunstable Rd
>>>>>>>>>>> Score: 41.06405
>>>>>>>>>>> ID: 12677400
>>>>>>>>>>> Country Code: US
>>>>>>>>>>> Coordinates: 42.72631, -71.50269
>>>>>>>>>>> Search Key: MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW HAMPSHIRE
>>>>>>>>>>> UNITED STATES
>>>>>>>>>>>
>>>>>>>>>>> Name: Main St
>>>>>>>>>>> Score: 41.06405
>>>>>>>>>>> ID: 12681980
>>>>>>>>>>> Country Code: US
>>>>>>>>>>> Coordinates: 42.76416, -71.46681
>>>>>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>>>>
>>>>>>>>>>> Name: Main St
>>>>>>>>>>> Score: 41.06405
>>>>>>>>>>> ID: 12681973
>>>>>>>>>>> Country Code: US
>>>>>>>>>>> Coordinates: 42.75045, -71.4607
>>>>>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>>>>
>>>>>>>>>>> Name: Main St
>>>>>>>>>>> Score: 41.06405
>>>>>>>>>>> ID: 12681974
>>>>>>>>>>> Country Code: US
>>>>>>>>>>> Coordinates: 42.76019, -71.465
>>>>>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>>>>
>>>>>>>>>>> Name: Main Dunstable Rd
>>>>>>>>>>> Score: 41.06405
>>>>>>>>>>> ID: 12677399
>>>>>>>>>>> Country Code: US
>>>>>>>>>>> Coordinates: 42.74641, -71.48943
>>>>>>>>>>> Search Key: MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW HAMPSHIRE
>>>>>>>>>>> UNITED STATES
>>>>>>>>>>>
>>>>>>>>>>> Name: S Main St
>>>>>>>>>>> Score: 41.06405
>>>>>>>>>>> ID: 11893215
>>>>>>>>>>> Country Code: US
>>>>>>>>>>> Coordinates: 42.73412, -71.44797
>>>>>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>>>>
>>>>>>>>>>> Name: Main St
>>>>>>>>>>> Score: 41.06405
>>>>>>>>>>> ID: 12681978
>>>>>>>>>>> Country Code: US
>>>>>>>>>>> Coordinates: 42.73492, -71.44951
>>>>>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>>>>
>>>>>>>>>>> Name: S Main St
>>>>>>>>>>> Score: 41.06405
>>>>>>>>>>> ID: 11893214
>>>>>>>>>>> Country Code: US
>>>>>>>>>>> Coordinates: 42.73958, -71.45895
>>>>>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>>>>
>>>>>>>>>>> Name: Main St
>>>>>>>>>>> Score: 41.06405
>>>>>>>>>>> ID: 12681979
>>>>>>>>>>> Country Code: US
>>>>>>>>>>> Coordinates: 42.76416, -71.46681
>>>>>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>>>>
>>>>>>>>>>> Name: Main St
>>>>>>>>>>> Score: 41.06405
>>>>>>>>>>> ID: 12681977
>>>>>>>>>>> Country Code: US
>>>>>>>>>>> Coordinates: 42.747, -71.45957
>>>>>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Case #3
>>>>>>>>>>>
>>>>>>>>>>> But why does this not work with fuzzy mode and i misspelled a bit
>>>>>>>>>>> (1 edit away) and as You saw the data is there with Main spelling:
>>>>>>>>>>>
>>>>>>>>>>> org.apache.lucene.queryparser.classic.QueryParser parser = new
>>>>>>>>>>> org.apache.lucene.queryparser.classic.QueryParser(field,
>>>>>>>>>>> phraseAnalyzer) ;
>>>>>>>>>>>
>>>>>>>>>>> Query q1 = null;
>>>>>>>>>>> try {
>>>>>>>>>>> q1 = parser.parse("Mains~"); // 1 edit away
>>>>>>>>>>> } catch (ParseException e) {
>>>>>>>>>>> e.printStackTrace();
>>>>>>>>>>> }
>>>>>>>>>>> booleanQuery.add(q1, BooleanClause.Occur.MUST);
>>>>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>>>>>> "NASHUA"), BooleanClause.Occur.MUST);
>>>>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>>>>>> "NEW HAMPSHIRE"), BooleanClause.Occur.MUST);
>>>>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>>>>>> "UNITED STATES"), BooleanClause.Occur.MUST);
>>>>>>>>>>>
>>>>>>>>>>> query plan:
>>>>>>>>>>>
>>>>>>>>>>> [+contentDFLT:mains~2, +contentDFLT:"nashua",
>>>>>>>>>>> +contentDFLT:"new-hampshire", +contentDFLT:"united states"]
>>>>>>>>>>>
>>>>>>>>>>> testQuerySearch1 Time to compute: 23 seconds (due to debugging
>>>>>>>>>>> stops)
>>>>>>>>>>>
>>>>>>>>>>> Number of results: 0
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Case #4
>>>>>>>>>>>
>>>>>>>>>>> Then i changed q1 to SHOULD from MUST above: and i think fuzzy
>>>>>>>>>>> query is ignored here since there is no MAIN in the first 468
>>>>>>>>>>> resuls:
>>>>>>>>>>>
>>>>>>>>>>> there is no boost for Mains term here.
>>>>>>>>>>>
>>>>>>>>>>> query plan:
>>>>>>>>>>>
>>>>>>>>>>> [contentDFLT:mains~2, +contentDFLT:"nashua",
>>>>>>>>>>> +contentDFLT:"new-hampshire", +contentDFLT:"united states"]
>>>>>>>>>>>
>>>>>>>>>>> testQuerySearch1 Time to compute: 125 seconds (due to debugging
>>>>>>>>>>> stops)
>>>>>>>>>>> Number of results: 1794
>>>>>>>>>>> Name: Nashua Dr
>>>>>>>>>>> Score: 34.186226
>>>>>>>>>>> ID: 4974936
>>>>>>>>>>> Country Code: US
>>>>>>>>>>> Coordinates: 42.7636, -71.46063
>>>>>>>>>>> Search Key: NASHUA NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>>>>
>>>>>>>>>>> Name: Nashua River Rail Trl
>>>>>>>>>>> Score: 34.186226
>>>>>>>>>>> ID: 4975508
>>>>>>>>>>> Country Code: US
>>>>>>>>>>> Coordinates: 42.7062, -71.53962
>>>>>>>>>>> Search Key: NASHUA RIVER RAIL NASHUA HILLSBOROUGH NEW HAMPSHIRE
>>>>>>>>>>> UNITED STATES
>>>>>>>>>>>
>>>>>>>>>>> Name: Nashua Rd
>>>>>>>>>>> Score: 33.84896
>>>>>>>>>>> ID: 4975388
>>>>>>>>>>> Country Code: US
>>>>>>>>>>> Coordinates: 42.78746, -71.92823
>>>>>>>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>>>>
>>>>>>>>>>> Name: NASHUA
>>>>>>>>>>> Score: 33.84896
>>>>>>>>>>> ID: 21014865
>>>>>>>>>>> Country Code: US
>>>>>>>>>>> Coordinates: 42.75873, -71.46438
>>>>>>>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>>>>
>>>>>>>>>>> Name: NASHUA
>>>>>>>>>>> Score: 33.84896
>>>>>>>>>>> ID: 21014865
>>>>>>>>>>> Country Code: US
>>>>>>>>>>> Coordinates: 42.75873, -71.46438
>>>>>>>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>>>>
>>>>>>>>>>> Name: NASHUA
>>>>>>>>>>> Score: 33.84896
>>>>>>>>>>> ID: 21014865
>>>>>>>>>>> Country Code: US
>>>>>>>>>>> Coordinates: 42.75873, -71.46438
>>>>>>>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>>>>
>>>>>>>>>>> Name: NASHUA
>>>>>>>>>>> Score: 33.84896
>>>>>>>>>>> ID: 21014865
>>>>>>>>>>> Country Code: US
>>>>>>>>>>> Coordinates: 42.75873, -71.46438
>>>>>>>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>>>>
>>>>>>>>>>> Name: NASHUA
>>>>>>>>>>> Score: 33.84896
>>>>>>>>>>> ID: 21014865
>>>>>>>>>>> Country Code: US
>>>>>>>>>>> Coordinates: 42.75873, -71.46438
>>>>>>>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>>>>
>>>>>>>>>>> Name: Nashua St
>>>>>>>>>>> Score: 33.84896
>>>>>>>>>>> ID: 4975671
>>>>>>>>>>> Country Code: US
>>>>>>>>>>> Coordinates: 42.88471, -70.81687
>>>>>>>>>>> Search Key: NASHUA ROCKINGHAM NEW HAMPSHIRE UNITED STATES
>>>>>>>>>>>
>>>>>>>>>>> Name: Nashua Rd
>>>>>>>>>>> Score: 33.84896
>>>>>>>>>>> ID: 4975400
>>>>>>>>>>> Country Code: US
>>>>>>>>>>> Coordinates: 42.79014, -71.92364
>>>>>>>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Why is the fuzzy query ignored?
>>>>>>>>>>> Even if i have separate fields for street, city,region, country,
>>>>>>>>>>> this fuzzy query issue will come into place for words with
>>>>>>>>>>> multiple parts like main dunstable etc., right?
>>>>>>>>>>>
>>>>>>>>>>> Best regards
>>>>>>>>>>>
>>>>>>>>>>> On 6/12/19 11:36 AM, baris.kazar@oracle.com wrote:
>>>>>>>>>>>
>>>>>>>>>>> Tomoko,-
>>>>>>>>>>>
>>>>>>>>>>> Thank You for Your suggestions. i am trying to understand it
>>>>>>>>>>> and i thought i did :)
>>>>>>>>>>>
>>>>>>>>>>> but it does not work with FuzzyQuery when i used with a *single*
>>>>>>>>>>> large TextField like street=...value... city=...value...
>>>>>>>>>>> region=...value... country=...value... (with or without quotes
>>>>>>>>>>> for the values)
>>>>>>>>>>>
>>>>>>>>>>> What i knew about Lucene fuzzy queries are not holding now with
>>>>>>>>>>> this Textfield form. That is why i suspected of a bug.
>>>>>>>>>>>
>>>>>>>>>>> 1. Yes, i saw and have a solid proof on that now.
>>>>>>>>>>>
>>>>>>>>>>> 2. yes but FuzzyQuery takes quotes as they are as they are
>>>>>>>>>>> escaped and it is not analyzed.
>>>>>>>>>>>
>>>>>>>>>>> Stuffing into one textfield vs having separate fields should only
>>>>>>>>>>> affect probably the performance but not the outcome in my case.
>>>>>>>>>>> But, i have been thinking about this and maybe it is the way to
>>>>>>>>>>> go in this case.
>>>>>>>>>>>
>>>>>>>>>>> mY CONTENT field has street names in mixed case and city, region
>>>>>>>>>>> country names in UPPERCASE. Can this be a problem?
>>>>>>>>>>> i thought index stored them in lowercase since i am using
>>>>>>>>>>> StandardAnalyzer.
>>>>>>>>>>>
>>>>>>>>>>> CONTENT field also has full textfield string with street=...
>>>>>>>>>>> city=... region=... country=... (here all values are UPPERCASE).
>>>>>>>>>>>
>>>>>>>>>>> Why cant the index find the names via FuzzyQuery? i tried both
>>>>>>>>>>> FuzzyQuery and Query builder as i showed before.
>>>>>>>>>>>
>>>>>>>>>>> The last advice in Your previous email would nicely go outside
>>>>>>>>>>> the parantheses since it might be very critical :) :) :)
>>>>>>>>>>>
>>>>>>>>>>> Best regards
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 6/12/19 12:17 AM, Tomoko Uchida wrote:
>>>>>>>>>>>
>>>>>>>>>>> I'd suggest to correctly understand the way a software works before
>>>>>>>>>>> suspecting its bug :-)
>>>>>>>>>>>
>>>>>>>>>>> I guess you may miss two points:
>>>>>>>>>>>
>>>>>>>>>>> 1. the standard analyzer (standard tokenizer) breaks words by double
>>>>>>>>>>> quote (U+0022) so quotes are not indexed or searched at all if
>>>>>>>>>>> you are
>>>>>>>>>>> using standard analyzer. (That is the reason you have same results
>>>>>>>>>>> with or without quotes.)
>>>>>>>>>>> See:
>>>>>>>>>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__lucene.apache.org_core_8-5F1-5F0_core_org_apache_lucene_analysis_standard_StandardTokenizer.html&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=1L6ZQKxmWmYxDX4uJHxzY5SAR_UCl6UUXCo916wzXCo&s=8E2lp1YIGM-3v3FspeieGl8z8rEBs6qioTudtFNzh8c&e=
>>>>>>>>>>> and
>>>>>>>>>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__unicode.org_reports_tr29_&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=1L6ZQKxmWmYxDX4uJHxzY5SAR_UCl6UUXCo916wzXCo&s=riCZ_f25XW869CKbHPUqfbLiDU-AukE6la0xTLMw6u8&e=
>>>>>>>>>>>
>>>>>>>>>>> 2. double quote has special meaning (it's interpreted as phrase
>>>>>>>>>>> query)
>>>>>>>>>>> with the built-in query parser so you need to escape it if you
>>>>>>>>>>> want to
>>>>>>>>>>> search double quotes itself.
>>>>>>>>>>> See:
>>>>>>>>>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__lucene.apache.org_core_8-5F1-5F0_queryparser_org_apache_lucene_queryparser_classic_package-2Dsummary.html-23Terms&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=1L6ZQKxmWmYxDX4uJHxzY5SAR_UCl6UUXCo916wzXCo&s=t8OYTgidvcwNpAVFuTsqGhDJK5BwUZVCxc0mPHzqCYU&e=
>>>>>>>>>>>
>>>>>>>>>>> (My advice would be to create separate fields for each key value
>>>>>>>>>>> pairs
>>>>>>>>>>> instead of stuffing all pairs into one text field, if you need to
>>>>>>>>>>> search them separately.)
>>>>>>>>>>>
>>>>>>>>>>> 2019?6?12?(?) 2:39 <baris.kazar@oracle.com>:
>>>>>>>>>>>
>>>>>>>>>>> i can say that quotes is not the issue with index as it still
>>>>>>>>>>> results in
>>>>>>>>>>> same results with quotes or without quotes.
>>>>>>>>>>>
>>>>>>>>>>> i am starting to feel that this might be a bug maybe??
>>>>>>>>>>>
>>>>>>>>>>> Best regards
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 6/10/19 2:46 PM, baris.kazar@oracle.com wrote:
>>>>>>>>>>>
>>>>>>>>>>> Somehow " is causing an issue as this should return street with
>>>>>>>>>>> MAIN:
>>>>>>>>>>>
>>>>>>>>>>> [contentDFLT:street="MAINS"~2, +contentDFLT:"city nashua",
>>>>>>>>>>> +contentDFLT:"region new-hampshire", +contentDFLT:"country united
>>>>>>>>>>> states"] -> this was with fuzzyquery on MAINS
>>>>>>>>>>>
>>>>>>>>>>> Best regards
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 6/10/19 2:24 PM, baris.kazar@oracle.com wrote:
>>>>>>>>>>>
>>>>>>>>>>> [.+contentDFLT:"city nashua", +contentDFLT:"region new-hampshire",
>>>>>>>>>>> +contentDFLT:"country united states", contentDFLT:street
>>>>>>>>>>> contentDFLT:mains]
>>>>>>>>>>>
>>>>>>>>>>> QueeryParser chops it into two pieces from
>>>>>>>>>>> parser.parser("street=\"MAINS\"");
>>>>>>>>>>>
>>>>>>>>>>> Index has a TextField named contentDFLT the following data :
>>>>>>>>>>> street="MAIN" city="NASHUA" municipality="HILLSBOROUGH" region="NEW
>>>>>>>>>>> HAMPSHIRE" country="UNITED STATES"
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> When i set street=\"MAINS~\" with parser:
>>>>>>>>>>> i get the following
>>>>>>>>>>> [.+contentDFLT:"city nashua", +contentDFLT:"region new-hampshire",
>>>>>>>>>>> +contentDFLT:"country united states", contentDFLT:street
>>>>>>>>>>> contentDFLT:mains]
>>>>>>>>>>>
>>>>>>>>>>> probably " quotations are messing this up as You were saying...
>>>>>>>>>>> Best regards
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 6/10/19 12:48 PM, Tomoko Uchida wrote:
>>>>>>>>>>>
>>>>>>>>>>> Or, " (double quotation) in your query string may affect query
>>>>>>>>>>> parsing.
>>>>>>>>>>>
>>>>>>>>>>> When I parse this string by classic query parser (lucene 8.1),
>>>>>>>>>>> street="MAINS~"
>>>>>>>>>>> parsed (raw) query is
>>>>>>>>>>> text:street text:mains
>>>>>>>>>>> (I set the default search field to "text", so text:xxxx is appeared
>>>>>>>>>>> here.)
>>>>>>>>>>>
>>>>>>>>>>> Query parsing is a complex process, so it would be good to check
>>>>>>>>>>> parsed raw query string especially when you have (reserved) special
>>>>>>>>>>> characters in your query...
>>>>>>>>>>>
>>>>>>>>>>> 2019?6?11?(?) 1:10 Tomoko Uchida <tomoko.uchida.1111@gmail.com>:
>>>>>>>>>>>
>>>>>>>>>>> Hi,
>>>>>>>>>>>
>>>>>>>>>>> I noticed one small thing in your previous mail.
>>>>>>>>>>>
>>>>>>>>>>> when i use q1 = parser.parse("street=\"MAIN\""); i get same results
>>>>>>>>>>>
>>>>>>>>>>> which is good.
>>>>>>>>>>>
>>>>>>>>>>> To specify a search field, ":" (colon) should be used instead of
>>>>>>>>>>> "=".
>>>>>>>>>>> See the query parser documentation:
>>>>>>>>>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__lucene.apache.org_core_8-5F1-5F0_queryparser_org_apache_lucene_queryparser_classic_package-2Dsummary.html-23Fields&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=u4SeJqH4lePhOazCLwxLEr3WqcMkODtYLv4njiKZ4PM&s=WrNfUXO9gz1PqpczTJw1vD9sWqvr76WRv2Aeo9uWqa4&e=
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> I'm not sure this is related to your problem.
>>>>>>>>>>>
>>>>>>>>>>> 2019?6?11?(?) 0:51 <baris.kazar@oracle.com>:
>>>>>>>>>>>
>>>>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>>>>>> "city=\"NASHUA\""), BooleanClause.Occur.MUST);
>>>>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>>>>>> "region=\"NEW HAMPSHIRE\""), BooleanClause.Occur.MUST);
>>>>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>>>>>> "country=\"UNITED STATES\""), BooleanClause.Occur.MUST);
>>>>>>>>>>>
>>>>>>>>>>> org.apache.lucene.queryparser.classic.QueryParser parser = new
>>>>>>>>>>> org.apache.lucene.queryparser.classic.QueryParser(field,
>>>>>>>>>>> phraseAnalyzer) ;
>>>>>>>>>>> Query q1 = null;
>>>>>>>>>>> try {
>>>>>>>>>>> q1 = parser.parse("MAIN");
>>>>>>>>>>> } catch (ParseException e) {
>>>>>>>>>>>
>>>>>>>>>>> e.printStackTrace();
>>>>>>>>>>> }
>>>>>>>>>>> booleanQuery.add(q1, BooleanClause.Occur.SHOULD);
>>>>>>>>>>>
>>>>>>>>>>> testQuerySearch2 Time to compute: 0 seconds
>>>>>>>>>>> Number of results: 1775
>>>>>>>>>>> Name: Main St
>>>>>>>>>>> Score: 37.20959
>>>>>>>>>>> ID: 12681979
>>>>>>>>>>> Country Code: US
>>>>>>>>>>> Coordinates: 42.76416, -71.46681
>>>>>>>>>>> Search Key: street="MAIN" city="NASHUA" municipality="HILLSBOROUGH"
>>>>>>>>>>> region="NEW HAMPSHIRE" country="UNITED STATES"
>>>>>>>>>>>
>>>>>>>>>>> Name: Main St
>>>>>>>>>>> Score: 37.20959
>>>>>>>>>>> ID: 12681977
>>>>>>>>>>> Country Code: US
>>>>>>>>>>> Coordinates: 42.747, -71.45957
>>>>>>>>>>> Search Key: street="MAIN" city="NASHUA" municipality="HILLSBOROUGH"
>>>>>>>>>>> region="NEW HAMPSHIRE" country="UNITED STATES"
>>>>>>>>>>>
>>>>>>>>>>> Name: Main St
>>>>>>>>>>> Score: 37.20959
>>>>>>>>>>> ID: 12681978
>>>>>>>>>>> Country Code: US
>>>>>>>>>>> Coordinates: 42.73492, -71.44951
>>>>>>>>>>> Search Key: street="MAIN" city="NASHUA" municipality="HILLSBOROUGH"
>>>>>>>>>>> region="NEW HAMPSHIRE" country="UNITED STATES"
>>>>>>>>>>>
>>>>>>>>>>> when i use q1 = parser.parse("street=\"MAIN\""); i get same
>>>>>>>>>>> results
>>>>>>>>>>> which is good.
>>>>>>>>>>>
>>>>>>>>>>> But when i switch to MAINS~ then fuzzy query does not work.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> i need to say something with the q1 only in the booleanquery:
>>>>>>>>>>> it tries to match the MAIN in street, city, region and country
>>>>>>>>>>> which are
>>>>>>>>>>> in a single TextField field.
>>>>>>>>>>> But i dont want this. that is why i need to street="..." etc when
>>>>>>>>>>> searching.
>>>>>>>>>>>
>>>>>>>>>>> Best regards
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 6/10/19 11:31 AM, Tomoko Uchida wrote:
>>>>>>>>>>>
>>>>>>>>>>> Hi,
>>>>>>>>>>>
>>>>>>>>>>> just for the basic verification, can you find the document without
>>>>>>>>>>> fuzzy query? I mean, does this query work for you?
>>>>>>>>>>>
>>>>>>>>>>> Query query = parser.parse("MAIN");
>>>>>>>>>>>
>>>>>>>>>>> Tomoko
>>>>>>>>>>>
>>>>>>>>>>> 2019?6?11?(?) 0:22 <baris.kazar@oracle.com>:
>>>>>>>>>>>
>>>>>>>>>>> why cant the second set not work at all?
>>>>>>>>>>>
>>>>>>>>>>> it is indexed as Textfield like street="..." city="..." etc.
>>>>>>>>>>>
>>>>>>>>>>> Best regards
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 6/10/19 11:23 AM, baris.kazar@oracle.com wrote:
>>>>>>>>>>>
>>>>>>>>>>> i dont know how to use Fuzzyquery with queryparser but probably
>>>>>>>>>>> You
>>>>>>>>>>> are suggesting
>>>>>>>>>>>
>>>>>>>>>>> QueryParser parser = new QueryParser(field, analyzer) ;
>>>>>>>>>>> Query query = parser.parse("MAINS~2");
>>>>>>>>>>>
>>>>>>>>>>> booleanQuery.add(query, BooleanClause.Occur.SHOULD);
>>>>>>>>>>>
>>>>>>>>>>> am i right?
>>>>>>>>>>> Best regards
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 6/10/19 10:47 AM, Atri Sharma wrote:
>>>>>>>>>>>
>>>>>>>>>>> I would suggest using a QueryParser for your fuzzy query before
>>>>>>>>>>> adding it to the Boolean query. This should weed out any case
>>>>>>>>>>> issues.
>>>>>>>>>>>
>>>>>>>>>>> On Mon, 10 Jun 2019 at 8:06 PM, <baris.kazar@oracle.com
>>>>>>>>>>> <mailto:baris.kazar@oracle.com>> wrote:
>>>>>>>>>>>
>>>>>>>>>>> BooleanQuery.Builder booleanQuery = new
>>>>>>>>>>> BooleanQuery.Builder();
>>>>>>>>>>>
>>>>>>>>>>> //First set
>>>>>>>>>>>
>>>>>>>>>>> booleanQuery.add(new FuzzyQuery(new
>>>>>>>>>>> org.apache.lucene.index.Term(field, "MAINS")),
>>>>>>>>>>> BooleanClause.Occur.SHOULD);
>>>>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>>>>>> "NASHUA"), BooleanClause.Occur.MUST);
>>>>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>>>>>> "NEW HAMPSHIRE"), BooleanClause.Occur.MUST);
>>>>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>>>>>> "UNITED STATES"), BooleanClause.Occur.MUST);
>>>>>>>>>>>
>>>>>>>>>>> // Second set
>>>>>>>>>>> //booleanQuery.add(new FuzzyQuery(new
>>>>>>>>>>> org.apache.lucene.index.Term(field, "street=\"MAINS\"")),
>>>>>>>>>>> BooleanClause.Occur.SHOULD);
>>>>>>>>>>> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,
>>>>>>>>>>>
>>>>>>>>>>> field, "city=\"NASHUA\""), BooleanClause.Occur.MUST);
>>>>>>>>>>> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,
>>>>>>>>>>>
>>>>>>>>>>> field, "region=\"NEW HAMPSHIRE\""),
>>>>>>>>>>> BooleanClause.Occur.MUST);
>>>>>>>>>>> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,
>>>>>>>>>>>
>>>>>>>>>>> field, "country=\"UNITED STATES\""),
>>>>>>>>>>> BooleanClause.Occur.MUST);
>>>>>>>>>>>
>>>>>>>>>>> The first set brings also street with Nashua name.
>>>>>>>>>>> (NASHUA).
>>>>>>>>>>>
>>>>>>>>>>> so, to prevent that and since i also indexed with
>>>>>>>>>>> street="..."
>>>>>>>>>>> city="..." i did the second set but it does not bring
>>>>>>>>>>> anything.
>>>>>>>>>>>
>>>>>>>>>>> createPhraseQuery builds a Phrasequery with one term
>>>>>>>>>>> equal to the
>>>>>>>>>>> string
>>>>>>>>>>> in the call.
>>>>>>>>>>>
>>>>>>>>>>> Best regards
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 6/10/19 10:47 AM, baris.kazar@oracle.com
>>>>>>>>>>> <mailto:baris.kazar@oracle.com> wrote:
>>>>>>>>>>> > How do i check how it is indexed? lowecase or uppercase?
>>>>>>>>>>> >
>>>>>>>>>>> > only way is now to by testing.
>>>>>>>>>>> >
>>>>>>>>>>> > i am using standardanalyzer.
>>>>>>>>>>> >
>>>>>>>>>>> > Best regards
>>>>>>>>>>> >
>>>>>>>>>>> >
>>>>>>>>>>> > On 6/9/19 11:57 AM, Atri Sharma wrote:
>>>>>>>>>>> >> On Sun, Jun 9, 2019 at 8:53 PM Tomoko Uchida
>>>>>>>>>>> >> <tomoko.uchida.1111@gmail.com
>>>>>>>>>>> <mailto:tomoko.uchida.1111@gmail.com>> wrote:
>>>>>>>>>>> >>> Hi,
>>>>>>>>>>> >>>
>>>>>>>>>>> >>> What analyzer do you use for the text field? Is the
>>>>>>>>>>> term "Main"
>>>>>>>>>>> >>> correctly indexed?
>>>>>>>>>>> >> Agreed. Also, it would be good if you could post your
>>>>>>>>>>> actual
>>>>>>>>>>> code.
>>>>>>>>>>> >>
>>>>>>>>>>> >> What analyzer are you using? If you are using
>>>>>>>>>>> StandardAnalyzer,
>>>>>>>>>>> then
>>>>>>>>>>> >> all of your terms while indexing will be lowercased,
>>>>>>>>>>> AFAIK, but
>>>>>>>>>>> your
>>>>>>>>>>> >> query will not be analyzed until you run a
>>>>>>>>>>> QueryParser on it.
>>>>>>>>>>> >>
>>>>>>>>>>> >>
>>>>>>>>>>> >> Atri
>>>>>>>>>>> >>
>>>>>>>>>>> >
>>>>>>>>>>> >
>>>>>>>>>>> >
>>>>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> > To unsubscribe, e-mail:
>>>>>>>>>>> java-user-unsubscribe@lucene.apache.org
>>>>>>>>>>> <mailto:java-user-unsubscribe@lucene.apache.org>
>>>>>>>>>>> > For additional commands, e-mail:
>>>>>>>>>>> java-user-help@lucene.apache.org
>>>>>>>>>>> <mailto:java-user-help@lucene.apache.org>
>>>>>>>>>>> >
>>>>>>>>>>>
>>>>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>>>>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>>>>>>>
>>>>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>>>>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>>>>>>>
>>>>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>>>>
>>>>>>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>>>>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>>>>>>>
>>>>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>>>>
>>>>>>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>>>>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>>>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>>>>>>
>>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>>>>>
>>>>>>>> ---------------------------------------------------------------------
>>>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

1 2  View All