Mailing List Archive

FuzzyQuery
Hi,-
i cant get FuzzyQuery working for searching with a query like Mains~2 to find the word Main in a TextField.
Any suggestions please?

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: FuzzyQuery [ In reply to ]
Hi,

What analyzer do you use for the text field? Is the term "Main"
correctly indexed?

2019?6?8?(?) 9:13 Baris Kazar <baris.kazar@oracle.com>:
>
> Hi,-
> i cant get FuzzyQuery working for searching with a query like Mains~2 to find the word Main in a TextField.
> Any suggestions please?
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: FuzzyQuery [ In reply to ]
On Sun, Jun 9, 2019 at 8:53 PM Tomoko Uchida
<tomoko.uchida.1111@gmail.com> wrote:
>
> Hi,
>
> What analyzer do you use for the text field? Is the term "Main"
> correctly indexed?

Agreed. Also, it would be good if you could post your actual code.

What analyzer are you using? If you are using StandardAnalyzer, then
all of your terms while indexing will be lowercased, AFAIK, but your
query will not be analyzed until you run a QueryParser on it.


Atri

--
Regards,

Atri
Apache Concerted

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: FuzzyQuery [ In reply to ]
How do i check how it is indexed? lowecase or uppercase?

only way is now to by testing.

i am using standardanalyzer.

Best regards


On 6/9/19 11:57 AM, Atri Sharma wrote:
> On Sun, Jun 9, 2019 at 8:53 PM Tomoko Uchida
> <tomoko.uchida.1111@gmail.com> wrote:
>> Hi,
>>
>> What analyzer do you use for the text field? Is the term "Main"
>> correctly indexed?
> Agreed. Also, it would be good if you could post your actual code.
>
> What analyzer are you using? If you are using StandardAnalyzer, then
> all of your terms while indexing will be lowercased, AFAIK, but your
> query will not be analyzed until you run a QueryParser on it.
>
>
> Atri
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: FuzzyQuery [ In reply to ]
I would suggest using a QueryParser for your fuzzy query before adding it
to the Boolean query. This should weed out any case issues.

On Mon, 10 Jun 2019 at 8:06 PM, <baris.kazar@oracle.com> wrote:

> BooleanQuery.Builder booleanQuery = new BooleanQuery.Builder();
>
> //First set
>
> booleanQuery.add(new FuzzyQuery(new
> org.apache.lucene.index.Term(field, "MAINS")), BooleanClause.Occur.SHOULD);
> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> "NASHUA"), BooleanClause.Occur.MUST);
> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> "NEW HAMPSHIRE"), BooleanClause.Occur.MUST);
> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> "UNITED STATES"), BooleanClause.Occur.MUST);
>
> // Second set
> //booleanQuery.add(new FuzzyQuery(new
> org.apache.lucene.index.Term(field, "street=\"MAINS\"")),
> BooleanClause.Occur.SHOULD);
> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,
> field, "city=\"NASHUA\""), BooleanClause.Occur.MUST);
> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,
> field, "region=\"NEW HAMPSHIRE\""), BooleanClause.Occur.MUST);
> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,
> field, "country=\"UNITED STATES\""), BooleanClause.Occur.MUST);
>
> The first set brings also street with Nashua name. (NASHUA).
>
> so, to prevent that and since i also indexed with street="..."
> city="..." i did the second set but it does not bring anything.
>
> createPhraseQuery builds a Phrasequery with one term equal to the string
> in the call.
>
> Best regards
>
>
>
> On 6/10/19 10:47 AM, baris.kazar@oracle.com wrote:
> > How do i check how it is indexed? lowecase or uppercase?
> >
> > only way is now to by testing.
> >
> > i am using standardanalyzer.
> >
> > Best regards
> >
> >
> > On 6/9/19 11:57 AM, Atri Sharma wrote:
> >> On Sun, Jun 9, 2019 at 8:53 PM Tomoko Uchida
> >> <tomoko.uchida.1111@gmail.com> wrote:
> >>> Hi,
> >>>
> >>> What analyzer do you use for the text field? Is the term "Main"
> >>> correctly indexed?
> >> Agreed. Also, it would be good if you could post your actual code.
> >>
> >> What analyzer are you using? If you are using StandardAnalyzer, then
> >> all of your terms while indexing will be lowercased, AFAIK, but your
> >> query will not be analyzed until you run a QueryParser on it.
> >>
> >>
> >> Atri
> >>
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
>
>
Re: FuzzyQuery [ In reply to ]
i am using standardanalyzer.

Best regards


On 6/9/19 11:22 AM, Tomoko Uchida wrote:
> Hi,
>
> What analyzer do you use for the text field? Is the term "Main"
> correctly indexed?
>
> 2019?6?8?(?) 9:13 Baris Kazar <baris.kazar@oracle.com>:
>> Hi,-
>> i cant get FuzzyQuery working for searching with a query like Mains~2 to find the word Main in a TextField.
>> Any suggestions please?
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: FuzzyQuery [ In reply to ]
i dont know how to use Fuzzyquery with queryparser but probably You are
suggesting

QueryParser parser = new QueryParser(field, analyzer) ;
Query query = parser.parse("MAINS~2");

booleanQuery.add(query, BooleanClause.Occur.SHOULD);

am i right?
Best regards


On 6/10/19 10:47 AM, Atri Sharma wrote:
> I would suggest using a QueryParser for your fuzzy query before adding
> it to the Boolean query. This should weed out any case issues.
>
> On Mon, 10 Jun 2019 at 8:06 PM, <baris.kazar@oracle.com
> <mailto:baris.kazar@oracle.com>> wrote:
>
> BooleanQuery.Builder booleanQuery = new BooleanQuery.Builder();
>
> //First set
>
>         booleanQuery.add(new FuzzyQuery(new
> org.apache.lucene.index.Term(field, "MAINS")),
> BooleanClause.Occur.SHOULD);
> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> "NASHUA"), BooleanClause.Occur.MUST);
> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> "NEW HAMPSHIRE"), BooleanClause.Occur.MUST);
> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> "UNITED STATES"), BooleanClause.Occur.MUST);
>
> // Second set
>          //booleanQuery.add(new FuzzyQuery(new
> org.apache.lucene.index.Term(field, "street=\"MAINS\"")),
> BooleanClause.Occur.SHOULD);
> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,
> field, "city=\"NASHUA\""), BooleanClause.Occur.MUST);
> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,
> field, "region=\"NEW HAMPSHIRE\""), BooleanClause.Occur.MUST);
> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,
> field, "country=\"UNITED STATES\""), BooleanClause.Occur.MUST);
>
> The first set brings also street with Nashua name. (NASHUA).
>
> so, to prevent that and since i also indexed with street="..."
> city="..." i did the second set but it does not bring anything.
>
> createPhraseQuery builds a Phrasequery with one term equal to the
> string
> in the call.
>
> Best regards
>
>
>
> On 6/10/19 10:47 AM, baris.kazar@oracle.com
> <mailto:baris.kazar@oracle.com> wrote:
> > How do i check how it is indexed? lowecase or uppercase?
> >
> > only way is now to by testing.
> >
> > i am using standardanalyzer.
> >
> > Best regards
> >
> >
> > On 6/9/19 11:57 AM, Atri Sharma wrote:
> >> On Sun, Jun 9, 2019 at 8:53 PM Tomoko Uchida
> >> <tomoko.uchida.1111@gmail.com
> <mailto:tomoko.uchida.1111@gmail.com>> wrote:
> >>> Hi,
> >>>
> >>> What analyzer do you use for the text field? Is the term "Main"
> >>> correctly indexed?
> >> Agreed. Also, it would be good if you could post your actual code.
> >>
> >> What analyzer are you using? If you are using StandardAnalyzer,
> then
> >> all of your terms while indexing will be lowercased, AFAIK, but
> your
> >> query will not be analyzed until you run a QueryParser on it.
> >>
> >>
> >> Atri
> >>
> >
> >
> >
> ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> <mailto:java-user-unsubscribe@lucene.apache.org>
> > For additional commands, e-mail:
> java-user-help@lucene.apache.org
> <mailto:java-user-help@lucene.apache.org>
> >
>
Re: FuzzyQuery [ In reply to ]
Hi,

just for the basic verification, can you find the document without
fuzzy query? I mean, does this query work for you?

Query query = parser.parse("MAIN");

Tomoko

2019?6?11?(?) 0:22 <baris.kazar@oracle.com>:
>
> why cant the second set not work at all?
>
> it is indexed as Textfield like street="..." city="..." etc.
>
> Best regards
>
>
>
> On 6/10/19 11:23 AM, baris.kazar@oracle.com wrote:
> > i dont know how to use Fuzzyquery with queryparser but probably You
> > are suggesting
> >
> > QueryParser parser = new QueryParser(field, analyzer) ;
> > Query query = parser.parse("MAINS~2");
> >
> > booleanQuery.add(query, BooleanClause.Occur.SHOULD);
> >
> > am i right?
> > Best regards
> >
> >
> > On 6/10/19 10:47 AM, Atri Sharma wrote:
> >> I would suggest using a QueryParser for your fuzzy query before
> >> adding it to the Boolean query. This should weed out any case issues.
> >>
> >> On Mon, 10 Jun 2019 at 8:06 PM, <baris.kazar@oracle.com
> >> <mailto:baris.kazar@oracle.com>> wrote:
> >>
> >> BooleanQuery.Builder booleanQuery = new BooleanQuery.Builder();
> >>
> >> //First set
> >>
> >> booleanQuery.add(new FuzzyQuery(new
> >> org.apache.lucene.index.Term(field, "MAINS")),
> >> BooleanClause.Occur.SHOULD);
> >> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> >> "NASHUA"), BooleanClause.Occur.MUST);
> >> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> >> "NEW HAMPSHIRE"), BooleanClause.Occur.MUST);
> >> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> >> "UNITED STATES"), BooleanClause.Occur.MUST);
> >>
> >> // Second set
> >> //booleanQuery.add(new FuzzyQuery(new
> >> org.apache.lucene.index.Term(field, "street=\"MAINS\"")),
> >> BooleanClause.Occur.SHOULD);
> >> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,
> >> field, "city=\"NASHUA\""), BooleanClause.Occur.MUST);
> >> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,
> >> field, "region=\"NEW HAMPSHIRE\""), BooleanClause.Occur.MUST);
> >> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,
> >> field, "country=\"UNITED STATES\""), BooleanClause.Occur.MUST);
> >>
> >> The first set brings also street with Nashua name. (NASHUA).
> >>
> >> so, to prevent that and since i also indexed with street="..."
> >> city="..." i did the second set but it does not bring anything.
> >>
> >> createPhraseQuery builds a Phrasequery with one term equal to the
> >> string
> >> in the call.
> >>
> >> Best regards
> >>
> >>
> >>
> >> On 6/10/19 10:47 AM, baris.kazar@oracle.com
> >> <mailto:baris.kazar@oracle.com> wrote:
> >> > How do i check how it is indexed? lowecase or uppercase?
> >> >
> >> > only way is now to by testing.
> >> >
> >> > i am using standardanalyzer.
> >> >
> >> > Best regards
> >> >
> >> >
> >> > On 6/9/19 11:57 AM, Atri Sharma wrote:
> >> >> On Sun, Jun 9, 2019 at 8:53 PM Tomoko Uchida
> >> >> <tomoko.uchida.1111@gmail.com
> >> <mailto:tomoko.uchida.1111@gmail.com>> wrote:
> >> >>> Hi,
> >> >>>
> >> >>> What analyzer do you use for the text field? Is the term "Main"
> >> >>> correctly indexed?
> >> >> Agreed. Also, it would be good if you could post your actual
> >> code.
> >> >>
> >> >> What analyzer are you using? If you are using StandardAnalyzer,
> >> then
> >> >> all of your terms while indexing will be lowercased, AFAIK, but
> >> your
> >> >> query will not be analyzed until you run a QueryParser on it.
> >> >>
> >> >>
> >> >> Atri
> >> >>
> >> >
> >> >
> >> >
> >> ---------------------------------------------------------------------
> >> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> <mailto:java-user-unsubscribe@lucene.apache.org>
> >> > For additional commands, e-mail:
> >> java-user-help@lucene.apache.org
> >> <mailto:java-user-help@lucene.apache.org>
> >> >
> >>
> >
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: FuzzyQuery [ In reply to ]
why cant the second set not work at all?

it is indexed as Textfield like street="..." city="..." etc.

Best regards



On 6/10/19 11:23 AM, baris.kazar@oracle.com wrote:
> i dont know how to use Fuzzyquery with queryparser but probably You
> are suggesting
>
> QueryParser parser = new QueryParser(field, analyzer) ;
> Query query = parser.parse("MAINS~2");
>
> booleanQuery.add(query, BooleanClause.Occur.SHOULD);
>
> am i right?
> Best regards
>
>
> On 6/10/19 10:47 AM, Atri Sharma wrote:
>> I would suggest using a QueryParser for your fuzzy query before
>> adding it to the Boolean query. This should weed out any case issues.
>>
>> On Mon, 10 Jun 2019 at 8:06 PM, <baris.kazar@oracle.com
>> <mailto:baris.kazar@oracle.com>> wrote:
>>
>>     BooleanQuery.Builder booleanQuery = new BooleanQuery.Builder();
>>
>>     //First set
>>
>>             booleanQuery.add(new FuzzyQuery(new
>>     org.apache.lucene.index.Term(field, "MAINS")),
>>     BooleanClause.Occur.SHOULD);
>>     booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>     "NASHUA"), BooleanClause.Occur.MUST);
>>     booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>     "NEW HAMPSHIRE"), BooleanClause.Occur.MUST);
>>     booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>     "UNITED STATES"), BooleanClause.Occur.MUST);
>>
>>     // Second set
>>              //booleanQuery.add(new FuzzyQuery(new
>>     org.apache.lucene.index.Term(field, "street=\"MAINS\"")),
>>     BooleanClause.Occur.SHOULD);
>> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,
>>     field, "city=\"NASHUA\""), BooleanClause.Occur.MUST);
>> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,
>>     field, "region=\"NEW HAMPSHIRE\""), BooleanClause.Occur.MUST);
>> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,
>>     field, "country=\"UNITED STATES\""), BooleanClause.Occur.MUST);
>>
>>     The first set brings also street with Nashua name. (NASHUA).
>>
>>     so, to prevent that and since i also indexed with street="..."
>>     city="..." i did the second set but it does not bring anything.
>>
>>     createPhraseQuery builds a Phrasequery with one term equal to the
>>     string
>>     in the call.
>>
>>     Best regards
>>
>>
>>
>>     On 6/10/19 10:47 AM, baris.kazar@oracle.com
>>     <mailto:baris.kazar@oracle.com> wrote:
>>     > How do i check how it is indexed? lowecase or uppercase?
>>     >
>>     > only way is now to by testing.
>>     >
>>     > i am using standardanalyzer.
>>     >
>>     > Best regards
>>     >
>>     >
>>     > On 6/9/19 11:57 AM, Atri Sharma wrote:
>>     >> On Sun, Jun 9, 2019 at 8:53 PM Tomoko Uchida
>>     >> <tomoko.uchida.1111@gmail.com
>>     <mailto:tomoko.uchida.1111@gmail.com>> wrote:
>>     >>> Hi,
>>     >>>
>>     >>> What analyzer do you use for the text field? Is the term "Main"
>>     >>> correctly indexed?
>>     >> Agreed. Also, it would be good if you could post your actual
>> code.
>>     >>
>>     >> What analyzer are you using? If you are using StandardAnalyzer,
>>     then
>>     >> all of your terms while indexing will be lowercased, AFAIK, but
>>     your
>>     >> query will not be analyzed until you run a QueryParser on it.
>>     >>
>>     >>
>>     >> Atri
>>     >>
>>     >
>>     >
>>     >
>> ---------------------------------------------------------------------
>>     > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>     <mailto:java-user-unsubscribe@lucene.apache.org>
>>     > For additional commands, e-mail:
>>     java-user-help@lucene.apache.org
>>     <mailto:java-user-help@lucene.apache.org>
>>     >
>>
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: FuzzyQuery [ In reply to ]
booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
"city=\"NASHUA\""), BooleanClause.Occur.MUST);
        booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
"region=\"NEW HAMPSHIRE\""), BooleanClause.Occur.MUST);
        booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
"country=\"UNITED STATES\""), BooleanClause.Occur.MUST);

        org.apache.lucene.queryparser.classic.QueryParser parser = new
org.apache.lucene.queryparser.classic.QueryParser(field, phraseAnalyzer) ;
        Query q1 = null;
        try {
            q1 = parser.parse("MAIN");
        } catch (ParseException e) {

            e.printStackTrace();
        }
        booleanQuery.add(q1, BooleanClause.Occur.SHOULD);

testQuerySearch2 Time to compute: 0 seconds
Number of results: 1775
Name: Main St
Score: 37.20959
ID: 12681979
Country Code: US
Coordinates: 42.76416, -71.46681
Search Key: street="MAIN" city="NASHUA" municipality="HILLSBOROUGH"
region="NEW HAMPSHIRE" country="UNITED STATES"

Name: Main St
Score: 37.20959
ID: 12681977
Country Code: US
Coordinates: 42.747, -71.45957
Search Key: street="MAIN" city="NASHUA" municipality="HILLSBOROUGH"
region="NEW HAMPSHIRE" country="UNITED STATES"

Name: Main St
Score: 37.20959
ID: 12681978
Country Code: US
Coordinates: 42.73492, -71.44951
Search Key: street="MAIN" city="NASHUA" municipality="HILLSBOROUGH"
region="NEW HAMPSHIRE" country="UNITED STATES"

 when i use q1 = parser.parse("street=\"MAIN\""); i get same results
which is good.

But when i switch to MAINS~ then fuzzy query does not work.


i need to say something with the q1 only in the booleanquery:
it tries to match the MAIN in street, city, region and country which are
in a single TextField field.
But i dont want this. that is why i need to street="..." etc when searching.

Best regards



On 6/10/19 11:31 AM, Tomoko Uchida wrote:
> Hi,
>
> just for the basic verification, can you find the document without
> fuzzy query? I mean, does this query work for you?
>
> Query query = parser.parse("MAIN");
>
> Tomoko
>
> 2019?6?11?(?) 0:22 <baris.kazar@oracle.com>:
>> why cant the second set not work at all?
>>
>> it is indexed as Textfield like street="..." city="..." etc.
>>
>> Best regards
>>
>>
>>
>> On 6/10/19 11:23 AM, baris.kazar@oracle.com wrote:
>>> i dont know how to use Fuzzyquery with queryparser but probably You
>>> are suggesting
>>>
>>> QueryParser parser = new QueryParser(field, analyzer) ;
>>> Query query = parser.parse("MAINS~2");
>>>
>>> booleanQuery.add(query, BooleanClause.Occur.SHOULD);
>>>
>>> am i right?
>>> Best regards
>>>
>>>
>>> On 6/10/19 10:47 AM, Atri Sharma wrote:
>>>> I would suggest using a QueryParser for your fuzzy query before
>>>> adding it to the Boolean query. This should weed out any case issues.
>>>>
>>>> On Mon, 10 Jun 2019 at 8:06 PM, <baris.kazar@oracle.com
>>>> <mailto:baris.kazar@oracle.com>> wrote:
>>>>
>>>> BooleanQuery.Builder booleanQuery = new BooleanQuery.Builder();
>>>>
>>>> //First set
>>>>
>>>> booleanQuery.add(new FuzzyQuery(new
>>>> org.apache.lucene.index.Term(field, "MAINS")),
>>>> BooleanClause.Occur.SHOULD);
>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>> "NASHUA"), BooleanClause.Occur.MUST);
>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>> "NEW HAMPSHIRE"), BooleanClause.Occur.MUST);
>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>> "UNITED STATES"), BooleanClause.Occur.MUST);
>>>>
>>>> // Second set
>>>> //booleanQuery.add(new FuzzyQuery(new
>>>> org.apache.lucene.index.Term(field, "street=\"MAINS\"")),
>>>> BooleanClause.Occur.SHOULD);
>>>> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,
>>>> field, "city=\"NASHUA\""), BooleanClause.Occur.MUST);
>>>> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,
>>>> field, "region=\"NEW HAMPSHIRE\""), BooleanClause.Occur.MUST);
>>>> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,
>>>> field, "country=\"UNITED STATES\""), BooleanClause.Occur.MUST);
>>>>
>>>> The first set brings also street with Nashua name. (NASHUA).
>>>>
>>>> so, to prevent that and since i also indexed with street="..."
>>>> city="..." i did the second set but it does not bring anything.
>>>>
>>>> createPhraseQuery builds a Phrasequery with one term equal to the
>>>> string
>>>> in the call.
>>>>
>>>> Best regards
>>>>
>>>>
>>>>
>>>> On 6/10/19 10:47 AM, baris.kazar@oracle.com
>>>> <mailto:baris.kazar@oracle.com> wrote:
>>>> > How do i check how it is indexed? lowecase or uppercase?
>>>> >
>>>> > only way is now to by testing.
>>>> >
>>>> > i am using standardanalyzer.
>>>> >
>>>> > Best regards
>>>> >
>>>> >
>>>> > On 6/9/19 11:57 AM, Atri Sharma wrote:
>>>> >> On Sun, Jun 9, 2019 at 8:53 PM Tomoko Uchida
>>>> >> <tomoko.uchida.1111@gmail.com
>>>> <mailto:tomoko.uchida.1111@gmail.com>> wrote:
>>>> >>> Hi,
>>>> >>>
>>>> >>> What analyzer do you use for the text field? Is the term "Main"
>>>> >>> correctly indexed?
>>>> >> Agreed. Also, it would be good if you could post your actual
>>>> code.
>>>> >>
>>>> >> What analyzer are you using? If you are using StandardAnalyzer,
>>>> then
>>>> >> all of your terms while indexing will be lowercased, AFAIK, but
>>>> your
>>>> >> query will not be analyzed until you run a QueryParser on it.
>>>> >>
>>>> >>
>>>> >> Atri
>>>> >>
>>>> >
>>>> >
>>>> >
>>>> ---------------------------------------------------------------------
>>>> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>> <mailto:java-user-unsubscribe@lucene.apache.org>
>>>> > For additional commands, e-mail:
>>>> java-user-help@lucene.apache.org
>>>> <mailto:java-user-help@lucene.apache.org>
>>>> >
>>>>
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: FuzzyQuery [ In reply to ]
Hi,

I noticed one small thing in your previous mail.

> when i use q1 = parser.parse("street=\"MAIN\""); i get same results
which is good.

To specify a search field, ":" (colon) should be used instead of "=".
See the query parser documentation:
http://lucene.apache.org/core/8_1_0/queryparser/org/apache/lucene/queryparser/classic/package-summary.html#Fields

I'm not sure this is related to your problem.

2019?6?11?(?) 0:51 <baris.kazar@oracle.com>:
>
> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> "city=\"NASHUA\""), BooleanClause.Occur.MUST);
> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> "region=\"NEW HAMPSHIRE\""), BooleanClause.Occur.MUST);
> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> "country=\"UNITED STATES\""), BooleanClause.Occur.MUST);
>
> org.apache.lucene.queryparser.classic.QueryParser parser = new
> org.apache.lucene.queryparser.classic.QueryParser(field, phraseAnalyzer) ;
> Query q1 = null;
> try {
> q1 = parser.parse("MAIN");
> } catch (ParseException e) {
>
> e.printStackTrace();
> }
> booleanQuery.add(q1, BooleanClause.Occur.SHOULD);
>
> testQuerySearch2 Time to compute: 0 seconds
> Number of results: 1775
> Name: Main St
> Score: 37.20959
> ID: 12681979
> Country Code: US
> Coordinates: 42.76416, -71.46681
> Search Key: street="MAIN" city="NASHUA" municipality="HILLSBOROUGH"
> region="NEW HAMPSHIRE" country="UNITED STATES"
>
> Name: Main St
> Score: 37.20959
> ID: 12681977
> Country Code: US
> Coordinates: 42.747, -71.45957
> Search Key: street="MAIN" city="NASHUA" municipality="HILLSBOROUGH"
> region="NEW HAMPSHIRE" country="UNITED STATES"
>
> Name: Main St
> Score: 37.20959
> ID: 12681978
> Country Code: US
> Coordinates: 42.73492, -71.44951
> Search Key: street="MAIN" city="NASHUA" municipality="HILLSBOROUGH"
> region="NEW HAMPSHIRE" country="UNITED STATES"
>
> when i use q1 = parser.parse("street=\"MAIN\""); i get same results
> which is good.
>
> But when i switch to MAINS~ then fuzzy query does not work.
>
>
> i need to say something with the q1 only in the booleanquery:
> it tries to match the MAIN in street, city, region and country which are
> in a single TextField field.
> But i dont want this. that is why i need to street="..." etc when searching.
>
> Best regards
>
>
>
> On 6/10/19 11:31 AM, Tomoko Uchida wrote:
> > Hi,
> >
> > just for the basic verification, can you find the document without
> > fuzzy query? I mean, does this query work for you?
> >
> > Query query = parser.parse("MAIN");
> >
> > Tomoko
> >
> > 2019?6?11?(?) 0:22 <baris.kazar@oracle.com>:
> >> why cant the second set not work at all?
> >>
> >> it is indexed as Textfield like street="..." city="..." etc.
> >>
> >> Best regards
> >>
> >>
> >>
> >> On 6/10/19 11:23 AM, baris.kazar@oracle.com wrote:
> >>> i dont know how to use Fuzzyquery with queryparser but probably You
> >>> are suggesting
> >>>
> >>> QueryParser parser = new QueryParser(field, analyzer) ;
> >>> Query query = parser.parse("MAINS~2");
> >>>
> >>> booleanQuery.add(query, BooleanClause.Occur.SHOULD);
> >>>
> >>> am i right?
> >>> Best regards
> >>>
> >>>
> >>> On 6/10/19 10:47 AM, Atri Sharma wrote:
> >>>> I would suggest using a QueryParser for your fuzzy query before
> >>>> adding it to the Boolean query. This should weed out any case issues.
> >>>>
> >>>> On Mon, 10 Jun 2019 at 8:06 PM, <baris.kazar@oracle.com
> >>>> <mailto:baris.kazar@oracle.com>> wrote:
> >>>>
> >>>> BooleanQuery.Builder booleanQuery = new BooleanQuery.Builder();
> >>>>
> >>>> //First set
> >>>>
> >>>> booleanQuery.add(new FuzzyQuery(new
> >>>> org.apache.lucene.index.Term(field, "MAINS")),
> >>>> BooleanClause.Occur.SHOULD);
> >>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> >>>> "NASHUA"), BooleanClause.Occur.MUST);
> >>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> >>>> "NEW HAMPSHIRE"), BooleanClause.Occur.MUST);
> >>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> >>>> "UNITED STATES"), BooleanClause.Occur.MUST);
> >>>>
> >>>> // Second set
> >>>> //booleanQuery.add(new FuzzyQuery(new
> >>>> org.apache.lucene.index.Term(field, "street=\"MAINS\"")),
> >>>> BooleanClause.Occur.SHOULD);
> >>>> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,
> >>>> field, "city=\"NASHUA\""), BooleanClause.Occur.MUST);
> >>>> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,
> >>>> field, "region=\"NEW HAMPSHIRE\""), BooleanClause.Occur.MUST);
> >>>> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,
> >>>> field, "country=\"UNITED STATES\""), BooleanClause.Occur.MUST);
> >>>>
> >>>> The first set brings also street with Nashua name. (NASHUA).
> >>>>
> >>>> so, to prevent that and since i also indexed with street="..."
> >>>> city="..." i did the second set but it does not bring anything.
> >>>>
> >>>> createPhraseQuery builds a Phrasequery with one term equal to the
> >>>> string
> >>>> in the call.
> >>>>
> >>>> Best regards
> >>>>
> >>>>
> >>>>
> >>>> On 6/10/19 10:47 AM, baris.kazar@oracle.com
> >>>> <mailto:baris.kazar@oracle.com> wrote:
> >>>> > How do i check how it is indexed? lowecase or uppercase?
> >>>> >
> >>>> > only way is now to by testing.
> >>>> >
> >>>> > i am using standardanalyzer.
> >>>> >
> >>>> > Best regards
> >>>> >
> >>>> >
> >>>> > On 6/9/19 11:57 AM, Atri Sharma wrote:
> >>>> >> On Sun, Jun 9, 2019 at 8:53 PM Tomoko Uchida
> >>>> >> <tomoko.uchida.1111@gmail.com
> >>>> <mailto:tomoko.uchida.1111@gmail.com>> wrote:
> >>>> >>> Hi,
> >>>> >>>
> >>>> >>> What analyzer do you use for the text field? Is the term "Main"
> >>>> >>> correctly indexed?
> >>>> >> Agreed. Also, it would be good if you could post your actual
> >>>> code.
> >>>> >>
> >>>> >> What analyzer are you using? If you are using StandardAnalyzer,
> >>>> then
> >>>> >> all of your terms while indexing will be lowercased, AFAIK, but
> >>>> your
> >>>> >> query will not be analyzed until you run a QueryParser on it.
> >>>> >>
> >>>> >>
> >>>> >> Atri
> >>>> >>
> >>>> >
> >>>> >
> >>>> >
> >>>> ---------------------------------------------------------------------
> >>>> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >>>> <mailto:java-user-unsubscribe@lucene.apache.org>
> >>>> > For additional commands, e-mail:
> >>>> java-user-help@lucene.apache.org
> >>>> <mailto:java-user-help@lucene.apache.org>
> >>>> >
> >>>>
> >>>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: FuzzyQuery [ In reply to ]
Or, " (double quotation) in your query string may affect query parsing.

When I parse this string by classic query parser (lucene 8.1),
street="MAINS~"
parsed (raw) query is
text:street text:mains
(I set the default search field to "text", so text:xxxx is appeared here.)

Query parsing is a complex process, so it would be good to check
parsed raw query string especially when you have (reserved) special
characters in your query...

2019?6?11?(?) 1:10 Tomoko Uchida <tomoko.uchida.1111@gmail.com>:
>
> Hi,
>
> I noticed one small thing in your previous mail.
>
> > when i use q1 = parser.parse("street=\"MAIN\""); i get same results
> which is good.
>
> To specify a search field, ":" (colon) should be used instead of "=".
> See the query parser documentation:
> http://lucene.apache.org/core/8_1_0/queryparser/org/apache/lucene/queryparser/classic/package-summary.html#Fields
>
> I'm not sure this is related to your problem.
>
> 2019?6?11?(?) 0:51 <baris.kazar@oracle.com>:
> >
> > booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> > "city=\"NASHUA\""), BooleanClause.Occur.MUST);
> > booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> > "region=\"NEW HAMPSHIRE\""), BooleanClause.Occur.MUST);
> > booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> > "country=\"UNITED STATES\""), BooleanClause.Occur.MUST);
> >
> > org.apache.lucene.queryparser.classic.QueryParser parser = new
> > org.apache.lucene.queryparser.classic.QueryParser(field, phraseAnalyzer) ;
> > Query q1 = null;
> > try {
> > q1 = parser.parse("MAIN");
> > } catch (ParseException e) {
> >
> > e.printStackTrace();
> > }
> > booleanQuery.add(q1, BooleanClause.Occur.SHOULD);
> >
> > testQuerySearch2 Time to compute: 0 seconds
> > Number of results: 1775
> > Name: Main St
> > Score: 37.20959
> > ID: 12681979
> > Country Code: US
> > Coordinates: 42.76416, -71.46681
> > Search Key: street="MAIN" city="NASHUA" municipality="HILLSBOROUGH"
> > region="NEW HAMPSHIRE" country="UNITED STATES"
> >
> > Name: Main St
> > Score: 37.20959
> > ID: 12681977
> > Country Code: US
> > Coordinates: 42.747, -71.45957
> > Search Key: street="MAIN" city="NASHUA" municipality="HILLSBOROUGH"
> > region="NEW HAMPSHIRE" country="UNITED STATES"
> >
> > Name: Main St
> > Score: 37.20959
> > ID: 12681978
> > Country Code: US
> > Coordinates: 42.73492, -71.44951
> > Search Key: street="MAIN" city="NASHUA" municipality="HILLSBOROUGH"
> > region="NEW HAMPSHIRE" country="UNITED STATES"
> >
> > when i use q1 = parser.parse("street=\"MAIN\""); i get same results
> > which is good.
> >
> > But when i switch to MAINS~ then fuzzy query does not work.
> >
> >
> > i need to say something with the q1 only in the booleanquery:
> > it tries to match the MAIN in street, city, region and country which are
> > in a single TextField field.
> > But i dont want this. that is why i need to street="..." etc when searching.
> >
> > Best regards
> >
> >
> >
> > On 6/10/19 11:31 AM, Tomoko Uchida wrote:
> > > Hi,
> > >
> > > just for the basic verification, can you find the document without
> > > fuzzy query? I mean, does this query work for you?
> > >
> > > Query query = parser.parse("MAIN");
> > >
> > > Tomoko
> > >
> > > 2019?6?11?(?) 0:22 <baris.kazar@oracle.com>:
> > >> why cant the second set not work at all?
> > >>
> > >> it is indexed as Textfield like street="..." city="..." etc.
> > >>
> > >> Best regards
> > >>
> > >>
> > >>
> > >> On 6/10/19 11:23 AM, baris.kazar@oracle.com wrote:
> > >>> i dont know how to use Fuzzyquery with queryparser but probably You
> > >>> are suggesting
> > >>>
> > >>> QueryParser parser = new QueryParser(field, analyzer) ;
> > >>> Query query = parser.parse("MAINS~2");
> > >>>
> > >>> booleanQuery.add(query, BooleanClause.Occur.SHOULD);
> > >>>
> > >>> am i right?
> > >>> Best regards
> > >>>
> > >>>
> > >>> On 6/10/19 10:47 AM, Atri Sharma wrote:
> > >>>> I would suggest using a QueryParser for your fuzzy query before
> > >>>> adding it to the Boolean query. This should weed out any case issues.
> > >>>>
> > >>>> On Mon, 10 Jun 2019 at 8:06 PM, <baris.kazar@oracle.com
> > >>>> <mailto:baris.kazar@oracle.com>> wrote:
> > >>>>
> > >>>> BooleanQuery.Builder booleanQuery = new BooleanQuery.Builder();
> > >>>>
> > >>>> //First set
> > >>>>
> > >>>> booleanQuery.add(new FuzzyQuery(new
> > >>>> org.apache.lucene.index.Term(field, "MAINS")),
> > >>>> BooleanClause.Occur.SHOULD);
> > >>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> > >>>> "NASHUA"), BooleanClause.Occur.MUST);
> > >>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> > >>>> "NEW HAMPSHIRE"), BooleanClause.Occur.MUST);
> > >>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> > >>>> "UNITED STATES"), BooleanClause.Occur.MUST);
> > >>>>
> > >>>> // Second set
> > >>>> //booleanQuery.add(new FuzzyQuery(new
> > >>>> org.apache.lucene.index.Term(field, "street=\"MAINS\"")),
> > >>>> BooleanClause.Occur.SHOULD);
> > >>>> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,
> > >>>> field, "city=\"NASHUA\""), BooleanClause.Occur.MUST);
> > >>>> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,
> > >>>> field, "region=\"NEW HAMPSHIRE\""), BooleanClause.Occur.MUST);
> > >>>> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,
> > >>>> field, "country=\"UNITED STATES\""), BooleanClause.Occur.MUST);
> > >>>>
> > >>>> The first set brings also street with Nashua name. (NASHUA).
> > >>>>
> > >>>> so, to prevent that and since i also indexed with street="..."
> > >>>> city="..." i did the second set but it does not bring anything.
> > >>>>
> > >>>> createPhraseQuery builds a Phrasequery with one term equal to the
> > >>>> string
> > >>>> in the call.
> > >>>>
> > >>>> Best regards
> > >>>>
> > >>>>
> > >>>>
> > >>>> On 6/10/19 10:47 AM, baris.kazar@oracle.com
> > >>>> <mailto:baris.kazar@oracle.com> wrote:
> > >>>> > How do i check how it is indexed? lowecase or uppercase?
> > >>>> >
> > >>>> > only way is now to by testing.
> > >>>> >
> > >>>> > i am using standardanalyzer.
> > >>>> >
> > >>>> > Best regards
> > >>>> >
> > >>>> >
> > >>>> > On 6/9/19 11:57 AM, Atri Sharma wrote:
> > >>>> >> On Sun, Jun 9, 2019 at 8:53 PM Tomoko Uchida
> > >>>> >> <tomoko.uchida.1111@gmail.com
> > >>>> <mailto:tomoko.uchida.1111@gmail.com>> wrote:
> > >>>> >>> Hi,
> > >>>> >>>
> > >>>> >>> What analyzer do you use for the text field? Is the term "Main"
> > >>>> >>> correctly indexed?
> > >>>> >> Agreed. Also, it would be good if you could post your actual
> > >>>> code.
> > >>>> >>
> > >>>> >> What analyzer are you using? If you are using StandardAnalyzer,
> > >>>> then
> > >>>> >> all of your terms while indexing will be lowercased, AFAIK, but
> > >>>> your
> > >>>> >> query will not be analyzed until you run a QueryParser on it.
> > >>>> >>
> > >>>> >>
> > >>>> >> Atri
> > >>>> >>
> > >>>> >
> > >>>> >
> > >>>> >
> > >>>> ---------------------------------------------------------------------
> > >>>> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > >>>> <mailto:java-user-unsubscribe@lucene.apache.org>
> > >>>> > For additional commands, e-mail:
> > >>>> java-user-help@lucene.apache.org
> > >>>> <mailto:java-user-help@lucene.apache.org>
> > >>>> >
> > >>>>
> > >>>
> > >>
> > >> ---------------------------------------------------------------------
> > >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > >> For additional commands, e-mail: java-user-help@lucene.apache.org
> > >>
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > > For additional commands, e-mail: java-user-help@lucene.apache.org
> > >
> >

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: FuzzyQuery [ In reply to ]
[+contentDFLT:"city nashua", +contentDFLT:"region new-hampshire",
+contentDFLT:"country united states", contentDFLT:street contentDFLT:mains]

QueeryParser chops it into two pieces from
parser.parser("street=\"MAINS\"");

Index has a TextField named contentDFLT the following data :
street="MAIN" city="NASHUA" municipality="HILLSBOROUGH" region="NEW
HAMPSHIRE" country="UNITED STATES"


When i set street=\"MAINS~\" with parser:
i get the following
[+contentDFLT:"city nashua", +contentDFLT:"region new-hampshire",
+contentDFLT:"country united states", contentDFLT:street contentDFLT:mains]

probably " quotations are messing this up as You were saying...
Best regards


On 6/10/19 12:48 PM, Tomoko Uchida wrote:
> Or, " (double quotation) in your query string may affect query parsing.
>
> When I parse this string by classic query parser (lucene 8.1),
> street="MAINS~"
> parsed (raw) query is
> text:street text:mains
> (I set the default search field to "text", so text:xxxx is appeared here.)
>
> Query parsing is a complex process, so it would be good to check
> parsed raw query string especially when you have (reserved) special
> characters in your query...
>
> 2019?6?11?(?) 1:10 Tomoko Uchida <tomoko.uchida.1111@gmail.com>:
>> Hi,
>>
>> I noticed one small thing in your previous mail.
>>
>>> when i use q1 = parser.parse("street=\"MAIN\""); i get same results
>> which is good.
>>
>> To specify a search field, ":" (colon) should be used instead of "=".
>> See the query parser documentation:
>> https://urldefense.proofpoint.com/v2/url?u=http-3A__lucene.apache.org_core_8-5F1-5F0_queryparser_org_apache_lucene_queryparser_classic_package-2Dsummary.html-23Fields&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=u4SeJqH4lePhOazCLwxLEr3WqcMkODtYLv4njiKZ4PM&s=WrNfUXO9gz1PqpczTJw1vD9sWqvr76WRv2Aeo9uWqa4&e=
>>
>> I'm not sure this is related to your problem.
>>
>> 2019?6?11?(?) 0:51 <baris.kazar@oracle.com>:
>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>> "city=\"NASHUA\""), BooleanClause.Occur.MUST);
>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>> "region=\"NEW HAMPSHIRE\""), BooleanClause.Occur.MUST);
>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>> "country=\"UNITED STATES\""), BooleanClause.Occur.MUST);
>>>
>>> org.apache.lucene.queryparser.classic.QueryParser parser = new
>>> org.apache.lucene.queryparser.classic.QueryParser(field, phraseAnalyzer) ;
>>> Query q1 = null;
>>> try {
>>> q1 = parser.parse("MAIN");
>>> } catch (ParseException e) {
>>>
>>> e.printStackTrace();
>>> }
>>> booleanQuery.add(q1, BooleanClause.Occur.SHOULD);
>>>
>>> testQuerySearch2 Time to compute: 0 seconds
>>> Number of results: 1775
>>> Name: Main St
>>> Score: 37.20959
>>> ID: 12681979
>>> Country Code: US
>>> Coordinates: 42.76416, -71.46681
>>> Search Key: street="MAIN" city="NASHUA" municipality="HILLSBOROUGH"
>>> region="NEW HAMPSHIRE" country="UNITED STATES"
>>>
>>> Name: Main St
>>> Score: 37.20959
>>> ID: 12681977
>>> Country Code: US
>>> Coordinates: 42.747, -71.45957
>>> Search Key: street="MAIN" city="NASHUA" municipality="HILLSBOROUGH"
>>> region="NEW HAMPSHIRE" country="UNITED STATES"
>>>
>>> Name: Main St
>>> Score: 37.20959
>>> ID: 12681978
>>> Country Code: US
>>> Coordinates: 42.73492, -71.44951
>>> Search Key: street="MAIN" city="NASHUA" municipality="HILLSBOROUGH"
>>> region="NEW HAMPSHIRE" country="UNITED STATES"
>>>
>>> when i use q1 = parser.parse("street=\"MAIN\""); i get same results
>>> which is good.
>>>
>>> But when i switch to MAINS~ then fuzzy query does not work.
>>>
>>>
>>> i need to say something with the q1 only in the booleanquery:
>>> it tries to match the MAIN in street, city, region and country which are
>>> in a single TextField field.
>>> But i dont want this. that is why i need to street="..." etc when searching.
>>>
>>> Best regards
>>>
>>>
>>>
>>> On 6/10/19 11:31 AM, Tomoko Uchida wrote:
>>>> Hi,
>>>>
>>>> just for the basic verification, can you find the document without
>>>> fuzzy query? I mean, does this query work for you?
>>>>
>>>> Query query = parser.parse("MAIN");
>>>>
>>>> Tomoko
>>>>
>>>> 2019?6?11?(?) 0:22 <baris.kazar@oracle.com>:
>>>>> why cant the second set not work at all?
>>>>>
>>>>> it is indexed as Textfield like street="..." city="..." etc.
>>>>>
>>>>> Best regards
>>>>>
>>>>>
>>>>>
>>>>> On 6/10/19 11:23 AM, baris.kazar@oracle.com wrote:
>>>>>> i dont know how to use Fuzzyquery with queryparser but probably You
>>>>>> are suggesting
>>>>>>
>>>>>> QueryParser parser = new QueryParser(field, analyzer) ;
>>>>>> Query query = parser.parse("MAINS~2");
>>>>>>
>>>>>> booleanQuery.add(query, BooleanClause.Occur.SHOULD);
>>>>>>
>>>>>> am i right?
>>>>>> Best regards
>>>>>>
>>>>>>
>>>>>> On 6/10/19 10:47 AM, Atri Sharma wrote:
>>>>>>> I would suggest using a QueryParser for your fuzzy query before
>>>>>>> adding it to the Boolean query. This should weed out any case issues.
>>>>>>>
>>>>>>> On Mon, 10 Jun 2019 at 8:06 PM, <baris.kazar@oracle.com
>>>>>>> <mailto:baris.kazar@oracle.com>> wrote:
>>>>>>>
>>>>>>> BooleanQuery.Builder booleanQuery = new BooleanQuery.Builder();
>>>>>>>
>>>>>>> //First set
>>>>>>>
>>>>>>> booleanQuery.add(new FuzzyQuery(new
>>>>>>> org.apache.lucene.index.Term(field, "MAINS")),
>>>>>>> BooleanClause.Occur.SHOULD);
>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>> "NASHUA"), BooleanClause.Occur.MUST);
>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>> "NEW HAMPSHIRE"), BooleanClause.Occur.MUST);
>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>> "UNITED STATES"), BooleanClause.Occur.MUST);
>>>>>>>
>>>>>>> // Second set
>>>>>>> //booleanQuery.add(new FuzzyQuery(new
>>>>>>> org.apache.lucene.index.Term(field, "street=\"MAINS\"")),
>>>>>>> BooleanClause.Occur.SHOULD);
>>>>>>> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,
>>>>>>> field, "city=\"NASHUA\""), BooleanClause.Occur.MUST);
>>>>>>> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,
>>>>>>> field, "region=\"NEW HAMPSHIRE\""), BooleanClause.Occur.MUST);
>>>>>>> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,
>>>>>>> field, "country=\"UNITED STATES\""), BooleanClause.Occur.MUST);
>>>>>>>
>>>>>>> The first set brings also street with Nashua name. (NASHUA).
>>>>>>>
>>>>>>> so, to prevent that and since i also indexed with street="..."
>>>>>>> city="..." i did the second set but it does not bring anything.
>>>>>>>
>>>>>>> createPhraseQuery builds a Phrasequery with one term equal to the
>>>>>>> string
>>>>>>> in the call.
>>>>>>>
>>>>>>> Best regards
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 6/10/19 10:47 AM, baris.kazar@oracle.com
>>>>>>> <mailto:baris.kazar@oracle.com> wrote:
>>>>>>> > How do i check how it is indexed? lowecase or uppercase?
>>>>>>> >
>>>>>>> > only way is now to by testing.
>>>>>>> >
>>>>>>> > i am using standardanalyzer.
>>>>>>> >
>>>>>>> > Best regards
>>>>>>> >
>>>>>>> >
>>>>>>> > On 6/9/19 11:57 AM, Atri Sharma wrote:
>>>>>>> >> On Sun, Jun 9, 2019 at 8:53 PM Tomoko Uchida
>>>>>>> >> <tomoko.uchida.1111@gmail.com
>>>>>>> <mailto:tomoko.uchida.1111@gmail.com>> wrote:
>>>>>>> >>> Hi,
>>>>>>> >>>
>>>>>>> >>> What analyzer do you use for the text field? Is the term "Main"
>>>>>>> >>> correctly indexed?
>>>>>>> >> Agreed. Also, it would be good if you could post your actual
>>>>>>> code.
>>>>>>> >>
>>>>>>> >> What analyzer are you using? If you are using StandardAnalyzer,
>>>>>>> then
>>>>>>> >> all of your terms while indexing will be lowercased, AFAIK, but
>>>>>>> your
>>>>>>> >> query will not be analyzed until you run a QueryParser on it.
>>>>>>> >>
>>>>>>> >>
>>>>>>> >> Atri
>>>>>>> >>
>>>>>>> >
>>>>>>> >
>>>>>>> >
>>>>>>> ---------------------------------------------------------------------
>>>>>>> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>>>> <mailto:java-user-unsubscribe@lucene.apache.org>
>>>>>>> > For additional commands, e-mail:
>>>>>>> java-user-help@lucene.apache.org
>>>>>>> <mailto:java-user-help@lucene.apache.org>
>>>>>>> >
>>>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: FuzzyQuery [ In reply to ]
Somehow " is causing an issue as this should return street with MAIN:

[contentDFLT:street="MAINS"~2, +contentDFLT:"city nashua",
+contentDFLT:"region new-hampshire", +contentDFLT:"country united states"]

Best regards


On 6/10/19 2:24 PM, baris.kazar@oracle.com wrote:
> [.+contentDFLT:"city nashua", +contentDFLT:"region new-hampshire",
> +contentDFLT:"country united states", contentDFLT:street
> contentDFLT:mains]
>
> QueeryParser chops it into two pieces from
> parser.parser("street=\"MAINS\"");
>
> Index has a TextField named contentDFLT the following data :
> street="MAIN" city="NASHUA" municipality="HILLSBOROUGH" region="NEW
> HAMPSHIRE" country="UNITED STATES"
>
>
> When i set street=\"MAINS~\" with parser:
> i get the following
> [.+contentDFLT:"city nashua", +contentDFLT:"region new-hampshire",
> +contentDFLT:"country united states", contentDFLT:street
> contentDFLT:mains]
>
> probably " quotations are messing this up as You were saying...
> Best regards
>
>
> On 6/10/19 12:48 PM, Tomoko Uchida wrote:
>> Or, " (double quotation) in your query string may affect query parsing.
>>
>> When I parse this string by classic query parser (lucene 8.1),
>> street="MAINS~"
>> parsed (raw) query is
>> text:street text:mains
>> (I set the default search field to "text", so text:xxxx is appeared
>> here.)
>>
>> Query parsing is a complex process, so it would be good to check
>> parsed raw query string especially when you have (reserved) special
>> characters in your query...
>>
>> 2019?6?11?(?) 1:10 Tomoko Uchida <tomoko.uchida.1111@gmail.com>:
>>> Hi,
>>>
>>> I noticed one small thing in your previous mail.
>>>
>>>> when i use q1 = parser.parse("street=\"MAIN\""); i get same results
>>> which is good.
>>>
>>> To specify a search field, ":" (colon) should be used instead of "=".
>>> See the query parser documentation:
>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__lucene.apache.org_core_8-5F1-5F0_queryparser_org_apache_lucene_queryparser_classic_package-2Dsummary.html-23Fields&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=u4SeJqH4lePhOazCLwxLEr3WqcMkODtYLv4njiKZ4PM&s=WrNfUXO9gz1PqpczTJw1vD9sWqvr76WRv2Aeo9uWqa4&e=
>>>
>>>
>>> I'm not sure this is related to your problem.
>>>
>>> 2019?6?11?(?) 0:51 <baris.kazar@oracle.com>:
>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>> "city=\"NASHUA\""), BooleanClause.Occur.MUST);
>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>> "region=\"NEW HAMPSHIRE\""), BooleanClause.Occur.MUST);
>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>> "country=\"UNITED STATES\""), BooleanClause.Occur.MUST);
>>>>
>>>>           org.apache.lucene.queryparser.classic.QueryParser parser
>>>> = new
>>>> org.apache.lucene.queryparser.classic.QueryParser(field,
>>>> phraseAnalyzer) ;
>>>>           Query q1 = null;
>>>>           try {
>>>>               q1 = parser.parse("MAIN");
>>>>           } catch (ParseException e) {
>>>>
>>>>               e.printStackTrace();
>>>>           }
>>>>           booleanQuery.add(q1, BooleanClause.Occur.SHOULD);
>>>>
>>>> testQuerySearch2 Time to compute: 0 seconds
>>>> Number of results: 1775
>>>> Name: Main St
>>>> Score: 37.20959
>>>> ID: 12681979
>>>> Country Code: US
>>>> Coordinates: 42.76416, -71.46681
>>>> Search Key: street="MAIN" city="NASHUA" municipality="HILLSBOROUGH"
>>>> region="NEW HAMPSHIRE" country="UNITED STATES"
>>>>
>>>> Name: Main St
>>>> Score: 37.20959
>>>> ID: 12681977
>>>> Country Code: US
>>>> Coordinates: 42.747, -71.45957
>>>> Search Key: street="MAIN" city="NASHUA" municipality="HILLSBOROUGH"
>>>> region="NEW HAMPSHIRE" country="UNITED STATES"
>>>>
>>>> Name: Main St
>>>> Score: 37.20959
>>>> ID: 12681978
>>>> Country Code: US
>>>> Coordinates: 42.73492, -71.44951
>>>> Search Key: street="MAIN" city="NASHUA" municipality="HILLSBOROUGH"
>>>> region="NEW HAMPSHIRE" country="UNITED STATES"
>>>>
>>>>    when i use q1 = parser.parse("street=\"MAIN\""); i get same results
>>>> which is good.
>>>>
>>>> But when i switch to MAINS~ then fuzzy query does not work.
>>>>
>>>>
>>>> i need to say something with the q1 only in the booleanquery:
>>>> it tries to match the MAIN in street, city, region and country
>>>> which are
>>>> in a single TextField field.
>>>> But i dont want this. that is why i need to street="..." etc when
>>>> searching.
>>>>
>>>> Best regards
>>>>
>>>>
>>>>
>>>> On 6/10/19 11:31 AM, Tomoko Uchida wrote:
>>>>> Hi,
>>>>>
>>>>> just for the basic verification, can you find the document without
>>>>> fuzzy query? I mean, does this query work for you?
>>>>>
>>>>> Query query = parser.parse("MAIN");
>>>>>
>>>>> Tomoko
>>>>>
>>>>> 2019?6?11?(?) 0:22 <baris.kazar@oracle.com>:
>>>>>> why cant the second set not work at all?
>>>>>>
>>>>>> it is indexed as Textfield like street="..." city="..." etc.
>>>>>>
>>>>>> Best regards
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 6/10/19 11:23 AM, baris.kazar@oracle.com wrote:
>>>>>>> i dont know how to use Fuzzyquery with queryparser but probably You
>>>>>>> are suggesting
>>>>>>>
>>>>>>> QueryParser parser = new QueryParser(field, analyzer) ;
>>>>>>> Query query = parser.parse("MAINS~2");
>>>>>>>
>>>>>>> booleanQuery.add(query, BooleanClause.Occur.SHOULD);
>>>>>>>
>>>>>>> am i right?
>>>>>>> Best regards
>>>>>>>
>>>>>>>
>>>>>>> On 6/10/19 10:47 AM, Atri Sharma wrote:
>>>>>>>> I would suggest using a QueryParser for your fuzzy query before
>>>>>>>> adding it to the Boolean query. This should weed out any case
>>>>>>>> issues.
>>>>>>>>
>>>>>>>> On Mon, 10 Jun 2019 at 8:06 PM, <baris.kazar@oracle.com
>>>>>>>> <mailto:baris.kazar@oracle.com>> wrote:
>>>>>>>>
>>>>>>>>       BooleanQuery.Builder booleanQuery = new
>>>>>>>> BooleanQuery.Builder();
>>>>>>>>
>>>>>>>>       //First set
>>>>>>>>
>>>>>>>>               booleanQuery.add(new FuzzyQuery(new
>>>>>>>>       org.apache.lucene.index.Term(field, "MAINS")),
>>>>>>>>       BooleanClause.Occur.SHOULD);
>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>>>       "NASHUA"), BooleanClause.Occur.MUST);
>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>>>       "NEW HAMPSHIRE"), BooleanClause.Occur.MUST);
>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>>>       "UNITED STATES"), BooleanClause.Occur.MUST);
>>>>>>>>
>>>>>>>>       // Second set
>>>>>>>>                //booleanQuery.add(new FuzzyQuery(new
>>>>>>>>       org.apache.lucene.index.Term(field, "street=\"MAINS\"")),
>>>>>>>>       BooleanClause.Occur.SHOULD);
>>>>>>>> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,
>>>>>>>>       field, "city=\"NASHUA\""), BooleanClause.Occur.MUST);
>>>>>>>> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,
>>>>>>>>       field, "region=\"NEW HAMPSHIRE\""),
>>>>>>>> BooleanClause.Occur.MUST);
>>>>>>>> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,
>>>>>>>>       field, "country=\"UNITED STATES\""),
>>>>>>>> BooleanClause.Occur.MUST);
>>>>>>>>
>>>>>>>>       The first set brings also street with Nashua name. (NASHUA).
>>>>>>>>
>>>>>>>>       so, to prevent that and since i also indexed with
>>>>>>>> street="..."
>>>>>>>>       city="..." i did the second set but it does not bring
>>>>>>>> anything.
>>>>>>>>
>>>>>>>>       createPhraseQuery builds a Phrasequery with one term
>>>>>>>> equal to the
>>>>>>>>       string
>>>>>>>>       in the call.
>>>>>>>>
>>>>>>>>       Best regards
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>       On 6/10/19 10:47 AM, baris.kazar@oracle.com
>>>>>>>>       <mailto:baris.kazar@oracle.com> wrote:
>>>>>>>>       > How do i check how it is indexed? lowecase or uppercase?
>>>>>>>>       >
>>>>>>>>       > only way is now to by testing.
>>>>>>>>       >
>>>>>>>>       > i am using standardanalyzer.
>>>>>>>>       >
>>>>>>>>       > Best regards
>>>>>>>>       >
>>>>>>>>       >
>>>>>>>>       > On 6/9/19 11:57 AM, Atri Sharma wrote:
>>>>>>>>       >> On Sun, Jun 9, 2019 at 8:53 PM Tomoko Uchida
>>>>>>>>       >> <tomoko.uchida.1111@gmail.com
>>>>>>>> <mailto:tomoko.uchida.1111@gmail.com>> wrote:
>>>>>>>>       >>> Hi,
>>>>>>>>       >>>
>>>>>>>>       >>> What analyzer do you use for the text field? Is the
>>>>>>>> term "Main"
>>>>>>>>       >>> correctly indexed?
>>>>>>>>       >> Agreed. Also, it would be good if you could post your
>>>>>>>> actual
>>>>>>>> code.
>>>>>>>>       >>
>>>>>>>>       >> What analyzer are you using? If you are using
>>>>>>>> StandardAnalyzer,
>>>>>>>>       then
>>>>>>>>       >> all of your terms while indexing will be lowercased,
>>>>>>>> AFAIK, but
>>>>>>>>       your
>>>>>>>>       >> query will not be analyzed until you run a QueryParser
>>>>>>>> on it.
>>>>>>>>       >>
>>>>>>>>       >>
>>>>>>>>       >> Atri
>>>>>>>>       >>
>>>>>>>>       >
>>>>>>>>       >
>>>>>>>>       >
>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>
>>>>>>>>       > To unsubscribe, e-mail:
>>>>>>>> java-user-unsubscribe@lucene.apache.org
>>>>>>>> <mailto:java-user-unsubscribe@lucene.apache.org>
>>>>>>>>       > For additional commands, e-mail:
>>>>>>>>       java-user-help@lucene.apache.org
>>>>>>>> <mailto:java-user-help@lucene.apache.org>
>>>>>>>>       >
>>>>>>>>
>>>>>> ---------------------------------------------------------------------
>>>>>>
>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: FuzzyQuery [ In reply to ]
i can say that quotes is not the issue with index as it still results in
same results with quotes or without quotes.

i am starting to feel that this might be a bug maybe??

Best regards


On 6/10/19 2:46 PM, baris.kazar@oracle.com wrote:
> Somehow " is causing an issue as this should return street with MAIN:
>
> [contentDFLT:street="MAINS"~2, +contentDFLT:"city nashua",
> +contentDFLT:"region new-hampshire", +contentDFLT:"country united
> states"] -> this was with fuzzyquery on MAINS
>
> Best regards
>
>
> On 6/10/19 2:24 PM, baris.kazar@oracle.com wrote:
>> [.+contentDFLT:"city nashua", +contentDFLT:"region new-hampshire",
>> +contentDFLT:"country united states", contentDFLT:street
>> contentDFLT:mains]
>>
>> QueeryParser chops it into two pieces from
>> parser.parser("street=\"MAINS\"");
>>
>> Index has a TextField named contentDFLT the following data :
>> street="MAIN" city="NASHUA" municipality="HILLSBOROUGH" region="NEW
>> HAMPSHIRE" country="UNITED STATES"
>>
>>
>> When i set street=\"MAINS~\" with parser:
>> i get the following
>> [.+contentDFLT:"city nashua", +contentDFLT:"region new-hampshire",
>> +contentDFLT:"country united states", contentDFLT:street
>> contentDFLT:mains]
>>
>> probably " quotations are messing this up as You were saying...
>> Best regards
>>
>>
>> On 6/10/19 12:48 PM, Tomoko Uchida wrote:
>>> Or, " (double quotation) in your query string may affect query parsing.
>>>
>>> When I parse this string by classic query parser (lucene 8.1),
>>> street="MAINS~"
>>> parsed (raw) query is
>>> text:street text:mains
>>> (I set the default search field to "text", so text:xxxx is appeared
>>> here.)
>>>
>>> Query parsing is a complex process, so it would be good to check
>>> parsed raw query string especially when you have (reserved) special
>>> characters in your query...
>>>
>>> 2019?6?11?(?) 1:10 Tomoko Uchida <tomoko.uchida.1111@gmail.com>:
>>>> Hi,
>>>>
>>>> I noticed one small thing in your previous mail.
>>>>
>>>>> when i use q1 = parser.parse("street=\"MAIN\""); i get same results
>>>> which is good.
>>>>
>>>> To specify a search field, ":" (colon) should be used instead of "=".
>>>> See the query parser documentation:
>>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__lucene.apache.org_core_8-5F1-5F0_queryparser_org_apache_lucene_queryparser_classic_package-2Dsummary.html-23Fields&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=u4SeJqH4lePhOazCLwxLEr3WqcMkODtYLv4njiKZ4PM&s=WrNfUXO9gz1PqpczTJw1vD9sWqvr76WRv2Aeo9uWqa4&e=
>>>>
>>>>
>>>> I'm not sure this is related to your problem.
>>>>
>>>> 2019?6?11?(?) 0:51 <baris.kazar@oracle.com>:
>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>> "city=\"NASHUA\""), BooleanClause.Occur.MUST);
>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>> "region=\"NEW HAMPSHIRE\""), BooleanClause.Occur.MUST);
>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>> "country=\"UNITED STATES\""), BooleanClause.Occur.MUST);
>>>>>
>>>>> org.apache.lucene.queryparser.classic.QueryParser parser = new
>>>>> org.apache.lucene.queryparser.classic.QueryParser(field,
>>>>> phraseAnalyzer) ;
>>>>>           Query q1 = null;
>>>>>           try {
>>>>>               q1 = parser.parse("MAIN");
>>>>>           } catch (ParseException e) {
>>>>>
>>>>>               e.printStackTrace();
>>>>>           }
>>>>>           booleanQuery.add(q1, BooleanClause.Occur.SHOULD);
>>>>>
>>>>> testQuerySearch2 Time to compute: 0 seconds
>>>>> Number of results: 1775
>>>>> Name: Main St
>>>>> Score: 37.20959
>>>>> ID: 12681979
>>>>> Country Code: US
>>>>> Coordinates: 42.76416, -71.46681
>>>>> Search Key: street="MAIN" city="NASHUA" municipality="HILLSBOROUGH"
>>>>> region="NEW HAMPSHIRE" country="UNITED STATES"
>>>>>
>>>>> Name: Main St
>>>>> Score: 37.20959
>>>>> ID: 12681977
>>>>> Country Code: US
>>>>> Coordinates: 42.747, -71.45957
>>>>> Search Key: street="MAIN" city="NASHUA" municipality="HILLSBOROUGH"
>>>>> region="NEW HAMPSHIRE" country="UNITED STATES"
>>>>>
>>>>> Name: Main St
>>>>> Score: 37.20959
>>>>> ID: 12681978
>>>>> Country Code: US
>>>>> Coordinates: 42.73492, -71.44951
>>>>> Search Key: street="MAIN" city="NASHUA" municipality="HILLSBOROUGH"
>>>>> region="NEW HAMPSHIRE" country="UNITED STATES"
>>>>>
>>>>>    when i use q1 = parser.parse("street=\"MAIN\""); i get same
>>>>> results
>>>>> which is good.
>>>>>
>>>>> But when i switch to MAINS~ then fuzzy query does not work.
>>>>>
>>>>>
>>>>> i need to say something with the q1 only in the booleanquery:
>>>>> it tries to match the MAIN in street, city, region and country
>>>>> which are
>>>>> in a single TextField field.
>>>>> But i dont want this. that is why i need to street="..." etc when
>>>>> searching.
>>>>>
>>>>> Best regards
>>>>>
>>>>>
>>>>>
>>>>> On 6/10/19 11:31 AM, Tomoko Uchida wrote:
>>>>>> Hi,
>>>>>>
>>>>>> just for the basic verification, can you find the document without
>>>>>> fuzzy query? I mean, does this query work for you?
>>>>>>
>>>>>> Query query = parser.parse("MAIN");
>>>>>>
>>>>>> Tomoko
>>>>>>
>>>>>> 2019?6?11?(?) 0:22 <baris.kazar@oracle.com>:
>>>>>>> why cant the second set not work at all?
>>>>>>>
>>>>>>> it is indexed as Textfield like street="..." city="..." etc.
>>>>>>>
>>>>>>> Best regards
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 6/10/19 11:23 AM, baris.kazar@oracle.com wrote:
>>>>>>>> i dont know how to use Fuzzyquery with queryparser but probably
>>>>>>>> You
>>>>>>>> are suggesting
>>>>>>>>
>>>>>>>> QueryParser parser = new QueryParser(field, analyzer) ;
>>>>>>>> Query query = parser.parse("MAINS~2");
>>>>>>>>
>>>>>>>> booleanQuery.add(query, BooleanClause.Occur.SHOULD);
>>>>>>>>
>>>>>>>> am i right?
>>>>>>>> Best regards
>>>>>>>>
>>>>>>>>
>>>>>>>> On 6/10/19 10:47 AM, Atri Sharma wrote:
>>>>>>>>> I would suggest using a QueryParser for your fuzzy query before
>>>>>>>>> adding it to the Boolean query. This should weed out any case
>>>>>>>>> issues.
>>>>>>>>>
>>>>>>>>> On Mon, 10 Jun 2019 at 8:06 PM, <baris.kazar@oracle.com
>>>>>>>>> <mailto:baris.kazar@oracle.com>> wrote:
>>>>>>>>>
>>>>>>>>>       BooleanQuery.Builder booleanQuery = new
>>>>>>>>> BooleanQuery.Builder();
>>>>>>>>>
>>>>>>>>>       //First set
>>>>>>>>>
>>>>>>>>>               booleanQuery.add(new FuzzyQuery(new
>>>>>>>>>       org.apache.lucene.index.Term(field, "MAINS")),
>>>>>>>>>       BooleanClause.Occur.SHOULD);
>>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>>>>       "NASHUA"), BooleanClause.Occur.MUST);
>>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>>>>       "NEW HAMPSHIRE"), BooleanClause.Occur.MUST);
>>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>>>>       "UNITED STATES"), BooleanClause.Occur.MUST);
>>>>>>>>>
>>>>>>>>>       // Second set
>>>>>>>>>                //booleanQuery.add(new FuzzyQuery(new
>>>>>>>>>       org.apache.lucene.index.Term(field, "street=\"MAINS\"")),
>>>>>>>>>       BooleanClause.Occur.SHOULD);
>>>>>>>>> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,
>>>>>>>>>
>>>>>>>>>       field, "city=\"NASHUA\""), BooleanClause.Occur.MUST);
>>>>>>>>> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,
>>>>>>>>>
>>>>>>>>>       field, "region=\"NEW HAMPSHIRE\""),
>>>>>>>>> BooleanClause.Occur.MUST);
>>>>>>>>> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,
>>>>>>>>>
>>>>>>>>>       field, "country=\"UNITED STATES\""),
>>>>>>>>> BooleanClause.Occur.MUST);
>>>>>>>>>
>>>>>>>>>       The first set brings also street with Nashua name.
>>>>>>>>> (NASHUA).
>>>>>>>>>
>>>>>>>>>       so, to prevent that and since i also indexed with
>>>>>>>>> street="..."
>>>>>>>>>       city="..." i did the second set but it does not bring
>>>>>>>>> anything.
>>>>>>>>>
>>>>>>>>>       createPhraseQuery builds a Phrasequery with one term
>>>>>>>>> equal to the
>>>>>>>>>       string
>>>>>>>>>       in the call.
>>>>>>>>>
>>>>>>>>>       Best regards
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>       On 6/10/19 10:47 AM, baris.kazar@oracle.com
>>>>>>>>>       <mailto:baris.kazar@oracle.com> wrote:
>>>>>>>>>       > How do i check how it is indexed? lowecase or uppercase?
>>>>>>>>>       >
>>>>>>>>>       > only way is now to by testing.
>>>>>>>>>       >
>>>>>>>>>       > i am using standardanalyzer.
>>>>>>>>>       >
>>>>>>>>>       > Best regards
>>>>>>>>>       >
>>>>>>>>>       >
>>>>>>>>>       > On 6/9/19 11:57 AM, Atri Sharma wrote:
>>>>>>>>>       >> On Sun, Jun 9, 2019 at 8:53 PM Tomoko Uchida
>>>>>>>>>       >> <tomoko.uchida.1111@gmail.com
>>>>>>>>> <mailto:tomoko.uchida.1111@gmail.com>> wrote:
>>>>>>>>>       >>> Hi,
>>>>>>>>>       >>>
>>>>>>>>>       >>> What analyzer do you use for the text field? Is the
>>>>>>>>> term "Main"
>>>>>>>>>       >>> correctly indexed?
>>>>>>>>>       >> Agreed. Also, it would be good if you could post your
>>>>>>>>> actual
>>>>>>>>> code.
>>>>>>>>>       >>
>>>>>>>>>       >> What analyzer are you using? If you are using
>>>>>>>>> StandardAnalyzer,
>>>>>>>>>       then
>>>>>>>>>       >> all of your terms while indexing will be lowercased,
>>>>>>>>> AFAIK, but
>>>>>>>>>       your
>>>>>>>>>       >> query will not be analyzed until you run a
>>>>>>>>> QueryParser on it.
>>>>>>>>>       >>
>>>>>>>>>       >>
>>>>>>>>>       >> Atri
>>>>>>>>>       >>
>>>>>>>>>       >
>>>>>>>>>       >
>>>>>>>>>       >
>>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>>
>>>>>>>>>       > To unsubscribe, e-mail:
>>>>>>>>> java-user-unsubscribe@lucene.apache.org
>>>>>>>>> <mailto:java-user-unsubscribe@lucene.apache.org>
>>>>>>>>>       > For additional commands, e-mail:
>>>>>>>>>       java-user-help@lucene.apache.org
>>>>>>>>> <mailto:java-user-help@lucene.apache.org>
>>>>>>>>>       >
>>>>>>>>>
>>>>>>> ---------------------------------------------------------------------
>>>>>>>
>>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>>>
>>>>>> ---------------------------------------------------------------------
>>>>>>
>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: FuzzyQuery [ In reply to ]
I'd suggest to correctly understand the way a software works before
suspecting its bug :-)

I guess you may miss two points:

1. the standard analyzer (standard tokenizer) breaks words by double
quote (U+0022) so quotes are not indexed or searched at all if you are
using standard analyzer. (That is the reason you have same results
with or without quotes.)
See: https://lucene.apache.org/core/8_1_0/core/org/apache/lucene/analysis/standard/StandardTokenizer.html
and http://unicode.org/reports/tr29/

2. double quote has special meaning (it's interpreted as phrase query)
with the built-in query parser so you need to escape it if you want to
search double quotes itself.
See: http://lucene.apache.org/core/8_1_0/queryparser/org/apache/lucene/queryparser/classic/package-summary.html#Terms

(My advice would be to create separate fields for each key value pairs
instead of stuffing all pairs into one text field, if you need to
search them separately.)

2019?6?12?(?) 2:39 <baris.kazar@oracle.com>:
>
> i can say that quotes is not the issue with index as it still results in
> same results with quotes or without quotes.
>
> i am starting to feel that this might be a bug maybe??
>
> Best regards
>
>
> On 6/10/19 2:46 PM, baris.kazar@oracle.com wrote:
> > Somehow " is causing an issue as this should return street with MAIN:
> >
> > [.contentDFLT:street="MAINS"~2, +contentDFLT:"city nashua",
> > +contentDFLT:"region new-hampshire", +contentDFLT:"country united
> > states"] -> this was with fuzzyquery on MAINS
> >
> > Best regards
> >
> >
> > On 6/10/19 2:24 PM, baris.kazar@oracle.com wrote:
> >> [.+contentDFLT:"city nashua", +contentDFLT:"region new-hampshire",
> >> +contentDFLT:"country united states", contentDFLT:street
> >> contentDFLT:mains]
> >>
> >> QueeryParser chops it into two pieces from
> >> parser.parser("street=\"MAINS\"");
> >>
> >> Index has a TextField named contentDFLT the following data :
> >> street="MAIN" city="NASHUA" municipality="HILLSBOROUGH" region="NEW
> >> HAMPSHIRE" country="UNITED STATES"
> >>
> >>
> >> When i set street=\"MAINS~\" with parser:
> >> i get the following
> >> [.+contentDFLT:"city nashua", +contentDFLT:"region new-hampshire",
> >> +contentDFLT:"country united states", contentDFLT:street
> >> contentDFLT:mains]
> >>
> >> probably " quotations are messing this up as You were saying...
> >> Best regards
> >>
> >>
> >> On 6/10/19 12:48 PM, Tomoko Uchida wrote:
> >>> Or, " (double quotation) in your query string may affect query parsing.
> >>>
> >>> When I parse this string by classic query parser (lucene 8.1),
> >>> street="MAINS~"
> >>> parsed (raw) query is
> >>> text:street text:mains
> >>> (I set the default search field to "text", so text:xxxx is appeared
> >>> here.)
> >>>
> >>> Query parsing is a complex process, so it would be good to check
> >>> parsed raw query string especially when you have (reserved) special
> >>> characters in your query...
> >>>
> >>> 2019?6?11?(?) 1:10 Tomoko Uchida <tomoko.uchida.1111@gmail.com>:
> >>>> Hi,
> >>>>
> >>>> I noticed one small thing in your previous mail.
> >>>>
> >>>>> when i use q1 = parser.parse("street=\"MAIN\""); i get same results
> >>>> which is good.
> >>>>
> >>>> To specify a search field, ":" (colon) should be used instead of "=".
> >>>> See the query parser documentation:
> >>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__lucene.apache.org_core_8-5F1-5F0_queryparser_org_apache_lucene_queryparser_classic_package-2Dsummary.html-23Fields&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=u4SeJqH4lePhOazCLwxLEr3WqcMkODtYLv4njiKZ4PM&s=WrNfUXO9gz1PqpczTJw1vD9sWqvr76WRv2Aeo9uWqa4&e=
> >>>>
> >>>>
> >>>> I'm not sure this is related to your problem.
> >>>>
> >>>> 2019?6?11?(?) 0:51 <baris.kazar@oracle.com>:
> >>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> >>>>> "city=\"NASHUA\""), BooleanClause.Occur.MUST);
> >>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> >>>>> "region=\"NEW HAMPSHIRE\""), BooleanClause.Occur.MUST);
> >>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> >>>>> "country=\"UNITED STATES\""), BooleanClause.Occur.MUST);
> >>>>>
> >>>>> org.apache.lucene.queryparser.classic.QueryParser parser = new
> >>>>> org.apache.lucene.queryparser.classic.QueryParser(field,
> >>>>> phraseAnalyzer) ;
> >>>>> Query q1 = null;
> >>>>> try {
> >>>>> q1 = parser.parse("MAIN");
> >>>>> } catch (ParseException e) {
> >>>>>
> >>>>> e.printStackTrace();
> >>>>> }
> >>>>> booleanQuery.add(q1, BooleanClause.Occur.SHOULD);
> >>>>>
> >>>>> testQuerySearch2 Time to compute: 0 seconds
> >>>>> Number of results: 1775
> >>>>> Name: Main St
> >>>>> Score: 37.20959
> >>>>> ID: 12681979
> >>>>> Country Code: US
> >>>>> Coordinates: 42.76416, -71.46681
> >>>>> Search Key: street="MAIN" city="NASHUA" municipality="HILLSBOROUGH"
> >>>>> region="NEW HAMPSHIRE" country="UNITED STATES"
> >>>>>
> >>>>> Name: Main St
> >>>>> Score: 37.20959
> >>>>> ID: 12681977
> >>>>> Country Code: US
> >>>>> Coordinates: 42.747, -71.45957
> >>>>> Search Key: street="MAIN" city="NASHUA" municipality="HILLSBOROUGH"
> >>>>> region="NEW HAMPSHIRE" country="UNITED STATES"
> >>>>>
> >>>>> Name: Main St
> >>>>> Score: 37.20959
> >>>>> ID: 12681978
> >>>>> Country Code: US
> >>>>> Coordinates: 42.73492, -71.44951
> >>>>> Search Key: street="MAIN" city="NASHUA" municipality="HILLSBOROUGH"
> >>>>> region="NEW HAMPSHIRE" country="UNITED STATES"
> >>>>>
> >>>>> when i use q1 = parser.parse("street=\"MAIN\""); i get same
> >>>>> results
> >>>>> which is good.
> >>>>>
> >>>>> But when i switch to MAINS~ then fuzzy query does not work.
> >>>>>
> >>>>>
> >>>>> i need to say something with the q1 only in the booleanquery:
> >>>>> it tries to match the MAIN in street, city, region and country
> >>>>> which are
> >>>>> in a single TextField field.
> >>>>> But i dont want this. that is why i need to street="..." etc when
> >>>>> searching.
> >>>>>
> >>>>> Best regards
> >>>>>
> >>>>>
> >>>>>
> >>>>> On 6/10/19 11:31 AM, Tomoko Uchida wrote:
> >>>>>> Hi,
> >>>>>>
> >>>>>> just for the basic verification, can you find the document without
> >>>>>> fuzzy query? I mean, does this query work for you?
> >>>>>>
> >>>>>> Query query = parser.parse("MAIN");
> >>>>>>
> >>>>>> Tomoko
> >>>>>>
> >>>>>> 2019?6?11?(?) 0:22 <baris.kazar@oracle.com>:
> >>>>>>> why cant the second set not work at all?
> >>>>>>>
> >>>>>>> it is indexed as Textfield like street="..." city="..." etc.
> >>>>>>>
> >>>>>>> Best regards
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> On 6/10/19 11:23 AM, baris.kazar@oracle.com wrote:
> >>>>>>>> i dont know how to use Fuzzyquery with queryparser but probably
> >>>>>>>> You
> >>>>>>>> are suggesting
> >>>>>>>>
> >>>>>>>> QueryParser parser = new QueryParser(field, analyzer) ;
> >>>>>>>> Query query = parser.parse("MAINS~2");
> >>>>>>>>
> >>>>>>>> booleanQuery.add(query, BooleanClause.Occur.SHOULD);
> >>>>>>>>
> >>>>>>>> am i right?
> >>>>>>>> Best regards
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On 6/10/19 10:47 AM, Atri Sharma wrote:
> >>>>>>>>> I would suggest using a QueryParser for your fuzzy query before
> >>>>>>>>> adding it to the Boolean query. This should weed out any case
> >>>>>>>>> issues.
> >>>>>>>>>
> >>>>>>>>> On Mon, 10 Jun 2019 at 8:06 PM, <baris.kazar@oracle.com
> >>>>>>>>> <mailto:baris.kazar@oracle.com>> wrote:
> >>>>>>>>>
> >>>>>>>>> BooleanQuery.Builder booleanQuery = new
> >>>>>>>>> BooleanQuery.Builder();
> >>>>>>>>>
> >>>>>>>>> //First set
> >>>>>>>>>
> >>>>>>>>> booleanQuery.add(new FuzzyQuery(new
> >>>>>>>>> org.apache.lucene.index.Term(field, "MAINS")),
> >>>>>>>>> BooleanClause.Occur.SHOULD);
> >>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> >>>>>>>>> "NASHUA"), BooleanClause.Occur.MUST);
> >>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> >>>>>>>>> "NEW HAMPSHIRE"), BooleanClause.Occur.MUST);
> >>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> >>>>>>>>> "UNITED STATES"), BooleanClause.Occur.MUST);
> >>>>>>>>>
> >>>>>>>>> // Second set
> >>>>>>>>> //booleanQuery.add(new FuzzyQuery(new
> >>>>>>>>> org.apache.lucene.index.Term(field, "street=\"MAINS\"")),
> >>>>>>>>> BooleanClause.Occur.SHOULD);
> >>>>>>>>> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,
> >>>>>>>>>
> >>>>>>>>> field, "city=\"NASHUA\""), BooleanClause.Occur.MUST);
> >>>>>>>>> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,
> >>>>>>>>>
> >>>>>>>>> field, "region=\"NEW HAMPSHIRE\""),
> >>>>>>>>> BooleanClause.Occur.MUST);
> >>>>>>>>> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,
> >>>>>>>>>
> >>>>>>>>> field, "country=\"UNITED STATES\""),
> >>>>>>>>> BooleanClause.Occur.MUST);
> >>>>>>>>>
> >>>>>>>>> The first set brings also street with Nashua name.
> >>>>>>>>> (NASHUA).
> >>>>>>>>>
> >>>>>>>>> so, to prevent that and since i also indexed with
> >>>>>>>>> street="..."
> >>>>>>>>> city="..." i did the second set but it does not bring
> >>>>>>>>> anything.
> >>>>>>>>>
> >>>>>>>>> createPhraseQuery builds a Phrasequery with one term
> >>>>>>>>> equal to the
> >>>>>>>>> string
> >>>>>>>>> in the call.
> >>>>>>>>>
> >>>>>>>>> Best regards
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> On 6/10/19 10:47 AM, baris.kazar@oracle.com
> >>>>>>>>> <mailto:baris.kazar@oracle.com> wrote:
> >>>>>>>>> > How do i check how it is indexed? lowecase or uppercase?
> >>>>>>>>> >
> >>>>>>>>> > only way is now to by testing.
> >>>>>>>>> >
> >>>>>>>>> > i am using standardanalyzer.
> >>>>>>>>> >
> >>>>>>>>> > Best regards
> >>>>>>>>> >
> >>>>>>>>> >
> >>>>>>>>> > On 6/9/19 11:57 AM, Atri Sharma wrote:
> >>>>>>>>> >> On Sun, Jun 9, 2019 at 8:53 PM Tomoko Uchida
> >>>>>>>>> >> <tomoko.uchida.1111@gmail.com
> >>>>>>>>> <mailto:tomoko.uchida.1111@gmail.com>> wrote:
> >>>>>>>>> >>> Hi,
> >>>>>>>>> >>>
> >>>>>>>>> >>> What analyzer do you use for the text field? Is the
> >>>>>>>>> term "Main"
> >>>>>>>>> >>> correctly indexed?
> >>>>>>>>> >> Agreed. Also, it would be good if you could post your
> >>>>>>>>> actual
> >>>>>>>>> code.
> >>>>>>>>> >>
> >>>>>>>>> >> What analyzer are you using? If you are using
> >>>>>>>>> StandardAnalyzer,
> >>>>>>>>> then
> >>>>>>>>> >> all of your terms while indexing will be lowercased,
> >>>>>>>>> AFAIK, but
> >>>>>>>>> your
> >>>>>>>>> >> query will not be analyzed until you run a
> >>>>>>>>> QueryParser on it.
> >>>>>>>>> >>
> >>>>>>>>> >>
> >>>>>>>>> >> Atri
> >>>>>>>>> >>
> >>>>>>>>> >
> >>>>>>>>> >
> >>>>>>>>> >
> >>>>>>>>> ---------------------------------------------------------------------
> >>>>>>>>>
> >>>>>>>>> > To unsubscribe, e-mail:
> >>>>>>>>> java-user-unsubscribe@lucene.apache.org
> >>>>>>>>> <mailto:java-user-unsubscribe@lucene.apache.org>
> >>>>>>>>> > For additional commands, e-mail:
> >>>>>>>>> java-user-help@lucene.apache.org
> >>>>>>>>> <mailto:java-user-help@lucene.apache.org>
> >>>>>>>>> >
> >>>>>>>>>
> >>>>>>> ---------------------------------------------------------------------
> >>>>>>>
> >>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >>>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>>>>>>
> >>>>>> ---------------------------------------------------------------------
> >>>>>>
> >>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>>>>>
> >>> ---------------------------------------------------------------------
> >>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >>> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>>
> >>
> >
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: FuzzyQuery [ In reply to ]
Tomoko,-

 Thank You for Your suggestions. i am trying to understand it and i
thought i did :)

but it does not work with FuzzyQuery when i used with a *single* large
TextField like street=...value... city=...value... region=...value...
country=...value... (with or without quotes for the values)

What i knew about Lucene fuzzy queries are not holding now with this
Textfield form. That is why i suspected of a bug.

1. Yes, i saw and have a solid proof on that now.

2. yes but FuzzyQuery takes quotes as they are as they are escaped and
it is not analyzed.

Stuffing into one textfield vs having separate fields should only affect
probably the performance but not the outcome in my case.
But, i have been thinking about this and maybe it is the way to go in
this case.

mY CONTENT field has street names in mixed case and city, region country
names in UPPERCASE. Can this be a problem?
i thought index stored them in lowercase since i am using StandardAnalyzer.

CONTENT field also has full textfield string with street=... city=...
region=... country=... (here all values are UPPERCASE).

Why cant the index find the names via FuzzyQuery? i tried both
FuzzyQuery and Query builder as i showed before.

The last advice in Your previous email would nicely go outside the
parantheses since it might be very critical :) :) :)

Best regards


On 6/12/19 12:17 AM, Tomoko Uchida wrote:
> I'd suggest to correctly understand the way a software works before
> suspecting its bug :-)
>
> I guess you may miss two points:
>
> 1. the standard analyzer (standard tokenizer) breaks words by double
> quote (U+0022) so quotes are not indexed or searched at all if you are
> using standard analyzer. (That is the reason you have same results
> with or without quotes.)
> See: https://urldefense.proofpoint.com/v2/url?u=https-3A__lucene.apache.org_core_8-5F1-5F0_core_org_apache_lucene_analysis_standard_StandardTokenizer.html&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=1L6ZQKxmWmYxDX4uJHxzY5SAR_UCl6UUXCo916wzXCo&s=8E2lp1YIGM-3v3FspeieGl8z8rEBs6qioTudtFNzh8c&e=
> and https://urldefense.proofpoint.com/v2/url?u=http-3A__unicode.org_reports_tr29_&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=1L6ZQKxmWmYxDX4uJHxzY5SAR_UCl6UUXCo916wzXCo&s=riCZ_f25XW869CKbHPUqfbLiDU-AukE6la0xTLMw6u8&e=
>
> 2. double quote has special meaning (it's interpreted as phrase query)
> with the built-in query parser so you need to escape it if you want to
> search double quotes itself.
> See: https://urldefense.proofpoint.com/v2/url?u=http-3A__lucene.apache.org_core_8-5F1-5F0_queryparser_org_apache_lucene_queryparser_classic_package-2Dsummary.html-23Terms&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=1L6ZQKxmWmYxDX4uJHxzY5SAR_UCl6UUXCo916wzXCo&s=t8OYTgidvcwNpAVFuTsqGhDJK5BwUZVCxc0mPHzqCYU&e=
>
> (My advice would be to create separate fields for each key value pairs
> instead of stuffing all pairs into one text field, if you need to
> search them separately.)
>
> 2019?6?12?(?) 2:39 <baris.kazar@oracle.com>:
>> i can say that quotes is not the issue with index as it still results in
>> same results with quotes or without quotes.
>>
>> i am starting to feel that this might be a bug maybe??
>>
>> Best regards
>>
>>
>> On 6/10/19 2:46 PM, baris.kazar@oracle.com wrote:
>>> Somehow " is causing an issue as this should return street with MAIN:
>>>
>>> [contentDFLT:street="MAINS"~2, +contentDFLT:"city nashua",
>>> +contentDFLT:"region new-hampshire", +contentDFLT:"country united
>>> states"] -> this was with fuzzyquery on MAINS
>>>
>>> Best regards
>>>
>>>
>>> On 6/10/19 2:24 PM, baris.kazar@oracle.com wrote:
>>>> [.+contentDFLT:"city nashua", +contentDFLT:"region new-hampshire",
>>>> +contentDFLT:"country united states", contentDFLT:street
>>>> contentDFLT:mains]
>>>>
>>>> QueeryParser chops it into two pieces from
>>>> parser.parser("street=\"MAINS\"");
>>>>
>>>> Index has a TextField named contentDFLT the following data :
>>>> street="MAIN" city="NASHUA" municipality="HILLSBOROUGH" region="NEW
>>>> HAMPSHIRE" country="UNITED STATES"
>>>>
>>>>
>>>> When i set street=\"MAINS~\" with parser:
>>>> i get the following
>>>> [.+contentDFLT:"city nashua", +contentDFLT:"region new-hampshire",
>>>> +contentDFLT:"country united states", contentDFLT:street
>>>> contentDFLT:mains]
>>>>
>>>> probably " quotations are messing this up as You were saying...
>>>> Best regards
>>>>
>>>>
>>>> On 6/10/19 12:48 PM, Tomoko Uchida wrote:
>>>>> Or, " (double quotation) in your query string may affect query parsing.
>>>>>
>>>>> When I parse this string by classic query parser (lucene 8.1),
>>>>> street="MAINS~"
>>>>> parsed (raw) query is
>>>>> text:street text:mains
>>>>> (I set the default search field to "text", so text:xxxx is appeared
>>>>> here.)
>>>>>
>>>>> Query parsing is a complex process, so it would be good to check
>>>>> parsed raw query string especially when you have (reserved) special
>>>>> characters in your query...
>>>>>
>>>>> 2019?6?11?(?) 1:10 Tomoko Uchida <tomoko.uchida.1111@gmail.com>:
>>>>>> Hi,
>>>>>>
>>>>>> I noticed one small thing in your previous mail.
>>>>>>
>>>>>>> when i use q1 = parser.parse("street=\"MAIN\""); i get same results
>>>>>> which is good.
>>>>>>
>>>>>> To specify a search field, ":" (colon) should be used instead of "=".
>>>>>> See the query parser documentation:
>>>>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__lucene.apache.org_core_8-5F1-5F0_queryparser_org_apache_lucene_queryparser_classic_package-2Dsummary.html-23Fields&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=u4SeJqH4lePhOazCLwxLEr3WqcMkODtYLv4njiKZ4PM&s=WrNfUXO9gz1PqpczTJw1vD9sWqvr76WRv2Aeo9uWqa4&e=
>>>>>>
>>>>>>
>>>>>> I'm not sure this is related to your problem.
>>>>>>
>>>>>> 2019?6?11?(?) 0:51 <baris.kazar@oracle.com>:
>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>> "city=\"NASHUA\""), BooleanClause.Occur.MUST);
>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>> "region=\"NEW HAMPSHIRE\""), BooleanClause.Occur.MUST);
>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>> "country=\"UNITED STATES\""), BooleanClause.Occur.MUST);
>>>>>>>
>>>>>>> org.apache.lucene.queryparser.classic.QueryParser parser = new
>>>>>>> org.apache.lucene.queryparser.classic.QueryParser(field,
>>>>>>> phraseAnalyzer) ;
>>>>>>> Query q1 = null;
>>>>>>> try {
>>>>>>> q1 = parser.parse("MAIN");
>>>>>>> } catch (ParseException e) {
>>>>>>>
>>>>>>> e.printStackTrace();
>>>>>>> }
>>>>>>> booleanQuery.add(q1, BooleanClause.Occur.SHOULD);
>>>>>>>
>>>>>>> testQuerySearch2 Time to compute: 0 seconds
>>>>>>> Number of results: 1775
>>>>>>> Name: Main St
>>>>>>> Score: 37.20959
>>>>>>> ID: 12681979
>>>>>>> Country Code: US
>>>>>>> Coordinates: 42.76416, -71.46681
>>>>>>> Search Key: street="MAIN" city="NASHUA" municipality="HILLSBOROUGH"
>>>>>>> region="NEW HAMPSHIRE" country="UNITED STATES"
>>>>>>>
>>>>>>> Name: Main St
>>>>>>> Score: 37.20959
>>>>>>> ID: 12681977
>>>>>>> Country Code: US
>>>>>>> Coordinates: 42.747, -71.45957
>>>>>>> Search Key: street="MAIN" city="NASHUA" municipality="HILLSBOROUGH"
>>>>>>> region="NEW HAMPSHIRE" country="UNITED STATES"
>>>>>>>
>>>>>>> Name: Main St
>>>>>>> Score: 37.20959
>>>>>>> ID: 12681978
>>>>>>> Country Code: US
>>>>>>> Coordinates: 42.73492, -71.44951
>>>>>>> Search Key: street="MAIN" city="NASHUA" municipality="HILLSBOROUGH"
>>>>>>> region="NEW HAMPSHIRE" country="UNITED STATES"
>>>>>>>
>>>>>>> when i use q1 = parser.parse("street=\"MAIN\""); i get same
>>>>>>> results
>>>>>>> which is good.
>>>>>>>
>>>>>>> But when i switch to MAINS~ then fuzzy query does not work.
>>>>>>>
>>>>>>>
>>>>>>> i need to say something with the q1 only in the booleanquery:
>>>>>>> it tries to match the MAIN in street, city, region and country
>>>>>>> which are
>>>>>>> in a single TextField field.
>>>>>>> But i dont want this. that is why i need to street="..." etc when
>>>>>>> searching.
>>>>>>>
>>>>>>> Best regards
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 6/10/19 11:31 AM, Tomoko Uchida wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> just for the basic verification, can you find the document without
>>>>>>>> fuzzy query? I mean, does this query work for you?
>>>>>>>>
>>>>>>>> Query query = parser.parse("MAIN");
>>>>>>>>
>>>>>>>> Tomoko
>>>>>>>>
>>>>>>>> 2019?6?11?(?) 0:22 <baris.kazar@oracle.com>:
>>>>>>>>> why cant the second set not work at all?
>>>>>>>>>
>>>>>>>>> it is indexed as Textfield like street="..." city="..." etc.
>>>>>>>>>
>>>>>>>>> Best regards
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 6/10/19 11:23 AM, baris.kazar@oracle.com wrote:
>>>>>>>>>> i dont know how to use Fuzzyquery with queryparser but probably
>>>>>>>>>> You
>>>>>>>>>> are suggesting
>>>>>>>>>>
>>>>>>>>>> QueryParser parser = new QueryParser(field, analyzer) ;
>>>>>>>>>> Query query = parser.parse("MAINS~2");
>>>>>>>>>>
>>>>>>>>>> booleanQuery.add(query, BooleanClause.Occur.SHOULD);
>>>>>>>>>>
>>>>>>>>>> am i right?
>>>>>>>>>> Best regards
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 6/10/19 10:47 AM, Atri Sharma wrote:
>>>>>>>>>>> I would suggest using a QueryParser for your fuzzy query before
>>>>>>>>>>> adding it to the Boolean query. This should weed out any case
>>>>>>>>>>> issues.
>>>>>>>>>>>
>>>>>>>>>>> On Mon, 10 Jun 2019 at 8:06 PM, <baris.kazar@oracle.com
>>>>>>>>>>> <mailto:baris.kazar@oracle.com>> wrote:
>>>>>>>>>>>
>>>>>>>>>>> BooleanQuery.Builder booleanQuery = new
>>>>>>>>>>> BooleanQuery.Builder();
>>>>>>>>>>>
>>>>>>>>>>> //First set
>>>>>>>>>>>
>>>>>>>>>>> booleanQuery.add(new FuzzyQuery(new
>>>>>>>>>>> org.apache.lucene.index.Term(field, "MAINS")),
>>>>>>>>>>> BooleanClause.Occur.SHOULD);
>>>>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>>>>>> "NASHUA"), BooleanClause.Occur.MUST);
>>>>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>>>>>> "NEW HAMPSHIRE"), BooleanClause.Occur.MUST);
>>>>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>>>>>> "UNITED STATES"), BooleanClause.Occur.MUST);
>>>>>>>>>>>
>>>>>>>>>>> // Second set
>>>>>>>>>>> //booleanQuery.add(new FuzzyQuery(new
>>>>>>>>>>> org.apache.lucene.index.Term(field, "street=\"MAINS\"")),
>>>>>>>>>>> BooleanClause.Occur.SHOULD);
>>>>>>>>>>> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,
>>>>>>>>>>>
>>>>>>>>>>> field, "city=\"NASHUA\""), BooleanClause.Occur.MUST);
>>>>>>>>>>> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,
>>>>>>>>>>>
>>>>>>>>>>> field, "region=\"NEW HAMPSHIRE\""),
>>>>>>>>>>> BooleanClause.Occur.MUST);
>>>>>>>>>>> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,
>>>>>>>>>>>
>>>>>>>>>>> field, "country=\"UNITED STATES\""),
>>>>>>>>>>> BooleanClause.Occur.MUST);
>>>>>>>>>>>
>>>>>>>>>>> The first set brings also street with Nashua name.
>>>>>>>>>>> (NASHUA).
>>>>>>>>>>>
>>>>>>>>>>> so, to prevent that and since i also indexed with
>>>>>>>>>>> street="..."
>>>>>>>>>>> city="..." i did the second set but it does not bring
>>>>>>>>>>> anything.
>>>>>>>>>>>
>>>>>>>>>>> createPhraseQuery builds a Phrasequery with one term
>>>>>>>>>>> equal to the
>>>>>>>>>>> string
>>>>>>>>>>> in the call.
>>>>>>>>>>>
>>>>>>>>>>> Best regards
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 6/10/19 10:47 AM, baris.kazar@oracle.com
>>>>>>>>>>> <mailto:baris.kazar@oracle.com> wrote:
>>>>>>>>>>> > How do i check how it is indexed? lowecase or uppercase?
>>>>>>>>>>> >
>>>>>>>>>>> > only way is now to by testing.
>>>>>>>>>>> >
>>>>>>>>>>> > i am using standardanalyzer.
>>>>>>>>>>> >
>>>>>>>>>>> > Best regards
>>>>>>>>>>> >
>>>>>>>>>>> >
>>>>>>>>>>> > On 6/9/19 11:57 AM, Atri Sharma wrote:
>>>>>>>>>>> >> On Sun, Jun 9, 2019 at 8:53 PM Tomoko Uchida
>>>>>>>>>>> >> <tomoko.uchida.1111@gmail.com
>>>>>>>>>>> <mailto:tomoko.uchida.1111@gmail.com>> wrote:
>>>>>>>>>>> >>> Hi,
>>>>>>>>>>> >>>
>>>>>>>>>>> >>> What analyzer do you use for the text field? Is the
>>>>>>>>>>> term "Main"
>>>>>>>>>>> >>> correctly indexed?
>>>>>>>>>>> >> Agreed. Also, it would be good if you could post your
>>>>>>>>>>> actual
>>>>>>>>>>> code.
>>>>>>>>>>> >>
>>>>>>>>>>> >> What analyzer are you using? If you are using
>>>>>>>>>>> StandardAnalyzer,
>>>>>>>>>>> then
>>>>>>>>>>> >> all of your terms while indexing will be lowercased,
>>>>>>>>>>> AFAIK, but
>>>>>>>>>>> your
>>>>>>>>>>> >> query will not be analyzed until you run a
>>>>>>>>>>> QueryParser on it.
>>>>>>>>>>> >>
>>>>>>>>>>> >>
>>>>>>>>>>> >> Atri
>>>>>>>>>>> >>
>>>>>>>>>>> >
>>>>>>>>>>> >
>>>>>>>>>>> >
>>>>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>>>>
>>>>>>>>>>> > To unsubscribe, e-mail:
>>>>>>>>>>> java-user-unsubscribe@lucene.apache.org
>>>>>>>>>>> <mailto:java-user-unsubscribe@lucene.apache.org>
>>>>>>>>>>> > For additional commands, e-mail:
>>>>>>>>>>> java-user-help@lucene.apache.org
>>>>>>>>>>> <mailto:java-user-help@lucene.apache.org>
>>>>>>>>>>> >
>>>>>>>>>>>
>>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>>
>>>>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>>>>>
>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>
>>>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: FuzzyQuery- why is it ignored? [ In reply to ]
Hi again,-

this is really interesting and i hope i am missing something. Index
small cases all entries so case sensitivity is not an issue i think.

Case #1:

org.apache.lucene.queryparser.classic.QueryParser parser = new
org.apache.lucene.queryparser.classic.QueryParser(field, phraseAnalyzer) ;
        Query q1 = null;
        try {
            q1 = parser.parse("Main");
        } catch (ParseException e) {
            e.printStackTrace();
        }
        booleanQuery.add(q1, BooleanClause.Occur.MUST);
        booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
"NASHUA"), BooleanClause.Occur.MUST);
        booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
"NEW HAMPSHIRE"), BooleanClause.Occur.MUST);
        booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
"UNITED STATES"), BooleanClause.Occur.MUST);


This brings *with this:*

*query plan:
*

*[+contentDFLT:main, +contentDFLT:"nashua",
+contentDFLT:"new-hampshire", +contentDFLT:"united states"]*

testQuerySearch1 Time to compute: 0 seconds (copied answer after exec
finished)

Number of results: 12
Name: Main Dunstable Rd
Score: 41.204945
ID: 12677400
Country Code: US
Coordinates: 42.72631, -71.50269
Search Key: MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES

Name: Main St
Score: 41.204945
ID: 12681980
Country Code: US
Coordinates: 42.76416, -71.46681
Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES

Name: Main St
Score: 41.204945
ID: 12681973
Country Code: US
Coordinates: 42.75045, -71.4607
Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES

Name: Main St
Score: 41.204945
ID: 12681974
Country Code: US
Coordinates: 42.76019, -71.465
Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES

Name: Main Dunstable Rd
Score: 41.204945
ID: 12677399
Country Code: US
Coordinates: 42.74641, -71.48943
Search Key: MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES

Name: S Main St
Score: 41.204945
ID: 11893215
Country Code: US
Coordinates: 42.73412, -71.44797
Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES

Name: Main St
Score: 41.204945
ID: 12681978
Country Code: US
Coordinates: 42.73492, -71.44951
Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES

Name: S Main St
Score: 41.204945
ID: 11893214
Country Code: US
Coordinates: 42.73958, -71.45895
Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES

Name: Main St
Score: 41.204945
ID: 12681979
Country Code: US
Coordinates: 42.76416, -71.46681
Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES

Name: Main St
Score: 41.204945
ID: 12681977
Country Code: US
Coordinates: 42.747, -71.45957
Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES



Case #2

When i did this it also worked by adding ~ to make it Fuzzy query to
Main word:

org.apache.lucene.queryparser.classic.QueryParser parser = new
org.apache.lucene.queryparser.classic.QueryParser(field, phraseAnalyzer) ;
        Query q1 = null;
        try {
            q1 = parser.parse("Main~");
        } catch (ParseException e) {
            e.printStackTrace();
        }
        booleanQuery.add(q1, BooleanClause.Occur.MUST);
        booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
"NASHUA"), BooleanClause.Occur.MUST);
        booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
"NEW HAMPSHIRE"), BooleanClause.Occur.MUST);
        booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
"UNITED STATES"), BooleanClause.Occur.MUST);

*query plan:**
**
**[+contentDFLT:main~2, +contentDFLT:"nashua",
+contentDFLT:"new-hampshire", +contentDFLT:"united states"]*

testQuerySearch1 Time to compute: 24 seconds (due to debugging stops)
Number of results: 12
Name: Main Dunstable Rd
Score: 41.06405
ID: 12677400
Country Code: US
Coordinates: 42.72631, -71.50269
Search Key: MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES

Name: Main St
Score: 41.06405
ID: 12681980
Country Code: US
Coordinates: 42.76416, -71.46681
Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES

Name: Main St
Score: 41.06405
ID: 12681973
Country Code: US
Coordinates: 42.75045, -71.4607
Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES

Name: Main St
Score: 41.06405
ID: 12681974
Country Code: US
Coordinates: 42.76019, -71.465
Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES

Name: Main Dunstable Rd
Score: 41.06405
ID: 12677399
Country Code: US
Coordinates: 42.74641, -71.48943
Search Key: MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES

Name: S Main St
Score: 41.06405
ID: 11893215
Country Code: US
Coordinates: 42.73412, -71.44797
Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES

Name: Main St
Score: 41.06405
ID: 12681978
Country Code: US
Coordinates: 42.73492, -71.44951
Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES

Name: S Main St
Score: 41.06405
ID: 11893214
Country Code: US
Coordinates: 42.73958, -71.45895
Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES

Name: Main St
Score: 41.06405
ID: 12681979
Country Code: US
Coordinates: 42.76416, -71.46681
Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES

Name: Main St
Score: 41.06405
ID: 12681977
Country Code: US
Coordinates: 42.747, -71.45957
Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES




Case #3

But why does this not work with fuzzy mode and i misspelled a bit (1
edit away) and as You saw the data is there with Main spelling:

org.apache.lucene.queryparser.classic.QueryParser parser = new
org.apache.lucene.queryparser.classic.QueryParser(field, phraseAnalyzer) ;

        Query q1 = null;
        try {
            q1 = parser.parse("Mains~");  // 1 edit away
        } catch (ParseException e) {
            e.printStackTrace();
        }
        booleanQuery.add(q1, BooleanClause.Occur.MUST);
        booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
"NASHUA"), BooleanClause.Occur.MUST);
        booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
"NEW HAMPSHIRE"), BooleanClause.Occur.MUST);
        booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
"UNITED STATES"), BooleanClause.Occur.MUST);

*query plan:**
**
**[+contentDFLT:mains~2, +contentDFLT:"nashua",
+contentDFLT:"new-hampshire", +contentDFLT:"united states"]**
*

testQuerySearch1 Time to compute: 23 seconds (due to debugging stops)

Number of results: 0



Case #4

Then i changed q1 to SHOULD from MUST above: and i think fuzzy query is
ignored here since there is no MAIN in the first 468 resuls:

there is no boost for Mains term here.

*query plan:*

*[contentDFLT:mains~2, +contentDFLT:"nashua",
+contentDFLT:"new-hampshire", +contentDFLT:"united states"]**
*

testQuerySearch1 Time to compute: 125 seconds (due to debugging stops)
Number of results: 1794
Name: Nashua Dr
Score: 34.186226
ID: 4974936
Country Code: US
Coordinates: 42.7636, -71.46063
Search Key: NASHUA NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES

Name: Nashua River Rail Trl
Score: 34.186226
ID: 4975508
Country Code: US
Coordinates: 42.7062, -71.53962
Search Key: NASHUA RIVER RAIL NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED
STATES

Name: Nashua Rd
Score: 33.84896
ID: 4975388
Country Code: US
Coordinates: 42.78746, -71.92823
Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES

Name: NASHUA
Score: 33.84896
ID: 21014865
Country Code: US
Coordinates: 42.75873, -71.46438
Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES

Name: NASHUA
Score: 33.84896
ID: 21014865
Country Code: US
Coordinates: 42.75873, -71.46438
Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES

Name: NASHUA
Score: 33.84896
ID: 21014865
Country Code: US
Coordinates: 42.75873, -71.46438
Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES

Name: NASHUA
Score: 33.84896
ID: 21014865
Country Code: US
Coordinates: 42.75873, -71.46438
Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES

Name: NASHUA
Score: 33.84896
ID: 21014865
Country Code: US
Coordinates: 42.75873, -71.46438
Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES

Name: Nashua St
Score: 33.84896
ID: 4975671
Country Code: US
Coordinates: 42.88471, -70.81687
Search Key: NASHUA ROCKINGHAM NEW HAMPSHIRE UNITED STATES

Name: Nashua Rd
Score: 33.84896
ID: 4975400
Country Code: US
Coordinates: 42.79014, -71.92364
Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES


Why is the fuzzy query ignored?
Even if i have separate fields for street, city,region, country, this
fuzzy query issue will come into place for words with multiple parts
like main dunstable etc., right?

Best regards

On 6/12/19 11:36 AM, baris.kazar@oracle.com wrote:
> Tomoko,-
>
>  Thank You for Your suggestions. i am trying to understand it and i
> thought i did :)
>
> but it does not work with FuzzyQuery when i used with a *single* large
> TextField like street=...value... city=...value... region=...value...
> country=...value... (with or without quotes for the values)
>
> What i knew about Lucene fuzzy queries are not holding now with this
> Textfield form. That is why i suspected of a bug.
>
> 1. Yes, i saw and have a solid proof on that now.
>
> 2. yes but FuzzyQuery takes quotes as they are as they are escaped and
> it is not analyzed.
>
> Stuffing into one textfield vs having separate fields should only
> affect probably the performance but not the outcome in my case.
> But, i have been thinking about this and maybe it is the way to go in
> this case.
>
> mY CONTENT field has street names in mixed case and city, region
> country names in UPPERCASE. Can this be a problem?
> i thought index stored them in lowercase since i am using
> StandardAnalyzer.
>
> CONTENT field also has full textfield string with street=... city=...
> region=... country=... (here all values are UPPERCASE).
>
> Why cant the index find the names via FuzzyQuery? i tried both
> FuzzyQuery and Query builder as i showed before.
>
> The last advice in Your previous email would nicely go outside the
> parantheses since it might be very critical :) :) :)
>
> Best regards
>
>
> On 6/12/19 12:17 AM, Tomoko Uchida wrote:
>> I'd suggest to correctly understand the way a software works before
>> suspecting its bug :-)
>>
>> I guess you may miss two points:
>>
>> 1. the standard analyzer (standard tokenizer) breaks words by double
>> quote (U+0022) so quotes are not indexed or searched at all if you are
>> using standard analyzer. (That is the reason you have same results
>> with or without quotes.)
>> See:
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__lucene.apache.org_core_8-5F1-5F0_core_org_apache_lucene_analysis_standard_StandardTokenizer.html&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=1L6ZQKxmWmYxDX4uJHxzY5SAR_UCl6UUXCo916wzXCo&s=8E2lp1YIGM-3v3FspeieGl8z8rEBs6qioTudtFNzh8c&e=
>> and
>> https://urldefense.proofpoint.com/v2/url?u=http-3A__unicode.org_reports_tr29_&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=1L6ZQKxmWmYxDX4uJHxzY5SAR_UCl6UUXCo916wzXCo&s=riCZ_f25XW869CKbHPUqfbLiDU-AukE6la0xTLMw6u8&e=
>>
>> 2. double quote has special meaning (it's interpreted as phrase query)
>> with the built-in query parser so you need to escape it if you want to
>> search double quotes itself.
>> See:
>> https://urldefense.proofpoint.com/v2/url?u=http-3A__lucene.apache.org_core_8-5F1-5F0_queryparser_org_apache_lucene_queryparser_classic_package-2Dsummary.html-23Terms&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=1L6ZQKxmWmYxDX4uJHxzY5SAR_UCl6UUXCo916wzXCo&s=t8OYTgidvcwNpAVFuTsqGhDJK5BwUZVCxc0mPHzqCYU&e=
>>
>> (My advice would be to create separate fields for each key value pairs
>> instead of stuffing all pairs into one text field, if you need to
>> search them separately.)
>>
>> 2019?6?12?(?) 2:39 <baris.kazar@oracle.com>:
>>> i can say that quotes is not the issue with index as it still
>>> results in
>>> same results with quotes or without quotes.
>>>
>>> i am starting to feel that this might be a bug maybe??
>>>
>>> Best regards
>>>
>>>
>>> On 6/10/19 2:46 PM, baris.kazar@oracle.com wrote:
>>>> Somehow " is causing an issue as this should return street with MAIN:
>>>>
>>>> [contentDFLT:street="MAINS"~2, +contentDFLT:"city nashua",
>>>> +contentDFLT:"region new-hampshire", +contentDFLT:"country united
>>>> states"] -> this was with fuzzyquery on MAINS
>>>>
>>>> Best regards
>>>>
>>>>
>>>> On 6/10/19 2:24 PM, baris.kazar@oracle.com wrote:
>>>>> [.+contentDFLT:"city nashua", +contentDFLT:"region new-hampshire",
>>>>> +contentDFLT:"country united states", contentDFLT:street
>>>>> contentDFLT:mains]
>>>>>
>>>>> QueeryParser chops it into two pieces from
>>>>> parser.parser("street=\"MAINS\"");
>>>>>
>>>>> Index has a TextField named contentDFLT the following data :
>>>>> street="MAIN" city="NASHUA" municipality="HILLSBOROUGH" region="NEW
>>>>> HAMPSHIRE" country="UNITED STATES"
>>>>>
>>>>>
>>>>> When i set street=\"MAINS~\" with parser:
>>>>> i get the following
>>>>> [.+contentDFLT:"city nashua", +contentDFLT:"region new-hampshire",
>>>>> +contentDFLT:"country united states", contentDFLT:street
>>>>> contentDFLT:mains]
>>>>>
>>>>> probably " quotations are messing this up as You were saying...
>>>>> Best regards
>>>>>
>>>>>
>>>>> On 6/10/19 12:48 PM, Tomoko Uchida wrote:
>>>>>> Or, " (double quotation) in your query string may affect query
>>>>>> parsing.
>>>>>>
>>>>>> When I parse this string by classic query parser (lucene 8.1),
>>>>>> street="MAINS~"
>>>>>> parsed (raw) query is
>>>>>> text:street text:mains
>>>>>> (I set the default search field to "text", so text:xxxx is appeared
>>>>>> here.)
>>>>>>
>>>>>> Query parsing is a complex process, so it would be good to check
>>>>>> parsed raw query string especially when you have (reserved) special
>>>>>> characters in your query...
>>>>>>
>>>>>> 2019?6?11?(?) 1:10 Tomoko Uchida <tomoko.uchida.1111@gmail.com>:
>>>>>>> Hi,
>>>>>>>
>>>>>>> I noticed one small thing in your previous mail.
>>>>>>>
>>>>>>>> when i use q1 = parser.parse("street=\"MAIN\""); i get same
>>>>>>>> results
>>>>>>> which is good.
>>>>>>>
>>>>>>> To specify a search field, ":" (colon) should be used instead of
>>>>>>> "=".
>>>>>>> See the query parser documentation:
>>>>>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__lucene.apache.org_core_8-5F1-5F0_queryparser_org_apache_lucene_queryparser_classic_package-2Dsummary.html-23Fields&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=u4SeJqH4lePhOazCLwxLEr3WqcMkODtYLv4njiKZ4PM&s=WrNfUXO9gz1PqpczTJw1vD9sWqvr76WRv2Aeo9uWqa4&e=
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> I'm not sure this is related to your problem.
>>>>>>>
>>>>>>> 2019?6?11?(?) 0:51 <baris.kazar@oracle.com>:
>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>>> "city=\"NASHUA\""), BooleanClause.Occur.MUST);
>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>>> "region=\"NEW HAMPSHIRE\""), BooleanClause.Occur.MUST);
>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>>> "country=\"UNITED STATES\""), BooleanClause.Occur.MUST);
>>>>>>>>
>>>>>>>> org.apache.lucene.queryparser.classic.QueryParser parser = new
>>>>>>>> org.apache.lucene.queryparser.classic.QueryParser(field,
>>>>>>>> phraseAnalyzer) ;
>>>>>>>>            Query q1 = null;
>>>>>>>>            try {
>>>>>>>>                q1 = parser.parse("MAIN");
>>>>>>>>            } catch (ParseException e) {
>>>>>>>>
>>>>>>>>                e.printStackTrace();
>>>>>>>>            }
>>>>>>>>            booleanQuery.add(q1, BooleanClause.Occur.SHOULD);
>>>>>>>>
>>>>>>>> testQuerySearch2 Time to compute: 0 seconds
>>>>>>>> Number of results: 1775
>>>>>>>> Name: Main St
>>>>>>>> Score: 37.20959
>>>>>>>> ID: 12681979
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.76416, -71.46681
>>>>>>>> Search Key: street="MAIN" city="NASHUA"
>>>>>>>> municipality="HILLSBOROUGH"
>>>>>>>> region="NEW HAMPSHIRE" country="UNITED STATES"
>>>>>>>>
>>>>>>>> Name: Main St
>>>>>>>> Score: 37.20959
>>>>>>>> ID: 12681977
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.747, -71.45957
>>>>>>>> Search Key: street="MAIN" city="NASHUA"
>>>>>>>> municipality="HILLSBOROUGH"
>>>>>>>> region="NEW HAMPSHIRE" country="UNITED STATES"
>>>>>>>>
>>>>>>>> Name: Main St
>>>>>>>> Score: 37.20959
>>>>>>>> ID: 12681978
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.73492, -71.44951
>>>>>>>> Search Key: street="MAIN" city="NASHUA"
>>>>>>>> municipality="HILLSBOROUGH"
>>>>>>>> region="NEW HAMPSHIRE" country="UNITED STATES"
>>>>>>>>
>>>>>>>>     when i use q1 = parser.parse("street=\"MAIN\""); i get same
>>>>>>>> results
>>>>>>>> which is good.
>>>>>>>>
>>>>>>>> But when i switch to MAINS~ then fuzzy query does not work.
>>>>>>>>
>>>>>>>>
>>>>>>>> i need to say something with the q1 only in the booleanquery:
>>>>>>>> it tries to match the MAIN in street, city, region and country
>>>>>>>> which are
>>>>>>>> in a single TextField field.
>>>>>>>> But i dont want this. that is why i need to street="..." etc when
>>>>>>>> searching.
>>>>>>>>
>>>>>>>> Best regards
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On 6/10/19 11:31 AM, Tomoko Uchida wrote:
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> just for the basic verification, can you find the document
>>>>>>>>> without
>>>>>>>>> fuzzy query? I mean, does this query work for you?
>>>>>>>>>
>>>>>>>>> Query query = parser.parse("MAIN");
>>>>>>>>>
>>>>>>>>> Tomoko
>>>>>>>>>
>>>>>>>>> 2019?6?11?(?) 0:22 <baris.kazar@oracle.com>:
>>>>>>>>>> why cant the second set not work at all?
>>>>>>>>>>
>>>>>>>>>> it is indexed as Textfield like street="..." city="..." etc.
>>>>>>>>>>
>>>>>>>>>> Best regards
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 6/10/19 11:23 AM, baris.kazar@oracle.com wrote:
>>>>>>>>>>> i dont know how to use Fuzzyquery with queryparser but probably
>>>>>>>>>>> You
>>>>>>>>>>> are suggesting
>>>>>>>>>>>
>>>>>>>>>>> QueryParser parser = new QueryParser(field, analyzer) ;
>>>>>>>>>>> Query query = parser.parse("MAINS~2");
>>>>>>>>>>>
>>>>>>>>>>> booleanQuery.add(query, BooleanClause.Occur.SHOULD);
>>>>>>>>>>>
>>>>>>>>>>> am i right?
>>>>>>>>>>> Best regards
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 6/10/19 10:47 AM, Atri Sharma wrote:
>>>>>>>>>>>> I would suggest using a QueryParser for your fuzzy query
>>>>>>>>>>>> before
>>>>>>>>>>>> adding it to the Boolean query. This should weed out any case
>>>>>>>>>>>> issues.
>>>>>>>>>>>>
>>>>>>>>>>>> On Mon, 10 Jun 2019 at 8:06 PM, <baris.kazar@oracle.com
>>>>>>>>>>>> <mailto:baris.kazar@oracle.com>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>        BooleanQuery.Builder booleanQuery = new
>>>>>>>>>>>> BooleanQuery.Builder();
>>>>>>>>>>>>
>>>>>>>>>>>>        //First set
>>>>>>>>>>>>
>>>>>>>>>>>>                booleanQuery.add(new FuzzyQuery(new
>>>>>>>>>>>>        org.apache.lucene.index.Term(field, "MAINS")),
>>>>>>>>>>>>        BooleanClause.Occur.SHOULD);
>>>>>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer,
>>>>>>>>>>>> field,
>>>>>>>>>>>>        "NASHUA"), BooleanClause.Occur.MUST);
>>>>>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer,
>>>>>>>>>>>> field,
>>>>>>>>>>>>        "NEW HAMPSHIRE"), BooleanClause.Occur.MUST);
>>>>>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer,
>>>>>>>>>>>> field,
>>>>>>>>>>>>        "UNITED STATES"), BooleanClause.Occur.MUST);
>>>>>>>>>>>>
>>>>>>>>>>>>        // Second set
>>>>>>>>>>>>                 //booleanQuery.add(new FuzzyQuery(new
>>>>>>>>>>>>        org.apache.lucene.index.Term(field,
>>>>>>>>>>>> "street=\"MAINS\"")),
>>>>>>>>>>>>        BooleanClause.Occur.SHOULD);
>>>>>>>>>>>> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>        field, "city=\"NASHUA\""), BooleanClause.Occur.MUST);
>>>>>>>>>>>> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>        field, "region=\"NEW HAMPSHIRE\""),
>>>>>>>>>>>> BooleanClause.Occur.MUST);
>>>>>>>>>>>> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>        field, "country=\"UNITED STATES\""),
>>>>>>>>>>>> BooleanClause.Occur.MUST);
>>>>>>>>>>>>
>>>>>>>>>>>>        The first set brings also street with Nashua name.
>>>>>>>>>>>> (NASHUA).
>>>>>>>>>>>>
>>>>>>>>>>>>        so, to prevent that and since i also indexed with
>>>>>>>>>>>> street="..."
>>>>>>>>>>>>        city="..." i did the second set but it does not bring
>>>>>>>>>>>> anything.
>>>>>>>>>>>>
>>>>>>>>>>>>        createPhraseQuery builds a Phrasequery with one term
>>>>>>>>>>>> equal to the
>>>>>>>>>>>>        string
>>>>>>>>>>>>        in the call.
>>>>>>>>>>>>
>>>>>>>>>>>>        Best regards
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>        On 6/10/19 10:47 AM, baris.kazar@oracle.com
>>>>>>>>>>>>        <mailto:baris.kazar@oracle.com> wrote:
>>>>>>>>>>>>        > How do i check how it is indexed? lowecase or
>>>>>>>>>>>> uppercase?
>>>>>>>>>>>>        >
>>>>>>>>>>>>        > only way is now to by testing.
>>>>>>>>>>>>        >
>>>>>>>>>>>>        > i am using standardanalyzer.
>>>>>>>>>>>>        >
>>>>>>>>>>>>        > Best regards
>>>>>>>>>>>>        >
>>>>>>>>>>>>        >
>>>>>>>>>>>>        > On 6/9/19 11:57 AM, Atri Sharma wrote:
>>>>>>>>>>>>        >> On Sun, Jun 9, 2019 at 8:53 PM Tomoko Uchida
>>>>>>>>>>>>        >> <tomoko.uchida.1111@gmail.com
>>>>>>>>>>>> <mailto:tomoko.uchida.1111@gmail.com>> wrote:
>>>>>>>>>>>>        >>> Hi,
>>>>>>>>>>>>        >>>
>>>>>>>>>>>>        >>> What analyzer do you use for the text field? Is the
>>>>>>>>>>>> term "Main"
>>>>>>>>>>>>        >>> correctly indexed?
>>>>>>>>>>>>        >> Agreed. Also, it would be good if you could post
>>>>>>>>>>>> your
>>>>>>>>>>>> actual
>>>>>>>>>>>> code.
>>>>>>>>>>>>        >>
>>>>>>>>>>>>        >> What analyzer are you using? If you are using
>>>>>>>>>>>> StandardAnalyzer,
>>>>>>>>>>>>        then
>>>>>>>>>>>>        >> all of your terms while indexing will be lowercased,
>>>>>>>>>>>> AFAIK, but
>>>>>>>>>>>>        your
>>>>>>>>>>>>        >> query will not be analyzed until you run a
>>>>>>>>>>>> QueryParser on it.
>>>>>>>>>>>>        >>
>>>>>>>>>>>>        >>
>>>>>>>>>>>>        >> Atri
>>>>>>>>>>>>        >>
>>>>>>>>>>>>        >
>>>>>>>>>>>>        >
>>>>>>>>>>>>        >
>>>>>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>        > To unsubscribe, e-mail:
>>>>>>>>>>>> java-user-unsubscribe@lucene.apache.org
>>>>>>>>>>>> <mailto:java-user-unsubscribe@lucene.apache.org>
>>>>>>>>>>>>        > For additional commands, e-mail:
>>>>>>>>>>>>        java-user-help@lucene.apache.org
>>>>>>>>>>>> <mailto:java-user-help@lucene.apache.org>
>>>>>>>>>>>>        >
>>>>>>>>>>>>
>>>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>>>>>>> For additional commands, e-mail:
>>>>>>>>>> java-user-help@lucene.apache.org
>>>>>>>>>>
>>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>>>>>
>>>>>> ---------------------------------------------------------------------
>>>>>>
>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>
Re: FuzzyQuery- why is it ignored? [ In reply to ]
Ok, i think only this very specific only "mains" has an issue.

all i knew about Lucene was fine :) Great...

i have one more question:

which one is advised to use: FuzzyQuery or the Query.parser with search
string~ appended?

The second one will go through analyzer and make search string lowercase.

Best regards


On 6/12/19 1:03 PM, baris.kazar@oracle.com wrote:
>
> Hi again,-
>
> this is really interesting and i hope i am missing something. Index
> small cases all entries so case sensitivity is not an issue i think.
>
> Case #1:
>
> org.apache.lucene.queryparser.classic.QueryParser parser = new
> org.apache.lucene.queryparser.classic.QueryParser(field, phraseAnalyzer) ;
>         Query q1 = null;
>         try {
>             q1 = parser.parse("Main");
>         } catch (ParseException e) {
>             e.printStackTrace();
>         }
>         booleanQuery.add(q1, BooleanClause.Occur.MUST);
>         booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer,
> field, "NASHUA"), BooleanClause.Occur.MUST);
>         booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer,
> field, "NEW HAMPSHIRE"), BooleanClause.Occur.MUST);
>         booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer,
> field, "UNITED STATES"), BooleanClause.Occur.MUST);
>
>
> This brings *with this:*
>
> *query plan:
> *
>
> *[+contentDFLT:main, +contentDFLT:"nashua",
> +contentDFLT:"new-hampshire", +contentDFLT:"united states"]*
>
> testQuerySearch1 Time to compute: 0 seconds (copied answer after exec
> finished)
>
> Number of results: 12
> Name: Main Dunstable Rd
> Score: 41.204945
> ID: 12677400
> Country Code: US
> Coordinates: 42.72631, -71.50269
> Search Key: MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>
> Name: Main St
> Score: 41.204945
> ID: 12681980
> Country Code: US
> Coordinates: 42.76416, -71.46681
> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>
> Name: Main St
> Score: 41.204945
> ID: 12681973
> Country Code: US
> Coordinates: 42.75045, -71.4607
> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>
> Name: Main St
> Score: 41.204945
> ID: 12681974
> Country Code: US
> Coordinates: 42.76019, -71.465
> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>
> Name: Main Dunstable Rd
> Score: 41.204945
> ID: 12677399
> Country Code: US
> Coordinates: 42.74641, -71.48943
> Search Key: MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>
> Name: S Main St
> Score: 41.204945
> ID: 11893215
> Country Code: US
> Coordinates: 42.73412, -71.44797
> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>
> Name: Main St
> Score: 41.204945
> ID: 12681978
> Country Code: US
> Coordinates: 42.73492, -71.44951
> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>
> Name: S Main St
> Score: 41.204945
> ID: 11893214
> Country Code: US
> Coordinates: 42.73958, -71.45895
> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>
> Name: Main St
> Score: 41.204945
> ID: 12681979
> Country Code: US
> Coordinates: 42.76416, -71.46681
> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>
> Name: Main St
> Score: 41.204945
> ID: 12681977
> Country Code: US
> Coordinates: 42.747, -71.45957
> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>
>
>
> Case #2
>
> When i did this it also worked by adding ~ to make it Fuzzy query to
> Main word:
>
> org.apache.lucene.queryparser.classic.QueryParser parser = new
> org.apache.lucene.queryparser.classic.QueryParser(field, phraseAnalyzer) ;
>         Query q1 = null;
>         try {
>             q1 = parser.parse("Main~");
>         } catch (ParseException e) {
>             e.printStackTrace();
>         }
>         booleanQuery.add(q1, BooleanClause.Occur.MUST);
>         booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer,
> field, "NASHUA"), BooleanClause.Occur.MUST);
>         booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer,
> field, "NEW HAMPSHIRE"), BooleanClause.Occur.MUST);
>         booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer,
> field, "UNITED STATES"), BooleanClause.Occur.MUST);
>
> *query plan:**
> **
> **[+contentDFLT:main~2, +contentDFLT:"nashua",
> +contentDFLT:"new-hampshire", +contentDFLT:"united states"]*
>
> testQuerySearch1 Time to compute: 24 seconds (due to debugging stops)
> Number of results: 12
> Name: Main Dunstable Rd
> Score: 41.06405
> ID: 12677400
> Country Code: US
> Coordinates: 42.72631, -71.50269
> Search Key: MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>
> Name: Main St
> Score: 41.06405
> ID: 12681980
> Country Code: US
> Coordinates: 42.76416, -71.46681
> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>
> Name: Main St
> Score: 41.06405
> ID: 12681973
> Country Code: US
> Coordinates: 42.75045, -71.4607
> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>
> Name: Main St
> Score: 41.06405
> ID: 12681974
> Country Code: US
> Coordinates: 42.76019, -71.465
> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>
> Name: Main Dunstable Rd
> Score: 41.06405
> ID: 12677399
> Country Code: US
> Coordinates: 42.74641, -71.48943
> Search Key: MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>
> Name: S Main St
> Score: 41.06405
> ID: 11893215
> Country Code: US
> Coordinates: 42.73412, -71.44797
> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>
> Name: Main St
> Score: 41.06405
> ID: 12681978
> Country Code: US
> Coordinates: 42.73492, -71.44951
> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>
> Name: S Main St
> Score: 41.06405
> ID: 11893214
> Country Code: US
> Coordinates: 42.73958, -71.45895
> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>
> Name: Main St
> Score: 41.06405
> ID: 12681979
> Country Code: US
> Coordinates: 42.76416, -71.46681
> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>
> Name: Main St
> Score: 41.06405
> ID: 12681977
> Country Code: US
> Coordinates: 42.747, -71.45957
> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>
>
>
>
> Case #3
>
> But why does this not work with fuzzy mode and i misspelled a bit (1
> edit away) and as You saw the data is there with Main spelling:
>
> org.apache.lucene.queryparser.classic.QueryParser parser = new
> org.apache.lucene.queryparser.classic.QueryParser(field, phraseAnalyzer) ;
>
>         Query q1 = null;
>         try {
>             q1 = parser.parse("Mains~");  // 1 edit away
>         } catch (ParseException e) {
>             e.printStackTrace();
>         }
>         booleanQuery.add(q1, BooleanClause.Occur.MUST);
>         booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer,
> field, "NASHUA"), BooleanClause.Occur.MUST);
>         booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer,
> field, "NEW HAMPSHIRE"), BooleanClause.Occur.MUST);
>         booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer,
> field, "UNITED STATES"), BooleanClause.Occur.MUST);
>
> *query plan:**
> **
> **[+contentDFLT:mains~2, +contentDFLT:"nashua",
> +contentDFLT:"new-hampshire", +contentDFLT:"united states"]**
> *
>
> testQuerySearch1 Time to compute: 23 seconds (due to debugging stops)
>
> Number of results: 0
>
>
>
> Case #4
>
> Then i changed q1 to SHOULD from MUST above: and i think fuzzy query
> is ignored here since there is no MAIN in the first 468 resuls:
>
> there is no boost for Mains term here.
>
> *query plan:*
>
> *[contentDFLT:mains~2, +contentDFLT:"nashua",
> +contentDFLT:"new-hampshire", +contentDFLT:"united states"]**
> *
>
> testQuerySearch1 Time to compute: 125 seconds (due to debugging stops)
> Number of results: 1794
> Name: Nashua Dr
> Score: 34.186226
> ID: 4974936
> Country Code: US
> Coordinates: 42.7636, -71.46063
> Search Key: NASHUA NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>
> Name: Nashua River Rail Trl
> Score: 34.186226
> ID: 4975508
> Country Code: US
> Coordinates: 42.7062, -71.53962
> Search Key: NASHUA RIVER RAIL NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED
> STATES
>
> Name: Nashua Rd
> Score: 33.84896
> ID: 4975388
> Country Code: US
> Coordinates: 42.78746, -71.92823
> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>
> Name: NASHUA
> Score: 33.84896
> ID: 21014865
> Country Code: US
> Coordinates: 42.75873, -71.46438
> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>
> Name: NASHUA
> Score: 33.84896
> ID: 21014865
> Country Code: US
> Coordinates: 42.75873, -71.46438
> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>
> Name: NASHUA
> Score: 33.84896
> ID: 21014865
> Country Code: US
> Coordinates: 42.75873, -71.46438
> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>
> Name: NASHUA
> Score: 33.84896
> ID: 21014865
> Country Code: US
> Coordinates: 42.75873, -71.46438
> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>
> Name: NASHUA
> Score: 33.84896
> ID: 21014865
> Country Code: US
> Coordinates: 42.75873, -71.46438
> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>
> Name: Nashua St
> Score: 33.84896
> ID: 4975671
> Country Code: US
> Coordinates: 42.88471, -70.81687
> Search Key: NASHUA ROCKINGHAM NEW HAMPSHIRE UNITED STATES
>
> Name: Nashua Rd
> Score: 33.84896
> ID: 4975400
> Country Code: US
> Coordinates: 42.79014, -71.92364
> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>
>
> Why is the fuzzy query ignored?
> Even if i have separate fields for street, city,region, country, this
> fuzzy query issue will come into place for words with multiple parts
> like main dunstable etc., right?
>
> Best regards
>
> On 6/12/19 11:36 AM, baris.kazar@oracle.com wrote:
>> Tomoko,-
>>
>>  Thank You for Your suggestions. i am trying to understand it and i
>> thought i did :)
>>
>> but it does not work with FuzzyQuery when i used with a *single*
>> large TextField like street=...value... city=...value...
>> region=...value... country=...value... (with or without quotes for
>> the values)
>>
>> What i knew about Lucene fuzzy queries are not holding now with this
>> Textfield form. That is why i suspected of a bug.
>>
>> 1. Yes, i saw and have a solid proof on that now.
>>
>> 2. yes but FuzzyQuery takes quotes as they are as they are escaped
>> and it is not analyzed.
>>
>> Stuffing into one textfield vs having separate fields should only
>> affect probably the performance but not the outcome in my case.
>> But, i have been thinking about this and maybe it is the way to go in
>> this case.
>>
>> mY CONTENT field has street names in mixed case and city, region
>> country names in UPPERCASE. Can this be a problem?
>> i thought index stored them in lowercase since i am using
>> StandardAnalyzer.
>>
>> CONTENT field also has full textfield string with street=... city=...
>> region=... country=... (here all values are UPPERCASE).
>>
>> Why cant the index find the names via FuzzyQuery? i tried both
>> FuzzyQuery and Query builder as i showed before.
>>
>> The last advice in Your previous email would nicely go outside the
>> parantheses since it might be very critical :) :) :)
>>
>> Best regards
>>
>>
>> On 6/12/19 12:17 AM, Tomoko Uchida wrote:
>>> I'd suggest to correctly understand the way a software works before
>>> suspecting its bug :-)
>>>
>>> I guess you may miss two points:
>>>
>>> 1. the standard analyzer (standard tokenizer) breaks words by double
>>> quote (U+0022) so quotes are not indexed or searched at all if you are
>>> using standard analyzer. (That is the reason you have same results
>>> with or without quotes.)
>>> See:
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__lucene.apache.org_core_8-5F1-5F0_core_org_apache_lucene_analysis_standard_StandardTokenizer.html&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=1L6ZQKxmWmYxDX4uJHxzY5SAR_UCl6UUXCo916wzXCo&s=8E2lp1YIGM-3v3FspeieGl8z8rEBs6qioTudtFNzh8c&e=
>>> and
>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__unicode.org_reports_tr29_&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=1L6ZQKxmWmYxDX4uJHxzY5SAR_UCl6UUXCo916wzXCo&s=riCZ_f25XW869CKbHPUqfbLiDU-AukE6la0xTLMw6u8&e=
>>>
>>> 2. double quote has special meaning (it's interpreted as phrase query)
>>> with the built-in query parser so you need to escape it if you want to
>>> search double quotes itself.
>>> See:
>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__lucene.apache.org_core_8-5F1-5F0_queryparser_org_apache_lucene_queryparser_classic_package-2Dsummary.html-23Terms&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=1L6ZQKxmWmYxDX4uJHxzY5SAR_UCl6UUXCo916wzXCo&s=t8OYTgidvcwNpAVFuTsqGhDJK5BwUZVCxc0mPHzqCYU&e=
>>>
>>> (My advice would be to create separate fields for each key value pairs
>>> instead of stuffing all pairs into one text field, if you need to
>>> search them separately.)
>>>
>>> 2019?6?12?(?) 2:39 <baris.kazar@oracle.com>:
>>>> i can say that quotes is not the issue with index as it still
>>>> results in
>>>> same results with quotes or without quotes.
>>>>
>>>> i am starting to feel that this might be a bug maybe??
>>>>
>>>> Best regards
>>>>
>>>>
>>>> On 6/10/19 2:46 PM, baris.kazar@oracle.com wrote:
>>>>> Somehow " is causing an issue as this should return street with MAIN:
>>>>>
>>>>> [contentDFLT:street="MAINS"~2, +contentDFLT:"city nashua",
>>>>> +contentDFLT:"region new-hampshire", +contentDFLT:"country united
>>>>> states"] -> this was with fuzzyquery on MAINS
>>>>>
>>>>> Best regards
>>>>>
>>>>>
>>>>> On 6/10/19 2:24 PM, baris.kazar@oracle.com wrote:
>>>>>> [.+contentDFLT:"city nashua", +contentDFLT:"region new-hampshire",
>>>>>> +contentDFLT:"country united states", contentDFLT:street
>>>>>> contentDFLT:mains]
>>>>>>
>>>>>> QueeryParser chops it into two pieces from
>>>>>> parser.parser("street=\"MAINS\"");
>>>>>>
>>>>>> Index has a TextField named contentDFLT the following data :
>>>>>> street="MAIN" city="NASHUA" municipality="HILLSBOROUGH" region="NEW
>>>>>> HAMPSHIRE" country="UNITED STATES"
>>>>>>
>>>>>>
>>>>>> When i set street=\"MAINS~\" with parser:
>>>>>> i get the following
>>>>>> [.+contentDFLT:"city nashua", +contentDFLT:"region new-hampshire",
>>>>>> +contentDFLT:"country united states", contentDFLT:street
>>>>>> contentDFLT:mains]
>>>>>>
>>>>>> probably " quotations are messing this up as You were saying...
>>>>>> Best regards
>>>>>>
>>>>>>
>>>>>> On 6/10/19 12:48 PM, Tomoko Uchida wrote:
>>>>>>> Or, " (double quotation) in your query string may affect query
>>>>>>> parsing.
>>>>>>>
>>>>>>> When I parse this string by classic query parser (lucene 8.1),
>>>>>>> street="MAINS~"
>>>>>>> parsed (raw) query is
>>>>>>> text:street text:mains
>>>>>>> (I set the default search field to "text", so text:xxxx is appeared
>>>>>>> here.)
>>>>>>>
>>>>>>> Query parsing is a complex process, so it would be good to check
>>>>>>> parsed raw query string especially when you have (reserved) special
>>>>>>> characters in your query...
>>>>>>>
>>>>>>> 2019?6?11?(?) 1:10 Tomoko Uchida <tomoko.uchida.1111@gmail.com>:
>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> I noticed one small thing in your previous mail.
>>>>>>>>
>>>>>>>>> when i use q1 = parser.parse("street=\"MAIN\""); i get same
>>>>>>>>> results
>>>>>>>> which is good.
>>>>>>>>
>>>>>>>> To specify a search field, ":" (colon) should be used instead
>>>>>>>> of "=".
>>>>>>>> See the query parser documentation:
>>>>>>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__lucene.apache.org_core_8-5F1-5F0_queryparser_org_apache_lucene_queryparser_classic_package-2Dsummary.html-23Fields&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=u4SeJqH4lePhOazCLwxLEr3WqcMkODtYLv4njiKZ4PM&s=WrNfUXO9gz1PqpczTJw1vD9sWqvr76WRv2Aeo9uWqa4&e=
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> I'm not sure this is related to your problem.
>>>>>>>>
>>>>>>>> 2019?6?11?(?) 0:51 <baris.kazar@oracle.com>:
>>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>>>> "city=\"NASHUA\""), BooleanClause.Occur.MUST);
>>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>>>> "region=\"NEW HAMPSHIRE\""), BooleanClause.Occur.MUST);
>>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>>>> "country=\"UNITED STATES\""), BooleanClause.Occur.MUST);
>>>>>>>>>
>>>>>>>>> org.apache.lucene.queryparser.classic.QueryParser parser = new
>>>>>>>>> org.apache.lucene.queryparser.classic.QueryParser(field,
>>>>>>>>> phraseAnalyzer) ;
>>>>>>>>>            Query q1 = null;
>>>>>>>>>            try {
>>>>>>>>>                q1 = parser.parse("MAIN");
>>>>>>>>>            } catch (ParseException e) {
>>>>>>>>>
>>>>>>>>>                e.printStackTrace();
>>>>>>>>>            }
>>>>>>>>>            booleanQuery.add(q1, BooleanClause.Occur.SHOULD);
>>>>>>>>>
>>>>>>>>> testQuerySearch2 Time to compute: 0 seconds
>>>>>>>>> Number of results: 1775
>>>>>>>>> Name: Main St
>>>>>>>>> Score: 37.20959
>>>>>>>>> ID: 12681979
>>>>>>>>> Country Code: US
>>>>>>>>> Coordinates: 42.76416, -71.46681
>>>>>>>>> Search Key: street="MAIN" city="NASHUA"
>>>>>>>>> municipality="HILLSBOROUGH"
>>>>>>>>> region="NEW HAMPSHIRE" country="UNITED STATES"
>>>>>>>>>
>>>>>>>>> Name: Main St
>>>>>>>>> Score: 37.20959
>>>>>>>>> ID: 12681977
>>>>>>>>> Country Code: US
>>>>>>>>> Coordinates: 42.747, -71.45957
>>>>>>>>> Search Key: street="MAIN" city="NASHUA"
>>>>>>>>> municipality="HILLSBOROUGH"
>>>>>>>>> region="NEW HAMPSHIRE" country="UNITED STATES"
>>>>>>>>>
>>>>>>>>> Name: Main St
>>>>>>>>> Score: 37.20959
>>>>>>>>> ID: 12681978
>>>>>>>>> Country Code: US
>>>>>>>>> Coordinates: 42.73492, -71.44951
>>>>>>>>> Search Key: street="MAIN" city="NASHUA"
>>>>>>>>> municipality="HILLSBOROUGH"
>>>>>>>>> region="NEW HAMPSHIRE" country="UNITED STATES"
>>>>>>>>>
>>>>>>>>>     when i use q1 = parser.parse("street=\"MAIN\""); i get same
>>>>>>>>> results
>>>>>>>>> which is good.
>>>>>>>>>
>>>>>>>>> But when i switch to MAINS~ then fuzzy query does not work.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> i need to say something with the q1 only in the booleanquery:
>>>>>>>>> it tries to match the MAIN in street, city, region and country
>>>>>>>>> which are
>>>>>>>>> in a single TextField field.
>>>>>>>>> But i dont want this. that is why i need to street="..." etc when
>>>>>>>>> searching.
>>>>>>>>>
>>>>>>>>> Best regards
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 6/10/19 11:31 AM, Tomoko Uchida wrote:
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> just for the basic verification, can you find the document
>>>>>>>>>> without
>>>>>>>>>> fuzzy query? I mean, does this query work for you?
>>>>>>>>>>
>>>>>>>>>> Query query = parser.parse("MAIN");
>>>>>>>>>>
>>>>>>>>>> Tomoko
>>>>>>>>>>
>>>>>>>>>> 2019?6?11?(?) 0:22 <baris.kazar@oracle.com>:
>>>>>>>>>>> why cant the second set not work at all?
>>>>>>>>>>>
>>>>>>>>>>> it is indexed as Textfield like street="..." city="..." etc.
>>>>>>>>>>>
>>>>>>>>>>> Best regards
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 6/10/19 11:23 AM, baris.kazar@oracle.com wrote:
>>>>>>>>>>>> i dont know how to use Fuzzyquery with queryparser but
>>>>>>>>>>>> probably
>>>>>>>>>>>> You
>>>>>>>>>>>> are suggesting
>>>>>>>>>>>>
>>>>>>>>>>>> QueryParser parser = new QueryParser(field, analyzer) ;
>>>>>>>>>>>> Query query = parser.parse("MAINS~2");
>>>>>>>>>>>>
>>>>>>>>>>>> booleanQuery.add(query, BooleanClause.Occur.SHOULD);
>>>>>>>>>>>>
>>>>>>>>>>>> am i right?
>>>>>>>>>>>> Best regards
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On 6/10/19 10:47 AM, Atri Sharma wrote:
>>>>>>>>>>>>> I would suggest using a QueryParser for your fuzzy query
>>>>>>>>>>>>> before
>>>>>>>>>>>>> adding it to the Boolean query. This should weed out any case
>>>>>>>>>>>>> issues.
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Mon, 10 Jun 2019 at 8:06 PM, <baris.kazar@oracle.com
>>>>>>>>>>>>> <mailto:baris.kazar@oracle.com>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>        BooleanQuery.Builder booleanQuery = new
>>>>>>>>>>>>> BooleanQuery.Builder();
>>>>>>>>>>>>>
>>>>>>>>>>>>>        //First set
>>>>>>>>>>>>>
>>>>>>>>>>>>>                booleanQuery.add(new FuzzyQuery(new
>>>>>>>>>>>>>        org.apache.lucene.index.Term(field, "MAINS")),
>>>>>>>>>>>>>        BooleanClause.Occur.SHOULD);
>>>>>>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer,
>>>>>>>>>>>>> field,
>>>>>>>>>>>>>        "NASHUA"), BooleanClause.Occur.MUST);
>>>>>>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer,
>>>>>>>>>>>>> field,
>>>>>>>>>>>>>        "NEW HAMPSHIRE"), BooleanClause.Occur.MUST);
>>>>>>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer,
>>>>>>>>>>>>> field,
>>>>>>>>>>>>>        "UNITED STATES"), BooleanClause.Occur.MUST);
>>>>>>>>>>>>>
>>>>>>>>>>>>>        // Second set
>>>>>>>>>>>>>                 //booleanQuery.add(new FuzzyQuery(new
>>>>>>>>>>>>>        org.apache.lucene.index.Term(field,
>>>>>>>>>>>>> "street=\"MAINS\"")),
>>>>>>>>>>>>>        BooleanClause.Occur.SHOULD);
>>>>>>>>>>>>> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>        field, "city=\"NASHUA\""), BooleanClause.Occur.MUST);
>>>>>>>>>>>>> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>        field, "region=\"NEW HAMPSHIRE\""),
>>>>>>>>>>>>> BooleanClause.Occur.MUST);
>>>>>>>>>>>>> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>        field, "country=\"UNITED STATES\""),
>>>>>>>>>>>>> BooleanClause.Occur.MUST);
>>>>>>>>>>>>>
>>>>>>>>>>>>>        The first set brings also street with Nashua name.
>>>>>>>>>>>>> (NASHUA).
>>>>>>>>>>>>>
>>>>>>>>>>>>>        so, to prevent that and since i also indexed with
>>>>>>>>>>>>> street="..."
>>>>>>>>>>>>>        city="..." i did the second set but it does not bring
>>>>>>>>>>>>> anything.
>>>>>>>>>>>>>
>>>>>>>>>>>>>        createPhraseQuery builds a Phrasequery with one term
>>>>>>>>>>>>> equal to the
>>>>>>>>>>>>>        string
>>>>>>>>>>>>>        in the call.
>>>>>>>>>>>>>
>>>>>>>>>>>>>        Best regards
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>        On 6/10/19 10:47 AM, baris.kazar@oracle.com
>>>>>>>>>>>>> <mailto:baris.kazar@oracle.com> wrote:
>>>>>>>>>>>>>        > How do i check how it is indexed? lowecase or
>>>>>>>>>>>>> uppercase?
>>>>>>>>>>>>>        >
>>>>>>>>>>>>>        > only way is now to by testing.
>>>>>>>>>>>>>        >
>>>>>>>>>>>>>        > i am using standardanalyzer.
>>>>>>>>>>>>>        >
>>>>>>>>>>>>>        > Best regards
>>>>>>>>>>>>>        >
>>>>>>>>>>>>>        >
>>>>>>>>>>>>>        > On 6/9/19 11:57 AM, Atri Sharma wrote:
>>>>>>>>>>>>>        >> On Sun, Jun 9, 2019 at 8:53 PM Tomoko Uchida
>>>>>>>>>>>>>        >> <tomoko.uchida.1111@gmail.com
>>>>>>>>>>>>> <mailto:tomoko.uchida.1111@gmail.com>> wrote:
>>>>>>>>>>>>>        >>> Hi,
>>>>>>>>>>>>>        >>>
>>>>>>>>>>>>>        >>> What analyzer do you use for the text field? Is
>>>>>>>>>>>>> the
>>>>>>>>>>>>> term "Main"
>>>>>>>>>>>>>        >>> correctly indexed?
>>>>>>>>>>>>>        >> Agreed. Also, it would be good if you could post
>>>>>>>>>>>>> your
>>>>>>>>>>>>> actual
>>>>>>>>>>>>> code.
>>>>>>>>>>>>>        >>
>>>>>>>>>>>>>        >> What analyzer are you using? If you are using
>>>>>>>>>>>>> StandardAnalyzer,
>>>>>>>>>>>>>        then
>>>>>>>>>>>>>        >> all of your terms while indexing will be
>>>>>>>>>>>>> lowercased,
>>>>>>>>>>>>> AFAIK, but
>>>>>>>>>>>>>        your
>>>>>>>>>>>>>        >> query will not be analyzed until you run a
>>>>>>>>>>>>> QueryParser on it.
>>>>>>>>>>>>>        >>
>>>>>>>>>>>>>        >>
>>>>>>>>>>>>>        >> Atri
>>>>>>>>>>>>>        >>
>>>>>>>>>>>>>        >
>>>>>>>>>>>>>        >
>>>>>>>>>>>>>        >
>>>>>>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>        > To unsubscribe, e-mail:
>>>>>>>>>>>>> java-user-unsubscribe@lucene.apache.org
>>>>>>>>>>>>> <mailto:java-user-unsubscribe@lucene.apache.org>
>>>>>>>>>>>>>        > For additional commands, e-mail:
>>>>>>>>>>>>> java-user-help@lucene.apache.org
>>>>>>>>>>>>> <mailto:java-user-help@lucene.apache.org>
>>>>>>>>>>>>>        >
>>>>>>>>>>>>>
>>>>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>>>>>>>> For additional commands, e-mail:
>>>>>>>>>>> java-user-help@lucene.apache.org
>>>>>>>>>>>
>>>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>>>>>>> For additional commands, e-mail:
>>>>>>>>>> java-user-help@lucene.apache.org
>>>>>>>>>>
>>>>>>> ---------------------------------------------------------------------
>>>>>>>
>>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>>
>
Re: FuzzyQuery- why is it ignored? [ In reply to ]
Hi,

> Ok, i think only this very specific only "mains" has an issue.

It looks strange to me. I did some test locally.

1. Indexed this text: "NASHUA NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES".

2a. This query string (just copied from your Case #3) worked correctly
for me as far as I can see.
+contentDFLT:mains~2 +contentDFLT:"nashua",
+contentDFLT:"new-hampshire", +contentDFLT:"united state"

2b. However this query string got no results.
+contentDFLT:"mains~2", +contentDFLT:"nashua",
+contentDFLT:"new-hampshire", +contentDFLT:"united states"
It is an expected behaviour because the classic query parser does not
support fuzzy query inside phrase query (as far as I know).

I suspect you use fuzzy query operator (~) inside phrase query ("), as
the 2b case.

FYI: there is a special parser for such complex phrase query.
https://lucene.apache.org/core/8_1_0/queryparser/org/apache/lucene/queryparser/complexPhrase/ComplexPhraseQueryParser.html

Tomoko

2019?6?13?(?) 6:16 <baris.kazar@oracle.com>:
>
> Ok, i think only this very specific only "mains" has an issue.
>
> all i knew about Lucene was fine :) Great...
>
> i have one more question:
>
> which one is advised to use: FuzzyQuery or the Query.parser with search string~ appended?
>
> The second one will go through analyzer and make search string lowercase.
>
> Best regards
>
>
> On 6/12/19 1:03 PM, baris.kazar@oracle.com wrote:
>
> Hi again,-
>
> this is really interesting and i hope i am missing something. Index small cases all entries so case sensitivity is not an issue i think.
>
> Case #1:
>
> org.apache.lucene.queryparser.classic.QueryParser parser = new org.apache.lucene.queryparser.classic.QueryParser(field, phraseAnalyzer) ;
> Query q1 = null;
> try {
> q1 = parser.parse("Main");
> } catch (ParseException e) {
> e.printStackTrace();
> }
> booleanQuery.add(q1, BooleanClause.Occur.MUST);
> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field, "NASHUA"), BooleanClause.Occur.MUST);
> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field, "NEW HAMPSHIRE"), BooleanClause.Occur.MUST);
> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field, "UNITED STATES"), BooleanClause.Occur.MUST);
>
>
> This brings with this:
>
> query plan:
>
> [+contentDFLT:main, +contentDFLT:"nashua", +contentDFLT:"new-hampshire", +contentDFLT:"united states"]
>
> testQuerySearch1 Time to compute: 0 seconds (copied answer after exec finished)
>
> Number of results: 12
> Name: Main Dunstable Rd
> Score: 41.204945
> ID: 12677400
> Country Code: US
> Coordinates: 42.72631, -71.50269
> Search Key: MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>
> Name: Main St
> Score: 41.204945
> ID: 12681980
> Country Code: US
> Coordinates: 42.76416, -71.46681
> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>
> Name: Main St
> Score: 41.204945
> ID: 12681973
> Country Code: US
> Coordinates: 42.75045, -71.4607
> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>
> Name: Main St
> Score: 41.204945
> ID: 12681974
> Country Code: US
> Coordinates: 42.76019, -71.465
> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>
> Name: Main Dunstable Rd
> Score: 41.204945
> ID: 12677399
> Country Code: US
> Coordinates: 42.74641, -71.48943
> Search Key: MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>
> Name: S Main St
> Score: 41.204945
> ID: 11893215
> Country Code: US
> Coordinates: 42.73412, -71.44797
> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>
> Name: Main St
> Score: 41.204945
> ID: 12681978
> Country Code: US
> Coordinates: 42.73492, -71.44951
> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>
> Name: S Main St
> Score: 41.204945
> ID: 11893214
> Country Code: US
> Coordinates: 42.73958, -71.45895
> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>
> Name: Main St
> Score: 41.204945
> ID: 12681979
> Country Code: US
> Coordinates: 42.76416, -71.46681
> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>
> Name: Main St
> Score: 41.204945
> ID: 12681977
> Country Code: US
> Coordinates: 42.747, -71.45957
> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>
>
>
> Case #2
>
> When i did this it also worked by adding ~ to make it Fuzzy query to Main word:
>
> org.apache.lucene.queryparser.classic.QueryParser parser = new org.apache.lucene.queryparser.classic.QueryParser(field, phraseAnalyzer) ;
> Query q1 = null;
> try {
> q1 = parser.parse("Main~");
> } catch (ParseException e) {
> e.printStackTrace();
> }
> booleanQuery.add(q1, BooleanClause.Occur.MUST);
> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field, "NASHUA"), BooleanClause.Occur.MUST);
> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field, "NEW HAMPSHIRE"), BooleanClause.Occur.MUST);
> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field, "UNITED STATES"), BooleanClause.Occur.MUST);
>
>
> query plan:
>
> [+contentDFLT:main~2, +contentDFLT:"nashua", +contentDFLT:"new-hampshire", +contentDFLT:"united states"]
>
> testQuerySearch1 Time to compute: 24 seconds (due to debugging stops)
> Number of results: 12
> Name: Main Dunstable Rd
> Score: 41.06405
> ID: 12677400
> Country Code: US
> Coordinates: 42.72631, -71.50269
> Search Key: MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>
> Name: Main St
> Score: 41.06405
> ID: 12681980
> Country Code: US
> Coordinates: 42.76416, -71.46681
> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>
> Name: Main St
> Score: 41.06405
> ID: 12681973
> Country Code: US
> Coordinates: 42.75045, -71.4607
> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>
> Name: Main St
> Score: 41.06405
> ID: 12681974
> Country Code: US
> Coordinates: 42.76019, -71.465
> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>
> Name: Main Dunstable Rd
> Score: 41.06405
> ID: 12677399
> Country Code: US
> Coordinates: 42.74641, -71.48943
> Search Key: MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>
> Name: S Main St
> Score: 41.06405
> ID: 11893215
> Country Code: US
> Coordinates: 42.73412, -71.44797
> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>
> Name: Main St
> Score: 41.06405
> ID: 12681978
> Country Code: US
> Coordinates: 42.73492, -71.44951
> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>
> Name: S Main St
> Score: 41.06405
> ID: 11893214
> Country Code: US
> Coordinates: 42.73958, -71.45895
> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>
> Name: Main St
> Score: 41.06405
> ID: 12681979
> Country Code: US
> Coordinates: 42.76416, -71.46681
> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>
> Name: Main St
> Score: 41.06405
> ID: 12681977
> Country Code: US
> Coordinates: 42.747, -71.45957
> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>
>
>
>
> Case #3
>
> But why does this not work with fuzzy mode and i misspelled a bit (1 edit away) and as You saw the data is there with Main spelling:
>
> org.apache.lucene.queryparser.classic.QueryParser parser = new org.apache.lucene.queryparser.classic.QueryParser(field, phraseAnalyzer) ;
>
> Query q1 = null;
> try {
> q1 = parser.parse("Mains~"); // 1 edit away
> } catch (ParseException e) {
> e.printStackTrace();
> }
> booleanQuery.add(q1, BooleanClause.Occur.MUST);
> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field, "NASHUA"), BooleanClause.Occur.MUST);
> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field, "NEW HAMPSHIRE"), BooleanClause.Occur.MUST);
> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field, "UNITED STATES"), BooleanClause.Occur.MUST);
>
> query plan:
>
> [+contentDFLT:mains~2, +contentDFLT:"nashua", +contentDFLT:"new-hampshire", +contentDFLT:"united states"]
>
> testQuerySearch1 Time to compute: 23 seconds (due to debugging stops)
>
> Number of results: 0
>
>
>
> Case #4
>
> Then i changed q1 to SHOULD from MUST above: and i think fuzzy query is ignored here since there is no MAIN in the first 468 resuls:
>
> there is no boost for Mains term here.
>
> query plan:
>
> [contentDFLT:mains~2, +contentDFLT:"nashua", +contentDFLT:"new-hampshire", +contentDFLT:"united states"]
>
> testQuerySearch1 Time to compute: 125 seconds (due to debugging stops)
> Number of results: 1794
> Name: Nashua Dr
> Score: 34.186226
> ID: 4974936
> Country Code: US
> Coordinates: 42.7636, -71.46063
> Search Key: NASHUA NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>
> Name: Nashua River Rail Trl
> Score: 34.186226
> ID: 4975508
> Country Code: US
> Coordinates: 42.7062, -71.53962
> Search Key: NASHUA RIVER RAIL NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>
> Name: Nashua Rd
> Score: 33.84896
> ID: 4975388
> Country Code: US
> Coordinates: 42.78746, -71.92823
> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>
> Name: NASHUA
> Score: 33.84896
> ID: 21014865
> Country Code: US
> Coordinates: 42.75873, -71.46438
> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>
> Name: NASHUA
> Score: 33.84896
> ID: 21014865
> Country Code: US
> Coordinates: 42.75873, -71.46438
> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>
> Name: NASHUA
> Score: 33.84896
> ID: 21014865
> Country Code: US
> Coordinates: 42.75873, -71.46438
> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>
> Name: NASHUA
> Score: 33.84896
> ID: 21014865
> Country Code: US
> Coordinates: 42.75873, -71.46438
> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>
> Name: NASHUA
> Score: 33.84896
> ID: 21014865
> Country Code: US
> Coordinates: 42.75873, -71.46438
> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>
> Name: Nashua St
> Score: 33.84896
> ID: 4975671
> Country Code: US
> Coordinates: 42.88471, -70.81687
> Search Key: NASHUA ROCKINGHAM NEW HAMPSHIRE UNITED STATES
>
> Name: Nashua Rd
> Score: 33.84896
> ID: 4975400
> Country Code: US
> Coordinates: 42.79014, -71.92364
> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>
>
> Why is the fuzzy query ignored?
> Even if i have separate fields for street, city,region, country, this fuzzy query issue will come into place for words with multiple parts like main dunstable etc., right?
>
> Best regards
>
> On 6/12/19 11:36 AM, baris.kazar@oracle.com wrote:
>
> Tomoko,-
>
> Thank You for Your suggestions. i am trying to understand it and i thought i did :)
>
> but it does not work with FuzzyQuery when i used with a *single* large TextField like street=...value... city=...value... region=...value... country=...value... (with or without quotes for the values)
>
> What i knew about Lucene fuzzy queries are not holding now with this Textfield form. That is why i suspected of a bug.
>
> 1. Yes, i saw and have a solid proof on that now.
>
> 2. yes but FuzzyQuery takes quotes as they are as they are escaped and it is not analyzed.
>
> Stuffing into one textfield vs having separate fields should only affect probably the performance but not the outcome in my case.
> But, i have been thinking about this and maybe it is the way to go in this case.
>
> mY CONTENT field has street names in mixed case and city, region country names in UPPERCASE. Can this be a problem?
> i thought index stored them in lowercase since i am using StandardAnalyzer.
>
> CONTENT field also has full textfield string with street=... city=... region=... country=... (here all values are UPPERCASE).
>
> Why cant the index find the names via FuzzyQuery? i tried both FuzzyQuery and Query builder as i showed before.
>
> The last advice in Your previous email would nicely go outside the parantheses since it might be very critical :) :) :)
>
> Best regards
>
>
> On 6/12/19 12:17 AM, Tomoko Uchida wrote:
>
> I'd suggest to correctly understand the way a software works before
> suspecting its bug :-)
>
> I guess you may miss two points:
>
> 1. the standard analyzer (standard tokenizer) breaks words by double
> quote (U+0022) so quotes are not indexed or searched at all if you are
> using standard analyzer. (That is the reason you have same results
> with or without quotes.)
> See: https://urldefense.proofpoint.com/v2/url?u=https-3A__lucene.apache.org_core_8-5F1-5F0_core_org_apache_lucene_analysis_standard_StandardTokenizer.html&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=1L6ZQKxmWmYxDX4uJHxzY5SAR_UCl6UUXCo916wzXCo&s=8E2lp1YIGM-3v3FspeieGl8z8rEBs6qioTudtFNzh8c&e=
> and https://urldefense.proofpoint.com/v2/url?u=http-3A__unicode.org_reports_tr29_&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=1L6ZQKxmWmYxDX4uJHxzY5SAR_UCl6UUXCo916wzXCo&s=riCZ_f25XW869CKbHPUqfbLiDU-AukE6la0xTLMw6u8&e=
>
> 2. double quote has special meaning (it's interpreted as phrase query)
> with the built-in query parser so you need to escape it if you want to
> search double quotes itself.
> See: https://urldefense.proofpoint.com/v2/url?u=http-3A__lucene.apache.org_core_8-5F1-5F0_queryparser_org_apache_lucene_queryparser_classic_package-2Dsummary.html-23Terms&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=1L6ZQKxmWmYxDX4uJHxzY5SAR_UCl6UUXCo916wzXCo&s=t8OYTgidvcwNpAVFuTsqGhDJK5BwUZVCxc0mPHzqCYU&e=
>
> (My advice would be to create separate fields for each key value pairs
> instead of stuffing all pairs into one text field, if you need to
> search them separately.)
>
> 2019?6?12?(?) 2:39 <baris.kazar@oracle.com>:
>
> i can say that quotes is not the issue with index as it still results in
> same results with quotes or without quotes.
>
> i am starting to feel that this might be a bug maybe??
>
> Best regards
>
>
> On 6/10/19 2:46 PM, baris.kazar@oracle.com wrote:
>
> Somehow " is causing an issue as this should return street with MAIN:
>
> [contentDFLT:street="MAINS"~2, +contentDFLT:"city nashua",
> +contentDFLT:"region new-hampshire", +contentDFLT:"country united
> states"] -> this was with fuzzyquery on MAINS
>
> Best regards
>
>
> On 6/10/19 2:24 PM, baris.kazar@oracle.com wrote:
>
> [.+contentDFLT:"city nashua", +contentDFLT:"region new-hampshire",
> +contentDFLT:"country united states", contentDFLT:street
> contentDFLT:mains]
>
> QueeryParser chops it into two pieces from
> parser.parser("street=\"MAINS\"");
>
> Index has a TextField named contentDFLT the following data :
> street="MAIN" city="NASHUA" municipality="HILLSBOROUGH" region="NEW
> HAMPSHIRE" country="UNITED STATES"
>
>
> When i set street=\"MAINS~\" with parser:
> i get the following
> [.+contentDFLT:"city nashua", +contentDFLT:"region new-hampshire",
> +contentDFLT:"country united states", contentDFLT:street
> contentDFLT:mains]
>
> probably " quotations are messing this up as You were saying...
> Best regards
>
>
> On 6/10/19 12:48 PM, Tomoko Uchida wrote:
>
> Or, " (double quotation) in your query string may affect query parsing.
>
> When I parse this string by classic query parser (lucene 8.1),
> street="MAINS~"
> parsed (raw) query is
> text:street text:mains
> (I set the default search field to "text", so text:xxxx is appeared
> here.)
>
> Query parsing is a complex process, so it would be good to check
> parsed raw query string especially when you have (reserved) special
> characters in your query...
>
> 2019?6?11?(?) 1:10 Tomoko Uchida <tomoko.uchida.1111@gmail.com>:
>
> Hi,
>
> I noticed one small thing in your previous mail.
>
> when i use q1 = parser.parse("street=\"MAIN\""); i get same results
>
> which is good.
>
> To specify a search field, ":" (colon) should be used instead of "=".
> See the query parser documentation:
> https://urldefense.proofpoint.com/v2/url?u=http-3A__lucene.apache.org_core_8-5F1-5F0_queryparser_org_apache_lucene_queryparser_classic_package-2Dsummary.html-23Fields&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=u4SeJqH4lePhOazCLwxLEr3WqcMkODtYLv4njiKZ4PM&s=WrNfUXO9gz1PqpczTJw1vD9sWqvr76WRv2Aeo9uWqa4&e=
>
>
> I'm not sure this is related to your problem.
>
> 2019?6?11?(?) 0:51 <baris.kazar@oracle.com>:
>
> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> "city=\"NASHUA\""), BooleanClause.Occur.MUST);
> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> "region=\"NEW HAMPSHIRE\""), BooleanClause.Occur.MUST);
> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> "country=\"UNITED STATES\""), BooleanClause.Occur.MUST);
>
> org.apache.lucene.queryparser.classic.QueryParser parser = new
> org.apache.lucene.queryparser.classic.QueryParser(field,
> phraseAnalyzer) ;
> Query q1 = null;
> try {
> q1 = parser.parse("MAIN");
> } catch (ParseException e) {
>
> e.printStackTrace();
> }
> booleanQuery.add(q1, BooleanClause.Occur.SHOULD);
>
> testQuerySearch2 Time to compute: 0 seconds
> Number of results: 1775
> Name: Main St
> Score: 37.20959
> ID: 12681979
> Country Code: US
> Coordinates: 42.76416, -71.46681
> Search Key: street="MAIN" city="NASHUA" municipality="HILLSBOROUGH"
> region="NEW HAMPSHIRE" country="UNITED STATES"
>
> Name: Main St
> Score: 37.20959
> ID: 12681977
> Country Code: US
> Coordinates: 42.747, -71.45957
> Search Key: street="MAIN" city="NASHUA" municipality="HILLSBOROUGH"
> region="NEW HAMPSHIRE" country="UNITED STATES"
>
> Name: Main St
> Score: 37.20959
> ID: 12681978
> Country Code: US
> Coordinates: 42.73492, -71.44951
> Search Key: street="MAIN" city="NASHUA" municipality="HILLSBOROUGH"
> region="NEW HAMPSHIRE" country="UNITED STATES"
>
> when i use q1 = parser.parse("street=\"MAIN\""); i get same
> results
> which is good.
>
> But when i switch to MAINS~ then fuzzy query does not work.
>
>
> i need to say something with the q1 only in the booleanquery:
> it tries to match the MAIN in street, city, region and country
> which are
> in a single TextField field.
> But i dont want this. that is why i need to street="..." etc when
> searching.
>
> Best regards
>
>
>
> On 6/10/19 11:31 AM, Tomoko Uchida wrote:
>
> Hi,
>
> just for the basic verification, can you find the document without
> fuzzy query? I mean, does this query work for you?
>
> Query query = parser.parse("MAIN");
>
> Tomoko
>
> 2019?6?11?(?) 0:22 <baris.kazar@oracle.com>:
>
> why cant the second set not work at all?
>
> it is indexed as Textfield like street="..." city="..." etc.
>
> Best regards
>
>
>
> On 6/10/19 11:23 AM, baris.kazar@oracle.com wrote:
>
> i dont know how to use Fuzzyquery with queryparser but probably
> You
> are suggesting
>
> QueryParser parser = new QueryParser(field, analyzer) ;
> Query query = parser.parse("MAINS~2");
>
> booleanQuery.add(query, BooleanClause.Occur.SHOULD);
>
> am i right?
> Best regards
>
>
> On 6/10/19 10:47 AM, Atri Sharma wrote:
>
> I would suggest using a QueryParser for your fuzzy query before
> adding it to the Boolean query. This should weed out any case
> issues.
>
> On Mon, 10 Jun 2019 at 8:06 PM, <baris.kazar@oracle.com
> <mailto:baris.kazar@oracle.com>> wrote:
>
> BooleanQuery.Builder booleanQuery = new
> BooleanQuery.Builder();
>
> //First set
>
> booleanQuery.add(new FuzzyQuery(new
> org.apache.lucene.index.Term(field, "MAINS")),
> BooleanClause.Occur.SHOULD);
> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> "NASHUA"), BooleanClause.Occur.MUST);
> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> "NEW HAMPSHIRE"), BooleanClause.Occur.MUST);
> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
> "UNITED STATES"), BooleanClause.Occur.MUST);
>
> // Second set
> //booleanQuery.add(new FuzzyQuery(new
> org.apache.lucene.index.Term(field, "street=\"MAINS\"")),
> BooleanClause.Occur.SHOULD);
> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,
>
> field, "city=\"NASHUA\""), BooleanClause.Occur.MUST);
> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,
>
> field, "region=\"NEW HAMPSHIRE\""),
> BooleanClause.Occur.MUST);
> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,
>
> field, "country=\"UNITED STATES\""),
> BooleanClause.Occur.MUST);
>
> The first set brings also street with Nashua name.
> (NASHUA).
>
> so, to prevent that and since i also indexed with
> street="..."
> city="..." i did the second set but it does not bring
> anything.
>
> createPhraseQuery builds a Phrasequery with one term
> equal to the
> string
> in the call.
>
> Best regards
>
>
>
> On 6/10/19 10:47 AM, baris.kazar@oracle.com
> <mailto:baris.kazar@oracle.com> wrote:
> > How do i check how it is indexed? lowecase or uppercase?
> >
> > only way is now to by testing.
> >
> > i am using standardanalyzer.
> >
> > Best regards
> >
> >
> > On 6/9/19 11:57 AM, Atri Sharma wrote:
> >> On Sun, Jun 9, 2019 at 8:53 PM Tomoko Uchida
> >> <tomoko.uchida.1111@gmail.com
> <mailto:tomoko.uchida.1111@gmail.com>> wrote:
> >>> Hi,
> >>>
> >>> What analyzer do you use for the text field? Is the
> term "Main"
> >>> correctly indexed?
> >> Agreed. Also, it would be good if you could post your
> actual
> code.
> >>
> >> What analyzer are you using? If you are using
> StandardAnalyzer,
> then
> >> all of your terms while indexing will be lowercased,
> AFAIK, but
> your
> >> query will not be analyzed until you run a
> QueryParser on it.
> >>
> >>
> >> Atri
> >>
> >
> >
> >
> ---------------------------------------------------------------------
>
> > To unsubscribe, e-mail:
> java-user-unsubscribe@lucene.apache.org
> <mailto:java-user-unsubscribe@lucene.apache.org>
> > For additional commands, e-mail:
> java-user-help@lucene.apache.org
> <mailto:java-user-help@lucene.apache.org>
> >
>
> ---------------------------------------------------------------------
>
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
> ---------------------------------------------------------------------
>
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: FuzzyQuery- why is it ignored? [ In reply to ]
Shot in the dark: stemming. Whenever I see a problem with something ending in “s” (or “er” or “ing” or….) my first suspect is that stemming is turned on. In that case the token in the index that’s actually searched on is somewhat different than you expect.

The test is easy, just insure your fieldType contains no stemmers. PorterStemmer is particularly aggressive, but for this case to test I’d just remove all stemming, re-index and see if the results differ.

Best,
Erick

> On Jun 13, 2019, at 7:26 AM, baris.kazar@oracle.com wrote:
>
> Tomoko,-
>
> That is strange indeed.
>
> Something is wrong when i use mains but maink, mainl, mainr,mainq, maint all work ok any consonant at the end except s works in this case.
>
> Case #3 had +contentDFLT:mains~2 but not +contentDFLT:"mains~2".
>
> i am using fuzzy query with ~ from Query.builder and that is not PhraseQuery.
>
> Similarly FuzzyQuery with input "mains" (it has to be lowercase since it does not go through StandardAnalyzer) is also not PhraseQuery.
>
> can there be a clearer sample case for ComplexPhraseQuery please in the docs?
>
> did You also index "MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES" the expected output in this case?
>
> Thanks for spending time on this, i would like to thank everyone.
>
> Best regards
>
>
> On 6/13/19 12:13 AM, Tomoko Uchida wrote:
>> Hi,
>>
>>> Ok, i think only this very specific only "mains" has an issue.
>> It looks strange to me. I did some test locally.
>>
>> 1. Indexed this text: "NASHUA NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES".
>>
>> 2a. This query string (just copied from your Case #3) worked correctly
>> for me as far as I can see.
>> +contentDFLT:mains~2 +contentDFLT:"nashua",
>> +contentDFLT:"new-hampshire", +contentDFLT:"united state"
>>
>> 2b. However this query string got no results.
>> +contentDFLT:"mains~2", +contentDFLT:"nashua",
>> +contentDFLT:"new-hampshire", +contentDFLT:"united states"
>> It is an expected behaviour because the classic query parser does not
>> support fuzzy query inside phrase query (as far as I know).
>>
>> I suspect you use fuzzy query operator (~) inside phrase query ("), as
>> the 2b case.
>>
>> FYI: there is a special parser for such complex phrase query.
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__lucene.apache.org_core_8-5F1-5F0_queryparser_org_apache_lucene_queryparser_complexPhrase_ComplexPhraseQueryParser.html&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=ZcXpaSlwS5DegX76mHTb_6DH3P7noan1eeMXc-Vh5M8&s=FoIMlcjDO2b7Gut9XRx-NIBWiBQWItsj8IlylJC7Wkc&e=
>>
>> Tomoko
>>
>> 2019?6?13?(?) 6:16 <baris.kazar@oracle.com>:
>>> Ok, i think only this very specific only "mains" has an issue.
>>>
>>> all i knew about Lucene was fine :) Great...
>>>
>>> i have one more question:
>>>
>>> which one is advised to use: FuzzyQuery or the Query.parser with search string~ appended?
>>>
>>> The second one will go through analyzer and make search string lowercase.
>>>
>>> Best regards
>>>
>>>
>>> On 6/12/19 1:03 PM, baris.kazar@oracle.com wrote:
>>>
>>> Hi again,-
>>>
>>> this is really interesting and i hope i am missing something. Index small cases all entries so case sensitivity is not an issue i think.
>>>
>>> Case #1:
>>>
>>> org.apache.lucene.queryparser.classic.QueryParser parser = new org.apache.lucene.queryparser.classic.QueryParser(field, phraseAnalyzer) ;
>>> Query q1 = null;
>>> try {
>>> q1 = parser.parse("Main");
>>> } catch (ParseException e) {
>>> e.printStackTrace();
>>> }
>>> booleanQuery.add(q1, BooleanClause.Occur.MUST);
>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field, "NASHUA"), BooleanClause.Occur.MUST);
>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field, "NEW HAMPSHIRE"), BooleanClause.Occur.MUST);
>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field, "UNITED STATES"), BooleanClause.Occur.MUST);
>>>
>>>
>>> This brings with this:
>>>
>>> query plan:
>>>
>>> [+contentDFLT:main, +contentDFLT:"nashua", +contentDFLT:"new-hampshire", +contentDFLT:"united states"]
>>>
>>> testQuerySearch1 Time to compute: 0 seconds (copied answer after exec finished)
>>>
>>> Number of results: 12
>>> Name: Main Dunstable Rd
>>> Score: 41.204945
>>> ID: 12677400
>>> Country Code: US
>>> Coordinates: 42.72631, -71.50269
>>> Search Key: MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>
>>> Name: Main St
>>> Score: 41.204945
>>> ID: 12681980
>>> Country Code: US
>>> Coordinates: 42.76416, -71.46681
>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>
>>> Name: Main St
>>> Score: 41.204945
>>> ID: 12681973
>>> Country Code: US
>>> Coordinates: 42.75045, -71.4607
>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>
>>> Name: Main St
>>> Score: 41.204945
>>> ID: 12681974
>>> Country Code: US
>>> Coordinates: 42.76019, -71.465
>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>
>>> Name: Main Dunstable Rd
>>> Score: 41.204945
>>> ID: 12677399
>>> Country Code: US
>>> Coordinates: 42.74641, -71.48943
>>> Search Key: MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>
>>> Name: S Main St
>>> Score: 41.204945
>>> ID: 11893215
>>> Country Code: US
>>> Coordinates: 42.73412, -71.44797
>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>
>>> Name: Main St
>>> Score: 41.204945
>>> ID: 12681978
>>> Country Code: US
>>> Coordinates: 42.73492, -71.44951
>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>
>>> Name: S Main St
>>> Score: 41.204945
>>> ID: 11893214
>>> Country Code: US
>>> Coordinates: 42.73958, -71.45895
>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>
>>> Name: Main St
>>> Score: 41.204945
>>> ID: 12681979
>>> Country Code: US
>>> Coordinates: 42.76416, -71.46681
>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>
>>> Name: Main St
>>> Score: 41.204945
>>> ID: 12681977
>>> Country Code: US
>>> Coordinates: 42.747, -71.45957
>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>
>>>
>>>
>>> Case #2
>>>
>>> When i did this it also worked by adding ~ to make it Fuzzy query to Main word:
>>>
>>> org.apache.lucene.queryparser.classic.QueryParser parser = new org.apache.lucene.queryparser.classic.QueryParser(field, phraseAnalyzer) ;
>>> Query q1 = null;
>>> try {
>>> q1 = parser.parse("Main~");
>>> } catch (ParseException e) {
>>> e.printStackTrace();
>>> }
>>> booleanQuery.add(q1, BooleanClause.Occur.MUST);
>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field, "NASHUA"), BooleanClause.Occur.MUST);
>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field, "NEW HAMPSHIRE"), BooleanClause.Occur.MUST);
>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field, "UNITED STATES"), BooleanClause.Occur.MUST);
>>>
>>>
>>> query plan:
>>>
>>> [+contentDFLT:main~2, +contentDFLT:"nashua", +contentDFLT:"new-hampshire", +contentDFLT:"united states"]
>>>
>>> testQuerySearch1 Time to compute: 24 seconds (due to debugging stops)
>>> Number of results: 12
>>> Name: Main Dunstable Rd
>>> Score: 41.06405
>>> ID: 12677400
>>> Country Code: US
>>> Coordinates: 42.72631, -71.50269
>>> Search Key: MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>
>>> Name: Main St
>>> Score: 41.06405
>>> ID: 12681980
>>> Country Code: US
>>> Coordinates: 42.76416, -71.46681
>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>
>>> Name: Main St
>>> Score: 41.06405
>>> ID: 12681973
>>> Country Code: US
>>> Coordinates: 42.75045, -71.4607
>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>
>>> Name: Main St
>>> Score: 41.06405
>>> ID: 12681974
>>> Country Code: US
>>> Coordinates: 42.76019, -71.465
>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>
>>> Name: Main Dunstable Rd
>>> Score: 41.06405
>>> ID: 12677399
>>> Country Code: US
>>> Coordinates: 42.74641, -71.48943
>>> Search Key: MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>
>>> Name: S Main St
>>> Score: 41.06405
>>> ID: 11893215
>>> Country Code: US
>>> Coordinates: 42.73412, -71.44797
>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>
>>> Name: Main St
>>> Score: 41.06405
>>> ID: 12681978
>>> Country Code: US
>>> Coordinates: 42.73492, -71.44951
>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>
>>> Name: S Main St
>>> Score: 41.06405
>>> ID: 11893214
>>> Country Code: US
>>> Coordinates: 42.73958, -71.45895
>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>
>>> Name: Main St
>>> Score: 41.06405
>>> ID: 12681979
>>> Country Code: US
>>> Coordinates: 42.76416, -71.46681
>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>
>>> Name: Main St
>>> Score: 41.06405
>>> ID: 12681977
>>> Country Code: US
>>> Coordinates: 42.747, -71.45957
>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>
>>>
>>>
>>>
>>> Case #3
>>>
>>> But why does this not work with fuzzy mode and i misspelled a bit (1 edit away) and as You saw the data is there with Main spelling:
>>>
>>> org.apache.lucene.queryparser.classic.QueryParser parser = new org.apache.lucene.queryparser.classic.QueryParser(field, phraseAnalyzer) ;
>>>
>>> Query q1 = null;
>>> try {
>>> q1 = parser.parse("Mains~"); // 1 edit away
>>> } catch (ParseException e) {
>>> e.printStackTrace();
>>> }
>>> booleanQuery.add(q1, BooleanClause.Occur.MUST);
>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field, "NASHUA"), BooleanClause.Occur.MUST);
>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field, "NEW HAMPSHIRE"), BooleanClause.Occur.MUST);
>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field, "UNITED STATES"), BooleanClause.Occur.MUST);
>>>
>>> query plan:
>>>
>>> [+contentDFLT:mains~2, +contentDFLT:"nashua", +contentDFLT:"new-hampshire", +contentDFLT:"united states"]
>>>
>>> testQuerySearch1 Time to compute: 23 seconds (due to debugging stops)
>>>
>>> Number of results: 0
>>>
>>>
>>>
>>> Case #4
>>>
>>> Then i changed q1 to SHOULD from MUST above: and i think fuzzy query is ignored here since there is no MAIN in the first 468 resuls:
>>>
>>> there is no boost for Mains term here.
>>>
>>> query plan:
>>>
>>> [contentDFLT:mains~2, +contentDFLT:"nashua", +contentDFLT:"new-hampshire", +contentDFLT:"united states"]
>>>
>>> testQuerySearch1 Time to compute: 125 seconds (due to debugging stops)
>>> Number of results: 1794
>>> Name: Nashua Dr
>>> Score: 34.186226
>>> ID: 4974936
>>> Country Code: US
>>> Coordinates: 42.7636, -71.46063
>>> Search Key: NASHUA NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>
>>> Name: Nashua River Rail Trl
>>> Score: 34.186226
>>> ID: 4975508
>>> Country Code: US
>>> Coordinates: 42.7062, -71.53962
>>> Search Key: NASHUA RIVER RAIL NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>
>>> Name: Nashua Rd
>>> Score: 33.84896
>>> ID: 4975388
>>> Country Code: US
>>> Coordinates: 42.78746, -71.92823
>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>
>>> Name: NASHUA
>>> Score: 33.84896
>>> ID: 21014865
>>> Country Code: US
>>> Coordinates: 42.75873, -71.46438
>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>
>>> Name: NASHUA
>>> Score: 33.84896
>>> ID: 21014865
>>> Country Code: US
>>> Coordinates: 42.75873, -71.46438
>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>
>>> Name: NASHUA
>>> Score: 33.84896
>>> ID: 21014865
>>> Country Code: US
>>> Coordinates: 42.75873, -71.46438
>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>
>>> Name: NASHUA
>>> Score: 33.84896
>>> ID: 21014865
>>> Country Code: US
>>> Coordinates: 42.75873, -71.46438
>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>
>>> Name: NASHUA
>>> Score: 33.84896
>>> ID: 21014865
>>> Country Code: US
>>> Coordinates: 42.75873, -71.46438
>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>
>>> Name: Nashua St
>>> Score: 33.84896
>>> ID: 4975671
>>> Country Code: US
>>> Coordinates: 42.88471, -70.81687
>>> Search Key: NASHUA ROCKINGHAM NEW HAMPSHIRE UNITED STATES
>>>
>>> Name: Nashua Rd
>>> Score: 33.84896
>>> ID: 4975400
>>> Country Code: US
>>> Coordinates: 42.79014, -71.92364
>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>
>>>
>>> Why is the fuzzy query ignored?
>>> Even if i have separate fields for street, city,region, country, this fuzzy query issue will come into place for words with multiple parts like main dunstable etc., right?
>>>
>>> Best regards
>>>
>>> On 6/12/19 11:36 AM, baris.kazar@oracle.com wrote:
>>>
>>> Tomoko,-
>>>
>>> Thank You for Your suggestions. i am trying to understand it and i thought i did :)
>>>
>>> but it does not work with FuzzyQuery when i used with a *single* large TextField like street=...value... city=...value... region=...value... country=...value... (with or without quotes for the values)
>>>
>>> What i knew about Lucene fuzzy queries are not holding now with this Textfield form. That is why i suspected of a bug.
>>>
>>> 1. Yes, i saw and have a solid proof on that now.
>>>
>>> 2. yes but FuzzyQuery takes quotes as they are as they are escaped and it is not analyzed.
>>>
>>> Stuffing into one textfield vs having separate fields should only affect probably the performance but not the outcome in my case.
>>> But, i have been thinking about this and maybe it is the way to go in this case.
>>>
>>> mY CONTENT field has street names in mixed case and city, region country names in UPPERCASE. Can this be a problem?
>>> i thought index stored them in lowercase since i am using StandardAnalyzer.
>>>
>>> CONTENT field also has full textfield string with street=... city=... region=... country=... (here all values are UPPERCASE).
>>>
>>> Why cant the index find the names via FuzzyQuery? i tried both FuzzyQuery and Query builder as i showed before.
>>>
>>> The last advice in Your previous email would nicely go outside the parantheses since it might be very critical :) :) :)
>>>
>>> Best regards
>>>
>>>
>>> On 6/12/19 12:17 AM, Tomoko Uchida wrote:
>>>
>>> I'd suggest to correctly understand the way a software works before
>>> suspecting its bug :-)
>>>
>>> I guess you may miss two points:
>>>
>>> 1. the standard analyzer (standard tokenizer) breaks words by double
>>> quote (U+0022) so quotes are not indexed or searched at all if you are
>>> using standard analyzer. (That is the reason you have same results
>>> with or without quotes.)
>>> See: https://urldefense.proofpoint.com/v2/url?u=https-3A__lucene.apache.org_core_8-5F1-5F0_core_org_apache_lucene_analysis_standard_StandardTokenizer.html&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=1L6ZQKxmWmYxDX4uJHxzY5SAR_UCl6UUXCo916wzXCo&s=8E2lp1YIGM-3v3FspeieGl8z8rEBs6qioTudtFNzh8c&e=
>>> and https://urldefense.proofpoint.com/v2/url?u=http-3A__unicode.org_reports_tr29_&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=1L6ZQKxmWmYxDX4uJHxzY5SAR_UCl6UUXCo916wzXCo&s=riCZ_f25XW869CKbHPUqfbLiDU-AukE6la0xTLMw6u8&e=
>>>
>>> 2. double quote has special meaning (it's interpreted as phrase query)
>>> with the built-in query parser so you need to escape it if you want to
>>> search double quotes itself.
>>> See: https://urldefense.proofpoint.com/v2/url?u=http-3A__lucene.apache.org_core_8-5F1-5F0_queryparser_org_apache_lucene_queryparser_classic_package-2Dsummary.html-23Terms&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=1L6ZQKxmWmYxDX4uJHxzY5SAR_UCl6UUXCo916wzXCo&s=t8OYTgidvcwNpAVFuTsqGhDJK5BwUZVCxc0mPHzqCYU&e=
>>>
>>> (My advice would be to create separate fields for each key value pairs
>>> instead of stuffing all pairs into one text field, if you need to
>>> search them separately.)
>>>
>>> 2019?6?12?(?) 2:39 <baris.kazar@oracle.com>:
>>>
>>> i can say that quotes is not the issue with index as it still results in
>>> same results with quotes or without quotes.
>>>
>>> i am starting to feel that this might be a bug maybe??
>>>
>>> Best regards
>>>
>>>
>>> On 6/10/19 2:46 PM, baris.kazar@oracle.com wrote:
>>>
>>> Somehow " is causing an issue as this should return street with MAIN:
>>>
>>> [contentDFLT:street="MAINS"~2, +contentDFLT:"city nashua",
>>> +contentDFLT:"region new-hampshire", +contentDFLT:"country united
>>> states"] -> this was with fuzzyquery on MAINS
>>>
>>> Best regards
>>>
>>>
>>> On 6/10/19 2:24 PM, baris.kazar@oracle.com wrote:
>>>
>>> [.+contentDFLT:"city nashua", +contentDFLT:"region new-hampshire",
>>> +contentDFLT:"country united states", contentDFLT:street
>>> contentDFLT:mains]
>>>
>>> QueeryParser chops it into two pieces from
>>> parser.parser("street=\"MAINS\"");
>>>
>>> Index has a TextField named contentDFLT the following data :
>>> street="MAIN" city="NASHUA" municipality="HILLSBOROUGH" region="NEW
>>> HAMPSHIRE" country="UNITED STATES"
>>>
>>>
>>> When i set street=\"MAINS~\" with parser:
>>> i get the following
>>> [.+contentDFLT:"city nashua", +contentDFLT:"region new-hampshire",
>>> +contentDFLT:"country united states", contentDFLT:street
>>> contentDFLT:mains]
>>>
>>> probably " quotations are messing this up as You were saying...
>>> Best regards
>>>
>>>
>>> On 6/10/19 12:48 PM, Tomoko Uchida wrote:
>>>
>>> Or, " (double quotation) in your query string may affect query parsing.
>>>
>>> When I parse this string by classic query parser (lucene 8.1),
>>> street="MAINS~"
>>> parsed (raw) query is
>>> text:street text:mains
>>> (I set the default search field to "text", so text:xxxx is appeared
>>> here.)
>>>
>>> Query parsing is a complex process, so it would be good to check
>>> parsed raw query string especially when you have (reserved) special
>>> characters in your query...
>>>
>>> 2019?6?11?(?) 1:10 Tomoko Uchida <tomoko.uchida.1111@gmail.com>:
>>>
>>> Hi,
>>>
>>> I noticed one small thing in your previous mail.
>>>
>>> when i use q1 = parser.parse("street=\"MAIN\""); i get same results
>>>
>>> which is good.
>>>
>>> To specify a search field, ":" (colon) should be used instead of "=".
>>> See the query parser documentation:
>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__lucene.apache.org_core_8-5F1-5F0_queryparser_org_apache_lucene_queryparser_classic_package-2Dsummary.html-23Fields&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=u4SeJqH4lePhOazCLwxLEr3WqcMkODtYLv4njiKZ4PM&s=WrNfUXO9gz1PqpczTJw1vD9sWqvr76WRv2Aeo9uWqa4&e=
>>>
>>>
>>> I'm not sure this is related to your problem.
>>>
>>> 2019?6?11?(?) 0:51 <baris.kazar@oracle.com>:
>>>
>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>> "city=\"NASHUA\""), BooleanClause.Occur.MUST);
>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>> "region=\"NEW HAMPSHIRE\""), BooleanClause.Occur.MUST);
>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>> "country=\"UNITED STATES\""), BooleanClause.Occur.MUST);
>>>
>>> org.apache.lucene.queryparser.classic.QueryParser parser = new
>>> org.apache.lucene.queryparser.classic.QueryParser(field,
>>> phraseAnalyzer) ;
>>> Query q1 = null;
>>> try {
>>> q1 = parser.parse("MAIN");
>>> } catch (ParseException e) {
>>>
>>> e.printStackTrace();
>>> }
>>> booleanQuery.add(q1, BooleanClause.Occur.SHOULD);
>>>
>>> testQuerySearch2 Time to compute: 0 seconds
>>> Number of results: 1775
>>> Name: Main St
>>> Score: 37.20959
>>> ID: 12681979
>>> Country Code: US
>>> Coordinates: 42.76416, -71.46681
>>> Search Key: street="MAIN" city="NASHUA" municipality="HILLSBOROUGH"
>>> region="NEW HAMPSHIRE" country="UNITED STATES"
>>>
>>> Name: Main St
>>> Score: 37.20959
>>> ID: 12681977
>>> Country Code: US
>>> Coordinates: 42.747, -71.45957
>>> Search Key: street="MAIN" city="NASHUA" municipality="HILLSBOROUGH"
>>> region="NEW HAMPSHIRE" country="UNITED STATES"
>>>
>>> Name: Main St
>>> Score: 37.20959
>>> ID: 12681978
>>> Country Code: US
>>> Coordinates: 42.73492, -71.44951
>>> Search Key: street="MAIN" city="NASHUA" municipality="HILLSBOROUGH"
>>> region="NEW HAMPSHIRE" country="UNITED STATES"
>>>
>>> when i use q1 = parser.parse("street=\"MAIN\""); i get same
>>> results
>>> which is good.
>>>
>>> But when i switch to MAINS~ then fuzzy query does not work.
>>>
>>>
>>> i need to say something with the q1 only in the booleanquery:
>>> it tries to match the MAIN in street, city, region and country
>>> which are
>>> in a single TextField field.
>>> But i dont want this. that is why i need to street="..." etc when
>>> searching.
>>>
>>> Best regards
>>>
>>>
>>>
>>> On 6/10/19 11:31 AM, Tomoko Uchida wrote:
>>>
>>> Hi,
>>>
>>> just for the basic verification, can you find the document without
>>> fuzzy query? I mean, does this query work for you?
>>>
>>> Query query = parser.parse("MAIN");
>>>
>>> Tomoko
>>>
>>> 2019?6?11?(?) 0:22 <baris.kazar@oracle.com>:
>>>
>>> why cant the second set not work at all?
>>>
>>> it is indexed as Textfield like street="..." city="..." etc.
>>>
>>> Best regards
>>>
>>>
>>>
>>> On 6/10/19 11:23 AM, baris.kazar@oracle.com wrote:
>>>
>>> i dont know how to use Fuzzyquery with queryparser but probably
>>> You
>>> are suggesting
>>>
>>> QueryParser parser = new QueryParser(field, analyzer) ;
>>> Query query = parser.parse("MAINS~2");
>>>
>>> booleanQuery.add(query, BooleanClause.Occur.SHOULD);
>>>
>>> am i right?
>>> Best regards
>>>
>>>
>>> On 6/10/19 10:47 AM, Atri Sharma wrote:
>>>
>>> I would suggest using a QueryParser for your fuzzy query before
>>> adding it to the Boolean query. This should weed out any case
>>> issues.
>>>
>>> On Mon, 10 Jun 2019 at 8:06 PM, <baris.kazar@oracle.com
>>> <mailto:baris.kazar@oracle.com>> wrote:
>>>
>>> BooleanQuery.Builder booleanQuery = new
>>> BooleanQuery.Builder();
>>>
>>> //First set
>>>
>>> booleanQuery.add(new FuzzyQuery(new
>>> org.apache.lucene.index.Term(field, "MAINS")),
>>> BooleanClause.Occur.SHOULD);
>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>> "NASHUA"), BooleanClause.Occur.MUST);
>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>> "NEW HAMPSHIRE"), BooleanClause.Occur.MUST);
>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>> "UNITED STATES"), BooleanClause.Occur.MUST);
>>>
>>> // Second set
>>> //booleanQuery.add(new FuzzyQuery(new
>>> org.apache.lucene.index.Term(field, "street=\"MAINS\"")),
>>> BooleanClause.Occur.SHOULD);
>>> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,
>>>
>>> field, "city=\"NASHUA\""), BooleanClause.Occur.MUST);
>>> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,
>>>
>>> field, "region=\"NEW HAMPSHIRE\""),
>>> BooleanClause.Occur.MUST);
>>> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,
>>>
>>> field, "country=\"UNITED STATES\""),
>>> BooleanClause.Occur.MUST);
>>>
>>> The first set brings also street with Nashua name.
>>> (NASHUA).
>>>
>>> so, to prevent that and since i also indexed with
>>> street="..."
>>> city="..." i did the second set but it does not bring
>>> anything.
>>>
>>> createPhraseQuery builds a Phrasequery with one term
>>> equal to the
>>> string
>>> in the call.
>>>
>>> Best regards
>>>
>>>
>>>
>>> On 6/10/19 10:47 AM, baris.kazar@oracle.com
>>> <mailto:baris.kazar@oracle.com> wrote:
>>> > How do i check how it is indexed? lowecase or uppercase?
>>> >
>>> > only way is now to by testing.
>>> >
>>> > i am using standardanalyzer.
>>> >
>>> > Best regards
>>> >
>>> >
>>> > On 6/9/19 11:57 AM, Atri Sharma wrote:
>>> >> On Sun, Jun 9, 2019 at 8:53 PM Tomoko Uchida
>>> >> <tomoko.uchida.1111@gmail.com
>>> <mailto:tomoko.uchida.1111@gmail.com>> wrote:
>>> >>> Hi,
>>> >>>
>>> >>> What analyzer do you use for the text field? Is the
>>> term "Main"
>>> >>> correctly indexed?
>>> >> Agreed. Also, it would be good if you could post your
>>> actual
>>> code.
>>> >>
>>> >> What analyzer are you using? If you are using
>>> StandardAnalyzer,
>>> then
>>> >> all of your terms while indexing will be lowercased,
>>> AFAIK, but
>>> your
>>> >> query will not be analyzed until you run a
>>> QueryParser on it.
>>> >>
>>> >>
>>> >> Atri
>>> >>
>>> >
>>> >
>>> >
>>> ---------------------------------------------------------------------
>>>
>>> > To unsubscribe, e-mail:
>>> java-user-unsubscribe@lucene.apache.org
>>> <mailto:java-user-unsubscribe@lucene.apache.org>
>>> > For additional commands, e-mail:
>>> java-user-help@lucene.apache.org
>>> <mailto:java-user-help@lucene.apache.org>
>>> >
>>>
>>> ---------------------------------------------------------------------
>>>
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>>> ---------------------------------------------------------------------
>>>
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>>>
>>>
>>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: FuzzyQuery- why is it ignored? [ In reply to ]
Tomoko,-

 That is strange indeed.

Something is wrong when i use mains but maink, mainl, mainr,mainq, maint
all work ok any consonant at the end except s works in this case.

Case #3 had +contentDFLT:mains~2 but not +contentDFLT:"mains~2".

i am using fuzzy query with ~ from Query.builder and that is not
PhraseQuery.

Similarly FuzzyQuery with input "mains" (it has to be lowercase since it
does not go through StandardAnalyzer) is also not PhraseQuery.

can there be a clearer sample case for ComplexPhraseQuery please in the
docs?

did You also index "MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED
STATES" the expected output in this case?

Thanks for spending time on this, i would like to thank everyone.

Best regards


On 6/13/19 12:13 AM, Tomoko Uchida wrote:
> Hi,
>
>> Ok, i think only this very specific only "mains" has an issue.
> It looks strange to me. I did some test locally.
>
> 1. Indexed this text: "NASHUA NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES".
>
> 2a. This query string (just copied from your Case #3) worked correctly
> for me as far as I can see.
> +contentDFLT:mains~2 +contentDFLT:"nashua",
> +contentDFLT:"new-hampshire", +contentDFLT:"united state"
>
> 2b. However this query string got no results.
> +contentDFLT:"mains~2", +contentDFLT:"nashua",
> +contentDFLT:"new-hampshire", +contentDFLT:"united states"
> It is an expected behaviour because the classic query parser does not
> support fuzzy query inside phrase query (as far as I know).
>
> I suspect you use fuzzy query operator (~) inside phrase query ("), as
> the 2b case.
>
> FYI: there is a special parser for such complex phrase query.
> https://urldefense.proofpoint.com/v2/url?u=https-3A__lucene.apache.org_core_8-5F1-5F0_queryparser_org_apache_lucene_queryparser_complexPhrase_ComplexPhraseQueryParser.html&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=ZcXpaSlwS5DegX76mHTb_6DH3P7noan1eeMXc-Vh5M8&s=FoIMlcjDO2b7Gut9XRx-NIBWiBQWItsj8IlylJC7Wkc&e=
>
> Tomoko
>
> 2019?6?13?(?) 6:16 <baris.kazar@oracle.com>:
>> Ok, i think only this very specific only "mains" has an issue.
>>
>> all i knew about Lucene was fine :) Great...
>>
>> i have one more question:
>>
>> which one is advised to use: FuzzyQuery or the Query.parser with search string~ appended?
>>
>> The second one will go through analyzer and make search string lowercase.
>>
>> Best regards
>>
>>
>> On 6/12/19 1:03 PM, baris.kazar@oracle.com wrote:
>>
>> Hi again,-
>>
>> this is really interesting and i hope i am missing something. Index small cases all entries so case sensitivity is not an issue i think.
>>
>> Case #1:
>>
>> org.apache.lucene.queryparser.classic.QueryParser parser = new org.apache.lucene.queryparser.classic.QueryParser(field, phraseAnalyzer) ;
>> Query q1 = null;
>> try {
>> q1 = parser.parse("Main");
>> } catch (ParseException e) {
>> e.printStackTrace();
>> }
>> booleanQuery.add(q1, BooleanClause.Occur.MUST);
>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field, "NASHUA"), BooleanClause.Occur.MUST);
>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field, "NEW HAMPSHIRE"), BooleanClause.Occur.MUST);
>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field, "UNITED STATES"), BooleanClause.Occur.MUST);
>>
>>
>> This brings with this:
>>
>> query plan:
>>
>> [+contentDFLT:main, +contentDFLT:"nashua", +contentDFLT:"new-hampshire", +contentDFLT:"united states"]
>>
>> testQuerySearch1 Time to compute: 0 seconds (copied answer after exec finished)
>>
>> Number of results: 12
>> Name: Main Dunstable Rd
>> Score: 41.204945
>> ID: 12677400
>> Country Code: US
>> Coordinates: 42.72631, -71.50269
>> Search Key: MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>
>> Name: Main St
>> Score: 41.204945
>> ID: 12681980
>> Country Code: US
>> Coordinates: 42.76416, -71.46681
>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>
>> Name: Main St
>> Score: 41.204945
>> ID: 12681973
>> Country Code: US
>> Coordinates: 42.75045, -71.4607
>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>
>> Name: Main St
>> Score: 41.204945
>> ID: 12681974
>> Country Code: US
>> Coordinates: 42.76019, -71.465
>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>
>> Name: Main Dunstable Rd
>> Score: 41.204945
>> ID: 12677399
>> Country Code: US
>> Coordinates: 42.74641, -71.48943
>> Search Key: MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>
>> Name: S Main St
>> Score: 41.204945
>> ID: 11893215
>> Country Code: US
>> Coordinates: 42.73412, -71.44797
>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>
>> Name: Main St
>> Score: 41.204945
>> ID: 12681978
>> Country Code: US
>> Coordinates: 42.73492, -71.44951
>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>
>> Name: S Main St
>> Score: 41.204945
>> ID: 11893214
>> Country Code: US
>> Coordinates: 42.73958, -71.45895
>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>
>> Name: Main St
>> Score: 41.204945
>> ID: 12681979
>> Country Code: US
>> Coordinates: 42.76416, -71.46681
>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>
>> Name: Main St
>> Score: 41.204945
>> ID: 12681977
>> Country Code: US
>> Coordinates: 42.747, -71.45957
>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>
>>
>>
>> Case #2
>>
>> When i did this it also worked by adding ~ to make it Fuzzy query to Main word:
>>
>> org.apache.lucene.queryparser.classic.QueryParser parser = new org.apache.lucene.queryparser.classic.QueryParser(field, phraseAnalyzer) ;
>> Query q1 = null;
>> try {
>> q1 = parser.parse("Main~");
>> } catch (ParseException e) {
>> e.printStackTrace();
>> }
>> booleanQuery.add(q1, BooleanClause.Occur.MUST);
>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field, "NASHUA"), BooleanClause.Occur.MUST);
>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field, "NEW HAMPSHIRE"), BooleanClause.Occur.MUST);
>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field, "UNITED STATES"), BooleanClause.Occur.MUST);
>>
>>
>> query plan:
>>
>> [+contentDFLT:main~2, +contentDFLT:"nashua", +contentDFLT:"new-hampshire", +contentDFLT:"united states"]
>>
>> testQuerySearch1 Time to compute: 24 seconds (due to debugging stops)
>> Number of results: 12
>> Name: Main Dunstable Rd
>> Score: 41.06405
>> ID: 12677400
>> Country Code: US
>> Coordinates: 42.72631, -71.50269
>> Search Key: MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>
>> Name: Main St
>> Score: 41.06405
>> ID: 12681980
>> Country Code: US
>> Coordinates: 42.76416, -71.46681
>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>
>> Name: Main St
>> Score: 41.06405
>> ID: 12681973
>> Country Code: US
>> Coordinates: 42.75045, -71.4607
>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>
>> Name: Main St
>> Score: 41.06405
>> ID: 12681974
>> Country Code: US
>> Coordinates: 42.76019, -71.465
>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>
>> Name: Main Dunstable Rd
>> Score: 41.06405
>> ID: 12677399
>> Country Code: US
>> Coordinates: 42.74641, -71.48943
>> Search Key: MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>
>> Name: S Main St
>> Score: 41.06405
>> ID: 11893215
>> Country Code: US
>> Coordinates: 42.73412, -71.44797
>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>
>> Name: Main St
>> Score: 41.06405
>> ID: 12681978
>> Country Code: US
>> Coordinates: 42.73492, -71.44951
>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>
>> Name: S Main St
>> Score: 41.06405
>> ID: 11893214
>> Country Code: US
>> Coordinates: 42.73958, -71.45895
>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>
>> Name: Main St
>> Score: 41.06405
>> ID: 12681979
>> Country Code: US
>> Coordinates: 42.76416, -71.46681
>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>
>> Name: Main St
>> Score: 41.06405
>> ID: 12681977
>> Country Code: US
>> Coordinates: 42.747, -71.45957
>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>
>>
>>
>>
>> Case #3
>>
>> But why does this not work with fuzzy mode and i misspelled a bit (1 edit away) and as You saw the data is there with Main spelling:
>>
>> org.apache.lucene.queryparser.classic.QueryParser parser = new org.apache.lucene.queryparser.classic.QueryParser(field, phraseAnalyzer) ;
>>
>> Query q1 = null;
>> try {
>> q1 = parser.parse("Mains~"); // 1 edit away
>> } catch (ParseException e) {
>> e.printStackTrace();
>> }
>> booleanQuery.add(q1, BooleanClause.Occur.MUST);
>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field, "NASHUA"), BooleanClause.Occur.MUST);
>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field, "NEW HAMPSHIRE"), BooleanClause.Occur.MUST);
>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field, "UNITED STATES"), BooleanClause.Occur.MUST);
>>
>> query plan:
>>
>> [+contentDFLT:mains~2, +contentDFLT:"nashua", +contentDFLT:"new-hampshire", +contentDFLT:"united states"]
>>
>> testQuerySearch1 Time to compute: 23 seconds (due to debugging stops)
>>
>> Number of results: 0
>>
>>
>>
>> Case #4
>>
>> Then i changed q1 to SHOULD from MUST above: and i think fuzzy query is ignored here since there is no MAIN in the first 468 resuls:
>>
>> there is no boost for Mains term here.
>>
>> query plan:
>>
>> [contentDFLT:mains~2, +contentDFLT:"nashua", +contentDFLT:"new-hampshire", +contentDFLT:"united states"]
>>
>> testQuerySearch1 Time to compute: 125 seconds (due to debugging stops)
>> Number of results: 1794
>> Name: Nashua Dr
>> Score: 34.186226
>> ID: 4974936
>> Country Code: US
>> Coordinates: 42.7636, -71.46063
>> Search Key: NASHUA NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>
>> Name: Nashua River Rail Trl
>> Score: 34.186226
>> ID: 4975508
>> Country Code: US
>> Coordinates: 42.7062, -71.53962
>> Search Key: NASHUA RIVER RAIL NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>
>> Name: Nashua Rd
>> Score: 33.84896
>> ID: 4975388
>> Country Code: US
>> Coordinates: 42.78746, -71.92823
>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>
>> Name: NASHUA
>> Score: 33.84896
>> ID: 21014865
>> Country Code: US
>> Coordinates: 42.75873, -71.46438
>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>
>> Name: NASHUA
>> Score: 33.84896
>> ID: 21014865
>> Country Code: US
>> Coordinates: 42.75873, -71.46438
>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>
>> Name: NASHUA
>> Score: 33.84896
>> ID: 21014865
>> Country Code: US
>> Coordinates: 42.75873, -71.46438
>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>
>> Name: NASHUA
>> Score: 33.84896
>> ID: 21014865
>> Country Code: US
>> Coordinates: 42.75873, -71.46438
>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>
>> Name: NASHUA
>> Score: 33.84896
>> ID: 21014865
>> Country Code: US
>> Coordinates: 42.75873, -71.46438
>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>
>> Name: Nashua St
>> Score: 33.84896
>> ID: 4975671
>> Country Code: US
>> Coordinates: 42.88471, -70.81687
>> Search Key: NASHUA ROCKINGHAM NEW HAMPSHIRE UNITED STATES
>>
>> Name: Nashua Rd
>> Score: 33.84896
>> ID: 4975400
>> Country Code: US
>> Coordinates: 42.79014, -71.92364
>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>
>>
>> Why is the fuzzy query ignored?
>> Even if i have separate fields for street, city,region, country, this fuzzy query issue will come into place for words with multiple parts like main dunstable etc., right?
>>
>> Best regards
>>
>> On 6/12/19 11:36 AM, baris.kazar@oracle.com wrote:
>>
>> Tomoko,-
>>
>> Thank You for Your suggestions. i am trying to understand it and i thought i did :)
>>
>> but it does not work with FuzzyQuery when i used with a *single* large TextField like street=...value... city=...value... region=...value... country=...value... (with or without quotes for the values)
>>
>> What i knew about Lucene fuzzy queries are not holding now with this Textfield form. That is why i suspected of a bug.
>>
>> 1. Yes, i saw and have a solid proof on that now.
>>
>> 2. yes but FuzzyQuery takes quotes as they are as they are escaped and it is not analyzed.
>>
>> Stuffing into one textfield vs having separate fields should only affect probably the performance but not the outcome in my case.
>> But, i have been thinking about this and maybe it is the way to go in this case.
>>
>> mY CONTENT field has street names in mixed case and city, region country names in UPPERCASE. Can this be a problem?
>> i thought index stored them in lowercase since i am using StandardAnalyzer.
>>
>> CONTENT field also has full textfield string with street=... city=... region=... country=... (here all values are UPPERCASE).
>>
>> Why cant the index find the names via FuzzyQuery? i tried both FuzzyQuery and Query builder as i showed before.
>>
>> The last advice in Your previous email would nicely go outside the parantheses since it might be very critical :) :) :)
>>
>> Best regards
>>
>>
>> On 6/12/19 12:17 AM, Tomoko Uchida wrote:
>>
>> I'd suggest to correctly understand the way a software works before
>> suspecting its bug :-)
>>
>> I guess you may miss two points:
>>
>> 1. the standard analyzer (standard tokenizer) breaks words by double
>> quote (U+0022) so quotes are not indexed or searched at all if you are
>> using standard analyzer. (That is the reason you have same results
>> with or without quotes.)
>> See: https://urldefense.proofpoint.com/v2/url?u=https-3A__lucene.apache.org_core_8-5F1-5F0_core_org_apache_lucene_analysis_standard_StandardTokenizer.html&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=1L6ZQKxmWmYxDX4uJHxzY5SAR_UCl6UUXCo916wzXCo&s=8E2lp1YIGM-3v3FspeieGl8z8rEBs6qioTudtFNzh8c&e=
>> and https://urldefense.proofpoint.com/v2/url?u=http-3A__unicode.org_reports_tr29_&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=1L6ZQKxmWmYxDX4uJHxzY5SAR_UCl6UUXCo916wzXCo&s=riCZ_f25XW869CKbHPUqfbLiDU-AukE6la0xTLMw6u8&e=
>>
>> 2. double quote has special meaning (it's interpreted as phrase query)
>> with the built-in query parser so you need to escape it if you want to
>> search double quotes itself.
>> See: https://urldefense.proofpoint.com/v2/url?u=http-3A__lucene.apache.org_core_8-5F1-5F0_queryparser_org_apache_lucene_queryparser_classic_package-2Dsummary.html-23Terms&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=1L6ZQKxmWmYxDX4uJHxzY5SAR_UCl6UUXCo916wzXCo&s=t8OYTgidvcwNpAVFuTsqGhDJK5BwUZVCxc0mPHzqCYU&e=
>>
>> (My advice would be to create separate fields for each key value pairs
>> instead of stuffing all pairs into one text field, if you need to
>> search them separately.)
>>
>> 2019?6?12?(?) 2:39 <baris.kazar@oracle.com>:
>>
>> i can say that quotes is not the issue with index as it still results in
>> same results with quotes or without quotes.
>>
>> i am starting to feel that this might be a bug maybe??
>>
>> Best regards
>>
>>
>> On 6/10/19 2:46 PM, baris.kazar@oracle.com wrote:
>>
>> Somehow " is causing an issue as this should return street with MAIN:
>>
>> [contentDFLT:street="MAINS"~2, +contentDFLT:"city nashua",
>> +contentDFLT:"region new-hampshire", +contentDFLT:"country united
>> states"] -> this was with fuzzyquery on MAINS
>>
>> Best regards
>>
>>
>> On 6/10/19 2:24 PM, baris.kazar@oracle.com wrote:
>>
>> [.+contentDFLT:"city nashua", +contentDFLT:"region new-hampshire",
>> +contentDFLT:"country united states", contentDFLT:street
>> contentDFLT:mains]
>>
>> QueeryParser chops it into two pieces from
>> parser.parser("street=\"MAINS\"");
>>
>> Index has a TextField named contentDFLT the following data :
>> street="MAIN" city="NASHUA" municipality="HILLSBOROUGH" region="NEW
>> HAMPSHIRE" country="UNITED STATES"
>>
>>
>> When i set street=\"MAINS~\" with parser:
>> i get the following
>> [.+contentDFLT:"city nashua", +contentDFLT:"region new-hampshire",
>> +contentDFLT:"country united states", contentDFLT:street
>> contentDFLT:mains]
>>
>> probably " quotations are messing this up as You were saying...
>> Best regards
>>
>>
>> On 6/10/19 12:48 PM, Tomoko Uchida wrote:
>>
>> Or, " (double quotation) in your query string may affect query parsing.
>>
>> When I parse this string by classic query parser (lucene 8.1),
>> street="MAINS~"
>> parsed (raw) query is
>> text:street text:mains
>> (I set the default search field to "text", so text:xxxx is appeared
>> here.)
>>
>> Query parsing is a complex process, so it would be good to check
>> parsed raw query string especially when you have (reserved) special
>> characters in your query...
>>
>> 2019?6?11?(?) 1:10 Tomoko Uchida <tomoko.uchida.1111@gmail.com>:
>>
>> Hi,
>>
>> I noticed one small thing in your previous mail.
>>
>> when i use q1 = parser.parse("street=\"MAIN\""); i get same results
>>
>> which is good.
>>
>> To specify a search field, ":" (colon) should be used instead of "=".
>> See the query parser documentation:
>> https://urldefense.proofpoint.com/v2/url?u=http-3A__lucene.apache.org_core_8-5F1-5F0_queryparser_org_apache_lucene_queryparser_classic_package-2Dsummary.html-23Fields&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=u4SeJqH4lePhOazCLwxLEr3WqcMkODtYLv4njiKZ4PM&s=WrNfUXO9gz1PqpczTJw1vD9sWqvr76WRv2Aeo9uWqa4&e=
>>
>>
>> I'm not sure this is related to your problem.
>>
>> 2019?6?11?(?) 0:51 <baris.kazar@oracle.com>:
>>
>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>> "city=\"NASHUA\""), BooleanClause.Occur.MUST);
>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>> "region=\"NEW HAMPSHIRE\""), BooleanClause.Occur.MUST);
>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>> "country=\"UNITED STATES\""), BooleanClause.Occur.MUST);
>>
>> org.apache.lucene.queryparser.classic.QueryParser parser = new
>> org.apache.lucene.queryparser.classic.QueryParser(field,
>> phraseAnalyzer) ;
>> Query q1 = null;
>> try {
>> q1 = parser.parse("MAIN");
>> } catch (ParseException e) {
>>
>> e.printStackTrace();
>> }
>> booleanQuery.add(q1, BooleanClause.Occur.SHOULD);
>>
>> testQuerySearch2 Time to compute: 0 seconds
>> Number of results: 1775
>> Name: Main St
>> Score: 37.20959
>> ID: 12681979
>> Country Code: US
>> Coordinates: 42.76416, -71.46681
>> Search Key: street="MAIN" city="NASHUA" municipality="HILLSBOROUGH"
>> region="NEW HAMPSHIRE" country="UNITED STATES"
>>
>> Name: Main St
>> Score: 37.20959
>> ID: 12681977
>> Country Code: US
>> Coordinates: 42.747, -71.45957
>> Search Key: street="MAIN" city="NASHUA" municipality="HILLSBOROUGH"
>> region="NEW HAMPSHIRE" country="UNITED STATES"
>>
>> Name: Main St
>> Score: 37.20959
>> ID: 12681978
>> Country Code: US
>> Coordinates: 42.73492, -71.44951
>> Search Key: street="MAIN" city="NASHUA" municipality="HILLSBOROUGH"
>> region="NEW HAMPSHIRE" country="UNITED STATES"
>>
>> when i use q1 = parser.parse("street=\"MAIN\""); i get same
>> results
>> which is good.
>>
>> But when i switch to MAINS~ then fuzzy query does not work.
>>
>>
>> i need to say something with the q1 only in the booleanquery:
>> it tries to match the MAIN in street, city, region and country
>> which are
>> in a single TextField field.
>> But i dont want this. that is why i need to street="..." etc when
>> searching.
>>
>> Best regards
>>
>>
>>
>> On 6/10/19 11:31 AM, Tomoko Uchida wrote:
>>
>> Hi,
>>
>> just for the basic verification, can you find the document without
>> fuzzy query? I mean, does this query work for you?
>>
>> Query query = parser.parse("MAIN");
>>
>> Tomoko
>>
>> 2019?6?11?(?) 0:22 <baris.kazar@oracle.com>:
>>
>> why cant the second set not work at all?
>>
>> it is indexed as Textfield like street="..." city="..." etc.
>>
>> Best regards
>>
>>
>>
>> On 6/10/19 11:23 AM, baris.kazar@oracle.com wrote:
>>
>> i dont know how to use Fuzzyquery with queryparser but probably
>> You
>> are suggesting
>>
>> QueryParser parser = new QueryParser(field, analyzer) ;
>> Query query = parser.parse("MAINS~2");
>>
>> booleanQuery.add(query, BooleanClause.Occur.SHOULD);
>>
>> am i right?
>> Best regards
>>
>>
>> On 6/10/19 10:47 AM, Atri Sharma wrote:
>>
>> I would suggest using a QueryParser for your fuzzy query before
>> adding it to the Boolean query. This should weed out any case
>> issues.
>>
>> On Mon, 10 Jun 2019 at 8:06 PM, <baris.kazar@oracle.com
>> <mailto:baris.kazar@oracle.com>> wrote:
>>
>> BooleanQuery.Builder booleanQuery = new
>> BooleanQuery.Builder();
>>
>> //First set
>>
>> booleanQuery.add(new FuzzyQuery(new
>> org.apache.lucene.index.Term(field, "MAINS")),
>> BooleanClause.Occur.SHOULD);
>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>> "NASHUA"), BooleanClause.Occur.MUST);
>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>> "NEW HAMPSHIRE"), BooleanClause.Occur.MUST);
>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>> "UNITED STATES"), BooleanClause.Occur.MUST);
>>
>> // Second set
>> //booleanQuery.add(new FuzzyQuery(new
>> org.apache.lucene.index.Term(field, "street=\"MAINS\"")),
>> BooleanClause.Occur.SHOULD);
>> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,
>>
>> field, "city=\"NASHUA\""), BooleanClause.Occur.MUST);
>> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,
>>
>> field, "region=\"NEW HAMPSHIRE\""),
>> BooleanClause.Occur.MUST);
>> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,
>>
>> field, "country=\"UNITED STATES\""),
>> BooleanClause.Occur.MUST);
>>
>> The first set brings also street with Nashua name.
>> (NASHUA).
>>
>> so, to prevent that and since i also indexed with
>> street="..."
>> city="..." i did the second set but it does not bring
>> anything.
>>
>> createPhraseQuery builds a Phrasequery with one term
>> equal to the
>> string
>> in the call.
>>
>> Best regards
>>
>>
>>
>> On 6/10/19 10:47 AM, baris.kazar@oracle.com
>> <mailto:baris.kazar@oracle.com> wrote:
>> > How do i check how it is indexed? lowecase or uppercase?
>> >
>> > only way is now to by testing.
>> >
>> > i am using standardanalyzer.
>> >
>> > Best regards
>> >
>> >
>> > On 6/9/19 11:57 AM, Atri Sharma wrote:
>> >> On Sun, Jun 9, 2019 at 8:53 PM Tomoko Uchida
>> >> <tomoko.uchida.1111@gmail.com
>> <mailto:tomoko.uchida.1111@gmail.com>> wrote:
>> >>> Hi,
>> >>>
>> >>> What analyzer do you use for the text field? Is the
>> term "Main"
>> >>> correctly indexed?
>> >> Agreed. Also, it would be good if you could post your
>> actual
>> code.
>> >>
>> >> What analyzer are you using? If you are using
>> StandardAnalyzer,
>> then
>> >> all of your terms while indexing will be lowercased,
>> AFAIK, but
>> your
>> >> query will not be analyzed until you run a
>> QueryParser on it.
>> >>
>> >>
>> >> Atri
>> >>
>> >
>> >
>> >
>> ---------------------------------------------------------------------
>>
>> > To unsubscribe, e-mail:
>> java-user-unsubscribe@lucene.apache.org
>> <mailto:java-user-unsubscribe@lucene.apache.org>
>> > For additional commands, e-mail:
>> java-user-help@lucene.apache.org
>> <mailto:java-user-help@lucene.apache.org>
>> >
>>
>> ---------------------------------------------------------------------
>>
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>> ---------------------------------------------------------------------
>>
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>>
>>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: FuzzyQuery- why is it ignored? [ In reply to ]
Erick,

Cool, could You give a simple example with my example please?

Best regards



On 6/13/19 10:12 AM, Erick Erickson wrote:
> Shot in the dark: stemming. Whenever I see a problem with something ending in “s” (or “er” or “ing” or….) my first suspect is that stemming is turned on. In that case the token in the index that’s actually searched on is somewhat different than you expect.
>
> The test is easy, just insure your fieldType contains no stemmers. PorterStemmer is particularly aggressive, but for this case to test I’d just remove all stemming, re-index and see if the results differ.
>
> Best,
> Erick
>
>> On Jun 13, 2019, at 7:26 AM, baris.kazar@oracle.com wrote:
>>
>> Tomoko,-
>>
>> That is strange indeed.
>>
>> Something is wrong when i use mains but maink, mainl, mainr,mainq, maint all work ok any consonant at the end except s works in this case.
>>
>> Case #3 had +contentDFLT:mains~2 but not +contentDFLT:"mains~2".
>>
>> i am using fuzzy query with ~ from Query.builder and that is not PhraseQuery.
>>
>> Similarly FuzzyQuery with input "mains" (it has to be lowercase since it does not go through StandardAnalyzer) is also not PhraseQuery.
>>
>> can there be a clearer sample case for ComplexPhraseQuery please in the docs?
>>
>> did You also index "MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES" the expected output in this case?
>>
>> Thanks for spending time on this, i would like to thank everyone.
>>
>> Best regards
>>
>>
>> On 6/13/19 12:13 AM, Tomoko Uchida wrote:
>>> Hi,
>>>
>>>> Ok, i think only this very specific only "mains" has an issue.
>>> It looks strange to me. I did some test locally.
>>>
>>> 1. Indexed this text: "NASHUA NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES".
>>>
>>> 2a. This query string (just copied from your Case #3) worked correctly
>>> for me as far as I can see.
>>> +contentDFLT:mains~2 +contentDFLT:"nashua",
>>> +contentDFLT:"new-hampshire", +contentDFLT:"united state"
>>>
>>> 2b. However this query string got no results.
>>> +contentDFLT:"mains~2", +contentDFLT:"nashua",
>>> +contentDFLT:"new-hampshire", +contentDFLT:"united states"
>>> It is an expected behaviour because the classic query parser does not
>>> support fuzzy query inside phrase query (as far as I know).
>>>
>>> I suspect you use fuzzy query operator (~) inside phrase query ("), as
>>> the 2b case.
>>>
>>> FYI: there is a special parser for such complex phrase query.
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__lucene.apache.org_core_8-5F1-5F0_queryparser_org_apache_lucene_queryparser_complexPhrase_ComplexPhraseQueryParser.html&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=ZcXpaSlwS5DegX76mHTb_6DH3P7noan1eeMXc-Vh5M8&s=FoIMlcjDO2b7Gut9XRx-NIBWiBQWItsj8IlylJC7Wkc&e=
>>>
>>> Tomoko
>>>
>>> 2019?6?13?(?) 6:16 <baris.kazar@oracle.com>:
>>>> Ok, i think only this very specific only "mains" has an issue.
>>>>
>>>> all i knew about Lucene was fine :) Great...
>>>>
>>>> i have one more question:
>>>>
>>>> which one is advised to use: FuzzyQuery or the Query.parser with search string~ appended?
>>>>
>>>> The second one will go through analyzer and make search string lowercase.
>>>>
>>>> Best regards
>>>>
>>>>
>>>> On 6/12/19 1:03 PM, baris.kazar@oracle.com wrote:
>>>>
>>>> Hi again,-
>>>>
>>>> this is really interesting and i hope i am missing something. Index small cases all entries so case sensitivity is not an issue i think.
>>>>
>>>> Case #1:
>>>>
>>>> org.apache.lucene.queryparser.classic.QueryParser parser = new org.apache.lucene.queryparser.classic.QueryParser(field, phraseAnalyzer) ;
>>>> Query q1 = null;
>>>> try {
>>>> q1 = parser.parse("Main");
>>>> } catch (ParseException e) {
>>>> e.printStackTrace();
>>>> }
>>>> booleanQuery.add(q1, BooleanClause.Occur.MUST);
>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field, "NASHUA"), BooleanClause.Occur.MUST);
>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field, "NEW HAMPSHIRE"), BooleanClause.Occur.MUST);
>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field, "UNITED STATES"), BooleanClause.Occur.MUST);
>>>>
>>>>
>>>> This brings with this:
>>>>
>>>> query plan:
>>>>
>>>> [+contentDFLT:main, +contentDFLT:"nashua", +contentDFLT:"new-hampshire", +contentDFLT:"united states"]
>>>>
>>>> testQuerySearch1 Time to compute: 0 seconds (copied answer after exec finished)
>>>>
>>>> Number of results: 12
>>>> Name: Main Dunstable Rd
>>>> Score: 41.204945
>>>> ID: 12677400
>>>> Country Code: US
>>>> Coordinates: 42.72631, -71.50269
>>>> Search Key: MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>
>>>> Name: Main St
>>>> Score: 41.204945
>>>> ID: 12681980
>>>> Country Code: US
>>>> Coordinates: 42.76416, -71.46681
>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>
>>>> Name: Main St
>>>> Score: 41.204945
>>>> ID: 12681973
>>>> Country Code: US
>>>> Coordinates: 42.75045, -71.4607
>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>
>>>> Name: Main St
>>>> Score: 41.204945
>>>> ID: 12681974
>>>> Country Code: US
>>>> Coordinates: 42.76019, -71.465
>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>
>>>> Name: Main Dunstable Rd
>>>> Score: 41.204945
>>>> ID: 12677399
>>>> Country Code: US
>>>> Coordinates: 42.74641, -71.48943
>>>> Search Key: MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>
>>>> Name: S Main St
>>>> Score: 41.204945
>>>> ID: 11893215
>>>> Country Code: US
>>>> Coordinates: 42.73412, -71.44797
>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>
>>>> Name: Main St
>>>> Score: 41.204945
>>>> ID: 12681978
>>>> Country Code: US
>>>> Coordinates: 42.73492, -71.44951
>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>
>>>> Name: S Main St
>>>> Score: 41.204945
>>>> ID: 11893214
>>>> Country Code: US
>>>> Coordinates: 42.73958, -71.45895
>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>
>>>> Name: Main St
>>>> Score: 41.204945
>>>> ID: 12681979
>>>> Country Code: US
>>>> Coordinates: 42.76416, -71.46681
>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>
>>>> Name: Main St
>>>> Score: 41.204945
>>>> ID: 12681977
>>>> Country Code: US
>>>> Coordinates: 42.747, -71.45957
>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>
>>>>
>>>>
>>>> Case #2
>>>>
>>>> When i did this it also worked by adding ~ to make it Fuzzy query to Main word:
>>>>
>>>> org.apache.lucene.queryparser.classic.QueryParser parser = new org.apache.lucene.queryparser.classic.QueryParser(field, phraseAnalyzer) ;
>>>> Query q1 = null;
>>>> try {
>>>> q1 = parser.parse("Main~");
>>>> } catch (ParseException e) {
>>>> e.printStackTrace();
>>>> }
>>>> booleanQuery.add(q1, BooleanClause.Occur.MUST);
>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field, "NASHUA"), BooleanClause.Occur.MUST);
>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field, "NEW HAMPSHIRE"), BooleanClause.Occur.MUST);
>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field, "UNITED STATES"), BooleanClause.Occur.MUST);
>>>>
>>>>
>>>> query plan:
>>>>
>>>> [+contentDFLT:main~2, +contentDFLT:"nashua", +contentDFLT:"new-hampshire", +contentDFLT:"united states"]
>>>>
>>>> testQuerySearch1 Time to compute: 24 seconds (due to debugging stops)
>>>> Number of results: 12
>>>> Name: Main Dunstable Rd
>>>> Score: 41.06405
>>>> ID: 12677400
>>>> Country Code: US
>>>> Coordinates: 42.72631, -71.50269
>>>> Search Key: MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>
>>>> Name: Main St
>>>> Score: 41.06405
>>>> ID: 12681980
>>>> Country Code: US
>>>> Coordinates: 42.76416, -71.46681
>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>
>>>> Name: Main St
>>>> Score: 41.06405
>>>> ID: 12681973
>>>> Country Code: US
>>>> Coordinates: 42.75045, -71.4607
>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>
>>>> Name: Main St
>>>> Score: 41.06405
>>>> ID: 12681974
>>>> Country Code: US
>>>> Coordinates: 42.76019, -71.465
>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>
>>>> Name: Main Dunstable Rd
>>>> Score: 41.06405
>>>> ID: 12677399
>>>> Country Code: US
>>>> Coordinates: 42.74641, -71.48943
>>>> Search Key: MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>
>>>> Name: S Main St
>>>> Score: 41.06405
>>>> ID: 11893215
>>>> Country Code: US
>>>> Coordinates: 42.73412, -71.44797
>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>
>>>> Name: Main St
>>>> Score: 41.06405
>>>> ID: 12681978
>>>> Country Code: US
>>>> Coordinates: 42.73492, -71.44951
>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>
>>>> Name: S Main St
>>>> Score: 41.06405
>>>> ID: 11893214
>>>> Country Code: US
>>>> Coordinates: 42.73958, -71.45895
>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>
>>>> Name: Main St
>>>> Score: 41.06405
>>>> ID: 12681979
>>>> Country Code: US
>>>> Coordinates: 42.76416, -71.46681
>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>
>>>> Name: Main St
>>>> Score: 41.06405
>>>> ID: 12681977
>>>> Country Code: US
>>>> Coordinates: 42.747, -71.45957
>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>
>>>>
>>>>
>>>>
>>>> Case #3
>>>>
>>>> But why does this not work with fuzzy mode and i misspelled a bit (1 edit away) and as You saw the data is there with Main spelling:
>>>>
>>>> org.apache.lucene.queryparser.classic.QueryParser parser = new org.apache.lucene.queryparser.classic.QueryParser(field, phraseAnalyzer) ;
>>>>
>>>> Query q1 = null;
>>>> try {
>>>> q1 = parser.parse("Mains~"); // 1 edit away
>>>> } catch (ParseException e) {
>>>> e.printStackTrace();
>>>> }
>>>> booleanQuery.add(q1, BooleanClause.Occur.MUST);
>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field, "NASHUA"), BooleanClause.Occur.MUST);
>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field, "NEW HAMPSHIRE"), BooleanClause.Occur.MUST);
>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field, "UNITED STATES"), BooleanClause.Occur.MUST);
>>>>
>>>> query plan:
>>>>
>>>> [+contentDFLT:mains~2, +contentDFLT:"nashua", +contentDFLT:"new-hampshire", +contentDFLT:"united states"]
>>>>
>>>> testQuerySearch1 Time to compute: 23 seconds (due to debugging stops)
>>>>
>>>> Number of results: 0
>>>>
>>>>
>>>>
>>>> Case #4
>>>>
>>>> Then i changed q1 to SHOULD from MUST above: and i think fuzzy query is ignored here since there is no MAIN in the first 468 resuls:
>>>>
>>>> there is no boost for Mains term here.
>>>>
>>>> query plan:
>>>>
>>>> [contentDFLT:mains~2, +contentDFLT:"nashua", +contentDFLT:"new-hampshire", +contentDFLT:"united states"]
>>>>
>>>> testQuerySearch1 Time to compute: 125 seconds (due to debugging stops)
>>>> Number of results: 1794
>>>> Name: Nashua Dr
>>>> Score: 34.186226
>>>> ID: 4974936
>>>> Country Code: US
>>>> Coordinates: 42.7636, -71.46063
>>>> Search Key: NASHUA NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>
>>>> Name: Nashua River Rail Trl
>>>> Score: 34.186226
>>>> ID: 4975508
>>>> Country Code: US
>>>> Coordinates: 42.7062, -71.53962
>>>> Search Key: NASHUA RIVER RAIL NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>
>>>> Name: Nashua Rd
>>>> Score: 33.84896
>>>> ID: 4975388
>>>> Country Code: US
>>>> Coordinates: 42.78746, -71.92823
>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>
>>>> Name: NASHUA
>>>> Score: 33.84896
>>>> ID: 21014865
>>>> Country Code: US
>>>> Coordinates: 42.75873, -71.46438
>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>
>>>> Name: NASHUA
>>>> Score: 33.84896
>>>> ID: 21014865
>>>> Country Code: US
>>>> Coordinates: 42.75873, -71.46438
>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>
>>>> Name: NASHUA
>>>> Score: 33.84896
>>>> ID: 21014865
>>>> Country Code: US
>>>> Coordinates: 42.75873, -71.46438
>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>
>>>> Name: NASHUA
>>>> Score: 33.84896
>>>> ID: 21014865
>>>> Country Code: US
>>>> Coordinates: 42.75873, -71.46438
>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>
>>>> Name: NASHUA
>>>> Score: 33.84896
>>>> ID: 21014865
>>>> Country Code: US
>>>> Coordinates: 42.75873, -71.46438
>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>
>>>> Name: Nashua St
>>>> Score: 33.84896
>>>> ID: 4975671
>>>> Country Code: US
>>>> Coordinates: 42.88471, -70.81687
>>>> Search Key: NASHUA ROCKINGHAM NEW HAMPSHIRE UNITED STATES
>>>>
>>>> Name: Nashua Rd
>>>> Score: 33.84896
>>>> ID: 4975400
>>>> Country Code: US
>>>> Coordinates: 42.79014, -71.92364
>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>
>>>>
>>>> Why is the fuzzy query ignored?
>>>> Even if i have separate fields for street, city,region, country, this fuzzy query issue will come into place for words with multiple parts like main dunstable etc., right?
>>>>
>>>> Best regards
>>>>
>>>> On 6/12/19 11:36 AM, baris.kazar@oracle.com wrote:
>>>>
>>>> Tomoko,-
>>>>
>>>> Thank You for Your suggestions. i am trying to understand it and i thought i did :)
>>>>
>>>> but it does not work with FuzzyQuery when i used with a *single* large TextField like street=...value... city=...value... region=...value... country=...value... (with or without quotes for the values)
>>>>
>>>> What i knew about Lucene fuzzy queries are not holding now with this Textfield form. That is why i suspected of a bug.
>>>>
>>>> 1. Yes, i saw and have a solid proof on that now.
>>>>
>>>> 2. yes but FuzzyQuery takes quotes as they are as they are escaped and it is not analyzed.
>>>>
>>>> Stuffing into one textfield vs having separate fields should only affect probably the performance but not the outcome in my case.
>>>> But, i have been thinking about this and maybe it is the way to go in this case.
>>>>
>>>> mY CONTENT field has street names in mixed case and city, region country names in UPPERCASE. Can this be a problem?
>>>> i thought index stored them in lowercase since i am using StandardAnalyzer.
>>>>
>>>> CONTENT field also has full textfield string with street=... city=... region=... country=... (here all values are UPPERCASE).
>>>>
>>>> Why cant the index find the names via FuzzyQuery? i tried both FuzzyQuery and Query builder as i showed before.
>>>>
>>>> The last advice in Your previous email would nicely go outside the parantheses since it might be very critical :) :) :)
>>>>
>>>> Best regards
>>>>
>>>>
>>>> On 6/12/19 12:17 AM, Tomoko Uchida wrote:
>>>>
>>>> I'd suggest to correctly understand the way a software works before
>>>> suspecting its bug :-)
>>>>
>>>> I guess you may miss two points:
>>>>
>>>> 1. the standard analyzer (standard tokenizer) breaks words by double
>>>> quote (U+0022) so quotes are not indexed or searched at all if you are
>>>> using standard analyzer. (That is the reason you have same results
>>>> with or without quotes.)
>>>> See: https://urldefense.proofpoint.com/v2/url?u=https-3A__lucene.apache.org_core_8-5F1-5F0_core_org_apache_lucene_analysis_standard_StandardTokenizer.html&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=1L6ZQKxmWmYxDX4uJHxzY5SAR_UCl6UUXCo916wzXCo&s=8E2lp1YIGM-3v3FspeieGl8z8rEBs6qioTudtFNzh8c&e=
>>>> and https://urldefense.proofpoint.com/v2/url?u=http-3A__unicode.org_reports_tr29_&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=1L6ZQKxmWmYxDX4uJHxzY5SAR_UCl6UUXCo916wzXCo&s=riCZ_f25XW869CKbHPUqfbLiDU-AukE6la0xTLMw6u8&e=
>>>>
>>>> 2. double quote has special meaning (it's interpreted as phrase query)
>>>> with the built-in query parser so you need to escape it if you want to
>>>> search double quotes itself.
>>>> See: https://urldefense.proofpoint.com/v2/url?u=http-3A__lucene.apache.org_core_8-5F1-5F0_queryparser_org_apache_lucene_queryparser_classic_package-2Dsummary.html-23Terms&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=1L6ZQKxmWmYxDX4uJHxzY5SAR_UCl6UUXCo916wzXCo&s=t8OYTgidvcwNpAVFuTsqGhDJK5BwUZVCxc0mPHzqCYU&e=
>>>>
>>>> (My advice would be to create separate fields for each key value pairs
>>>> instead of stuffing all pairs into one text field, if you need to
>>>> search them separately.)
>>>>
>>>> 2019?6?12?(?) 2:39 <baris.kazar@oracle.com>:
>>>>
>>>> i can say that quotes is not the issue with index as it still results in
>>>> same results with quotes or without quotes.
>>>>
>>>> i am starting to feel that this might be a bug maybe??
>>>>
>>>> Best regards
>>>>
>>>>
>>>> On 6/10/19 2:46 PM, baris.kazar@oracle.com wrote:
>>>>
>>>> Somehow " is causing an issue as this should return street with MAIN:
>>>>
>>>> [contentDFLT:street="MAINS"~2, +contentDFLT:"city nashua",
>>>> +contentDFLT:"region new-hampshire", +contentDFLT:"country united
>>>> states"] -> this was with fuzzyquery on MAINS
>>>>
>>>> Best regards
>>>>
>>>>
>>>> On 6/10/19 2:24 PM, baris.kazar@oracle.com wrote:
>>>>
>>>> [.+contentDFLT:"city nashua", +contentDFLT:"region new-hampshire",
>>>> +contentDFLT:"country united states", contentDFLT:street
>>>> contentDFLT:mains]
>>>>
>>>> QueeryParser chops it into two pieces from
>>>> parser.parser("street=\"MAINS\"");
>>>>
>>>> Index has a TextField named contentDFLT the following data :
>>>> street="MAIN" city="NASHUA" municipality="HILLSBOROUGH" region="NEW
>>>> HAMPSHIRE" country="UNITED STATES"
>>>>
>>>>
>>>> When i set street=\"MAINS~\" with parser:
>>>> i get the following
>>>> [.+contentDFLT:"city nashua", +contentDFLT:"region new-hampshire",
>>>> +contentDFLT:"country united states", contentDFLT:street
>>>> contentDFLT:mains]
>>>>
>>>> probably " quotations are messing this up as You were saying...
>>>> Best regards
>>>>
>>>>
>>>> On 6/10/19 12:48 PM, Tomoko Uchida wrote:
>>>>
>>>> Or, " (double quotation) in your query string may affect query parsing.
>>>>
>>>> When I parse this string by classic query parser (lucene 8.1),
>>>> street="MAINS~"
>>>> parsed (raw) query is
>>>> text:street text:mains
>>>> (I set the default search field to "text", so text:xxxx is appeared
>>>> here.)
>>>>
>>>> Query parsing is a complex process, so it would be good to check
>>>> parsed raw query string especially when you have (reserved) special
>>>> characters in your query...
>>>>
>>>> 2019?6?11?(?) 1:10 Tomoko Uchida <tomoko.uchida.1111@gmail.com>:
>>>>
>>>> Hi,
>>>>
>>>> I noticed one small thing in your previous mail.
>>>>
>>>> when i use q1 = parser.parse("street=\"MAIN\""); i get same results
>>>>
>>>> which is good.
>>>>
>>>> To specify a search field, ":" (colon) should be used instead of "=".
>>>> See the query parser documentation:
>>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__lucene.apache.org_core_8-5F1-5F0_queryparser_org_apache_lucene_queryparser_classic_package-2Dsummary.html-23Fields&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=u4SeJqH4lePhOazCLwxLEr3WqcMkODtYLv4njiKZ4PM&s=WrNfUXO9gz1PqpczTJw1vD9sWqvr76WRv2Aeo9uWqa4&e=
>>>>
>>>>
>>>> I'm not sure this is related to your problem.
>>>>
>>>> 2019?6?11?(?) 0:51 <baris.kazar@oracle.com>:
>>>>
>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>> "city=\"NASHUA\""), BooleanClause.Occur.MUST);
>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>> "region=\"NEW HAMPSHIRE\""), BooleanClause.Occur.MUST);
>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>> "country=\"UNITED STATES\""), BooleanClause.Occur.MUST);
>>>>
>>>> org.apache.lucene.queryparser.classic.QueryParser parser = new
>>>> org.apache.lucene.queryparser.classic.QueryParser(field,
>>>> phraseAnalyzer) ;
>>>> Query q1 = null;
>>>> try {
>>>> q1 = parser.parse("MAIN");
>>>> } catch (ParseException e) {
>>>>
>>>> e.printStackTrace();
>>>> }
>>>> booleanQuery.add(q1, BooleanClause.Occur.SHOULD);
>>>>
>>>> testQuerySearch2 Time to compute: 0 seconds
>>>> Number of results: 1775
>>>> Name: Main St
>>>> Score: 37.20959
>>>> ID: 12681979
>>>> Country Code: US
>>>> Coordinates: 42.76416, -71.46681
>>>> Search Key: street="MAIN" city="NASHUA" municipality="HILLSBOROUGH"
>>>> region="NEW HAMPSHIRE" country="UNITED STATES"
>>>>
>>>> Name: Main St
>>>> Score: 37.20959
>>>> ID: 12681977
>>>> Country Code: US
>>>> Coordinates: 42.747, -71.45957
>>>> Search Key: street="MAIN" city="NASHUA" municipality="HILLSBOROUGH"
>>>> region="NEW HAMPSHIRE" country="UNITED STATES"
>>>>
>>>> Name: Main St
>>>> Score: 37.20959
>>>> ID: 12681978
>>>> Country Code: US
>>>> Coordinates: 42.73492, -71.44951
>>>> Search Key: street="MAIN" city="NASHUA" municipality="HILLSBOROUGH"
>>>> region="NEW HAMPSHIRE" country="UNITED STATES"
>>>>
>>>> when i use q1 = parser.parse("street=\"MAIN\""); i get same
>>>> results
>>>> which is good.
>>>>
>>>> But when i switch to MAINS~ then fuzzy query does not work.
>>>>
>>>>
>>>> i need to say something with the q1 only in the booleanquery:
>>>> it tries to match the MAIN in street, city, region and country
>>>> which are
>>>> in a single TextField field.
>>>> But i dont want this. that is why i need to street="..." etc when
>>>> searching.
>>>>
>>>> Best regards
>>>>
>>>>
>>>>
>>>> On 6/10/19 11:31 AM, Tomoko Uchida wrote:
>>>>
>>>> Hi,
>>>>
>>>> just for the basic verification, can you find the document without
>>>> fuzzy query? I mean, does this query work for you?
>>>>
>>>> Query query = parser.parse("MAIN");
>>>>
>>>> Tomoko
>>>>
>>>> 2019?6?11?(?) 0:22 <baris.kazar@oracle.com>:
>>>>
>>>> why cant the second set not work at all?
>>>>
>>>> it is indexed as Textfield like street="..." city="..." etc.
>>>>
>>>> Best regards
>>>>
>>>>
>>>>
>>>> On 6/10/19 11:23 AM, baris.kazar@oracle.com wrote:
>>>>
>>>> i dont know how to use Fuzzyquery with queryparser but probably
>>>> You
>>>> are suggesting
>>>>
>>>> QueryParser parser = new QueryParser(field, analyzer) ;
>>>> Query query = parser.parse("MAINS~2");
>>>>
>>>> booleanQuery.add(query, BooleanClause.Occur.SHOULD);
>>>>
>>>> am i right?
>>>> Best regards
>>>>
>>>>
>>>> On 6/10/19 10:47 AM, Atri Sharma wrote:
>>>>
>>>> I would suggest using a QueryParser for your fuzzy query before
>>>> adding it to the Boolean query. This should weed out any case
>>>> issues.
>>>>
>>>> On Mon, 10 Jun 2019 at 8:06 PM, <baris.kazar@oracle.com
>>>> <mailto:baris.kazar@oracle.com>> wrote:
>>>>
>>>> BooleanQuery.Builder booleanQuery = new
>>>> BooleanQuery.Builder();
>>>>
>>>> //First set
>>>>
>>>> booleanQuery.add(new FuzzyQuery(new
>>>> org.apache.lucene.index.Term(field, "MAINS")),
>>>> BooleanClause.Occur.SHOULD);
>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>> "NASHUA"), BooleanClause.Occur.MUST);
>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>> "NEW HAMPSHIRE"), BooleanClause.Occur.MUST);
>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>> "UNITED STATES"), BooleanClause.Occur.MUST);
>>>>
>>>> // Second set
>>>> //booleanQuery.add(new FuzzyQuery(new
>>>> org.apache.lucene.index.Term(field, "street=\"MAINS\"")),
>>>> BooleanClause.Occur.SHOULD);
>>>> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,
>>>>
>>>> field, "city=\"NASHUA\""), BooleanClause.Occur.MUST);
>>>> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,
>>>>
>>>> field, "region=\"NEW HAMPSHIRE\""),
>>>> BooleanClause.Occur.MUST);
>>>> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,
>>>>
>>>> field, "country=\"UNITED STATES\""),
>>>> BooleanClause.Occur.MUST);
>>>>
>>>> The first set brings also street with Nashua name.
>>>> (NASHUA).
>>>>
>>>> so, to prevent that and since i also indexed with
>>>> street="..."
>>>> city="..." i did the second set but it does not bring
>>>> anything.
>>>>
>>>> createPhraseQuery builds a Phrasequery with one term
>>>> equal to the
>>>> string
>>>> in the call.
>>>>
>>>> Best regards
>>>>
>>>>
>>>>
>>>> On 6/10/19 10:47 AM, baris.kazar@oracle.com
>>>> <mailto:baris.kazar@oracle.com> wrote:
>>>> > How do i check how it is indexed? lowecase or uppercase?
>>>> >
>>>> > only way is now to by testing.
>>>> >
>>>> > i am using standardanalyzer.
>>>> >
>>>> > Best regards
>>>> >
>>>> >
>>>> > On 6/9/19 11:57 AM, Atri Sharma wrote:
>>>> >> On Sun, Jun 9, 2019 at 8:53 PM Tomoko Uchida
>>>> >> <tomoko.uchida.1111@gmail.com
>>>> <mailto:tomoko.uchida.1111@gmail.com>> wrote:
>>>> >>> Hi,
>>>> >>>
>>>> >>> What analyzer do you use for the text field? Is the
>>>> term "Main"
>>>> >>> correctly indexed?
>>>> >> Agreed. Also, it would be good if you could post your
>>>> actual
>>>> code.
>>>> >>
>>>> >> What analyzer are you using? If you are using
>>>> StandardAnalyzer,
>>>> then
>>>> >> all of your terms while indexing will be lowercased,
>>>> AFAIK, but
>>>> your
>>>> >> query will not be analyzed until you run a
>>>> QueryParser on it.
>>>> >>
>>>> >>
>>>> >> Atri
>>>> >>
>>>> >
>>>> >
>>>> >
>>>> ---------------------------------------------------------------------
>>>>
>>>> > To unsubscribe, e-mail:
>>>> java-user-unsubscribe@lucene.apache.org
>>>> <mailto:java-user-unsubscribe@lucene.apache.org>
>>>> > For additional commands, e-mail:
>>>> java-user-help@lucene.apache.org
>>>> <mailto:java-user-help@lucene.apache.org>
>>>> >
>>>>
>>>> ---------------------------------------------------------------------
>>>>
>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>
>>>> ---------------------------------------------------------------------
>>>>
>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>
>>>>
>>>>
>>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: FuzzyQuery- why is it ignored? [ In reply to ]
does it consider it as like plural word? :) :) :)
That makes sense.

Best regards


On 6/13/19 10:31 AM, baris.kazar@oracle.com wrote:
> Erick,
>
> Cool, could You give a simple example with my example please?
>
> Best regards
>
>
>
> On 6/13/19 10:12 AM, Erick Erickson wrote:
>> Shot in the dark: stemming. Whenever I see a problem with something
>> ending in “s” (or “er” or “ing” or….) my first suspect is that
>> stemming is turned on. In that case the token in the index that’s
>> actually searched on is somewhat different than you expect.
>>
>> The test is easy, just insure your fieldType contains no stemmers.
>> PorterStemmer is particularly aggressive, but for this case to test
>> I’d just remove all stemming, re-index and see if the results differ.
>>
>> Best,
>> Erick
>>
>>> On Jun 13, 2019, at 7:26 AM, baris.kazar@oracle.com wrote:
>>>
>>> Tomoko,-
>>>
>>>   That is strange indeed.
>>>
>>> Something is wrong when i use mains but maink, mainl, mainr,mainq,
>>> maint all work ok any consonant at the end except s works in this case.
>>>
>>> Case #3 had +contentDFLT:mains~2 but not +contentDFLT:"mains~2".
>>>
>>> i am using fuzzy query with ~ from Query.builder and that is not
>>> PhraseQuery.
>>>
>>> Similarly FuzzyQuery with input "mains" (it has to be lowercase
>>> since it does not go through StandardAnalyzer) is also not PhraseQuery.
>>>
>>> can there be a clearer sample case for ComplexPhraseQuery please in
>>> the docs?
>>>
>>> did You also index "MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED
>>> STATES" the expected output in this case?
>>>
>>> Thanks for spending time on this, i would like to thank everyone.
>>>
>>> Best regards
>>>
>>>
>>> On 6/13/19 12:13 AM, Tomoko Uchida wrote:
>>>> Hi,
>>>>
>>>>> Ok, i think only this very specific only "mains" has an issue.
>>>> It looks strange to me. I did some test locally.
>>>>
>>>> 1. Indexed this text: "NASHUA NASHUA HILLSBOROUGH NEW HAMPSHIRE
>>>> UNITED STATES".
>>>>
>>>> 2a. This query string (just copied from your Case #3) worked correctly
>>>> for me as far as I can see.
>>>> +contentDFLT:mains~2 +contentDFLT:"nashua",
>>>> +contentDFLT:"new-hampshire", +contentDFLT:"united state"
>>>>
>>>> 2b. However this query string got no results.
>>>> +contentDFLT:"mains~2", +contentDFLT:"nashua",
>>>> +contentDFLT:"new-hampshire", +contentDFLT:"united states"
>>>> It is an expected behaviour because the classic query parser does not
>>>> support fuzzy query inside phrase query (as far as I know).
>>>>
>>>> I suspect you use fuzzy query operator (~) inside phrase query ("), as
>>>> the 2b case.
>>>>
>>>> FYI: there is a special parser for such complex phrase query.
>>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__lucene.apache.org_core_8-5F1-5F0_queryparser_org_apache_lucene_queryparser_complexPhrase_ComplexPhraseQueryParser.html&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=ZcXpaSlwS5DegX76mHTb_6DH3P7noan1eeMXc-Vh5M8&s=FoIMlcjDO2b7Gut9XRx-NIBWiBQWItsj8IlylJC7Wkc&e=
>>>>
>>>>
>>>> Tomoko
>>>>
>>>> 2019?6?13?(?) 6:16 <baris.kazar@oracle.com>:
>>>>> Ok, i think only this very specific only "mains" has an issue.
>>>>>
>>>>> all i knew about Lucene was fine :) Great...
>>>>>
>>>>> i have one more question:
>>>>>
>>>>> which one is advised to use: FuzzyQuery or the Query.parser with
>>>>> search string~ appended?
>>>>>
>>>>> The second one will go through analyzer and make search string
>>>>> lowercase.
>>>>>
>>>>> Best regards
>>>>>
>>>>>
>>>>> On 6/12/19 1:03 PM, baris.kazar@oracle.com wrote:
>>>>>
>>>>> Hi again,-
>>>>>
>>>>> this is really interesting and i hope i am missing something.
>>>>> Index small cases all entries so case sensitivity is not an issue
>>>>> i think.
>>>>>
>>>>> Case #1:
>>>>>
>>>>> org.apache.lucene.queryparser.classic.QueryParser parser = new
>>>>> org.apache.lucene.queryparser.classic.QueryParser(field,
>>>>> phraseAnalyzer) ;
>>>>>          Query q1 = null;
>>>>>          try {
>>>>>              q1 = parser.parse("Main");
>>>>>          } catch (ParseException e) {
>>>>>              e.printStackTrace();
>>>>>          }
>>>>>          booleanQuery.add(q1, BooleanClause.Occur.MUST);
>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>> "NASHUA"), BooleanClause.Occur.MUST);
>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>> "NEW HAMPSHIRE"), BooleanClause.Occur.MUST);
>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>> "UNITED STATES"), BooleanClause.Occur.MUST);
>>>>>
>>>>>
>>>>> This brings with this:
>>>>>
>>>>> query plan:
>>>>>
>>>>> [+contentDFLT:main, +contentDFLT:"nashua",
>>>>> +contentDFLT:"new-hampshire", +contentDFLT:"united states"]
>>>>>
>>>>> testQuerySearch1 Time to compute: 0 seconds (copied answer after
>>>>> exec finished)
>>>>>
>>>>> Number of results: 12
>>>>> Name: Main Dunstable Rd
>>>>> Score: 41.204945
>>>>> ID: 12677400
>>>>> Country Code: US
>>>>> Coordinates: 42.72631, -71.50269
>>>>> Search Key: MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW HAMPSHIRE
>>>>> UNITED STATES
>>>>>
>>>>> Name: Main St
>>>>> Score: 41.204945
>>>>> ID: 12681980
>>>>> Country Code: US
>>>>> Coordinates: 42.76416, -71.46681
>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>
>>>>> Name: Main St
>>>>> Score: 41.204945
>>>>> ID: 12681973
>>>>> Country Code: US
>>>>> Coordinates: 42.75045, -71.4607
>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>
>>>>> Name: Main St
>>>>> Score: 41.204945
>>>>> ID: 12681974
>>>>> Country Code: US
>>>>> Coordinates: 42.76019, -71.465
>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>
>>>>> Name: Main Dunstable Rd
>>>>> Score: 41.204945
>>>>> ID: 12677399
>>>>> Country Code: US
>>>>> Coordinates: 42.74641, -71.48943
>>>>> Search Key: MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW HAMPSHIRE
>>>>> UNITED STATES
>>>>>
>>>>> Name: S Main St
>>>>> Score: 41.204945
>>>>> ID: 11893215
>>>>> Country Code: US
>>>>> Coordinates: 42.73412, -71.44797
>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>
>>>>> Name: Main St
>>>>> Score: 41.204945
>>>>> ID: 12681978
>>>>> Country Code: US
>>>>> Coordinates: 42.73492, -71.44951
>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>
>>>>> Name: S Main St
>>>>> Score: 41.204945
>>>>> ID: 11893214
>>>>> Country Code: US
>>>>> Coordinates: 42.73958, -71.45895
>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>
>>>>> Name: Main St
>>>>> Score: 41.204945
>>>>> ID: 12681979
>>>>> Country Code: US
>>>>> Coordinates: 42.76416, -71.46681
>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>
>>>>> Name: Main St
>>>>> Score: 41.204945
>>>>> ID: 12681977
>>>>> Country Code: US
>>>>> Coordinates: 42.747, -71.45957
>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>
>>>>>
>>>>>
>>>>> Case #2
>>>>>
>>>>> When i did this it also worked by adding ~ to make it Fuzzy query
>>>>> to Main word:
>>>>>
>>>>> org.apache.lucene.queryparser.classic.QueryParser parser = new
>>>>> org.apache.lucene.queryparser.classic.QueryParser(field,
>>>>> phraseAnalyzer) ;
>>>>>          Query q1 = null;
>>>>>          try {
>>>>>              q1 = parser.parse("Main~");
>>>>>          } catch (ParseException e) {
>>>>>              e.printStackTrace();
>>>>>          }
>>>>>          booleanQuery.add(q1, BooleanClause.Occur.MUST);
>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>> "NASHUA"), BooleanClause.Occur.MUST);
>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>> "NEW HAMPSHIRE"), BooleanClause.Occur.MUST);
>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>> "UNITED STATES"), BooleanClause.Occur.MUST);
>>>>>
>>>>>
>>>>> query plan:
>>>>>
>>>>> [+contentDFLT:main~2, +contentDFLT:"nashua",
>>>>> +contentDFLT:"new-hampshire", +contentDFLT:"united states"]
>>>>>
>>>>> testQuerySearch1 Time to compute: 24 seconds (due to debugging stops)
>>>>> Number of results: 12
>>>>> Name: Main Dunstable Rd
>>>>> Score: 41.06405
>>>>> ID: 12677400
>>>>> Country Code: US
>>>>> Coordinates: 42.72631, -71.50269
>>>>> Search Key: MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW HAMPSHIRE
>>>>> UNITED STATES
>>>>>
>>>>> Name: Main St
>>>>> Score: 41.06405
>>>>> ID: 12681980
>>>>> Country Code: US
>>>>> Coordinates: 42.76416, -71.46681
>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>
>>>>> Name: Main St
>>>>> Score: 41.06405
>>>>> ID: 12681973
>>>>> Country Code: US
>>>>> Coordinates: 42.75045, -71.4607
>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>
>>>>> Name: Main St
>>>>> Score: 41.06405
>>>>> ID: 12681974
>>>>> Country Code: US
>>>>> Coordinates: 42.76019, -71.465
>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>
>>>>> Name: Main Dunstable Rd
>>>>> Score: 41.06405
>>>>> ID: 12677399
>>>>> Country Code: US
>>>>> Coordinates: 42.74641, -71.48943
>>>>> Search Key: MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW HAMPSHIRE
>>>>> UNITED STATES
>>>>>
>>>>> Name: S Main St
>>>>> Score: 41.06405
>>>>> ID: 11893215
>>>>> Country Code: US
>>>>> Coordinates: 42.73412, -71.44797
>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>
>>>>> Name: Main St
>>>>> Score: 41.06405
>>>>> ID: 12681978
>>>>> Country Code: US
>>>>> Coordinates: 42.73492, -71.44951
>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>
>>>>> Name: S Main St
>>>>> Score: 41.06405
>>>>> ID: 11893214
>>>>> Country Code: US
>>>>> Coordinates: 42.73958, -71.45895
>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>
>>>>> Name: Main St
>>>>> Score: 41.06405
>>>>> ID: 12681979
>>>>> Country Code: US
>>>>> Coordinates: 42.76416, -71.46681
>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>
>>>>> Name: Main St
>>>>> Score: 41.06405
>>>>> ID: 12681977
>>>>> Country Code: US
>>>>> Coordinates: 42.747, -71.45957
>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Case #3
>>>>>
>>>>> But why does this not work with fuzzy mode and i misspelled a bit
>>>>> (1 edit away) and as You saw the data is there with Main spelling:
>>>>>
>>>>> org.apache.lucene.queryparser.classic.QueryParser parser = new
>>>>> org.apache.lucene.queryparser.classic.QueryParser(field,
>>>>> phraseAnalyzer) ;
>>>>>
>>>>>          Query q1 = null;
>>>>>          try {
>>>>>              q1 = parser.parse("Mains~");  // 1 edit away
>>>>>          } catch (ParseException e) {
>>>>>              e.printStackTrace();
>>>>>          }
>>>>>          booleanQuery.add(q1, BooleanClause.Occur.MUST);
>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>> "NASHUA"), BooleanClause.Occur.MUST);
>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>> "NEW HAMPSHIRE"), BooleanClause.Occur.MUST);
>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>> "UNITED STATES"), BooleanClause.Occur.MUST);
>>>>>
>>>>> query plan:
>>>>>
>>>>> [+contentDFLT:mains~2, +contentDFLT:"nashua",
>>>>> +contentDFLT:"new-hampshire", +contentDFLT:"united states"]
>>>>>
>>>>> testQuerySearch1 Time to compute: 23 seconds (due to debugging stops)
>>>>>
>>>>> Number of results: 0
>>>>>
>>>>>
>>>>>
>>>>> Case #4
>>>>>
>>>>> Then i changed q1 to SHOULD from MUST above: and i think fuzzy
>>>>> query is ignored here since there is no MAIN in the first 468 resuls:
>>>>>
>>>>> there is no boost for Mains term here.
>>>>>
>>>>> query plan:
>>>>>
>>>>> [contentDFLT:mains~2, +contentDFLT:"nashua",
>>>>> +contentDFLT:"new-hampshire", +contentDFLT:"united states"]
>>>>>
>>>>> testQuerySearch1 Time to compute: 125 seconds (due to debugging
>>>>> stops)
>>>>> Number of results: 1794
>>>>> Name: Nashua Dr
>>>>> Score: 34.186226
>>>>> ID: 4974936
>>>>> Country Code: US
>>>>> Coordinates: 42.7636, -71.46063
>>>>> Search Key: NASHUA NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>
>>>>> Name: Nashua River Rail Trl
>>>>> Score: 34.186226
>>>>> ID: 4975508
>>>>> Country Code: US
>>>>> Coordinates: 42.7062, -71.53962
>>>>> Search Key: NASHUA RIVER RAIL NASHUA HILLSBOROUGH NEW HAMPSHIRE
>>>>> UNITED STATES
>>>>>
>>>>> Name: Nashua Rd
>>>>> Score: 33.84896
>>>>> ID: 4975388
>>>>> Country Code: US
>>>>> Coordinates: 42.78746, -71.92823
>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>
>>>>> Name: NASHUA
>>>>> Score: 33.84896
>>>>> ID: 21014865
>>>>> Country Code: US
>>>>> Coordinates: 42.75873, -71.46438
>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>
>>>>> Name: NASHUA
>>>>> Score: 33.84896
>>>>> ID: 21014865
>>>>> Country Code: US
>>>>> Coordinates: 42.75873, -71.46438
>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>
>>>>> Name: NASHUA
>>>>> Score: 33.84896
>>>>> ID: 21014865
>>>>> Country Code: US
>>>>> Coordinates: 42.75873, -71.46438
>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>
>>>>> Name: NASHUA
>>>>> Score: 33.84896
>>>>> ID: 21014865
>>>>> Country Code: US
>>>>> Coordinates: 42.75873, -71.46438
>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>
>>>>> Name: NASHUA
>>>>> Score: 33.84896
>>>>> ID: 21014865
>>>>> Country Code: US
>>>>> Coordinates: 42.75873, -71.46438
>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>
>>>>> Name: Nashua St
>>>>> Score: 33.84896
>>>>> ID: 4975671
>>>>> Country Code: US
>>>>> Coordinates: 42.88471, -70.81687
>>>>> Search Key: NASHUA ROCKINGHAM NEW HAMPSHIRE UNITED STATES
>>>>>
>>>>> Name: Nashua Rd
>>>>> Score: 33.84896
>>>>> ID: 4975400
>>>>> Country Code: US
>>>>> Coordinates: 42.79014, -71.92364
>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>
>>>>>
>>>>> Why is the fuzzy query ignored?
>>>>> Even if i have separate fields for street, city,region, country,
>>>>> this fuzzy query issue will come into place for words with
>>>>> multiple parts like main dunstable etc., right?
>>>>>
>>>>> Best regards
>>>>>
>>>>> On 6/12/19 11:36 AM, baris.kazar@oracle.com wrote:
>>>>>
>>>>> Tomoko,-
>>>>>
>>>>>   Thank You for Your suggestions. i am trying to understand it and
>>>>> i thought i did :)
>>>>>
>>>>> but it does not work with FuzzyQuery when i used with a *single*
>>>>> large TextField like street=...value... city=...value...
>>>>> region=...value... country=...value... (with or without quotes for
>>>>> the values)
>>>>>
>>>>> What i knew about Lucene fuzzy queries are not holding now with
>>>>> this Textfield form. That is why i suspected of a bug.
>>>>>
>>>>> 1. Yes, i saw and have a solid proof on that now.
>>>>>
>>>>> 2. yes but FuzzyQuery takes quotes as they are as they are escaped
>>>>> and it is not analyzed.
>>>>>
>>>>> Stuffing into one textfield vs having separate fields should only
>>>>> affect probably the performance but not the outcome in my case.
>>>>> But, i have been thinking about this and maybe it is the way to go
>>>>> in this case.
>>>>>
>>>>> mY CONTENT field has street names in mixed case and city, region
>>>>> country names in UPPERCASE. Can this be a problem?
>>>>> i thought index stored them in lowercase since i am using
>>>>> StandardAnalyzer.
>>>>>
>>>>> CONTENT field also has full textfield string with street=...
>>>>> city=... region=... country=... (here all values are UPPERCASE).
>>>>>
>>>>> Why cant the index find the names via FuzzyQuery? i tried both
>>>>> FuzzyQuery and Query builder as i showed before.
>>>>>
>>>>> The last advice in Your previous email would nicely go outside the
>>>>> parantheses since it might be very critical :) :) :)
>>>>>
>>>>> Best regards
>>>>>
>>>>>
>>>>> On 6/12/19 12:17 AM, Tomoko Uchida wrote:
>>>>>
>>>>> I'd suggest to correctly understand the way a software works before
>>>>> suspecting its bug :-)
>>>>>
>>>>> I guess you may miss two points:
>>>>>
>>>>> 1. the standard analyzer (standard tokenizer) breaks words by double
>>>>> quote (U+0022) so quotes are not indexed or searched at all if you
>>>>> are
>>>>> using standard analyzer. (That is the reason you have same results
>>>>> with or without quotes.)
>>>>> See:
>>>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__lucene.apache.org_core_8-5F1-5F0_core_org_apache_lucene_analysis_standard_StandardTokenizer.html&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=1L6ZQKxmWmYxDX4uJHxzY5SAR_UCl6UUXCo916wzXCo&s=8E2lp1YIGM-3v3FspeieGl8z8rEBs6qioTudtFNzh8c&e=
>>>>> and
>>>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__unicode.org_reports_tr29_&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=1L6ZQKxmWmYxDX4uJHxzY5SAR_UCl6UUXCo916wzXCo&s=riCZ_f25XW869CKbHPUqfbLiDU-AukE6la0xTLMw6u8&e=
>>>>>
>>>>> 2. double quote has special meaning (it's interpreted as phrase
>>>>> query)
>>>>> with the built-in query parser so you need to escape it if you
>>>>> want to
>>>>> search double quotes itself.
>>>>> See:
>>>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__lucene.apache.org_core_8-5F1-5F0_queryparser_org_apache_lucene_queryparser_classic_package-2Dsummary.html-23Terms&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=1L6ZQKxmWmYxDX4uJHxzY5SAR_UCl6UUXCo916wzXCo&s=t8OYTgidvcwNpAVFuTsqGhDJK5BwUZVCxc0mPHzqCYU&e=
>>>>>
>>>>> (My advice would be to create separate fields for each key value
>>>>> pairs
>>>>> instead of stuffing all pairs into one text field, if you need to
>>>>> search them separately.)
>>>>>
>>>>> 2019?6?12?(?) 2:39 <baris.kazar@oracle.com>:
>>>>>
>>>>> i can say that quotes is not the issue with index as it still
>>>>> results in
>>>>> same results with quotes or without quotes.
>>>>>
>>>>> i am starting to feel that this might be a bug maybe??
>>>>>
>>>>> Best regards
>>>>>
>>>>>
>>>>> On 6/10/19 2:46 PM, baris.kazar@oracle.com wrote:
>>>>>
>>>>> Somehow " is causing an issue as this should return street with MAIN:
>>>>>
>>>>> [contentDFLT:street="MAINS"~2, +contentDFLT:"city nashua",
>>>>> +contentDFLT:"region new-hampshire", +contentDFLT:"country united
>>>>> states"] -> this was with fuzzyquery on MAINS
>>>>>
>>>>> Best regards
>>>>>
>>>>>
>>>>> On 6/10/19 2:24 PM, baris.kazar@oracle.com wrote:
>>>>>
>>>>> [.+contentDFLT:"city nashua", +contentDFLT:"region new-hampshire",
>>>>> +contentDFLT:"country united states", contentDFLT:street
>>>>> contentDFLT:mains]
>>>>>
>>>>> QueeryParser chops it into two pieces from
>>>>> parser.parser("street=\"MAINS\"");
>>>>>
>>>>> Index has a TextField named contentDFLT the following data :
>>>>> street="MAIN" city="NASHUA" municipality="HILLSBOROUGH" region="NEW
>>>>> HAMPSHIRE" country="UNITED STATES"
>>>>>
>>>>>
>>>>> When i set street=\"MAINS~\" with parser:
>>>>> i get the following
>>>>> [.+contentDFLT:"city nashua", +contentDFLT:"region new-hampshire",
>>>>> +contentDFLT:"country united states", contentDFLT:street
>>>>> contentDFLT:mains]
>>>>>
>>>>> probably " quotations are messing this up as You were saying...
>>>>> Best regards
>>>>>
>>>>>
>>>>> On 6/10/19 12:48 PM, Tomoko Uchida wrote:
>>>>>
>>>>> Or, " (double quotation) in your query string may affect query
>>>>> parsing.
>>>>>
>>>>> When I parse this string by classic query parser (lucene 8.1),
>>>>> street="MAINS~"
>>>>> parsed (raw) query is
>>>>> text:street text:mains
>>>>> (I set the default search field to "text", so text:xxxx is appeared
>>>>> here.)
>>>>>
>>>>> Query parsing is a complex process, so it would be good to check
>>>>> parsed raw query string especially when you have (reserved) special
>>>>> characters in your query...
>>>>>
>>>>> 2019?6?11?(?) 1:10 Tomoko Uchida <tomoko.uchida.1111@gmail.com>:
>>>>>
>>>>> Hi,
>>>>>
>>>>> I noticed one small thing in your previous mail.
>>>>>
>>>>> when i use q1 = parser.parse("street=\"MAIN\""); i get same results
>>>>>
>>>>> which is good.
>>>>>
>>>>> To specify a search field, ":" (colon) should be used instead of "=".
>>>>> See the query parser documentation:
>>>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__lucene.apache.org_core_8-5F1-5F0_queryparser_org_apache_lucene_queryparser_classic_package-2Dsummary.html-23Fields&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=u4SeJqH4lePhOazCLwxLEr3WqcMkODtYLv4njiKZ4PM&s=WrNfUXO9gz1PqpczTJw1vD9sWqvr76WRv2Aeo9uWqa4&e=
>>>>>
>>>>>
>>>>>
>>>>> I'm not sure this is related to your problem.
>>>>>
>>>>> 2019?6?11?(?) 0:51 <baris.kazar@oracle.com>:
>>>>>
>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>> "city=\"NASHUA\""), BooleanClause.Occur.MUST);
>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>> "region=\"NEW HAMPSHIRE\""), BooleanClause.Occur.MUST);
>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>> "country=\"UNITED STATES\""), BooleanClause.Occur.MUST);
>>>>>
>>>>> org.apache.lucene.queryparser.classic.QueryParser parser = new
>>>>> org.apache.lucene.queryparser.classic.QueryParser(field,
>>>>> phraseAnalyzer) ;
>>>>>             Query q1 = null;
>>>>>             try {
>>>>>                 q1 = parser.parse("MAIN");
>>>>>             } catch (ParseException e) {
>>>>>
>>>>>                 e.printStackTrace();
>>>>>             }
>>>>>             booleanQuery.add(q1, BooleanClause.Occur.SHOULD);
>>>>>
>>>>> testQuerySearch2 Time to compute: 0 seconds
>>>>> Number of results: 1775
>>>>> Name: Main St
>>>>> Score: 37.20959
>>>>> ID: 12681979
>>>>> Country Code: US
>>>>> Coordinates: 42.76416, -71.46681
>>>>> Search Key: street="MAIN" city="NASHUA" municipality="HILLSBOROUGH"
>>>>> region="NEW HAMPSHIRE" country="UNITED STATES"
>>>>>
>>>>> Name: Main St
>>>>> Score: 37.20959
>>>>> ID: 12681977
>>>>> Country Code: US
>>>>> Coordinates: 42.747, -71.45957
>>>>> Search Key: street="MAIN" city="NASHUA" municipality="HILLSBOROUGH"
>>>>> region="NEW HAMPSHIRE" country="UNITED STATES"
>>>>>
>>>>> Name: Main St
>>>>> Score: 37.20959
>>>>> ID: 12681978
>>>>> Country Code: US
>>>>> Coordinates: 42.73492, -71.44951
>>>>> Search Key: street="MAIN" city="NASHUA" municipality="HILLSBOROUGH"
>>>>> region="NEW HAMPSHIRE" country="UNITED STATES"
>>>>>
>>>>>      when i use q1 = parser.parse("street=\"MAIN\""); i get same
>>>>> results
>>>>> which is good.
>>>>>
>>>>> But when i switch to MAINS~ then fuzzy query does not work.
>>>>>
>>>>>
>>>>> i need to say something with the q1 only in the booleanquery:
>>>>> it tries to match the MAIN in street, city, region and country
>>>>> which are
>>>>> in a single TextField field.
>>>>> But i dont want this. that is why i need to street="..." etc when
>>>>> searching.
>>>>>
>>>>> Best regards
>>>>>
>>>>>
>>>>>
>>>>> On 6/10/19 11:31 AM, Tomoko Uchida wrote:
>>>>>
>>>>> Hi,
>>>>>
>>>>> just for the basic verification, can you find the document without
>>>>> fuzzy query? I mean, does this query work for you?
>>>>>
>>>>> Query query = parser.parse("MAIN");
>>>>>
>>>>> Tomoko
>>>>>
>>>>> 2019?6?11?(?) 0:22 <baris.kazar@oracle.com>:
>>>>>
>>>>> why cant the second set not work at all?
>>>>>
>>>>> it is indexed as Textfield like street="..." city="..." etc.
>>>>>
>>>>> Best regards
>>>>>
>>>>>
>>>>>
>>>>> On 6/10/19 11:23 AM, baris.kazar@oracle.com wrote:
>>>>>
>>>>> i dont know how to use Fuzzyquery with queryparser but probably
>>>>> You
>>>>> are suggesting
>>>>>
>>>>> QueryParser parser = new QueryParser(field, analyzer) ;
>>>>> Query query = parser.parse("MAINS~2");
>>>>>
>>>>> booleanQuery.add(query, BooleanClause.Occur.SHOULD);
>>>>>
>>>>> am i right?
>>>>> Best regards
>>>>>
>>>>>
>>>>> On 6/10/19 10:47 AM, Atri Sharma wrote:
>>>>>
>>>>> I would suggest using a QueryParser for your fuzzy query before
>>>>> adding it to the Boolean query. This should weed out any case
>>>>> issues.
>>>>>
>>>>> On Mon, 10 Jun 2019 at 8:06 PM, <baris.kazar@oracle.com
>>>>> <mailto:baris.kazar@oracle.com>> wrote:
>>>>>
>>>>>         BooleanQuery.Builder booleanQuery = new
>>>>> BooleanQuery.Builder();
>>>>>
>>>>>         //First set
>>>>>
>>>>>                 booleanQuery.add(new FuzzyQuery(new
>>>>>         org.apache.lucene.index.Term(field, "MAINS")),
>>>>>         BooleanClause.Occur.SHOULD);
>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>         "NASHUA"), BooleanClause.Occur.MUST);
>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>         "NEW HAMPSHIRE"), BooleanClause.Occur.MUST);
>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>         "UNITED STATES"), BooleanClause.Occur.MUST);
>>>>>
>>>>>         // Second set
>>>>>                  //booleanQuery.add(new FuzzyQuery(new
>>>>>         org.apache.lucene.index.Term(field, "street=\"MAINS\"")),
>>>>>         BooleanClause.Occur.SHOULD);
>>>>> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,
>>>>>
>>>>>         field, "city=\"NASHUA\""), BooleanClause.Occur.MUST);
>>>>> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,
>>>>>
>>>>>         field, "region=\"NEW HAMPSHIRE\""),
>>>>> BooleanClause.Occur.MUST);
>>>>> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,
>>>>>
>>>>>         field, "country=\"UNITED STATES\""),
>>>>> BooleanClause.Occur.MUST);
>>>>>
>>>>>         The first set brings also street with Nashua name.
>>>>> (NASHUA).
>>>>>
>>>>>         so, to prevent that and since i also indexed with
>>>>> street="..."
>>>>>         city="..." i did the second set but it does not bring
>>>>> anything.
>>>>>
>>>>>         createPhraseQuery builds a Phrasequery with one term
>>>>> equal to the
>>>>>         string
>>>>>         in the call.
>>>>>
>>>>>         Best regards
>>>>>
>>>>>
>>>>>
>>>>>         On 6/10/19 10:47 AM, baris.kazar@oracle.com
>>>>>         <mailto:baris.kazar@oracle.com> wrote:
>>>>>         > How do i check how it is indexed? lowecase or uppercase?
>>>>>         >
>>>>>         > only way is now to by testing.
>>>>>         >
>>>>>         > i am using standardanalyzer.
>>>>>         >
>>>>>         > Best regards
>>>>>         >
>>>>>         >
>>>>>         > On 6/9/19 11:57 AM, Atri Sharma wrote:
>>>>>         >> On Sun, Jun 9, 2019 at 8:53 PM Tomoko Uchida
>>>>>         >> <tomoko.uchida.1111@gmail.com
>>>>> <mailto:tomoko.uchida.1111@gmail.com>> wrote:
>>>>>         >>> Hi,
>>>>>         >>>
>>>>>         >>> What analyzer do you use for the text field? Is the
>>>>> term "Main"
>>>>>         >>> correctly indexed?
>>>>>         >> Agreed. Also, it would be good if you could post your
>>>>> actual
>>>>> code.
>>>>>         >>
>>>>>         >> What analyzer are you using? If you are using
>>>>> StandardAnalyzer,
>>>>>         then
>>>>>         >> all of your terms while indexing will be lowercased,
>>>>> AFAIK, but
>>>>>         your
>>>>>         >> query will not be analyzed until you run a
>>>>> QueryParser on it.
>>>>>         >>
>>>>>         >>
>>>>>         >> Atri
>>>>>         >>
>>>>>         >
>>>>>         >
>>>>>         >
>>>>> ---------------------------------------------------------------------
>>>>>
>>>>>         > To unsubscribe, e-mail:
>>>>> java-user-unsubscribe@lucene.apache.org
>>>>> <mailto:java-user-unsubscribe@lucene.apache.org>
>>>>>         > For additional commands, e-mail:
>>>>>         java-user-help@lucene.apache.org
>>>>> <mailto:java-user-help@lucene.apache.org>
>>>>>         >
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>>
>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>>
>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>
>>>>>
>>>>>
>>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: FuzzyQuery- why is it ignored? [ In reply to ]
However, the index does not have MAINS but MAIN for the expected entry.

Best regards



On 6/13/19 10:33 AM, baris.kazar@oracle.com wrote:
> does it consider it as like plural word? :) :) :)
> That makes sense.
>
> Best regards
>
>
> On 6/13/19 10:31 AM, baris.kazar@oracle.com wrote:
>> Erick,
>>
>> Cool, could You give a simple example with my example please?
>>
>> Best regards
>>
>>
>>
>> On 6/13/19 10:12 AM, Erick Erickson wrote:
>>> Shot in the dark: stemming. Whenever I see a problem with something
>>> ending in “s” (or “er” or “ing” or….) my first suspect is that
>>> stemming is turned on. In that case the token in the index that’s
>>> actually searched on is somewhat different than you expect.
>>>
>>> The test is easy, just insure your fieldType contains no stemmers.
>>> PorterStemmer is particularly aggressive, but for this case to test
>>> I’d just remove all stemming, re-index and see if the results differ.
>>>
>>> Best,
>>> Erick
>>>
>>>> On Jun 13, 2019, at 7:26 AM, baris.kazar@oracle.com wrote:
>>>>
>>>> Tomoko,-
>>>>
>>>>   That is strange indeed.
>>>>
>>>> Something is wrong when i use mains but maink, mainl, mainr,mainq,
>>>> maint all work ok any consonant at the end except s works in this
>>>> case.
>>>>
>>>> Case #3 had +contentDFLT:mains~2 but not +contentDFLT:"mains~2".
>>>>
>>>> i am using fuzzy query with ~ from Query.builder and that is not
>>>> PhraseQuery.
>>>>
>>>> Similarly FuzzyQuery with input "mains" (it has to be lowercase
>>>> since it does not go through StandardAnalyzer) is also not
>>>> PhraseQuery.
>>>>
>>>> can there be a clearer sample case for ComplexPhraseQuery please in
>>>> the docs?
>>>>
>>>> did You also index "MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED
>>>> STATES" the expected output in this case?
>>>>
>>>> Thanks for spending time on this, i would like to thank everyone.
>>>>
>>>> Best regards
>>>>
>>>>
>>>> On 6/13/19 12:13 AM, Tomoko Uchida wrote:
>>>>> Hi,
>>>>>
>>>>>> Ok, i think only this very specific only "mains" has an issue.
>>>>> It looks strange to me. I did some test locally.
>>>>>
>>>>> 1. Indexed this text: "NASHUA NASHUA HILLSBOROUGH NEW HAMPSHIRE
>>>>> UNITED STATES".
>>>>>
>>>>> 2a. This query string (just copied from your Case #3) worked
>>>>> correctly
>>>>> for me as far as I can see.
>>>>> +contentDFLT:mains~2 +contentDFLT:"nashua",
>>>>> +contentDFLT:"new-hampshire", +contentDFLT:"united state"
>>>>>
>>>>> 2b. However this query string got no results.
>>>>> +contentDFLT:"mains~2", +contentDFLT:"nashua",
>>>>> +contentDFLT:"new-hampshire", +contentDFLT:"united states"
>>>>> It is an expected behaviour because the classic query parser does not
>>>>> support fuzzy query inside phrase query (as far as I know).
>>>>>
>>>>> I suspect you use fuzzy query operator (~) inside phrase query
>>>>> ("), as
>>>>> the 2b case.
>>>>>
>>>>> FYI: there is a special parser for such complex phrase query.
>>>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__lucene.apache.org_core_8-5F1-5F0_queryparser_org_apache_lucene_queryparser_complexPhrase_ComplexPhraseQueryParser.html&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=ZcXpaSlwS5DegX76mHTb_6DH3P7noan1eeMXc-Vh5M8&s=FoIMlcjDO2b7Gut9XRx-NIBWiBQWItsj8IlylJC7Wkc&e=
>>>>>
>>>>>
>>>>> Tomoko
>>>>>
>>>>> 2019?6?13?(?) 6:16 <baris.kazar@oracle.com>:
>>>>>> Ok, i think only this very specific only "mains" has an issue.
>>>>>>
>>>>>> all i knew about Lucene was fine :) Great...
>>>>>>
>>>>>> i have one more question:
>>>>>>
>>>>>> which one is advised to use: FuzzyQuery or the Query.parser with
>>>>>> search string~ appended?
>>>>>>
>>>>>> The second one will go through analyzer and make search string
>>>>>> lowercase.
>>>>>>
>>>>>> Best regards
>>>>>>
>>>>>>
>>>>>> On 6/12/19 1:03 PM, baris.kazar@oracle.com wrote:
>>>>>>
>>>>>> Hi again,-
>>>>>>
>>>>>> this is really interesting and i hope i am missing something.
>>>>>> Index small cases all entries so case sensitivity is not an issue
>>>>>> i think.
>>>>>>
>>>>>> Case #1:
>>>>>>
>>>>>> org.apache.lucene.queryparser.classic.QueryParser parser = new
>>>>>> org.apache.lucene.queryparser.classic.QueryParser(field,
>>>>>> phraseAnalyzer) ;
>>>>>>          Query q1 = null;
>>>>>>          try {
>>>>>>              q1 = parser.parse("Main");
>>>>>>          } catch (ParseException e) {
>>>>>>              e.printStackTrace();
>>>>>>          }
>>>>>>          booleanQuery.add(q1, BooleanClause.Occur.MUST);
>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>> "NASHUA"), BooleanClause.Occur.MUST);
>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>> "NEW HAMPSHIRE"), BooleanClause.Occur.MUST);
>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>> "UNITED STATES"), BooleanClause.Occur.MUST);
>>>>>>
>>>>>>
>>>>>> This brings with this:
>>>>>>
>>>>>> query plan:
>>>>>>
>>>>>> [+contentDFLT:main, +contentDFLT:"nashua",
>>>>>> +contentDFLT:"new-hampshire", +contentDFLT:"united states"]
>>>>>>
>>>>>> testQuerySearch1 Time to compute: 0 seconds (copied answer after
>>>>>> exec finished)
>>>>>>
>>>>>> Number of results: 12
>>>>>> Name: Main Dunstable Rd
>>>>>> Score: 41.204945
>>>>>> ID: 12677400
>>>>>> Country Code: US
>>>>>> Coordinates: 42.72631, -71.50269
>>>>>> Search Key: MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW HAMPSHIRE
>>>>>> UNITED STATES
>>>>>>
>>>>>> Name: Main St
>>>>>> Score: 41.204945
>>>>>> ID: 12681980
>>>>>> Country Code: US
>>>>>> Coordinates: 42.76416, -71.46681
>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>
>>>>>> Name: Main St
>>>>>> Score: 41.204945
>>>>>> ID: 12681973
>>>>>> Country Code: US
>>>>>> Coordinates: 42.75045, -71.4607
>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>
>>>>>> Name: Main St
>>>>>> Score: 41.204945
>>>>>> ID: 12681974
>>>>>> Country Code: US
>>>>>> Coordinates: 42.76019, -71.465
>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>
>>>>>> Name: Main Dunstable Rd
>>>>>> Score: 41.204945
>>>>>> ID: 12677399
>>>>>> Country Code: US
>>>>>> Coordinates: 42.74641, -71.48943
>>>>>> Search Key: MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW HAMPSHIRE
>>>>>> UNITED STATES
>>>>>>
>>>>>> Name: S Main St
>>>>>> Score: 41.204945
>>>>>> ID: 11893215
>>>>>> Country Code: US
>>>>>> Coordinates: 42.73412, -71.44797
>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>
>>>>>> Name: Main St
>>>>>> Score: 41.204945
>>>>>> ID: 12681978
>>>>>> Country Code: US
>>>>>> Coordinates: 42.73492, -71.44951
>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>
>>>>>> Name: S Main St
>>>>>> Score: 41.204945
>>>>>> ID: 11893214
>>>>>> Country Code: US
>>>>>> Coordinates: 42.73958, -71.45895
>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>
>>>>>> Name: Main St
>>>>>> Score: 41.204945
>>>>>> ID: 12681979
>>>>>> Country Code: US
>>>>>> Coordinates: 42.76416, -71.46681
>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>
>>>>>> Name: Main St
>>>>>> Score: 41.204945
>>>>>> ID: 12681977
>>>>>> Country Code: US
>>>>>> Coordinates: 42.747, -71.45957
>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>
>>>>>>
>>>>>>
>>>>>> Case #2
>>>>>>
>>>>>> When i did this it also worked by adding ~ to make it Fuzzy query
>>>>>> to Main word:
>>>>>>
>>>>>> org.apache.lucene.queryparser.classic.QueryParser parser = new
>>>>>> org.apache.lucene.queryparser.classic.QueryParser(field,
>>>>>> phraseAnalyzer) ;
>>>>>>          Query q1 = null;
>>>>>>          try {
>>>>>>              q1 = parser.parse("Main~");
>>>>>>          } catch (ParseException e) {
>>>>>>              e.printStackTrace();
>>>>>>          }
>>>>>>          booleanQuery.add(q1, BooleanClause.Occur.MUST);
>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>> "NASHUA"), BooleanClause.Occur.MUST);
>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>> "NEW HAMPSHIRE"), BooleanClause.Occur.MUST);
>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>> "UNITED STATES"), BooleanClause.Occur.MUST);
>>>>>>
>>>>>>
>>>>>> query plan:
>>>>>>
>>>>>> [+contentDFLT:main~2, +contentDFLT:"nashua",
>>>>>> +contentDFLT:"new-hampshire", +contentDFLT:"united states"]
>>>>>>
>>>>>> testQuerySearch1 Time to compute: 24 seconds (due to debugging
>>>>>> stops)
>>>>>> Number of results: 12
>>>>>> Name: Main Dunstable Rd
>>>>>> Score: 41.06405
>>>>>> ID: 12677400
>>>>>> Country Code: US
>>>>>> Coordinates: 42.72631, -71.50269
>>>>>> Search Key: MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW HAMPSHIRE
>>>>>> UNITED STATES
>>>>>>
>>>>>> Name: Main St
>>>>>> Score: 41.06405
>>>>>> ID: 12681980
>>>>>> Country Code: US
>>>>>> Coordinates: 42.76416, -71.46681
>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>
>>>>>> Name: Main St
>>>>>> Score: 41.06405
>>>>>> ID: 12681973
>>>>>> Country Code: US
>>>>>> Coordinates: 42.75045, -71.4607
>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>
>>>>>> Name: Main St
>>>>>> Score: 41.06405
>>>>>> ID: 12681974
>>>>>> Country Code: US
>>>>>> Coordinates: 42.76019, -71.465
>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>
>>>>>> Name: Main Dunstable Rd
>>>>>> Score: 41.06405
>>>>>> ID: 12677399
>>>>>> Country Code: US
>>>>>> Coordinates: 42.74641, -71.48943
>>>>>> Search Key: MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW HAMPSHIRE
>>>>>> UNITED STATES
>>>>>>
>>>>>> Name: S Main St
>>>>>> Score: 41.06405
>>>>>> ID: 11893215
>>>>>> Country Code: US
>>>>>> Coordinates: 42.73412, -71.44797
>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>
>>>>>> Name: Main St
>>>>>> Score: 41.06405
>>>>>> ID: 12681978
>>>>>> Country Code: US
>>>>>> Coordinates: 42.73492, -71.44951
>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>
>>>>>> Name: S Main St
>>>>>> Score: 41.06405
>>>>>> ID: 11893214
>>>>>> Country Code: US
>>>>>> Coordinates: 42.73958, -71.45895
>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>
>>>>>> Name: Main St
>>>>>> Score: 41.06405
>>>>>> ID: 12681979
>>>>>> Country Code: US
>>>>>> Coordinates: 42.76416, -71.46681
>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>
>>>>>> Name: Main St
>>>>>> Score: 41.06405
>>>>>> ID: 12681977
>>>>>> Country Code: US
>>>>>> Coordinates: 42.747, -71.45957
>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> Case #3
>>>>>>
>>>>>> But why does this not work with fuzzy mode and i misspelled a bit
>>>>>> (1 edit away) and as You saw the data is there with Main spelling:
>>>>>>
>>>>>> org.apache.lucene.queryparser.classic.QueryParser parser = new
>>>>>> org.apache.lucene.queryparser.classic.QueryParser(field,
>>>>>> phraseAnalyzer) ;
>>>>>>
>>>>>>          Query q1 = null;
>>>>>>          try {
>>>>>>              q1 = parser.parse("Mains~");  // 1 edit away
>>>>>>          } catch (ParseException e) {
>>>>>>              e.printStackTrace();
>>>>>>          }
>>>>>>          booleanQuery.add(q1, BooleanClause.Occur.MUST);
>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>> "NASHUA"), BooleanClause.Occur.MUST);
>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>> "NEW HAMPSHIRE"), BooleanClause.Occur.MUST);
>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>> "UNITED STATES"), BooleanClause.Occur.MUST);
>>>>>>
>>>>>> query plan:
>>>>>>
>>>>>> [+contentDFLT:mains~2, +contentDFLT:"nashua",
>>>>>> +contentDFLT:"new-hampshire", +contentDFLT:"united states"]
>>>>>>
>>>>>> testQuerySearch1 Time to compute: 23 seconds (due to debugging
>>>>>> stops)
>>>>>>
>>>>>> Number of results: 0
>>>>>>
>>>>>>
>>>>>>
>>>>>> Case #4
>>>>>>
>>>>>> Then i changed q1 to SHOULD from MUST above: and i think fuzzy
>>>>>> query is ignored here since there is no MAIN in the first 468
>>>>>> resuls:
>>>>>>
>>>>>> there is no boost for Mains term here.
>>>>>>
>>>>>> query plan:
>>>>>>
>>>>>> [contentDFLT:mains~2, +contentDFLT:"nashua",
>>>>>> +contentDFLT:"new-hampshire", +contentDFLT:"united states"]
>>>>>>
>>>>>> testQuerySearch1 Time to compute: 125 seconds (due to debugging
>>>>>> stops)
>>>>>> Number of results: 1794
>>>>>> Name: Nashua Dr
>>>>>> Score: 34.186226
>>>>>> ID: 4974936
>>>>>> Country Code: US
>>>>>> Coordinates: 42.7636, -71.46063
>>>>>> Search Key: NASHUA NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>
>>>>>> Name: Nashua River Rail Trl
>>>>>> Score: 34.186226
>>>>>> ID: 4975508
>>>>>> Country Code: US
>>>>>> Coordinates: 42.7062, -71.53962
>>>>>> Search Key: NASHUA RIVER RAIL NASHUA HILLSBOROUGH NEW HAMPSHIRE
>>>>>> UNITED STATES
>>>>>>
>>>>>> Name: Nashua Rd
>>>>>> Score: 33.84896
>>>>>> ID: 4975388
>>>>>> Country Code: US
>>>>>> Coordinates: 42.78746, -71.92823
>>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>
>>>>>> Name: NASHUA
>>>>>> Score: 33.84896
>>>>>> ID: 21014865
>>>>>> Country Code: US
>>>>>> Coordinates: 42.75873, -71.46438
>>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>
>>>>>> Name: NASHUA
>>>>>> Score: 33.84896
>>>>>> ID: 21014865
>>>>>> Country Code: US
>>>>>> Coordinates: 42.75873, -71.46438
>>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>
>>>>>> Name: NASHUA
>>>>>> Score: 33.84896
>>>>>> ID: 21014865
>>>>>> Country Code: US
>>>>>> Coordinates: 42.75873, -71.46438
>>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>
>>>>>> Name: NASHUA
>>>>>> Score: 33.84896
>>>>>> ID: 21014865
>>>>>> Country Code: US
>>>>>> Coordinates: 42.75873, -71.46438
>>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>
>>>>>> Name: NASHUA
>>>>>> Score: 33.84896
>>>>>> ID: 21014865
>>>>>> Country Code: US
>>>>>> Coordinates: 42.75873, -71.46438
>>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>
>>>>>> Name: Nashua St
>>>>>> Score: 33.84896
>>>>>> ID: 4975671
>>>>>> Country Code: US
>>>>>> Coordinates: 42.88471, -70.81687
>>>>>> Search Key: NASHUA ROCKINGHAM NEW HAMPSHIRE UNITED STATES
>>>>>>
>>>>>> Name: Nashua Rd
>>>>>> Score: 33.84896
>>>>>> ID: 4975400
>>>>>> Country Code: US
>>>>>> Coordinates: 42.79014, -71.92364
>>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>>>>>>
>>>>>>
>>>>>> Why is the fuzzy query ignored?
>>>>>> Even if i have separate fields for street, city,region, country,
>>>>>> this fuzzy query issue will come into place for words with
>>>>>> multiple parts like main dunstable etc., right?
>>>>>>
>>>>>> Best regards
>>>>>>
>>>>>> On 6/12/19 11:36 AM, baris.kazar@oracle.com wrote:
>>>>>>
>>>>>> Tomoko,-
>>>>>>
>>>>>>   Thank You for Your suggestions. i am trying to understand it
>>>>>> and i thought i did :)
>>>>>>
>>>>>> but it does not work with FuzzyQuery when i used with a *single*
>>>>>> large TextField like street=...value... city=...value...
>>>>>> region=...value... country=...value... (with or without quotes
>>>>>> for the values)
>>>>>>
>>>>>> What i knew about Lucene fuzzy queries are not holding now with
>>>>>> this Textfield form. That is why i suspected of a bug.
>>>>>>
>>>>>> 1. Yes, i saw and have a solid proof on that now.
>>>>>>
>>>>>> 2. yes but FuzzyQuery takes quotes as they are as they are
>>>>>> escaped and it is not analyzed.
>>>>>>
>>>>>> Stuffing into one textfield vs having separate fields should only
>>>>>> affect probably the performance but not the outcome in my case.
>>>>>> But, i have been thinking about this and maybe it is the way to
>>>>>> go in this case.
>>>>>>
>>>>>> mY CONTENT field has street names in mixed case and city, region
>>>>>> country names in UPPERCASE. Can this be a problem?
>>>>>> i thought index stored them in lowercase since i am using
>>>>>> StandardAnalyzer.
>>>>>>
>>>>>> CONTENT field also has full textfield string with street=...
>>>>>> city=... region=... country=... (here all values are UPPERCASE).
>>>>>>
>>>>>> Why cant the index find the names via FuzzyQuery? i tried both
>>>>>> FuzzyQuery and Query builder as i showed before.
>>>>>>
>>>>>> The last advice in Your previous email would nicely go outside
>>>>>> the parantheses since it might be very critical :) :) :)
>>>>>>
>>>>>> Best regards
>>>>>>
>>>>>>
>>>>>> On 6/12/19 12:17 AM, Tomoko Uchida wrote:
>>>>>>
>>>>>> I'd suggest to correctly understand the way a software works before
>>>>>> suspecting its bug :-)
>>>>>>
>>>>>> I guess you may miss two points:
>>>>>>
>>>>>> 1. the standard analyzer (standard tokenizer) breaks words by double
>>>>>> quote (U+0022) so quotes are not indexed or searched at all if
>>>>>> you are
>>>>>> using standard analyzer. (That is the reason you have same results
>>>>>> with or without quotes.)
>>>>>> See:
>>>>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__lucene.apache.org_core_8-5F1-5F0_core_org_apache_lucene_analysis_standard_StandardTokenizer.html&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=1L6ZQKxmWmYxDX4uJHxzY5SAR_UCl6UUXCo916wzXCo&s=8E2lp1YIGM-3v3FspeieGl8z8rEBs6qioTudtFNzh8c&e=
>>>>>> and
>>>>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__unicode.org_reports_tr29_&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=1L6ZQKxmWmYxDX4uJHxzY5SAR_UCl6UUXCo916wzXCo&s=riCZ_f25XW869CKbHPUqfbLiDU-AukE6la0xTLMw6u8&e=
>>>>>>
>>>>>> 2. double quote has special meaning (it's interpreted as phrase
>>>>>> query)
>>>>>> with the built-in query parser so you need to escape it if you
>>>>>> want to
>>>>>> search double quotes itself.
>>>>>> See:
>>>>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__lucene.apache.org_core_8-5F1-5F0_queryparser_org_apache_lucene_queryparser_classic_package-2Dsummary.html-23Terms&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=1L6ZQKxmWmYxDX4uJHxzY5SAR_UCl6UUXCo916wzXCo&s=t8OYTgidvcwNpAVFuTsqGhDJK5BwUZVCxc0mPHzqCYU&e=
>>>>>>
>>>>>> (My advice would be to create separate fields for each key value
>>>>>> pairs
>>>>>> instead of stuffing all pairs into one text field, if you need to
>>>>>> search them separately.)
>>>>>>
>>>>>> 2019?6?12?(?) 2:39 <baris.kazar@oracle.com>:
>>>>>>
>>>>>> i can say that quotes is not the issue with index as it still
>>>>>> results in
>>>>>> same results with quotes or without quotes.
>>>>>>
>>>>>> i am starting to feel that this might be a bug maybe??
>>>>>>
>>>>>> Best regards
>>>>>>
>>>>>>
>>>>>> On 6/10/19 2:46 PM, baris.kazar@oracle.com wrote:
>>>>>>
>>>>>> Somehow " is causing an issue as this should return street with
>>>>>> MAIN:
>>>>>>
>>>>>> [contentDFLT:street="MAINS"~2, +contentDFLT:"city nashua",
>>>>>> +contentDFLT:"region new-hampshire", +contentDFLT:"country united
>>>>>> states"] -> this was with fuzzyquery on MAINS
>>>>>>
>>>>>> Best regards
>>>>>>
>>>>>>
>>>>>> On 6/10/19 2:24 PM, baris.kazar@oracle.com wrote:
>>>>>>
>>>>>> [.+contentDFLT:"city nashua", +contentDFLT:"region new-hampshire",
>>>>>> +contentDFLT:"country united states", contentDFLT:street
>>>>>> contentDFLT:mains]
>>>>>>
>>>>>> QueeryParser chops it into two pieces from
>>>>>> parser.parser("street=\"MAINS\"");
>>>>>>
>>>>>> Index has a TextField named contentDFLT the following data :
>>>>>> street="MAIN" city="NASHUA" municipality="HILLSBOROUGH" region="NEW
>>>>>> HAMPSHIRE" country="UNITED STATES"
>>>>>>
>>>>>>
>>>>>> When i set street=\"MAINS~\" with parser:
>>>>>> i get the following
>>>>>> [.+contentDFLT:"city nashua", +contentDFLT:"region new-hampshire",
>>>>>> +contentDFLT:"country united states", contentDFLT:street
>>>>>> contentDFLT:mains]
>>>>>>
>>>>>> probably " quotations are messing this up as You were saying...
>>>>>> Best regards
>>>>>>
>>>>>>
>>>>>> On 6/10/19 12:48 PM, Tomoko Uchida wrote:
>>>>>>
>>>>>> Or, " (double quotation) in your query string may affect query
>>>>>> parsing.
>>>>>>
>>>>>> When I parse this string by classic query parser (lucene 8.1),
>>>>>> street="MAINS~"
>>>>>> parsed (raw) query is
>>>>>> text:street text:mains
>>>>>> (I set the default search field to "text", so text:xxxx is appeared
>>>>>> here.)
>>>>>>
>>>>>> Query parsing is a complex process, so it would be good to check
>>>>>> parsed raw query string especially when you have (reserved) special
>>>>>> characters in your query...
>>>>>>
>>>>>> 2019?6?11?(?) 1:10 Tomoko Uchida <tomoko.uchida.1111@gmail.com>:
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I noticed one small thing in your previous mail.
>>>>>>
>>>>>> when i use q1 = parser.parse("street=\"MAIN\""); i get same results
>>>>>>
>>>>>> which is good.
>>>>>>
>>>>>> To specify a search field, ":" (colon) should be used instead of
>>>>>> "=".
>>>>>> See the query parser documentation:
>>>>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__lucene.apache.org_core_8-5F1-5F0_queryparser_org_apache_lucene_queryparser_classic_package-2Dsummary.html-23Fields&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=u4SeJqH4lePhOazCLwxLEr3WqcMkODtYLv4njiKZ4PM&s=WrNfUXO9gz1PqpczTJw1vD9sWqvr76WRv2Aeo9uWqa4&e=
>>>>>>
>>>>>>
>>>>>>
>>>>>> I'm not sure this is related to your problem.
>>>>>>
>>>>>> 2019?6?11?(?) 0:51 <baris.kazar@oracle.com>:
>>>>>>
>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>> "city=\"NASHUA\""), BooleanClause.Occur.MUST);
>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>> "region=\"NEW HAMPSHIRE\""), BooleanClause.Occur.MUST);
>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>> "country=\"UNITED STATES\""), BooleanClause.Occur.MUST);
>>>>>>
>>>>>> org.apache.lucene.queryparser.classic.QueryParser parser = new
>>>>>> org.apache.lucene.queryparser.classic.QueryParser(field,
>>>>>> phraseAnalyzer) ;
>>>>>>             Query q1 = null;
>>>>>>             try {
>>>>>>                 q1 = parser.parse("MAIN");
>>>>>>             } catch (ParseException e) {
>>>>>>
>>>>>>                 e.printStackTrace();
>>>>>>             }
>>>>>>             booleanQuery.add(q1, BooleanClause.Occur.SHOULD);
>>>>>>
>>>>>> testQuerySearch2 Time to compute: 0 seconds
>>>>>> Number of results: 1775
>>>>>> Name: Main St
>>>>>> Score: 37.20959
>>>>>> ID: 12681979
>>>>>> Country Code: US
>>>>>> Coordinates: 42.76416, -71.46681
>>>>>> Search Key: street="MAIN" city="NASHUA" municipality="HILLSBOROUGH"
>>>>>> region="NEW HAMPSHIRE" country="UNITED STATES"
>>>>>>
>>>>>> Name: Main St
>>>>>> Score: 37.20959
>>>>>> ID: 12681977
>>>>>> Country Code: US
>>>>>> Coordinates: 42.747, -71.45957
>>>>>> Search Key: street="MAIN" city="NASHUA" municipality="HILLSBOROUGH"
>>>>>> region="NEW HAMPSHIRE" country="UNITED STATES"
>>>>>>
>>>>>> Name: Main St
>>>>>> Score: 37.20959
>>>>>> ID: 12681978
>>>>>> Country Code: US
>>>>>> Coordinates: 42.73492, -71.44951
>>>>>> Search Key: street="MAIN" city="NASHUA" municipality="HILLSBOROUGH"
>>>>>> region="NEW HAMPSHIRE" country="UNITED STATES"
>>>>>>
>>>>>>      when i use q1 = parser.parse("street=\"MAIN\""); i get same
>>>>>> results
>>>>>> which is good.
>>>>>>
>>>>>> But when i switch to MAINS~ then fuzzy query does not work.
>>>>>>
>>>>>>
>>>>>> i need to say something with the q1 only in the booleanquery:
>>>>>> it tries to match the MAIN in street, city, region and country
>>>>>> which are
>>>>>> in a single TextField field.
>>>>>> But i dont want this. that is why i need to street="..." etc when
>>>>>> searching.
>>>>>>
>>>>>> Best regards
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 6/10/19 11:31 AM, Tomoko Uchida wrote:
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> just for the basic verification, can you find the document without
>>>>>> fuzzy query? I mean, does this query work for you?
>>>>>>
>>>>>> Query query = parser.parse("MAIN");
>>>>>>
>>>>>> Tomoko
>>>>>>
>>>>>> 2019?6?11?(?) 0:22 <baris.kazar@oracle.com>:
>>>>>>
>>>>>> why cant the second set not work at all?
>>>>>>
>>>>>> it is indexed as Textfield like street="..." city="..." etc.
>>>>>>
>>>>>> Best regards
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 6/10/19 11:23 AM, baris.kazar@oracle.com wrote:
>>>>>>
>>>>>> i dont know how to use Fuzzyquery with queryparser but probably
>>>>>> You
>>>>>> are suggesting
>>>>>>
>>>>>> QueryParser parser = new QueryParser(field, analyzer) ;
>>>>>> Query query = parser.parse("MAINS~2");
>>>>>>
>>>>>> booleanQuery.add(query, BooleanClause.Occur.SHOULD);
>>>>>>
>>>>>> am i right?
>>>>>> Best regards
>>>>>>
>>>>>>
>>>>>> On 6/10/19 10:47 AM, Atri Sharma wrote:
>>>>>>
>>>>>> I would suggest using a QueryParser for your fuzzy query before
>>>>>> adding it to the Boolean query. This should weed out any case
>>>>>> issues.
>>>>>>
>>>>>> On Mon, 10 Jun 2019 at 8:06 PM, <baris.kazar@oracle.com
>>>>>> <mailto:baris.kazar@oracle.com>> wrote:
>>>>>>
>>>>>>         BooleanQuery.Builder booleanQuery = new
>>>>>> BooleanQuery.Builder();
>>>>>>
>>>>>>         //First set
>>>>>>
>>>>>>                 booleanQuery.add(new FuzzyQuery(new
>>>>>>         org.apache.lucene.index.Term(field, "MAINS")),
>>>>>>         BooleanClause.Occur.SHOULD);
>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>         "NASHUA"), BooleanClause.Occur.MUST);
>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>         "NEW HAMPSHIRE"), BooleanClause.Occur.MUST);
>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>>>>         "UNITED STATES"), BooleanClause.Occur.MUST);
>>>>>>
>>>>>>         // Second set
>>>>>>                  //booleanQuery.add(new FuzzyQuery(new
>>>>>>         org.apache.lucene.index.Term(field, "street=\"MAINS\"")),
>>>>>>         BooleanClause.Occur.SHOULD);
>>>>>> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,
>>>>>>
>>>>>>         field, "city=\"NASHUA\""), BooleanClause.Occur.MUST);
>>>>>> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,
>>>>>>
>>>>>>         field, "region=\"NEW HAMPSHIRE\""),
>>>>>> BooleanClause.Occur.MUST);
>>>>>> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,
>>>>>>
>>>>>>         field, "country=\"UNITED STATES\""),
>>>>>> BooleanClause.Occur.MUST);
>>>>>>
>>>>>>         The first set brings also street with Nashua name.
>>>>>> (NASHUA).
>>>>>>
>>>>>>         so, to prevent that and since i also indexed with
>>>>>> street="..."
>>>>>>         city="..." i did the second set but it does not bring
>>>>>> anything.
>>>>>>
>>>>>>         createPhraseQuery builds a Phrasequery with one term
>>>>>> equal to the
>>>>>>         string
>>>>>>         in the call.
>>>>>>
>>>>>>         Best regards
>>>>>>
>>>>>>
>>>>>>
>>>>>>         On 6/10/19 10:47 AM, baris.kazar@oracle.com
>>>>>>         <mailto:baris.kazar@oracle.com> wrote:
>>>>>>         > How do i check how it is indexed? lowecase or uppercase?
>>>>>>         >
>>>>>>         > only way is now to by testing.
>>>>>>         >
>>>>>>         > i am using standardanalyzer.
>>>>>>         >
>>>>>>         > Best regards
>>>>>>         >
>>>>>>         >
>>>>>>         > On 6/9/19 11:57 AM, Atri Sharma wrote:
>>>>>>         >> On Sun, Jun 9, 2019 at 8:53 PM Tomoko Uchida
>>>>>>         >> <tomoko.uchida.1111@gmail.com
>>>>>> <mailto:tomoko.uchida.1111@gmail.com>> wrote:
>>>>>>         >>> Hi,
>>>>>>         >>>
>>>>>>         >>> What analyzer do you use for the text field? Is the
>>>>>> term "Main"
>>>>>>         >>> correctly indexed?
>>>>>>         >> Agreed. Also, it would be good if you could post your
>>>>>> actual
>>>>>> code.
>>>>>>         >>
>>>>>>         >> What analyzer are you using? If you are using
>>>>>> StandardAnalyzer,
>>>>>>         then
>>>>>>         >> all of your terms while indexing will be lowercased,
>>>>>> AFAIK, but
>>>>>>         your
>>>>>>         >> query will not be analyzed until you run a
>>>>>> QueryParser on it.
>>>>>>         >>
>>>>>>         >>
>>>>>>         >> Atri
>>>>>>         >>
>>>>>>         >
>>>>>>         >
>>>>>>         >
>>>>>> ---------------------------------------------------------------------
>>>>>>
>>>>>>
>>>>>>         > To unsubscribe, e-mail:
>>>>>> java-user-unsubscribe@lucene.apache.org
>>>>>> <mailto:java-user-unsubscribe@lucene.apache.org>
>>>>>>         > For additional commands, e-mail:
>>>>>>         java-user-help@lucene.apache.org
>>>>>> <mailto:java-user-help@lucene.apache.org>
>>>>>>         >
>>>>>>
>>>>>> ---------------------------------------------------------------------
>>>>>>
>>>>>>
>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>>
>>>>>> ---------------------------------------------------------------------
>>>>>>
>>>>>>
>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>>
>>>>>> ---------------------------------------------------------------------
>>>>>>
>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>>
>>>>>> ---------------------------------------------------------------------
>>>>>>
>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

1 2  View All