Mailing List Archive

Re: Wildcard Searching
Does anyone know anything about this?

Thanks,
Otis

--- Otis Gospodnetic <otis_gospodnetic@yahoo.com> wrote:
> Hello,
>
> This was a thread on lucene-user initially, but I'm copying
> lucene-dev
> as well. Sorry about duplicates.
>
> --- Stefan Bergstrand <stefan.bergstrand@polopoly.com> wrote:
> > Doug Cutting <DCutting@grandcentral.com> writes:
> >
> > Just noticed this problem in my program.
> >
> > It seems as if the analyzer passed to QueryParser.parse(), never is
> > passed to PrefixQuery (which is what my test case is parsed to).
> >
> > A quick look in QueryParser.jj confirms this:
> >
> > q = new PrefixQuery(new Term(field, term.image.substring
> > (0, term.image.length()-1)));
>
> I thought that queries such as 'rou?d' are considered wildcard
> queries
> by QueryParser.jj, and not Prefix queries, no?
> In the default definition of token in QueryParser.jj I see this:
>
> | <PREFIXTERM: <_TERM_START_CHAR> (<_TERM_CHAR>)* "*" >
> | <WILDTERM: <_TERM_START_CHAR>
> (<_TERM_CHAR> | ( [ "*", "?" ] ))* >
>
> Then further down in QueryParser.jj we have this:
>
> if (wildcard)
> q = new WildcardQuery(new Term(field, term.image));
>
> So a WildWuery is being constructed, not PrefixQuery, I think.
>
> What I don't understand is why the definition of _TERM_START_CHAR
> looks
> like this:
>
> | <#_TERM_START_CHAR: ~[. " ", "\t", "+", "-", "!", "(", ")", ":",
> "^",
> "[", "]", "\"", "{", "}", "~", "*" ] >
>
> Maybe the name is misleading, but it seems like _TERM_START_CHAR are
> the characters that a TERM can start with, because later in
> QueryParser.jj we have TERM defined as:
>
> | <TERM: <_TERM_START_CHAR> (<_TERM_CHAR>)* >
>
> and _TERM_CHAR has this definition:
>
> | <#_TERM_CHAR: <_TERM_START_CHAR> >
>
> So how can we have a "*" in _TERM_START_CHAR when terms are not
> allowed
> to start with a "*", and if we do have "*", how come we do not have
> "?"
> as well?
>
> Can somebodyt correct me in every place where I made false
> statements,
> assumptions, and conclusions?
>
> Thanks,
> Otis
>
> > > > From: Howk, Michael [mailto:MHowk@FSC.Follett.com]
> > > >
> > > > Also, Lucene returns the parsed version of each of our
> > > > searches. When we
> > > > search by rou*d, Lucene parses it as rou*d (which is what we
> > > > would expect).
> > > > But when we search by rou?d, Lucene parses it as "rou d". It
> > > > seems to wrap
> > > > the term in quotes and replace the question mark with a
> > > > space. Any ideas? Or
> > > > can someone give us an idea of how to understand WildcardQuery
> or
> > > > WildcardTermEnum?
> > >
> > > It sounds like the problem is in the query parser. Brian?
> > >
> > > Doug
> > >
> > > --
> > > To unsubscribe, e-mail:
> > <mailto:lucene-user-unsubscribe@jakarta.apache.org>
> > > For additional commands, e-mail:
> > <mailto:lucene-user-help@jakarta.apache.org>
> > >
> > >
> >
> > --
> > ---------------------------
> > Stefan Bergstrand
> > Polopoly - Cultivating the information garden
> > Ph: +46 8 506 782 67
> > Cell: +46 704 47 82 67
> > Fax: +46 8 506 782 51
> > stefan.bergstrand@polopoly.com, http://www.polopoly.com
> >
> > --
> > To unsubscribe, e-mail:
> > <mailto:lucene-user-unsubscribe@jakarta.apache.org>
> > For additional commands, e-mail:
> > <mailto:lucene-user-help@jakarta.apache.org>
> >
>
>
>
> __________________________________________________
> Do You Yahoo!?
> Yahoo! Sports - live college hoops coverage
> http://sports.yahoo.com/
>
> --
> To unsubscribe, e-mail:
> <mailto:lucene-user-unsubscribe@jakarta.apache.org>
> For additional commands, e-mail:
> <mailto:lucene-user-help@jakarta.apache.org>
>


__________________________________________________
Do You Yahoo!?
Yahoo! Tax Center - online filing with TurboTax
http://taxes.yahoo.com/

--
To unsubscribe, e-mail: <mailto:lucene-dev-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-dev-help@jakarta.apache.org>