Mailing List Archive

Escape woes
Hello all. I'm having a problem with escaping Lucene characters, and I
was wondering if anyone here could help.

I've set up a Keyword field for my documents named "host". "host" can
contain names like "ocrlprod" or "ny-dns-2". I'm pretty sure that I
want it to be a Keyword (rather than Text), so that it isn't tokenized.

The problem is when I need to search against that field. For example:

search string I construct: (host:orclprod)
query as reported by toString(): +host:orclprod

returns the results I expect. Great so far.

search string I construct: (host:ny-dns-2)
query as reported by toString(): +(host:ny -dns -2)

does not return any results. After reading about needing to escape
special characters, I'm not surprised, and move on to...

search string I construct: (host:ny\-dns\-2)
query as reported by toString(): +host:"ny dns 2"

does not return any results. What happened to my hyphens?!?! Is there
any way to get them to show up, as I expect I need them to?

Thanks for any help.

--- Matt

--
To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>
Re: Escape woes [ In reply to ]
I think I have a workaround, but I'm not sure why it works (since it
doesn't mesh with the docs about escaping special chars).

In the examples below, I was using StandardAnalyzer. If I switch to
using WhitespaceAnalyzer, then one of the examples that fails (in the
note below) becomes:

search string I construct: (host:"ny-dns-2")
query as reported by toString(): +host:ny-dns-2

returns the results I expect.

I just don't understand why.

I didn't have to escape the hyphen. In fact, if I do escape them, I'm
back to no results. I did have to quote the string I wanted to find.

Does anyone have an explanation they would care to share?

Thanks.

--- Matt

"Matthew B. Merrill" wrote:
>
> Hello all. I'm having a problem with escaping Lucene characters, and I
> was wondering if anyone here could help.
>
> I've set up a Keyword field for my documents named "host". "host" can
> contain names like "ocrlprod" or "ny-dns-2". I'm pretty sure that I
> want it to be a Keyword (rather than Text), so that it isn't tokenized.
>
> The problem is when I need to search against that field. For example:
>
> search string I construct: (host:orclprod)
> query as reported by toString(): +host:orclprod
>
> returns the results I expect. Great so far.
>
> search string I construct: (host:ny-dns-2)
> query as reported by toString(): +(host:ny -dns -2)
>
> does not return any results. After reading about needing to escape
> special characters, I'm not surprised, and move on to...
>
> search string I construct: (host:ny\-dns\-2)
> query as reported by toString(): +host:"ny dns 2"
>
> does not return any results. What happened to my hyphens?!?! Is there
> any way to get them to show up, as I expect I need them to?
>
> Thanks for any help.
>
> --- Matt
>
> --
> To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
> For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>

--
To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>
Re: Escape woes [ In reply to ]
I had the same problem when indexing date fields i.e. YYYY-MM-DD
I use the standard analyzer and i use double quotes and it works fine.


On Tue, 16 Jul 2002, Matthew B. Merrill wrote:

>
> I think I have a workaround, but I'm not sure why it works (since it
> doesn't mesh with the docs about escaping special chars).
>
> In the examples below, I was using StandardAnalyzer. If I switch to
> using WhitespaceAnalyzer, then one of the examples that fails (in the
> note below) becomes:
>
> search string I construct: (host:"ny-dns-2")
> query as reported by toString(): +host:ny-dns-2
>
> returns the results I expect.
>
> I just don't understand why.
>
> I didn't have to escape the hyphen. In fact, if I do escape them, I'm
> back to no results. I did have to quote the string I wanted to find.
>
> Does anyone have an explanation they would care to share?
>
> Thanks.
>
> --- Matt
>
> "Matthew B. Merrill" wrote:
> >
> > Hello all. I'm having a problem with escaping Lucene characters, and I
> > was wondering if anyone here could help.
> >
> > I've set up a Keyword field for my documents named "host". "host" can
> > contain names like "ocrlprod" or "ny-dns-2". I'm pretty sure that I
> > want it to be a Keyword (rather than Text), so that it isn't tokenized.
> >
> > The problem is when I need to search against that field. For example:
> >
> > search string I construct: (host:orclprod)
> > query as reported by toString(): +host:orclprod
> >
> > returns the results I expect. Great so far.
> >
> > search string I construct: (host:ny-dns-2)
> > query as reported by toString(): +(host:ny -dns -2)
> >
> > does not return any results. After reading about needing to escape
> > special characters, I'm not surprised, and move on to...
> >
> > search string I construct: (host:ny\-dns\-2)
> > query as reported by toString(): +host:"ny dns 2"
> >
> > does not return any results. What happened to my hyphens?!?! Is there
> > any way to get them to show up, as I expect I need them to?
> >
> > Thanks for any help.
> >
> > --- Matt
> >
> > --
> > To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
> > For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>
>
> --
> To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
> For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>
>
>


--
To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>
Re: Escape woes [ In reply to ]
I tried double quotes when using StandardAnalyzer; things got *really*
strange then. If I was searching for a string that contained just one
hyphen, I got what I expected. If the string contained *two* hyphens,
then I got things like:

search string I construct: (host:"ny-dns-2")
query as reported by toString(): +host:"ny dns-2"

Note the missing first hyphen.

I don't understand why this seems to work for you but not for me; you're
using v1.2rc5? That is the latest release version, correct?

--- Matt

Keith Gunn wrote:
>
> I had the same problem when indexing date fields i.e. YYYY-MM-DD
> I use the standard analyzer and i use double quotes and it works fine.
>
> On Tue, 16 Jul 2002, Matthew B. Merrill wrote:
>
> >
> > I think I have a workaround, but I'm not sure why it works (since it
> > doesn't mesh with the docs about escaping special chars).
> >
> > In the examples below, I was using StandardAnalyzer. If I switch to
> > using WhitespaceAnalyzer, then one of the examples that fails (in the
> > note below) becomes:
> >
> > search string I construct: (host:"ny-dns-2")
> > query as reported by toString(): +host:ny-dns-2
> >
> > returns the results I expect.
> >
> > I just don't understand why.
> >
> > I didn't have to escape the hyphen. In fact, if I do escape them, I'm
> > back to no results. I did have to quote the string I wanted to find.
> >
> > Does anyone have an explanation they would care to share?
> >
> > Thanks.
> >
> > --- Matt
> >
> > "Matthew B. Merrill" wrote:
> > >
> > > Hello all. I'm having a problem with escaping Lucene characters, and I
> > > was wondering if anyone here could help.
> > >
> > > I've set up a Keyword field for my documents named "host". "host" can
> > > contain names like "ocrlprod" or "ny-dns-2". I'm pretty sure that I
> > > want it to be a Keyword (rather than Text), so that it isn't tokenized.
> > >
> > > The problem is when I need to search against that field. For example:
> > >
> > > search string I construct: (host:orclprod)
> > > query as reported by toString(): +host:orclprod
> > >
> > > returns the results I expect. Great so far.
> > >
> > > search string I construct: (host:ny-dns-2)
> > > query as reported by toString(): +(host:ny -dns -2)
> > >
> > > does not return any results. After reading about needing to escape
> > > special characters, I'm not surprised, and move on to...
> > >
> > > search string I construct: (host:ny\-dns\-2)
> > > query as reported by toString(): +host:"ny dns 2"
> > >
> > > does not return any results. What happened to my hyphens?!?! Is there
> > > any way to get them to show up, as I expect I need them to?
> > >
> > > Thanks for any help.
> > >
> > > --- Matt
> > >
> > > --
> > > To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
> > > For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>
> >
> > --
> > To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
> > For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>
> >
> >
>
> --
> To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
> For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>

--
To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>
Re: Escape woes [ In reply to ]
I'll bet it has something to do with numbers vs letters around the
hyphen. The example I had that worked with a single hyphen was "#-#",
like "1-1" or "4-13".

Your example is also number-based (YYYY-MM-DD).

Anyone care to weigh in with a detailed treatise on what's happening and
why?

--- Matt

"Matthew B. Merrill" wrote:
>
> I tried double quotes when using StandardAnalyzer; things got *really*
> strange then. If I was searching for a string that contained just one
> hyphen, I got what I expected. If the string contained *two* hyphens,
> then I got things like:
>
> search string I construct: (host:"ny-dns-2")
> query as reported by toString(): +host:"ny dns-2"
>
> Note the missing first hyphen.
>
> I don't understand why this seems to work for you but not for me; you're
> using v1.2rc5? That is the latest release version, correct?
>
> --- Matt
>
> Keith Gunn wrote:
> >
> > I had the same problem when indexing date fields i.e. YYYY-MM-DD
> > I use the standard analyzer and i use double quotes and it works fine.
> >
> > On Tue, 16 Jul 2002, Matthew B. Merrill wrote:
> >
> > >
> > > I think I have a workaround, but I'm not sure why it works (since it
> > > doesn't mesh with the docs about escaping special chars).
> > >
> > > In the examples below, I was using StandardAnalyzer. If I switch to
> > > using WhitespaceAnalyzer, then one of the examples that fails (in the
> > > note below) becomes:
> > >
> > > search string I construct: (host:"ny-dns-2")
> > > query as reported by toString(): +host:ny-dns-2
> > >
> > > returns the results I expect.
> > >
> > > I just don't understand why.
> > >
> > > I didn't have to escape the hyphen. In fact, if I do escape them, I'm
> > > back to no results. I did have to quote the string I wanted to find.
> > >
> > > Does anyone have an explanation they would care to share?
> > >
> > > Thanks.
> > >
> > > --- Matt
> > >
> > > "Matthew B. Merrill" wrote:
> > > >
> > > > Hello all. I'm having a problem with escaping Lucene characters, and I
> > > > was wondering if anyone here could help.
> > > >
> > > > I've set up a Keyword field for my documents named "host". "host" can
> > > > contain names like "ocrlprod" or "ny-dns-2". I'm pretty sure that I
> > > > want it to be a Keyword (rather than Text), so that it isn't tokenized.
> > > >
> > > > The problem is when I need to search against that field. For example:
> > > >
> > > > search string I construct: (host:orclprod)
> > > > query as reported by toString(): +host:orclprod
> > > >
> > > > returns the results I expect. Great so far.
> > > >
> > > > search string I construct: (host:ny-dns-2)
> > > > query as reported by toString(): +(host:ny -dns -2)
> > > >
> > > > does not return any results. After reading about needing to escape
> > > > special characters, I'm not surprised, and move on to...
> > > >
> > > > search string I construct: (host:ny\-dns\-2)
> > > > query as reported by toString(): +host:"ny dns 2"
> > > >
> > > > does not return any results. What happened to my hyphens?!?! Is there
> > > > any way to get them to show up, as I expect I need them to?
> > > >
> > > > Thanks for any help.
> > > >
> > > > --- Matt
> > > >
> > > > --
> > > > To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
> > > > For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>
> > >
> > > --
> > > To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
> > > For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>
> > >
> > >
> >
> > --
> > To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
> > For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>
>
> --
> To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
> For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>

--
To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>