Mailing List Archive

What punctuation is "legal" in a query?
I have posted several questions/examples regarding QueryParser throwing exceptions when the query string contains various punctuation characters. I have been downloading the nightly builds hoping that my problems would be solved, unfortunately, only some of the problems I have reported have been addressed via fixes to the code or responses to my posting.

I understand that some punctuation is reserved, '*' for prefix query and '[' ']' for range query.

1. Is any other punctuation reserved? Is this documented anywhere?

2. Shouldn't any "non-reserved" punctuation be legal query syntax (i.e. it shouldn't cause a parse error)?

Example:

A document contains a field whose value is a relative directory path: "cooldocs/myfavoritetopics"

Parsing the following query causes a parse error: "relativepath:cooldocs/myfavoritetopics"

Some other punctuation seems to work fine (i.e. '.' and '_'), that's why I'm confused as to which punctuation should or shouldn't work.

I have attached a test case that causes the error. I am using StandardAnalyzer and the Nov. 24 nightly build.

Sorry to be so persistent about this, but query syntax containing punctuation (especially '.', '_', '/') is extremely critical to the product I am working on.

Thanks.
Paul Friedman
Re: What punctuation is "legal" in a query? [ In reply to ]
> Sorry to be so persistent about this, but query syntax containing punctuation (especially '.', '_', '/') is extremely critical to the product I am working on.

You are of course correct that the syntax should be documented, and
I'm sure in time, it will. We've added individual elements (some at
your request), and I agree that it should be more tolerant (I'm more
used to writing parsers for compilers than for user-level tools.

Bear in mind that the query parser is a convenience, which gives you
an 80% solution for 20% of the work. If you've got specific
requirements, maybe you should use the query classes directly?


--
To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>
RE: What punctuation is "legal" in a query? [ In reply to ]
Thanks for the suggestion. I was not aware that QueryParser was just a convenience class, I thought it was the recommended way of generating a Query from a user defined query string. Since I have never used the query classes directly, I have a couple of questions:

1. Do I just parse the query string myself (possibly using my own query syntax) then generate the appropriate query classes and add them to a BooleanQuery?

2. If I added the documents to the index using the StandardAnalyzer, how do I make sure that the terms contained in the query object that I created are "analyzed" properly, when I use QueryParser I just pass in the same StandardAnalyzer and it takes care of it for me?

Thanks.
Paul Friedman


-----Original Message-----
From: Brian Goetz [mailto:brian@quiotix.com]
Sent: Monday, November 26, 2001 11:04 AM
To: Lucene Users List
Subject: Re: What punctuation is "legal" in a query?


> Sorry to be so persistent about this, but query syntax containing punctuation (especially '.', '_', '/') is extremely critical to the product I am working on.

You are of course correct that the syntax should be documented, and
I'm sure in time, it will. We've added individual elements (some at
your request), and I agree that it should be more tolerant (I'm more
used to writing parsers for compilers than for user-level tools.

Bear in mind that the query parser is a convenience, which gives you
an 80% solution for 20% of the work. If you've got specific
requirements, maybe you should use the query classes directly?


--
To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>

--
To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>