Mailing List Archive

Fw: Exception when search with accents.
Hello! I've been trying Lucene during some time, and
I like it, it seems to be pretty good. I'm trying to do a
Web searcher (similar to WebSearch) but I'm in a trouble.
If I want to seach exactly "á" in a Lucene app, running it in
a Command Line (like demo/SearchFiles) it throws an exception:

Exception in thread "main" com.lucene.queryParser.TokenMgrError: Lexical
error a
t line 1, column 2. Encountered: <EOF> after : ""
at com.lucene.queryParser.QueryParserTokenManager.getNextToken

(QueryParserTokenManager.java:446)
at com.lucene.queryParser.QueryParser.jj_ntk(QueryParser.java:505)
at
com.lucene.queryParser.QueryParser.Modifiers(QueryParser.java:191)
at com.lucene.queryParser.QueryParser.Query(QueryParser.java:226)
at com.lucene.queryParser.QueryParser.parse(QueryParser.java:72)
at com.lucene.queryParser.QueryParser.parse(QueryParser.java:49)
at com.i2a.websearch.WebSearcher.search(WebSearcher.java:72)
at demo.SearchFiles.main(SearchFiles.java:77)


in the other hand, if I try the same from a WebApp (like WebSearch) it
doesn't throw it.
I' ve been "playing" with Character Encodings and Locale Configs, but I
didn't obtain anything.
Can someone help me?

Thanks.

P.D: Platform = NT 4.0 Service Pack 6, JDK 1.3


_________________
Toni Ortega
aortega@fihoca.com
_________________


--
To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>
Fw: Exception when search with accents. [ In reply to ]
FW: Exception when search with accents.Hello Toni,
I don't know if my message I'd sent to lucene-user, was delivered to you,
because I got a mailer daemon failure message.
If not, pls forward it to the lucene user list.

peter

-----Original Message-----
From: Halácsy Péter
Sent: Monday, October 29, 2001 2:08 PM
To: 'Lucene Users List'
Subject: RE: Exception when search with accents.



Hello,
this is the same problem that Joanne highlighted on dev list. (had problem
with Finnish characters)

I think it's the bug of QueryParser.jj file because there the term is
definied as character string that begins with any English character.

Modify a line in the QueryParser.jj:
orig line:
<TERM: <_IDENTIFIER_CHAR>
( ~[."\"", " ", "\t", "(", ")", ":", "&", "|", "^", "*",
"?", "~", "{", "}", "[", "]" ] )* >

modified:
<TERM: ( ~[."\"", " ", "\t", "(", ")", ":", "&", "|", "^", "*", "?",
"~", "{", "}", "[", "]" ] )+ >

And rebuild the package. I managed to search "á".



peter

> -----Original Message-----
> From: Antonio Ortega [mailto:aortega@fihoca.com]
> Sent: Monday, October 29, 2001 1:51 PM
> To: Lucene
> Subject: Exception when search with accents.
>
>
> Hello! I've been trying Lucene during some time, and
> I like it, it seems to be pretty good. I'm trying to do a
> Web searcher (similar to WebSearch) but I'm in a trouble.
> If I want to seach exactly "á" in a Lucene app, running it in
> a Command Line (like demo/SearchFiles) it throws an exception:
>
> Exception in thread "main"
> com.lucene.queryParser.TokenMgrError: Lexical
> error a
> t line 1, column 2. Encountered: <EOF> after : ""
> at com.lucene.queryParser.QueryParserTokenManager.getNextToken
>
> (QueryParserTokenManager.java:446)
> at
> com.lucene.queryParser.QueryParser.jj_ntk(QueryParser.java:505)
> at
> com.lucene.queryParser.QueryParser.Modifiers(QueryParser.java:191)
> at
> com.lucene.queryParser.QueryParser.Query(QueryParser.java:226)
> at
> com.lucene.queryParser.QueryParser.parse(QueryParser.java:72)
> at
> com.lucene.queryParser.QueryParser.parse(QueryParser.java:49)
> at com.i2a.websearch.WebSearcher.search(WebSearcher.java:72)
> at demo.SearchFiles.main(SearchFiles.java:77)
>
>
> in the other hand, if I try the same from a WebApp (like WebSearch) it
> doesn't throw it.
> I' ve been "playing" with Character Encodings and Locale
> Configs, but I
> didn't obtain anything.
> Can someone help me?
>
> Thanks.
>
> P.D: Platform = NT 4.0 Service Pack 6, JDK 1.3
>
> _________________
> Toni Ortega
> aortega@fihoca.com
> _________________
>
>
> --
> To unsubscribe, e-mail:
> <mailto:lucene-user-unsubscribe@jakarta.apache.org>
> For additional commands, e-mail:
> <mailto:lucene-user-help@jakarta.apache.org>
>
>



--
To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>
Re: Fw: Exception when search with accents. [ In reply to ]
Its a problem with the query parser, but this is not the correct fix.
I think this fix just masks the problem.

> I think it's the bug of QueryParser.jj file because there the term is
> definied as character string that begins with any English character.
>
> Modify a line in the QueryParser.jj:
> orig line:
> <TERM: <_IDENTIFIER_CHAR>
> ( ~[."\"", " ", "\t", "(", ")", ":", "&", "|", "^", "*",
> "?", "~", "{", "}", "[", "]" ] )* >
>
> modified:
> <TERM: ( ~[."\"", " ", "\t", "(", ")", ":", "&", "|", "^", "*", "?",
> "~", "{", "}", "[", "]" ] )+ >
>
> And rebuild the package. I managed to search "á".


--
To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>