Mailing List Archive

TokenMgnError
Hi,
I am receiving a TokenMsgError when certain characters such as commas, '[',
and '<' are being used in a query.
I read a message in the archive where someone was experiencing a similar
problem, and apparently certain characters have special meanings in the
query and the TokenMsgError is thrown when these characters are not used
correctly.
I am currently taking a query string directly from an input field on a web
site, and so I can't ensure that users will write the query correctly.
Since I think it is common for a web user to enter a comma into their search
query, I am wondering how other people are handling this problem. Has
anyone written a tokenizer that can safely read any query from a web user
without throwing the error? Or is what I am experiencing potentially a bug?

In case it is helpful, the stackTrace being generated when a comma is
entered in the search query is:
org.apache.lucene.queryParser.TokenMgrError: Lexical error at line 1, column
2. Encountered: after : ""
at
org.apache.lucene.queryParser.QueryParserTokenManager.getNextToken(QueryPars
erTokenManager.java:523)
at
org.apache.lucene.queryParser.QueryParser.jj_ntk(QueryParser.java:583)
at
org.apache.lucene.queryParser.QueryParser.Modifiers(QueryParser.java:216)
at
org.apache.lucene.queryParser.QueryParser.Query(QueryParser.java:251)
at
org.apache.lucene.queryParser.QueryParser.parse(QueryParser.java:72)
at
org.apache.lucene.queryParser.QueryParser.parse(QueryParser.java:49)

I tried tracking the error down in the QueryParserTokenManager to determine
the proper usage, but I was having trouble understanding exactly what the
class was doing since it contained a lot of hard-coded hex and weird method
names.

Any suggestions are greatly appreciated.

Thanks,
Jordan