Mailing List Archive: about my proposed patch for bug 7782

about my proposed patch for bug 7782

Apr 12, 2002, 7:55 PM

Post #1 of 5 (663 views)

I just added a patch to the report for bug 7782. It fixes the problem
in QueryParser.jj where TokenStream.close() is not being called. It
also addresses the problem of the IOExceptions that TokenStream creates
being silently ignored -- the exceptions are now sent back up to the
user of the QueryParser. In the current codebase there's no way for
a user of QueryParser to know that the TokenStream blew up.

This change obviously means that users of QueryParser need to catch
IOException.

What's the review procedure for contributed patches?

Eric

--
To unsubscribe, e-mail: <mailto:lucene-dev-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-dev-help@jakarta.apache.org>

Re: about my proposed patch for bug 7782 [ In reply to ]

otis_gospodnetic at yahoo

Apr 13, 2002, 1:15 PM

Post #2 of 5 (667 views)

Permalink

Hello,

I just checked out the patch. Thanks for providing it.
I assume you are right about the need to call close() on TokenStream.
Have you encontered problems with the query parser before your changes?
I'm asking because I haven't (yet).
It seems like the change is the addition of a few finally blocks and
some formatting changes. Ah, talking about that - they make the patch
longer and harder to check by just glancing at the code. I think the
formatting changes are sometimes in order, but it's probably better not
to mix them with code/functionality changes. I was once reprimanded
for doing that. :)

I think the change is fine, but since it's not backwards compatible
(IOException) we should, I think, apply it only after the upcoming
release.

Perhaps it would be good to get rid of TokenMgrException and wrap it in
ParseException at that time as well. What do you think, people?
Subsequently we could also apply Hungarian Peter's enhancement that
allows one to speciify whether OR or AND should be used by default by
the query parser.

I hope this answers your questions.
American Peter, when do we release? Did you get the priviledges that
you need from Geir yet?

Croatian Otis

--- "Eric D. Friedman" <eric@conveysoftware.com> wrote:
> I just added a patch to the report for bug 7782. It fixes the
> problem
> in QueryParser.jj where TokenStream.close() is not being called. It
> also addresses the problem of the IOExceptions that TokenStream
> creates
> being silently ignored -- the exceptions are now sent back up to the
> user of the QueryParser. In the current codebase there's no way for
> a user of QueryParser to know that the TokenStream blew up.
>
> This change obviously means that users of QueryParser need to catch
> IOException.
>
> What's the review procedure for contributed patches?
>
> Eric
>
>
> --
> To unsubscribe, e-mail:
> <mailto:lucene-dev-unsubscribe@jakarta.apache.org>
> For additional commands, e-mail:
> <mailto:lucene-dev-help@jakarta.apache.org>
>

__________________________________________________
Do You Yahoo!?
Yahoo! Tax Center - online filing with TurboTax
http://taxes.yahoo.com/

--
To unsubscribe, e-mail: <mailto:lucene-dev-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-dev-help@jakarta.apache.org>

Re: about my proposed patch for bug 7782 [ In reply to ]

eric at conveysoftware

Apr 13, 2002, 7:25 PM

Post #3 of 5 (661 views)

Permalink

Hi Otis,

On Sat, 13 Apr 2002, Otis Gospodnetic wrote:

> I assume you are right about the need to call close() on TokenStream.
> Have you encontered problems with the query parser before your changes?
> I'm asking because I haven't (yet).

Yes, because my Analyzers cache TokenStream instances that are
expensive to set up. Calling close is required for the shared instance
to be made available to the next user.

> It seems like the change is the addition of a few finally blocks and
> some formatting changes. Ah, talking about that - they make the patch
> longer and harder to check by just glancing at the code. I think the
> formatting changes are sometimes in order, but it's probably better not
> to mix them with code/functionality changes. I was once reprimanded
> for doing that. :)

Two things here: one, I specified a context level of 5 (cvs diff -c5)
so that the diffs would be both longer and clearer. I'm happy to
upload a shorter diff if that's more helpful.

Two, I made the formatting changes to address inconsistencies introduced
by someone else. That is, the person who wrote the (broken) code for
which my patch applies used a different coding style than is found in
the rest of the file (and in the rest of Lucene, for that matter).

> I think the change is fine, but since it's not backwards compatible
> (IOException) we should, I think, apply it only after the upcoming
> release.

My opinion here is that code which silently swallows exceptions is not
acceptable in a production release and shouldn't be allowed to ship.
As it stands, the QueryParser allows users to unwittingly assemble a
Query that is inconsistent with the input they provided. That is very
bad.

> Perhaps it would be good to get rid of TokenMgrException and wrap it in
> ParseException at that time as well. What do you think, people?
> Subsequently we could also apply Hungarian Peter's enhancement that
> allows one to speciify whether OR or AND should be used by default by
> the query parser.

TokenMgrException and ParseException are two different things. The
former reports a problem in the lexer that prevented it from reading a
valid token; the latter reports a problem encountered in the parser,
which is where the syntax of the query language is enforced.

To make a source code analogy: "return if break;" contains a set of
valid tokens for the Java language, but the syntax is invalid, making a
ParseException the right kind of error to raise. Conversely, "swAtch
(c) { }" contains characters which cannot be recognized by the lexer as
a legal token, so a TokenMgrException is appropriate.

I would strongly urge against blurring the distinction between these
two classes of error, as they really are not the same thing.

Eric

--
To unsubscribe, e-mail: <mailto:lucene-dev-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-dev-help@jakarta.apache.org>

Re: about my proposed patch for bug 7782 [ In reply to ]

drag0n2 at yahoo

Apr 15, 2002, 7:55 AM

Post #4 of 5 (661 views)

Permalink

Although I agree with you in theory, in practice we need to be backwards
compatable as well is absolutelly correct.

ParseException also does not nesseserally mean "syntax error", it indicates
an error encountered during parsing.

If you would like we can create 3 exception classes:
ParseException
TokenMgrException extends ParseException
SyntaxException extends ParseException

but that would require significant changes thoughout the code, which IMHO
should not be in this release.

For this release we should just throw a ParseException.

>
> To make a source code analogy: "return if break;" contains a set of
> valid tokens for the Java language, but the syntax is invalid, making a
> ParseException the right kind of error to raise. Conversely, "swAtch
> (c) { }" contains characters which cannot be recognized by the lexer as
> a legal token, so a TokenMgrException is appropriate.
>
> I would strongly urge against blurring the distinction between these
> two classes of error, as they really are not the same thing.
>
> Eric
>
>
> --
> To unsubscribe, e-mail:
<mailto:lucene-dev-unsubscribe@jakarta.apache.org>
> For additional commands, e-mail:
<mailto:lucene-dev-help@jakarta.apache.org>

--
To unsubscribe, e-mail: <mailto:lucene-dev-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-dev-help@jakarta.apache.org>

RE: about my proposed patch for bug 7782 [ In reply to ]

eric at ConveySoftware

Apr 15, 2002, 10:36 AM

Post #5 of 5 (665 views)

Permalink

Hi Eugene,

The problem with folding TokenMgrException into ParseException is that it's
inconsistent with the parser generation framework provided by JavaCC.
Simply put, a lexer error is not a parser error. Indeed, a smart parser may
be able to gracefully recover from lexer errors. Yes, you can hack the
generated classes, but first I'd like to believe that there's a good design
reason for doing so. To my mind, the absence of an "is a" relationship
between these two classes make such a move an abuse of inheritance.

my 2 cents,
Eric

> -----Original Message-----
> From: Eugene Gluzberg [mailto:drag0n2@yahoo.com]
> Sent: Monday, April 15, 2002 7:56 AM
> To: Lucene Developers List
> Subject: Re: about my proposed patch for bug 7782
>
>
> Although I agree with you in theory, in practice we need to
> be backwards
> compatable as well is absolutelly correct.
>
> ParseException also does not nesseserally mean "syntax
> error", it indicates
> an error encountered during parsing.
>
> If you would like we can create 3 exception classes:
> ParseException
> TokenMgrException extends ParseException
> SyntaxException extends ParseException
>
> but that would require significant changes thoughout the
> code, which IMHO
> should not be in this release.
>
> For this release we should just throw a ParseException.
>
> >
> > To make a source code analogy: "return if break;" contains a set of
> > valid tokens for the Java language, but the syntax is
> invalid, making a
> > ParseException the right kind of error to raise.
> Conversely, "swAtch
> > (c) { }" contains characters which cannot be recognized by
> the lexer as
> > a legal token, so a TokenMgrException is appropriate.
> >
> > I would strongly urge against blurring the distinction between these
> > two classes of error, as they really are not the same thing.
> >
> > Eric
> >
> >
> > --
> > To unsubscribe, e-mail:
> <mailto:lucene-dev-unsubscribe@jakarta.apache.org>
> > For additional commands, e-mail:
> <mailto:lucene-dev-help@jakarta.apache.org>
>
>
> --
> To unsubscribe, e-mail:
> <mailto:lucene-dev-unsubscribe@jakarta.apache.org>
> For additional commands, e-mail:
> <mailto:lucene-dev-help@jakarta.apache.org>
>

--
To unsubscribe, e-mail: <mailto:lucene-dev-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-dev-help@jakarta.apache.org>