Mailing List Archive: Why exceptions shouldn't be used for flow control [Re: YAS to the "Reading line-by-line" Problem]

Why exceptions shouldn't be used for flow control [Re: YAS to the "Reading line-by-line" Problem]

Jun 23, 1999, 4:55 PM

Post #1 of 5 (678 views)

William Tanksley wrote:
>
> Both should be handled by exception, not by returning a special
> value.

I don't agree with that. Reaching the end of a file while
reading it is not exceptional enough to warrant requiring
the use of an exception handler to catch it.

What bothers me about using exceptions for flow control
in situations like this is that the effect of an exception
handler is *non-local*, whereas what you're trying to
catch is really only a local concern.

When you write something like

while 1:
try:
line = read_me_a_line()
except EOFError:
break

what you really mean by the try-except is "if THIS
PARTICULAR reading operation reaches the end of
the file". But that's not what it does - it catches
any EOFError raised by any operation invoked by
read_me_a_line() that wasn't caught by something
else first.

In my experience, that sort of behaviour tends to
mask bugs. I prefer results which are not errors to
be returned normally, so that I can be sure where
they came from.

Greg

Why exceptions shouldn't be used for flow control [Re: YAS to the "Reading line-by-line" Problem] [ In reply to ]

wtanksle at dolphin

Jun 23, 1999, 6:14 PM

Post #2 of 5 (668 views)

Permalink

On Thu, 24 Jun 1999 11:55:35 +1200, Greg Ewing wrote:
>William Tanksley wrote:

>> Both should be handled by exception, not by returning a special
>> value.

>I don't agree with that. Reaching the end of a file while
>reading it is not exceptional enough to warrant requiring
>the use of an exception handler to catch it.

"Not exceptional enough". What does that mean? (I ain't never had too
much exception! ;-)

It's an exception to the behavior of a function defined to return the next
entity in a file. It's not an exception to a function which returns the
next entity, or some reserved value otherwise. There's no such thing as a
partial exception; it's a quality, not a quantity.

>What bothers me about using exceptions for flow control

Presumably you're talking about command flow. Okay, but exceptions do
nothing BUT affect flow control. What else can they do?

>in situations like this is that the effect of an exception
>handler is *non-local*, whereas what you're trying to
>catch is really only a local concern.

Is it? Are they? The effects of an exception toss are as local as the
user of the function want them to be.

If EOF was truly local, then file.read would always be able to decide what
to do with it -- print a message, abort(), pop up a window, or whatever
else it takes.

>When you write something like

> while 1:
> try:
> line = read_me_a_line()
> except EOFError:
> break

>what you really mean by the try-except is "if THIS
>PARTICULAR reading operation reaches the end of
>the file". But that's not what it does - it catches
>any EOFError raised by any operation invoked by
>read_me_a_line() that wasn't caught by something
>else first.

I see your problem, but it's pretty hard to sympathize with it, at least
in this case (and it seems to me in any case). Consider that 'for' loops
have been terminated by IndexError exceptions for a long time now, with no
problems reported.

If someone else's uncaught EOFError hits this function, something nasty is
almost always going to happen, because of how far this is away from its
correct target. This is robust by a definition of 'robust' with which
some are not perfectly acquainted: if it fails, it does so loudly so that
the tester notices and reports the problem.

In addition, the complexity of the failure required to produce this is
pretty high. How many files do you need to read to get the next line from
this file? What would have happened if you'd ignored the EOF if it had
been returned otherwise?

>In my experience, that sort of behaviour tends to
>mask bugs. I prefer results which are not errors to
>be returned normally, so that I can be sure where
>they came from.

But reading data after EOF _is_ an error condition! A normal return is
precisely what you don't want. A normal return offers permission for the
program to keep going as usual.

What you want is not a normal return, it's an _ab_normal return. Something
like returning (int)-1 when the function's only expected to return char,
or (in Python) returning None. The problem? These things are SO easy to
ignore!!!! C is famous for noiselessly casting the -1 to a char (look at
the old Bash security bug), and Python is even worse -- any return you can
possibly make will almost certainly fit in noiselessly with anything you
do with it. This is Python's ease of use.

The only solution which works in each case is exceptions. This doesn't
mean that using exceptions guerantees bug freedom; nothing can do that.
It simply means that it catches the majority of the bugs the majority of
the time.

You ought to feel lucky. Some respected programming gurus recommend
calling abort() on any odd results, to make misuse of their API as
blatantly obvious as possible.

>Greg

--
-William "Billy" Tanksley
Utinam logica falsa tuam philosophiam totam suffodiant!
:-: May faulty logic undermine your entire philosophy!

Why exceptions shouldn't be used for flow control [Re: YAS to the "Reading line-by-line" Problem] [ In reply to ]

tseaver at palladion

Jun 23, 1999, 8:18 PM

Post #3 of 5 (668 views)

Permalink

William Tanksley wrote:
>
> On Thu, 24 Jun 1999 11:55:35 +1200, Greg Ewing wrote:
> >William Tanksley wrote:
>
> >> Both should be handled by exception, not by returning a special
> >> value.
>
> >I don't agree with that. Reaching the end of a file while
> >reading it is not exceptional enough to warrant requiring
> >the use of an exception handler to catch it.
>
> "Not exceptional enough". What does that mean? (I ain't never had too
> much exception! ;-)
>
> It's an exception to the behavior of a function defined to return the next
> entity in a file. It's not an exception to a function which returns the
> next entity, or some reserved value otherwise. There's no such thing as a
> partial exception; it's a quality, not a quantity.

In Design-by-Contract terms, a service may throw exceptions for two reasons:

* The caller violated the precondition of the contract (passing an
invalid parameter, for instance). Note that ALL such exceptions
are supposed to be avoidable, given enough diligence on the caller's
part.

* The service was unable to fulfill the contract due to the moral
equivalent of force majeure (out of disk space, etc.). No action the
caller could possibly take beforehand would obviate the need to catch
/ propagate these exceptions.

The "EOF" problem in Python acts like one of the second group, because the
file-object-protocol does not supply a test for EOF (callers who can't test for
EOF, can't be blamed for reading off the end). The tradeoff here is that the
FOP is a "bigger tent" than if it provided EOF(): many FOP-obeying objects
literally can't determine EOF without trying to do the read anyway (sockets,
FIFOs, etc).

Using exceptions for other purposes requires a more definite protocol for them
between caller and service, and thus requires more design justification. Two
cases I see often:

Because they unwind the stack cleanly, exceptions provide an expedient way to
return "success" from deeply-recursive routines, such as graph searches. The
justification is along the following lines: "non-exception implemtations will
spend as much effort and code to handle returning the OOB data in line as they
do performing the search".

In Soft-n-GUI ("fewmet") programming: given a suitable framework which catches
the exceptions at the outermost layer and displays them to the user, exceptions
drastically simplify event-handling code: the event handlers check for nasty
bits at the top, and throw if found; the rest of the code is unencumbered with
the kinds of ugly control flow which would otherwise be required.

>
> >What bothers me about using exceptions for flow control
>
> Presumably you're talking about command flow. Okay, but exceptions do
> nothing BUT affect flow control. What else can they do?
>
> >in situations like this is that the effect of an exception
> >handler is *non-local*, whereas what you're trying to
> >catch is really only a local concern.
>
> Is it? Are they? The effects of an exception toss are as local as the
> user of the function want them to be.
>
> If EOF was truly local, then file.read would always be able to decide what
> to do with it -- print a message, abort(), pop up a window, or whatever
> else it takes.

Here's the rub: if read() throws on EOF, it promulgates a policy that EOF is a
"back out" condition: everyone must wrap almost any call to read() in a try
block, even if only to ignore it, or break out of a local loop. Because read()
doesn't throw (and thereby doesn't enforce policy at the mechanism level),
applications which don't need the policy don't pay for it; yet it is trivial
(as the original "YAS" poster shows) to layer the policy on top of the
mechanism, if so desired.

Use-exceptions-daily-in-GUI-code-_except_-when-threads-are-involved'ly

Tres.
--
=========================================================
Tres Seaver tseaver@palladion.com 713-523-6582
Palladion Software http://www.palladion.com

Why exceptions shouldn't be used for flow control [Re: YAS to the "Reading line-by-line" Problem] [ In reply to ]

greg.ewing at compaq

Jun 24, 1999, 3:19 PM

Post #4 of 5 (672 views)

Permalink

William Tanksley wrote:
>
> "Not exceptional enough". What does that mean? (I ain't never had too
> much exception! ;-)

Sorry, I was using "exceptional" in a non-technical sense there.
I meant "not abnormal enough".

I don't agree that reaching EOF is necessarily "abnormal".
If I want to read all the data in a file, and I don't know
in advance how much there is, I write a loop that keeps reading
until EOF. In that case, reaching EOF is not abnormal at all -
it's bound to happen eventually!

On the other hand, if I have reason to believe that there
should be a certain amount of data left, and I hit EOF while
trying to read it, then that is certainly an error, and
raising an exception is a reasonable thing to do.

The question, then, is what a general-purpose library routine
should do on EOF, given that the "right thing" depends on the
circumstances. You would have it always throw an exception,
and require the programmer to catch it if it is an "expected"
exception.

Personally, I prefer it the way it is. If I want an exception
thrown, I can always write a wrapper which does so. Doing it
the other way around - wrapping exception-throwing code into
something which doesn't - is trickier to get right; as I've
shown, it's hard to make sure that it catches only what you
want to catch.

> In addition, the complexity of the failure required to produce this is
> pretty high. How many files do you need to read to get the next line from
> this file?

It's perhaps not the best example of this kind of problem.
A better one, which I actually have experienced, is from the
days when it was common to catch KeyError as a way of telling
when a key wasn't in a dictionary:

try:
v = d[the_key()]
except KeyError:
v = some_default_value

The subtle problem with that piece of code is that if there
is a bug which causes the_key() to raise a KeyError (which I
think you'll agree is *not* an unlikely event) it gets
incorrectly caught instead of triggering a traceback.
To guard against that, you have to be very careful what
you put inside the try:

i = the_key()
try:
v = d[i]
except KeyError:
v = some_default_value

This is one of the reasons that the get() method was
added to dictionaries. Using it, you can write

v = d.get(the_key(), some_default_value)

which is not only less error-prone but shorter and
clearer as well.

> If someone else's uncaught EOFError hits this function, something nasty is
> almost always going to happen... if it fails, it does so loudly so that
> the tester notices and reports the problem.

But that's exactly what *doesn't* happen! The misdirected exception
silently terminates a loop that it wasn't meant to terminate, and
later on the program fails with some set of symptoms that give little
or no clue as to what the original cause was. I would much rather get
a traceback pinpointing exactly what was thrown and where it was thrown
from.

> C is famous for noiselessly casting the -1 to a char (look at
> the old Bash security bug), and Python is even worse -- any return you can
> possibly make will almost certainly fit in noiselessly with anything you
> do with it.

The design of read() is perhaps not the best here -- it might be
better to return None, which would cause most things expecting a
string to blow up rather more quickly.

> You ought to feel lucky. Some respected programming gurus recommend
> calling abort() on any odd results

Raising an exception is Python's equivalent of calling abort(). But
it only works as such if you can be sure that your exception isn't
going to be caught by something that makes unwarranted assumptions
about its cause. The more you use exceptions for "normal" things,
such as EOF or KeyError or IndexError when they aren't really errors,
the more likely that is to happen. In my experience, anyway.

Greg

Why exceptions shouldn't be used for flow control [Re: YAS to the "Reading line-by-line" Problem] [ In reply to ]

greg.ewing at compaq

Jun 27, 1999, 4:17 PM

Post #5 of 5 (666 views)

Permalink

William Tanksley wrote:
>
> while not f.eof():
> line = f.readline()
> # ...

I'd love to be able to write reading loops that way.
Unfortunately, it's a fact of life that some of the
things read() has to deal with are unable to detect
eof without trying to read something. I think the
existing definition of read() and friends is the best
that can be done in those circumstances.

> It's been a while since I've used read, so I
> don't recall what it actually returns

It returns an empty string if it can't read more
than 0 characters, same as readline().

> in a function designed to get n!=0
> more characters, an EOF really is an error. Unfortunately for me, that's
> not the definition of 'read', and never will be.

Perhaps there should be a 2-parameter version of read:

read(min, max)

which would raise an exception if it couldn't read at
least min characters. Setting min=0 would give the
current behaviour, and setting min=max would allow
reading a fixed-length record without having to check
for errors. Then everyone would be happy!

Greg