Mailing List Archive: Why exceptions shouldn't be used for flow control [Re: YAS to the "Reading line-by-line" Problem]

On Fri, 25 Jun 1999 10:19:05 +1200, Greg Ewing wrote:
>William Tanksley wrote:

>> "Not exceptional enough". What does that mean? (I ain't never had too
>> much exception! ;-)

>Sorry, I was using "exceptional" in a non-technical sense there.
>I meant "not abnormal enough".

>I don't agree that reaching EOF is necessarily "abnormal".
>If I want to read all the data in a file, and I don't know
>in advance how much there is, I write a loop that keeps reading
>until EOF. In that case, reaching EOF is not abnormal at all -
>it's bound to happen eventually!

This is completely true, but a good functional design for an API has each
function doing as little as possible. By that rule, we'd do well to
remove EOF handling from the contract of the read function, and put it
into an eof function.

I'm also aware that doing so goes against the C API, so that's one
overwhelming strike against doing that to the current library (quite aside
from breaking all software ;-).

>The question, then, is what a general-purpose library routine
>should do on EOF, given that the "right thing" depends on the
>circumstances. You would have it always throw an exception,
>and require the programmer to catch it if it is an "expected"
>exception.

Right.

>Personally, I prefer it the way it is. If I want an exception
>thrown, I can always write a wrapper which does so. Doing it
>the other way around - wrapping exception-throwing code into
>something which doesn't - is trickier to get right; as I've
>shown, it's hard to make sure that it catches only what you
>want to catch.

You're entirely correct there. Exceptionn-throwing code depends upon
having a good test system. Without complete tests, it doesn't happen.

OTOH, that particular exceptionn would get thrown only in a truly bizzare
situation, because all code would look somethinng like this:

while not f.eof():
line = f.readline()
# ...

Most programmers wouldn't expect to have to catch an exception any mmore
thann they'd have to test for an exceptionnal returnn value.

>> In addition, the complexity of the failure required to produce this is
>> pretty high. How many files do you need to read to get the next line from
>> this file?

>It's perhaps not the best example of this kind of problem.
>A better one, which I actually have experienced, is from the
>days when it was common to catch KeyError as a way of telling
>when a key wasn't in a dictionary:

> try:
> v = d[the_key()]
> except KeyError:
> v = some_default_value

Yes, I agree. And I suspect the problem was that my explanation of my
solution was bad.

>The subtle problem with that piece of code is that if there
>is a bug which causes the_key() to raise a KeyError (which I
>think you'll agree is *not* an unlikely event) it gets
>incorrectly caught instead of triggering a traceback.
>To guard against that, you have to be very careful what
>you put inside the try:

> i = the_key()
> try:
> v = d[i]
> except KeyError:
> v = some_default_value

>This is one of the reasons that the get() method was
>added to dictionaries. Using it, you can write

> v = d.get(the_key(), some_default_value)

>which is not only less error-prone but shorter and
>clearer as well.

This is a very interesting case of a "bad" API replacing what I would have
called a "good" one. It's obvious that I would have been incorrect. I
think the distinction here is that a dictionary isn't a stream, and
there's a reasonable expectation of atomic access.

>> If someone else's uncaught EOFError hits this function, something nasty is
>> almost always going to happen... if it fails, it does so loudly so that
>> the tester notices and reports the problem.

>But that's exactly what *doesn't* happen! The misdirected exception
>silently terminates a loop that it wasn't meant to terminate, and
>later on the program fails with some set of symptoms that give little
>or no clue as to what the original cause was. I would much rather get
>a traceback pinpointing exactly what was thrown and where it was thrown
>from.

You're right -- and the solution to this is the samme as the solution to
the corresponding problem with exceptional return values (-1 for EOF).
Build the API so that they're never something you'd want to catch.

>> C is famous for noiselessly casting the -1 to a char (look at
>> the old Bash security bug), and Python is even worse -- any return you can
>> possibly make will almost certainly fit in noiselessly with anything you
>> do with it.

>The design of read() is perhaps not the best here -- it might be
>better to return None, which would cause most things expecting a
>string to blow up rather more quickly.

That's also a good idea. It's been a while since I've used read, so I
don't recall what it actually returns (I've used nothing but readline).

>> You ought to feel lucky. Some respected programming gurus recommend
>> calling abort() on any odd results

>Raising an exception is Python's equivalent of calling abort(). But
>it only works as such if you can be sure that your exception isn't
>going to be caught by something that makes unwarranted assumptions
>about its cause. The more you use exceptions for "normal" things,
>such as EOF or KeyError or IndexError when they aren't really errors,
>the more likely that is to happen. In my experience, anyway.

You're entirely right. And so am I -- in a function designed to get n!=0
more characters, an EOF really is an error. Unfortunately for me, that's
not the definition of 'read', and never will be.

And it doesn't really matter -- one writes wrappers for everything anyhow
(read is supposed to be primitive).

>Greg

--
-William "Billy" Tanksley
Utinam logica falsa tuam philosophiam totam suffodiant!
:-: May faulty logic undermine your entire philosophy!

On Mon, 28 Jun 1999 11:17:01 +1200, Greg Ewing wrote:
>William Tanksley wrote:

>> while not f.eof():
>> line = f.readline()
>> # ...

>I'd love to be able to write reading loops that way.
>Unfortunately, it's a fact of life that some of the
>things read() has to deal with are unable to detect
>eof without trying to read something. I think the
>existing definition of read() and friends is the best
>that can be done in those circumstances.

I'd guess that you're talking about things like text file parsing and
such. I would use readline there, but that's fine. Yes, unstructured
data is more difficult to handle.

>> It's been a while since I've used read, so I
>> don't recall what it actually returns

>It returns an empty string if it can't read more
>than 0 characters, same as readline().

Ah. There's a justification for that, as you observed. I don't like the
fact that readline is stuck doing the same thing (record delimiters are
not really part of the record).

>> in a function designed to get n!=0
>> more characters, an EOF really is an error. Unfortunately for me, that's
>> not the definition of 'read', and never will be.

>Perhaps there should be a 2-parameter version of read:

> read(min, max)

>which would raise an exception if it couldn't read at
>least min characters. Setting min=0 would give the
>current behaviour, and setting min=max would allow
>reading a fixed-length record without having to check
>for errors. Then everyone would be happy!

A facinating solution. Well, since I don't use read, I probably wouldn't
use it, and I'm the only person whining. Therefore, it's probably not
worth implementing (even though it is a cool solution).

>Greg

--
-William "Billy" Tanksley