Revision as of 00:56, 6 December 2009

There is confusion about the distinction of errors and exceptions for a long time,
repeated threads in Haskell-Cafe and more and more packages that handle errors and exceptions or something between.
Although both terms are related and sometimes hard to distinguish, it is important to do it carefully.
This is like the confusion between parallelism and concurrency.

The first problem is that exception seem to me to be the historically younger term.
Before there were only errors, independent from whether they are programming or I/O or user errors.
In this article we want to use the term exception for expected but irregular situations at runtime
and the term error for mistakes in the running program, that can be resolved only by fixing the program.
We do not want to distinguish between different ways of representing exceptions:

Maybe

,

Either

, exceptions in

IO

monad, or return codes,

they all represent exceptions and are worth considering for exception handling.

The history may have led to the identifiers we find today in the Haskell language and standard Haskell modules.

However infinite loops in general cannot be catched, whereas calls to sugared functions like

error

can.

Even more confusion was initiated by Java programming language
to use the term "exceptions" for programming errors like the NullPointerException
and introducing the distinction between
checked and unchecked exceptions.

Contents

1 Examples

Let me give some examples for explaining the difference between errors and exceptions
and why the distinction is important.

First consider a compiler like GHC.
If you feed it with a program that contains syntax or type errors it emits a descriptive message of the problem.
For GHC these are exceptions.
GHC must expect all of these problems and handles them by generating a useful message for the user.
However, sometimes you "succeed" to let GHC emit something like "Panic! This should not happen: ... Write a bug report to ghc@haskell.org"
Then you encountered a bug in GHC. For GHC this is an error. It cannot be handled by GHC itself.
The report "didn't expect TyVar in TyCon after unfolding" or so isn't of much help for the user.
It's the business of the GHC developers to fix the problem.

Ok, these are possible reactions to user input.
Now a more difficult question:
How should GHC handle corruptions in the files it has generated itself like the interface (.hi) and object files (.o)?
These corruptions can be introduced easily by the user by editing the files in a simple text editor,
or by network problems or by exchanging files between operating systems or different GHC versions, or by virus programs.
Thus GHC must be prepared for them, which means, it must generate and handle exceptions here,
it must tell the user at least that there is some problem with the read file.
Next question: Must GHC also be prepared for corrupt memory or damages in the CPU?
Good question. I don't think it must be prepared for that.

Now we proceed with two examples that show, what happens if you try to treat errors like exceptions:
I was involved in the development of a library that was written in C++.
One of the developers told me, that the developers are divided into the ones who like exceptions
and the other ones who prefer return codes.
As it seem to me, the friends of return codes won.
However, I got the impression that they debated the wrong point:
Exceptions and return codes are equally expressive, they should however not be used to describe errors.
Actually the return codes contained definitions like ARRAY_INDEX_OUT_OF_RANGE.
But I wondered: How shall my function react, when it gets this return code from a subroutine?
Shall it send a mail to its programmer?
It could return this code to its caller in turn, but he will also not know, how to cope with it.
Even worse, since I cannot make assumptions about the implementation of a function,
I have to expect an ARRAY_INDEX_OUT_OF_RANGE from every subroutine.
My conclusion is, that ARRAY_INDEX_OUT_OF_RANGE is a (programming) error.
It cannot be handled or fixed at runtime, it can only be fixed by its developer.

The second example is a library for advanced arithmetic in Modula-3.
I decided to use exceptions for signalling problems.
One of the exceptions was VectorSizeMismatch,
that was raised whenever two vector with different sizes shall be added or multiplied by a scalar product.
However I found, that quickly almost every function in the library could potentially raise this exception
and Modula-3 urges you to declare all potential exceptions.
(However, ignoring potential exceptions only yields a compiler warning, that can even be suppressed.)
I also noticed that due to the way I generated and combined the vectors and matrices
the sizes would always match.
Thus in case of a mismatch this means, there is not a problem with user input but with my program.
Consequently, I removed this exception and replaced the checks by ASSERT.
These ASSERTs can be disabled by a compiler switch for efficiency concerns.
A correct program fulfils all ASSERTs and thus it does not make a difference
whether they are present in the compiled program or not.
In a faulty program the presence of ASSERTs only controls the way a program fails:
either by giving wrong results or segmentation faults.

My conclusion is that (programming) errors can only be handled by the programmer,
not by the running program.
Thus the term "error handling" sounds contradictory to me.
However supporting a programmer with finding errors (bugs) in his programs is a good thing.
I wouldn't just call it "error handling" but "debugging".

An important example in Haskell is the module

Debug.Trace

.
It provides the function

trace

that looks like a non-I/O function

but actually outputs something on the console.
It is natural that debugging functions employ hacks.
For finding a programming error it would be inappropriate to transform the program code
to allow I/O in a set of functions that do not need it otherwise.
The change would only persist until the bug is detected and fixed.
Summarized, hacks in debugging functions
are necessary for quickly finding problems without large restructuring of the program
and they are not problematic, because they only exist until the bug is removed.

Different from that exceptions are things you cannot fix in advance.
You will always have to live with files that cannot be found and user input that is malformed.
You can insist that the user does not hit the X key,
you may threat him to send him lawyers,
but your program has to be prepared to receive a "X key pressed" message nonetheless.
Thus exceptions belong to the program and
the program must be adapted to treat exceptional values where they can occur.
No hacks can be accepted for exception handling.

2 When exceptions become errors

Another issue that makes distinction between exceptions and errors difficult is,
that sometimes the one gets converted into the other one.

It is an error to not handle an exception.
If a file cannot be opened you must respect that result.
You can proceed as if the file could be opened, though.
If you do so you might crash the machine
or the runtime system terminates your program.
All of these effects are possible consequences of a (programming) error.
Again, it does not matter wether the exceptional situation is signaled by a return code that you ignore

or an IO exception for which you did not run a

catch

.

3 When errors become exceptions

Often there is criticism about the distinction between errors and exceptions
because there are software architectures
where even programming errors of a part shall not crash a larger piece of software.
Typical examples are:
A process in an operating system shall not crash the whole system if it crashes itself.
A buggy browser plugin shall not terminate the browser.
A corrupt CGI script shall not bring the web server down, where it runs.

In these cases errors are handled like exceptions.
But there is no reason to dismiss the distinction of errors and exceptions, at all.
Obviously there are levels, and when crossing level boundaries it is ok to turn an error into an exception.
The part that contains an error cannot do anything to recover from it,
also the next higher level cannot fix it, but it can restrict the damage.
Within one encapsulated part of an architecture errors and exceptions shall be strictly separated.
(Or put differently: If at one place you think you have to handle an error like an exception,
why not dividing the program into two parts at this position? :-) )