Lately, I have been converting the code in librsvg that handles XML
from C to Rust. For many technical reasons, the library still uses
libxml2, GNOME's historic XML parsing library, but some of the
callbacks to handle XML events like start_element, end_element,
characters, are now implemented in Rust. This has meant that I'm
running into all the cases where the original C code in librsvg failed
to handle errors properly; Rust really makes it obvious when that
happens.

In this post I want to talk a bit about propagating errors. You call
a function, it returns an error, and then what?

What can fail?

It turns out that this question is highly context-dependent. Let's
say a program is starting up and tries to read a configuration file.
What could go wrong?

The file doesn't exist. Maybe it is the very first time the program
is run, and so there isn't a configuration file at all? Can the
program provide a default configuration in this case? Or does it
absolutely need a pre-written configuration file to be somewhere?

The file can't be parsed. Should the program warn the user and
exit, or should it revert to a default configuration (should it
overwrite the file with valid, default values)? Can
the program warn the user, or is it a user-less program that at best
can just shout into the void of a server-side log file?

The file can be parsed, but the values are invalid. Same questions
as the case above.

Etcetera.

At each stage, the code will probably see very low-level errors ("file
not found", "I/O error", "parsing failed", "value is out of range").
What the code decides to do, or what it is able to do at any
particular stage, depends both on the semantics you want from the
program, and from the code structure itself.

Structuring the problem

This is an easy, but very coarse way of handling things:

gbooleanread_configuration(constchar*config_file_name){/* open the file *//* parse it *//* set global variables to the configuration values *//* return true if success, or false if failure */}

What is bad about this? Let's see:

The calling code just gets a success/failure condition. In the case
of failure, it doesn't get to know why things failed.

If the function sets global variables with configuration values as
they get read... and something goes wrong and the function returns
an error... the caller ends up possibly in an inconsistent state,
with a set of configuration variables that are only halfway-set.

If the function finds parse errors, well, do you really want to call
UI code from inside it? The caller might be a better place to make
that decision.

A slightly better structure

Let's add an enumeration to indicate the possible errors, and a
structure of configuration values.

enumConfigError{ConfigFileDoesntExist,ParseError,// config file has bad syntax or somethingValueError,// config file has an invalid value}structConfigValues{// a bunch of fields here with the program's configuration}fnread_configuration(filename: &Path)-> Result<ConfigValues,ConfigError>{// open the file, or return Err(ConfigError::ConfigFileDoesntExist)// parse the file; or return Err(ConfigError::ParseError)// validate the values, or return Err(ConfigError::ValueError)// if everything succeeds, return Ok(ConfigValues)}

This is better, in that the caller decides what to do with the
validated ConfigValues: maybe it can just copy them to the
program's global variables for configuration.

However, this scheme doesn't give the caller all the information it
would like to present a really good error message. For example, the
caller will get to know if there is a parse error, but it doesn't know
specifically what failed during parsing. Similarly, it will just get
to know if there was an invalid value, but not which one.

Ah, so the problem is fractal

We could have new structs to represent the little errors, and then
make them part of the original error enum:

The ParseError and ValueError structs have individual
error_reason fields, which are strings. Presumably, one could have
a ParseError with error_reason = "unexpected token", or a
ValueError with error_reason = "cannot be a negative number".

One problem with this is that if the low-level errors come with error
messages in English, then the caller has to know how to localize them
to the user's language. Also, if they don't have a machine-readable
error code, then the calling code may not have enough information to
decide what do do with the error.

Let's say we had a ParseErrorKind enum with variants like
UnexpectedToken, EndOfFile, etc. This is fine; it lets the
calling code know the reason for the error. Also, there can be a
gimme_localized_error_message() method for that particular type of
error.

How can we expand this? Maybe the ParseErrorKind::UnexpectedToken
variant wants to contain data that indicates which token it got that
was wrong, so it would be UnexpectedToken(String) or something
similar.

But is that useful to the calling code? For our example program,
which is reading a configuration file... it probably only needs to
know if it could parse the file, but maybe it doesn't really need any
additional details on the reason for the parse error, other than
having something useful to present to the user. Whether it is
appropriate to burden the user with the actual details... does the app
expect to make it the user's job to fix broken configuration files?
Yes for a web server, where the user is a sysadmin; probably not for a
random end-user graphical app, where people shouldn't need to write
configuration files by hand in the first place (should those have a
"Details" section in the error message window? I don't know!).

Maybe the low-level parsing/validation code can emit those detailed
errors. But how can we propagate them to something more useful to the
upper layers of the code?

Translation and propagation

Maybe our original read_configuration() function can translate the
low-level errors into high-level ones:

Etcetera. It is up to each part of the code to decide what do do with
lower-level errors. Can it recover from them? Should it fail the
whole operation and return a higher-level error? Should it warn the
user right there?

Language facilities

C makes it really easy to ignore errors, and pretty hard to present
detailed errors like the above. One could mimic what Rust is actually
doing with a collection of union and struct and enum, but this
gets very awkward very fast.

Rust provides these facilities at the language level, and the idioms
around Result and error handling are very nice to use. There are
even crates like failure that go a long way towards
automating error translation, propagation, and conversion to strings
for presenting to users.

Infinite details

I've been recommending The Error Model to anyone who
comes into a discussion of error handling in programming languages.
It's a long, detailed, but very enlightening read on recoverable
vs. unrecoverable errors, simple error codes vs. exceptions
vs. monadic results, the performance/reliability/ease of use of each
model... Definitely worth a read.