Sunday, July 8, 2012

errors-1.0: Simplified error handling

Update

I just released a quick patch changing a dependency from the EitherT package to the Either package. What this means is that if you already installed version 1.0 and upgrade to 1.1, you will have two packages exporting conflicting modules for Control.Monad.Trans.Either. The fix is very simple. Just type:

ghc-pkg hide EitherT

Which will hide the EitherT package so it does not conflict.

I tried to make this fix as rapidly as possible to mitigate the damage. I apologize, but somebody notified me that either is a higher quality dependency for the EitherT type.

Introduction

This post marks the release of errors-1.0, a package designed to fill all the gaps in type-safe error handling. I target this package specifically towards Haskell programmers that prefer to use Maybe and Either for type-safe exception handling, yet get frustrated by thousands of paper-cuts every time some convenient function is missing.

Now, all those pain points are gone, and instead of an import list that looks something like this:

import Control.Monad.Trans.Either
import Data.Maybe
import Safe
...

You now can just import:

import Control.Error

Utility Functions

Control.Error re-exports Control.Error.Util, which provides some very useful utility functions:

I'm pretty sure every experienced Haskell programmer has desired either the hush or note functions at some point and ended up either in-lining it into their code or defining it in some sort of Util module of their project.

It's quite frustrating, actually. We frequently encounter code that uses a mixture of Maybe or Either and want to combine them both within the same monad, but when we search using Hoogle or Hayoo for a function like hush, we get just one result:

Precis.Utils.ControlOperators.suppress :: Either e a -> Maybe a

This exemplifies one such utility function that the precis package wrote to scratch its own itch. Who would want to make the precis package a dependency for such a trivial function, especially when the precis package has nothing to do with error-handling?

I also couldn't even find an equivalent to the note function, although perhaps that's just because my Hoogle-fu isn't strong enough. However, I find it surprising that such a useful conversion function does not reside in some standard location on Hackage. Well, now it officially resides in the errors package.

Teaching

Also, most Haskell aficionados know that liftMaybe and liftEither are easy to write:

liftMaybe = MaybeT . return
liftEither = EitherT . return

However, many beginners to Haskell don't know how to lift a Maybe to a MaybeT and would benefit from the above function. Imagine a beginner Hoogling for:

(Monad m) => Maybe a -> MaybeT m a

... and instead of getting a bunch of almost-correct matches, they get an exact match and they can click the link to the liftMaybe function, consult the source code and have an "AHA!" moment where they learn a bit more about how monad transformers work. I'll wager there are many Haskell beginners right now that give up on using Maybes within MaybeT because they don't even know it's possible to do so. Well, now they do.

Social Confirmation

I designed the errors library to encourage type-safe error-handling style in Haskell, showing beginners that they can use the type system to handle errors painlessly, without using out-of-band language features like Control.Exception.

While I have nothing against Control.Exception, I'm worried that the proliferation of libraries revolving around it along with the absence of high-quality and instructive Either/EitherT libraries might lead beginners to believe that Haskell does not have a type-safe, simple, and easy way to handle errors. They might even lose interest in the language because they mistakenly feel it does not live up to their expectations of elegance and simplicity. After all, how can functional programming be so great if it can't do something as simple as error handling elegantly?

A recent post on reddit emphasizes that we shouldn't take for granted that beginners even know how to integrate high quality Maybe/Either code in complicated applications. All the expert knowledge of how to seamlessly inter-convert the various error-handling styles types gets locked away in people's pet utility libraries. This package solves that problem by providing easily accessible source code that beginner's can use and learn from.

More importantly, I designed the library to encourage beginners to use the type-safe error-handling style by providing a sort of social confirmation that is lacking on Hackage. The lack of a type-safe error handling ecosystem acts like a social cue to beginners that perhaps they are going down the wrong path and should reconsider. After all, if Either is supposedly the right way to handle errors, why does it seem like nobody cares about it enough to create a proper package covering common use cases? Well, now they know at least one person cares.

Scripting

Control.Error also exports Control.Error.Script, which covers the common use case of simple scripting:

type Script a = EitherT String IO a

The entire module provides a way to convert between Control.Exception and EitherT, so that people who prefer to not use Control.Exception now have a standard way to interface it with EitherT-style code. The most important function in the module is:

{- NOTE: This has been renamed to "scriptIO" and "tryIO" now resides in
Control.Error.Util and only catches IO exceptions -}
tryIO :: IO a -> Script a

... which is like lift except that it also catches exceptions and translates them into Lefts. It also provides convenience functions to bind Maybes and Eithers within the monad:

These seem a little bit cumbersome to use at first, since you'd have to take all your favorite partial functions and convert them to Maybe or Either first and then pass them to tryMaybe/tryEither ... or do you?

Fortunately for you, this library has "batteries included", and Control.Error also exports the Control.Error.Safe module which wraps all your favorite partial functions both in the Either and EitherT monad.

The Either variants close a gaping hole in the safe library, where the only way you could attach a descriptive error message was using exceptions (again, this is the kind of social cue I'm talking about). Also, the EitherT variants are incredibly useful within the Script monad, where you can now seamlessly bind all your favorite error-handling functions of all types in the same monad:

import Control.Error
import System.Environment
main = runScript $ do
as do
n throwT "Usage: MYPROGRAM FILE LINENUM"
str
Notice the pattern? Everything that can be bound in the Script monad begins with the prefix try. A future release of the library will also provide versions with the default error messages (i.e. "Prelude.read: no parse"), but I just haven't decided what to name those versions, yet.

Also, notice that you don't have to import Control.Monad.Trans since for the special case of the Script monad the tryIO function serves the role of lift. The compiler error for the lack of Control.Monad.Trans is a nice reminder that you used lift instead of tryIO by mistake.

Why not ErrorT?

I'm a huge fan of the transformers package, with one major exception: ErrorT.

The first reason is that ErrorT comes with the additional baggage of the Error class, which really has no place in a proper EitherT implementation. I don't know how many times I've tried using that type, got hit by the Error class constraint, and threw up my hands in disgust and went back to EitherT (provided by the appropriately-named EitherT package and re-exported by Control.Error).

However, there is a deeper reason that I'm basing my library on EitherT and not ErrorT, which is the implementation of catchError and throwError.

For the longest time, I never appreciated that throw and catch in any Either-like block are actually return and (>>=) in the dual Either monad (the one with the type variables swapped). I missed this because the types of catchError and throwError I would see in every library were never made fully polymorphic:

So now we have the ability to not only throw and catch exceptional values, but to even change the type of the exceptional value. I like to call this the "success" monad (after ehird's term for it, since he was the one who pointed out this awesome symmetry to me). In this monad, each statement is an exception handler and the monad terminates when you "throw" a successful result:

runEitherRT $ do
e2
If any handler succeeds (by returning a Right result), the entire monad terminates with the successful result.

Laws for throwE and catchE

In fact, since throwE and catchE are just return and (>>=) in disguise, we can use the monad laws to state the behavior that throwE and catchE are expected to satisfy:

If you spend a moment to think about all of those laws they all make intuitive sense. They each say, respectively:

If you throw a value, the catch block processes it

If your catch block just rethrows the error, it's the same as not catching in the first place

Catch blocks are associative

These laws exactly match our intuition for how throwE and catchE should behave!

If all of this excites you, then you'll love Data.EitherR, which is one of the modules exported by Control.Error. It provides all the machinery necessary for working in the EitherR monad and also provides the convenience functions for the generalized throw and catch:

These are just newtype wrappers around the return and (>>=) in the success monad. If all you want is throwE and catchE, then you never need to actually use EitherR directly and the above functions are sufficient.

Also, technically you could just use Either itself to implement the "success" monad by just reversing the convention for the type variables and use throwE and catchE to implement the error monad. However, the main reason you might not want to do that is to not confuse other people who are familiar with the traditional convention for Either. People will understand your code better if you stick to Either for code that terminates on errors and EitherR for code that terminates on successes.

Additionally, if you want to actually switch between both monads in your code and use ordinary do notation for each one, then the EitherR and EitherRT newtypes will be absolutely essential to convert between both monads.

Simplicity

There's one last thing that's nice about Control.Error: it re-exports Data.Either and Data.Maybe. It's just one nice little feature that helps trim down your import list.

I always enjoy coming up with compelling and interesting Haskell code examples in as few lines as possible, and keeping the import list clean is just one of those "nice" things when showing people from other languages how clean and simple Haskell can be. My rule of thumb is that if you really want to impress other people with Haskell, then show them how much power you can fit clearly and expressively into a single 80x24 terminal window, including all imports and extensions.

Conclusions

Hopefully this library will help shrink up a lot of people's utility modules and encourage the Either/EitherT style of error-handling. Let me know if there is a feature that you think is missing from this library, because the goal of this library is to make error handling as simple and painless as possible.

17 comments:

In the Script monad I think you really shouldn't catch AsyncExceptions.

Example code in a server:

servingThread = forever $ serve

And the serve function is written with your Script monad. When the whole server wants to quit, the mainThread will obviously call killThread servingThreadId. Although, this async exception will be caught by the Script monad and nicely handled as if it was something related to the request (unable to parse, wrong values, etc.).

More on this topic:http://hackage.haskell.org/packages/archive/base/latest/doc/html/Control-Exception.html#g:4

Simon Marlow's tutorial also contains info about handling async exceptions in the right way: community.haskell.org/~simonmar/par-tutorial.pdf

This is a good point. Technically, the Script monad was intended to be a quick solution for very simple programs, which is why it specifies a concrete monad transformer stack and uses Strings to hold exceptional values. However, that seems like a common enough use case to warrant including in its behavior. E-mail me at my gmail address with username Gabriel439 and, if you can, show me the modified runScript function you have in mind.

I've just started using this package, and am wondering if there's a fundamental reason why there's no `MonadState` instance for `EitherT`. Is it safe to just write one myself, or is this left out because it has some semantic problems?

I'm not the author of the `either` library, but my guess is that this is for one of two reasons:

a) `either` is written in the style of `transformers` where you manually lift commands

b) `either` is intended to be Haskell98, and writing `mtl` instances requires extensions like `UndecidableInstances`

I generally prefer the `transformers` style over the `mtl` style. The `mtl` style does have some semantic problems, namely that the behavior changes depending on the order in which the monad transformers are applied.

There is actually a principled way to do what the `mtl` does that lets you still type class monad transformer operations while still preserving the ability to reason about your code, and I will blog about that later.

I just wasted hours trying to switch from ErrorT to EitherT to clean up a 3-line spurious Error instance declaration, only to realize I'd have to paste and adapt tons of difficult code I barely understand to get the same functionality. There should really be a big warning somewhere in the EitherT documentation, because the resulting type error is not helpful for a beginner.

I am actually having trouble understanding the purpose of a transformer that you can't stack...

Are you using the `mtl` or `transformers` (in other words, did you import `Control.Monad.Error` or `Control.Monad.Trans.Error`)? If you are using the `mtl`, just don't substitute the `throwError` function: it is type-classed to work with any monad transformer stack that has `ErrorT` or `EitherT` in it (it works with both). If you are using `transformers`, then just add two `lift`s before it like this:

lift $ lift $ left ...

However, judging from your type error that you pasted it looks like you made a completely different mistake during the refactoring. The type error says that you are trying to lift a list as a monadic action where it expects an `EitherT` action. That's the only part I don't understand. I'd have to see the whole source code for your function to say more.

It looks like I was using mtl for all my stuff (including ErrorT), but your Control.Errors package imports transformers. Does mixing the two create problems? It seems that there is no EitherT in mtl...

The list is used as a regular return type here, not as a nondeterministic monad. I get the right interpretation in ErrorT

Both `transformers` and `mtl` are interchangeable and compatible with each other. `mtl` itself depends on `transformers` and gets its monad transformers from there. `mtl` just adds on the type classes like `MonadError`. So mixing the two packages is completely safe.

If you are doing that, then make sure you import `Control.Monad.Error.Class` which provides the type-classed `throwError` function that you don't have to `lift`. Alternatively, just add two `lift`s in front of your `left` and it should also work.

I usually advise beginners to stick to the `transformers` API until they feel comfortable with it. It also leads to better type errors and type inference. It is a little more verbose, but the reward is easier to maintain code.

Also, if you have a large function that has lots of deeply nested lifts, you can save a lot of time by declaring these at the top of your function:

liftState = id liftReader = lift liftEither = lift . lift ...

Then if you later change your monad transformer stack for that function, all you have to do is change those top-level lift definitions and everything still works.

Another trick is to define your own custom monad transformer newtype that auto-lifts the relevant operations for you. That way you define all the machinery in one place and if you need to modify your monad transformer stack you just change the code in one place and every other use is automatically fixed. However, this is only worth doing if you use the exact same monad transformer stack all over your code base. Think of it as defining your own custom DSL. Haskell lets you pick and choose which language features you build into your DSL (i.e. errors, state, etc.).

by 'transformers API' you mean avoiding the use of MonadTrans and do the lifting manually?

How do you avoid using the "auto-lifting" machinery? By not using functions from the Monad* typeclasses? (In our example, using left instead of throwError). Because even the trasformers from the transformer library have MonadTrans instance declarations, so you'll get autolifted as soon as you use any of the other Monad* typeclasses...

(At this point I'm just curious. I'm way too lazy to lift everything!)

By `transformers` API I mean `lift`ing everything manually. The way you avoid using the auto-lifting machinery is only import modules from the `transformers` library. That means that instead of `Control.Monad.Error` you would import `Control.Monad.Trans.Error` and instead of `Control.Monad.Trans` you would import `Control.Monad.Trans.Class`.