Iteratees Step By Step (Part 1)

As you may be aware, iteratee-based I/O is trendy right now. While I normally avoid trendy things, there are enough smart people talking about how iteratees solve real problems that, just this once, it’s probably worth taking the time to figure them out. What I’ve found, though, is that a lot of the existing resources on these things try to present them in their fully formed state, as a collection of a bunch of weird-looking types and examples, which little introduction to why those choices were made. That’s certainly not how I come to understand things!

This is my attempt to fill in the gap. I present to you a step by step process by which one might have started from a general goal, and arrived at the design of iteratees. For this part, we’re ignoring enumerators, or enumeratees, or general iteratees, and instead just deriving a design for pure iteratees. Believe me, it’s hard enough just to get that far! I do not claim that this is the actual thought process that anyone went through, nor that I even understand any number of subtle issues that may arise. However, that said, I do think that the steps below are plausible, and each of them for the most part motivated by flaws or limitations in the previous stage, such that it’s reasonable to imagine following this thought process.

Warning: This isn’t written by someone who is an expert on these concepts. It may be that there are ideas or important concerns missing here. If so, I hope others may point them out in the comments.

Step 0: Taking Stock of the Situation

What everyone knows about iteratees is that they address the problem of doing incremental, composable I/O without being lazy.

It’s worth taking a moment to familiarize ourselves with the status quo, the thing our iteratees are meant to replace. First of all, they replace explicit use of file handles. Explicit use of file handles is generally not composable, involves manipulation things with global state… in short, not very functional! One of our goals will be to take care of this in one place; a place that doesn’t have to know anything about the application, and therefore can be written once and for all, and reused across applications.

By contrast, lazy I/O is very attractively nice to use. You interact with the file handle only once, and then are able to write purely functional code on lists. Unfortunately, it’s well-known that doing unbounded amounts of lazy I/O in long-running processes just doesn’t work very well. File handles may be left open indefinitely, and you have no way to do anything about it. It’s simply not possible in a long-running application, like a network server of some type, to give up that kind of control completely to non-deterministic subsystem like the garbage collector. We’ll be approaching the problem from here on as if we’re writing a network server where this matters a lot. So, in contrast with lazy I/O, we’d like to make sure we know exactly when we’re done with the file handle, and can pass it off, look for another request in it, etc.

We’re now left with the question of what to do to meet these goals.

Step 1: The Starting Intuition

First, to avoid the problem of lazy I/O, we clearly need some deterministic piece of code running somewhere that opens and closes file handles at well-understood times. Second, though, to avoid the problems of explicit use of file handles in the application code, that piece of code should be separate from the application. So we need some piece of code whose job it is to manage the file handles, read the data, and…. do what? Well, pass it off to some application, which it knows nothing about, and such application in turn knowing nothing for sure about where the data came from! In other words, instead of the application being in control, and calling the I/O layer as a service, we instead have some piece of core server code in control, doing the I/O and then calling the application as a service to handle the result.

As a first approximation, we expect the server code to look something like this:

Of course, this is impractical and oversimplified, but it’s a starting point! Then it’s natural to ask what the application does. Well, it takes a value, and… well… does something!

type Application = Value -> IO ()

This might seem a little unusual if you’re starting out thinking of a console application, which is run first, and then decides to do some I/O. It’s not really all that weird, though, from a server standpoint. Practically every web application framework in every programming language under the sun does something like this: reads the incoming request and passes it off to the application. That’s promising; at least others have settled on the same kind of answer for the high level problem. We’re not off into anything very that unusual yet…

Here’s where we start to get unusual:

type Value = Char
readSomething = hGetChar

Remember all those web application frameworks that we were happy to have as company? We’re parting company now. They do a lot of the work of reading and handling fairly high-level web-related concerns long before they get into any kind of a event-based mode of doing things. The high-level interface between the application server and the application might be a nice callback-based one where the server handles the I/O and then calls the application… but dig down a few inches, and the server implements most of the protocol in good old imperative code. Why? They lack composability.

We’re trying to perform this inversion — the transition from a standard imperative interface to an event-dispatch sort of interface — at a much lower level, by providing data to the “application” a character at a time, so that even something like input parsing is done from within the event-based framework! That’s where this gets a bit radical. But it makes, sense, too. If we truly had a composable way of building applications in this sort of framework, then it should be possible to use it to implement parsing, right?

Well, not yet, but that’s our goal. To illustrate where we are, I’ll write a simple application: it takes a character you type, and repeats it in upper case.

upperEcho :: Application
upperEcho c = putChar (toUpper c)

Okay, pretty basic, but it’s working so far! If it’s not obvious that this code works, you can build and run it in GHCi. You’ll just need to add some imports (Data.Char, System.IO), and run it with stdin as the handle. Server only handles a single client request. That’s okay: in practice, we probably want to handle one request from a client network connection and close it anyway. To get a long-running server here, we can always wrap it with Control.Monad.forever to handle plenty of sequential incoming requests.

Step 2: More Input

There’s one fairly obvious disadvantage to the arrangement we had above. (Okay, there are many, but we have to start somewhere!) It’s great for applications that need to do something with a character, but it’s rather awkward to handle applications that might need to wait for a whole word, or even (gasp!) a sentence before responding. Of course, how characters are grouped into words or sentences is application-specific, so we need to be a bit abstract here. We’d like the application to be able to consume some number of characters, and after each one, decide whether to do something now, or hold on and wait until later.

In fact, we already have an intuition for how this ought to work, simply by remembering how to curry functions of multiple parameters. We need the application to be able to consume the input, and morph itself into a new application that’s only waiting for the tail of the input. Time to modify the application type. This time we need a newtype to avoid a cyclic synonym.

This is the curried version of our application. I can now use it to write applications that expect more than a single character of input. Let’s try using it to build an application that simply reads a word and prints it in reversed order.

That’s not too bad. A far cry from the ease of using lazy I/O, where we’d just use the standard library’s words, then map reverse over the result. But, at least we’re able to keep track of all the incremental state using closures, and it’s sort of fun to build this and play around with it in GHCi using stdin as the stream. (You’ll need the imports for Data.Char and System.IO that I left out again.) Unfortunately, it now never stops… but, we’ll handle that in a bit.

Step 3: Can We Be More Functional?

At this point, we may start to get concerned that the IO monad is playing too strong a role in the code. In particular, we’ve interweaved IO and input processing in a way that really wasn’t done even with lazy I/O. There, one could have just applied the built-in words function to the lazy input stream, and then only worried about IO once that was done. Here, even the breaking up of characters into words is being handled by code running in the IO monad. Let’s see if we can separate the incremental input from the final IO that happens at the end.

This is going to require dividing the Application type into multiple constructors, one indicating the need to (in a purely functional way) partially apply some input, and the other for doing the resulting I/O once enough input has been read. This also gives us termination back, since we have to define when we’re done.

That’s a bit clearer, and the I/O appears only in one place: inside the Enough constructor. I do want to point out one thing, though: we’ve lost a bit versus the previous implementation. In particular, the previous implementation could have evolved its behavior over time, since we built a new application every time. We could have preserved that, by having the Enough constructor provide a new Application and have it not terminate (or only optionally terminate) again. However, most network servers don’t actually want to change their behavior over the long term, so I’m just sticking with this one.

Step 4: Handling End of File

One rather glaring omission of everything we’ve done so far is that we assume that the input stream is infinite. In the real world, input streams end. This is easy to handle: we can expand the vocabulary of inputs from Char to something that includes an EOF indicator. We basically want Maybe, but I’ll define a new type for the ability to use better names.

data Input = EOF | More Char

(A quick note: Oleg’s Iteratee code calls this Stream. I refuse to do so, because it’s not a stream; it’s just one piece of what will eventually be a stream. But there you go, that’s the name used elsewhere.) Then server needs to detect end of file and do the right thing with it. Our server provides an infinite stream of EOFs when the end of file is reached. We hope, though, that the application just finishes with Enough as soon as one of them pops up.

That’s it. I pulled out the I/O stuff into finish, since we’d like to use it from a couple places. Aside from that, it’s completely straight-forward.

Step 5: Chunking the Input

Reading from input streams a character at a time is inefficient. We’d like to modify the code so that we can read a bunch at a time. It would also be very nice to use the ByteString type rather than String, when we choose to do so. All of this presents a small issue, though. Often, chunking in an input stream is rather arbitrary, depending on buffer sizes of various software layers in the local server, remote server, or even network routers in between! It’s entirely possible that the application might be given too much data. So we need to add another quirk to the Enough constructor for the application: in addition to the final action to perform, it should tell us the (possibly empty) left-over data that it didn’t need.

We also want to be able to compare the leftovers with an empty value, so a derived Eq instance is added.

The types are now parameterized over a “chunk” type i, which might be, say, String or ByteString, etc. Instead of the server using hGetChar, we want to be able to read a chunk at a time. Time to pick a representation, since the I/O code can’t very easily be type-agnostic. We use strict ByteStrings, which I’m assuming are imported qualified with the name B.

The interface change for server is worth noting. Running the application might result in a leftover bit of input, so the return type has changed from IO () to IO B.ByteString. We might also want to then run a second application against the remaining stream, so an Input B.ByteString has been added as a parameter as well. This lets us put together multiple application pieces later on.

Finally, we’d like to rewrite the application to make use of this new chunking feature. Fortunately, the ByteString API contains functions analogous to a lot of the list processing API from the prelude.

There it is. The longest yet, but we’ve come a long way in robustness and performance. We now handle incrementally consuming input, EOFs in the stream, and chunked I/O for better speed.

Step 6: Generalizing the Interface

The applications we’ve been building have turned an input stream into an I/O action. But did we really need to turn it into an I/O action? What if all we really wanted was to extract out reversed words for use in some other part of the application? Then in addition to parameterizing our application based on the input type, we also want to change what it might give back. This might also be a good time to change a word. We aren’t really talking about complete applications any more, but rather about ways of getting “things” from a stream of input. We introduce the word Iteratee to represent these mini-application pieces.

The implementation looks almost identical; indeed, there’s only one change. Instead of performing the output as an I/O action and then returning the remaining part of the stream, we instead just get the result value and packing it up with the tail in a tuple. In the spirit of this change, we’ll change wordReverser to return its result as a String rather than printing it out to the console.

Much more general, and with very few changes! Note that now, wordReverser is a component that isn’t tied to where its underlying data comes from, nor what we do with its result. We might begin to see how we might compose this with some other pieces to build a larger application…

Step 7: Combining Iteratees

In this step, we aren’t going to modify any of the preceding code. Instead, we’ll just add a Monad instance for Iteratee. This will give us powerful tools for combining iteratees together, and building up more complex ones from smaller, simpler ones. All we need are the two monad operations, return and bind.

Return is an easy one. If I have a plain value, then the corresponding iteratee consumes no input, produces that value as its result, and the leftover string is empty. I’ll rely on the Monoid type class for a suitable definition of “empty.”

Hmm. First of all, if the first iteratee is Enough, then we can just extract its value, and apply the function to get an Iteratee i b. But what about the leftover piece from the first Iteratee? Well, we’ve already dealt with this once, in the implementation of server from the previous two steps! Back then, we decided what to do: if the second iteratee is also done, then we append the leftovers to its own. If not, then we apply the first level of partial evaluation immediately.

Okay, that’s a bit longer, but the important bit here is that it was built up logically from simpler pieces, all of which can be easily reused.

Step 8: Iteratee Error Handling

So far, so good. It certainly seems much easier to deal with our iteratees now that they can be built compositionally using monads. It was a little awkward, though, to handle the Nothing cases in all of the code above. It would be nice if, instead of having to handle error conditions everywhere they occur, unexpected errors could be propogated for us to the end. In other words, we’d like our iteratees to have an error state, which, once it’s entered, simply remains there so that the error is reported at the end.

An obvious design would be to have constructors Enough, Partial, and Error for the Iteratee type. However, to make a little jump here just to stay with Oleg’s design even when I don’t think it’s so obvious, we’d like to have errors be recoverable. That is, we note that the error has occurred, and the user can choose to fix the error, and then provide new input that succeeds. To do so, an error will be considered to be a Partial, but with an error message.

The second field of Partial is the current error message. If it’s Nothing, then there is no error. If it’s something, then there is an error, which is recoverable by providing an appropriately fixed input chunk. Of course, our generic server can’t really do anything with errors except give up and fail. So we just add a case for it to do so.

The server gives up early, true, but we do want to do the right thing with errors during the plumbing with the Monad instance, so that another, smarter server may still do something more interesting. We do so here. I also inline the named functions from the previous section, for brevity (the inlining related changes are not in bold, breaking my convention, but so that the actually relevant changes are more obvious).

Given this new error handling framework for iteratees, one would hope that we could simplify the code for some of the previous code. Alas, in general, this is not possible, because we didn’t want to treat end of stream as an error; just as a termination condition. If we were willing to treat it as an error, though, we could write some shorter code.

This is the same as the previous code except that the end of file is NOT accepted as a valid terminator for the word, which instead must be followed by a space character. Errors are fully recoverable (but not recovered in the server code above), and propogated automatically through the monad.

Step 9: Underlying Stream Error Handling

This is a minor detail to handle. Currently, we have no idea what to do with errors in the underlying stream. For example, if reading from a file on disk fails not due to end of file, but rather because the disk is corrupt, there’s no way to catch and handle that scenario. In other words, entirely separate from there being processing errors at the iteratee level, there may also be physical errors on the stream. These should mostly be handled like EOF (there’s little point in “trying again” to read from a failed stream.) But it would be nice to keep track of the fact that an actual error occured rather than an EOF.

Then we just need to propogate that around. The Monoid instance just keeps track of errors. The only non-obvious bit is that we sometimes append an empty value to the left, and when we do so, we ought to preserve the error rather than throwing it away.

Those are all of the necessary changes to propogate underlying stream errors around the system.

Step 10: Take a Break!

We’ve done a lot, so this is is where I’ll stop for now. This gets you all the way through pure iteratees, which is the first of a number of topics in the iteratee-based I/O world. I hope to write another installment where we do a bit more, working with some of the other concepts that arise. For example, our server function is a bit of an ad hoc thing that just sort of organically appeared, so we can poke at it to see where it goes (enumerators and enumeratees), and we might be interested in doing incremental effectful stuff as we do the processing of the input, so we would then need to generalize the iteratees to effectful ones. That’s all a task for a later day, though.

We’ve come rather a long way, though. The composability of the code above is the most important thing. Once can easily see a path from something like itChar to something like this: