Sunday, December 30, 2012

The continuation monad is one of the least appreciated monads and in this post I hope to motivate when to use it. This post will first motivate continuations in general and then motivate them in their specific capacity as monads.

Continuations

A Haskell continuation has the following type:

newtype Cont r a = Cont { runCont :: (a -> r) -> r }

A continuation takes a function of type (a -> r) and generates an r, where r can sometimes be a fixed value like Int or IO ().

For example, I might write a long-running process that spawns an action every time the user enters a line of input:

onInput :: (String -> IO ()) -> IO ()
-- i.e. Cont (IO ()) String
onInput f = forever $ do
str
You will recognize this idiom if you've ever used frameworks with callbacks. We supply the framework with a function (i.e. a continuation) and the framework uses that function to do its job.

"Complete me Later"

You generally use continuations when you are programming something, but you want somebody else to complete it. Common reasons include:

You are programming a framework with callbacks that users supply

You are defining a custom map engine for game players to program

You are lazy

I'll use the following hypothetical code segment as an example:

unitAttack :: Target -> IO ()
unitAttack target = do
swingAxeBack 60
valid
Let's imagine you have to package up and compile this code for somebody else (say, a fellow colleague) to use later, but it won't compile yet because you still have the unspecified ??? function. What do you do?

Like all good programming, the best solution is the laziest one. We punt and take the incomplete behavior as a parameter so that whoever finishes the function later on can complete the function by passing the specified behavior in:

unitAttack :: Target -> (Target -> IO ()) -> IO ()
unitAttack target todo = do
swingAxeBack 60
valid
Problem solved! Notice how the right hand side of the type signature resembles the shape of our Cont type. If we just add a newtype, we can wrap it in Cont ourselves:

unitAttack :: Target -> Cont (IO ()) Target
unitAttack target = Cont $ \todo -> do
swingAxeBack 60
valid
... or, even better, we can use ContT instead. The benefit of ContT is that it is also a monad transformer, which comes in handy. ContT has the exact same Monad instance as Cont, so they are otherwise interchangeable:

Fortunately, there is a clean and general solution. Just define a data type that wraps both possible arguments in a sum type, and just define a single continuation that accepts this sum type:

data Hole = Swing Int | Attack Target
unitAttack :: Target -> ContT () IO Hole
unitAttack target = ContT $ \k -> do
k (Swing 60)
valid
Each constructor acts as a place-holder that signals to the continuation which hole it is currently filling. Then somebody else can continue where we left off and just write:

This trick generalizes to n holes with variable arguments per hole. Just define a type with n constructors, one for each hole, where each constructor stores whatever arguments that particular continuation will need:

data Hole = Hole1 Arg1 Arg2 | Hole2 | Hole3 Arg3 | Hole4

Algebraic Data Types

I want to digress for a moment to talk about algebraic data types. If you are not interested, skip to the next section.

It turns out we can elegantly derive the above trick for multiple holes. Type algebra says that if we squint then we can translate the following type constructors to algebraic operators and derive equivalent types from simple algebraic manipulations:

Either a b <=> a + b
(a, b) <=> a * b
a -> b <=> b ^ a

That means that if we have a function with two continuations:

(a1 -> r) -> ((a2 -> r) -> r)

... we just translate it to the equivalent algebraic expression:

(r ^ (r ^ a2)) ^ (r ^ a1)

... and then we can derive equivalent representations just by using the rules of algebra:

This is a Kleisli arrow! That means we can compose it with our previous Kleisli arrow:

unitAttack >=> halfAssedCompletion :: Target -> ContT () IO Int

This composition substitutes in halfAssedCompletion for each hole we left in the unitAttack function. However, halfAssedCompletion left smaller Int holes of its own that somebody else now has to finish up.

Notice how now we originally needed a handler of type:

handler :: Target -> IO ()

... but now we only need a smaller handler of type:

newHandler :: Int -> IO ()

... in other words, halfAssedCompletion acts as an intermediary that transforms handlers of type (Int -> IO ()) into handlers of type (Target -> IO ()).

The Cont monad is all about chaining these kinds of partial completions together until all the holes are finally filled. You could use this abstraction to complete a project in stages and seamlessly hand off work from person to person whenever circumstances require a change in maintainer before completing the project. Alternative, you can use this to condense the callback API of a framework into a single point of entry.

The Kleisli Category

Earlier I said that the key to a monad is its Kleisli arrows. The reason why is that Kleisli arrows are morphisms in the Kleisli category, where (>=>) is Kleisli arrow composition:

Things that obey these laws have nice properties. For example, it guarantees that you can reason about each Kleisli arrow in a composition chain in isolation. Each Kleisli arrow's behavior is completely determined by its input (i.e. domain) and output (i.e. codomain). So let's think about how that modularity translates to the Cont Kleisli category.

When you switch maintainers, you don't have to give the next maintainer a bunch of holes sprawled out over a large code base like this:

largeProgram = do
...
x
Instead you can unify all the holes using a single callback that accepts a single type (the "codomain") unifying all the holes you left:

largeProgram :: () -> ContT () IO Hole
largeProgram () = ContT $ \k -> do
...
x
This give the next person a single point of entry to continue from, because now they only have to write a Kleisli arrow that handles a single Hole input which encompasses all the previous holes:

Then you just use Kleisli composition to connect your code contribution:

largeProgram >=> nextContribution

This cleanly modularizes the first person's contribution so that you can hermetically seal it off from subsequent contributions. By repeating this process, each subsequent contribution to the code base becomes its own modular and composable Kleisli arrow, cleanly separated from other contributions:

This is why frameworks and game custom map makers all use continuations to delimit the interface between the company's code and the user's code. The continuation monad is all about establishing air-tight code boundaries, both internally within a project, and externally for user facing APIs. This lets you isolate responsibilities if you can separate each code contribution into its own Kleisli arrow.

Callback Hell

Frameworks are the canonical example of separating responsibilities, where the framework writer provides some code, but the user is expected to fill in the gap with callbacks of their own. This often results in callback hell in frameworks that take this principle to the extreme, like Node.js.

But it doesn't have to be that way. The continuation monad teaches us that we can always condense a sprawling API filled with callbacks into a single callback that takes a single argument. Even better, we get monadic syntactic sugar for composing multiple layers of callbacks.

I'll use the GLUT package as an example, which requires several callbacks like:

It says the return type is polymorphic, meaning that there is no hole left to fill. The above function just inserts return () in all holes and calls it a day. We can even prove a chain of continuations is done if its final return value type-checks as Void, the empty type:

run only accepts completed programs that have no holes left. We can use run for our previous GLUT example, since the final user callback handler leaves no unfinished holes:

run ((glut >=> userCallbacks) ()) :: IO ()

Conclusion

I hope this post inspires people to use the continuation monad to structure and modularize code completion boundaries. The continuation monad naturally arises at the boundaries between programmers and cleanly abstracts away callback hell into a simple and uniform interface with a single entry point.

Wednesday, December 12, 2012

Introduction

I'm releasing pipes-3.0, which significantly simplifies the entire library. This release began as the misnamed 2.6 branch of my Github repository, but then I finally cleanly solved the polymorphic constraints issue and this solution unlocked several features that I could finally implement.

The large change log includes:

Type-classing the entire Proxy API

Offering both fast and correct base implementations

Fixing type synonyms

Unifying the Pipes and Proxy APIs and dropping Frames

Adding the PFunctor type class (functors over proxies)

Performance improvements

Expanded laws and guarantees

Smaller dependencies

As always, if you want to learn how to use pipes, just consult the tutorial.

Code

I can demonstrate a lot of the new improvements just by taking the take' function from the pipes-2.4 announcement post and showing the difference before and after the new changes.

Control.Proxy now imports useful things from other libraries (like lift).

However, this release includes several more great changes as well.

Polymorphic Constraints

The big issue that held back the library in the wake of the 2.4 release was that I could not type class the request and respond operations. The pipes library really only has three fundamental operations:

(>->)

request

respond

Unfortunately, I could not type class request and respond to work with proxy transformers, meaning any standard libraries would be substantially crippled.

The problem boiled down to my inability to write polymorphic type class contexts like this:

This method has one drawback, which is that sometimes you need to embed the code in a newtype if all the type variables are polymorphic. The IdentityP proxy transformer fits this purpose perfectly, meaning that if you don't use any proxy transformers, you just write:

myPipe () = runIdentityP $ pipeCode

The () and runIdentityP are the only syntactic noise now. The () is essential, as I will describe below, and the runIdentityP avoids OverlappingInstances when all the type variables are polymorphic.

With this change I could then lift request and respond over proxy transformers and consolidate the entire Proxy API into the single Proxy type class. this solution is entirely Haskell98 and no longer requires FlexibleContexts. If you use pipes already you will discover that your new type signatures will become gorgeous now.

Type-classed API

pipes-3.0 type classes the entire Proxy API using the Proxy type class, which defines the following three essential functions:

(>->)

request

respond

The entire library implement EVERYTHING using these three functions and the upcoming prompt finalization library also is built entirely on these three functions. I can safely say that you don't need to use any other functions to write fully featured pipes libraries, which dramatically simplifies the API.

I type-classed all the utilities which means that I can now offer two base proxy implementations:

ProxyFast The fast proxy implementation from pipes-2.5

ProxyCorrect The correct proxy implementation from pipes-2.4

So now you can choose which implementation you prefer. Control.Proxy exports ProxyFast implementation by default, but you can easily switch to the correct implementation. Because all pipes code builds on the Proxy type class you are completely free to pick whichever implementation you prefer and they will both work transparently with all standard libraries.

However, these utilities don't just work with both base proxy implementations; they also now work as proxy transformers, too! This means you can now use these utilities seamlessly within any feature set without any lifting.

Fixing the double request law was an interesting challenge, but it was ultimately very rewarding and led to the correct formulation of the laws in terms of two symmetric proxy composition categories:

Pull-based category: (>->) and idT (the current one)

Push-based category: (>~>) and coidT (the new one)

The pull-based category composes proxies blocked on respond and returns a new proxy blocked on respond. The push-based category composes proxies blocked on request and returns a new proxy blocked on request.

The proxy laws say that these two categories are perfectly symmetric and that your code's behavior never changes under the following functor, which is an isomorphism between the two categories:

(>~>) to (>->)

(>->) to (>~>)

request to respond

respond to request

This means that the choice of composition operator is arbitrary, and I standardize on using the pull-based (>->) operator because the library already uses it and also matches the lazy demand-based expectations of Haskell programmers. Therefore, you never have to use the (>~>) operator, but I include in the library as a theoretical curiosity.

I've already spoken about the practical benefit of centralizing on proxies (duplex channels), but I also standardized on proxies because of their elegance, even more elegant than even pipes. For example:

The Proxy laws have a purely categorical formulation. Pipes do not permit such a formulation.

Proxy Kleisli arrows form morphisms in at least 5 categories, all of which have the same shape. Pipes are morphisms in just two categories (composition and Kleisli categories), and their morphisms do not overlap.

You can assemble most useful proxies purely using composition operators: i.e. (>=>) and (>->), unlike Pipes, which have an impedance mismatch between the two categories.

You cannot implement a push-based category using Pipes.

Type synonyms

Now type synonyms work incredibly well after I parametrized them to take the Proxy instance as a type parameter. This means that you never have to give them up when things get hairy and if you loved your good old-fashioned and simple Pipe, Consumer and Producer types, you can now use them transparently with the entire proxy ecosystem and they mesh perfectly with everything.

There is another reason I really like proxies: The Producer type synonym forbids requests. You can't do that with the unidirectional pipe implementation. I special case the Producer type synonym to close the upstream end to forbid communication upstream, so now you have a stronger guarantee.

I also went back to using () and C for type synonyms instead of Rank2Types to universally quantify unused ends. Rank2Types caused all sorts of problems when you tried to write proxy combinators that accepted pipes as arguments, whereas the simple approach always works and gives clearer types. The only disadvantage is that if you want to insert a pipe with a blocked end within an open one (like a Producer within a Pipe), you must explicitly reopen the end using the unitD and unitU helper functions.

Also, I chose not to include the initial parameter of proxies in the type synonym. The main reason is that you then lose the type synonym any time you define anything other than a composable proxy, which defeats all the other gains I just mentioned. Also, proxies lie along 5 categories, and special-casing the type synonyms for just one of those categories is a bad idea, especially when at least one of those other categories (the Kleisli category) is commonly used as well.

I also empirically experimented with both approaches in my own projects, and I can pretty confidently state that the type synonyms should never include the initial parameter. It does mean that your type signatures will be slightly longer, but it's worth it. Also, it makes it much easier for people less familiar with the pipes library to consume pipes utilities because they don't need to remember what the initial argument for any given type synonym is supposed to be.

API Consolidation

I went to a great deal of trouble to clean up the fragmentation that I began when I released Frames. Frames are gone and I've merged Pipe functionality into Proxys, so the library now only has three operations you ever need to use:

(>->)

request

respond

There is now just one way to compose things, something which a lot of users have requested.

I've deprecated the old Pipe API and included a "Control.Proxy.Pipe" module which helps users transition from Pipes to Proxys. This module not only provides the transition API but also has detailed instructions for how to upgrade your library while still keeping the niceties of the Pipes API.

In addition, I've consolidated all the tutorials into a single coherent tutorial in Control.Proxy.Tutorial with a single logical progression. The tutorial is long, but describes a great deal of the pipe idioms I've collected through my extensive usage of pipes in my own projects. More importantly, the tutorial collects all of the documentation for the library into a single location again so that users don't need to hunt over several tutorial modules or several blog posts of mine to learn about some key concept.

I also spent a lot of effort into guiding people towards a single natural coding style for pipes to make it easier for pipes users to read each other's code. The API exported from Control.Proxy has very little redundancy, and many old redundant API features (like the old await and yield) require you to pay an "import tax" to the tune of one extra import if you want to deviate from the new pipes "style guide". The tutorial also spends a lot of time talking about common idioms to encourage a uniform coding style.

Performance

The pipes library now includes rewrite rules that rewrite unoptimized user code into the equivalent hand-tuned code. Despite type-classing the entire standard library these rewrite rules fire very robustly without any assistance on your part. Just enjoy the extra performance. There are no performance regressions in this release and all the standard library utilities perform just as damn fast as they did in pipes-2.5, despite now being fully polymorphic and written using ordinary do notation.

Like always, "Control.Proxy.Prelude.Base" provides tons of examples of how to write idiomatic and high-performance pipes code, so feel free to consult that if you are learning how to write your own utilities.

Lightweight

I've dropped Frames, which means that I can now drop the index-core dependency. This means that the library only has one dependency: transformers-0.2. pipes is now the lightest dependency for any streaming library.

I will not add any extra dependencies to the main package and I will release all additional features in separate packages that build on top of pipes. I would like to see the core pipes package go as viral as possible. This is quite easy because in addition to having only a single dependency, pipes is:

Quick to compile

Safe Haskell

Light on extensions: Only KindSignatures and Rank2Types (for MFunctor and PFunctor)

Convenience imports

I added some extra imports from other modules to Control.Proxy, like lift and forever. I was pretty conservative in what I added and stuck to things that I considered essential for using pipes idiomatically.

hoist and lift

I renamed MFunctor's mapT to hoist. Sorry for the inconvenience, but mapT was a terrible name for several reasons and instead went with Edward's naming convention which I felt was more tasteful.

One of the really nice things to come out of pipes is a clean mechanism for mixing arbitrary monad transformer stacks and arbitrary proxy transformer stacks. It turns out that if two transformer stacks have the same base monad, then you can always interleave them in any way using the right combination of hoist and lift.

The exact same trick also applies to proxy transformer stacks. You can always interleave them using the right combination of hoistP (from the new PFunctor class) and liftP, which are just the higher-kinded versions of hoist and lift (and they also obey more laws, too).

The tutorial describes this trick in more detail and I find that this works phenomenally well and is very easy to reason about. For this reason I think MFunctor needs to be somewhere in the standard libraries as soon as possible because it is incredibly useful for this purpose.

Stability

I can state confidently that the pipes API can now handle any problem you throw at it and that the core is feature complete. The tutorial shows how you can combine the incredibly small pipes core into all the functionality you need. The only exception is prompt finalization, which I will describe in a separate upcoming library, but finalization similarly builds entirely on this core and does not add any new concepts or primitives.

There are only a few changes I foresee making to the main library in the future, which are:

Moving MFunctor to a better home

Adding a base proxy implementation with strictness annotations (never leaks, but half the performance of the fast implementation)

Note that this doesn't mean that pipes frequently leaks space. I've only found one case of leaking, which is the following code segment:

runProxy $ enumFromS 1

Everything else I have tried does not leak and I currently use pipes in long-running server applications of my own that also do not leak without any strictness annotations. However, if users do encounter leaks then I will go ahead and add the strict base implementation.

I will also be pretty conservative about expanding the core library because I want to stabilize it now. Almost all new functions will go in separate libraries now.

Conclusion

At this point I am very happy with the state of the main pipes library and I'm currently devoting my next efforts to standard libraries. The next libraries I will release will be (in this order):