Saturday, June 14, 2014

Spreadsheet-like programming in Haskell

What if I told you that a spreadsheet could be a library instead of an application? What would that even mean? How do we distill the logic behind spreadsheets into a reusable abstraction? My mvc-updates library answers this question by bringing spreadsheet-like programming to Haskell using an intuitive Applicative interface.

The central abstraction is an ApplicativeUpdatable value, which is just a Fold sitting in front of a ManagedController:

data Updatable a =
forall e . On (Fold e a) (Managed (Controller e))

The ManagedController originates from my mvc library and represents a resource-managed, concurrent source of values of type e. Using Monoid operations, you can interleave these concurrent resources together while simultaneously merging their resource management logic.

The Fold type originates from my foldl library and represents a reified left fold. Using Applicative operations, you can combine multiple folds together in such a way that they still pass over the data set just once without leaking space.

To build an Updatable value, just pair up a Fold with a ManagedController:

What's amazing is that when you stick a Fold in front of a Controller, you get a new Applicative. This Applicative instance lets you combine multiple Updatable values into new derived Updatable values. For example, we can combine lastLine and seconds into a single data type that tracks updates to both values:

example will update every time lastLine or seconds updates, caching and reusing portions that do not update. For example, if lastLine updates then only the first field of Example will change. Similarly, if seconds updates then only the second field of Example will change.

When we're done combining Updatable values we can plug them into mvc using the updates function:

updates :: Buffer a -> Updatable a -> Managed (Controller a)

This gives us back a ManagedController we can feed into our mvc application:

The key feature I want to emphasize is how concise this spreadsheet API is. We provide our user an Applicative input cell builder and a Monoid output cell builder, and we're done. We don't have to explain to the user how to acquire resources, manage threads, or combine updates. The Applicative instance for Updatable handles all of those trivial details for them. Adding extra inputs or outputs is as simple as chaining additional inCell and outCell invocations.

Reactive animations

We don't have to limit ourselves to spread sheets, though. We can program Updatable graphical scenes using these same principles. For example, let's animate a cloud that orbits around the user's mouse using the sdl library. Just like before, we will begin from a concise interface:

cloudOrbit is defined as a pure function from the current time and mouse coordinates to a Cloud. With the power of Applicatives we can lift this pure function over two Updatable values (mouse and seconds) to create a new UpdatableCloud that we pass intact to our program's View.

Under the hood

mvc-updates distinguishes itself from similar libraries in other languages by not relying on a semantics for concurrency. The Applicative instance for Updatable uses no concurrent operations, whatsoever:

In fact, this Applicative instance only assumes that the Controller type is a Monoid, so this trick generalizes to any source that forms a Monoid.

This not only simplifies the proof of the Applicative laws, but it also greatly improves efficiency. This Applicative instance introduces no new threads or buffers. The only thread or buffer you will incur is in the final call to the updates function, but expert users can eliminate even that overhead by inlining the logic of the updates function directly into their mvc program.

The small size of the library is no accident. The Updatable abstraction is an example of a scalable program architecture. When we combine Updatable values together, the end result is a new Updatable value. This keeps the API small since we always end up back where we started and we never need to introduce additional abstractions.

There is no need to distinguish between "primitive" Updatable values or "derived" Updatable values or "sheets" of Updatable values. The Applicative interface lets us unify these three concepts into a single uniform concept. Moreover, the Applicative interface is one of Haskell's widely used type classes inspired by category theory, so we can reuse people's pre-existing intuition for how Applicatives work. This is a common theme in Haskell where once you learn the core set of mathematical type classes they go a very, very long way.

Conclusion

Hopefully this post will get you excited about the power of Applicative programming. If you would like to learn more about Applicatives, I highly recommend the "Applicative Programming with Effects" paper by Conor McBride and Ross Paterson.

I would like to conclude by saying that there many classes of problems that the mvc-updates library does not solve well, such as:

10 comments:

I'm a little bit familiar with constraint programming, but the main thing I look for in a programming paradigm are programming interfaces inspired by category theory or abstract algebra (i.e. monoids, functors, categories, etc.). Are there analogs of that in constraint programming?

I don't use mathematics for the sake of using mathematics. The purpose behind structuring programs mathematically is to compose small bits of mathematical functionality, each of which is correct in isolation, to build larger mathematical structures which are still correct.

Sure, you can always whip up some specialized and non-mathematical solution, but these will rarely generalize to more complex problems well. They will usually solve some very specific problem very well, but the moment you deviate from the problem it was intended to solve it will become very brittle.

Even the very example you give (constraint programming systems) demonstrates this issues. Constraint programming lacks the resource management sophistication of `mvc-updates`, where as you combine updatable values it automatically merges their resource management logic, and it's not clear to me how I would extend it with this feature, whereas with `mvc-updates` it was trivial because it took the principled approach.

Then there are the million plus iOS/Mac programmers using AutoLayout and KVO/Bindings. AutoLayout is based on the Cassowary constraint solver, KVO/Bindings is equivalent to a simple one-way constraint solver (without formulae)

And finally, spreadsheets are the most wide-spread form of programming, and again, spread-sheets = one-way dataflow constraints.

So you have a funny definition of "nobody" :-)

@Gabriel: what do you mean with "resource management"? Considering the wide variety of constraint systems, are you certain that none have this? In fact, mvc-updates seems quite limited compared to most constraint systems I am aware of.