Friday, April 25, 2014

I'm releasing the mvc library for model-view-controller (MVC) programming in Haskell. I initially designed this library with games and user interfaces in mind, but the larger goal of this library is to provide a mathematically inspired framework for general-purpose component sharing in Haskell.

This library differs in a significant way from other MVC libraries: this API statically enforces in the types that the Model is pure, but with very little loss in power. Using mvc you can refactor many types of concurrent applications into a substantial and pure Model that interfaces with carefully constrained Views and Controllers.

The first half of this post walks through two examples reproduced from the mvc documentation and the second half of this post describes the architecture that enables you to embed large and non-trivial business logic in an entirely pure Model. This post will use a side-by-side mix of theoretical terminology alongside plain English. Even if you don't understand the theoretical half of this post you should still derive benefit from the non-technical half and use that as a bridge towards understanding the underlying theory.

Examples

The mvc library uses four types to represent everything:

The Model is a pure streaming transformation from inputs to outputs

The View handles all outputs from the Model

The Controller supplies all inputs to the Model

The Managed type extends other types with logic for acquiring or releasing resources

There are no other concepts you have to learn. The API is extraordinarily small (4 types, and 8 primitive functions associated with those types).

However, as we extend our application the types will never grow more complex. In fact, the mvc library statically forbids you from increasing the complexity of your types, because the library only provides a single run function of the following type:

There is no other way to consume Models, Views, and Controllers, so runMVC forces you to consolidate all your Views into a single View and consolidate all your Controllers into a single Controller. This creates a single entry point and a single exit point for your Model. Equally important, you cannot mix your Model logic with your View or Controller logic. The mvc library enforces MVC best practices using the type system.

A Controller that interleaves lines from standard input with periodic ticks

A View that writes lines to standard output

A pure Model, which forwards lines until it detects the string "quit"

A Managed type, which abstracts over resource acquisition and release

The Model only has a single streaming entry point (the Controller) and a single streaming exit point (the View).

However, this interface is deceptively simple and can model very complex logic. For example, here's a more elaborate example using the sdl library that displays a white rectangle between every two mouse clicks:

Compile and run this program, which will open up a window, and begin clicking to paint white rectangles to the screen:

Here we package the effectful and concurrent components that we need from the sdl into a self-contained package containing a single View and Controller. Our pure logic is contained entirely within a pure Pipe, meaning that we can feed synthetic input to our program:

These kinds of tests would be impossible to run if we settled for anything less than complete separation of impurity and concurrency from our program's logic.

However, this total separation might seem unrealistic. What happens if we don't have exactly one View or exactly one Controller?

Monoids - Part 1

Views and Controllers are Monoids, meaning that we can combine any number of Views into a single View, and likewise combine any number of Controllers into a single Controller, by using mconcat (short for "Monoid concatenation") from Data.Monoid:

In other words, we can can combine a list of any number of Views into a single View and combine a list of any number of Controllers into a single Controller. We get several benefits for free as a result of this design.

First, combinability centralizes our View logic and Controller logic into a single expression that we feed to runMVC. We can therefore identify all inputs and outputs to our system simply by tracing all sub-expressions that feed into this larger expression. Contrast this with a typical mature code base where locating all relevant inputs and outputs for the system is non-trivial because they are typically not packaged into a single self-contained term.

Second, combinability promotes reuse. If we find ourselves repeatedly using the same set of inputs or outputs we can bundle them into a new derived component that we can share with others.

Third, combinable inputs and outputs are the reason our Model can afford to have a single entry point and a single exit point. This beats having to write callback spaghetti code where we cannot easily reason about our application's control flow.

This is an example of a scalable architecture. The Monoid type class lets us indefinitely grow our inputs and outputs without ever increasing the number of concepts, abstractions or types.

To be more specific, this scalable architecture is a special case of the category design pattern. When combinable components are morphisms in a category, we can connect as many components as we please yet still end up back where we started. In this case the operative category is a monoid, where Views or Controllers are morphisms, (<>) is the composition operator and mempty is the identity morphism.

Functors - Part 1

However, the Monoid type class only lets us combine Views and Controllers that have the same type. For example, suppose we have a Controller for key presses, and a separate Controller for mouse events:

keys :: Controller KeyEvent
clicks :: Controller MouseEvent

If we try to combine these using (<>) (an infix operator for mappend), we will get a type error because their types do not match:

keys <> clicks -- TYPE ERROR!

keys and clicks don't stream the same event type, so how do we reconcile their different types? We use functors!

The functor design pattern specifies that when we have an impedance mismatch between components, we unify them to agree on a common component framework. Here, we unify both of our Controller output types using Either.

Using theoretical terminology, when we have morphisms in diverse categories, we use functors to transform these morphisms to agree on a common category. In this case keys is a morphism in the Controller KeyEvent monoid and clicks is a morphism in the Controller MouseEvent monoid. We use fmap to transform both monoids to agree on the Controller (Either KeyEvent MouseEvent) monoid.

However, in this case fmap is behaving as a functor in a different sense than we are normally accustomed to. We're already familiar with the following functor laws for fmap:

fmap (f . g) = fmap f . fmap g
fmap id = id

However, right now we're not interested in transformations from functions to functions. Instead, we're interested in transformations from monoids to monoids, so we're going to invoke a different set of functor laws for our Controllers:

fmap f (c1 <> c2) = fmap f c1 <> fmap f c2
fmap f mempty = mempty

In other words, fmap f correctly translates monoid operations from one type of Controller to another. This functor between monoids is the operative functor when we transform Controllers to agree on a common type.

Functors - Part 2

We can use the same functor design pattern to unify different types of Views as well. For example, let's assume that we have two separate Views, one that logs Strings to a file, and another that displays video Frames to a screen:

logLines :: View String
frames :: View Frame

Again, we cannot naively combine these using mappend/(<>) because the types don't match:

logLines <> frames -- TYPE ERROR!

However, View does not implement Functor, so how do we unify the types this time?

We still use functors! However, this time we will be using the handles function from mvc, which has the following type:

handles :: Traversal' a b -> View b -> View a

This lets us use Traversals to specify which outgoing values each View should handle:

This latter functor between View monoids is the operative functor when we are unifying Views to agree on a common type.

Applicatives

Alright, but we don't typically work with unadorned Views or Controllers. If you look at the utility section of mvc you will see that most Views or Controllers are Managed, meaning that they must acquire or release some sort of resource before they can begin streaming values in or out. For example, any View or Controller that interacts with a file must initially acquire the file and release the file when done:

runMVC is written this way because some Views and Controllers must acquire and release the same resource. Does this mean that I need to provide a new run function that accepts separate ManagedViews and ManagedControllers?

No! Fortunately, Managed implements the Applicative type class and we can use Applicatives to combined two Managed resources into a single Managed resource:

I can fuse my two Managed resources into a single Managed resource! This is another example of scalable design. We don't complicate our run function by adding special cases for every permutation of ManagedViews and Controllers. Instead, we make Managed layers laterally combinable, which prevents proliferation of functions, types, and concepts.

Monoids - Part 2

Managed implements the Monoid type class, too! Specifically, we can wrap any type that implements Monoid with Managed and we will get back a new derived Monoid:

In fact, this trick works for any Applicative, not just Managed. Applicatives let you extend arbitrary Monoids with new features while still preserving their Monoid interface. There is no limit to how many Applicative extensions you can layer this way.

Conclusion

The documentation for the mvc library is full of theoretical examples like these showing how to architect applications using scalable abstractions inspired by category theory.

The mvc library has certain limitations. Specifically, I did not design the library to handle changing numbers of inputs and outputs over time. This is not because of a deficiency in category theory. Rather, I wanted to introduce this simpler API as a stepping stone towards understanding more general abstractions later on that I will release as separate libraries.

The other reason I'm releasing the mvc library is to test the waters for an upcoming book I will write about architecting programs using category theory. I plan to write one section of the book around an application structured using this mvc style.

Friday, April 4, 2014

Haskell design patterns differ from mainstream design patterns in one important way:

Conventional architecture: Combine a several components together of type A to generate a "network" or "topology" of type B

Haskell architecture: Combine several components together of type A to generate a new component of the same type A, indistinguishable in character from its substituent parts

This distinction affects how the two architectural styles evolve as code bases grow.

The conventional architecture requires layering abstraction on top of abstraction:

Oh no, these Bs are not connectable, so let's make a network of Bs and call that a C.

Well, I want to assemble several Cs, so let's make a network of Cs and call that a D

...

Wash, rinse, and repeat until you have an unmanageable tower of abstractions.

With a Haskell-style architecture, you don't need to keep layering on abstractions to preserve combinability. When you combine things together the result is still itself combinable. You don't distinguish between components and networks of components.

In fact, this principle should be familiar to anybody who knows basic arithmetic. When you combine a bunch of numbers together you get back a number:

3 + 4 + 9 = 16

Zero or more numbers go in and exactly one number comes out. The resulting number is itself combinable. You don't have to learn about "web"s of numbers or "web"s of "web"s of numbers.

If elementary school children can master this principle, then perhaps we can, too. How can we make programming more like addition?

Well, addition is simple because we have (+) and 0. (+) ensures that we can always convert more than one number into exactly number:

(+) :: Int -> Int -> Int

... and 0 ensures that we can always convert less than one number into exactly one number by providing a suitable default:

0 :: Int

This will look familiar to Haskell programmers: these type signatures resemble the methods of the Monoid type class:

In other words, the Monoid type class is the canonical example of this Haskell architectural style. We use mappend and mempty to combine 0 or more ms into exactly 1 m. The resulting m is still combinable.

Not every Haskell abstraction implements Monoid, nor do they have to because category theory takes this basic Monoid idea and generalizes it to more powerful domains. Each generalization retains the same basic principle of preserving combinability.

For example, a Category is just a typed monoid, where not all combinations type-check:

Category theory is full of generalized patterns like these, all of which try to preserve that basic intuition we had for addition. We convert more than one thing into exactly one thing using something that resembles addition and we convert less than one thing into exactly one thing using something that resembles zero. Once you learn to think in terms of these patterns, programming becomes as simple as basic arithmetic: combinable components go in and exactly one combinable component comes out.

These abstractions scale limitlessly because they always preserve combinability, therefore we never need to layer further abstractions on top. This is one reason why you should learn Haskell: you learn how to build flat architectures.

Tuesday, April 1, 2014

Worst practices spread more quickly because they require more attention

The following two sections outline specific manifestations of this problem and how they promote the dissemination of worst practices.

Management

Corollary 1: Teams with the greatest technical debt mentor the most employees.

Scenario 1: You manage a team that ships an important feature but with a poor execution. You can make a compelling case to management that your team needs more head count to continue supporting this feature. More employees reporting to you increases your chances of promotion and you enjoy rewarding opportunities to mentor new software engineers in your specific brand of programming.

Scenario 2: You manage a team that ships an important feature in the exact same amount of time but with an amazing execution. Your code requires little maintenance, so little in fact that your team is now obsolete. If you're lucky you are retasked to work on a new project; if you're unlucky you are fired because your skills are no longer relevant. Training others how to write software of excellent quality is out of the question either way.

In other words, software engineers that produce excellent code cannot easily pass on their wisdom to the next generation of programmers unless they go into academia.

Open source projects

Scenario 1: You write a useful library or programming language, but it is incredibly buggy and problematic. This generates several scathing blog posts criticizing your work, prompting others to clarify in response how to work around those issues. Don't forget to check Stack Overflow, which is full of people asking for help regarding how to use your project. Did you know that people measure popularity in terms of Stack Overflow questions?

Scenario 2: You release a useful library or programming language at the same time, except that through some miracle you get it correct on the first try. Version 1.0 is your first and last release. That means you can't advertise your library through announcement posts for subsequent releases and there are only so many ways that you can say "it works" before people accuse you of shameless advertising.

In other words, well-written open source projects can become victims of their own success, with fewer opportunities to market themselves.

Conclusion

Obviously the above examples are exaggerated hyperboles and reality is somewhere in the middle. There is no perfect software project or team and there is a limit to how much crap a company or user will tolerate.

However, I write this because I value simple and understandable software and I want to start a discussion on how software engineering can culturally reward correctness and reliability. We need to think of ways to encourage programming excellence to spread.