Generalizing APIs

Edit. ddarius pointed out to me that the type families examples were backwards, so I’ve flipped them to be the same as the functional dependencies.

Type functions can be used to do all sorts of neat type-level computation, but perhaps the most basic use is to allow the construction of generic APIs, instead of just relying on the fact that a module exports “mostly the same functions”. How much type trickery you need depends on properties of your API—perhaps most importantly, on the properties of your data types.

Suppose I have a single function on a single data type:

defaultInt :: Int

and I would like to generalize it. I can do so easily by creating a type class:

class Default a where
def :: a

Abstraction on a single type usually requires nothing more than vanilla type classes.

If we’re unlucky, some of the functions will not use all of the data types:

empty :: IntSet

In which case, when we attempt to use the function, GHC will tell us it can’t figure out what instance to use:

No instance for (Set IntMap e)
arising from a use of `empty'

One thing to do is to introduce a functional dependency between IntSet and Int. A dependency means something is depending on something else, so which type depends on what? We don’t have much choice here: since we’d like to support the function empty, which doesn’t mention Int anywhere in its signature, the dependency will have to go from IntSet to Int, that is, given a set (IntSet), I can tell you what it contains (an Int).:

Notice that this is still fundamentally a multiparameter type class, we’ve just given GHC a little hint on how to pick the right instance. We can also introduce a fundep in the other direction, if we need to allow a plain e. For pedagogical purposes, let’s assume that our boss really wants a “null” element, which is always a member of a Set and when inserted doesn’t do anything:

Also notice that whenever we add a functional dependency, we preclude ourselves from offering an alternative instance. The following is illegal with the last typeclass for Set:

instance Set IntSet Int where ...
instance Set IntSet Int32 where ...
instance Set BetterIntSet Int where ...

This will report a “Functional dependencies conflict.”

Functional dependencies are somewhat maligned because they interact poorly with some other type features. An equivalent feature that was recently added to GHC is associated types (also known as type families or data families.)

Instead of telling GHC how automatically infer one type from the other (via the dependency), we create an explicit type family (also known as a type function) which provides the mapping:

Notice that our typeclass is no longer multiparameter: it’s a little like as if we introduced a functional dependency from c -> e. But then, how does it know what the type of null should be? Easy: it makes you tell it:

Our type function goes the other direction, and we can vary the implementation of the container based on what type is being used, which may not be one that we own. This is one primary use case of data families, but it’s not directly related to the question of generalizing APIs, so we leave it for now.

IntContainer looks a lot like a newtype, and in fact can be made one:

instance Set IntSet where
newtype Elem IntSet = IntContainer Int

If you find wrapping and unwrapping newtypes annoying, in some circumstances you can just use a type synonym:

However, this rules out some functions you might like to write, for example, automatically specializing your generic functions:

x :: Int
x = null

GHC will error:

Couldn't match expected type `Elem e'
against inferred type `[Int]'
NB: `Container' is a type function, and may not be injective

Since I could have also written:

instance Set BetterIntSet where
type Elem BetterIntSet = Int

GHC doesn’t know which instance of Set to use for null: IntSet or BetterIntSet? You will need for this information to be transmitted to the compiler in another way, and if this happens completely under the hood, you’re a bit out of luck. This is a distinct difference from functional dependencies, which conflict if you have a non-injective relation.

Another method, if you have the luxury of defining your data type, is to define the data type inside the instance:

So type families hide implementation details from the type signatures (you only use the associated types you need, as opposed to Set c e => c where the e is required but not used for anything—this is more obvious if you have twenty associated data types). However, they can be a bit more wordy when you need to introduce newtype wrappers for your associated data (Elem). Functional dependencies are great for automatically inferring other types without having to repeat yourself.

(Thanks Edward Kmett for pointing this out.)

What to do from here? We’ve only scratched the surface of type level programming, but for the purpose of generalizing APIs, this is essentially all you need to know! Find an API you’ve written that is duplicated across several modules, each of which provide different implementations. Figure out what functions and data types are the primitives. If you have many data types, apply the tricks described here to figure out how much type machinery you need. The go forth, and make thy API generic!

7 Responses to “Generalizing APIs”

“How do we assert that Container a is a Monoid? We have to add it to all of our function signatures, unfortunately”

Actually, you can use a constraint on the class:

class Monoid (Container e) => Set e where
data Container e :: *
…

Yes, you can refer to Container like that even though it looks like it would be out of scope; and yes, GHC supports this. It’s equality constraints (Container e ~ Something) in that position which are as of yet unimplemented.

(I’m also not sure whether to call this “class constraints”, “superclass constraints”, or what, but that’s just terminology.)

[…] this sense subtype polymorphism is closed. This post is inspired in part by the excellent article Generalizing APIs by Edward Z. Yang. For this post I will use Scala, my current language of choice for most of the […]