Servant, Type Families, and Type-level Everything

A look at advanced GHC features used in Servant

Servant is a really nice library for building REST APIs in Haskell. However, it uses advanced GHC features which may not be familiar to some Haskell programmers. In this article, I explain type-level strings, type-level lists, type-level operators, and type families. Finally, I use code from servant-server to explain how these features are used in practice.

This article is aimed at people who have a basic familiarity with Haskell. This includes understanding things like typeclasses, applicatives, monads, monad transformers, pointfree style, ghci, etc.

This article will give you insight to how Servant is using these advanced Haskell features, and hopefully make you more productive when using Servant.

Servant Example

Here is a simple example of using servant-server. This code will be referred to throughout the article.

The Motivation for Servant

Most web frameworks allow the user to write a handler for a specific route as a function. Here is an example of a handler function for a theoretical framework returning a JSON list [1,2,3,4]:

dogNums' ::SomeMonadValue
dogNums' = return $ toJSON [1,2,3,4]

When a user makes a request to /dogs, this function would get called, and the framework would pass the generated JSON back to the user. The type of the handler function is SomeMonad Value. This means it is running in SomeMonad and returning a JSON Value.

This is not bad, but it’s not type safe. All the type signature says is that some kind of JSON is returned.

It would be nice to enforce that a list of Ints is returned. Ideally we would like to write this function like this:

dogNums'' ::SomeMonad [Int]
dogNums'' = return [1,2,3,4]

The framework would be responsible for converting the list of Ints to JSON and returning it to the user.

Servant does this for us.

In our example, there are two handlers for two different routes. Here is the handler for the /dogs route:

dogNums ::EitherTServantErrIO [Int]
dogNums = return [1,2,3,4]

How does dogNums relate to dogNums''?

SomeMonad would be EitherT ServantErr IO. The list of Ints is the same.

Servant is great because it gives us type safety in the return type of our handlers.

However, two important things are still missing. Servant needs to be told that the handler should be called when the user sends a GET request to /dogs. Servant also needs to be told to convert the Int list returned by the dogNums handler to JSON.

This is really cool! We are able to get back the concrete value of something that only exists on the type level!

How does Servant use this? Recall the MyAPI type defined near the top of this article:

typeMyAPI="dogs":>Get'[JSON] [Int]:<|>"cats":>Get'[JSON] [String]

"dogs" and "cats" are type-level strings. At the end of this article we will look at some servant-server code and confirm that it is using symbolVal to get the value of the type-level strings.

Type-Level Lists

Just like type-level strings, type-level lists can also be defined.

First, the DataKinds language extension needs to be enabled.

ghci>:set -XDataKinds
ghci>

Let’s look at the kind of a type-level empty list:

ghci>:kind []
[] ::*->*
ghci>

No, wait, that’s not right. That’s just the kind of the normal list constructor. How do we write a type-level list?

Take quick peek at the GHC page on datatype promotion. The first section is pretty interesting, as is the section on the promoted list and tuple types. There is a short example of a heterogeneous list (or HList). A heterogeneous list is a list that has elements of different types. In the example, foo2 represents a heterogeneous list with two elements, Int and Bool.

From the example, you can see that type-level lists can be defined by putting a quote in front of the opening bracket:

ghci>:kind '[]'[] :: [k]
ghci>

Type-level lists can also be defined with multiple elements:

ghci>:kind '[Int, Bool, String]'[Int, Bool, String] :: [*]
ghci>

Going back to the MyAPI example from above, Servant is using type-level lists to represent the available content-type encodings of the response.

typeMyAPI="dogs":>Get'[JSON] [Int]:<|>"cats":>Get'[JSON] [String]

Servant is only willing to send back responses in JSON. (Because JSON is the only type in the type-level list).

(However, to get this to compile, there would need to be an instance of ToFormUrlEncoded [Int].) The /dogs route will then return either JSON or form-encoded values. The /cats route will return either JSON or plain text.

I’m not going to go into how type-level lists are used in servant-server, but if you’re interested you may want to start with reading the Get instance for HasServer, which will take you to the methodRouter function, which will take you to the AllCTRender typeclass. The AllCTRender typeclass/instance is where the real magic starts happening.

Oliver Charles has an interesting post on the generics-sop package where he talks a little about heterogeneous lists.

Type-Level Operators

In the Servant example code above, there are two type-level operators being used: (:>) and (:<|>). Type-level operators are similar to normal data types—they are just composed of symbols instead of letters.

Let’s look at how (:>) and (:<|>) are defined in Servant:

data path :> a
data a :<|> b = a :<|> b

If we didn’t want to write them infix, they could be written like this:

data (:>) path a
data (:<|>) a b = (:<|>) a b

In fact, if these data types were written with letters instead of symbols, they would look something like this:

dataFoo path a
dataBar a b =Bar a b

You can see that (:>) and (:<|>) are just normal datatype definitions. They only look weird because they are made of symbols and written infix.

Type operators help when writing long type definitions. They keep the long type definition easy to understand. Take the following API definition:

typeMyAPI="foo":>"bar">:Get'[JSON] [Int]

This defines the route /foo/bar. Rewriting this prefix would look like this:

typeMyAPI= (:>) "foo" ((>:) "bar" (Get'[JSON] [Int]))

You can see how much easier the infix style is to read!

NOTE: The TypeOperators language extension is needed to use the above code.

You may be thinking, “These type operators are pretty neat, but how are they actually used? They just look like confusing data types!” Well, we’ll get to that in a minute. Before we can jump into the Servant code, we need to get a basic understanding of type families.

Type Families

Type families are a relatively simple addition to Haskell that allow the user to do some computation at the type level. However, if you google for type families, it’s easy to get scared.

The first result is the GHC/Type families article on the Haskell Wiki. This is written with an advanced Haskeller in mind. Don’t worry if it’s too hard. (The other problem is that most of their examples use data families instead of type synonym families–which I introduce below. Most of the real world Haskell code I’ve seen uses type synonym families much more than data families).

The second link is to the type-families page in the GHC manual. It’s good if you already know about type families and just want a refresher, but it’s not good as an introduction to type families.

The third result is an article on FP Complete. It gets points for being about Pokemon, but the setup/motivation for using type families is way too long3.

The fourth result is an introduction to type families by Oliver Charles. It’s the best of the bunch, but it is slightly hard to follow if you’ve never used MVars, IORefs, etc.

I wrote a super simple tl;drpresentation about type families. Originally I wrote it in Japanese for a Haskell Lightning Talk in Tokyo, but I recently translated it to English upon the request from someone in the #haskell room in the functional programming slack community. If you aren’t sure about type families, please read that presentation and then proceed to the next section.

Servant

Now we come to the interesting section. How does Servant actually use type-level strings, type-level lists, type-operators, and type families? Let’s go back to the example code at the top of this blog post:

Let’s start with the easy things. It returns a Network.Wai.Application. This represents an application that can be served by Warp (i.e. something that can be passed to the run function provided by Warp).

The first argument is Proxy layout. The serve function uses this to figure out what the API type is. You might be asking, “If we are also passing the layout type variable to the Server type constructor, why do we additionally need to pass a Proxy layout? Surely, we don’t need to pass layout twice?”. That will be covered later.

HasServer takes one type parameter, layout. ServerT is a type family that takes two parameters, layout and m.

There is one function in this typeclass, route. It takes a Proxy layout and an IO of a RouteResult of a ServerT with the m parameter specialized to EitherT ServantErr IO. Quite a mouthful. Let’s abbreviate part of the type to make it easier to digest:

route ::Proxy layout ->IO (RouteResult (ServerT layout ...)) ->Router

Basically route takes an IO of a RouteResult of a ServerT and returns a Router. Let’s go back real quick and look at the implementation of the serve function:

It basically calls route with the proxy and the implementation of the API.

Now for the interesting part. Since HasServer is a typeclass, what route function actually gets called? If we look at the HasServer typeclass once again, we can see that the specific route function that gets called depends on the type of layout (which gets passed to route as Proxy layout).

We won’t go into how the route function is implemented here, but if you are interested, you’re welcome to look at the implementation of methodRouter. methodRouter does the actual rendering of the return type. For example, it will turn our [Int] into a JSON blob.

Because methodRouter handles the rendering of the return type, route needs to pass it Proxy contentTypes so that methodRouter knows what type to render.

Wrap-Up

At a very high-level, the HasServer typeclass, ServerT type family, and route function are used to peel away levels of the MyAPI type:

typeMyAPI="dogs":>Get'[JSON] [Int]:<|>"cats":>Get'[JSON] [String]

First, (:<|>) is peeled away leaving us with "dogs" :> Get '[JSON] [Int]. Then (:>) is peeled away leaving us with Get '[JSON] [Int]. This gets turned into the actual type of dogNums: EitherT ServantErr IO [Int].

dogNums ::EitherTServantErrIO [Int]
dogNums = return [1,2,3,4]

Why Pass layout Twice?

In the beginning of the Serve! section, a question was asked about the route function and the HasServer typeclass.

He says that the main reason we need to pass layout twice is that type families, like ServerT, are not injective. An explanation of injectivity is given on the Haskell Wiki page on type families.

If we have ServerT a m and ServerT b m, even if we know that ServerT a m == ServerT b m and m == m, we cannot conclude that a == b. (This is in contrast to a type like Maybe a and Maybe b, where if we know that Maybe a == Maybe b, then we also know that a == b.)

The route function effectively doesn’t get to “see” the layout passed to ServerT. It only “sees” the type that ServerT turns into.

The ServerT type family completely ignores the path argument! In the implementation of the route function, if we didn’t have the Proxy (path :> sublayout) argument, we wouldn’t be able to use the path argument at all!6

Conclusion

Thanks

After completing a rough draft of this blog post, I emailed all three main servant developers (Julian K. Arni, Alp Mestanogullari, and Sönke Hahn) asking them if they would review it. Since it’s such a long blog post, and I’m sure they are busy guys, I was expecting maybe one of them to respond, but to my surprise, all three responded within hours of sending the email. They all took the time not only to read through this post, but to give very helpful feedback.

If any of you ever come to Tokyo, dinner is on me!

Footnotes

Technically, this information is partially found in the API type, and partially comes from the fact that you arrange your Server handlers in the same order as the corresponding endpoints in the API type.

Later parts of the article will talk more about this in depth, but you can get an idea for what it means by looking at the myAPI function.

servant requires you to write the handlers for your endpoints in the same order that the corresponding endpoints appear in the API type. It is by relying on this order that servant can cross-reference the information found in the API types and the types of the handlers themselves, in order to check that you’re not returning an Int where a String is expected.

For example, based on the order of the “/dogs” and “/cats” handler in the MyAPI type, GHC would throw an error if you revered the order of dogsNums and cats in the myAPI function.

This works, but there is still one problem left. What if we want the caller to be able to determine the type? Like we tried to do in ghci above, what if we want to be able to parse Bools as well as Ints? We can use Proxy to do this.

This article is also super long, so I really shouldn’t complain about length.↩

The article A Gentle Introduction to Monad Transformers might be a good place to start if you’re not too familiar with Monad transformers. However, if you’re not too familiar with Monad transformers, the rest of this article will probably be quite challenging.↩

It may be easier to reason about this code using convenient type synonyms. Originally we had this:

In fact, even if we didn’t use path, we would still have to use a Proxy. This is because the arguments to a type family declared inside a typeclass need to be used in a way that makes them unambiguous in functions making use of that type family.

It’s awkward to explain, but it is pretty easy to understand when you see an example.

In the following typeclass, there is one type family and two functions using that type family:

classBaz a wheretypeHoge a
myGoodFunc :: a ->Hoge a ->Char myBadFunc ::Hoge a ->Char

In exampleGood, GHC knows to pick the myGoodFunc from the Baz String instance because the first argument to myGoodFunc is a String.

However, in exampleBad, GHC doesn’t know which myBadFunc to pick. Should it pick myBadFunc from the Baz Text instance, or from the Baz String instance? It doesn’t have enough information to decide. GHC will throw a compilation error.

The HasServer typeclass is also using this Proxy trick. That is why passing a Proxy is necessary.

The key takeaway is: when you know something like Maybe a, you know a. when you know Hoge a, you don’t knowa. In the two typeclass instances above, Hoge StringandHoge Text become Int, so if all you have is Int, GHC doesn’t know whether you started with Hoge String or Hoge Text. GHC can’t pick the right typeclass instance.↩