Foreword

Back at the beginning of 2011, when I first started doing the experiments in generic programming that would eventually turn into shapeless, I had no idea that five years later it would have evolved into such a widely used library. I am profoundly grateful to the people who have trusted me and added shapeless as a dependency to their own projects: the vote of confidence that this represents is a huge motivator for any open source project. I am also hugely grateful to the many people who have contributed to shapeless over the years: eighty one at the time of writing. Without their help shapeless would be a far less interesting and useful library.

These positives notwithstanding, shapeless has suffered from one of the common failings of open source projects: a lack of comprehensive, accurate and accessible documentation. The responsibility for this lies squarely at my door: despite acknowledging the lack I have never been able to find the time to do anything about it. To some extent shapeless has been saved from this by Travis Brown’s heroic Stack Overflow performance and also by the many people who have given talks about and run workshops on shapeless (in particular I’d like to highlight Sam Halliday’s “Shapeless for Mortals” workshop).

But Dave Gurnell has changed all that: we now have this wonderful book length treatment of shapeless’s most important application: type class derivation via generic programming. In doing this he has pulled together fragments of folklore and documentation, has picked my brain, and turned the impenetrable tangle into something which is clear, concise and very practical. With any luck he will be able to make good on my regular claims that at its core shapeless is a very simple library embodying a set of very simple concepts.

Thanks Dave, you’ve done us all a great service.

Miles Sabin
Creator of shapeless

1 Introduction

This book is a guide to using shapeless, a library for generic programming in Scala. Shapeless is a large library, so rather than cover everything it has to offer we will concentrate on a few compelling use cases and use them to build a picture of the tools and patterns available.

Before we start, let’s talk about what generic programming is and why shapeless is so exciting to Scala developers.

1.1 What is generic programming?

Types are helpful because they are specific: they show us how different pieces of code fit together, help us prevent bugs, and guide us toward solutions when we code.

Sometimes, however, types are too specific. There are situations where we want to exploit similarities between types to avoid repetition. For example, consider the following definitions:

These two case classes represent different kinds of data but they have clear similarities: they both contain three fields of the same types. Suppose we want to implement a generic operation such as serializing to a CSV file. Despite the similarity between the two types, we have to write two separate serialization methods:

Generic programming is about overcoming differences like these. Shapeless makes it convenient to convert specific types into generic ones that we can manipulate with common code.

For example, we can use the code below to convert employees and ice creams to values of the same type. Don’t worry if you don’t follow this example yet—we’ll get to grips with the various concepts later on:

Both values are now of the same type. They are both heterogeneous lists (HLists for short) containing a String, an Int, and a Boolean. We’ll look at HLists and the important role they play soon. For now the point is that we can serialize each value with the same function:

This example is basic but it hints at the essence of generic programming. We reformulate problems so we can solve them using generic building blocks, and write small kernels of code that work with a wide variety of types. Generic programming with shapeless allows us to eliminate huge amounts of boilerplate, making Scala applications easier to read, write, and maintain.

Does that sound compelling? Thought so. Let’s jump in!

1.2 About this book

This book is divided into two parts.

In Part I we introduce type class derivation, which allows us to create type class instances for any algebraic data type using only a handful of generic rules. Part I consists of four chapters:

In Chapter 2 we introduce generic representations. We also introduce shapeless’ Generic type class, which can produce a generic encoding for any case class or sealed trait.

In Chapter 3 we use Generic to derive instances of a custom type class. We create an example type class to encode Scala data as Comma Separated Values (CSV), but the techniques we cover can be extended to many situations. We also introduce shapeless’ Lazy type, which lets us handle recursive data like lists and trees.

In Chapter 4 we introduce the theory and programming patterns we need to generalise the techniques from earlier chapters. Specifically we look at dependent types, dependently typed functions, and type level programming. This allows us to access more advanced applications of shapeless.

In Chapter 5 we introduce LabelledGeneric, a variant of Generic that exposes field and type names as part of its generic representations. We also introduce additional theory: literal types, singleton types, phantom types, and type tagging. We demonstrate LabelledGeneric by creating a JSON encoder that preserves field and type names in its output.

In Part II we introduce the “ops type classes” provided in the shapeless.ops package. Ops type classes form an extensive library of tools for manipulating generic representations. Rather than discuss every op in detail, we provide a theoretical primer in three chapters:

In Chapter 6 we discuss the general layout of the ops type classes and provide an example that strings several simple ops together to form a powerful “case class migration” tool.

In Chapter 7 we introduce polymorphic functions, also known as Polys, and show how they are used in ops type classes for mapping, flat mapping, and folding over generic representations.

Finally, in Chapter 8 we introduce the Nat type that shapeless uses to represent natural numbers at the type level. We introduce several related ops type classes, and use Nat to develop our own version of Scalacheck’s Arbitrary.

1.3 Source code and examples

This book is open source. You can find the Markdown source on Github. The book receives constant updates from the community so be sure to check the Github repo for the most up-to-date version.

We also maintain a copy of the book on the Underscore web site. If you grab a copy of the book from there we will notify you whenever we release an update.

There are complete implementations of the major examples in an accompanying repo. See the README for installation details. We assume shapeless 2.3.2 and either Typelevel Scala 2.11.8+ or Lightbend Scala 2.11.9+ / 2.12.1+.

Most of the examples in this book are compiled and executed using version 2.12.1 of the Typelevel Scala compiler. Among other niceties this version of Scala introduces infix type printing, which cleans up the console output on the REPL:

Don’t panic! Aside from the printed form of the result (infix versus prefix syntax), these types are the same. If you find the prefix types difficult to read, we recommend upgrading to a newer version of Scala. Simply add the following to your build.sbt, substituting in contemporary version numbers as appropriate:

scalaOrganization := "org.typelevel"
scalaVersion := "2.12.1"

The scalaOrganization setting is only supported in SBT 0.13.13 or later. You can specify an SBT version by writing the following in project/build.properties (create the file if it isn’t there in your project):

sbt.version=0.13.13

1.4 Acknowledgements

Thanks to Miles Sabin, Richard Dallaway, Noel Welsh, and Travis Brown, for their invaluable contributions to this guide.

Special thanks to Sam Halliday for this excellent workshop Shapeless for Mortals, which provided the initial inspiration and skeleton, and to Rob Norris and his fellow contributors for the awesome Tut, which keeps our examples compiling correctly.

Finally, thanks to everyone who has contributed on Github. Here is an alphabetical list of contributors:

A separately maintained French translation of this book is also available on Github. Thanks to etienne and fellow contributors for producing and maintaining it!

If you spot an error or potential improvement, please raise an issue or submit a PR on the Github page.

2 Algebraic data types and generic representations

The main idea behind generic programming is to solve problems for a wide variety of types by writing a small amount of generic code. Shapeless provides two sets of tools to this end:

a set of generic data types that can be inspected, traversed, and manipulated at the type level;

automatic mapping between algebraic data types (ADTs) (encoded in Scala as case classes and sealed traits) and these generic representations.

In this chapter we will start with a recap of the theory of algebraic data types and why they might be familiar to Scala developers. Then we will look at generic representations used by shapeless and discuss how they map on to concrete ADTs. Finally, we will introduce a type class called Generic that provides automatic mapping back and forth between ADTs and generic representations. We will finish with some simple examples using Generic to convert values from one type to another.

2.1 Recap: algebraic data types

Algebraic data types (ADTs)1 are a functional programming concept with a fancy name but a very simple meaning. They are an idiomatic way of representing data using “ands” and “ors”. For example:

a shape is a rectangle or a circle

a rectangle has a width and a height

a circle has a radius

In ADT terminology, “and” types such as rectangle and circle are called products and “or” types such as shape are called coproducts. In Scala we typically represent products using case classes and coproducts using sealed traits:

2.1.1 Alternative encodings

Sealed traits and case classes are undoubtedly the most convenient encoding of ADTs in Scala. However, they aren’t the only encoding. For example, the Scala standard library provides generic products in the form of Tuples and a generic coproduct in the form of Either. We could have chosen these to encode our Shape:

Importantly, Shape2 is a more generic encoding than Shape3. Any code that operates on a pair of Doubles will be able to operate on a Rectangle2 and vice versa. As Scala developers we tend to prefer semantic types like Rectangle and Circle to generic ones like Rectangle2 and Circle2 precisely because of their specialised nature. However, in some cases generality is desirable. For example, if we’re serializing data to disk, we don’t care about the difference between a pair of Doubles and a Rectangle2. We just write two numbers and we’re done.

Shapeless gives us the best of both worlds: we can use friendly semantic types by default and switch to generic representations when we want interoperability (more on this later). However, instead of using Tuples and Either, shapeless uses its own data types to represent generic products and coproducts. We’ll introduce these types in the next sections.

2.2 Generic product encodings

In the previous section we introduced tuples as a generic representation of products. Unfortunately, Scala’s built-in tuples have a couple of disadvantages that make them unsuitable for shapeless’ purposes:

Each size of tuple has a different, unrelated type, making it difficult to write code that abstracts over sizes.

There is no type for zero-length tuples, which are important for representing products with zero fields. We could arguably use Unit, but we ideally want all generic representations to have a sensible common supertype. The least upper bound of Unit and Tuple2 is Any so a combination of the two is impractical.

For these reasons, shapeless uses a different generic encoding for product types called heterogeneous lists or HLists4.

An HList is either the empty list HNil, or a pair ::[H, T] where H is an arbitrary type and T is another HList. Because every :: has its own H and T, the type of each element is encoded separately in the type of the overall list:

We can manipulate and transform HLists in addition to being able to inspect and traverse them. For example, we can prepend an element with the :: method. Again, notice how the type of the result reflects the number and types of its elements:

val newProduct = 42L :: product

Shapeless also provides tools for performing more complex operations such as mapping, filtering, and concatenating lists. We’ll discuss these in more detail in Part II.

The behaviour we get from HLists isn’t magic. We could have achieved all of this functionality using (A, B) and Unit as alternatives to :: and HNil. However, there is an advantage in keeping our representation types separate from the semantic types used in our applications. HList provides this separation.

2.2.1 Switching representations using Generic

Shapeless provides a type class called Generic that allows us to switch back and forth between a concrete ADT and its generic representation. Some behind-the-scenes macro magic allows us to summon instances of Generic without boilerplate:

Note that the instance of Generic has a type member Repr containing the type of its generic representation. In this case iceCreamGen.Repr is String :: Int :: Boolean :: HNil. Instances of Generic have two methods: one for converting to the Repr type and one for converting from it:

In versions 2.10 and earlier, Scala had a limit of 22 fields for case classes. This limit was nominally fixed in 2.11, but using HLists will help avoid the remaining limitations of 22 fields in Scala.

2.3 Generic coproducts

Now we know how shapeless encodes product types. What about coproducts? We looked at Either earlier but that suffers from similar drawbacks to tuples. Again, shapeless provides its own encoding that is similar to HList:

In general coproducts take the form A :+: B :+: C :+: CNil meaning “A or B or C”, where :+: can be loosely interpreted as Either. The overall type of a coproduct encodes all the possible types in the disjunction, but each concrete instance contains a value for just one of the possibilities. :+: has two subtypes, Inl and Inr, that correspond loosely to Left and Right. We create instances of a coproduct by nesting Inl and Inr constructors:

Every coproduct type is terminated with CNil, which is an empty type with no values, similar to Nothing. We can’t instantiate CNil or build a Coproduct purely from instances of Inr. We always have exactly one Inl in a value.

Again, it’s worth stating that Coproducts aren’t particularly special. The functionality above can be achieved using Either and Nothing in place of :+: and CNil. There are technical difficulties with using Nothing, but we could have used any other uninhabited or arbitrary singleton type in place of CNil.

2.3.1 Switching encodings using Generic

Coproduct types are difficult to parse on first glance. However, we can see how they fit into the larger picture of generic encodings. In addition to understanding case classes and case objects, shapeless’ Generic type class also understands sealed traits and abstract classes:

The Repr of the Generic for Shape is a Coproduct of the subtypes of the sealed trait: Rectangle :+: Circle :+: CNil. We can use the to and from methods of the generic to map back and forth between Shape and gen.Repr:

2.4 Summary

In this chapter we discussed the generic representations shapeless provides for algebraic data types in Scala: HLists for product types and Coproducts for coproduct types. We also introduced the Generic type class to map back and forth between concrete ADTs and their generic representations. We haven’t yet discussed why generic encodings are so attractive. The one use case we did cover—converting between ADTs—is fun but not tremendously useful.

The real power of HLists and Coproducts comes from their recursive structure. We can write code to traverse representations and calculate values from their constituent elements. In the next chapter we will look at our first real use case: automatically deriving type class instances.

3 Automatically deriving type class instances

In the last chapter we saw how the Generic type class allowed us to convert any instance of an ADT to a generic encoding made of HLists and Coproducts. In this chapter we will look at our first serious use case: automatic derivation of type class instances.

3.1 Recap: type classes

Before we get into the depths of instance derivation, let’s quickly recap on the important aspects of type classes.

Type classes are a programming pattern borrowed from Haskell (the word “class” has nothing to do with classes in object oriented programming). We encode them in Scala using traits and implicits. A type class is a parameterised trait representing some sort of general functionality that we would like to apply to a wide range of types:

// Turn a value of type A into a row of cells in a CSV file:trait CsvEncoder[A] {
defencode(value: A): List[String]
}

We implement our type class with instances for each type we care about. If we want the instances to automatically be in scope we can place them in the type class’ companion object. Otherwise we can place them in a separate library object for the user to import manually:

3.1.1 Resolving instances

Type classes are very flexible but they require us to define instances for every type we care about. Fortunately, the Scala compiler has a few tricks up its sleeve to resolve instances for us given sets of user-defined rules. For example, we can write a rule that creates a CsvEncoder for (A, B) given CsvEncoders for A and B:

When all the parameters to an implicit def are themselves marked as implicit, the compiler can use it as a resolution rule to create instances from other instances. For example, if we call writeCsv and pass in a List[(Employee, IceCream)], the compiler is able to combine pairEncoder, employeeEncoder, and iceCreamEncoder to produce the required CsvEncoder[(Employee, IceCream)]:

Given a set of rules encoded as implicit vals and implicit defs, the compiler is capable of searching for combinations to give it the required instances. This behaviour, known as “implicit resolution”, is what makes the type class pattern so powerful in Scala.

Even with this power, the compiler can’t pull apart our case classes and sealed traits. We are required to define instances for ADTs by hand. Shapeless’ generic representations change all of this, allowing us to derive instances for any ADT for free.

However, as we will see in Section 4.2, when working with shapeless we encounter situations where implicitly doesn’t infer types correctly. We can always define the summoner method to do the right thing, so it’s worth writing one for every type class we create. We can also use a special method from shapeless called “the” (more on this later):

Unfortunately, several limitations of typesetting code in a book prevent us writing long singletons containing lots of methods and instances. We therefore tend to describe definitions outside of their context in the companion object. Bear this in mind as you read and check the accompanying repo linked in Section 1.3 for complete worked examples.

3.2 Deriving instances for products

In this section we’re going to use shapeless to derive type class instances for product types (i.e. case classes). We’ll use two intuitions:

If we have type class instances for the head and tail of an HList, we can derive an instance for the whole HList.

If we have a case class A, a Generic[A], and a type class instance for the generic’s Repr, we can combine them to create an instance for A.

Take CsvEncoder and IceCream as examples:

IceCream has a generic Repr of type String :: Int :: Boolean :: HNil.

The Repr is made up of a String, an Int, a Boolean, and an HNil. If we have CsvEncoders for these types, we can create an encoder for the whole thing.

If we can derive a CsvEncoder for the Repr, we can create one for IceCream.

3.2.1 Instances for HLists

Let’s start by defining an instance constructor and CsvEncoders for String, Int, and Boolean:

This solution is specific to IceCream. Ideally we’d like to have a single rule that handles all case classes that have a Generic and a matching CsvEncoder. Let’s work through the derivation step by step. Here’s a first cut:

The problem here is a scoping issue: we can’t refer to a type member of one parameter from another parameter in the same block. The trick to solving this is to introduce a new type parameter to our method and refer to it in each of the associated value parameters:

We’ll cover this coding style in more detail in the next chapter. Suffice to say, this definition now compiles and works as expected and we can use it with any case class as expected. Intuitively, this definition says:

Given a type A and an HList type R, an implicit Generic to map A to R, and a CsvEncoder for R, create a CsvEncoder for A.

We now have a complete system that handles any case class. The compiler expands a call like:

In this case the error message is relatively easy to understand. If shapeless can’t calculate a Generic it means that the type in question isn’t an ADT—somewhere in the algebra there is a type that isn’t a case class or a sealed abstract type.

The other potential source of failure is when the compiler can’t calculate a CsvEncoder for our HList. This normally happens because we don’t have an encoder for one of the fields in our ADT. For example, we haven’t yet defined a CsvEncoder for java.util.Date, so the following code fails:

The message we get here isn’t very helpful. All the compiler knows is it tried a lot of combinations of implicits and couldn’t make them work. It has no idea which combination came closest to the desired result, so it can’t tell us where the source(s) of failure lie.

There’s not much good news here. We have to find the source of the error ourselves by a process of elimination. We’ll discuss debugging techniques in Section 3.5. For now, the main redeeming feature is that implicit resolution always fails at compile time. There’s little chance that we will end up with code that fails during execution.

3.3 Deriving instances for coproducts

In the last section we created a set of rules to automatically derive a CsvEncoder for any product type. In this section we will apply the same patterns to coproducts. Let’s return to our shape ADT as an example:

The generic representation for Shape is Rectangle :+: Circle :+: CNil. In Section 3.2.2 we defined product encoders for Rectangle and Circle. Now, to write generic CsvEncoders for :+: and CNil, we can use the same principles we used for HLists:

Because Coproducts are disjunctions of types, the encoder for :+: has to choose whether to encode a left or right value. We pattern match on the two subtypes of :+:, which are Inl for left and Inr for right.

Alarmingly, the encoder for CNil throws an exception! Don’t panic, though. Remember that we can’t create values of type CNil, so the throw expression is dead code. It’s ok to fail abruptly here because we will never reach this point.

If we place these definitions alongside our product encoders from Section 3.2, we should be able to serialize a list of shapes. Let’s give it a try:

There is a Scala compiler bug called SI-7046 that can cause coproduct generic resolution to fail. The bug causes certain parts of the macro API, on which shapeless depends, to be sensitive to the order of the definitions in our source code. Problems can often be worked around by reordering code and renaming files, but such workarounds tend to be volatile and unreliable.

If you are using Lightbend Scala 2.11.8 or earlier and coproduct resolution fails for you, consider upgrading to Lightbend Scala 2.11.9 or Typelevel Scala 2.11.8. SI-7046 is fixed in each of these releases.

3.3.1 Aligning CSV output

Our CSV encoder isn’t very practical in its current form. It allows fields from Rectangle and Circle to occupy the same columns in the output. To fix this problem we need to modify the definition of CsvEncoder to incorporate the width of the data type and space the output accordingly. The examples repo linked in Section 1.3 contains a complete implementation of CsvEncoder that addresses this problem.

The problem is that our type is recursive. The compiler senses an infinite loop applying our implicits and gives up.

3.4.1 Implicit divergence

Implicit resolution is a search process. The compiler uses heuristics to determine whether it is “converging” on a solution. If the heuristics don’t yield favorable results for a particular branch of search, the compiler assumes the branch is not converging and moves onto another.

One heuristic is specifically designed to avoid infinite loops. If the compiler sees the same target type twice in a particular branch of search, it gives up and moves on. We can see this happening if we look at the expansion for CsvEncoder[Tree[Int]] The implicit resolution process goes through the following types:

We see Tree[A] twice in lines 1 and 5, so the compiler moves onto another branch of search. The eventual consequence is that it fails to find a suitable implicit.

In fact, the situation is worse than this. If the compiler sees the same type constructor twice and the complexity of the type parameters is increasing, it assumes that branch of search is “diverging”. This is a problem for shapeless because types like ::[H, T] and :+:[H, T] can appear several times as the compiler expands different generic representations. This causes the compiler to give up prematurely even though it would eventually find a solution if it persisted with the same expansion. Consider the following types:

The compiler attempts to resolve a CsvEncoder[::[H, T]] twice in this branch of search, on lines 2 and 4. The type parameter for T is more complex on line 4 than on line 2, so the compiler assumes (incorrectly in this case) that the branch of search is diverging. It moves onto another branch and, again, the result is failure to generate a suitable instance.

3.4.2Lazy

Implicit divergence would be a show-stopper for libraries like shapeless. Fortunately, shapeless provides a type called Lazy as a workaround. Lazy does two things:

it suppresses implicit divergence at compile time by guarding against the aforementioned over-defensive convergence heuristics;

it defers evaluation of the implicit parameter at runtime, permitting the derivation of self-referential implicits.

We use Lazy by wrapping it around specific implicit parameters. As a rule of thumb, it is always a good idea to wrap the “head” parameter of any HList or Coproduct rule and the Repr parameter of any Generic rule in Lazy:

The reason for the failure is that we haven’t defined a CsvEncoder for Float. However, this may not be obvious in application code. We can work through the expected expansion sequence to find the source of the error, inserting calls to CsvEncoder.apply or implicitly above the error to see if they compile. We start with the generic representation of Foo:

Int passes but Float fails. CsvEncoder[Float] is a leaf in our tree of expansions, so we know to start by implementing this missing instance. If adding the instance doesn’t fix the problem we repeat the process to find the next point of failure.

3.5.2 Debugging using reify

The reify method from scala.reflect takes a Scala expression as a parameter and returns an AST object representing the expression tree, complete with type annotations:

The types inferred during implicit resolution can give us hints about problems. After implicit resolution, any remaining existential types such as A or T provide a sign that something has gone wrong. Similarly, “top” and “bottom” types such as Any and Nothing are evidence of failure.

3.6 Summary

In this chapter we discussed how to use Generic, HLists, and Coproducts to automatically derive type class instances. We also covered the Lazy type as a means of handling complex/recursive types. Taking all of this into account, we can write a common skeleton for deriving type class instances as follows.

In the next chapter we’ll cover some useful theory and programming patterns to help write code in this style. In Chapter 5 we will revisit type class derivation using a variant of Generic that allows us to inspect field and type names in our ADTs.

4 Working with types and implicits

In the last chapter we saw one of the most compelling use cases for shapeless: automatically deriving type class instances. There are plenty of even more powerful examples coming later. However, before we move on, we should take time to discuss some theory we’ve skipped over and establish a set of patterns for writing and debugging type- and implicit-heavy code.

4.1 Dependent types

Last chapter we spent a lot of time using Generic, the type class for mapping ADT types to generic representations. However, we haven’t yet discussed an important bit of theory that underpins Generic and much of shapeless: dependent types.

To illustrate this, let’s take a closer look at Generic. Here’s a simplified version of the definition:

The answer is it depends on the instance we get for gen. In expanding the call to getRepr, the compiler will search for a Generic[A] and the result type will be whatever Repr is defined in that instance:

What we’re seeing here is called dependent typing: the result type of getRepr depends on its value parameters via their type members. Suppose we had specified Repr as type parameter on Generic instead of a type member:

We would have had to pass the desired value of Repr to getRepr as a type parameter, effectively making getRepr useless. The intuitive take-away from this is that type parameters are useful as “inputs” and type members are useful as “outputs”.

4.2 Dependently typed functions

Shapeless uses dependent types all over the place: in Generic, in Witness (which we will see in the next chapter), and in a host of other “ops” type classes that we will survey in Part II of this guide.

For example, shapeless provides a type class called Last that returns the last element in an HList. Here’s a simplified version of its definition:

This code uses the idiomatic layout described in Section 3.1.2. We define the Aux type in the companion object beside the standard apply method for summoning instances.

Summoner methods versus “implicitly” versus “the”

Note that the return type on apply is Aux[L, O], not Second[L]. This is important. Using Aux ensures the apply method does not erase the type members on summoned instances. If we define the return type as Second[L], the Out type member will be erased from the return type and the type class will not work correctly.

The implicitly method from scala.Predef has this behaviour. Compare the type of an instance of Last summoned with implicitly:

The type summoned by implicitly has no Out type member. For this reason, we should avoid implicitly when working with dependently typed functions. We can either use custom summoner methods, or we can use shapeless’ replacement method, the:

4.3 Chaining dependent functions

Dependently typed functions provide a means of calculating one type from another. We can chain dependently typed functions to perform calculations involving multiple steps. For example, we should be able to use a Generic to calculate a Repr for a case class, and use a Last to calculate the type of the last element. Let’s try coding this:

Unfortunately our code doesn’t compile. This is the same problem we had in Section 3.2.2 with our definition of genericEncoder. We worked around the problem by lifting the free type variable out as a type parameter:

As a general rule, we always write code in this style. By encoding all the free variables as type parameters, we enable the compiler to unify them with appropriate types. This goes for more subtle constraints as well. For example, suppose we wanted to summon a Generic for a case class of exactly one field. We might be tempted to write this:

The error message hints at the problem. The clue is in the appearance of the type H. This is the name of a type parameter in the method: it shouldn’t be appearing in the type the compiler is trying to unify. The problem is that the gen parameter is over-constrained: the compiler can’t find a Reprand ensure its length at the same time. The type Nothing also often provides a clue, appearing when the compiler fails to unify covariant type parameters.

The solution to our problem above is to separate implicit resolution into steps:

This doesn’t compile because the head method in the method body requires an implicit parameter of type IsHCons. This is a much simpler error message to fix—we just need to learn a tool from shapeless’ toolbox. IsHCons is a shapeless type class that splits an HList into a Head and Tail. We can use IsHCons instead of =:=:

This fixes the bug. Both the method definition and the call site now compile as expected:

getWrappedValue(Wrapper(42))
// res17: Int = 42

The take home point here isn’t that we solved the problem using IsHCons. Shapeless provides a lot of tools like this (see Chapters 6 to 8), and we can supplement them where necessary with our own type classes. The important point is to understand the process we use to write code that compiles and is capable of finding solutions. We’ll finish off this section with a step-by-step guide summarising our findings so far.

4.4 Summary

When coding with shapeless, we are often trying to find a target type that depends on values in our code. This relationship is called dependent typing.

Problems involving dependent types can be conveniently expressed using implicit search, allowing the compiler to resolve intermediate and target types given a starting point at the call site.

We often have to use multiple steps to calculate a result (e.g. using a Generic to get a Repr, then using another type class to get to another type). When we do this, there are a few rules we can follow to ensure our code compiles and works as expected:

We should extract every intermediate type out to a type parameter. Many type parameters won’t be used in the result, but the compiler needs them to know which types it has to unify.

The compiler resolves implicits from left to right, backtracking if it can’t find a working combination. We should write implicits in the order we need them, using one or more type variables to connect them to previous implicits.

The compiler can only solve for one constraint at a time, so we mustn’t over-constrain any single implicit.

We should state the return type explicitly, specifying any type parameters and type members that may be needed elsewhere. Type members are often important, so we should use Aux types to preserve them where appropriate. If we don’t state them in the return type, they won’t be available to the compiler for further implicit resolution.

The Aux type alias pattern is useful for keeping code readable. We should look out for Aux aliases when using tools from the shapeless toolbox, and implement Aux aliases on our own dependently typed functions.

When we find a useful chain of dependently typed operations we can capture them as a single type class. This is sometimes called the “lemma” pattern (a term borrowed from mathematical proofs). We’ll see an example of this pattern in Section 6.2.

5 Accessing names during implicit derivation

Often, the type class instances we define need access to more than just types. In this chapter we will look at a variant of Generic called LabelledGeneric that gives us access to field names and type names.

To begin with we have some theory to cover. LabelledGeneric uses some clever techniques to expose name information at the type level. To understand these techniques we must discuss literal types, singleton types, phantom types, and type tagging.

5.1 Literal types

A Scala value may have multiple types. For example, the string "hello" has at least three types: String, AnyRef, and Any5:

Interestingly, "hello" also has another type: a “singleton type” that belongs exclusively to that one value. This is similar to the singleton type we get when we define a companion object:

object Foo
Foo
// res3: Foo.type = Foo$@4464b13a

The type Foo.type is the type of Foo, and Foo is the only value with that type.

Singleton types applied to literal values are called literal types. These have existed in Scala for a long time, but we don’t normally interact with them because the default behaviour of the compiler is to “widen” literals to their nearest non-singleton type. For example, these two expressions are essentially equivalent:

math.sqrt(4).narrow// <console>:17: error: Expression scala.math.`package`.sqrt(4.0) does not evaluate to a constant or a stable reference value// math.sqrt(4.0).narrow// ^// <console>:17: error: value narrow is not a member of Double// math.sqrt(4.0).narrow// ^

Literal types in Scala

Until recently, Scala had no syntax for writing literal types. The types were there in the compiler but we couldn’t express them directly in code. However, as of Lightbend Scala 2.12.1, Lightbend Scala 2.11.9, and Typelevel Scala 2.11.8 we have direct syntax support for literal types. In these versions of Scala we can use the -Yliteral-types compiler option and write declarations like the following:

val theAnswer: 42 = 42// theAnswer: 42 = 42

The type 42 is the same as the type Int(42) we saw in printed output earlier. You’ll still see Int(42) in output for legacy reasons, but the canonical syntax going forward is 42.

5.2 Type tagging and phantom types

Shapeless uses literal types to model the names of fields in case classes. It does this by “tagging” the types of the fields with the literal types of their names. Before we see how shapeless does this, we’ll do it ourselves to show that there’s no magic (well… minimal magic, at any rate). Suppose we have a number:

val number = 42

This number is an Int in two worlds: at runtime, where it has an actual value and methods that we can call, and at compile-time, where the compiler uses the type to calculate which pieces of code work together and to search for implicits.

We can modify the type of number at compile time without modifying its run-time behaviour by “tagging” it with a “phantom type”. Phantom types are types with no run-time semantics, like this:

trait Cherries

We can tag number using asInstanceOf. We end up with a value that is both an Int and a Cherries at compile-time, and an Int at run-time:

Shapeless uses this trick to tag fields and subtypes in an ADT with the singleton types of their names. If you find using asInstanceOf uncomfortable then don’t worry: shapeless provides two tagging syntaxes to avoid such unsavoriness.

The first syntax, ->>, tags the expression on the right of the arrow with the singleton type of the literal expression on the left:

FieldType is a type alias that simplifies extracting the tag and base types from a tagged type:

type FieldType[K, V] = V with KeyTag[K, V]

As we’ll see in a moment, shapeless uses this mechanism to tag fields and subtypes with their names in our source code.

Tags exist purely at compile time and have no runtime representation. How do we convert them to values we can use at runtime? Shapeless provides a type class called Witness for this purpose6. If we combine Witness and FieldType, we get something very compelling—the ability to extract the field name from a tagged field:

If we build an HList of tagged elements, we get a data structure that has some of the properties of a Map. We can reference fields by tag, manipulate and replace them, and maintain all of the type and naming information along the way. Shapeless calls these structures “records”.

We don’t need to go into depth regarding records here; suffice to say that records are the generic representation used by LabelledGeneric. LabelledGeneric tags each item in a product or coproduct with the corresponding field or type name from the concrete ADT (although the names are represented as Symbols, not Strings). Shapeless provides a suite of Map-like operations on records, some of which we’ll cover in Section 6.4. For now, though, let’s derive some type classes using LabelledGeneric.

5.3 Deriving product instances with LabelledGeneric

We’ll use a running example of JSON encoding to illustrate LabelledGeneric. We’ll define a JsonEncoder type class that converts values to a JSON AST. This is the approach taken by Argonaut, Circe, Play JSON, Spray JSON, and many other Scala JSON libraries.

The type here is slightly more complex than we have seen. Instead of representing the field names with literal string types, shapeless is representing them with symbols tagged with literal string types. The details of the implementation aren’t particularly important: we can still use Witness and FieldType to extract the tags, but they come out as Symbols instead of Strings7.

5.3.1 Instances for HLists

Let’s define JsonEncoder instances for HNil and ::. Our encoders are going to generate and manipulate JsonObjects, so we’ll introduce a new type of encoder to make that easier:

We can access the value of K using witness.value, but the compiler has no way of knowing what type of tag we’re going to get. LabelledGeneric uses Symbols for tags, so we’ll put a type bound on K and use symbol.name to convert it to a String:

coproductEncoder follows the same pattern as hlistEncoder. We have three type parameters: K for the type name, H for the value at the head of the HList, and T for the value at the tail. We use FieldType and :+: in the result type to declare the relationships between the three, and we use a Witness to access the runtime value of the type name. The result is an object containing a single key/value pair, the key being the type name and the value the result:

Other encodings are possible with a little more work. We can add a "type" field to the output, for example, or even allow the user to configure the format. Sam Halliday’s spray-json-shapeless is an excellent example of a codebase that is approachable while providing a great deal of flexibility.

5.5 Summary

In this chapter we discussed LabelledGeneric, a variant of Generic that exposes type and field names in its generic representations.

The names exposed by LabelledGeneric are encoded as type-level tags so we can target them during implicit resolution. We started the chapter discussing literal types and the way shapeless uses them in its tags. We also discussed the Witness type class, which is used to reify literal types as values.

Finally, we combined LabelledGeneric, literal types, and Witness to build a JsonEcoder library that includes sensible names in its output.

The key take home point from this chapter is that none of this code uses runtime reflection. Everything is implemented with types, implicits, and a small set of macros that are internal to shapeless. The code we’re generating is consequently very fast and reliable at runtime.

6 Working with HLists and Coproducts

In Part I we discussed methods for deriving type class instances for algebraic data types. We can use type class derivation to augment almost any type class, although in more complex cases we may have to write a lot of supporting code for manipulating HLists and Coproducts.

In Part II we’ll look at the shapeless.ops package, which provides a set of helpful tools that we can use as building blocks. Each op comes in two parts: a type class that we can use during implicit resolution, and extension methods that we can call on HList and Coproduct.

There are three general sets of ops, available from three packages:

shapeless.ops.hlist defines type classes for HLists. These can be used directly via extension methods on HList, defined in shapeless.syntax.hlist.

shapeless.ops.coproduct defines type classes for Coproducts. These can be used directly via extension methods on Coproduct, defined in shapeless.syntax.coproduct.

shapeless.ops.record defines type classes for shapeless records (HLists containing tagged elements—Section 5.2). These can be used via extension methods on HList, imported from shapeless.record, and defined in shapeless.syntax.record.

We don’t have room in this book to cover all of the available ops. Fortunately, in most cases the code is understandable and well documented. Rather than provide an exhaustive guide, we will touch on the major theoretical and structural points and show you how to extract further information from the shapeless codebase.

6.1 Simple ops examples

HList has init and last extension methods based on two type classes: shapeless.ops.hlist.Init and shapeless.ops.hlist.Last. While init drops the last element of an HList, last drops all except the last one. Coproduct has similar methods and type classes. These serve as perfect examples of the ops pattern. Here are simplified definitions of the extension methods:

The return type of each method is determined by a dependent type on the implicit parameter. The instances for each type class provide the actual mapping. Here’s the skeleton definition of Last as an example:

We can make a couple of interesting observations about this implementation. First, we can typically implement ops type classes with a small number of instances (just two in this case). We can therefore package all of the required instances in the companion object of the type class, allowing us to call the corresponding extension methods without any imports from shapeless.ops:

Second, the type class is only defined for HLists with at least one element. This gives us a degree of static checking. If we try to call last on an empty HList, we get a compile error:

HNil.last// <console>:16: error: Implicit not found: shapeless.Ops.Last[shapeless.HNil.type]. shapeless.HNil.type is empty, so there is no last element.// HNil.last// ^

6.2 Creating a custom op (the “lemma” pattern)

If we find a particular sequence of ops useful, we can package them up and re-provide them as another ops type class. This is an example of the “lemma” pattern, a term we introduced in Section 4.4.

Let’s work through the creation of our own op as an exercise. We’ll combine the power of Last and Init to create a Penultimate type class that retrieves the second-to-last element in an HList. Here’s the type class definition, complete with Aux type alias and apply method:

The important point here is that, by defining Penultimate as another type class, we have created a reusable tool that we can apply elsewhere. Shapeless provides many ops for many purposes, but it’s easy to add our own to the toolbox.

6.3 Case study: case class migrations

The power of ops type classes fully crystallizes when we chain them together as building blocks for our own code. We’ll finish this chapter with a compelling example: a type class for performing “migrations” (aka “evolutions”) on case classes8. For example, if version 1 of our app contains the following case class:

The type class should take care of the migration without additional boilerplate.

6.3.1 The type class

The Migration type class represents a transformation from a source to a destination type. Both of these are going to be “input” types in our derivation, so we model both as type parameters. We don’t need an Aux type alias because there are no type members to expose:

trait Migration[A, B] {
defapply(a: A): B
}

We’ll also introduce an extension method to make examples easier to read:

Take a moment to locate Intersection in the shapeless codebase. Its Aux type alias takes three parameters: two input HLists and one output for the intersection type. In the example above we are specifying ARepr and BRepr as the input types and BRepr as the output type. This means implicit resolution will only succeed if B has an exact subset of the fields of A, specified with the exact same names in the same order:

6.3.3 Step 2. Reordering fields

We need to lean on another ops type class to add support for reordering. The Align op lets us reorder the fields in one HList to match the order they appear in another HList. We can redefine our instance using Align as follows:

We introduce a new type parameter called Unaligned to represent the intersection of ARepr and BRepr before alignment, and use Align to convert Unaligned to BRepr. With this modified definition of Migration we can both remove and reorder fields:

6.3.4 Step 3. Adding new fields

We need a mechanism for calculating default values to support the addition of new fields. Shapeless doesn’t provide a type class for this, but Cats does in the form of a Monoid. Here’s a simplified definition:

We need to combine Monoid9 with a couple of other ops to complete our final implementation of Migration. Here’s the full list of steps:

use LabelledGeneric to convert A to its generic representation;

use Intersection to calculate an HList of fields common to A and B;

calculate the types of fields that appear in B but not in A;

use Monoid to calculate a default value of the type from step 3;

append the common fields from step 2 to the new field from step 4;

use Align to reorder the fields from step 5 in the same order as B;

use LabelledGeneric to convert the output of step 6 to B.

We’ve already seen how to implement steps 1, 2, 4, 6, and 7. We can implement step 3 using an op called Diff that is very similar to Intersection, and step 5 using another op called Prepend. Here’s the complete solution:

Note that this code doesn’t use every type class at the value level. We use Diff to calculate the Added data type, but we don’t actually need diff.apply at run time. Instead we use our Monoid to summon an instance of Added.

With this final version of the type class instance in place we can use Migration for all the use cases we set out at the beginning of the case study:

It’s amazing what we can create with ops type classes. Migration has a single implicit def with a single line of value-level implementation. It allows us to automate migrations between any pair of case classes, in roughly the same amount of code we’d write to handle a single pair of types using the standard library. Such is the power of shapeless!

6.4 Record ops

We’ve spent some time in this chapter looking at type classes from the shapeless.ops.hlist and shapeless.ops.coproduct packages. We mustn’t leave without mentioning a third important package: shapeless.ops.record.

6.4.1 Selecting fields

Attempting to access an undefined field causes a compile error as we might expect:

sundae.get('nomCherries)
// <console>:20: error: No field Symbol with shapeless.tag.Tagged[String("nomCherries")] in record String with shapeless.labelled.KeyTag[Symbol with shapeless.tag.Tagged[String("name")],String] :: Int with shapeless.labelled.KeyTag[Symbol with shapeless.tag.Tagged[String("numCherries")],Int] :: Boolean with shapeless.labelled.KeyTag[Symbol with shapeless.tag.Tagged[String("inCone")],Boolean] :: shapeless.HNil// sundae.get('nomCherries)// ^

6.4.2 Updating and removing fields

The updated method and Updater type class allow us to modify fields by key. The remove method and Remover type class allow us to delete fields by key:

6.4.4 Other operations

There are other record ops that we don’t have room to cover here. We can rename fields, merge records, map over their values, and much more. See the source code of shapeless.ops.record and shapeless.syntax.record for more information.

6.5 Summary

In this chapter we explored a few of the type classes that are provided in the shapeless.ops package. We looked at Last and Init as two simple examples of the ops pattern, and built our own Penultimate and Migration type classes by chaining together existing building blocks.

Many of the ops type classes share a similar pattern to the ops we’ve seen here. The easiest way to learn them is to look at the source code in shapeless.ops and shapeless.syntax.

In the next chapters we will look at two suites of ops type classes that require further theoretical discussion. Chapter 7 discusses functional operations such as map and flatMap on HLists, and Chapter 8 discusses how to implement type classes that require type level representations of numbers. This knowledge will help us gain a more complete understanding of the variety of type classes from shapeless.ops.

7 Functional operations on HLists

“Regular” Scala programs make heavy use of functional operations like map and flatMap. A question arises: can we perform similar operations on HLists? The answer is “yes”, although we have to do things a little differently than in regular Scala. Unsurprisingly the mechanisms we use are type class based and there are a suite of ops type classes to help us out.

Before we delve in to the type classes themselves, we need to discuss how shapeless represents polymorphic functions suitable for mapping over heterogeneous data structures.

7.1 Motivation: mapping over an HList

We’ll motivate the discussion of polymorphic functions by looking at the map method. Figure 1 shows a type chart for mapping over a regular list. We start with a List[A], supply a function A => B, and end up with a List[B].

Figure 1: Mapping over a regular list (“monomorphic” map)

The heterogeneous element types in an HList cause this model to break down. Scala functions have fixed input and output types, so the result of our map will have to have the same element type in every position.

Ideally we’d like a map operation like the one shown in Figure 2, where the function inspects the type of each input and uses it to determine the type of each output. This gives us a closed, composable transformation that retains the heterogeneous nature of the HList.

Figure 2: Mapping over a heterogeneous list (“polymorphic” map)

Unfortunately we can’t use Scala functions to implement this kind of operation. We need some new infrastructure.

7.2 Polymorphic functions

Shapeless provides a type called Poly for representing polymorphic functions, where the result type depends on the parameter types. Here is a simplified explanation of how it works. Note that the next section doesn’t contain real shapeless code—we’re eliding much of the flexibility and ease of use that comes with real shapeless Polys to create a simplified API for illustrative purposes.

7.2.1 How Poly works

At its core, a Poly is an object with a generic apply method. In addition to its regular parameter of type A, Poly accepts an implicit parameter of type Case[A]:

When we call myPoly.apply, the compiler searches for the relevant implicit Case and inserts it as usual:

myPoly.apply(123)
// res8: Double = 61.5

There is some subtle scoping behaviour here that allows the compiler to locate instances of Case without any additional imports. Case has an extra type parameter P referencing the singleton type of the Poly. The implicit scope for Case[P, A] includes the companion objects for Case, P, and A. We’ve assigned P to be myPoly.type and the companion object for myPoly.type is myPoly itself. In other words, Cases defined in the body of the Poly are always in scope no matter where the call site is.

We’re extending a trait called Poly1 instead of Poly. Shapeless has a Poly type and a set of subtypes, Poly1 through Poly22, supporting different arities of polymorphic function.

The Case.Aux types doesn’t seem to reference the singleton type of the Poly. Case.Aux is actually a type alias defined within the body of Poly1. The singleton type is there—we just don’t see it.

We’re using a helper method, at, to define cases. This acts as an instance constructor method as discussed in Section 3.1.2), which eliminates a lot of boilerplate.

Syntactic differences aside, the shapeless version of myPoly is functionally identical to our toy version. We can call it with an Int or String parameter and get back a result of the corresponding return type:

Because Cases are just implicit values, we can define cases based on type classes and do all of the advanced implicit resolution covered in previous chapters. Here’s a simple example that totals numbers in different contexts:

This behaviour is confusing and annoying. Unfortunately there are no concrete rules to follow to avoid problems. The only general guideline is to try not to over-constrain the compiler, solve one constraint at a time, and give it a hint when it gets stuck.

7.3 Mapping and flatMapping using Poly

Shapeless provides a suite of functional operations based on Poly, each implemented as an ops type class. Let’s look at map and flatMap as examples. Here’s map:

Note that the elements in the resulting HList have types matching the Cases in sizeOf. We can use map with any Poly that provides Cases for every member of our starting HList. If the compiler can’t find a Case for a particular member, we get a compile error:

Interestingly, although we define a type P for our Poly, we don’t reference any values of type P anywhere in our code. The Mapper type class uses implicit resolution to find Cases, so the compiler only needs to know the singleton type of P to locate the relevant instances.

Let’s create an extension method to make ProductMapper easier to use. We only want the user to specify the type of B at the call site, so we use some indirection to allow the compiler to infer the type of the Poly from a value parameter:

The mapTo syntax looks like a single method call, but is actually two calls: one call to mapTo to fix the B type parameter, and one call to Builder.apply to specify the Poly. Some of shapeless’ built-in ops extension methods use similar tricks to provide the user with convenient syntax.

7.6 Summary

In this chapter we discussed polymorphic functions whose return types vary based on the types of their parameters. We saw how shapeless’ Poly type is defined, and how it is used to implement functional operations such as map, flatMap, foldLeft, and foldRight.

Each operation is implemented as an extension method on HList, based on a corresponding type class: Mapper, FlatMapper, LeftFolder, and so on. We can use these type classes, Poly, and the techniques from Section 4.3 to create our own type classes involving sequences of sophisticated transformations.

8 Counting with types

From time to time we need to count things at the type level. For example, we may need to know the length of an HList or the number of terms we have expanded so far in a computation. We can represent numbers as values easily enough, but if we want to influence implicit resolution we need to represent them at the type level. This chapter covers the theory behind counting with types, and provides some compelling use cases for type class derivation.

ScalaCheck provides built-in instances of Arbitrary for a wide range of standard Scala types. However, creating instances of Arbitrary for user ADTs is still a time-consuming manual process. This makes shapeless integration via libraries like scalacheck-shapeless very attractive.

In this section we will create a simple Random type class to generate random values of user-defined ADTs. We will show how Length and Nat form a crucial part of the implementation. As usual we start with the definition of the type class itself:

The Repr for Light is Red :+: Amber :+: Green :+: CNil. An instance of Random for this type will choose Red 50% of the time and Amber :+: Green :+: CNil 50% of the time. A correct distribution would be 33% Red and 67% Amber :+: Green :+: CNil.

And that’s not all. If we look at the overall probability distribution we see something even more alarming:

To fix this problem we have to alter the probability of choosing H over T. The correct behaviour should be to choose H1/n of the time, where n is the length of the coproduct. This ensures an even probability distribution across the subtypes of the coproduct. It also ensures we choose the head of a single-subtype Coproduct 100% of the time, which means we never call cnilProduct.get. Here’s an updated implementation:

These operations and their associated type classes are useful for manipulating individual elements within a product or coproduct.

8.5 Summary

In this chapter we discussed how shapeless represents natural numbers and how we can use them in type classes. We saw some predefined ops type classes that let us do things like calculate lengths and access elements by index, and created our own type classes that use Nat in other ways.

Between Nat, Poly, and the variety of types we have seen in the last few chapters, we have seen just a small fraction of the toolbox provided in shapeless.ops. There are many other ops type classes that provide a comprehensive foundation on which to build our own code. However, the theory laid out here is enough to understand the majority of ops needed to derive our own type classes. The source code in the shapeless.ops packages should now be approachable enough to pick up other useful ops.

Prepare for launch!

With Part II’s look at shapeless.ops we have arrived at the end of this guide. We hope you found it useful for understanding this fascinating and powerful library, and wish you all the best on your future journeys as a type astronaut.

As functional programmers we value abstraction above all else. Concepts like functors and monads arise from years of programming research: writing code, spotting patterns, and making abstractions to remove redundancy. Shapeless raises the bar for abstraction in Scala. Tools like Generic and LabelledGeneric provide an interface for abstracting over data types that were previously frustratingly unique and distinct.

There have traditionally been two barriers to entry for aspiring new shapeless users. The first is the wealth of theoretical knowledge and implementation detail required to understand the patterns we need. Hopefully this guide has helped in this regard.

The second barrier is the fear and uncertainty surrounding a library that is seen as “academic” or “advanced”. We can overcome this by sharing knowledge—use cases, pros and cons, implementation strategies, and so on—to widen the understanding of this valuable tool. So please share this book with a friend… and let’s scrap some boilerplate together!

Not to be confused with “abstract data types”, which are a different tool from computer science that has little bearing on the discussion here.↩

The word “algebra” meaning: the symbols we define, such as rectangle and circle; and the rules for manipulating those symbols, encoded as methods.↩

We’re using “generic” in an informal way here, rather than the conventional meaning of “a type with a type parameter”.↩

Product is perhaps a better name for HList, but the standard library unfortunately already has a type scala.Product.↩

String also has a bunch of other types like Serializable and Comparable but let’s ignore those for now.↩