GHC.Generics

From HaskellWiki

GHC 7.2 includes improved support for datatype-generic programming through two new features, enabled with two new flags: DeriveGeneric and DefaultSignatures. We show how this all works in this page, starting with a detailed example.

Since this is a fresh new feature, it is possible that you will run into bugs when using it. If so, please
report them!

For starters, try to ignore the p parameter in all types; it's there just for future compatibility. The easiest way to understand how you can use these types to represent others is to see an example. Let's represent the

UserTree

type shown before:

type RepUserTree a =-- A UserTree is either a Leaf, which has no arguments
U1
-- ... or it is a Node, which has three arguments that we put in a product
:+: a :*: UserTree a :*: UserTree a

Simple, right? Different constructors become alternatives of a sum, and multiple arguments become products. In fact, we want to have some more information in the representation, like datatype and constructor names, and to know if a product argument is a parameter or a type. We use the other primitives for this, and the representation looks more like:

type RealRepUserTree a =-- Information about the datatype
M1 D Data_UserTree (-- Leaf, with information about the constructor
M1 C Con_Leaf U1
-- Node, with information about the constructor
:+: M1 C Con_Node (-- Constructor argument, which could have information-- about a record selector label
M1 S NoSelector (-- Argument, tagged with P because it is a parameter
K1 P a)-- Another argument, tagged with R because it is -- a recursive occurrence of a type
:*: M1 S NoSelector (K1 R (UserTree a))-- Idem
:*: M1 S NoSelector (K1 R (UserTree a))))

A bit more complicated, but essentially the same. Datatypes like

Data_UserTree

are empty datatypes used only for providing meta-information in the representation; you don't have to worry much about them for now. Also, GHC generates these representations for you automatically, so you should never have to define them yourself! All of this is explained in much more detail in Section 2.1. of the original paper describing the new generic deriving mechanism.

1.1.2 A generic function

Since GHC can represent user types using only those primitive types, all you have to do is to tell GHC how to serialize each of the individual primitive types. The best way to do that is to create a new type class:

class GSerialize f where
gput :: f a ->[Bit]

This class looks very much like the original

Serialize

class, just that the type argument is of kind

*->*

, since our generic representation types have this p parameter lying around. Now we need to give instances for each of the basic types. For units there's nothing to serialize:

instance GSerialize U1 where
gput U1 =[]

The serialization of multiple arguments is simply the concatenation of each of the individual serializations:

Finally, we're only left with the arguments. For these we will just use our first class,

Serialize

, again:

instance(Serialize a)=> GSerialize (K1 i a)where
gput (K1 x)= put x

So, if a user datatype has a parameter which is instantiated to

Int

, at this stage we will use the library instance for

Serialize Int

.

1.1.3 Default implementations

We've seen how to represent user types generically, and how to define functions on representation types. However, we still have to tie these two together, explaining how to convert user types to their representation and then applying the generic function.

The representation

RepUserTree

we have seen earlier is only one component of the representation; we also need functions to convert to and from the user datatype into the representation. For that we use another type class:

class Generic a where-- Encode the representation of a user datatypetype Rep a ::*->*-- Convert from the datatype to its representation
from :: a ->(Rep a) x
-- Convert from the representation to the datatype
to ::(Rep a) x -> a

says that we can serialize any a into a list of bits, as long as that a is

Generic

, and its representation

Rep a

has a

GSerialize

instance. The implementation is very simple: first convert the value to its representation using

from

, and then call

gput

on that representation.
However, we still have to write a

Serialize

instance for the user dataype:

instance(Serialize a)=> Serialize (UserTree a)where
put = putDefault

1.2 Using GHC's new features

What we have seen so far could all already be done, at the cost of writing a lot of boilerplate code yourself (or spending hours writing Template Haskell code to do it for you). Now we'll see how the new features of GHC can help you.

1.2.1 Deriving representations

The

Generic

class, and all the representation types, come with GHC in the GHC.Generics module. GHC can also derive

With the new language pragma DefaultSignatures, GHC allows you to put the keyword

default

before a (new) type signature for a method inside a class declaration. If you give such a default type signature, then you have to provide a default method implementation, which will be type-checked using the default signature, and not the original one.

Now the user can simply write:

instance(Serialize a)=> Serialize (UserTree a)

GHC fills out the implementation for

put

using the default method. It will type-check correctly because we have a

2 Different perspectives

We outline the changes introduced in 7.2 regarding support for generic programming from the perspective of three different types of users: the end-user, the generic programmer, and the GHC hacker.

2.1 The end-user

If you know nothing about generic programming and would like to keep it that way, then you will be pleased to know that using generics in GHC 7.2 is easier than ever. As soon as you encounter a class with a default signature (like Serialize above), you will be able to give empty instances for your datatypes, like this:

instance(Serialize a)=> Serialize (UserTree a)

You will need to add a

deriving Generic

clause to each datatype that you want to have generic implementations for. You might have datatypes that use other datatypes, and you might need Generic instances for those too. In that case, you can import the module where the datatype is defined and give a standalone deriving Generic instance. In either case, you will need the -XDeriveGeneric flag.

2.2 The generic programmer

If you are a library author and are eager to make your classes easy to instantiate by your users, then you should invest some time in defining instances for each of the representation types of GHC.Generics and defining a generic default method. See the example for Serialize above, and the original paper for many other examples (but make sure to check the changes from the paper).

2.3 The GHC hacker

If you are working on the GHC source code, you might find it useful to know what kind of changes were made. There is a Trac wiki page with a lower-level overview of things and also keeping track of what still needs to be done.

3 Changes from the paper

In the paper we describe the implementation in UHC. The implementation in GHC is slightly different:

Representable0 and Representable1 have become Generic and Generic1, respectively. from0, to0, and Rep0 also lost the 0 at the end of their names.

We are using type families, so the Generic and Generic1 type classes have only one type argument. So, in GHC the classes look like what we describe in the "Avoiding extensions" part of Section 2.3 of the paper. This change affects only a generic function writer, and not a generic function user.

Default definitions (Section 3.3) work differently. In GHC we don't use a DERIVABLE pragma; instead, a type class can declare a generic default method, which is akin to a standard default method, but includes a default type signature. This removes the need for a separate default definition and a pragma. For example, the Encode class of Section 3.1 is now: