Simple Extensible Records — Quick Generic Tricks, Pt. 1

There have been a few discussions lately about how to do quick and easy typesafe extensible records in Haskell. And there have been a number of discussions lately about extensions to do it more cleanly (see, e.g. this one on haskell-cafe). This came up on the channel the other day and somebody immediately suggested HList, which seems too heavyweight for this sort of thing to me. HaskellDB, for example, also rolls its own fairly heavyweight new records system.

Anyway, in discussion, we cooked up an idea which Kamina, who originally asked the question, later implemented very cleanly. The gist is that one defines a map of properties, and implements a typeclass for accessing them. Subsequently, one defines a typeclass for each property, whose default methods know how to query them from the properties map. Then, providing any data type with a new property is just a quick instance declaration away. There’s a slight messiness in that one needs to directly index properties by strings, or by a Property enumeration that’s set in stone. (One can actually index them by TypeReps using Typeable, but that’s sort of messy.)

However, there’s another issue as well — we want records extensible in two ways. The typical way is just being able to reuse an accessor and setter function over multiple data types with the same property. But we also want to be able to actually extend records with something like inheritance — i.e. to declare some records strict supersets of others. The following solution, with generics, both solves that issue, and is significantly more lightweight, although it still requires mild boilerplate. Additionally, while the former solution lets one “break” it by circumventing the typeclass-bound “smart accessor” functions and operating on the properties map directly, this solution can also be “broken” by giving it a bad instance declaration.

Both solutions, one should note, can easily be augmented with Template Haskell to reduce the boilerplate still further.

We start by realizing that rather than creating a properties map, it would be nice if we could walk directly over the data in our record and select something of the right type. This, of course, is something that the Scrap Your Boilerplate generics library excels at. In fact, it turns out we have the function we need right at our fingertips: gfindtype, which is of type (Data x, Typeable y) => x -> Maybe y . We ask it for a specific type, and if that type is an immediate subterm of x, then it returns it, otherwise it fails.

Now we need a modifier. There isn’t one built in, but its simple enough to add one. The type signature is immediately evident: greplace :: (Data a, Typeable b) => a -> b -> Maybe a . However, what to do with this type signature is a bit trickier. Data.Generics gives us gmapMo of type (Data a, MonadPlus m) => (a -> m a) -> a -> m a. It traverses the immediate subterms of a type, and applies a transformation. We build our traverser using `extM` which composes a chain of actions to try, matching from right to left the first with an appropriate type. Our default action when nothing matches is to const Nothing (i.e., to leave the term unchanged). When the term does match, we return just the term. This gives us:
greplace x y = gmapMo (const Nothing `extM` (const (Just y))) x

And indeed, this does what we want. However, we don’t have any type safety constraints yet. Every query and update gives us a maybe, depending if it matches. But we can combine this system with the typeclass constraints from the other system and get a very clean interface. Assume we have properties of type Color and Size:

And for anything with a Color, e.g., a car, we just can write instance HasColor Car. Not too shabby!

However, as we’ve written it, this only works for immediate subterms. So, just like our old solution with an explicit properties map, we still can’t extend our records just by wrapping them as part of a larger record.

Luckily, the generics libraries make it trivial to do deep searches as well. We can replace gfindnew with gfind = something (const Nothing `extQ` Just) . This function is built much the same as our greplace was — yielding Just a value of the appropriate type, or Nothing otherwise. The something combinator just maps our query function deeply over every term, from top to bottom, until it hits a match. And as for greplace, we can just replace gmapMo with “somewhere” for a similar effect. Now, if we have data DrivenCar = DrivenCar Person Car we can give the appropriate instance declarations, and get and set the color of the car directly here as well.

One more problem though: the somewhere combinator will apply our transformation at least once — which means that it might apply it many multiple times, and even if it doesn’t, it’ll keep working ever after it succeeds. This too is easy enough to fix. Look at the source and we see that somewhere is just defined as f x `mplus` gmapMq (somewhere f) x. So its trying the transform, and if the transform fails, its mapping the transform over all the immediate subterms, recursively. All we have to do is write a function “once” that looks exactly the same, but with gmapMo instead of gmapMq, and we have what we really need.