For long lost friends and stalkers

Post navigation

Summary: When programming, often there is a requirement to store heterogeneous types of values in a collection. The Typed Keys pattern defines keys which ‘know’ the type of data to which they refer, and which encapsulate the type information and logic to convert it to and from the storage type.

Preamble

I seem to write on this theme a lot: how to leverage the type system of modern programming languages to reduce programming errors, and make application code less verbose. Well, this is another example of that, and it’s a technique I find myself using quite often in Java & C# to really simplify code, increase type-safety (i.e., reduce the opportunity for dumb errors) and make business logic more understandable.

The gist is that if you do the same thing every time you access particular keys/variables/lookup items, then you should encapsulate these actions within the key, rather than spreading them throughout the code.

It’s not an original idea, and we’ll point to some examples in a future post.

Motivating example:

Let’s say you have a map of strings to objects (in Java: Map<String, Object>), and you’re storing in it values of several different types:

And let’s say that there could be a bunch of other such keys too—dozens or hundreds—it’s an open-ended list.

This situation occurs with session attributes, and configuration settings, and key-value databases.

If it is a small, relatively-fixed list of keys, the solution might be a wrapper class around the Map, with appropriately-typed getters and setters for each value. But the keys are an open-ended set. The code to retrieve a session-id might be:

long sessionId = (Long)map.get("sessionId");

Let’s imagine that we’re doing that a few times throughout the codebase.

And to set it:

map.put("sessionId") = /* generate the session id somehow */

Already, there are a couple of problems with this code, so let’s address them one at a time.

The string literal

A simple (and obvious) one to start:

We’re repeating the same constant strings throughout the codebase. This is error-prone, and makes the code fragile & hard to change.

It’s error-prone—because a typo at any of the string literals will never be caught at compile time, but will cause the code to fail at runtime.

It’s fragile—because these errors are easy to introduce accidentally.

It’s hard to change—because changing any key-name requires changing it at many places throughout the code.

So the obvious first step is to define each of the key-names we’re using as string constants:

That’s an improvement. It’s one that you’d expect most people to make. Fairly uncontroversial.

However, there is another bit of repetition & fragility in the code as it stands, and it’s not immediately obvious how to refactor this one:

The type of each key

The repetition is the cast-to-long (or whatever the type of the key is) at each access point. This repetition leads to the same issues as the literal strings: fragility, error-proneness & difficulty in changing the code.

It’s error-prone—because a wrong cast at each any access point will never be caught at compile time, but will cause the code to fail at runtime. For example, there’s no compile-time guarantee that we’re casting to the same type when reading the value as the type we wrote to the key.

It’s fragile—because these errors are easy to introduce accidentally.

It’s hard to change—because changing any key’s type could require changing it at many, many places throughout the code.

Additionally, the repetition of the type casts—which is inherent to each key—is nevertheless written out explicitly each access point, and so clutters the code.

The idea

So can we somehow encapsulate the string key name with the type of the variable…?

Perhaps something like:

class Key<T> { final String keyName; final Class<T> valueType;}

We can then construct a mechanism for getting and setting these variables:

If you ever needed to change the type of a keyed variable, for example, changing the session ID from a Long to a GUID, you’d change the declaration of SESSION_ID, and the compiler would point out all the places in the code that needed to change.

Alternatively

Another, and common, approach to achieving type-safety is to wrap the (untyped) map in a typed wrapper, and provide accessors for each of the values. For example:

More adventures in strongly-typed database ID fields. A follow-up (of sorts) to a post from 2013.

In that previous post I described a way of adding strong typing to database ID references in C# code, without really any runtime overhead, and interoperability with existing code which passes database IDs as integers. This post presents a refinement which is more flexible, and produces less cluttered code.

Background

In a lot of database-heavy apps, at least the ones I’ve been involved in, you spend a lot of time passing database IDs around in the code. Usually these are integers (32- or 64-bit), but they could also be UUIDs or strings.

The trouble is that an integer representing a customer ID has the same static type as an integer representing a user ID, invoice line ID, product ID, or for that matter an integer representing a quantity. The compiler will not complain at you when you pass an integer representing a user ID to a function expecting an integer representing a customer ID—because they are all just undifferentiated integers.

So it’s an appealing idea to somehow introduce static type checking for database entity IDs. Of course, we should avoid bloating the code or introducing any runtime overhead and it should easily interoperate with whatever the native key type is for the database entities.

Ideally the scheme should even cope with composite primary keys, though in my experience composite primary keys are pretty rare (at least when using an ORM which doesn’t directly expose joining tables).

Previous approach

In ID: Type-safety in database code, I described a C# generic struct type, ID<> for representing strongly-typed database IDs. It worked, but had the following shortcomings:

It was verbose: ID types look like ID<Customer> or ID<Invoice>, which is awkward to type and visually messy.

It was limited: It assumed that database IDs are always 32-bit integers. Different types of keys—for example, some tables with string keys and others with integers— cannot be mixed in a single project without creating multiple, different-named ID classes.

New approach

Ideally key types would be named EntityName.Id, but how can we do that while keeping them as structs, and without requiring each entity to redefine its own Id struct?

The answer is to make it an inner type of a parameterised Entity class (parameterised by database key type and Entity subtype). Subclasses instantiate the parameter types, and get an Id struct type strongly typed with respect to their key and Entity type.

ID types now look like Customer.ID or Invoice.ID—which is visually less noisy, and puts the entity name first.

Entities have a ‘Key’ property which is of the underlying primary key type.

The downside is that all entity classes must inherit from the same Entity<> base class in order to be able to have-strongly typed ID types. However, since the entity ‘knows’ about its ID type, it can expose an Id field of that type.

It’s possible for many entities which share the same underlying key type (and key field name) to inherit from a common subclass of Entity, specialised to their key type.

You’ll notice that there is one abstract property on Entity: Key; this represents the entity’s (primary) key as its underlying type. Making this abstract allows subclasses to decide how they want to store all their fields—the Entity class itself does not store any state.

Entities which use the same key type/key name

I have a few line-of-business applications most of which use 32-bit integer entity IDs. Table key names are almost always ‘Id’, and they use Microsoft’s EntityFramework for database access. We can abstract the common bits of the 32-bit-ID database entities like this:

This says that all inheriting entities have an Int32 key field (and hence an ID type based on ints), represented in the database as a field called ‘Id‘.

Accepting IDs or entities

As with my previous approach, it includes a mechanism for methods to receive as parameters objects which can be either an ID or a whole entity.

This is useful because frequently business logic already has an entity object, and it’s a useful optimisation for called methods not to have to retrieve the same entity again from the database.

We specify an interface to represent the union of an Entity type and its corresponding ID type, called EntityName.IDOrEntity. Entities and their IDs implement this interface, and an extension method on the interface, GetEntity(Func<ID, Entity>), provides a mechanism to either return the entity, or to look up the entity from its ID.

In other words, if you provide an entity, the method can use it directly; if you provide just an ID, it can look up the entity itself.

Note: This article/proposal is an aggregation of severalpreviousblogposts written over the past few years. The content is mostly the same, but has been edited for consistency and clarity.

Abstract

In the C# programming language, there are two kinds of data types: value types and reference types. Reference types can hold either pointers to objects, or the special value null, which is used to indicate a ‘missing’ reference. This feature of implicitly allowing nulls in reference types has some well-recognised problems and the .NET team are already exploring ways of enforcing non-nullable reference types; this paper explores one possible design. Continue reading →

I’ve been developing a mobile app with Ionic 2/Angular 2/Cordova/TypeScript, and given how many moving parts there are in there, and that Ionic 2 and Angular 2 were still beta software when I started—and that I was completely new to all of the technologies when I started—it’s gone relatively well.

However, it’s not a totally smooth development experience, and there are lots of little annoyances which occasionally turn it from pleasant to hideously frustrating.