Rails Enum is a Sharp Knife

When we say something "is a sharp knife," we mean that it is eminently valuable if used correctly, and exceptionally hazardous when wielded incorrectly or without effective training. It's similar to the Principle of Least Astonishment, but with a harmful outcome rather than merely an astonishing one. We, at Bendyworks, assert that the Rails Enum functionality is exactly (exacto-ly?) that: a very sharp knife.

Why Use Enum in the First Place?

We've frequently run into situations where having an enum in a database would be great. State indicators like "active," "locked," and "archived" for a conversation are a great example. In such a case, we'd probably want to filter by this state, so we're gonna want an index. The first question to come up is: should the "active," "locked," and "archived" values be in the code or the database?

How Should We Store Enums?

Unless we're allowing users to build dynamic workflows with custom states, the obvious choice (to me) is to put the values in the code. My primary reason centers on the relative difficulty in idempotently inserting the values into their own table. In production, this is needlessly unnerving, and in testing, it's unnecessarily complex. You're also introducing complexity into many queries by adding an extra join.

Why Not Native Enums?

Let's say that we wanted to use the db-native enums. In PostgreSQL, that means we create a type with a list of acceptable values, then effectively treat it as a string (even though it's actually stored in 4 bytes, just like an integer).

This seems all well and good until you realize that you have a human-understandable string representation in both your code and the database. If, for example, you wanted to change "archived" to "closed," you ought to change it both in your code and the database (probably through a migration), lest you add a cognitive dissonance in your project, where "archived" and "closed" mean the same thing but can't be used interchangeably in all places.

Just Store Them as Integers

Informed by the above section, we'd prefer to have just one canonical location for abstracting from human-understandable values into integers: our application code. We could write our own simple abstraction layer to do that, or we could just use ActiveRecord::Enum because that's literally all it is!

Wait! No! Not Like That!

The documentation for ActiveRecord::Enum kicks off with the following example:

This seems like a reasonable and concise way of declaring an enum until you ask the question:

"What if I insert :locked in between :active and :archived?"

The answer?

"Unless you write a migration to convert every 1 into 2 alongside this change AND make sure to run it (or is it 2 into 3?), all of your archived conversations will switch over to being locked instead."

You can avoid this particular problem by making enums be append-only, which indeed the documentation suggests.

This is what we like to call a "leaky abstraction." That is, in order to use an abstraction (name→integer), you must know its underlying implementation to avoid problems. This is a particularly malevolent leaky abstraction, as it could easily lead to some serious problems. For example, what if you had an enum of roles [ :admin, :author, :editor ] and you decided to insert :anonymous at the beginning? Then you probably just turned every Author into an Admin, and every Admin into an Anonymous user. Oops!

Another way to think of this problem is that it introduces a very high level of connascence, namely connascence of value. Not only is this the second worst form of connascence, it also exists between your code and your database, exhibiting a troublesome level of locality which further amplifies the severity.

is it pretty? Not particularly, but it's safe. And when you're handling a Sharp Knife, you want to be safe. In addition, it signifies to the next reader of this code (perhaps your future self?) that, as long as you follow the established pattern, you'll continue to be safe.