statebox, an eventually consistent data model for Erlang (and Riak)

A few weeks ago when I was on call at work I was chasing down a bug in
friendwad [1] and I realized that we had made a big mistake. The data
model was broken, it could only work with transactions but we were using
Riak. The original prototype was built with Mnesia, which would've
been able to satisfy this constraint, but when it was refactored for
an eventually consistent data model it just wasn't correct anymore.
Given just a little bit of concurrency, such as a popular user, it would
produce inconsistent data. Soon after this discovery, I found another
service built with the same invalid premise and I also realized
that a general solution to this problem would allow us to migrate several
applications from Mnesia to Riak.

When you choose an eventually consistent data store you're
prioritizing availability and partition tolerance over consistency,
but this doesn't mean your application has to be inconsistent. What it
does mean is that you have to move your conflict resolution from
writes to reads. Riak does almost all of the hard work for you [2],
but if it's not acceptable to discard some writes then you will have to
set allow_mult to true on your bucket(s) and handle siblings
[3] from your application. In some cases, this might be trivial.
For example, if you have a set and only support adding to that set,
then a merge operation is just the union of those two sets.

statebox is my solution to this problem. It bundles the value with
repeatable operations [4] and provides a means to automatically
resolve conflicts. Usage of statebox feels much more declarative
than imperative. Instead of modifying the values yourself, you
provide statebox with a list of operations and it will apply them
to create a new statebox. This is necessary because it may apply
this operation again at a later time when resolving a conflict between
siblings on read.

Design goals (and non-goals):

The intended use case is for data structures such as dictionaries
and sets

Direct support for counters is not required

Applications must be able to control the growth of a statebox so that
it does not grow indefinitely over time

The implementation need not support platforms other than Erlang and
the data does not need to be portable to nodes that do not share
code

It should be easy to use with Riak, but not be dependent on it
(clear separation of concerns)

Must be comprehensively tested, mistakes at this level are very expensive

It is ok to require that the servers' clocks are in sync with NTP
(but it should be aware that timestamps can be in the future or past)

Here's what typical statebox usage looks like for a trivial
application (note: Riak metadata is not merged [5]). In this case we
are storing an orddict in our statebox, and this orddict has the keys
following and followers.

Realistically, these operations may happen with some concurrency and
cause conflict. For demonstration purposes we will have AB happen
concurrently with BA and the conflict will be resolved during AC.
For simplicity, I'll only show the operations that modify the key for
alice.

Uh oh, there are two stateboxes in Riak now... so
statebox_riak:from_values([AB, BA]) is called. This will apply
all of the operations from both of the event queues to one of the
current values and we will get a single statebox as a result.

Well, that's about it! alice is following both bob and
charlie despite the concurrency. No locks were harmed during this
experiment, and we've arrived at eventual consistency by using
statebox_riak, statebox, and Riak without having to write any
conflict resolution code of our own.

[1]

friendwad manages our social graph for Mochi Social and MochiGames.
It is also evidence that naming things is a hard problem in
computer science.

The default conflict resolution algorithm in statebox_riak
chooses metadata from one sibling arbitrarily. If you use
metadata, you'll need to come up with a clever way to merge it
(such as putting it in the statebox and specifying a custom
resolve_metadatas in your call to statebox_riak:new/1).