Commentary

Motivated by applications that wish to exploit availability for low latency where possible, but not sacrifice correctness, Holt et al. propose a new way of writing applications with mixed consistency that is tied closely to the application itself. The approach, named Inconsistent, Performance-bound, Approximate (IPA) presents a methodology that present three main contributions: consistency safety, consistency types, and error-bounded consistency.

Consistency safety: Based on annotated ADTs, consistency safety prevents data that results from weakly consistent operations from flowing into strongly consistent operations. Simply put, if you have a Counter ADT and the read operation is annotated as a “inconsistent“ operation, you cannot use this value as the value for a “consistent” insert operation on a Set ADT.

Error-bounded Consistency: Based on the current system load, return values using weak consistency, while ensuring values stay within some numeric error bound. Simply put, we can ask the system to provide a value that is within a particular error, or the most accurate value that is available within a certain latency profile.

IPA uses ADTs and their methods as the target of their annotations. Policies come in two flavors: static, where a value can be classified as either Consistent(Strong) or Consistent(Weak); and dynamic, where values can be classified as LatencyBound(x) or ErrorTolerance(x%) to provide more granular, application specific consistent requirements.

Each method and ADT yields values of a particular type, based on the consistency policy. To use an example from the paper, a read operation on a Counter ADT under the Consistency(Strong) policy would yield a Consistent[Int], whereas a read operation on a Counter ADT under the Consistency(Weak) policy would yield a Inconsistent[Int]. Furthermore, under the LatencyBound(...) policy, it would return a Rushed[Int] and under the ErrorTolerance(...) policy, it would return a Interval[Int].

These types form a lattice for a parameterized result type T where ⊤ = Consistent[T], ⊥ = Inconsistent[T] with a number of data store and error-bounded specific types, incomparable, lie between ⊤ and ⊥: ⊥ ≤ UserDefined ≤ ⊤whereUserDefined = {Interval[T], Rushed[T], LocalQuorum[T], ...}.

Rushed types represent values returned within a latency bound. It can be seen as a disjoint sum of all possible consistency values and pattern matching can be used to take action based on the returned type. The authors propose and demonstrate an optimization to prevent load introduced by parallel requests, by using reservoir sampling to make requests based on the consistency model that’s historically more likely to return a value within the specified bound.

Interval types allow the system to return a value within a particular error tolerance. Using an example from the paper, only return size of a set within a particular error tolerance (think: 5% on a 100 element set that just had an element inserted, would return intervals containing 101, such as [95, 105], [100, 107]). These policies are enforced using an escrow system, where reservations to perform operations are allocated and transferred between different replicas, ensuring a value stays within a particular bound.

It would have been nice to see more on how the implementation works, as some of the mechanics remain unclear. For example, the system has no way to ensure that an annotated example actually returns values that match the specified consistency model.

If we consider the case of the Counter ADT again: it is possible to achieve strongly consistency with Cassandra (their backing data store of choice) by two mechanisms, with very different latency profiles:

Read and write using quorums (QUORUM);

Read against a single node (ONE), but ensure that writes go to all nodes (ALL).

Yet another example of where the methodology becomes unclear is around enforcing latency bounds. Given the user has the choice on how they aim to meet the consistency policy specified on a data type by selecting the appropriate data store operations, is the user responsible for specifying the code to run the requests for values under different consistency models in parallel? The previous concern would indicate so, but the demonstrating examples in the paper seems to show the user issuing single read operations against the store.

The IPA system is realized on top of the Scala programming language, with an associated middleware component for managing reservations, and can be run on top of any backing data store that supports per-object or per-operation consistency levels. The paper contains a thorough evaluation and links to related and future work.