Dan Pritchett's Design Principle

One of the underlying principles is assuming high latency, not low latency. An architecture that is tolerant of high latency will operate perfectly well with low latency, but the opposite is never true.

This post reviews The Challenges of Latency, an article about how asynchronous architectures can improve the quality of Web applications, published on the InfoQueue site by eBay architect Dan Pritchett in May 2007. Dan's article is especially relevant today, given the high level of interest in adopting Web services and SOA approaches.

Dan explains why global, large-scale architectures need to address latency, and what architectural patterns can be applied to deal with it. He begins by invoking the second fallacy of distributed computing:

Latency.

The time it takes packets to flow from one part of the world to another. Everyone knows it exists. The second fallacy of distributed computing is "Latency is zero". Yet so many designs attempt to work around latency instead of embracing it. This is unfortunate and in fact doesn't work for large-scale systems. Why?

In any large-scale system, there are a few inescapable facts:

A broad customer base will demand reasonably consistent performance across the globe.

Business continuity will demand geographic diversity in your deployments.

The speed of light isn't going to change.

The last point can't be emphasized enough. The speed of light dictates that even if we can route packets at the speed of light, seems unlikely, it will take 30ms for a packet to traverse the Atlantic.

Latency hurts customer service

He emphasises the connection between Internet latency and customer service:

The Internet is a part of foundation of the global economy.

Companies need to reliably reach their customers regardless of where they may be located. Architectures that force close geographic proximity of the components limit the quality of service provided to geographically distributed customers. Response time will obviously degrade the further customers are from the servers, but so will reliability. Despite the tremendous increase in the reliability of traffic routing on the Internet, the further you are from a service, the more often that service will be effectively unavailable to you.

Latency tolerance

After spelling out the principle that I have highlighed above as today's Performance Wisdom, he goes on to make the case for introducing asynchronous interactions as the way to achieve latency tolerance.

The web has created an interaction style that is very problematic for building asynchronous systems. The web has trained the world to expect request/response interactions, with very low latency between the request and response. These expectations have driven architectures that are request/response oriented that lead to synchronous interactions from the browser to the data. This pattern does not lend itself to high latency connections.

Latency tolerance can only be achieved by introducing asynchronous interactions to your architecture.

The challenge becomes determining the components that can be decoupled and integrated via asynchronous interactions. An asynchronous architecture is far more than simply changing the request/response from a call to a series of messages though. The client is still expecting a response in a deterministic time. Asynchronous architectures shift from deterministic response time to probabilistic response time. Removing the determinism is uncomfortable for users and probably for your business units, but is critical to achieving true asynchronous interactions.

Dan accepts that not all Web application components can be designed to function asynchronously, but argues that designers can identify those use cases that do support synchronous interactions. These arguments confirm my earlier conclusion that synchronous solutions must be combined with asynchronous designs in which the user must accept that unconfirmed changes will be reflected in the enterprise database(s) at a later time.

Data partitioning

He makes a very good point about the importance of designing data distribution from the outset:

You can decompose your applications into a collection of loosely coupled components; expose your services using asynchronous interfaces, and yet still leave yourself parked in one data center with little hope of escape. You have to tackle your persistence model early in your architecture and require that data can be split along both functional and scale vectors or you will not be able to distribute your architecture across geographies.

I recently read an article where the recommendation was to delay horizontal data spreading until you reach vertical scaling limits. I can think of few pieces of worse advice for an architect. Splitting data is more complex than splitting applications.

But if you don't do it at the beginning, applications will ultimately take short cuts that rely on a monolithic schema. These dependencies will be extremely difficult to break in the future.

ACID or BASE?

Noting that the traditional way to maintain database consistency across partitioned data requires ACID-compliant distributed transactions and two-phase commit protocols, Dan advocates a (cleverly-named) alternative to the ACID properties, the BASE approach to database consistency:

The problem with distributed transactions is they create synchronous couplings across the databases. Synchronous couplings are the antithesis of latency tolerant designs. The alternative to ACID is BASE:

Basically Available
Soft state
Eventually consistent

BASE frees the model from the need for synchronous couplings. Once you accept that state will not always be perfect and consistency occurs asynchronous to the initiating operation, you have a model that can tolerate latency.

Another article worth reading, Web-Services Transactions by Doug Kaye, advances similar arguments without using the BASE terminology.

References: business-driven or event-driven architectures

While these articles present, at a high-level, a convincing case for asynchronous architectures, many others have elaborated on the implementation details. Here are five examples of more detailed treatments, in approximately descending order of generality:

Event-Driven Architecture Overview, a very detailed post by Brenda Michelson in her blog, Elemental Links.
[A blog response from Doug McClure exemplifies the gulf between the concerns of business/application architects and those of IT/network management, despite all the talk of "alignment" these days. As I wrote in my book, "Middleware is the reason why network specialists and application programmers cannot communicate!" (Guideline 15.3, p469).]

Next: the CAP theorem

Dan points out that adopting the BASE approach to consistency forces you to understand a very important principle, known as The CAP Theorem. This theorem was first propounded in a 1998 presentation -- Lessons from Internet Services: ACID vs. BASE -- by Dr. Eric Brewer of Inktomi, now a professor at UC Berkeley.

Of course there are situations where data needs to be consistent at the end of an operation. The CAP Theorem is a useful tool for determining what data to partition and what data must conform to ACID.

The CAP Theorem states that when designing databases you consider three properties, Consistency, Availability, and Partitioning. You can have at most two of the three for any data model.

Organizing your data model around CAP allows you to make the appropriate decisions with regards to consistency and latency.

Because The CAP Theorem plays a crucial role in the design of large scalable systems using asynchronous architectures, I will be devoting my next post to it.

This series of posts contains some material first published in High-Performance Client/Server. My 1998 book is out of print now, and contains some outdated examples and references. But most of the discussions of performance principles are timeless, and you can pick up a used copy for about $3.00 at Amazon.