Amazon, Web Services, and Sesame Street

Two years ago, Sesame Street's Cookie Monster learned a lesson: Some
foods -- like vegetables -- are "anytime" foods, while others --
including cookies -- are "sometimes" foods. Cookie Monster can't just
go ahead and eat cookies whenever he likes (which, presumably, would be
all the time) -- instead, he has to ask "is sometimes now?". I was
reminded of this by, of all things, Amazon's recent launch of its
SimpleDB database service.

Amazon's SimpleDB is conceptually very similar to Amazon's S3 storage
service: Where S3 provides a key -> file mapping, SimpleDB provides
a (key, attribute) -> value mapping. The main difference between
the two is in pricing: S3 is optimized for storage of large files, with
a cost of $0.15 per GB-month -- a factor of 10 less than the $1.50 per
GB-month which SimpleDB costs for storage -- while SimpleDB is
optimized for transactions, with requests priced based on CPU usage
($0.14 per CPU-hour) which will inevitably be far less than S3's
$1 per million reads and $1 per hundred thousand writes.

The most important similarity between S3 and SimpleDB -- and the fact
that will probably cause the most headaches to potential users -- is
a rather esoteric one: Neither S3 nor SimpleDB guarantee that they will
always be consistent. Instead, they both guarantee "eventual
consistency": You can update a file in S3 (or a value in SimpleDB) and
then get an old version of the file (or value) back when you try to
read it a moment later -- but "eventually" the updates will propagate
through Amazon's network and be visible.

This "eventual consistency" greatly limits what SimpleDB can be used
for. Don't try to use it to store any sort of accounting information,
for example: If you adjust an account balance twice in quick succession
(with each transaction being performed as a read-modify-write sequence)
there's a good chance that you'll lose the first transaction because it
won't have propagated by the time that you read data for the second
transaction. One approach to solving this problem would be to cache
values: If you've written a value to SimpleDB (or stored a file on S3)
then hold onto it for a while to give SimpleDB (or S3) a chance to
propagate the update.

Unfortunately, this solution is unworkable: There's no way to know how
long you'll need to hold onto recently-stored data for. A few seconds
is probably enough. A few hours is almost certainly enough.
But there's no way to know -- you can't even try reading the data back
from SimpleDB (or S3) to check if you get the "new" version, because
if Amazon's network partitions (due to hardware, software, human, or
backhoe error) it's possible that updates have propagated to some parts
of SimpleDB/S3 but not others. (This is an instance of a general
theorem: It's impossible to build a distributed system which is
partition tolerant, available, and consistent. Amazon has, in their
design, chosen availability rather than consistency.)

In order to make SimpleDB and S3 (and indirectly, EC2) more usable,
Amazon should add a new API call -- one which I've been asking them to
add for the past 8 months. This API call would answer the following
question: "When is the most recent time T, such that all data which
was stored prior to time T is now guaranteed to be visible everywhere?"
This would allow users of SimpleDB and S3 to keep recently-stored data
cached for as long as it was necessary; but to flush those cache entries
once SimpleDB and S3 could guarantee that the data had propagated,
thereby keeping the cache size under control.

In other words, just like Cookie Monster asks "is sometimes now?", we
need an API which will answer the question "is eventually now?"