Amazon’s DynamoDB: What it is, and why it matters

Guest Commentary: Amazon Web Services this morning announced a new service called DynamoDB — rolling out, in public form, a cloud database technology already used by several of Amazon’s internal teams. So what problem does it aim to solve? And is it worth a look?

First, some background: As many developers can attest, the job of configuring, provisioning and managing traditional (SQL-based) relational databases can be complex and fraught with numerous challenges. Whether it’s MySQL, Microsoft SQL Server, Oracle or PostgreSQL, developers are often required to learn a great deal about database architecture if they’re to successfully configure a solution that’s efficient and scalable. They also have to worry about things like backups and replication.

The goal of Amazon’s DynamoDB is to make database storage and retrieval not only extremely fast but, perhaps more importantly, maintenance-free. They’re able to do this by dramatically simplifying the storage model and automatically handling things like replication and scaling. Amazon revealed that DynamoDB achieves much of its speed through the service’s extensive use of solid-state drives (SSD) which are much faster than mechanical, spinning hard drives.

Whereas more traditional relational SQL (Structured Query Language) database solutions require detailed knowledge of table design and index structure, the NoSQL movement, which DynamoDB is a part of, embraces the notion of a key-value storage model. Programmers new to NoSQL/DynamoDB concepts can quickly wrap their heads around this concept by equating it with a simple hash table or Dictionary. Instead of thinking in terms of columns and indexes, NoSQL/DynamoDB reads look more like requests for data associated with simple keys. Writes behave in a similar manner.

In DynamoDB, data are stored in tables that are quickly created through Amazon’s AWS dashboard or programmatically using their AWS SDK which supports Java, .NET and PHP. Unlike traditional relational databases, DynamoDB’s tables simply have names and a description of how their keys will behave. Developers don’t have to create columns or commit to a particular design schema at the outset of their project.

My takeaway: Developers should be excited about Amazon’s entry into the NoSQL space because it offers, perhaps for the first time, a maintenance-free database solution that’s capable of scaling to support almost any sized application. While there are rabid supporters of both NoSQL and the more traditional relational SQL-based database models, Amazon’s DynamoDB product makes it easy to work with both, simultaneously.

With Amazon handling all of the storage, backup and replication, there’s no reason why developers wouldn’t want to explore and, eventually, embrace the solution.

During a 30-minute webinar, Amazon CTO Werner Vogels and DynamoDB GM Swami Sivasubramanian discussed the new service, which is available starting today in beta form through the Amazon AWS platform. The entire webinar is now available online.

SmugMug CEO Don MacAskill, one of Amazon’s largest AWS customers, was on-hand from the Bay Area to describe his company’s early experiences using Amazon’s new service. SmugMug currently stores billions of photos and video objects using a variety of Amazon AWS services including S3 for storage and EC2 for compute resources. MacAskill said that they “absolutely love how simple the [DynamoDB] model is and the API” and are “thrilled with the performance.”

Related Stories

Comments

http://omniti.com/ Robert Treat

While I do think this announcement and service are significant, my takeaway was a bit less bullish. Amazon already trumpets two other maintenance free database solutions, as does Google, and services like Heroku are also getting into that game, so this isn’t exactly revolutionary; what we really need to see are some real world examples and comparisons to see if this service will perform better than some of the other offerings.

I think this will probably have more impact on folks looking to implement things like Cassandra or Riak, especially those doing on top of cloud based servers. If you are already working on horizontal scale-out, and already subject to Amazons service levels, this becomes very interesting. Granted it doesn’t look as feature-full as something like Riak, but the trade-off of features for eliminating the ops overhead of a distributed system could likely be worth it.

Of course, I’m discounting the difficulty that a lot of developers have when starting to develop on these platforms. If you are used to developing against systems where you have ACID guarantees or even simple things like unique indexes, there is a learning curve to these new systems (and in some cases those guarantees make these solutions a bad fit). Not saying it is a show stopper, just another cost that folks should factor in when investigating a new platform.

Peter

Be warned that DynamoDB does NOT seamlessly scale. It out right rejects any work load that it receives above some prepaid throughput capacity. You have to pre-purchase capacity at a ‘worst case’ cost and then be prepared for your application to fail if it receives a load higher than you paid for. Combine that with the extremely limited querying API, and it all translates to an odd ball cloud service for odd ball situations.