Introducing Amazon Aurora

When you think of AWS, the first thing that comes to mind is scalability, high availability, and performance. AWS is now bringing the same features to a relational database by offering Aurora through Amazon RDS. As announced at AWS re:Invent 2014, Amazon Aurora is a fully managed relational database engine that is feature compatible with MySQL 5.6. Though it is currently in preview, it promises to give you the performance and availability of high end commercially available databases with built-in scalability without the administrative overhead.

To create Amazon Aurora, AWS has taken a service-oriented approach, making individual layers such as storage, logging and caching scale out while keeping the SQL and transaction layers using the core MySQL engine.

So, let’s talk about storage. Aurora can automatically add storage volume in 10 GB chunks up to 64 TB. This eliminates the need to plan for the data growth upfront and manually intervene when storage needs to be added. This feature is described by AWS to happen seamlessly without any downtime or performance hit.

Amazon Aurora’s storage volume is automatically replicated across three AWS Availability Zones (AZ) with two copies of data in each AZ. Aurora uses quorum writes and reads across these copies. It can handle the loss of two copies of data without affecting database write availability and up to 3 copies without affecting read availability. This SSD powered multi-tenant 10 GB storage chunks allows for concurrent access and reduces hot spots making it fault tolerant. It is also self-healing as it continuously scans data blocks for errors and repairs them automatically.

Unlike in a traditional database where it has to replay the redo log since last check point, which is single threaded in MySQL, Aurora’s underlying storage replays redo logs on-demand, in parallel, distributed and asynchronous mode. This allows you to recover from a crash almost instantaneously.

Aurora database can have up to 15 Aurora replicas. Since master and replica share the same underlying storage, you can failover with no data loss. Also, there’s very little load on the master since there’s no log replay, resulting in minimal replica lag of approximately 10 to 20 milliseconds. The cache lives outside of the database process which allows it to survive during a database restart. As a result, operations can be resumed much faster as there is no need to warm the cache. The backup is automatic, incremental and continuous. This enables point-in-time recovery up to the last five minutes.

A new added feature of Amazon Aurora is its ability to simulate failure of node, disk or networking components using SQL commands. This allows you to the high availability and scaling features of your application.

According to the Aurora FAQ, it is five times faster than a stock version of MySQL. Using Sysbench “Aurora delivers over 500,000 selects/sec and 100,000 updates/sec running the same benchmark on the same hardware.”

While Aurora may cost slightly more than the RDS for MySQL per instance based on the US-EAST-1, the features may justify it, and you only pay for the storage consumed at a rate of $0.10/GB per month and IOs at a rate of $0.20 per million requests.

It’s exciting to see the challenges of scaling the relational database being addressed by AWS. To learn more about Amazon Aurora, you can sign up for a preview at http://aws.amazon.com/rds/aurora/preview/.