Ever since Amazon announced AWS EC2 C5D instances, we — as AWS practitioners — have been digging deep into the technicalities of it. We found some interesting stuff, and we’d like to share our findings for the greater good.

C5D Intro: AWS EC2 Instances with Local NVMe Storage

For those unaware, AWS EC2 C5D instances are high-performance block storage instances. Introduced sometime during May 2018, they are said to be ideal for applications that need access to high-speed, low latency local storage. This has made it media vertical’s favorite instance out of the lot.

Read on about its details below or move directly to Tests section of this post to view the benchmarking observations.

Highlights:

#3 | Got the CPU juice of the C5 family and the disk performance (IoPs) of the i3 family, making it ideal for database workloads that are harder on CPU, like block compression.

#4 | 25% to 50% improvement in price-performance over C4 instances.

#5 | While I/O files are typically stored as Amazon S3, the intermediate files are expendable.

#6 | The local storage is terminated when the instance is stopped, making it ideal for storing intermediate files, not long-term storage.

#7 | Each local NVMe device is hardware encrypted using the XTS-AES-256 block cipher and a unique key. Each key is destroyed when the instance is stopped or terminated.

#8 | The local storage will show up as one or more devices (/dev/nvme*1 on Linux) after the guest operating system has booted.

#9 | Cheaper (for now) than regular C5 Large, but as Spot instances. Spot requests for C5D Large starts at $0.0324 per hour across all regions. Where as C5 Large on-demand costs $0.085 per hour. This is likely only due to the slow uptake. As C5D instances become more popular, the prices might increase.

Here’s detailed pricing for all variations of C5D instances in N Virginia.

Applications:

Batch and log processing

Apps that need caches and scratch files heavily

Image manipulation

Distributed and or real-time analytics

High-performance computing (HPC)

Ad serving

Highly scalable multiplayer gaming

Video encoding and other forms of media processing that requires large amounts of I/O to temporary storage

Note that the batch and log processing runs in a race-to-idle model, flushing volatile data to disk as fast as possible in order to make full use of compute resources. More details on availability, regions, sizes, and purchase models,here.

Tests

In the same region and availability zone, we launched a C5.large instance with 54 GB EBS volume and T2.small instance for comparison with no NACLs in the way, and open security groups between the instances. All instances were running Amazon Linux 2 OS (ami-0a5e707736615003c), and patched up to date as of October 2018.

Before we walk you through the benchmarking results, let’s have a look at the pricing comparison between C5D.Large, C5.Large and T2.Small on-demand instances:

We ran the Sysbench tool to calculate all prime numbers up to 20,000 to compare and contrast the pure compute performance of C5D instances with C5 instance as well as the the most popular instance type, T2.

Here are the kind of tests we ran on all the three instances:

To run CPU tests, we made use of the SysBench tool, with a single threaded CPU test.

To measure disk performance, we made use of Sysbench file IO benchmark (using random r/w). As these instances are capable of bursting, we conducted few tests to drain the IO balance. Before draining the IO balance, we used IOping to test latency. Then, using FIO, we attempted to drain the balance and test the performance as the burst bucket was emptied and refilled.

To benchmark the performance of a typical file server workload, we made use of the blogbench score.

To run network tests and demonstrate TCP/IP latency, we used average latency of 100 icmp packets. Additionally, we used iperf to test the bandwidth between devices.

To get a more a slightly more “real-world” test, we used the phoronix test suite to run a benchmark compiler test (compiling Linux). This gives real world disk IO across a filesystem alongside general system and CPU performance. Disk IO is often a constraining factor in machine performance, but it very much depends on what particular workload you’re running.

Here are the results:

Here are Our Key Observations:

#1 | The CPUs all come out at the same score, as all three instances have CPUs of the same capability.

#2 | The C5D vastly outperforms the C5 and T2 in File IO due to the nvme disk, with nearly 4 times the read and write capability of a C5 instance. AWS uses the same underlying disk technology for C5 and T2. Hence the results for C5 and T2 are similar.

#3 | The disk latency measure (IOping) shows that the nvme disk has half the latency of the C5. This is a dramatic difference, and has significant implications for latency-sensitive workloads such as real-time analytics.

#4 | The network measures are essentially the same for C5D and C5 instances, as expected, but significantly improved over the T2 instance.

#5 | The compile time is interesting. The task hits a CPU bottleneck at this point. Although the C5D is faster every time, it’s hard to find a repeatable real-world task that really stretches the capability of the new nvme disk.

#6 | The test showed that the CPU maxes out to 100% to compile Linux. This indicates that C5D is a little faster. Even though compiling Linux C5D is a very disk intensive operation, the CPU was the ultimate bottleneck of the task in our test case.

Just to add to these tests, the C5 instance took 8.10 seconds to unpack the Linux kernel. The C5D took 8.06 seconds to unpack.

The Wrap UP

There were several tests conducted over a week to benchmark C5D.Large instance’s performance. Hope these test results gives you the real perspective of the new C5D instances.

If you’d like to know more about these test results, write to us at tech@totalcloud.io or tweet to us at @totalcloudio. Also, do share your views on this post. We would love to hear your thoughts.

TotalCloud is a visual and interactive cloud management platform. Built using a 3D gaming engine, it helps monitor and analyze all the cloud infrastructure frame by frame based on user's actions and renders it in real.