This is the second post in our series on cloud server performance benchmarking. The previous blog post, What is an ECU? CPU Benchmarking in the Cloud, focused strictly on CPU performance. In this post, we'll look at disk IO performance. Choosing a cloud provider should be based on a many factors including performance, price, support, reliability/uptime, scalability, network performance and features. Our intent is to provide a good reference for those looking to use cloud services by providing objective analysis in regards to all of these factors.

Benchmark Setup

All benchmarked cloud servers were configured almost identically in terms of OS and software, CentOS 5.4 64-bit (or 32-bit in the case of EC2 m1.small and c1.medium and IBM's Development Cloud where 64-bit is not supported). File systems were formatted ext3.

Benchmark Methodology

In the previous post we used Amazon's EC2 ECU (Elastic Compute Unit) as a baseline for comparison of CPU performance between providers. In terms of disk performance, there really isn't a common term synonymous to the ECU. However, most readers will be at least be somewhat familiar with hardware disk IO performance factors such drive types (SAS, SATA), spindle speeds (10K, 15K) and RAID levels. So, we chose to use a "bare-metal" cloud server as the baseline for disk IO performance in this post. Our experience has been that most providers will not disclose much technical detail about their underlying storage systems. However, in the case of Storm on Demand's new Bare Metal Cloud Servers they fully disclose most technical details about their servers (one of the selling points of this service). With this service, you are assigned a dedicated server. Your OS still runs on a hypervisor (Storm uses Xen), but the underlying hardware is not shared with any other virtual servers, and you thereby have the full resources of that server available (no CPU limits, disk or memory sharing).

The server model we chose as the performance comparison baseline is the dual processor Intel E5506 2.13 GHz (8 cores total) with 4 x 15K RPM SAS drives configured in hardware managed Raid 1+0. SAS 15K is one of the fastest storage interfaces with throughput up to 6 Gbp/s, and Raid 1+0 can improve performance using striping.

To use this server as the baseline we assigned it an aggregate IO Performance score (IOP) of exactly 100 points. Other server disk IO benchmark results were then compared to baseline results and assigned a relative score, where 100 is equal in performance, less than 100 worse, and greater than 100 better. For example, a server with a score of 50 scored 50% lower than the baseline server, while a server with a score of 125, scored 25% higher.

To compute the IOP, the results from each of the benchmarks is first calculated for the baseline server. Each benchmark has a weight for the overall aggregate score. The baseline server benchmark scores are the 100% mark for each of these weights (i.e. the baseline server receives the full weight of each benchmark for its IOP). Once the weights are calculated, they are then summed to create an aggregate score. This score is then compared with the aggregate score of the baseline server and use to produce the IOP.

Benchmarks

We used a combination 7 disk IO performance benchmarks to create the IOP score. The following is a description of the benchmarks and corresponding weights use:

Blogbench[weight=200]: BlogBench is designed to replicate the load of a real-world busy file server by stressing the file-system with multiple threads of random reads, writes, and rewrites. The behavior is mimicked of that of a blog by creating blogs with content and pictures, modifying blog posts, adding comments to these blogs, and then reading the content of the blogs. All of these blogs generated are created locally with fake content and pictures.

Bonnie++[weight=100]: Bonnie++ is based on the Bonnie hard drive benchmark by Tim Bray. This program is used by ReiserFS developers, but can be useful for anyone who wants to know how fast their hard drive or file system is.

Dbench (128 clients) [weight=30]: Dbench is a benchmark designed by the Samba project as a free alternative to netbench, but dbench contains only file-system calls for testing the disk performance.

Flexible IO Tester (fio) [weight=30]: fio is an I/O tool meant to be used both for benchmark and stress/hardware verification. It has support for 13 different types of I/O engines (sync, mmap, libaio, posixaio, SG v3, splice, null, network, syslet, guasi, solarisaio, and more), I/O priorities (for newer Linux kernels), rate I/O, forked or threaded jobs, and much more.

hdparm buffered disk reads[weight=100]: Determines the speed of reading through the buffer cache to the disk without any prior caching of data. This measurement is an indication of how fast the drive can sustain sequential data reads under Linux, without any filesystem overhead.

We credit the Phoronix Test Suite for making it easier to run the benchmarks. The tests above come from the Disk test suite (except for bonnie++). If you'd like to compare your own server to the baseline server in this post, you can install Phoronix and use the comparison feature when running your own IO benchmarks. The full baseline server results are available here (including the ID for comparison tests). The baseline server bonnie++ results are available here.

Results

The following results are divided into sections based on provider. If the provider has more than one data center, multiple tables are displayed one for each. Each table shows the server identifier, CPU architecture, memory (GB), storage description (if known) or size, the server price, and the IOP score.

AWS EC2 servers can use either ephemeral or external Elastic Block Storage (EBS) storage. Regarding EBS, Amazon states "The latency and throughput of Amazon EBS volumes is designed to be significantly better than the Amazon EC2 instance stores in nearly all cases.". EBS is a block level off-instance storage, meaning it exists independent of the underlying host. Striping of multiple EBS volumes can be used to improve performance. We used a single EBS volume only (no striping or other performance enhancements).

Because EBS is off-instance, it has the advantage of added durability because if the host system fails the instance can be quickly restarted on another host (this is not automatic). EBS volumes are automatically replicated to prevent data loss due to failure of any single hardware component. EBS is billed at $0.10/GB/month + $0.10 per 1 million I/O requests. The prices below do not include EBS usage charges.

Rackspace Cloud uses local storage for cloud servers. This provides generally good performance on nodes of all sizes. However, the tradeoff is that it does not provide durability should the host system fail. Rackspace does offers scheduled hot backup/imaging capabilities for added durability.

Storm offers a diverse range of cloud servers to chose from including traditional cloud servers and their new "bare metal" dedicated cloud servers. The row highlighted in green is the baseline server for the IOP metric in this post. The 48GB cloud server was the top performer at almost 70% faster than the baseline server! We aren't sure how they do this, but we do know that the Intel Westmere CPU it runs on is very high performing (it was also the top performer in the CPU benchmarks post), and most likely the storage components are very high end as well. To our knowledge, all Storm's cloud servers run off of local host storage. They offer automated daily backups and imaging for added durability.

GoGrid's disk IO performance was excellent across instances of all sizes. Even the low end 1GB cloud server scored almost 10% faster than the baseline. The 4GB cloud server was the #2 performer in this post at over 60% better performance than the baseline. GoGrid's cloud servers run off of local host storage. They offer prepaid plans starting at $199/mo which discounts usage to $0.08/GB/hr (versus non-prepaid $0.19/GB/hr).

Voxel's cloud server storage resides on external SANs in each data center. Disk IO performance was roughly 50% of the baseline across the board, about average for providers using external instance storage. The exact technical details of Voxel's SAN storage are not available.

NewServers offers a true bare metal cloud. When you create a server instance it actually deploys onto physical hardware including local disk storage (no hypervisor overhead). Like the baseline server, the ns-fast instance below uses SAS storage. According to NewServers, the SAS drives used in this instance are Seagate 10K RPM. The slower spindle speed and Raid 1 configuration may have attributed to the reduced performance of that server compared with the baseline.

Linode markets itself as a VPS provider. However, because they are a very common and popular provider, and support some cloud-like features, we included them in our benchmarks. Linode uses Raid 1 local storage. The higher end servers generally performed better. The ln-14400 performed slightly better than the baseline server.

SoftLayer's cloud server storage resides on external SANs. Disk IO performance ranged from OK to painfully slow and showed a general increase in performance on larger sized instances. We believe storage is based on a GigE iSCCI SAN.

Terremark is a VMWare vCloud provider. Terremark was the best performing provider in this post using external SAN storage and providing automatic failover capabilities. Use of the SAN and VMWare provide a high availability feature wherein servers are automatically migrated to another host should the existing host fail.

OpSource is another VMWare-based provider. Cloud server storage resides on an external SAN. Performance was generally slower than other SAN/VMWare providers in this post. Performance was roughly flat across instances of all sizes.

BlueLock is a VMWare vCloud provider. Server storage resides on a SAN. Combined SAN and VMWare provide a high availability feature wherein servers are automatically migrated to another host should the existing host fail. Performance was very good for external storage providers and generally showed a linear increase in performance on larger sized instances.

Flexiscale is a UK cloud provider that has been around for a few years. They were recently acquired and renamed to Flexiant. They have recently released version 2.0 of their cloud server platform. They use external SAN storage.

8/17/2010 Update: Since originally conducting these tests in May 2010, Flexiscale has updated their external storage system. The top 8gb instance in the table below represents benchmark results conducted after than update. These results show a dramatic increase in IO performance (about 35%).

Flexiscale [UK]

ID

CPU

Memory

Storage

IOP

8gb

AMD Opteron 8356 2.29 GHz [1 processor, 4 cores]

8

320 GB

84.93

8gb

AMD Opteron 8218 2.62 GHz [1 processor, 4 cores]

8

320 GB

62.71

4gb

AMD Opteron 8220 2.80 GHz [1 processor, 2 cores]

8

160 GB

41.77

2gb

AMD Opteron 8218 2.60 GHz [1 processor, 1 core]

8

80 GB

36.56

1gb

AMD Opteron 8218 2.60 GHz [1 processor, 1 core]

8

40 GB

19.46

Summary

This is our first attempt at comparing cloud provider disk IO performance. We acknowledge that it is not perfect and hope to make improvements over time. Please comment on this post if you have any suggestions on how we might improve our methods. We intend to continually run these benchmarks (every couple of months) to incorporate new providers, improve the quality and quantity of our data available, and check for provider upgrades and improvements.

Providers were about evenly split between those using external SAN storage versus those using local host storage for cloud server instance storage. Generally the local storage providers performed better in the benchmarks. However, most of the external SAN providers are able to incorporate better durability including implicit backups and automatic host server failover.