Is the Performance of All Cloud Servers the Same?

One of the benefits of delivering Infrastructure as a Service (IaaS) through the cloud is an abstraction from the underlying hardware delivering the service. There’s no requirement to understand what technology is being used to deliver, for example, cloud servers. The specification of a cloud-based server is based on a few simple metrics, CPU, memory and disk space.

CPU or processor power is described by most vendors in terms of cores, which translate to some abstract definition of physical computing power. Only Amazon Web Services (AWS) reference physical CPU architecture, with processing assigned EC2 Compute Units (ECUs). You can find more details here, but summarizing, an ECU is approximately the power of a 1.2Ghz 2007 Intel Xeon Processor. Memory is a more tangible quantity and simply expressed in megabytes or Gigabytes. Storage references purely disk capacity and has no correlation to actual disk performance.

Being a “storage guy” this lack of an I/O performance metric piqued my interest, as much of my professional career in storage has involved ensuring consistent and high I/O performance. I thought it would be interesting to look at both processing power and disk I/O performance to see how the different cloud implementations match up.

Measuring PerformanceNow I could install some software tool to execute the performance tests, but it’s more interesting to think about the underlying processes that are occurring on a virtual machine, so I’ve created a couple of PERL scripts to do the analysis. For the CPU measurement, I’ve simply created a script that loops for a fixed number of seconds, performing maths calculations and counting the number of loops that get executed in that fixed interval; in this case one second. A single measurement isn’t an entirely accurate measure of performance so I repeat the process at one second intervals, obtaining a series of figures that can be averaged out.

For storage I/O my PERL scripts creates a 100MB file, writing a series of random 4K data blocks. With both scripts I measure elapsed time and the user & system CPU time taken. If the PERL script is being executed consistently, then CPU time for each metric should be similar across all cloud environments although the elapsed time will vary by the percentage of resources being allocated.

The ResultsI ran tests against the following platforms: Amazon AWS, Rackspace and GoGrid, all of which were the US-based service. I also tried to choose a consistent platform, standardizing on CentOS or RHEL (which should be identical). Unfortunately there is no standard version of these operating systems available on each platform, so some tests are based on version 5.6, some on 6.x.

AWS#1: RHEL 6.1, 2ECU (burst only), 613MB, I/O performance low

AWS#2: RHEL 6.1, 2ECU, 7.5GB, I/O performance high

AWS#3: CentOS 5.6, 2ECU, 7.5GB, I/O performance high

Rackspace: CentOS 5.6, 4 virtual cores, 256MB, 10GB

GoGrid: CentOS 6.0, 0.5 CPU Core, 512MB, 25GB disk

What’s interesting is most of the CPU performance figures came out at a similar level, except for the AWS micro-instance. This gets more power, but after about 10 seconds of continuous use starts to get throttled. All of the instances are different in their definitions of computing resource but effectively translate to the same amount of CPU speed (remember the script runs single threaded).

For the storage, most instances ranged between 1 & 2 seconds per 100MB file. However, the two AWS instances using Elastic Block Store (permanent data store, retained even if the instance is destroyed) have significantly worse performance, with the micro-instance being particularly bad. One curious anomaly is that performance seemed to improve for the micro-instance in line with the way CPU was constricted. Although the timings were rounded to the nearest second, taking the average of all the observations, the three solutions using instance storage came out at a remarkably similar time, although AWS was slightly faster (1.4s per 100MB compared to 1.7).

So what’s the point of performing these measurements? Well firstly, it provides an additional way to do better like for like comparisons of the different offerings. Obviously I/O performance is not part of the server profile but can vary dramatically, depending on the instance type chosen. If, over time, servers are migrated to new technology, the relative performance level can be evaluated ensure servers remain correctly sized.

I’ll be extending the scripts to do more complex performance testing and see if I/O varies with differing block sizes and how multi-threaded CPU tasks are handled. Plus, there’s the whole comparison of Linux versus Windows to contend with.

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.