Hidden Cloud Costs: AWS, Azure Management Costs Compared

With the help of in-depth Gartner analysis, InformationWeek takes a closer look at the cost of managing Amazon vs. Azure.

7 Cloud Service Startups To Watch

(Click image for larger view and slideshow.)

What's so hard about assessing cloud ROI?

When it comes to determining cloud cost, one of the murkiest areas to assess is the implicit management expense of cloud workloads. Is there a difference between the management expenses of different services? You bet there is. To illustrate this, here's a quick look at the differences in operation between Amazon Web Services and Microsoft Azure.

If you can get the workload to run in either location, which one runs more efficiently, compute-wise? That might be possible to determine through some kind of improvised benchmark. Which one runs more efficiently, management-wise? That might be harder to determine, but recent research from Gartner will help us make some comparisons.

To determine how well your workload is running on Amazon, some basic tools are available. CloudWatch supplies a basic set of metrics on CPU, memory, and bandwidth use per application to customers. The IT cloud manager can programmatically retrieve data collected by CloudWatch, analyze it, chart it, and look for trends. Service basics are free; metrics and alerts can be customized for small fees. In addition, AWS's free CloudTrail logs API calls and delivers the log files to the customer.

Microsoft also supplies free monitoring for Azure operations, but according to Gartner research director Kyle Hilgendorf's report, Microsoft Azure: In-Depth Assessment (registration required), it does not allow creation of custom metrics. The report says Azure holds the operational data for only 10 days, and then it's discarded.

On the other hand, if you're a skilled Microsoft shop -- and there are many -- using Active Directory, Windows Server 2012, SQL Server, and particularly System Center will make it easier to monitor and manage your cloud workload. They'll also work with tasks sent to Azure, more or less as if they were on premises. That's also true of SharePoint. In terms of efficient management, this can be a big plus.

Perhaps the main drawback in the Microsoft management approach is that it uses some, but not all, types of Linux under Hyper-V. For example, SUSE and Oracle Linux are welcome; Red Hat Enterprise Linux is not. Many modern enterprise datacenters depend on both Windows Server and Linux, specifically Red Hat Enterprise Linux. If yours does, Azure's blind eye for Red Hat is a problem. Similarly, Azure can see and work with SQL Server, but not MySQL. This pattern is seen across many examples of popular open source code, with a few exceptions Microsoft does support, such as PHP.

So what will it cost to cope with this dichotomy? Can you save money on Azure? It's noted in certain benchmarking tests for delivering consistent, efficient compute service. But you may need to change the Linux you depend on (an expensive and unlikely proposition) or use a different service than Azure for your Red Hat Linux workloads, complicating the management challenge. Amazon, of course, welcomes Red Hat workloads.

Amazon provides a battery of configuration and deployment tools, including Elastic Beanstalk for scalable deployments, Data Pipeline for making connections between compute and storage deployments, and Cloud Formation for generating reusable templates that capture all the dependencies of a given workload.

Again, familiarity with Windows Server and System Center will give you roughly equal capabilities for similar workloads. The Gartner report notes one difference: Azure's Autoscale allows customers only to turn VMs on and off. In other words, it requires customers to pre-deploy instances for the scaling group. This means customers must predict ahead of time how many instances may be needed for the peak usage, according to the report. Then instances get turned on as needed, and hopefully the number reserved matches the need.

The report notes, "Amazon includes robust load-balancing options... AWS has expanded the feature set of its Elastic Load Balancing service to include session affinity and metrics-driven load balancing."

Here's a little-known fact: If an AWS server hardware dies and your application goes with it, that's your tough luck. It's up to you to monitor CloudWatch or set up alerts to let you know when a virtual machine has died. Alternatively, you could set up a backup system in a second "availability zone" such as an independently powered section of the AWS data center for high-availability operation. Moving your data from one zone to another, of course, results in transfer fees, along with the fees for a second system.

"AWS customers can set up CloudWatch alarms and automatic scripting options to ensure a new healthy instance is created, if the original instance fails," Gartner's report says. "Customers can create auto-scaling groups that will make sure that a fixed number of instances are available, so that if one fails, a new one is launched as a replacement."

If failure is pending -- for example, hardware is showing signs of impending demise -- Amazon will provide notice to affected customers of a maintenance issue and give customers leeway to migrate to a new server before the faulty unit is shut down.

For Azure, Microsoft provides automated restart for a stopped virtual machine. "AWS does not currently support automatic instance recovery or restart of the original instance, should a failure occur," Gartner's report explains. Amazon instead gives customers the tools to monitor, alert, and restart the VM on their own.

What is the difference in cost for the one approach versus the other? Amazon offers a more complete way for customers to manage workloads themselves -- for those who know how. But many Microsoft shops see a benefit in the Azure restart feature. If they are already Premier support contract holders, technical support will notify them of any outage issues.

Of course, it's not a black and white issue. If your company already has expertise in Microsoft products, Azure will likely offer an edge in lower costs. On the other hand, "companies that have existing people with skills in .NET, Visual Studio, System Center, Active Directory, Hyper-V, SQL Server, and SharePoint will be able to transfer some of this knowledge to the Azure platform," Gartner's report says. "Current Microsoft customers with enterprise agreements can also use the negotiated discount for Azure."

As for its rating on overall infrastructure-as-a-service operation, AWS meets 92% of Gartner's criteria, and Microsoft meets 75%. With Amazon, you'll have to learn the Amazon way of doing things versus the datacenter software you already know. Doing so, however, will give you a more thorough management approach and possibly a more cost-effective handle on operations.

But not everyone needs the full Amazon treatment. Those who want an auxiliary operation to complement their data center may find Azure less expensive and easier to use. Take your pick -- and estimate your costs -- on where you wish to end up.

In its ninth year, Interop New York (Sept. 29 to Oct. 3) is the premier event for the Northeast IT market. Strongly represented vertical industries include financial services, government, and education. Join more than 5,000 attendees to learn about IT leadership, cloud, collaboration, infrastructure, mobility, risk management and security, and SDN, as well as explore 125 exhibitors' offerings. Register with Discount Code MPIWK to save $200 off Total Access & Conference Passes.

Charles Babcock is an editor-at-large for InformationWeek and author of Management Strategies for the Cloud Revolution, a McGraw-Hill book. He is the former editor-in-chief of Digital News, former software editor of Computerworld and former technology editor of Interactive ... View Full Bio

This is a great article and good information about the differences between Amazon and Azure. When we started with Azure we liked the simplification Microsoft did but missed a lot of transparancy when it comes to cost management. Especially as a customer under enterprise agreement you have not much transparancy. Because of that we created a simple portal hosted on Azure. Everybody who is interested can find this absolutly free portal here: costs.azurewebsites.net

The Cloudvertical site does offer comparisons to cloud providers' costs. Although location also would play a role. It is also important to avoid missteps in the beginning of this journey as well as avoid the hidden costs down the line.

Every major IT vendor e.g Google, IBM, Hewlett-Packard, Oracle, Vmware etc -- and the telecommunication service providers who have the Infra resources to run ITaaS offer enterprise public cloud services and want to expand their footprints make revenue off of the cloud. Many of them, notably Rackspace, and HP, are betting on OpenStack infrastructures, which besides Azure, is the most formidable competitor to Amazon.

It's important not to lose time and effort -Despite the tit-for-tat price cuts in the cloud, one needs to justify the bigger gains with best practices, as well as avoid those hidden bear traps. Having a partner that will understand your business requirements, development strategies, outlook, end goal, PaaS/IaaS needs and complimenting the knowledge of the public/private- HYBRID cloud be it AWS or Azure will surely benefit your roadmap by putting you on the right track from the get going.

As part of the Mindtree's Infrastructure Managed Services team, we have a history of driving/navigating the cloud-wagon for our customers and partners alike as their trusted advisors, while maintain our Vendor agnostic best practices and domain expertise to support your/their needs, implementation and journey.

Yes, DougL321, my comments reflect an assumption that someone wants a hybrid cloud operation, with some operations on-premises and some workloads in the cloud. Picture looks different for a startup with no on-premises data center; and right, it's also different for someone only interested in platform as a service. How many companies are only interested in PaaS?

This is useful information as long as you have a large investement in on-prem. However, it seems to completely ignore PaaS which scales very differently. If you just want to move your datacenter off prem, you aren't going to see as much ROI as you would if you reduced your overall footprint. You will achieve some economies of scale and might get some reduced licensing (i.e. you can run SQL on a VM using an Azure per-minute license instead of paying per core), but your cloud provider will be adding profit that otherwise wouldn't be in the mix. In my opinion, the biggest could ROI is in eliminating servers and environments that exist to handle periodic loads (quarterly finanicals, annual MRP runs, etc.) and otherwise sit unused.

The statement, "Azure's Autoscale allows customers only to turn VMs on and off," is a little misleading in my opinion because that makes it sound like your only option to scale in Azure is to add more VMs and more instances. That simply isn't the case. Services such as SQL server can scale without adding more instances. More importantly, with a PaaS model, you can treat the compute resources as disposable assets and only have to pay for them when they are used.

For a small business or startup, I don't know why you would design VMs into your strategy when you could most likely use a combination of SaaS and PaaS to meet your needs. For larger compaines, obviously there is an investment required to deploy a work load to a PaaS environment when you already have everything setup to run on your on-prem servers. However if your cloud strategy is to just take existing server instances and move them to the cloud, I'm not surprised if you are scrambling to find the ROI.

"If possible, something outside of your cloud environment should be checking to make sure that services are still up and available rather than relying on built-in cloud tools." Exactly, PCComf. There's more than a cottage industry trying to provide this: think Compuware, Solarwinds, New Relic, AppDynamic, NetScount and don't forget the French firm Cedexis. All can supply info. that the cloud provider may not.

That angle is deceptive, though I'm not alleging it as necessarily intentional. When you have auto-scaling turned on, a dead or underperforming (by virtue of your customer config) instance is replaced by another instance. Even if you're running 1 instance, you should turn on auto-scaling and set its limit to 1 if you want this limited notion of availability.

What the article says is that the same instance isn't rebooted. That's a good, smart thing. Since the system can't determine whether the reason for the dead instance falls on the software or on the instance itself, the design of the system covers all the cases. Terminating the failed instance and starting a 2nd is the smartest thing to do there. Also, specifically, opting to restart the failed instance is both proven to be risky and reflective of other cloud provders' limited experience working with the kinds of products that need cloud hosting.

AWS is Amazon's own infrastructure. Their huge success was built entirely on this architecture. Other companies like Rackspace and Microsoft or don't have that same level of insight into operational excellence with regard to online reliability. When you see an article that views AWS in a critical light compared to other cloud hosting products, take it with a grain of salt (or suspect that it's written by an Azure consultant, not the case here).

You still need application level monitoring to ensure that what you think you are providing is actually what you are providing. This is true in and out of the cloud.

I have a very small AWS environment, but I once saw a server fail, yet it appeared to still be running fine in the aws administration console. It took an hour or so to respond enough that I could even manage it enough to restart it / shut it down.

If possible, something outside of your cloud environment should be checking to make sure that services are still up and available rather than relying on built-in cloud tools to keep you going. There are too many ways for a server to "go out" that no cloud provider can account for all of them.

Is it only me who finds "If an AWS server hardware dies and your application goes with it, that's your tough luck. It's up to you to monitor CloudWatch or set up alerts to let you know when a virtual machine has died." really worrisome?

Companies are trusting AWS with their data and applications, and if the hardware, not the virtual machine, but the hardware dies, it's the customer's responsibility to plan for it? Does this not seem a little odd?

As you review management costs for Azure and Amazon, it's hard not to see how VMware has a shot at grabbing part of the bybrid cloud market. The pretty good things that Windows Server, Hyper-V and System Center can do will be amplified by all the things that vSphere and vCloud can do. Think hybrid computing based on live migration of virtual machines. That's not possible today. The same storage file system must underlie both departure point and host destination for live migration. But I'll bet VMware is working on it.

Enterprise cloud adoption has evolved to the point where hybrid public/private cloud designs and use of multiple providers is common. Who among us has mastered provisioning resources in different clouds; allocating the right resources to each application; assigning applications to the "best" cloud provider based on performance or reliability requirements.