Archive

There are a few definitions of cloud computing out there, but now that three big players (Amazon, Google, and Microsoft) are solidifying on what they offer (and what it costs), there seems to be a pretty consistent purpose for cloud computing: the ability to increase infrastructure capacity on the fly. In some ways, this is similar to keeping servers in a hosting facility – the hosting company handles some level of IT administration, ensures the power and network connectivity are available, etc. The difference in cloud computing is that getting more servers at a hosting facility takes time…typically days or weeks. In cloud computing, you can increase capacity immediately, and also reduce it. This provides huge advantages for systems that have usage spikes periodically, as you can increase and reduce infrastructure when needed, rather than paying for the unused capacity of beefy servers (and electricity and IT staff) that sit idle most of the time. You pay for what you use in terms of storage, processing and memory capacity, and internet data transfer.

From that standpoint, Amazon’s Elastic Compute Cloud (EC2), Google’s App Engine, and Microsoft’s Azure are very similar. They all provide administrative consoles to manage your infrastructure’s capacity within their hosting facility on the fly. You pay only for the amount of infrastructure you use, rather than paying for oversized infrastructure to meet peak demands. There are quite a few differences in their administrative consoles, security capabilities, and of course, pricing. But the biggest differences are in the type of infrastructure they offer. I’ve been evaluating the different cloud platforms for some time, and now that Microsoft has official pricing and Go Live dates, I thought others might find my research helpful in understanding and choosing a platform.

Amazon’s EC2 essentially provides you with hosting for virtual machines (called Amazon Machine Images). They have various images of Windows, Linux, and Unix operating systems, with various database and application server software packages available. You also have the ability to create your own virtual machine images which whatever software you will need. You also get to customize the type of hardware available for running your image – standard, high-memory, and high-processing. For prepackaged images with software bundled in, you typically pay a little extra per processing hour, covering things like licensing costs. Links to pricing, listings of prepackaged instances, and other details about EC2 are available at the end of this post.

Microsoft’s Windows Azure is similar to EC2 in some respects, as the amount you pay depends entirely on the number of instances you’re running. The difference is that you can’t configure any image you want. Everything you do needs to run on Windows Azure’s Platform, which supports a wide variety of languages and runtimes, but not everything. You’re limited to Azure SDK’s for .NET, PHP, Ruby, Python, or Java. As for database support, you have SQL Azure, which was recently announced to have full T-SQL compatibility with Microsoft’s SQL Server. You also can choose to use a mySQL database on the Azure platform. This is fairly broad application support, especially for Microsoft, and if you can live within these constraints, there are two big advantages. Administration of instances is much simpler – you only need to administer your application, rather than administering the whole operating system. Microsoft handles all the OS administration and updates, which means you only need to worry with your application administration. The other advantage is simplified pricing, which is due primarily to the limited application offerings. You just pick what level of hardware you want for each Windows Azure instance, and what size database you’ll want for SQL Azure (currently only supporting up to 10GB databases). Links to pricing, SDK resources, and other details about Azure are available at the end of this post.

Google’s App Engine (GAE) gets rid of the whole concept of an instance. You don’t have to predetermine or preconfigure the number of instances you’ll need to service your application. App Engine will let it use however many requests, bandwidth, processing time, and storage your application happens to need. This is a nice model, as it takes out the complexities of determining how busy your site will be at any time, and spinning up more instances appropriately. You just pay for exactly what your application uses in terms of resources. Since there is no hard number of instances to help throttle your usage, to keep costs from running out of control, GAE simply provides you with quotas that you administer. The great advantage in Google App Engine is that there is a free level of quotas that lowers the barrier to entry significantly. If your application doesn’t exceed a certain level of usage, Google is essentially hosting it for free. This convenience comes with a cost, however. The Google App Engine SDK only comes in two flavors – Java and Python, and they are both sandboxed to control what you can do. For example, everything is basically request based – you can’t really have a service running in the background (although there is a Cron service that allows you to schedule jobs). You’re also limited in their SDK to using Google Accounts (although some people have workarounds for this as well). Links to pricing, SDK resources, and other details about the Google App Engine are available below.

Amazon EC2http://aws.amazon.com/ec2/
Complete flexibility – many operating system and application options, and the ability to package your own images with whatever custom software you need.

Windows Azurehttp://www.microsoft.com/windowsazure/
Simpler administration and pricing – only administer your application and the number of instances to run, rather than administering the entire virtual machine. Limited to SDK’s for .NET, PHP, Ruby, Python, and Java, with databases in SQL Azure or mySQL.

Google’s App Enginehttp://code.google.com/appengine/
Even simpler administration – just administer your resource quotas. Limited to SDK’s for Python and Java, and using Google’s Datastore. Also supports a free model, with lower quotas, excellent for hosting a startup application.

So how do you choose or should you even bother with this “cloud” stuff?
It depends entirely on what you are doing. In some scenarios, hosting in the cloud just doesn’t make financial sense. If you have a basic PHP site with mySQL, you could certainly host that in Azure or EC2, but unless you have the level of high volume traffic that necessitates some dedicated servers, it’s going to cost significantly more than a run of the mill hosting plan which can offer a similar Service Level Agreement for availability and performance. If your traffic is very steady, then the ability to add and remove infrastructure on the fly isn’t as attractive, and a standard hosting center may still work well and be cost effective for you. If you’re willing and able to develop or port your existing application to run within Google App Engine or Microsoft Azure SDK’s, those probably make a lot of sense, because of the simplified administration. If you can’t work with those constraints, the Amazon EC2 cloud is a very flexible option, although you get stuck with the administration of the operating system. If you are trying to co-locate your application between your existing infrastructure and a cloud infrastructure in order to handle peak conditions or provide load balancing, Amazon EC2 and Microsoft Azure provide the capability to host applications both “on premise” (on your infrastructure) and in their cloud.

What’s the next step?
Assuming you have an application running already, you need to analyze the traffic. Is it always steady, does your application often encounter traffic spikes, are these spikes periodic or sporatic? If you frequently have large spikes, moving to a cloud environment certainly makes more sense than paying for server resources that go unused between spikes.

Which cloud?
Now you just need to choose your cloud, and that depends on the trade off between flexibility and administration. The flexibility provided by Amazon’s EC2 means you need to have some IT resources that can administer the operating system instances, but it also generally means you’ll be able to host your existing application with little to no code changes. Microsoft’s Azure and Google’s App Engine are more constrained in terms of technology choices. An existing application will require at least some code modification to run in their cloud platforms, and you’ll really need to review the application’s full architecture to determine the effort to migrate. However, there is much less IT administration, which means that once you’ve migrated, your resources can be spent on building new functionality. If you can fit in the relatively narrow bounds of the Google App Engine, there is hardly an infrastructure to manage at all. Of course, if you’re a .NET development shop, the Microsoft Azure cloud probably makes the most sense because while you could deploy to Amazon EC2, you’ll be wasting resources on server administration overhead instead of building new product features. If you’re hooking into lots of Google services, like XMPP and Maps and the Google Web Toolkit, Google App Engine is going to fit in well.

What if you’re afraid to commit?
Cloud computing is a great concept, but it is new. And there have been some highly publicized outages that can make just about everyone nervous. Many people also worry about data security, although it isn’t much different than the security afforded by a reputable hosting center. If you don’t want to commit, but would like to supplement existing infrastructure, Amazon EC2 and Microsoft Azure make the most sense. Amazon is nothing more than virtual machines that happen to reside on Amazon’s infrastructure. You can keep some infrastructure “on premise” and other infrastructure in Amazon’s virtual machines. Similarly, Microsoft’s Windows Azure has AppFabric, which is an abstraction layer that your application code writes to, allowing the same code to run in Microsoft’s cloud as in an “on premise cloud” that runs on your own infrastructure. Google’s App Engine can only run locally in somewhat of a “restricted” mode, where some of the services are not available.

There are some big players offering a lot of choices, and probably more to come in the future as the technology matures, but cloud computing is certainly one of the larger computing paradigm shifts in this decade, IMHO second only to the explosion of mobile computing.