Keep Track of the Stack: Minimizing Downtime

Far and away, one of the most common blockers that keeps organizations and cloud computing apart is the fear of downtime. Whether it’s a cloud outage or just an application crashing because it wasn’t designed for cloud deployment, the effect on end user productivity and even on bottom line revenue can be devastating. This can scare IT people that see cloud computing as something that decreases their level of control. They can’t touch that humming rack of servers that used to be in the datacenter, so it feels as though they’ve put the success or failure of workloads in someone else’s hands. Someone they don’t even know.

But control really isn’t the issue. You’ve got all the control you need for cloud computing, especially with a mature cloud provider like AWS. The trick is understanding that the cloud is a different way of doing things – different deployment, different delivery, basically a whole new set of possible gotchas. End users without technical expertise read cloud marketing and press literature that makes it sound as though putting applications and infrastructure in the cloud is just a matter of hitting a button; but from the IT side, that’s not true at all.

Taking the AWS stack as an example and using a cloud outage as the scenario, a major feature of the service that’s commonly ignored by customers is its sophisticated failover capabilities. Many people just activate an account, deploy some virtual machines and disregard anything else; but that’s missing out on some valuable capabilities when it comes to minimizing downtime. For example, AWS is made up of many geographic regions, each with up to four availability zones within the region. By leveraging this as a foundation, you can design a much more reliable virtual infrastructure than any physical version you’d manage in-house. IT staff who aren’t cloud experts will understand this feature since failover is a network necessity whether local or remote.

Uptime and availability for IT systems has and always will be a business decision. This can be a challenge for most business to understand as IT is often seen an art form instead of a science. Surely if we have better IT folks and good equipment we can support 100% uptime! The Cloud actually makes uptime and availability an easily understood investment in terms of costs and with appropriate automation makes failovers instant and automatic.

Having a deeper understanding of the cloud stack your provider is using, can make decisions around uptime and availability easier to make. As an example, Multi-AZ deployments can get you added resiliency at a minimum cost investment. They provide sub-millisecond latency, easy virtual machine transfers, and elastic load balancing. These won’t just minimize downtime, they’ll probably improve application performance beyond what you’d have if you were managing in-house!

For those folks for whom running applications and infrastructure in different regions is a necessity, there are options in the AWS stack, too. These might be people who use the web to do business in multiple countries. Without understanding the full scope of the AWS stack, these folks could be doomed to constant application monitoring and fast, manual fixes so their customers can reach their web sites reliably. Smart use of all of AWS’ features on the other hand can make this not only viable, but a performance booster at the same time!

Using AWS’ virtual private cloud (VPC) service, for instance, allows IT to manage infrastructure on AWS at a much more granular level than with the standard service, including buckets like IP addressing, hardware/virtual bridging, security, and virtual storage optimization. For global distributions, AWS also has CloudFront, which is a market-leading web content delivery service that not only optimizes content delivery across big geographic distances using advanced caching, it can also integrate with other AWS cloud services, which opens up new design options for cloud infrastructure.

Customers who already have standard AWS infrastructure might think they’ll be frustrated when trying to implement these options and shy away, but even here, knowledge of AWS’ comprehensive stack provides help. Using CloudFormation Templates, your IT staff and your developers can organize new and existing AWS resources and provision them however they want. This is compatible with the AWS Management Console but also has specific command line tools and APIs.

There are many more AWS features and strategies you can use to swat potential downtime before it bites you; too many to list here. Using the right ones for your situation is a big job; one with a steep learning curve. Expecting your in-house IT staff to develop and maintain that level of expertise in someone else’s rapidly evolving service stack isn’t practical. To get the best reliability bang for your buck, work closely with an AWS value-add partner – folks that are experts in the AWS stack and can map its capabilities to all the specific needs of your business. This takes planning, but once completed you’ll not only have a more resilient application infrastructure than you could ever hope to have in-house, you’ll also have de facto expertise you can call upon in case of problems.