How cloud scalability can be easily achieved today

It’s perhaps the most captivating myth of cloud: autoscaling. A major selling point, the promise of autoscaling is that the amount of computing resource automatically scales based on load. It can handle those unexpected traffic spikes better, automatically, with no human intervention.

Workloads just know, for example, when your website has been mentioned on a television show or in a major magazine, demand spikes, and the cloud compensates. The cloud becomes a nimble, always-on, semi-sentient member of your IT staff with extraordinary reflexes and reacts accordingly.

That cloud nirvana doesn’t really exist though, or rather for most, it is out of reach.

Why? Well, there are dozens of reasons, from the technological to the practical. And unfortunately, it’s this promise that IT leaders are often sold on – the belief that autoscaling is easy, quick to set up and always ensures 100 per cent uptime. The truth about autoscaling, as with most technological promises, is it’s a little more involved.

We often say we want autoscaling, when, in our heart of hearts, we all know that there are so many ways a system could seem to be spiking, but that spike may, in fact, have nothing to do with demand. It could be an attack. It could be a runaway process. It could be some middling attempt at recursion by a newbie developer. The list goes on.

For this mythical self-scaling automatic magic-cloud to exist, it would not only have to be amazingly responsive, but it would also have to be intelligent enough to triage, in those moments, the reason for the spike and respond – or not - accordingly. Few humans have that skill, let alone distant, application-unaware computing systems.

So autoscaling isn’t a little more involved, it’s actually a lot more involved. And it’s also not exactly ‘auto’. It is, in fact, complex, time-consuming and demands a great deal of technical knowledge and skill. To create a truly automated and self-healing architecture that scales with little or no human intervention requires custom scripts and templates that can take months for a skilled team to get right, and many organisations have neither the time nor those resources to make it work.

So, what’s the next best thing? Alerting to a potential demand spike so a real live human can assess the situation (major television show or print magazine timing, marketing email blast data, and some knowledge of whether the last patch might be at fault) and make an intelligent scaling decision.

But, as has been often pointed out, people aren’t actually doing that. Instead, they are just over-provisioning. Why? Well, in part because resizing instances or VMs in most clouds is a taxing process. Often, it requires downtime and restarts in new, larger instances. And then, when the hullabaloo about your artisanal locally-grown alpaca friendship bracelets passes, you need to re-set the whole thing again to a smaller size.

IT people are busy. They don’t have time for this either. Couple it with the fact that they are chastised when systems are under-provisioned or fail, that re-starting a system may land it on an unfortunate server filled with noisy neighbors, and that all of this is happening at the scale of dozens or hundreds of servers at a time – and this feels like a great time to just over-provision everything and leave well enough alone.

There is an alternative.

Managed clouds are different. A managed cloud is not exactly like having an autonomic self-motivated mega-computer presciently rejiggering your RAM, but most of us also aren’t eager to invite HAL9000 to our staff meetings.

Instead, you can achieve a happy middle ground of cost-savings and ultimate control – and that feels like an IT myth whose time has come.