Vertical scalability used to require optimizations inside the application, at the code level. Cloud computing changes the nature of vertical scalability and, one hopes, will lead to a new model of scalability based on the capabilities of Infrastructure 2.0 and increasingly granular resource management capabilities.

RightScale recently offered up its own analysis of Amazon Usage Estimates and while the details they provide on Amazon usage from their vantage point is very interesting I found one of their related observations even more fascinating:

In earlier days the predominant method of scaling was by launching more servers, but we are now seeing a lot more scaling by replacing smaller servers by larger ones.

The reason I find this fascinating has to do with not so much where cloud computing is today, but where (we hope at least) it is going.

Horizontal scalability is the ability of an application to be scaled up to meet demand through replication and the distribution of requests across a pool or farm of servers. It's the traditional load balanced model, and it's an integral component of cloud computing environments. Vertical scalability is the ability of an application to scale under load; to maintain performance levels as the number of concurrent requests increases. While load balancing solutions can certainly assist in optimizing the environment in which an application needs to scale by reducing overhead that can negatively impact performance (such as TCP session management, SSL operations, and compression/caching functionality) it can't solve core problems that prevent vertical scalability.

While this is still correct the observation by RightScale tells me that there is now another way to vertically scale in a cloud computing environment. It’s a hack, in the traditional workaround geek-cool sense, but an ingenious one nonetheless.

WHAT’S THAT GONNA COST YA?

When an application hits the top boundaries of CPU and memory on any machine – whether virtual or physical – the traditional response to ensure scalability is to use load balancing solutions to horizontally scale the application, thus increasing concurrent user and TCP connection limits while hopefully addressing the degrading performance problems associated with high utilization.

But what appears to be happening, at least in some cases,is that rather than horizontally scale by adding new instances is that people are simply “upgrading” the virtual machine and thus increasing the limitation on CPU and memory. It’s like buying bigger hardware, only it’s a lot easier and faster and doesn’t require nearly as much preparation before deployment. In essence people have found a way to vertically scale their application by simply provisioning more CPU and memory. Sort of.

This does not address the inherent performance degradation that occurs as higher utilization rates occur, and if you check Amazon’s pricing you’ll find that it’s quite a jump from a “small” to a “large” instance, regardless of operating system or whether it’s “standard” or “high-CPU” usage. In fact in both cases it’s 4x the cost to go from small to large/medium to extra-large, which makes sense given you 4x the EC2 compute units with each “upgrade”.

It seems, at least on the surface, that even with the costs of load balancing services from Amazon ($0.025/hour per Elastic Load balancer + $0.008/GB of data transferred through an Elastic Load Balancer) that it would be financially advantageous to simply launch a second instance and take advantage of load balancing, while also benefiting from the performance improvements typically associated with load balancing.

Now granted, the small servers are, by enterprise standards, pretty small and most organizations would never deploy an application into production running on a server with 1.7GB RAM and “the equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor.” I’m fairly certain my digital camera has more processing power than that, so what RightScale is seeing actually makes a great deal of sense to me. It might cost a bit more but it seems more apposite to provision higher performing instances despite the possibility of overprovisioning.

But I digress (yet again) and so now let’s get back to the point of this post which is not actually a comparison of vertical and horizontal scaling technology (although that’s interesting, too) but what this unique vertical scaling solution says about where the future of cloud computing (hopefully) lies.

WHERE COMPUTE RESOURCES, NOT VIRTUAL MACHINES, ARE PROVISIONED ON-DEMAND

Really, that’s it. That’s where cloud computing should be going and that’s where cloud computing hopefully is going. Someday you won’t need to launch a bigger instance because the environment will automatically, based on specified thresholds and business needs, allocate more CPU and/or memory on-demand to your “application”. How that happens isn’t important at the moment, but that it will happen and that it will take a combination of data derived from across the infrastructure is what is important and exciting. Because what RightScale is seeing is the first step toward someone deciding you shouldn’t have to launch a separate VM, you should just be able to grow the one you have until it can’t hold any more. Until you really are paying per clock-tick, per instructions executed and bytes in memory used instead of in chunks that may or may not be enough, or may be too much. Overpaying is not what cloud computing is supposed to be about either, but right now that very well may be the case.

But not in the future. No, in the future the infrastructure sees the requests, the users, the traffic patterns, and the performance of the application; it will process the needs of the application based on the context and capabilities of the infrastructure and the business needs and then determine when an application needs more compute resources. It will further be able to signal management systems or invoke the proper methods itself that will provision the resources needed to ensure the application scales. That same infrastructure should - and hopefully will - be able to determine at what point vertical scalability is no longer ensuring application performance meets business criteria (or has met some compute ceiling) and can then decide to provision resources using horizontal scaling techniques instead.

The intelligence to interpret the technical context and measure it against business needs. The ability to connect and integrate to gather and share that technical context. The flexibility to automatically determine whether horizontal or vertical scaling is necessary to meet those business criteria. That’s a dynamic infrastructure, that’s what we’re trying to enable via Infrastructure 2.0.

That’s the future of cloud computing. That’s where we’re going. And that’s why it’s so exciting to see the beginnings of it happening with virtual images; because it is just the beginning and people are starting to really flex the boundaries of cloud computing which will lead to even more innovation and change and shift into a higher gear so we can get where it is I hope we’re going a little bit faster.