Virtualized and private cloud infrastructures are all about sharing resources - compute, storage and network. Optimizing these environments comes down to the ability to properly balance capacity supply and application demand. In practical terms, this means allocating the right amount of resources and putting workloads in the right places. These decisions are critical to ensuring performance, compliance and cost control.

Yet most organizations are using antiquated methods such as home-grown spreadsheets and best guesses to determine which infrastructure to host workloads on and how much capacity to allocate. Not only do these approaches hinder operational agility, but as hosting decisions become more and more complex, they are downright dangerous. The typical strategy employed to stave off risk is to over-provision infrastructure, and the thinking behind this is that having an excess of capacity on hand will ensure that enough resource is available to avoid any performance problems. This is not only expensive, but it actually doesn't prevent key operational issues and many of the performance and compliance issues that are caused by incorrectly combining workloads.

In essence, this management challenge is the same one faced by hotel operators. Hoteliers need to constantly align guest demands with hotel resources and amenities. A hotel could not operate without a reservation system to manage resource availability and match that with guest needs, and yet this is exactly how companies manage their virtual and internal cloud environments. Imagine if a hotel didn't have the operational control provided by their reservation system, and was constantly forced to build more rooms than necessary in order to meet "potential" guest demands, rather than basing their decision on an actual profile of historical and predicted demand. Or if they put clients in rooms without enough beds or required amenities. This should start sounding familiar to anyone who has managed a production virtual environment.

Hotels have had the luxury of a long history to refine their operations, and by using reservations systems to properly place guests and manage current and future bookings, they have gained a complete picture of available resources at any point in time. In doing so, they have optimized their ability to plan for and leverage available capacity, achieving the right balance between supply and demand.

Why Workload Routing and Reservations are ImportantBy applying the same principles used to manage a hotel's available capacity to their own operations, IT organizations can significantly reduce risk and cost while ensuring service levels in virtual and cloud infrastructures. There are five reasons why the process of workload routing and capacity reservation must become a core, automated component of IT planning and management:

1. Complexity of the Hosting DecisionHosting decisions are all about optimally aligning supply with demand. However, this is very complex in modern infrastructures, where capabilities can vary widely, and the requirements of the workloads may have a significant impact on what can go where. To make the optimal decision, there are three important questions that must be asked:

Do the infrastructure capabilities satisfy the workload requirements? This is commonly referred to as "fit for purpose," and is required to determine whether the hosting environment is suitable for the kind of workload being hosted. This question has not always been top of mind in the past, as the typical process to deploy new applications has been to procure new infrastructure with very detailed specifications. But the increasing use of shared environments is changing this, and understanding the specifications of the currently running hosting environments is critical. Unfortunately, early virtual environments tended to be one-size-fits-all, and early internal clouds tended to focus on dev/test workloads, so fit for purpose decisions rarely extended beyond ensuring the environment has the right CPU architecture.

Will the workloads fit? While the fit for purpose analysis is concerned with whether a target environment has the right kind of capacity, this aspect of making hosting decisions is concerned with whether there is sufficient free capacity to host the workloads. This is a more traditional capacity problem, but with a twist, as virtual and cloud environments are by nature shared environments, and the capacity equation is multi-dimensional. Resources such as CPU, memory, disk, I/O, network I/O, storage capacity, etc., must be considered, as well as looking at the levels and patterns of activity to ensure that the new workloads are "dovetailing" with the existing ones. Furthermore, any analysis of capacity must also ensure that the workload will fit at the point in time it will be deployed and it must continue to fit beyond that time.

What is the relative cost? While fit and suitability are critical to where to host a workload, in a tiebreaker the main issue becomes relative cost. While many organizations are still not sophisticated enough to have an accurate chargeback model in place, a more precise way to determine cost may be to consider the relative cost of hosting a workload as a function of policy and placement.

2. Capacity Supply and Application Demand are DynamicNothing stands still in virtualized IT environments, and any decisions must be made in the context of ever-changing technologies, hardware specs, service catalogs, application requirements and workloads. This is becoming even more prevalent in the age of the software-defined data center.

Because of this, capacity must be viewed as a pipeline, with inbound demands, inbound supply side capacity, outbound demands and decommissioned capacity all being part of the natural flow of activity. Handling this flow is a key to achieving agility, which is a goal in the current breed of virtual and cloud hosting infrastructure. The ability to efficiently react to changing needs is critical, and the lack of agility in legacy environments is really a reflection of the fact that previous approaches did not operate as a pipeline. If it currently takes two to three months to get capacity, then it is a clear indication that there is no pipeline in place.

3. Meeting Your Customers ExpectationsApplication owners today have expectations that capacity will be available when required, so it's necessary for IT to have a way to hold capacity for planned workload placements to be available on the date of deployment (like advance booking a hotel room).

Sometimes the concept of a capacity reservation is equated with the draw-down on a pool of resources or a quota that has been assigned to a consumer or internal group. This is dangerous, as it simply ensures that a specific amount of resources will not be exceeded, and does not guarantee that actual resources will be available. This is analogous to getting a coupon from a store that says "limit 10 per customer" - it in no way guarantees that there will be any product left on the shelf. Organizations should beware of these types of reservations, as they can give a false sense of security.

Capacity reservations are extremely useful to those managing the infrastructure capacity. They provide an accurate model of the pipeline of demand, which allows for much more efficient, accurate and timely purchasing decisions. Simply put, less idle capacity needs to be left on the floor. It also allows infrastructure to be managed as a portfolio, and if a certain mix of resources is needed to satisfy the overall supply and demand balance (such as buying servers with more memory), then procurement can factor this in.

4. Even Self-Service Needs ReservationsSelf-service can create a highly volatile demand pipeline. But a bigger issue with self-service models is the way organizations perceive them. Many early cloud implementations focus on dev/test users or more grid-type workloads, and the entire approach to delivering capacity takes on a last-minute, unplanned flavor. But these are not the only kinds of workloads - or even the most common - and for a cloud to become a true "next-generation" hosting platform it must also support enterprise applications and proper release planning processes.

The heart of the issue is a tendency for organizations to equate self-service with instant provisioning. Although instant provisioning is useful for dev/test, grid and other horizontal scaling scenarios, it is not the only approach. For example, an online hotel reservation site provides self-service access to hotel rooms, but these rooms are not often being booked for that night. For business trips, conferences and even vacations, you book ahead. The same process must be put into place for hosting workloads.

Rather than narrowly defining self-service as the immediate provisioning of capacity, it is better to focus on the intelligent provisioning of capacity, which may or may not be immediate. For enterprise workloads with proper planning cycles and typical lead times, reservations are far more important than instant provisioning. And deciding where the application should be hosted in the first place is a solution critical decision that is often overlooked. Unless an organization has only one hosting environment, the importance (and difficulty) of this should not be underestimated.

5. Demand Is GlobalThere is a huge benefit to thinking big when it comes to making hosting decisions. The long-term trend will undoubtedly be to start thinking beyond the four walls of an organization and make broader hosting decisions that include external cloud providers, outsourcing models and other potential avenues of efficiency. But the use of external capacity is still a distant roadmap item in many IT organizations, and the current focus tends to be on making the best use of existing capacity and purchasing dollars.

Operating in scale also allows certain assumptions to be challenged, such as the requirement for an application to be hosted at a specific geographical location. Geographical constraints should be fully understood and properly identified, and not simply assumed based on past activity or server-hugging paranoia. Some workloads do have specific jurisdictional constraints, compliance requirements or latency sensitivities, but many have a significant amount of leeway in this regard, and to constrain them unnecessarily ties up expensive data center resources.

Unfortunately, the manual processes and spreadsheet-based approaches in use in many organizations are simply not capable of operating at the necessary scale, and cannot properly model the true requirements and constraints of a workload. This not only means that decisions are made in an overly narrow context, but that the decisions that are made are likely wrong.

Moving Past Your "Gut" Hosting decisions are far too important to be left to simplistic, best-efforts approaches. Where a workload is placed and how resources are assigned to it is likely the most important factor in operational efficiency and safety, and is even more critical as organizations consider cloud hosting models. These decisions must be driven by the true requirements of the applications, the capabilities of the infrastructure, the policies in force and the pipeline of activity. They should be made in the context of the global picture, where all supply and demand can be considered and all hosting assumptions challenged. And they should be made in software, not brains, so they are repeatable, accurate and can drive automation.

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.