The lesson from the Amazon outage: It's time to layer the cloud

If cloud services were layered architecturally, providers would be able to better contain the effects of failure on the critical services

InfoWorld|Oct 25, 2012

The recent Amazon Web Services outage reminded us once again that cloud computing is not yet a perfect science. That said, perhaps it's also time we define formal methods, models, and approaches to make cloud computing easier to understand -- and more reliable.

Most organizations that implement cloud computing view clouds as a simple collection of services or APIs; they use the cloud functions, such as storage and compute, through these services. When they implement cloud computing services, they see it as just a matter of mixing and matching these services in an application or process to form the solution.

The trouble with that approach? There is a single layer of services that most cloud users see that exposes the cloud computing functions. Thus, all types of services exist at this layer, from primitive to high level, from coarse to fine grained.

Although it's not at all a new concept, in many instances it's helpful to define cloud services using a layered approach that works up from the most primitive to the highest-level services, with the higher-level services depending on those at the lower levels. Many IaaS clouds already work this way internally. However, all exposed services, primitive or not, are pretty much treated the same: as one layer.

A better approach would be for each layer to have a common definition from cloud provider to cloud provider. Each layer would provide a specific set of predefined levels of support. For example:

Layer 0: Hardware services

Layer 1: Virtualization service (if required)

Layer 2: Storage

Layer 3: Compute

Layer 4: Data

Layer 5: Tenant management

Layer 6: Application

Layer 7: Process

Layer 8: Management

Of course, this is just a concept. I suspect the layers will change to represent the purpose and functions of each cloud.

Whatever the right specification, the core notion is that we would treat each layer differently, understanding that the lower-level or primitive layers support the layers above, so they should be extremely fault-tolerant and scalable. The higher layers are closer to the application instances or solutions and are treated accordingly. You can work the same angle with security and governance -- each layer is treated differently based on its purpose and mission.

There's no magic here or anything new. We've been layering architecture for years, and I'm sure the layering concept is on the white boards of most cloud computing providers. But it needs to be explicit in the services, too. If we can map most cloud computing services to clearly understandable domains, we can begin to manage and evaluate those layers differently, considering the differing degrees of importance.