An IT industry insider's perspective on information, technology and customer challenges.

October 04, 2013

Beyond SDDC: The "Burger King" Model Of Service Delivery

We are constantly bombarded with marketing slogans, etc. Some of them stick in your brain for a very long time.

In the hotly-contested fast-food burger market, Burger King launched a very successful campaign -- "Have It Your Way" -- that emphasized their willingness to tailor each and every hamburger ordered. This was in sharp contrast to others (e.g. McDonalds) who had relatively standard offerings at the time.

In our virtual infrastructure / cloud world, something similar is happening as well. The historical approach of pre-defined menu items (performance, availability, capacity, etc.) are giving way to a strong desire for a-la-carte ordering on a per application basis, with the ability to change your mind at any time.

And, of course, this creates new challenges and opportunities all around.

Antecedents

Over the last several years, I've worked with many customers and service providers who invested in the big leap around creating standardized infrastructure services catalog.

Instead of hand-crafting IT plumbing, they built robust catalogs of repeatable and reusable infrastructure services -- the familiar "gold, silver, bronze" approach. Service providers would offer these to customers; enterprise IT organization would offer them to internal consumers.

Operationally, this form of IT-as-a-service was a huge leap forward: the focus was now on the suitability of the service delivered and consumed vs. how it was constructed. Very transformative stuff -- a theme that's still being played out in IT organizations around the globe.

But the next wave appears to be starting, driven by articulate application requirements -- and enabled by software-defined data centers.

The Discussion

Let's say that I'm an unsophisticated consumer of enterprise IT services, and I approach my ITaaS-oriented enterprise IT team (or an external service provider) with my new application that I need infrastructure for.

Since I'm sort of new at this, they help me out.

They show me some pre-configured options for capacity, performance and availability -- along with the costs for each. After a bit of discussion, I make my choices, everything is speedily provisioned -- and I can come back to them if things change down the road.

Not a bad model for an unsophisticated application like mine, no? I'm more than OK with a few standard choices on the menu, because I really don't need more than that.

But the discussion is very different if I'm a sophisticated consumer of enterprise IT services, and I know *exactly* what kind of behavior I'd like out of the infrastructure.

Here's the default infrastructure profile for my database servers.

If there's a demand spike and measured query response time drops below 3msec, I'd like you to take action: faster storage, more CPU, etc. Keep doing that until my query response time drops below 3msec, and then you're free to reallocate the resources elsewhere until I need them again.

I may need to spin up a bunch of new instances in a hurry, so be ready for that. And here's my best prediction as to how often that will happen, and how quickly you'll need to react.

As far as data protection goes, I'm OK with a 10 minute RPO and 30 minute RTO, unless it's the holiday retail season, and then I need more like 30 seconds of RPO and 5 minutes of RTO. Of course, when that period passes, we can go back to the 10/30 thing.

I'm also really concerned about the network between my web apps and my database servers -- if I start seeing congestion and latency, I'll want to dial up more bandwidth temporarily.

Got all that? Great! Let's go ...

Yikes!

As you can see, this utterly breaks both (a) the historical bespoke configuration approach, and (b) any current notion of a standard service catalog offering for infrastructure.

Your only reasonable alternative would be the infamous "have a hunch, provision a bunch" and try and reserve dedicated resources for the high-water mark.

Yes, that's incredibly wasteful -- and potentially ineffective -- but your options in this scenario are currently very limited.

Is This A Realistic Discussion To Have?

I think so.

For starters, I beleive we are entering a new era of what I call "monster apps" -- complex, organic chains of application entities that reflect large-scale integrated business processes. As competitive business models get more integrated and hence complex, the app chains behind them inevitably follow. It's becoming less useful or interesting to think of application components in isolation.

And the people who are responsible for these larger, complex apps will be very sophisticated consumers of IT services. Put differently, your best (internal) customers are going to demand it before long.

If any of them has spent any time on AWS, they're already comfortable with the idea of dialing up (and dialing down) resources as needed, albeit in a very limited way. Economic incentives encourage complex application owners to be very precise in understanding exactly what kind of infrastructure resources are needed -- and when -- in which places to deliver the exact combination of capacity, performance and availability they're looking for.

What's Needed?

At a very high level, a good starting point would be much richer and more expressive set of models around per-application infrastructure policies. Go back to that example above -- how would you represent that policy in a consistent and programmatic way?

Policies will likely have all sorts of if-then constructs, triggered by external events: high demand, an outage, cutting over to a new version, etc. Most importantly, they're unlikely to result in static infrastructure requirements. So I've started to call these dynamic policies or perhaps scenario policies to differentiate them from the more familiar static ones.

Diving down underneath the plumbing, and we'll need very rich sets of dynamic infrastructure services that are capable of responding to the moment-to-moment demands of application-centric policies -- if they're able to. More/less CPU, memory, network bandwidth, IOPS, application instances, instantiate new services, etc. -- all the dynamic norm vs. a human-driven exception.

I think of SDDC -- software-define data center -- as evolving into this role: orchestrating resources dynamically based on pushed-down application policy requirements.

I also see a sharp distinction between two infrastructure abstraction approaches currently in evidence. The first is what I'd describe as a bottoms-up model: here's what we've got on the menu, you choose. The second is tops-down: here's what I need, you go make it happen. The first is better than no model at all; the second clearly trumps the first.

But I think to complete the big picture, we need to introduce one more abstraction.

The New Role Of Mediation

Anyone who's worked in a corporate setting knows that there's never enough resources available to satisfy the demands of every stakeholder, so choices must be made. And the same will likely be the norm in this model.

I think of mediation as distinct and separate from application-centric policy and SDDC. One creates demand, the other attempts to fulfill it with available resources. But when hard choices need to be made, a third party has to be introduced into the model.

Can we (or should we) move things around? Can we bring more resources online? Can we deprioritize some components to make resources available for more important ones?

Today, this is the hard work of enterprise IT: balancing demand against supply, and doing so quickly and efficiently. But in this proposed model, vast amounts of friction have been removed from the entities expressing their needs (demand) and the ability of the infrastructure to potentially dynamically satisfy the demand (SDDC).

Remove this friction, and bottleneck inevitably moves to the mediation function: being able to make prioritized decisions around what's important, and what's less so. It's not enough to simply think in terms of prioritizing applications -- keeping in mind that they are vast, aggregated groups of functionality -- the challenge is isolating the subcomponents that matter, and that requires an entirely new level of insight.

We're Making Progress

Virtualization (and cloud) at its outset was mostly about efficiency -- doing things for less money. While that never loses its appeal, it didn't take long for people to figure out that the real business benefit was agility -- reacting more quickly to new demands.

And that too will never lose its appeal. How we define agility in our emerging IT world is likely to undergo rapid evolution over the next five years: moving from simply provisioning static resources more quickly to a model where resources and services are dynamically varied through very expressive application-centric policies.

It appears inevitable.

Because, if I'm the owner of a large, complex and business-critical application, I want it my way.

Comments

The approach and situation you have described is already investigated and resolution is provided in one of my research papers presented couple of years back in IEEE International SysCon Conference.
Here is the link of the paper if you wish to read. You really need to understand how the multi-dimensional situation in highly virtualised and orchestrated environment where the focus is just on ‘on-demand’ resource allocation for high priority business services. The problem is well studied, data is analysed and solution is already proposed in the paper.

The situation is worst in cloud environment where orchestration engine is configured with rules to provision resources with 'tunnel vision'. Unfortunately, neither cloud service providers nor consumers are aware of the ‘syndrome’.

Currently I am working on extending my research to resolve much advance problems in cloud environment which cannot be resolved simply by virtualisation, consolidation and orchestration components.