IT management in a world of utility IT

A cynic might call it “could computing” rather than “cloud computing”. What if you could get rid of your data center. What if you could pay only for what you use. What if you could ramp up your capacity on the fly. We’ve been hearing these promising pitches for a while now and recently the intensity has increased, fueled by some real advances.

As an IT management architect who is unfortunately unlikely to be in position to retire anytime soon (donations accepted for the send-William-to-retirement-on-a-beach fund) it forces me to wonder what IT management would look like in a world in which utility computing is a common reality.

First, these utility computing providers themselves will need plenty of IT management, if not necessarily the exact same kind that is being sold to enterprises today. You still need provisioning (automated of course). You definitely need access measuring and billing. Disaster recovery. You still have to deal with change planning, asset management and maybe portfolio management. You need processes and tools to support them. Of course you still have to monitor, manage SLAs, and pinpoints problems and opportunities for improvement. Etc. Are all of these a source of competitive advantage? Google is well-known for writing its infrastructure software (and of course also its applications) in house but there is no reason it should be that way, especially as the industry matures. Even when your business is to run a data center, not all aspects of IT management provide competitive differentiation. It is also very unclear at this point what the mix will be of utility providers that offer raw infrastructure (like EC2/S3) versus applications (like CRM as a service), a difference that may change the scope of what they would consider their crown jewels.

An important variable in determining the market for IT management software directed at utility providers is the number of these providers. Will there be a handful or hundreds? Many people seem to assume a small number, but my intuition goes the other way. The two main reasons for being only a handful would be regulation and infrastructure limitations. But, unlike with today’s utilities, I don’t see either taking place for utility computing (unless you assume that the network infrastructure is going to get vertically integrated in the utility data center offering). The more independent utility computing providers there are, the more it makes sense for them to pool resources (either explicitly through projects like the Collaborative Software Initiative or implicitly by buying from the same set of vendors) which creates a market for IT management products for utility providers. And conversely, the more of a market offering there is for the software and hardware building blocks of a utility computing provider, the lower the economies of scale (e.g. in software development costs) that would tend to concentrate the industry.

Oracle for one is already selling to utility providers (SaaS-type more than EC2-type at this point) with solutions that address scalability, SLA and multi-tenancy. Those solutions go beyond the scope of this article (they include not just IT management software but also databases and applications) but Oracle Enterprise Manager for IT management is also part of the solution. According to this Aberdeen report the company is doing very well in that market.

The other side of the equation is the IT management software that is needed by the consumers of utility computing. Network management becomes even more important. Identity/security management. Desktop management of some sort (depending on whether and what kind of desktop virtualization you use). And, as Microsoft reminds us with S+S, you will most likely still be running some software on-premises that needs to be managed (Carr agrees). The new, interesting thing is going to be the IT infrastructure to manage your usage of utility computing services as well as their interactions with your in-house software. Which sounds eerily familiar. In the early days of WSMF, one of the scenarios we were attempting to address (arguably ahead of the times) was service management across business partners (that is, the protocols and models were supposed to allow companies to expose some amount of manageability along with the operational services, so that service consumers would be able to optimize their IT management decision by taking into account management aspects of the consumed services). You can see this in the fact that the WSMF-WSM specification (that I co-authored and edited many years ago at HP) contains a model of a “conversation” that represents “set of related messages exchanged with other Web services” (a decentralized view of a BPEL instance, one that represents just one service’s view of its participation in the instance). Well, replace “business partner” with “SaaS provider” and you’re in a very similar situation. If my business application calls a mix of internal services, SaaS-type services and possibly some business partner services, managing SLAs and doing impact/root cause analysis works a lot better if you get some management information from these other services. Whether it is offered by the service owner directly, by a proxy/adapter that you put on your end or by a neutral third party in charge of measuring/enforcing SLAs. There are aspects of this that are “regular” SOA management challenges (i.e. that apply whenever you compose services, whether you host them yourself or not) and there are aspects (security, billing, SLA, compliance, selection of partners, negotiation) that are handled differently in the situation where the service is consumed from a third party. But by and large, it remains a problem of management integration in a word of composed, orchestrated and/or distributed applications. Which is where it connects with my day job at Oracle.

Depending on the usage type and the level of industry standardization, switching from one utility computing provider to the other may be relatively painless and easy (modify some registry entries or some policy or even let it happen automatically based on automated policies triggered by a price change for example) or a major task (transferring huge amounts of data, translating virtual machines from one VM format to another, performing in-depth security analysis…). Market realities will impact the IT tools that get developed and the available IT tools will in return shape the market.

Another intriguing opportunity, if you assume a mix of on-premises computing and utility-based computing, is that of selling back your spare capacity on the grid. That too would require plenty of supporting IT management software for provisioning, securing, monitoring and policing (coming soon to an SEC filing: “our business was hurt by weak sales of our flagship Pepsi cola drink, partially offset by revenue from renting computing power from our data center to the Coca cola company to handle their exploding ERP application volume”). I believe my neighbors with solar panels on their roofs are able to run their electric counter backward and sell power to PG&E when they generate more than they use. But I’ll stop here with the electric grid analogy because it is already overused. I haven’t read Carr’s book so the comment may be unfair, but based on extracts he posted and reviews he seems to have a hard time letting go of that analogy. It does a good job of making the initial point but gets tiresome after a while. Having personally experienced the Silicon Valley summer rolling black-outs, I very much hope the economics of utility computing won’t be as warped. For example, I hope that the telcos will only act as technical, not commercial intermediaries. One of the many problems in California is that the consumer don’t buy from the producers but from a distributor (PG&E in the Bay Area) who sells at a fixed price and then has to buy at pretty much any price from the producers and brokers who made a killing manipulating the supply during these summers. Utility computing is another area in which economics and technology are intrinsically and dynamically linked in a way that makes predictions very difficult.

For those not yet bored of this topic (or in search of a more insightful analysis), Redmonk’s Coté has taken a crack at that same question, but unlike me he stays clear of any amateurish attempt at an economic analysis. You may also want to read Ian Foster’s analysis (interweaving pieces of technology, standards, economy, marketing, computer history and even some movie trivia) on how these “clouds” line up with the “grids” that he and others have been working on for a while now. Some will see his post as a welcome reminder that the only thing really new in “cloud” computing is the name and others will say that the other new thing is that it is actually happening in a way that matters to more than a few academics and that Ian is just trying to hitch his jalopy to the express train that’s passing him. For once I am in the “less cynical” camp on this and I think a lot of the “traditional” Grid work is still very relevant. Did I hear “EC2 components for SmartFrog”?

[UPDATED 2008/6/30: For a comparison of “cloud” and “grid”, see here.]

[UPDATED 2008/9/22: More on the Cloud vs. Grid debate: a paper critical of Grid (in the OGF sense of the term) efforts and Ian Foster’s reply (reat the comments too).]

11 Responses to IT management in a world of utility IT

“coming soon to an SEC filing: “our business was hurt by weak sales of our flagship Pepsi cola drink, partially offset by revenue from renting computing power from our data center to the Coca cola company to handle their exploding ERP application volume”).”

The flip side of this might be when an investment analyst starts looking at 10-Ks and finds that company A’s technology expenditures are extremely lower than company Bs. I have been saying this for a while now, when the “Mad Money” guy starts figuring out how clouds can help his investors he will start to yell.

In order to discuss some of the issues surrounding The Cloud concept, I think it is important to place it in historical context, looking at the Cloud’s forerunners and the problems they encountered before being adopted.

One of the current barriers in the way of The Cloud is economics. I argue that,

“Telecom prices have fallen and bandwidth has increased, but more slowly than processing power, leaving the economics worse than in 2003”.

And “I’m sure that advances will appear over the coming years to bring us closer, but at the moment there are too many issues and costs with network traffic and data movements to allow it to happen for all but select processor intensive applications, such as image rendering and finite modelling.”

Thanks for the comment. There are some very interesting context, history and data points in your post. And it’s very true that bandwidth has not increased at the same rate as processing power (the same thing applies to the home and is also shaping/constraining net usage there). But at the same time I don’t think this really limits utility computing to CPU-intensive applications. Unlike what the Globus guys had in mind, the people who are using EC2 today are not necessarily going “on the Grid” because they want to perform tasks that require a scale they can’t meet in their own datacenter. They are mostly doing thing they could do on their own but would rather not bother with (both from a Capex and administration perspective). Nothing in what they do requires the data to be far away from the processing. They’re just moving data and CPU as a pair from their own datacenter to one ran and managed by someone else. At least in scenarios where the end users go directly to the hosted system, there is no additional bandwidth requirement.

BTW, here is a direct link to Paul’s “Cloud computing” blog entry for those who will read this in a few weeks/months/years, when the entry has been pushed out of the front page by other interesting articles.