Pages

05 May 2011

VMWare Cloud's Stormy Start and Superspecialization in SM

It's still in Beta, so cut them some slack, but two failures at VMWare's new Cloud Foundry infrastructure may give some prospective customers pause. Cloud Foundry has only been available since 12 April 2011, and the two failures on 25 and 26 April may be related, but the failures are noteworthy nonetheless. One reason is that prospective customers are keen to learn how VMWare coordinates its service interruptions with partners and customers.

As reported by Network World, the first outage was caused by a PSU failure in a storage cabinet. This is the sort of failure to be expected at any facility of this kind. The outage lasted 10 hours. The next day, as staff worked out remediation plans to address future failures, as VMWare put it, "Unfortunately, at 10:15am PDT, one of the operations engineers developing the playbook touched the keyboard. This resulted in a full outage of the network infrastructure sitting in front of Cloud Foundry. This took out all load balancers, routers, and firewalls; caused a partial outage of portions of our internal DNS infrastructure; and resulted in a complete external loss of connectivity to Cloud Foundry" (via Network World).It's possible that in an era of superspecialization yet another specialization has been added to the mix: service availability management for mega-datacenters. Not likely to be in the job description for my next job or yours, but it will be part of someone's highly compensated sweat.Let's hope there's still room in mega-datacenter budgets for some decent CRM. Based on recent history, and steady, though reluctant migration to cloud services, they will need it.