Disaster recovery has become table stakes in the world of server virtualization. Any good virtualization platform these days will find a way to restart a virtual machine in the event of a hardware failure. But which vendor excels more than any other at getting critical applications back online after failures, and making sure the most important virtual machines are given priority in the restart process?

Debate has broken out over this topic since the Burton Group research and analysis firm declared that Microsoft's Hyper-V is not enterprise-ready because it lacks a specific feature found in both the VMware and Citrix hypervisors. But Microsoft contends that Hyper-V does meet the core features customers are looking for, and even the Burton Group concedes that Microsoft has surpassed its rivals in certain types of disaster-recovery scenarios.

The feature in question is restart priority. According to the Burton Group, enterprise-class virtualization products must let IT administrators assign a restart priority to virtual machines, ensuring that the most critical workloads restart before any others in the event of a physical server outage.

Microsoft insists that its virtualization management tools allow this type of prioritization, if perhaps in a roundabout way, but the Burton Group has refused to give Hyper-V a final seal of approval, saying only VMware and Citrix allow this functionality.

The VM restart priority setting in VMware's High Availability software lets IT assign VMs a priority of low, medium or high, with the high priority VMs starting first. But this is not a perfect tool, as administrator cannot set a restart order within the "high priority" bucket.

Citrix's XenServer provides a greater level of control and is therefore the best platform for this type of disaster recovery scenario, according to Burton Group analyst Chris Wolf.

"The idea behind priority is to ensure that mission critical workloads come up first," he says of VMware's system. "Only those types of systems should be given high priority. Even if I had 10 VMs with high priority set, those 10 would all come up before any VMs set to medium or low priority. That's the point. Customers would like greater granularity with VMware's priority metric (XenServer's is better) and we've called that out in our vSphere assessment. Still VMware's behavior meets the minimum expectation of our criteria, while the XenServer implementation is the most ideal."

VMware counters that its Site Recovery Manager software does provide "strict ordering of VM restart," while conceding that its High Availability software does not.

In any case, Wolf says his team at the Burton Group has discussed the restart priority issue with Microsoft, and that Microsoft officials "understand the use case and they understand why it's important."

Microsoft tells a somewhat different story. "I've gone back and forth with Burton Group about this specific feature," says Edwin Yuen, a virtualization director at Microsoft. "Certainly we have alternatives, or ways around it."

Hyper-V lets IT delay the restart of certain virtual machines by a set time, 15 seconds, 30 seconds, or whatever amount of time is chosen. Delaying the restart of lower-priority virtual machines effectively allows the highest-priority ones to start up first, he says.

Customers can go even further in Microsoft's System Center Virtual Machine Manager, which lets IT write scripts defining which VMs restart first in the case of failure. Customers can also set rules preventing restart of certain virtual machines while back-end services are restarted. For example, if a Web application running in one virtual machine requires a SQL database that runs in another VM, Microsoft admins can require that the database start up before the application.

Wolf agrees that "Where Microsoft has a decided advantage is application-aware high availability. That's something we highlight as a real strength in the Microsoft solution that neither Citrix nor VMware can offer." VMware treats the virtual machine as a black box, so if an application inside a VM stalls, the company's high availability product won't detect the problem unless there is a complete failure of the operating system, according to Wolf.

As Yuen puts it, Microsoft "can look at the VMs, the operating systems and the services. We literally can tell 'is the SQL database up and running? Has the mailbox service started?' We can do a level of detection that VMware can't do."

This, combined with the other features Yuen described, should meet the requirements of customers as much, if not more than the specific restart priority feature that the Burton Group considers to be crucial, Yuen says.

"I don't believe that the feature requirement for restart priority fulfills what the customer wants to do anyway. That's my opinion," Yuen says.

A few Network World readers expressed dissatisfaction with both the Microsoft and VMware approaches in comments posted to a recent article titled "It's time to virtualize Microsoft Exchange (but not with Hyper-V)."

"Setting a 'high, medium, low' priority is just as weak and unmanageable as setting a startup delay. Neither gives any sort of guarantee that the service that you are relying on is actually available," one reader commented. "Both ways are fragile and prone to failure and need to improve."

Wolf notes that many customers have deployed Hyper-V despite the restart priority issue, and despite other areas in which VMware excels. For example, VMware allows VMs to run in lockstep on two physical hosts at the same time, providing a better level of fault tolerance. Citrix achieves this through a partnership with Marathon Technologies, while Microsoft doesn't have the feature yet but should in the near future, he says.

Large enterprises that have virtualized mission-critical apps and have high service-level expectations may feel this lockstep feature is important, but "to be honest that level of availability is not something that's that important to most enterprise organizations today," Wolf says.

Customers may also be willing to accept a slightly lower level of availability in exchange for the better pricing offered by Hyper-V. Since Hyper-V is bound to improve over time, customers may also prefer to start their virtualization journey with Hyper-V than VMware to avoid the high exit cost of a future switch away from VMware.

But for the time being, Wolf says VMware is clearly in the lead in providing disaster recovery and high availability.

"It's definitely VMware because they have the richest integration with the storage vendors," Wolf says. "Their Site Recovery Manager product today is very mature. Their live migration is more powerful than the competition" allowing concurrent migration of up to eight virtual machines.

Microsoft's market share has been growing more rapidly than VMware's, but the two most widely deployed hypervisors are VMware ESX and VMware Server, with Hyper-V in third place, according to IDC.

Microsoft argues that the higher cost for VMware does not justify the extra features offered by the company, but Wolf says "It's going to take a long time for VMware to be unseated as the dominant player, in my opinion. But Microsoft has done this before. … VMware is going to have to be very good in terms of execution."