Blog

I have had several customers asking me for advice on what to do with new ESX hosts who should be joined into the VMware cluster, but after adding to the cluster problems with vMotion arose. It just didn’t work anymore because of some minor CPU differences.

One of the customers had bought the exact same HP DL380 model with same product number and revision with the same type of Intel CPUs inside. But unfortunately the stepping on the old CPUs is 6 and on the new ones its 10. The HP machines contain Xeon CPUs from Intel type 5450 3 Ghz stepping 6 (existing) and stepping 10 (new).

I asked them the following questions so I could give them advice on what to do next:

Question: How many new ESX hosts are you adding to the cluster?Impact: If you are just adding 1 or 2 ESX hosts as extra capacity it is good to look into reforming the VMware cluster to an EVC cluster., because the more ESX hosts (up to 32) in a DRS cluster the better DRS can do its job. If you have more than 2 hosts to add to the cluster it can be a solution to build a dedicated cluster with the new ESX hosts.

Question: Do you need the new features added to the CPUs or do you just need more power in the VMware cluster?Impact: If you look at the latest range of CPUs it can make a total difference of up to 25% speed, because of new added features.

Question: What are the plans for the future with the clusters and do you suspect significant growth?Impact: If you suspect significant growth it can be useful to build up a new ESX cluster with new functionality but always weigh carefully the pros and the cons.

Question: Are the used servers and CPU’s capable of switching on the VT or AMD-V option and can the XD or NX bit be enabled in the BIOS? (Intel markets the feature as the XD bit, for eXecute Disable. AMD uses the name Enhanced Virus Protection.)Impact: If the machine and the CPUs are capable, you can start using an VMware EVC cluster.

After answering the above questions I recommended some clients to build up an EVC cluster in vCenter Server, most answers I get after suggesting such a move are:

When enabled for a cluster, EVC ensures that all CPUs within the cluster are vMotion compatible. CPUs starting with Intel 45nm Core 2 (Penryn) and AMD Second Generation Opteron (revision E or F) incorporate FlexMigration and Extended Migration technologies, respectively.

The EVC feature allows the virtualization layer to mask or hide certain features by modifying the semantics of the CPUID instruction and hides certain CPUID feature bits, even from nonprivileged code. For example, with support from hardware, the virtualization layer modifies the semantics of the CPUID instruction to mask or hide the SSE4.1 feature from any code (privileged or nonprivileged) to make CPUs differing in this feature compatible for vMotion. Specifics on CPU compatibility with the Enhanced vMotion Compatibility feature

It’s recommended to use CPU masking on cluster level instead of virtual machine level whereby CPU masking can be done on cluster level. You will have to build an EVC cluster though, possibly limiting the total power of a VMware cluster.

Boundary conditions for Enhanced VMware Cluster (EVC) are:

CPUs from a single vendor, either AMD or Intel

Running ESX Server 3.5 Update 2 or newer

Connected to vCenter Server

Hardware support for AMD-V or Intel VT and enabled in the bios.

AMD no execute (NX) or Intel execute disable (XD) technology

Support hardware live migration AMD-V Extended Migration or Intel FlexMigration.

After explaining the whole EVC cluster principle you can guess the next question in line I suspect. “How do we build up an EVC cluster?”

Steps to take to build an EVC cluster:

Step 1
Build a new VMware cluster with the new host(s) and make sure the NX bit is enabled in the bios. Also switch on VT or AMD-V technology in there. No VMs may be present or switched on, on the ESX host. DRS and HA technology is fully compatible with a VMware EVC cluster.

Step 2
Change the VMware EVC mode in the cluster settings from disabled to enabled.

Step 3
Bring down the VMs on an ESX host you want to move into the VMware EVC Cluster. Use a cold migrate to move them to the new EVC cluster. Start the VM’s on the EVC cluster. After moving all the VM’s from the old ESX host, restart the host and edit the BIOS settings for it to join the EVC cluster. Add the ‘old’ ESX host to the new EVC cluster.

Step 4
Repeat step 3 for every ESX host you want to add to the EVC cluster. Also look on page 238 in the Basic System Administrators Update 2 and later for ESX 3.5, ESXi 3.5 with Virtual Center 2.5. (vi3_35_25_u2_admin_guide.pdf)

Background info:

For AMD Opteron there are two VMware EVC mode options to choose from at the moment, namely:

AMD Option 1
Applies the baseline feature set of AMD Opteron™ Generation 1/2 (“Rev. E”/”Rev. F”)
processors to all hosts in the cluster.

Intel Option 2
Applies the baseline feature set of Intel® Xeon® 45nm Core™2 (“Penryn”) processors to all
hosts in the cluster.

Hosts with the following processor types will be permitted to enter the cluster:
Intel® Xeon® 45nm Core™2 (“Penryn”)
Intel® Xeon® Core™ i7 (“Nehalem”)

Additional CPU features exposed include SSE4.1.

Intel Option 3
Applies the baseline feature set of Intel® Xeon® Core™ i7 (“Nehalem”) processors to all
hosts in the cluster.

Hosts with the following processor types will be permitted to enter the cluster:
Intel® Xeon® Core™ i7 (“Nehalem”)

Additional CPU features exposed include SSE4.2 and POPCOUNT.

Interested in knowing if all your physical ESX servers are the same? VMware CPU Host Info will help you find out. The application gathers the important system information from your hosts and puts this in one single overview. This tool is written by Richard Garsthagen.

References
See the following references for more information on EVC clusters:

Edwin Weijdema

Edwin Weijdema is a Solutions Architect at Veeam for the Benelux & Nordics region and has over 20 years of experience designing, implementing, and managing data center technologies for large companies. His areas of expertise include virtualization, networking, and storage solutions. He knows what it takes to add business value to partners and customers. He is a veteran vExpert, Cisco Champion 2015 and holds several other certifications.

6 Comments

Hi Vip,
I’m looking to implement EVC for one of our customers. This is because two of our hosts have SSE4-capable CPU’s, while the rest is only SSE3. I also read the article at http://hyperinfo.wordpress.com/2008/08/27/vmware-esx-and-enhanced-vmotion-compatibility/, which makes me concerned about the fact that EVC only “masks” the (in my case) SSE4 feauture. As I understand it, it’s still possible for applications to use feautures which are masked by EVC and this could theoretically result in unexpected behavior by applications, when a VM is vmotioned to a CPU which does not have (for example) SSE4.

What’s your take on this? Any real-world experience (good or bad) with EVC in these kinds of situations?

Hi Vip,
I’m looking to implement EVC for one of our customers. This is because two of our hosts have SSE4-capable CPU’s, while the rest is only SSE3. I also read the article at http://hyperinfo.wordpress.com/2008/08/27/vmware-esx-and-enhanced-vmotion-compatibility/, which makes me concerned about the fact that EVC only “masks” the (in my case) SSE4 feauture. As I understand it, it’s still possible for applications to use feautures which are masked by EVC and this could theoretically result in unexpected behavior by applications, when a VM is vmotioned to a CPU which does not have (for example) SSE4.

What’s your take on this? Any real-world experience (good or bad) with EVC in these kinds of situations?

In the real world experience three of the customers I gave advice on the matter have switched to an EVC cluster so they can hold as many ESX servers in a cluster. I would recomment to test the application on an EVC cluster if you arent totally sure the application is written cleanly. So just hope the application developer correctly used the CPUID to find out which instcructions to use in the application. If it is an application already running on the VMware cluster and you are adding two new hosts to it I would say go for it. If its a new application test it by creating an EVC cluster using 1 ‘old’ESX and 1 ‘new’ESX server.

In the real world experience three of the customers I gave advice on the matter have switched to an EVC cluster so they can hold as many ESX servers in a cluster. I would recomment to test the application on an EVC cluster if you arent totally sure the application is written cleanly. So just hope the application developer correctly used the CPUID to find out which instcructions to use in the application. If it is an application already running on the VMware cluster and you are adding two new hosts to it I would say go for it. If its a new application test it by creating an EVC cluster using 1 ‘old’ESX and 1 ‘new’ESX server.

Thanks for your input. I implemented EVC a week ago, and have not encountered any problems with activating it or with the applications running on VM’s that are being VMotion’ed back and forth beteween what used to be incompatible CPUs.

Thanks for your input. I implemented EVC a week ago, and have not encountered any problems with activating it or with the applications running on VM’s that are being VMotion’ed back and forth beteween what used to be incompatible CPUs.