June 11, 2009

Why you may want to leave FT alone for now…

Today, FT has a LOT of caveats. Aside from the obvious one everyone’s talking about (the limitation to 1 CPU VMs), there’s some other ones that I think are much worse (especially if you have newer CPUs) because they don’t just affect one VM, but the whole host. Here’s a few examples:

Power Management must be turned off in the BIOS on the ESX Host. This is a big bummer, considering ESX4 just started supporting low-power state CPU features like SpeedStep.

Hyper-Threading must be turned off on the host. If you’ve got a new Xeon 5500 processor, this is a bummer as well.

Turning on FT disabled EPT/RVI for the ENTIRE host. This means one of the biggest performance enhancements (vMMU support) is gone for your whole host when you turn on FT for a single VM! UPDATE: This only disables FT for the single VM, not all VMs.

And as for VM-specific limitations (meaning these at least only affect the FT VM):

No Virtual SMP

No Snapshots (meaning no Storage VMotion, no VCB)

No Hot-Add hardware

No NPIV

No DRS

No Thin Provisioning

I think FT is a really cool piece of engineering, but today it’s pretty obvious that’s a version 1.0 (or worse, version 0.9) product. It works, but there are more gotchas than in any other VMware feature I’ve ever seen.

Like this:

LikeLoading...

Related

7 Comments

Actually, it rivals Directpath I/O which has a list about as long. Seriously though, when vMotion was released did every customer run out and do it to all of their production VM’s the first day they installed it? Probably not (unless they have no fear). Customers will kick the tires on it for some time, in the meantime, the limitations should diminish over time.

This is incorrect. Turning on FT only disables the use of NPT/EPT for that virtual machine (since the use of NPT/EPT results in non-determinism that is not trackable). Other virtual machines on the host can use NPT/EPT.

Krishna:
You are correct. In previous versions of the vSphere migration checklist, the follow item was listed under the Fault Tolerance section:
“Ensure that there is no user requirement to use NPT/EPT (Nested Page Tables/Extended Page Tables) since VMware FT disables NPT/EPT on the ESX host”
However, they have since updated it (on June 8th) and it now reads:
“Ensure that there is no user requirement to use NPT/EPT (Nested Page Tables/Extended Page Tables) on VMware FT protected VMs, since VMware FT disables NPT/EPT for the VMs”
Thanks for the correction! I’m glad I was wrong.

Eric,
Again, in the version of the migration checklist from before June 8th, this was listed as a “required” option. It is now listed as “Optional” due to “performance implications”
So it looks like this shows two things. One: FT may not be as bad as I first thought, but Two: VMware really needs to get their documentation straight!