Hypervisor Best Practices

Hypervisor Build – Best Practices

What follows is a list of things that you may want to consider when building your Hyper-V servers on HP DL hardware.

Always update the BIOS, ILO firmware, storage and network controller firmware, even on a new (recently purchased) server. As of this writing some Gen8s that are already in the shipping pipeline have older versions of all of these components. Do not assume that HP will ship a fully updated server.

Boot into the BIOS and check CPU settings

Check that hardware virtualization and VT-d options are enabled

Check that DEP option is enabled

Disable hyperthreading

Boot into the BIOS and check Power settings

Set Performance Mode to static high performance

Set Power Regulator to maximum performance

Save changes and power down/unplug the server from power before continuing

Following these steps will ensure that Hyper-V role will install without issues on the first try and that your server will operate at peak performance and stability. I have seen cases where voltage regulation was crashing servers – HP acknowledged the problem eventually but setting server power to max/static will avoid this altogether.

NIC configuration

NIC teaming is a matter of preference. With 10 Gbps interfaces throughput may not be an issue anymore and teaming will only add complexity. If you are building a failover cluster, non-teamed NIC failure will cause a failover and maybe a minute of downtime (depending on the amount of RAM used by the affected VMs). On the other hand, in the 15 years of working with hardware on and off, I am hard pressed to think of a single case where a NIC failed while in flight. However, I can think of more than one instance of configuration changes on the server or the switch causing an interface flap or link disconnect, or etherchannel meltdown – chances of this happening with a NIC team are arguably higher. In addition, with more and more applications now being resilient by design (Exchange DAGs, SQL database mirroring, AD multimaster replication, etc) case can be made that 100% uptime isn’t needed (and is unattainable anyway) at any one infrastructure point.

If your NIC driver allows it, always set speed and duplex to 10 or 1 Gbps (as applicable) and full duplex. Match this setting on the switch ports. Autodetection is still broken.

Enable VMQ features on the ports that will be used as virtual switches in Hyper-V

Enable VMQ lookahead split on the ports that have VMQ features enabled

Use jumboframes = 9000 on iSCSI ports

Disable power saving feature on all NICs on the server

If you are using iSCSI storage, it is extremely important to follow the vendor’s best practice configuration for the switch you are using. For example, you will have to set switch port to portfast (disable STP), drop them into a dedicated storage VLAN that has no routing outside of the storage subnet, enable flow control at the port level, and don’t forget to configure jumbo frames to match NIC setting (9000 bytes).

Ensure proper binding order of the primary NIC (the one with the gateway setting). Move it to the top of the NIC order in Advanced Settings.

Leave IPv6 bound to all NICs even if you are not using IPv6. Optionally reorder binding order on each NIC to bind IPv4 protocol higher than IPv6.

On all iSCSI interfaces, leave default gateway blank, leave DNS settings blank, switch off Automatic Metric and set it to something like 512, disable NetBIOS over TCP/IP and LMHOSTS lookups, and disable DNS registration of this iSCSI adapter.

On all other NIC interfaces except for the main management NIC (the one you left bound at the top of the bindings list), disable Automatic Metric and set it to something like 256. Disable DNS registration too. DNS registration should be enabled only on the primary (management) NIC.

To avoid confusion, NICs that are used to create Hyper-V virtual switches should be marked as inaccessible by the host operating system.

Rename your NICs to reflect their physical location in the chassis, and potentially their static IP address. Example: LAN1 – <ipaddress>, SAN1 – <ipaddress>, etc. This will make your lights-out datacenter management experience much less painful.

Try to automate your NIC management using NETSH CLI. After a difficult ramp-up you will love this tool and it may save some time later by ensuring that your servers are configured consistently (even if the settings are incorrect, they will be consistent…).

On a Hyper-V failover cluster with iSCSI CSVs you will need more than 4 NICs, so using something like DL360 G7 or Gen8 you will need to install a secondary 4-port PCIE card (4 ports may already be on the chassis of the server depending on which network bundle you order). I prefer low-profile PCIE cards with full height / half height bracket options. I use 4 ports for iSCSI MPIO (no teaming!), 1 port for server management, 1 port for cluster heartbeat and occasional live migrations. 2 ports remain for the Hyper-V virtual switch and can be teamed. This configuration is based on 1 Gbps NICs and is open to some debate.

On iSCSI NIC ports (only on iSCSI ports), make sure to disable File and Print Sharing service as well as Microsoft Client service.

Storage configuration

This section will apply only to local storage. iSCSI/SAN storage configuration for Hyper-V clustering is subject to its own best practices write-up.

Do not use RAID5 on a Hyper-V server.

I know, it is tempting, but do not use RAID5 on a Hyper-V server. Its write overhead will be a tremendous setback on any production server.

Try to use 15K RPM disks only, resorting to 10K only if you must attain greater capacity.

I prefer to run OS/hypervisor on a pair of RAID1 disks, using a separate array/volume. This array can use a pair of 146 GB SAS 15K disks.

Fill remaining bays with SAS or SATA 10K or higher disks and use either RAID0+1 (RAID10) – preferred, or RAID6. If you have the luxury of 8 or more drives in a single server, you may have other potentially feasible options like RAID50.

Alternative is to create one big RAID10 array and carve out something like 100 GB partition for the OS. Be mindful of the RAM amount, as your pagefile.sys and/or memory dump may grow.

Remember, that configuring disk system for performance is not the same as configuring disk system for capacity. Performance is mostly about the number of spindles and cumulative IOPS that these spindles can handle, given RAID overhead and IO profile. Higher number of smaller drives with higher RPM and lower seek time will beat fewer higher capacity drives any time of day, even if cumulative capacity is about the same.

Very important – read up on, and learn to love partition alignment. Partitions properly aligned to disk sector boundaries result in something like 15% less IO (and more performance) depending on the NTFS stripe size. Performance difference may be even more significant on some RAID architectures, especially on RAID5.

Drive fragmentation may be an issue on the local disk – it is not an issue in an iSCSI SAN environment. Keep this in mind if you find your hypervisor slowing down.

SSD storage is in, controllers are in generation 4 now, drives are becoming cheaper and somewhat bigger. Invest in a pair of SSDs for Hyper-V boot volume.

Where SSD drives just don’t have the required capacity, look at SSD hybrid drives. These are spindle-based disks with an SSD cache that is much larger than traditional disk buffer. In some use cases these drives perform just as well as pure SSD disks and you may find them suitable as the main storage for your VMs.

Storage controller hardware acceleration can give a significant boost, for a spindle-based storage system to rival equivalent grade SSD system IN SOME USE CASES. Storage controller can use onboard cache (RAM) to store write IOs an order of magnitude faster, acknowledge the IO back to the system, and then take time writing it out to a much smaller spindle based disk. This writing to RAM is essentially what happens in SSD (SSD is likely slower than writing to controller RAM). So if you are not planning to invest in a SAN environment or SSD, invest in a production grade storage controller with lots of onboard cache. Don’t forget to enable write caching and set read/write cache ratio according to IO profile of your virtualization environment.

Remember that adding many SAS or SATA disks on the same channel could create a bus bottleneck, so be sure that cumulative throughput of your drives does not exceed the speed of the channel they are connected to. Example: SATA-3 theoretical performance ceiling is 300 MB/sec; one 10K RPM SATA disk can burst to something like 150 MB/sec, so putting more than 2 such drives on the same SATA-3 channel may limit drive performance.

Keep in mind that setting up SATA-6 drives on SATA-3 controller is pointless, as the more expensive drives will downgrade to match the speed of the bus.

If you have money to spare, look at SSD PCIE cards. Think of it as a SAN on steroids on a PCB. A $3000 1TB SSD PCIE 2.0 card from OCZ can provide the kind of IOPS muscle that iSCSI SANs (and broadly speaking, any spindle-based storage system) can only dream about – several hundreds of thousands of IOPS. Admittedly, SSD PCIE does not fit all use cases but I’ll mention it here for completeness.

Never use dynamically expanding disks in Hyper-V production environments.

Never use differencing disks in Hyper-V production environments.

Never use Hyper-V snapshots in Hyper-V production environments.

Best performance can be attained only using fixed size VHDX disks that have no Hyper-V snapshots, that are stored on aligned partitions.

Attach VHDX disks as SCSI disks to your VMs unless the drive is a boot volume. SCSI adapters in Hyper-V offer slightly better performance over IDE ones, but you still can’t boot from SCSI disk.

Pass-through disks work best in high-IO or large disk scenarios (roughly around 200-300 GB mark and up). If you virtualize SQL and Exchange servers, for example, especially in iSCSI environment, consider using pass-through disks. Some people argue that they are less agile, in a way that they tie you down to a storage platform making an otherwise mobile server less movable. At the end of the day, any volume can be migrated from any storage to any other storage, even if not easily so, and at the same time pass-through disks offer better performance through less virtualization overhead.

That sums it up for now. Most of these points come from experience, some from research, some from the school of hard knocks. If you think that I missed something, please feel free to share your thoughts below.

No Responses to "Hypervisor Best Practices"

One piece of information regarding VMQ and teaming – I’ve learned that unless all team members support VMQ and unless you do the Hyper-V team assignment from within windows (if you are teaming), you will lose the VMQ capability. Since not all the NICs that i have have that support, in my experiences, i chose to not team on the host, but rather to team within the guest when needing redundancy or performance. This way i would be guaranteed to use the NICs that i set aside for VMQ and that i would be guaranteed to actually have VMQ.

Hi Sergey,
Your simple question made me think for a bit. I think I’ll answer it this way: in the past, hyperthreading sucked. Present version is much better and will probably do no harm to leave on, however, can one state with certainty that two heavy VM workloads won’t end up on two different hyperthreads, but the same underlying core? The bottom line is that a hyperthread is not a core, and having more virtual processors in your OS does not mean that your physical core can compute any faster. When I run serious production workloads, I need guarantees that the scheduler places VMs on different cores (up until the point when the number of VMs exceeds number of cores). There is a somewhat scientific post here which I found to be very good – take a look at this. Ignore the post’s SAP theme (I work with SAP so it was relevant) but read on to where the conversation shifts to VM and virtualization workloads.
Hope this helps,
Dennis