In previous articles I have described a number of steps to optimise the performance of ESXi host system resources, now we will look at details on how to optimise virtual machines with the available resources.

Through memory management techniques ESXi can manage memory and is capable of reclaiming any excess memory assigned to virtual machines when the host system memory is exhausted. However, it is important for a number of reasons that you configure memory for the virtual machine to satisfy the workload of your virtual machine and do not over allocate memory resources, as this process of assigning excess memory can lead to a number of issues, for example:

Increase in the amount of available overhead memory to power on the virtual machine.

By default, ESXi is configured to support the use of large pages. However, the guest operating system in some instances my require additional configuration in order to use large memory pages. For Example, for a Windows Server 2012 server with Microsoft SQL Server installed we would be required to configure the ‘Lock pages in memory’ privilege to the user account running the Microsoft SQL Server service so that the application will execute with the use of large memory pages.

Network Configuration

It is recommended that you use the VMXNET3 virtual network adapter for all supported guest operating systems that have VMware Tools installed. This virtual machine network adapter is optimised to provide higher throughput, lower latency and less overhead when compared to the other virtual machine network adapter options. The driver required for the VMXNET3 virtual adapter is not provided by the guest operating system and therefore requires VMware Tools installed to supply the driver.

VMXNET3, the newest generation of virtual network adapter from VMware, offers performance on par with or better than its previous generations in both Windows and Linux guests. Both the driver and the device have been highly tuned to perform better on modern systems. Furthermore, VMXNET3 introduces new features and enhancements, such as TSO6 and RSS. TSO6 makes it especially useful for users deploying applications that deal with IPv6 traffic, while RSS is helpful for deployments requiring high scalability. All these features give VMXNET3 advantages that are not possible with previous generations of virtual network adapters. Moving forward, to keep pace with an ever‐increasing demand for network bandwidth, we recommend customers migrate to VMXNET3 if performance is of top concern to their deployments.

CPU Configuration

The CPU scheduler in ESXi schedules CPU activity and fairly grants CPU access to virtual machines using shares it is important to configure a virtual machine with required number of vCPUs required for the workload. For application workloads that are unknown , it is my recommendation to start with the approach of small and increase the number of vCPUs gradually until you notice an acceptable and stable performance from the virtual machine workload. By enabling CPU Hotplug for the virtual machine where supported by the guest operating system and/or application this allows for the flexibility of additional vCPUS to be added without incurring downtime for the virtual machine.

In the example of overcommitting CPU resources for a virtual machine the process of assigning the additional vCPU to the virtual machine can lead to performance issues as well as directly consuming additional memory for the associated virtual machine overhead. The additional CPU resources may cause for the host system to be exchausted and for the virtual machines on this host system to degrade in performance. Therefore, adding additional vCPUs to the virtual machine to resolve a perceived vCPU contention issue may actually add an extra burden to the host system and degrade performance further.

By default the VMkernel schedules virtual machines vCPU to run on any logical CPU for the host systems hardware. In some cases you may wish to schedule CPU affinity for the virtual machine. For Example, you may require to troubleshoot the performance of CPU workload if this was not sharing CPU resources with other workloads on the host system, where you do not have the ability to migrate the virtual machine to run on an isolated host system. Also, you may wish to use CPU scheduling to measure throughput and response times of several virtual machines competing for CPU resources agaisnt specific logical CPUs on the host system.

One limitation of enabling CPU scheduling affinity for a virtual machine is that vMotion is not functional in this configuration and also if have virtual machine in a Distributed Resources Scheduler (DRS) cluster, the ability to enable CPU scheduling affinity is disabled.

Storage Configuration

The placement of a virtual machine on a datastore may have a significant impact on the performance of that virtual machine, this is due to I/O requirements for all virtual machines on that shared resource may result in I/O latency if the underlying storage array is unable to meet the requirements. In the event of optimising the placement of virtual machine I/O you may utilise Storage vMotion in order to migrate virtual machines to datastores that have fewer competing workloads or that are configured with better performance by triggering a I/O latency threashold.

When provisioning a virtual machine the default virtual SCSI controller type is based on the guest operating system and in most cases will be adequate for the virtual machine workload. However, if this is not sufficient to satisfy the virtual machine workload using a VMware Paravirtual SCSI controller can improve performance with higher achievable throughput and lower CPU utilization in comparison to other SCSI controller types. As per the VMXNET3 virtual network adapter, this requires VMware Tools installed to provide the appropriate drivers for the supported guest operating system.

As discussed in a previous performance blog (http://wp.me/p15Mdc-u7) by configuring the virtual machine disk type as Eager-zeroed thick provides the best performance virtual machine disk type as they do not require to obtain new physical disk blocks or write zeros during normal operations.

If you require to remove the VMFS layer to satisfy I/O requirements for performance in some instances you may configure a virtual machine to use raw device mappings (RDM).