What is new in VMware vSphere 5.5

Update: September 3: supported number of hosts and VMs by vCenter Virtual Appliance reduced when using embedded database.

At VMworld 2013 US VMware announced a new point release of vSphere: 5.5 . The General Availability is yet unknown but I expect it will be soon.

This post is part of a series of blogpostings on VMworld 2013 announcements. See here for a complete overview of all announcements.

vSphere 5.5 has new features and enhancements in ESXi 5.5 and vCenter Server 5.5. Mind the Virtual SAN or VSAN is not part of the vSphere license and needs to be aqcuired seperately. Lots of details on VSAN in this post. Many details on all the announcements made at VMworld 2013 are shown here.

Some of the highlights of VMware vSphere 5.5:

Support for VMDK files of max. 62 TB filesize for VMFS5 and NFS (finally)

16 GB End to end Fibre channel. So 16 GB from host to switch and 16 GB from switch to SAN

Support for 40 Gpbs NICs

Enhanced IPv6 support

Enhancements for CPU C-states. This reduces power consumption. More info here.

Expanded vGPU support: In vSphere 5.1 VMware only supports NVIDIA GPU. In vSphere 5.5 support for AMD and Intel GPU’s is added. Three rendering modes are supported: automatic, hardware and software.

The ability to vMotion a virtual machine between different GPU vendors is also supported. If hardware mode is enabled in the source host and GPU does not exist in the destination host, vMotion will fail and will not attempt a vMotion.

added Microsoft Windows Server 2012 guest clustering support

AHCI Controller Support which enables Mac OS guests to use IDE CDROM drives. AHCI is an operating mode for SATA. More info on ahci here.

VMDK maximum size to 62 TB One of the most interesting new features in the increased maximum size of the virtual disk file (VMDK). While vSphere 5.1 has a maximum of 2 TB, vSphere 5.5 has a maximum of 62 TB. 62TB is also the max size for Virtual Mode RDMs. Already existing VMDKs which are less than 2 TB can be grown offline (no hot-grow support just yet). Online/hot extension of 2TB+ VMDKs will not be supported due to concerns around GPT partition header update of a live VM.

In case an admin creates 2TB+ disk on VMFS-5 and then presents this VMFS-5 to versions of ESXi prior to 5.5, the VM will fail to start on pre ESXi 5.5 versions

This is a list of supported items

NFS & VMFS

Offline extension of 2TB+ VMDK

vMotion

Storage vMotion

SRM/vSphere Replication

Flash

Snapshots

Linked Clones

SE Sparse Disks

These items/features below do not support 62 TB disks

Virtual SAN (VSAN)

Online/hot extension of 2TB+ VMDK

BusLogic Virtual SCSI Adapters

Fault Tolerance

VI (C#) Client

MBR Partitioned Disks

vmfsSparse Disks

vSphere Web Client All the new features of vSphere 5.5 are only available in the vSphere Web Client. The full Windows client (C#) is provided for support for Update Manager, Site Recovery Manager, some vendor plugins and to connect directly to an ESXi host.

Other enhancements are

Increased Platform Support

Dropped support for Linux guest integration tools which are used to get VM Console access. This due to Adobe dropping support for Flash on Linux.

Added support for OS X

VM Console access

Deploy OVF Templates

Attach Client Devices

Enhanced Usability Experience

Drag and Drop

Filters

Recent Items

Virtual hardware version 10 vSphere 5.5 introduces a new version of the virtual hardware: version 10 . New is:

VMFS Heapsize An improvement on storage is the VMFS heapsize. VMFS heap is part of the host’s physical memory in use by the kernel which is reserved for file handling of VMFS volumes. The heap memory contains pointers to data blocks on VMDK files on VMFS volumes. An issue with previous versions of VMFS heap meant that there were concerns when accessing above 30TB of open files from a single ESXi host. ESXi 5.0p5 & 5.1U1 introduced a larger heap size to deal with this. Admins could reserve up to 640 MB for heap size. vSphere 5.5 introduces a much improved heap eviction process, meaning that there is no need for the larger heap size, which consumes memory. vSphere 5.5 with a maximum of 256MB of heap allows ESXi hosts to access all address space of a 64TB VMFS.

Change in vSphere HA

A major new feature of VMware HA is App HA. This allows to monitor applications running inside the guest operating system. If a service fails App HA will try to restart the service. If this does not succeed, HA will then restart the VM.

However the supported number of services running in the guest is limited to:

MSSQL 2005, 2008, 2008R2, 2012

Tomcat 6.0, 7.0

TC Server Runtime 6.0, 7.0

IIS 6.0, 7.0, 8.0

Apache HTTP Server 1.3, 2.0, 2.2

More info on VMware AppHA in this post.

There are a few other changes in VMware HA. In vSphere 5.1 DRS VM-to-VM anti affinity rules could be created. This prevented having two or more VM serving the same application role (like a web frontend) running on the same host. After a HA event, HA could restart one of the member of that rule on the same host on which another VM was already running. After the VM had started, DRS would kick in and vMotion to VM to another host. In vSphere 5.5 directly after a HA event the VM will be restarted on a host which does not have other member of a DRS anti-affinity rule. See images below

vSphere networking feature summary:

Enhancement to LACP feature

Number of Link Aggregation Group support on VDS increased to 64

Number of hashing algorithms supported increased to 22

Enhanced SR-IOV

Communicate Port group specific properties to the virtual functions

Traffic Filtering

Helps drop or allow selected traffic

QoS Tagging

Provides Service level agreements to important traffic types by marking the packets

Latency-sensitivity feature Some applications demand a lot of compute resources. Think about high performance computing and stock trading applications. When dealing with millions dollars per second the slightest bit of delay in response could cost a lot of money. So VMware introduces in vSphere 5.5 a new feature called Latency Sensitivity. This is set per VM and has a value of normal, medium or high. With the latency-sensitivity feature enabled, the CPU scheduler determines whether exclusive access to PCPUs can be given or not considering various factors including whether PCPUs are over-subscribed or not.

Reserving 100% of VCPU time increases the chances of getting exclusive PCPU access for the VM. With exclusive PCPU access given, each VCPU entirely owns a specific PCPU and no other VCPUs are allowed to run on it. This achieves nearly zero ready time, improving response time and jitter under CPU contention. Although just reserving 100% of CPU time (without the latency-sensitivity enabled) can yield a similar effect in a relatively large time scale, the VM may still has to wait in a short time span, possibly adding jitter. Note that the LLC is still shared with other VMs residing on the same socket even with given exclusive PCPU access. The latency-sensitivity feature requires the user to reserve the VM’s memory to ensure that the memory size requested by the VM is always available. Without memory reservation, vSphere may reclaim memory from the VM, when the host free memory gets scarce. Some memory reclamation techniques such as ballooning and hypervisor swapping may significantly downgrade VM performance, when the VM accesses the memory region that has been swapped out to the disk. Memory reservation prevents such performance degradation from happening. More details on memory management in vSphere can be found in

Bypassing Virtualization Layers: Once exclusive access to PCPUs is obtained, the feature allows the VCPUs to bypass the VMKernel’s CPU scheduling layer and directly halt in the VMM, since there are no other contexts that need to be scheduled. That way, the cost of running the CPU scheduler code and the cost of switching between the VMKernel and VMM are avoided, leading to much faster VCPU halt/wake-up operations. VCPUs still experience switches between the direct guest code execution and the VMM but this operation is relatively cheap with the hardware-assisted visualization technologies provided by recent CPU architectures

Tuning Virtualization Layers: When the VMXNET3 para-virtualized device is used for VNICs in the VM, VNIC Interrupt coalescing and LRO support for the VNICs are automatically disabled to reduce response time and its jitter. Although such tunings can help improve performance, they may have a negative side effect in certain scenarios, which is discussed later in this paper. If hardware supports SR-IOV and the VM doesn’t need a certain virtualization features such as vMotion, Network IO Control , and Fault Tolerance, VMware recommends the use of a pass-through mechanism, Single-root I/O virtualization (SR-IOV), for the latency sensitive feature. Best practices

Set the Latency-Sensitivity to High

Consider reserving 100% of CPU reservation

Consider overprovisioning PCPUs to reduce the impact of sharing the LLC. Even with exclusive PCPU access, the LLC is still shared

Consider using a pass-through mechanism such as SR-IOV to bypass the network virtualization layer, if the hardware supports one and virtualization features such as vMotion, FaultTolerance, etc. are not needed

Consider using a separate PNIC for latency sensitive VMs to avoid network contention

Consider using NetIOC, if a pass-through mechanism is not used and there is contention for network bandwidth.

Reliable Memory
Reliable memory is a new feature in vSphere 5.5. Internal memory in a ESXi server is not always equal. Some parts of the internal memory can be more reliable (have less errors) than other parts. If that is the case the server can get a purple screen of death (PSOD). The server hardware is able to report which memory is reliable and which is not. ESXi is able to query the reliability of memory . This allows ESXi to stop using parts of memory when it determines that a failure might occur (predictive, ECC), as well as when a failure did occur (actual, uncorrectable). Corrected errors are reported and collected, and ESXi stops using the failed address to prevent the corrected error from becoming an uncorrected error. This enables better VMkernel reliability despite errors that are occurring in RAM, and avoids using memory pages that might contain errors.

vSphere 5.1 had a similar feature named Memory Reliability. Memory reliability provides a better VMkernel reliability despite corrected and uncorrected errors in RAM. It also enables the system to avoid using memory pages that might contain errors. I understand this feature in ESXi 5.5 is anticipating on future developments of server vendors implementing Reliable Memory techniques. An example of a server vendor which offers reporting on reliable memory is Dell.

They offer Fault Resilient Memory. Dell Fault Resilient Memory (FRM) is a specialized, Dell patented technology designed to work in conjunction with the VMware vSphere hypervisor to protect it and the virtual machines (VMs) it supports from the ramifications of encountering memory faults which would take them out of service. The concept originated with the notion of having the hypervisor and its kernel provided a “fault resilient zone” by the platform firmware. When provided with the FRM zone, the VMware vSphere hypervisor can place itself in this provided zone, protecting it from exposure against potential memory faults (as memory could deteriorate over time), providing a highly available virtualization solution.

Del also has Dell Reliable Memory Technology . I have not found any other documentation showing vendors supporting reliable memory techniques of which ESXi Reliable Memory can benefit.

Difference between Reliable Memory in 5.5 and Memory Reliability The objective of Reliable Memory and Memory Reliability is the same, to protect against physical memory errors, but the specifics are different. In 5.1, VMware avoided using regions of memory where they detected physical errors. In 5.5, ESX runs critical code in memory regions that are designated as “more reliable” / “better protected, so that even if physical memory errors do occur in those regions, ESX will keep running. (Assuming ESX is running on hardware that supports this capability.) You can look up how much memory is considered reliable by using the ESXCLI hardware memory get command. Reliable Memory is available in vSphere Enterprise and Enterprise Plus editions.

Pricing and features per edition Some of the major announcements on pricing and features are

The vCPU entitlements are removed from all editions. Configure up to the maximum supported number of vCPUs in the VM is all editions

No price change for vSphere 5.5 editions.

vSphere Essentials Plus is now available in two ways: without VSA ($ 4,495) and with VSA ($4995)

No feature waterfall. All features available in vSphere 5.1 editions remain available in the same vSphere 5.5 edition.

The vCPU entitlements are removed from all editions. This means you can configure a VM with any number of vCPU’s independent of the vSphere edition. Previously in vSphere 5.1 standard edtion allowed a max of 8-way vCPU, Enterprise Edition a 32-way and Enterprise Plus a 64-way. The free vSphere Hypervisor product however remains having a limit of 8 vCPU per VM! But there is no restriction on the amount of physical RAM. This used to be 32 GB per host. vSphere hypervisor 5.5 now allows to use as much RAM as the server has onboard. A much unknown edition is the vSphere for Retail and Branch Office. Customers had to purchase first the starter kit. This is no longer required. The ROBO kit enables to deploy a maximum of 6 CPUs across three servers per site. There is now a essentials for ROBO 10 CPU Pack and a Essentials Plus for ROBO 10 CPU pack. vCenter Server 5.5 what is new:

Security

improved vCenter Single Sign On

User Interface

Increased Platform Support

Enhanced Usability Experience

vCenter Single Sign On

Improved user experience in multi domain environments

Secure Database connectivity with Windows Authentication

vCenter Databases

Official support for database clustering technologies. VMware now supports the use of Oracle RAC and Microsoft Failover Clustering to create highly available clustered databases. In the past VMware did help customers wirh issues on clustered database but there was no official support.

vCenter Server Appliance

Scalability of embedded database (vPostgres)

Now supports up to 100 hosts and 3000 VMs (initially VMware told support for 500 vSphere hosts / 5000 virtual machines) when using the embeded vPostgress database. External Oracle DB and SQL Server will continue to scale at 1000 hosts / 10000 vms.

There is no support yet for Microsoft SQL Server used as a database by the vCenter Server Appliance. This is due to the lack of an ODBC driver in Linux. Microsoft has a Tech Preview of the ODBC driver available but this is not supported in production environments.

VCSA doesn’t support Linked Mode, and therefore can’t support SSO multi-site without vCenter on Windows (mixed vCenter deployments to achieve single pane of glass management)

VCSA doesn’t support vCenter Heartbeat yet…

This post is part of a series of blogpostings on VMworld 2013 announcements. See here for a complete overview of all announcements.