Availability Management

Reducing planned and unplanned downtime

Whether we are talking about a highly available and critically productive environment or not, any planned or unexpected downtime means financial losses. Historically, solutions that could provide high availability and redundancy were costly and complex.

With the virtualization technologies available today, it becomes easier to provide higher levels of availability for environments where they are needed. With VMware products, and vSphere in particular, it's possible to do the following things:

Have higher availability that is independent of hardware, operating systems, or applications

Choose a planned downtime for many maintenance tasks, and shorten or eliminate them

Provide automatic recovery in case of failure

Planned downtime

Planned downtime usually happens during hardware maintenance, firmware or operating system updates, and server migrations. To reduce the impact of planned downtime, IT administrators are forced to schedule small maintenance windows outside the working hours.

vSphere makes it possible to dramatically reduce the planned downtime. With vSphere, IT administrators can perform many maintenance tasks at any point in time as it allows downtime elimination for many common maintenance operations.

This is possible mainly because workloads in a vSphere can be dynamically moved between different physical servers and storage resources without any service interruption.

The main availability capabilities that are built into vSphere allow the use of HA and redundancy features, and are as follows:

Shared storage: Storage resources such as Fibre Channel, iSCSI, Storage Area Network (SAN), or Network Access Storage (NAS) help eliminate the single points of failure. SAN mirroring and replication features can be used to have fresh copies of the virtual disk at disaster recovery sites.

vSphere vMotion® and Storage vMotion functionalities allow the migration of VMs between ESXi hosts and their underlying storage without service interruption, as shown in the following figure:

In other words, vMotion is the live migration of VMs between ESXi hosts, and Storage vMotion is the live migration of VMs between storage LUNs. In both cases, VM retains its network and disk connection. With vSphere 5.1 and the later versions, it's possible to combine vMotion with Storage vMotion into a single migration that simplifies administration. The entire process takes less than two seconds on a GB network.

vMotion keeps track of the ongoing memory transaction while memory and system states get copied to the target host. Once copying is done, vMotion suspends the source VM, copies the transactions that happened during the process to the target host, and resumes the VM on the target host. This way, vMotion ensures transaction integrity.

vSphere requirements for vMotion

vSphere requirements for vMotion are as follows:

All the hosts must have the following features:

They should be correctly licensed for vMotion

Have access to the shared storage

Use a GB Ethernet adapter for vMotion, preferably a dedicated one

The VMkernel port group is configured for vMotion with the same name (the name is case sensitive)

Have access to the same subnets

Must be members of all the vSphere distributed switches that VMs use for networking

Use jumbo frames for best vMotion performance

All the virtual machines that need to be vMotioned must have the following features:

Shouldn't use raw disks if migration between storage LUNs is needed

Shouldn't use devices that are not available on the destination host (for example, a CD drive or USB devices not enabled for vMotion)

Should be located on a shared storage resource

Shouldn't use devices connected from the client computer

Migration with vMotion

Migration with vMotion happens in three stages:

vCenter server verifies that the existing VM is in a stable state and that the CPU on the target host is compatible with the CPU this VM is currently using

vCenter migrates VM state information such as memory, registers, and network connections to the target host

The virtual machine resumes its activities on the new host

VMs with snapshots can be vMotioned regardless of their power state as long as their files stay on the same storage. Obviously, this storage has to be accessible for both the source and destination hosts.

Both the source and destination hosts must be of ESX or ESXi version 3.5 or later

All the VM files should be kept in a single directory on a shared storage resource

To vMotion a VM in vCenter, right-click on a VM and choose Migrate… as shown in the following screenshot:

This opens a migration wizard where you can select whether it's going to migrate between hosts or storage or both. The Change hostoption is the standard vMotion, and Change datastore is the Storage vMotion. As you can see, the Change both host and datastore option is not available because this VM is currently running. As mentioned earlier, vSphere 5.1 and later support vMotion and Storage vMotion in one transaction.

In the next steps, you are able to choose the destination as well as the priority for this migration. Multiple VMs can be migrated at the same time if you make multiple selections in the Virtual Machines tab for the host or the cluster.

VM vMotion is widely used to perform host maintenance such as upgrading the ESX operating system, memory, or any other configuration changes. When maintenance is needed on a host, all the VMs can be migrated to other hosts and this host can be switched into the maintenance mode. This can be accomplished by right-clicking on the host and selecting Enter Maintenance Mode.

Unplanned downtime

Environments, especially critical ones, need to be protected from any unplanned downtime caused by possible hardware or application failures. vSphere has important capabilities that can address this challenge and help to eliminate unplanned downtime.

These vSphere capabilities are transparent to the guest operating system and any applications running inside the VMs; they are also a part of the virtual infrastructure. The following features can be configured for VMs in order to reduce the cost and complexity of HA. More detail on these features will be given in the following sections of this article.

High availability (HA)

vSphere's HA is a feature that allows a group of hosts connected together to provide high levels of availability for VMs running on these hosts. It protects VMs and their applications in the following ways:

In case of ESX server failure, it restarts VMs on the other hosts that are members of the cluster

In case of guest OS failure, it resets the VM

If application failure is detected, it can reset a VM

With vSphere HA, there is no need to install any additional software in a VM. After vSphere HA is configured, all the new VMs will be protected automatically.

The HA option can be combined with vSphere DRS to protect against failures and to provide load balancing across the hosts within a cluster

Creating a vSphere HA cluster

Before HA can be enabled, a cluster itself needs to be created. To create a new cluster, right-click on the datacenter object in the Hosts and Clusters view and select New Cluster... as shown in the following screenshot:

The following prerequisites have to be considered before setting up a HA cluster:

All the hosts must be licensed for vSphere HA.

ESX/ESXi 3.5 hosts are supported for vSphere HA with the following patches installed; these fix an issue involving file locks:

ESX 3.5: patch ESX350-201012401-SG and prerequisites

ESXi 3.5: patch ESXe350-201012401-I-BG and prerequisites

At least two hosts must exist in the cluster.

All the hosts' IP addresses need to be assigned statically or configured via DHCP with static reservations to ensure address consistency across host reboots.

At least one network should exist that is shared by all the hosts, that is, a management network. It is best practice to have at least two.

To ensure VMs can run on any host, all the hosts should also share the same datastores and virtual networks.

All the VMs must be stored on shared, and not local, storage.

VMware tools must be installed for VM monitoring to work.

Host certificate checking should be enabled.

Once all of the requirements have been met, vSphere HA can be enabled in vCenter under the cluster settings dialog. In the following screenshot, it appears as PRD-CLUSTER Settings:

Once HA is enabled, all the cluster hosts that are running and are not in maintenance mode become a part of HA.

HA settings

The following HA settings can also be changed at the same time:

Host monitoring status is enabled by default

Admission control is enabled by default

Virtual machine options (restart priority is Medium by default and isolation response by default is set to Leave powered on)

VM monitoring is disabled by default

Datastore heartbeating is selected by vCenter by default

More details on each of these settings can be found in the following sections of this article.

Host monitoring status

When a HA cluster is created, an agent is uploaded to all the hosts and configured to communicate with other agents within the cluster. One of the hosts becomes the master host, and the rest become slave hosts. There is an election process to choose the master host, and the host that mounts more datastores has an advantage in this election. In cases where we have a tie, the host with the lexically-highest Managed Object ID(MOID) is chosen.

MOID, also called MoRef ID, is a value generated by vCenter for each object: host, datastore, VM, and so on. It is guaranteed to be unique across the infrastructure managed by this particular vCenter server.

When it comes to the election process for choosing the master host, a host with ID 99 will have higher priority than a host with ID 100.

If a master host fails or becomes unavailable, a new election process is initiated.

Slave hosts monitor whether the VMs are running locally and report to the master host.

In its turn, the master host communicates with vCenter and monitors other hosts for failures. Its main responsibilities are listed as follows:

Monitoring the state of the slave hosts and in case of failure, identifying which VMs must be restarted

Monitoring the state of all the protected VMs and restarting them in case of failure

Managing a list of hosts and protected VMs

Communicating with vCenter and reporting the cluster's health state

Host availability monitoring is done through a network heartbeat exchange, which happens every second by default. In cases where we lose network heartbeats with a host, before declaring it as failed, the master host checks whether this host communicates with any of the existing datastores using datastore heartbeats and responds to pings sent to its management IP address or not.

The master host detects the following types of host failure:

Type of failure

Network heartbeats

ICMP ping

Datastore heartbeats

Lost connectivity to the master host

-

+

+

Network isolation

-

-

+

Failure

-

-

-

If host failure is detected, the host's VMs will be restarted on other hosts.

Host network isolation happens when a host is running but doesn't see any traffic from vSphere HA agents, which means that it's disconnected from the management network. Isolation is handled as a special case of failure in VMware HA. If a host becomes network isolated, the master host continues to monitor this host and the VMs running on it. Depending on the isolation settings chosen for individual VMs, some of them may be restarted on another host.

The master host has to communicate with vCenter, therefore, it can't be in the isolation mode. Once that happens, a new master host will be elected.

When network isolation happens, certain hosts are not able to communicate with vCenter, which may result in configuration changes not having effect on certain parts of the infrastructure. If a network infrastructure is configured correctly and has redundant network paths, isolation should happen rarely.

Datastore heartbeating

Datastore heartbeating was introduced in vSphere 5. In the previous versions of vSphere, once a host became unreachable through the management network, HA always initiated VM restart, even if the VMs were still running. This, of course, created unnecessary downtime and additional stress to the host. Datastore heartbeating allows HA to make a distinction between hosts that are isolated or partitioned and hosts that have failed, which adds more stability to the way HA works.

vCenter server selects a list of datastores for heartbeat verification to maximize the number of hosts that can be verified. It uses a selection algorithm designed to select datastores that are connected to the highest number of hosts. This algorithm attempts to choose datastores that are hosted on different storage arrays or NFS servers. It also prefers VMFS-formatted LUNs over NFS-hosted datastores.

vCenter selects datastores for heartbeating in the following scenarios:

When HA is enabled

If a new datastore is added

If the accessibility to a datastore changes

By default, two datastores are selected. This is the minimum amount of datastores needed. It can be changed using the das.heartbeatDsPerHost parameter under Advanced Settings for up to five datastores. The PRD-CLUSTER Settings dialog box can be used to verify or change the datastores selected for heartbeating, as shown in the following screenshot:

It is recommended, however, to let vCenter choose the datastores. Only the datastores that are mounted to more than one host are available in the list.

Datastore heartbeating leverages the existing VMFS filesystem locking mechanism. There is a so-called heartbeat region that exists on each datastore and is updated as long as the lock on a file exists. A host updates the datastore's heartbeat region if it has at least one file opened on this volume. HA creates a file for datastore heartbeating purposes only to make sure there is at least one file opened on a volume. Each host creates its own file and HA to be able to determine whether an unresponsive host still has connection to a datastore, and simply checks whether the heartbeat region has been updated or not.

By default, an isolation response is triggered after 5 seconds for the master host and after approximately 30 seconds if the host was a slave in vSphere 5. This time difference occurs because of the fact that if the host was a slave, it would need to go through the election process to determine whether there are any other hosts that exist, or whether the master host is simply down. This election starts within 10 seconds after the slave host has lost its heartbeats. If there is no response for 15 seconds, the HA agent on this host elects itself as the master. The isolation response time can be increased using the das.config.fdm.isolationPolicyDelaySec parameter under Advanced Settings. This is, however, not recommended as it increases the downtime when a problem occurs.

If a host becomes a master in a cluster with more than one host and has no slaves, it continuously starts checking whether it's in the isolation mode or not. It keeps doing so until it becomes a master with slaves or connects to a master as a slave. At this point, the host will ping its isolation address to determine whether the management network is available again. By default, the isolation address is a gateway configured for the management network. This option can be changed using the das.isolationaddress[X] parameter under Advanced Settings. [X] takes values from 1 to 10 and allows configuration of multiple isolation addresses. Additionally, the das.usedefaultisolationaddress parameter can be used to indicate whether the default gateway address should be used as an isolation address or not. This parameter should be set to False if the default gateway is not configured to respond to the ICMP ping packets.

Generally, it's recommended to have one isolation address for each management network. If this network uses redundant paths, the isolation address should always be available under normal circumstances.

In certain cases, a host may be isolated, that is, not accessible via the management network but still able to receive election traffic. This host is called partitioned. Have a look at the following figure to gain more insight about this:

When multiple hosts are isolated but can still communicate with each other, it's called a network partition. This can happen for various reasons; one of them is when a cluster spans multiple sites over a metropolitan area network. This is called the stretched cluster configuration.

When a cluster partition occurs, one subset of hosts is able to communicate with the master while the other is not. Depending on the isolation response selected for VMs, they may be left running or restarted. When a network partition happens, the master election process is initiated within the subset of hosts that loses its connection to the master. This is done to make sure that the host failure or isolation results in appropriate action on the VMs. Therefore, a cluster will have multiple masters; each one in a different partition as long as the partition exists. Once the partition is resolved, the masters are able to communicate with each other and discover the multiplicity of master hosts. Each time this happens, one of them becomes a slave.

The hosts' HA state is reported by vCenter through the Summary tab for each host as shown in the following screenshot:

This is done under the Hosts tab for cluster objects as shown in the following screenshot:

Running (Master) indicates that HA is enabled and the host is a master host.

Connected (Slave) means that HA is enabled and the host is a slave host.

Only the running VMs are protected by HA. Therefore, the master host monitors the VM's state and once it changes from powered off to powered on, the master adds this VM to the list of protected machines.

Virtual machine options

Each VM's HA behavior can be adjusted under vSphere HA settings or in the Virtual Machine Options option found in the PRD-CLUSTER Settings page as shown in the following screenshot:

Restart priority

The restart priority setting determines which VMs will be restarted first after the host failure. The default setting is Medium. Depending on the applications running on a VM, it may need to be restarted before other VMs, for example, if it's a database, a DNS, or a DHCP server. It may be restarted after others if it's not a critical VM.

If you select Disabled, this VM will never be restarted if there is a host failure. In other words, HA will be disabled for this VM.

Isolation response

The isolation response setting defines HA actions against a VM if its host loses connection to the management network but is still running. The default setting is Leave powered on. To be able to understand why this setting is important, imagine the situation where a host loses connection to the management network and at the same time or shortly afterwards, to the storage network as well—a so-called split-brain situation.

In vSphere, only one host can have access to a VM at a time. For this purpose, the .vmdk file is locked and there is an additional .lck file present in the same folder where .vmdk file is stored. As HA is enabled, VMs will fail over to another host, however, their original instances will keep running on the old host. Once this host comes out of isolation, we will end up with two copies of VMs. Therefore, the isolated host will not have access to the .vmdk file as it's locked. In vCenter, however, this VM will look as if it is flipping between two hosts.

With the default settings, the original host is not able to reacquire the disk locks and will be querying the VM. HA will send a reply instead which allows the host to power off the second running copy.

If the Power Off option is selected for a VM under the isolation response settings, this VM will be immediately stopped when isolation occurs. This can cause inconsistency with the filesystem on a virtual drive. However, the advantage of this is that VM restart on another host will happen more quickly, thus reducing the downtime.

The Shut down option attempts to gracefully shut down a VM. By default, HA waits for 5 minutes for this to happen. When this time is over, if the VM is not off yet, it will be powered off. This timeout is controlled by the das.isolationshutdowntimeout parameter under the Advanced Settings option. VM must have VMware tools installed to be able to shut down gracefully. Otherwise, the shutdown option is equivalent to power off.

VM monitoring

Under VM Monitoring, the monitoring settings of individual applications can be adjusted as shown in the following screenshot:

The default setting is Disabled. However, VM and Application Monitoring can be enabled so that if the VM heartbeat (VMware tool's heartbeat) or its application heartbeat is lost, the VM is restarted. To avoid false positives, the VM monitoring service also monitors VM's I/O activity. If a heartbeat is lost and there was no I/O activity (by default during the last 2 minutes), VM is considered as unresponsive. This feature allows you to power cycle nonresponsive VMs.

I/O interval can be changed under the advanced attribute settings (for more details, check the HA Advanced attributes table later in this section).

To avoid repeating VM resets by default, they will be restarted only three times during the reset period. This can be changed in the Custom mode as shown in the following screenshot:

In order to be able to monitor applications within a VM, they need to support VMware application monitoring. Alternatively, you can download the appropriate SDK and set up customized heartbeats for the application that needs to be monitored.

Under Advanced Options, the following vSphere HA behaviors can be set. Some of them have already been mentioned in sections of this article.

The following screenshot shows the Advanced Options (vSphere HA) window where advanced HA options can be added and set to specific values:

Admission control

Admission control ensures there are sufficient resources available to provide failover protection as and when VM resource reservations are kept.

Admission control is available for the following:

Hosts

Resource pools

vSphere HA

Admission control can only be disabled for vSphere HA. The following screenshot shows PRD-CLUSTER Settings with the option to disable admission control:

Examples of actions that may not be permitted because of insufficient resources are as follows:

Power on a VM

Migrate a VM to another host, cluster, or resource pool

Increase CPU or memory reservation for a VM

Admission control policies

There are three possible types of admission control policies available for HA configuration that are as follows:

Host failure cluster tolerates: When this option is chosen, HA ensures that a specified number of hosts can fail, but sufficient resources will still be available to accommodate all the VMs from these hosts. The decision to either allow or deny an operation is based on the following calculations:

Slot size: A hypothetical VM that has the largest amount of memory and CPU that is assigned to an existing VM in the environment. For example, for the following VMs, the slot size will be 4 GHz and 6 GB:

VM

CPU

RAM

VM1

4 GHz

2 GB

VM2

2 GHz

4 GB

VM3

1 GHz

6 GB

Host capacity: It gives the number of slots each host can hold based on the resources available for VMs; not the total host memory and CPU. For example, for the previous slot size, the host capacity will be as given in the following table:

Host

CPU

RAM

Slots

Host1

4 GHz

128 GB

1

Host2

24 GHz

6 GB

1

Host3

8 GHz

14 GB

2

Cluster failover capacity: This gives the number of hosts that can fail before there aren't enough slots left to accommodate all the VMs. For example, for previous hosts with 1 host failure policy, the failover capacity is 2 slots. In case of Host3 failure (host with larger capacity), the cluster is left with only two slots. But if the current failover capacity is less than the allowed limit, admission control disallows the operation. For example, if we are running two VMs and need to power the third one, it will be denied as the cluster capacity is two and it may not be able to accommodate three VMs.

This option is probably not the best one for an environment that has VMs with significantly more of resources assigned than the rest of the VMs.

The Host failure cluster tolerates option can be used when all cluster hosts are sized pretty much equally. Otherwise, if you use this option, then excessive capacity is reserved such that the cluster tolerates the largest host failure. When this option is used, VM reservations should be kept similar across the cluster as well. Because vCenter uses the slot sizes model to calculate capacity, and the slot size is based on the largest reservation, having VMs with a large reservation will again result in additional unnecessary capacity being reserved.

Percentage of cluster resources: With this policy enabled, HA ensures that a specified percentage of resources are reserved for failover across all the hosts. It also checks that there are at least two hosts available. The calculation happens as follows:

The total resource requirement for all the running VMs is calculated. For example, for three VMs in the previous table, the total requirement will be 7 GHz and 12 GB.

The total available host resources are calculated. For the previous example, the total is 34 GHz and 148 GB.

The current CPU and memory failover capacity for the cluster is calculated as follows:

CPU: (1-7/34)*100%=79%

RAM: (1-12/148)*100%=92%

If the current CPU and memory capacity is less than allowed, the operation is denied.

With such different hosts from the example, the CPU and RAM capacity should be configured carefully to avoid a situation when, for example, the host with most amount of RAM fails and the other hosts are not able to accommodate all the VMs because of memory resources. Therefore, RAM should be configured at 87 percent based on the two smallest hosts (#2 and #3) and not 30% based on the number of hosts in the environment:

[1-(6+14)/148]*100%=87%

In other words, if the host with 128 GB fails, we need to make sure that the total resources needed by the VMs are less than the sum of 6 GB and 14 GB, which is only 13 percent of the total cluster's 148 GB. Therefore, we need to make sure that in all instances, the VMs use only 13 percent of the RAM or that the cluster has 87 percent of RAM that is free.

Specified failover hosts: With this policy enabled, HA keeps the chosen failover hosts reserved, doesn't allow the powering on or migrating of any VMs to this host, and restarts VMs on this host only when failure occurs. If for some reason, it's not possible to use a designated failover host to restart the VMs, HA will restart them on other available hosts.

It is recommended to use the Percentage of cluster resources reserved option in most cases. This option offers more flexibility in terms of host and VM sizing than other options.

HA security and logging

vSphere HA configuration files for each host are stored on the host's local storage and are protected by the filesystem permissions. These files are only available to the root user.

For security reasons, ESXi 5 hosts log HA activity only to syslog. Therefore, logs are placed at a location where syslog is configured to keep them. Log entries related to HA are prepended with fdm, which stands for fault domain manager. This is what the vSphere HA ESX service is called.

Older versions of ESXi write HA activity to fdm logfiles in /var/log/vmware/fdm stored on the local disk. There is also an option to enable syslog logging on these hosts. Older ESX hosts are able to save HA activity only in the fdm local logfile in /var/log/vmware/.

The das.config.log.maxFileNum option causes ESXi 5 hosts to maintain two copies of the logfiles: one is a file created by the Version 5 logging mechanism, and the other one is maintained by the pre-5.0 logging mechanism. After any of these options are changed, HA needs to be reconfigured.

The following table provides log capacity recommendations according to VMware for environments of different sizes based on the requirement to keep one week of history:

Size

Minimum log capacity per host in MB

40 VMs in total with 8 VMs per host

4

375 VMs in total with 25 VMs per host

35

1,280 VMs in total with 40 VMs per host

120

3,000 VMs in total with 512 VMs per host

300

These are just recommendations; additional capacity may be needed depending on the environment. Increasing the log capacity involves specifying the number of rotations together with the file size as well as making sure there is enough space on the storage resource where the logfiles are kept.

The vCenter server uses the vpxuser account to connect to the HA agents. When HA is enabled for the first time, vCenter creates this account with a random password and makes sure the password is changed periodically. The time period for a password change is controlled by the VirtualCenter.VimPasswordExpirationInDays parameter that can be set under the Advanced Settings option in vCenter.

All communication between vCenter and HA agents, as well as agent-to-agent traffic, is secured with SSL. Therefore, for vSphere HA, it's necessary that each host has verified SSL certificates. New certificates require HA to be reconfigured. It will also be reconfigured automatically if a host has been disconnected before the certificate is replaced.

SSL certificates are also used to verify election messages so if there is a rogue agent running, it will only be able to affect the host it's running on. This issue, if it occurs, is reported to the administrator.

HA uses TCP/8182 and UDP/8182 ports for communication between agents. These ports are opened and closed automatically by the host's firewall. This helps to ensure that these ports are open only when they are needed.

Using HA with DRS

When vSphere HA restarts VMs on a different host after a failure, the main priority is the immediate availability of VMs. Based on CPU and memory reservations, HA determines which host to use to power the VMs on. This decision is based, of course, on the available capacity of the host. It's quite possible that after all the VMs have been restarted, some hosts become highly loaded while others are relatively lightly loaded.

DRS is the load balancing and failover solution that can be enabled in vCenter for better host resource management.

vSphere HA, together with DRS, is able to deliver automatic failover and load balancing solutions, which may result in a more balanced cluster. However, there are a few things to consider when it comes to using both features together.

In a cluster with DRS, HA, and the admission control enabled; VMs may not be automatically evacuated from a host entering the maintenance mode. This occurs because of resources reserved for VMs that need to be restarted. In this case, the administrator needs to migrate these VMs manually.

Some VMs may not fail over because of resource constraints. This can happen in one of the following cases:

HA admission control is disabled and DPM is enabled, which may result in insufficient capacity available to perform failover as some hosts may be in the standby mode and therefore, fewer hosts would be available.

VM to host affinity rules limit hosts where certain VMs can be placed.

Total resources are sufficient but fragmented across multiple hosts. In this case, these resources can't be used by the VMs for failover.

DPM is in the manual mode that requires an administrator's confirmation before a host can be powered on from the standby mode.

DRS is in the manual mode, and an administrator's confirmation may be needed so that the migration of VMs can be started.

What to expect when HA is enabled

HA only restarts a VM if there is a host failure. In other words, it will power on all the VMs that were running on a failed host placed on another member of the cluster. Therefore, even with HA enabled, there will still be a short downtime for VMs that are running on faulty hosts. In fast environments, however, VM reboot happens quickly. So if you are using some kind of monitoring system, it may not even trigger an alarm. Therefore, if a bunch of VMs have been rebooted unexpectedly, you know there was an issue with one of the hosts and can review the logs to find out what the issue was.

Of course, if you have set up vCenter notifications, you should get an alert.

If you need VMs to be up all the time even if the host goes down, there is another feature that can be enabled called Fault Tolerance.

Fault Tolerance

It's fair to say that vSphere HA provides only a basic level of protection; in the event of a host failure, it restarts VMs. vSphere FT, however, provides a higher level of availability and protects VMs from host failure without any downtime, any loss of data, or connection interruptions.

FT can be enabled for any critical VMs. Continuous availability is provided by creating and maintaining a secondary VM that is an exact copy of the primary one. Primary and secondary machines exchange heartbeats. This allows them to monitor each other's status.

If a host where primary VM is running fails, the secondary VM is activated within a few seconds to replace the primary one, and a transparent failover occurs.

To ensure that both VMs are exactly the same, VMware uses the vLockstep technology. vLockstep executes identical sequences of x86 instructions on both the machines. The primary VM replays all the events taking place between the processor and virtual I/O devices back to the secondary VM. The secondary VM executes these events the same way as the primary VM does, while only the primary VM executes the actual workload. Therefore, failover from the primary VM to the secondary VM happens seamlessly without any loss of the existing network connections or in-progress transactions. The whole process is transparent, fully automated, and doesn't require the vCenter server to be available.

Obviously, primary and secondary VMs are not allowed to run on the same host. When primary VMs are powered on, an anti-affinity check takes place and the secondary VM gets moved to another host.

Logging traffic, by default, is unencrypted and contains all of the network and I/O data. This traffic can contain sensitive data such as passwords. Therefore, it's important to make sure that this network is secure to avoid, for example, man-in-the-middle attacks. Best practice is to make this network private.

Preparing hosts and VMs

Hosts and VMs need to be configured correctly before FT is enabled.

The following requirements should be considered:

The cluster requirements for FT are as follows:

All the hosts share the same datastores and network.

The vSphere HA cluster has been created and enabled. For more details about creating a HA cluster, please see the previous section about vSphere HA.

FT logging and VMotion are configured (see the host requirements for details).

Host certificate checking is enabled. Certificate checking can be enabled under the SSL Settings section in the vCenter Server Settingspage as shown in the following screenshot:

At least two FT-certified hosts that have the same build number or are running the same FT version should be present. It's better, however, to use three hosts. If the primary VM fails on one of the hosts, the secondary VM becomes primary and creates a new secondary VM on the third host. The FT version can be checked on a host's Summary tab in vCenter.

For hosts older than ESX/ESXi version 4.1, the Summary tab shows the host's build number instead. It is not recommended to combine ESX and ESXi hosts in an FT pair.

The host requirements for FT are as follows:

Hardware Virtualization (HV) is enabled in the host's BIOS.

Host's CPUs are of the FT-compatible processor group. It is also recommended to use host processors that are compatible with one another. The list of supported CPUs can be found in the VMware knowledge base.

Hosts are licensed for FT.

Hosts are certified for FT. This can also be verified in VMware's compatibility lists. To confirm the host's ability in supporting FT, you can also use vCenter's compliance checker in the following manner:

Select the cluster in the vCenter inventory and go to the Profile Compliance tab.

Click on Check Compliance Now, which will run the compliance tests.

To view the running tests, click on Description. The compliance status appears at the bottom of the screen. A host is labeled as either Compliant or Noncompliant.

When a host is not compliant for FT, the reasons for this can be viewed on the Summary tab for each host in the vSphere Client. Click on the blue caption icon next to the Host Configured for FT field as shown in the following screenshot:

Each host in a cluster must be configured with two different networking switches to support FT. Therefore, a minimum of two physical GB network adapters are required on each host so that it can support FT. VMware, however, recommends 10 GB adapters. vMotion and FT logging network cards must be connected to different subnets.

IPv6 is not yet supported on FT logging NIC.

The VM requirements for FT are as follows:

All the devices attached to VM are supported, and all the features it's using are compatible.

The following table gives a list of the features and devices that are not supported:

Incompatible Feature

Corrective action

Snapshots

Remove all the snapshots to enable FT, or disable FT to create a snapshot

Storage vMotion

Turn off FT to migrate a VM to a different storage

Linked clones

Unfortunately, it's not possible to enable FT on a linked clone or to create a clone from a VM that has FT enabled on it.

VM backup

Disable FT or use backup solutions that do not require snapshot creation.

You will need additional products such as VMware View Composer to be able to create linked clones. Therefore, this type of clone is out of the scope of this article.

Configuring FT

To enable FT on a VM, right-click on Fault Tolerance in vCenter and choose Turn On Fault Tolerance as shown in the following screenshot:

This VM will become the primary VM, and the secondary VM will be created on a different host.

Unfortunately, there is no option to turn on FT for multiple VMs. When multiple VMs are selected, the FT option is not available. FT has to be enabled on each VM separately.

When FT is on, vCenter removes the VM's memory limit and sets the memory reservation to be the same as the memory size of the VM. Therefore, the memory reservation size, limit, and shares on the VM can't be changed while FT is on.

The FT state for a particular VM can be viewed in vCenter under the Summary tab in the Fault Tolerance section for a primary VM. Under Fault Tolerance Status, there is an indication whether the VM is protected or not protected.

When the status is protected, it means that both the primary and secondary VMs are powered on and up and running. Not protected means that for some reason, the secondary VM is not running.

Under the Fault Tolerance tab, you will also be able to see the location of the secondary VM, its CPU, and memory, as well as vLockstep Interval and Log Bandwidth.

vLockstep Interval is a delay in seconds before changes in the primary VM are replicated to the secondary one. Typically, this delay is in fractions of a second. Log Bandwidth is the network capacity that is being used to transfer data with changes from the primary to secondary VM.

FT can be disabled from the same menu when you right-click on the VM. However, it can be disabled only from the primary VM. When FT is used together with HA, HA detects the use of FT and is able to ensure proper operation.

In cases where we have host isolation, where FT is enabled, isolation responses are not performed on VMs. Primary and secondary VMs are already communicating with each other. Therefore, they will either keep functioning if there is network connectivity or will fail over if there is no connectivity and/or the heartbeat is lost.

If a host partition occurs, HA will restart a secondary VM if the primary VM is running in the same partition as the master HA agent, or the secondary VM will not be restarted until partitioning is resolved.

Using FT with DRS

When FT is used together with DRS, the behavior will be different depending on the EVC settings.

If EVC is not enabled, fault tolerant VMs will have the DRS status set to disabled. In this case, the primary VM is turned on only on its registered host, the secondary VM is automatically placed, and neither of them are moved for FT purposes.

Enabled EVC allows fault-tolerant VMs to be placed and be included in the cluster's load balancing calculations. By default, DRS does not place more than four primary and secondary VMs on a single host. This limit can be changed in Advanced Options by adjusting the das.maxftvmsperhost parameter. When set to 0, this option will be ignored by DRS.

When DRS is used with affinity rules, VM-to-VM rules apply only to primary VM, while VM-to-Host affinity rules apply to both primary and secondary VMs. The VM-to-VM rule set for the primary VM will apply to a secondary one after a failover, that is, after this secondary VM becomes primary.

Summary

Any downtime, either planned or unexpected, means financial losses. With the virtualization technologies available today, it becomes easier to provide higher levels of availability for environments where they are needed. vSphere HA is a feature that allows a group of hosts connected together to provide high levels of availability for VMs running on these hosts.

Admission control ensures that there are sufficient resources available for failover protection as and when VM resource reservations are kept. It is available for hosts, resource pools, and vSphere HA.

HA offers only a basic level of protection; in the event of a host failure, it restarts the VMs. vSphere FT provides a higher level of availability and protects VMs from host failure without any downtime, or any loss of data, transactions, or connections.

Alerts & Offers

Series & Level

We understand your time is important. Uniquely amongst the major publishers, we seek to develop and publish the broadest range of learning and information products on each technology. Every Packt product delivers a specific learning pathway, broadly defined by the Series type. This structured approach enables you to select the pathway which best suits your knowledge level, learning style and task objectives.

Learning

As a new user, these step-by-step tutorial guides will give you all the practical skills necessary to become competent and efficient.

Beginner's Guide

Friendly, informal tutorials that provide a practical introduction using examples, activities, and challenges.

Essentials

Fast paced, concentrated introductions showing the quickest way to put the tool to work in the real world.

Cookbook

A collection of practical self-contained recipes that all users of the technology will find useful for building more powerful and reliable systems.

Blueprints

Guides you through the most common types of project you'll encounter, giving you end-to-end guidance on how to build your specific solution quickly and reliably.

Mastering

Take your skills to the next level with advanced tutorials that will give you confidence to master the tool's most powerful features.

Starting

Accessible to readers adopting the topic, these titles get you into the tool or technology so that you can become an effective user.

Progressing

Building on core skills you already have, these titles share solutions and expertise so you become a highly productive power user.