Pages

Tuesday, 3 April 2012

How to Backup Fault Tolerant VMs in vSphere 4

How to Backup Fault Tolerant VMs in vSphere 4

A major benefit of deploying VMware's vSphere 4 is the additional options it
offers you for business continuity and disaster recovery, such as virtual
machine level backups and high availability features. vSphere 4 Essentials Plus,
Advanced, and Enterprise editions all include VMware Data Recovery

A more detailed explanation of the potential of vSphere based DR solutions is
beyond the scope of this article, but all the products utilise VM snapshots to
enable backing up of live VMs without affecting availability:VMware Data Recovery Snapshot Based Backup
Another major feature in vSphere 4 Advanced and higher editions is Fault
Tolerance, which is intended to eliminate VM downtime in the event of a host
server failure by creating a live shadow instance of the VM on another host and
keeping them in "lockstep" synchronisation. The effect is similar to clustering,
but since it operates at the hypervisor level it does not require any special
features at the VM software level. Again it is beyond the scope of this article
to discuss the advantages and disadvantages of this technology, the important
thing to note here is that FT enabled VMs cannot be snapshotted.

The Problem with Backing Up FT enabled VMs

As we have already seen, all the vSphere based VM backup solutions rely on
snapshot technology to image live VMs, but Fault Tolerant VMs cannot be
snapshotted which therefore precludes backing them up. Unfortunately, it seems
that the first time many users discover this is when trying to run their first
backup of a FT VM. Depending on their backup application, it will either not
allow them to create the backup job in the first place, or the job will fail
shortly after starting.
It seems that this situation was exacerbated by conflicting information from
VMware when vSphere 4 was originally released, at one point they said that Fault
Tolerant VMs would be allowed a single snapshot for backup purposes but this
feature was not included in the released version of vSphere 4.0. Subsequently
they implied that it would be enabled in a later update, but despite vSphere 4.1
including some major improvements to FT it appeared that they had given up on
the single snapshot feature, at least until the next major version release.

The Solution

To be completely accurate, it is still not possible to backup FT enabled VMs.
What is now the commonly accepted method is in fact a workaround, as it involves
disabling FT in order to allow a snapshot to be created, and then re-enabling it
once the backup has completed. This means that for the duration of the backup,
the Virtual Machine will no longer have the benefit of FT protection, which may
be a problem for some readers. Unfortunately if that is the case, then you will
have to look at alternative backup methods, i.e., either running inside the VM
or at the SAN level.
It is quite simple to test the process manually in order to establish that
your backup application will be able to image an FT VM; you just need to turn
off Fault Tolerance for that VM. Note that you have to "turn off" rather than
"disable", otherwise it still won't allow a snapshot to be created.

Turning Off Fault Tolerance

In your vSphere Client, right-click on the FT VM and select
"Fault Tolerance", then "Turn Off Fault
Tolerance" from the sub-menu. The "Disable Fault Tolerance" option will
only stop the lockstep synchronisation, but leaves the secondary VM in place.
"Turn Off" will actually disable lockstep sync and then remove the secondary VM
completely, leaving you with a normal VM that can be snapshotted as usual.
Now start your backup running and if you watch the bottom
"Tasks" section in the vSphere Client, you should see the
snapshot being created by the backup application. Once the backup has completed,
the snapshot should be removed. Once that is done you can right-click
the VM again and select the "Turn On Fault Tolerance".
This might take a few minutes as it has to create a secondary VM and bring it
into lockstep sync with the primary, exactly the same process as when you
originally enabled FT.
This process obviously isn't a practical solution for regular scheduled
backups. However, it does demonstrate the steps required to allow backups of any
Fault Tolerant Virtual Machine, and it will give you an idea of the potential
hazards involved. The main problem has already been mentioned; the VM will not
be FT protected during the backup process so a host hardware problem would cause
downtime, and an error whilst turning FT on or off could leave the VM
unprotected, requiring manual intervention. On a more positive note though, in
my experience such problems are very rare and are unlikely to cause downtime by
themselves, and whilst the VM lacks FT protection, it will still have vSphere
High Availability. Therefore, should the host fail, the VM will be started on
another host. It will have undergone a "dirty shutdown" so there may be some
data loss or even corruption and a short period of downtime, all of which
illustrates quite neatly why Fault Tolerance was an attractive option in the
first place!

Automating the Procedure

Fortunately vSphere has comprehensive scripting support, allowing for the
automation of any process achievable via the vSphere Client GUI, so we can use
this to turn Fault Tolerance off and on when required. vSphere scripts are
written in Perl, but don't worry if you have no experience of using that - . The instructions below
will show you how to implement it. However if you are using a third party
application such as Veeam and have an active support agreement with the vendor,
then you should contact them first to see if they have their own solution.

To use any scripts you first of all need to install the vSphere CLI
(Command Line Interface), which you can download from the
VMware website - browse to the "Downloads" section and
select vSphere4, then click the "Drivers & Tools"
tab. Expand the "Automation Tools & SDKs" section
then download the version of the vSphere CLI that matches your
vCenter installation, note that you want the standard CLI not
the "PowerCLI" (the PowerCLI has similar functionality but integrates with the
Windows PowerShell):

Once you have downloaded the CLI, run the file to install
it. In theory, you can run the CLI on any Windows system with a network
connection to your vCenter Server, but in practice it will be much easier to
install it on the same system as your VM backup application. For these
instructions I have assumed you will install it to the default recommended
folder location, but if you choose a custom folder then just change subsequent
file paths appropriately.

Now download the FTcli2.pl script file from http://communities.vmware.com/docs/DOC-10279
and save it to the C:\Program Files (x86)\VMware\VMware vSphere
CLI\perl\bin folder.

If you open the Start - Programs - VMware menu you should see an entry for
the "VMware vSphere CLI", with just a Command prompt icon in it. Clicking this
will indeed just open a standard command prompt, but with the location changed
to the vSphere CLI installation folder. At this point to simplify things in
future I would recommend adding the C:\Program Files
(x86)\VMware\VMware vSphere CLI\perl\bin folder to the default folder
paths list

With the path set, you should now find you can execute your scripts from a
standard Command Prompt. Try entering FTcli2.pl /? in order to
see the online help listing all the options for this script. You will see that
you can specify explicit or passthrough authentication. We will assume you are
running it on your vCenter Server with sufficient privileges to use
passthrough.

Now we need to test running the script with the required options to first
turn off FT, and then to turn it back on again. It would be a good idea to
create a test FT VM for testing these procedures rather than using a live
production VM, just in case. The command to turn off FT should be something like
this: ftCLI2.pl --server vcenter.domain.local --passthroughauth
--operation stop --vmname MyTestFTvm , where vcenter.domain.local is
the FQDN (or IP) of your vCenter Server. Enter that command at the prompt and
run it, it should return some progress information, you should see the task
appear in the vSphere Client, and the Fault Tolerance will be turned off for
that VM.

In the event that the script fails to turn off FT, then the output in the
Command Prompt window, or the Task Status in the vSphere Client will usually
give a good indication of the cause of the problem. You may also add the
--verbose option to the command which should make it return
more detailed error messages.

The command to turn on FT should be identical to the turn off command,
except with --operation create instead, so now you should be
able make a test VM Fault Tolerant and then remove FT again afterwards.

In order to use these new script commands effectively, they need to be
coordinated with the backup application. To facilitate this, you should create
two batch files; open Notepad and enter your commands, starting each separate
command on a new line like this:

The cd C:\Program Files (x86)\.... line should not be
necessary if you have already added the folder location to your default system
paths list, but it won't do any harm to include it anyway. This example is for
turning on FT as it contains the --operation create
option.

Now save the file to a suitable location, making sure you change the
"Save as type" to "All files" and
include a .bat extension at the end of the file name. This
tells Windows that it is a batch file, so the commands in it should be
executed:

Repeat steps 9-11 but replace create with stop
in order to create a batch file to turn off FT for your VM.

Now you have your two batch files. In the future, all you should have to do
is change the --vmname MyTestFTvm option to match the name of
your FT VM as shown in the vSphere Client.

Scheduling your Fault Tolerant VM Backups

The first step for every scheduled FT VM backup
needs to be turning off Fault Tolerance, so you can use the Windows Task
Scheduler to create a scheduled task to run your batch file at the appropriate
time. The Windows Task Scheduler in Windows 2008 Server is quite different to
use from the old Windows 2003 Server version

Note that when you create your task, you can specify
what Windows user account it should run under. If you are using the passthrough
authentication option, it is essential that you specify an account that has
sufficient rights on your vCenter Server to change the Fault Tolerance settings
for the VM. Configure the task to run at a suitable time and frequency for your
backup schedule.
Next you need to create a backup job for your FT
VM, just like any other VM backup, but you should schedule it to run at a
suitable period after the "Turn Off FT" task to ensure it has time to complete
that before starting the backup, otherwise it will fail. Usually I find allowing
a delay of 15 minutes is ample, but you should be able to confirm what is best
for your system with some testing. Setting the time for reactivating Fault
Tolerance is harder because the duration of the backup job may be quite variable
from day to day - set it too early and the task will fail, whilst making it too
late will leave your VM unprotected for longer than necessary. The best option,
which most backup applications support, is to use the option in the backup job
properties to run a command after the job has completed:

Conclusion

Although it sounds like a complex and significant set of tasks to be running
on a perhaps nightly basis it does in fact usually turn out to be a reliable
procedure once setup and the schedules established. The main disadvantage has
already been highlighted - the Fault Tolerance protection has to be turned off
in order to backup the VM, which increases the risk of downtime during the
backup window. Depending on the role of the VM in question this may or may not
be an issue, but with some planning it should be possible to minimise the risk.
For example, by combining a weekly VM level backup with a daily OS level backup.