Unifying backup and replication for VMware

Posted on February 03, 2010

With support for all of the new features in vSphere 4, Veeam Backup & Replication brings near-CDP levels of data protection to virtual machines.

By Jack Fegreus, openBench Labs

Adding to the complexity of today's data center operations are the many regulatory mandates to secure and maintain critical business data at a granular file level. However, file-level compliance mandates can also complicate another important datacenter initiative: server virtualization.

A key characteristic of a virtual environment is the encapsulation of VM logical disk volumes as single physical disk files. This representation makes image-level backups faster than traditional file-level backups and enhances restoration as virtual disks can be restored as either a whole disk image or as individual files. That's why general-purpose backup packages integrate with VMware Consolidated Backup (VCB) to provide image-based backup.

Veeam Backup & Replication software is a dedicated image-level data protection package that unifies backup and replication in a VMware environment and easily integrates with existing general-purpose backup packages. Veeam provides IT with a data protection application specifically designed to deal with the complex data protection issues found in a sophisticated VMware vSphere 4 environment supported by host servers running either the ESX or ESXi hypervisor.

Veeam Backup & Replication can directly leverage the new vStorage APIs in a vSphere 4 environment to improve performance without VCB integration. That means SMBs without a SAN in place can leverage the ESXi hypervisor with minimal investment. More importantly, Veeam Backup & Replication software extends disk imaging operations to include on-site or off-site replication for rapid recovery. Veeam Backup & Replication can also utilize the new Changed-Block Tracking feature of VMFS to accelerate incremental backup and replication, recognize virtual disks with thin provisioning, and leverage hot-add for virtual disks, when deployed within a VM.

What's more, when making the business impact analysis required for any IT disaster recovery plan, Veeam Backup & Replication helps resolve balancing two important issues: the Recovery Time Objective (RTO)—the maximum period of time that it could take to recover—and the Recovery Point Objective (RPO)—the maximum amount of data measured in time prior to the disruption that could be lost in the recovery process.

To provide IT with a better RPO, Veeam Backup & Replication will by default ensure transaction consistency with either VCB or the vStorage API. In addition, a backup administrator can configure Veeam Backup & Replication to use the Windows Volume Shadow Copy Service (VSS) to ensure transaction consistency for VSS-aware applications running within a VM, including Active Directory, MS Exchange, or SQL Server.

Veeam Backup & Replication fully supports the VMware ESXi hypervisor, which many sites are now bringing into production as more servers are bundled with ESXi firmware. Nonetheless, the use of ESXi hosts can complicate backup and recovery processes as the architecture of ESXi differs significantly from ESX. To keep the ESXi hypervisor as compact as possible in order to be able to embed this hypervisor in firmware, VMware does not implement a service console in ESXi. Typically data protection applications, such as backup and replication, utilize the service console in ESX.

Veeam Backup & Replication uses VMware Application Programming Interfaces (APIs) to access ESXi remotely and enable backup and restore of VMs running on ESXi servers over the network. With the introduction of Veeam Backup & Replication 4.1, system administrators can also replicate VMs to, as well as from, an ESXi host. As a result, Veeam Backup & Replication simplifies sophisticated IT backup and recovery procedures in virtual environments that include both ESX and ESXi hosts—a capability that is precisely what IT needs to meet Service Level Agreements (SLAs) with aggressive RTO and RPO requirements.

VMs and synthetic backup

To assess the ability of Veeam Backup & Replication 4.1 to simplify backups and enhance IT operations for disaster recovery, openBench Labs hosted a VMware vSphere 4 environment using two servers: One server ran the ESX hypervisor and the other ran ESXi. All hypervisor datastores were created on a Xiotech Emprise 5000 disk array, and each host shared SAN access to every datastore. On a third system, we ran Windows Server 2003 and installed Veeam Backup & Replication 4.1 with shared SAN access to all of the logical volumes used by the two VMware hosts. We also installed VCB on that server to measure any performance differences between integration via the vStorage API and VCB.

For testing performance and functionality of backup and replication processes, we set up two groups of server VMs: Eight VMs ran Windows Server 2003 on the ESX host and two dual-processor VMs ran Windows Server 2008 on the ESXi host. All VMs were configured as application servers running SQL Server and IIS.

We utilized thin provisioning on each VM system drive, which was virtualized as a vmdk file in a VMware datastore. We stored all work files on Raw Device Mapping (RDM) disks, which were configured in virtual compatibility mode. An RDM volume is a logical disk volume that is formatted by the VM with a native operating system (OS) file system—NTFS in our tests. That makes an RDM volume directly accessible by other physical, as well as virtual, systems running the same OS. Through the creation of an optional vmdk mapping file by the VMware host, an RDM volume remains manageable through VMFS; however, virtual compatibility does not extend to the implementation of Changed-Block Tracking.

To lower the amount of disk space required to store backup images, Veeam Backup & Replication 4.1 provides options for data compression, job-level inline data deduplication, and synthetic backup. Within a job policy, a system administrator is able to configure the level of data compression and set the implementation of data deduplication for that job. It's important to note that Veeam Backup & Replication only applies data deduplication within a job: There is no global store of unique blocks. That means two independent backups of the same VM will create identical backup image files.

That deduplication scheme provides a number of interesting performance and functionality tradeoffs. First, the processing overhead for inline data deduplication is dramatically reduced. What's more, a Veeam Backup & Replication image for a job with data deduplication can be transferred without data re-hydration to any off-line media and restored on any server running Veeam Backup & Replication software. More importantly, the software's data deduplication scheme is transparent when integrating Veeam Backup & Replication with a general-purpose enterprise backup package. Integration with an external package is simplified via a menu option to execute an external command or trigger a batch processing script at the end of the Veeam Backup and Replication process.

Also dramatically reducing storage requirements and the time needed for a backup window is Veeam's automated synthetic backup technology, which administrators can apply when backing up or replicating VMs. Using Veeam's automated synthetic backup, a full backup only needs to run once. After an initial full backup, subsequent backups default to incremental, which can be made more efficient by leveraging the new Changed-Block Tracking mechanism in VMware hypervisors.

Unlike traditional incremental backup schemes, however, Veeam automatically writes the incremental data into the existing backup file (.vbk) to create a new synthetic full backup and then generates a reversed incremental rollback file (.vrb) containing the original data overwritten in the previous full backup. As a result, an administrator can immediately restore the most recent backup without extra processing. If a backup from an earlier time period is needed, the Veeam software will automatically invoke the proper rollback files. What's more, IT administrators can insert a physical full backup into the backup job rotation at any time in order to comply with any local IT security policy. When this is done, Veeam resets the chain of rollback files to use the new full backup.

VM backup automation

To help ensure success within a sophisticated VMware environment—especially a distributed environment with multiple backup servers—Veeam Backup & Replication includes two management graphical user interfaces (GUIs). The two GUIs, Veeam Backup Enterprise Manager and Veeam Backup and FastSCP, allow IT departments to fine tune their own balance of centralized and decentralized management and reporting.

Veeam Backup Enterprise Manager acts as a single management point in the distributed enterprise environment by rolling up all of the activities relating to jobs and VMs across all of the servers running Veeam Backup & Replication. With Veeam Backup Enterprise Manager, IT managers can immediately evaluate what has taken place with respect to data protection processes over the last seven days or the last 24 hours within a VMware environment. What's more, they can clear and resolve issues on any remote server by taking control of backup and replication jobs from a central location.

openBench Labs managed all of the backup and replication tests for our VMware environment using the Veeam Backup and FastSCP GUI. By establishing one connection with our vSphere server, we were able to integrate with all of the ESX and ESXi hosts in our environment. What's more, we were able to create highly flexible job definitions by using VMware container objects, including hosts and datastores to create jobs that would automatically recognize and adjust to changes in VM clients, which is particularly important when vMotion is used at a site for load balancing.

Complementing the centralized management and reporting paradigm of the Veeam Backup Enterprise Manager, the Veeam Backup and FastSCP software provides a local wizard-based management and reporting application for decentralized operations. At the foundation of the Veeam Backup and FastSCP interface are four key constructs: job definitions, sessions that are instances of job definitions, backup image files that are created by job sessions, and replicas of VMs that are created by replication jobs.

In addition, there are five important wizards: Backup, Replication, Restore, VM Copy, and File Copy. All of these wizards provide simple check-off menus to utilize compression, data deduplication, and synthetic backup for improved processing. Another important aspect of these wizards is the ability to choose VMware container objects to define backup and replication clients. Using these objects allows job definitions to automatically adjust to changes in the number and location of VMs in a VMware environment. Using the Veeam Backup and FastSCP wizards, we began our assessment of Veeam Backup & Replication with the job of backing up the eight server VMs that were running on our ESX host server.

Phat performance with thin provisioning

For all of our tests, we used VMware version 7 VMs configured with thin provisioning. To set up performance baselines, we first ran a series of full backups with different levels of compression, settings for data deduplication, and methods of integration with VMware—vStorage API or VCB. These tests provided an eye-opening perspective on the process of backing up a VM.

We began our assessment by running full backup jobs of individual VMs without utilizing either compression or data deduplication. In addition to providing a perfect baseline for our general assessment, this configuration matches the best profile for IT sites planning to stage Veeam Backup & Replication backup images to another repository with a more sophisticated data deduplication scheme via integration with a general enterprise backup application, such as NetBackup.

We started testing by running backups of single VMs. Our principle metrics were wall clock time and the size of the resulting backup image. Wall clock time replaced throughput as the primary metric because of the manner in which Veeam Backup & Replication handles a backup.

Unlike most competitive backup packages, which utilize VCB to integrate with VMware, Veeam Backup & Replication does not copy the VM files from a datastore on the VMware host to a temporary directory on the Windows server before backing up the files. Veeam reads the VMware data and writes the backup image directly to storage on the Windows server. While NetBackup runs that first step at upwards of 500MBps, the second stage slows significantly as the data is reformatted and saved as a NetBackup backup image.

In addition, Veeam Backup & Replication is able to leverage the Changed-Block Tracking that was introduced in vSphere 4. That makes the perceived throughput of an incremental backup appear to be as high as 500MBps. As a result, wall clock time becomes the best metric for assessing the overall throughput.

For disk-to-disk (D2D) backups, the size of backup images is a critical metric. Standard backup rotation schemes, such as the popular Grandfather Father Son (GFS), for creating and storing daily, weekly, monthly and annual backups typically increase storage requirements by a factor of 25 times over the original data. The ability to address the issue of storage utilization with compression and deduplication has made these features essential enterprise options.

Thin provisioning of storage is another important way to increase storage utilization. Thin provisioning uses storage virtualization to present an OS with what appears to be a fully provisioned volume, while real storage capacity is allocated only when consumed. That makes it important for a backup package to recognize the thin provisioning scheme introduced with vSphere.

On our server VMs, we used thin provisioning for the system disk: Typically each disk consumed 8GB of the allotted 25GB. Only the Veeam software recognized this configuration. As a result, Veeam backup images created using either the vStorage API or VCB were less than one third the size of the NetBackup image, and the wall clock time for the NetBackup backup session was six times longer than the Veeam backup session.

With compression set to its default level of optimal, we tested the effect of data deduplication and compression on a backup of a single VM. In these tests, integration with the vStorage API always provided close to a 27% advantage in reduced wall clock time. More importantly, optimal compression consistently reduced the data written to the backup image by 2 to 1.

In our high-speed Fibre Channel environment, there were no throughput bottlenecks with respect to the volume of data being backed up. As a result, the additional CPU processing to provide a 2-to-1 reduction in data via compression was reflected in a modest increase of 18% in the wall clock time of a full backup. On the other hand, when throughput was constrained by a Gigabit Ethernet connection in a replication process, which requires VM data be transferred to the new host server over a LAN so that the new host can create the proper structures, the reduction in the amount of data transferred reduced wall clock time. Reduced wall clock time is also a characteristic of an iSCSI SAN or a LAN-based backup—made possible by support of the vStorage API.

On the other hand, the Veeam inline data deduplication scheme provided no advantage when applied to a single VM with thin provisioning, since we had eliminated multiple blocks of blank data by reducing disk utilization on the host ESX server via thin provisioning. To leverage deduplication, we needed to change our backup strategy from parallel backup jobs of individual VMs to a single job that backed up all of the system disks—approximately 70GB of data on 200GB of thin-provisioned storage—of our eight VMs sequentially.

When we backed up the system disks on eight server VMs sequentially in a single job, the deduplication module in Veeam Backup & Replication was able to compare data blocks across all eight VMs to lower the storage footprint of a backup image by roughly 20% more than compression alone. Not only did we back up eight VMs in sequence more quickly with Veeam Backup & Replication than eight VMs in parallel with NetBackup, the storage footprint of our full backup of eight VMs using optimal compression and data deduplication consumed only 30% more space than the backup image of one VM with NetBackup. Increasing compression to the highest level that Veeam supports in combination with data deduplication further reduced the amount of storage consumed by 5% percent, but increased processing time by 25%.

Synthetic reality

With a full backup in hand, we could now begin to leverage Veeam by never again doing a full backup. For ongoing backup sessions, Veeam Backup & Replication defaults to a synthetic backup process. After an initial full backup, Veeam Backup & Replication defaults to performing incremental backups of changed data, which is even more efficient when vSphere Changed-Block Tracking is enabled.

Using the restore wizard, we first chose how the backup should be restored—as a working VM, selected VM container files, or files of the logical VM. We then restored the VM from the ESX host to the ESXi host and automatically registered the new guest VM. We restored the current version in 22 minutes and the oldest version (eight automatic roll backs) in 25 minutes.

What characterizes a synthetic backup process is the automatic roll up of each incremental backup into a new full backup—dubbed a synthetic full backup—so that the most recent backup is always a full backup ready to restore. In a traditional incremental backup scheme, the full backup is the oldest file. IT administrators must first restore the full backup and then sequentially roll up the incremental files to the required point in time. Since the most recent backup is often the required target, the Veeam synthetic backup process automatically rolls up each incremental backup as it finishes.

In particular, Veeam Backup & Replication creates a new synthetic full backup—a .vbk file—and a roll back service—a .vrb file—from the incremental backup to reverse the traditional restoration process. Using the restore wizard, an IT administrator picks a point in time for the restoration point and the Veeam software automatically handles any roll backs.

We tested the process by restoring backups of a VM running on our ESX server as fully functional VMs on our ESXi server. We started with the most recent full synthetic backup. It took 22 minutes to automatically restore the VM, register the VM on the ESXi host, and power on the VM. We then repeated the process with a restore point that required eight roll backs. This time the process took 25 minutes.

Equally impressive were the performance levels for backup throughput and storage utilization with incremental backups following the initial full backup. Driven by the degree of change in the data and the similarity of the VMs in the backup session, 80% of the incremental roll back files required less than 2GB of storage in our tests. It took just 33GB of storage to hold one full Veeam backup and two additional rollbacks. That 33GB backup image represented 207GB of original data—a reduction of better than 6 to 1.

The results of these tests also have profound implications when it comes to effectively implementing a continuous data protection (CDP) scheme for VMs. Using vSphere's Changed-Block Tracking feature, we were typically able to generate an incremental backup of one of our VMs in about 2 minutes. What's more, we were easily able to restore and register a VM on a different VMware host. Taken together and packaged as a service, those capabilities represent the heart of a disaster recovery process.

Morphing replication and CDP

Veeam Backup & Replication builds on its backup constructs to set up replication for disaster recovery. The wizard for replication utilizes the same menus as the backup wizard, which simplifies setup for an administrator.

In a simple sequence of steps, an IT administrator first chooses the VMs for replication and a datastore target host. In this datastore, Veeam Backup & Replication creates a folder dubbed VeeamBackup. Within the VeeamBackup folder, folders are then created for each replica VM. Finally, the administrator makes the same choices for data deduplication, compression, VM quiescence for transaction integrity, and job scheduling that would be made for a normal backup.

Once the administrator finishes with the wizard, there is one very significant difference in a replication process compared to a backup process. Like a backup process, the Veeam proxy server triggers a snapshot of the files for the VM being replicated and reads those files without engaging the VM or its host server. Nonetheless, the Veeam proxy server must engage the new host server for the replica VM. The new host must create duplicate setting files for the original VM and place these files in the new replica directory along with an initial full backup—dubbed replica.vbk. In addition, incremental backup files will be added based on the backup schedule in the job definition.

Engaging the new host requires that the original data is transferred over a LAN to the new hypervisor. This means replication throughput will be constrained at most sites by a 1Gb Ethernet connection. As a result, the amount of data sent during replication has a major effect on the time it takes to complete the replication operation. More importantly, the time necessary for replication to complete also determines the minimum amount of time we would have to wait between launching replication operations, and that directly determines the highest RPO that can be met.

Since replication, even with multiple VMs, is done on a per-VM basis, clearly deduplication, just as in our single VM backup test, would not be effective. On the other hand, using optimal compression easily cut network traffic in half. In all of our tests, optimal compression cut data traffic and significantly reduced wall clock time for both the initial full backup and the subsequent incremental backups, which are key to meeting a high RPO.

In our testing, we replicated two VMs running Windows Server 2008 on our ESXi host to our ESX host and two VMs running Windows Server 2003 on our ESX host to our ESXi host. What's more, given the minimal amount of time—about 3 minutes—it took to do an incremental backup with minimal changes, we set up a backup schedule that called for replication every 5 minutes and limited the number of saved replicas to 12 for ESX and 7 for ESXi, which is the limit imposed by that hypervisor.

We chose to failover our replicas from the Replicas menu rather than the restore wizard. Once triggered, the process is totally transparent to administrators. In particular, it took less than two minutes for the replica to start booting.

In effect, we had created a virtual CDP process. After allowing sufficient time to populate a number of restore points, we tested the process by corrupting the directory of a VM running on our ESXi host. We then triggered a failover to its VM replica on the ESX server. Within 2 minutes, the replica was booting and within 10 minutes our VM was back online.

At this point the standard option of copying guest files from the replica VM to the original VM was of little use. To build a new VM on the ESXi host, we ran a standard backup job of the replica running on the ESX host. This created a clean backup without the replica's .vbk and .vrb files. We then restored that backup to the ESXi server and were ready to revert to our normal configuration.

With full support for vSphere and synthetic backup technology, which includes in-line deduplication and compression, Veeam Backup & Replication combines both backup and replication protection in one unified package. With a common technology foundation and wizards that employ the same menus for both data protection processes, IT administrators are able to easily implement a robust data protection plan capable of meeting any corporate SLA with aggressive recovery point and recovery time options. Using Veeam Backup & Replication, IT can support near-CDP levels of replication for VMware VMs with full recovery time measured in minutes.

In addition, the Veeam synthetic backup scheme supports continuous incremental backup, which can be modified to include full backups to meet local SLAs. With incremental backups, the data storage footprint of multiple backups can be reduced on the order of 30 to 1. As a result, Veeam Backup & Replication reduces storage costs as it increases the reliability and availability of a VMware environment.

• Veeam Backup & Replication combines backup and near-CDP-level replication in one interface to unify both data protection processes for IT with a synthetic backup process and consistent process wizards.• A full backup of a VM running Windows Server 2003 averaged 189MBps and an incremental backup of the same VM averaged 245MBps on a VMware ESX 4.0 host. From a single full VM backup, restore full VMs or individual files as a logical system or as an ESX application in VMFS.• Data deduplication, compression, and recognition of thin provisioning lowered the footprint of a full backup of 8 VMs by a factor of 9 to 1.• Veeam Backup & Replication can leverage the new vStorage API when performing backup or replication to allow SMB sites with or without shared storage to fully leverage ESXi, including the ability to replicate VMs to an ESXi host.• Full support for VMware vSphere Changed-Block Tracking and thin provisioning increases full and incremental backup speed and reduces storage requirements.• Via support for synthetic backup, Veeam Backup & Recovery can leverage inline data deduplication and compression to support very narrow backup windows, which provides for near continuous data protection (CDP) with replication. • Veeam Backup & Recovery employs VMware snapshots to ensure VM transaction consistency and Volume Shadow Services for VSS-aware Windows applications such as Active Directory, SQL Server, and Exchange.