an insider's perspective, technical tips n' tricks in the era of the VMware Revolution

September 03, 2013

VMworld 2013: Gory details on VM Instant Access and performance

In the most recent Avamar and Data Domain releases, if restoring an entire virtual machine from backup – and those backups are stored on a Data Domain system, a special feature called “instant access” is available.

Instant access is similar to restoring the full image to a new virtual machine, except that the restored virtual machine can be booted directly from the Data Domain system (via an NFS datastore). This reduces the amount of time required to restore an entire virtual machine – to in essence zero, and if particularly handy with large VMs.

The admin now can play with the restored VM and make sure it’s right. At that point – you want it back onto a production datastore. From the vSphere client, power on the virtual machine and initiate a storage vMotion of the virtual machine to a datastore within the vCenter

When the storage vMotion is complete, the restored virtual machine files no longer exist on the Data Domain system

If you read the fine print you’ll see this… NOTICE: In order to minimize operational impact to the Data Domain system, only one (1) instant access is permitted at a time.

So, is one (1) Instance Access enough for your needs?

Probably not, so we didn’t hardcode this limitation. In fact, we anticipate that customers will need to restore multiple VMs at a time, especially in the vApp context or for multi-VM applications. But, we are being caution initially, because we want to ensure the Data Domain system is operational for its intended purpose – backups and restores, not running VMs.

If you read my post back here – you’ll see yet another example of my “different horses for different courses” view when it comes to storage stacks (yes, you CAN have a general purpose storage stack like VNX or NetApp – which are good at many things, not great at anyone thing – in fact their “good at many” characteristics is why people dig them). People LOVE Data Domain as a backup target (for dedupable datasets), and we often get the “why not just use it always as a NFS datastore for VMs?” question. “Hey – it does an inline dedupe (unlike VNX or NetApp that do it as a post process), and it seems that only all-flash arrays like XtremIO can do that…” so why not? The answer is rooted in the things that make DD awesome for backup workloads (inline dedupe, huge ingest bandwidth) are wrapped up in the things that make it NOT ideal for transactional NFS (small IO latency characteristics, IOps density and cost).

So, a characterization of “what does this load do to a DD system during the transient period” seems necessary… For that, read on dear reader for gobs of test data! (thank you BRS team!)

We tested the ability to use Avamar “Instant Access” to restore and power up five (5) virtual machines on a Data Domain system. Once the virtual machines are powered on, we storage vMotion them to an EMC VMAX 40K, and during the S-vMotion we’ll gather ESXTop metrics to gain an understanding of the impact on the VM during the process.

Q: How do you enable more than one (1) instant access at a time?

A: The Instant Access field in the Avamar MC GUI’s “Edit Data Domain System” dialog is currently set to read-only mode and its value default to 1. The field's read-only mode is modifiable, by modifying /usr/local/avamar/var/mc/server_data/prefs/mcserver.xml file. Locate the xml element <entry key="ddr_can_modify_ir_limit" value="false" /> and change the boolean value to true. You'll need to restart the MC, then you'll need to go to the Edit Data Domain System dialog, the field should accept a new value.

Q: What hardware and software will be used during the tests?

A: We used the following:

Avamar Virtual Edition (AVE) 7.0 - image-level backup datasets were set to using Data Domain as the target.

After backups are complete, run “filesys restart” command at DD CLI prompt to ensure backup data is not in DD cache (for more realistic test results)

Run instant access for virtual machines

Start virtual machines

Start load generation tools within each virtual machine (unrealistic that customers will start to pound on the virtual machines prior to S-vMotioning them to primary storage, but we wanted to know the impact)

Start ESXTop data collection on ESXi hosts

Start storage vMotions – by default, Data Domain restores the virtual machine as Eager Zero Thick; however, we wanted to know the effects of converting to thin during the S-vMotion.

Q: How much time did it take to complete each portion of the tests?

A: Here were the results:

Q: What can we conclude from these tests?

A: Well – we shouldn’t conclude TOO much – but here are some thoughts:

Keep in mind, most people will not generate stressing loads on a virtual machine while it’s residing on a Data Domain system, let alone during an storage vMotion if they don’t have to. These tests are probably unusual, but help illustrate the effects on the Data Domain systems and the virtual machines themselves.

In most of the tests, converting back to thin vDisks increased the duration of the storage vMotion. This may be a necessary evil if the “source” virtual machines used thin vdisks and storage resources are limited during the restoration.

The durations appear to be linear based on previous single virtual machine testing (not shown here)

The effects on Data Domain system CPU resources during the storage vMotion were similar to the effects during the initial backups, which leads us to believe that while it’s NOT the design center of Data Domain, the DD990 and DD890 are engineered to handle temporary running virtual machines.

The disk latency effects (for both reads and writes) on the virtual machines was unacceptable by application vendors standards (e.g., what Microsoft requires for Exchange and/or SQL operations) – again see my comment about design centers of platforms; however, as noted in bullet one, it’s unlikely that a load would be placed upon the virtual machine during the restore process, and the virtual machine would only briefly resides on the DD system.

How were the virtual machines effected during the S-vMotion? (in general higher than you would like on a VM if it’s running in production, but for the time period where you’re checking it out before using storage vMotion to move back to a production datastore, not bad). Figure 11 – DD990 All VMs from all tests - read latency Figure 12 - DD890 All VMs from all tests - read latency

Figure 13 – DD990 All VMs from all tests - write latency

Figure 14 - DD890 All VMs from all tests - write latency

There you have it – gory detail on the performance impact and performance envelope of VM instant access!

Comments

You can follow this conversation by subscribing to the comment feed for this post.

Hi Chad,

Like all the other Avamar features -Self service restore, vCenter integration, CBT restore-
This is great stuff but unlike the other Avamar features this -Instant Access- is currently not available to
us Networker customers. Anytime soon perhaps?

(Name and email address are required. Email address will not be displayed with the comment.)

Name is required to post a comment

Please enter a valid email address

Invalid URL

Please enable JavaScript if you would like to comment on this blog.

Disclaimer

The opinions expressed here are my personal opinions. Content published here is not read or approved in advance by EMC and does not necessarily reflect the views and opinions of EMC. This is my blog, it is not an EMC blog.