For my experience I took two random VMs with a total provisioned storage of 87GB and I set up a backup job on a VDR appliance configured to send backup data onto a NFS share on a Windows 2012 server.

I won't detail the steps to configure this, basically because the way to export a NFS share in Windows 2012 hasn't changed from Windows 2008 (add RSAT-NFS-Admin feature), and also because the afore-mentioned video shows most of the steps to configure the deduplicated volume.

In my test, once the VDR backup job has completed, the initial 87GB have shrunk to a mere 14GB.

Now, before we proceed, there are two thing which are worth mentioning.

The first is that you have to shutdown your VDR appliance in order to close the vmdk disk before you launch the Windows deduplication task, otherwise the optimization task will silently fail with

Event ID 8196: "Data Deduplication failed to dedup file "testvdr02-flat.vmdk" with file ID 844424930132042 due to non-fatal error 0x80565350, An error occurred while opening the file because the file was in use."

The second thing to keep in mind is that you have to make sure there's plenty of room on your destination volume for deduplication to succeed, otherwise the optimization task will fail with

Event ID 8252: "Data Deduplication has failed to set NTFS allocation size for container file \\?\Volume{339d7092-0...9a69ca5460}\SVI\Dedup\ChunkStore\{C811D0A8-A4DD-59A4-8518-98158C627379}.ddp\Data\00000078.00000001.ccc due to error 0x80070070, There is not enough space on the disk."

I wasn't able to establish the minimal disk space requirements for Windows deduplication not to fail, so if anybody has information about this parameter, please share!

Let's go on. At this point the optimization task starts, and the process Microsoft File Server Data Management Host (fsdmhost.exe) scans the disk high and low for data chunks to deduplicate.

Once it's finished, and differentely from what it's stated on Veeam blog, I don't see any size improvement for the backups, because the amount of disk space used stays roughly the same, 14 gigs... This means to me that VDR deduplication is quite efficient and the Windows Deduplication engine can't add much gain to it.

The shown deduplication saving of 37,1GB is just the amount of space deduplication can retrieve from the vmdk disk because it is stored as thick.

If anybody at Veeam has a better interpretation of this results, I am open for suggestions, remarks and of course corrections!

For an introduction to Windows 2012 Data Deduplication check this previous post.

2 comments:

I am seeing large gains, writing Veeam backups to a deduped win 2012 vol. Veeam can only dedupe on the per job level. so if you have many jobs that backup many VM's, win dedupe will help (alot). Your test needs more vm's! And more backup jobs to be relevant.