The issues arises because data has been deleted from inside the master VM, from which the replica VM’s disk image is created by sparsing. Sparsing is removing any unused blocks to reduce the storage space required. The sparsing process is also used when VMs are converted to thin disks such as by storage VMotion. The problem is that unused blocks aren’t what is removed, zeroed blocks are removed. To preserve the guest’s view of the disk sparsing leaves intact any block on disk that contains data, even if that data is a deleted file.

Keeping the replica small is very desirable if there will be many copies on your shared storage, which is what happens without tiered storage configurations. In Andre’s 2000 seat example a copy of the Replica was stored for roughly every seven desktop VMs, any bloat in the replica was multiplied by about 280 when the View Composer pool was deployed.

The simplest way to resolve the issue is to use sdelete from SysInternals to zero the free space in the master before taking the snapshot that will be the basis for the replica.

Here’s how it looks:

1. My Winodws 7 master, 40GB Hard disk with 26GB of OS and applications installed (yes I currently have everything installed and only thinapp a small selection) produces a similarly sized replica.

2. To create blocks of data from deleted files I copy a few ISO files (8.26GB) to the disk and then delete them. The OS still reports 26GB consumed but the replica is rather larger, reflecting the non-zeroed blocks previously used by the ISOs.

3. In the VM I run SDelete to zero the free space, this removes the ability to undelete the files.

4. The VM still reports 26GB of consumed disk, however the replica has returned to it’s original size.

The conclusion is that with View Composer pools it’s important to completely clean-up the master image before recomposing your pool, including zeroing free space after you have defragmented the disk.

If you use the View 4.5 feature of tiered storage you can reduce the size multiplication as only a single copy of the replica is required for each pool, however this brings with it the issue of all replica IO going to a single datastore. Andre has a great article explaining Tiered Storage on View 4.5.

2 Responses to Bloated Replicas, or “deleted isn’t zeroed”

Andrew has identified an issue I didn’t encounter. If you thin provision the Master VM then when you run sdelete it will fully expand the vmdk, this is expected behavior. In fact I always thick provision my master VMs so it’s not an issue for me. There aren’t a lot of master VMs, certainly there will be no more of these than there are pools and often one master will be the parent for multiple pools.

The aim is to get the replica to a small size since there are usually a lot of these, usually one per datastore so possibly ten per pool. The space saving in the replica is multiplied.