Recently I blogged about a quite convenient way to release read-only locks on vmdk files that prevent snapshots from being committed. Unfortunately a quite common problem with 3rd party backup tools that leverage the vStorage API. Using StorageVMotion can help in releasing these locks but sometimes even SVmotion won't work.

One more side note here: one can ask why there can be a mismatch about the VM still running in snapshot mode and the vCenter server not knowing anything about it, or better said, without the snapshot manager showing any existing snapshots on that VM. That's because the backup programs send a "remove snapshot" command to the vCenter server. The VC sends the snapshot command to the ESX host that is responsible for the snapshot handling and at the same time it deletes the information about the existing snapshot from the VC database without waiting for the ESX to confirm snapshot removal. This is by design, don't ask me why. So the new "feature" "Consolidation" in vSphere 5.x is nothing more than a simple database query against all listed snapshots and this list is compared to the settings of each VM. When VC finds a mismatch it activates the "Consolidation needed" button. Curious....

Because I have troubleshooted several of these problems in the past I decided to give a short step-by-step guide to identify which system holds the lock and how to release it. I can't cover all situations where this problem occurs but in ~95% of all cases it was one of the two solutions provided here.

The first an almost most important step is to reboot the physical backup server you use to backup your VMs. This is especially true if you have a server that uses SAN mode to get the vmdks. These servers tend to hold locks when backup jobs fail or crash. A reboot of these systems sometimes is enough to reset the lock. If you have a virtual backup proxy as tools like Veeam or vRanger supports then rebooting this VM won't help.

Check if the consolidation process now works. If it still fails proceed to the next step.

Identify which file is really locked and prevents the snapshot removal process. To accomplish this shutdown the VM because if the VM is running you will see the hosting ESX server as one of the file lock holders and this can be confusing.

Log on to the ESX server that has the VM registered via SSH and run the following command:for a in $(ls -1);do echo $a; vmkfstools -D $a; done

Search for the entries of the base vmdk of the VM (the one that is called *-flat.vmdk). In almost every case this is the locked file.You will see an output like:

The first column tells you the so called cartel ID. You will need this information in the next step. The second column tells you that the disk is in use by a VM process. That means one of your VMs has the disk attached and thus holds the lock. The other columns are unimportant.

Now check which VM holds the lock. This can be done with the command:esxcli vm process list | grep 11696694 -B3The output will be something like this:

servername

World ID: 11696695

Process ID: 0

VMX Cartel ID: 11696694

Open vSphere client, connect to the ESX where you executed the last commands on and search for the VM that the last command shows in the first line (in this case it's servername). Goto the VM settings and check the attached hard disks. You will see the disks that are locked. Remove them by delete the harddisks from the VM but WITHOUT deleting the disk files from the disk.

This problem is mainly caused by virtual backup proxies that use hot-add mode for data transport. If they crash or have other problems so the disk deattachment can't be completed the base VM disks remain connected and thus locked for other processes like snapshot removal tasks. That's the reason why a simple reboot of a virtual backup proxy won't help in this situation as the locked disks remain attached to the VM and are still locked for other processes.

To give a final advise: if you encounter locked files first reboot the physical backup server. Next check for the lock holder as described in this article. Next try a SVMotion. Next step is to call VMware support as you probably have a rare problem where standard procedures won't help. Good luck.