Category Veeam

Last week I was moving some VMs for a client using Veeam Quick Migration. During the migration, I happened to stumble upon some strange behavior.

The source VMs make up an Oracle RAC cluster, using shared VMDKs. One of the requirements for setting up shared VMDKs, using the multi-writer option, is that they are thick eager zeroed disks. When the VMs got to the other side, they wouldn’t boot. I figured something went wrong during the migration so I tried it again. Once the second run completed, the VMs still wouldn’t boot so I started digging around some more.

The error message I was getting was rather vague; “Incompatible device backing specified for device ‘0’.”. After verifying the config of both nodes I eventually decided to look at the disk type on the destination side. That’s when I noticed the disk type was thick provisioned lazy zeroed. Ahah, that’s why they didn’t want to boot! After manually inflating the disks, they were up and running again. I’m starting to suspect that this is a bug.

Running some tests

I started building some more test VMs just to prove that this was, in fact, a bug. One of the options you can set during the Quick Migration wizard, is the disk type. You can explicitly select each of the types, or you can have Veeam use the same format as on the source side. Explicitly selecting thick provisioned eager zeroed or using the same as source also produced a VM with lazy zeroed disks. Time to submit a ticket!

As usual, Veeam support was very helpful and investigated the issue. A couple days later they came back to me and confirmed this was indeed a bug that will be fixed in an upcoming version.

Workaround

This bug is a minor inconvenience since there is an easy workaround. You can login to an ESXi server using SSH and convert the VMDK using the command

When I got into the office this morning, I noticed that on particular copy job hadn’t done its` job over the weekend. This particular job copies the daily restore points to a separate scale-out repository and enforces the GFS scheme that’s been set.

The job report displayed

Not that much to go on if you ask me. First, I checked to see if all extents in the repository still had enough room, this was the case. While I was doing that, I verified that all my proxies were still up and running. Before heading to my good friend Google, I decided to remove the copy job restore points from the configuration.

After this, I did a rescan of the repository and retried the job. It ran without a hitch, a nice and easy fix 🙂 I hope this won’t become a common thing, time will tell.