Answered by:

Backups crashing CSV volumes on iSCSI

Question

We've had a 2 node hyper-v cluster running and backing up using DPM 2010 for a long time with no issues. The hosts talk to the CSV LUN's via fiber channel while DPM uses iSCSI for its storage. We just tried to move the hosts to iSCSI from fiber channel over
the same iSCSI subnet as the DPM server and when a backup starts it crashes the CSV volume the VM is on. Sometimes I see the volumes are in a redirected mode (not sure what that means) during the backup attempt, sometimes not. The SAN is an EMC VNX5300
and we use Powerpath with EMC DSM.

Can anyone provide me with any information as to what to look at? I am building a test lab with a hyper-v cluster to do more testing but any ideas to test would be appreciated.

Answers

The best way to backup your Hyper-V enviroment is to use your fibre channel connection to your CSV LUN's and install (if you haven't already done that) the VSS hardware providors on the DPM server. Using this technuiqe your DPM will actually use the
VSS function of the SAN so your EMC SAN will take the snaphot and provide the data (blocklevel changes) to your DPM server.

Your CSV volume comes into a redirected mode since DPM will trigger the hosts who has the coordination hadle to make the VSS snapshot of your CSV volume, the snapshot is made from within the operatingsystem on the host and this is a very time consuming process
since your having a large amount of data on your VM's.

All replies

The best way to backup your Hyper-V enviroment is to use your fibre channel connection to your CSV LUN's and install (if you haven't already done that) the VSS hardware providors on the DPM server. Using this technuiqe your DPM will actually use the
VSS function of the SAN so your EMC SAN will take the snaphot and provide the data (blocklevel changes) to your DPM server.

Your CSV volume comes into a redirected mode since DPM will trigger the hosts who has the coordination hadle to make the VSS snapshot of your CSV volume, the snapshot is made from within the operatingsystem on the host and this is a very time consuming process
since your having a large amount of data on your VM's.

Great information! The serialized method does not appeal to me and forces me to look at staying fiber channel. Can anyone point me to a doc on how to install a VSS hardware providor? Also, after installing the providor, I assume it will not affect the backups
of my file servers, sharepoint or SQL, am I correct?

I had to enable File/Print & Microsoft Client on all of the iSCSI adapters though to get backups working. Is this normal?

After starting a backup, I was unable to access the VM for about a minute but it did come back. I still need to do some more testing by having more than one VM on the CSV as in ptoduction all VM's on the CSV went down. In the test lab,

While I still have a VM that I can't backup while online for some reason, the ultimate solution was to disable the iSCSI networks from cluster networking. Also installed the VSS provider on the hosts which seems to work very well. While at it, created
a live migration network and see much better migration & backup performance.

Hyper V cluster using iSCSI storage, with dedicated 10gig NIC's in MPIO mode for iSCSI, plus dedicated 10gig NIC for all other network traffic, all VLAN'ed off etc. All updates from Microsoft and Equallogic, hardware VSS, many...many Microsoft hotfixes
for "multiple VM backups on the same CSV" an 3-4 MPIO hotfixes etc. I spent a HUGE amount of time tracking down all of the hot fixes etc.

If I kick off a DPM backup of a VM it works fine as long as it is backing up only one VM per CSV at that moment. I get a brief CSV redirect, then the hardware snapshot is created, backs up, then snap shot is destroyed. Worked great.

Then I attempted to backup 4 VM's on the same CSV......the wheels fell of the bus. I got all kinds of cluster errors 1134, 1169 etc. Basically telling me the volumes were not there, or had moved or whatever. I had about 25 VM's on this
cluster at the time, and 4 or 5 were just off when I came into work.

Equallogic blamed it on Microsoft, Microsoft said it was the hardware VSS provider. Honestly I would bet on Microsoft since there is only about 5-10 hotfixes out there for CSV/MPIO that I know of, this week.

Long story short, I now use SAN snapshots for VM's that dont have much if any change to their data for a whole VM backup, and DPM in the guest VM for stull like SQL/Exchange/Domain controllers etc.

DPM with CSV for full backups is NOT ready IMHO. DPM is a great tool, and we use the heck out of it, but Hyper V on CSV feels very "1.0" to me, especially coming from VMware. The next version of DPM/Hyper V/CSV needs to have this nailed down.
Also find a way to track the changes on the Host, so DPM does not have to SCAN a the whole VHD file of a VM looking for changes. A 1TB VHD being read from the SAN will put a load on the SAN.

Microsoft is conducting an online survey to understand your opinion of the Technet Web site. If you choose to participate, the online survey will be presented to you when you leave the Technet Web site.