Recommended image size is 715x450px or greater

Crop topic image

Join the Community! Creating your account only takes a few minutes.

I am considering 2 WS 2012 hyper-v hosts with VM's running on each replicated to the opposite host, each with DAS. I expect to put 4tb on each host and have a dedicated FC or 10gb link between the two hosts. With all that I could almost afford some shared storage but it would be single point of failure, I can accept RPO/RTO of 30 minutes for disaster.

I would like to do host maintenance without downtime.

How can I do live migration (Shared nothing) on these VM's? I have read that the live migration will fail since the VM already exists on the target host, due to the replica. Is it easy and advisable to remove the replica and proceed with the Live Migration or is there some other option?

Thanks Rod-IT, but I would prefer a MS solution, since that would add a completely new skill set requirement to the Dept. I understand their are VSA solutions for Hyper-V too, but they take up host resources and come with a price.

Replica doesn't do auto or non-stop migration for either planned or unplanned movements. That is the crux of the issue. I can accept some downtime for unplanned failover but would like live or uninterrupted migration for planned events, like host maintenance.

RTO Of 30 minuits is doable on a LAN with Veeam depending on the hardware and size of the guests being replicated. You will have to setup Veeam proxies between the Hosts in order to throttle back replication as oppose to consuming all your resources as you set the servers replicate every half hour with your FC, backups should be quick.

Maybe Veeam can give you a trial in order to see if this scenario will work for you? They're a great company to work with.

I've been crash testing Starwinds Virtual iSCSI SAN w/two WS Srv12 hosts in a cluster. The free version I've been using for tests puts a cap at 128GB of storage and 512MB w/b cache. Pretty much takes storage on one host, mirrors it to equal storage on second host and presents it back to them as iSCSI target you can add as CSV.

Very easy to setup, VM's can live migrate or failover to either host with ease. I've got a VM running Exch2013 and a DC - pull the plug on the primary host with stress tests running and they're both humming along and responding to live network in under 120secs. Eats up disk I/O and network bandwidth as would be expected - but to a tolerable level w/only 128GB on a couple disks and two dual-NIC teams. Throw some Veeam in there and almost seems to good to be true. So I must be missing something.

The software is rather expensive. (but cheap compared to alternate means of shared storage, second storage cluster, etc)

Anyone use Starwinds in production? After dozens of soft & hard failovers, live migrations, etc I see no corruption or data errors to speak of...

RTO Of 30 minuits is doable on a LAN with Veeam depending on the hardware and size of the guests being replicated. You will have to setup Veeam proxies between the Hosts in order to throttle back replication as oppose to consuming all your resources as you set the servers replicate every half hour with your FC, backups should be quick.

Maybe Veeam can give you a trial in order to see if this scenario will work for you? They're a great company to work with.

Best Regards,

James Braunstein

I've seen someone do an RPO on 3 minutes with Veeam over the WAN (ok, well they had a GigE point to point and had some massive HDS SANs). I wouldn't recommend trying to hit that RPO (was averaging a snap on a VM every minute, and was rolling through multiple jobs, and was abusive on storage) but it did work. Again, this isn't a suggested config, but Veeam with beastly amounts of hardware/pipes can do some darn near CDP configurations.

I've been crash testing Starwinds Virtual iSCSI SAN w/two WS Srv12 hosts in a cluster. The free version I've been using for tests puts a cap at 128GB of storage and 512MB w/b cache. Pretty much takes storage on one host, mirrors it to equal storage on second host and presents it back to them as iSCSI target you can add as CSV.

Very easy to setup, VM's can live migrate or failover to either host with ease. I've got a VM running Exch2013 and a DC - pull the plug on the primary host with stress tests running and they're both humming along and responding to live network in under 120secs. Eats up disk I/O and network bandwidth as would be expected - but to a tolerable level w/only 128GB on a couple disks and two dual-NIC teams. Throw some Veeam in there and almost seems to good to be true. So I must be missing something.

The software is rather expensive. (but cheap compared to alternate means of shared storage, second storage cluster, etc)

Anyone use Starwinds in production? After dozens of soft & hard failovers, live migrations, etc I see no corruption or data errors to speak of...

My understanding it they are the most popular VSA product on the market for Hyper-V. Are you trying their new log structured file system yet?

So what you are suggesting is not to replica, so to be able to do live migration for maintenance and use Vreeam for replica/backup for disaster recovery?

David

4TB at 10Gbps with a ~10% overhead factored in your looking at an Hour to migrate everything using shared nothing. Now the fun part is getting that much throughput out of your disk sub system. If the disks are busy and under load for IOPS you'll get a fraction of that (remember, putting a bunch of VM's on a host is an IO blender, so those nice streaming reads become random when they are fighting over VM's for IO). If your using NL-SAS this is going to be miserable.

Given the added headache of trying to calculate how many spare IOPS/disks you need to feed this system so you can get migrations to finish within an hour, (and this will scale equally poorly) I'd argue that shared nothing isn't really that great of a feature.

Like John773 says, VMware is very embedded with a lot of folk and is by no means rocket science to learn.

I too have used a range or VM options and VMware for me is by far the easiest to learn and understand - and from an OS level, requires less host patches than a Windows OS.

Hyper V will require more patches in the long run and therefore more reboots than an ESXi host.

Install consists of selecting the USB stick/SD card you want to install to and mashing enter a lot. Set an IP/Password and you've got a host!

Exactly. "click, return, return, finish"

In addition to the OPs original statement about shared storage being the single point of failure - it is far less of a single point than a server, put in a good UPs (or two) and your pretty resilient. Split the array between cabinets as well, on different electrical phases and your as good as you can be without replicating storage.

In all the times I've been using VMware with SAN/NAS, I've not seen the disks be the point of failure. Split your array between cabinets and even if you do have a failure, it's only selected machines on those disks that are affected.

I would certainly look to see what shared storage you can get, this will be a much better option than shared nothing.

Replica is for unplanned failover. My understanding is that their will be downtime with a failover from Replica. I would like to use live migration for the host maintenance (no downtime).

I have considered VMWare and really would like to consider a MS only solution. Maybe in the US, VMWare knowledge and consultants are a dime a dozen but in the Dominican Republic, they aren't plentiful or cheap. Besides, I need my staff to be self-sufficient and feel that it would be easier with the MS path. We have already started the process with a 2008 R2 virtualization and are ready to continue into a DR configuration.

As for Rod-IT, I will have the data replicated between the two hosts, so no single point of failure.

So back to the question, if I am replicating, is it trivial to remove the replication and proceed with a Shared nothing Live Migration (for planned maintenance)? Of course, I will have a fresh backup of VM before removing the replica.

As for Rod-IT, I will have the data replicated between the two hosts, so no single point of failure.

That still depends on if they are on the same electrical phase or UPS, if there is a backup generator and so on.

As John pointed out earlier this is going to be IOP intensive and network hungry, not to mention time consuming.

Just to give you an idea of the time these things can take. I recently did some storage migrations on our VMware farm, from one SAN to another, a ~600GB system, over an 8Gbit fibre link, while in use (along with other systems still running) this took around about 3 hours. This was a thick provisioned system, so it's a full 600GB to copy.

I hope if nothing more this gives you an idea of the time replications / migrations can take - it's just an example, and speed of copy/impact on systems are all down to your own setup and specs.

I am coming to the conclusion that I need to bite the bullet and justify the SAN, and have a DAS storage failover configuration in case SAN goes down (certainly can't afford second SAN). That would allow the cluster and shared migration etc. Oh well, I tried to do it on the cheap......off to fight budget battle.