About a year ago I wrote a whitepaper about vCloud Director resiliency, or better said I developed a disaster recovery solution for vCloud Director. This solution allows you to fail-over vCloud Director workloads between sites in the case of a failure. Immediately after it was published various projects started to implement this solution. As part of our internal project our PowerCLI guru’s Aidan Dalgleish and Alan Renouf started looking in to automating the solution. Those who read the initial case study probably have seen the manual steps required for a fail-over, those who haven’t read this white paper first…

The manual steps in the vCloud Director Resiliency whitepaper is exactly what Alan and Aidan addressed. So if you are interested in implementing this solution then it is useful to read this paper new white paper about Automating vCloud Director Resiliency as well. Nice work Alan and Aidan!

One of the white papers I worked on in 2012 when I was part of Technical Marketing was just published. This white paper is about VMware View infrastructure resiliency. It is a common question from customers, and now with this white paper you can explore the different options and understand the impact of these options. Below is a link to the paper and the description is has on the VMware website.

When I was playing with the new vCloud Director 5.1 and Site Recovery Manager 5.1 I figured I would record a demo of the DR solution that Chris Colotti and I developed. The demo is fairly straight forward and hopefully helps you in the process of building a resilient cloud infrastructure. In this demo I have included:

I just received a note that the DR paper for vCloud Director is finally available in both epub / mobi format. So if you have an e-reader make sure to download this format as it will render a lot better then a generic PDF!

Description: vCloud Director disaster recovery can be achieved through various scenarios and configurations. This case study focuses on a single scenario as a simple explanation of the concept, which can then easily be adapted and applied to other scenarios. In this case study it is shown how vSphere 5.0, vCloud Director 1.5 and Site Recovery Manager 5.0 can be implemented to enable recoverability after a disaster.

I know some of you have been waiting for this so I wanted to share some early results. I was in the UK last week and we managed to get an environment configured using persistent linked clone virtual desktops with View. We also managed to fail-over and fail-back desktops between two datacenters. The concepts is really similar to the vCloud Director DR concept.

In this scenario Site Recover Manager will be leveraged to fail-over all View management components. In each of the sites it is required to have a management vCenter Server and an SRM Server which aligns with standard SRM design concepts. Since it is difficult to use SRM for View persistent desktops there is no requirement to have an SRM environment connecting to the View desktop cluster’s vCenter Server. In order to facilitate a fail-over of the View desktops a simple mount of the volume is done. This could be using ‘esxcfg-volume -m’ for VMFS or using a DNS c-name mounting the NFS share after point the alias to the secondary NAS server.

What would the architecture look like? This is an oversimplified architecture, of course … but I just want to get the message across:

What would the steps be?

Fail-over View management environment using SRM

Validate all View management virtual machines are powered on

Using your storage management utility break replication for the datastores connected to the View Desktop Cluster and make the datastores read/write (if required by storage platform)

Mask the datastores to the recovery site (if required by storage platform)

Using ESXi command line tools mount the volumes of the View Desktop Cluster cluster on each host of the cluster

esxcfg-volume –m ;
or

point the DNS CNAME to the secondary NAS server and mount the NAS datastores

Validate all volumes are available and visible in vCenter, if not rescan/refresh the storage

Take the hosts out of maintenance mode for the View Desktop Cluster (or add the hosts to your cluster, depending on the chosen strategy)

In our tests the virtual desktops were automatically powered on by vSphere HA. vSphere HA is aware of the situation before the fail-over and will power-on the virtual machines according to the last known state

These steps have been validated this week and we managed to successfully fail-over our desktops and fail them back. Keep in mind that we only did these tests two or three times, so don’t consider this article to be support statement. We used persistent linked clones as that was the request we had at that point, but we are certain this will work for the various different scenarios. We will extend our testings to include various other scenarios.