High Availability for a file share using WSFC, ILB and 3rd-party Software SIOS Datakeeper

While Windows “Shared Disk Failover Cluster” is not yet supported for Microsoft Azure Virtual Machines, 3rd-party software SIOS Datakeeper can be used as an alternative: http://us.sios.com/products/datakeeper-cluster/ . As a sample use case, this post describes how to make a file share highly available. All information which is needed to set it up is basically available on the Internet. The idea of this post is simply to put everything together. Before going into the details I would like to start with a high-level overview of the approach :

Figure 1

The key components of the tested solution setup are :

A domain controller VM and two VMs which represent the failover cluster

HA for the domain controller is NOT part of this post! The focus is purely on the high availability implementation for the file share ( not the whole cluster end-to-end )

For simplicity, a simple file share witness on the domain controller VM was used as the cluster quorum configuration. As mentioned above the DC VM has no HA which means that the file share witness is gone once the DC VM is stopped

All three VMs are part of one Azure Vnet. The domain controller VM and the two cluster node VMs belong to two different cloud services to get a clean separation of the Internal Load Balancer configuration which is done on a cloud service level

To ensure high availability for a file share, it’s necessary to put both cluster nodes into an Azure availability set to avoid that both VMs might go down at the same time ( e.g. during Azure maintenance )

WSFC was used to configure a shared disk failover cluster between the two cluster node VMs and to finally provide the highly available file share

The Azure Internal Load Balancer ( ILB ) allows access to the file share via the virtual name of the file server role created on the failover cluster

The “trick” then is to replace the IP address of the file server role by the IP address of the ILB

Now the question is of course – how should all this work as there is no shared disk available for Azure Virtual machines like it is for Hyper-V on-premises ? The answer is : SIOS Datakeeper

Figure 2

Using SIOS Datakeeper one can create a so-called “mirror” between two volumes ( data disks attached to the VMs ) via synchronous replication:

When creating the mirror between the two volumes ( Azure data disks ) SIOS Datakeeper will add this mirror as storage to the cluster configuration

To the cluster failover manager it then looks like a shared disk

As mentioned before – ILB allows to access the file share via the file server role name and will always route to the active cluster node

Prerequisites:

All tests were done with Windows 2012 R2

SIOS Datakeeper version was 8.2

SIOS Datakeeper installation requires .NET Framework 3.5. I ran into a problem which at the time of my testing required a fix in the form of running a little exe downloaded from the following KB article : http://support2.microsoft.com/kb/3005628. Recent tests showed that this is no longer an issue in the latest WS2012 R2 Azure Gallery image from October

For my internal testing, I finally created my own OS WS2012 R2 image where I installed .NET framework 3.5 as well as the fix from the third item above. The other two bug fix examples were not necessary or applicable. To create the private OS image I followed the guidance below. Then I created all my test VMs from this private image: https://azure.microsoft.com/en-us/documentation/articles/virtual-machines-capture-image-windows-server/ . In case no bug fixes are required it’s of course perfectly fine to just use a standard Azure Gallery OS image

I used a dedicated vnet for my test environment. You have to watch out to create the vnet the right way to make ILB work. In the past vnets were associated with an affinity group. This is no longer the case : http://msdn.microsoft.com/en-us/library/azure/jj156085.aspx To set up ILB the new regional type is required. For newly created vnets everything is fine.

One has to enter the location ( Azure region ) when creating the vnet via the portal. For old existing vnets there is an option to change the vnet setting from affinity group to regional by modifying the network config. I tested this myself and it’s not possible though while there are VMs in the vnet. For my test, I removed the VMs ( keeping all disks ), exported the VM configs and imported them again. Here is an article about this config change : https://azure.microsoft.com/blog/2014/05/14/regional-virtual-networks/

Then I did the setup of a domain controller VM and the two cluster node VMs in two different cloud services. This is a clean setup avoiding any kind of potential side effects related to the ILB configuration as the ILB will be configured per cloud service.

The Datakeeper screen shows that the primary ( source ) is currently cluster node 1 and the secondary ( target ) is cluster node 2

Figure 5

Looking at the file system of cluster node 1, you will see a file share on volume S:

This is the share which should become highly available

Figure 6

The Datakeeper volume is visible in the failover cluster manager and allows the creation of a file server role

Under “Shares” within the file server role, one can see the file share which we looked at on file explorer level in figure 5

Figure 7

Checking the second cluster node, it turns out that the replicated volume S: can be seen e.g. on file explorer level but access is not possible

SIOS Datakeeper makes sure that the replicated volume can only be accessed on the current owner node

Figure 8

The file share created in the cluster role can be accessed from the domain controller as expected by the virtual name : “\\fshafsrole\fsha_share”

Figure 9

Now we start a manual failover of the file server role from cluster node 1 to cluster node 2

Figure 10

The failover cluster manager on the second node ( Azure VM fsha-cln2 ) shows that the owner node changed to fsha-cln2

SIOS Datakeeper also switched source and target server

Now, it’s possible to access the file share on file explorer level on the second node which didn’t work before

Access from the domain controller VM also still works as expected

Figure 11

Right-click on the file server role within the failover cluster manager allows to set appropriate permissions for the file share access

Figure 12

As mentioned in the overview section, it’s not enough to use the Datakeeper mirror like a shared disk for the failover cluster

Another “workaround” is necessary regarding the access to the file share via the name of the file server role

It was achieved with the help of the Azure Internal Load Balancer – ILB

The screenshot shows that an Internal Load Balancer with the name “ilbfsha” was created on the cloud service of the cluster nodes

The IP address of the Load Balancer is 10.0.0.99

Figure 13

Checking the “Resources” tab in failover manager shows that the file server role has in fact the IP address of the ILB – 10.0.0.99

This is the “trick” to make the whole setup finally work. The file server role was originally created with a different IP address

Once the file server role was created, its IP address was replaced by the IP address of the ILB via a Powershell command. See the PS command example in the ILB Configuration section further down. It's not only a simple IP address change which could be done in the Failover Manager GUI

Figure 14

The PS command “get-azureendpoint” shows that two endpoints were created for cluster node 1 ( same for cluster node 2 ) related to ILB

The local ports are 443 and 445 and the so-called “probe port” is 59999

ILB Configuration:

Like the setup of a domain controller VM or the failover cluster on Azure, the information about how to configure the Azure Internal Load Balancer ILB can be found on the Internet. What is needed for the highly available file share regarding ILB is in fact the same which is required for SQL Server AlwaysOn – see links below. Adding the ILB to an Azure cloud service as well as adding the endpoints 443 and 445 to the VMs via Azure Powershell is trivial and exactly the same as for SQL Server AlwaysOn. The only part which needs a bit attention, is the setting of the file server role IP address. In the lab environment, the file server role was originally created with IP address 10.0.0.100. At the end, it had to be replaced by the ILB IP address 10.0.0.99. This can be accomplished by the same set-clusterparameter command as one can find in the blog below about the SQL AlwaysOn group listener. Below is a sample how it was done in the file share lab setup. Commands run via Azure Powershell locally on-prem :

Miscellaneous

To make my life easier regarding the setup of the test environment I created an Azure PS script which helped me to create a domain controller VM and also the two cluster node VMs including domain join and so on. It’s NOT an official Microsoft utility. The emphasis was neither on programming style nor on security. It was just about functional testing of SIOS Datakeeper. The idea was to automate all the steps to the point where one has to install SIOS Datakeeper as much as possible. There were a few challenges to overcome and I decided to share the findings and the PS code. The whole SIOS Datakeeper project is related to SAP HA on Azure Virtual Machines. Therefore I will publish this Azure PS sample in about two weeks on our Microsoft SAP Engineering Team blog : Running SAP Applications on the Microsoft Platform