Over the last several years, I’ve had the honor of working with a talented and focused engineer named Brian Macbeth. Brian has spent a bit of time working with Citrix’s new Linux VDA technology and has run into a number of limitations in the area of roaming profiles and home directories, that he has resolved using the processes described below. Brian has graciously volunteered to share his experiences with the extended virtualization community in an attempt to further advance the use of Linux VDA in enterprise organizations. Without further ado, below is the guest blog post by Brian Macbeth, feel free to comment below!

Configuring a High Availability NFS Server Cluster

Welcome to part 2 of Citrix Linux VDA Centralized Home Directories blog post series. In this post, I will describe how to stand up a high availability two-node Linux NFS server cluster using a shared iSCSI LUN via software initiator.

Installing a Linux HA cluster for the first time can be challenging endeavor; while there is a lot of info on the web, much of it is fragmented or dated, making the build troubling. This post should set you on the path to successfully implementing high availability centralized NFS home directories.

I highly recommend getting comfortable with managing a Linux cluster in a non-production environment before moving into production. A clustered Linux NFS server failover is not nearly as seamless as a SMB/CIFS share failover on Windows failover cluster. That said, you will need to understand how the NFS client “pause” during failovers will impact the user community and tune the NFS client recovery and timeout configurations to best meet your environment needs and expectations. The NFS man page should be your new friend.

Environment

Operating System:

CentOS 7, domain joined

Node 1 FQDN/IP:

vdinfs01a.demo.private 10.1.10.80

Node 2 FQDN/IP:

vdinfs01a.demo.private 10.1.10.81

Cluster Virtual IP:

vdinfs01.demo.private 10.1.10.82

iSCSI Target IP/LUN:

IP: 10.1.10.10 LUN Size: 50GB

Volume Group Name:

volgrp1

Logical Volume Name:

nfshome_vol1

NFS Mount Folder:

nfsmount

Cluster Node Base Configs

Perform the following configurations on both cluster nodes.

Host File Configuration

I originally had the host file configured the same way I configured for the Linux VDAs, which caused intra-cluster communication problems between the nodes. It turns out, if the loopback address is associated with the hostname, Corosync seems to bind to the loopback address and then cannot reach the other node. At a bare minimum, you only need to list the local node in its respected hosts file; however, listing both nodes ensures name resolution in the event of any DNS server connectivity issues.

Edit /etc/hosts using your editor of choice so that it looks like this:

Run the following command to validate the new resources are configured and started:

pcs status

Run the following command to prevent resources from moving after node recovery:

pcs resource defaults resource-stickiness=100

Configure Fencing

Fencing is the cluster function of isolating a malfunctioning node with STONITH (Shoot The Other Node In The Head) to take a malfunctioning node offline. Since the cluster nodes are virtual machines and we don’t have access to power controllers or out-of-band management cards such as iLO or DRAC, we will use fence_scsi and associate it to the shared iSCSI LUN.

Run the following command to the find the iSCSI device by WWN:

ls /dev/disk/by-id/wwn-*

Build your fencing configuration by including the cluster node names and the WWN: