Introduction to the Cluster Quorum Model

Failover clusters ensure that workloads such as File Server, Exchange, SQL, and Virtual Machines are highly available. These resources are considered highly available if the nodes that host resources are up, however the cluster generally requires more than half the nodes to be running, which is known as having “quorum”.

Quorum is designed to prevent “split-brain” scenarios which can happen when there is a partition in the network and subsets of nodes cannot communication with each other.This can cause both subsets of nodes to try to own the workload and write to the same disk which can lead to numerous problems.However this is prevented with the Windows Server Failover Clustering quorum model which forces one of these groups to have a majority of nodes running, so only one of these groups will stay online.

If the cluster loses quorum, all the nodes of a cluster will offline and any hosted resources will offline.Choosing the right quorum mode while deploying your cluster will help ensure that the cluster stays up for as long as possible while sustaining node, disk or network failures.

Terminology

Here is some relevant terminology:

·Disk Witness Resource – A clustered disk can contribute towards the cluster’s quorum.This disk resides in the cluster group. Besides providing a vote for the quorum, this resource serves two other critical functions.

oStores a constantly-updated version of the cluster database. This allows the cluster to maintain its state and configuration independently of individual node failures, which ensures that nodes will always have the most up-to-date copy of the database.

·File Share Witness (FSW) Resource – A file share accessible by all nodes of the cluster can contribute to the cluster’s quorum. Besides provide a vote for the quorum, it also helps with the “split brain” scenario.However, file share witness doesn’t contain the cluster database.

·Vote – The quorum calculation is based on votes. Cluster nodes, disk witness resources and file share witness resources may have a vote base on the quorum configuration.The table in the next section shows the relationship between quorum mode and votes.

Quorum Model Description

The following table describes the different quorum modes available since Windows Server 2008.

Quorum Mode

Components that has vote

Votes for Quorum (v denotes vote)

Node Majority

Nodes (1 per node)

v/2 + 1

Node and Disk Majority

Nodes (1 per node) and Disk Witness Resource (1)

v/2 + 1

Node and File Share Majority

Nodes (1 per node) and File Share Witness Resource (1)

v/2 + 1

No Majority: Disk Only (Legacy)

Disk Witness Resource (1)

v

It is recommended to have an odd number of total votes in the cluster since quorum requires more than half of the votes to be online.If I have a 4-node cluster, and only give each node a vote for 4 total votes, I need 3 nodes to stay running to maintain quorum with more than half of the votes.This means I can only sustain a single node failure.However, by assigning a disk or FSW a 5th vote, I still need 3 votes to maintain quorum, however I can now sustain two node failures, instead of one.So by adding these extra votes by using a disk or file share, instead of requiring the purchase of an additional node, Failover Clustering can offer higher availability at a much lower cost.

Default Configuration

When the cluster is first created, the most appropriate quorum mode is automatically assigned, which is based on the number of nodes and available cluster storage.This can always be changed, as described in the next section.The cluster will attempt to configure quorum so that there is always an odd number of votes.If there is an odd number of nodes, the cluster will select Node Majority as the quorum type to keep the odd number of votes.If there are an even number of nodes, and disks in Available Storage, the cluster will select Node and Disk Majority, giving a disk a single vote, so that there is an odd number of total votes.If there is an even number of nodes, but no Available Storage, the cluster will select Node Majority and issue a warning message.The cluster will never select Node and File Share Witness since it requires additional configuration, and it will never select No Majority: Disk Only as this is not recommended because it is a single point of failure.

Configuring Quorum

In Failover Cluster Manager, the quorum configuration can be changed through the Configure Cluster Quorum Wizard. This page can be reached by right clicking on the cluster name, selecting “More Actions…” and then “Configure Cluster Quorum Settings…”

Once “Configure Cluster Quorum Setting…” is selected, the Configure Cluster Quorum Wizard appears. This will recommend the best configuration for you based on the number of nodes and Available Storage and inform you about the number of failures you can sustain.In the snapshot below, Node Majority quorum is recommended because it’s a 3 node cluster, which will be explained later.

This worked well for me, I went from a Cluster that had 2 nodes to 3 and upon adding the 3rd node the wizard warned me to "remove" the quorum. This was not a very accurate message, so upon some searching I found this artical to be helpful and I re-ran the "Configure Cluster Quorum Wizard". It suggest to change to this mode: "Node Majority" from the mode it had been in for just 2 nodes: "Node and Disk Majority"

Jalith

13 Nov 2011 8:56 PM

Hi Can i configure 2 FSWs in a cluster (from 3rd and 4th location). This is not to increase the votes but just to have high availability at share level. if the connection to the site3 lost site4 FSW provide a vote and if the connection to the site4 lost site3 FSW will provide the vote.

Amitabh (PM Microsoft)

19 Dec 2011 12:39 PM

You can only configure 1 FSW per Cluster.

Please look at the 'Quorum Model Description' above to understand how the Quorum Votes are calculated.

If you have your Cluster nodes up and running and you loose connectivity to File Share Witness (3rd site) then the cluster would continue to run provided you have enough number of Cluster Nodes up and running.