Re: [Linux-cluster] Quorum Disk on 2 nodes out of 4?

From: Lon Hohberger <lhh redhat com>

To: linux clustering <linux-cluster redhat com>

Subject: Re: [Linux-cluster] Quorum Disk on 2 nodes out of 4?

Date: Thu, 19 Nov 2009 14:16:01 -0500

On Wed, 2009-11-18 at 11:08 +0000, Karl Podesta wrote:
> On Wed, Nov 18, 2009 at 06:32:25AM +0100, Fabio M. Di Nitto wrote:
> > > Apologies if a similar question has been asked in the past, any inputs,
> > > thoughts, or pointers welcome.
> >
> > Ideally you would find a way to plug the storage into the 2 nodes that
> > do not have it now, and then run qdisk on top.
> >
> > At that point you can also benefit from "global" failover of the
> > applications across all the nodes.
> >
> > Fabio
>
> Thanks for the reply and pointers, indeed the 4 nodes attached to storage
> with qdisk sounds best... I believe in the particular scenario above,
> 2 of the nodes don't have any HBA cards / attachment to storage. Maybe
> an IP tiebreaker would have to be introduced if storage connections could
> not be obtained and the cluster was to split into two.
>
> I wonder how common that type of quorum disk setup would be these days,
> I gather most would use GFS in this scenario with 4 nodes, eliminating
> the need for any specific failover of an ext3 disk mount etc., and merely
> failing over the services accross all cluster nodes instead.
We don't have an IP tiebreaker in the traditional sense.
I wrote a demo IP tiebreaker which works for 2 node clusters, but it
does not work in 4 node clusters since there is no coordination about
whether other nodes in a partition can "see" the tiebreaker in the demo
application.
You can use a tweaked version of Carl's weighted voting scheme to be
able to sustain 2 node failures 1/2 the time in a 4 node cluster:
node# 1 2 3 4
votes 1 3 5 4
Votes = 13
Quorum = 7
Any 1 node can fail:
Nodes 1 2 3 = 9 votes
Nodes 2 3 4 = 12 votes
Nodes 1 3 4 = 10 votes
Nodes 1 2 4 = 8 votes
Half of the time, 2 nodes can fail (ex: if you were worried about a
random partition between 2 racks):
Nodes 2 3 = 8 votes
Nodes 3 4 = 9 votes
Nodes 2 4 = 7 votes
Obviously in the other half of the possible failure permutations, 2
nodes failing would mean loss of quorum:
Nodes 1 2 = 4 votes -> NO QUORUM
Nodes 1 3 = 6 votes -> NO QUORUM
Nodes 1 4 = 5 votes -> NO QUORUM
If you do this, put your critical applications on nodes 1 and 2. In the
event of a failure, nodes 3 and 4 can pick up the load without losing
quorum. Well, in theory ;)
-- Lon