This web page is no longer maintained. Information presented here exists only to avoid breaking historical links.The Project stays maintained, and lives on: see the Linux-HA Reference Documentation.To get rid of this notice, you may want to browse the old wiki instead.

Load Sharing with IPtables and Linux-HA

This HOWTO describes the setup according to version 2 of the software, with the CRM beeing an integral part of heartbeat. If you usepacemaker this article is only good for the basic concept. Please refer to the new document for the setup in a pacemaker cluster.

Why load sharing?

Load sharing is a possibility to distribute the CPU load caused by requests to several distinct machines. If you build a cluster from lets say 3 machines and load sharing works perfectly every single machine only has 1/3 of the original load. It is also a way to enhance the availability of a service you want to offer. If one of the three machines fails and the other take over the availability of the service is much higher than just with one machine. This can be very useful in scenarios, where one machine has to be switched off for maintenance.

Basics

Of course there two different software components have to be installed:

One software is responsible for the distribution of the connections. Within the netfilter frame the iptables CLUSTERIP target is a very elegant solution for this problem. Every node of a cluster calculates itself a hash value from the IP packet (source IP, source port, ...) and decides autonomous if it is responsible for this connection or not. No communications between the nodes is necessary! For further explanation, discussion, and examples see "man iptables" or Flavio's Blog.

The other software component necessary is a system that detects the failure of one node and manages that the connections from that node are distributed to other nodes. Linux-HA is ideal for that purpose because it uses the resource concept. An IP address is just a resource that is shared between the member nodes of a cluster. The IP address in this case is a virtual address for the whole cluster.

The complete failure detection and handling is done within the heartbeat framework of Linux-HA.

Load Sharing vs Load Balancing

In my approach I talk about Load Sharing. In my understanding "Load Balancing" utilizes a concept of active load distribution between the nodes. A distribution agent measures the load of every node and distributes new connection to the node with the least load. See the software of the company Stonesoft for a really good working example of a load balancing cluster.

Load Sharing uses fixed hash values and therefore has no possibility to distribute the load dynamically. Also all established connections on one node have to be reinitialized if a node fails. There is no sync of the connection tables between the nodes.

Prerequisites

iptables and the CLUSTERIP target is a Linux concept. So it will not work on other platforms. Sorry. First of all, please check if the CLUSTERIP target works on you distribution. FC5 and FC6 did NOT work for me. I also checked the live version of F7 and got the same error. OpenSuSE 10.2 does work for me as well as debian etch. SuSE/Novell assured me that it will work on SUSE Enterprise Server 10, Service Pack 1 (SLE10 SP1). Please mail me (misch .at. schwartzkopff.org) if you have further success stories, so I can include it here. For a quick check enter on the command line:

For <IP address> use a unused address of you local net. Be sure the eth0 is the correct interface. Please also be sure to load the ipt_conntrack module (modprobe ipt_conntrack) before using CLUSTERIP. Then add the cluster IP address to the interface:

ifconfig eth0:0 <IP address> or
ip address add <IP address> dev eth0

Ping the new IP address from an other machine. You might see "hash=1 ct_hash=1 not responsible" or "hash=1 ct_hash=1 responsible" in /var/log/messages. Depending on the responsibility you get an answer from the cluster or not. You can change the responsibility of the node by

echo "+1" > /proc/net/ipt_CLUSTERIP/<IP address> or

echo "-1" > /proc/net/ipt_CLUSTERIP/<IP address>

You also can use any other hash value instead of the "1" for the responsibility. There can be more than one hash values in the /proc/net... file. Now you can be sure that the CLUSTERIP target of your distribution works.

Installation

The installation of the Linux-HA software is described elsewhere. But be sure that you get a fresh compiled installation. My 2.0.8 is working, elder have problems sometimes. Since the new IPaddr2 resource uses the version 2 frame an installation of version 1 will also not work.

For the version 2.0.8 you have to get the RAdirectly from me. There is also some further documentation at my website. Copy the script to the directory where the ocf scripts live. In my installation it is /usr/lib/ocf/resource.d/heartbeat/. Since Version 2.1.0 the new RA is included in the official distribution. No need to install the RA separately anymore.

Configuration

Open your hb_gui and connect to one of the nodes. You should get a picture like the one below. Of course the names of your cluster nodes may differ.

Create a new resource by a right-click on "Resource". Enter native when asked for the "Item type". A new window with the details of the new resource opens.

Give the new resource a reasonable name. I named the Resource ID "resource_IP_95" since it is an IP address and my cluster get the address 192.168.188.95. The type of the resource is IPaddr2 with Class/Provider "ocf/heartbeat". The value of the IP address parameter is 192.168.188.95.

Now comes the interesting part. The resource is a clone resource so it is started on all nodes. Since I have two nodes in my cluster clone_max is 2 as well as clone_node_max. The name of the clone resource is IP_95. In the parameters of the dedicated resource I change the clusterip_hash to sourceip-sourceport to tell the node how to calculate the hash values for IP packets. I also remove the target_role "stopped" from the dedicated resources, since this will be inherited from the cluster resource. But beware, the resource is immediately started when the target_role is removed.

Since the resource is started and no errors occurred everything should be green.

Have fun

You can check cluster now by pinging the cluster IP address. In my case it answers on

ping 192.168.188.95

It is also possible to log into the cluster with SSH. But you never know which node will answer. Sometimes it is the first node, sometimes it is the second node. So if you want to log in to one specific node, always use the dedicated IP address, not the cluster address.

For a test of the cluster you can switch one of the nodes to "standby" with a simple right-click→Standby on the node while pinging the cluster. In my case I loose perhaps 1 ping, sometimes I get 1 duplicate. But within 2 seconds the other node took over.

If the first node comes up again, the resource will not voluntarily go back to the node. Please insert a resource_stickiness=0, so the clones are distributed equally on all availably nodes of the cluster.

Now you can user the cluster as a base for the installation of all(!) other services like Web servers, Databases, ... Of course the synchronization of the data between the applications is not part of Linux-HA. This has to be controlled inside the application. But where no data synchronization is needed this is a very simple way to build a load-sharing cluster to handle large amounts of traffic. Please mail me any success stories, suggestions, and corrections to misch .at. schwartzkopff .dot. org.