Getting High with Lenny

The aim here is to set up some high available services on Debian Lenny (at this moment October 1st still due to be released)

There is a lot of buzz going on for a while now about virtualisation and High Availability and while Vserver is very well capable for this job the number of documented examples compared to some other virtualisation techniques are a little lacking so i thought i'd do my share.

I prefer to use Vserver for the "virtualisation" because of its configurability, shared memory and cpu resources and basically the raw speed.
DRBD8 and Heartbeat should take care of the availability magic in case a machine shuts down unexpectedly.
In my experience it takes a few seconds to have several Vservers fail over to another machine with this setup.

The main attempt here is to give a single working example without going to much in to the details of every option, the scenario is relatively simple but different variations can be made.

For this set up we will have

2 machines
both machines have 1 single large DRBD partition
primary/seconday there is always 1 machine active and 1 on standby
1 LVM partition per Vserver on top of the DRBD partition, for quota support from within the guest and LVM snapshots
the Vservers /etc/vserver and /var/lib/vservers directories will be placed on the DRBD partition.

In case the main machine that runs the Vservers goes down, the synchronized second machine should take over and automatically start the Vservers.

Basically this is an on-line RAID solution that can keep your services running in case of hardware failure, it is NOT a back-up replacement.

The cost for this setup is that you always have 1 idle machine standby, this cost can be justified by the fact that Linux-Vserver enables you to make full use of the 1 machine that is running, you also could consider to run this on a little less expensive (reliable) hardware.

Also note that i will be using R1 style configuration for heartbeat, R1 style can be considered to be depreciated when using Heartbeat2 but i could not get my head around the R2 xml configuration, so if you want R2 you might want to have a look here.
Fail-over)

machine1 will use the following names.
hostname = node1
IP number = 192.168.1.100
is primary for r0 on disk c0d0p6
physical volume on r0 is /dev/drbd0
volume group on /dev/drbd0 is called drbdvg0

machine2 will use the following names.
hostname = node2
IP number = 192.168.1.200
is secondary for r0 on disk c0d0p6
The Volume Group and the Physical Volume will be identical on node2 if this one becomes the primary for r0.

Loadbalance-Failover the network cards

Maybe not very specific to Vserver, Heartbeat or DRBD, but loadbalancing your network cards for failover is always usefull. Some more indepth details by Carla Schroder can be found here.
[[1]]
I did not do it for the DRBD crossover cable between the nodes while this is actually highly recomended.
We need both mii-tool and ethtool.

apt-get install ethtool ifenslave-2.6

nano /etc/modprobe.d/arch/i386

To load the modules with the correct options at boot time.

alias bond0 bonding
options bond0 mode=balance-alb miimon=100

And set the interfaces eth0 and eth1 as slaves to bond0, also eth2 is set here for the crossover cable for the DRBD connection to the fail over machine.

This way the system needs to be rebooted before the changes take effect, otherwise you should load the drivers and ifdown eth0 and eth1 first before ifup bond0 but i'm planning to install a new kernel anyway in the next step.

Install the Vserver packages

With Etch i found that the Vserver kernel often ended up as second in the grub list, not so in Lenny but to be safe check the kernel stanza in /boot/grub/menu.lst especially when doing this from a remote location.

Install DRBD8, LVM2 and Heartbeat

not sure about this, but DRBD always needed to be compiled against the running kernel, is this still the case with the kernel specific modules? I did not check but it would be good to know in case of a kernel upgrade.

Build DRBD8

Although packages are available in the repositorie for DRBD8, the purpose of these packages is that you can built it easily from source and patch the running kernel.

To do this we just issue this command

m-a a-i drbd8

And to load it into the kernel..

depmod -ae

modprobe drbd

Configure DRBD8

Now that we have the essentials installed we can configure DRBD. Again, i will not go in to the details of all the options here so check out the default config and http://www.drbd.org/ to find a match for your set up.

Configure LVM2

<note important>
LVM will normally scan all available devices under /dev, but since /dev/cciss/c0d0p6 and /dev/drbd0 are basically the same this will lead to errors where LVM reads and writes the same data to both devices.
So to limit it to scan /dev/drbd devices only we do the following on both nodes.

</note>

cp /etc/lvm/lvm.conf /etc/lvm/lvm.conf.original

nano /etc/lvm/lvm.conf

#filter = [ "a/.*/" ]
filter = [ "a|/dev/drbd|", "r|.*|" ]

to re-scan with the new settings on both nodes

vgscan

Create the Physical Volume

The following only needs to be done on the node that is the primary!!

On node1

pvcreate /dev/drbd0

Create the Volume Group

The following only needs to be done on the node that is the primary!!

One node1

vgcreate drbdvg0 /dev/drbd0

Create the Logical Volume

Yes, again only on the node that is primary!!!

For this example about 50GB, this leaves plenty of space to expand the volumes or to add extra volumes later on.

On node1

lvcreate -L50000 -n web drbdvg0

Then we put a file system on the logical volumes

mkfs.ext3 /dev/drbdvg0/web

create the directory where we want to mount the Vservers

mkdir -p /VSERVERS/web

and mount the volume group to the mount point

mount -t ext3 /dev/drbdvg0/web /VSERVERS/web/

Get informed

Offcourse we want to be informed later on by heartbeat in case a node goes down, so we install postfix to send the mail.

This should be done on both nodes

apt-get install postfix mailx

and go for the defaults, "internet site" and node1.example.com"

We don't want postfix to listen to all interfaces,

nano /etc/postfix/main.cf

and change the line at the bottom to read like this, otherwise we get into trouble with postfix blocking port 25 for all the Vservers later.

inet_interfaces = loopback-only

Heartbeat

Get aquinted

Add the other node in the hosts file of both nodes, this way Heartbeat knows who is who.

so for node1 do

nano /etc/hosts

and add node2

192.168.1.200 node2

Get intimate

Set up some keys on both boxes so we can ssh login without a password (defaults, no passphrase)

ssh-keygen

then copy over the public keys

scp /root/.ssh/id_rsa.pub 192.168.1.100:/root/.ssh/authorized_keys

scp /root/.ssh/id_rsa.pub 192.168.1.200:/root/.ssh/authorized_keys

Configure Heartbeat

Without the ha.cf file Heartbeat wil not start, this should only be done on 1 of the nodes.

<note>
We will be using heartbeat R1-style configuration here simply because i don't understand the R2 xml based syntax.
</note>
We only did the above 2 config files on 1 node but we need it on both, heartbeat can do that for us.

/usr/lib/heartbeat/ha_propagate

Heatbeat behavior

After above 2 files are set, the haresources is where we want to be to control Heartbeats behaviour.
This is an example for 1 Vserver that we will set up later on.

The above will default the Vserver named web to node1 and specify the mount points, the vserver-web script will start and stop heartbeat, the sendarp is for notifying the network that this IP can be found somewhere else then before. (have added the SendArp an extra time below for better result)

Another example for more than 1 Vserver,
We only specify 1 default node here for all Vservers and the same DRBD disk and Volume Group, the individual start scripts and mount points are specified separately, mind the \, its all in 1 line. the last mail command is only needed once.

not needed????

There is some more interesting discussion going on here, Advanced_DRBD_mount_issues) , for those who have multiple Vservers on multiple DRBD devices. Not sure if it also applies for this setup but i'm using it without any drawbacks at the moment.

Create a Vserver

Note that we already have mounted the LVM partition on /VSERVERS/web in an earlier step, we're going to place both the /var and /etc directories on the mountpoint and symlink to it, this way the complete Vserver and its config are available on the other node when mounted.

mkdir -p /VSERVERS/web/etc

mkdir -p /VSERVERS/web/barrier/var

When making the Vserver it will be in the default location /var/lib/vservers/web and its config in /etc/vservers/web

On node1 we move the Vserver directories to the LVM volume on the DRBD disks and make symlinks from the normal locations.

On node1

mv /etc/vservers/web/* /VSERVERS/web/etc/

rmdir /etc/vservers/web/

ln -s /VSERVERS/web/etc /etc/vservers/web

mv /var/lib/vservers/web/* /VSERVERS/web/barrier/var

rmdir /var/lib/vservers/web/

ln -s /VSERVERS/web/barrier/var /var/lib/vservers/web

We need to set the same symlinks on node2, but the we need the Vserver directories available there first.
The mounting should be handled by heartbeat by now so we make our resources move to the other machine.