Installing an OpenSSI Cluster on Debian 3.1(Sarge)

These instructions describe how to install an OpenSSI cluster on Debian
with minimal hardware requirements. Currently openssi can be installed
on Debian Testing (3.1, Sarge). It would not work with Debian stable
(Woody). All you need is two or more computers, connected with a private
ethernet switch. This network is called the "interconnect",
and should be private for security and performance reasons. Each individual
computer in the cluster is called a "node".

In this basic configuration, the first node's root filesystem is shared
with the rest of the cluster via the interconnect. This works well
for many users. To learn more about how filesystems are shared over
the interconnect, please see README.cfs.

You can make your filesystems highly-available (``HA'') if they
are on shared disk hardware that is physically connected to two or
more nodes. This can be done with Fibre Channel or some other Storage
Area Network (``SAN''). Please see README.hardmounts
for more information.

If you do not have shared disk hardware, an alternate solution for
HA filesystems is provided by the Distributed Replicated Block Device
(``DRBD'') project. This solution is provided as a supplemental
download from OpenSSI.org.

Note that any time another document is referenced, you can find it
in the docs/ directory of this release tarball, as well as
in your system's /usr/share/doc/openssi/ directory after
you install OpenSSI.

A good document that explains more about OpenSSI clustering in general
is Introduction-to-SSI. It was written by Bruce Walker, who
is the project leader for OpenSSI.

This installation guide is provided in multiple formats for your convenience:

These instructions assume you are doing a fresh install of OpenSSI.
If you are upgrading from previous releases of openssi , please see
the upgrade instructions in README.upgrade

Install Debian testing (3.1, sarge) on the first node. There's no
need to install a distribution on any node other than the first one.
Installation manual can be found at 'http://www.debian.org/devel/debian-installer'
.

Feel free to either let the installer automatically partition your
filesystems or do it yourself using the provided tools. The ext3 filesystem
is preferred over ext2, because of its journalling capabilities.

/boot can either be its own partition or a directory on the
root filesystem. Regardless of this choice, these instructions assume
that /boot is located on the first partition of the first
drive (e.g., /dev/hda1 or /dev/sda1).

Configure GRUB as the boot loader, rather than LILO. The OpenSSI project
no longer supports LILO.

The Debian installer gives you the option to install the boot loader
on the Master Boot Record (``MBR'') of a particular disk, or on
the boot block of a particular partition. It is recommended that you
install the boot loader on the MBR of your first internal disk (e.g.,
/dev/hda or /dev/sda).

Configure the cluster interconnect interface with a static IP address.
The interconnect should be on a private switch for security reasons,
so hopefully this requirement does not cause much trouble, even in
a networking environment with dynamic addresses.

Add the following entries to /etc/apt/sources.list in addition to
entries used for Debian installation.

deb http://deb.openssi.org/v1 ./
deb-src http://deb.openssi.org/v1 ./

Add following entries to /etc/apt/preferences

Package: *
Pin: origin deb.openssi.org
Pin-Priority: 1001

Configure http proxy. In the bash shell , you can export environment
variable ``http_proxy'' by setting its value to local proxy server.

Execute:

# apt-get update

# apt-get dist-upgrade

As a part of the dist-upgrade, some of the utilities will be downgraded
since OpenSSI needs a modified version of those utilities.

Add necessary drivers list to ``/etc/mkinitrd/modules''. Most
important drivers are network drivers, depending upon the network
cards used in the participating nodes in the cluster (Ex: e100, eepro100
etc.), which would be used while booting cluster nodes.

Execute:

# apt-get install openssi

This would install an openssi and create a first node (init node)
of the cluster. While creating first node as part of installation
using ``ssi-create'', it would display few questions related to
cluster setup and they are listed below and expects the installer
to answer. Please see the 'known problems' at the end of this document
how to consider few error messages that may appear and treat them.

Enter a node number between 1 and 125. Every node in the cluster must
have a unique node number. The first node is usually 1, although you
might want to choose another number for a reason such as where the
machine is physically located.

Select a Network Interface Card (``NIC'') for the cluster interconnect.
It must already be configured with an IP address and netmask before
it will appear in the list. If the desired card has not been configured,
do so in another terminal then select (R)escan.

The NIC should be connected to a private network for better security
and performance. It should also be capable of network booting, in
case anything ever happens to the boot partition on the local hard
drive. To be network boot capable, the NIC must have a chipset supported
by PXE or Etherboot.

Select (P)XE or (E)therboot as the network boot protocol for this
node. PXE is an Intel standard for network booting, and many professional
grade NICs have a PXE implementation pre-installed on them. You can
probably enable PXE with your BIOS configuration tool. If you do not
have a NIC with PXE, you can use the open-source project Etherboot,
which lets you generate a floppy or ROM image for a variety of different
NICs.

OpenSSI includes an integrated version of Linux Virtual Server (``LVS''),
which lets you to configure a Cluster Virtual IP (``CVIP'') address
that automatically load balances TCP connections across various nodes.
This CVIP is highly available and can be configured to move to another
node in the event of a failure. For more information, please see README.CVIP.

Enter a clustername. It should resolve to your CVIP address, either
in DNS or the cluster's /etc/hosts file, if you choose to
configure a CVIP. This is required if you want to run NFS server.
For more information, please see README.nfs-server.

The current hostname will automatically become the nodename for this
node.

Select whether you want to enable root filesystem failover. The root
must be installed on (or copied to) shared disk hardware, in order
to answer yes to this question. If you do answer yes, then each time
you add a new node, you will be asked if the node is physically attached
to the root filesystem and if it should be configured as a root failover
node (see openssi-config-node later). You can learn more
about filesystem failover in README.hardmounts.

A simple mechanism for synchronizing time across the cluster will
be installed. Any time a node boots, it will synchronize its system
clock with the initnode (the node where init is running). You can
also run the ssi-timesync command at any time to force all nodes to
synchronize with the initnode.

This timesync mechanism synchronizes nodes to within a second or two
of each other. If you need a higher degree of synchronization, you
can configure Network Time Protocol (``NTP'') across the cluster.
Instructions for how to do this are available in README.ntp.

Automatic process load balancing will be installed as part of OpenSSI.
To enable load-balancing for a program, mention it's name in the file
``/cluster/etc/loadlevellist''. The program name ``bash-ll''
has been listed in the file ``/cluster/etc/loadlevellist'' by
default. ``bash-ll'' program has not been delivered. So to enable
load balancing for every program that runs with bash shell, create
a hard link as shown below and execute a program in the shell ``/bin/bash-ll''.

Nodes in the OpenSSI cluster are booted using network booting method.
This lets you avoid having to install a distribution on more than
one node. To network boot a new node, first select one of its NICs
for the cluster interconnect. It must have a chipset supported by
PXE or Etherboot.

If the selected NIC does not support PXE booting, download
an appropriate Etherboot image from the following URL:

http://rom-o-matic.net/5.2.4/

Choose the appropriate chipset. Under Configure it is recommended
that ASK_BOOT be set to 0. Floppy Bootable
ROM Image is the easiest format to use. Just follow the instructions
for writing it to a floppy.

If the node requires a network driver not already mentioned in the
file ``/etc/mkinitrd/modules''(in the init node), add the driver
name in that file. Then rebuild the ramdisk to include the driver
and update the network boot images.

# mkinitrd -o <init RD image file> <kernel-version>

# ssi-ksync

NOTE:

initrd and openssi kernel will be installed during openssi installation
in the path "/boot" of 'init node'. For PXE boot
do following steps manually.

# apt-get install syslinux

# cp /usr/lib/syslinux/pxelinux.0 /tftpboot

It has been obeserved that tftpd-hpa or atftpd would work fine
with etherboot or PXE. So it is recommended to install tftp-hpa or
atftpd. The entry for ``tftp'' in /etc/inetd.conf should have
a root directory as ``/tftpboot''. please check whether root directory
for tftp is ``/tftpboot''. If it is not, please correct it. tftp-hpa
does not have it by default, so modify manually editing /etc/inetd.conf.
if you install atftpd, reconfigure using ``dpkg-reconfigure atftpd``
to refer /tftpboot'' on init node. This is a directory where kernel
and initrd images are available for other nodes to boot using network
booting method. You can refer an entry shown below.

Connect the selected NIC to the cluster interconnect, insert an Etherboot
floppy (if needed), and boot the computer. It should display the hardware
address of the NIC it is attempting to boot with, then hang while
it waits for a DHCP server to answer its request.

On the first node (or any node already in the cluster), execute ``
ssi-addnode''. It will ask you few questions about how you
want to configure your new node and they are as follows.

Enter a unique node number between 1 and 125.

Enter MAC address of the new node to be added in the cluster.

Enter a static IP address for the NIC. It must be unique and it must
be on the same subnet as the cluster interconnect NICs for the other
nodes.

Select (P)XE or (E)therboot as the network boot protocol for this
node. PXE is an Intel standard for network booting, and many professional
grade NICs have a PXE implementation pre-installed on them. You can
probably enable PXE with your BIOS configuration tool. If you do not
have a NIC with PXE, you can use the open-source project Etherboot,
which lets you generate a floppy or ROM image for a variety of different
NICs.

Enter a nodename. It should be unique in the cluster and it should
resolve to one of this node's IP addresses. The nodename can resolve
to either the IP address you configured above for the interconnect,
or to one of external IP addresses that you might configure below.
The nodename can resolve to the IP address either in DNS or in the
cluster's /etc/hosts file.

The nodename is stored in /etc/nodename, which is a context-dependent
symlink (``CDSL''). In this case, the context is node number,
which means each node you add will have it's own view of /etc/nodename
containing its own hostname. To learn more about CDSLs, please see
the document entitled cdsl.

If you enabled root failover during the first node's installation,
you will be asked if this node should be a root failover node. This
node must have access to the root filesystem on a shared disk
in order to answer yes. If you answer yes, then this node can boot
first as a root node, so you should configure it with a local boot
device. This is done after this node joins the cluster and is described
in step .

Save the configuration.

The program will now do all the work to admit the node into the cluster.
Wait for the new node to join. A ``nodeup'' message on the first
node's console will indicate this. You can confirm its membership
with the cluster command:

# cluster -v

If the new node is hung searching for the DHCP server, try manually
restarting the ``dhcpd'' on the cluster's init node where DHCP
server is running:

# invoke-rc.d dhcp restart

NOTE: It has been observed that some time tftp server will not
respond to request once it already responded to client's request.
So restart the inetd on the init node if client could get IP address
, but could not continue booting.

# invoke-rc.d inetd restart

If New node is still hung, try rebooting the node . It would come
up.

The following steps tell you how to configure the new node's hardware,
including its swap space, local boot device (optional unless configured
for root failover), and external NICs. Sorry for the complexity of
some of the steps. The Debian installer automates most of it for you
during the installation of the first node, but it's not much help
when adding new nodes.

Configure the new node with one or more swap devices using fdisk
(or a similar tool) and mkswap:

# onnode <node_number> fdisk /dev/hda (device name)

partition disk

# onnode <node_number> mkswap /dev/hda3 (device partition name)

Add the device name(s) to the file /etc/fstab, as documented
in README.fstab.

Either reboot the node or manually activate the swap device(s) with
the swapon command:

# onnode <node_number> swapon <swap_device>

If you have enabled root failover you MUST configure a local boot
device on the new node. Otherwise, configuring a local boot device
is optional. If you are going to configure a local boot device, it
is highly recommended that the boot device have the same name as the
first node's boot device. Remember that we assumed at the beginning
of these instructions that the first node's boot device is located
on the first partition of the first drive (e.g., /dev/hda1
or /dev/sda1).

Assuming you have already created a suitable partition with fdisk,
format your boot device with an ordinary Linux filesystem, such as
ext3:

# onnode <node_number> mkfs.ext3 /dev/hda1

Now run ssi-chnode anywhere in the cluster (no need to use
onnode with this command). Select the new node, enter its
local boot device name, and ssi-chnode will copy over the
necessary files.

Finally, you need to manually install a GRUB boot block on the new
node:

# onnode <node_number> grub --device-map=/boot/grub/device.map

grub> root (hd0,0)

grub> setup (hd0)

grub> quit

Repeat above steps at any time to add other nodes to the cluster.

Enjoy your new OpenSSI cluster!!!

To learn more about OpenSSI, please read Introduction-to-SSI.

One of the first things you can try is running the demo Bruce, Scott
and I have done at recent trade shows. It illustrates some of the
features of OpenSSI clusters. You can find it here, along with older
demos:

http://OpenSSI.org/#demos

Recently, a scalable LTSP server has been tested on an OpenSSI cluster.
To learn more about how to set this up, please see README.ltsp.

If you have questions or comments that are not addressed on the website,
do not hesitate to send a message to the user's discussion forum: