If you are a real neophyte in all senses of the word (a linux neophyte,
a parallel computing neophyte, a beowulf neophyte, a network manager
neophyte) and need to pretty much learn everything as you go
along, you're going to want to either stick to the following recipe. If
you did your work very carefully and know that you really need to build
one of the cheaper designs discussed next, you should still build
yourself a very small beowulf (perhaps four nodes) out of either systems
and parts at hand or systems you plan to recycle into stripped nodes or
server nodes according to this plan just to learn what you're doing.

That is, the following design isn't really very cost-beneficial as you
buy some things your nodes don't really need and do a lot of work by
hand, but it is still reasonable for small beowulfs or while you're
learning. From what you learn you can understand how to implement the
next design that scales a lot better in all ways.

This design has you install and configure each node basically by hand
using the standard installation tools that come with whatever linux
distribution you selected to use. You then put the nodes onto the
chosen network(s) and configure them for parallel operation.

The other place where this design works as a prototype is if you are
setting up a beowulf-style cluster that isn't really a true
beowulf. In that case you'd actually configure each node to be a fully
functioning standalone workstation, setting up X and sound and all that,
and providing each node with a monitor and keyboard and room on a table
or desk. You'd still install most of the ``beowulf'' software - PVM,
MPI, MOSIX and so forth, and you'd still configure it for parallel
operation.

In this ``hand crafted beowulf'' design, your nodes have to be
configured to install independently. These days, that means that they
probably need the following hardware:

A floppy drive.

A cheap, small (4 GB is small these days) IDE hard drive.

A CD-Rom drive

A generic SVGA card (I usually get $30 S3-Virge cards)

plus of course your NIC(s). Each node is then attached to your choice
of the following:

A KVM (Keyboard, Video, Mouse) switch, which in turn is connected
to a single keyboard, monitor and mouse. KVM switches are available
that are cheap (but fuzz a high resolution monitor a bit and don't work
for PS/2 mice) or expensive (but keep the monitor clear and can manage
all kinds of mice). The latter can be purchased to support all the way
up to some 64 nodes, although they might add almost as much to the
marginal cost of your nodes as a monitor, keyboard and mouse for each.

A monitor, keyboard and mouse for each. That is, you're building
a NOW (network of workstations) or COW (cluster of workstations) as
opposed to building a "true beowulf". Big deal. It will still work
like a beowulf for anything but moderately fine grained synchronous
parallel code and you can use the workstations for all sorts of useful
(but not particularly CPU or network intensive) things while it is doing
parallel computations.

A moderately portable monitor, keyboard and mouse, perhaps on a
cart. You plug this into the nodes only one at a time of course,
installing one, then the next one, then the next and so on.

One of several moderately expensive specialty cards that let you
use (e.g.) a serial console for the original install. Expect to pay
three or four times the cost of a cheap SVGA card.

The installation procedure is then very simple. You plug your
distribution CD into the CD-Rom drive, the boot floppy into the floppy
drive, (if necessary attach the portable monitor and keyboard to the
appropriate ports) and boot. You will generally find yourself in your
distribution's standard install program.

From there, install a more or less standard linux according to the
distribution instructions. You probably have more than enough hard disk
space to install everything as it is hard to buy a disk nowadays with
less than 4 gigabytes (which is way plenty) so don't waste too much time
picking and choosing - if it looks like it might be useful
install it, or just install ``everything'' if that is an option. Be
moderately careful to install all the nodes the same way as you really
want them to be as ``identical'' as possible.

Be sure to include general programming support (compilers, libraries,
editors, debuggers, and documentation). Be sure to include the full
kernel, including sources and documentation (a lot of distributions
won't install the kernel source unless you ask it to). Be sure to
install all networking support, including things like NFS server
packages. Sure, a lot of these things will never be needed on a node
(at least if you do things correctly overall), but if they are
ever needed it will be a total pain in the rear to put them on later and
space is cheap (your time later is expensive).

Be sure to install enough swap space to handle the node's memory if you
can possibly spare the disk. A rule of thumb to follow might be to
install 1-2x main memory. Again, if you are sensible (and read the
chapter on the utter evil of swapping) you will avoid running the nodes
so that they swap. However, in the real world memory leaks (MPI is
legendary for leaking in real live beowulfs!), Joe runs his job at the
same time as Mary without telling her, a forking daemon goes forking
nuts and spawns a few thousand instances of itself, netscape goes
berserk on a NOW workstation, and you'd just LOVE to have a tiny bit of
slack to try to kill off the offending processes without wasting Mary's
two week run. A system without swap that runs out of memory generally
dies a ghastly death soon thereafter. It's one of the few ways to crash
even linux. Be warned.

Finally, install your beowulf specific software off of a homemade CD or
the net (when the network is up) or perhaps the CD that came with this
book (if a CD came with this book). If you installed a distribution
that uses RPM's (like Red Hat, SuSE, Caldera) this should be
straightforward. Debian users will firebomb my house if I don't extend
this to Debian packages as well, so I will. At this point in my life,
I'd tend to avoid Slackware although we were very happy together for
years. Good packaging systems scale well to lots of nodes, and
scalability is key to manageability.

With all the software installed, it is time to do the system
configuration. Here I cannot possibly walk you through a full course in
linux systems management, and most of what you do is common to all linux
or unix systems, things like installing a root password (you probably
did this during the install, actually, and hopefully picked the same
password for all nodes), setting up the network, setting up security and
network services, setting up NFS mounts, and so forth. To learn how to
do all this, you can use the documentation that came with your
distribution or head on down to Barnes and Noble (or over to amazon.com)
and get a few books on the subject. Be warned that the ``administration
tools'' that come with most linux distributions suck wildly in so many
ways11.11 so even if you use them to get started you need to
learn how to do things by hand.

There are a few things you need to do a bit differently than the
out-of-the-box configuration, and I'll focus on just these.

Be sure that the latest version of the openssh package is
installed on all the nodes11.12. Keep this
revision up to date as aggressively as you can manage, as there are
occasional security holes found in ssh and you want to be sure you are
working with the latest patched release. The latest releases of ssh are
also much easier to debug when something goes wrong with your
setup.

When you set up networking on a ``true beowulf'' node (one that is
isolated from the main network of your organization by some sort of
gateway node), use an IP number for a private internal network. Private
internal networks are described in an RFC (if you know what that is or
care). They are also described in the HOWTO on IP-Masquerading. I
personally like the 192.168.x.x addresses, but you can also use the
10.x.x.x addresses (if you want to be lavish) or the 176.[16-31].x.x,
which I can never remember. Remember not to assign the 0 address or the
255 address to nodes - that is, use only something like
192.168.1.[1-254] as a range. 0 and 255 are ``special'' addresses and
can break things if used.

Set up a common /etc/hosts or some sort of nameservice. There are
good things and bad things about using NIS to manage system databases
like this. It is likely that the bad outweighs the good - NIS can
significantly increase the overhead of certain kinds of network traffic
and network traffic is the last thing that you want to slow down
in a beowulf. On a ``true beowulf'' most people tend to use a tool like
rsync or an scp script to distribute identical copies of /etc/passwd,
/etc/group, /etc/hosts, and so forth. However, in a NOW-type cluster
with lots of users (and not particularly fine grained parallel code) NIS
is a reasonable enough solution.

When you are done and have rebooted the node, it should come up
accessible (via ssh) over the network. Once you can login as root over
the net (test this) you can move or switch the monitor and
keyboard to the next node.

With all of this established, and with ssh set up to permit root access
to the nodes from the head node without a password, it is time to
distribute common copies of things like /etc/hosts,
/etc/hosts.[allow,deny], /etc/passwd, and your preferred /root home
directory (I tend to like to customize mine in various ways and miss the
customizations when they aren't there).

To do this, one can use something like rsync (with the underlying shell
set to ssh, of course) or just an scp. Either way, you will find it
very useful to have a set of scripts on the head node that permit
commands to be executed on all nodes (one at a time, in order) or files
copied to all nodes (one after another, in order). Some simple scripts
for doing this sort of thing are in the Software appendix (and available
on the web, since I doubt that you want to type them in).

I'd strongly recommend that you arrange for all nodes to do all their
logging on your head node to make it as easy as possible to monitor the
nodes and to make it as easy as possible to reinstall or upgrade the
nodes. If all they contain is a distribution plus some simple
post-install configuration files, you don't need to back them up as
reinstalling them according to your recipe will generally be faster.
This is a good reason to set things up so that the nodes provide at most
scratch space on disk for running calculations with the full
understanding that this space is volatile and will go away if a node
dies.

When you are finished with this general configuration, one should have a
head node (mywulf outside and bhead inside) that is also an NFS server
exporting home directory space and project space to all the nodes. You
should have a common password file (and possibly /etc/shadow file) on
all the nodes containing all your expected users. You should have ssh
set up so all your users (and root) can transparently execute ssh
commands on all nodes from the head node or each other (root might only
work from the head node). That is, ``ssh b12 ls /'' should show you the
contents of the root directory without a password. You should have PVM
and MPI (and possibly other things like MOSIX or a queuing system)
installed on all nodes (probably via an NFS mount - there is little
reason to maintain N copies of the binary installation, although with
RPM or a decent package manager there isn't too much reason not to).

PVM or MPI should be configured so that they are can utilize all the
nodes. How to do this is beyond the scope of this book - there are
lots of nice references on both of them and one can usually succeed even
if one only follows the instructions provided with both of them. With
PVM, for example, you'll have to tell it to use ssh instead of rsh and
decide whether you want to run pvmd as root (with a preconfigured
virtual machine) or let users build their own virtual machine for any
given calculation, which in turn may depend on who your users are and
what sort of usage policy you have. Similar decisions are required for
MPI. It is a very good idea to run a few of the test examples that are
provided with PVM and MPI to verify that your beowulf is functioning.

From this point on, you can declare your beowulf open for business.
Your work is probably not done, as I've only described a very minimalist
beginning, but from this beginning you can learn and add bells and
whistles as you need them.

This approach, as we've seen, more or less builds your beowulf nodes by
hand. This teaches you the most about how to build them and configure
them, but it doesn't scale too well. It might take you as long as half
a day to install a new node using the approach above, even after you you
have mastered it (the first few nodes might take you days or weeks to
get ``just right''). There has to be a better way.

Of course there is. There are several, and I'll proceed to cover at
least two. The next example will be a bit Red Hat-centric in my
description. This is not to endorse Red Hat over any other linux
but simply because I'm most familiar with Red Hat and too lazy to
experiment with alternatives (at least to the point of becoming
moderately ``expert''). It is certain that a very similar solution is
possible with other distributions, if you take the time to figure out
how to make it work.