The concept underlying the beowulf-style compute cluster is not new, and
was not invented by any one group at any one time (including the NASA
group headed by Sterling and Becker that coined the name ``beowulf'').
Rather it was an idea that was developed over a long period and that
grew along with a set of open source tools capable of supporting it
(primarily PVM at first, and later MPI). Note that this is not an
attempt to devalue the contributions of Sterling and Becker in any way,
it is simply a fact.

However, Thomas Sterling and Don Becker at NASA-Goddard (CESDIS) were,
as far as I know, the first group to conceive of making a dedicated
function supercomputer out of commodity components running
entirely open source software and Don Becker, especially, has
devoted a huge fraction of his life to the development of the open
source software drivers required to make such a vision reality. Don
actually wrote most of the ethernet device drivers in use in Linux
today, which are the sine qua non of any kind of networked
parallel computing1.23. The NASA group also made specific
modifications to the Linux kernel to support beowulf design (like
channel bonding) that are worthy of mention. Most recently Don Becker
and Erik Hendricks and others from the original NASA-Goddard beowulf
team have formed Scyld.com1.24, which both
maintains the beowulf list and beowulf website and has produced a ``true
beowulf in a box'' - the Scyld Beowulf CD - that can be used to
transform any pile of PC's into a beowulf in literally minutes.

By providing the sexy name, a useful website, and the related mailing
list they formed a nucleation point for all the users of PVM and MPI who
were tired of programming in parallel on networks of expensive hardware
with proprietary and expensive operating systems (like those offered at
the time by IBM, DEC, SGI, Sun Microsystems, and Hewlett-Packard) only
to have to buy the whole thing over and over again at very high cost as
the hardware evolved. Once Linux had a reasonably reliable network and
Intel finally managed to produce a mass-market processor with decent and
cost-beneficial numerical performance (the P6), those PVM/MPI users,
including myself, rejected those expensive, proprietary systems like
radioactive waste and joined with others of a like mind on the beowulf
list. This began an open source development/user support cycle that
persists and is amazingly effective today.

It is this last contribution, the clear articulation of the idea of the
Linux-based beowulf and the focusing of previously disparate energies
onto its collaborative development that is likely to be the most
important in the long run, as it transcends any particular architectural
contributions made in association with the original project. It is an
idea that is finally coming to a long awaited maturity - it appears
that a number of Linux distributions are going to be providing
integrated beowulf/cluster software support ``out of the box'' in their
standard distributions quite soon (really, they largely have for some
time, although there have been a few missing pieces). The Scyld beowulf
is just the first, and most deeply integrated, of what I expect to
become many attempts to make network parallel computing a fully
integrated feature of everyday Linux rather than something even remotely
exotic.

Beowulfs have always been built from MCOTS hardware, which is by
definition readily available. Soon beowulf support will similarly be in
MCOTS box-set Linux distributions (instead of being scattered
hither-and-yon across the web). That takes care of the hardware and
software side of things. All that's missing is the knowledge of how to
put the two together to make beowulfs work for you, a hole that I'm
shamelessly hoping to exploit, errrm, uh, ``fill'' with this
book1.25. With all this to further support parallel development, can
commercial-grade parallel software be far behind?

With imitation being the most sincere form of flattery, it is amusing
that the beowulf concept has been transported by name to other
architectures, some of them most definitely not open source on
COTS hardware (there are FreeBSD beowulfs, NT based ``beowulfs'',
Solaris based ``beowulfs'', and so forth, where I quite deliberately put
the term beowulf in each of these latter cases within quotes to indicate
my skepticism that - with the exception of the FreeBSD efforts - the
clusters in question could truly qualify as beowulfs1.26.

Not that I'm totally religious about this - a lot of the clusters I'll
discuss below, although COTS and open source, are not really
beowulfs either although they function about the same way. I am
fairly religious about the open source part; it is a True Fact that
nobody sane would consider building a high performance
beowulf without the full source of all its software components,
especially the kernel. I also really, really like Linux. However, even
ignoring the historical association of beowulfery and Linux, there are
tremendous practical advantages associated with access to the full
operating system source even for people with mundane needs.

Issues of control, repair, improvement, cost, or just plain
understandability all come down strongly in favor of open source
solutions to complex problems of any sort. Not to mention scalability
and reliability. This is true in spades for beowulfery, which tends to
nonlinearly magnify any small instability in its component platforms
into horrible problems when jobs are run over lots of nodes.

If you are foolish enough to buy into the notion that WinNT or Win2K
(for example) can be used to build a ``beowulf'' that will somehow be
more stable than or outperform a Linux-based beowulf, you're paying good
money1.27 for an illusion, as you will realize very painfully the first
time your systems misbehave and Microsoft claims that it Isn't Their
Fault. They could even be right. It wouldn't matter. It's out of your
control and you'll likely never know, since long before you find out
your patience will be exhausted and you'll go right out and reinstall
Linux on the hardware (for free), do a recompile, and live happily ever
after1.28. Use the NT CD's (however much they cost
originally) for frisbees with your dog, or as coasters for your coffee
cup1.29.

At this point in time beowulfs (both ``true beowulfs'' and beowulf-style
MCOTS clusters of all sorts) are proven technology and can easily be
shown to utterly overwhelm any other computing model in cost-benefit for
all but a handful of very difficult bleeding edge computational
problems. A beowulf-style cluster can often equal or even beat a ``big
iron'' parallel supercomputer in performance while costing a tiny
fraction as much to build or run1.30. The following is a guide on how to analyze your
own situation and needs to determine how best to design a beowulf or
beowulf-style cluster to meet your needs at the lowest possible cost.
Enjoy.