Temporary exile

Why Must It Be So Hard to Cluster?

Oct 8th, 2008

In the past, I’ve gotten pretty upset by how difficult it is to
take advantage of multiple computers on a network for general
tasks. In this age of advanced Linux software, I’m still shocked at
how hard it is to cluster machines. Let’s say I have three or so
machines on my local network. If the task is something commonplace
like encoding audio or compiling, I can use either
distmp3 or
distcc, respectively.
Alternatively, if I want to share disk space among nodes, I could
use a clustered file system such as
Lustre) or
GFS. After that,
I’d have to put together a
more formal)
cluster like OpenMosix
(now abandoned), OpenSSI,
Kerreghed (comparison
paper, PDF) or some other
option. The next step is to write my own applications to do
something explicitly parallel using any number of options like
OpenMP,
PVM along
with trendy stuff like
hadoop and
MapReduce. I can always
opt for just doing it by hand using distributed objects for a given
language. Apropos, Ruby has positively stellar support for
distributed objects
indcluding
Rinda,
an implementation of
tuple-spaces (ala
Linda))
which provides nifty things (auto-discovery, among
other features).
Still, these options don’t help me build a general usage cluster
out of machines. Then there are the tools to control the actions of
the machines remotely like
clusterssh,
dsh and
gsh. So far, my options are:
1. Settle for the limited capabilities except for select tasks.
2. Write my own app to do something (or everything, which is a bad

idea).

Deal with it and control actions using a remote, group-admin
tool.

I understand how the landscape could reach such a state, but I
don’t like the fact that this is the same set of options I’ve had
for the last five years or so. Are there options I’m overlooking?
Is there something I don’t know about? The only thing I can see
down the pipeline is
GNU Queue (got a tipoff
from mct) which might very well be
exactly what I’ve been dreaming of. Unfortunately, no
releases have yet been made,
so certainly no chance of using it now.