Beowulf Questions

On Sat, Jan 04, 2003 at 01:19:40PM -0500, Mark Hahn wrote:
> > Personally, I don't think so, especially if we consider the
> > fact that in the not-too-distant future, networking speeds
> > will be up to snuff with the various tasks at hand. With these
>> ah! I think this is the central fallacy that drives grid enthusiasm.
>
Then you clearly don't understand grid computing. That's understandable,
because "grid computing" has become a hijacked term - just like when
most people say "hacker" they really mean "cracker". Most of the press
coverage of grid computing in the past few months has simply been wrong
about what's really going on, and most of the new grid products and
grid companies are either useless or really don't have very much to do
with grid computing. They're just on the grid bandwagon because it's trendy,
and it's pretty frustrating for those of us who are doing real work in the
area, because the marketing drones are drowning out the real message.
> there simply is no coming breakthrough that will make all networking
> fast, low-latency, cheap, ubiquitous and low-power. and grid
> (in the grand sense) really does require *all* those properties.
Grid computing does not require any of this. Grid computing is all about
access and coordination. Grid computing is much more than just
running naturally (embarrassingly) parallel problems on spare cycles on
every computer people can find. Certainly SETI at Home and distributed.net
have been successful and are probably the first sorts of applications
that will be able to make serious use of "the grid" (if there's ever a real
"the grid" that emerges, though it's more likely there will be several large
"the grids" and many internal ones)
> oh, you will certainly manage to do some very interesting things
> with wimpier networking, but with major compromises. I don't see
> people doing parallel weather sims over 803.11*-connected nodes
> any time soon. but seti at home-type applications (very losely coupled
> and coarse-grained) would be a fine way to keep my fridge's brain
> busy. on the other hand, a fridge will always be a tiny fraction
> of the compute power of a desktop, so is it worth it? not to mention
> the fact that seti at fridge will jack up my monthly power bill...
>
No, it's not. I'm convinced that there will never be a market for cycles
recovered from the home user - it's just not worth it. Pretend that compute
power is (physical) storage space - nearly everyone in the world has some
extra closet space (or at least I do, but I'm single :). Nearly every
company in the world has some storage requirements - maybe it's worthwhile
to rent some of that extra closet space - if I ever need my space back, I'll
just send back the box the company sent me to store. But now the company
needs to:
1. Be able to put their stuff into boxes small enough to fit into my closet
2. Handle keeping track off all of those boxes
3. Verifying that I never messed with the contents of that box when I get it
back.
It's just not worth it - probably not for the company, and certainly not for
me - whatever they're paying me it's not likely enough for me to keep their
box safe and send it back to them on a whim. There may be a few things I'd
consider doing that for, though - if the Red Cross told me they wanted to
store a box of emergency supplies in my closet (and the closets of all of my
neighbors) and all I had to do was pull them out if they ever needed them,
then I'd probably do that - this is the model that SETI at home and
FightAids at home are using and it's working for them.
However, it's probably worth it for the company to do that internally -
they've got control over all of the closets anyway, so for some of their stuff
they should try and reclaim that space. (I think we've all got signs at work
that say "Physical Plant storage only" :)
Think of "grid computing" as a shipping and warehousing company (actually,
after the power crisis in California a while back naming something after
the electric grid is kinda a bad idea :). You don't want to have to
build and maintain your own warehouse, you'd rather get it from someone
else. This warehouse(grid) company will have the appropriate resources for
what I want to do - if I just want to store lots of little boxes then
maybe I don't care much if they use lots of little 10x10 storage units. Or,
maybe my stuff is too big (maybe I need to store the shuttle for a few days :)
so I need a huge, huge warehouse(cluster). I don't want to have separate
billing and addressing methods for this; I just want to say "I need to store
this" and have it happen. And not only do I need just space, but maybe I
want to consider location - if my factory is in northern Wisconsin, I'd
rather not ship my widgets to California if there's closer, unused space in
Milwaukee.
Some companies will setup their own, internal distribution/grids - think of
Walmart - and inside the company they'll deal with however the cost recovery
method needs to work. Others will get it from the big boys - you'll want
someone you can trust, so you're more likely to use FedEx than Fly-By-Night
Shipping, Inc. The important point is that access to it is basically the
same - you've got a box that needs to go somewhere - FedEx and UPS both
take packages with the same address, only the billing is a bit different.
There are some cooler things that Grid Computing will let you do that aren't
really covered in a shipping analogy. First off, it's easy to create
free-wheeling deals with other sites - maybe I can access to another cluster
down the road, and I can use the standard grid interfaces to it, instead of
having to learn where all the software is installed, remember my username on
the machine, which batch system it's running, etc etc etc. There's also
possibilities for levels of indirection and middlemen - maybe the American
Association of Physicists will buy 1 million CPU hours for it's members. The
physicist will just go to grid://aap.org* and submit their jobs. AAP.org will
deal 750,000 hours at IBM.com, 10,000 from doe.gov, 100,000 from
GridStartup.com, and so forth. When the million hours starts to run out,
AAP.org will deal with buying more.
I think that (at least for the next few years) linux-based beowulfs will
be the main building blocks of these sorts of Grids. This doesn't mean that
we'll stick 16-node clusters at 10 sites, and haphazardly schedule MPI jobs
across the 160 CPU's - clearly, tightly-coupled codes will stay together.
Consider TeraGrid (teragrid.org) - it's 4 (5 now that the Pittsburgh TCS-1
will be part of it) large clusters. Most jobs will run on one cluster at a
time - certainly some will span multiple clusters on the grid, probably as
two tightly coupled instances exchanging coarse-grained boundary info or some
such. (Teragrid does have a 40Gbps connection between the 4 sites, but it
still takes those 40Gb some time to cross between Illinois and California :)
If your job can tolerate the latency, then go ahead and schedule it
wherever. If it can't, then don't. Grid Computing doesn't throw 40 years
of parallel computing basics out the window.
Grid Computing is not going to replace cluster computing - it's a
complementary style of computing. Some people are going to (and with good
reason) still build their own clusters and keep them in house. I think that
in the future (when grid computing is more reliable) that many of the people
who currently buy 32 node clusters and then have them at about 5% utilization
will be better off going to one of the (someday to exist) grid providers. What
the grid community really hopes to see is the (much larger) percentage of
scientists who currently don't use computing but should be will be able to
get into it - even with all the wonderful work that Scyld and LinuxNetworX and
the like have been doing to make turn-key clustering easy, I'd still guess
that only 1 out of every 10 or 1 out of every 100 people who could do better
science with more computing will. (And of course, there are other people than
just scientists. The arts, business, etc)
> ultrawideband is an interesting development for this kind of networking,
> perhaps also in the optical range. anyone interested in this stuff should
> read Robert Forward and Vernor Vinge's books (FS novels).
>> ps: I don't mean grid stuff isn't worthwhile, or that we can't do
> any of it until the perfect network arrives. there's lots of great
> work going on - p2p networking, java/jini/jxta, etc.
The great thing about a downturn in the economy is all-hype doesn't
survive. At SC01, P2P companies were all over the place, and at SC02
most of them didn't have a booth this year. Hopefully a good amount of
the deadwood in grid computing will be culled out this year (and we'll
have a different set of companies at SC03 that won't make it the year :)
-Erik
> I just don't see
> it being relevant to the beowulf world very soon, or ever being as
> grand as the starry-eyed gridophiliacs would like to predict...
>>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf