[Beowulf] MIPS/Watt data

At 11:24 AM 6/3/2004 -0500, Brian D. Ropers-Huilman wrote:
>-----BEGIN PGP SIGNED MESSAGE-----
>Hash: SHA1
>>IBM's BlueGene systems are based on similar architectures, 500MHz PowerPCs,
>full system on a chip design.
>>On 2004-06-02 09:57 (-0500), Bari Ari <bari at onelabs.com> wrote:
>] Jim Lux wrote:
>] > If we compare Dhry mips to Bogomips (who's to say if we're not within an
>] > order of magnitude).. A 3.4 GHz P4 turned in 6700 Bogomips. That was
>] > probably around 100W (total guess), for 67 Bogomips/watt.
>]
>] ARM, Mips and SH all beat the pants off x86 in the MIPS/watt and also in
>] the FLOPS/watt department.
>]
>] Nobody seems to be interested in clusters built with these
>] architectures.
>>- --
>Brian D. Ropers-Huilman :: Manager :: High Performance Computing
Not exactly nobody: I'm VERY interested in minimizing joules per
computation, because every joule and watt is precious in space....
Lest you think that my application is peculiar, it really isn't, it's just
an extreme case of the increasing emphasis on total cost of
ownership. Power consumption is going to be more and more important,
especially for clusters with more than 4-5 CPUs: Ballpark it.. If your node
costs $1000/CPU and it burns 100W/CPU. You'll actually need more like
150W/CPU (by the time you pump that heat to the big outdoors). Run that
CPU 24/7 for a year, and you've burned 1.3 MWh, about $250 dollars worth, a
pretty big fraction of the cost of the node (a REAL big fraction if you
amortize over 3 years). Cut the power in half to do your computation, and
you've just effectively bought yourself a bunch more nodes.
There are other capital cost aspects to moving the heat around. Taking a
quick gander at the Grainger catalog... A ductless split system with 23000
BTU/hr capacity runs about $2300 and consumes about 2200W (SEER very close
to 10). 23kBTU/hr is 6.8 kW. Figuring a 50% overall system efficiency, we
get pretty close to 3 Watts/dollar. (We'll also assume that includes
installation costs, etc.) So, to move that 150W out of the room costs
about $50 in capital costs, as well. Could be twice or three times that, or
half.
Before people start beating me up about my pessimistic estimates for power
and HVAC costs, the real point is that reducing the energy consumed to do a
unit of computation is a "good thing". There is a distinctly non-zero cost
to moving the heat away from the CPU to the great outdoors that pervades
every part of a cluster. You need bigger heat sinks or heat pipes and
bigger fans, which makes the chassis bigger and heavier, which means that
the rack has to be stronger, which means that the floor has to be stronger,
and the room probably has to be bigger.
For trivial sized clusters, where you can "fit everything under a desk",
this is less of an issue, but you start to get any size at all, the
infrastructure costs start to come out of the noise level.
I think that in the cluster market, this is where vendors are going to have
to start competing. Clearly, they can't try to make deals on the cluster
software, because (as evidenced by all the discussion on the list of late)
the software is basically low cost. A vendor can sell on "quality of
hardware", but underneath it all, there's only a few mobo manufacturers, so
it comes down to who gets the best quantity discounts, and who has the best
sheet metal fabricators, which is a pretty competitive business. The basic
cost to design, market, and fabricate nodes and racks for a cluster is
pretty much the same for all vendors.
So, how, as a vendor, do you differentiate yourself from all the rest?
(especially if someone wants to make a decision based on quantitative metrics.)
- service: repair frequency and speed, documented low MTBF and MTTR,
etc. - these are tough to quantify in any meaningful way, because clusters
are still essentially "one-off" unique items, so statistics are not
particularly meaningful. This comes down to the fuzzy things like "vendor
reputation".
- unique value added: Training, preconfigured systems, etc. - again, very
difficult to quantify, because some bright soul at your prospective
customer will come up with the "why should we pay X all that money, when I
can download the software, burn the CDS and get it up and running in my
spare time".. Granted it's bogus, but it happens. A vendor can't prove
that the "do it yourself" approach is actually more expensive, and unless
the customer has built some clusters themselves, they won't know any better.
- lower cost of ownership - something that is readily quantifiable! I know
what electricity costs, what floor space costs, what sysadmin time costs, etc.
James Lux, P.E.
Spacecraft Telecommunications Section
Jet Propulsion Laboratory, Mail Stop 161-213
4800 Oak Grove Drive
Pasadena CA 91109
tel: (818)354-2075
fax: (818)393-6875