----- Forwarded message from Joe Landman <landman at scalableinformatics.com> -----
From: Joe Landman <landman at scalableinformatics.com>
Date: Thu, 23 Jun 2005 21:20:58 -0400
To: "Clustering, compute farming & distributed computing in life science informatics" <bioclusters at bioinformatics.org>
Subject: Re: [Bioclusters] topbiocluster.org
User-Agent: Mozilla Thunderbird 1.0.2 (Windows/20050317)
Reply-To: "Clustering, compute farming & distributed computing in life science informatics" <bioclusters at bioinformatics.org>
I had sent a note to James offline. Worth posting a similar letter here.
James Cuff wrote:
>Ok,
>>So I put my money where my mouth is, (well 50 bucks anyway)
>>http://topbiocluster.org is alive
>>(well once the DNS gets pushed everywhere that is, I only set it last
>night, some of you may have to hang fire for a bit :-))
>>We all talk a lot on this list about which cluster this, that and the
>other for application this that and the rest. I also saw the last top500
>list yesterday, and to be frank I'm all done with linpak, we do other
>stuff, and it matters.
Absolutely. Linpack makes sense for folks doing large sparse matrix
work. Few folks here are doing that. Moreover, IDC data seems to
indicate that a sizeable fraction of compute cycles goes to this sort of
computing. Yet there is no equivalent to HPL for this community.
We started to do this with baseline tests for BBSv3 and we had hoped for
some feedback. Most of what we got was from vendors wanting to use it
for marketing purposes (ok, but it needs to mean something, so the
content needs to be relevant for a large swath).
>There are two good benchmark tools I know of, both are currently listed on
>the topbiocluster.org 'site', but I'm going to need a bit of help from
>folk to actually get this thing off the ground.
>>My first thoughts are we build a list of what is actually out there in
>terms of bioclusters, bit like Glen's QA mail from the other day, then we
>start to go about doing the benchmark gig.
In looking over the reported results, I was struck with the thought that
someone is designing/building/selling slow filesystems. Moreover, as
this community is fundamentally data motion bound (huge databases and
data sets take non-zero time to move, even on fast links).
>I'm also looking to the vendors a bit here (I know some of you folk hang
>out in here :-)). Let me know off list if I'm opening up a can of worms,
>or if you would like to help. I want to keep this open, but there are
>often things best talked about off list...
I believe this is a can of worms that needed to be opened some years
ago. We tried to pry this open with BBS v3 baseline tests last year,
and get some feedback. Since then we have added Amber8 tests, GAMESS
tests, and a few others. But we are missing some critical tests.
Bonnie is nice but it doesn't replicate the workload of most of the
tools we have encountered. Most of the tools we have seen have use
cases that are either large sequential reads punctuated by occasional
writes, or effectively random IO. Other tools introduce effectively
random latency of network traffic (remote interactions with web service
systems)
>If we get this thing right it _will_ be a one stop shop for biocluster
>performance.
>>I really want to capture NFS/SAN/storage figures in here, we all know it's
>not just about the number of CPUs. We really need to see if we can
>capture the whole *cluster* performance, not just raw CPU horsepower...
Thank you. The fastest CPUs can be hamstrung by terrible IO or simply
poor/non-scalable cluster IO designs. The fastest nets can be hamstrung
by poor quality switches/NICs. Bad OS choices make huge performance
impacts, as do many other bits along those lines. The wrong compilers
or compiler options can make fast codes creep.
>>So, let's open this up, and lets get talking...
>>- How can we best start to fill in this web site?
First off, get end users to start talking about the things that they
care about in performance: what bottlenecks their runs? With enough
data, we can move to step 2.
Second, build tests that exercize the weak spots as well as the strong
spots. Sure, the latest greatest multi-core CPUs are great. Just don't
run them with a single spindle, or a poorly designed RAID5.
>>- Would people be happy to submit figures about their cluster?
Hopefully. BBSv4 is aimed at making this very simple.
>>- What numbers shall we use for ranking? What to run etc.
One fundamental error made in the Spec numbers is reducing the
multidimensional performance space to single numbers using a dubious
practice of creating an average (over things with very different
characteristics/dimensions). I would argue for a vector, and the vector
would be per application. That is have a blast vector, with blastx,
psiblast, rpsblast,... . Have a HMMer vector with an hmmalign, pfam
search,... . Have a data transfer vector: time to copy nt to all
nodes in cluster/number of nodes. Have a web services vector. This
way you don't lose information (some systems may be better designed for
one subset of tasks than another).
>>- How do we capture storage aspects?
bonnie is a start, but I would question as to how well correlated
against use cases it is. I would think that a more typical use case
would involve remote queries of a large database, moving large
databases, local queries of large databases, etc.
>I'm happy to do some of the grunt work here to collect information etc.
>I guess it's best that we keep all the chat open on this list, and I'll
>see what pops up. As things come in, I'll start to flesh out the website
>soem more. Also, once we have a bit more of a scope as to what we will
>actually rank, list and store, I'll be happpy to start on the mysql
>database, and get things rocking.
I had an XML format for the output. I was speaking with a few other
folks about standardizing it with a few other tools. This would enable
easier construction of comparison tools.
>>submit at topbiocluster.org will work to send things in so I can get them
>into a database if we actually get going on it.
>>Let's see what happens, this could be a bumpy ride, but it should be fun.
>>"Cabin crew, doors to automatic and cross check!"
>>So I guess the floor is now open...
>>Best,
>>J.
As indicated, we fully support this effort. BBSv4 is being built as we
speak, fixing many unanticipated and sometimes surprising "features",
and adding functionality and consistency. Documentation too (most
requested feature).
Please folks, suggest some tests which stress IO and the rest of the
system. Specifically real workloads. They are the only benchmarks
that matter.
--
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics LLC,
email: landman at scalableinformatics.com
web : http://www.scalableinformatics.com
phone: +1 734 786 8423
fax : +1 734 786 8452
cell : +1 734 612 4615
_______________________________________________
Bioclusters maillist - Bioclusters at bioinformatics.orghttps://bioinformatics.org/mailman/listinfo/bioclusters
----- End forwarded message -----
--
Eugen* Leitl <a href="http://leitl.org">leitl</a>
______________________________________________________________
ICBM: 48.07100, 11.36820 http://www.leitl.org
8B29F6BE: 099D 78BA 2FD3 B014 B08A 7779 75B0 2443 8B29 F6BE
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://www.beowulf.org/pipermail/beowulf/attachments/20050624/e55330de/attachment.sig>