On Tue, Feb 15, 2005 at 09:53:37AM +0100, Joachim Worringen wrote:
> This is an interesting issue. If you look at what Greg mentioned about
> dump NICs (like InfiniPath, or SCI) and the latency numbers Ole posted
> for ScaMPI on different interconnects (all(?) accessed through uDAPL),
> you see that the dumb interface SCI has the lowest latency for both,
> pingpong and random, with random being about twice of pingpong. In
> contrast, the "smart" NIC Myrinet, which has much less CPU utilization,
> has twice the pingpong latency, and a slightly worse random-to-pingpong
> ratio.
I would make 2 comments about this:
First, you should be using the best MPI for each piece of hardware.
Hardware architects pick their interface with a software
implementation in mind. I don't expect any 3rd party MPI to get close
to PathScale's MPI latency on PathScale's hardware, unless the 3rd
party is flexible enough to change a lot of code.
Second, you really can't generalize about dumb NICs by looking at
SCI. SCI has a unique situation: its raw latency is much lower than
the MPI latency of all MPI implementations for it. I suspect no
hardware designer would be out to imitate that property! Both
InfiniPath and the Quadrics STEN (forgive me for classing this as
dumb, I happen to think dumb is a compliment...) get this right.
Third (you knew I couldn't keep to my promise of 2), I wouldn't make
any scaling generalizations based on a test with 16 nodes. Even at
128-256 nodes the picture is quite different, and that's the sweet
spot that lots of today's clusters are at. So, if you want to make a
scaling generalization, you should be quoting 256-512 node results.
-- greg