On 02/01/2014 11:17 AM, atchley tds.net wrote:
> On Fri, Jan 31, 2014 at 11:27 AM, Prentice Bisbal
> <prentice.bisbal at rutgers.edu <mailto:prentice.bisbal at rutgers.edu>> wrote:
>> Alex,
>>> On 01/30/2014 07:15 PM, Alex Chekholko wrote:
>> Hi Prentice,
>> Today, IB probably means Mellanox, so why not get their pre-sales
> engineer to draw you up a fabric configuration for your
> intended use
> case?
>>> Because I've learned that sales people will tell you anything is
> possible with their equipment if it means a sale.
> I posted my question to this list instead of talking to Mellanox
> specifically to get real-world, unbiased information.
>>> Certainly you can have a fabric where each host has two links, and
> then you segregate the different types of traffic on the different
> links. But what would that accomplish if they're using the same
> fabric?
>>> Doesn't IB use cross-bar switches? If so, the bandwidth between
> one pair of communicating hosts should not be affected by
> communication between another pair of communicating hosts.
>>> The cross-bar switch only guarantees non-blocking if the two ports are
> on the same line card (i.e. using the same crossbar). Once you start
> traversing multiple crossbars, you are sharing links and can
> experience congestion.
Scott, You're right. I wasn't thinking when I made that earlier
statement. As soon as I read your reply, I facepalmed. D'oh!
>> Certainly you can have totally separate fabrics and each host
> could
> have links to one or more of those.
>> If this was Ethernet, you'd comparing separate networks vs
> multiple
> interfaces on the same network vs bonded interfaces on the same
> network. Not all the concepts translate directly, the main
> one being
> the default network layout, Mellanox will suggest a strict fat
> tree.
>> Furthermore, your question really just comes down to performance.
> Leave IB out of it. You're asking: is an interconnect with
> such and
> such throughput and latency sufficient for my heterogeneous
> workload
> comprised of bulk data transfers and small messages. Only you can
> answer that.
>>> This question does not "come down to performance", and this
> question is specifically about IB, so there's no way to leave IB
> out of it.
>> This is really a business/economics question as much as it's about
> performance: Is it possible to saturate FDR IB, and if so, how
> often does it happen? How much will it cost for a larger or second
> IB switch and double the number of cables to make this happen? And
> how hard will it be to set up? Will the increased TCO be justified
> increase in performance? How can I measure the increase in
> performance? How can I measure, in real-time, the load on my IB
> fabric, and collect that data to see if the investment paid off?
>>> Generally (lots of hand waving), HPC does not saturate the fabric for
> IPC unless is it a many-to-one (e.g. collective). Where lots of
> bandwidth makes the most difference is for I/O. Distributed file
> systems probably put the most bandwidth load on the system.
> Scott
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.beowulf.org/pipermail/beowulf/attachments/20140204/b4d17dfd/attachment.html>