Archive

The TOP500 list was released this week, ranking the fastest supercomputers in the world. We at the IBTA were excited to see that InfiniBand’s presence increased on the list to become the most used interconnect for the first time. Clearly, InfiniBand is the interconnect of choice for today’s compute-intensive systems! As the chart below demonstrates, InfiniBand’s adoption rate has grown significantly, outpacing all of the other options.

We were also pleased to see that since the last report 6 months ago FDR has increased the number of systems it connects tenfold, making it the fastest growing interconnect on the list.

The TOP500 list notes that InfiniBand connects eight of the 20 Petascale systems on the list, and that InfiniBand-connected systems boast the highest performance growth rates on the list. Petascale systems on the TOP500 list favored InfiniBand because of its scalability and the resulting computing efficiency. The graph below illustrates the performance trends showing how supercomputing depends on InfiniBand to achieve the highest performance.

The TOP500 list demonstrates what the IBTA and the InfiniBand community have already known - InfiniBand is a technology that has changed the face of HPC and, we believe, is having the same effect on the enterprise data center. Below, are some additional stats referenced on the TOP500 list.

InfiniBand connects 25 of the 30 most compute-efficient systems, including the top 2

InfiniBand-based system performance grew 69% from June ‘11 to June ‘12

Want to learn more about the TOP500, or how InfiniBand fared? Check out the IBTA’s press release on the news.

This month, IBTA member companies attended the Interop Conference 2012 in Las Vegas. As news from the event streamed in and demos began on-site, we were excited to see that InfiniBand and RDMA were making headlines at this traditionally datacenter-focused event. Microsoft, FusionIO and Mellanox demoed a setup with Windows Server 2012 Beta and SMB 3.0 that illustrated amazing remote file performance using SMB Direct (SMB over RDMA), resulting in 5.8 Gbytes per second from a single network port. The demo consisted of a combination of Intel Romley motherboards each with two CPUs each with 8 cores, the faster PCIe Gen3 bus, four FusionIO ioDrive 2 drives rated at 1.5 Gbytes/sec each and the latest Mellanox InfiniBand ConnectX-3 network adapters. TechNet’s Microsoft contributor Jose Barreto noted in his coverage of the demo when looking at this handy results table that accompanied the demo:

“You can’t miss how RDMA improves the numbers for % Privileged CPU utilization, fulfilling the promise of low CPU utilization and low number of cycles per byte. The comparison between traditional, non-RDMA 10GbE and InfiniBand FDR for the first workload shows the most impressive contrast: over 5 times the throughput with about half the CPU utilization.”

We’re happy to see RDMA getting the recognition it deserves, and we look forward to seeing more coverage resulting from Interop in the coming days, as well as future discussions around RoCE and how InfiniBand solutions can be deployed in the enterprise.

Last month, VMware’s Josh Simon’s attended the OpenFabrics Alliance User and Developer Workshop in Monterey, CA. While at the event, Josh sat down with InsideHPC to discuss VMware’s current play in the HPC space, big data and the company’s interest in InfiniBand and RDMA. Josh believes that by adopting RDMA, VMware can better manage low-latency issues in the enterprise.

Josh noted that RDMA over an InfiniBand device aides in virtualization. VMware is seeing live-migration times shrinking in its virtualization platforms, such as VMotion and others. The company is also seeing CPU savings in addition, thanks to the efficient RDMA applications it has deployed.

Greg Ferro also posted on The Ethereal Mind blog about how VMware believes InfiniBand over Ethernet is better than Ethernet alone. According to Greg:

“Good InfiniBand networks have latency measured in hundreds of nanoseconds and much lower impact on system CPU because InfiniBand uses RDMA to transfer data. RDMA (Remote Direct Memory Access) means that data is transferred from memory location to memory location thus removing the encapsulation overhead of Ethernet and IP (that’s as short as I can make that description).”

To watch Josh’s interview with InsideHPC, or to check out the other presentations from the OFA workshop, head over to the InsideHPC workshop page. Presentations are also available for download on the OFA website.

The IBTA wrapped up the four part fall Webinar Series in December, and if you didn’t have the opportunity to attend these events live, there is a recorded version available on the IBTA’s website. In the webinar series, we suggested the idea that it makes sense to take a fresh look at I/O in light of recent developments in I/O and data center architecture. We took a high level look at two RDMA technologies which were InfiniBand and a relative new comer called RoCE - RDMA over Converged Ethernet .

RDMA is an interesting network technology that has been dominant in the HPC marketplace for quite a while and is now finding increasing application in modern commercial data centers, especially in performance sensitive environments or environments that depend on an agile, cost constrained approach to computing, for example almost any form of cloud computing. So it’s no surprise that several questions arose during the webinar series about the differences between a “native” InfiniBand RDMA fabric and one based on RoCE. In a nutshell, the questions boiled down to this: What can InfiniBand do that RoCE cannot? If I start down the path of deploying RoCE, why not simply stick with it, or should I plan to migrate to IB?”

As a quick review, RoCE is a new technology that is best thought of as a network that delivers many of the advantages of RDMA, such as lower latency or improved CPU utilization, but using a Ethernet switched fabric instead of InfiniBand adapters and switches. This is illustrated in the diagram below. Conceptually, RoCE is simple enough, but there is a subtlety that is easy to overlook. Many of us, when we think of Ethernet, naturally envision the complete IP architecture consisting of TCP, IP and Ethernet. But the truth is that RoCE bears no relationship to traditional TCP/IP/Ethernet, even though it uses an Ethernet layer. The diagram also compares the two RDMA technologies to traditional TCP/IP/Ethernet. As the drawing makes clear, RoCE and InfiniBand are sibling technologies, but are only distant cousins to TCP/IP/Ethernet. Indeed, RoCE’s heritage is found in the basic InfiniBand architecture and is fully supported by the open source software stacks provided by the Open Fabrics Alliance. So if it’s possible to use Ethernet and still harvest the benefits of RDMA, what’s to choose between the two? Naturally, there are trade-offs to be made.

During the webinar we presented the following chart as a way to illustrate some of the trade-offs that one might encounter in choosing an I/O architecture. The first column shows a pure Ethernet approach, as is common in most data centers today. In this scenario, the data center rides the wave of improvements in Ethernet speeds. Naturally, using traditional TCP/IP/Ethernet, you don’t get any of the RDMA advantages. For this blog, our interest is mainly in the middle and right hand columns which focus on the two alternate implementations of RDMA technology.

From the application perspective both RoCE and native InfiniBand present the same API and provide about the same sets of services. So what are the differences between them? They really break down into four distinct areas.

Wire speed and the bandwidth roadmap. The roadmap for Ethernet is maintained by the IEEE and is designed to suit the needs of a broad range of applications ranging from home networks to corporate LANs to data center interconnects and even wide area networking. Naturally, each type of application has unique requirements and different speed requirements. For example, client networking does not have the speed requirements that are typical of a data center application. Of this wide range of applications the Ethernet roadmap naturally tends to reflect the bulk of its intended market, even though speed grades more representative of data center needs (40 and 100GbE) have recently been introduced. The InfiniBand roadmap on the other hand, is maintained by the InfiniBand Trade Association and has one focus, which is to be the highest performance data center interconnect possible. Commodity InfiniBand components (NICs and switches) at 40Gb/s have been in wide distribution for several years now, and a new 56Gb/s speed grade has recently been announced. Although the InfiniBand and Ethernet roadmaps are slowly converging, it is still true that the InfiniBand bandwidth roadmap leads the Ethernet roadmap. So if bandwidth is a serious concern, you would probably want to think about deploying an InfiniBand fabric.

InfiniBand Speed Roadmap

Adoption curve. Historically, next generation Ethernet has been deployed first as a backbone (switch-to-switch) technology and eventually trickled down to the end nodes. 10GbE was ratified in 2002, but until 2007 almost all servers connected to the Ethernet fabric using 1GbE, with 10GbE reserved for the backbone. The same appears to be true for 40 and 100GbE; although the specs were ratified by the IEEE in 2010, an online search for 40GbE NICs reveals only one 40GbE NIC product in the marketplace today. Server adapters for InfiniBand on the other hand, are ordinarily available coincident with the next announced speed bump allowing servers to connect to an InfiniBand network at the very latest speed grades right away. 40Gb/s InfiniBand HCAs, known as QDR, have been available for a number of years now, and new adapter products matching the next roadmap speed bump, known as FDR, were announced at SC11 this past fall. The important point here is that one trade-off to be made in deciding between RoCE and native InfiniBand is that RoCE allows you to preserve your familiar Ethernet switched fabric, but at the price of a slower adoption curve compared to native InfiniBand.

Fabric management. RoCE and InfiniBand both offer many of the features of RDMA, but there is a fundamental difference between an RDMA fabric built on Ethernet using RoCE and one built on top of native InfiniBand wires. The InfiniBand specification describes a complete management architecture based on a central fabric management scheme which is very much in contrast to traditional Ethernet switched fabrics, which are generally managed autonomously. InfiniBand’s centralized management architecture, which gives its fabric manager a broad view of the entire layer 2 fabric, allows it to provide advanced fabric features such as support for arbitrary layer 2 topologies, partitioning, QoS and so forth. These may or may not be important in any particular environment, but by avoiding the limitations of the traditional spanning tree protocol, InfiniBand fabrics can maximize bi-sectional bandwidth and thereby take full advantage of the fabric capacity. That’s not to say that there are not proprietary solutions in the Ethernet space, or that there is no work underway to improve Ethernet management schemes, but again, if these features are important in your environment, that may impact your choice of native InfiniBand compared to an Ethernet-based RoCE solution. So when choosing between an InfiniBand fabric and a RoCE fabric, it makes sense to consider the management implications.

Link level flow control vs. DCB. RDMA, whether native InfiniBand or RoCE, works best when the underlying wires implement a so-called lossless fabric. A lossless fabric is one where packets on the wire are not routinely dropped. By comparison, traditional Ethernet is considered a lossy fabric since it frequently drops packets, relying on the TCP transport layer to notice these lost packets and to adjust for them. InfiniBand, on the other hand, uses a technique known as link level flow control, which ensures that packets are not dropped in the fabric except in the case of serious errors. This technique helps explain much of InfiniBand’s traditionally high bandwidth utilization efficiency. In other words, you get all the bandwidth for which you’ve paid. When using RoCE, you can accomplish almost the same thing by deploying the latest version of Ethernet sometimes known as Data Center Bridging, or DCB. DCB comprises five new specifications from the IEEE which taken together provide almost the same lossless characteristic as InfiniBand’s link level flow control. But there’s a catch; to get the full benefit of DCB requires that your switches and NICs implement the important parts of these new IEEE specifications. I would be very interested to hear from anybody who has experience with these new features in terms of how complex they are to implement in products, how well they work in practice, and if there are any special management challenges.

As we pointed out in the webinars, there are many practical routes to follow on the path to an RDMA fabric. In some environments, it is entirely likely that RoCE will be the ultimate destination, providing many of the benefits of RDMA technology while preserving major investments in existing Ethernet. In some other cases, RoCE presents a great opportunity to become familiar with RDMA on the way toward implementing the highest performance solution based on InfiniBand. Either way, it makes sense to understand some of these key differences in order to make the best decision going forward.

If you didn’t get a chance to attend any of the webinars or missed one of the parts, be sure to check out the recording here on the IBTA website. Or, if you have any lingering questions about the webinars or InfiniBand and RoCE, email me at pgrun@systemfabricworks.com.

Recently, there has been a lot of conversation around InfiniBand. Members of the IBTA often take our knowledge of InfiniBand technology for granted, which is why we are happy to see more exploratory discussion and education conversations happening. If you’re interested in finding out more about InfiniBand the IBTA has a number of resources for you to check out, including a product roadmap, put together by the IBTA’s members.

Additionally, we wanted to share a recent blog post by Oracle’s Neeraj Gupta, which succinctly introduces the InfiniBand technology to those who may be unfamiliar with it.

Looking forward to more discussion and education on InfiniBand in the coming weeks.

Heard the buzz word - InfiniBand ? And wondering what it is ? Here is some information to get you started.

I am quite sure that you are already familiar with more common networking technologies like Ethernet and various Wireless media these days. InfiniBand is yet another but it does not reach out to us in our daily lives as much as others and probably thats the reason you are still interested in reading about it here

InfiniBand is meant to provide interconnect for high end computing environments by providing high bandwidth under extremely low latency. In other words, it enables computing end points to exchange more data, faster. Lets compare InfiniBand with Ethernet based on various product offerings today.

I would like to point out that these are raw bandwidths and the actual throughput is usually lower which depends on messaging protocols across end points. I will talk about this more later.

In recent years of technology evolution, computing platforms’ capabilities have reached a point where they can use a better and higher speed network to communicate with peer platforms more efficiently. We refer to the term - bottlenecks, when such scenarios occur. In high demanding computing environments, InfiniBand solves this problem by allowing computers to exchange more data faster.

So, what do you need to get on this high speed data highway ? Not likely that same equipment will work. You are right !

InfiniBand requires specialized hardware equipment. Each computing end point needs an I/O card that we call as Host Channel Adapter or HCA. They connect to InfiniBand Switches using special cables that are engineered to carry your data at this high rate with precision.

Oh wow ! So, do I need to re-write my applications here ? I do not have time to do that !

I know you will ask this at this point. The answer is “no”. Before I go any further, let me just state that InfiniBand follows well known industry standard for networking and this is known as Open Systems Interconnect or OSI. This model offers seven layers and just like ethernet, they apply to InfiniBand as well. Now, let me come back to the original point. We dont need to re-write our entire applications because InfiniBand technology enables very seamless integration.

The new hardware that we just talked about integrates and presents itself to your application in a very similar way as Ethernet. Your view into the network remains same and you continue to interact with sockets comprised of IP addresses and ports.

Thats all for this blog. I will come back with more information on this later and open up the topic in details. Thanks for reading !

Recently, QLogic Corporation sold Intel Corporation the QLogic InfiniBand business unit. QLogic continues its focus and expansion of its Ethernet product-line, as well as RoCE and iWARP protocols, amongst others, over Ethernet. QLogic remains committed to its close relationships and positions within the IBTA.

Exciting things are to come in 2012 including: an update to our minibook, more discussions/events around RoCE, and so much more. As we look to the future of InfiniBand, we want to broaden the InfiniBand story beyond the HPC into the enterprise, highlighting real-life deployments and showcasing the great work of our members. We know various market-verticals utilize InfiniBand in their storage solutions, such as financial services, cloud computing and manufacturing, and this year we plan to increase the visibility of these solutions by reaching out to you, our members, and spreading your success stories.

We look forward to the year ahead with the IBTA and the continued support and participation of our members.

With the SC11 show less than a week away, we at 3M are gearing up and getting excited about what we have to share with the IBTA community. The high-speed twin axial cable and active optical cable (AOC) teams haven’t wasted any time by quickly identifying ways to contribute to the working groups and working to bring the best of 3M processes and innovation to the table. Coupled with access to IBTA’s resources, our support and rapidly growing portfolio of high-speed products will continue to expand and evolve.

This week, our 3M engineers are currently at IBTA Plugfest #20 with a full suite of complementary copper and fiber products for FDR, QDR and DDR rates, ranging in lengths from 0.5 meters up to 100 meters. And this momentum continues as we head into SC11.

3M is a sponsor of the IBTA/OFA booth, where we’ll be showcasing our various high-speed products for the HPC segment. What we have to offer in this space is truly differentiated and we believe can meet the most demanding of our customers’ needs.

For example, our line of copper QSFP+ assemblies are made with 3MTM Twin Axial Cable technology, which lets the cable assembly easily bend and fold, allowing the cable to achieve a very small bend radius (2.5mm) with little to no impact on electrical performance.

The cable is lightweight and low profile (flat ribbon format) as well, and together with its unique flexibility, enables very efficient routing and improved airflow. Imagine folding the cable at a right angle right off the backshell where it exits the switch. Honestly, it’s a bit hard to envision without seeing it in action, so check out the cable and its capabilities on our YouTube page!

We’ll also be leveraging our presence at SC11 to showcase our AOC assemblies - one of the lowest power AOCs available on the market. Up there in flexibility with the 3M Twin Ax assemblies, our AOCs utilize a low-loss, bend-insensitive fiber, meaning the flexibility and high-performance benefits of 3M copper QSFP+ assemblies extends to the fiber side, as well.

These and other 3M cables will be put to the test in the 40 Gbps RDMA live demonstration within the IBTA/OFA booth. So, if you’re going to be at SC11, the demo should not be missed as it’s an extraordinary 40Gbps RDMA demo over this kind of distance… 6,000 miles, further than a trip from Seattle to Paris!

3M will be in Seattle all week during SC11 with a full agenda, so make sure you stop by the IBTA/OFA booth and check out our latest technologies. We’ll also be meeting with customers one-on-one in our meeting suite at the Grand Hyatt. The countdown is on and we look forward to seeing you at the show. Feel free to e-mail us for more information or to schedule time to chat at SC11.

With less than two weeks until SC11, there’s lots of buzz around the forthcoming November release of the TOP500 list. Will Japan’s K supercomputer stay at the top? Where will China’s Sunway Bluelight place? How many systems with GPUs will we see?

The TOP500 list plays into SC11’s unofficial theme of Big Data. Nicole Hemsoth of HPCwire, released an article last week providing highlights of what you can expect to see at SC11. Nicole sites John Johnson, conference chair, who says that this year the supercomputing community is “being called upon to rise to the data challenge and develop methods for dealing with the exponential growth of data and strategies for analyzing and storing large data sets.”

Nicole goes on to highlight a number of presentations and sessions being given at SC11 focused on the problems and new developments spawned by Big Data and technical or scientific computing.

The technical working groups of the IBTA and OFA are all about addressing the problems and new developments spawned by data challenges in high performance computing - and translating those technologies into meaningful solutions for the enterprise data center.

We have joined forces with several member companies as well as SC11’s SCinet and Energy Sciences Network (ESnet) and will be showcasing a “world’s first” demonstration of Remote Direct Memory Access (RDMA) protocols over a 40 Gbps WAN. Watch here for more details to be released this Monday, Nov. 14.

We will also be featuring presentations from member companies on a full range of topics detailed on this site.

Be sure to add the IBTA/OFA #6010 to your list of must-see booths at the show and watch this space for live updates from the show.

In September, I had the privilege of working with my friend and colleague, Paul Grun of System Fabric Works (SFW) on the first webinar in a four-part series, “Why I/O is Worth a Fresh Look,” presented by InfiniBand Trade Association on September 23.

The IBTA Fall Webinar Series is part of a planned outreach program led by the IBTA to expand InfiniBand technology to new areas where its capabilities may be especially useful. InfiniBand is well-accepted in the High-Performance Community (HPC), but the technology can be just as beneficial in “mainstream” Enterprise Data Centers (EDC). The webinar series addresses the role of remote direct memory access (RDMA) technologies, such as InfiniBand and RDMA over Converged Ethernet (RoCE), in the EDC, highlighting the rising importance of I/O technology in the on-going transformation of modern data centers. We know that broadening into EDC is a difficult task for several reasons, including the fact that InfiniBand could be viewed as a “disruptive” technology, not based on the familiar Ethernet transport, and therefore requires new components in the EDC. The benefits are certainly there, but so are the challenges, hence the difficulty of our task.

Like all new technologies, one of our challenges is educating those who are not familiar with InfiniBand and challenging them to look at their current systems differently – just as our first part in this webinar series suggests - taking a fresh look at I/O. In this first webinar, we took on the task of reexamining I/O and assessing genuine advancements in I/O, specifically InfiniBand and making case for how this technology should be considered when improving your data center. We believe the developments in the InfiniBand world over the last decade are not well-known to EDC managers, or at least not well understood.

I am very happy with the result, and the first webinar really set the stage for the next three webinars which dive into the nuts and bolts of this technology and give practical information on how this technology can be implemented and improve your data center.

During the webinar we answered several questions, but one in particular, I felt we did not spend enough time discussing due to time limitations. The attendee asked, “How will interoperability in the data center be assured? The results from the IBTA plugfests are less than impressive. Will this improve with the next generation FDR product?”

First, this question requires a little explanation, because it uses terminology and implies knowledge outside of the webinar itself. There is testing of InfiniBand components which takes place jointly between the IBTA and OpenFabrics Alliance (OFA) at the University of New Hampshire Interoperability Lab (UNH-IOL). We test InfiniBand components for compliance to the InfiniBand specification and for interoperability with other compliant InfiniBand components.

In the opinion of IBTA and OFA members, vendors and customers alike, interoperability must be verified with a variety of vendors and their products. However, that makes the testing much more difficult and results in lower success rates than if a less demanding approach were to be taken. The ever-increasing data rates also put additional demands on cable vendors and InfiniBand Channel Adapter and Switch vendors.

The real world result of our testing is a documented pass rate of about 90%, and a continuing commitment to do better.

What this means in real world terms is that the InfiniBand community has achieved the most comprehensive and strictest compliance and interoperability program in the industry. This fact, in and of itself, is probably the strongest foundational element that justifies our belief that InfiniBand can and should be considered for adoption in the mainstream EDC, with complete confidence as to its quality, reliability and maturity.

If you were unable to attend the webinar, be sure to check out the recorded webinar and download the presentation slides here. We’re looking forward to the next webinar (The Practical Approach to Applying InfiniBand in Your Data Center, taking place October 21) in the series which will dig more deeply into how this technology can be integrated into the data center and get into the meat of this technology. I look forward to your participation in the remaining webinars. There’s a lot we can accomplish together, and it starts with this basic understanding of the technology and how it can help you reach your company’s goals.