Conflicting Goals In Data Centers

Two conflicting goals are emerging inside of data centers—speed at any cost, and the ability to extend hardware well beyond its expected lifetime to amortize that cost.

Layered across both of those are concerns about how to move data back and forth more efficiently, how to secure it, and how to best integrate different generations of technology. But these widely different goals have created headaches for data centers, and opportunities for a surprisingly wide swath of new technology, particularly on the hardware side.

Much of this was in full view at the Open Compute Project Summit last month, where silicon and hardware innovations were the key focus. On the raw performance side, Nvidia showed off its internally built NVlink interconnect fabric, which supports hyperscale computing based on massively parallel GPUs. Nvidia isn’t contributing NVlink to OCP, but it is tapping into the group’s focus on the need for faster movement of data for the most compute-intensive tasks.

“Of the big cloud vendors, there are four in the world that really matter—Microsoft, Google, Amazon, and Alibaba [of China],” says Rob Ober, CTO of Nvidia’s Accelerated Computing group. “Two of the four showed platforms with our Tesla GPUs at the Open Compute conference.”

Microsoft engineers discussed their new “Project Olympus” platform, an eight-GPU ‘hybrid mesh cube’ in a modular chassis that interconnects via NVlink, as well as standard PCI Express. But the real focus was on NVlink, which is 5X to 12X faster than PCIe gen3.

“NVlink is a very lightweight protocol,” Ober explains. “We are not opening that up to the OCP community. We don’t want to make that a standard. It’s not that we are against it, it’s just that there is no practical use for people outside Nvidia. However, we are making the design rules and all of the required information to build a board like this. The Gerber files will be available along with the layout. The board stack ups with all the impedance rules. In theory, an ODM (original design manufacturer) could take that and build it, but I don’t think they will. Well, maybe Facebook or Google might.”

The integration challenge
This is not as easy as it sounds, of course. While GPUs have become a key component in AI and neural networks, which rely on massive parallelism, the integration of heterogeneous processing elements remains a challenge.

“The tricky thing here is how to integrate many different cores correctly, how to achieve cache coherence,” said Frank Schirrmeister, senior group director for product management in Cadence’s System & Verification Group. “Cache coherence and the GPUs’ access to memory, that’s really extremely interesting—and that drives requirements for our tools. We sometimes build our own designs to stress test our tools. We need to make sure that we really find the right bugs. The chips and the various cores and the GPUs are all accessing the same data, you have to make sure that as they access the memories, nothing gets lost or out of sync. Say you have four participants accessing the same memory, if one of them changes something, that needs to be reflected in the cache of the other cores.”

It’s particularly difficult in complex heterogeneous architectures.

“You need to be sure all those cores work correctly and the memory is refreshed correctly,” said Schirrmeister. “Those are tricky things to find. That is obviously happening in real time in the machine. It’s a very interesting problem. If certain cores are shut down in power, what happens when that core comes back? Those are very challenging bugs to find. To make sure every core sees the same memory.”

The OCP advance program also listed a session on “the OCP HPC interconnect silicon spec,” which according to OCP can build on existing mainstream technologies such as PCIe, RapidIO, Infiniband and Ethernet. It was not listed in the final program, although it may be something to watch for in the future. Marvell, Nvidia, Qualcomm, Cadence and Cavium all said they hadn’t heard much about it.

The OCP program did cover a tremendous amount of ground, though, including the Switch Abstraction Interface (SAI) ASIC work being done at Microsoft with the support of 77 other contributors, updates from Western Digital on SSD drives, and a peek into how Facebook runs data centers worldwide with eight generations of architectures at once (OCP and non-OCP) to keep its 1.8 billion monthly users satisfied.

Extending technology
One of Facebook’s prods for suppliers was the top of rack switching topologies that are now prevalent. Marvell discussed its new PIPE port extender for these environments, where high-end speed- and port-density chassis need to connect to already deployed lower-speed Ethernet ports.
This is the other end of the spectrum, where existing technology needs to be extended and improved without massive investments or a complete rethinking of the data center architecture.

Source: Marvell

“We can give that to rack suppliers with no in-box management,” said Yaron Zimerman, staff product line manager at Marvell. “Management comes from the central entity, so there is no need for an on-board CPU. So you get power as low as 50 watts. You get a thinner, smaller backplane with less PCB layers, so no need for a fan.”

A typical board for this application normally would be 16 to 18 layers. The goal here is to reduce that as low as 8.

“Traditional equipment goes away at about 100 watts, and you are out of the box with the optics,” Zimerman said, noting the central management entity is a control bridge on the SPINE switch, and this is enabled by a IEEE 802.1br control plane. The PIPE device can fan out 10 or 12 ports of 10 Gigabit Ethernet.

A wild card with a sizable presence at OCP was Qualcomm, showing a bit more about its Centriq 48-ARM core 10nm server processor. The silicon is sampling now, said Ram Peddibhotla, vice president of product management for Qualcomm Datacenter Technologies.

Qualcomm is collaborating with the Microsoft Windows Server team to see this processor work with Windows Server, a product only used internally at Microsoft today.

“It’s really a collaboration between our two companies that spans the full spectrum, so we can accelerate Microsoft Cloud services,” he said. Peddibhotla worked at Intel in software and services strategy for 18 years before coming to Qualcomm two years ago.

Qualcomm isn’t yet disclosing die size, the fab or power consumption. It may later this year. What is known is this will enable 32 lanes of PCIe gen 3.