A Chinese PhD candidate at the country's National University of Defense Technology has attempted to solve some of the networking problems that crop up when businesses try to chain multiple datacentres together.

The method, which tries to assure solid network performance in datacentres consisting of multiple 1,000-2,000 server modules that have been linked together known as "mega datacentres", lets distributed software applications maintain performance even in the case of multiple hardware failures.

Containerised datacentres can cause networking problems, so a Chinese researcher has set out to solve them. Image credit: Jack Clark

In a paper entitled SCautz: A fault tolerant Network Architecture for Modular Datacenter paper (PDF), lead researcher Feng Huang writes that as cloud providers have seen the amount of IT they manage grow, they have resorted to packing equipment into shipping containers and linking these together. While many cloud companies, such as Google (video), and Amazon Web Services, have adopted this technique, Huang argues that any crimp in the inter-container networking performance can have a huge effect on facilities.

"As the crucial component of modular datacentres [MDC], modular datacentre networks' [MDCN] incomplete structure should try its best to retain the network performance," Huang writes. "The most important point is that the performance of MDCN must degrade more gracefully than MDC's computation and storage do, so as not to become the fatal weakness that make containers' overall performance below the threshold criterion ahead of time."

The SCautz method

To tackle this, Huang and his team have come up with a new way of structuring the network, called 'SCautz'.

SCautz allows servers to carry out many of the typical functions of network switches, leaving the actual switches to focus on inter-container data transfer. This approach holds the same philosophy as the nascent field of software-defined networking, which sees companies like recent VMware-acquisition Nicira attempt to move networking away from proprietary hardware and onto basic servers.

The SCautz method assumes that the operator is using commodity-off-the-shelf switches — stripped down switches without many of the additional software and hardware produced by the major networking companies.

In simulations, the team pitted SCautz against a Microsoft-led experimental network architecture for modular datacentres, named BCube (PDF).

SCautz performed almost as well as BCube in testing, but required far fewer switches, lowering the overall cost of the datacentre network.

In addition, in cases where between 10 and 20 percent of network hardware failed, SCautz networks saw their network throughput drop by between 6.91 percent and 13.74 percent respectively, compared with BCube, which saw a fall of between 15.3 percent and 25.23 percent.

Consequently, SCautz was able to route around failed hardware well enough for the network performance to degrade less than the amount of hardware which became unavailable. As a result, SCautz networks are more resilient to hardware failures, giving datacentre operators greater flexibility when responding to a hardware crisis.

The next stage in the research is to design an inter-container network by connecting multiple SCautz-based containers, Huang writes. Huang is currently working toward a PhD at China's National University of Defence Technology (NUDT).

Jack Clark has spent the past three years writing about the technical and economic principles that are driving the shift to cloud computing. He's visited data centers on two continents, quizzed senior engineers from Google, Intel and Facebook on the technologies they work on and read more technical papers than you care to name on topics f...
Full Bio