Large companies like Facebook, Google, and Microsoft as
well as a number of small and medium enterprises daily
process massive amounts of data in batch jobs and in real
time applications. This generates high network traffic,
which is hard to support using traditional, oversubscribed,
network infrastructures. To address this issue, several alternative
network topologies have been proposed, aiming
to increase the bandwidth available in enterprise clusters.
We observe that in many of the commonly used workloads,
data is aggregated during the process and the output
size is a fraction of the input size. This motivated us to explore
a different point in the design space. Instead of increasing
the bandwidth, we focus on decreasing the traffic
by pushing aggregation from the edge into the network.
We built Camdoop, a MapReduce-like system running
on CamCube, a cluster design that uses a direct-connect
network topology with servers directly linked to other
servers. Camdoop exploits the property that CamCube
servers forward traffic, to perform in-network aggregation
of data during the shuffle phase. Camdoop supports
the same functions used in MapReduce and is compatible
with existing MapReduce applications. We demonstrate
that, in common cases, Camdoop significantly reduces
the network traffic and provides high performance
increase over a version of Camdoop running over a
switch and against two production systems, Hadoop and
Dryad/DryadLINQ.