Towards predictable datacenter networks

Submitted by richard on Sun, 2013-10-27 15:30

Authors:

Hitesh Ballani, Paolo Costa, Thomas Karagiannis and Ant Rowstron

Published:

SIGCOMM

This paper looks at the issue of reducing variability in performance
in data centre networks. Variable network performance can lead
to unreliable application performance in networked applications –
this can be a particular problem for cloud apps. Virtual networks
are proposed as a solution to isolate the “tenant” performance from
the physical network infrastructure. The system presented is known
as Oktopus. The system provides a tradeoff between guarantees to tenants,
costs to tenants and profits to providers by mapping a virtual network
to the physical network.

In the introduction a useful study of the intra-cloud network bandwidth
is done by looking at results from past studies (figure 1). Large
differences (factor of five or more in some studies) are observed.

Tenant requests are abstracted to simply a number of virtual
machines (assumed standard in terms of CPU memory and storage) and a
bandwidth , making a virtual switch with a total bandwidth but
with a limit on each VM of . In this case the subscription request
is . This is known as a “Virtual cluster”.

Some networks are oversubscribed – in these cases a
request is described as where is the size of groups in
which the machines are clustered ( groups) and is
an oversubscription factor. VMs in group are connected by a virtual
group switch of capacity and groups are interconnected by a link
of capacity . This is known as a “Virtual oversubscribed
cluster”.

Other topologies could be offered – for example a clique (which would have high
max bandwidth but limit provider flexibility).

The Oktopus system (named after Paul the psychic octopus – hence German
spelling) implements these virtual abstractions. Requesting
tenants can ask for a virtual cluster or a virtual oversubscribed
cluster for connections. A management plance (logically centralised)
performs admission control and maps requests to physical machines.
A data plane uses rate-limiting at end-hosts to enforce bandwidth limits.

Tenants without virtual networks should be guaranteed a fair share
of network bandwidth. Tenants on virtual networks are given
high priority traffic with guaranteed but limited bandwidth.
Other traffic gets fair share of residual bandwidth. Limiting
the share of total bandwidth used by virtual networks guarantees
a minimum QoS for non virtual network clients.

In non-tree networks then routing may be an issue. Load balancing
is used to spread traffic across paths.

Evaluation is by simulation on a testbed consisting of 25 physical
machines. A three-level tree topology is used and no path
diversity. The simulation is of a datacentre with 16,000 physical
machines and 4VMs each machine. The machines are arranges in 40
machines racks with a top of rack switch. These connect to
aggregation switches which connect to a datacentre core switch.

The allocation algorithm for VMs is greedy (since the full problem
is NP hard).
If out of VMs are places on a single physical machine
then must be on other phyical machines and
bandwidth must be available.
Assume slots for VMs per machine. Each
link capacity . is number of empty slots on .
is residual bw for link . The number of VMs for request for
that
can be allocated on machine with outbound link is:
.

Simulations are run with tenant jobs modelled as independent tasks
with flows between tasks on individual VMs of uniform length .
A simulated workflow makes some simple assumptions about length
of task completion. The network is assumed to have more demand
for bandwidth than it can handle. A baseline comparison is tenants asking only
for a number of machines which are greedily allocated as close
as possible. This is extended as
and to see the effects on completion times.
Completion times are much reduced by placement using oversubscription
policies as bandwidth is used more efficiently. In addition the
efficiency would allow providers to operate clouds closer to 100%
utilisation (Amazon apparently at 70 - 80%).