Evolution: 20 years of switching fabric

Packet switching equipment is a comparably young market with merely 20 years of evolution. At the core of packet switching equipment lays the switch fabric; this single sub-system is probably the most significant component of networks scalability. This paper describes the evolution of the switching fabric.

The switch fabric provides three functions. It switches traffic from one fabric port to another fabric port while providing fairness between the fabric ports, provides quality of service (QoS) in case of port congestion, and delivers resiliency to faults. As the fabric is the core of the platforms it is required to scale in capacity and enable rapid evolution of the system QoS.

The capacity of a switch fabric is defined as the total bandwidth switched without packet drops under any traffic pattern. A "non-blocking" fabric is one where the fabric capacity and QoS, is agnostic to the traffic pattern that is switched through the fabric, and the fabric capacity is equal to the sum of its ports rates.

A switch maintains fairness if no port has advantage over another port in the system. QoS is the ability to differentiate between competing packets to an egress port in case of congestion. A fabric that is aware of network ports in addition to the fabric ports typically makes better informed decisions, enhances QOS and fairness. Sitting at the core of the switching platform, the fabric resiliency to faults is critical, and must be addressed by both a resilient architecture and a redundancy scheme.

Shared Bus
The Shared Bus was the first switching architecture in use. The Shared bus as its name reflects is a shared bus media, on top of which a number of input-output (IO) devices communicate. No contention is allowed on the bus, i.e.: at each point in time only a single source is allowed to send traffic on the bus. Typically contention is resolved via centralized arbiter that grants a source to send traffic on the bus.
In shared bus systems, non-blocking means that the sum of fabric ports rates is less than the bus rate. That is, the capacity of the system is bounded by the bus capacity. Even if the total bandwidth is maintained below the bus capacity, the number and capacity of IO devices is bounded by the centralized arbiter performance. Examples of shared bus architecture are; Cisco 1900 and VLSI vendor Galileo (Marvell) "Galnet" chipset.

Shared Memory
Improving on the shared bus architecture brought about the "high performance" shared memory architecture. The shared memory architecture is based on a large capacity memory which, in advanced designs, is distributed among different memory controllers. Each memory controller is connected to all of the IO devices. The memory is typically organized as a plurality of queues each assigned to an output IO device or network port.

For non-blocking operation the guaranteed memory "write" bandwidth should equal the maximum bandwidth of all input devices combined. The same holds for the "read" bandwidth. Analyzing the shared memory architecture, one reaches the following conclusions:

Cost--The shared memory is used as the main buffer for the system. In carrier systems this is in the order of 100ms buffer of the system capacity. A practical issue is the initial cost of the system, as the system has to be equipped from day one with its full shared memory fabric.

Scale--In a shared memory architecture the size of the system is limited by the number of links (or the total bandwidth) a memory controller ASIC can handle. Figure 2 presents a system with N Fabric ports (NxIO devices). Each memory controller is connected to all N input and output devices. In addition, it connects to its local memory supporting the write and read bandwidth of these N connections. That is, for a memory controller to support a bandwidth of N connections it's normalized IO-pins capacity is 4 X N (N input connection, N output connections, N for memory write and N for memory read). This limits the capacity of the ASIC that can be built , thus the overall system size.

On the other hand, shared memory systems are known for their excellent QoS performance. Examples of shared memory architectures are Juniper M series and VLSI vendor MMC Networks (AMCC) chipset.