Bus architecture and performance

Network (and other) interfaces attach to the motherboard using some kind of interconnect. Typically a motherboard has a couple of bus media, e.g., a number of PCI, AGP, PCI-X or similar slots. How different adapters attach to these interconnects has significant performance implications.

Maximum performance of some buses

Newer Intel-based systems typically include the PCI Express (PCIe) bus, which scales to higher speeds than classical PCI and also features smaller physical adapters. The physical data transmission is quite different from earlier PCI buses and based on serial "lanes", which can be bundled to obtain higher rates. However, the programming interface has been kept very close, so that older PCI drivers can work with PCIe buses and devices. Initially, PCIe was mainly used for graphics adapters, but now it is used for all kinds of I/O. Most Gigabit Ethernet and 10GE network adapters have PCIe connections, especially the higher-performance ones.

PCIe x1: 400 MB/sec

PCIe x2: 800 MB/sec

PCIe x4: 1.6 GB/sec

PCIe x8: 3.2 GB/sec

PCIe x16: 6.4 GB/sec

(PCIe "Gen 2" x16: 16 GB/sec peak bandwidth)

Intel has published a white paper, Hardware Level IO Benchmarking of PCI Express (see "References" section), which has detailed information about the various PCIe variants. It also explains the overhead with different transaction workloads. These explanations can be used to estimate what effective performance can be expected.

Older systems use PCI buses, or their high-performance successor, PCI-X.

PCI 32bit/33Mhz: 133 MB/sec

PCI 2.x 64bit/66Mhz: 532 MB/sec

PCI-X 1.0 64bit/66Mhz: 532 MB/sec

PCI-X 1.0 64bit/133Mhz: 1.07 GB/sec

PCI-X 2.0 64bit/266Mhz: 2.13 GB/sec

PCI-X 2.0 64bit/533Mhz: 4.26 GB/sec

All PCI and PCI-X buses are backward compatible, except later models no longer support 5V adapters. Note that the slowest adapter in the bus determines the maximum speed of the bus. If multiple adapters are plugged into the same bus, the speed also decreases. Regardless of the numbers above, an adapter in a 32bit/33Mhz PCI slot can typically only achieve ~300-400 Mbit/s TCP/IP performance.

For historical comparison, the predecessor of PCI was the ISA bus, which could just support 100 Mb/s Ethernet at line rate:

ISA 8bit /8Mhz: 8 MB/sec

ISA 16bit/8Mhz: 16 MB/sec

Measuring bus performance

There are tools to measure I/O performance of PCI* buses. For example, at least on HP Itanium systems running Linux, pcitop utility may be of great help.

Common Issues: Bus support and adapter placement

Check what bus types and speeds a network adapter supports. Verify how many distinct channels the motherboard supports. Verify that only the network adapter is plugged into a bus so that the performance is not degraded by slower or multiple cards.

Common Issues: Memory read registers

The default operating system/driver's or BIOS's Maximum Memory Read Byte Count (MMRBC) might be set to e.g., 512 bytes. A higher value (e.g., 4096 bytes) could even double the performance in 10 Gbit/s grade tests (in one case, from 2Gbit/s to 4 Gbit/s). The value may in some cases be set from BIOS, but in others must be adjusted in OS's PCI-X settings (e.g., 'setpci' on Linux).

It's not clear whether this is also the same thing, in some cases adjusting "PCI payload size" in BIOS made a huge difference, especially when using small
packets. This is because the overhead in PCIe was significant. See more on the netdev post referred below.