PCI-X Exposed

PCI-X delivers a better, faster, safer
PCI. It is more complicated, particularly for bridges, but is worth
it for a significant upgrade in bandwidth and operating
efficiency.

There is no mystery to PCI-X. It is a very straightforward
re-implementation of PCI, one designed to up PCI efficiencies,
especially those of Reads. In fact PCI-X uses most of the basic
signaling of PCI and most of PCI's command set. Like PCI it is a
synchronous, arbitrated bus—one that multiplexes addresses
and data over the same lines.

PCI Revisited

We shouldn't forget just how far PCI, has come. When PCI was
first created, it was implemented as a solder-in I/O bus for PC
motherboards. One of its main purposes was to replace the VESA
Local bus (VL), to provide a higher I/O bandwidth path for PC
graphics, which were loading down existing I/O resources.
Additionally, PCI was designed to be a low-cost I/O bus for PCs.
And as such it had a minimum of I/O resources (four interrupts,
processed in a Round-Robin style) and supported a PC host-oriented
style of processing. All of which caused problems downstream, when
we started using PCI as an embedded system bus (CompactPCI) for
Telecom systems and as a mezzanine bus (PMC and PC-MIP) for
real-time systems.

And then there is the PCI Read problem. Unfortunately, PCI has
historically done badly bandwidth-wise when it comes to bus Reads.
The problem is that the target PCI controller has to access the
data, usually from some device. That data access and transfer can
cause delays, long delays to the PCI bus waiting for the Read data.
Basically the bus can hang up until the data is ready to return,
and this hangs up a critical system resource.

Son of AGP

The AGP bus, a variant of PCI, was designed by Intel to solve
the 3-D graphics problem. For history was repeating itself,
graphics processing memory transactions for 3-D rendering and
processing was again overloading the PCI bus, as graphics had
overloaded PCI's predecessor buses. Intel's solution was AGP. The
AGP bus used the basic PCI commands, signals, and lines. However,
it was really a point-to-point connection, providing a direct path
to memory for the graphics engine. Thus the engine could use
cheaper main memory DRAM to store its intermediate figures.

What AGP did, was to add a sideband bus, a small bus parallel to
the main PCI bus. This sideband bus carried the commands,
addresses, and status, off-loading the main PCI bus. What AGP also
did was use this sideband bus to create split transactions: bus
transactions where the command is one transaction and the requested
action is another, perhaps later action. Thus it converted Host
Reads (to targets) to Target Writes (to host), which eliminated the
read access delays holding up the bus. AGP also introduced
transaction byte counts, counts of the number of bytes in a
transaction so the target or host could know how long the
transaction would be and would not be tied to reacting quickly to a
bus signal to end the transaction.

PCI-X does all this and more. Unlike AGP it doesn't use a
sideband bus to transfer the transaction data. Instead PCI-X relies
on a longer 36-bit PCI-X attribute word to transfer the command
information to the transaction target. It uses split transactions
to beat the Read problem, except these transactions can be morphed
into multiple transactions, each with its own transaction byte
count. PCI-X is basically an enhanced PCI, as is AGP. Both can run
the PCI command set.

PCI-X is downward compatible to PCI. If a PCI card is inserted
into a PCI-X system the system will drop down to PCI level
operations. It will not execute PCI-X based operations. Thus, to
take advantage of PCI, engineers will have to rely only on PCI-X
cards and boards.

PCI-X Basics

At the conceptual level, PCI-X represents a straightforward PCI
derivative architecture (bridging and some other operations can get
complicated). Here are the PCI-X basics:

PCI-X is defined as 32-bit and 64-bit bus. But its real
efficiency is at 64-bits.

In the spec, PCI-X is defined for two bus speeds—66- and
133-MHz.

Bus loading for PCI-X is currently being characterized. Initial
work shows that the bus at 66-MHz can drive five to seven slots,
making it an option to replace or extend CompactPCI, which is based
on PCI. Loading for the 133-MHz version is on the order of one to
two slots.

PCI-X introduces the concept of a transaction sequence, which
represents the one or more transactions to make up a single logical
transfer. Each sequence is assigned an ID, which is the combination
of the Requester's ID and Tag attributes.

PCI-X Split Transactions replace PCI's Delayed Transactions.
Any transaction, other than a Memory Write, can be completed using
the Split Transaction protocol. The target can terminate the
transaction by signaling a Split Response. And then launching one
or more Split Completion transactions to complete the original
request. In effect, the original transaction is converted into a
series of target initiated Split Completions. For example, a Memory
Read becomes a sequence of Memory Writes.

A transaction can be terminated early (on an ADB boundary or on
the first data phase). The transaction can be completed by one or
more transactions initiated by the target.

On starting a transaction, PCI-X puts the address on the AD
lines on the first cycle and puts out a 36-bit attribute word on
the second cycle (32-bits on the A/D lines and 4-bits on the C/BE
lines.) This 36-bit attribute word provides transaction byte
counts, a Tag (sequence # for the requester), requester bus and
device numbers, and function/status bits (no snoop bits, relaxed
order bit).

PCI-X multiple word transactions are controlled by a byte
count. PCI-X defines Allowable Byte Disconnect Boundaries (ABDs)
aligned on 128 byte boundaries. Intermediate transactions can
terminate either on these boundaries or when the byte count is
exhausted.

Unlike PCI, PCI-X restricts use of wait-states to up bus
performance. Initiators cannot insert wait-states and targets can
only insert initial wait-states.

PCI-X was designed for large burst transfers. The standard
block size is 128 bytes. The maximum block size is 4096 bytes.

PCI-X implements a "relaxed ordering" for bridged transactions.
By setting a relaxed order bit in the attribute field, it lets
bridges process posted transactions out-of-order.

Unlike PCI, which gates bus signals and data before setting
values in registers, PCI-X is a registered bus. Bus signals and
data are strobed into registers without intermediate gating to
minimize bus latency.