AMD Radeon HD 6990 4GB Review

The Architecture behind Antilles

AMD’s Antilles card uses a pair of AMD’s Cayman cores which have been specifically binned for low power consumption and heat output. Unfortunately, since the cores are so stringently binned, we will likely see a mere trickle of these cards making their way into the retail channel.

One of the main benefits of using such select GPUs is ability to utilize fully-enabled dies instead of trading off SIMDs blocks for additional power savings. As we will see on the next pages, AMD stuck with fully-enabled cores but did have to sacrifice clock speeds in order to retain acceptable power consumption numbers. For a more in-depth presentation of the Cayman architecture, we recommend that you visit our HD 6970 / HD 6950 launch article.

A bird’s-eye view of a single Cayman core which resides in Antilles really doesn’t show that much of a departure from Cypress but there are some noteworthy changes. Since AMD’s has moved to a simplified VLIW4 architecture for the thread processors, the number of SIMD engines has been increased by four for a total of 24. Each of these engines features 16 thread processors with four ALUs each (for a total of 64 ALUs per SIMD), four texture units, 512KB of L2 texture cache and 64KB associated towards the local data share. This means a full-enabled core will have 1536 shaders and 96 TMUs while the ROPs array layout hasn’t changed from Cypress with its 32 colour and 128 z-stencil ROPs. Multiply this by two and you have a general idea of what Antilles can bring to the table.

All in all, Cayman may have less Shader Processors than Cypress but the processors themselves are slightly more efficient and the architecture has additional texture processing power granted by the additional 4 SIMD engines.

Much like on the Barts series, we can also see that in an effort to increase rendering efficiency even more, AMD has broken up the Ultra Threaded Dispatch Processor into two with each section having its own instruction and constant cache. This dispatch processor basically acts like a traffic cop, directing draw calls to the SIMD arrays. With each directing its own “half” of the SIMD engine, rendering information can be processed at a much quicker rate.

Since geometry performance has been the overriding focus here, we can naturally expect Cayman-based cards to run circles around the HD 5800-series in some games. However, not all of the first generation DX11 games incorporate higher level geometry or higher levels of tessellation. DX10 and to a greater extent DX9 applications lack a real need for increased performance in this area, which may very well lead to a relatively minor gap between AMD’s current and past generations.

In order to facilitate communication between the two cores, AMD has once again gone with a PCI-E switch from PLX. For the Antilles card, an ultra low latency 8647 chip is used and allows for a total of 48 second generation PCI-E lanes to be used which should eliminate any bottlenecks seen with previous generations. The HD 5970 used this same switch but due to communication differences between the Cypress and Cayman cores, Antilles should be better able to utilize it. There is no magical Sideport wand here, just a fast 7.92 Gbps interconnect between the two GPUs.

The 48 bi-directional lanes afforded by the 8647 are partitioned at a 2:1 ratio between the cores and switch to interface communication. This means each Cayman GPU has the bandwidth of sixteen PCI-E Gen 2 lanes between it and the PLX switch while the switch itself has an open x16 link to the motherboard’s PCI-E slot. Meanwhile, the main display output is handled by the primary GPU.

AMD has chosen a very straightforward design which should be quite efficient considering the PLX chip consumes less than 3 watts of power. In addition, an expensive dual PCB solution has been avoided through the use of a linear communication path with a low latency switch instead of a built-in Crossfire bridge.