Tile-based Rasterization in Nvidia GPUs

Nvidia has constantly evolved the architecture of their GPUs in each generation to enhance performance and power-efficiency. While the company has discussed the changes in the programmable shader cores for the Maxwell and Pascal generation, which have generally eliminated or reduced scheduling logic and placed a greater burden on the compiler. However, Nvidia’s architects have avoided disclosing details about the fixed function graphics hardware – in some cases denying changes.

Starting with the Maxwell architecture, Nvidia high-performance GPUs have borrowed techniques from low-power mobile graphics architectures. Specifically, Maxwell and Pascal use tile-based immediate-mode rasterizers that buffer pixel output, instead of conventional full-screen immediate-mode rasterizers. Using simple DirectX shaders, we demonstrate the tile-based rasterization in Nvidia’s Maxwell and Pascal GPUs and contrast this behavior to the immediate-mode rasterizer used by AMD.

The YouTube video below is best viewed in full-screen mode and includes:

A brief refresher on the 3D pipeline

An explanation of the DirectX shader code, which is available on Github

Using tiled regions and buffering the rasterizer data on-die reduces the memory bandwidth for rendering, improving performance and power-efficiency. Consistent with this hypothesis, our testing shows that Nvidia GPUs change the tile size to ensure that the pixel output from rasterization fits within a fixed size on-chip buffer or cache.

Tile-based rasterization is nothing new in graphics. The PowerVR architecture has used tile-based deferred rendering since the 1990’s, and mobile GPUs from ARM and Qualcomm also use various forms of tiling. However, tiling has repeatedly failed on the desktop. In the 1990’s, Gigapixel developed the GP-1 tiled-rendering GPU, before the company was acquired by 3dfx (in turn acquired by Nvidia). The PowerVR-based Kyro desktop GPU was released in 2001, but STMicro cancelled the product line. Microsoft also investigated tiling for Talisman, an entirely new graphics pipeline (including APIs and hardware) that was ultimately shelved.

Historically, new graphics technologies have moved from performance-focused desktop GPUs into mobile GPUs. For example, programmable shaders and GPGPU were pioneered on the PC and only appeared in mobile GPUs years later. In the case of tiled rasterization, the direction is reversed and a mobile technology is now influencing high-performance architectures. This is an exciting turning point for computer graphics and it will be fascinating to see which other mobile technologies will be adapted for high performance GPUs in the future.