We Recommend

My Discussions

Mac OS X 10.2 Jaguar

Quartz Extreme

Quartz Extreme is Apple's name for the new GPU accelerated version of the Quartz Compositor. This is not one of Apple's more inspired product branding efforts, and many Mac enthusiasts have taken to calling it by the more descriptive (or at least less embarrassing) name QuartzGL. I'll simply refer to it as QE.

The Quartz display layer was examined in several earlier articles. Here's a crash course in Quartz to refresh your memory.

Quartz in a Nutshell

Quartz is an umbrella term that describes the core technologies used by Mac OS X to render images. It has two parts: Quartz 2D and the Quartz Compositor.

[Quartz 2D's] APIs allow you to create text and images by specifying a sequence of commands and mathematical statements that place lines, shapes, color, shading, translucency, and other graphical attributes in two-dimensional space. You do not need to specify the attributes of individual pixels. As a result, a shape can be efficiently defined as a series of paths and attributes rather than as a bitmap.

Quartz 2D accepts input from a variety of sources and can produce output in several different formats, including PDF, PostScript, and of course bitmaps suitable for screen display.

Quartz 2D has several "sibling" APIs that also produce bitmapped data for screen display: QuickDraw, QuickTime, and OpenGL. QuickDraw actually uses some Quartz 2D APIs in its back end, but QuickTime and OpenGL do most of their own drawing.

All of the bitmapped data produced by Quartz 2D, QuickDraw, QuickTime, and OpenGL is passed to the Quartz Compositor for eventual display on the screen.

The Quartz Compositor (formerly "Core Graphics Services") is implemented as a single "window server" process that is responsible for managing all on-screen windows. Each window has an associated bitmap (created by the application that owns the window using one of the drawing APIs). The window server produces the final screen image by compositing all of the visible window bitmaps with each other according to their position and layering. From the System Overview:

[The window server] composites and recomposites each pixel of an application's window as the window is drawn, redrawn, covered, and uncovered. Each window is represented as a bitmap that includes both translucency (alpha channel) and anti-aliasing information. The bitmap serves as a buffer, allowing the window server to "remember" an application's window contents and to recomposite it without the application's involvement.

The window server also routes events (e.g. cursor movement, mouse clicks, typing) to the appropriate applications, and manages the cursor.

So, in a nutshell: applications issue drawing commands using one of the various drawing APIs; the drawing APIs produce a bitmaps based on these (possibly vector-based) drawing commands; the window server retains the resulting bitmaps and composites them into a pleasing, cohesive final image on the screen.

Performance Problems

There are at least two performance problems with this architecture. As we've seen in earlier articles, the amount of memory used by the window server for retaining window bitmaps quickly becomes substantial. Worse, it scales linearly with the number and size of windows on the screen.

In more traditional architectures, bitmaps are not retained for every window on the screen. Instead, applications are asked to redraw any newly revealed portion of their windows. Each application draws into a single shared "frame buffer" that is the same size as the screen itself (e.g. 1024x768 pixels). This frame buffer is usually stored in dedicated video memory on the video card, rather than in main memory. To enable smoother drawing, it is possible to use two frame buffers: one on-screen and one off-screen. Applications draw into the off-screen frame buffer, and the video card swaps the two when the drawing is done. In this way, no one ever sees drawing "as it happens."

In the Mac OS X architecture, however, the screen resolution is almost irrelevant as far as memory usage is concerned. The memory usage for a pair of frame buffers is dwarfed by the potentially boundless number of windows that may be on the screen at any given time, each of which has its own buffered bitmap equal to the size of the window.

Mac OS X stores its window buffers in main memory rather than trying to fit them all in video memory. Although neither memory pool is limitless, main memory is still usually larger than video memory, and main memory is managed by the OS's virtual memory system which falls back to using hard disk space when things get tight. Things fall off a performance cliff pretty quickly when that happens.

The second performance problem has to do with all the compositing that the window server does. Rather than simply take the pixels from each window's bitmap and display them as-is on the screen, the window server must blend each pixel with the pixels from all of the other windows that have a pixel at that position, taking into account each pixel's transparency value.

When windows are entirely opaque, the calculations required are significantly abbreviated. But every window in Mac OS X has a partially transparent drop shadow and title bar (if it is not the front-most window). Menus are also partially transparent, as are the Dock, the overlays that appear when you hit the sound volume or media eject keys on the keyboard, and so on. The point is that transparency is unavoidable in Mac OS X, and the compositing calculations become more significant as more transparent objects appear on the screen.

Enter Quartz Extreme

So, where does Quartz Extreme fit in? As stated earlier, QE is a reimplementation of the Quartz Compositor using OpenGL--a description that should make a bit more sense to you now. Let me restate my "in a nutshell" summary of the Mac OS X on-screen graphics system, this time accounting for Quartz Extreme. Stay with me here, because this gets a little strange.

Applications issue drawing commands using one of the various drawing APIs; the drawing APIs produce a bitmaps based on these (possibly vector-based) drawing commands; the window server, now an OpenGL application itself, retains the resulting bitmaps as textures on polygons in an OpenGL scene and composites them into a pleasing, cohesive final image on the screen by issuing OpenGL drawing commands.

It's slightly confusing to think about the window server as an OpenGL application, but that's what it is. It just happens to be the only OpenGL application that does not send its output to the window server for compositing...because it, er, is the window server.

Let's see what this buys us in terms of performance. Here's a comparison of the Mac OS X display architecture with and without Quartz Extreme:

Notice the sudden proliferation of the red "hardware" lines in QE-enabled diagram. What it's trying to show is that the calculations required to composite each application's windows onto the screen are now handled by the GPU on the graphics card rather than by main CPU. Each window is treated as an OpenGL surface, and the bitmap that makes up the window's contents is the "texture" mapped on that surface.

The end result is that the CPU cycles previously spent compositing windows are now free for other purposes, and the previously (mostly) idle GPU is put to work doing what it does best.

Now let's look at what QE does not do. First, it does not affect Quartz 2D or any of the other drawing APIs at all. They all continue to function just as they always have, with the same amount of participation from the CPU.

Second, QE doesn't lessen the memory requirements of Mac OS X's display architecture. The window server must still retain bitmapped buffers for each window on the screen, and those buffers are still stored in main memory for the reasons stated earlier.

For maximum performance, the textures (window buffers) being composited should be in video memory. But they must also remain in main memory, because they can be evicted from the limited pool of video memory at any time.

Without QE, data flows from the window buffers in main memory to the CPU for compositing. With QE, data flows from main memory to the video card instead. This removes the burden on the bus between main memory and the CPU. But if all of the window buffers in main memory cannot fit into video memory, there will be heavy demands placed on the bus between main memory and the video card.

So it's no surprise that two of the requirements for Quartz Extreme support are at least 16MB of VRAM (32MB recommended) and an AGP2x bus (4x or better recommended). Furthermore, since windows come in many different shapes and sizes, video cards that do not support arbitrary texture sizes (e.g. ATI Rage 128) cannot use Quartz Extreme. The video card must also have support for all the pixel formats used by Quartz, and support multitexturing.

The G3/400 with its Rage 128 card on a PCI bus fails to meet these requirements and therefore cannot use Quartz Extreme. The 64MB GeForce 4 MX card in an AGP 4x slot on the G4/800 is ready to go, however. We'll see what kind of difference it makes in the next section.

John Siracusa / John Siracusa has a B.S. in Computer Engineering from Boston University. He has been a Mac user since 1984, a Unix geek since 1993, and is a professional web developer and freelance technology writer.