Technology Lab —

The Linux graphics stack from X to Wayland

Ars looks at the evolution of the Linux graphics stack, from the origins of …

In the early 1980s, MIT computer scientist Bob Scheifler set about laying down the principles for a new windowing system. He had decided to call it X, because it was an improvement on the W graphical system, which naturally resided on the V operating system. Little did Bob know at the time, but the X Window System that he and fellow researches would eventually create would go on to cause a revolution. It became the standard graphical interface of virtually all UNIX based operating systems, because it provided features and concepts far superior to its competition. It took only a few short years for the UNIX community to embrace the X windowing system en masse.

In this article, we'll take a look at the development of the Linux graphics stack, from the initial X client/server system to the modern Wayland effort.

What made X so special, of course, is legendary. X was the first graphical interface to embrace a networked, distributed solution. An X Server running on one of the time sharing machines was capable of generating the display for windows that belong to any number of local clients. X defined a network display protocol so that windows from one machine could be displayed on another, remote machine. In fact, X was always intended to be used in this network fashion, and the protocol was completely hardware-independent. X clients running on one type of UNIX could send their displays over the wire to a completely different UNIX hardware platform.

X also abstracted the look-and-feel away from the server itself. So the X protocol defined pointing devices and window primitives, but left the appearance of the interface up to the widget toolkits, window managers, and desktop environments.

As X development proceeded, led by Bob Scheifler and under the stewardship of MIT, more vendors became interested. Industry leaders at the time, like DEC, obtained a free license to the source code to make further improvements. Then a curious thing happened. A group of vendors asked MIT if some sort of arrangement could be made to preserve the integrity of the source. They wanted to keep X universally useful to all interested parties. MIT agreed, and soon the MIT X Consortium was formed and the full source tree was released, including the DEC improvements. This release of the X source really was an extraordinary event. The vendor community realized that X had become a valuable commodity, and it was in the best interests of all to protect it from any one company gaining control. Perhaps the opening of the X source code is the single most important event to come out of the X story, and the MIT X Consortium maintains the rights to the X source today.

One of the senior developers recruited by the Consortium was Keith Packard, who was commissioned to re-implement the core X server in 1988. As we'll see, Packard figured prominently in the development of the Linux graphics stack.

Although X has ruled the UNIX and Linux graphics stacks, the feature-laden and ubiquitous software eventually became a victim of its own success. As Linux took flight throughout the '90s, X began to find use in configurations where a standalone X server and client both resided on one desktop computer; it came bundled this way with pretty much all of the Linux distributions. The network transparency of X is of no use on a single desktop installation, and this once vaunted feature was adding overhead to the video drawing.

As PC sales ballooned during this period, the sophistication of dedicated graphics hardware began to creep ahead of the capabilities of X. The development of new and improved hardware in graphics cards was and continues to be very aggressive.

The arrival of Translation Table Maps

Around 2004, some Linux developers had become increasingly frustrated with the slow pace of X development. They had at their disposal OpenGL, an image rendering API that was originally developed to produce 2D and 3D graphics (derived from work by the now-defunct Silicon Graphics) in 1992. But after years of attempting to get X to talk 3D to the graphics device, not a single OpenGL call could be made through the X layer.

Then, in 2007, a bright light. Thomas Hellstrom, Eric Anholt, and Dave Airlie had developed a memory management module they called translation table maps (TTM). TTM was designed to move the memory buffers destined for graphics devices back and forth between graphic device memory and system memory. It was notable because of the wild applause of the Linux community. It provided hope that somebody—anybody—was working on the problem of providing an API to properly manage graphical applications' 3D needs. The strategy was to make the memory buffer a first class object, and allow applications to allocate and manipulate memory buffers of the graphical content. TTM would manage the buffers for all applications on the host, and provide synchronization between the GPU and the CPU. This would be accomplished with the use of a "fence." The fence was just a signal that the GPU was finished operating on a buffer, so that control of it could be sent back to the owning application.

To be fair, TTM was an ambitious attempt to standardize how applications access the GPU; it was an overall memory manager for all video drivers in the Linux space. In short, TTM was trying to provide all of the operations that any graphics program might possibly need. The unfortunate side effect was a very large amount of code—the TTM API was huge, whereas each individual open source driver only needs a small subset of API calls. A large API means confusion for developers who have to make choices. The loudest complaint was that the TTM had some performance issues, perhaps related to the fencing mechanism, and inefficient copying of the buffer objects. TTM could be many things to many applications, but it couldn't afford to be slow.

Reenter Keith Packard. In 2008, he announced that work was proceeding on an alternative to TTM. By now Keith was working for Intel, and together with the help of Eric Anholt he used the lessons learned from developing TTM and rewrote it. The new API was to be called GEM (Graphics Execution Manager). Most developers reading this piece can probably guess what happened next, because experienced developers know that the only thing better than getting a chance to solve a big problem by writing a significant chunk of code is doing it twice.

GEM had many improvements over TTM, one of the more significant of which was the fact that the API was much tighter, and the troublesome fence concept was removed. Keith and Eric put the onus on the applications to lock memory buffers outside of the API. That freed up GEM to concentrate on managing the memory under control of the GPU, and to control the video device execution context. The goal was to shift the focus to managing ioctl() calls within the kernel instead of managing memory by moving buffers around. The net effect was that GEM became more of a streaming API than a memory manager.

GEM allowed applications to share memory buffers so that the entire contents of the GPU memory space did not have be be reloaded. This is from the original release notes:

"Gem provides simple mechanisms to manage graphics data and control execution flow within the linux [sic] operating system. Using many existing kernel subsystems, it does this with a modest amount of code."

The introduction of GEM in May of 2008 was a promising step forward for the Linux graphics stack. GEM did not try to be all things to all applications. For example, it left the execution of the GPU commands to be generated by the device specific driver. Because Keith and Eric were working at Intel, it was only natural for them to write GEM specific to the open-source intel driver. The hope was that GEM could be improved to the point where it could support other drivers as well, thus effectively covering the three biggest manufacturers of GPUs.

However, non-intel device driver adoption of GEM was slow. There is some evidence to suggest that the AMD driver adopted a "GEMified TTM manager", signifying a reluctance to move the code directly into the GEM space. GEM was in danger of becoming a one-horse race.

Both TTM and GEM try to solve the 3D acceleration problem in the Linux graphics stack by integrating directly with X to get to the device to perform GPU operations. Both attempt to bring order to the crowd of libraries like OpenGL (which depends on X), Qt (depends on X) and GTK+ (also X). The problem was that X stands between all of these libraries and the kernel, and the kernel is the way to the device driver, and the ultimately to the GPU.

X is the oldest lady at the dance, and she insists on dancing with everyone. X has millions of lines of source, but most of it was written long ago, when there were no GPUs, and no specialized transistors to do programmable shading or rotation and translation of vertexes. The hardware had no notion of oversampling and interpolation to reduce aliasing, nor was it capable of producing extremely precise color spaces. The time has come for the old lady to take a chair.

Who will update the open source drivers for AMD and NIVIDIA hardware? Developing open source drivers in Linux, especially for graphics adapters, has always been the developers' scourge. Usually working with incomplete hardware specifications, or none at all, the exercise invariably boils down to reverse engineering the device.

Spelling error nvidia. I'm very excited about this overall the current system needs work.

Is there a good primer on the over all architecture of Linux graphics? You never explained in details the interactions between the kernel, graphics drivers, X, OpenGL, and graphics toolkits like GTK. This article appears to assume that the reader already knows how most of these components interact.

You also forgot to mention the fact that Wayland removes the most definitive and beloved feature of X (i.e. network transparency). Sure, you may be able to run X in a Wayland window, but then you can only run your X apps over the network.

@dburr: Beloved by whom? Yes, back in the 80s this was the defining feature of X, and was extremely important, but nowadays it's more a hindrance than a help. If you really want network transparency, use VNC.

True, but I can't think of the last time I wanted a universal per-app solution. For programs where it makes sense to have the GUI frontend on a different machine than the processing backend, frequently the two halves are already separate programs with their own protocol, and for running network-aware programs (e.g., web browsers) as though you were on a different machine, you can set up a SOCKS proxy. What place is there for a system wide per-app solution?

There is, of course, an alternate Unix graphics stack that appeared way back in 1988 right around the same time that X11 first appeared.

Unlike X11, this graphics stack was tied closely to an object-oriented SDK and dynamic programming language intended for writing modern large-scale GUI desktop applications. It featured one of the first GUI implementations of GNU emacs, as well as the world's first web browser written by Tim Berners-Lee.

This graphics stack was designed for use on high-powered workstations rather than thin clients and therefore lacked the network transparency of X11. Nonetheless, it evolved into the fully composited, buttery smooth display system that powers some of today's finest desktops, laptops, smartphones, MP3 players, and tablets.

Graphics hardware vendors are free to choose how much they want to contribute. It always amazes me how reluctant they are to do just that. Wouldn't any hardware vendor, especially if their product was perhaps the most capable hardware in the world, want its users to have the best experience they possibly could? By holding back information aren't they really hurting their own product line?

We're talking about Linux here, the OS nobody gives a damn about, remember? It's users are already having the "best experience they possibly could;" just on OS's that people actually care about. That's what happens when you're talking about the also-ran OS: it doesn't get support.

The insistence on open source also hurts. Sure, it's nice and all to have open source, but this basically means that the IHVs have to fork over their detailed hardware specifications. They might do it if they had no other choice, or if the market forced them to. But Linux users are not sufficiently numerous to impose those kinds of demands on them.

This makes it sound like there was no real OpenGL support on X until 2007, which is just ridiculous. GLX has existed just as long as OpenGL itself and allowed for direct loopback connections, and on the Linux side, agpgart did that exact same thing of treating the command buffer as a 'first-class object' in X11's world, with complete userspace control over the command buffer.

AMD fares much better than NVIDIA in this department. Over the last few years, a driver team was assembled to write open source drivers for their hardware. They also release specifications periodically so that open source development can continue in the wild. The driver name is fglx (FireGL and Radeon for X), and the Linux community can get periodic (monthly) updates from AMD.'

That is very confused. AMD write a closed-source driver for Linux called fglrx. This is virtually an exact equivalent of the situation of the closed source driver for Linux for nvidia GPUs, written by nvidia, and called nvidia.

Nouveau is an open source driver for nvidia cards written by reverse engineering. This project gets no help, and no information, from nvidia.

Intel write an open source graphics driver for Linux. Although open source, this is still Intel code, and Itel hold the copyrights.

There is an open source driver for AMD GPUs, but it is not like either of the other open source drivers. AMD released the programming specifications for their GPUs:http://www.x.org/docs/AMD/

Open source developers took this programming information, and used it to write an open source driver, called radeon. This project is hosted by the Xorg foundation, it is called xf86-video-ati.

AMD fares much better than NVIDIA in this department. Over the last few years, a driver team was assembled to write open source drivers for their hardware. They also release specifications periodically so that open source development can continue in the wild. The driver name is fglx (FireGL and Radeon for X), and the Linux community can get periodic (monthly) updates from AMD.

While true in parts, this paragraph is misleading.

fglrx is the name for AMD's closed source kernel and X driver for Linux that comes as part of its Catalyst suite. The open source version is called, simply, radeon, and the X component of the same name is developed under the package xf86-video-ati.

Graphics hardware vendors are free to choose how much they want to contribute. It always amazes me how reluctant they are to do just that. Wouldn't any hardware vendor, especially if their product was perhaps the most capable hardware in the world, want its users to have the best experience they possibly could? By holding back information aren't they really hurting their own product line?

We're talking about Linux here, the OS nobody gives a damn about, remember? It's users are already having the "best experience they possibly could;" just on OS's that people actually care about. That's what happens when you're talking about the also-ran OS: it doesn't get support.

The insistence on open source also hurts. Sure, it's nice and all to have open source, but this basically means that the IHVs have to fork over their detailed hardware specifications. They might do it if they had no other choice, or if the market forced them to. But Linux users are not sufficiently numerous to impose those kinds of demands on them.

As long as Wayland has full support for Intel, I'm perfectly happy. Almost all the low to mid-end desktop and laptop solution nowadays use Intel. I haven't had a dedicated GPU for over five years, and I don't expect I ever will again.

Question: does this mean that Ubuntu will no longer feature X whatsoever?

No. X Windows is going to be around for a very very very long time.

Wayland will feature a X Window server that will allow you to run X applications integrated into your desktop.

Quote:

And are there going to be significant performance gains as a result?

No. Not by itself. Wayland is designed to be extremely 'thin' and lightweight. It may provide very modest benefits over X Windows currently, but it's not something to look forward to.

The benefits from Wayland come from making full utilization of the Gallium3D driver stack. It is designed to allow application developers easy access to graphics APIs in a manner that the application developers see fit.

In order to get good performance and proper acceleration out of X Windows requires herculean efforts. With Wayland it will vastly simplify a applications developer's job. Wayland should provide benefits from this and from Gallium.

So yes Wayland can provide performance benefits, but it's not automatic. It allows for a much easier time to get good performance.

Intel uses GEM and provides a simplified version of the 'old way'. ATI open source drivers users can use something like Intel... or they can use X Windows on top of Gallium. (This is what I am using)

Nvidia open source drivers users can use X Windows on top of Gallium for 3D stuff.

With Ubuntu 11.10 they should provide Wayland support. But it's not going to be something useful. It's going to take a long time to transition and it may end up that people are satisfied with X Windows on top of Gallium instead of Wayland on top of Gallium.

If the designers of X-Windows built cars, there would be no fewer than five steering wheels hidden about the cockpit, none of which followed the same principles -- but you'd be able to shift gears with your car stereo. Useful feature, that.

What a wildly inaccurate and speculative story, it seemed more like a novel, than a tech story.

And no there is no local 'X' overhead, just because X can be used over network does not imply a local networking overhead. Those who clamor to X because of remoting, you are wrong. It's cheaper to transmit image deltas today, because toolkits have changed, as done in RDP/VNC, and there is no reason it can't be per app, that's an implementation detail. Further nothing tops you from using a remote X when running wayland, you just need a X server that draws in wayland window.

Read up on Xrender and cairo to grasp what has changed, and why it leaves much of X unused today.

Good article, but missing a lot of history. It almost completely glosses over the period from 2000ish to 2008 when a lot happened in the Linux/Unix graphics world:

- The rise of the now mostly defunct Berlin and DirectFB projects, the first attempts to compete with and replace X. Berlin was extremely ambitious but was designed around the clumsy CORBA protocol.- The fall of XFree86 and the rise of freedesktop.org- Keith Packard's experimental compositing X server, this was the first taste of a 'modern' desktop for Linux users. OS X had been released for several years at this point (ca. 2003?), and Linux was badly lagging.- The re-organization of X under X.org, and the rewriting of several key components. This brought life back to X development and allowed it to somewhat modernize. That's why the Ubuntu desktop looks reasonably good and not like something from 1985.- AIGLX and XGL, two competing compositing solutions (one championed by Red Hat, the other by SuSE, IIRC) written as sort-of "stopgap" measures until the X situation was straightened out. Everybody is still waiting.

I'm sure there's more detail I forgot. The tale of graphics on Linux has been long and torturous, that's for sure. If you've ever been forced to play with X11 modelines settings, you already know this.

This article gets a lot of the gist more or less right, but has a few big factual errors.

Some highlights that stuck out to me:

1) While the article does state clearly what TTM is, at other times it seems to be saying that TTM does something magical to make 3D work. This isn't true. None of this is mandatory to get 3D working on any card, it's simply necessary to have some code in place to efficiently use the GPU's memory.

2) The article explicitly states that TTM was necessary for OpenGL to ever work on X, which is completely untrue. GLX (the protocol for using OpenGL on an X11 desktop) has been around and implemented with hardware acceleration since well before TTM was ever conceived. See Utah-GLX for one of the older FOSS implementations there-of.

3) The article implies that "there's evidence" that the open ATI drivers use GEM+TTM, while we have a lot more than evidence: we have the freaking source code. It's true. And it has nothing to do with "inertia" like the author suggests, but instead has to do with the fact that GEM was never designed to be a universal GPU memory manager. It was designed for Intel's IGP devices. It simply doesn't handle some of the more complex issues that a high-end discrete GPU needs. That's why ATI is using much of the TTM internals, but with a "GEM-ified" API for consistency. The article confuses this a bit by trying to note that GEM is not "everything to all applications" but GEM/TTM have absolutely nothing to do with applications; they're kernel interfaces for the low-level graphics stack.

4) It is stated (after already stating that both are just protocols) that Wayland has its own compositing manager and X11 uses something outside the protocol for composting, both of which are untrue. X11 splits the job of compositing out from the job of window/event management, but these are certainly still done using the X11 protocol. Wayland's protocol simply assumes that there is a client and a server, and the server takes care of window management and compositing and the like. However, that's just the client-server protocol behavior. Nothing stops you from writing a Wayland server that spawns an external process to do the compositing, or window management, or anything else; if you do so, you'd simply have to rely on a separate protocol for communication.

5) The article claims that Wayland does away with networking. This is true of the base protocol. It's true pretty much entirely, in a very technical sense. In a more practical sense, though, once again remember that nothing stops you from writing a Wayland server that communicates over a network with a remote display using whatever protocol you want. Just because your current VNC client is full-desktop does not mean that a VNC-like protocol could not be per-application (and I'm fairly sure such protocols already exist). Such a network-capable server does not exist yet, but I do recall someone starting on it. If nothing else, you can keep running your X apps just like you always have, and simply use Wayland as the display server for low-level hardware management (which has a ton of advantages, like being able to multiplex multiple X servers for fast-user-switching support). Toolkits like GTK 3 can actually have both the Wayland and X11 backends compiled in at the same time, so there's no need to fear that your binaries will someday lose X11 support and be Wayland-only until well after Wayland has a networking story (and, given the huge latency problems of the X11 protocol and modern wireless networks' issues with latency, it's likely that the Wayland story will be _better_ than X11 ever was when it comes to networking -- though that is conjecture at this point).

- Keith Packard's experimental compositing X server, this was the first taste of a 'modern' desktop for Linux users. OS X had been released for several years at this point (ca. 2003?), and Linux was badly lagging.

And still is lagging. X has had an extremely chilling effect on the modernisation of Linux, resulting in an architectural catastrophe that a client/server architecture never intended for hardware accelerated low-latency display is being shoe-horned into the role it was never designed for.

The Linux display stack is very roughly at the WindowsXP level: It has rudimentary GPU offloading through OpenGL (another shoe-horned API), lacks proper compositing and VRAM virtualisation and is largely unable to expose the full features of even a nine year old SM2.0 GPU.

If we can rely on past experience as a measure of future development, Linux will catch up with Windows Vista (saying nothing for Windows 7) sometime in 2019.

I like the way the X article portrays Mr Wayland as a genius for having a flash of inspiration in 2008, a mere decade after the developers of Quartz had the same flash of inspiration, and half a decade after Microsoft's Aero/DWM engineers.

The exceptionally poor quality of the writing in this article is beyond disappointing. I just can't believe how painfully bad some of it is.

and how is flinging crap at people like a chimp in a cage going to help anyone? do you have anything constructive or do you just get off on shitting on people?

Nah, he's right. The article feels like it tries to cover 10+ pages worth of material in only two, and uses compression-by-random-discard to make that happen. It strikes exactly the wrong balance between assuming the reader is new to the subject matter and assuming she has all of the prerequisites to understand it. At various points it feels like the author went "oh hey, DRI2 is important and I haven't mentioned it yet", so writes a couple of sentences mentioning DRI2, while doing nothing at all to explain what it is to the naive reader, nor conveying anything of use or interest for one who already knows. (Substitute in whichever technology or acronym for "DRI2".) In addition to that in many places the connective tissue between adjacent sentences and paragraphs feels exceedingly thin -- like the only reason one thought follows the other is because it's the next one which happened to enter the author's mind. Overall, the article ends up reading like an enthusiastic forum post from someone who's been reading Phoronix for a couple of years, not like the sort of well-researched, well-written essay one expects from Ars.*

When the third paragraph manages to get things completely wrong, is there any point reading on?

Quote:

What made X so special, of course, is legendary. X was the first graphical interface to embrace a networked, distributed solution. An X Server running on one of the time sharing machines was capable of generating the display for windows that belong to any number of local clients. X defined a network display protocol so that windows from one machine could be displayed on another, remote machine. In fact, X was always intended to be used in this network fashion, and the protocol was completely hardware-independent. X clients running on one type of UNIX could send their displays over the wire to a completely different UNIX hardware platform.

This is completely arse about tit.

An X Server running on the local machine is able to generate the display for windows that belong to applications on one of the time sharing machines (or any other machine.)

The X Server runs on the local machine. The clients (i.e. applications that want to draw to the screen) are the things that run remotely.

The exceptionally poor quality of the writing in this article is beyond disappointing. I just can't believe how painfully bad some of it is.

and how is flinging crap at people like a chimp in a cage going to help anyone? do you have anything constructive or do you just get off on shitting on people?

Nah, he's right. The article feels like it tries to cover 10+ pages worth of material in only two, and uses compression-by-random-discard to make that happen. It strikes exactly the wrong balance between assuming the reader is new to the subject matter and assuming she has all of the prerequisites to understand it. At various points it feels like the author went "oh hey, DRI2 is important and I haven't mentioned it yet", so writes a couple of sentences mentioning DRI2, while doing nothing at all to explain what it is to the naive reader, nor conveying anything of use or interest for one who already knows. (Substitute in whichever technology or acronym for "DRI2".) In addition to that in many places the connective tissue between adjacent sentences and paragraphs feels exceedingly thin -- like the only reason one thought follows the other is because it's the next one which happened to enter the author's mind. Overall, the article ends up reading like an enthusiastic forum post from someone who's been reading Phoronix for a couple of years, not like the sort of well-researched, well-written essay one expects from Ars.*

* (Though obviously we're just spoiled.)

This. I'm that occasional Phoronix reader you mention, and I came away from this article with no change in knowledge whatsoever. In fact, after the many comments saying the article got some basics outright *wrong* I'm trying to forget I read it.

What I still do not quite understand is why "closed source" drivers could not be developed for the GPU layer and an appropriate set of hooks 'inserted' into the kernel and window manager for these drivers to communicate with?

All that is holding Linux back is the lack of a solid window manager and UI environment that can hold its head high amongst the windows' and OSX's of the world...