An evening with glamor

The idea of glamor is to leverage an existing OpenGL driver in order to implement the 2D acceleration required for an X server, and then in the future one only need to create an OpenGL driver. But how well does the existing mesa/i965 driver handle the task of accelerating glamor for cairo – a task best suited for the 3D pipeline and pixel shaders of the GPU? With the first patches to implement accelerated trapezoids for glamor, I decided to find out.

I ran the usual cairo trace benchmarks, including a couple of new ones to highlight areas of poor performance that have come to our attention, so measuring the throughput of various applications which use cairo against the glamor (with and without the trapezoid shader enabled), SNA and UXA ddx backends on a small selection of Core processors, from a lowly i3 Arrandale to the thoroughbred i7 IvyBridge. The results were then normalized to the performance of UXA on each system, so that we can directly compare the performance of each proposed backend to the current driver. A result above the centre-line means that the driver was faster than UXA, below slower.

Between the impedence mismatch between the Render protocol and OpenGL, that mesa has not been optimised for this workload and that glamor itself is still very immature, therein belies the simplicity and appeal of glamor.

Like this:

Related

2 Comments

So Glamor with trapezoids is mostly a performance regression compared with the “green” glamor? Interesting.

I like SNA and I want to see it used, so here I’m going to give you my humble opinion:

I think one of the things that make people “fear” trying SNA is the amount of commits you still do to xf86-video-intel. Why so many commits even 1 year after you started the project? Are these all bugfixes? Or performance improvements? You really should write a blog post explaining what you’ve been doing in xf86-video-intel and how you see the state of SNA (and maybe also UXA and Glamor).

You’re the father of SNA, you’re the “boss” of Intel 2D. When you think SNA is ready for the real world, you really should focus on pushing it. If you never say “I believe SNA is ready and you can cut my balls off if it fails” no one will ever try to use it. Give convincing arguments in favour of SNA and show the flaws behind UXA and Glamor. Otherwise, we risk seeing the Glamor guys doing a better job at “sales & marketing” than you and pushing their baby first… We don’t want to see such amazing work like SNA forgotten, right?

SNA is like that hot chick that keeps hitting you but is never available to do something when you invite her.

Changing the subject:

I heard SNA splits the driver into some Gen-specific backends. Is it possible to code it in a way that someone could replace the backend for a specific Gen (e.g. Gen2) and make it use Glamor, while the rest of the driver can still use the super-fast backend? It would be a nice solution where both “competitors” can have a chance of living. It would also allow you to just plug the Glamor backend when you don’t have time to optimize boring architectures (like Gen2) or to do some quick new-hardware-enablement. IMHO, for old hardware we shouldn’t care about performance, but care about stability, and maybe a rule like “hardware older than 7 years old uses the Glamor backend” would allow you to keep the code base in a not-so-huge state while still accelerating the parts that need it most.

Trapezoidal acceleration as implemented by the proposed shader for glamor is a mixed bag. For strokes, it is a severe regression – the overhead of emitting each trapezoid is much much larger than computing the mask on the CPU. Conversely for large fills, such as the rounded-rectangles around text boxes or misaligned rectangles, then emitting the commands for each trap is much smaller than emitting the mask. As it currently stands, using the current implementation everywhere is a net loss in my estimation.

SNA is not already the default option because we try very hard not to introduce regressions. So I have taken a cautious approach and been soliciting as much feedback as possible and fixing all the loose ends found. Although it seems like there is still a lot of churn with many commits (for instance, just this last week Zdenek Kabelac has been feeding me the results of running the code through a static analyser and I’ve been busily reducing the noise so that we may fix one or two genuine issues) those commits are no longer focusing on building a solid foundation but on the snagging. Once the foundations are ready, then I can start unveiling the grand design…

Making the choice of backend on a per-generation basis is trivial, but doesn’t answer the stability issue. (I’m not even going to mention that glamor doesn’t support such old devices, nor is it a paragon of stability and resilience.😉 The majority of the code for any backend is not generation specific and by keeping multiple backends around and supported increases the bug surface and so we are more likely to encounter bugs (and less likely to be able to fix them). Also solving issues on the older hardware tends to prevent very similar issues occuring in future products – coding for an Atom and scaling to IvyBridge, means we should be able to cope with whatever the hardware engineers throw at us next.