If this is your first visit, be sure to
check out the FAQ by clicking the
link above. You may have to register
before you can post: click the register link above to proceed. To start viewing messages,
select the forum that you want to visit from the selection below.

ATI R500: Mesa vs. Catalyst Benchmarking

08-29-2008, 01:01 AM

Phoronix: ATI R500: Mesa vs. Catalyst Benchmarking

With Mesa 7.1 having been released this week and the open-source R600/770 3D support just around the corner, we've taken this opportunity to see how the open-source Mesa 3D stack compares to AMD's monthly-refined Catalyst Linux Suite with the fglrx driver performs for the Radeon X1000 (R500) series. In this article are Mesa 7.1 and Catalyst 8.8 benchmarks for the Radeon X1300PRO and X1800XL graphics cards.

Comment

There are some deeper and reasonable explanation of why most of the Games perform poor and will be performing poor on open drivers:

We can't really blame driver writers, this is mainly due to the game programmers. They don't really follow the graphics API guide lines, with a few exceptions (UT2004 anyone?). The proprietary drivers which ATI and NVIDIA make are piled up by past experiences TO COPE with the bad exercises which game programmers do most of the time. This is why write a working graphics driver is easy, but to make it work fast in all circumstances is VERY hard.

I have seen some interview of this kind, the driver programmer in nVidia unveils this truth. There is one example if I recall correctly: if a graphics program (be it OpenGL or D3D) creates a dynamic vertex (pixel?) buffer, write to it but don't unlock in time, the graphics pipeline will stall, reducing the rendering speed tremendously. There are certain documentation which don't recommends this way of using the pipeline, but game programmers (maybe innocently) uses it anyway. The result is obvious: poor performance.

However, GPU driver writers have to cope with this problem. In the nVidia's interview, they analyze the behavior of the game, then aggressively unlock the resource before the game actually do.

This might be a lame example but you get the idea. It's like a loophole, if GPU vendor A writes its driver strictly following the industry standard and performs poorly, on the other hand GPU vendor B hacks its driver to cope with problem so the game runs a lot smoother on GPU B, the straight outcome of this is people is going to prefer GPU B because seeing from the surface, GPU B runs the game faster!

Based on above speculation, I am not confident about the performance in open drivers at all. Fast running system requires each and every piece of puzzle falls into the correct place, in this case game writers!

In short, what do all those per-month official GPU drivers that ATI and nVidia are busying about? Hacking games!

Comment

I'm very surprised about 2D performance.. I always though that 2D is the area, where OS drivers rule. And many people report more lags in 2D, when fgrlx is used, and almost smooth performance with OS..

How can that be explained?

Comment

I'm very surprised about 2D performance.. I always though that 2D is the area, where OS drivers rule. And many people report more lags in 2D, when fgrlx is used, and almost smooth performance with OS..

How can that be explained?

Well, it depends on what the Java 2D benchmark actually does. However, seeing how the results are exactly tied up in all three benchmarks I suppose that the driver's hitting a software fallback in those cases. It would be nice to repeat all the tests with a different CPU, it could highlight where the driver's not doing hw acceleration.

Comment

I'm very surprised about 2D performance.. I always though that 2D is the area, where OS drivers rule. And many people report more lags in 2D, when fgrlx is used, and almost smooth performance with OS..

How can that be explained?

Well, unless I've misread the article, the Java 2D benchmarks are still using OpenGL for rendering.

Comment

While I haven't done any profiling, the bad performance for games is almost certainly due to exactly two facts:

1. Until we get a real memory manager, we do not have real support for VBOs. That hurts *a lot*. [1]
2. We don't use texture or framebuffer tiling and HyperZ on R300+.

Once these issues are fixed, I expect our performance to come a lot closer to fglrx in those games. Obviously we will never reach performance parity simply because a lot more effort goes into fglrx. But I don't think we'll be beaten by the orders of magnitude that you're seeing right now.

As for the 2D benchmarks: I believe this is 2D implemented on top of OpenGL, and while I haven't looked at the Java2D tests, the fact that both graphics cards performed equally badly suggests that we're simply hitting software fallbacks. This isn't unexpected - 2D rendering is likely to use OpenGL features that are off the "typical" path that games use, so we just haven't gotten around to implement them in hardware.

Note that normal 2D operation goes through the render acceleration in the X server, which is indeed pretty good.

[1] A simple example: While working on the r300-bufmgr branch I made some subtle changes to the vertex upload code which I thought were harmless. Turns out they prevented gcc from doing some optimizations which caused Nexuiz performance to drop by 50% (of course I reverted those changes immediately...). So at least Nexuiz is clearly vertex upload bound, and that bottleneck will simply vanish in many cases as soon as we support VBOs properly.

P.S.: I don't think it's fair to blame game developers. Sure, they sometimes stray off the fast path. But right now, the problem clearly lies with our (the driver developers') court.

Comment

My understanding is, that 2.6.26 brought some new support for R500 chips into the Kernel, but the article states, that the benchmarks where done with a 2.6.24 Kernel.

Where this changes backported to the used 2.6.24 Kernel, or do they don't matter? I thought the driver don't even work without this changes. (R300-based Card user here, so this changes currently don't really apply to me)

Comment

My understanding is, that 2.2.26 brought some new support for R500 chips into the Kernel, but the article states, that the benchmarks where done with a 2.2.24 Kernel.

Where this changes backported to the used 2.2.24 Kernel, or do they don't matter? I thought the driver don't even work without this changes. (R300-based Card user here, so this changes currently don't really apply to me)

The only relevant components are the drm.ko and radeon.ko modules, and the benchmark was done using the drm tree from GIT.