The Test

For the purposes of our testing we’ll be looking at the 6 games we’ve adopted for use with FCAT due to their proven reliability. These are Total War: Shogun 2, HItman: Absolution, Sleeping Dogs, Battlefield 3, Bioshock Infinite, and Crysis 3. All of our results unless otherwise noted are using Catalyst 13.8b1 for the AMD cards, and NVIDIA’s 326.19 beta drivers for the GeForce cards.

Our metric of choice for measuring frame times and frame pacing is a metric we’re calling Delta Percentages. With delta percentages we’re collecting the deltas (differences) between frame times, averaging that out, and then dividing delta average by the average frame time of the entire run. The end result of this process is that we can measure whether sequential frames are rendering in roughly the same amount of time, while controlling for performance differences by looking at the data relative to the average frame time (rather than as absolute time). This gives us the average frame-to-frame time difference as a percentage.

In general, a properly behaving single-GPU card should have a delta average of under 3%, with the specific value depending in part on how variable the workload is throughout any given game benchmark. 3% may sound small, but since we’re talking about an average it means it’s weighed against the entire run, as the higher the percentage the more unevenly frames are arriving. For a multi-GPU setup we’d ideally like to see the delta percentages be equal to our single-GPU setups, but this is for the most part unreasonable. There is no hard number for what is or isn’t right here, but based on play testing we’d say 15%-20% is a reasonable threshold for acceptable variance, with anything under 10% being very good for a multi-GPU setup.

Finally, in our testing we did encounter an issue with Catalyst 13.8 that required we make some slight adjustments to FCAT to compensate for this bug, so we need to make note of this. For reasons we can’t sufficiently explain at this time but has been confirmed by AMD, in some cases in Crossfire mode AMD’s latest drivers are periodically drawing small slices of old frame buffers at the top of the screen. The gameplay impact is minimal-to-nonexistent, but this problem throws off FCAT badly.

To quickly demonstrate the problem, below we have two consecutive frames from one of our Battlefield 3 runs. The correct FCAT color order here is dark blue, green, light blue, and olive. The frames corresponding to dark blue and green occur on frame one, and light blue and olive on frame two. Yet looking at frame two, we see a small 6 pixel high stripe of dark blue at the very top of the image. At this point the dark blue frame should have already been discarded, as the cards have moved on to the green and later light blue frames. Instead we’re getting a very small slice of a frame that is essentially 2 frames old.

The gameplay impact from this is trivial to none; the issue never exceeds a 6 pixel slice, only occurs at the top of the frame (which is generally skybox territory), and is periodic to the point where it occurs at most a few times per minute. And based on our experience this primarily occurs when a buffer swap should be occurring during or right after the start of a new refresh cycle, which is why it’s so periodic.

However the larger issue is that FCAT detects this as a frame drop, believing that over a dozen frames have been dropped. This isn’t actually possible of course – the context queue isn’t large enough to hold that many frames – and analysis shows that it’s actually part of the old frame as we’ve explained earlier. As such we’ve had to modify FCAT to ignore this issue so that it doesn’t find these slices and count them as dropped frames. The issue is real enough (this isn’t a capture error) and AMD will be fixing it, but it’s not evidence of a dropped frame as the stock implementation of FCAT would assume.

Ultimately our best guess here is that AMD is somehow mistiming their buffer swaps, as the 2 frame old aspect of this correlates nicely to the fact that the dark blue and light blue frames would both be generated by the same GPU in a two-GPU setup.

Post Your Comment

102 Comments

That makes sense, but I guess the bigger concern from the outset was how AMD's allowance of runtframes/microstutter in an "all out performance" mentality might have overstated their performance. You found in your review that AMD performance typically dropped 5-10% as a result of this fix, that should certainly be considered, especially if AMD isn't doing a good job of making sure they implement this frame time fix across all their drivers, games, APIs etc.

Also, any word whether this is a driver-level fix or an game-specific profile optimization (like CF, SLI, AA profiles)?Reply

The performance aspect is a bit weird. To be honest I'm not sure why performance was up with Cat 13.6 in the first place. For a mature platform like Tahiti it's unusual.

As for the fix, AMD has always presented it as being a driver level fix. Now there are still individual game-level optimizations - AMD is currently trying to do something about Far Cry 3's generally terrible consistency, for example (an act I'm convinced is equivalent to parting the Red Sea) - but the basic frame pacing mechanism is universal.Reply

All frames with vsync disabled are runts, since they are never completely displayed. With a sufficiently fast graphics card and/or sufficiently less complex game, every frame will be a runt even by the arbitrary definitions you find at sites like this.

And all the while, nothing at all is ever said about the most hideous artifact of all - screen tearing.Reply

There is a simple and definite fix for tearing artifacts and you mention it yourself - vsync. If screen tearing bothers you, and I think it should bother most people, you should keep vsync on at all times.Reply

Vsync or frame limiters are certainly workarounds, but it also introduces input lag and largely negates the benefit of having multiple powerful GPUs to begin with. A 120Hz monitor would increase the headroom for Vsync, but also by nature reduces the need for Vsync (there's much less tearing).Reply

Yes this is certainly true, when I was on 60Hz I would always enable Triple Buffering when available, however, TB isn't the norm and few games implemented it natively. Even fewer implemented it correctly, most use a 3 frame render ahead queue, similar to the Nvidia driver forcing it which is essentially a driver hack for DX.

Having said all that, TB does still have some input lag even at 120Hz even with Nvidia Vsync compared to 120Hz without Vsync (my preferred method of gaming now when not using 3D).Reply

The amount of tearing is independent the refresh rate of your monitor. If you have vsync off, every frame rendered creates a tear line. If you are drawing frames at 80Hz without vsync, you are going to see a tear every 1/80 of a second no matter what the refresh rate of your screen is. The only difference is that a 60Hz screen would occasionally have two tear lines on screen at once.Reply

Sorry, not even remotely close to true. Runt frames were literally tiny shreds of frames followed by full frames, unlike normal screen tearing with Vsync off that results in 1/3 or more of the frame being updated at a time, consistently.

The difference is, one method does provide the impression of fluidity and change from one frame to the next (with palpable tearing) whereas runt frames are literally worthless unless you think 3-4 rows worth of image followed by full images provides any meaningful sense of motion.

I do love the term "runt frame" though, an anachronism in the tech world born of AMD's ineptitude with regard to CrossFire. I for one will miss it. Reply

You're not making sense. All frames with vsync off are partial. The frame buffer is replaced in the middle of screen updates, so no rendered frame is ever displayed completely.

A sense of motion is achieved by displaying different frames in a time sequence. It has nothing to do with showing parts of different frames in the same screen refresh.

And vsync adds a maximum latency of the inverse of the screen refresh (16.67ms for a 60Hz display). On average, it will be half that. If you have a very laggy monitor (Overdrive-TN, PVA, or MVA panel types), that tiny bump from vsync might push the display lag to noticeability. For plain TN and IPS panels (not to mention CRT), there will be no detectable display lag with vsync on.Reply