Does FCAT reveal instances when frame metering is taking place? The focus on the end of the rendering pipeline concerns me, as NVIDIA might be using frame metering to give the appearance of very smooth frame delivery without improving the user experience. Delaying "fast" frames to shrink the delay attributable to "slow" frames does not help the subjective experience; stuttering game-time instead of frame-time is no better for the user.Reply

In essence, stuttering in the game simulation is being downplayed maybe a bit too much.

Very intuitively, suppose that in the simulation an object moves from one side of the screen to the other at a constant speed, and suppose there is a stutter in there caused by:1) the context queue being full2) the simulation engine being blocked by this fact(and also assume the simulation timer is written independent of the rendering pipeline of course, otherwise fast/slow rendering would accelerate/decelerate the simulation time).

Then the simulation might simulate like this (samples over time):[-X-X-X-------X--]While the graphic subsystem is able to smooth out the stutter via queues, buffers, ... .[---X---X---X---X]

Frame delivery in that instance might be smooth, however visually the object will not move at a constant speed, moving slower in the beginning and stuttering/jumping or accelerating at a later stage.

Today's multi-threaded games use a different thread for simulation.The simulation runs at a constant speed, and sets up variables, which the graphic rendering thread pick up when generating frames. It's totally irrelevant if you have 20 frames or 120 or horrible visual stuttering. The simulation itself will be smooth... but the way it's displayed/presented... that's another thing.Reply

Doesn't matter if the game engine is running on another thread or not, if it expects one frame to be displayed at t+0ms, the next one at t+10ms, third at t+30ms, then an evenly moving object will be at one third it's way on the second compared to the first and third frames.

If you even these frames out, and display them at t+0/t+15/t+30, then the object will be seemingly moving twice the speed between the first and second frames compared to between second-third. That is exactly shuttering.Reply

I also agree with you and Spoelie. Simulation step stutter is also an issue that should be covered. It would be really nice to get simulation timestamps directly from the output of the simulator and match them up to their corresponding output frames. However, this would probably require collaboration with game developers that you probably won't get. Until then, using a tool that works at the output of the renderer (like FRAPS) and can associate simulation steps with output frames would be nice.

That said, there is really little that AMD or Nvidia can do to fix issues in the application other than trick it into working correctly. These results would be more useful for game developers developing new engines. Also keep in mind, simulation steps are tied loosely (through queues) to GPU's processing capabilities (unless the bottleneck happens to be before the command is dispatched to D3D). Simulation steps should be roughly equal to frame times on average. If the GPU processing were completely consistent, then the latency between the simulation step and output would be fixed and the output would appear smooth. It is variations in frame times that cause variations in simulation steps. On average, the variations of each should be roughly equal. The worse case stutter, which should be something like double the frame time variation (when simulator is compensating it the opposite direction as the frame time variation), is what we need to look out for. That said, variation in frame time is generally smaller than frame time itself. I would suggest that simulation step stuttering is a smaller problem than frame time stuttering and becomes smaller as frame times get shorter. Point of interest, Nvidia's Frame metering may actually increase simulation step stuttering.Reply

I agree with this. If the frames aren't displayed at the time the game engine is expecting to, then you will have your should-be-moving-smoothly-objects being quirkly-moving-objects redered with a smooth framerate...Reply

I get your concern, but I'm not sure it's valid. The problem here is differences in frame rates, marked jerkiness where the moving images look like they've slowed down or even stopped and then suddenly jump forward. Evening this out by metering the frames ameliorates the issue, but at the cost of overall frame rates.

And, really, what's wrong with the technique, unless it were to bring frame rates down to an unacceptable level, and why would you consider it cheating somehow?Reply

IIRC, frame metering occurs right before output and after the feedback loop. It does nothing to smooth out simulation steps. Further, adding random delays completely unknown to the simulator actually makes simulation step stutter worse. However, I believe simulation step stutter will prove to be a smaller problem than frame time stutter (at least until frame times themselves are long enough to be an issue).Reply

As far as I can tell it does not reveal instances of frame metering. While I do believe simulation step stutter to be a less significant problem than frame time stuttering, judicious use of frame metering may elevate its position as you are effectively adding random delays to the output that the simulator knows nothing about and therefore cannot compensate. I would encourage the reviewers here not to dismiss FRAPS completely yet. FCAT seems to be the better tool for evaluating the end of the pipeline, but until there is a better tool at the beginning of the pipeline that can see fine grained simulation step times, a coarse tool like FRAPS is still useful for revealing coarse simulation step time inconsistencies.Reply

...but if they did do frame metering (i.e.delaying faster frames) the overall frame rate would drop, which other benchmarks would pick up easily. I'm sure that ATI and nVidia still care about overall FPS too.Reply

If there is frame metering, then FCAT would show results with highly regular frames with few-to-no runt frames. So in a roundabout way it proves frame metering/pacing; you wouldn't have that kind of smoothness without it.Reply

It gets interesting ^^ A year and a half later, but still.It would be interesting to have a FRAPS comparison result to see the improvements with this method.

Also I'm wondering about the cases where frames pass almost invisibly with only a few lines (or none?) been drawn to the screen (due to the screen / capture card refresh rate).Does the tool detect such cases? (through knowing the order in which the colour overlay is supposed to happen it could detect frames that weren't drawn, while being unable to provide timing for those)

Techreport has been doing this for months. It's a great measurement and definitely the way of the future when it comes to comparing cards. Sites holding onto FPS only reviews are stuck in the past. Reply

You may have missed the point of this article as well as the prior one. Techreport utilizes Fraps, which is what's being criticized. Simply because they may be waiting for the correct tool does not define many sites as "stuck in the past."Reply

No tech report have something very similar to what is described above with the colour bars - it's what detected the runt frames that afflict xfire. The nvidia tool looks like a more professional complete solution, but tech report did it first.Reply

A color bar can hold a whole lot of information. From my reasoning, it might be possible to timestamp each line in a frame, but let's not get that detailed. How about just timestamping each frame and coding the information in the overlay. Each 24-bit pixel of the overlay is 3 bytes of information. An overlay at about 10 pixels wide would give 30 bytes of information on each line. This should be enough to timestamp the frame from were FRAPS would get its information to be compared with the time of what come out at the end of the pipeline. Why wouldn't this be possible?Reply

Agreed. Ideally, you'd write a frame number and timestamp on each line of the overlay (as FRAPS/FCAT does), and separately transmit the frame number and timestamp to the video capture system (a simple 32-bit GPIO interface would do). This would tell you everything about latency, stutter, and dropped frames from the PRESENT call to the montior.Reply

Unfortunately that won't work. The timestamp needs to be generated at the moment the simulation state finalized, before the draw calls are sent off. Present is too late, particularly because a pipeline stall means that Present may come well after the simulation state has been finalized.Reply

The whole issue is not the simulation state but what happens between Present and your Display. The accuracy of what is being stimulated is not what the beanchmark is about (it would be nice to have a benchmark for that) but to measure the frame latency introduced by the graphics pipeline.Reply

At least the game/benchmark could encode the simulation time when it calls "present" and the capture analysis tool could then discover fluctuations in the time difference between simulation time and shown-on-the-screen time. Ideally, such a difference should be constant all the time, leading to a perfectly smooth rendering of the simulation. This is what someone has already pointed out earlier in the comments.Reply

So that's a very good question, and one I didn't have time to adequately delve into for this article.

The short answer is that on a technical level, there's no real possibility to fix the results. The extractor tool is only looking at a recorded video and has no idea what generated it. The overlay tool meanwhile is hooking into the pipeline at the application/Present level, similar to FRAPS. So it doesn't know anything about the hardware either.

On a practical level we've been over this with a fine-toothed comb. So have PCPerspective and others. What FCAT generates and what FCAT sees is consistent between all video cards. There's nothing strange going on.

That said, even NVIDIA is aware of the potential conflict of interest, which is why their goal is to eventually wash their hands of FCAT. The overlay can easily be implemented by someone else - ideally the FRAPS guys - and the extractor wouldn't be too hard to recreate (and the possibility is open of releasing the source for that). Someone had to do all the work to put this together though, and there really isn't anyone in a better position than a GPU vendor like AMD or NVIDIA. But now that they've laid down the template, independent parties can easily recreate all of this.Reply

IMO If their intent was to avoid maintaining the tool long term to avoid conflict of interest accusations they should've planned on making it open source from the start. Forcing 3rd parties to recreate something they always intended as a throwaway would be spiteful; and a pledge of opensourcing it would help mitigate accusations even if they held the code closed until it was completed.Reply

Unfortunately the overlay cannot be open sourced. There are patent/licensing issues there with some of the code they're using. The only way for that to be compliant is to recreate it from scratch, hence the very specific interest in having the FRAPS guys do it.Reply

Just what the heck are they doing in the overlay code that has patent/licensing issues? Recreating it from scratch won't avoid any patent issues, and it's simple enough to implement that you don't need to cross license copyrighted code from a 3rd party.Reply

Ryan, Gigaplex is right and you're misunderstanding something here. Recreating code from scratch (*without* looking at the original, in other words clean-room reverse engineering) will get around any copyright issues. However, the whole reason software patents are so pernicious and wrong is that they protect the mere idea, not any implementation. Someone who came up with this completely independently and did it all themselves would still be violating any software patents on the idea of it.

If it actually is patented then the only way around it is for someone in a non-software patent country to do it or for anonymous people to just blow the law off and do it anyway. Also, software patents don't preclude open source either, they're semi-orthogonal depending on the license, but OSS is primarily about copyright (though the GPL 3 tries to expand it a bit). Nvidia may have separate licensing issues though that'd be unique to them.Reply

This can be time consuming and costly, but for the authenticity of FCAT's numbers, reviewers can always cross-ref with a second set of tool, e.g. high speed camera in random sections of a benchmark run to see if two results are consistent. If the stutter is really bad, a player can also detect it by eyes. My way of checking how constant the frames are is to move side to side with constant speed if it's a first person shooter (A and D in WASD config), or a controller to move left stick all the way to left or right for other types of games.Reply

Does the extractor tool just use the color bar or does it attempt to determine where in a scan line a new frame starts?

Can this be used on an Eyefinity/Surround setup? Does only one of the monitors need to be captured? What is the maximum resolution that can be captured?

Looking at PLOT.png, there already appears to be some abnormally large spikes. Is there going to be any sort of visual inspection to verify that such oddities are just performace spike and not something erroneous going on with FCAT?

If FCAT and FRAPS are used simultaneously, which one intercepts the Present call first to process an overlay?Reply

What happens when the Present call is invokes when the color bar is being drawn on screen? IE one scan lines has the color bar composed of two different colors? Does the extractor record this as one scan line as one frame? (For example this could lead to a spike where one frame is recorded at 1080 fps on a 1080p display.)

If FRAPS can determine the frame rate from the application's perspective and FCAT can records the frame rate at the display level, then couldn't the latency invoked by just the OS/driver be determined on a per frame basis? Determining this latency would be interesting.

Can FCAT be used to determine the where a frame was rendered in a multiple GPU set up? IE each GPU is only allowed a specific subset of colors to use for their color bar.Reply

1) A scan line cannot be composed of two different colors. A frame cannot be switched mid-line. It has to wait until the end of the line.

2) So without timestamping and clock syncing, we would not be able to determine latency. It wouldn't be possible to easily match up frames to present calls, nor how long it took that one frame to traverse the pipe.

3) No. FCAT cannot tell us which GPU rendered a frame. AFR is too abstracted from the rendering pipeline for that.Reply

1) Correct. If the FPS is below 60, the monitor would be repeating part of a frame, so the color bar would stay the same until a new frame is finally served up.

2) The colors are in a regular pattern of 16 colors to detect dropped frames.

3) The extractor tool only looks for the color bar

4) Yes, this can be used on a surround setup. You'd just capture the leftmost display. The maximum resolution right now would be 2560x1440 (the limits of the capture card), which would be part of a 7680x1440 or 12800x1440 setup.

5) We've already taken a look at results like those. FCAT is almost dead simple; those aren't anomalies in FCAT.

6) It would be FRAPS first, then FCAT. NVIDIA suggests starting FRAPS second, so it comes earlier in the chain.Reply

Frame Render Ahead/Pre-Rendering (frames prepared by the CPU ahead of time)? How does this option change the outlook for SLI and Crossfire? The Battlefield 3 community is saying if you change this value to 0 or 1 versus the default value of 3, game play is smoother. I've done it myself on my old HD 4870 Crossfire setup on a i7-2600k+Z77 system and it does seem to help.

Does this pre-rendering affect the amount of runt frames on an SLI/Crossfire configuration?

What is the affect of different CPU/Chipset combinations. At one point folks used to say AMD system felt smoother, despite showing lower FPS.

Soooo... not only did AMD NOT realize that this was an issue, but nVidia has known enough to worry about this particular issue for YEARS and develop a tool to detect and measure it.

Wow, AMD. You were behind the curve BEFORE you fired all those engineers in your R&D. The more this story develops, the more sad it becomes. Damn. It's like you're a farmer who failed to realize that you have to spray your fields for insects in addition to fertilizing them.Reply

I'm not sure you realize how development of silicon based semiconductor product works. This is not on the scale of a planting season where you put down some seed an later that year you harvest your crop. You start off with the specification phase of the product, you move to development of the hdl and verification of everything and then you go to nre and silicon samples. This is a multi-year process. AMD may well have known about this issue just as long as nvidia but silicon products don't go from specification to product overnight. That is why vendors offer driver/firmware/microcode updates. As to taking a pot shot at AMD laying off R&D people, it's called a business decision. Sometimes you need to let people go so that the rest of the employees can remain employed. Otherwise you can end up facing bankruptcy and massive layoffs. I don't know if it's the right move at this time or not, and I'm no financial analyst. Currently nvidia has no gpu chips directly that share a piece of silicon with the cpu, unlike AMD and intel. The next generation of consoles seem to have gone with AMD. I haven't heard of any big chipsets from nvidia. Anyway, sorry for the long response. I believe AMD, Intel, and nvidia all have strong points and areas where they can improve. This seems like it might be largely a driver issue and I'll admit that AMD seem to have had issues with drivers. Your post just seemed like an easy shot against AMD. I am not affiliated with AMD, nvidia, or Intel and the views expressed in this post are my own opinions and not those of any current or previous employer.Reply

From what I heard/read somewhere, Nvidia new of such a problem back in their 8000 series card(8800gtx/gts/gt). So its something they knew about for a long time. Nvidia has hardware built in to their cards to keep the cards in sync frame wise where as AMD doesn't. Some time of 2 years for AMD to fix it properly as well as a software driver fix will most likely cost some fps.Reply

I'd like an AnandTech investigation of a slightly different issue:How fast does Apple update the screen on iPads and iPhones when displaying movies? Specifically, when displaying movies, do they switch to updating the screen only every 24th of a second?

The reason I think this is an interesting question is that, in my experience, movies displayed on iPad show none of the stuttering when panning that is so obvious on both TVs and computers, stuttering which is, as far as I can tell, generated by the 3:2 pullup. (I don't have a 120Hz TV, so I can't comment on how well they deal with this.

So we have the visual suggestion that this is what Apple is doing, along with the obvious point that it would presumably save power to only refresh the screen at 24Hz (though the power savings may be negligible).

I must admit I would find very interesting an investigation of this (perhaps by similar techniques to what are being used here, like a movie consisting of a color coded sequence of frames, and time-stamped video capture; though you'd probably have to use an external video camera.)

And this is not just an Apple specific issue; it would be interesting to know if Android and MS likewise are capable of displaying stutter-free movies on their mobile devices (unlike on the desktop where, sure, you have far less control, but for fsck's sake --- can't you at least do the job right in full screen mode?)Reply

Wow, so this issue of stuttering has been talked about amongst users for at least a decade, then Scott Wasson from techreport decides to run a test of latency just weeks before Nvidia releases their super duper new tool to test latency?......and Nvidia have been working on it for 2 years? What a coincidence!! ...and NVidia cards perform better in this regard? Double coincidence!! So does that mean Nvidia is a benevolent company who wants to help AMD fix their stuttering issues so AMD can sell more cards?? Wow they must be saints!! We can add this to Nvidia's long list of open, honest and transparent business practices.Reply

Not sure if joking or troll.... Scott's latency tests started in their August 23, 2012 article entitled "Inside the second: Gaming performance with today's CPUs". That's not "just weeks" ago. Second, if NVIDIA thinks Scott's FRAPS tests are so awesome for them, why would they bother to release a tool that measures at an entirely different point in the pipeline? Your conspiracy theory is not only factually wrong, it doesn't even make a good conspiracy theory.Reply

I also agree with you and Spoelie. Simulation step stutter is also an issue that should be covered. It would be really nice to get simulation timestamps directly from the output of the simulator and match them up to their corresponding output frames. However, this would probably require collaboration with game developers that you probably won't get. Until then, using a tool that works at the output of the renderer (like FRAPS) and can associate simulation steps with output frames would be nice.

That said, there is really little that AMD or Nvidia can do to fix issues in the application other than trick it into working correctly. These results would be more useful for game developers developing new engines. Also keep in mind, simulation steps are tied loosely (through queues) to GPU's processing capabilities (unless the bottleneck happens to be before the command is dispatched to D3D). Simulation steps should be roughly equal to frame times on average. If the GPU processing were completely consistent, then the latency between the simulation step and output would be fixed and the output would appear smooth. It is variations in frame times that cause variations in simulation steps. On average, the variations of each should be roughly equal. The worse case stutter, which should be something like double the frame time variation (when simulator is compensating it the opposite direction as the frame time variation), is what we need to look out for. That said, variation in frame time is generally smaller than frame time itself. I would suggest that simulation step stuttering is a smaller problem than frame time stuttering and becomes smaller as frame times get shorter. Point of interest, Nvidia's Frame metering may actually increase simulation step stuttering.Reply

I've just added FCAT overlay rendering support to OSD server of MSI Afterburner and EVGA Precision. Still need some time to discuss exact RGB color sequence with NVIDIA, then I guess we'll release it to public.Reply

Thanks, Ryan. I've already received hex reference colors from NV. MSI Afterburner with FCAT overlay support (3.0.0 beta 8) is expected to be released on Monday, EVGA Precsion 4.1.0 with FCAT overlay support will be available on the next week too.Reply

There is a well documented stuttering fix for both Nvidia and AMD users on multiple forums. I've tried this for my HD 4870 Crossfire setup and it works. This particular user from the above link has a NVIDIA GTX 470.

5.Open notepad and paste this into it and save it as "user.cfg" inside your "program files/origingames/battlefield3" folder:

i dont like the spin, lots of sites were left flat footed by scott wasson's work, and this article tries to spin it like "well we always wanted to do this, but we never had a good enough tool, until now" how convenient. it reminds of countless corporate bs. when beaten to a trend, a corporation will usually say something like "well we always wanted to do what out wildly popular competitor did, but only NOW can it be done properly, by us, we're not copying guys, no really"

Nonsense. You never cared (much) before, Wasson's work started exposing you, so you jumped on the bandwagon, late, like a lot of sites.

Mind you I like Anandtech, and dont even really like Wasson lol, but a spade's a spade.

and this fcat is probably a better tool, all that may be true. but again, call a spade a spade.Reply

Nice article, but i'm just a bit concerned about using another piece of hardware to get the results. What i also don't like is that the capture takes place at 60Hz. What about the ones that are "overclocking" their monitors (run them at more than the default 60 Hz refresh rate, for example my monitor now runs at 75 Hz) or the ones that have high refresh rate monitors, 120 Hz for example. Also what i would really really like to see is an in depth analysis of VirtuMVP. In theory it should generate a more responsive and/or smoother experience, but most of the games i've run with VitruMVP had, more or less, some form of stuttering.Reply

I feel the topic of the OP relates to all of this new frame time testing directly. AMD systems may in fact be SMOOTHER than Intel system. I have a Core i7 2600k/Z77 system running crossfire. I can play Battlefield 3 on High at 60 to 120fps....albeit with a ton of stutter/dropped frames/runt frames. My coworker has a measly AMD FX-4100, with the same HD 6850 crossfire on an AMD 970 chipset. His system allows for CrossfireX to be enabled (Crossfire through the chipset/PCIe AND Crossfire cable simultaneously). His system ran only between 30 and 65fps during gameplay but clearly had no stutter/dropped frames/runt frames. At a reported 35fps his system played smoother than mine at 75fps. His 970 chipset was also pushing 1 card at PCIe 2.0 16x and the other at 4x, yet he was still 'smoother'.

This irked me... AMD systems may very well be smoother in Crossfire configurations given the added features that support crossfire on their chipsets. I urge Anandtech members to please write Techreport, Anandtech, and PCPer to do more testing with AMD systems vs. Intel systems. Reviewers tend to only use Intel systems when doing all testing, but this may not be showing the entire picture (literally). Also let's continue this discussion given this new frame time point of reference and get to the bottom of this.

As Ryan shrout has talked about, Frame issues with AMD cards happen in games were GPU is the limiting factor in fps. In games that use cpu more then gpu power for issue doesn't show up. So in case why AMD cpu's have problem less compared to intel cpu most likely is a case where amd cpu is maxing out well before gpu gets to its max. Its pretty common that per core per watt intel is faster. Reply

I do not believe your statement on patents is completely accurate. Patents do cover a specific implementation of an idea, although they also cover independent discoveries of that same implementation and clean room reverse engineering of said implementation. What I mean by a specific implementation is that taking for example the RSA cryptosystem, the patent did not cover all possible implementations of a public key cryptosystem although it might have been the only way to implement a public key system at the time. I'm reasonably certain that if diffie-hellman had been known of at the time people could have used that at the time just as it was used once it was discovered without infringing any patents. Similarly, LZW patents didn't cover generic data compression but instead a compression system that used less processing then LZ-77/78. People working on making patent free audio and visual codecs have been finding different ways to avoid various algorithms for some time now. I am not a lawyer or a trained solicitor. I have not taken any formal classes in practicing law and this post should not be taken as offering professional legal advice. I appreciate your comment and just felt it could use some clarification.Reply

I wonder how 3Dmark will react to this evolution, because it gives scores based on FPS, but this kind of analysis requires an add on card to capture video.Will 3Dmark ignore frame by frame analysis?Will require more hardware (capture card)?Will resort to a software solution (questionable)?Will give the user a choice on methodology (giving different, non comparable results)?

I hope that AMD also give a similar tool. Nvidia and AMD already tried to cheat with his software, and if nVIDIA is the only one providing tools…

I’m happy with this evolution. I ranted for long time about noticeable freezes on benchmarks and games reporting great FPS (on single cards, not SLI/stuttering), but I felt frustrated by benchmarks review sites.Now I’m worried by not having the same hardware than reviewers, and not being able to verify websites claims/results.FbF analysis (Frame by Frame) is a step in the right direction, but makes us consumers vulnerable to companies corrupting websites to get better, non reader-verifiable, results.Reply

Um not sure there is anything AMD can add to this. the DX overlay add's colors to each frame before it gets to sub systems, and colors are analyzed to be in a certain order. If a color is missing then a frame was dropped. i wouldn't say AMD is trying to cheat but they surely have an issue with frame's being dropped or being so small they don't improve gaming experience. When you remove those tiny and dropped frames from fps numbers. It paints a different picture on which card is faster in SLI/CF setup's. Single card setup, which card is better is a toss up.Reply