Inside the second: Gaming performance with today’s CPUs

Does the processor you choose still matter?

This story was brought to you by our friends at The Tech Report. You can visit the original story here.

As you may know, a while back, we came to some difficult realizations about the validity of our methods for testing PC gaming performance. In my article Inside the second: A new look at game benchmarking, we explained why the widely used frames-per-second averages tend to obscure some of the most important information about how smoothly a game plays on a given system. In a nutshell, the problem is that FPS averages summarize performance over a relatively long span of time. It's quite possible to have lots of slowdowns and performance hiccups during the period in question and still end up with an average frame rate that seems quite good. In other words, the FPS averages we (and everyone else) had been dishing out to readers for years weren't very helpful—and were potentially misleading.

To sidestep this shortcoming, we proposed a new approach, borrowed from the world of server benchmarking, that focuses on the actual problem at hand: frame latencies. By considering the time required to render each and every frame of a gameplay session and finding ways to quantify the slowdowns, we figured we could provide a more accurate sense of true gaming performance—not just the ability to crank out lots of frames for high averages, but the more crucial ability to deliver frames on time consistently.

Some good things have happened since we proposed our new methods. We've deployed them in a host of graphics card reviews, and they have proved their worth, helping to uncover some performance deficiencies that would have otherwise remained hidden. In response to your feedback, we've refined our means of quantifying the latency picture and presenting the info visually. A few other publications have noticed what we're doing and adjusted their own testing methods; even more have quietly inquired about the possibility behind the scenes.

Most importantly, you, our readers, have responded very positively to the changes, even though we've produced some articles that are much more demanding reading than your average scan-and-skip-to-the-conclusion PC hardware review.

We have largely missed one important consequence of our insights, though. Latency-focused game testing doesn't just apply to graphics cards; it's just as helpful for considering CPU performance. We made a quick down payment on exploring this matter in our Ivy Bridge review, but we haven't done enough to pursue it. Happily, that oversight ends today. Over the summer, we've tested 18 different PC processors from multiple generations in a range of games, and we can now share the results with you.

Before your eyes glaze over from the prospect of data overload, listen up. The results we've compiled confront a popular myth: that PC processors are now so fast that just about any CPU will suffice for today's games, especially since so many titles are console ports. I've said something to that effect myself more than once. But is it true? We now have the tools at our disposal to find out. You may be surprised by what we've discovered.

The contenders

Yes, we really have tested 18 different desktop CPUs for this article. They break down into several different classes, delineated mainly by price. We have a full complement of the latest chips on hand, including several members of Intel's Ivy Bridge lineup and a trio of AMD FX processors. We've tested them against their predecessors in the past generation or two, to cover a pretty big swath of the CPUs sold in the past several years. Allow me to make some brief introductions.

Quite a few PC enthusiasts will be interested in the first class of CPUs tested, which is headlined by the Core i5-3470 at $184. This Ivy Bridge-based quad-core replaces the Sandy Bridge-derived Core i5-2400 at the same price. The newer chip has slightly faster clocks and a lower power envelope—77W instead of 95W—versus the model it supplants. Two generations back, this price range was served by the Core i5-655K, a dual-core chip. The closest competing offering from AMD is the FX-6200 at $155, a six-core part based on the Bulldozer architecture. The FX-6200's precursor was the Phenom II X4 980, which we've also invited to the festivities.

For a little more money, the next class of CPUs promises even higher performance. Intel's Ivy Bridge offering in this range is the Core i5-3570K for $216, with a fully unlocked multiplier to ease overclocking. The 3570K replaces an enthusiast favorite, the Core i5-2500K, again with slightly higher clock speeds and a lower thermal design power (or TDP). This is also the space where AMD's top Bulldozer chip, the FX-8150, contends. The legacy options here are a couple of 45-nm chips, the Core i5-760 and the Phenom II X6 1100T.

Tech Report

More relevant for many of us mere mortals, perhaps, are the lower-end chips that sell for closer to a hundred bucks. AMD's FX-4170 at $135 gets top billing here, since our selection of Intel chips skews to the high end. We think the FX-4170 is a somewhat notable entry in the FX lineup because it boasts the highest base and Turbo clock speeds, even though it has fewer cores. The FX-4170 supplants a lineup of chips known for their strong value, the Athlon II X4 series. Our legacy representative from that series actually bears the Phenom name, but under the covers, the Phenom II X4 850 employs the same silicon with slightly higher clocks.

Finally, we have the high-end chips, a segment dominated by Intel in recent years. We've already reviewed the Ivy-derived Core i7-3770K, a $332 part that inherits the spot previously occupied by the Core i7-2600K and, before that, by the Core i7-875K. Also kicking around in the same price range is the Core i7-3820, a fairly affordable Sandy Bridge-E-based part that drops into Intel's pricey X79 platform. The Core i7-3820's big brother is a thousand-dollar killer, the Core i7-3960X, the fastest desktop CPU ever.

This selection isn't perfect, but we think it provides a good cross-section of the market. Face it: the CPU makers offer way too many models these days. The sheer volume of parts is difficult to track without an online reference. If you're having trouble keeping them sorted, fear not. We've broken down the results by class in the following pages, and we'll summarize the overall picture with one of our famous price-performance scatter plots.

The value of the i7 over the i5s is demonstrated on applications that support hyper threading. Not much point for gaming. Also, the 3570K can potentially deliver a lot more performance for its price because of its unlocked multiplier vs the 3470. At 179 bucks at Microcenter, I'd probably consider that as the best value of this lot.

This is the most helpful write-up of this type I've ever seen. I had sort of guessed that the i5-3470 and 3570 would be the best bang for the buck but having these stats to back it up is really excellent.

I don't exactly have the budget to upgrade my system right now, but it's nice to see that my i5-760 from a couple years ago is holding up just fine too; I can probably get another 18 months out of it before it starts creaking.

As far as I understand, the improvement to covnentional benchmarking procedure in this article is simply from the fact that the data is not averaged into a single numerical performance index, i.e. average FPS, but to present data using various statistical measures that incoporates information on statistical spread, for example, variance and range; to highlight outliers.

If you take the reciprocal (1/x) of the frame latency graph in this article, you obtain the more user-understandble frame per second graph back, as a function of frame number. This is the same as what other benchmark sites, for example, [H]ardocp has done. Rather than having the averaged FPS as a single bar chart, it has time-lapsed graph of the FPS, and people can see where the dips (minimum FPS) are. This is understandable and there is no need to reinvent the wheel by introducing artifically the frame latency, spending a paragraph and a table to explain it, which is mathematically just a reciprocal measure of the FPS conveying the same information.

In fact, looking at the spikes is more confusing than looking at dips, and say I only have 10fps at frame number 3000, rather than saying my frame time is 0.1 second at frame number 3000, is indefinitely more clear to average perople.

While I am not affiliated with [H]ardocp, and I am a big fan of Ars Technica, I here post a link to [H]ardocp to make my point:http://www.hardocp.com/article/2010/12/ ... d_review/5In my humble opinion, to say "To sidestep this shortcoming, we proposed a new approach, borrowed from the world of server benchmarking, that focuses on the actual problem at hand: frame latencies." and making it sound you completely revolutionized benchmarking is a bit over the top =P

I do appreciate the 99th percentile plots to highlight outliers.

Lastly, you can revolutionise benchmarking by marking simple but effective measure such as total number of time spend in rendering under 30FPS and 60FPS respectively. Ultimately, that's all what people need to know isn't it?

All of the games tested are mechanically simple and there's no reason to expect them to heavily use the processor for anything but graphics. Why was there no effort to include games which need lots of CPU for AI, simulation, etc.?

Fantastic methodology.The work put into this must have been phenomenal. Thank you for taking the time to do this, and for sharing it with us.

Phenomenally short-sighted, perhaps: not a single RTS or turn-based strategy game was tested. The Total Annihilation-Supreme Commander sequence of games, for instance, has long been known to tax CPUs far more than GPUs, not to mention memory usage (SupCom and its FA sequel, when played on a Windows XP system with mods and large maps, could easily require the /3GB startup switch be used to force Windows to allocate another gigabyte to the app space).

From the selection that was tested, an ignorant person might walk away with the presumption that "game" is synonymous with "first-person shooter". WTF? If the purpose here is to highlight how the performance characteristics of specific processors affect game play, then why not test a broad range of game TYPES?

All of the games tested are mechanically simple and there's no reason to expect them to heavily use the processor for anything but graphics. Why was there no effort to include games which need lots of CPU for AI, simulation, etc.?

That was a bit weird for me as well. Two games I would have liked to see benched are World of Warcraft and Civ 5 since they care far more for CPU power than GPU power. Though honestly since I know at least WoW doesn't have a built in benchmark tool and is an online game, it would be difficult to get repeatable tests done.

From the selection that was tested, an ignorant person might walk away with the presumption that "game" is synonymous with "first-person shooter". WTF? If the purpose here is to highlight how the performance characteristics of specific processors affect game play, then why not test a broad range of game TYPES?

This too.

And I also wish there'd been more of a range, time-wise. I run a Q6600 and would like to have seen how a C2Q (or C2D) would compare (because I'll bet a lot of people still use them) although since they're discontinued, I can understand the difficulty of getting hold of them...

perhaps a Pentium G6xx then?Sure they're not aimed at 'gaming rigs' but I thought the point was to evaluate performance against power.

I can't remember which tech site did the testing, but real-world multiplayer BF3 results aren't nearly so close. When benchmarking on 64-player maps with a high-end video card, the Intel 4+ core chips really pull away from the pack.

Wow, another great article. You guys have been doing really, really well lately. I think this article finally pushed me back into 'et Subscriptor' status, though it'll be a week or two before I actually do it.

Wow, another great article. You guys have been doing really, really well lately. I think this article finally pushed me back into 'et Subscriptor' status, though it'll be a week or two before I actually do it.

Wow, another great article. You guys have been doing really, really well lately. I think this article finally pushed me back into 'et Subscriptor' status, though it'll be a week or two before I actually do it.

God, it's so nice to have the old Ars back!

Actually, this is a techreport article that ars is republishing. It is nice to have some benchmerking back at Ars, but it's not the Ars staff doing the testing (they'd have tried to run it over with a VW, or throw it off a balcony or something.)

I guess the TL;DR for the article is that AMD chips still suck for gaming unless you're on a really tight budget. I wish AMD could get their act together and make the CPU business more competitive like the SocketA/s939 days.

Agree that it would've been nice for them to test some CPU-heavy simulation games. FSX (most recent version of Flight Simulator from Microsoft) would be a great one. Despite the fact that it's about six years old, I've heard that it takes an i7 to run it with graphics settings maxed out while still getting a smooth frame rate.

Great review. Always nice to see a solidly presented methodology. As others have commented it would be nice to see Civ5 or other strategy based games that churn CPUs for minutes (Large/Huge Map Late game turns? Also probably highly repeatable). I also would have liked to see how well a C2D or C2Q fared against the more modern CPUS.

I <3 AMD for pulling off the amd64 revolution... but I recently got a new win7 + i5-3550 machine and kubuntu12.04 feels nicer in virtualbox than it did natively on a Phenom II X4 840. IVB is a nice piece of silicon.

Fantastic methodology.The work put into this must have been phenomenal. Thank you for taking the time to do this, and for sharing it with us.

Phenomenally short-sighted, perhaps: not a single RTS or turn-based strategy game was tested. The Total Annihilation-Supreme Commander sequence of games, for instance, has long been known to tax CPUs far more than GPUs, not to mention memory usage (SupCom and its FA sequel, when played on a Windows XP system with mods and large maps, could easily require the /3GB startup switch be used to force Windows to allocate another gigabyte to the app space).

From the selection that was tested, an ignorant person might walk away with the presumption that "game" is synonymous with "first-person shooter". WTF? If the purpose here is to highlight how the performance characteristics of specific processors affect game play, then why not test a broad range of game TYPES?

I don't get it either. Why didn't they whip out StarCraft II? All they'd need to do was pick a good replay that involves a couple of death-balls (or even a 2v2 that goes into the late game), and play it a bunch.

I guess the TL;DR for the article is that AMD chips still suck for gaming unless you're on a really tight budget. I wish AMD could get their act together and make the CPU business more competitive like the SocketA/s939 days.

That + the moral of the article for AMD users - if you want to play games with an AMD under the hood - stay well clear of Bulldozer (I actually made the mistake of upgrading my Phenom X21100T with its supposed Bulldozer 'replacement'(edit:It was the 8150 as I'd thought)... Thankfully I managed to return it and have my money back)

I think CPU's have always been the bottleneck for making a computer perform better. GPUs, RAM, Data storage/Access speed have all grown much faster processing speed/power. I think the emphasis on CPUs has always been to improve the architecture and the way it handles calculations, as opposed to improving how many calculations it can perform at once.

I believe that the (relatively) large userbase that knows about over-clocking CPU's is partially a reflection of that as well.

To those who wish I'd tested RTS games or whatever, I agree. It would be nice to test more games and perhaps more CPUs (like the sainted Core i3, it seems). In my defense, putting together this article was a pretty big undertaking by itself, and I needed finish this project and publish the results I had.

But nothing prevents us from doing more testing with different games in the future. I think it would be fun to try Arma 2 and some other games that are known CPU hogs. I'll see what we can do next time around.

As far as I understand, the improvement to covnentional benchmarking procedure in this article is simply from the fact that the data is not averaged into a single numerical performance index, i.e. average FPS, but to present data using various statistical measures that incoporates information on statistical spread, for example, variance and range; to highlight outliers.

If you take the reciprocal (1/x) of the frame latency graph in this article, you obtain the more user-understandble frame per second graph back, as a function of frame number. This is the same as what other benchmark sites, for example, [H]ardocp has done. Rather than having the averaged FPS as a single bar chart, it has time-lapsed graph of the FPS, and people can see where the dips (minimum FPS) are. This is understandable and there is no need to reinvent the wheel by introducing artifically the frame latency, spending a paragraph and a table to explain it, which is mathematically just a reciprocal measure of the FPS conveying the same information.

In fact, looking at the spikes is more confusing than looking at dips, and say I only have 10fps at frame number 3000, rather than saying my frame time is 0.1 second at frame number 3000, is indefinitely more clear to average perople.

While I am not affiliated with [H]ardocp, and I am a big fan of Ars Technica, I here post a link to [H]ardocp to make my point:http://www.hardocp.com/article/2010/12/ ... d_review/5In my humble opinion, to say "To sidestep this shortcoming, we proposed a new approach, borrowed from the world of server benchmarking, that focuses on the actual problem at hand: frame latencies." and making it sound you completely revolutionized benchmarking is a bit over the top =P

I do appreciate the 99th percentile plots to highlight outliers.

Lastly, you can revolutionise benchmarking by marking simple but effective measure such as total number of time spend in rendering under 30FPS and 60FPS respectively. Ultimately, that's all what people need to know isn't it?

Spot on! Thank you for voicing this. I felt the same way and I agree completely with you.

The article is BS and fluff surrounding precious little new benchmarking methodology. Clearly the CPU matters, and the tried and true metric (FPS) is effective at demonstrating that. About the only new useful metric, as you mentioned, is % time spent under 60 FPS is a useful metric.

This 99th percentile frame-time stuff is also BS. The CDF is hardly as useful as the PDF. Present that instead. On how to calculate the PDF: Take a vector of the raw frame latencies. Calculate the 1/t reciprocal. Plot the histogram. Mean and variance in FPS are immediately visualized.

To those who wish I'd tested RTS games or whatever, I agree. It would be nice to test more games and perhaps more CPUs (like the sainted Core i3, it seems). In my defense, putting together this article was a pretty big undertaking by itself, and I needed finish this project and publish the results I had.

But nothing prevents us from doing more testing with different games in the future. I think it would be fun to try Arma 2 and some other games that are known CPU hogs. I'll see what we can do next time around.

That's good to hear. I look forward to the review sequel that includes Supreme Commander: Forged Alliance, StarCraft II, ArmA 2, Civ 5, Sins of a Solar Empire, and Sword of the Stars.

I'll nitpick a bit: Nobody except a few plays BF3 for singleplayer, and multi is where the magic happens. I know it's hard to benchmark multiplayer portion of a game (since it is not repeatable) but the difference between single and multi in that game is like day and night. Running a long term average would at least give an idea about how much difference the cpu selection makes.

I don't expect to see detailed charts for such a test but I think it should at least be mentioned in the relevant section. There is a reason why BF3 have much smaller maps and lower player caps in consoles.

For a lousy reference, I have an aging gaming rig with a Phenom II x4 960T and the performance is manageable in lowest settings. A few months ago it had a Phenom II x3 720 CPU and the game was unplayable. These two CPUs have very similar architectures, the difference is 3 vs 4 cores and 200 MHz per core. The actual difference in performance is much more than what you would expect.

To those who wish I'd tested RTS games or whatever, I agree. It would be nice to test more games and perhaps more CPUs (like the sainted Core i3, it seems). In my defense, putting together this article was a pretty big undertaking by itself, and I needed finish this project and publish the results I had.

I don't see what the size of the undertaking has to do with it. You chose to test many games for which there was every reason to expect similar, GPU bound behavior. It appears that you could have tested fewer games, therefore done less work, and still gotten more interesting results by selecting games that are generally known or expected to be CPU bound.

To those who wish I'd tested RTS games or whatever, I agree. It would be nice to test more games and perhaps more CPUs (like the sainted Core i3, it seems). In my defense, putting together this article was a pretty big undertaking by itself, and I needed finish this project and publish the results I had.

But nothing prevents us from doing more testing with different games in the future. I think it would be fun to try Arma 2 and some other games that are known CPU hogs. I'll see what we can do next time around.

Thanks for the write up Scott, it was very informative. I also enjoyed your GPU write up. I would love to see how the CPUs fair against something like Civ 5. That is one game that can bring a system to its knees .

@MobiusPizza, we used to use those those FPS-over-time graphs you mention, too. The trouble is, the tools used to produce them consistently average FPS over a full second, so a relatively long wait for one frame can be masked if it's surrounded by a bunch of low-latency frames. See "why FPS fails" here for an illustrated example:

So the results from other sites (or our own past results) are not just a reciprocal.

I'll admit we could convert our results back to FPS, but I fear the distinction I've just made would be lost even more easily. Also, I think once you start thinking in terms of latency, it sort of begins to stick.

@MobiusPizza, we used to use those those FPS-over-time graphs you mention, too. The trouble is, the tools used to produce them consistently average FPS over a full second, so a relatively long wait for one frame can be masked if it's surrounded by a bunch of low-latency frames. See "why FPS fails" here for an illustrated example:

So the results from other sites (or our own past results) are not just a reciprocal.

I'll admit we could convert our results back to FPS, but I fear the distinction I've just made would be lost even more easily. Also, I think once you start thinking in terms of latency, it sort of begins to stick.

Hi Scott,

I know I'm not MobiusPizza, but I don't agree with your statement about latency. FPS is an intuitive measure, why? Because it's a measure of velocity, a natural value for people to measure. Customers are interested in how fast things go, i.e. mph. No customer is interested in how long it takes to travel one mile. That measure is for manufacturers.

Furthermore, the human eye is an integrating sensor. It does not measure individual frames. The naked eye cannot measure absolute frame-rates except in large order, but deviations in frame-rate (on the order of 2-3 fps) are easily caught by the eye. If you want a useful metric, provide FPS with error bars. 55 +/- 10 fps. That deviation, to first order, is a measure of how smooth the video will be. That's a useful metric for consumers.