Translating to IPC: All This for 3%?

Contrary to popular belief, increasing IPC is difficult. Attempt to ensure that each execution port is fed every cycle requires having wide decoders, large out-of-order queues, fast caches, and the right execution port configuration. It might sound easy to pile it all on, however both physics and economics get in the way: the chip still has to be thermally efficient and it has to make money for the company. Every generational design update will go for what is called the ‘low-hanging fruit’: the identified changes that give the most gain for the smallest effort. Usually reducing cache latency is not always the easiest task, and for non-semiconductor engineers (myself included), it sounds like a lot of work for a small gain.

For our IPC testing, we use the following rules. Each CPU is allocated four cores, without extra threading, and power modes are disabled such that the cores run at a specific frequency only. The DRAM is set to what the processor supports, so in the case of the new CPUs, that is DDR4-2933, and the previous generation at DDR4-2666. I have recently seen threads which dispute if this is fair: this is an IPC test, not an instruction efficiency test. The DRAM official support is part of the hardware specifications, just as much as the size of the caches or the number of execution ports. Running the two CPUs at the same DRAM frequency gives an unfair advantage to one of them: either a bigger overclock/underclock, and deviates from the intended design.

So in our test, we take the new Ryzen 7 2700X, the first generation Ryzen 7 1800X, and the pre-Zen Bristol Ridge based A12-9800, which is based on the AM4 platform and uses DDR4. We set each processors at four cores, no multi-threading, and 3.0 GHz, then ran through some of our tests.

For this graph we have rooted the first generation Ryzen 7 1800X as our 100% marker, with the blue columns as the Ryzen 7 2700X. The problem with trying to identify a 3% IPC increase is that 3% could easily fall within the noise of a benchmark run: if the cache is not fully set before the run, it could encounter different performance. Shown above, a good number of tests fall in that +/- 2% range.

However, for compute heavy tasks, there are 3-4% benefits: Corona, LuxMark, CineBench and GeekBench are the ones here. We haven’t included the GeekBench sub-test results in the graph above, but most of those fall into the 2-5% category for gains.

If we take out Cinebench R15 nT result and the Geekbench memory tests, the average of all of the tests comes out to a +3.1% gain for the new Ryzen 2700X. That sounds bang on the money for what AMD stated it would do.

Cycling back to that Cinebench R15 nT result that showed a 22% gain. We also had some other IPC testing done at 3.0 GHz but with 8C/16T (which we couldn’t compare to Bristol Ridge), and a few other tests also showed 20%+ gains. This is probably a sign that AMD might have also adjusted how it manages its simultaneous multi-threading. This requires further testing.

AMD’s Overall 10% Increase

With some of the benefits of the 12LP manufacturing process, a few editors internally have questioned exactly why AMD hasn’t redesigned certain elements of the microarchitecture to take advantage. Ultimately it would appear that the ‘free’ frequency boost is worth just putting the same design in – as mentioned previously, the 12LP design is based on 14LPP with performance bump improvements. In the past it might not have been mentioned as a separate product line. So pushing through the same design is an easy win, allowing the teams to focus on the next major core redesign.

That all being said, AMD has previously already stated its intentions for the Zen+ core design – rolling back to CES at the beginning of the year, AMD stated that they wanted Zen+ and future products to go above and beyond the ‘industry standard’ of a 7-8% performance gain each year.

Clearly 3% IPC is not enough, so AMD is combining the performance gain with the +250 MHz increase, which is about another 6% peak frequency, with better turbo performance with Precision Boost 2 / XFR 2. This is about 10%, on paper at least. Benchmarks to follow.

should be out next year as AMD has been very much on the ball with Ryzen launches more or less to the DAY they claimed would launch which is very nice...basically what they are promising for product delivery they are doing what they say IMO, not to mention TSMC recently announced volume production of their 7nm, so that likely means GloFo will be very soon to follow, and AMD can use TSMC just the same :) Reply

If you ever do fancy a bit more oomph in the meantime (and assuming IPC is less important than threaded performance, eg. HandBrake is more important than PDF loading), a decent temporary sideways step for X79 is a XEON E5-2697 v2 (IB-EP). An oc'd 3930K is quicker for single-threaded of course, but for multithreaded the XEON does very well, easily beating an oc'd 3930K, and the XEON has native PCIe 3.0 so no need to bother with the not entirely stable forced NVIDIA tool. See my results (for FireFox, set Page Style to No Style in the View menu):

I never felt limited by my i5-4670k either, especially mildly overclocked to 4.0GHz.

Until I build a new PC around the same old components because the MSI Z97 motherboard (thanks MSI) failed (it was 4 years old but still...) so I picked up a new i3-8350k + ASRock Z270 at Microcenter bundled together for $200 a month ago, and it's a joke how much faster it is than my old i5.

First off, it's noticeably faster, at STOCK, than the max stable overclock I could get on my old i5. Granted I replaced the RAM too, but still 16GB, now PC4-2400 instead of PC3-2133. Doubt it makes a huge difference.

Where things are noticeably faster comes down to boot times, app launches and gaming. All of this is on the same Intel SSD730 480GB SATA3 I've had for years. I didn't even do a fresh install, I just dropped it in and let Windows 10 rebuild the HAL, and reactivated with my product key.

Even on paper, the 8th gen i3's are faster than previous gen i5's. The i3 stock is still faster than the 4th gen i5 mildly overclocked.

I wish I waited. It's compelling (although more expensive) to build an AMD Ryzen 2 now. It really wasn't before, but now that performance is slightly better and prices are slightly lower, it would be worth the gamble.Reply

i think there's something wrong with your old Haswell setup if the difference is that noticeable. I have every generation of Intel I7 or I5 except Coffee Lake running in 2 rooms attached to each other, and I can't even notice a significant difference from my SANDY 2600k system with a SATA 850 Evo Pro sitting literally right next to my Kaby I7 with a 960 EVO NVMe SSD. I want to convince myself how much better the newer one is, but it just isn't. And this is 5 generations apart for the CPU's/mobos and using one of the fastest SSD's ever made compared to a SATA drive, although about the fastest SATA drive there is. Coffee Lake is faster than Kaby but so tiny between the equivalent I7 to I7, I can't see myself noticing a major difference.

In the same room across from these 2 is my first Ryzen build, the 1800X also with an 960 EVO SSD. Again, I can barely convince myself it's a different system than the Sandy 2600k with SATA SSD. I have your exact Haswell I5 too, and it feels fast as hell still. Especially for app launches and gaming. The only time I notice major differences between these systems is when I'm encoding videos or running synthetic benchmarks. Just for the thrill of a new flagship release I just ordered the 2700X too and it'll be sitting next to the 1800X for another side by side experience. It'll be fun to setup but I'm pretty convinced I won't be able to tell the 2 systems apart when not benchmarking.