Share This article

Ever since Intel did the first Haswell graphics explainer, there’s been talk about just how much additional performance the company would deliver with its new lineup of graphics processors. Ivy Bridge, after all, was a major improvement over Sandy Bridge — so much so that Intel talked up its “Tick Plus” model as a key component of the first-generation 22nm processor’s appeal.

We’re still barred from talking about Haswell’s GPU performance directly, but Intel has released new branding information and some early performance figures. If you plan on buying a Haswell-based system, you’ll want to understand these performance levels and how the new brands are being used.

The top-end Intel graphics products are now known as Iris Pro and Iris. Intel Iris Pro systems are based on the company’s GT3e graphics solution. That’s the chip with an onboard eDRAM frame buffer (pegged at 128MB by RealWorldTech). The slide below describes the differences…

As of this writing, Intel’s Iris Pro 5200 GPUs will only be available in 48W mobile SKUs. There are two further flavors of GT3 — Iris 5100 and Iris 5000. The Iris 5100 parts are intended for 28W mobile processors while the 15W chips all use the Iris 5000. Intel isn’t revealing clock speeds yet, but the difference between Iris 5100 and 5000 is going to be boost/turbo clocks.

How do these SKUs impact desktop parts? They basically don’t. Intel will offer one BGA-based solution (the R-series) with GT3e graphics. All of the other desktop chips will use the HD 4600-4200 SKUs. These top out at 20 EUs (Execution Units), as compared to Ivy Bridge’s 16 EU’s. Thus, the desktop performance increase in graphics should be 15-20%. That’s reasonable, if not particularly exciting.

The performance per watt trade-off

We’re going to ignore 3DMark 6 (because no one cares), and look at the 3DMarkVantage and 3DMark11. Forget the far-right bar for the 28W part, for a moment, and focus on the middle. That’s the Intel Core i7-4650U, a 15W part with (presumably) Iris 5000 graphics. Factor in the 2W TDP decline, and the performance boost between the new Haswell chip and the older Ivy Bridge CPU is actually excellent. Moving to GT3 picks up ~25% in one benchmark, 50% in the other.

While Intel is obviously going to cherry pick results, these should map to real-world games. GT3 is more than twice the size of GT2 (40 EUs vs 16) and one reason Intel went to the wider array is because it could down clock the graphics cores and save power as a result.

The i7-4558U is a different story. Forget Ivy Bridge for a moment and look at the 4650U vs 4558U comparison. The 4558U is only 1.32x as fast as the 4650U in 3DMark Vantage, and 1.5x as fast in 3DMark11 — despite drawing nearly 2x as much power. This suggests that GT3’s power consumption benefits are narrowly defined — the array will sip power if you keep clock speed constrained, but raising the clock speed bar will quickly turn the tables on power consumption.

Tagged In

Post a Comment

Juan Perez

o boy here go again with this full of hot air…………haven’t they learned anything yet….x86 will not replace GPU AT ALL…………..ITS GPU REPLACING x86 CPU’S…. THAT’S WHAT AMD DOING………..REPLACING X86 with APU…WHY CAN’T INTEL SEE THIS AND UNDERSTAND THIS…..YOU WILL NEVER SURPASS 300 GFLOPS BARRIER……BECAUSE OS LIMITATION AND x86 execution limitation……I AM WONDERING HOW WILL eDram will help since it didn’t help wiiu performance….

Joel Detrow

OH MY GOD CAPS LOCK AND ELLIPSES

Joel Hruska

This post is so full of wrong it makes my head hurt.

Harry_Wild

This is a very confusing article. It based on mobile but then throws in a bunch of desktop info and then goes back to mobile again and then leaves the reader hanging.

I was more interested in tablets and desktops then notebook graphics right now since I have a need for both and not for a notebook right now.

I will just wait for now and just buy a graphic card for my newer desktop. I have already update my older desktop and it better then my newer desktop for only $50.00. What a deal! If I know about this performance difference for only $50; I just would of brought another of the older desktops and upgrade it with the same graphic card and save myself the difference of around $700.

Joel Hruska

Theres nothing at all about tablets.

Desktop graphics perf will be up 15-25%.

VirtualMark

I have to admit, all this talk of low level integrated graphics is totally and utterly boring. Why can’t they just spend the extra transistors on more cache or a better CPU? Sure for the low end i3 give them their integrated, but for us that want performance, give us a chip that doesn’t have this nonsense built in.

2x the performance isn’t going to help it run new games either. It might just barely let you scrape by on a game like Metro 2033, with pretty much everything turned off that makes the game look nice. Great, that’s really amazing news, I can’t wait to see this blocky mess.

Fast_Turtle

Since I’m planning a new build (3yr replacement cycle) for Year End, if Haswell offers me a 10 – 15 percent boost in performance for the same cost as what I’ve already selected, then simple economics comes into play. In other words, Take the Performance Boost.

The funny thing is, there are many of us building ITX based systems around Xeons used for stuff such as Video Encoding, Programming and such that simply doesn’t require a dedicated GPU that will benefit from this improvement. I’ve really looked at the performance gains between the comparable unlocked i7 and the E3 xeon 1275 v2 (ivy bridge) and the price difference is minimal and depending on timing, in favor of the Xeon. Then there’s the ECC support built into the Xeon, which becomes more important with more system memory (I’m currently using 16GB and the new build will have at least the same if not double) so ECC is important.

Sure the IGP from intel doesn’t offer great gaming performance but then it’s not a gaming GPU. It’s a business GPU that offers acceptable performance while doing development work on things that don’t depend on high video performance. In fact, if the performance improvement is sufficient and Intel succeeds in getting the driver OpenGL Certified, you’re now looking at something that may be in Mid level CAD/CAM workstations. Personally, I’ve got no qualms about using Intel’s IGP offering as I’ve seen the performance from the latest HD4000 unit. Good enough for my purposes and saves me a bit of money since I don’t need a dedicated video card and no, the cheap cards <$50 don't really offer anymore performance over the HD4000 gpu.

http://www.mrseb.co.uk/ Sebastian Anthony

Hey, Turtle! Welcome back :)

Joel Hruska

VM,

Two reasons.

1) Because more cache isn’t going to get you a faster chip. We’ve hit diminishing marginal returns there, big time.

Here’s the essence of the problem: Larger caches are slower. Slower caches add latency, latency reduces performance. Obviously there are a lot of ways to tweak cache design, but slapping on more cache stops helping past a certain point.

2) Haswell’s graphics performance doesn’t jump as much as thought it might, but delivering 20% improved GPU performance (let’s call that a reasonable expectation) is a much bigger deal than delivering 5-8% CPU perf. And if Intel can keep up a 20% iteration for a few years, you *will* be playing modern games on it.

I’m not saying that they automatically will — but they’re gaining ground on where games sit.

Fast_Turtle

True, we may be seeing diminshing returns on caches but and this is the big BUT generally, there’s enough headroom in the CPU to handle what the GPU doesn’t. It’s one of the many reasons I’m looking at an E3 Xeon for my next system. Sure the onboard HD4000 doesn’t hold a candle to my current Radeon 5670 but unless the app is really CPU bound, there’s plenty of headroom to pick up the GPU slack.

AMD’s trinity/fusion design differs in the fact that the main problem they’ve had was FPU performance. They haven’t been able to match Intel until the began moving the GPU onto the same die and changing it into the APU. Eventually, the move will be complete with AMD calling it the FPU and we’ll start the entire cycle all over again with discrete GPU’s offering features and performance that onboard setups can’t match. Soon we’ll be back to the same place as when the Voodoo cards hit the market.

Joel Hruska

Why would you opt for an E3 Xeon in the first place? What do you need from Xeon that consumer-level Intel hardware doesn’t offer?

Separately from that:

“AMD’s trinity/fusion design differs in the fact that the main problem they’ve had was FPU performance.”

Actually, AMD’s VLIW5 / VLIW4 / GCN cores all excel at *integer* performance. I realize that this runs counter to AMD’s discussion of GPU-as-FPU from several years back, but the benchmarks don’t lie:

Pull any comparison of code you want — you’ll find that GCN’s big advantage over Nvidia is in integer. Floating-point code is where NV is most competitive.

Second point: Nothing AMD has done to-date allows for the sort of “GPU-as-FPU” combination that they were talking about a few years ago. And, as things currently stand, that level of integration isn’t on the roadmap. Yes, there was a time when AMD definitely talked about using the GPU for FPU offload; it was one reason given for why Bulldozer used a shared FPU design.

There are two problems with that approach, however. First, there’s the practical: We’re nowhere near that level of integration. AMD’s most recent HSA roadmaps show Kaveri sharing unified pointers and memory structures by 2014. Integrating FPU functionality into the GPU means that the CPU needs to be able to dispatch FPU code to the graphics card seamlessly, interrupt context, and schedule the code with virtually no latency hit. It also means your GPU needs to understand x87, SSE2/AVX.

Even if you can handle the practical problems, programming support and the difficulty of seamlessly handing off the task may make it a non-starter. But this is a 2016-2017 era move in any case.

It’s been years since AMD talked up the idea and I don’t think that’s an accident. The push, I think, is towards wider OpenCL adoption.

Fast_Turtle

The Xeon is cheaper then the same performance from an i7 and the board supports them. So why not save money and get the better deal? I’m not normally one concerned about lots of performance but when there’s a 10-15 percent difference in benchmarks between the two chips, why buy the slower for more money was the question I asked myself. Thus the Xeon.

The link to E.T. just goes to prove my contention that AMD is pushing the APU as their new FPU even if the integration is taking a while. This is pretty much the same transition as between the 386 and 486 where Intel began integrating the FPU on Die. Remember that the 386 had a companion chip 387 for mathematical work then the 486 came out but some didn’t have it and were called Sx. People pitched a major fit about that when they discovered it and then the Penium was released that included the fpu on Die.

In regards to SSE2/AVX, those are extensions to the x86 code base. Just like EMT64/AMD64 along with the 3Dnow and are handled in the microcode. Simply put, AMD has already integrated their new FPU based on the GPU that they bought ATI for. All that’s left is to tweak things to boost the performance of the new design while cuttting power consumption.

Even your OpenCL comment simply reinforces things. What AMD has done is relabed the FPU and are now calling it an APU but it serves the same purpose. Same S***. Just a different name involved. In other words Coke/Pepsi both are Cola just labeled differently with slightly different taste(performance).

Joel Hruska

Fast_Turtle,

I don’t think you understand what has to happen for the GPU to function as FPU. Let me try to explain.

Asterixes mean that this functionality is completely missing from current chips — there’s not even a predecessor of the proper capability. Right now, there’s no way for the CPU decoder units to dispatch code directly to the GPU. There’s no graceful preemption mechanism for interrupting the APU, running FPU code on it, and returning the result. The GPU’s ability to write to CPU caches is still fairly constrained and latency is much higher than allowing the CPU to handle the data outright.

There’s also the question of structure. The only way for the GPU to handle native FPU code is if the GPU understands x87/SSE2/AVX, or what have you. Otherwise you have to convert the results.

You mention bringing the FPU onboard, but this is actually far more complex than that. It may never make sense to pursue this kind of configuration. But even if AMD goes for it, it’ll be 3-5 more years before it’s ready for seamless integration.

where’s the interuption? That’s why the APU is AMD’s new FPU. If you look at any of the block slides regarding the Athlon design, you see a meager 1-4 (maybe 8) FPU paths. The new APU has 8x that number available. So where’s the interupt going to happen? Same place it always has, the decoder, not the FPU/APU. That’s what you’re having trouble wrapping you mind around. The APU is the new FPU. The only thing taking time is AMD tweaking the microcode and finishing the design integration.

Joel Hruska

Fast_Turtle,

I’ve spent *hours* buried in those docs. “The APU
design may have originated with the Radeon GPU but internally it’s no
where’s near the same.”

It’s true that the APU doesn’t *have* to look like an IGP, and that the HSA enhancements AMD has made to the PS4 and to Kaveri introduce new capabilities. Thus far, however, all modern AMD APUs, including Kabini, are based on the APU design that debuted with Llano. That means a mixture of heterogeneous and non-heterogeneous access across specialized buses, with dramatically different maximum bandwidth and capabilities.

Critically, however, nothing about Kaveri (the next-generation implementation of HSA due at the end of the year) changes the sequence of events as I laid it out above.

Look at the two graphs at the bottom that compare Trinity against Sandy Bridge, then against Ivy Bridge. That’s HD 2500 vs. HD 4000.

In an average of 15 titles, Trinity was about 1.8x faster than Sandy Bridge.

In an average of the same 15 titles, Trinity was 1.2x faster than Ivy Bridge.

Therefore: Ivy Bridge was *much* faster than Sandy Bridge.

And since we’re talking about gaming, and my comment was made specifically regarding gaming, your own perception of a lack of improvement is invalid.

Phobos

*much* faster? not enough, if you already using sandy. Clearly if you have more cash than common sense by all means.

Joel Hruska

Phobos,

In what universe is 60% not “much faster?”

A 60% gain is huge. Maybe it’s still not fast *enough* for your purposes, or for my purposes. Maybe we want more. That’s fine. But if you look at a 60% performance gain in a year, within the same power envelope, and your answer is “Meh,” than I’d seriously recommend you consider both how quickly GPU performance has been improving within the same power envelope these past six years and exactly what you want a company — any company — to deliver within a 15 month product window.

As a reviewer, I consider a 60% performance increase to be a phenomenal year-on-year delivery. That doesn’t mean I’d pick Ivy Bridge HD 4000 for a gaming laptop. I wouldn’t. But it’s a huge gain.

Cyber Revengeance

i don’t think intel is able to improve 2x to 3x gpu performance. Its just an exaggeration for marketing because haswell is just one month from release. intel always does that. It help in increasing the sales.

Let it come out then the benchmark will show what it can do. assume max of 1.5x performance. Compare intel haswell gpu to amd 8000 series integrated gpu.

Use of this site is governed by our Terms of Use and Privacy Policy. Copyright 1996-2015 Ziff Davis, LLC.PCMag Digital Group All Rights Reserved. ExtremeTech is a registered trademark of Ziff Davis, LLC. Reproduction in whole or in part in any form or medium without express written permission of Ziff Davis, LLC. is prohibited.