You may want to note that what's unique about the 7970 is not that it can get up to an 18% overclock on stock volts, but that it is a top-end card that has 18% headroom. The 6970 had nothing close to that as Techspot found, for instance, so with the more expensive 7970, the headroom should be factored into the cost equation - the price premium for top-end cards rarely come with this bonus.

As an example, the HD5850, which was introduced as a high mid-range card, typically could reach a 17% overclock at stock volts (both of mine do), and the GTX460 was similar in this regard. That's why they were such value cards. But there's nothing entirely new about this kind of overclocking headroom at stock volts - it's not reserved only for CPUs, as you suggested.Reply

To clarify things, the point I was attempting to make was in reference to high end cards - the 580, 6970, 5870, and the like. Mid-range cards have traditionally overclocked better because there's plenty of thermal and power headroom to work with, which is consistent with Techspot's findings. In any case I've slightly edited the article to clarify this point.Reply

I think people will be disappointed in the overclocking part of this article, namely that you didn't do any voltage adjustments. I think people were wanting to see where the sweet spot for voltage is (best overclock without going too high, how increased voltage affects heat and power), like you often do with CPUs.

On the flipside, I would have liked to see about undervolting. I saw someone mention that they had dropped voltage and were able to maintain clocks which cut the power consumption by a fair margin with no loss in performance.Reply

Considering that this is a reference card, I consider overclocking without voltage adjustment to be far more important. The 7970 is not an overengineered card like the 6990/5970 that was specifically built to be overvolted. It should be possible to give it some more voltage, but given the lack of design headroom in the power circuitry and the cooler, what you can achieve on stock voltage is much more important since it's all "free" performance.Reply

Ryan - as usual, thanks so much for being responsive to feedback. And thanks for putting this article together - very informative. That PCIe scaling analysis will be referenced for years to come, in my opinion.

By the way, I agree that stock voltage overclocking is something worthy of being explored. It is a totally separate beast from overvolted overclocking, which not everyone has the skill or knowledge to do. The promise of higher performance and essentially no risk of hardware damage is truly a freebie, as you noted.Reply

Yep Termie, now the hyper enthusiast experts with their 7970's are noobs unable to be skilled enough to overclock...

Can you amd fans get together sometime and agree on your massive fudges once and for all - we just heard no one but the highest of all gamers and end user experts buys these cards - with the intention of overclocking the 7970 to the hilt, as the expert in them demands the most performance for the price...

We just heard MONTHS of that crap - now it's the opposite....

Suddenly, the $579.00 amd fanboy buyers can't overclock...

How about this one- add this one to the arsenal of hogwash...

" Don't void your warranty !" by overclocking even the tiniest bit..

( We know every amd fanboy will blow the crap out of their card screwing around and every tip given around the forums is how to fake out the vendor, lie, and get a free replacement after doing so )Reply

Fair enough. I didn't mean it as a real criticism just more of a nitpick. I realize the state of voltage control on video cards isn't exactly stellar and I'm sure AMD/nVidia aren't keen on you doing it.

Its certainly not as robust as CPU voltage adjustment is today, which I didn't mean to confuse as I understand there's a pretty significant disparity.

I sould have expanded my on my comment a bit more.I have a hunch AMD is being pretty conservative on voltage with these (in both directions, its higher than it needs to be, but its not as high as it could fairly safely be either). Firstly, probably to play it safe with the chips from the new process, but also I think they're giving themselves some breathing room for improvement. After 40nm, they probably didn't want to go for broke right out of the gate and leave some extra that they could push to improve as needed (they have space to release a 7980; something in line with the 4890). Considering the results, its not like they really need to, especially coupled with the rumored 28nm issues.

Oh, and likewise to Termie, I do still appreciate the work and realize you can't please everyone. I liked the update and actually I think you did enough to touch on the subject in the 7950 review (namely addressing the lack of quality software management for GPUs currently).Reply

It seemed like we've just finished seeing most major engines like Unreal Engine 3, FROSTBITE 2.0, CryEngine 3 transition to a deferred rendering model. Is it very difficult for developers to modify their existing/previous forward renderers to incorporate the new lighting technique used in the Leo Demo? Otherwise, given the investment developers have put into deferred rendering, I'm guessing they're not looking to transitioned back to an improved forward renderer anytime soon.

On a related note, you mentioned the lack of MSAA is a common problem to DX10+. Given this improved lighting technique requires compute shaders, is it actually DX11 GPU only, ie. does it require CS5.0 or can it be implemented in CS4.x to support DX10 GPUs? According to the latest Steam survey, by far the majority of GPUs are still DX10, so game developers won't be dropping support for them for a few years. Some games do support DX11 only features like tessellation, but I presume that having to implement 2 different rendering/lighting models is a lot more work, which could hinder adoption if the technique isn't compatible with DX10 GPUs.Reply

No one has tested the 7970 in a crossfire configuration under PCI 3.0. I would expect increased bandwidth to benefit the most in that environment. I realize the 7800 series will be better candidates for crossfire given price, heat, and power consumption but a test with the 7900 series would show the potential.Reply

If I remember correctly, TB provides the bandwidth of a PCIe 4x connection. So if a high end card like this isn't bottlenecked with that much constraint, it sure looks good for external graphics! You'd need a separate power plug of course, but it now looks feasible. Reply

TB controllers have a PCIe 2.0 x4 back end, but the protocol adapter can only pump at 10Gbps, so Thunderbolt devices essentially share the equivalent of 2.5 lanes of PCIe 2.0. I was hoping that PCIe 3.0 x1 performance would be tested as well, since that would show bottlenecking very similar to what could be expected from a Thunderbolt connected GPU.Reply

We'd see less than a doubling of band with if TB2.0 just went from PCIe2.0 to 3.0 clocks because TB already incorporates a high efficiency encoding like 3.0 does. That's why a TB1.0 connection can carry 2.5x PCIe 2.0 lanes of data over a channel that's raw capacity is only 2 lanes wide.Reply

I don't really understand why dumn SSAA would be so hard to implement in a game-independent, API-independent, renderer-independent fashion. The driver can simply present a larger framebuffer to the game (say, 3840x2160 for a 1080p game) and as a final step before swapping the buffer, average the pixel values in 2x2 blocks, supersampling down to the target resolution.

I mean, this is how antialiasing used to work in the days before MSAA, and while there's a big performance penalty there, it has the virtue of working in any scenario, on any content or geometry.Reply

So PCIe 4GB/s (2.0x8 or 3.0x4) is where high-end cards start dropping off and showing noticeable differences in performance. That is definitely going to be the big advantage IVB brings to the mainstream as you'll be able to get 8GB/s in an x8/x8 config with PCIe 3.0 cards.

It'd be interesting if you could do a comparison at some point on the impact of VRAM and bandwidth and PCIe bus speeds. An ideal candidate would be a card that has 2xVRAM variants like a GTX 580 or 6970 that's still fast enough to make things interesting.

Also interesting discussion on the MSAA situation. That helps explain why enabling MSAA has caused VRAM amounts to balloon incredibly in recent games, like BF3, Skyrim, Crysis etc. That extra G-buffer with all that geometry data. Is this what Nvidia was doing in the past with their AA override compatibility bits? Telling their driver to store intermediate buffers for MSAA? Also, wasn't DX10.1/11 supposed to help with this with the ability to read back the multisample depth buffer?

In any case, I for one welcome FXAA. While it does have a blurring effect, the AA it provides with virtually no loss in performance is amazing. It allows me to run much lower levels of AA (4xMSAA + 4xTSAA max, or even 2x+2x) in conjunction with FXAA to achieve better overall AA at the expense of slight blurring. MSAA+TSAA+FXAA provides similar full-scene AA results as the much more performance expensive SGSSAA for me.Reply

LOL yeah cause with Nvidia's 780 coming out in a month Im gonna go blow a load of cash on a card that's only marginally faster than the 580... riiiight. Nvidia released a slide of performance for their 780 vs the 580... the 780 was more than twice as fast as the 580 in all the games they tested.... some almost 2.5X as fast. If the rumored specs are true it will have almost identical specs to the 590 only on a 28nm die in a single chip. This is why you never jump at the first offerings of a new generation of cards. Especially when, if youve been doing your research, you know both chips are being made at the same foundry and both taped out about the same time and that amd went with the lower power chips first instead of the high k metal gates like nvidia did. Now Nvidia is doing a hard launch not a paper launch the end of march. Way to jump the gun dude. Reply

Microsoft Flight Simulator X is a "game" that is limited by a 16x PCIe 2.0 bus. (when not using the buggy DX10 preview). You would easily find this when you fly low over a forest with loads of autogenerated trees.Reply

This information also makes me dreams of the unannounced xbox. Is this compatible with the current e-dram related hardware AA on the xbox360? Another question, could this possibly help a mythical gen gaming console to do more in hardware? Reply

I like the visual of the demo, surface materials, lighting, and shadows all look natural and refreshing. It has the quality of offline renderer. It's much better than most games out there today, which all surfaces look alike, overly shiny surfaces, unnatural glow, and general blurriness. I know a lot has to do with hardware limitation of consoles and developers like to use excessive post process to hide it, but the visual is getting old.

... I see that you have stated the consequences as to what happens to the PCIe lanes if you use a non-IVB processors but not, what happens if you have a pcie 2.0 compliant GPU and that too in SLI! Also an article on how the lane would be distributed under all scenarios like, pcie 2.0: Single GPU, dual/triple GPU; pcie 3.0: single/dual/triple GPUs!!Reply

I am very excited about this new technique for calculating lighting in a forward renderer. Deferred MSAA is a disaster, and postAA gives mediocre results, so I really hope we are going to see a move back to forward rendering in the next iteration of engines, in 2-3 years.Reply