Nvidia's Kepler GTX 680: Powering the Next Gen

Quicker, quieter, and more power efficient, the GTX 680 is new "world's fastest" GPU.

At GDC 2011, Epic Games (maker of Gears of War) unveiled Samaritan, an eye-popping technical demo that showed what was possible with Unreal Engine 3 and a seriously hardcore PC. It was--and still is--an impressive demo, showcasing smoothly tessellated facial features, point light reflections, and judicious use of movie-style bokeh. The demo was so impressive that Epic decided to show it again at this year's GDC, with vice president Mark Rein reiterating that Samaritan is its vision for the next generation--a "screw you" to the naysayers predicting that graphical prowess will play second fiddle to features and functionality.

The problem was, Samaritan didn't exactly run on your average gaming PC, requiring three Nvidia GTX 580 GPUs at a cost of thousands of dollars, as well as a power supply that brought Greenpeace members out in a cold sweat. And this left us wondering: if it took so much power to run the demo, what chance would the next generation of console and PC gamers have to experience it?

The answer, it turns out, was also unveiled at this year's GDC in the form of Nvidia's brand-new GTX 680 GPU, a single one of which easily powers the Samaritan demo. Aside from a few optimisations from Epic, most of that comes down to the new 28nm Kepler architecture that the 680 is based on. It features a new Streaming Multiprocessor (SMX) design, GPU Boost, new FX and TX Anti-aliasing (FXAA/TXAA) technology, Adaptive VSync, support for up to four monitors (including 3D Vision Surround), and--most interestingly of all--much reduced power consumption with an increased focus on performance per watt.

Based on specs alone, the 680 is a more powerful beast than its predecessor, but it's also more practical too. The four display outputs mean you can drive four monitors (at up to 4K 3840x2160 resolution!) at once from a single card. That goes for 3D vision surround, meaning you don't have to splurge on an SLI setup if you're into giving yourself a headache. Also notable is the reduced power consumption of the card, which has a TDP of 195 Watts, compared to 250 Watts in a GTX 580, meaning you only need two six-pin connectors, and you can run it from a much smaller power supply.

"Clearly, the 680 is a more powerful beast than its predecessor, but it's also more practical too."

The reduced TDP also results in reduced heat, meaning it's easier to keep cool. The reference card has an all-new cooling setup that's much kinder on your ears; a godsend for anyone that ever had to endure the jet engine sounds of the GTX 400 series. It features special acoustic dampening material around the fan, a triple heat pipe design, and a redesigned fin stack that has been shaped for better airflow. Of course, this being a reference card, you can expect manufacturers to come up with their own crazy cooling solutions once they start shipping 680s.

Another neat feature of the 680 is a new hardware-based H.264 video encoder called NVENC. If you've ever tried to encode H.264 video, you'll know that it's a time-consuming process. While previous GTX cards sped up encoding using the GPU's CUDA cores, it resulted in increased power consumption. And so, in keeping with Nvidia's new power-saving attitude, the NVENC encoder consumes much less power, while also being four times as fast.

"If you're anything like us, then nothing gets you more excited than realistic cloth animations and individually animated strands of hair."

That means 1080p videos encode up to eight times faster than real time, depending on your chosen quality setting, so a 16-minute-long 1080p, 30fps video takes approximately 2 minutes to complete. Unfortunately, software developers need to incorporate NVENC support in their software, so at launch you're limited to using Cyberlink's Media Expresso. Support for Cyberlink PowerDirector and Arcsoft MediaConverter is promised for a later date.

One other notable improvement to the GTX 680 comes in the form of improved PhysX performance. And, if you're anything like us, then nothing gets you more excited than realistic cloth animations and individually animated strands of hair. To demonstrate the 680's improved performance, Nvidia has put together a tech demo featuring a very hairy ape in a wind tunnel. Each strand of its fur is individually animated, with PhysX processing each movement in real time.

Nvidia also put together a demo called Fracture, which features three destructible pillars. Instead of scripted animations, it uses PhysX to calculate the destruction of an object in real time. Depending on what force the pillar is struck with and taking into account the environment and any previous damage, it falls apart in an amazingly realistic way. The obvious application for this tech is in action games, where gunfire could accurately damage buildings.

The improvements to PhysX aren't just part of a tech demo either. The PC version of Gearbox Software's upcoming Borderlands 2 is set to support many PhysX enhancements. These include water that reacts accurately to a player's movements, rippling and splashing around the environment as you walk through it. Borderlands 2 also makes use of PhysX to render destruction. For example, fire a rocket launcher into the ground, and huge chunks of earth and gravel fly into the air. The resulting debris settles on the floor, where you can kick it around by walking through it.

So those are the top-line improvements to Kepler, but there's plenty more tech to get stuck into on the next page, where we take a more in-depth look at Nvidia's latest architecture. Or, if you just want to get straight to some benchmarks, head over to page three.

A Closer Look at Kepler

SMX Design

Since the GTX series, Nvidia GPUs have been based on the Fermi architecture, which introduced DirectX 11 and OpenGL 4.0 support, as well as a new pipeline technology for improved tessellation performance. Kepler builds upon Fermi, but instead of simply increasing clock speeds to achieve better performance, it actually decreases them in favour of having more processing (CUDA) cores operating at lower speeds. It's much like Intel's transition from the "hotter than the surface of the sun" architecture of the Pentium 4, to the much more power-efficient Core architecture.

"Kepler is much like Intel's transition from the 'hotter than the surface of the sun' architecture of the Pentium 4, to the much more power-efficient Core architecture."

The 680's CUDA cores reside within the SMX. Most GPU functions are performed by the SMX, including pixel and geometry shading, physics calculations, texture filtering, and tessellation. Each SMX contains 192 CUDA cores, which is six times as many as Fermi, and the 680 contains eight SMX blocks for a total of 1,536 cores. This increase provides two times the performance per watt of Fermi. What that means for you is a graphics card with much greater performance at reduced power consumption of 195W compared to 250W in a GTX 580. The reduction is so great that the card requires only two six-pin power connectors, meaning less heat is generated, and smaller power supplies such as those in Alienware's console-sized X51 desktop can be used.

GPU Boost

Another performance-pushing feature of the 680 is GPU Boost, which dynamically overclocks the GPU from the base clock speed of 1GHz. That's possible thanks to some clever monitoring of the GPU's Thermal Design Power (TDP). Many games and applications don't tax the GPU to its maximum, leaving some TDP headroom available. In those cases the clock speed of the GPU is increased on the fly, typically by around 5 percent, but sometimes by as much as 10 percent, giving you a boost in performance.

What's neat is that feature is entirely automated, being integrated into the 680 drivers and hardware, so you get better performance with little effort on your part. That's not to say performance junkies can't push things further; like all previous GeForce cards, the 680's base clock can be overclocked as far as you're willing to push it.

FXAA

One of the key features in getting Samaritan to run on just a single graphics cards is FXAA, Nvidia's own custom anti-aliasing. Good anti-aliasing is important if you don't want your games to look like a jaggy mess around the edges of objects, with most modern games using some form of it. The most common is multi-sample anti-aliasing (MSAA), which was used extensively in the first Samaritan demo. While MSAA produces some beautiful results, it's a bit of a resource hog.

Nvidia's FXAA produces similar (if not better) results with its pixel shader image filter and other post-processing effects like motion blur and bloom. And it does so using much less of the GPU's resources. That means a performance hit of around 1ms per frame or less, resulting in frame rates that are around two times higher than 4xMSAA.

While FXAA has been around for some time, it has previously been dependent on developers' implementing it: if the game you'd bought didn't support it, you were out of luck. With the GTX 680, FXAA can be turned on from the Nvidia control panel, making it compatible with most games.

TXAA

And, as if that weren't enough AA talk for you, the Kepler architecture also features TXAA, which is a brand-new film-style AA technique that works exclusively on the GTX 680. It's a mix of hardware anti-aliasing, a custom CG film-style AA resolve, and--in the case of TXAA 2--an optional temporal component for better image quality.

Like FXAA, it also requires much less processing power, resulting in better performance. TXAA offers similar visual quality to 8xMSAA, but with the performance hit of 2xMSAA, while TXAA 2 offers image quality that is superior to 8xMSAA, but with the performance hit of 4xMSAA.

It's certainly impressive but will require game developers to support it in future titles, so you won't be able to go TXAA crazy from day one. That said, MechWarrior Online, The Secret World, Eve Online, Borderlands 2, BitSquid, Slant Six Games, Crytek, and Epic's upcoming Unreal Engine 4 have all promised to support the technology.

Adaptive VSync

"Like FXAA, Adaptive VSync doesn't require developer support, so you can simply turn it on in the Nvidia control panel when needed."

If you get annoyed by screen tearing or random stuttering in your favourite games, then Adaptive Vsync is for you. VSync is the process of presenting new frames at the same refresh rate as your monitor, that is, 60fps for your typical 60Hz monitor. The problem is, if you suddenly hit a particularly taxing area in your game that causes the frame rate to drop, rather than simply decreasing the frame rate slightly, it drops right down to 30Hz. This causes noticeable stuttering.

You might imagine the solution is to simply turn VSync off, but that presents its own problems. With VSync off, new frames are presented immediately, which causes a visible tear line onscreen at the switching point between old and new frames. It's exacerbated at higher frame rates, where the tearing gets bigger and more distracting.

Nvidia's solution to this predicament is Adaptive VSync, which switches VSync on and off on the fly. If your frame rate drops below 60fps, Vsync is automatically disabled, thus preventing any stuttering. Once you hit 60fps again, VSync is turned back on to reduce screen tearing.

Like FXAA, Adaptive VSync doesn't require developer support, so you can simply turn it on in the Nvidia control panel when needed.

Benchmarks

Because the release of a new graphics card wouldn't be complete without some benchmarks and bar charts, we've rounded up three of the most GPU-taxing games we could find to put the GTX 680 through its paces. We also nabbed a GTX 580 and an ATI HD7970 for some added competition. Each game was run at maximum settings, with AA enabled and at a 1080p resolution. For the Unigine Heaven benchmark, tessellation was set to extreme.

Judging by our very orange bar charts, it's easy to see what a great performer the GTX 680 is. Not only is it a step up from its predecessor, but it also outperforms ATI's high-end HD7970 by some margin.

What the charts don't tell you, though, is just how damn quiet the 680 is. There's a very noticeable difference in volume between it and the other cards, particularly when it's running at full whack. If your gaming PC lives under the TV in the living room, or you have it in the same room as your significant other, you'll definitely appreciate the decreased volume.

There's a price to pay for all that quiet power, though, with the GTX 680 retailing at around £429, which is pretty much the same price as ATI's HD7970. With its greater performance, quiet cooling, support for four monitors, much-reduced power consumption, and a bunch of new technologies under the hood, the GTX 680 is easily the better choice. Plus, if you're into headaches, Nvidia's 3D Vision Surround is much more widely supported than ATI's solution, and it performs better too.

The question is, does anyone actually need a card like the GTX 680? After all, a previous-generation GTX 580 can run pretty much anything you throw at it at maximum settings. And while Nvidia hasn't announced anything just yet, it's likely there will be midrange 600 series cards to follow later in the year at a much cheaper price point.

But if Epic's Samaritan demo really is what the next generation of games are going to look like--maybe even more so with Unreal Engine 4--then cards like the GTX 680 are just what the gaming industry needs to push through technological advances and create experiences that can astound, and make us more immersed in our favourite games than ever before. And with its advances in power consumption, there's a chance--however slim--that something like it might just make its way into a next-generation console.

You might not need a 680 just yet, but as soon as a game that looks as good as Samaritan hits, you'll definitely want one.

For more on Kepler and my thoughts on its laptop versions, click here.

triple monitor setups are b*tchin'..... quad seems too much. However, I can't wait until we have six-monitor support with a chair that has a protracting keyboard that turns based on the movement of the mouse.

Sounds awesome, but we need more games developed for the pc that can take advantage of the power. I've had a GTX580 for a year now and it hasn't even broken a sweat. What a waste of money. Unless you're playing on multiple displays, what's the point. I should have stuck with my 5770.

@BrassBullet
You are right about the consoles needing hardware like this in the next xbox and ps but its already been stated but not official is the next xbox gpu will be a 6780 equivalent so i very very very much doubt a 680 will go into the next gen consoles. So if the next xbox does have a 6780 and Sony stick with Nvidia i reckon it will be like a 650 or whatever they call it gpu. We shall wait and see. Oh and i dont think it was mentioned in this article but i have read in articles on the 680 that the codename for it indicates it may actually not be the highest end single gpu nvidia release. Hell they may even be goin back to the old 680 ultra naming scheme which would be awesome lol. I mean they have plenty of headroom to increase it cause of temps and power usage so nvidia could really spice things up and release some super high end single gpu which leaves ati's 7970 in the dust. Btw i own 2 6970s in crossfire so im not a ati fanboy but im liking what both companies are doing although prices are too expensive in my opinion

As a high-end PC gpu I'm underwhelmed by the 680, but I think it might just be the perfect fit for the new Consoles. First, it's very small, about half the size of the 580, this will make it cheap to produce in mass for a low margin item like a game console. Second, they stripped out a lot of the GPGPU performance. GPGPU is thr sole reason enterprises buy nvidia products, so the 680 is clearly not ment to win that market like the last few flagships. Lastly, and largely because of the last 2 reasons, it is the most power efficient high end GPU. The space and cooling constraints of a console make this a huge selling point. Especially because AMD has chosen to add more features unnecessary for a console, I wouldn't be surprised in the least if the 680 is exactly the GPU that shows up in thex Nextbox or PS4.

The GTX 680 is an excellent chip in every way, but I feel it is being a little overhyped. The Radeon HD 7970 card has more memory (2 gb vs 3gb) and actually performs on par with and often beats the GTX 680 at ultra-high resolutions, such as the 2560 x 1600 or multiple monitor setups such as 5760x 1080, which is where such cards will realistically be used.
AMD aren't in trouble, they just need to price the 7970 about $20 cheaper than the GTX 680 to remain competitive.
Plus, aside from gaming, the 7970 is proven to have much more compute power (read any professional review, such as Tom's Hardware's for details).

It's amazing really how just in 2 years since the 4xx series came out the technology difference between the 2 is insane. Kepler is clearly the next gen over Fermi, but around 2014 its expected that the next gen Maxwell will outperform Kepler by up to 10 times. Also with Nvidia putting a roadmap for mobile (Tegra) performance that will match PC performance soon with also next gen architecture, Nvidia, the way I see it is pushing technological hardware levels of humanity forward really fast.

@NeoEnigma
I agree completely. I was just stating that in terms of power, if anyone was confused by the article. The prime choice for high-end GPU would be the GTX 680, no doubt. I prefer a single GPU setup over sli/crossfire any day of the week. One of the main points of the GTX 680 is less power consumption. So in terms of performance/watt, it's hands down the best. For $500 USD, it's way better than the GTX 580 when it launched.

@Elann2008 - but with so many games having broken or no SLI support at all, I'd sooner sell my current 580 and buy a 680 rather than get a 2nd 580 and SLI it. I Just feel like it's the more stable and reliable option. My 580 will be on sale soon... :)

I can run anything with my AMD budget PC at 1680x1050. (Phenom II 955BE and OCd 5850) If I ever do decide to play at 1920x1080 I'll definitely be getting one of these. By then the price will have dropped as well! Win-win.

I am currently running 2 560 ti cards, mixed w/ my 8 gb ram and my amd phenom 6 core, atm There is no need for me to upgrade, But when new games start to run slow, or dx12 comes out, Then it is Time !
( imo )

@ll0stryker0ll
This "review" gets so many things wrong, it's a complete joke. Sounds like it was copied out of an Nvidia press junket. 3 benchmarks, and only two of them games? At 1080p? For a high-end graphics card? Might as well just draw some lines with crayon.
This card is good enough to stand on its own merits, but Nvidia still feels the need to play games with the press. It's disappointing, really.

@markiewicz
"You are wrong because the CPU clocks are provided by NVIDIA and come with a warranty. The card is designed to work within these frequencies at an exact power consumption and heat output treshold."
One would hope so. The problem is that Nvidia will do whatever it can to maintain the perception of a performance advantage. So it wouldn't surprise me if these cards started failing prematurely because they were clocked too high.
"Difference is that Nvidia guarantees every single card you buy will hit the quoted clockspeed, including Boost."
Yeah, but Nvidia's guarantee isn't exactly rock solid. See bumpgate.
I mean, I wouldn't worry too much about the 680, but Nvidia in general has burned quite a few bridges in the past, which is the only reason I'm still hesitant to buy a sweet-looking card such as this one.

@B-boy
Your joking right?Please tell me you are joking and you are not that dumb enough to believe what you wrote.If your being serious,then I suggest doing some major research before posting such nonsense in the future.Not trying to insult you here but I have to agree with ihsiep and say that was ignorant indeed.

naryanrobinson
$500 is the retail price for end consumers and will never mean their costs are around $450 or something!! this is a technology sell and one of the best in market and that is why it costs so much to end consumers as no majority of comppanies are not even capable of producing such items!! giants in tech like nvidia always throw up a lot of costs in research and development!! for a product to be successful they need to cover those costs!! what I believe is that Nvidia and ATI will be dying to get their cards be used in one of those microsoft and sony machines!! the larger they have a secured sell the lower the costs can be!!
So all those who say Microsoft or Sony may use these cards in their nex gen consoles might be right!!
Clearly you are not related to costing, finance or even accounts field!! :-p

hmmmm I think I'll wait for this to come out so that the 580 comes down a bit more then nab another 580 to SLI. Not worth me upgrading just yet - I only upgraded from a 768mb 280 to a 3gb 580 about a month ago and I'm happy as Larry. The FPS gain to price ratio is a no brainer for me.

@bluebird08 That's assuming there will be new consoles in the near future. The way things look, doesn't look like there will be a new one within a few years. But I digress here, I am a PC gamer, so that doesn't bother me too much :P.