WE'VE HEARD REPORTS about how the upcoming Nvidia GTX680, the very first Kepler 'GK104' GPU will beat all and sundry in everything, including AMD's top of the range Radeon HD 7970, despite the latter's new GCN architecture and 50 per cent wider memory buses and memory capacity.

After all, look at the impressive block diagram. With all the brand new compute-oriented shaders and such, it does leave one impressed:

gtx680inside

According to specifications leaked by Techpowerup, the complicated hierarchy starts with the Gigathread Engine, which marshals all the unprocessed and processed information between the rest of the GPU and the PCI-Express 3.0 system interface. Below this are four graphics processing clusters (GPCs) and one common resource, the raster engine, and two streaming multiprocessors (SMs). Only this time, innovation has gone into redesigning the SM, and it is now called the SMX. Each SMX has one next-generation Polymorph 2.0 engine, an instruction cache, 192 CUDA cores, and other first-level caches. So four GPCs of two SMXs each, and 16 SMXs of 192 CUDA cores each, amount to the 1536 CUDA core count. There are four raster units amounting to 32 ROPs, eight geometry units each with a tessellation unit, and some third-level cache. There's a 256-bit wide GDDR5 memory interface at 6GHz declared throughput, and as noted it's a third narrower than the top end AMD Radeon HD 7970.

As The INQUIRER hasn't recently gotten Nvidia cards for review, I used a bit of spare time here in sunny Shenzhen, where the March all-time high of 29C heat hit us just a day before. It was a sweaty ordeal taking a public bus to a funny factory place nearly 10 miles away, in a booming city of 15 million twice the size of Greater London, but it was worth it....

Since almost anything, including the world's newest GPUs, can be found in Shenzhen, I had a quick look at an - unindentified, obviously, for the vendor's protection - GTX680 2GB card in that factory for just half hour. I was shown some 3Dmark 11 and similar benchmark results, but being a compute boffin, I ran my Sandra 2012SP2 benchmark that I carry around on a USB stick to check GPGPU compute performance in floating point, especially double precision.

Remember, this card is supposed to be 'the crown winner' for Nvidia, since it couldn't make the bigger GK100 die on time, and all the effort was put into tuning it to the hilt to try to win against AMD, which has lead the performance pack for the past year or so. Therefore, I thought I'll get some good compute performance results here, too - in particular since AMD has enabled double precision floating-point even in the mid-range Radeon HD 7870 GPU as the first in this market segment, not to mention the high end Radeon HD 7970 model. I ran the same benchmark before on both the Radeon HD 7870 and the Radeon HD 7970, on - really underclocked - reference clock versions, which AMD could push up by another 20 per cent anytime.

Here is the result:

nv680compute

Wow! The claim of beating the HD7970 goes right into the thin air, it seems. Nvidia's new GPU is beaten by the Radeon HD 7970 by an order of magnitude here in double precision floating-point, as well as nearly twice in ordinary single precision floating point. One is speechless here. Even the Radeon HD 7870 with its restricted double precision floating-point still outperforms the GTX680 by a noticeable margin in this department, as you can see here. Only the Radeon HD 7850 is substantially slower.

One might ask, why bother? Well, compute GPU performance can't rely on tweaked drivers, application detection turnarounds and similar tricks as well as other such shortcuts. It is pure, raw processing ability that defines the GPU general purpose computing useability. After all, Nvidia created the GPGPU market and CUDA programming environment. This situation not only badly hurts its prestige in this area but also forces the need for a, say, GK110 'real Kepler high end' follow-on to be delivered soon. Not to mention, Nvidia's GPU compute optimised cards like Tesla sell for thousands apiece, even though they are based on essentially the same dies as high end consumer GPUs, therefore GPU compute is important.

For the other aspects of it, I was shown how it is quite close to the Radeon HD 5970 in Full HD gaming performance, except where its memory bandwidth limitations with a third narrower bus lose to AMD in high anti-aliasing and highly textured scenarios. One interesting, and rather negative, observation over Chinese tea from the hardware guy in charge was the issue with PCB and component quality on the reference boards, something I'd leave for later when more boards are seen. If this problem is really there, though, it could affect overclocking chances rather badly.

What then? We need Nvidia to be a strong competitor with a good product line from top to bottom, to avoid further attempts to be acquired, just like the one from Intel sometime ago that almost succeeded save for these same Chinese saving Nvidia's stock price at the last moment with a truly huge order of, guess what, expensive GPU compute cards for their Tianhe multi-petaflop clusters of supercomputers in Tianjin, right next to the capital, Beijing. So, we can't say that the Chinese don't help US companies survive, even if lead by a Taiwanese.

For that, Nvidia needs a true performance leading world class GPU, one that will drive a 'waterfall effect' to help the sales of its other GPUs too. Remember that Intel will also greatly improve its integrated graphics, with near doubling in the Ivy Bridge generation, followed by another massive jump in Haswell. And these on-chip GPUs will support DX11 Compute, among other things. It's not a good idea to be squeezed both from both top and bottom.

So, aside from few gaming benchmark tests, the GK104 die in the GTX680 is not exactly the cure for Nvidia's ills or a true performance leader at the moment. Nvidia urgently needs the GK110 die, especially since AMD can really easily crank up its GPUs by well over 30 per cent across the board right away - count the 20 per cent frequency jump plus driver improvements, and there you are. And did we mention the dual GPU Radeon HD 7990 followed by the Sea Islands? It's an exciting year for watching the GPU market. µ

Wow now they have to sell the GTX680 at around $330 so it can be between HD7870 and HD7850

$330 would be nice, since GTX680 is a mid range card anyway, it happens to handle high clocks well to give it a lead in benchmarks, but overall its not a good all rounder card. which will mean its a mid range card after all

What bullshit is this, what gamer gives a crap for compute performance, fact is going by the toms hardware review of the 680, it beast the 7970 by a good 20%.

Click to expand...

well lets see.. do you use a GPU accelerated web browser, watch much flash content, work on any 3d modeling on the side, or care about the maximum possible performance / time? As with what it said, GPGPU has more to do with the raw power rather than driver optimization... so therefore we may see quite a bit more out of the 79xx series possibly over time.

Oh and if you do that other stuff you might enjoy the increased speed.

well lets see.. do you use a GPU accelerated web browser, watch much flash content, work on any 3d modeling on the side, or care about the maximum possible performance / time? As with what it said, GPGPU has more to do with the raw power rather than driver optimization... so therefore we may see quite a bit more out of the 79xx series possibly over time.

Oh and if you do that other stuff you might enjoy the increased speed.

Click to expand...

Flash and browser GPU acceleration works more than fine with any modern card. Hardly uses more than few % of GPU at worst. And what comes to 3D modeling, its pretty much the same thing there except more GPU usage. You are not going to see difference between these cards.

E: And if you are doing some hardcore rendering, you are fool if you bought anything else than quadro or FirePro.

Plus, I couldn't care less of GPGPU performance in gaming graphics cards. I am sure GK110 will be that insane cruncher replacement for the GPGPU guys.

Removing a lot of GPGPU stuff is probably biggest reason for GK104's greatly improved perf/w. Card is designed purely games in mind.

Click to expand...

Well of course. The DP performance was capped for Fermi GTX, NV sells for very good money Quadros and Teslas so that's where they need it. I'm sure it's the same now but this journalist has no idea what he's talking about. And yes, GK104 is not meant for GPGPU, it's TWIMTBP...

Get out, who needs amazing compute performance to run a web browser and flash as for 3D modeling who the fack cares, buy another GPU that does the job better. 680 is about gaming and apparently it does it rather well and better than the competition.

Get out, who needs amazing compute performance to run a web browser and flash as for 3D modeling who the fack cares, buy another GPU that does the job better. 680 is about gaming and apparenlty it does it rather well and better than the competition.

Click to expand...

Back to the speculation on the topic... That means there may be the possibility that in time with better optimized drivers that Talahi could be substantially faster than gk104. End of line.

Me personally, I would love to see that. For three reasons.
1. AMD has been having issues in recent years and really needs a payoff year to keep competitive, as we all know no competition = sucks for us.
2. When was the last generation AMD got the chance to hold the single GPU crown again?...
3. I prefer ethical companies over nonethical companies. AMD falls into the former, Nvidia into the latter. http://www.hardocp.com/news/2012/02/08/amd_named_to_top_20_clean_capitalism_ranking

When Nvidia was doing very well in GPGPU with most in most of their last gen cards, Most people used to buy Nvidia cards coz they were good all rounder cards. now AMD has turned the tables and they have good all rounder Cards and people have a problem with that??? i don't get it. i do not support AMD or Nvidia. i can purchase any card on either teams, but mostly the well balanced ones

In a perfectly ordered multithreaded application yes. In a out of order game where the user and many other branches coukd change the rendering these numbers mean shit now and later. A driver isn't going to make this much difference, W1zzard has already done rebenches with new "performance" drivers where it helped some games a few percent and hurt others.

GPGPU will mean more to us when lazy devs get to making it work and work right.

When Nvidia was doing very well in GPGPU with most in most of their last gen cards, Most people used to buy Nvidia cards coz they were good all rounder cards. now AMD has turned the tables and they have good all rounder Cards and people have a problem with that??? i don't get it. i do not support AMD or Nvidia. i can purchase any card on either teams, but mostly the well balanced ones

Click to expand...

We don't have problem with AMD cards. But we do have problem with bullshit articles.

I just want the GC manufacturers to keep within a certain distance of each other and not blow the other out of the water.
Why? Because it keeps the fanboys screaming for their side, which eventually leads to violating TPU rules of ettiquette, which helps us keep our bansticks from rusting.

I'm a little surprised by what appears to be a bit lackluster compute power. They have always prided themselves on that. Perhaps they are pushing that more into the professional market of cards due to more people who use high levels of compute power buying those. I know of a few people who bought their 5xx series cards for the computer power when doing non-gaming applications, but it may not be a large portion of thier non-pro desktop card sales.

ATI/AMD Radeon used to to the same thing as what is happening with this GTX680 situation, they never used to care much about GPGPU, but now they are playing Nvidia's own game, at list when AMD Radeon were not focused on GPGPU they used to price their cards right