With Rahul having covered the basis of Titan’s strong compute performance, let’s shift gears a bit and take a look at real world usage.

On top of Rahul’s work with Titan, as part of our 2013 GPU benchmark suite we put together a larger number of compute benchmarks to try to cover real world usage, including the old standards of gaming usage (Civilization V) and ray tracing (LuxMark), along with several new tests. Unfortunately that got cut short when we discovered that OpenCL support is currently broken in the press drivers, which prevents us from using several of our tests. We still have our CUDA and DirectCompute benchmarks to look at, but a full look at Titan’s compute performance on our 2013 GPU benchmark suite will have to wait for another day.

For their part, NVIDIA of course already has OpenCL working on GK110 with Tesla. The issue is that somewhere between that and bringing up GK110 for Titan by integrating it into NVIDIA’s mainline GeForce drivers – specifically the new R314 branch – OpenCL support was broken. As a result we expect this will be fixed in short order, but it’s not something NVIDIA checked for ahead of the press launch of Titan, and it’s not something they could fix in time for today’s article.

Unfortunately this means that comparisons with Tahiti will be few and far between for now. Most significant cross-platform compute programs are OpenCL based rather than DirectCompute, so short of games and a couple other cases such as Ian’s C++ AMP benchmark, we don’t have too many cross-platform benchmarks to look at. With that out of the way, let’s dive into our condensed collection of compute benchmarks.

We’ll once more start with our DirectCompute game example, Civilization V, which uses DirectCompute to decompress textures on the fly. Civ V includes a sub-benchmark that exclusively tests the speed of their texture decompression algorithm by repeatedly decompressing the textures required for one of the game’s leader scenes. While DirectCompute is used in many games, this is one of the only games with a benchmark that can isolate the use of DirectCompute and its resulting performance.

Note that for 2013 we have changed the benchmark a bit, moving from using a single leader to using all of the leaders. As a result the reported numbers are higher, but they’re also not going to be comparable with this benchmark’s use from our 2012 datasets.

With Civilization V having launched in 2010, graphics cards have become significantly more powerful since then, far outpacing growth in the CPUs that feed them. As a result we’ve rather quickly drifted from being GPU bottlenecked to being CPU bottlenecked, as we see both in our Civ V game benchmarks and our DirectCompute benchmarks. For high-end GPUs the performance difference is rather minor; the gap between GTX 680 and Titan for example is 45fps, or just less than 10%. Still, it’s at least enough to get Titan past the 7970GE in this case.

Our second test is one of our new tests, utilizing Elcomsoft’s Advanced Office Password Recovery utility to take a look at GPU password generation. AOPR has separate CUDA and OpenCL kernels for NVIDIA and AMD cards respectively, which means it doesn’t follow the same code path on all GPUs but it is using an optimal path for each GPU it can handle. Unfortunately we’re having trouble getting it to recognize AMD 7900 series cards in this build, so we only have CUDA cards for the time being.

Password generation and other forms of brute force crypto is an area where the GTX 680 is particularly weak, thanks to the various compute aspects that have been stripped out in the name of efficiency. As a result it ends up below even the GTX 580 in these benchmarks, never mind AMD’s GCN cards. But with Titan/GK110 offering NVIDIA’s full compute performance, it rips through this task. In fact it more than doubles performance from both the GTX 680 and the GTX 580, indicating that the huge performance gains we’re seeing are coming from not just the additional function units, but from architectural optimizations and new instructions that improve overall efficiency and reduce the number of cycles needed to complete work on a password.

Altogether at 33K passwords/second Titan is not just faster than GTX 680, but it’s faster than GTX 690 and GTX 680 SLI, making this a test where one big GPU (and its full compute performance) is better than two smaller GPUs. It will be interesting to see where the 7970 GHz Edition and other Tahiti cards place in this test once we can get them up and running.

Our final test in our abbreviated compute benchmark suite is our very own Dr. Ian Cutress’s SystemCompute benchmark, which is a collection of several different fundamental compute algorithms. Rahul went into greater detail on this back in his look at Titan’s compute performance, but I wanted to go over it again quickly with the full lineup of cards we’ve tested.

Surprisingly, for all of its performance gains relative to GTX 680, Titan still falls notably behind the 7970GE here. Given Titan’s theoretical performance and the fundamental nature of this test we would have expected it to do better. But without additional cross-platform tests it’s hard to say whether this is something where AMD’s GCN architecture continues to shine over Kepler, or if perhaps it’s a weakness in NVIDIA’s current DirectCompute implementation for GK110. Time will tell on this one, but in the meantime this is the first solid sign that Tahiti may be more of a match for GK110 than it’s typically given credit for.

Post Your Comment

337 Comments

Price is a disgrace. Can we really be surprised though ? We saw the 680 release and knew then they were selling their mid ranged card as a flagship with a flagship price.

We knew then the real flagship was going to come at some point. I admit I assumed they would replace the 680 with it and charge maybe 600 or 700. Can't believe they're trying to pawn it off for 1000. Looks like nvidia has decided to try and reshape what the past flagship performance level is worth. 8800gtx,280,285,480,580 all 500-600, we all know gtx680 is not a proper flagship and was their mid-range. Here is the real one and..... 1000

Problem here is this gen none of the reviewers chewed out AMD for the 7970. This led Nvidia to think it was totally fine to release GK104 for $500 which was still cheaper then a 7970 but not where that die was originally slotted and to do this utter insanity with a $1000 solution that is more expensive then solutions that are faster then it.

7950 3-way Crossfire, GTX690, GTX660Ti 3 Way SLI, GTX670SLI and GTX680SLI are all better options for anyone who isn't spending $3000 on cards as even dual card you are better off with the GTX690s in SLI. Poor form Nvidia, poor form. But poor form to every reviewer who gives this an award of any kind. It's time to start taking pricing and availability into the equation.

I think I'd have much less of an issue if partners had access to GK110 dies binned for slightly lower clocks and limited to 3GB at 750-800. I'd wager you'd hit close to the same performance window at a more reasonable price that people wouldn't have scoffed at. GTX670SLI is about $720...Reply

Pretty much agree. GPU reviewers of late have been so forgiving toward nVidia and AMD for all kinds of crap. They don't seem to have the cahoneys to put their foot down and say, "This far, no farther!"

They just keep bowing their head and saying, "Can I have s'more, please?" Pricing is way out of hand, but the reviewers here and elsewhere just seem to be living in a fantasy world where these prices make even an iota of sense.

That said, the Titan is a halo card and I don't think 99% of people out there are even supposed to be considering it.

This is for that guy you read about on the forum thread who says he's having problems with quad-sli working properly. This is for him to help him spend $1k more on GPU's than he already would have.

So then we can have a thread with him complaining about how he's not getting optimal performance from his $3k in GPU's. And how, "C'mon, nVidia! I spent $3k in your GPU's! Make me a custom driver!"

"Investing" as our many local amd fanboy retard destroyers like to proclaim, in an amd card, is one sorry bet on the future. It's not an investment.

If it weren't for the constant crybaby whining about price in a laser focused insane fps only dream world of dollar pinching beyond the greatest female coupon clipper in the world's OBSESSION level stupidity, I could stomach an amd fanboy buying Radeons at full price and not WHINING in an actual show of support for the failing company they CLAIM must be present for "competition" to continue.

Instead our little hoi polloi amd ragers rape away at amd's failed bottom line, and just shortly before screamed nVidia would be crushed out of existence by amd's easy to do reduction in prices.... it went on and on and on for YEARS as they were presented the REAL FACTS and ignored them entirely. Yes, they are INSANE. Perhaps now they have learned to keep their stupid pieholes shut in this area, as their meme has been SILENCED for it's utter incorrectness. Thank God for small favors YEARS LATE.

Keep crying crybabies, it's all you do now, as you completely ignore amd's utter FAILURE in the driver department and are STUPID ENOUGH to unconsciously accept "the policy" about dual card usage here, WHEN THE REALITY IS NVIDIA'S CARDS ALWAYS WORK AND AMD'S FAIL 33% OF THE TIME.

So recommending CROSSFIRE cannot occur, so here is thrown the near perfect SLI out with the biased waters.

ANOTHER gigantic, insane, lie filled BIAS.

Congratulations amd fanboys, no one could possibly be more ignorant nor dirtier. That's what lying and failure is all about, it's all about amd and their little CLONES.Reply

Awesome card! best single GPU on the planet at the moment. Almost 50% better in frame latencies than 7970. Crossfire,don't make me laugh. here's an analysis. Many of the frames "rendered" by the 7970 and especially Crossfire aren't visible.http://www.pcper.com/reviews/G...Reply