Earlier this week NVIDIA announced their new top-end single-GPU consumer card, the GeForce GTX Titan. Built on NVIDIA’s GK110 and named after the same supercomputer that GK110 first powered, the GTX Titan is in many ways the apex of the Kepler family of GPUs first introduced nearly one year ago. With anywhere between 25% and 50% more resources than NVIDIA’s GeForce GTX 680, Titan is intended to be the ultimate single-GPU card for this generation.

Meanwhile with the launch of Titan NVIDIA has repositioned their traditional video card lineup to change who the ultimate video card will be chasing. With a price of $999 Titan is decidedly out of the price/performance race; Titan will be a luxury product, geared towards a mix of low-end compute customers and ultra-enthusiasts who can justify buying a luxury product to get their hands on a GK110 video card. So in many ways this is a different kind of launch than any other high performance consumer card that has come before it.

So where does that leave us? On Tuesday we could talk about Titan’s specifications, construction, architecture, and features. But the all-important performance data would be withheld another two days until today. So with Thursday finally upon us, let’s finish our look at Titan with our collected performance data and our analysis.

Titan: A Performance Summary

GTX Titan

GTX 690

GTX 680

GTX 580

Stream Processors

2688

2 x 1536

1536

512

Texture Units

224

2 x 128

128

64

ROPs

48

2 x 32

32

48

Core Clock

837MHz

915MHz

1006MHz

772MHz

Shader Clock

N/A

N/A

N/A

1544MHz

Boost Clock

876Mhz

1019MHz

1058MHz

N/A

Memory Clock

6.008GHz GDDR5

6.008GHz GDDR5

6.008GHz GDDR5

4.008GHz GDDR5

Memory Bus Width

384-bit

2 x 256-bit

256-bit

384-bit

VRAM

6GB

2 x 2GB

2GB

1.5GB

FP64

1/3 FP32

1/24 FP32

1/24 FP32

1/8 FP32

TDP

250W

300W

195W

244W

Transistor Count

7.1B

2 x 3.5B

3.5B

3B

Manufacturing Process

TSMC 28nm

TSMC 28nm

TSMC 28nm

TSMC 40nm

Launch Price

$999

$999

$499

$499

On paper, compared to GTX 680, Titan offers anywhere between a 25% and 50% increase in resource. At the starting end, Titan comes with 25% more ROP throughput, a combination of Titan’s 50% increase in ROP count and simultaneous decrease in clockspeeds relative to GTX 680. Shading and texturing performance meanwhile benefits even more from the expansion of the number of SMXes, from 8 to 14. And finally, Titan has a full 50% more memory bandwidth than GTX 680.

Setting aside the unique scenario of compute for a moment, this means that Titan will be between 25% and 50% faster than GTX 680 in GPU limited situations, depending on the game/application and its mix of resource usage. For an industry and userbase still trying to come to terms with the loss of nearly annual half-node jumps, this kind of performance jump on the same node is quite remarkable. At the same time it also sets expectations for how future products may unfold; one way to compensate for the loss of the rapid cadence in manufacturing nodes is to spread out the gains from a new node over multiple years, and this is essentially what we’ve seen with the Kepler family by launching GK104, and a year later GK110.

In any case, while Titan can improve gaming performance by up to 50%, NVIDIA has decided to release Titan as a luxury product with a price roughly 120% higher than the GTX 680. This means that Titan will not be positioned to push the price of NVIDIA’s current cards down, and in fact it’s priced right off the currently hyper-competitive price-performance curve that the GTX 680/670 and Radeon HD 7970GE/7970 currently occupy.

This setup isn’t unprecedented – the GTX 690 more or less created this precedent last May – but it means Titan is a very straightforward case of paying 120% more for 50% more performance; the last 10% always costs more. What this means is that the vast majority of gamers will simply be shut out from Titan at this price, but for those who can afford Titan’s $999 price tag NVIDIA believes they have put together a powerful card and a convincing case to pay for luxury.

So what can potential Titan buyers look forward to on the performance front? As always we’ll do a complete breakdown of performance in the following pages, but we wanted to open up this article with a quick summary of performance. So with that said, let’s take a look at some numbers.

GeForce GTX Titan Performance Summary (2560x1440)

vs. GTX 680

vs. GTX 690

vs. R7970GE

vs. R7990

Average

+47%

-15%

34%

-19%

Dirt: Showdown

47%

-5%

3%

-38%

Total War: Shogun 2

50%

-15%

62%

1%

Hitman: Absolution

34%

-15%

18%

-15%

Sleeping Dogs

49%

-15%

17%

-30%

Crysis

54%

-13%

21%

-25%

Far Cry 3

35%

-23%

37%

-15%

Battlefield 3

48%

-18%

52%

-11%

Civilization V

59%

-9%

60%

0

Looking first at NVIDIA’s product line, Titan is anywhere between 33% and 54% faster than the GTX 680. In fact with the exception of Hitman: Absolution, a somewhat CPU-bound benchmark, Titan’s performance relative to the GTX 680 is actually very consistent at a narrow 45%-55% range. Titan and GTX 680 are of course based on the same fundamental Kepler architecture, so there haven’t been any fundamental architecture changes between the two; Titan is exactly what you’d expect out of a bigger Kepler GPU. At the same time this is made all the more interesting due to the fact that Titan’s real-world performance advantage of 45%-55% is so close to its peak theoretical performance advantage of 50%, indicating that Titan doesn’t lose much (if anything) in efficiency when scaled up, and that the games we’re testing today favor memory bandwidth and shader/texturing performance over ROP throughput.

Moving on, while Titan offers a very consistent performance advantage over the architecturally similar GTX 680, it’s quite a different story when compared to AMD’s fastest single-GPU product, the Radeon HD 7970 GHz Edition. As we’ve seen time and time again this generation, the difference in performance between AMD and NVIDIA GPUs not only varies with the test and settings, but dramatically so. As a result Titan is anywhere between being merely equal to the 7970GE to being nearly a generation ahead of it.

At the low-end of the scale we have DiRT: Showdown, where Titan’s lead is less than 3%. At the other end is Total War: Shogun 2, where Titan is a good 62% faster than the 7970GE. The average gain over the 7970GE is almost right in the middle at 34%, reflecting a mix of games where the two are close, the two are far, and the two are anywhere in between. With recent driver advancements having helped the 7970GE pull ahead of the GTX 680, NVIDIA had to work harder to take back their lead and to do so in an concrete manner.

Titan’s final competition are the dual-GPU cards of this generation, the GK104 based GTX 690, and the officially unofficial Tahiti based HD 7990 cards, which vary in specs but generally have just shy of the performance of a pair of 7970s. As we’ve seen in past generations, when it comes to raw performance one big GPU is no match for two smaller GPUs, and the same is true with Titan. For frames per second and nothing else, Titan cannot compete with those cards. But as we’ll see there are still some very good reasons for Titan’s existence, and areas Titan excels at that even two lesser GPUs cannot match.

None of this of course accounts for compute. Simply put, Titan stands alone in the compute world. As the first consumer GK110 GPU based video card there’s nothing quite like it. We’ll see why that is in our look at compute performance, but as far as the competitive landscape is concerned there’s not a lot to discuss here.

1. Does compute capability really takes that much more transistors to build? as in 2x trans. only yield ~140% improvement on gaming.I think this was a conscious decision by nVidia to focus on compute and the required profit margin to sustain R&D.

2. despite the die size shrink, I'm guessing it would be harder to have functional silicon as the process shrinks. i.e. finding 100mm^2 of functional silicon @ 40nm is easier than @28nm, from the standpoint that more transistors are packed to the same area. Which I think why they have 15SMXs designed.Thus it'd be more expensive for nVidia to build same area at 28 vs. 40 nm... at least until the process matures, but at 7B I doubt it will ever be attainable.

3. The AMD statement on no updates to 7970 essentially sealed the $1000 price for titan. I would bet if AMD announced 8970, Titan would be priced at $700 today, with 3GB memory.Reply

Luxury GPU is no more silly than Extreme CPUs that cost $1000 each. And yet, Intel continues to sell those, and what's more the performance offered by Titan is a far better deal than the performance offered by a $1000 CPU vs. a $500 CPU. Then there's the Tesla argument: it's a $3500 card for the K20 and this is less than a third that price, with the only drawbacks being no ECC and no scalability beyond three cards. For the Quadro crowd, this might be a bargain at $1000 (though I suspect Titan won't get the enhanced Quadro drivers, so it's mostly a compute Tesla alternative).Reply

The problem with this analogy, which I'm sure was floated around Nvidia's Marketing board room in formulating the plan for Titan, is that Intel offers viable alternative SKUs based on the same ASIC. Sure there are the few who will buy the Intel EE CPU (3970K) for $1K, but the overwhelming majority in that high-end market would rather opt for the $500 option (3930K) or $300 option (3820).

Extend this to the GPU market and you see Nvidia clearly withheld GK100/GK110 as the flagship part for over a year, and instead of offering a viable SKU for traditional high-end market segments based on this ASIC, they created a NEW ultra-premium market. That's the ONLY reason Titan looks better compared to GK104 than Intel's $1K and $500 options, because Nvidia's offerings are truly different classes while Intel's differences are minor binning and multiplier locked parts with a bigger black box.Reply

You are assuming that the Intel EE parts are nothing more than a marketing ploy, which is wrong, while at the same time assuming that the Titan is orders of magnitude beyond the 680 which is also wrong.

You're seeing it from the point of view of someone who buys the cheapest Intel CPU, overclocks it to the point of melting, and then feels they have a solution "just as good if not better" than the Intel EE.

Because the Titan has unlocked stream procs that the 680 lacks, and there is no way to "overclock" your way around missing SPs, you feel that NVidia has committed some great sin.

The reality is that the EE procs give out of box performance that is superior to out of box performance of the lesser SKUs by a small, but appreciable, margin. In addition, they are unlocked, and come from a better bin, which means they will overclock *even better* than the lesser SKUs. Budget buyers never want to admit this, but it is reality in most cases. Yes you can get a "lucky part" from the lesser SKU that achieves a 100% overclock, but this is an anomaly. Most who criticize the EE SKUs have never even come close to owning one.

Similarly, the Titan offers a small, but appreciable, margin of performance over the 680. It allows you to wait longer before going SLI. The only difference is you don't get the "roll of the dice" shot at a 680 that *might* be able to appear to match a Titan since the SP's arent there.

The analogy is fine, it's just that biased perspective prevents some from seeing it.Reply

Well you obviously have trouble comprehending analogies if you think 3.6B difference in transistors and ~40% difference in performance is analogous to 3MB L3 cache, an unlocked multiplier and 5% difference in performance.

But I guess that's the only way you could draw such an asinine parallel as this:

"Similarly, the Titan offers a small, but appreciable, margin of performance over the 680."

It's the only way your ridiculous analogy to Intel's EE could possibly hold true, when in reality, it couldn't be further from the truth. Titan holds a huge advantage over GTX 680, but that's expected, its a completely different class of GPU whereas the 3930K and 3960X are cut from the exact same wafer.Reply

There was no manufacturing capacity you IDIOT LIAR.The 680 came out 6 months late, and amd BARELY had 79xx's on the shelves till a day before that.

Articles were everywhere pointing out nVidia did not have reserve die space as the crunch was extreme, and the ONLY factory was in the process of doing a multi-billion dollar build out to try to keep up with bare minimum demand.

Now we've got a giant GPU core with perhaps 100 attempted dies per wafer, with a not high yield, YET YOU'RE A LIAR NONETHELESS.Reply

It has nothing to do with manufacturing capacity, it had everything to do with 7970's lackluster performance and high price tag.

GTX 680 was only late (by 3, not 6 months) because Nvidia was too busy re-formulating their high-end strategy after seeing 7970 outperform GTX 580 by only 15-20% but asking 10% higher price. Horrible price:performance metric for a new generation GPU on a new process node.

This gave Nvidia the opportunity to:

1) Position mid-range ASIC GK104 as flagship GTX 680 and still beat the 7970.2) Push back and most importantly, re-spin GK100 and refine it to be GK110.3) Screw their long-time customers and AMD/AMD fans in the process.4) Profit.

So instead of launching and mass-producing their flagship ASIC first (GK100) as they've done in every single previous generation and product launch, they shifted their production allocation at TSMC to their mid-range ASIC, GK104 instead.

Once GK110 was ready, they've had no problem churning them out, even the mfg date of these TITAN prove this point as week 31 chips are somewhere in the July-August time frame. They were able to deliver some 19,000 K20X units to ORNL for the real TITAN in October 2012. Coupled with the fact they're using ASICs with the same number of functional units for GTX Titanic, it goes to show yields are pretty good.

But the real conclusion to be drawn for this is that other SKUs based on GK110 are coming. There's no way GK110 wafer yields are anywhere close to 100% for 15 SMX ASICs. I fully expect a reduced SMX unit, maybe 13 with 2304SP as originally rumored show it's face as the GTX 780 with a bunch of GK114 refreshes behind it to fill out the line-up.

The sooner people stop overpaying for TITAN, the sooner we'll see the GTX 700 series, imo, but with no new AMD GPUs on the horizon we may be waiting awhile.Reply

you're a brainwashed lying sack of idiocy, so maybe i'll waste my time reading your idiotic lies, and maybe not, since your first line is the big fat frikkin LIE you HAVE TO BELIEVE that you made up in your frikkin head, in order to take your absolutely FALSE STANCE for the past frikkin nearly year now.Reply