Old News at 4 Months of Age?

It seems all of the latest news surrounding NVIDIA and its partners these days pertains to the GTX 460 launch. While this is a pretty impressive product for its price point, we have seen interest of course shift away from the high end NVIDIA products. Now that the GTX 480 has been out for four months, I figured it was time to take a second, closer look at the product, the software environment that surrounds it, and what has been done to potentially rectify some of the issues that have surrounded this sometimes contentious product.

MSI was kind enough to ship their high end GTX 480 board for this closer inspection. We all know that the board performs very well in games, and it is arguably the fastest single GPU solution on the market. What it comes with though is a high price paid in power consumption and heat production. It is certainly well documented that these are big negatives towards the card, but in digging deeper will we find aspects that outweigh those negatives?

Musings Upon NVIDIA’s Intentions with Fermi

Fermi is arguably one of the most controversial products NVIDIA has ever released. Controversy in fact certainly seems like a cornerstone of NVIDIA’s very existence, and this product has added to that reputation. Obviously Fermi was not designed from the outset to cause controversy, but only to be a well rounded and high performing solution from NVIDIA.

Readers can take a deep dive into the major architectural points of the Fermi chip here. For the time being though, I do want to go over some of my opinions on the architecture after having seen it in action, and mulled over the tradeoffs that were made to get this product to market. The first impression of this product harkens back to the GeForce FX days, but while some of the issues surrounding Fermi are reminiscent of this previous situation, they have very little in common.

The board is nicely packed and protected in MSI's not overly large box.

At the time of release, GeForce FX was not the fastest part on the market. And while it had some next generation features that the competition did not (eg. ATI’s products were all 24 bit vs. the 16/32 bit FP support GeForce FX had), it was never able to adequately use them in any kind of fashion due to poor overall performance. GeForce FX also had some serious design issues that caused production woes, and not so much that TSMC had major problems with their 130 nm process (TSMC’s 130 nm process was considered “clean” approximately 4 months before NVIDIA requested first samples of NV30). Performance was also another problem. At 16 bit precision, the NV30 was almost as fast as the competing Radeon 9700 Pro, but the output quality often suffered in many situations which required decent amounts of pixel shading. While the NV30 was really a very good DX8 card, it was pretty pathetic at DX9 content. Certainly NV30 was an ambitious design at the time, aimed to be far more flexible than the competition, but falling down when it came to real-world performance.

Another rumor about Fermi is that the initial design was a failure, so NVIDIA dusted off another design that had been shelved which was meant for the GPGPU/Enterprise market and then shoehorned into the desktop/gaming market. This idea is a bit of a stretch, and while enterprise class aspects of the architecture certainly permeate this card, it is hard to imagine that NVIDIA did work on such a product and then shelved it. Certainly ideas and initial designs are thrown around and picked up and discarded fairly quickly, but once real work gets underway, big changes such as the one described above simply do not happen. There are a lot of very competent and talented engineers and architects at NVIDIA, and it is a bit of a stretch to think that a fully developed product was dropped at the last minute and a previous cast off was put in its place.

Personally, I like what NVIDIA has done with the Fermi architecture. It is a pretty radical re-imagining of the workloads and workflows for a graphics chip. It takes a major step away from a pure rendering device, and is much more akin to Intel’s Larrabee project, but just without the horrific bandwidth and x86 overhead issues.