GK104 gets cheaper and faster

A week ago today we posted our review of the GeForce GTX 780, NVIDIA's attempt to split the difference between the GTX 680 and the GTX Titan graphics cards in terms of performance and pricing. Today NVIDIA launches the GeForce GTX 770 that, even though it has a fancy new name, is a card and a GPU that you are very familiar with.

The NVIDIA GK104 GPU Diagram

Based on GK104, the same GPU that powers the GTX 680 (released in March 2012), GTX 670 and the GTX 690 (though in a pair), the new GeForce GTX 770 has very few changes from the previous models that are really worth noting. NVIDIA has updated the GPU Boost technology to 2.0 (more granular, better controls in software) but the real changes come in the clocks speeds.

The GTX 770 is still built around 4 GPCs and 8 SMXs for a grand total of 1536 CUDA cores, 128 texture units and 32 ROPs. The clock speeds have increased from 1006 MHz base clock and 1058 MHz Boost up to 1046 MHz base and 1085 MHz Boost. That is a pretty minor speed bump in reality, an increase of just 4% or so over the previous clock speeds.

NVIDIA did bump up the GDDR5 memory speed considerably though, going from 6.0 Gbps to 7.0 Gbps, or 1750 MHz. The memory bus width remains 256-bits wide but the total memory bandwidth has jumped up to 224.3 GB/s.

Maybe the best change for PC gamers is the new starting MSRP for the GeForce GTX 770 at $399 - a full $50-60 less than the GTX 680 was selling for as of yesterday. If you happened to pick up a GTX 680 recently, you are going to want to look into your return options as this will surely annoying the crap out of you.

Tired of this design yet? If so, you'll want to look into some of the non-reference options I'll show you on the next page from other vendors, but I for one am still taken with the design of these cards. You will find a handful of vendors offering up re-branded GTX 770 options at the outset of release but most will have their own SKUs to showcase.

Our 4K Testing Methods

You may have recently seen a story and video on PC Perspective about a new TV that made its way into the office. Of particular interest is the fact that the SEIKI SE50UY04 50-in TV is a 4K television; it has a native resolution of 3840x2160. For those that are unfamiliar with the new upcoming TV and display standards, 3840x2160 is exactly four times the resolution of current 1080p TVs and displays. Oh, and this TV only cost us $1300.

In that short preview we validated that both NVIDIA and AMD current generation graphics cards support output to this TV at 3840x2160 using an HDMI cable. You might be surprised to find that HDMI 1.4 can support 4K resolutions, but it can do so only at 30 Hz (60 Hz 4K TVs won't be available until 2014 most likely), half the refresh rate of most TVs and monitors at 60 Hz. That doesn't mean we are limited to 30 FPS of performance though, far from it. As you'll see in our testing on the coming pages we were able to push out much higher frame rates using some very high end graphics solutions.

I should point out that I am not a TV reviewer and I don't claim to be one, so I'll leave the technical merits of the monitor itself to others. Instead I will only report on my experiences with it while using Windows and playing games - it's pretty freaking awesome. The only downside I have found in my time with the TV as a gaming monitor thus far is with the 30 Hz refresh rate and Vsync disabled situations. Because you are seeing fewer screen refreshes over the same amount of time than you would with a 60 Hz panel, all else being equal, you are getting twice as many "frames" of the game being pushed to the monitor each refresh cycle. This means that the horizontal tearing associated with Vsync will likely be more apparent than it would otherwise.

I would likely recommend enabling Vsync for a tear-free experience on this TV once you are happy with performance levels, but obviously for our testing we wanted to keep it off to gauge performance of these graphics cards.

Recent rumors seem to suggest that NVIDIA will release its desktop-class GeForce 700 series of graphics cards later this year. The new card will reportedly be faster than the currently-available GTX 600 series, but will likely remain based on the company's Kepler architecture.

According to the information presented during NVIDIA's GTC keynote, its Kepler architecture will dominate 2012 and 2013. It will then follow up with Maxwell-based cards in 2014. Notably absent from the slides are product names, meaning the publicly-available information at least leaves the possibility of a refreshed Kepler GTX 700 lineup in 2013 open.

Fudzilla further reports that NVIDIA will release the cards as soon as May 2013, with an official launch as soon as Computex. Having actual cards available for sale by Computex is a bit unlikely, but a summer launch could be possible if the new 700 series is merely a tweaked Kepler-based design with higher clocks and/or lower power usage. The company is rumored to be accelerating the launch of the GTX 700 series in the desktop space in response to AMD's heavy game-bundle marketing, which seems to be working well at persuading gamers to choose the red team.

What do you make of this rumor? Do you think a refreshed Kepler is coming this year?

Last month, NVIDIA revealed its Kayla development platform that combines a quad core Tegra System on a Chip (SoC) with a NVIDIA Kepler GPU. Kayla will out later this year, but that has not stopped other board makers from putting together their own solutions. One such solution that began shipping earlier this week is the mITX GPU Devkit from SECO.

The new mITX GPU Devkit is a hardware platform for developers to program CUDA applications for mobile devices, desktops, workstations, and HPC servers. It combines a NVIDIA Tegra 3 processor, 2GB of RAM, and 4GB of internal storage (eMMC) on a Qseven module with a Mini-ITX form factor motherboard. Developers can then plug their own CUDA-capable graphics card into the single PCI-E 2.0 x16 slot (which actually runs at x4 speeds). Additional storage can be added via an internal SATA connection, and cameras can be hooked up using the CIC headers.

Rear IO on the mITX GPU Devkit includes:

1 x Gigabit Ethernet

3 x USB

1 x OTG port

1 x HDMI

1 x Display Port

3 x Analog audio

2 x Serial

1 x SD card slot

The SECO platform is a proving to be popular for GPGPU in the server space, especially with systems like Pedraforca. The intention of using these types of platforms in servers is to save power by using a low power ARM chip for inter-node communication and basic tasks while the real computing is done solely on the graphics cards. With Intel’s upcoming Haswell-based Xeon chips getting down to 13W TPDs though, systems like this are going to be more difficult to justify. SECO is mostly positioning this platform as a development board, however. One use in that respect is to begin optimizing GPU-accelerated code for mobile devices. With future Tegra chips to get CUDA-compatible graphics cards, new software development and optimization of existing GPGPU code for smartphones and tablet will be increasingly important.

Either way, the SECO mITX GPU Devkit is available now for 349 EUR or approximately $360 (in both cases, before any taxes).

Two new photos recently popped up on Cowcotland, showing off an unreleased "Dragon Edition" GTX 660 Ti graphics card from ASUS. The new card boasts some impressive factory overclocks on both the GPU and memory as well as a beefy heatsink and a new blue and black color scheme.

The ASUS GTX 660 Ti Dragon will feature a custom cooler with two fans and an aluminum heastink. The back of the card includes a metal backplate to secure the cooler and help dissipate a bit of heat itself. However, there is also a cutout in the backplate to allow for (likely) additional power management circuitry. The card also features the company's power phase technology, NVIDIA's 660 Ti GK-104 GPU, and 2GB of GDDR5 memory. The graphics core is reportedly clocked at 1150MHz (no word on whether that is the base or boost figure) while the memory is overclocked to 6100MHz. For comparison, the reference GTX 660 Ti clocks are 915MHz base, 980MHz boost, and 6,000MHz memory. The new card will support DVI, DisplayPort, and HDMI video outputs.

There is no word on pricing or availability, but the Dragon looks like it will be one of the fastest GTX 660 Ti cards available when (if?) it publicly released!

NVIDIA releases the GeForce GT 700M family

NVIDIA revolutionized gaming on the desktop with the release of its 600-series Kepler-based graphics cards in March 2012. With the release of the GeForce GT 700M series, Kepler enters the mobile arena to power laptops, ultrabooks, and all-in-one systems.

Today, NVIDIA introduces four new members to its mobile line: the GeForce GT 750M, the GeForce GT 740M, the GeForce GT 735M, and the GeForce GT 720M. These four new mobile graphics processors join the previously-released members of the GeForce GT 700m series: the GeForce GT 730M and the GeForce GT 710M. With the exception of the Fermi-based GeForce GT 720M, all of the newly-released mobile cores are based on NVIDIA's 28nm Kepler architecture.

Notebooks based on the GeForce GT 700M series will offer in-built support for the following new technologies:

Server platform manufacturer TYAN is showing off several of its latest servers aimed at the high performance computing (HPC) market. The new servers range in size from 2U to 4U chassis and hold up to 8 Kepler-based Tesla accelerator cards. The new product lineup consists of two motherboards and three bare-bones systems. The S7055 and S7056 are the motherboards while the FT77-B7059, TA77-B7061, and FT48-B7055.

The TA77-B7061 is the smallest system, with support for two Intel Xeon E5-2600 processors and four Kepler-based Tesla accelerator cards. The FT48-B7055 has si7056 specifications but is housed in a 4U chassis. Finally, the FT77-B7059 is a 4U system with support for two Intel Xeon E5-2600 processors, and up to eight Tesla accelerator cards. The S7055 supports a maximum of 4 GPUs while the S7056 can support two Tesla cards, though these are bare boards so you will have to supply your own cards, processors, and RAM (of course).

According to TYAN, the new Kepler-based HPC systems will be available in Q2 2013, though there is no word on pricing yet.

Earlier this week, NVIDIA updated its Quadro line of workstation cards with new GPUs with GK104 “Kepler” cores. The updated line introduced four new Kepler cards, but the Quadro 6000 successor was notably absent from the NVIDIA announcement. If rumors hold true, professionals may get access to a K6000 Quadro card after all, and one that is powered by GK110 as well.

According to rumors around the Internet, NVIDIA has reserved its top-end Quadro slot for a GK110-based graphics card. Dubbed the K6000 (and in line with the existing Kepler Quadro cards), the high-end workstation card will feature 13 SMX units, 2,496 CUDA cores, 192 Texture Manipulation Units, 40 Raster Operations Pipeline units, and a 320-bit memory bus. The K6000 card will likely have 5GB of GDDR5 memory, like its Tesla K20 counterpart. Interestingly, this Quadro K6000 graphics card has one less SMX unit than NVIDIA’s Tesla K20X and even NVIDIA’s consumer-grade GTX Titan GPU. A comparison between the rumored K6000 card, the Quadro K5000 (GK104), and other existing GK110 cards is available in the table below. Also, note that the (rumored) K6000 specs put it more in like with the Tesla K20 than the K20X, but as it is the flagship Quadro card I felt it was still fair to compare it to the flagship Telsa and GeForce cards.

Quadro K6000

Tesla K20X

GTX Titan

GK110 Full (Not available yet)

Quadro K5000

SMX Units

13

14

14

15

8

CUDA Cores

2,496

2,688

2,688

2,880

1536

TMUs

192

224

224

256

128

ROPs

40

48

48

48

32

Memory Bus

320-bit

384-bit

384-bit

384-bit

256-bit

DP TFLOPS

~1.17 TFLOPS

1.31 TFLOPS

1.31 TFLOPS

~1.4 TFLOPS

.09 TFLOPS

Core

GK110

GK110

GK110

GK110

GK104

The Quadro cards are in an odd situation when it comes to double precision floating point performance. The Quadro K5000 which uses GK104 brings an abysmal 90 GFLOPS of double precision. The rumored GK110-powered Quadro K6000 brings double precision performance up to approximately 1 TFLOPS, which is quite the jump and shows that GK104 really was cut down to focus on gaming performance! Further, the card that the K6000 is replacing in name, the Quadro 6000 (no prefixed K), is based on NVIDIA’s previous-generation Fermi architecture and offers .5152 TFLOPS (515.2 GFLOPS) of double precision performance. On the plus side, users can expect around 3.5 TFLOPS of single precision horsepower, which is a substantial upgrade over Quadro 6000's 1.03 TFLOPS of single precision floating point. For comparison, the GK104-based Quadro K5000 offers 2.1 TFLOPS of single precision. Although it's no full GK110, it looks to be the Quadro card to beat for the intended usage.

Of course, Quadro is more about stable drivers, beefy memory, and single precision than double precision, but it would be nice to see the expensive Quadro workstation cards have the ability to pull double duty, as it were. NVIDIA’s Tesla line is where DP floating point is key. It is just a rather wide gap between the two lineups that the K6000 somewhat closes, fortunately. I would have really liked to see the K6000 have at least 14 SMX units, to match consumer Titan and the Tesla K20X, but rumors are not looking positive in that regard. Professionals should expect to see quite the premium with the K6000 versus the Titan, despite the hardware differences. It will likely be sold for around $3,000.

No word on availability, but the card will likely be released soon in order to complete the Kepler Quadro lineup update.

Missed the live event? Here is the full replay feature me and Tom Petersen!

Hopefully by now you have read our review of the NVIDIA GeForce GTX TITAN 6GB graphics card that was just released. This is definitely a product release that highlights a generations of GPUs and I would really encourage you to read the article and offer your feedback.