We heard about this one back at Nvidia's GPU Technology Conference and since it was recently used in the Oak Ridge National Laboratory's Titan supercomputer, Nvidia has now finally and officially launched its new Tesla K20 GPU family based on the elusive GK110 GPU.

Back at GTC 2012, Nvidia released a short announcement of the K20 and shed some details regarding its power, but has kept all those juicy GK110 specs to itself. Now, as it is officially announced, it is clear that Tesla K20 lineup will consist of two products, the Tesla K20 and the Tesla K20X.

Both of the new Tesla graphics cards are based on the same GK110 Kepler GPU and it is now clear that GK110 features 15 SMX clusters with six memory controllers and 1.5MB of L2 cache. The Tesla K20X, a more powerful one, uses 14 active SMXs with all six memory controllers and full 1.5MB of L2 cache. This adds up to a total of 2688 CUDA cores. The Tesla K20X works at 732MHz for the GPU and features 6GB of GDDR5 memory clocked at 5.2GHz and paired up with a 384-bit memory interface. All these specs will offer 3.95 TFLOPS of single- and 1.31 TFLOPS of double-precision floating-point performance and it is the same GPU that is a part of the Titan supercomputer.

The less powerful, Tesla K20 has one less SMX, five memory controllers and 1.25MB of L2 cache. That adds up to 2496 CUDA cores and it works at 706MHz for the GPU and 5.2GHz for 5GB of GDDR5 memory paired up with a 320-bit memory interface. It will provide up to 3.52 TFLOPS of single- and 1.17 TFLOPS of double-precision floating-point performance.

The TDP is set at 235W for the Tesla K20X and 225W for the K20. Although the performance of both Nvidia's new Tesla's are away from AMD's FirePro S10000 dual-GPU beast, the TDP is also nowhere near the incredible 375W.

When compared to the previous Fermi based Tesla M2090, the new Tesla K20X raises the actual architecture efficiency from 65 percent to up to 93 percent. This was achieved via new Hyper-Q, Dynamic Parallelism, new ECC algorithm and OpenACC, MPI and other HPC libraries and technologies.

Of course, Tesla family was never cheap so, although there is no official price, the estimated price is set at around US $3199 for the K20 and somewhere over that number for the more powerful K20X.

Nvidia has officially published the Kepler GK110 whitepaper detailing all the specs that that had us wondering for months. The Kepler based GK110 will first show up as a Tesla K20 graphics card is built for intense computing applications that include data analytics, weather modeling,, computational chemistry and pshysics and aimed at both server and workstation systems.

As far as the specs are concerned, the GK110 features 7.1 billion transistors and promises up to three times the performance per watt when compared to previous Fermi architecture. It packs a total of 2880 cores organized in 15 SMX modules. Each SMX feature 192 single-precision CUDA cores, 64 double-precision units, 32 special function units (SFU) and 32 load/store units. The GK110 GPU features six 64-bit memory controllers, which adds up to a 384-bit memory interface.

The memory subsystem inside the Kepler based GK110 includes 64KB of on-chip memory for each SMX that can be allocated with a bit more flexibility when compared to the previous Fermi architecture, enabling 32/32KB split between shared memory and L1 cache. In addition to L1 cache, the GK110 Kepler also has 48KB of Read-Only Data cache. In case you lost the number, this adds up to 960KB of shared memory.

As far as the L2 cache is concerned, the Kepler GK110 packs 1536KB of L2 cache, double the amount of L2 cache found in the Fermi architecture and offers up to twice the bandwidth per clock. As expected the GK110 also has ECC memory protection support for all register files, shared memories, L1 and L2 cache and DRAM memory.

In addition to these specs, Nvidia also included a couple of Kepler features that will show up in the GK110 and that include Dynamic Parallelism, Hyper-Q, Grid Management and Nvidia GPUDirect. We already wrote about these features and if you are looking for more details, you can always check out the Nvidia GK110 whitepaper located here.

All in all, the GK110 looks like a pretty impressive GPU, but we honestly doubt that we'll see a Geforce graphics card based on the GK110 anytime soon.

Nvidia used the GPU Technology Conference, currently taking place in San Jose, California, to announce two new Tesla GPUs based on the Kepler artchitecture, the Tesla K10 and K20.

When compared to previous Tesla generation products based on Fermi architecture, the new Kepler based Tesla's will bring SMX streaming multiprocessors, that provide up to three more performance per Watt when compared to Fermi, Dynamic Parallelism feature that enables GPU threads to dynamically spawn new threads allowing the GPU to adapt dynamically to the data as well as a new Hyper-Q feature that enables multiple CPU cores to simultaneosly use the CUDA architecture cores on a single GPU. The Hyper-Q dramatically increases GPU utilization, slashing CPU idle times and advancing programmability thus making it ideal for cluster applications that use MPI, according to Nvidia.

The newely introduced Tesla K10 is able to produce peak double precision floating point performance of up to 0.19 teraflops, while single precision floating point performance hoovers at impressive 4.58 teraflops. In case you are wondering, the Tesla K10 is based on two GK104 GPUs for a total of 3072 CUDA cores, 8GB of GDDR5 memory that adds to a 320GB/s of memory bandwidth. The Tesla K10 is for servers only and is aimed at seismic, image, siginal processing and video analytics computing applications.

Unlike the Tesla K10, which was at least detailed to some extent, the Tesla K20's specs are still unknown. It is known that this one will be based on the GK110 GPU. This one is aimed at much more intensive computing applictions like computational chemistry and physics, data analytics, weather modeling and is aimed at both servers and workstations systems. Unlike the Tesla K10 which only features SMX as architecture feature, the Tesla K20 will also pack the mentioned dynamic parallelism and Hyper-Q features.

The dual-GK104 Tesla K10 GPU will be available later this month, while the big-dog K20 is scheduled for Q4 2012.