Tegra K1 ? The Convergence of Mobile and Desktop Computing

Many people have been waiting a long time for Nvidia to make the announcement that they made today. The Logan SoC has been long awaited by many mobile and gaming enthusiasts and finally brings nvidia?s desktop graphics expertise into the world of ARM. Nvidia accomplishes this task by integrating a single Kepler SMX GPU unit that contains 192 GPU shader cores into the SoC. And by the looks of the SoC?s die from Nvidia?s own images, it appears to be around 70-80% GPU in terms of die space. The GPU itself brings the ability to bring both desktop and mobile graphics APIs into a single chip while still delivering impeccable performance and efficiency. The K1?s Kepler GPU is capable of OpenGL 4.4 and DirectX 11.2 in addition to OpenGL ES 3.0 which effectively covers the full gamut of graphics APIs.

The K1, however, is not actually one chip but rather two. The two flavors that it comes in are varied by the CPUs that Nvidia utilizes in the SoC, one being a 32-bit quadcore A15 plus one A9 core design and the other being a 64-bit dual core Project Denver design which they are dubbing as a supercore. Project Denver is Nvidia?s own custom CPU which utilizes ARM?s v8 instruction set to do 64-bit computing. Jen Hsun showed a working prototype of the dual core SoC at the press conference and said that it had just come from the factory and was very new. This would imply that their A15-based K1 will likely ship to customers far in advance of the Project Denver based K1 due to the A15-based K1 having already been shown at Siggraph in July.

Nvidia is calling their K1 SoC a 192 core chip even though the 32-bit A15 version is technically a 197 core processor and the Project Denver based 64-bit K1 is a 194 core SoC. Although, to be fair, Nvidia has already had problems in the past keeping count of cores since they consider the Tegra 4 to be a quadcore SoC even though it features 5 ARM cores (four A15 and one A9). The interesting thing about Nvidia?s K1, though, is that both versions are pin compatible and could theoretically be swapped for the other. Also, the Denver-based chip features significantly more L1 cache with 128K + 64K L1 cache compared to the 32K + 32K L1 cache on the 32-bit A15 chip. The Project Denver cores also clock even higher at 2.5 GHz rather than the 2.3 GHz of the A15s, however considering the likely improvement to IPS, core per core 2.5 GHz should be much faster than 2.3 GHz.

The truth is that we have all been waiting for ARM SoCs to finally get desktop-class GPUs and Nvidia is the first company to really do it. The added capability of CUDA 6 in addition to all of the graphics APIs will mean that Nvidia?s SOC is theoretically capable of vastly more compute than virtually any other mobile SoC in the world. Nvidia compared the GPU of the K1 against the Xbox 360 and PS3 and stated that they achieved more performance in both CPU and GPU in only 5W, 1/20th the power of the gaming consoles. With the CUDA 6 capability, the added parallel GPU compute capabilities of mobile devices increases significantly and instantaneously gains an already well-established developer base. However, it remains to be seen how many game developers and mobile app developers will adopt CUDA, but the ones that already do in the AA and AAA titles will find it easy to launch mobile titles alongside their console titles.

Another thing that Nvidia has been working tirelessly on is their image signal processing capabilities. They started this with their Chimera architecture in the Tegra 4 and continued to improve upon it with Tegra K1. The K1?s new dual ISP core is capable of supporting a 100 Megapixel sensor, handling 4096 simultaneous focus points and providing a throughput of 1.2 Gigapixels per second. This capability is far beyond anything most smartphone, tablet or any mobile device sensor would be packing, but this could easily be fully utilized by a camera sensor array on a device that has multiple camera sensors in an array.

Overall, the Tegra K1 is really two Tegra chips and delivers on a lot of Nvidia?s promises, albeit a bit later than anyone would?ve liked. But at the same time, Nvidia makes a nearly 8 generation leap in terms of GPU technology and as a result of that vast improvements should be expected in terms of performance and power consumption. By the looks of it, though, Nvidia is still struggling to make the modem a part of their Tegra chips as there was no mention of their Icera modems unlike last year?s CES where T4i and the i500 were a big part of the press conference. While it remains to be seen what?s going on with Nvidia?s modems, it is clear that they are moving full steam ahead with their applications processors and that we can expect them to come fairly soon.

The 32-bit K1 chip will likely launch in devices around MWC timeframe while I would expect the 64-bit Project Denver-based K1 to be towards the latter part of the year considering that they only now have working silicon. I think we will see more of Project Denver throughout the course of the year, but I?m not sure when we?ll actually see real shipping products. Nvidia has upped the applications processor game a huge notch, especially in terms of GPU capability and it will be interesting to see what kinds of design wins, if any, they will get. Nvidia struggled quite a bit to get any Tegra 4 smartphone design wins and they slowly trickled in with the tablet design wins and had an overall satisfactory showing. Hopefully 2014 will be kinder to Nvidia, because everyone knows we need another competitor to Qualcomm to drive down price and drive up performance.