Nvidia needs to use GTC to push GPGPU to the mainstream

THIS WEEK Nvidia will be doing its best at its GPU Technology Conference (GTC) to show how its general purpose graphics processing units (GPGPU) can do much more than play games and will highlight the increasing flexibility of the GPU in offloading computing from the CPU.

Nvidia's GPU Technology Conference has grown this year and this is a testament to the work the firm has done to promote its GPGPUs and CUDA in the research community over the past five years. Nvidia hopes that showing off what researchers and its partners are able to do with its GPUs will create a snowball effect and reduce the industry's reliance on traditional CPUs.

This is Nvidia's opportunity to show the wider developer community that GPUs can be relevant for their applications and not just be used to plot points on a Gnuplot graph. But it is also important to appreciate how far Nvidia, and its rivals, have come in pushing GPGPUs in certain domains over the course of the last five years.

Programming languages hold the key Both AMD and Nvidia design GPGPU accelerators for use in high performance computing (HPC) clusters, but Nvidia was the first to push both the power of the GPU and its CUDA programming language into the classroom. The firm's hard work back then is starting to pay off now as students who were taught CUDA at university are working in industry and prescribing CUDA and with it Nvidia hardware.

It would be unfair to say that AMD hasn't done anything, on the contrary in fact. While the firm might not have heavily pushed GPGPUs in the research domain it has spent the last two years pushing OpenCL in order to sell its accelerated processing units (APU).

AMD and Nvidia pushing GPUs as accelerators, whether as dedicated accelerator boards - such as the Firepro or Tesla boards, respectively - or in AMD's APUs, is having an effect on Intel. As part of Intel's major GPU revamp with its Ivy Bridge processors, the chipmaker finally supported Full Profile OpenCL.

Intel's support of OpenCL, at least in terms of implementation if perhaps not driving the Kronos OpenCL working group with the same vigour as AMD, is a tacit acknowledgement that x86 CPUs cannot be relied upon for 100 percent of overall system performance, even in mainstream applications.

Much has been made of Nvidia's insistance in pushing its CUDA programming language, which rivals have pointed out is proprietary. Nvidia counters such claims by saying that it produces compilers for other programming languages such as C and Fortran, though there is evidence to suggest that running the same algorithm programmed in CUDA offers better performance on Nvidia GPU cards over other languages.

One Nvidia representative told us last month that this was simply down to the maturity of the CUDA compiler relative to other languages. It is true that Nvidia's CUDA compiler has been around the block a few more times than its C or Fortran compilers, but the firm will have to improve in this area quickly if it wants to move from seeing its GPUs being used in research to helping accelerate mainstream applications.

Nvidia, and all silicon vendors, cannot underestimate the amount of time and effort professional developers have spent not just in learning a particular programming language but most importantly working out ways to optimise algorithms and development workflows to increase all around efficiency. It is not surprising therefore that developers will want to stick rather than twist.

While Nvidia's work with universities means a new generation of computer science graduates will be skilled in using its accelerators, there will be far more programmers in the field that will be more comfortable with C, Java or Python. Making those developers change to coding in CUDA will be a daunting task and a change that many will not make if it means dependance on a single hardware vendor.

Bringing GPGPUs to mainstream consumers Nvidia could steal a march on its rivals by releasing a true GPGPU in its Tegra system on chip (SoC) processor. AMD has done this with its embedded APUs but its reliance on Windows 8 tablets presently limits its market reach.

Nvidia would need to do considerable work with the Android project to allow developers to take advantage of its GPGPU designs, similar to Intel's work with porting Android to x86.

Samsung's recent Galaxy S4 launch highlighted a number of applications that are ripe for GPGPU acceleration such as image manipulation and gesture recognition. It should be noted that Samsung and Qualcomm, which will supply the Galaxy S4 SoC in the US, are members of the HSA Foundation, but where Samsung goes other handset makers will surely look to follow in a bid to stop Samsung from effectively owning the Android market.

Nvidia's Tegra 4 still doesn't have GPGPU support but we expect the firm will start talking future support pretty soon in order to whet developers' appetites. For Nvidia, GPGPU and software defined radios are what will push the firm into a different league and should get Qualcomm in particular hot under the collar.

We hope Nvidia will showcase GPGPU acceleration in an efficient way, by efficient we mean a shallow learning curve for developers to not just bang out code but debug and optimise code. AMD, Nvidia and Intel have all talked about the performance improvements to be had with their respective accelerators, but what developers care about is how easily those improvements are attainable.

Above all, Nvidia has to start telling developers how it will move GPGPUs out of universities and put them into the hands of consumers. Consumers hold the key for Nvidia and can make it a top-tier mainstream chip designer. µ