Posted
by
Soulskillon Friday March 30, 2012 @04:27PM
from the when-two-processors-love-each-other-very-much dept.

MrSeb writes "Late last week, Jen-Hsun Huang sent a letter to Nvidia employees congratulating them on successfully launching the highly acclaimed GeForce GTX 680. After discussing how Nvidia changed its entire approach to GPU design to create the new GK104, Jen-Hsun writes: 'Today is just the beginning of Kepler. Because of its super energy-efficient architecture, we will extend GPUs into datacenters, to super thin notebooks, to superphones.' (Nvidia calls Tegra-powered products 'super,' as in super phones, super tablets, etc, presumably because it believes you'll be more inclined to buy one if you associate it with a red-booted man in blue spandex.) This has touched off quite a bit of speculation concerning Nvidia's Tegra 4, codenamed Wayne, including assertions that Nvidia's next-gen SoC will use a Kepler-derived graphics core. That's probably true, but the implications are considerably wider than a simple boost to the chip's graphics performance."
Nvidia's CEO is also predicting this summer will see the rise of $200 Android tablets.

It is possible to use the GPU effectively to speed up some scientific simulations. Usually in fluid mechanics problems that could be solved by time marching (or physics that obey hyperbolic governing differential equations). But working with the GPU is a real PITA. There is no standardization. There is no real support for any high level languages. Of course they have bullet points saying "C++ is Supported". But you dig in and find, you have to link with their library, there is no standardization, you need to manage the memory, you need to manage the data pipe line and fetch and cache, the actual amount of code you could fit in their "processing" unit is trivially small. All it could store turns out to be about 10 or so double precision solution variables and about flux vector splitting for Navier Stokes for just one triangle. About 40 lines of C code.

On top of everything, the binary is a mismash of compiled executable chunks sitting in the interpreted code. Essentially the if a competitor or hacker gets the "executable" they can reverse engineer every bit of innovation you had done to cram your code into these tiny processors and reverse engineer your scientific algorithm at a very fine grain.

Then their sales critter create "buzz". Make misleading, almost lying, presentations about GPU programming and how it is going to achieve world domination.

In my experience, GPU programming works exactly like you'd expect it to work. Your nightmare doesn't sound like it's with GPU programming, it sounds like it's with NVidia's marketing.

GPU processors are really small, so everything you've listed here is expected. The code size, variable limits, etc etc. The advantage is you have thousands of them at your disposal. That makes GPUs extremely good when you need to run a kernel with x where x is from 0 to a trillion. Upload the problem set to VRAM, and send the cores to work.

Stuff like C++ and high level languages is also not good for this sort of work. I'm not even sure why people are bothering with C++ on GPGPU to be perfectly frank. Again, you're writing kernels here, not entire programs. C++ is honestly bulky for GPGPU work and I can't imagine what I'd use it for. Both CUDA and OpenCL are already pretty high level, any further past that and you're risking sacrificing performance.

Interpreted code is also good. It's usually JIT compiled for the architecture you're working on. In the case of OpenCL and CUDA, it could be recompiled to run on an ATI card, NVidia card, or local CPU, all of which have different machine languages that you won't know about until runtime.

It sounds like you're angry because GPU programming isn't very much like programming for CPUs, and you'd be right. That's the nature of the hardware, it's built very different and is optimized for different tasks. Whether that's because you were sold a false bill of goods by NVidia, I don't know. But it doesn't mean GPU programming is broken, it just may not be for you. It mostly sounds like you're just trying to cram too much into your individual kernels though.