ispc on GPUs

Anon (no@thanks.com) on 2/11/12 wrote:
---------------------------
>For what it's worth, the thing I'll be working on has no such luxuries. It won't
>ever run on Windows, should be able to run on a multi-core x86 without a GPU, and
>isn't likely to get very far with management if some kind of commercial compiler
>tech is required. I'm not really seeing an alternative to OpenCL at the moment.
>ISPC looks really interesting, but has no GPU backends at the moment, and it's not clear if it ever will.

I think think a GPU backend for ispc probably doesn't make sense (and thus is unlikely to happen), for a variety of reasons, most of which boil down to the substantial underlying architectural differences between CPUs and GPUs.

ispc was primarily designed to map well to modern (and future) CPU hardware as it exists today; this led to design features like the ability to express both scalar and vector computation, cross-lane communication mechanisms, an execution model that has stronger guarantees of convergence across the running program instances than is possible on GPUs, the ability to use data in memory from the application without copying or reformatting, etc.

Conversely, ispc doesn't have the notion of GPU-y things like local memory, doesn't have barriers (since they're not needed due to the execution convergence guarantees, etc.) Without those mechanisms in language, it's hard (impossible?) to write programs that run with high performance on GPUs.

I don't think it makes sense to add those features to ispc, since they negatively impact the goal of mapping well to CPU hardware and delivering very high performance on CPUs.

Put another way (alert: my personal opinion only, not Intel's): I think that OpenCL on the CPU has a number of shortcomings that stem from these issues: it's full of limitations that are unnecessary on the CPU (e.g. a very restricted pointer model), it requires a driver and the resulting overhead and data copying, it has language features like barriers and shared memory that don't really correspond to CPU hardware features, etc. All of that comes from the GPU heritage of OpenCL, but adds unnecessary complexity and unnecessarily hurts performance when running on CPUs. My opinion is that making ispc run on the GPU would in the end require making a number of similar compromises, which in turn would make it less appealing for the CPU.