Note: This event is being held before ISC begins and as such is not affiliated with the ISC conference. Attendees do not need to register for the conference.

NVIDIA® CUDA™ is a general purpose scalable parallel programming model for writing highly parallel applications. It provides several key abstractions – a hierarchy of thread blocks, shared memory, and barrier synchronization. This model has proven quite successful at programming multi-threaded many-core GPUs and scales transparently to hundreds of cores. Scientists throughout the industry and academia are already using CUDA to achieve dramatic speedups on production and research codes (see http://www.gpucomputing.org for a list of codes, academic papers and commercial packages based on CUDA). And with an upcoming version of CUDA, a new compiler backend extends CUDA to multi-core CPUs.

In this tutorial, NVIDIA engineers will partner with academic and industrial researchers to present CUDA and discuss its advanced use for science and engineering domains. We will demonstrate its application with traditional HPC examples ranging from BLAS, FFT, integration with Fortran and high-level languages (MATLAB, Mathematica, Python) and describe in detail the programming model at the heart of it all. We will then turn to advanced topics including optimizing CUDA programs, CUDA floating point performance and accuracy, and CUDA programming strategies and tips. Finally we will present detailed case studies in which domain scientists will describe their experience using CUDA to accelerate mature, deployed, real-world science codes.