CUDACasts Episode #2: Your First CUDA C Program

In the last episode of CUDACasts, we learned how to install the CUDA Toolkit on Windows. Now we’re going to quickly move on and accelerate code on the GPU. For this episode, we’ll use the CUDA C programming language. However, as I will show in future CUDACasts, there are other CUDA enabled languages, including C++, Fortran, and Python.

The simple code we’ll be writing is a kernel called VectorAdd, which adds two vectors, a and b, in parallel, and stores the results in vector c. You can follow along in the video or download the source code for this episode from Github.

The process for moving VectorAdd from the CPU to the massively parallel GPU follows three simple steps.

Parallelize the VectorAdd function by converting the serial for loop that adds each pair of elements on the CPU into a parallel kernel that uses an independent GPU thread to add each pair of elements.

Copy the initialized data from CPU memory to the GPU memory space and the results back.

Modify the VectorAdd function call to launch the now parallelized kernel on the GPU.

If you’re interested in learning more about CUDA C, you can watch my in-depth Introduction to CUDA C/C++ recorded here. In the next CUDACast, we’ll explore an alternate method for accelerating code using the OpenACC directive based approached.

If you would like to request a topic for a future episode of CUDACasts, or if you have any other feedback, please use the contact form or leave a comment to let us know!

About Mark Ebersole

As CUDA Educator at NVIDIA, Mark Ebersole teaches developers and programmers about the NVIDIA CUDA parallel computing platform and programming model, and the benefits of GPU computing. With more than ten years of experience as a low-level systems programmer, Mark has spent much of his time at NVIDIA as a GPU systems diagnostics programmer in which he developed a tool to test, debug, validate, and verify GPUs from pre-emulation through bringup and into production. Before joining NVIDIA, he worked for IBM developing Linux drivers for the IBM iSeries server. Mark holds a BS degree in math and computer science from St. Cloud State University. Follow @cudahamster on Twitter