What Is CUDA?

You may not realize it, but GPUs are good for more than videogames and scientific research. In fact, there’s a good chance your daily life is being affected by GPU computing.

Mobile applications rely on GPUs running servers in the cloud. Stores use GPUs to analyze retail and web data. Web sites use GPUs to more accurately place ads. Engineers rely on them in computer-aided engineering applications. Accelerated computing using GPUs continues to expand.

A key role in modern AI: the NVIDIA CUDA Deep Neural Network library (cuDNN) is a GPU-accelerated library of primitives for deep neural networks.

So, What Is CUDA?

Even with this broad and expanding interest, as I travel across the United States educating researchers and students about the benefits of GPU acceleration, I routinely get asked the question “what is CUDA?”

It’s more than that. CUDA is a parallel computing platform and programming model that makes using a GPU for general purpose computing simple and elegant. The developer still programs in the familiar C, C++, Fortran, or an ever expanding list of supported languages, and incorporates extensions of these languages in the form of a few basic keywords.

These keywords let the developer express massive amounts of parallelism and direct the compiler to the portion of the application that maps to the GPU.

A simple example of code is shown below. It’s written first in plain “C” and then in “C with CUDA extensions.”

More CUDA Resources

Learning how to program using the CUDA parallel programming model is easy. We have webinars and self-study exercises at the CUDA Developer Zone website.

You can contact us at AccelerEyes (support@accelereyes.com) and we’re happy to help.

mebersole

You can try posting your questions on stackoverflow.com using the CUDA tag (or the NVIDIA forums when they’re back up). Lots of NVIDIA engineers and other CUDA experts monitor the forums.

mebersole

Launching 4096 blocks would work on any Compute capable device. In fact you can have up to 65535 blocks in the x-direction for Compute capability up to 2.x and 2^31-1 for 3.0! The Compute driver will just keep scheduling blocks until all have been executed on the hardware.

Agastya Parikh

Thank you everyone!

Agastya Parikh

Thank you everyone!

abhinole

…ahhh….I use Fermi with CC 2.0….and I just realized that even I can launch 4096 blocks…..Thanks for your reply….