GPGPU or general purpose GPU programming was dramatically changed and demystified when NVIDIA introduced the CUDA architecture for their GPUs in 2006. Since then, problems in Engineering, Economics or any fields do not need to be translated to graphics problems to be processed by the GPU. The GPU programming paradigm was dramatically shifted when CUDA entered the scene of parallel computing. Now developers can write code as they did in C or C++ and do not need to worry about the wrappers or the abstractions that abstract the graphics behind the scene. However, the developer needs to have understanding of the underlying hardware structure but does not need to know every nuts and bolts to be able to efficiently harness the power of parallel computing on the CUDA enabled NVIDIA’s GPUS. This section is mainly concerned with the basics of the CUDA architecture and the CUDA C/C++ extension for programming NVIDIA’s GPUs. In addition to that, many concepts in CUDA programming will be explored.