Fragment processing requires floating point math, as the fragments are colored and filtered to become pixels. This stage is where muchof the interesting compute for CFD can happen, as the parallel floating point processors can be repurposed with either fragment shaders or special purpose languages to do non-graphics floating point math.

Fragment processing requires floating point math, as the fragments are colored and filtered to become pixels. This stage is where muchof the interesting compute for CFD can happen, as the parallel floating point processors can be repurposed with either fragment shaders or special purpose languages to do non-graphics floating point math.

-

===Performance===

+

==Performance==

NVidia has been actively promoting GPGPU computing in recent years. They introduced CUDA in 2007, giving programmer direct access to the GPU. NVidia maintains a [http://www.nvidia.com/object/cuda_showcase_html.html CUDA community showcase] on their CUDA website showing the performance boost to a variety of applications when making use of the GPU.

NVidia has been actively promoting GPGPU computing in recent years. They introduced CUDA in 2007, giving programmer direct access to the GPU. NVidia maintains a [http://www.nvidia.com/object/cuda_showcase_html.html CUDA community showcase] on their CUDA website showing the performance boost to a variety of applications when making use of the GPU.

When programming GPGPU systems, it is important to remember that there is significant overhead involved in transferring data between the CPU and the GPU. Serial programs with memory requirements beyond those available on the GPU will not see the same dramatic performance enhancements as programs with low memory transfer overhead.

When programming GPGPU systems, it is important to remember that there is significant overhead involved in transferring data between the CPU and the GPU. Serial programs with memory requirements beyond those available on the GPU will not see the same dramatic performance enhancements as programs with low memory transfer overhead.

-

===Languages===

+

==Languages==

There exist several languages which support direct control of the GPU. OpenCL, Microsoft's DirectCompute, and Nvidia's CUDA are good examples of these. For more information about how to program on the GPU, see the [http://en.wikipedia.org/wiki/GPGPU Wikipedia] site on GPGPU computing. Coding syntax is similar to C/C++ programming syntax.

There exist several languages which support direct control of the GPU. OpenCL, Microsoft's DirectCompute, and Nvidia's CUDA are good examples of these. For more information about how to program on the GPU, see the [http://en.wikipedia.org/wiki/GPGPU Wikipedia] site on GPGPU computing. Coding syntax is similar to C/C++ programming syntax.

Revision as of 06:11, 18 March 2010

Contents

Introduction

GPGPU is an acronym for General Purpose Graphic Processor Unit. A GPGPU is any graphics processor being used for general computing beyond graphics. GPUsare widely available, and often targeted at the computer gaming industry. Graphics workloads are very parallel, and so GPUs developed as large-scale parallel computation machines. Originally GPGPU processing was done by tricking the GPU by disguising computation loads as graphic loads. In recent years, GPU manufacturers have been actively encouraging GPGPU computing with the release of specialized languages which support GPGPU commands. GPUs incorporate many more computational cores than their equivalent CPUs, and so the performance of parallel operations can be greatly enhanced. Programming in parallel on a GPU has the same justification given for parallel computing in general.

Application to CFD

GPGPU computing offers large amounts of compute power, which can be tapped for parallel components of CFD algorithms, while the CPU performs the serial portions of the algorithm. GPGPU languages also support data-parallel computation, similar to vector processors. In short, modern GPUs provide raw computational power orders of magnitude larger than a CPU and can fit inside a single computer case.

Graphics Architecture

This section is written by a non english speaker; please excuse the bad grammar (And correct it!).

A GPU has a main memory, many stages and parallel processors. Traditional GPUs have a linear pipeline with several distinct stages: application, command, geometry, rasterization, fragment, and display. Intel's project Larrabee promises a reconfigurable graphics pipeline, with many of the traditional steps being handled in software. Such a development would expose even more of the GPUs compute power to parallel programmers.

Traditional Pipeline

A traditional pipeline will have three main computation stages: geometry, rasterization, and fragment. Graphics is traditionally done with triangles, and a GPU will operate on a batch of triangle verticies to first create fragments, which will help create the pixels that end up on the monitor.

Geometry

Vertex processing is handled in the geometry step. Geometry from the CPU is transformed based on the vertex shaders (programs) written to the GPU. These processors specialized in matrix transformations. Common operations include projecting 3D coordinates onto 2D screen coordinates. The closest analogue would be a vector or quaternion processor since each vector operation takes a series of components which represent a triangle vertex. Lagrangian frame computations might be well suited to vertex shaders.

Rasterization

Rasterization takes the transformed vectors from the geometry step and creates fragments from the geometry. The easiest way to think of rasterization, is "chunking" a large triangle into many fragments. This stage is typically done with fixed function specialized hardware.

Fragment

Fragment processing requires floating point math, as the fragments are colored and filtered to become pixels. This stage is where muchof the interesting compute for CFD can happen, as the parallel floating point processors can be repurposed with either fragment shaders or special purpose languages to do non-graphics floating point math.

Performance

NVidia has been actively promoting GPGPU computing in recent years. They introduced CUDA in 2007, giving programmer direct access to the GPU. NVidia maintains a CUDA community showcase on their CUDA website showing the performance boost to a variety of applications when making use of the GPU.

When programming GPGPU systems, it is important to remember that there is significant overhead involved in transferring data between the CPU and the GPU. Serial programs with memory requirements beyond those available on the GPU will not see the same dramatic performance enhancements as programs with low memory transfer overhead.

Languages

There exist several languages which support direct control of the GPU. OpenCL, Microsoft's DirectCompute, and Nvidia's CUDA are good examples of these. For more information about how to program on the GPU, see the Wikipedia site on GPGPU computing. Coding syntax is similar to C/C++ programming syntax.

Additionally, if a programmer wants to use the graphics card to compute, and then display the results of their CFD equations. OpenGL hooks into C++ to enable that, and Microsoft's DirectX performs a similar role.

All of these languages have active communities participating in their further development. There are many code samples and tools on the internet to help a programmer get started with GPGPU computing. Go for it!