OpenMP [14] is the dominant programming model for shared-memory parallelism in C, C++ and Fortran due to its easy-to-use directive-based style, portability and broad support by compiler vendors. Compute-intensive application regions are increasingly being accelerated using devices such as GPUs and DSPs, and a programming model with similar characteristics is needed here. This paper presents extensions to OpenMP that provide such a programming model. Our results demonstrate that a high-level programming model can provide accelerated performance comparable to that of hand-coded implementations in CUDA.

Email address protected by JavaScript. Activate javascript to see the email.

We use cookies to improve our service for you. You can find more information in our data protection declaration. By continuing to use our site, you accept our use of cookies and Privacy Policy.OkPrivacy policy