PERFORMANCE PREDICTION OF OPENMP PROGRAMS

View/Open

Date

Author

Metadata

Abstract

OpenMP, a directive-based API supports multithreading programming on shared
memory systems. Since OpenMP pragmas, directives, function calls, and environment
variables are platform-independent, the API is highly portable. OpenMP provides
necessary hints to the compiler in order to parallelize the given code, instead
of focusing on the low-level details of the hardware.
Performance prediction methodologies enable estimation of performance factors
(execution time, cache misses, e ect of a compiler's optimizations) prior to the actual
execution process. Existing approaches involve mathematical modeling of these
performance factors. In order to achieve the best performance using OpenMP, it is
critical to analyze cases such as the e cient cache utilization, optimal distribution
of the workload among the CPUs.
We attempt to solve the problem of e cient per-thread workload distribution by
predicting an optimal combination of an OpenMP scheduling policy and a chunk
size (we call this combination a \class"). We employed PAPI hardware counters, R
statistical package, machine learning software WEKA, TAU toolkit, and the OpenMP
collector API. A set of heuristics were applied to analyze the data to nd out the
similarities between snippets of code pertaining to the same class. We developed a
framework for taking measurements to gather the training data for the predictive
model being constructed.
We evaluate our approach using several case studies from application domains
such as Dense Linear Algebra, Structured, and Unstructured Grids. The results
demonstrate that there is a set of parameters that in
uences the choice of the "class"
for performance prediction.
v