In order to deliver the promise of Moore’s Law to the end
user, compilers must make decisions that are intimately tied
to a specific target architecture. As engineers add architectural
features to increase performance, systems become
harder to model, and thus, it becomes harder for a compiler
to make effective decisions.
Machine-learning techniques may be able to help compiler
writers model modern architectures. Because learning techniques
can effectively make sense of high dimensional spaces,
they can be a valuable tool for clarifying and discerning
complex decision boundaries. In our work we focus on loop
unrolling, a well-known optimization for exposing instruction
level parallelism. Using the Open Research Compiler
as a testbed, we demonstrate how one can use supervised
learning techniques to model the appropriateness of loop
unrolling.
We use more than 1,100 loops — drawn from 46 benchmarks
— to train a simple learning algorithm to recognize
when loop unrolling is advantageous. The resulting classifier
can predict with 88% accuracy whether a novel loop
(i.e., one that was not in the training set) benefits from
loop unrolling. Furthermore, we can predict the optimal or
nearly optimal unroll factor 74% of the time. We evaluate
the ramifications of these prediction accuracies using the
Open Research Compiler (ORC) and the Itanium r 2 architecture.
The learned classifier yields a 6% speedup (over
ORC’s unrolling heuristic) for SPEC benchmarks, and a 7%
speedup on the remainder of our benchmarks. Because the
learning techniques we employ run very quickly, we were
able to exhaustively determine the four most salient loop
characteristics for determining when unrolling is beneficial.