Abstract

On the road to Exascale computing, both performance and power areas are meant to be tackled at different levels, from system to processor level. The processor itself is the main responsible for the serial node performance and also for the most of the energy consumed by the system. Thus, it is important to have tools to simultaneously analyze both performance and energy efficiency at processor level.
Performance tools have allowed analysts to understand, and even improve, the performance of an application that runs in a system. With the advent of recent processor capabilities to measure its own power consumption, performance tools can increase their collection of metrics by adding those related to energy consumption and provide a correlation between the source code, its performance and its energy efficiency.
In this paper, we present a performance tool that has been extended to gather such energy metrics. The results of this tool are passed to a mechanism called folding that produces detailed metrics and source code references by using coarse grain sampling. We have used the tool with multiple serial benchmarks as well as parallel applications to demonstrate its usefulness by locating hot spots in terms of performance and power drained.