Advanced Computing, Mathematics and Data Division Research Highlights

February 2014

Powering Performance at the Exascale

Optimal performance within stringent power and reliability requirements is the main design criterion for exascale systems

Results: The Performance and Architecture Laboratory, or PAL, at Pacific Northwest National Laboratory (PNNL) takes on power and energy problems affecting high-performance computing (HPC), especially issues that may impact achieving practical performance at the exascale. Using methods that span processor architecture and system integration to software and programming models to performance and power modeling, PAL scientists, who are part of PNNL's HPC Group, have developed power-aware algorithms that use an accurate per-core proxy power sensor model to estimate the active power of each core. Their methods also have afforded the first workload-specific quantitative power modeling capability that accurately captures workload phases, their impact on power consumption, and the effects of system architecture and processor clock speeds.

PNNL’s Performance and Architecture Laboratory develops an integrated approach to co-design of the hardware-software stack through integrated performance and power modeling. Enlarge Image.

Why it Matters: Understanding how much power is consumed by computing system components at any given time is critical to building self-aware, self-adaptive system software that can manage changing power characteristics, such as shifting power to computationally demanding processes within applications to save power without hindering performance. This assures next-generation supercomputers will have the power and energy efficiencies required to deliver practical and sustainable exaflops performance in the future.

Methods: Responding to the U.S. Department of Energy's (DOE) power consumption goals of 20-25 MW for future exascale computing systems, the PAL group addresses power as a resource, akin to memory modules, focusing efforts on system software power management to identify opportunities for power savings. Their accurate per-core proxy power sensor model has shown that processes in the same application can pose different power profiles and/or may alternate high- and low-power phases independently, offering opportunities to shift power without diminishing performance. The PAL group also is analyzing the energy cost in data movement with respect to total energy consumption of an application and identifying the dominant component of data movement energy for current and future parallel applications. Already, their research has indicated the energy cost of data movement affects each application differently (ranging from 18-40 percent in their experiments). Moreover, the energy spent in resolving data dependency, speculation, and out-of-order scheduling accounted for up to 35 percent of the total dynamic energy, indicating a need for simpler, energy-efficient processor core designs. Along with their application-specific power modeling capability, PAL researchers also developed Energy Templates that capture per-core idle/busy states and the time each core expects to remain in those states. This allows runtime software to determine when to employ power saving features, such as dynamic voltage and frequency scaling, without negatively impacting performance.

What's Next? To accommodate the increasing complexity and assure the practicality of future exascale applications and systems, resources that address performance and power consumption must be an essential part of the system and application design. As such, the PAL group continues to develop and integrate tools and techniques that capture and quantify power and performance to exploit energy-saving opportunities and provide mechanisms for dynamically optimizing ongoing application execution.