Introduction

Separation of System-Awareness

HPC system architectures are getting more complicated and diversified. Due to the complexity, it becomes harder and harder to exploit the full potential of a particular system without performance optimizations specific to the system. That is, an application code must be thoroughly optimized and specialized for a specific platform to achieve a high performance. The diversity of system architectures increases the number of system architectures that have to be considered during the life of an application. Accordingly, increases in system complexity and diversity would force a programmer to further invest enormous time and effort for HPC application development and maintenance. To overcome this difficulty, we are developing an extensible programming framework, Xevolver, which separates system-specific performance optimizations from application codes and thereby facilitates HPC application migration.

Xevolver framework

We are developing Xevolver to enable users to express their own code optimizations for special demands of individual systems and individual applications. Instead of manual code modifications, the users can use Xevolver to optimize and specialize an application code for a particular system. In the Xevolver framework, translation rules can be defined in an external file. Thus, Xevolver separates system-specific and/or application-specific code optimizations from application codes. We are also developing numerical libraries and domain-specific tools with Xevolver to help enhance performance portability of HPC applications.

User-defined code transformation

There are various ways to abstractions of system complexities, but developers still need to modify an application code for special demand of individual systems and applications. Such modifications are likely to be scattered over the whole application code, and make it difficult to maintain the code while keeping performance portability. Hence, we need an additional abstraction layer for those code modifications.
According to our observation, there are repetitive patterns in the code modifications. Thus, we can assume that those code modifications could be replaced with a smaller number of code transformations. Under this assumption, we are developing the Xevolver code transformation framework that enables users to system-specific and/or application-specific code transformations. Those translation rules are defined separately from application codes. By using different rules for different systems, a single application code can be transformed into an appropriate version for each system.

HPC refactoring

HPC application codes are often optimized and specialized for their target systems. However, some code optimizations for one HPC system may lead to performance degradation on another system. If an HPC application code is directly modified for system-specific optimizations, those code optimizations could prevent future systems from achieving high performance. Therefore, we explore the methodology and supportive tools for “HPC refactoring,” which is code refactoring for separating system-specific code optimizations from an existing HPC code, using Xevolver’s abstractions.

Hierarchical abstraction of computing systems

Abstraction technologies such as numerical libraries are strongly required to hide the complicated system configurations from application developers. We are developing numerical libraries to hierarchically abstract the HPC system. Our numerical libraries are optimized for multiple platforms, such as GPU, MIC, and CPU cluster systems, so that application developers can use different implementations with common interfaces, resulting in high performance portability. The numerical libraries are designed to support as many data structures as possible to cover various use cases while achieving high performance. We also investigate autotuning technologies to adapt the implementations of numerical libraries to similar platforms in order to achieve high performance portability.