Authors

Abstract

The node-level architecture of exascale systems may look significantly different than current designs.
Heterogeneous cores will be severely power- and heat-constrained. They may have large amounts of
2.5D or 3D stacked DRAM and may have even larger non-volatile storage. Simulating the large design space of
these systems is important in order to find the best candidates for further study.
Current simulation techniques offer good visibility into low-level details but are too slow and memory
intensive for such explorations.

We propose a simulation model that is based on the idea that reducing insight into some low-level details can
greatly increase the simulator's performance. This allows both the exploration of a broader design space as well as
the analysis of larger and more representative workloads. Figure 1 presents our simulation methodology, which
utilizes existing hardware, along with fast performance and power estimation models, to simulate HPC
benchmarks at nearly full speed. This can yield better insights into exascale node designs by allowing faster
turnaround of tests at different design points and more workloads that better represent exascale applications.