ABSTRACT:Scaling computations on emerging massive-core supercomputers
is a daunting task, which coupled with the
significantly lagging system I/O capabilities exacerbates
applications’ end-to-end performance. The I/O bottleneck
often negates potential performance benefits of assigning
additional compute cores to an application. In this paper,
we address this issue via a novel functional partitioning
(FP) runtime environment that allocates cores to specific
application tasks — checkpointing, de-duplication, and
scientific data format transformation — so that the deluge
of cores can be brought to bear on the entire gamut of
application activities. The focus is on utilizing the extra
cores to support HPC application I/O activities and also
leverage solid-state disks in this context. For example, our
evaluation shows that dedicating 1 core on an oct-core
machine for checkpointing and its assist tasks using FP can
improve overall execution time of a FLASH benchmark on
80 and 160 cores by 43.95% and 41.34%, respectively.