I/O Performance Challenges at Leadership Scale

Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis

Date Published

07/2009

Publisher

ACM

Conference Location

Portland, Oregon

Other Numbers

ANL/MCS-P1648-0709

Abstract

Today�s top HPC systems run applications with hundreds of thousands of processes, contain hundreds of storage nodes, and must meet massive I/O requirements for capacity and performance. These leadership class systems face daunting challenges to deploying scalable I/O systems. In this paper we present a case study of the I/O challenges to performance and scalability on Intrepid, the IBM Blue Gene/P system at the Argonne Leadership Computing Facility. Listed in the top 5 fastest supercomputers of 2008, Intrepid runs computational science applications with intensive demands on the I/O system. We investigate the challenges to I/O at this scale, and evaluate the performance of the I/O hardware and software deployed on Intrepid. We show the file and storage system sustain high performance under varying workloads as the applications scale with the number of processes.