Consulting

darshan

Darshan has been turned on by default on Hopper and Edison

Description

Darshan is a light weight IO profiling tool capable of profiling POSIX IO, MPIIO and HDF5 IO. We encourage all users to turn on Darshan for their application running on Hopper. Darshan will not only help the users to identify IO bottleneck and improve performance, but also help NERSC to better understand the IO usage of its users and shape its future plans.

Availibility

The Darshan module is loaded by default for all Hopper and Edison users. Simply recompile your code to allow Darshan to collect statistics.

Exam the Darshan Results

Completed Jobs Page

On the Completed Jobs page, a Darshan summary is automatically generated for all jobs logged by Darshan. Below shows an example of a job's I/O summary:

Note that each aprun command will produce a separate log file. The raw log files are kept for 1 month before deleted from the central location. The log summary is stored in a database for 24/7 access via the web.

Understanding Results

You can use the darshan-parser tool to analyze your log. To show how much data was read or written in a run:

Case Studies/Frequent I/O Problems

Overhead of Darshan

The overhead of Darshan is negligible to most of jobs. The plot below shows the overhead of darshan with respect to the job concurrency. The blue markers shows the overhead when each MPI task is reading/writing its own file. The red markers shows the overhead when all MPI tasks are read/writing the same file. For big jobs, it is suggested to use shared file IO (E.g. MPIIO) instead of file pre process. "Overhead" is defined as the time that Darshan uses to communicate the result and write the log file. The overhead of darshan should be un-noticeable for jobs with less than 10000 MPI tasks.

Contact

If you have any question about Darshan at NERSC, please email consult@nersc.gov