Most I/O- and data-intensive scientific applications access multiple layers in the parallel I/O software stack during execution. Typical I/O requests from these applications may include accesses to high-level I/O libraries such as Parallel netCDF and HDF5, the MPI I/O library, and parallel file systems. To design and implement parallel applications that exercise such parallel I/O software stack, one must understand the flow of interactions between I/O calls across the entire I/O stack. This would in turn help one describe I/O behavior and thus exploit the potential performance in the different layers of the storage hierarchy. In this paper, we propose a Pin-based dynamic instrumentation framework to understand the complex interactions of I/O from the applications through multiple I/O libraries to the underlying parallel file systems without any modification of the code. We also present the overheads incurred by the proposed dynamic instrumentation tool. When our tested application is executed using a process count of 32, 64, 128, and 256, the overheads we observed are 38.7, 66, 68.9, and 78.4\%, respectively.