darshan issueshttps://xgitlab.cels.anl.gov/darshan/darshan/issues2017-08-08T18:24:20-05:00https://xgitlab.cels.anl.gov/darshan/darshan/issues/231update lustre module to support DNE, PFL2017-08-08T18:24:20-05:00Glenn K. Lockwoodupdate lustre module to support DNE, PFLLustre has recently incorporated features including distributed namespace (DNE) and progressive file layouts (PFL) which have performance implications beyond the stripe width and size. There are probably new ioctls or llapi calls to query this information, and it should be stored in the Lustre module data for each file. Specifically,
1. PFL changes the core assumptions we made about stripe layouts, so this must be revisited to ensure that the Lustre module isn't just storing nonsense on PFL-enabled Lustre file systems (>= Lustre 2.10)
2. Examining which llapi calls are available will require much more deliberate autoconf macros. Do we want to resort to ioctls (if available) for missing llapi calls? And if so, how will we detect which ioctls are supported by the file system at configure time?
3. We should store the MDT to which each file is assigned for both DNE1 and DNE2. At present this information is not stored at all because the Lustre module carries an implicit assumption that there is only one MDT per file system.Lustre has recently incorporated features including distributed namespace (DNE) and progressive file layouts (PFL) which have performance implications beyond the stripe width and size. There are probably new ioctls or llapi calls to query this information, and it should be stored in the Lustre module data for each file. Specifically,
1. PFL changes the core assumptions we made about stripe layouts, so this must be revisited to ensure that the Lustre module isn't just storing nonsense on PFL-enabled Lustre file systems (>= Lustre 2.10)
2. Examining which llapi calls are available will require much more deliberate autoconf macros. Do we want to resort to ioctls (if available) for missing llapi calls? And if so, how will we detect which ioctls are supported by the file system at configure time?
3. We should store the MDT to which each file is assigned for both DNE1 and DNE2. At present this information is not stored at all because the Lustre module carries an implicit assumption that there is only one MDT per file system.enhancementGlenn K. LockwoodGlenn K. Lockwoodhttps://xgitlab.cels.anl.gov/darshan/darshan/issues/232add support for DataWarp accounting API2017-08-08T18:53:01-05:00Glenn K. Lockwoodadd support for DataWarp accounting APIRecent versions of Cray Linux include a DataWarp telemetry API that provides information analogous to what the Lustre module provides. Specifically, `dw_get_stripe_configuration(3)` returns
* `int stripe_size`
* `int stripe_width`
* `int starting_index`
And the DataWarp servers to which `starting_index` refers can be divined from `/proc/mounts` or other kernel interfaces.
Furthermore, CLE 6.0UP04 now includes `dw_get_accounting_data_json(3)` which returns aggregate server-side telemetry. While not broken out into a per-file basis, the data types returned may be of interest at some point. An example output is:
```
{
"realm": {
"namespaces": [
{
"stage_bytes_written": 0,
"namespace_id": 378,
"stripe_width": 2,
"file_size_limit": 0,
"stage_bytes_read": 0,
"stripe_size": 8388608,
"files_create_threshold": 0,
"substripe_width": 12,
"substripe_size": 8388608,
"files_created": 0,
"num_data_created": 0,
"bytes_read": 0,
"max_offset_written": 0,
"bytes_written": 0,
"max_offset_read": 0
}
],
"fragments": [
{
"fs_capacity": 0,
"files_created": 0,
"capacity_used": 0,
"capacity_max": 0,
"max_window_write": 86400,
"write_high_water": 0,
"server_hostname": "c0-0c0s6n2",
"write_moving_avg": 0,
"write_limit": 1730066513920,
"bytes_written": 0,
"bytes_read": 0
},
{
"fs_capacity": 172920012800,
"files_created": 0,
"capacity_used": 34369536,
"capacity_max": 34369536,
"max_window_write": 86400,
"write_high_water": 0,
"server_hostname": "c0-0c0s6n1",
"write_moving_avg": 0,
"write_limit": 1730066513920,
"bytes_written": 0,
"bytes_read": 0
}
],
"realm_id": 378,
"namespace_count": 1,
"server_count": 2
}
}
```
These sorts of data may be best collected by some other tool (just as we decided client-side file system counters should be).Recent versions of Cray Linux include a DataWarp telemetry API that provides information analogous to what the Lustre module provides. Specifically, `dw_get_stripe_configuration(3)` returns
* `int stripe_size`
* `int stripe_width`
* `int starting_index`
And the DataWarp servers to which `starting_index` refers can be divined from `/proc/mounts` or other kernel interfaces.
Furthermore, CLE 6.0UP04 now includes `dw_get_accounting_data_json(3)` which returns aggregate server-side telemetry. While not broken out into a per-file basis, the data types returned may be of interest at some point. An example output is:
```
{
"realm": {
"namespaces": [
{
"stage_bytes_written": 0,
"namespace_id": 378,
"stripe_width": 2,
"file_size_limit": 0,
"stage_bytes_read": 0,
"stripe_size": 8388608,
"files_create_threshold": 0,
"substripe_width": 12,
"substripe_size": 8388608,
"files_created": 0,
"num_data_created": 0,
"bytes_read": 0,
"max_offset_written": 0,
"bytes_written": 0,
"max_offset_read": 0
}
],
"fragments": [
{
"fs_capacity": 0,
"files_created": 0,
"capacity_used": 0,
"capacity_max": 0,
"max_window_write": 86400,
"write_high_water": 0,
"server_hostname": "c0-0c0s6n2",
"write_moving_avg": 0,
"write_limit": 1730066513920,
"bytes_written": 0,
"bytes_read": 0
},
{
"fs_capacity": 172920012800,
"files_created": 0,
"capacity_used": 34369536,
"capacity_max": 34369536,
"max_window_write": 86400,
"write_high_water": 0,
"server_hostname": "c0-0c0s6n1",
"write_moving_avg": 0,
"write_limit": 1730066513920,
"bytes_written": 0,
"bytes_read": 0
}
],
"realm_id": 378,
"namespace_count": 1,
"server_count": 2
}
}
```
These sorts of data may be best collected by some other tool (just as we decided client-side file system counters should be).enhancementGlenn K. LockwoodGlenn K. Lockwoodhttps://xgitlab.cels.anl.gov/darshan/darshan/issues/233Fortran with SGI-mpt-2.15 does not work with DARSHAN2017-12-15T10:58:27-06:00Siddhartha GhoshFortran with SGI-mpt-2.15 does not work with DARSHANIt seems you do not intercept fortran versions of mpi_init / mpi_finalize. Unlike mpich SGI-mpt does not have a separate fortran library that in turn calls the c-library. Do you have a suggestion for fortran codes with SGI-mpt ? In case not do you have any plans to add this support ? Apparently this does not appear to be a lot of change.It seems you do not intercept fortran versions of mpi_init / mpi_finalize. Unlike mpich SGI-mpt does not have a separate fortran library that in turn calls the c-library. Do you have a suggestion for fortran codes with SGI-mpt ? In case not do you have any plans to add this support ? Apparently this does not appear to be a lot of change.https://xgitlab.cels.anl.gov/darshan/darshan/issues/234Darshan hangs with MVAPICH2 on gpfs2018-01-22T11:11:17-06:00Samuel KhuvisDarshan hangs with MVAPICH2 on gpfsI am running into an issue with Darshan using MVAPICH2 on gpfs. If I configure MVAPICH2 with the `--with-file-system=ufs+nfs` option then Darshan behaves correctly, however if I configure with the `--with-file-system=ufs+nfs+gpfs` option then Darshan hangs, even for a simple [MPI hello world](/uploads/33c722be54ae4acabf1cd25fe40bd337/hello.c). The below screenshot from Totalview shows that it is hanging inside a call to `PMPI_File_write_at_all` during `darshan_core_shutdown`. We created a standalone [MPIIO test code](/uploads/35e140bee7d82ddfcde4527ff95c640c/gpfs-write-test.c) to try to reproduce this issue but our code does not hang. I would appreciate any insight into why Darshan is hanging.
![darshan-totalview](/uploads/4724c483720bff5334a5cbf6092e024b/darshan-totalview.png)I am running into an issue with Darshan using MVAPICH2 on gpfs. If I configure MVAPICH2 with the `--with-file-system=ufs+nfs` option then Darshan behaves correctly, however if I configure with the `--with-file-system=ufs+nfs+gpfs` option then Darshan hangs, even for a simple [MPI hello world](/uploads/33c722be54ae4acabf1cd25fe40bd337/hello.c). The below screenshot from Totalview shows that it is hanging inside a call to `PMPI_File_write_at_all` during `darshan_core_shutdown`. We created a standalone [MPIIO test code](/uploads/35e140bee7d82ddfcde4527ff95c640c/gpfs-write-test.c) to try to reproduce this issue but our code does not hang. I would appreciate any insight into why Darshan is hanging.
![darshan-totalview](/uploads/4724c483720bff5334a5cbf6092e024b/darshan-totalview.png)https://xgitlab.cels.anl.gov/darshan/darshan/issues/228add per-OST metric tabulation to Lustre module2018-01-11T10:10:33-06:00Glenn K. Lockwoodadd per-OST metric tabulation to Lustre moduleRight now the Lustre module retains the OSTs over which each file is striped, but it does not track any more detailed metrics such as
- number of bytes read/written to each OST
- number of ops issued to each OST
A recent post to the lustre-discuss mailing list pointed out that the `FIEMAP` ioctl, issued from any Lustre client, will return the exact OST corresponding to a file and an offset. Andreas went on to describe the placement algorithm as a simple three-step process:
1. fetch file layout via llapi_layout_get_by_path() or similar
2. stripe_index = (logical file offset / stripe_size) % stripe_count
3. OST index = llapi_layout_ost_index_get(layout, stripe_index)
This would allow us to tell if certain OSTs are receiving a large re-read or modify workload. There are two major drawbacks though:
1. We would have to scope this very carefully. It may be possible to track a large chunk of the POSIX module counters on a per-file, per-OST basis which would be a tremendous amount of data. We would have to make choices as to which POSIX counters are worth tracking at such fine granularity, and which ones aren't.
2. The placement algorithm might change in future versions of Lustre, which would cause Darshan to report reasonable-looking but wrong data. The `llapi_*` calls are designed explicitly to work around this, but we would need to carefully measure the overheads of using these over the standard ioctls.
This isn't a high priority, but it'd give us a tremendous amount of insight for data-intensive applications that do a lot of modify-in-place or re-read.Right now the Lustre module retains the OSTs over which each file is striped, but it does not track any more detailed metrics such as
- number of bytes read/written to each OST
- number of ops issued to each OST
A recent post to the lustre-discuss mailing list pointed out that the `FIEMAP` ioctl, issued from any Lustre client, will return the exact OST corresponding to a file and an offset. Andreas went on to describe the placement algorithm as a simple three-step process:
1. fetch file layout via llapi_layout_get_by_path() or similar
2. stripe_index = (logical file offset / stripe_size) % stripe_count
3. OST index = llapi_layout_ost_index_get(layout, stripe_index)
This would allow us to tell if certain OSTs are receiving a large re-read or modify workload. There are two major drawbacks though:
1. We would have to scope this very carefully. It may be possible to track a large chunk of the POSIX module counters on a per-file, per-OST basis which would be a tremendous amount of data. We would have to make choices as to which POSIX counters are worth tracking at such fine granularity, and which ones aren't.
2. The placement algorithm might change in future versions of Lustre, which would cause Darshan to report reasonable-looking but wrong data. The `llapi_*` calls are designed explicitly to work around this, but we would need to carefully measure the overheads of using these over the standard ioctls.
This isn't a high priority, but it'd give us a tremendous amount of insight for data-intensive applications that do a lot of modify-in-place or re-read.enhancementGlenn K. LockwoodGlenn K. Lockwoodhttps://xgitlab.cels.anl.gov/darshan/darshan/issues/221darshan randomly fails to detect username on mira/cetus2018-01-11T10:10:34-06:00Shane Snyderdarshan randomly fails to detect username on mira/cetusThis ticket was split off from issue #201 so that we could troubleshoot and track the issue separately for Mira/Cetus at the ALCF. See the original ticket for more details.
In summary, Darshan is randomly failing to detect the username associated with a job on these systems and instead generates a log file with the user's EUID for certain jobs. This issue occurs with both 2.x and 3.x versions of Darshan. Darshan has attempted to determine what the username is using the following 3 methods, in order:
1.) `cuserid()`
2.) `getenv("LOGNAME")`
3.) `geteuid()` (which returns the numeric euid of a user rather than the string user name)
We have confirmed that `getenv("LOGNAME")` never succeeds on Mira/Cetus, so if `cuserid()` fails sporadically, it will fall back to using EUIDs causing the issue. We need to determine what is causing `cuserid()` to fail and if there is any workaround. If not, we should look into whether or not there is another environment variable we can query for the user's username.This ticket was split off from issue #201 so that we could troubleshoot and track the issue separately for Mira/Cetus at the ALCF. See the original ticket for more details.
In summary, Darshan is randomly failing to detect the username associated with a job on these systems and instead generates a log file with the user's EUID for certain jobs. This issue occurs with both 2.x and 3.x versions of Darshan. Darshan has attempted to determine what the username is using the following 3 methods, in order:
1.) `cuserid()`
2.) `getenv("LOGNAME")`
3.) `geteuid()` (which returns the numeric euid of a user rather than the string user name)
We have confirmed that `getenv("LOGNAME")` never succeeds on Mira/Cetus, so if `cuserid()` fails sporadically, it will fall back to using EUIDs causing the issue. We need to determine what is causing `cuserid()` to fail and if there is any workaround. If not, we should look into whether or not there is another environment variable we can query for the user's username.triage-bugsShane SnyderShane Snyderhttps://xgitlab.cels.anl.gov/darshan/darshan/issues/220darshan randomly fails to detect usernames on cori/edison2018-01-11T10:10:34-06:00Shane Snyderdarshan randomly fails to detect usernames on cori/edisonThis ticket was split off from issue #201 so that we could troubleshoot and track the issue separately for Cori/Edison systems at NERSC. See the original ticket for more details.
In summary, Darshan is randomly failing to detect the username associated with a job on these systems and instead generates a log file with the user's EUID for certain jobs. This issue occurs with both 2.x and 3.x versions of Darshan. Darshan has attempted to determine what the username is using the following 3 methods, in order:
1.) `cuserid()`
2.) `getenv("LOGNAME")`
3.) `geteuid()` (which returns the numeric euid of a user rather than the string user name)
The `cuserid()` option has typically been disabled on Cray systems due to it causing strange errors in the past, but this has not been tested in around 5 years or so. So, our only chance of getting a string username is if the environment has `LOGNAME` defined to the username.
It turns out that if a user explicitly passes environment variables to `srun` using the `--export` switch, it wipes much of the environment variables that are propagated to compute nodes, including LOGNAME. Users who use the `--export` switch, will therefore get log file names with their euids rather than usernames.This ticket was split off from issue #201 so that we could troubleshoot and track the issue separately for Cori/Edison systems at NERSC. See the original ticket for more details.
In summary, Darshan is randomly failing to detect the username associated with a job on these systems and instead generates a log file with the user's EUID for certain jobs. This issue occurs with both 2.x and 3.x versions of Darshan. Darshan has attempted to determine what the username is using the following 3 methods, in order:
1.) `cuserid()`
2.) `getenv("LOGNAME")`
3.) `geteuid()` (which returns the numeric euid of a user rather than the string user name)
The `cuserid()` option has typically been disabled on Cray systems due to it causing strange errors in the past, but this has not been tested in around 5 years or so. So, our only chance of getting a string username is if the environment has `LOGNAME` defined to the username.
It turns out that if a user explicitly passes environment variables to `srun` using the `--export` switch, it wipes much of the environment variables that are propagated to compute nodes, including LOGNAME. Users who use the `--export` switch, will therefore get log file names with their euids rather than usernames.triage-bugsShane SnyderShane Snyderhttps://xgitlab.cels.anl.gov/darshan/darshan/issues/200investigate use of new liblustre client API calls in the Lustre module2018-01-11T10:10:34-06:00Shane Snyderinvestigate use of new liblustre client API calls in the Lustre moduleWe should look at the new liblustre API calls for gathering information on Lustre file layouts at the client. For instance, `llapi_layout_get_by_fd` can supposedly provide the detailed stripe information by just providing a file descriptor, rather than using the low-level ioctl method we are currently using. We don't have access to a Lustre file system with a new enough version (I *think* they showed up around Lustre 2.7?) to test these calls right now, however -- Edison and Cori each use some minor version of Lustre 2.5. We should ensure they are low-overhead enough to put in wrappers before trying to use them in the Lustre module directly. When we've previously analyzed the`llapi_file_get_stripe` call, we determined it had too much overhead, but perhaps the new API overcomes that.
Additionally, we should look through the interface and see if there are any other calls we can leverage to provide additional data in the Lustre module. We should look at the new liblustre API calls for gathering information on Lustre file layouts at the client. For instance, `llapi_layout_get_by_fd` can supposedly provide the detailed stripe information by just providing a file descriptor, rather than using the low-level ioctl method we are currently using. We don't have access to a Lustre file system with a new enough version (I *think* they showed up around Lustre 2.7?) to test these calls right now, however -- Edison and Cori each use some minor version of Lustre 2.5. We should ensure they are low-overhead enough to put in wrappers before trying to use them in the Lustre module directly. When we've previously analyzed the`llapi_file_get_stripe` call, we determined it had too much overhead, but perhaps the new API overcomes that.
Additionally, we should look through the interface and see if there are any other calls we can leverage to provide additional data in the Lustre module. triage-feature-requestenhancementhttps://xgitlab.cels.anl.gov/darshan/darshan/issues/198is &quot;used_derrived&quot; working?2018-01-11T10:10:34-06:00Rob Lathamis "used_derrived" working?After looking at Edison log files, we find cases where pnetcdf or hdf5 is used, mpi-io is used, but no derived datatypes were used. This smells fishy and warrants further investigation... but we're in the middle of proposal madness and ATPESC preparationAfter looking at Edison log files, we find cases where pnetcdf or hdf5 is used, mpi-io is used, but no derived datatypes were used. This smells fishy and warrants further investigation... but we're in the middle of proposal madness and ATPESC preparationhttps://xgitlab.cels.anl.gov/darshan/darshan/issues/197stdio regression tests are failing on BG/Q2018-01-11T10:10:34-06:00Shane Snyderstdio regression tests are failing on BG/QThe stdio test cases used in our regression tests are working correctly on Jenkins, personal workstation, but not working correctly on the BG/Q.
Just glancing at the output, it looks like the issue may be that our wrapper functions aren't intercepting the fstream calls, but need to investigate why.The stdio test cases used in our regression tests are working correctly on Jenkins, personal workstation, but not working correctly on the BG/Q.
Just glancing at the output, it looks like the issue may be that our wrapper functions aren't intercepting the fstream calls, but need to investigate why.triage-bugsdefectShane SnyderShane Snyderhttps://xgitlab.cels.anl.gov/darshan/darshan/issues/190-e option not supported in some gnuplot versions2018-12-14T16:39:42-06:00Philip Carnscarns@mcs.anl.gov-e option not supported in some gnuplot versionsThe darshan-job-summary.pl script relies on sending commands into gnuplot with the -e argument. It also checks that the version of gnuplot is at least 4.2.
However, it appears that gnuplot 4.2 patchlevel 3 (or maybe some particular builds of it) do not support this option. See mailing list thread here, reported by Rene Salmon:
http://lists.mcs.anl.gov/pipermail/darshan-users/2016-May/000374.html
We need to see if we should refine our gnuplot version number check, or if there is some runtime test we should be doing to safely detect/report the problem.The darshan-job-summary.pl script relies on sending commands into gnuplot with the -e argument. It also checks that the version of gnuplot is at least 4.2.
However, it appears that gnuplot 4.2 patchlevel 3 (or maybe some particular builds of it) do not support this option. See mailing list thread here, reported by Rene Salmon:
http://lists.mcs.anl.gov/pipermail/darshan-users/2016-May/000374.html
We need to see if we should refine our gnuplot version number check, or if there is some runtime test we should be doing to safely detect/report the problem.triage-bugshttps://xgitlab.cels.anl.gov/darshan/darshan/issues/189allow multiple modules to stack on the same wrappers2018-01-11T10:10:34-06:00Shane Snyderallow multiple modules to stack on the same wrappersAs is stands, each Darshan module is responsible for wrapping and instrumenting a set of I/O functions. The issue is that if multiple modules want to wrap the same I/O functions, there is no method to do that now. The only way to share the wrappers is to explicitly call into other modules from the module that actually wraps the function of interest.
We should think about how to add a layer of abstraction between the actual I/O function wrappers and the modules that are trying to instrument these functions so that we can stack multiple modules on the same wrappers.As is stands, each Darshan module is responsible for wrapping and instrumenting a set of I/O functions. The issue is that if multiple modules want to wrap the same I/O functions, there is no method to do that now. The only way to share the wrappers is to explicitly call into other modules from the module that actually wraps the function of interest.
We should think about how to add a layer of abstraction between the actual I/O function wrappers and the modules that are trying to instrument these functions so that we can stack multiple modules on the same wrappers.major-feature-requestenhancementhttps://xgitlab.cels.anl.gov/darshan/darshan/issues/182darshan does not understand binded mount points2020-05-15T14:54:34-05:00Shane Snyderdarshan does not understand binded mount pointsWhen accessing a file path corresponding to a binded mount point, Darshan is unable to recognize the actual underlying mount point it references. This prevents Darshan from determining the underlying file system type, which is necessary for determining the FS block size (which is used to determine whether I/O access are aligned or not).
It would be nice if Darshan could recognize binded mount points and store mappings from these bind mount points to the true underlying mount point in its runtime data structures. When an application accesses files in the binded mount point, Darshan can map them to the actual mount points it stores in the log file and correctly determine the underlying FS type and block size.
When accessing a file path corresponding to a binded mount point, Darshan is unable to recognize the actual underlying mount point it references. This prevents Darshan from determining the underlying file system type, which is necessary for determining the FS block size (which is used to determine whether I/O access are aligned or not).
It would be nice if Darshan could recognize binded mount points and store mappings from these bind mount points to the true underlying mount point in its runtime data structures. When an application accesses files in the binded mount point, Darshan can map them to the actual mount points it stores in the log file and correctly determine the underlying FS type and block size.
3.2.2defecthttps://xgitlab.cels.anl.gov/darshan/darshan/issues/181darshan outputs warnings when using Intel &#39;-ipo&#39; compile flag2018-01-11T10:10:34-06:00Shane Snyderdarshan outputs warnings when using Intel '-ipo' compile flagOn Cori, when using the '-ipo' (interprocedural optimization) flag for Intel compilers, Darshan causes compilation warnings. I have verified this using a simple test Fortran program and the ifort compiler (output shown below). The applications appear to run without errors and Darshan logs are generated successfully.
Sample output warnings:
ipo: warning #11021: unresolved __real_H5Fcreate
Referenced in libdarshan.a(darshan-hdf5.o)
ipo: warning #11021: unresolved __real_H5Fopen
Referenced in libdarshan.a(darshan-hdf5.o)
...On Cori, when using the '-ipo' (interprocedural optimization) flag for Intel compilers, Darshan causes compilation warnings. I have verified this using a simple test Fortran program and the ifort compiler (output shown below). The applications appear to run without errors and Darshan logs are generated successfully.
Sample output warnings:
ipo: warning #11021: unresolved __real_H5Fcreate
Referenced in libdarshan.a(darshan-hdf5.o)
ipo: warning #11021: unresolved __real_H5Fopen
Referenced in libdarshan.a(darshan-hdf5.o)
...triage-bugsdefecthttps://xgitlab.cels.anl.gov/darshan/darshan/issues/180improve Darshan modules&#39; record layout2018-01-11T10:10:34-06:00Shane Snyderimprove Darshan modules' record layoutCurrently, modules receive their own distinct memory buffer when registering with Darshan which will only store records corresponding to the specific module. At shutdown time, each module's memory buffers are first sorted to get all shared records in a contiguous region so that reductions can be ran on the shared records. After that, the buffers are compressed and written out collectively. A log file header is used to indicate the extents in the output log file that correspond to each module so their data can easily be retrieved.
Some of our initial performance testing has shown this not to be the optimal design. Specifically, the decision to compress and write out memory buffers on a per-module basis leads to reduced compression efficiency and longer shutdown times as collective writes are performed for each active module. It would be more desirable to compress and write everything in single operations, rather than one for each module.
To avoid this issue, we would need to modify Darshan to manage a single contiguous memory buffer that is used to store records from all modules. Whenever modules register a record with Darshan, it is appended to this buffer and an address to refer to the record is returned. At shutdown time, Darshan would only have to compress this single buffer and write it out all at once.
A couple of problems evident from the proposed approach:
1.) How would a log consumer be able to extract data from the log file without knowing what order records were stored in? I.e., given a contiguous buffer of records from numerous modules, how do we know what addresses correspond to what types of records (obviously, each module has different record structs, so they are likely all different length, etc).
2.) How do we handle shared file reductions? Data records from active modules will be stored together, meaning we likely won't have a contiguous range of records to reduce.
For issue 1), we have a simple solution: prefix all records with a "Darshan base record" that indicates the record identifier and the associated module identifier (and perhaps more data that is likely to be used in all modules, such as corresponding rank) for the following record. This turns out to be a natural refactor, as all current modules store this type of info (sans the module identifier which is most critical for this mechanism) at the beginning of each record. Then, when a consumer is reading a log file, it can work through the buffer of module data, peeking at each record's module identifier to determine exactly how to read what follows. This solution trades off more flexibility in storage of records at runtime for more complexity in log reading utilities, which is completely fine.
For issue 2), there are a couple of solutions: sort Darshan's record buffer in a way that gets shared records from the same module in a contiguous memory region or use custom MPI datatypes to do a reduction on noncontiguous records. I'm not sure how the overhead of each method compares, but intuitively the first option (sorting) seems like it would be easier and more efficient. In fact, if we use a sorting algorithm that puts all records from the same module in a contiguous memory region, compression efficiency should increase (since we are compressing sequences of records with identical structure), making this an even more appealing option.
So, what we need to do:
* refactor of module records to begin with a Darshan "base record"
* update Darshan's record registration and memory management to use a single buffer for all module records
* implement a sort algorithm to sort records by module identifier
* compress and collectively write out this big record buffer
* update darshan logutils API to be able to consume the new record format
Some potential advantages:
* improved compression efficiency
* reduced number of collective I/O ops at shutdown
* optimized memory consumption as there will no longer be fragmentation or wasted use of memory buffers as there is when each module has their own distinct buffer
* no longer any need for module offset/extent pairs in the Darshan header, which will greatly reduce the fixed-length size of this header
Drawbacks:
* increased record size to include module identifier (1 byte...which is likely offset by increased compression efficiency and much smaller header, so probably not really a drawback)
* the need to sort a big array of records at runtime, probably twice (once to get modules in order, twice to get each module's shared records in order)
* development effort :)
The key to determining whether this redesign is a good idea is determining how the overhead of sorting all records at shutdown time and doing a single compress/coll. write compares to doing independent compress/coll. writes for each module.Currently, modules receive their own distinct memory buffer when registering with Darshan which will only store records corresponding to the specific module. At shutdown time, each module's memory buffers are first sorted to get all shared records in a contiguous region so that reductions can be ran on the shared records. After that, the buffers are compressed and written out collectively. A log file header is used to indicate the extents in the output log file that correspond to each module so their data can easily be retrieved.
Some of our initial performance testing has shown this not to be the optimal design. Specifically, the decision to compress and write out memory buffers on a per-module basis leads to reduced compression efficiency and longer shutdown times as collective writes are performed for each active module. It would be more desirable to compress and write everything in single operations, rather than one for each module.
To avoid this issue, we would need to modify Darshan to manage a single contiguous memory buffer that is used to store records from all modules. Whenever modules register a record with Darshan, it is appended to this buffer and an address to refer to the record is returned. At shutdown time, Darshan would only have to compress this single buffer and write it out all at once.
A couple of problems evident from the proposed approach:
1.) How would a log consumer be able to extract data from the log file without knowing what order records were stored in? I.e., given a contiguous buffer of records from numerous modules, how do we know what addresses correspond to what types of records (obviously, each module has different record structs, so they are likely all different length, etc).
2.) How do we handle shared file reductions? Data records from active modules will be stored together, meaning we likely won't have a contiguous range of records to reduce.
For issue 1), we have a simple solution: prefix all records with a "Darshan base record" that indicates the record identifier and the associated module identifier (and perhaps more data that is likely to be used in all modules, such as corresponding rank) for the following record. This turns out to be a natural refactor, as all current modules store this type of info (sans the module identifier which is most critical for this mechanism) at the beginning of each record. Then, when a consumer is reading a log file, it can work through the buffer of module data, peeking at each record's module identifier to determine exactly how to read what follows. This solution trades off more flexibility in storage of records at runtime for more complexity in log reading utilities, which is completely fine.
For issue 2), there are a couple of solutions: sort Darshan's record buffer in a way that gets shared records from the same module in a contiguous memory region or use custom MPI datatypes to do a reduction on noncontiguous records. I'm not sure how the overhead of each method compares, but intuitively the first option (sorting) seems like it would be easier and more efficient. In fact, if we use a sorting algorithm that puts all records from the same module in a contiguous memory region, compression efficiency should increase (since we are compressing sequences of records with identical structure), making this an even more appealing option.
So, what we need to do:
* refactor of module records to begin with a Darshan "base record"
* update Darshan's record registration and memory management to use a single buffer for all module records
* implement a sort algorithm to sort records by module identifier
* compress and collectively write out this big record buffer
* update darshan logutils API to be able to consume the new record format
Some potential advantages:
* improved compression efficiency
* reduced number of collective I/O ops at shutdown
* optimized memory consumption as there will no longer be fragmentation or wasted use of memory buffers as there is when each module has their own distinct buffer
* no longer any need for module offset/extent pairs in the Darshan header, which will greatly reduce the fixed-length size of this header
Drawbacks:
* increased record size to include module identifier (1 byte...which is likely offset by increased compression efficiency and much smaller header, so probably not really a drawback)
* the need to sort a big array of records at runtime, probably twice (once to get modules in order, twice to get each module's shared records in order)
* development effort :)
The key to determining whether this redesign is a good idea is determining how the overhead of sorting all records at shutdown time and doing a single compress/coll. write compares to doing independent compress/coll. writes for each module.major-feature-requestenhancementhttps://xgitlab.cels.anl.gov/darshan/darshan/issues/172consider an MPIT module2018-01-11T10:10:34-06:00Philip Carnscarns@mcs.anl.govconsider an MPIT moduleThe new MPIT interface in MPI may provide some information that we could collect in Darshan. Suggested by Raghu Chandrasekar of Cray following the SC15 BoF.The new MPIT interface in MPI may provide some information that we could collect in Darshan. Suggested by Raghu Chandrasekar of Cray following the SC15 BoF.major-feature-requesthttps://xgitlab.cels.anl.gov/darshan/darshan/issues/167parser options to only output specific module data2018-01-11T10:10:34-06:00Shane Snyderparser options to only output specific module dataCurrently, darshan-parser outputs data for all modules present in a given darshan log. Module-specific utilities probably prefer a way to only see output counter data for a specific module, without having to filter the default darshan-parser output.
It shouldn't be too terribly difficult to add command line options to the parser to specify which modules to parse output from. E.g.
./darshan-parser --modules="POSIX,MPIIO" sample-log.darshanCurrently, darshan-parser outputs data for all modules present in a given darshan log. Module-specific utilities probably prefer a way to only see output counter data for a specific module, without having to filter the default darshan-parser output.
It shouldn't be too terribly difficult to add command line options to the parser to specify which modules to parse output from. E.g.
./darshan-parser --modules="POSIX,MPIIO" sample-log.darshantriage-feature-requestparsing toolsShane SnyderShane Snyderhttps://xgitlab.cels.anl.gov/darshan/darshan/issues/235dxt module does not record offsets for MPIIO accesses2018-02-22T13:55:44-06:00Philip Carnscarns@mcs.anl.govdxt module does not record offsets for MPIIO accessesShould we inspect the MPI datatypes in the wrapper to at least get the first offset for each access?Should we inspect the MPI datatypes in the wrapper to at least get the first offset for each access?https://xgitlab.cels.anl.gov/darshan/darshan/issues/236Create environment variable for dynamic library on Cray2018-02-26T13:49:49-06:00Kevin HarmsCreate environment variable for dynamic library on CraySetup the Cray module to export the DARSHAN_PRELOAD environment variable to allow users to easily set the LD_PRELOAD environment variable. For example:
module load darshan
aprun -e LD_PRELOAD=${DARSHAN_PRELOAD}Setup the Cray module to export the DARSHAN_PRELOAD environment variable to allow users to easily set the LD_PRELOAD environment variable. For example:
module load darshan
aprun -e LD_PRELOAD=${DARSHAN_PRELOAD}Kevin HarmsKevin Harmshttps://xgitlab.cels.anl.gov/darshan/darshan/issues/238Alternative malloc() implementation for Darshan2019-01-18T12:58:18-06:00Philip Carnscarns@mcs.anl.govAlternative malloc() implementation for DarshanDarshan already has the ability to use mmap() to manage memory for it's runtime statistics collection, but it also uses a few small malloc() calls at runtime for transient data structures. This can cause problems when interacting with other instrumentation libraries that could invoke POSIX file operations in the malloc path and cause a deadlock (examples include libunwind, hugepages, and alternative malloc implementations).
We've been able to work around this case by case in the past, but John Mellor-Crummey offered this suggestion for a more systemic solution:
>>>
What we do in HPCToolkit is just mmap a segment of perhaps 4MB. Then our allocator simply advances a cursor through the segment to provide data. There is no free. I assume that you don’t need to free either. Then, if we out of space in the segment, we mmap another and move the cursor there. Our code as it is a bit specific to HPCToolkit, so you probably don’t want it verbatim. However, it would probably provide a good template for what to do. You can find the few files involved here:
https://github.com/HPCToolkit/hpctoolkit/tree/master/src/tool/hpcrun/memory
>>>Darshan already has the ability to use mmap() to manage memory for it's runtime statistics collection, but it also uses a few small malloc() calls at runtime for transient data structures. This can cause problems when interacting with other instrumentation libraries that could invoke POSIX file operations in the malloc path and cause a deadlock (examples include libunwind, hugepages, and alternative malloc implementations).
We've been able to work around this case by case in the past, but John Mellor-Crummey offered this suggestion for a more systemic solution:
>>>
What we do in HPCToolkit is just mmap a segment of perhaps 4MB. Then our allocator simply advances a cursor through the segment to provide data. There is no free. I assume that you don’t need to free either. Then, if we out of space in the segment, we mmap another and move the cursor there. Our code as it is a bit specific to HPCToolkit, so you probably don’t want it verbatim. However, it would probably provide a good template for what to do. You can find the few files involved here:
https://github.com/HPCToolkit/hpctoolkit/tree/master/src/tool/hpcrun/memory
>>>major-feature-requestwrapper librarieshttps://xgitlab.cels.anl.gov/darshan/darshan/issues/102adjust darshan-job-summary.pl for color-blindness2018-03-16T10:14:01-05:00Shane Snyderadjust darshan-job-summary.pl for color-blindnessThe default colors chosen by gnuplot are not friendly to color-blind users.
See http://old.nabble.com/color-blindness-and-gnuplot-default-colors-td15478098.html for suggestions.The default colors chosen by gnuplot are not friendly to color-blind users.
See http://old.nabble.com/color-blindness-and-gnuplot-default-colors-td15478098.html for suggestions.triage-bugsenhancementdarshan-job-summary.plhttps://xgitlab.cels.anl.gov/darshan/darshan/issues/93discrepancy in data volume in --perf and --file output from darshan-parser2018-03-16T10:14:01-05:00Shane Snyderdiscrepancy in data volume in --perf and --file output from darshan-parserReported by Huong Luu:
The total_bytes field reported by darshan-parser --perf almost never agrees with the cumulative totals in the 2nd column of the statistics reported by --file (read_only, write_only, and read_write).
We should get an example log and investigate what causes this; it may be a bug.Reported by Huong Luu:
The total_bytes field reported by darshan-parser --perf almost never agrees with the cumulative totals in the 2nd column of the statistics reported by --file (read_only, write_only, and read_write).
We should get an example log and investigate what causes this; it may be a bug.triage-bugsdefectparsing toolshttps://xgitlab.cels.anl.gov/darshan/darshan/issues/92report I/O time in darshan-parser2018-03-16T10:14:01-05:00Shane Snyderreport I/O time in darshan-parserAs reported by Huong Luu, UIUC:
currently Darshan parser reports aggregate performance (computed by total_bytes/time). When I parse the output log, I have to compute the I/O time by reverse (total_bytes/agg_perf). When user uses text files, total_bytes = 0 --> I/O time is lost.
So I think it would be nice if the darshan parser can output the I/O time AND the aggregate performance.
As reported by Huong Luu, UIUC:
currently Darshan parser reports aggregate performance (computed by total_bytes/time). When I parse the output log, I have to compute the I/O time by reverse (total_bytes/agg_perf). When user uses text files, total_bytes = 0 --> I/O time is lost.
So I think it would be nice if the darshan parser can output the I/O time AND the aggregate performance.
triage-feature-requestenhancementparsing toolshttps://xgitlab.cels.anl.gov/darshan/darshan/issues/90Disable Darshan instrumentation if --wrap arguments are detected on BG platform2018-03-16T10:14:01-05:00Shane SnyderDisable Darshan instrumentation if --wrap arguments are detected on BG platformThis would prevent conflicts with memlog and possibly other tools.
We could check for this in the profile conf by looking at command line arguments.This would prevent conflicts with memlog and possibly other tools.
We could check for this in the profile conf by looking at command line arguments.triage-feature-requestenhancementcompiler scriptshttps://xgitlab.cels.anl.gov/darshan/darshan/issues/80merge darshan-florin-extensions branch to trunk (epoch support)2018-03-16T10:14:01-05:00Shane Snydermerge darshan-florin-extensions branch to trunk (epoch support)The darshan-florin-extensions branch contains work done by Florin Isaila to add epochs to Darshan (ability to explicitly enable and disable periods of Darshan instrumentation and report them separately).The darshan-florin-extensions branch contains work done by Florin Isaila to add epochs to Darshan (ability to explicitly enable and disable periods of Darshan instrumentation and report them separately).triage-feature-requestenhancementwrapper librarieshttps://xgitlab.cels.anl.gov/darshan/darshan/issues/69Link failure with FORTRAN netcdf libraries2018-03-16T10:14:01-05:00Shane SnyderLink failure with FORTRAN netcdf librariesI got the same errors after change the .soft as you suggested.
After some more tests, I found a strange behavior.
Here is my test program.
% cat test.f
program main
include "mpif.h"
include "pnetcdf.inc"
integer ierr, ncid
call MPI_Init(ierr)
ierr = nfmpi_create(MPI_COMM_WORLD, "pnf_test.nc", NF_CLOBBER,
+ MPI_INFO_NULL, ncid)
ierr = nfmpi_enddef(ncid)
ierr = nfmpi_close(ncid)
call MPI_Finalize (ierr)
end program
Here is my compile command and it generated no error.
% /soft/compilers/wrappers/gcc/mpif77 -I../../src/libf test.f -o test
-L../../src/lib -lpnetcdf
However, once I removed the line of
ierr = nfmpi_enddef(ncid)
the errors appeared:
/soft/perftools/darshan/darshan-2.2.7/lib/libdarshan-mpi-io.a(darshan-pnetcdf.o):
In function `__wrap_ncmpi_close':
/soft/perftools/darshan/darshan-2.2.7/build/darshan-2.2.7/darshan-runtime/lib/darshan-pnetcdf.c:152:
undefined reference to `ncmpi_close'
/soft/perftools/darshan/darshan-2.2.7/lib/libdarshan-mpi-io.a(darshan-pnetcdf.o):
In function `__wrap_ncmpi_open':
/soft/perftools/darshan/darshan-2.2.7/build/darshan-2.2.7/darshan-runtime/lib/darshan-pnetcdf.c:108:
undefined reference to `ncmpi_open'
/soft/perftools/darshan/darshan-2.2.7/lib/libdarshan-mpi-io.a(darshan-pnetcdf.o):
In function `__wrap_ncmpi_create':
/soft/perftools/darshan/darshan-2.2.7/build/darshan-2.2.7/darshan-runtime/lib/darshan-pnetcdf.c:62:
undefined reference to `ncmpi_create'
collect2: ld returned 1 exit status
I guess the mpi compiler wrapper fails to place the user
command-line library (-L../../src/lib -lpnetcdf in this case)
after the internal libraries used by Darshan.
(via Ray)I got the same errors after change the .soft as you suggested.
After some more tests, I found a strange behavior.
Here is my test program.
% cat test.f
program main
include "mpif.h"
include "pnetcdf.inc"
integer ierr, ncid
call MPI_Init(ierr)
ierr = nfmpi_create(MPI_COMM_WORLD, "pnf_test.nc", NF_CLOBBER,
+ MPI_INFO_NULL, ncid)
ierr = nfmpi_enddef(ncid)
ierr = nfmpi_close(ncid)
call MPI_Finalize (ierr)
end program
Here is my compile command and it generated no error.
% /soft/compilers/wrappers/gcc/mpif77 -I../../src/libf test.f -o test
-L../../src/lib -lpnetcdf
However, once I removed the line of
ierr = nfmpi_enddef(ncid)
the errors appeared:
/soft/perftools/darshan/darshan-2.2.7/lib/libdarshan-mpi-io.a(darshan-pnetcdf.o):
In function `__wrap_ncmpi_close':
/soft/perftools/darshan/darshan-2.2.7/build/darshan-2.2.7/darshan-runtime/lib/darshan-pnetcdf.c:152:
undefined reference to `ncmpi_close'
/soft/perftools/darshan/darshan-2.2.7/lib/libdarshan-mpi-io.a(darshan-pnetcdf.o):
In function `__wrap_ncmpi_open':
/soft/perftools/darshan/darshan-2.2.7/build/darshan-2.2.7/darshan-runtime/lib/darshan-pnetcdf.c:108:
undefined reference to `ncmpi_open'
/soft/perftools/darshan/darshan-2.2.7/lib/libdarshan-mpi-io.a(darshan-pnetcdf.o):
In function `__wrap_ncmpi_create':
/soft/perftools/darshan/darshan-2.2.7/build/darshan-2.2.7/darshan-runtime/lib/darshan-pnetcdf.c:62:
undefined reference to `ncmpi_create'
collect2: ld returned 1 exit status
I guess the mpi compiler wrapper fails to place the user
command-line library (-L../../src/lib -lpnetcdf in this case)
after the internal libraries used by Darshan.
(via Ray)triage-bugsdefectcompiler scriptshttps://xgitlab.cels.anl.gov/darshan/darshan/issues/65-wrap error when using HPCToolkit2018-03-16T10:14:01-05:00Shane Snyder-wrap error when using HPCToolkit
I think there's an issue where -wrap calls are specified when we should be disabling darshan at link. Need to investigate more.
Do you know if there's a conflict between Darshan and Rice HPCToolkit on BG/P (Intrepid)? I'm getting these hpclink errors:
hpclink mpixlf90 -WF,-qfpp=comment -qfixed=120 -O3 -qhot -g -I/home/beres/software/p3d-1/include/ -o runit common.o main.o inits.o time1.o time2.o non1d1.o non1d2.o outen.o out1dsp.o out78sp.o out78.o \
read78sp.o forcew.o forcez.o -L/soft/apps/LAPACK -L/soft/apps/fftw-3.1.2-float/lib -L/home/beres/software/p3d-1/lib -lp3dfft -llapack_bgp -lfftw3f
/bgsys/drivers/V1R4M2_200_2010-100508P/ppc/comm/default/lib/libfmpich.cnk.a(_wcomm_rankf.o): In function `mpi_comm_rank':
/bghome/bgbuild/V1R4M2_200_2010-100508P/ppc/bgp/comm/lib/dev/mpich2/src/binding/f77/comm_rankf.c:190: undefined reference to `__wrap_MPI_Comm_rank'
/bgsys/drivers/V1R4M2_200_2010-100508P/ppc/comm/default/lib/libfmpich.cnk.a(_wfinalizef.o): In function `mpi_finalize':
/bghome/bgbuild/V1R4M2_200_2010-100508P/ppc/bgp/comm/lib/dev/mpich2/src/binding/f77/finalizef.c:190: undefined reference to `__wrap_MPI_Finalize'
/bgsys/drivers/V1R4M2_200_2010-100508P/ppc/comm/default/lib/libfmpich.cnk.a(_winitf.o): In function `mpi_init':
/bghome/bgbuild/V1R4M2_200_2010-100508P/ppc/bgp/comm/lib/dev/mpich2/src/binding/f77/initf.c:196: undefined reference to `__wrap_MPI_Init'
I think there's an issue where -wrap calls are specified when we should be disabling darshan at link. Need to investigate more.
Do you know if there's a conflict between Darshan and Rice HPCToolkit on BG/P (Intrepid)? I'm getting these hpclink errors:
hpclink mpixlf90 -WF,-qfpp=comment -qfixed=120 -O3 -qhot -g -I/home/beres/software/p3d-1/include/ -o runit common.o main.o inits.o time1.o time2.o non1d1.o non1d2.o outen.o out1dsp.o out78sp.o out78.o \
read78sp.o forcew.o forcez.o -L/soft/apps/LAPACK -L/soft/apps/fftw-3.1.2-float/lib -L/home/beres/software/p3d-1/lib -lp3dfft -llapack_bgp -lfftw3f
/bgsys/drivers/V1R4M2_200_2010-100508P/ppc/comm/default/lib/libfmpich.cnk.a(_wcomm_rankf.o): In function `mpi_comm_rank':
/bghome/bgbuild/V1R4M2_200_2010-100508P/ppc/bgp/comm/lib/dev/mpich2/src/binding/f77/comm_rankf.c:190: undefined reference to `__wrap_MPI_Comm_rank'
/bgsys/drivers/V1R4M2_200_2010-100508P/ppc/comm/default/lib/libfmpich.cnk.a(_wfinalizef.o): In function `mpi_finalize':
/bghome/bgbuild/V1R4M2_200_2010-100508P/ppc/bgp/comm/lib/dev/mpich2/src/binding/f77/finalizef.c:190: undefined reference to `__wrap_MPI_Finalize'
/bgsys/drivers/V1R4M2_200_2010-100508P/ppc/comm/default/lib/libfmpich.cnk.a(_winitf.o): In function `mpi_init':
/bghome/bgbuild/V1R4M2_200_2010-100508P/ppc/bgp/comm/lib/dev/mpich2/src/binding/f77/initf.c:196: undefined reference to `__wrap_MPI_Init'
triage-bugsdefectcompiler scriptshttps://xgitlab.cels.anl.gov/darshan/darshan/issues/59Disable darshan when application links against ADIOS dummy MPI implementation2018-03-16T10:14:01-05:00Shane SnyderDisable darshan when application links against ADIOS dummy MPI implementationUsing Darshan will cause link time errors in this case because there are no PMPI symbols in the dummy implementation (see alcf-support #158921).
Ideally Darshan would detect this situation automatically and disable itself, similar to what it does in the case where it detects the presence of another tool using PMPI.Using Darshan will cause link time errors in this case because there are no PMPI symbols in the dummy implementation (see alcf-support #158921).
Ideally Darshan would detect this situation automatically and disable itself, similar to what it does in the case where it detects the presence of another tool using PMPI.triage-bugsenhancementcompiler scriptshttps://xgitlab.cels.anl.gov/darshan/darshan/issues/48track per-process information in addition to per-file information2018-03-16T10:14:01-05:00Shane Snydertrack per-process information in addition to per-file informationThis ticket would be best done after modularizing the log format (see #46). In addition to storing per-file records, we could store summary record per process (but spanning files) as well.
For example, we might want to know the total amount of time a process spent doing metadata, reads, and writes, as well as the number of bytes it read and wrote, regardless of how many files it opened or how many threads it used.
We could use this per-process data (in conjunction with a reduction step) to produce an immediate performance estimate without much post processing.
This also depends on accurate thread accounting in #81.This ticket would be best done after modularizing the log format (see #46). In addition to storing per-file records, we could store summary record per process (but spanning files) as well.
For example, we might want to know the total amount of time a process spent doing metadata, reads, and writes, as well as the number of bytes it read and wrote, regardless of how many files it opened or how many threads it used.
We could use this per-process data (in conjunction with a reduction step) to produce an immediate performance estimate without much post processing.
This also depends on accurate thread accounting in #81.triage-feature-requestdefectwrapper librarieshttps://xgitlab.cels.anl.gov/darshan/darshan/issues/26expanded instrumentation for a high level library2018-03-16T10:14:01-05:00Shane Snyderexpanded instrumentation for a high level libraryPick one of pnetcdf, hdf5, or damsel and add additional optional support in darshan. See #46.Pick one of pnetcdf, hdf5, or damsel and add additional optional support in darshan. See #46.triage-feature-requestdefectwrapper librarieshttps://xgitlab.cels.anl.gov/darshan/darshan/issues/160darshan-job-summary.pl: show bytes read/written per interface2018-03-16T10:14:01-05:00Shane Snyderdarshan-job-summary.pl: show bytes read/written per interfaceOne of the summary tables should show how much data was read or written via POSIX, MPI-IO, NetCDF, or HDF5.One of the summary tables should show how much data was read or written via POSIX, MPI-IO, NetCDF, or HDF5.triage-feature-requestenhancementdarshan-job-summary.plShane SnyderShane Snyderhttps://xgitlab.cels.anl.gov/darshan/darshan/issues/156safety checks for latex packages2018-03-16T10:14:01-05:00Shane Snydersafety checks for latex packagesSee mailing list: http://lists.mcs.anl.gov/pipermail/darshan-users/2015-September/000305.html
darshan-job-summary.pl does not do anything to gracefully detect missing latex packages, like multirow.sty.See mailing list: http://lists.mcs.anl.gov/pipermail/darshan-users/2015-September/000305.html
darshan-job-summary.pl does not do anything to gracefully detect missing latex packages, like multirow.sty.triage-feature-requestenhancementdarshan-job-summary.plhttps://xgitlab.cels.anl.gov/darshan/darshan/issues/155the shared file time_by_slowest metric doesn&#39;t distinguish metadata and i/o time2018-03-16T10:14:01-05:00Shane Snyderthe shared file time_by_slowest metric doesn't distinguish metadata and i/o timeReported by Huong Luu.
This is the most accurate metric for estimating performance for globally shared files, but it lumps both metadata and I/O transfer into the same measurement.
Ideally we would break down the two categories to help with analysis. This would require a library change, log format change, and utility change. The current log file does not contain this level of detail.Reported by Huong Luu.
This is the most accurate metric for estimating performance for globally shared files, but it lumps both metadata and I/O transfer into the same measurement.
Ideally we would break down the two categories to help with analysis. This would require a library change, log format change, and utility change. The current log file does not contain this level of detail.triage-feature-requestenhancementwrapper librarieshttps://xgitlab.cels.anl.gov/darshan/darshan/issues/154update file terminology in tools and docs2018-03-16T10:14:02-05:00Shane Snyderupdate file terminology in tools and docsReported by Huong Luu.
The command line utilities for Darshan often refer to files as either "shared" or "unique". It would be more correct to refer to them as "globally shared" or "non-globally shared" to reflect that Darshan only identifies shared files if all ranks open them.
This will require updates to several tools to fix. We also need to check if some tools identify partially shared files as a post processing step, which would be a third category.Reported by Huong Luu.
The command line utilities for Darshan often refer to files as either "shared" or "unique". It would be more correct to refer to them as "globally shared" or "non-globally shared" to reflect that Darshan only identifies shared files if all ranks open them.
This will require updates to several tools to fix. We also need to check if some tools identify partially shared files as a post processing step, which would be a third category.triage-documentationenhancementparsing toolshttps://xgitlab.cels.anl.gov/darshan/darshan/issues/151account for possible time skew across ranks2018-03-16T10:14:02-05:00Shane Snyderaccount for possible time skew across ranksDarshan automatically normalizes all timestamps to be relative to MPI_Init() time, but it does not check for possible time skew across ranks (if one rank reports a drastically different MPI_Wtime() from another). We could check for skew at MPI_Finalize() time and adjust accordingly.Darshan automatically normalizes all timestamps to be relative to MPI_Init() time, but it does not check for possible time skew across ranks (if one rank reports a drastically different MPI_Wtime() from another). We could check for skew at MPI_Finalize() time and adjust accordingly.triage-feature-requestenhancementwrapper librarieshttps://xgitlab.cels.anl.gov/darshan/darshan/issues/150darshan_log_getexe() is prone to buffer overflows2018-03-16T10:14:02-05:00Shane Snyderdarshan_log_getexe() is prone to buffer overflowsThis API call in the logutils library requires the caller to provide a large enough pre-allocated buffer (4KiB works well now) but there is no argument for the caller to indicate how big the buffer is. It will overflow if the caller does not provide a large enough buffer.This API call in the logutils library requires the caller to provide a large enough pre-allocated buffer (4KiB works well now) but there is no argument for the caller to indicate how big the buffer is. It will overflow if the caller does not provide a large enough buffer.triage-bugsdefectparsing toolshttps://xgitlab.cels.anl.gov/darshan/darshan/issues/144get Darshan to compile cleanly with clang2018-03-16T10:14:02-05:00Shane Snyderget Darshan to compile cleanly with clangCompile with clang and check code with static analysis and address sanitizer.Compile with clang and check code with static analysis and address sanitizer.triage-feature-requestenhancementotherhttps://xgitlab.cels.anl.gov/darshan/darshan/issues/239investigate the possibility of dynamic library usage tracking2020-05-15T14:54:45-05:00Philip Carnscarns@mcs.anl.govinvestigate the possibility of dynamic library usage trackingDarshan could theoretically be used to track how many libraries are loaded as part of execution. It does not do this right now, either because it filters out system paths, or because it is not intercepting the I/O calls used by the loader.
We should investigate this and see how hard it would be to enable this functionality. Maybe optional, maybe as a separate module that just reports library usage?
This would only be valid for LD_PRELOAD mode.Darshan could theoretically be used to track how many libraries are loaded as part of execution. It does not do this right now, either because it filters out system paths, or because it is not intercepting the I/O calls used by the loader.
We should investigate this and see how hard it would be to enable this functionality. Maybe optional, maybe as a separate module that just reports library usage?
This would only be valid for LD_PRELOAD mode.3.2.2wrapper librarieshttps://xgitlab.cels.anl.gov/darshan/darshan/issues/240Investigate issues on Cooley with LD_PRELOAD=libdarshan.so:libtrackdeps.so to...2018-03-20T11:21:06-05:00Kevin HarmsInvestigate issues on Cooley with LD_PRELOAD=libdarshan.so:libtrackdeps.so togetherRunning tests on Cooley, I found MPI executables would crash when LD_PRELOAD was set to point to the libfmpich.so:libdarshan.so:libtrackdeps.so. Removing libtrackdeps.so solved the crash problem. It's not clear how trackdeps is working. Will need to talk to Hal.
Cooley documentation currently states to only set LD_PRELOAD=$DARSHAN_PRELOAD, so it avoids this issue. This was detected when running the regression suite which keeps any existing LD_PRELOAD libraries in addition to adding darshan.Running tests on Cooley, I found MPI executables would crash when LD_PRELOAD was set to point to the libfmpich.so:libdarshan.so:libtrackdeps.so. Removing libtrackdeps.so solved the crash problem. It's not clear how trackdeps is working. Will need to talk to Hal.
Cooley documentation currently states to only set LD_PRELOAD=$DARSHAN_PRELOAD, so it avoids this issue. This was detected when running the regression suite which keeps any existing LD_PRELOAD libraries in addition to adding darshan.https://xgitlab.cels.anl.gov/darshan/darshan/issues/242darshan-parser output to json2018-06-14T14:49:47-05:00Shane Snyderdarshan-parser output to jsonA number of tools have implemented their own tools for translating darshan-parser output into JSON format -- we should just offer an option for outputting in this format directly to limit code duplication.A number of tools have implemented their own tools for translating darshan-parser output into JSON format -- we should just offer an option for outputting in this format directly to limit code duplication.https://xgitlab.cels.anl.gov/darshan/darshan/issues/245derived values in darshan-parser --perf should be easier to use2018-08-03T20:39:58-05:00Philip Carnscarns@mcs.anl.govderived values in darshan-parser --perf should be easier to useThe output of darshan-parser --perf is valuable, but there are some problems:
- it still emits several deprecated estimate methods for I/O performance
- it does not emit a unified elapsed time for data and metadata; you can only get this by working backwards from the total_bytes and agg_perf_by_slowest
At some point we should do another usability pass over the output of this tool.The output of darshan-parser --perf is valuable, but there are some problems:
- it still emits several deprecated estimate methods for I/O performance
- it does not emit a unified elapsed time for data and metadata; you can only get this by working backwards from the total_bytes and agg_perf_by_slowest
At some point we should do another usability pass over the output of this tool.https://xgitlab.cels.anl.gov/darshan/darshan/issues/246ability to discriminate close() time from other metadata time2018-08-03T20:58:47-05:00Philip Carnscarns@mcs.anl.govability to discriminate close() time from other metadata timeOn some file systems, close() could actually transfer data due to flush at close behavior. In this case it would really be I/O time rather than metadata time.
If we had a separate accounting of close() time it would be easier to assess if this is going on or not. Would require new counters, though.On some file systems, close() could actually transfer data due to flush at close behavior. In this case it would really be I/O time rather than metadata time.
If we had a separate accounting of close() time it would be easier to assess if this is going on or not. Would require new counters, though.enhancementhttps://xgitlab.cels.anl.gov/darshan/darshan/issues/247API to support injection of characterization data from external tools2020-04-25T07:17:15-05:00Philip Carnscarns@mcs.anl.govAPI to support injection of characterization data from external toolsThe use case for this would be instrumentation from an I/O library that wants to inject it's data into Darshan so that it travels with the Darshan log and uses it's compression and output infrastructure.
This would most likely mean adding an externally visible wrapper to the module register and record id functions, plus an extra function to retrieve a pointer to the memory region allocated during registration.The use case for this would be instrumentation from an I/O library that wants to inject it's data into Darshan so that it travels with the Darshan log and uses it's compression and output infrastructure.
This would most likely mean adding an externally visible wrapper to the module register and record id functions, plus an extra function to retrieve a pointer to the memory region allocated during registration.https://xgitlab.cels.anl.gov/darshan/darshan/issues/249user-defined data block within darshan logs2018-10-11T18:19:52-05:00Glenn K. Lockwooduser-defined data block within darshan logsI would like the ability to insert new records/modules (or even arbitrary data, like HDF5's user block) into Darshan logs after a log has been generated. While this would probably be dangerous to do in production, it would be a more portable way for downstream analysis to attach indices or derived quantities to existing logs so they don't have to be recalculated.
A specific use case that I've had is generating the results of `darshan-parser --perf` only once per log and then caching that information somewhere. I resorted to storing these summary metrics as extended attributes associated with the Darshan log file, but xattrs tend to disappear when a file is transferred across different systems or batched up into tar-like formats.
Another potential use case would be to plug into a framework like TOKIO and insert additional performance data that came from sources outside of the scope of the job. A facility could add value to users' Darshan logs by putting server-side I/O load data into the Darshan log after the job has completed, giving the user a single über-log that contains everything a the center knows about I/O that is relevant to that job.
I would like the ability to insert new records/modules (or even arbitrary data, like HDF5's user block) into Darshan logs after a log has been generated. While this would probably be dangerous to do in production, it would be a more portable way for downstream analysis to attach indices or derived quantities to existing logs so they don't have to be recalculated.
A specific use case that I've had is generating the results of `darshan-parser --perf` only once per log and then caching that information somewhere. I resorted to storing these summary metrics as extended attributes associated with the Darshan log file, but xattrs tend to disappear when a file is transferred across different systems or batched up into tar-like formats.
Another potential use case would be to plug into a framework like TOKIO and insert additional performance data that came from sources outside of the scope of the job. A facility could add value to users' Darshan logs by putting server-side I/O load data into the Darshan log after the job has completed, giving the user a single über-log that contains everything a the center knows about I/O that is relevant to that job.
https://xgitlab.cels.anl.gov/darshan/darshan/issues/256no dxt support for stdio module2020-05-15T14:54:22-05:00Shane Snyderno dxt support for stdio moduleDXT support should be pretty straightforward to integrate into the STDIO module so we can trace I/O operations that work on `FILE *`s.DXT support should be pretty straightforward to integrate into the STDIO module so we can trace I/O operations that work on `FILE *`s.3.2.2Shane SnyderShane Snyderhttps://xgitlab.cels.anl.gov/darshan/darshan/issues/257POSIX_OPENS value 02019-04-23T10:18:12-05:00Greg WatsonPOSIX_OPENS value 0Hi, what does it mean for POSIX_OPENS to have a value of 0? I have an application that is creating about 3000 files, but darshan-job-summary only reports about 10. The problem seems to be that most of the POSIX_OPENS records have 0 for the value field. Any idea why this might be happing (this is darshan 3.1.7)?Hi, what does it mean for POSIX_OPENS to have a value of 0? I have an application that is creating about 3000 files, but darshan-job-summary only reports about 10. The problem seems to be that most of the POSIX_OPENS records have 0 for the value field. Any idea why this might be happing (this is darshan 3.1.7)?https://xgitlab.cels.anl.gov/darshan/darshan/issues/260lustre module uses a deprecated lustre_user2020-04-02T18:36:28-05:00Glenn K. Lockwoodlustre module uses a deprecated lustre_userOn newer Lustre versions, Darshan spits out the following:
```
In file included from lib/darshan-lustre.c(21):
/usr/include/lustre/lustre_user.h(45): warning #1224: #warning directive: "Including lustre_user.h is deprecated. Include linux/lustre/lustre_user.h directly."
#warning "Including lustre_user.h is deprecated. Include linux/lustre/lustre_user.h directly."
^
```
Simple fix, but needs to be done. Probably by me.On newer Lustre versions, Darshan spits out the following:
```
In file included from lib/darshan-lustre.c(21):
/usr/include/lustre/lustre_user.h(45): warning #1224: #warning directive: "Including lustre_user.h is deprecated. Include linux/lustre/lustre_user.h directly."
#warning "Including lustre_user.h is deprecated. Include linux/lustre/lustre_user.h directly."
^
```
Simple fix, but needs to be done. Probably by me.Glenn K. LockwoodGlenn K. Lockwoodhttps://xgitlab.cels.anl.gov/darshan/darshan/issues/262create non-mpi test cases for regression tests2019-12-11T09:21:53-06:00Shane Snydercreate non-mpi test cases for regression testsWe should include testing for those code paths in our normal regression tests.We should include testing for those code paths in our normal regression tests.https://xgitlab.cels.anl.gov/darshan/darshan/issues/263add information to darshan-job-summary to summarize process load imbalance, i...2019-12-12T13:58:52-06:00Shane Snyderadd information to darshan-job-summary to summarize process load imbalance, if possibleThis is definitely possible using DXT output, which opens up the broader question of if/how we should include some DXT-style information in the output graphs if it's available.
We should consider how to do this as we think more about revamping darshan-job-summary output as part of the integration of Python bindings into darshan-util.This is definitely possible using DXT output, which opens up the broader question of if/how we should include some DXT-style information in the output graphs if it's available.
We should consider how to do this as we think more about revamping darshan-job-summary output as part of the integration of Python bindings into darshan-util.https://xgitlab.cels.anl.gov/darshan/darshan/issues/268first I/O time graph in darshan-job-summary can be misleading if not all rank...2020-02-12T15:44:06-06:00Philip Carnscarns@mcs.anl.govfirst I/O time graph in darshan-job-summary can be misleading if not all ranks perform I/O[benchio_1202.darshan](/uploads/2e902a44d1b2beec22f19d2deab61416/benchio_1202.darshan)
File to use as a reproducer is attached, but the issue should be visible from any parallel application doing I/O from rank 0 (or any other subset of ranks). The first graph shows "Average I/O cost per process", which will look low on those apps because it amortizes I/O time across ranks that aren't doing I/O.
It would probably be more intuitive to show slowest_rank_time/run_time instead, so we show the percentage of time spent in I/O on the worst case rank rather than average across all ranks regardless of whether they participate or not.[benchio_1202.darshan](/uploads/2e902a44d1b2beec22f19d2deab61416/benchio_1202.darshan)
File to use as a reproducer is attached, but the issue should be visible from any parallel application doing I/O from rank 0 (or any other subset of ranks). The first graph shows "Average I/O cost per process", which will look low on those apps because it amortizes I/O time across ranks that aren't doing I/O.
It would probably be more intuitive to show slowest_rank_time/run_time instead, so we show the percentage of time spent in I/O on the worst case rank rather than average across all ranks regardless of whether they participate or not.darshan-job-summary.plhttps://xgitlab.cels.anl.gov/darshan/darshan/issues/270warnings/crashes in newer Lustre versions related to use of ioctls2020-05-25T20:23:41-05:00Shane Snyderwarnings/crashes in newer Lustre versions related to use of ioctlsDarshan's Lustre module uses ioctl to determine striping information for files, but this ability has been deprecated in newer Lustre versions, leading to warnings like:
```
using old ioctl(LL_IOC_LOV_GETSTRIPE) on [0x2c002048c:0x25:0x0], use llapi_layout_get_by_path()
```
Another user has reported crashes if the Lustre module is not disabled, using Lustre version 2.11.0.300_cray_102_g3dbace1.
We should look at using the suggested API for retrieving this info, and confirm its overhead.Darshan's Lustre module uses ioctl to determine striping information for files, but this ability has been deprecated in newer Lustre versions, leading to warnings like:
```
using old ioctl(LL_IOC_LOV_GETSTRIPE) on [0x2c002048c:0x25:0x0], use llapi_layout_get_by_path()
```
Another user has reported crashes if the Lustre module is not disabled, using Lustre version 2.11.0.300_cray_102_g3dbace1.
We should look at using the suggested API for retrieving this info, and confirm its overhead.3.2.2Shane SnyderShane Snyderhttps://xgitlab.cels.anl.gov/darshan/darshan/issues/271consider a new output mode for darshan-parser that is more machine parseable2020-05-15T14:54:53-05:00Philip Carnscarns@mcs.anl.govconsider a new output mode for darshan-parser that is more machine parseableWe probably should not change the existing --perf output in case existing analysis tools depend on it, but it would be helpful to have a variation that:
* includes the module name in each line of output (to make it easier to grep)
* only shows what's accepted as the best performance estimate ("by_slowest")
* also outputs bytes
* also outputs time (rate, bytes, and time should all be directly related)
This output mode would probably be preferred for future ML analysis tools.We probably should not change the existing --perf output in case existing analysis tools depend on it, but it would be helpful to have a variation that:
* includes the module name in each line of output (to make it easier to grep)
* only shows what's accepted as the best performance estimate ("by_slowest")
* also outputs bytes
* also outputs time (rate, bytes, and time should all be directly related)
This output mode would probably be preferred for future ML analysis tools.3.2.2https://xgitlab.cels.anl.gov/darshan/darshan/issues/272Darshan log writes failing to Lustre filesystem2020-05-15T14:55:00-05:00Shane SnyderDarshan log writes failing to Lustre filesystemReported by André Carneiro on the Darshan users mailing list:
```
*Using OpenMPI 3.1.5 and GCC 7
Program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
Backtrace for this error:
#0 0x7f4f6759f27f in ???
#1 0x7f4f687ababe in ???
#2 0x7f4f687add06 in ???
#3 0x7f4f687db6c0 in ???
#4 0x7f4f687dbddb in ???
#5 0x7f4f6879d6f1 in ???
#6 0x7f4f6871892b in ???
#7 0x7f4f691d0ae1 in MPI_File_write_at_all
at lib/darshan-mpiio.c:536
#8 0x7f4f691bea7f in darshan_log_append_all
at lib/darshan-core.c:1800
#9 0x7f4f691c1907 in darshan_log_write_name_record_hash
at lib/darshan-core.c:1761
#10 0x7f4f691c1907 in darshan_core_shutdown
at lib/darshan-core.c:546
#11 0x7f4f691be402 in MPI_Finalize
at lib/darshan-core-init-finalize.c:82
#12 0x7f4f68b6a798 in ???
#13 0x4023bb in ???
#14 0x401ae6 in ???
#15 0x7f4f6758b3d4 in ???
#16 0x401b16 in ???
#17 0xffffffffffffffff in ???
--------------------------------------------------------------------------
*Using Intel PSXE 2018 with Intel MPI
forrtl: severe (71): integer divide by zero
Image PC Routine Line Source
exec.exe 000000000045282E Unknown Unknown Unknown
libpthread-2.17.s 00002B8B5A5FE5D0 Unknown Unknown Unknown
libmpi_lustre.so. 00002B8B659D4FDF ADIOI_LUSTRE_Get_ Unknown Unknown
libmpi_lustre.so. 00002B8B659CFFD9 ADIOI_LUSTRE_Writ Unknown Unknown
libmpi.so.12.0 00002B8B59A4C15C Unknown Unknown Unknown
libmpi.so.12 00002B8B59A4D1D5 PMPI_File_write_a Unknown Unknown
libdarshan.so 00002B8B58F90312 MPI_File_write_at Unknown Unknown
libdarshan.so 00002B8B58F7E63A Unknown Unknown Unknown
libdarshan.so 00002B8B58F815B0 darshan_core_shut Unknown Unknown
libdarshan.so 00002B8B58F7DFF3 MPI_Finalize Unknown Unknown
libmpifort.so.12. 00002B8B592414DA pmpi_finalize__ Unknown Unknown
exec.exe 00000000004490A5 Unknown Unknown Unknown
exec.exe 00000000004032DE Unknown Unknown Unknown
libc-2.17.so 00002B8B5AB2F3D5 __libc_start_main Unknown Unknown
exec.exe 00000000004031E9 Unknown Unknown Unknown
```
So, two different MPI implementations hit the same problem.
The user can work around by writing to a non-Lustre file system. Having the user `export DARSHAN_LOGHINTS=""` also works around the problem, so seems related to hint interaction with Lustre.Reported by André Carneiro on the Darshan users mailing list:
```
*Using OpenMPI 3.1.5 and GCC 7
Program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
Backtrace for this error:
#0 0x7f4f6759f27f in ???
#1 0x7f4f687ababe in ???
#2 0x7f4f687add06 in ???
#3 0x7f4f687db6c0 in ???
#4 0x7f4f687dbddb in ???
#5 0x7f4f6879d6f1 in ???
#6 0x7f4f6871892b in ???
#7 0x7f4f691d0ae1 in MPI_File_write_at_all
at lib/darshan-mpiio.c:536
#8 0x7f4f691bea7f in darshan_log_append_all
at lib/darshan-core.c:1800
#9 0x7f4f691c1907 in darshan_log_write_name_record_hash
at lib/darshan-core.c:1761
#10 0x7f4f691c1907 in darshan_core_shutdown
at lib/darshan-core.c:546
#11 0x7f4f691be402 in MPI_Finalize
at lib/darshan-core-init-finalize.c:82
#12 0x7f4f68b6a798 in ???
#13 0x4023bb in ???
#14 0x401ae6 in ???
#15 0x7f4f6758b3d4 in ???
#16 0x401b16 in ???
#17 0xffffffffffffffff in ???
--------------------------------------------------------------------------
*Using Intel PSXE 2018 with Intel MPI
forrtl: severe (71): integer divide by zero
Image PC Routine Line Source
exec.exe 000000000045282E Unknown Unknown Unknown
libpthread-2.17.s 00002B8B5A5FE5D0 Unknown Unknown Unknown
libmpi_lustre.so. 00002B8B659D4FDF ADIOI_LUSTRE_Get_ Unknown Unknown
libmpi_lustre.so. 00002B8B659CFFD9 ADIOI_LUSTRE_Writ Unknown Unknown
libmpi.so.12.0 00002B8B59A4C15C Unknown Unknown Unknown
libmpi.so.12 00002B8B59A4D1D5 PMPI_File_write_a Unknown Unknown
libdarshan.so 00002B8B58F90312 MPI_File_write_at Unknown Unknown
libdarshan.so 00002B8B58F7E63A Unknown Unknown Unknown
libdarshan.so 00002B8B58F815B0 darshan_core_shut Unknown Unknown
libdarshan.so 00002B8B58F7DFF3 MPI_Finalize Unknown Unknown
libmpifort.so.12. 00002B8B592414DA pmpi_finalize__ Unknown Unknown
exec.exe 00000000004490A5 Unknown Unknown Unknown
exec.exe 00000000004032DE Unknown Unknown Unknown
libc-2.17.so 00002B8B5AB2F3D5 __libc_start_main Unknown Unknown
exec.exe 00000000004031E9 Unknown Unknown Unknown
```
So, two different MPI implementations hit the same problem.
The user can work around by writing to a non-Lustre file system. Having the user `export DARSHAN_LOGHINTS=""` also works around the problem, so seems related to hint interaction with Lustre.3.2.2https://xgitlab.cels.anl.gov/darshan/darshan/issues/273Darshan conflicts with trackdeps2020-05-15T14:55:08-05:00Shane SnyderDarshan conflicts with trackdepsLooks like trying to use Darshan and trackdeps at the same time is leading to crashes on Cooley. We should investigate further to ensure these tools can operate together without causing crashes for users.Looks like trying to use Darshan and trackdeps at the same time is leading to crashes on Cooley. We should investigate further to ensure these tools can operate together without causing crashes for users.3.2.2https://xgitlab.cels.anl.gov/darshan/darshan/issues/275add non-mpi test case2020-05-15T14:54:09-05:00Shane Snyderadd non-mpi test caseAdd a non-MPI regression test case to more easily test that functionality.Add a non-MPI regression test case to more easily test that functionality.3.2.2Shane SnyderShane Snyderhttps://xgitlab.cels.anl.gov/darshan/darshan/issues/276update regression tests to use darshan-diff2020-05-15T10:15:55-05:00Shane Snyderupdate regression tests to use darshan-diffWe should have a regression test case to compare counters of mpi-io-test example with previous log versions to ensure arbitrary counters aren't changing or being corrupted from release-to-release.We should have a regression test case to compare counters of mpi-io-test example with previous log versions to ensure arbitrary counters aren't changing or being corrupted from release-to-release.Shane SnyderShane Snyderhttps://xgitlab.cels.anl.gov/darshan/darshan/issues/277conflict between --without-mpi and --enable-mmap-logs2020-05-18T07:26:29-05:00Philip Carnscarns@mcs.anl.govconflict between --without-mpi and --enable-mmap-logsReported by Jeff Layton: https://lists.mcs.anl.gov/pipermail/darshan-users/2020-May/000599.html
We either need to fix or update autoconf to make the options mutually exclusive.Reported by Jeff Layton: https://lists.mcs.anl.gov/pipermail/darshan-users/2020-May/000599.html
We either need to fix or update autoconf to make the options mutually exclusive.3.2.2Shane SnyderShane Snyderhttps://xgitlab.cels.anl.gov/darshan/darshan/issues/278darshan-mk-log-dirs unclear errors when root log directory does note exist2020-05-18T19:32:28-05:00Shane Snyderdarshan-mk-log-dirs unclear errors when root log directory does note exist`darshan-mk-log-dirs` prints out a generic error and fails if the root log directory Darshan was configured with doesn't exist. We should update the script to print the error string so it's clear why it failed, and also to just create the root log directory if it does not already exist.`darshan-mk-log-dirs` prints out a generic error and fails if the root log directory Darshan was configured with doesn't exist. We should update the script to print the error string so it's clear why it failed, and also to just create the root log directory if it does not already exist.3.2.2https://xgitlab.cels.anl.gov/darshan/darshan/issues/279darshan-job-summary requires some specific latex/perl packages2020-05-20T09:43:50-05:00Shane Snyderdarshan-job-summary requires some specific latex/perl packages(reported by Jeff Layton)
`darshan-job-summary` does require some specific LaTeX Perl module that is not necessarily available by default on most systems:
libpod-latex-perl
We should make that an explicit darshan-util dependency for now.(reported by Jeff Layton)
`darshan-job-summary` does require some specific LaTeX Perl module that is not necessarily available by default on most systems:
libpod-latex-perl
We should make that an explicit darshan-util dependency for now.3.2.2https://xgitlab.cels.anl.gov/darshan/darshan/issues/142darshan-shutdown-bench triggers weak symbol check in wrappers2015-09-25T11:54:59-05:00Shane Snyderdarshan-shutdown-bench triggers weak symbol check in wrappersThe following command:
```text
mpicc darshan-test/darshan-shutdown-bench.c -o darshan-shutdown-bench
```
... fails because the mpicc wrappers trigger the weak symbol check and turn off Darshan (the above test program will not link without Darshan libraries).The following command:
```text
mpicc darshan-test/darshan-shutdown-bench.c -o darshan-shutdown-bench
```
... fails because the mpicc wrappers trigger the weak symbol check and turn off Darshan (the above test program will not link without Darshan libraries).triage-bugsdefectcompiler scriptshttps://xgitlab.cels.anl.gov/darshan/darshan/issues/139documentation for darshan-logutils.h2015-09-25T11:55:06-05:00Shane Snyderdocumentation for darshan-logutils.hThe logutils API doesn't have much documentation. We should consider using doxygen to generate an online API reference.The logutils API doesn't have much documentation. We should consider using doxygen to generate an online API reference.triage-documentationenhancementdocumentationhttps://xgitlab.cels.anl.gov/darshan/darshan/issues/132internal diagnostic timing routines are skewed if processes call MPI_Finalize...2015-09-25T11:55:33-05:00Shane Snyderinternal diagnostic timing routines are skewed if processes call MPI_Finalize() at different timesDarshan includes a feature that will display internal diagnostic timing information if the DARSHAN_INTERNAL_TIMING environment variable is set. However, the mechanism used to collect timing information assumes that all processes have called MPI_Finalize() simultaneously. If this is not the case, then the diagnostic information will erroneously attribute too much time to Darshan while it waits on the other processes to synchronize.Darshan includes a feature that will display internal diagnostic timing information if the DARSHAN_INTERNAL_TIMING environment variable is set. However, the mechanism used to collect timing information assumes that all processes have called MPI_Finalize() simultaneously. If this is not the case, then the diagnostic information will erroneously attribute too much time to Darshan while it waits on the other processes to synchronize.triage-bugsdefectwrapper librarieshttps://xgitlab.cels.anl.gov/darshan/darshan/issues/128keep read/write/meta breakdown during variance redunction2015-09-25T11:55:42-05:00Shane Snyderkeep read/write/meta breakdown during variance redunctionRight now we combine all three into a single time calculation, but in some cases it might be helpful to retain the original three categories as well.
This is a low priority.Right now we combine all three into a single time calculation, but in some cases it might be helpful to retain the original three categories as well.
This is a low priority.triage-feature-requestdefectwrapper librarieshttps://xgitlab.cels.anl.gov/darshan/darshan/issues/125darshan-util fails to build on Mac OS2015-09-25T11:56:03-05:00Shane Snyderdarshan-util fails to build on Mac OSReported by Karen Schuchardt:
```text
gcc -I . -I . -I ./../ -DDARSHAN_CONFIG_H=\"darshan-util-config.h\" -g -O2
-I/usr/include darshan-parser.c darshan-logutils.o -o darshan-parser -lz
-lbz2
Undefined symbols:
"_strndup", referenced from:
_darshan_log_getjob in darshan-logutils.o
ld: symbol(s) not found
collect2: ld returned 1 exit status
make: *** [darshan-parser] Error 1
```
Reported by Karen Schuchardt:
```text
gcc -I . -I . -I ./../ -DDARSHAN_CONFIG_H=\"darshan-util-config.h\" -g -O2
-I/usr/include darshan-parser.c darshan-logutils.o -o darshan-parser -lz
-lbz2
Undefined symbols:
"_strndup", referenced from:
_darshan_log_getjob in darshan-logutils.o
ld: symbol(s) not found
collect2: ld returned 1 exit status
make: *** [darshan-parser] Error 1
```
triage-feature-requestdefectparsing toolshttps://xgitlab.cels.anl.gov/darshan/darshan/issues/122compiler script generation fails on OpenMPI2015-09-25T11:56:17-05:00Shane Snydercompiler script generation fails on OpenMPIThe darshan-gen-cc.pl script fails on generating a compiler wrapper due to the '-v' option not outputting anything to standard out. I don't think we even need this test specifically. I believe if we remove it and restructure a little, OpenMPI wrapper script generation will work.
my $version_out = `$input_file -v 2>/dev/null |head -n 1`;
if (!($version_out))
{
printf STDERR "Error: failed to invoke $input_file with -v\n";
exit(1);
}
The darshan-gen-cc.pl script fails on generating a compiler wrapper due to the '-v' option not outputting anything to standard out. I don't think we even need this test specifically. I believe if we remove it and restructure a little, OpenMPI wrapper script generation will work.
my $version_out = `$input_file -v 2>/dev/null |head -n 1`;
if (!($version_out))
{
printf STDERR "Error: failed to invoke $input_file with -v\n";
exit(1);
}
triage-feature-requestdefectcompiler scriptshttps://xgitlab.cels.anl.gov/darshan/darshan/issues/108histogram for read/write times2015-09-25T11:56:25-05:00Shane Snyderhistogram for read/write timesThis feature was suggested by Jim Browne. It would be helpful to have a histogram not only of the access sizes, but how long each access took to complete.
We already time each read and write (in order to record cumulative I/O time, and to identify the slowest operation), so there wouldn't be any extra overhead except for the storage space required for the new fields.
We would want the bucket sizes to scale automatically based on the slowest operation we have seen on that file.This feature was suggested by Jim Browne. It would be helpful to have a histogram not only of the access sizes, but how long each access took to complete.
We already time each read and write (in order to record cumulative I/O time, and to identify the slowest operation), so there wouldn't be any extra overhead except for the storage space required for the new fields.
We would want the bucket sizes to scale automatically based on the slowest operation we have seen on that file.triage-feature-requestenhancementwrapper librarieshttps://xgitlab.cels.anl.gov/darshan/darshan/issues/105determine how to tabulate close() cost2015-09-25T11:56:33-05:00Shane Snyderdetermine how to tabulate close() costRight now Darshan tracks close() as metadata time, but some file systems flush data at close so that it really is performing I/O more so than metadata.
Should we track close in a separate category?Right now Darshan tracks close() as metadata time, but some file systems flush data at close so that it really is performing I/O more so than metadata.
Should we track close in a separate category?triage-feature-requestenhancementwrapper librarieshttps://xgitlab.cels.anl.gov/darshan/darshan/issues/104histogram bins using bytes rather than counts2019-11-06T11:30:43-06:00Shane Snyderhistogram bins using bytes rather than countsDarshan includes histogram bins that increment by 1 each time a given read or write size is encountered. It might be helpful to also have bins that increment by the number of bytes in the operation instead, so that we can see what percentage of data was transferred using various access sizes.
Need to think about the tradeoff in additional memory overhead to do this.Darshan includes histogram bins that increment by 1 each time a given read or write size is encountered. It might be helpful to also have bins that increment by the number of bytes in the operation instead, so that we can see what percentage of data was transferred using various access sizes.
Need to think about the tradeoff in additional memory overhead to do this.enhancementwrapper librarieshttps://xgitlab.cels.anl.gov/darshan/darshan/issues/103Additional darshan-job-summary.pl graph ideas2015-09-25T11:56:55-05:00Shane SnyderAdditional darshan-job-summary.pl graph ideas- histogram of data read/written per file
- histogram of data read/written per process
- scatter plot of i/o time vs. file size
Maybe a heat map would be appropriate for something?- histogram of data read/written per file
- histogram of data read/written per process
- scatter plot of i/o time vs. file size
Maybe a heat map would be appropriate for something?triage-feature-requestenhancementdarshan-job-summary.pl