New Features

Pegasus now automatically distributes the files in HTCondor submit directory for all workflows in 2 level directory structure. This is done to prevent having too many workflow and condor submit files in one directory for a large workflow. The behavior of submit directory organization can be controlled by the following properties

pegasus.dir.submit.mapper.hashed.levels the number of directory levels used to accomodate the files. Defaults to 2. pegasus.dir.submit.mapper.hashed.multiplier the number of files associated with a job in the submit directory. defaults to 5.

Note that this is enabled by default. If you want to have pre 4.7.0 behavior you can

[PM-833] – Pegasus should manage directory structure on the staging site
For non sharedfs mode, Pegasus will now automatically manage the directory structure on the staging site in a hierarchal directory structure via use of staging mappers. The staging mappers determine what sub directory on the staging site a job will be associated with. Before, the introduction of staging mappers, all files associated with the jobs scheduled for a particular site landed in the same directory on the staging site. As a result, for large workflows this could degrade filesystem performance on the staging servers. More information can be found in the documentation at https://pegasus.isi.edu/docs/4.7.0/ref_staging_mapper.php

[PM-1036] – R DAX API
Pegasus now includes an R API for generating DAXes of complex and large workflows in R environments. The API follows the Google’ R style guide, and all objects and methods are defined using the S3OOP system. The source package can be obtained by running pegasus-config --r or from the Pegasus’ downloads page. A tutorial workflow can be generated using pegasus-init, and an example workflow is provided in the examples folder. More information can be found in the documentation at https://pegasus.isi.edu/documentation/dax_generator_api.php#api-r.

[PM-1126] – pegasus-analyzer should report information about held jobs
Pegasus monitoring daemon now populates the reason for held jobs in it’s database. Both pegasus-analyzer and dashboard were updated to show this information.
Related JIRA items:

[PM-928] – pegasus-exitcode should write its output to a log file
pegasus-exticode is now set to write to a workflow global log file ending in exitcode.log that captures pegasus-exitcode stdout and stderr as json messages. This allows users to check pegasus-exitcode messages, which otherwise would have been set to /dev/null by condor dagman.

[PM-1115] – Pegasus to check for cyclic dependencies in the DAG
Pegasus now explicitly checks for cyclic dependencies and reports one of the edges making up the cycle.

[PM-1054] – Add option to ignore files in libinterpose
kickstart now has support for environment variables KICKSTART_TRACE_MATCH and KICKSTART_TRACE_IGNORE that determine what file accesses are captured via lib interpose. The MATCH version only traces files that match the patterns, and the IGNORE version does NOT trace files that match the patterns. Only one of the two can be specified.

The Pegasus project is supported by the National Science Foundation under the OAC SI2-SSI program, grant #1664162. Pegasus also receives support from the Department of Energy, the National Institutes of Health, Defense Advanced Research Projects Agency, and the USC Information Sciences Institute.