pyFlow

pyFlow is a tool to manage tasks in the context of a task dependency graph. It has some similarities to make. pyFlow is not a program – it is a python module, and workflows are defined using pyFlow by writing regular python code with the pyFlow API.

Use case:

pyFlow has been optimized to be lightweight and simple to use for prototype/RD workflows.

Features:

Define workflows as python code

Run workflows on localhost or sge

Continue workflows which have partially completed

Task resource management: Specify number of threads and memory required for each task

Recursive workflow specification: take any existing pyFlow object and use it as a task in another pyFlow.

Dynamic workflow specification: define a wait on task specification rather than just tasks,
so that tasks can be defined based on the results of upstream tasks (note: recursive workflows are an even better way to do this)

Detects and reports all failed tasks with consistent workflow-level logging.

Release Distributions:

Requirements:

pyflow's only requirement is python. pyflow is supported on python 2 versions 2.4+, except note that python 2.7.2 should not be used due to a critical multithread bug in the python interpreter which impacts many pyflow runs.

Getting Started

To use an existing pyflow workflow or develop a new one, you may need to download or generate the latest pyflow installation tarball (see top-level README.txt on git repository)

To develop a new pyflow workflow:
Start by downloading the latest pyflow tarball (from version history section below).
See pyflow/README.txt
Look at the demo programs. If new to pyflow the recommended order is:

helloWorld – simplest workflow

simpleDemo – a basic feature sandbox

subWorkflow – shows how recursive workflow invocation works

runOptionsDemo – shows an example of how workflow run options can be acquired from command-line arguments.

cwdDemo – a simple demonstration of how the 'cwd' option is used on task calls.