In this paper we propose a formal, graphical workflow language
for dataflows, i.e., workflows where large amounts of complex data
are manipulated and the structure of the manipulated data is reflected
in the structure of the workflow. It is a common extension of
– Petri nets, which are responsible for the organization of the processing
tasks, and
– Nested relational calculus, which is a database query language over
complex objects, and is responsible for handling collections of data
items (in particular, for iteration) and for the typing system.
We demonstrate that dataflows constructed in hierarchical manner, according
to a set of refinement rules we propose, are sound: initiated with a
single token (which may represent a complex scientific data collection) in
the input node, terminate with a single token in the output node (which
represents the output data collection). In particular they always process
all of the input data, leave no ”debris data” behind and the output is
always eventually computed.