Construct arbitrarily complex workflows in which the specific methods run are
determined at runtime. This module supports short circuiting a workflow if an
item fails, supports ordering methods, callbacks for processed items, and
deciding what methods are executed based on state or runtime options.

As an example of the Workflow object, let’s construct a sequence processor
that will filter sequences that are < 10 nucleotides, reverse the sequence
if the runtime options indicate to, and truncate if a specific nucleotide
pattern is observed. The Workflow object will only short circuit, and
evaluate requirements on methods decorated by method. Developers are free
to define as many methods as they’d like within the object definition, and
which can be called from workflow methods, but they will not be subjected
directly to workflow checks.

An instance of a Workflow must be passed a state object and any runtime
options. There are a few other useful parameters that can be specfied but are
out of scope for the purposes of this example. We also do not need to provide
a state object as our initialize_state method overrides self.state.
Now, let’s create the instance.

>>> wf=SequenceProcessor(state=None,options={'reverse=':False})

To run items through the SequenceProcessor, we need to pass in an
iterable. So, lets create a list of sequences.

>>> seqs=['AAAAAAATTTTTTT','ATAGACC','AATTGCCGGAC','ATATGAACAAA']

Before we run these sequences through, we’re going to also define callbacks
that are applied to the result of an single pass through the Workflow.
Callbacks are optional – by default, a success will simply yield the state
member variable while failures are ignored – but, depending on your workflow,
it can be useful to handle failures or potentially do something fun and
exciting on success.

A few things of note just happened. First off, none of the sequences were
reversed as the SequenceProcessor did not have option “reverse”
set to True. Second, you’ll notice that the 3rd sequence was truncated,
which is expected as it matched our nucleotide pattern of interest. Finally,
of the sequences we processed, only a single sequence failed.

To assist in constructing workflows, debug information is available but it
must be turned on at instantiation. Let’s do that, and while we’re at it, let’s
go ahead and enable the reversal method. This time through though, were going
to walk through an item at a time so we can examine the debug information.

The debug_trace specifies the methods executed, and the order of their
execution where closer to zero indicates earlier in the execution order. Gaps
indicate there was a method evaluated but not executed. Each of the items in
the debug_trace is a key into a few other dict of debug information
which we’ll discuss in a moment. Did you see that the sequence was reversed
this time through the workflow?

Now, let’s take a look at the next item, which on our prior run through the
workflow was a failed item.

What we can see is that the failed sequence only executed the check_length
method. Since the sequence didn’t pass our length filter of 10 nucleotides,
it was marked as failed within the check_length method. As a result, none
of the other methods were evaluated (note: this short circuiting behavior can
be disabled if desired).

This third item previously matched our nucleotide pattern of interest for
truncation. Let’s see what that looks like in the debug output.

In this last example, we can see that the truncate method was executed
prior to the reverse method and following the check_length method. This
is as anticipated given the priorities we specified for these methods. Since
the truncate method is doing something interesting, let’s take a closer
look at how the state is changing. First, we’re going to dump out the
state of the workflow prior to the call to truncate and then we’re going
to dump out the state following the call to truncate, which will allow
us to rapidly what is going on.

As we expect, we have our original sequence going into truncate, and
following the application of truncate, our sequence is missing our
nucleotide pattern of interest. Awesome, right?

There is one final piece of debug output, wf.debug_runtime, which can
be useful when diagnosing the amount of time required for individual methods
on a particular piece of state (as opposed to the aggregate as provided by
cProfile).

Three final components of the workflow that are quite handy are objects that
allow you to indicate anything as an option value, anything that is
not_none, and a mechanism to define a range of valid values.