Vision

The vision is to provide tools to easily achieve better performance and
reproducibility when working with long running jobs.

Avoid computing twice the same thing: code is rerun over an
over, for instance when prototyping computational-heavy jobs (as in
scientific development), but hand-crafted solution to alleviate this
issue is error-prone and often leads to unreproducible results

Persist to disk transparently: persisting in an efficient way
arbitrary objects containing large data is hard. Using
joblib’s caching mechanism avoids hand-written persistence and
implicitly links the file on disk to the execution context of
the original Python object. As a result, joblib’s persistence is
good for resuming an application status or computational job, eg
after a crash.

Joblib strives to address these problems while leaving your code and
your flow control as unmodified as possible (no framework, no new
paradigms).

Main features

Transparent and fast disk-caching of output value: a memoize or
make-like functionality for Python functions that works well for
arbitrary Python objects, including very large numpy arrays. Separate
persistence and flow-execution logic from domain logic or algorithmic
code by writing the operations as a set of steps with well-defined
inputs and outputs: Python functions. Joblib can save their
computation to disk and rerun it only if necessary:

Logging/tracing: The different functionalities will
progressively acquire better logging mechanism to help track what
has been ran, and capture I/O easily. In addition, Joblib will
provide a few I/O primitives, to easily define define logging and
display streams, and provide a way of compiling a report.
We want to be able to quickly inspect what has been run.