papermillParameterize, execute, and analyze notebooks

papermill is a tool for parameterizing, executing, and analyzing
Jupyter Notebooks.

Papermill lets you:

parameterize notebooks

execute notebooks

DEPRECATED This functionality will be removed entirely in papermill 1.0,
the features are moved to scrapbook:

collect metrics across the notebooks

summarize collections of notebooks

This opens up new opportunities for how notebooks can be used. For
example:

Perhaps you have a financial report that you wish to run with
different values on the first or last day of a month or at the
beginning or end of the year, using parameters makes this task
easier.

Do you want to run a notebook and depending on its results, choose a
particular notebook to run next? You can now programmatically
execute a workflow without having to copy and paste from
notebook to notebook manually.

Installation

From the command line:

pip install papermill

For all optional io dependencies, you can specify individual bundles
like s3, or azure -- or use all

pip install papermill[all]

Python Version Support

This library will support python 2.7 and 3.5+ until end-of-life for python 2 in 2020. After which python 2 support will halt and only 3.x version will be maintained.

Usage

Parameterizing a Notebook

To parameterize your notebook designate a cell with the tag parameters.

Papermill looks for the parameters cell and treats this cell as defaults for the parameters passed in at execution time. Papermill will add a new cell tagged with injected-parameters with input parameters in order to overwrite the values in parameters. If no cell is tagged with parameters the injected cell will be inserted at the top of the notebook.

Additionally, if you rerun notebooks through papermill and it will reuse the injected-parameters cell from the prior run. In this case Papermill will replace the old injected-parameters cell with the new run's inputs.

Executing a Notebook

The two ways to execute the notebook with parameters are: (1) through
the Python API and (2) through the command line interface.

Execute via CLI

NOTE:
If you use multiple AWS accounts, and you have properly configured your AWS credentials, then you can specify which account to use by setting the AWS_PROFILE environment variable at the command-line. For example:

In the above example, two parameters are set: alpha and l1_ratio using -p (--parameters also works). Parameter values that look like booleans or numbers will be interpreted as such. Here are the different ways users may set parameters:

$ papermill local/input.ipynb s3://bkt/output.ipynb -r version 1.0

Using -r or --parameters_raw, users can set parameters one by one. However, unlike -p, the parameter will remain a string, even if it may be interpreted as a number or boolean.

Analyzing a Collection of Notebooks

DEPRECATED This functionality will be removed entirely in papermill 1.0

See scrapbook's scrapbook
model for an equivilent API for this capability.

Papermill can read in a directory of notebooks and provides the
NotebookCollection interface for operating on them.

"""summary.ipynb"""
import papermill as pm
nbs = pm.read_notebooks('/path/to/results/')
# Show named plot from 'notebook1.ipynb'
# Accept a key or list of keys to plot in order.
nbs.display_output('train_1.ipynb', 'matplotlib_hist')

# Dataframe for all notebooks in collection
nbs.dataframe.head(10)

Development Guide

Read CONTRIBUTING.md for guidelines on how to setup a local development environment and make code changes back to Papermill.

For development guidelines look in the DEVELOPMENT_GUIDE.md file. This should inform you on how to make particular additions to the code base.