Why not use Jupyter Notebook?

2016-02-01

Jupyter Notebook is a wonderful tool in situations that play to its
strengths. But I try to use it only in those situations, because I find it
makes it hard to write good code and to practice good software engineering.

YMMV, but I use notebook to:

distribute reports with inline figures, or

explain code containing no function or class definitions to an audience

Why not more often?

1. I find it hard to write substantial code in a notebook

Jumping between notebooks (or even moving around within a notebook) is hard
compared to a text editor or IDE, so functions and classes place a cognitive
overhead on the coder/reader that is not entirely overcome by
shift-tab.

And you can execute cells out of order. In fact it’s natural to use it in this
way. But I find that I introduce bugs as soon as I try it in a project more
substantial or structured than a script. Sometimes the bugs are simple
NameError exceptions, which are irritating since they would be easily
spotted in a text editor, but are
simple enough to fix by running cells in order. But mutated variables often
introduce much subtler and less benign bugs.

The upshot of these two issues is that I almost never define classes or
non-lambda functions in a notebook. When I do, I write buggy, confusing code.
And in any case, if code is useful enough and abstractable enough to be a
function or class, then it belongs in a module that can be imported by other
code, not siloed off in a notebook.

An environment in which I avoid writing functions or classes is not an
environment I can use for substantial work.

2. It doesn’t play nice with version control

The notebook way of doing things requires documentation, code and output to be
woven together. There is not a one-to-one mapping between lines of the .ipynb
file and the code and comments you typed. This complicates the semantics of
version control enormously.

Notebook diffs are difficult to read, which makes things harder for the
committer, and much harder for anyone who has to review commits (this
diff
fixed a NameError by running cells in order without changing code). And your
version control software may insist that a file has changed when, from your
point of view, it hasn’t.

Why is this? Firstly, diffs due to real changes to the input are hard to read
because they contain json cruft and the <textarea> editor makes it harder to
edit code consistently:

Secondly, changes to the metadata section of a notebook are invisible to the
user until they diff. And finally, a notebook changes when you run a cell,
even if the input is unchanged:

cell numbers change every time you run a cell

output can change if the environment is different or the code has random
behaviour

the string that encodes an inline image can change, even if the image is
visually identical

You can of course manually clear output before committing, or use commit hooks
to automate this. And
although it makes reading diffs extra-difficult, you might actually want to
record output changes in version control. But these complexities make
integrating notebooks into a source-controlled project — or getting used to
version control as a beginner — non-trivial.

3. I have to swap my text editor for a <textarea>

This is a big point, but it doesn’t take as long as the others to explain: text
editing in Notebook has come on leaps and bounds. But it doesn’t help me write
PEP8-compliant code, it doesn’t have IDE features to navigate a non-trivial
project and spot bugs, and because it’s ultimately a tricked-out <textarea>
for the general user, it will never behave like my text
editor.

So, that’s why I don’t use Jupyter Notebook more often!

Jupyter Notebook is great!

All that said, the features of Jupyter Notebook that lead to the difficulties
I’ve described are features, not bugs, and I’m certainly not suggesting they
be “fixed”. All things in moderation, but a notebook is often the right tool
for the job, and literate
programming is a
tremendously powerful way of interacting with a computer.

If a notebook workflow works for you, then my intention here is not to persuade
you that you’re wrong! But if you struggle to grok development in notebooks as
I do, I wrote this to remind you that modules and text editors are still an
option. And if you teach, think about whether the benefits justify the
complexities for beginner coders.