The majority of the complaints I hear about notebooks I think come from a misunderstanding of what they're supposed to be. It's a mashup between a scientific paper and a repl. So it's useful for a bit of both:

a) Just like with a paper, you can present scientific or mathematical ideas with accompanying visualizations or simulations. From the REPL side, as a bonus, you get interactivity, and the reader can pause and experiment with the examples you're giving to improve their understanding or test their hypotheses. If I change this variable, how will the system react? You can just try it!

b) Just like with a REPL, you can type in and execute commands step by step, viewing the output of the previous command instead of running the whole thing at once. From the document side, as a bonus, you get nicer presentation (charts, interactivity, nice and wide sortable tables, etc) than you would in a shell, which comes in handy when doing things like data exploration or mathematical simulation.

It's decidedly NOT there for you to type all your code in like an editor and make a huge mess. It's apples and oranges w.r.t and a poor substitute for something like PyCharm or VS Code or vim. It is there for you to a) try things out yourself, and whatever you discover hopefully eventually make it into proper python modules b) make interesting ideas presentable and explorable for others. That's all!

When I see stuff like "out of order execution is confusing", I don't disagree, but it does make me wonder how long and convoluted the notebooks these people work with are - probably a ripe candidate to refactor stuff out into python modules as functions. When I see stuff around notebooks for "reproducibility", I'm a bit confused in that notebooks often don't specify any guidance on installation and dependencies, let alone things like arguments and options that a regular old script would. In that regard I think it's barely an improvement over .py files lying around. When I hear "how do I import a notebook like a python module", I'm very very scared.

Granted, I've seen huge notebooks that are a mess, so I understand the frustration, but it's not like we all haven't seen the single file of code with 5000 lines and 10 nested layers of conditionals at some point in our lives.

> When I see stuff around notebooks for "reproducibility", I'm a bit confused in that notebooks often don't specify any guidance on installation and dependencies, let alone things like arguments and options that a regular old script would.

At the core of this, as some others may have already alluded to already, is that many academic scientists have not been socialized to make a distinction between development and production environments. Jupyter notebooks are clearly beneficial for sandboxing and trying out analyses creatively (with many wrong turns) before running "production" analyses, which ideally should be the ones that are reproducible. For many scientific papers, the analysis stops at "I was messing around in SPSS and MATLAB at 3 AM and got this result" without much consideration for reformulating what the researcher did and rewriting code/scripts so that they can be re-run consistently.

> many academic scientists have not been socialized to make a distinction between development and production environments

Geologist here - definitely true in my field. Nonetheless, while I don't develop in notebooks at all, I do use them for "reproducibility" in a sense -- by putting a bit of dependency info in a github repo along with a .ipynb file, I can do things like this: https://mybinder.org/v2/gh/brenhinkeller/Chron.jl/master?fil...

Which ends up being useful when a lot of folks in my field don't do any computational work at all, so being able to just click on a link and have something work in browser is a big help.

This is kind of a broad observation, but scientists tend to borrow tools from a huge variety of fields, and use them in ways that seem un-disciplined to the practitioners of those fields. For instance, an engineer would be horrified to see me working in the machine shop without a fully dimensioned and toleranced drawing. A project manager would be disturbed to learn that I don't have a pre-written plan for my next task. How do I even know what I'm going to do? If we adopted the most disciplined processes from every field, we'd grind to a halt.

In fact, there might be something about what attracts people to be scientists rather than engineers, that makes us bristle at doing what engineers consider to be "good" engineering.

I agree that science can't be bound by the rigid structures of most applied disciplines, and that the freedom to combine technologies in novel ways is a pre-requisite to novel findings.

What I find objectionable is the inability of scientists to explicitly delegate tasks to domain specialists in their everyday work when it makes sense. I think that it's unrealistic of you to believe that engineers always work with "a fully dimensioned and toleranced drawing" before starting work on a project and that your would work "grind to a halt". Indeed, there's a reason for the qualifier rapid in the term "rapid prototyping". If you can give an engineer general specifications for what you want and then leave him/her alone, he/she should be able to produce something that mostly fits your needs while avoiding all of the pitfalls that wouldn't have occurred to you. It would also be incorrect to assume that engineering does not involve creativity and is purely bound by rigid processes- if your requirements were strange enough, something fresh would inevitably be built.

This sort of delegation of course, is actually more efficient, since you can work on other tasks in parallel with the engineer (such as writing your next grant proposal or article or gasp teaching). Most scientists also already do this implicitly by choosing to purchase instrumentation from manufacturers like Olympus, Phillips, or Siemens rather than building it themselves.

Part of the reason for why I have such strong opinions about this matter, is that I've actually witnessed scientists waste more time messing around in fields where they were clearly out of their depth. As an example, there was a thread on a listserv in my (former) field that lasted for literally months that was solely devoted to the appearance of a website. Everyone wanted to turn the website design into an academic debate, when the website's creation (which had little to do with the substance of the scholarship itself) could have been turned over to a seasoned web developer and finished in less than a week or two.

But in the case of dev and prod distinction it has nothing to do with fitting some over-constrained engineering principle, but about fitting actual science: if you cannot reproduce something, you don't have a result, you have a fluke.

I think GP here is an insightful comment. Reproducing things is indeed important, but re-running code is much too narrow a definition, and possibly distractingly narrow.

Maybe your awful notebook gets the same answer you got the day before on the blackboard. Or the same answer your collaborator got independently, perhaps with different tools. Those might be great checks that you understand what you're doing. Spending time on them might be more valuable for finding errors than spending time on making one approach run without human intervention.

Not to say that there aren't some scientists who would benefit from better engineering. But it's too strong to say that fixing everything that looks wrong to engineer's eyes is automatically a good idea.

I find that with Jupyter, re-running code does serve one useful purpose, which is to make sure that your result isn't affected by out-of-order execution or a global that you declared and forgot about. That is a real pitfall of Jupyter that has to be explained to beginners.

For my work, reproducing a result may involve collecting more data, because a notebook might be a piece of a bigger puzzle that includes hardware and physical data. This is where scripting is a two edged sword. On the one hand, it's easy to get sloppy in all of the ways that horrify real programmers. On the other hand, scripting an experiment so it runs with little manual intervention means that you can run it several times.

Huge fan of just including an environment.yml for a conda virtual-env in the repo you store your notebooks in, but the challenge there is that it's OS specific reproducibility. I've had no luck creating a single yml for all OS's and the overhead of creating similar yml's for (say) Mac and Win is a lot unless you plan on sharing your notebook widely.

If you ever have used an R Notebook written in R-Markdown, then its pretty easy to see why Jupyter Notebooks putting everything in JSON is just... infuriatingly wrong-headed. In an R Notebook, I can see my code, I can see my text, everything is exceedingly simple to understand, and I can edit it in any of the fantastic text editors out there (Jupyter's editor is not among them)

RStudio is also my favorite editor. All my work is data science / stats related, where I like the workflow of writing/modifying code in a .R (or .py) file, and being able to quickly experiment by running chunks in a REPL with Ctrl + Enter.

R and Python are supported. No Julia, unfortunately.
VS Code and Atom support similar workflows with Julia. However, the Julia Language server in VS Code is extremely unstable and I regularly lose LaTeX completions.
The REPL in Atom is mind boggling laggy and slow to the point that it is much less frustrating to copy and paste code into a REPL running in your favorite terminal emulator.

Is that atom's fault or just the Julia REPL's fault? I use the REPL directly on Windows and it seems to be really slow as it will take something like "using JuMP" and precompile the module which takes time.

I think it is Juno (the Julia package for Atom)'s fault.
Atom is fine on its own, as is the Julia REPL after compilation.

I just looked through Julia's settings tab in Atom, and saw the option "Fallback Renderer" with the note "Enable this if you're experiencing slowdowns in the built-in terminals."
It was disabled by default, so I've just enabled it.

Subjectively, I think it feels fine now. Longer use will tell, but I suspect I was just running into a known issue some setups run into, and they already provided the workaround.

EDIT:
Comparing running some code in Atom's terminal and a REPL running in the GNOME Terminal, the regular REPL still feels notably snappier -- even though I'm `using OhMyREPL`, which makes the REPL a bit less responsive.

I'd say Atom feels acceptable (and definitely not "mind boggling laggy" right now), and shift/ctrl + enter more convenient than switching tabs.
So I will stick with it (for Julia). More time shall tell.

As a result of the serialize-to-json approach, jupyter supports R, python, scala, go, lua, bash, julia, and haskell, among others. Its accessible to a much wider range of programmers, at the cost of version control being a bit weirder.

That is a complete non-sequitur. Json in no way enables that. Just having a defined format enables that.

Emacs org-mode is proof that a simple text format with markup rules is all you really need to support multiple languages in a single file. You lose some of the simplicity of parsing the file, but you gain a ton more.

On point, a browser can not remember ifa notebook. Just parse the json. It can also parse text/plain. So, could show the org document without styling. The org document is actually readable. Json... Not so much.

How is a notebook without proper software to handle it in any way more useful then other structured plaintext-file? Yes, JSON can be prettyprinted in a browser, but what then? It's still a useless mess you can't work with.

Edit: For trivial examples of "org-mode" in an org-mode document, you need only look at the documentation of org-mode. That said, I expect there to be limitations, because they make sense. Similar to how you can pretty print json inside a jupyter notebook, but don't expect to have a notebook interpreted in the notebook. (If that makes sense.)

Ah, actually it looks like I'm somewhat mistaken. R Markdown supports other languages as well. I think the real difference is that it doesn't look like R Markdown supports partial evaluation.

By that I mean that that to share the r-markdown doc it appears that you need to rerun the whole thing. It does some tricks to do concurrent visualization, but to actually share the doc you have to rerun all the R/python from scratch.

In jupyter OTOH, if I have a long running ML pipeline as part of my doc, I can render without rerunning the pipeline.

You can cache the results of an rmd cell, and you can also share the rendered version of the doc first. You're right that there's a higher emphasis on "run the whole thing," and I think that that's a conscious (and acceptable) design choice vs not being sure that the shared doc will run as provided.

The main reason for json, I believe, is that the Jupyter client is separate from the backend. It's actually pretty trivial to run the engine on a beefy box while interacting on a light laptop (on the same subnet). With Jupyter Lab and some fiddling, you can put the server anywhere.

It's also trivial to export notebooks to .py files.

That said, my goodness do notebooks wreak havoc on git. I hope this in particular gets fixed as popularity grows.

Having a proper server available is even more reason to use a proper fileformat. The client doesn't care what the server handles and the server doesn't need to send raw datastructures directly from the storage.

Actually, fixing the fileformat-mess should be very simple. Just change the file-load/save-functions. Use a folder-structure with every cell being a seperate file. Or switch to XML. Or make a generic interface and allow to save in whatever the people want. Saving notebooks in Mongodb or some SQL-Database seems like a good goal for dedicated services.

You parting "Granted..." is precisely what fills me with dread when I see notebooks. Yes, I have seen poorly done source files. I made more than a few myself. However, many of the practices we have grown into as sound programming advice seem to be largely thrown out the window for these notebooks.

The irony, to me, is that I actually typically argue for the mixing of presentation and content. But to me, notebooks look like an attempt by people to make a WYSIWYG out of JUnit/TestNG/whatever style reports. Only, without the repeatability.

There is also the entire bend where these are taking off in a way that doesn't make sense. Do they do the things you are saying? Well, yeah. But no better than plenty of tools before them. Mathematica and Matlab both had "notebook" like features for a long long time. Complete with optimized libraries. And this is ignoring the interactivity of the old LISP machines. (You can see from my history I have a soft spot for emacs org-mode.)

Jupyter is a lot of things. Bad isn't necessarily one of them, but exceptional isn't, either. Heavily marketed is.

Yes, both are expensive outside the student licenses, but Mathematica is significantly cheaper and has a lot more built into the language, so you don't have to turn around and buy expensive "toolboxes" for all the functionality missing in Matlab.

Notebooks have been in Mathematica for ages and are really powerful and difficult to describe to those who haven't used them. To give an example, I was building a tool and embedded images as variables in a way reminiscent of being an engineer on the USS Enterprise. You can point to a file in Python as a variable, but you can't just copy-paste an image in as a variable last I checked (don't think Jupyter is there yet).

There is also the entire bend where these are taking off in a way that doesn't make sense.

It makes perfect sense. Just not to a lot of HN readers.

The average HN reader is approaching this from a perspective of "I am a professional programmer who might occasionally dabble in scientific computing, and therefore I hate this thing because it's not a professional programmer's tool designed by and for professional programmers according to the best practices of professional programmers".

The people who are actually using notebooks, meanwhile, are not professional programmers. They're scientists who increasingly have to do programming as part of their science. And notebooks are a godsend for them. We don't need to drag them all the way into our world; we need to pay attention to what they actually want, need, and find useful, and accept that it's going to differ from what we want, need, and find useful.

Because I don't have four years to get something done, that doesn't do what I want when I finally get it, if it even works at all, and that I can't fix myself.

Okay, that was extreme, and if you think I was talking about programming, it's because you have a guilty conscience. ;-) It actually applies to all interesting fields -- programming, engineering, management, classical music composition, etc. Those fields don't even know what their best practices are, and acknowledge that things take too long and can't be managed. No manager would say: "Our programmers have best practices, so the work will be done next week." Why should scientists have such faith?

Meanwhile, do you trust Maxwell's Equations, Darwinian evolution, quantum mechanics, etc.? How did we establish the physical constants to mostly better than 8 digits of precision? Science has somehow figured out how to make progress despite the messy business of research.

For me, it's not that I "have" to do programming, but that physical science has been computation driven since before the 1940s. Programming is how I think and work. With apologies to Richelieu, "programming is too important to be left to the programmers."

"Best practices" are a chimera. The issue at hand isn't about what is "best", but whether or not a software engineer's "good enough" practices are more likely to achieve science's goals than a graduate student's "good enough" practices.

It's also disingenuous to claim that classical music composition doesn't have "best practices" when the field of music theory exists as an explicit manifestation of "best practices" in music. Having gone to a school with a conservatory, I also believe that I know several individuals who would would disagree with your mindset regarding how the creative process can't be managed. Indeed, if creativity, as it relates to musical composition, couldn't be managed most orchestras would be brimming with anger at the number of commissions that weren't finished on time for the concert, and most Hollywood studios and Broadway shows would screech to a halt.

Show me the reproducible research in programming about the merits of different type systems (murky at best). Or of different approaches to testing. Or software architecture. Or... well, most of the stuff day-to-day working programmers actually do. There are barely even attempts at rigor in most of our practices, let alone the kind of reviewed and reproduced results we demand from the sciences.

I run my unit and integration tests with every build, and they reproducibly pass if my code is working. If you have code, it doesn't take much to make it able to run again and get the same result, and it's frustrating to see Jupyter users mess it up.

I work in development of scientific equipment. Jupyter is my lab notebook. I think that to make good use of Jupyter for this purpose, you have to be a good programmer and a good scientist. No tool will turn us into these things against our will.

With that said, Jupyter has greatly improved my ability to find my own mistakes, and to reproduce my own results later on.

I think it speaks to people's desire for a quick and easy to set up basic GUI creator with an editor that allows inline code editing, and no need to deal explicitly with the client server interaction.

I myself, as someone who likes to create really solid and maintainable tools, have fallen into the notebook trap and written things like "change the month in cell 22 then execute cells 1 through 3 and 20 through 27 to update the report".

The notebook format was great for prototyping what was really a small app. You don't really have those problems when you're just generating a document.

You're right about what excel is (and the whole VB ecosystem for that matter), but I think the critical difference is that the language and environment are very different. If I know the smallest amount of python (or R) I can leverage Jupyter notebooks and it is intuitive.

To really get something great out of excel you have to learn excel. I think that difference is almost as important as the excel stigma.

I agree completely on your first point - notebooks are a poor substitute for proper software tooling. I wrote this recently [1]

> In the case of an analyst, the domain of "software engineering" lies close to their own domain. Projects in both areas require code which (ideally) exhibits clarity and reproducibility. Obfuscated software is bad [...] and idempotency is good.

> The problem, then, is when the analyst takes a core tool from their domain and applies it to a slightly different domain like software engineering. Things go south fast: your notebook has not-quite-imperative code that is untested and unmonitored. It is, in other words, bad software.

As for the point about "refactoring stuff out into python modules as functions," the problem is that the new crop of data scientists aren't learning how to do this. The role of "machine learning engineer" is emerging to address this shortcoming in SWE skill throughout the data science community. It honestly cannot happen quickly enough.

I fundamentally agree with you but I have the feeling that some some of the major proponents of notebooks belong to the category of people who misunderstand them, and simply use them for everything, and write long and convoluted notebooks; I’ve definitely seen my share of those in my domain (bioinformatics, AI) and elsewhere. By contrast, Joel Grus for instance perfectly understands their strengths and weaknesses.

As for being a a good REPL, I feel that an actual REPL (+ editor integration) works better than notebooks: You can combine a literate document with a REPL but still get the benefits of a proper editor/IDE and a proper execution environment, rather than a half-hearted mix of both that’s hosted inside a HTML contenteditable (= Jupyter), and you also get “charts, interactivity, nice and wide sortable tables, etc” if you want). RMarkdown inside RStudio or Nvim-R does this well. — I just don’t want to give up the advantages of a proper editor for the very slight increase in integration that Jupyter gives me.

Is it actually a common skill to write meaningful non-helloworldish Python code that yields expected results without a number of iterations of debugging and correcting and without PyCharm intelligent completion, hinting and correcting features? I understand the value of Jupyter notebooks for publishing your work results but find it almost impossible to use it to actually do the work - it feels million times more convenient to code in PyCharm then copy-paste the code to Jupyter once it's ready.

I think we can all agree some notebooks are shit storms and should not be relied upon at ALL for production. AT my job we started using notebooks as an 'in-repo', 'interactive' documentation of sorts. Showcase various modules and give simple usage examples of them. It was pretty awesome. I love using notebooks as a more advanced scratch pad. For the times when ipython shell isnt enough, and you want something extra. Also i had to install the vim bindings ASAP, gotta have that vim

I'd say it's more of a shell than a REPL. For most languages that provide a shell, there isn't a real separation between the reader, evaluator, and the printer. Being able to interact with those components separately is the real advantage of a REPL over a shell.

The majority of the complaints I hear about notebooks I think come from a misunderstanding of what they're supposed to be

No, the majority of complaints are that notebooks are great, but Jupyter is a bad notebook. I mean maybe it’s impressive to someone who’s never seen a notebook before but to someone used to Mathematica, MathCAD,
RMarkdown, org-mode, whatever, it just seems clunky as hell. I wonder how many “data scientists” claiming it as their top choice have ever tried anything else?

Version control for Jupyter notebooks was one of the biggest complaint I had. Specifically, diff and merge with the JSON files (.ipynb) is ugly.

I built ReviewNb[1] to solve one of those problems (diff). Note that, there is nbdime[2] which works well for local diff/merge. The idea for ReviewNb is to have much tighter integration with GitHub etc.

The hard part is that introducing a tool like git (which requires you to choose moments to take a snapshot of the file, and then add some commit message) breaks the flow of interactive experimentation that notebooks are so good for. And then we need to find a way to make those commits useful, because the time ordering of commits could be different from the time order in which cells were run! That is what is crucial to making computations reproducible — viewers should be able to replay the history of how a notebook result came to be. (EDIT: Note that this is the case only for stateful computations -- if a notebook interface was used to construct a dataflow graph (like spreadsheets) with values updating live, then this wouldn't be so much of a problem. More fundamentally, it is not at all obvious that thinking of notebook contents as akin to code is the best way to use version control)

I wonder whether there is a solution along the lines of auto-committing each cell before it’s executed and the results just after the cell is executed. Otherwise a user has to do too much manual organizing, which is a problem the notebook should ideally solve. When a user is happy with the experiments and the provenance of their results, they should be able to use an interactive rebase to create a cleaner version to share/archive.

To have reproducible prototypes, I use Make to wrap the whole workflow in docker. Then I push the code to gist and forget about it. Although GitHub gist doesn't allow binary file, images embedded in .ipynb (JSON), on the other hand works in gist. Here is an example.

This is a super valuable perspective! We (https://qri.io) are building a kind of git/github for datasets and are hoping to talk to would-be users about just this issue. Would love to have your feedback on it (particularly on how commits are registered). Mind if I ping you at the email address you listed? - Rico

The RCloud project covered some of this ground https://cscheid.net/2015/08/17/collaborative-visual-analysis... It takes the view that everything should be saved and versioned. In hindsight it seems obvious that this can overwhelm people with dead ends and scratch work and in general the flat workbook space doesn't provide enough help with organizing results. There are some other ideas mentioned in the conclusion of the RCloud paper.

For me, markdown is meant to be readable even when not rendered. I could see how not having persistent graphs and tables might be an issue, but my own philosophy is to start fresh each time - I treat it like a templating language with some convenient rendering features for prototyping, rather than like an IDE.

Your last point also has an upside - it's using different engines (Rmarkdown vs. Sweave). I can write whatever HTML or LaTeX code I want, depending on what's appropriate. I wouldn't want to have to make web documents with LaTeX, nor would I want to make PDFs with HTML.

RMarkdown and Knitr are dramatic improvements in terms of final outputs and VC relative to notebooks. Notebook believers (Satan worshippers, imho) would suggest that notebooks are best for developing in and not primarily made for use as final outputs.

I think they fundamentally json is just the wrong format for these files. Speaking from (ancient and limited) experience I made a little notebook-style interpreter for learning scala back in 2009 or so called scalide. It saved its files ("scalapads") to XML. XML actually worked better in some ways since most of the code could live between the tags unescaped (sans < > &) so it merged / diffed the user code well. The meta-level stuff (cell boundaries etc) needed by the notebook ... not so much.

In json the code has to be escaped into strings, and json is really finicky about syntax (e.g. no trailing commas). So it doesn't work well.

I never got the chance to redo it, however the solution I was leaning to for my post "I won the lottery, I can work on fun stuff" attempt was to store the meta-code in a version of the host language(s), with some simple syntax that could live comfortably in the comments of various different languages to do things like encode the cell divisions and so on.

Basically something like:
#notebook[lang=python]

#cell[lang=python]
def add(x,y):
return x + y
#endcell

//notebook[lang=scala]

//cell[lang=python]
def add(x: Int, y: Int) = x + y
//endcell

This I think would be beneficial for a couple of reasons.

1. Better diffing / merging.

2. One click toggle between show source and view as notebook mode, which would really allow this to work in an IDE like vscode pretty seamlessly. The cells become something akin to //#regions in the IDE. But at the end of the day you are still editing a source code file, so you can edit the whole file easily.

3. The keyboard shortcuts for executing and jumping between cells would generally work in raw code mode, so you could just edit there continuously and manually writing out //cell //endcell. Also the executing results could appear in block comments inline in the editor, off to the side, or in a popup above, the code you are editing.

4. The IDEs could uprender the comment-syntax into cells as they gained better support for the paradigm (similar to how they do for code folding / syntax higlighting already).

5. Eventually, perhaps a cross language, metasyntax could be established to make things a bit more concrete than magic comments (get ready for some serious bikeshed painting though!)

The closest I have seen anything come in this regard is Quokka however it's not quite all the way there.

The closest thing I've seen to what you described would be... Emacs. It actually uses the "metadata in file-specific comments" paradigm. You can put file-local values for Emacs variables in comments at the top or bottom of your file, like described in [0].

Your example could be rewritten as:

# -*- notebook-lang: python -*-

or

// -*- notebook-lang: scala -*-

Still, the usual way of using Emacs for "interactive notebooks" is via org-mode, which is a better Markdown with support for (among other things) executing code blocks straight in the org document you're writing. This way, Emacs support all your points 1 to 5, and is generally more powerful than Jupyter or other similar things, but it also means you can kiss any kind of collaboration goodbye.

For some weird reason, the more powerful a tool, the less likely it is other people will be using it.

Well, the reason is not so weird. Generally speaking, the more powerful the tool, the higher the bar. More time and effort is required to learn it and become proficient with it. When it comes to the very powerful tools, few will have the aptitude or be prepared to put in the effort to learn them.

For those who don't, less powerful tools take their place and proliferate.

As you say, emacs checks all the boxes but the majority is not prepared to learn it and prefers to program throught their browser.

Anything in something other than the primary language could be in something like `execute_scala(""" scala code """)` - which would execute properly given proper globals.

As long as the output-hash storage is treated as append-only and is highly available (output cells could even be encrypted for security if this was a public cloud service, or you could even use a local or shared filesystem), then this file would not only parse and run as a perfectly valid Python file, but it would also hold references to outputs in a source-control friendly way. IDEs could show the cell outputs inline. If you rerun your notebook and get different outputs for some reason, `git diff` tells you exactly where things changed without being too messy. Basically, put outputs in off-chain storage, and just be a literate code file.

I’m glad I’m not the only one. When I inherited some “production notebooks” (if that’s a thing) I couldn’t believe it was nearly impossible to do basic things such as test and review changes (via version control).

At our company if it's in a notebook it's not considered ready for production, it must run as a script before being considered for Eng to take over from DS. It's actually not that hard to write a notebook in such a way that it converts easily to a script. Just check and make sure that your variables/functions/whatever are initialized above the cell(s) they're used in, declare all imports in the top cell, and periodically move cells to fix any inconsistencies with these rules (checking that you didn't break anything of course). I've always said that Data Scientist doesn't mean, "I don't do engineering," good basic eng practice helps make more productive data science and brings it into production more robustly. How do you know your models work well if the code that generated them is inscrutable?

I wonder how much of the "3 engineers for 1 data scientist" ratio I hear all the time is due to Data Engineering being assigned the role of cleanup to code that should be better in the first place.

I think cleanup is part of it. I also have noticed as a guy on the DS job family, but who has taken a large interest in SDE work, separating the job families can result in churn.
For example, I might think of three model choices A, B and C. C may be the worst of the three, but only very marginally worse. It can also be the case that C is an order of magnitude easier to keep and maintain in production.

I've seen cases where the wrong choice here ends up requiring three SDEs for half a year, where if they gave up a tiny benefit of the best model, they could have done it with 1 SDE in 1 month.

You don't use Jupyter notebooks in production; they are super useful for pitching ideas to clients/bosses and doing some early prototyping. I feel sorry for anyone that has to work with "pure data scientists" that have no clue about software engineering practices...

It depends on what you're doing, yeah? In RMarkdown notebooks... yeah, I wouldn't write models in one. But if the focus is on embedding some visualizations and tables into a document, and then refreshing the document to every so often pull in new data, I can see that as a production use for a notebook. TL;DR: Can be useful for reporting, wouldn't use it anywhere else in the pipeline.

I'm coming to realize one of the key skills for a data engineer to have nowadays is "productionizing" notebook code from data scientists and PMs and teaching them to make it more testable and modular in the first place.

We're building something just like that at qri (https://qri.io) a free and open source dataset version control system. Right now all datasets on qri are public by default, but we're working toward supporting. encryption and private networks.

Jupytext linked elsewher eont he thread seems like a step in the right direction. Instead of changing the whole tool, accept that you're always going to be married to github and change the serialization-layer to be source control friendly. Basically, split the input from the output+metadata and flatten it all to text. Then you can source control it fine and if you need use the output+metadata fold them back in.

As with so many things python related (including python itself), I am perplexed by how willing people seem to be to fall in love with solutions that have so many limitations and problems. I find Jupyter just barely usable. I constantly have issues with editing in the cells, diagrams not sizing correctly, cells accidentally displaying huge amounts of data and freezing my browser, complete failure of autocompletion in many languages, a very awkward security model involving manual cutting and pasting of auth tokens around, nearly impossible to get a reasonable rendering of the notebook into something reasonable like PDF (yes there attempts at solutions, they are full of problems). Many limitations derive directly from the architecture where the kernels are limited in what they can do because specific parts have to be interpreted in the browser that are language specific.

From my perspective, it's a dumpster fire - in 2018 there should be something so much better than this. RStudio is a thousand times better but only does R. I used to like Beaker Notebook but it gave up due to Jupyter's popularity and converted itself into a bunch of Jupyter extensions which now have all of Jupyter's limitations.

Yet despite all this I can see that there's this enormous community that loves this and keeps developing and contributing to it.

I feel the same way, especially as an emacs user. Org-babel seems to be a superior implementation of the same idea. Org is just a text document, so git and git diffs work. I can use any combination of languages I want in a document and have them running in different sessions. And best of all I can edit code blocks using my customized major mode for that language. On top of that you get all the goodness that comes with org-mode, not least of which is the ability export it to dozens of other human readable formats for easy sharing. I think there's even an exporter for jupyter notebooks (there's at least one for ipython notebooks).

I like jupyter and use it all the time mostly because I can log what I do for future reference. But I have to work around it so much stuff it can get really annoying sometimes. Yeah, compared to the Matlab IDE and how easy it is to use, it's not even close. But it's an open source project, so people tend to find it awkward to criticize it a lot (I mean, that sentiment is justified since it's mostly volunteer work but sometimes it can get to be too much).

it's a dumpster fire - in 2018 there should be something so much better than this

In 1998 I was using a tool called MathCAD that provided a notebook interface running as a plugin to MS Word. In 2018, Jupyter is still not as good as that. Some things are just not meant to be webpages.

I love Jupyter Notebook for experimenting and rapid creation of reports, but dislike it for not being able to use my editor and for intermingling inputs and outputs in a single file. So I'm working on an alternative frontend to Jupyter kernels, which is heavily inspired by KnitR: https://github.com/azag0/knitj It is still being developed, but it's functional and I use it every day.

I have not tested it with anything else than the Python kernel, but it uses Jupyter Client to communicate with the kernel, which is kernel agnostic. So you should be able to do just “knitj -k <kernel name> ...”.

True. Actually that's what I meant by "intermingling inputs and outputs". KnitJ still shows both code and its output in the rendered HTML, but unlike in Jupyter Notebook, the code is stored and edited separately in a single source file.

The most important tool for programming, for me, is that window that shows you the current state of all the variables. When I step through a program, I look at the state. 90% of my debugging solutions come from seeing that variable doesn't have the right state.

2. Intellisense

For the love of god, I do not want to remember if it is len(), length(), .len(), .length(), .size(), size(1) or whatever.

That's it. But those two are so big that I have to code and debug in Spyder and then paste the code into notebook. I feel sorry for people who are new who think that all the debugging is happening in the notebook.

Microsoft just announced an initiative like this (unfunded, community-based, likely at risk of becoming abandonware), perhaps you could combine your efforts with theirs? (The issue in their code that I'm personally most impacted by in lack of support for conda[0])

One thing you may want to be aware of is that the python language server used for vscode runs pylint which performs static analysis on the code. However, Jupyter Notebook uses autocomplete by actually introspecting the variables as they are defined. This creates large differences when doing things such as selecting a column in a pandas dataframe. In jupyter if you press tab on the column name, it can autocomplete and also assumes you are getting a series, which leads to autocomplete on things like .min, .max, etc... In pylint you don't get any of this autocomplete since pylint cannot statically determine the column names so you lose the intellisense.

Re #2, if you haven't tried the newest versions recently (and especially with the jupyterlab beta which has a nicer completion GUI), I'd encourage you to take a look! It's come a long way, along with the library that's doing the completions under the hood.

I like R for many things, but Python just keeps getting more compelling, particularly given the excellent machine learning packages. As these sorts of toolchain elements get better and better, and as more people realize that there's a benefit to simultaneously training researchers to run code as well as stats, I suspect we'll start to see an exodus from pure R solutions.

The real question is when (and whether) new social scientist stats courses will start teaching Python stats toolchains, rather than R. That seemed to be an inflection point for R (as folks moved away from SAS), and could be for stats-centric Python too.

I'm not sure what about Jupyter makes Python more compelling in comparison to R. R is entirely usable in Jupyter Notebooks, and R Notebooks are, in my opinion, possibly superior to Jupyter notebooks in many ways.

> and as more people realize that there's a benefit to simultaneously training researchers to run code as well as stats, I suspect we'll start to see an exodus from pure R solutions

I'm not sure what you are saying here.

I would actually argue that most of the Python data science toolchain is years behind what is available in R.

Python definitely has more mindshare for machine learning, and particularly deep learning. However, that’s not all of statistics. For things like mixed-effects modeling, I think R still has a clear lead. There are some python packages (e.g., statsmodels) but R’s lme4 has more features, like custom covariance structures, and virtually every textbook and tutorial currently uses R. I’m actually not sure if I’ve ever actually encountered statsmodels in the wild. PyMCMC is relatively popular, but I think bugs/jags are also more common.

The migration between languages is also industry specific. They are still teaching SAS to finance and healthcare analysts, for instance, and R and Python are still rising in healthcare specifically. Keep in mind all the legacy code and all the coders who just know SAS and don't need to change. It'll take longer for the transition than you think.

I have tried to move from SAS to R a few times at this point it's largely inertia there is so much in my org already written in SAS makes it very difficult to convince people to change.

I do like the SAS dev tools (especially Enterprise Guide). I'd really love if R had some sort of GUI front end for non technical people. Eg. I know the finance analysts in our org wouldn't have a clue how to configure their own ODBC sources -which you need to do with R Studio until it's as easy for them as SAS convincing them to switch won't get any traction.

I'm definitely well in the R camp but keep feeling this nagging pull from Python. Especially for trading...it would be so nice to have a language for both research and production, as right now I translate all my research into scala for production.

I hate to be the stereotypical Julia recommender, but it is made for this use case, more so than Python, which isn’t all that much faster than R if speed matters. (Unless you want to try Cython but that’s a whole bag of worms.)

I'd second that. R and Python both have the same pre-LLVM performance issues.

I don't expect either R or Python to go away either time soon, nor would I want them to, but I would like to see people moving to things like Julia and Nim, which have the same level of expressivity, but are much more performant. I have difficulty imagining many people saying "I love programming in R and Python, but don't like Julia or Nim."

I like Python but at least with stats/numerics there isn't a big reason to move away from R except for specific libraries (especially DL stuff) or front-end integration with web-land (and even then things like Jupyter mitigate against that).

I've tried it while it was unstable and it was excellent (I remember some random forest training that went from a few days down to half an hour). Since everything was unstable at the time, one update was all it took to break everything. Now that 1.0 is out, I'd definitely like to pick it up again. Unfortunately for my current use cases, the ecosystem just doesn't exist like it does with R or Python.

Meh. As long as you have defined deliverables between grader and student, grading programming based assignments are relatively easy. Coursera has been around longer than Jupyter has been popular, after all. (And they aren't all just multiple choice.)

Being interactive is what makes it good for teaching. But there are plenty of interactive options. And for a certain class of teaching, it is not "on rails" enough such that people will have to have a ramp up period first on Jupyter before they can really get into their topic.

The only thing that stops me from being able to use notebooks full time is their intellisense compared to IDEs is horrible. I like being able to use them for demos/presentations, but I can't imagine trying to code within one primarily. Especially when it comes to tracking results.

How do people cope with this? Do you supplement it with other tools? I spend a lot of my time in an IDE and then just paste some of the code in to cells. That seems easier.

I do the opposite, my job is kind of bad data engineer/scientist/etl minion so it's a lot of dataframes.

Work (and often debug) in jupyter -> open the notebook from pycharm when it's got some completed thoughts and write into a python module + test module, tidying up and adding type annotations.

Sometimes doing that multiple times so that the notebook is importing from modules which were originally pulled out of the notebook.

It sucks having to use two tools but I don't think there's any one tool that can do both as well as pycharm/jupyter, short of me getting a lot better at emacs or writing a lot of custom Atom extensions (I think).

I am very hopeful that JupyterLab will get support for the Language Server Protocol sometime soon. That would make all the difference in the world for me. I'd still have to use a terminal to build and run tests, but I wouldn't be surprised if a test runner comes along fairly quickly after that.

Data frame rendering in the various notebooks (beaker,jupyter,zeppelin,..) is wonderful.
Your workflow sounds closest to what I do. If I want to visualize something I tend to compile my thoughts/imports and organize things in an editor first and put it in a notebook in parallel. It helps with version control as well.

I am a Spark data engineer and spend a lot of time in Scala / Python IDEs & browser notebooks. Databricks lets you package code as JAR / wheel files & attach the binaries to the cluster. I write all the complicated code in tested projects that are checked into GitHub & use the notebooks to invoke the functions and visualize results.

Folks that try to do all programming in notebooks typically drown in complexity and suffer.

When you're processing a lot of data, it can be expensive to keep re-running your whole script every time you make a change. The notebook keeps the results of your earlier steps in memory when you want to change and re-run a later step.

This is a trade-off between how much code you're writing and how much data you're processing. If you're writing maybe 20 lines of code but you have enough input that it takes several minutes to run, the notebook becomes a clear win for your development process.

So does the standard terminal repl in python. You can achieve the same workflow with having a plain old python file, and then just use your favorite editor's "Send block of code to console" function. This way, you retain your editor's functionality while you can work just as interactively as with a notebook.

You can generally persist the results your self to disk though. Especially since a lot of things end up being numpy arrays. So you run 1 script that saves all the results, and another that loads it and runs just the part of your workflow you want. Bonus: it's persisted to disk on top of that! I know things get more complicated than that, but I'd say the compelling use case for notebooks isn't the state saving but more the whole package in one place (state persistence,visualization, interactive repl,..)

I find this odd because I am the opposite - one of my primary use cases for Jupyter/ipython in general is the ease with which I can get 'live' code introspection and intellisense. It's often my prototyping sandbox for python code that I then move into my IDE once it's close to being ready.

I also notice that developing in this way encourages me to create smaller, more testable functions that i can easily work with inside a single notebook cell.

It's not about writing code as much as it is about exploring the data.

If you're writing a lot of code in them, it's probably better to put that code into libraries that get imported and reused.

And I do agree that default code environment is unbearable. Particularly the auto insertion of completing quotation marks, which has me continually fighting with the editor to get correct code into a tiny web text box.

Oh I won't argue you with you there. I just find myself rotating quite a bit because I have to do both deployment as well as writing code for experimentations.

What I'm specifically talking about is even that kinda hacky experiment code you end up writing. I don't try to implement whole projects in there, but even just "train this model" type code ends up being a hassle because of how bad the editors are.

My above comment was more referencing wishing I could spend more time writing experiment code in jupyter without copying and pasting all the time.

That's surprising because I have the opposite experience! Since my first cell is to import all of the libraries I want to use to memory, the intellisense works without fail, regardless of how big the libraries are. Comparing that with my VS Code experience where using intellisense to pull up functions' doc strings takes an age for all but the inbuilt Python libraries.

Hey There! I'm trying to solve this right now in VSCode's in built editor: https://github.com/pavanagrawal123/VSNotebooks . It's a fork from another extension somebody already built, but all activity is dead, so I'm starting up dev on an active fork. I'd love to hear any feedback y'all have! :)

Yeah but the whole point is "interactive coding". It doesn't feel very interactive when I have to context switch all the time :). I'd prefer something closer to what the lisp folks get to do with the repl where you can scratch out an idea and see it working without leaving your environment.

well, I don't think so. Not everything you do is interactive. Data exploration and basic model selection is, but complex models and more complicated data-pipelines/preprocessing isn't, I think.
Tensorflow is the opposite of interactive, even in a notebook.

putting models (in a sense of more complicated models, not just a SVM), data-pipelines, shared visualization-code in a src folders and experimenting in the notebook divides stuff that's interactive by nature from "real" coding. I don't context-switch that much to be honest.

I don't really copy code into cells, because I only experiment there.

Also, what happens if you need to share code between notebooks?

I think notebooks should be simple and explain the experiments and the reasoning behind them to your coworkers. Otherise it's hard to coordinate and learn from each others insights into the data.

I've switched largely to Jupyter / Python for computational linguistics / psycholinguistics because of the pandas / numpy /numba stack, decent off-the-shelf NLP (spacy and gensim), and the ease of moving data into an R kernel for specific analyses and plots. Also nice that any reasonably sized notebook will render on GitHub (and access can be controlled through the accounts system, until something is ready to be public).

One thing I haven't figure out how to do is to generate fully styled LaTeX manuscripts from notebooks (like papaja for RStudio). Is there a way to do this with pandoc?

Having NLTK and SciKitLearn in the same environment as my stats tools is... tantalizing. And if I could write Markdown-to-TeX docs straight from Jupyter, rather than the R -> TeX tables/variables read into LaTeX I'd used before, that'd be a massive win.

I used to use Excel extensively. I've started using Jupyter as a replacement. Some things are great like python modules that can do anything, and visualizations. But if there are less than 100k rows, its still much easier to just use Excel. I'm kinda disappointed. IF you have more than 100k rows then Excel starts to be cumbersome. That is the sweet spot for me.

yeah, i have a use case now where we pull data from a database, manipulate it, and then have a final table/csv/dataframe/whatever. the problem is then how to share this with non-technical users. in an ideal world, this would get inserted into a google sheet, and that sheet would just update daily after new data is loaded into the database.

i'm pretty sure this is a usecase which others have and curious what people use to solve it. i've heard, variously, that some options are to use tableau/similar or email a csv and ask the end user to import into google sheets/excel

Putting aside the R vs Python question (as as noted in this thread, you can use R in a Jupyter notebook and Python in an RMarkdown notebook), I much prefer RMarkdown notebooks. RMarkdown notebooks are plain text, so you can read them easily in any text editor (which also means they play well with git, unlike Jupyter notebooks).

And it's meant to work with the RStudio IDE, so I get a much more seamless experience going between regular code and notebooks (although this is admittedly a more R-centric benefit, at least until and unless RStudio adds Python support outside of notebooks).

A benefit I like about rmarkdown is that it makes it very easy for me to create templated reports. They're built in a way that makes it easy for me to work either iteratively (due to caching of blocks) or rerun the whole thing and get an output.

IMO, Jupyter is nice for presenting the final results of research (like LaTeX), but it is often not the right tool to get there. It's good for professors who teach and publish but it's bad for students to learn and research.

Frankly I find that all programming environments for scientific computing are deficient in some way or another.
If you look at the set of features in Visual Studio, R Studio and Jupyter notebooks, you will see that the Union of useful features is large, and the intersection is almost empty.

Question/idea: Could a notebook-model supplant bespoke photographing processing software such as the "darkroom" mode of Lightroom (or darktable). The extant programs essentially take a lot of data (camera's raw output) and apply a configurable recipe to produce intelligible output (an image). Each recipe (stored as an XMP sidecar) is essentially a list of math operations (increase brightness, wavelet decompose, change color model, etc.) and their parameters.

Obviously a great part of why we use Lightroom/darktable is because of the speed with which the recipe-processing occurs. Plus a smooth UI, a catalog-viewing feature, and a well vetted choice of image operations. The appeal of moving this work to a notebook would be that an actively maintained Jupyter ecosystem could supplant lock-in to a specific software, and open up the underlying math magic.

At the very least, this could be an interesting platform for experimenting with image processing methods. And the reordering of cells could become a virtue, to run an image processing pipeline out of the standard order.

I'm curious if anyone has already worked along these lines. I find through a quick web search that people are doing some image processing, but more in the face detection or ML for medical imaging aspects. I see as a basic toolkit that http://scikit-image.org/docs/dev/auto_examples/ is something, though this isn't the whole range of operations needed for, say, fine art image tuning.

Yes and no. I do computer vision and like photography as well. I use Jupyter notebooks extensively for computer vision, and they work pretty well for (semi) interactive manipulation of image data with code. But as a general purpose tool, it's too clunky for anything more than prototyping. I don't see them replacing darktable/lightroom anytime soon.

Zeppelin[1] is another great tool of a similar nature. It leans a little bit more towards the Scala / Spark world, for people who like that stack. That said, you can use Python, R, etc. with Zeppelin as well.

Does anybody know of a good hosted solution of JupyterHub? I made a neat notebook that I needed to share with my non-technical team, it was using iPyWidgets to do some interactive modeling, but they each needed to be able to use it independently. It has private data so I couldn't use Binder. I've been following Zepl.com for a long time, but couldn't use them here because Zepplin doesn't support iPyWidgets. Pretty soon I found myself installing helm and trying to follow along a tutorial on how to deploy JupyterHub on a kubernetes cluster. That started to add an unmanageable level of complexity to own, especially to share a simple notebook. And while spinning up a GKE node per user is the whole point of Kubernetes, it got expensive quickly in my test. We cannot spend $75K a year on Domino. Any other options?

Polyaxon, https://github.com/polyaxon/polyaxon, is an open source platform that tries to simplify not only running notebooks on kubernetes, but also tries to solve issues related to scaling, tracking, and reproducibility.

Sharing articles with team members and letting them run them is trivial. We automatically version the article, the data and the environment (docker image) and you can remix (fork) other articles. `xoxo` is a signup code you can you if you want to give it a try.

We're not far out (~two weeks) from launching our beta for private research. Here, you'll get your own private data store and docker registry as well as secrets management (stored securely in hashicorps vault).

I'm literally building this right now to fulfill this need. If you shoot me an email to hugo@opensourceanswers.com, I can let you know when it's ready. My plan is to charge a premium (similar to github prices) per user, and pass on compute costs directly to the customer with no markup

Notebooks are great for invoking existing functions and exploring data.

Notebooks aren't ideal for creating functions (standard text editor features are lacking and testing is impossible).

Notebooks encourage an "order dependent variable assignment" programming style without abstractions. Here's what you'll commonly see in a notebook:

val df = spark.read.csv("some_data")

df2 = df.withColumn("clean_name", trim("name"))

df3 = df2.filter("clean_name" === "Mark")

I've found that notebooks are very useful if you write all the complicated code in separate GitHub repos and attach binary executables to the cluster. If you try to write all your logic in notebooks, you'll quickly struggle with order dependent, messy code.

I had the same trouble with order dependence as notebooks got to a certain size, so my team and I created and open-sourced a library, Loman, to help with that. It allows you to interactively create a graph, where nodes represent inputs or functions, and then keeps track of state as you change or add inputs, intermediate functions and request recalculations. Our experience has been broadly positive with this way of working. As graphs get larger, it's easy to lift them into code files in libraries, while continuing to modify or extend them in notebooks. The graph structure and visualization make it easy to return to loman graphs with up to low hundreds of nodes, which would make for a fearsome notebook otherwise. It also makes it easy to bolt Qt or Bokeh UIs onto them for interactive dashboards - just bind UI widgets and events to the inputs and widgets to the outputs. They can be serialized, which is useful for tracking exceptions in intermediate calculations when we put them in airflow to run periodically, as you can see all the inputs to the failing calculation, and its upstreams.

The function bit has always confused me. I tend to write code with lots of functions/modules/classes for handling various aspects of the analysis and I just don't understand how that's supposed to be integrated. Instead notebooks seem better designed to handle small code snippets that rely on well known libraries. I'd be happy to jump on the notebook bandwagon but I'm having trouble seeing how I could adapt my code to the notebook style.

Having spent a decent amount of time learning to be a programmer while doing scientific image analysis in Matlab (shudders from the real programmers), and with a decent amount of time spent in Mathematica as well, I just can't seem to buy into the Jupyter/notebook based programming enthusiasm. The talk linked in the article explains it better than I ever could, but for me, when I am leaving data in memory, it is much more convenient for me to have a completely linear history, ordered by command execution time. In python I have found the best way to do this is writing standard python functions and scripts, and running them in an IPython environment with the %run magic. You have the linear history, git works well on standard .py files, and you can interactively work with the data in the IPython prompt without worrying that something is proceeding nonlinearly. What I find works best is to explore with the data in the live prompt, which gives you interactivity, and then build up slowly a master collection of functions and commands that when run, with a single command, can reproduce the results you got while exploring. Then to come back to the data at a later point in time, you have to run one script file on the raw data. Of course, this is kind of the point of the Jupyter notebook, but I find that often when I want to change parameters, the ability to jump around and redefine things means I do. By moving from IPython to a script/functions I run, I ensure that everything progresses linearly. Idk, just my two cents.

Many comments are about implementation details such as JSON format of Jupyter and comparing user experience with IDE and shell and fail to see the fundamental difference between Emacs and Jupyter, which is in this quote in the article:

“In many cases, it’s much easier to move the computer to the data than the data to the computer,” says Pérez of Jupyter’s cloud-based capabilities. “What this architecture helps to do is to say, you tell me where your data is, and I’ll give you a computer right there.”

We archive full-stack reproducibility by allowing you to install arbitrary software and version these environments using docker. You can reuse these in other articles or pull and use them locally. `xoxo` is a signup code you can you if you want to give it a try.

I work for a company, Code Ocean, that aims to solve this issue. We have a custom-UI for installing packages through a variety of package managers, including Conda, CRAN, etc. Here's an example of some Julia notebooks being rendered to HTML with the environment fully configured and accessible https://codeocean.com/2018/08/16/counterexamples-on-the-mono...