The Trouble with Spreadsheets

As a prelude to my next look at alternative fuels models, some thoughts on spreadsheets.

Everyone loves to hate spreadsheets, and it’s especially easy to hate Excel 2007 for rearranging the interface: a productivity-killer with no discernible benefit. At the same time, everyone uses them. Magne Myrtveit wonders, Why is the spreadsheet so popular when it is so bad?

Spreadsheets are convenient modeling tools, particularly where substantial data is involved, because numerical inputs and outputs are immediately visible and relationships can be created flexibly. However, flexibility and visibility quickly become problematic when more complex models are involved, because:

Structure is invisible and equations, using row-column addresses rather than variable names, are sometimes incomprehensible.

Dynamics are difficult to represent; only Euler integration is practical, and propagating dynamic equations over rows and columns is tedious and error-prone.

Without matrix subscripting, array operations are hard to identify, because they are implemented through the geography of a worksheet.

Arrays with more than two or three dimensions are difficult to work with (row, column, sheet, then what?).

Data and model are mixed, so that it is easy to inadvertently modify a parameter and save changes, and then later be unable to easily recover the differences between versions. It’s also easy to break the chain of causality by accidentally replacing an equation with a number.

Implementation of scenario and sensitivity analysis requires proliferation of spreadsheets or cumbersome macros and add-in tools.

Execution is slow for large models.

Adherence to good modeling practices like dimensional consistency is impossible to formally verify

For some of the reasons above, auditing the equations of even a modestly complex spreadsheet is an arduous task. That means spreadsheets hardly ever get audited, which contributes to many of them being lousy. (An add-in tool called Exposé can get you out of that pickle to some extent.)

There are, of course, some benefits: spreadsheets are ubiquitous and many people know how to use them. They have pretty formatting and support a wide variety of data input and output. They support many analysis tools, especially with add-ins.

For my own purposes, I generally restrict spreadsheets to data pre- and post-processing. I do almost everything else in Vensim or a programming language. Even seemingly trivial models are better in Vensim, mainly because it’s easier to avoid unit errors, and more fun to do sensitivity analysis with Synthesim.

5 thoughts on “The Trouble with Spreadsheets”

In econometrics, they’re simply a too for pre-processing: a convenient way to sort the data out before feeding it into R, Stata or EVIEWS, for example. No fancy manipulations there, thanks!

That said, I have long used OpenOffice (must have been a good 4 years since I dropped Office 2003 altogether), and recently discovered an R add-on I’ve been itching to try. I wonder what it does? Need a bit of time to try it.

That said, I did my previous research on fishery dynamics using spreadsheets. Not very intuitive, to be honest, but I had no other software around at the time (it was during Christmas holidays. Serious.), it was something I knew how to use and the data was conveniently formatted in discrete time periods, which allowed me to do it. As far as I can tell, modeling and simulating continuous variables is beyond spreadsheets.