25 November 2009

The Visual Display of Quantitative Information

After reading Visual Explanations I was intrigued enough to pick up Tufte's classic, The Visual Display of Quantitative Information. Overall it is quite good. With numerous examples, Tufte shows when graphics and tables can be used to illuminate the truth, and he presents some pieces of a theory to govern their design. Tufte also calls out the use of misleading data graphics in newspapers, ads, corporate publications, and other sources.

Some takeaways:

Graphics are frequently considered a crutch to lean on when text is deemed boring; but, when used judiciously, they can actually present and reveal data in a much more information-dense and easily-retained manner than can text. To make effective graphics takes statistical training, not just artistic training.

Creating graphics with "integrity" is, in large part, making sure that the relative sizes perceived by the eye are commensurate with the relative sizes in the real data. Tufte displays examples of many tricks that people have used in order to make differences appear more or less significant than they are. And it's not just about how the data are drawn but also often about what data are drawn.

Tufte also presents some general design principles, including:

Use as little ink on the page as you can, within reason. Avoid redundancy and remove inessential elements.

Avoid busy-looking textures and other "chartjunk."

Arrays of small similar graphics ("small multiples") are an elegant way to present multivariate data.

Integrate graphics with the narrative text when possible.

I was struck by Tufte's lament that computer graphics often evoke the thought "Isn't it remarkable that the computer can be programmed to draw like that?" rather than "My, what interesting data." This seems to be no less true in 2009 than when it was written. Given Tufte's opinions about PowerPoint, I do wonder what he would say about Apple's Keynote. PowerPoint presentations are usually merely inane or unattractive; Keynote presentations, with their typical distorted 3-D charts, are often downright misleading.

While I thought most of Tufte's book was valuable, he occasionally appears to make dubious logical jumps and comparisons to prove his point:

He speaks of a 2.2 megapixel grayscale astronomical survey map being subdivided into "2,275,328 rectangles" as if it is equivalent to a table with as many entries. The eye can perceive macro- and micro-structure in a graphic but not every last detail with fidelity. Thus the effective content of the image is much smaller than 2,275,328 elements. Throughout, Tufte shows some odd fascination with numbers like these and seems to labor under the delusion that their exact magnitudes are meaningful.

Tufte does some rethinking of how box plots, axes, and other graphical elements might look. But he seems to be driven only by his maxim of "reduce ink." In his favorite box plot alternative the different components are barely distinguishable from each other. The simplicity of the resulting graphic does not nearly make up for the fact that the information therein is much less easily perceived.

Still, I recommend this book. Reading it is like flipping through a curated gallery of data graphics designs, many of them strikingly elegant. It is an easy read (I read it in one sitting), but as many of us have to marshal data to make an argument once in a while, we could use some advice on how to do so effectively.

Disclosure

I'm a software engineer at DNAnexus, Inc. This blog represents the opinion of myself and no one else.Unless specifically noted otherwise, I do not receive free review copies of books or other products mentioned here.