Books

Which books to read to learn more about data visualization

I’m getting most of the basic principles and best practices of information visualization and design from books. I read fictions on my iPad, but when I want to study I still need an actual book in my hands in which to annotate and quickly flip through. Below you can find my reviews and thoughts on the content and usefulness of books that I’ve read. I hope this will assist you to figure out which books you still want to read to learn more about data visualization. Further down below is another section with books that are still on my own “to-read” list

Non-design Books

An introduction to information graphics and visualization

Alberto Cairo

This book is divided into four parts: Foundations, Cognition, Practice and Profiles. The foundation starts, like many other books on this subject, with explaining why you should visualize in the first place, what are the benefits. The second chapter starts with the statement that “Function constrains the form” to convey the idea that, unlike art, information design is meant to help us in the visual analysis of data. A fact that is often forgotten in all those awful marketing infographics/infoposters that you see these days.
I found Alberto’s starting example in which he redesigns a graphic about the armed forces in South America extremely useful. It shows how difficult it is to get insights from the original design and offsets this to how easy it is to get them from the redesign. Perhaps the most important message from the book (to me at least) is in this section. Try to put yourself in the viewer’s place and think about what insights/answers you want to find from the visualization. Do you want to make comparisons? See correlations? Use these questions to make sure that your design follows a form that makes answering these questions easy for the viewer.

The third chapter is about a topic that I haven’t seen in any other book so far, but that plays a big role when designing data visualizations more meant for promotion or for online use (instead of dashboards): The beauty paradox. You cannot really create a chart that both contains a lot of (different types of) information and that can be understood within seconds. Alberto introduces us to “The Visualization Wheel” (image bottom right), which covers the main features you need to balance during a design. A more familiar design means less originality. How to find the best middle ground in the visualization wheel, somewhere between radical minimalism and very playful is the subject of chapter 4 “The Complexity Challenge: Presentation and Exploration”. My favorite quote of this chapter is the following: “Graphics should not simplify the messages, they should clarify”.

The second section, Cognition, is Colin Ware’s book “Information Visualization” in about 50 pages, which is more than any other book besides Colin’s, so bonus points for that. We get a short introduction into the workings of the eye, preattentive features and Gestalt laws with nice examples made clear through (simple and print) charts.

The third section is about Practice, in which the first chapter takes us through the design choices of ~4 infographics that Alberto has made. I always enjoy reading about the process behind a beautiful end result. Then you find out the struggles the designer faced, the lessons learned and the effort that somebody has put in. The final chapter is about the rise of interactive graphics. The pictures in this chapter feel a bit outdated, but the general ideas about feedback, styles of interaction (instruction versus exploration) and ways in which a reader can navigate the visual (overview first, zoom and filter, details on demand by Ben Schneiderman) are still relevant.

The final section, Profiles, are a collection of 10 interviews that Alberto had with the creme-de-la-creme of the data visualization world: John Grimwade, Hans Rosling, Stefanie Posavec, Jan Willem Tulp and Moritz Stefaner, to name a few. And they work with/come from very different backgrounds. Stefanie is more hands-on, John Grimwade works in the illustrated infographics world (the way infographics should be) and Jan Willem Tulp/Moritz Stefaner use coding to create their online visualizations. I found it really refreshing to have such a section at the end of this book. Again, learning how all these people work and how they tackle visualizations is very useful to figure out what way works best for yourself.

To wrap it up, I found this to be a really good book! Especially how it teaches you to think about the design; what questions you should ask yourself to improve the design. A good portion on visualization principles and interviews with several of the experts in the field. Even more so, it’s also well written, I couldn’t put it down and read it all in two days on holiday. I think this is the book I would recommend to read first if you haven’t read any other book about data visualization, because with just the teachings from this book I think you can already create some effective and beautiful infographics (to stress this again, the way infographics should be).

Edward R. Tufte

There is a lot of information packed in this ±200 page book (perhaps I should say, “bible” of data visualization). I found it to be a very educative but also fun read. It’s packed with gorgeous data visualizations, most of which I had never seen before, but that were made before the computer age really took flight.

The book is divided into two sections. The first “Graphical Practice”, has three chapters. The first chapter is a bit of a history lesson and showcase where we see many different types of charts that belong in Tufte’s “Hall of Fame” as it were. He is particularly fond of graphics that manage to pack an enormous amount of data into a 2-dimensional display and he has often calculated for the reader how many numbers there are visible per square inch.
The second chapter is about the other side, the “Hall of Shame”. Where terms such as the “Lie factor” are introduced. It’s most important part, I would say, is the six principles of Graphical integrity on the last page of this chapter.

The second section (about 100 pages) “Theory of Data Graphics” consists of 6 chapters. It aims to provide a set of rules by which you can assess the effectiveness of a chart. His famous “data-ink” ratio is explained here and he gives several examples of how to gradually remove non-data ink from different kinds of charts to improve readability. Although I do see an amazing improvement in how the newer charts convey information I do sometimes feel that Tufte takes it one step too far in his minimization. But then again, it’s good to see how far you can go just to be aware of it.
Chartjunk gets a chapter in which things such as Moire effects are condemned. Although perhaps in our age of colorful print and computer screens, the need for different types of cross-hatching to create categories is less relevant (my guess is that if Tufte had written this book after 2000 he would have replaced the Moire effect with all the horrible unnecessary 3D graphics out there).
In the chapter on Data maximization I loved his idea of a “range-frame” where the axes on a scatterplot run from min to max only, thereby giving the axes a data conveying element as well. Strange that I haven’t seen this idea around in practice yet.

His chapter on High-resolution Data Graphics showcases all kinds of charts that manage to display vast amounts of data. But it is the “simple” sparkline that really shines here. He spends several pages on examples and explaining their use and by the end you really want to employ them everywhere.
The final chapter is a bit of a random collection of final rules to improve the aesthetics of data graphical design. Topics such as combining words and text and the proportion of height versus width come to pass.

I loved reading this book. Its ratio of text and graphics is very pleasant. I had first expected that most other data visualization books that I’ve read before this one would summarize practically all of the good points from this books. But that wasn’t entirely the case. Even though it was first published in the 80’s there were still a lot of new insights to be found here for me. Ideas on how to make even a common chart such as the scatterplot even more effective. In hindsight, I should’ve read this book as one of my first dataviz related books. I highly recommend it to anyone interested in data visualization, even if you’re new of a more experienced practitioner.

Third Edition: Perception for Design (Interactive Technologies)

Colin Ware

I find the “psychology of seeing” or perception an extremely interesting topic. I still remember that it first started to intrigue me when I heard about the book “The man who mistook his wife for a hat” and read a short excerpt from it. The fact that this man could not really make any sense of what he was seeing, but being completely unaware of this, was so strange, that I wanted to know more about how this could happen. What really happens in between the light falling onto the back of our eyes and our understanding of what we are seeing.

The first two chapters of the book dive straight into this topic and the biological workings of our eyes. A model of perceptual processing is explained that runs between the eyes and the brain, super acuity (where our brains make very smart use of all the visual input to see details on a very fine scale) and the distribution of nerves attached to our light receptors.

The addition of “Design Guidelines” boxes in the current edition that formulate the impact of some insight into a clear design motivations and rules help to make them practical. It also makes this book a gem for practitioners of data visualization, not just researchers.

Next is a chapter on Lightness followed by a chapter on Color. In these chapters you really learn how often your brain is fooling you in what you think you’re seeing and what is physically true. Two patches of grey that have exactly the same RGB values, but that do not seem the same because one patch is embedded in a light area and the other in a dark area. The book also tries to explain why we are seeing something different than is actually there, sometimes getting a bit technical with actual formulas. In the chapter on Color we first get a short introduction into color specification schemes, but most of the chapter contains fascinating insights into how we perceive color. How to spiral through different hues and different luminosities if you want to see both patterns of highs and lows and read of rather exact values from a legend key (the right most square in the photo on the bottom left shows such a color scheme) If there is one thing to remember from these chapters, it’s that “Context is Everything”.

The next chapter dives into the topic of “Finding Information”. It starts of a bit technical again about the workings in the brain, but in essence this is a chapter on Preattentive processing. In preattentive processing our eyes and brain are doing some serious hard lifting and parallel processing, without us really noticing. That a red dot immediately stands out in a group of grey dots. There are several features that are preattentively processed, such as color, orientation, size and many more. This chapter explains all features currently know with examples. The last part goes into the combination of features, which are separable (color and shape) and which are inseparable (width and height, we can only easily see the complete shape) and other more advanced topics relating to how you make it easy for the viewer to find information.

In “Static and Moving Patterns” we are introduced to the Gestalt Laws of grouping, or how our brain intuitively sees groups and how we perceive patterns in general. But also on visualizing flows I’ve become a fan of the streamlet). Again a chapter packed with extremely useful information for data visualization in which it is often the goal to show patterns. It ends with a few sections on seeing patterns through motion

And all that is just in the first half of the book. However, I found the second half to be less intriguing and it took me a lot longer to read it than the first half. Although the chapters have interesting titles, such as “Visual Thinking Processes”, and the chapter on “Visual Objects and Data Objects” was an exception and still very useful, overall, the content felt less applicable to myself. Nonetheless, there is just so much information, useful guidelines and good examples in this book, that it really is a must-read for data visualization designers

Type, Typography and the Reader

Jan Middendorp

I actually bought the Dutch version of this book on a whim while at Infographic Conference near Utrecht and I’m very glad that I did! This is a wonderfully complete book about Typography. Each spread of two pages is almost its own sub-sub chapter and explains one facet of typography. This can range from history lessons to more technical subjects like kerning to a specific category of fonts. Each page has a whole array of wonderful full-color examples to accompany the lesson of that spread, which makes for a very entertaining read.

Even though each 2-pager can be read on its own, they are grouped into 6 actual chapters. The first is a very general chapter about reading and seeing; ways to read, how typography is used in marketing. Next a short chapter on organizing and planning a page by using grids. The third chapter is the biggest of the book and concerns type. How to recognize the different categories, how to choose a type and how to combine them. We get an entire history lesson from before Gutenberg to the typographic noise of the first commercial spreads to the (digital) trends of today. I loved reading it all. But there is also more than enough about the technical side, especially in the fourth chapter on typographic detail. We learn about the do’s and don’ts, each gets its own spread which explains the why’s and shows good and bad examples and the chapter contains much more. How to get effective texts for reading, but also on creating big captions. The final two chapters are a bit smaller again. Design strategies swiftly introduces us to typographic logos, corporate identities and the current vibe of going back to the printing press for authenticity. The final chapter contains a few spreads on the history of the actual printing of text. What kind of work was needed and what types of machines have been used in the past.

This is by far the best book on typography I’ve read so far (although I have to admit that I haven’t read that many, but still). It’s a very complete introduction into typography that doesn’t shy away from becoming technical or detailed from time to time. I love the fact that by using history it teaches the reader the “why” on how some conventions, types or words (like uppercase & lowercase) came into being. It will not satisfy those looking for an entire chapter on specific things like kerning, you’ll need a much more specialized book for that. But for all who want to learn about the world of typography, I would definitely suggest you start with this book!

Note: If you’re doubting about this book or “Thinking with Type” by Ellen Lupton, then I would definitely recommend “Shaping Text”. I won’t write a full review of “Thinking with Type” on this page since I think one book about typography in a data visualization section is enough :) But I found “Shaping Text” to be a much better book to learn from. It has a more elaborate history, more examples about the different technical things and most importantly, it doesn’t fill its pages with fluffy words and anecdotes with metaphors about text/typography. “Shaping Text” covers all that “Thinking with Type” covers, but more complete and more clearly. But this is just my opinion of course.

Edward R. Tufte

Another amazing book by Tufte. In terms of style of print, this book looks the same as his “The Visual Display of Quantitative Information”, with a large portion of the page’s width devoted to notes (or note-taking) and many images in between the text. It’s a bit shorter with 120 pages than the other but I would say this is definitely one of the must-have books for those venturing into data visualization. Also just like the other, “The Visual Display of Quantitative Information”, this book is filled with many wonderful data visualizations that I had never seen before. Some already several hundreds of years old. It seems that Tufte also has a love for how the Japanese visualize data, because the books features quite a few images from Japan.

The book has six chapters, each with its subject on how to display information in a specific way. The first, “Escaping Flatland” feels a bit as an introduction to the rest. It showcases many dataviz examples that were able to convey a lot of information in a small amount of space. Sometimes going into the history of a specific chart, such as how sunspots have been recorded since Galileo.
The second chapter, “Micro/Macro Readings”, is about charts which can be read on multiple levels of hierarchy. The overall view, but also a detailed view (think of maps for example). Such as the stem-and-leaf plots where you can see the distribution by the shape but also read off the exact numbers of each data point.

Chapter three, “Layering and Separation”, is about the 1+1=3 effect (in this case not a good thing) of how visual clutter can create noise (you see more than there really is). It starts with examples that are very good at layering their information in such a way to complement each other and the chapter ends with several examples in which the non-data is too prevalent (of which Tufte redesigns a few).
The subject of chapter four, “Small Multiples”, seems to have risen in popularity in the past year or so. It’s a relatively short chapter in terms of text, but it is nice to see examples of small multiples that are not necessarily a grid of boxes with a similar image (but of course each slightly different) in each.
Chapter five, “Color and Information”, dives a bit into how color can be used effectively. It’s a brief intro, and I would recommend reading Colin Ware’s book for those looking an in-depth piece on the effects of color in (data) visualization. But the pages about Oliver Byrne’s explanations of mathematical proof’s using colors and shapes are truly amazing.

The final chapter, “Narratives of Space and Time”, is about the display of time (and a bit about location) in a chart. And no, you won’t find a conventional line chart in this chapter. As with the rest of the book, this chapter has many wonderful examples of data visualization that have used a (slightly) different way to convey the passages of time, modelled on their dataset, from dancing to flight schedules.

This book is all about the 6 subjects of the chapters. There are no data visualization”rules” in here like in “The Visual Display of Quantitiative Information”. Instead each chapter is comprised of examples that apply the subject of the chapter in an effective way and Tufte explains to the reader what it is exactly that works so well. Like his previous book I also very much enjoyed reading this book and learning from the many examples and would definitely recommend it.

Scott Murray

After discovering D3 about two years ago, this was the book that I bought to get myself started. I had already done a few chapters that were then available online. Even though I didn’t quite understand the whole “data().enter().append()” back then (and often think I still don’t these days :) ) I was able to actually create a bar chart and a scatterplot in D3 by following the examples. So, happy with the success I bought the book itself. I still remember very clearly, I was alone on a two week project in the center of Paris during the summer. I still haven’t found a better place to read than in the Tuileries, the gardens near the Louvre, during a beautiful summer evening.

Before this book I had no JavaScript and practically no HTML & CSS experience. Thankfully, this book is really aimed at somebody with no prior knowledge and the third chapter on Technology Fundamentals explains HTML, DOM, CSS, JavaScript and of course SVG fundamentals. Especially the introduction to JavaScript arrays and objects was essential for me.

The chapter on Data is where you first make something appear on the screen, and also the chapter that introduces you to the D3 chaining method. Next in “Drawing with Data” we get to draw a bar chart. In just 20 pages, we go from nothing to a beautiful bar chart with labels and colored according to height. Even the smallest step is thoroughly explained and always accompanied with an image of the result to make sure that you really understand what each step of the code adds to the end result.

The chapters on Scales and Axes are essential too, we couldn’t really make any interesting charts if we had to do the mapping of the range of the data to the locations of the pixels on the screens all the time. But it’s the next two chapters that really hooked me on D3; “Updates, Transitions and Motion” and “Interactivity”. Continuing with the bar chart from the “Drawing with Data” chapter, the book now explains how to make updates to the data and see your bar chart move to adjust, add new bars or remove bars. I think I didn’t fully understand the “enter, update, exit” routine at the time, but I understood it enough to get it to work in my own example. In the “Interactivity” chapter it’s crazy when you see how easy it is to add simple interactive elements, such as a color change of a bar when you hover over it. Even though it can be done in just a few lines of code, after creating your own hover effect for the very first time, it feels like you’ve made your bar chart 300% more interesting than before :)

It took me many, many more hours of coding in D3 and searching on Stackoverflow after reading this book before I could start to really make custom made visualizations, but I think I couldn’t have gotten a better start into the wonderful world of D3 than with this book.
Nowadays, there is a free online version of the book. This makes the examples involving animation more interesting of course, but I still would’ve bought a hard copy since I learn more easily from books than web pages and to show my gratitude to Scott Murray for creating this book :)

Edward R. Tufte

After reading “The Visual Display of Quantitative Information” and “Envisioning Information” by Tufte, I was very happy when Edward Tufte was so generous to send me his other two books (including his signature ^^) of which this is one. And I can say that these are of the same high standards as the first two books and they were an absolute joy to read.

This book revolves around the idea of presenting evidence. Evidence can come in words, numbers, images, diagrams, still or moving. What are the best ways to combine these aspects into one coherent story? There are 7 main chapters in this book with 2 bonus chapters relating to Tufte’s work in art. We start with “Mapped Pictures: Images as Evidence and Explanation”. This chapter makes it very clear that many data visualizations these days are missing their scales: “Mapped pictures combine representational images with scales, diagrams, overlays, numbers, words, images”. Of course, we get to see beautiful examples; of art, a stork, dancing and more.

The second chapter “Sparklines: Intense, Simple, Word-Sized Graphics” is an expanded version from the Sparklines section in his previous book “The Visual Display of Quantitative Information” and it brings enough new examples and information to be deserving of its own chapter now. We should really be employing sparklines more.

In “Links and Causal Arrows: Ambiguity in Action” we get into the realm of networks and graphs, which encompasses much more than what we would typically call a network visualization these days. It’s about linking “things” objects or words with arrows. But arrows can be rather ambiguous, what do they really mean? He gives tips on how to create effective diagrams.

In “Words, Numbers, Images—Together” Tufte explains very well the issue that you see so often when the evidence is not presented together. Images at the far end of a book, difficult number mappings in an image to explanations somewhere in the text. We are shown the excellent example of Hyperotomachia and Galileo’s Sidereus Nuncius in which words and images, evidence in the latter case, are presented together. And how Isaac Newton’s Opticks with its images bound together, away from the text, is a pain to read and understand. The chapter ends with a few nice redesigns of how to integrate words within a chart without it being crowded and obscuring the data.

The fifth chapter “The Fundamental Principles of Analytical Design” is all about Minard’s famous map of Napoleon’s campaign to Moscow and back. It starts with a wonderful fold-out large example of the (translated) map. By studying Minard and the map Tufte teaches us 6 very useful lessons on presenting evidence. From the importance of showing comparisons to credibility to content counting most of all. A great chapter!

Chapter six deals with the opposite of Minard’s map “Corruption in Evidence Presentations: Effects without Causes, Cherry-Picking, Overreaching, Chartjunk, and the Rage to Conclude”, quite a mouthful. It’s about the different ways that somebody might be trying to corrupt reasoning through presentations. The title of this chapter sums up the subsections of this chapter and Tufte explains and shows (elaborate) examples on all.

I love that he ends with “The Cognitive Style of PowerPoint: Pitching out Corrupts Within”. We get a first-row seat on how awful a technical report (from NASA on the Columbia disaster) becomes when it has gone through PowerPoint. It’s very clear after reading this chapter that when you want to present (beautiful) evidence, don’t use PowerPoint.

I am always amazed how Tufte manages to find so many wonderful examples from all over history to accompany his teachings. Although he sometimes re-uses an example that has come to pass in another title of his, it is always either in short passing (and referencing where to the more detailed story) or a much more elaborate telling (such as the sparklines and Minard’s map in this book). Definitely a book that will teach you much more about data visualization if you care about data and how to convey insights to your intended audience!

The FlowingData Guide to Design, Visualization, and Statistics

Nathan Yau

This was the first book that I read about data visualization, even before I had figured out that I wanted to specialize in the area. It’s been quite a while since I read it so it’s a bit difficult to remember, but I do remember that I really enjoyed reading this book and learned a lot from it.

The first 3 chapters of the book are about the preparation; how to get the data in the right form and choosing your tools to visualize. A whole bunch of links to (mostly) open data sources are shared and next we learn how to use Python to scrape weather data. There are many cases throughout the book where Yau shares the complete code (and explains what each step/line of code means) to create a visualization. He does this in many different tools, all of them free. This is one of the reasons why I learned so much from this book, by following the examples I had a good first step into a new tool, made something cool which gave me enough motivation to want to learn more about the tool (although I ditched Python for R).

The next five chapters are similar to the idea of Stephen Few’s book “Now you see it” and are about one particular visual analysis task; Visualizing patterns over time, proportions, relationships, differences and spatial relationships. Each chapter explains several chart types from the ground up (so even the bar chart is explained like we’ve never seen it before) and then shows how to create one and make it look elegant and effective. Especially creating a simple chart in R and the refining the look in Illustrator (or InkScape) opened my eyes to how much you can improve a simple chart be not sticking to program defaults.

One thing that I found interesting is that, for the most part, one type of chart only appears in one of the chapters. So it’s not all line charts and bar charts, more complex charts such as Treemaps and Parallel Coordinates (or Chernoff faces, although I feel these are exceptionally useless) are also explained.

This is the most practical book that I’ve read that combines visual best practices with learning exactly how to (re)create a chart. It contains tons of examples, is written in an engaging manner and is not too long . I would definitely recommend this book to those who already know how to program and want to dive into the world of data visualization

Images and Quantities, Evidence and Narrative

Edward R. Tufte

After reading “The Visual Display of Quantitative Information” and “Envisioning Information” by Tufte, I was very happy when Edward Tufte was so generous to send me his other two books (including his signature ^^) of which this is one. And I can say that these are of the same high standards as the first two books and they were an absolute joy to read.

As said in his introduction, this book “describes design strategies—the proper arrangement in space and time of images, words, and numbers—for presenting information about motion, process, mechanism, cause, and effect”

The first chapter “Images and Quantities” is a short one about the scales in images. Appropriate scales to help the reader understand the context are often forgotten or incomplete, especially in scientific images. The second chapter is definitely my favorite one, “Visual and Statistical Thinking: Displays of Evidence for Making Decisions”. This chapter is devoted to the analytical and decision process behind two major events. The first is the cholera epidemic in London during September 1845 from which some of you might already know Jon Snow’s map (see the image below left). In this case, the analysis and visual evidence

The second chapter is definitely my favorite one, “Visual and Statistical Thinking: Displays of Evidence for Making Decisions”. This chapter is devoted to the analytical and decision process behind two major events. The first is the cholera epidemic in London during September 1845 from which some of you might already know Jon Snow’s map (see the image below left). In this case, the analysis and visual evidence was done in an exemplary form. The other case concerns the launch of the space shuttle Challenger on January 28, 1986. Sadly, this is the opposite, where the visual evidence was conveyed very poorly which resulted in the shuttle taking off and exploding, killing all 7 astronauts inside. I loved how elaborate Tufte goes through both events and highlights the critical issues that made a difference.

Chapter three “Explaining Magic: Pictorial Instructions and Disinformation Design” has a nice twist. By looking at how magic tries to obscure, to deceive the audience, Tufte teaches us how to not do this for good information design. It starts with some fun examples of magic tricks and how these are visualized in learning books and the chapter ends with 6 learnings on how to give a “good information” presentation. So, for example, instead of not telling the audience anything as is normal in magic, turn that around and start your presentation by explaining the problem, why it is important and what the solution.

Chapter four “The Smallest Effective Difference” is another short chapter that revolves around the idea to “make all visual distinctions as subtle as possible, but stull clear and effective”. This chapter has some nice and convincing redesigns that you can learn from.

In chapter 5 “Parallelism: Repetition and Change, Comparison and Surprise” we see the first ideas of (small) multiples. Although in this chapter that can also mean just two different versions side by side instead of many. This chapter again has a wonderful long explanation about the cyclogram from the Salyut 6 space flight (image below right) and many other examples ranging from areas such as landscape design to typography.

And then is chapter 6 “Multiples in Space and Time” we get to the >2 versions that are slightly different. Tufte explains the benefits of small multiples. And we again see a very diverse collection of examples such as an image of 13 different interpretations of Saturn from the 1600’s.

The final chapter “Visual Confections: Juxtapositions from the Ocean of the Streams of Story” started out a bit confusing for me. Although I liked the examples from centuries ago I couldn’t quite get the idea of confections, until I realized that it was another term of/relates to Infographics in the data visualization realm of today. Some of the examples in this chapter were maybe a bit too much into the realm of art for my taste.

As always, the collection of examples that span centuries and are not constrained to what we nowadays typically think of as data visualizations is what I love the most about all of Tufte’s books and this is no exception. However, I will be honest and say that if you are still new to the world of data visualization I would recommend other books first to get a good basis. I felt that “Visual Explanations” has less direct learnings for data visualization and felt more as a wonderful collection (and explanation) of examples. However, once you’ve read the basics and best practices of data visualization, I would definitely recommend this book to expand your growing understanding of visualization and your collection of data visualization books.

Designing Tables and Graphs to Enlighten

Stephen Few

Stephen Few isn’t the guru on dashboard design for nothing. His books are packed with useful and practical information and examples on the many ways to display data. His books are not meant for fancy charts, no Streamgraphs in here, but for the basic charts and tables that have been at our disposal for many years. Nonetheless, I think you need to first understand how to make a simple chart effective.

The book starts by explaining the different types of data; quantitative versus categorical. Afterwards, the basic statistical measures are explained; mean, median, standard deviation, distributions, ratios and measures of correlation. The third chapter is a short one, explaining when to use tables and when to use charts.

The next two chapters are about the fundamental variations of either tables or graphs. What kind of relationships can you display in a table; quantitative to categorical or quantitative to quantitative. The fundamentals of graphs is a much longer chapter. That this book really starts at the basics comes across here where the start of the chapter tells the reader how points and bars can be used to encode data. Next, the visual attributes such as position, color and shape are explained to create categorical subdivisions in a graph.
Afterwards, things get a bit more interesting when the section on “Relationships in Graphs” begin. Few points out that there are 7 types of relationships that business graphs usually display, such as ranking, deviation (image on the bottom left) and correlation. For each of the 7 options Few shows what chart types work particularly well. Here we see one or two non-typical charts, such as box plots (which I like a lot) and two bars on top of each other where one set of bars is thinner than the other (not yet sure what I think of these).

Chapter 6 is in essence Colin Ware’s book “Information Design” in 25 pages. It quickly tells you about the working of the eye, preattentive design, colors, context and the Gestalt Laws. A bit short for my taste, I feel that those subjects which have such a big impact on the effectiveness of a chart, could use more pages. The famous “Data-ink ratio”, by Edward Tufte, is quickly explained in chapter 7; reduce the non-data ink and enhance the data-ink.

Chapter 8 is all about the design of tables. Even though I hardly ever “design” tables, this chapter points out some very useful tips to make elegant tables with as few row filling colors and cell borders as possible.

Although chapter 9 is called “General Graph Design”, it is mostly about what not to do. Try to use zero based scaled and 3D is bad (the pie chart was already addressed in one of the first chapters). Then we finally get to, I think, the most useful chapter of the book “Component-Level Graph Design”. Here are the tips that are least obvious, but can still make a big impact of chart effectiveness ; how much white space should there be between bars, the best x/y axis width ratio, using trend or reference lines to make the pattern more apparent, when and how to eliminate legends, tick marks or grid lines. Most of the chapters before this one are about how to display the data, what chart types. This one really shows the extra mile that a chart can achieve by optimizing the visual side of a chart once you’ve chosen a specific chart type.

The book ends with a short chapter on displaying multiple variables, by using small multiples for example.

I have to admit that I read through the first ~6 chapters really quickly because I already knew most of what was being explained. Few writes in a very entertaining fashion making sure that it never gets boring to read about tables or data.
This book is one big list of useful tips tied together with extensive explanations and examples. Many are obvious, but having them all enumerated with nice tables and charts to prove the point is what makes this book such a valuable read.

Data, Charts, and Maps for Communication

Alberto Cairo

The Truthful Art is the second book in Alberto’s trilogy (I think) of which The Functional Art, one of my favorite books, was the first. In short this book is a statistics book for journalists I’d say. And that’s also the reason that I find this book difficult to review. With my background in Astronomy, I’ve had 5 years of mathematics and physics. Therefore, I was already aware of the formulas and ideas presented in this book. Although it was certainly nice to see data visualization examples with each subject. I can’t say for sure, but I think that if you are new to statistics then this can be a good and gentle introduction to the main topics. Maybe that some things are still explained a bit confusing, most is explained in small steps with easy maths and examples.

The book has 12 chapters divided int four parts. The first part “foundations” talks about the differences between infographics, data visualizations, charts, and maps. But it also points out and explains the qualities of great visualizations: truthful, functional, beautiful, insightful and enlightening.

Part II is called “truthful” talks about visualization being a model of the truth. We can never be absolutely true, a visual just isn’t real life, but there are “better” models. Alberto also talks about the common mistakes due to dubious models and error prone human reasoning.

I find it interesting that Part III is called “functional”, which is by far the biggest section in the book. I guess he wasn’t ready with it after his 1st book :) It’s here that we dive into the statistics and see some formulas; (weighted) means, standard deviations, histograms, the normal distribution, boxplots, percentiles, visualizing trends and seasonality, ratios, log scales, correlation (coefficient), z-scores, parallel coordinates, linear regressions are terms that you will now know of after reading these ±200 pages. And then there’s suddenly a chapter on mapping data on maps before it ends with a chapter on “Uncertainty and Significance” which I find a fascinating subject in visualization. Many of the terms above are preceded with a fun example of a simple dataset that Alberto takes apart and visualizes in several slightly different ways.

The last part is about “practice” and is a collection of the wonderful work that other data visualization practitioners have been doing.

Definitely a useful book to have when you’re new to statistics and data manipulation.

Principles for Creating Graphics that People Understand

Connie Malamed

This book consists of two sections. The first explains how we humans process visual information and the second section talks about principles to improve your design. The book is filled with colorful and large examples. From data/information visualization designers such as Nigel Holmes and Nicholas Feltron to marketing pieces to art. Each example is accompanied by a small explanation about how it connects with the lessons you’ve just learned from the main body of text. In fact, most of the book is taken up by the visual content so in terms of reading, it doesn’t take that long to read through the entire book.

The first section contains elements that I’ve seen in other books in this section as well. That we can only hold a few chunks of information in our working memory. Especially a data visualization, needs to keep that in mind. There was also a nice section on schemas, our mental representations that embody our understanding of the world.

The next 3/4 of the book is devoted to section two: the Principles, which contains 6 subsections.

Organize for Perception is a bit about out preattentive processing, but using very different examples and wording than all the other books in this list. It’s more about how to design a piece where the viewer can get a sense about the (most important) information quickly. Direct the Eyes talks about compositional and signaling techniques that are effective at guiding the eyes to a specific location, such as emphasis and position. Reduce realism explains that it can help to make it easier for a viewer to understand the graphic more quickly and let it stick in long-term memory of you reduce the realism. Make the Abstract Concrete is about diagrams and flows, how visuals can help us to think so we do not have to hold it all in memory. In Clarify Complexity we learn different techniques on how complex concepts, such as science or medical topics can be visually represented. Many of these examples remind me of National Geographic style visualizations. The final subsection Charge it Up is about creating an emotional response when somebody views your design (such as storytelling, metaphors, humor)

Each subsection first explains what the general idea is and then has a section on applying the principle to designs in several ways.

This book is aimed at graphic/information designers in general and not strictly about visualizing data (or information). Therefore, not all examples are as relevant to a data visualization designer. Nonetheless, I feel this book shares many useful lessons about making a visualization that stands out in a crowd, that is easy to understand for the audience and that makes a lasting impact.

Simple Visualization Techniques for Quantitative Analysis

Stephen Few

This books is about what charts to use for the most common tasks in data analysis. There are two big sections in the book. One about building the core skills for visual analysis which involves general concepts and principles. The second section is about honing skills for diverse types of visual analysis to identify patterns for example.

The first 6 chapters about building the core skills start of with a really brief history of data visualization, what makes a good analyst (curiosity) and what is meaningful data. Chapter 3 “Thinking with our eyes” is again a summary of Colin Ware’s book, just like chapter 6 in Few’s previous book “Show me the numbers”. Color, luminosity and preattentive are really quickly explained.

Chapter 4 is about 13 different analytical interactions that you can have with the data (visualization) to get an understanding of it; comparing, highlighting, zooming and annotating for example. Each of the 13 options are explained with many visual examples. Chapter 5 then looks at several techniques that can improve the effectiveness of visual analysis. Using brushing in a dashboard of charts or adding reference lines. Finally chapter 6 ends the first section by quickly introducing six different analytical patterns which will get their own full chapter in the second section.

The 6 chapters in the second section are devoted to time-series analysis, part-to-whole and ranking analysis, deviation analysis, distribution analysis, correlation analysis and multivariate analysis. Each of these types is introduced more thoroughly than in chapter 6 after which Few shows the reader what chart types can be used to perform the analysis with. For some analyses, such as deviation analysis, there are quite a few different options available, whereas time-series analysis really only has two (bar and line chart). Each chapter ends with a section on best practices for the chart types. How to select the best interval for binning data to do distribution analysis for example.

This book is rather big, but it actually doesn’t contain a lot of text and I was able to finish it within a week. It uses big gutters for notes and there might be more space filled by charts than text (which I don’t mind).
In the end, I think this book is more meant for the data analyst who needs to get a grip on the data or create dashboards or reports for the management about the operation of the business. What types of charts can you use that will transform the data in such a manner that the viewer can visually see a pattern is the main point, not necessarily visualization best practices.
For a data visualization designer who has no experience as a data analyst this book is a good read to understand how to get meaning from data. But for those who are learning data visualization from a data scientist like background this books offers little new knowledge and you are better off picking Few’s first book “Show me the Numbers”.

An Introduction to the Histories, Theories, and Best Practices Behind Effective Information Visualizations

Isabel Meirelles

This is the most colorful book in my possession that also teaches you a bit about information visualization. There are only six chapters in the book and each chapter focuses on a different type of data: hierarchical structures (trees), relational structures (networks), temporal structures (timelines and flows), spatio-temporal structures and textual structures.

Each chapter consists of a history of the type of chart. These are the most extensive histories that I’ve read and they are accompanied by many beautiful vintage visualizations that were made hundred(s) of years ago. Next is an explanation on how to create/interpret the chart and what kind of variations there are. A tree can be visualized as a treemap or a sunburst for example. And finally there are several case studies. A short introduction about the visualization is given, why was there a need or desire to create it and an explanation of what is visualized. All of them are presented with many beautiful screenshots.
Dispersed throughout the book are also small breakouts that explain several Gestalt Laws, such as similarity and continuity and Preattentive principles.

If I were to describe this book in one sentence, it would be a book that visualizes and (quickly) explains a lot of the more famous visualizations found online. If you really want to learn about best practices and visualization principles I would suggest to go for a few other books on this list, such as Colin Ware’s book “Information Visualization” or Stephen Few’s “Now You See It”. But if you want to be inspired by beautiful examples, get a bit of background about them, get introduced to a few visualization best practices and are interested in a piece of history on several chart types, this is the book for you.

4th Edition

Robin Williams

This book does not contain any visualization of data. It is all about how the everyday man/woman can turn their Wordart flyer into something more appealing. Most of the examples in this book are therefore related to flyers and banners that you would find in the local weekly newspaper from small businesses or community groups.

The most important message from this book is “Don’t be a Wimp”, if you use contrasting letters, make them really different, for example. This book is one of those that is filled with more visual examples than text, which I rather enjoy since a point is made so much clearer with the right example. It doesn’t take long to read all the text with its ~200 pages in total.

After the introduction, the first four chapters are on the four design principles that Robin has distilled from design theory: Contrast: don’t use elements that are too similar, Repetition: repeat visual elements of the design throughout the piece, Alignment: every element should have some visual connection with another element and Proximity: items relating to each other should be grouped together.

There are often examples of advertisements that do not use a certain design principle and how they would look with a make-over. I felt that does help to make the message clear of how much difference small changes can make. Although not all of the redesigns get rid of their “home grown made-on-my-own-computer” feel, they are indeed always an improvement.

There’s also a nice chapter on the basics of color theory and pages with extra tips and tricks for creating business cards, envelopes, flyers, newsletters, brochures and so on.

The second section of the book is devoted to typography. First explaining essentials such as the right kind of quotation marks and when to use underlining (i.e. never). The rest of the section explains different categories of type; the serifs, sans serifs and others and how you can combine these categories on a page to make it look more interesting.

Although I did like the book, thinking back I guess that you really only need to read the page that summarizes the four design principles. Like I said in the beginning, perhaps this book is more suited for the person who does want to create a nice flyer for their little pie shop. For those who really want to go into the design of graphics, information and data could do better invest their money in a different book that is more technical, dives deeper and is focused more on the professional designer.

Tamara Munzner

After hearing the Data Stories episode with Tamara Munzner I knew I wanted to read her book. She’s done, seen and thought about some many aspects of Information Visualization.

Each chapter starts with a small schematic representation of the topics that will be discussed. In the bottom left image you can see such a presentation for chapter 3 on the different types of abstract data tasks
Chapter 2 introduces us to data, what data types are there, but this goes much further than the typical ordinal and nominal. It sets apart networks from geometry and sequential versus cyclic. In the next chapter Munzner explains what the main abstract reasons are for using a visualization tool in the first place. I felt the schematic at the start of this page (image bottom left) is one of the most useful in the book and very useful when you are at the start of a project to figure out the “point” of the visual (and that there is actually an “enjoy” task as well).
Once you know the data and the task you can start on the design and chapter 4 explains the “Four levels of Validation” or the four levels that exist in visualization design; Domain situation, Data/task abstraction, Visual encoding/interaction choice and creating and algorithm to handle the visual encoding. On each of these levels you need to perform validation to make sure that your visualization program is effective.

Chapter 5 on “Marks and Channels” (together with chapter 10) is this book’s version of Colin Ware’s work reviewed above. I do really like Figure 5.8 (image bottom right) that ranks how good people are at judging exact values from several different studies, including crowd sourcing.
One of my favorite chapters was number 6 on “The Rules of Thumb” with 10 basic rules such as “Eyes Beat Memory” and “No Unjustified 3D”. At the start of the chapter Munzner explains “Each of them [the rules] has a catchy title in hopes that you’ll remember it as a slogan”
The next 3 chapters are all about how to visually arrange data; data that comes from tables, spatial data and networks. Using radial layouts or parallel layouts for “normal” data for example. Munzner uses many case studies where many are research related (about biology for example).

Chapter 10 “Map Color and Other Channels” was my other favorite chapter of this book. Even though most can also be found in Colin Ware’s book I did learn a few more things about color and there are a few very nice examples here that I hadn’t seen before.
The final 4 chapters are about the major strategies that are available to manage complexity in visualizations; changing a view over time, faceting data into multiple views, reducing items and attributes and embedding focus and contextual information in one view. Each chapter explains what it means, why to do it and then several options of how you can do it

After reading it all, I do have to admit that even though there are so many interesting topics covered in this book, I found it difficult to pick it up again in the evenings and read. The text felt a bit dry, in that it wasn’t pulling me in, but more an enumeration and explanation of facts.
I also feel that this books will be more useful to researchers than practitioners. It focuses more on theory, creating complete visualization tools and uses many examples from tools created as part of research for a very specific task or user group. Combined with the rather high price I would suggest to go with the other books on this list (first)

Non Design Books

Books that I’ve read that don’t really have a direct connection to data visualization or design. Nonetheless I do think they contain valuable information that has the potential to make you a better desinger

Darrell Huff

Lovely little book (only 124 A5-ish pages) written more than 50 years ago, but still very much relevant (perhaps even more so, in this data overloaded online world now). It is packed full with wonderful real-life examples of cases where scientists and businessmen have made false conclusions form and with numbers. And, sadly enough, those false conclusions are not always by accident. There are several chapters about how graphs can be misleading, such as the bar chart that doesn’t start at zero.

If you’re working with numbers, this really is a must read. The humourous book contains many valuable lessons. Both on on catching a wonky statistic you read about and teaching you how not to make the mistakes yourself.

Daniel Kahneman

This book, written by Economics Nobel Prize winner Daniel Kahneman, really has nothing to do with data visualization. It won’t teach you anything about what charts to choose or what the best practices are in Information design. But this book does teach you how people think and behave, or rather, it shows you how counterintuitive people behave and how we make decisions far from rationally founded reasons.
Things such as “Anchoring”, where we are influenced by irrelevant numbers in making choices, or “Framing” where the context in which choices are presented influences the outcome are exceptionally funny and informative to learn (An example of framing: subjects were asked whether they would opt for surgery if the “survival” rate is 90 percent, while others were told that the mortality rate is 10 percent. The first framing increased acceptance, even though the situation was no different).

This book, written in engaging manner that really makes it easy to breeze through, is packed with interesting psychological research where practically all resulted in some form amazing and unexpected finding about how people behave that has really changed my worldview.
I wouldn’t call it essential for learning data visualization, but it’s a good addition if you want to know more about what really drives people and their decisions, to improve the overall design set-up of you visual.

Upcoming Books

Below is a list of books that I really want to read, but haven’t gotten around to yet. I’ll write a short review and add them to the section above once I’ve read one. But for now the list below explains in a sentence or two why they are on my wishlist

William Cleveland

Both of these books are written by William Cleveland and have been around for more than 25 years. That’s why I’m starting with Tufte and Few first. However I keep reading about how these books still contain a lot of valuable information on the subject of making effective and clear visualization for exploratory data analysis. I’m not quite sure if “Visualizing Data” is a companion to “The Elements of Graphing Data” or if it’s more of an update though

Version 4.0: 20th Anniversary Edition

Robert Bringhurst

One of the books on this list not aimed at data. I’m really fascinated by typography. That there is a whole area of expertise in this field, really wonderful! I keep enrolling in all the free typography classes on Skillshare, but again, I should really start with reading about the basics and I understand this is one of the best typography books