Data Visualization Design, Part 1: Data to Ink Ratio

(Image from Tufte’s The Visual Display of Quantitative Information)

This fall semester I’m taking a Data Visualization class as part of my Data Science graduate studies at Indiana University’s School of Informatics and Computing. I’ve not focused many blogs here on data visualization topics even though visualization has been a key part of my data science journey. Today I want to focus on a particular principle of good design that came up in my Data Visualization class: selecting a good data to ink ratio.

Arguably the first person to explore this principle of good data visualization design was Edward Tufte, who is known as ‘a pioneer in the field of data visualization.’ Tufte mentions this in Chapter 4 of his signature book The Visual Display of Quantitative Information. Tufte’s main argument is that ‘Above all else show the data’. He further expands on this principle by saying that ‘every bit of ink on a graphic requires a reason.’ This hypothesis is known as the data to ink ratio. I agree with this principle and think for the majority of the time it is a good guiding principle to creating effective data visualizations. When creating graphics, what is displayed should be a visual representation of the information with as little redundancy as possible. Erasing what is not necessary is crucial.

However, Tufte later mentions that repeating the same information can be useful for certain graphs such as representing cyclical time-series data. Michelle Borkin and colleagues also present some interesting research where redundancy of qualitative data encodings and additional qualitative encodings of the main trend can help reinforce the message and help the viewer to remember the graph in certain situations. Borkin et al. also agree with Tufte’s arguments of good design principles generally though.

Next week, we’ll look at some case studies that apply the ink to data ratio principle to improve data visualizations.