Is That Chart Saying What You Think It’s Saying?

More than ever, charts are part of the news. Since we spend most of our idle moments with our faces in front of screens, charts are an effective way for news outlets to keep us from looking away (what publishers more optimistically refer to as engagement). They’re a great Twitter cheat, too, since a picture is worth far more than 140, or even 10,000, characters. I like to think charts also serve a higher purpose by enabling people to understand complex ideas and see old ideas in new ways. Visualization is powerful.

But then there’s the other part of it, when the charts themselves become the news. Visualization being powerful makes it a tempting thing to misuse. In highly charged partisan debates and in high-stakes business scenarios, chart makers can toe — and sometimes blatantly cross — the blurred line between a persuasive visualization and a dishonest manipulation.

Recently, the New YorkTimes reported on emails from airbag maker Takata that reveal engineers were allegedly “changing colors or lines in a graphic ‘to divert attention'” in a chart about airbag test results. This was part of a larger data manipulation that ultimately led to Honda severing ties with the airbag company. (For more on data ethics, go here.)

I haven’t seen those charts, so it’s hard to judge how dishonest the renderings were. But Takata’s example is one of the many clear signs that managers who hone their visualization skills may increasingly find themselves in a position to change emphasis in a chart, leave out data, or add some data. It’s time to start thinking about these issues.

The intent of this chart is clear, but its execution is so dopey that even calling it malicious is hard. This is amateur manipulation, like a student who wants to get out of gym giving his teacher a note that reads, “Please excuse Scott from gym today, signed Scott’s Mom.” Almost immediately after it showed up during a U.S. Congressional hearing, the chart was unanimously panned. Still, looking at some of the techniques used in it to understand why someone would try to pass it off as true is instructive. The best place to start is to find the idea that you think the chart maker wants you to see. In this case, every decision seems to be designed to make me see a surge in abortions at the expense of cancer screenings. As one shoots up, the other dives down. Making the lines cross injects that idea in a way I see immediately. If I don’t read the numbers — or if I’m looking at it projected on a screen in a hearing room, or on C-SPAN, where the numbers will be hard to make out — then all the better. I see a crossover dynamic that possibly suggests causation. The chart maker also plotted abortions in a highly saturated red, which draws the eye and looks “dominant” compared to the softer pink, further reinforcing the idea: “Look at this red line going up (much higher than the pink one).” What would have been a less deceptive representation? Nearly any other chart! Many people recharted it by simply plotting the lines as they would appear with a single y-axis for the total population that received the services. That makes sense. I went a little further: The services here are for women, and the population of women changed between 2006 and 2013. Using U.S. Census population estimates*, I recharted the values as rates — services per 1,000 women — and I excluded females under age 10:

The fact that less than half as many women received cancer screenings in 2013 compared to 2006 is remarkable and makes me wonder why that is. But if the intent of the original chart was to show some relationship between cancer screenings and abortions, we now see why this plotting wasn’t used: the suggested relationship just doesn’t exist. In fact, the abortion rate was effectively unchanged, moving from 2.19 to 2.33 per 1,000 women aged 10 and older. That was an easy one. No amount of rationalizing can justify putting a lower number above a higher one on a y-axis with rising values. On the other hand, subtler issues with technique are at play in every chart we produce. Someone could make the argument that my chart doesn’t include “interim” data between 2006 and 2013. What if the number of abortions doubled by 2010, then returned to baseline? What if funding for cancer screenings changed dramatically? Maybe screenings per dollar of funding would be a better metric? Every chart includes these kinds of decisions, which means every chart is, in a sense, a manipulation of the available information. Here’s a trickier one that made the rounds recently:

The accompanying tweet surely lathered up people on both sides of the issue. Some giddily retweeted, and others angrily responded by calling the chart a “lie.” But it’s accurately plotted. The chart maker hasn’t put lower numbers above higher ones or left out relevant data points or done anything else particularly egregious. The chart isn’t wrong, so why are some people so infuriated by it?

Remember that key question: what is the idea that you think the chart maker wants you to see? The answer here appears to be “nothing.” They want you to see a flat line. That flatness is accentuated by starting the y-axis at 0 and extending it twice as far as the top range of values. The more values on the y-axis, the less distance between each value, effectively flattening the line.

But the flatness isn’t totally artificial. Temperature increases are within just a little more than a single degree. What this chart maker understands is that we don’t actually read data or think about specific values on a chart like this. We see a shape. We respond to its concision. In psychology, this is related to the central Gestalt principle called the law of Prägnanz, which means pithiness. We find the easiest meaning we can in the object itself, at a glance.

So we see a flat line and draw on convention to connect that shape to what we associate it with: changelessness. Nothing is happening. It’s steady. Safe.

What I love about this chart is that you can’t dismiss it as some charlatan’s trick. Since it’s accurately plotted, it demands anyone who thinks that it’s misleading to produce something that’s equally accurate but that better shows why the small changes in the trend line are significant. Many people did just that. Some truncated the y-axis to make the line rise more steeply. Others threw a lot of science at the problem, changing the measure plotted to “departures from average temperature” as steeply rising bars.

Another approach is to attack the Gestalt by coming up with examples of flat lines that nonetheless show profound change. Below is a flat line I created using similar techniques as the chart above, extending the y-axis far beyond the top values and using a y-axis measure that encourages flatness. (The data isn’t real — I created it for the purpose of the exercise.) Drag the slider to the right to see why what looks changeless actually matters:

Most people assumed the climate change chart was an effort to say ”Look, the line is flat. Nothing’s happening. Stop worrying.” Responding that way to the body temperature chart would eventually result in a coma, or death. Sometimes, even flat lines mean something.

Consumers of data visualization that’s found in the news or the boardroom shouldn’t be expected to brush up on the law of Prägnanz to suss out manipulative charts. In fact, you can’t stop your visual perception system from doing its thing. It’s going to interpret a flat line as showing changelessness. What you can do is take a few extra moments to think critically about what you’re looking at, just like you would when looking at any other information someone presented to you. If a manager sent you a memo that read, “Our competitor’s revenues are flat, so they’re nothing to worry about,” would you simply assent without challenging them?

Recent research shows that with practice and by using critical thinking skills, we can improve our data-visual literacy. Nasty skirmishes like the ones above are a sign that data-visual literacy is increasing and that we’re beginning to consider how to define the ethical use of visualization in news and in business. Two of these examples were fraught political ones, but the analysis isn’t ideological. It’s relevant to managers who, like the Takata engineers, are going to find themselves in need of making a case, winning some business, or documenting results. How will they handle the power they have with visualization? Here’s hoping it’s responsibly.

*The U.S. Census has produced an estimate of the female population by age for 2006, but not for 2013. For 2013, I calculated 51% of the total population estimate (that was the proportion of women in 2006) to arrive at a number of women, then subtracted 13% of that figure for females under 10 years old (again, using the same proportion that existed in the 2006 data).