Making Tables of Numbers Comprehensible

SEARCH BLOG

I’m always amazed (read: dismayed) when I see the results of an analysis presented with a key set of the results delivered as a raw table of numbers. It is impossible to instantly comprehend a data table that has more than 3 or 4 rows and 3 or 4 columns. And, “instant comprehension” should be the goal of any presentation of information — it’s the hook that gets your audience’s brain wrapped around the material and ready to ponder it more deeply. Below are two different ways to use conditional formatting to convey information rather than data.

Heatmaps

An industry analyst report was released recently that summarized the scores from the analyst’s evaluation of multiple platforms across a number of dimensions. You may recognize the table below (although I’ve doctored the results enough that I’m not giving away any of the analyst’s intellectual property):

This is a barely-dressed-up Excel spreadsheet. It takes some real staring at the table, including scanning and re-scanning the numbers, to realize that “Jupiter” is rated as the strongest offering. “Neptune” — the fourth vendor listed — appears to be the second strongest. To be fair, a high-level summary of these results is presented in a separate chart, but that chart is really high level.

Some things that, when I tried to wrap my own head around the table, seemed extraneous:

2 decimal places — these scores were the roll-ups of dozens and dozens of individual scores that were, inherently, somewhat subjective. Two decimal places implies a precision that simply does not exist.

The actual component scores — in an evaluation like this, the reader really primarily cares about relative strength across each row rather than the absolute scores.

The weightings — the weighting matters…but it’s just a factor in the formulas that get to the overall scores — it’s not actually part of the results

With that in mind, just to get a better understanding of the data myself, I grabbed the data and reformatted it using a heatmap. Each row is graduated separately based on the high/low values in that row:

I eliminated the actual display of the numbers for the components for each group, and I shifted the weighting off to the right (and lightened it and made it a smaller font). When I looked at this, I realized that the order of the vendors was simply alphabetical. While that is a logical order, and it may seem like a good way to “let the data speak for itself,” alphabetization is entirely arbitrary in this context. Why not arrange the platforms from overall strongest to weakest?

As it turns out, “Jupiter” happened to be first alphabetically and had the highest overall score. But, with this arrangement, we can quickly see where the different platforms stand out. For instance, “Mercury” is rated relatively lower on all dimensions except for “Employees,” where they were scored high relative to the other platforms. “Saturn” has only 3 areas where they are not the absolute weakest of the entire group.

Now, is the heatmap approach above the only way to present it? Do I think you have to omit the numbers from all of the cells? Of course not! But, it hardly seems arguable that the heatmap is much easier to digest than the raw data table.

Chartless Bars

A different type of data table is shown below. This one is a case where multiple metrics are shown across a single dimension (in this case, traffic source):

This is a reasonably formatted table of numbers, but reading and interpreting a bunch of numbers at a glance isn’t something we do well. Interestingly, the first “read” of this data actually comes from the number of digits in each metric rather than the numbers themselves. For instance, the higher conversion rate for email jumps out because it’s a longer number, rather than because it is numerically larger. As a matter of fact, the first three metrics actually each have a bit of a bar chart nature to them just because they have varying numbers of digits.

Understanding that that length is a much faster/easier visual input than numerical digits, we can use conditional formatting to capitalize on that fact:

Or, if you have the room to allow a slightly wider chart, add a column for each metric so that the value and the bars don’t overlap:

In either case, it then becomes much easier to grasp which metrics for which rows are anomalous. For instance, the conversion rate for email — noted earlier — but also the visits from paid search.

Always More Than “Just The Numbers”

Hopefully, the examples here, more than prescribing exactly how to plague-ishly avoid raw tables of numbers, show how much more readily comprehensible tables of numbers can be with some quick visualizations of the data. What do you think? Do you have techniques you use to make data tables more readily digestible?

6 Comments

Mark PetersJune 24th, 2014

Hi Tim, fantastic post.

I usually go with the in-cell bars or traffic lights, but you did such a great job on the heat map, I need to start doing that more 🙂

Glad to see you sorted greatest to least as well! This is frequently a missed opportunity.

Interesting – I’ve never used a heatmap, but it looks like a good way to go. I like the in-cell charts too – I can imagine audiences being wowed!

In the heatmap section, you mentioned rounding, and that’s something that jumps out at me too. My area is slide presentations, and often I see figures like $1,043,478 projected up on the wall, but having 7 digits of precision is far too many on a slide (rather than in a report for longer reading). So you might like these ways to round big numbers more sensibly for use on slides.

Tim has moved on from Analytics Demystified effective 12/31/2017 but his content lives on. If you have questions for Tim please send them to eric@analyticsdemystified.com directly and they will get routed.