Data Visualization: Representing Proportions Accurately

Many times in data analytics, you need to represent the proportions of your data. Pie charts are frequently used to do just this; however, you may have heard at some point in your career that you should never use a pie chart. The primary criticism is that it is really hard for the human eye to assess the areas of each piece of pie in order to compare the relative magnitude of the variables shown, especially when there are many sections to the pie. This issue has been compounded by the use of 3D pie charts, easily available in software like Microsoft Excel, that further skew the relative areas of each pie slice.

In my experience, there is a single functional use of a pie chart – to compare the fractional relationship between two numbers. For example, the percentage of revenue generated at Doug’s Desserts from selling baked goods compared to online subscriptions.

Once you want to compare more than two variables, it is time to use a different form of data visualization. Let’s look at an example from my hypothetical company, Doug’s Desserts, where I’ll investigate the share of online subscription revenue generated by the top 10 U.S. states. Here is what the data looks like using a pie chart.

Alternative number one: the bar chart

The bar chart is a great alternative to a pie chart. It is much easier to compare the sizes of the bars and to determine the relative magnitudes of the values, particularly when the values are sorted, compared to a pie chart. It also does not rely on so many colors in the chart, which I find distracting to the eye. Below you’ll see a bar chart that visualizes Doug’s Desserts.

Alternative number two: the Cleveland dot plot

While the above bar chart is clearer to interpret than the pie chart, there is a lot of “ink” on the chart so your eyes can be easily distracted. Another alternative, and my preferred option, is the Cleveland dot plot. This plot replaces the bars with a single point and flips the coordinates so that you read the discrete variable from top to bottom, not left to right.

I hope you’ve enjoyed this week on data visualization. I’d love to hear about what plotting techniques you find the most useful, so please send me an e-mail. Next week, I’ll talk about the tools you can use to calculate the simple statistics I talked about last week and the charts I’ve shown you this week.

Questions? Send any questions on data analytics or pricing strategy to doug@outlier.ai and I’ll answer them in future issues!