Improve the use of pie charts

Keyboard Shortcuts

Pie charts are one of the most commonly used data visualizations to explore the composition of data. They are easy to understand but they do have some shortcomings. Data visualization expert Matt Francis explains why care has to be taken when using pie charts and he offers some important guidelines to ensure you use them correctly.

- [Instructor] Of all the ways to visualizethe composition of a data set,the pie chart is probably the most commonly used.But it's also one of the most abused visualization types.When used correctly, it can be effective,but far too often it gets misused and used the wrong way.So let's look at some examplesof both good and bad pie chartsand some of the things you should avoidif you are going to use them.So a pie chart works by showingthe overall composition of a data setby encoding the values in the angles.Let's build for us an example and see how they work.

So say in our data set we're interestedin looking at the overall salessubdivided up by our regions.So first we'll add the regionand then we're going to add the sales onto the angle.So there we have a very simple pie chart.What we've done is taken all of the data;subdivided it up by the regionand then encoded the amount of salesin the angle of the pie chart.Now this shows one of the problems by using a pie chart.Which segment is the biggest?Pie charts work on area and angleand it's very difficult to compare these accurately.

Pie charts are good for an overall impressionof how the data is distributed.But in this example it's very hard to tellwhich is the largest, which is the smallest.However what we can say,is that there are four segments to this data.The west and east look pretty evenand the south and central also look pretty even.That's about all we can really tell from this data.One thing that pie charts are quite good for, however,is when you're combining regions together.So what we could say is,is that the combination of the west and the south...

Make up roughly half of the data.With other visualizations it's hard to makethat same kind of association.But the pie chart does let us join those segments together.Now pie charts only work wellwith a minimum number of segments.Three to four is probably about the maximumyou want to deal with.Any more than that and the pie chart starts getting messy.Let's look at a bad example.This is looking at all of the salesfor each one of our states.So the angle is the amount of salesand they're colored by state.The problem here is we've got 49 states being displayed.

But we can only show 20 distinct colors.So the colors have started repeating.So if we look on here we can see we have three dark blues.But which state does that refer to?Even though we've got the legend, we don't know.Also, which is the third biggest state.We can see that that peach color one is the biggest,followed by probably the red one.But then is it that blue?Or is it the green at the top?It's really hard to see any kind of rankingin this kind of visualization.At the moment, this is sorted alphabetically.And we could change thatto be sorted by the amount of sales.

So in this case, the smallest to the largest.So now we do have some kind of order.However what we can't tell ishow much bigger is the peach state, which is California,compared to, say, Florida.Because we're trying to compare angles against each other,and area,that's a really difficult thing to do.The better way to visualize this datawould have been in a simple bar chart.And then we can see California and Florida.And we can compare the actual values between the two.

We can see much more detail in herethan we could have done in the pie chart.Another example where pie charts get abusedis when we're looking for change over time.This example here, we're looking at the regional salesfor four years.Now the question is, are sales increasing or decreasingfor each one of our measures?Well, it's really hard to tell.At first glance it looks likethe west has been pretty static for three years,and then in 2015 it increased.Similarly, for the east,it started off just over a quarter or soand then increasedand then it looks like it decreased again.

But it's very hard to seethe ebb and flow of the data across time.A better way of visualizing time is using a line chart.If we convert these pies to lines,we see something like this.We can see that east has been steadily increasing,whereas the west in fact dropped and then increased.Let's compare these side by sideto really see the difference.Again, looking at the west in the pie chart,those segments for 2013 and 2012 look almost identical.But the line chart shows a definite dropbetween those two values.

This is information and detailwe wouldn't have got from the pie chartsimply because we cannotaccurately compare angle and area together.So a pie chart isn't a particularly good choicewhen we're looking at detail.But when we're getting a general feeling for datathen it can work quite well.Another example of a pie chart is placement on a map.Here we're looking at the sales per statesubdivided by all of our categories.Again, we can't see much detail,but we can see a general summary of the distribution.And there's some interesting things that crop up.

For example, in Wyoming, we only sell furniture.Whereas in North Dakota we only sell office supplies.In Montana, technology seems to be the biggest sector.Whereas in Kansas, it's office supplies.We're not comparing one state against the otherso the pie chart works wellbecause we're treating each stateas its own independent entityand look at the composition of data within each one,according to what the pie chart is showing us.Pie charts are not a bad data visualization,but they are often misused and abused.

If you keep the number of segments to a minimumand don't compare one pie chart against the otherthey can be a very effective and simple data visualizationthat everybody understands how to read.

Resume Transcript Auto-Scroll

Author

Released

11/22/2016

Data Visualization Tips and Tricks is a series of standalone lessons on how to do data viz the right way, every time. Presented by award-winning data visualization expert (and Tableau-designated "Zen Master") Matt Francis, this software-agnostic course is designed for experienced data scientists and analytics specialists and serves as a must-have bank of knowledge and best practices. Learn how to choose the right visualization for your data, and answer the 5 key questions you should ask yourself at the beginning of every project. Topics include understanding the relationships between data sets, making comparisons, charting relationships, visualizing data distributions, creating maps, and—most valuably—knowing when to use which types of graphs and charts. Matt also teaches you how to understand what others are doing with their own visualizations, ask informed questions, and look with a critical eye at the work of others.