Visualizing Data

When an engineering colleague of mine (95% of my immediate surroundings) emailed me a link this past week, I thought for sure it was for a YouTube video. This link was different though in that it introduced me to Anscombe’s quartet, a set of four datasets that share the exact same statistical properties. Each one of these datasets has the same mean, variance and mean of each y variable, variance and mean of each x variable and the exact same correlation between the x and y variables. A little geeky, eh?

What’s my point you might ask? When each one of these datasets is plotted out visually, they have completely different appearances (just click on the link above and you’ll see what I mean). There are outliers where one would not expect to see them – identifying both opportunities and risks in your data depending on what you are analyzing. However, one would never see the variance in data patterns if it was not plotted in a chart or graph (or analyzed data point by data point).

Looking at your data is just as important as reading your data. Not all of us can see the obvious by just looking at numbers, even if we don’t consider ourselves “visual” people. I’ve learned this quite extensively in my current job when presenting data findings to a combination of both finance and product teams. Some people are “table” folks and others are “chart” folks. Regardless, the combination of the two data presentation methods jogs the brain and forces you to see the data in new ways and patterns. Here are three reasons why you need to visualize your data:

Visualizing data allows you to segment elements within your results set. For example, when analyzing campaign data, the average clickthrough rate or cost per acquisition might meet your campaign goals. However, deeper segmentation via visualization can help you determine where further optimization opportunities lie via eliminating waste and exploiting upside outliers.

Ability to see relationships and correlations within the data. Scatterplots and grids can help sift out opportunities where you might have huge upside if you can optimize your key trigger points (whether it is behavior metrics like CTR or margin management).

Trend analysis over time is more obvious when it is visualized. Trends in data can shift dramatically or slowly over time. Visualizing trends can allow you to see the slope of a metric or group of metrics to identify seasonal and market trends that may not pop out at you when you are just staring at a table of numbers.

Segmentation, correlation and trending are just three of the reasons for data visualizations – but there are so many more. As analysts we sometimes do not see the forest through the trees. Visualizing data forces you to stop and look at the results and ponder the bigger picture. Perhaps a picture is worth a thousand data points in this case.