Data Visualization

This is part 1 of a 5 part series on Data Visualization.

Last week we talked about simple statistical measurements that help you understand your data better. The resulting metrics are a great way to summarize large data sets into a few numbers. Data visualization is complementary to data analysis – it allows you to condense large data sets into a single picture (or set of pictures).

The most important thing to keep in mind with visualizations is that you are trying to create the clearest representation of the data possible by using plotting techniques that make it easy to compare values. This week I will cover the following concepts¹:

Understanding the basics of your data

Plotting regression coefficients

Comparing multiple measures on the same plot

Representing proportions accurately

I’ll continue to refer back to my hypothetical company, Doug’s Desserts, which sells baked goods in a storefront and online and also has an online subscription service that provides customers with recipes and tips, in order to provide context for examples.

Since I won’t be able to cover every detail of visualization this week, I encourage you to check out the work of experts like William Cleveland and Edward Tufte. They have conducted experiments to understand how humans understand visualizations and offer excellent critiques of content delivery.

[1] The plots you see this week will be created using R, my favorite statistical analysis tool. I’ll cover R, and other alternatives, in more depth next week during my series on analysis tools.