Thoughts on data, statistics, computer science, and those other things people call life.

Menu

Teaching Analytics to NYC Employees: Excel as a Tool for Summarizing Data

This blog post is part of a series of posts I’m doing outlining the experiences I’ve had in a year of teaching analytics classes to NYC employees. For more information on the courses I teach, see the Github repository of course outlines and other important information as well as the other posts in this series.

I find Excel intimidating.

I’m afraid of data.

It’s been a long time since I was in grad school and I’ve forgotten how to use Excel.

I learned Excel on my own and I feel like there are a lot of things I don’t know.

These are all things I hear frequently in my NYC data analytics classes. Almost everyone is self-taught in how to use Excel, having learned enough to get by in their jobs. In some cases, they were given spreadsheets created by their predecessors and they learned enough to make sense of the formulas or charts that previously existed. In other cases, they’re the only person in their office who know much about Excel and are often in the role of the resident expert even if they feel their own skills are fairly weak.

My Excel Tools: Summarizing Data class is intended to help address this issue by giving a solid foundation for beginner, intermediate, and advanced users alike. No one wants to be manually splitting fields, retyping values, or constantly filtering and sorting to count or replace values. I cover the basics necessary to have a firm grasp on using Excel to clean, analyze, and visualize data. I emphasize the use of formulas to manipulate text and calculate numbers, showing students the best practices I’ve learned (usually through my own trial and error) to produce reliable and repeatable analysis. A long the way, I’m able to address long standing questions, share some best practices, and even pick up a few tricks from the students.

Ultimately, my goal isn’t to make them Excel ninjas, it’s to build a foundation of understanding they can use to continue their learning outside the classroom. This should be the goal of any learning experience, but in this case, it’s about helping them understand how Excel works with discussions about the menu ribbon, formulas, functions, using the Microsoft Office help, data types, pivot tables, and other foundational concepts that they can use. I teach them how to do common tasks, like extracting values from strings, doing calculations, splitting and combing texts, and the mother of all tasks, joining data from multiple spreadsheets.

Except for the first exercise, a sample inventory sheet to review formulas, I use real-world data. We look at the population data for the NYC Community Districts since 1970, charting the fall of NYC’s population during these times of social and economic trouble for the city. There are usually several people in the class who lived through those times and I encourage them to share their personal experiences to give context to the data. Our analysis enriches their personal stories with greater context, as data should. It also shows them things they may not have realized, like how the loss of population in the Bronx between 1970 and 1980 (roughly 20% of the population) wasn’t evenly spread around the whole borough, but concentrated in particular places, like the South Bronx. Some community districts saw no drop in population and some saw increases, like Community District 10, which went from a population of 84,948 in 1970 to 106,516 in 1980.

Being able to tell this more nuanced story, one that both reinforces our assumptions and displays the nuance we don’t know is a key goal for this exercise. I hope to show the power of good analysis, one that can both confirm and challenge our beliefs and intuitions, all the while learning practical skills for using Excel to manipulate and visualize data.

We then examine 311 service requests, specifically noise complaints. We manipulate the data with formulas and analyze the results before finally joining the data with the population data to normalize the number of complaints by the population of a community district. They leave class not only understanding the steps to accomplishing a task but also the intuition behind why they are doing it. My hope is they are able to better synthesize the information I share with them into their own work, automating where it makes sense, and ultimately becoming more data literate moving forward.

Excel is a powerful tool for working with data and with its relatively intuitive interface, democratizes analytics for the masses. Like any tool, however, there is a necessary process of learning the functions and features that make it useful. There are also common mistakes that can lead to serious errors. I warn them of these and help talk through common methods of error checking and validation.

By the end of the class, I hope they leave empowered to see Excel as their friend, confident they can use it more effectively in their work. The excitement I see when they learn something that will make their jobs easier makes the effort to put together and present these classes worthwhile. The positive feedback I’ve gotten in the course evaluations helps me know I’m on the right track, helping NYC employees better use Excel 15-20 students at a time.