A Data Scientist’s Real Job: Storytelling

Every morning at DoSomething.org, our computers greet us with a report containing over 350 million data points tracking our organization’s performance. Our challenge as data scientists is to translate this haystack of information into guidance for staff so they can make smart decisions — whether it’s choosing the right headline for today’s email blast (should we ask our members to “take action now” or “learn more”?) or determining the purpose of our summer volunteer campaign (food donation drive or recycling campaign?).

In short, we’re tasked with transforming data into directives. Good analysis parses numerical outputs into an understanding of the organization. We “humanize” the data by turning raw numbers into a story about our performance.

When many people hear “Big Data,” they think “Big Brother” (Type “big data is…” into Google and one of the top recommendations is, “…watching you.”). Central to this anxiety is a feeling that what it means to be human can’t be tracked or quantified by computers. This fear is well-founded. As the cost of collecting and storing data continues to decrease, the volume of raw data an organization has available can be overwhelming. Of all the data in existence, 90% was created in the last 2 years. Inundated organizations can lose sight of the difference between what’s statistically significant and what’s important for decision-making.

Using Big Data successfully requires human translation and context whether it’s for your staff or the people your organization is trying to reach. Without a human frame, like photos or words that make emotion salient, data will only confuse, and certainly won’t lead to smart organizational behavior.

Data gives you the what, but humans know the why.

The best business decisions come from intuitions and insights informed by data. Using data in this way allows your organization to build institutional knowledge and creativity on top of a solid foundation of data-driven insights.

For DoSomething.org, mapping our communications data gives us an amazing window to view our audience. We have over 1.5 million users, and for each one we have hundreds of data points to what and how they respond to new volunteer opportunities via email and texting. Here’s how we go from 350 million data points to organizational change, and how organizations grappling with similarly huge amounts of information can do the same:

Look only for data that affect your organization’s key metrics. At DoSomething.org, our goal is increasing teens’ engagement in volunteering. So when we did a deep dive on our data last fall to determine how to increase that metric we started with simple questions: Who currently volunteers the most, and how can we find more people like them? We were able to ignore larger volumes of data that didn’t answer our questions and hone in on what really mattered.

Present data so that everyone can grasp the insights. Hint: never show a regression analysis or a plot from R. In fact, our final presentation had very few numbers. We focused on telling a clear story with simple slides and visuals. While we used regression analysis to find a list of significant variables, we visualized data to find trends: even data analysts are much better at discovering geographic (and underlying demographic) trends on maps than in regression tables, especially when there are multiple underlying patterns with ambiguous relationships.

By presenting the data visually, the entire staff was able to quickly grasp and contribute to the conversation. Everyone was able to see areas of high and low engagement. That led to a big insight: Someone outside the analytics team noticed that members in Texas border towns were much more engaged than members in Northwest coastal cities.

Return to the data with new questions. Once we learned who our most engaged members were, we returned to the data to see what campaigns those members liked best; in other words, what led those members to get involved. The answer turned out to be campaigns around improving community health, an issue that disproportionately impacts minorities. This information allowed us to better tailor our volunteer campaigns going forward to engage new members, reach out to the right partnerships for those campaigns, and also highlight another potential area for growth — white, male college students in the Northwest.

Data scientists want to believe that data has all the answers. But the most important part of our job is qualitative: asking questions, creating directives from our data, and telling its story.

Please join the conversation and check back for regular updates. Follow the Scaling Social Impact insight center on Twitter @ScalingSocial and give us feedback.