From Anecdotes to Evidence: A Preview Of A New Data Journalism MOOC

Doing Journalism with Data: First Steps, Skills and Tools, which starts May 19, is an example of a growing trend of MOOCs offered by organizations outside the usual university structure, in this case under the auspices of the European Journalism Centre and featuring an international team instructors from across the commercial and non-profit publishing spectrum. Charlie Chung from from Class Central recently spoke with one of the instructors on the team to get a preview and generously developed this article for MOOC News and Reviews. -Editor

The amount of data being generated in our modern society is exploding, and it’s being collected and analyzed by all kinds of organizations, from governments (e.g. the U.S. NSA) to corporations (e.g. loyalty reward cards). There have also been extreme reactions against some of this (e.g. in the form of data leaks by Edward Snowden, Julian Assange et al.), but this does not provide an adequate counter-balance.

Thankfully, there is another group in our society that is getting more sophisticated with data: journalists. As long as they can get access to information, they can help bring important issues to public attention. Investigative reporting, of course, has been around for a long time and has often used data to support its conclusions. But the difference now is that the datasets are huge, and computing power allows individual reporters to analyze these data without the aid of statisticians or social scientists. This exciting trend is referred to as data-driven journalism.

An example of data-driven journalism is a story that broke in 2011 where a large private hospital chain in California was engaged in Medicare billing fraud. But it wasn’t CMS (Centers for Medicare and Medicaid) that figured this out. Instead, reporters from California Watch, a nonprofit investigative news operation, dug up the story and analyzed public anonymized Medicare diagnosis data. The story included first-hand accounts from employees, but it was the data analysis that clinched it, and the story eventually led to federal investigations.

One of the reporters for that story was Pulitzer Prize-winning journalist Stephen Doig, the Knight Chair in Journalism at Arizona State Univeristy’s Walter Cronkite School of Journalism and Mass Communication and one of the instructors of the upcoming MOOC, Doing Journalism with Data: First Steps, Skills and Tools. He was called in specifically to analyze the Medicare billing data.

Doig describes data-driven journalism as “going beyond anecdotes to provide evidence.” This is a powerful advance in the field, however, we should not let the presence of gigabytes of data cause us to lose perspective. “The most important word in ‘data-driven journalism’ is ‘journalism’,” Doig reminds us. The right questions still need to be asked, insights need to be developed and promising leads followed up on. But to the extent journalists increase their data sophistication, it opens the door to more stories and deeper insights. And solid in-depth data in a story, “adds an incredible amount of power to it and enhances the credibility of what we do.”

In our age of social media, the practice of data-driven journalism also covers how data results are presented in a story. Thus, the growing popularity of infographics. (If a fact is in a pretty infographic, it must be right!). But data-driven journalists are also taking this to the next level by creating interactive infographics, and allowing readers to actually sort the data in ways that meaningful to them. A good example of this is the Los Angeles Times Crime Map, where you can navigate a map and see crime reports by neighborhood, updated daily. Creating a resource like this requires a tremendous amount of work, but the value to the public can be very high.

But are there dangers in data-driven journalism?

As journalists use more sophisticated analyses to support their points, their approaches are starting to resemble robust social science methods. But Doig notes that there are important differences between the two: social science is trying to advance a field, build generalizable findings and support theory, whereas data-driven journalism is looking to report news in a timely manner.

So is there a danger that journalists inexperienced in data analysis may come to mistaken conclusions or publish incorrect factoids? Doig comments:

“I’m just as concerned [with faulty data analysis] as with other things. There is nothing special about data that makes the misuse of it any worse than other kinds of things. A journalist who can’t take good notes in an interview is just as dangerous as a journalist can’t do the math to do good data journalism . . . Good journalists are very conscious of data quality problems. We know we have to write our name on that story, and we know that writing corrections is not going to be a career enhancement.”

New, more powerful tools are always a double-edged sword: they offer more capability to do good things as well as increases the potential to do bad things. Even esteemed social scientists aren’t immune from errors, as demonstrated by the recent example of “Excelgate,” where two Harvard economics professors made a simple formula error in their spreadsheet that had a major effect on their oft-quoted conclusion about the relationship between debt and economic growth. Doig’s module in the upcoming MOOC will cover analyzing data in Microsoft Excel, where students will work with hypothetical crime data.

A MOOC for journalists, whether professional or citizen

The upcoming data journalism MOOC is sponsored by the European Journalism Centre, and consists of five modules taught by different data journalism, instructors. Along with the module mentioned before, the topics will cover the full range from concepts, to analysis, and graphical presentations, and will be taught through a combination of video lectures and assignments.

The intended audience for the MOOC is broad: professional journalists, citizen journalists or members of the public interested in data journalism. The content is relevant to a wide audience, because as Doig states, “What journalists do is gather information, organize it and present it clearly. And those are skills that a number of other professions (lawyers, and business) also need.”

On the subject of citizen journalists, Doig notes their increasing numbers and the important role they play in the reporting ecosphere. They too should be leveraging data-driven journalism:

“The traditional idea of a journalist is someone who is working for a traditional news organization, a newspaper or a broadcast news station. But there’s a growing world of essentially citizen-journalists — people who are doing blogs about particular subjects and things like that, and there’s no reason why they shouldn’t be using these kinds of tools to do their own research for that.”

It would be great to see a large number of data-savvy citizen-journalists providing crowdsourced insights that extends and supplements the work of traditional news organizations and professional journalists. Assuming relevant public data remain accessible, this could be the best counterweight we have to government data-mining activities.

My background is in software and management consulting, throughout which I've developed various internal and external training courses. My interests are in cognitive psychology, pedagogy, and lifelong learning. Currently, I'm Chief Course Curator at Class Central, a comprehensive MOOC directory provider, and an advisor to several edtech ventures.