The ethics of automation and feeds

Since Adrian Holovaty built ChicagoCrime.org in 2005 to automatically update a map with police crime statistics, automation has been an important element of data journalism. Few news organisations have guidelines on automation, but the BBC’s guidelines (2013) on video feeds do provide a framework.

The guidelines state that a senior editorial figure must approve the feed first, and that they “should aim to maintain editorial control” The stated objective is to protect “editorial independence and reduce the risk of intrusive, harmful, offensive or unduly promotional images appearing on our site” and in line with that the guidance recommends monitoring the output of the feed.

“The level of monitoring should be appropriate for the content of the camera. A producer should normally be in a position to cut the feed from a live webcam if it becomes necessary.”

Ethics in a world without gatekeepers: the new information environment

Perhaps the final ethical dilemma facing journalists in dealing with data is its sheer volume and public availability. In this new context, where journalists no longer act as gatekeepers, there may be a new ethical claim to increase the time devoted to fact-checking and increasing public literacy around data.

Predictions, for example, can be retrospectively tested, as the New York Times did with Budget Forecasts, Compared With Reality. Editorial can be devoted to unpicking statistical spin, opening up datasets to public scrutiny, and publicising key changes affecting context, such as decisions to re-classify terms such as ‘poverty’, or to change an authority’s boundaries.

Data itself is increasingly the ‘power’ to be held to account, with journalists investigating its flaws, uses and abuses, and giving a voice to the data which is ‘voiceless’. In fashioning guidelines for your own data journalism practice, then, consider the following checklist:

How do we ensure that reporting on data is accurate? What processes should be routine in seeking clarification?

How do we put data into context? Is data always reported in relative terms (i.e. per person), and alongside historical trends? Do we check how the data was gathered?

What are the considerations to be made when publishing data in full, or when automating publication of data?

What are the considerations to be made when obtaining data?

In collaborative projects do we ensure that all parties are clear on shared ethics, values and roles?

And finally: how do we ensure we choose the most important data – which may require more work to obtain – rather than simply the most available?

If you have examples of ethical dilemmas, best practice, or guidance, I’d be happy to include it with an acknowledgement.