The next wave of data journalism?

In the first of three expanded extracts from a forthcoming book chapter on ‘The next wave of data journalism’ I outline some of the ways that data journalism is reinventing itself, and adapting for a world which is rapidly changing again. Where networked communications and processing power were key in the 2000s, automation and AI are becoming key in the decade to come. And just as data journalism raised the bar for journalism as a whole, the bar is about to be raised for data journalism itself.

Data journalism isn’t doing enough. Now into its second decade, the noughties-era technologies that it was built on – networked access to information and vastly improving visualisation capabilities – are now taken for granted, just as the ‘computer assisted’ part of its antecedent Computer Assisted Reporting was.

In just ten years data journalism has settled down into familiar practices and genres, from the interactive map and giant infographics to the quick turnaround “Who comes bottom in the latest dataset” write-up. It’s a sure sign of maturity when press officers are sending you data journalism-based media releases.

Now we need to move forward. And the good news is: there are plenty of places to go.

Looking back to look forward

Philip Meyer’s reporting laid the foundation for CAR

In order to look forward it is often useful to look back: in any history of data journalism you will read that it came partly out of the Computer Assisted Reporting (CAR) tradition that emerged in the US in the late 1960s.

CAR saw journalists using spreadsheet and database software to analyse datasets, but it also had important political and cultural dimensions too: firstly, the introduction of a Freedom of Information Act in the US which made it possible to access more data than before; and secondly, the spread of social science methods into politics and journalism, pioneered by seminal CAR figure Philip Meyer.

Data journalism, like CAR, had technological, political and cultural dimensions too. Where CAR had spreadsheets and databases, data journalism had APIs and datavis tools; where CAR had Freedom of Information, data journalism had a global open data movement; and where CAR acted as a trojan horse that brough social science methods into the newsroom, data journalism has brought ‘hacker’ culture into publishing.

Much of the credit for the birth of data journalism lies outside of the news industry: often overlooked in histories of the form is the work of civic coders and information activists (in particular MySociety which was opening up political data and working with news organisations well before the term data journalism was coined), and technology companies (the APIs and tools of Yahoo! for example formed the basis of much of data journalism’s early experiments).

The early data journalists were self-created, but as news organisations formalised data journalism roles and teams, data journalism newswork has been formalised and routinised too.

So where do we look for data journalism’s next wave of change?

Look outside news organisations once again and you see change in two areas in particular: on the technical side, an increasing use of automation, from algorithms and artificial intelligence (AI) to bots and the internet of things.

On the political side, a retreat from open data and transparency while non-governmental organisations take on an increasingly state-like role in policing citizens’ behaviour.

What is data journalism for?

Datavis can be seen as “Striving to keep the significant interesting and relevant”

Data journalists will often tell you that the key part of data journalism is the journalism bit: we are not just analysing data but finding and telling important stories in that. But journalism isn’t just about stories, either. Kovach and Rosenstiel, in their excellent book Principles of Journalism, outline 10 principles which are always important to return to:

Journalism’s first obligation is to the truth

Its first loyalty is to citizens

Its essence is a discipline of verification

Its practitioners must maintain an independence from those they cover

It must serve as an independent monitor of power

It must provide a forum for public criticism and compromise

It must strive to keep the significant interesting and relevant

It must keep the news comprehensive and proportional

Its practitioners must be allowed to exercise their personal conscience

Citizens, too, have rights and responsibilities when it comes to the news

Some of these can be related to data journalism relatively easily: journalism’s first obligation to the truth, for example, appears to be particularly well served by an ability to access and analyse data.

Striving to keep the significant interesting and relevant? Visualisation and interactivity are great examples of how data journalism has been able to do just that for even the dryest subjects.

But an attraction to those more obvious benefits of data journalism can distract us from the demands of the other principles.

Is data journalism “a discipline of verification”, or do we attribute too much credibility to data? Cleaning data, and seeking further sources that can independently confirm what the data appears to tell us are just two processes that should be just as central as being able to generate a bar chart.

Some of the other principles become more interesting when you begin to look at developments that are set to impact our practice in the coming decades…

Rise of the robots

The rise of ‘robot journalism‘ – the use of automated scripts to analyse data and generate hundreds of news reports that would be impossible for individual journalists to write – is one to keep a particular eye on.

Aside from the more everyday opportunities that automation offers for reporting on amateur sports or geological events, automation also offers an opportunity to better “serve as an independent monitor of power”.

Lainna Fader, Engagement Editor at New York Magazine, for example, highlights the way that bots are useful “for making value systems apparent, revealing obfuscated information, and amplifying the visibility of marginalized topics or communities.”

By tweeting every time anonymous sources are used in the New York Times the Twitter bot @NYTanon serves as a watchdog on the watchdogs (Lokot and Diakopoulos 2015).

But is robot journalism a “discipline of verification” (another of Kovach and Rosenstiel’s principles)? Well, that all boils down to the programming: in 2015 Matt Carlsontalked about the rise of new roles of “meta-writer” or “metajournalist” to “facilitate” automated stories using methods from data entry to narrative construction and volunteer management. And by the end of 2016 observers were talking about ‘augmented journalism‘: the idea of using computational techniques to assist in your news reporting.

The concept of ‘augmented journalism’ is perhaps a defensive response to the initial examples of robot journalism: with journalists feeling under threat, the implied assurance is that robots would free up time for reporters to do the more interesting work.

What has remained unspoken, however, is that in order for this to happen, journalists need to be willing — and able — to shift their focus from routine, low-skilled processes to workflows involving high levels of technical skill, critical abilities — and computational thinking.

But more than a decade on from Adrian Holovaty’s seminal post “A fundamental way newspaper sites need to change”, there is very little evidence of this being seriously addressed in journalism training or newsroom design. Instead, computational thinking is being taught earlier, to teenagers and younger children at school.

Some of those may, in decades to come, get a chance to reshape the newsroom themselves. In the next part of this series, then, I look at how computational thinking is likely to play a role in the next wave of data journalism — and the need to problematise and challenge it at the same time.

10 thoughts on “The next wave of data journalism?”

re: future directions, *automation* and *robot journalists* (generating text, rather than charts, from data) are two areas I think are also likely to develop, but there’s also a need to think about workflows I think?

Ideas and workflows developed originally to support reproducible research are also attractive I think (workflows based on RStudio/Rmd/knitr, or Jupyter notebooks, for example) but there are a several blockers in practice: the need to install software, the need to learn scripting (not necessarily programming….), the need to produce output formats that can be used to feed downstream parts of the publication process.

Yes, I keep meaning to write a separate post about reproducibility and those tools. It’s a shame that data journalism – in the UK anyway – seems to have gone backwards in terms of transparency, linking less often to data etc.

Good post, but I am afraid you are slightly missing the point. We are in a stage where data-journalism is more about presentation of data than about their interpretation, which is good (datajournalis is at its early stages) but far from enough. We know how to use data: go on Google Scholar and see what Political Scientists, Psychologists, Linguists (not to mention Economists) do with data. They have methods and skills we are still lacking, as journalists. We are increasingly good in presenting data. We are learing to use tools like Ggplot, Tableau and js. But what we still do not know is how to frame the visualization in broader context. And that is the challenge for the nex wave of datajournalism, provided that there will be one anytime soon.

That’s a very good point. I would disagree slightly about it being more about presentation – if anything I think there was an obsession with datavis tools in the early years which then faded. At the moment I would say the interest in visualisation is part of a wider interest in coding – some focus that on JavaScript, and some on R or Python. But yes, I’m surprised how much of that is focused on superficial presentation rather than more rigorous data work. I suspect we’ll probably see history repeating itself with people moving from the production to the newsgathering (this happens in journalism generally – students tend to be more interested in writing first and then later getting more original stories)