JEEcamp thoughts on data journalism

I spent some time talking to Martin Belam (@currybet) about data journalism and the importance (or otherwise) of journalists learning to code.

He said, as he’s said before, that it’s more important for journalists to know whether something is or isn’t possible than for us to necessarily be able to do it ourselves.

And for working journalists whose day to day job doesn’t carry a coding requirement already – and particularly those of us who are lucky enough to be in a workplace where there are developers or programmers who can take our ideas and make them flesh (ie. not me), he’s almost certainly right.

Those skills are becoming more and more important. With the birth of data.gov.uk and the increasingly open approach to information that the new coalition government is likely to take, sifting and analysing data to find the stories is going to be a vital skill for a lot of journalists.

We need to know our way around a spreadsheet. We need to be able to spot patterns in data and understand not only what they mean but also how we can use them to reveal stories that are not only relevant but useful.

We need to know where our skills can get us. We need to know our capabilities and our limits – and, crucially, we must be aware of what we don’t know. That’s not just knowing that there are holes in our knowledge, but knowing the shape of those holes so that we can try to get our problems a little closer to a solution.

Journalism is about asking the right questions. We research stories before we interview subjects so that we can ask pertinent questions whose answers will illuminate the subject. We need to be able to do the same thing with our data – we need to know what questions to ask and how, so that even if we can’t make the tools ourselves we can hand over the task to someone else without asking the impossible or wasting their time.

But most of the time, certainly for journalists on regional papers and I would wager for many in other areas, those people who know how to make the tools just don’t exist. I have friends who code, but I can’t ask them for a favour every time I want to create a news app, or diff two versions of a stack of documents, or visualise a complex dataset, or tell the story of 100 people’s losses from an investment fund going bust in a way that conveys both the scale and the humanity of the problem.

Regional journalists work on hundreds of stories that could be made vastly easier or more beautiful or more accessible through a touch of computer work (spreadsheets, maps, things that aren’t quite coding but sort of almost are and look like it to the untrained eye). A few of us can create those additions; the rest just write the story, and our papers and websites are poorer for it.

We work on a few stories – and the number is increasing – that are perfect for news apps, web coding, multimedia packages or other more complex solutions that very, very few of us can create. But no one else will do it for us.

On top of that many of us struggle with inflexible content management systems that penalise or make it literally impossible to display data-driven work online. Faced with that problem, some budding computer-assisted-reporters give up before they’ve even started.

So I’m not going to stop learning Python. It’s not a complete solution to the problem – for that we need real, systemic change so that the businesses we work for all value data work, understand its increasing relevance, reflect on current practice and support training journalists to do an evolving job.

But for me, it means that in the future I might be able to create better stories, automate processes within series or campaigns or multiple follow-up stories, make my job easier and make a better experience for the reader all at the same time.