Few things illustrate the challenges facing journalism in the age of ‘Big Data’ better than Cable Gate – and specifically, how you engage people with stories that involve large sets of data.

The Cable Gate leaks have been of a different order to the Afghanistan and Iraq war logs. Not in number (there were 90,000 documents in the Afghanistan war logs and over 390,000 in the Iraq logs; the Cable Gate documents number around 250,000) – but in subject matter.

Tragedy or statistic?

I once heard a journalist trying to put the number ‘£13 billion’ into context by saying: “imagine 13 million people paying £1,000 more per year” – as if imagining 13 million people was somehow easier than imagining £13bn. Comparing numbers to the size of Wales or the prime minister’s salary is hardly any better.

Generally misattributed to Stalin, the quote “The death of one man is a tragedy, the death of millions is a statistic” illustrates the problem particularly well: when you move beyond scales we can deal with on a human level, you struggle to engage people in the issue you are covering.

Research suggests this is a problem that not only affects journalism, but justice as well. In October Ben Goldacre wrote about a study that suggested “People who harm larger numbers of people get significantly lower punitive damages than people who harm a smaller number. Courts punish people less harshly when they harm more people.”

“Out of a maximum sentence of 10 years, people who read the three-victim story recommended an average prison term one year longer than the 30-victim readers. Another study, in which a food processing company knowingly poisoned customers to avoid bankruptcy, gave similar results.”

“”As long as we have reporting that gives the impression to everyone that poor, black folks in these communities don’t value life, it just adds to their sense of isolation,” says Stephen Franklin, the community media project director at the McCormick Foundation-funded Community Media Workshop, where he led the “We Are Not Alone” campaign to promote stories about solution-based anti-violence efforts.

“Natalie Moore, the South Side Bureau reporter for the Chicago Public Radio, asks: “What do we want people to know? Are we just trying to tell them to avoid the neighborhoods with many homicides?” Moore asks. “I’m personally struggling with it. I don’t know what the purpose is.””

Salience

“Whistleblowing that lacks salience does nothing to serve the public interest – if we mean capturing the public’s attention to nurture its discourse in a way that has the potential to change something material. “

He is right. But Charlie Beckett, in the comments to that post, points out that Wikileaks is not operating in isolation:

“Wikileaks is now part of a networked journalism where they are in effect, a kind of news-wire for traditional newsrooms like the New York Times, Guardian and El Pais. I think that delivers a high degree of what you call salience.”

This is because last year Wikileaks realised that they would have much more impact working in partnership with news organisations than releasing leaked documents to the world en masse. It was a massive move for Wikileaks, because it meant re-assessing a core principle of openness to all, and taking on a more editorial role. But it was an intelligent move – and undoubtedly effective. The Guardian, Der Spiegel, New York Times and now El Pais and Le Monde have all added salience to the leaks. But could they have done more?

Visualisation through personalisation and humanisation

In my series of posts on data journalism I identified visualisation as one of four interrelated stages in its production. I think that this concept needs to be broadened to include visualisation through case studies: or humanisation, to put it more succinctly.

There are dangers here, of course. Firstly, that humanising a story makes it appear to be an exception (one person’s tragedy) rather than the rule (thousands suffering) – or simply emotive rather than also informative; and secondly, that your selection of case studies does not reflect the more complex reality.

“Avastin extends survival from 19.9 months to 21.3 months, which is about 6 weeks. Some people might benefit more, some less. For some, Avastin might even shorten their life, and they would have been better off without it (and without its additional side effects, on top of their other chemotherapy). But overall, on average, when added to all the other treatments, Avastin extends survival from 19.9 months to 21.3 months.

“The Daily Mail, the Express, Sky News, the Press Association and the Guardian all described these figures, and then illustrated their stories about Avastin with an anecdote: the case of Barbara Moss. She was diagnosed with bowel cancer in 2006, had all the normal treatment, but also paid out of her own pocket to have Avastin on top of that. She is alive today, four years later.

“Barbara Moss is very lucky indeed, but her anecdote is in no sense whatsoever representative of what happens when you take Avastin, nor is it informative. She is useful journalistically, in the sense that people help to tell stories, but her anecdotal experience is actively misleading, because it doesn’t tell the story of what happens to people on Avastin: instead, it tells a completely different story, and arguably a more memorable one – now embedded in the minds of millions of people – that Roche’s £21,000 product Avastin makes you survive for half a decade.”

Broadcast journalism – with its regulatory requirement for impartiality, often interpreted in practical terms as ‘balance’ – is particularly vulnerable to this. Here’s one example of how the homeopathy debate is given over to one person’s experience for the sake of balance:

Journalism on an industrial scale

The Wikileaks stories are journalism on an industrial scale. The closest equivalent I can think of was the MPs’ expenses story which dominated the news agenda for 6 weeks. Cable Gate is already on Day 9 and the wealth of stories has even justified a live blog.

With this scale comes a further problem: cynicism and passivity; Cable Gate fatigue. In this context online journalism has a unique role to play which was barely possible previously: empowerment.

3 years ago I wrote about 5 Ws and a H that should come after every news story. The ‘How’ and ‘Why’ of that are possibilities that many news organisations have still barely explored. ‘Why should I care?’ is about a further dimension of visualisation: personalisation – relating information directly to me. The Guardian moves closer to this with its searchable database, but I wonder at what point processing power, tools, and user data will allow us to do this sort of thing more effectively.

‘How can I make a difference?’ is about pointing users to tools – or creating them ourselves – where they can move the story on by communicating with others, campaigning, voting, and so on. This is a role many journalists may be uncomfortable with because it raises advocacy issues, but then choosing to report on these stories, and how to report them, raises the same issues; linking to a range of online tools need not be any different. These are issues we should be exploring, ethically.

All the above in one sentence

Somehow I’ve ended up writing over a thousand words on this issue, so it’s worth summing it all up in a sentence.

Industrial scale journalism using ‘big data’ in a networked age raises new problems and new opportunities: we need to humanise and personalise big datasets in a way that does not detract from the complexity or scale of the issues being addressed; and we need to think about what happens after someone reads a story online and whether online publishers have a role in that.

“Let the crowd have the middle of the diamond. Just let it go, our time there is ending. There’s too many of them, they are too fast, they will out man and out maneuver you every time that it matters to them–and if it doesn’t matter to them, I bet there’s not much of a market for it. Just walk away… and watch.”

He finishes by arguing that commercial journalism needs to raise the bar:

“The future of Journalism is not to become public service with the hopes of gratuity, but a professional service with professional expectations and results. If people are going to blogs and the crowd instead of your publications, it’s because your publication is not meeting the expectations of your audience. As a publication you have the choice to evolve to meet those expectations, find a new audience, or leave.”

“2. Paul does not make the distinction between unplanned breaking news events (like accidents and terrorist attacks) and planned live coverage of events (like the Super Bowl or the US presidential inauguration). Paul’s “news diamond” and my “news lifecycle” models are much more valid for unplanned breaking news events.”

It’s fair to say that my diamond does take the perspective of a news organisation – that’s who it was aimed at. But I’m not sure that that means it doesn’t acknowledge the blurring of boundaries.

Anyway, Mishra poses some questions:

How do we increase the number and variety of sources in the process of creating, curating and consuming news?

How do we separate signal from noise during each stage of the news lifecycle?

How do we contract the “alert” to “analysis” stages of the news lifecycle, in order to get better signal to noise ratio sooner in the cycle?

How to we expand the “conversation” to “customization” stages of the news lifecycle, in order to maximize the returns from the content we have created?

Here’s a new contribution to the ‘Model for a 21st Century Newsroom’ concept: the Google Newsroom, by Benoît Raphaël. Based on his experience as editor in chief at Le Post, Raphael makes a number of salient points about reorganising the newsroom in a digital age. He suggests that “we have to forget that old idea of merging newsrooms” and create “one “where everything happens,” that is to say on the web. This is the heart of information system. The rest is just appearance.” Continue reading →