Gideon Lichfield's blog. About news, news things, and things.

What journalism can learn from science

Note: this essay is based on a talk (audio) that NPR’s Matt Thompson and I gave at SXSW Interactive on March 13th, 2012. Matt laid out the groundwork for the talk in this Poynter article last September.

1. Why make journalism more like science?

Journalism and science both try to make sense of the world. They make observations, and test them against theories about how the world actually works. But there the similarity more or less ends.

Science has established standards of measurement and evidence. Journalists make them up as they go along. (The closest thing journalism has to a standard of evidence is the rule that something is true if two independent sources told you it was—hardly a bedrock of epistemology.) Scientists add to the edifice of knowledge piece by piece, citing and building on each others’ work. Journalists have no specific, agreed-on body of knowledge, other than what we call “general knowledge”, that they can refer to. Science makes precise, testable claims and then tries to prove them wrong; it uses the doctrine of falsifiability. Journalists tend to make broad, untestable claims (or cite such claims by others: “Video games will damage children’s brains!”) and then look for factoids to confirm them, avoiding the inconvenient evidence to the contrary. And so on.

All of this is very sweeping, of course; it isn’t to say that there isn’t plenty of good, careful, solid journalism. Nor is real science the serene and lofty pursuit of crystalline certainty that it likes to seem to be; what we’ll describe here is an idealised form. But science does have a methodological basis and rigour that journalism does not.

And that’s fine. If journalists had to work the way scientists do they would publish rarely, interest almost nobody, and be out of business very fast. But that isn’t to say that journalism can’t borrow from some of science’s methods. And in fact, as we will explain, it’s already starting to happen.

First, though, a digression. We’re about to propose a few things—tools—that could make journalism more scientific. These tools mostly involve adding more stuff to the stories journalists produce. This will not happen by fiat. Journalists have plenty to do already. So these tools will catch on only if they meet three conditions:

Self-justifying: they are useful—immediately useful, not just long-term useful—to a journalist working on a story, regardless of whether they help anyone else.

Easy-to-use: they don’t impose a significant burden or learning curve.

Interoperable: they work on any publishing platform or content-management system.

A good example is Storify, a platform for collecting and displaying various kinds of material from the web—tweets, blog posts, pictures, etc—in one place. (You can see the Storify of our session here.) Storify is self-justifying: it helps a journalist collate and organise source material for a story, though it also helps others see where the information came from. It is easy to use (try it). And it is interoperable: a Storify collection can be embedded on any web page. As a result, though no newspaper built Storify and nobody told them to use it, more and more are now doing so.

And so to our thesis.

2. What’s so good about the scientific method?

We’ve identified three broad-brush qualities of science (because journalists always like to have things in threes) that we think journalism ought to have more of.

>> First, science is collaborative. As we’ve said, scientists build on each others’ work. So do journalists. Once something has been well enough reported and established—such as where Barack Obama was born, or what led to the collapse of Lehman Brothers—other journalists shouldn’t need to repeat the reporting in order to take the story further. In this sense, journalism is already collaborative… sort-of. But there’s no agreed way of establishing and pointing to what considered is solid knowledge. There’s only a fuzzy and arbitrary general understanding, which not everyone may agree with. In short, there is no convention for signalling authority.

In science, authority is dealt with by means of citations. A paper cites other papers to establish the starting point for its research; the citations effectively say, “this is what we already know.” A cited paper may in fact turn out to be wrong later on, but in that case the citations act as a paper trail, showing other scientists what other results are dubious as a consequence.

Citations perform another function, as well: they act as measures of reputation. A paper, and a scientist, that is cited often is important and probably reliable (either that, or notoriously wrong). The way this measure is calculated is through the Science Citation Index, which collates all citations. The index serves as a kind of map of the scientific edifice. It shows which bricks rest on which other bricks, and lets you see which ones are most central to the structure.

So what if there were some kind of citation index for news? The first element of this is already, crudely speaking, in place. On blogs and on some newspaper websites, articles carry links to other articles; they show, in other words, what work they are relying on. But we don’t do this systematically. And we don’t have a bigger picture of how these links all relate. Technorati used to act as a kind of primitive citation index for blog posts: you could enter the address of a post and it would show you every page that linked to it. It seems not to any more.

Something of that nature, only more sophisticated, would be enormously helpful in journalism. It would be make the starting points and assumptions of a story more transparent. It would help identify particularly useful or ground-breaking pieces of work. And it would be especially useful in tracking down the origins of misinformation.

Ideally, the news citation index would be more sophisticated than Technorati (when it worked) or even than the Science Citation Index. It would need not only to show which other articles linked to a given story, but to track where in those articles the links came from and what in the story they were linking to: to show, in other words, which facts they were attributing to the story in question. This, what we might call “fine linking”, is not something the web is well set up to do right now. But it would be a step towards the “web of data” or “semantic web”, a much-discussed goal of information seers, in which links point to and from facts, rather than pages.

A news citation index would be self-justifying if it helped journalists organise their own research, by giving them a way to keep track of what they already know and where it came from. And it would be interoperable by definition if it were based on hyperlinking, which is the architecture of the web itself.

>> Second, science is replicable. Scientific work is, broadly speaking, transparent: you show your assumptions, your method, your data, and how those led you to your conclusions. But why is transparency important? Not for its own sake, but because it allows others to repeat your work. Scientists must be able to replicate each others’ results—first, so that they can check for errors (some experiments need to be done several times by different groups in different places to make sure they are right); and second, because to see if a hypothesis applies more broadly, they may need to do the same experiment under different sets of conditions. So we’ve chosen replicability to be one of our three traits, as a finer-grained version of the notion of transparency.

Journalists, too, could do with some replicability. A investigation finds cases of police brutality in one town. Is the same pattern being repeated up and down the region or the country, evidence of a systemic problem, or is it just a local anomaly, in which case some bad apples are to blame? Another investigation alleges corruption in the city council, based on some funny-looking numbers in the accounts. But is the conclusion warranted, or is there a more innocent explanation? Other journalists, and the public, need to be able to see how these investigations were done so they can judge the results and repeat them if necessary. It’s also, frankly, good discipline for the reporter who wrote the story: when you have to list your sources, you are more careful about checking them.

What makes a scientific paper replicable is the transparency about its methods, data and reasoning. It might seem at first that this is impossible for journalists. They have to protect sources and guard scoops. They also mustn’t overload their audience with facts. Stories would be unreadable if they looked like academic papers. But in fact, online journalism has become far more transparent in the last three years or so.

Inline links, which we’ve already mentioned, have been around for a while. A slightly newer practice is posting photos, videos, interview recordings or transcripts that can’t go in the print version. But some websites now publish source documents next to their articles using document-sharing platforms such as Scribd and DocumentCloud. ProPublica, an investigative journalistic non-profit, publishes step-by-step guides on its digging and data analysis, as with this story on tracking nurses’ licences. Politifact’s Truth-O-Meter™, a collaboration between several American newspapers that fact-checks statements by public figures, lists all its sources of information in its articles.

And then there is Wikipedia. Sure, many still don’t consider Wikipedia a journalistic outfit. It doesn’t check material for errors before publishing it; its “stories” are constantly evolving and never finished; and when mistakes get in they may, or may not, get caught and cut out later. But the Wikipedia page for a big current event can have far more people at once working on it than at any newspaper, and when errors are caught they disappear far more quickly than they would from a normal online article.

And—here is the point—Wikipedia uses footnotes. When its users are following the guidelines, they give a source for every fact. Journalists and schoolchildren are taught not to rely on a Wikipedia page for accuracy, but what it does do very well is show where purported facts came from, so you can check for yourself. And footnotes are more useful than inline links, because you have to click on each link to see what it leads to, whereas a set of footnotes gives you, in one place, an alternative view of the story as told through its sources.

So: how about footnotes for news? Your first reaction might be: ugh, that would look terrible. Book publishers hate footnotes; they make books look academic and put readers off. But Wikipedia wouldn’t be the 6th most-visited site on the internet if people really hated footnotes. It’s a matter of habit, and also of design: there are ways to make footnotes less ugly. (Here’s an easy one: make them show up only when you hover your mouse over the text.)

In fact, stories would be snappier with footnotes. Instead of “A survey last month by the Whatalongname Foundation for Public Health Research found that 43% of American adults are unhappy with their GP, but according to the Department of Health and Human Services, just 3% of them changed doctors last year”, wouldn’t it be easier to read that “43% of American adults are unhappy with their GP1 but just 3% of them changed doctors last year2”? Having to state your sources in the text is really just a holdover from the days of print. With footnotes you can separate the story from the scaffolding behind it, and thus make both the story clearer and the scaffolding more visible.

As for protecting your sources: sure, some have to stay secret. But just as a story can quote “a source close to so-and-so”, the footnote could explain that a particular fact came from a protected source. The point is not to reveal the source so much as it is to reveal the process, so that others can retrace its steps.

Footnotes for news would be self-justifying if, like the citation index, they helped journalists organise their research. (The citation index might, in fact, come about as a by-product or a layer on top of the footnotes.) Some blogging platforms, such as WordPress, already have easy-to-use footnoting plug-ins, as does Wikipedia. To make them interoperable might mean adopting some standards for footnoting. But this could well come about by convergence.

>> Third, science is predictive. As we’ve already said, scientists makes precise, testable predictions, whereas journalists and the people they quote tend to make general, untestable ones. And this is bad. But what’s worse is that it’s very hard to hold the makers of these predictions to account when they get it wrong.

An example that has entered the lore is Thomas Friedman of the New York Times. In November 2003, eight months after the beginning of the war in Iraq, he predicted that the following six months would “determine the prospects for democracy-building there”. Over the next three years, he made several more predictions, all stating that the following six months or so would be decisive. The blogger Atrios defined a “Friedman”, later the “Friedman Unit”, as any six-month period starting from the present moment. In the spirit of science we would like to redefine the Friedman Unit more broadly, and time-agnostically, as “that period of time within which a prediction by a public figure is almost certain to be forgotten”.

Friedman is hardly alone in this habit, though. And since the central mission of journalism is to hold the powerful to account, the fact that it is singularly bad at keeping track of predictions they make is one of its most fundamental weaknesses as an institution. On the great issues of our time—stimulus measures, global warming, healthcare, international security policy, the role of the state, and so on—the poverty of public debate is due in no small part to the fact that those who say things that later turned out to be false get away with it, because nobody can remember who said what when.

And so we think journalism needs a prediction tracker. The prediction trackers we have are mainly in the form of journalists’ own, sometimes short and patchy, memories. A more concrete one does exist: Politifact’s “Obameter”, which has been keeping track of promises Barack Obama made, and reporting on whether they are being fulfilled. But despite being a superb thing in itself, it has not become a point of reference for most journalists. It rarely crops up in stories. Readers do not flock to see how well or badly the president has been doing lately. Tracking predictions may be good for a news outlet in the long term, like eating your spinach, but in the short-term it just means extra work. So it is not obviously self-justifying.

However, there are steps towards making it interoperable. Dan Schultz’s Truth Goggles project is an experiment in making data from Politifact’s Truth-O-Meter crop up next to a politician’s name in any news article, so that you can see that politician’s overall record of truth-telling. A twist on that technology might be to pull up past statements the politician had made about the topic of the article in question from a database (software already exists for aggregating such quotes from the web), so you could instantly see whether she was contradicting herself. If news organisations decide to include prediction-tracking of this sort for the benefit of their readers, they might start to realise that it would do their journalists good to make use of it too—if only to avoid looking stupid.

3. What else might journalism learn from science?

We have focused on three traits of science that we think news ought to have more of: being collaborative, replicable and predictive. But science has many other qualities. These may or may not be worth trying to adopt. For instance, the doctrine of falsifiability is the very linchpin of what distinguishes science from speculation. But can you imagine making journalists look for proof that their ideas are wrong rather than right?

On the other hand, it’s interesting to try to imagine how the checks and balances imposed by peer review might look in a journalistic context. It already exists, in a way: comment threads let anyone (not usually the journalists’ peers, though) pick a story apart. They’re mostly junk, but there are gems of useful information in them that can make a story better. And an entire industry exists around trying to make comment threads less troll-laden and more useful in order to drive more traffic to sites, so it’s already in the media’s interests to improve them.

There is also a genuine experiment in peer review in the form of NewsTrust, a site that lets experts and journalists rate articles for accuracy, fairness and so on. Websites can choose to display the ratings next to their stories. Not many do so, because the ratings fail the self-justifying test: even if they have a long-term benefit, they don’t make a journalist’s work immediately easier. But maybe they contain the germ of a system that could.

As we’ve said, this isn’t about making journalism into science. But we hope we’ve shown that the two could have a lot more in common, and that this could make journalism better, more reliable, and more valuable to society.