If democracy would be poorer without journalism, then journalism must have some effect. Can we measure those effects in some way? While most news organizations already watch the numbers that translate into money (such as audience size and pageviews), the profession is just beginning to consider metrics for the real value of its work.

That’s why the recent announcement of a Knight-Mozilla Fellowship at The New York Times on “finding the right metric for news” is an exciting moment. A major newsroom is publicly asking the question: How do we measure the impact of our work? Not the economic value, but the democratic value. The Times’ Aaron Pilhofer writes:

The metrics newsrooms have traditionally used tended to be fairly imprecise: Did a law change? Did the bad guy go to jail? Were dangers revealed? Were lives saved? Or least significant of all, did it win an award?

But the math changes in the digital environment. We are awash in metrics, and we have the ability to engage with readers at scale in ways that would have been impossible (or impossibly expensive) in an analog world.

The problem now is figuring out which data to pay attention to and which to ignore.

Evaluating the impact of journalism is a maddeningly difficult task. To begin with, there’s no single definition of what journalism is. It’s also very hard to track what happens to a story once it is released into the wild, and even harder to know for sure if any particular change was really caused by that story. It may not even be possible to find a quantifiable something to count, because each story might be its own special case. But it’s almost certainly possible to do better than nothing.

The idea of tracking the effects of journalism is old, beginning in discussions of the newly professionalized press in the early 20th century and flowering in the “agenda-setting” research of the 1970s. What is new is the possibility of cheap, widespread, data-driven analysis down to the level of the individual user and story, and the idea of using this data for managing a newsroom. The challenge, as Pilhofer put it so well, is figuring out which data, and how a newsroom could use that data in a meaningful way.

What are we trying to measure and why?

Metrics are powerful tools for insight and decision-making. But they are not ends in themselves because they will never exactly represent what is important. That’s why the first step in choosing metrics is to articulate what you want to measure, regardless of whether or not there’s an easy way to measure it. Choosing metrics poorly, or misunderstanding their limitations, can make things worse. Metrics are just proxies for our real goals — sometimes quite poor proxies.

An analytics product such as Chartbeat produces reams of data: pageviews, unique users, and more. News organizations reliant on advertising or user subscriptions must pay attention to these numbers because they’re tied to revenue — but it’s less clear how they might be relevant editorially.

Consider pageviews. That single number is a combination of many causes and effects: promotional success, headline clickability, viral spread, audience demand for the information, and finally, the number of people who might be slightly better informed after viewing a story. Each of these components might be used to make better editorial choices — such as increasing promotion of an important story, choosing what to report on next, or evaluating whether a story really changed anything. But it can be hard to disentangle the factors. The number of times a story is viewed is a complex, mixed signal.

It’s also possible to try to get at impact through “engagement” metrics, perhaps derived from social media data such as the number of times a story is shared. Josh Stearns has a good summary of recent reports on measuring engagement. But though it’s certainly related, engagement isn’t the same as impact. Again, the question comes down to: Why would we want to see this number increase? What would it say about the ultimate effects of your journalism on the world?

As a profession, journalism rarely considers its impact directly. There’s a good recent exception: a series of public media “impact summits” held in 2010, which identified five key needs for journalistic impact measurement. The last of these needs nails the problem with almost all existing analytics tools:

While many Summit attendees are using commercial tools and services to track reach, engagement and relevance, the usefulness of these tools in this arena is limited by their focus on delivering audiences to advertisers. Public interest media makers want to know how users are applying news and information in their personal and civic lives, not just whether they’re purchasing something as a result of exposure to a product.

Or as Ethan Zuckerman puts it in his own smart post on metrics and civic impact, “measuring how many people read a story is something any web administrator should be able to do. Audience doesn’t necessarily equal impact.” Not only that, but it might not always be the case that a larger audience is better. For some stories, getting them in front of particular people at particular times might be more important.

Measuring audience knowledge

Pre-Internet, there was usually no way to know what happened to a story after it was published, and the question seems to have been mostly ignored for a very long time. Asking about impact gets us to the idea that the journalistic task might not be complete until a story changes something in the thoughts or actions of the user.

If journalism is supposed to inform, then one simple impact metric would ask: Does the audience know the things that are in this story? This is an answerable question. A survey during the 2010 U.S. mid-term elections showed that a large fraction of voters were misinformed about basic issues, such as expert consensus on climate change or the predicted costs of the recently passed healthcare bill. Though coverage of the study focused on the fact that Fox News viewers scored worse than others, that missed the point: No news source came out particularly well.

In one of the most limited, narrow senses of what journalism is supposed to do — inform voters about key election issues — American journalism failed in 2010. Or perhaps it actually did better than in 2008 — without comparable metrics, we’ll never know.

While newsrooms typically see themselves in the business of story creation, an organization committed to informing, not just publishing, would have to operate somewhat differently. Having an audience means having the ability to direct attention, and an editor might choose to continue to direct attention to something important even it’s “old news”; if someone doesn’t know it, it’s still new news to them. Journalists will also have to understand how and when people change their beliefs, because information doesn’t necessarily change minds.

I’m not arguing that every news organization should get into the business of monitoring the state of public knowledge. This is only one of many possible ways to define impact; it might only make sense for certain stories, and to do it routinely we’d need good and cheap substitutes for large public surveys. But I find it instructive to work through what would be required. The point is to define journalistic success based on what the user does, not the publisher.

Other fields have impact metrics too

Measuring impact is hard. The ultimate effects on belief and action will mostly be invisible to the newsroom, and so tangled in the web of society that it will be impossible to say for sure that it was journalism that caused any particular effect. But neither is the situation hopeless, because we really can learn things from the numbers we can get. Several other fields have been grappling with the tricky problems of diverse, indirect, not-necessarily-quantifiable impact for quite some time.

Academics wish to know the effect of their publications, just as journalists do, and the academic publishing field has long had metrics such citation count and journal impact factor. But the Internet has upset the traditional scheme of things, leading to attempts to formulate wider ranging, web-inclusive measures of impact such as Altmetrics or the article-level metrics of the Public Library of Science. Both combine a variety of data, including social media.

Social science researchers are interested not only in the academic influence of their work, but its effects on policy and practice. They face many of the same difficulties as journalists do in evaluating their work: unobservable effects, long timelines, complicated causality. Helpfully, lots of smart people have been working on the problem of understanding when social research changes social reality. Recent work includes the payback framework which looks at benefits from every stage in the lifecycle of research, from intangibles such as increasing the human store of knowledge, to concrete changes in what users do after they’ve been informed.

Data beyond numbers

Numbers are helpful because they allow standard comparisons and comparative experiments. (Did writing that explainer increase the demand for the spot stories? Did investigating how the zoning issue is tied to developer profits spark a social media conversation?) Numbers can be also compared at different times, which gives us a way to tell if we’re doing better or worse than before, and by how much. Dividing impact by cost gives measures of efficiency, which can lead to better use of journalistic resources.

But not everything can be counted. Some events are just too rare to provide reliable comparisons — how many times last month did your newsroom get a corrupt official fired? Some effects are maddeningly hard to pin down, such as “increased awareness” or “political pressure.” And very often, attributing cause is hopeless. Did a company change its tune because of an informed and vocal public, or did an internal report influence key decision makers?

Fortunately, not all data is numbers. Do you think that story contributed to better legislation? Write a note explaining why! Did you get a flood of positive comments on a particular article? Save them! Not every effect needs to be expressed in numbers, and a variety of fields are coming to the conclusion that narrative descriptions are equally valuable. This is still data, but it’s qualitative (stories) instead of quantitative (numbers). It includes comments, reactions, repercussions, later developments on the story, unique events, related interviews, and many other things that are potentially significant but not easily categorizable. The important thing is to collect this information reliably and systematically, or you won’t be able to make comparisons in the future. (My fellow geeks may here be interested in the various flavors of qualitative data analysis.)

Qualitative data is particularly important when you’re not quite sure what you should be looking for. With the right kind, you can start to look for the patterns that might tell you what you should be counting,

Metrics for better journalism

Can the use of metrics make journalism better? If we can find metrics that show us when “better” happens, then yes, almost by definition. But in truth we know almost nothing about how to do this.

The first challenge may be a shift in thinking, as measuring the effect of journalism is a radical idea. The dominant professional ethos has often been uncomfortable with the idea of having any effect at all, fearing “advocacy” or “activism.” While it’s sometimes relevant to ask about the political choices in an act of journalism, the idea of complete neutrality is a blatant contradiction if journalism is important to democracy. Then there is the assumption, long invisible, that news organizations have done their job when a story is published. That stops far short of the user, and confuses output with effect.

The practical challenges are equally daunting. Some data, like web analytics, is easy to collect but doesn’t necessarily coincide with what a news organization ultimately values. And some things can’t really be counted. But they can still be considered. Ideally, a newsroom would have an integrated database connecting each story to both quantitative and qualitative indicators of impact: notes on what happened after the story was published, plus automatically collected analytics, comments, inbound links, social media discussion, and other reactions. With that sort of extensive data set, we stand a chance of figuring out not only what the journalism did, but how best to evaluate it in the future. But nothing so elaborate is necessary to get started. Every newsroom has some sort of content analytics, and qualitative effects can be tracked with nothing more than notes in a spreadsheet.

Most importantly, we need to keep asking: Why are we doing this? Sometimes, as I pass someone on the street, I ask myself if the work I am doing will ever have any effect on their life — and if so, what? It’s impossible to evaluate impact if you don’t know what you want to accomplish.