A dull one. Without the invective and ideology about free vs. paid, pajama-clad bloggers vs. stick-in-the-mud mainstream media curmudgeons, and Utopian visions of crowdsourced news vs. dark fears about falling standards you can find elsewhere. It has words like taxonomy and persistent content in it; discusses business models and revenue streams in dull, accountant-like language; and tries to dissect the sparkling prose journalists turn out into tiny bytes of data.

But there is a purpose here, and it’s based around the idea that we as journalists haven’t really thought about how people are changing the way they access information, or how we need to fundamentally rethink the way we carry out journalism and the kinds of – for want of a better word – products we turn out for them.

There’s much hand-wringing over the loss of the traditional business model of news, it’s true. Perhaps too much. And this site will contribute its share. But hopefully it’ll also explore some of the less-explored questions about where the profession goes in a digital age. And lay out some of the thinking behind one concrete idea that might help move the business forward: Something I’m calling Structured Journalism.

Just an entirely self-serving shout-out to the nice CJR piece by Jonathan Stray about some of the innovations going on at Reuters, including our Automation For Insight project and Reuters News Tracer – a cool new tool that detects newsworthy events on social media and assigns a confidence score assessing how credible they are.

And as a two-fer, we also got a nice piece about News Tracer in Nieman Lab as well.

Not bad for a single day.

(And as a complete side note, if you haven’t been reading Jonathan’s blog, you should. There’s some really good stuff there.)

Basically – and you can read the pieces for more description – what News Tracer does is find clusters of tweets, cleans out spam and other dross, figures out which clusters are “newsworthy,” at least as mainstream news organizations define it, separate assertions of opinion from assertions of fact, and then figures out a score for the credibility of the cluster.

Loads of kudos to the Thomson Reuters R&D team, and especially Sameena Shah, who led the development team who solved a whole host of very interesting algorithmic challenges over a two-year period. As Jonathan notes:

Newsroom standards are rarely formal enough to turn into code. How many independent sources do you need before you’re willing to run a story? And which sources are trustworthy? For what type of story? “The interesting exercise when you start moving to machines is you have to start codifying this,” says Chua. Much like trying to program ethics for self-driving cars, it’s an exercise in turning implicit judgments into clear instructions.

Sameena’s team did really smart work figuring out – with help from the newsroom – what “newsworthiness” means, and also how to pull together a basket of factors to help assess credibility. It’s a never-ending iterative process, of course, but they’ve built up a very impressive capability that extends the reach of the newsroom, improves its speed, and frees reporters up to do more value-added work.

Hillary Clinton did not, as it turns out, have a 75% or 80% or 90% or 99% chance of winning the election, as many of us – including our massive States of the Nation project – predicted up until Election Day.

(You’ll see that we’ve now updated our analysis to correct for an accurate understanding of the actual turnout, which shows that Donald Trump’s odds of success would have been 75%. Kinda late, it’s true, but simply validating the underlying assumptions behind the project.)

So what happened? And given Brexit, and Colombia, what does this mean for polls more broadly? And how do we do better in the future?

To be fair, the polls weren’t actually that far off. Seriously. At least the national polls. They showed Clinton leading by a couple of percentage points, and as the final results trickle in, indeed the former Secretary of State is leading in the popular vote. It’s true that the polls may have overstated her appeal, but broadly the numbers are within the margin of error, especially if you assume the likely voter models are somewhat off.

Which they were. And that’s one critical place all the polls fell down on. And again, to be fair, predicting who will or won’t go to the polls – a once every two- or four-year event – is a tricky exercise at best. But it doesn’t help that we tend to present the numbers we come up with as absolutes, rather than give a range of possible outcomes based on a range of possible turnout models. As we noted when we launched the Read More…

There’s no question that much of the media missed much of the story of Trump’s rise to power – that’s a journalism problem. But it’s also clear that even all the great journalism about the campaign – and there was a lot of it – didn’t really factor hugely in many voters’ minds. And that’s a distribution problem.

That’s not the same as the “fake news” issue – which is yet another real problem. This really about how, in addition to upending business models, the new digital landscape is also upending the media’s ability to get quality journalism in front of audiences. And that’s at least as a big an issue as fake news.

Not that I have any proposals for solutions; but I thought it might be helpful to try to disaggregate the two issues of journalism and distribution, and point to different groups or approaches that should be tackling them.

To be sure, we all could – and should – do better journalism, and certainly much of the media dropped the ball in this election. And fake news is a real problem. But let’s focus for the moment on our distribution problem, our dependence on external platforms to get real news to readers, and the filter bubbles that they inhabit.

As Josh Benton put it very nicely in a Nieman Lab piece right after the elections:

In a column just before the election, The New York Times’ Jim Rutenberg argued that “the cure for fake journalism is an overwhelming dose of good journalism.” I wish that were true, but I think the evidence shows that it’s not. There was an enormous amount of good journalism done on Trump and this entire election cycle, from both old-line giants like the Times and The Washington Post and digital natives like BuzzFeed and The Daily Beast. …

The problem is that not enough people sought it out. And of those who did, not enough of them trusted it to inform their political decisions. And even for many of those, the good journalism was crowded out by the fragmentary glimpses of nonsense.

And tackling this issue isn’t really something that individual journalists – or even large news organizations are really equipped to do well.

There’s an analogy here – a medical one – that I think bears on this. Stay with me. Read More…

I was meaning to write this post – honest! – before the election, but procrastination has its benefits: Now the timing seems much more apt, even if the subject – filter bubbles – has heavily picked over. Which means I can talk about a different kind of bubble instead – the kind in newsrooms.

There are two pretty big – somewhat unrelated – problems to address on that front: One is how much flat-out untruthful/half-truthful memes are out there, masquerading as real news and crowding out real information; and the other is just how hard it is for good, serious journalism – the kind of work, for example, that the Washington Post did on the Trump Foundation – to actually get in front of audiences that matter. The latter is much more around the questions of virality, discovery, platforms, filter algorithms and issues of how to distribute rather than create news – of which another post, probably.

But in all of the angst in the media about how we failed to predict – or even contemplate – the prospect of a Trump victory has also been the meme about how the mainstream media inhabits its own bubble with a self-reinforcing worldview. Which certainly has some truth to it. As Fortune noted:

In part, that’s because much of the East Coast-based media establishment is arguably out of touch with the largely rural population that voted for Trump, the disenfranchised voters who looked past his cheesy exterior and his penchant for half-truths and heard a message of hope, however twisted.

They’re getting the shit kicked out of them. I know, I was there. Step outside of the city, and the suicide rate among young people fucking doubles. The recession pounded rural communities, but all the recovery went to the cities. The rate of new businesses opening in rural areas has utterly collapsed.

So the argument is that the media elite missed a key part of the story because they didn’t have enough insight into the rural heartland; that they sent reporters in to report, but largely as anthropological expeditions rather than as genuine explorations Read More…

First, it’s a link driven by Google, which means millions – hundreds of millions – of people will see it and use it, and hence drive up the value and importance of fact-checking, at least in theory.

Second, it stems from a recognition by Google – or at least I hope it does – that people’s news needs aren’t driven solely by the freshest story on the subject, and more by a desire to understand a subject in context. That explains, to some extent, why Wikipedia has become a real destination for news searches, and certainly pushes the value of depth rather than just speed. (Not that speed doesn’t matter as well, of course).

And thirdly, by highlighting only the fact checks that conform to a certain schema, Google is rewarding the notion of structured journalism, and using the best of what the idea has to offer: Building greater long-term value out of structuring the information journalists collect, analyze and publish every day.

To be sure, some don’t see that as an advantage, as this piece from Slate suggests:

Google seems to have a somewhat narrow view of fact-checking journalism, one that defines it by form as much as by function. It will likely leave out plenty of stories that could merit the tag, while including some others that might not. At least at first, it seems to be surfacing stories mainly from dedicated fact-checking organizations, such as Politifact, rather than articles from mainstream news organizations.

And it’s true that there are fact checks embedded in all sorts of types of journalism that won’t be surfaced by this new link. On the other hand, it’s just as likely that Read More…

It’s the biggest presidential tracking poll ever – at least as far as we can figure, and with upwards of 15,000-plus people surveyed every week, we’re reasonably confident we can make that claim. But it isn’t just a huge poll, cool though that is.

It’s built around the idea that polling accuracy – and election results – hinge around good estimates and predictions of actual turnout on polling day. So while every pollster has their own model for what percentage of each demographic group will show up to vote – as do we – the site we’ve built lets users create demographic groups on the fly and adjust their predictions of turnout for that group, and see how that would impact the results of the elections at the state level, and hence overall in the Electoral College.

(OK, so you can actually create that last filter, but then again that’s not a huge slice of the population, so I’m not sure changing their turnout is going to materially affect the election. But it’s great that you can do that.)

And if the lively discussion at the session, and at then at an evening drinks session afterwards was anything to go by, there’s cause for some optimism. Not that it’s all smooth sailing from here – and certainly one of the bigger questions we have to address as more people try to implement structured journalism sites is: What’s the use? As in: What use do you want the site to serve?

That’s not a question that’s unique to structured journalism, of course – all news organizations need to think about who their intended audience is, and what they bring to them. But structure brings with it much more, well, structure – and that means trying to solve those questions much earlier.

But first to go back to the session for a minute. It featured a good mix: David talked about the ambition goals of his structured stories software and template, and how it fared in actual coverage, and Jacquie shared the progress she’s made in developing more standardized templates for turning out BBC stories and explainers on key topics. And I just tried to keep up with them.

The room was pretty full, and there was no shortage of comments and questions from the floor, so towards the end of the session, we extended an invite to get together in the bar (where else?) in the evening to keep the discussion going. By the time I got there, a couple of people had already gathered around David, and before too long there was a solid core of about a dozen of us around a table – sharing ideas for projects, discussing challenges they’d faced.

But a key question that kept surfacing was – nicely framed by Jonathan Stray – was about use cases, and how tightly to define it before you set up shop. And it’s a key issue, I think: Deciding what topic and questions we want to throw a light on, designing our information structure for that – and shedding everything else.

It’s a tough thing to do, because we all naturally want to preserve as much flexibility as possible. But – at least so far – it’s very hard to build and maintain Read More…

Best of all, we’re meeting in a restaurant, which ought to mean a meeting that isn’t in a windowless room. I’ll be posting (honest!) on what we discuss there, as well as (hopefully) finally catching up with some long overdue posts from NICAR and IRE.

At a recent meeting of the Institute for Non-Profit News – for my sins, I now sit on INN’s board – we learned an interesting statistic: About half the organization’s members have a strategy to drive readers to their own sites/destinations, and the other half count on distributing their content via other platforms.

Does it matter how they (you/we) reach readers? And should they (you/we) care?

Good questions, albeit without clear answers. But with the expansion of Facebook’s Instant Articles and the launch of Google’s Accelerated Mobile Pages, it’s clear that distribution of news is increasingly moving out of the hands of news organizations – to the point that some start-ups no longer even have websites or home pages.

There’s a balance, of course, in the middle. No news organization can afford to ignore social platforms or how its stories are surfaced via search. But the real question between distributed and direct strategies seems to hinge around whether the news site wants to prioritize reach, or engagement.

There’s no way a news site can build the size of audience Facebook has, so it makes sense, if you want to reach millions, to focus your distribution strategy on getting your content on social platforms. And it especially makes sense if you’re a relatively small start-up that likely doesn’t have much brand recognition or is unlikely to be destined to become a destination site. Non-profits, too, are often incentivized to maximize their reach and impact by getting their content to as many people as possible. That speaks to following a distributed strategy.

Not that generating revenue is the most important thing – although it helps, even for non-profits – but engagement is likely to be better if you have a destination site, or even better, destination app. As Ken Doctor notes:

While only 8 percent of those accessing news on smartphones and tablets use apps, they account for 45 percent of all mobile time spent on news.

So in a surprise win, Spotlight bagged the Oscar for Best Picture – a very nice victory for a great movie about investigative journalism. In fact, probably the best film about journalism since All The President’s Men came out in 1976.

If you haven’t seen it, you should: It’s a nicely nuanced look at the long, dogged process of investigative reporting, both wonderfully acted and directed, that features no car chases, meetings in dark carparks, or secret leaks – just hard work. And a spreadsheet.

And that’s probably one of the nicest things about the movie, at least for me. There’s a point in it when the reporters figure out that priests who have been caught molesting children are sent off for a period of “recovery” somewhere else – so now the team can, instead of looking for tips about abusive priests, start working the other way, but building a database of priests who have been warehoused for a year or so.

As Matt Carroll, one of the reporters on the story notes in an essay on Medium:

It’s also wonderful because it shows the power of investigative journalism, through the tedious grind of slowly building a major story, thread by thread. One scene pays homage to the gritty work involved in building a spreadsheet of suspect priests. A spreadsheet, of all things! And the scene is great. (OK, so I’m biased: I was the data geek. But I still think the scene is fantastic.)

Hear, hear. So here’s to a great movie on investigative journalism where the star of the show is a spreadsheet. OK, so I’m stretching it. But it’s still a nice win for data journalism as well.

Categories

Welcome

(Re)Structuring Journalism explores the evolution of information in a digital age and how we need to fundamentally rethink what journalists do and what they produce.

And it proposes one possible solution: Structured Journalism.

About the author

Reg Chua has been a journalist for more than a quarter-century; he's currently Executive Editor, Editorial Operations, Data and Innovation at Thomson Reuters. Prior to that, he was Editor-in-Chief of the South China Morning Post and had a 16-year career at The Wall Street Journal.