Signals of churnalism?

On Friday I had quite a bit of fun with Churnalism.com, a new site from the Media Standards Trust which allows you to test how much of a particular press release has been reproduced verbatim by media outlets.

The site has an API, which got me thinking whether you might be able to ‘mash’ it with an RSS feed from Google News to check particular types of articles – and what ‘signals’ you might use to choose those articles.

“According to research”, “research published today” and “according to a new report”

And of course there is “A press release said”.

Signal – or sign?

The idea kicked off a discussion on Twitter on whether certain phrases were signals of churnalism, or just journalistic cliches. The answer, of course, is both.

By brainstorming for ‘signals’ I wasn’t arguing that any material using these phrases would be guilty of churnalism – or even the majority – just that they might be represent one way of narrowing your sample. Once you have a feed of stories containing “Revolutionary new” you can then use the API to test what proportion of those articles are identical to the text in a press release – or another news outlet.

The signal determines the sample, the API calculates the results.

Indeed, there’s an interesting research project to be done – perhaps using the Churnalism API – on whether the phrases above are more likely to contain passages copied wholesale from press releases, than a general feed of stories from Google News.

(Another research project might involve looking at press releases to identify common phrases used by press officers that might be used by the API)

You may have another opinion of course – or other phrases you might suggest?

7 thoughts on “Signals of churnalism?”

Hadn’t seen Churnalism – what a great site. I do a lot of work on student plagiarism in higher education, so interesting to see how this site gives a good demonstration of recycling of information online. When students copy from news articles, you can never be sure that the course highlighted by Turnitin (plagiarism detection software) is the source that the student used as it’s obvious there is so much recycling of news stories from press releases. Would your idea looking at signals therefore also compare articles to each other to find those which had copied from each other? (i.e. closer matches to on another than to the original press release).

Finding news articles which are derived from press releases.
Finding the press release from which these news articles were derived.

These problems can be short-cutted if you have the original press release. Put it into churnalism.com and you get the news articles. However, if you don’t, how do you find the likely articles? Your keyphrase approach is one technique. Another might be to put every news article through the search process and find the sets of articles which have common windows of text. From this subset find the windows which are common to all the articles and you have very likely part of the text which made up the original press release. These sections are then ideal Google fodder for finding the press release itself. Just append the terms “press release” to the query. You can do this manually by looking for quotes which look like they came from a “spokesman”. This is something that could very well be automated.

It seems users of churnalism.com have very strong hunches that some articles are press release derived and they use the article as the search text. Perhaps this is something that should be encouraged!

I can see why you’d want to use potential phrases as possible signs of churnalism, but none can possibly be proof of churnalism. Looking at the list of phrases, only ‘a survey said’ would probably be proof of churnalism. I know you’ve made this point, but if the purpose of flagging up churnalism is to improve journalism overall, wrongly suggesting that articles are churnalism when they are not will only make it harder for people to take the concept seriously.

Like you say, it’s a point I make in the post: the exercise was to identify potential RSS feeds that could be used with the Churnalism.com API to find specific instances where the majority of text was copied. So there would be no “wrongly suggesting that articles are churnalism” – the RSS feed would not be seen by anyone.

I think the concept of the site is fantastic, if you’re having work done for you then it’s a good way of checking that you’re getting something original. It might push journalists to think harder about what they’re writing, instead of falling back on clichés as a safety net. But of course there will be problems with certain things that can only be phrased a finite number of ways. It serves as a good indication of originality though!