Pheme and the fight against fake news

Facebook, Google and most recently the Décodex of the newspaper Le Monde are some of the projects being developed to try and fight rumors and false information. Launched on January 2013, project Pheme gathers researchers, journalists and European experts. This project is the oldest and the least notorious amidst its competitors. Nevertheless, its goals are very ambitious: developing on the one side, algorithms that can automatically detect rumors, on the other, tools for tracing, analyzing and verifying information.

Conceived to serve journalists and medical physicians, these tools could offer the news’ world some breathing room. As Kalina Bontcheva, head of the project, points out: we face a « race between machines and people who make up false information for fun, for political purposes or for money ». A race that often leaves journalists out of breath in front of breaking news and budget restrictions imposed on the press.

One of the ambitious tools Pheme desires to develop would allow the tracing of implicit links between articles, posts, and conversations on Twitter that refer to the same information, but that, for example, contain contradicting details. To do this the tracing will repose upon not the precise order of the words but upon a more flexible structure, that of the grammatical relationship between words.

As a result, a tweet declaring a fact (be it true or false) will not be analyzed by Pheme individually, but amidst a sequence, amidst a relational network. In this way, Pheme would propose its users indications as to the veracity of the analyzed information, providing, for instance, the space-time frame in which it was expressed, or the degree to which its authors are reliable based on their status and past publications.

From the London riots to project Pheme

The Pheme project was conceived in the wake of the 2011 London riots. At the time, the Guardian and the London School of Economics associated around the project “Reading the Riots”; a social research study that was interested in understanding the role of social media during these events.

« A fake photograph appearing to show the London Eye on fire on fire which circulated on Twitter during the August riots (source : The Guardian) »

Rob Procter, now a professor of the Computer Science department at the University of Warwick, analyzed with his team more than 2.5 million tweets posted during the riots. This work allowed them to study the diffusion of seven rumors that spread through Twitter. Following this research, the Guardian put up a visual interface allowing readers to observe on an interactive chronological line the progression of the tweets confirming, contradicting, questioning or commentating one of these rumors.

“The project with the Guardian gave us an idea of the meanings people give information they receive from social platforms such as Twitter,” Rob Procter says, “It suggested that we could use the information that we gathered to teach machines to foresee automatically, or semi-automatically, the veracity of the information on social networks.”

To reach the results published in the Guardian, they had to use a manual method of data sorting, Rob Procter specifies. In the framework of project Pheme, this procedure has been programmed onto computers so they might be capable of classifying data that confirms, opposes, questions or comments an information.

A multidisciplinary team

Behind Pheme, there is a team of academics, experts and journalists coordinated by the University of Sheffield. Relative to the fields of machine learning, data-mining, journalism, human-computer interaction and visualization, they meet face-to-face two or three times a year and collaborate regularly through Skype. From its beginning in 2013, the project has been financed by the European Commission.

To elaborate models of data verification, the researchers have relied upon collecting data from past cases of rumor proliferation on social media, such as the crash of Germanwings or the Ottawa shooting. “In such news stories, we expect to find erroneous information, misinterpreted information, or hoaxes surrounding the real events,” Rob Procter explains.

The swissinfo.ch teams have helped researchers understand the work and needs of journalists. “Our role has also been to provide deep analysis of the rumors that circulated after events such as the Charlie Hebdo shooting, or the Germanwings crash,” adds Geraldine Wong Sak-Hoi, a swissinfo representative.

With the researchers, the journalists have developed data verification and sorting protocols. Once the data was sorted, analyzed and verified by human experts, machine learning algorithms were run to reproduce their work. The objective being that the algorithms reach the same conclusions as the researchers before moving on to use it on new data.