A team of researchers from 7 countries is building an open-source tool to help verify claims on Twitter

Social media newsgathering and verification are no longer novel practices in the newsroom. But even if publishers now have a person or a team of reporters tasked with monitoring conversations on these platforms and verifying their accuracy, there have still been instances of fake rumours or misrepresented facts spreading online when news breaks.

A team of researchers, developers and journalists is hoping to solve this through the EU-funded project Pheme, an open-source dashboard they are currently building to help newsrooms detect, track and verify facts and claims the moment they start spreading on Twitter.

The project, which began in January 2014, aims to be completed early next year, when a working prototype should be made available to the public. Its name references Pheme, the Greek goddess of fame and rumours.

The idea for Pheme first came about when a group of academics at the University of Warwick were examining the social media conversations during the 2011 riots in England, explained SWI swissinfo's Geraldine Wong Sak Hoi, who works on the project.

"Professor Rob Procter at Warwick University was actually involved in a project with the Guardian and the London School of Economics called 'Reading the riots', where they tried to chart the evolution of social media claims and verify the rumours they were able to find about what was going on in various cities such as London.

"What they identified is that during major breaking news events, a lot of people will go to Twitter to try to find out what's going on, but also to contribute to what's happening.

"At the time, plenty of people believed the London Eye was on fire for example, so how do we solve that problem for journalists and for anyone who is interested in getting the facts?"

The Pheme dashboard uses natural language processing (NLP) to analyse social media messages and develop algorithms that will identify trends and patterns in the discussions, and ultimately whether or not they are true.

The user will be able to search through specific news events by keyword, author or location, and Pheme will display the results from Twitter, evaluating in real-time how likely they are to be true based on a variety of factors.

"We have a set of past rumours, which are annotated manually by journalists as to whether they are true, false, or unconfirmed," Kalina Bontcheva, senior researcher in the department of computer science at University of Sheffield, told Journalism.co.uk via email.

"Each of the tweets pertaining to a rumour is annotated as supporting, denying, or questioning that rumour, such as that the London Eye was on fire during the London riots.

"NLP methods are then used to identify automatically linguistic information such as negations and positive or negative sentiment words, names of people, locations, and organisations, as well as part of speech information and basic sentence structure. In addition, we take into account whether a tweet contains emoticons, URLs, and hashtags."

Once the tweet author's position in regards to the claim has been established, and information about them has also been extracted, an algorithm will attempt to determine the tweet's overall veracity.

"This may not always be possible with high confidence, in which case, the journalists themselves can make that judgement based on the automatically collected evidence," Bontcheva said.

Based on this material, Pheme will also have an indicator of trustworthy accounts, based on manually curated lists of Twitter users, such as scientists and journalists.

However, because a person starting a rumour may not have been previously involved in a similar activity, it can be tricky to provide a definite list, Bontcheva added, so the team is now considering whether this data should be provided as part of the public prototype at the end of the project.

A working prototype of the dashboard will be released internally by the team at the end of June, to be tested by SWI swissinfo. It will ultimately be free for individuals or news organisations, but the creators are considering potential paid-for features.