Stories on the conflict in Yemen rarely make it to the mainstream media. Unlike the wars in Syria and Iraq, none of the parties fighting in Yemen release official information on bombings, which makes it difficult for journalists to source facts.

The little coverage there is can be often plagued by misinformation, but the independent initiative Yemen Data Project aims to tackle that, by collecting and disseminating data on Saudi coalition airstrikes, helping news outlets widen their reporting.

"It's the greatest data project you’ve never heard of,” said Iona Craig, freelance journalist and adviser at the Yemen Data Project, at the CIJ Logan Symposium last week (20 October).

“We don’t do advocacy – it’s about transparency and the fact that data can be used to hold the parties of the conflict accountable, because there is no independent monitoring or data collection going on.”

In the absence of official data, the Yemen Data Project is entirely based on open-source information, including that from the Armed Conflict Location and Event Data Project – a data and analysis source on political violence and protest in the developing world.

The Yemenis working on the project use information from local news outlets, as well as social media and instant messaging apps.

To counter any bias, they then cross-reference information with media outlets that are more closely aligned with the other side of the conflict, as well as with human rights organisations and NGOs on the ground.

Craig explained the data collectors risk their lives to create a public record that otherwise would simply not exist, and their identities must remain secret.

It is difficult for international journalists to report independently from the coalition-controlled territory without being embedded or going in with local fighters. Similarly, foreign journalists cannot access the rebel-controlled territory without local fixers.

“Yemen was probably one of the most under-reported places in the Middle East, even during the Arab Spring," said Craig.

"Journalists have lived in Syria, or Lebanon, so even when they had to report remotely, when it was not safe to be there, they had contacts and they knew the context very well.

“But for Yemen, the basic knowledge was not there.”

Despite collecting data on Saudi-led coalition airstrikes since 2015, the Yemen Data Project was only formally named in mid-2016, when they wanted to publicise the database and hand the raw dataset to the Guardian, who then spent three months analysing it and developing stories.

The initiative only started to receive funding at the end of last year, but is now seen as a legitimate source, and its data i used by major media outlets such as the Washington Post, the Times, NGOs, and cited in the British Parliament and the U.S. Congress.