The first meeting was a three-hour brainstorming session on “Improving Humanitarian Information for Affected Communities” organized in preparation for the second meeting on “The Unmet Need for Communication in Humanitarian Response,” which was held at the UN General Assembly.

The meetings presented an ideal opportunity for participants to share information on current initiatives that focus on communications with crisis-affected populations. Ushahidi naturally came to mind so I introduced the concept of crowdsourcing crisis information. I should have expected the immediate push back on the issue of data validation.

Crowdsourcing and Data Validation

While I have already blogged about overcoming some of the challenges of data validation in the context of crowdsourcing here, there is clearly more to add since the demand for “fully accurate information” a.k.a. “facts and only facts” was echoed during the second meeting in the General Assembly. I’m hoping this blog post will help move the discourse beyond the black and white concepts that characterize current discussions on data accuracy.

Having worked in the field of conflict early warning and rapid response for the past seven years, I fully understand the critical importance of accurate information. Indeed, a substantial component of my consulting work on CEWARN in the Horn of Africa specifically focused on the data validation process.

To be sure, no one in the humanitarian and human rights community is asking for inaccurate information. We all subscribe to the notion of “Do No Harm.”

Does Time Matter?

What was completely missing from today’s meetings, however, was a reference to time. Nobody noted the importance of timely information during crises, which is rather ironic since both meetings focused on sudden onset emergencies. I suspect that our demand (and partial Western obsession) for fully accurate information has clouded some of our thinking on this issue.

This is particularly ironic given that evidence-based policy-making and data-driven analysis are still the exception rather than the rule in the humanitarian community. Field-based organizations frequently make decisions on coordination, humanitarian relief and logistics without complete and fully accurate, real-time information, especially right after a crisis strikes.

So why is this same community holding crowdsourcing to a higher standard?

Time versus Accuracy

Timely information when a crisis strikes is a critical element for many of us in the humanitarian and human rights communities. Surely then we must recognize the tradeoff between accuracy and timeliness of information. Crisis information is perishable!

The more we demand fully accurate information, the longer the data validation process typically takes and thus the more likely the information will be become useless. Our public health colleagues who work in emergency medicine know this only too well.

The figure below represents the perishable nature of crisis information. Data validation makes sense during time-periods A and B. Continuing to carry out data validation beyond time B may be beneficial to us, but hardly to crisis affected communities. We may very well have the luxury of time. Not so for at-risk communities.

This point often gets overlooked when anxieties around inaccurate information surface. Of course we need to insure that information we produce or relay is as accurate as possible. Of course we want to prevent dangerous rumors from spreading. To this end, the Thomson Reuters Foundation clearly spelled out that their new Emergency Information Service (EIS) would only focus on disseminating facts and only facts. (See my previous post on EIS here).

Yes, we can focus all our efforts on disseminating facts, but are those facts communicated after time-period B above really useful to crisis-affected communities? (Incidentally, since EIS will be based on verifiable facts, their approach may well be liked to Wikipedia’s rules for corrective editing. In any event, I wonder how EIS might define the term “fact”).

Why Ushahidi?

Ushahidi was created within days of the Kenyan elections in 2007 because both the government and national media were seriously under-reporting widespread human rights violations. I was in Nairobi visiting my parents at the time and it was also frustrating to see the majority of international and national NGOs on the ground suffering from “data hugging disorder,” i.e., they had no interest whatsoever to share information with each other or the public for that matter.

This left the Ushahidi team with few options, which is why they decided to develop a transparent platform that would allow Kenyans to report directly, thereby circumventing the government, media and NGOs, who were working against transparency.

Note that the Ushahidi team is only comprised of tech-experts. Here’s a question: why didn’t the human rights or humanitarian community set up a platform like Ushahidi? Why were a few tech-savvy Kenyans without a humanitarian background able to set up and deploy the platform within a week and not the humanitarian community? Where were we? Shouldn’t we be the ones pushing for better information collection and sharing?

In a recent study for the Harvard Humanitarian Initiative (HHI), I mapped and time-stamped reports on the post-election violence reported by the mainstream media, citizen journalists and Ushahidi. I then created a Google Earth layer of this data and animated the reports over time and space. I recommend reading the conclusions.

Accuracy is a Luxury

Having worked in humanitarian settings, we all know that accuracy is more often luxury than reality, particularly right after a crisis strikes. Accuracy is not black and white, yes or no. Rather, we need to start thinking in terms of likelihood, i.e., how likely is this piece of information to be accurate? All of us already do this everyday albeit subjectively. Why not think of ways to complement or triangulate our personal subjectivities to determine the accuracy of information?

At CEWARN, we included “Source of Information” for each incident report. A field reporter could select from several choices: (1) direct observation; (2) media, and (3) rumor. This gave us a three-point weighted-scale that could be used in subsequent analysis.

At Ushahidi, we are working on Swift River, a platform that applies human crowdsourcing and machine analysis (natural language parsing) to filter crisis information produced in real time, i.e., during time-periods A and B above. Colleagues at WikiMapAid are developing similar solutions for data on disease outbreaks. See my recent post on WikiMapAid and data validation here.

Conclusion

In sum, there are various ways to rate the likelihood that a reported event is true. But again, we are not looking to develop a platform that insures 100% reliability. If full accuracy were the gold standard of humanitarian response (or military action for that matter), the entire enterprise would come to a grinding halt. The intelligence community has also recognized this as I have blogged about here.

The purpose of today’s meetings was for us to think more concretely about communication in crises from the perspective of at-risk communities. Yet, as soon as I mentioned crowdsourcing the discussion became about our own demand for fully accurate information with no concerns raised about the importance of timely information for crisis-affected communities.

After a couple of years experience as head of disaster relief for an NGO in Indonesia, it is sad that I have to admit to being unsurprised by the organisation’s reluctance to rely on crowdsourcing – despite the obvious benefits. Man’s natural response to crisis is often to make risk minimisation, rather than result-orientation, the guiding principle, which unfortunately results in timidity.

No probs. I found myself similarly frustrated many times ;0)
I only discovered your blog recently, as I have recently been musing on a universal needs assessment software platform, and had some ideas regarding data integrity: Like you, I encountered different grades of data sources in the field (trained observers; untrained observers; rumours). In my system the trained observers’ assessments would enter the universal map for dissemination to all other users immediately. Untrained observers’ assessments and rumours are similarly disseminated, but in a separate ‘holding’ layer until confirmed, moderated or rejected by a trained observer. Theoretically, this records all available information, avoids unnecessary doubling up of trained observer’s time, directs resources to possible gaps and maintains the universal map’s integrity. Any thoughts?