Search This Blog

What happens when technology can do great things for humanity, but doesn't make a lot of money? Jim Fruchterman explores the social entrepreneurship side of technology applications: how to get great tech tools to the people who often need them the most, but are least able to afford them!

Subscribe to this blog

Follow by Email

Issues with Crowdsourced Data Part 2

A recent guest Beneblog explains why we believe a correlation found between SMS text messages and building damage by researchers was not useful. Some of the questions we received made us realize we need to be clearer about why this is important. Why did we bother analyzing this claim? Why does it matter? Thanks to Patrick Ball, Jeff Klingner and Kristian Lum for contributing this material (and making it much clearer).

We’re reacting to the following claim: “Data collected using unbounded crowdsourcing (non-representative sampling) largely in the form of SMS from the disaster affected population in Port-au-Prince can predict, with surprisingly high accuracy and statistical significance, the location and extent of structural damage post-earthquake.”

While this claim is technically correct, it misses the point. If decision makers simply had a map, they could have made better decisions more quickly, more accurately, and with less complication than if they had tried to use crowdsourcing. Our concern is that if in the future decision makers depend on crowdsourcing, bad decisions are likely to result -- decisions that impact lives. So, we’re speaking up.

In the comments to our last post, Jon from Ushahidi said "If a tool's fitness cannot be absolute, then neither can it's fallibility." And, that the correlation they found was useful. Why is this something worth arguing about?

Misunderstanding relationships in data is a problem because it can lead to choosing less effective, more expensive data instead of choosing obvious, more accurate starting points. The correlation found in Haiti is an example of a "confounding factor". A correlation was found between building damage and SMS streams, but only because both were correlated with the simple existence of buildings. Thus the correlation between the SMS feed and the building damage is an artifact or spurious correlation. Here are two other examples of confounding effects.

- Children's reading skill is strongly correlated with their shoe size -- because older kids have bigger feet and tend to read better. You wouldn't measure all the shoes in a classroom to evaluate the kids' reading ability.

- Locations with high rates of drowning deaths are correlated with locations with high rates of ice cream sales because people tend to eat ice cream and swim when they're at leisure in hot places with water, like swimming pools and seasides. If we care about preventing drowning deaths, we don't set up a system to monitor ice cream vendors.

We're particularly concerned because we think that using a SMS stream to measure a pattern is probably at its best in a disaster situation. When there's a catastrophe, people often pull together and help each other. If an SMS stream was ever going to work as a pattern measure, it was going to be in a context like this -- and it didn't work very well. We don't think that SMS was a very good measure of building damage, relative to the obvious alternative of using a map of building locations.

The problems will be much worse if SMS streams are used to try to measure public violence. In these contexts, the perpetrators will be actively trying to suppress reporting, and so the SMS streams will not just measure where the cell phones are, they'll measure where the cell phones that perpetrators can't suppress are. We'll have many more "false negative" zones where there seems to be no violence, but there's simply no SMS traffic. And we'll have dense, highly duplicated reports of visible events where there are many observers and little attempt to suppress texting.

In the measurement of building damage in Port-au-Prince, there were several zones where there was lots of damage but few or no SMS messages ("false negatives"). This occurred when no one was trying to stop people from texting. The data will be far more misleading when the phenomenon being measured is violence.

As we've said in each post, crowdsourcing generally and SMS traffic in particular is great for documenting specific requests for help. Our critique is that it's not a good way to generate a valid basis for understanding patterns.

Comments

Two points to stress here: 1) crowdsourcing isn't typically methodologically rigorous. 2) people assure information quality. While technology can aid, no automated rules can give you accuracy. Accuracy is a measure that describes judgments comparing representations to what they describe in the real world, which is done by humans. People produce, assess, and correct for accuracy.

You're describing a circumstance similar to Australia's initiative to make the primary data for collected research reusable for other research. The methodology (including relevant software and metadata such as definitions and specification) for producing this primary data has to be shared and reported in a way that allows its integrity to be tested; and the data's fitness for purpose needs to be assessed. Data usually isn't collected to be fit for *all* purposes -- and doing so generally requires common understanding of stakeholder requirements, since people will otherwise produce information only in a manner suited to accomplish the function they're most interested in or exposed to. Making this kind of common understanding apply universally, across all enterprises, is much more tricky. If the quality of information in a set were assessed, we'd have metrics for understanding its fitness for particular purposes.

And the participants in crowdsourcing (that I've seen) aren't typically wedded to a practice of managing information as a shared resource in the way members of an established enterprise can be. A practice of assessing and giving feedback could probably improve crowdsourcing despite this gap, but establishing the culture for understanding and meeting downstream stakeholder needs is more problematic in that scenario. Not entirely impossible, but I am unaware of any efforts to apply this perspective to crowdsourcing.

However: 1) as I said, information quality is produced by people; and 2) we should be able to do distributed information capture with a proper understanding of this fact. Most of the troubles with information quality come from a misunderstanding whereby people tend to expect automation to assure quality.

You improve an information production process at the point of capture (as well as through downstream automated processing) to prevent non-quality as you detect it and "build quality into" the process, rather than have to redundantly perform scrap and rework in the form of fixing already captured information that's produced by a process that continues to produce nonquality.

Finally, this understanding of the role of people as information producers, the value of whose work is understood in terms of quality characteristics of information that are measurable, is what I see will be the eventual development of the nature of work in the information economy. We will sell quality -- accuracy, timeliness, completeness, common understanding, scalability, etc. -- not "content."

Here's a link describing the Australian primary data project that I mention in my comment (and which I includd there, but it's not visible or accessible on the main blog's rendering of my comment): http://www.theaustralian.com.au/higher-education/project-aims-to-reuse-stored-primary-data/story-e6frgcjx-1226022044954

Although I agree with your general contention that crowdsourced data is not a valid replacement for good sampling and data analysis, I am a bit troubled by the lack of awareness of military/disaster operations that you gloss over. It is all well and good to say "just look at a satellite image" or "just look at a map" to determine where the likely damage is going to occur... it is really not that simple. I worked in the Army HQ in Iraq for a year, and acquiring satellite imagery of an area, or even up-to-date maps, was next to impossible without some very high ranking suport behind you. I think that crowdsourced data, as you mention, is one piece of a huge jigsaw puzzle, which a coordinator only has a vague understanding of in the first place.

Proper data collection and analysis is a personal project of mine - the year spent working in Iraq was largely composed of questioniong "surveys" conducted by local companies who claimed representative sampling but would never actually give us specifics as to how those samples were actually collected. But I think that at some level there has to be a compromise between an ideal (real-time satellite imagery, a true SRS of any population, etc.), and the reality...

Thanks for these comments: makes me think there's another blog post here to articulate this a third way. Each of these interchanges allows us to try out different ways of zeroing in on our point and more clearly excluding things that are not our point. More to come!

Jim, as usual, I agree with your points. I'm a data geek so the point you make about an anecdotal usecase and a more dependable one are spot on.

But again, collecting data via crowdsourcing takes many shapes. Sending surveyors out to the field with pens and pads is what might now be referred to as 'bounded crowdsourcing', which would not have to be public, would not have to be chaotic, and would not have to be contributed by unvetted (or untrained individuals). In such a usecase, tools used for "crowdsourcing", simply become tools used for data collection.

As it's been said "information quality is produced by people". That said, platforms like Ushahidi serve as more than just tools for collection, the methodology is often the mode for delivering a different message - one that is hard to quantify. That message is often "You the observer, the victim, or the person empathetic to the cause, have a voice -- you are participating."

This is why we encourage the use of Ushahidi in combination with more qualitative tools like your own; or for instance ArcGIS (for more advanced mapping). While crowdsourcing alone *can* be used for decision based campaigned, it rarely is and we don't take the stand that it should be. The recent Disaster 2.0 report shows exactly why it wasn't and the people who are indeed making decisions can see how lowering the bar for local participation does offer a different types of strategic leverage than having more accurate data alone. So I'll agree with you by using one of my favorite quotes, "More isn't better. Better is better." Yet, better isn't always the sole objective.

Thanks for your comments, Jon. This crowdsourcing-centric view of the world is getting kind of silly, though.

You are stretching the use of the term crowdsourcing way beyond its accepted meaning here. I’m sure that many statisticians will be surprised to find that “running a survey” is now going to be called “bounded crowdsourcing!” From Wikipedia: “Crowdsourcing is the act of outsourcing tasks, traditionally performed by an employee or contractor, to an undefined, large group of people or community (a 'crowd'), through an open call.” Doesn’t sound like running a survey with surveyors to me. I don't think we should keep stretching to try to make well-established approaches sound like a new branch of crowdsourcing that we just figured out!

Tools like pads of paper, PCs, landline telephones and SMS-capable phones aren’t tools for crowdsourcing becoming tools for data collection, as you suggest. They are general purpose tools used frequently for data collection that occasionally get used for crowdsourcing.

I do agree with you on the empathetic nature of SMS texts being made more widely available. If the objective is to get people to care more, or hear from people whose messages are often ignored or suppressed through other channels. As long as the personal security of the person isn’t threatened by sending a text, I think it’s a good thing.

And I do agree with you that more isn’t better. The goal of collecting data usually is to make better decisions, real-time and/or a later date.

Popular posts from this blog

Every once in a while, the Beneblog features something of personal importance to me.

I'm very excited (and proud) about an exciting concert coming up soon in Palo Alto. My daughter, Kate Fruchterman, will be returning briefly to the area the evening of June 17th to give a concert. Kate will be heading to Europe this fall to sing professionally in Italy for the Turin Opera Company, as the winner of one of three Opera Foundation Scholarships.

As I said at the Skoll World Forum this year after hearing Monica Yunus, the famous opera singer and daughter of leading social entrepreneur Muhammad Yunus, Kate is another proof point of the proposition that geeky social entrepreneur dads can have beautiful opera singer daughters.

But, there's more! The accomplished pianist Virginia Fruchterman (who I happen to be married to) will be the main accompanist at the concert at St. Mark's Church. In addition, Lauren Osaka, flautist, and Phil Kadet, the NYC-based jazz pianist and compos…

“Your secure software is open source: doesn’t that make it less secure?”

This is a recurring question that we get at Benetech about Martus—our free, strongly encrypted tool for secure collection and management of sensitive information, built and provided by the Benetech Human Rights Program. It’s an important question for us and for all of our peers developing secure software in today’s post-Snowden environment of fear and worry about surveillance. We strongly believe not only that open source is compatible with digital security, but that it’s also essential for it.

Let me explain with the following analogy:

Think of encryption as a locked combination safe for your data. You may be the only one who has the combination, or you may entrust it to select few close associates. The goal of a safe is to keep unauthorized people from gaining access to its content. They might be burglars attempting to steal valuable business information; employees trying to learn confidential salary informati…

Tomorrow is an exciting day for our Bookshare online library for students with dyslexia or visual impairments. We have incredibly generous matching grants from two of our dedicated tech entrepreneur supporters, Bernie Newcomb and Lata Krishnan. Tomorrow, Tuesday May 5, 2015, is Silicon Valley Gives day, where donors from around the world will find their contributions to organizations based here matched by local donors.

We love reading, and we know how important being able to read a book is to educational and employment opportunity. Each year, we provide more than a million books that are spoken aloud, enlarged or made into braille for students who can't pick up a print book and read it because of a disability.

We've never done a crowdfunding campaign specifically for Bookshare, and tomorrow we'll find out if some of our 350,000 users and their families are able to express their appreciation by helping match these challenge grants. And we need help: our annual federal f…