Pages

Mar 4, 2011

Crowdsourcing Genealogy

Ancestry Insider has a great post today, on what he calls laissez faire indexing. That kind of collaboration is common practice in the Web2.0 era, generally referred to as crowdsourcing. But I like Insider's term too, as it's descriptive of how the phenomenon actually works: individuals' self-interest benefits the whole community. Economists have known this for a couple centuries (although you couldn't tell from our government's economic policies). It's the same principle behind Wikipedia and many other successful websites, and exactly the kind of collaboration we need more of–frequently discussed at RootsTech.

Insider doesn't mention it in the post, but Ancestry.com also does a form of this (besides Member Connect and their World Archives Project). Many of their records must be OCRed, so misspellings abound (of course there were already plenty in the original records). I find records about my relatives despite the misspellings, and they have a facility to offer corrections. So as I work on my family tree and find mistakes, I input corrections. Eventually, they show up as alternate spellings on the View Record page, and they periodically send me a nice e-mail thanking me for my contributions. Granted, these efforts usually aren't publicly available (only to their subscribers), but it works on the same principle, and their customers benefit from each other.

Another company that does this well, albeit outside the genealogy community, is LibraryThing. Individual users can input their own book collections, including tags and other metadata for each book. LibraryThing aggregates all the information from its users, and sells a catalog product to small libraries that can't afford software from the major players in the industry. (Their algorithm weeds out the weird tags some people use, e.g. the physical bookshelf where it's located in their home.) They also have services for authors, publishers, and bookstores. It's completely free to catalog up to 200 books, and they're very community friendly (mashups with RSS feeds, API access, etc.). I use it to organize my personal library, and couldn't be happier with it.

Now here's a challenge: I'd bet a talented developer (that leaves me out) could actually create a mashup with freely available records to do just what Insider's talking about. The completely digitized US Census collection at the Internet Archive immediately comes to mind. I'm not a Footnote subscriber, so I don't know how they implemented the example in Insider's post, but there are many methods that would work. Using HTML5 and JavaScript (and libraries like jQuery and jQueryUI to do the heavy lifting), you could overlay the census image with a drag and drop form to input text values of the image underneath.