A competitor to Google Book Search emerges as the Yahoo-backed Open Content …

Share this story

After several years of scanning and archiving, the Internet Archive and the Open Content Alliance this week unveiled the Open Library, their attempt at bringing public domain books to the masses.

The Internet Archive has hosted texts for quite some time, but the Open Library makes fully-searchable, high-quality scans of books available, along with downloadable PDFs. It offers an experience designed to match paper: there's even a page-flipping animation as readers move forward and backward through the book.

Ben Vershbow of the Institute for the Future of the Book says that when it comes to presentation, "they already have Google beat, even with recent upgrades to the [Google Book Search] system including a plain text viewing option." Magnification isn't yet in place but is coming soon, apparently in time for the "official" launch later this year. The Open Library does have a neat trick up its sleeve: it allows the on-demand printing of any book through Lulu.com, allowing anyone in the world to order a printed copy of long out-of-print works.

The Open Content Alliance provided plenty of support for the project, drawing on the resources of companies like Yahoo, Adobe, MSN, and HP Labs. As Vershbow's comment above indicated, the new project looks like a direct competitor to many of the things that Google wants to do with Google Book Search, although the Open Library takes a different approach to handling copyrighted texts. Only public domain books will be scanned, and other publishers can opt-in to the system at some point; Google, by contrast, scans everything but only displays tiny snippets of information from copyrighted texts.

But the Open Library goes beyond Google Book Search in a couple of clever ways. For one, it will integrate with Librivox, a site that allows users to contribue home audio recordings of public domain books. Open Library users can already click the "Listen" button to hear Henry James' "An International Episode" in full and for free (and read surprisingly well).

The Internet Archive also partnered months ago with researchers at Carnegie Mellon to use their reCAPTCHA system to correct the results of optical character recognition. In essence, millions of users across the Internet are helping to make texts behind the Open Library more accurate even as they prove their own humanity to various blog comment systems.

The Open Library site is limited right now, allowing only for access to limited books and having no way to return to the main page (access to many other books is possible through the Internet Archive). Although the project is not nearly as developed as Google's, it's good to see some competition develop; Hopefully, the two projects will spur one another on to create a pair of truly compelling resources for readers.

Do we really need multiple projects like this, or would everyone be better off by pooling resources and building a single, massive database? The Internet Archive's Brewster Kahle recognizes that this question will arise, and he tackles it head-on in his vision statement for the Open Library. "Won't some of the big commercial digitization projects deliver this future?" he asks. "They are part of it, but if we go no further, we may have an expanded bookstore, or a single means of organizing the materials, but we may not be building on the open tradition of a library."