Orphan Books: Is Google Robbing the Warehouse?

It's just not Google's week. A mob of angry villagers north of London formed human chains and chased off the Google Maps car (no word whether they had torches). Microsoft is all up in Google's business (to be precise, they're funding a team at New York Law School's Institute for Information Law and Policy, led by a former Microsoft programmer, which is weighing in on the pending settlement of Google's book-scanning lawsuit). And it's not just Microsoft that's taking aim at Google: the NYT has an overview of the many parties, from librarians to law professors, who have serious doubts about that Google settlement. Talk about a village mob.

In the NYT, Miguel Helft discusses one of the key issues in the Google settlement: so-called orphan books. While the settlement gives authors and publishers a say in how Google uses their books, orphan out-of-print books have no clear rightsholder. Without a rightsholder to opt out of Google's database or grant rights to another organization besides Google, these orphan books basically devolve to Google's custody:

While the registry's agreement with Google is not exclusive, the registry will be allowed to license to others only the books whose authors and publishers have explicitly authorized it. Since no such authorization is possible for orphan works, only Google would have access to them, so only Google could assemble a truly comprehensive book database. (source)

If Google really did have a monopoly on digital access to those out-of-print books, it would be a huge coup. Google could charge other libraries for access to the orphan books in its database, profiting off of volumes which it scanned and digitized, but didn't write or publish - books which currently languish in obscurity (but freedom) in library warehouses across the country.

Robert Darnton, head of Harvard's libraries, says the settlement "takes the vast bulk of books that are in research libraries and makes them into a single database that is the property of Google. Google will be a monopoly." Darnton is not completely opposed to the settlement, but is very concerned about its ramifications for libraries and readers. Back in February, he wrote a long historical perspective on the settlement for the New York Review of Books:

The eighteenth-century Republic of Letters had been transformed into a professional Republic of Learning, and it is now open to amateurs--amateurs in the best sense of the word, lovers of learning among the general citizenry. Openness is operating everywhere, thanks to "open access" repositories of digitized articles available free of charge, the Open Content Alliance, the Open Knowledge Commons, OpenCourseWare, the Internet Archive, and openly amateur enterprises like Wikipedia. The democratization of knowledge now seems to be at our fingertips. We can make the Enlightenment ideal come to life in reality.

At this point, you may suspect that I have swung from one American genre, the jeremiad, to another, utopian enthusiasm. It might be possible, I suppose, for the two to work together as a dialectic, were it not for the danger of commercialization. When businesses like Google look at libraries, they do not merely see temples of learning. They see potential assets or what they call "content," ready to be mined. Built up over centuries at an enormous expenditure of money and labor, library collections can be digitized en masse at relatively little cost--millions of dollars, certainly, but little compared to the investment that went into them.(source)

Danton makes an analogy with the skyrocketing costs of science journals, which now cost thousands of dollars per library subscription. Darnton warns that once a subscription to Google's book search database becomes a standard feature of all libraries, and people come to see it as a necessity, Google can, and will, charge whatever it wants, because it will have no competitors. (Once you've spend twenty minutes or so digesting Darnton's essay, you can see some replies to him here.)

The truth is, I find this whole settlement debate very troubling. Google took the initiative to scan millions of library books, digitize them, and make them searchable on the web. That is a public benefit that did not exist before, and one I am glad to have. On the other hand, just because Google did this first should not preclude another company or organization from also scanning and digitizing books, including orphan books, and making them available under different terms - perhaps for free, in a grand open-access initiative. I'm not clear on how the 134-page settlement affects such future competitors to Google. Danton says,

The class action character of the settlement makes Google invulnerable to competition. Most book authors and publishers who own US copyrights are automatically covered by the settlement. They can opt out of it; but whatever they do, no new digitizing enterprise can get off the ground without winning their assent one by one, a practical impossibility, or without becoming mired down in another class action suit. If approved by the court--a process that could take as much as two years--the settlement will give Google control over the digitizing of virtually all books covered by copyright in the United States.

In Helft's piece, Google's lawyer sort of agrees - although he spins it differently:

nothing prevented a potential rival from following in its footsteps -- namely, by scanning books without explicit permission, waiting to be sued and working to secure a similar settlement (source)

Perhaps. I'm withholding judgment until later this month; I'm going to a panel on the settlement hosted by the Information Technology and Innovation Foundation, and will hopefully get a few of my questions answered. Will update you with what I find out.

More like this

I promised you some updates on the Google Books Settlement, so here you go. Things are definitely getting interesting.
First, I mentioned earlier that I was going to attend a panel on the Google Book Search Settlement here in DC, featuring representatives of Google, the publishers, and the…

Photograph by Benjamin Reed.
Ursula K. Le Guin is a internationally-recognized, award-winning science fiction writer, an elegant badass and the author of such classics as the Hugo and Nebula-award winning The Left Hand of Darkness, The Lathe Of Heaven, and the Earthsea novels. Last year, she…

Jeffrey Toobin, writing in the New Yorker, has an excellent article on Google's plan to scan all the books they can get their hands on into digital:
The legal assertion at the core of Google's business plan is its purported right to scan millions of copyrighted books without payment to or…

In today's NYTimes: At Harvard, a Proposal to Publish Free on Web:
Faculty members are scheduled to vote on a measure that would permit Harvard to distribute their scholarship online, instead of signing exclusive agreements with scholarly journals that often have tiny readerships and high…

Darnton works at an institution that could easily create a competing book database on its own, if Google starts behaving badly or charging exorbitant fees for access. Harvard has the money and the books already. Princeton or Yale could probably do it, too. I think Google's settlement would actually make things easier for future competitors since they would already have a precedent for negotiating with the various rightsholders.

There are several areas of our economy where we recognize that monopolies are unavoidable and/or more efficient -- electric utilities, water supply, etc. However, in those same cases we have realized the potential abuses inherent in a monopoly and have set up appropriate regulatory commissions to make sure the public gets a fair deal.

I think we have reached the point where we will need to recognize the existence of monopolistic "information utilities" -- like Google's -- and put in place an appropriate regulatory framework.

I have to say that I'm both confused and a little dismayed (being a lover of Teh Google). Is there talk of google actually charging for access to the database? And are they seriously trying to claim that no one else can legally database the orphan books? Or is the fear that they could?

I think the fear is that in the long run, Google would start charging more and more for full-text access to the books, the ability to print them out, etc. While publishers and authors would have a say in the handling of copyrighted books with clear rightsholders, what to do with orphan books would be up to Google, and apparently other people might not be able to database them under this settlement. So yeah, I think people are concerned about the situation a decade or so down the road - not the situation now.

My only concern is that the scanned books not become less accessible to the public after scanning.

I love being able to find 50-year-old books in libraries that cover certain topics better than newer books are ever likely to. I still prefer holding and reading a physical book to reading one or two pages of it on a computer monitor.

Donate

ScienceBlogs is where scientists communicate directly with the public. We are part of Science 2.0, a science education nonprofit operating under Section 501(c)(3) of the Internal Revenue Code. Please make a tax-deductible donation if you value independent science communication, collaboration, participation, and open access.

You can also shop using Amazon Smile and though you pay nothing more we get a tiny something.

More by this author

A few weeks ago, I was notified that if I wished to continue blogging at Scienceblogs/National Geographic, I'd have to agree to new terms. After considering these terms, as well as the decision to ban pseudonymous blogging, I don't feel that the new management and I are on the same page. I have…

A few months ago I got an email from Zachtronics, creators of the Codex of Alchemical Engineering, about the new indie game called SpaceChem. It was billed as "an obscenely addictive, design-based puzzle game about building machines and fighting monsters in the name of science." What's not to love…

Recently, Scienceblogs/National Geographic decided it would no longer host pseudonymous science bloggers. As a result, many of my former colleagues have left. I think this decision was wrong. Read on for my reasons.
One: simple fairness. Several well-established pseudonymous bloggers had been…

This video from Xperia Studio very effectively conveys how data visualization can both leverage and challenge our conceptions of "reality." The night sky we've seen since childhood, like everything else we see, is just a tiny slice of the spectrum - only what we can perceive with our limited…

More reads

"In the future, maybe quantum mechanics will teach us something equally chilling about exactly how we exist from moment to moment of what we like to think of as time." -Richard K. Morgan
It’s absolutely true that, in quantum mechanics, there are certain pairs of properties that we simply can’t measure simultaneously. Measure the position of an object really well, and its momentum becomes more…

In case you didn't know, reality is science fiction.
If you doubt me, read the news. Read, for example, this recent article in the New York Times about Carnegie Mellon's "Read the Web" program, in which a computer system called NELL (Never Ending Language Learner) is systematically reading the internet and analyzing sentences for semantic categories and facts, essentially teaching itself…

Blurring, chopping and blocking. Three online items this week all deal with some pretty dynamic phenomena.
The blurring is in our perceptions. It turns out that if you even think you have lost money in an experiment, your ability to distinguish between musical notes will be hampered. What’s the connection? Dr. Rony Paz has been showing that this tendency to lump sounds together is tied to fear.…