At Risk: Universal Online Access to All Knowledge

I’ve been following Brewster Kahle and Robert Darnton, a University Professor and director of Harvard’s Library, recently, and they’re concerned over the settlement of the lawsuit between Google and the authors and publishers, over the scanning and use of books in Google Book Search. In my experience, Brewster is extraordinarily thoughtful and takes a long view. Early in my career, I was a librarian. I love books. So while I’m not a lawyer and I find this settlement confusing, I’m writing about it because I think it merits awareness and a serious discussion.

The key issues appear to be whether the business model created by the settlement will lock up content that essentially belongs to the public domain (per Brewster) and whether the publishers’ and authors’ creation of a Google monopoly for books will harm access to knowledge in the future (per Darnton). Below, I’m relying on their words to explain this further.

Last week Brewster posted “It’s All About the Orphans” (http://www.opencontentalliance.org/2009/02/23/its-all-about-the-orphans/) on the blog of the Open Content Alliance, focusing on the plight of “orphan works” – that vast number of books that are still under copyright but whose authors can no longer be found:

“After digesting the proposed Google Book Settlement, it becomes clear that the dizzyingly complex agreement is, in essence, an elaborate scheme for the exploitation of orphan works… The upshot, if the Settlement is approved, would be legal protection for Google, and only for Google, to scan and provide digital access to the orphan works. Presto! … So, should the Settlement be approved, Google will be handed exclusive access to the orphans, and the public loses out… I, personally, am amazed at this creative use of class action law. The three parties have managed to skirt copyright law, bypass legislative efforts, and feather their own nests – all through the clever use of law intended to remedy harms. This Settlement, if approved by the judge, will accomplish things appropriate to a legislative body not to private corporate boardrooms. Let’s live under the rule of law, as arduous as that might be, and free the orphans, legitimately, not for one corporation but for all of us.”

And in “Google & the Future of Books” (http://www.nybooks.com/articles/22281), an article that Darnton published in The New York Review of Books last month, the focus is slightly different but the upshot is the same:

“After reading the settlement and letting its terms sink in—no easy task, as it runs to 134 pages and 15 appendices of legalese – one is likely to be dumbfounded: here is a proposal that could result in the world’s largest library… Moreover, in pursuing the terms of the settlement with the authors and publishers, Google could also become the world’s largest book business – not a chain of stores but an electronic supply service that could out-Amazon Amazon… The class action character of the settlement makes Google invulnerable to competition… We are allowing a question of public policy – the control of access to information – to be determined by private lawsuit… As an unintended consequence, Google will enjoy what can only be called a monopoly – a monopoly of a new kind, not of railroads or steel but of access to information… The settlement creates a fundamental change in the digital world by consolidating power in the hands of one company… This is also a tipping point in the development of what we call the information society. If we get the balance wrong at this moment, private interests may outweigh the public good for the foreseeable future, and the Enlightenment dream may be as elusive as ever.”

A lot seems to be at stake and the court may approve the settlement in June! I don’t care if the settlement means that Google will get even richer (disclosure: I’m a Google shareholder). The question is: to what extent will WE become poorer?

I thought quite a bit about this issue back before the settlement and around the time of the settlement. I carefully read the federal code around the issue and the analysis of quite a few others (and, sorry, I can’t give proper cites – this was an informal effort).

My conclusion then was that the big libraries, like Harvard, had made a bad deal — they didn’t understand the tech. well enough and Google basically not only steamrollered them but implicated them in the potentially massive infringement case that Google’s settlement laid to bed (for now). How I got there is interesting but only forensically. What is still interesting going forward is how I imagined “the deal they should have made”.

Basically, Google should have, indeed, paid for scanning and building the databases – but the ownership of those databases should have remained entirely with the libraries. Google should have built the databases with an API so that they could still build their services on top of the databases (with a performance hit) – but that API should have been equally available to all competitors and the public generally. No monopoly. No private commercial benefit.

A week after Google offered book search, other search engine companies (and other firms beyond those) could have offered alternatives. The various firms would compete mainly on UI and on making efficient use of the API. Large scale users of the library-owned API would help support scaling and operating costs by paying fees for use.

As far as I could tell, copyright law essentially requires that solution and forbids what actually happened. I was pretty shocked that such libraries as these went along with what happened.

It’s not too late. The Writer’s Guild caved pretty easy and pretty early but legal pressure can still be brought to bear on Google. They can give up their private databases back to the libraries that properly should own them in the first place and those institutions, needs be, can ensure non-rival access to the low-level APIs.

-t

http://tim.oreilly.com/ Tim O'Reilly

I agree with Tom’s analysis. (See my old post: book search should work like web search.)

And I do agree with Brewster’s concern that this settlement will derail the kind of reform that would have solved this problem far more effectively. That’s still my preferred solution.

That being said, the tone of both Brewster’s comments and Darnton’s, implies that Google was up to some kind of skulduggery here. That’s unfair. Should they have stood up on principle to the Author’s Guild and the AAP? Absolutely, yes. But it’s the AG and the AAP who should be singled out for censure.

From conversations with people at Google, I believe that they do in fact continue to believe in real solutions to the orphaned works problem, and that demonizing them doesn’t do any of us any good.

The fact is, that Google made a massive investment to digitize these books in the first place. No one else was making the effort. (Brewster was scanning out of copyright works, but Google was the only one to try to cut the Gordian knot of orphaned works. (See my NY Times Op Ed from 2005: http://www.nytimes.com/2005/09/28/opinion/28oreilly.html)

And while there are some legal issues with the settlement seeming to give Google clearance to use orphan works, I don’t know that there’s anything stopping anyone else who wants to take the same risks that Google took from inviting their own lawsuit from the same clowns who forced this settlement.

In short, we’re comparing a flawed real world outcome with an “if wishes were horses” outcome that wasn’t in the cards.

Barring change to copyright law (and yes, we need that), Google has at least created digital copies of millions of books that were not otherwise available at all. Make those useful enough and valuable enough, and I guarantee there will be pressure to change the law so that others can profit too. Not to mention that if anyone does become clear about their ownership of books that have been orphaned, they can step forward.

Net net, Google Book Search was an important step forward in building an ebook ecosystem. I wish this settlement hadn’t happened, and that Google had held out for the win on the idea that search is fair use. And I wish that Google had taken the road that Tom outlined.

But they put hundreds of millions of dollars into a project that no one else wanted to touch. And frankly, I think we’re better off, even with this flawed settlement, than if Google had never done this in the first place.

Finally, I’ll point out that there is more competition in ebooks today than at any time in the past. Any claim that we’re on the verge of a huge Google monopoly, such as Darnton claims, is so far from the truth as to be laughable. Google is one of many contenders in an exploding marketplace.

http://public.resource.org Carl Malamud

My big issues with this settlement are not with Google, they are with this new registry being created that will purport to represent the interests of authors and publishers (and will handle vast sums of money with no stipulated oversight or accountability).

What Google is doing here is pretty straightforward. You might agree or not, but it is clear. What I don’t get is what the other side of the settlement is up to. Will this be another $500m/year nonprofit registry with $1m/salaries and extravagant expenses?

http://basiscraft.com Thomas Lord

Tim,

Thanks for the kind words.

In the spirit of forensics: Those very fine libraries uncharacteristically erred, up front – they could have insisted a little more strongly and probably Google would have internally come up with the solution I propose (that you like). Played right, the process would have been early agreement on scanning with an unclear reading on who would have what kind of access to the databases – giving Google a lot of first-mover advantage in addition to their incumbent infrastructure advantages – but then when the service finally “lit up” the DBs would be with the libraries and the low-level APIs non-rival. As you say: coulda-shoulda-woulda-but-didn’t-so-isn’t. Probably as that played out, other people besides Google would have stepped up to pick up some of the tab for scanning – making Google’s spreadsheets look that much better given the eventual non-rivalry concession.

In the spirit of understanding things: you praise Google, I don’t. We’re better off those books having been scanned (I strongly agree) – I don’t like the way they bull-in-china-shop worked this. I think there’s a deep and lasting threat here that they need to fix if they want to “not be evil”. You’re basically saying that Google performed an act of civil disobedience and we’re all better off for it. I agree they basically did civil disobedience. I think it is a horrid precedent for a corporation to get away with that, in this case, and I note: This wasn’t non-violent civil disobedience; it wasn’t passive resistance. Even if we allow that corps can be disobedient protesters we have to remember that Google took and is taking enormous private gain here. They aren’t freedom fighters, by a long shot (or, have you got them to ship you a copy of the book DB?). Real peaceful protesters plan on spending some jail time when they deliberately set out to break the law. I don’t think the settlement with the authors counts as such — that’s just a toss-off pay-off that leaves the underlying issues unresolved. These aren’t the nice guys you make them out to be. Personally and one on one I’m sure they’re nice and well-intentioned, mostly — they are just ideologically willfully ignorant of the emergent effects of their speculative value system choices – if you know what I mean. :-)

There are other nits to pick with your talk there. For example: competition in e-books really doesn’t assure much about these legacy works.

Going forward:

Google’s vague long-term notion is the same as AT&Ts was in response to Unix and the same as Oracle’s was given the thin-client notion: to own “The” platform monopoly in the form of an over-the-net commodity. Commoditized computing with a monopoly over the system-admin / network engineering parts of that circuit.

That ain’t gonna happen, for lots of reasons I’m sure you can imagine – but it’s the default mode of reasoning (or an emergent effect thereof) at Google and several other firms. The sooner we get out of there the better for all of us, including those firms.

Conveniently, the book dbs are a fine, fine space for Google to begin to, on its own, start to fix its approach. There isn’t a lot of revenue tied to those assets so they can speculate with them without risking much. In speculating, they can tweak them in the direction of better citizenship by returning ownership to the libs to set a signal and to broadcast a meme. Win-win.

They should just unilaterally decide to fix things along the lines I outlined that you seconded (except you said something similar earlier and, anyway, we’re both drawing on 10 other people and…. hey, you get the idea. I mean, that’s the essential thing: it’s an idea and we both (and probably others) “get it” and it’s, so far, “testing true” in discussion….

Tom to Google: “Just Do It!”. Double-dog dare you to fix things. And just think, if you do, well, gee, I might have something good to say about you, for whatever that’s worth.

-t

bowerbird

first, yes, the libraries made a huge mistake up-front.
no, check that — the _librarians_ made a huge mistake,
specifically the people in charge who made the contracts.
they ignored their job in the first place, and then sold out
their own interests when someone offered to do it for ’em.

even worse, those people in charge didn’t get smarter,
so the contracts didn’t really get much better over time.
(heck, because google had more power, some got worse.)

the librarians-in-charge made such bad deals that google
is essentially laughing at their sheer impotence nowadays,
as evidenced by the way the “settlement” shafts libraries…

so the first thing we have to do is get those people fired.
they made _gigantic_ mistakes, and must be accountable.
and yes, i’m quite serious about that. fire the bastards…

second, the authors guild is a bunch of scumbags, and
they need to be _isolated_ and _disdained_, immediately.
moreover, they are fools. google bought them off with
a measly $35 million, which is petty funds for google…

and last, but not least, google has now turned evil on us.
up until this, i supported them. they no longer deserve it.
it’s not that they “caved” on the suit. it’s worse than that.
they used the suit as an excuse to take our money and run.
so any brownie points they had collected up to that point
are forfeited with this slick maneuver to take advantage of
the groups here that had no voice, namely the _orphans_
— who have now become the virtual property of google —
and the _public_ which is being robbed big-time by this…

this “settlement” needs to be fought, _vigorously_…

-bowerbird

http://www.mymeemz.com Alex Tolley

In biology, we had the government run “human genome project” and also the private venture, Celera, using a different technology. Only the former was to be published to the public. There were also private gene annotation databases, e.g. by Incyte Genomics. The IP landscape has allowed gene sequences to be restricted from free use in research and product development.

Darnton would have wanted a public equivalent to the HGP for digitizing libraries. Now the only option seems to be to try to keep the digitized data from being restricted by fees.

It seems to me that in both these domains, the problem is more to do with the evolution of IP law. Focussing efforts on IP might bring about the desired changes in a host of areas.

bowerbird

$1 billion will digitize most of the books in most of the libraries.

we’re in the process of spending hundreds of billions of dollars
— hundreds of billions, literally! — to big banks who failed us.

but we can’t spend $1 billion to digitize our cultural heritage?

is it really that hard to see what’s wrong with this picture?

-bowerbird

http://www.google.com/ Alexander Macgillivray

Disclosure First: I am an attorney at Google and have also done work for Brewster Kahle and the Internet Archive.

Thank you Tim for challenging some of the “Google is the boogey man” assumptions in the post and comments. As Tim says, Google has been and continues to be a strong proponent of reasonable orphan works legislation. Our submissions (dating back to 2005) can be found here:http://www.copyright.gov/orphan/comments/OW0681-Google.pdfhttp://www.copyright.gov/orphan/comments/reply/OWR0134-Google.pdf
As for Thomas, your contentions about copyright law and civil disobedience are just plain wrong. What we did is well within the law. Two cases I’d point you to for a start are:http://www.eff.org/files/filenode/Kelly_v_Arriba_Soft/20030707_9th_revised_ruling.pdfhttp://fairuse.stanford.edu/primary_materials/cases/GrahamKindersley.pdf
Both stand for the proposition that use of the whole of copyrighted works for a commercial but transformative purpose can be fair use. Kelly puts an even finer point on it because that case, like this one is about copying to index and present search results.
As for Thomas’s exclusivity arguments, nothing we are doing prevents anyone else from doing the same, even with these same libraries and this same registry. None of our library contracts are exclusive. Any of our partner libraries can do the deal you propose with anyone that cares to do it. Even though we are funding the creation of the Registry, it too is non-exclusive and can do deals with anyone else it pleases (and we don’t control it). And, Tim is right, there is lots of competition in ebooks and books generally.
More importantly, and seldom mentioned is that the agreement, if it is approved by the Court, will give important benefits to many in the United States. No longer will the students and faculty of great universities be the only ones who have the advantage of large libraries. Every public library will have a free access to millions
of books and any school will be able to get a campus wide subscription. Anyone who has a connection to the internet will be able to search and preview pages from many of these books before purchasing from us or from a bookstore, or clicking “find it in a library” to get access to the physical book for free. Meanwhile the research corpus created as part of the settlement will give researchers access to the entire corpus to create new types of algorithms, including in search that may even be “google-beaters.” And, all of that is not even to mention the tremendous uses libraries can make of their copies, the more than a million (and growing) public domain works we make available for free without restrictions and that the settlement “will revolutionize access to books for blind Americans” (the quotation is
from Dr. Marc Maurer, President of the National Federation for the Blind).http://www.mytechboxonline.com/mtomass/mass-gbsblind-11.html
There’s much more in the settlement, but I think even this subset of the public benefits ought to be enough. And, on top of that, authors and publishers will get paid when people purchase their books. That too, we believe, is something that benefits the public.
Anyhow, as Tim wrote, these are real benefits available now, not some “if wishes were horses” outcome. To turn Linda’s original question around: that is how we all (not just Google) become richer.
-Alex

http://basiscraft.com Thomas Lord

Alexander

I hope that you see this comment (and, please, will the Oreilly folks please try to close the loop and perhaps ping him to make sure).

Without commenting on the content of your response here may I please ask for your email address so you and I can have a brief exchange about these issues off-blog. Before I commence on that I’ll spend some hours digesting your comment here but I’d like to have at least a quick exchange directly because I think the matter is a lot simpler and clearer cut than you make it out to be and we can probably either come to agreement or each mutually reach a clearer opinion of why the other is wrong if we talk a little more efficiently, like over email. I don’t mean to impose on your time but… I think it’s a worthy cause and I think you see at least, even if you don’t agree with me, that I’m not arguing from a trivially dismissed point of view. Please drop me a line (lord@emf.net).

Featured Video

The Internet of Things That Do What You Tell Them: Cory Doctorow passionately explains how computers are already entwined in our lives, which means laws that support lock-in are much more than inconveniences.