Wednesday, January 28, 2009

Google: What's in it for libraries?

(This is a version of the talk I gave on the Google panel at ALA in Denver. PDF for printing.)

The title of this panel is: Google/AAP Settlement: What's in it for libraries? What I can say with certainty is that we don't really know. And I don't really know. What I have to work with is the settlement document, all 140+ pages and 15 appendices, which is the same information that is available to you. But in spite of its size, that is just the tip of the iceberg. It doesn't reveal the discussions that took place nor the reasons behind the decisions that were made. Some people know much more because they were involved in the negotiations. However, everyone who was involved is sworn to secrecy and can't speak about it. This greatly limits what we can and cannot know about the potential effect on libraries. Although I do not have answers, I do have many questions.

I'm going to address the question of libraries that are not Google partners. All of us who are not partners, who are not involved in the scanning, are potential customers. For the Google partners, the settlement includes in an appendix examples of the contracts that they will be asked to sign. For everyone else, it is important to note that you have no contract with Google. There is some information in the Google/AAP settlement document about aspects of the product that Google can provide, but it is far from the whole picture. Any input that you will have into what this settlement means to you will take place as you negotiate your contracts for the Google book products, should you choose to subscribe. What I will cover in the next few minutes are some things that you should be aware of as you consider becoming a Google book product customer.

The first thing you should ask, and the bottom line, is: Does this product serve my users? At the moment, the book search product is an idiosyncratic offering of digitized texts, but nothing that would resemble a curated collection. Can the library's patrons benefit from the service? Will it meet their research needs?

Next you should consider the quality of the product. The obvious areas are the quality of the scans and of the OCR, but also the use of metadata, the search capabilities, and the ability to integrate the product into library practices.

There are numerous legal requirements on libraries, especially publicly funded libraries, that we always must be aware of in our relationships with vendors. A key one of these is privacy. We know that Google's primary business model has been that of delivering customers to advertisers, and obviously we cannot participate in such a model. Those of us in public institutions are bound by state laws to ensure the confidentiality of the use of our materials, and we generally extend that to outside services contracted by the library. The only mention of confidentiality in the agreement is the confidentiality of rights holders.

Our services also must be ADA compliant. Beyond that I would say that for public libraries it is implicit in your mandate that you provide equal access to all. This generally excludes any services that require payments by end-users (and note that there is a statement in the settlement that users of the free subscription that will be available to public libraries may need to make royalty payments for any printing.)

Publicly funded institutions may be bound by the first amendment, and all libraries are champions of intellectual freedom. We know that Google does censor other products, and that publishers withdraw controversial books. If nothing else, we need those activities to not take place secretly.

Which brings me to another issue: transparency. The entire settlement process has been an exercise in the lack of transparency. For those of us who were not involved, it came as a surprise when the agreement was released and we found out that it was the result of two years of secret negotiations. This is normal in the for-profit world, but for those of us in publicly-funded institutions, transparency of our operations is a legal and moral obligation. In addition, the secrecy around the workings of the product make it very hard for us to help users who aren't finding what they need. We don't have to know the secret page rank algorithm, but until the settlement document came out Google would not even reveal how many scanned books it had in its database (7 million). Should the database be offered to subscribers, we should insist on knowing what it contains and what features it will have, so that we can assess its value for our users. I also want to say that we do not want to be in a position of getting information about the product that we cannot share with our users, so becoming party to secrets is not an option.

The last question that I'll bring up here is that of sustainability. Libraries have been in existence for thousands of years, and modern libraries in this country have a history that is measured in centuries. Google has been in existence for about 15 years. Do any of us expect that Google will be around in 200 years? What are the plans for this content should Google cease to exist, or decide it doesn't want to continue to support this product? Some libraries will have copies of scanned books, but is there a plan to place in escrow all of the scans? It's not just a question of the scans, however, because they will be in dark archives. What happens to the service, the user interface?

I also want to say a few words about the so-called "free subscription" that will be offered to publicly funded libraries. We need to look this gift horse in the mouth, if nothing else to make sure that it isn't a Trojan horse. We have very little information today about the nature of this particular product, other than that it will be reduced in functionality from the paid subscription, and it is stated in the agreement as being "one terminal per library building." Remote access to this product is not allowed, users must be physically at the library. Clearly, for any medium or large libraries, one access will not be sufficient. It is also clear that "free" has its costs, and in this case one cost will be the management of a very scarce resource. While this free service is often touted as an act of great generosity, in my more cynical moments I see it as a clever act of "product placement." Where best to put a demo version of your product than in the institution that is most frequented by potential customers: book readers.

I have no idea what the future will bring, but I can imagine a wide range of possibilities. At one end, I see the possibility that the Google book product turns out not to be profitable, that it doesn't gain enough subscribers and it doesn't sell enough out-of-print books to make it worthwhile. Google drops the product, as it has dropped other products that just didn't pan out. The other end is a scenario where the product is highly profitable, either through sales or advertising revenue, and Google continues to make deals with libraries to scan books until it is parallel in content with the entire system of libraries in this country. Parallel, but highly capitalized, ubiquitous, online. At that point we would have a privatized version of the library system, with different goals and values, and no public oversight.

You and I know that Google is not a library, but we also know that our users don't understand that difference. And I'm pretty sure that some city managers with budget problems will not understand that difference, but they do know what it costs them to maintain a public library. I hope that Google understands that its own ambitions can have far-reaching effects on public institutions, but I don't know if there is any way to mitigate the danger that Google can pose to those institutions.

To my library colleagues, I have some advice. We have to be willing to throw off the past and learn to innovate. This is a new information world, and we must be full participants in it. To be visible we must embrace the Web as our data platform, and to do that we must reject any attempts to prevent us from participating openly on the Net.