Sign up to receive our free Tech e-newsletter and get the latest tech news, Hot Sites & more in your inbox.

E-mail:

Select one:
HTML
Text

Google book project: Digital-age test of copyright law

By Anick Jesdanun, The Associated Press

NEW YORK  Tony Sanfilippo is of two minds when it comes to Google Inc.'s ambitious program to scan millions of books and make their text fully searchable on the Internet.

Tony J. Sanfilippo, Penn State University Press marketing and sales director, says Google's book scan has boosted his sales.

By Pat Little, AP

On the one hand, Sanfilippo credits the program for boosting sales of obscure titles at Penn State University Press, where he works. On the other, he's worried that Google's plans to create digital copies of books obtained directly from libraries could hurt his industry's long-term revenues.

With Google's book-scanning program set to resume in earnest this fall, copyright laws that long preceded the Internet look to be headed for a digital-age test.

The outcome could determine how easy it will be for people with Internet access to benefit from knowledge that's now mostly locked up — in books sitting on dusty library shelves, many of them out of print.

"More and more people are expecting access, and they are making do with what they can get easy access to," said Brewster Kahle, co-founder of the Internet Archive, which runs smaller book-scanning projects, mostly for out-of-copyright works. "Let's make it so that they find great works rather than whatever just happens to be on the Net."

To prevent the wholesale file-sharing that is plaguing the entertainment industry, Google has set some limits in its library project: Users won't be able to easily print materials or read more than small portions of copyright works online.

Google also says it will send readers hungry for more directly to booksellers and libraries.

But many publishers' remain wary.

To endorse Google's library initiative is to say "it's OK to break into my house because you're going to clean my kitchen," said Sally Morris, chief executive of the U.K.-based Association of Learned and Professional Society Publishers. "Just because you do something that's not harmful or (is) beneficial doesn't make it legal."

Morris and other publishers believe Google must get their permission first, as it has under the Print Publisher Program it launched in October 2004, two months before announcing the library initiative.

Google books Q&A

Q. Where is Google getting books to scan? A. Publishers can submit books directly. Most U.S. and U.K. publishers are contributing some books from their collections and Google recently expanded the program to several other countries to encourage more non-English books. But even if all publishers join, Google believes they would be providing no more than 15% of all books ever published. So it made deals to scan collections, including copyright-protected books, from Harvard, Michigan and Stanford universities. It is also scanning out-of-copyright books from the New York Public Library and Oxford University. Google won't disclose numbers but says most of the scanning so far has been for publisher-submitted titles. Q. How much of a book can one read online?A. For works in the public domain, or out of copyright, the entire book can be read online. For books submitted by publishers, readers can see five pages at a time and no more than 20% of an entire book using multiple searches. For copyright books scanned from libraries, readers can read a few sentences. Google keeps track of page views using ''cookie'' data files on computers and a username tied to an e-mail address. A small portion of each book can never be viewed to prevent readers from obtaining an entire book by using multiple computers and accounts. Google also tries to disable printing, though there are ways to bypass that. Ads are displayed only for publisher-submitted titles, and publishers get most of the revenues  Google won't say how much. Google also provides links to online booksellers and to nearby libraries that may carry the book, using the user's ZIP code and a global card catalog maintained by the Online Computer Library Center. Q. Is this legal? A. Google believes it is under ''fair use'' provisions of copyright law. Google is giving publishers the option to specify titles they do not want scanned, similar to its approach in letting Web site owners opt out of the search engine index. Publishers, however, believe that even if Google is limiting display of copyright works, the very act of copying without permission is illegal. The outcome could hinge on the interpretation of a 2003 federal appeals court ruling in Kelly vs. Arriba Soft. The court held that a search engine may create smaller versions, or thumbnails, of images under fair use. With books, Google is reproducing the entire work, though a court may factor in Google's limited display and any new functionality the copying enables. Q. Why is Google getting the complaints when others are scanning books, too? A. Google's initiative is the most ambitious and includes materials for which it hasn't obtained permission. Amazon.com Inc. uses only books submitted by publishers, while the Internet Archive's two book-scanning projects deal with out-of-copyright books or those for which permission was granted. Nonetheless, the archive has asked a court for clarification on whether it can scan so-called orphan works - those still under copyright but are out of print and belong to owners difficult to find. The Library of Congress also has been conducting hearings on such works.

Under the publishers' program, Google has deals with most major U.S. and U.K. publishers. It scans titles they submit, displays digital images of selected pages triggered by search queries and gives publishers a cut of revenues from accompanying ad displays.

But publishers aren't submitting all their titles under that program, and many of the titles Google wants to scan are out of print and belong to no publisher at all.

Jim Gerber, Google's director of content partnerships, says the company would get no more than 15% of all books ever published if it relied solely on publisher submissions.

That's why it has turned to libraries.

Under the Print Library Project, Google is scanning millions of copyright books from libraries at Harvard, Michigan and Stanford along with out-of-copyright materials there and at two other libraries.

Google has unilaterally set this rule: Publishers can tell it which books not to scan at all, similar to how website owners can request to be left out of search engine indexes. In August, the company halted the scanning of copyright books until Nov. 1, saying it wanted to give publishers time to compile their lists.

Richard Hull, executive director of the Text and Academic Authors Association, called Google's approach backwards. Publishers shouldn't have to bear the burden of record-keeping, agreed Sanfilippo, the Penn State press's marketing and sales director.

"We're not aware of everything we've published," Sanfilippo said. "Back in the 50s, 60s and 70s, there were no electronic files for those books."

Google, which wouldn't say how many books it has scanned so far, says it believes its initiative is protected under the "fair use" provisions of copyright law.

Gerber argues that the initiative will "stimulate more people to contribute to the arts and the sciences by making these books more findable."

Washington lawyer Jonathan Band says Google's case is strong given the limits on display — a few sentences at a time for works scanned from libraries, with technology making it difficult to recreate even a single page.

"I don't see how making a few snippets of a work available to a user could have any negative impact on the market," said Band, who has advised library groups and Internet companies on copyright issues.

Under Google's strictures, readers can see just five pages at a time of publisher-submitted titles — and no more than 20% of an entire book through multiple searches. For books in the public domain, they can read the entire book online.

Not all publishers are opposed.

"For a typical author, obscurity is a far greater threat than piracy," said Tim O'Reilly, chief executive of O'Reilly Media and an adviser to Google's project. "Google is offering publishers an amazing opportunity for people to discover their content."

James Hilton, associate provost and interim librarian at the University of Michigan, said his school is contributing 7 million volumes over six years because one day, materials that aren't searchable online simply won't get read.

"That doesn't mean it's going to be read online, but it's not going to be found if it's not online," he said.

Hal Hallstein, a 2003 Colby College graduate, said Google's project would have been useful for his studies in Buddhism. He typed the word "shunyata" — Sanskrit for emptiness — and found several books he didn't know existed.

"The card catalog in my experience is rather limited in terms of the amount it really describes," he said.

Nonetheless, as e-media coordinator at Wisdom Publications, he believes each publisher should be able to decide whether to join, as his company has.

Much of the objections appear to stem from fears of setting a precedent that could do future harm to publishing.

"If Google is seen as being permitted to do this without any response, then probably others will do it," said Allan Adler, a vice president at the Association of American Publishers. "You would have a proliferation of databases of complete copies of these copyrighted works."

Publishers won't rule out a lawsuit against Google.

The technology juggernaut, whose name is synonymous with online search, isn't just shaking up book publishing.

Google has a separate project to archive television programs but has so far received limited permissions. The company also faces lawsuits over facilitating access to news resources and porn images online.

Jonathan Zittrain, an Internet legal scholar affiliated with Oxford and Harvard universities, says the book-scanning dispute comes down balancing commercial and social benefits.

"From the point of view of the publishers, you can't blame them for playing their role, which is to maximize sales," he said. "But if fair use wasn't found, (Google) would never be able to do the mass importation of books required to make a database that is socially useful."

Copyright 2005 The Associated Press. All rights reserved. This material may not be published, broadcast, rewritten or redistributed.