This item in Monday's New York Times, cited by Ann Okerson on the
liblicense list, should be of some interest to this list.

David Green
===========

>Date: Tue, 13 May 2003 21:46:43 -0400 (EDT)
>From: Ann Okerson <ann.okerson@yale.edu>
>To: liblicense-l@lists.yale.edu >>Reply-To: liblicense-l@lists.yale.edu >>
>
>Monday's New York Times included an interesting article on an impressive
>new automated scanning machine in use at Stanford University. The machine
>literally turns the pages of books of every format while scanning to
>graphic images or OCR text of extraordinarily high quality. The potential
>for automating the conversion of large quantities of text is immediately
>obvious.
>
>http://www.nytimes.com/2003/05/12/technology/12TURN.html > (requires simple registration)
>
>The Evelyn Wood of Digitized Book Scanners
>
>May 12, 2003
>By JOHN MARKOFF
>
>PALO ALTO, Calif., May 10 - Putting the world's most
>advanced scholarly and scientific knowledge on the Internet
>has been a long-held ambition for Michael Keller, head
>librarian at Stanford University. But achieving this goal
>means digitizing the texts of millions of books, journals
>and magazines - a slow process that involves turning each
>page, flattening it and scanning the words into a computer
>database.
>
>Mr. Keller, however, has recently added a tool to his
>crusade. On a recent afternoon, he unlocked an unmarked
>door in the basement of the Stanford library to demonstrate
>the newest agent in the march toward digitization. Inside
>the room a Swiss-designed robot about the size of a sport
>utility vehicle was rapidly turning the pages of an old
>book and scanning the text. The machine can turn the pages
>of both small and large books as well as bound newspaper
>volumes and scan at speeds of more than 1,000 pages an
>hour.
>
>Occasionally the robot will stumble, turning more than a
>single page. When that happens, the machine will pause
>briefly and send out a puff of compressed air to separate
>the sticking pages.
>
>For Mr. Keller, the robot, made by 4DigitalBooks, one of
>two companies now introducing the first automated
>digitization systems, is a boon.
>
>"Think about the power of bringing our library to little
>schools in the middle of Africa," Mr. Keller said. "Would
>it make a difference for those who now have their minds
>closed to the idea of democracy?"
>
>The first book-scanning robots were introduced this spring
>by 4DigitalBooks of St. Aubin, Switzerland, and Kirtas
>Technologies of Victor, N.Y. The machines have already
>begun to generate interest from libraries and private and
>nonprofit groups now working to digitize books.
>
>Until now, the job has been done mostly by students or
>armies of low-cost workers in countries like India and the
>Philippines. But manual digitization presents significant
>logistical problems. Book collections may have to be moved
>long distances to digitization centers.
>
>And in some cases the process of scanning has damaged old
>books and journals, making it necessary to rebind them
>afterward.
>
>The digitizing machines, by contrast, can be located close
>to book collections and offer speed and quality control
>unattainable by manual systems.
>
>Even so, manual processing is still less expensive in many
>cases than acquiring a robot. The 4DigitalBooks robot,
>whose price neither the company nor Stanford officials
>would disclose, becomes cost effective on projects larger
>than 5.5 million pages, said Ivo Iossiger, the company's
>chief technology officer and a co-founder. It seems likely
>that the vast majority of digitization over the next
>several years will be done by hand.
>
>Mr. Keller admits that his dream to have the entire
>Stanford library in a digital database is unlikely in the
>foreseeable future because such an undertaking - involving
>eight million volumes - could cost upward of $250 million.
>
>In the meantime, the Stanford librarians have begun
>digitizing books and documents where there are no thorny
>copyright barriers and have important historical and
>political significance.
>
>The newly installed robot is currently finishing two pilot
>projects, scanning books published by Stanford's Center for
>the Study of Language and Information and works for the
>Medieval and Modern Thought Text Digitization Project. It
>will soon begin work on the 2,500 titles published by the
>Stanford University Press.
>
>Not long ago Stanford helped finance the manual
>digitization of the presidential papers of Eduardo Frey,
>the former president of Chile, who was concerned that
>records of his administration could be lost in a coup.
>
>And beginning in 1999, the Stanford library system sent a
>team of specialists and students to Europe, where the
>university is engaged in a multiyear project to digitize
>selected documents produced by the General Agreement on
>Tariffs and Trade and its successor organization, the World
>Trade Organization in Geneva. The project, which will take
>five years, will ultimately scan about 2.2 million pages of
>information.
>
>Other ambitious undertakings like Carnegie Mellon
>University's Million Book Project will also continue to
>rely on manual digitization for several more years. Another
>project, led by the Internet Archive in San Francisco,
>recently shipped 80 tons of old books acquired from the
>Kansas City Library to Hyderabad, India, where they will be
>scanned, according to Michael Lesk, a former National
>Science Foundation official and digital library expert who
>works with the archive.
>
>Mr. Lesk said that currently in India or the Philippines it
>is possible to scan and digitize a book for $1 to $4. But
>he acknowledged that there were significant costs in
>quality control.
>
>For Mr. Keller the most vexing challenges are neither labor
>costs nor technology. Librarians, he said, must find a way
>to address the copyright restrictions that appear to be
>tightening as a result of new federal laws like the Digital
>Millennium Copyright Act of 1998.
>
>Stanford is struggling to comply with copyright
>restrictions while making works that have recently lost
>their copyright protection available digitally. Mr. Keller
>said the library increased the circulation of its
>collection by 50 percent when it computerized its card
>catalog. Digitizing out-of-print books could likewise make
>them available to a much wider audience, he said. The
>payoff for building such a digital collection, he added, is
>vastly improved availability of a huge store of knowledge
>and information for teaching, learning and research.

--

-----------------------------------------------------------------------
NINCH-Announce is an announcement listserv, produced by the National
Initiative for a Networked Cultural Heritage (NINCH). The subjects of
announcements are not the projects of NINCH, unless otherwise noted;
neither does NINCH necessarily endorse the subjects of announcements. We
attempt to credit all re-distributed news and announcements and appreciate
reciprocal credit.

For questions, comments or requests to un-subscribe, contact the editor:
<mailto:david@ninch.org>
-----------------------------------------------------------------------
See and search back issues of NINCH-ANNOUNCE at
<http://www.cni.org/Hforums/ninch-announce/>.
-----------------------------------------------------------------------