The
current issue of WIRED (or is it only the online WIRED News? I'm
not always sure which is which.) carrieds a piece on what Amazon is
doing with its search engines to tease data out of the PDF books it
carries. “Judging a Book by its Contents”
includes the following from Amazon exec. Bill Carr. Oh that news
organizations could bring the same type of thinking to their archives.

Bill Carr, Amazon's executive vice president of digital media, confirms that this is a serious attempt to sell more books.

“We've been spending a lot of time thinking, 'We have this rich digital
content, how can we pull info out and expose it to customers that makes
discovery even better?'” Carr said. “What you are seeing here are the
fruits of a lot experimenting and brainstorming.”

Carr points to the “adaptive unconscious” SIP from Malcolm Gladwell's best seller, Blink, as an example of how improbable data mining can get a curious reader into the long tail of Amazon's catalog.”

Benjamin Vershbow, a researcher at the Institute for the Future of the Book,”…sees Amazon's data mining as part of a trend on the web where sites are
learning to weave data sources together to create a new web experience.”

Someone, and it won't be a newspaper or magazine
publisher, will see an opportunity to do the same thing with our
archives. No, Lexis-Nexis is just a warehouse. Valuable,
but not much added value.