Subscribe To

RootsTech 2015

Some people eat, sleep and chew gum, I do genealogy and write...

Thursday, December 26, 2013

Will all the books in the world be digitized?

I guess my first comment on this subject would be, don’t
hold your breath. But this is a real concern for genealogists and others and
cannot be dismissed quite that cavalierly.
This is especially true when there are some very large companies in the
world that have as a goal the digitization of every book and in some cases,
every record in the world.

But let’s assume that the number really is around 130
million or so. Could Google digitize all those books? Well, the answer is if
they had them available to scan, yes they could. According to some estimates, Google has
already scanned over 30 million books since starting in 2004 and has done over
10 million in that last year. At that rate, they would be “done” in about ten
years. But the real questions is not whether Google is going to digitize every
last book in the world, but whether or not someone or anyone is going to do so.
Of course if you think about it for a minute (or more as the case may be), you
will soon realize that there are some rather apparent insurmountable obstacles
to achieving this goal. There are the physical limitations of access created by
national boundaries and attitudes. Do you really believe that every library in
the world is just going to sit there and let Google (or anyone else) waltz in
and start scanning away?

Don’t think I have ignored the issue of copyrights. Really
copyright isn’t an issue with the digitization of books, it is only an issue
with what can happen to display or make the digitized books available online. I
give you an example of one problem. This problem hits home because it is
sitting in the Mesa FamilySearch Library. Many of the books in the Mesa
FamilySearch Library are essentially unique. They are very limited editions.
What's more is that they are extremely unlikely to been included in Google's
estimate of the number of books. So, the question about whether or not all the
world's books will be digitized is not a legal issue, neither is it a
digitization issue, in the end it is a totally practical problem of making all
of the books available to be digitized. Now, I should mention that many of the
books in the Mesa FamilySearch Library have already been digitized and are
already available online on FamilySearch.org. But under present policies and procedures,
the remaining books that are under copyright and unique or in limited editions,
will likely not be digitized ever. In this context ever means until the
copyrights run out and that is a very long time assuming that additional
extensions of the copyright coverage are not passed by the United States
legislature in the future.

So the answer to the question is 42.

Just in case that answer is not satisfying, here is the full
quotiation:

"Good Morning," said Deep Thought at last.
"Er..good morning, O Deep Thought" said Loonquawl nervously, "do
you have...er, that is..."
"An Answer for you?" interrupted Deep Thought majestically.
"Yes, I have."
The two men shivered with expectancy. Their waiting had not been in vain.
"There really is one?" breathed Phouchg.
"There really is one," confirmed Deep Thought.
"To Everything? To the great Question of Life, the Universe and
everything?"
"Yes."
Both of the men had been trained for this moment, their lives had been a
preparation for it, they had been selected at birth as those who would witness
the answer, but even so they found themselves gasping and squirming like
excited children.
"And you're ready to give it to us?" urged Loonsuawl.
"I am."
"Now?"
"Now," said Deep Thought.
They both licked their dry lips.
"Though I don't think," added Deep Thought. "that you're going
to like it."
"Doesn't matter!" said Phouchg. "We must know it! Now!"
"Now?" inquired Deep Thought.
"Yes! Now..."
"All right," said the computer, and settled into silence again. The
two men fidgeted. The tension was unbearable.
"You're really not going to like it," observed Deep Thought.
"Tell us!"
"All right," said Deep Thought. "The Answer to the Great
Question..."
"Yes..!"
"Of Life, the Universe and Everything..." said Deep Thought.
"Yes...!"
"Is..." said Deep Thought, and paused.
"Yes...!"
"Is..."
"Yes...!!!...?"
"Forty-two," said Deep Thought, with infinite majesty and
calm.”
― Adams, Douglas. The
Hitchhiker's Guide to the Galaxy. Ballantine, 1980.

4 comments:

Because the 'Inside Google Books' article refers to "books of the world", I would hope that the count isn't limited to those published in the English language. However, there's no explicit mention of the issue James.

While I can easily imagine OCR works OK for other Latin-based languages, I admit I that have no knowledge of its usage for other scripts, e.g. Cyrillic, Japanese, Chinese, Korean.

There must be a large number of published works in the associated languages but you wouldn't find many of them in a US/UK library.

All I can say is… this would require a lot of time to process metadata and other details associated with the digitized books, should it eventually come to pass. Although digitized books do pave the way to easier access, it’s a matter of how well they’ll handle the digital catalog and how extensive it will be that we can measure usefulness. Because digitizing for the sake of just having digital copies isn’t exactly productive.