All eBooks -- so-called -- are right now nothing more than a lightly tarted-up text dump into a specific electronic container. They're objects with less capability than print (go on, riffle through those pages with eInk's refresh rate!) and even dumber than print (uh, which page are you actually on?). They are linear, they are stupid. They are the lowest common denominator of what an eBook can be -- and even at that they are simply atrocious.

They are unworthy of serious consideration. Selling them is tantamount to legalized robbery. And the future will look back on them and wonder just what the hell we were thinking.

When I showed this tiny, heavy thing [an original Sony Walkman cassette tape player] to Lili [his young daughter], I’m wondering now if she was thinking, "yeah, it plays music, but what else does it do?" She didn’t ask, but, knowing her, I wonder if that was going through her head. Whether that’s what goes through the heads of her Western generation, the third (?) internet generation. Where’s the controller? What else does it do?

They try to "turn the page" by flicking a finger across the screen. But the Kindle doesn’t have a touch screen….Which means that the second thing that people checking out my Kindle do is get a funny confused look — why doesn’t it work?

It's not unreasonable to expect people to look at an eBook today and wonder, What else does it do?

No, not the hardware device -- the eBook itself.

I can imagine the interior monologue: "Wait, you mean I can start here at the beginning and read through to the end -- and what else? That's all there is? That's stupid! That's dumb! That's like a printed newspaper!"

Yes: Like a printed newspaper! Exactly.

And what has the health of the printed newspaper industry been like in just the past year?

This post is my attempt to:

1) Save book publishing

2) Save eBooks from the fate of printed newspapers

3) Justify a ten-dollar (American) price (and sometimes more than that) for an eBook

4) Create an entire new industry

This post will also delineate two groups. The first is the one we know already: the dying dinosaurs of print, who can't figure out this "eBook thing."

The second group I'm designating second-stage dinosaurs. These are people who think they "get" eBooks -- and see today's pathetic excuses for eBooks as being eBooks.

Concomitant with those groups, there is an implied third group: those who will understand this post and all of its implications and see that, yes, this is part of what an "eBook" actually should be -- and it's time to begin to bring this about.

There is more information contained in those two words than most people would imagine.

And it's that information that should be extracted. That hidden information is the foundation of part of what an eBook should be -- and is meant to be.

Below is an illustration of at least part of the invisible information contained in that short sentence. I don't intend this to be comprehensive nor to be professionally correct within the disciplines I'm bumping into. This is illustrative and a place to begin.

I'm using very rough markup language here. It's probably incorrect as all hell -- but it's a start.

Jesus wept.

<book-title=Christian Holy Bible><sub-book-title=Book of John><book-author=X><book-editor=X><book-publisher-company=X><book-publisher-company-country=X><book-publication-year=X><book-copyright-year=X><book-publication-country=X><book-type=religious><book-origin-language=Greek><book-translated-language=English><book-edition=King James><book-origin-publication-year=X><book-origin-publication-month=X><book-origin-publication-location=X><book-edition-publication-year=X><book-edition-publication-month=X><book-edition-publication-location=X><book-series=X><book-series-number=X><book-dedication=X><book-acknowledgement=X><book-footnote=X><book-sidebar=X><book-caption=X><book-photo-caption=X><book-part=New Testament><book-chapter=11><book-area=verse 35><book-paragraph-number=#><book-sentence-number=#><chapter-paragraph-number=#><chapter-sentence-number=#><factual-person-name=Jesus><factual-person-name-alt-01=Joshua><factual-person-name-alt-02=Yeshua><factual-person-sex=male><factual-person-identity=man><factual-person-age=32><factual-person-eye-color=X><factual-person-hair-color=X><factual-person-skin-color=X><factual-person-nationality=X><factual-person-role=leader><factual-person-occupation-01=carpenter><factual-person-occupation-02=preacher><factual-person-race=Semite><factual-person-religion-01=Judaism><factual-person-religion-02=Jewish><factual-person-label-01=Messiah><factual-person-label-02=Son of God><factual-person-label-03=Son of Man><factual-person-label-04=Word><factual-person-label-05=Lamb of God><factual-person-label-06=Saviour><factual-person-label-07=King of Kings><factual-person-label-06=Lord of Lords><factual-person-father=Joseph><factual-person-mother=Mary><factual-location-galaxy=Milky Way><factual-location-planet=Earth><factual-location-nation=Israel><factual-location-city=Jerusalem><factual-location-rule=Roman><factual-location-politics=Imperial><factual-location-ruler=Pontius Pilate><factual-location-year=32><factual-year-designation=A.D.><factual-local-time=day><factual-local-time-hour=unknown><action-01=weep><action-02=cry><body-item-implied=eyes><body-item=tears><emotion-01=sorrow><emotion-02=regret><emotion-03=disappointment><emotion-04=frustration><emotion-05=sympathy><emotion-06=empathy>

And all of that sample data above is just within a two-word sentence! How much more resides in full paragraphs?

Some people bailed or skipped down to this paragraph when they saw the tag, "<factual-location-galaxy=Milky Way>." Too bad for you. Put yourself in group two, as a second-stage dinosaur. Because you're not grasping the idea that tag can also be used in books about astronomy and the like.

And also because that tag has as its mirror, "<fictional-location-galaxy=>", which is applicable to what is broadly and coarsely termed "science-fiction."

All of this hidden information -- exploded out like that, made explicit -- turns an ebook from a dumb object into a smart object.

Further, it's then possible to associate it with other such objects in ways that are not currently possible. It would enable queries such as these:

Show me all mystery fiction books set in Los Angeles in the year 1945.

Show me all romance fiction books set in Maine in the year 2009.

Show me all fiction books set on Mars in any fictional year, published between 1940 and 1960.

Show me all books with alcoholic leaders.

Show me all books with alcoholic leaders who drink American whiskey.

Show me all books with Abraham Lincoln as fictional character.

Show me all mystery fiction books originally published in Portuguese in 1960 and translated to English with antagonists who are political activists with left-wing affiliations.

Show me all fiction books that mention writer John Straley.

Show me all first paragaphs from fiction books published in May 2009..

Show me all time-travel fiction books set in America in the century 1800.

Show me all time-travel fiction books that are set in Yugoslavia with any character who is Jewish.

Show me all fiction books with dedications to a wife.

Show me all fiction books with acknowledgements that mention Charles Bukowski.

... and many, many, many more.

This would open up book discovery in a way that's just not possible with today's crude and coarse methods. It would enable scholarship that has been impossible. It would give eBooks more possibilities than anyone today can envision -- or should try to envision.

Remember the Walkman cited above. What else can it do?

With such exploded data, an eBook is not just an eBook -- it becomes a ticket for admission to a vast collection of databased information.

An eBook becomes a local terminal connected to a growing and living cloud of associated information, with meanings and implications no publisher or writer can currently imagine. It lets the reader make those connections. It's an eBook that can do something.

And this is precisely why Google wants the Book Search settlement to go through: it sees that as the future. Google is staffed by geeks who juggle information with an expertise that print publishers lack.

Google makes information do things.

Print publishing freezes information into a static object -- an object that dies a little with each passing day. An object that stands alone, disconnected, unable to do anything.

Right now, the hierarchy of print publishing's stops at the Publisher. That's the pinnacle. There needs to be another layer slathered over that. The information geeks. The ones who will take the static objects, extract the hidden information, and database it. They will set the standards across the entire industry. They are new publishers for a new age.

This metadata and thin-sliced denotata has value. And that value will increase, not decrease, as it ages. As new connections are formed, and new data is added, its value increases exponentially.

The metadata value of a publisher could equal, if not surpass, that of the works on which it's based.

Metadata will become a multi-billion dollar business.

The entire global economy is built on metadata.

And it's accessing to that metadata that would justify more than a five-dollar (American) price for an eBook. Consumers would understand the additional value that justified the higher price. They would see an actual investment has been made to turn a crudely-decorated text data dump into something active and intelligent.

It'd no longer be a flat, linear collection of words. Dimensions have been added to it that breathe and grow. The eBook price --again -- becomes a ticket. People are no longer buying an object -- they are also buying into an ongoing experience.

And this is not only applicable to non-fiction. It's just as important to fiction too.

In the early 1990s, Oxford University Press published a hardcover collection of the Sherlock Holmes stories which were filled with descriptive and historical notes -- another kind of metadata!

If you've gotten this far and you can see all of the implications of this, good for you.

If you've gotten this far and you're asking, "But who would use this? What good is this?" then you're a second-stage dinosaur. Because you've just asked the kind of questions that executives at IBM and Hewlett-Packard asked when confronted by the MITS Altair and the Apple II and all those other "toys" of the late 1970s. What did those "toys" lead to?

It's beyond the scope of this post to delve into all the threads that lead from this: such as who owns the metadata, should writers have any rights to publisher-created metadata, etc.

It's also beyond my technical capabilities to suggest how this can be constructed. My instinct is the eBook resides locally and the experiential data resides in the Cloud.

And that data must be able to cross-query across all publisher databases for this to work. Imagine a thousand Googles, all having to speak a common language to present results to one UberGoogle.

My aim here is to help push eBooks into what they should be and today must begin to be.

Oh, one more thing: Cost. Yes, it will cost. But would publishers rather go out of business? Or would they like to have third-parties springing up to do this for free? Jump while the window of opportunity is open! Save publishing and advance eBooks.

11 comments:

Terrific post. I found your blog through an RT on Twitter, and I'm hooked!I've been arguing that eReaders are not the future. Although I own many of them for our business (Kindle 1, 2, and DX, and the Sony eReader), as a publisher I just don't see the value of delivering my content to the devices, particularly when I've gone through all the trouble of increasing the meta in our publications. With that said, there might be certain things we would do to satisfy a particular market segment, but in the end, the value of our content (of most content) is on the web.Thanks again.

There's a huge amount of information in that post, and all of it smart, informative, and prompting lots of discussion about the rights to own and/or control that metadata. The end result is, as you point out, a better understanding of anything being read, and a livelier experience for the reader.

Plus, yes, a fuck of a lot of work. But the end result will make it worthwhile.

First, however, we gotta convince people that anything other than ink, glue, and paper is desirable. This model might be the way people are convinced.

I find this endlessly fascinating. How would you envision it working, though? Let me get all geeky for a moment (and please accept the substitute xml tags since I'm pretty sure I can't use the real thing in a comment)...

1. If the info lived in the cloud, I'm assuming there's be a centralized server of some kind. Something that says "this John Doe is ID# 398381858381". That in itself could get very complicated, and there's the quesiton of who would maintain it, how they would resolve arguments and stuff like that. But let's skip that for now.

2. Let's assume it's not in the cloud. Let's say the metadata is stored in the tail end of the ebook itself, and referenced in a transparent way by whatever reader is reading it. So now we define John Doe on a book-by-book basis... which means if he evolves between books in a series, the metadata on the earlier story is out of date... but maybe that's ideal, since it would be like spoilers otherwise. Anyway... so we have a giant grid of character, location and maybe selected object/abstraction information too. Anything not fully defined outside the book itself. You could skip describing a toothbrush, but not a fictional company. So that leads me to...

3. How many times would you link things? Look at this line:[p]"What's going on here?" said [char id="314"]John[]char][/p]The ID 314 connects to the object ID in the meta file, and pulls the info through. But do you link every instance of the name "John"? Do you link the word "he" when it refers to him? Are we looking at quadrupling the size of every ebook size? Or just the first instance? Or maybe just the first instance in each chapter?

I think I might try something along these lines for my new project (it could use the metadata anyway)... but the execution will be interesting to explore. Excellent post!

As a writer who sells many more eBooks than print books, I think extended metadata is important. There are two key problems, however.

One: Fiction is written to provide an experience that begins at page one and ends on the last page (for the most part). There are plot twists and other developments in my work that would come up in searches and spoil the experience I'm trying to provide to my readers, for example. There are many other aspects that would be ruined if you didn't read a book linearly.

Two: For metadata searches to work as you've stated here everyone would have to have open access to eBooks. No one would be able to sell an eBook because it would already be open to the public. Publishers might find another way to make money, they're harder to crush than people give them credit for, kind of like cockroaches, but writers would starve. I make my living selling my eBooks for very reasonable prices (under $8.00 US), and my readers are happy to read them on their mobile phones. I've also never had a complaint about the price.

Metadata will become an integral part of finding the eBook you want. How we search for literature will change, what we look for in literature may change a little, but there will always be people who want to read a BOOK. That is to say, they'll start at page 1 and finish on the last, not search for one tidbit after another through random volumes that have been made a part of a searchable database that treats fiction, with all it's colour, invention and style like parts of a bloody overgrown encyclopedia.

Oh, and as for the devices, they're in their infancy. With regard to searchability, most good reader software makes eBooks easily searchable and is linked to a dictionary or encyclopedia so you can do an instant word lookup while reading. They're getting there, but as with any new concept it'll take time. Perhaps you could create a working model of your idea? Sounds like you could become something of a publisher yourself.

You covered a lot of ground but your salient points are as well-made as ever. Let's make sure we make eBooks the best they can be, working hard at this, the nascent stage of their development, to realize their full potential X years down the road.

Even if we assumed that a book passage could contain all that metadata, I'm not sure that searchability is that important for most books (especially fiction books).

People would never search for horror books produced in Los Angelas between 1980 and 1990. (maybe scholars would, but not ordinary readers). They are more likely to have fuzzy criteria in mind. Serendipity is a better value to cultivate, but that involves an element of randomness and social sharing. Perhaps there is more value in user-generated folksonomies than something a creator could provide.

With regard to Jesus Wept, as a literature buff, i would be most interested in seeing echoes of this sentence in other literary passages. We need a way to collect annotations on a book.

One more thing. i highly recommend Jack Matthews: BOOKING IN THE HEARTLAND (1986). It's a visionary look at the book.