Basically, EPUB internally uses XHTML or DTBook (an XML standard provided by the DAISY Consortium) to represent the text and structure of the content document, and a subset of CSS to provide layout and formatting. XML is used to create the document manifest, table of contents, and EPUB metadata. Finally, the files are bundled in a zip file as a packaging format.

So, um, the ePub format uses HTML and CSS? Pardon, but what the flying flip is wrong with, say, using HTML and CSS, raw, straight up, no chaser?

ePub is specifically hobbled for no good reason, when apparently if it weren’t for the slight of hand in file extensions and a few things that aren’t kosher html, we could just read these things in a browser…

…A browser on any screen we already happen to own (and we own a lot of screens that support web browsing).

Hm? Publishers? Amazon, Sony, B&N? Anyone care to mention why you’re using tried-and-true open source code and standards but are trying to hide the easily comprehensible, utterly familiar bits behind this ePub screen. “Pay no attention to the open source code behind the curtain!”

Say, how long would it take someone to kludge together a Firefox and/or Chrome extension to read ePub-formatted “e-books” in browser with very little in the way of effort?

[who says Google isn’t already working on that for Chrome? Google Editions, anyone?]

Manufacturers, publishers, and Google are all likely scared stiff that we’ll figure this out on our own. There is nothing new about e-books; in fact even the fancy e-container is nothing new, as anyone who can use Wikipedia soon realizes:

E-books are web pages! [and Soylent Green is made of People!] The trick, apparently, is restricting access to web-ready text, keep us from copying it, and making it seem new and special and something that requires new hardware, when in fact we’ve been reading text on screens for a couple of decades now. Why sell us a system when you could just sell us a book: device agnostic, HTML/CSS configured, browser ready — just the file, dammit, why all the shenanigans?

With a simple browser extension or update, ePub is even worse (in a business sense) than MP3. MP3s require some sort of plug-in or app; text requires a pair of eyeballs (— sorry, didn’t mean to be discriminatory: a single eyeball, or a sensitive touch and knowledge of Braille will also do the trick) and everything we do on the web already supports that text function. The ‘T’ in HTML is text; this goes deep into the bones of the web we all already know.

##

It’s not gloom and doom, though: Where a hit song can take up 3 1/2 minutes of your time, and a whole album (no matter how epic) seldom takes up more than 77 minutes, a book requires hours of your time and attention. That’s not to say people won’t pirate books — I think it’s fair to say people will pirate anything — but there is less of a potato-chip-factor to books [justonemore] as there was in the early days of downloading music.

Though I suppose that only applies to long works of fiction and literature; comics (particularly 22-48 page pamphlets) may be screwed and 200pgs of manga take, in my experience, about an hour to read the 1st time through. (Obviously I take more time on subsequent re-reading, though not all fans are plumbing books for depth of meaning and appreciation of the art)

But, still, after 15 years (Napster launched in June of 1999 — and only operated for two years — and Napster would have been impossible without the acceptance by so much of the user base of the .mp3 encoding and format, introduced in 1995)

Actually, I need to interrupt that thought — Please note, MP3 came first: small enough to be portable, good enough for most uses, widespread years before Napster. It wasn’t file sharing that “killed” music (if that is your corporate or legal position; and it’s a position that is debatable) it was the music files that made the sharing possible and piracy inevitable.

Anyway, after 15 years, digital music settled down into a new pattern and apparently, someone [*cough* Jobs, Apple *cough* *cough*] figured out how to make money off of it, and we currently enjoy a DRM-free music environment where we buy things and they’re ours, so long as we don’t lose the file. (I’ve lost more music to friends, borrowing my CDs and never returning them, then I ever will to hard drive crashes)

First comes the format, then the user base, then the tools, then the pirates, and then the new sales ecosystem. We’re still waiting on the ebook format for comics (raw image files don’t quite cut it) (and the ebook format for text is already out there: you’re reading a blog that uses it) and while we haven’t seen great tools or ‘mainstream’ piracy a la Napster, yet, I think we all know this is coming.

The trick is to leapfrog the mess and lost revenue and legal hassles and skip ahead to the stable, legal sales ecosystem.

Closed formats, DRM, and “walled gardens” are going to be part of that process. In fact, it’s the part that is currently trying to charge you $100-$250-$500 (and up, for some iPads) for a dedicated e-book reader. And honestly, much like the lawsuits and massive piracy, it’s a step most of us would be willing to skip.

There are just too many businesses, participants, and players out there looking to squeeze money out of the old system or cash in on the new system. This isn’t an open market; this isn’t free-market capitalism yet. People, customers, are going to get screwed over. Business, heck, whole industries may go under before the dust settles and the solution seems obvious.

##

What pisses me off most is this assumption that borders on religious belief, that the old system and old players are somehow sacrosanct, and will always be able to stay in business, stay at the top of their business, and no matter what changes they are guaranteed a place at whatever new table gets set up.

In short: No.

And that’s how things work: Past performance is no guarantor of future success, all investment carries risk; innovate, adapt, respond, or die.

##

So, I’m a bookseller, the endangered salamander of the old ecosystem, soon to die and not likely to stay open through next year, let alone the re-alignment to come.

Say, combine this recent post by Seth Godin with my post of 10 months ago — hm. It seems I’ve a business that has nothing to do with ‘books’ and everything to do with the way you people *already* use my bookstore. So what if the books are only decorations, and you buy coffee while connecting to the internet through a portal on which I can sell ads — it’s not much, but the major payroll expenses are restocking and selling the books; if no one is buying books we can do more with less. (While also selling books online – oh, yeah, I’m thinking about this 5 different ways, including some I haven’t posted about yet) Don’t worry about the store; we may end up as nothing more than a fancy coffeeshop, but I’ll make out.

And new books will be released. Maybe not as many, to start with, but good content will win out.

The question, then, between 1970-publishing-and-retail and 2015-publishing-and-retail is how much bullshit they [“they” being corporations who control—but do not create—content] put us through and how much cash they con us out of, before we find the new model book paradise just past the horizon.

Comment

What’s wrong with raw image files?

And how much beyond raw image files are required to unwrongify it?

The reason image/jpeg is flawed for manga is that it is lossy and this is not 1998. The artifacts are more cost than the bandwidth.

But what’s wrong with raw PNG files? Its got the color range needed to display color pages, its allows the precision to display fine ink drawings, and it has the transparency to encode both the underlying artwork and the language specific overlay.

The metadata required is a rectangle dimension for each panel to allow smaller viewer devices to auto-pan panel by panel. That can be defined as tEXt ancillary chunk.

The fact that manga and comics viewer sites with from the 1m’s and 10m’s to the excess of 1b views work with raw images suggest that it is more or less workable.

For viewing, they can just be served.

For downloading, they are stored in a zip, a text index file and a main subdirectory named after the work. The main art subdirectory contains the art in png files, Name-000.png. Additional subdirectories provide each language localization with parallel Name-000.png overlay, with appropriate transparency to overlay the main art, with a standard name for language support according to some widely available reference stand used as the localization subdirectory names.

If there were a format that allowed one to both view full pages (& double pagespreads) in the ‘raw’ while also allowing easy navigation panel-by-panel [so much easier to see detail & also to read dialogue and captions that way] then I think we’d have a suitable e-comic or e-manga format.

I know of no format or reader that currently makes that possible, at least not in an any-browser, any-reader format.

HTML can’t be used, because it’s a failure as a brand. If it’s used for the format then everyone is going to try to implement an ebook reader using one of the existing browsers, which means you’ll have ebooks that look correct only on (read: target as a platform) IE, some on Firefox, and some on WebKit. That puts HTML out of the question.

While it’s unpleasant to hear that it’s yet another XML application, at least it’s not something odious like COM structured storage, or something as complex as a subset of PostScript.

As for why not a raw image, it’s difficult to pass an image file through a TTS synthesizer. A properly-done XML file, with a bit of processing, becomes an instant audiobook.

Plus, XML compresses better in a zip archive than almost any image format, but no one really cares about file size in the end.

It’s a gimped-up, almost web-ready document, even simpler than a web page but using the familiar code in unfamiliar ways.

an ePub file might look different in IE/Safari/Firefox/Chrome/Lynx (remember lynx?) but would be viewable by all because that’s its whole damn raison d’etre and design philosophy — it’s straight text, with just the barest overlay of formatting and navigation.

It’s a damn file, which was the point of my article, and a pretty simple file at that. This isn’t YouTube, it’s text.

And I’m wondering why I can’t just open it in a browser — any browser, and all browsers.

In my head, I sound like Yahtzee (quite a feat, given my inherited U.S.-flat-midwestern-accent.)

where I start my browsing day...

...and one source I trust for reviews, reports, and opinion on manga specifically. [disclaimer: I'm a contributor there]

attribution

RocketBomber is a publication of Matt Blind, some rights reserved: unless otherwise noted in the post, all articles are non-commercial CC licensed (please link back, and also allow others to use the same data where applicable).