kottke.org posts about Publishing

Okay, I'll chase ONE new story today. But it's about this fundamental problem of converting old media objects into new ones, and I get to dig up some old blog posts too, I feel like I'm still in character.

Google's counting method relies entirely on its enormous metadata collection--almost one billion records--which it winnows down by throwing out duplicates and non-book items like CDs. The result is a book count that's arrived at by a kind of process of elimination. It's not so much that Google starts with a fixed definition of "book" and then combs its records to identify objects with those characteristics; rather, the GBS algorithm seeks to identify everything that is clearly not a book, and to reject all those entries. It also looks for collections of records that all identify the same edition of the same book, but that are, for whatever reason (often a data entry error), listed differently in the different metadata collections that Google subscribes to.

But the problem with Google's count, as is clear from the GBS count post itself, is that GBS's metadata collection is a riddled with errors of every sort. Or, as linguist and GBS critic Geoff Nunberg put it last year in a blog post, Google's metadata is "train wreck: a mish-mash wrapped in a muddle wrapped in a mess."

It's not just Google that has a problem. I wrote a post for Wired.com last week ("Why Metadata Matters for the Future of E-books") about how increased reliance on metadata was affecting publishers of new books, who also depend heavily on digital search -- and generally how bibliographic and legal arcana around e-books affects what we see and how we come to see it more than you'd think.

But I wish I'd added Google's woeful records to the piece. It's not like I didn't know about it; here's the title of a post I wrote a year ago, also citing Nunberg's post when it first appeared at Language Log: "Scholars to Google: Your Metadata Sucks".

The benefits of winning the award appear to be few. According to Philip Stone, The Bookseller's charts editor:

"What does the future hold for these items?" Mr. Stone asked, speaking of fromage-frais cartons. "Well, given that fromage frais normally comes in 60-gram containers, one would assume that the world outlook for 0.06-gram containers of fromage frais is pretty bleak. But I'm not willing to pay Â£795 to find out."

For those of you who are more into designer accessories than dairy almanacs, the Calf & Half pitcher lets you pour with udder abandon.

And if you're looking for more clandestine cream, bring your own containers. Raw milk, once our only option, then treated as a potential health hazard, now finds itself on the black market.

Shaking up tech publishing: "It seems that the industry standard [for authors] is something akin to 10% of the profits (which easily take 4-5-6 months to arrive), being forced to write in Word, and finally a production cycle that's at least a good 3 months from final book to delivery. That's horrible!" Building a shop "to take $19 from your credit card" and laying out books in InDesign aren't as easy as he makes it out to be for everyone, but it's a great overall point.

I got an email this morning from a kottke.org reader, Meghann Marco. She's an author and struggling to get her book out into the hands of people who might be interested in reading it. To that end, she asked her publisher, Simon & Schuster, to put her book up on Google Print so it could be found, and they refused. Now they're suing Google over Google Print, claiming copyright infringement. Meghann is not too happy with this development:

Kinda sucks for me, because not that many people know about my book and this might help them find out about it. I fail to see what the harm is in Google indexing a book and helping people find it. Anyone can read my book for free by going to the library anyway.

In case you guys haven't noticed, books don't have marketing like TV and Movies do. There are no commercials for books, this website isn't produced by my publisher. Books are driven by word of mouth. A book that doesn't get good word of mouth will fail and go out of print.

Personally, I hope that won't happen to my book, but there is a chance that it will. I think the majority of authors would benefit from something like Google Print.

Someone asked me recently, "Meghann, how can you say you don't mind people reading parts of your book for free? What if someone xeroxed your book and was handing it out for free on street corners?"

I replied, "Well, it seems to be working for Jesus."

And here's an excerpt of the email that Meghann sent me (edited very slightly):

I'm a book author. My publisher is suing Google Print and that bothers me. I'd asked for my book to be included, because gosh it's so hard to get people to read a book.

Getting people to read a book is like putting a cat in a box. Especially for someone like me, who was an intern when she got her book deal. It's not like I have money for groceries, let alone a publicist.

I feel like I'm yelling and no one is listening. Being an author can really suck sometimes. For all I know speaking up is going to get me blacklisted and no one will ever want to publish another one of my books again. I hope not though.

[My book is] called 'Field Guide to the Apocalypse' It's very funny and doesn't suck. I worked really hard on it. It would be nice if people read it before it went out of print.

As Tim O'Reilly, Eric Schmidt, and Google have argued, I think these lawsuits against Google are a stupid (and legally untenable) move on the part of the publishing industry. I know a fair number of kottke.org readers have published books...what's your take on the situation? Does Google Print (as well as Amazon "Search Inside the Book" feature) hurt or help you as an author? Do you want your publishing company suing Google on your behalf?