I've bought more than 100 books from Amazon since getting my K1 in Jan '08. Just recently, the last several books I've purchased are having more and more formatting problems. Non-hyphenated words having hyphens in the middle of the line, no spaces between paragraphs, no graph indentations, and the current book has, on almost every page, at least one instance of two words run together without a space.

I have written Amazon that it is about time they have somebody review new books, and if they are not properly formatted, they should not be listed, but bounced back to the publisher for correction. It is not as though Amazon does not have enough books listed that they have to be too anxious to list new ones that are deficient.

I trust everybody that finds these messy, stupid errors will send a feedback email asking that these be corrected. (fat chance Amazon will do this).

I also find many extra hyphens in the NYT - I get the Sunday editions for the book review section - and am continuously amazed at the introduction of hyphens - especially into names of people and locations.

Almost every book I've bought has errors on nearly every page. Not the kind of typos you see in a published book, like bad grammer and misspelled words. What I'm seeing is formatting problems indicative of sloppy conversions that are not proofread. For example, paragraphs beginning in the middle of sentences, inappropriate hyphenation, terrible OCR conversions, etc. Probably 99% of the errors I see could have been caught and corrected by proofreading.

Proofreading is an extremely costly process. It's probably not economically viable for publishers who are doing "cheap and cheerful" conversions of back-catalog stuff.

Having done a couple of homebrew dead-tree-to-ebook conversions, I can completely agree with this. It is a huge time sink, even taking out the dead tree part and doing just format conversions -- you still have to go through the whole darn thing.

If you are a publishing company and have two staffers assigned to working on eBook technical matters, it'd be a pretty daunting task to do all your expected job responsibilities AND proofread a dozen conversions a week.

However, I agree. I can take the occasional "carriage return didn't get converted to a space" or "hyphenation in the middle of a line" issues. I've seen some truly unreadable content too. The unreadable stuff should never had made it to Amazon -- not when you can spot-check a book and find this kind of problem in under thirty seconds!

So the publishers need to step up to the plate, Amazon needs to reject titles, and consumers need to beat on both for really badly done conversions.

Almost 600 years ago, some guy named Gutenberg revolutionized the written word. He is credited with being the catalyst for the Renaissance period by providing a method of spreading ideas very quickly. Instead of a book taking a year to be copied by hand, it could be printed in a few hours.

The books made with the first moveable type printing press lacked pagination, indentations, and paragraph breaks due to limitations in the technology. This obviously was overcome in time with technology and processes to make the newly printed books easier to read.

I consider E-book readers in the same vein as the invention of moveable type. The time it takes to receive an idea via books has dropped another order of magnitude. And just like 600 years ago, it will take a little time for the rest of the processes to catch up to the new technology.

Proofreading is an extremely costly process. It's probably not economically viable for publishers who are doing "cheap and cheerful" conversions of back-catalog stuff.

I wouldn't have a problem with the shoddy conversions if they were giving the books away or selling them at bargain prices. But publishers are selling unproofed books at full book prices. If publishers want to charge paperback prices or more for e-books, they need to provide the proofing that is a standard part of publishing process as well.

I wouldn't have a problem with the shoddy conversions if they were giving the books away or selling them at bargain prices. But publishers are selling unproofed books at full book prices. If publishers want to charge paperback prices or more for e-books, they need to provide the proofing that is a standard part of publishing process as well.

I agree. I also know how hard it is to thoroughly proof a scanned book. However many (maybe most) of these can be automatically found and corrected by appropriate SW or macros (or at least automatically scanned for & visually corrected, e.g. the carriage return is usually due to a page ending but not at aparagraph ending. This usually has a "section break" followed by a lower case letter.).

A little effort considering the price being charged shouldn't be too much to expect.

I don't think I'd have that big of an issue if 1) the books weren't priced at full price and 2) weren't locked into DRM so I couldn't "fix" the errors. I can't read books with those types of flaws. it drives me nuts. I'm a bit OCD though....

User feedback returned via surveys, e-mails, phone calls and the "contact us" form has been overwhelmingly positive and interesting. Users did not expect to be able to correct OCR text. Once they discovered they could, they quickly took to the concept and method, and several reported finding correcting the text both addictive and rewarding. Users were actively correcting much more than they or we had expected to correct. In addition, our own users have the potential to achieve a 100% accuracy rate with their knowledge of English, history and context, whereas our contractors are only achieving an accuracy of 99.5% in the title headings.

I wouldn't have a problem with the shoddy conversions if they were giving the books away or selling them at bargain prices. But publishers are selling unproofed books at full book prices. If publishers want to charge paperback prices or more for e-books, they need to provide the proofing that is a standard part of publishing process as well.

This. I'm sorry, but when they're charging as much as a paperback DTB, I expect the same standards as you'd see in a DTB. If they aren't willing to invest in making an ebook version good, I'd rather they not release it at all.

There are so many potential ways to solve these issues that it's not even funny -- I'm sure volunteers could be recruited in exchange for free copies of many titles (and maybe a credit in the ebook edition). Or they could use a distributed proofreading model, as Alisa has suggested before, and allow readers to flag problems on the Kindle itself. But neither publishers nor Amazon seem to want to address quality problems in any solution-oriented way.

To the OP and others -- even if you don't have the time to get specific, I'd encourage you to submit the titles to this thread so that other readers are aware of the problem. I'd also suggest adding the tag "formatting problems" to the product page on Amazon.

I can understand sloppy formatting creeping into older books - that is, any title that has not been published in printed format for more than, say, 20 years.

But, surely, OCR and proofreading can't be an issue for more recent titles. For such books, the publisher must surely have a way of extracting the text for the ebook from their traditional composition systems. In other words, the text is already held electronically (complete with formatting tags), so converting it to an ebook should be a mechanical process.