There's probably a big discussion of this somewhere in here, but I'm not finding it, so if I'm beating a dead horse, send me over there, please...

I have a Nook and read epubs. Normally, the formatting is OK, and I generally use Coolreader and have done a little fiddling to make the book look as I like. Works fine.

I recently started reading an epub that APPEARS to have embedded page numbers, headers, and footers. It's still readable, but the extraneous text makes it very choppy and disrupts the reading process.

Initially, I thought maybe it was a book I'd gotten in Kindle format and converted in calibre, but I can't find any indication of that, so I THINK I actually got it as an epub. It ALSO appears to have hyphenation embedded, so I'm now the publisher may have just taken a paper layout and converted it to epub, complete with all the formatting they'd have used for the paper printing........

Has this happened to other people? Is it common? And, is there a way for me to modify the epub to remove all the extra junk so I have a more readable file? If so, recommendations on what (hopefully free) tool would work?

DiapDealer

08-04-2012, 10:08 AM

Could be a fixed-layout ePub. If so, I don't think there's any magic beans to help convert it into a "normal" reflowable ePub. It could also simply be an untouched OCR scan hastily converted to ePub ... in which case, you'll just have to roll up your sleeves, dive in and manually "clean it up." My guess would be the latter.

Toxaris

08-04-2012, 12:37 PM

It sound like a poor conversion from a PDF. You can usually fix it with some good search and replace functions.

gracie

08-10-2012, 09:04 AM

I figured out enough regular expressions to get the file pretty well cleaned up. Calibre's search and replace worked well, and it made the book a LOT easier to read...