Re: reading PDFs on an ereader, check out mobipocket creator. Does a FAR better job at converting than calibre, and you can convert either the final .mobi or better yet the intermediate .html to ePub. Only reason I have a Windows vm on my mac.

If a simple program can do the conversion from PDF to a format that works, reflows, and is easily readable, that proves that it is easy and feasible for the reading software on the Kobo to do the same, or at least that part of it that allows the PDF to be read easily.

David Soulayrol is absolutely correct. The whole point of the PDF format is to allow the author/publisher, not the reader, to control the appearance of the document. It was designed to support typography properly. Think of it as a page sized image, designed to allow printers to duplicate the look of an original document.

Even html/epub/css does not have all the functions of PDF (or of real typography), and therefore, conversions involving graphics or tables will never be very successful.

If you don't like how PDF documents render on a device, get your material in another format more suitable for the device..

David Soulayrol is absolutely correct. The whole point of the PDF format is to allow the author/publisher, not the reader, to control the appearance of the document. It was designed to support typography properly. Think of it as a page sized image, designed to allow printers to duplicate the look of an original document.

I know that's the theory, but it really assumes you are displaying the PDF on the device at least equal, or larger than the target format of the PDF. Any attempt to display on a smaller device, like our ereaders, is quite awkward, and this is why the smart manufacturers have introduced reflow on PDFs.

Re: reading PDFs on an ereader, check out mobipocket creator. Does a FAR better job at converting than calibre, and you can convert either the final .mobi or better yet the intermediate .html to ePub. Only reason I have a Windows vm on my mac.

could you give a quick step by step guide how you do that. I didn´t find out how convert the .html either to pdf, .mobi or epub

If a simple program can do the conversion from PDF to a format that works, reflows, and is easily readable, that proves that it is easy and feasible for the reading software on the Kobo to do the same, or at least that part of it that allows the PDF to be read easily.

So by that logic it should we should be able to browse all websites, use a word processor and excel etc. To claim it's simple is over-simplifying things if only because of the difference in hardware capability of a laptop/desktop vs an e-reader. The pdf's I use on my Kobo I would curse them it they tried to reflow them. They aren't meant to be reflowed and attempting to do so would cause me grief. It's at least in part personal preference. Personally I can thing of lots of other things I'd rather see them put their time into. Obviously you think they should put time into reflow of pdf. Neither of us are wrong nor is Kobo. We (like all kinds of Kobo owners) have different priorities for what we want and Kobo only has so many hours of development time so they prioritize what they see as important (which might not be what you or I want at the top of the list).

If a simple program can do the conversion from PDF to a format that works, reflows, and is easily readable, that proves that it is easy and feasible for the reading software on the Kobo to do the same, or at least that part of it that allows the PDF to be read easily.

Something like MobiPocket Creator is doing a whole lot of clean-up afterwards - things like filtering out page headings and numbers, de-hyphenation, image cropping, etc. - which would be impossible in real time, at least on current ereader hardware. Yes, you can do a simple reflow without this extra processing, but it looks/reads like crap, see: Calibre's PDF>ePub output.

And of course a simple scanned PDF, which I think is largely what people want to read, is impossible to do any of this with anyway, at least without the additional step of OCRing which introduces a whole extra bunch of problems.

could you give a quick step by step guide how you do that. I didn´t find out how convert the .html either to pdf, .mobi or epub

thank you

In MobiPocket creator import the PDF. The software will then do its thing and dump the resultant .html file and associated images in a folder in your home directory somewhere, and offer to display the files. It will also offer to create a .mobi, which you can skip since everything Calibre needs is there already. Then import the .html into Calibre, which will show up as a .zip (including the images), and convert that to .ePub.

Attached is a screenshot showing the original PDF, the files MobiPocket creator created, and the end-product .ePub after importing and converting with Calibre (I imported the files over to MacOS as MobiPocket unfortunately is Windows-only, so this is even easier in Windows). The original PDF is unreadable with tiny text, while converting directly with Calibre results in a mess of layout artefacts, but using MobiPocket as the front-end results in a final .ePub that looks like an official ePub, for a book that's very unlikely to ever get an ebook release. Would be amazing if Calibre could do this itself, but it's really focused on transcoding between well-formed XML formats, not intelligently dealing with layout issued from a PDF.

I've also had good success with MobiPocket with multi-column documents, and even some docs I scanned and OCR'd myself using Acrobat. Calibre doesn't even come close.

p.s. tried to convert the same book from PDF to .html with Acrobat to see how they stack up: the Acrobat conversion was totally unreadable: inserted random spaces inside words everywhere for reasons I can't figure out since they're not in the raw text if I just copy-paste it. The only thing Acrobat seems to have over MobiPocket is better handling of tables. Also the notable downside of costing money.

I think when come down to the format support issue, you would have also to consider the business model of the e reader maker: do they want to make money mainly on the device they are selling you, or mainly on the e books which you will by from them?

I think for kobo, it is quite clear, they use high hardware specification to attract customers, but the end goal of doing this is probaly not to sell it for only 129, but they want the device to lead you to buy their books -- currently mostly (if not only) epub formats. If they were to make glo to adapt better on other formats, they will spend more (on the format adatpation developments) and earn less (since you will be less willingly to buy their books because you will have much broader choice of books to read conviniently from and not to buy from them). So base on this point of view, it is easy to explain why kobo is doing what ever it takes to make glo looks better and ads, but once you have got it, you realize you just don't want to use glo to read formats like pdf, and then you settle down by thinking it is just probaly a epub reader, and start to consider if you should by an epub from kobo store.

In contrast, like sony readers, eventhough sony has its own online book store, but it seems sony focuses more on hardware profits, at least more than kobo does. This could explain why sony makes its reader more pdf format adaptive and a lot features that makes you read conviniently, but at the same time, down grading the hardwares on the recent models, while attempt to rise market price of the product.

So basically in the end, I think for the issue of formats, it is how the companies want to profit will mostly deternmine what kind of reading experiences the customers will eventually get in the end.

Something like MobiPocket Creator is doing a whole lot of clean-up afterwards - things like filtering out page headings and numbers, de-hyphenation, image cropping, etc. - which would be impossible in real time, at least on current ereader hardware. Yes, you can do a simple reflow without this extra processing, but it looks/reads like crap, see: Calibre's PDF>ePub output.

Actually it doesn't look like crap. (EDIT: well maybe Calibre's efforts do, as mentioned in a post above) but I used an ereader that does PDF reflow for a year or more before I got the Kobo. It worked well, and made it possible to enjoy reading PDFs. So, I am not speaking theoretically, it can be done, it has been done, very successfully.

Quote:

Originally Posted by stewacide

And of course a simple scanned PDF, which I think is largely what people want to read, is impossible to do any of this with anyway, at least without the additional step of OCRing which introduces a whole extra bunch of problems.

It is quite clear from the discussion - and I am sure I mentioned it somewhere - that we are NOT talking about scanned image PDFs.

So by that logic it should we should be able to browse all websites, use a word processor and excel etc. To claim it's simple is over-simplifying things if only because of the difference in hardware capability of a laptop/desktop vs an e-reader. The pdf's I use on my Kobo I would curse them it they tried to reflow them. They aren't meant to be reflowed and attempting to do so would cause me grief. It's at least in part personal preference. Personally I can thing of lots of other things I'd rather see them put their time into. Obviously you think they should put time into reflow of pdf. Neither of us are wrong nor is Kobo. We (like all kinds of Kobo owners) have different priorities for what we want and Kobo only has so many hours of development time so they prioritize what they see as important (which might not be what you or I want at the top of the list).

I am astounded that people can argue that it is so hard to do, when it's been done, very successfully, by other ereader manufacturers. It doesn't make the PDF into a perfect epub, but it makes it much more pleasurable to read, and that's all we are asking.

I am astounded that people can argue that it is so hard to do, when it's been done, very successfully, by other ereader manufacturers. It doesn't make the PDF into a perfect epub, but it makes it much more pleasurable to read, and that's all we are asking.

but using MobiPocket as the front-end results in a final .ePub that looks like an official ePub, for a book that's very unlikely to ever get an ebook release. Would be amazing if Calibre could do this itself, but it's really focused on transcoding between well-formed XML formats, not intelligently dealing with layout issued from a PDF.

.
The way through Pdf->MobiPocket(.html)->Calibre(epub) is actually what I did. And how to make an ePub with MobiPocket without using calibre?
There is no obvious choice to convert to .epub or .mobi. There is an option to make epub but the result is a punch of files;.html, .opf, .prc, .xml, .jpg, .png

The way through Pdf->MobiPocket(.html)->Calibre(epub) is actually what I did. And how to make an ePub with MobiPocket without using calibre?
There is no obvious choice to convert to .epub or .mobi. There is an option to make epub but the result is a punch of files;.html, .opf, .prc, .xml, .jpg, .png

Think about it for a second! MobiPocket ..... Why on earth would a program that was designed solely for the purpose of making Mobi content have the ability to create an alternate format such as ePub that was totally foreign to the ecosystem that the Mobi format was intended for.