I apologize if this inconveniences folks. I asked about putting up a cross-sticky to this thread in the PDF forum, and the solution decided on by the moderator ended up being to merge the two main k2pdfopt threads, with the merged thread residing in this PDF forum, which I do think makes the most sense. I do want to recognize WangMinh12, who started the k2pdfopt thread in the Kindle Development forum where it saw a lot of activity over the last year. Thank you, WangMinh12.

I read a lot of scanned pdf's on 6" eink readers and k2pdfopt has been my main weapon for about year now.

Before k2pdfopt I used Bris, Pdfscissors, Abbyy Finereader 10, Scantaylor etc. to crop margins and then print out such cropped pdf in Adobe Acrobat in Tile page mode (now also possible in Adobe Reader as Posters mode) for landscape dimension (about 120x85 mm).

Now I usually use Bris together with k2pdfopt, first to crop pdf with Bris then print pdf in landscape mode with k2pdfopt with wrap turned off, margins on zero and output 185 ppi instead of 167.

What I would like to see in new version of k2pdfopt is posibility to crop ocr-ed pdf image at exactly the text width !!!

Is it possible to somehow use ocr coordinates to crop pdf image?

It is matter of seconds to crop textual pdf with many viewers or editors there, so why is there no such command for pdf image with ocr in background?

For those who didn't know there is nice free online pdf croping service for files under 10 MB that i sometimes use for ocr-ed pdf image.

What I would like to see in new version of k2pdfopt is posibility to crop ocr-ed pdf image at exactly the text width !!!

Markom--I'm not quite sure what you're getting at since the whole objective of k2pdfopt is to magnify the text and/or to crop out excess white space and margins. Maybe you want to e-mail me via my web site and we can discuss it offline?

Markom--I'm not quite sure what you're getting at since the whole objective of k2pdfopt is to magnify the text and/or to crop out excess white space and margins. Maybe you want to e-mail me via my web site and we can discuss it offline?

Sometimes removing black margins close to the text (due to bad scanning) is not perfect with k2pdfopt or any other application there.
I mean it is always at least good enough for reading on my small eink reader in landscape mode and I'm very glad with it, but on small screen every millimeter is sometimes important so i prefer to use Briss to crop pdf image as closer to the text as possible and then use this cropped pdf in k2pdfopt, but even then sometimes result is not perfect or it takes more time.

I'm talking about pdf scans with ocr in the background (searchable image) here.

If there was tool to automatically crop such pdf at the text width(size) i.e. maybe (if possible) by somehow using already existing ocr in background for necessary data where to exactly cut the front image, there will be no need to manually draw rectangles like in Briss or Pdfscissors or to try different margin values in k2pdfopt for different pages and result would always be near perfect.

So, yes, this wish of mine is not directly connected with k2pdfopt itself but as you've mentioned on your pages:

"... A future release might also have an option for a different type of output that would use cropping instructions rather than rasterizing to generate the converted file (similar to what is done in Cut2Col, SoPDF, and the latest version of PaperCrop, which all leave the text in searchable form if it started that way in the original file)."

maybe you can figure things out and grant us another great cropping tool.

If there was tool to automatically crop such pdf at the text width(size) i.e. maybe (if possible) by somehow using already existing ocr in background for necessary data where to exactly cut the front image, there will be no need to manually draw rectangles like in Briss or Pdfscissors or to try different margin values in k2pdfopt for different pages and result would always be near perfect.

I think I get what you want and it depends on whether I can figure out how to have MuPDF render only text primitives from the PDF file. I will add it to my k2pdfopt wish list. Thanks for the idea.

K2pdfopt v1.50 is released. The major new feature is optical character recognition (OCR--English only), but there are several other new features that various users have requested. I've also released the source code. See the web site for more details.

1. Can't you preserve images somehow, without changing their size? I mean if the image is large, don't split it just fit it into the page (or next blank page)? because right now, it splits big images. If you don't do that, then the reader can zoom in for those pictures (given this case happens rarely, it won't be too inconvenient for readers).

Now that 6" eink readers are pretty cheap, I read some of those problematic-size pdf's like magazines, newspapers, A4 or bigger books etc. with two e-readers situated next to each other.

... it would be nice if we could do it all within k2pdfopt only.

Interesting concept. Not hard to implement if you don't need a clean gap between the two side-by-side images (just save each half of the output image to separate PDF files), but I'd think it would be distracting to have words or even individual letters split across the two kindles. Seems a lot more cumbersome and less convenient than just getting a bigger e-reader or tablet.

K2pdfopt v1.50 is released. The major new feature is optical character recognition (OCR--English only), but there are several other new features that various users have requested. I've also released the source code. See the web site for more details.

Great news! Thank you very much.
It compiles well on ARM btw (OCR switched off for now).
I tested it on a 25 page document, which took about 10.5 minutes.