I bought "the girl on the dock" PDF and want to convert it to Mobi format.

There's a few issues with that. Every page has an image background for a line at the top and bottom, some pages have a separator graphic as part of the background and some pages are just an illustration with a caption.

I want to keep the illustrations and lose the rest.

I tried running it through Calibre and it looks like every single line in the PDF is a paragraph. The output has everything double-spaced and broken sentences.

Then there's the page numbers to get rid of and the author's name and book title alternating at the top of every page. (That at the page tops has always annoyed me, even with dead tree books. I'm not likely to forget what book I'm reading or who wrote it while I'm reading it.)

Finally, no table of contents. That should be the easiest thing to do. There's only 5 chapters. Might not even bother with adding one.

Is there an easy way to delete the first and last lines of every page (to remove the page numbers and the author name and book title) then remove all paragraph marks except where there's indents or a line begins with a " mark, which are standalone lines of dialog? Also need to delete all carriage returns except at the ends of each paragraph so the text can flow with different screen or font sizes.

The PDF could be a case study in "How to format a PDF in order to make it as difficult as possible to convert to another format." I suppose it'd work decently on a large tablet or reader but not on a 4.3" Android phone screen.

PDF is destination format--and usually a final destination. It would be hard to imagine a worse source format for conversion. There is no "poof!" The very nature of PDF precludes being "poofed" to anothef format. There's really nothing to be done except diving in and cleaning things up by hand. Regex can help if you're capable, but any such search & replace would have to be tailored to each individual document. There is no "regex X will do Y."

I bought "the girl on the dock" PDF and want to convert it to Mobi format.

Is there an easy way to delete the first and last lines of every page (to remove the page numbers and the author name and book title) then remove all paragraph marks except where there's indents or a line begins with a " mark, which are standalone lines of dialog? Also need to delete all carriage returns except at the ends of each paragraph so the text can flow with different screen or font sizes.

You can crop the PDF pages removing the FIRST and LAST lines (if they are all positioned in the same location throughout all the pages) before loading/converting it to "Calibre". I usually do this for page numbers and header/footers that I don't want included in the ebook.

Just a word of advice, I usually export the PDF first into an HTML file then hand building the EPUB via Sigil before I convert it to a MOBI. It makes for a cleaner code if you know a bit of HTML/CSS.

Acrobat is just about the only software that will allow you "really" crop PDFs. The scads of other cropping utilities only keep the cropped portions from displaying. Meaning all the info is still part of the PDF and comes back to haunt you when you try and convert.

It involves a few steps and the result is far from perfect. The PDF pagination is still there and splits the paragraph at the page break. I could further edit it in Sigil but THIS book is not worth the bother.