[0916] Introducing pdf2htmlEX: converts PDF to HTML w/o losing fmt

* Completed removed Boost* Relaxed dependency of C++11, supports GCC no earlier than 4.4.6* Links are now supported (In-document jumping is accurate to pages)* Fixed an encoding problem for some fonts.

There are bascially 2 types of pdf-to-html converters:One is roughly a pdf-to-text converter with a few pre-defined formats in HTML.The other is render-everything-as-images converter, which loses all text and generated huge files.

But pdf2htmlEX takes advatanges of both, retaining both Text and Styling.Features:1.Extract and embed fonts from PDF2.Optimizing for web while making sure render is precise3.Non-text objects are rendered as images.4.Single-file output mode -- I know you hate spearated font/image files

To compile & installgrab a recent poppler (>=0.20.3), make sure '--enable-xpdf-headers' is used for configuregrab the latest git version of fontforge https://github.com/fontforge/fontforge, because I submitted a few features/bugs for pdf2htmlEXthe boost c++ library. (See detailed depended components in the project home page)cmakeGCC that supports c++11

I must admit, this is pretty impressive to me, could be a good starting point to get saner pdf->epub.

I know PDF is a pita to handle and parse, but here are some feature wishes:a) automaticly create working links for any valid URL and mail addressesb) trying to find table of contents and link itc) link objects/images to open in a new window/tab, so I can look at them and read the surrounding text more easily.

Will try it over the weekend and give feedback, thanks for far, much appreciated.

coolwanglu,Welcome to the Arch Forums. Very nice application you have put together there.

Generally, we reserve these forums for Arch support. That you are using Ubuntu does not violate our rules as you are not asking for support and are not about to create confusion amongst Arch users.

We do, however, strongly encourage users to use our build system so that Pacman (our package manager) can keep abreast of the system files that are installed. I see you asked if anyone would like to help package this for Arch. We have a subforum for just that purpose; I am wondering if this thread should perhaps be moved to that subforum.

a) What are your thoughts on the move.b) Why not make the move to Arch?

Nothing is too wonderful to be true, if it be consistent with the laws of nature -- Michael FaradaySometimes it is the people no one can imagine anything of who do the things no one can imagine. -- Alan Turing----How to Ask Questions the Smart Way

I must admit, this is pretty impressive to me, could be a good starting point to get saner pdf->epub.

I know PDF is a pita to handle and parse, but here are some feature wishes:a) automaticly create working links for any valid URL and mail addressesb) trying to find table of contents and link itc) link objects/images to open in a new window/tab, so I can look at them and read the surrounding text more easily.

Will try it over the weekend and give feedback, thanks for far, much appreciated.

Thanks for your attention.

I've been working so far to handle all kinds of text/fonts stuff. And in future versions I'm planning to support other objects (images/link/drawing etc) "natively" in HTML. So (a) (b) is in the plan.But not sure about (c), what do you expect to see in a new tab after clicking an image?

coolwanglu,Welcome to the Arch Forums. Very nice application you have put together there.

Generally, we reserve these forums for Arch support. That you are using Ubuntu does not violate our rules as you are not asking for support and are not about to create confusion amongst Arch users.

We do, however, strongly encourage users to use our build system so that Pacman (our package manager) can keep abreast of the system files that are installed. I see you asked if anyone would like to help package this for Arch. We have a subforum for just that purpose; I am wondering if this thread should perhaps be moved to that subforum.

a) What are your thoughts on the move.b) Why not make the move to Arch?

Hello,sorry if I have posted in the wrong subforum, I just saw the description "A place for true innovation. Share your own created utilities with the Arch community." and came in.

I didn't intend to advertise that ubuntu ppa, I just put a general description and wanted to broadcast this tool.As there's already one user kindly made a package, I'll remove the line of PPA and add a link to this instead.Would this be OK?

While it's functional as it is with the stable fontforge from extra the html rendering is a bit odd in my tests. I plan later to check on fontforge-git.

Thanks for the cool program. Just the other day I was looking for something like that and couldn't find anything functional.

Thank you very much!I'll put in into the git repo.

I've submitted a few features/bugs for fontforge recently for pdf2htmlEX. So the scripts may not be valid for earlier versions of fontforge.I think the 'odd' you saw was incorrect fonts, as there were no fonts generated actually.Please do check with the lastest version and see if they'll work.