PDF to HTML5 conversion – Clipping

One of the things I have missed most in moving PDF content to HTML5 is the clipping functionality of the PDF.

In a PDF File you can set a clip which can be any irregular shape. Only content which is inside the clip is drawn the rest is not (it is simple). HTML5 has nothing like this which means we have to emulate it. Otherwise invisible content (such as crop marks or invisible lines) starts to appear.

This turned out to quite a complex task. Eliminating anything which is not in the clipped area was easy – the tricky bit is handling items which intersect with the clip (ie drawn so partly visible). Images can be clipped but shapes have to be altered. The hardest items to handle were images.

In the PDF File format you can have a Stroked Shape (the outline), a Filled shape (colour in the shape) and both. So you have to workout how the shape interacts with the clip. For example if the clip was totally inside the shape, we could ignore it if it was a Stroke (ie an outline) but would need to fill in the clipped area if it was filled. We had to dig out our old Maths notes on trigonometry to calculate the points where the lines appear and disappear!

I am sure we will find some additional cases which we have not currently covered. So try the latest version and let us know what you think.

Mark Stephens has been working with Java and PDF since 1999 and has diversified into HTML5, SVG and JavaFX.
He also enjoys speaking at conferences and has been a Speaker at user groups, Business of Software, Seybold and JavaOne conferences. He has a very dry sense of humor and an MA in Medieval History for which he has not yet found a practical use.

Mark Stephens has been working with Java and PDF since 1999 and has diversified into HTML5, SVG and JavaFX.

He also enjoys speaking at conferences and has been a Speaker at user groups, Business of Software, Seybold and JavaOne conferences. He has a very dry sense of humor and an MA in Medieval History for which he has not yet found a practical use.