Edit:
My use case is I have a HTML report that I want to make available in PDF too. I will make updates to this report structure so I don't want to maintain a separate PDF version, but (hopefully) convert automatically.
Also because I generate the report HTML I can ensure it is well formed XHTML to make the PDF conversion easier.

3 Answers
3

pisa is a html2pdf converter using the
ReportLab Toolkit, the HTML5lib and
pyPdf. It supports HTML 5 and CSS 2.1
(and some of CSS 3). It is completely
written in pure Python so it is
platform independent. The main benefit
of this tool that a user with Web
skills like HTML and CSS is able to
generate PDF templates very quickly
without learning new technologies.
Easy integration into Python
frameworks like CherryPy, KID
Templating, TurboGears, Django, Zope,
Plone, Google AppEngine (GAE) etc.

I got pisa working on gae and works great. However, it doesn't support all CSS tags. For example, I was heavily using positions, top, left and floats, all of which are not supported by pisa: htmltopdf.org/doc/pisa-en.html (take a look at the supported CSS). Other than those restrictions, its a great library.
–
adamMay 29 '11 at 18:48

Have you considered pyPdf? I doubt it has anywhere like the functional richness you require, but, it IS a start, and is in pure Python. The PdfFileWriter class would be the one to generate PDF output, unfortunately it requires PageObject instances and doesn't provide real ways to put those together, except extracting them from existing PDF documents. Unfortunately all richer pdf page-generation packages I can find do appear to depend on reportlab or other non-pure-Python libraries:-(.

@Richard, your total misconception about PIL on GAE is very common, let me try once again to clear it up: with GAE in real service you get a microscopic image-manipulation API that's less than 1/100 the PIL functionality; the GAE SDK can emulate that tiny API based on local installs of PIL, that DOESN'T mean you'll get PIL when you run your GAE app on Google's servers. And freetype2 doesn't seem an "optional to run faster C module" to me: how are you going to deal with fonts when freetype2's not around, fast or slow as you may be?!
–
Alex MartelliOct 22 '09 at 4:27

What you're asking for is a pure Python HTML renderer, which is a big task to say the least ('real' renderers like webkit are the product of thousands of hours of work). As far as I'm aware, there aren't any.

Instead of looking for an HTML to PDF converter, what I'd suggest is building your report in a format that's easily converted to both - for example, you could build it as a DOM (a set of linked objects), and write converters for both HTML and PDF output. This is a much more limited problem than converting HTML to PDF, and hence much easier to implement.