michellem has asked for the
wisdom of the Perl Monks concerning the following question:

Hi Folks,

I'm working on a project which requires nice formatted reports generated from database output. One possible way to go is to generate PDF files, and I've done a fair bit of research and beginning code testing to figure out the best way to go. I need a bit of advice on one particular thing.

One module that I like the best (or at least I like the idea best) is called PDF::Template. It uses an XML template to separate the formatting from the data. I like this approach a lot. The one fatal flaw, however (for me, not necessarily for others) is that it requires the totally not free software PDFLib. (Apparently, you used to be able to download it for free, but that is no longer the case, the free version puts a demo watermark on all PDFs generated. In any event, I'm a totally open source gal, so any software I use in my applications has to be open source.) However, it looks fairly straightforward, and I was thinking of writing a version that used PDF::API2, which looks like a really nice PDF creation module.

So my question is: is it worth doing this? Has anyone done something similar that I couldn't find? Based on my research, I'm pretty clear that PDF generation via PDF::API2 is the way I want to go (as opposed to using the older PDF::Create, or a combination of html2ps, etc.), but I'm wondering whether it's worth going whole hog and recreating this module (or a similar idea) or just writing my own little thing.

I have had the same problem some time ago, and decided to take a slightly different approach: I generate reports in HTML, which is a trivial task, and then I use htmldoc to convert them to PDF on the fly.
It supports full HTML 3.2 input (no CSS, though), it's extremely fast, and the PDF output is rendered
better than under my old Netscape browsers :-)
The source code is distributed under GPL2.

++ to your question. I had almost exactly the same issue to solve last year, and after researching the options I came to the same conclusions. I went back to my customers and asked them to purchase PDFlib in order to proceed with the project. They declined, so the project died an early death

First off, why PDF? Because only PDF can be reliably displayed AND PRINTED on multiple OS, browsers and hardware. It would be nice if there was another way...

The ideas and architecture behind PDFlib and PDF::Template are key; that is that you can create a template PDF that can be filled in at run time by the perl program.
PDFlib software allowed a number of important services.

* A template editor plug-in for Adobe Acrobat. This allowed templates to be created that overlayed existing PDFs. This was important because it allowed any program to make the layout of the form then "print" the form to pdf from which the template could then be constructed. The template editor was an easy enough GUI to be used by anyone: Anyone could maintain/make the templates.

* A C-based perl Module that allowed templates to be populated by the perl program. The API was very nicely designed, very simple and robust.

Is it worth it to roll your own? I would anwser yes! It would be great to have similiar or identical functionality so that one could output to a template that would result in an output file that was identically printable across any platform. I would contribute time to such an open source project.

As the (hidden) maintainer of PDF::Template, I feel it necessary to say that we are working towards using free solutions. We're also looking at addressing things that PDFLib doesn't do (such as Acrobat forms). Unfortunately, I have something like -4 hours /week to work on it. *winces* If you want to help, please email me at rkinyon@columbus.rr.com and I can definitely put you to work.

------We are the carpenters and bricklayers of the Information Age.

The idea is a little like C++ templates, except not quite so brain-meltingly complicated. -- TheDamian, Exegesis 6

Please remember that I'm crufty and crochety. All opinions are purely mine and all code is untested, unless otherwise specified.

Here is a different approach, instead of creating a PDF report directly, you can create your report in a XML format first, and then use a XSL stylsheet and Apache FOP to translate the XML file to a PDF file. The advantage is, by this way, you can easily provide your report in a different format (e.g. in HTML format), when it's needed, by just supplying a different stylesheet.

What is proposed with PDF results in a fully paginated, layouted file with embedded graphics all nicely wrapped up into a single template file.

Correction
As mentioned below, FOP uses XML tags to more or less completely specify layout, pagination, etc, etc. Thus you could have XML encoded data and using an XSLT tranformation end up with a XSL-fo file that would be nicely layed-out. These file are specifically designed to be then rendered into PDF for presentation.

I'm not aware how mature the tools are for generating an XSL-fo file (especially GUI driven visual tools). I'm still in favor of PDF Templates because at this point PDF print drivers allow virtually any program to output PDF templates.

I use HTMLDOC. It's good solution for simple converting HTML to PDF. Also, I'd like suggest to use module HTML::HTMLDoc, which implements an object-oriented Perl interface to this programm.

But for build a commercial system which should generate some documents in different formats, I'd suggest to look at FOP. I've just looked over information about this tool and I hope that I will use it soon in my project for development system of documents circulation.

You should be able to substitute PDFlib Lite for PDFLib. The summary of its license agreement is here; you can redistribute it, you can modify it, but with certain caveats. You might be able to live with it; I have been, so far.

PDF::API2 has a wonderful feature set, and is being developed at breakneck pace, but I find its documentation a bit bewildering. It has a decent user community, though, through which you will find many answers.

I'm one of those PDF::API2 users also. I looked high and low for a good way to get my feet wet in PDF generation, and ended up there. It's served my changing purposes quite well, and while I've met with the occasional issue here or there, and the same confuddling documentation, the author is updating it probably weekly, and the message boards are active enough.

I agree, open source does not necessarily mean free, and I was oversimplifying in talking about what I meant by free. I am interested *both* in free as in beer as well as freely modifyable, redistributable, etc. - something along the lines of a GPL/BSD license.

I don't know what you are talking about when you say "read the website". There is not a whole lot on the website suggesting that the software is written in an open source model. The license can't possibly pass muster as an open source license. Agreed, PDFlib lite seems to be released as open source, and does fit open source guidelines - but that doesn't mean that PDFlib in general is developed with an open source model.

I have written a module that produces PDF reports using just Perl only (no libs). About 4 weeks ago, I was tasked with automating a research report that pulled information from Sybase and MSSQL databases. A requirement was that the report had to look the same as the existing report which was being produced by cutting and pasting data in Excel and MSWord. The data had to be scrubbed by hand, and the customer wanted rules automated as well.

None of the Perl PDF modules did everything I needed, and I didn't want to hack the existing modules, so, I sat down and read the first 500 pages of the Adobe PDF Reference Manual and wrote my own special implementation of the module (I knew PostScript already, so it wasn't so scary). I took a different approach than the existing modules and decided to developed a pdf object with methods for each of the things I needed to do. So I have built a graph method(it draws scaleable stacked bar charts with a legend - font,size,color,linewidth, autoscale, etc), a table method(header,footer, rows, cols, font, fontsize, color, padx,pady, linewidth, linepattern, alignment), an image method ( file, height, width, x, y (only does jpgs currently)), text method(x,y,align,font, size, color), and methods to draw ...

I have about 75% of module finished (only need to add the methods to produce an index automatically from the content, and a rule set to split the content stream across pages. (This is one area that lead me to writing my own module in the first place... PDF doesn't have any mechanisms ( or restrictions!) as to how to wrap or when to do a page break. As a result, you have to keep track of the content height yourself or it just runs off the page.

Three weeks ago, my customer decided that he would rather have the project completed in Java because they were worried about the long term maintenance of the report ... (sigh) and so, I had to stop working on the Perl implementation and start developing the report using iText.

I am going to complete this module in my spare time on the weekends just to show the customer just how powerful Perl is.

I plan on implementing a simple write method that mimics Perl's format so that users didn't have to learn anything new to send their output directly to PDF's.

I have run into a few technical difficulties concerning encoding fonts, doing hyphenation, auto-pagination, multi-column control and layout and a few other items... ie... I need some help!) Anyway, if you would like to know more see a sample report, or if you want to help me, send me an e-mail: jmoosmann@earthlink.net

I definitely vote for the PDF generation through another means option. I've done autogeneration of reports based on data, some pulled from flat files, some pulled from database backends, and it always seems easy to go through some language that affords a certain level of redundant markup. As such, the XML, HTML options are quite helpful, for markup, but formatting is a different issue.

Thus, I have to say I quite like using TeX for these types of things, because it gives you markup when you want it, it's language based so it's easy to generate from within scripts, and it will give consistent formatting if that is necessary (generating froms from DBs, etc.). It might take a little bit more work, since you need to be familiar with TeX, and it might take a few more CPU cycles, but it pays off by offering quite a bit more control over the layout.

When putting a smiley right before a closing parenthesis, do you:

Use two parentheses: (Like this: :) )
Use one parenthesis: (Like this: :)
Reverse direction of the smiley: (Like this: (: )
Use angle/square brackets instead of parentheses
Use C-style commenting to set the smiley off from the closing parenthesis
Make the smiley a dunce: (:>
I disapprove of emoticons
Other