ReportLab

As I mentioned before, after writing a python script to read in Kada’s data files on the rifle club shooters’ scores and calculate new ladders, the next step is output that’s a bit fancier than the straight ASCII text dump:

This does the basic job that the original system did (actually, it does a bit more – the asterisks mark out those shooters who haven’t yet shot enough cards to get on the ladder, but they’re still listed as an incentive for them to shoot more cards – the current system doesn’t do this). It’s not really doing all that can be done, however, and it’s certainly not all that fancy-looking. Especially in a scripting language, where the whole point is to do fancy stuff quickly through toolkits. So, what I wanted was PDF output (because these are printed off and posted up in the club), graphics like logos and so on, but also some graphs and charts with some meaningful data.

One of the graphs I wanted to include was one of Edward Tufte’s many good ideas, sparklines. Small graphs which summarise the state of play of a variable in an easy to read, inline format (meaning that it’s in the flow of the text itself as if english had suddenly become a pictographic language for a moment. They seemed perfect to show a high-level view of how the shooters were doing over the course of the year. Also, it would be nice to display the various breakdowns and analysis of membership (by gender, experience, college year, etc) in a graphical form – there’s nothing wrong with the raw data, but it’s almost always easier and faster to take in analysis that has a graphical expression. So pie charts and such would be an improvement.Right, so I need a PDF generation library and a graphing library, and after some casting about and looking up different options, I settled on ReportLab to generate the PDF because it could do both high and low level generation (more on that in a moment) and because it has some well-written documentation. For the graphs and charts, MatPlotLib seemed to have the best combination of documentation and features (though there are others with wide followings). I’ll write a bit more about MatPlotLib in a future post and just focus on ReportLab for now.

PDF generation turns out to be conceptually very easy with ReportLab, both at high and low levels. At low level, the page is a canvas and you simply draw lines and place text on it using the pdfgen section of the library. I’d guess it would be very useful for small jobs, or for fine detail; but for this project pdfgen’s a bit too fine-grained. Reportlab’s high level document creation model platypus (Page Layout and Typography Using Scripts) is more like it – it’s based on the TeX model of creating documents (ie. specify the content, let the software do the layout properly for you).

In this case, I want two documents, one for the actual ladders and one for the analysis and breakdowns of membership numbers. Platypus requires that you create an actual document object (in this case I use their base template, but custom templates are possible too), and each document will require you to create and populate a list of elements within that document, as well as any layout information, before generating the document itself. So to start with, two lists, two document objects, and setting of some basic layout stuff:
[cc escaped=”true” lang=”python”]
def prepareReport():
LadderDocElements=[]
ReportDocElements=[]

There is some additional setup to arrange the layout of frames on the page that the elements will plug into, and it looks a little complex because I’ve got two different layouts (one and two column), but to be honest the lack of readability is more down to formatting issues than code complexity:
[cc escaped=”true” lang=”python”]
interFrameMargin = 0.5*inch
frameWidth = ladderDoc.width/2 – interFrameMargin/2
frameHeight = ladderDoc.height – inch*0.6
framePadding = 0*inch

[/cc]
So we have frames and soloframe, the former a two-column layout with a titlebar, the latter a single frame taking up the whole of the page between the margins. Later on in the file (which is now in dire need of refactoring for readability’s sake, which is the next thing on the project to-do list), those frame layouts are fed into the document objects and the document generated.

There’s also a helper function or two defined within prepareReport() to manage simple tasks like header text and preformatted chunks, these are mainly left over from swiped demo code, but they do show rather well how text gets added into the document.
[cc escaped=”true” lang=”python”]
def header(Elements, txt, style=HeaderStyle, klass=platypus.Paragraph, sep=0.3):
s = platypus.Spacer(0.2*inch, sep*inch)
Elements.append(s)
style.alignment=1
style.fontName=’Helvetica-Bold’
style.fontSize=18
para = klass(txt, style)
Elements.append(para)

return data
[/cc]
At the end there is the first instance of graphing. Rather than trying to hook MatPlotLib into ReportLab tightly, the easier option is to simply generate individual png image files for each sparkline (and the same method is used for all the other graphics as well) and to save them all in a temporary directory which can be either deleted or re-used after the pdf is generated. It’s not too hard to think of advantages to this – you could upload the images to the club webserver and have both PDF and webpage formats for reports and ladders and so on. As I said, I’ll go into this in more detail in a later post.

I’m not really overly happy with the code as yet though. It’s rambling in places, downright ugly in others, and I really want to do a job refactoring it before putting it up anywhere. This whole series of blog posts isn’t really supposed to be saying “look at this incredible python coding”, it’s meant to be saying “look how fast you can do actual, useful work” – in this case the actual need comes from a sports club to be sure, but it’s an actual need nonetheless. And that you can take python with no prior experience and put something like this together inside of about 20 hours of playing with it for the first time, that’s a rather excellent recommendation really. I’m definitely going to be keeping python in the toolbox for future projects – in fact I’m already playing with using PyQT for the Range Officer Report part of my RCMS project. More on that later as well, but I’m already finding it’s a nifty little tool for doing GUIs quickly. Whether or not it’ll be fast enough on the new toy to be usable remains to be seen, of course.

7 comments

Do you have an example of the output as it stands at the moment? I’ve been following this little series with interest and I’m keeping note of it all as I may be faced with a similar enough problem in the next 12 – 18 months at a club I’m involved with myself (not air rifles though).

The problem with LaTeX and pdflatex (or even just dvips followed by ps2pdf) is that it’s a whole additional toolchain to be installed on the club computer. Reportlab’s just another library so I just have to worry about the python install and that’s it.