Progression 0.3: Bar Charts and Normalisation

The CHP posts are currently on a bit of a hiatus while I write up a couple of papers on CHP, and prepare for going to the SIGCSE conference as part of my day job. In the mean time, it seems that a few people have picked up and begun using my Progression Haskell benchmarking library. The problem with users is that they want new features. So, spurred on by a couple of emails, I’ve created a new release of the library: version 0.3.

The changes in version 0.3 are all centred around the graphing (which in turn caused changes to the config and command-line options). John Millikin asked if I could support having versions as the groups on the X axis, and benchmarks as the lines — the opposite to the current arrangement. He made the point that with few benchmarks and many versions this would provide a better visualisation. I’ve exposed the function that maps raw data into graphable data in the configuration, and have also added default (exposed) implementations for Progression’s original transformation and the one that John wanted. Furthermore, I added a function that supports normalisation of the data (scaling by the corresponding benchmark times for the latest recorded version — perhaps I should make this configurable, too?). Normalisation should help if the benchmarks take noticeably different amounts of time — it is used by default, but configurable on the command-line.

Sebastian Fischer also emailed me to offer his opinion that my line graphs were inferior to bar charts, both in theory and in terms of visibility. He’s right, and I have now corrected this by offering a choice between line graphs and bar graphs — I’ve actually made bar graphs the new default. I also made Progression print out its gnuplot script when it draws a graph — that way, if you know gnuplot and want to tweak something further manually, you can start with that script and work from there.

All of the above is configurable through Progression’s Config data type, and via command-line options. Here is a demonstration of a normalised bar graph, with benchmarks as the major groups and versions as sub-groupings (i.e. a graph that uses the new defaults) — click it for a larger version:

Currently, I take all the values for a particular benchmark, say, simpleChannel. Then I find the time for the simpleChannel benchmark for the version rewrite-strike-single and divide all the times for simpleChannel by that time. I then do the same process independently for SeqCommsTime, dividing all the times for that benchmark by the time for the version rewrite-strike-single and so on for each benchmark. (In the above graph you can see that the times for rewrite-strike-single all come out as 1.0) I think perhaps I had a slight mis-wording in the blog post — I’ll try to clarify it.