Monday, 23 May 2011

One of the side effects of having the word 'visualisation' in your job title is that people expect you to know something about the subject. Accordingly, every now and again I get contacted by one of my colleagues who needs to draw a graph, or reformat an image, or incorporate a visualisation into a document, and is - usually - too busy with other (probably more important) activities to spend a lot of time figuring out how to make a graphics package behave itself and do the right thing.

I received the latest of these queries last week, from a colleague who wanted to reproduce Figure 1 (below) as part of the documentation for the NAG Parallel Library. It's a schematic illustration of the way in which, for certain routines in the Library, elements of a matrix are distributed between compute processors. More specifically, this example shows the distribution of a two-dimensional matrix consisting of twelve rows and columns (each numbered from 1 to 12) across a two by three grid of processors.

The request was to reproduce the figure using gnuplot, a plotting program that can be used to create a wide variety of two- and three-dimensional plots of data, functions and approximations to data. It runs on a variety of platforms, and is freely distributed. Gnuplot's interface is command-driven - i.e. the user either enters commands one at a time at its interactive prompt, or puts the commands into a script, which the program then loads and runs. Several example scripts (and their associated output) are available at the gnuplot site - for example, these illustrate the range of the program's functionality for plotting analytical functions. We use the program a lot for the display of the results from the example programs which form an important part of NAG Library documentation; thus, Figure 2 is a plot of the output from the c06ab routine, which is used in the numerical summation of series.

Whilst producing this sort of picture in gnuplot is comparatively straightforward, and can be done with a handful of commands, the diagram in Figure 1 presents a few more challenges. However, the elementary components aren't hard to find: thus, gnuplot has a set object rectangle command which creates a rectangle, given the coordinates of its diagonal corners, a set arrow command which creates an arrow (or a line) between two endpoints, and a set label command which adds text at a specified location. So I started writing the script in this fashion:

but I hadn't got very far before realising that hardwiring all of the coordinate values wasn't a good idea if - say - the size of the boxes, or the spacing between them, needed changing. And I was getting tired of keeping track of the running number of objects, and working out in my head the locations of the boundaries. Fortunately, I recalled that a computer is good at this sort of thing (the clue is in the name, after all). I also discovered that gnuplot allows user-defined variables to be declared, and supports mathematical operations on them. This made writing the script a lot easier, but there's a large degree of repetition in the code which - although straightforward to produce by cutting and pasting - looks inelegant and verbose. In a more fully-featured programming environment, this drawback would be overcome by using control statements such as loops or branches, or by using subprograms. However, these features aren't supported by gnuplot (or if they are, I couldn't find them - and the call of more important activities was becoming too strident to ignore), so we're left with a lengthy script, which is reproduced below in all its resplendence for the curious.

Buried within it (almost at the end) is one technical point that might be of wider interest: because gnuplot is, when all's said and done, a plotting package, each script must contain at least one plot command, otherwise nothing will appear. More specifically in our example, we've used the set command to create items to be displayed by specifying values for its object, arrow and label options, but these won't be drawn on the graph until the plot command is issued. This has a compulsory argument, but having specified everything to be displayed, we don't have anything else left to be explicitly plotted. Other users who have used the command in similar circumstances (for example, here and here) have used the expression 1/0 for this argument; gnuplot evaluates this to NaN, which is interpreted as an undefined point, and is quietly ignored by the plot command when it's displaying the other objects in the figure.

5 comments:

GNUPlot is a great package and about half of my PhD thesis plots were generated using it.

For that particular plot,however, I would have probably used Mathematica which I think would have been able to do the same thing with MUCH less code. Of course Mathematica is rather more expensive than GNUPlot :)

Thanks, Mick - I too was surprised that gnuplot commands can be assembled into something that "almost" looks like a program. It's probably asking for too much (and would be getting too far away from its original design) to ask for the extra features - loops, branches, subprograms - that would make it more like a "real" programming language, but it's interesting to see how far you can get (e.g. by unrolling loops) just employing the features it has.

I think the spec asked for gnuplot because it's used for most (all?) of the other figures in our documentation, and because it's reasonably straightforward to produce images in a variety of formats (e.g. PNG, as well as PostScript) from it. Plus, of course, there's a certain amount of gnuplot experience and expertise (to which, perhaps, this article is a tiny contribution) lying around here.

Many thanks for the helpful comment, Mike - my apologies for the delayed response. Yes, Mathematica (or MATLAB, or Java, or C, or Python, or any similarly-endowed programming environment) would indeed facilitate a much more compact representation of the same program, but the spec called for gnuplot - mostly, I think, for reasons of history and (as you note) economy. Whether or not the latter proves to be false is another question, of course.

Thanks for sharing your gnuplot experience! Would you like to write about a LaTeX graphics and gnuplot contest in your blog, to give your readers a chance to win a gnuplot cookbook? It's here: http://latex-community.org/component/content/article/92-contests/431-gnuplot-book

About NAG

The Numerical Algorithms Group (NAG) is dedicated to applying its unique expertise in numerical engineering to delivering high-quality computational software and high performance computing services. For 40 years NAG experts have worked closely with world-leading researchers in academia and industry to create powerful, reliable and flexible software which today is relied on by tens of thousands of individual users, as well as numerous independent software vendors. NAG serves its customers from offices in Oxford, Manchester, Chicago, Tokyo and Taipei, through local staff in France and Germany, as well as via a global network of distributors.