I use quite a number of boxplots in my writing, and have chosen pgfplots as my plotting solution for a number of reasons (one of which is the benefit of having the data to build the plot in the tex file itself).

The state of affairs regarding boxplots in pgfplots 1.8+ has improved a lot since I first started using them, but since I normally use R for analysing my data, and since this strikes me as a relatively common setup, I was wondering how people did it, to see if we can come to a common best approach.

pgfplots is a must? You can also build the plot in the tex file itself using R without pgfplots with Sweave.
–
FranOct 6 '13 at 15:34

Definitely not a must, but I prefer to keep a unified visual aesthetic for my plots throughout my document, and I've found I can achieve this more easily by using the same tool for everything. That said, I've never actually got Sweave to work, so that might have factored into this. An example would be a welcome addition! :P
–
jjaOct 6 '13 at 15:37

2 Answers
2

This is what I've started using recently, since understanding more or less how to use the new boxplot interface of pgfplots. Although I know it's not particularly pretty (how could it be? I'm by no means an R programmer...), it does get the job done. But it would be interesting to see what others have come up with.

EDIT: Since writing this answer, the function I use has expanded quite a bit, and now accepts more options and allows one to output a completely specified tikzpicture environment. Still on the to-do list is to make it accept lists of boxplots to print as sets of groupplot plots. But FWIW, here's the current version. Older versions can be seen in the answers edit history.

This version also makes use of a custom outid entry in the R boxplot object, with the id of the outliers. The function will still work if this is not set (and assign numbers as placeholders).

In R, you can then save the boxplot object and pass it as an argument to pgfbp:

boxplot(response ~ group, data=data) -> bp
pgfbp(bp)

and copy the output to your tex file.

Labeling outliers

As for the meta column, the reason I included it in this function is because sometimes (particularly when showing initial plots to my supervisor) it is useful to label the outliers to be able to identify unusual tendencies in a single participant. This I do together with a pgfplots style:

but I still have to find a good solution for extracting the labels for each outlier from the data (I have a kludge put together from a previous version, but I thought this was a bit too specific for this question). The version above uses numbers as placeholders, but they are easy to remove if they are not used.

Without pgfplots you can insert chunks of R code directly in the text file and obtain the results of this chunks (text, tables or figures) instead of the R code in the PDF file.

The source file must have the extension .Rnw (R noweb) that R with the Sweave fuction (or knitr) convert in a normal .tex that you compile as usual. If you use rstudio the editor can make all the steps for you with one click.