"R is beautiful" - Video interview with author Prof. Michael J. Crawley as The R Book is re-released

Features

Author: Statistics Views

Date: 03 Dec 2012

Copyright: Photograph copyright of Statistics Views.

Wiley is proud to announce the release today of the second edition of The R Book. This hugely successful and popular textbook presents an extensive and comprehensive guide for all R users.

The R language is recognized as one of the most powerful and flexible statistical software packages, enabling users to apply many statistical techniques that would be impossible without such software to help implement such large data sets. R has become an essential tool for understanding and carrying out research.

What developments do you think will be made in the future with regards to R and which new audiences will start to use it?

I think everyone will use it! Everyone who has data will use it. I personally cannot see any point in using any other existing software. Of course, in computing, nothing stands still. There will be a software that comes after R and it will be better, quicker and more beautiful but at the moment, there is nothing that comes even close. I cannot see what is going to happen in the future, of course but the computing speeds and the power to hold bigger sets were for a long time a limitation on us and they are gradually being relieved. R would not be the language of choice for doing something like managing supermarket data – that is tens of trillions of cross-referencing.

R is not a database management system, however you can talk to a database with R. There is a nice package that makes R look like the front end of a database, opening an access channel and then you can write your queries from R. It’s like opening a gateway but if you like using R, there is no need to use anything else because it can talk to big data sets.

Are there people or events that have been influential in your career?

When I was a teenager, computers were very new and cool to be involved in, and if you weren’t a computer scientist and yet were computer-illiterate, that was unusual.

As an undergraduate at Edinburgh, I got the opportunity to learn more about computers and that served me in good stead. The first money I ever earned was as a computer programmer in the US, programming large simulation models for ecosystems in Utah. Working in computer programming was very influential and then I think as I ended up teaching statistics in a curious roundabout sort of way, although I was working at a science department teaching ecology, there always had to be a statistics teacher and it always ended up being me as I had done it before.

Learning how difficult students found statistical computing and learning specifically what it is about it that they found hard, by years of trial and error, that is what has made the style of writing in The R Book effective, talking them through it in what to a mathematician is an extraordinarily long way, using long sentences to say what they would say in a word but leading the student gently through what they are seeing as output from R and explaining why everything is necessary.

R is infinitely opaque at first if you are not talked through it and one wonders why for example, with the output of a model, why aren’t we show the means and the standard errors of the means, why one mean and a bunch of differences between other means, all that which is so elegant, beautiful and intuitive to a professional statistician is just peculiar to a beginner in science or economics. You cannot just read about R. These courses are only effective if taught well with a lot of practical work and I think a lot of students use The R Book as a kind of practical guide so they can actually obtain the data off the web, type what I’ve typed in the book for themselves and see it come up on the screen and hopefully it looks exactly like it does in the book! This is again one of those frustrations when software is dynamic because if you write a book about it, as soon as the book is printed, the software has changed and the output no longer looks exactly like it did. So the students look at it and their natural inclination is to think that they have made an error but R is very stable and will guide you through your research.