Daily news about using open source R for big data analysis, predictive modeling, data science, and visualization since 2008

August 07, 2012

An analysis of the r-help mailing list

Even though forums and question-and-answer services like StackOverflow are emerging as the place to find crowdsourced technical help when using software like R, the traditional r-help email list is still going strong. UCLA grad student and R user Richard Kwock presented a poster at last month's JSM conference with an analysis of traffic on the list, showing it's still generating nearly 3000 messages per month:

Thanks to the generosity of R experts who monitor the list (including the most active responder, R core member Brian Ripley), R users are likely to find an answer to their questions by mailing to the list. (Although the answer to a poorly-framed question might well be: "read the posting guide!" -- newbies are adviced to lurk on the list for a while before diving in with a question.) In fact, over the years and despite the growing traffic on the list, the likelihood of a response has increased. New questions have received 2.2 responses on average in recent years:

The most popular categories of questions on the list are: data structures (objects like lists and data frames), data manipulation (functions like rep and paste); and statistical functions (like rnorm and lm). The most active time for the list is early in the morning for the US West Coast, most likely because it coincides with the early evening in western Europe, where many of the most active list participants are located.

Richard used R (of course!) to perform many other analyses of the list traffic, including charts of the top 20 most mentioned R functions (#1 is c, the vector operator), the top 20 mailing list contributors, and trends in popular topics (graphics, data mining, and bayesian analysis are all growing in interest). For the complete analysis and more charts, check out Richard's poster linked below.