Tag: statistics

(Guest post by Achim Zeileis)
Development of the R package exams for automatic generation of (statistical) exams in R started in 2006 and version 1 was published in JSS by Grün and Zeileis (2009). It was based on standalone Sweaveexercises, that can be combined into exams, and then rendered into different kinds of PDF output (exams, solutions, self-study materials, etc.). Now, a major revision of the package has been released that extends the capabilities and adds support for learning management systems. It is still based on the same type ofSweave files for each exercise but can also render them into output formats like HTML (with various options for displaying mathematical content) and XML specifications for online exams in learning management systems such as Moodle or OLAT. Supplementary files such as graphics or data are
handled automatically. Here, I give a brief overview of the new capabilities. A detailed discussion is in the working paper by Zeileis, Umlauf, and Leisch (2012) that is also contained in the package as a vignette.Continue reading Generation of E-Learning Exams in R for Moodle, OLAT, etc.

A Bernoulli process is a sequence of Bernoulli trials (the realization of n binary random variables), taking two values (0/1, Heads/Tails, Boy/Girl, etc…). It is often used in teaching introductory probability/statistics classes about the binomial distribution.

When visualizing a Bernoulli process, it is common to use a binary tree diagram in order to show the progression of the process, as well as the various consequences of the trial. We might also include the number of “successes”, and the probability for reaching a specific terminal node.

I wanted to be able to create such a diagram using R. For this purpose I composed some code which uses the {diagram} R package. The final function should allow one to create different sizes of diagrams, while allowing flexibility with regards to the text which is used in the tree.

Here is an example of the simplest use of the function:

source("http://www.r-statistics.com/wp-content/uploads/2011/11/binary.tree_.for_.binomial.game_.r.txt")# loading the function
binary.tree.for.binomial.game(2)# creating a tree for B(2,0.5)

The resulting diagram will look like this:

The same can be done for creating larger trees. For example, here is the code for a 4 stage Bernoulli process:

source("http://www.r-statistics.com/wp-content/uploads/2011/11/binary.tree_.for_.binomial.game_.r.txt")# loading the function
binary.tree.for.binomial.game(4)# creating a tree for B(4,0.5)

The resulting diagram will look like this:

The function can also be tweaked in order to describe a more specific story. For example, the following code describes a 3 stage Bernoulli process where an unfair coin is tossed 3 times (with probability of it giving heads being 0.8):

So it is no surprise that the new release of plyr 1.5 got me curious. While going through the news file with the new features and bug fixes, I noticed how (quietly) Hadley has also released (6 days ago) another version of plyr prior to 1.5 which was numbered 1.4.1. That version included only one more function, but a very important one – a new citation reference for when using the plyr package. Here is how to use it:

install.packages("plyr")# so to upgrade to the latest releasecitation("plyr")

The output gives both a simple text version as well as a BibTeX entry for LaTeX users. Here it is (notice the download link for yourself to read):

Recently I was asked by O’Reilly publishing to give a book review for Paul Teetor new introductory book to R. After giving the book some attention and appreciating it’s delivery of the material, I was happy to write and post this review. Also, I’m very happy to see how a major publishing house like O’Reilly is producing more and more R books, great news indeed.

And now for the book review:

Executive summary: a book that offers a well designed gentle introduction for people with some background in statistics wishing to learn how to get common (basic) tasks done with R.

In this new edition, Professor Fox has teamed with Professor Sandy Weisberg, to refresh the original edition so to cover the development gained in the (nearly) 10 years since the first edition was written.

Here is what John Fox had to say:

Dear all,

Sandy Weisberg and I would like to announce the publication of the second
edition of An R Companion to Applied Regression (Sage, 2011).

As is immediately clear, the book now has two authors and S-PLUS is gone
from the title (and the book). The R Companion has also been thoroughly
rewritten, covering developments in the nearly 10 years since the first
edition was written and expanding coverage of topics such as R graphics and
R programming. As before, however, the R Companion provides a general
introduction to R in the context of applied regression analysis, broadly
construed. It is available from the publisher at (US) or (UK), and from Amazon (see here)

The book is augmented by a web site with data sets, appendices on a variety of topics, and more, and it associated with the car package on CRAN, which has recently undergone an overhaul.

In this post I publish a PDF document titled “A collection of tips for R in Finance”.
It is a basic 5 page introduction to R in finances by Arnaud Amsellem (linked in profile).

The article offers tips related to the following points:

Code Editor

Organizing R code

Update packages

Getting external data into R

Communicating with external applications

Optimizing R code

This article is well articulated, and offers a perspective of someone who is experienced in the field and touches points that I can imagine beginners might otherwise overlook. I hope publishing it here will be of use to some readers out there.

Update: as some readers have noted to me (by e-mail, and by commenting), this document touches very lightly on the topic of “finances” in R. I therefore decided to update the title from “R in finance – some tips for beginners”, to it’s current form.

Lastly: if you (a reader of this blog) feel you have an article (“post”) to contribute, but don’t feel like starting your own blog, feel welcome to contact me, and I’ll be glad to post what you have to say on my blog (and subsequently, also on R bloggers).

Ian fellows, a hard working contributer to the R community (and a cool guy), has announced today the release of Deducer (0.4) to CRAN (scheduled to update in the next day or so).
This major update also includes the release of a new plug-in package (DeducerExtras), containing additional dialogs and functionality.

Following is the e-mail he sent out with all the details and demo videos.

The next step is that the website will go into closed BETA for about a week. If you want to be part of this – now is the time to join (
From being part in some other closed BETA of similar projects, I can attest that the enthusiasm of the people trying to answer questions in the BETA is very impressive, so I strongly recommend the experience.

If you won't make it by the time you see this post, then no worries - about a week or so after the website will go online, it will be open to the wide public.

(p.s: thanks Romunov for pointing out to me that the BETA is about to open)

p.s: MetaOptimize

I would like to finish this post with mentioning MetaOptimize. This is a Q&A website which is of a more “machine learning” then a “statistical” community. It also started out some short while ago, and already it has around 700 users who have submitted ~160 questions with ~520 answers given. From my experience on the site so far, I have enjoyed the high quality of the questions and answers.
When I first came by the website, I feared that supporting this website will split the R community of users between this website and the area 51 StackExchange website.
But after a lengthy discussion (published recently as a post) with MetaOptimize founder, Joseph Turian, I came to have a more optimistic view of the competition of the two websites. Where at first I was afraid, I am now hopeful that each of the two website will manage to draw a tiny bit of different communities of people (that would otherwise wouldn’t be present in the other website) – thus offering all of us a wider variety of knowledge to tap into.

Letter in the conversation, Achim Zeileis, has surprised us (well, me) saying the following

I’ve thought about adding a plot() method for the coeftest() function in the “lmtest” package. Essentially, it relies on a coef() and a vcov() method being available – and that a central limit theorem holds. For releasing it as a general function in the package the code is still too raw, but maybe it’s useful for someone on the list. Hence, I’ve included it below.

(I allowed myself to add some bolds in the text)

So for the convenience of all of us, I uploaded Achim’s code in a file for easy access. Here is an example of how to use it:

I hope Achim will get around to improve the function so he might think it worthy of joining his“lmtest” package. I am glad he shared his code for the rest of us to have something to work with in the meantime

* * *

Update (07.07.10):
Thanks to a comment by David Atkins, I found out there is a more mature version of this function (called coefplot) inside the {arm} package. This version offers many features, one of which is the ability to easily stack several confidence intervals one on top of the other.

It works for baysglm, glm, lm, polr objects and a default method is available which takes pre-computed coefficients and associated standard errors from any suitable model.

Example:
(Notice that the Poisson model in comparison with the binomial models does not make much sense, but is enough to illustrate the use of the function)