Are you ready now? OK, this post reviews how to install Stan. Let's start here! :) In principle this post just follows a content of "RStan Getting Started" but some tips are added in order to fix less known problems.

For any OS, Stan needs a C++ compiler because Stan is implemented by C++. For R in Windows, CRAN distributes a package of C++ builder and other dependencies as Rtools and here we need it. But Stan project kindly shows us how to install Rtools.

OMG, what happens? As far as I've known, this kind of trouble is caused by mismatching of the version of gcc and its PATH.

For many versions of Stan, gcc 4.6.3 or later is recommended; but in some packages of other programming languages, e.g. Python (more in particular Python xy) or MinGW, some older version of gcc is included and usually PATH is edited to set itself prior to the other version of gcc manually installed.

In my own case, MinGW includes an older version of gcc 4.5.2 and its PATH setting prevented me from installing gcc 4.6.3 appropriately. What I had to do was just to remove a folder path below from PATH,

C:\MinGW32-xy\lib\gcc\mingw32\4.5.2

and to add folder paths below.

C:\Rtools\bin;c:\Rtools\gcc-4.6.3\bin

Just for your information, recently I tried to install {rstan} onto an EC2 AmazonLinux instance but it failed. Wrong version of gcc? Any lack of required software or library? No, no, just because of out of memory :( For Amazon EC2, you have to prepare a larger instance.

Get Stan and install it

OK, let's install Stan onto R. Actually {rstan} requires some dependencies such as {Rcpp} and {inline}, but those packages would be installed at the same time with {rstan]. Don't worry.

Even if you still see any error message and fail, please read carefully and fix it. As far as I know, the most popular reason of failed installation is mismatching of versions of dependencies (and/or builders). Just as a usual advice for installing anything on your machine, please verify all software on your machine are updated to the latest version.

Run an easy example of binary logistic regression with {rstan}

Now you can run Stan on R as {rstan} anytime. :) Please make sure {rstan} is ready.

>library(rstan)

OK, let's try it with a very easy example. Please download "conflict_sample.txt" from my GitHub repository and import it as a data frame "dat". This is a really simple data set to be easily modeled by generalized linear modeling, such as logistic regression.

Unfortunately the current version of Stan cannot handle discrete and categorical variables so please convert a "cv" column into numeric.

> dat$cv <- as.numeric(dat$cv)-1

Generate MC samples and get a simple result

Next, prepare a Stan code; in principle {rstan} just works as an interface of Stan for R so it needs a source code to be compiled. Please write a code below and save it as "conflict.stan" with any editor you like.

This code means merely a simple binary logistic regression. Of course it never requires such a heavy computation with MCMC or HMC in order to estimate a maximum likelihood*1, so please take it as just an easy example.

Actually this code can be re-written using vectorization. It looks further simpler :)

Check convergence with plotting

Usually Bayesian practitioners like to check whether a model estimation converges well or not. If you want so, you have to plot a distribution (histogram) of each estimated parameter. The simplest way is using {coda}.

It's beautiful but needs a little sophisticated skills of {ggplot2} I think. :P) In the next post, I'll argue about how to use Stan for a dataset with random effects; that is, a simple example of hierarchical Bayesian models.