Tag: cluster

I finally found the time to try parallel computing in R using snowfall/snow thanks to this article in the first issue of the R Journal (replacement of R News). I didn’t try parallel computing before because I didn’t have a good toy example, and it seemed like a steep learning curve. Snow and Snowfall is perfect for ‘embarrassingly parallel’ jobs, eg, a simulation study, bootstrap, or a cross-validation. I do simulation studies a lot, eg, assessing the properties of a statistical methodology, so implementing parallel computing will be very useful.

I got the toy example to work, but it was parallel on a single computer with multiple cores. Thanks to Michael Zeller, I got it to work on multiple machines. If we use multiple nodes, make sure we enable passwordless ssh.

Credit for getting snowfall to work on the BDUC servers (uci-nacs) goes to Harry Mangalam.

## Example 3 – Multiple nodes on a cluster (namely, the BDUC servers of uci-ics)## ssh to bduc, then ssh to one of their claws (the head node is 32bit whereas the other wones are 64)## put something like## export LD_LIBRARY_PATH=/home/vqnguyen/lib:/usr/local/lib:/usr/lib:/lib:/sge62/lib/lx24-x86 in .bashrc## or## Sys.setenv(LD_LIBRARY_PATH=”/home/vqnguyen/lib:/usr/local/lib:/usr/lib:/lib:/sge62/lib/lx24-x86″)## in an R session. Note: modify path to your home directory## might have to install required packages elsewhere, like ~/Rlib, and use .libPaths() to add library path. Put this in .Rprofilesink(‘SnowFallExample.Rout’, split=TRUE)
.Platform
.Machine
R.version
Sys.info()