vecLib: Why Mac users are better off with Open Source R

The July and August meetings of the New England R Users group focused on two different aspects of R performance: parallel processing techniques and the effects of compiler & library selection when compiling the R executable itself.

It was the comprehensive presentation byIBM’s Vipin Sachdeva (slides here) showing 15-20X speedups through compiler and library selection that made me want to try a couple of benchmarks myself. And my recent it’s-about-time upgrade to 2.11 seemed like the perfect opportunity.

Open source vs. Revolutions Community R

Performance is one of the advantages claimed by Revolution Analytics for its distributions, with their product page promising “optimized libraries and compiler techniques run most computation-intensive programs significantly faster than Base R” even with their free, Community edition. I have heard good things about its performance on Windows, so I was curious to see if it provides an improvement over the already-optimized Mac binary.

Benchmarking Methodology (or lack thereof)

First, some disclaimers: I am not a serious benchmarker and have made no special effort for statistical rigor. I am just looking for order-of-magnitudes here, so I kept a normal number of programs running in the background, like Firefox and OpenOffice, though nothing was doing anything substantial and I avoided any user input while each test ran. My machine is the short-lived, late-2008, aluminum unibody 13″ MacBook (MacBook5,1) with 4GB RAM and Mac OS X Leopard 10.5.8 running the 32-bit kernel. It has a 2.4GHz Core 2 Duo — nothing special.

For my tests, I ran the standard R Benchmark 2.5 available from AT&T ‘s benchmarking page which performs various matrix and vector calculations — perfect for discerning the effects of such optimized libraries. I kept the defaults, such as running each test 3 times, and installed the required “SuppDists” package. I tested the open source 2.10.1 32-bit version I already had on my machine and then installed Revolution’s 2.10.1-based 64-bit community edition. I should have repeated the test with the open source 64-bit edition, but I didn’t think of it at the time (I told you I wasn’t serious about this), so instead I later re-ran the benchmark with the 32- and 64-bit versions of the open source 2.11.1 to check if there are any significant 32-vs-64 differences.

Results

It didn’t take long to realize that the Revolutions community edition was not going to fare well. During just the fourth benchmark, 2800×2800 cross-product matrix (b = a’ * a), there was a pregnant pause in the output while my laptop’s fans kicked in and soon spun up to full force. It took nearly 25 seconds to complete each turn of that one test where the open source 2.10.1 had finished in less than one tenth the time. (The complete output for each test is at the end of this post.)

Figure 1: Summary-level benchmark results. (Smaller bars are better.)

Figure 1 shows the geometric means of the elapsed times for each benchmark section as reported by R Benchmark 2.5. Clearly the Revolutions distribution did significantly worse on the matrix benchmarks. Figure 2 drills into the individual benchmarks to show the roughly 2-8X difference on the five slowest matrix benchmarks. Only on the sixth, Grand common divisors of 400,000 pairs (recursion), was the slowdown matched by the base 64-bit distribution. Only on Revolution’s fastest benchmark,2400×2400 normal distributed random matrix ^1000 all the way at the bottom of Figure 2, did the 64-bit versions hold a distinct (and roughly equal) advantage over their 32-bit brethren.

Figure 2: Individual benchmark results. (Smaller bars are better.)

vecLib: BLAS, LAPACK, and built into the Mac

So, no surprise — Amy was right. The off-the-shelf open source distributions of R for the Mac are already optimized. But how? Vipin walked us through all the different choices for BLAS and LAPACK libraries, not to mention the different C and FORTRAN compilers and their optimization flags. How can we know what’s being used by a given distribution? Well, it turns out that R makes it easy to find out with the config options to “R CMD”:

VecLib also contains Basic Linear Algebra Subprograms (BLAS) that use AltiVec technology for their implementations. The functions are grouped into three categories (called levels), as follows:

Vector-scalar linear algebra subprograms

Matrix-vector linear algebra subprograms

Matrix operations

A Readme file is included that contains the following sections:

Descriptions of functions

Comparison with BLAS (Basic Linear Algebra Subroutines)

Test methodology

Future releases

Compiler version

LAPACK

LAPACK provides routines for solving systems of simultaneous linear equations, least-squares solutions of linear systems of equations, eigenvalue problems, and singular value problems. The associated matrix factorizations (LU, Cholesky, QR, SVD, Schur, generalized Schur) are also provided, as are related computations such as reordering of the Schur factorizations and estimating condition numbers. Dense and banded matrices are handled, but not general sparse matrices. In all areas, similar functionality is provided for real and complex matrices, in both single and double precision.

As Vipin had demonstrated, using a fast BLAS and LAPACK libraries can make all the difference in the world (well, 20X or so). And since Apple controls the horizontal and vertical on their platform, it shouldn’t be a surprise that vecLib is fast on their hardware and OS. The real question is why doesn’t Revolutions simply link to vecLib too? It can’t be because their libraries are better (they clearly aren’t). Nor could they be afraid of competing with their Enterprise edition because, according to this edition comparison chart, they don’t offer an enterprise edition for the Mac. Perhaps they’re simply not that familiar with the platform and don’t know about vecLib. I know I didn’t know anything about it until these tests prompted me to consult The Google.

So there still seems to be plenty of opportunity to beat vecLib if you’re willing to compile R and mix and match BLAS libraries. For the rest of us, the open source distribution offers the best bang for the buck.

Maybe I need to ask Vipin to take a look at my Mac at the next meeting….