Two Surpising Things about R

I see that it’s been over a year since my last post! I have a backlog of blog post ideas, but something else always seems to have higher priority. Today, though, I have some interesting (and useful) things to say about R, which I discovered in the last few days, and which shouldn’t take long to blog about. Of course, some other people may already be quite familiar with these things. Or maybe not…

First up, a useful feature of R that I hadn’t realized existed, which comes with a surprising gain in efficiency. Second, something surprisingly slow about R’s implementation of a very common operation.

First, the good thing I discovered about R. In complex mathematical expressions, it’s common to use more than one type of bracket, so that it’s easier to pair them up visually. Typical programming languages use only parentheses, other brackets having been appropriated for other uses. But it turns out that in R you can use both parentheses and curly brackets! The curly brackets are normally used to group statements, but an expression is one type of statement, and the last (or only) in the group provides the value. I’m not sure that this was always the case — I vaguely recall otherwise with some earlier version (perhaps an early version of S). But it works now.

Here’s the even more surprising thing. It occurred to me that before rushing out and using this feature, I should check that it doesn’t introduce some horrible inefficiency, as might be the case if curly brackets were optimized for their more common use in grouping statements. So I did a little test, as follows:

That’s with R version 2.9.2 on a Windows XP machine. Of course I ran it several times to be sure that results were consistent. And I ran it on two other versions of R, on Intel Solaris and Sun SPARC machines, with similar results.

I’m having difficulty imagining how curly brackets can be more efficient than parentheses. Could there be a dispatch operation somewhere in which a curly bracket operator gets recognized faster than a parenthesis operator? But surely any such search wouldn’t be done by linearly, or by any other method where this could happen. I could imaginge some strange accidental effect of caching, except that it’s consistent over different versions of R and different machine architectures.

The second surprising thing is how long it takes R to square all the numbers in a vector. Consider the following:

So multiplying the vector by itself is about 3.6 times faster than squaring it the obvious way with ^2. This is again R version 2.9.2, released 2009-08-24.

My first thought was that R treats ^2 as a general exponentiation operation, requiring taking a log, multiplying by two, and then exponentiating. But no, as seen below, a general exponentiation takes even longer

So my guess is that R does check for an exponent being exactly two, and treats that specially, but that it does this check again and again, for every element of the vector.

The speed gain from replacing a^2 with a*a is enough to justify this replacement in time-critical code, even though it makes the program less readable. But perhaps squaring will be (has already been?) made faster in a later version.

Related

Regarding the a^2 versus a*a thing, I don’t know much about this part of R’s implementation, but it’s possible that a*a uses integer arithmetic instead of floating-point arithmetic if a is an integer, but that a^2 always uses floating point. I don’t know why a^2.1 would be even slower, though….

This makes things even more mysterious. I could see R storing 1:10000 as a vector of integers, and then having to convert to floating point when it does arithmetic (since the overflow properties of integer arithmetic would be incorrect). But that should slow down both a*a and a^2.

Whatever the reason, it seems that when the vector isn’t all integer, the advantage of a*a over a^2 increases to a factor of about 6.8 (based on some longer timing tests).

I can tell you why a^2.1 is slow… just remember what fractional exponents mean…
in this case the operation you would need to do is (a^21)^(1/10).
I assume there is a more efficient algorithm to figure that out than actually multiplying “a” 21 times and then taking the 10th root, which is the most obvious simplification for human eyes….

But it is clear, that finding the umptieth root of something, will always be slower than just squaring it, whatever way you do it.

The integer versus float argument does not hold. I tried it myself.

Using a*a instead of a**2 is a case of “unrolling the loop”. You can imagine how much easier it is for a program to understand “I have to multiply a -1- times” which is basically a general loop, than to just do a*a.

In R you can do a lot of optimization if you put your mind to it. In most compilers the statement a^2 will actually be read as a*a, thereby improving the speed. R doesn’t do that obviously.

Also, try doing the same operation a*a as a loop like
r = c()
for (i in 1:length(a)-1) r[i]=a[i]*a[i]

In this case this is obviously a bad idea, but generally, if a loop can be converted to something vectorized, speed will usually benefit. That is because R code is inherently slow, because of all the fancy and useful higher level constructs and notation, and the less you let R-Code do the work, the faster the program!

Yes, a^2.1 is slow because it is a much more complex operation than just multiplying a by a. It is presumably implemented as exp(2.1*log(a)).

It turns out that when a is an integer a*a is actually done with integer arithmetic. Which is a bad idea, I think. Maybe I’ll post on that sometime. But as I mention above, the benefit of a*a over a^2 also exists for reals.

Practically any compiled language will indeed treat a^2 identically to a*a. And most interpreted languages with vector operations would also do a^2 and a*a equally quickly on long vectors. That’s why it’s surprising that R doesn’t.

Inspired by the two issues above, I’ve actually looked at the R source code. So I now understand what’s going on. More on that in a future post. At the moment, however, I can say that these issues are not a result of R having fancy higher level constructs (except in so far as that may be why R is interpreted rather than compiled). I can also say that there is considerable scope for speeding up the R interpreter.

Yes, R could be made a lot faster, I think. But I believe that accuracy, usability and reliability are of higher priority to the core developers. Other than that, there is a chance that R will receive the kind of attention that Javascript, Perl and Python have received, and just look at how far these languages have come, speedwise.

I don’t get the integer argument though. If I type in typeof(2) I get “double”. I haven’t yet figured out a way to create or use integer arithmetic in R.

I think substantial improvements to speed can be made without compromising accuracy, usability, and reliability, and without much effort either. Slow speed is a significant impediment to greater use of R, much more so than lack of some wish list of new features.

You can create a vector of 10 integers with integer(10). Sequences from “:” are also integers. For instance, typeof(2:2) returns “integer”, even though typeof(2) returns “double”.

Integers may be needed for interface to C and Fortran routines, but I think having them infiltrate the rest of the language is a bad idea.

As you can see above, there’s a problem trying to enter R programs as comments! The issue is < and >, which have to be entered as &lt; and &gt; to avoid interpretation as HTML commands.

I assume that the f and g functions were using curly brackets and parentheses, however. I’d use the first number output by system.time to judge time, which makes the advantage of curly brackets somewhat less.

I: The difference between (1:10000)^2 and (1:10000)*(1:10000) may be that (1:10000)*(1:10000) is evaluated using the software equivalent of a vector processor. Or, the extra time may be in replaciing (1:10000)^2 by
as.real(1:10000))*as.real((1:10000))

II: The following, as I interpret it, makes it clear that, when a is an integer vector,
a*a is evaluated using integer arithmetic. This has a downside (potential integer overflow) as well as an upside (faster).

Your example got garbled, probably because you have to enter < and > in comments using &lt; and &gt;. But in any case, you’re right that n:m with n and m integers produces an integer vector, about the only way a normal user not writing C routines for interface to R would encounter integers.

But as I say in a comment above, that’s not the source of the problem. Having now looked at the R source code, I can say that the problem is that when evaluating a^2, R checks for the exponent being 2 over again for every element of a, and furthermore, this is done in a procedure called for every element, producing a huge slowdown.

Look in names.c in src/main. A search for “^” will reveal an association with do_arith, which is found in arithmetic.c. From there, one can trace through to where POWOP is handled in the real_binary procedure.

Naive question to Radford, regarding the integer squaring (or power of powers of twos): in the very specific case of squares, how about just using a bitwise left shift ?
That’s one case where integer programming would be an advantage.
More generally, for integer exponent and base, using the binary representation of the exponent allows for fast exponentiation by summing as many bit-shifted versions of the base as there are ones in the exponent (and making no more than n shifts, where n = floor(log_2(exponent)) is the index of the highest-order one).

I think you’re thinking of a*2 and more generally a times a small integer. But everything from that does transfer to a to the power of an integer, and people do indeed implement exponentiation by an integer power that way. In my mod to R I only put the simple check for a^2, however.

Sorry, my bad, thanks for correcting me! For the shift, yes indeed, I was confused, this is for multiplication, my apologies (I’ll blame it on the jetlag, brhm hum). However, I was indeed thinking fast exponentiation by repeated squaring (instead of bitwise shift), as you precised.
Great patches for improvements, by the way, I’ve just seen your new post.
Next step: fixing the memory management?

I agree speed matters. Especially for those who are historic matlab users (as is often the case in the machine learning community). Trying out R, finding it much slower to do what you want to and going back to matlab is a rather common occurrance. For many of the computations we want to do we are talking about hours to days of computing time, and so speed really does matter.

According to the R Language Definition, the difference between a curly brace and parentheses is that the curly brace yields a function object, whereas the parentheses give you an expression object; it should be surprising that there might be a slight speed difference between these two things.

Now with the original example, as well as some of the variants that other people tried (like adding lots of extra parens or braces), keep in mind what you’re asking R to do: you’ve defined a function that creates either a function object or an expression object which gets called. So what you learn from the example you gave is that creating lots of objects is expensive! I haven’t looked at R internals in a while, but it’s not unreasonable to think that for each layer of parens or braces you add it generates a new function/expression object; afterall you’re working in an interpreted environment and don’t have a compiler to help.

Not only is creating lots of objects expensive, but in this case the expression you are evaluating takes a tiny fraction of time compared to actually generating the function/expression object. Of course, the example you gave is contrived and I couldn’t imagine writing the code that way for real; the value of a, b, and c never change, so it doesn’t make sense to re-calculate those values a million times when you could instead move the d <- 1/(a*(b+c)) outside the function definition of f/g so that it is only ever evaluated once.

If you find yourself iterating a million times then it's worth re-thinking your code. Sometimes I see people use iteration when they ought to be using the kind of array and matrix operations that, in a sense, is one of the reasons to use R in the first place. For example:

I think it’s easy to fall into this trap if you’ve used some C or Fortran before where iteration is common, but not as expensive.

Some times you can’t help but use some iteration, but if you do have to iterate lots of times, then make sure the cost of the expression you’re iterating over far outweighs the cost of generating the function/expression object. Also, you can sometimes reduce the cost of an expensive function or expression by using the R.memoise or R.cache packages.

Neither parens nor curly brackets create a “function object”. Neither one creates anything that then “gets called”. They both simply evaluate their arguments and return the last, the only difference being that curly brackets can have more than one argument, with only the last providing the value. (Actually, there are also differences in how they get treated for debugging and profiling purposes, but those differences would all go in favour of parens being faster, not slower.)

The timing test is not, of course, a real program. I’m well aware that it computes the same thing over and over, but this would be a problem for a timing test only if R was smart enough to realize this too, and skip the extra computations. It isn’t that smart.

Your comment about it being bad to use loops is one that has been said again and again by many people, going back to the days of S, which would not only do loops slowly, but would also run out of memory for no good reason whenever you used a loop that iterated many times. I think it’s just a lazy excuse for not improving the implementation. Loops are often the clearest way to write a program. Blaming the user for writing them is not the way to improve a language implementation. And loops are unavoidable for many programs that do things like bootstrapping and MCMC.

[…] Radford M. Neal, Professor at the University of Toronto, published an article with the headline Two Surprising Things about R. He worked out, that parentheses in mathematical expression slow down the run-time dramatically! In […]

In response to my comments, the R core people did make changes to speed up squaring. I’ve just tried R 2.14.1, and x^2 is now slightly faster than x*x. It wasn’t in the version current when I wrote this post.

Yes, parentheses are still slower than curly brackets in R 2.13.1, and in the latest R 2.14.1, though not by as much as before (a patch I suggested to reduce general overhead in built-in functions was incorported in later releases, which reduced this difference).

There’s no good reason for parentheses to be slower. I suggested another change (to just a couple lines of code) that would speed up parentheses. Luke Tierney responded that this patch should on no account be incorporated into R, on the basis that parentheses are an operator just like sqrt, and that there should therefore be nothing in the code implementing parentheses that departs from how operators like sqrt are implemented.

Any self-respecting programming language implementor would be deeply embarassed that parentheses slow down a program at all. Rejecting a very simple modification that would reduce the overhead of parentheses on such totally specious grounds requires a level of poor design judgement that is difficult to comprehend.

The R Core people made some changes in response to my comments about a*a versus a^2, which speeded up a^2 (though not by as much as they could have). The result is that in 2.13.1, on one of my machines, a^2 is indeed about the same speed as a*a for reals, and faster for integers. This is for vectors of length 1000.