Iteration or Recursion?

by Liviu Tudor

Did you ever get stuck having to choose between a recursive or iterative implementation? Get the answers to whether one is better than the other or whether it really matter?

Whether we like it or not, recursion is something we encounter very frequently when dealing with applications — even the simplest call to Arrays.sort shows that under the hood there is a recursive quicksort implementation. Whether you like it or not, it is something you need to be aware of and most importantly, the implications of it.

There have been many books written around recursive functions and I'm not going to get involved in a whole discussion about whether recursion could or should be totally avoided or whether that is possible at all (Ackermann function springs to everyone's mind I'm sure). Rather, I'll instead work on the assumption that in the real world most solutions are a mixture of recursive and iterative implementations and because of that there are a few considerations you should factor into your decision. Therefore I'm going to take a comparative look at implementing a few basic functionalities in both a recursive and iterative manners and see what conclusions can be drawn.

In order to perform such a comparison, I'm going to need some basic metrics to allow for comparison in between the 2 implementations in each case. This in itself can suffer a whole discussion as there is more than one coordinate for such measurements, however, for simplicity of such comparisons I'm going to settle for just two parameters: the time it took to complete the operation and the steps taken. Arguably the two are related as it seems logical that the more steps an algorithm takes the more time it will take to finish the execution. This is the real world, however, and there are other things involved in the process of executing a function, as it will transpire soon. Also, it is worth mentioning the "Schroedinger's cat" factor, as introducing these simple measurements will obviously affect the measurements themselves. However, these measurements are not intended to be precise, but are only used for comparison. Any measurement errors introduced shouldn't affect the comparison as long as they are introduced in both implementations and with the same magnitude.

To help with such measurements, I've provided a very simple interface ingeniously named SimpleMeasurement that provides just to methods, one for retrieving the time and one for retrieving the number of steps:

public interface SimpleMeasurement
{
public long getNanos();
public long getSteps();
}

While retrieving the number of steps is obvious (getSteps function), the time function seems to have a an obfuscated signature (getNanos). There is a reason behind this madness. On most modern computers, most implementations run in less than a milisecond, so trying to measure the time in miliseconds (let alone seconds) would render zero in most cases for both recursive and iterative implementations. This might lead to the wrong conclusion that both implementations are equally fast. Therefore I decided to go for a finer-grained measurement and measure the time in nanoseconds, which is one billionth of a second. To put it in perspective, 1,000,000 nanoseconds = 1 milisecond.

Each implementation provided for the purpose of this article will have to implement this interface, in order to be able to extract the information that will be used for comparison. Each implementation will have to apply the following pattern when called:

Store the current system time in nanoseconds (see javadoc on System.nanoTime. It is safe to use this since it is only being used to compute a difference of starting time and ending time.

Reset the counter that counts the number of steps it took to execute the function.

Increase the counter for each execution cycle.

Store the end system time in nanoseconds at the end of the cycle.

Based on the above steps, each implementation will return (ending time - starting time) for getNanos and the counter value for getSteps.

Since a common pattern has been identified for all of these implementations, it makes sense to provide the basics of this pattern in a class on its own. The SimpleMeasurementImpl class does exactly this. It provides members for start time and end time and the steps counter and also provides implementations for returning the two values:

You will notice that all of the data members have been declared as protected so classes extending this can simply access this data directly. (Remember, this is just for the purpose of some basic measurement metrics, so there is no point in overloading the implementation with encapsulating this data and providing getters and setters which will ultimately add even more execution time to our calls!) In the light of the above each one of our implementation will have to extend this class, and on each call:

Initialize timeStart at the beginning of each call (timeStart = System.nanoTime()).

Reset the counter (steps = 0).

Perform the execution, making sure the counter is incremented at each step (steps++).

Initialize timeEnd (timeEnd = System.nanoTime()) at the end.

And the rest is already done for us!

NOTE: You might have noticed a very loose description when it comes to counting the number of steps in the implementations. That's because for the purpose of the comparisons mentioned above, I am not going to actually count every single step taken in the execution of a function, but rather count the number of cycles — the number of times a loop executes or a recursive function calls itself. Since there is already a measurement of time spent executing a function, it isn't necessary to count precisely every single step of the execution (which would have included things like variable assignments, if/then/else tests etc). So simply put, this implementations will have to increment the steps member by one per each step of the loop or each time the recursive function calls itself.

With the basics now implemented, it is time to finally move to the core of this article. The first comparison you're going to look at is the factorial function, for those of you that need their memory refreshed, here's a brief on this function:

n! = 1 if n == 0
n! = 1 * 2 * ... * n if n >= 1

The above definition can be also written in a recursive manner as:

n! = 1 if n == 0
n! = n * (n-1)!

As you can see from the above, the two definitions are identical from a complexity perspective as they both execute in O(n), which ultimately means an algorithm computing these values will have to traverse all of the values from 1 to n to compute this number. For the untrained eye, this means that it doesn't matter which implementation one would go for! Now, see if that is right!

For the purpose of clarity and in order to separate one comparison from another, I've decided to structure each such set of implementations in its own package. The factorial implementation has a corresponding factorial package. For the same clarity reason and in order to clearly separate the iterative and recursive implementation, each package will provide an abstract class that defines the functionality it offers (for instance, computing the factorial) and also provides the basic measurements defined in SimpleMeasurement (by extending the SimpleMeasurementImpl class) and two other classes—one which implements this functionality recursively and another one which implements it iteratively. Also, for each of these you will find a "driver" class in the default package which simply calls the recursive and iterative implementations and then displays the measurements returned by the SimpleMeasurement interface that each such implementation will have to offer. So in the case of the factorial function, you will find inside the factorial package first and foremost an abstract class that defines the "service" this package offers:

This maps to the iterative definition provided above for the factorial, where you loop over all the values from 1 to n and multiply them together. This does mean of course that the complexity of such an approach is O(n).

Now look at the recursive implementation of the same factorial, provided in the RecursiveFactorial class:

The first thing you will notice is that the implementation has been broken down into two functions, a "wrapper" (getFactorial) and the actual recursive function computeFactorial. While that might seem odd at first glance, a closer look reveals that this pattern has been applied so the "measurements" (start/end time and number of steps) can properly be initialized. Removing the measurements from the implementation would render the following implementation:

You will notice that the try/finally gets removed since the finally clause will have an empty "body". And looking at the above getFactorial becomes just a function that delegates the call to coputeFactorial, so you can merge the two of them into a single call:

As stated before, this pattern was implemented purely so all of the measurements could be added into the code. You will notice the same pattern of splitting the implementation into two parts applied to all the other recursive implementations discussed in here, for the same reason.

The implementation shown above maps perfectly onto the recursive definition of the factorial function provided above and just as in the case of the iterative one, the complexity involved is still O(n) as ultimately you will end up cycling through all the numbers from 1 to n and multiply them.

So at first sight, the two implementations are completely identical from a computational point of view, thus leaving the choice of one over another to pure personal preference of the programmer. Is this definitely the case though? Since you do have some basic comparisons, run the two implementations side by side and look at the numbers. A "side by side" comparison has been provided in the Factorial class in the default package, in the main function:

As you can see, this is a very simple idea. The factorial of 1000 is going to be computed using the iterative approach and then using the recursive implementation. On my PC this execution renders the following result:

As I signaled before, the steps member is not a precise measurement, but rather a measure of how many cycles/loops have taken place. As such, in both cases this is 999 — since the case of n=1 is not considered a step. So as expected, both implementations take the same number of steps.

The recursive one seems, however, to take longer!?

So that's a first signal that perhaps personal preference is not the best way to decide over these two implementations. Furthermore, let's push this even further and see how you can cope with the likes of computing the factorial of 10,000.j A number 10 times bigger suggests that you should expect 10 times more steps (obviously!) and 10 times longer to complete. The result probably comes as no surprise:

While the iterative function still completes (though indeed, the computation would exceed the size of a long data type so it just returns a 0), on my machine, the recursive one bails out around step 5000 as it runs out of stack space. (It also takes more time to execute as our timing measurements show, but I will get back to that one.)

So as a first rule for choosing one implementation over another, when the complexities involved are equivalent, one needs to bear in mind the stack limitations present in the real world when dealing with recursive functions. As such, an iterative approach is always a better one when complexity involved is of n order.

Next, look at a more complex function, the Fibonacci series, which is defined by:

As you can figure it out based on this implementation, the series generated by this formula is 0, 1, 1, 2, 3, 5, 8, 13, 21... and so on. Based on this definition, a recursive implementation of the Fibonacci function can be provided — in the fibonacci package. Start again by defining the "contract" in the FibonacciCalculation abstract class:

Now let's see if an iterative implementation for this function can be provided. Note that the Fibonacci function is defined only recursively, however, that doesn't mean an iterative implementation is not possible! The definition, if you recall, was:

To provide an iterative implementation, you will have to iterate from 0 to n-1 bearing in mind at each step i that you have already computed the values for fib(i-1) and fib(i-2). You can use this to compute fib(i). Moving on to the next step, you can reuse the values for fib(i-1) and fib(i). Therefore an iterative implementation will look like the one in class IterativeFibonacci (again, the code required for measurements has been taken out):

If you compare the complexities of the two implementations provided, a quick glance reveals that the recursive one makes two call to itself at each step. This provides a perfect example of "tree recursion", which has an exponential complexity — while the iterative one is of linear complexity. Running these two in parallel (see the Fibonacci class in the default package) provides the following result, for computing fib(20):

Even without the time measurements, anyone can see straight away that the iterative solution should be the preferred one due to the number of iterations/steps required. This is true even for really low values of the n. While stack overflow can occur here for large values, it is not the purpose of this example to re-iterate the stack overflow problem, but instead to underline the fact that one should try to map a recursive definition to an iterative one as well and compare their complexities before deciding on the implementation, even if the problem given only "suffers" a recursive definition. This is even more obvious for cases like the Fibonacci series, which follow a tree recursive definition and therefore provide exponential complexity.

Just because your "requirement specifications" outlines a recursion, it doesn't mean you should proceed straight away to implementing it recursively. It is always worth analyzing whether an alternative (iterative?) implementation could be a better choice.

For the next exercise, assume that you have enough memory and stack space so that you will never cause a stack overflow, regardless of the inputs. Let's take a look at whether for equivalent recursive and iterative implementations (i.e. implementations that have the same complexity), there are any differences in between the recursions and iterations. For the purpose of this exercise, the package "stack" has been created and the abstract class that defines the "contracts", StackConsumingCalculation. The class itself offers 3 different functions; however, I will be looking at one of them at a time.

For simplicity, all this function does, is return a 0 if b == 0, otherwise, it returns stackConsumingCalculation(a, b-1, c, d). It's easy to see the function will always return 0, but only after it iterated through all the values b,b-1,b-2,...,0. As such, the iterative implementation of this function (provided in the IterativeStackConsuming class) looks like this:

This goes to show that even with the same number of steps taken (i.e. equivalent complexities), the recursive one takes longer  quite sensibly longer in fact (about four times!). Since there are no complex operations (computations) involved, the single cause left for this is "stack preparation"  the time it takes internally the JVM to put the parameters on the stack at each step, together with pointers in the code to where the execution should continue once the call is finished. Since you're passing four parameters, at each step four values need to be pushed onto the stack so one idea is to encapsulate these four parameters into a single object, to reduce the number of such operations on each step and see if this makes a difference. In order to test this, class Capsule has been introduced, which simply just wraps these four values:

public class Capsule
{
public String a;
public int b;
public long c;
public Object d;
}

A method taking an instance of Capsule as a parameter has been provided in the abstract class StackConsumingCalculation:

public abstract long stackConsumingCapsule( Capsule c );

Now running the test again using the Capsule class, gives the following results:

Surprise! The execution times have gotten much closer, which goes to prove that the number of parameters involved is a significant factor when dealing with recursion in real life. This conclusion leads to another idea  what if you only use one parameter that is a primitive data type? To test this, stackConsumingPrimitive function has been added to the abstract class StackConsumingCalculation:

Ouch! What happened there? We had seemed to be on the right track for getting the execution times closer ! The explanation lies, not surprisingly, in what happens if the following code gets executed:

Stack s = new Stack();
int i = 10;
s.push( i );

I'm sure you've figured out by now  autoboxing! The primitive data type i (int) gets converted into an Integer class that gets pushed onto Stack instance s. There is an extra step of creating an instance of class Integer before pushing the data onto the stack. Similarly (though not identically the same), inside JVM, the primitive types will get wrapped into an object in order to get pushed onto the call stack. Despite using a "simple" data type, more time is actually being spent internally to prepare the call then in the case of using an object!

So when dealing with recursion, another coordinate to take into account is the complexity of the data passed in between calls, and whether this can be simplified by using objects or aggregating them into a single object!

As a last comparative study in this article, you'll look at the greatest common divisor problem, which can be defined using the Euclidean definition:

gcd(x,y) = x if y == 0
gcd(x,y) = gcd(y, x % y), if x >= y > 0
(obviously, for the other case, x < y, one would just have to swap x and y in this definition)

Following the same pattern as before, package "gcd" was defined, and the "contract" for this computation presented in class GCDComputation:

For those who can't detect it, the complexity of the two implementations is equivalent though it has to be noticed that the iterative one requires an extra variable (temp) and extra step needed to swap the min and max values when min becomes greater than max.

Running these two implementations in parallel, gets in most cases slightly bigger times for the recursive one, however, I did manage after the fourth execution to get the following:

While this is a result of randomness (as one has to take into account memory usage, CPU load etc at OS level), such a result, where the recursive implementation execution time is shorter than the iterative one cannot be achieved with any of the previous functions provided (trust me, I've tried it!). So this result proves that the recursive implementation is not just "very" equivalent to the iterative one, but in fact it might be a better one occasionally! And that is because the recursive GCD implementation provided here is a "tail-recursive" one. In other words, it is an implementation that doesn't build up any deferred operations (which need to be performed after returning from the recursive call).

If you look at the factorial recursive implementation, having just returned from the call to factorial(n-1), the code still has to multiply that result with n:

The multiplication in this case is the deferred operation; however, in the case of the recursive GCD, there are no deferred computations/operations left after returning from the recursion. Because of that, modern compilers can optimize tail-recursive functions, since they don't need to save a lot of the information regarding the execution point (in the caller function) when returning from the recursion. So, all the time that was spent before putting things on stack and retrieving them back at each step can be eliminated, thus making the recursive implementation very close (and sometimes better than) the iterative implementation.

Conclusion

While the choice between recursion over iteration or vice versa will always suffer a lot of discussions, there are few basic checks one can perform before going down the wrong route, as shown above. And of course, last but not least one should consider whether a recursive implementation is not somehow easier to understand and maintain than an iterative one, which might be an argument strong enough to overrule all of the above!

Source Code

About the Author

Liviu Tudoris a Java consultant living in the UK with a lot of experience with high-availability systems, mostly in the online media sector. Throughout his years of working with Java, he came to realize that when performance matters, it is the "low level", core Java that makes an application deliver rather than a bloated middleware framework. When the injuries acquired playing rugby with his team, The Phantoms ( http://www.phantomsrfc.com) in London, he writes articles on Java technology for Developer.com.