Recursion is a topic that scares a lot of people in the beggining, but it is not as difficult as one might think. All recursion is is a function or method that calls itself. Here we'll take a look at what recursion is, how it is used, and what some of the common pitfalls with recursion are.

Recursion is a tool that beautifully handles a certain set of problems. You may not use it that much as a programmer, but its always good to know how you can use it if the need should arise. Also, it is a common question in job interviews and a subject that every programmer must learn in school.

3: a computer programming technique involving the use of a procedure, subroutine, function, or algorithm that calls itself one or more times until a specified condition is met at which time the rest of each repetition is processed from the last one called to the first — compare iteration

The exception that is thrown when the execution stack overflows because it contains too many nested method calls. This class cannot be inherited.

Stack Overflow Exception - MSDN listing of the stack overflow exception class and explains this is generally a recursion problem.

Note:All examples were created using Visual Studio 2010, targetting the .NET Framework 4.0. We'll do our best to point out anything that might not work in older versions.

Using Recursion

Recursion is nothing more than a function or method that calls itself. There is really not anything complicated to it. Probably the most complicated part about it is "when to use it".

In school they usually teach recursion by having you write a program to calculate Factorials because this is a fairly simple example of what recursion is good for. In fact, it may be a little too simple in that it does not demonstrate that you can have multiple parameters being passed when you do recursion and also you kind of miss the point that recursion is very good at "climbing trees", which is something we will look at later.

Below I have some example code demonstrating how to calculate Factorials using recursion. You should look at the definition of Factorials first if you are not familiar with Factorials, so that you get an idea of what it is here that the program is supposed to do. But basically, a factorial is just a number multiplied by all of the integers lower than itself. So, 5 factorial is 5*4*3*2*1 = 120.

The code starts in Main, of course - as this is a console app, and then asks the user for a number. Factorials by definition cannot be calculated on 0 or negative numbers. So, we will exit the program if zero is selected and give an error message if a negative number is selected. I've also excluded any number higher than 20, because - as you can see - the numbers become astronmical very quickly and even the long data type cannot hold the results too much higher than that.

The method named Factorial is our recursion method.

The Factorial method ends the recursion if the number submitted is 1 and returns 1 as a result. This is what is called a base case it is the case that stops the recursion and therefore is the end, or base, of the recursion. Not having a base case that ends the recursion is the primary cause of the dreaded "stack overflow exception".

If the number submitted is not 1, then it does recursion and calls itself. First, it gets the next number to be submitted and calls itself with that number. It is important to notice that it will multiply the result with the current number. But the important thing to notice there is that this multiplication takes place after the result comes back.

There are two directions of travel here and that is one thing that often confuses people. Recursion travels from the "root" to the "tip" and then back again from the "tip" to the "root". Whether something is calculated on the way out or on the way back in is determined by whether the operation happens before or after the recursive method is called. In this example, the multiplication actually happens on the way back even though its written as part of the same line, because it happens after the recursive call. And this is what we need to happen, because we cannot do the multiplication until we have the answer from the method call.

Eventually, the next number will keep getting smaller and smaller until it reaches 1 and then the function will not call itself, but it will return the number 1 without calling itself. At that point, it travels back out towards the root executing any code after the line in the method where the function called itself. A lot of people do not realize that the recursion travels in both directions as it goes out to the base case and then back through every call that was made until it comes back and returns from the original call to the method.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
namespace Factorial
{
class Program
{
//We define a method that returns type long integer, and it takes in a long integer which is the number we will enter for which we want to calculate the factorial.
public static long Factorial(long Number)
{
long ReturnValue;
long NextNumber;
NextNumber = Number - 1;
if (Number == 1)
{
//This is the base case that stops the recursion.
ReturnValue = 1;
}
else
{
//This is what makes it recursion because it calls itself.
//Any code before the recursion call will happen as the recursion is traveling from the root to the branch tip.
ReturnValue = Factorial(NextNumber) * Number;
//Any code after the recursion will happen as the recursion is traveling back from the branch tip to the root.
}
return ReturnValue;
}
static void Main(string[] args)
{
bool ExitLoop = false;
long UsersNumber = 0;
do
{
Console.Write("Enter a number to calculate it's factorial (or 0 to exit):");
UsersNumber = Convert.ToInt16( Console.ReadLine());
if (UsersNumber > 0)
{
if (UsersNumber > 20)
{
Console.WriteLine("Numbers entered must be between 1 and 20 due to how big");
Console.WriteLine("these numbers get for higher values.");
}
else
{
//The recursion happens when the Factorial method is called.
Console.WriteLine("The Factorial of "+UsersNumber+" is " + Factorial(UsersNumber));
}
}
else
{
if (UsersNumber == 0) ExitLoop = true;
Console.WriteLine("The number must not be negative.");
}
}
while (!ExitLoop);
}
}
}

Now the factorial example is commonly used to teach recursion because it is a very simple example of recursion. However, it in many ways is too simple.

One of the things that recursion is very good at is "climbing trees". And what better way to demonstrate this than with a family tree.

In this example, we have a family tree of objects. There is a Person class that has a field for the person's name. It also has a field for the person's favorite color. And a Person has a List<Person>, or a list of Persons, who are the children of that Person. Really, this family tree is more complicated than the recursion used to examine the family tree.

Now the family must have a person who is the head of the family. So, we have declared a Person instance called the Patriarch to hold the head of the family. Everyone else, is defined as a child of this person (or a child's child's child, etc.).

Really, all the code in Main is setting up the family tree, or keeping the program from exiting at the end.

The actual recursion happens in the RecursiveColorsOfTheFamilyTree method. With this method we want to know the favorite colors of everyone in the family tree. We have no idea how big the tree is. We have no idea what the shape of all the branches is. All we know is the Patriarch. And the Patriarch object (Person class) gives public access to it's list of children to be queried. We use that list to find out who that Person's children are, and who their children are.

So, this is our recursive method. It writes this person's favorite color using the ChosenColor method of the Person class. This simply writes Name + " prefers " + FavoriteColor+"." to the screen.

Next, it iterates through every child this Person has, writes Person's child, and then calls itself with the Child object to have that Child object act as a Person and do the same. So, it is traveling down that branch of the family tree.

Now the "base case" may not be as obvious here. There is nothing that says "Stop the recursion". However, if the current Person has no Children in its list then it will not iterate through the list and that branch of the recursion will end.

Likewise, each child's child's child's child will be explored. So, once all of a Person's children have been called, that Person's branch is complete and the method will return back to the parent and finish analyzing any children that the child's parent has. I think this is more clear if you run the actual program and see the results.

You can see the order that they are being called in. You can see that when it gets to Steve Jr. that it starts going through all of Steve Jr.'s children and then goes back to Steve's (Steve Jr.'s parent) children when it is finished exploring Steve Jr.'s children. You can add more children to the family tree to experiment with it.

So, hopefully, you can see how useful recursion is for exploring the branches of a data structure such as this. Experiement by adding more children and watching the result.

And finally, I want to discuss the stack overflow problem that is commonly associated with recursion. This is "extra credit" as you really don't need to know much about the stack overflow exception except that it is generally caused by not having anything in your recursion method that ends the recursion and so it goes on forever into infinity which crashes the stack. But I'll explore it in more detail for those interested since it is tied so closely to recursion.

First of all, you have to understand what the stack is. I'll give you an analogy here that might help you envision it without going into the specific technical details of a stack. You can make a stack yourself or have stacks other than the one we are talking about here. But this is a specific stack that is used by your program behind the scenes.

So, imagine the stack as being a box. Maybe a nice wooden box. Since its your imagination I'll let you use any sort of box you please.

So, your program has this box. And imagine that it keeps track of everything it is working on in the current method on a piece of paper that we will call the stack frame. All of the values of the current fields it is working on are stored here in this stack frame.

When the program calls another method, it "pushes" the stack frame (our imaginary piece of paper) onto the top of the stack (our box). That way, when it finishes the new method and comes back to the current method, it can "pop" the stack frame off the top of the stack of "papers" and go back to working on what it was working on before the call to the new method and have the exact values it was using before. (Storing values by reference may or may not have those values stored in the stack frame).

So, when you call methods 5, 6, or 23 layers deep, your program has a stack frame on the stack that represents each one of those methods. And the one on top is the method that it is currently executing.

Now eventually the stack, I mean this box, will fill up if you go deep enough into enough method calls. It's like the box filling up with stack frame papers and then it starts over flowing and the papers start falling out everywhere. That's basically what a "stack overflow" is. The stack runs out of space and then when it tries to "push" a new stack frame on top it falls off. The stack is broken.

So, you have to either get a bigger box (allocate more space to the stack) or make sure that the calls to methods never go that many layers deep.

But usually what happens is that you do recursion without a base case to tell the recursion that it is done. And so it becomes "infinate" recursion and then you are guaranteed to overflow the stack and crash your program. So, you never want to have infinate recursion because it will always result in a stack overflow exception.

In Conclusion

Recursion is not all that complicated. All it is is a function or method that calls itself, usually many times. It is very good at "climbing" data trees and working its way through an unlimited number of branches without knowing the shape of those branches before it begins. Make sure you have a base case that tells the recursive function when enough is enough. If you want something to happen as the recursion is moving from root to branch tip put the code in the recursive function before it calls itself. Any code after the call to itself, will execute as it climbs back from the branch tip back towards the root. Play around with the family tree code, add some new children at any and every level, and see what you can learn about the order it is doing things in.

I would disagree. The is a C# tutorial. Defining language features that don't exist in the language, and can cause potential confusion, seems detrimental. You might as well define pure functions and memoization at that point...