Depth First Search Algorithm

Hello people…! In this post I will talk about the other Graph Search Algorithm, the Depth First Search Algorithm. Depth First Search is different by nature from Breadth First Search. As the name suggests, “Depth”, we pick up a vertex S and see all the other vertices that can possibly reached by that vertex S and we do that to all vertices in V. Depth First Search can be used to search over all the vertices, even for a disconnected graph. Breadth First Search can also do this, but DFS is more often used to do that. Depth First Search is used to solve puzzles! You can solve a given maze or even create your own maze by DFS. DFS is also used in Topological Sorting, which is the sorting of things according to a hierarchy. It is also used to tell if a cycle exists in a given graph. There are many other applications of DFS and you can do a whole lot of cool things with it. So, lets get started…!

The way the Depth First Search goes is really like solving a maze. When you see a maze in a newspaper or a magazine or anywhere else, the way you solve it is you take a path and go through it. If you find any junction or a crossroad, where you have a choice of paths to choose, you mark that junction, take up a path and traverse the whole route in your brain. If you see a dead end, you realize that this leads you now where so you come back to the junction you marked before and take up another path of the junction. See if that is also a dead end, else you continue to explore the, puzzle as this route contributes in reaching your destination. Well, at least that’s how I solve a maze. But… I hope you get the idea. You pick up a path and you explore it as much as possible. When you can’t travel any further and you haven’t yet reached your destination, you come back to the starting point and pick another path.

In code, you do this by recursion. Because of the very nature of recursion. Your recursion stack grows-grows and eventually becomes an empty stack. If you think for a while you can notice that the way of traversing which I told you above is logically covering only the vertices accessible from a given vertex S. Such traversal can be implemented using a single function that is recursive. But we want to explore the whole graph. So, we will use another function to do this for us. So, DFS is a two-functions thing. We will come back to the coding part later. Now, DFS too can be listed out in a step-by-step process –

For a given set of vertices V = {V1, V2,V3…}, start with V1, and explore all the vertices that can possibly be reached from V1.

Then go to the next vertex V2, if it hasn’t been visited, explore all the nodes reachable from V2, if not, go for V3.

Similarly, go on picking up all the vertices one-by-one and explore as much as possible if it wasn’t visited at all.

Now, how do you tell if a vertex wasn’t visited earlier…? If it has no parent vertex assigned. So what you have to do when you visit a node is –

Set the parent vertex of the current vertex to be the vertex from where you reached that vertex. We will use an array to assign parent vertices, which is initialised to -1, telling that the vertices were never visited.

When you are starting your exploration from a vertex, say, V1, we assign parent[V1] = 0, because it has no parent. If there is an edge from V1 to V2, we say, parent[V2] = V1.

Let’s look at an example and see how DFS really works –

Depth First Search Algorithm Step-by-Step

The working of DFS is pretty clear from the picture. Notice how we would assign the parent vertices to each vertex. Once we have visited all the vertices from a given initial vertex V1, we backtrack to V1. What do we really mean by this “backtrack” here is that the recursion control will gradually come back to the function that started explopring from V1. We will understand this once we put DFS in code. And one more thing, whenever we got a choice of going to two vertices from one vertex, we preferred going to the vertex with the greater number. Why is this…? This is because we will be following Head Insertion in our Adjacency Lists to have O(1) Insertion operation. Now, assuming that we insert the vertices from Vertex 1 to Vertex 10, the greater number vertices will end up being in front of the Linked Lists. Take a moment to understand this. Even if you don’t understand, it’s ok…! You will get the hang of it later. Why I really did that was to explain you the concept of Edge Classification.

Edge Classification in Graphs

As you can see there are three types of edges, in fact, there are 4 actually. They are –

Tree Edge – These are the edges through which we have traversed all the vertices of the graph by DFS. More clearly, these are the edges that represent the parent-child relationship. That is, the tree edge Vertex 1 → Vertex 3 says that, Vertex 1 is the parent of Vertex 3. Just like the parent-child relationship in a tree. Why this is called a “tree” edge is that it happens so that these edges together form a “tree”, or rather a “forest”.

Forward Edge – This is an edge which points from one vertex which is higher in the hierarchy of parent-child relationship to a vertex which is a descendant. Observe that Vertex 2 is a descendant of Vertex 1, so the edge Vertex 1 → Vertex 3, is a forward edge.

Backward Edge – This is the opposite of forward edge. It points from a descendant Vertex to an ancestor Vertex. In the above diagram, the edge, Vertex 4 → Vertex 3 is a backward edge.

Cross Edge – Every other edge is a cross edge. We don’t have a cross edge in the above diagram, but, a cross edge can arise when there is a edge between siblings, two vertices that have the same parent.

Cycle Detection – Using DFS, we can detect if there are any cycles in the given graph. How…? This is very simple. If there exists at least one backward edge, then the given graph will have cycles. Well, telling how many cycles would be there for given number of backward edges is a challenge. But the point is if there is a backward edge, there is a cycle. This is by the very definition of the Backward Edge. It is an edge from a descendant to an ancestor. If we have a backward edge, then there will surely be another path of tree edges from the ancestor to descendant, forming a cycle. The picture below should make things clear.

Cycle Detection by Edge Classification

But, how do we compute backward edges in code…? This is a little tricky. We could have a boolean array of size |V| which would hold the status of the vertex, whether it is in the recursion stack or not. If the vertex is in the recursion stack, it means that the vertex is indeed an ancestor. So, the edge will be a backward edge.

Now try to code the DFS, it is basically a recursion process. If you are good with recursion, I’m sure you can get this. Nonetheless, I have put my code below –

CJava

Just remember that, the parent array is used to indicate whether the vertex is visited or not, and it also indicates the path through which we came. On the other hand, the status array indicates, if a vertex is currently in the recursion stack. This is used to detect backward edges, as discussed. If a vertex has an edge which points to a vertex in the recursion stack, then it is a backward edge.

This is the Depth First Search Algorithm. It has a time complexity of O(|V| + |E|), just like the Breadth First Search. This is because, we visit every vertex once, or you could say, twice, and we cover all the edges that AdjacencyList[Vi] has, for all Vi ∈ V which takes O(|E|) time, which is actually the for loop in our depth_first_search_explore() function. DFS is not very difficult, you just need to have experienced hands in recursion. You might end up getting stuck with some bug. But it is worth spending time with the bugs because they make you think in the perfect direction. So if you are not getting the code, you just have to try harder. But if you are having any doubts, feel free to comment them…! Keep practising… Happy Coding…! 🙂

Status array is like an array of boolean flags which indicate if the vertex is being processed. We say a vertex is processed when we see it twice.. Once when it enters the recursion stack and once when it leaves… So for the time it is in the recursion stack, the value will be 1, when it is processed, we must set it’s value to 0… Which is what that line does… All this mechanism is to find backward edges. We say an edge is a backward edge if the edge points to a vertex currently in the recursion stack… Think you are traversing a graph path and you got 1 → 2 → 3 → 4… and then you suddenly see an edge from 4 → 1.. So its a backward edge… Think about this scenario and think of what values the status[] array will have at this point… I’m sure you’ll get it! 😉

The status array tells us if a vertex is in the recursion stack or not. If it is in the recursion stack, then the corresponding vertex will have 1 in the status array. If an vertex has an edge to a vertex already in the recursion stack, then it a backward edge. You can print all the vertices without repetition if you add this line –

Hi Vamsi,
The blog is wonderful. I tried to print cross edges and forward edges along with the back edges in the following code.http://ideone.com/18xfGW
Please have a look and tell me where it can fail.

The logic I used is:
If we visit a node already visited and we have not yet departed from that node, than it makes a back edge.
Else, there are two possibilities – cross edge or forward edge. In both these cases we visit a node from which we have alredy departed. So here, we can distinguish between them using the arrival time. If it is a forward edge (going from ancestor to descendent, u -> v), arrival time of u will be less v and if it is cross edge, arrival time of u will be greater than v.

Hi Vikas..! 🙂 … I’m sorry for the late reply..! I got caught up in exams..! 😛 …. Coming to your code… Your logic for the back edge is perfect… Your logic indirectly implies my logic for checking the backward edge… Your logic for the forward edge also seems flawless… However, your logic for the cross edge tickles my brain… As far as I know, if two vertices do not have an ancestor-descendant relationship (one is not the ancestor of the other), then an edge between them is a cross edge… I’m unable to reduce your logic to this… I thought about your cross edge technique for an hour… It is giving the expected output… However, I have a small hunch, that given good time, one MAY be able to find a counter example to your logic… So, though your code seems correct, I suggest you manually check the parents of the nodes for printing the cross edges… Feel free if you have anything else to say…! 🙂 … Have a nice day..! 😉