Does Kruskal produce all MSTs?

The area of Theoretical Computer Science has so many beautiful questions. Among these are a few questions, that can be called as “cute” questions. The problem raised here is one such question. 🙂

Here’s the question: Given a weighted connected graph, G, does Kruskal’s algorithm produce all possible Minimum Spanning Trees of G?

A couple of lines on Kruskal’s algorithm: In order to find the MST, you take the edges and order them in non-decreasing order. Then traverse this sorted sequence of edges, picking edges one by one, keeping in mind that the picked edge should not create a cycle with any of the previously picked edges.

There’s also another interesting fact about MSTs: If a graph has unique edge weights, i.e. no two edge weights are same, then that graph also has a unique MST. Also, if a graph has repetitions of certain edge weights, then there may exist more than one MST for the graph.

Now, back to the original question.

Let us take an example graph to understand the arguments that follow.

Suppose our connected graph, G has the following ordered sequence of edge weights:

Now, we apply Kruskal’s algorithm on G to get some MST. Let’s call this as Tk.

Also, let T be any random MST of G. We shall compare T with Tk to see whether T can be produced by Kruskal’s algorithm.

Simple till now, right? 🙂

Now, T and Tk will generally have some common edges. (Even if they don’t the ensuing arguments will hold just as well.)

Let us consider the edges one by one (in sorted order). Let’s say that edge, e1 is part of both T and Tk. Similarly, edges, e2 and e3 are both of both T and Tk. Also, let us assume that edge, e4 is not part of either T or Tk. So far so good; all edges thus far between T and Tk are common.

Case 1: Now, let us assume that Tk contains edge, e5, but not e6; while T contains edge, e6, but not e5.

We now have to prove that Kruskal’s algorithm can produce an MST which contains e6 but not e5.

(We know that Kruskal’s algorithm selects an edge to be a part of the MST in sorted order. When two or more edges have equal weight, it randomly picks one of them and checks whether that edge can be a part of the MST. Only then does it go to the next edge of equal weight to test whether that is compatible with the tree constructed thus far.)

So, in our example, after Kruskal’s algorithm successfully processes edge, e4, it has to make a choice between e5 and e6. Let us suppose that it picks edge, e6. We know that e6 would be compatible with the previously made selections of e1 through e4 (this is because, T contains the same selection as Kruskal for e1 through e4, and also contains e6. Hence, e6 is compatible with the earlier selections i.e. it does not form a cycle with those edges).

So, one problem is sorted out. There does exist a Kruskal-produced MST, call it Tk1, having edge, e6. But, we need to ensure that this MST does not contain e5. This can be easily proved.

Here’s the proof: We know that { e1, e2, e3 } is compatible with both e5 as well as e6. (e4 is part of neither T nor Tk, in our example). Now, Kruskal’s algorithm always tests an edge along the ordered sequence in order to accept or reject it. Since, e6 was not part of Tk, it means that { e1, e2, e3, e5, e6} form a cycle. Coming back to Tk1: e6 can’t be part of Tk1 since it forms a cycle with the previously picked edges {e1, e2, e3, e5}.

So, we have proved that if a Kruskal-produced MST and a random MST differ in an edge of the same weight, then we can apply Kruskal’s algorithm to rectify this difference.

Case 2: Let’s get back to the sample edge sequence (reproduced here for our convenience):

Let’s assume that T and Tk agree on edge selections until edge, e7. Also, let us assume that Tk does not contain e8, while T contains e8.

We can easily prove that this is not possible. The two facts that T agrees with Tk on edges e1 through e7, and that e8 is part of T, imply that e8 is compatible with Kruskal’s selection edges from e1 through e7. By definition, Kruskal’s algorithm must accept e8. Hence, the fact that e8 is not part of Tk is not possible. Hence, case 2 is an impossibility.

Assume that T and Tk agree on edge selections from e1 through e9. Also, assume that T contains two edges of weight, 8, say e10 and e11. On the other hand, assume that Tk contains only one edge of weight, 8, say e12.

To rectify the fact that Tk contains e12 but not e10, we apply the same arguments as in Case 1. Hence, we get that there can be a Kruskal-produced MST, call it TK1, that contains e10 but not e12.

Now, can e11 be made to be a part of a Kruskal-produced MST that contains e10 but not e12? The arguments here are similar in nature to those in Case 2. We can prove that such a case can not exist.

Using the same meaning for T and Tk, we shall assume that T and Tk agree on edge selections from e1 through e8. Also, assume that Tk contains e9, while T does not contain e9.

Add edge, e9 to T. We get a cycle. It can be observed that this cycle will contain an edge with index greater than 9. i.e. one edge from among the set {e10, e11, …, em}. (This is because, if e9 formed a cycle composed exclusively of edges selected by Kruskal from e1 through e8, it would not have been a part of Tk in the first place. Hence, the presence of a higher-indexed edge is essential in the cycle).

Now, we can remove the higher-indexed, and hence, higher-weighted edge, say f, from the cycle and replace it in the MST, T by e9. This new tree, T – {f} + {e9}, is lower in weight than T. This is not possible. Hence, Case 5 is also an impossibility.

Assume T and Tk agree on edge selections from e1 through e9. There can be a possibility that Tk contains 2 or more edges of weight 8, while T contains just one of them. We can show by arguments similar to those in Case 4 that such a possibility does not exist.

The cases that were considered were chosen so as to represent all possible types of interesting situations that might arise. There may be other cases possible, but hopefully it would reduce to one of these.

Explanation of this problem through an example makes, I believe, it easier for the reader and certainly, the writer. 🙂 It is indeed a very long explanation, and probably one may want to think of the answer themselves.

So, Kruskal’s Algorithm does produce all possible MSTs of a graph G.

Some Practical Implications of this Result:

Whenever one is asked to prove that a given spanning tree is not an MST, we just need to show that the given tree can not be produced by Kruskal’s algorithm.

We had seen an interesting fact at the beginning of this post, that uniqueness of edge weights implies uniqueness of MST. (Note that we did not use this fact anywhere in showing that Kruskal’s algorithm produces all MSTs.) This is easily proven since Kruskal’s algorithm produces only one MST when run on a graph with unique edge weights, implying that only one MST exists for such a graph.

You conclude by saying “Whenever one is asked to prove that a given spanning tree is not an MST, we just need to show that the given tree can not be produced by Kruskal’s algorithm.”

But from my experience, I have found that it is easier to find out a MST using Kruskal’s Algorithm, rather than trying to show that a particular spanning tree cannot be produced using the algorithm. (The procedure involves assuming that a spanning tree is an MST, and thereby giving a result of contradiction). Can you suggest any simpler way?

You asked a very nice question. Here’s what I meant when I wrote that sentence.

First of all, the correctness of the sentence is clear from the fact that Kruskal’s algorithm does produce all MSTs.

Now regarding whether this is a simple method or not:

There may be situations where we can use this to prove that a given spanning tree is not an MST. Consider the same edge sequence as in the post:
1 (e1), 2 (e2), 3 (e3), 4 (e4), 5 (e5), 5 (e6), 5 (e7), 6 (e8), 7 (e9), 8 (e10), 8 (e11), 8 (e12), 8 (e13), 9 (e14), 10 (e15), …. , (em).

Suppose when we apply Kruskal’s algorithm, on processing edge, e4, we find that it is compatible with whatever selection of edges we make from e1 through e3. Hence, we find that e4 MUST necessarily be a part of ALL Kruskal-produced MSTs, and hence, MUST necessarily be a part of ALL MSTs of G.

So, if we are now given a spanning tree, which does not contain e4, we can directly say that it is not an MST.

Also as you said, another procedure for proving that a spanning tree is not minimum, is to assume the opposite and arrive at a contradiction. This Kruskal method for proving the same might probably be simpler in some cases.