I have recently been doing some research into algorithms for finding minimum spanning trees in graphs, and I am interested in the following problem:

Let G be an undirected graph on n vertices with m edges, such that each edge has a weight w(e) ∈ {1, 2,..,R} where R is a natural number, constant. Is there an algorithm which finds a minimum spanning tree of G in time O(n+m)?

Obviously, you could just run Prims/Kruskals on the graph, and you would get a minimum spanning tree, but not in linear time.

IDEA 1:

I was thinking that we could start by adding every edge with weight 1 to the tree, provided it creates no cycles, as if there is an edge of weight 1 that creates no cycles, then it is preferable to an edge of weight 2 say, and do this in increasing order. But then I am not sure whether this is actually correct and didnt get anywhere proving its correctness.

IDEA 2:

Using the Union-Find (Disjoint-set) data structure and adapting Kruskals algorithm:

For all v in V { MAKE_SET(v);
}

Sort the edges in non-decreasing order into R buckets (each bucket corresponds to a specific edge weight);

For every edge (u,v) in E (starting with those in bucket 1 ie in the non-decreasing order) {

if FIND(u) != FIND(v) then T=T U (u,v);

UNION(u,v);

} Return T

But again, I am unsure of running time analysis/correctness proof and implementation. I believe this is correct because we are just using Kruskals algorithm with the Union-Find data structure but sorting in linear O(m) time making O(m+n) overall I think?

Any help on possible ways to design an algorithm/improve my algorithms to do this would be appreciated and any implementations/pseudocode/proof/analysis ideas (java preferable but any language welcome) would be super helpful.

1 Answer
1

Your second idea seems correct to me, though some details are missing. However, its running time will not be linear: The best possible running time for $m$ Find and $n$ Union operations is $\mathcal{O}(n + 1 + m \cdot \alpha(n + 1))$ where $\alpha$ is the inverse Ackerman function. This is not linear in theory, but in practice $\alpha(n)$ will be at most 5 for all realistic input values. Note that achieving this running time will involve a lot of work on the Union Find structure.

A really linear time algorithm is described this paper by Fredman and Willard from 1994. They grow a tree until the number of neighbors of this tree grows to large and then start a new tree. Their algorithm is probably even more difficult to implement than your second idea.

If you are aiming for a practical algorithm, I would go with your second idea and a practically efficient implementation of Union Find.