Transitive closure of a directed graph

Contents

1 Formulation of the problem

Let [math]G = (V, E)[/math] be a directed graph. The sequence [math]P(u, v)[/math] of edges [math]e_1 = (u, w_1)[/math], [math]e_2 = (w_1, w_2)[/math], …, [math]e_k = (w_{k-1}, v)[/math] is called a "path from the vertex [math]u[/math] to [math]v[/math]. We say that [math]v[/math] is reachable from [math]u[/math] if there exists at least one path [math]P(u, v)[/math]. Each vertex is reachable from itself.

It is required to construct the transitive closure[math]G^+ = (V, E^+)[/math] of the graph [math]G[/math]; namely, an edge [math](v, w) \in E^+[/math] if and only if the vertex [math]w[/math] is reachable in the graph [math]G[/math] from the vertex [math]v[/math].

An equivalent formulation is as follows: Given a reflexive binary relation [math]R[/math], construct the minimal (with respect to inclusion) relation [math]R^+[/math] that contains [math]R[/math] and has the transitivity property; that is, if [math]aR^+b[/math] and [math]bR^+c[/math], then [math]aR^+c[/math]. If the original relation [math]R[/math] is symmetric, then [math]R^+[/math] is an equivalence relation, and it suffices to find the corresponding equivalence classes.

2 Properties of the problem

If vertices [math]v[/math] and [math]w[/math] belong to the same strongly connected component of the graph [math]G[/math], then the transitive closure contains the edges [math](v, w)[/math] and [math](w, v)[/math]. In an undirected graph, the edge [math](v, w)[/math] belongs to the transitive closure if and only if the vertices [math]v[/math] and [math]w[/math] belong to the same connected component. Consequently, for an undirected graph, the search for transitive closure is equivalent to finding connected components.

If vertices [math]x[/math] and [math]y[/math] belong to the same strongly connected component of the graph [math]G[/math], while vertices [math]z[/math] and [math]t[/math] belong to the other component, then the edges [math](x, z)[/math], [math](x, t)[/math], [math](y, z)[/math], and [math](y, t)[/math] simultaneously belong or not belong to the transitive closure [math]G^+[/math]. It follows that the search for the transitive closure of the graph [math]G[/math] can be reduced to finding the transitive closure of the acyclic graph obtained from [math]G[/math] by merging each strongly connected component into a single vertex. This idea underlies Purdom's algorithm алгоритм Пурдома.

In terms of vertices, the graph [math]G[/math] and its transitive closure [math]G^+[/math] have the same strongly connected components.

3 Algorithms for solving the problem

In an undirected graph, a vertex [math]w[/math] is reachable from a vertex [math]v[/math] if and only if both belong to the same connected component. The transitive closure of such graph reduces to finding its connected components and can be constructed by the following algorithms:

the use of a system of disjoint sets[1] The time complexity is [math]O(m \alpha(m, n))[/math]. An effective multithreaded implementation is possible [2];

the parallel algorithm of Shiloach-Vishkin[3] The time complexity is [math]O(\ln n)[/math], provided that [math]n + 2m[/math] processors are used.

For a directed graph, the transitive closure can be reduced to the search for shortest paths in a graph with unit weights. It can then be found by the following algorithms:

Floyd--Warshall algorithm[4][5]. The complexity is [math]O(n^3)[/math]. (The first published algorithm for transitive closure [6] is exactly the Floyd--Warshall algorithm for graphs with unit weights);

Notation: [math]m[/math] is the number of edges, [math]n[/math] is the number of vertices, and [math]\mu \le m[/math] is the number of edges that connect strongly connected components.

4 Parallelization resource

The problem can be solved by applying the breadth first search to all the vertices of the graph. For different vertices, the calculations are independent and may be performed in parallel.

Since the size of the output data and the complexity of calculations are quadratic, it makes sense to perform a preliminary processing of the data and use various graph decompositions in order to reduce the original problem to the parallel finding of shortest paths on smaller subgraphs [8]: