Succinct Representation of Trees and Graphs

View/ Open

Date

Author

Metadata

Abstract

In this thesis, we study succinct representations of trees and graphs. A succinct representation of a combinatorial object is a space efficient representation that supports a reasonable set of operations and queries on the object in constant or near constant time on the RAM with logarithmic word size. The storage requirement of a succinct representation is intended to be optimal to within lower order terms.
We first propose a uniform approach for succinct representation of various families
of trees. The method is based on two recursive decompositions of trees into subtrees. The approach simplifies the existing representation of ordinal trees while allowing the full set of navigational operations and queries. The approach applied to cardinal (i.e., k-ary) trees yields a space-optimal succinct representation allowing cardinal-type operations (e.g., determining child labeled i) as well as the full set of ordinal-type operations (e.g., reporting the number of siblings to the left of a node). Previous space-optimal succinct representations had not been able to support both types of operations efficiently. We demonstrate how the approach can be applied to obtain a space-optimal succinct representation for the family of free trees where the order of children is insignificant. Furthermore, we show that our approach can be used to obtain entropy-based succinct representations. The approach adapts to match the degree-distribution entropy suggested by Jansson et al. We discuss that our approach can be made adaptive to various other entropy measures.
Next, we focus on ordinal trees, and present a novel universal succinct representation. Our new representation is able to simultaneously emulate previous ordinal tree representations of the balanced parenthesis (BP), depth first unary degree sequence (DFUDS) and partitioned representations using a single instance of the data structure. They not only support the union of all the ordinal tree operations supported by these representations, but will also automatically inherit any new operations supported by these representations in the future; hence the universality title we attributed to the representation.
We then move to more general graphs rather than trees, and consider the problem of encoding a graph with $n$ vertices and m edges compactly supporting adjacency, neighborhood and degree queries in constant time. The adjacency query asks whether there is an edge between two vertices, the neighborhood query reports the neighbors of a given vertex in constant time per neighbor, and the degree query reports the number of edges incident to a given vertex. The representation is to achieve the optimal space requirement as a function of n and m to within lower order terms. We prove a lower bound in the cell probe model that it is impossible to achieve the information theoretic lower bound to within lower order terms unless the graph is too sparse (namely, $m=o(n^\delta)$ for any constant \delta > 0) or too dense (namely m = \littleOmega{n^{2-\delta}}) for any constant \delta > 0).
We also present a succinct encoding for graphs for all values of n,m supporting
queries in constant time. The space requirement of the representation is always within a multiplicative 1+\epsilon factor of the information-theory lower bound for any constant $\epsilon > 0$. This is the best achievable space bound according to our lower bound where it applies. The space requirement of the representation achieves the information-theory lower bound tightly to within lower order terms when
the graph is sparse (m=o(n^\delta) for any constant \delta > 0), or very dense (m = \littleOmega (n^2/(\sqrt{\log n})).