Abstract

It is a challenging and fundamental problem to construct
the underlying overlay network to support efficient and
scalable information distribution in topic-based
publish/subscribe systems. Existing overlay design
algorithms aim to minimize the node fan-out while building topic-connected overlays,
in which all nodes interested in the same topic are organized in a directly
connected dissemination sub-overlay. However, most state-of-the-art
algorithms suffer from high computational complexity, such as $O(|V|^4|T|)$,
where $V$ is the node set and $T$ is the topic set.

In this paper, we devise a general indexing data structure that allows us to provide a
significantly faster implementation, with $O(|V|^2|T|)$ running time, for different
state-of-the-art algorithms. The generality of the indexing data structure is due to the
fact that it enables edge lookup by both node degree and \emph{edge contribution},
a central metric in all existing algorithms. When tested on typical pub/sub workloads,
the speedup observed was by a factor of over $1\,000$, thereby rendering the algorithms more
suitable for practical use. For example, under a typically Zipf distributed pub/sub workload,
with $1\,000$ nodes and $100$ topics, our new implementation completes in $3.823$ seconds,
while the previous alternative takes over $555$ minutes.