On a Recurrence Arising in Graph Compression

Yongwook Choi, Charles Knessl, Wojciech Szpankowski

Abstract

In a recently proposed graphical compression algorithm by Choi and Szpankowski (2012), the following tree arose in the course of the analysis. The root contains $n$ balls that are consequently distributed between two subtrees according to a simple rule: In each step, all balls independently move down to the left subtree (say with probability $p$) or the right subtree (with probability $1-p$). A new node is created as long as there is at least one ball in that node. Furthermore, a nonnegative integer $d$ is given, and at level $d$ or greater one ball is removed from the leftmost node before the balls move down to the next level. These steps are repeated until all balls are removed (i.e., after $n+d$ steps). Observe that when $d=\infty$ the above tree can be modeled as a trie that stores $n$ independent sequences generated by a binary memoryless source with parameter $p$. Therefore, we coin the name $(n,d)$-tries for the tree just described, and to which we often refer simply as $d$-tries. We study here in detail the path length, and show how much the path length of such a $d$-trie differs from that of regular tries. We use methods of analytic algorithmics, from Mellin transforms to analytic poissonization.