Discrete Mathematics & Theoretical Computer Science |
In a recently proposed graphical compression algorithm by Choi and Szpankowski (2009), the following tree arose in the course of the analysis. The root contains n balls that are consequently distributed between two subtrees according to a simple rule: In each step, all balls independently move down to the left subtree (say with probability $p$) or the right subtree (with probability 1-$p$). A new node is created as long as there is at least one ball in that node. Furthermore, a nonnegative integer $d$ is given, and at level $d$ or greater one ball is removed from the leftmost node before the balls move down to the next level. These steps are repeated until all balls are removed (i.e., after $n+d$ steps). Observe that when $d=∞$ the above tree can be modeled as a $\textit{trie}$ that stores $n$ independent sequences generated by a memoryless source with parameter $p$. Therefore, we coin the name $(n,d)$-tries for the tree just described, and to which we often refer simply as $d$-tries. Parameters of such a tree (e.g., path length, depth, size) are described by an interesting two-dimensional recurrence (in terms of $n$ and $d$) that – to the best of our knowledge – was not analyzed before. We study it, and show how much parameters of such a $(n,d)$-trie differ from the corresponding parameters of regular tries. We use methods of analytic algorithmics, from Mellin transforms to analytic poissonization.