Mark Daniel Ward ; Wojciech Szpankowski
-
Analysis of the multiplicity matching parameter in suffix trees
dmtcs:3387 -
Discrete Mathematics & Theoretical Computer Science,
January 1, 2005,
DMTCS Proceedings vol. AD, International Conference on Analysis of Algorithms
-
https://doi.org/10.46298/dmtcs.3387
Analysis of the multiplicity matching parameter in suffix trees
Authors: Mark Daniel Ward 1; Wojciech Szpankowski 2
NULL##NULL
Mark Daniel Ward;Wojciech Szpankowski
1 Department of mathematics Purdue University
2 Department of Computer Science [Purdue]
In a suffix tree, the multiplicity matching parameter (MMP) $M_n$ is the number of leaves in the subtree rooted at the branching point of the $(n+1)$st insertion. Equivalently, the MMP is the number of pointers into the database in the Lempel-Ziv '77 data compression algorithm. We prove that the MMP asymptotically follows the logarithmic series distribution plus some fluctuations. In the proof we compare the distribution of the MMP in suffix trees to its distribution in tries built over independent strings. Our results are derived by both probabilistic and analytic techniques of the analysis of algorithms. In particular, we utilize combinatorics on words, bivariate generating functions, pattern matching, recurrence relations, analytical poissonization and depoissonization, the Mellin transform, and complex analysis.
Volume: DMTCS Proceedings vol. AD, International Conference on Analysis of Algorithms
Section: Proceedings
Published on: January 1, 2005
Imported on: May 10, 2017
Keywords: data compression,complex asymptotics,suffix trees,combinatorics on words,pattern matching,autocorrelation polynomial,[INFO.INFO-DS] Computer Science [cs]/Data Structures and Algorithms [cs.DS],[INFO.INFO-DM] Computer Science [cs]/Discrete Mathematics [cs.DM],[MATH.MATH-CO] Mathematics [math]/Combinatorics [math.CO],[INFO.INFO-CG] Computer Science [cs]/Computational Geometry [cs.CG]
Funding:
Source : OpenAIRE Graph
Analytic Information Theory, Combinatorics, and Algorithmics: The Precise Redundancy and Related Problems; Funder: National Science Foundation; Code: 0208709
Combinatorial &Probabilistic Methods for Biol Sequences; Funder: National Institutes of Health; Code: 5R01GM068959-04
1 Document citing this article
Source : OpenCitations
Drmota, Michael; Gittenberger, Bernhard; Panholzer, Alois; Prodinger, Helmut; Ward, Mark Daniel, 2009, On The Shape Of The Fringe Of Various Types Of Random Trees, Mathematical Methods In The Applied Sciences, 32, 10, pp. 1207-1245, 10.1002/mma.1085.