Discrete Mathematics & Theoretical Computer Science |

3011

Consider a countable alphabet $\mathcal{A}$. A multi-modular hidden pattern is an $r$-tuple $(w_1,\ldots , w_r)$, where each $w_i$ is a word over $\mathcal{A}$ called a module. The hidden pattern is said to occur in a text $t$ when the later admits the decomposition $t = v_0 w_1v_1 \cdots v_{r−1}w_r v_r$, for arbitrary words $v_i$ over $\mathcal{A}$. Flajolet, Szpankowski and Vallée (2006) proved via the method of moments that the number of matches (or occurrences) with a multi-modular hidden pattern in a random text $X_1\cdots X_n$ is asymptotically Normal, when $(X_n)_{n\geq1}$ are independent and identically distributed $\mathcal{A}$-valued random variables. Bourdon and Vallée (2002) had conjectured however that asymptotic Normality holds more generally when $(X_n)_{n\geq1}$ is produced by an expansive dynamical source. Whereas memoryless and Markovian sequences are instances of dynamical sources with finite memory length, general dynamical sources may be non-Markovian i.e. convey an infinite memory length. The technical difficulty to count hidden patterns under sources with memory is the context-free nature of these patterns as well as the lack of logarithm-and exponential-type transformations to rewrite the product of non-commuting transfer operators. In this paper, we address a case study in which we have successfully overpassed the aforementioned difficulties and which may illuminate how to address more general cases via auto-correlation operators. Our main result shows that the number of matches with a bi-modular pattern $(w_1, w_2)$ normalized by the number of matches with the pattern $w_1$, where $w_1$ and $w_2$ are different alphabet characters, is indeed asymptotically Normal when $(X_n)_{n\geq1}$ is produced by a holomorphic probabilistic dynamical source.

Source : oai:HAL:hal-01082042v1

Volume: DMTCS Proceedings vol. AQ, 23rd Intern. Meeting on Probabilistic, Combinatorial, and Asymptotic Methods for the Analysis of Algorithms (AofA'12)

Section: Proceedings

Published on: January 1, 2012

Imported on: January 31, 2017

Keywords: non-Markovian sequence,hidden pattern,pattern matching,probabilistic dynamical source,[INFO.INFO-DS] Computer Science [cs]/Data Structures and Algorithms [cs.DS],[INFO.INFO-CC] Computer Science [cs]/Computational Complexity [cs.CC]

Fundings :

- Source : OpenAIRE Research Graph
*AMC-SS: Markovian Embeddings for the Analysis and Computation of Patterns in non-Markovian Random Sequences*; Funder: National Science Foundation; Code: 0805950

This page has been seen 133 times.

This article's PDF has been downloaded 131 times.