DMTCS Proceedings vol. AQ, 23rd Intern. Meeting on Probabilistic, Combinatorial, and Asymptotic Methods for the Analysis of Algorithms (AofA'12)

The present volume collects the proceedings of the Aofa'12, 23st International Meeting on Probabilistic, Combinatorial, and Asymptotic Methods for the Analysis of Algorithms held at the Centre de Recherches Mathématiques at Université de Montreal, Canada, between June 18-22, 2012. The conference builds on the communities of the former series of conferences ``Mathematics and Computer Science'' and ``Analysis of Algorithms'', and aims at studying rigorously the combinatorial objects which appear in particular as data structures, algorithms, models of networks and as well as the essential ubiquitous combinatorial structures. The program committee selected submissions covering this wide range of topics. The regular papers were presented in 30-minute talks, and short abstracts in presentations of 15 minutes. They all appear in the present volume. The conference included five invited plenary lectures: - Michael Drmota (TU Vienna, Austria): Extremal parameters in critical and subcritical graph classes - Svante Janson (Uppsala, Sweden): Simply generated trees, conditioned Galton-Watson trees, random allocations and condensation - Amin Coja-Oghlan (Warwick, UK): Phase transitions and computational complexity - Claire Mathieu (Brown, USA): Algorithms for optimization over noisy data - Avi Wigderson (IAS, USA): Population Recovery via Partial Identification We thank the members of the steering and program committees for their involvement. We also thank the invited speakers, and the authors of the contributed papers. We express our gratitute to the people at CRM and especially Louis Pelletier for their invaluable help in making sure that the meeting went smoothly. We are also grateful to the editor-in-chief of DMTCS Jens Gustedt for his help in the compilation of the proceedings. Finally, our special thanks go to the sponsors of the conference for their contributions: CRM, Inria, and NSERC. Nicolas Broutin and Luc Devroye, Editors **Steering Committee** - Brigitte Chauvin (Versailles, France) - Luc Devroye (McGill, Canada) - Michael Drmota (TU Vienna, Austria) - Daniel Panario (Carleton, Canada) - Robert Sedgewick (Princeton, USA) - Wojciech Szpankowski (Purdue, USA) - Brigitte Vallée (Caen, France) **Program Committee** - Frédérique Bassino, Paris 13, France - Jit Bose, Carleton, Canada - Nicolas Broutin (co-chair), Inria, France - Philippe Chassaing, Nancy, France - Julien Clément, Caen, France - Luc Devroye (co-chair), McGill, Canada - Uriel Feige, Weizmann, Israel - Alan Frieze, Carnegie-Mellon, USA - Bernhard Gittenberger, TU Vienna, Austria - Hsien-Kuei Hwang, Academia Sinica, Taiwan - Philippe Jacquet, Inria, France - Colin McDiarmid, Oxford, UK - Mike Molloy, Toronto, Canada - Ralph Neininger, Frankfurt, Germany - Marc Noy, Barcelona, Spain - Daniel Panario, Carleton, Canada - Alois Panholzer, TU Vienna, Austria - Bruno Salvy, Inria, France - Mohit Singh, McGill, Canada - Brigitte Vallée, Caen, France - Mark Ward, Purdue, USA **Organization**- Nicolas Broutin - Luc Devroye

1. A New Binomial Recurrence Arising in a Graphical Compression Algorithm

Yongwook Choi ; Charles Knessl ; Wojciech Szpankowski.

In a recently proposed graphical compression algorithm by Choi and Szpankowski (2009), the following tree arose in the course of the analysis. The root contains n balls that are consequently distributed between two subtrees according to a simple rule: In each step, all balls independently move down to the left subtree (say with probability $p$) or the right subtree (with probability 1-$p$). A new node is created as long as there is at least one ball in that node. Furthermore, a nonnegative integer $d$ is given, and at level $d$ or greater one ball is removed from the leftmost node before the balls move down to the next level. These steps are repeated until all balls are removed (i.e., after $n+d$ steps). Observe that when $d=∞$ the above tree can be modeled as a $\textit{trie}$ that stores $n$ independent sequences generated by a memoryless source with parameter $p$. Therefore, we coin the name $(n,d)$-tries for the tree just described, and to which we often refer simply as $d$-tries. Parameters of such a tree (e.g., path length, depth, size) are described by an interesting two-dimensional recurrence (in terms of $n$ and $d$) that – to the best of our knowledge – was not analyzed before. We study it, and show how much parameters of such a $(n,d)$-trie differ from the corresponding parameters of regular tries. We use methods of analytic algorithmics, from Mellin transforms to analytic poissonization.

Section: Proceedings

2. Approximate Counting via the Poisson-Laplace-Mellin Method

Michael Fuchs ; Chung-Kuei Lee ; Helmut Prodinger.

Approximate counting is an algorithm that provides a count of a huge number of objects within an error tolerance. The first detailed analysis of this algorithm was given by Flajolet. In this paper, we propose a new analysis via the Poisson-Laplace-Mellin approach, a method devised for analyzing shape parameters of digital search trees in a recent paper of Hwang et al. Our approach yields a different and more compact expression for the periodic function from the asymptotic expansion of the variance. We show directly that our expression coincides with the one obtained by Flajolet. Moreover, we apply our method to variations of approximate counting, too.

Section: Proceedings

3. On death processes and urn models

Markus Kuba ; Alois Panholzer.

We use death processes and embeddings into continuous time in order to analyze several urn models with a diminishing content. In particular we discuss generalizations of the pill's problem, originally introduced by Knuth and McCarthy, and generalizations of the well known sampling without replacement urn models, and OK Corral urn models.

Section: Proceedings

4. Asymptotic behavior of some statistics in Ewens random permutations

Valentin Feray.

The purpose of this article is to present a general method to find limiting laws for some renormalized statistics on random permutations. The model considered here is Ewens sampling model, which generalizes uniform random permutations. We describe the asymptotic behavior of a large family of statistics, including the number of occurrences of any given dashed pattern. Our approach is based on the method of moments and relies on the following intuition: two events involving the images of different integers are almost independent.

Section: Proceedings

5. Asymptotics of Divide-And-Conquer Recurrences Via Iterated Function Systems

John Kieffer.

Let $k≥2$ be a fixed integer. Given a bounded sequence of real numbers $(a_n:n≥k)$, then for any sequence $(f_n:n≥1)$ of real numbers satisfying the divide-and-conquer recurrence $f_n = (k-mod(n,k))f_⌊n/k⌋+mod(n,k)f_⌈n/k⌉ + a_n, n ≥k$, there is a unique continuous periodic function $f^*:\mathbb{R}→\mathbb{R}$ with period 1 such that $f_n = nf^*(\log _kn)+o(n)$. If $(a_n)$ is periodic with period $k, a_k=0$, and the initial conditions $(f_i:1 ≤i ≤k-1)$ are all zero, we obtain a specific iterated function system $S$, consisting of $k$ continuous functions from $[0,1]×\mathbb{R}$ into itself, such that the attractor of $S$ is $\{(x,f^*(x)): 0 ≤x ≤1\}$. Using the system $S$, an accurate plot of $f^*$ can be rapidly obtained.

Section: Proceedings

6. Additive tree functionals with small toll functions and subtrees of random trees

Stephan Wagner.

Many parameters of trees are additive in the sense that they can be computed recursively from the sum of the branches plus a certain toll function. For instance, such parameters occur very frequently in the analysis of divide-and-conquer algorithms. Here we are interested in the situation that the toll function is small (the average over all trees of a given size $n$ decreases exponentially with $n$). We prove a general central limit theorem for random labelled trees and apply it to a number of examples. The main motivation is the study of the number of subtrees in a random labelled tree, but it also applies to classical instances such as the number of leaves.

Section: Proceedings

7. Domination analysis for scheduling on non preemptive uniformly related machines

Idan Eisner ; Alek Vainshtein.

no abstract

Section: Proceedings

8. Enumeration and Random Generation of Concurrent Computations

Olivier Bodini ; Antoine Genitrini ; Frédéric Peschanski.

In this paper, we study the shuffle operator on concurrent processes (represented as trees) using analytic combinatorics tools. As a first result, we show that the mean width of shuffle trees is exponentially smaller than the worst case upper-bound. We also study the expected size (in total number of nodes) of shuffle trees. We notice, rather unexpectedly, that only a small ratio of all nodes do not belong to the last two levels. We also provide a precise characterization of what ``exponential growth'' means in the case of the shuffle on trees. Two practical outcomes of our quantitative study are presented: (1) a linear-time algorithm to compute the probability of a concurrent run prefix, and (2) an efficient algorithm for uniform random generation of concurrent runs.

Section: Proceedings

9. On total variation approximations for random assemblies

Eugenijus Manstavičius.

We prove a total variation approximation for the distribution of component vector of a weakly logarithmic random assembly. The proof demonstrates an analytic approach based on a comparative analysis of the coefficients of two power series.

Section: Proceedings

10. Some exact asymptotics in the counting of walks in the quarter plane

Guy Fayolle ; Kilian Raschel.

Enumeration of planar lattice walks is a classical topic in combinatorics, at the cross-roads of several domains (e.g., probability, statistical physics, computer science). The aim of this paper is to propose a new approach to obtain some exact asymptotics for walks confined to the quarter plane.

Section: Proceedings

11. Biased Boltzmann samplers and generation of extended linear languages with shuffle

Alexis Darrasse ; Konstantinos Panagiotou ; Olivier Roussel ; Michele Soria.

This paper is devoted to the construction of Boltzmann samplers according to various distributions, and uses stochastic bias on the parameter of a Boltzmann sampler, to produce a sampler with a different distribution for the size of the output. As a significant application, we produce Boltzmann samplers for words defined by regular specifications containing shuffle operators and linear recursions. This sampler has linear complexity in the size of the output, where the complexity is measured in terms of real-arithmetic operations and evaluations of generating functions.

Section: Proceedings

12. On the number of transversals in random trees

Bernhard Gittenberger ; Veronika Kraus.

We study transversals in random trees with n vertices asymptotically as n tends to infinity. Our investigation treats the average number of transversals of fixed size, the size of a random transversal as well as the probability that a random subset of the vertex set of a tree is a transversal for the class of simply generated trees and for Pólya trees. The last parameter was already studied by Devroye for simply generated trees. We offer an alternative proof based on generating functions and singularity analysis and extend the result to Pólya trees.

Section: Proceedings

13. Generic properties of random subgroups of a free group for general distributions

Frédérique Bassino ; Cyril Nicaud ; Pascal Weil.

We consider a generalization of the uniform word-based distribution for finitely generated subgroups of a free group. In our setting, the number of generators is not fixed, the length of each generator is determined by a random variable with some simple constraints and the distribution of words of a fixed length is specified by a Markov process. We show by probabilistic arguments that under rather relaxed assumptions, the good properties of the uniform word-based distribution are preserved: generically (but maybe not exponentially generically), the tuple we pick is a basis of the subgroup it generates, this subgroup is malnormal and the group presentation defined by this tuple satisfies a small cancellation condition.

Section: Proceedings

14. Matching solid shapes in arbitrary dimension via random sampling

Daria Schymura.

We give simple probabilistic algorithms that approximately maximize the volume of overlap of two solid, i.e. full-dimensional, shapes under translations and rigid motions. The shapes are subsets of $ℝ^d$ where $d≥ 2$. The algorithms approximate with respect to an pre-specified additive error and succeed with high probability. Apart from measurability assumptions, we only require that points from the shapes can be generated uniformly at random. An important example are shapes given as finite unions of simplices that have pairwise disjoint interiors.

Section: Proceedings

15. On Bernoulli Sums and Bernstein Polynomials

Jacek Cichoń ; Zbigniew Gołębiewski.

In the paper we discuss a technology based on Bernstein polynomials of asymptotic analysis of a class of binomial sums that arise in information theory. Our method gives a quick derivation of required sums and can be generalized to multinomial distributions. As an example we derive a formula for the entropy of multinomial distributions. Our method simplifies previous work of Jacquet, Szpankowski and Flajolet from 1999.

Section: Proceedings

16. Support and density of the limit $m$-ary search trees distribution

Brigitte Chauvin ; Quansheng Liu ; Nicolas Pouyanne.

The space requirements of an $m$-ary search tree satisfies a well-known phase transition: when $m\leq 26$, the second order asymptotics is Gaussian. When $m\geq 27$, it is not Gaussian any longer and a limit $W$ of a complex-valued martingale arises. We show that the distribution of $W$ has a square integrable density on the complex plane, that its support is the whole complex plane, and that it has finite exponential moments. The proofs are based on the study of the distributional equation $ W \overset{\mathcal{L}}{=} \sum_{k=1}^mV_k^{\lambda}W_k$, where $V_1, ..., V_m$ are the spacings of $(m-1)$ independent random variables uniformly distributed on $[0,1]$, $W_1, ..., W_m$ are independent copies of W which are also independent of $(V_1, ..., V_m)$ and $\lambda$ is a complex number.

Section: Proceedings

17. Adaptive compression against a countable alphabet

Dominique Bontemps ; Stephane Boucheron ; Elisabeth Gassiat.

This paper sheds light on universal coding with respect to classes of memoryless sources over a countable alphabet defined by an envelope function with finite and non-decreasing hazard rate. We prove that the auto-censuring (AC) code introduced by Bontemps (2011) is adaptive with respect to the collection of such classes. The analysis builds on the tight characterization of universal redundancy rate in terms of metric entropy by Haussler and Opper (1997) and on a careful analysis of the performance of the AC-coding algorithm. The latter relies on non-asymptotic bounds for maxima of samples from discrete distributions with finite and non-decreasing hazard rate.

Section: Proceedings

18. Exactly Solvable Balanced Tenable Urns with Random Entries via the Analytic Methodology

Basile Morcrette ; Hosam M. Mahmoud.

This paper develops an analytic theory for the study of some Pólya urns with random rules. The idea is to extend the isomorphism theorem in Flajolet et al. (2006), which connects deterministic balanced urns to a differential system for the generating function. The methodology is based upon adaptation of operators and use of a weighted probability generating function. Systems of differential equations are developed, and when they can be solved, they lead to characterization of the exact distributions underlying the urn evolution. We give a few illustrative examples.

Section: Proceedings

19. Locally Restricted Compositions IV. Nearly Free Large Parts and Gap-Freeness

Edward Bender ; Rodney Canfield ; Zhicheng Gao.

We define the notion of $t$-free for locally restricted compositions, which means roughly that if such a composition contains a part $c_i$ and nearby parts are at least $t$ smaller, then $c_i$ can be replaced by any larger part. Two well-known examples are Carlitz and alternating compositions. We show that large parts have asymptotically geometric distributions. This leads to asymptotically independent Poisson variables for numbers of various large parts. Based on this we obtain asymptotic formulas for the probability of being gap free and for the expected values of the largest part and number distinct parts, all accurate to $o(1)$.

Section: Proceedings

20. The weighted words collector

Jérémie Du Boisberranger ; Danièle Gardy ; Yann Ponty.

We consider the word collector problem, i.e. the expected number of calls to a random weighted generator before all the words of a given length in a language are generated. The originality of this instance of the non-uniform coupon collector lies in the, potentially large, multiplicity of the words/coupons of a given probability/composition. We obtain a general theorem that gives an asymptotic equivalent for the expected waiting time of a general version of the Coupon Collector. This theorem is especially well-suited for classes of coupons featuring high multiplicities. Its application to a given language essentially necessitates knowledge on the number of words of a given composition/probability. We illustrate the application of our theorem, in a step-by-step fashion, on four exemplary languages, whose analyses reveal a large diversity of asymptotic waiting times, generally expressible as $\kappa \cdot m^p \cdot (\log{m})^q \cdot (\log \log{m})^r$, for $m$ the number of words, and $p, q, r$ some positive real numbers.

Section: Proceedings

21. A phase transition in the distribution of the length of integer partitions

Dimbinaina Ralaivaosaona.

We assign a uniform probability to the set consisting of partitions of a positive integer $n$ such that the multiplicity of each summand is less than a given number $d$ and we study the limiting distribution of the number of summands in a random partition. It is known from a result by Erdős and Lehner published in 1941 that the distributions of the length in random restricted $(d=2)$ and random unrestricted $(d \geq n+1)$ partitions behave very differently. In this paper we show that as the bound $d$ increases we observe a phase transition in which the distribution goes from the Gaussian distribution of the restricted case to the Gumbel distribution of the unrestricted case.

Section: Proceedings

22. The Euclid Algorithm is totally gaussian

Brigitte Vallée.

We consider Euclid’s gcd algorithm for two integers $(p, q)$ with $1 \leq p \leq q \leq N$, with the uniform distribution on input pairs. We study the distribution of the total cost of execution of the algorithm for an additive cost function $d$ on the set of possible digits, asymptotically for $N \to \infty$. For any additive cost of moderate growth $d$, Baladi and Vallée obtained a central limit theorem, and –in the case when the cost $d$ is lattice– a local limit theorem. In both cases, the optimal speed was attained. When the cost is non lattice, the problem was later considered by Baladi and Hachemi, who obtained a local limit theorem under an intertwined diophantine condition which involves the cost $d$ together with the “canonical” cost $c$ of the underlying dynamical system. The speed depends on the irrationality exponent that intervenes in the diophantine condition. We show here how to replace this diophantine condition by another diophantine condition, much more natural, which already intervenes in simpler problems of the same vein, and only involves the cost $d$. This “replacement” is made possible by using the additivity of cost $d$, together with a strong property satisfied by the Euclidean Dynamical System, which states that the cost $c$ is both “strongly” non additive and diophantine in a precise sense. We thus obtain a local limit theorem, whose speed is related to the irrationality exponent which intervenes in the new diophantine condition. […]

Section: Proceedings

23. Joint String Complexity for Markov Sources

Philippe Jacquet ; Wojciech Szpankowski.

String complexity is defined as the cardinality of a set of all distinct words (factors) of a given string. For two strings, we define $\textit{joint string complexity}$ as the set of words that are common to both strings. We also relax this definition and introduce $\textit{joint semi-complexity}$ restricted to the common words appearing at least twice in both strings. String complexity finds a number of applications from capturing the richness of a language to finding similarities between two genome sequences. In this paper we analyze joint complexity and joint semi-complexity when both strings are generated by a Markov source. The problem turns out to be quite challenging requiring subtle singularity analysis and saddle point method over infinity many saddle points leading to novel oscillatory phenomena with single and double periodicities.

Section: Proceedings

24. Data Streams as Random Permutations: the Distinct Element Problem

Ahmed Helmi ; Jérémie Lumbroso ; Conrado Martínez ; Alfredo Viola.

In this paper, we show that data streams can sometimes usefully be studied as random permutations. This simple observation allows a wealth of classical and recent results from combinatorics to be recycled, with minimal effort, as estimators for various statistics over data streams. We illustrate this by introducing RECORDINALITY, an algorithm which estimates the number of distinct elements in a stream by counting the number of $k$-records occurring in it. The algorithm has a score of interesting properties, such as providing a random sample of the set underlying the stream. To the best of our knowledge, a modified version of RECORDINALITY is the first cardinality estimation algorithm which, in the random-order model, uses neither sampling nor hashing.

Section: Proceedings

25. Exact $L^2$-Distance from the Limit for QuickSort Key Comparisons (Extended Abstract)

Patrick Bindjeme ; james Allen fill.

Using a recursive approach, we obtain a simple exact expression for the $L^2$-distance from the limit in the classical limit theorem of Régnier (1989) for the number of key comparisons required by $\texttt{QuickSort}$. A previous study by Fill and Janson (2002) using a similar approach found that the $d_2$-distance is of order between $n^{-1} \log{n}$ and $n^{-1/2}$, and another by Neininger and Ruschendorf (2002) found that the Zolotarev $\zeta _3$-distance is of exact order $n^{-1} \log{n}$. Our expression reveals that the $L^2$-distance is asymptotically equivalent to $(2 n^{-1} \ln{n})^{1/2}$.

Section: Proceedings

26. The Limiting Distribution for the Number of Symbol Comparisons Used by QuickSort is Nondegenerate (Extended Abstract)

Patrick Bindjeme ; james Allen fill.

In a continuous-time setting, Fill (2012) proved, for a large class of probabilistic sources, that the number of symbol comparisons used by $\texttt{QuickSort}$, when centered by subtracting the mean and scaled by dividing by time, has a limiting distribution, but proved little about that limiting random variable $Y$—not even that it is nondegenerate. We establish the nondegeneracy of $Y$. The proof is perhaps surprisingly difficult.

Section: Proceedings

27. Stokes polyhedra for $X$-shaped polyminos

Yu. Baryshnikov ; L. Hickok ; N. Orlow ; S. Son.

Consider a pair of $\textit{interlacing regular convex polygons}$, each with $2(n + 2)$ vertices, which we will be referring to as $\textit{red}$ and $\textit{black}$ ones. One can place these vertices on the unit circle $|z | = 1$ in the complex plane; the vertices of the red polygon at $\epsilon^{2k}, k = 0, \ldots , 2n − 1$, of the black polygon at $\epsilon^{2k+1}, k = 0, \ldots , 2n − 1$; here $\epsilon = \exp(i \pi /(2n + 2))$. We assign to the vertices of each polygon alternating (within each polygon) signs. Note that all the pairwise intersections of red and black sides are oriented consistently. We declare the corresponding orientation positive.

Section: Proceedings

28. Mean field analysis for inhomogeneous bike sharing systems

Christine Fricker ; Nicolas Gast ; Hanene Mohamed.

In the paper, bike sharing systems with stations having a finite capacity are studied as stochastic networks. The inhomogeneity is modeled by clusters. We use a mean field limit to compute the limiting stationary distribution of the number of bikes at the stations. This method is an alternative to analytical methods. It can be used even if a closed form expression for the stationary distribution is out of reach as illustrated on a variant. Both models are compared. A practical conclusion is that avoiding empty or full stations does not improve overall performance.

Section: Proceedings

29. On Greedy Trie Execution

Zbigniew Gołębiewski ; Filip Zagórski.

In the paper "How to select a looser'' Prodinger was analyzing an algorithm where $n$ participants are selecting a leader by flipping <underline>fair</underline> coins, where recursively, the 0-party (those who i.e. have tossed heads) continues until the leader is chosen. We give an answer to the question stated in the Prodinger's paper – what happens if not a 0-party is recursively looking for a leader but always a party with a smaller cardinality. We show the lower bound on the number of rounds of the greedy algorithm (for <underline>fair</underline> coin).

Section: Proceedings

30. On the Number of 2-Protected Nodes in Tries and Suffix Trees

Jeffrey Gaither ; Yushi Homma ; Mark Sellke ; Mark Daniel Ward.

We use probabilistic and combinatorial tools on strings to discover the average number of 2-protected nodes in tries and in suffix trees. Our analysis covers both the uniform and non-uniform cases. For instance, in a uniform trie with $n$ leaves, the number of 2-protected nodes is approximately 0.803$n$, plus small first-order fluctuations. The 2-protected nodes are an emerging way to distinguish the interior of a tree from the fringe.

Section: Proceedings

31. Analysis of Digital Expansions of Minimal Weight

Florian Heigl ; Clemens Heuberger.

Extending an idea of Suppakitpaisarn, Edahiro and Imai, a dynamic programming approach for computing digital expansions of minimal weight is transformed into an asymptotic analysis of minimal weight expansions. The minimal weight of an optimal expansion of a random input of length $\ell$ is shown to be asymptotically normally distributed under suitable conditions. After discussing the general framework, we focus on expansions to the base of $\tau$, where $\tau$ is a root of the polynomial $X^2- \mu X + 2$ for $\mu \in \{ \pm 1\}$. As the Frobenius endomorphism on a binary Koblitz curve fulfils the same equation, digit expansions to the base of this $\tau$ can be used for scalar multiplication and linear combination in elliptic curve cryptosystems over these curves.

Section: Proceedings

32. Mixing times of Markov chains on 3-Orientations of Planar Triangulations

Sarah Miracle ; Dana Randall ; Amanda Pascoe Streib ; Prasad Tetali.

Given a planar triangulation, a 3-orientation is an orientation of the internal edges so all internal vertices have out-degree three. Each 3-orientation gives rise to a unique edge coloring known as a $\textit{Schnyder wood}$ that has proven useful for various computing and combinatorics applications. We consider natural Markov chains for sampling uniformly from the set of 3-orientations. First, we study a "triangle-reversing'' chain on the space of 3-orientations of a fixed triangulation that reverses the orientation of the edges around a triangle in each move. We show that (i) when restricted to planar triangulations of maximum degree six, the Markov chain is rapidly mixing, and (ii) there exists a triangulation with high degree on which this Markov chain mixes slowly. Next, we consider an "edge-flipping'' chain on the larger state space consisting of 3-orientations of all planar triangulations on a fixed number of vertices. It was also shown previously that this chain connects the state space and we prove that the chain is always rapidly mixing.

Section: Proceedings

33. Toward the asymptotic count of bi-modular hidden patterns under probabilistic dynamical sources: a case study

Loïck Lhote ; Manuel E. Lladser.

Consider a countable alphabet $\mathcal{A}$. A multi-modular hidden pattern is an $r$-tuple $(w_1,\ldots , w_r)$, where each $w_i$ is a word over $\mathcal{A}$ called a module. The hidden pattern is said to occur in a text $t$ when the later admits the decomposition $t = v_0 w_1v_1 \cdots v_{r−1}w_r v_r$, for arbitrary words $v_i$ over $\mathcal{A}$. Flajolet, Szpankowski and Vallée (2006) proved via the method of moments that the number of matches (or occurrences) with a multi-modular hidden pattern in a random text $X_1\cdots X_n$ is asymptotically Normal, when $(X_n)_{n\geq1}$ are independent and identically distributed $\mathcal{A}$-valued random variables. Bourdon and Vallée (2002) had conjectured however that asymptotic Normality holds more generally when $(X_n)_{n\geq1}$ is produced by an expansive dynamical source. Whereas memoryless and Markovian sequences are instances of dynamical sources with finite memory length, general dynamical sources may be non-Markovian i.e. convey an infinite memory length. The technical difficulty to count hidden patterns under sources with memory is the context-free nature of these patterns as well as the lack of logarithm-and exponential-type transformations to rewrite the product of non-commuting transfer operators. In this paper, we address a case study in which we have successfully overpassed the aforementioned difficulties and which may illuminate how to address more general cases via auto-correlation operators. Our main result […]

Section: Proceedings

34. Infinite Systems of Functional Equations and Gaussian Limiting Distributions

Michael Drmota ; Bernhard Gittenberger ; Johannes F. Morgenbesser.

In this paper infinite systems of functional equations in finitely or infinitely many random variables arising in combinatorial enumeration problems are studied. We prove sufficient conditions under which the combinatorial random variables encoded in the generating function of the system tend to a finite or infinite dimensional limiting distribution.

Section: Proceedings

35. Simply generated trees, conditioned Galton―Watson trees, random allocations and condensation: Extended abstract

Svante Janson.

We give a unified treatment of the limit, as the size tends to infinity, of random simply generated trees, including both the well-known result in the standard case of critical Galton-Watson trees and similar but less well-known results in the other cases (i.e., when no equivalent critical Galton-Watson tree exists). There is a well-defined limit in the form of an infinite random tree in all cases; for critical Galton-Watson trees this tree is locally finite but for the other cases the random limit has exactly one node of infinite degree. The random infinite limit tree can in all cases be constructed by a modified Galton-Watson process. In the standard case of a critical Galton-Watson tree, the limit tree has an infinite "spine", where the offspring distribution is size-biased. In the other cases, the spine has finite length and ends with a vertex with infinite degree. A node of infinite degree in the limit corresponds to the existence of one node with very high degree in the finite random trees; in physics terminology, this is a type of condensation. In simple cases, there is one node with a degree that is roughly a constant times the number of nodes, while all other degrees are much smaller; however, more complicated behaviour is also possible. The proofs use a well-known connection to a random allocation model that we call balls-in-boxes, and we prove corresponding results for this model.

Section: Proceedings