Counting l-letter subwords in compositions

. Let N be the set of all positive integers and let A be any ordered subset of N . Recently, Heubach and Mansour enumerated the number of compositions of n with m parts in A that contain the subword τ exactly r times, where τ ∈ { 111 , 112 , 221 , 123 } . Our aims are (1) to generalize the above results, i.e., to enumerate the number of compositions of n with m parts in A that contain an ℓ -letter subword, and (2) to analyze the number of compositions of n with m parts that avoid an ℓ -letter pattern, for given ℓ . We use tools such as asymptotic analysis of generating functions leading to Gaussian asymptotic.


Introduction
A composition π = π 1 π 2 . . .π m of n ∈ N is an ordered collection of one or more positive integers whose sum is n.The number of summands, namely m, is called the number of parts of the composition.Let A = {a 1 , a 2 , . . ., a d } or A = {a 1 , a 2 , a 3 , . ..},where a 1 < a 2 < • • • are positive integers.We will refer to such a set as an ordered subset of N. We will look at compositions of n with parts in A, i.e., compositions whose parts are restricted to be from a set A ⊆ N.
The problem of counting the number of compositions (words), which contain a set of given number of strings as substrings, is a classical problem in combinatorics.This problem can be studied using the transfer matrix method (see [12,Section 4.7] and [6]).In particular, it is a well-known fact that the generating function of such compositions is always rational [12,Theorem 4.7.2].For example, following [12,Example 4.7.5],one can find that the generating function for the number of compositions with parts in {1, 2, 3}, where neither 11 nor 23 appear as two consecutive parts, is given by 1+x 1−2x 3 −x 4 −x 2 +x 5 +x 6 .We note that the term pattern in [12,Section 4.7] and [6] is used to denote an exact string rather than its type with respect to order isomorphism.For example, the pattern 11 is the actual string 11, whereas in our setting an occurrence of the subword pattern 11 is any substring aa.Another example, the pattern rise, non-rise as defined in [6] includes the subword patterns 121, 122, 132, 231 as defined in this paper.However, we show that each of the subword patterns 121, 122, 132 and 231, is avoided by a different number of compositions of n.
In this paper, we analyze, in several cases, different asymptotic parameters for the number of compositions of n with m parts in N, which do not contain a given subword pattern (see below for the precise definition).In order to do that, first we need to find explicit formulas for the generating functions for the 1365-8050 c 2006 Discrete Mathematics and Theoretical Computer Science (DMTCS), Nancy, France number of compositions of n with m parts in N, which does not contain given subword patterns.While the transfer matrix method expresses these generating functions in terms of determinants, we present here another technique which leads to compact formulas.
Let π = π 1 π 2 . . .π m be any composition of n with m parts in A and σ = σ 1 σ 2 . . .σ m ′ be any word of length m ′ , where m ≥ m ′ .An occurrence of σ in π is a subword π i . . .π i+m ′ −1 which is orderisomorphic to σ, i.e., π i−1+a < π i−1+b if and only if σ a < σ b for all 1 ≤ a < b ≤ m ′ .We denote the number of occurrences of σ in π by σ(π), and in such a context σ is usually called a pattern of length m ′ (or m ′ -letter pattern).For example, the number of occurrences of the pattern 112 in the composition π = 11223113 of 14 with 8 parts is 3, namely, 112, 223 and 113; thus 112(π) = 3.If r = 0, we say that π avoids σ.Our aim is to count the number of compositions of n with m parts in A which contain an ℓ-letter pattern σ exactly r times, and to analyze the number of compositions of n with m parts in A which avoid an ℓ-letter pattern σ.
We denote the set of compositions of n with parts in A that avoid a pattern τ (resp.with m parts) by Recently, Heubach and Mansour [8] found that (1.1) where t p (A) = 1≤i1<•••<ip≤d y p p j=1 x ai j .Our first goal is to generalize these results to obtain explicit formula for the generating function C A τ (x, y, z), where τ is an ℓ-letter pattern.In particular, we find explicit formulas for the generating functions C A τ (x, y, z), where τ ∈ {212, 121, 132, 213}, see Section 2. We note that our generating function C A τ (x, y, z) for x = 1 and A = {1, 2, . . ., k} gives the generating function for the number k-ary words that avoid τ , thus this presents a generalization of Burstein and Mansour's results, see [2].
A Carlitz composition of n, introduced in [3], is a composition of n in which no adjacent parts are the same, namely, one that avoids the pattern 11.More generally, a k-Carlitz composition of n is a composition of n in which no k consecutive parts are the same, that is, a composition that avoids the pattern 11 . . . 1 asymptotic parameters of the number of k-Carlitz compositions of n with m parts in N, see Section 3. We use tools such as asymptotic analysis of generating functions (based on their singularities) leading to Gaussian asymptotic (see [1], [5,Theorem IX.8], and [9]).

Enumeration of compositions
In this section we propose an explicit formula for the generating function C A τ (x, y, z) for several interesting cases of ℓ-letter patterns τ .To do that we need the following notation.Denote the generating function for all compositions π ∈ C A τ ;r (n, m) with π 1 . . .π s = α 1 . . .α s by C A τ (α 1 . . .α s |x, y, z).

2.1
The pattern τ = 11 . . . 1 Let τ = 11 . . . 1 be a pattern of length ℓ.By the definitions we have that which is equivalent to, using (2.1), 3) and induction on s we obtain that for all s = 1, 2, . . .ℓ. Hence, putting s = 1 and using Equation 2.1 we get which gives the following result.
Theorem 2.1 Let A be any ordered subset of N and let τ = 11 . . . 1 be a pattern of length ℓ.Then For instance, Theorem 2.1 for z = 0 and A = N gives that the generating function for the number of compositions of n with m parts in N that avoid τ = 11 . . . 1 (ℓ-letter pattern) is given by In particular, if ℓ = 3 then we obtain [8, Theorem 2.1], which is given for easy reference also in (1.1).

2.2
The pattern τ = 11 . . .12 Let τ = 11 . . .12 be a pattern of length ℓ and let A = {a 1 , . . ., a d } be any ordered subset of N. Define d A τ ;r (n; m) to be the number of compositions π ∈ C A τ ;r (n; m) such that the string concatenation π(a d + 1) (a d + 1 > a d ) contains τ exactly r times, and denote the corresponding generating function by 2) , where and r 1 + r 2 = r.In terms of generating functions, the above translates into or, equivalently, Let us now find the recurrence for D A τ (x, y, z).Let π ∈ C A (n; m) be such that the string concatenation π(a d +1) contains τ exactly r times, and has exactly s letters a d .Then π = π (1) a d π (2) . . .a d π (s) a d π (s+1) for some where δ(ξ) = 1 if ξ holds and δ(ξ) = 0 otherwise.Translating to generating functions, we obtain , which, after summing over s, yields These two recurrences, together with D ∅ τ (x, y, z) = C ∅ τ (x, y, z) = 1 and induction, yield the following theorem.
Theorem 2.2 Let A be any ordered subset of N and let τ = 11 . . .12 be a pattern of length ℓ.Then .
For example, Theorem 2.2 for z = 0 and A = N gives that the generating function for the number of compositions of n with m parts in N which avoid the ℓ-letter pattern τ = 11 . . .12 is given by .
In particular, if ℓ = 3 then we get [8, Theorem 2.2], which is also listed in (1.1).Similar arguments as in the proof of Theorem 2.2, replacing a d by a 1 , lead us to the following result.
Theorem 2.3 Let A be any ordered subset of N and let τ = 22 . . .21 be a pattern of length ℓ.Then .
For example, Theorem 2.3 for z = 0 and A = N gives that the generating function for the number of compositions of n with m parts in N which avoid the ℓ-letter pattern τ = 22 . . .21 is given by .

2.3
The subword τ = 211 . . .112 Let A = {a 1 , . . ., a d } be any ordered subset of N, and let τ = 211 . . .112 be a pattern of length ℓ.We define d A τ ;r (n; m) to be the number of compositions π ′ ∈ C A (n; m) such that the string concatenation (a d + 1)π ′ (a d + 1) contains τ exactly r times, and denote the corresponding generating function by D A τ (x, y, z).Let π ∈ C A τ ;r (n; m) such that π contains s occurrences of the letter a d .For s = 0, the generating function for the number of such compositions π is given by C A ′ τ (x, y, z) where A ′ = {a 1 , . . ., a d−1 }, and for s ≥ 1, by ).Hence, if we sum over all s ≥ 0, we get .
On the other hand, the string concatenation (a d + 1)π ′ (a d + 1), with π ′ as above, contains an occurrence of τ involving the two letters a d +1 if and only if π ′ is a constant string of length ℓ−2, otherwise, τ ((a d + 1)π ′ (a d + 1)) = τ (π ′ ) (that is, the number occurrences of τ in the string concatenation is the same as the number occurrences of τ in π ′ ).Translating to generating functions, we obtain Therefore, using the initial conditions C ∅ τ (x, y, z) = D ∅ τ (x, y, z) = 1 and induction on d, we get the following theorem.
Theorem 2.4 Let A be any ordered subset of N and let τ = 211 . . .112 be a pattern of length ℓ.For all ℓ ≥ 3, x a j y 1+x a j y ℓ−1 (1−z) For example, Theorem 2.4 for z = 0 and A = N gives that the generating function for the number of compositions of n with m parts in N which avoid the ℓ-letter pattern τ = 21 . . .12 is given by In particular, for ℓ = 3 we have that C N 212 (x, 1) = " , and the sequence for the number of compositions of n with parts in N that avoid 212 for n = 0 to n = 20 is given by 1, 1, 2, 4, 8, 15, 30, 58, 114, 222, 434, 846, 1655, 3230, 6310, 12322, 24067, 46997, 91791, 179262, 350106.Note that the first time the pattern 212 can occur is for n = 5, as the composition 212.
Similar arguments as in the proof of Theorem 2.4, replacing a d by a 1 , lead us to the following result.
Theorem 2.5 Let A be any ordered subset of N and let τ = 122 . . .221 be a pattern of length ℓ.Then x a j y 1+x a j y ℓ−1 (1−z) For example, Theorem 2.5 for z = 0 and A = N gives that the generating function for the number of compositions of n with m parts in N which avoid the ℓ-letter pattern τ = 12 . . .21 is given by In particular, for ℓ = 3 we have that C N 121 (x, 1) = " , and the sequence for the number of compositions of n with parts in N that avoid 121 for n = 0 to n = 20 is given by 1, 1, 2, 4, 7, 13, 24, 44, 82, 153, 284, 528, 981, 1820, 3378, 6270, 11638, 21608, 40121, 74494, 138317.Note that the first time the pattern 121 can occur is for n = 4, as the composition 121.We remark that in [8] the two patterns 212 and 121 were not treated individually, but as part of a peak and valley.

General patterns
In this section we consider several general cases of patterns.We start by the following result which generalizes Theorem 2.4.
Proof: Let π ∈ C A (n; m).The generating function for the number of compositions π which do not contain the letter a d and contain τ exactly r times is given by C A ′ τ (x, y, z) where A ′ = {a 1 , . . ., a d−1 }.Now assume that π = π (1) a d π (2) such that π (1) does not contain the part a d .Then the number occurrences of τ in π satisfies τ (π) = τ (π (1) ) + τ (a d π (2) ), so the generating function for the number of such compositions π is given by x a d yC A ′ τ (x, y, z)D A τ (x, y, z), where D A τ (x, y, z) is the generating function for the number of compositions π ∈ C A (n; m) such that a d π contains τ exactly r times.Therefore, On the other hand, let a d π ∈ C A (n + a d ; m + 1).If π does not contain the letter a d , then the generating function for the number of such π is given by C A ′ τ (x, y, z).Otherwise, let i be the position of the leftmost letter a d and let π| i be the left prefix of π of length i, then the generating function for these compositions is given by Hence, from the above two equations, we obtain , so, by induction on d with the initial condition C {a1,...,ap−1} τ x a j we get the desired result.✷ Now, let τ = 1τ ′ 1 be a pattern, where τ ′ is a pattern with parts in {2, 3, . ..}.Then similar arguments as in the proof of Theorem 2.6 with replacing a d by a 1 lead us to the following result.
Theorem 2.7 Let A be any ordered subset of N, and let τ = 1τ ′ 1, where τ ′ is a pattern with ℓ − 2 letters in {2, 3, . ..}.Then , where B j is the set of all the compositions β = a i1 . . .a i ℓ−2 with parts in {a j , . . ., a d } such that β is order-isomorphic to τ ′ .
Note that the above theorem generalizes Theorem 2.5.Let τ = pτ ′ (p + 1) be a pattern such that τ ′ is a pattern with letters in {1, 2, . . ., p − 1}.This case is treated in a similar manner as the case of τ = pτ ′ p.As a result, we obtain the following theorem.
For instance, Theorem 2.9 for z = 0, y = 1, A = N and τ = 132 gives , and the sequence for the number of compositions of n with parts in N that avoid 132 for n = 0 to n = 20 is given by 1, Note that the first time the pattern 132 can occur is for n = 6, as the composition 132.We remark that in [8] there two patterns 132 and 213 were not treated individually, but as part of a peak and valley.
3 Asymptotic distribution of the number of compositions that avoid ℓ-letter pattern We will now use methods from asymptotic analysis as described in [5, Theorem IX.8] (also, see [1]) to analyze different asymptotic parameters for the number of compositions of n with m parts in N that avoid an ℓ-letter pattern, for given ℓ ≥ 3. We look at the generating function as a complex function, and indicate this fact by using the variables z, w instead of the variables x, y.Let τ be any given pattern, the function C N τ (z, w) is a bivariate analytic function at (0, 0) and has nonnegative coefficients there.The asymptotic behavior of C N τ (n), the number of compositions of n with parts in N that avoid a pattern τ , is determined by the dominant pole of the function C N τ (z, 1) = 1/h τ (z), i.e., the smallest positive z * τ root of h τ (z) (see for example [13]).For instance, To be sure that z * τ , for fixed τ , is the dominant singularity, we use the argument principle [7].Figures 1  and 2 present the curve h τ (z) with z = 0.6e it , where h τ (z) is converted into a series up to O(z 30 ).Thus, the winding number is 1 (this result can be achieved analytically, for example, Rouche's theorem [7]    gives that the functions h τ (z) and z − z * τ in the domain |z| < 0.6 have the same number of zeros, thus h τ (z) has only one simple zero), so that the function h τ (z) has only one root z * τ in the domain |z| < 0.6.Thus, the singularity analysis gives that We are now ready for bivariate asymptotic by using Flajolet and Sedgewick's book [5, Theorem IX.8]), which allows us to obtain results on the asymptotic distribution of the number of parts in the set of compositions (consider all compositions as equiprobable) of n with m parts in N that avoid an ℓ-letter pattern.Take |z| ≤ 0.6 and |w − 1| ≤ ǫ τ , where we assume that ǫ τ is a small positive real number.Because of the form of the terms of the sum in denominator of C τ (z, w) = 1 hτ (z,w) that involve wz j , the function h τ (z, w) remain analytic there.Thus, there exists ρ τ (w) a nonconstant analytic function for w in a sufficiently small neighborhood of 1 (by Weierstrass preparation or implicit functions).The nondegeneracy conditions are easily verified by numerical computations.
For instance, if τ = 112 then the function h .
Substituting w = 1 with using the fact that ρ(1) = z * 112 , we get that ρ ′ (1) = −0.24714426 . ... Hence, the variability condition in the case τ = 112 holds.Therefore, there results that [5,theorem IX.8] applies.Hence, the number of compositions with m parts of large given n that avoid the pattern 112 is asymptotically Gaussian.We note that the study of Gaussian asymptotic of the number compositions with m parts of large given n that avoid a pattern τ is similar as the case of 112, where τ is any pattern given in Section 2.
In order to obtain an explicit Gaussian asymptotic for each case of τ , let (see [5,Section IX.5] and [1]) For the above numerical calculations we converted h τ (z, w) into a series up to O(z 30 ) and used the values of z * τ which are given in (3.1).Therefore, by using [5,theorem IX.8] (see also [1]) we can state the following result.