Enumeration of words by the sum of differences between adjacent letters

and u is the total variation (or just variation) of σ. For example, the total variation of the word σ = 1121415221 ∈ [5] is u(σ) = 0 + 1 + 1 + 3 + 3 + 4 + 3 + 0 + 1 = 16. Several authors have studied the statistic u(σ) in different contexts. For example, Curtis [5] discussed the problem of finding a word σ = σ1σ2 · · ·σn over the alphabet {a1, a2, . . . , ak} which maximizes the risk function ∑n j=1 |σi−σi−1|, where σ0 = σn. Chao and Liang [3] studied the permutations of n distinct letters arranged on a circle or a line and found the arrangements that maximize the risk function. Later, Cohen and Tonkes [4] (see also [5, 9]) analyzed optimal permutations for multisets. Motivated by the above results, we pose the following problem:


Introduction
Given a totally ordered set of k letters [k] = {1, 2, . . ., k}, we refer to the elements of [k] n as k-ary words of length n, or just words.Let σ be any k-ary word of length n, we define u(σ) = u n,k (σ) to be the sum of the differences between each pair of adjacent letters of σ, that is, and u is the total variation (or just variation) of σ.For example, the total variation of the word σ = 1121415221 ∈ [5] 10 is u(σ) = 0 + 1 + 1 + 3 + 3 + 4 + 3 + 0 + 1 = 16.
Several authors have studied the statistic u(σ) in different contexts.For example, Curtis [5] discussed the problem of finding a word σ = σ 1 σ 2 • • • σ n over the alphabet {a 1 , a 2 , . . ., a k } which maximizes the risk function n j=1 |σ i −σ i−1 |, where σ 0 = σ n .Chao and Liang [3] studied the permutations of n distinct letters arranged on a circle or a line and found the arrangements that maximize the risk function.Later, Cohen and Tonkes [4] (see also [5,9]) analyzed optimal permutations for multisets.
Motivated by the above results, we pose the following problem: • Given a nonnegative integer m and a multiset of n letters M = {a 1 , a 2 , . . ., a n }, enumerate all permutations σ of M with u(σ) = m.This paper addresses the restricted case of the problem in which σ ∈ [k] n for a fixed integer k > 0.
More precisely, let f k (n, m) be the number of k-ary words σ of length n with u(σ) = m.Clearly, u(σ) ≤ u(1k1k We denote the correspond generating function by F k (n; q), that is, and define In the next section we find two different expressions for the bivariate generating function F k (x, q).The first expression is obtained by using the scanning element algorithm as described in [6] which it is given by the following result.
Theorem 1.1 The bivariate generating function F k (x, q) is given by , where A crucial device in proving Theorem 1.1 is the idea of a new statistic on the set of k-ary words of length n defined as follows.The complete total variation (or just complete variation) The bivariate generating function F k (x, q) can be expressed in terms of Chebyshev polynomials of the second kind, see Theorem 1.2.Chebyshev polynomials of the second kind are defined by U r (cos θ) = sin(r+1)θ sin θ . For example, U 0 Chebyshev polynomials were invented for the needs of approximation theory, but are also widely used in various other branches of mathematics, including algebra, combinatorics, and number theory (see [12]).
Theorem 1.2 The bivariate generating function F k (x, q) is given by where , which is the generating function for all k-ary words of length n.Moreover, using [7, Theorem IX.9 (Meromorphic schema)] with ρ = 1 k we obtain that the random variable u n,k with probability generating function F k (n;q) k n after standardization, converges in distribution to a Gaussian variable, with a speed of convergence that is O(1/ √ n), and the mean and the standard deviation of u n,k are asymptotically linear in n.
In our next theorem, we determine the mean and the variance of the total variation exactly.
Theorem 1.3 The mean and the variance of the total variation of a uniformly chosen k-ary word of length n are given by respectively.
We present two proofs of the above theorem.The first proof relies on Theorem 1.1, while the second consists of an elegant derivation based solely on counting special types of subsets in [k] n and the additivity of the expected value operator, see Section 3.

Proof of Theorems 1.1 and 1.2
In this section we establish the formula for the generating function F k (x, q) asserted in Theorems 1.1 and 1.2.To this end, let x n q u(σ) .
Our first result is a recurrence relation for the functions Proof: From the definitions we can state that Now let us solve the recurrence relation in the statement of Lemma 2.1 either by using kernel method technique (see [2,6,8]) or by using tridiagonal matrix algorithm (see [17]).

Kernel method technique
In order to solve the recurrence relation in the statement of Lemma 2.1, we define Clearly, F k (x, q; 1) = F k (x, q).So, in order to specify F k (x, q), we need to study a functional relation for F k (x, q; t).
Proof: Lemma 2.1 gives for all i = 1, 2, . . ., k. Multiplying through by t i−1 and summing over i, i ∈ [k], we obtain which completes the proof.✷ We note the following useful restatement of Lemma 2.2: We need a result relating F k (x, q; 1/q) and F k (x, q; q).We claim that To see this, define the complement of Then it is immediate from the definitions that σ ′ is a k-ary word of length n with u(σ ′ ) = u(σ) and as claimed.Equation (2.2) for t = q gives that F k (x, q; 1/q) = 1 + 1 q k−1 (F k (x, q; q) − 1).Hence, (2.1) reduces to which is equivalent to This type of functional equation can be solved using the kernel method (see [2,8]).In this case, if we assume that

.4)
We can formulate the following theorem.Let G k (x, q) be the generating function for the number of k-ary Theorem 2.3 The generating function G k (x, q) = 1 − q + qF k (x, q; q) is given by , where Proof: Using the definition v(σ) = σ 1 + u(σ), we have that the contribution for nonempty k-ary words in G k (x, q) is q(F k (x, q; q) − 1) and the contribution of the empty word in G k (x, q) is 1.Thus G k (x, q) = 1 + q(F k (x, q; q) − 1).The rest follows from (2.4).✷ For example, Theorem 2.3 gives We are now ready to establish Theorem 1.1.
Proof Proof of Theorem 1.1:By substituting t = 1 in (2.3), we obtain Then by applying Theorem 2.3 we obtain the second part of Theorem 1.1.✷ For example, Theorem 1.1 gives

Tridiagonal matrix algorithm
Rewriting the equations in the statement of Lemma 2.1 in matrix form we get that . . .
which is equivalent to where By using the tridiagonal matrix algorithm on (2.5), as described in [17], we get that where and ξ = 1+q 2 −x(1−q 2 ) q .Thus, by Equations (1.1) and (2.7) together with induction on i, we obtain that On the other hand, Equation (2.6) together with induction on i = k, k−1, . . ., 1 give that x i = k j=i d j j−1 s=i (−c s ).Thus, by using (2.8) we have that , for all i = 1, 2, . . ., k.Hence, by using we complete the proof of Theorem 1.2.
3 Proof of Theorem 1.3 Let [k] n be the set of k-ary words of length n, where we assume that all words occur equally likely.Then n are discrete random variables for the variation and the complete variation that take values 0, 1, . . ., (n − 1)(k − 1) and 1, 2, . . ., (n − 1)(k − 1) + k, respectively.

Generating function techniques
As a corollary of Theorem 1.1 we get the following two results.
Theorem 3.1 The mean of the random variables u n,k and v n,k are given by Proof: After simple algebraic operations, Theorem 2.3 gives and Theorem 1.1 gives The variance of the random variables u n,k and v n,k are given by and respectively.
Proof: After simple algebraic operations, Theorem 1.1 gives 3  , which implies that the ratio of the second factorial moment of the random variable u n,k by k n is given by .
Using this expression, the variance of the random variable u n,k is given by Similarly, Theorem 2.3 gives 90(1−kx) 3   , which implies that the ratio of the second factorial moment of the random variable v n,k by k n is given by .
Using this expression, the variance of the random variable v n,k is given by , as required.✷

Additivity of the expected value operator
In this section, we give an alternative proof of Theorems 3.1 and 3.2 (see Theorem 1.3), which consists of an elegant derivation based solely on counting special types of subsets in [k] n and the additivity of the expected value operator.
In order to do that, let X i = X i (σ), i = 1, 2, . . ., n − 1 and σ ∈ [k] n , be the discrete random variable for the absolute difference |σ i −σ i+1 | between values in the i-th and (i+1)-st positions, and X 0 = X 0 (σ) be the discrete random variable for the value of the first letter of σ ∈ [k] n .From the definitions we obtain that , for all m = 1, 2, . . ., k − 1, • P (X 0 = m) = 1 k for every m ∈ [k], for all m = 1, 2, . . ., k, where P (Y = m) denotes the probability that the discrete random variable Y equals m.Then all X i 's, i = 1, 2, . . ., n − 1, have the same distribution and u n,k = n−1 i=1 X i .Thus, the expected value of u n,k is given by E Therefore, The discrete random variable v n,k for the complete total variation is given by v n,k = u n,k + X 0 and its expected value equals Thus, by (3.1) we have that Hence, Equations (3.1) and (3.2) complete the proof of Theorem 3.1.Now we ready to prove Theorem 3.2.The random variables X i and X j are independent if |i − j| > 1 (that is, the covariance Cov(X i , X j ) equals 0 for all |i − j| > 1), thus the variance of u n,k is given by where the first sum on first line above corresponds to cases where σ 1 σ 2 σ 3 is an increasing sequence or a decreasing sequence, the second sum corresponds to cases where σ 1 = σ 3 = σ 2 , and the third sum corresponds to the rest of the cases.Hence, by the fact that Cov(X 1 , X 2 ) = E(X 1 X 2 ) − E(X 1 )E(X 2 ) we get that which implies that which agrees with Theorem 3.2.
To this end, let us find explicit formula for variance of the discrete random variable v n,k = n−1 i=0 X i .Since the discrete random variables X i and X j are independent if |i − j| > 1, we get that the variance of v n,k is given by V ar(v n,k ) = V ar(X 0 ) + (n − 1)V ar(X 1 ) + 2Cov(X 0 , X 1 ) + 2(n − 2)Cov(X 1 , X 2 ) = V ar(X 0 ) + 2Cov(X 0 , X 1 ) + V ar(u n,k ) with V ar(X 0 ) = E(X