Note on the weighted internal path length of b-ary trees

In a recent paper Broutin and Devroye (2005) have studied the height of a class of edge-weighted random trees. This is a class of trees growing in continuous time which includes many well known trees as examples. In this paper we derive a limit theorem for the internal path length for this class of trees. The application of this limit theorem to concrete examples depends upon the possibility to obtain an expansion of the mean of the path length. For the proof we extend a limit theorem in Neininger and R¨uschendorf (2004) to recursive sequences of random variables with continuous time parameter.


Introduction
In this paper we derive a limit theorem for the internal path length of edge-weighted b-ary random trees.For this class of trees which includes as particular cases many well known types of random trees as e.g.random binary search trees, random recursive trees, random split trees and others in a recent paper by Broutin and Devroye (2005), a general law of large numbers for the weighted height was established.
The weighted b-ary tree is defined as follows.Let T ∞ denote an infinite complete rooted b-ary tree.To each node u in T ∞ independently a random vector ((Z 1 , E 1 ), . . ., (Z b , E b )) is assigned corresponding to the b outgoing edges of u where Z i , E i ≥ 0, each pair (Z i , E i ) is identically distributed as (Z, E) and where Z, E have finite expectations.We also assume that (Z i ) and (E i ) are independent.Z e assigns a weight and E e an age to edge e.Then for node u E e is the age of u (1) Z e is the weighted depth of u. (2) Here π(u) is the set of edges in T ∞ on the path of the root to node u.The b-ary tree of age ≤ t then is defined in continuous time t ≥ 0 by (3) where E) } denotes the Cramer function of (Z, E).This result applies to rBST, random recursive trees, plane oriented trees, oriented trees, split trees and others.Their proof was based on Chernoff's theorem and extends earlier results of Biggins and Grey (1997) and Biggins (1977Biggins ( , 1978) ) using branching random walks.The upper bound in (6) can be extended to dependent reproduction based on the Gärtner-Ellis theorem (see Schopp (2005)).The lower bound however needs a new Galton-Watson type result for the case of dependent reproduction which seems to be not available in sufficient generally.The application of our limit theorem to concrete examples depends upon an expansion of the first moment of the path length resp. in some cases of the first two moments.
2 Limit theorem for the weighted internal path length For the internal path length of b-ary weighted trees as introduced in Section 1 we obtain the following recursive equation in continuous time which arises when splitting the tree at the root: are independent copies of each other.To argue for (7) let u 1 , . . ., u b the b nodes of T t below the root with corresponding ages E 1 , . . ., E b .If E 1 > t, then V t−E1 the number of nodes in the subtree with root u 1 , is zero and we get no contribution of this subtree to the internal path length.Only the nodes in the subtree of u 1 of age less than t − E 1 contribute to the internal path length.For each of them we have to add the weight Z 1 of the edge from the root to u 1 , i.e.Z 1 V (1) t−E1 .Similarly, the contribution of the other subtrees is accounted in (7) yielding the recursion To deal with the recursive random variables (Y t ) with continuous time parameter t as in (8) we derive in the following an extension to continuous time of the contraction method as developed in Neininger and Rüschendorf (2004) (see also Rösler and Rüschendorf (2001)).Let 0 < s ≤ 3, let Y t be s-integrable for all t, and consider the normalized version X t of Y t defined by where for 1 < s ≤ 3, M t := EY t and for 2 < s ≤ 3, C t := Var(Y t ), C t > 0 else (for the motivation of this normalization see Neininger and Rüschendorf (2004)).Convergence will be formulated w.r.t. the Zolotarev metric where The normalized version X t of Y t satisfies a recursive equation of a form similar to (8): where Theorem 1 Let 0 < s ≤ 3 und X t ∈ L s satisfy the recursive equation ( 12) and assume that A s < ∞ and sup 0≤u≤t X u s < ∞ for all t > 0. Assume further that 1) Then X t converges in distribution to a limit X, and X is in law the unique solution of the fixpoint equation in L s with EX = 0 for 1 < s ≤ 3 and Var X = 1 for 2 < s ≤ 3.
Proof: Note that by the normalization for 1 < s ≤ 2 X t is centered, thus Eb (t) = 0.For 2 < s ≤ 3 EX t = 0, Var(X t ) = 1 and thus Eb (t) = 0 and E(b r ) 2 = 1.Thus from assumption ( 14) we obtain Eb * = 0, 1 < s ≤ 2 and This implies by Corollary 3.4 of Neininger and Rüschendorf (2004) existence and uniqueness of a solution of (17) in M s (0, 1), the class of distributions of all X ∈ L s with moments as specified above.We introduce as in the discrete time case an accompanying sequence Q t of X t by where (X (r) ), (X (r) t ) are independent copies of X, X t and τ is some suitable positive number specified later in the proof.Then for 2 < s ≤ 3 Var(Q t ) = Var(X t ) and thus Q t ∈ M s (0, 1) and the distance between X t , Q t , X w.r.t. the Zolotarev metric is finite for all t > 0.
By the triangle inequality holds As in the discrete case we obtain that the remainder term r t := ζ s (Q t , X) → 0 as t → ∞ using ( 16) and the condition sup 0<t≤τ X t s < ∞.Further, using the ideality properties of the ζ s -metric we obtain and thus by (21 where r * = sup τ ≤t r t < ∞.By an inequality due to Zolotarev with some constant C > 0. Thus we obtain by assumption (15) from ( 22) for some η < 1 if τ is chosen large enough.This implies d * t ≤ ηd * t + r * by monotonicity of d * t , i.e. d * t ≤ r * 1−η for all t > τ .Thus we get that d t is bounded.Now we refine the estimate as in the discrete case to obtain that d t → 0. Let d := lim sup t→∞ d t .Then for any ǫ > 0 holds d t ≤ d + ǫ for all t ≥ τ 1 and thus by ( 20), ( 21 Using assumptions ( 15), ( 16), this implies Since ζ s -convergence implies weak convergence we obtain the conclusion of the theorem.✷ Remark.
a) In order to apply the limit theorem to concrete b-ary weighted trees we have to control the first moment of Y t for 1 < s ≤ 2 and the first and second moment for 2 < s ≤ 3 (as typically in the case of normal limits).In the case of discrete time recursive sequences several examples of this type have been given in Neininger and Rüschendorf (2004).Broutin and Devroye (2005) applied their results on the height of b-ary weighted trees to several trees.In the case where the age variables are exponential the corresponding b-ary tree has a Markov structure, the number of nodes V t can be determined by a law of large numbers and so a transference to e.g.rBST's is possible (see Broutin and Devroye (2005)).
b) Assumption 3) of Theorem 1 can be weakened a bit for the limit theorem (see Schopp (2005)).Also the random subtree sizes t − E r in the recursive equation ( 12) can be replaced in the formulation of Theorem 1 by general subtree size I r (t) ≤ t as in the discrete time case in Neininger and Rüschendorf (2004).In a recent paper of Janson and Neininger (2006) a similar extension of the limit theorem of Neininger and Rüschendorf (2004) to the continuous time case has been independently established (even in the multivariate case) and has been applied to a fragmentation process.c) In Theorem 1 we assume finiteness of the s-th absolute moments of the random modified coefficients (A (t) r ) and the modified toll terms (b (t) ).Thus integrability properties of Z, E may have an impact on the applicability of the theorem.
In many applications (see Broutin and Devroye (2005)) Z is a bounded random variable.Thus for the finiteness of the s-th moment of b (t) it suffices in that case to estimate the sth absolute moment of the number of nodes up to time t, since where c(t) is a constant depending on t.
If for example E is Exponential(1)-distributed, then { V t ≤ n} = {t n ≥ t}, where t n is the time of the n-th birth and V t denote the number of external nodes in the b-ary tree. As Ei 1+(i−1)(b−1) we obtain Therefore the s-th moment of V t is bounded by the s-th moment of a geometric G(e −t(b−1) ) random variable, and since P (V t ≥ k) = P ( V t ≥ (b − 1)k + 1) we have also a bound for V t .
Broutin and Devroye (2005)f nodes of T t .Broutin and Devroye (2005)proved a strong law for the height assuming that Z i , E i are independent.More precisely they established t are the height of T t H t := max{D u : u ∈ T t } denotes the space of m-fold continuously differentiable real functions on R 1 with a Hölder condition for the m-th derivative.ζ s (X, Y ) is finite if X, Y have finite absolute moments of order s and the moments of order 1, . . ., m of X and Y coincide.ζ s is an ideal metric of order s, i.e. for Z independent of X, Y and any c ∈ R holds