Analysis of some statistics for increasing tree families

This paper deals with statistics concerning distances between randomly chosen nodes in varieties of increasing trees. Increasing trees are labelled rooted trees where labels along any branch from the root go in increasing order. Many important tree families that have applications in computer science or are used as probabilistic models in various applications, like recursive trees, heap-ordered trees or binary increasing trees (isomorphic to binary search trees) are members of this variety of trees. We consider the parameters depth of a randomly chosen node, distance between two randomly chosen nodes, and the generalisations where p nodes are randomly chosen: the size of the ancestor-tree of these selected nodes and the size of the smallest subtree generated by these nodes, also called Steiner distance . Under the restriction that the node-degrees are bounded, we can prove that all these parameters converge in law to the Normal distribution. This extends results obtained earlier for binary search trees and heap-ordered trees to a much larger class of structures.


Introduction
In this paper we consider families of increasing trees.Increasing trees as defined in [2] are labelled trees (the nodes of a tree of size n are labelled by distinct integers of the set {1, . . ., n}), such that each sequence of labels along any branch starting at the root is increasing.As the underlying tree model, we use the so called simply generated trees, compare [12].They are essentially the same as Galton-Watson trees, obtained as the family tree of a Galton-Watson branching process conditioned on a given total size, see e. g., [1].Additionally, they are equipped with increasing labellings.We will thus speak about simple families of increasing trees.A thorough study of families (=varieties) of increasing trees was conducted in [2].
A class T of a simple family of increasing trees can be defined in the following way.A sequence of non-negative numbers (ϕ k ) k≥0 with ϕ 0 > 0 is used to define the weight w(T ) of any planted plane tree T by w(T ) = ∏ v ϕ d (v) , where v ranges over all vertices of T and d(v) is the out-degree of v. Further, λ(T ) denotes the number of different increasing labellings of the tree T .Then the family T consists of all trees T together with their weights w(T ) and the various increasing labellings λ(T ).
For a given integer sequence (ϕ k ) k≥0 , the quantities are counting the number of trees of size n in T , where |T | denotes the size (=number of nodes) of the tree T .
Further we define by the degree generating function ϕ(t), which contains all the information required for analysing the tree parameters that we consider in this paper.
As it is natural in enumeration problems related to labelled structures, we use the exponential generating function It follows then from the recursive way in which these trees are generated that where the degree of the root is k and the subtrees S 1 , . . ., S k have sizes n 1 , . . ., n k , respectively.Thus, for simple families of increasing trees with degree generating function ϕ(t), the (exponential) generating function T (z) satisfies the autonomous first order differential equation As an example we list in Figure 1 all planar unary-binary increasing trees of size 4.These trees are described via the degree generating function ϕ(t) = 1 + t + t 2 .
In [2], an asymptotic study of many parameters related to simple families of increasing trees can be found.These results are obtained by means of a generating functions approach and naturally depend on the degree generating function ϕ(t).
We want to give a few examples as an illustration that simple families of increasing trees are not an obscure construction but occur rather naturally.
We begin with the so called recursive trees, which are the family of non-planar increasing trees where all node degrees ≥ 0 are allowed.They are described by the degree generating function ϕ(t) = exp(t).This model is used among other things to describe the spread of epidemics, for pyramid schemes or as a basis of Burge's sorting method (see e. g., [11]).
Another family of trees are the so called heap ordered trees (also known as plane recursive trees), which are planar increasing trees such that all node degrees are allowed.They are described by the degree generating function ϕ(t) = 1  1−t .They can be used for pyramid schemes based on the principle "success breeds success" (see also [11]).
Finally we want to mention the so called binary increasing trees, which are labelled unary-binary trees with one sort of leaves and one sort of binary nodes, but two sorts of unary nodes (left branching nodes and right branching nodes); thus they are described by the degree generating function ϕ(t) = (1 + t) 2 .They can be used as a data structure to represent mergeable priority queues, with algorithms that can be precisely analysed (see [2]).This tree model is also isomorphic to the model of standard binary search trees (and thus to the Quicksort algorithm, see e. g., [7]).Hence when analysing structural parameters, we can either do it in the binary search tree model or in the binary increasing tree model.
For the last-mentioned model of binary search trees, two tree parameters are analysed in [15], which extend the quantities depth of a random node and distance between two random nodes.The depth of a node v is here defined as the number of nodes lying on the unique path from the root to v. The natural extension size of the ancestor-tree of p chosen nodes v 1 , . . ., v p measures the size of the tree spanned by the root and v 1 , . . ., v p and therefore counts the number of nodes that are lying on at least one direct path from the root to v i for 1 ≤ i ≤ p.For binary search trees this parameter is of interest when analysing the average behaviour of the Multiple Quickselect algorithm (see e. g. [16]), which is used to find arbitrary p-order statistics in a data file of length n.
The node distance between v 1 and v 2 is defined as the number of nodes lying on the direct path from v 1 to v 2 .A study of the distance between nodes is of interest for various tree families.E. g., the distance between two specified nodes appears in the analysis of the costs of finger search operations in search tree structures (see [5]).Moreover, heap ordered trees (they are a special instance of the so called Albert-Barabási-model for scale-free networks, see [3])) and recursive trees (see [8]) are used to model the growth of the world-wide-web.Thus in this context the distance between randomly chosen nodes in heap ordered trees resp.recursive trees is of interest when analysing the "hopcount" problem of the internet, which asks for the number of hops (= traversed routers) along the shortest path between two arbitrary nodes in the internet.
The natural extension for the parameter distance between two nodes is the spanning subtree size of p chosen nodes v 1 , . . ., v p and thus counts the number of nodes that lie on at least one direct path from v i to v j for 1 ≤ i ≤ j ≤ p.In the literature, this parameter is also known as the Steiner distance between these p nodes (see e. g., [4]).Such measures are used for analyzing transportation networks and multiprocessor computer networks.
Our previous paper [15] contains the following results for the binary search tree model: Under the assumption that all n p possibilities of selecting p nodes in a tree of size n are equally likely, the distribution of the Steiner distance of p chosen nodes as well as the distribution of the size of the ancestor-tree of p chosen nodes is asymptotically Gaussian for n → ∞ and fixed p; mean and variance are asymptotically 2p log n.
See Figure 2 for a comparison of both parameters considered here: ancestor-tree size and Steinerdistance for a given increasing tree.The intention of this work is to generalise the limiting distribution results for the Steiner distance and for the ancestor-tree size that were obtained for the special case of binary increasing trees in [15], and for the case of heap ordered trees in [13], to other members of the simple family of increasing trees.We will carry out this analysis for all simple increasing trees where the possible node degrees are bounded, and thus the degree generating function ϕ(t) = ϕ 0 + • • • + ϕ d t d is a polynomial of degree d ≥ 2. We will thus speak about the polynomial variety of increasing trees.
Important special cases are the d-ary increasing trees (ϕ(t) = (1 + t) d ) and the planar unary-binary increasing trees (ϕ(t) = 1 + t + t 2 ), whereas the above mentioned recursive trees and heap-ordered trees are not covered by this analysis.So, the results in [14] (recursive trees), [13] (heap ordered trees) and in the present paper complement each other.
However, in principle one can also obtain results for tree classes with unbounded node-degrees (where ϕ(t) is not a polynomial) using the methods of this paper.This is technically more subtle and depends on whether one can describe the behaviour of T (z) near their dominant singularities (see Section 2).
Throughout this paper we will (for a given tree family) denote by X n,p the random variable counting the size of the ancestor tree of p randomly chosen nodes in a tree of size n and by Y n,p the random variable counting the Steiner distance of p randomly chosen nodes in a tree of size n.The main result concerning the limiting distribution of these parameters in a polynomial variety of increasing trees is then stated in the next two theorems.The distribution function of the standard normal distribution N (0, 1) is denoted by Φ(x).

Theorem 1. Given a polynomial variety of increasing trees with degree generating function
The distribution of the random variable X n,p , which counts the size of the ancestor-tree of p randomly chosen nodes in a random tree of size n, is for fixed p ≥ 1 asymptotically Gaussian, where the convergence rate is of order The expectation E n,p = E(X n,p ) and the variance V n,p = V(X n,p ) satisfy with constants c p resp.d p that are specified in Section 3.

Theorem 2. Given a polynomial variety of increasing trees with degree generating function
the random variable Y n,p , which counts the Steiner distance (and thus the spanning subtree size) of p randomly chosen nodes in a random tree of size n, is for fixed p ≥ 2 asymptotically Gaussian, where the convergence rate is of order O 1 √ log n : The expectation E n,p = E(Y n,p ) and the variance V n,p = V(Y n,p ) satisfy with constants cp resp.dp that are specified in Section 4.

Known results for the generating function T (z)
As the starting point in our analysis of the parameters X n,p and Y n,p , we will collect results from [2] concerning the structure of the generating function T (z).

Theorem 3. [Bergeron et al.] The exponential generating function T (z) of a simple family of increasing trees defined by the degree function
Depending on the degree function ϕ(t), periodicity phenomena can occur.If ϕ(t) is a function of t q for some q ≥ 2, such that ϕ(t) = ψ(t q ) for some power series ψ, one says that ϕ(t) is periodic and the maximum possible q is called the period (otherwise ϕ(t) is called aperiodic, q = 1).For a period q ≥ 2 one gets e. g., by applying the Lagrange inversion formula, that T (z) = zT * (z q ) for some power series T * .Thus non-zero coefficients T n can occur only if the congruence condition n ≡ 1 (mod q) is satisfied.For the asymptotic behaviour of the coefficients T n one can translate the behaviour of T (z) in the neighbourhood of the dominant singularities (singularities with smallest modulus) via singularity analysis (see [6]).
The next theorem describes the location of the dominant singularities.

Theorem 4. [Bergeron et al.] Given a polynomial degree function
Furthermore, if ϕ(t) is non periodic, ρ is the only dominant singularity of T (z).If ϕ(t) has period q ≥ 2, then T (z) = zT * (z q ), where T * (t) has a unique dominant singularity at t = ρ q .
From this theorem it follows that T (z) is analytic in a domain larger than the disk of convergence if we slit at angles 2πm/q: it exists a ρ ′ > ρ, such that T (z) is analytic in the domain The next theorem describes the behaviour of T (z) near the dominant singularity ρ.For a polynomial degree generating function ϕ(t) = ∑ 0≤k≤d ϕ k t k of degree d we use throughout this paper the abbreviations and Via singularity analysis one gets then in the aperiodic case immediately the asymptotic behaviour of the coefficients T n .If the degree generating functions ϕ(t) has period q ≥ 2 one has q dominant singularities, and their contributions have to be added.This leads to an extra factor q in the formula (9) as stated in the next theorem.

The quantities T n of elements of T
with size n satisfy for d ≥ 2:

Outline of the generating functions approach for X n,p and Y n,p
We will start now by giving a generating functions approach to obtain the results for the random variables X n,p and Y n,p stated in Theorem 1 and Theorem 2. To do this, we introduce the generating functions If we take the recursive generation of the simple families of increasing trees into consideration, we can translate these recurrences into equations for the corresponding generating functions.This gives for the generating function of the ancestor-tree size the non-linear first order differential equation The derivative w. r. t. z in the left side of this equation reflects the marking of the root with label 1; in the right side of this equation the factor v reflects the counting of the root, regardless whether the root is one of the chosen nodes or not (which leads to a factor 1 + u).But in the case p = 0, where we do not select any node, the root must not be counted.Thus we have to add the stated correction term.The parameters "Steiner distance" and "ancestor-tree size" are closely related; differences occur only if all selected nodes are hanging on the same subtree.One gets by translating the recursive generation of the simple families of increasing trees a linear first order differential equation that connects both generating functions: with inital value F(0, u, v) = 0. Essential in our analysis will be the knowledge of the asymptotic behaviour of the coefficients These singular expansions will be given in the formulae (23) resp.(46).To get (23) we will "pump out" the main term and the order of the remainder term of G p (z, v) from the differential equation ( 12) by induction, where we have to solve a first-order differential equation in every induction step.This is done in the Subsections 3.1-3.4.Using singularity analysis, we can translate this expansion of G p (z, v) into an asymptotic formula for the moment generating function ∑ m≥0 P{X n,p = m}e ms in a neighbourhood of s = 0 for fixed p and n → ∞.To obtain the stated Gaussian limiting distribution result (Theorem 1), we can apply a central limit theorem (the so called quasi power theorem, which is due to Hwang, see [9]), that is very powerful in particular when dealing with combinatorial structures; this is done in Subsection 3.5.
Since the generating functions for the quantities ancestor-tree size and Steiner distance are closely related by a first order differential equation, we can translate the asymptotic expansion around the dominant singularity z = ρ of G p (z, v) (given by equation ( 23)) into the asymptotic expansion (46) of F p (z, v), which will be established in Subsection 4.1.From this expansion, we obtain also an asymptotic formula for the moment generating function ∑ m≥0 P{Y n,p = m}e ms and the stated normal convergence result (Theorem 2) follows by applying the quasi power theorem; this is done in Subsection 4.2.
For computing the second order terms c p resp.d p in the asymptotic expansion of the expectation resp. of the variance of X n,p , we will consider in Subsection 3.6 the coefficients of the main term in the expansion of G p (z, v) in more detail.Via generating functions, we can solve the recurrence equation that appears, at least in principle, and obtain a formula for the c p 's, as defined in Theorem 1.This formula is obtained in Subsection 3.7 and given as Theorem 13 (with more effort, one could also obtain a formula for d p , but this is omitted here).Due to the close relation between X n,p and Y n,p , we obtain in Subsection 4.3 as Corollary 14 a formula for cp , as defined in Theorem 2.
The quasi power theorem as proven in [9], which we will apply to our problem, is stated below for the reader's convenience.
, with U(s) and V (s) analytic for |s| ≤ σ and independent of n; U ′′ (0) = 0, Under these assumptions, the distribution of X n is asymptotically Gaussian with the given convergence rate in the Kolmogorov metric: Moreover, the mean and the variance of X n satisfy Throughout this paper, we will use the following abbreviations for the rising factorials x n := x(x + 1) • • • (x + n − 1), the falling factorials x n := x(x − 1) • • • (x − n + 1) and the harmonic numbers H n := ∑ 1≤k≤n 1 k .We will also use the notations D u for the differential operator w. r. t. u and N u for the operator that evaluates at u = 0.
3 The ancestor-tree size

Recurrences for the generating functions G p (z, v)
As already pointed out in Subsection 2.2 the main step in the proof of Theorem 1 is a thorough analysis of the asymptotic behaviour of the functions near the dominant singularity z = ρ uniformly in a neighbourhood of v = 1.
Here we start our analysis by describing how one can obtain the functions The functions G p (z, v), for p ≥ 1, will be obtained recursively.Differentiating equation ( 12) p times w.r. t. u and evaluating at u = 0 gives for p ≥ 1 that If we denote by ϕ (L) (t) the L-th derivative of ϕ(t), it also holds for p ≥ 1 that Using (3), we get that the functions G p (z, v) satisfy for p ≥ 1 the differential equation with initial values G p (0, v) = 0.By solving this first-order linear differential equation, we get for p ≥ 1 equations, which define the G p (z, v) recursively, where the auxiliary functions D u N l u ϕ (L) G(z, u, v) for 1 ≤ l < p appear, as defined by (15):

Analyticity of the functions G p (z, v)
In order to prove the analyticity of the functions G p (z, v) within the analytic domain D of T (z) (which will be done in Lemma 8) it is necessary to show that T ′ (z) = 0 for all z within this domain D. But if there would exist a κ ∈ D such that T ′ (κ) = 0, we would get by (3), that ϕ T (κ) = 0 holds as well.Using the integral representation (4) and expanding the polynomial ϕ(t) around T (κ), one gets immediately that the integral is unbounded: Thus T ′ (z) = 0 for z ∈ D and therefore the functions Lemma 8.The functions G p (z, v) and N u D p u ϕ (L) G(z, u, v) are for p ≥ 0, 0 ≤ L ≤ d − 1 and fixed v analytic within the domain z ∈ D where T (z) is analytic.Their dominant singularities are the dominant singularities of T (z) and are thus located at z = ρ in the aperiodic case, resp.by z = ρe 2πim/q , 0 ≤ m < q, if ϕ(t) has period q.
Proof.We start by considering in which instance we know from Theorem 4 that the lemma is true.Moreover, we get that Using the recursive description (17) of the functions G p (z, v), we will show the lemma for p ≥ 1 by induction.We assume thus, that the lemma is for p ≥ 1 already shown for all G l (z, v) and Since T ′ (z) = 0 for z ∈ D, it follows immediately that the integrand in the representation of G p (z, v) as given by ( 17) is analytic in the considered domain D. The dominant singularities are the dominant singularities of T ′ (z) and thus of T (z).It follows then that the lemma is also true for G p (z, v).
Once the lemma is proven for G p (z, v), we can use for 0 ≤ L ≤ d − 1 the recursive description of N u D p u ϕ (L) (G(z, u, v)) stated as equation ( 15), to show that the lemma is also true for these auxiliary functions.Moreover, one gets for p ≥ 1 that

Singular behaviour of G 1 (z, v)
The next lemma describes the asymptotic behaviour of G 1 (z, v) near z = ρ.
Lemma 9. G 1 (z, v) has for fixed v, |1 − v| ≤ σ and σ small enough in a neighbourhood of the dominant singularity z = ρ the expansion with g 1 (0, v) = 1, and where g 1 (t, v) and g 2 (t, v) are for fixed v with |1 − v| ≤ σ, σ small enough, analytic around t = 0.Moreover, α 1 (v) is analytic for v with |1 − v| ≤ σ, σ small enough, and given by Proof.We start with the integral representation which is obtained from (17).
Next, we consider the integral We want to show that the first part of equation ( 22), is an analytic function of v in a neighbourhood of v = 1.This is not completely obvious in advance, since T ′ (z) is not analytic at z = ρ.We will consider the coefficients n and show that they will not grow too fast.We choose ε > 0 small enough, such that the expansion of T ′ (z) as given by (19) holds for |1 − z/ρ| ≤ ε and write From the expansion (19), we get and with the substitution x = 1 − t/ρ: we get further Thus we found that c n ≤ ρ ε (n + 1)! (M 1 − (1 + δ) log ε) n + ρεM n 0 , and this gives by majorisation, that C(v) converges for |v| < 1 M 1 −(1+δ) log ε and is analytic therein.For the second part in (22) we obtain It follows again by majorisation, which is omitted here, that H(t, v) is for fixed v with |1 − v| ≤ σ, σ small enough, in a neighbourhood of t = 0 an analytic function, since Ĥ(t, v) is analytic there.
Combining these results, we get from (18) the expansion Setting the lemma is proven.

Establishing the singular behaviour of G p (z, v) by induction
The crucial step in proving the normal convergence theorem in our approach is the description of the behaviour of the functions G p (z, v) in a neighbourhood of v = 1 around the dominant singularity z = ρ.This is given in the next lemma.
Lemma 10.The functions G p (z, v) have for for p ≥ 1 and |1 − v| ≤ σ, σ small enough, the following local expansions around the dominant singularity z = ρ: and for p = 0: The auxiliary functions N u D p u ϕ (L) G(z, u, v) have for p ≥ 1, 0 ≤ L ≤ d − 1 and |1 − v| ≤ σ, σ small enough, the following local expansions around the dominant singularity z = ρ: , and for p = 0: The coefficients α p (v) resp.β p,L (v) are for |1 − v| ≤ σ analytic functions which are given by the following system of recurrences: where the initial values are given by Proof.First we will treat the special cases.Since , .
Further it follows for L = d that This gives the initial values Of course these functions are constants and therefore analytic.Further it follows from Lemma 9 that the stated expansion holds for G 1 (z, v).We only use the fact that for v with |1 − v| ≤ σ and σ small enough, it holds in a neighbourhood of z = ρ: for a < 0.
This gives we obtain for 0 ≤ L ≤ d − 1: Therefore the functions β 1,L (v) are also analytic for |1 − v| ≤ σ, and the lemma holds for p = 1.
In the following we will use the expansions , which follow from (21).
We will now use induction to show the lemma for general p, where we assume that the lemma is already proven for 0 ≤ l < p with p ≥ 2. To do this, we have to examine first the integrand near the dominant singularity t = ρ in the integral representation given by equation (17).By using the induction hypothesis, we obtain then for |1 − v| ≤ σ in a neighbourhood of t = ρ the expansion To obtain a suitable bound for the remainder term of the integral, we will use the following fact.Given an analytic function f (z) with its only dominant singularity at z = ρ, which satisfies for holds, where the contour γ is for z ∈ D(φ 0 , τ) with |z − ρ| = r ≤ τ and Arg(z − ρ) = φ with |φ| ≥ φ 0 defined by γ := γ 1 + γ 2 , γ 1 := [ρ − τ, ρ − r], and where γ Next we consider the integral R z t=0 I(t, v)dt, where we split the integration path into parts as described below.The quantity τ is chosen small enough such that above expansion (26) for the integrand I(t, v) holds.We obtain , where Ĉ(v) subsumes the contributions .
Furthermore, Ĉ(v) is uniformly bounded for |1 − v| ≤ σ, σ small enough.This leads thus to the expansion Since it follows from the induction hypothesis that the appearing functions β l,1 (v), α p−l (v) are analytic for |1 − v| ≤ σ, and there the denominator (p − 1)dδv − (p − 1)δ does not vanish, we obtain that α p (v) is also analytic.
Once the expansion is proven for G p (z, v), we can use equation ( 15) to obtain the corresponding results for N u D p u ϕ (L) (G(z, u, v)) .We obtain for 0 ≤ L ≤ d − 1 and p ≥ 2 (but it turns out that it holds also for p = 1): , where Therefore the functions β p,L (v) are also analytic and the lemma is proven.

Applying the quasi power theorem
From this local expansion of the generating functions G p (z, v) around the dominant singularity z = ρ, as given by ( 23), we obtain immediately via singularity analysis (see [6]) the expansion for the coefficients: The remainder term holds uniformly for |1 − v| ≤ σ, σ small enough.As remarked in the description of Theorem 6, one has to add in the periodic case q ≥ 2 the contributions of the q dominant singularities, which leads to the factor q in formula (27).
Using the asymptotic behaviour of the numbers T n as given by equation ( 9), we obtain by choosing σ small enough the expansion We obtain U ′ p (0) = pdδ and U ′′ p (0) = pdδ, and get by applying the quasi power theorem (Theorem 7) the stated normal convergence result, Theorem 1.The proof that V p (s) is actually an analytic function in a neighbourhood of s = 0, which is necessary to apply the quasi power theorem, will follow in Subsection 4.1, when we examine the leading coefficients α p (v).We will show that α p (1) > 0 for p ≥ 1 (see equation 36), and since α p (v) is analytic around v = 1, this is sufficient for the analyticity of V p (s) around s = 0.

Examining the leading coefficients in the expansion of G p (z, v)
From the quasi power theorem follows also that the constants c p resp.d p , which denote the second order terms in the expansion of the expectation E(X n,p ) resp. the variance V(X n,p ) in Theorem 1, are given by Equation ( 29) shows then, that it is (apart from α p (1)) necessary to compute α ′ p (1) to obtain c p , resp.α ′ p (1) and α ′′ p (1) to obtain d p .Since the recurrence (24) for the α p (v) is quite involved, we will describe in the next lemma how to find the leading coefficients α p (v) by means of generating functions.
Lemma 11.The generating functions A(x, v) := ∑ p≥0 α p (v) x p p! of the leading coefficients α p (v), that are given by (24), satisfy the differential equation Proof.To prove this lemma, we also define for 0 ≤ L ≤ d the auxiliary functions B L (x, v) := ∑ p≥0 β p,L (v) x p p! of the leading coefficients β p,L (v).We can then translate the recurrences (24) and initial values (25) into the following system of differential equations: Next we want to show by induction that For k = 0 the statement is satisfied.If we assume that it is already proven for all B d−l (x, v) with 0 ≤ l < k and k > 0, we obtain and thus But plugging in the initial values for x = 0, we obtain and (32) is proven.
We thus obtain where we used the definition of η.By integration we obtain Setting x = 0 and taking the initial values into consideration, we obtain from this Combining equations ( 33) and (34), we obtain the first-order differential equation stated in the lemma.
It is easy to verify (but also clear from prior considerations) that and thus For general d, we cannot expect to get a simple formula for A(x, v).But for the special case d = 2 (which covers e. g., binary increasing trees), we obtain For binary increasing trees, we get with ϕ(t) = (1 + t) 2 eventually which was already computed in [15], to obtain the second order terms of E(X n,p ) and V(X n,p ) for binary search trees.Recall that binary search trees and binary increasing trees are isomorphic.

Computing the second order term for E(X n,p )
Although in general we will not be able to obtain a simple formula for A(x, v) as defined in Lemma 11, we can use the differential equation (31) to compute α ′ p (1) (in principle also for α ′′ p (1), but this is not carried out here), and thus c p in Theorem 1.
One can simplify the sum e. g. by establishing a recurrence, where one uses the Chu-Vandermonde identity, and obtains for p ≥ 2 p−1 This leads for p ≥ 2 to the formula T =0 ϕ(T ) −v dT , and thus log(ϕ(T )) (ϕ(T )) v dT, we also obtain the stated formula for α ′ 1 (1).From (29) we get then immediately that the constants c p in Theorem 1 are given by where Ψ(x) := (log Γ(x)) ′ denotes the digamma-function.With Lemma 12, Equation (36) and using we obtain the following theorem.
In order to use the asymptotic expansions for G p (z, v) as given by ( 23), we will integrate by parts and use (3).We then get Now we can use (23) and obtain finally for p ≥ 2 the asymptotic expansion

Applying the quasi power theorem
From the singular expansion (46) of F p (z, v) we obtain again by using singularity analysis the following asymptotic expansion for the coefficients: and where γ p (v) is given by (48).By another application of the quasi power theorem, the normal convergence theorem (Theorem 2) follows, since Ũ′ p (0) = Ũ′′ p (0) = pdδ.

Computing the second order term for E(Y n,p )
To obtain the constants cp = Ṽ ′ p (0) in the expansion of the expectation E(Y n,p ) in Theorem 2, we use equations ( 29 where the formula for c p is given in Theorem 13.

Fig. 2 :
Fig. 2: An increasing tree with the two parameters under consideration

Theorem 7 .
[H. K. Hwang] Let {X n } n≥1 be a sequence of integral random variables.Suppose that the moment generating function satisfies the asymptotic expression

Theorem 13 . 4 . 1
The second-order terms c p in the asymptotic expansion of the expectations E(X n,p ) as stated by Theorem 1 are for p ≥ 1 given byc p = p log δ dt − pdδ Ψ(δ) + H p − 1 + 1 − pd.Singular behaviour of the generating functions F p (z, v)Since the differential equation (13) connects the generating functions F(z, u, v) and G(z, u, v), we obtain also for F p (z, v) := N u D p u F(z, u, v) = p![u p ]F(z, u, v) a differential equation that links F p (z, v) to the G p (z, v)'s, which were analysed in Section 3.For p ≥ 1,∂ ∂z F p (z, v) = ϕ ′ T (z) F p (z, v) + ∂ ∂z G p (z, v) − vϕ ′ T (z) G p (z, v)(43)holds.This differential equation has the solution