Asymptotics of lattice walks via analytic combinatorics in several variables

We consider the enumeration of walks on the two dimensional non-negative integer lattice with short steps. Up to isomorphism there are 79 unique two dimensional models to consider, and previous work in this area has used the kernel method, along with a rigorous computer algebra approach, to show that 23 of the 79 models admit D-finite generating functions. In 2009, Bostan and Kauers used Pad\'e-Hermite approximants to guess differential equations which these 23 generating functions satisfy, in the process guessing asymptotics of their coefficient sequences. In this article we provide, for the first time, a complete rigorous verification of these guesses. Our technique is to use the kernel method to express 19 of the 23 generating functions as diagonals of tri-variate rational functions and apply the methods of analytic combinatorics in several variables (the remaining 4 models have algebraic generating functions and can thus be handled by univariate techniques). This approach also shows the link between combinatorial properties of the models and features of its asymptotics such as asymptotic and polynomial growth factors. In addition, we give expressions for the number of walks returning to the x-axis, the y-axis, and the origin, proving recently conjectured asymptotics of Bostan, Chyzak, van Hoeij, Kauers, and Pech.


Introduction
Recently, the study of two-dimensional lattice walks restricted to the non-negative quadrant has been an active topic of interest in several sub-areas of combinatorics (see, for instance, [8,14,9,26,27,21,4,5,13,34,18,23,7,6,12]), with applications in branches of applied mathematics including queuing theory and the study of linear polymers.The seminal work of Mishna and Bousquet-Mélou [13] gave a uniform approach to several enumerative questions, including the nature of a model's GF 1 (algebraic, D-finite, etc.) and the determination of exact or asymptotic counting formulas.In particular, they used the kernel method to prove that the GFs corresponding to 22 of the 79 non-equivalent two-dimensional models are D-finite.They conjectured that one additional model was D-finite-proved later by several authors [5,6,12]-and that the rest were not.In 2009, Bostan and Kauers [4] used computer algebra approaches to guess differential equations satisfied by the GFs of these 23 models, which were then exploited to guess dominant asymptotics for the number of walks of a given length; see Table 1.
The main difficulty one encounters in trying to determine asymptotics through annihilating equations of D-finite GFs is the connection problem; that is, even if one is able to rigorously derive an annihilating linear differential equation and compute asymptotics for a basis of solutions, it can be surprisingly hard (possibly incomputable in general) to rigorously express the GF in question as a linear combination of these basis elements.Ongoing work of Bostan, Chyzak, van Hoeij, Kauers, and Pech [3] attempts to get around this problem by using creative telescoping techniques combined with the kernel method to represent the walk GFs explicitly in terms of hypergeometric functions.Although such a representation should, in principle, allow 1 We abbreviate 'generating function' as GF throughout.
Table 1.Asymptotics for the 23 D-finite models.
one to rigorously determine asymptotics, in practice this depends on computing integrals of hypergeometric functions which those authors have only been able to numerically approximate 2 .
1.1.Our Contribution.In this work, we combine the expressions resulting from the kernel method with the techniques of analytic combinatorics in several variables to rigorously determine the asymptotics for 19 of the 23 D-finite models, verifying the computational guesses of [4] 3 .The final 4 models admit algebraic GFs whose minimal polynomials are explicitly known, meaning their asymptotics can be determined rigorously through univariate means 4 [26, 11, 5].Thus, to our knowledge, this work gives the first complete proof of asymptotics for the 23 D-finite models.The analysis breaks down into four groups (up to exchanging x and y coordinates): 4 models whose step sets are symmetric over every axis (the highly symmetric models, these were handled in general dimension by Melczer and Mishna [25]), 6 models whose step sets are symmetric over the y axis and have positive vector sum in their second coordinate (the positive drift models), 6 models whose step sets are symmetric over the y axis and have negative vector sum in their second coordinate (the negative drift models), and 3 sporadic cases.One reason for the recent interest in lattice path models in the quarter plane is the large variety of asymptotic and analytic behaviour their GFs exhibit, and this is evident in our work as well.The GF of a model is represented as the diagonal of a multivariate rational function, and the range of asymptotic behaviour apparent in Table 1 reflects differences in the geometry of the set of singularities of these rational functions.
We begin in Section 2 by giving an overview of lattice path enumeration and the kernel method, and show how it can be used to derive expressions for lattice path GFs which are amenable to the techniques of analytic combinatorics in several variables.Section 3 then details the general methods of analytic combinatorics in several variables, and outlines how the asymptotic analysis will proceed.This is followed by Section 4, where we derive asymptotics for the 19 models represented by multivariate diagonals through the kernel method.In Section 5 we examine the asymptotics for walks which return to one or both of the boundaries x = 0 and y = 0-proving recently guessed asymptotics by Bostan et al. [3]-and conclude in Section 6 with directions for future research.For more historical background and details on the calculations, the reader is referred to an upcoming full journal article building upon this extended abstract.
2 For some models, such integrals need to be rigorously determined to show not only the asymptotic constant of growth but even its exponential growth.
3 Three of the models have a periodic constant which their original table failed to take into account, but their guesses are otherwise correct. 4While these algebraic equations can be used to derive diagonal expressions for the 4 algebraic GFs, the resulting representations have more pathological characteristics (such as degenerate critical points) than those arising 'naturally' out of the kernel method.In any case, univariate methods for algebraic GFs are well established and easily give asymptotics for these models.

The Kernel Method for Quadrant Walks
The kernel method is a widely used strategy for manipulating functional equations, often those arising in the context of enumerating lattice paths in restricted regions.An often cited early example appears in the work of Knuth [22], and more modern accounts include Bousquet-Mélou and Petkovšek [15] and Banderier et al. [1], among many others.
To a given set of steps S = {±1, 0} 2 \ {(0, 0)} we associate the multivariate GF C(x, y, t) := i,j,n≥0 c i,j,n x i y j t n , where c i,j,n denotes the number of walks on the steps S of length n, beginning at the origin, ending at the lattice point (i, j), and never leaving the first quadrant (including coordinate axes).We further define the characteristic polynomial of a model to be the Laurent polynomial S(x, y) = (i,j)∈S x i y j .Decomposing a walk of length n > 0 ending at the point (i, j) into a walk of length n − 1 followed by a step in S gives a recurrence for c i,j,n (with special cases when i = 0 or j = 0 to account for the restriction to the quarter plane).Translating this recurrence into GF equations allows one to show that the GF C(x, y, t) satisfies the functional equation where x is shorthand for 1/x and ǫ = 1 if (−1, −1) ∈ S and 0 otherwise.Letting B −1 (y) = [x]S(x, y) and and we note that C(0, y, t), C(x, 0, t), and C(0, 0, t) represent the GFs for walks ending on the x-axis, on the y-axis, and at the origin, respectively.
In order to manipulate this functional equation into a more usable form, Bousquet-Mélou [9] (see also [13]) used a group of bi-rational transformations of the plane which fix S(x, y).As S ⊂ {±1, 0} 2 \ {(0, 0)}, we can in fact write for Laurent polynomials A i , B i .If A −1 (x) = 0 or B −1 (y) = 0 then S has no step moving towards (at least) one of its boundaries.In other words, the model defined by S is actually a lattice path model with a restriction to a halfplane (or having no restriction).Banderier and Flajolet [2] have shown that such models always admit algebraic GFs, and gave effective means of calculating their asymptotics.Thus, we may assume neither A −1 (x) nor B −1 (y) is 0. With this in mind, we define the bi-rational transformations and let G be the (possibly infinite) group of bi-rational transformations these involutions generate.We can view an element g ∈ G as acting on a Laurent polynomial f (x, y) by setting σ(f (x, y)) := f (σ(x, y)).

Define the non-negative series extraction operator
The main results of Bousquet-Mélou and Mishna, and Bostan and Kauers, combine to yield the following.
Theorem 1 ( [13,5]).Up to isomorphism, there are 79 distinct two-dimensional lattice path models with short steps restricted to the non-negative quadrant (which are not equivalent to halfplane models).Of these 79, precisely the 23 models listed in Table 1 give rise to a finite group G.For each model in Table 1 except models 5-8, the multivariate GF C(x, y, t) can be expressed as where sgn(g) is the minimal length of g in terms of the generators Φ and Ψ.For models 5-8, C(x, y, t) is algebraic and the dominant asymptotics of C(1, 1, t) match Table 1.
The key idea is that Φ and Ψ each fix S(x, y) along with one term on the right hand side of Equation ( 1).Thus the sign-weighted sum over the group eliminates all unknown series on the right hand side.This introduces new terms C(g(x, y), t) on the left hand side for each g ∈ G, but one can show that for each case, except models 5-8, taking the non-negative series extraction leaves only C(x, y, t).The algebraic models 5, 6, and 7 require a more delicate analysis [13], and model 8 a different approach [5].
The 56 models with infinite group are strongly suggested to have non-D-finite GFs C(1, 1, t) counting the total number of walks, meaning that determining their asymptotics is likely to be difficult.Dominant asymptotics are only known for 5 of the 51 models (see Melczer and Mishna [24]) although Bostan et al. [7] have determined asymptotics up to a constant multiple for the number of walks returning to the origin in the other 51 cases.Recent work of Duraj [16] implies that when the vector sum of a model's step set contains two negative coordinates the results of [7] determine asymptotics up to a constant multiple for the number of walks in these models ending anywhere.Fayolle and Raschel [18] outline a method which one can use to find the exponential growth of all of these 51 models, although they cannot generally recover sub-exponential behaviour.
2.1.Diagonal expressions.In order to prove the dominant asymptotics listed in Table 1, we will need to convert Equation ( 2) into a more computational form.For that we define the diagonal operator ∆ : The result which allows us a compact representation of the GF C(1, 1, t) for the number of walks ending anywhere-along with the GFs C(1, 0, t), C(0, 1, t) and C(0, 0, t) for walks ending on one or both of the coordinate axes-is the following. where The proof follows from the definition of the diagonal after writing out the geometric series and expansion of P on the right hand side.Combining Lemma 2 with Theorem 1 gives us a compact representation for our GFs which will allow for the asymptotic analysis.Theorem 3. Let S be a step set corresponding to a model in Table 1 other than the algebraic models 5-8.Then for a, b ∈ {0, 1}, where O(x, y) = g∈G sgn(g)g(xy).

Analytic Combinatorics in Several Variables
A general reference for this section is the text of Pemantle and Wilson [31] which contains precise statements and proofs for the results sketched here.Let F (x, y, t) = G(x, y, t)/H(x, y, t) ∈ Q(x, y, t) be a rational function analytic at the origin.Similar to the univariate case, the starting point here is to use the (multivariate) Cauchy integral formula to write (4) where T is a sufficiently small torus around the origin.The set of zeros V = V(H) ∩ (C * ) 3 describes the set of singularities of F (x, y, t) off the coordinate axes, and is known as the singular variety.If D is the open domain of convergence of F at the origin, we call a singular point (x, y, t) ∈ V ∩ ∂D on the boundary of the domain of convergence a minimal point.Minimal points are, in a sense, the most straightforward generalization of dominant singularities in the univariate case.Unfortunately, however, the extreme pathology of singularities possible in the multivariate case means it is possible that no minimal points will contribute to the dominant asymptotics.
Another notion is needed to fill this gap.The function h : V → R defined by h(x, y, t) := − log |xyt| is known as the height function associated to V and captures the part of the integrand in Equation ( 4) whose modulus increases with n.Assume first that V defines a smooth manifold.Then the work of Pemantle and Wilson implies that the critical points of the function h as a map between smooth manifolds are those around which local behaviour of the GF will determine asymptotics of the coefficient sequence, they are called the critical points of V. Basic results from complex analysis imply that we can deform the cycle of integration T in Equation ( 4) without changing the value of the integral as long as we do not cross the singular variety V or the coordinate axes.If there are critical points on the boundary of the domain of convergence-i.e., critical minimal points-then one can deform T to be arbitrarily close without changing the integral representation of a n,n,n .This allows one to show that the local behaviour of the function at these points will determine asymptotics (under the assumption that V is smooth).
In the general case, one begins by computing a Whitney stratification of the singular variety, which is a decomposition of V into a disjoint collection of smooth manifolds called strata with some additional properties (see Pemantle and Wilson [31], Definition 5.4.1 and the following discussion).When F is rational, a stratification can be computed algorithmically: in the smooth case the stratification simply consists of the manifold itself and in general each stratum can be effectively represented as the intersection of algebraic hypersurfaces V(H 1 ) ∩ • • • ∩ V(H k ) minus some varieties of lower dimension.A point (x, y, t) in a stratum B is called a critical point of B if ∇h| B (x, y, t) = 0, and the set of critical points of the singular variety is the set of points that are critical points for some stratum.
When V is a smooth manifold, or is smooth except for points where it consists of smooth manifolds intersecting transversely (the multiple point case) the set of critical points forms an algebraic set easily computed by Gröbner basis or other elimination methods; generically the set of critical points is finite.It is very difficult in general to determine which critical points actually contribute to the dominant asymptotics.
For a point (x, y, t) where V is locally the transverse intersection of smooth hypersurfaces V(H 1 ) ∩ • • • ∩ V(H k ) we define the cone K(x, y, t) ⊂ RP 2 to be the span of the vectors ∇ log H i := (x∂H i /∂x, y∂H i /∂y, t∂H i /∂t), i = 1, . . ., k and the cone N(x, y, t) = K(x, y, t) * to be the dual cone to K(x, y, t).Pemantle and Wilson [30] proved that a multiple point (x, y, t) is critical if and only if (1, 1, 1) ∈ N(x, y, t).Furthermore, they showed that when critical minimal points exist they almost always are the ones determining asymptotics for the diagonal coefficient sequence.
Proposition 4 (Pemantle and Wilson [31], Proposition 10.3.6).Suppose that for a rational function G/H the singular variety V is composed solely of smooth or multiple points and let K be its set of critical points.Let W ⊂ D be the points in the closure of the power series domain of convergence at which h(x, y, t) is minimized, and assume that the function (x, y, t) → (|x|, |y|, |t|) is constant on W .If 1 / ∈ ∂N(x, y, t) for any (x, y, t) ∈ K ∩ W and the set is non-empty and finite, then V is the set of contributing points of F , in the sense that [x n y n t n ]F (x, y, t) ∼ (x,y,t)∈V formula(x, y, t), where formula(x, y, t) denotes an effective function-under mild conditions-which depends on the local geometry of V at (x, y, t).
A precise description of formula(x, y, t) in the smooth and multiple point cases can be found in [33,Thm 3.2], which is the version used in our calculations through a Sage implementation by Raichev [32].

Asymptotics
We now apply the results of analytic combinatorics in several variables to the expressions obtained in Theorem 3, proving the guesses of Bostan and Kauers [4].

The Highly Symmetric Models.
Four of the models in Table 1 have step sets which are symmetric over every axis, and for each model one can directly calculate that the group G is {(x, y) → (x ±1 , y ±1 )}, meaning Equation (3) simplifies to

y) .
Let H = 1 − txyS(x, y).There are no solutions to H = H t = 0, so V is smooth, and the condition (1, 1, 1) ∈ K(x, y, t) becomes xH x = yH y = tH t .Solving this yields the critical points: Melczer and Mishna [25] showed-for the analogue in dimension d-that the points in K are all minimal.Thus, one can use Proposition 4 along with [33,Thm 3.2] to give the asymptotics of C(1, 1, t) which appear in Table 1.

Models With One Symmetry.
There are 12 models whose step sets have one symmetry (we assume without loss of generality over the y axis).For each of these models, the group G is a group of order 4 generated by (x, y) → (x, y) and (x, y) → (x, yA −1 (x)/A 1 (x)), and Equation ( 3) simplifies to We note that this rational function may be singular at the origin (if A −1 = x + x and A 1 = 1, for instance) but if it is not analytic it is of the form R(x, y, t)/x for R analytic at the origin, and we can use the identity [t n ]∆(R/x) = [t n+1 ]∆(ytR) to determine the asymptotics of the diagonal sequence by analyzing the function ytR(x, y, t) which is analytic at the origin.
Let H 1 = A 1 (x), H 2 = 1 − y, and H 3 = 1 − txyS(x, y).The singular variety V is the union of the three smooth varieties V 1 = V(H 1 ), V 2 = V(H 2 ), and V 3 = V(H 3 ), which intersect transversely.As H 3 is the only factor with t any critical point (x, y, t) must be in V 3 , so there are four possible strata which could provide critical points.The first two cases are: (1) (critical smooth point of V 3 ) As the multivariate expansion of 1/H 3 (x, y, t) has all non-negative coefficients, (x, y, t) is a critical minimal point of V 3 only if (|x|, |y|, |t|) is (see [30]).Thus, we search for positive real solutions of the smooth critical point equations xH x = yH y = tH t .This simplifies to S x (x, y) = S y (x, y) = 0, and due to the step set symmetry over the x axis one finds the only positive real solution occurs at (x 1 , y 1 , t 1 ) = 1, A 1 (1)/A −1 (1), 1/(x 1 y 1 S(x 1 , y 1 )) . ( Any point (x, y, t) on V 2 has y = 1 and, as ∆ log (H 2 ) = (0, −1, 0) at any point on V 2 , we see (1, 1, 1) ∈ K(x, y, t) if and only if S y (x, 1) = 1.This gives one positive real critical point in this stratum: (x 2 , y 2 , t 2 ) = (1, 1, 1/|S|).
The remaining two strata which could contribute critical points are = 0 and S is not symmetric over both axes, a straightforward computation shows S y (x, y) = 0 has no solution when y = 0.As t = 1/(xyS(x, y)) on V 3 , any positive real point (x, y, t) ∈ V can be shown to be minimal with respect to V 3 .After simplification, H 1 (x) = 1, 1 + x 2 , or 1 + x + x 2 so the factor H 1 does not affect the minimality of the two points above.Whether or not V 2 affects the analysis depends on whether A 1 (1) < A −1 (1) or A 1 (1) > A −1 (1) (if they are equal we are in the highly symmetric case).We now examine both of these cases.

The Negative Drift Models.
If A 1 (1) < A −1 (1)-i.e., more steps move south than north-then both points we have found above are critical minimal points.One can verify that (x 1 , y 1 , t 1 ) minimizes the height function-as the minimum must occur at a critical point -so the points contributing to the dominant asymptotics are (x 1 , y 1 , t 1 ) and any other points on V 3 with the same modulus.It can be shown that the contributing points are exactly the set and each is a smooth point of the singular variety.Theorem 3.2 of Raichev and Wilson [33] calculates the contribution of each point and gives the asymptotics of C(1, 1, t) which appear in Table 1.
Example 5. Consider the model defined by step set S = {(0, 1), (−1, −1), (1, −1)} = {N, SE, SW }.Here we have , and four of the eight possible points described above are contributing points: Using the Sage implementation of Raichev [32], we calculate the contribution at each to be so that the number of walks of length n satisfies : n even : n odd Note that the original table of Bostan and Kauers [4] only had the constant for the even cases listed.

The Positive Drift Models.
When A 1 (1) > A −1 (1)-i.e., more steps move north than south-then the factor H 2 = 1 − y makes (x 1 , y 1 , t 1 ) fall outside the domain of convergence.Thus, there is a single positive critical minimal point: (1, 1, 1/|S|).Now this point will determine the dominant asymptotics; that the factor 1 − y "cuts off" the critical point which contributed to the dominant asymptotics in the negative drift case gives an analytic reason for why the combinatorial factor of drift affects the asymptotic growth rate.It remains only to find the points in V with the same coordinate-wise modulus and determine the contribution of each.A quick calculation shows that there are at most four such points, the subset of (x, y, t) = (x, 1, ±S(x, 1)) such that x = ±1 and |t| = |S|.Calculating the contributions of each gives the asymptotics of C(1, 1, t) which appear in Table 1.

The Sporadic Cases.
Aside from the four algebraic models which were discussed in the opening section, for which asymptotics are already known, there are only three models in Table 1 whose asymptotics remain to be proven.The asymptotics are proven case by case, and each is a straightforward application of analytic combinatorics in several variables.The asymptotics of C(1, 1, t) for these models were originally proven by Bousquet-Mélou and Mishna [13] via different methods (our results on the asymptotics of boundary returns for the first and third sporadic models, in the next section, are not covered by their work).
Example 7.For the second step set S = {E, SE, W, N W }, Equation (3) simplifies to This case turns out to be easy to analyze, since the denominator is smooth.There are two points which satisfy the critical point equations: ρ 1 = (1, 1, 1/4) and ρ 2 = (−1, 1, 1/4), both of which are minimal and smooth.As the numerator has a zero of order 2 at ρ 1 but order 3 at ρ 2 , in fact only ρ 1 contributes to the dominant asymptotics.
Example 8.For the final step set S = {N W, SE, N, S, E, W }, Equation (3) simplifies to .
) is the only critical minimal point, and the analysis at ρ is analogous to Example 6.

Walks Returning to the Boundary
In the previous section we derived asymptotics for [t n ]C(1, 1, t) -that is, the number of walks ending anywhere.Combinatorially, it is also of interest to determine asymptotics for the number of walks returning to the origin or one of the bounding coordinate axes.Theorem 3 gives a simple link between the rational functions whose diagonals we take to get these sequences: one simply multiplies the rational function F (x, y, t) whose diagonal determined C(1, 1, t) by 1 − x, 1 − y, or both factors to get the appropriate representation.This allows one to see immediately that for the highly symmetric and negative drift models the contributing points determining dominant asymptotics are unchanged, meaning the exponential growth of all walks and walks returning to either/both axes are the same (the order of vanishing of the numerator at these points will increase, however, meaning the polynomial growth term will be changed, along with the constant).For the positive drift models, in contrast, the point (x 1 , y 1 , t 1 ) (along with the other points in V with the same modulus) will determine dominant asymptotics for walks returning to the x-axis or origin as the factor of 1 − y in the numerator will cancel with the one present in the denominator.This means that the exponential growth of walks returning to the x-axis or origin for positive drift models will be less than the exponential growth for the total number of walks, which itself is equal to the exponential growth for the number of walks returning to the y-axis.The complete list of these asymptotics is given in Tables 2 and 3, and we note that for each of the algebraic models 5-8 the minimal polynomial of C(x, y, t) has been determined previously [13,5], meaning the asymptotics corresponding to these 4 models are already known.These results prove numerically guessed asymptotics of Bostan et al. [3].

Conclusion
Lattice paths restricted to the non-negative quarter plane are well-studied objects, and there are many approaches to their enumeration.In this article we have tried to highlight the benefits of using diagonals of rational functions: compact representations of GFs, effective determinations of asymptotics, and clear links between analytic and combinatorial properties.After determining asymptotics of the 23 D-finite models in the quarter plane there are clear generalizations still left to be worked on: models with longer or weighted steps, models in higher dimensions, and restrictions to different regions or other lattices are a few obvious

Table 2 .
Asymptotics of boundary returns for the highly symmetric, positive drift, and sporadic cases.