Upper bounds on the non-3-colourability threshold of random graphs

We present a full analysis of the expected number of ‘rigid’ 3-colourings of a sparse random graph. This shows that, if the average degree is at least 4.99, then as n → ∞ the expected number of such colourings tends to 0 and so the probability that the graph is 3-colourable tends to 0. (This result is tight, in that with average degree 4.989 the expected number tends to ∞ .) This bound appears independently in Kaporis et al. [14]. We then give a minor improvement, which shows in particular that the probability that the graph is 3-colourable tends to 0 if the average degree is at least 4.989.


Introduction
We are concerned with the 3-colourability of sparse random graphs.We consider the G n,m model (also known as the uniform model), where m = ⌈θn/2⌉.This consists of all graphs with vertex set V n = {1, 2, . . ., n} which have m edges, where each of these graphs occurs with probability ( n 2 ) . Note that the expected degree of a vertex is θ + o (1) as n → ∞.We say that a graph property holds asymptotically almost surely (aas) if the probability that the random graph of order n has this property tends to 1 as n → ∞.
The threshold for non-2-colourability (that is, for the existence of an odd circuit) is well understood, and is not sharp, see for example the recent survey of Molloy [18].For k > 2 our understanding of k-colourability is less complete.Let χ(G) denote the chromatic number of a graph G. Erdős in [9,3] asked whether for each k ≥ 3 there exists a constant θ k such that for any ε > 0, χ(G n,m ) ≤ k aas when m ≤ (θ k /2 − ε)n, and χ(G n,m ) > k aas when m ≥ (θ k /2 + ε)n.Note that θ k would specify a critical average degree for the random graph.In the case of 3-colourability, the experiments in [13] suggest that θ 3 ≃ 4. 6.
In the above theorem as well as in what follows, the symbol log always refers to binary logarithm.In order to prove this result, one half of the battle is to show (2), and the other half is to show (3).The main step in proving (3) is to show that for θ = 4.9895, we have µ(θ) < 0. This involves considerable computation: we are more explicit about this side of matters than has been the custom in previous papers in this area.
From the previous theorem and the Markov inequality (1) we obtain Therefore, θ + 3 ≤ 4.9895.Working from the foundation provided by Theorem 1.1, we may try to improve the bound on θ + 3 in different ways.One way involves an adequate subfamily of the rigid 3-colourings, the "Ψ-gadget-free" rigid 3-colourings -see the brief discussion in Section 7. A second way involves considering the number of non-trivial tree components, and using the observation that any non-trivial tree has at least two rigid 3-colourings: this approach is described in Section 6 below, and yields the following theorem.
Similar results can be deduced for the G n,p model.In this model, again V n = {1, 2, . . ., n} is the set of vertices, but now each of the n 2 possible edges appears independently with probability p.We find that if θ = 4.98887 and p = θ/n, then there exists a δ > 0 such that P[χ(G n,p ) ≤ 3] ≤ 2 −δn , for n sufficiently large.
The main steps in the proof of Theorem 1.1 are as follows.We first give an exact formula for where G * n,m is a slight variation of the G n,m model; see Lemma 2.1 below.We show that we can discard the 'tails' of the sum that appears there, and give good approximations to the remaining 'central' terms.These terms involve probabilities p(k, α, n, θ), which are investigated in Section 3. The probabilities can be written as a sum, where the summands involve binomial coefficients and certain 'balls-and-bins' probabilities.Again we show that we can discard the tails in the sums, and give good approximations to the remaining central terms, now involving Stirling numbers of the second kind.We use known asymptotic expressions for these Stirling numbers, thus expressing the summands as terms like 2 h(x,θ)n .This yields an approximation for E[R(G * n,m )] as 2 µ(θ)n+O(log n) (see ( 22) below) and we then obtain (2).This part of the proof is completed in Section 4.
The remaining work to prove Theorem 1.1 is largely numerical, and is described in Section 5.The main task is to show that for θ = 4.9895, we have µ(θ) < 0. We show that, for this specific θ, h(x, θ) is concave over its domain D. We find numerically a first approximation for a point which gives the maximum value of h inside this area.We define a very fine grid in a box around this point, and find a grid point x where the maximum value of h on the grid is attained.Then, we determine an upper bound for h on the surface of the box (by computing values of h and its partial derivatives on a fine grid and using concavity).We find that this bound is less than the value of h at x, and so we deduce that the box contains the maximum of h over D. Further computations handle the region inside the box.
After completing the proof of Theorem 1.1 as described above, in Section 6 we prove Theorem 1.3, and then we make some brief concluding remarks in Section 7. Some details from earlier proofs are given in the Appendices.

Starting the proofs
For the sake of simplicity, we carry out the probability calculations in the G * n,m model.In this model, we form the random graph by choosing at random m times, each time independently, uniformly and with replacement, an edge out of the n 2 possible 2-subsets of V n = {1, . . ., n}.We ignore any repetitions of an edge, so the random graph may have less than m edges.Our results transfer easily to the G n,m model -see Lemma 4.3 below.Every probability, unless otherwise stated, is meant to be taken over the G * n,m model.
and for each positive integer n let For each Let C (n) be the set of all 3-colourings of V n , i.e. the set of all possible mappings from V n into {1, 2, 3}, or equivalently the set of all partitions of V n into three sets S 1 , S 2 and S 3 (some of them possibly empty).
Also, for each positive integer n and each (k, α) ∈ D (n) , let C (k, α, n) denote the set of all partitions of V n into three sets S 1 , S 2 , S 3 , where where S ∈ C (k, α, n).Note that the above quantity depends only on the sizes of the independent sets induced by S. Recall that R(G) denotes the number of rigid 3-colourings of G.
Lemma 2.1 For all positive integers n and m with 0 Proof.By the linearity of the expected value, we have Let us take a fixed colouring S with stable sets S 1 , S 2 , S 3 , where Hence, e(S) n 2 = (αβ + βγ + γα) = φ(k, α). Thus, . Now notice that the family C (k, α, n) consists of exactly n kn • kn αn colourings.So, rephrasing the sum in (5) in terms of k and α we obtain the following: ✷ The above lemma is our starting point.We next check that in Lemma 2.1 we may ignore the extreme values of α and k.We will split the sum there into two pieces.Let (which corresponds to α, β, γ ≥ 0.2); and let D (n) Moreover, doing some elementary calculations, we obtain the following: Lemma 2. 2 The function φ(k, α) is continuous and concave on D, for each (k, α) ∈ D we have 0 ≤ φ(k, α) ≤ 1/3, and and let We will see that the second 'error' term here is negligible for the relevant values of m; and then we may focus on the first term.
Proof.By Lemma 2.2, we have σ = sup Thus, we have 2 < 1, and the lemma follows.✷ The following standard lemma on approximating binomial coefficients may be proved using Stirling's formula: (the use of the same letter p should not cause confusion).Directly from the definition (7) and the last lemma, we have: In the next section we consider the term p(k, α, n, θ).
3 Calculations for p(k, α, n, θ) In this section, we derive an asymptotic formula for p(k, α, n, θ), which was defined in (9) to be equal to For positive integers t ≥ r, we let p(t, r) denote the probability that, when we throw t balls uniformly at random into r bins, each bin ends up non-empty. where and b(λm; m, p) is the probability that a random variable distributed according to the binomial distribution Bi(m, p) is equal to λm.To see this, observe first that, conditioning on S being a proper 3-colouring, the random variable that determines the number of edges between S 1 and S 2 ∪ S 3 is binomially distributed, namely it is Bi(m, p), where p is defined in (11).Once we have specified the number of edges between S 1 and S 2 ∪ S 3 (and, therefore, the number of edges between S 2 and S 3 as well), the probability that S is rigid is exactly the probability that each vertex in S 2 ∪ S 3 is adjacent to some vertex in S 1 and each vertex in S 3 is adjacent to some vertex in S 2 .Note that for each edge, say, between S 1 and S 2 ∪ S 3 , each vertex in S 2 ∪ S 3 has the same probability to be the endvertex of it.The same holds for the edges between S 2 and S 3 .Thus, we can think of this as a random throwing of balls into bins; each ball corresponding to an edge and each bin corresponding to a vertex.Note that we have two independent such random experiments.This observation yields (10).

Discarding the tails
We next check that we may discard the extreme values of λ in (10).This is a technical exercise for which we need one preliminary lemma.
Proof.Let W (t, r) be the set of all arrangements of t balls into r bins leaving no empty bins and let w(t, r) be its cardinality.What we want to prove will follow from the following inequality: To prove (12), consider ordered pairs of balls and bins, i.e. if T is the set of balls and R the set of bins, take the Cartesian product of them P = T × R. Each such pair (b, B), where b ∈ T and B ∈ R, corresponds to the fact that the ball b is in bin B. For each such pair arrange the remaining t − 1 balls into the r bins leaving no empty bins.Thus, we form the set W = {(p, w) : p ∈ P, w ∈ W (t − 1, r)}.Note that we have a surjective mapping from the set W onto the set W (t, r).Clearly, W is of cardinality rtw(t − 1, r).The mapping induces a natural partition on this set and each of the parts, which is the set of pairs that are mapped to a specific arrangement of t balls into r bins without leaving any empty bins, is of cardinality equal to the number of balls which are not the only ball in their bin, which is at least t − r and at most 2(t − r).Thus, (12) has been established.Therefore, and the lemma follows.✷ and let (The extra terms 0.24 and 0.065 here are to exclude extreme values which, as we shall see shortly, are negligible, but which would cause awkwardness later.)It is convenient to restrict θ to a range [θ l , θ u ], as we need to obtain approximations uniformly over θ.We let (Recall that p is defined in (11).) Proof.Within the proof, let f (λ) be the general term in the sum in equation (10).We will compare the term f (λ), for some λ which will be specified later, with the adjacent term f (λ − 1/m).Note that f (λ) = b(λm; m, p)p(t, r)p(m − t, r ′ ), where r = n(1 − α) and r ′ = n(1 − k), and t = λm.We consider the "lower" tail first.Assume that t = n(1 − α) + ⌊ηn(1 − α)⌋, for some η > 0. By Lemma 3.1 we have Thus, we obtain Using the fact that (k, α) ∈ D 1 , straightforward verification shows that for η = 0.25 and for n sufficiently large the factor on the right hand side is less than 1/2 (in fact it is less than 0.45), for any θ ∈ [θ l , θ u ]. (This is the case because the factor is increasing with respect to η; see [10] for the details.)Therefore, the sum of the terms for t from n(1 − α) up to n(1 − α) + ⌊0.25(1 − α)n⌋ − 1 can be bounded as follows: by the geometric sum formula.Following the same treatment, we can bound the other tail of the sum.Here, assume that Using the fact that (k, α) ∈ D 1 , one can see that for η = 0.07 and for n sufficiently large the factor on the right hand side is less than 1/2 (in fact it is less than 0.49), for any θ ∈ [θ l , θ u ].As in the previous case, this expression is increasing with respect to η.Therefore, the sum of the terms for by the geometric sum formula.Now, the lemma follows from the above observations along with the fact that each term is non-negative, which means that removing a few terms from the sum gives a lower bound on it.✷

Introducing Stirling numbers of the second kind
For positive integers t ≥ r the Stirling number of the second kind S(t, r) is defined to be 1/r!times the number of surjective functions from a set of cardinality t to a set of cardinality r.Thus Hence, we may rewrite Lemma 3.2 as follows: 1 , we have: .

Asymptotics for Stirling numbers of the second kind
An essential part of our probability calculations involves asymptotic expressions for the Stirling numbers of the second kind.We need some preliminary definitions and results, see for example [20].

The estimate
Recall that D 1 and L 1 are defined in ( 6) and ( 13) above.Let and let , θ(1−λ) .We shall prove the following lemma: Proof.In what follows, the "error" term ⌈θn/2⌉ − θn/2 yields a Θ(1) factor for each term of the sum in Lemma 3.3, since (k, α) ∈ D (n) 1 .We use Lemma 2.4 to give asymptotic expressions for the binomial coefficients.Note that since (k, α) ∈ D 1 the assumptions of Lemma 2.4 are satisfied.This is also true for the coefficient that involves θ, since θ is assumed to be in a closed and bounded interval not containing 0. Thus, using ( 16) and Stirling's approximation for the factorials, Lemma 3.3 implies the following: where, for θ ∈ [θ l , θ u ] and (k, α, λ) ∈ D, where , where x 0 = x 0 (x/y) and where p is defined in (11).Doing some calculations, we obtain: where For the elementary but tedious calculations see Appendix B.
Also, note that by (27) in Appendix A, the monotonicity of the lower bound for f (u), where u = 2(1−α) , and the fact that (k, α) ∈ D 1 , it follows that in both cases the function f is at least , and so the two factors including f in the expression above yield a Θ(1) term.This concludes the proof of the lemma.✷ 4 Proof of Theorem 1.1 and Corollary 1.2 Recall that θ l , θ u were introduced in ( 14) and D was defined in (17). where θ(1−λ) .(Recall that the function x 0 (u) was defined at the start of Subsection 3.3 ) We will prove the following: Proof.Lemmas 2.5 and 3.4 imply that uniformly over θ ∈ [θ l , θ u ]: where p is defined in (11).Hence, the lemma has been established.✷ By the last lemma, where c(n, θ) = Ω(n −3/2 ), c(n, θ) = O(n 3/2 ) and Let D (∞) = lim inf D (n) and note that it is the set of rationals contained in D and, therefore, it is dense inside it.Then, we have since for each fixed θ the function h(k, α, λ, θ) is continuous on D, which is a compact subset of R 3 .In fact, we can say a little more than this.We know that h(k, α, λ, θ) attains its maximum at an internal point of D, say x * .Note that this is a stationary point and one can also see that h is differentiable on D and its derivatives are continuous.The latter implies that for any ε > 0 there exists an open ball U containing x * where ∇h < ε.For n sufficiently large, there is a point x n ∈ D (n) ∩U with and then by the Mean Value Theorem.Hence, This fact along with (19) imply the following: S 1 = 2 µ(θ)n+O(log n) .
Since h(k, α, λ, θ) is continuous on its domain D, and h as a function of θ is also continuous, the function µ(θ) is continuous as well.As we shall see later, for θ 0 = 4.9893, we have The numerical investigation in the next section shows that for θ 2 = 4.9895.By Lemma 2.3, there exists δ > 0 such that S 2 = O 2 − δn , uniformly over θ ∈ [θ l , θ u ].
Proof.To see the one direction note that Here, E(G) denotes the set of edges of a graph G. Recall that ln(1 This expression is bounded away from 0 uniformly for θ in the closed interval The other direction is a little more tricky.To see this note as before that whence we obtain e(S)/m 2 ≥ η > 0, for some η (depending only on b).Thus (once 2m ≤ e(S)), ≥ exp(−2m 2 /e(S)) ≥ e −2/η .
Hence, for such an S, we have On the other hand, since adding edges to a proper 3-colouring increases the probability that this is rigid.Therefore, Finally, by Lemma 2.3, since θ ≥ 4.98, ✷ Thus, Lemma 4.3 along with (22) and inequalities (20) and ( 21) conclude the proof of Theorem 1.1.Corollary 1.2 follows immediately.
To complete the proof of Theorem 1.1, and thus of Corollary 1.2, it remains to establish (21).Let From an initial approximation to the maximum in D, using the Complex method [5] (see also [6]), our attention is directed to the cube C ⊆ D, where 0.6980 ≤ k ≤ 0.6981, 0.3622 ≤ α ≤ 0.3623 and 0.6910 ≤ λ ≤ 0.6911.Divide the surface of the cube C into squares of side s = 5 × 10 −6 .For a square centred at a, by the concavity of h we obtain for each point b in the square.By checking each square, we find that h(x) ≤ −3.937721 × 10 −5 for each point x on the surface of the cube C .But there is a point y inside C with h(y) strictly greater than this bound.More specifically, we may define a cubic grid inside C each cube having side equal to 5 × 10 −6 .
The maximum value we find by searching this grid is equal to −3.937414 × 10 −5 and it is strictly greater than the upper bound on h on the surface of C .Since h is concave on D, it follows as noted above that h attains its maximum over D inside C .Now we obtain an upper bound on h inside C , using the aforementioned grid.By concavity, for each point b in the sub-cube with its centre located at a and edge of length equal to s, we have By checking through each sub-cube, we find that h(x) is less than −3.9 × 10 −5 , for each x ∈ C , and thus for each x ∈ D.
6 Proof of Theorem 1.3 Let t(G) denote the number of components of the graph G that are non-trivial trees (that is, trees with at least one edge).

Lemma 6.1 For any t ≥ 0 and any positive integers n and m with
Proof.Any non-trivial tree has at least 2 rigid 3-colourings.Thus, for a graph Now, we apply this result to G n,m and take expectations.We obtain: and the lemma follows.✷ and m = ⌈θn/2⌉.We shall use standard methods to prove: Lemma 6.2 Let θ > 0. For any ε > 0 there exists δ 1 > 0 such that Proof.Let T be a tree on vertices 1, . . .,t, where t is constant.Then by standard approximations.Now we may multiply by n t t t−2 to see that the expected number of tree components with t vertices in G n,m is (1 + o( 1)) nθ t−1 e −θt t t−2 /t!.Since the number of tree components of G n,m with at least t vertices is at most n/t, it follows that (see also [4] p.96): The following lemma will complete the proof, since if G ′ is obtained from G by adding an edge and deleting an edge then |t(G) − t(G ′ )| ≤ 2. ✷ The following lemma is a special case of Theorem 7.4 of [16].

Lemma 6.3 Let f be a function on graphs such that, if G ′ is obtained from G by adding an edge and deleting an edge, then
Proof.Given an m-tuple x of distinct edges of K n , let G(x) be the graph on {1, . . ., n} with edges those mentioned in x (ignoring the order), and let f x and y differ in exactly one co-ordinate, or if they differ in exactly two co-ordinates and the values there are swapped.Thus we may apply Theorem 7.4 of [16], see also Example 7.3 there.✷ Therefore, setting x = τ(θ)n − εn, for some ε > 0 which will be specified later, in Lemma 6.1, and using also Lemmas 6.2, 4.2 and 4.3 along with equation ( 8) we obtain the following: We now fix θ = 4.98887.
From the proof of Lemma 2.3 we may see that S 2 ≤ 1, for n sufficiently large.We keep the definition of the region D unchanged.Inside D, we may perform numerical investigations similar to those in the case of the rigid 3-colourings.We consider the same sub-cube C as before, and find again that h(k, α, λ, θ) attains its maximum inside C .We can check through the same family of sub-cubes and see that the maximum value µ(θ) of h(k, α, λ, θ) over D satisfies µ(θ) > 0, but τ(θ) − µ(θ) ≥ 10 −5 .Then by (23) with

Concluding remarks
We considered the adequate family consisting of the rigid 3-colourings of a graph, and investigated carefully the expected number of such colourings in the random graph G n,m .We thus obtained an upper bound on the non-3-colourability threshold θ + 3 , which appears independently in [14].We then improved this upper bound slightly, by taking into account the number of non-trivial tree components.
Let us sketch now some related ideas that also will improve the upper bound slightly.When we considered the number of non-trivial tree components in Section 6, in the proof of Lemma 6.1 we used the fact that R(T ) ≥ 2 for any non-trivial tree T .We can be more precise; for example R(T ) ≥ 3 unless T is a star.By computing the value of R(T ) for each 'small' non-trivial tree (say those having at most 5 vertices), and then following the general approach in Section 6, it is possible to obtain a slight improvement on Theorem 1.3 (see [10] for further details).
We may also consider an adequate subfamily of the rigid 3-colourings of a graph G, namely the leftmost 3-colourings.These are the proper 3-colourings S 1 , S 2 , S 3 where |S 3 | is minimal and, subject to this, |S 2 | is minimal.Note that any such 3-colouring must be rigid and, further, this family is adequate.Unfortunately it seems to be hard to study leftmost 3-colourings, but we can work with related families such as those defined in terms of "Ψ-gadgets".
Given a 3-colouring S 1 , S 2 , S 3 of a graph G, a Ψ 12 -gadget is a component of the subgraph induced on S 1 ∪ S 2 , which is a star with centre in S 1 and at least 2 leaves (which must belong to S 2 ).We may define Ψ 13 -and Ψ 23 -gadgets similarly.Call a rigid 3-colouring Ψ-gadget free if there are no Ψ 12 or Ψ 13 or Ψ 23 gadgets.Note that these 3-colourings form an adequate family, since each leftmost 3-colouring is Ψ-gadget free.By analysing such families of 3-colourings we may reduce the upper bound on θ + 3 slightly -see [11] and [10].

Appendices
A A note on Subsection 3.3 We set u = r/t.Recall that x 0 (u) = x 0 (r,t) is the root of the following equation: for 0 < u = u(r,t) ≤ 1.Let x 0 = x 0 (u) throughout this Appendix.Suppose that y is either r or t.We have Therefore, Again, using (24) we have e x 0 − 1 − e x 0 ux 0 = 0.
So, u = e −x 0 (e x 0 − 1)x −1 0 .Thus, (25) becomes ∂x 0 ∂y = (∂u/∂y)x 0 e −x 0 − e −x 0 (e The denominator is negative, since 1 + x 0 < e x 0 .Thus, the sign of this expression depends upon the sign of ∂u/∂y.We shall use these expressions for the derivatives of x 0 (u) with respect to the variables on which it depends in Appendix C. Now, we shall give a lower bound on f (t, r) and we will study its monotonicity.In fact we shall work with , where u = r/t.We have Thus, In what follows, we are trying to investigate the monotonicity of the latter function.The derivative of this with respect to u is We have to determine the sign of the numerator.Using (25), we obtain , where x 0 = x 0 (x/y).Thus, for (k, α, λ) ∈ D we have , C The concavity of h(k, α, λ) over D In this Appendix, we show that the function h(k, α, λ) as it was defined in Section 5 is strictly concave over the interior of D, where D was defined in (17).(Recall that the function h(k, α, λ, θ) was defined in ( 18), and we have fixed θ = θ 2 = 4.9895 to obtain h(k, α, λ).)This treatment improves the proof in an earlier version of this paper; and was inspired by the correction [12] to [14].We split the function h(k, α, λ) (multiplied by ln 2 to change to natural logarithms) into four parts.Namely, for any (k, α, λ) ∈ D, we have where and x 1 , x 2 are defined in (28) and (29), respectively.For i = 1, . . ., 4, we set h i = h i (k, α, λ).We prove that each of these functions is concave over the interior of D, with h 4 being strictly concave there.To prove that a suitably differentiable function is concave (strictly concave, respectively) over an open domain, we have to prove that its Hessian matrix (i.e. the matrix of the second partial derivatives) is negative semidefinite (definite, respectively) over this domain (see for example [21], Theorem 5.5.5 p. 230).By, for example, Theorem 6E in [19] (p.339), to check negative semidefiniteness (definiteness, respectively) of a real symmetric matrix it is necessary and sufficient to show that the principal minors have alternating signs (and are non-zero in the case of definiteness), the first one being non-positive.Thus, we may deal with h 1 , . . ., h 4 as follows (for further details of the calculations see [10]).
1. Observe that h 1 does not depend on λ.It can be easily checked that it is concave over D. The elements of the Hessian matrix with respect to k and α are the following: .
> 0, and so is concave over D.
2. Next we consider the function h 2 , which does not depend on α.After doing some algebraic manipulations and using ( 29) and (26), we obtain: Similarly, and Clearly we have ∂ 2 h 2 ∂k 2 < 0 and it is easily checked from the above that is identically 0 over D; and so h 2 is concave there.
3. The function h 3 has precisely the same form as h 2 , where 1 − λ has been replaced by λ.Thus, we deduce that h 3 is also concave over the interior of D.
To show strict concavity, it suffices to show that the principal minors are non-zero and have alternating signs with the first one being negative.It is easy to see that the first two principal minors are as required, since ∂ 2 h 4 ∂k 2 and ∂ 2 h 4 ∂α 2 are strictly less than − 1−λ (k−α) 2 .The third principal minor, which is the determinant ∆ of the matrix, satisfies and this can verified e.g. by using Maple (or see [10] for the details).The denominator is always strictly positive, so it is sufficient to show that the numerator in negative for any (k, α, λ) ∈ D.

Fig. 1 :
Fig. 1: The function x 0 (u) is rigid|S is proper], where S ∈ C (k, α, n), m = ⌈θn/2⌉ and we are working in the G * where as above θ 2 = 4.9895.Then h is continuous over D, and we shall see in Appendix C that it is strictly concave over the interior of D. Thus, if C ⊆ D and y ∈ C o (the interior of C ) are such that h(y) > h(x) for every x in the boundary of C , then h has a unique maximum point x * over D and x * ∈ C o .Using concavity, we can estimate numerically where the maximum of h is located, and give an upper bound on h over this domain, as follows.
4. We show that h 4 is strictly concave over the interior of D. Its Hessian matrix is: