To cheer you up in difficult times 32, Annika Heckel’s guest post: How does the Chromatic Number of a Random Graph Vary?
Combinatorics and more 2021-10-02
This is a guest post kindly written by Annika Heckel. We first reported about Annika Heckel’s breakthrough in this post. A pdf version of this post can be found here.
Pick an -vertex graph uniformly at random. Pick another one. Will it have the same chromatic number? Or if not, how different are their chromatic numbers likely to be if
is large?
The chromatic number of a random graph is a random variable, so what we are really asking is: Is this random variable essentially deterministic? That is, is the weight of the distribution concentrated on one value, or on just a few values which are close together?
Until recently we did not know whether or not this is the case. In this blog post I’ll describe a new result showing that, at least for infinitely many , the chromatic number of a random
-vertex graph is not concentrated on fewer than about
consecutive values.
Random graphs
A uniform -vertex graph is generated by the random graph
where we include each possible edge independently with probability
, and more generally we can ask about the chromatic number
of
.
Results about generally fall into one of two categories: Can we find (i) upper and lower bounds for its typical value, and (ii) bounds on how much it varies about this typical value?
The typical value
On (i), we have the well-known 1987 result of Bollobás who showed that, with high probability (whp), . This was improved and generalised several times, the current best bounds are from 2016:
(I am writing out this result because it will become important when talking about question (ii) later.)
Upper concentration bounds
So how about the width of the distribution of — what is the length of the shortest interval (or rather sequence of intervals) which is likely to contain
?
Of course (1) is already a weak concentration type result, giving an explicit such interval of length . However, it turns out that if we don’t have to specify the interval, we can do a lot better.
The starting point for question (ii) is the classic 1987 result of Shamir and Spencer, who showed that for any function ,
is whp contained in some sequence of intervals of length about
. If
quickly enough, however, much sharper concentration holds: in 1997, Alon and Krivelevich reduced the interval length to only
in the case
. So here the chromatic number behaves almost like a deterministic function; almost all of the weightof the distribution is on two consecutive values.
The opposite question
In view of strong results showing that the chromatic number is sharply concentrated, in the late 1980s Bollobas raised the following question (later popularised by him and Erdos): Can we find any examples where χ(Gn,p) is not very sharply concentrated? This is trivially true for p = 1 − 1/(10n) (as noted by Alonand Krivelevich), but how about non-trivial examples, and what about the mostnatural special case, p = 1/2?
This question remained open for quite some time, for a simple reason: whilewe have a number of standard tools to prove upper bounds on the concentrationof a random variable (for example the martingale concentration argument ofShamir and Spencer), we have few ways of giving lower concentration bounds.(Unless we can work out the entire approximate distribution, for example byproving asymptotic normality.)A lower bound for concentrationI will sketch the main ideas needed for the following result:
Theorem 1 (H. 2019; H., Riordan 2021). Let ε > 0, and let be a sequence of intervals such that
whp. Then there are infinitely many values
such that
Of course we can also replace with some function
which tends to
slowly. Up to this
-term, the lower bound matches Shamir and Spencer’s upper bound.
A word of caution: The theorem only says that there are some so that
is not very sharply concentrated. It does not tell us what these values
are, and no bound is given for the other
. In fact, we do not believe the conclusion of Theorem~1 is true for every
— more on this later!
Basic proof strategy
The proof needs two ingredients:(1) A (weak) concentration type result
A result that says that, whp, for some well-behaved function
and an error term
. Here,
is much larger than the scale on which we are trying to prove non-concentration. We also need a lower bound on the slope of
,
where we will specify
later.
(2) A coupling result
A coupling that shows that, for and some slightly larger
,
and
are close together. More specifically, we need a coupling of
and
with
(with
as above) so that with probability at least
, say,
Now suppose is a sequence of intervals as in Theorem 1. As shown in the picture, we now know that
is concentrated around a function
with slope more than
(blue area), but it also follows from the coupling that the slope between the concentration intervals (red lines) is at most
.
If all intervals were short, then as we increase , eventually this would lead to a contradiction: an interval
lying outside the blue area. So there must be {at least one long interval}.
Working out the numbers, under certain reasonable conditions there is an interval of length about . We will have
of order
,
of order
and
close to
.
Independence number and large independent sets
So what’s the functionWhat does this have to do with ? Each colour class in a colouring (i.e. all the vertices of one particular colour) is independent, because neighbouring vertices must be coloured differently. So there are at most
vertices in each colour class, and therefore
. We saw in (1) that this easy lower bound is in fact asymptotically correct.
We call an independent set of size an
-set. It is plausible that the optimal colouring of
contains all or almost all
-sets as colour classes. Roughly speaking, this is because the expected number of
-colourings is dominated by colourings with as many
-sets as possible.
Let count the number of
-sets, then it can be shown that
is approximately Poisson with mean
, where
. In particular,
typically varies by about
. If we must use all
-sets in the colouring, then it seems plausible that the overall number of colours we end up with also varies by roughly this amount. (Divided by
, as we will see in a moment.)
The weak concentration result
Luckily we already have a suitable estimate forSo what is the effect of having one extra -set in
? Using one colour for this
-set, we need to colour the remaining
vertices. So one extra
-set should save us about
colours. So extra
-sets should reduce
by about
.
The coupling
We construct the coupling in the following way: take
The inner graph on
vertices is simply
. It is also clear that, if
is the outer graph on
vertices, then
This is because any colouring of can be extended to a colouring of
by giving a new colour to each of the
-sets.
The trouble is that is not distributed as
— we have disturbed the distribution by planting the
-sets. The key point is that, as long as
is not too large — less than the standard deviation
of
— then
and
are very similar. This makes sense on an intuitive level: the random graph doesn’t `notice’ the extra
-sets as long as we have planted fewer than the natural fluctuation
.
So we let (or in fact
works). For the formal proof, we bound the total variation distance of the two random graph models.
Tying it all together
The two ingredients above can be combined to show that for everyHow close to Shamir and Spencer’s upper bound can we actually get? At the moment, nothing better than
for some unspecified function
is possible. The main bottleneck is the size of the error term
in (1).
Konstantinos Panagiotou and I have been working on an improved estimate for in a paper we are currently writing up. Assuming this result, Oliver Riordan and I can prove that there are infinitely many
such that
The best upper concentration bound for is
, due to Alon.
Which of these is closer to the truth? We conjecture that, for the worst case , the width of the distribution of
matches our lower bound up to the constant. However, the concentration behaviour seems to be very different for different
, as described below.
The Zigzag conjecture
LetThere is another conjectured lower bound coming from fluctuations in , the number of
-sets, namely
The Zigzag conjecture, made recently by Bollobás, Heckel, Morris, Panagiotou, Riordan and Smith, states that at least for most ,
is essentially the maximum of these two lower bounds. Ignoring
-terms, we have the following simplified statement.
Define by letting
. Let
Then, if denotes the standard deviation of
,
It is not hard to show that the function `zigzags’ between
and
, roughly linearly in
, as shown in the picture (the orange and blue lines are the lower bounds coming from
– and
-sets, respectively).
There’s a detailed heuristic explanation of the conjecture in the paper. We also think that we have a pretty good idea of the behaviour of at the extreme points. In particular, we believe that in the `worst case’,
is of order
, while in the `best case’ it’s of order
.