Intransitive dice V: we want a local central limit theorem
Gowers's Weblog 2018-03-10
It has become clear that what we need in order to finish off one of the problems about intransitive dice is a suitable version of the local central limit theorem. Roughly speaking, we need a version that is two-dimensional — that is, concerning a random walk on — and completely explicit — that is, giving precise bounds for error terms so that we can be sure that they get small fast enough.
There is a recent paper that does this in the one-dimensional case, though it used an elementary argument, whereas I would prefer to use Fourier analysis. Here I’d like to begin the process of proving a two-dimensional result that is designed with our particular application in mind. If we are successful in doing that, then it would be natural to try to extract from the proof a more general statement, but that is not a priority just yet.
As people often do, I’ll begin with a heuristic argument, and then I’ll discuss how we might try to sharpen it up to the point where it gives us good bounds for the probabilities of individual points of . Much of this post is cut and pasted from comments on the previous post, since it should be more convenient to have it in one place.
Using characteristic functions
The rough idea of the characteristic-functions approach, which I’ll specialize to the 2-dimensional case, is as follows. (Apologies to anyone who knows about this properly for anything idiotic I might accidentally write.) Let be a random variable on
and write
for
. If we take
independent copies of
and add them together, then the probability of being at
is
where that denotes the -fold convolution.
Now let’s define the Fourier transform of , which probabilists call the characteristic function, in the usual way by
.
Here and
belong to
, but I’ll sometimes think of them as belonging to
too.
We have the convolution law that and the inversion formula
Putting these together, we find that if random variables are
independent copies of
, then the probability that their sum is
is
.
The very rough reason that we should now expect a Gaussian formula is that we consider a Taylor expansion of . We can assume for our application that
and
have mean zero. From that one can argue that the coefficients of the linear terms in the Taylor expansion are zero. (I’ll give more details in a subsequent comment.) The constant term is 1, and the quadratic terms give us the covariance matrix of
and
. If we assume that we can approximate
by an expression of the form
for some suitable quadratic form in
and
, then the
th power should be close to
, and then, since Fourier transforms (and inverse Fourier transforms) take Gaussians to Gaussians, when we invert this one, we should get a Gaussian-type formula for
. So far I’m glossing over the point that Gaussians are defined on
, whereas
and
live in
and
and
live in
, but if most of
is supported in a small region around 0, then this turns out not to be too much of a problem.
The Taylor-expansion part
If we take the formula
and partially differentiate times with respect to
and
times with respect to
we obtain the expression
.
Setting turns this into
. Also, for every
and
the absolute value of the partial derivative is at most
. This allows us to get a very good handle on the Taylor expansion of
when
and
are close to the origin.
Recall that the two-dimensional Taylor expansion of about
is given by the formula
where is the partial derivative operator with respect to the first coordinate,
the mixed partial derivative, and so on.
In our case, ,
, and
.
As in the one-dimensional case, the error term has an integral representation, namely
,
which has absolute value at most , which in turn is at most
.
When is the random variable
(where
is a fixed die and
is chosen randomly from
), we have that
.
With very slightly more effort we can get bounds for the moments of as well. For any particular
and a purely random sequence
, the probability that
is bounded above by
for an absolute constant
. (Something like 1/8 will do.) So the probability that there exists such a
conditional on
(which happens with probability about
) is at most
, and in particular is small when
. I think that with a bit more effort we could probably prove that
is at most
, which would allow us to improve the bound for the error term, but I think we can afford the logarithmic factor here, so I won’t worry about this. So we get an error of
.
For this error to count as small, we want it to be small compared with the second moments. For the time being I’m just going to assume that the rough size of the second-moment contribution is around . So for our error to be small, we want
to be
and
to be
.
That is giving us a rough idea of the domain in which we can say confidently that the terms up to the quadratic ones give a good approximation to , and hence that
is well approximated by a Gaussian.
The remaining part
Outside the domain, we have to do something different, and that something is fairly simple: we shall show that is very small. This is equivalent to showing that
is bounded away from 1 by significantly more than
. This we do by looking more directly at the formula for the Fourier transform:
.
We would like this to have absolute value bounded away from 1 by significantly more than except when
is quite a bit smaller than
and
is quite a bit smaller than
.
Now in our case is uniformly distributed on the
points
. So we can write
as
.
Here’s a possible way that we might try to bound that sum. Let and let us split up the sum into pairs of terms with
and
, for
. So each pair of terms will take the form
The ratio of these two terms is
.
And if the ratio is , then the modulus of the sum of the two terms is at most
.
Now let us suppose that as varies, the differences
are mostly reasonably well distributed in an interval between
and
, as seems very likely to be the case. Then the ratios above vary in a range from about
to
. But that should imply that the entire sum, when divided by
, has modulus at most
. (This analysis obviously isn’t correct when
is bigger than
, since the modulus can’t be negative, but once we’re in that regime, then it really is easy to establish the bounds we want.)
If is, say
, then this gives us
, and raising that to the power
gives us
, which is tiny.
As a quick sanity check, note that for not to be tiny we need
to be not much more than
. This reflects the fact that a random walk of
steps of typical size about
will tend to be at a distance comparable to
from the origin, and when you take the Fourier transform, you take the reciprocals of the distance scales.
If is quite a bit smaller than
and
is not too much smaller than
, then the numbers
are all small but the numbers
vary quite a bit, so a similar argument can be used to show that in this case too the Fourier transform is not close enough to 1 for its
th power to be large. I won’t give details here.
What is left to do?
If the calculations above are not too wide of the mark, then the main thing that needs to be done is to show that for a typical die the numbers
are reasonably uniform in a range of width around
, and more importantly that the numbers
are not too constant: basically I’d like them to be pretty uniform too.
It’s possible that we might want to try a slightly different approach, which is to take the uniform distribution on the set of points , convolve it once with itself, and argue that the resulting probability distribution is reasonably uniform in a rectangle of width around
and height around
. By that I mean that a significant proportion of the points are hit around
times each (because there are
sums and they lie in a rectangle of area
). But one way or another, I feel pretty confident that we will be able to bound this Fourier transform and get the local central limit theorem we need.