Intransitive dice V: we want a local central limit theorem
Gowers's Weblog 2018-03-10
It has become clear that what we need in order to finish off one of the problems about intransitive dice is a suitable version of the local central limit theorem. Roughly speaking, we need a version that is two-dimensional — that is, concerning a random walk on — and completely explicit — that is, giving precise bounds for error terms so that we can be sure that they get small fast enough.
There is a recent paper that does this in the one-dimensional case, though it used an elementary argument, whereas I would prefer to use Fourier analysis. Here I’d like to begin the process of proving a two-dimensional result that is designed with our particular application in mind. If we are successful in doing that, then it would be natural to try to extract from the proof a more general statement, but that is not a priority just yet.
As people often do, I’ll begin with a heuristic argument, and then I’ll discuss how we might try to sharpen it up to the point where it gives us good bounds for the probabilities of individual points of . Much of this post is cut and pasted from comments on the previous post, since it should be more convenient to have it in one place.
Using characteristic functions
The rough idea of the characteristic-functions approach, which I’ll specialize to the 2-dimensional case, is as follows. (Apologies to anyone who knows about this properly for anything idiotic I might accidentally write.) Let be a random variable on and write for . If we take independent copies of and add them together, then the probability of being at is
where that denotes the -fold convolution.
Now let’s define the Fourier transform of , which probabilists call the characteristic function, in the usual way by
.
Here and belong to , but I’ll sometimes think of them as belonging to too.
We have the convolution law that and the inversion formula
Putting these together, we find that if random variables are independent copies of , then the probability that their sum is is
.
The very rough reason that we should now expect a Gaussian formula is that we consider a Taylor expansion of . We can assume for our application that and have mean zero. From that one can argue that the coefficients of the linear terms in the Taylor expansion are zero. (I’ll give more details in a subsequent comment.) The constant term is 1, and the quadratic terms give us the covariance matrix of and . If we assume that we can approximate by an expression of the form for some suitable quadratic form in and , then the th power should be close to , and then, since Fourier transforms (and inverse Fourier transforms) take Gaussians to Gaussians, when we invert this one, we should get a Gaussian-type formula for . So far I’m glossing over the point that Gaussians are defined on , whereas and live in and and live in , but if most of is supported in a small region around 0, then this turns out not to be too much of a problem.
The Taylor-expansion part
If we take the formula
and partially differentiate times with respect to and times with respect to we obtain the expression
.
Setting turns this into . Also, for every and the absolute value of the partial derivative is at most . This allows us to get a very good handle on the Taylor expansion of when and are close to the origin.
Recall that the two-dimensional Taylor expansion of about is given by the formula
where is the partial derivative operator with respect to the first coordinate, the mixed partial derivative, and so on.
In our case, , , and .
As in the one-dimensional case, the error term has an integral representation, namely
,
which has absolute value at most , which in turn is at most
.
When is the random variable (where is a fixed die and is chosen randomly from ), we have that .
With very slightly more effort we can get bounds for the moments of as well. For any particular and a purely random sequence , the probability that is bounded above by for an absolute constant . (Something like 1/8 will do.) So the probability that there exists such a conditional on (which happens with probability about ) is at most , and in particular is small when . I think that with a bit more effort we could probably prove that is at most , which would allow us to improve the bound for the error term, but I think we can afford the logarithmic factor here, so I won’t worry about this. So we get an error of .
For this error to count as small, we want it to be small compared with the second moments. For the time being I’m just going to assume that the rough size of the second-moment contribution is around . So for our error to be small, we want to be and to be .
That is giving us a rough idea of the domain in which we can say confidently that the terms up to the quadratic ones give a good approximation to , and hence that is well approximated by a Gaussian.
The remaining part
Outside the domain, we have to do something different, and that something is fairly simple: we shall show that is very small. This is equivalent to showing that is bounded away from 1 by significantly more than . This we do by looking more directly at the formula for the Fourier transform:
.
We would like this to have absolute value bounded away from 1 by significantly more than except when is quite a bit smaller than and is quite a bit smaller than .
Now in our case is uniformly distributed on the points . So we can write as
.
Here’s a possible way that we might try to bound that sum. Let and let us split up the sum into pairs of terms with and , for . So each pair of terms will take the form
The ratio of these two terms is
.
And if the ratio is , then the modulus of the sum of the two terms is at most .
Now let us suppose that as varies, the differences are mostly reasonably well distributed in an interval between and , as seems very likely to be the case. Then the ratios above vary in a range from about to . But that should imply that the entire sum, when divided by , has modulus at most . (This analysis obviously isn’t correct when is bigger than , since the modulus can’t be negative, but once we’re in that regime, then it really is easy to establish the bounds we want.)
If is, say , then this gives us , and raising that to the power gives us , which is tiny.
As a quick sanity check, note that for not to be tiny we need to be not much more than . This reflects the fact that a random walk of steps of typical size about will tend to be at a distance comparable to from the origin, and when you take the Fourier transform, you take the reciprocals of the distance scales.
If is quite a bit smaller than and is not too much smaller than , then the numbers are all small but the numbers vary quite a bit, so a similar argument can be used to show that in this case too the Fourier transform is not close enough to 1 for its th power to be large. I won’t give details here.
What is left to do?
If the calculations above are not too wide of the mark, then the main thing that needs to be done is to show that for a typical die the numbers are reasonably uniform in a range of width around , and more importantly that the numbers are not too constant: basically I’d like them to be pretty uniform too.
It’s possible that we might want to try a slightly different approach, which is to take the uniform distribution on the set of points , convolve it once with itself, and argue that the resulting probability distribution is reasonably uniform in a rectangle of width around and height around . By that I mean that a significant proportion of the points are hit around times each (because there are sums and they lie in a rectangle of area ). But one way or another, I feel pretty confident that we will be able to bound this Fourier transform and get the local central limit theorem we need.