Intransitive dice VII — aiming for further results
Gowers's Weblog 2018-03-10
While Polymath13 has (barring a mistake that we have not noticed) led to an interesting and clearly publishable result, there are some obvious follow-up questions that we would be wrong not to try to answer before finishing the project, especially as some of them seem to be either essentially solved or promisingly close to a solution. The ones I myself have focused on are the following.
- Is it true that if two random elements
and
of
are chosen, then
beats
with very high probability if it has a sum that is significantly larger? (Here “significantly larger” should mean larger by
for some function
— note that the standard deviation of the sum has order
, so the idea is that this condition should be satisfied one way or the other with probability
).
- Is it true that the stronger conjecture, which is equivalent (given what we now know) to the statement that for almost all pairs
of random dice, the event that
beats a random die
has almost no correlation with the event that
beats
, is false?
- Can the proof of the result obtained so far be modified to show a similar result for the multisets model?
The status of these three questions, as I see it, is that the first is basically solved — I shall try to justify this claim later in the post, for the second there is a promising approach that will I think lead to a solution — again I shall try to back up this assertion, and while the third feels as though it shouldn’t be impossibly difficult, we have so far made very little progress on it, apart from experimental evidence that suggests that all the results should be similar to those for the balanced sequences model. [Added after finishing the post: I may possibly have made significant progress on the third question as a result of writing this post, but I haven’t checked carefully.]
The strength of a die depends strongly on the sum of its faces.
Let and
be elements of
chosen uniformly and independently at random. I shall now show that the average of
is zero, and that the probability that this quantity differs from its average by substantially more than is very small. Since typically the modulus of
has order
, it follows that whether or not
beats
is almost always determined by which has the bigger sum.
As in the proof of the main theorem, it is convenient to define the functions
and
.
Then
,
from which it follows that beats
if and only if
. Note also that
.
If we choose purely at random from
, then the expectation of
is
, and Chernoff’s bounds imply that the probability that there exists
with
is, for suitable
at most
. Let us now fix some
for which there is no such
, but keep
as a purely random element of
.
Then is a sum of
independent random variables, each with maximum at most
. The expectation of this sum is
.
But
,
so the expectation of is
.
By standard probabilistic estimates for sums of independent random variables, with probability at least the difference between
and its expectation
is at most
. Writing this out, we have
,
which works out as
.
Therefore, if , it follows that with high probability
, which implies that
beats
, and if
, then with high probability
beats
. But one or other of these two cases almost always happens, since the standard deviations of
and
are of order
. So almost always the die that wins is the one with the bigger sum, as claimed. And since “has a bigger sum than” is a transitive relation, we get transitivity almost all the time.
Why the strong conjecture looks false
As I mentioned, the experimental evidence seems to suggest that the strong conjecture is false. But there is also the outline of an argument that points in the same direction. I’m going to be very sketchy about it, and I don’t expect all the details to be straightforward. (In particular, it looks to me as though the argument will be harder than the argument in the previous section.)
The basic idea comes from a comment of Thomas Budzinski. It is to base a proof on the following structure.
- With probability bounded away from zero, two random dice
and
are “close”.
- If
and
are two fixed dice that are close to each other and
is random, then the events “
beats
” and “
beats
” are positively correlated.
Here is how I would imagine going about defining “close”. First of all, note that the function is somewhat like a random walk that is constrained to start and end at zero. There are results that show that random walks have a positive probability of never deviating very far from the origin — at most half a standard deviation, say — so something like the following idea for proving the first step (remaining agnostic for the time being about the precise definition of “close”). We choose some fixed positive integer
and let
be integers evenly spread through the interval
. Then we argue — and this should be very straightforward — that with probability bounded away from zero, the values of
and
are close to each other, where here I mean that the difference is at most some small (but fixed) fraction of a standard deviation.
If that holds, it should also be the case, since the intervals between and
are short, that
and
are uniformly close with positive probability.
I’m not quite sure whether proving the second part would require the local central limit theorem in the paper or whether it would be an easier argument that could just use the fact that since and
are close, the sums
and
are almost certainly close too. Thomas Budzinski sketches an argument of the first kind, and my guess is that that is indeed needed. But either way, I think it ought to be possible to prove something like this.
What about the multisets model?
We haven’t thought about this too hard, but there is a very general approach that looks to me promising. However, it depends on something happening that should be either quite easy to establish or not true, and at the moment I haven’t worked out which, and as far as I know neither has anyone else.
The difficulty is that while we still know in the multisets model that beats
if and only if
(since this depends just on the dice and not on the model that is used to generate them randomly), it is less easy to get traction on the sum because it isn’t obvious how to express it as a sum of independent random variables.
Of course, we had that difficulty with the balanced-sequences model too, but there we got round the problem by considering purely random sequences and conditioning on their sum, having established that certain events held with sufficiently high probability for the conditioning not to stop them holding with high probability.
But with the multisets model, there isn’t an obvious way to obtain the distribution over random dice by choosing
independently (according to some distribution) and conditioning on some suitable event. (A quick thought here is that it would be enough if we could approximate the distribution of
in such a way, provided the approximation was good enough. The obvious distribution to take on each
is the marginal distribution of that
in the multisets model, and the obvious conditioning would then be on the sum, but it is far from clear to me whether that works.)
A somewhat different approach that I have not got far with myself is to use the standard one-to-one correspondence between increasing sequences of length taken from
and subsets of
of size
. (Given such a sequence
one takes the subset
, and given a subset
, where the
are written in increasing order, one takes the multiset of all values
, with multiplicity.) Somehow a subset of
of size
feels closer to a bunch of independent random variables. For example, we could model it by choosing each element with probability
and conditioning on the number of elements being exactly
, which will happen with non-tiny probability.
Actually, now that I’m writing this, I’m coming to think that I may have accidentally got closer to a solution. The reason is that earlier I was using a holes-and-pegs approach to defining the bijection between multisets and subsets, whereas with this approach, which I had wrongly assumed was essentially the same, there is a nice correspondence between the elements of the multiset and the elements of the set. So I suddenly feel more optimistic that the approach for balanced sequences can be adapted to the multisets model.
I’ll end this post on that optimistic note: no doubt it won’t be long before I run up against some harsh reality.