An abridged proof of Marton’s conjecture
What's new 2024-06-22
[This post is dedicated to Luca Trevisan, who recently passed away due to cancer. Though far from his most significant contribution to the field, I would like to mention that, as with most of my other blog posts on this site, this page was written with the assistance of Luca’s LaTeX to WordPress converter. Mathematically, his work and insight on pseudorandomness in particular have greatly informed how I myself think about the concept. – T.]
Recently, Timothy Gowers, Ben Green, Freddie Manners, and I were able to establish the following theorem:
Theorem 1 (Marton’s conjecture) Letbe non-empty with
. Then there exists a subgroup
of
with
such that
is covered by at most
translates of
, for some absolute constant
.
We established this result with , although it has since been improved to
by Jyun-Jie Liao.
Our proof was written in order to optimize the constant as much as possible; similarly for the more detailed blueprint of the proof that was prepared in order to formalize the result in Lean. I have been asked a few times whether it is possible to present a streamlined and more conceptual version of the proof in which one does not try to establish an explicit constant
, but just to show that the result holds for some constant
. This is what I will attempt to do in this post, though some of the more routine steps will be outsourced to the aforementioned blueprint.
The key concept here is that of the entropic Ruzsa distance between two random variables
taking values
, defined as
Theorem 2 (Entropic Marton’s conjecture) Letbe a
-valued random variable with
. Then there exists a uniform random variable
on a subgroup
of
such that
for some absolute constant
.
We were able to establish Theorem 2 with , which implies Theorem 1 with
by fairly standard additive combinatorics manipulations; see the blueprint for details.
The key proposition needed to establish Theorem 2 is the following distance decrement property:
Proposition 3 (Distance decrement) Ifare
-valued random variables, then one can find
-valued random variables
such that
and
for some absolute constants
.
Indeed, suppose this proposition held. Starting with both equal to
and iterating, one can then find sequences of random variables
with
,
To prove Proposition 3, we can reformulate it as follows:
Proposition 4 (Lack of distance decrement implies vanishing) Ifare
-valued random variables, with the property that
for all
-valued random variables
and some sufficiently small absolute constant
, then one can derive a contradiction.
Indeed, we may assume from the above proposition that
The entire game is now to use Shannon entropy inequalities and “entropic Ruzsa calculus” to deduce a contradiction from (1) for small enough. This we will do below the fold, but before doing so, let us first make some adjustments to (1) that will make it more useful for our purposes. Firstly, because conditional entropic Ruzsa distance (see blueprint for definitions) is an average of unconditional entropic Ruzsa distance, we can automatically upgrade (1) to the conditional version
— 1. Main argument —
Now we derive more and more consequences of (2) – at some point crucially using the hypothesis that we are in characteristic two – before we reach a contradiction.
Right now, our hypothesis (2) only supplies lower bounds on entropic distances. The crucial ingredient that allows us to proceed is what we call the fibring identity, which lets us convert these lower bounds into useful upper bounds as well, which in fact match up very nicely when is small. Informally, the fibring identity captures the intuitive fact that the doubling constant of a set
should be at least as large as the doubling constant of the image
of that set under a homomorphism, times the doubling constant of a typical fiber
of that homomorphism; and furthermore, one should only be close to equality if the fibers “line up” in some sense.
Here is the fibring identity:
Proposition 5 (Fibring identity) Letbe a homomorphism. Then for any independent
-valued random variables
, one has
The proof is of course in the blueprint, but given that it is a central pillar of the argumnt, I reproduce it here.
Proof: Expanding out the definition of Ruzsa distance, and using the conditional entropy chain rule
We will only care about the characteristic setting here, so we will now assume that all groups involved are
-torsion, so that we can replace all subtractions with additions. If we specialize the fibring identity to the case where
,
,
is the addition map
, and
,
are pairs of independent random variables in
, we obtain the following corollary:
Corollary 6 Letbe independent
-valued random variables. Then we have the identity
This is a useful and flexible identity, especially when combined with (2). For instance, we can discard the conditional mutual information term as being non-negative, to obtain the inequality
Now we use the fibring identity again, relabeling as
and requiring
to be independent copies of
. We conclude that
Remark 7 A similar argument works in the-torsion case for general
. Instead of decrementing the entropic Ruzsa distance, one instead decrements a “multidistance”
for independent
. By an iterated version of the fibring identity, one can first reduce again to the symmetric case where the random variables are all copies of the same variable
. If one then takes
,
to be an array of
copies of
, one can get to the point where the row sums
and the column sums
have small conditional mutual information with respect to the double sum
. If we then set
and
, the data processing inequality again shows that
and
are nearly independent given
. The
-torsion now crucially intervenes as before to ensure that
has the same form as
or
, leading to a contradiction as before.