A quasirandomness implication

Gowers's Weblog 2019-08-20

This is a bit of a niche post, since its target audience is people who are familiar with quasirandom graphs and like proofs of an analytic flavour. Very roughly, a quasirandom graph is one that behaves like a random graph of the same density. It turns out that there are many ways that one can interpret the phrase “behaves like a random graph” and, less obviously, that they are all in a certain sense equivalent. This realization dates back to seminal papers of Thomason, and of Chung, Graham and Wilson.

I was lecturing on the topic recently, and proving that certain of the quasirandomness properties all implied each other. In some cases, the proofs are quite a bit easier if you assume that the graph is regular, and in the past I have sometimes made my life easier by dealing just with that case. But that had the unfortunate consequence that when I lectured on Szemerédi’s regularity lemma, I couldn’t just say “Note that the condition on the regular pairs is just saying that they have quasirandomness property $n$ ” and have as a consequence all the other quasirandomness properties. So this year I was determined to deal with the general case, and determined to find clean proofs of all the implications. There is one that took me quite a bit of time, but I got there in the end. It is very likely to be out there in the literature somewhere, but I haven’t found it, so it seems suitable for a blog post. I can be sure of at least one interested reader, which is the future me when I find that I’ve forgotten the argument (except that actually I have now found quite a conceptual way of expressing it, so it’s just conceivable that it will stick around in the more accessible part of my long-term memory).

The implication in question, which I’ll state for bipartite graphs, concerns the following two properties. I’ll state them qualitatively first, and then give more precise versions. Let $G$ be a bipartite graph of density $\delta$ with (finite) vertex sets $X$ and $Y$ .

Property 1. If $A\subset X$ and $B\subset Y$ , then the number of edges between $A$ and $B$ is roughly $\delta|A||B|$ .

Property 2. The number of 4-cycles (or more precisely ordered quadruples $(x_1,y_1,x_2,y_2)$ such that $x_iy_j$ is an edge for all four choices of $i,j$ ) is roughly $\delta^4|X|^2|Y|^2$ .

A common way of expressing property 1 is to say that the density of the subgraph induced by $A$ and $B$ is approximately $\alpha$ as long as the sets $A$ and $B$ are not too small. However, the following formulation leads to considerably tidier proofs if one wants to use analytic arguments. I think of $G$ as a function, so $G(x,y)=1$ if $xy$ is an edge of the graph and 0 otherwise. Then the condition is that if $A$ has density $\alpha$ in $X$ and $B$ has density $\beta$ in $Y$ , then

$|\mathbb E_{x,y}G(x,y)A(x)B(y)-\delta\alpha\beta|\leq c_1$

This condition is interesting when $c_1$ is small. It might seem more natural to write $c_1\alpha\beta$ on the right-hand side, but then one has to add some condition such as that $\alpha,\beta\geq c_1$ in order to obtain a condition that follows from the other conditions. If one simply leaves the right-hand side as $c_1$ (which may depend on $\delta$ ), then one obtains a condition that automatically gives a non-trivial statement when $A$ and $B$ are large and a trivial one when they are small.

As for Property 2, the most natural analytic way of expressing it is the inequality

$\displaystyle \mathop{\mathbb E}_{x_1,x_2\in X}\mathop{\mathbb E}_{y_1,y_2\in Y}G(x_1,y_1)G(x_1,y_2)G(x_2,y_1)G(x_2,y_2)\leq\delta^4(1+c_2)$

An easy Cauchy-Schwarz argument proves a lower bound of $\delta^4$ , so this does indeed imply that the number of labelled 4-cycles is approximately $\delta^4|X|^2|Y|^2$ .

The equivalence between the two statements is that if one holds for a sufficiently small $c_i$ , then the other holds for a $c_j$ that is as small as you want.

In fact, both implications are significantly easier in the regular case, but I found a satisfactory way of deducing the first from the second a few years ago and won’t repeat it here, as it requires a few lemmas and some calculation. But it is given as Theorem 5.3 in these notes, and the proof in question can be found by working backwards. (The particular point that is easier in the regular case is Corollary 5.2, because then the function $g'$ mentioned there and in Lemma 5.1 is identically zero.)

What I want to concentrate on is deducing the second property from the first. Let me first give the proof in the regular case, which is very short and sweet. We do it by proving the contrapositive. So let’s assume that every vertex in $X$ has degree $\delta Y$ , that every vertex in $Y$ has degree $\delta X$ , and that

$\displaystyle \mathop{\mathbb E}_{x_1,x_2\in X}\mathop{\mathbb E}_{y_1,y_2\in Y}G(x_1,y_1)G(x_1,y_2)G(x_2,y_1)G(x_2,y_2)>\delta^4(1+c_2)$

Since for each fixed $x_2,y_2$ , the expectation over $x_1,y_1$ on the left-hand side is zero if $G(x_2,y_2)=0$ , it follows that there exists some choice of $x_2,y_2$ such that

$\displaystyle \mathop{\mathbb E}_{x_1\in X}\mathop{\mathbb E}_{y_1\in Y}G(x_1,y_1)G(x_1,y_2)G(x_2,y_1)>\delta^3(1+c_2)$

We now set $A=\{x:G(x,y_2)=1\}$ and $B=\{y:G(x_2,y)=1\}$ . Then $A$ and $B$ both have density $\delta$ (by the regularity assumption), and the above inequality tells us that

$\mathbb E_{x_1,y_1}G(x_1,y_1)A(x_1)B(y_1)-\delta^3>c_2\delta^3$ ,

so we obtain the desired conclusion with $c_1=c_2\delta^3$ . Or to put it another way, if we assume Property 1 with constant $c_1$ , then we obtain Property 2 with $c_2=\delta^{-3}c_1$ (which in practice means that $c_1$ should be small compared with $\delta^3$ in order to obtain a useful inequality from Property 2).

The difficulty when $G$ is not regular is that $A$ and $B$ may have densities larger than $\delta$ , and then the inequality

$\mathbb E_{x_1,y_1}G(x_1,y_1)A(x_1)B(y_1)-\delta^3>c_2\delta^3$ ,

no longer gives us what we need. The way I used to get round this was what I think of as the “disgusting” approach, which goes roughly as follows. Suppose that many vertices in $X$ have degree substantially larger than $\delta|Y|$ , and let $A$ be the set of all such vertices. Then the number of edges between $A$ and $Y$ is too large, and we get the inequality we are looking for (with $B=Y$ ). We can say something similar about vertices in $Y$ , so either we are done or $G$ is at least approximately regular, and in particular almost all neighbourhoods are of density not much greater than $\delta$ . Then one runs an approximate version of the simple averaging argument above, arguing in an ugly way that the contribution to the average from “bad” choices $(x_2,y_2)$ is small enough that there must be at least one “good” choice.

To obtain a cleaner proof, I’ll begin with a non-essential step, but one that I think clarifies what is going on and shows that it’s not just some calculation that magically gives the desired answer. It is to interpret the quantity

$\displaystyle \mathop{\mathbb E}_{x_1,x_2\in X}\mathop{\mathbb E}_{y_1,y_2\in Y}G(x_1,y_1)G(x_1,y_2)G(x_2,y_1)G(x_2,y_2)$

as the probability that the four pairs $x_iy_j$ are all edges of $G$ if $x_1,x_2$ are chosen independently at random from $X$ and $y_1,y_2$ are chosen independently at random from $Y$ . Then our hypothesis is that

$\mathbb P[$ all $x_iy_j$ are edges $]-\delta^4>c_2\delta^4$

Let us now split up the left-hand side as follows. It is the sum of

$\mathbb P[$ all $x_iy_j$ are edges $]$ $-\delta\mathbb P[x_1y_2,x_2y_1,x_2y_2$ are edges $]$

and

$\delta\mathbb P[x_1y_1,x_1y_2,x_2y_1$ are edges $]$ $-\delta^2\mathbb P[x_1y_2,x_2y_1$ are edges $]$ ,

where I have used the fact that the probability that $x_1y_2$ and $x_2y_1$ are both edges is exactly $\delta^2$ .

One or other of these two terms must be at least $c_2\delta^4/2$ . If it is the first, then averaging gives us $x_2,y_2$ such that

$\mathbb E_{x_1,y_1}(G(x_1,y_1)-\delta)G(x_1,y_2)G(x_2,y_1)\geq c_2\delta^3/2$

and now we can define $A$ and $B$ just as in the regular case, obtaining the same result apart from an extra factor of 1/2.

If it is the second, then averaging over $x_1,y_1$ this time and dividing by $\delta$ gives us some $x_1,y_1$ such that the inequality

$\mathbb E_{x_2,y_2}G(x_1,y_2)G(x_2,y_1)(G(x_2,y_2)-\delta)\geq c_2\delta^3/2$ .

So again we are done, this time with the roles of 1 and 2 reversed.

I haven’t checked the details, but it is clear that one could run this argument for any subgraph $H$ that occurs the “wrong” number of times. The extra factor one would need to divide by would be the number of edges you need to remove from $H$ to obtain a matching.

Of course, the fact that Property 1 implies that all small subgraphs occur with roughly the right frequency is a very well known fact about quasirandom graphs, but it is usually proved using an almost regularity assumption, which I wanted to avoid.