The three-dimensional Kakeya conjecture, after Wang and Zahl

What's new 2025-02-26

There has been some spectacular progress in geometric measure theory: Hong Wang and Joshua Zahl have just released a preprint that resolves the three-dimensional case of the infamous Kakeya set conjecture! This conjecture asserts that a Kakeya set – a subset of {{\bf R}^3} that contains a unit line segment in every direction, must have Minkowski and Hausdorff dimension equal to three. (There is also a stronger “maximal function” version of this conjecture that remains open at present, although the methods of this paper will give some non-trivial bounds on this maximal function.) It is common to discretize this conjecture in terms of small scale {0 < \delta < 1}. Roughly speaking, the conjecture then asserts that if one has a family {\mathbb{T}} of {\delta \times \delta \times 1} tubes of cardinality {\approx \delta^{-2}}, and pointing in a {\delta}-separated set of directions, then the union {\bigcup_{T \in \mathbb{T}} T} of these tubes should have volume {\approx 1}. Here we shall be a little vague as to what {\approx} means here, but roughly one should think of this as “up to factors of the form {O_\varepsilon(\delta^{-\varepsilon})} for any {\varepsilon>0}“; in particular this notation can absorb any logarithmic losses that might arise for instance from a dyadic pigeonholing argument. For technical reasons (including the need to invoke the aforementioned dyadic pigeonholing), one actually works with slightly smaller sets {\bigcup_{T \in \mathbb{T}} Y(T)}, where {Y} is a “shading” of the tubes in {\mathbb{T}} that assigns a large subset {Y(T)} of {T} to each tube {T} in the collection; but for this discussion we shall ignore this subtlety and pretend that we can always work with the full tubes.

Previous results in this area tended to center around lower bounds of the form

\displaystyle  |\bigcup_{T \in \mathbb{T}} T| \gtrapprox \delta^{3-d} \ \ \ \ \ (1)

for various intermediate dimensions {0 < d < 3}, that one would like to make as large as possible. For instance, just from considering a single tube in this collection, one can easily establish (1) with {d=1}. By just using the fact that two lines in {{\bf R}^3} intersect in a point (or more precisely, a more quantitative estimate on the volume between the intersection of two {\delta \times \delta \times 1} tubes, based on the angle of intersection), combined with a now classical {L^2}-based argument of Córdoba, one can obtain (1) with {d=2} (and this type of argument also resolves the Kakeya conjecture in two dimensions). In 1995, building on earlier work by Bourgain, Wolff famously obtained (1) with {d=2.5} using what is now known as the “Wolff hairbrush argument”, based on considering the size of a “hairbrush” – the union of all the tubes that pass through a single tube (the hairbrush “stem”) in the collection.

In their new paper, Wang and Zahl established (1) for {d=3}. The proof is lengthy (127 pages!), and relies crucially on their previous paper establishing a key “sticky” case of the conjecture. Here, I thought I would try to summarize the high level strategy of proof, omitting many details and also oversimplifying the argument at various places for sake of exposition. The argument does use many ideas from previous literature, including some from my own papers with co-authors; but the case analysis and iterative schemes required are remarkably sophisticated and delicate, with multiple new ideas needed to close the full argument.

A natural strategy to prove (1) would be to try to induct on {d}: if we let {K(d)} represent the assertion that (1) holds for all configurations of {\approx \delta^{-2}} tubes of dimensions {\delta \times \delta \times 1}, with {\delta}-separated directions, we could try to prove some implication of the form {K(d) \implies K(d + \alpha)} for all {0 < d < 3}, where {\alpha>0} is some small positive quantity depending on {d}. Iterating this, one could hope to get {d} arbitrarily close to {3}.

A general principle with these sorts of continuous induction arguments is to first obtain the trivial implication {K(d) \implies K(d)} in a non-trivial fashion, with the hope that this non-trivial argument can somehow be perturbed or optimized to get the crucial improvement {K(d) \implies K(d)}. The standard strategy for doing this, since the work of Bourgain and then Wolff in the 1990s (with precursors in older work of Córdoba), is to perform some sort of “induction on scales”. Here is the basic idea. Let us call the {\delta \times \delta \times 1} tubes {T} in {\mathbb{T}} “thin tubes”. We can try to group these thin tubes into “fat tubes” of dimension {\rho \times \rho \times 1} for some intermediate scale {\delta \ll \rho \ll 1}; it is not terribly important for this sketch precisely what intermediate value is chosen here, but one could for instance set {\rho = \sqrt{\delta}} if desired. Because of the {\delta}-separated nature of the directions in {\mathbb{T}}, there can only be at most {\lessapprox (\rho/\delta)^{-2}} thin tubes in a given fat tube, and so we need at least {\gtrapprox \rho^{-2}} fat tubes to cover the {\approx \delta^{-2}} thin tubes. Let us suppose for now that we are in the “sticky” case where the thin tubes stick together inside fat tubes as much as possible, so that there are in fact a collection {\mathbb{T}_\rho} of {\approx \rho^{-2}} fat tubes {T_\rho}, with each fat tube containing about {\approx (\rho/\delta)^{-2}} of the thin tubes. Let us also assume that the fat tubes {T_\rho} are {\rho}-separated in direction, which is an assumption which is highly consistent with the other assumptions made here.

If we already have the hypothesis {K(d)}, then by applying it at scale {\rho} instead of {\delta} we conclude a lower bound on the volume occupied by fat tubes:

\displaystyle  |\bigcup_{T_\rho \in \mathbb{T}_\rho} T_\rho| \gtrapprox \rho^{3-d}.

Since {\sum_{T_\rho \in \mathbb{T}_\rho} |T_\rho| \approx \rho^{-2} \rho^2 = 1}, this morally tells us that the typical multiplicity {\mu_{fat}} of the fat tubes is {\lessapprox \rho^{3-d}}; a typical point in {\bigcup_{T_\rho \in \mathbb{T}_\rho} T_\rho} should belong to about {\mu_{fat} \lessapprox \rho^{3-d}} fat tubes.

Now, inside each fat tube {T_\rho}, we are assuming that we have about {\approx (\rho/\delta)^{-2}} thin tubes that are {\delta}-separated in direction. If we perform a linear rescaling around the axis of the fat tube by a factor of {1/\rho} to turn it into a {1 \times 1 \times 1} tube, this would inflate the thin tubes to be rescaled tubes of dimensions {\delta/\rho \times \delta/\rho \times 1}, which would now be {\approx \delta/\rho}-separated in direction. This rescaling does not affect the multiplicity of the tubes. Applying {K(d)} again, we see morally that the multiplicity {\mu_{fine}} of the rescaled tubes, and hence the thin tubes inside {T_\rho}, should be {\lessapprox (\delta/\rho)^{3-d}}.

We now observe that the multiplicity {\mu} of the full collection {\mathbb{T}} of thin tubes should morally obey the inequality

\displaystyle  \mu \lessapprox \mu_{fat} \mu_{fine}, \ \ \ \ \ (2)

since if a given point lies in at most {\mu_{fat}} fat tubes, and within each fat tube a given point lies in at most {\mu_{fine}} thin tubes in that fat tube, then it should only be able to lie in at most {\mu_{fat} \mu_{fine}} tubes overall. This heuristically gives {\mu \lessapprox \rho^{3-d} (\delta/\rho)^{3-d} = \delta^{3-d}}, which then recovers (1) in the sticky case.

In their previous paper, Wang and Zahl were roughly able to squeeze a little bit more out of this argument to get something resembling {K(d) \implies K(d+\alpha)} in the sticky case, loosely following a strategy of Nets Katz and myself that I discussed in this previous blog post from over a decade ago. I will not discuss this portion of the argument further here, referring the reader to the introduction to that paper; instead, I will focus on the arguments in the current paper, which handle the non-sticky case.

Let’s try to repeat the above analysis in a non-sticky situation. We assume {K(d)} (or some suitable variant thereof), and consider some thickened Kakeya set

\displaystyle  E = \bigcup_{T \in {\mathbb T}} T

where {{\mathbb T}} is something resembling what we might call a “Kakeya configuration” at scale {\delta}: a collection of {\delta^{-2}} thin tubes of dimension {\delta \times \delta \times 1} that are {\delta}-separated in direction. (Actually, to make the induction work, one has to consider a more general family of tubes than these, satisfying some standard “Wolff axioms” instead of the direction separation hypothesis; but we will gloss over this issue for now.) Our goal is to prove something like {K(d+\alpha)} for some {\alpha>0}, which amounts to obtaining some improved volume bound

\displaystyle  |E| \gtrapprox \delta^{3-d-\alpha}

that improves upon the bound {|E| \gtrapprox \delta^{3-d}} coming from {K(d)}. From the previous paper we know we can do this in the “sticky” case, so we will assume that {E} is “non-sticky” (whatever that means).

A typical non-sticky setup is when there are now {m \rho^{-2}} fat tubes for some multiplicity {m \ggg 1} (e.g., {m = \delta^{-\eta}} for some small constant {\eta>0}), with each fat tube containing only {m^{-1} (\delta/\rho)^{-2}} thin tubes. Now we have an unfortunate imbalance: the fat tubes form a “super-Kakeya configuration”, with too many tubes at the coarse scale {\rho} for them to be all {\rho}-separated in direction, while the thin tubes inside a fat tube form a “sub-Kakeya configuration” in which there are not enough tubes to cover all relevant directions. So one cannot apply the hypothesis {K(d)} efficiently at either scale.

This looks like a serious obstacle, so let’s change tack for a bit and think of a different way to try to close the argument. Let’s look at how {E} intersects a given {\rho}-ball {B(x,\rho)}. The hypothesis {K(d)} suggests that {E} might behave like a {d}-dimensional fractal (thickened at scale {\delta}), in which case one might be led to a predicted size of {E \cap B(x,\rho)} of the form {(\rho/\delta)^d \delta^3}. Suppose for sake of argument that the set {E} was denser than this at this scale, for instance we have

\displaystyle  |E \cap B(x,\rho)| \gtrapprox (\rho/\delta)^d \delta^{3-\alpha} \ \ \ \ \ (3)

for all {x \in E} and some {\alpha>0}. Observe that the {\rho}-neighborhood {E} is basically {\bigcup_{T_\rho \in {\mathbb T}_\rho} T_\rho}, and thus has volume {\gtrapprox \rho^{3-d}} by the hypothesis {K(d)} (indeed we would even expect some gain in {m}, but we do not attempt to capture such a gain for now). Since {\rho}-balls have volume {\approx \rho^3}, this should imply that {E} needs about {\gtrapprox \rho^{-d}} balls to cover it. Applying (3), we then heuristically have

\displaystyle  |E| \gtrapprox \rho^{-d} \times (\rho/\delta)^d \delta^{3-\alpha} = \delta^{3-d-\alpha}

which would give the desired gain {K(d+\alpha)}. So we win if we can exhibit the condition (3) for some intermediate scale {\rho}. I think of this as a “Frostman measure violation”, in that the Frostman type bound

\displaystyle |E \cap B(x,\rho)| \lessapprox (\rho/\delta)^d \delta^3

is being violated.

The set {E}, being the union of tubes of thickness {\delta}, is essentially the union of {\delta \times \delta \times \delta} cubes. But it has been observed in several previous works (starting with a paper of Nets Katz, Izabella Laba, and myself) that these Kakeya type sets tend to organize themselves into larger “grains” than these cubes – in particular, they can organize into {\delta \times c \times c} disjoint prisms (or “grains”) in various orientations for some intermediate scales {\delta \lll c \ll 1}. The original “graininess” argument of Nets, Izabella and myself required a stickiness hypothesis which we are explicitly not assuming (and also an “x-ray estimate”, though Wang and Zahl were able to find a suitable substitute for this), so is not directly available for this argument; however, there is an alternate approach to graininess developed by Guth, based on the polynomial method, that can be adapted to this setting. (I am told that Guth has a way to obtain this graininess reduction for this paper without invoking the polynomial method, but I have not studied the details.) With rescaling, we can ensure that the thin tubes inside a single fat tube {T_\rho} will organize into grains of a rescaled dimension {\delta \times \rho c \times c}. The grains associated to a single fat tube will be essentially disjoint; but there can be overlap between grains from different fat tubes.

The exact dimensions {\rho c, c} of the grains are not specified in advance; the argument of Guth will show that {\rho c} is significantly larger than {\delta}, but other than that there are no bounds. But in principle we should be able to assume without loss of generality that the grains are as “large” as possible. This means that there are no longer grains of dimensions {\delta \times \rho' c' \times c'} with {c'} much larger than {c}; and for fixed {c}, there are no wider grains of dimensions {\delta \times \rho' c \times c} with {\rho'} much larger than {\rho}.

One somewhat degenerate possibility is that there are enormous grains of dimensions approximately {\delta \times 1 \times 1} (i.e., {\rho \approx c \approx 1}), so that the Kakeya set {E} becomes more like a union of planar slabs. Here, it turns out that the classical {L^2} arguments of Córdoba give good estimates, so this turns out to be a relatively easy case. So we can assume that least one of {\rho} or {c} is small (or both).

We now revisit the multiplicity inequality (2). There is something slightly wasteful about this inequality, because the fat tubes used to define {\mu_{fat}} occupy a lot of space that is not in {E}. An improved inequality here is

\displaystyle  \mu \lessapprox \mu_{coarse} \mu_{fine}, \ \ \ \ \ (4)

where {\mu_{coarse}} is the multiplicity, not of the fat tubes {T_\rho}, but rather of the smaller set {\bigcup_{T \subset T_\rho} T}. The point here is that by the graininess hypotheses, each {\bigcup_{T \subset T_\rho} T} is the union of essentially disjoint grains of some intermediate dimensions {\delta \times \rho c \times c}. So the quantity {\mu_{coarse}} is basically measuring the multiplicity of the grains.

It turns out that after a suitable rescaling, the arrangement of grains looks locally like an arrangement of {\rho \times \rho \times 1} tubes. If one is lucky, these tubes will look like a Kakeya (or sub-Kakeya) configuration, for instance with not too many tubes in a given direction. (More precisely, one should assume here some form of the Wolff axioms, which the authors refer to as the “Katz-Tao Convex Wolff axioms”). A suitable version if the hypothesis {K(d)} will then give the bound

\displaystyle  \mu_{coarse} \lessapprox \rho^{-d}.

Meanwhile, the thin tubes inside a fat tube are going to be a sub-Kakeya configuration, having about {m} times fewer tubes than a Kakeya configuration. It turns out to be possible to use {K(d)} to then get a gain in {m} here,

\displaystyle  \mu_{fine} \lessapprox m^{-\sigma} (\delta/\rho)^{-d},

for some small constant {\sigma>0}. Inserting these bounds into (4), one obtains a good bound {\mu \lessapprox m^{-\sigma} \delta^{-d}} which leads to the desired gain {K(d+\alpha)}.

So the remaining case is when the grains do not behave like a rescaled Kakeya or sub-Kakeya configuration. Wang and Zahl introduce a “structure theorem” to analyze this case, concluding that the grains will organize into some larger convex prisms {W}, with the grains in each prism {W} behaving like a “super-Kakeya configuration” (with significantly more grains than one would have for a Kakeya configuration). However, the precise dimensions of these prisms {W} is not specified in advance, and one has to split into further cases.

One case is when the prisms {W} are “thick”, in that all dimensions are significantly greater than {\delta}. Informally, this means that at small scales, {E} looks like a super-Kakeya configuration after rescaling. With a somewhat lengthy induction on scales argument, Wang and Zahl are able to show that (a suitable version of) {K(d)} implies an “x-ray” version of itself, in which the lower bound of super-Kakeya configurations is noticeably better than the lower bound for Kakeya configurations. The upshot of this is that one is able to obtain a Frostman violation bound of the form (3) in this case, which as discussed previously is already enough to win in this case.

It remains to handle the case when the prisms {W} are “thin”, in that they have thickness {\approx \delta}. In this case, it turns out that the {L^2} arguments of Córdoba, combined with the super-Kakeya nature of the grains inside each of these thin prisms, implies that each prism is almost completely occupied by the set {E}. In effect, this means that these prisms {W} themselves can be taken to be grains of the Kakeya set. But this turns out to contradict the maximality of the dimensions of the grains (if everything is set up properly). This treats the last remaining case needed to close the induction on scales, and obtain the Kakeya conjecture!