247B, Notes 1: Restriction theory

What's new 2020-03-30

This set of notes focuses on the restriction problem in Fourier analysis. Introduced by Elias Stein in the 1970s, the restriction problem is a key model problem for understanding more general oscillatory integral operators, and which has turned out to be connected to many questions in geometric measure theory, harmonic analysis, combinatorics, number theory, and PDE. Only partial results on the problem are known, but these partial results have already proven to be very useful or influential in many applications.

We work in a Euclidean space {{\bf R}^d}. Recall that {L^p({\bf R}^d)} is the space of {p^{th}}-power integrable functions {f: {\bf R}^d \rightarrow {\bf C}}, quotiented out by almost everywhere equivalence, with the usual modifications when {p=\infty}. If {f \in L^1({\bf R}^d)} then the Fourier transform {\hat f: {\bf R}^d \rightarrow {\bf C}} will be defined in this course by the formula

\displaystyle  \hat f(\xi) := \int_{{\bf R}^d} f(x) e^{-2\pi i x \cdot \xi}\ dx. \ \ \ \ \ (1)

From the dominated convergence theorem we see that {\hat f} is a continuous function; from the Riemann-Lebesgue lemma we see that it goes to zero at infinity. Thus {\hat f} lies in the space {C_0({\bf R}^d)} of continuous functions that go to zero at infinity, which is a subspace of {L^\infty({\bf R}^d)}. Indeed, from the triangle inequality it is obvious that

\displaystyle  \|\hat f\|_{L^\infty({\bf R}^d)} \leq \|f\|_{L^1({\bf R}^d)}. \ \ \ \ \ (2)

If {f \in L^1({\bf R}^d) \cap L^2({\bf R}^d)}, then Plancherel’s theorem tells us that we have the identity

\displaystyle  \|\hat f\|_{L^2({\bf R}^d)} = \|f\|_{L^2({\bf R}^d)}. \ \ \ \ \ (3)

Because of this, there is a unique way to extend the Fourier transform {f \mapsto \hat f} from {L^1({\bf R}^d) \cap L^2({\bf R}^d)} to {L^2({\bf R}^d)}, in such a way that it becomes a unitary map from {L^2({\bf R}^d)} to itself. By abuse of notation we continue to denote this extension of the Fourier transform by {f \mapsto \hat f}. Strictly speaking, this extension is no longer defined in a pointwise sense by the formula (1) (indeed, the integral on the RHS ceases to be absolutely integrable once {f} leaves {L^1({\bf R}^d)}; we will return to the (surprisingly difficult) question of whether pointwise convergence continues to hold (at least in an almost everywhere sense) later in this course, when we discuss Carleson’s theorem. On the other hand, the formula (1) remains valid in the sense of distributions, and in practice most of the identities and inequalities one can show about the Fourier transform of “nice” functions (e.g., functions in {L^1({\bf R}^d) \cap L^2({\bf R}^d)}, or in the Schwartz class {{\mathcal S}({\bf R}^d)}, or test function class {C^\infty_c({\bf R}^d)}) can be extended to functions in “rough” function spaces such as {L^2({\bf R}^d)} by standard limiting arguments.

By (20), (3), and the Riesz-Thorin interpolation theorem, we also obtain the Hausdorff-Young inequality

\displaystyle  \|\hat f\|_{L^{p'}({\bf R}^d)} \leq \|f\|_{L^p({\bf R}^d)} \ \ \ \ \ (4)

for all {1 \leq p \leq 2} and {f \in L^1({\bf R}^d) \cap L^2({\bf R}^d)}, where {2 \leq p' \leq \infty} is the dual exponent to {p}, defined by the usual formula {\frac{1}{p} + \frac{1}{p'} = 1}. (One can improve this inequality by a constant factor, with the optimal constant worked out by Beckner, but the focus in these notes will not be on optimal constants.) As a consequence, the Fourier transform can also be uniquely extended as a continuous linear map from {L^p({\bf R}^d) \rightarrow L^{p'}({\bf R}^d)}. (The situation with {p>2} is much worse; see below the fold.)

The restriction problem asks, for a given exponent {1 \leq p \leq 2} and a subset {S} of {{\bf R}^d}, whether it is possible to meaningfully restrict the Fourier transform {\hat f} of a function {f \in L^p({\bf R}^d)} to the set {S}. If the set {S} has positive Lebesgue measure, then the answer is yes, since {\hat f} lies in {L^{p'}({\bf R}^d)} and therefore has a meaningful restriction to {S} even though functions in {L^{p'}} are only defined up to sets of measure zero. But what if {S} has measure zero? If {p=1}, then {\hat f \in C_0({\bf R}^d)} is continuous and therefore can be meaningfully restricted to any set {S}. At the other extreme, if {p=2} and {f} is an arbitrary function in {L^2({\bf R}^d)}, then by Plancherel’s theorem, {\hat f} is also an arbitrary function in {L^2({\bf R}^d)}, and thus has no well-defined restriction to any set {S} of measure zero.

It was observed by Stein (as reported in the Ph.D. thesis of Charlie Fefferman) that for certain measure zero subsets {S} of {{\bf R}^d}, such as the sphere {S^{d-1} := \{ \xi \in {\bf R}^d: |\xi| = 1\}}, one can obtain meaningful restrictions of the Fourier transforms of functions {f \in L^p({\bf R}^d)} for certain {p} between {1} and {2}, thus demonstrating that the Fourier transform of such functions retains more structure than a typical element of {L^{p'}({\bf R}^d)}:

Theorem 1 (Preliminary {L^2} restriction theorem) If {d \geq 2} and {1 \leq p < \frac{4d}{3d+1}}, then one has the estimate

\displaystyle  \| \hat f \|_{L^2(S^{d-1}, d\sigma)} \lesssim_{d,p} \|f\|_{L^p({\bf R}^d)}

for all Schwartz functions {f \in {\mathcal S}({\bf R}^d)}, where {d\sigma} denotes surface measure on the sphere {S^{d-1}}. In particular, the restriction {\hat f|_S} can be meaningfully defined by continuous linear extension to an element of {L^2(S^{d-1},d\sigma)}.

Proof: Fix {d,p,f}. We expand out

\displaystyle  \| \hat f \|_{L^2(S^{d-1}, d\sigma)}^2 = \int_{S^{d-1}} |\hat f(\xi)|^2\ d\sigma(\xi).

From (1) and Fubini’s theorem, the right-hand side may be expanded as

\displaystyle  \int_{{\bf R}^d} \int_{{\bf R}^d} f(x) \overline{f}(y) (d\sigma)^\vee(y-x)\ dx dy

where the inverse Fourier transform {(d\sigma)^\vee} of the measure {d\sigma} is defined by the formula

\displaystyle  (d\sigma)^\vee(x) := \int_{{\bf R}^d} e^{2\pi i x \cdot \xi}\ d\sigma(\xi).

In other words, we have the identity

\displaystyle  \| \hat f \|_{L^2(S^{d-1}, d\sigma)}^2 = \langle f, f * (d\sigma)^\vee \rangle_{L^2({\bf R}^d)}. \ \ \ \ \ (5)

Since the sphere {S^{d-1}} have bounded measure, we have from the triangle inequality that

\displaystyle  (d\sigma)^\vee(x) \lesssim_d 1. \ \ \ \ \ (6)

Also, from the method of stationary phase (as covered in the previous class 247A), or Bessel function asymptotics, we have the decay

\displaystyle  (d\sigma)^\vee(x) \lesssim_d |x|^{-(d-1)/2} \ \ \ \ \ (7)

for any {x \in {\bf R}^d} (note that the bound already follows from (6) unless {|x| \geq 1}). We remark that the exponent {-\frac{d-1}{2}} here can be seen geometrically from the following considerations. For {|x|>1}, the phase {e^{2\pi i x \cdot \xi}} on the sphere is stationary at the two antipodal points {x/|x|, -x/|x|} of the sphere, and constant on the tangent hyperplanes to the sphere at these points. The wavelength of this phase is proportional to {1/|x|}, so the phase would be approximately stationary on a cap formed by intersecting the sphere with a {\sim 1/|x|} neighbourhood of the tangent hyperplane to one of the stationary points. As the sphere is tangent to first order at these points, this cap will have diameter {\sim 1/|x|^{1/2}} in the directions of the {d-1}-dimensional tangent space, so the cap will have surface measure {\sim |x|^{-(d-1)/2}}, which leads to the prediction (7). We combine (6), (7) into the unified estimate

\displaystyle  (d\sigma)^\vee(x) \lesssim_d \langle x\rangle^{-(d-1)/2}, \ \ \ \ \ (8)

where the “Japanese bracket” {\langle x\rangle} is defined as {\langle x \rangle := (1+|x|^2)^{1/2}}. Since {\langle x \rangle^{-\alpha}} lies in {L^p({\bf R}^d)} precisely when {p > \frac{d}{\alpha}}, we conclude that

\displaystyle  (d\sigma)^\vee \in L^q({\bf R}^d) \hbox{ iff } q > \frac{d}{(d-1)/2}.

Applying Young’s convolution inequality, we conclude (after some arithmetic) that

\displaystyle  \| f * (d\sigma)^\vee \|_{L^{p'}({\bf R}^d)} \lesssim_{p,d} \|f\|_{L^p({\bf R}^d)}

whenever {1 \leq p < \frac{4d}{d-1}}, and the claim now follows from (5) and Hölder’s inequality. \Box

Remark 2 By using the Hardy-Littlewood-Sobolev inequality in place of Young’s convolution inequality, one can also establish this result for {p = \frac{4d}{3d+1}}.

Motivated by this result, given any Radon measure {\mu} on {{\bf R}^d} and any exponents {1 \leq p,q \leq \infty}, we use {R_\mu(p \rightarrow q)} to denote the claim that the restriction estimate

\displaystyle  \| \hat f \|_{L^q({\bf R}^d, \mu)} \lesssim_{d,p,q,\mu} \|f\|_{L^p({\bf R}^d)} \ \ \ \ \ (9)

for all Schwartz functions {f}; if {S} is a {k}-dimensional submanifold of {{\bf R}^d} (possibly with boundary), we write {R_S(p \rightarrow q)} for {R_\mu(p \rightarrow q)} where {\mu} is the {k}-dimensional surface measure on {S}. Thus, for instance, we trivially always have {R_S(1 \rightarrow \infty)}, while Theorem 1 asserts that {R_{S^{d-1}}(2 \rightarrow p)} holds whenever {1 \leq p < \frac{4d}{3d+1}}. We will not give a comprehensive survey of restriction theory in these notes, but instead focus on some model results that showcase some of the basic techniques in the field. (I have a more detailed survey on this topic from 2003, but it is somewhat out of date.)

— 1. Necessary conditions —

It is relatively easy to find necessary conditions for a restriction estimate {R_S(p \rightarrow q)} to hold, as one simply needs to test the estimate (9) against a suitable family of examples. We begin with the simplest case {S = {\bf R}^d}. The Hausdorff-Young inequality (4) tells us that we have the restriction estimate {R_{{\bf R}^d}(p \rightarrow p')} whenever {1 \leq p \leq 2}. These are the only restriction estimates available:

Proposition 3 (Restriction to {{\bf R}^d}) Suppose that {1 \leq p,q \leq \infty} are such that {R_{{\bf R}^d}(p \rightarrow q)} holds. Then {q=p'} and {1 \leq p \leq 2}.

We first establish the necessity of the duality condition {q=p'}. This is easily shown, but we will demonstrate it in three slightly different ways in order to illustrate different perspectives. The first perspective is from scale invariance. Suppose that the estimate {R_{{\bf R}^d}(p \rightarrow q)} holds, thus one has

\displaystyle  \| \hat f \|_{L^q({\bf R}^d)} \lesssim_{d,p,q} \|f\|_{L^p({\bf R}^d)} \ \ \ \ \ (10)

for all Schwartz functions {f \in {\mathcal S}({\bf R}^d)}. For any scaling factor {\lambda>0}, we define the scaled version {f_\lambda \in {\mathcal S}({\bf R}^d)} of {f} by the formula

\displaystyle  f_\lambda(x) := f(x/\lambda).

Applying (10) with {f} replaced by {f_\lambda}, we then have

\displaystyle  \| \hat f_\lambda \|_{L^p({\bf R}^d)} \lesssim_{d,p,q} \|f_\lambda\|_{L^q({\bf R}^d)}.

From change of variables, we have

\displaystyle  \|f_\lambda\|_{L^q({\bf R}^d)} = \lambda^{d/q} \|f\|_{L^q({\bf R}^d)}

and from the definition of Fourier transform and further change of variables we have

\displaystyle  \hat f_\lambda(\xi) = \lambda^d \hat f(\lambda \xi)

so that

\displaystyle  \|f_\lambda\|_{L^p({\bf R}^d)} = \lambda^{d-d/p} \|f\|_{L^q({\bf R}^d)};

combining all these estimates and rearranging, we conclude that

\displaystyle  \| \hat f \|_{L^p({\bf R}^d)} \lesssim_{d,p,q} \lambda^{d/p+d/q-d} \|f \|_{L^q({\bf R}^d)}.

If {d/p+d/q-d} is non-zero, then by sending {\lambda} either to zero or infinity we conclude that {\| \hat f\|_{L^p({\bf R}^d)}=0} for all {f \in {\mathcal S}({\bf R}^d)}, which is absurd. Thus we must have the necessary condition {d/p+d/q-d=0}, or equivalently that {q=p'}.

We now establish the same necessary condition from the perspective of dimensional analysis, which one can view as an abstraction of scale invariance arguments. We give the spatial variable a unit {L} of length. It is not so important what units we assign to the range of the function {f} (it will cancel out of both sides), but let us make it dimensionless for sake of discussion. Then the {L^q} norm

\displaystyle  \|f\_{L^q({\bf R}^d)} = (\int_{{\bf R}^d} |f(x)|^q\ dx)^{1/q}

will have the units of {L^{d/q}}, because integration against {d}-dimensional Lebesgue measure will have the units of {L^d} (note this conclusion can also be justified in the limiting case {q=\infty}). For similar reasons, the Fourier transform

\displaystyle  \hat f(\xi) = \int_{{\bf R}^d} f(x) e^{-2\pi i x \cdot \xi}\ dx

will have the units of {L^d}; also, the frequency variable {\xi} must have the units of {L^{-1}} in order to make the exponent {-2\pi i x \cdot \xi} appearing in the exponential dimensionless. As such, the norm

\displaystyle  \| \hat f \|_{L^p({\bf R}^d)} = (\int_{{\bf R}^d} |\hat f(\xi)|^p\ d\xi)^{1/p}

has units {L^d (L^{-1})^{d/p} = L^{d-d/p}}. In order for the estimate (10) to be dimensionally consistent, we must therefore have {d-d/p = d/q}, or equivalently that {q=p'}.

Finally, we establish the necessary condition {q=p'} once again using the example of a rescaled bump function, which is basically the same as the first approach but with {f} replaced by a bump function. We will argue at a slightly heuristic level, but it is not difficult to make the arguments below rigorous and we leave this as an exercise to the reader. Given a length scale {R}, let {\varphi_R} be a bump function adapted to the ball {B(0,R) := \{ x \in {\bf R}^d: |x| \leq R \}} of radius {R} around the origin, thus {\varphi_R(x) = \varphi(x/R)} where {\varphi \in C^\infty_c({\bf R}^d)} is some fixed test function {\varphi} supported on {B(0,1)}. As long as {\varphi} is non-zero, the norm {\| \varphi_R \|_{L^q({\bf R}^d)}} is comparable to {R^{d/q}} (up to constant factors that can depend on {d,q,\varphi} but are independent of {R}). The uncertainty principle then predicts that the Fourier transform {\widehat{\varphi_R}} will be concentrated in the dual ball {B(0,1/R)}, and within this ball (or perhaps a slightly smaller version of this ball) {\widehat{\varphi_R}(\xi) = \int_{{\bf R}^d} \varphi_R(x) e^{-2\pi ix \cdot \xi}\ dx} would be expected to be of size comparable to {R^{-d}} (the phase {x \cdot \xi} does not vary enough to cause significant cancellation). From this we expect {\| \widehat{\varphi_R} \|_{L^p({\bf R}^d)}} to be comparable in size to {R^{d-d/p}}. If (10) held, we would then have

\displaystyle  R^{d-d/p} \lesssim_{d,p,q,\varphi} R^{d/q}

for all {R>0}, which is only possible if {d-d/p = d/q}, or equivalently {q=p'}.

Now we turn to the other necessary condition {p \leq 2}. Here one does not use scaling considerations; instead, it is more convenient to work with randomised examples. A useful tool in this regard is Khintchine’s inequality, which encodes the square root cancellation heuristic that a sum {\sum_{j=1}^n f_j} of numbers or functions {f_j} with randomised signs (or phases) should have magnitude roughly comparable to the square function {(\sum_{j=1}^n |f_j|^2)^{1/2}}.

Lemma 4 (Khintchine’s inequality) Let {0 < p < \infty}, and let {\varepsilon_1,\dots,\varepsilon_n \in \{-1,+1\}} be independent random variables that each take the values {-1,+1} with an equal probability of {1/2}.

  • (i) For any complex numbers {z_1,\dots,z_n}, one has

    \displaystyle  ({\bf E} |\sum_{j=1}^n \epsilon_j z_j|^p)^{1/p} \sim_p (\sum_{j=1}^n |z_j|^2)^{1/2}.

  • (ii) For any functions {f_1,\dots,f_n \in L^p(X,\mu)} on a measure space {(X,\mu)}, one has

    \displaystyle  ({\bf E} \|\sum_{j=1}^n \epsilon_j f_j\|_{L^p(X,\mu)}^p)^{1/p} \sim_p \| (\sum_{j=1}^n |f_j|^2)^{1/2} \|_{L^p(X,\mu)}.

Proof: We begin with (i). By taking real and imaginary parts we may assume without loss of generality that the {z_j} are all real, then by normalisation it suffices to show the upper bound

\displaystyle  {\bf E} |\sum_{j=1}^n \epsilon_j x_j|^p \lesssim_p 1 \ \ \ \ \ (11)

and the lower bound

\displaystyle  {\bf E} |\sum_{j=1}^n \epsilon_j x_j|^p \gtrsim_p 1 \ \ \ \ \ (12)

for all {0 < p < \infty}, whenever {x_1,\dots,x_n} are real numbers with {\sum_{j=1}^n x_j^2=1}.

When {p=2} the upper and lower bounds follow by direct calculation (in fact we have equality { {\bf E} |\sum_{j=1}^n \epsilon_j x_j|^2 = 1} in this case). By Hölder’s inequality, this yields the upper bound for {p \leq 2} and the lower bound for {p>2}. To handle the remaining cases of (11) it is convenient to use the exponential moment method. Let {\lambda>0} be an arbitrary threshold, and consider the upper tail probability

\displaystyle  {\bf P}( \sum_{j=1}^n \epsilon_j x_j \geq \lambda ).

For any {t>0}, we see from Markov’s inequality that this quantity is less than or equal to

\displaystyle  e^{-t\lambda} {\bf E} \exp( t \sum_{j=1}^n \epsilon_j x_j ).

The expectation here can be computed to equal

\displaystyle {\bf E} \exp( t \sum_{j=1}^n \epsilon_j x_j ) = \prod_{j=1}^n {\bf E} \exp( \epsilon_j t x_j ) = \prod_{j=1}^n \cosh(tx_j).

By comparing power series we see that {\cosh(y) \leq \exp( y^2/2)} for any real {y}, hence by the normalisation {\sum_{j=1}^n x_j^2=1} we see that

\displaystyle  {\bf P}( \sum_{j=1}^n \epsilon_j x_j \geq \lambda ) \leq e^{-t\lambda} e^{t^2/2}.

If we set {t := \lambda} we conclude that

\displaystyle  {\bf P}( \sum_{j=1}^n \epsilon_j x_j \geq \lambda ) \leq e^{-\lambda^2/2};

since the random variable {\sum_{j=1}^n \epsilon_j x_j} is symmetric around the origin, we conclude that

\displaystyle  {\bf P}( |\sum_{j=1}^n \epsilon_j x_j| \geq \lambda ) \leq 2e^{-\lambda^2/2}.

From the Fubini-Tonelli theorem we have

\displaystyle  {\bf E} |X|^p = \int_0^\infty p \lambda^{p-1} {\bf P}(|X| \geq \lambda)\ d\lambda

and this then gives the upper bound (11) for any {2 < p < \infty}. The claim (12) for {0 < p < 2} then follows from this, Hölder’s inequality (applied in reverse), and the fact that (12) was already established for {p=2}.

To prove (ii), observe from (i) that for every {x \in X} one has

\displaystyle  {\bf E} |\sum_{j=1}^n \epsilon_j f_j(x)|^p \sim_p (\sum_{j=1}^n |f_j(x)|^2)^{p/2};

integrating in {X} and applying the Fubini-Tonelli theorem, we obtain the claim. \Box

Exercise 5

  • (i) How does the implied constant in (11) depend on {p} in the limit {p \rightarrow \infty} if one analyses the above argument more carefully?
  • (ii) Establish (11) for the case of even integers {p} by direct expansion of the left-hand side and some combinatorial calculation. How does the dependence of the implied constant in (11) on {p} compare with (i) if one does this?
  • (iii) Establish a matching lower bound (up to absolute constants) for the implied constant in (11).

Now we show that the estimate (10) fails in the large {p} regime {p>2}, even when {q=p'}. Here, the idea is to have {f} “spread out” in physical space (in order to keep the {L^p} norm low), and also having {\hat f} somewhat spread out in frequency space (in order to prevent the {L^{p'}} norm from dropping too much). We use the probabilistic method (constructing a random counterexample rather than a deterministic one) in order to exploit Khintchine’s inequality. Let {\varphi} be a non-zero bump function supported on (say) the unit ball {B(0,1)}, and consider a (random) function of the form

\displaystyle  f(x) = \sum_{j=1}^n \epsilon_j \varphi(x-x_j)

where {\epsilon_1,\dots,\epsilon_n} are the random signs from Lemma 4, and {x_1,\dots,x_n \in {\bf R}^d} are sufficiently separated points in {{\bf R}^d} (all we need for this construction is that {|x_j-x_k| \geq 2} for all {1 \leq j < k \leq n}). Then the summands here have disjoint supports and

\displaystyle  \| f\|_{L^p({\bf R}^d)} \sim_{p,\varphi} n^{1/p}

(note that the signs {\epsilon_j} have no effect on the magnitude of {f}). If (10) were true, this would give the (deterministic) bound

\displaystyle  \| \hat f \|_{L^{p'}({\bf R}^d)} \lesssim_{p,\varphi,d} n^{1/p}. \ \ \ \ \ (13)

On the other hand, the Fourier transform of {f} is

\displaystyle  \hat f(\xi) = \sum_{j=1}^n \epsilon_j e^{2\pi i x_j \cdot \xi} \hat \varphi(\xi),

so by Khintchine’s inequality

\displaystyle  ({\bf E} \| \hat f\|_{L^{p'}({\bf R}^d)}^{p'})^{1/p'} \sim_{p,d} \| (\sum_{j=1}^n |e^{2\pi i x_j \cdot \xi} \hat \varphi(\xi)|^2)^{1/2} \|_{L^{p'}_\xi({\bf R}^d)}.

The phases {e^{2\pi i x_j \cdot \xi}} can be deleted, and {\hat \varphi} is not identically zero, so one arrives at

\displaystyle  ({\bf E} \| \hat f\|_{L^{p'}({\bf R}^d)}^{p'})^{1/p} \sim_{p,d,\varphi} n^{1/2}.

Comparing this with (13) and sending {n \rightarrow \infty}, we obtain a contradiction if {p>2}. This completes the proof of Proposition 3.

Exercise 6 Find a deterministic construction that explains why the estimate (10) fails when {p>2} and {q=p'}.

Exercise 7 (Marcinkiewicz-Zygmund theorem) Let {(X,\mu), (Y,\nu)} be measure spaces, let {0 < q \leq p < \infty}, and suppose {T: L^p(X,\mu) \rightarrow L^q(Y,\nu)} is a bounded linear operator with operator norm {\|T\|}. Show that

\displaystyle  \| (\sum_{\alpha \in A} |Tf_\alpha|^2)^{1/2} \|_{L^q(Y,\nu)} \lesssim_{p,q} \|T\| \| (\sum_{\alpha \in A} |f_\alpha|^2)^{1/2} \|_{L^p(X,\mu)}

for any at most countable index set {A} and any functions {f_\alpha \in L^p(X,\mu)}. Informally, this result asserts that if a linear operator {T} is bounded from scalar-valued {L^p} functions to scalar-valued {L^q} functions, then it is automatically bounded from vector-valued {L^p} funct

Exercise 8 Let {U} be a bounded open subset of {{\bf R}^d}, and let {1 \leq p,q \leq \infty}. Show that {R_U(p \rightarrow q)} holds if and only if {1 \leq p \leq 2} and {q \leq p'}. (Note: in order to use either the scale invariance argument or the dimensional analysis argument to get the condition {q \leq p'}, one should replace {U} with something like a ball {B(0,r)} of some radius {r>0}, and allow the estimates to depend on {r}.)

Now we study the restriction problem for two model hypersurfaces:

  • (i) The paraboloid

    \displaystyle  \Sigma := \{ (\xi', |\xi'|^2): \xi' \in {\bf R}^{d-1} \} \ \ \ \ \ (14)

    equipped with the measure {\mu} induced from Lebesgue measure {d\xi'} in the horizontal variables {{\bf R}^{d-1}}, thus

    \displaystyle  \int_\Sigma f\ d\mu = \int_{{\bf R}^d} f( \xi', |\xi'|^2)\ d\xi'

    (note this is not the same as surface measure on {\Sigma}, although it is mutually absolutely continuous with this measure).

  • (ii) The sphere {S^{d-1} = \{ \xi \in {\bf R}^d: |\xi| = 1 \}}.

These two hypersurfaces differ from each other in one important respect: the paraboloid is non-compact, while the sphere is compact. Aside from that, though, they behave very similarly; they are both conic hypersurfaces with everywhere positive curvature. Furthermore, they are also very highly symmetric surfaces. The sphere of course enjoys the rotation symmetry under the orthogonal group {O(d)}. At first glance the paraboloid {\Sigma} only enjoys symmetry under the smaller orthogonal group {O(d-1)} that rotates the {\xi'} variable (leaving the final coordinate {\xi_d} unchanged), but it also has a family of Galilean symmetries

\displaystyle  (\xi', \xi_d) \mapsto (\xi' + \xi_0, \xi_d + 2 \xi' \cdot \xi_0 + |\xi_0|^2 )

for any {\xi_0 \in {\bf R}^{d-1}}, which preserves {\Sigma} (and also can be seen to preserve the measure {\mu}, since the horizontal variable {\xi'} is simply translated by {\xi_0}). Furthermore, the paraboloid also enjoys a parabolic scaling symmetry

\displaystyle  (\xi', \xi_d) \mapsto (\lambda \xi', \lambda^2 \xi_d)

for any {\lambda > 0}, for which the sphere does not have an exact analogue (though morally Taylor expansion suggests that the sphere “behaves like” the paraboloid at small scales. The following exercise exploits these symmetries:

Exercise 9

  • (i) Let {V} be a bounded non-empty open subset of {S^{d-1}}, and let {1 \leq p,q \leq \infty}. Show that {R_{V}(p \rightarrow q)} holds if and only if {R_{S^{d-1}}(p \rightarrow q)} holds.
  • (ii) Let {V_1, V_2} be bounded non-empty open subsets of {\Sigma} (endowed with the restriction of {\mu} to {V_1,V_2}), and let {1 \leq p,q \leq \infty}. Show that {R_{V_1}(p \rightarrow q)} holds if and only if {R_{V_2}(p \rightarrow q)} holds.
  • (iii) Suppose that {1 \leq p,q \leq \infty} are such that {R_\Sigma(p \rightarrow q)} holds. Show that {q = \frac{d-1}{d+1} p'}. (Hint: Any of the three methods of scale invariance, dimensional analysis, or rescaled bump functions will work here.)
  • (iv) Suppose that {1 \leq p,q \leq \infty} are such that {R_{S^{d-1}}(p \rightarrow q)} holds. Show that {q \leq \frac{d-1}{d+1} p'}. (Hint: The same three methods still work, but some will be easier to pull off than others.)
  • (v) Suppose that {1 \leq p,q \leq \infty} are such that {R_V(p \rightarrow q)} holds for some bounded non-empty open subset {V} of {\Sigma}, and that {q = \frac{d-1}{d+1} p'}. Conclude that {R_\Sigma(p \rightarrow q)} holds.
  • (vi) Suppose that {1 \leq p,q \leq \infty} are such that {R_{S^{d-1}}(p \rightarrow q)} holds, and that {q = \frac{d-1}{d+1} p'}. Conclude that {R_\Sigma(p \rightarrow q)} holds.

Exercise 10 (No non-trivial restriction estimates for flat hypersurfaces) Let {S} be an open non-empty subset of a hyperplane in {{\bf R}^d}, and let {1 \leq p,q \leq \infty}. Show that {R_S(p \rightarrow q)} can only hold when {p=1}.

To obtain a further necessary condition on the restriction estimates {R_\Sigma(p \rightarrow q)} or {R_{S^{n-1}}(p \rightarrow q)} holding, it is convenient to dualise the restriction estimate to an extension estimate.

Exercise 11 (Duality) Let {\mu} be a Radon measure on {{\bf R}^d}, let {1 \leq p,q \leq \infty}, and let {A > 0}. Show that the following claims are equivalent:

  • (i) (Restriction estimate) One has

    \displaystyle  \| \hat f \|_{L^q({\bf R}^d,\mu)} \leq A \|f\|_{L^p({\bf R}^d)}

    for all {f \in C^\infty_c({\bf R}^d)}.

  • (ii) (Extension estimate) one has

    \displaystyle  \| (g\ d\mu)^\vee \|_{L^{p'}({\bf R}^d)} \leq A \| g\|_{L^{q'}({\bf R}^d)}

    for all {g \in C^\infty_c({\bf R}^d)}, where the inverse Fourier transform {(g\ d\mu)^\vee: {\bf R}^d \rightarrow {\bf C}} of the finite measure {g\ d\mu} is defined by the formula

    \displaystyle  (g\ d\mu)^\vee(x) := \int_{{\bf R}^d} g(\xi) e^{2\pi i x \cdot \xi}\ d\mu(\xi). \ \ \ \ \ (15)

This gives a further necessary condition as follows. Suppose for instance that {R_{S^{n-1}}(p \rightarrow q)} holds; then by the above exercise, one has

\displaystyle  \| (g\ d\sigma)^\vee \|_{L^{p'}({\bf R}^d)} \lesssim_{d,p,q} \| g\|_{L^{q'}({\bf R}^d)}

for all {g \in C^\infty_c({\bf R}^d)}. In particular, {(g\ d\sigma)^\vee \in L^{p'}({\bf R}^d)}. However, we have the following stationary phase computation:

Exercise 12 If {g \in C^\infty_c({\bf R}^d)}, show that

\displaystyle  (g\ d\sigma)^\vee (\xi) = \frac{1}{|\xi|^{\frac{d-1}{2}}} ( c_+ g(\frac{\xi}{|\xi|}) e^{2\pi i x \cdot \frac{\xi}{|\xi|}} + c_- g(-\frac{\xi}{|\xi|}) e^{-2\pi i x \cdot \frac{\xi}{|\xi|}})

\displaystyle  + O_{g,d}( |\xi|^{-\frac{d}{2}} )

for all {|\xi| \geq 1} and some non-zero constants {c_+, c_-} depending only on {d}. Conclude that the estimate {R_{S^{n-1}}(p \rightarrow q)} can only hold if {p < \frac{2d}{d+1}}.

Exercise 13 Show that the estimate {R_\Sigma(p \rightarrow q)} can only hold if {p < \frac{2d}{d+1}}. (Hint: one can explicitly test (15) when {g} is a gaussian; the fact that gaussians are not, strictly speaking, compactly supported can be dealt with by a limiting argument.)

It is conjectured that the necessary conditions claimed above are sufficient. Namely, we have

Conjecture 14 (Restriction conjecture for the sphere) Let {d \geq 2}. Then we have {R_{S^{d-1}}(p \rightarrow q)} whenever {p < \frac{2d}{d+1}} and {q \leq \frac{d-1}{d+1} p'}.

Conjecture 15 (Restriction conjecture for the paraboloid) Let {d \geq 2}. Then we have {R_{\Sigma}(p \rightarrow q)} whenever {p < \frac{2d}{d+1}} and {q = \frac{d-1}{d+1} p'}.

It is also conjectured that Conjecture 14 holds if one replaces the sphere {S^{d-1}} by any bounded open non-empty subset of the paraboloid {\Sigma}.

The current status of these conjectures is that they are fully solved in the two-dimensional case {d=2} (as we will see later in these notes) and partially resolved in higher dimensions. For instance, in {d=3} one of the strongest results currently is due to Hong Wang, who established {R_{U}(p \rightarrow 1)} for {U} a bounded open non-empty subset of {\Sigma} when {p' > 3 + \frac{1}{13}} (conjecturally this should hold for all {p'>3}); for higher dimensions see this paper of Hickman and Rogers for the most recent results.

We close this section with an important connection between the restriction conjecture and another conjecture known as the Kakeya maximal function conjecture. To describe this connection, we first give an alternate derivation of the necessary condition {q \leq \frac{n+1}{n-1} p'} in Conjecture 14, using a basic example known as the Knapp example (as described for instance in this article of Strichartz).

Let {\kappa} be a spherical cap in {S^{d-1}} of some small radius {\delta>0}, thus {\kappa = S^{d-1} \cap B(\omega_0,\delta)} for some {\omega_0 \in S^{d-1}}. Let {g \in C^\infty_c({\bf R}^d)} be a bump function adapted to this cap, say {g(\xi) = \varphi( \frac{\xi-\omega_0}{\delta})} where {\varphi} is a fixed non-zero bump function supported on {B(0,1)}. We refer to {g} as a Knapp example at frequency {\omega_0} (and spatially centred at the origin). The cap {\kappa} (or any slightly smaller version of {\kappa}) has surface measure {\sim_d \delta^{d-1}}, thus

\displaystyle  \| g\|_{L^{q'}(S^{d-1})} \sim_{d,\varphi,q} \delta^{(d-1)/q'}

for any {1 \leq q \leq \infty}. We then apply the extension operator to {g}:

\displaystyle  (g d\sigma)^\vee(x) = \int_{\kappa} e^{2\pi i x \cdot \omega} \varphi( \frac{\omega-\omega_0}{\delta})\ d\sigma(\omega). \ \ \ \ \ (16)

The integrand is only non-vanishing if {\omega = \omega_0 + O(\delta)}; since also from the cosine rule we have

\displaystyle  |\omega-\omega_0|^2 = |\omega|^2 + |\omega_0|^2 - 2 \omega \cdot \omega_0 = 2 - 2 \omega \cdot \omega_0

we also have {\omega \cdot \omega_0 = 1 + O(\delta^2)}. Thus, if {x} lies in the tube

\displaystyle  T^{(c/\delta)^2 \times c/\delta}_{\omega_0,0} := \{ x \in {\bf R}^d: |x \cdot \omega_0| \leq (c/\delta)^2; |x - (x \cdot \omega_0) \omega_0| \leq c/\delta \}

for a sufficiently small absolute constant {c>0}, then the phase {e^{2\pi i x \cdot \omega}} has real part {\gg 1}. If we set {\varphi} to be non-negative and not identically zero, and note that , we conclude that

\displaystyle  |(g d\sigma)^\vee(x)| \gtrsim_{d,\varphi} \delta^{d-1}

for {x \in T^{(c/\delta)^2 \times c/\delta}_{\omega_0,0}}. Since the tube {T^{(c/\delta)^2 \times c/\delta}_{\omega_0,0}} has dimensions {c/\delta^2 \times c/\delta}, its volume is

\displaystyle  \sim_d c/\delta^2 \times (c/\delta)^{d-1} \sim_d \delta^{-(d+1)}

and thus

\displaystyle  \| (gd\sigma)^\vee \|_{L^{p'}({\bf R}^d)} \gtrsim_{d,\varphi,p} \delta^{d-1} \delta^{-(d+1)/p'}

for any {1 \leq p \leq \infty}. By Exercise 11, we thus see that if the estimate {R_{S^{d-1}}(p \rightarrow q)} holds, then

\displaystyle  \delta^{d-1} \delta^{-(d+1)/p'} \lesssim_{d,\varphi,p,q} \delta^{(d-1)/q'}

for all small {\delta>0}; sending {\delta} to zero, we conclude that

\displaystyle  d-1 - \frac{d+1}{p'} \geq \frac{d-1}{q'}

or equivalently that {q \leq \frac{d-1}{d+1} p'}, recovering the second necessary condition in Conjecture 14.

Exercise 16

  • (i) By considering a random superposition of Knapp examples located at different frequencies {\omega_0}, and using Khintchine’s inequality, recover the first necessary condition {p < \frac{2d}{d-1}} of Conjecture 14.
  • (ii) Suppose that {R_{S^{d-1}}(p \rightarrow q)} holds for some {1 \leq p, q < \infty}. Establish the estimate

    \displaystyle  \| \sum_{T \in {\mathcal T}} c_T 1_T \|_{L^{p'/2}} \lesssim_{p,q,d} \delta^{2(\frac{d}{p'}-d+1)} (\delta^{d-1} \sum_{T \in \mathcal T} |c_T|^{q'/2})^{2/q'} \ \ \ \ \ (17)

    whenever {{\mathcal T}} is a collection of {\delta \times 1} tubes – that is to say, sets of the form

    \displaystyle  T = T^{\delta \times 1}_{\omega_T, x_T} = \{ x \in {\bf R}^d: |(x-x_T) \cdot \omega_T| \leq 1;

    \displaystyle  |(x-x_T) - ((x-x_T) \cdot \omega_T) \omega_T| \leq \delta \}

    whose directions {\omega_T} are {\delta}-separated (thus {\omega_T - \omega_{T'}| \geq \delta} for any two distinct {T,T' \in {\mathcal T}}), and the {c_T} are arbitrary real numbers.

  • (iii) Establish claims (i) and (ii) with the sphere {S^{d-1}} replaced by a bounded non-empty open subset of the paraboloid {\Sigma}.

Using this exercise, we can show that restriction estimates imply assertions about the dimension of Kakeya sets (also known as Besicovitch sets.

Exercise 17 (Restriction implies Kakeya) Assume that either Conjecture 14 or Conjecture 15 holds. Define a Kakeya set to be a compact subset {E} of {{\bf R}^d} that contains a unit line segment in every direction (thus for every {\omega \in S^{d-1}}, there exists a line segment {\{ x_\omega + t \omega: t \in [0,1] \}} for some {x_\omega \in {\bf R}^d} that is contained in {E}. Show that for any {0 < \delta < 1}, the {\delta}-neighbourhood of {E} has Lebesgue measure {\gtrsim_{d,\varepsilon} \delta^\varepsilon} for any {\varepsilon}. (This is equivalent to the assertion that {E} has Minkowski dimension {d}.) It is also possible to show that the restriction conjecture implies that all Kakeya sets have Hausdorff dimension {d}, but this is trickier; see this paper of Bourgain. (This can be viewed as a challenge problem for those students who are familiar with the concept of Hausdorff dimension.)

The Kakeya conjecture asserts that all Kakeya sets in {{\bf R}^d} have Minkowski and Hausdorff dimension equal to {d}. As with the restriction conjecture, this is known to be true in two dimensions (as was first proven by Davies), but only partial results are known in higher dimensions. For instance, in three dimensions, Kakeya sets are known to have (upper) Minkowski dimension at least {\frac{5}{2}+\epsilon_0} for some absolute constant {\epsilon_0>0} (a result of Katz, Laba, and myself), and also more recently for Hausdorff dimension (a result of Katz and Zahl). For the latest results in higher dimensions, see these papers of Hickman-Rogers-Zhang and Zahl.

Much of the modern progress on the restriction conjecture has come from trying to reverse the implication in Exercise 17, and use known partial results towards the Kakeya conjecture (or its relatives) to obtain restriction estimates. We will not give the latest arguments in this direction here, but give an illustrative example (in the multilinear setting) at the end of this set of notes.

— 2. {L^2} theory —

One of the best understood cases of the restriction conjecture is the {q=2} case. Note that Conjecture 14 asserts that {R_{S^{d-1}}(p \rightarrow 2)} holds whenever {p \leq \frac{2(d+1)}{d+3}}, and Conjecture 15 asserts that {R_{S^{d-1}}(p \rightarrow 2)} holds when {p = \frac{2(d+1)}{d+3}}. Theorem 1 already gave a partial result in this direction. Now we establish the full range of the {L^2} restriction conjecture, due to Tomas and Stein:

Theorem 18 (Tomas-Stein restriction theorem) Let {d \geq 2}. Then {R_{S^{d-1}}(p \rightarrow 2)} holds for all {1 \leq p \leq \frac{2(d+1)}{d+3}}, and {R_{\Sigma}(p \rightarrow 2)} holds for {p = \frac{2(d+1)}{d+3}}.

The exponent {p = \frac{2(d+1)}{d+3}} is sometimes referred to in the literature as the Tomas-Stein exponent; though the dual exponent {p' = \frac{2(d+1)}{d-1}} is also referred to by this name.

We first establish the restriction estimate {R_{S^{d-1}}(p \rightarrow 2)} in the non-endpoint case {1 \leq p < \frac{2(d+1)}{d+3}} by an interpolation method. Fix {p,d}. By the identity (5) and Hölder’s inequality, it suffices to establish the inequality

\displaystyle  \| f * (d\sigma)^\vee \|_{L^{p'}({\bf R}^d)} \lesssim_{d,p} \|f\|_{L^p({\bf R}^d)}.

We use the standard technique of dyadic decomposition. Let {\varphi \in C^\infty_c({\bf R}^d)} be a bump function supported on {B(0,1)} that equals {1} on {B(0,1/2)}. Then one has the telescoping series

\displaystyle  1 = \varphi(x) + \sum_{k=1}^\infty \psi(x/2^k)

where {\psi(x) := \varphi(x) - \varphi(2x)} is a bump function supported on the annulus {B(0,1) \backslash B(0,1/4)}. We can then decompose the convolution kernel {(d\sigma)^\vee} as

\displaystyle  (d\sigma)^\vee(x) = \varphi(x) (d\sigma)^\vee(x) + \sum_{k=1}^\infty \psi(x/2^k) (d\sigma)^\vee(x)

so by the triangle inequality it will suffice to establish the bounds

\displaystyle  \| f * \varphi (d\sigma)^\vee \|_{L^{p'}({\bf R}^d)} \lesssim_{d,p,\varphi} \|f\|_{L^p({\bf R}^d)} \ \ \ \ \ (18)

and

\displaystyle  \| f * \psi(\cdot/2^k) (d\sigma)^\vee \|_{L^{p'}({\bf R}^d)} \lesssim_{d,p,\varphi} 2^{-ck} \|f\|_{L^p({\bf R}^d)} \ \ \ \ \ (19)

for all {k \geq 1} and some constant {c>0} depending only on {d,p}.

The function {\varphi (d\sigma)^\vee} is smooth and compactly supported, so (18) is immediate from Young’s inequality (note that {p' \geq p} when {p \leq \frac{2(d+1)}{d+3} \leq 2}). So it remains to prove (19). Firstly, we recall from (7) (or (8)) that the kernel {\psi(\cdot/2^k) (d\sigma)^\vee} is of magnitude {O_d( 2^{-k(d-1)/2})}. Thus by Young’s inequality we have

\displaystyle  \| f * \psi(\cdot/2^k) (d\sigma)^\vee \|_{L^{\infty}({\bf R}^d)} \lesssim_{d} 2^{-k(d-1)/2} \|f\|_{L^1({\bf R}^d)} \ \ \ \ \ (20)

We now complement this with an {L^2} estimate. The Fourier transform of {\psi(\cdot/2^k) (d\sigma)^\vee} can be computed as

\displaystyle  \widehat{\psi(\cdot/2^k) (d\sigma)^\vee}(\xi) = \int_{S^{d-1}} 2^{kd} \hat\psi(2^k(\xi-\eta)) d\sigma(\eta)

for any {\xi\in {\bf R}^d}, and hence by the triangle inequality and the rapid decay of the Schwartz function {\hat \psi} we have

\displaystyle  \widehat{\psi(\cdot/2^k) (d\sigma)^\vee}(\xi) \lesssim_{d,\varphi} \int_{S^{d-1}} 2^{kd} \langle 2^k(\xi-\eta) \rangle^{-100d} d\sigma(\eta).

By dyadic decomposition we then have

\displaystyle  \widehat{\psi(\cdot/2^k) (d\sigma)^\vee}(\xi) \lesssim_{d,\varphi} \sum_{j=0}^\infty 2^{kd} 2^{-100jd} \sigma( B(\xi, 2^{j-k} )).

From elementary geometry we have

\displaystyle  \sigma( B(\xi, 2^{j-k} )) \lesssim_d \min( 1, 2^{(d-1)(j-k)} )

(basically because the sphere is {d-1}-dimensional), and then on summing the geometric series we conclude that

\displaystyle  \widehat{\psi(\cdot/2^k) (d\sigma)^\vee}(\xi) \lesssim_{d,\varphi} 2^k.

From Plancherel’s theorem we conclude that

\displaystyle  \| f * \psi(\cdot/2^k) (d\sigma)^\vee \|_{L^{2}({\bf R}^d)} \lesssim_{d} 2^{k} \|f\|_{L^2({\bf R}^d)}. \ \ \ \ \ (21)

Applying either the Marcinkiewicz interpolation theorem (or the Riesz-Thorin interpolation theorem) to (20) and (21), we conclude (after some arithmetic) the required estimate (18) with

\displaystyle  c := \frac{d+1}{2} (\frac{1}{p} - \frac{1}{p'}) - 1,

which is indeed positive when {p < \frac{2(d+1)}{d+3}}.

At the endpoint {p = \frac{2(d+1)}{d+3}} the above argument does not quite work; we obtain a decent bound for each dyadic component {f * \psi(\cdot/2^k) (d\sigma)^\vee} of {f * (d\sigma)^\vee}, but then we have trouble getting a good bound for the sum. The original argument of Stein got around this problem by using complex interpolation instead of dyadic decomposition, embedding {f * (d\sigma)^\vee} in an analytic family of functions. We present here another approach, which is now popular in PDE applications; the basic inputs (namely, an {L^1} to {L^\infty} estimate similar to (20), an {L^2} estimate similar to (21), and an interpolation) are the same, but we employ the additional tool of Hardy-Littlewood-Sobolev fractional integration to recover the endpoint.

We turn to the details. Set {p = p_0 := \frac{2(d+1)}{d+3}}. We write {{\bf R}^d} as {{\bf R}^{d-1} \times {\bf R}} and parameterise the frequency variable {\xi} by {(\xi',\xi_d)} with {\xi' \in {\bf R}^{d-1}, \xi_d \in {\bf R}}, thus for instance {S^{d-1} = \{ (\xi',\xi_d): |\xi'|^2 + \xi_d^2 = 1\}}. (One can think of {x_d} as a “time” variable that we will give a privileged role in the physical domain {{\bf R}^d}.) We split the spatial variable {x = (x',x_d)} similarly. Let {\eta \in C^\infty_c} be a non-negative bump function localised to a small neighbourhood of the north pole {(0,1)} of {S^{d-1}}. By Exercise 9 it will suffice to show that

\displaystyle  \| \hat f \|_{L^2({\bf R}^d, \eta d\sigma)} \lesssim_{d,\eta} \|f\|_{L^{p_0}({\bf R}^d)}

for {f \in C^\infty_c({\bf R}^d)}. Squaring as before, it suffices to show that

\displaystyle  \| f * (\eta d\sigma)^\vee \|_{L^{p'_0}({\bf R}^d)} \lesssim_{d,\eta} \|f\|_{L^{p_0}({\bf R}^d)}.

For each {x_d \in {\bf R}}, let {f_{x_d}: {\bf R}^{d-1} \rightarrow {\bf C}} denote the function {f_{x_d}(x') := f(x',x_d)}, and let {K_{x_d}} denote the function

\displaystyle  K_{x_d}(x') := (\eta d\sigma)^\vee(x', x_d).

Then we have

\displaystyle  f * (\eta d\sigma)^\vee(x',x_d) = \int_{\bf R} f_{y_d} * K_{x_d-y_d}(x')\ dy_d

where on the right-hand side the convolution is now over {{\bf R}^{d-1}} rather than {{\bf R}^d}. By the Fubini-Tonelli theorem and Minkowski’s inequality, we thus have

\displaystyle  \| f * (\eta d\sigma)^\vee \|_{L^{p'_0}({\bf R}^d)} \leq \| \int_{\bf R} \| f_{y_d} * K_{x_d-y_d} \|_{L^{p'_0}({\bf R}^{d-1})} \|_{L^{p'_0}_{x_d}({\bf R})}.

From Exercise 12 we have the bounds

\displaystyle  K_{x_d}(x') \lesssim_{\eta,d} |(x',x_d)|^{-\frac{d-1}{2}} \leq |x_d|^{-\frac{d-1}{2}}

leading to the dispersive estimate

\displaystyle  \| g * K_{x_d-y_d} \|_{L^\infty({\bf R}^{d-1})} \lesssim_{\eta,d} |x_d-y_d|^{-\frac{d-1}{2}} \|g\|_{L^1({\bf R}^{d-1})}

for any {g \in L^1({\bf R}^{d-1})} (the claim is vacuous when {x_d-y_d} vanishes). On the other hand, the {d-1}-dimensional Fourier transform of {K_{x_d}} can be computed as

\displaystyle  \hat K_{x_d}(\xi') = \frac{\eta(\xi', \sqrt{1-|\xi'|^2})}{\sqrt{1-|\xi'|^2}} e^{2\pi i x_d \sqrt{1-|\xi'|^2}}

which is bounded by {O_\eta(1)}, hence by Plancherel we have the energy estimate

\displaystyle  \| g * K_{x_d-y_d} \|_{L^2({\bf R}^{d-1})} \lesssim_{\eta,d} \|g\|_{L^2({\bf R}^{d-1})}.

Interpolating, we conclude after some arithmetic that

\displaystyle  \| g * K_{x_d-y_d} \|_{L^{p'_0}({\bf R}^{d-1})} \lesssim_{\eta,d} |x_d-y_d|^{-\frac{d-1}{d+1}} \|g\|_{L^{p_0}({\bf R}^{d-1})}.

Applying the one-dimensional Hardy-Littlewood-Sobolev inequality we conclude (after some more arithmetic) that

\displaystyle  \| \int_{\bf R} \| f_{y_d} * K_{x_d-y_d} \|_{L^{p'_0}({\bf R}^{d-1})} \|_{L^{p'_0}_{x_d}({\bf R})} \lesssim_d \| \| f_{y_d} * K_{x_d-y_d} \|_{L^{p_0}({\bf R}^{d-1})} \|_{L^{p_0}_{x_d}({\bf R})}

and the claim follows.

This latter argument can be adapted for the paraboloid, which in turn leads to some very useful estimates for the Schrödinger equation:

Exercise 19 (Strichartz estimates for the Schrödinger equation) Let {d \geq 2}.

  • (i) By modifying the above arguments, establish the restriction estimate {R_\sigma(2 \mapsto \frac{2(d+1)}{d+3})}.
  • (ii) Let {f \in {\mathcal S}({\bf R}^{d-1})}, and let {u: {\bf R}^d \rightarrow {\bf C}} denote the function

    \displaystyle  u(x',x_d) := \int_{{\bf R}^{d-1}} \hat f(\xi') e^{2\pi i (x' \cdot \xi' + x_d |\xi'|^2)}\ d\xi'.

    (This is the solution to the Schrödinger equation {2\pi i \partial_{x_d} u = \Delta_{x'} u} with initial data {u(x',0) = f(x')}.) Establish the Strichartz estimate

    \displaystyle  \| u \|_{L^{\frac{2(d+1)}{d+3}}({\bf R}^d)} \lesssim_d \|f\|_{L^2({\bf R}^d)}.

  • (iii) More generally, with the hypotheses as in (ii), establish the bound

    \displaystyle  \| \| u_{x_d} \|_{L^r({\bf R}^{d-1})} \|_{L^q_{x_d}({\bf R})} \lesssim_{d,q,r} \|f\|_{L^2({\bf R}^d)}

    whenever {2 < q,r \leq \infty} are exponents obeying the scaling condition {\frac{2}{q} + \frac{d-1}{r} = \frac{d-1}{2}}. (The endpoint case {q=2} of this estimate is also available when {d \neq 3}, using a more sophisticated interpolation argument; see this paper of Keel and myself.)

The Strichartz estimates in the above exercise were for the linear Schrödinger equation, but Strichartz estimates can also be established by the same method (namely, interpolating between energy and dispersive estimates) for other linear dispersive equations, such as the linear wave equation {\partial_{x_d}^2 u = \Delta_{x'} u}. Such Strichartz estimates are a fundamental tool in the modern analysis of nonlinear dispersive equations, as they often allow one to view such nonlinear equations as perturbations of linear ones. The topic is too vast to survey in these notes, but see for instance my monograph on this topic.

— 3. Bilinear estimates —

A restriction estimate such as

\displaystyle  \| \hat f \|_{L^q({\bf R}^d,\mu)} \lesssim_{p,q,d,\mu} \|f\|_{L^p({\bf R}^d)}

or its equivalent dual form \begin{equation \| (g d\mu)^\vee \|_{L^{p’}(R^d)} \lesssim_{p,q,d,\mu} \| g\|_{L^{q’}(R^d,\mu)} are linear estimates, asserting the boundedness of either a restriction operator {f \mapsto \hat f|_S} (where {S} denotes the support of {\mu}) or an extension operator {g \mapsto (g\ d\mu)^\vee}. In the last ten or twenty years, it has been realised that one should also consider bilinear or multilinear versions of the extension estimate, both as stepping stones towards making progress on the linear estimate, and also as being of independent interest and application.

In this section we will show how the consideration of bilinear extension estimates can be used to resolve the restriction conjecture for the circle (i.e., the {d=2} case of Conjecture 14):

Theorem 20 (Restriction conjecture for {S^1}) One has {R_{S^1}(p \rightarrow q)} whenever {p < \frac{4}{3}} and {q \leq \frac{1}{3} p'}.

Note from Exercise 9(vi) that this theorem also implies the {d=2} case of Conjecture 15. This case of the restriction conjecture was first established by Zygmund; Zygmund’s proof is shorter than the one given here (relying on the Hausdorff-Young inequality (4)), but the arguments here have broader applicability, in particular they are also useful in higher-dimensional settings.

To prove this conjecture, it suffices to verify it at the endpoint {q = \frac{1}{3} p'}, since from Hölder’s inequality the norm {\| \hat f \|_{L^q(S^1, d\sigma)}} is essentially non-decreasing in {q}, where {d\sigma} is arclength measure on {S^1}. By Exercise 9, we may replace {S^1} here by (say) the first quadrant {\phi([0,1/4])} of the circle, where {\phi} is the map {\phi(\theta) := (\cos 2\pi \theta, \sin 2\pi \theta)}; we let {d\sigma_{[0,1/4]}} be the arc length measure on that quadrant. (This reduction is technically convenient to avoid having to deal with antipodal points with parallel tangents a little later in the argument.)

By (3) and relabeling, it suffices to show that

\displaystyle  \| (g\ d\sigma_{[0,1/4]})^\vee \|_{L^q({\bf R}^2)} \lesssim_{q} \| g\|_{L^p(\phi([0,1/4]))} \ \ \ \ \ (22)

whenever {q > 4}, {p' = \frac{q}{3}}, and {g \in L^p(\phi([0,1/4]))} (we drop the requirement that {g} is smooth, in order to apply rough cutoffs shortly), and arcs such as {\phi([0,1/4])} are always understood to be endowed with arclength measure.

We now bilinearise this estimate. It is clear that the estimate (22) is equivalent to

\displaystyle  \| (g_1\ d\sigma_{[0,1/4]})^\vee (g_2\ d\sigma_{[0,1/4]})^\vee \|_{L^{q/2}({\bf R}^2)} \lesssim_{q} \| g_1\|_{L^p(\phi([0,1/4]))} \| g_2\|_{L^p(\phi([0,1/4]))} \ \ \ \ \ (23)

for any {g_1,g_2 \in L^p(\phi([0,1/4]))}, since (23) follows from (22) and Hölder’s inequality, and (22) follows from (23) by setting {g_1=g_2}.

Right now, the two functions {g_1} and {g_2} are both allowed to occupy the entirety of the arc {\phi([0,1/4])}. However, one can get better estimates if one separates the functions {g_1,g_2} to lie in transverse sub-arcs {\phi(I), \phi(J)} of {\phi([0,1/4])} (where by “transverse” we mean that there is some non-zero separation between the normal vectors of {\phi(I)} and the normal vectors of {\phi(J)}. The key estimate is

Proposition 21 (Bilinear {L^2} estimate) Let {I,J} be subintervals of {[0,1/4]} such that {|I| \sim |J| \sim \mathrm{dist}(I,J)}. Then we have

\displaystyle  \| (g_1\ d\sigma_I)^\vee (g_2\ d\sigma_J)^\vee \|_{L^2({\bf R}^2)} \lesssim |I|^{-1/2} \| g_1\|_{L^2(\phi(I))} \| g_2\|_{L^2(\phi(J))}

for an {g_1,g_2 \in C^\infty({\bf R}^2)}, where {\sigma_I} denotes the arclength measure on {\phi(I)}.

Proof: To avoid some very minor technicalities involving convolutions of measures, let us approximate the arclength measures {d\sigma_I, d\sigma_J}. Observe that we have

\displaystyle  d\sigma_I(\xi) = \lim_{\varepsilon \rightarrow 0} \frac{1}{2\varepsilon} 1_{A_{I,\varepsilon}}(\xi)\ d\xi

in the sense of distributions, where {A_{I,\varepsilon}} is the annular region

\displaystyle  A_{I,\varepsilon} := \{ r \phi(\theta): \theta \in I, 1-\varepsilon \leq r \leq 1+\varepsilon \}.

Thus we have the pointwise bound

\displaystyle  (g_1\ d\sigma_I)^\vee = \lim_{\varepsilon \rightarrow 0} \frac{1}{2\varepsilon} (g_1 1_{A_{I,\varepsilon}})^\vee

and

\displaystyle  \|g_1\|_{L^2(\phi(I))} = \lim_{\varepsilon \rightarrow 0} \frac{1}{\sqrt{2\varepsilon}} \|g_1\|_{L^2(A_{I,\varepsilon})}

and similarly for {g_2}. Hence by monotone convergence it suffices to show that

\displaystyle  \| (g_1 1_{A_{I,\varepsilon}})^\vee (g_2 1_{A_{J,\varepsilon}})^\vee \|_{L^2({\bf R}^2)} \lesssim |I|^{-1/2} \varepsilon \| g_1\|_{L^2(A_{I,\varepsilon})} \| g_2\|_{L^2(A_{J,\varepsilon})}

for sufficiently small {\varepsilon>0}. By Plancherel’s theorem, it thus suffices to show that

\displaystyle  \| (g_1 1_{A_{I,\varepsilon}}) * (g_2 1_{A_{J,\varepsilon}}) \|_{L^2({\bf R}^2)} \lesssim |I|^{-1/2} \varepsilon \| g_1\|_{L^2(A_{I,\varepsilon})} \| g_2\|_{L^2(A_{J,\varepsilon})}

for {g_1 \in L^2(A_{I,\varepsilon}), g_2 \in L^2(A_{J,\varepsilon})}, if {\varepsilon} is sufficiently small. From Young’s inequality one has

\displaystyle  \| (g_1 1_{A_{I,\varepsilon}}) * (g_2 1_{A_{J,\varepsilon}}) \|_{L^1({\bf R}^2)} \lesssim \| g_1\|_{L^1(A_{I,\varepsilon})} \| g_2\|_{L^1(A_{J,\varepsilon})}

so by interpolation it suffices to show that

\displaystyle  \| (g_1 1_{A_{I,\varepsilon}}) * (g_2 1_{A_{J,\varepsilon}}) \|_{L^\infty({\bf R}^2)} \lesssim |I|^{-1} \varepsilon^2 \| g_1\|_{L^\infty(A_{I,\varepsilon})} \| g_2\|_{L^\infty(A_{J,\varepsilon})}.

But this follows from the pointwise bound

\displaystyle  \| 1_{A_{I,\varepsilon}} * 1_{A_{J,\varepsilon}} \|_{L^\infty({\bf R}^2)} \lesssim |I|^{-1} \varepsilon^2, \ \ \ \ \ (24)

for sufficiently small {\varepsilon>0}, whose proof we leave as an exercise. \Box

Exercise 22 Establish (24).

Remark 23 Higher-dimensional bilinear {L^2} estimates, involving more complicated manifolds than arcs, play an important role in the modern theory of nonlinear dispersive equations, especially when combined with the formalism of dispersive variants of Sobolev spaces known as {X^{s,b}} spaces, introduced (and independently \href). See for instance this book of mine for further discussion.

From the triangle inequality we have

\displaystyle  \| (g_1\ d\sigma_I)^\vee (g_2\ d\sigma_J)^\vee \|_{L^\infty({\bf R}^2)} \leq \| g_1\|_{L^1(\phi(I))} \| g_2\|_{L^1(\phi(J))}

so by complex interpolation (which works perfectly well for bilinear operators) we have

\displaystyle  \| (g_1\ d\sigma_I)^\vee (g_2\ d\sigma_J)^\vee \|_{L^{q/2}({\bf R}^2)} \ \ \ \ \ (25)

\displaystyle  \lesssim |I|^{-2/q} \| g_1\|_{L^{(q/2)'}(\phi(I))} \| g_2\|_{L^{(q/2)'}(\phi(J))}

for any {q > 4}. The estimate (25) begins to look rather similar to (23), and we can deduce (23) from (25) as follows. Firstly, it is convenient to use Marcinkiewicz interpolation (using the fact that we have an open range of {q}) to reduce (22) to proving a restricted estimate

\displaystyle  \| (1_E\ d\sigma_{[0,1/4]})^\vee \|_{L^q({\bf R}^2)} \lesssim_{q} \sigma_{[0,1/4]}( E )^{1 - 3/q}

for any measurable subset {E} of the circle, so to prove (23) it suffices to show that

\displaystyle  \| (1_E \ d\sigma_{[0,1/4]})^\vee (1_E \ d\sigma_{[0,1/4]})^\vee \|_{L^{q/2}({\bf R}^2)} \lesssim_{q} \sigma_{[0,1/4]}( E )^{2 - 6/q}.

We can view the expression {(1_E \ d\sigma_{[0,1/4]})^\vee (1_E \ d\sigma_{[0,1/4]})^\vee(x)} as a two-dimensional integral

\displaystyle  \int_{[0,1/4] \times [0,1/4]} 1_E(\phi(\theta_1)) 1_E (\phi(\theta_2)) e^{2\pi i x \cdot (\phi(\theta_1)+\phi(\theta_2))}\ d\theta_1 d\theta_2. \ \ \ \ \ (26)

We now perform a

    It is a known theorem (first conjectured by Klainerman and Machedon) that one has the bilinear restriction theorem

    \displaystyle  \| \widehat{g_1\ d\mu_{S_1}} \widehat{g_2\ d\mu_{S_2}} \|_{L^q({\bf R}^d)} \lesssim_{q,d,S_1,S_2} \|g_1\|_{L^2(\mu_{S_1})} \|g_2\|_{L^2(\mu_{S_2})} \ \ \ \ \ (28)

    whenever {d \geq 2}, {q > \frac{d+2}{d}}, disjoint compact subsets {S_1,S_2} of {{\bf R}^{d-1}}, and functions {g_1,g_2 \in C^\infty_c({\bf R}^d)}, where {\mu_S} denotes the measure given by the integral

    \displaystyle  \int_{{\bf R}^d} f(\xi)\ d\mu_S(\xi) := \int_S f(\xi', |\xi'|^2)\ d\xi'.

    (The range {q > \frac{d+2}{d}} is known to be sharp for (28) except possibly for the endpoint {q = \frac{d+2}{d}}, which remains open currently.) Assuming this result, show that Conjecture 15 holds for all {p < \frac{2(d+2)}{d+4}}. (Hint: one repeats the above arguments, but at one point one will be faced with estimating a bilinear expression involving two “close” regions {S_1,S_2}, which could be very large or very small. The hypothesis (28) does not specify how the implied constants depend on the size or location of {S_1,S_2}, but one can obtain such a dependence by exploiting the translation and Galilean symmetries of the paraboloid.)

    — 4. Multilinear estimates —

    We now turn to multilinear (or more precisely, {d}-linear) Kakeya and restriction estimates, where we happen to have nearly optimal estimates. For instance, we have the following estimate (cf. (17)), first established by Bennett, Carbery, and myself:

    Theorem 24 (Multilinear Kakeya estimate) Let {d \geq 2}, let {c \geq 0} be sufficiently small, and let {0 < \delta < 1}. Suppose that {{\mathcal T}_1,\dots,{\mathcal T}_d} are collections of {\delta \times 1} tubes such that each tube {T_i} in {{\mathcal T}_i} is oriented within {c} of the basis vector {e_i}. Then we have

    \displaystyle  \| \prod_{i=1}^d \sum_{T_i \in {\mathcal T}_i} 1_{T_i} \|_{L^{1/(d-1)}({\bf R}^d)} \lesssim_{d,\varepsilon} \delta^{-\varepsilon} \prod_{i=1}^d \delta^{d-1} \# {\mathcal T}_i \ \ \ \ \ (29)

    for any {\varepsilon>0}.

    Exercise 25 Assuming Theorem 24, obtain an estimate for {\| \prod_{i=1}^d \sum_{T_i \in {\mathcal T}_i} 1_{T_i} \|_{L^{p}({\bf R}^d)}} for any {0 < p < \infty} in terms of {\delta} and {\prod_{i=1}^d \# {\mathcal T}_i}, and use examples to show that this estimate is optimal in the sense that the exponents for {\delta} and {\prod_{i=1}^d \# {\mathcal T}_i} can only be improved by epsilon factors at best.

    In the two-dimensional case {d=2} the estimate is easily established with no epsilon loss. Indeed, in this case we can expand the left-hand side of (29) as

    \displaystyle  \int_{{\bf R}^2} (\sum_{T_1 \in {\mathcal T}_1} 1_{T_1}) (\sum_{T_2 \in {\mathcal T}_2} 1_{T_2})

    \displaystyle  = \sum_{T_1 \in {\mathcal T}_1} \sum_{T_2 \in {\mathcal T}_2} |T_1 \cap T_2|.

    But if {T_1} is a {1 \times \delta} rectangle oriented near {e_1}, and {T_2} is a {1 \times \delta}-rectangle oriented near {e_2}, then {|T_1 \cap T_2|} is comparable with {\delta^2}, and the claim follows.

    The epsilon loss was removed in general dimension by Guth, using the polynomial method. We will not give that argument here, but instead give a simpler proof of Theorem 24, also due to Guth, and based primarily on the method of induction on scales. We first treat the case when {c=0}, that is when all the tubes in each family {{\mathcal T}_i} are completely parallel:

    Exercise 26

    • (i) (
        If {d \geq 2} and {0 < c, \delta \leq 1}, one has

        \displaystyle  K_c(\delta) \lesssim_d K_c(\sqrt{\delta}) \times K_c(\sqrt{\delta}).

    Proof: For each {i=1,\dots,d}, let {{\mathcal T}_i} be a collection of {\delta \times 1} tubes oriented within {c} of {e_i}. Our objective is to show that

    \displaystyle  \| \prod_{i=1}^d \sum_{T_i \in {\mathcal T}_i} 1_{T_i} \|_{L^{1/(d-1)}({\bf R}^d)} \lesssim K_c(\sqrt{\delta})^2 \prod_{i=1}^d \delta^{d-1} \# {\mathcal T}_i. \ \ \ \ \ (32)

    Let {c_0>0} be a small constant depending only on {d}. We partition {{\bf R}^d} into cubes {Q} of sidelenth {c_0 \sqrt{\delta}}, then the left-hand side of (32) can be decomposed as

    \displaystyle  ( \sum_Q \| \prod_{i=1}^d \sum_{T_i \in {\mathcal T}_i} 1_{T_i} \|_{L^{1/(d-1)}(Q)}^{1/(d-1)})^{d-1}.

    Clearly we can restrict the inner sum to those tubes {T_i} that actually intersect {Q}. For {c_0} small enough, the intersection of {T_i} with {Q} is contained in a {\sqrt{\delta} \times \delta} tube oriented within {c} of {e_i}; such a tube can be viewed as a rescaling by {\sqrt{\delta}} of a {1 \times \sqrt{\delta}} tube, also oriented within {c} of {e_i}. From (30) and rescaling we conclude that

    \displaystyle  \| \prod_{i=1}^d \sum_{T_i \in {\mathcal T}_i} 1_{T_i} \|_{L^{1/(d-1)}(Q)}

    \displaystyle  \leq (\sqrt{\delta})^{d(d-1)} K_c(\sqrt{\delta}) \prod_{i=1}^d \sqrt{\delta}^{d-1} \# \{ T_i \in {\mathcal T}_i: T_i \cap Q \neq \emptyset\}.

    Now let {T^*_i} be the {2 \times 2\sqrt{\delta}} tube with the same central axis and center of mass as {T_i}. For {c_0} small enough, if {T_i \cap Q \neq \emptyset} then {1_{T^*_i}} equals {1} on all of {Q}, and hence

    \displaystyle  \prod_{i=1}^d \# \{ T_i \in {\mathcal T}_i: T_i \cap Q \neq \emptyset\} \lesssim_{c_0,d} \sqrt{\delta}^{-d(d-1)} \| \prod_{i=1}^d \sum_{i=1}^d 1_{T^*_i} \|_{L^{1/(d-1)}(Q)}.

    Combining all these estimates, we can bound the left-hand side of (32) by

    \displaystyle  \lesssim_{c_0,d} (\sqrt{\delta})^{d(d-1)} K_c(\sqrt{\delta}) \| \prod_{i=1}^d \sum_{i=1}^d 1_{T^*_i} \|_{L^{1/(d-1)}({\bf R}^d)}.

    But by (30) and rescaling we have

    \displaystyle  \| \prod_{i=1}^d \sum_{i=1}^d 1_{T^*_i} \|_{L^{1/(d-1)}({\bf R}^d)} \lesssim_d K_c(\sqrt{\delta}) \prod_{i=1}^d \sqrt{\delta}^{d-1} \# {\mathcal T}_i

    and the claim follows. \Box

    Now let {\varepsilon>0}, and let {A>1} be sufficiently large depending on {d}. If {c>0} is sufficiently small depending on {A,\varepsilon,d}, then from (31) we have the claim

    \displaystyle  K_c(\delta) \leq A^{-1} \delta^{-\varepsilon} \ \ \ \ \ (33)

    whenever {c \leq \delta \leq \sqrt{c}}. On the other hand, from Proposition 26 we see (for {A} large enough) that if (33) holds in some range {r \leq \delta \leq \sqrt{c}} with {r \leq c} then it also holds in the larger range {r^2 \leq \delta \leq \sqrt{c}}. By induction we then have (33) for all {0 < \delta \leq \sqrt{c}}. Combining this with (31), we have shown that

    \displaystyle  K_c(\delta) \lesssim_d \delta^{-\varepsilon}

    for all {0 < \delta < 1}, whenever {c} is sufficiently small depending on {\varepsilon,d}. This is almost what we need to prove Theorem 24, except that we are requiring {c} to be small depending on {\varepsilon} as well as {d}, whereas Theorem 24 only requires {c} to be sufficiently small depending on {d} and not {\varepsilon}. We can overcome this (at the cost of worsening the implied constants by an {\varepsilon}-dependent factor) by the triangle inequality and exploiting affine invariance (somewhat in the spirit of Exercise 9). Namely, suppose that {\varepsilon>0} and {c} is only assumed to be small depending on {d} but not on {\varepsilon}. By what we have previously established, we have

    \displaystyle  \| \prod_{i=1}^d \sum_{T_i \in {\mathcal T}_i} 1_{T_i} \|_{L^{1/(d-1)}({\bf R}^d)} \lesssim_d \delta^{-\varepsilon} \prod_{i=1}^d \delta^{d-1} \# {\mathcal T}_i \ \ \ \ \ (34)

    whenever the tubes {T_i} lie within {c_\varepsilon} of {e_i}, where {c_\varepsilon>0} is a quantity that is sufficiently small depending on {\varepsilon,d}. Now we apply a linear transformation to both sides, and also modify {\delta} slightly, and conclude that for any {\omega_i} within {c} of {e_i}, we still have the bound (34) if the {T_i} are assumed to lie within {c_\varepsilon/2} (say) of {\omega_i} instead of {e_i}. On the other hand, by compactness (or more precisely, total boundedness), we can find {O_{c_\varepsilon,c,d}(1)} directions {\omega_i} that lie within {c} of {e_i}, such that any other direction that lies within {c} of {e_i} lies within {c_\varepsilon/2} of one of the {\omega_i}. Applying the (quasi-)triangle inequality for {L^{1/(d-1)}}, we conclude that

    \displaystyle  \| \prod_{i=1}^d \sum_{T_i \in {\mathcal T}_i} 1_{T_i} \|_{L^{1/(d-1)}({\bf R}^d)} \lesssim_{d,\varepsilon} \delta^{-\varepsilon} \prod_{i=1}^d \delta^{d-1} \# {\mathcal T}_i \ \ \ \ \ (35)

    whenever the direction of {T_i} are merely assumed to lie within {c} of {e_i}. This concludes the proof of Theorem 24.

    Exercise 27 By optimising the parameters in the above argument, refine the estimate in Theorem 24 slightly to

    \displaystyle  \| \prod_{i=1}^d \sum_{T_i \in {\mathcal T}_i} 1_{T_i} \|_{L^{1/(d-1)}({\bf R}^d)} \lesssim_{d} \exp( O_d(\sqrt{\log \frac{1}{\delta}}) ) \prod_{i=1}^d \delta^{d-1} \# {\mathcal T}_i

    for any {0 < \delta \leq \frac{1}{2}}.

    We can use the multilinear Kakeya estimate to prove a multilinear restriction (or more precisely, multilinear extension) estimate:

    Theorem 28 (Multilinear restriction estimate) Let {d \geq 2}, let {c \geq 0} be sufficiently small, and let {R \geq 1}. Suppose that {S_1,\dots,S_d} are open subsets of {S^{d-1}} that lie within {c} of the basis vector {e_i}. Then we have

    \displaystyle  \| \prod_{i=1}^d (g_i\ d\sigma_{S_i})^\vee \|_{L^{2/(d-1)}(B(0,R))} \lesssim_{d,\varepsilon} R^{\varepsilon} \prod_{i=1}^d \| g_i \|_{L^2(S_i, d\sigma_{S_i})}, \ \ \ \ \ (36)

    where {\sigma_{S_i}} denotes surface measure on {S_i}.

    Exercise 29 By modifying the arguments used to prove Exercise 16(ii), show that Theorem 28 implies Theorem 24.

    Exercise 30 Assuming Theorem 28, obtain for each {0 < q < \infty} and {1 < p < \infty} an estimate of the form

    \displaystyle \| \prod_{i=1}^d (g_i\ d\sigma_{S_i})^\vee \|_{L^q(B(0,R))} \lesssim_{d,\varepsilon} R^{\alpha+\varepsilon} \prod_{i=1}^d \|g_i\|_{L^p(S_i, d\sigma_i)}

    whenever {\varepsilon>0}, {R>1}, and {g_i \in L^p(S_i,d\sigma_i)} and some exponent {\alpha = \alpha(p,q,d)}, and use examples to show that the exponent {\alpha} you obtain is best possible.

    Remark 31 In the {d=2} case, this result with no epsilon loss follows from Proposition 21. It is an open question whether the epsilon can be removed in higher dimensions; see this recent paper of mine for some progress in this direction.

    To prove Theorem 28, we again turn to induction on scales; the argument here is a corrected version of one from this paper of Bennett, Carbery and myself, which first appeared in this paper of Bennett. Fix {d}, and let {c>0} be sufficiently small. For technical reasons it is convenient to replace the subsets {S_i} of the sphere by annuli. More precisely, for each {R \geq c^{-1}}, let {C(R)} denote the best constant in the inequality

    \displaystyle  \| \prod_{i=1}^d G_i^\vee \|_{L^{2/(d-1)}(B(0,R))} \leq C(R) R^{-\frac{d}{2}} \prod_{i=1}^d \| G_i \|_{L^2(A_{i,R})}, \ \ \ \ \ (37)

    whenever {G_i \in L^2(A_{i,R})}, where {A_{i,R}} is the annular cap

    \displaystyle  A_{i,R} := \{ \xi \in {\bf R}^d: 1-\frac{1}{R} \leq |\xi| \leq 1+\frac{1}{R}; \quad |\xi - e_i| \leq c + \frac{1}{R} \}.

    Because we have restricted both the Fourier and spatial domains to be compactly supported it is clear that {C(R)} is finite for each {R}, thus

    \displaystyle  C(R) \lesssim_{d,R} 1. \ \ \ \ \ (38)

    We will show that

    \displaystyle  C(R) \lesssim_{d,\varepsilon} R^\varepsilon \ \ \ \ \ (39)

    for any {R \geq c^{-1}}.

    Exercise 32 Show that (39) implies Theorem 28. (Hint: starting with {g_i \in L^2(S_i,d\sigma_{S_i})}, multiply {(g_i\ d\sigma_{S_i})^\vee} by a suitable weight function that is large on {B(0,R)} and has Fourier transform supported on {B(0,1/R)}, and write this as {G_i^\vee} for a suitable {G_i}. Then obtain {L^2} estimates on {G_i}.)

    To establish (39), the key estimate is

    Proposition 33 (Induction on scales) For any {R \geq c^{-2}}, one has

    \displaystyle  C(R) \lesssim_d C(\sqrt{R}) K_{2c}(1/\sqrt{R})^{1/2}

    where the multilinear Kakeya constant {K_{2c}(\sqrt{R})} was defined in (30).

    Suppose we can establish this claim. Applying Theorem 24, we conclude that if {R \geq R_\varepsilon} for a sufficiently large {R_\varepsilon \geq c^{-2}} depending on {\varepsilon,d}, one has

    \displaystyle  C(R) \leq C(\sqrt{R}) \sqrt{R}^\varepsilon.

    From (38) one has

    \displaystyle  C(R) \leq C_\varepsilon R^\varepsilon \ \ \ \ \ (40)

    for all {1 \leq R \leq R_\varepsilon} and some {C_\varepsilon = O_{\varepsilon,d}(1)}, and then an easy induction then shows that (40) holds for all {R \geq 1}, giving the claim.

    It remains to prove Proposition 33. We will rely on the wave packet decomposition. Informally, this decomposes {G_i^\vee} into a sum of “wave packets” that is approximately of the form

    \displaystyle  G_i^\vee(x) \approx \sum_{T_i} c_{i,T_i} 1_{T_i}(x) e^{2\pi i x \cdot \omega_{T_i}}, \ \ \ \ \ (41)

    where {T_i} ranges over {\sqrt{R} \times R}-tubes in {B(0,R)} oriented in various directions {\omega_{T_i}} oriented near {e_i}, and the coefficients {c_{i,T_i}} obey an {L^2} type estimate

    \displaystyle  \sum_T |c_{i,T_i}|^2 \lessapprox R^{-\frac{d+1}{2}} \|G_i\|_{L^2(S^{d-1})}^2. \ \ \ \ \ (42)

    (This decomposition is inaccurate in a number of technical ways, for instance the sharp cutoff {1_{T_i}} should be replaced by something smoother, but we ignore these issues for the sake of this informal discussion.) Heuristically speaking, (41) is asserting that {G_i^\vee} behaves like a superposition of various (translated) Knapp examples (16) with {\delta = R^{-1/2}}.

    Let us informally indicate why we would expect the wave packet decomposition to hold, and then why it should imply something like Proposition 33. Geometrically, the annular cap {A_{i,R}} behaves like the union of essentially disjoint {1/R \times 1/\sqrt{R}}-disks {D_i}, each centred at some point {\omega_{D_i}} on the unit sphere that is close to {e_i}, and oriented normal to the direction {\omega_{D_i}}. Thus {G_i^\vee} should behave like the sum of the components {(G_i 1_{D_i})^\vee}. By the uncertainty principle, each such component {(G_i 1_{D_i})^\vee} should behave like a constant multiple of the plane wave {e^{2\pi ix \cdot \omega_{D_i}}} on each translate of the dual region {D_i^\perp} to {D_i}, which is a {\sqrt{R} \times R} tube oriented in the direction {\omega_{D_i}}. By Plancherel’s theorem, the total {L^2} norm of {(G_i 1_{D_i})^\vee} should equal {\| G_i\|_{L^2(D_i)}}. Thus we expect to have a decomposition roughly of the form

    \displaystyle  (G_i 1_{D_i})^\vee(x) \approx \sum_{T_i \in {\mathcal T}_{D_i}} c_{i,T_i} 1_{T_i}(x) e^{2\pi i x \cdot \omega_{D_i}}

    where {{\mathcal T}_{D_i}} is a collection of parallel and boundedly overlapping {\sqrt{R} \times R} tubes {T_i} oriented in the direction {\omega_{D_i}}, and the {c_{i,T}} are coefficients with

    \displaystyle  R^{\frac{d+1}{2}} \sum_{T_i \in {\mathcal T}_{D_i}} |c_{i,T_i}|^2 \approx \| G_i 1_{D_i} \|_{L^2({\bf R}^d)}^2.

    Summing over {D_i} and collecting powers of {R}, we (heuristically) obtain the wave packet decomposition (41) with bound (42).

    Now we informally explain why the decomposition (41) (and attendant bound (42)) should yield Proposition 33. Our task is to show that

    \displaystyle  \| \prod_{i=1}^d G_i^\vee \|_{L^{2/(d-1)}(B(0,R))} \lesssim_d C(\sqrt{R}) K_{2c}(\sqrt{R})^{1/2} R^{-d/2} \prod_{i=1}^d \| G_i \|_{L^2(A_{i,R})} \ \ \ \ \ (43)

    for {G_i \in L^2(A_{i,R})}. We may as well normalise {\| G_{i} \|_{L^2(A_{i,R})}=1}. Applying the wave packet decomposition, one expects to have an approximation of the form

    \displaystyle  G_i^\vee(x) \approx \sum_{T_i \in {\mathcal T}_i} c_{i,T_i} 1_{T_i}(x) e^{2\pi i x \cdot \omega_{T_i}}

    for some coefficients {c_{i,T_i}} with

    \displaystyle  \sum_{T_i \in {\mathcal T}_i} |c_{i,T_i}|^2 \lessapprox R^{-\frac{d+1}{2}} \ \ \ \ \ (44)

    and the {T_i} are essentially distinct {\sqrt{R} \times R} tubes oriented within {2c} of {e_i}. We cover {B(0,R)} by balls {B(x_0,\sqrt{R})} of radius {\sqrt{R}}. On each such ball, the cutoffs {1_{T_i}(x)} are morally constant, and so

    \displaystyle  \| \prod_{i=1}^d G_i^\vee \|_{L^{2/(d-1)}(B(x_0,\sqrt{R}))} \ \ \ \ \ (45)

    \displaystyle  \approx \| \prod_{i=1}^d \sum_{T_i \in {\mathcal T}_i: x_0 \in T_i} c_{i,T_i} e^{2\pi i x \cdot \omega_{T_i}} \|_{L^{2/(d-1)}(B(x_0,\sqrt{R}))}.

    From the uncertainty principle, the trigonometric polynomial {\sum_{T_i \in {\mathcal T}_i: x_0 \in T_i} c_{i,T_i} e^{2\pi i x \cdot \omega_{T_i}}} behaves on {B(x_0,\sqrt{R})} like the inverse Fourier transform {G_{i,x_0}^\vee} of a function {G_{i,x_0}} supported on {A_{i,\sqrt{R}}} with

    \displaystyle  \| G_{i,x_0} \|_{L^2(A_{i,\sqrt{R}})}^2 \approx \sqrt{R}^{d+1} \sum_{T_i \in {\mathcal T}_i: x_0 \in T_i} |c_{i,T_i}|^2

    and hence by (37) we expect the expression (45) to be bounded by

    \displaystyle  \lessapprox C(\sqrt{R}) \sqrt{R}^{-d/2} \prod_{i=1}^d \sqrt{R}^{\frac{d+1}{2}} (\sum_{T_i \in {\mathcal T}_i: x_0 \in T_i} |c_{i,T_i}|^2)^{1/2}

    which is also morally

    \displaystyle  \lessapprox C(\sqrt{R}) \| \prod_{i=1}^d \sum_{T_i \in {\mathcal T}_i} |c_{i,T_i}|^2 1_{T_i} \|_{L^{1/(d-1)}(B(x_0,\sqrt{R}))}^{1/2}.

    Averaging in {x_0}, we thus expect the left-hand side of (43) to be

    \displaystyle  \lessapprox C(\sqrt{R}) \| \prod_{i=1}^d \sum_{T_i \in {\mathcal T}_i} |c_{i,T_i}|^2 1_{T_i} \|_{L^{1/(d-1)}(B(0,R))}^{1/2}.

    Applying a rescaled (and weighted) version of Theorem 24, this is bounded by

    \displaystyle  \lessapprox C(\sqrt{R}) (K_{2c}(1/\sqrt{R})) \prod_{i=1}^d \sqrt{R}^{d-1} \sum_{T_i \in {\mathcal T}_i} |c_{i,T_i}|^2)^{1/2}

    and the claim now follows from (44).

    Now we begin the rigorous argument. We need to prove (43), and we normalise {\| G_{i} \|_{L^2(A_{i,R})}=1}. By Fubini’s theorem we have

    \displaystyle  \| \prod_{i=1}^d G_i^\vee \|_{L^{2/(d-1)}(B(0,R))}

    \displaystyle  \lesssim_d (R^{-d/2} \int_{B(0,R)} \| \prod_{i=1}^d G_i^\vee \|_{L^{2/(d-1)}(B(x_0,\sqrt{R}))}^{2/(d-1)}\ dx_0)^{(d-1)/2}.

    Let {\phi} be a fixed Schwartz function that is bounded away from zero on {B(0,1)} and has Fourier transform supported on {B(0,1/2)}, thus the function {\phi^{x_0}_{\sqrt{R}}(x) := \phi(\frac{x-x_0}{\sqrt{R}})} is bounded away from zero on {B(x_0,\sqrt{R})} and has Fourier transform supported on {B(0,1/2\sqrt{R})}. In particular we have

    \displaystyle  \| \prod_{i=1}^d G_i^\vee \|_{L^{2/(d-1)}(B(x_0,\sqrt{R}))} \lesssim_d \| \prod_{i=1}^d \phi^{x_0}_{\sqrt{R}} G_i^\vee \|_{L^{2/(d-1)}(B(x_0,\sqrt{R}))}.

    We can write

    \displaystyle  \phi^{x_0}_{\sqrt{R}} G_i^\vee = (G_i * \widehat{\phi^{x_0}_{\sqrt{R}}})^\vee

    and observe that {(G_i * \widehat{\phi^{x_0}_{\sqrt{R}}})} has Fourier transform supported in {A_{i,\sqrt{R}}}. Thus

    \displaystyle  \| \prod_{i=1}^d \phi^{x_0}_{\sqrt{R}} G_i^\vee \|_{L^{2/(d-1)}(B(x_0,\sqrt{R}))}

    \displaystyle \leq C(\sqrt{R}) \sqrt{R}^{-d/2} \prod_{i=1}^d \| G_i * \widehat{\phi^{x_0}_{\sqrt{R}}} \|_{L^2(A_{i,\sqrt{R}})}.

    Thus it remains to establish the bound

    \displaystyle  (R^{-d/2} \int_{B(0,R)} \prod_{i=1}^d \| G_i * \widehat{\phi^{x_0}_{\sqrt{R}}} \|_{L^2(A_{i,\sqrt{R}})}^{2/(d-1)}\ dx_0)^{(d-1)/2} \ \ \ \ \ (46)

    \displaystyle  \lesssim K_{2c}(\sqrt{R})^{1/2} \sqrt{R}^{-d/2}.

    We cover {A_{i,\sqrt{R}}} by a collection of {O(1/\sqrt{R}) \times O(1/R)} disks {D_i}, each one centered at an element {\omega_{D_i} \in S^{d-1}} that lies within {2c} of {e_i}, and is oriented with normal {\omega_{D_i}}, with the {\omega_{D_i}} separated from each other by {\gg 1/\sqrt{R}}. A partition of unity then lets us write {G_i = \sum_{D_i} G_{i,D_i}} where each {G_{i,D_i} \in L^2(D_i)} with

    \displaystyle  \sum_{D_i} \| G_{i,D_i}\|_{L^2(D_i)}^2 = 1. \ \ \ \ \ (47)

    The functions {G_{i,D_i} * \widehat{\phi^{x_0}_{\sqrt{R}}}} then have bounded overlapping supports in the sense that every {\xi \in {\bf R}^d} is contained in at most {O_d(1)} of these supports. Hence

    \displaystyle  \| G_i * \widehat{\phi^{x_0}_{\sqrt{R}}} \|_{L^2(A_{i,\sqrt{R}})}^2 \lesssim_d \sum_{D_i} \| G_{i,D_i} * \widehat{\phi^{x_0}_{\sqrt{R}}} \|_{L^2(A_{i,\sqrt{R}})}^2.

    By Plancherel’s theorem the right-hand side is at most

    \displaystyle  \sum_{D_i} \| G_{i,D_i}^\vee \phi^{x_0}_{\sqrt{R}} \|_{L^2({\bf R}^d)}^2.

    This is morally bounded by

    \displaystyle  \sum_{D_i} \| G_{i,D_i}^\vee \|_{L^2(B(x_0,\sqrt{R}))}^2

    so one has morally bounded the left-hand side of (46) by

    \displaystyle  (R^{-d/2} \int_{B(0,R)} \prod_{i=1}^d (\sum_{D_i} \| G_{i,D_i}^\vee \|_{L^2(B(x_0,\sqrt{R}))}^2)^{1/(d-1)}\ dx_0)^{(d-1)/2}.

    In practice, due to the rapid decay of {\phi^{x_0}_{\sqrt{R}}}, one has to add some additional terms involving some translates of the balls {B(x_0,\sqrt{R})}, but these can be handled by the same method as the one given below and we omit this technicality for brevity. We can write {G_{i,D_i} = \tilde G_{i,D_i} \psi_{i,D_i}}, where {\psi_{i,D_i}} is a Schwartz function adapted to a slight dilate of {D_i} whose inverse Fourier transform {\psi_{i,D_i}^\vee} is a bump function adapted to a {O(\sqrt{R}) \times O(R)} tube {D_i^\perp} oriented along {\omega_i} through the origin, and {\tilde G_{i,D_i} \in L^2(D_i)} with

    \displaystyle  \|\tilde G_{i,D_i} \|_{L^2(D_i)} \sim_d \|G_{i,D_i} \|_{L^2(D_i)}. \ \ \ \ \ (48)

    This gives a reproducing-type formula

    \displaystyle  G_{i,D_i}^vee = \tilde G_{i,D_i}^\vee * \psi_{i,D_i}^\vee

    which by Cauchy-Schwarz (or Jensen’s inequality) gives the pointwise bound

    \displaystyle  |G_{i,D_i}^vee|^2(x) \lesssim_d R^{-(d+1)/2} |\tilde G_{i,D_i}^\vee|^2 * 1_{D_i^\perp}(x).

    By enlarging {D_i^\perp} slightly, we then have

    \displaystyle  |G_{i,D_i}^vee|^2(x) \lesssim_d R^{-(d+1)/2} |\tilde G_{i,D_i}^\vee|^2 * 1_{D_i^\perp}(x_0)

    for all {x \in B(x_0,\sqrt{R})}, hence

    \displaystyle  \| G_{i,D_i}^\vee \|_{L^2(B(x_0,\sqrt{R}))}^2 \lesssim_d R^{-1/2} |\tilde G_{i,D_i}^\vee|^2 * 1_{D_i^\perp}(x_0).

    We have thus bounded the left-hand side of (46) by

    \displaystyle  (R^{-d/2} \int_{B(0,R)} \prod_{i=1}^d (\sum_{D_i} R^{-1/2} |\tilde G_{i,D_i}^\vee|^2 * 1_{D_i^\perp}(x_0))^{1/(d-1)}\ dx_0)^{(d-1)/2}

    which we can rearrange as

    \displaystyle  R^{-d^2/4} \| \prod_{i=1}^d \sum_{D_i} |\tilde G_{i,D_i}^\vee|^2 * 1_{D_i^\perp}(x_0) \|_{L^{1/(d-1)}(B(0,R))}^{1/2}.

    Using a rescaled version of (30) (and viewing the convolution here as a limit of Riemann sums) we can bound this by

    \displaystyle  R^{-d^2/4} ( R^{d(d-1)/2} K_{2c}(\sqrt{R}) \prod_{i=1}^d \| \sum_{D_i} |\tilde G_{i,D_i}^\vee|^2 \|_{L^1({\bf R}^d)} ) ^{1/2},

    which by (48), (47) is bounded by

    \displaystyle  R^{-d/4} K_{2c}(\sqrt{R})^{1/2}

    giving (46) as desired.