Heat flow and zeroes of polynomials II: zeroes on a circle

What's new 2018-08-02

This is a sequel to this previous blog post, in which we discussed the effect of the heat flow evolution

\displaystyle  \partial_t P(t,z) = \partial_{zz} P(t,z)

on the zeroes of a time-dependent family of polynomials {z \mapsto P(t,z)}, with a particular focus on the case when the polynomials {z \mapsto P(t,z)} had real zeroes. Here (inspired by some discussions I had during a recent conference on the Riemann hypothesis in Bristol) we record the analogous theory in which the polynomials instead have zeroes on a circle {\{ z: |z| = \sqrt{q} \}}, with the heat flow slightly adjusted to compensate for this. As we shall discuss shortly, a key example of this situation arises when {P} is the numerator of the zeta function of a curve.

More precisely, let {g} be a natural number. We will say that a polynomial

\displaystyle  P(z) = \sum_{j=0}^{2g} a_j z^j

of degree {2g} (so that {a_{2g} \neq 0}) obeys the functional equation if the {a_j} are all real and

\displaystyle  a_j = q^{g-j} a_{2g-j}

for all {j=0,\dots,2g}, thus

\displaystyle  P(\overline{z}) = \overline{P(z)}

and

\displaystyle  P(q/z) = q^g z^{-2g} P(z)

for all non-zero {z}. This means that the {2g} zeroes {\alpha_1,\dots,\alpha_{2g}} of {P(z)} (counting multiplicity) lie in {{\bf C} \backslash \{0\}} and are symmetric with respect to complex conjugation {z \mapsto \overline{z}} and inversion {z \mapsto q/z} across the circle {\{ |z| = \sqrt{q}\}}. We say that this polynomial obeys the Riemann hypothesis if all of its zeroes actually lie on the circle {\{ z = \sqrt{q}\}}. For instance, in the {g=1} case, the polynomial {z^2 - a_1 z + q} obeys the Riemann hypothesis if and only if {|a_1| \leq 2\sqrt{q}}.

Such polynomials arise in number theory as follows: if {C} is a projective curve of genus {g} over a finite field {\mathbf{F}_q}, then, as famously proven by Weil, the associated local zeta function {\zeta_{C,q}(z)} (as defined for instance in this previous blog post) is known to take the form

\displaystyle  \zeta_{C,q}(z) = \frac{P(z)}{(1-z)(1-qz)}

where {P} is a degree {2g} polynomial obeying both the functional equation and the Riemann hypothesis. In the case that {C} is an elliptic curve, then {g=1} and {P} takes the form {P(z) = z^2 - a_1 z + q}, where {a_1} is the number of {{\bf F}_q}-points of {C} minus {q+1}. The Riemann hypothesis in this case is a famous result of Hasse.

Another key example of such polynomials arise from rescaled characteristic polynomials

\displaystyle  P(z) := \det( 1 - \sqrt{q} F ) \ \ \ \ \ (1)

of {2g \times 2g} matrices {F} in the compact symplectic group {Sp(g)}. These polynomials obey both the functional equation and the Riemann hypothesis. The Sato-Tate conjecture (in higher genus) asserts, roughly speaking, that “typical” polyomials {P} arising from the number theoretic situation above are distributed like the rescaled characteristic polynomials (1), where {F} is drawn uniformly from {Sp(g)} with Haar measure.

Given a polynomial {z \mapsto P(0,z)} of degree {2g} with coefficients

\displaystyle  P(0,z) = \sum_{j=0}^{2g} a_j(0) z^j,

we can evolve it in time by the formula

\displaystyle  P(t,z) = \sum_{j=0}^{2g} \exp( t(j-g)^2 ) a_j(0) z^j,

thus {a_j(t) = \exp(t(j-g)) a_j(0)} for {t \in {\bf R}}. Informally, as one increases {t}, this evolution accentuates the effect of the extreme monomials, particularly, {z^0} and {z^{2g}} at the expense of the intermediate monomials such as {z^g}, and conversely as one decreases {t}. This family of polynomials obeys the heat-type equation

\displaystyle  \partial_t P(t,z) = (z \partial_z - g)^2 P(t,z). \ \ \ \ \ (2)

In view of the results of Marcus, Spielman, and Srivastava, it is also very likely that one can interpret this flow in terms of expected characteristic polynomials involving conjugation over the compact symplectic group {Sp(n)}, and should also be tied to some sort of “{\beta=\infty}” version of Brownian motion on this group, but we have not attempted to work this connection out in detail.

It is clear that if {z \mapsto P(0,z)} obeys the functional equation, then so does {z \mapsto P(t,z)} for any other time {t}. Now we investigate the evolution of the zeroes. Suppose at some time {t_0} that the zeroes {\alpha_1(t_0),\dots,\alpha_{2g}(t_0)} of {z \mapsto P(t_0,z)} are distinct, then

\displaystyle  P(t_0,z) = a_{2g}(0) \exp( t_0g^2 ) \prod_{j=1}^{2g} (z - \alpha_j(t_0) ).

From the inverse function theorem we see that for times {t} sufficiently close to {t_0}, the zeroes {\alpha_1(t),\dots,\alpha_{2g}(t)} of {z \mapsto P(t,z)} continue to be distinct (and vary smoothly in {t}), with

\displaystyle  P(t,z) = a_{2g}(0) \exp( t g^2 ) \prod_{j=1}^{2g} (z - \alpha_j(t) ).

Differentiating this at any {z} not equal to any of the {\alpha_j(t)}, we obtain

\displaystyle  \partial_t P(t,z) = P(t,z) ( g^2 - \sum_{j=1}^{2g} \frac{\alpha'_j(t)}{z - \alpha_j(t)})

and

\displaystyle  \partial_z P(t,z) = P(t,z) ( \sum_{j=1}^{2g} \frac{1}{z - \alpha_j(t)})

and

\displaystyle  \partial_{zz} P(t,z) = P(t,z) ( \sum_{1 \leq j,k \leq 2g: j \neq k} \frac{1}{(z - \alpha_j(t))(z - \alpha_k(t))}).

Inserting these formulae into (2) (expanding {(z \partial_z - g)^2} as {z^2 \partial_{zz} - (2g-1) z \partial_z + g^2}) and canceling some terms, we conclude that

\displaystyle  - \sum_{j=1}^{2g} \frac{\alpha'_j(t)}{z - \alpha_j(t)} = z^2 \sum_{1 \leq j,k \leq 2g: j \neq k} \frac{1}{(z - \alpha_j(t))(z - \alpha_k(t))}

\displaystyle  - (2g-1) z \sum_{j=1}^{2g} \frac{1}{z - \alpha_j(t)}

for {t} sufficiently close to {t_0}, and {z} not equal to {\alpha_1(t),\dots,\alpha_{2g}(t)}. Extracting the residue at {z = \alpha_j(t)}, we conclude that

\displaystyle  - \alpha'_j(t) = 2 \alpha_j(t)^2 \sum_{1 \leq k \leq 2g: k \neq j} \frac{1}{\alpha_j(t) - \alpha_k(t)} - (2g-1) \alpha_j(t)

which we can rearrange as

\displaystyle  \frac{\alpha'_j(t)}{\alpha_j(t)} = - \sum_{1 \leq k \leq 2g: k \neq j} \frac{\alpha_j(t)+\alpha_k(t)}{\alpha_j(t)-\alpha_k(t)}.

If we make the change of variables {\alpha_j(t) = \sqrt{q} e^{i\theta_j(t)}} (noting that one can make {\theta_j} depend smoothly on {t} for {t} sufficiently close to {t_0}), this becomes

\displaystyle  \partial_t \theta_j(t) = \sum_{1 \leq k \leq 2g: k \neq j} \cot \frac{\theta_j(t) - \theta_k(t)}{2}. \ \ \ \ \ (3)

Intuitively, this equation asserts that the phases {\theta_j} repel each other if they are real (and attract each other if their difference is imaginary). If {z \mapsto P(t_0,z)} obeys the Riemann hypothesis, then the {\theta_j} are all real at time {t_0}, then the Picard uniqueness theorem (applied to {\theta_j(t)} and its complex conjugate) then shows that the {\theta_j} are also real for {t} sufficiently close to {t_0}. If we then define the entropy functional

\displaystyle  H(\theta_1,\dots,\theta_{2g}) := \sum_{1 \leq j < k \leq 2g} \log \frac{1}{|\sin \frac{\theta_j-\theta_k}{2}| }

then the above equation becomes a gradient flow

\displaystyle  \partial_t \theta_j(t) = - 2 \frac{\partial H}{\partial \theta_j}( \theta_1(t),\dots,\theta_{2g}(t) )

which implies in particular that {H(\theta_1(t),\dots,\theta_{2g}(t))} is non-increasing in time. This shows that as one evolves time forward from {t_0}, there is a uniform lower bound on the separation between the phases {\theta_1(t),\dots,\theta_{2g}(t)}, and hence the equation can be solved indefinitely; in particular, {z \mapsto P(t,z)} obeys the Riemann hypothesis for all {t > t_0} if it does so at time {t_0}. Our argument here assumed that the zeroes of {z \mapsto P(t_0,z)} were simple, but this assumption can be removed by the usual limiting argument.

For any polynomial {z \mapsto P(0,z)} obeying the functional equation, the rescaled polynomials {z \mapsto e^{-g^2 t} P(t,z)} converge locally uniformly to {a_{2g}(0) (z^{2g} + q^g)} as {t \rightarrow +\infty}. By Rouche’s theorem, we conclude that the zeroes of {z \mapsto P(t,z)} converge to the equally spaced points {\{ e^{2\pi i(j+1/2)/2g}: j=1,\dots,2g\}} on the circle {\{ |z| = \sqrt{q}\}}. Together with the symmetry properties of the zeroes, this implies in particular that {z \mapsto P(t,z)} obeys the Riemann hypothesis for all sufficiently large positive {t}. In the opposite direction, when {t \rightarrow -\infty}, the polynomials {z \mapsto P(t,z)} converge locally uniformly to {a_g(0) z^g}, so if {a_g(0) \neq 0}, {g} of the zeroes converge to the origin and the other {g} converge to infinity. In particular, {z \mapsto P(t,z)} fails the Riemann hypothesis for sufficiently large negative {t}. Thus (if {a_g(0) \neq 0}), there must exist a real number {\Lambda}, which we call the de Bruijn-Newman constant of the original polynomial {z \mapsto P(0,z)}, such that {z \mapsto P(t,z)} obeys the Riemann hypothesis for {t \geq \Lambda} and fails the Riemann hypothesis for {t < \Lambda}. The situation is a bit more complicated if {a_g(0)} vanishes; if {k} is the first natural number such that {a_{g+k}(0)} (or equivalently, {a_{g-j}(0)}) does not vanish, then by the above arguments one finds in the limit {t \rightarrow -\infty} that {g-k} of the zeroes go to the origin, {g-k} go to infinity, and the remaining {2k} zeroes converge to the equally spaced points {\{ e^{2\pi i(j+1/2)/2k}: j=1,\dots,2k\}}. In this case the de Bruijn-Newman constant remains finite except in the degenerate case {k=g}, in which case {\Lambda = -\infty}.

For instance, consider the case when {g=1} and {P(0,z) = z^2 - a_1 z + q} for some real {a_1} with {|a_1| \leq 2\sqrt{q}}. Then the quadratic polynomial

\displaystyle  P(t,z) = e^t z^2 - a_1 z + e^t q

has zeroes

\displaystyle  \frac{a_1 \pm \sqrt{a_1^2 - 4 e^{2t} q}}{2e^t}

and one easily checks that these zeroes lie on the circle {\{ |z|=\sqrt{q}\}} when {t \geq \log \frac{|a_1|}{2\sqrt{q}}}, and are on the real axis otherwise. Thus in this case we have {\Lambda = \log \frac{|a_1|}{2\sqrt{q}}} (with {\Lambda=-\infty} if {a_1=0}). Note how as {t} increases to {+\infty}, the zeroes repel each other and eventually converge to {\pm i \sqrt{q}}, while as {t} decreases to {-\infty}, the zeroes collide and then separate on the real axis, with one zero going to the origin and the other to infinity.

The arguments in my paper with Brad Rodgers (discussed in this previous post) indicate that for a “typical” polynomial {P} of degree {g} that obeys the Riemann hypothesis, the expected time to relaxation to equilibrium (in which the zeroes are equally spaced) should be comparable to {1/g}, basically because the average spacing is {1/g} and hence by (3) the typical velocity of the zeroes should be comparable to {g}, and the diameter of the unit circle is comparable to {1}, thus requiring time comparable to {1/g} to reach equilibrium. Taking contrapositives, this suggests that the de Bruijn-Newman constant {\Lambda} should typically take on values comparable to {-1/g} (since typically one would not expect the initial configuration of zeroes to be close to evenly spaced). I have not attempted to formalise or prove this claim, but presumably one could do some numerics (perhaps using some of the examples of {P} given previously) to explore this further.