Erdos problem #385, the parity problem, and Siegel zeroes

What's new 2024-08-19

The Erdös problem site was created last year, and announced earlier this year on this blog. Every so often, I have taken a look at a random problem from the site for fun. A few times, I was able to make progress on one of the problems, leading to a couple papers; but the more common outcome is that I play around with the problem for a while, see why the problem is difficult, and then eventually give up and do something else. But, as is common in this field, I don’t make public the observations that I made, and the next person who looks at the same problem would likely have to go through the same process of trial and error to work out what the main obstructions that are present are.

So, as an experiment, I thought I would record here my preliminary observations on one such problem – Erdös problem #385 – to discuss why it looks difficult to solve with our current understanding of the primes. Here is the problem:

Problem 1 (Erdös Problem #385) Let

\displaystyle F(n) =\max_{\stackrel{m<n}{m\hbox{\ composite}}} m+p(m),

where {p(m)} is the least prime divisor of {m} . Is it true that {F(n)>n} for all sufficiently large {n}? Does {F(n)-n \rightarrow \infty} as {n \rightarrow \infty}?

This problem is mentioned on page 73 of this 1979 paper of Erdös (where he attributes the problem to an unpublished work of Eggelton, Erdös, and Selfridge that, to my knowledge, has never actually appeared), as well as briefly in page 92 of this 1980 paper of Erdös and Graham.

At first glance, this looks like a somewhat arbitrary problem (as many of Erdös’s problems initially do), as the function {F} is not obviously related to any other well-known function or problem. However, it turns out that this problem is closely related to the parity barrier in sieve theory (as discussed in this previous post), with the possibility of Siegel zeroes presenting a particular obstruction. I suspect that Erdös was well aware of this connection; certainly he mentions the relation with questions on gaps between primes (or almost primes), which is in turn connected to the parity problem and Siegel zeroes (as is discussed recently in my paper with Banks and Ford, and in more depth in these papers of Ford and of Granville).

Let us now explore the problem further. Let us call a natural number {n} bad if {F(n) \leq n}, so the first part of the problem is asking whether there exist bad numbers that are sufficiently large. We unpack the definitions: {n} is bad if and only if {m+p(m) \leq n} for any composite {m}, so placing {m} in intervals of the form {[n-h,n]} we are asking to show that

\displaystyle  p(m) \leq h \hbox{ for all composite } m \in [n-h, n]

for each {1 \leq h \leq n}. To put it another way, the badness of {n} asserts that for each {1 \leq h \leq n} that the residue classes {0 \hbox{ mod } p} for {p \leq h} cover all the natural numbers in the interval {[n-h,n]} except for the primes.

It is now natural to try to understand this problem for a specific choice of interval {h} as a function of {n}. If {h} is large in the sense that {h > \sqrt{n}}, then the claimed covering property is automatic, since every composite number less than or equal to {n} has a prime factor less than or equal to {\sqrt{n}}. On the other hand, for {h} very small, in particular {h = o(\log n)}, it is also possible to find {n} with this property. Indeed, if one takes {n} to lie in the residue class {0 \hbox{ mod } \prod_{p \leq h}}, then we see that the residue classes cover all of {[n-h,n]} except for {n-1}, and from , and specifically the semiprimes that are product of two primes between {n^{1/u}} and {n^{1-1/u}}. If one can show for some {2 < u < 3} that the largest gap between semiprimes in say {[x,2x]} with prime factors in {[x^{1/u}, x^{1-1/u}]} is {o( x^{1/u} )}, then this would affirmatively answer the first part of this problem (and also the second). This is certainly very plausible – it would follow from a semiprime version of the Cramér conjecture – but remains well out of reach for now. Even assuming the Riemann hypothesis, the best upper bound on prime gaps in {[x,2x]} is {O( \sqrt{x} \log x )}, and the best upper bound on semiprime gaps is not significantly better than this – in particular, one cannot reach {x^{1/u}} for any {2 < u < 3}. (There is a remote possibility that an extremely delicate analysis near {u=2}, together with additional strong conjectures on the zeta function, such as a sufficiently quantitative version of the GUE hypothesis, may barely be able to resolve this problem, but I am skeptical of this, absent some further major breakthrough in analytic number theory.)

Given that multiplicative number theory does not seem powerful enough (even on RH) to resolve these problems, the other main approach would be to use sieve theory. In this theory, we do not really know how to exploit the specific location of the interval {[n-h,n]} or the specific congruence classes used, so one can study the more general problem of trying to cover an interval {I} of length {h} by one residue class mod {p} for each {p \leq h}, and only leaving a small number of survivors which could potentially be classified as “primes”. The discussion of the small {h} case already reveals a problem with this level of generality: one can sieve out the interval {[-h, 1]} by the residue classes {0 \hbox{ mod } p} for {p \leq h}, and leave only one survivor, {-1}. Indeed, thanks to known bounds on Jacobsthal’s function, one can be more efficient than this; for instance, using equation (1.2) from this paper of Ford, Green, Konyagin, Maynard, and myself, it is possible to completely sieve out any interval of sufficiently large length {h} using only those primes {p} up to {O(\frac{h \log\log h}{\log h \log\log\log h})}. On the other hand, from the work of Iwaniec, we know that sieving up to {o(h^{1/2})} is insufficient to completely sieve out such an interval; related to this, if one only sieves up to {h^{1/u}} for some {2 < u < 3}, the linear sieve (see e.g., Theorem 2 of this previous blog post) shows that one must have at least {(f(u)+o(1)) \frac{h}{\log h}} survivors, where {f(u)} can be given explicitly in the regime {2 < u < 3} by the formula

\displaystyle  f(u) := \frac{2e^\gamma}{u} \log(u-1).

These lower bounds are not believed to be best possible. For instance, the Maier–Pomerance conjecture on Jacobsthal’s function would indicate that one needs to sieve out primes up to {\gg h/\log^2 h} in order to completely sieve out an interval of length {h}, and it is also believed that sieving up to {h^{1-\varepsilon}} should leave {\gg_\varepsilon h/\log h} survivors, although even these strong conjectures are not enough to positively resolve this problem, since we are permitted to sieve all the way up to {h} (and we are allowed to leave every prime number as a survivor, which in view of the Brun–Titchmarsh theorem could permit as many as {O(h/\log n)} survivors).

Unfortunately, as discussed in this previous blog post, the parity problem blocks such improvements from taking place from most standard analytic number theory methods, in particular sieve theory. A particularly dangerous enemy arises from Siegel zeroes. This is discussed in detail in the papers of of Ford and of Granville mentioned previously, but an informal discussion is as follows. If there is a Siegel zero associated to the quadratic character of some conductor {q}, this roughly speaking means that almost all primes {p} (in certain ranges) will be quadratic non-residues mod {q}. In particular, if one restricts attention to numbers {n} in a residue class {a \hbox{ mod } q} that is a quadratic residue, we then expect most numbers in this class to have an even number of prime factors, rather than an odd number.

This alters the effect of sieving in such residue classes. Consider for instance the classical sieve of Eratosthenes. If one sieves out {0 \hbox{ mod } p} for each prime {p \leq \sqrt{h}}, the sieve of Eratosthenes tells us that the surviving elements of {[1,h]} are simply the primes between {\sqrt{h}} and {h}, of which there are about {h/\log h} many. However, if one restricts attention to {[1,h] \cap a \hbox{ mod } q} for a quadratic residue class {a \hbox{ mod } q} (and taking {h} to be somewhat large compared to {q}), then by the preceding discussion, this eliminates most primes, and so now sieving out {0 \hbox{ mod } p} should leave almost no survivors. Shifting this example by {a} and then dividing by {q}, one can end up with an example of an interval {I} of length {h} that can be sieved by residue classes {b_p \hbox{ mod } p} for each {p \leq \sqrt{h}} in such a manner as to leave almost no survivors (in particular, {o(h/\log h)} many). In the presence of a Siegel zero, it seems quite difficult to prevent this scenario from “infecting” the above problem, creating a bad scenario in which for all {\log n \ll h \ll \sqrt{n}}, the residue classes {0 \hbox{ mod } p} for {p \leq \sqrt{h}} already eliminate almost all elements of {[n-h,n]}, leaving it mathematically possible for the remaining survivors to either be prime, or eliminated by the remaining residue classes {0 \hbox{ mod } p} for {\sqrt{h} < p \leq h}.

Because of this, I suspect that it will not be possible to resolve this Erdös problem without a major breakthrough on the parity problem that (at a bare minimum) is enough to exclude the possibility of Siegel zeroes existing. (But it is not clear at all that Siegel zeroes are be the only “enemy” here, so absent a major advance in “inverse sieve theory”, one cannot simply assume GRH to run away from this problem).

— 0.1. Addendum: heuristics for Siegel zero scenarios —

This post also provides a good opportunity to refine some heuristics I had previously proposed regarding Siegel zeroes and their impact on various problems in analytic number theory. In this previous blog post, I wrote

“The parity problem can also be sometimes be overcome when there is an exceptional Siegel zero … [this] suggests that to break the parity barrier, we may assume without loss of generality that there are no Siegel zeroes.”

On the other hand, it was pointed out in a more recent article of Granville that (as with the current situation), Siegel zeroes can sometimes serve to enforce the parity barrier, rather than overcome it, and responds to my previous statement with the comment “this claim needs to be treated with caution, since its truth depends on the context”.

I actually agree with Granville here, and I propose here a synthesis of the two situations. In the absence of a Siegel zero, standard heuristic models in analytic number theory (such as the ones discussed in this post) typically suggest that a given quantity {X} of interest in number theory (e.g., the number of primes in a certain set) obey an asymptotic law of the form

\displaystyle  X = \mathrm{Main\ term} + \mathrm{Error\ term}

where {\mathrm{Main\ term}} is generally fairly well understood, while {\mathrm{Error\ term}} is expected to fluctuate “randomly” (or more precisely, pseudorandomly) and thus be smaller than the main term. However, a major difficulty in analytic number theory is that we often cannot prevent a “conspiracy” from occurring in which the error term becomes as large as, or even larger than the main term: the fluctuations present in that term are often too poorly understood to be under good control. The parity barrier manifests by providing examples of analogous situations in which the error term is indeed as large as the main term (with an unfavorable sign).

However, the presence of a Siegel zero tends to “magnetize” the error term by pulling most of the fluctuations in a particular direction. In many situations, what this means is that one can obtain a refined asymptotic of the form

\displaystyle  X = \mathrm{Main\ term} + \mathrm{Siegel\ correction} + \mathrm{Better\ error\ term}

where {\mathrm{Better\ error\ term}} now fluctuates less than the original {\mathrm{Error\ term}}, and in particular can (in some cases) be shown to be lower order than {\mathrm{Main\ term}}, while {\mathrm{Siegel\ correction}} is a new term that is often explicitly describable in terms of the exceptional character {\chi} associated to the Siegel zero, as well as the location {\beta} of the Siegel zero {L(\beta,\chi)=0}. A typical example is the problem of estimating the sum {\sum_{n \leq x: n = a\ (q)} \Lambda(n)} of primes in an arithmetic progression. The Siegel–Walfisz theorem gives a bound of the form

\displaystyle  \sum_{n \leq x: n = a\ (q)} \Lambda(n) = \frac{x}{\varphi(q)} + O_A( x \log^{-A} x )

for any {A>0} (with an ineffective constant); in the regime {q = O(\log^{O(1)} x)} one can improve the error term to {O( x \exp(-c \log^{1/2} x) )}, but for large {q} one cannot do better than the Brun–Titchmarsh bound of {O(x / \varphi(q))}. However, when there is a Siegel zero {L(\beta,\chi)} in an appropriate range, we can obtain the refined bound

\displaystyle  \sum_{n \leq x: n = a\ (q)} \Lambda(n) = \frac{x}{\varphi(q)} - \chi(a) 1_{q_0|q} 1_{(a,q)=1} \frac{x^{\beta-1}}{\beta} + O( x \exp(\log^{-c} x))

for some {c>0}, where {q_0} is the conductor of {\chi}; see e.g., Theorem 5.27 of Iwaniec–Kowalski. Thus we see the error term is much improved (and in fact can even be made effective), at the cost of introduicing a Siegel correction term which (for {\beta} close to {1} and {x} not too large) is of comparable size to the main term, and can either be aligned with or against the main term depending on the sign of {\chi(a)}.

The implications of such refined asymptotics then depend rather crucially on how the Siegel correction term is aligned with the main term, and also whether it is of comparable order or lower order. In many situations (particularly those concerning “average case” problems, in which one wants to understand the behavior for typical choices of parameters), the Siegel correction term ends up being lower order, and so one ends up with the situation described in my initial blog post, where we are able to get the predicted asymptotic {X \approx \mathrm{Main\ term}} in the Siegel zero case. However, as pointed out by Granville, there are other situations (particularly those involving “worst case” problems, in which some key parameter can be chosen adversarially) in which the Siegel correction term can align to completely cancel (or to highly reinforce) the main term. In such cases, the Siegel zero becomes a very concrete manifestation of the parity barrier, rather than a means to avoid it.