A Quest For Simple Hard Statements

Gödel’s Lost Letter and P=NP 2018-08-02

Made-to-order statements that are not so simple

Harvey Friedman is a long-standing friend who is a world expert on proofs and the power of various logics. This part of mathematics is tightly connected to complexity theory. This first mention of his work on this blog was a problem about the rationals that says nothing of “logic” or “proofs.”

Today Ken and I would like to introduce some of Harvey’s work in progress.

Harvey’s work is a kind of quest. Recall a quest is a a long or arduous search for something. Ever since his 1967 PhD thesis, Harvey has been trying to find simple statements that are immensely hard to prove. This might seem to be a strange quest: Why look for hard statements; why not look for easy statements? Indeed. Since Kurt Gödel’s seminal work, logicians have been interested in finding natural statements that hide great complexity underneath. The measure of complexity is that the statements are true but cannot be proved in certain prominent logical theories.

Wikipedia has this to say about the famous theorem of Jeff Paris and Leo Harrington (with input from Laurence Kirby) that Peano Arithmetic (PA) cannot prove a certain strengthening of the finite Ramsey theorem:

This was the first “natural” example of a true statement about the integers that could be stated in the language of arithmetic, but not proved in Peano arithmetic; it was already known that such statements existed by Gödel’s first incompleteness theorem.

See also this for history and related examples including earlier work by Reuben Goodstein that was developed further by Kirby and Paris.

What if we can find statements that (i) use arguably simpler notions than arithmetic, yet (ii) are unprovable in systems such as ZFC set theory that are vastly stronger than PA? They might require systems beyond ZFC, such as adding plausible “large cardinal” axioms, to prove. The foundational goal of Harvey’s quest is to demonstrate that realms of higher set theory long regarded as wildly infinitistic and intangible connect algorithmically to concrete and intimately familiar combinatorial ideas. He does also mention motives in complexity, such as in this recent popular talk, and we will develop this aspect here.

Unprovable Yet Simple

Introductory theory courses show how to define sets that are computably enumerable but not decidable. The complement is then definable by a formula ${\phi(n)}$ whose only unbounded quantifiers over numbers are universal. Such formulas and the resulting sentences ${\phi(n_0)}$ for any fixed value ${n_0}$ are said to be ${\prod_1^0}$ . For instance, the set ${E}$ of numbers ${n}$ coding Turing machines that accept no inputs is defined by saying that for all ${x}$ and ${c}$ , ${c}$ is not a valid accepting computation of the machine on input ${x}$ . For every effective theory ${T}$ , be it PA or ZFC or stronger, the set ${D}$ of cases ${\phi(n)}$ that ${T}$ can prove is computably enumerable. Thus ${D}$ cannot equal ${E}$ . Presuming ${T}$ does not prove any false cases, this makes ${D}$ a proper subset of ${E}$ . Every ${n_0 \in E \setminus D}$ (and there are infinitely many) yields a ${\prod_1^0}$ sentence ${\phi(n_0)}$ that is true but not provable in ${T}$ .

Indeed, one can construct particular values ${n_0}$ via Gödel’s diagonalization mechanism. The resulting sentence ${\phi(n_0)}$ is simple in terms of quantifiers, but the “ ${n_0}$ ” is definitely not simple. It bakes all the complexity of diagonalizing over proofs into the statement. Gödel further showed one can use a ${\prod^0_1}$ sentence expressing the consistency of ${T}$ , but this too references ${T}$ and proofs explicitly.

The Paris-Harrington statements are natural, but they are ${\prod^0_2}$ not ${\prod^0_1}$ —that is, they have the form “(for all ${x}$ )(there exists ${y}$ ) ${[\cdots]}$ ” where ${[\cdots]}$ is first-order with bounded quantifiers only. They embody the idea that functions ${f}$ giving ${f(x) = y}$ for these quantifiers must grow infinitely often faster than any function PA can prove to be computable. The statement ${\mathsf{P \neq NP}}$ is likewise ${\prod^0_2}$ . Some time back we wrote a post on independence and complexity that also covered work by Harvey with Florian Pelupessy related to Paris-Harrington. In 1991, Shai Ben-David and Shai Halevi proved a two-way connection to circuit complexity that applies when an effective first-order theory ${T}$ is augmented to a theory ${T_1}$ by adding all true ${\prod^0_1}$ sentences over its language as axioms. They cited as folklore a theorem that ${T}$ and ${T_1}$ generally have the same sets of provably computable functions and said in consequence:

[I]f ${\phi}$ is a natural mathematical statement for which “ ${\phi}$ is independent of PA” is provable in any of the known methods for establishing such results, then ${\phi}$ is independent of PA ${\;_1}$ as well. [Likewise for ZFC and any similar ${T}$ and its corresponding ${T_1}$ .]

Well this is not in Harvey’s quest either. Here are several objections: Statements of Paris-Harrington type cannot be brought much below ${\prod_2^0}$ . The theory ${T_1}$ is not effective. Adding true numerical ${\prod^0_1}$ sentences may be benign for provable growth rates but not for ${\mathsf{P \neq NP}}$ and other statements we want to prove. Although it is often kosher to assume particular ${\prod^0_1}$ statements like (forms of) the Riemann Hypothesis, allowing arbitrary ones begs the idea of proof. The situation is that either ${T}$ is stronger than PA and the functions grow so insanely as to strain naturalness, or ${T}$ is weaker but then the independence is weak too.

What we all want is to show independence from strong theories for simple and natural statements that while not literally in ${\prod^0_1}$ form demonstrably have no more power. Then we may hope to apply his results to understand the hardness of statements like ${\mathsf{P \neq NP}}$ at the ground level. According to Ben-David and Halevi, this will require ideas and constructions beyond “the known methods.” Harvey’s ideas begin by blending questions about the integers into questions about the rationals that leverage the latter’s dense order properties. Let’s turn to look at them now.

Theories on the Rationals

Consider the rationals ${\mathbb{Q}}$ . How hard are statements about the rationals? It depends.

If we allow first order sentences that involve addition and multiplication over the rationals, then Julia Robinson proved, long ago, that the integers can be defined in this first order theory. This immediately implies the general unprovability results discussed above. We have covered this topic before here.

What is surprising, perhaps, is that there are complex statements that use only the order properties of the rationals. However, if we restrict ourselves to the first order theory of the rationals under the usual ordering ${x < y}$ , then it is long known that:

Theorem 1 The first order theory ${(\mathbb{Q},<)}$ is decidable.

Harvey’s ideas use mildly second-order statements about the rationals—with little or no arithmetic besides the ${<}$ relation—to get his hard statements. This may be surprising or may not—it depends on your viewpoint.

Michael Rabin further showed that the second-order monadic theory of ${(\mathbb{Q},<)}$ is decidable (see also this). So we need more than quantifiers over sets of rationals. If one allows an existential quantifier over triples, then it does not take much else to write down the rules for relations ${A(x,y,z)}$ and ${M(x,y,z)}$ to behave like addition and multiplication and use the quantifier to assert their existence, whereupon undecidability follows by Robinson’s result. It seems to us that this paper gives a ready way to show the undecidability also for pairs.

Harvey uses predicates defining sets of tuples of rationals and a single quantifier over them. He eliminates that quantifier by expressing the formula body using satisfiability in first-order predicate calculus with equality, obtaining the schematic form

$\displaystyle \phi = (\forall k)[\psi_k \text{ is satisfiable}],$

where ${\psi_k}$ uses only first-order quantifiers to talk about finite sets of ${k}$ -tuples and is computable from ${k}$ . The satisfaction relation is equivalent to the negation of a derivable contradiction. Hence ${\phi}$ is semantically equivalent to a ${\prod^0_1}$ sentence ${\sigma}$ . The point is that whereas ${\sigma}$ would encode proofs or computations, ${\phi}$ is simple and free of navel-gazing.

Some Key Concepts

Harvey’s work is based on the standard notion of order equivalence:

Definition 2 Say ${x=x_{1},\dots,x_{k}}$ is order equivalent to ${y=y_{1},\dots,y_{k}}$ if
$\displaystyle x_{i} < x_{j} \iff y_{i} < y_{j}$
for all indices ${i}$ and ${j}$ . We use ${x \equiv y}$ to denote this.

Theorem 3 Order equivalence on ${k}$ -tuples is an equivalence relation and there are only a polynomial in ${k!}$ equivalence classes.

Of course, if the tuples are restricted to distinct rationals then there are exactly ${k!}$ equivalence classes. The exact numbers of classes for each ${k}$ when members can be equal are addressed in this paper.

Now we can define the key binary relation on sets ${A,B}$ of ${k}$ -tuples of rationals. It uses order equivalence on ${2k}$ -tuples obtained by flattening elements of ${A^2}$ and ${B^2}$ :

Definition 4 Let ${k}$ be fixed. Then ${A \preceq B}$ provided
$\displaystyle (\forall x \in A^2)(\exists y \in B^2) \; x \equiv y.$

Harvey calls this emulation and the equivalence ${A \preceq B \wedge B \preceq A}$ he calls duplication. The notions naturally extend to ${A^r}$ and ${B^r}$ for fixed ${r > 2}$ , when they are called ${r}$ –emulation and ${r}$ –duplication. In applications, ${B}$ will be finite while ${A}$ can be huge, but in writing ${A \preceq B}$ we still regard ${A}$ as being constrained by ${B}$ .

The third key concept is a kind of invariance that is customizable. Its calibrations may determine the level of hardness of the statements obtained. They require identifying some elements of a ${k}$ -tuple as being integers. Here is an example:

For all sequences ${p_1,\dots,p_i}$ of nonnegative rationals, each less than the integer ${i}$ where ${1 \leq i < k}$ ,
$\displaystyle \begin{array}{rcl} && (p_1,p_2,\dots,p_i,i,i+1,\dots,k-1) \in A\\ &\iff& (p_1,p_2,\dots,p_i,i+1,i+2,\dots,k) \in A. \end{array}$

The case ${i = 1}$ says that for all ${p < 1}$ ,

$\displaystyle (p,1,2,\dots,k-1) \in A \iff (p,2,3,\dots,k) \in A$ .

A stronger constraint is

$\displaystyle (p,n_1,n_2,\dots,n_{k-1}) \in A \iff (p,1+n_1,1+n_2,\dots,1+n_{k-1}) \in A$

whenever ${p < n_1,\dots,n_{k-1} < k}$ . Here ${n_1,\dots,n_{k-1}}$ stand for integers that are not distinct and like ${p_1,\dots,p_i}$ are not required to be in nondescending order. Harvey calls these various notions of ${A}$ being stable.

The final core ingredient is the usual notion of ${A}$ being maximal with respect to some property: ${A}$ has the property but no proper superset of ${A}$ does. It will suffice to consider sets obtained by adding one element to ${A}$ . Now we can exemplify his statements ${\phi}$ in simple terms:

For every finite set ${B}$ of ${k}$ -tuples of rationals there is a set ${A}$ that is maximal with regard to ${A \preceq B}$ and is stable.

There are similar statements for duplication and other variants of emulation. Their definitive treatment will come in papers after #106 and #107 on Harvey’s manuscripts page.

The Goal

The above concepts are all basic. They figure into many areas of combinatorial mathematics, for instance regarding preferences and rankings. None talks about Gödel numbers or Turing machines or proofs. The goal is to establish and classify instances of this phenomenon:

There are simple true statements ${\phi}$ involving little more than ${\equiv}$ and ${\preceq}$ that require immensely strong set theories to prove.

Here “true” needs some explanation. If ordinary set theory cannot prove a statement ${\phi}$ , with what confidence can we say ${\phi}$ is true? The answer is that the aforementioned large cardinal axioms ${\Lambda}$ which Harvey uses to prove ${\phi}$ are widely accepted. We note this 2014 post by Bill Gasarch on how Fermat’s Last Theorem was originally proved using theory derived from the axiom ${\Lambda}$ asserting the existence of a strongly inaccessible cardinal. This ${\Lambda}$ is equivalent to having a Grothendieck universe, by which Alexander Grothendieck proved statements used by Andrew Wiles in his proof. Colin McLarty later showed how to remove this dependency and prove Fermat in a fragment of Ernst Zermelo’s set theory—but the relevance of ${\Lambda}$ to “real mathematics” remains.

So the upshot will be that ZFC plus some ${\Lambda}$ will prove ${\phi}$ but ZFC alone cannot (unless ZFC is inconsistent). It is possible for ZFC to be consistent whereas ZFC ${\cup \;\{\Lambda\}}$ is inconsistent. All the known large cardinal axioms ${\Lambda}$ have these relationships to ZFC:

ZFC can prove that ${\Lambda}$ implies the consistency of ZFC.
ZFC can prove that the consistency of ZFC implies the consistency of ZFC ${\cup \;\{\neg\Lambda\}}$ .

The final—and long—step needed in Harvey’s proofs is to get point 1 with ${\phi}$ in place of ${\Lambda}$ . This exemplifies a reversal step in his program of Reverse Mathematics. From this, the above desired conclusion about ZFC not proving ${\phi}$ follows. By point 2, ZFC cannot prove ${\neg\phi}$ either. Nevertheless, we can accept ${\phi}$ as true. This is intended to rest not only on heuristic arguments in favor of the particular ${\Lambda}$ employed, as exemplified for Fermat, but on evidence that can be programmed.

Harvey’s ultimate goal will be realized when the effect of ${\Lambda}$ is shown to be algorithmic. Here is a facile analogy. Suppose we build an infinite set ${A}$ by a simple greedy algorithm. We get a “simple path” of finite sets ${A_0,A_1,A_2,\dots}$ that grow successively larger: each ${A_i}$ has cardinality ${i}$ . We can view the process as structured by the ordinal ${\omega}$ being the limit of ${0,1,2,\dots,i}$ as ${i}$ grows. The idea is that more-ramified search structures can be “governed” by higher ordinals and patterns of descents from them. The end product will nevertheless always be a countable set ${A}$ of tuples and the evidence of progress will always be a longer finite path. Greedy ${\omega}$ may give only maximality, whereas the large cardinal axiom implies the existence of an infinite ${A}$ with all stipulated properties. Though it may provide no guidance on building longer and longer finite paths, it implies their existence so that computers can always find a longer path by a finite search. The concrete import of this and further motivations are described in this Nautilus article from last year.

Open Problems

Open problems are most engaging when easy to state: the simple attracts and the complex repels. Simplicity allows them to be stated also in the direction that most of us believe. Yet we may think they are true but don’t know how to prove them. The whole point of Harvey’s quest is to create more milestones that are simple but hard to prove statements. How much will it help to have such statements?