Proof of the Diagonal Lemma in Logic

Persiflage 2020-05-25

Why is the proof so short yet so difficult?

Saeed Salehi is a logician at the University of Tabriz in Iran. Three years ago he gave a presentation at a Moscow workshop on proofs of the diagonal lemma.

Today I thought I would discuss the famous diagonal lemma.

The lemma is related to Georg Cantor’s famous diagonal argument yet is different. The logical version imposes requirements on when the argument applies, and requires that it be expressible within a formal system.

The lemma underpins Kurt Gödel’s famous 1931 proof that arithmetic is incomplete. However, Gödel did not state it as a lemma or proposition or theorem or anything else. Instead, he focused his attention on what we now call Gödel numbering. We consider this today as “obvious” but his paper’s title ended with “Part I”. And he had readied a “Part II” with over 100 pages of calculations should people question that his numbering scheme was expressible within the logic.

Only after his proof was understood did people realize that one part, perhaps the trickiest part, could be abstracted into a powerful lemma. The tricky part is not the Gödel numbering. People granted that it can be brought within the logic once they saw enough of Gödel’s evidence, and so we may write {\ulcorner \phi \urcorner} for the function giving the Gödel number of any formula {\phi} and use that in other formulas. The hard part is what one does with such expressions.

This is what we will try to motivate.

Tracing the Lemma

Rudolf Carnap is often credited with the first formal statement, in 1934, for instance by Eliott Mendelson in his famous textbook on logic. Carnap was a member of the Vienna Circle, which Gödel frequented, and Carnap is considered a giant among twentieth-century philosophers. He worked on sweeping grand problems of philosophy, including logical positivism and analysis of human language via syntax before semantics. Yet it strikes us with irony that his work on the lemma may be the best remembered.

Who did the lemma first? Let’s leave that for others and move on to the mystery of how to prove the lemma once it is stated. I must say the lemma is easy to state, easy to remember, and has a short proof. But I believe that the proof is not easy to remember or even follow.

Salehi’s presentation quotes others’ opinions about the proof:

{\bullet } Sam Buss: “Its proof [is] quite simple but rather tricky and difficult to conceptualize.”

{\bullet} György Serény (we jump to Serény’s paper): “The proof of the lemma as it is presented in textbooks on logic is not self-evident to say the least.”

{\bullet } Wayne Wasserman: “It is `Pulling a Rabbit Out of the Hat’—Typical Diagonal Lemma Proofs Beg the Question.”

So I am not alone, and I thought it might be useful to try and unravel its proof. This exercise helped me and maybe it will help you.

Here goes.

Stating the Lemma

Let {S(w)} be a formula in Peano Arithmetic ({PA}). We claim that there is some sentence {\phi} so that

\displaystyle  PA \vdash \phi \iff S(\ulcorner \phi \urcorner).

Formally,

Lemma 1 Suppose that {S(x)} is some formula in {PA}. Then there is a sentence {\phi} so that

\displaystyle  PA \vdash \phi \iff S(\ulcorner \phi \urcorner).

The beauty of this lemma is that it was used by Gödel and others to prove various powerful theorems. For example, the lemma quickly proves this result of Alfred Tarski:

Theorem 2 Suppose that {PA} is consistent. Then truth cannot be defined in {PA}. That is there is no formula {Tr(x)} so that for all sentences {\phi} {PA} proves

\displaystyle  \phi \iff Tr(\ulcorner \phi \urcorner).

The proof is this. Assume there is such a formula {Tr(x)}. Then use the diagonal lemma and get

\displaystyle  \phi \iff \neg Tr(\ulcorner \phi \urcorner).

This shows that

\displaystyle  \phi \iff \neg Tr(\ulcorner \phi \urcorner) \iff Tr(\ulcorner \phi \urcorner).

This is a contradiction. A short proof.

The Proof

The key is to define the function {F(n)} as follows: Suppose that {n} is the Gödel number of a formula of the form {A(x)} for some variable {x} then

\displaystyle  F(n) = \ulcorner A(\ulcorner A(x) \urcorner) \urcorner.

If {n} is not of this form then define {F(n)=0}. This is a strange function, a clever function, but a perfectly fine function, It certainly maps numbers to numbers. It is certainly recursive, actually it is clearly computable in polynomial time for any reasonable Gödel numbering. Note: the function {F} does depend on the choice of the variable {x}. Thus,

\displaystyle  F(\ulcorner y=0 \urcorner) = \ulcorner (\ulcorner y=0 \urcorner)=0 \urcorner,

and

\displaystyle  F(\ulcorner x=0 \urcorner) = \ulcorner (\ulcorner x=0 \urcorner)=0 \urcorner.

Now we make two definitions:

\displaystyle  \begin{array}{rcl}        g(w) &\equiv& S(F(w)) \\        \phi &\equiv& g(\ulcorner g(x) \urcorner). \end{array}

Now we compute just using the definitions of {F, g, \phi}:

\displaystyle  \begin{array}{rcl}        \phi &=& g(\ulcorner g(x) \urcorner) \\                 &=& S(F(\ulcorner g(x) \urcorner)) \\           &=& S(\ulcorner g(\ulcorner g(x) \urcorner) \urcorner) \\               &=& S(\ulcorner \phi \urcorner). \end{array}

We are done.

But …

Where did this proof come from? Suppose that you forgot the proof but remember the statement of the lemma. I claim that we can then reconstruct the proof.

First let’s ask: Where did the definition of the function {F} come from? Let’s see. Imagine we defined

\displaystyle  \begin{array}{rcl}        g(w) &\equiv& S(F(w)) \\        \phi &\equiv& g(\ulcorner g(x) \urcorner). \end{array}

But left {F} undefined for now. Then

\displaystyle  \begin{array}{rcl}        \phi &=& g(\ulcorner g(x) \urcorner) \\                 &=& S(F(\ulcorner g(x) \urcorner)). \end{array}

But we want {\phi = S(\ulcorner \phi \urcorner)} that happens provided:

\displaystyle  \ulcorner g(\ulcorner g(x) \urcorner) \urcorner) = F(\ulcorner g(x) \urcorner).

This essentially gives the definition of the function {F}. Pretty neat.

But but …

Okay where did the definition of {g} and {\phi} come from? It is reasonable to define

\displaystyle  g(w) \equiv S(F(w)),

for some {F}. We cannot change {S} but we can control the input to the formula {S}, so let’s put a function there. Hence the definition for {g} is not unreasonable.

Okay how about the definition of {\phi}? Well we could argue that this is the magic step. If we are given this definition then {F} follows, by the above. I would argue that {\phi} is not completely surprising. The name of the lemma is after all the “diagonal” lemma. So defining {\phi} as the application of {g} to itself is plausible.

Taking an Exam

Another way to think about the diagonal lemma is imagine you are taking an exam in logic. The first question is:

Prove in {PA} that for any {S(x)} there is a sentence {\phi} so that

\displaystyle  \phi \iff S(\ulcorner \phi \urcorner).

You read the question again and think: “I wish I had studied harder, I should have not have checked Facebook last night. And then went out and {\dots}” But you think let’s not panic, let’s think.

Here is what you do. You say let me define

\displaystyle  g(x) = S(F(x)),

for some {F}. You recall there was a function that depends on {S}, and changing the input from {x} to {F(x)} seems to be safe. Okay you say, now what? I need the definition of {F}. Hmmm let me wait on that. I recall vaguely that {F} had a strange definition. I cannot recall it, so let me leave it for now.

But you think: I need a sentence {\phi}. A sentence cannot have an unbound variable. So {\phi} cannot be {g(x)}. It could be {g(m)} for some {m}. But what could {m} be? How about {\ulcorner \phi \urcorner}. This makes

\displaystyle  \phi = g(\ulcorner g \urcorner).

It is after all the diagonal lemma. Hmmm does this work. Let’s see if this works. Wait as above I get that {F} is now forced to satisfy

\displaystyle  F(\ulcorner g(x) \urcorner) = \ulcorner g(\ulcorner g(x) \urcorner) \urcorner.

Great this works. I think this is the proof. Wonderful. Got the first question.

Let’s look at the next exam question. Oh no {\dots}

Open Problems

Does this help? Does this unravel the mystery of the proof? Or is it still magic?

[Fixed equation formatting]