Sat, 15 Dec 2018 14:32:02 GMT

And again, I gave a talk for the
best mathematics club of the multiverse, for "younger" students,
meaning that **this text is not scientific**. This is the script for
the talk.

We want to prove the following theorem:

**Theorem:** There is an algorithm of which, with standard set theory,
we cannot decide whether it terminates.

Firstly, we need to look at what an algorithm is. Usually, people introduce this with Turing machines. However, we will use a model which is closer to modern computers, so-called register machines. These are provably equivalent to Turing machines, except that some algorithms might take a bit longer on Turing machines – but at this point, we do not care about efficiency, we just care about whether things can be calculated at all, given enough time and space. If you care about efficiency, that would be in the realm of complexity theory, while we will work in the realm of recursion theory.

A register machine has a finite set of registers `R[0]`

, `R[1]`

, …,
which can contain arbitrarily large natural numbers. A program
consists of a sequence of instructions:

- For every i, there is the instruction
`R[i]++`

, which increases the number that is contained in`R[i]`

. - For every i, there is the instruction
`R[i]--`

, which decreases the number contained in`R[i]`

if it is larger than 0, and does nothing otherwise. - There is the instruction
`End`

, which ends the program. - For every i and n, there is the instruction
`if R[i]==0 then goto n`

. The instructions are numbered, and this instruction jumps to the instruction with number`n`

, if`R[i]`

contains 0, otherwise it does nothing.

The following program checks whether `R[0] >= R[1]`

, and if so, it
sets `R[2]`

to 1:

```
0 if R[1]==0 then goto 6
1 if R[0]==0 then goto 5
2 R[1]--
3 R[0]--
4 if R[2]==0 then goto 0
5 End
6 R[2]++
7 End
```

Line 4 is only here to do an unconditional jump, therefore, we can introduce a shorthand that just does an unconditional jump:

```
0 if R[1]==0 then goto 6
1 if R[0]==0 then goto 5
2 R[1]--
3 R[0]--
4 goto 0
5 End
6 R[2]++
7 End
```

To make it a bit more readable, we can omit the line numbers, which we
usually do not need, and just set *labels* to the lines we want to
jump to:

```
Start: if R[1]==0 then goto Yes
if R[0]==0 then goto No
R[1]--
R[0]--
goto Start
No: End
Yes: R[2]++
End
```

This just increases readability, it doesn't make anything possible
that wasn't possible before. By using this algorithm, we can generally
check whether `R[i] > R[j]`

, and therefore, we can introduce a
shorthand notation `if R[i] > R[j] then goto A`

without being able to
do anything we couldn't do before.

We can do addition by

```
Start: if R[1]==0 then goto Done
R[0]++
R[1]--
goto Start
Done: End
```

and truncated subtraction () by

```
Start: if R[1]==0 then goto Done
R[0]--
R[1]--
goto Start
Done: End
```

Therefore, we can add the instructions `R[i]+=R[j]`

which adds `R[j]`

to `R[i]`

, and `R[i]-=R[j]`

, without being able to do anything more
than before. Having these instructions, we can define multiplication
by repeated addition, and division and modulo by repeated
subtraction. This is left to the reader.

Now, we cannot know for sure whether everything that is computable at
all is computable by register machines. However, we do not know any
computable function that cannot be computed by a register
machine. Hence, it is generally believed that there is none. This is
called the **Church-Turing thesis**.

These programs operate on numbers only. Real computers work with
images and text. However, this is, computationally, no difference, and
there are several injections between these kinds of data. Especially,
programs themselves can be represented as numbers, which is called
**Gödelization**. For this, we use the Cantor Pairing, which
gives a bijective function
by
. This function
enumerates the backward diagonals on the grid of natural numbers, as
this graphic shows. The pairing itself can obviously be
calculated with the above functions by a register machine. Inverting
the function is also easy, and left to the reader.

Now we can represent every program we have in the following way: We first map numbers to the single instructions:

`R[i]++`

is mapped to`R[i]--`

is mapped to`if R[i]==0 then goto j`

is mapped to`End`

is mapped to

Therefore, every instruction has its own code. A program can be encoded as a sequence of these codes; the program with the instructions can be encoded by .

Therefore, it is well-defined to talk about "programs getting other
programs as parameters". And having seen this, it is easy to write an
**universal register machine**, which evaluates such a program, given
a sequence of register values:

The state of a program is entirely determined by its registers and the number of the current instruction. It has a finite sequence of registers , and an instruction line , and therefore, we can encode the state by .

Now, let contain the program code, and contain the current state. Let be the first element of the program state, which tells us, at which position of we are. Let be the current instruction given by .

- If , let and
. is the value of the
-st register in the simulated program. We replace it
inside by and replace
by . We
simulated
`R[k-1]++`

. - Similarily, we can simulate
`R[k-1]--`

. - For , we have to check the register value, and set the instruction number to the apropriate value.
- For , we end, and have the final state of the program.

While it is really intricate, we can see that it is possible to
"simulate" a register program inside a register program. A natural
question which arises is: Is there an algorithm such that, given a
program (or its Gödelization), and an input state ,
the algorithm determines whether terminates. This problem
is called the **Halting problem**. We now show that it cannot be
solved. Formally, we show that there is no program that,
given the Gödelization of a program in and an
input state in , leaves being 0 if and only if
with the given input state terminates. We do this by
contradiction: Assume such an M exists. Then we could, from this M,
generate the following program:

```
"execute M"
if R[2]==0 then goto Loop
End
Loop: goto Loop
```

This program terminates if and only if the given program with the given state does not terminate. We modify this program once again by one line:

```
"set R[1] to <R[0]>"
"execute M"
if R[3]==0 then goto Loop
End
Loop: goto Loop
```

We call this program . only takes one argument. It
terminates if and only if the given program, given its *own*
Gödelization, does not terminate. Such programs are called
**self-accepting**.

Now, as is itself a program, we can Gödelize it, so let be the Gödelization. By setting , we can calculate .

Now assume terminates. This means that in after
executing M there will be `R[2]==0`

. Therefore
would not terminate. Contradiction.

But assuming would not terminate would mean, by the same
argument, that `R[2]`

is not 0 after M. Therefore, would
terminate. Also a contradiction.

Such a program cannot exist. Therefore, cannot exist.

This proves that we cannot generally decide whether an algorithm terminates. However, it is not yet what we want: We want an algorithm, of which we cannot decide whether it terminates, at all. To get it, we need to do a bit of logic. We will mainly focus on Zermelo-Fraenkel set theory here, as it is the foundation of mathematics.

We first define what a mathematical *formula* is, which is essentially
a string that encodes a mathematical proposition.

- We have an infinite set of
*variable symbols*. - The set of strings is the set of
*atomic formulae*: Formulae which just give them -relation between two free variables.

Now, the set of formulae is the smallest set, such that

- Every atomic formula is a formula: .
- If and , then ("for all a X holds") and ("there exists an a such that X holds") and ("not X") are in .
- If , then ("X and Y"), ("X or Y") and ("X implies Y") are in .

We now give the axioms of set theory:

**NUL**: There is an empty set:

We can introduce a common shorthand notation for by , and rewrite this axiom as

If we want to talk about the empty set now, we need to introduce some variable , and add to the formula. Therefore, our system doesn't get stronger if we introduce a symbol for the empty set, instead of always adding this formula, and it increases readability, which is why we do that.

We furthermore define the shorthand notation by , and by .

**EXT**: The axiom of *extensionality* says that sets that contain the
same elements are also contained in the same sets:

**FUN**: The axiom of *foundation* says that every set contains a set
that is disjoint to it. From this axiom follows that there are no
infinite -chains.

or, with additional obvious shorthand notation

**PAR**: The axiom of *pairing* says that there is a set that contains
at least two given elements, meaning, for all , there
exists a superset of :

**UN**: The axiom of *union* says that the superset of the union of all sets in a set exists:

**POW**: The axiom of the powerset: A superset of the powerset of
every set exists:

.

**INF**: The axiom of *infinity* says that a superset of the set of
natural numbers exists. Natural numbers are encoded as ordinals:
, and . Writing it out
as formula is left as an exercise.

The other two sets of formulae we need are given by **axiom schemes**:
They are infinitely many axioms, but they can be expressed by a
simple, finite rule:

**SEP**: The axiom scheme of *separation* says that, for every formula
and every set , the set exists:

Let a formula be given with free variables among , and not occur freely. Then the formula

.

is an axiom of set theory.

**RPL**: The axiom scheme of *replacement* is a bit more complicated.

A formula is called a **functor** on a set
(which is *not* the same as a functor in category theory), if for all
there is a unique such that
holds. Therefore, in some sense, defines something
similar to a function on , and we write
for this unique . Then the set
, the "image" of ,
exists. Formalizing this scheme is left as an exercise.

**AC**: It should be noted that usually the axiom of *choice* is
added. However, we do not need to care whether it is added or not,
so we omit it here.

We already talked about embedding natural numbers into this set theory. We can also define general arithmetic inside this set theory. Most of mathematics can be formalized inside Zermelo-Fraenkel set theory.

Now, we can formalize propositions. Now we want to formalize proofs. Normally, I would introduce the calculus of natural deduction here, because it corresponds to the dependently typed lambda calculus, so every proof is a term. However, for the specific purpose we need, namely, formalizing proof theory in arithmetic, the equivalent Hilbert calculus is the better choice. It corresponds to the SKI calculus for proof terms.

Firstly, we further reduce our formulae: We can express
as , and as . Furthermore, can be expressed by
. Hence, we only need ,
and to express all formulae. We now
define additional *logical axiom schemes*, where
range over all formulae. (Notice:
is right-associative.)

- P1.
- P2.
- P3.
- Q5. for all variables
- Q6.
- Q7. if is not free in

A *proof* of a formula is a finite sequence of formulae
, such that and for
all , either is an axiom of set
theory, or a logical axiom, or there exist such that
. Essentially this means that
everything in the formula is either an axiom or follows from former
formulae applying modus ponens.

**Completeness Theorem:** If a formula is true in set
theory, then there exists a proof of it.

To prove this, we would need model theory, which would lead too far, so we leave out the proof.

Now, as we did for programs before, we can gödelize formulae and proofs. Let us denote by the gödelization of .

**Diagonalization Lemma:** For every formula with one
free variable , there exists a formula , such
that holds.

*Proof:* First, we notice that, given the formula , we
can express the *substitution* of another variable for
, therefore, we can give a function that satisfies
. Now we can
define . Now,
define . Then we have
. This concludes the proof.

Notice that the definition of is computational: It can be done effectively by a computer. As we can find such a formula for every , we denote it by .

Now, we can also gödelize proofs and their correctness criterion. Therefore, we can give a formula meaning " is the gödelization of a correct proof of the gödelized formula ". Therefore, says that the gödelized formula is provable.

By the diagonalization lemma, there is a formula such that . Now, assume that does not hold. Then also cannot hold, therefore, it would be provable, which is a contradiction. Hence, must hold. But then, it cannot be provable. This is a (sketch of a) proof of

**Gödel's first incompleteness theorem:** In Zermelo-Fraenkel set
theory, there are propositions that can neither be proved nor
disproved.

More generally, this theorem holds for all axiom systems that are capable of basic arithmetic, because this is all we used. Specifically for Zermelo-Fraenkel set theory, there are other examples of such propositions, namely the continuum hypothesis, and the existence of large cardinals.

Now, something we always implicitly assumed is that set theory is
**consistent**: If is provable, then
cannot be provable. This is, however, unknown, which follows from:

**Gödel's second incompleteness theorem:** Set theory cannot prove its
own consistency.

*Proof:* We use our from the proof of the
first incompleteness theorem. Furthermore, we can define
such that
. Now, we can
define what it means to be consistent, namely: . Now, we know that
, and therefore, since
false propositions imply anything,
for all formulae ,
and obviously this implies
. Therefore,
. But this contradicts what
we proved in the first completeness theorem. Hence,
cannot be provable.

Let . Obviously, if and only if set theory is inconsistent (since it is wrong). Now consider the following algorithm:

```
Retry: if ν(R[0], a) then goto Found
R[0]++
goto Retry
Found: end
```

Does this algorithm terminate?

If it terminates, it has found an inconsistency in set theory. Assuming that set theory is consistent, it would not terminate. But if we could prove that it does not terminate, we would be able to prove that set theory is consistent, and this contradicts the second incompleteness theorem.

Hence, we have an algorithm of which we cannot decide whether it terminates.