0% found this document useful (0 votes)
14 views99 pages

414 - F24 - Lecture6 - Turing Machines - Multiplication - Coding - Halting - Problem

Uploaded by

yuxi20030429
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views99 pages

414 - F24 - Lecture6 - Turing Machines - Multiplication - Coding - Halting - Problem

Uploaded by

yuxi20030429
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 99

Contents

1 Turing Machine for Multiplication of fixed k and arbitrary n 2

2 Turing Machine for Multiplication 7

3 Computing Functions and Standard Configurations 31

4 The Turing (aka Church-Turing) Thesis 38

5 Effective Coding and the Diagonal Function 44


5.1 Effective Coding and the Enumerability of Turing Machines . 44
5.2 Coding and The Universal Turing Machine . . . . . . . . . . . 47
5.3 The Diagonal Function . . . . . . . . . . . . . . . . . . . . . . 53

6 The Halting Problem 59


6.1 Framing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
6.2 The Copying Machine . . . . . . . . . . . . . . . . . . . . . . 69
6.3 An impossible machine . . . . . . . . . . . . . . . . . . . . . . 82
6.4 “Reducing to the halting problem” . . . . . . . . . . . . . . . 85
6.5 Rice’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . 92
6.6 Productivity/Busy Beaver functions - uncomputability via rapid
growth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96

1
1 Turing Machine for Multiplication of fixed k and arbitrary n

It will help give a sense of the resources we do and don’t


have with our two-letter Turing machines if we quickly
glimpse at how to take the machine we built in our last
lecture — computing n 7→ 2n — and modify it into a
machine that computes n 7→ k × n for some fixed num-
ber k.

The modification is simple enough: the n 7→ 2n machine


reads a counter, moves to the second blank detected on
the left, writes two strokes, and heads back to erase the
original counter. Then it begins the cycle again, repeat-
ing until all the counters have been erased.

To get the n 7→ k × n machine, just do the same thing,


except write k strokes at the point in the cycle where the
n 7→ 2n machine writes 2.

This requires just a very superficial adjustment to the


program. I’ll first illustrate with the modification needed
to get the machine calculating n 7→ 3n.

2
Here are the crucial moments on the diagrams on slide
??. The n 7→ 2n machine prints a 1, moves to q4, prints
another 1 and goes into q5. (Instructions q3B1q3; q31Lq4;
q4B1q4; q41Rq5; q51Rq5)

q3 /q4 q4 q4 /q5

1 1 1 1 1 1 1 1 1 1 1 1 1 1

For the n 7→ 3n we interpolate an extra state between


q4 and q5. We drop q41Rq5, and add: (q41Lqa; qaB1q5)
Then the machine will continue as before:

q3 /q4 q4 q4 /qa

1 1 1 1 1 1 1 1 1 1 1 1 1 1

qa qa /q5

1 1 1 1 1 1 1 1 1 1 1 1

3
After the three cycles corresponding to the three counters
are completed, the modified machine will halt on the left-
most of a string of 9 strokes, on an otherwise blank tape.
Clearly, for any k, we can get the function n 7→ k × n
from the n 7→ 2n machine by dropping the instruction
q41Rq5, adding additional states {qa1 , qa2 , . . . qak−2 } and
adding additional instructions:

q41Lqa1 ;
qa1 01qa1 ; qa1 1Lqa2 ;
qa2 01qa2 ; qa2 1Lqa3 ;
qa3 01qa3 ; qa3 1Lqa4 ;
.. ..
..
qak−3 01qak−3 ; qa3 1Lqak−2 ;
qak−2 01qak−2 ; qak−2 1Rq5 )

4
It is crucial that this is a function that multiplies n by
some specific number given in advance, before we set
out to construct the machine.

To print k strokes rather than 2 each cycle is just a mat-


ter of adding a fixed number of additional states, to get a
different machine that performs a different computation
on a given input than the original n 7→ 2n machine does.

5
It’s important to see why this strategy won’t work to get
a machine that multiplies a given number n with some
other, arbitrary number m, with both m and n supplied
as input on the tape.

A Turing machine is allowed to have an arbitrarily large


finite number of states, but the machine can’t add states
as the computation goes on.

You can use as large a finite number of states as you like


to build a machine, but once the machine is built those
are all the states it will have. The states available for
a given computation are hard-wired in to the machine
itself.

6
2 Turing Machine for Multiplication

Now that we have a sense of some of the obstacles, how


can we build a Turing machine that, when started on the
leftmost 1 of two blocks of m and n strokes separated by
a blank, on an otherwise blank tape, will end scanning
the leftmost 1 of a single block of m × n strokes on an
otherwise blank tape.

First idea that comes to mind: Given m strokes separated


by a B from another n strokes, do what you did with the
n 7→ 2n machine, except that you want to perform the
duplicating operation m times instead of just twice.

This is indeed the right idea, but as with the duplica-


tion machine it requires some ingenuity and finesse to
make the machine track the needed information during
the computation.

7
To see the challenge, let’s look at one failed try. Say we
use the first block of m strokes as counters, as in the du-
plication machine. (This is, in fact, what we’ll do.)

So let’s blunder into a clumsy first plan and see where it


gets into trouble; I’ll use the effort to calculate 4 × 3 to
illustrate.

Say we try to build cycles that work like this: (next slide)

(NB: THIS IS AN EXAMPLE OF A STRATEGY THAT


FAILS!)

8
1 1 1 1 1 1 1

From the above starting configuration, move left until


you are scanning the leftmost of the second block of 1’s.

1 1 1 1 1 1 1

From there, initiate a routine that computes n 7→ 2n,


ending up here:

1 1 1 1 1 1 1 1 1 1

Move back to the far left stroke, to erase one counter:

1 1 1 1 1 1 1 1 1 1

Erase the left-most 1, and initiate another sequence to


add another 3 to the second block (next slide):

9
1 1 1 1 1 1 1 1 1

BUT if we carry out the n 7→ 2n routine on the right-


hand block, it won’t write 3 more strokes this time. It
will write 6, because that is how many strokes are cur-
rently in the second block.

The machine has no way to “remember” that the original


input was just 3. It can only deal with what is in fact on
the tape.

BZZT! back to the drawing board.

And so that is the challenge of the m, n 7→ m × n ma-


chine: figure out a way to repeatedly add the input n,
while at the same time somehow retaining on the tape
the information as to what the original input n was.

This is what the multiplication machine diagrammed and


described on BJB pp. 30 - 31 does; the solution to the
problem is quite clever.

10
I won’t go into all of the details of how the machine works:
you can extract those from the diagram p. 30 of BJB.

I’ll just give you a high-altitude picture of how it works,


and then I’ll go through the most complicated subroutine
in detail.

The core idea is to keep pushing the original input n


farther and farther along the tape.
You let the distance between the leftmost 1 on the mov-
ing block of n strokes, and rightmost 1 in the first group
keep track of how much has been added.

Here is what successive cycles will look like. (Again illus-


trating with 4 × 3:

Opening configuration:

1 1 1 1 1 1 1

11
After 1 cycle, the first counter has been erased, then the
right-hand block of 3 1’s has been pushed three squares
down the tape, and the head has moved right right to
begin another cycle:

1 1 1 1 1 1

After 2 cycles, the second counter has been erased, then


the block of 3 strokes has been shifted 6 squares down,
and the head is moving right to begin another cycle:

1 1 1 1 1

12
After 3 cycles, the third counter has been erased and then
the block of 3 strokes has been shifted 9 squares down.

1 1 1 1

As the head moved back right, it detected that there is


only one counter left, and so it transitions to a “cleanup”
routine, to change 9 blanks to strokes:

First it moves one step right, to erase the last remaining


counter:

1 1 1 1

13
When it erases the last remaining counter, it moves right
one square, to the square that originally separated the
two blocks of strokes. That square is supposed to remain
blank:

1 1 1

14
After the machine registers (by changing states) that it
has passed a square that is supposed to stay blank, it
continues to move right, writing 1 on every blank it en-
counters, until it encounters a 1. Once it encounters a 1,
it just moves to the left-most 1, and halts.

1 1 1

1 1 1 1

1 1 1 1 1 1 1 1 1 1 1 1

qhalt

1 1 1 1 1 1 1 1 1 1 1 1

15
To construct the machine, there are three different mod-
ules that need to be put together.

One of them cleans up by writing the 1’s over blanks at


the end, and then halts in the right place.

One of them implements the procedure of erasing the


counters and entering the “shift the right block over” sub-
routine.

And finally, we need the the subroutine that actually does


the shifting of the left-hand block of n strokes precisely
n slots down the tape.

The last of these is the only one that is at all tricky, so


I’ll just deal with that one.

16
OK, so let’s look at the beginning of one instance of the
subroutine, in the calculation of 4 × 3.

We take up the calculation after 2 cycles, with the block


of 3 strokes has been shifted 6 squares down, the third
counter has been erased, and the head has moved right
to begin another cycle:

1 1 1 1

17
The reason you need a bit of finesse here because you
need to figure out a way to move the strokes down the
tape, in such a way that the machine can tell just from
the configuration of the tape, assisted with some fixed
finite number of states:

a) when a stroke still needs to be moved

b) where a stroke should go when it is moved and

c) when all the strokes have been moved.

(And remember, the “fixed finite number of states” needs


to suffice for a routine that works for arbitrary n, not just
for n = 3.)

18
Here is the plan, called the “Leapfrog routine” in BJB
p.31.

The strokes are moved, beginning with the leftmost in


the block and moving right, according to the arrows in
this diagram:

1 1 1 1

So long as there is a gap between the newly written 1’s


and some 1’s still to be moved, the machine knows it has
to keep executing the leapfrog subroutine.

When the machine detects that only one stroke is on the


left of that gap, it knows that it’s time to begin wrapping
up the subroutine.

Now I’ll go through the leapfrog routine, to make ex-


plicit what the machine instructions have to look like at
the level of detail.

I’ll use the same state names as in the flow diagram of


BJB p.30, to facilitate comparisons, and to make it easier
to eventually put all the modules together.

19
Some pictures, illustrating with the case of moving 4
strokes. We join the computation with the read/write
head on the far left stroke of the right-most block, in
state q4:
q4

1 1 1 1

The machine erases the stroke, enters q5 state that says


“I erased the first stroke in the block” and upon reading
the blank it just wrote, it moves right one square and
goes into q6. (q41Bq5; q5BRq6)
q5

1 1 1

q6

1 1 1

20
If, in q6, the machine were to read another blank, that
would tell it that the whole block had been erased, so it
should write a 1 and exit the leapfrog routine. I look at
that path beginning on slide 27.

But in this case the machine sees a 1, so it knows the


routine isn’t done yet, so it goes into q7 and moves right.

In q7 it keeps moving right so long as it is reading 1’s.


until it reads a blank. (q61Rq7; q71Rq7)

q6 /q7

1 1 1

q7

1 1 1

21
q7

1 1 1

Reading a blank in q7, it will move right one more square


and go into q8. (q7BRq8)

q7 /q8

1 1 1

This is a minor choice point, though not one that leaves


the subroutine. It depends on whether or not, in q8, the
machine reads a 1.

I’ll first consider what happens when, as now, the ma-


chine reads a blank. Then, on slide 24 I’ll note what
happens when it reads a 1.

22
Reading a blank in q8 , the machine writes a 1, and goes
into q10. (q8B1q10).

q8 /q10

1 1 1 1

In q10 if it reads a 1, it goes left and stays in q10.

In later cycles, the machine will move right, remaining in


q10 for more strokes, but in this case the machine encoun-
ters a blank right away. The machine moves left again
and goes into q11. (q101Lq10; q10BLq11.)

q10 q10 /q11

1 1 1 1 1 1 1 1

23
I’ll pause to explain the minor divergence I mentioned on
slide 22.

After the machine has completed a full cycle, the machine


will be reading a 1, rather than a blank when it goes into
q8 .

q7 /q8 q8

1 1 1 1 1 1

Here, the head needs to move right until it finds a blank,


and then write a 1. So we interpolate q9 as a “move right
until you encounter a blank” state, before writing a 1
on the first blank it finds, and going into q10. (q81Rq9;
q91Rq9; q9B1q10.)

q8 /q9 q9 /q10

1 1 1 1 1 1 1

Once the machine is in q10 scanning a 1, it returns to the


main line of the subroutine.

24
Where we left off (on slide 23), the machine had just gone
from q10 to q11 and moved left. I’ll return to that point.

The head is now reading a 1. Now the machine moves left


through 1’s until it encounters a blank. When it encoun-
ters a blank, it moves right, goes into q4, and starts
another cycle. (q111Lq11; q11BRq4.)

That is, the machine will do the cycle of erasing a stroke


on the right block, and writing a stroke on the left block.

q11 q11 /q4

1 1 1 1 1 1 1 1

25
Here’s a fast-forwarded diagram version of the next cy-
cle, to put us in a position to follow the steps to exit the
leapfrog routine.

q4 /q5 q5 /q6 q6 /q7

1 1 1 1 1 1 1 1 1

q7 /q8 q8 /q10 q10

1 1 1 1 1 1 1 1 1 1 1

q11 /q4 q4

1 1 1 1 1 1 1 1

And the cycle starts again.

26
[Skipping 1 full cycle later:] After 1 more cycle, we come
to the choice-point that will lead to exiting the leapfrog
routine.

As before, in state q4 we erase the 1 and move to q5.

In q5, the machine reads the blank it has just written,


moves right, and goes into q6. But this time, instead of
reading a 1, the machine reads a blank in q6.

That is a sign that the machine has erased the last of the
counters for the leapfrog subroutine, so it’s time to write
a final 1 (on the square above which the machine head
is currently resting) and move to q12 to exit the routine.
(q6B1q12)

q4 q4 /q5 q5 /q6

1 1 1 1 1 1 1 1 1 1

q6 q6 /q12

1 1 1 1 1 1 1

27
So let’s pull the camera back to get a wider view of the
tape.

In this example, we have a block of 4 1’s on the right,


and a block of 3 counters on the left. (In this specific
configuration, the machine has gone through two cycles
of the calculation of 5×4 and has erased two strokes from
the left block, which was originally 5 strokes.)

q12

1 1 1 1 1 1 1

Now what we need to do is go back to the far right block,


so as to erase another counter, and proceed either to an-
other leapfrog routine or to the cleanup phase, depending
on how many counters are left.

So we just add some states that move the read/write head


left, while keeping track of changes from spaces to blanks
and blanks to spaces. (q121Lq13; q13BLq13; q131Lq14,
q141Lq14; q14BRq1)

28
The effect of the transitions q121Lq13; q13BLq13; q131Lq14,
q141Lq14; q14BRq1 in pictures:

q12 /q13

1 1 1 1 1 1 1

q13

1 1 1 1 1 1 1

q13 /q14

1 1 1 1 1 1 1

q14

1 1 1 1 1 1 1

q14 /q1

1 1 1 1 1 1 1

q1

1 1 1 1 1 1 1

29
I’ll leave it as an exercise for you to work through the
BJB p. 30 flowchart to work out:
a) How the machine uses the first block of m strokes as
as counters, so that you perform the leapfrog subroutine
m − 1 times
(States q1, q2, q3)

b) and the routine that kicks in when the last of the coun-
ters is erased, which cleans up at the end by instructing
the machine to keep moving right and writing 1’s in the
blank spaces it has created, then getting into the desired
halting configuration. (States q2, q15, q16, q17, q18)

Both of them are quite straightforward.

30
3 Computing Functions and Standard Configurations

Turing machines will be used to compute functions from


numbers (or tuples of them) to numbers (or tuples of
them).

These will often be partial functions (i.e. undefined for


one or more arguments). If the function is defined for
every argument, it’s called total.

For this, we need a canonical way to match up these func-


tions with the machines that we’ll take to compute them.

31
Canonical representation of f : N
| ×N×
{z. . . × N} → N:
k times

1. Arguments m1, m2, . . . , mk are represented by strokes


in monadic notation, each separated from its neigh-
bors by a single blank, on an otherwise blank tape.

(You need to do some trickery to represent 0 as an


argument, but I won’t address that right now. It
adds an additional complication I’d prefer to avoid
for the moment, to keep things simpler.)

2. The machine begins in state 1 (I will sometimes use


0, because it’s hard to unlearn old habits), scanning
the leftmost 1 on the tape.

A configuration satisfying these two conditions is called


a standard initial configuration.

32
3. If f (m1, . . . , mn) = n then the machine represent-
ing f will eventually halt scanning the leftmost 1 of
a tape consisting of n consecutive strokes, otherwise
blank.

This is called a “standard final configuration”


4. If the machine is to assign no value at all to the in-
puts (i.e. if f (m1, . . . , mn) is undefined) then the
machine will either never halt, or halt in some con-
figuration that is not a standard final configuration.

Notation: when we want to say that f (m1, . . . , mn) is


undefined, we will sometimes write it as:

f (m1, . . . , mn) = ↑.

If we have occasion to say that f (m1, . . . , mn) is defined,


but don’t know or need to give the numerical value, we
may write it as:

f (m1, . . . , mn) = ↓.

33
A few trivial-looking Turing computable functions.

I’ll just draw your attention to a couple of functions that


BJB describe Turing machines for on p. 32.

The identity function: id(n) = n

The k-place empty function: e(m1, . . . mk ) = ↑ for every


(m1, . . . mk )

The k-place constant 1 function: e(m1, . . . mk ) = 1 for


every (m1, . . . mk )

It is worth filing these away in memory, they can be


handy as building blocks for more complex Turing ma-
chines, and they will prove to be important in connection
with recursive functions.

34
You might ask: how can we hope to get machines to com-
pute more complicated functions, given how hard it was
just to get apparently elementary operations like addition
and multiplication?

The point is that once we have build some simple ma-


chines, we will be able to get more complicated ones by
combining the simple ones in straightforward ways.

It will turn out that it is easier to address some of these


cases with a change of framework, to the topic of recur-
sive functions, a bit later in the course.

But I’ll say a few orienting words here.

35
It’s important to appreciate how the multiplication ma-
chine described in the last lecture (and BJB p. 28-29)
was obtained. This is a general pattern.

We had previously described a procedure for doubling


the input m, which we could see as the procedure m 7→
m + m. This involved repeating an “add two” routine,
indexed by the strokes in the input m as counters.

The multiplication machine nested a modification of the


m 7→ m + m within a loop indexed by n, to have the ef-
fect of repeating the m 7→ m + m procedure n − 1 times
(and retaining the original block of m strokes, though
moved down the tape).

By setting up a nested loop, indexed by counters, we


were essentially defining multiplication by re-
cursion from addition.

36
More generally, with the nesting technique, can get is a
natural hierarchy of recursively defined functions by be-
ginning with successor (i.e. +1):

Addition in terms of +1:


x + 0 = x; x + (y + 1) = (x + y) + 1

Multiplication in terms of addition:


x × 1 = x; x × (y + 1) = (x × y) + y

Exponentiation in terms of multiplication:


x1 = x; xy+1 = xy × x

On the same pattern, we can define “superexponentia-


tion” (sometimes called “tetration”) in terms of exponen-
tiation, and the process can be continued further still.

37
4 The Turing (aka Church-Turing) Thesis

We say that a numerical function of k arguments is Tur-


ing computable if it is computed by a Turing machine
satisfying 1) - 4).

Turing computable functions are clearly effectively com-


putable.

Turing’s thesis is the converse: that all effectively com-


putable functions are Turing computable.

Sometimes Turing’s thesis is blended with another called


“Church’s thesis” that we’ll see in a few weeks. Church’s
thesis is that all effectively computable functions are re-
cursive.

Since (as we’ll see) the recursive functions are exactly the
Turing computable ones, people sometimes speak of the
“Church - Turing thesis”.

38
When you are trying to devise a Turing machine, it is of-
ten useful in the early stages to think in informal terms,
trusting the Church-Turing thesis.

Ask yourself “Could I get a machine of any type to do


this?”

Simple example: Say I have two Turing machines M1 and


M2. I want a Turing machine that will run M1 if input
an even number and M2 if input an odd number.

Is there a Turing machine that can do this? Since it


is clear that this is something that can be accomplished
mechanically, Turing’s thesis says there must be a Turing
machine that can do it.

39
Note though that such appeals to the Turing thesis are
usually just for orientation.

You aren’t allowed to appeal to Turing’s thesis for a prob-


lem set / midterm problem unless the question explicitly
says you can!

40
Turing’s thesis is not the sort of thing that can be proven.

The idea of “effectively computable” is an informally ex-


plained idea of “computable by a step by step procedure
with explicit instructions”. The idea has the vagueness
that informal ideas typically have.

“Turing computable” is a precisely defined concept - the


thesis is that the informal idea of effectively computable
is captured by the precise one.

41
Now it is in principle possible that someone could find
a function that is intuitively effectively computable, but
which is not Turing computable.

If there were such an example, it would show that the


Turing’s thesis is false.

But all of the candidate counter-examples that people


have proposed have been unconvincing.

42
The fact that 70 years of research has passed without
any plausible counterexamples, or even suggestions that
look like they might be fleshed out into counter-examples,
gives some evidence for the truth of the thesis.

Though by itself, that evidence isn’t overwhelming. Per-


haps we just have certain systematic blind spots that hin-
der us from discovering potential counter-examples.

Better evidence is the fact we’ll learn in the coming weeks,


that there are many other plausible, independent analy-
ses of “effectively computable”, and they all turn out to
capture exactly the same class of functions as the Turing
computable ones!

Another argument, a rather subtle one, turns on the fact


that the Turing computable / recursive functions are in
a certain way a naturally stable and robust class of func-
tions, in a way that I’ll explain a few lectures from now.
For the rest of this lecture and then next, I’ll concentrate
less on showing what can be done with Turing machines
and switch to the question of ascertaining what can’t be
done.

43
5 Effective Coding and the Diagonal Function

5.1 Effective Coding and the Enumerability of Turing Machines

In this lecture, I’ll shift emphasis from what can be done


with Turing machines toward the question of ascertain-
ing what can’t be done.

We can easily establish, just by a counting argument, that


there are going to be functions from N to N that can’t
be computed by Turing machines.

Turing machine descriptions are finite strings of symbols


in a countable alphabet.

(Not a finite alphabet, because we need to have count-


ably many names for states q0, q1, . . . , qn, . . .. )

We’ve shown that we can code finite sequences in a count-


able alphabet into natural numbers, so we can code up
Turing machines into natural numbers, with distinct nat-
ural numbers for distinct machines.

44
So: There are only countably many Turing machines.

But there are uncountably many functions from N to N.

Hence there is at least one function from N to N that is


not Turing computable.

(In fact, there are uncountably many such functions.)

But this is nonconstructive, just an existence proof. We


would like to have an example of such a function. First
let’s look at a somewhat artificial one.

45
First we need to code up the Turing machines. It isn’t
so crucial that we know what the actual coding scheme
is, though you might want to look at BJB p. 36 to get a
sense of what one such coding looks like.

All we need to know for our purposes is:

1. We can assign natural numbers to Turing machines


so that there is a list M1, M2, M3, . . . Mn, . . . that
contains every possible Turing machine. (And with
each TM description appearing only once in the list.)
2. The coding is itself effective, meaning that that there
is an effective procedure to produce the number given
a description of the Turing machine and an effective
procedure to get a description of the Turing machine
from the code.

(Assuming the Church-Turing Thesis, this means that


there is a Turing machine that can code up machine
descriptions into codes and produce machine descrip-
tions given codes.)

46
5.2 Coding and The Universal Turing Machine

A key fact about the effective enumeration is that there


are a wide variety of operations that can be mechani-
cally performed performed on Turing machines them-
selves that can be duplicated by Turing machines op-
erating on codes of Turing machines.

Here is what I mean. One of the most simple operations


you can perform on Turing machines is composition.

Say we have two Turing machines Mj and Mi, both of


which compute functions of one variable fj and fi.

The composition of Mj and Mi is just the machine


Mi ◦ Mj that takes input m, feeds it into Mj , takes the
output of that computation and feeds it as input to Mi,
producing fi(fj (m)) as output.

47
Can you carry out “machine composition” mechanically?

Sure, in principle.

You can imagine a big machine with mechanical hands


taking the machines, setting them side by side, and at-
taching some device to feed the output of Mj into Mi.
Or some such thing.

But isn’t that a limitation on the Church-Turing thesis?

There is in principle a mechanical way to compose Tur-


ing machines, but that’s not the sort of thing a Turing
machine can do, right?

Turing machines don’t have mechanical hands, they don’t


physically move objects around...

48
True enough, but Turing machines can operate on Turing
machine codes to produce other Turing machine codes.

Continuing with our example, there is a Turing machine


M◦ that takes j and i as input, and outputs the numeri-
cal code of the Turing machine Mi ◦ Mj .

I’ll often speak of mechanical operations on Turing ma-


chines themselves when strickly speaking I am describing
mechanical operations on codes, as performed for exam-
ple by M◦.

(Aside: “M◦” is not a standard name for the composition


machine; I’m just using it as a label here.)

49
Another crucial operation a Turing machine can perform
is the function of what is called the Universal Turing Ma-
chine.

This machine takes inputs l and n, and computes the


output of Ml on input n.

So when I say something like “take m and l, then feed


input m to machine Ml ” I’m describing something that
can be done effectively, using the Universal Turing Ma-
chine.

50
Let us write fn for the function computed by the machine
Mn .

It is important to bear in mind that we code up descrip-


tions of machines, not functions.

We individuate functions only by their values on the given


arguments, not by their definitions, or by the Turing ma-
chines that compute them:

f = g ⇔def ∀x1∀x2, . . . ∀xk f (x1, . . . xk ) = g(x1, . . . xk )

This means that for any function fj computed by a Tur-


ing machine Mj there will be infinitely many other Turing
machines Ml that also compute fj .

51
This is easy to see:

Let M ∗ be a (pointless) Turing machine that does noth-


ing but change whatever it sees on the square it is reading,
then changes it back.
(q1B1q2, q21Bqhalt , q11Bq2, q2B1qhalt )

If fj is computed by Mj , then it will also be computed


by Mj ◦ M ∗, (Mj ◦ M ∗) ◦ M ∗, ((Mj ◦ M ∗) ◦ M ∗) ◦ M ∗,

Each of these are different machines with different codes.

In other words, for infinitely many i, fj = fi.

52
5.3 The Diagonal Function

We can get our first example of a non-Turing-computable


function by adapting the diagonal technique that we en-
countered when studying enumerability and non-enumerability.

Remember how that worked: we produced something


that couldn’t be on a given list by ensuring that it would
have to differ from each entry on the list.

That’s what we do with this definition. We define the


diagonal function:
(
2 if fn(n) ↓ and = 1
d(x) =
1 otherwise

53
After the diagonal arguments we’ve already seen, the ar-
gument that d is not Turing computable is familiar.

I’ll rewrite the definition so we have it on this page.


(
2 if fn(n) ↓ and = 1
d(x) =
1 otherwise
If d were Turing computable, then it would be on the list
of Turing computable functions, fk for some k.
Apply d to its own index:
(
2 if fk (k) ↓ and = 1
d(k) = fk (k) =
1 otherwise

This is inconsistent. By its definition, d is always defined,


and it = 1 or = 2.

But when applied to k, it cannot = 1, because then, by


its definition, it would = 2. And it cannot = 2 since it
would then, by its definition, = 1.

54
That gives one example of an uncomputable function, but
it’s a bit abstract and theoretical.

Even perhaps a bit gimmicky. The only interesting thing


about the diagonal function is that it is uncomputable.

How can we come up with an uncomputable function that


is at least in the ballpark of something we might be in-
dependently interested in knowing?

That brings us to the Halting Problem.

55
First note that it is possible for a Turing machine, given
an input, to never halt at all.

Before you set the machine to work on such an input, it


would be nice to have advance warning:

The machine won’t halt on this input. Don’t try it,


it’s a waste of time.

Is it possible to devise a machine to give such an advance


warning?

56
For certain machines and inputs it’s clear that the ma-
chine won’t halt on that input.

For example, take the machine with this description:

q11Rq1, q1BRq1.

On any input, this machine will just move right forever.


Or, if it were a real machine, it would move right until it
wore out.

But you can’t always count on the conclusion being that


obvious.

57
Is there a machine that will take a code c and a number
k, and:

a) halt and print 0 if Mc halts on input k

b) halt and prints 1 if Mc doesn’t halt on input k?

Answer: nope! Let’s see why.

58
6 The Halting Problem

6.1 Framing

The problem: Given an effective enumeration of the Tur-


ing machines M1, M2, M3, . . . Mn . . . produce a machine
Mh such that:

1) if Mh is started on the leftmost stroke with m and


n strokes on an otherwise blank tape, then Mh will halt
scanning a 1 on an otherwise blank tape if Mm halts on
input n, and:

ii) Mh halts scanning the leftmost stroke of 2 consecutive


strokes on an otherwise blank tape if Mm doesn’t halt on
input n.

This problem is unsolvable. Let’s see how to prove that.

59
When you consider different functions and ask if they can
be effectively computed, you often find that the main ob-
stacle is knowing whether or not some procedure will halt.

Try the thought experiment: How could I try to compute


the diagonal function? Where would my efforts break
down?

Recall the definition:


(
2 if fx(x) ↓ and = 1
d(x) =
1 otherwise

60
Procedure: Take the input you are given — say that it is
l — and find the description of Ml .

That can be done effectively. After a finite time, you’ll


have the description in hand.

Take l and start Ml scanning the leftmost 1. If Ml halts


on this input, you will know that in a finite time - i.e.
after a finite number of steps - when the machine finishes.

If the machine is in standard final configuration scanning


1, then write down 2. If the machine is not in a standard
final configuration, or it is in an sfc but scanning the left-
most of more than 1 stroke, write 1.

Those are all things that are intuitively effectively com-


putable: you could construct a machine to do these things.

61
But there is one case left over: what do you do if the
machine never halts?

This is a tough one: Say that you have been sitting for
awhile, waiting for the machine to halt.

You leave it for awhile. Read a chapter of a book, eat


some dinner. Come back and the machine is still running.

Next morning: still running.

But you don’t know that it will never halt, just that it
hasn’t halted yet.

There is this asymmetry in what we can know: if it does


halt, we will know that in a finite time.

But if it doesn’t halt, and our only strategy is the “sit and
wait and watch” strategy, at no point in the computation
will we be able to decide between i) the machine will never
halt, or ii) it just hasn’t halted yet, but it will.

62
Another way to cast the point is this:

Here is a Turing machine we can construct, and the pre-


vious discussion tells us why (if we assume the Church-
Turing thesis):
(
1 if Mx halts on input x
M ∗(x) =
↑ if Mx doesn’t halt on input x

The problem is to replace the ↑ with 2. We can only do


that if we can examine the description and look ahead to
see if the computation finishes.

Sooooo...... let’s prove we can’t, via a uniform, effective


procedure, look ahead to see if the computation finishes.

63
The halting function is a two-variable reframing of the
diagonal function in terms of machines:
(
1 if Mx halts on input y
h(x, y) =
2 otherwise

We want to show that there can be no machine Mh that


computes this.

64
The argument is quite straightforward, but I’m going to
go into some details, because this argument reveals one
of the basic techniques for proving that Turing machine
M ∗ satisfying a certain description is impossible:

Take some other machine M that you know to be impos-


sible, and show that if you could build M ∗, you could
also build M.

(Also, it will be useful orientation for the problem set to


get some more exposure to the nuts and bolts of building
Turing machines.)

First we produce a function ϕ that cannot be computed,


and then we show that if we could construct Mh, we could
build a machine to compute ϕ.

65
Since we already know that the diagonal function is un-
computable, I’ll show that if we could solve the halting
problem, we could compute (a variation on) the diagonal
function.

We could construct a ‘non-self-halting machine” that halts


and prints 2 if Mk doesn’t halt on k as input, and that
doesn’t halt at all if Mk does halt on k as input.

66
OK, so let’s Assume that we can build the machine Mh
that computes:
(
1 if Mx halts on input y
h(x, y) =
2 otherwise

We can build two additional machines that we can use to


build the non-self-halting machine.

One is what BJB call a “dithering machine” that never


halts on input 1 and always halts for any other input.

One is a “copying machine”, that takes n consecutive


strokes on an otherwise blank tape and returns two blocks
of n consecutive strokes on an otherwise blank tape, sep-
arated by a single blank.

67
The “dithering machine” is pretty trivial to build.

In q1, reading a 1, move right and go into q2. In q2 read-


ing a 1, move left and halt in q3. (The machine will be
in a standard final configuration.)

In q2, reading a blank (which tells you that the preceding


1 is the only 1 on the tape if the machine was started
in a standard initial configuration) move left and go into
state q1.

(An unending cycle of move one square right-move one


square left - move one square right - …will ensue.)

In canonical form: q11Rq2, q21Lq3, q2BLq1.

68
6.2 The Copying Machine

The copying machine is a bit more challenging, but not


so bad, and the idea is straightforward.

As with the doubling machine, you need to use the initial


strokes as counters, and erase them as you go. The two
blocks you end up with will be written by the machine
itself.

I’ll mention in passing that you will need to use a version


of this “use input as counters” strategy to solve one of
the problems on the problem set, so pay attention!

Here is the graphic version, looking at the representative


case of input 4

69
There is a first “set up” cycle you go through, to place a
couple of counters on the right. In q1, reading a 1, move
right until you hit a blank, move right and go into q2
(q11Rq1; q1BRq2)

q1 q2

1 1 1 1 1 1 1 1

Now the machine writes a 1, moves past a blank, and


writes another 1, changing states as it goes to keep track
of what it has done. The machine will end up in q4 read-
ing a stroke (that the machine has just printed) and it
will move left.
(q2B1q2; q21Rq3; q3BRq4; q4B1q4; q41Lq5)
q2

1 1 1 1 1

q4 /q5

1 1 1 1 1 1

70
As the machine cruises left, it passes a blank, then a 1,
then a blank, and a bunch of 1’s, changing states at ap-
propriate points, to keep track of the fascinating things
it has seen. (q5BLq5; q51Lq6; q6BLq7; q71Lq7; q7BRq8)

When it hits the first blank on the far left, it moves right
once, and begins a cycle that it will repeat until it has
erased the counters. (Now it has left the “set up” phase
once it has laid down the two placeholders on the right.)

After all the state-changing, it’s in q8 reading a 1.


q8

1 1 1 1 1 1

Erase the 1, move right into q9 (q8B1q8; q81Rq9)


q9

1 1 1 1 1

71
If the head were reading a blank in q9, it would know it
had erased the last of the counters. It would then just
move right and halt as soon as it encountered the first 1,
so it would be in standard final configuration. (q9BRq ∗;
q ∗BRq∗; q ∗11qhalt )

But in this case it isn’t reading a blank, so goes right


until it does find a blank. When it does, it goes into
q10 and keeps moving right until it hits the second blank.
(q91Rq9; q9BRq10; q101Rq10; q101Rq10;)

At that point the situation looks like this:


q10

1 1 1 1 1

72
Now we have a delicate move. We want to keep shoving
the second block down to create room to expand the first
by one, and also expand the second by one. So we write
a 1 in the blank that the read/write head is occupying,
and go into q11:

q10 /q11

1 1 1 1 1 1

This adds 1 to the first block, all right, but at the cost
of eliminating the blank separating the two blocks. So
we need to erase the leftmost 1 of the second block, and
then move right and write two strokes on the first two
available blanks. (One stroke to replace the stroke we
erased, and one stroke because we want to expand the
second block by 1).
Here are the state-transitions for this:

(q10B1q11; q111Rq12; q121Bq12; q12BRq13; q131Bq13; q13BRq14;


q121Bq12; q12BRq13 ; q131Rq13; q13B1q14; q141Rq14; q1401q15)

[Pictures next page]

73
q11 /q12

1 1 1 1 1 1

q12

1 1 1 1 1 1/0

q12 /q13

1 1 1 1 1

q13

1 1 1 1 1 0/1

q13 /q14

1 1 1 1 1 1

q14 /q15

1 1 1 1 1 1 0/1

74
Note how this works: you write an additional 1 on the
first new block, which means (danger!) that there is no
longer a blank between the two blocks.

So you erase the 1 that used to be the leftmost of the sec-


ond block, and write two 1’s. One of the 1’s is to replace
the one that you erased (but you’ve “moved that 1 to the
right”) and the second is to add the 1 that you need to
add.

You are constantly pushing the second block to the right,


to create room for the expanding first block.

75
And now it’s just a mad dash to the left, to find the first
blank after the leftmost block.

There is a switch to transition to a clean-up phase when


there is only one stroke left in the far left block. (After
the machine reads the first stroke in the far left block, it
is alert to see if the next square is a blank. If so, there
is just the one counter remaining in the left block, and it
will just erase the last counter and move right to halt.)

q151Lq15; q15BLq16 ; q161Lq16; q16BLq17; q171Rq18;


q18BRqcleanup ; q181Lq19; q191Lq19; q19BRq8

Pictures next page.

76
q15

1 1 1 1 1 1 1

q15 /q16

1 1 1 1 1 1 1

q16

1 1 1 1 1 1 1

q16 /q17

1 1 1 1 1 1 1

q17 /q18

1 1 1 1 1 1 1

q18 /q19

1 1 1 1 1 1 1

q19 /q8

1 1 1 1 1 1 1

77
Now that you’ve completed a cycle, go back to the far
left, erase a counter, and run the cycle again!

[Pictures next page - some fast-forwarding applied]

78
q8

1 1 1 1 1 1 1

q8

1 1 1 1 1 1

q11

1 1 1 1 1 1

q12

1 1 1 1 1 1 1

q13

1 1 1 1 1 1 1

q13

1 1 1 1 1 1

q13

1 1 1 1 1 1

79
q14

1 1 1 1 1 1

q14

1 1 1 1 1 1 1

q15

1 1 1 1 1 1 1

q15

1 1 1 1 1 1 1 1

You’ve added one more to each block, and shifted the


rightmost over to make room for the growing middle one.

After another completed cycle, you are here:


q15

1 1 1 1 1 1 1 1 1

At this point the blocks on the right are what you want
them to be, and the machines next steps are to finish off
the sequence: Move left and erase next counter.

80
In this case the machine detected it is the last counter,
so instead of starting another cycle, it moves right one
square and halt.
q15

1 1 1 1 1 1 1 1 1

qcleanup

1/0 1 1 1 1 1 1 1 1

qcleanup

1 1 1 1 1 1 1 1

qcleanup /qhalt

1 1 1 1 1 1 1 1

qhalt

1 1 1 1 1 1 1 1

81
6.3 An impossible machine

So we have the dithering machine, and we have the copy-


ing machine.

Say we also had a machine Mh that computed the halting


function h.

Then we could construct an impossible machine.

First combine Mh and the copying machine Mc, so that


the input of a single block of n strokes would be dupli-
cated and fed to Mh.

Combining two machines is a simple process of adding


transitions from the halting state of C to the start state
of Mh.

(You can see one reason why we insisted on a the standard


initial and final configuration as defined. You want to be
able to feed the output from one machine directly into
a second machine as input. This would be much more
complicated if we hadn’t set things up as we have.)

82
The machine Mh ◦Mc is a “self-halting machine”: it halts
and prints 1 on input k if the one-variable machine Mk
halts on input k. It halts and prints 2 if Mk doesn’t halt
on input k.

Now we take the dithering machine Md, and splice it to


the self-halting machine:

Md ◦ Mh ◦ Mc. This takes the output of from Mh ◦ Mc


applied to x and acts this way: if the output of Mh ◦ Mc
is 1, it will go on forever. If the output is 2, it will halt
in a standard final configuration.

In other words: It doesn’t halt on input x iff Mx does


halt on x.

83
Since Mc, Md, and (by assumption) Mh all are Turing
machines, Md ◦ Mh ◦ Mc is a Turing machine as well.

Therefore it is Mk̃ for some k̃. But this is impossible,


since Mk̃ would halt on input k̃ if and only if it didn’t
halt.

Since this “non-self-halting” machine is impossible, that


means the halting machine we (theoretically) used to con-
struct it is impossible as well.

84
6.4 “Reducing to the halting problem”

As I noted above: This opens up a proof technique: “re-


duce X to the halting problem” (or more precisely: reduce
the halting problem to X!)

Example: Do we have a machine to tell if two given ma-


chines compute the same function?

(Recall as I noted above that for any given computable


function, there are infinitely many different Turing ma-
chines that compute it.)

85
Now you might think that this is the sort of thing we
could get a machine to check:

Is there a Turing machine Msame function that when input


k and l will halt and print 1 if Ml and Mk compute the
same function?

Answer: nope!

The easiest way to show this is to give a procedure whereby


you could take such a machine Msame function and use it
to build a machine that computes the halting function.

To spell this out in formal detail, full of state-transition


goodness, would be more of a detour than it is worth
taking at this point, so I’ll wave my hands at various
points and make appeals to the Church-Turing thesis to
“justify” the assertion that certain Turing machines exist.

86
Assume for the sake of reducing to a contradiction that
Msame function exists.

The idea is this: given k and n, modify Mk into an-


other machine Mk n−modified that performs just like Mk ,
with the possible difference that on input n, Mk n−modified
doesn’t halt.

(I say “possible difference” because we don’t know whether


or not Mk halts on input n. It might or it might not. But
we force Mk n−modified not to halt on input n.)

First note that for any given n we can easily write down
a machine that will do the following: check to see if the
input is exactly n. If it isn’t, then return to the initial
configuration. If it is, then go into an infinite loop and
never halt.

Call the machine Mtest n

87
Mtest n looks like this:

qi1Rqi+1 for every i, 1 ≤ i ≤ n


qiBLqreturn for every i, 1 ≤ i ≤ n
qn+11Lqreturn
qn+10Lqcycle
qreturn 1Lqreturn
qreturn BRqhalt
qcycle B1qcycle
qcycle 1Bqcycle

The above procedure for “given n, write down the Turing


machine description for Mtest n ” is intuitively effective, so
by the Church-Turing thesis, it can be performed by a
Turing machine.

(Though, of course, the Turing machine would take n and


output the description coded into a number.)

88
Here is another thing that we can intuitively do effec-
tively: Given k and l, return the code of the machine
Mk ◦ Ml .

So, here is our procedure for solving the halting problem


given the machine Msame function.

Given k and l, produce Mtest n .

We can then form Mk n−modified as Mtest n ◦ Mk .

˜
Say that the code of Mk n−modified is l.

Feed k and l˜ into Msame function.

89
This will give a yes or no answer in a finite number of
steps to the question “Do Mk and M˜l compute the same
function?”

If Mk and M˜l compute the same function, then Mk doesn’t


halt on input n because M˜l doesn’t.

If Mk and M˜l don’t compute the same function, then Mk


must halt on input n because M˜l doesn’t halt on input
n, and Mk and M˜l do exactly the same thing on all other
inputs.

This gives us an effective procedure to compute the halt-


ing function by the Church-Turing thesis.

But we have established that no such effective procedure


exists, so Msame function is impossible.

90
You can use similar moves to show that there is no Tur-
ing machine Mhalt on all that will decide, given k as input,
whether or not Mk halts on all inputs.

If you had such a machine Mhalt on all you could solve the
halting problem this way:

Take k and l. Modify Mk so that on every input except


l, it halts right away, but on l it behaves as before.

Say that the resulting machine is Mk̃ .

Feed k̃ into Mhalt on all.

If Mhalt on all says that Mk̃ halts on everything, then it


must halt on l as input, which happens if and only if and
Mk halts on l as input.

If Mhalt on all says that Mk̃ doesn’t halts on everything,


then it must fail to halt on l as input since it halts on
every other input. But Mk̃ fails to halt on l if and only
if Mk fails to halt on l.

91
6.5 Rice’s Theorem

This section is optional - “it will not be on the exam”,


as the saying goes - and I won’t be proving the central
result, but it’s worth a detour to illustrate how hard it
it is to mechanically decide questions pertaining to the
capabilities of classes of Turing machines.

We can produce Turing machines for a remarkable range


of arithmetical functions - it is amazing what you can do
on such a slender basis - but when we turn from calculat-
ing numerical functions to working with codes for Turing
machines, everything changes.

There is a general result called Rice’s Theorem that illus-


trates just how few things we can do, effectively, in this
domain.

92
There are different ways to state the theorem, I’ll put it
in terms of functions. Say that an index set C of ma-
chine codes has this property:

If f is computed by Mk , and k ∈ C, then for every other


l, if Ml computes f , then l ∈ C as well.
If f is computed by Mk , and k 6∈ C, then for every other
l, if Ml computes f , then l 6∈ C as well.

The point of an index set is that you want to make sure


that you don’t separate machines that compute the same
function. That way you can regard the index set as es-
sentially a class of functions rather than machines.

93
Say that a class C of numbers is decidable if and only if
there is a Turing machine that i) halts and prints 1 on
input k if k ∈ C ii) halts and prints 2 on input k if k 6∈ C.

Rice’s theorem states that the only decidable index sets


of numbers are the empty set ∅ and the set N of all nat-
ural numbers.

As we have already seen, there is no effective procedure


to decide if Ml and Mk compute the same function, so
from the point of view of Turing computability, requir-
ing a class of numbers to be an index set is a powerful
constraint.

Rice’s theorem tells you that the constraint is about as


powerful as it can possibly be!

94
So consider any property of Turing-computable functions:

{f |f is constant }
{f |f is not constant }
{f |f (x) = 0 for at least one input x}
{f |f (x) 6= 5 for any input x}
..
.

None of these classes (more precisely their index sets) is


decideable.

95
6.6 Productivity/Busy Beaver functions - uncomputability via rapid
growth

For the proof of the unsolvability of the halting problem,


and several other of the key arguments in this course, the
key trick is diagonalization.

The optional chapter 4.2 presents a different technique


that we won’t be using much if at all, but it is useful
in many other applications: producing a function f that
can’t be in a given class C because f grows faster than
any of the functions in C.

More precisely, say that f eventually dominates g (f, g :


N → N) if:
(∃n)(∀k) n < k ⇒ g(k) < f (k)
In this terminology, we show f 6∈ C if we can show that
f eventually dominates every function in C.

96
The productivity function (also called the “Busy Beaver
Function”) is shown to be uncomputable because it even-
tually dominates every Turing computable function.

Another example of a function constructed to grow rapidly


is on p. 84 - 85 of the text - the Ackermann function.

It’s a simple illustration of the point, so I’ll quickly glance


at it as an example of using rapid rates of growth to show
that a given function γ is not in a given class because ev-
erything in the class is eventually dominated by γ.

97
Consider the a natural hierarchy of functions beginning
with successor (i.e. +1) and applying recursion.

f1(x, n) = x + n:
x+0=x
x + (y + 1) = (x + y) + 1

f2(x, n) = x × n:
x×1=x
x × (y + 1) = (x × y) + y

f3(x, n) = xn:
x1 = x
xy+1 = xy × x

f4(x, n) =n x:
1
x=x
. .x
x.
y+1
x = (y x)x = |xx{z }
y+1 times
..
.

Set γ(n) = fn(n, n)

98
γ will grow faster than any of the functions fi.

In fact, γ is recursive, and it doesn’t dominate all recur-


sive functions.

But γ is not “primitive recursive” - a phrase we’ll define


in a couple of weeks.

99

You might also like