1 Primitive Recursive Functions
1 Primitive Recursive Functions
1
5. f is defined by recursion of two primitive recursive functions, i.e. if
g(x1 , . . . , xn−1 ) and h(x1 , . . . , xn+1 ) are primitive recursive then the fol-
lowing function is also primitive recursive
f (x1 , . . . , xn−1 , 0) = g(x1 , . . . , xn−1 )
f (x1 , . . . , xn−1 , m + 1) = h(x1 , . . . , xn−1 , m, f (x1 , . . . , xn−1 , m))
This rule for deriving a primitive recursive function is called the Re-
cursion rule. It is a very powerful rule and is why these functions are
called ‘primitive recursive.’
2
Informally we would say: f (x) = S(S(S(S(S(x))))) is primitive recursive by
successor and several applications of composition.
The question arises, “how does one think of these definitions?” The key
point is to think recursively. The thought process for defining plus might be
“Well gee, x + 0 = x, that’s easy enough, and x + (y + 1) = (x + y) + 1, so I
can define x + (y + 1) in terms of x + y.” From that point it should be easy
to see how to formalize that f (x, y) = x + y is primitive recursive.
Now that we have plus is primitive recursive, we can use it to define other
primitive recursive functions. Here we use it to define multiplication.
Example 1.4 f (x, y) = xy is primitive recursive via
f (x, 0) = 0
f (x, y + 1) = f (x, y) + x
3
Now that we have multiplication is primitive recursive, we use it to define
powers.
f (x, 0) = 1
f (x, y + 1) = xf (x, y)
Example 1.6 In all the above examples we took a well known function and
showed it was primitive recursive. Here we define a function directly. What
this function does is, if x > 0 then it subtracts 1, otherwise it just returns 0.
f (0) = 0
f (x + 1) = x
To do this formally, recall that in the recursion rule the function h can
depend on x and f (x). Henceforth, call this function M (x).
f (x, 0) = x
f (x, y + 1) = M (f (x, y))
4
Fact 1.9 Virtually every function you can think of (primality, finding quo-
tients and remainders, any polynomial) is primitive recursive.
5
Since each of the gi ’s, and h, take m or less steps to derive, BY THE INDUC-
TION HYPOTHESIS they can be computed by a JAVA program. Using this,
we write a JAVA program for f .
begin
input(x1 , . . . , xn );
output(h(g1 (x1 , . . . , xn ), . . . , gk (x1 , . . . , xn ))) ;
end
This program was easy to write because we already knew we had JAVA
programs for g1 , . . . , gk and h.
We leave the last case, where the last rule was recursion, to the reader as
an exercise. It is not hard.
Is EVERY function that is computed by a JAVA program a primitive
recursive functions? We will show the answer is NO.
Exercise Describe a mapping ρ from N onto the set of all primitive recursive
functions of many variable. The function U (x, y) = ρ(x)(y) should be JAVA
computable. Why do you think I chose the letter U ? (The actual details of
the JAVA code aren’t necessary, just argue that it can be done.)
6
In particular, take x = n to get
F (n) = U (n, n) + 1.
A(0, y) = y + 1
A(x + 1, 0) = A(x, 1)
A(x + 1, y + 1) = A(x, A(x + 1, y))
7
It is easy to see that Ackerman’s function is JAVA computable. We will
not prove that it is not primitive recursive, but we give an idea by giving a
list of intuitions:
1. Ackerman’s function grows very fast. It grows faster than any primitive
recursive function.
We will add a basic way of generating one function from another which
will allow functions to not always be defined, yet is basic and is something
that machines can really do. Intuitively, we allow the operation of being
8
able to search for a number without having an a priori bound on what the
number may be (in fact, it might not exist). The resulting set of functions
will be called the General Recursive Functions.
Keep in mind our final goal of showing that this model is very powerful-
many functions will turn out to be general recursive.
Notation 2.2 The symbol µ stands for “least number such that.” We il-
lustrate its use. If g(x) is a function then µx[g(x) = 13] is the least num-
ber x such that g(x) = 13. Note that such a number need not exist, in
which case µx[g(x) = 13] is undefined. If g(x1 , . . . , xn , y) is a function
then f (x1 , . . . , xn ) = µy[g(x1 , . . . , xn , y) = 0] is the function that, on in-
put ha1 , . . . , an i returns the least value of y such that g(ha1 , . . . , an i, y) = 0.
Note that such a number need not exist, in which case f (ha1 , . . . , an i) is
undefined. The function f is called the unbounded minimalization of g.
Def 2.5 The general recursive functions are also called the partial com-
putable functions. The subset of the partial computable functions that are
total are called the total computable functions, or just the computable func-
tions. The term “general recursive” will not be used again in this course, but
is included for historical purposes.
We will later see that general recursive functions are very powerful.
9
3 Turing Machines
We will now take a different approach to pinning down what is meant by
“computable.” This definition is motivated by actual computers and resem-
bles a machine. The definition is similar to that of a Deterministic Finite
Automaton, or Push Down Automaton, but it can do much much more.
Keep in mind that our final goal is to show that this model can compute a
lot of functions.
We first give the formal definition, and then explain it intuitively.
The machine acts in discrete steps. At any one step it will read the symbol
in the “tape square”, see what state it is in, and do one of the following:
1. write a symbol on the tape square and change state,
2. move the head one symbol to the left and change state,
3. move the head one symbol to the right and change state.
We now formally say how the machine computes a function. This will be
followed by intuition.
10
Def 3.3 Let M be a Turing Machine. Let α1 , α2 , α3 , α4 ∈ Σ∗ , and q, q 0 ∈ Q.
Let α1 = x1 x2 · · · xk , and α2 = xk+1 xk+2 · · · xn . The symbol α1 qα2 `M α3 q 0 α4
means that one of the following is true:
Def 3.4 If C and D are IDs then C `∗M D means that either C = D or there
exist a finite set of IDs C1 , C2 , . . . , Ck such that C = C1 , for all i, Ci `M Ci+1 ,
and Ck = D.
Def 3.5 Let M be a Turing Machine. Recall that the partial function com-
puted by Turing Machine M is the following partial function: f (x) is the
unique y (if it exists) such that xq0 `∗M yh. If no such y exists then M (y) is
said to diverge.
Intuitively we start out with x laid out on the tape, and the head looking
at the rightmost symbol of x. The machine then runs, and if it gets to the
halt state with the condition that there are only blanks to the right of the
head, then the string to the left of the head is the value f (x).
For examples of Turing machines, and exercises on building them to do
things, see Lewis and Papadimitriou’s text “Elements of the Theory of Com-
putation.”, or the Hopcroft-Ullman White book. Other books also contain
this material.
Note that, just like a computer, the computation of a Turing machine is
in discrete steps.
11
Note that the function computed by Me,s is intuitively computable. Al-
though it is a partial function we can tell when it will be undefined so we
can think of it as being total.
A careful analysis of the proof of the above theorem reveals that the 1-
tape machine is not that much more inefficient then the equivalent 2-tape
machine. In particular, we have actually shown that if the 2-tape machine
halts on inputs of length n in T (n) steps, then the 1-tape machine will halt, on
inputs of length n, in T (n)2 steps. While this is not important for recursion
theory, it will be a significant fact in complexity theory. The best known
simulation of a multitape Turing Machine by a fixed number of tape machine
is that any function f that can be computed by k-tape Turing Machine in
12
T (n) steps on inputs of length n can be computed by a 2-tape machine in
T (n) log T (n). (See Hopcroft-Ullman, the White book.)
Other enhancements to a Turing Machine such as extra heads, two-
dimensionality, allowing a 2-way infinite tape, do not add power. Note that
a Turing Machine with many added features resembles an actual computer.
Exercise Discuss informally how to convert various variants of a Turing
Machine to a 1-tape 1-head 1-dim Turing Machine. Comment on how runtime
and number of states are affected.
5 Godelization
By using variations of Turing Machines it would not be hard to show that
standard functions such as addition, multiplication, exponentiation, etc. are
all computable by Turing Machines. We wish to examine functions that, in
some sense, take Turing Machines as their input. In order to do this, we must
code machines by numbers. In this subsection we give an explicit coding and
its properties. The actual coding is not that interesting or important and
can be skipped, but should at least be skimmed to convince yourself that it
really can be carried out. The properties of the coding are very important. A
more abstract approach to this material would be to DEFINE a numbering
system as having those properties. We DO NOT take this approach, but will
discuss it at the end of this section.
Def 5.1 A Godelization is an onto mapping from N to the set of all Turing
Machines such that given a Turing Machine, one can actually find the number
mapped to, and given a number one can actually find the Turing Machine
that maps to it.
13
• L, R are represented by the numbers 1 and 2. (We still denote L and
R by L and R. Note that L and R have numbers different from those
in the alphabet.)
We first show how to encode a rule as a number:
Let q1 , q2 ∈ Q and σ1 , σ2 ∈ Σ. (By our convention, q1 , q2 , σ1 , σ2 are
numbers). The rule
δ(q1 , σ1 ) = (q2 , σ2 )
is represented by the number 2q1 3σ1 5q2 7σ2 . The representations for rules that
have L or R in the last component are defined similarly. In any case we
denote the rule that says what δ(q, σ) does by c(q, σ).
We now code the entire machine M as a number. Let pi denote the ith
prime. Let h−, −i be such that the map (i, j) → hi, ji is a bijection from
N × N to N which is computable by a Turing Machine. The Turing Machine
M is coded by the number
m
n Y
Y c(i,j)
C(M ) = phi,ji
i=1 j=1
14
• Given x, determine the number of symbols in the alphabet of Mx .
• Given numbers x and y, produce the code for the Turing Machine that
computes the composition of the functions computed by Mx and My .
Our coding has some very nice properties that we now state as theorems.
There is nothing inherently good about the coding we used, virtually any
coding one might come up with has these properties. The properties essen-
tially say that we can treat the indices of a Turing Machines as though they
were programs. We will be using these theorems informally, without explicit
reference to them, for most of this course.
.
Both the s-1-1 theorem and the s-m-n theorem are proven by actually
constructing such functions. These constructions were of more interest when
they were proven than they are now, since now the notion of treating data
and parameters the same has been absorbed into our culture.
The next theorem says that there is one Turing Machine that can simulate
all others. It is similar to a mainframe: you feed it programs and inputs, and
it executes them.
15
Theorem 5.5 (Universal Turing Machine Theorem, or Enumeration Theo-
rem) There is a Turing Machine M such that M (x, y) is the result of running
Mx on y. (Note that this might diverge.)
One concern might be that if we prove theorems for our particular APS
will it be true for all APS’s. The following theorem says YES, as it says that
all APS’s are essentially the same.
3. post Systems were proposed by Emil Post in 1943. They are a gener-
alization of Grammars.
16
5. Markov Algorithms were proposed by Andrei Andreivich Markov in the
1940’s.
17
4. Any JAVA program that halts on all inputs you can think of is com-
puting a computable function.
We give an interesting example of a partial computable function. We
want a function that will, on input e, output some PRIME that Me halts on.
If Me does not halt on any prime, then the function will be undefined.
First attempt (which will fail): run Me (2). If it halts then output 2, else
run Me (3). If it halts then output 3, else run Me (5). This will not work since
you cannot tell if Me (2) halts.
So what to do?
Well, we can try to run Me (2) for a few steps, then try Me (3) for a few
steps, then go back to Me (2) and try out various other primes as we go.
We try Me (p) for s steps for many primes p and numbers s. This process
is known as DOVETAILING. Before presenting the formal algorithm we’ll
need pairing functions.
Def 6.2 Let π1 and π2 be computable function such that the set {(π1 (x), π2 (x)) :
x ∈ N} is all of N × N.
Algorithm for f :
1. Input(e)
2. i := 1
FOUND := FALSE
While NOT FOUND
x := π1 (i)
s := π2 (i)
Run Me (x) for s steps.
If x is prime and Me (x) halts within s steps then
output(x)
FOUND := TRUE
else i := i + 1
The algorithm looks at ALL possible pairs (x, s) and if we find that Me (x)
halts in s steps, and x is prime, then we halt. Note that if Me halts on SOME
prime then f (x) will be such a prime; however, if Me does not halt on any
prime, then the algorithm will diverge (as it should).
18
7 Some strange examples of computable func-
tions
Functions that are almost always 0 are very easy to compute: just store a
table.
Example 7.1 Let f be the function that is f (0) = 12, f (10) = 20, f (14) =
7, f is zero elsewhere. The function f is easily seen to be computable. Just
write a program with a lot of ‘if’ statements in it. It will output 0 on values
that are not 0,10, or 14.
Example 7.2 Let f be the function that is nonzero on values less than 10,
and on those values always outputs the input squared. From the description
we can deduce that f (1) = 1, f (2) = 4, f (3) = 9, f (4) = 16, f (5) = 25,
f (6) = 36, f (7) = 49, f (8) = 64, f (9) = 81, and f is zero elsewhere.
In the above example, even though we were not given the function ex-
plicitly, we could derive an explicit description from what was given. In the
next example this is no longer the case, but the function is still computable.
INTERESTING EXAMPLE
One needs to know what the Goldbach Conjecture to appreciate this
example: Goldbach’s conjecture is still unknown. It is: every even is the
sum of two primes.
19
So THERE EXISTS a JAVA program for f . In fact, we can write down two
programs, and know that one of them computes f , but we don’t know which
one. But to show that f is computable WE DO NOT CARE WHICH ONE!
The definition of computability only said THERE EXISTS a JAVA program,
it didn’t say we could find it.
The fact that this list is infinite should not bother us. It is still the case
that f is computable since one of the functions on this list is f , or f is always
0.
20
1. a and b are irrational and ab is rational, OR
1. The primes.
21
4. Most sets you can think of are computable.
Are there any noncomputable sets? Cheap answer: The number of SETS
is uncountable, the number of COMPUTABLE SETS is countable, hence
there must be some noncomputable sets. In fact, there are an uncountable
number of them. I find this answer rather unenlightening.
Proof:
We show that K0 is NOT computable, by using diagonalization. Assume
that K0 is computable. Let M be the Turing Machine that decides K0 . Using
M we can easily create a machine M 0 that operates as follows:
0 if Mx (x) does not halt,
0
M (x) =
↑ if Mx (x) does halt.
22
If M 0 (e) ↑ then by the definition of M 0 , we know that Me (e) does halt’;
but since M 0 = Me , we know that Me (e) does not halt. Hence the scenario
that M 0 (e) ↑ cannot happen. (This alone is not a contradiction)
By combining the two above statements we get that M 0 (e) can neither
converge, nor diverge, which is a contradiction.
This proof may look unmotivated— why define M 0 as we did? We now
look at how one might have come up with the halting set if one’s goal was
to come up with an explicit set that is not decidable:
We want to come up with a set A that is not decidable. So we want that
M1 does not decide A, M2 does not decide A, etc. Let’s make A and machine
Mi differ on their value of i. So we can DEFINE A to be
A = {i | Mi (i) 6= 1}.
This set can easily be shown undecidable— for any i, Mi fails to decide it
since A and Mi will differ on i. But looking at what makes A hard intuitively,
we note that the “6= 1” is a red herring, and the set
B = {i | Mi (i) ↓}
Note 9.4 In some texts, the set we denote as K is called the Halting set.
We shall later see that these two sets are identical in computational power,
so the one you care to dub THE halting problem is not important. We chose
the one we did since it seems like a more natural problem. Henceforth, we
will be using K as our main workhorse, as you will see in a later section.
23
was: on input x, y, run Mx (y) until it halts. The problem was that if hx, yi ∈
/
K0 then the algorithm diverges. But note that if (x, y) ∈ K0 then this
algorithm converges. SO, this algorithm DOES distinguish K0 from K 0 .
But not quite in the way we’d like. The following definition pins this down
1. Input(n).
2. If n = 0 then output a.
24
We show that range(g) =domain(f ). If y is in the range of g then it must be
the case that M (y) halted, so y is in the domain of f . If y is in the domain of
f then let n be the least number such that M (y) halts in n steps and y ≤ n.
If there is some m < n such that g(m) = y then we are done. Otherwise
consider the computation of g(n). In that computation y ∈ Y but might
not be output if there is some smaller element of Y . The same applies to
g(n + 1), g(n + 2), . . .. If there are z elements smaller than y in A then one
of g(n), g(n + 1), . . . , g(n + z) must be y.
2) → 1). Assume that A is either empty or the range of a total computable
function. If A is empty then A is the domain of the partial computable
function that always diverges, and we are done. Assume A is the range of
a total computable function f . Let g be the partial computable function
computed by the following algorithm:
1. Input(n).
2. Compute f (0), f (1), . . . until (if it happens) you discover that there is
an i such that f (i) = n. If this happens then halt. (if it does not, then
the function will end up diverging, which is okay by us).
25
Theorem 10.3 A set A is computable iff both A and A are r.e.
Proof: If A is computable then A is computable. Since any computable
set is r.e. both are r.e.
Assume A and A are r.e. Let Ma be a Turing Machine that has domain
A and Mb be a Turing Machine that has domain A. The set A is computable
via the following algorithm: on input x run both Ma (x) and Mb (x) simul-
taneously; if Ma (x) halts then output YES, if Mb (x) halts then output NO.
Since either x ∈ A or x ∈ A, one of these two events must happen.
This theorem links two of our questions: there exists an r.e. set that is
not computable iff r.e. sets are not closed under complementation.
26
3. HILBERT’S TENTH PROBLEM (improvements) Given a polynomial
in just 13 variables p(x1 , . . . , x13 ) with integer coefficients, does there
exist integers a1 , . . . , a13 such that p(a1 , . . . , a13 ) = 0.
In this section we exhibit sets that are harder than K but do not prove
this.
27
Recall that K can be written as
Theorem 11.5 For every i there are sets in Σi −Πi , there are sets in Σi+1 −
Σi , there are Σi -complete sets, and there are Πi -complete sets.
Exercise (You may use the above Theorem.) Show that a Σi -complete set
cannot be in Πi .
Exercise Show that K is Σ1 -complete. Show that K is Π1 -complete.
Exercise Show that if A is Πi -complete then A is Σi -complete.
28
We show that F IN (the set of indices of Turing machines with finite
domain) is in Σ2 and that COF (the set of Turing machines with cofinite
domains) is in Σ3 . It turns out that F IN is Σ2 -complete, and COF is Σ3 -
complete, though we will not prove this. As a general heuristic, whatever
you can get a set to be, it will probably be complete there.
29