Notes Liu
Notes Liu
1 Introduction 5
Prerequisite Knowledge 6
2 Induction 9
Complete Induction 14
Beyond Numbers 16
Structural Induction 17
A Larger Example 19
Exercises 20
3 Recursion 25
Measuring Runtime 25
Divide-and-Conquer Algorithms 34
Quicksort 37
Exercises 39
4 Program Correctness 43
What is Correctness? 43
Iterative Programs 47
Termination 51
Exercises 53
Regular Languages 60
A Suggestive Flowchart 63
Correctness of DFAs 66
Limitations of DFAs 67
Nondeterminism 68
Equivalence of Definitions 70
Exercises 72
It should come as no surprise that the field of computer science predates the
invention of computers, since humans have been solving problems for millen-
nia. Our English word algorithm, a sequence of steps taken to solve a problem, The word algebra is derived from the
is named after the Persian mathematician Muhammad ibn Musa al-Khwarizmi, word al-jabr, appearing in the title
of one of his books, describing the
whose mathematics texts were compendia of mathematics computational proce- operation of subtracting a number from
dures. In 1936, Alan Turing, one of the fathers of modern computer science, de- both sides of an equation.
veloped the Turing Machine, a theoretical model of computation which is widely A little earlier, Alonzo Church (who
believed to be just as powerful as all programming languages in existence today. would later supervise Turing during
the latter’s graduate studies) developed
In one of the earliest and most fundamental results in computer science, Turing the lambda calculus, an alternative
proved that there are some problems that cannot be solved by any computer that model of computation that forms
has ever or will ever be built – before computers had been invented at all! the philosophical basis for functional
programming languages like Scheme,
Haskell, and ML.
A programmer’s value lies not in her ability to write code, but to understand
problems and design solutions – a much harder task. Beginning programmers
often write code by trial and error (“Does this compile? What if I add this line?”),
which indicates not a lack of programming experience, but a lack of design ex-
perience. When presented with a problem, many students often jump straight
to the computer, even if they have no idea what they are going to write! And
when the code is complete, they are at a loss when asked the two fundamental “My code is correct because it passed
all of the tests” is reasonable but unsat-
questions: Why is your code correct, and is it a good solution?
isfying. What I really want to know is
how your code works.
6 david liu
In this course, you will learn the skills necessary to answer both of these ques-
tions, improving both your ability to reason about the code you write and your
ability to communicate your thinking with others. These skills will help you
design cleaner and more efficient programs, and clearly document and present
your code. Of course, like all skills, you will practice and refine these throughout
your university education and in your careers.
The first section of the course introduces the powerful proof technique of induc-
tion. We will see how inductive arguments can be used in many different math-
ematical settings; you will master the structure and style of inductive proofs, so
that later in the course you will not even blink when asked to read or write a
“proof by induction.”
From induction, we turn our attention to the runtime analysis of recursive pro-
grams. You have done this already for non-recursive programs, but did not
have the tools necessary to handle recursion. We will see that (mathematical)
induction and (programming) recursion are two sides of the same coin, so we
use induction to make analysing recursive programs easy as cake. After these Some might even say, chocolate cake.
lessons, you will always be able to evaluate your recursive code based on its
runtime, a very important consideration!
We next turn our attention to the correctness of both recursive and non-recursive
programs. You already have some intuition about why your programs are cor-
This is not to say tests are unnecessary!
rect; we will teach you how to formalize this intuition into mathematically rigor- The methods we’ll teach you in this
ous arguments, so that you may reason about the code you write and determine course are quite tricky for larger soft-
ware systems. However, a more mature
errors without the use of testing.
understanding of your own code cer-
tainly facilitates finding and debugging
Finally, we will turn our attention to the simplest model of computation, the errors.
finite automaton. This serves as both an introduction to more complex compu-
tational models like Turing Machines, and also formal language theory through
the intimate connection between finite automata and regular languages. Regu-
lar languages and automata have many other applications in computer science,
from text-based pattern matching to modelling biological processes.
Prerequisite Knowledge
In MAT102, you learned how to write proofs. This is the main object of interest in
CSC236, so you should be comfortable with this style of writing. However, one
key difference is that we will not expect (nor award marks for) a particular proof
structure – indentation is no longer required, and your proofs can be mixtures
of mathematics, English paragraphs, pseudocode, and diagrams! Of course, we
will still greatly value clear, well-justified arguments, especially since the content So a technically correct solution that is
will be more complex. extremely difficult to understand will
not receive full marks. Conversely, an
incomplete solution which explains
clearly partial results (and possibly
Concepts from CSC148 even what is left to do to complete
the solution) will be marked more
generously.
Recursion, recursion, recursion. If you liked using recursion CSC148, you’re in
luck: induction, the central proof structure in this course, is the abstract think-
ing behind designing recursive functions. And if you didn’t like recursion or
found it confusing, don’t worry! This course will give you a great opportunity
to develop a better feel for recursive functions in general, and even give you
programming opportunities to get practical experience.
This is not to say you should forget everything you have done with iterative pro-
grams; loops will be present in our code throughout this course, and will be the A design pattern is a common coding
central object of study for a week or two when we discuss program correctness. template which can be used to solve a
variety of different problems. “Looping
In particular, you should be very comfortable with the central design pattern of through a list” is arguably the simplest
first-year python: computing on a list by processing its elements one at a time one.
using a for or while loop.
You should also be comfortable with terminology associated with trees, which
will come up occasionally throughout the course when we discuss induction
proofs.
You will also have to remember the fundamentals of Big-O algorithm analysis,
and how to determine tight asymptotic bounds for common functions.
Finally, the last part of the course deals with regular languages; you should be
familiar with the terminology associated with strings, including length, reversal,
concatenation, and the empty string.
2 Induction
1 + 2 + 3 + · · · + n − 1 + n = (1 + n ) + (2 + n − 1) + (3 + n − 2) + · · ·
= ( n + 1) + ( n + 1) + ( n + 1) + · · ·
n n
= ( n + 1) (since there are pairs)
2 2
This isn’t exactly a formal proof – what if n is odd? – and although it could be We ignore the 0 in the summation, since
made into one, this proof is based on a mathematical “trick” that doesn’t work this doesn’t change the sum.
n
for, say, ∑ i2 . And while mathematical tricks are often useful, they’re hard to
i =0
come up with in the first place! Induction gives us a different way to tackle this
problem that is astonishingly straightforward.
EV (n) : n is even
GR( x, y) : x > y
FROSH ( a) : a is a first-year university student
Every predicate has a domain, the set of its possible input values. For exam-
ple, the above predicates could have domains N, R, and “the set of all UofT
We will always use the convention that
students,” respectively. Predicates give us a precise way of formulating English 0 ∈ N unless otherwise specified.
problems; the predicate that is relevant to our example is
n
n ( n + 1)
P(n) : ∑i= 2
.
A common mistake: defining the
i =0
predicate to be something like
n ( n + 1)
You might be thinking right now: “Okay, now we’re going to prove that P(n) is P(n) : . Such an expres-
2
true.” But this is wrong, because we haven’t yet defined n! So in fact we want to sion is wrong and misleading because
it isn’t a True/False value, and so fails
prove that P(n) is true for all natural numbers n, or written symbolically, ∀n ∈ to capture precisely what we want to
N, P(n). Here is how a formal proof might go if we were not using mathematical prove.
induction:
10 david liu
n
n ( n + 1)
Proof of ∀n ∈ N, ∑ i =
i =0
2
Let n ∈ N.
# Want to prove that P(n) is true.
Case 1: Assume n is even.
# Gauss' trick
..
.
Then P(n) is true.
Case 2: Assume n is odd.
# Gauss' trick, with a twist?
..
.
Then P(n) is true.
Then in all cases, P(n) is true.
Then ∀n ∈ N, P(n).
Instead, we’re going to see how induction gives us a different, easier way of
proving the same thing.
Suppose we want to create a viral Youtube video featuring “The World’s Longest
Domino Chain!!! (like plz)".
Of course, a static image like the one featured on the right is no good for video;
instead, once we have set it up we plan on recording all of the dominoes falling
in one continuous, epic take. It took a lot of effort to set up the chain, so we
would like to make sure that it will work; that is, that once we tip over the
first domino, all the rest will fall. Of course, with dominoes the idea is rather
straightforward, since we have arranged the dominoes precisely enough that
any one falling will trigger the next one to fall. We can express this thinking a
bit more formally:
We can apply the same reasoning to the set of natural numbers. Instead of
“every domino in the chain will fall,” suppose we want to prove that “for all
n ∈ N, P(n) is true”, where P(n) is some predicate. The analogues of the above
statements in the context of natural numbers are
Putting these together yields the Principle of Simple Induction (also known as simple/mathematical induction
Mathematical Induction):
A different, slightly more mathematical intuition for what induction says is that
“P(0) is true, and P(1) is true because P(0) is true, and P(2) is true because P(1)
is true, and P(3) is true because. . . ” However, it turns out that a more rigorous
proof of simple induction doesn’t exist from the basic arithmetic properties of
the natural numbers alone. Therefore mathematicians accept the principle of It certainly makes sense intuitively, and
turns out to be equivalent to another
induction as an axiom, a statement as fundamentally true as 1 + 1 = 2.
fundamental math fact called the Well-
Ordering Principle.
This gives us a new way of proving a statement is true for all natural numbers:
instead of proving P(n) for an arbitrary n, just prove P(0), and then prove the
link P(k) ⇒ P(k + 1) for an arbitrary k. The former step is called the base case,
while the latter is called the induction step. We’ll see exactly how such a proof
goes by illustrating it with the opening example.
n
n ( n + 1)
Example 2.1. Prove that for every natural number n, ∑i= 2
.
i =0
It’s easy to miss this step, but without it, often you’ll have trouble deciding
precisely what to write in your proofs.
Step 2 (Base Case): n = 0. We would like to prove that P(0) is true. Recall the
meaning of P:
0
0(0 + 1)
P (0) : ∑i= 2
.
i =0
For induction proofs, the base case
This statement is trivially true, because both sides of the equation are equal to usually a very straightforward proof.
In fact, if you find yourself stuck on the
0. base case, then it is likely that you’ve
misunderstood the question and/or are
Step 3 (Induction Step): the goal is to prove that ∀k ∈ N, P(k) ⇒ P(k + 1). Let trying to prove the wrong predicate.
k ∈ N be some arbitrary natural number, and assume P(k) is true. This an-
tecedent assumption has a special name: the Induction Hypothesis. Explicitly,
we assume that
k
k ( k + 1)
∑i= 2 .
i =0
k +1
(k + 1)(k + 2)
Now, we want to prove that P(k + 1) is true, i.e., that ∑i= 2
. This
i =0
can be done with a simple calculation:
!
k +1 k
∑i= ∑i + ( k + 1)
i =0 i =0
k ( k + 1)
= + ( k + 1) (By Induction Hypothesis) The one structural requirement we do
2 have for this course is that you must
k always state exactly where you use
= ( k + 1) +1
2 the induction hypothesis. We expect
to see the words “by the induction
(k + 1)(k + 2) hypothesis” at least once in each of
=
2 your proofs.
Therefore P(k + 1) holds. This completes the proof of the induction step: ∀k ∈
N, P(k) ⇒ P(k + 1).
In our next example, we look at a geometric problem – notice how our proof
will use no algebra at all, but instead constructs an argument from English state-
ments and diagrams. This example is also interesting because it shows how to
apply simple induction starting at a number other than 0.
Example 2.2. A triomino is a three-square L-shaped figure. To the right, we show
a 4-by-4 chessboard with one corner missing that has been tiled with triominoes.
Prove that for all n ≥ 1, any 2n -by-2n chessboard with one corner missing can
be tiled with triominoes.
Proof. Predicate: P(n): Any 2n -by-2n chessboard with one corner missing can
be tiled with triominoes.
Base Case: This is slightly different, because we only want to prove the claim
for n ≥ 1 (and ignore n = 0). Therefore our base case is n = 1, i.e., this is the
“start” of our induction chain. When n = 1, we consider a 2-by-2 chessboard
with one corner missing. But such a chessboard is exactly the same shape as a
Again, a rather trivial base case. Keep
triomino, so of course it can be tiled by triominoes! in mind that even though it was simple,
the proof would have been incomplete
Induction Step: Let k ≥ 1 and suppose that P(k) holds; that is, that every 2k -by- without it!
2k chessboard with one corner missing can be tiled by triominoes. (This is the
Induction Hypothesis.) The goal is to show that any 2k+1 -by-2k+1 chessboard with
one corner missing can be tiled by triominoes.
Consider an arbitrary 2k+1 -by-2k+1 chessboard with one corner missing. Divide
it into quarters, each quarter a 2k -by-2k chessboard.
Exactly one of these has one corner missing; by the Induction Hypothesis, this
quarter can be tiled by triominoes. Next, place a single triomino in the middle
that covers one corner in each of the three remaining quarters.
Each of these quarters now has one corner covered, and by the I.H. again, they
can each be tiled by triominoes. This completes the tiling of the 2k+1 -by-2k+1
chessboard. Note that in this proof, we used the
induction hypothesis twice! (Or tech-
nically, 4 times, one for each 2k -by-2k
Before moving on, here is some intuition behind what we did in the previous two
quarter.)
examples. Given a problem of a 2n -by-2n chessboard, we repeatedly broke it up
introduction to the theory of computation 13
into smaller and smaller parts, until we reached the 2-by-2 size, which we could
tile using just a single triomino. This idea of breaking down the problem into
smaller ones “again and again” was a clear sign that a formal proof by induction
was the way to go. Be on the lookout for phrases like “repeat over and over” in
your own thinking to signal that you should be using induction. In the opening In your programming, this is the same
example, we used an even more specific approach: in the induction step, we sign that points to using recursive
took the sum of size k + 1 and reduced it to a sum of size k, and evaluated solutions as the easiest approach.
that using the induction hypothesis. The cornerstone of simple induction is this
link between problem instances of size k and size k + 1, and this ability to break
down a problem into something exactly one size smaller.
Example 2.3. Consider the sequence of natural numbers satisfying the following
properties: a0 = 1, and for all n ≥ 1, an = 2an−1 + 1. Prove that for all n ∈ N, We will see in the next chapter one way
an = 2n+1 − 1. of discovering this expression for an .
P(n) : an = 2n+1 − 1.
For the induction step, let k ∈ N and suppose ak = 2k+1 − 1. Our goal is to
prove that P(k + 1) holds. By the recursive property of the sequence,
ak+1 = 2ak + 1
= 2 (2k +1 − 1 ) + 1 (by the I.H.)
k +2
=2 −2+1
k +2
=2 −1
By this point, you have done several examples using simple induction. Recall
that the intuition behind this proof technique is to reduce problems of size k + 1
to problems of size k (where “size” might mean the value of a number, or the
size of a set, or the length of a string, etc.). However, for many problems there
is no natural way to reduce problem sizes just by 1. Consider, for example, the
following problem:
Prove that every natural number greater than 1 has a prime factoriza- Every prime can be written as a product
tion, i.e., can be written as a product of primes. of just one number: itself!
How would you go about proving the induction step, using the method we’ve
used so far? That is, how would you prove P(k) ⇒ P(k + 1)? This is a very
hard question to answer, because even the prime factorizations of consecutive E.g., 210 = 2 · 3 · 5 · 7, but 211 is prime.
numbers can be completely different!
But if I asked you to solve this question by “breaking the problem down,” you
would come up with the idea that if k + 1 is not prime, then we can write k + 1 =
14 david liu
a · b, where a, b < k + 1, and we can “do this recursively” until we’re left with a
product of primes. Since we always identify recursion with induction, this hints
at a more general form of induction that we can use to prove this statement.
Complete Induction
Recall the intuitive “chain of reasoning” that we do with simple induction: first
we prove P(0), and then use P(0) to prove P(1), then use P(1) to prove P(2),
etc. So when we get to k + 1, we try to prove P(k + 1) using P(k), but we have
already gone through proving P(0), P(1), . . . , and P(k − 1), in addition to P(k )!
In some sense, in Simple Induction we’re throwing away all of our previous
work except for P(k). In Complete Induction, we keep this work and use it in our
proof of the induction step. Here is the formal statement of The Principle of
Complete Induction: complete induction
P(0) ∧ ∀k, P(0) ∧ P(1) ∧ · · · ∧ P(k ) ⇒ P(k + 1) ⇒ ∀n, P(n)
Induction Step: Here is the only structural difference for Complete Induction
proofs. We let k ≥ 2, and our induction hypothesis is now to assume that for all
2 ≤ i ≤ k, P(i ) holds. (That is, we’re assuming P(2), P(3), P(4), . . . , P(k) are all
true.) The goal is still the same: prove that P(k + 1) is true.
There are two cases. In the first case, assume k + 1 is prime. Then of course k + 1
can be written as a product of primes, so P(k + 1) is true. The product contains a single number,
k + 1.
In the second case, k + 1 is composite. But then by the definition of composite-
ness, there exist a, b ∈ N such that k + 1 = ab and 2 ≤ a, b ≤ k; that is, k + 1
has factors other than 1 and itself. This is the intuition from earlier. And here
is the “recursive thinking”: by the induction hypothesis, P( a) and P(b) hold. We can only use the induction hypothe-
Therefore we can write sis because a and b are at least 2 and less
than k + 1.
a = q 1 · · · q l1 and b = r 1 · · · r l2 ,
introduction to the theory of computation 15
k + 1 = ab = q1 · · · ql1 r1 · · · rl2 ,
Note that we used inductive thinking to break down the problem; but unlike
Simple Induction where the size of the subproblem is one less than the current
problem size, we didn’t know much about the sizes of the resulting problems
(only that they were smaller than the original problem). Complete Induction
allows us to handle this sort of structure.
√ √
( 1+2 5 )n − ( 1−2 5 )n
fn = √ .
5
Proof. Note that we really need complete induction here (and not just simple
induction) because f n is defined in terms of both f n−1 and f n−2 , and not just
f n−1 only.
√ √
( 1+2 5 )n − ( 1−2 5 )n
The predicate we will prove is P(n) : f n = √ . We require two
5
base cases: one for n = 1, and one for n = 2. These can be checked by simple
calculations:
√ √ √ √
( 1+2 5 )1 − ( 1−2 5 )1 1+ 5
2 √− 1− 5
2
√ =
5 5
√
5
= √
5
= 1 = f1
√ √ √ √
( 1+2 5 )2 − ( 1−2 5 )2 6+2 5
4 √ − 6−2 5
4
√ =
5 5
√
5
= √
5
= 1 = f2
For the induction step, let k ≥ 2 and assume P(1), P(2), . . . , P(k ) hold. Consider
16 david liu
f k +1 = f k + f k −1
√ √ √ √
( 1+2 5 )k − ( 1−2 5 )k ( 1+2 5 )k−1 − ( 1−2 5 )k−1
= √ + √ (by I.H.)
5 5
√ √ √ √
( 1+2 5 )k + ( 1+2 5 )k−1 ( 1−2 5 )k + ( 1−2 5 )k−1
= √ − √
5 5
√ √ √ √
( 1+2 5 )k−1 ( 1+2 5
+ 1) ( 1−2 5 )k−1 ( 1−2 5
+ 1)
= √ − √
5 5
√ √ √ √
( 1+2 5 )k−1 · 6+2 5
4 ( 1−2 5 )k−1 · 6−2 5
4
= √ − √
5 5
√ √ √ √
( 1+2 5 )k−1 ( 1+2 5 )2 ( 1−2 5 )k−1 ( 1−2 5 )2
= √ − √
5 5
√ √
( 1+2 5 )k+1 ( 1−2 5 )k+1
= √ − √
5 5
Beyond Numbers
So far, our proofs have all been centred on natural numbers. Even in situations
where we have proved statements about other objects — like sets and chess-
boards — our proofs have always required associating these objects with natural
numbers. Consider the following problem:
Prove that any non-empty binary tree has exactly one more node
than edge.
√
You are already familiar with many descriptions of sets: {2, π, 10}, { x ∈ R |
x ≥ 4}, and “the set of all non-empty binary trees” are all perfectly valid de-
scriptions of sets. Unfortunately, these set descriptions don’t lend themselves
very well to induction, because induction is recursion and it isn’t clear how to
apply recursive thinking to any of these descriptions. However, for some objects
– like binary trees – it is relatively straightforward to define them recursively.
Here’s a warm-up.
introduction to the theory of computation 17
Example 2.6. Suppose we want to construct a recursive definition of N. Here is The “smallest” means that nothing else
one way. Define N to be the (smallest) set such that: is in N. This is an important point to
make; for example, the set of integers Z
also satisfies the given properties, but
• 0∈N includes more than N. In the recursive
definitions below, we omit “smallest”
• If k ∈ N, then k + 1 ∈ N but it is always implicitly there.
Notice how similar this definition looks to the Principle of Simple Induction!
This isn’t a coincidence: induction fundamentally makes use of this recursive
structure of N. We’ll refer to the first rule as the base of the definition, and the
second as the recursive rule. In general, a recursive definition can have multiple
base and recursive rules!
Example 2.7. Construct a recursive definition of “the set of all non-empty binary
trees.”
Intuitively, the base rule(s) always capture the smallest or simplest elements of
a set. Certainly the smallest non-empty binary tree is a single node.
What about larger trees? This is where “breaking down” problems into smaller
subproblems makes the most sense. You should know from CSC148 that we
really store binary trees in a recursive manner: every tree has a root node and
links to the roots of the left and right subtrees (the suggestive word here is
“subtree.”) One slight subtlety is that one or both of these subtrees could be
empty. Here is a formal recursive definition (before you read it, try coming up
with one yourself!):
Notice that this definition has two recursive rules, not one!
Structural Induction
Now, we mimic the format of our induction proofs, but with the recursive defi-
nition of non-empty binary trees rather than natural numbers. The similarity of
form is why this type of proof is called structural induction. In particular, notice
structural induction
the identical terminology.
Example 2.8. Prove that every non-empty binary tree has one more node than
edge.
Base Case: Our base case is determined by the first rule. Suppose T is a single
node. Then it has one node and no edges, so P( T ) holds.
Induction Step: We’ll divide our proof into two parts, one for each recursive
rule.
V ( T ) = V ( T1 ) + V ( T2 ) + 1
E( T ) = E( T1 ) + E( T2 ) + 2
since one extra node (new root r) and two extra edges (from r to the roots
of T1 and T2 ) were added to form T. By the induction hypothesis, V ( T1 ) =
E( T1 ) + 1 and V ( T2 ) = E( T2 ) + 1, and so
V ( T ) = E( T1 ) + 1 + E( T2 ) + 1 + 1
= E( T1 ) + E( T2 ) + 2 + 1
= E( T ) + 1
Therefore P( T ) holds.
• Let T1 be a non-empty binary tree, and suppose P( T1 ) holds. Let T be the tree
formed by taking a new node r and adding an edge to the root of T1 . Then
V ( T ) = V ( T1 ) + 1 and E( T ) = E( T1 ) + 1, and since V ( T1 ) = E( T1 ) + 1 (by
the induction hypothesis), we have
V ( T ) = E( T1 ) + 2 = E( T ) + 1.
In structural induction, we identify some property that is satisfied by the sim- We say that such a property is invariant
plest (base) elements of the set, and then show that the property is preserved under the recursive rules, meaning
it isn’t affected when the rules are
under each of the recursive construction rules.
applied. The term “invariant” will
reappear throughout this course in
Here is some intuition: imagine you have a set of Lego blocks. Starting with different contexts.
individual Lego pieces, there are certain “rules” that you can use to combine
Lego objects to build larger and larger structures, corresponding to (say) differ-
ent ways of attaching Lego pieces together. This is a recursive way of describing
the (infinite!) set of all possible Lego creations.
Now suppose you’d like to make a perfectly spherical object, like a soccer ball
or the Death Star. Unfortunately, you look in your Lego kit and all you see are
rectangular pieces! Naturally, you complain to your mother (who bought the kit
for you) that you’ll be unable to make a perfect sphere using the kit. But she
remains unconvinced: maybe you should try doing it, she suggests, and if you’re
introduction to the theory of computation 19
lucky you’ll come up with a clever way of arranging the pieces to make a sphere.
Aha! This is impossible, since you’re starting with non-spherical pieces, and you
(being a Lego expert) know that no matter which way you combine Lego objects
together, starting with rectangular objects yields only other rectangular objects
as results. So even though there are many, many different rectangular structures
you could build, none of them could ever be perfect spheres.
A Larger Example
Let us turn our attention to another useful example of induction: proving the
equivalence of recursive and non-recursive definitions. We know from our study
of Python that often problems can be solved using either recursive or iterative
programs, but we’ve taken it for granted that these programs really can accom-
Although often a particular problem
plish the same task. We’ll look later in this course at proving things about what lends itself more to one technique than
programs do, but for a warm-up in this section, we’ll step back from programs the other.
• (0, 0) ∈ S
• If ( a, b) ∈ S, then both ( a + 1, b + 1) ∈ S and ( a + 3, b) ∈ S Again, there are two recursive rules
here.
Also, define the set S0 = {( x, y) ∈ N ∗ N | x ≥ y ∧ 3 | x − y}. Prove that these Here, 3 | x − y means that x − y is
two definitions are equivalent, i.e., S = S0 . divisible by 3.
Proof. We divide our solution into two parts. First, we show using structural
induction that S ⊆ S0 ; that is, every element of S satisfies the property of S0 .
Then, we prove using complete induction that S0 ⊆ S; that is, every element of
S0 can be constructed from the base and recursive rules of S.
Part 1: S ⊆ S0 . In this part, we show that the base case of S is in S0 , and that all
elements generated using the recursive rules of S are also in S0 . For clarity, we
define the predicate
P( x, y) : x ≥ y ∧ 3 | x − y
The only base element of S is (0, 0). Clearly, P(0, 0) is true, as 0 ≥ 0 and 3 | 0.
Now for the induction step. There are two recursive rules for S. Let ( a, b) ∈ S,
and suppose P( a, b) holds. Consider ( a + 1, b + 1). By the induction hypothesis,
a ≥ b, and so a + 1 ≥ b + 1. Also, ( a + 1) − (b + 1) = a − b, which is divisible by
3 (again by the I.H.). So P( a + 1, b + 1) also holds.
Part 2: S0 ⊆ S. We would like to use complete induction, but we can only apply
that technique to natural numbers, and not pairs of natural numbers. So we
need to associate each pair ( a, b) with a single natural number. We can do this
20 david liu
The base case is n = 0. the only element of S0 whose ( x, y) sums to 0 is (0, 0),
which is certainly in S by the base rule of the recursive definition.
Now let k ∈ N, and suppose P(0), P(1), . . . , P(k) all hold. Let ( x, y) ∈ S0 such
that x + y = k + 1. We will prove that ( x, y) ∈ S. There are two cases to consider:
• y > 0. Then since x ≥ y, x > 0. Then ( x − 1, y − 1) ∈ S0 , and ( x − 1) + (y − The > 0 checks ensure that x − 1, y −
1) = k − 1. By the Induction Hypothesis (in particular, P(k − 1)), ( x − 1, y − 1 ∈ N.
1) ∈ S. Then ( x, y) ∈ S by applying the first recursive rule in the definition of
S.
• y = 0. Since k + 1 > 0, it must be the case that x > 0. Then since x −
y = x, x must be divisible by 3, and so x ≥ 3. Then ( x − 3, y) ∈ S0 and
( x − 3) + y = k − 2, so by the Induction Hypothesis (in particular, P(k − 2)),
( x − 3, y) ∈ S. Applying the second recursive rule in the definition of S shows
that ( x, y) ∈ S.
Exercises
Challenge: can you mathematically derive this formula by starting from the
standard geometric identity?
16. Recall two standard trigonometric identities:
n
(a) Prove that for all n ≥ 1, ∑ fi = f n+2 − 1.
i =1
n
(b) Prove that for all n ≥ 1, ∑ f 2i−1 = f 2n .
i =1
(c) Prove that for all n ≥ 2, f n2 − f n+1 f n−1 = (−1)n−1 .
(d) Prove that for all n ≥ 1, gcd( f n , f n+1 ) = 1. You may use the fact that for all a < b,
n
gcd( a, b) = gcd( a, b − a).
(e) Prove that for all n ≥ 1, ∑ f i2 = f n f n +1 .
i =1
18. A full binary tree is a non-empty binary tree where every node has exactly
0 or 2 children. Equivalently, every internal node (non-leaf) has exactly two
children.
(a) Prove using complete induction that every full binary tree has an odd num-
ber of nodes.
You can choose to do induction on
either the height or number of nodes
in the tree. A solution with simple
induction is also possible, but less
22 david liu
(b) Prove using complete induction that every full binary tree has exactly one
more leaf than internal nodes.
(c) Give a recursive definition for the set of all full binary trees.
(d) Reprove parts (a) & (b) using structural induction instead of complete in-
duction.
19. Consider the sets of binary trees with the following property: for each node,
the heights of its left and right children differ by at most 1. Prove that every
binary tree with this property of height n has at least (1.5)n − 1 nodes.
1 n
n
20. Let k > 1. Prove that for all n ∈ N, 1 − ≥ 1− .
k k
21. Consider the following recursively defined function f : N → N.
2, if n = 0
f (n) = 7, if n = 1
( f (n − 1))2 − f (n − 2), if n ≥ 2
Prove that every string in S is balanced, i.e., the number of left brackets equals
the number of right brackets.
24. The Fibonacci trees Tn are a special set of binary trees defined recursively as
follows.
• 2∈S
• If k ∈ S, then k2 ∈ S
k
• If k ∈ S, and k ≥ 2, then ∈S
2
(a) Prove that every element of S is a power of 2, i.e., can be written in the
form 2m for some m ∈ N.
(b) Prove that every power of 2 (including 20 ) is in S.
introduction to the theory of computation 23
26. Consider the set S ⊂ N2 of ordered pairs of integers defined by the following
recursive definition:
• (3, 2) ∈ S
• If ( x, y) ∈ S, then (3x − 2y, x ) ∈ S
S0 = {(2k+1 + 1, 2k + 1) | k ∈ N}.
Prove that for all propositional formulas F, F has a logically equivalent for-
mula G such that G only has negations applied to propositions. For example,
we have the equivalence
¬(¬( P ∧ Q) ⇒ R) ⇔ (¬ P ∨ ¬ Q) ∧ ¬ R
Hint: you won’t have much luck applying induction directly to the statement
in the question. (Try it!) Instead, prove the stronger statement: “F and ¬ F
have equivalent formulas that only have negations applied to propositions.”
28. It is well-known that Facebook friendships are the most important relation-
ships you will have in your lifetime. For a person x on Facebook, let f x denote
the number of friends x has. Find a relationship between the total number of
Facebook friendships in the world, and the sum of all of the f x ’s (over every
person on Facebook). Prove your relationship using induction.
29. Consider the following 1-player game. We start with n pebbles in a pile,
where n ≥ 1. A valid move is the following: pick a pile with more than 1
pebble, and divide it into two smaller piles. When this happens, add to your
score the product of the sizes of the two new piles. Continue making moves
until no more can be made, i.e., there are n piles each containing a single
pebble.
Prove using complete induction that no matter how the player makes her
n ( n − 1)
moves, she will always score points when playing this game with n
2 So this game is completely determined
pebbles. by the starting conditions, and not at all
by the player’s choices. Sounds fun.
30. A certain summer game is played with n people, each carrying one water
balloon. The players walk around randomly on a field until a buzzer sounds,
at which point they stop. You may assume that when the buzzer sounds,
each player has a unique closest neighbour. After stopping, each player then
throws his water balloon at their closest neighbour. The winners of the game
are the players who are dry after the water balloons have been thrown (as-
sume everyone has perfect aim).
Prove that for every odd n, this game always has at least one winner.
24 david liu
1 − x1 1 − x2 1 − xn 1−S
× ×···× ≥ ,
1 + x1 1 + x2 1 + xn 1+S
n
where S = ∑ xi .
i =1
1
3. A unit fraction is a fraction of the form , n ∈ Z+ . Prove that every rational
n
p
number 0 < < 1 can be written as the sum of distinct unit fractions.
q
3 Recursion
Now, programming! In this chapter, we will apply what we’ve learned about in-
duction to study recursive algorithms. In particular, we will learn how to analyse
the time complexity of recursive programs, for which the runtime on an input of
size n depends on the runtime on smaller inputs. Unsurprisingly, this is tightly
connected to the study of recursively defined (mathematical) functions; we will
discuss how to go from a recurrence relation like f (n + 1) = f (n) + f (n − 1) to a
closed form expression like f (n) = 2n + n2 . For recurrences of a special form, we
will see how the Master Theorem gives us immediate, tight asymptotic bounds. Recall that asymptotic bounds involve
These recurrences will be used for divide-and-conquer algorithms; you will gain Big-O, and are less precise than exact
experience with this common algorithmic paradigm and even design algorithms expressions.
of your own.
Measuring Runtime
Recall that one of the most important properties of an algorithm is how long
it takes to run. We can use the number of steps as a measurement of running
time; but reporting an absolute number like “10 steps” or “1 000 000 steps” an
algorithm takes is pretty meaningless unless we know how “big” the input was,
since of course we’d expect algorithms to take longer on larger inputs. So a
more meaningful measure of runtime is “10 steps when the input has size 2” or
“1 000 000 steps when the input has size 300” or even better, “n2 + 2 steps when
the input has size n.” But as you probably remember from CSC148, counting an
exact number of steps is often tedious and arbitrary, so we care more about the
Big-O (asymptotic) analysis of an algorithm.
In this course, we will mainly care
about the upper bound on the worst-
In CSC148 and earlier in this course, you analysed the runtime of iterative algo- case runtime of algorithms; that is, the
rithms. As we’ve mentioned several times by now, induction is very similar to absolute longest an algorithm could run
on a given input size n.
recursion; since induction has been the key idea of the course so far, it should
come as no surprise that we’ll turn our attention to recursive algorithms now!
Consider the following simple recursive function, which you probably saw in All code in this course will be in
CSC148: Python-like pseudocode. The syntax
and methods will be mostly Python,
with some English making the code
more readable and/or intuitive. We’ll
expect you to follow a similar style.
26 david liu
1 def fact(n):
2 if n == 1:
3 return 1
4 else:
5 return n * fact(n-1)
You should all be familiar with standard function notation: f (n) = n2 , f (n) =
n log n, or the slightly more unusual (but no less meaningful) f (n) = “the num-
ber of distinct prime factors of n.” There is a second way of defining functions
using recursion, e.g.,
(
0, if n = 0
f (n) =
f (n − 1) + 2n − 1, if n ≥ 1
Before returning to our earlier factorial example, let us see how to apply this to
a more concrete example.
Example 3.1. There are exactly two ways of tiling a 3-by-2 grid using triominoes,
shown to the right:
Develop a recursive definition for f (n), the number of ways of tiling a 3-by-n
grid using triominoes for n ≥ 1. Then, find a closed form expression for f .
Solution:
Note that if n = 1, there are no possible tilings, since no triomino will fit in
a 3-by-1 board. We have already observed that there are 2 tilings for n = 2.
Suppose n > 2. The key idea to get a recurrence is that for a 3-by-n block, first
consider the upper-left square. In any tiling, there are only two possible tri-
omino placements that can cover it (these orientations are shown in the diagram
introduction to the theory of computation 27
above). Once we have fixed one of these orientations, there is only one possible
triomino orientation that can cover the bottom-left square (again, these are the
two orientations shown in the figure).
So there are exactly two possibilities for covering both the bottom-left and top-
left squares. But once we’ve put these two down, we’ve tiled the leftmost 3-by-2
part of the grid, and the remainder of the tiling really just tiles the remaining 3-
by-(n − 2) part of the grid; there are f (n − 2) such tilings. Since these two parts Because we’ve expressed f (n) in terms
are independent of each other, we get the total number of tilings by multiplying of f (n − 2), we need two base cases –
the number of possibilities for each. Therefore the recurrence relation is: otherwise, at n = 2 we would be stuck,
as f (0) is undefined.
0, if n = 1
f (n) = 2, if n = 2
2 f (n − 2), if n > 2
Now that we have the recursive definition of f , we would like to find its closed
form expression. The first step is to guess the closed form expression, by a “brute
force” approach known as repeated substitution. Intuitively, we’ll expand out the
recursive definition until we find a pattern. So much of mathematics is finding
patterns.
f ( n ) = 2 f ( n − 2)
= 4 f ( n − 4)
= 8 f ( n − 6)
..
.
= 2k f (n − 2k)
.
Thus we’ve obtained a closed form formula f (n) – except the .. in our repeated
.
substitution does not constitute a formal proof! When you saw the .., you proba-
bly interpreted it as “repeat over and over again until. . . ” and we already know
how to make this thinking formal: induction! That is, given the recursive defini-
tion of f , we can prove using complete induction that f (n) has the closed form Why complete and not simple induc-
given above. This is a rather straightforward argument, and we leave it for the tion? We need the induction hypothesis
Exercises. to work for n − 2, and not just n − 1.
Solution:
Let T (n) denote the worst-case running time of fact on input n. In this course,
we will ignore exact step counts entirely, replacing these counts with constants.
28 david liu
The base case of this method is when n = 1; in this case, the if block executes
and the method returns 1. This is done in constant time, and so we can say that
Constant always means “independent of
T (1) = c for some constant c. input size.”
What if n > 1? Then fact makes a recursive call, and to analyse the runtime
we consider the recursive and non-recursive parts separately. The non-recursive
part is simple: only a constant number of steps occur (the if check, multiplica-
tion by n, and the return), so let’s say the non-recursive part takes d steps. What
about the recursive part? The recursive call is fact(n-1), which has worst-case
runtime T (n − 1), by definition! Therefore when n > 1 we get the recurrence
relation T (n) = T (n − 1) + d. Putting this together with the base case, we get
the full recursive definition of T:
c, if n = 1
T (n) =
T (n − 1) + d, if n > 1
Now, we would like to say that T (n) = O(??), but to do so, we really need a
closed form definition of T. Once again, we use repeated substitution.
T ( n ) = T ( n − 1) + d
= T (n − 2) + d + d = T (n − 2) + 2d
= T (n − 3) + 3d
..
.
= T (1) + ( n − 1) d
= c + ( n − 1) d (Since T (1) = c)
Thus we’ve obtained the closed form formula T (n) = c + (n − 1)d, modulo the
..
.. As in the previous example, we leave proving this closed form as an exercise.
After proving this closed form, the final step is simply to convert this closed
form into an asymptotic bound on T. Since c and d are constants with respect to
n, we have that T (n) = O(n).
13 return bin_search(A[0..m-1], x)
14 else: One notable difference from Python is
15 return bin_search(A[m..len(A)-1], x) how we’ll denote sublists. Here, we use
the notation A[i..j] to mean the slice
of the list A from index i to index j,
including A[i] and A[j].
We analyse the runtime of bin_search in terms of n, the length of the input list
A. If n = 0 or n = 1, bin_search(A,x) takes constant time (note that it doesn’t
matter whether the constant is the same or different for 0 and 1).
What about when n > 1? Then some recursive calls are made, and we again
look at the recursive and non-recursive steps separately. We include the compu-
tation of A[0..m-1] and A[m..len(A)-1] in the non-recursive part, since argu-
ment evaluation happens before the recursive call begins.
Interestingly, this is not the case in
some programming languages – an
alternative is “lazy evaluation.”
n this in mind, we conclude that the recurrence relation for T (n) is T (n) =
With
T + d. Therefore the full recursive definition of T is
2
c, if n ≤ 1
T (n) = n
T + d, if n > 1
2
Again, we omit floors and ceilings.
Let us use repeated substitution to guess a closed form. Assume that n = 2k for
30 david liu
Once again, we’ll leave proving this closed form to the Exercises. So T (n) =
c + kd. This expression is quite misleading, because it seems to not involve
an n, and hence be constant time – which we know is not the case for binary
search! The key is to remember that n = 2k , so k = log2 n. Therefore we have
T (n) = c + d log2 n, and so T (n) = O(log n).
In our analysis of binary search, we assumed that the list slicing operation
A[0..m-1] took constant time. However, this is not the case in Python and many
other programming languages, which implement this operation by copying the
sliced elements into a new list. Depending on the scale of your application, this
can be undesirable for two reasons: this copying takes time linear in the size of
the slice, and uses linear additional memory.
While we are not so concerned in this course about the second issue, the first can
drastically change our runtime analysis (e.g., in our analysis of binary search).
However, there is always another way to implement these algorithms without
this sort of slicing that can be done in constant time, and without creating new
lists. The key idea is to use variables to keep track of the start and end points
of the section of the list we are interested in, but keep the whole list all the way
through the computation. We illustrate this technique in our modified binary
search:
In this code, the same list A is passed to each recursive call; the range of searched
values, on the other hand, indeed gets smaller, as the first and last parameters
change. More technically, the size of the range, last - first + 1, decreases by
a (multiplicative) factor of two at each recursive call.
Passing indices as arguments works well for recursive functions that work on
smaller and smaller segments of a list. We’ve just introduced the most basic
version of this technique. However, many other algorithms involve making new
lists in more complex ways, and it is usually possible to make these algorithms
in-place, i.e., to use a constant amount of extra memory, and do operations by
changing the elements of the original list.
In general, finding exact closed forms for recurrences can be tedious or even
impossible. Luckily, we are often not really looking for a closed form solution to
a recurrence, but an asymptotic bound. But even for this relaxed goal, the only
method we have so far is to find a closed form and turn it into an asymptotic
bound. In this section, we’ll look at a powerful technique with the caveat that it
works only for a special recurrence form.
1 def mergesort(A):
2 if len(A) == 1:
3 return A
4 else:
5 m = len(A) // 2
6 L1 = mergesort(A[0..m-1])
7 L2 = mergesort(A[m..len(A)-1])
8 return merge(L1, L2)
9
10 def merge(A, B):
11 i = 0
32 david liu
12 j = 0
13 C = []
14 while i < len(A) and j < len(B):
15 if A[i] <= B[j]:
16 C.append(A[i])
17 i += 1
18 else:
19 C.append(B[j])
20 j += 1
21 return C + A[i..len(A)-1] + B[j..len(B)-1] # List concatenation
From the written description, you should be able to intuit that there are two
n n
recursive calls, each on a list of size . So the cost of Step 2 is 2T . Putting For the last time, we’ll point out that we
2 2
all three steps together, we get a recurrence relation ignore floors and ceilings.
n
T (n) = 2T + cn.
2
• a is the “number of recursive calls”; the bigger a is, the more recursive calls,
and the bigger we expect T (n) to be.
• b determines the rate of decrease of the problem size; the larger b is, the faster
the problem size goes down to 1, and the smaller T (n) is.
• f (n) is the cost of the non-recursive part; the bigger f (n) is, the bigger T (n)
is.
We can further quantify this relationship by considering the following even more
specific form:
c, if n = 1
f (n) = n
a f + nk , if n > 1
b
introduction to the theory of computation 33
The latter term looks like a geometric series, for which we may use our geometric
a
series formula. However, this only applies when the common ratio k is not
b
equal to 1. Therefore, there are two cases.
a
• Case 1: = 1, so a = bk . Taking logs, we have logb a = k. In this case, the
bk
expression becomes
r −1
f (n) = cnk + nk ∑ 1i
i =0
= cnk + nk r
= cnk + nk logb n
= O(nk log n)
ar
!
logb a k
1− bkr
f (n) = cn +n
1 − bak
nlogb a
1− nk
= cnlogb a + nk
1 − bak
! !
1 1
= c− nlogb a + nk
1 − bak 1 − bak
With this intuition in mind, let us now state the Master Theorem.
34 david liu
Example 3.4. Consider the recurrence for mergesort: T (n) = 2T (n/2) + dn.
Here a = b = 2, so log2 2 = 1, while dn = Θ(n1 ). Therefore Case 1 of the Master
Theorem applies, and T (n) = O(n log n), as we expected.
Example 3.5. Consider the recurrence T (n) = 49T (n/7) + 50nπ . Here, log7 49 =
2 and 2 < π, so by Case 3 of the Master Theorem, T (n) = O(nπ ).
Even though the Master Theorem is useful in a lot of situations, be sure you
understand the statement of the theorem to see exactly when it applies (see
Exercises for some questions investigating this).
Divide-and-Conquer Algorithms
Now that we have seen the Master Theorem, let’s discuss some algorithms for
which it can help us analyse the runtime! A key feature of the recurrence form
aT (n/b) + f (n) is that each of the recursive calls has the same size. This naturally
leads to the divide-and-conquer paradigm, which can be summarized as fol-
divide-and-conquer
lows:
An algorithmic paradigm is a general
strategy for designing algorithms to
1 def divide-and-conquer(P): solve problems. You will see many
2 if P has "small enough" size: more such strategies in CSC373.
3 return solve_directly(P)
4 else:
5 divide P into smaller problems P_1, ..., P_k (same size)
6 for i from 1 to k:
7 # Solve each subproblem recursively
8 s_i = divide_and_conquer(P_i)
9 # combine the s_1 ... s_k to solve P
10 return combine(s_1 ... s_k)
This is a very general template — in fact, it may seem exactly like your mental
model of recursion so far, and certainly it is a recursive strategy. What distin-
guishes divide-and-conquer algorithms from a lot of other recursive procedures
is that we divide the problem into two or more parts and solve the subproblems
introduction to the theory of computation 35
Another common non-divide-and-
conquer recursive design pattern
is taking a list, processing the first
for each part, whereas recursive functions in general may make only a single element, then recursively processing
recursive call, like in fact or bin_search. the rest of the list (and combining the
results).
This introduction to the divide-and-conquer paradigm was deliberately abstract.
However, we have already discussed one divide-and-conquer algorithm: merge-
sort! Let us now see two more examples of divide-and-conquer algorithms: fast
multiplication and quicksort.
Fast Multiplication
Now, let’s see a different way of making this faster. Using a divide-and-conquer
approach, we want to split 1234 and 5678 into smaller numbers:
Now we use some algebra to write the product 1234 · 5678 as the combination of
some smaller products:
So now instead of multiplying 4-digit numbers, we have shown how to find the
solution by multiplying some 2-digit numbers, a much easier problem! Note On a computer, we would use base-2
that we aren’t counting multiplication by powers of 10, since that amounts to instead of base-10 to take advantage
of the “adding zeros,” which corre-
just adding some zeroes to the end of the numbers. sponds to (very fast) bit-shift machine
operations.
Reducing 4-digit multiplication to 2-digit multiplication may not seem that im-
pressive; but now, we’ll generalize this to arbitrary n-digit numbers (the differ-
ence in multiplying 100-digit vs. 50-digit numbers may be more impressive).
We have found a mathematical identity that seems useful, and we can use this
to develop a multiplication algorithm. Let’s see some pseudocode:
1 def rec_mult(x,y):
2 n = length of x # Assume x and y have the same length
3 if n == 1:
4 return x * y
5 else:
6 a = x // 10^(n//2)
7 b = x % 10^(n//2)
8 c = y // 10^(n//2)
9 d = y % 10^(n//2)
10
11 r = rec_mult(a, c)
12 s = rec_mult(a, d)
13 t = rec_mult(b, c)
14 u = rec_mult(b, d)
15
16 return r * 10^n + (s + t) * 10^(n//2) + u
Now, let’s talk about the running time of this algorithm, in terms of the size n
of the two numbers. Note that there are four recursive calls; each
n call multiplies
n
two numbers of size , so the cost of the recursive calls is 4T . What about
2 2
the non-recursive parts? Note that the final return step involves addition of
2n-digit numbers, which takes Θ(n) time. Therefore we have the recurrence
n
T (n) = 4T + cn.
2
So, this approach didn’t help! We had an arguably more complicated algo-
rithm that achieved the same asymptotic runtime as what we learned in ele- This is a serious lesson. It is not the case
mentary school! Moral of the story: Divide-and-conquer, like all algorithmic that everything we teach you works for
every situation. It is up to you to care-
paradigms, doesn’t always lead to “better” solutions! fully put together your knowledge to
figure out how to approach problems!
In the case of fast multiplication, though, we can use more math to improve the
running time. Note that the “cross term” ad + bc in the algorithm required two
multiplications to compute naïvely; however, it is correlated with the values of
ac and bd with the following straightforward identity:
( a + b)(c + d) = ac + ( ad + bc) + bd
( a + b)(c + d) − ac − bd = ad + bc
1 def fast_rec_mult(x,y):
2 n = length of x # Assume x and y have the same length
3 if n == 1:
4 return x * y
introduction to the theory of computation 37
5 else:
6 a = x // 10^(n//2)
7 b = x % 10^(n//2)
8 c = y // 10^(n//2)
9 d = y % 10^(n//2)
10
11 p = fast_rec_mult(a + b, c + d)
12 r = fast_rec_mult(a, c)
13 u = fast_rec_mult(b, d)
14
15 return r * 10^n + (p - r - u) * 10^(n//2) + u
You can study the (improved!) runtime of this algorithm in the exercises.
Quicksort
Before moving on, an excellent exercise is to take the above pseudocode and im-
plement quicksort yourself. As we will discuss again and again, implementing
algorithms yourself is the best way to understand them. Remember that the only
way to improve your coding abilities is to code lots — even something as simple
and common as sorting algorithms offers great practice. See the Exercises for
more examples.
1 def quicksort(A):
2 if len(A) <= 1:
38 david liu
3 pass
4 else:
5 # Choose the final element as the pivot
6 pivot = A[-1]
7
8 # Partition the rest of A with respect to the pivot
9 L, G = partition(A[0:-1], pivot)
10 # Sort each list recursively
11 quicksort(L)
12 quicksort(G)
13
14 # Combine
15 sorted = L + [pivot] + G
16 # Set A equal to the sorted list
17 for i in range(len(A)):
18 A[i] = sorted[i]
19
20 def partition(A, pivot):
21 L = []
22 G = []
23 for x in A:
24 if x <= pivot:
25 L.append(x)
26 else:
27 G.append(x)
28 return L, G
Let us try to analyse the running time T (n) of this algorithm, where n is the
length of the input list A. First, the base case n = 1 takes constant time. The
partition method takes linear time, since it is called on a list of length n − 1
and contains a for loop that loops through all n − 1 elements. The Python list
methods in the rest of the code also take linear time, though a more careful
implementation could reduce this. But because partitioning the list always takes
linear time, the non-recursive cost of quicksort is linear.
What about the costs of the recursive steps? There are two of them: quicksort(L)
and quicksort(G), so the recursive cost in terms of L and G is T (| L|) and T (| G |). Here | A| denotes the length of the list
Therefore a potential recurrence is: A.
(
c, if n ≤ 1
T (n) =
T (| L|) + T (| G |) + dn, if n > 1
What’s the problem with this recurrence? It depends on what L and G are,
which in turn depends on the input array and the chosen pivot! In particular,
we can’t use either repeated substitution or the Master Theorem to analyse this
function. In fact, the asymptotic running time of this algorithm can range from
Θ(n log n) to Θ(n2 ), the latter of which is just as bad as bubblesort! See the Exercises for details.
This begs the question: why is quicksort so used in practice? Two reasons:
introduction to the theory of computation 39
quicksort takes Θ(n log n) time “on average”, and careful implementations of Average-case analysis is slightly more
quicksort yield better constants than other Θ(n log n) algorithms like mergesort. sophisticated than what we do in
These two facts together imply that quicksort often outperforms other sorting this course, but you can take this
to mean that most of the time, on
algorithms in practice! randomly selected inputs, quicksort
takes Θ(n log n) time.
Exercises
1. Let f : N → N be defined as
(
0, if n = 0
f (n) =
f (n − 1) + 2n − 1, if n ≥ 1
3. Prove that the closed form expression for the runtime of fact is T (n) = c +
(n − 1)d.
4. Prove that the closed form expression for the runtime of bin_search is T (n) =
c + d log2 n.
5. Let T (n) be the number of binary strings of length n in which there are no
consecutive 1’s. So T (0) = 1, T (1) = 2, T (2) = 3, etc.
(a) Develop a recurrence for T (n). Hint: think about the two possible cases for
the last character.
(b) Find a closed form expression for T (n).
(c) Prove that your closed form expression is correct using induction.
6. Repeat the steps of the previous question, except with binary strings where
every 1 is immediately preceded by a 0.
7. It is known that every full binary tree has an odd number of nodes. Let T (n) A full binary tree is a binary tree where
denote the number of distinct full binary trees with n nodes. For example, every node has either 0 or 2 children.
T (1) = 1, T (3) = 1, and T (7) = 5. Give a recurrence for T (n), justifying why
1
it is correct. Then, use induction to prove that T (n) ≥ 2(n−1)/2 .
n
8. Consider the following recursively defined function
3, if n = 0
f (n) = 7, if n = 1
3 f (n − 1) − 2 f (n − 2), if n ≥ 2
Find a closed form expression for f , and prove that it is correct using induc-
tion.
40 david liu
Develop a recursive definition for H (n), and justify why it is correct. Then
find a closed form for H using repeated substitution.
11. Consider the following recursively defined function.
1, if n = 1
T (n) = n
4T + log2 n, otherwise
2
Use repeated substitution to come up with a closed form expression for T (n),
when n = 2k ; i.e., n is a power of 2. You will need to use the following identity:
n
n · a n +2 − ( n + 1 ) · a n +1 + a
∑ i · ai = ( a − 1)2
.
i =0
(a)
1 def sum(A):
2 if len(A) == 0:
3 return 1
4 else:
5 return A[0] + sum(A[1..len(A)-1])
(b)
1 def fun(A):
2 if len(A) < 2:
3 return len(A) == 0
4 else:
5 return fun(A[2..len(A)-1])
introduction to the theory of computation 41
(c)
1 def double_fun(A):
2 n = len(A)
3 if n < 2:
4 return n
5 else:
6 return double_fun(A[0..n-2]) + double_fun(A[1..n-1])
(d)
1 def mystery(A):
2 if len(A) <= 1:
3 return 1
4 else:
5 d = len(A) // 4
6 s = mystery(A[0..d-1])
7 i = d
8 while i < 3 * d:
9 s += A[i]
10 i += 1
11 s += mystery(A[3*d..len(A)-1])
12 return s
where L and G are the partitions of the list. Clearly, how the list is partitioned
matters a great deal for the runtime of quicksort.
n
(a) Suppose the lists are always evenly split; that is, | L| = | G | = at each
2
recursive call. Find a tight asymptotic bound on the runtime of quicksort
For simplicity, we’ll ignore the fact that
using this assumption. n−1
each list really would have size .
2
(b) Now suppose that the lists are always very unevenly split: | L| = n − 2
and | G | = 1 at each recursive call. Find a tight asymptotic bound on the
runtime of quicksort using this assumption.
4 Program Correctness
In our study of algorithms so far, we have mainly been concerned with their
worst-case running time. While this is an important consideration of any pro-
gram, there is arguably a much larger one: program correctness! That is, while
it is important for our algorithms to run quickly, it is more important that they
work! You are used to testing your programs to demonstrate their correctness, Frankly, developing high-quality tests
but your confidence depends on the quality of your testing. takes a huge amount of time – much
longer than you probably spent on it in
In this chapter, we’ll discuss methods of formally proving program correctness, CSC148!
without writing any tests at all. We cannot overstate the importance of this
technique: a test suite cannot possibly test a program on all possible inputs
(unless it is a very restricted program), and so a proof is the only way we can
ensure that our programs are actually correct on all inputs. Even for larger
software systems, which are far too complex to formally prove their correctness,
the skills you will learn in this chapter will enable you to reason more effectively By “semantically” we mean your ability
about your code; essentially, what we will teach you is the art of semantically to derive meaning from code, i.e.,
identify exactly what the program does.
tracing through code.
This contrasts with program syntax,
things like punctuation and (in Python)
indentation.
What is Correctness?
You may be familiar with the most common tools used to specify program cor-
rectness: preconditions and postconditions. Formally, a precondition of a function precondition/postcondition
is a property that an input to the function must satisfy in order to guarantee that
the function will work properly. A postcondition of a function is a property that As a program designer, it is up to you
must be satisfied after the function completes. Most commonly, this refers to to specify preconditions. This often bal-
properties of a return value, though it can also refer to changes to the variables ances the desire of flexibility (allowing
a broad range of inputs/usages) with
passed in as with the implementation of quicksort from the previous chapter feasibility (how much code you want or
(which didn’t return anything but instead changed the input list A). are able to write).
Example 4.1. Consider the following code for calculating the greatest common
divisor of two natural numbers. Its pre- and postconditions are shown.
So preconditions tell us what must be true before the program starts, and post-
conditions tell us what must be true after the program terminates (assuming it
ends at all). We have the following formal statement of correctness. Though it
is written in more formal language, note that it really captures what we mean
when we say that a program is “correct.”
Consider the code for gcd_rec shown in the previous example. Here is its state-
ment of correctness:
Writing full induction proofs that formalise the above logic is tedious, so instead
we use the fundamental idea in a looser template. For each program path from
the first line to a return statement, we show that it terminates and that, when it
does, the postconditions are satisfied. We do this as follows:
• If the path contains no recursive calls or loops, analyse the code line by line
until the return statement.
• For each recursive call on the path (if there are any), argue why the precondi-
tions are satisfied at the time of the recursive call, and that the recursive call There is some ambiguity around what
occurs on a “smaller” input than the original call. Then you may assume that is meant by “smaller.” We will discuss
the postconditions for the recursive call are satisfied when the recursive call this shortly.
terminates.
Pre
Finally, argue from the last recursive call to the end of the function why the
postconditions of the original function call will hold.
• For each loop, use a “loop invariant.” We will deal with this in the next
section.
Pre
Example 4.2. We will show that gcd_rec is correct. There are three program
paths (this is easy to see, because of the if statements). Let’s look at each one
separately.
• Path 1: the program terminates at line 7. If the program goes into this block,
then a = 1 or b = 1. But in these cases, gcd( a, b) = 1, because gcd( x, 1) = 1
for all x. Then the postcondition holds, since at line 7 the program returns 1.
• Path 2: the program terminates at line 9. If the program goes into this block,
b divides a. Since b is the greatest possible divisor of itself, this means that
gcd( a, b) = b, and b is returned at line 9.
• Path 3: the program terminates at line 11. We need to check that the recursive
call satisfies its preconditions and is called on a smaller instance. Note that b
and ( a mod b) are both at least 1, and ( a mod b) < b, so the preconditions
are satisfied. Since a + b > ( a mod b) + b, the sum of the inputs decreases, and
so the recursive call is made on a smaller instance.
If you recall the example of using
Therefore when the call completes, it returns gcd(b, a mod b). Now we use complete induction on ordered pairs,
taking the sum of the two components
the identity that gcd( a, b) = gcd(b, a mod b) to conclude that the original was the size measure we used there,
call returns the correct answer. too.
Proof of correctness. Here there are five different program paths. We’ll check
three of them, and leave the others as an exercise:
• Path 4: the program terminates at line 15. This happens when len(A) > 1,
and A[guess] > x. Because A is sorted, and A[guess] > x, for every index
i ≥ guess, A[i] > x. Therefore the only way x could appear in A is if it
appeared at an index smaller than guess.
Now, let us handle the recursive call. Since guess ≤ len( A) − 1, we have
that guess − 1 ≤ len( A) − 2, and so the length of the list in the recursive
call is at most len( A) − 1; so the recursive call happens on a smaller instance.
Therefore, when the recursive call returns, the postcondition is satisfied: it
returns true if and only if x appears in A[0..guess-1]. The original function
call then returns this value; by the discussion in the previous paragraph, this
is the correct value to return, so the postcondition is satisfied.
introduction to the theory of computation 47
Iterative Programs
In this section, we’ll discuss how to handle loops in our code. So far, we have
been able to determine the exact sequence of steps in each program path (e.g., We’ve treated recursive calls as “black
“Lines 1, 2, 4, and 6 execute, and then the program returns”). However, this is boxes” that behave nicely (i.e., can be
not the case when we are presented with a loop, because the sequence of steps treated as a single step) as long as their
preconditions are satisfied and they are
depends on the number of times the loop iterates, which in turn depends on called on smaller inputs.
the input (e.g., the length of an input list). Thus our argument for correctness
cannot possibly go step by step!
Instead, we treat the entire loop as a single unit, and give a correctness argument
specifically for it separately. But what do we mean for a loop to be “correct”?
Consider the following function.
1 def avg(A):
2 '''
3 Pre: A is a non-empty list of numbers
4 Post: Returns the average of the numbers in A
5 '''
6 sum = 0
7 i = 0
8 while i < len(A):
9 sum += A[i]
10 i += 1
11 return sum / len(A)
Clearly, the loop calculates the sum of the elements in A one element at a time.
After some thought, we determine that the variable sum starts with value 0 and in
the loop takes on the values A[0], then A[0] + A[1], then A[0] + A[1] + A[2],
etc. We formalize this by defining a loop invariant for this loop. A loop invariant loop invariant
is a predicate that is true every time the loop-condition is checked (including the
check that terminates the loop). Usually, the predicate will depend on which
iteration the loop is on, or more generally, the value(s) of the program variable(s)
associated with the loop. For example, in avg, the loop invariant corresponding to
our previous intuition is
i −1
P(i, sum) : sum = ∑ A[k] By convention, the empty sum
−1
∑ A[k]
k =0
k =0
evaluates to 0.
The i and sum in the predicate really correspond to the values of those variables
in the code for avg. That is, this predicate is stating a property of these variables
48 david liu
in the code.
Unfortunately, this loop invariant isn’t quite right; what if i > len( A)? Then the
sum is not well-defined, since, for example, A[len(A)] is undefined. This can be
solved with a common technique for loop invariants: putting bounds on “loop
counter” variables, as follows:
i −1
Inv(i, sum) : 0 ≤ i ≤ len( A) ∧ sum = ∑ A [ k ].
k =0
This ensures that the sum is always well-defined, and has the added benefit of In general, the fewer possible values a
explicitly defining a possible range of values on i. variable takes on, the fewer cases you
have to worry about in your code.
A loop invariant is correct if it is always true at the beginning of every loop
iteration, including the loop check that fails, causing the loop to terminate. This is
why we allowed i ≤ len( A) rather than just i < len( A) in the invariant.
How do we prove that loop invariants are correct? The argument is yet another
application of induction:
• First, we argue that the loop invariant is satisfied when the loop is reached.
(This is arrow (1) in the diagram)
Pre
• Then, we argue that if the loop invariant is satisfied at the beginning of an it-
eration, then after the loop body executes once (i.e., one loop iteration occurs), (1)
the loop invariant still holds. (Arrow (2))
Inv (2)
• Finally, after proving that the loop invariant is correct, we show that if the
invariant holds when the loop ends, then the postcondition will be satisfied (3)
Though this is basically an inductive proof, as was the case for recursive pro- If we wanted to be precise, we would
grams, we won’t hold to the formal induction structure here. do induction on the number of loop
iterations executed.
Example 4.4. Let us formally prove that avg is correct. The main portion of the
proof will be the proof that our loop invariant Inv(i, sum) is correct.
Proof. When the program first reaches the loop, i = 0 and sum = 0. Plugging
this into the predicate yields
−1
Inv(0, 0) : 0 ≤ 0 ≤ len( A) ∧ 0 = ∑ A [ k ],
k =0
which is true (recall the note about the empty sum from earlier).
Now suppose the loop invariant holds when i = i0 , at the beginning of a loop it-
eration. Let sum0 be the value of the variable sum at this time. The loop invariant
we are assuming is the following:
i0 −1
Inv(i0 , sum0 ) : 0 ≤ i0 ≤ len( A) ∧ sum0 = ∑ A [ k ].
k =0
What happens next? The obvious answer is “the loop body runs,” but this misses
one subtle point: if i0 = len( A) (which is allowed by the loop invariant), the
introduction to the theory of computation 49
body of the loop doesn’t run, and we don’t need to worry about this case, since we
only care about checking what happens to the invariant when the loop actually
runs.
Assume that i0 < len( A), so that the loop body runs. What happens in one
iteration? Two things: sum increases by A[i0 ], and i increases by 1. Let sum1
and i1 be the values of sum and i at the end of the loop iteration. We have
sum1 = sum0 + A[i0 ] and i1 = i0 + 1. Our goal is to prove that the loop invariant
holds for i1 and sum1 , i.e.,
i1 −1
Inv(i1 , sum1 ) : 0 ≤ i1 ≤ len( A) ∧ sum1 = ∑ A [ k ].
k =0
Let us check that the loop invariant is still satisfied by sum1 and i1 . First, 0 ≤
i0 < i0 + 1 = i1 ≤ len( A), where the first inequality came from the loop invariant
holding at the beginning of the loop, and the last inequality came from the
assumption that i0 < len( A). The second part of Inv can be checked by a simple
calculation:
The next key idea is that when the loop ends, variable i has value len( A), since
by the loop invariant it always has value ≤ len( A), and if it were strictly less
than len( A), another iteration of the loop would run. Then by the loop invariant,
len( A)−1 That might seem like a lot of writing to
the value of sum is ∑ A[k ], i.e., the sum of all the elements in A! The final get to what we said paragraphs ago, but
k =0 this is a formal argument that confirms
step is to continue tracing until the program returns, which in this case takes just our intuition.
a single step: the program returns sum / len(A). But this is exactly the average We also implicitly use here the mathe-
of the numbers in A, because the variable sum is equal to their sum! Therefore matical definition of “average” as the
the postcondition is satisfied. sum of the numbers divided by how
many there are.
Note the deep connection between the loop invariant and the postcondition.
There are many other loop invariants we could have tried to prove: for example,
Inv(i, sum) : i + sum ≥ 0. But this wouldn’t have helped at all in proving
the postcondition! When confronting more problems on your own, it will be
up to you to determine the right loop invariants for the job. Keep in mind
that choosing loop invariants can usually be done by either taking a high-level
approach and mimicking something in the postcondition or taking a low-level
approach by carefully tracing through the code on test inputs to try to find
patterns in the variable values.
50 david liu
One final warning before we move on: loop invariants describe relationships
between variables at a specific moment in time. Students often try to use loop
Specifically, at the beginning of a
invariants to capture how variables change over time (e.g., “i will increase by 1 particular loop check.
when the loop runs”), which creates massive headaches because determining
how the code works is the meat of a proof, and shouldn’t be shoehorned into
a single predicate! When working with more complex code, we take the view
that loop invariants are properties that are preserved, even if they don’t describe
exactly how the code works or exactly what happens! This flexibility is what
makes correctness proofs manageable.
1 def mult(a,b):
2 '''
3 Pre: a and b are natural numbers
4 Post: returns a * b
5 '''
6 m = 0
7 count = 0
8 while count < b:
9 m += a
10 count += 1
11 return m
Proof of correctness. The key thing to figure out here is how the loop accom-
plishes the multiplication. It’s clear what it’s supposed to do: the variable m
changes from 0 at the beginning of the loop to a*b at the end. How does this
change happen? One simple thing we could do is make a table of values of the
variables m and count as the loop progresses:
m count
0 0
a 1
2a 2
3a 3
.. ..
. .
Aha: m seems to always contain the product of a and count; and, when the loop
ends, count = b! This leads directly to the following loop invariant (including
the bound on count):
Consider an execution of the code, with the preconditions satisfied by the inputs.
Then when the loop is first encountered, m = 0 and count = 0, so m = a × count,
and count ≤ b (since b ∈ N).
introduction to the theory of computation 51
Now suppose the loop invariant holds at the beginning of some iteration, with
m = m0 and count = count0 . Furthermore, suppose count0 < b, so the loop runs.
Explicitly, we assume that
When the loop runs, m increases by a and count increases by 1. Let m1 and count1 Inv(m0 , count0 ) holds.
denote the new values of m and count; so m1 = m0 + a and count1 = count0 + 1.
Since Inv(m0 , count0 ) holds, we have
m1 = m0 + a
= a × count0 + a (by invariant)
= a(count0 + 1)
= a × count1
Moreover, since we’ve assumed count0 < b, we have that count1 = count0 + 1 ≤
b. So Inv(m1 , count1 ) holds.
Finally, when the loop terminates, we must have count = b, since by the loop
invariant count ≤ b, and if count < b another iteration would occur. Then by
the loop invariant again, when the loop terminates, m = ab, and the function
returns m, satisfying the postcondition.
Termination
Unfortunately, there is a slight problem with all the correctness proofs we have
done so far. We’ve used phrases like “when the recursive call ends” and “when
the loop terminates”. But how do we know that the recursive calls and loops
end at all? That is, how do we know that a program does not contain an infinite
loop or infinite recursion?
This is a serious issue: what beginning
programmer hasn’t been foiled by either
The case for recursion is actually already handled by our implicit induction of these errors?
proof structure. Recall that the predicate in our induction proofs is that f is cor-
rect on inputs of size n; part of the definition of correctness is that the program
terminates. Therefore as long as the induction structure holds — i.e., that the
recursive calls are getting smaller and smaller — termination comes for free.
The case is a little trickier for loops. As an example, consider the loop in avg. Be-
cause the loop invariant Inv(i, sum) doesn’t say anything about how i changes,
we can’t use it to prove that the loop terminates. But for most loops, including
this one, it is “obvious” why they terminate, because they typically have counter The counter role is played by the
variables that iterate through a fixed range of values. variable i, which goes through the
range {0, 1, . . . , len( A)}.
This argument would certainly convince us that the avg loop terminates, and
in general all loops with this form of loop counter terminate. However, not all
loops you will see or write will have such an obvious loop counter. Here’s an
example:
1 def collatz(n):
2 ''' Pre: n is a natural number '''
3 curr = n
4 while curr > 1:
5 if curr is even:
52 david liu
In fact, it is an open question in math-
ematics whether this function halts
6 curr = curr // 2 on all inputs or not. If only we had a
computer program that could tell us
7 else: whether it does!
8 curr = 3 * curr + 1
Therefore we’ll now introduce a formal way of proving loop termination. Recall loop termination
that our correctness proofs of recursive functions hinged on the fact that the
recursive calls were made on smaller and smaller inputs, until some base case
was reached. Our strategy for loops will draw inspiration from this: we associate
with the loop a loop variant v that has the following two properties: loop variant
(1) v decreases with each iteration of the loop While this is not the only strategy for
proving termination, it turns out to
(2) v is always a natural number at the beginning of each loop iteration be suitable for most loops involving
numbers and lists. When you study
more advanced data structures and
algorithms, you will discuss more
If such a v exists, then at some point v won’t be able to decrease any further complex arguments for both correctness
(because 0 is the smallest natural number), and therefore the loop cannot have and termination.
any more iterations. This is analogous to inputs to recursive calls getting smaller
and smaller until a base case is reached.
Example 4.6. Proof of termination of avg. Even though we’ve already observed that
the loop has a natural loop counter variable i, this variable increases with each
iteration. Instead, our loop variant will be v = len( A) − i. Let us check that v
satisfies the properties (1) and (2):
(1) Since at each iteration i increases by 1, and len( A) stays the same, v =
len( A) − i decreases by 1 on each iteration.
(2) Note that i and len( A) are both always natural numbers. But this alone is
not enough to conclude that v ∈ N; for example, 3, 5 ∈ N but (3 − 5) ∈ / N. This is a major reason we include
But the loop invariant we proved included the predicate 0 ≤ i ≤ len( A), and such loop counter bounds on the loop
invariant.
because of this we can conclude that len( A) − i ≥ 0, so len( A) − i ∈ N.
Since we have established that v is a decreasing, bounded variant for the loop,
this loop terminates, and therefore avg terminates (since every other line of code
is a simple step that certainly terminates).
Notice that the above termination proof relied on i increasing by 1 on each itera-
tion, and that i never exceeds len(A). That is, we basically just used the fact that
i was a standard loop counter. Here is a more complex example where there is
no obvious loop counter.
1 def term_ex(x,y):
2 ''' Pre: x and y are natural numbers. '''
3 a = x
4 b = y
5 while a > 0 or b > 0:
6 if a > 0:
7 a -= 1
8 else:
9 b -= 1
10 return x * y
Proof. Intuitively, the loop terminates because when the loop runs, either a or b
decreases, and will stop when a and b reach 0. To make this argument formal,
we need the following loop invariant: a, b ≥ 0, whose proof we leave as an
exercise.
The loop variant we define is v = a + b. Let us prove the necessary properties for
v:
Exercises
1. Here is some code that recursively determines the smallest element of a list.
Give pre- and postconditions for this function, then prove it is correct accord-
ing to your specifications.
1 def recmin(A):
2 if len(A) == 1:
3 return A[0]
4 else:
5 m = len(A) // 2
6 min1 = recmin(A[0..m-1])
7 min2 = recmin(A[m..len(A)-1])
8 return min(min1, min2)
1 def sort_colours(A):
2 '''
3 Pre: A is a list whose elements are either 'red' or 'blue'
54 david liu
3. Prove the following loop invariant for the loop in term_ex: Inv( a, b) : a, b ≥ 0.
4. Consider the following modification of the term_exexample.
1 def term_ex_2(x,y):
2 ''' Pre: x and y are natural numbers '''
3 a = x
4 b = y
5 while a >= 0 or b >= 0:
6 if a > 0:
7 a -= 1
8 else:
9 b -= 1
10 return x * y
5. For each of the following, state pre- and postconditions that capture what
the program is designed to do, then prove that it is correct according to your
specifications.
Don’t forget to prove termination (even though this is pretty simple). It’s easy
to forget about this if you aren’t paying attention.
(a)
(b)
4 while r >= d:
5 r -= d
6 q += 1
7 return q
(c)
1 def lcm(a,b):
2 x = a
3 y = b
4 while x != y:
5 if x < y:
6 x += a
7 else:
8 y += b
9 return x
(d)
1 def div3(s):
2 sum = 0
3 i = 0
4 while i < len(s):
5 sum += s[i]
6 i += 1
7 return sum mod 3 == 0
(e)
1 def count_zeroes(L):
2 z = 0
3 i = 0
4 while i < len(L):
5 if L[i] == 0:
6 z += 1
7 i +=1
8 return z
(f)
1 def f(n):
2 r = 2
3 i = n
4 while i > 0:
5 r = 3*r -2
6 i -= 1
7 return r
56 david liu
1 def f(x):
2 ''' Pre: x is a natural number '''
3 a = x
4 y = 10
5 while a > 0:
6 a -= y
7 y -= 1
8 return a * y
(a)
(b)
8. Prove that the following function is correct. Warning: this one is probably the
most difficult of these exercises. But, it runs in linear time – pretty amazing!
introduction to the theory of computation 57
1 def majority(A):
2 '''
3 Pre: A is a list with more than half its entries equal to x
4 Post: Returns the majority element x
5 '''
6 c = 1
7 m = A[0]
8 i = 1
9 while i <= len(A) - 1:
10 if c == 0:
11 m = A[i]
12 c == 1
13 else if A[i] == m:
14 c += 1
15 else:
16 c -= 1
17 i += 1
18 return m
1 def bubblesort(L):
2 '''
3 Pre: L is a list of numbers
4 Post: L is sorted
5 '''
6 k = 0
7 while k < len(L):
8 i = 0
9 while i < len(L) - k:
10 if L[i] > L[i+1]:
11 swap L[i] and L[i+1]
12 i += 1
13 k +=1
6 return pivot
7 else if len(L) >= k:
8 return extract(L, k)
9 else:
10 return extract(G, k - len(L) - 1)
In this final chapter, we turn our attention to the study of finite automata, a
simple model of computation with surprisingly deep applications ranging from
vending machines to neurological systems. We focus on one particular applica-
tion: matching regular languages, which are the foundation of natural language
processing, including text searching and parsing. This application alone makes
automata an invaluable computational tool, one with which you are probably
already familiar in the guise of regular expressions.
Definitions
We open with some definitions related to strings. An alphabet Σ is a finite set alphabet
of symbols, e.g., {0, 1}, { a, b, . . . , z}, or {0, 1, . . . , 9, +, −, ×, ÷}. A string over
string
an alphabet Σ is a finite sequence of symbols from Σ. Therefore “0110” is a
string over {0, 1}, and “abba” and “cdbaaaa” are strings over { a, b, c, d}. The
empty string “”, denoted by e, consists of a sequence of zero symbols from the
Σ: Greek letter “Sigma”, e: “epsilon”
alphabet. We use the notation Σ∗ to denote the set of all strings over the alphabet
Σ.
The length of a string w ∈ Σ∗ is the number of symbols appearing in the string, length
and is denoted |w|. For example, |e| = 0, | aab| = 3, and |11101010101| = 11. We
use Σn to denote the set of strings over Σ of length n. For example, if Σ = {0, 1},
then Σ0 = {e} and Σ2 = {00, 01, 10, 11}. So Σ∗ = Σ0 ∪ Σ1 ∪ Σ2 ∪ · · ·
{w ∈ { a, b, c}∗ | |w| ≤ 3}
{w ∈ { a, b, c}∗ | w has the same number of a’s and c’s}
{w ∈ { a, b, c}∗ | w can be found in an English dictionary}
These are pretty mundane examples. Somewhat surprisingly, however, this no-
tion of languages also captures solutions to computational problems. Consider the
following languages over the alphabet of all standard ASCII characters.
Regular Languages
Regular languages are the most basic kind of languages, and are derived from
rather simple language operations. In particular, we define the following three
operations for languages L, M ⊆ Σ∗ :
Example 5.1. Consider the languages L = { a, bb} and M = { a, c}. Then we have
L ∪ M = { a, bb, c}
LM = { aa, ac, bba, bbc}
L∗ = {e, a, aa, bb, aaa, abb, bba, . . . }
M∗ = {e, a, c, aa, ac, ca, cc, aaa, aac, . . . } Note that M∗ = { a, c}∗ is exactly the
strings made up of only a’s and c’s.
This explains the notation Σ∗ to denote
the set of all strings over the alphabet Σ.
Definition 5.1 (Regular Language). The set of regular languages over an alphabet regular language
Σ is defined recursively as follows:
Regular languages are sets of strings, and are often infinite. As humans, we
are able to leverage our language processing and logical abilities to represent
languages; for example, “strings that start and end with the same character”
and “strings that have an even number of zeroes” are both simple descriptions
of regular languages. What about computers? We could certainly write simple
programs that compute either of the above languages, but we have grander am-
bitions. Specifically, we would like a simple, computer-friendly representation Exercise: write these programs.
of regular languages so that we could input an arbitrary regular language and a
string, and the computer would determine whether the string is in the language
or not.
This is precisely the idea behind the regular expression (regex), a pattern-based regular expression
string representation of a regular language. Given a regular expression r, we use
L(r ) to denote the language matched (represented) by r. Here are the elements Note the almost identical structure
of regular expressions: to the definition of regular languages
themselves.
It’s an easy exercise to prove by structural induction that every regular language
can be matched by a regular expression, and every regular expression matches a language
that is regular. Another way to put it is that a language L is regular if and only if
there is a regular expression r such that L = L(r ).
Example 5.2. Let Σ = {0, 1}. Describe the language of the regex 01 + 1(0 + 1)∗ .
To interpret this regex, we need to understand precedence rules. By convention,
these are identical to the standard arithmetic precedence; thus star has the highest
precedence, followed by concatenation, and finally union. Therefore the complete So star is like power, concatenation is
bracketing of this regex is (01) + (1((0 + 1)∗ )), but with these precedence rules like multiplication, and union is like
in place, we only need the brackets around the (0 + 1). addition.
Let us proceed part by part. The “01” component matches the string 01. The
(0 + 1)∗ matches all binary strings, because it contains all strings resulting from
adding a 0 or 1 at each step. This means that 1(0 + 1)∗ matches a 1, followed by
any binary string. Finally, we take the union of these two: L(01 + 1(0 + 1)∗ ) is
the set of strings that are either 01 or start with a 1.
Example 5.3. Let’s go the other way, and develop a regular expression given a
description of the following regular language:
Solution:
Note that L is finite, so in fact we can simply list out all of the strings in our
regex:
e + a + b + aa + ab + ba + bb
Another strategy is to divide up the regex into cases depending on length:
e + ( a + b) + ( a + b)( a + b)
The three parts capture the strings of length 0, 1, and 2, respectively. A final
representation would be to say that we’ll match two characters, each of which
could be empty, a, or b:
(e + a + b)(e + a + b).
All three regexes we gave are correct! In general, there is more than one regular
expression that matches any given regular language.
But what about the complement of L, i.e., the language L̄ = {w ∈ {0, 1}∗ |
w does not have 11 as a substring}? It is more difficult to find a regex for this
language because regular expressions specify patterns that should be matched,
not avoided. Let’s approach this problem by interpreting the definition as “every
1 must be preceded by a 0.”
(e + 1)(00∗ 1)∗ .
There is one last problem: this fails to match any string that ends with 0’s, e.g.,
0100. We can apply a similar fix as the previous one to allow a block of 0’s to be
matched at the end:
(e + 1)(00∗ 1)∗ 0∗ .
Some interesting meta-remarks arise naturally from the last example. One is a
rather straightforward way of showing that a language and regex are inequiv- Keep this in mind when you and
alent, by simply producing a string that is in one but not the other. On the your friends are arguing about whose
other hand, convincing ourselves that a regular expression correctly matches a regular expressions are correct.
language can be quite difficult; how did we know that (e + 1)(00∗ 1)∗ 0∗ correctly
matches L̄? Proving that a regex matches a particular language L is much harder,
because we need to show that every string in L is matched by the regex, and every Note that this is a universally quantified
string not in L is not. statement, while its negation (regex
doesn’t match L) is existentially quan-
tified. This explains the difference in
difficulty.
introduction to the theory of computation 63
Our intuition only takes us so far — it is precisely this gap between our gut
and true understanding that proofs were created to fill. This is, however, just a
little beyond the scope of the course; you have all the necessary ingredients —
surprise: induction — but the arguments for all but the simplest languages are
quite involved. We will see later that reasoning about
the correctness of deterministic finite
Example 5.5. We finish off this section with a few corner cases to illustrate some automata is just as powerful, and a little
simpler.
of the subtleties in our definition, and extend the arithmetic metaphor we hinted
at earlier. First, the following equalities show that ∅ plays the role of the “zero”
in regular expressions. Let r be an arbitrary regular expression.
∅+r = r+∅ = r
By equality between regular expressions
∅r = r∅ = ∅ we mean that they match the same
language. That is, r1 = r2 ⇔ L(r1 ) =
L(r2 ).
The first is obvious because taking the union of any set with the empty set
doesn’t change the set. What about concatenation? Recall that concatenation
of languages involves taking combinations of strings from the first language
and strings from the second; if one of the two languages is empty, then no
combinations are possible.
We can use similar arguments to show that e plays the role of the “one”:
er = re = r
e∗ = e
A Suggestive Flowchart
You might be wondering how computers actually match strings to regular ex-
pressions. It turns out that regular languages are rather easy to match because
of the following (non-obvious!) property:
After some thought you may realise that the accepted strings are exactly the
ones with an odd number of 1’s. A more suggestive way of saying this is that
the language accepted by this flowchart is the set of strings with an odd number
of 1’s.
64 david liu
Example 5.6. In the introductory example, the state set is Q = {q0 , q1 }, the Note that there’s only one final state in
alphabet is {0, 1}, the initial state is q0 , and the set of final states is {q1 }. We can this example, but in general there may
represent the transition function as a table of values: be several final states.
In general, the number of rows in the transition table is | Q| · |Σ|; each state must
have exactly |Σ| transitions leading out of it, each labelled with a unique symbol
in Σ.
Before proceeding, make sure you understand each of the following statements
about how DFAs work:
A quick note about notation before proceeding with a few more examples. Tech-
nically, δ takes as its second argument a single symbol; e.g., δ(q0 , 1) = q1
(from the previous example). But we can just as easily extend this definition
to arbitrary-length strings in the second argument. For example, we can say
δ(q0 , 11) = q0 , δ(q1 , 1000111) = q1 , and δ(q0 , e) = q0 . For every state q, δ(q, e) = q.
Example 5.7. Let us design a DFA that accepts the following language:
Solution:
Consider starting at an initial state q0 . q0
should lead to a state q2 where we need to “end with a b.” The simple way to a
achieve this is to have q2 loop on a’s, then move to an accepting state q3 on a b.
q2 b q3
What about the transitions for q3 ? As long as it’s reading b’s, it can accept, so it a
should loop to itself. On the other hand, if it reads an a, it should go back to q2 ,
and continue reading until it sees another b. q0
b
q1 a, b
Correctness of DFAs
a b
b
Like regular expressions, arguing that DFAs are incorrect is generally easier than q2 q3
arguing that they are correct. However, because of the rather restricted form a a
DFAs must take, reasoning about their behaviour is a little more amenable than
for regular expressions. q0
b
The simple strategy of “pick an arbitrary string in L, and show that it is accepted
by the DFA” is hard to accomplish, because the paths taken through the DFA by q1 a, b
each accepted string can be quite different; i.e., different strings will probably
require substantially different proofs. Therefore we adopt a different strategy.
We know that DFAs consist of states and transitions between states; the term
state suggests that if a string reaches that point, the DFA “knows” something
about that string, or it is “expecting” what will come next. We formalize this
notion by characterizing for each state precisely what must be true about the
strings that reach it.
Note that the definition of state invariant uses an if and only if. We aren’t
just giving properties that the strings reaching q must satisfy; we are defining
precisely which strings reach q. Let us see how state invariants can help us prove
that DFAs are correct.
Example 5.8. Consider the following language over the alphabet {0, 1}: L = 0 0
{w | w has an odd number of 1’s}, and the DFA shown. Prove that the DFA
accepts precisely the language L. 1
q0 q1
Proof. It is fairly intuitive why this DFA is correct: strings with an even number 1
of 1’s go to q0 , and transition to q1 upon reading a 1 (where the string now has
an odd number of 1’s). Here are some state invariants for the two states:
Here are two important properties to keep in mind when designing state invari-
ants:
• The state invariants should be mutually exclusive. That is, there should be no
overlap between them; no string should satisfy two different state invariants.
Otherwise, to which state would the string go?
• The state invariants should be exhaustive. That is, they should cover all pos-
sible cases; every string in Σ∗ , including e, should satisfy one of the state
invariants. Otherwise, the string goes nowhere.
These conditions are definitely satisfied by our two invariants above, since every
string has either an even or odd number of 1’s.
Next, we want to prove that the state invariants are correct. We do this in two
steps.
• Show that the empty string e satisfies the state invariant of the initial state.
In our case, the initial state is q0 ; e has zero 1’s, which is even; therefore the
state invariant is satisfied by e.
a
• For each transition q − → r, show that if a string w satisfies the invariant of
state q, then the string wa satisfies the invariant of r. (That is, each transition
respects the invariants.) There are four transitions in our DFA. For the two
The astute reader will note that we are
0-loops, appending a 0 to a string doesn’t change the number of 1’s in the basically doing a proof by induction
string, and hence if w contains an even (odd) number of 1’s, then w0 contains on the length of the strings. Like our
proofs of program correctness, we
an even (odd) number of 1’s as well, so these two transitions are correct. “hide” the formalities of induction
proofs and focus only on the content.
On the other hand, appending a 1 increases the number of 1’s in a string by
one. So if w contains an even (odd) number of 1’s, w1 contains an odd (even)
number of 1’s, so the transitions between q0 and q1 labelled 1 preserve the
invariants.
Thus we have proved that the state invariants are correct. The final step is to
show that the state invariants of the accepting state(s) precisely describe the
target language. This is very obvious in this case, because the only accepting
Remember that in general, there can be
state is q1 , and its state invariant is exactly the defining characteristic of the more than one accepting state!
target language L.
Limitations of DFAs
The simplicity of the DFA model enables proofs of correctness, as shown above.
This simplicity is also useful for reasoning about the model’s limitations. In this
section, we’ll cover two examples. First, we prove a lower bound on the number
of states required in a DFA accepting a particular language. Then, we’ll show
that certain languages cannot be accepted by DFAs of any size!
Example 5.9. Consider the language
We’ll prove that any DFA accepting this language has at least 4 states.
68 david liu
Now, since wi and w j reach the same state, they are indistinguishable prefixes to
the DFA; this means that any strings of the form wi x and w j x will end up at the
same state in the DFA, and hence are both accepted or both rejected. However,
suppose x = w3− j . Then wi x = wi+3− j and w j x = w j+3− j = w3 . Then wi x
contains 3 − j + i 1’s, and w j x contains three 1’s. But since i < j, 3 − j + i < 3, so
wi x ∈
/ L, while w j x ∈ L. Therefore wi x and w j x cannot end up at the same state
in D , a contradiction!
The key idea in the above proof was that the four different strings e, 1, 11, 111 all
had to reach different states, because there were suffixes that could distinguish
any pair of them. In general, to prove that a language L requires at least k These suffixes were the x = w3− j in the
states in a DFA to accept it, it suffices to give a set of k strings, each of which is proof.
distinguishable from the others with respect to L. Two strings w1 and w2 are “distinguish-
able with respect to L” if there is a
Now we turn to a harder problem: proving that some languages cannot be suffix x such that w1 x ∈ L and w2 x ∈/ L,
accepted by DFAs of any size. or vice versa.
Example 5.10. Consider the language L = {0n 1n | n ∈ N}. Prove that no DFA
accepts L.
Here’s how: consider the string w = 0k+1 , i.e., the string consisting of k + 1 0’s.
Since there are only k states in D , the path that w takes through D must involve
a loop starting at some state q. That is, we can break up w into three parts:
w = 0a 0b 0c , where b ≥ 1, δ(s, 0a ) = q, and δ(q, 0b ) = q.
This loop is dangerous for the DFA! Because reading 0b causes a loop that begins
and ends at q, the DFA forgets whether it has read 0b or not; thus the strings
0a 0c and 0a 0b 0c reach the same state, and are now indistinguishable to D . But of
The DFA has lost track of the number of
course these two strings are distinguishable with respect to L: 0a 0c 1a+c ∈ L, but 0’s.
0 a 0b 0c 1 a + c ∈
/ L.
Nondeterminism
Consider the following language over the alphabet {0, 1}∗ : L = {w | the third last character of w is 1}.
You’ll see in the Exercises that a DFA takes at least 8 states to accept L, using
the techniques we developed in the previous section. Yet there is a very short
introduction to the theory of computation 69
regular expression that matches this language: (0 + 1)∗ 1(0 + 1)(0 + 1). Contrast
this with the regular expression (0 + 1)(0 + 1)1(0 + 1)∗ , matching strings whose
third character is 1; this has a simpler 5-state DFA. You can prove this in the Exercises.
Why is it hard for DFAs to “implement” the former regex, but easy to imple-
ment the latter? The fundamental problem is the uncertainty associated with the
Kleene star. In the former case, a DFA cannot tell how many characters to match
q0 0,1
with the initial (0 + 1)∗ segment, before moving on to the 1! This is not an issue in
the latter case, because DFAs read left to right, and so have no problem reading
the first three characters, and then matching the rest of the string to the (0 + 1)∗ .
1
On the other hand, consider the automaton to the right. This is not deterministic q1
because it has a “choice”: reading in a 1 at q0 can loop back to q0 or move to q1 .
Moreover, reading in a 0 or a 1 at q3 leads nowhere! But consider this: for any 0,1
string whose third last character is indeed a 1, there is a “correct path” that leads
to the final state q3 . For example, for the string 0101110, the correct path would q2
continuously loop at q0 for the prefix 0101, then read the next 1, transition to q1 ,
then read the remaining 10 to end at q3 , and accept. 0,1
BUT WAIT, you say, isn’t this like cheating? How did the automaton “know” to
loop at the first two 1’s, then transition to q1 on the third 1? We will define our q3
model in this way first, so bear with us. The remarkable fact we’ll show later is
that, while this model seems more powerful than plain old DFAs, in fact every
language that we can accept in this model can also be accepted with a DFA. Cheating doesn’t help.
A Nondeterministic Finite Automaton (NFA) is defined in a similar fashion to nondeterministic finite automaton
a DFA: it is a quintuple N = ( Q, Σ, δ, s, F ), with Q, Σ, s, and F playing the same
roles as before. The transition function δ now maps to sets of states rather than
individual states; that is, δ : Q × Σ → 2Q , where 2Q represents the set of all
subsets of Q. For instance, in the previous example, δ(q0 , 1) = {q0 , q1 }, and
δ(q3 , 0) = ∅.
We think of δ(q, a) here as representing the set of possible states reachable from q
by reading in the symbol a. This is extended in the natural way to δ(q, w) for
arbitrary length strings w to mean all states reachable from q by reading in the
string w. Note that for state q and next symbol a, if δ(q, a) = ∅ then this path
“aborts,” i.e., the NFA does not continue reading more characters for this path.
e-transitions
We can augment NFAs even further through the use of e-transitions. These are e-transition
nondeterministic transitions that do not require reading in a symbol to activate.
That is, if you are currently at state q on an NFA, and there is an e-transition
q e
r
70 david liu
from q to another state r, then you can transition to r without reading the next
q0
symbol in the string.
e
For instance, the NFA on the right accepts the string 0 by taking the e-transition e
q1 q4
to q1 , reading the 0 to reach q2 , then taking another e-transition to q3 .
As was the case for nondeterminism, e-transitions do not actually add any power 0 1
to the model, although they can be useful in certain constructions that we’ll use
in the next section. q2 q5
e, 0
Equivalence of Definitions
q3
So far in this chapter, we have used both DFAs and regular expressions to rep-
resent the class of regular languages. We have taken for granted that DFAs are
sufficient to represent regular languages; in this section, we will prove this for-
mally. There is also the question of nondeterminism: do NFAs accept a larger
class of languages than DFAs? In this section, we’ll show that the answer, some-
what surprisingly, is no. Specifically, we will sketch a proof of the following
theorem.
Proof. If you aren’t familiar with theorems asserting the equivalence of multiple
We only sketch the main ideas of the
statements, what we need to prove is that any one of the statements being true proof. For a more formal treatment,
implies that all of the others must also be true. We are going to prove this by see Sections 7.4.2 and 7.6 of Vassos
Hadzilacos’ course notes.
showing the following chain of implications: (3) ⇒ (2) ⇒ (1) ⇒ (3).
(3) ⇒ (2). Given an NFA, we’ll show how to construct a DFA that accepts b
the same language. Here is the high-level idea: nondeterminism allows you to e, a
a
“choose” different paths to take through an automaton. After reading in some 0 1 2
characters, the possible path choices can be interpreted as the NFA being simul- b
taneously in some set of states. Therefore we can model the NFA as transitioning a
between sets of states each time a symbol is read. Rather than formally defining
0 01 02 012
this construction, we’ll illustrate this on an example NFA (shown right).
a, b
0 02 012
introduction to the theory of computation 71
Next, we put in transitions between the states of the DFA. Consider the subset
{0, 2} of states in the NFA. Upon reading the symbol a, we could end up at
all three states, {0, 1, 2} (notice that to reach 2, we must transition from 2 to 1
by reading the a, then use the e-transition from 1 back to 2). In our constructed
DFA, there is a transition between {1, 2} and {0, 1, 2} on symbol a. So in general,
we look at all possible outcomes starting from a state in subset S and reading
symbol a. Repeating this for all subsets yields the following transitions.
Next, we need to identify initial and final states. The initial state of the NFA is
0, and since there are no e-transitions, the initial state of the constructed DFA is
{0}. The final states of the DFA are exactly the subsets containing the final state
1 of the NFA. Finally, we can simplify the DFA considerably by removing the
states ∅ and {2}, which cannot be reached from the initial state.
(1) ⇒ (3). In this part, we show how to construct NFAs from regular expres-
sions. Note that regular expressions have a recursive definition, so we can ac-
tually prove this part using structural induction. First, we show standard NFAs { a}
{∅} {e}
for accepting ∅, {e}, and { a} (for a generic symbol a).
a
Next, we show how to construct NFAs for union, concatenation, and star, the
three recursive operations used to define regular expressions. Note that because
we’re using structural induction, it suffices to show how to perform these oper-
ations on NFAs; that is, given two NFAs N1 and N2 , construct NFAs accepting
L(N1 ) ∪ L(N2 ), L(N1 )L(N2 ), and (L(N1 ))∗ . We use the notation on the right N1 N2
to denote generic NFAs; the two accepting states on the right side of each box s1 s2
symbolize all accepting states of the NFAs, and their start states are s1 and s2 ,
respectively.
N1
First consider union. This can be accepted by the NFA shown to the right.
Essentially, the idea is that starting in a new start state, we “guess” whether the
s1
word will be accepted by N1 or N2 by e-transitioning to either s1 or s2 , and then
e
see if the word is actually accepted by running the corresponding NFA.
s
N2
For concatenation, we start with the first NFA N1 , and then every time we reach
a final state, we “guess” that the matched string from L(N1 ) is complete, and e
s2
e-transition to the start state of N2 .
Finally, for the Kleene star we perform a similar construction, except that the
N1 N2
final states of N1 e-transition to s1 rather than s2 . To possibly accept e, we add a
new initial state that is also accepting. s1 e s2
(2) ⇒ (1). Finally, we show how to construct regular expressions from DFAs. e
This is the hardest construction to prove, so our sketch here will be especially N1
e
s s1
e
For any two states q, r in a DFA, there is a regular expression that
matches precisely the strings w such that δ(q, w) = r; i.e., the strings
that induce a path from q to r.
Note that Lij (0) is the set of strings where there must be no intermediate states,
i.e., w is a symbol labelling a transition directly from i to j. Also, Lij (n) = Lij :
no restrictions are placed on the states that can be passed. We will show how
to inductively build up regular expressions matching each of the Lij (k ), where Formally, our predicate is P(k ) : “For
the induction is done on k. First, the base case, which we’ve already described all states i and j, the set Lij (k ) can be
matched by a regex.”
intuitively:
{ a ∈ Σ | δ(i, a) = j},
(
if i 6= j
Lij (0) =
{ a ∈ Σ | δ(i, a) = j} ∪ {e}, if i = j
Note that when i = j, we need to include e, as this indicates the trivial act of
following no transition at all. Since the Lij (0) are finite sets (of symbols), we can
write regular expressions for them (e.g., if Lij (0) = { a, c, f }, the regex would be You can prove this fact in the Exercises.
a + c + f ).
Finally, here is the recursive definition of the sets that will allow us to construct
regular expressions: it defines Lij (k + 1) in terms of some L... (k), using only the
operations of union, concatenation, and star:
Therefore, given regular expressions for Lij (k), Li,k+1 (k ), Lk+1,k+1 (k ), and Lk+1,j (k ),
we can construct a regular expression for Lij (k + 1).
Exercises
1. For each of the following regular languages over the alphabet Σ = {0, 1},
design a regular expression and DFA which accepts that language. For which
languages can you design an NFA that is substantially smaller than your
DFA?
2. Let L = {w ∈ {0, 1}∗ | the third character of w is a 1}. Prove that every DFA
accepting L has at least 5 states.
3. Let L = {w ∈ {0, 1}∗ | the third last character of w is a 1}. Prove that every
DFA accepting L has at least 8 states. Hint: Consider the 8 binary strings of
length 3. For a bonus, what is the smallest DFA
4. Prove by induction that every finite language can be represented by a regular you can find that accepts L? It will have
at least 8 states!
expression. (This shows that all finite languages are regular.)
5. Prove that the following languages are not regular.
2
(a) { an | n ∈ N}
(b) { xx | x ∈ {0, 1}∗ }
(c) {w ∈ { a, b}∗ | w has more a’s than b’s}
(d) {w ∈ {0, 1}∗ | w has two blocks of 0’s with the same length}
A block is a maximal substring contain-
(e) { an bm cn−m | n ≥ m ≥ 0} ing the same character; for example, the
string 00111000001 has four blocks: 00,
6. Recall that the complement of a language L ⊆ Σ∗ is the set L = {w ∈ Σ∗ | 111, 00000, and 1.
w∈/ L }.
Prove that if L is a regular language, then so is Pre( L). (Hint: recall the
definition of regular languages, and use structural induction!)
6 In Which We Say Goodbye
With CSC236 complete, you have now mastered the basic concepts and reason-
ing techniques vital to your computer science career both at this university and
beyond. You have learned how to analyse the efficiency of your programs, both
iterative and recursive. You have also learned how to argue formally that they
are correct by using program specifications (pre- and postconditions) and loop
invariants. You studied the finite automaton, a simple model of computation
with far-reaching consequences.
Where to from here? Most obviously, you will use your skills in CSC263 and
CSC373, where you will study more complex data structures and algorithms.
You will see first-hand the real tools with which computers store and compute
with large amounts of data, facing real-world problems as ubiquitous as sorting
– but whose solutions are not nearly as straightforward. If you were intrigued
by the idea of provably correct programs, you may want to check out CSC410; if
you liked the formal logic you studied in CSC165, and are interested in more of
its computer science applications (of which there are many!), CSC330, CSC438,
CSC465, and CSC486 would be good courses to consider. If you’d like to learn
about more powerful kinds of automata and more complex languages, CSC448
is the course for you. Finally, CSC463 tackles computability and complexity theory,
the fascinating study of the inherent hardness of problems.
For more information on the above, or other courses, or any other matter aca-
demic, professional, or personal, come talk to any one of us in the Department
of Computer Science! Our {doors, ears, minds} are always open.