Aidma 2 7 2
Aidma 2 7 2
Charles A. Cusack
[email protected]
David A. Santos
Version 2.7.2
June 7, 2021
ii
History
• An Active Introduction to Discrete Mathematics and Algorithms, 2019, 2018, 2017, Charles
A. Cusack. Minor revisions/fixing of errors. A few additions/subtractions of material.
Contents
Charles A. Cusack
May, 2019
v
vi
How to use this book
As the title of the book indicates, this is not a book that is just to be read. It was written so
that the reader interacts with the material. If you attempt to just read what is written and take
no part in the exercises that are embedded throughout, you will likely get very little out of it.
Learning needs to be active, not passive. The more active you are as you ‘read’ the book, the
more you will get out of it. That will translate to better learning. And it will also translate to a
higher grade. So whether you are motivated by learning (which is my hope) or merely by getting
a certain grade, your path will be the same–use this book as described below.
The content is presented in the following manner. First, concepts and definitions are given–
generally one at a time. Then one or more examples that illustrate the concept/definition will
be given. After that you will find one or more exercises of various kinds. This is where this
book differs from most. Instead of piling on more examples that you merely read and think you
understand, you will be asked to solve some for yourself so that you can be more confident that
you really do understand.
Some of the exercises are just called Exercises. They are very similar to the examples, except
that you have to provide the solution. There are also Fill in the details which provide part of
the solution, but ask you to provide some of the details. The point of these is to help you think
about some of the finer details that you might otherwise miss. There are also Questions of various
kinds that get you thinking about the concepts. Finally, there are Evaluate exercises. These ask
you to look at solutions written by others and determine whether or not they are correct. More
precisely, your goal is to try to find as many errors in the solutions as you can. Usually there will
be one or more errors in each solution, but occasionally a correct solution will be given, so pay
careful attention to every detail. The point of these exercises is to help you see mistakes before
you make them. Many of these exercises are based on solutions from previous students, so they
often represent the common mistakes students make. Hopefully if you see someone else make
these mistakes, you will be less likely to make them yourself.
The point of the exercises is to get you thinking about and interacting with the material. As
you encounter these, you should write your solution in the space provided. After you have written
your solution, you should check your answer with the solution provided in the back of the book.
You will get the most out of them if you first do your best to give a complete solution on your
own, and then always check your solution with the one provided to make sure you did it correctly.
If yours is significantly different, make sure you determine whether or not the differences are just
a matter of choice or if there is something wrong with your solution.
If you get stuck on an exercise, you should re-read the previous material (definitions, examples,
etc.) and see if that helps. Then give it a little more thought. For Fill in the details questions,
sometimes reading what is past a blank will help you figure out what to put there. If you get
really stuck on an exercise, look up the solution and make sure you fully understand it. But don’t
jump to the solution too quickly or too often without giving an honest attempt at solving the
exercise yourself. When you do end up looking up a solution, you should always try to rewrite
it in the space provided in your own words. You should not just copy it word for word. You
won’t learn as much if you do that. Instead, do your best to fully understand the solution. Then,
without looking at the solution, try to re-solve the problem and write your solution in the space
provided. Then check the solution again to make sure you got it right.
It is highly recommended that you act as your own grader when you check your solutions. If
vii
viii
your solution is correct, put a big check mark in the margin. If there are just a few errors, use a
different colored writing utensil to mark and fix your errors. If your solution is way off, cross it
out (just put a big ‘X’ through it) and write out your second attempt, using a separate sheet of
paper if necessary. If you couldn’t get very far without reading the solution, you should somehow
indicate that. So that you can track your errors, I highly recommend crossing out incorrect
solutions (or portions of solutions) instead of erasing them. Doing this will also allow you to look
back and determine how well you did as you were working through each chapter. It may also help
you determine how to spend your time as you study for exams.
This whole process will help you become better at evaluating your own work. This is important
because you should be confident in your answers, but only when they are correct. Grading yourself
will help you gain confidence when you are correct and help you quickly realize when you are not
correct so that you do not become confident about the wrong things. Another reason that grading
your solutions is important is so that when you go back to re-read any portion of the book, you
will know whether or not what you wrote was correct.
It is important that you read the solutions to the exercises after you attempt them, even if
you think your solution is correct. The solutions often provide further insight into the material
and should be regarded as part of any reading assignment given.
Make sure you read carefully. When you come upon an Evaluate exercise, do not mistake it
for an example. Doing so might lead you down the wrong path. Never consider the content of an
Evaluate exercise to be correct unless you have verified with the solution that it is really correct.
To be safe, when re-reading, always assume that the Evaluate exercises are incorrect, and never
use them as a model for your own problem solving. To help you, we have tried to differentiate
these from other example and exercise types by using a different font.
There is an expectation that you are able to solve every exercise on your own. (Note that I
am talking about the exercises embedded into the chapters, not the homework problems at the
end of each chapter.) If there are exercises that you are unable to complete, you need to get them
cleared up immediately. This might mean asking about them in class, going to see the professor
or a teaching assistant, and/or going to a help center/tutor. Whatever it takes, make sure you
have a clear understanding of how to solve all of them.
Every chapter ends with two sections called Reading Comprehension Questions and Problems.
The Problems sections are exactly what they sound like–a list of problems suitable for working
on in class or given as homework assignments.
All of the Reading Comprehension Questions should be attempted after you have finished
reading each section (including completing all of the exercises). They are sort of the final check of
your comprehension of the material before you move on to solving homework problems. Although
some of these questions are similar to the exercises in the sections, others are more conceptual
in nature. The majority of them are not meant to be difficult, but rather to test whether you
really understand the material from the section as whole. These can be used as a starting point
for class discussion, so be sure to ask about those that you have trouble completing and/or are
unsure about.
Space is not given in the book for solutions to the Reading Comprehension Questions, so write
your answers on paper or use a Google Doc or other typesetting software to record your solutions.
(In my classes I have students share a Google Doc with me in which they place their answers to
these questions, adding the most recent answers at the top of the document to make it easier to
find their recent answers. When questions can’t easily be done in a Google Doc, they write their
solution on paper, scan or take a picture of it, and include the picture in their Google Doc.)
Solutions to the Reading Comprehension Questions are available in the back of the book.
As with the exercises throughout the book, it is highly recommended that you check your an-
ix
swers and grade your own work, crossing out your solution when you were incorrect (instead of
erasing/deleting it) and replacing it with the correct solution.
Chapter 1: Motivation
The purpose of a discrete mathematics course in the computer science curriculum is to give
students a foundation in some of the mathematical concepts that are foundational to computer
science. By “foundational,” we mean both that the field of computer science was built upon (some
of) them and that they are used to varying degrees in the study of the more advanced topics in
computer science.
Computer science students sometimes complain about taking a discrete mathematics course.
They do not understand the relevance of the material to the rest of the computer science curricu-
lum or to their future career. This can lead to lack of motivation. They also perceive the material
to be difficult.
To be honest, some of the topics are difficult. But the majority of the material is very
accessible to most students. One problem is that learning discrete mathematics takes effort, and
when something doesn’t sink in instantly, some students give up too quickly. The perceived
difficulty together with a lack of motivation can lead to lack of effort, which almost always leads
to failure. Even when students expend effort to learn, they can let their perceptions get the
best of them. If someone believes something is hard or that they can’t do it, it often leads to
self-fulfilling prophecy. This is perhaps human nature. On the other hand, if someone believes
that they can learn the material and solve the problems, chances are they will. The bottom line
is that a positive attitude can go a long way.
This book was written in order to ensure that the student has to expend effort while reading it.
The idea is that if you are allowed to simply read but not required to interact with the material,
you can easily read a chapter and get nothing out. For instance, your brain can go on ‘autopilot’
when something doesn’t sink in and you might get nothing out of the remainder of your time
reading. By requiring you to solve problems and answer questions as you read, your brain is
forced to stay engaged with the material. In addition, when you incorrectly solve a problem, you
know immediately, giving you a chance to figure out what the mistake was and correct it before
moving on to the next topic. When you correctly solve a problem, your confidence increases. We
strongly believe that this feature will go a long way to help you more quickly and thoroughly
learn the material, assuming you use the book as instructed.
What about the problem of relevance? In other words, what is the connection between discrete
mathematics and other computer science topics? There are several reasons that this connection
is unclear to students. First, we don’t always do a very good job of making the connection clear.
We teach a certain set of topics because it is the set of topics that has always been taught in such
a course. We don’t always think about the connection ourselves, and it is easy to forget that this
connection is incredibly important to students. Without it, students can suffer from a lack of
motivation to learn the material.
The second reason the connection is unclear is because one of the goals of such a course is
simply to help students to be able to think mathematically. As they continue in their education
and career, they will most certainly use some of the concepts they learn, yet they may be totally
unaware of the fact that some of their thoughts and ideas are based on what they learned in a
discrete mathematics course. Thus, although the students gain a benefit from the course, it is
essentially impossible to convince them of this during the course.
The third reason that the connection is unclear is that given the time constraints, it is impos-
sible to provide all of the foundational mathematics that is relevant to the advanced computer
1
2 Chapter 1
science courses and make the connection to those advanced topics clear. Making these connec-
tions would require an in-depth discussions of the advanced topics. The connections are generally
made, either implicitly or explicitly, in the courses in which the material is needed.
This book attempts to address this problem by making connections to one set of advanced
topics–the design and analysis of algorithms. This is an ideal application of the discrete math-
ematics topics since many of them are used in the design and analysis of algorithms. We also
do not have to go out of our way too far to provide the necessary background, as we would if
we attempted to make connections to topics such as networking, operating systems, architecture,
artificial intelligence, database, or any number of other advanced topics. As already mentioned,
the necessary connections to those topics will be made when you take courses that focus on those
topics.
The goal of the rest of this chapter is to further motivate you to want to learn the topics that
will be presented in this book. We hope that after reading it you will be more motivated. For
some students, the topics are interesting enough on their own, whether or not they can be applied
elsewhere. For others, this is not the case. One way or another, you must find motivation to learn
this material.
Problem A: The following algorithm supposedly computes the sum of the first n integers. Does
it work properly? If it does not work, explain the problem and fix it.
s u m 1 T o N( int n ) {
return n + s u m 1 T o N(n -1) ;
}
Problem B: The Mega Millions lottery involves picking five different numbers from 1 to 56, and
one number from 1 to 46. I purchased a ticket last week and was surprised when none of my
six numbers matched. Should I have been surprised? What are the chances that a randomly
selected ticket will match none of the numbers?
(a) How large of an array can I run the algorithm on in less than 24 hours?
(b) How large can n be if I can wait a year for the answer?
Some Problems 3
Problem E: I have an algorithm that takes two inputs, n and m. The algorithm treats n
differently when it is less than zero, between zero and 10, and greater than 10. It treats m
differently based on whether or not it is even. I want to write some test code to make sure
the algorithm works properly for all possible inputs. What pairs (n, m) should I test? Do
these tests guarantee correctness? Explain.
Problem F: Can the following code be simplified? If so, give equivalent code that is as simple
as possible.
if ((! x . size () <=0 && x . get (0) != 11) || x . size () > 0)
{
if (!( x . get (0) ==11 && ( x . size () > 13 || x . size () < 13) )
&& ( x . size () > 0 || x . size () == 13) )
{
// do s o m e t h i n g
}
}
4 Chapter 1
Problem G: In how many ways may we write the number 19 as the sum of three positive integer
summands? Here order counts, so, for example, 1 + 17 + 1 is to be regarded different from
17 + 1 + 1.
Problem H: Consider the stoogeSort algorithm given here:
void s t o o g e S o r t ( int [] A , int L , int R ) {
if (R <= L ) { // Array has at most one e l e m e n t so it is sorted
return ;
}
if ( A [ R ] < A [ L ]) {
int temp = A [ L ]; // Swap first and last e l e m e n t
A [ L ] = A [ R ]; // if they are out of order
A [ R ] = temp ;
}
if (R -L >1) { // If the list has at least 3 e l e m e n t s
int third =( R - L +1) /3;
s t o o g e S o r t (A ,L ,R - third ) ; // Sort first two - thirds
s t o o g e S o r t (A , L + third , R ) ; // Sort last two - thirds
s t o o g e S o r t (A ,L ,R - third ) ; // Sort first two - thirds again
}
}
Problem I: A cryptosystem was recently proposed. One of the parameters of the cryptosystem
is a nonnegative integer n, the meaning of which is unimportant here. What is important
is that someone has proven that the system is insecure for a given n if there is more than
one integer m such that 2 · m ≤ n ≤ 2 · (m + 1).
(a) For what value(s) of n, if any, can you prove or disprove that there is more than one
integer m such that 2 · m ≤ n ≤ 2 · (m + 1)?
(b) Given your answer to (a), does this prove that the cryptosystem is either secure or
insecure? Explain.
Problem J: A certain algorithm takes a positive integer, n, as input. The first thing the algo-
rithm does is set n = n mod 5. It then uses the value of n to do further computations. One
friend claims that you can fully test the algorithm using just the inputs 1, 2, 3, 4, and 5.
Another friend claims that the inputs 29, 17, 38, 55, and 6 will work just as well. A third
friend responds with “then why not just use 50, 55, 60, 65, and 70? Those should work just
as well as your stupid lists.” A fourth friend claims that you need many more test cases to
be certain. A fifth friend says that you can never be certain no matter how many test cases
you use. Which friend or friends is correct? Explain.
Problem K: Write an algorithm to swap two integers without using any extra storage. (That
is, you can’t use any temporary variables.)
Problem L: You are at a party with some friends and one of them claims “I just did a quick
count, and it turns out that at this party, there are an odd number of people who have
shaken hands with an odd number of other people.” Can you prove or disprove that this
friend is correct?
Some Problems 5
So f2 = 1, f3 = 2, f4 = 3, f5 = 5, f6 = 8, etc.
(a) One friend claims that the following algorithm is an elegant and efficient way to com-
pute fn .
int F i b o n a c c i ( int n ) {
if ( n <= 1) {
return ( n ) ;
} else {
return ( F i b o n a c c i(n -1) + F i b o n a c c i(n -2) ) ;
}
}
Is he right? Explain.
(b) Another friend claims that he has an algorithm that computes fn that takes constant
time–that is, no matter how large n is, it always takes the same amount of time to
computer fn . Is it possible that he has such an algorithm? Explain.
Problem N: You need to settle an argument between your boss (who can fire you) and your
professor (who can fail you). They are trying to decide who to invite to the Young Accoun-
tants Volleyball League. They want to invite freshmen who are studying accounting and
are over 6 feet tall. They have a list of everyone they could potentially invite.
1. Your boss says they should make a list of all freshmen, a list of all accounting majors,
and a list of everyone over 6 feet tall. They should then combine the lists (removing
duplicates) and invite those on the combined list.
2. Your professor says they should make a list of everyone who is not a freshman, a list
of anyone who does not do accounting, and a list of everyone who is 6 feet tall or less.
They should make a fourth list that contains everyone who is on all three of the prior
lists. Finally, they should remove from the original list everyone on this fourth list,
and invite the remaining students.
• Two integers have the same parity if they are both even or both odd.
Example 2.2. Use the definition of even to prove that the sum of two even integers is even.
Proof: If x and y are even, then x = 2a and y = 2b for some integers a and b.
Then x + y = 2a + 2b = 2(a + b), which is even since a + b is an integer.
Example 2.3. Use the definitions of even and odd to prove that the sum of an even integer
and an odd integer is odd.
Note: The next example is the first of many Fill in the details exercises in which you need
to supply some of the details. After you have filled in the blanks, compare your answers with
the solutions. The answers are given with semicolons (;) separating the blanks.
7
8 Chapter 2
⋆Fill in the details 2.4. Use the definitions of even and odd to prove that the sum of two
odd integers is even.
Note: Did you notice the ⋆ in the heading of the previous example? This indicates that a
solution is provided. If you are reading the PDF file, clicking on the ⋆ will take you to the
solution. Clicking on the number in the solution will take you back.
If you are reading the PDF, go to the back of the book to find the solutions.
Example 2.5. Use the definitions of even and odd to prove that the product of two odd
integers is odd.
⋆Fill in the details 2.6. Use the definitions of even and odd to prove that the product of
an even integer and an odd integer is even.
a · b = (2n)(2o + 1) = . Since is an
integer, a · b is .
These examples may seem somewhat ridiculous since they are proving such obvious facts.
However, keep in mind that our goal is to learn techniques for writing proofs. As we proceed the
proofs will become more complicated, but we will continue to follow the same basic techniques
we are using here. In other words, the fact that we are proving facts about even and odd integers
is not at all important. What is important are the techniques we are learning in the process.
You may be asking yourself “why are we wasting our time proving such obvious results”? If
so, ask yourself this: Would you rather be asked to prove more complicated things right away?
Think about how you learned to read and write. You started by reading books that only
had a few simple words. As you progressed, the books and the words in them got longer. The
vocabulary increased. You encountered increasingly complex sentence and paragraph structures.
Direct Proofs 9
The same is true when you learned to write. You began by writing the letters of the alphabet.
Then you learned to write words, followed by sentences, paragraphs, and eventually essays.
Learning to read and write proofs follows the same procedure. In order to know how to write
correct proofs you first need to see some examples of them. But you need to go beyond just
seeing them–you need to understand them. That is the goal of examples like the previous one.
Your brain needs to be engaged with the material as you work through the book. You must work
through all of the examples in order to get the most out of this book.
Note: Next you will see the first of many Exercises. These give you an opportunity to solve
a problem from start to finish and then check your answer with the solution provided. It is
important that you try each of these on your own before looking at the solution. You will not
get as much out of the book if you skip these or jump straight to the answer without trying
them yourself.
⋆Exercise 2.7. Use the definition of even to prove that the product of two even integers is
even.
Proof:
Note: The next example is an Evaluate example. These examples give a problem and then
provide one or more solutions to the problem based on previous student solutions. Your job
is to evaluate each solution by finding any mistakes. Mistakes include not only incorrect
algebra and logic, but also unclear presentation, skipped steps, incorrect assumptions, over-
simplification, etc. When you come across these examples you should write down every error
you can find. Once you are pretty sure you know all of the problems (if there are any), compare
your evaluation to the one given in the solutions.
10 Chapter 2
⋆Evaluate 2.8. Evaluate the following proof that supposedly uses the definition of odd to
prove that the product of two odd integers is odd.
Evaluation
Sometimes students get frustrated because they think that too many details are required in
a proof. Why are mathematicians such sticklers on the details? The next example is the first of
many that will try to demonstrate why the seemingly little details matter.
Note: The Question examples are similar to the Evaluate ones except that they ask a
specific question. Write down your answer in the space provided and then compare your
answer with the one in the solutions.
⋆Question 2.9. What is wrong with the following “proof” that the sum of an even and an
odd number is even?
Answer
Definition 2.10. Let b and a be integers with a 6= 0. We say that b is divisible by a if there
exists an integer c such that b = ac. If b is divisible by a, we also say that b is a multiple of
a, a is a factor or divisor of b, and that a divides b, written as a|b. If a does not divide b,
we write a ∤ b.
Example 2.11. Since 6 = 2 · 3, 2|6, and 3|6. But 4 ∤ 6 since we cannot write 6 = 4 · c for any
integer c.
Direct Proofs 11
Example 2.12. Prove that the product of two even integers is divisible by 4.
⋆Fill in the details 2.13. Prove that if x is an integer and 7 divides 3x + 2, then 7 also
divides 15x2 − 11x − 14.
. Notice that
15x2 − 11x − 14 = ( )( )
= a(5x − 7).
Therefore .
Example 2.14. Let a and b be integers such that a|b and b|a. Prove that either a = b or
a = −b.
Proof: If a|b, we can write b = ac for some integer c. Similarly, if b|a, we can
write a = bd for some integer d. Then we can write b = ac = (bd)c. Dividing both
sides by b (which is legal, since b|a implies b 6= 0), we can see that cd = 1. Since
c and d are integers, we know that either c = d = 1 or c = d = −1. In the first
case, we have that a = b, and in the second case, we have that a = −b.
Evaluation
Definition 2.16. A positive integer p > 1 is prime if its only positive factors are 1 and p.
A positive integer c > 1 which is not prime is said to be composite.
12 Chapter 2
⋆Evaluate 2.17. Prove or disprove that if a is a positive even integer, then it is composite.
Evaluation
Note: Notice that according to the definitions given above, 1 is neither prime nor composite.
This is one of the many things that makes 1 special.
Proof
⋆Question 2.19. Did you notice that the proof in the solution to the previous exercise (you
read it, right?) did not consider the case of 0 or negative even numbers. Was that O.K.?
Explain why or why not.
Answer
n! = 1 · 2 · 3 · · · n
≤ n · n · n···n
= nn .
Proof: Since n is composite, n = ab for some integers 1 < a < n−1 and
1 < b < n − 1. By definition of factorial, a|(n − 1)! and b|(n − 1)!. Therefore
n = ab divides (n − 1)!
Evaluation
Proof: If n is not a perfect square, then we can write n = ab for some integers
a and b with 1 < a < b < n − 1. Thus, (n − 1)! = 1 · · · a · · · b · · · (n − 1). Since a
and b are distinct numbers on the factor list, n = ab is clearly a factor of (n − 1)!.
If n is a perfect square, then n = a2 for some integer 2 < a < n − 1. Since a > 2,
2a < a2 = n. Thus, 2a < n, so (n − 1)! = 1 · · · a · · · 2a · · · (n − 1). Then a(2a) = 2n
is a factor of (n − 1)!, which means that n is as well.
⋆Question 2.25. Why was it O.K. to assume 1 < a < b < n − 1 in the previous proof?
Answer
⋆Question 2.26. In the second part of the previous proof, why could we say that a > 2?
Answer
14 Chapter 2
Example 2.27. Prove the Arithmetic Mean-Geometric Mean Inequality, which states that
for all non-negative real numbers x and y,
√ x+y
xy ≤ .
2
√ √ √ √
Proof: Since x and y are non-negative, x and y are real numbers, so x− y
is a real number. Since the square of any real number is greater than or equal to
0 we have √ √
( x − y)2 ≥ 0.
Expanding (recall the FOIL method?) we get
√
x − 2 xy + y ≥ 0.
√
Adding 2 xy to both sides and dividing by 2, we get
x+y √
≥ xy,
2
yielding the result.
The previous example illustrates the creative part of writing proofs. The proof started out
√ √
considering x− y, which doesn’t seem to be related to what we wanted to prove. But hopefully
after you read the entire proof you see why it makes sense. If you are saying to yourself “I would
√ √
never have thought of starting with x − y?,” or “How do you know where to start?,” I am
afraid there are no easy answers. Writing proofs is as much of an art as it is a science. There
are three things that can help, though. First, don’t be afraid to experiment. If you aren’t sure
where to begin, try starting at the end. Think about the end goal and work backwards until you
see a connection. Sometimes working both backward and forward can help. Try some algebra
and see where it gets you. But in the end, make sure your proof goes from beginning to end. In
other words, the order that you figured things out should not necessarily dictate the order they
appear in your proof.
The second thing you can do is to read example proofs. Although there is some creativity
necessary in proof writing, it is important to follow proper proof writing techniques. Although
there are often many ways to prove the same statement, there is often one technique that works
best for a given type of problem. As you read more proofs, you will begin to have a better
understanding of the various techniques used, know when a particular technique might be the
best choice, and become better at writing your own proofs. If you see several proofs of similar
problems, and the proofs look very similar, then when you prove a similar problem, your proof
should probably resemble those proofs. This is one area where some students struggle—they
submit proofs that look nothing like any of the examples they have seen, and they are often
incorrect. Perhaps it is because they are afraid that they are plagiarizing if they mimic another
proof too closely. However, mimicking a proof is not the same as plagiarizing a sentence. To be
clear, by ‘mimic’, I don’t mean just copy exactly what you see. I mean that you should read
and understand several examples. Once you understand the technique used in those examples,
you should be able to see how to use the same technique in your proof. For instance, in many of
the examples related to even numbers, you may have noticed that they start with statement like
“Assume x is even. Then x = 2a for some integer a.” So if you need to write a proof related to
even numbers, what sort of statement might make sense to begin your proof?
Direct Proofs 15
The third thing that can help is practice. This applies not only to writing proofs, but to
learning many topics. An analogy might help here. Learning is often like sports—you don’t learn
how to play basketball (or insert your favorite sport, video game, or other hobby that takes some
skill) by reading books and/or watching people play it. Those things can be helpful (and in some
cases necessary), but you will never become a proficient basketball player unless you practice.
Practicing a sport involves running many drills to work on the fundamentals and then applying
the skills you learned to new situations. Learning many topics is exactly the same. First you need
to do lots of exercises to practice the fundamental skills. Then you can apply those skills to new
situations. When you can do that well, you know you have a good understanding of the topic. So
if you want to become better at writing proofs, you need to write more proofs.
⋆Question 2.28. What three things can help you learn to write proofs?
1.
2.
3.
16 Chapter 2
Although not technically interchangeable, you may sometimes see the word statement instead
of proposition. Context should help you determine whether or not a given usage of the word
statement should be understood to mean proposition.
Definition 2.30. An implication is a proposition of the form “if p, then q,” where p and q
are propositions. p is called the premise and q is called the conclusion.
It is sometimes written as p → q, which is read “p implies q.” It is a statement that
asserts that if p is a true proposition then q is a true proposition.
An implication is true unless p is true and q is false.
Example 2.31. The proposition “If I do well in this course, then I can take the next course”
is an implication. However, the proposition “I can do well in this course and take the next
course” is not an implication.
If you read xkcd and laugh, you are being consistent with the proposition. If you read xkcd
and do not laugh, then you are demonstrating that the proposition is false.
But what if you don’t read xkcd? Are you demonstrating that the proposition is true or
false? Does it matter whether or not you laugh? It turns out that you are not disproving
it in this case–in other words, the proposition is still true if you don’t read xkcd, whether
or not you laugh. Why? Because the statement is not saying anything about laughing by
itself. It is only asserting that IF you read xkcd, then you will laugh. In other words, it is a
conditional statement, with the condition being that you read xkcd. The statement is saying
nothing about anything if you don’t read xkcd.
So the bottom line is that if you do not read xkcd, the statement is still true.
a
If you are unfamiliar with xkcd, go to http://xkcd.com.
⋆Question 2.33. When is the implication “If you read xkcd, then you will laugh” false?
Answer
Implication and Its Friends 17
⋆Exercise 2.34. Consider the implication “If you build it, they will come.” What are all
of the possible ways this proposition could be false?
Solution
Given an implication p → q, there are three related propositions. But first we need to discuss
the negation of a proposition.
Definition 2.35. Given a proposition p, the negation of p, written ¬p, is the proposition
“not p” or “it is not the case that p.”
Example 2.36. If p is the proposition “x ≤ y” then ¬p is the proposition “it is not the case
that x ≤ y,” or “x > y”.
Note: It is easy to incorrectly negate sentences, especially when they contain words like
“and”, “or”, “implies”, and “if.” This will become easier after we study logic in Chapter 4.
Definition 2.37. The contrapositive of a proposition of the form “if p, then q” is the
proposition “if q is not true, then p is not true” or “if not q, then not p” or ¬q → ¬p.
⋆Question 2.38. What is the contrapositive of the proposition “If you know Java, then
you know a programming language”?
Answer
Theorem 2.39. An implication is true if and only if its contrapositive is true. Stated another
way, an implication and its contrapositive are equivalent.
18 Chapter 2
false when ¬q is true and is false, and true otherwise. Notice that this is
Definition 2.41. The inverse of a proposition of the form “if p, then q” is the proposition
“if p is not true, then q is not true” or “if not p, then not q” or ¬p → ¬q.
⋆Question 2.42. What is the inverse of the proposition “If you know Java, then you know
a programming language”?
Answer
⋆Question 2.43. Are a proposition and its inverse equivalent? Explain, using the proposi-
tion from Question 2.42 as an example.
Answer
Definition 2.44. The converse of a proposition of the form “if p, then q” is the proposition
“if q, then p” or q → p.
⋆Question 2.45. What is the converse of the proposition “If you know Java, then you know
a programming language”?
Answer
Implication and Its Friends 19
⋆Question 2.46. Are a proposition and its converse equivalent? Explain using the propo-
sition about Java/programming languages.
Answer
As you have just seen, the inverse and converse of a proposition are not equivalent to the
proposition. However, it turns out that the inverse and converse of a proposition are equivalent
to each other. You will be asked to prove this in Problem 2.2. If you think about it in the right
way, it should be fairly easy to prove.
2. Inverse If I do not get to watch “The Army of Darkness,” then I will not be happy.
4. Contrapositive If I am not happy, then I didn’t get to watch “The Army of Darkness.”
⋆Question 2.48. Using the propositions from the previous example, answer the following
questions.
(a) Give an explanation of why an implication might be true, but the inverse false.
Answer
(b) Explain why an implication is saying the exact same thing as its contrapositive. (Don’t
just say “By Theorem 2.39.”)
Answer
Implications can be tricky to fully grasp and it is easy to get your head turned around when
dealing with them. We will discuss them in quite a bit of detail throughout the next few sections
in order to help you understand them better. We will also revisit them in Chapter 4.
20 Chapter 2
Proof: Assume that 5n + 2 is odd, but that n is even. Then n = 2k for some
integer k. This implies that 5n + 2 = 5(2k) + 2 = 10k + 2 = 2(5k + 1), which is
even. But this contradicts our assumption that 5n + 2 is odd. Therefore it must
be the case that n is odd.
The idea behind this proof is that if we are given the fact that 5n + 2 is odd, we are asserting
that n must be odd. How do we prove that n is odd? We could try a direct proof, but it
is actually easier to use a proof by contradiction in this case. The idea is to consider what
would happen if n is not odd. What we showed was that if n is not odd, then 5n + 2 has to
be even. But we know that 5n + 2 is odd because that was our initial assumption. How can
5n + 2 be both odd and even? It can’t. In other words, our proof lead to a contradiction–an
impossibility. Therefore, something is wrong with the proof. But what? If n is indeed even,
our proof that 5n + 2 is even is correct. So there is only one possible problem–n must not be
even. The only alternative is that n is odd. Can you see how this proves the statement “if
5n + 2 is odd, then n is odd?”
Note: If you are somewhat confused at this point that’s probably O.K. Keep reading, and
re-read this section a few times if necessary. At some point you will have an “Aha” moment
and the idea of contradiction proofs will make sense.
√
Example 2.50. Prove that if n = ab, where a and b are positive integers, then either a ≤ n
√
or b ≤ n.
√
Proof: Let’s assume that n = ab but that the statement “either a ≤ n or
√ √ √
b ≤ n” is false. Then it must be the case that a > n and b > n. But then
√ √
ab > n n = n. But this contradicts the fact that ab = n. Since our assumption
√ √
that a > n and b > n lead to a contradiction, it must be false. Therefore it
√ √
must be the case that either a ≤ n or b ≤ n.
Sometimes your proofs will not directly contradict an assumption made but instead will con-
tradict a statement that you otherwise know to be true. For instance, if you ever conclude that
0 > 1, that is a contradiction. The next example illustrates this.
Proof by Contradiction 21
√ 1
⋆Fill in the details 2.51. Show, without using a calculator, that 6 − 35 < .
10
√ 1 1
Proof: Assume that 6 − 35 ≥ . Then 6 − ≥ . If we multiple
10 10
Now that we have seen a few examples, let’s discuss contradiction proofs a little more formally.
Here is the basic concept of contradiction proofs: You want to prove that a statement p is true.
You “test the waters” by seeing what happens if p is not true. So you assume p is false and use
proper proof techniques to arrive at a contradiction. By “contradiction” I mean something that
cannot possibly be true. Since you proved something that is not true, and you used proper proof
techniques, then it must be that your assumption was incorrect. Therefore the negation of your
assumption—which is the original statement you wanted to prove—must be true.
⋆Evaluate 2.52. Use the definition of even and odd to prove that if a and b are integers
and ab is even, then at least one of a or b is even.
Evaluation
Proof 2: If true, either one is odd and the other even, or they are both
even, so we will show that the product of an even and an odd is even, and
that the product of two evens integers is even.
Let a = 2k and b = 2x + 1. (2k)(2x + 1) = 4kx + 2k = 2(2kx + k). 2kx + k is an
integer so 2(2kx + k) is even.
Let a = 2k and b = 2x. (2k)(2x) = 4kx = 2(2kx) since 2kx is an integer,
2(2kx) is even.
Thus, if a and b are integers, ab is even, at least one of a or b is even.
Evaluation
22 Chapter 2
Proof 3: Let a and b be integers and assume that ab is even, but that
neither a nor b is even. Then both a and b are odd, so a = 2n + 1 and
b = 2m + 1 for some integers n and m. But then ab = (2n + 1)(2m + 1) =
4nm + 2n + 2m + 1 = 2(2nm + n + m) + 1, which is odd since 2nm + n + m is an
integer. This contradicts the fact that ab is even. Therefore either a or
b must be even.
Evaluation
For some students, the trickiest part of contradiction proofs is what to contradict. Sometimes
the contradiction is the fact that p is true. At other times you arrive at a statement that is clearly
false (e.g. 0 > 1). Generally speaking, you should just try a few things (e.g. do some algebra) and
see where it leads. With practice, this gets easier. In fact, with enough practice this will probably
become one of your favorite techniques. When a direct proof doesn’t seem to be working this is
usually the next technique I try.
Example 2.53. Let a1 , a2 , . . ., an be real numbers. Prove that at least one of these numbers
is greater or equal to the average of the numbers.
Our next contradiction proof involves permutations. Here is the definition and an example in
case you haven’t seen these before.
Definition 2.54. A permutation is a function from a finite set to itself that reorders the
elements of the set.
Note: We will discuss both functions and sets more formally later. For now, just think of
a set as a collection of objects of some sort and a function as a black box that produces an
output when given an input.
Example 2.55. Let S be the set {a, b, c}. Then (a, b, c), (b, c, a) and (a, c, b) are permutations
of S. (a, a, c) is not a permutation of S because it repeats a and does not contain b. (b, d, a)
is not permutations of S because d is not in S, and c is missing.
Proof by Contradiction 23
⋆Exercise 2.56. List all of the permutations of the set {1, 2, 3}. (Hint: There are 6.)
Answer
Note: In many contexts, when a list of objects occurs in curly braces, the order they are
listed does not matter (e.g. {a, b, c} and {b, c, a} mean the same thing). On the other hand,
when a list occurs in parentheses, the order does matter (e.g. (a, b, c) and (b, c, a) do not
mean the same thing).
Proof: Assume that the product is odd. Then all of the differences ak − k
must be odd. Now consider the sum S = (a1 − 1) + (a2 − 2) + · · · + (an − n).
Since the ak ’s are a just a reordering of 1, 2, . . . , n, S = 0. But S is the sum of
an odd number of odd integers, so it must be odd. Since 0 is not odd, we have a
contradiction. Thus our initial assumption that all of the ak − k are odd is wrong,
so at least one of them is even and hence the product is even.
⋆Question 2.58. Why did the previous proof begin by assuming that the product was odd?
Answer
⋆Question 2.59. In the previous proof, we asserted that S = 0. Why was this the case?
Answer
We will use facts about rational/irrational numbers to demonstrate some of the proof tech-
niques. In case you have forgotten, here are the definitions.
24 Chapter 2
• A rational number is one that can be written as p/q, where p and q are integers, with
q 6= 0.
√
Example 2.61. Prove that 2 is irrational. We present two slightly different proofs. In
both, we will use the fact that any positive integer greater than 1 can be factored uniquely
as the product of primes (up to the order of the factors).
√ a
Proof 1: Assume that 2 = , where a and b are positive integers with b 6= 0. We can
b
assume a and b have no factors in common (since if they did, we could cancel them
and use the resulting numerator and denominator as a and b). Multiplying by b and
squaring both sides yields 2b2 = a2 . Clearly 2 must be a factor of a2 . Since 2 is prime,
a must have 2 as a factor, and therefore a2 has 22 as a factor. Then 2b2 must also have
22 as a factor. But this implies that 2 is a factor of b2 , and therefore a factor
√ of b. This
contradicts the fact that a and b have no factors in common. Therefore 2 must be
irrational.
√ a
Proof 2: Assume that 2 = , where a and b are positive integers with b 6= 0. Multi-
b
plying by b and squaring both sides yields 2b2 = a2 . Now both a2 and b2 have an even
number of prime factors. So 2b2 has an odd number of primes in its factorization and
a2 has an even number of primes √ in its factorization. This is a contradiction since they
are the same number. Therefore 2 must be irrational.
⋆Question 2.62. In proof 2 from the previous example, why do a2 and b2 have an even
number of factors?
Answer
Now that you have seen a few more examples, it is time to begin the discussion about how/why
contradiction proofs work. We will start with the following idea that you may not have thought
of before. It turns out that if you start with a false assumption, then you can prove that anything
is true. It may not be obvious how (e.g. How would you prove that all elephants are less than
1 foot tall given that 1 + 1 = 1?), but in theory it is possible. This is because statements of the
form “p implies q” are true if p (called the premise) is false, regardless of whether or not q (called
the conclusion) is true or false.
Example 2.63. The statement “If chairs and tables are the same thing, then the moon is
made of cheese” is true. This may seem weird, but it is correct. Since chairs and tables are
not the same thing, the premise is false so the statement is true. But it is important to realize
that the fact that the whole statement is true doesn’t tell us anything about whether or not
Proof by Contradiction 25
the moon is made of cheese. All we know is that if chairs and tables were the same thing,
then the moon would have to be made out of cheese in order for the statement to be true.
Example 2.64. Consider what happens if your parents tell you “If you clean your room,
then we will take you to get ice cream.” If you don’t clean your room and your parents don’t
take you for ice cream, did your parents tell a lie? No. What if they do take you for ice
cream? They still haven’t lied because they didn’t say they wouldn’t take you if you didn’t
clean your room. In other words, if the premise is false, the whole statement is true regardless
of whether or not the conclusion is true.
It is important to understand that when we say that a statement of the form “p implies q” is
true, we are not saying that q is true. We are only saying that if p is true, then q has to be true.
We aren’t saying anything about q by itself. So, if we know that “p implies q” is true, and we
also know that p is true, then q must also be true. This is a rule called modus ponens, and it is
at the heart of contradiction proofs as we will see shortly.
⋆Exercise 2.65. It might help to think of statements of the form “p implies q” as rules
where breaking them is equivalent to the statement being false. For instance, consider the
statement “If you drink alcohol, you must be 21.” If we let p be the statement “you drink
alcohol” and q be the statement “you are 21,” the original statement is equivalent to “p
implies q”.
1. If you drink alcohol and you are 21, did you break the rule?
2. If you drink alcohol and you are not 21, did you break the rule?
3. If you do not drink alcohol and you are 21, did you break the rule?
4. If you do not drink alcohol and you are not 21, did you break the rule?
5. Generalize the idea. If you have a statement of the form “p implies q”, where p and q
can be either true or false statements, exactly when can the statement be false?
6. If you do not drink alcohol, does it matter how old you are?
Now we are ready to explain the idea behind contradiction proofs. We want to prove some
statement p is true. We begin by assuming it is false—that is, we assume ¬p is true. We use this
26 Chapter 2
fact to prove that q—some false statement—is true. In other words, we prove that the statement
“¬p implies q” is true, where q is some false statement. But if ¬p is true, and “¬p implies q” is
true, modus ponens tells us that q has to be true. Since we know that q is false, something is
wrong. We only have two choices: either ¬p is false or “¬p implies q” is false. If we used proper
proof techniques to establish that “¬p implies q” is true, then that isn’t the problem. Therefore,
the only other possibility is that ¬p is false, implying that p must be true. That is how/why
contradiction proofs work.
Let’s analyze the second √ proof from Example√2.61 in light of this discussion. The only as-
sumption we made was that 2 is rational (¬p=“ 2 is rational”). From this (and only this), we
were able to show that a2 has both an even and an odd number of factors (q=“a2 has an even
and an odd number√ of factors”, and we showed that “¬p implies q” is true). Thus, we know for
certain that if 2 is rational, then a2 has an even and 1
√ an odd number of factors. This fact is
indisputable since we proved it. If it is also true that 2 is rational, modus ponens implies that
a2 has an even and an odd number of factors. This is also indisputable. But we know that a2
cannot have both an even and odd number of factors. In other words, we have a contradiction.
Therefore, something that√has been said somewhere is wrong. Everything we said is indisputable
except for one thing–that 2 is rational. That was never something √ we proved—we just assumed
it. So it has to be the case that this is false, which means that 2 must be irrational.
To summarize, if you want to prove that a statement is true using a contradiction proof, assume
the statement is false, use this assumption to get a contradiction (i.e. prove a false statement),
and declare that it must therefore be true.
Notice that what q is doesn’t matter. In other words, given the assumption ¬p, the goal in
a contradiction proof is to establish that any false statement is true. This is both a blessing and
a curse. The blessing is that any contradiction will do. The curse is that we don’t have a clear
goal in mind, so it can sometimes be difficult to know what to do. As mentioned previously, this
becomes easier as you read and write more proofs.
If this discussion has been a bit confusing, try re-reading it. The better you understand the
theory behind contradiction proofs, the better you will be at writing them. We will revisit some
of these concepts in the chapter on logic, so the more you understand from here, the better off
you will be when you get there. O.K., enough theory. Let’s see some more examples!
1
√
We did not prove that a2 has an even and an odd number of factors. We proved that if 2 is rational, then a2
has an even and an odd number of factors. It is very important that you understand the difference between these
two statements.
Proof by Contradiction 27
⋆Fill in the details 2.66. Let a, b be real numbers. Prove that if a < b + ǫ for all ǫ > 0,
then a ≤ b.
a−b
a<b+ = .
2
a . Therefore, .
a
Hint: What assumption do we always make when doing a contradiction proof?
b
Same as the previous blank
The following beautiful proof goes back to Euclid. It uses the assumption that any integer
greater than 1 is either a prime or a product of primes.
Example 2.67 (Euclid). Show that there are infinitely many prime numbers.
Proof: Assume that there are only a finite number of primes and label the
primes {p1 , p2 , . . . , pn }. Consider the number
N = p1 p2 · · · pn + 1.
This is a positive integer that is clearly greater than 1. Observe that none of the
primes on the list {p1 , p2 , . . . , pn } divides N , since division by any of these primes
leaves a remainder of 1. Since N is larger than any of the primes on this list, it
is either a prime or divisible by a prime outside this list. But we assumed the
list above contained all of the prime numbers. This is a contradiction. Therefore
there must be infinitely many primes.
28 Chapter 2
⋆Fill in the details 2.68. If a, b, c are odd integers, prove that ax2 + bx + c = 0 does not
have a rational number solution.
p
Proof: Suppose is a rational solution to the equation. We may assume that
q
p and q have no prime factors in common, so either p and q are both odd, or one
p
is odd and the other even. Since is a solution, we know that
q
= 0.
Proof by Contraposition 29
Proof: We will instead prove that if n is even (not odd), then 5n + 2 is even
(not odd). Since this is the contrapositive of the original statement, a proof of
this will prove that that the original statement is true.
Assume n is even. The n = 2a for some integer a. Then 5n + 2 = 5(2a) + 2 =
2(5a + 1). Since 5a + 1 is an integer, 2(5a + 1) is even.
Be careful with proof by contraposition. Do not make the mistake of trying to prove the
converse or inverse instead of the contrapositive. In that case, you may write a correct proof, but
it would be a proof of the wrong thing.
In the next example we will see the similarities and differences between contradiction proofs
and proofs by contraposition.
⋆Evaluate 2.72. Let n be an integer. Use the definition of even/odd to prove that if 3n + 2
is even, then n is even using a proof by contraposition.
Evaluation
Evaluation
Evaluation
Other Proof Techniques 31
Definition 2.73. A trivial proof is a proof of a statement of the form “if p, then q” that
doesn’t use p in the proof.
(x + 1)2 − 2x = (x2 + 2x + 1) − 2x
= x2 + 1
> x2 .
Notice that we never used the fact that x > 0 in the proof.
Example 2.76. Prove or disprove that the product of two irrational numbers is irrational.
√ √ √
Proof: We showed in Example 2.61 that 2 is irrational. But 2 ∗ 2 = 2,
which is an integer so it is clearly rational. Thus the product of 2 irrational
number is not always irrational.
Example 2.77. Prove or disprove that “Everybody Loves Raymond” (or that “Everybody
Hates Chris”).
Proof: Since I don’t really love Raymond (and I don’t hate Chris), the statement
is clearly false.
⋆Exercise 2.78. Prove or disprove that the sum of any two primes is also prime.
Proof
32 Chapter 2
Definition 2.79. A proof by cases breaks up a statement into multiple cases and proves
each one separately.
We have already seen several examples of proof by cases (e.g. Examples 2.24 and 2.68), but
it never hurts to see another example.
⋆Fill in the details 2.81. Let s be a positive integer. Prove that the closed interval [s, 2s]
contains a power of 2.
If and Only If Proofs 33
⋆Question 2.82. Why is there a choice between proving q implies p and proving ¬p
implies ¬q when proving the backwards direction?
Answer
As we have mentioned before, the examples in this section are quite trivial and may seem
ridiculous–since they are so obvious, why are we bothering to prove them? The point is to use
the proof techniques we are learning. We will use the techniques on more complicated problems
later. For now we want the focus to be on proper use of the techniques. That is more easily
accomplished if you don’t have to think too hard about the details of the proof.
⋆Exercise 2.84. Prove that x is odd iff x + 20 is odd using direct proofs for both directions
34 Chapter 2
⋆Exercise 2.85. Prove that x is odd iff x + 20 is odd using using a direct proof for the
forward direction and a proof by contraposition for the backward direction.
⋆Fill in the details 2.86. The two most common ways to prove p iff q are
⋆Evaluate 2.87. Use the definition of odd to prove that x is odd if and only if x − 4 is odd.
Evaluation
Evaluation
Common Errors in Proofs 35
16 1
Example 2.88. Is the following proof that = correct? Why or why not?
64 4
Proof: This is true because if I cancel the 6 on the top and the
16 1✓
6 1
bottom, I get = = .
64 ✓
64 4
Evaluation: You probably know that you can’t cancel arbitrary digits in a frac-
tion, so this is not a valid proof. In addition, consider this: If this proof is correct,
16 16✁ 1
then it could be used to prove that = = = 1, which is clearly false.
61 6✁1 1
Note: The point of the previous example is this: Don’t confuse the fact that what you are
trying to prove is true with whether or not your proof actually proves that it is true. An
incorrect proof of a correct statement is no proof at all.
One rookie mistake that I see often is proof by example, where the writer attempts to prove
something in general by proving it for one particular case and assuming it must therefore work
for all of the other cases.
⋆Question 2.89. What is wrong with this ‘proof’ that the sum of two even integers is even?
Answer
Just because a proof seems work out, it does not mean that it is a proof of the correct
statement. For instance, the proof in Question 2.89 is a correct proof of the fact that the sum of
4 and 6 is even. But it is certainly not a proof that the sum of any two even numbers is even.
Let’s see an example of a supposed proof of something that is not even true. Hopefully I do
not need to convince you that the proof cannot be valid (since the statement is false).
Example 2.90. What is wrong with this ‘proof’ that one more than an even number is
divisible by 3?
Evaluation: This only shows that one more than 14 is divisible by 3. Notice
36 Chapter 2
Hopefully this example helps you see the problem with proof by example. If the technique
worked, then the proof in the previous example is a valid proof of the false statement that one
more than an even number is divisible by 3. But since that statement is false, it can’t have been
a valid proof. Indeed, as we already mentioned, the proof does show that the statement is true
for the given even number (in this case, 14), but that does not imply anything about the validity
of the statement for any other even numbers.
If you want to prove something for a general collection of numbers (e.g. even number, integers,
etc.), then your proof has to be general enough to include all possible values. For instance, if
you want to prove something about odd numbers, then you let x = 2k + 1 where k is an integer.
Notice that no matter which odd integer you want to consider, you can pick k to obtain that
value. Thus, if you prove something about the value x = 2k + 1, then you have proven it for all
odd values of x. However, if you show it is true for x = 7 (for instance), you have only shown
that it is true for x = 7.
Another common mistake when writing proofs is to make one or more invalid assumptions
without realizing it. This is another case where you end up proving a different statement (usually
a more specific statement) than the one you set out to prove. The problem is that when you make
this sort of mistake, the proof can sometimes seems to “work” because you get the conclusion you
want. Thus, your proof might actually be a valid proof, but it is of the wrong statement. Thus,
it isn’t always obvious that you even made a mistake.
The next few examples should illustrate what can go wrong if you aren’t careful.
⋆Question 2.91. What is wrong with this ‘proof’ that the sum of two even integers is even?
Answer
Since the statement in the previous example is true, it can be difficult to appreciate why the
proof is wrong. The proof seems to prove the statement but as you saw in the solution, it actually
doesn’t. It proves a more specific statement (In this case, it is a proof of the fact that the sum of
an even number with itself is even when it was supposed to be a proof of the fact that the sum
of any two even numbers is even.).
If it seems like we are being too nit-picky, consider the next example which gives a supposed
proof that the sum of two even numbers is divisible by 4 (hopefully you can quickly convince
yourself that this is not a true statement).
Common Errors in Proofs 37
⋆Question 2.92. What is wrong with the following ‘proof’ that the sum of two even integers
is divisible by 4?
Answer
Another common mistake students make when trying to prove an identity/equation is to start
with what they want to prove and work both sides of it until they demonstrate that they are
equal. I want to stress that this is an invalid proof technique. Again, if this seems like I am
making something out of nothing, consider this example:
Proof:
−1 = 1
(−1)2 = 12
1 = 1
Therefore −1 = 1.
How do you know that this proof is incorrect? (Think about the obvious reason, not any
technical reason.)
Answer
Notice that each step of algebra in the previous proof is correct. For instance, if a = b, then
a2 = b2 is correct. And (−1)2 and 12 are both equal to 1. So the majority of the proof contains
proper techniques. It contains just one problem: It starts by assuming something that isn’t true.
Unfortunately, one error is all it takes for a proof to be incorrect.
Note: When writing proofs, never assume something that you don’t already know to be true!
In particular, if you are trying to prove an equality, never start with the equality and work
both sides until you get the same thing. As demonstrated in the previous example, this is not
a valid proof technique.
38 Chapter 2
⋆Question 2.94. When you are given an equation to prove, should you prove it by writing
it down and working both sides until you get them both to be the same? Why or why not?
Answer
Let’s be clear about this issue. If you known an equation is correct, you can work both sides
of it until you get to some desired conclusion. However, if you have an equation and you do not
know whether or not it is correct, you cannot start your proof by considering that equation. As
Example 2.93 demonstrated, if an equation is not correct, sometimes you can work both sides
until they are the same, which gives the illusion that you have proven that it is correct, which is
clearly not possible. Hopefully this makes it clear to you that beginning a proof with an unknown
equation (e.g. the equation you are trying to prove) and using it in your proof is not valid.
⋆Question 2.95. You are given an equation. You work both sides of it until they are the
same. Should you now be convinced that the equation is correct? Why or why not?
Answer
Note: If you already know that an equation is true, then working both sides of it (for some
purpose other than demonstrating it is true) is a valid technique. However, it is more common
to start with a known equation and work just one side until it is what we want.
There are plenty of other common errors in proofs. We will see more examples of them
throughout the remainder of the book (although we will focus more on correct proof techniques!),
especially in the Evaluate examples. I want to say that you will likely see other examples of errors
in proofs as you write your own proofs, but that would be mean. Probably accurate, but still
mean.
More Practice 39
⋆Exercise 2.96. Let p < q be two consecutive odd primes (two primes with no other primes
between them). Prove that p + q is a composite number. Further, prove that it has at least
three, not necessarily distinct, prime factors. (Hint: think about the average of p and q.)
Proof:
⋆Evaluate 2.97. Prove or disprove that if x and y are rational, then xy is rational.
Proof 1: Because x and y are both rational, assume x = a/b where a and b
are integers and b 6= 0. We can assume that a and b have no factors
in common (since if they did we could cancel them and use the resulting
y
numbers as our new a and b). Then xy = bay , so xy is rational.
Evaluation
Evaluation
Since none of the proofs in the previous example were correct, you need to prove it.
40 Chapter 2
⋆Exercise 2.98. Prove or disprove that if x and y are rational, then xy is rational.
Proof:
Evaluation
Evaluation
Evaluation
More Practice 41
Evaluation
Evaluation
⋆Evaluate 2.100. Mersenne primes are primes that are of the form 2p −1, where p is prime.
Are all numbers of this form prime? Give a proof/counterexample.
Evaluation
Evaluation
42 Chapter 2
⋆Exercise 2.101. Let p be prime. Prove that not all numbers of the form 2p − 1 are prime.
Proof:
Reading Comprehension Questions 43
Note: It is recommended that you attempt to complete all of the questions before checking
your answers. As with the exercises throughout the book, it is also recommended that if you
get one wrong, attempt to solve it again before reading the answer/solution in detail. You will
learn a lot more that way!
Also, the solutions given are often just one possible answer (especially when answers in-
volve coming up with an example). If your answer is different, you should be able to determine
whether or not your answer is also correct. If you are not sure, please ask!
⋆Question 2.1. Because it was (perhaps incorrectly) assumed that you have heard the term
proof before, it was never formally defined in the chapter. Let’s make sure you don’t go any
further without having a good definition. So, what is a proof? Feel free to look up the definition
(online or in a dictionary) if you need to.
⋆Question 2.2. Let’s say someone correctly proves statement A. Does that mean A is a true
statement, that you are just pretty sure that it is true, or that it may or may not be true based
on whether or not you understand the argument being made in the proof? Explain your answer.
⋆Question 2.3. True or false: Every even number is not odd. Explain your answer.
⋆Question 2.4. If b is divisible by a, is it always the case that a is divisible by b? Explain using
an example.
⋆Question 2.7. Is it possible for both a proposition and its negation to be true? Explain.
⋆Question 2.8. If the proposition A implies B is true, does that mean the proposition B implies
A is true? Prove or give a counterexample.
⋆Question 2.9. True or false: If the inverse of an implication is true, then the implication is
also true. Explain your answer.
⋆Question 2.10. True or false: If the inverse of an implication is true, then the converse of the
implication is also true. Explain your answer.
⋆Question 2.11. What can you say about an implication and its contrapositive?
⋆Question 2.12. In your own words, explain the idea behind contradiction proofs. Include
specifics like how one goes about writing a proof by contradiction and why it is a valid proof
technique. (The goal of this question is to help you better understand the technique and to
convince you that it is indeed a valid technique, so put some thought into this one!)
44 Chapter 2
⋆Question 2.13. This should be a lot easier than the last question: Explain why proof by
contraposition is a valid proof technique.
⋆Question 2.14. Explain the difference between a proof by contradiction and a proof by con-
traposition, particularly as it applies to proving statements of the form p → q.
⋆Question 2.15. True or false: Every integer is a rational number. Explain your answer.
⋆Question 2.16. True or false: Every irrational number is not an integer. Explain your answer.
⋆Question 2.17. True or false: Every rational number is an integer. Explain your answer.
⋆Question 2.18. If you want to prove that A if and only if B is true (where A and B are
statements of some sort), can you just show that A implies B? If not, explain why that does not
work and what you would have to do instead (or in addition).
(a) Is showing that p implies q and ¬q implies ¬p a valid technique? Explain why or why not.
(b) Is showing that ¬q → ¬p and q → p a valid technique? Explain why or why not.
⋆Question 2.20. Proof by counterexample is a valid proof technique. Proof by example is not.
Explain the difference.
⋆Question 2.21. If I want to prove some equation, should I write down the equation and work
both sides until they are the same? Explain why this is or is not a valid proof technique.
Problems 45
2.10 Problems
Problem 2.1. Prove that a number and its square have the same parity. That is, the square of
an even number is even and the square of an odd number is odd.
Problem 2.2. Prove that the inverse of an implication is true if and only if the converse of the
implication is true.
Problem 2.3. Let a and b be integers. Consider the problem of proving that if at least one of
a or b is even, then ab is even. Is this equivalent to the statement from Evaluate 2.52? Explain,
using the appropriate terminology from this chapter.
Problem 2.4. Let a and b be integers. Consider the statement “If ab is even, then at least one
of a or b is even.” Rephrase this statement using the word odd instead of even (but you cannot
use the phrase not odd). Using terminology from this chapter, how did you come up with the
alternative phrasing?
Problem 2.5. Prove or disprove that there are 100 consecutive positive integers that are not
perfect squares. (Recall: a number is a perfect square if it can be written as a2 for some integer
a.)
(a) Are there any integers n and m that satisfy this equation? Prove it.
(b) Are there any positive integers n and m that satisfy this equation? Prove it.
Problem 2.7. Consider the equation a3 + b3 = c3 over the integers (that is, a, b, and c have to
all be integers).
(b) If we restrict a, b, and c to the positive integers, are there infinitely many solutions? Are
there any? Justify your answer. (Hint: Do a web search for “Fermat’s Last Theorem.”)
(b) Is it possible to prove that n is odd iff 3n + 4 is odd? If so, prove it. If not, explain why not
(i.e. give a counter example).
(c) If we don’t assume n has to be an integer, is it possible to prove that n is odd iff 3n + 4 is
odd? If so, prove it. If not, explain why not (i.e. give a counter example).
Problem 2.9. Prove that if n is an integer and 5n + 4 is even, then n is even using a
(b) Is it possible to prove that n is odd iff 4n + 3 is odd? If so, prove it. If not, explain why not
(i.e. give a counter example).
Problem 2.13. Prove that ab is odd iff a and b are both odd.
Problem 2.15. Let n be an odd integer. For what values of k do n and nk have the same parity?
Prove your claim.
Problem 2.16. Let n be an even integer. For what values of k do n and nk have the same
parity? Prove your claim.
Problem 2.17. Prove or disprove: Every positive integer can be written as the sum of the squares
of two integers.
Problem 2.18. Prove that the product of two rational numbers is rational.
Problem 2.19. Prove that the product of a non-zero rational number and an irrational number
is irrational.
Problem 2.22. Prove or disprove that n2 −1 is composite whenever n is a positive integer greater
than or equal to 1.
Problem 2.23. Prove or disprove that n2 −1 is composite whenever n is a positive integer greater
than or equal to 3.
2
A successful solution to this will earn you an A in the course. You are free to use Google or whatever other
resources you want for this problem, but you must fully understand the solution you submit.
Chapter 3: Programming Fundamentals and
Algorithms
The purpose of this chapter is to review some of the programming concepts you should have picked
up in previous classes while introducing you to some basic algorithms and new terminology that
we will find useful as we continue our study of discrete mathematics. We will also practice our
skills at proving things by sometimes proving that an algorithm does as specified.
Algorithms are presented in a syntax similar to Java and C++. This can be helpful since you
should already be familiar with it. On the other hand, this sort of syntax ties our hands more
than one often likes when discussing algorithms. What I mean is that when discussing algorithms,
we often want to gloss over some of the implementation details. For instance, we may not care
about data types, or how parameters are passed (i.e. by value or by reference), but by using a
Java-like syntax we are forcing ourselves to use particular data types and pass parameters in a
certain way.
Consider an algorithm that swaps two values (we will see an implementation of this shortly).
The concept is the same regardless of what type of data is being swapped. But given our choice
of syntax, we will give an implementation that assumes a particular data type. Most of the time
the algorithms presented can be modified to work with other data types.
The issue of pass-by-value versus pass-by-reference is more complicated. We will have a brief
discussion of this later, but the bottom line is that whenever you implement an algorithm from
any source, you need to consider how this and other language-specific features might change how
you understand the algorithm, how you implement it, and/or whether you even can.
3.1 Algorithms
An algorithm is a set of instructions that accomplishes a task in a finite amount of time.
Example 3.1 (Area of a Trapezoid). Write an algorithm that gives the area of a trapezoid
with height h and bases a and b.
Note: Notice that we use the return keyword to indicate what value should be passed to
whoever calls an algorithm. For instance, if someone calls x=AreaTrapazoid(a, b, h), then
x will be assigned the value h ∗ (a + b)/2 since this is what was returned by the algorithm.
Those who know Java, C, C++, or just about any other programming language should already
be familiar with this concept.
47
48 Chapter 3
⋆Exercise 3.2. Write an algorithm that returns the area of a square with sides of length s.
double a r e a S q u a r e ( double s ) {
Definition 3.3. The assignment operator, =, assigns to the left-hand argument the value
of the right-hand argument.
Example 3.4. The statement x = a + b means “assign to x the value of a plus the value of
b.”
Note: Most modern programming languages use = for assignment. Other symbols used
include :=, =:, <<, ←, etc.
As it turns out, the most common symbol for assignment (=) is perhaps the most confusing
for someone who is first learning to program. One of the most common assignment statements
is x = x + 1;. What this means is “assign to the x its current value plus one.” However,
what it looks like is the mathematical statement “x is equal to x + 1”, which is false for every
value of x. If this has tripped you up in the recent past or still does, fear not. Eventually you
will instinctively interpret it correctly, probably forgetting you were ever confused by it.
Example 3.5 (Swapping variables). Write an algorithm that will interchange the values of
two variables, x and y. That is, the contents of x becomes that of y and vice-versa.
It can be very useful to be able to prove that an algorithm actually does what we think it
does. Then when an error is found in a program we can focus our attention on the portions of
the code that we are uncertain about, ignoring the code that we know is correct.
Example 3.6. Prove that the algorithm in Example 3.5 works correctly.
Proof: Assume the values a and b are passed into swap. Then at the beginning
of the algorithm, x = a and y = b. We need to prove that after the algorithm is
finished, x = b and y = a.
After the first line, x and y are unchanged and t = a since it was assigned the
value stored in x, which is a. After the second line, x = b since it is assigned the
value stored in y, which is b. Currently x = b, y = b, and t = a. Finally, after the
Algorithms 49
Note: The correctness of this algorithm (and a few others in this chapter) is based on the
assumption that the variables are passed by reference rather than passed by value.
In C and C++, it is possible to pass by value or by reference, although we didn’t use the
proper syntax to do so. The * or & you sometimes see in argument lists is related to this.
In Java, everything is passed by value and it is impossible to pass by reference. However,
because non-primitive variables in Java are essentially reference variables (that is, they store
a reference to an object, not the object itself ), in some ways they act as if they were passed by
reference. This is where things start to get complicated. I don’t want to get into the subtleties
here, especially since there are arguments about whether or not these are the best terms to
use. Let me give an analogy instead. a
If I share a Google Doc with you, I am passing by reference. We both have a reference
to the same document. If you change the document, I will see the changes. If I change the
document, you will see the changes. On the other hand, if I e-mail you a Word document, I am
passing by value. You have a copy of the document I have. Essentially, I copied the current
value (or contents) of the document and gave that to you. If you change the document,
my document will remain unchanged. If I change my document, your document will remain
unchanged. This sounds pretty simple. However, it gets more complicated. In Java, you
can create a “primitive” Word document, but in a sense you can’t create an “object” Word
document. Instead, a Google Doc is created and you are given access (i.e. a reference) to it.
This is why in some ways primitive and object variables seem to act differently in Java.
I’ve already said too much. You will/did learn a lot more about this issue in another
course. Here is the bottom line: The assumption being made in the various swap algorithms
is that when a variable is passed in, the algorithm has direct access to that variable and not
just a copy of it. Thus if changes are made to that variable in the algorithm, it is changing
the variable that was passed in. This subtlety does not matter for most of the algorithms here.
a
Inspired by a response on http://stackoverflow.com/questions/373419/.
Note: We should be absolutely clear that it is impossible to implement the swap method
from Example 3.5 in Java. In fact, there is no way to implement a method that swaps two
arbitrary values in Java. As we will see shortly, it is possible to implement a method that
swaps two elements from an array.
Note: One final note before we move on. Whether or not the swap method (or any method)
can be implemented, we can still use it in other algorithms as if it can. This is because when
discussing algorithms we are usually more concerned about the idea behind the algorithm,
not all of the implementation details. Using a method like swap in another algorithm often
makes it easier to understand the overall concept of that algorithm. If we actually wanted to
implement an algorithm that uses swap, we would simply need to replace the call to swap with
some sort of code that accomplishes the task if swap is impossible to implement.
50 Chapter 3
⋆Question 3.7. Does the following swap algorithm work properly? Why or why not?
void swap ( Object x , Object y ) {
x = y;
y = x;
}
Answer
Example 3.8. Write an algorithm that will interchange the values of two variables x and y
without introducing a third variable, assuming they are of some numeric type.
Solution: The idea is to use sums and differences to store the values. Assume
that initially x = a and y = b.
void swap ( number x , number y ) {
x = x + y ; // x = a + b and y = b
y = x - y ; // x = a + b and y = a +b - b = a
x = x - y ; // x = a +b - a = b and y = a
}
Notice that the comments in the code sort of provide a proof that the algorithm
is correct, although keep reading for an important disclaimer.
Example 3.9. It was mentioned that the comments in the algorithm from Example 3.8
provide a proof of its correctness. What possibly faulty assumption is being made?
Problem 3.23 explores whether or not the algorithm in Example 3.8 works for integer types–
specifically 2’s complement numbers.
The mod operator and Integer Division 51
Definition 3.10. The mod operator is defined as follows: for integers a and n such that
a ≥ 0 and n > 0, a mod n is the integral non-negative remainder when a is divided by n.
Observe that this remainder is one of the n numbers
0, 1, 2, ..., n − 1.
Java, C, C++, and many other languages use % for mod (e.g. int x = a % n instead of
int x = a mod n).
Definition 3.13. For integers a, b, and n, where n > 0, we say that a is congruent to b
modulo n if n divides a − b (that is, a − b = kn for some integer k). We write this as a ≡ b
(mod n).
There are a few other (equivalent) ways of defining congruence modulo n.
If a − b 6= kn for any integer k, then a is not congruent to b modulo n, and we write this
as a 6≡ b (mod n).
Notice that if a ≡ b (mod n) and 0 ≤ b < n, then b is the remainder when a is divided by n.
52 Chapter 3
Proof: Since every integer is either even (of the form 2k) or odd (of the form
2k + 1) we have two possibilities:
Example 3.16. Prove that the sum of two squares of integers leaves remainder 0, 1 or 2
when divided by 4.
Example 3.17. Prove that 2003 is not the sum of two squares.
Proof: Notice that 2003 ≡ 3 (mod 4). Thus, by Example 3.16 we know that
2003 cannot be the sum of two squares.
The proof of the following is left as an exercise. Recall that iff is shorthand for if and only
if.
Example 3.19. Since 1961 mod 37 = 0 and 356 mod 37 = 23, and 0 6= 23, we know that
1961 6≡ 356 (mod 37) by Theorem 3.18.
Note: Our definition of mod required that n > 0 and a ≥ 0. However, it is possible to define
a mod n when a is negative. Unfortunately, there are two possible ways of doing so based on
how you define the remainder when the dividend is negative. One possible answer is negative
and the other is positive. However, they always differ by n, so computing one from the other
is easy.
Example 3.20. Since −13 = (−2) ∗ 5 − 3 and −13 = (−3) ∗ 5 + 2, we might consider the
remainder of −13/5 as either −3 or 2. Thus, −13 mod 5 = −3 and −13 mod 5 = 2 both seem
like reasonable answers. Fortunately, the two possible answers differ by 5. In fact, you can
always obtain the positive possibility by adding n to the negative possibility.
The mod operator and Integer Division 53
⋆Exercise 3.21. Fill in the missing numbers that are congruent to 1 (mod 4) (listed in
increasing order)
Note: When using the mod operator in computer programs in situations where the dividend
might be negative, it is important to know which definition your programming language/com-
piler uses. Java returns a negative number when the dividend is negative. In C, the answer
depends on the compiler.
⋆Exercise 3.22. If you write a C program that computes −45 mod 4, what are the two
possible answers it might give you?
Answer
The next exercise explores a reasonable idea: What if we want the answer to a mod b to always
be between 0 and b − 1, even if a is negative? In other words, we want to force the correct positive
answer even if the compiler for the given language might return a negative answer.
⋆Evaluate 3.23. Although different programming languages and compilers might return
different answers to the computation a mod b when a < 0, they always return a value between
−(b − 1) and b − 1. Given that fact, give an algorithm that will always return an answer
between 0 and b − 1, regardless of whether or not a is negative. Try to do it without using
any conditional statements.
Evaluation
Evaluation
54 Chapter 3
Evaluation
Evaluation
Definition 3.25. The floor of a real number x, written ⌊x⌋, is the largest integer that is less
than or equal to x. The ceiling of a real number x, written ⌈x⌉, is the smallest integer that
is greater than or equal to x.
Theorem 3.28. Let a be an integer and x be a real number. Then a ≤ x if and only if
a ≤ ⌊x⌋.
⋆Evaluate 3.29. Implement an algorithm that will round a real number x to the closest
integer, but rounds down at .5. You can only use numbers, basic arithmetic (+, –, *, /),
and floor(y) and/or ceiling(y) (which correspond to ⌊y⌋ and ⌈y⌉). Don’t worry about
the data types (i.e. returning either a double or an int is fine as long as the value stored
represents an integer value).
Evaluation
Evaluation
Evaluation
Evaluation
56 Chapter 3
Corollary 3.30. Let a, b, and c be integers. Then a ≤ b/c if and only if a ≤ ⌊b/c⌋.
Proof: Since b/c is a real number, this is a special case of Theorem 3.28.
The floor function is important because in many programming languages, including Java, C, and
C++, integer division truncates. That is, when you compute n/k for integers n and k, the result
is rounded so it is closer to zero (as opposed to rounding down). That means that if n, k ≥ 0,
n/k rounds down to ⌊n/k⌋. But if n < 0, n/k rounds up to ⌈n/k⌉. So in Java, C, and C++,
3/4 = −3/4 = 0, 11/5 = 2 and −11/5 = −2, for instance.
⋆Exercise 3.31. Compute each of the following, assuming they are expressions in Java, C,
or C++.
⋆Evaluate 3.32. Let n and m be positive integers with m > 2. Assuming integer division
truncates, write an algorithm that will compute n/m, but will round the result instead of
truncating it (round up at or above . 5, down below . 5). For instance, 5/4 should return 1,
but 7/4 should return 2 instead of 1. You may only use basic integer arithmetic, not including
the mod operator.
Solution 1: floor(n/m+0.5)
Evaluation
Evaluation
Evaluation
The mod operator and Integer Division 57
Although the previous example may seem like it is based on an unnecessary restriction, this is
taken from a real-world situation. When writing code for an embedded device (e.g. a thermostat
or microwave oven), code space is often of great concern. Performing just a single calculation
using doubles/floats can add a lot of code since it needs to add certain code to deal with the data
type. Sometimes the amount of code added is too much since embedded processors have a lot less
space than the processor in your laptop or desktop computer. Because of this, some embedded
programmers do everything they can to avoid all non-integer computations in their code when it
is possible.
Definition 3.34. The if-then-else control statement has the following syntax:
if ( e x p r e s s i o n ) {
blockA
else {
blockB
}
and evaluates as follows. If expression is true then the statements in blockA are executed.
Otherwise, the statements in blockB are executed.
Example 3.35. Write an algorithm that will determine the maximum of two integers. Prove
your algorithm is correct.
⋆Exercise 3.36. Write an algorithm that will determine the maximum of three numbers
that uses the algorithm from Example 3.35. Prove that your algorithm is correct.
int max ( int x , int y , int z ) {
Proof
The previous exercise is an example of something that you are already familiar with: code
reuse. We could have written an algorithm from scratch, but it is much easier to use one that
If-then-else Statements 59
already exists. Not only is the resulting algorithm simpler, it is easier to prove that it is correct
since we know that algorithm it uses is correct.
⋆Exercise 3.37. Write an algorithm that prints “Hello” if one enters a number between
4 and 6 (inclusive) and “Goodbye” otherwise. Use the function print(String s) to print.
You are not allowed to use any boolean operators like and, or, etc.
void H e l l o G o o d b y e ( int x ) {
For simplicity, we will sometimes use print to output results and not worry about whitespace
(e.g. spaces and newlines). Think of it as being equivalent to Java’s System.out.print(i+" ") or
C++’s cout<<i<<" ", or C’s printf("%d ",i) if you would like.
⋆Question 3.38. The solution given for the previous example uses three print statements,
with two identical print statements appearing in different places. Is it possible to write the
algorithm using only two print statements while maintaining the restriction that you cannot
use and and or? If so, give that version of the algorithm. If not, explain why not.
Answer:
60 Chapter 3
• initialize is one or more statements that set up the initial conditions and is executed
once at the beginning of the loop.
• condition is the condition that is checked each time through the loop. If condition is
true, the statements in blockA are executed followed by the code in increment. This
process repeats until condition is false.
• increment is code that ensures the loop progresses. Typically increment is just a simple
increment statement, but it can be anything.
⋆Question 3.41. Does the factorial algorithm from Example 3.40 ever do something
unexpected? If so, what does it do, when does it do it, and what should be done to fix it?
Answer
The for loop 61
⋆Evaluate 3.42. Evaluate these algorithms that supposedly compute n! for values of n > 0.
Don’t worry about what they do when n ≤ 0.
Solution 1:
int fact = 1;
for ( int i =0;i <= n ; i ++) {
fact = fact * i ;
}
return fact ;
Evaluation
Solution 2:
int fact = 1;
for ( int i =2;i <= n ; i ++) {
fact = fact * i ;
}
return fact ;
Evaluation
Solution 3:
int fact = 1;
for ( int i = n ;i >0; i - -) {
fact = fact * i ;
}
return fact ;
Evaluation
Solution 4:
int fact = 1;
for ( int i =1;i < n ; i ++) {
fact = fact *( n - i ) ;
}
return fact ;
Evaluation
62 Chapter 3
⋆Exercise 3.43. Write an algorithm that will compute xn , where x is a given real number
and n is a given positive integer.
double power ( double x , int n ) {
}
Arrays 63
3.5 Arrays
Definition 3.44. An array is an aggregate of homogeneous types. The length of the array
is the number of entries it has.
Example 3.45. Write an algorithm that determines the maximum element of a 1-dimensional
array of n integers.
Solution: We declare the first value of the array (the 0-th entry) to be the
maximum (a sentinel value). Then we successively compare it to other n−1 entries.
If an entry is found to be larger than it, that entry is declared the maximum.
M a x E n t r y( int [] X , int n ) {
int max = X [0];
for( int i =1;i < n ; i ++) {
if ( X [ i ] > max ) {
max = X [ i ];
}
}
return max ;
}
If your primary language is Java, you might wonder why we did not use X.length in the
previous algorithm. There are two reasons. First, not all languages store the length of an array as
part of the array. For examples, C and C++ do not. In these languages you always need to pass
the length along with an array. Second, sometimes you want to be able to process only part of an
array. Written as we did above, the algorithm will return the maximum of the first n elements of
an array. The algorithm works as long as the array has at least n elements.
Note: If an algorithm has an array and a variable n as parameters, you can probably assume
n is the length of the array unless it is otherwise specified.
64 Chapter 3
Example 3.46. Implement a method that swaps two elements of an array that works in Java
and other languages that can’t pass by reference.
Solution: Here is a method that swaps two elements of an integer array. Except
for the type of the parameter and temp variable, this works for any data type.
swap ( int [] X , int a , int b ) {
int temp = X [ a ];
X [ a ]= X [ b ];
X [ b ]= temp ;
}
I don’t want to get into the technical details of pass-by-value versus pass-by-
reference since that is really the subject of another course. But briefly, this works
because when the array is passed we have access to the individual array elements.
Therefore when we change them, they are changed in the original array.
Example 3.47. An array (X[0], . . . X[n − 1]) is given. Without introducing another array,
put its entries in reverse order.
Solution: Observe that we want to exchange the first and last element, the
second and second-to-last element, etc. That is, we want to exchange X[0] ↔
X[n−1], X[1] ↔ X[n−2], . . . , X[k] ↔ X[n−k−1]. But what value of k is correct?
If we go all the way to n − 1, the result will be that every element is swapped and
then swapped back, so we will accomplish nothing. Hopefully you can see that if
we swap elements when k < n − k − 1, we will swap each element at most once.
The “at most once” is because if the array has an odd number of elements, the
middle element occurs when k = n − k − 1, but we can skip it since it doesn’t need
to be swapped with anything. Notice that k < n − k − 1 if and only if 2k < n − 1.
Since k and n are integers, this is equivalent to 2k ≤ n − 2. This is equivalent
to k ≤ ⌊(n − 2)/2⌋ by Corollary 3.30. Thus, we need to swap the elements
0, 1, . . . , ⌊(n − 2)/2⌋ with elements n − 1, n − 2, . . . , n − 1 − ⌊(n − 2)/2⌋ = n − ⌊n/2⌋,
respectively. The following algorithm implements this idea.
r e v e r s e A r r a y ( int [] X , int n ) {
for ( int i =0;i <=(n -2) /2; i ++) {
swap (X ,i ,n -i -1) ;
}
}
⋆Question 3.48. The previous algorithm went until i was (n − 2)/2, not ⌊(n − 2)/2⌋. Why
is this O.K.? Does it depend on the language? Explain.
Answer
Arrays 65
⋆Question 3.49. Does the following algorithm correctly reverse the elements of an array?
Explain.
r e v e r s e A r r a y ( int [] X , int n ) {
for( int i =0;i < n /2; i ++) {
swap (X ,i ,n -i -1) ;
}
}
Answer
Hopefully the previous example helps you realize that you need to be careful when working
with arrays. Formulas related to array indices change depending on whether arrays are indexed
starting at 0 or 1. In addition, formulas involving the number of elements in an array can be
tricky, especially when the formulas relate to partitioning the array into pieces (e.g. into two
halves). These can both lead to the so-called “off by one” error that is common in computer
science. The next example illustrates these problems, and one way to deal with it.
Example 3.50. Give a formula for the index of the middle element of an array of size n. If
there are two middle elements (e.g. n is even), use the first one.
⋆Question 3.51. The previous example seems to suggest that ⌈n/2⌉ − 1 = ⌊(n − 1)/2⌋ for
all integers n. Is this correct? Do a few sample computations to try to convince yourself of
your answer.
Answer
66 Chapter 3
Note: Always be very careful with formulas related to the index of an array. Double-check
your logic by plugging in some values to be certain your formula is correct.
Definition 3.52. A boolean variable is a variable that can be either true or false.
Definition 3.53. The not unary operator changes the status of a boolean variable from true
to false and vice-versa. In Java, C, and C++, the not operator is ! and it appears before the
expression being negated (e.g. !x).
The not operator is essentially the same thing as the negation we discussed earlier. The
difference is context—we are applying not to a boolean variable, whereas we applied negation to
a statement.
Example 3.54. [The Locker-Room Problem] A locker room contains n lockers, numbered
1 through n. Initially all doors are open. Person number 1 enters and closes all the doors.
Person number 2 enters and opens all the doors whose numbers are multiples of 2. Person
number 3 enters and toggles all doors that are multiples of 3. That is, he closes them if they
are open and opens them if they are closed. This process continues, with person i toggling
each door that is a multiple of i. Write an algorithm to determine which lockers are closed
when all n people are done.
Can you see how to slightly improve the algorithm in Example 3.54?
The while loop 67
Example 3.56. An array X satisfies X[0] ≤ X[1] ≤ · · · ≤ X[n − 1]. Write an algorithm that
finds the number of entries that are different.
⋆Exercise 3.57. What will the following algorithm return for n = 5? Trace the algorithm
carefully, outlining all your steps.
m y s t e r y( int n ) {
int x =0;
int i =1;
while (n >1) {
if ( n *i >4) {
x = x +2* n ;
} else {
x=x+n;
}
n =n -2;
i ++;
}
return x ;
}
Answer
68 Chapter 3
Theorem 3.58. Let n > 1 be a positive integer. Either n is prime or n has a prime factor
√
no greater than n.
Example
√ 3.59. To determine whether 103 is prime we proceed as follows. Observe that
⌊ 103⌋ = 10 (According to Theorem 3.28, we only need concern ourselves with the floor).
We now divide 103 by every prime no greater than 10. If one of these primes divides 103, then
103 is not a prime. Otherwise, 103 is a prime. Notice that 103 mod 2 = 1, 103 mod 3 = 1,
103 mod 5 = 3, and 103 mod 7 = 5. Since none of these remainders is 0, 103 is prime.
Proof
Proof
Solution: We first deal with a few base cases. If n = 1, it is not prime, and if
n = 2 or n = 3 it is prime. Then we determine if n is even, in which case it is not
prime. Finally, we loop through all of the odd values, starting with 3 and going
√
to n, determining whether or not n is a multiple of any of them. If so, it is not
√
prime. If we get through all of this, then n has no factors less than or equal to n
which means it must be prime. Here is the algorithm based on this description.
b o o l e a n i s P r i m e( int n ) {
if (n <=1) { // A n y t h i n g less than 2 is not prime .
return false ; }
if ( n ==2 || n ==3) { // 2 and 3 are s p e c i a l cases .
return true ; }
if ( n %2==0) { // D i s c a r d even n u m b e r s right away .
The while loop 69
return false ;
} else {
// D e t e r m i n e if it has any odd f a c t o r s.
int i = 1;
while ( i <= sqrt ( n ) ) {
i = i + 2;
if ( n % i ==0) {
return false ; }
}
return true ; // It had no f a c t o r s.
}
}
Note: It should be noted that although this algorithm in Example 3.62 works, it is not very
practical for large values of n. In fact, there is no known algorithm that can factor numbers
efficiently on a “classical” computer. The most commonly used public-key cryptosystems rely
on the assumption that there is no efficient algorithm to factor a number. However, if you
have a quantum computer, you are in luck. Shor’s algorithm actually can factor numbers
efficiently.
⋆Question 3.63. Why did the algorithm in the previous example deal with even numbers
differently?
Answer
⋆Exercise 3.64. Use the fact that integer division truncates to write an algorithm that
reverses the digits of a given positive integer. For example, if 123476 is the input, the output
should be 674321. You should be able to do it with one extra variable, one while loop, one
mod operation, one multiplication by 10, one division by 10, and one addition.
int r e v e r s e D i g i t s ( int n ) {
}
70 Chapter 3
⋆Question 3.9. Write a method sumFirstN(int n) that returns the sum of the first n positive
integers. Your code should be as simple and efficient as possible. (Hint: you should use a loop,
although we will see later on this semester how to solve this particular problem without a loop.)
⋆Question 3.12. Write two versions of a method called boolean noZeroes(int a[], int n),
where a is an array of length n, that returns true if and only if none of the elements of a are 0:
one that calls containsZero (see the previous question), and one that does not. In both cases,
your code should be as simple and efficient as possible.
⋆Question 3.13. Rewrite the code from Questions 3.8 using a while loop.
72 Chapter 3
3.8 Problems
Note: For the remainder of the book, whenever a problem asks for an algorithm, always
assume it is asking for the most efficient algorithm you can find. You will likely lose points
if your algorithm is not efficient enough.
Problem 3.1. Let n be a positive integer. Recall that a ≡ b (mod n) iff n divides a − b (that is,
a − b = k · n for some k ∈ Z). Use this formal definition to prove each of the following:
(c) If a ≡ b (mod n) and b ≡ c (mod n), then a ≡ c (mod n). (Transitive property)
Problem 3.2. Implement the swap operation for integers without using an additional variable
and without using addition or subtraction. (Hint: bit operations)
Problem 3.3. Prove or disprove that the following method correctly computes the maximum of
two integers x and y, assuming that the minimum method correctly computes the minimum of x
and y.
int m a x i m u m( int x , int y ) {
int min = m i n i m u m(x , y ) ;
int max = x + y - min ;
return max ;
}
Problem 3.4. Let a be an array with n elements. Give an algorithm int sum(int []a,int n)
that returns the sum of all of the elements from the array a.
Problem 3.5. Give an algorithm int sumLarge(int []a,int n,int k) (where a is an array
with n elements) that returns the sum of all of the elements from the array a that are at least k
(as in a[i] ≥ k, not whose index is at least k).
Problem 3.6. Let a be an array with n elements. Give an algorithm that returns true if and only
if the elements of a are in increasing order. That is, if i < j, then a[i] ≤ a[j]. Call your algorithm
boolean increasing(int []a,int n). (Hint: You do not have to check that a[i] ≤ a[j] for
every i < j. What is a simpler thing to check that is equivalent to this?)
Problem 3.7. Let a be an array with n elements. Give an algorithm mod(int []a,int n,int k)
that replaces each element a[i] from the array a with a[i] mod k.
Problem 3.8. Let a be an array with n elements. Give an algorithm sort(int []a,int n) that
sorts the elements of the array a in increasing order.
Problem 3.9. Let a be an array with n elements and assume that a[i] is the number of people
in room i of a hotel. The hotel can have at most capacity total guests. If they have too many
guests, they have to kick guests out in reverse order of room number (so larger room numbers get
kicked out first). Give an algorithm int tooMany(int []a,int n, int capacity) that returns
the room number k such that some or all guests in room k and all guests in rooms k + 1 and
beyond must vacate; or -1 if there is enough space for everybody.
Problems 73
Problem 3.10. Give an algorithm secondSmallest(int []a,int n) that returns the second
smallest element of an array a with n elements. Implement this under two different assumptions:
(a) If there are two or more of the smallest value in the array, that value should be returned.
(e.g. If the array is [2, 5, 6, 2, 4, 3, 2], then 2 should be returned.)
(b) If there are two or more of the smallest value in the array, return the next larger value. (e.g.
If the array is [2, 5, 6, 2, 4, 3, 2], then 3 should be returned.)
(Note: If there are not two or more of the smallest value, you of course return the next larger
value than the smallest value in either case.)
Problem 3.11. Give a recursive algorithm that computes n!. You can assume n ≥ 0.
Problem 3.12. What will the following algorithm return for n = 3?
i C a n D u z S o m e t h i n g ( int n ) {
int x = 0;
while (n >0) {
for ( int i =1;i <= n ; i ++) {
for ( int j = i ;j <= n ; j ++) {
x = x + i*j;
}
}
n - -;
}
return x ;
}
Problem 3.13. Give an algorithm that will round a real number x to the closest integer, rounding
up at .5. You can only use floor(y), ceiling(y), basic arithmetic (+, -, *, /) and/or numbers.
You cannot use anything else, including conditional statements! Prove that your algorithm is
correct.
Problem 3.14. Recall that Example 3.32 had the conditions that n > 0 and m > 2. Also recall
that you gave a solution to this in Exercise 3.33. Also recall that integer division always truncates
toward zero, so negative numbers truncate differently than positive ones.
(a) Does your solution work when m = 2? Justify your answer with a proof/counterexample.
(b) Does your solution work when n ≤ 0? Justify your answer with a proof/counterexample.
(c) Give an algorithm that will work for any integer n and any non-zero m. Give examples
that demonstrate that your algorithm is correct for the various cases and/or a proof that it
always works. Make sure you consider all relevant cases (e.g., when it should round up and
down, when n and m are positive/negative). You may only use basic integer arithmetic and
conditional statements. You may not use floor, ceiling, abs (absolute value), etc. You also
may not use the mod operator since how it works with negative numbers is not the same for
every language.
Problem 3.15. Assume you have a function random(int n) that returns a random integer
between 0 and n − 1, inclusive. Give code/pseudocode for an algorithm random(int a, int b)
that returns a random number between a and b, inclusive of both a and b. You may assume that
a < b (although in practice, this should be checked). You may only call random(int n) once
and you may not use conditional statements. Prove that your algorithm returns an integer in the
required range.
74 Chapter 3
Problem 3.16. Assume you have a function random() that returns a non-negative random
integer. Give code/pseudocode for an algorithm random(int a, int b) that returns a random
integer between a and b, inclusive of both a and b. Each possible number generated should occur
with approximately the same probability. You may assume that a and b are both positive and that
a < b (although in practice, this should be checked). You may use only basic integer arithmetic
(including the mod operator) and you may only call random() once. You may not use loops,
conditional statements, floor, ceiling, abs (absolute value), etc. Prove that your algorithm returns
an integer in the required range.
Problem 3.17. The following method is a simplified version of a method that might be used
to implement a hash table or in a cryptographic system. Assume that for one particular use the
number returned by this function has to have the opposite parity (even/odd) of the parameter.
For instance, hash_it(4) returns 49 which has the opposite parity of 4, so it works for 4. Prove
or disprove that this function always returns a value of opposite parity of the parameter.
int hash_it(int x) {
return x*x+6*x+9;
}
Problem 3.18. Give an algorithm that prints all of the primes that are less than or equal to n.
Your algorithm should be as efficient as possible. One approach is to modify the algorithm from
Example 3.62 by using an array to make it more efficient.
Problem 3.19. Prove or disprove that the following method computes the absolute value of x.
For simplicity, assume
√ that all of the calculations are performed with perfect precision. You may
2
use the fact that x = x when x ≥ 0 if it will help.
double a b s o l u t e V a l u e ( double x ) {
double square = x * x ;
double answer = sqrt ( square ) ;
return answer ;
}
Problem 3.20. Prove or disprove that the following method computes the absolute value of x.
For simplicity, assume that all of the calculations are performed with perfect precision. You may
√
use the fact that ( x)2 = x when x ≥ 0 if it will help.
double a b s o l u t e V a l u e ( double x ) {
double root = sqrt ( x ) ;
double answer = root * root ;
return answer ;
}
Problem 3.21. Problems 3.19 and 3.20 both assumed that “all of the calculations are performed
with perfect precision”. Is that a realistic assumption? Give an example of an input for which the
each algorithm will work properly. Then give an example of an input for which each algorithm
will not work properly. You can implement and run the algorithms to do some testing if you wish.
Problem 3.22. The following method is supposed to do some computations on a positive number
that result in getting the original number back. Prove or disprove that this method always returns
the exact value that was passed in. Unlike in the previous problems, here you should assume that
although a double stores a real number as accurately as possible, it uses only a fixed amount
of space. Thus, a double is unable to store the exact value of any irrational number–it instead
stores an approximation.
Problems 75
double r e t u r n T h e P a r a m e t e r U n m o d i f i e d ( double x ) {
double a = sqrt ( x ) ;
double b = a * a ;
return b ;
}
Problem 3.23. Prove or disprove that the algorithm from Example 3.8 actually does work1
properly with integer data types stored using 2’s complement.2 You may restrict to 8-bit numbers
if it will help you think about it more clearly–a proof/counterexample for 8-bit number can easily
be modified to work for 32- or 64-bit numbers. (Hint: If it doesn’t work, what sort of numbers
might it fail on?)
Problem 3.24. Use the first definition of congruence modulo n given in Definition 3.13 to prove
Theorem 3.18. (Note: This is an if and only if proof, so you need to prove both ways.)
1
When we say “works,” we mean for all possible values of x and y.
2
We assume you have previously encountered the 2’s complement representation of integers. If not, do an
Internet search for details.
76 Chapter 3
Chapter 4: Logic
4.1 Propositional Logic
Whether the statement is obviously true or false does not enter into the definition. One only needs
to know that its certainty can be established.
Example 4.2. The following are propositions and their truth values, if known:
(d) There exists infinitely many primes which are the sum of a square and 1. (unknown)
(h) Every prime that leaves remainder 1 when divided by 4 is the sum of two squares. (true)
(i) Every even integer greater than 6 is the sum of two distinct primes. (unknown)
⋆Exercise 4.3. Give the truth value of each of the following statements.
(a) 0 = 1.
(b) 17 is an integer.
77
78 Chapter 4
Example 4.4. The following are not propositions, since it is impossible to assign a true or
false value to them.
(c) What I am is what I am, are you what you are or what?
(d) x = x + 1.
⋆Exercise 4.5. For each of the following statements, state whether it is true, false, or not
a proposition.
(b) “Psych” was one of the best shows on TV when it was on the air.
Definition 4.6. A logical operator is used to combine one or more propositions to form a
new one. A proposition formed in this way is called a compound proposition. We call the
propositions used to form a compound proposition variables for reasons that should become
evident shortly.
Next we will discuss the most common logical operators. Some of these will be familiar to
you. When you learned about Boolean expressions in your programming courses, you proba-
bly saw NOT (e.g. if( !list.isEmpty() )), OR (e.g. if( x>0 || y>0 )), and AND (e.g.
if( list.size() > 0 && list.get(0) > 1 )). The notation we use will be different, however.
This is because the symbols you are familiar with are specific choices made by whoever created
the programming language(s) you learned. Here we will use standard mathematical notation for
the operators.
For each of the following definitions, assume p and q are propositions.
Propositional Logic 79
Definition 4.7. The negation (or NOT) of p, denoted by ¬p is the proposition “it is not
the case that p”. ¬p is true when p is false, and vice-versa. Other notations include p,
∼ p, and !p. Many programming languages use the last one.
Example 4.8. If p is the proposition “x < 0”, then ¬p is the proposition “It is not the case
that x < 0,” or “x ≥ 0.”
⋆Fill in the details 4.9. Let p be the proposition “I am learning discrete mathematics.”
(a) Use the fact that “This is a proposition” is a proposition to prove that “This is not a
proposition” is a proposition. Then prove that its truth value is false.
Proof
(b) Use a contradiction proof to prove that “This is not a proposition” is a proposition. Then
prove that its truth value is false.
Proof
80 Chapter 4
⋆Exercise 4.11. You need a program to execute some code only if the size of a list is not
0. The variable is named list, and its size is list.size(). Give the expression that should
go in the if statement. In fact, give two different expressions that will work.
1.
2.
Definition 4.12. The conjunction (or AND) of p and q, denoted by p∧q, is the proposition
“p and q”. The conjunction of p and q is true when p and q are both true and false otherwise.
Many programming languages use && for conjunction.
Example 4.13. Let p be the proposition “x > 0” and q be the proposition “x < 10.” Then
p ∧ q is the proposition “x > 0 and x < 10,” or “0 < x < 10.” In a Java/C/C++ program, it
would be “x>0 && x<10.”
Example 4.14. Let p be the proposition “x < 0” and q be the proposition “x > 10.” Then
p ∧ q is the proposition “x < 0 and x > 10.” Notice that p ∧ q is always false since if x < 0,
clearly x 6> 10. But don’t confuse the proposition with its truth value. When you see the
statement ‘p ∧ q is “x < 0 and x > 10”’ and ‘p ∧ q is false,’ these are saying two different
things. The first one is telling us what the proposition is. The second one is telling us its
truth value. ‘p ∧ q is false’ is just a shorthand for saying ‘p ∧ q has truth value false.’
⋆Fill in the details 4.15. If p is the proposition “I like cake,” and q is the proposition “I
Example 4.16. Write a code fragment that determines whether or not three numbers can
be the lengths of the sides of a triangle.
Solution: Let a, b, and c be the numbers. For simplicity, let’s assume they are
integers. First we must have a > 0, b > 0, and c > 0. Also, the sum of any two of
them must be larger than the third in order to form a triangle. More specifically,
we need a + b > c, b + c > a, and c + a > b. Since we need all of these to be true,
this leads to the following algorithm.
I s I t A T r i a n g l e ( int a , int b , int c ) {
if (a >0 && b >0 && c >0 && a +b > c && b +c > a && a +c > b ) {
return true ;
} else { return false ; }
}
Propositional Logic 81
Definition 4.17. The disjunction (or OR) of p and q, denoted by p ∨ q, is the proposition
“p or q”. The disjunction of p and q is false when both p and q are false and true otherwise.
Put another way, if p is true, q is true, or both are true, the disjunction is true. Many
programming languages use || for disjunction.
Example 4.18. Let p be the proposition “x < 5” and q be the proposition “x > 15.”
Then p ∨ q is the proposition “x < 5 or x > 15.” In a Java/C/C++ program, it would be
“x<5 || x>15.”
⋆Fill in the details 4.19. Let p be the proposition “x > 0” and q be the proposition
⋆Exercise 4.20. Let p be “Jill is tall,” and q be “Jill is smart.” Express each of the following
propositions in English.
1. ¬p is
2. p ∨ q is
3. p ∧ q is
⋆Exercise 4.21. Give an algorithm that will return true if an array of integers either starts
or ends with a 0, and false otherwise. Assume array indexing starts at 0 and that the array
is of length n. Use only one conditional statement. Be sure to deal with the possibility of an
empty array.
b o o l e a n s t a r t s O r E n d s W i t h Z e r o ( int [] a , int n ) {
}
82 Chapter 4
⋆Question 4.22. Does the solution given for the previous exercise properly deal with arrays
of size 0 and 1? Prove it.
Answer
Definition 4.23. The exclusive or (or XOR) of p and q, denoted by p⊕q, is the proposition
“p is true or q is true, but not both”. The exclusive or of p and q is true when exactly
one of p or q is true. Put another way, the exclusive or of p and q is true iff p and q have
different truth values.
Example 4.24. Let p be the proposition “x > 10” and q be the proposition “x < 20.” Then
p ⊕ q is the proposition “x > 10 or x < 20, but not both.”
Note: Notice that ∨ is an inclusive or, meaning that it is true if both are true, whereas ⊕
is an exclusive or, meaning it is false if both are true. The difference between ∨ and ⊕ is
complicated by the fact that in English, the word “or” to can mean either of these depending
on context. For instance, if your mother tells you “you can have cake or ice cream” she is
likely using the exclusive or, whereas a prerequisite of “Math 110 or demonstrated competency
with algebra” clearly has the inclusive or in mind.
⋆Exercise 4.25. For each of the following, is the or inclusive or exclusive? Answer OR or
XOR for each.
(d) You need to take probability or statistics before taking this class.
(e) You can get credit for either Math 111 or Math 222.
Propositional Logic 83
⋆Exercise 4.26. Let p be “list 1 is empty” and q be “list 2 is empty.” Explain the difference
in meaning between p ∨ q and p ⊕ q.
Answer
⋆Question 4.27. Let p be the proposition “x < 5” and q be the proposition “x > 15.”
Answer
Answer
XOR is not used as often as AND and OR in logical expressions in programs. Some languages
have an XOR operator and some do not. The issue gets blurry because some languages, like Java,
have an explicit Boolean type, while others, like C and C++, do not. All of these languages have
a bitwise XOR operator, but this is not the same thing as a logical XOR operator. We will return
to this topic later. In the next section we will see how to implement ⊕ using ∨, ∧, and ¬.
Definition 4.28. The conditional statement (or implies) involving p and q, denoted by
p → q, is the proposition “if p, then q”. It is false when p is true and q is false, and true
otherwise. In the statement p → q, we call p the premise (or hypothesis or antecedent)
and q the conclusion (or consequence).
Example 4.29. Let p be “you earn at least 94%,” and q be “you will receive an A.” Then
p → q is the proposition “If you earn at least 94%, then you will receive an A.”
It is important to realize that p → q and q → p are not always equivalent.
Example 4.30. Let p be “you earn at least 94%,” and q be “you will receive an A.” Then
p → q is the proposition “If you earn at least 94%, then you will receive an A,” and q → p
is the proposition “If you receive an A, then you earned at least 94%.” Although they may
sound equivalent, they are not. Consider the possibility that it is true that receiving at least
84 Chapter 4
⋆Question 4.31. Assume that the proposition “If you earn at least 94% in this class, then
you will receive an A” is true.
(a) What grade will you get if you earn 94%? Explain.
Answer
Answer
(c) If you don’t earn 94%, does that mean you didn’t get an A? Explain.
Answer
Example 4.32. Translating between an English sentence and a mathematical expression can
sometimes be tricky with conditional statements. Again, let p be “you earn at least 94%,”
and q be “you will receive an A.” Then the sentence “You will receive an A whenever you
earn at least 94%” is p → q, and not q → p since it is expressing the same idea as the sentence
“If you earn at least 94%, you will receive an A.”
Note: The conditional operator is by far the one that is the most difficult to get a handle
on for at least two reasons. First, the conditional statement p → q is not saying anything
about p or q by themselves. It is only saying that if p is true, then q has to also be true. It
doesn’t say anything about the case that p is not true. This brings us to the second reason.
Should F → T be true or false? Although it seems counterintuitive to some, it should be true.
Again, p → q is telling us about the value of q when p is true (i.e., if p is true, the q must
be true). What does it tell us if p is false? Nothing. As strange as it might seem, when p is
false, the whole statement is true regardless of the truth value of q.
If you are still confused, you can simply fall back on the formal definition: p → q is false
when p is true and q is false, and is true otherwise. In other words, if interpreting
p → q as the English sentence “p implies q” is more harmful than helpful in understanding
the concept, don’t worry about why it doesn’t make sense and just remember the definition.a
a
In mathematics, terms are usually chosen so they make sense immediately. Sometimes this is not possible
(if the concept is very complicated or it doesn’t relate to anything familiar). Sometimes a term is poorly
defined but the definition sticks because of prior use. Sometimes it makes sense to some people and not to
others, probably based on a person’s background. I think this last possibility may be the reason in this case.
Propositional Logic 85
Example 4.34. Let p be “you earn at least 94%,” and q be “you receive an A.” Then p ↔ q
is the proposition “You earn at least 94% if and only if you receive an A.”
⋆Question 4.35. Assume that the proposition “You will receive an A if and only if you
earn at least 94%” is true.
Answer
Answer
(c) If you don’t earn at least 94%, does that mean you didn’t get an A?
Answer
Now let’s bring all of these operations together with a few more examples.
Example 4.36. Let a be the proposition “I will eat my socks,” b be “It is snowing,” and c
be “I will go jogging.” Here are some compound propositions involving a, b, and c, written
using these variables and operators and in English.
With Variables/Operators In English
(b ∨ ¬b) → c Whether or not it is snowing, I will go jogging.
b → ¬c If it is snowing, I will not go jogging.
b → (a ∧ ¬c) If it is snowing, I will eat my socks, but I will not go jogging.
a↔c When I eat my socks I go jogging, and when I go jogging I
eat my socks.
or I eat my socks if and only if I go jogging.
⋆Fill in the details 4.37. Let p be the proposition “Iron Man is on TV,” q be “I will
watch Iron Man,” and r be “I own Iron Man on DVD.” Fill in the missing information in
the following table.
86 Chapter 4
Definition 4.38. A truth table is a table that shows the truth value of a compound propo-
sition for all possible combinations of truth assignments to the variables in the proposition.
If there are n variables, the truth table will have 2n rows.
The truth table for ¬ is given in Table 4.1 and the truth tables for all of the other operators
we just defined are given in Table 4.2. In the latter table, the first two columns are the possible
values of the two variables, and the last 5 columns are the values for each of the two-variable
compound propositions we just defined for the given inputs.
p ¬p
T F
F T
p q (p ∧ q) (p ∨ q) p⊕q (p → q) (p ↔ q)
T T T T F T T
T F F T T F F
F T F T T T F
F F F F F T T
Example 4.39. Construct the truth table of the proposition a ∨ (¬b ∧ c).
Solution: Since there are three variables, the truth table will have 23 = 8 rows.
Here is the truth table, with several helpful intermediate columns.
Propositional Logic 87
a b c ¬b ¬b ∧ c a ∨ (¬b ∧ c)
T T T F F T
T T F F F T
T F T T T T
T F F T F T
F T T F F F
F T F F F F
F F T T T T
F F F T F F
Note: Notice that there are several columns in the truth table besides the columns for the
variables and the column for the proposition we are interested in. These are “helper” or
“intermediate” columns (those are not official definitions). Their purpose is simply to help
us compute the final column more easily and without (hopefully) making any mistakes.
p q p → q (p → q) ∧ q
T T
T F
F T
F F
Note: As long as all possible values of the variables are included, the order of the rows of
a truth table does not matter. However, by convention one of two orderings is usually used.
Since there is an interesting connection to the binary representation of numbers, let’s take a
closer look at this connection in the next example.
Example 4.41 (Ordering the rows of a Truth Table). Notice that the values of the variables
can be used to construct an index for each row. We can do this by thinking of each T as a
1 and each F as a 0, and treating the columns as a binary number. The rows will then be
listed either in order or (more commonly) in reverse order. For instance, if there are three
variables, we can think of it as shown in the following table.
a b c index
T T T 1 1 1 7
T T F 1 1 0 6
T F T 1 0 1 5
T F F 1 0 0 4
F T T 0 1 1 3
F T F 0 1 0 2
F F T 0 0 1 1
F F F 0 0 0 0
88 Chapter 4
This is the ordering you should follow so that you can easily check your answers with those
in the solutions. It also makes grading your homework easier.
There is also a way of thinking about this recursively. Given an ordering for a table with
n variables, we can compute an ordering for a table with n + 1 variables as follows. Make
two copies of the columns corresponding to the n variables, appending a T to the beginning
of the first copy, and an F to the beginning of the second copy.
⋆Exercise 4.42. Construct the truth table of the proposition (a ∨ ¬b) ∧ c. You’re on your
own this time to supply all of the details.
Answer
Answer
Evaluation
90 Chapter 4
Definition 4.48. A proposition that is always true is called a tautology. One that is always
false is a contradiction. Finally, one that is neither of these is called a contingency.
(a) The proposition “x < 0” is a contingency since its truth depends on the value of x.
(b) The proposition “x2 < 0” is a contradiction since it is false no matter what x is.
(c) The proposition “x2 ≥ 0” is a tautology since it is true no matter what x is.
⋆Fill in the details 4.50. State whether each of the following propositions is a tautology,
contradiction, or contingency. Give a brief justification.
(b) p ∧ ¬p is a since .
(c) p ∨ q is a since .
To prove something is a tautology, one must prove that it is always true. One way to do this
is to show that the proposition is true for every row of the truth table. Another way is to argue
(without using a truth table) that the proposition is always true, often using a proof by cases.
Proof 1: Since every row in the following truth table for p ∨ ¬p is T , it is a tautology.
p ¬p p ∨ ¬p
T F T
F T T
Proof 1:
p q p → q p ∧ (p → q) p ∧ (p → q) → q
T T T T T
T F F F T
F T T F T
F F T F T
Evaluation
p q p → q p ∧ (p → q) p ∧ (p → q) → q
T T T T T
T F F F T
F T T F T
F F T F T
Evaluation
Evaluation
92 Chapter 4
Proof 4: Since an implication can only be false when the premise is true
and the conclusion is false, we only need to prove that this can’t happen.
So let’s assume that p ∧ (p → q) is true but that q is false. Since p ∧ (p → q)
is true, p is true and p → q is true (by definition of conjunction). But if p
is true and q is false, p → q is false. This is a contradiction, so it must be
the case that our assumption that p ∧ (p → q) is true but that q is false is
incorrect. Since that was the only possible way for p ∧ (p → q) → q to be
false, it cannot be false. Therefore it is a tautology.
Evaluation
Evaluation
Definition 4.53. Let p and q be propositions. Then p and q are said to be logically equiv-
alent if p ↔ q is a tautology. An alternative (but equivalent) definition is that p and q are
equivalent if they have the same truth table. That is, if they have the same truth value for all
assignments of truth values to the variables.
When p and q are equivalent, we write p = q. An alternative notation is p ≡ q.
There are many logical equivalences (or identities/rules/laws) that come in handy when work-
ing with compound propositions. Many of them (e.g. commutative, associative, distributive) will
resemble the arithmetic laws you learned in grade school. Others are very different. The most
common ones are given in Table 4.3.
We will provide proofs of some of these so you can get the hang of how to prove propositions
are equivalent. One method is to demonstrate that the propositions have the same truth tables.
That is, they have the same value on every row of the truth table. But just drawing a truth table
isn’t enough. A statement like “since p and q have the same truth table, p = q” is necessary to
make a connection between the truth table and the equivalence of the propositions. Let’s see a
few examples.
N ame Equivalence
commutativity p∨q =q∨p
p∧q =q∧p
associativity p ∨ (q ∨ r) = (p ∨ q) ∨ r
p ∧ (q ∧ r) = (p ∧ q) ∧ r
distributive p ∧ (q ∨ r) = (p ∧ q) ∨ (p ∧ r)
p ∨ (q ∧ r) = (p ∨ q) ∧ (p ∨ r)
identity p∨F =p
p∧T =p
negation p ∨ ¬p = T
p ∧ ¬p = F
domination p∨T =T
p∧F =F
idempotent p∨p=p
p∧p=p
double negation ¬(¬p) = p
DeM organ′ s ¬(p ∨ q) = ¬p ∧ ¬q
¬(p ∧ q) = ¬p ∨ ¬q
absorption p ∨ (p ∧ q) = p
p ∧ (p ∨ q) = p
p ¬p ¬(¬p)
T F T
F T F
Since the entries for both p and ¬(¬p) are the same for every row, ¬(¬p) = p.
The two versions of De Morgan’s Law are among the most important propositional equiva-
lences for computer scientists. It is easy to make a mistake when trying to simplify expressions
conditional statements, and a solid understanding of De Morgan’s Laws goes a long way. In light
of this, let’s take a look at them next.
Proof: We can prove this by showing that in each case, both expression have
the same truth table. Below is the truth table for ¬(p ∨ q) and ¬p ∧ ¬q (the gray
columns).
p q p∨q ¬(p ∨ q) ¬p ¬q ¬p ∧ ¬q
T T T F F F F
T F T F F T F
F T T F T F F
F F F T T T T
Since they are the same for every row of the table, ¬(p ∨ q) = ¬p ∧ ¬q.
94 Chapter 4
⋆Exercise 4.56. Prove the second version of De Morgan’s Law: ¬(p ∧ q) = ¬p ∨ ¬q.
Proof
p q p∧q ¬(p ∧ q) ¬p ¬q ¬p ∨ ¬q
T T
T F
F T
F F
Truth tables aren’t the only way to prove that two propositions are equivalent. You can also
use other equivalences. Let’s see an example.
⋆Fill in the details 4.57. Prove the idempotent laws (p ∨ p = p and p ∧ p = p) by using
the other equivalences.
Proof: We have
p = p∨F (by identity)
= p ∨ (p ∧ ¬p) (by )
= (p ∨ p) ∧ (p ∨ ¬p) (by )
= (p ∨ p) ∧ T (by negation)
= (by identity)
Thus, p ∨ p = p. Similarly,
p = (by identity)
= (by negation)
= (by distributive)
= (by negation)
= p∧p (by )
Thus, .
Although it is helpful to specifically state which rules are being used at every step, it isn’t
Propositional Equivalence 95
always required.
(p ∧ q) ∨ (p ∧ ¬q) = p ∧ (q ∨ ¬q) = p ∧ T = p.
⋆Exercise 4.59. Use the other equivalences (not a truth table) to prove the Absorption
laws.
Solution: Using DeMorgan’s Law and double negation, we can see that
As the previous example demonstrates, you can apply the rules to propositions in various
forms. Sometimes it is useful to explicitly define p and q (and sometimes r) and write expressions
using formal mathematical notation, but at other times it is just as easy to apply the rules the
the expressions as they are. In the previous example, we didn’t gain that much by defining p and
q. But with more complicated expressions it certainly can be helpful.
Note: A common mistake is to forget to use De Morgan’s law when dealing with negation. For
instance, in the last example, replacing the code !(a==null || a.size()<=0 ) with the code
!(a==null) || !(a.size()<=0) would be incorrect. You cannot just distribute a negation
among other terms. Always remember to use De Morgan’s law: ¬(p ∨ q) 6= ¬p ∨ ¬q.
Evaluation
Although some of these examples may seem a bit contrived, in some sense they are realistic.
As code is refactored, code is added and removed in various places, conditionals are combined or
separated, etc. and sometimes it leads to conditionals that are more complicated than they need
to be. In addition, when working on large teams, you will often work on code written by others.
Since some programmers don’t have a good grasp on logic, you will certainly run into conditional
statements that are way more complicated and convoluted than necessary. As I believe these
examples demonstrate, simplifying conditionals is not nearly as easy as one might think. It takes
great care to ensure that your simplified version is still correct.
98 Chapter 4
Note: There is an important difference between the logical operators as discussed here and how
they are implemented in programming languages such as Java, C, and C++. It is something
that is sometimes called short circuiting. You are probably familiar with the concept even
if you haven’t heard it called that before. It exploits the domination laws:
F ∧q =F
T ∨q =T
Let’s see an example.
Example 4.65. Consider the statement if(x>=0 && a[x]!=0). The first domination law
implies that when x < 0, the expression in the if statement will evaluate to false regardless of
the truth value of a[x]!=0. Therefore, many languages will simply not evaluate the second
part of the expression—they will short circuit.
The same thing happens for statements like if(x<0 || x>a.length). When x < 0, the
expression is true regardless of the truth value of x>a.length. Again, many languages don’t
evaluate the second part of this expression if the first part is true. Of course, if the first part
is false, the second part is evaluated since the truth value now depends on the truth value of
the second part.
There are at least two benefits of this. First, it is more efficient since sometimes less code
needs to be executed. Second, it allows the checking of one condition before checking a second
condition that might cause a crash. You have probably used it in statements like the above to
make sure you don’t index outside the bounds of an array. Another use is to avoid attempting
to access methods or fields when a variable refers to null (e.g. if(a!=null && a.size()>0)).
But this has at least two consequences that can cause subtle problems if you aren’t careful.
First, although the AND and OR operators are commutative (e.g. p ∨ q and q ∨ p are equiv-
alent), that is not always the case for Boolean expressions in these languages. For instance,
the statement if(x>=0 && a[x]!=0) is not equivalent to if(a[x]!=0 && x>=0) since the
second one will cause a crash if x < 0. Second, if the second part of the expression is code
that you expect will always be executed, you may spend a long time tracking down the bug
that this creates.
⋆Evaluate 4.66. Rewrite the following segment of code so that it is as simple as possible
and logically equivalent.
if ( !( list . i s E m p t y () && list . get (0) >=100) && !( list . get (0) <100) )
{
x ++;
} else {
x - -;
}
Solution 1: The second and third statements mean the same thing. Also,
if the second is true then we got a value so we know the list is not
empty, so the first statement is unnecessary. This leads to the following
equivalent code:
Propositional Equivalence 99
Evaluation
Evaluation
So my simplified code is
if ( ! list . i s E m p t y () && list . get (0) >= 100 ) {
x ++;
} else {
x - -;
}
Evaluation
⋆Question 4.67. In the solutions to the previous problem we said that the final solution
was correct. But there might be a catch. Go back to the original code and the final solution
and look closer. Is the final solution really equivalent to the original? Explain why or why
not.
Evaluation
The previous question serves as a reinforcement of a point previously made. When dealing
100 Chapter 4
with logical expressions in programs, we have to be careful about our notion of equivalence. This
is because of short-circuiting and the fact that expressions in programs, unlike logical statements,
can crash instead of being true or false.
⋆Exercise 4.68. Let p be “x > 0”, q be “y > 0,” and r be “Exactly one of x or y is greater
than 0.”
Answer
Answer
Table 4.4 contains some important identities involving →, ↔, and ⊕. Since these operators
are not always present in a programming language, identities that express them in terms of ∨, ∧,
and ¬ are particularly important.
p ⊕ q = (p ∨ q) ∧ ¬(p ∧ q) p ↔ q = (p → q) ∧ (q → p)
p ⊕ q = (p ∧ ¬q) ∨ (¬p ∧ q) p ↔ q = ¬p ↔ ¬q
¬(p ⊕ q) = p ↔ q p ↔ q = (p ∧ q) ∨ (¬p ∧ ¬q)
p → q = ¬q → ¬p ¬(p ↔ q) = p ↔ ¬q
p → q = ¬p ∨ q ¬(p ↔ q) = p ⊕ q
The previous example demonstrates an important general principle. When proving identities
(or equations of any sort), sometimes it works best to start from the right hand side. Try to keep
this in mind in the future.
⋆Evaluate 4.70. Show that p ↔ q and (p ∧ q) ∨ (¬p ∧ ¬q) are logically equivalent.
Evaluation
Propositional Equivalence 101
Proof 2: They are both true when p and q are both true or both false.
Therefore they are logically equivalent.
Evaluation
Proof 3: Each of these is true precisely when p and q are both true.
Evaluation
Proof 4: Each of these is true when p and q have the same truth value
and false otherwise, so they are equivalent.
Evaluation
In the previous example, you should have noticed that just a subtle change in wording can be
the difference between a correct or incorrect proof. When writing proofs, remember to be very
precise in how you word things. You may know what you mean when you wrote something, but
a reader can only see what you actually wrote.
102 Chapter 4
Example 4.72. In a previous example we saw that “x < 0” was a contingency, “x2 < 0” was
a contradiction, and “x2 ≥ 0” was a tautology. Each of these is actually a predicate since
until we assign a value to x, they are not propositions.
Example 4.73. Let P (x) be “x < 0”. Notice that until we assign some value to x, P (x) is
neither true nor false.
P (3) is the proposition “3 < 0,” which is false.
If we let Q(x) be “x2 ≥ 0,” then Q(3) is “32 ≥ 0,” which is true.
Notice that both P (x) and “x < 0” are propositional functions. In other words, we don’t
have to use functional notation to represent a propositional function. Any statement that has a
variable in it is a propositional function, whether we label it or not.
(a) x2 + 2x + 1 = 0
(b) 32 + 2 · 3 + 1 = 0
Definition 4.75. The symbol ∀ is the universal quantifier, and it is read as “for all”, “for
each”, “for every”, etc. For instance, ∀x means “for all x”. When it precedes a statement,
it means that the statement is true for all values of x.
As the name suggests, the “all” refers to everything in the universe of discourse (or
domain of discourse, or simply domain), which is simply the set of objects to which the
current discussion relates.
Hopefully you recall that N is the set of natural numbers (i.e. {0, 1, 2, . . .}), Z is the set of
integers, and Z+ is the set of positive integers (i.e. {1, 2, 3, . . .}). We will use these in some of the
following examples.
Predicates and Quantifiers 103
Example 4.76. Let P (x)=“x < 0”. Then P (x) is a propositional function, and ∀xP (x)
means “all values of x are negative.” If the domain is Z, ∀xP (x) is false. However, if the
domain is negative integers, ∀xP (x) is true.
Example 4.77. Express each of the following English sentences using the universal quantifier.
Don’t worry about whether or not the statements are true. Assume the domain is real
numbers.
⋆Exercise 4.78. Express each of the following using the universal quantifier. Assume the
domain is Z.
(a) Two times any number is less than three times that number.
Answer
Answer
Example 4.79. The expression ∀x(x2 ≥ 0) means “for all values of x, x2 is non-negative”.
But what constitutes all values? In other words, what is the domain? In this case the most
logical possibilities are the integers or real numbers since it seems to be stating something
about numbers (rather than people, for example). In most situations the context should make
it clear what the domain is.
⋆Exercise 4.81. Use the universal quantifier to express the fact that the square of any
integer is not zero as long as the integer is not zero.
Answer
104 Chapter 4
Definition 4.82. The symbol ∃ is the existential quantifier, and it is read as “there
exists”, “there is”, “for some”, etc. For instance, ∃x means “For some x”. When it precedes
a statement, it means that the statement is true for at least one value of x in the universe.
√
Example 4.83. Prove that ∃x( x = 2) is true when the domain is the integers.
√ √
Proof. Notice that when x = 4, x = 4 = 2, proving the statement.
⋆Exercise 4.84. Express the sentence “Some integers are positive” using quantifiers. You
may assume the domain of the variable(s) is Z.
Answer
Sometimes you will see nested quantifiers. Let’s see a few examples.
Example 4.85. Use quantifiers to express the sentence “all positive numbers have a square
root,” where the domain is real numbers.
√
Solution: We can express this as ∀(x > 0)∃y( x = y).
⋆Evaluate 4.86. Express the sentence “Some integers are even” using quantifiers. You may
assume the domain of the variable(s) is the integers.
Evaluation
Evaluation
Evaluation
⋆Exercise 4.88. Express the following statement using quantifiers: “Every integer can be
expressed as the sum of two squares.” Assume the domain for all three variables (did you
catch the hint?) is Z.
Answer
⋆Fill in the details 4.89. Prove or disprove the statement from the previous example.
y 6= 0 since 3 is not .
If y ≥ 2, y 2 ≥ 4, so we need , .
There is a subtly in the proof in Exercise 4.90 that you might have overlooked. The next
evaluate exercise will help to see if you caught it.
106 Chapter 4
Evaluation
Evaluate 4.91 is a good reminder of why mathematicians are so fussy about the details in
proofs. Now it’s your turn to give a proper solution to that problem. If you have been paying
attention, this should be an easy one.
Example 4.93. Prove or disprove that the following statement is equivalent to the statement
from the Example 4.90.
∃m ∈ Z+ ∀n ∈ Z+ (n + 7)2 > 49 + m
Solution: This is almost the same as the expression from the previous example,
but the ∀n ∈ Z+ and ∃m ∈ Z+ have been reversed. Does that change the meaning?
Let’s find out.
The expression in Example 4.90 is saying something like “For any positive integer
n, there is some positive integer m . . .” In English, the statement in this example is
saying something like “There exists a positive integer m such that for any positive
integer n . . .” Are these different? Indeed. The one from Example 4.90 lets us
pick a value of m based on the value of n. In other words, we can pick a different
value of m for each value of n. The one from this example requires that we pick
Predicates and Quantifiers 107
a value of m that will work for all values of n. Can you see how that is saying
something different?
⋆Exercise 4.95. Find a predicate P (x, y) such that ∀x∃yP (x, y) and ∃y∀xP (x, y) have
different true values. Justify your answer. (Hint: Think simple. Will something like “x = y”
or “x < y” work if you choose the appropriate domain?)
Answer:
Example 4.96. Let P (x)=“x < 0”. Then ¬∀xP (x) means “it is not the case that all values
of x are negative.” Put more simply, it means “some value of x is not negative”, which we
can write as ∃x¬P (x).
What we saw in the last example actually holds for any propositional function.
Theorem 4.97 (DeMorgan’s Laws for quantifiers). If P (x) is a propositional function, then
Proof: We will prove the first statement. The proof of the second is very
similar. Notice that ¬∀xP (x) is true if and only if ∀xP (x) is false. ∀xP (x) is
false if and only if there is at least one x for which P (x) is false. This is true
if and only if ¬P (x) is true for some x. But this is exactly the same thing as
∃x¬P (x), proving the result.
Example 4.98. Negate the following expression, but simplify it so it does not contain the ¬
symbol.
∀n∃m(2m = n)
Solution:
¬∀n∃m(2m = n) = ∃n¬∃m(2m = n)
= ∃n∀m¬(2m = n)
= ∃n∀m(2m 6= n)
⋆Exercise 4.99. Answer the following questions about the expression from Example 4.98,
assuming the domain is Z.
(a) Write the expression in English. You can start with a direct translation, but then smooth
it out as much as possible.
Answer
(b) Write the negation of the expression in English. State it as simply as possible.
Answer
Answer
Predicates and Quantifiers 109
Let’s see how quantifiers connect to algorithms. If you want to determine whether or not
something (e.g. P (x)) is true for all values in a domain (e.g., you want to determine the truth
value of ∀xP (x)), one method is to simply loop through all of the values and test whether or not
P (x) is true. If it is false for any value, you know the answer is false. If you test them all and
none of them were false, you know it is true.
Example 4.100. Here is how you might determine if ∀xP (x) is true or false for the domain
{0, 1, 2, . . . , 99}:
b o o l e a n i s T r u e F o r A l l () {
for ( int i =0;i <100; i ++) {
if ( ! P ( i ) ) {
return false ;
}
}
return true ;
}
Notice the negation in the code—this can trip you up if you aren’t careful.
Example 4.101. Let P (x) and Q(x) be predicates and the domain be {0, 1, 2, . . . , 99}. What
is isTrueForAll2() determining?
b o o l e a n i s T r u e F o r A l l 2 () {
for ( int i =0;i <100; i ++) {
if ( ! P ( i ) && ! Q ( i ) )
return false ;
}
return true ;
}
Solution: Notice that if both P (i) and Q(i) are false for the same value of i, it
returns false, and otherwise it returns true. Put another way, it returns true if for
every value of i, either P (i) or Q(i) is true. Thus, isTrueForAll2 is determining
the truth value of ∀i(P (i) ∨ Q(i)).
⋆Exercise 4.102. Rewrite the expression ( !P(i) && !Q(i) ) from the previous example
so that it uses only one negation.
Answer:
110 Chapter 4
⋆Exercise 4.103. Let P (x) and Q(x) be predicates and the domain be {0, 1, 2, . . . , 99}.
What is isTrueForAll3() determining?
b o o l e a n i s T r u e F o r A l l 3 () {
b o o l e a n result = true ;
for ( int i =0;i <100; i ++) {
if (! P ( i ) ) {
result = false ;
}
}
if ( result == true ) {
return true ;
}
for ( int i =0;i <100; i ++) {
if (! Q ( i ) ) {
return false ;
}
}
return true ;
}
Answer
Example 4.104. Now we are ready for the million dollar question:a Are isTrueForAll2
and isTrueForAll3 determining the same thing?
Solution: At first glance, it looks like they might be. But we need to dig
deeper, and we need to prove one way or the other. To prove it, we would need to
show that these expressions evaluate to the same truth value, regardless of what
P and Q are. To disprove it, we just need to find a P and a Q for which these
expressions have different truth values. But let’s first talk it through to see if we
can figure out which answer seems to be correct.
∀i(P (i) ∨ Q(i)) is saying that for every value of i, either P (i) or Q(i) has to be
true. ∀iP (i) ∨ ∀iQ(i) is saying that either P (i) has to be true for every i, or that
Q(i) has to be true for every i. These sound similar, but not exactly the same,
so we cannot be sure yet. In particular, we cannot jump to the conclusion that
they are not equivalent because we described each with different words. There are
many ways of wording the same concept.
At this point we either need to try to tweak the wording so that we can see that
they are really saying the same thing, or we need to try to convince ourselves they
aren’t. Let’s try the latter.
What if P (i) is always true and Q(i) is always false? Then both statements are
true. But that doesn’t prove that they are always both true, so this doesn’t help.
Predicates and Quantifiers 111
Let’s try something else. What if we can find a P (i) and a Q(i) such that for any
given value of i, we can ensure that either P (i) or Q(i) is true, but also that there
is some value j such that P (j) is false and some value k 6= j such that Q(k) is
false? Then ∀i(P (i) ∨ Q(i)) would be true, but ∀iP (i) ∨ ∀iQ(i) false, so this would
work. But in order to be certain, we have to know that such a P and Q exist.b
What if we let P (i) be “i is even”, Q(i) be “i is odd”, and the universe be Z.
Then ∀iP (i) = ∀iQ(i) = F , so ∀iP (i) ∨ ∀iQ(i) = F , but ∀i(P (i) ∨ Q(i)) = T .
Now we have all of the pieces. Let’s put this all together in the form of a proof.
Example 4.106. Let p, q, and r be boolean variables. Then p, ¬p, q, ¬q, r, and ¬r are all
literals.
On the other hand, p ∧ q, ¬p → q, and p ∧ q ∧ r are not literals because they include
boolean operations of two or more variables. In other words, none of them are a variable or
the negation of a variable.
⋆Exercise 4.107. Let p, q, and r be boolean variables. Which of the following are literals?
q ∨ r, ¬p, q, p ∧ q ∧ r, ¬p ∧ q, r.
Answer
Example 4.110. Literals are conjunctive clauses since they are a conjunction of a single
variable. This might sound weird because if you only have a single variable, there is nothing
to “conjoin” it to. But it is just like if someone asked to to add up all of the money in your
pocket–if you only have a single dollar, you will say you have one dollar, having “added up”
the dollar.
Therefore, p, ¬p, q, ¬q, r, and ¬r are all conjunctive clauses.
Normal Forms 113
⋆Exercise 4.111. Let p, q, and r be boolean variables. Which of the following are con-
junctive clauses?
q ∨ r, ¬p, q, p ∧ q ∧ r, ¬p → q, ¬p ∧ q, r, ¬r ∧ p ∧ q, q ∨ ¬r, p ∧ ¬(r ∧ q)
Answer
Definition 4.112. A logical expression is in disjunctive normal form (DNF) (or sum-
of-products expansion) if it is expressed as a disjunction (OR) of conjunctive clauses.
Example 4.113. Let p, q, and r be boolean variables. Then the following are in disjunctive
normal form:
• q (It is the disjunction of a single conjunctive clause that consists of just a literal.)
• p ∨ ¬q (It is the disjunction of two conjunctive clauses, each of which is just a literal.)
• (p ∧ q ∧ r) ∨ (¬p ∧ r)
• p ∨ (q ∧ ¬p) ∨ (r ∧ ¬p)
• r ∧ ¬q ∧ p
• p→q
• p ∧ (q ∨ r)
• p ∨ ¬(r ∧ q)
• p ∨ (q ∧ ¬p) ∧ (r ∨ ¬q)
• (p ↔ q) ∨ (q ∧ ¬r) ∨ ¬p
⋆Exercise 4.114. Let p, q, and r be boolean variables. Which of the following are in
disjunctive normal form?
¬p, q ∨ r, ¬q ∧ r, p ∧ q ∧ r, (¬p → q) ∨ (q ∧ r), ¬(p ∧ ¬q), ¬r ∧ p ∧ q, ¬(p ∨ q), p ∧ ¬(r ∧ q),
(p ∧ ¬r) ∨ (r ∧ q) ∨ (¬q ∧ p), (p ∨ ¬r) ∧ (r ∨ q) ∧ (¬q ∨ p), (p ∧ ¬r) ∨ (r ∨ q) ∨ (¬q ∧ p),
(p ∧ ¬r) ∨ (r ∧ q) ∧ (¬q ∧ p), (p ∧ ¬r) ∨ (r ∧ q) ∨ (¬q ∧ p ∧ r)
Answer
114 Chapter 4
Make sure you have a clear understanding of when an expression is and is not a literal, a
conjunctive clause, or in disjunctive normal form.
Now you understand what disjunctive normal form is and can recognize when an expression
is in this form. Next we describe how to convert any expression to an equivalent one that is in
disjunctive normal form. The procedure involves constructing a truth table for the expression.
The process is pretty straightforward once you get the hang of it, but it can be a little tricky at
first so pay careful attention!
Procedure 4.115. This will convert a boolean expression to disjunctive normal form.
3. For each such row, create a conjunctive clause that includes all of the variables which
are true on that row and the negation of all of the variables that are false.
Solution: The truth table for p ⊕ q is given to the right. The second and third
rows of the table are true, so we use those to construct the disjunctive normal
form.
p q p⊕q
The second row yields conjunctive clause p ∧ ¬q, and
T T F
the third row yields conjunctive clause ¬p ∧ q. The
T F T
disjunction of these is (p ∧ ¬q) ∨ (¬p ∧ q).
F T T
Thus, p ⊕ q = (p ∧ ¬q) ∨ (¬p ∧ q).
F F F
The previous example is essentially just another proof of the identity that was proven in
Example 4.69.
p q r Z
T T T T
T T F T
T F T F
T F F F
F T T F
F T F T
F F T F
F F F T
The solution from the previous example can be simplified to Z = (p ∧ q) ∨ (¬p ∧ ¬r). Although
this can be done by applying the logical equivalences we learned about earlier, there are more
sophisticated techniques that can be used to simplify expressions that are in disjunctive normal
form. This is beyond our scope, but you will likely learn more about this when you take a
computer organization class and discuss circuit minimization. The important point I want to
make here is that computing the disjunctive normal form of an expression using the technique
we describe will not always produce the most simple form of the expression. In fact, much of the
time it won’t be.
There is another important form that is very similar to disjunctive normal form.
There are several methods for converting to conjunctive normal form. They generally involve
using double negation, distributive, and De Morgan’s laws either based on the truth table or
based on the disjunctive normal form. However, we won’t discuss these techniques here.
116 Chapter 4
AN D OR XOR IF F
p q (p&&q) (p||q) p! = q (p == q)
1 1 1 1 0 1
1 0 0 1 1 0
0 1 0 1 1 0
0 0 0 0 0 1
We don’t usually think about != being XOR and == being IFF (or biconditional). We usually
think of them in their more natural interpretation: ‘not equal’ and ‘equal’.
Note: A note of caution: Although Java is a lot like C and C++, how it deals with logical
expressions is very different. Java has an explicit boolean type and you can only use the
logical operators on boolean values. Further, conditional statements in Java require boolean
values. In C and C++, the int type is used as a boolean value, where 0 is false, and anything
else is true. This is very convenient, but can also cause some confusion.
Example 4.121. In C/C++, (5&&6), (5||0), (4!=5) are all true. In Java the first two
statements are illegal.
Now it’s time to extend the concept of Boolean operators to integer data types (including int,
short, long, byte, etc.).
Definition 4.122. A bitwise operation is a boolean operation that operates on the indi-
vidual bits of its argument(s).
Definition 4.123. The complement or bitwise NOT, usually denoted by ~, just flips each
bit.
Bitwise Operations 117
Note: For simplicity, the rest of the examples will assume numbers are represented with 8
bits. The concept is exactly the same regardless of how many bits are used for a particular
data type.
~ = , which is in decimal.
• The bitwise AND, usually denoted by &, applies ∧ to the corresponding bits of each
argument.
• The bitwise OR, usually denoted by |, applies ∨ to the corresponding bits of each
argument.
• The bitwise XOR, usually denoted by ^, applies ⊕ to the corresponding bits of each
argument.
We will present examples in table form rather than ‘code form’ since it is much easier to see
what is going on when the bits are lined up.
Note: It is important to remember that & and && are not the same thing! The same holds for
| and ||. It is equally important to remember that ^ does not mean exponentiation in most
programming languages.
118 Chapter 4
Note: A final reminder: It is important to understand the difference between the Boolean
operators and the bitwise operators.
Reading Comprehension Questions 119
⋆Question 4.2. What are the six logical operators that were introduced in this chapter? Draw
a truth table for each.
⋆Question 4.3. Explain the difference between (inclusive) or and exclusive or.
⋆Question 4.8. If p ∨ q is true and p is false, what can you say about q?
⋆Question 4.9. If p ∨ q is true and p is true, what can you say about q?
⋆Question 4.11. If p ↔ q is true and p is false, what can you say about q?
⋆Question 4.12. If p → q is true and p is true, what can you say about q?
⋆Question 4.13. If p → q is true and p is false, what can you say about q?
⋆Question 4.14. Consider the following code: if(...) { // do something } (where the de-
tails of the conditional statement in the if have been omitted). Is it more likely that the condi-
tional statement is a tautology or a contingency? Explain.
⋆Question 4.15. Prove the domination laws. That is, prove that
(a) p ∨ T = T
(b) p ∧ F = F .
⋆Question 4.16. Consider the conditional statements if( i<a.length && a[i]!=0 ) and
if( a[i]!=0 && i<a.length ). Are they equivalent in most programming languages? Explain
why or why not. If they are not equivalent, explain exactly how they differ. In particular, is one
right and one wrong, or do they accomplish different goals?
⋆Question 4.17. Simplify the conditional statement if( !(x<=0 || y<=0) ) as much as pos-
sible.
⋆Question 4.18. If a programming language does not implement short-circuiting with logical
expressions, how would you write the following code instead? (Note: In case it isn’t clear, without
short-circuiting this code will crash if a is null, and that is a bad thing.)
120 Chapter 4
⋆Question 4.19. Explain why ¬p ∧ ¬q and ¬(p ∧ q) are not logically equivalent. (Hint: The
answer is not “because that’s not what DeMorgan’s law says.” Why isn’t that a sufficient answer?
Because maybe there is a different law that says they are equivalent.)
⋆Question 4.21. Does ¬∀xP (x) mean P (x) is never true? If so, convince me. If not, what does
it mean?
⋆Question 4.22. Does ¬∃xP (x) mean P (x) is never true? If so, convince me. If not, what does
it mean?
⋆Question 4.23. Give two equivalent (but different) ways of expressing ∀x¬∃yQ(x, y).
⋆Question 4.24. Give three equivalent (but different) ways of expressing ¬∃x(x < 0 ∧ x > 0).
⋆Question 4.25. Express the sentence “Everybody hurts sometimes” using predicates and quan-
tifiers. To get you started, let H(x, y) =“x hurts at time y”.
⋆Question 4.26. Express the sentence “Nothing ever changes, nothing ever stays the same”
using predicates and quantifiers. Hint: You will need to define one or two predicates, depending
on how you interpret the sentence and how clever you are.
⋆Question 4.27. Let P (x, y) =“x ≤ y” and assume the universe of discourse is the set of
integers.
(c) Do the statements in parts (a) and (b) seem to be saying the same thing? Explain.
(f) Hopefully you answered that one of the statements is true and the other is false (If not, go
back to the previous two questions and try again!). Can you change the universe of discourse
so that the two statements have the same truth values?
(g) If you said no to the previous question, go back and try harder before continuing. So, you can
make them have the same truth value by changing the universe of discourse. Does that mean
with this universe of discourse the statement are saying the same thing? (This is a subtle but
important point, so if you are not totally confident in your answer, ask about this one!)
⋆Question 4.28. Let A be an array. Consider the statement ∀x(A[x]! = 0), where the universe
of discourse is {0, 1, . . . , n − 1}.
(b) Write a function boolean foo(int []A, int n) that computes and returns the truth value
of the statement.
(c) What is a better name for foo? In other words, what does it do?
⋆Question 4.29. Let A be an array. Consider the statement ¬∃x(A[x] == 0), where the
universe of discourse is {0, 1, . . . , n − 1}.
(b) Write a function boolean bar(int []A, int n) that computes and returns the truth value
of the statement.
(c) What is a better name for bar? In other words, what does it do?
⋆Question 4.30. Let p, q and r be boolean variables. Which of the following are literals? ¬p,
¬p ∧ r, q, ¬r, p → r, r.
⋆Question 4.31. Let p, q and r be boolean variables. Which of the following are conjunctive
clauses? ¬p, p ∧ r, q ∨ ¬r, ¬p ∧ r, q, ¬r, p → r, ¬(p ∧ q).
⋆Question 4.32. Let p, q and r be boolean variables. Which of the following are in disjunctive
normal form? ¬p, p∧r, q∨¬r, ¬p∧r, p → r, ¬(p∧q), (p∧q)∨(q∧¬r)∨¬p, (p∨q)∧(q∨¬r)∧¬p,
(p ∧ r) ∨ ¬(r ∧ ¬q) ∨ (¬p ∧ q), (p ∧ ¬r ∧ q) ∨ (¬p ∧ r ∧ ¬q) ∨ (p ∧ r ∧ q) ∨ (¬p ∧ ¬r ∧ ¬q) .
⋆Question 4.33. Assume x and y are integers stored on a computer using 8 bits. Let x = 37
(00100101 in binary), y = 112 (01110000 in binary). Compute ~x, x&y, x|y, and x^y.
122 Chapter 4
4.7 Problems
Problem 4.1. Draw a truth table to represent the following.
(a) ¬p ∨ q
(b) (p → q) ∨ ¬p
(c) (p ∧ ¬q) ∨ r
(e) (p ∨ ¬r) ∧ q
(f) (p ⊕ q) ∧ (q ∨ r)
(g) p ↔ (p ∧ q)
(a) p ∧ (q ∨ r) = (p ∧ q) ∨ (p ∧ r)
(b) p ∨ (q ∧ r) = (p ∨ q) ∧ (p ∨ r)
(a) p ∨ F = p
(b) p ∧ T = p
(a) p ∨ ¬p = T
(b) p ∧ ¬p = F
(a) p → q = ¬p ∨ q
(b) p → q = ¬q → ¬p
(a) p ↔ q = (p → q) ∧ (q → p)
(b) p ↔ q = ¬p ↔ ¬q
(c) ¬(p ↔ q) = p ↔ ¬q
Problem 4.11. The NAND of p and q, denoted by p|q, is the proposition “not both p and q”.
The NAND of p and q is false when p and q are both true and true otherwise.
(b) Express p|q using ∨, ∧, and/or ¬ (you may not need all of them).
(c) Express p∧q using only |. (That means you cannot use ¬, ∨, ∧, or any other boolean operator
except for |. Thus, your answer should only involve p, q, | and parentheses.) Your answer
should be as simple as possible. Give a truth table that shows they are the same.
(d) Express ¬p ∨ q using only |. Your answer should be as simple as possible. Give a truth table
that shows they are the same.
Problem 4.12. The NOR of p and q, denoted by p ↓ q, is the proposition “neither p nor q”. The
NOR of p and q is true when p and q are both false and false otherwise.
(b) Express p ↓ q using ∨, ∧, and/or ¬ (you may not need all of them).
(c) Express p∧q using only ↓. (That means you cannot use ¬, ∨, ∧, or any other boolean operator
except for ↓. Thus, your answer should only involve p, q, ↓ and parentheses.) Your answer
should be as simple as possible. Give a truth table that shows they are the same.
(d) Express ¬p ∨ q using only ↓. Your answer should be as simple as possible. Give a truth table
that shows they are the same.
Problem 4.13. A set of logical operators is functionally complete if any possible operator can
be implemented using only operators from that set. It turns out that {¬, ∧} is functionally
complete. So is {¬, ∨}. To show that a set if functionally complete, all one needs to do is show
how to implement all of the operators from another functionally complete set. Given this,
(a) Show that {|} is functionally complete. (Hint: Since {¬, ∧} is functionally complete, one way
is to show how to implement both ∧ and ¬ using just |.)
Problem 4.14. Write each of the following expressions so that negations are only applied to
propositional functions (and not quantifiers or connectives).
Problem 4.15. Let P (x, y)=“x likes y”, where the universe of discourse for x and y is the set of
all people. Translate each of the following into English, smoothing them out as much as possible.
Then give the truth value of each.
124 Chapter 4
Problem 4.16. Let P (x, y, z)=“x2 + y 2 = z 2 ”, where the universe of discourse for all variables
is the set of integers. What are the truth values of each of the following?
Problem 4.17. Write each of the following sentences using quantifiers and propositional func-
tions. Define propositional functions as necessary (e.g. Let D(x) be the proposition ‘x plays disc
golf.’)
(b) If all students in my class do their homework, then some of the students will pass.
(c) If none of the students in my class study, then all of the students in my class will fail.
(e) Some people like ice cream, and some people like cake, but everybody needs to drink water.
(j) You can’t please all of the people all of the time, but you can please some of the people some
of the time.
(k) If only somebody would give me some money, I would buy a new house.
(l) Nobody loves me, everybody hates me, I’m going to eat some worms.
(m) Every rose has its thorn, and every night has its dawn.
Problem 4.18. Express the following phrase using quantifiers. “There is some constant c such
that f (x) is no greater than c · g(x) for all x ≥ x0 for some constant x0 .” Your solution should
contain no English words.
(b) (Difficult if you have not had calculus.) This is the definition of something. What is it?
Problem 4.20. You are helping a friend debug the code below. He tells you “The code in the
if statement never executes. I have tried it for x=2, x=4, and even x=-1, and it never gets to the
code inside the if statement.”
if (( x %2==0 && x <0) || !( x %2==0 || x <0) ) {
// Do s o m e t h i n g.
}
(a) Is he correct that the code inside the if statement does not execute for his chosen values?
Justify your answer.
(b) Under what conditions, if any, will the code in the if statement execute? Be specific and
complete.
Problem 4.22. Simplify the following code as much as possible. (It can be simplified into a
single if statement that is about as complex as the original outer if statement).
if ( (! x . size () <=0 && x . get (0) !=11) || x . size () >0 ) {
if ( !( x . get (0) ==11 && ( x . size () >13 || x . size () <13) )
&& ( x . size () >0 || x . size () ==13) ) {
// Do a few things .
}
}
126 Chapter 4
b o o l e a n u n k n o w n 1( int x , int y ) {
if ( x !=0 && y !=0) {
return true ;
} else {
return false ;
}
}
b o o l e a n u n k n o w n 2( int x , int y ) {
if ( x !=0 || y !=0) {
return true ;
} else {
return false ;
}
}
(c) Are unknown1 and unknown2 equivalent to each other? Prove or disprove it.
Problem 4.24. The following method returns true if and only if none of the entries of the array
are 0:
b o o l e a n n o Z e r o E l e m e n t s ( int [] a , int n ) {
for ( int i =0;i < n ; i ++) {
if ( a [ i ] == 0 )
return false ;
}
return true ;
}
The two methods below implement this idea for two arrays. Assume list1 and list2 have
the same size for both of these methods.
b o o l e a n u n k n o w n 1( int [] list1 , int [] list2 , int n ) {
for ( int i =0;i < n ; i ++) {
if ( list1 [ i ]==0 && list2 [ i ]==0 )
return false ;
}
return true ;
}
(a) What is unknown1 determining? (Give answer in terms of list1 and list2 and the appro-
priate quantifier(s).)
(b) What is unknown2 determining? (Give answer in terms of list1 and list2 and the appro-
priate quantifier(s).)
(c) Prove or disprove that unknown1 and unknown2 are determining the same thing.
Problem 4.25. Use Procedure 4.115 to find the disjunctive normal form for each of the expres-
sions from Problem 4.1.
128 Chapter 4
Chapter 5: Sets, Functions, and Relations
5.1 Sets
• The objects in the set are called the elements of the set.
• If a does not belong to the set A, we write a 6∈ A, read “a is not an element of A.”
Note: The symbol ∈ should be read as is an element of, not exists in.
Example 5.2. The sets A = {1, 2, 3}, B = {3, 2, 1}, and C = {1, 1, 1, 2, 2, 3} actually rep-
resent the same set since repeated values are ignored and the order elements are listed does
not matter. Notice that 1 ∈ A and 3 ∈ A, but 4 6∈ A.
Let D = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9} be the set of decimal digits. Then 4 ∈ D but 11 6∈ D.
Notice that the elements in a set are listed between curly braces. Thus, {1, 2, 3} is a set (where
order does not matter and duplicates are ignored), but [1, 2, 3] is a list (where order does matter
and duplicates are allowed). Also, 1, 2, 3 is just a list of three numbers whereas {1, 2, 3} is the
set containing the numbers 1, 2, and 3.
• The number of elements in a set A, also known as the the cardinality of A, will be
denoted by |A|.
• If the set A has infinitely many elements, we write |A| = ∞ and we refer to A as an
infinite set.
Example 5.4. If A, B, C, and D are the sets from Example 5.2, then |A| = 3, |B| = 3,
|C| = 3, and |D| = 10.
129
130 Chapter 5
⋆Exercise 5.5. Give the set of prime numbers less than 10. What is its cardinality?
Answer
Since we cannot list every element of an infinite set, we need a way of expressing the set so
that it is clear what elements it contains. If the elements of the set follow some pattern, it is
common to list the first several elements and then conclude with . . ., indicating that the pattern
continues. There is no “right” number of elements to list when using this notation, but there
needs to be enough so that the pattern is evident. Often 3-5 elements suffices.
Example 5.6. The set of positive integers can be expressed as Z+ = {1, 2, 3, . . .}. Notice
that |Z+ | = ∞.
The set of positive integers that are a multiple of 5 can be expressed as {5, 10, 15, 20, . . .}.
Hopefully it is clear that |{5, 10, 15, 20, . . .}| = ∞.
The set of integer multiples of 5 can be expressed as {. . . , −15, −10, −5, 0, 5, 10, 15, . . .}.
Hopefully it is clear that this is also an infinite set.
Definition 5.7. We say two sets are equal if they contain the same elements. That is
∀x(x ∈ A ↔ x ∈ B). If A and B are equal sets, we write A = B.
Note: We will normally denote sets by capital letters, like A, B, S, N, etc. Elements will be
denoted by lowercase letters, like a, b, r, etc.
Definition 5.9. The following notation is pretty standard, and we will follow it in this book.
N = {0, 1, 2, 3, . . .} the natural numbers.
Z = {. . . − 2, −1, 0, 1, 2, . . .} the integers.
Z+ = {1, 2, 3, . . .} the positive integers.
Z− = {−1, −2, −3, . . .} the negative integers.
Q the rational numbers.
R the real numbers.
C the complex numbers.
∅ = {} the empty set or null set.
Sets 131
Example 5.10. Notice that |N| = |Z| = |R| = ∞. But this may be a bit misleading. Do all
of these sets have the same number of elements? Believe it or not, it turns out that N and Z
do, but that R has many more elements than both of these. If it seems strange to talk about
whether or not two infinite sets have the same number of elements, don’t worry too much
about it. We probably won’t bring it up again.
Another other common notation that is used express sets is called set builder notation. It is
easier to understand through examples than by giving a formal definition.
Example 5.12. Let S be the set of the squares of integers. We can express this as S =
{n2 |n ∈ Z} or S = {n2 : n ∈ Z}. We call this set builder notation. We read the : or | as “such
that.” Thus, S is the set containing numbers of the form n2 such that n is an integer.
Example 5.13. Consider the set of integers that are one less than a positive power of 2.
This set contains elements such as 1 = 21 − 1, 3 = 22 − 1, and 255 = 28 − 1. We can express
this set in set builder notation as {2n − 1|n ∈ Z+ }.
We could also use the notation {21 − 1, 22 − 1, 23 − 1, . . .}, since the pattern is evident.
However, it would be unwise to write it as {1, 3, 7, 15, . . .} since it may or may not be evident
what the pattern is when expressed in this way.
⋆Exercise 5.14. Use two different notations to express the set of even integers.
Answer
Example 5.15. Let T be the set of all integers that can be expressed as the sum of the
square of two positive integers. This set contains elements such as 2 = 12 + 12 , 5 = 12 + 22 ,
and 25 = 32 + 42 . Then T = {n2 + m2 |n, m ∈ Z+ }.
In this case, expressing the set as something like {2, 5, 8, 9, . . .} does not make sense at all
because there is no way of discerning a pattern.
Example 5.16. Use set builder notation to express C, the set of complex numbers.
Solution: C = {a + bi : a, b ∈ R}.
132 Chapter 5
⋆Exercise 5.17. Use set builder notation to express Q, the set of all rational numbers.
Answer
Note: Some authors use ⊂ to mean the same thing as ⊆. You will need to consider the
context in order to interpret it correctly.
Example 5.19. Let S = {1, 2, . . . , 20}, that is, the set of integers between 1 and 20, inclusive.
Let E = {2, 4, 6, . . . , 20}, the set of all even integers between 2 and 20, inclusive. Notice that
E ⊆ S. Let P = {2, 3, 5, 7, 11, 13, 17, 19}, the set of primes less than 20. Then P ⊆ S, but
P 6⊆ E and E 6⊆ P .
⋆Exercise 5.20. Let S = {n2 |n ∈ Z} and A = {1, 4, 9, 16}. Answer each of the following,
including a brief justification.
(a) Is A ⊆ S?
(b) Is A ⊂ S?
(c) Is S ⊆ S?
(d) Is S ⊂ S?
(e) Is S ⊂ A?
Sets 133
⋆Exercise 5.21. Let A be the set of integers divisible by 6, B be the set of integers divisible
by 2, and C be the set of integers divisible by 3. Answer each of the following, giving a brief
justification.
(a) Is A ⊆ B?
(b) Is A ⊆ C?
(c) Is B ⊆ A?
(d) Is B ⊆ C?
(e) Is C ⊆ A?
(f) Is C ⊆ B?
is the set of students in a particular course. This set can be split into two subsets: the set
F = {Roxan, Jacquelin, Fatimah, Wakeelah, Ashley, Madeline} of females in the class, and the
set M = {Sean, Ruben, Leslie} of males in the class. Thus we have F ⊆ S and M ⊆ S. Notice
that it is not true that F ⊆ M or that M ⊆ F . Put another way, F 6⊆ M and M 6⊆ F
Solution: They are ∅, {a}, {b}, {c}, {a, b}, {b, c}, {a, c}, and {a, b, c}.
Notice that there are 8 subsets. Also notice that 8 = 23 . As we will see shortly, that is not a
coincidence.
Notice that we wrote ∅ and not {∅} in the previous example. It turns out that ∅ 6= {∅}. ∅
is the empty set–that is, the set that has no elements. {∅} is the set containing the empty set.
Thus, {∅} is a set containing the single element ∅. You can use either ∅ or {} to denote the
empty set, but not {∅}.
134 Chapter 5
Definition 5.25. The power set of a set is the set of all subsets of a set. The power set of
a set A is denoted by P (A).
Example 5.26. If A = {a, b, c}, example 5.23 implies that P (A) = {∅, {a}, {b}, {c}, {a, b},
{b, c}, {a, c}, {a, b, c}}. Notice that the solution is a set, the elements of which are also sets.
An incorrect answer would be {∅, a, b, c, {a, b}, {b, c}, {a, c}, {a, b, c}}. This is incorrect
because a is not the same thing as {a} (the set containing a). {a} ∈ P (A), but a 6∈ P (A).
This is a subtle but important distinction.
We will prove the following theorem in the next section after we have developed the appropriate
notation to do so.
(a) |P (A)| = .
(b) |P (P (A))| = .
(c) |P (P (P (A)))| = .
⋆Exercise 5.30. If one element is added to a finite set A, how much larger is the power
set of A after the element is added (relative to the size of the power set before it is added)?
Explain your answer.
Answer
136 Chapter 5
Definition 5.31.
Example 5.32. Let A = {1, 2, 3, 4, 5, 6}, and B = {1, 3, 5, 7}. Then A∪B = {1, 2, 3, 4, 5, 6, 7}.
⋆Exercise 5.33. Let A be the set of even integers and B be the set of odd integers. Then
A ∪ B=
Definition 5.34.
Example 5.35. Let A = {1, 2, 3, 4, 5, 6}, and B = {1, 3, 5, 7, 9}. Then A ∩ B = {1, 3, 5}.
⋆Exercise 5.36. Let A be the set of even integers and B be the set of odd integers. Then
A ∩ B=
Set Operations 137
Definition 5.37.
The difference (or set-difference) of sets A and B
is the set containing elements from A that are not in B.
More formally,
A\B A B
A \ B = {x : x ∈ A and x 6∈ B}.
Example 5.38. Let A = {1, 2, 3, 4, 5, 6}, and B = {1, 3, 5, 7, 9}. Then A \ B = {2, 4, 6} and
B \ A = {7, 9}.
⋆Exercise 5.39. Let A be the set of even integers and B be the set of odd integers. Then
A \ B= and B \ A= .
Proof: We use inductiona and the idea from the solution to Exercise 5.24.
Clearly if |A| = 1, A has 21 = 2 subsets: ∅ and A itself.
Assume every set with n − 1 elements has 2n−1 subsets. Let A be a set with n
elements. Choose some x ∈ A. Every subset of A either contains x or it doesn’t.
Those that do not contain x are subsets of A \ {x}. Since A \ {x} has n − 1
elements, the induction hypothesis implies that it has 2n−1 subsets. Every subset
that does contain x corresponds to one of the subsets of A \ {x} with the element
x added. That is, for each subset S ⊆ A \ {x}, S ∪ {x} is a subset of A containing
x. Clearly there are 2n−1 such new subsets. Since this accounts for all subsets of
A, A has 2n−1 + 2n−1 = 2n subsets.
a
We will cover induction more fully and formally later. But since this use of induction is pretty intuitive,
especially in light of Example 5.24, it serves as a useful foreshadowing of things to come.
Definition 5.41.
Let A ⊆ U . The complement of A with respect to U is
just the set difference U \ A. More formally, U
A
A
A = {x ∈ U : x 6∈ A} = U \ A.
Note: Often the set U , which is called the universe or universal set, is implied and we
just use A to denote the complement. We usually follow this convention here. Further, when
talking about several sets, we will usually assume they have the same universal set.
Example 5.42. Let U = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9} be the universal set of decimal digits and
A = {0, 2, 4, 6, 8} ⊂ U be the set of even digits. Then A = {1, 3, 5, 7, 9} is the set of odd
digits.
⋆Exercise 5.43. Let A be the set of even integers and B be the set of odd integers, and let
It should not be too difficult to convince yourself that the following theorem is true.
A ∩ A = ∅, and
A ∪ A = U.
The various intersecting regions for two and three sets can be seen in Figures 5.1 and 5.2.
C (A ∪ B ∪ C)
A∩B∩C
(A ∪ B)
A∩B∩C A∩B∩C
A∩B A∩B
A∩B∩C
A∩B
A∩B∩C A∩B∩C
A∩B∩C
A B
A B
Figure 5.1: Venn diagram for two sets.
Figure 5.2: Venn diagram for three sets.
Example 5.46. Let A be the set of prime numbers, B be the set of perfect squares, and C
be the set of even numbers. Then A and B are clearly disjoint since if a number is a perfect
square, it cannot possibly be prime (although 0 and 1 are not prime for different reasons than
the rest of the elements of B). On the other hand, A and C are not disjoint since they both
contain 2, and B and C are not disjoint because they both contain 4.
Set Operations 139
⋆Exercise 5.47. Let A be the set of even integers and B be the set of odd integers. Are A
and B disjoint? Explain.
Answer
Set identities can be used to show that two sets are the same. Table 5.1 gives some of the
most common set identities. In these identities, U is the universal set. We won’t provide proofs
for most of these, but we will present a few examples and a technique that will allow you to verify
that they are correct.
N ame Identity
commutativity A∪B = B∪A
A∩B = B∩A
associativity A ∪ (B ∪ C) = (A ∪ B) ∪ C
A ∩ (B ∩ C) = (A ∩ B) ∩ C
distributive A ∩ (B ∪ C) = (A ∩ B) ∪ (A ∩ C)
A ∪ (B ∩ C) = (A ∪ B) ∩ (A ∪ C)
identity A∪∅=A
A∩U =A
complement A∪A =U
A∩A =∅
domination A∪U =U
A∩∅=∅
idempotent A∪A =A
A∩A =A
complementation (A) = A
DeM organ′ s A∪B =A∩B
A∩B =A∪B
absorption A ∪ (A ∩ B) = A
A ∩ (A ∪ B) = A
These identities may look somewhat familiar. They are essentially the same as the logical
equivalences presented in Table 4.3. In fact, if we equate T to U , F to ∅, ∨ to ∪, ∧ to ∩, and ¬ to
¯, the laws are identical. This is because logic operations and sets are both what we call Boolean
algebras. We won’t go into detail about this connection, but in case you run into the concept in
the future, you heard it here first!
The following theorem can be used to prove set identities.
Theorem 5.48. Two sets A and B are equal if and only if A ⊆ B and B ⊆ A.
That was the long, drawn-out version of the proof. The purpose of all of the detail is to
make the technique clear. Here is a proof without any extraneous details.
The proofs in the previous example are called set containment proofs since we showed set
containment both ways. The technique is pretty straightforward: Theorem 5.48 tells us that if
X ⊆ Y and Y ⊆ X, then X = Y . Thus, to prove X = Y , we just need to show that X ⊆ Y and
Y ⊆ X. But how do we show that one set is a subset of another? This is easy: To show that
X ⊆ Y , we show that every element from X is also in Y . In other words, we assume that x ∈ X
and use definitions and logic to show that x ∈ Y . Assuming we do not use any special properties
about x other than the fact that x ∈ X, then x is an arbitrary element from X, so this shows
that X ⊆ Y . Showing that Y ⊆ X uses exactly the same technique.
Note: Be careful. To prove that X = Y , you generally need to prove two things: X ⊆ Y and
Y ⊆ X. Do not forget to do both. On the other hand, if you are asked to prove that X ⊆ Y ,
you do not need to (and should not) show that Y ⊆ X.
Let’s see another example of this type of proof. This proof will provide a few more details
than necessary in order to further explain the technique.
Theorem 5.50. Prove the first De Morgan’s Laws: Given sets A and B, (A ∪ B) = A ∩ B.
which is the same as x 6∈ A ∪ B. But this last statement asserts that x ∈ (A ∪ B).
Hence A ∩ B ⊆ (A ∪ B).
Since we have shown that the two sets contain each other, they are equal by The-
orem 5.48.
You have already seen a few correct ways to prove that A \ B = A ∩ B. Can you spot the
problem(s) in the following ‘proofs’ of this? These proofs use the alternative notation of A − B
for set difference.
⋆Evaluate 5.51. Use a set containment proof to prove that if A and B are sets, then
A − B = A ∩ B.
Evaluation
Proof 2: B is the other part of the universal that does not contain any
part of B. A ∪ B means all intersection part of A and the universal that
does not contain any part of B. Therefore it returns all elements that
are in A but not in B which are A − B. Thus, A − B = A ∩ B.
Evaluation
Evaluation
Sometimes we can do a set containment proof in one step instead of two. This only works if
every step of the proof is reversible. We illustrate this idea next.
142 Chapter 5
Proof: We have
x ∈ A \ (B ∪ C) ↔ x ∈ A ∧ x 6∈ (B ∪ C)
↔ (x ∈ A) ∧ ((x 6∈ B) ∧ (x 6∈ C))
↔ (x ∈ A ∧ x 6∈ B) ∧ (x ∈ A ∧ x 6∈ C)
↔ (x ∈ A \ B) ∧ (x ∈ A \ C)
↔ x ∈ (A \ B) ∩ (A \ C).
Note: The proof in the previous example works because every step is reversible. You can only
write something like ‘α ↔ β’ in a proof if α → β and β → α are both true. When attempting
to shortcut proofs with this technique, make sure each step truly is reversible.
⋆Fill in the details 5.53. Use a set containment proof to show that
(A ∪ B) ∩ C = (A ∩ C) ∪ (B ∩ C).
Solution: We have,
x ∈ (A ∪ B) ∩ C
↔ x ∈ (A ∪ B) ∧ by def. of intersection
↔ (x ∈ A ∨ )∧x∈C by
↔ (x ∈ A ∧ x ∈ C) ∨ by
↔ ∨ (x ∈ B ∩ C) by
↔ x ∈ (A ∩ C) ∪ (B ∩ C). by
Example 5.54. In Java, the TreeSet class is one implementation of a set that has several
methods with perhaps unfamiliar names, but they do what should be familiar things. Let’s
discuss a few of them.a Let A and B be TreeSets.
(a) The method retainAll(TreeSet other) “retains only the elements in this TreeSet that
are contained in the other TreeSet. In other words, removes from this TreeSet all of its el-
ements that are not contained in other.” It is not too difficult to see that A.retainAll(B)
is computing A ∩ B.b
(b) The method boolean containsAll(TreeSet other) “returns true if this set contains
all of the elements of other (and false otherwise).” Thus, A.containsAll(B) returns
true iff B ⊆ A.
Set Operations 143
(c) Even without documentation, it seems likely that A.size() is determining |A|.
Sometimes you need to find the number of elements in the union of several sets. This is easy
if the sets do not intersect. If they do intersect, more care is needed to make sure no elements are
missed or counted more than once. In the following examples we will use Venn diagrams to help us
do this correctly. Later, we will learn about a more powerful tool to do this—inclusion-exclusion.
Example 5.55. Of 40 people, 28 smoke and 16 chew tobacco. It is also known that 10 both
smoke and chew. How many among the 40 neither smoke nor chew?
18 10 6
Smoke Chew
We should note that we truly hope that these numbers are not representative of
the number of people who smoke and/or chew in real life. It’s bad for you. Don’t
do it. Really.
144 Chapter 5
⋆Exercise 5.56. In a group of 30 people, 8 speak English, 12 speak Spanish and 10 speak
French. It is known that 5 speak English and Spanish, 7 Spanish and French, and 5 English
and French. The number of people speaking all three languages is 3. How many people speak
at least one of these languages?
Definition 5.57. The Cartesian product of sets A and B is the set A × B = {(a, b)|a ∈
A ∧ b ∈ B}. In other words, it is the set of all ordered pairs of elements from A and B.
A × B = {(1, a), (1, b), (2, a), (2, b), (3, a), (3, b)}, and
B × A = {(a, 1), (a, 2), (a, 3), (b, 1), (b, 2), (b, 3)}.
Notice that A × B 6= B × A. If A 6= B, this is always the case.
A×B =
B 3 = {(a, a, a), (a, b, a), (b, a, a), (b, b, a), (a, a, b), (a, b, b), (b, a, b), (b, b, b)}
Set Operations 145
A2 =
A3 =
Theorem 5.63. If A and B are finite sets with |A| = n and |B| = m, then |A × B| = n · m.
Example 5.64. Let A and B be finite sets with |A| = 100 and |B| = 5. Then |A × B| =
100 ∗ 5 = 500, |A2 | = 100 ∗ 100 = 10, 000, and |B 4 | = 54 = 625.
⋆Exercise 5.65. Let A, B, and C be sets with |A| = 10, |B| = 50, and |C| = 20. Determine
the following
(a) |A × B| =
(b) |A × C| =
(c) |A2 | =
(d) |B 3 | =
(e) |A × B × C| =
146 Chapter 5
Solution 1: Assume A and B are not empty. We know the Cartesian prod-
uct of A and B, denoted by A × B, is the set of all ordered pairs (a, b), where
a ∈ A and b ∈ B. Therefore, we can conclude that our assumption was in-
correct because if each set is not empty, (a, b) is in the cross product, but
A × B = ∅, so at least one of the sets must be empty.
Evaluation
Evaluation
Solution 3: We can conclude that both A and B are empty. I’ll prove it by
contradiction. Assume that A × B = ∅, but that it is not the case that
both A and B are empty. Then neither A nor B is empty. But then there is
some a ∈ A and some b ∈ B, and (a, b) ∈ A × B, which implies that A × B 6= ∅.
This contradicts our assumption. Therefore both A and B are empty.
Evaluation
Evaluation
Functions 147
5.3 Functions
This section is meant as a review of what you hopefully already learned in an earlier course,
probably in high school. Thus, it is pretty brief. But we do try to cover all of the important
material and provide enough examples to illustrate the concepts.
Definition 5.67. Let A and B be sets. Then a function f from A to B assigns to each
element of A exactly one element from B. We write f : A → B if f is a function from A to
B. If a ∈ A and f assigns to a the value b ∈ B, we write f (a) = b. We also say that f maps
a to b.
If A = B, we sometimes say f is a function on A.
√
Example 5.69. Notice that we can define f (x) =√ x on the positive real numbers, but
√ cannot define it on the positive integers since 2 is not an integer. Similarly, since
we
−1 = i 6∈ R, we cannot define it on the real numbers. We can let it be a function from
R to C, though. But we won’t because this course is complex enough even without complex
numbers.
3. The range of f is the set {b|f (a) = b for some a ∈ A}. In other words the range is the
subset of B that are actually mapped to by f .
Figure 5.3 gives a pictorial representation of a function. Notice that in this example every
element in A has precisely one arrow going from it. So if I ask “what is f (x)?”, there is always
an answer and it is always unique. On the other hand, there is a point in B that has two arrows
going to it and several points that have no arrows going to them. This is fine.
Figure 5.4 does not represent a function since there are several points in A which have two
arrows going from them and several with no arrows at all. The problem here is that if I ask “what
is f (x)?”, sometimes there is no answer and sometimes there are multiple answers. Thus, f would
not represent a function.
148 Chapter 5
Note: In figures 5.3 and 5.4, the dots represent all of the elements of the sets A and B and
the gray ovals are mainly there to help identify which dots are in which set. However, in these
sorts of diagrams it is more common for the dots to represent only some of the elements. You
need to let the context help you determine how to properly interpret these diagrams.
Example 5.72. Give a formal definition of a function that assigns to an age the number of
complete decades someone of that age has lived. For instance, f (34) = 3 and f (5) = 0. Be
sure to indicate what the domain and codomain are.
Solution: It isn’t hard to see that the domain and codomain are both N. Thus
we want a function f : N → N. One way to define f is by f (x) = ⌊x/10⌋.
⋆Exercise 5.73. Give a formal definition of a function that returns the parity of an integer.
That is, it returns 0 for even numbers and 1 for odd numbers. Be sure to indicate what the
domain and codomain are.
Answer
• f is said to be surjective or onto if and only if for every b ∈ B, there exists some
a ∈ A such that f (a) = b. In other words, every element in B gets mapped to by some
element in A.
Procedure 5.75. To show that a function f is one-to-one, you just need to show that when-
ever f (a) = f (b), then a = b.
Example 5.76. Let f (x) = 2x − 3 be a function on the integers. Show that f is one-to-one.
⋆Question 5.77. Previously we mentioned that ‘working both sides’ was not an appropriate
proof technique. Why is it O.K. in the previous example?
Answer
⋆Exercise 5.78. Prove that f (x) = 5x is one-to-one over the real numbers.
Proof
Procedure 5.79. To show that a function f is not one-to-one, we simply need to find two
values a 6= b in the domain such that f (a) = f (b). That is, we just need to show that there
are two different numbers in the domain that are mapped to the same value in the codomain.
150 Chapter 5
Example 5.80. Let f (x) = x2 be a function on the integers. Show that f is not one-to-one.
⋆Exercise 5.81. Let f (x) = ⌊x⌋ be a function on R. Prove that f is not one-to-one.
Proof
Procedure 5.82. To show that a function f is onto, we need to show that for an arbitrary
b ∈ B, there is some a ∈ A such that f (a) = b. That is, show that every value in B is mapped
to by f .
Example 5.83. Let f (x) = x3 be a function on the real numbers. Show that f is onto.
Ä √ ä Ä √ ä3
Solution: Let b ∈ R. Then f 3 b = 3 b = b3/3 = b. Since every b ∈ R is
√
mapped to (from 3 b), f is onto.
Proof
Procedure 5.85. To show that a function f is not onto, we just need to find some b ∈ B
such that there is no a ∈ A with f (a) = b. In other words, we just need to find one value that
isn’t mapped to by f .
Example 5.86. Let f (x) = x3 be a function on the integers. Show that f is not onto.
⋆Exercise 5.87. Let f (x) = ⌊x⌋ be a function on R. Prove that f is not onto.
Proof
It is important to remember that whether or not a function is one-to-one or onto might depend
on the domain/codomain over which the function is defined. For instance, notice that in the last
two examples we used the same function but on different domains/codomains. In one case the
function was onto, and in the other case it wasn’t.
Answer
Answer
Answer
152 Chapter 5
(a) f (x) = x + 2
Answer
(b) g(x) = x2
Answer
(c) h(x) = 2x
Answer
Answer
The functions in the previous exercise were specifically chosen to demonstrate that all four
possibilities of being or not being one-to-one and onto (one-to-one and onto, one-to-one and not
onto, not one-to-one but onto, and not one-to-one or onto) are possible.
The following theorem should come as no surprise if you take a few minutes to think about it
(and you should take a few minutes to think about it until you are convinced it is correct).
Functions 153
⋆Exercise 5.91. Let’s test your understanding of the material so far. Answer each of the
following true/false questions, giving a very brief justification/counterexample.
(a) If f : A → B is onto, then the domain and range are not only the same size, but they
are the same set.
(c) If f : A → B is both one-to-one and onto, then A and B have the same number of
elements.
√
(f) Let f : R+ → R be defined by f (x) = x. Then f is a function that is neither
one-to-one nor onto.
Note: It is important to note that the function f −1 is not the same thing as 1/f . This is an
unfortunate case when a notation can be interpreted in two different ways. That is, in some
cases, a−1 means the inverse function and in other cases it means 1/a. Usually the context
will help you determine which one is the correct interpretation.
Procedure 5.93. One method of finding the inverse of a function is to replace f (x) (or
whatever the name of the function is) with y and solve for x (or whatever the variable is).
Finally, replace y with x and you have the inverse.
Example 5.95. Let f : R → R be defined by f (x) = x2 . Then f does not have an inverse
since it is not one-to-one.
⋆Exercise 5.97. Let f (x) = 3x − 5 be a function over R. Prove that f has an inverse and
then find it.
In other words, to compose f with g, we first compute g(x). Then we plug in g(x) into the
formula for f .
Note: Look closely at the notation. f ◦ g has f before g, so it might seem like it should be
g(f (x))–in other words, apply f first, then then g. But that is not how it is defined.
Also notice that to compose f with g, it is necessary that the range of g is a subset of the
domain of f since otherwise it would be impossible to compute.
Solution:
Notice that in the previous example, f ◦ g 6= g ◦ f . In other words, the order in which we
compose functions matters since the result is not always the same (although occasionally it is).
⋆Exercise 5.100. Let f and g be functions on R defined by f (x) = ⌊x⌋ and g(x) = x/2.
Compute f ◦ g and g ◦ f , simplifying your answers.
(f ◦ g)(x) =
(g ◦ f )(x) =
156 Chapter 5
Direct Proof:
For any distinct elements x, y ∈ A, g(x) 6= g(y), since g is one-to-one. Since f is
also one-to-one, then f (g(x)) 6= f (g(y)), which is the same as (f ◦g)(x) 6= (f ◦g)(y).
Therefore f ◦ g is one-to-one.
Proof by Contradiction:
Assume f ◦ g is not one-to-one. Then there exist distinct elements x, y ∈ A such
that (f ◦ g)(x) = (f ◦ g)(y). This is equivalent f (g(x)) = f (g(y)). Since f is
one-to-one, it must be the case that g(x) = g(y). But x 6= y, and g is one-to-one,
so g(x) 6= g(y). This is a contradiction. Therefore f ◦ g is one-to-one.
Example 5.104. Prove or disprove that f (x) = 2x + 1 and g(x) = 2x − 1, defined over the
real numbers, are inverses.
⋆Exercise 5.105. Let’s test your understanding of the material so far. Answer each of the
following true/false questions, giving a very brief justification/counterexample.
√
n
(e) Let n be a positive integer. Then the function x is invertible on R.
√
(f) Let n be a positive integer. Then the function n
x is invertible on N.
√
(g) Let n be a positive integer. Then the function n
x is invertible on R+ (the positive
real numbers).
(h) Let f and g be functions on Z+ defined by f (x) = x2 and g(x) = 1/x. Then
f ◦ g = g ◦ f.
(i) Let f and g be functions on Z defined by f (x) = (x + 1)2 and g(x) = x + 1. Then
f ◦ g = g ◦ f.
(j) Let f (x) = ⌊x⌋ and g(x) = ⌈x⌉ be defined on the real numbers. Then f ◦ g = g ◦ f .
(k) Let f (x) = ⌊x⌋ and g(x) = ⌈x⌉ be defined on the real numbers. Then f and g are
inverses of each other.
√
(l) Let f (x) = x2 and g(x) = x be defined over the positive real numbers. Then f and
g are inverses of each other.
158 Chapter 5
Example 5.106. Consider the following function that returns n! if n ≥ 0, and returns −1 if
n < 0 (n! is undefined for negative values of n, but we have to return something, so why not
a negative number?)
int f a c t o r i a l( int n ) {
if (n <0) { return -1; }
else if ( n ==0) { return 1; }
else {
int fact = 1;
for ( int i =1;i <= n ; i ++) {
fact = fact * i ;
}
return fact ;
}
}
What values of n should we use to test factorial?
Example 5.108. Define E = {2k : k ∈ Z} and O = {2k + 1 : k ∈ Z}. Clearly E is the set of
even integers and O is the set of odd integers. Since E ∩ O = ∅ and E ∪ O = Z, {E, O} is a
partition of Z. Put another way, we can partition the integers based on parity.
Example 5.109. We can partition the socks in our sock drawer by color. In other words,
we put all of the black socks in one set, the white ones in another, the green ones in another,
etc. For simplicity, we can put all of the multi-color socks in a single set.
Partitions and Equivalence Relations 159
Example 5.110. We can partition the set of all humans by putting each person into a set
based on the first letter of their first name. So Adam and Adele go into set A and Zeek goes
into set Z, for instance. The sets in the partition are A, B, . . . Z.a
a
For simplicity, we assume everyone’s name is written using the Roman alphabet.
Example 5.111. Let A = {1, 5, 8}, B = {2, 3}, C = {4}, D = {6, 9}, and E = {7, 10, 11, 12}.
Then the sets A, B, C, D, and E form a partition of the set {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12}.
Example 5.112. When choosing test cases for the factorial method in Example 5.106, we
thought about 3 subsets of Z: {0}, Z+ , and Z− . These cases form a partition of Z since they
are disjoint and Z = {0} ∪ Z+ ∪ Z− . This is good since it means we covered at least one value
of the different types, and we didn’t ‘overtest’ any of the cases by unknowingly duplicating
values from the same case.
⋆Exercise 5.113. You must decide on test cases for a method int maximum(int a,int b)
that returns the maximum of its arguments. How would you partition the possible inputs
into sets such that if it is correct for one (or a few) tests of cases from that set, it is probably
correct for the rest of the cases in that set? Notice that the set of inputs is Z × Z.
Answer
Most of the partitions we talk about will be based on some meaningful characteristic of the
elements of a set–like parity, color, or sign. But this is not inherent in the definition. For instance,
the sets in the partition from Example 5.111 do not seem to have any significant meaning. Some,
like the one in Example 5.108, will have a precise mathematical definition. Others, like the one
in Examples 5.109 will not.
⋆Exercise 5.114. Define a partition on Z that contains more than one subset.
Answer
{3Z, 3Z + 1, 3Z + 2} is a partition of Z.
a
The notation in this example may seem a bid odd at first. How are you supposed to interpret “3Z + 1”?
Is this 3 times the set Z plus 1? What does it mean to do algebra with sets and numbers? I won’t get into
all of the technical details, but here is a short answer. You can think of “3Z + 1” as just a name. Sure, it
may seem like an odd name, but why can’t we name a set whatever we want? Some people name their kids
Jon Blake Cusack 2.0 and get away with it. You can also think of “3Z + 1” as describing how to create the
set—by taking every element from Z, multiplying it by 3, and then adding 1. Thus, you can think of “3Z + 1”
as being both an algebraic expression and a name.
⋆Exercise 5.116. Let I = R \ Q (the set of irrational numbers). Prove that {Q, I} is a
partition of R.
Proof
Recall that when a list of number is given between parentheses (e.g. (1, 2, 3)), it typically
denotes an ordered list. That is, the order that the element are listed matters. So, for instance,
(1, 2) and (2, 1) are not the same thing.
Next we will develop an alternative way of thinking about partitions: equivalence relations.
After defining some terms and providing a few examples, we will make the connection between
partitions and equivalence relations more clear.
Example 5.118. Let A be the set of all students at this school and B be the set of all courses
at this school. We can define a relation R by saying that xRy if student x has taken course
y. Said another way, we can define R by saying that (x, y) ∈ R if student x has taken course
y.
Example 5.119. We can define a relation R = {(a, a2 ) : a ∈ Z}. That is, x is related to y if
y = x2 .
Example 5.120. We can define a relation on Z by saying that x is related to y if they have
the same parity. Thus, (2, 0), (234, −342), (3, 17) are all in R, but (2, 127) is not.
Partitions and Equivalence Relations 161
Answer
⋆Question 5.122. Is {(1, 2), (345, 7), (43, 8675309), (11, 11)} a relation on Z+ ? Explain.
Answer
Definition 5.123. A relation R on set A is said to be reflexive if for all x ∈ A, xRx (or
(x, x) ∈ R).
⋆Exercise 5.124. Let P be the set of all people. Which of the following relations on P are
reflexive? Explain why or why not.
(b) N is the relation with a related to b iff a’s name starts with the same letter as b’s name.
(c) C is the relation defined by (a, b) ∈ C if a and b have been to the same city.
(a) T :
(b) N :
(c) C:
(d) K:
162 Chapter 5
(e) R:
⋆Exercise 5.126. Which of the relations from Example 5.124 are symmetric? Explain why
or why not.
(a) T :
(b) N :
(c) C:
(d) K:
(e) R:
Answer
(b) What if (1, 2) and (2, 1) are both in R? Can you tell whether or not R is anti-symmetric?
Answer
Answer
⋆Exercise 5.130. Which of the relations from Example 5.124 are anti-symmetric? Explain
why or why not.
(a) T :
164 Chapter 5
(b) N :
(c) C:
(d) K:
(e) R:
Answer
Answer
Answer
Partitions and Equivalence Relations 165
⋆Exercise 5.132. Give an example of a relation on any set of your choice that is both
symmetric and anti-symmetric. Justify your answer.
Answer
⋆Exercise 5.134. Which of the relations from Example 5.124 are transitive? Explain why
or why not.
(a) T :
(b) N :
(c) C:
(d) K:
(e) R:
Definition 5.135. A relation which is reflexive, symmetric and transitive is called an equiv-
alence relation.
166 Chapter 5
Example 5.136. Let S ={All Human Beings}, and define the the relation M by (a, b) ∈ M
if a has the same (biological) mothera as b. Show that M is an equivalence relation.
⋆Exercise 5.137. Which of the relations from Example 5.124 are equivalence relations?
Explain why or why not.
(a) T :
(b) N :
(c) C:
(d) K:
(e) R:
⋆Exercise 5.139. Which of the relations from Example 5.124 are partial orders? Explain
why or why not.
(a) T :
(b) N :
(c) C:
(d) K:
(e) R:
⋆Exercise 5.140. Let X be a collection of sets. Let R be the relation on X such that A is
related to B if A ⊆ B. Prove that R is a partial order on X.
Proof: (Reflexive)
(Anti-symmetric)
(Transitive)
Labeling the lines of these proofs with what property we are proving isn’t strictly necessary.
168 Chapter 5
⋆Exercise 5.141. Consider the relation R = {(1, 2), (1, 3), (1, 5), (2, 2), (3, 5), (5, 5)} on the
set {1, 2, 3, 4, 5}. Prove or disprove each of the following.
(a) R is reflexive
Answer
(b) R is symmetric
Answer
(c) R is anti-symmetric
Answer
(d) R is transitive
Answer
Answer
Answer
It turns out that congruence modulo n is an equivalence relation. (See Definition 3.13 if
necessary).
Theorem 5.142. Let n be a positive integer. Let R be the relation on the set of integers
defined by R = {(a, b) : a ≡ b (mod n)}. Then R is an equivalence relation.
Partitions and Equivalence Relations 169
a − c = (a − b) + (b − c) = kn + ln = (k + l)n.
Notice that if we let n = 2 in the previous theorem, we essentially have the relation from
Example 5.120.
⋆Fill in the details 5.143. Let R be the relation on the set of ordered pairs of positive
integers (that is, Z+ × Z+ ) such that ((a, b), (c, d)) ∈ R if and only if ad = bc. Show that R
is an equivalence relation.a
.
(Transitive) Assume that ((a, b), (c, d)) ∈ R and ((c, d), (e, f )) ∈ R. Then
Definition 5.144. Let R be an equivalence relation on a set S. Then the equivalence class
of a, denoted by [a], is the subset of S containing all of the elements that are related to a.
More formally,
[a] = {x ∈ S : xRa}.
If x ∈ [a], we say that x is a representative of the equivalence class [a]. Note that any
element of an equivalence class can serve as a representative.
Example 5.145. The equivalence class of 3 modulo 8 is [3] = {8k + 3 : k ∈ Z}. Notice that
[11] = {8k + 11 : k ∈ Z} = {8k + 3 : k ∈ Z} = [3]. In fact, [3] = [8l + 3] for all integers l. In
other words, any element of the form 8l +3, where l is an integer, can serve as a representative
of [3]. Further, we can call this class [3], [11], [19], etc. It doesn’t really matter since they all
represent the same set of integers. Of course, [3] is the most logical choice.
Example 5.146. Notice that if our relation is congruence modulo 3, we can define three
equivalence classes:
It isn’t too difficult to see that Z = [1] ∪ [2] ∪ [3], and that these three sets are disjoint. In
other words, the equivalence classes {[1], [2], [3]} form a partition of Z. As we will see shortly,
this is not a coincidence.
Lemma 5.147. Let R be an equivalence relation on a set S. Then two equivalence classes
are either identical or disjoint.
Proof: Let a, b ∈ S, and assume [a] ∩ [b] 6= ∅ (that is, that they are not
disjoint). We need to show that [a] = [b]. First, let x ∈ [a] ∩ [b] (which exists since
[a] ∩ [b] 6= ∅). Then xRa and xRb, so by symmetry aRx and by transitivity aRb.
Now let y ∈ [a]. Then yRa. Since we just showed that aRb, then yRb by transi-
tivity. Thus y ∈ [b]. Therefore [a] ⊆ [b].
A symmetric argument proves that [b] ⊆ [a]. Therefore, [a] = [b].
Let’s bring together some of the examples of partitions with examples of equivalence relations
and classes.
Example 5.148. We just saw that congruence modulo 3 is an equivalence relation with three
equivalence classes, {3k : k ∈ Z}, {3k + 1 : k ∈ Z}, and {3k + 2 : k ∈ Z}. In Example 5.115,
we defined a partition of Z using these same three subsets.
Partitions and Equivalence Relations 171
Example 5.150. In Example 5.110 we defined a partition of people according to the first
letter of their first name. The sets in the partition were A, B, . . . , Z.
We can define an equivalence relation on the set of all people by saying a is related to b if a’s
name starts with the same letter of the alphabet as b’s name. In a series of previous exercises,
you proved that this defines an equivalence relation. Notice that the equivalence classes
are the sets A, B, . . . , Z (which we can think of as, for instance [Adam], [Betty], . . . , [Zeek]).
Again, these are the same sets that we used to partition people into in Example 5.110.
In these examples, there seems to be a connection between the equivalence classes of the
relation and the sets in a partition. As the next theorem illustrates, this is no coincidence.
and [a] ∩ [b] = ∅ if a is not related to b. This proves the first half of the theorem.
Conversely, let
[
S= Sα , where Sα ∩ Sβ = ∅ if α 6= β,
α
Example 5.152. In light of Theorem 5.151, we can say that the relation defined by congru-
ence modulo 4 partitions the set of integers into precisely 4 equivalence classes: [0], [1], [2],
and [3]. That is, given any integer, it is contained in one (and only one) of these classes.
More generally, if n > 2, Z can be partitioned into n sets, [0], [1], . . . , [n − 1], each of which
is an equivalence class of the relation defined by congruence modulo n.
When we think about the partition, we are focused on the concept that each number x
goes into one of the n subsets based on the value x mod n. On the other hand, when we think
about the relation of congruence modulo n, we are focused on the idea that x and y are in
the same equivalence class iff x ≡ y (mod n).
It might be helpful to see another example of where equivalence relations and partitions are
useful in computer science.
172 Chapter 5
Example 5.153. Consider a method setRewardChance(double chance) that sets the per-
centage chance that a player will be rewarded for completing some task. When the chance is
set, we may want to take some action based on the value. For instance, the chance should
certainly be non-negative, so we might want to throw some sort of error if it is negative.
Similarly, the chance should not be above 100 (We are assuming the values are being inter-
preted as whole percentages, so 100 means 100%). We may also want to treat the case of a
0% chance in a special way. Further, chance below 20%, between 20 and 50% and above 50%
might all be treated in different ways. This might lead to code such as the following.
s e t R e w a r d C h a n c e ( double chance ) {
if ( chance <0) { /* Throw an error */ }
else if ( chance ==0) { /* deal chance ==0 */ }
else if ( chance <20) { /* deal with 0 < chance <20 */ }
else if ( chance <50) { /* deal with 20 <= chance <50 */ }
else if ( chance <=100) { /* deal with 50 <= chance <=100) */ }
else { /* Throw an error */ }
}
Notice that a given value of chance will lead us down one of 6 different execution paths. If
we want to write tests for this code, we need to make sure that every path of execution is
tested.
So what does this have to do with partitions and equivalence relations? It is simple: The
code described above partitions the set of real numbers into 6 subsets, based on which section
of code will be executed:
R = (−∞, 0) ∪ {0} ∪ (0, 20) ∪ [20, 50) ∪ [50, 100] ∪ (100, ∞).
Therefore, to ensure the tests cover all of the code, we need to choose at least one value from
each subset in the partition. In the words of equivalence classes, we need to choose one or
more representatives from each equivalence class.
Hopefully it is pretty straightforward to see where partitions come into play here. But
perhaps the concepts related to equivalence relations are more difficult to see. In this case our
equivalence relation is a little more difficult to describe succinctly. The most straightforward
way is a bit vague: two numbers are related if they are in the same subset. Although vague,
it is somewhat precise (assuming I know that it is referring to the subsets given above). Are
40 and 2 related? No, because 40 ∈ [20, 50), but 2 ∈ (0, 20). Are 75 and 98 related? Yes,
because 75 ∈ [50, 100] and 98 ∈ [50, 100].
The previous two examples also demonstrates another important concept: Although equiva-
lence relations and partitions are two ways of thinking about the same concepts, sometimes it is
easy to think in terms of one versus the other. In Example 5.153, it is much easier to think in terms
of partitioning a set instead of thinking about formal definition of an equivalence relation. In that
example it is easy to write the partition (R = (−∞, 0)∪{0}∪(0, 20)∪[20, 50)∪[50, 100]∪(100, ∞)),
but not so easy to write down a complete and succinct definition of the equivalence relation (which
is why we did not do so). On the other hand, in Example 5.152, it is easy to define the equivalence
relation (x is related to y iff x ≡ y (mod n)), but a bit harder to write down the partition (again,
this is why we did not do so).
Put another way, in some cases it is easier to talk about what it means for two things to be
equivalent, and in other cases it is easier to talk about how to partition a set into disjoint subsets.
But remember that both of these are accomplishing the same thing.
Reading Comprehension Questions 173
⋆Question 5.1. Let A = {1, 2, 3, 4, 5, 6} and B = {2, 4, 6}. Which of the following notations
makes sense? Explain what is wrong with the ones that do not make sense.
(a) 3 ⊆ A. (b) 3 ∈ A. (c) {3} ∈ A. (d) {3} ⊆ A. (e) B ∈ A. (f) B ⊆ A.
⋆Question 5.5. Use a reasonable mathematical notation to express the set of perfect cubes (e.g.
numbers like 8 = 23 and −27 = (−3)3 ).
⋆Question 5.6. Use a reasonable mathematical notation to express the set of all numbers that
have at most 2 digits past the decimal point. For instance, the set contains 7, 3.4, and 45.98, but
does not contain 867.5309. (Hint: There is a really easy way to express this if you give it a little
bit of thought. On the other hand, do not overthink it or you will come up with something more
complicated than necessary.)
⋆Question 5.8. Let A be a set with |A| = 5. How many subsets does A have? (Hint: don’t
work too hard on this one!)
(b) Is {{a}, {b, c}, {a, c, e}} ⊆ A? If not, explain why not.
(e) Is {{a}, {b, c}, {a, c, e}} ⊆ P (A)? If not, explain why not.
⋆Question 5.10. Let U = {a, b, c, d, . . . , z} (the letters in the English alphabet) be the universal
set, V = {a, e, i, o, u} (the vowels), C = V (the consonants), and R = {a, b, d, g, k, p, v} (some
random letters). Find each of the following: (a) C ∪ R (b) C ∩ R (c) V ∩ R (d) R \ C (e) C ∪ R
174 Chapter 5
(a) Let’s say that I can prove that whenever x ∈ A, then x ∈ B. What did I just prove?
(b) Let’s assume I have the proof from part (a), but I can also prove that whenever x ∈ B, then
x ∈ A. Now what have I proven?
⋆Question 5.12. Give an informal proof of the second version of De Morgan’s law (See Table 5.1)
by describing the sets on both sides of the inequality and concluding that they are the same.
⋆Question 5.13. Use a set containment proof to prove the first complement law. That is, if A
is a set and U is the universal set, prove that A ∪ A = U .
⋆Question 5.17. Let A = {1, 2, 3, 4} and B = {2, 4, 6, 8}. Give an example of each of the
following. If it is not possible, explain why.
⋆Question 5.18. Let A = {1, 2, 3, 4} and B = {2, 4, 6, 8, 10, 12}. Give an example of each of
the following. If it is not possible, explain why.
⋆Question 5.19. Is each of the following true or false? If it is false, explain why.
(a) To show that f is one-to-one, you need to show that if a = b, then f (a) = f (b).
(b) To show that f is not one-to-one, you only need to find two values in the domain that map
to the same element of the codomain.
(c) To show that f is onto, you need to show that every element of the domain gets mapped to
some element of the codomain.
(d) To show that f is onto, you can show that the range and codomain are exactly the same set.
(e) To show that f is not onto, you need to show that no elements of the range are mapped to.
(f) To show that f is invertible, you need to show that f is one-to-one and that f is onto.
(b) Define a set C such that B, C is a not partition of A because the sets are not disjoint.
(c) Define a set C such that B, C is a not partition of A because the union of the sets is not A.
⋆Question 5.22. Is {(1, 3), (456, 901), (867, 5309)} a relation on Z? Explain.
⋆Question 5.23. Define a partial order on the set of all human beings. Briefly explain why it
is a partial order.
⋆Question 5.24. Define an equivalence relation on the set of all cars. Briefly explain why it
is an equivalence relation. Then define a partition of the set of all cars that corresponds to the
equivalence relation. Give a clear definition of each equivalence class (that is, each set in the
partition), and if possible give a representative element from each subset.
176 Chapter 5
⋆Question 5.25. Let B be the relation on Z+ such that (x, y) ∈ B if x and y have the same
number of 1s in their binary representation. For example, 3 = 112 and 5 = 1012 , so both 3 and 5
have two 1s in their binary representation. Thus, (3, 5) ∈ B and (5, 3) ∈ B. On the other hand,
(3, 2) 6∈ B since 2 = 102 has only one 1 in its binary representation.
(c) Define a partition of Z+ based on the relation B. (Hint: The fact that I am asking this
question should clue you in on the answer to one or more of the previous questions.) In other
words, define sets B1 , B2 , B3 . . . such that Z+ = B1 ∪ B2 ∪ B3 ∪ · · · and Bi ∩ Bj = ∅ if i 6= j.
To be clear, I am looking for a clear definition of Bi for a given value of i.
(d) Give the most obvious choice of a representative for each subset Bi . That is, choose an ai
such that [ai ] = Bi .
5.6 Problems
Problem 5.1. Draw a Venn diagram showing A ∩ (B ∪ C), where A, B, and C are sets.
Problem 5.2. Assume A, B, and C are sets. Prove each of the following set identities using a set
containment proof based on the basic definitions of ∩, ∪, etc. (see examples 5.49, 5.52, and 5.53).
(a) (A ∩ B ∩ C) ⊆ (A ∩ B).
(b) A ∩ B ⊆ A ∪ B.
(c) (A ∪ B) \ (A ∩ B) = (A \ B) ∪ (B \ A).
(d) (A − B) \ C ⊆ A \ C.
Problem 5.3. Prove each of the following set identities using a set containment proof based on
the basic definitions of ∩, ∪, etc. (see examples 5.49, 5.52, and 5.53).
(a) A ∪ (A ∩ B) = A.
(b) A ∪ (B ∩ C) = (A ∪ B) ∩ (A ∪ C)
(c) (A \ B) \ C = (A \ C) \ (B \ C).
Problem 5.4. Rusty has 20 marbles of different colors: black, blue, green, and yellow. Seventeen
of the marbles are not green, five are black, and 12 are not yellow. How many blue marbles does
he have?
(a) The method addAll(TreeSet other) adds all of the elements in other to this set if they’re
not already present. What is the result of A.addAll(B) (in terms of A and B and set operators)?
(b) The method removeAll(TreeSet other) removes from this set all of its elements that are
contained in other. What is the result of A.removeAll(B) (in terms of A and B and set
operators)?
(c) Write A.contains(x) using set notation, where x is an element that can be stored in a
TreeSet.
Problem 5.6. You need to settle an argument between your boss (who can fire you) and your
professor (who can fail you). They are trying to decide who to invite to the Young Accountants
Volleyball League. They want to invite freshmen who are studying accounting and are at least 6
feet tall. They have a list of all students.
(a) Your boss says they should make a list of all freshmen, a list of all accounting majors, and a
list of everyone at least 6 feet tall. They should then combine the lists (removing duplicates)
and invite those on the combined list. Is he correct? Explain. If he is not correct, describe in
the simplest possible terms who ends up on his guest list.
178 Chapter 5
(b) Your professor says they should make a list of everyone who is not a freshman, a list of
everyone who does not do accounting, and a list of everyone who is under 6 feet tall. They
should make a fourth list that contains everyone who is on all three of the prior lists. Finally,
they should remove from the original list everyone on this fourth list, and invite the remaining
students. Is he correct? Explain. If he is not correct, describe in the simplest possible terms
who ends up on his guest list.
(c) Give a simple description of how the guest list should be created.
Problem 5.8. Let a and b be real numbers with a 6= 0. Show that the function f (x) = a x + b
is invertible.
Problem 5.9. Prove or disprove: if a, b, and c are real numbers with a 6= 0, then the function
f (x) = a x2 + b x + c is invertible.
Problem 5.10. Prove that if f and g are onto, then f ◦ g is also onto.
Problem 5.11. Let f (x) = x + ⌊x⌋ be a function on R. (This one is a little tricky.)
Problem 5.12. Find the inverse of the function f (x) = x3 + 1 over the real numbers.
Problem 5.13. Let f be the function on Z+ that maps x to the number of bits required to
represent x in binary. For instance, f (1) = 1, f (2) = 2, f (3) = 2, f (4) = 3, f (10) = 4, etc. Hint:
The number 2n requires n + 1 bits to represent (a single 1 followed by n zeros). You may be able
to use this fact in one of your proofs.
Problem 5.14.
Consider the relation R = {(1, 2), (1, 3), (3, 5), (2, 2), (5, 5), (5, 3), (2, 1), (3, 1)} on set {1, 2, 3, 4, 5}.
Is R reflexive? symmetric? anti-symmetric? transitive? an equivalence relation? a partial order?
Problem 5.15. Let X be the set of all people. Which of the following are equivalence relations?
Prove it.
Problem 5.16. Repeat the previous problem, but which are partial orders? Prove it.
Problem 5.17. Define three different equivalence relations on the set of all TV shows. For each,
give examples of the equivalence classes, including one representative from each. Prove that each
is an equivalence relation.
Problem 5.18. Define a relation on the set of all Movies that is not an equivalence relation.
Problem 5.19. Let A = {1, 2, . . . , n}. Let R be the relation on P (A) (the power set of A) such
that a, b ∈ P (A) are related iff |a| = |b|. Prove that R is an equivalence relation. What are the
equivalence classes of R?
Problem 5.20. The class Relation is a partial implementation of a relation on a set A. It has a
list of Element objects.
• An Element stores an ordered pair from A. Element has methods getFrom() and getTo()
(using the language of the directed graph representation). So if an Element is storing (a, b),
getFrom() returns a and getTo() returns b. The constructor Element(Object a, Object b)
creates an element (a, b).
• The Relation class has methods like areRelated(Object a,Object b), getElements( ), and
getUniverse( ).
• Methods in the Relation class can use for(Element e : getElements()) to iterate over
elements of the relation.
Given all of this, implement the following methods in the Relation class:
(a) isReflexive()
(b) isSymmetric()
(c) isAntiSymmetric()
180 Chapter 5
Chapter 6: Sequences and Summations
6.1 Sequences
Definition 6.1. A sequence of real numbers is a function whose domain is the set of natural
numbers and whose output is a subset of the real numbers. We usually denote a sequence by
one of the notations
a0 , a1 , a2 , . . .
or
{an }+∞
n=0
or
{an }.
The last notation is just a shorthand for the second notation.
Note: Since sequences are functions, sometimes function notation is used. That is, a(n)
instead of an .
We will be mostly interested in two types of sequences. The first type are sequences that have
an explicit formula for their n-th term. They are said to be in closed form.
am , am+1 , am+2 , . . . ,
or
{an }+∞
n=m ,
where m is a non-negative integer. Most sequences we will deal with will start with m = 0 or
m = 1.
181
182 Chapter 6
(a) x0 =
(b) x1 =
(c) x2 =
(d) x3 =
(e) x4 =
⋆Exercise 6.4. Find the first five terms of the following sequences.
Å ãn
1
(a) xn = 1 + − , n = 0, 1, 2, . . .
2
x0 = x1 = x2 =
x3 = x4 =
(b) xn = n! + 1, n = 0, 1, 2, . . .
x0 = x1 = x2 =
x3 = x4 =
1
(c) xn = , n = 2, 3, 4, . . .
n! + (−1)n
x2 = x3 = x4 =
x5 = x6 =
Sequences 183
ãn
1
Å
(d) xn = 1 + , n = 1, 2, . . .
n
x1 = x2 = x3 =
x4 = x5 =
Definition 6.5. A recurrence relation is an equation that defines each term of a sequence
based on one or more previous terms of the sequence. More specifically, a recurrence relation
for a sequence {an } will define an based on (some of ) the values of a0 , a1 , . . . , an−1 .
1
ã Å
Example 6.6. Let x0 = 1, xn = 1 + xn−1 , for n = 1, 2, . . . . Then {xn }+∞
n=0 is a
n
recursively defined sequence. The terms x1 , x2 , . . . , x5 are
x1 = 1 + 11 x0 = 1 + 11 1 = 1 + 1 = 2.
x2 = 1 + 21 x1 = 1 + 21 2 = 2 + 1 = 3.
x3 = 1 + 31 x2 = 1 + 31 3 = 3 + 1 = 4.
x4 = 1 + 41 x3 = 1 + 41 4 = 4 + 1 = 5.
x5 = 1 + 51 x4 = 1 + 51 5 = 5 + 1 = 6.
Notice that in the previous example, we gave an explicit definition of x0 . This is called an
initial condition. In order to specify a sequence, a recurrence relation needs one or more initial
conditions. Without them, we have an abstract definition of a sequence, but cannot compute
any values since there is no “starting point.” Also note that different initial conditions can be
specified for the same recurrence relation, resulting in different sequences being generated.
When we find an explicit formula (or closed formula) for a recurrence relation, we say we have
solved the recurrence relation.
Example 6.7. Given the values we computed in Example 6.6, it seems relatively clear that
xn = n + 1 is a solution for that recurrence relation.
Note: It is important to be careful about jumping to conclusions too quickly when solving
recurrence relations.a Although it turns out that in the previous example, xn = n + 1 is the
correct closed form (we will prove it shortly), just because it works for the first 5 terms does
not necessarily imply that the pattern continues.
a
These comments also apply to other problems that involve seeing a pattern and finding an explicit formula.
184 Chapter 6
x0 = 1, xn = 5 · xn−1 , for n = 1, 2, . . . .
Find a closed form for xn . (Hint: Start by computing x1 , x2 , x3 , etc. until you see the
pattern.)
x0 = 1, xn = n · xn−1 , for n = 1, 2, . . . .
(You can verify these with a calculator). At this point it seems rela-
tively clear that an = 2n .
Evaluation
Did you catch what happened in the previous Evaluate exercise? The ‘obvious’ solution wasn’t
correct. If you missed this, go back and read the solution.
Generally speaking, you need to prove that the closed form is correct. One way to do this
is to plug it back into the recursive definition. If we can plug it into the right hand side of the
recursive definition and are able to simplify it to the left hand side, then it must be a solution.
We also have to verify that it works for the initial condition(s).
As an analogy, how do you know that x = −1 is a solution to the equation x2 + 2x + 1 = 0?
You plug it in to get (−1)2 + 2(−1) + 1 = 1 − 2 + 1 = 0. Since we got 0, x = −1 is a solution. We
do something similar for recurrence relations, except that what we are plugging in is a formula
instead of just a number.
1
Å ã
x0 = 1, xn = 1 + xn−1 , n = 1, 2, . . . .
n
Proof: To prove that xn = n + 1 is a solution for n ≥ 0, we need to show two
things. First, that it works for the initial condition. Since x0 = 1 = 0 + 1, it works
for the initial condition. Second, that if we plug it into the right hand side of the
recursive definition, that we can simplify it to xn . Doing so, we get
1 1
Å ã Å ã
1+ xn−1 = 1+ ((n − 1) + 1)
n n
n+1
Å ã
= n
n
= n+1
= xn
⋆Evaluate 6.14. Determine what ferzle(n) (below) returns for n = 0, 1, 2, 3, 4 and then
re-write ferzle without using recursion, making it as efficient as possible.a
int ferzle ( int n ) {
if (n <=0) {
return 3;
} else {
return ferzle (n -1) + 2;
}
}
int ferzle(int n) {
return 2*n+3;
}
Evaluation
a
Although we have not formally covered recursion yet, we expect that you have seen it before and know
enough to follow this example.
⋆Exercise 6.15. Fix the code from the solution given in Evaluate 6.14 so that it still uses
the closed form, but works correctly for all values of n.
int ferzle ( int n ) {
Example 6.16. The Fibonacci sequence is a sequence of numbers that is of interest in various
mathematical and computing applications. They are defined using the following recurrence
relation:a
0 if n=0
fn = 1 if n=1
fn−1 + fn−2 if n > 1
In words, each Fibonacci number (beyond the first two) is the sum of the previous two. The
first few are f0 = 0, f1 = 1,
f2 = f1 + f0 = 1 + 0 = 1,
f3 = f2 + f1 = 1 + 1 = 2,
f4 = f3 + f2 = 2 + 1 = 3,
f5 = f4 + f3 = 3 + 2 = 5,
f6 = f5 + f4 = 5 + 3 = 8,
f7 = f6 + f5 = 8 + 5 = 13.
Later we will see the closed form for the Fibonacci sequence. If you are really adventurous,
you might consider trying to determine it yourself. But be warned: It is not a simple formula
that you will come up with by just looking at some of the Fibonacci numbers.
a
In the remainder of the book, when you see fk , you should assume it refers to the k-th Fibonacci number
unless otherwise specified.
• increasing if an ≤ an+1 ∀n ∈ N
• decreasing if an ≥ an+1 ∀n ∈ N
Some people call these sequences non-decreasing, increasing, non-increasing, and de-
creasing, respectively.
A sequence is called monotonic if it is any of these, and non-monotonic if it is none
of these.
⋆Question 6.19. Notice in this first example we concluded that the sequence is strictly
increasing since we showed that xn > xn−1 . But according to the definition we need to show
that xn < xn+1 . So did we do something wrong? Explain.
Answer
1
Example 6.20. Prove that the sequence xn = 2 + , n = 0, 1, 2, . . . is strictly decreasing.
2n
Proof: We have
1 1
Å ã Å ã
xn+1 − xn = 2+ − 2+ n
2n+1 2
1 1
= − n
2n+1 2
1
= − n+1
2
< 0.
Thus, xn+1 − xn < 0, so xn > xn +1, i.e., the sequence is strictly decreasing.
n2 + 1
⋆Exercise 6.21. Prove that the sequence xn = , n = 1, 2, . . . is strictly increasing.
n
190 Chapter 6
⋆Exercise 6.22. Decide whether the following sequences are increasing, strictly increasing,
decreasing, strictly decreasing, or non-monotonic. You do not need to prove your answer,
but give a brief justification.
(a) xn = n, n = 0, 1, 2, . . .
Answer
(b) xn = (−1)n n, n = 0, 1, 2, . . .
Answer
1
(c) xn = , n = 0, 1, 2, . . .
n!
Answer
n
(d) xn = , n = 0, 1, 2, . . .
n+1
Answer
(e) xn = n2 − n, n = 1, 2, . . .
Answer
(f) xn = n2 − n, n = 0, 1, 2, . . .
Answer
(g) xn = (−1)n , n = 0, 1, 2, . . .
Answer
Sequences 191
1
(h) xn = 1 − , n = 0, 1, 2, . . .
2n
Answer
1
(i) xn = 1 + , n = 0, 1, 2, . . .
2n
Answer
There are two types of sequences that come up often. We will briefly discuss each.
a, ar, ar 2 , ar 3 , ar 4 , . . . ,
where a (the initial term) and r (the common ratio) are real numbers. That is, a geometric
progression is a sequence in which every term is produced from the preceding one by multiplying
it by a fixed number.
Notice that the first term can be written as ar 0 , so like an array in many programming
languages, the terms of a geometric progression are indexed starting at 0. Thus, the n-th term
is ar n−1 . If a = 0 then every term is 0. If ar 6= 0, we can find r by dividing any term by the
previous term.
3, 6, 12, 24, . . . .
Example 6.27. The fourth term of a geometric progression is 24 and its seventh term is 192.
Find its second term.
Solution: We are given that ar 3 = 24 and ar 6 = 192, for some a and r. Clearly,
ar 6= 0, and so we find
ar 6 192
3
= r3 = = 8.
ar 24
Thus, r = 2. Now, a(2)3 = 24, giving a = 3. The second term is thus ar = 6.
⋆Exercise 6.28. The 6-th term of a geometric progression is 20 and the 10-th is 320. Find
the absolute value of its third term.
where a (the initial term) and d (the common difference) are real numbers. That is, an
arithmetic progression is a sequence in which every term is produced from the preceding one
by adding a fixed number.
Sequences 193
Note: Notice that geometric progressions are essentially a discrete version of an exponential
function and arithmetic progressions are a discrete version of a linear function. One conse-
quence of this is that a sequence cannot be both of these unless it is the sequence a, a, a, . . .
for some a.
Solution: It is easy to see that each term is 3 more than the previous term.
Thus, this is an arithmetic progression with a = 4 and d = 3. Clearly it is therefore
not geometric.
⋆Question 6.32. Tests like the SAT and ACT often have questions such as the following.
23. Given the sequence of numbers 2, 9, 16, 23, what will the 8th term of the
sequence be? (a) 60 (b) 58 (c) 49 (d) 51 (e) 56
Answer
Answer
Now let’s see if you can correctly identify geometric and/or arithmetic sequences.
194 Chapter 6
⋆Question 6.33. Determine whether or not the following sequences are geometric and/or
arithmetic. Explain your answer.
Answer
Answer
(c) The sequence generated by ferzle(n) in Evaluate 6.14 on the non-negative inputs.
Answer
Sums and Products 195
Definition 6.34. Let {an } be a sequence. Then for 1 ≤ m ≤ n, where m and n are integers,
we define
Xn
ak = am + am+1 + · · · + an .
k=m
We call k the index of summation and m and n the limits of the summation. More
specifically, m is the lower limit and n is the upper limit. Each ak is a term of the sum.
Note: We often use i, j, and k as index variables for sums, although any letters can be used.
1 − y + y 2 − y 3 + y 4 − y 5 + · · · − y 99 + y 100
Solution: This is a lot like the previous exercise, except that every other term
is negative. So how do we get those terms to be negative? The standard trick
relies on the fact that (−1)i is 1 if i is even and −1 if i is odd. Thus, we can
multiple each term by (−1)i for an appropriate choice of i. Since the odd powers
are the negative ones, this is easy:
100
X 100
X
(−1)i y i or (−y)i
i=0 i=0
196 Chapter 6
Note: You might be tempted to give the following solution to the previous problem:
100
X
−y i .
i=0
which is not the correct answer. The bottom line: Always use parentheses in the appropriate
locations, especially when negative numbers are involved!
Note: If you struggled understanding the two solutions to the previous example, it might be
time to review the basic algebra rules involving exponents. We will just give a few of them
here. You can find more extensive lists in an algebra book or various reputable online sources.
We have already used the fact that if x 6= 0, then x0 = 1. In addition, if x, a, b ∈ R with
x > 0, then
1 a √ √ a
(xa )b = xab , xa xb = xa+b , (x−a ) = , and xb = b
xa = b x .
xa
As with sequences, we are often interested in obtaining closed forms for sums. We will present
several important formulas, along with a few techniques to find closed forms for sums.
since this sum is adding 20 terms, each of which is 1. But notice that
19
X 219
X
1= 1 = 20
k=0 k=200
since both of these sums are also adding 20 terms, each of which is 1. In other words, if the
variable of summation (the k) does not appear in the sum, then the only thing that matters
is how many terms the sum involves.
Sums and Products 197
6
X
(a) 1=
k=5
30
X
(b) 1=
k=20
100
X
(c) 1=
k=1
100
X
(d) 1=
k=0
Hopefully you noticed that the previous example and exercise can be generalized as follows.
Proof: This sum has b − a + 1 terms since there are that many number between
a and b, inclusive. Since each of the terms is 1, the sum is obviously b − a + 1.
Example 6.42. If we apply the previous theorem to the sums in Example 6.39, we would
obtain 20 − 1 + 1 = 20, 19 − 0 + 1 = 20, and 219 − 200 + 1 = 20.
Next is a simple theorem based on the distributive law that you learned in grade school.
Example 6.44. Using Theorems 6.41 and 6.43, we can see that
17
X 17
X
4=4 1 = 4 · (17 − 5 + 1) = 4 · 13 = 52.
k=5 k=5
198 Chapter 6
6
X
(a) 5=
k=5
30
X
(b) 200 =
k=20
Example 6.47. We can compute the sum from Example 6.44 by using Theorem 6.46 to
obtain
X17
4 = (17 − 5 + 1)4 = 52.
k=5
Both ways of computing this sum are valid, so feel free to use whichever you prefer.
30
X
(a) 200 =
k=20
100
X
(b) 9=
k=1
100
X
(c) 9=
k=0
Sums and Products 199
75
X
⋆Evaluate 6.49. Compute 10.
k=25
Evaluation
The following sum comes up often and should be committed to memory. The proof involves
a nice technique that adds the terms in the sum twice, in a different order, and then divides the
result by two. This is known as Gauss’ trick.
n
X
Proof: Let S = k for shorthand. Then we can see that
k=1
S = 1 + 2 + 3 + ··· + n
S = n + (n − 1) + · · · + 1.
S = 1 + 2 + ··· + n
S = n + (n − 1) + · · · + 1
2S = (n + 1) + (n + 1) + · · · + (n + 1)
= n(n + 1),
n(n + 1)
since there are n terms. Dividing by 2, we obtain S = , as was to be
2
proved.
Example 6.51.
10
X 10(10 + 1) 10 · 11
k= = = 55.
2 2
k=1
200 Chapter 6
20
X
(a) k=
k=1
100
X
(b) k=
k=1
1000
X
(c) k=
k=1
30
X
⋆Evaluate 6.53. Compute k.
k=1
30
X
Solution 1: k = 29 ∗ 30/2 = 435.
k=1
Evaluation
30
X 30
X
Solution 2: k=k 1 = k(30 − 1 + 1) = 30k.
k=1 k=1
Evaluation
Note: A common error is to think that the sum of the first n integers is n(n − 1)/2 instead of
n(n + 1)/2. Whenever I use the formula, I double check my memory by computing 1 + 2 + 3.
In this case, n = 3. So is the correct answer 3 · 2/2 = 3 or 3 · 4/2 = 6? Clearly it is the latter.
Then I know that the correct formula is n(n + 1)/2. You can use any positive value of n to
check the formula. I use 3 out of habit.
n n
X X n(n + 1)
⋆Question 6.54. Is it true that k= k= ? Explain.
2
k=0 k=1
Answer
Sums and Products 201
Theorem 6.55. If {xk } and {yk } are sequences, then for any n ∈ Z+ ,
n
X n
X n
X
xi + y i = xi + yi .
i=1 i=1 i=1
Example 6.56.
20 20 20
X X X 20 · 21
i+5= i+ 5= + 5 · 20 = 210 + 100 = 310.
2
i=1 i=1 i=1
⋆Exercise 6.58. Prove that the sum of the first n odd integers is n2 .
The following example contains something called a telescoping series. It demonstrates that
evaluating a telescoping series is fairly simple.
n
X
Example 6.59. Let {ak } be a sequence of real numbers. Show that (ai − ai−1 ) = an − a0 .
i=1
Proof: We can see that
n n
! n
!
X X X
(ai − ai−1 ) = ai − ai−1
i=1 i=1 i=1
= (a1 + a2 + · · · + an−1 + an ) − (a0 + a1 + a2 + · · · + an−1 )
= a1 + a2 + · · · + an−1 + an − a0 − a1 − a2 − · · · − an−1
= (a1 − a1 ) + (a2 − a2 ) + · · · + (an−1 − an−1 ) + an − a0
= an − a0 .
202 Chapter 6
Example 6.60. Given what we know so far, how can we compute the following:
100
X
k =?
k=50
It turns out that this is not that hard. Notice that it is almost a sum we know. We know
X100
how to compute k, but that has too many terms. Can we just subtract those terms to get
k=1
the answer? What terms don’t we want? Well, we don’t want terms 1 through 49. But that
49
X
is just k. In other words,
k=1
100
X 100
X 49
X
k = k− k
k=50 k=1 k=1
100 · 101 49 · 50
= −
2 2
= 5050 − 1225 = 3825
20
X
(a) k=
k=10
40
X
(b) k=
k=21
Solution 1:
100
X 100
X 30
X
k= k− k = 100 · 101/2 − 30 · 31/2 = 5050 − 465 = 4585
k=30 k=1 k=1
Evaluation
Sums and Products 203
Solution 2:
100
X 100
X 30
X
k= k− k = 99 · 100/2 − 29 · 30/2 = 4950 − 435 = 4515
k=30 k=1 k=1
Evaluation
Solution 3:
100
X 100
X 29
X
k= k− k = 100 · 101/2 − 29 · 30/2 = 5050 − 435 = 4615
k=30 k=1 k=1
Evaluation
⋆Question 6.63. Explain why the following computation is incorrect. Then explain why
the answer is correct even with the error(s).
100
X 100
X 30
X
k= k− k = 100 · 101/2 − 29 · 30/2 = 5050 − 435 = 4615
k=30 k=1 k=1
Answer
204 Chapter 6
We will prove Theorem 6.64 in the chapter on mathematical induction since that is perhaps the
easiest way to prove these results. It is probably a good idea to attempt to commit the first two
of these sums to memory since they come up on occasion.
⋆Question 6.65. Why does the third formula from Theorem 6.64 have a lower index of 2
(instead of 1 or 0, for instance)?
Answer
Sometimes double sums are necessary to express a summation. As a general rule, these should
be evaluated from the inside out.
Sums and Products 205
n X
X n
Example 6.67. Evaluate the double sum 1.
i=1 j=1
X n
n X n
X
Solution: We have 1= n = n · n = n2 .
i=1 j=1 i=1
n X
X i
(b) j=
i=1 j=1
n X
X n
(c) ij =
i=1 j=1
There is a formula for the sum of a geometric sequence, sometimes referred to as a geometric
206 Chapter 6
n
X
Proof: First, let S = xk . Then
k=0
n
X n
X n+1
X
xS = x xk = xk+1 = xk .
k=0 k=0 k=1
So
n+1
X n
X
k
xS − S = x − xk
k=1 k=0
= (x1 + x2 + . . . + xn + xn+1 ) − (x0 + x1 + . . . + xn )
= xn+1 − x0 = xn+1 − 1.
xn+1 −1
So we have (x − 1)S = xn+1 − 1, so S = x−1 , since x 6= 1.
Example 6.70.
n
X 1 − 3n+1 1 − 3n+1 3n+1 − 1
3k = = = .
1−3 −2 2
k=0
Example 6.71.
n n n Å ãk
1 1k 1 1 − (1/5)n+1 1 − 1/(5n+1 ) 5 1 5 1
X X X Å ã
= = = = = 1 − n+1 = − .
5k 5k 5 1 − 1/5 4/5 4 5 4 4 · 5n
k=0 k=0 k=0
⋆Exercise 6.74. Find the sum of the following geometric series. Assume y 6= 1.
(a) 1 + y + y 2 + y 3 + · · · + y 100 =
(b) 1 − y + y 2 − y 3 + y 4 − y 5 + · · · − y 99 + y 100 =
(c) 1 + y 2 + y 4 + y 6 + · · · + y 100 =
xN − 1 = (x − 1)(xN −1 + xN −2 + · · · + x + 1).
x2 − 1 = (x − 1)(x + 1)
x3 − 1 = (x − 1)(x2 + x + 1), and
x4 − 1 = (x − 1)(x3 + x2 + x + 1).
x5 − 1 =
Let’s use the technique from the proof of Theorem 6.69 in the special case where x = 2.
20 + 21 + 22 + 23 + 24 + · · · + 2n .
Solution: We could just use the formula from Theorem 6.69, but that would be
boring. Instead, let’s work it out. Let S = 20 + 21 + 22 + 23 + · · · + 2n . Then 2S =
S = 2S − S = (21 + 22 + 23 + · · · + 2n + 2n+1 )
−(20 + 21 + 22 + 23 + · · · + 2n )
=
= 2n+1 − 1.
n
X
Thus, 2k = 2n+1 − 1.
k=0
Since powers of 2 are very prominent in computer science, you should definitely commit the
formula from the previous example to memory.
Together, Theorems 6.43 and 6.69 imply the following:
n
X a − ar n+1
Theorem 6.79. Let r 6= 1. Then ar k = .
1−r
k=0
Sums and Products 209
⋆Fill in the details 6.80. Use Theorems 6.43 and 6.69 to prove Theorem 6.79.
a − ar n+1
= .
1−r
⋆Exercise 6.81. Prove Theorem 6.79 without using Theorems 6.43 and 6.69. In other
words, mimic the proof of Theorem 6.69.
Notice that if |r| < 1 then r n gets closer to 0 the larger n gets. More formally, if |r| < 1,
lim r n = 0. This implies the following (which we will not formally prove beyond what we have
n→∞
already said here).
210 Chapter 6
Example 6.83. A fly starts at the origin and goes 1 unit up, 1/2 unit right, 1/4 unit down,
1/8 unit left, 1/16 unit up, etc., ad infinitum. In what coordinates does it end up?
Solution:Its x coordinate is
Å ã0 Å ã1 Å ã2 1
1 1 1 1 1 1 1 1 1 2 2
− + − ··· = − + − + − + ··· = = .
2 8 32 2 4 2 4 2 4 1 − −1
4
5
Its y coordinate is
Å ã0 Å
1 1 1 1 1 1 2 1 4
ã Å ã
1− + − ··· = − + − + − + ··· = −1 = .
4 16 4 4 4 1− 4 5
2 4
Therefore, the fly ends up in 5, 5 .
Product notation is very similar to sum notation, except we multiply the terms instead of
adding them.
Definition 6.85. Let {an } be a sequence. Then for 1 ≤ m ≤ n, where m and n are integers,
we define
Yn
ak = am am+1 · · · an .
k=m
As with sums, we call k the index and m and n the lower limit and upper limit, respec-
tively.
Sums and Products 211
n
Y
Example 6.86. Notice that n! = k.
k=1
Note: An alternative way to express the variable and limits of sums and products is
X n
X
ak instead of ak
m≤k≤n k=m
and
Y n
Y
ak instead of ak
m≤k≤n k=m
212 Chapter 6
(c) Prove that your closed form is correct by following the technique from Example 6.11.
⋆Question 6.5. Let {an } be a sequence that gives the number of steps required to run some
algorithm on an input of size n (e.g. imagine the input is an array). For instance, it takes a1
steps to run the algorithm if the input is an array of size 1, a2 steps if the input is an array of
size 2, etc. Would you expect this sequence to be increasing, decreasing, or neither? Explain.
⋆Question 6.6. Are geometric progressions always, sometime, or never monotonic? Explain.
Similarly, are they always, sometimes, or never increasing?
⋆Question 6.7. Are arithmetic progressions always, sometime, or never monotonic? Explain.
Similarly, are they always, sometimes, or never increasing?
⋆Question 6.8. Give an example (not from the book) of each of the following.
23
X 11
⋆Question 6.14. Compute . Simplify your answer (although you don’t need to com-
(−7)k
k=1
pute the actual number). (I have thrown several subtle tricks at you on this one, but if you read
carefully and apply what you know, you should be able to do it!)
6.4 Problems
Problem 6.1. Find at least three different sequences that begin with 1, 3, 7 whose terms are
generated by a simple formula or rule. By different, I mean none of the sequences can have exactly
the same terms. In other words, your answer cannot simply be three different ways to generate
the same sequence.
22 n log2 n
X X X n
(b) (2j+1 − 2j ) (e) k(k − 1) (h)
j=5 k=1
2i
i=0
n n
X n X
X j
i X
X
(c) 5k (f) 5j (i) 1
k=0 j=1 i=1 j=1 k=1
Problem 6.7. Here is a standard interview question for prospective computer programmers: You
are given a list of 1, 000, 001 positive integers from the set {1, 2, . . . , 1, 000, 000}. In the list, every
member of {1, 2, . . . , 1, 000, 000} is listed once, except for x, which is listed twice, and the numbers
are listed in some unknown order. How do you find what x is without doing a 1, 000, 000 step
search (e.g. check if 1 is on the list twice, then check if 2 is on the list twice, etc.)? How much
faster is your solution than the naive solution?
Tn = 12 − 22 + 32 − 42 + · · · + (−1)n−1 n2 .
1 + 3 + 5 + · · · + (2n − 1) = n2 .
2 + 4 + 6 + · · · + 2n
Problem 6.12. Legend says that the inventor of the game of chess, Sissa ben Dahir, asked the
King Shirham of India to place a grain of wheat on the first square of the chessboard, 2 on the
second square, 4 on the third square, 8 on the fourth square, etc..
(a) How many grains of wheat are to be put on the last (64-th) square?
(b) How many grains, total, are needed in order to satisfy the greedy inventor?
(c) Given that 15 grains of wheat weigh approximately one gram, what is the approximate weight,
in kg, of the wheat needed?
(d) Given that the annual production of wheat is 350 million tonnes, how many years, approx-
imately, are needed in order to satisfy the inventor (assume that production of wheat stays
constant)?
Problem 6.13. It is easy to see that we can define n! recursively by defining 0! = 1, and if n > 0,
n! = n · (n − 1)!. Does the following method correctly compute n!? If not, state what is wrong
with it and fix it.
int f a c t o r i a l ( int n ) {
return n * f a c t o r i a l(n -1) ;
}
n
X
Problem 6.14. Find a closed formula for k2 (k −1). Simplify the formula as much as possible.
k=1
n
X
Problem 6.15. Find a closed formula for k · k!. (Hint: What is (k + 1)! − k!, and why does
k=1
it matter?) Simplify the formula as much as possible.
n n
!2
X X
3
Problem 6.16. Prove that for n ≥ 1, k = k .
k=1 k=1
Problem 6.17. A student turned in the code below (which does as its name suggests). I gave
them a ‘C’ on the assignment because although it works, it is very inefficient.
int s u m F r o m O n e T o N ( int n ) {
int sum = 0;
for ( int i =1;i <= n ; i ++) {
sum = sum + i ;
}
return sum ;
}
(a) Write the ‘A’ version of the algorithm (in other words, a more efficient version). You can
assume that n ≥ 1.
Problem 6.18. A student turned in the code below (which does as its name suggests). I gave
them a ‘C’ on the assignment because although it works, it is very inefficient.
int s u m F r o m M T o N( int m , int n ) {
int sum = 0;
for ( int i =1;i <= n ; i ++) {
sum = sum + i ;
}
for ( int i =1;i < m ; i ++) {
sum = sum - i ;
}
return sum ;
}
(a) Write the ‘A’ version of the algorithm (in other words, a more efficient version). You can
assume that 1 ≤ m ≤ n.
Problem 6.19. How many times does the function foo get called in the following code?
for( int i =0;i < n ; i ++) {
for ( int j =0;j < i ; j ++) {
foo (i , j ) ;
}
}
(a) What function does the code compute? In other words, what is g(n)?
(a) What function does the code compute? In other words, what is h(n)?
217
218 Chapter 7
Note: The “=” in the statement “f (n) = O(g(n))” should be read and thought of as “is”,
not “equals.” You can think of it as a one-way equals. So saying f (n) = O(g(n)) is not
the same thing as saying O(g(n)) = f (n), for instance (with the latter statement not really
making sense).
An alternative notation is to write f (n) ∈ O(g(n)) instead of f (n) = O(g(n)). It turns
out that O(g(n)) is actually the set of all functions that grow no faster than g(n), so the set
notation is actually in some sense more correct. The “=” notation is used because it comes in
handy when doing algebra. You can essentially think of these as being two different notations
(= and ∈) for the same thing. Similar statements are true for the other asymptotic notations.
n2 + n ≤ n3 + n3 = 2n3
Thus,
n2 + n ≤ 2n3 for all n ≥ 1.
Thus, we have shown that n2 + n = O(n3 ) by definition of Big-O, with n0 = 1,
and c = 2.
The following fact is a generalization of what was used in the previous example. It is used
often in proofs involving asymptotic notation.
Proof: We will not provide a proof, but it should be fairly clear intuitively that
this is true. If you cannot see why this is true, you should work out a few examples
to convince yourself.
Sometimes the easiest way to prove that f (n) = O(g(n)) is to take c to be the sum of
the positive coefficients of f (n), although this trick doesn’t always work. We can usually easily
eliminate the lower order terms with negative coefficients if we make the appropriate assumption.
Let’s see how to do this in the next few examples.
Asymptotic Notation 219
Also notice that if n ≥ 1, then n ≥ 0. Thus, our first step is still valid if we assume
n ≥ 1 since n ≥ 1 is a stronger condition than n ≥ 0. Putting this all together, if
we assume n ≥ 1, then
Since we have shown that 3n3 − 2n2 + 13n − 15 ≤ 16n3 for all n ≥ 1, we have
proven that 3n3 − 2n2 + 13n − 15 = O(n3 ).
We used n0 = 1 and c = 16 in our proof. It is not necessary to explicitly point this
out in our proof, though. We only do so to help you see the connection between
the proof and the definition of Big-O.
Solution: If n ≥ 1,
Evaluation
Note: The values of the constants used in the proofs do not need to be the best possible. For
instance, if you can show that f (n) ≤ 345 g(n) for all n ≥ 712, then f (n) = O(g(n)). It
doesn’t matter whether or not it is actually true that f (n) ≤ 3 g(n) for all n ≥ 5.
⋆Question 7.8. Answer each of the following questions related to Example 7.5. Include a
brief justification.
Answer
Answer
Answer
Answer
Asymptotic Notation 221
⋆Exercise 7.9. Prove that 5n5 − 4n4 + 3n3 − 2n2 + n = O(n5 ). (Hint: Use the same
techniques you saw in Example 7.5.)
⋆Question 7.10. What values did you use for n0 and c in your solution to Exercise 7.9?
n0 = , c=
√
Things are not always so easy. How would you show that ( 2)log n + log2 n + n4 = O(2n )?
Or that n2 = O(n2 − 13n + 23)? In general, we simply (or in some cases with much effort) find
values c and n0 that work. This gets easier with practice.
Big-O is a notation to express the idea that one function is an upper bound for another
function. The next notation allows us to express the opposite idea—that one function is a lower
bound for another function.
⋆Exercise 7.13. Prove that 4n2 + n + 1 = Ω(n2 ). (This one should be really easy—follow
the technique from the previous example and don’t over think it.)
⋆Question 7.14. What values did you use for n0 and c in your solution to Exercise 7.13?
n0 = , c=
Proving that f (n) = Ω(g(n)) often requires more thought than proving that f (n) = O(g(n)).
Although the lower-order terms with positive coefficients can be easily dealt with, those with
negative coefficients make things a bit more complicated. Often, we have to pick c < 1. A good
strategy is to pick a value of c that you think will work, and determine which value of n0 is
needed. Being able to do some algebra helps. As it turns out, we won’t have to worry a whole
lot about this, though. We will see a different technique to prove bounds shortly that, when it
works, makes things much easier.
Our third notation allows us to express the idea that two functions grow at the same rate.
n2 ≤ n2 + 5 n + 7 ≤ 13 n2 ,
⋆Question 7.17. In the previous example, we combined two inequalities. One of them
assumed n ≥ 0, the other assumed that n ≥ 1. In the combined inequality, we said it held if
n ≥ 1. Is that really O.K., or did we make a subtle error?
Answer
Using the definition of Θ can be inconvenient since it involves a double inequality. Luckily,
the following theorem provides us with an easier approach.
Theorem 7.18. If f and g are nonnegative functions, then f (n) = Θ(g(n)) if and only if
f (n) = O(g(n)) and f (n) = Ω(g(n)).
Proof: The result follows almost immediately from the definitions. We leave
the details to the reader.
This theorem implies that no new strategies are necessary for Θ proofs since they can be split
into two proofs—a Big-O proof and a Ω proof. Let’s see an example of this approach.
How do you use asymptotic notation to express that f (n) grows slower than g(n)? Saying
f (n) = O(g(n)) doesn’t work, because that only tells us that f (n) grows no faster than g(n). It
224 Chapter 7
might grow slower, but it also might grow at the same rate. With the notation we have, the best
way to express this idea is to say that f (n) = O(g(n)) and f (n) 6= Θ(g(n)). But that is awkward.
Let’s learn a new notation for this instead. For technical reasons that we won’t get into, this
notation has to be defined somewhat differently than the others.
Definition 7.20. Let f and g be nonnegative functions, with g being eventually non-zero.
We say that f (n) is little-o of g(n), written f (n) = o(g(n)) iff
f (n)
lim = 0.
n→∞ g(n)
Example 7.21. You should be able to convince yourself that 3n+2 = o(n2 ), but 3n+2 6= o(n).
Similarly, n2 +n+4 = o(n3 ) and n2 +n+4 = o(n4 ), but n2 +n+4 6= o(n2 ) and n2 +n+4 6= o(n).
If you are not comfortable with limits you can still convince yourself of these statements
by thinking of the informal definition. For instance, n2 + n + 4 grows slower than n3 so
n2 + n + 4 = o(n3 ). On the other hand, n2 + n + 4 grows at the same rate (so not slower
than) n2 , so n2 + n + 4 6= o(n2 ).
⋆Question 7.22. Why do we require that g(n) be eventually non-zero in the definition of
little-o?
Answer
Little-omega (ω) can be defined similarly to little-o, but the value of the limit is ∞ instead of
0. We won’t use ω very often.
⋆Question 7.23. Big-O notation is analogous to ≤ in certain ways. If so, what would be
the similar analogies for o and ω?
Answer
Note:
• It is important to remember that a O-bound is only an upper bound, and that it may
or may not be a tight bound. So if f (n) = O(n2 ), it is possible that f (n) = 3n2 + 4,
f (n) = log n, or any other function that grows no faster than n2 . But we also know
that f (n) 6= n3 or any other function that grows faster than n2 .
• Conversely, a Ω-bound is only a lower bound. Thus, if f (n) = Ω(n log n), it might be
the case that f (n) = 2n , but we know that f (n) 6= 3n, for instance.
• Unlike the others, Θ-bounds are precise. So, if f (n) = Θ(n2 ), then we know that f has
quadratic growth rate. It might be that f (n) = 3n2 , 2n2 − 43 n − 4, or even n2 + n log n.
But we are certain that the fastest growing term of f is c n2 for some constant c.
Asymptotic Notation 225
⋆Question 7.24. Answer the following questions about the asymptotic notations.
Solution 1: We can first eliminate all of the constants since they become
irrelevant as n grows large enough. This leaves us with nk + nk−1 + · · · + n =
O(nk ). Next we can eliminate all terms growing slower than nk , since they
also become irrelevant as n grows. This leaves us with nk = O(nk ), and
since they are the same, they are effectively theta of each other, and by
definition, anything that is theta of something is also omega and O, so
we can correctly say that nk = O(nk ), thus proving that ak nk + ak−1 nk−1 + · · · +
a1 n + a0 = O(nk ).
Evaluation
226 Chapter 7
k
X
Solution 2: Let c = |ai |. Then if n ≥ 1,
i=0
Evaluation
⋆Exercise 7.26. Assume that f (n) = O(n2 ) and g(n) = O(n3 ). What can you say about
the relative growth rates of f (n) and g(n)? In particular, does g(n) grow faster than f (n)?
Answer
Keep in mind that asymptotic notation only allows you to compare the asymptotic behavior
of functions. Except for Θ-notation, it only provides a bound on the growth rate. For instance,
knowing that f (n) = O(g(n)) only tells you that f (n) grows no faster than g(n). It is possible
that f (n) grows a lot slower than g(n).
⋆Exercise 7.27. Let’s test your understanding of the material so far. Answer each of the
following true/false questions, giving a very brief justification/counterexample. Justifications
can appeal to a definition and/or theorem. For counterexamples, use simple functions. For
instance, f (n) = n and g(n) = n2 .
(c) If f (n) = O(g(n)), then f (n) grows at the same rate as g(n)
Asymptotic Notation 227
Theorem 7.28. The transitive property holds for Big-O, Θ, and Ω. That is,
Proof: You will prove the transitive property of Big-O in Exercise 7.49. The
proofs of the other two are very similar.
Theorem 7.28 is pretty intuitive. For instance, when applied to Big-O notation, Theorem 7.28
is essentially stating that if g(n) is an upper bound on f (n) and h(n) is an upper bound on g(n),
then h(n) is an upper bound for f (n). Put another way, if f (n) grows no faster than g(n) and
228 Chapter 7
g(n) grows no faster than h(n), then f (n) grows no faster than h(n). This makes perfect sense if
you think about it for a few minutes.
Example 7.29. Let’s take it for granted that 4n2 + 3n + 17 = O(n3 ) and n3 = O(n4 ) (both
of which you should be able to easily prove at this point). According to Theorem 7.28, we
can conclude that 4n2 + 3n + 17 = O(n4 ).
Proof: We will give the proof for Big-O notation. The other two proofs are
similar. Assume f (n) = O(g(n)). Then by the definition of Big-O, there are
positive constants c and n0 such that f (n) ≤ c g(n) for all n ≥ n0 . Thus, if
n ≥ n0 ,
k f (n) ≤ k c g(n) = c′ g(n),
where c′ = k c is a positive constant. By the definition of Big-O, kf (n) = O(g(n)).
Example 7.31. Example 7.19 showed that 12 n2 + 3 n = Θ(n2). We can use Theorem 7.30 to
conclude that n2 + 6 n = Θ(n2 ) since n2 + 6 n = 2 12 n2 + 3 n .
Perhaps now is a good time to point out a related issue. Typically, we do not include constants
inside asymptotic notations. For instance, although it is technically correct to say that 34n3 +
2n2 − 45n + 5 = O(5n3 ) (or O(50n3 ), or any other constant you care to place there), it is best to
just say it is O(n3 ). In particular, Θ(1) may be preferable to Θ(k).
Proof: We will prove the assertion for Big-O. Assume f1 (n) = O(g1 (n)) and
f2 (n) = O(g2 (n)). Then there exists positive constants c1 and n1 such that for all
n ≥ n1 ,
f1 (n) ≤ c1 g1 (n),
and there exists positive constants c2 and n2 such that for all n ≥ n2 ,
f2 (n) ≤ c2 g2 (n).
where c = 2c0 . By the definition of Big-O, we have shown that f1 (n) + f2 (n) =
O(max{g1 (n), g2 (n)}).
Example 7.33. Since we have previously shown that 5n2 − 3n + 20 = O(n2 ) and that
3n3 − 2n2 + 13n − 15 = O(n3 ), we know that (5n2 − 3n + 20) + (3n3 − 2n2 + 13n − 15) =
O(n2 + n3 ) = O(n3 ).
Example 7.35. Since we have previously shown that 5n2 − 3n + 20 = O(n2 ) and that
3n3 − 2n2 + 13n − 15 = O(n3 ), we know that (5n2 − 3n + 20)(3n3 − 2n2 + 13n − 15) =
O(n2 n3 ) = O(n5 ). Notice that we could arrive at this same conclusion by multiplying the
two polynomials and taking the highest term. However, this would require a lot more work
than is necessary.
The next theorem essentially says that if g(n) is an upper bound on f (n), then f (n) is a lower
bound on g(n). This makes perfect sense if you think about it.
It turns out that Θ defines an equivalence relation on the set of functions from Z+ to Z+ .
That is, it defines a partition on these functions, with two functions being in the same partition
(or the same equivalence class) if and only if they have the same growth rate. But don’t take our
word for it. You will help to prove this fact next.
230 Chapter 7
⋆Fill in the details 7.37. Let R be the relation on the set of functions from Z+ to Z+
such that (f , g) ∈ R if and only if f = Θ(g). Show that R is an equivalence relation.
n0 such that
This implies that
1 1
g(n) ≤ f (n) and g(n) ≥ f (n) for all n ≥ n0
c1 c2
which is equivalent to
such that
Then
f (n) ≥ c1 g(n) ≥ c1 c3 h(n) for all n ≥ max{n0 , n1 },
and
definition of , so R is .
Example 7.38. The functions n2 , 3n2 − 4n + 4, n2 + log n, and 3n2 + n + 1 are all Θ(n2 ).
That is, they all have the same rate of growth and all belong to the same equivalence class.
Asymptotic Notation 231
⋆Exercise 7.39. Let’s test your understanding of the material so far. Answer each of the
following true/false questions, giving a very brief justification/counterexample. Justifications
can appeal to a definition and/or theorem. For counterexamples, use simple functions. For
instance, f (n) = n and g(n) = n2 .
(c) If f1 (n) = O(g1 (n)) and f2 (n) = O(g2 (n)), then f1 (n)+f2 (n) = O(max(g1 (n), g2 (n)))
In this section we provide more examples and exercises that use the definitions to prove bounds.
The first example is annotated with comments (given in footnotes) about the techniques that
are used in many of these proofs. We use the following terminology in our explanation. By lower
order term we mean a term that grows slower, and higher order means a term that grows faster.
The dominating term is the term that grows the fastest. For instance, in x3 + 7 x2 − 4, the x2 term
is a lower order term than x3 , and x3 is the dominating term. We will discuss common growth
rates, including how they relate to each other, in Section 7.2. But for now we assume you know
that x5 grows faster than x3 , for instance.
232 Chapter 7
Example 7.40. Find a tight bound on f (n) = n8 + 7n7 − 10n5 − 2n4 + 3n2 − 17.
Solution: We will prove that f (n) = Θ(n8 ). First, we will prove an upper
bound for f (n). It is clear that when n ≥ 1,
≤ n8 + 7n8 + 3n8 b
= 11n8
Thus, we have
= n8 − 29n7
Next, we need to find a value c > 0 such that n8 − 29n7 ≥ cn8 . Doing a little
algebra, we see that this is equivalent to (1 − c)n8 ≥ 29n7 . When n ≥ 1, we can
divide by n7 and obtain (1 − c)n ≥ 29. Solving for c we obtain
29
c≤1− .
n
If n ≥ 58, then c = 1/2 suffices. We have just shown that if n ≥ 58, then
1
f (n) = n8 + 7n7 − 10n5 − 2n4 + 3n2 − 17 ≥ n8 .
2
Thus, f (n) = Ω(n8 ). Since we have shown that f (n) = Ω(n8 ) and that f (n) =
O(n8 ), we have shown that f (n) = Θ(n8 ).
a
We can upper bound any function by removing the lower order terms with negative coefficients, as long
as n ≥ 0.
b
We can upper bound any function by replacing lower order terms that have positive coefficients by the
dominating term with the same coefficients. Here, we must make sure that the dominating term is larger than
the given term for all values of n larger than some threshold n0 , and we must make note of the threshold value
n0 .
c
We can lower bound any function by removing the lower order terms with positive coefficients, as long as
n ≥ 0.
d
We can lower bound any function by replacing lower order terms with negative coefficients by a sub-
dominating term with the same coefficients. (By sub-dominating, I mean one which dominates all but the
dominating term.) Here, we must make sure that the sub-dominating term is larger than the given term for
all values of n larger than some threshold n0 , and we must make note of the threshold value n0 . Making a
wise choice for which sub-dominating term to use is crucial in finishing the proof.
Let’s see another example of a Ω proof. You should note the similarities between this and the
second half of the proof in the previous example.
Asymptotic Notation 233
Proof: We need to show that there exist positive constants c and n0 such that
c n log n ≤ n log n − 2 n,
which is equivalent to
2
c≤1− , when n > 1.
log n
If n ≥ 8, then 2/(log n) ≤ 2/3, and picking c = 1/3 suffices. In other words, we
have just shown that if n ≥ 8,
1
n log n ≤ n log n − 2 n.
3
Thus if c = 1/3 and n0 = 8, then for all n ≥ n0 , we have
1 2
≤ n − 3n ≤ for all n ≥ n0
2
Dividing by n2 , we get c1 ≤ ≤ c2 .
1
≤ n2 − 3 n ≤ for all n ≥ .
2
Therefore, 12 n2 − 3n = Θ(n2 ).
234 Chapter 7
Answer
√ √
Example 7.44. Show that ( 2)log n = O( n), where the base of the log is 2.
Note:
√ log nYou √may be confused by the previous proof. It seems that we never√ showed that
√
( 2) ≤ c n for some √ constant c. But we essentially did by showing that ( 2)log n = n
√
since this implies that ( 2)log n ≤ 1 n.
We actually proved something stronger than √ was required. That is, since we proved the
two functions are equal, it is in fact true that ( 2)log n = Θ(√n). But we were only asked to
√ √
prove that ( 2)log n = O( n).
In general, if you need to prove a Big-O bound, you may instead prove a Θ bound, and
the Big-O bound essentially comes along for the ride.
⋆Question 7.45. In our previous note we mentioned that if you prove a Θ bound, you get
the Big-O bound for free.
Answer
(b) If we prove f (n) = O(g(n)), does that imply that f (n) = Θ(g(n))? In other words, does
it work the other way around? Explain, giving an appropriate example.
Answer
Asymptotic Notation 235
⋆Exercise 7.46. Show that n! = O(nn ). (Don’t give up too easily on this one—the proof
is very short and only uses elementary algebra.)
The last step used the fact that log(f (n)a ) = a log(f (n)), a fact that we assume you have
seen previously (but may have forgotten).
Proving properties of the asymptotic notations is actually no more difficult than the rest of
the proofs we have seen. You have already seen a few and helped write one. Here we provide one
more example and then ask you to prove another result on your own.
Example 7.48. Prove that if f (n) = O(g(n)) and g(n) = O(f (n)), then f (n) = Θ(g(n)).
Proof: If f (n) = O(g(n)), then there are positive constants c2 and n′0 such that
Similarly, if g(x) = O(f (x)), then there are positive constants c′1 and n′′0 such that
1
g(n) ≤ f (n) for all n ≥ n′′0 .
c′1
236 Chapter 7
⋆Exercise 7.49. Let f (x) = O(g(x)) and g(x) = O(h(x)). Show that f (x) = O(h(x)).
That is, prove Theorem 7.28 for Big-O notation.
f (n)
lim = A.
n→∞ g(n)
Then
1. If A = 0, then f (n) = O(g(n)), and f (n) 6= Θ(g(n)). That is, f (n) = o(g(n)).
2. If A = ∞, then f (n) = Ω(g(n)), and f (n) 6= Θ(g(n)). That is, f (n) = ω(g(n)).
If the above limit does not exist, then you need to resort to using the definitions or using
some other technique. Luckily, in the analysis of algorithms the above approach works most of
the time.
Asymptotic Notation 237
Before we see some examples, let’s review a few limits you should know.
(a) lim a = a
n→∞
(a) lim 13 = 13
n→∞
(b) lim n = ∞
n→∞
(c) lim n4 = ∞
n→∞
(g) lim 2n = ∞
n→∞
(b) lim n3 =
n→∞
(c) lim 3n =
n→∞
238 Chapter 7
Å ãn
3
(d) lim =
n→∞ 2
Å ãn
2
(e) lim =
n→∞ 3
The following theorem often comes in handy when using Theorem 7.50.
1
Theorem 7.55. If lim f (n) = ∞, then lim = 0.
n→∞ n→∞ f (n)
⋆Question 7.57. The proof in the previous example used Theorems 7.51 and 7.55. How
and where?
Answer
Asymptotic Notation 239
⋆Exercise 7.58. Prove that 3x3 = Ω(x2 ) using Theorem 7.50. Which case did you use?
Here are a few more useful properties of limits. Read carefully. These do not apply in all
situations.
Theorem 7.59. Let a be a finite real number and let lim f (n) = A and lim g(n) = B,
n→∞ n→∞
where A and B are finite real numbers. Then
f (n) A
(d) If B 6= 0, lim =
n→∞ g(n) B
We usually use the results from the previous theorem without explicitly mentioning them.
Example 7.60. Find a tight bound on f (x) = x8 + 7x7 − 10x5 − 2x4 + 3x2 − 17 using
Theorem 7.50.
Solution: We guess (or know, if we remember the solution to Example 7.40) that f (x) =
Θ(x8 ). To prove this, notice that
Compare the proof above with the proof given in Example 7.40. It should be pretty obvious
that using Theorem 7.50 makes the proof a lot easier. Let’s see another example that lets us
compare the two proof methods.
240 Chapter 7
Proof #1
We will use the definition of Θ. It is clear that when x ≥ 1,
1
x4 −23x3 +12x2 +15x−21 ≥ x4 −23x3 −21 ≥ x4 −23x3 −21x3 = x4 −44x3 ≥ x4 .
2
Thus
1 4
x ≤ x4 − 23x3 + 12x2 + 15x − 21 ≤ 28x4 , for all x ≥ 88.
2
We have shown that f (x) = x4 − 23x3 + 12x2 + 15x − 21 = Θ(x4 ).
If you did not follow the steps in this first proof, you should really review your
algebra rules.
Proof #2
Since
x4 − 23x3 + 12x2 + 15x − 21 x4 23x3 12x2 15x 21
lim = lim − 4 + 4 + 4 − 4
x→∞ x4 x→∞ x4 x x x x
23 12 15 21
= lim 1 − + 2+ 3− 4
x→∞ x x x x
= lim 1 − 0 + 0 + 0 − 0 = 1,
x→∞
Example 7.62. Prove that n(n + 1)/2 = O(n3 ) using Theorem 7.50.
n(n + 1)/2 n2 + n 1 1
Proof: Because lim 3
= lim 3
= lim + 2 = 0 + 0 = 0,
n→∞ n n→∞ 2n n→∞ 2n 2n
n(n + 1)/2 = o(n3 ), which implies that n(n + 1)/2 = O(n3 ).
⋆Exercise 7.63. Prove that n(n + 1)/2 = Θ(n2 ) using Theorem 7.50.
Asymptotic Notation 241
Now is probably a good time to recall a very useful theorem for computing limits, called
l’Hopital’s Rule. The version presented here is restricted to limits where the variable approaches
infinity since those are the only limits of interest in our context.
Theorem 7.65 (l’Hopital’s Rule). Let f (x) and g(x) be differentiable functions. If
then
f (x) f ′ (x)
lim = lim ′
x→∞ g(x) x→∞ g (x)
242 Chapter 7
3x 3
lim = lim (l’Hopital)
x→∞ x2 x→∞ 2x
3 1
= lim
2 x→∞ x
3
= 0
2
= 0.
3x2 + 4x − 9 6x + 4
lim = lim (l’Hopital)
x→∞ 12x x→∞ 12
1 1
= lim x +
x→∞ 2 3
= ∞
Now let’s apply it to proving asymptotic bounds.
We should mention that applying l’Hopital’s Rule in the first step is legal since
Therefore, x3 = O(2x ).
Asymptotic Notation 243
As in the previous example, at each step we checked that the functions on both
the top and bottom go to infinity as n goes to infinity before applying l’Hopital’s
Rule. Notice that we did not apply it in the final step since 6 does not go to
infinity.
⋆Evaluate 7.70. Prove that 7x is an upper bound for 5x , but that it is not a tight bound.
Proof 1: This is true if and only if 7x always grows faster than 5x which
means 7x − 5x > 0 for all x 6= 0. If it is a tight bound, then 7x − 5x = 0,
which is only true for x = 0. So 7x is an upper bound on 5x , but not a
tight bound.
Evaluation
5x x log 5
Proof 2: lim = lim . Both go to infinity, but x log 7 gets there
x→∞ 7x x→∞ x log 7
faster, showing that 5 = O(7x ).
x
Evaluation
Å ãx
7x 7
Proof 3: lim = lim = ∞ since 7/5 > 1. Thus 5x = O(7x ) by the limit
x→∞ 5x x→∞ 5
theorem.
Evaluation
We should mention that it is important to remember to verify that l’Hopital’s Rule applies before
just blindly taking derivatives. You can actually get the incorrect answer if you apply it when it
should not be applied.
√
Example 7.71. Find and prove a simple tight bound for 5n2 − 4n + 12.
√
Solution: We will show that 5n2 − 4n + 12 = Θ(n). Since
√ we are letting n
go to infinity, we can assume that n > 0. In this case, n = n2 . Using this, we
can see that
√
12 √
…
5n2 − 4n + 12 5n2 − 4n + 12 4
lim = lim 2
= lim 5 − + 2 = 5.
n→∞ n n→∞ n n→∞ n n
√
Therefore, 5n2 − 4n + 12 = Θ(n).
244 Chapter 7
⋆Exercise 7.72. Find and prove a good simple upper bound on n ln(n2 + 1) + n2 ln n.
(b) Using Theorem 7.50. You will probably need to use l’Hopital’s Rule a few times.
Asymptotic Notation 245
Example 7.73. Find and prove a simple tight bound for n log(n2 ) + (n − 1)2 log(n/2).
⋆Exercise 7.74. Find and prove a simple tight bound for (n2 − 1)5 . You may use either
the formal definition of Θ or Theorem 7.50. (The solution uses Theorem 7.50.)
246 Chapter 7
⋆Exercise 7.75. Find and prove a simple tight bound for 2n+1 + 5n−1 . You may use either
the formal definition of Θ or Theorem 7.50. (The solution uses Theorem 7.50.)
Common Growth Rates 247
Figure 7.2: Constants don’t matter Figure 7.3: Lower-order terms don’t matter
Figures 7.4 through 7.8 give a graphical representation of relative growth rates of functions.
In these diagrams, ** means exponentiation. For instance, x**2 means x2 .
It is important to point out that you should never rely on the graphs of functions to determine
relative growth rates. That is the point of Figures 7.6 and 7.7. Although graphs sometimes give
248 Chapter 7
Figure 7.6: Polynomials and an exponential. It Figure 7.7: Polynomials and an exponential with
looks like x4 grows faster than 2x , but see Fig 7.7. larger n. Clearly 2n grows faster than n4 .
you an accurate picture of the relative growth rates of the functions, they might just as well
present a distorted view of the data depending on the values that are used on the axes. Instead,
you should use the techniques we develop in this section.
Next we present some of the most important results about the relative growth rate of some
common functions. We will ask you to prove each of them. Theorems 7.50 and 7.65 will help
Common Growth Rates 249
you do so. You will notice that most of the theorems are using little-o, not Big-O. Hopefully you
understand the difference. If not, review those definitions before continuing.
We begin with something that is pretty intuitive: higher powers grow faster than lower powers.
⋆Exercise 7.78. Prove Theorem 7.76. (Hint: Use Theorem 7.50 and do a little algebra
before you try to compute the limit.)
The next theorem tells us that exponentials with different bases do not grow at the same rate.
More specifically, the higher the base, the faster the growth rate.
⋆Exercise 7.81. Prove Theorem 7.79. (See the hint for Exercise 7.78.)
250 Chapter 7
Recall that a logarithmic function is the inverse of an exponential function. That is, bx = n
is equivalent to x = logb n. The following identity is very useful.
Theorem 7.82. Let a, b, and x be positive real numbers with a 6= 1 and b 6= 1. Then
logb x
loga x = .
logb a
Example 7.83. Most calculators can compute ln n or log10 n, but are unable to compute
logarithms with any given base. But Theorem 7.82 allows you to do so. For instance, you
can compute log2 39 as log10 39/ log10 2.
Notice that the formula in Theorem 7.82 can be rearranged as (logb a)(loga x) = logb x. This
form should make it evident that changing the base of a logarithm just changes the value by a
constant amount. This leads to the following result.
Example 7.85. According to Corollary 7.84, log2 n = Θ(log10 n) and ln n = Θ(log2 n).
Corollary 7.84 is stating that all logarithms have the same rate of growth regardless of their
bases. That is, the base of a logarithm does not matter when it is used in asymptotic notation.
Because of this, the base is often omitted in asymptotic notation. In computer science, it is
usually safe to assume that the base of logarithms is 2 if it is not specified.
⋆Exercise 7.86. Indicate whether each of the following is true (T) or false (F).
(a) 2n = Θ(3n )
(b) 2n = o(3n )
(c) 3n = O(2n )
Theorem 7.87. Let b > 0 and c > 0 be real numbers. Then logc (n) = o(nb ).
Example 7.88. According to Theorem 7.87, log2 n = o(n2 ), log10 n = o(n1.01 ), and ln n =
√
o( n).
⋆Exercise 7.89. Prove Theorem 7.87. (Hint: This is easy if you use Theorems 7.50
and 7.65)
More generally, the next theorem states that any positive power of a logarithm grows slower
than any positive power of n. Since this one is a little tricky, we will provide the proof. In case
you have not seen this notation before, you should know that loga n means (log n)a , which is not
the same thing as log(na ).
Theorem 7.90. Let a > 0, b > 0, and c > 0 be real numbers. Then logac (n) = o(nb ). In
other words, any power of a log grows slower than any polynomial.
Proof: First, we need to know that if a > 0 is a constant, and lim f (n) = C,
n→∞
then a
lim (f (n))a = lim f (n) = C a .
n→∞ n→∞
Using this and the limit computed in the proof of Theorem 7.87, we have that
√
Example 7.91. According to Theorem 7.90, log42 n = o(n2 ), ln10 n = o( n), and
log1,000,000
10 n = o(n.00000001 ).
Finally, any exponential function with base larger than 1 grows faster than any polynomial.
252 Chapter 7
Theorem 7.92. Let a > 0 and b > 1 be real numbers. Then na = o(bn ).
Example 7.93. According to Theorem 7.92, it is easy to see that n2 = o(2n ), n15 = o(1. 5n ),
and n1,000,000 = o(1. 0000001n ).
There are several ways to prove Theorem 7.92, including using repeated applications of
l’Hopital’s rule, using induction, or doing a little algebraic manipulation and using one of several
clever tricks. But the techniques are beyond what we generally need in the course, so we will omit
a proof (and, perhaps more importantly, we will not ask you to provide a proof!).
⋆Fill in the details 7.94. Fill in the following blanks with Θ, Ω, O, or o. You should give
the most precise answer possible. (e.g. If you put O, but the correct answer is o, your answer
is correct but not precise enough.)
(d) log2 n2 = (log22 (n)).
(f) 5n = (3n ).
(h) n3 = (2n ).
An alternative notation for little-o is ≪. In other words, f (n) = o(g(n)) iff f (n) ≪ g(n).
This notation is useful in certain contexts, including the following comparison of the growth rate
of common functions. The previous theorems in this section provide proofs of some of these
relationships. The others are given without proof.
Common Growth Rates 253
Theorem 7.95. Here are some relationships between the growth rates of common functions:
√
c ≪ log n ≪ log 2 n ≪ n ≪ n ≪ n log n ≪ n1.1 ≪ n2 ≪ n3 ≪ n4 ≪ 2n ≪ 3n ≪ n! ≪ nn
You should convince yourself that each of the relationships given in the previous theorem is
correct. You should also memorize them or (preferably) understand why each one is correct so
you can ‘recreate’ the theorem.
⋆Exercise 7.96. Give a Θ bound for each of the following functions. You do not need to
prove them.
(b) f (n) = (n2 + 23n + 19)(n2 + 23n + n3 + 19)n3 (Don’t make this one harder than it is)
(d) f (n) = 49 ∗ 2n + 34 ∗ 3n
(e) f (n) = 2n + n5 + n3
⋆Exercise 7.97. Rank the following functions in increasing rate of growth. Clearly indicate
if two or more functions have the same growth rate. Assume the logs are base 2.
x, x2 , 2x , 10000, log300 x, x5 , log x, xlog 3 , x.000001 , 3x , x log(x), log(x300 ),
log(2x )
Algorithm Analysis 255
⋆Question 7.98. Why aren’t wall-clock time and CPU time the same?
Answer
Because the running time of an algorithm is greatly affected by the characteristics of the
computer system (e.g. processor speed, number of processors, amount of memory, file-system
type, etc.), the running time does not necessarily provide a comparable measure, regardless of
whether you use CPU time or wall-clock time. The next question asks you to think about why.
⋆Question 7.99. Sue and Stu were competing to write the fastest algorithm to solve a
problem. After a week, Sue informs Stu that her program took 1 hour to run. Stu declared
himself victorious since his program took only 3 minutes. But the real question is this:
Who’s algorithm was more efficient? Can we be certain Stu’s algorithm was better than
Sue’s? Explain. (Hint: Make sure you don’t jump to any conclusion too quickly. Think
about all of the possibilities.)
Answer
The answer to the previous question should make it clear that you cannot compare the running
times of algorithms if they were run on different machines. Even if two algorithms are run on the
same computer, the wall-clock times may not be comparable.
256 Chapter 7
⋆Question 7.100. Why isn’t the wall-clock time of two algorithms that are run on the same
computer always a reliable indicator of their relative performances?
Answer
In fact, if you run the same algorithm on the same machine multiple times, it will not always
take the same amount of time. Sometimes the differences between trial runs can be significant.
⋆Question 7.101. If two algorithms are run on the same machine, can we reliably compare
the CPU-times?
Answer
So the CPU-time turns out to be a pretty good measure of algorithm performance. Unfor-
tunately, it does not really allow one to compare two algorithms. It only allows us to compare
specific implementations of the algorithms. It also requires us to implement the algorithm in an
actual programming language before we even know how good the algorithm is (that is, before we
know if we should even spend the time to implement it).
But we can analyze and compare algorithms before they are implemented if we use the number
of instructions as our measure of performance. There is still a problem with this measure. What
is meant by an “instruction”? When you write a program in a language such as Java or C++,
it is not executed exactly as you wrote it. It is compiled into some sort of machine language.
The process of compiling does not generally involve a one-to-one mapping of instructions, so
counting Java instructions versus C++ instructions wouldn’t necessarily be fair. On the other
hand, we certainly do not want to look at the machine code in order to count instructions—
machine code is ugly. Further, when analyzing an algorithm, should we even take into account
the exact implementation in a particular language, or should we analyze the algorithm apart from
implementation?
O.K., that’s enough of the complications. Let’s get to the bottom line. When analyzing
algorithms, we generally want to ignore what sort of machine it will run on and what language
it will be implemented in. We also generally do not want to know exactly how many instructions
it will take. Instead, we want to know the rate of growth of the number of instructions. This is
sometimes called the asymptotic running time of an algorithm. In other words, as the size of the
input increases, how does that affect the number of instructions executed? We will typically use
the notation from Section 7.1 to specify the running time of an algorithm. We will call this the
time complexity (or often just complexity) of the algorithm.
generally say the input is of size n. For a graph, it is usually the number of vertices or the number
of vertices and edges. When the input is a single number, things get more complicated for reasons
I do not want to get into right now. We usually don’t need to worry about this, though.
Algorithm analysis involves determining the size of the input, n, and then finding a function
based on n that tells us how long the algorithm will take if the input is of size n. By “how long”
we of course mean how many operations.
Example 7.102 (Sequential Search). Given an array of n elements, often one needs to
determine if a given number val is in the array. One way to do this is with the sequential
search algorithm that simply looks through all of the elements in the array until it finds it or
reaches the end. The most common version of this algorithm returns the index of the element,
or −1 if the element is not in the array. Here is one implementation.
int s e q u e n t i a l S e a r c h ( int a [] , int n , int val ) {
for ( int i =0;i < a . size () ; i ++) {
if ( a [ i ]== val ) {
return i ;
}
}
return -1;
}
What is the size of the input to this algorithm?
Solution: There are a few possible answers to this question. The input tech-
nically consists of an array of n elements, the numbers n, and the value we are
searching for. So we could consider the size of the input to be n + 2. However,
typically we ignore constants with input sizes. So we will say the size of the input
is n.
In general, if an algorithm takes as input an array of size n and some constant number of
other numeric parameters, we will consider the size of the input to be n.
Answer
Example 7.104. How many operations does sequentialSearch take on an array of size n?
Let’s deal with the possible outcomes question first. Generally speaking, when we analyze an
algorithm we want to know what happens in one of three cases: The best case, the average case,
or the worst case. When thinking about these cases, we always consider them for a given value
of n (the input size). We will see in a moment why this matters.
As the name suggests, when performing a best case analysis, we are trying to determine the
smallest possible number of instructions an algorithm will take. Typically, this is the least useful
type of analysis. If you have experienced a situation when someone said something like “it will
only take an hour (or a day) to fix your cell phone,” and it actually took 3 hours (or days), you
will understand why.
When determining the best-case performance of an algorithm, remember that we need to
determine the best-case performance for a given input size n. This is important since otherwise
every algorithm would take a constant amount of time in the best case simply by giving it an
input of the smallest possible size (typically 0 or 1). That sort of analysis is not very informative.
Note: When you are asked to do a best-case analysis of an algorithm, remember that it is
implied that what is being asked is the best-case analysis for an input of size n. This actually
applies to average and worst-case analysis as well, but it is easier to make this mistake when
doing a best-case analysis.
Worst case analysis considers what is the largest number of instructions that will execute
(again, for a given input size n). This is probably the most common analysis, and typically the
most useful. When you pay Amazon for guaranteed 2-day delivery, you are paying for them to
guarantee a worst-case delivery time. However, this analogy is imperfect. When you do a worst-
case analysis, you know the algorithm will never take longer than what your analysis specified,
but occasionally an Amazon delivery is lost or delayed. When you perform a worst-case analysis
of an algorithm, you always consider what can happen that will make an algorithm take as long
as possible, so it will never take longer than the worst-case analysis implies.
The average case is a little more complicated, both to define and to compute. The first problem
is determining what “average” means for a particular input and/or algorithm. For instance, what
does an “average” array of values look like? The second problem is that even with a good
definition, computing the average case complexity is usually much more difficult than the other
two. It also must be used appropriately. If you know what the average number of instructions for
an algorithm is, you need to remember that sometimes it might take less time and sometimes it
might take more time–possibly significantly more time.
I sometimes use the term expected running time instead of one of these three. This is almost
synonymous with average case, but when I use this term I am being less formal. Thus, I will not
necessarily do a complete average case analysis to determine what I call the expected running
time. Think of it as being how long an algorithm will usually take. It will typically coincide with
either the average- or worst-case complexity.
Example 7.105. Continuing the sequentialSearch example, notice that our analysis above
reveals that the best-case performance is 7 = Θ(1) operations (if the element sought is the
first one in the array) and the worst-case performance is 2 + 5n = Θ(n) operations (if the
element is not in the array). If we assume that the element we are searching for is equally
likely to be anywhere in the array or not in the array, then the average-case performance
should be about 2 + 5(n/2) = Θ(n) operations. We will do a more thorough average-case
analysis of this algorithm shortly.
Algorithm Analysis 259
Notice that in the previous example, the average- and worst-case complexities are the same.
This makes sense. We estimate that the average case takes about half as long as the worst case.
But no matter how large n gets, it is still just half as long. That is, the rate of growth of the
average and worst-case running times are the same. Also note the logic we used to obtain the
best-case complexity of Θ(1). We did not say the best case was Θ(1) because the best-case input
was an array of size one. Instead it is Θ(1) because in the best case the element we are searching
for is the first element of the array, no matter how large the array is.
Here is another important question: How do we know we counted all of the operations? As
it turns out, we don’t actually care. This is good because determining the exact number is very
difficult, if not impossible. Recall that we said we wanted to know the rate of growth of an
algorithm, not the exact number of instructions. As long as we count all of the “important” ones,
we will get the correct rate of growth. But what are the “important” ones? The term abstract
operation is sometimes used to describe the operations that we will count. Typically you choose
one type of operation or a set of operations that you know will be performed the most often and
consider those as the abstract operation(s).
Example 7.106. The analysis of sequentialSearch can be done more easily than in the
previous example. We repeat the algorithm here for convenience.
int s e q u e n t i a l S e a r c h ( int a [] , int n , int val ) {
for ( int i =0;i < a . size () ; i ++) {
if ( a [ i ]== val ) {
return i ;
}
}
return -1;
}
Notice that the comparison (a[i]==val) is executed as often as any other instruction. There-
fore if we count the number of times that instruction executes, we can use that to determine
the rate of growth of the running time.
In the best case the comparison is executed once (if the element being searched for is the
first one in the array), so the best-case complexity is Θ(1).
In the worst case the comparison is executed n = Θ(n) times (if the element being searched
for is either at the end or not present in the array).
As before, we expect the average case to be about n/2 = Θ(n), although in the next
example we will do a more complete analysis.
Notice that we obtained the same answers here as we did above when we tried to take
into account every operation.
Our analysis simplified things a bit—we didn’t take into account the possibility that the
element was not in the array. To do so, let’s assume the element searched for is equally likely
to be anywhere in the array or not in the array. That is, there is now a 1/(n + 1) chance that
it will be in any of the n spots in the array and a 1/(n + 1) chance that it is not in the array.
(We divide by n + 1 because there are now n + 1 possibilities, each equally likely.) If it is not
in the array, the number of comparisons is n. In this case the expected time would be
n Å n
!
k n 1 1 n(n + 1) n2 + 3n
X ã X Å ã
+ = k+n = +n = = Θ(n).
n+1 n+1 n+1 n+1 2 2(n + 1)
k=1 k=1
n2 +3n
We’ll leave it to you to prove that 2(n+1) = Θ(n). (Use Theorem 7.50 and a little algebra).
The previous example demonstrates how performing an average-case analysis is typically much
more difficult than the other two, even with a relatively simple algorithm. In fact, did we even do
it correctly? Is it a valid assumption that there is a 1/(n + 1) chance that the element searched
for is not in the array? If we are searching for a lot of values in a small array, perhaps it is the
case that most of the values we are searching for are not in the array. Maybe it is more realistic
to assume there is a 50% chance it is in the array and 50% chance that it is not in the array. I
could propose several other reasonable assumptions, too. As stated before, it can be difficult to
define “average.” In this case it actually doesn’t matter a whole lot because under any reasonable
assumptions the average-case analysis will always come out as Θ(n).
As you might be able to imagine, things get much more complicated as the algorithms get
more complex. This is one of the reasons that in some cases we will skip or gloss over the details
of the average-case analysis of an algorithm.
It is important to make sure that you choose the operation(s) you will count carefully so your
analysis is correct. In addition, you need to look at every instruction in the algorithm to determine
whether or not it can be accomplished in constant time. If some step takes longer than constant
time, that needs to be properly taken into consideration. In particular, consider function/method
calls and operations on data structures very carefully. For instance, if you see a method call like
insert(x) or get(x), you cannot just assume they take constant time. You need to determine
how much time they actually take.
Note: When you are asked for the complexity of an algorithm, you should do the following
three things:
1. Give the best, average, and worst-case complexities unless otherwise specified. Some-
times the average case is quite complicated and can be skipped.
2. Give answers in the form of Θ(f (n)) for some function f (n), or O(f (n)) if a tight
bound is not possible. The function f (n) you choose should be as simple as possible.
For instance, instead of Θ(3n2 + 2n + 89), you should use Θ(n2 ) since the constants and
lower order terms don’t matter.
3. Clearly justify your answers by explaining how you arrived at them in sufficient detail.
Algorithm Analysis 261
⋆Exercise 7.109. Analyze the following algorithm that finds the maximum value in an
array. Start by deciding which operation(s) should be counted. Don’t forget to give the best,
worst, and average-case complexities.
int m a x i m u m( int a [] , int n ) {
int max = int . M I N _ V A L;
for ( int i =0; i < n ; i ++)
max = max ( max , a [ i ]) ;
return max ;
}
When an algorithm has no conditional statements (like the maximum algorithm from the pre-
vious exercise), or at least none that can cause the algorithm to end earlier, the best, average,
and worst-case complexities will usually be the same. I say usually because there is always the
possibility of a weird algorithm that I haven’t thought of that could be an exception.
262 Chapter 7
Solution: This algorithm has two independent loops, each of which do slightly
different things. Thus, we cannot pick a single operation to count. Instead we will
pick the assignment statements that involve q. That is, we will use both q=q+i*i
and q=q*j. The first assignment executes n times since the first loop executes for
every value of i from 1 to n. The second loop also executes its assignment n times
for the same reason. Since the loops happen one after another, we add the number
of operations, so the total is n + n = 2n assignment statements. Since there are
no conditional statements, this is the best, worst, and average-case number of
assignment statements. Thus, the complexity for all three cases is Θ(n).
Solution: Clearly the assignment (V=A[i]*A[j]) occurs the most often. The
a
inner loop always executes n times, each time doing one assignment. The outer
loop executes n times, and each time it executes, it executes the inner loop.
Therefore the total time is n · n = Θ(n2 ). This is the best, worst, and average case
complexity since nothing about the input can change what the algorithm does.
Here is another way to think about it. The inner loop executes the assignment
statement n times every time it executes. The first time through the outer loop,
the whole inner loop executes an calls the assignment n times. The second time
through the outer loop, the whole inner loop executes an calls the assignment n
times. This happens all the way until the nth time through the outer loop during
which the whole inner loop executes an calls the assignment n times. Thus, the
total number of times the assignment is called is n + n + · · · + n times (where there
are n terms in the sum), which is just n · n. Thus the complexity is Θ(n2 ).
a
Always analyze from the inside out. The more practice you get, the more it will be obvious that this is
the only way that will consistently work.
Sometimes people mistakingly think the algorithm Example 7.110 takes Θ(n2 ) operations.
But it is not executing one loop inside another loop. It is executing one loop n times followed by
another loop n times. On the other hand, the algorithm in Example 7.111 does not take n + n
Algorithm Analysis 263
operations. It is not executing one loop n times followed by another loop n times. It is executing
one loop n times, and each of those n times it is executing another loop that takes n time.
Here is an analogy. If you climb a flight of 10 stairs followed by another flight of 10 stairs,
you climbed a total of 10 + 10 = 20 stairs. Now assume you go into a building that has 10 floors.
There are 10 steps between floors (so it takes 10 steps to get from floor 1 to 2, etc.) If you climb
to the top of the building, how many stairs did you climb? It is 10 + 10 + · · · + 10 (where there
are 10 terms in the sum), which is 100 = 102 . How does this relate to the previous examples?
Simple. In the first case, you executed:
for ( stair 1 t h r o u g h 10)
climb stair
for ( stair 1 t h r o u g h 10)
climb stair
and in the second case you executed:
for ( floors 1 t h r o u g h 10)
for ( stair 1 t h r o u g h 10)
climb stair
Do you see the resemblance to the code from Examples 7.110 and 7.111? And do you see how we
are really performing the same analysis?
It is important to be careful not to jump to conclusions when analyzing algorithms. For
instance, a double-nested for-loop should always take Θ(n2 ) to execute, right?
If you read the solution to the previous exercise (which you definitely should have—always
read the solutions!), you will see that you need to be careful not to jump to conclusions too
quickly. A double-nested loop does not always mean an algorithm takes Θ(n2 ) time. But does it
guarantee it will take O(n2 ) (in other words, no more than quadratic time)?
264 Chapter 7
• Logarithmic: Θ(logk n)
• Linear: Θ(n)
• Quadratic: Θ(n2 )
• Polynomial: Θ(nk )
• Exponential: Θ(k n )
Definition 7.114 (Constant). An algorithm with running time Θ(1) (or Θ(k) for some
constant k) is said to have constant complexity. Note that this does not necessarily mean
that the algorithm takes exactly the same amount of time for all inputs, but it does mean
that there is some number K such that it always takes no more than K operations.
Algorithm Analysis 265
⋆Exercise 7.116. Which of the following algorithms have constant complexity? Briefly
justify your answers.
Answer
Answer
Answer
Definition 7.117 (Logarithmic). Algorithms with running time Θ(log n) are said to have
logarithmic complexity. As the input size n increases, so does the running time, but very
slowly. Logarithmic algorithms are typically found when the algorithm can systematically
ignore fractions of the input.
266 Chapter 7
Example 7.118. In Example 7.161 we will see that binary search has complexity Θ(log n).
Definition 7.119 (Linear). Algorithms with running time Θ(n) are said to have linear
complexity. As n increases, the run time increases in proportion with n. Linear algorithms
access each of their n inputs at most some constant number of times.
void m S u m F i r s t N( int n ) {
void s u m F i r s t N( int n ) {
int sum =0;
int sum =0;
for( int i =1;i <= n ; i ++)
for ( int i =1; i <= n ; i ++)
for ( int k =1;k <7; k ++)
sum = sum + i ;
sum = sum + i ;
}
}
It is pretty easy to see that sumFirstN takes linear time since it contains a single for loop
that executes n times and does a constant amount of work each time.
At first glance it may seem that mSumFirstN takes Θ(n2 ) time since it has a double nested
loop. You will think about why it is actually Θ(n) in the next question.
⋆Question 7.121. Why is the complexity of mSumFirstN from the previous example Θ(n)
and not Θ(n2 )?
Answer
Definition 7.122 (n log n). Many divide-and-conquer algorithms have complexity Θ(n log n).
These algorithms break the input into a constant number of subproblems of the same type, solve
them independently, and then combine the solutions together. Not all divide-and-conquer
algorithms have this complexity, however.
Example 7.123. Two of the most well known sorting algorithms, Quicksort and Merge-
sort, have an average case complexity of Θ(n log n). We will do a complete analysis of both
algorithms in Chapter 8.
Definition 7.124 (Quadratic). Algorithms with a running time of Θ(n2 ) are said to have
quadratic complexity. As n doubles, the running time quadruples.
Algorithm Analysis 267
⋆Exercise 7.126. Which of the following algorithms have quadratic complexity? Briefly
justify your answers.
Answer
(b) An algorithm that tries to find the smallest element in an array of size n × n by searching
through the entire array.
Answer
⋆Question 7.127. In a previous course you may have encountered several quadratic sorting
algorithms. Name them. (Note: We will analyze two of them soon.)
Answer
Definition 7.128 (Polynomial). Algorithms with running time Θ(nk ) for some constant k
are said to have polynomial complexity. We call them polynomial-time algorithms.
Note that linear and quadratic are special cases of polynomial. When we say an efficient
algorithm exists to solve a problem, we typically mean a polynomial-time algorithm.
Example 7.129. As we will see in Example 7.150, MatrixMultiply takes Θ(n3 ) time. Since
3 is a constant, that is a polynomial-time algorithm. We will also mention Strassen’s algorithm
that has a complexity of about Θ(n2.8 ). That is also a polynomial-time algorithm. It’s actual
complexity is Θ(nlog2 7 ).
268 Chapter 7
Definition 7.130 (Exponential). Algorithms with running time Θ(kn ) for some constant k
are said to have exponential complexity. Since exponential algorithms can only be run for
small values of n, they are not considered to be efficient. Brute-force algorithms are often
exponential.
Example 7.131. Since there are 2n binary numbers of length n, an algorithm that lists all
binary numbers of length n would take Θ(2n ) time, which is exponential.
Note: As we have already seen, exponentials with different bases do not grow at the same
rate. Thus, two exponential algorithms do not belong to the same complexity class unless the
base of the exponent is the same. In other words, an 6= Θ(bn ) unless a = b.
Let me end on a very important note regarding analysis of algorithms and asymptotic growth
of functions. If algorithm A is faster than algorithm B, then the running time of A is less than
the running time of B. On the other hand, if A’s running time is asymptotically faster than the
running time of B, that means B is a faster algorithm! In other words, the words fast/slow need
to be reversed when discussing algorithm speeds versus the growth of the functions. Put simply:
A faster growing complexity means a slower algorithm, and vice-versa.
Example 7.132. Find the complexity of bubblesort, where n is the size of the array a.
void b u b b l e s o r t( int a [] , int n ) {
for ( int i =n -1; i >0;i - -) {
for ( int j =0;j < i ; j ++) {
if ( a [ j ] > a [ j +1]) {
swap (a ,j , j +1) ;
}
}
}
}
Solution: First, notice that the input size is n since we are sorting an array
with n elements.
Example 3.46 gives an implementation of swap that takes constant time (verify
this!). The conditional statement, including the swap, takes constant time (we’ll
call it c, as usual), regardless of whether or not the condition is true. It takes
longer if the condition is true, but it is constant either way—about 3 operations
(array indexing (×2) and comparison) versus about 6 (the swap adds about 3).
The inner loop goes from j = 0 to j = i − 1, so it executes i times and takes ci
time. But what is i? This is where things get a little more complicated than in
the previous examples. Notice that the outer loop is changing the value of i. We
need to look at this a little more carefully.
Algorithm Analysis 269
1. The first time through the outer loop i = n − 1. So the inner loop takes
c(n − 1) time.
2. The second time through the outer loop i = n − 2. So the inner loop takes
c(n − 2) time.
3. The kth time through the outer loop i = n − k. So the inner loop takes
c(n − k) time.
4. This goes all the way to the nth time through the outer loop when i = 1 and
the inner loop takes c · 1 time.
The outer loop is simply causing the inner loop to be executed over and over
again, but with different parameters (specifically, it is changing the limit on the
inner loop). Thus, we need to add up the time taken for all of these calls to the
inner loop. Doing so, we see that the total time required for bubblesort is
Note: Part way through our analysis of bubblesort we had k as part of our complexity. But
notice that the k did not show up as part of the final complexity. This is because in the context
of the entire algorithm, k has no meaning. It is a local variable from the algorithm that we
needed to use to determine the overall complexity of the algorithm. The only variables that
should appear in the complexity of an algorithm are those that are related to the
size of the input.
⋆Question 7.133. In the best case, the code in the conditional statement in bubblesort
never executes. Why does this still result in a complexity of Θ(n2 )?
Answer
In reality, the best and worst case performance of bubblesort are different—the worst case
is about twice as many operations. But when we are discussing the complexity of algorithms, we
care about the asymptotic behavior—that is, what happens as n gets larger. In that case, the
difference is still just a factor of 2. The best and worst-case complexities have the same growth
rate (quadratic).
270 Chapter 7
Consider how this is different if the best-case complexity of an algorithm is Θ(n) and the
worst-case complexity is Θ(n2 ). As n gets larger, the gap between the performance in the best
and worst cases also gets larger. In this case, the best and worst-case complexities are not the
same since one is linear and the other is quadratic.
Note: If an algorithm contains nested loops and the limit on one or more of the inner loops
depends on a variable from an outer loop, analyzing the algorithm will generally involve one
or more summations, as it did with the previous example. As mentioned previously, variables
related to those loops that are used in your analysis (e.g. i, j, k, etc.) should never show up
in your final answer! They have no meaning in that context.
Example 7.134. Find the complexity of insertionSort, where n is the size of the array a.
void i n s e r t i o n S o r t ( int a [] , int n ) {
for ( int i =1;i < n ; i ++) {
int v = a [ i ];
int j =i -1;
while ( j >= 0 && a [ j ] > v ) {
a [ j +1] = a [ j ];
j - -;
}
a [ j +1]= v ;
}
}
Solution: The code inside the while loop takes constant time. The loop can
end for one of two reasons—if j gets to 0, or if a[j]>v. In the worst case, it goes
until j = 0. Since j starts out being i at the beginning, and it is decremented in
the loop, that means the loop executes i times in the worst case.
The for loop (the outer loop) changes the value of i from 1 to n − 1, executing a
constant amount of code plus the while loop each time. So the ith time through
the outer loop takes c1 +c2 i operations. We will simplify this to just i operations—
you can think of it as counting the number of assignments in the while loop if you
wish. So the worst-case complexity is
n−1
X (n − 1)n
i= = Θ(n2 ).
2
i=1
This happens, by the way, if the elements in the array start out in reverse order.
In the best case, the loop only executes once each time because a[j]>v is always
true (which happens if the array is already sorted). In this case, the complexity is
Θ(n) since the outer loop executes n − 1 times, each time doing a constant amount
of work.
We should point out that if we had done our computations using c1 + c2 i instead of i we
Algorithm Analysis 271
would have arrived at the same answer, but it would have been more work:
n−1 n−1 n−1 n−1
X X X X (n − 1)n
c1 + c2 i = c1 + c2 i = c1 · (n − 1) + c2 i = c1 · (n − 1) + c2 = Θ(n2 ).
2
i=1 i=1 i=1 i=1
The advantage of including the constants is that we can stop short of the final step and get a
better estimate of the actual number of operations used by the algorithm. In other words, if
we want an exact answer, we need to include the constants and lower order terms. If we just
want a bound, the constants and lower order terms can often be ignored.
Note: There are rare cases when ignoring constants and lower order terms can cause trouble
(meaning that it can lead to an incorrect answer) for subtle reasons that are beyond the scope
of this book. Unless you take more advanced courses dealing with these topics, you most likely
won’t run into those problems.
Example 7.135. Consider the following implementation of insertion sort that works on lists
of integers (written using Java syntax).
void i n s e r t i o n S o r t ( List < Integer > A , int n ) {
for ( int i = 1; i < n ; i ++) {
I n t e g e r T = A . get ( i ) ;
int j = i -1;
while ( j >= 0 && A . get ( j ) . c o m p a r e T o( T ) > 0) {
A . set ( j + 1 , A . get ( j ) ) ;
j - -;
}
A . set ( j +1 , T ) ;
}
}
Note: In some languages the size of the list, n, does not need to be passed in since it can be
obtained with a method call (e.g. A.size()). We will pass it in just to be clear that the size
of the list is n.
Give the complexity of insertionSort assuming that the list is an array-based imple-
mentation (e.g. an ArrayList in Java). To be clear, this means that both get and set take
constant time. We also assume that comparing two integers (with compareTo) takes constant
time (which should be true for any reasonable implementation).
Solution: Notice that this algorithm is almost identical to the earlier version
that is implemented on an array. Array indexing and assignment are simply
replaced with calls to get and set. Since the time of these calls remains constant,
the earlier analysis still holds. Thus, the algorithm has a complexity of Θ(n2 ).
⋆Question 7.136. What are the complexities of the methods set(i,x) (set the ith element
of the the list to x) and get(i) (return the ith element of the list) for a linked list, assuming
a reasonable implementation?
Answer
Example 7.137. Analyze the previous insertionSort algorithm assuming that the list is
now a linked list.
Solution: The analysis here is a bit more complicated than we have previously
seen, but we can still do it. We start by analyzing the code inside the while loop.
In the worst case, each iteration of the loop makes two calls to A.get(j) and one
call to A.set(j+1) and a constant amount of other work. The total time for each
iteration is therefore about 2j + (j + 1) + c = 3j + c + 1 = 3j + c′ , where c′ = c + 1
is still just some constant. The index of the while loop starts at j = i and can
go until j = 1 (with j decrementing each iteration). Thus, the complexity of the
while loop is about
i i i
X X X (i + 1)i
(3j + c′ ) = 3 j+ c′ = 3 + ic′ = Θ(i2 )
2
j=1 j=1 j=1
The rest of the code inside the for loop takes constant time, except the one call
to get and one call to set which take time Θ(i).a Thus, the code inside the for
loop has complexity Θ(i2 + i) = Θ(i2 ). The outer for loop makes i go from 1 to
n − 1. Thus, the overall complexity is
n−1 n−1
!
(n − 1)n(2(n − 1) + 1)
X X Å ã
2 2
Θ(i ) = Θ i =Θ = Θ(n3 ).
6
i=1 i=1
Clearly using a linked list in this implementation of insertion sort is a bad idea.
a
Note that there is a tricky part here. It is subtle, but important. The call to set actually takes j as
a parameter. However we cannot use Θ(j) as the complexity of this because the j has no meaning in this
context. Therefore we use the fact that 1 ≤ j ≤ i to instead call it Θ(i).
⋆Exercise 7.138. For each of the following implementations of a stack, give a tight bound
(using Θ-notation, of course) on the expected running time of the given operations, assuming
that the data structure has n items in it before the operation is performed.
Stack array linked list
push
pop
peek
size
isEmpty
⋆Exercise 7.139. For each of the following implementations of a queue, give a tight bound
on the expected running time of the given operations, assuming that the data structure has
n items in it before the operation is performed.
Queue array linked list circular array
enqueue
dequeue
first
size
isEmpty
⋆Exercise 7.140. For each of the following implementations of a list, give a tight bound
on the expected running time of the given operations, assuming that the data structure has
n items in it before the operation is performed.
List array linked list
addToFront
addToEnd
removeFirst
contains
size
isEmpty
274 Chapter 7
⋆Exercise 7.141. For each of the following implementations of a binary search tree (BST),
give a tight bound on the expected running time of the given operations, assuming that
the data structure has n items in it before the operation is performed. Assume a linked
implementations (rather than arrays). For balanced, assume an implementation like red-
black tree or AVL tree.
BST unbalanced balanced
insert/add
delete/remove
search/contains
maximum
successor
⋆Exercise 7.142. Give the average- and worst-case complexity of the following operations
on a hash table (implemented with open-addressing or chaining–it doesn’t matter), assuming
that the data structure has n items in it before the operation is performed.
Hash Table average worst
insert/add
delete/remove
search/contains
1. We can usually replace constants with 1. For instance, if something performs 30 operations,
we can say it is constant and call it 1. This is only valid if it really is always 30, of course.
3. Nested loops must be treated with caution. If the limits in an inner loop change based on
the outer loop, we generally need to write this as a summation.
4. We should generally work from the inside-out. Until you know how much time it takes to
execute the code inside a loop, you cannot determine how much time the loop takes.
5. Function calls must be examined carefully. Do not assume that a function takes constant
time unless you know that to be true. We already saw a few examples where function calls
did not take constant time, and the next example will demonstrate it again.
Algorithm Analysis 275
6. Only the size of the input should appear as a variable in the complexity of an algorithm. If
you have variables like i, j, or k in your complexity (because they were indexes of a loop,
for instance), you should probably rethink your analysis of the algorithm. Loop variables
should never appear in the complexity of an algorithm.
Now it’s time to see if you can spot where someone didn’t follow some of these principles.
Evaluation
Solution 2: The worst-case is ni since power(a,i) takes i time and the for
loop executes n times.
Evaluation
Solution 3: The for loop executes n times. Each time it executes, it calls
power(a,i), which takes i time. In the worst case, i = n − 1, so the complexity
is (n − 1)n = O(n2 ).
Evaluation
276 Chapter 7
⋆Exercise 7.144. What is the worst-case complexity of addPowers from Evaluate 7.143?
Justify your answer.
⋆Exercise 7.145. Give an implementation of the addPowers algorithm that takes Θ(n)
time. Justify the fact that it takes Θ(n) time. (Hint: Why compute a5 (for instance) from
scratch if you have already computed a4 ?)
double a d d P o w e r s ( double a , int n ) {
}
Justification of complexity:
Algorithm Analysis 277
⋆Exercise 7.146. Give an implementation of the addPowers algorithm that takes Θ(n)
time but does not use a loop. Justify the fact that it takes Θ(n) time. (Hint: This solution
should be much shorter than your previous one.)
double a d d P o w e r s ( double a , int n ) {
}
Justification of complexity:
Example 7.147. A student turned in the code below (which does as its name suggests). I
gave them a ‘C’ on the assignment because although it works, it is very inefficient. About
how many operations does their implementation require?
int s u m F r o m M T o N( int m , int n ) {
int sum = 0;
for ( int i =1;i <= n ; i ++) {
sum = sum + i ;
}
for ( int i =1;i < m ; i ++) {
sum = sum - i ;
}
return sum ;
}
Solution: The first loop takes about 1 + 4n operations, and the second loop
takes about 1 + 4(m − 1) operations. The first statement and return statement
add 2 operations. So the total number of operations is about 4 + 4n + 4(m − 1) =
4(n + m) = Θ(n + m).
278 Chapter 7
⋆Evaluate 7.148. Write an ‘A’ version of the method from Example 7.147. You can assume
that 1 ≤ m ≤ n. For each solution, determine how many operations are required and evaluate
it based on that as well as whether or not it is correct.
Solution 1:
int s u m F r o m M T o N ( int m , int n ) {
int sum = 0;
for ( int i =0;i < n ; i ++) {
sum = sum + i ;
}
for ( int i =0;i < m ; i ++) {
sum = sum - i ;
}
return sum ;
}
Evaluation
Solution 2:
int s u m F r o m M T o N ( int m , int n ) {
int sum = 0;
for ( int i = m ;i < n ; i ++) {
sum = sum + i ;
}
return sum ;
}
Evaluation
Solution 3:
int s u m F r o m M T o N ( int m , int n ) {
return ( n *(n -1) /2 - m *( m -1) /2) ;
}
Evaluation
Algorithm Analysis 279
⋆Exercise 7.149. Write an ‘A’ version of the method from Example 7.147. You can assume
that 1 ≤ m ≤ n. Explain why your solution is correct and give its efficiency.
int s u m F r o m M T o N ( int m , int n ) {
}
Justification
Example 7.150. The MatrixMultiply algorithm given below is the standard algorithm used
to compute the product of two matrices. Find the worst-case complexity of MatrixMultiply.
Assume that A and B are n × n matrices.
Matrix M a t r i x M u l t i p l y ( Matrix A , Matrix B ) {
Matrix C ;
for ( int i =0 ; i < n ; i ++) {
for ( int j =0 ; j < n ; j ++) {
C [ i ][ j ]=0;
for ( int k =0 ; k < n ; k ++) {
C [ i ][ j ] += A [ i ][ k ]* B [ k ][ j ];
}
}
}
return C ;
}
Solution: The code inside the inner loop does array indexing, multiplication,
addition, and assignment. All of these together take just constant time. There-
fore, let’s count the number of times the statement C[i][j]+=A[i][k]*B[k][j]
executes. We will ignore the calls to C[i][j]=0 since it executes just once every
time the entire middle loop executes, so it has a negligible contribution. Sim-
ilarly, the statement C[i][j]+=A[i][k]*B[k][j] is called at least as often as
280 Chapter 7
any of the code in the for loops (i.e. the comparisons and increments) so we
will ignore that code as well. The bottom line is that if we count the number of
times C[i][j]+=A[i][k]*B[k][j] executes, it will give us a tight bound on the
complexity of MatrixMultiply.
The inner loop executes the statement n times. The middle loop executes n
times, each time executing the inner loop (which executes the statement n times).
Thus, the middle loop executes the statement n × n = n2 times. The outer loop
simply executes the middle loop n times. Therefore the outer loop (and thus the
whole algorithm) executes the statement n × n2 = n3 times. Thus, the worst-case
complexity of MatrixMultiply is Θ(n3 ). Notice that this is also the best and
average-case complexity since there are no conditional statements in this code.
Example 7.151. In Java, the ArrayList retainAll method is implemented as follows (this
code is simplified a little from the actual implementation, but the changes do not affect
the complexity of the code). Note that Object[] elementData and int size are fields of
ArrayList whose meaning should be obvious.
public b o o l e a n r e t a i n A l l( Collection <? > c ) {
b o o l e a n m o d i f i e d = false ;
int w = 0;
for ( int r = 0; r < size ; r ++) {
if ( c . c o n t a i n s( e l e m e n t D a t a[ r ]) ) {
e l e m e n t D a t a [ w ++] = e l e m e n t D a t a [ r ];
}
}
if ( w != size ) {
for ( int i = w ; i < size ; i ++)
e l e m e n t D a t a [ i ] = null ;
size = w ;
m o d i f i e d = true ;
}
return m o d i f i e d;
}
Let al1 be an ArrayList of size n.
(b) What is the complexity of al1.retainAll(ts2), where ts2 is a TreeSet with m elements?
of the code takes Θ(n log m) time. In the worst case (w = 0), the second half
of the code takes Θ(n) time. Thus, the worst-case complexity of the method is
Θ(n log m + n) = Θ(n log m).
⋆Exercise 7.152. Answer the following two questions based on the code from Exam-
ple 7.151.
elements? Answer
(b) What is the complexity of al1.retainAll(hs2), where hs2 is a HashSet with m ele-
ments? Answer
Example 7.153. In Java, the retainAll method is implemented as follows for LinkedList,
TreeSet, and HashSet.
public b o o l e a n r e t a i n A l l( Collection <? > c ) {
b o o l e a n m o d i f i e d = false ;
Iterator <E > iter = i t e r a t o r () ;
while ( iter . h a s N e x t () ) {
if (! c . c o n t a i n s( iter . next () ) ) {
iter . remove () ;
m o d i f i e d = true ;
}
}
return m o d i f i e d;
}
Assume that the calls iter.hasNext() and iter.next() take constant time. Let ts1 be a
TreeSet of size n. Find the worst-case complexity of each of the following method calls.
Solution: The call to iter.remove() takes Θ(log n) time since the iterator
is over a TreeSet. The call to contains takes Θ(m) time since in this case
c is the ArrayList al2. The other operations in the loop take constant time.
Thus, each iteration of the while loop takes Θ(log n + m) time in the worst case
(which occurs if the conditional statement is always true and remove is called
every time). Since the loop executes n times, and the rest of the code takes
constant time, the overall complexity is Θ(n(log n + m)).
⋆Exercise 7.154. Using the setup and code from Example 7.153, determine the complexity
of the following method calls.
It is important to note that the number of examples related to the retainAll method is not
reflective of the importance of this method. It just turns out to be an interesting method to
analyze the complexity of given different data structures.
We end this section with a comment that perhaps too few people think about. Theory and
practice don’t always agree. Since asymptotic notation ignores the constants, two algorithms that
have the same complexity are not always equally good in practice. For instance, if one takes 4 · n2
operations and the other 10, 000 · n2 operations, clearly the first will be preferred even though
they are both Θ(n2 ) algorithms.
As another example, consider matrix multiplication, which is used extensively in many scien-
tific applications. As we saw, the standard algorithm has complexity Θ(n3 ). Strassen’s algorithm
for matrix multiplication (the details of which are beyond the scope of this book) has complexity
of about Θ(n2.8 ). Clearly, Strassen’s algorithm is better asymptotically. In other words, if your
matrices are large enough, Strassen’s algorithm is certainly the better choice. However, it turns
out that if n = 50, the standard algorithm performs better. There is debate about the “crossover
point.” This is the point at which the more efficient algorithm is worth using. For smaller inputs,
the overhead associated with the cleverness of the algorithm isn’t worth the extra time it takes.
For larger inputs, the extra overhead is far outweighed by the benefits of the algorithm. For
Strassen’s algorithm, this point may be somewhere between 75 and 100, but don’t quote me on
that. The point is that for small enough matrices, the standard algorithm should be used. For
matrices that are large enough, Strassen’s algorithm should be used. Neither one is always better
to use.
Analyzing recursive algorithms can be a little more complex. We will consider such algorithms
in Chapter 8, where we develop the necessary tools.
Algorithm Analysis 283
Example 7.155. How is the binary representation of a number n related to the binary
representation of ⌊n/2⌋? Let’s try some examples. If n = 9, ⌊n/2⌋ = 4. Notice that the
binary representation of 9 is 1001 and the binary representation of 4 is 100. If n = 22,
⌊n/2⌋ = 11. The binary representation of 22 is 10110 and the binary representation of 11 is
1011. Is there a pattern here? This probably isn’t enough data to be certain yet.
Let’s see if you can find the pattern with just a few more data points.
⋆Exercise 7.156. Fill in the following table with the binary representations.
n ⌊n/2⌋
decimal binary decimal binary
12 6
13 6
32 16
33 16
118 59
119 59
⋆Question 7.157. How are the binary representations of n and ⌊n/2⌋ related?
Answer
Hopefully you observed a clear pattern in the previous exercise. The next theorem formalizes
this idea. We provide a proof of the theorem to make it clear what is going on.
Theorem 7.158. The binary representation of ⌊n/2⌋ is the binary representation of n shifted
to the right one bit. That is, the binary representation of ⌊n/2⌋ is the same as that of n with
the last bit (the lowest order bit) chopped off.
n = am 2m + am−1 2m−1 + . . . + a2 22 + a1 21 + a0 20 .
284 Chapter 7
Notice that in the last step, a0 /2 is chopped off by the floor since it is either 0/2
or 1/2 and the other numbers are integers. From this we can see that the binary
representation of ⌊n/2⌋ is am am−1 am−2 . . . a2 a1 , which is the binary representation
of n shifted to the right one bit.
Corollary 7.159. If the number n requires exactly k bits to represent in binary, then ⌊n/2⌋
requires exactly k − 1 bits to represent in binary.
Proof: Recall that logc b is defined as “the number that c must be raised to in
order to get b.” That is, if k = logc b, then ck = b. Also, it should be clear that
2k is the smallest number that requires k + 1 bits to represent in binary. (If you
are not convinced of this, write out some binary numbers near powers of two until
you see it.) Let k be the number such that
Since writing 2k−1 takes k bits and 2k is the smallest number that requires k + 1
bits, it should be clear that n requires exactly k bits to represent in binary. Taking
the logarithm of equation 7.4, we get
which leads to
k − 1 ≤ log2 n < k.
Clearly ⌊log2 n⌋ = k − 1 since it is an integer. Thus, k = ⌊log2 n⌋ + 1, so it takes
⌊log2 n⌋ + 1 bits to represent n in binary.
Example 7.161. You are probably already familiar with the binary search algorithm. It is
given here for reference.a
int b i n a r y S e a r c h ( int a [] , int n , int val ) {
int left =0 , right =n -1;
while ( right - left >=0) {
int middle = ( left + right ) /2;
if ( val == a [ middle ])
return middle ;
else if ( val < a [ middle ])
right = middle -1;
else
left = middle +1;
}
return -1;
}
Binary search finds the index of a value in a sorted array by comparing the value being
searched for with the middle element of the array. If they are the same, it returns the index
of the element. Otherwise it continues the search in only half of the array. In other words,
it removes from consideration half of the array. Which half depends on whether the search
value was greater than or less than the middle value.
We will show that binary search has worst-case complexity Θ(log n). More precisely, we
will prove that the while loop executes no more than ⌊log2 n⌋ + 1 times.
Proof: Since the code inside the while loop takes a constant amount of time,
the complexity of binary search depends only on the number of iterations of the
loop. Clearly the worst case is when a value is not in the array since otherwise
the loop ends early with the return statement. Thus we will assume the value is
not in the array.
Notice that the value right-left is the number of entries of the array that are still
under consideration by the algorithm. The loop executes until right-left < 0.
Before the first iteration, right-left = n. During each iteration, either right
or left is set to the middle value between right and left (plus or minus 1). So
after the first iteration, right-left ≤ ⌊n/2⌋. In other words, the algorithm has
discarded at least half of the entries of the array. During each subsequent iteration,
right-left continues to be no more than the floor of half of its previous value,
so the algorithm continues to discard half of the entries of the array each time
through the loop.
According to Corollary 7.159, each iteration of the loop reduces the number of
bits used to represent right-left by one. According to Theorem 7.160, it takes
⌊log2 n⌋ + 1 bits to represent n in binary, and right − lef t started out as n. There-
fore, after ⌊log2 n⌋ iterations through the loop, right-left becomes 1, and the
next iterations ensures that right-left becomes negative and the loop terminates
(check this!). Since the loop executes at most ⌊log2 n⌋ + 1 times, the worst-case
complexity of binary search is Θ(log n).
a
You may be familiar with the recursive version of this algorithm instead of this iterative implementa-
tion. We analyze the iterative version because we have not yet covered recursion or the analysis of recursive
algorithms.
286 Chapter 7
(b) Does that mean that f (n) ≤ c g(n) for some constant c and for all values of n? Explain.
(c) Is it possible that g(10) (for instance) is a lot smaller than f (10)? Explain.
⋆Question 7.3. If f (n) = O(g(n)), does that imply that f (n) = Θ(g(n))? If so, explain why. If
not, give an example of functions f and g such that f (n) = O(g(n)) but f (n) 6= Θ(g(n)).
⋆Question 7.4. If f (n) = Θ(g(n)), does that imply that f (n) = O(g(n))? If so, explain why. If
not, give an example of functions f and g such that f (n) = Θ(g(n)) but f (n) 6= O(g(n)).
⋆Question 7.5. Explain the difference between f (n) = o(g(n)) and f (n) = O(g(n)).
⋆Question 7.6. If you know that f (n) = Θ(g(n)), does that give you more, less, or the same
amount of information about the relationship between f and g than if you knew that f (n) =
O(g(n))? Explain.
⋆Question 7.7. Give two different proofs that 7n3 + 4n2 − 8n + 27 = O(n3 ). (Do not forget to
use Theorem 7.18 when necessary.)
⋆Question 7.8. Prove that 3n = o(3. 1n ).
From Section 7.2
⋆Question 7.9. Explain why n log n grows faster than cn for any constant c > 0. That is,
explain why cn = o(n log n). Note that I am not asking for a proof of this, but an explanation of
why it makes sense.
⋆Question 7.10. We think of log n as a slow growing function. Does that mean that given
another function f (n), f (n) log n grows slower than f (n)? (In general, if we multiply a function
by a slow growing function, does it make the function grow slower?) Explain.
⋆Question 7.11. Rank the following functions in increasing order of rate of growth. Clearly
indicate if two of the functions have the same growth rate:
nn , 7 log10 n, n2 + n + 1, 7n , 3n2 , 2n , n!, log3 n, 7n log2 n, n3 , 27n, 8675309, n3 + n2 loge n
From Section 7.3
⋆Question 7.12. Explain why “My algorithm only took 5 minutes to run and yours took 15
minutes, so mine is better” is not a complete and valid argument. (What other information is
needed to make the conclusion?)
⋆Question 7.13. If one algorithm always takes 3 times as long to run as another algorithm, re-
gardless of the size of the input, do the two algorithms have different computational complexities?
Explain.
Reading Comprehension Questions 287
⋆Question 7.14. If algorithm A has computational complexity f (n), algorithm B has computa-
tional complexity g(n), and f (n) = o(g(n)) (that is, g(n) grows faster than f (n)), which algorithm
is faster, A or B? Explain.
⋆Question 7.15. Someone claims to have an algorithm that can sort an array of n elements in
Θ(log n) time. Why can you be certain that they are incorrect about their algorithm?
⋆Question 7.16. Someone has an algorithm that can search an array in time Θ(n log n). Is that
a good or bad algorithm to solve the search problem (or is it impossible to tell)? Explain.
⋆Question 7.17. Is an algorithm that can sort an array in time Θ(n1.5 ) better or worse than
insertionSort? Explain.
⋆Question 7.18. Give at least two reasons why a double-nested for loop does not always have
complexity Θ(n2 ).
⋆Question 7.19. Algorithm A has complexity Θ(n2 ) and algorithm B has complexity Θ(n log n).
(b) Are there potentially cases in which the “worse” algorithm is actually faster? If so, when
might it be faster and why? If not, how do you know that it is never faster?
⋆Question 7.20. Algorithm A has complexity Θ(n log n) and algorithm B has complexity O(n2 ).
Which algorithm is faster? Explain.
⋆Question 7.21. If two algorithms have the same complexity, what other factors should be
taken into account when choosing which one to use? List as many as you can think of.
288 Chapter 7
7.5 Problems
Problem 7.1. Prove Theorem 7.18.
Problem 7.2. Θ can be thought of as a relation on the set of positive functions, where (f , g) ∈ Θ
iff f (n) = Θ(g(n)). Prove that Θ is an equivalence relation.
Problem 7.3. Rank the following functions in increasing rate of growth. Indicate if two or more
functions have the same growth rate.
Å ãx
3 2 log2 3 √ x 2 x 3/2 log3 7 2 3
x!, x , x log x, x, x , x, 3 , x log x, x , x , x , x , x log(x ), x log(log(x)),
2
Problem 7.4. Prove that 3n3 − 4n2 + 13n = O(n3 )
(a) Using the definition of O.
(b) Using limits.
Problem 7.5. Prove that 5n2 − 7n = Θ(n2 )
(a) Using the definition of Θ and/or Theorem 7.18.
(b) Using limits.
Problem 7.6. Prove that n log n = o(n2 ).
Problem 7.7. Prove that log(x2 + x) = Θ(log x).
√
Problem 7.8. Prove that 5x2 + 11x = Θ(x).
Problem 7.9. Prove that n2 = o(1. 01n ).
Problem 7.10. Give tight bounds for the best and worst case running times of each of the
following in terms of the size of the input. (Assume A. length = n).
(a) void foo1 ( int n ) {
int foo = 0;
for ( int i = 0 ; i < n ; i ++)
foo += i ;
}
Problem 7.11. Consider the problem of computing the product of two matrices, A and B, where
A is l × m and B is m × n.
(a) Give an efficient algorithm to compute the product A × B. Assume you have a Matrix type
with fields rows and columns that specify the number of rows/columns the matrix has. Thus,
you can call A.rows to get the number of rows A has, for instance. Also assume you can index
a Matrix like an array. Thus, A[i][j] accesses the element in row i and column j.
(a) Give the worst-case complexity of the array version of selection sort.
(b) Give the worst-case complexity of the list version of selection sort assuming the list is an
array-based implementation (e.g. ArrayList).
(c) Give the worst-case complexity of the list version of selection sort assuming the list is a
linked-list implementation.
(d) Compare the three options. Is one of them the clear choice to use? Should any of them never
be used? Explain.
292 Chapter 7
Problem 7.13. Using the code from Example 7.153, determine the complexity of the following
method calls.
(a) hs1.retainAll(al2), where hs1 is a HashSet of size n and al2 is an ArrayList of size m.
(b) hs1.retainAll(ll2), where hs1 is a HashSet of size n and ll2 is a LinkedList of size m.
(c) hs1.retainAll(ts2), where hs1 is a HashSet of size n and ts2 is a TreeSet of size m.
(d) hs1.retainAll(hs2), where hs1 is a HashSet of size n and hs2 is a HashSet of size m.
(e) ll1.retainAll(al2), where ll1 is a LinkedList of size n and al2 is an ArrayList of size m.
(f) ll1.retainAll(ll2), where ll1 is a LinkedList of size n and ll2 is a LinkedList of size m.
(g) ll1.retainAll(ts2), where ll1 is a LinkedListof size n and ts2 is a TreeSet of size m.
(h) ll1.retainAll(hs2), where ll1 is a LinkedList of size n and hs2 is a HashSet of size m.
Problem 7.14. You need to choose data structures for two collections of data, A and B, and the
only thing you know is that the most common operation you will perform is A.retainAll(B).
Given this, what are you best choices for A and B? Clearly justify your choices.
Problem 7.15. In Java, a TreeMap is an implementation of the Map interface that uses a
balanced binary search tree (a red-black tree) to store the keys and values. In particular,
the keys are used as keys in a BST with each key having an associated value. TreeMaps
have methods like1 put(Object key, Object value) (add the key-value pair to the map),
Object get(Object key) (returns the value associated with the key), and ArrayList keySet()
(returns an ArrayList of all the keys). As should be expected, put and get both take Θ(log n)
time. You can assume that keySet takes Θ(n) time.
A method that might be useful on a TreeMap is ArrayList getAll(ArrayList keys) that
returns an ArrayList containing the values associated to the keys passed into the method. Consider
the following implementation of this method.
public A r r a y L i s t getAll ( A r r a y L i s t keys ) {
A r r a y L i s t t o R e t u r n = new A r r a y L i s t () ;
for ( Object key : keySet () ) {
for ( Object k : keys ) {
if ( k . equals ( key ) ) {
t o R e t u r n. add ( get ( key ) ) ;
}
}
}
return t o R e t u r n;
}
(a) Does this method work properly? Explain why it does or does not.
(c) Rewrite the method so that it is as efficient as possible and give the worst-case complexity of
the new version.
1
For simplicity we are ignoring generics here. If you don’t know what that means but you understand what this
problem is saying, don’t worry about it.
Problems 293
(a) Give the worst-case complexity of this algorithm if A is an array-based list (e.g., an ArrayList).
(c) Would it ever make sense to implement binary search on a linked list? Explain.
294 Chapter 7
Chapter 8: Recursion, Recurrences, and
Mathematical Induction
In this chapter we will explore a proof technique, an algorithmic technique, and a mathematical
technique. Each topic is in some ways very different than the others, yet they have a whole lot in
common. They are also often used in conjunction.
You have already seen recurrence relations. Recall that a recurrence relation is a way of
defining a sequence of numbers with a formula that is based on previous numbers in the sequence.
You are probably also familiar with recursion, an algorithmic technique in which an algorithm
calls itself (such an algorithm is called recursive), typically with “smaller” input. Finally, the
principle of mathematical induction is a slick proof technique that works so well that sometimes
it feels like you are cheating.
We will see that induction can be used to prove formulas, prove that algorithms—especially
recursive ones—are correct, and help solve recurrence relations. Among other things, recurrence
relations can be used to analyze recursive algorithm. Recursive algorithms can be used to compute
the values defined by recurrence relations and to solve problems that can be broken into smaller
versions of themselves.
As we will see, each of these has one or more base cases that can be proved/computed/de-
termined directly and a recursive or inductive step that relies on previous steps. With each, the
inductive/recursive steps must eventually lead to a base case.
Because induction can be used to prove things about the other two, we will begin there.
Proof: We use induction and the idea from the solution to Exercise 5.24.
Clearly if |A| = 1, A has 21 = 2 subsets: ∅ and A itself.
Assume every set with n − 1 elements has 2n−1 subsets. Let A be a set with n
elements. Choose some x ∈ A. Every subset of A either contains x or it doesn’t.
Those that do not contain x are subsets of A \ {x}. Since A \ {x} has n − 1
elements, the induction hypothesis implies that it has 2n−1 subsets. Every subset
that does contain x corresponds to one of the subsets of A \ {x} with the element
x added. That is, for each subset S ⊆ A \ {x}, S ∪ {x} is a subset of A containing
x. Clearly there are 2n−1 such new subsets. Since this accounts for all subsets of
A, A has 2n−1 + 2n−1 = 2n subsets.
Now we will go into detail about how and why induction works. You should come back and
reread Example 8.1 after reading section 8.1.1.
295
296 Chapter 8
⋆Exercise 8.2. Based on the description so far, which of the following statements might we
be able to prove with mathematical induction (indicate with ‘Y’ or ‘N’)? Briefly justify.
(b) Every positive integer can be written as the sum of two other positive integers.
(c) Every integer greater than 1 can be written as the product of prime numbers.
n
X n(n + 1)(2n + 1)
(d) If n ≥ 1, k2 =
6
k=1
The following example illustrates the idea behind induction. It uses modus ponens. Recall
that modus ponens states that if p is true and p → q is true, then q is true. In English, “If p is
true, and whenever p is true q is true, then q is true.”1
Example 8.3. Assume that we know that P (1) is true and that whenever k ≥ 1, P (k) →
P (k + 1) is true. What can we conclude?
Solution: Let’s start from the ground up. We know that P (1) is true. We
also know that P (k) → P (k + 1) is true for any integer k ≥ 1. For instance, since
4 ≥ 1, we know that P (4) → P (5) is true. It should be noted that we don’t (yet)
1
We can also write this as the tautology [p ∧ (p → q)] → q.
Mathematical Induction 297
• We know P (1) is true and 1 ≥ 1, P (1) → P (2) is true, therefore P (2) is true.
• Since P (2) is true and 2 ≥ 1, P (2) → P (3) is true, therefore P (3) is true.
• Since P (3) is true and 3 ≥ 1, P (3) → P (4) is true, therefore P (4) is true.
• Since P (4) is true and 4 ≥ 1, P (4) → P (5) is true, therefore P (5) is true.
• Since P (5) is true and 5 ≥ 1, P (5) → P (6) is true, therefore P (6) is true.
It seems pretty clear that this pattern continues for all values of k > 6 as well, so
P (k) is true for all k ≥ 1.
⋆Question 8.4. Example 8.3 had several statements like the following:
“Since P (4) is true and 4 ≥ 1, P (4) → P (5) is true, therefore P (5) is true.”
Answer
Example 8.3 did not give a formal proof of the conclusion. The idea is to get you thinking about
how mathematical induction works, not to provide a formal proof that it does (yet). Hopefully
this example will help prime your brain for the proof that mathematical induction is a valid proof
technique that we will give shortly.
Before moving on, we should make sure you understand what has already been said.
⋆Question 8.5. If you know that P (5) is true, and you also know that P (k) → P (k + 1)
whenever k ≥ 1, what can you conclude?
Answer
⋆Question 8.6. If you know that P (17) is true and you also know that P (k) → P (k + 1)
whenever k ≥ 1, what can you conclude about P (10)?
Answer
Now it is time to get more formal with our discussion. Mathematical induction is based on
the fact that if P (a) is true for some a ≥ 0 (the base case), and for any k ≥ a, if P (k) is true,
then P (k + 1) is true (the inductive case), then P (n) is true for all n ≥ a. In other words, the
principle of mathematical induction is based on the fact that
⋆Exercise 8.7. Restate [P (a) ∧ ∀k(P (k) → P (k + 1))] → (∀nP (n)) (where the universe is
{a, a + 1, a + 2, . . .}) in English.
Answer
The proof that [P (a) ∧ ∀k(P (k) → P (k + 1))] → (∀nP (n)) is true is based on something
called the well-ordering principle which states that every nonempty subset of the natural numbers
has a least element. Read the following proof very carefully, making sure you understand the
justification of every step. If you are not sure about any of the steps, it is important that you get
them clarified!
Theorem 8.8. Assume we are working over the universe {a, a + 1, a + 2, . . .}. The statement
[P (a) ∧ ∀k(P (k) → P (k + 1))] → (∀nP (n)) is true.
Proof: If the statement is false, then it must be that P (a)∧∀k(P (k) → P (k+1))
is true but that ∀nP (n) is false. Let S = {s ∈ {a, a + 1, a + 2, . . .}|¬P (s)}. That
is, S is the set of integers for which P (n) is false. Since ∀nP (n) is false, S
is nonempty. Clearly S is a subset of the natural numbers, so the well-ordering
principle applies. Therefore there is some least element b ∈ S. Since b ∈ S, P (b)
is false, and since it is the least such element, b − 1 6∈ S, so P (b − 1) is true.
But we know that ∀k(P (k) → P (k + 1)) is true, so P (b − 1) → P (b). By modus
ponens, P (b) is true, a contradiction. Therefore the statement is true.
It is definitely worth your time to convince yourself that mathematical induction is a valid
technique. If you aren’t convinced, reread the proof, think about it some more, and/or ask
someone to help you understand it.
⋆Question 8.9. Are you convinced that [P (a) ∧ ∀k(P (k) → P (k + 1))] → (∀nP (n)) is true?
Answer
We call P (a) the base case. Sometimes we actually need to prove several base cases (we will
see why later). For instance, we might need to prove P (a), P (a + 1), and P (a + 2) are all true.
The inductive step involves proving that ∀k(P (k) → P (k + 1)) is true. To prove it, we show
that if P (k) is true for any k which is at least as large as the base case(s), then P (k + 1) is true.
The assumption that P (k) is true is called the inductive hypothesis.
Based on our discussion so far, here is the procedure for writing induction proofs.
Mathematical Induction 299
Procedure 8.10. To use induction to prove that ∀nP (n) is true on domain {a, a + 1, . . .}:
1. Base Case: Show that P (a) is true (and possible one or more additional base cases).
(a) Inductive Hypothesis: Let k ≥ a be an integer and assume that P (k) is true.
(b) Inductive Step: Prove that P (k + 1) is true, typically using the fact that P (k) is
true.
Assuming we used no special facts about k other than k ≥ a, this means we have
shown that ∀k(P (k) → P (k + 1)) (again, where it is understood that the domain is
{a, a + 1, . . .}).
3. Summary: Conclude that ∀nP (n) is true, usually by saying something like “Since P (a)
and P (k) → P (k + 1) for all k ≥ a, ∀nP (n) is true by induction.”
As you will quickly learn, the base case is generally pretty easy, as is writing down the inductive
hypothesis. The summary is even easier, since it almost always says the same thing. The inductive
step is the longest and most complicated step. In fact, in mathematics and theoretical computer
science journals, induction proofs often only include the inductive step since anyone reading papers
in such journals can generally fill in the details of the other three parts. But keep in mind that
you are not (yet) writing papers for such journals, so you cannot omit these steps!
Let’s see another example.
Example 8.11. Prove that the sum of the first n odd integers is n2 . That is, show that
Xn
(2i − 1) = n2 for all n ≥ 1.
i=1
n
X
Proof: Let P (n) be the statement “ (2i − 1) = n2 ”. We need to show that
i=1
P (n) is true for all n ≥ 1.
1
X
Base Case: Since (2i − 1) = 2 · 1 − 1 = 1 = 12 , P (1) is true.
i=1
Inductive Hypothesis: Let k ≥ 1 and assume that P (k) is true. That is, assume
k
X
that (2i − 1) = k2 when k ≥ 1.
i=1
Inductive Step: Then
k+1
X k
X
(2i − 1) = (2i − 1) + (2(k + 1) − 1) (take k + 1 term from sum)
i=1 i=1
2
= k + (2k + 2 − 1) (by the inductive hypothesis)
= k2 + 2k + 1
= (k + 1)2
300 Chapter 8
Thus P (k + 1) is true.
Summary: Since we proved that P (1) is true, and that P (k) → P (k+1) whenever
k ≥ 1, P (n) is true for all n ≥ 1 by the principle of mathematical induction.
The previous proof had the four components we discussed. We proved the base case. We then
assumed it was true for k. That is, we made the inductive hypothesis. Next we proved that it was
true for k + 1 based on the assumption that it is true for k. That is, we did the inductive step.
Finally, we appealed to the principle of mathematical induction in the summary.
Did you notice the quotes? It is important that you include these. This is particularly impor-
Xn
tant if you use notation such as P (n) =“ (2i − 1) = n2 ”. Without the quotes, this becomes
i=1
n
X n
X
P (n) = (2i − 1) = n2 , which is defining P (n) to be (2i − 1) and saying that it is also
i=1 i=1
equal to n2 . These are not saying the same thing. With the quotes, P (n) is a propositional
function. Without them, it is a function from Z to Z.
In fact, to avoid this confusion, I recommend that you never use the equals sign with
propositional functions, especially when writing induction proofs.
Now it’s your turn to try to fill in the details of an induction proof.
Mathematical Induction 301
⋆Fill in the details 8.12. Reprove Theorem 6.50 using induction. That is, prove that for
n
X n(n + 1)
n ≥ 1, i= .
2
i=1
k
X k(k + 1)
Proof: Let P (k) be the statement “ i= ”. We need to show that
2
i=1
P (n) is true for all n ≥ 1.
1
X
Base Case: When k = 1, we have i = 1 = . Therefore,
i=1
[This is not part of the proof, but it will help us see what’s next. Our
k+1
X
i = + (k + 1)
i=1
= (k + 1)
Thus, .
8.1.2 Equalities/Inequalities
The last few example induction proofs have dealt with statements of the form
LHS(k) = RHS(k),
where LHS stands for left hand side and RHS stands for right hand side. For instance, in
Example 8.11, the statement was
Xn
(2i − 1) = n2 ,
i=1
k
X
so LHS(k) = (2i − 1) and RHS(k) = k 2 .
i=1
Xn
⋆Question 8.13. Let P (n) be the statement “ i · i! = (n + 1)! − 1.” Determine each of
i=1
the following:
(c) LHS(k) =
(d) RHS(k) =
(e) LHS(k + 1) =
(f) RHS(k + 1) =
For statements of this form, the goal of the inductive step is to show that LHS(k + 1) =
RHS(k + 1) given the fact that LHS(k)=RHS(k) (the inductive hypothesis). The way this
should generally be done is as follows:
Procedure 8.14. Given a proposition of the form “LHS(n) = RHS(n),” the algebra in the
inductive step of an induction proof should be done as follows:
LHS(k + 1) = LHS(k) + stuf f (apply algebra to separate LHS(k) from the rest)
= RHS(k) + stuf f (use the inductive hypothesis to replace LHS(k)
with RHS(k))
= ··· (1 or more steps, usually involving algebra, that
= RHS(k + 1) result in the goal of getting to RHS(k + 1))
Mathematical Induction 303
The last few examples followed this procedure, and your proofs should also follow it. Notice that
these examples do not begin the inductive step by writing out LHS(k + 1) = RHS(k + 1). One
of them wrote it out, but it was before the inductive step for the purpose of making the goal in
the inductive step clear. The inductive step should always begin by writing just LHS(k + 1), and
should then use algebra, the inductive hypothesis, etc., until RHS(k + 1) is obtained.
This technique also works (with the appropriate slight modifications) with inequalities, e.g.
LHS(k) ≤ RHS(k) and
LHS(k) ≥ RHS(k).
For instance, if P (k) is the statement “k > 2k ”, LHS(k) = k, and RHS(k) = 2k . In addition,
the ‘+stuf f ’ is not always literally addition. For instance, it might be LHS(k) × stuf f .
Here is another example of this type of induction proof–this time using an inequality.
Proof: Let P (n) be the statement “n < 2n ”. We want to prove that P (n) is
true for all n ≥ 1.
Base Case: Since 1 < 21 , P (1) is clearly true.
Hypothesis: We assume P (k) is true if k ≥ 1. That is, k < 2k .
In the previous example, LHS(k) = k, so LHS(k + 1) is already in the form LHS(k) + stuf f
since LHS(k + 1) = k + 1 = LHS(k) + 1. So the first step of algebra is unnecessary and we were
able to apply the inductive hypothesis immediately. Don’t let this confuse you. This is essentially
the same as the other examples minus the need for algebra in the first step.
Note: By the time you are done with this section, you will likely be tired of hearing this,
but since it is the most common mistake made in induction proofs, it is worth repeating ad
nauseam. Never begin the inductive step of an induction proof by writing down
P(k + 1). You do not know it is true yet, so it is not valid to write it down as if it were true
so that you can use a technique such as working both sides to verify that it is true (which, as
we have also previously stated, is not a valid proof technique).
304 Chapter 8
You can (and sometimes should) write down P (k + 1) on another piece of paper or with
a comment such as “We need to prove that” preceding it so that you have a clear direction
for the inductive step.
If you can complete the next exercise without too much difficulty, you are well on your way
to understanding how to write induction proofs.
n
X n(n + 1)(2n + 1)
⋆Exercise 8.16. Use induction to prove that for all n ≥ 1, i2 = .
6
i=1
(Hint: Follow the techniques and format of the previous examples and be smart about your
algebra and it will go a lot easier. Also, you will need to factor a polynomial in the inductive
step, but if you determine what the goal is ahead of time, it shouldn’t be too difficult.)
Mathematical Induction 305
8.1.3 Variations
In this section we will discuss a few slight variations of the details we have presented so far. First
we discuss the fact that we do not need to use a propositional function. Then we will discuss a
variation regarding the inductive hypothesis.
It is not always necessary to explicitly define P (k) for use in an induction proof. P (k) is
used mostly for convenience and clarity. For instance, in the solution to the previous exercise, it
allowed us to just say
“P (k) is true”
instead of saying
n
X n(n + 1)(2n + 1)
“ i2 = ” (which is long)
6
i=1
or
Here is an example that does not use P (k). It also does not label the four parts of the proof.
That is perfectly fine. The main reason we have done so in previous examples is to help you
identify them more clearly.
Example 8.17. Let fn be the n-th Fibonacci number. Prove that for all integers n ≥ 1,
and so the assertion is true for k = 1. Suppose k ≥ 1, and that the assertion is
true for k. That is,
fk−1 fk+1 = fk2 + (−1)k .
This can be rewritten as
1 · 2 + 2 · 22 + 3 · 23 + · · · + n · 2n = 2 + (n − 1)2n+1
or if you prefer,
n
X
i · 2i = 2 + (n − 1)2n+1 .
i=1
Do so without using a propositional function. You may label the four parts of your proof,
but it is not required.
Mathematical Induction 307
Example 8.19. Prove the generalized form of DeMorgan’s law. That is, show that for any
n ≥ 2, if p1 , p2 , . . ., pn are propositions, then
We provide several appropriate proofs of this one (and one inappropriate one).
Proof 1: (A typical proof)
Thus P (k+1) is true. Since we proved that P (2) is true, and that P (k) → P (k+1)
if k ≥ 2, by PMI, P (n) is true for all n ≥ 2.
We know that ¬(p1 ∨ p2 ) = (¬p1 ∧ ¬p2 ) since this is simply DeMorgan’s law.
Assume the statement is true for k. That is, ¬(p1 ∨ p2 ∨ · · · ∨ pk ) = (¬p1 ∧ ¬p2 ∧
· · · ∧ ¬pk ). Then we can see that
Thus the statement is true for k + 1. Since we have shown that the statement is
true for n = 2, and that whenever it is true for k it is true for k + 1, by PMI, the
statement is true for all n ≥ 2.
Sometimes it is acceptable to omit the justification in the summary. That is, there
isn’t necessarily a need to restate what you have proven and you can just jump to
the conclusion. So the previous proof could end as follows:
Thus the statement is true for k + 1. By PMI, the statement is true for
all n ≥ 2.
n
X
⋆Evaluate 8.20. Prove that for all positive integers n, i · i! = (n + 1)! − 1.
i=1
Solution: Base: n = 1
1 · 1! = (1 + 1)! − 1
1 = 2! − 1
1 = 1
n
X
Assume i · i! = (n + 1)! − 1 for n ≥ 1.
i=1
Induction:
n+1
X n
X
i · i! = i · i! + (n + 1)(n + 1)!
i=1 i=1
= (n + 1)! − 1 + (n + 1)(n + 1)!
= (n + 1 + 1)(n + 1)! − 1
= (n + 2)(n + 1)! − 1
= (n + 2)! − 1
Evaluation
The second variation we wish to discuss has to do with the inductive hypothesis/step. In the
inductive step, we can replace P (k) → P (k + 1) with P (k − 1) → P (k) as long as we prove the
statement for all k larger than any of the base cases. In general, we can use whatever index we
want for the inductive hypothesis as long as we use it to prove that the statement is true for the
next index, and as long as we are sure to cover all of the indices down to the base case. For
instance, if we prove P (k + 3) → P (k + 4), then we need to show it for all k + 3 ≥ a (that is, all
k ≥ a − 3), assuming a is the base case. Put simply, the assumption we make about the value of
k must guarantee that the inductive hypothesis includes the base case(s).
⋆Question 8.21. Consider a ‘proof’ of ∀nP (n) that shows that P (1) is true and that
P (k) → P (k + 1) for k > 1. What is wrong with such a proof?
Answer
Mathematical Induction 309
Note: Whether you assume P (k) or P (k−1) is true, you must specify the values of k precisely
based on your choice. For instance, if you assume P (k) is true for all k > a, you have a
problem. Although you known P (a) is true (because it is a base case), when you assume
P (k) is true for k > a, the smallest k can be is a + 1. In other words, when you prove
P (k) → P (k + 1), you leave out P (a) → P (a + 1). But that means you can’t get anywhere
from the base case, so the whole proof is invalid.
If you are wondering why we would use P (k − 1) as the inductive hypothesis instead of P (k),
it is because sometimes it makes the proof easier–for instance, the algebra steps involved might
be simpler.
33n+3 − 26n − 27
Proof: Let P (k) be the statement “33k+3 − 26k − 27 = 169N for some N ∈ N.”
We will prove that P (0) is true and that P (k − 1) → P (k).
⋆Question 8.23. Did you notice that in the previous example we assumed k > 0 instead
of k ≥ 0? Why did we do that?
Answer
[P (a) ∧ P (a + 1) ∧ · · · ∧ P (k)] → P (k + 1) if k ≥ a.
310 Chapter 8
This may look more complicated, but practically speaking, there is really very little difference.
Essentially, strong induction just allows us to assume more than weak induction. Let’s see an
example of why we might need strong induction.
Example 8.24. Show that every integer n ≥ 2 can be written as the product of primes.
Proof: Let P (n) be the statement “n can be written as the product of primes.”
We need to show that for all n ≥ 2, P (n) is true.
Since 2 is clearly prime, it can be written as the product of one prime. Thus P (2)
is true.
Assume [P (2) ∧ P (3) ∧ · · · ∧ P (k − 1)] is true for k > 2. In other words, assume
all of the numbers from 2 to k − 1 can be written as the product of primes.
We need to show that P (k) is true. If k is prime, clearly P (k) is true. If k is not
prime, then we can write k = a · b, where 2 ≤ a ≤ b < k. By hypothesis, P (a) and
P (b) are true, so a and b can be written as the product of primes. Therefore, k can
be written as the product of primes, namely the primes from the factorizations of
a and b. Thus P (k) is true.
Since we proved that P (2) is true, and that [P (2) ∧ P (3) ∧ · · · ∧ P (k − 1)] → P (k)
if k > 2, by the principle of mathematical induction, P (n) is true for all n ≥ 2.
That is, every integers n ≥ 2 can be written as the product of primes.
Example 8.25. In the country of SmallPesia coins only come in values of 3 and 5 pesos.
Show that any quantity of pesos greater than or equal to 8 can be paid using the available
coins.
Notice that there is no way we could have used weak induction in either of the previous
examples.
Mathematical Induction 311
Example 8.26. What is wrong with the following (supposed) proof that an = 1 for n ≥ 0:
ak · ak 1·1
ak+1 = = = 1.
ak−1 1
Summary: Therefore by PMI, an = 1 for all n ≥ 0.
Solution: The base case is correct, and there is nothing wrong with the
summary, assuming the inductive step is correct. ak = 1 and ak−1 = 1 are correct
by the inductive hypothesis since we are assuming k > 0. The algebra is also
correct. So what is wrong? The problem is that when k = 0, a−1 would be in
the denominator. But we don’t know whether or not a−1 = 1. Thus we needed
to assume k > 0. As it turns out, that is precisely where the problem lies. We
proved that P (0) is true and that P (k) → P (k + 1) is true when k > 0. Thus,
we know that P (1) → P (2), and P (2) → P (3), etc., but we never showed that
P (0) → P (1) because, of course, it isn’t true. The induction doesn’t work without
P (0) → P (1).
⋆Evaluate 8.27. Prove or disprove that all goats are the same color.
Evaluation
The next example deals with binary palindromes. Binary palindromes can be defined recur-
sively by λ, 0, 1 ∈ P , and whenever p ∈ P , then 1p1 ∈ P and 0p0 ∈ P . (Note: λ is the notation
sometimes used to denote the empty string—that is, the string of length 0. Also, 1p1 means the
binary string obtained by appending 1 to the begin and end of string p. Similarly for 0p0.) Notice
that there is 1 palindrome of length 0 (λ), 2 of length 1 (0, 1), 2 of length 2 (00, 11), 4 of length
3 (000, 010, 101, 111), etc.
312 Chapter 8
⋆Evaluate 8.28. Use induction to prove that the number of binary palindromes of length
2n (even length) is 2n for all n ≥ 0.
Evaluation
Evaluation
Proof 3: The empty string is the only string of length 0, and it is a palin-
drome. Thus there is 1 = 20 palindromes of length 0. Let 2n be the length,
assume 2n → 2n palindromes. Now we look at n + 1 so we know the length
is 2n + 2 and it starts and ends with either 0 or 1 and has 2n values in
between. Both possibilities imply 2n palindromes, so 2n + 2n = 2n+1 .
Evaluation
Mathematical Induction 313
⋆Exercise 8.29. Based on the feedback from the previous Evaluate exercise, construct a
proper proof that the number of binary palindromes of length 2n is 2n for all n ≥ 0.
8.1.6 Summary/Tips
Induction proofs are both intuitive and non-intuitive. On the one hand, when you talk through
the idea, it seems to make sense. On the other hand, it almost seems like you are using circular
reasoning. It is important to understand that induction proofs do not rely on circular reasoning.
Circular reasoning is when you assume p in order to prove p. But here we are not doing that. We
are assuming P (k) and using that fact to prove P (k + 1), a different statement. However, we are
not assuming that P (k) is true for all k ≥ a. We are proving that if we assume that P (k) is
true, then P (k + 1) is true. The difference between these statements may seem subtle, but it is
important.
Let’s summarize our approach to writing an induction proof. This is similar to Procedure 8.10
except we include several of the unofficial steps we have been using that often come in handy.
You are not required to use this procedure, but if you are having a difficult time with induction
proofs, try this out. Here is the brief version. After this we provide some further comments about
each step.
314 Chapter 8
1. Define: (optional) Define P (n) based on the statement you need to prove.
2. Rephrase: (optional) Rephrase the statement you are trying to prove using P (n). This
step is mostly to help you be clear on what you need to prove.
5. Goal: (optional) Write out the goal of the inductive step (coming next). It is usually
“I need to show that P (k + 1) is true” It can be helpful to explicitly write out P (k + 1),
although see important comments about this step below. This is another step that is
mostly for your own clarity.
6. Inductive: Prove the goal statement, usually using the inductive hypothesis.
1. Define: P (n) should be a statement about a single instance, not about a series of instances.
For example, it should be statements like “2n is even” or “A set with n elements has 2n
subsets.” It should NOT be of the form “2n is even if n > 1,” “n2 > 0 if n 6= 0,” or “For all
n > 1, a set with n elements has 2n subsets.”
2. Rephrase: In almost all cases, the rephrased statement should be “For all n ≥ a, P (n) is
true,” where a is some constant, often 0 or 1. If the statement cannot be phrased in this
way, induction may not be appropriate.
3. Base Case: For most statements, this means showing that P (a) is true, where a is the
value from the rephrased statement. Although usually one base case suffices, sometimes one
must prove multiple base cases, usually P (a), P (a + 1), . . . , P (a + i) for some i > 0. This
depends on the details of the inductive step.
Sometimes it is helpful to write out the hypothesis explicitly (that is, write down the whole
statement with k or k − 1 plugged in).
5. Goal: As previously stated, this is almost always “I need to show that P (k + 1) is true” (or
“I need to show that P (k) is true”). But it can be very helpful to explicitly write out what
P (k + 1) is so you have a clear direction for the next step. However, it is very important
that you do not just write out P (k + 1) without prefacing it with a statement like “I need to
show that...”. Since you are about to prove that P (k + 1) is true, you don’t know that it is
Mathematical Induction 315
true yet, so writing it down as if it is a fact is incorrect and confusing. In fact, it is probably
better write the goal separate from the rest if the proof (e.g. on another piece of paper).
The goal does not need to be written down and is not really part of the proof. The only
purpose of doing so it to help you see what you need to do in the next step. For instance,
knowing the goal often helps you to figure out the required algebra steps to get there.
6. Inductive: This is the longest, and most varied, part of the proof. Once you get the hang
of induction, you will typically only think about two parts of the proof—the base case and
this step. The rest will become second nature.
The inductive step should not start with writing down P (k + 1). Some students want to
write out P (k + 1) and work both sides until they get them to be the same. As we have
emphasized on several occasions, this is not a proper proof technique. You cannot start with
something you do not know and then work it until you get to something you do know and
then declare it is true.
“Since we proved that P (a) is true, and that P (k) → P (k + 1), for k ≥ a, then
we know that P (n) is true for all n ≥ a by PMI, ” or
The details change a bit depending on what your inductive hypothesis was (e.g. if it was
P (k − 1) instead of P (k)). Technically speaking, you can just summarize your proof by
saying
As long as someone can look back and see that you included the two necessary parts of the
proof, you do not necessarily need to point them out again.
316 Chapter 8
8.2 Recursion
You have seen examples of recursion if you have seen Russian Matryoshka dolls (Google it), two
almost parallel mirrors, a video camera pointed at the monitor, or a picture of a painter painting
a picture of a painter painting a picture of a painter... More importantly for us, recursion is a very
useful tool to implement algorithms. You probably already learned about recursion in a previous
programming course, but we present the concept in this brief section for the sake of review, and
because it ties in nicely with the other two topics in this chapter.
Examples of recursion that you may have already seen include binary search, Quicksort, and
Mergesort.
Answer
If a subroutine/function simply called itself as a part of its execution, it would result in infinite
recursion. This is a bad thing. Therefore, when using recursion, one must ensure that at some
point, the subroutine/function terminates without calling itself. We will return to this point after
we see what is perhaps the quintessential example of recursion.
0! =
1
1! =
1 = 1 × 0!
2! 2×1
= = 2 × 1!
3! 3×2×1
= = 3 × 2!
4! 4×3×2×1
= = 4 × 3!
5! 5×4×3×2×1
= = 5 × 4!
and in general, when n > 1,
n! = n × (n − 1) × · · · × 2 × 1 = n × (n − 1)!
int f a c t o r i a l ( int n ) {
if (n <=0) {
return 1;
} else {
return n * f a c t o r i a l(n -1) ;
}
}
To guarantee that they will terminate, every recursive algorithm needs all of the following.
1. Base case(s): One or more cases which are solved non-recursively. In other words, when an
algorithm gets to the base case, it does not call itself again. This is also called a stopping
case or terminating condition.
2. Inductive case(s): One or more recursive rule for all cases except the base case.
3. Progress: The inductive case(s) should always progress toward the base case. Often this
means the arguments will get smaller until they approach the base case, but sometimes it
is more complicated than this.
Example 8.34. Let’s take a closer look at the factorial algorithm from Example 8.33. If
n ≤ 0, factorial does not make a recursive call. Thus, it has a base case. When n > 0, it
is clearly making a recursive call, so it has inductive cases. When a recursive call is made to
factorial, the argument is smaller, so it is approaching a base case (i.e. making progress).
⋆Question 8.35. Consider the ferzle algorithm from Question 8.32 above.
Answer
Answer
Answer
Example 8.36. Prove that the recursive factorial(n) algorithm from Example 8.33 returns
n! for all n ≥ 0.
Example 8.37. Implement an algorithm countdown(int n) that outputs the integers from n
down to 1, where n > 0. So, for example, countdown(5) would output “5 4 3 2 1”.
Solution: One way to do this is with a simple loop:
void c o u n t d o w n( int n ) {
for ( i = n ;i >0;i - -)
print ( i ) ;
}
We wouldn’t learn anything about recursion if we used this solution. So let’s
consider how to do it with recursion. Notice that countdown(n) outputs n followed
by the numbers from n − 1 down to 1. But the numbers n − 1 down to 1 are the
output from countdown(n-1). This leads to the following recursive algorithm:
void c o u n t d o w n( int n ) {
print ( n ) ;
c o u n t d o w n(n -1) :
}
To see if this is correct, we can trace through the execution of countdown(3). The
following table give the result.
⋆Exercise 8.38. Prove that the recursive countdown(n) algorithm from Example 8.37 works
correctly. (Hint: Use induction.)
2. Find a way to break up the problem into smaller instances of the same problem.
Example 8.39. Consider the binary search algorithm to find an item v on a sorted list of
size n. The algorithm works as follows.
• Else (m > v), we binary search the right half of the array.
• Now, we have the same problem, but only half the size.
Example 8.40. Prove that the recursive binarySearch algorithm from Example 8.39 is
correct.
Proof: We will prove it by induction on n = right − lef t + 1 (that is, the size
of the array).
Base case: If n = 0, that means right < lef t, and binarySearch returns −1 as
it should (since val cannot possible be in an empty array). So it works correctly
for n = 0.
Inductive Hypothesis: Assume that binarySearch works for arrays of size 0
through k − 1 (we need strong induction for this proof).
Inductive step: Assume binarySearch is called on an array of size k. There
are three cases.
• If val = a[middle], the algorithm returns middle which is the correct answer.
• If val < a[middle], a recursive call is made on the first half of the array (from
lef t to middle − 1). Because a is sorted, if val is in the array, it is in that
half of the array, so we just need to prove that the recursive call returns the
correct value. Notice that the first half of the array has less than n elements
(it does not contain middle or anything to the right of middle, so it is clearly
smaller by at least one element). Thus, by the inductive hypothesis, it returns
the correct index or −1 if val is not in that part of the array. Therefore it
returns the correct value.
• The case for val > a[middle] is symmetric to the previous case and the details
are left to the reader.
Note: You might think the base case in the previous proof should be n = 1, but that is not
actually correct. A failed search will always make a final call to binarySearch with n = 0. If
we don’t prove it works for an empty array then we cannot be certain that it works for failed
searches.
Example 8.41. Recall the Fibonacci sequence, defined by the recurrence relation
0 if n=0
fn = 1 if n=1
fn−1 + fn−2 if n > 1.
Let’s see an iterative and a recursive algorithm to compute fn . The iterative algorithm (on
the left) starts with f0 and f1 and computes each fi based on fi−1 and fi−2 for i from 2 to
n. As it goes, it needs to keep track of the previous two values. The recursive algorithm (on
the right) just uses the definition and is pretty straightforward.
⋆Question 8.42. Which algorithm is better, Fib or FibR? Give several reasons to justify
your answer.
Answer
Although recursion is a great technique to solve many problems, care must be taken when
using it. It easy to make simple mistakes like we did in Example 8.37. They can also be very
inefficient on occasion, as we alluded to in the previous example (and will prove later). In addition,
recursive algorithms often take more memory than iterative ones, as we will see next.
322 Chapter 8
Example 8.43. Consider our algorithms for n!. The iterative one from Example 3.40 uses
memory to store four numbers: n, f , i, and return value.a The recursive one from Example
8.33 uses memory to store two numbers: n and the return value. Although the recursive
algorithm uses less memory, it is called multiple times, and every call needs its own memory.
For instance, a call to factorial(3) will call factorial(2) which will call factorial(1). Thus,
computing 3! requires enough memory to store 6 numbers, which is more than the 4 required
by the iterative algorithm. In general, the recursive algorithm to compute n! will need to
store 2n numbers, whereas the iterative one will still just need 4, no matter how large n gets.
a
I won’t get technical here, but memory needs to be allocated for the value returned by a function.
Since computers have a finite amount of memory, and since every call to a function requires
its own memory, there is a limit to how many recursive calls can be made in practice. In fact
some languages, including Java, have a defined limit of how deep the recursion can be. Even for
those that don’t have a limit, if you run out of memory, you can certainly expect bad things to
happen. This is one of the reasons recursion is avoided when possible.
Good compilers attempt to remove recursion, but it is not always possible. Good programmers
do the same. Since recursive algorithms are often more intuitive, it often makes sense to think
in terms of them. But many recursive algorithms can be turned into iterative algorithms that
are as efficient and use less memory. There is no single technique to do so, and it is not always
necessary, but it is a good thing to keep in mind.
Let’s see a few more examples of the subtle problems that we can run into when using recursion.
Example 8.44. The following algorithm is supposed to sum the numbers from 1 to n:
void S u m 1 t o N( int n ) {
if ( n == 0) return (0) ;
else return ( n + S u m 1 t o N(n -1) ) ;
}
Although this algorithm works fine for non-negative values of n, it will go into infinite
recursion if n < 0. Like our original solution to the countdown problem, the mistake here is
an improper base case.
It is easy to get things backwards when recursion is involved. Consider the following example.
Recursion 323
⋆Question 8.45. One of these routines prints from 1 up to n, the other from n down to 1.
Which does which?
void PrintN ( int n ) { void NPrint ( int n ) {
if ( n > 0) { if ( n > 0) {
PrintN (n -1) ; print ( n ) ;
print ( n ) ; NPrint (n -1) ;
} }
} }
Answer
We conclude this section by summarizing some of the advantages and disadvantages of recur-
sion.
The advantages include:
1. Recursion often mimics the way we think about a problem, thus the recursive solutions can
be very intuitive to program.
2. Often recursive algorithms to solve problems are much shorter than iterative ones. This can
make the code easier to understand, modify, and/or debug.
3. The best known algorithms for many problems are based on a divide-and-conquer approach:
These divide-and-conquer techniques are often best thought of in terms of recursive algo-
rithms.
Perhaps the main disadvantage of recursion is the extra time and space required. We have
already discussed the extra space. The extra time comes from the fact that when a recursive
call is made, the operating system has to record how to restart the calling subroutine later on,
pass the parameters from the calling subroutine to the called subroutine (often by pushing the
parameters onto a stack controlled by the system), set up space for the called subroutine’s local
variables, etc. The bottom line is that calling a function is not “free”.
Another disadvantage is the fact that sometimes a slick-looking recursive algorithm turns
out to be very inefficient. We alluded to this in Example 8.42. On the other hand, if such
inefficiencies are found, there are techniques that can often easily remove them (e.g. a technique
called memoization2 ). But you first have to remember to analyze your algorithm to determine
whether or not there might be an efficiency problem.
2
No, that’s not a typo. Google it.
324 Chapter 8
tn = n · tn−1 + 4 · tn−3
rn = rn/2 + 1
an = an−1 + 2 · an−2 + 3 · an−3 + 4 · an−4
pn = pn−1 · pn−2
sn = sn−3 + n2 − 4n + 32
We have not given any initial conditions for these recurrence relations. Without initial con-
ditions, we cannot compute particular values. We also cannot solve the recurrence relation
uniquely.
Recurrence relations have 2 types of terms: recursive term(s) and the non-recursive terms. In
the previous example, the recursive term of sn is sn−3 and the non-recursive term is n2 − 4n + 32.
⋆Question 8.48. Consider the recurrence relations rn and an from Example 8.47.
Answer
Answer
Answer
Answer
3
You might also see recurrence relations written using function notation, like a(n). Although there are technical
differences between these notations, you can think of them as being essentially equivalent in this context.
Solving Recurrence Relations 325
In computer science, the most common place we use recurrence relations is to analyze recursive
algorithms. We won’t get too technical yet, but let’s see a simple example.
Example 8.49. How many multiplications are required to compute n! using the factorial
algorithm given in Example 8.33 (repeated below)?
int f a c t o r i a l ( int n ) {
if (n <=0) {
return 1;
} else {
return n * f a c t o r i a l(n -1) ;
}
}
Given a recurrence relation for an , you can’t just plug in n and get an answer. For instance,
if an = n · an−1 , and a1 = 1, what is a100 ? The only obvious way to compute it is to compute
a2 , a3 , . . . , a99 , and then finally a100 . That is the reason why solving recurrence relations is so
important. As mentioned previously, solving a recurrence relation simply means finding a closed
form expression for it.
Example 8.50. It is not too difficult to see that the recurrence from Example 8.49 has the
solution Mn = n. To prove it, notice that with this assumption, Mn−1 + 1 = (n − 1) + 1 =
n = Mn , so the solution is consistent with the recurrence relation.
We can also prove it with induction: We know that M0 = 0, so the base case of k = 0 is
true. Assume Mk = k for k ≥ 0. Then we have
Mk+1 = Mk + 1 = k + 1,
so the formula is correct for k + 1. Thus, by PMI, the formula is correct for all k ≥ 0.
The last example demonstrates an important fact about recurrence relations used to analyze
algorithms. The recursive terms come from when a recursive function calls itself. The non-
recursive terms come from the other work that is done by the function, including any splitting or
combining of data that must be done.
326 Chapter 8
Example 8.51. Consider the recursive binary search algorithm we saw in Example 8.39:
int b i n a r y S e a r c h ( int [] a , int left , int right , int val ) {
if ( right >= left ) {
int middle = ( left + right ) /2;
if ( val == a [ middle ])
return middle ;
else if ( val < a [ middle ])
return b i n a r y S e a r c h (a , left , middle -1 , val ) ;
else
return b i n a r y S e a r c h (a , middle +1 , right , val ) ;
} else {
return -1;
}
}
Find a recurrence relation for the worst-case complexity of binarySearch.
We will discuss using recurrence relations to analyze recursive algorithms in more detail in
section 8.4. But first we will discuss how to solve recurrence relations. There is no general method
to solve recurrences. There are many strategies, however. In the next few sections we will discuss
four common techniques: the substitution method, the iteration method, the Master Theorem (or
Master Method), and the characteristic equation method for linear recurrences.
⋆Question 8.52. Let’s see if you have been paying attention. What does it mean to solve
a recurrence relation?
Answer
As we continue our discussion of recurrence relations, you will notice that we will begin to
sometimes use the function notation (e.g. T (n) instead of Tn ). We do this for several reasons.
The first is so that you are comfortable with either notation. The second is that in algorithm
analysis, this notation seems to be more common, at least in my experience.
need to actually prove it. Because of the close tie between recurrence relations and induction, it
is the most natural technique to use. Let’s see an example.
n(n + 1)
Prove that the solution is S(n) = .
2
1(1+1) (k−1)k
Proof: When n = 1, S(1) = 1 = 2 . Assume that S(k − 1) = 2 . Then
⋆Exercise 8.54. Recall that in Example 8.51, we developed the recurrence relation T (n) =
T (n/2) + 1, T (0) = 1 for the complexity of binarySearch. For technical reasons, ignore T (0)
and assume T (1) = 1 is the base case. Use substitution to prove that T (n) = log2 n + 1 is a
solution to this recurrence relation.
328 Chapter 8
Hn+1 = 2Hn + 1
= 2(2n − 1) + 1
= 2n+1 − 1,
⋆Exercise 8.56. Solve the following recurrence relation and use induction to prove your
solution is correct: A(n) = A(n − 1) + 2, A(1) = 2.
Solving Recurrence Relations 329
Example 8.57. Why was the recursive algorithm to compute fn from Example 8.41 so bad?
Solution: Let’s count the number of additions FibR(n) computes since that is
the main thing that the algorithm does.a Let F (n) be the number of additions re-
quired to compute fn using FibR(n). Since FibR(n) calls FibR(n-1) and FibR(n-2)
and then performs one addition, it is easy to see that
F (n) = F (n − 1) + F (n − 2) + 1,
where F (0) = F (1) = 0 is clear from the algorithm. We could use the method for
linear recurrences that will be outlined later to solve this, but the algebra gets a
bit messy. Instead, Let’s see if we can figure it out by computing some values.
F (0) = 0
F (1) = 0
F (2) = F (1) + F (0) + 1 = 1
F (3) = F (2) + F (1) + 1 = 2
F (4) = F (3) + F (2) + 1 = 4
F (5) = F (4) + F (3) + 1 = 7
F (6) = F (5) + F (4) + 1 = 12
F (7) = F (6) + F (5) + 1 = 20
No pattern is evident unless you add one to each of these. If you do, you will get
1, 1, 2, 3, 5, 8, 13, 21, etc., which looks a lot like the Fibonacci sequence starting with
f1 . So it appears F (n) = fn+1 −1. To verify this, first notice that F (0) = 0 = f1 −1
and F (1) = 0 = f2 − 1. Assume it holds for all values less than k. Then
F (k) = F (k − 1) + F (k − 2) + 1
= fk − 1 + fk−1 − 1 + 1
= fk + fk−1 − 1
= fk+1 − 1.
Proof: We have
Using this method requires a little abstract thinking and pattern recognition. It also requires
good algebra skills. Care must be taken when doing algebra, especially with the non-recursive
terms. Sometimes you should add/multiply (depending on context) them all together, and other
times you should leave them as is. The problem is that it takes experience (i.e. practice) to
determine which one is better in a given situation. The key is flexibility. If you try doing it one
way and don’t see a pattern, try another way.
Here is my suggestion for using this method
Solving Recurrence Relations 331
1. Iterate enough times so you are certain of what the pattern is. Typically this means at least
3 or 4 iterations.
2. As you iterate, make adjustments to your algebra as necessary so you can see the pattern.
For instance, whether you write 23 or 8 can make a difference in seeing the pattern.
3. Once you see the pattern, generalize it, writing what it should look like after k iterations.
4. Determine the value of k that will get you to the base case, and then plug it in.
5. Simplify.
⋆Question 8.59. The iteration method is probably not a good choice to solve the following
recurrence relation. Explain why.
Answer
T (n) = 2T (n/2) + n3
= 2[2T (n/4) + (n/2)3 ] + n3
= 2[2T (n/4) + n3 /8)] + n3
= 22 T (n/4) + n3 /4 + n3
Notice that in the second line we have (n/2)3 and not n3 . This may be more clear
if rewrite the formula using k: T (k) = 2T (k/2) + k3 . When applying the formula
to T (n/2), we have k = n/2, so we get
Back to the second line, also notice that the 2 is multiplied by both the 2T (n/4)
and the (n/2)3 terms. A common error is to lose one of the 2s on the T (n/4) term
or miss it on the (n/2)3 term when simplifying. Also, (n/2)3 = n3 /8, not n3 /2.
This is another common mistake. Continuing,
332 Chapter 8
T (n) = . . .
= 22 T (n/4) + n3 /4 + n3
= 22 [2T (n/8) + (n/4)3 ] + n3 /4 + n3
= 22 [2T (n/8) + n3 /43 ] + n3 /4 + n3
= 23 T (n/8) + n3 /42 + n3 /4 + n3 .
By now you should have noticed that I use 2 or more steps for every iteration–I do
one substitution and then simplify it before moving on to the next substitution.
This helps to ensure I don’t make algebra mistakes and that I can write it out in
a way that helps me see a pattern.
Next, notice that we can write the last line as
The sum starts at i = 0 (not 1) and goes to k − 1 (not k). It is easy to get either
(or both) of these wrong if you aren’t careful. We should be careful to make sure
we have seen the correct pattern. Too often I have seen students make a pattern
out of 2 iterations. Not only is this not enough iterations to be sure of anything,
the pattern they usually come up with only holds for the last iteration they did.
The pattern has to match every iteration. To be safe, go one more iteration after
you identify the pattern to verify that it is correct.
Continuing (with a few more steps shown to make all of the algebra as clear as
possible), we get
T (n) = . . .
= 23 T (n/23 ) + n3 /42 + n3 /41 + n3 /40
= 23 [2T (n/24 ) + (n/23 )3 ] + n3 /42 + n3 /41 + n3 /40
= 23 [2T (n/24 ) + n3 /29 ] + n3 /42 + n3 /41 + n3 /40
= 24 T (n/24 ) + n3 /26 + n3 /42 + n3 /41 + n3 /40
= 24 T (n/24 ) + n3 /43 + n3 /42 + n3 /41 + n3 /40
= ...
k−1
X
k k
= 2 T (n/2 ) + n3 /4i .
i=0
Notice that this does seem to match the pattern we saw above. We can evaluate
the sum to simplify it a little more:
Solving Recurrence Relations 333
T (n) = . . .
k−1
X
k k
= 2 T (n/2 ) + n3 /4i
i=0
k−1
X
= 2k T (n/2k ) + n 3
1/4i
i=0
k−1
X
= 2k T (n/2k ) + n3 (1/4)i
i=0
Ç å
1 − (1/4)k
= 2k T (n/2k ) + n3
1 − 1/4
= 2k T (n/2k ) + n3 (4/3)(1 − (1/4)k )
We are almost done. We just need to find a k that allows us to get rid of the
recursion. Thus, we need to determine what value of k makes T (n/2k ) = T (1) = 1.
In other words, we need k such that
n/2k = 1.
This is equivalent to
n = 2k .
Taking log (base 2) of both sides, we obtain
So k = log2 n. We plug in k and use the fact that 2log2 n = n along with the
exponent rules to obtain
T (n) = . . .
= 2k T (n/2k ) + n3 (4/3)(1 − (1/4)k )
= 2log2 n T (n/2log2 n ) + n3 (4/3)(1 − (1/4)log 2 n )
1
Å ã
= nT (1) + n3 (4/3) 1 − 2 log n
(2 ) 2
1
Å ã
3
= n · 1 + n (4/3) 1 − log n 2
(2 2 )
1
Å ã
= n + n3 (4/3) 1 − 2
n
4 3 4
= n+ n − n
3 3
4 3 1
= n − n.
3 3
So we have that T (n) = 34 n3 − 31 n.
334 Chapter 8
√
Example 8.62. Give a tight bound for the recurrence T (n) = T ( n) + 1, where T (2) = 1.
T (n) = T (n1/2 ) + 1
= T (n1/4 ) + 1 + 1
= T (n1/8 ) + 1 + 1 + 1
k
= T (n1/2 ) + k
k
If we can determine when n1/2 = 2, we can obtain a solution. Taking logs (base
2) on both sides, we get
k
log2 (n1/2 ) = log2 2.
We apply the power-inside-a-log rule and the fact that log2 2 = 1 to get
(1/2k ) log2 n = 1.
2k = log2 n.
⋆Exercise 8.63. Use iteration to solve the recurrence relation that we developed in Exam-
ple 8.51 for the complexity of binarySearch:
If you can do the following exercise correctly, then you have a firm grasp of the iteration
method and your algebra skills are superb. If you have difficulty, keep working at it and/or get
some assistance. I strongly recommend that you do your best to solve this one on your own.
⋆Exercise 8.64. Solve the recurrence relation T (n) = 2T (n − 1) + n, T (1) = 1. (Hint: You
will need the result from Exercise 8.18.)
338 Chapter 8
Theorem 8.65 (Master Theorem). Let T (n) be a monotonically increasing function satisfying
Wow. That was easy.4 But the ease of use of the Master Method comes with a cost. Well, two
actually. First, notice that we do not get an exact solution, but only an asymptotic bound on the
solution. Depending on the context, this may be good enough. If you need an exact numerical
solution, the Master Method will do you no good. But when analyzing algorithms, typically we
are more interested in the asymptotic behavior. In that case, it works great. Second, it only
works for recurrences that have the exact form T (n) = aT (n/b) + f (n). It won’t even work on
similar recurrences, such as T (n) = T (n/b) + T (n/c) + f (n).
4
Almost too easy.
Solving Recurrence Relations 339
Example 8.70. Let’s redo one from a previous section. Use the Master Theorem to solve
the recurrence ß
1 when n = 1
R(n) =
2R(n/2) + n/2 otherwise
Answer
340 Chapter 8
⋆Exercise 8.73. We saw in Example 8.51 that the complexity of binary search is given by
the recurrence relation T (n) = T (n/2) + 1, T (0) = 1 (and you may assume that T (1) = 1).
Use the Master Theorem to solve this recurrence.
The order of the recurrence is the difference between the highest and the lowest subscripts.
Example 8.75. un = un−1 + 2 is of the first order, and un = 9un−4 + n5 is of the fourth
order.
There is a general technique that can be used to solve linear homogeneous recurrence relations.
However, we will restrict our discussion to certain first and second order recurrences.
Procedure 8.76. Let f (n) be a polynomial and a 6= 1. Then the following technique can be
used to solve a first order linear recurrence relations of the form
xn = axn−1 + f (n).
1. First, ignore f (n). That is, solve the homogeneous recurrence xn = axn−1 . This is done
as follows:
(a) ‘Raise the subscripts’, so xn = axn−1 becomes xn = axn−1 . This is called the
characteristic equation.
Solving Recurrence Relations 341
2. Assume that the solution to the original recurrence relation, xn = axn−1 + f (n), is of
the form xn = Aan + g(n), where g is a polynomial of the same degree as f (n).
3. Plug in enough values to determine the correct constants for the coefficients of g(n).
This procedure is a bit abstract, so let’s just jump into seeing it in action.
⋆Exercise 8.79. Let x0 = 2, xn = 9xn−1 − 56n + 63. Find a closed form for this recurrence.
Procedure 8.80. Here is how to solve a second-order homogeneous linear recurrence relations
of the form
xn = axn−1 + bxn−2 .
3. If the roots are different, the solution will be of the form xn = A(r1 )n + B(r2 )n , where
A, B are constants.
4. If the roots are identical, the solution will be of the form xn = A(r1 )n + Bn(r1 )n .
Example 8.82. Find a closed form for the Fibonacci recurrence f0 = 0, f1 = 1, fn = fn−1 +
fn−2 .
√
1± 5
Solution: The characteristic equation is f 2 − f − 1 = 0. This has roots 2 .
Therefore, a solution will have the form
Ç √ ån Ç √ ån
1+ 5 1− 5
fn = A +B .
2 2
⋆Exercise 8.83. Find a closed form for the recurrence x0 = 1, x1 = 4, xn = 4xn−1 − 4xn−2 .
344 Chapter 8
Solution: The algorithm for Mergesort is below. Let T (n) be the worst-case
running time of Mergesort on an array of size n = right − lef t. Recall that
Merge takes two sorted arrays and merges them into one sorted array in time
Θ(n), where n is the number of elements in both arrays.a Since the two recursive
calls to Mergesort are on arrays of half the size, they each require time T (n/2)
in the worst-case. The other operations take constant time. Below we annotate
the Mergesort algorithm with these running times.
Notice that we absorbed the constants C1 and C2 into the Θ(n) term. For sim-
plicity, we will also replace the Θ(n) term with cn (where c is a constant) and
rewrite this as
T (n) = 2T (n/2) + cn.
We could use the Master Theorem to prove that T (n) = Θ(n log n), but that would
be too easy. Instead, we will use induction to prove that T (n) = O(n log n), and
leave the Ω-bound to the reader.
By definition, T (n) = O(n log n) if and only if there exists constants k and n0
such that T (n) ≤ kn log n for all n ≥ n0 .
For the base case, notice that T (2) = a for some constant a, and a ≤ k2 log 2 = 2k
as long as we pick k ≥ a/2. Now, assume that T (n/2) ≤ k(n/2) log(n/2). Then
Analyzing Recursive Algorithms 345
T (n) = 2T (n/2) + cn
≤ 2(k(n/2) log(n/2) + cn
= kn log(n/2) + cn
= kn log n − kn log 2 + cn
= kn log n + (c − k)n
≤ kn log n if k ≥ c
As long as we pick k = max{a/2, c}, we have T (n) ≤ kn log n, so T (n) = O(n log n)
as desired.
a
Since our goal here is to analyze the algorithm, we won’t provide a detailed implementation of Merge. All
we need to know is its complexity.
⋆Exercise 8.85. We stated in the previous example that we could use the Master Theorem
to prove that if T (n) = 2T (n/2) + cn, then T (n) = Θ(n log n). Verify this.
⋆Question 8.86. Answer the following questions about points that were made in Exam-
ple 8.84.
(a) Why were we allowed to absorb the constants C1 and C2 into the Θ(n) term?
Answer
(b) Why were we able to replace the Θ(n) term with cn?
Answer
346 Chapter 8
Example 8.87 (Towers of Hanoi). The following legend is attributed to French mathemati-
cian Edouard Lucas in 1883. In an Indian temple there are 64 gold disks resting on three
pegs. At the beginning of time, God placed these disks on the first peg and ordained that a
group of priests should transfer them to the third peg according to the following rules:
1. The disks are initially stacked on peg A, in decreasing order (from bottom to top).
2. The disks must be moved to another peg in such a way that only one disk is moved at
a time and without stacking a larger disk onto a smaller disk.
When they finish, the Tower will crumble and the world will end. How many moves does it
take to solve the Towers of Hanoi problem with n disks?
Solution: The usual (and best) algorithm to solve the Towers of Hanoi is:
The only question is how to move the top n − 1 disks. The answer is simple: use
recursion but switch the peg numbers. Here is an implementation of this idea:
void s o l v e H a n o i ( int N , int source , int dest , int spare ) {
if ( N ==1) {
m o v e D i s k( source , dest ) ;
} else {
s o l v e H a n o i (N -1 , source , spare , dest ) ;
m o v e D i s k( source , dest ) ;
s o l v e H a n o i (N -1 , spare , dest , source ) ;
}
}
Don’t worry if you don’t see why this algorithm works. Our main concern here is
analyzing the algorithm.
The exact details of moveDisk depend on how the pegs/disks are implemented,
so we won’t provide an implementation of it. But it doesn’t actually matter
anyway since we just need to count the number of times moveDisk is called. As
it turns out, any reasonable implementation of moveDisk will take constant time,
so the complexity of the algorithm is essentially the same as the number of calls
to moveDisk.
Let H(n) be the number of moves it takes to solve the Towers of Hanoi problem
with n disks. Then H(n) is the number of times moveDisk is called when running
solveHanoi(n,1,2,3). It should be clear that H(1) = 1 since the algorithm
simply makes a single call to moveDisk and quits. When n > 1, the algorithm
makes two calls to solveHanoi with the first parameter being n − 1 and one call
to moveDisk. Therefore, we can see that
H(n) = 2H(n − 1) + 1.
As with the first example, we want a closed form for H(n). But we already showed
that H(n) = 2n − 1 in Examples 8.55 and 8.61.
Analyzing Recursive Algorithms 347
⋆Exercise 8.88. Let T (n) be the complexity of blarg(n). Give a recurrence relation for
T (n).
int blarg ( int n ) {
if ( n >5) {
return blarg (n -1) + blarg (n -1) + blarg (n -5) + blarg ( sqrt ( n ) ) ;
}
else {
return n ;
}
}
Answer
⋆Exercise 8.89. Give a recurrence relation for the running time of stoogeSort(A,0,n-1).
(Hint: Start by letting T (n) be the running time of stoogeSort on an array of size n.)
void s t o o g e S o r t ( int [] A , int L , int R ) {
if (R <= L ) return ; // Array has at most one e l e m e n t
if ( A [ R ] < A [ L ]) { // Swap first and last e l e m e n t
Swap (A ,L , R ) ; // if they are out of order
}
if (R -L >1) { // If the list has at least 2 e l e m e n t s
int third =( R - L +1) /3;
s t o o g e S o r t (A ,L ,R - third ) ; // Sort first two - thirds
s t o o g e S o r t (A , L + third , R ) ; // Sort last two - thirds
s t o o g e S o r t (A ,L ,R - third ) ; // Sort first two - thirds again
}
}
Answer
⋆Exercise 8.90. Solve the recurrence relation you developed for StoogeSort in the previous
exercise. (Make sure you verify your solution to the previous problem before you attempt to
solve your recurrence relation).
348 Chapter 8
Answer
⋆Exercise 8.92. Give and solve a recurrence relation for the running time of an algorithm
that does as follows: The algorithm is given an input array of size n. If n < 3, the algorithm
does nothing. If n ≥ 3, create 5 separate arrays, each one-third of the size of the original
array. This takes Θ(n) to accomplish. Then call the same algorithm on each of the 5 arrays.
We will base our analysis on this version of Quicksort. It is straightforward to see that the
runtime of Partition is Θ(n) (Problem 8.14 asks you to prove this). We start by developing a
recurrence relation for the average case runtime of Quicksort.
Theorem 8.94. Let T (n) be the average case runtime of Quicksort on an array of size n.
Then
n−1
2X
T (n) = T (k) + Θ(n).
n
k=1
Proof: Since the pivot element is chosen randomly, it is equally likely that the pivot will
end up at any position from l to r. That is, the probability that the pivot ends up at location
l + i is 1/n for each i = 0, . . . , r − l. If we average over all of the possible pivot locations, we
obtain (the last step holds since T (0) = 0)
n−1
!
1 X
T (n) = (T (k) + T (n − k − 1)) + Θ(n)
n
k=0
n−1 n−1
1 X 1X
= T (k) + T (n − k − 1) + Θ(n)
n n
k=0 k=0
n−1 n−1
1 X 1 X
= T (k) + T (k) + Θ(n)
n n
k=0 k=0
n−1
2 X
= T (k) + Θ(n)
n
k=0
n−1
2 X
= T (k) + Θ(n).
n
k=1
We will need the following result in order to solve the recurrence relation.
350 Chapter 8
Then we can bound (k log k) by (k log(n/2)) = k(log n − 1) in the first sum, and
by (k log n) in the second sum. This gives
Theorem 8.96. Let T (n) be the average case runtime of Quicksort on an array of size n.
Then
T (n) = Θ(n log n).
Proof: We need to show that T (n) = O(n log n) and T (n) = Ω(n log n). To
prove that T (n) = O(n log n), we will show that for some constant a,
When n = 2,
an log n = a2 log 2 = 2a,
and a can be chosen large enough so that T (2) ≤ 2a. Thus, the inequality holds
for the base case. Let T (1) = C, for some constant C. For 2 < k < n, assume
T (k) ≤ ak log k. Then
n−1
2X
T (n) = T (k) + Θ(n)
n
k=1
n−1
2X 2
≤ ak log k + T (1) + Θ(n) (by assumption)
n n
k=2
n−1
2a X 2
= k log k + C + Θ(n)
n n
k=2
n−1
2a X
≤ k log k + C + Θ(n) (since n2 ≤ 1)
n
k=2
2a 1 2 1 2
Å ã
≤ n log n − n + C + Θ(n) (by Lemma 2)
n 2 8
a
= an log n − n + C + Θ(n)
4
a
= an log n + Θ(n) + C − n
4
We have shown that with an appropriate choice of a, T (n) ≤ an log n for all n ≥ 2,
so T (n) = O(n log n).
We leave it to the reader to show that T (n) = Ω(n log n).
a
We pick 2 for the base case since n log n=0 if n = 1, so we cannot make the inequality hold. Another
solution would be to show that T (n) ≤ an log n + b. In this case, b can be chosen so that the inequality holds
for n = 1.
352 Chapter 8
⋆Question 8.2. The inductive step involves proving that if P (k) is true, then P (k + 1) is true.
So it almost seems like you are using a statement to prove the same statement–in other words,
circular reasoning. Explain why it is not circular reasoning.
⋆Question 8.3. Recall that [P (a) ∧ ∀k(P (k) → P (k + 1))] → (∀nP (n)) is a tautology, where
the universe is {a, a + 1, a + 2, . . .}.
(b) Use modus ponens to explain what this has to do with induction.
⋆Question 8.4. If I show that P (0) is true and that for all k > 0, P (k) → P (k + 1), then can I
conclude that P (k) is true for all k ≥ 0? Explain.
⋆Question 8.5. Use induction to prove that if k ≥ 1, then the number of binary strings of
length k is 2k .
⋆Question 8.6. Student A proves that P (n) is true for all n ≥ 1 by proving that P (1) is true
and that if P (k) is true, then P (k + 1) is true whenever k ≥ 1. Student B proves it by proving
that P (1) is true and that if k > 1, P (k − 1) → P (k) is true. Which one has a correct proof
technique?
⋆Question 8.7. What is the difference between weak and strong induction?
⋆Question 8.8. Come up with an analogy that helps to explain why proof by induction makes
sense. (A common one uses dominoes.)
⋆Question 8.9. If you go to the PDF of this book and look at Definition 8.31, you will notice
that the word recursive contains a hyperlink. What does it link to and why does it make sense?
⋆Question 8.10. What two or three things (depending on how you count and/or describe it)
are required for a recursive algorithm to be correct? Explain why each requirement is necessary.
⋆Question 8.11. Why are mathematical induction and recursion covered in the same chapter?
⋆Question 8.12. Write a recursive algorithm that searches for a given value in an array of
integers and returns the index of the location of the number in the array, or −1 if the number
is not present in the array. (Note: There are a few reasonable ways this might be accomplished,
and since the argument list to the function might be different based on the exact algorithm, you
have to come up with the function definition yourself. Likely your algorithm will need either 2 or
3 arguments.)
⋆Question 8.16. In a sentence or two, describe how each of the following techniques is used to
solve a recurrence relation
⋆Question 8.17. (a) Give one advantage of the substitution and iteration methods over the
Master Theorem.
(b) Give one advantage of the Master Theorem over the substitution and iteration methods.
(f) Which of these three techniques would you rather use? Why?
⋆Question 8.18. Why is the topic of solving recurrence relations in the same chapter as math-
ematical induction and recursion?
⋆Question 8.19. Why is the section on analyzing recursive algorithms in this chapter?
⋆Question 8.20. Based on the examples in this section, outline a procedure to analyze a recur-
sive algorithm. Be as specific as possible.
⋆Question 8.21. Analyze your algorithm from Question 8.12 by developing and solving a re-
currence relation for it. Does this analysis provide a best or worst case complexity?
354 Chapter 8
8.6 Problems
n
X n2 (n + 1)2
Problem 8.1. Use induction to prove that k3 = for all n ≥ 1.
4
k=1
Problem 8.3. Prove that for all positive integers n, f12 + f22 + · · · + fn2 = fn fn+1 , where fn is the
nth Fibonacci number.
Problem 8.4. Prove the following generalized De Morgan’s Law for sets (where n ≥ 2):
(A1 ∪ A2 ∪ · · · ∪ An ) = A1 ∩ A2 ∩ · · · ∩ An .
(Note: There is a second law just like it that swaps the ∩s and ∪s.)
Problem 8.6. Prove that the number of binary palindromes of length 2k + 1 (odd length) is 2k+1
for all k ≥ 0.
Problem 8.8. Prove that the FibR(n) algorithm from Example 8.41 correctly computes fn .
(Hint: Use induction. How many base cases do you need? Do you need weak or strong induction?)
Problem 8.9. In Example 8.82 we gave a solution to the recurrence fn = fn−1 + fn−2 , f0 = 0,
f1 = 1. Use the substitution method√
to re-prove this. (Hint: Recall that the roots to the
1± 5
polynomial x − x − 1 = 0 are 2 . This is equivalent to x2 = x + 1. You will find this helpful
2
Problem 8.10. Explain why the following joke never ends: Pete and Repete got in a boat. Pete
fell off. Who’s left?.
Problem 8.11. Find and prove a solution for each of the following recurrence relations using
two different techniques (this will not only help you verify that your solutions are correct, but it
will also give you more practice using each of the techniques). At least one of the techniques must
yield an exact formula if possible.
Problem 8.12. Give an exact solution for each of the following recurrence relations.
(a) an = 3an−1 , a1 = 5.
Problem 8.13. Use the Master Theorem to find a tight bound for each of the following recurrence
relations.
Problem 8.14. Prove that the Partition algorithm from Example 8.93 has complexity Θ(n).
Problem 8.15. Consider the classic bubble sort algorithm (see Example 7.132).
(a) Write a recursive version of the bubble sort algorithm. (Hint: The algorithm I have in mind
should contain one recursive call and one loop.)
(b) Let B(n) be the complexity of your recursive version of bubble sort. Give a recurrence relation
for B(n).
(c) Solve the recurrence relation for B(n) that you developed in part (b).
(d) Is your recursive implementation better, worse, or the same as the iterative one given in
Example 7.132? Justify your answer.
Problem 8.16. Consider the following algorithm (remember that integer division truncates):
int halfIt ( int n ) {
if (n >0) {
return 1 + halfIt ( n /2) ;
} else {
return 0;
}
}
(b) Prove that the algorithm is correct. That is, prove that it returns the answer you gave in
part (a).
356 Chapter 8
(c) What is the complexity of halfIt(n)? Give and prove an exact formula. (Hint: This will
probably involve developing and solving a recurrence relation.)
Problem 8.17.
! This problem involves an algorithm to compute the sum of the first n squares
n
X
i.e. k2 using recursion.
k=1
n
X
(a) Write an algorithm to compute k2 that uses recursion and only uses the increment/decre-
k=1
ment operator for arithmetic (e.g., you cannot use addition or multiplication). (Hint: The
algorithm I have in mind has one recursive call and one or two loops. Also, you will probably
need a global variable or to assume you can pass a variable by reference.)
(b) Let S(n) be the complexity of your algorithm from part (a). Give a recurrence relation for
S(n).
(c) Solve the recurrence relation for S(n) that you developed in part (b).
(d) Give a recursive linear-time algorithm to solve this same problem (with no restrictions on
what operations you may use). Prove that the algorithm is linear.
(e) Give a constant-time algorithm to solve this same problem (with no restrictions on what you
may use). Prove that the algorithm is constant.
(f) Discuss the relative merits of the three algorithms. Which algorithm is best? Worst? Justify.
Problem 8.18. Assuming the priests can move one disk per second, that they started moving
disks 6000 years ago, and that the legend of the Towers of Hanoi is true, when will the world end?
Problem 8.19. Prove that the stoogeSort algorithm given in Exercise 8.89 correctly sorts an
array of n integers.
Chapter 9: Counting
In this chapter we provide a very brief introduction to a field called combinatorics. We are
actually only going to scratch the surface of this very broad and deep subfield of mathematics
and theoretical computer science. We will focus on a subfield of combinatorics that is sometimes
called enumeration. That is, we will mostly concern ourselves with how to count things.
It turns out that combinatorial problems are notoriously deceptive. Sometimes they can seem
much harder than they are, and at other times they seem easier than they are. In fact, there are
many cases in which one combinatorial problem will be relatively easy to solve, but a very closely
related problem that seems almost identical will be very difficult to solve.
When solving combinatorial problems, you need to make sure you fully understand what is
being asked and make sure you are taking everything into account appropriately. I used to tell
students that combinatorics was easy. I don’t say that anymore. In some sense it is easy. But it
is also easy to make mistakes.
Theorem 9.1 (Sum Rule). Let E1 , E2 , . . . , Ek , be pairwise finite disjoint sets. Then
Another way of putting the sum rule is this: If you have to accomplish some task and you
can do it in one of n1 ways, or one of n2 ways, etc., up to one of nk ways, and none of the
ways of doing the task on any of the list are the same, then there are n1 + n2 + · · · + nk ways
of doing the task.
Example 9.2. I have 5 brown shirts, 4 green shirts, 10 red shirts, and 3 blue shirts. How
many choices do I have if I intend to wear one shirt?
Solution: Since each list of shirts is independent of the others, I can use the
sum rule. Therefore I can choose any of my 5 + 4 + 10 + 3 = 22 shirts.
Example 9.3. How many ordered pairs of integers (x, y) are there such that 0 < |xy| ≤ 5?
357
358 Chapter 9
⋆Exercise 9.4. For dessert you can have cake, ice cream or fruit. There are 3 kinds of cake,
8 kinds of ice cream and 5 different of fruits. How many choices do you have for dessert?
Answer
Another way of putting the product rule is this: If you need to accomplish some task that
takes k steps, and there are n1 ways of accomplishing the first step, n2 ways of accomplishing
the second step, etc., and nk ways of accomplishing the kth step, then there are n1 n2 · · · nk
ways of accomplishing the task.
Example 9.6. I have 5 pairs of socks, 10 pairs of shorts, and 8 t-shirts. How many choices
do I have if I intend to wear one of each?
Solution: I can think of choosing what to wear as a task broken into 3 steps:
I have to choose a pair of socks (5 ways), a pair of shorts (10 ways), and finally a
t-shirt (8 ways). Thus I have 5 × 10 × 8 = 400 choices.
⋆Exercise 9.7. If license plates are required to have 3 letters followed by 3 digits, how
many license plates are possible?
Answer
Example 9.8. The positive divisors of 400 are written in increasing order
1, 2, 4, 5, 8, . . . , 200, 400.
How many integers are there in this sequence? How many of the divisors of 400 are perfect
squares?
Solution: Since 400 = 24 · 52 , any positive divisor of 400 has the form 2a 5b
where 0 ≤ a ≤ 4 and 0 ≤ b ≤ 2. Thus there are 5 choices for a and 3 choices for b
The Sum and Product Rules 359
Theorem 9.9. Let the positive integer n have the prime factorization
where the pi are distinct primes, and the ai are integers ≥ 1. If d(n) denotes the number of
positive divisors of n, then
⋆Exercise 9.10. Prove Theorem 9.9. (Hint: Follow the idea from Example 9.8.)
⋆Question 9.11. Whether or not you realize it, you used the fact that the pi were distinct
primes in your proof of Theorem 9.9 (assuming you did the proof correctly). Explain where
that fact was used (perhaps implicitly).
Answer
360 Chapter 9
Example 9.12. What is the value of sum after each of the following segments of code?
int sum=0; int sum=0;
for(int i=0;i<n;i++) { for(int i=0;i<n;i++) {
for(int i=0;i<m;i++) { sum = sum + 1;
sum = sum + 1; }
} for(int i=0;i<m;i++) {
} sum = sum + 1;
}
Solution: In the code on the left, the inner loop executes m times, so every
time the inner loop executes, sum gets m added to it. The outer loop executes n
times, each time calling the inner loop. Therefore m is added to sum n times, so
sum = n × m at the end.
In the code on the right, The first loop adds n to sum, and then the second loop
adds m to sum. Therefore, sum = n + m at the end.
The following problem can be solved using the product rule–you just need to figure out how.
⋆Exercise 9.13. The number 3 can be expressed as a sum of one or more positive integers
in four ways, namely, as 3, 1 + 2, 2 + 1, and 1 + 1 + 1. Show that any positive integer n can
be so expressed in 2n−1 ways.
Answer
Example 9.14. Each day I need to decide between wearing a t-shirt or a polo shirt. I have
50 t-shirts and 5 polo shirts. I also have to decide whether to wear jeans, shorts, or slacks. I
have 5 pairs of jeans, 15 pairs of shorts, and 4 pairs of slacks. How many different choices do
I have when I am getting dressed?
⋆Exercise 9.15. If license plates are required to have 5 characters, each of which is either
a digits or a letter, how many license plates are possible?
Answer
Answer
The Sum and Product Rules 361
Example 9.17. The integers from 1 to 1000 are written in succession. Find the sum of all
the digits.
Solution: When writing the integers from 000 to 999 (with three digits),
3 × 1000 = 3000 digits are used. Each of the 10 digits is used an equal number of
times, so each digit is used 300 times. The the sum of the digits in the interval
000 to 999 is thus
(0 + 1 + 2 + 3 + 4 + 5 + 6 + 7 + 8 + 9) · 300 = 13500.
Therefore, the sum of the digits when writing the integers from 1 to 1000 is
13500 + 1 = 13501.
⋆Fill in the details 9.18. In C++, identifiers (e.g. variable and function names) can
contain only letters (upper and/or lower case), digits, and the underscore character. They
may not begin with a digit.a
(b) There are 53 · (26 + 26 + 10 + 1) = 53 · 63 = 3339 possible identifiers with two characters.
Example 9.20. In any group of 13 people, there are always two who have their birthday on
the same month. Similarly, if there are 32 people, at least two people were born on the same
day of the month.
⋆Exercise 9.21. What can you say about the digits in a number that is 11 digits long?
Answer
The pigeonhole principle can be generalized.
Theorem 9.22 (The Generalized Pigeonhole Principle). If n objects are placed into k boxes,
then there is at least one box that contains at least ⌈n/k⌉ objects.
Proof: Assume not. Then each of the k boxes contains no more than ⌈n/k⌉ − 1
objects. Notice that ⌈n/k⌉ < n/k + 1 (convince yourself that this is always true).
Thus, the total number of objects in the k boxes is at most
contradicting the fact that there are n objects in the boxes. Therefore, some box
contains at least ⌈n/k⌉ objects.
The tricky part about using the pigeonhole principle is identifying the boxes and objects. Once
that is done, applying either form of the pigeonhole principle is straightforward. Actually, often
the trickiest thing is identifying that the pigeonhole principle even applies to the problem you are
trying to solve.
Example 9.23. A drawer contains an infinite supply of white, black, and blue socks. What
is the smallest number of socks you must take from the drawer in order to be guaranteed that
you have a matching pair?
Solution: Clearly I could grab one of each color, so three is not enough. But
according the the Pigeonhole Principle, if I take 4 socks, then I will get at least
⌈4/3⌉ = 2 of the same color (the colors correspond to the boxes). So 4 socks will
guarantee a matched pair.
Notice that I showed two things in this proof. I showed that 4 socks was enough,
Pigeonhole Principle 363
but I also showed that 3 was not enough. This is important. For instance, 5 is
enough, but it isn’t the smallest number that works.
⋆Exercise 9.24. An urn contains 28 blue marbles, 20 red marbles, 12 white marbles, 10
yellow marbles, and 8 magenta marbles. How many marbles must be drawn from the urn in
order to assure that there will be 15 marbles of the same color? Justify your answer.
Answer
⋆Exercise 9.25. You are in line to get tickets to a concert. Each person can get at most
4 tickets. There are only 100 tickets available. The girl behind you in line says “I sure hope
there are enough tickets for me. You’re lucky, though. You will get as many as you want.”
What does she know, and under what circumstances will she get any tickets?
Answer
The pigeonhole principle is useful in existence proofs–that is, proofs that show that something
exists without actually identifying it concretely.
Example 9.26. Show that amongst any seven distinct positive integers not exceeding 126,
one can find two of them, say a and b, which satisfy
b < a ≤ 2b.
Solution: Split the numbers {1, 2, 3, . . . , 126} into the six sets
{1, 2}, {3, 4, 5, 6}, {7, 8, . . . , 13, 14}, {15, 16, . . . , 29, 30},
Example 9.27. Given any 9 integers whose prime factors lie in the set {3, 7, 11} prove that
there must be two whose product is a square.
Solution: For an integer to be a square, all the exponents of its prime factori-
sation must be even. Any integer in the given set has a prime factorisation of the
form 3a 7b 11c . Now each triplet (a, b, c) has one of the following 8 parity patterns:
(even, even, even), (even, even, odd), (even, odd, even), (even, odd, odd), (odd,
even, even), (odd, even, odd), (odd, odd, even), (odd, odd, odd). In a group of 9
such integers, there must be two with the same parity patterns in the exponents.
Take these two. Their product is a square, since the sum of each corresponding
exponent will be even.
⋆Exercise 9.28. The nine entries of a 3×3 grid are filled with −1, 0, or 1. Prove that among
the eight resulting sums (three columns, three rows, or two diagonals) there will always be
two that add to the same number.
Answer
Example 9.30. Given any set of ten natural numbers between 1 and 99 inclusive, prove that
there are two distinct nonempty subsets of the set with equal sums of their elements. (Hint:
How many possible subsets are there, and what are the possible sums of the elements within
the subsets?)
Solution: There are 210 −1 = 1023 non-empty subsets that one can form with a
given 10-element set. To each of these subsets we associate the sum of its elements.
The minimum value that the sum can be for any subset is 1 + 2 + · · · + 10 = 55,
and the maximum value is 90 + 91 + · · · + 99 = 945. Since the number of possible
sums is no more than 945 − 55 + 1 = 891 < 1023, there must be at least two
different subsets that have the same sum.
Pigeonhole Principle 365
⋆Exercise 9.31. An eccentric old man has five cats. These cats have 16 kittens among
themselves. What is the largest integer n for which one can say that at least one of the five
cats has n kittens?
Answer
⋆Evaluate 9.32. Prove that at a party with at least two people, there are two people who
have shaken hands with the same number of people.
Proof 1: There are n − 1 people 1 person can shake hands with–4 others if
there are 5 people at the party. At one given time two people cannot
shake hands with 0 people and n − 1 people simultaneously because there are
4 slots to fill and 5 people therefore by the pigeonhole principle at least
two people shake hands with the same number of others.
Evaluation
Evaluation
Evaluation
366 Chapter 9
⋆Exercise 9.33. Give a correct proof of the problem stated in Evaluate 9.32.
⋆Exercise 9.34. There are seventeen friends from high school that all keep in touch by
writing letters to each other.a To be clear, each person writes separate letters to each of the
others. In their letters only three different topics are discussed. Each pair only corresponds
about one of these topics. Prove that there at least three people who all write to each other
about the same topic.
a
You do know what letters are, right? They are like e-mail, only they are written on paper, are sent to
just one person, and are delivered to your physical mail box.
Permutations and Combinations 367
Example 9.35. Consider the set {a, b, c, d}. Suppose we “select” two letters from these four.
Depending on our interpretation, we may obtain the following answers.
(a) Permutations with repetitions. The order of listing the letters is important, and
repetition is allowed. In this case there are 4 · 4 = 16 possible selections:
aa ab ac ad
ba bb bc bd
ca cb cc cd
da db dc dd
(b) Permutations without repetitions. The order of listing the letters is important, and
repetition is not allowed. In this case there are 4 · 3 = 12 possible selections:
ab ac ad
ba bc bd
ca cb cd
da db dc
(c) Combinations with repetitions. The order of listing the letters is not important, and
4·3
repetition is allowed. In this case there are + 4 = 10 possible selections:
2
aa ab ac ad
bb bc bd
cc cd
dd
(d) Combinations without repetitions. The order of listing the letters is not important,
4·3
and repetition is not allowed. In this case there are = 6 possible selections:
2
ab ac ad
bc bd
cd
Although most of the simple types of counting problems we want to solve can be reduced
to one of these four, care must be taken. The previous example assumed that we had a set of
distinguishable objects. When objects are not distinguishable, the situation is more complicated.
368 Chapter 9
M AT H M AHT M T AH M T HA M HT A M HAT
AM T H AM HT AT M H AT HM AHT M AHM T
T AM H T AHM T M AH T M HA T HM A T HAM
HAT M HAM T HT AM HT M A HM T A HM AT
Answer
Proof: The first position can be chosen in n ways, the second object in n − 1
ways, the third in n − 2, etc. This gives
Example 9.40. Previously we saw that there are 24 = 4! permutations of the letters in
M AT H and 6 = 3! permutations of the letters in EAT .
⋆Exercise 9.41. How many permutations are there of the letters in uncopyrightable?
Answer
Let’s see some slightly more complicated examples.
Example 9.42. A bookshelf contains 5 German books, 7 Spanish books and 8 French books.
Each book is different from one another. How many different arrangements can be done of
these books if
Solution:
13 · 8! · 12! = 251073478656000.
(d) As with (c), we align the 12 German and Spanish books first, creating 13
spaces. To assure that no two French books are next to each other, we
put them into these spaces. The first French book can be put into any of
13 spaces, the second into any of 12 remaining spaces, etc., and the eighth
French book can be put into any 6 remaining spaces. Now, the non-French
books can be permuted in 12! ways. Thus the total number of permutations
is
13 · 12 · 11 · 10 · 9 · 8 · 7 · 6 · 12! = 24856274386944000.
⋆Exercise 9.43. Telephone numbers in Land of the Flying Camels have 7 digits, and the
only digits available are {0, 1, 2, 3, 4, 5, 7, 8}. No telephone number may begin in 0, 1 or 5.
Find the number of telephone numbers possible that meet the following criteria:
Answer
(b) You may not repeat the digits and the phone numbers must be odd.
Answer
The previous example and exercise should demonstrate that counting often requires thinking
about things in different ways depending on the exact situation. This can be tricky, and it is
370 Chapter 9
very easy to make mistakes that lead to under or over counting possibilities. As you are solving
problems, think very carefully about what you are counting so you don’t fall into this trap.
Example 9.44. In how many ways may the letters of the word M ASSACHU SET T S be
permuted to form different strings?
M A1 S1 S2 A2 CHU S3 ET1 T2 S4 .
There are now 13 distinguishable objects, which can be permuted in 13! different
ways by Theorem 9.39. But this counts some arrangements multiple times since
in reality the duplicated letters are not distinguishable. Consider a single permu-
tation of all of the distinguishable letters. If I permute the letters A1 A2 , I get the
same permutation when ignoring the subscripts. The same thing is true of T1 T2 .
Similarly, there are 4! permutations of S1 S2 S3 S4 , so there are 4! permutations that
look the same (without the subscripts). Since I can do all of these independently,
there are 2!2!4! permutations that look identical when the subscripts are removed.
This is true of every permutation. Therefore, the actual number of permutations
13!
is = 64864800.
2! · 4! · 2!
The following exercises should help the technique used in the previous example to sink in.
⋆Exercise 9.45. Use an argument similar to that in Example 9.44 to determine the number
of permutations in the letters in T ALL.
Answer
Answer
⋆Exercise 9.47. How many permutations are there in the letters of AEEEI?
Answer
Permutations and Combinations 371
Answer
The arguments from the previous examples and exercises can be generalized to prove the
following.
Theorem 9.49. Let there be k types of objects: n1 of type 1; n2 of type 2; etc. Then the
number of ways in which these n1 + n2 + · · · + nk objects can be rearranged is
(n1 + n2 + · · · + nk )!
.
n1 ! · n2 ! · · · nk !
Example 9.50. How many permutations of the letters from M ASSACHU SET T S contain
M ASS?
Solution: We can consider M ASS as one block along with the remaining 9
letters A, C, H, U , S, E, T , T , S. Thus, we are permuting 10 ‘letters’. There are
two S’sa and two T ’s and so the total number of permutations sought is
10!
= 907200.
2! · 2!
a
Remember, the other two S’s are part of M ASS, which we are now treating as a single object.
⋆Exercise 9.51. How many permutations of the letters from the word ALGORIT HM S
contain SM IT H?
Answer
Example 9.52. In how many ways may we write the number 9 as the sum of three positive
integer summands? Here order counts, so, for example, 1 + 7 + 1 is to be regarded different
from 7 + 1 + 1.
determine how many ways each triple can be reordered. The possibilities are:
Example 9.53. In how many ways can the letters of the word MURMUR be arranged
without allowing two of the same letters next to each other?
M U R R
M U R R
M U R R
In the first case there are 2! = 2 ways of putting the remaining M and U, in the
second there are 2! = 2 ways and in the third there is only 1! way. Thus starting
the word with MU gives 2 + 2 + 1 = 5 possible arrangements. In the general case,
we can choose the first letter of the word in 3 ways, and the second in 2 ways.
Thus the number of ways sought is 3 · 2 · 5 = 30.a
a
It should be noted that this analysis worked because the three letters each occurred twice. If this was not
the case we would have had to work harder to solve the problem.
Permutations and Combinations 373
⋆Exercise 9.54. Telephone numbers in Land of the Flying Camels have 7 digits, and the
only digits available are {0, 1, 2, 3, 4, 5, 7, 8}. No telephone number may begin with 0, 1 or 5.
Find the number of telephone numbers possible that meet the following criteria:
Answer
(b) You may repeat digits, but the last digit must be even.
Answer
(c) You may repeat digits, but the last digit must be odd.
Answer
Example 9.55. In how many ways can the letters of the word AFFECTION be arranged,
keeping the vowels in their natural order and not letting the two F’s come together?
9!
Solution: There are ways of permuting the letters of AFFECTION.
2!
The 4 vowels can be permuted in 4! ways, and in only one of these will they be
9!
in their natural order. Thus there are ways of permuting the letters of
2! · 4!
AFFECTION in which their vowels keep their natural order. If we treat F F
as a single letter, there are 8! ways of permuting the letters so that the F ’s stay
8!
together. Hence there are permutations of AFFECTION where the vowels
4!
occur in their natural order and the F F ’s are together. In conclusion, the number
of permutations sought is
9! 8! 8! 9 7
Å ã
− = − 1 = 8 · 7 · 6 · 5 · = 5880.
2! · 4! 4! 4! 2 2
374 Chapter 9
An alternative notation is C(n, k). This notation is particularly useful when you want to
express a binomial coefficient in the middle of text since it doesn’t take up two lines.
Note: Observe that in the last fraction, there are k factors in both the numerator and de-
nominator. Also, observe the boundary conditions
Ç å Ç å Ç å Ç å
n n n n
= = 1, = = n.
0 n 1 n−1
Ç å
12
(b) =
2
Ç å
10
(c) =
5
Permutations and Combinations 375
Ç å
200
(d) =
4
Ç å
67
(e) =
0
If there are n kittens and you decide to take k of them home, you also decided not to take
n − k of them home. This idea leads to the following important theorem.
Example 9.60. Ç å Ç å
11 11
= = 55.
9 2
Ç å Ç å
12 12
= = 792.
5 7
Ç å Ç å
110 110
= = 110
109 1
Ç å
12
(b) =
10
Ç å
200
(c) =
196
Ç å
67
(d) =
66
376 Chapter 9
XY , XZ, XW , Y Z, Y W , W Z.
Notice that Y X (for instance) is not on the list because XY is already on the list and order
does not matter.
XY Z, XY W , XZW , Y W Z.
Answer
Proof: The number of ways of picking k objects if the order matters is n(n −
1)(n − 2) · · · (n − k + 1) since there are n ways of choosing the first object, n − 1
ways of choosing the second object, etc.. Since each k-combination can be ordered
in k! ways, the number of ordered lists of size k is k! times the number of k-
combinations. Put another way, the number of k-combinations is the number
above divided by k!. That is, the total number of k-combinations is
Ç å
n(n − 1)(n − 2) · · · (n − k + 1) n
= .
k! k
Ç å
10
Example 9.67. From a group of 10 people, we may choose a committee of 4 in = 210
4
ways.
Permutations and Combinations 377
⋆Evaluate 9.68. A family has seven women and nine men. They need five of them to get
together to plan a party. If at least one of the five must be a woman, how many ways are
there to select the five?
Evaluation
Evaluation
16
Solution 3: There are 5 possible committees, 59 of which contain only
16 9
men. Thus, there are 5 − 5 committees that contain at least one
woman.
Evaluation
A
To count the number of shortest routes from A to B (one of which is given), observe
that any shortest path must consist of 6 horizontal moves and 3 vertical ones for a total of
6 + 3 = 9 moves. Once we choose which 6 of these 9 moves are horizontal the 3 vertical ones
are determined. For instance, if I choose to go horizontal on moves 1, 2, 4, 6, 7, and 8, then
moves 3, 5 and 9 must be vertical. Since there are 9 moves, I just need to choose which 6 of
these are the horizontal moves. Thus there are 96 = 84 paths.
Another way to think about it is that we need to compute the number of permutations
of EEEEEEN N N , where E means move east, and N means move north. The number of
9
permutations is 9!/(6! · 3!) = 6 .
378 Chapter 9
⋆Exercise 9.70. Count the number of shortest routes from A to B that pass through point
O in the following grid. (Hint: Break it into two subproblems and combine the solutions.)
B
b
O
⋆Evaluate 9.71. A family has seven women and nine men. How many ways are there to
select five of them to plan a party if at least one man and one woman must be selected?
Solution 1: There are 7 choices for the first woman, 9 choices for the
first man, and 14 3 choices for the rest of the committee. Thus, there are
14
3 · 7 · 9 possible committees.
Evaluation
Solution 2: Since one has to be a woman and one has to be a man, then
they really just need to select 3 more member from the remaining 14
people, so the answer is 14
3 .
Evaluation
Now it’s your turn to give a correct solution to the previous problem.
⋆Exercise 9.72. A family has seven women and nine men. How many ways are there to
select five of them to plan a party if at least one man and one woman must be selected?
Permutations and Combinations 379
⋆Question 9.73. In the answer to the previous problem, we pointed out that two sets of
committees did not overlap. Why was that important?
Answer
Example 9.74. Three different integers are drawn from the set {1, 2, . . . , 20}. In how many
ways may they be drawn so that their sum is divisible by 3?
The sum of three numbers will be divisible by 3 when (a) the three numbers are
divisible by 3; (b) one of the numbers is divisible by 3, one leaves remainder 1 and
the third leaves remainder 2 upon division by 3; (c) all three leave remainder 1
upon division by 3; (d) all three leave remainder 2 upon division by 3. Hence the
number of ways is
Ç å Ç åÇ åÇ å Ç å Ç å
6 6 7 7 7 7
+ + + = 384.
3 1 1 1 3 3
⋆Evaluate 9.75. The 300-level courses in the CS department are split into three groups:
Foundations (361, 385), Applications (321, 342, 392), and Systems (335, 354, 376). In order
to get a BS in computer science at Hope you need to take at least one course from each
group. If you take four 300-level courses, how many different possibilities do you have that
satisfy the requirements?
Solution 1: You have to take one from each group and then you can take
any of the remaining 5 courses. So the total is 2 ∗ 3 ∗ 3 ∗ 5 = 90.
Evaluation
8
Solution 2: 4 = 70
Evaluation
380 Chapter 9
⋆Evaluate 9.76. Using the same requirements from Evaluate 9.75, how many total ways
are there to take 300-level courses that satisfy the requirements?
Solution 1: Take one from each group and then choose between 0 and 5
5 Ç å
X 5
of the remaining 5. The total is therefore 2 ∗ 3 ∗ 3 ∗ .
k
k=0
Evaluation
Solution 2: Since you can take anywhere between 3 and 8 courses, the
8 8 8 8 8 8
number of possibilities is 3 + 4 + 5 + 6 + 7 + 8 .
Evaluation
In Problem 9.31 you will have a chance to properly solve the problems from the previous two
Evaluate exercises.
Example 9.77. How many ways are there to put 10 ping pong balls into 4 buckets?
Solution: We will solve this using a technique sometimes called bars and stars.
We will represent the drawers with bars and the balls with stars. We will use 10
stars and 3 bars. To see why this is 3 and not 4, let’s see how we represent the
situation of having 3 balls in the first bucket, 5 in the second, none in the third,
and 2 in the fourth:
***|*****||**
Do you see it? The bars act as separators between the buckets, which is why we
have one less bar than the number of buckets.
Given this formulation, aren’t we just trying to find all possible orderings of bars
and stars? Indeed. To do so, all we need to do is determine where to put the
stars, and the bars ‘fall into place’. Alternatively, we can determine where to put
the bars and let the stars fall into place. There are 13 spots and we need to choose
10 spots for the balls (the ‘stars’) or 3 spots for the bucket separators (the ‘bars’).
So the solution is Ç å Ç å
13 13
= = 286.
10 3
Notice that Theorem 9.59 implies that these two methods of solving the problem
will always be the same, which is a really good thing.
Permutations and Combinations 381
Example 9.78. How many ways are there to choose 10 pieces of fruit if you can take any
number of bananas, oranges, apples, or pears and the order I select them does not matter?
Solution: Again we can use stars and bars so solve this problem. We need
10 stars to represent the chosen fruits and 3 bars to divide the four fruits we can
choose from. The stars before the first bar represent bananas, those between the
first and second bar are oranges, between the second and third are apples, and
after the third are pears. Thus, we need to count the number of ways we can
arrange 10 stars and 3 bars. Notice that this is exactly the same thing we did in
the previous example, so the answer is
Ç å Ç å
13 13
= = 286.
10 3
⋆Exercise 9.79. I want to make a sandwich that has 3 slices of meat. My refrigerator is
well stocked because I have 11 different meats to choose from. How many choices do I have
for my sandwich if I allow myself to have multiple slices of the same meat and the order the
slices appear on the sandwich does not matter?
Answer
The previous theorem can be applied to various situations. As with the pigeonhole principle,
the trickiest part is recognizing when and how to apply it.
382 Chapter 9
Again, multiplying
(a + b)3 = a3 + 3a2 b + 3ab2 + b3 (9.2)
by a + b one obtains
Theorem 9.81 (Binomial Theorem). Let x and y be variables and n be a nonnegative integer.
Then
n Ç å
n
X n n−i i
(x + y) = x y.
i
i=0
Solution:
Ç å Ç å Ç å Ç å
3 3 3 0 3 2 1 3 1 2 3
(4x + 5) = (4x) 5 + (4x) (5) + (4x) (5) + (4x)0 53
0 1 2 3
= (4x)3 + 3(4x)2 (5) + 3(4x)(5)2 + 53
= 64x3 + 240x2 + 300x + 125
√
Example 9.83. In the following, i = −1, so that i2 = −1.
Notice that we skipped the step of explicitly writing out the binomial coefficient for this
example. You can do it either way–just make sure you aren’t forgetting anything or making
algebra mistakes by taking shortcuts.
Binomial Theorem 383
The most important things to remember when using the binomial theorem are not to forget
the binomial coefficients, and not to forget that the powers (i.e. xn−i and y i ) apply to the whole
term, including any coefficients. A specific case that is easy to forget is a negative sign on the
coefficient. Did you make any of these mistakes when doing the last exercise? Be sure to identify
your errors so you can avoid them in the future.
√ √
⋆Exercise 9.85. Expand ( 3 + 5)4 , simplifying as much as possible.
n Ç å
X n
Example 9.86. Let n ≥ 1. Find a closed form for (−1)k .
k
k=0
384 Chapter 9
Solution: Using a little algebra and the binomial theorem, we can see that
n Ç å n Ç å
X n k
X n n−k
(−1) = 1 (−1)k = (1 − 1)n = 0.
k k
k=0 k=0
n Ç å
X n k
⋆Exercise 9.87. Find a closed form for 3 .
k
k=1
If we ignore the variables in the Binomial Theorem and write down the coefficients for increas-
ing values of n, a pattern, called Pascal’s Triangle, emerges (see Figure 9.1).
1
1 1
1 2 1
1 3 3 1
1 4 6 4 1
1 5 10 10 5 1
1 6 15 20 15 6 1
1 7 21 35 1 35 21 7
1 8 28 56 70 56 28 8 1
1 9 36 84 126 126 84 36 9 1
1 10 45 120 210 252 210 120 45 10 1
..
.
Figure 9.1: Pascal’s Triangle
Notice that each entry (except for the 1s) is the sum of the two neighbors just above it. This
observation leads to the Pascal’s Identity.
Theorem 9.88 (Pascal’s Identity). Let n and k be positive integers with k ≤ n. Then
Ç å Ç å Ç å
n+1 n n
= + .
k k−1 k
9.5 Inclusion-Exclusion
The Sum Rule (Theorem 9.1) gives us the cardinality for unions of finite sets that are mutually
disjoint. In this section we will drop the disjointness requirement and obtain a formula for the
cardinality of unions of general finite sets.
The Principle of Inclusion-Exclusion is attributed to both Sylvester and to Poincaré. We will
only consider the cases involving two and three sets, although the principle easily generalizes to
k sets.
Theorem 9.89 (Inclusion-Exclusion for Two Sets). Let A and B be sets. Then
|A ∪ B| = |A| + |B| − |A ∩ B|
Proof: Clearly there are |A ∩ B| elements that are in both A and B. Therefore,
|A| + |B| is the number of element in A and B, where the elements in |A ∩ B| are
counted twice. From this it is clear that |A ∪ B| = |A| + |B| − |A ∩ B|.
Example 9.90. Of 40 people, 28 smoke and 16 chew tobacco. It is also known that 10 both
smoke and chew. How many among the 40 neither smoke nor chew?
Solution: Let A denote the set of smokers and B the set of chewers. Then
meaning that there are 34 people that either smoke or chew (or possibly both).
Therefore the number of people that neither smoke nor chew is 40 − 34 = 6.
⋆Exercise 9.91. In a group of 100 camels, 46 eat wheat, 57 eat barley, and 10 eat neither.
How many camels eat both wheat and barley?
386 Chapter 9
Example 9.92. Consider the set A that are multiples of 2 no greater than 114. That is,
A = {2, 4, 6, . . . , 114}.
(a) Notice that the elements are 2 = 2(1), 4 = 2(2), . . . , 114 = 2(57). Thus
|A| = 57.
(b) Notice that
so |A3 | = 19.
(c) Notice that
so |A5 | = 11.
(d) Notice that A15 = {30, 60, 90}, so |A15 | = 3.
(e) First notice that A3 ∩ A5 = A15 . Then it is clear that the answer is
(f) We want
(g) We want
Theorem 9.93 (Inclusion-Exclusion for Three Sets). Let A, B, and C be sets. Then
Proof: Using the associativity and distributivity of unions of sets, we see that
|A ∪ B ∪ C| = |A ∪ (B ∪ C)|
= |A| + |B ∪ C| − |A ∩ (B ∪ C)|
= |A| + |B ∪ C| − |(A ∩ B) ∪ (A ∩ C)|
= |A| + |B| + |C| − |B ∩ C| − |A ∩ B| − |A ∩ C| + |(A ∩ B) ∩ (A ∩ C)|
= |A| + |B| + |C| − |B ∩ C| − (|A ∩ B| + |A ∩ C| − |A ∩ B ∩ C|)
= |A| + |B| + |C| − |A ∩ B| − |B ∩ C| − |C ∩ A| + |A ∩ B ∩ C|.
Example 9.94. At Medieval High there are forty students. Amongst them, fourteen like
Mathematics, sixteen like theology, and eleven like alchemy. It is also known that seven like
Mathematics and theology, eight like theology and alchemy and five like Mathematics and
alchemy. All three subjects are favored by four students. How many students like neither
Mathematics, nor theology, nor alchemy?
Solution: Let A be the set of students liking Mathematics, B the set of students liking
theology, and C be the set of students liking alchemy. We are given that
and
|A ∩ B ∩ C| = 4.
Using Theorem 9.93, along with some set identities, we can see that
|A ∩ B ∩ C| = |A ∪ B ∪ C|
= |U | − |A ∪ B ∪ C|
= |U | − |A| − |B| − |C| + |A ∩ B| + |A ∩ C| + |B ∩ C| − |A ∩ B ∩ C|
= 40 − 14 − 16 − 11 + 7 + 5 + 8 − 4
= 15.
388 Chapter 9
⋆Exercise 9.95. A survey of a group’s viewing habits revealed the percentages that watch
a given sports. The results are given below. Calculate the percentage of the group that
watched none of the three sports.
28% gymnastics 14% gymnastics & baseball 8% all three sports
29% baseball 10% gymnastics & soccer
19% soccer 12% baseball & soccer
⋆Exercise 9.96. Would you believe a market investigator that reports that of 1000 people,
816 like candy, 723 like ice cream, 645 like cake, while 562 like both candy and ice cream, 463
like both candy and cake, 470 like both ice cream and cake, while 310 like all three? State
your reasons!
Inclusion-Exclusion 389
Example 9.97. An auto insurance company has 10, 000 policyholders. Each policy holder
is classified as
• young or old,
• married or single.
Of these policyholders, 3000 are young, 4600 are male, and 7000 are married. The policyhold-
ers can also be classified as 1320 young males, 3010 married males, and 1400 young married
persons. Finally, 600 of the policyholders are young married males.
How many of the company’s policyholders are young, female, and single?
The following problem is a little more challenging than the others we have seen, but you have
all of the tools you need to tackle it.
⋆Exercise 9.98 (Lewis Carroll in A Tangled Tale.). In a very hotly fought battle, at least
70% of the combatants lost an eye, at least 75% an ear, at least 80% an arm, and at least
85% a leg. What can be said about the percentage who lost all four members?
390 Chapter 9
⋆Question 9.1. For each of the following, make up an example that requires the given rule
to solve. Your example should not just rehash an example from the book. Bonus points if it is
computer science related. For each, also give and explain the solution.
⋆Question 9.2. If there are 12 objects in 10 boxes, does the pigeonhole principle allow you to
conclude that one box has at least 3 objects? Or that two boxes have at least two items? Or that
every box has at least one item? What is the most precise thing that you can conclude from it?
⋆Question 9.3. Assume 30 balls are placed in 7 bins. Give several different possibilities for how
many balls are in each bin, trying to make the examples as different from each other as possible.
What is true of all of your examples, as predicted by the generalized pigeonhole principle?
⋆Question 9.4. I have 21 disc golf discs, including putters, approach discs, fairway drivers, and
distance drivers. Tell me everything you can say for certain about how many of each type of disc
I have.
⋆Question 9.5. Twelve people each pick a number from 1 to 1000. Prove that at least two of
them picked numbers that have the same number of 1s in their binary representation or show why
it is possible that this is not the case (i.e. give a counterexample).
⋆Question 9.7. How many are there of each of the following (where digit means decimal digit).
⋆Question 9.8. (a) How many permutations are there of the set {8, 6, 7, 5, 3, 0, 9}? (b) How
many of these permutations begin with 8 and end with 9?
⋆Question 9.9. You know somebody’s PIN number uses the digits 3, 3, 6, 6, 8, but you do not
know the order. How many possible PIN numbers have these digits?
⋆Question 9.10. You need to choose a team of 45 people out of a possible 50 people. What is
probably a much easier way of thinking about this problem?
Reading Comprehension Questions 391
25
⋆Question 9.11. Compute 22 by hand. (Hint: Be smart!)
⋆Question 9.12. The board of directors for the Holland Running Club has 11 members. The
executive board is a subcommittee of the board of directors consisting of 4 members (from the
11). The executive board consists of the president, vice-president, treasurer, and secretary.
(a) How many different possibilities are there for the executive board if we do not care which
office they hold?
(b) If we are given the four members of the executive board, how many ways are there of assigning
the offices?
(c) How many different possible executive boards are there (choosing from the whole board) if
we do care about who is in which office?
⋆Question 9.13. What are two important factors that influence how you go about counting
things (i.e. help you determine which technique you will use)?
(b) Compute the previous sum for n = 0, 1, 2, 3, 4, 5. (Hint: Use your solution to part (a)!)
(c) Attempt to make a connection between this question and Pascal’s Triangle. It may be a bit
subtle, but it is kind of neat if you see it.
⋆Question 9.15. Use the Binomial Theorem to expand (2x − 3y)5 , simplifying as much as
possible.
n Ç å
X n
⋆Question 9.16. Use the Binomial Theorem to prove that = 2n .
k
k=0
⋆Question 9.17. In a class of 20 students, 7 show up late and 4 sleep during class. How many
students do neither? Give both a minimum and maximum since there is not enough information
to know for sure.
⋆Question 9.18. You want to use Inclusion-Exclusion on 3 sets (e.g. your goal is to compute
how many things are in the union of 3 sets). You are given 6 pieces of information. Is that enough
to solve the problem? Explain.
⋆Question 9.19. Given Inclusion-Exclusion on two and three sets, can you generalize it to four
sets? Thus, given sets A, B, C, and D, what is a formula for |A ∪ B ∪ C ∪ D|?
392 Chapter 9
9.7 Problems
Problem 9.1. How many license plates can be made using either three letters followed by three
digits or four letters followed by two digits?
Problem 9.2. How many license plates can be made using 4 letters and 3 numbers if the letters
cannot be repeated and the letters and numbers may appear in any order?
Problem 9.3. How many bit strings of length 8 either begin with three 1s or end with four 0s?
Problem 9.4. How many alphabetic strings are there whose length is at most 5?
Problem 9.5. How many bit strings are there of length at least 4 and at most 6?
Problem 9.6. How many subsets with 4 or more elements does a set of size 30 have?
Problem 9.7. Given a group of ten people, prove that at least 4 are male or at least 7 are female.
Problem 9.8. My family wants to take a group picture. There are 7 men and 5 women, and we
want none of the women to stand next to each other. How many different ways are there for us
to line up?
Problem 9.9. My family (7 men and 5 women) wants to select a group of 5 of us to plan
Christmas. We want at least 1 man and 1 woman in the group. How many ways are there for us
to select the members of this group?
Problem 9.10. Compute each of the following: 84 , 99 , 73 , 8!, and 5!
Problem 9.11. For what value(s) of k is 18
k largest? smallest?
Problem 9.12. For what value(s) of k is 19
k largest? smallest?
Problem 9.13. A computer network consists of 10 computers. Each computer is directly con-
nected to zero or more of the other computers.
(a) Prove that there are at least two computers in the network that are directly connected to the
same number of other computers.
(b) Prove that there are an even number of computers that are connected to an odd number of
other computers.
Problem 9.14. Simplify the following expression so it does not involve any factorials or binomial
x x+1
coefficients: y / y−1
Problem 9.15. Prove that amongst six people in a room there are at least three who know one
another, or at least three who do not know one another.
Problem 9.16. Suppose that the letters of the English alphabet are listed in an arbitrary order.
(b) Give a list to show that there need not be five consecutive consonants.
(c) Suppose that all the letters are arranged in a circle. Prove that there must be five consecutive
consonants.
Problems 393
Problem 9.17. Bob has ten pockets and forty four silver dollars. He wants to put his dollars
into his pockets so distributed that each pocket contains a different number of dollars.
(b) Generalize the problem, considering p pockets and n dollars. Why is the problem most
(p − 1)(p − 2)
interesting when n = ?
2
Problem 9.18. Expand and simplify the following.
(a) (x − 4y)3
(b) (x3 + y 2 )4
(c) (2 + 3x)6
Problem 9.21. Prove Pascal’s Identity (Theorem 9.88). (Hint: Just use the definition of the
binomial coefficient and do a little algebra.)
n Ç å
k n
X
Problem 9.22. Prove that for any positive integer n, (−2) = (−1)n . (Hint: Don’t use
k
k=0
induction.)
Problem 9.24. There are approximately 7,000,000,000 people on the planet. Assume that ev-
eryone has a name that consists of exactly k lower-case letters from the English alphabet.
(a) If k = 8, is it guaranteed that two people have the same name? Explain.
(b) What is the maximum value of k that would guarantee that at least two people have the same
name?
(c) What is the maximum value of k that would guarantee that at least 100 people have the same
name?
394 Chapter 9
(d) Now assume that names can be between 1 and k characters long. What is the maximum
value of k that would guarantee that at least two people have the same name?
Problem 9.25. Password cracking is the process of determining someone’s password, typically
using a computer. One way to crack passwords is to perform an exhaustive search that tries every
possible string of a given length until it (hopefully) finds it. Assume your computer can test
10,000,000 passwords per second. How long would it take to crack passwords with the following
restrictions? Give answers in seconds, minutes, hours, days, or years depending on how large the
answer is (e.g. 12,344,440 seconds isn’t very helpful). Start by determining how many possible
passwords there are in each case.
(a) 8 lower-case alphabetic characters.
(d) 8 alphabetic (upper or lower), numeric characters, and special characters (assume there are
32 allowable special characters).
(f) 10 alphabetic (upper or lower), numeric characters, and special characters (assume there are
32 allowable special characters).
(g) 8 characters, with at least one upper-case, one lower-case, one number, and one special char-
acter.
Problem 9.26. IP addresses are used to identify computers on a network. In IPv4, IP addresses
are 32 bits long. They are usually written using dotted-decimal notation, where the 32 bits
are split up into 4 8-bit segments, and each 8-bit segment is represented in decimal. So the IP
address 10000001 11000000 00011011 00000100 is represented as 129. 192. 27. 4. The subnet mask
of a network is a string of k ones followed by 32 − k zeros, where the value of k can be different on
different networks. For instance, the subnet mask might be 11111111111111111111111100000000,
which is 255. 255. 255. 0 in dotted decimal. To determine the netid, an IP address is bitwise ANDed
with the subnet mask. To determine the hostid, an IP address is bitwise ANDed with the bitwise
complement of the subnet mask. Since every computer on a network needs to have a different
hostid, the number of possible hostids determines the maximum number of computers that can
be on a network.
Assume that the subnet mask on my computer is currently 255. 255. 255. 0 and my IP address
is 209. 140. 209. 27.
(a) What are the netid and hostid of my computer?
(b) How many computers can be on the network that my computer is on?
(c) In 2010, Hope College’s network was not split into subnetworks like it is currently, so all of
the computers were on a single network that had a subnet mask of 255. 255. 240. 0. How many
computers could be on Hope’s network in 2010?
n Ç å
X n
Problem 9.27. Prove that = 2n by counting the number of binary strings of length n
k
k=0
in two ways.
Problems 395
Problem 9.28. In March of every year people fill out brackets for the NCAA Basketball Tourna-
ment. They pick the winner of each game in each round. We will assume the tournament starts
with 64 teams (it has become a little more complicated than this recently). The first round of the
tournament consists of 32 games, the second 16 games, the third 8, the fourth 4, the fifth 2, and
the final 1. So the total number of games is 32 + 16 + 8 + 4 + 2 + 1 = 63. You can arrive at the
number of games in a different way. Every game has a loser who is out of the tournament. Since
only 1 of the 64 teams remains at the end, there must be 63 losers, so there must be 63 games.
X5
Notice that we can also write 1 + 2 + 4 + 8 + 16 + 32 = 63 as 2k = 26 − 1.
k=0
n
X
(a) Use a combinatorial proof to show that for any n > 0, 2k = 2n+1 − 1. (That is, define an
k=0
appropriate set and count the cardinality of the set in two ways to obtain the identity.)
(b) When you fill out a bracket you are picking who you think the winner will be of each game.
How many different ways are there to fill out a bracket? (Hint: If you think about this in the
proper way, this is pretty easy.)
(c) If everyone on the planet (7,000,000,000) filled out a bracket, is it guaranteed that two people
will have the same bracket? Explain.
(d) Assume that everyone on the planet fills out k different brackets and that no brackets are
repeated (either by an individual or by anybody else). How large would k have to be before it
is guaranteed that somebody has a bracket that correctly predicts the winner of every game?
(e) Assume every pair of people on the planet gets together to fill out a bracket (so everyone has
6,999,999 brackets, one with every other person on the planet). What is the smallest and
largest number of possible repeated brackets?
Problem 9.29. Mega Millions has 56 white balls, numbered 1 through 56, and 46 red balls,
numbered 1 through 46. To play you pick 5 numbers between 1 and 56 (corresponding to white
balls) and 1 number between 1 and 46 (corresponding to a red ball). Then 5 of the 56 balls and 1
of the 46 balls are drawn randomly (or so they would have us believe). You win if your numbers
match all 6 balls.
(a) How many different draws are possible?
(b) If everyone in the U.S.A. bought a ticket (about 314,000,000), is it guaranteed that two people
have the same numbers? Three people?
(c) If everyone in the U.S.A. bought a ticket, what is the maximum number of people that are
guaranteed to share the jackpot?
(d) Which is more likely: Winning Mega Millions or picking every winner in the NCAA Basketball
Tournament (see previous question)? How many more times likely is one than the other?
(e) I purchased a ticket last week and was surprised when none of my six numbers matched.
Should I have been surprised? What are the chances that a randomly selected ticket will
match none of the numbers?
(f) (hard) What is the largest value of k such that you are more likely to pick at least k winners
in the NCAA Basketball Tournament than you are to win Mega Millions?
396 Chapter 9
Problem 9.30. You get a new job and your boss gives you 2 choices for your salary. You can
either make $100 per day or you can start at $.01 on the first day and have your salary doubled
every day. You know that you will work for k days. For what values of k should you take the first
offer and for which should you take the second offer? Explain.
Problem 9.31. The 300-level courses in the CS department are split into three groups: Founda-
tions (361, 385), Applications (321, 342, 392), and Systems (335, 354, 376). In order to get a BS
in computer science at Hope you need to take at least one course from each group.
(a) How many different ways are there of satisfying this requirement by taking exactly 3 courses?
(b) If you take four 300-level courses, how many different possibilities do you have that satisfy
the requirements?
(c) How many total ways are there to take 300-level courses that satisfy the requirements?
(d) What is the smallest k such that no matter which k 300-level courses you choose, it is guar-
anteed that you will satisfy the requirement?
Problem 9.32. I am implementing a data structure that consists of k lists. I want to store a
total of n objects in this data structure, with each item being stored on one of the lists. All of
the lists will have the same capacity (e.g. perhaps each list can hold up to 10 elements).
Write a method minimumCapacity(int n, int k) that computes the minimum capacity each of
the k lists must have to accommodate n objects. In other words, if the capacity is less than this,
then there is no way the objects can all be stored on the lists. You may assume integer arithmetic
truncates (essentially giving you the floor function), but that there is no ceiling function available.
Problem 9.33. Write a method choose(int n, int k) (in a Java-like language) that computes
n
k . Your implementation should be as efficient as possible. Make sure to give and prove the
efficiency of your algorithm.
Chapter 10: Graph Theory
In this chapter we will provide a very brief and very selective introduction to graphs. Graph
theory is a very wide field and there are many thick textbooks on the subject. The main point of
this chapter is to provide you with the basic notion of what a graph is, some of the terminology
used, a few applications, and a few interesting and/or important results.
Example 10.2. Here is an example of a graph with the set of vertices and edges listed on
the right. Vertices are usually represented by means of dots on the plane, and the edges by
means of lines connecting these dots.
D E
V={A,B,C,D,E}
E={ (A,D),(A,E),(B,D),
A B
(B,E),(C,D),(C,E)}
C
Example 10.3. Sometimes we just care about the visual representation of a graph. Here are
three examples.
There are several variations of graphs. We will provide definitions and examples of the most
common ones.
397
398 Chapter 10
As you would probably suspect, the only difference between simple graphs and directed graphs
is that the edges in directed graphs have a direction. We should note that simple graphs are
sometimes called undirected graphs to make it clear that the graphs are not directed.
Example 10.6. In a simple graph, {u, v} and {v, u} are just two different ways of talking
about the same edge–the edge between u and v. In a directed graph, (u, v) is the edge from
u to v and (v, u) is the edge from v to u. These are not the same, and they may or may not
both be present.
• V , a set of vertices,
Two edges e1 and e2 with f (e1 ) = f (e2 ) are called multiple edges.
Definition 10.11. A weighted graph is a graph (or digraph) with the additional property
that each edge e has associated with it a real number w(e) called its weight.
A weighted digraph is often called a network.
Example 10.12. Here are two examples of weighted graphs and one weighted directed graph.
-2 3
2 1 12
3 7
0 0
1 3 8 -6 7
3 6 5
4 4 4
3
11 4
As we have seen, there are several ways of categorizing graphs:
• Directed or undirected edges.
Note: When writing graph algorithms, it is important to know what characteristics the graphs
have. For instance, if a graph might have loops, the algorithm should be able to handle it.
Some algorithms do not work if a graph has loops and/or multiple edges, and some only apply
to directed (or undirected) graphs.
Graph Terminology 401
Definition 10.14. Let u and v be vertices and e = {u, v} be an edge in undirected graph G.
• The degree of a vertex, denoted deg(v), is the number of edges incident with it.
x z x
w x
v w u w
y z
y v
u y
v
u
G1 G2 G
3
In graph G1 , we can say: The following table gives the degree of
each of the vertices in the graphs above.
• w is adjacent to x.
G1 G2 G3
• w and x are the endpoints of the edge
deg(u)=3 deg(u)=2 deg(u)=2
(w, x).
deg(v)=5 deg(v)=3 deg(v)=4
• (w, x) is incident with both w and x. deg(w)=3 deg(w)=2 deg(w)=3
deg(x)=2 deg(x)=4 deg(x)=2
• (w, x) connects vertices w and x. deg(y)=2 deg(y)=3 deg(y)=3
deg(z)=3 deg(z)=2
H1 H2 H3
Notice that H2 is a subgraph of H1 and that H3 is a subgraph of both H1 and H2 .
You can think of a walk as follows: Put your pencil down on a vertex and trace around edges
however you like until you reach some destination vertex. You are allowed to repeat edges and
vertices as often as you like–just like you may repeat sidewalks and paths when you go for a walk
(thus the name).
Definition 10.20. A u − v path is a walk that does not repeat any vertex.
It should be relatively easy to see that paths cannot repeat an edge (because to repeat an
edge you have to repeat a vertex).
Example 10.21. In the first graph, the trail abecde is indicated with the dark lines. It is not
a path since it repeats the vertex e. The second and third graphs show examples of paths.
a b a b
a b
c c
c bec
d e d e
abecde bedc d e
c cdec
d e
⋆Exercise 10.24. Find a cycle of length 4 and a cycle of length 5 in the graph from
Example 10.23. Is there a cycle of length 6? Explain why or why not.
Answer
Definition 10.25. A graph is called connected if there is a path between every pair of
distinct vertices.
A connected component of a graph is a maximal connected subgraph.
Example 10.26. Below are two graphs, each drawn inside dashed boxes. The graph on the
left is connected. The one on the right is not connected. It has two connected components.
a b a b
c c
d e d e
⋆Exercise 10.27. Draw a graph that has two connected components, one that is a cycle of
length 4 and one that is a cycle of length 3.
404 Chapter 10
Definition 10.28. A tree (or unrooted tree) is a connected acyclic graph. That is, a
graph with no cycles.
A forest is a collection of trees.
Example 10.29. Here are four trees. If they were all part of the same graph, we could
consider the graph a forest.
tree
tree tree
tree
⋆Exercise 10.30. Draw a tree that has 5 vertices, one vertex with degree 4 and the others
with degree 1.
Note: These trees are not to be confused with rooted trees (e.g. binary trees). When com-
puter scientists use the term tree, they usually mean rooted trees, not the trees we are dis-
cussing here. When you see/hear the term ‘tree,’ it is important to be clear about which one
the writer/speaker has in mind.
Definition 10.32. A spanning tree of G is a subgraph which is a tree and contains all of
the vertices of G.
Graph Terminology 405
Example 10.33. Below is a graph (on the left) and one of several possible spanning trees
(on the right).
G spanning tree of G
• u is said to be adjacent to v.
• The in-degree of u, denoted by deg − (u), is the number of edges in G which have u as
their terminal vertex.
• The out-degree of u, denoted by deg + (u), is the number of edges in G which have u
as their initial vertex.
406 Chapter 10
w y
w x
y
x v w
v v
z
u u u x
y
G G G6
4 5
Consider the edge (w, x) in G4 .
• w is the initial vertex and x is the terminal vertex of the edge (w, x).
This table gives the in-degree and out-degree for the vertices in graphs G4 , G5 , and G6 .
G4 G5 G6
− +
deg (u)=2 deg (u)=4 − +
deg (u)=1 deg (u)=0 deg (u)=1 deg+ (u)=1
−
− +
deg (v)=2 deg (v)=2 − +
deg (v)=1 deg (v)=2 deg− (v)=2 deg+ (v)=2
deg− (w)=1 deg+ (w)=1 deg− (w)=1 deg+ (w)=1 deg− (w)=2 deg+ (w)=2
deg− (x)=2 deg+ (x)=3 deg− (x)=1 deg+ (x)=1 deg− (x)=1 deg+ (x)=1
− +
deg (y)=3 deg (y)=0 − +
deg (y)=2 deg (y)=2 deg− (y)=2 deg+ (y)=2
deg− (z)=1 deg+ (z)=1
Some Special Graphs 407
Definition 10.36. The complete graph with n vertices Kn is the graph where every pair
of vertices is adjacent. Thus Kn has n2 edges.
K2 K3 K4 K5
Definition 10.38. Cn denotes a cycle of length n. It is a graph with n edges, and n vertices
v1 · · · vn , where vi is adjacent to vi+1 for n = 1, . . . , n − 1, and v1 is adjacent to vn .
C3 C C5
4
We won’t provide an example of the paths because they are pretty easy to visualize. For
instance, P3 is simply C4 with one edge removed.
Definition 10.41. Qn denotes the n-dimensional cube (or hypercube). One way to
define Qn is that it is a simple graph with 2n vertices, which we label with n-tuples of 0’s
and 1’s. Vertices of Qn are connected by an edge if and only if they differ by exactly one
coordinate. Observe that Qn has n2n−1 edges.
408 Chapter 10
Example 10.42. Here are Q2 and Q3 , with vertices labeled as mentioned in the definition.
101 111
01 11 001 011
100 110
00 10 000 010
Q2 Q3
Notice that in Q2 , the vertex labeled 11 is adjacent to the vertices labeled 10 and 01
since each of these differ in one bit. Similarly, the vertex labeled 101 in Q3 is adjacent to the
vertices labeled 001, 111, and 100 for the same reason. Next is Q4 , also labeled according to
the definition.
1101 1111
1001 1011
0101 0111
0001 0011
0100 0110
0000 0010
1100 1110
1000 1010
Q4
It should not be too difficult to see that Q1 is the same as P1 which is the same as K2 .
Definition 10.43. A simple graph G is called bipartite if the vertex set V can be partitioned
into two disjoint nonempty sets V1 and V2 such that every edge connects a vertex in V1 to a
vertex in V2 .
Put another way, no vertices in V1 are connected to each other, and no vertices in V2 are
connected to each other.
Some Special Graphs 409
Note that there may be different ways of assigning the vertices to V1 and V2 . That is not
important. As long as there is at least one way to do so such that all edges go between V1 and
V2 , then a graph is bipartite.
Notice that although these are drawn to make it clear what the partition is (i.e. V1 is the
top row of vertices and V2 is the bottom row), a graph does not have to be drawn as such in
order to be bipartite. They are often drawn this way out of convenience. For instance, the
hypercubes are all bipartite even though they are not drawn this way.
Definition 10.45. Km,n denotes the complete bipartite graph with m + n vertices. That
is, it is the graph with m + n vertices that is partitioned into two sets, one of size n and the
other of size m such that every possible edge between the two sets is in the graph.
Example 10.46. The first four graphs from Example 10.44 are complete bipartite graphs.
The first is K1,1 , the second is K1,2 (or K2,1 ), the third is K2,2 , and the fourth is K3,2 (or
K2,3 ).
410 Chapter 10
The proof in the previous theorem is an example of a combinatorial proof. It is a neat technique
where you prove a formula by counting the number of objects in a set in two different ways.
x z x
w x
v w u w
y z
y v
u y
u v
G1 G2 G
3
A quick tabulation of the degrees of the vertices and the number of edges reveals the
following:
Graph G1 G2 G3
|E| 9 7 8
X
deg(v) 18 14 16
v∈V
Undirected graphs have an interesting property that is really easy to prove using Theo-
rem 10.47.
Handshaking Lemma 411
Corollary 10.49. Every graph has an even number of vertices of odd degree.
Proof: The sum of an odd number of odd numbers is odd. Since the sum of the
degrees of the vertices in a simple graph is always even, one cannot have an odd
number of odd degree vertices.
The situation is slightly different, but not too surprising, for directed graphs.
We won’t provide a proof of this theorem (it’s almost obvious), but you should verify it for the
graphs in Example 10.35 by adding up the degrees in each column and comparing the appropriate
sums.
412 Chapter 10
Definition 10.51. The adjacency list representation of a graph maintains, for each vertex,
a list of all of the vertices adjacent to that vertex. This can be implemented in many ways,
but often an array of linked lists is used.
Example 10.52. A drawing of C5 is given below on the left. An adjacency list representation
is given below on the right.
B A →E→B
C B →A→C
A C →B→D
D →C→E
D E
E →D→A
Example 10.53. A drawing of a directed cycle of length 5 is given below on the left. An
adjacency list representation is given next to it.
B A →B Notice that this is a lot like the previ-
C B →C ous example except that each list only
has one element on it. That is be-
A C →D
cause (A, B) is an edge (for instance),
D →E but (B, A) is not an edge. So B is on
D E
E →A A’s list, but A is not on B’s list.
Example 10.54. Here is another example of a graph on the left with the adjacency list
representation on the right.
A B A →B→E→D
B →A→E→C
E C →B→D→E
D →C→A→E
D C E →A→B→C→D
Note that the order the vertices are listed does not matter.
Graph Representation 413
⋆Exercise 10.55. Give the adjacency list representation for K3,3 as drawn below.
A →
B →
A B C
C →
D →
E →
D E F
F →
⋆Exercise 10.56. Give the adjacency list representation for the directed graph similar to
K3,3 drawn below.
A →
B →
A B C
C →
D →
E →
D E F
F →
The graph in Example 10.54 has 5 vertices and 8 edges (so n = 5 and m = 8). The adjacency
list uses an array of size 5 and there are 5 linked lists that contain a total of 3 + 3 + 3 + 3 + 4 =
16 = 2 ∗ 8 nodes. Notice that this is twice the number of edges because each edge is stored twice
(because if (u, v) is an edge, u is stored on v’s list and v is stored on u’s list). For each node we
need to store the value and the next node, so the linked lists take up about 2(2 ∗ 8) = 4 ∗ 8 = 4m
memory. Since the array takes about 5 = n memory, the memory requirement for an adjacency
list representation of the graph is approximately n + 4m = Θ(n + m).
Notice that the discussion in the previous paragraph generalizes to all graphs. That is, the
space requirement for the adjacency list representation of a graph is approximately n + 4m =
Θ(n + m).
Hopefully it is not too difficult to see that for directed graphs, the amount of memory required
is about n + 2m = Θ(n + m) because each edge is only stored once.
For weighted graphs, an additional field can be stored in each node for the weight of each
edge. So for undirected weighted graphs, the memory requirement goes up to about n + 6m, and
for directed weighted graphs it is about n + 3m. In both cases, it is still Θ(n + m).
The second method of storing a graph makes it so you can ask directly “Is (u, v) and edge?”
This is accomplished by storing a matrix whose rows and columns are indexed by the vertices.
414 Chapter 10
We often assume that the vertices are numbered 0, 1, . . . , n−1 since that is how we typically index
matrices. In the next few examples we will continue with our examples with vertices labeled A,
B, etc. To make the interpretation of the matrices clear, we label the rows and columns. You can
also just think of a mapping of A to 0, B to 1, etc.
Example 10.58. A drawing of C5 is given below on the left, the adjacency list in the middle,
and the adjacency matrix on the right.
A B C D E
B A →E→B
A 0 1 0 0 1
C B →A→C
B 1
0 1 0 0
A C →B→D C 0
1 0 1 0
D 0 0 1 0 1
D →C→E
D E E 1 0 0 1 0
E →D→A
Example 10.59. A drawing of a directed cycle of length 5 is given below on the left. An
adjacency list representation is given in the middle and the adjacency matrix on the right.
A B C D E
B A →B
A 0 1 0 0 0
C B →C
B 0
0 1 0 0
A C →D C 0
0 0 1 0
D 0 0 0 0 1
D →E
D E E 1 0 0 0 0
E →A
Example 10.60. Here is another example of a graph on the left, the adjacency list repre-
sentation on the center, and the adjacency matrix on the right.
A B A B C D E
A →B→E→D
A 0 1 0 1 1
B →A→E→C
B 1
0 1 0 1
E C →B→D→E C 0
1 0 1 1
D 1 0 1 0 1
D →C→A→E
E 1 1 1 1 1
D C E →A→B→C→D
From these examples, it should be relatively clear that the amount of space needed to store
an adjacency matrix with n vertices and m edges is about n2 = Θ(n2 ). Notice that it does not
depend on m, since a larger m just means more 1s and fewer 0s in the matrix.
If G is weighted, we can store the weights in the matrix instead of just 0 or 1. For non-adjacent
vertices, we store ∞, or MAX INT (or −1 if only positive weights are valid). If done this way,
Graph Representation 415
the space requirement remains n2 = Θ(n2 ). Alternatively, a second matrix can be used to store
the weights, doubling the space requirement, which is still Θ(n2 ).
Notice the amont of space required to store both directed and undirected graphs is the same
with the adjacency matrix.
⋆Exercise 10.61. Give the adjacency matrix representation for K3,3 as drawn below.
A B C
D E F
⋆Exercise 10.62. Give the adjacency matrix representation for the directed graph similar
to K3,3 drawn below.
A B C
D E F
Obviously, how much space is required to store a graph is of importance, but so is how much
time is required to do basic operations on a graph. For instance, the most common things one
might want to do on a graph are determine whether or not two vertices are adjacent and iterate
over the edges that are incident with a vertex (put another way, iterate over all of the neighbors of
a vertex). For a weighted graph, one would probably ask the weight of an edge somewhat often.
There are certainly other important operations one might want to perform on a graph. Since you
have all of the tools you need to answer such questions, we will ask you to explore them at the
end of the chapter.
So which representation is better? We will also let you think about that at the end of the
chapter, but hopefully it is somewhat clear that answering that question requires you to consider
both time and space requirements.
416 Chapter 10
• Connectivity: Is there a way to get between any two vertices in the graph?
• Shortest Path: What is the shortest path from A to B? (weighted and unweighted ver-
sions)
• Longest Path: What is the longest path from A to B? (weighted and unweighted versions)
• Minimum Spanning Tree: What is the “most efficient” way to connect the vertices
(weighted graphs)?
• Traveling Salesman: What is the shortest route that visits every vertex and returns to
the starting vertex? (weighted graphs)
Knowing what graph problems have been studied and what is known about each is very
important. Many problems can be modeled using graphs, and once a problem has been mapped
to a particular graph problem, it can be helpful to know the best way to solve it.
We finish the chapter by giving several examples of problems whose solutions become simpler
when using a graph-theoretic model as well as develop some new graph terminology. It is impor-
tant to mention that whole books are written just about graph theory, and even they have to pick
a small subset of the topic. Thus, what is presented in the remainder of this chapter should not
be interpreted in any way to be the most important topics in graph theory. It is just a very small
selection of easy to understand topics that are related to interesting problems. Dozens–maybe
even hundreds–of other topics could have been chosen. It should be noted that the author has
even resisted the urge to include one of his favorite graph topics, graph pebbling, even though it
is a somewhat interesting topic. Well, to him anyway.
Problem Solving with Graphs 417
Example 10.63. A wolf, a goat, and a cabbage are on one bank of a river. The ferryman
wants to take them across, but his boat is too small to accommodate more than one of them
at a time. He cannot leave the wolf and the goat together (the wolf will eat the goat), or
the cabbage and the goat (the goat will eat the cabbage) unless he is with them. Can the
ferryman still get all of them across the river?
Solution: Represent the position of a single item by 0 for one bank of the river
and 1 for the other bank. The position of the three items can now be given as an
ordered triplet, say (W , G, C). For example, (0, 0, 0) means that the three items
are on one bank of the river, (1, 0, 0) means that the wolf is on one bank of the
river while the goat and the cabbage are on the other bank. The object of the
puzzle is now seen to be to move from (0, 0, 0) to (1, 1, 1) by traversing certain
edges of Q3 while avoiding other edges. Note that Q3 is the correct set of edges
to consider since he can only move one of the three items at a time.
But there are some edges he cannot use. For instance, 000 → 100 is illegal since
it would mean he takes the wolf to the other side, leaving the goat and cabbage
together. Similarly, 000 → 001 is illegal. Thus, from 000, the only choice is to go
to 010. Continuing this analysis, it can be determined that the set of legal edges
is as in the following graph:
011 001
110 100
Based on this, one answer is 000 → 010 → 011 → 001 → 101 → 111. This means
that the ferryman (i) takes the goat across, (ii) returns and takes the cabbage
over, (iii) brings back the goat, (iv) takes the wolf over, (v) returns and takes the
goat over.
Another answer is 000 → 010 → 110 → 100 → 101 → 111. This means that the
ferryman (i) takes the goat across, (ii) returns and takes the wolf over, (iii) brings
back the goat, (iv) takes the cabbage over, (v) returns and takes the goat over.
Go to https://xkcd.com/1134/ to see a funny, but incorrect, solution.
Example 10.64. Prove that amongst six people in a room there are at least three who know
one another, or at least three who do not know one another.
Solution: Consider an arbitrary person of this group (call him Peter). There
are five other people, and of these, either three of them know Peter or else, three
of them do not know Peter.
Let us assume three know Peter. If two of these three people know one another,
then we have a triangle of three people who know each other (Peter and these
two–see the graph below on the left, where the acquaintances are marked by solid
lines). If no two of these three people know one another, then we have three
418 Chapter 10
Peter Peter
The argument for the case when three do not know Peter is similar and is left to
the reader.
Example 10.65. Mr. and Mrs. Landau invite four other married couples for dinner. Some
people shook hands with some others, and the following rules were noted: (i) a person did not
shake hands with himself, (ii) no one shook hands with his spouse, (iii) no one shook hands
more than once with the same person. After the introductions, Mr. Landau asks the nine
people how many hands they shook. Each of the nine people asked gives a different number.
How many hands did Mrs. Landau shake?
6 7 6 7 6 7
5 5 5
8 8 8
4 Mr. Landau 4 Mr. Landau 4 Mr. Landau
3 0 3 0 3 0
2 1 2 1 2 1
Figure 10.1: Example 10.65. Figure 10.2: Example 10.65. Figure 10.3: Example 10.65.
Problem Solving with Graphs 419
Definition 10.66. Recall that a trail is a walk where all the edges are distinct. An Eulerian
trail on a graph G is a trail that traverses every edge of G. A tour of G is a closed walk
that traverses each edge of G at least once. An Euler tour (or Euler cycle) on G is a tour
traversing each edge of G exactly once, that is, a closed Euler trail. A graph is Eulerian if
it contains an Euler tour.
It turns out there is a very easy way to determine whether or not a graph has an Euler tour.
Theorem 10.67. A nonempty connected graph is Eulerian if and only if it has no vertices
of odd degree.
Proof: Assume first that G is Eulerian, and let C be an Euler tour of G starting
and ending at vertex u. Each time a vertex v is encountered along C, two of the
edges incident to v are accounted for. Since C contains every edge of G, d(v) is
then even for all v 6= u. Also, since C begins and ends in u, d(u) must also be
even.
Conversely, assume that G is a connected nonEulerian graph with at least one
edge and no vertices of odd degree. Let W be the longest walk in G that traverses
every edge at most once:
W = v0 , v0 v1 , v1 , v1 v2 , v2 , . . . , vn−1 , vn−1 vn , vn .
Example 10.68 (Königsberg Bridge Problem). The town of Königsberg (now called Kalin-
ingrad) was built on an island in the Pregel River. The island sat near where two branches
of the river join, and the borders of the town spread over to the banks of the river as well as
a nearby promontory. Between these four land masses, seven bridges had been erected. The
townsfolk used to amuse themselves by crossing over the bridges and asked whether it was
possible to find a trail starting and ending in the same location allowing one to traverse each
of the bridges exactly once. Figure 10.4 has a graph-theoretic model of the town, with the
seven edges of the graph representing the seven bridges. By Theorem 10.67, this graph is not
Eulerian so it is impossible to find a trail as the townsfolk asked.
420 Chapter 10
A
B D
C
Figure 10.4: Model of the bridges in Königsberg from Example 10.68.
Definition 10.69. A Hamiltonian cycle in a graph is a cycle passing through every vertex.
G is Hamiltonian if it contains a Hamiltonian cycle.
Unlike Theorem 10.67, there is no simple characterization of all graphs with a Hamiltonian cycle.
In fact, the problem of determining whether or not a graph contains a Hamiltonian cycle is one
of the most famous NP-Complete problems. The details are beyond the scope of this book, but
briefly (and oversimplifying a bit), NP-Complete is a class of problems that are all equivalent in
the sense that if any of them can be solved in polynomial time, then they can all be solved in
polynomial time. Further, nobody currently knows whether or not any of them can be solved in
polynomial time. This leads to the so-called P versus NP problem, one of the most important
open problems in theoretical computer science. (Again, the details of precisely what this means
are beyond the scope of this book.)
Coming back to the Hamiltonian cycle problem, we do have the following one-way result.
v1 v2 · · · vi vn vn−1 · · · vi+1 v1 ,
v1 v2 v2 vi vi+1 vn−1 vn
But then
d(a) + d(b) = |S| + |T | = |S ∪ T | + |S ∩ T | < n.
Problem Solving with Graphs 421
n n
But since we are assuming that d(a) ≥ and d(b) ≥ , we have arrived at a
2 2
contradiction.
Definition 10.71. A graph is planar if it can be drawn in a plane with no intersecting edges.
Such a drawing is called a planar embedding of the graph.
Example 10.72. Although the usual way K4 is drawn has two edges intersect, it is planar
as shown in figure 10.5. It is important to understand that being planar means you can draw
it with no intersecting edges, not that every way of drawing it has no edges intersecting.
B
A
3
2 1
4
D C
⋆Exercise 10.73. Draw a planar embedding of K4 that does not have curved edges.
Definition 10.74. A face of a planar graph is a region bounded by the edges of the graph.
Example 10.75. K4 has 4 faces, labeled 1 through 4 in Figure 10.5. Face 1, which extends
indefinitely, is called the outside face.
Here are a few results about planar graphs. These theorems use v and e instead of n and m
because although computer scientists often use n and m, graph theorists seem to prefer v and e.
And you should get used to the fact that not everybody uses the same notation, so it’s good for
you to see different letters used.
Theorem 10.76 (Euler’s Formula). For every drawing of a connected planar graph with v
vertices, e edges, and f faces the following formula holds:
v − e + f = 2.
Proof: The proof is by induction on e. Let P (e) be the proposition that v−e+f =
2 for every drawing of a graph G with e edges. If e = 0 and it is connected, then we
422 Chapter 10
must have v = 1 and hence f = 1, since there is only the outside face. Therefore,
v − e + f = 1 − 0 + 1 = 2, establishing P (0)
Assume now P (e) is true, and consider a connected graph G with e + 1 edges.
Either
➊ G has no cycles. Then there is only the outside face, and so f = 1. Since
there are e + 1 edges and G is connected, we must have v = e + 2. This gives
(e + 2) − (e + 1) + 1 = 2 − 1 + 1 = 2, establishing P (e + 1).
➋ or G has at least one cycle. Consider a spanning tree of G and an edge uv
in the cycle, but not in the tree. Such an edge is guaranteed by the fact that
a tree has no cycles. Deleting uv merges the two faces on either side of the
edge and leaves a graph G′ with only e edges, v vertices, and f faces. G′
is connected since there is a path between every pair of vertices within the
spanning tree. So v − e + f = 2 by the induction assumption P (e). But then
v − e + f = 2 =⇒ (v) − (e + 1) + (f + 1) = 2 =⇒ v − e + f = 2,
establishing P (e + 1).
Theorem 10.77. (a) Every simple planar graph with v ≥ 3 vertices has e ≤ 3v − 6 edges.
(b) Every simple planar graph with v ≥ 3 vertices and which does not have C3 as a subgraph
has e ≤ 2v − 4 edges.
Proof: If v = 3, both statements are plainly true so assume that G is a maximal
planar graph with v ≥ 4. We may also assume that G is connected, otherwise, we
may add an edge to G. Since G is simple, every face has at least 3 edges in its
boundary. If there are f faces, let Fk denote the number of edges on the k-th face,
for 1 ≤ k ≤ f . We then have
F1 + F2 · · · + Ff ≥ 3f .
Also, every edge lies in the boundary of at most two faces. Hence if Ej denotes
the number of faces that the j-th edge has, then
2e ≥ E1 + E2 + · · · + Ee .
To be clear, Theorem 10.77 part (a) implies that a graph with at least 3 vertices and more
than 3v − 6 edges cannot be planar (the contrapositive of the statement). Similarly for part (b).
Problem Solving with Graphs 423
5
Example 10.78. K5 is not planar by Theorem 10.77 since K5 has 2 = 10 edges and
10 > 9 = 3(5) − 6.
Answer
424 Chapter 10
(c) A network
⋆Question 10.2. Give an example of a problem that might be modeled using the following types
of graphs. Make sure it is clear what the vertices and edges represent.
(a) A network
⋆Question 10.4. Draw an unconnected graph such that one component contains a cycle of
length 4 and another component is a tree.
⋆Question 10.5. (a) How many edges does a tree with n ≥ 2 vertices have? Draw a few trees
of various sizes and you should see an obvious pattern.
(b) (a bit challenging) Prove that your formula is correct. (Hint: Use induction. But you have
to be a little careful in how you do it. Also, you may assume that every tree contains at least
one vertex with degree 1.)
(d) Is L connected? v x
(i) Draw a spanning tree of L. How many edges does it have? Does your spanning tree contain
any cycles? Explain why or why not.
Reading Comprehension Questions 425
(j) What is the minimum number of edges that you can remove to make the graph disconnected
(that is, not connected)? Which ones?
(k) Find a cycle of length 3 in L. Then find one of length 4. Repeat for 5, 6, 7, and 8.
⋆Question 10.10. Draw Q5 . Just kidding. That would be a bit difficult to visualize. Instead,
describe how you could construct Q5 recursively. For instance, can you see how to go from Q1
to Q2 ? And from Q2 to Q3 ? And from Q3 to Q4 ? Once you observe the pattern it is pretty
straightforward to see how to construct Qk+1 from Qk .
⋆Question 10.11. Give a partition of the vertices of Q3 to show that it is bipartite. In other
words, which vertices go in V1 and which go in V2 ? Use 3-bit numbers to list the vertices (since
that is the natural way to construct the graph).
⋆Question 10.12. Draw K3,5 . Then draw a graph G such that G is a subgraph of K3,5 .
⋆Question 10.13. Give an informal proof of Theorem 10.47. That is, argue why it makes sense
by talking about edges, degrees, and vertices.
⋆Question 10.14. You are at a party with some friends and one of them claims “I just did a
quick count, and it turns out that at this party, there are an odd number of people who have
shaken hands with an odd number of other people at the party.” Prove or disprove that this
friend is correct.
⋆Question 10.15. (a) If a graph has very few edges, which representation is a better choice if
space is the only consideration? Explain.
(b) If a graph has many edges, which representation is a better choice if space is the only consid-
eration? Explain.
⋆Question 10.16. Are space considerations actually that important? In other words, practically
speaking, if you can store a graph using one of the representation, can you store it in the other
without worrying too much about space? Explain. (This is an important question, so think
carefully about it!) (Hint: Think about storing the graph of friends on Facebook or another
social media site.)
⋆Question 10.17. Given an adjacency list representation of a graph with n vertices and m
edges, how long do the following operations take?
⋆Question 10.18. Given an adjacency matrix representation of a graph with n vertices and m
edges, how long do the following operations take?
⋆Question 10.19. (a) If adding and removing edges is an important operation, is one of the
representations a better choice? Explain.
(b) If adding and removing vertices is an important operation, is one of the representations a
better choice? Explain.
(c) Is Q3 Eulerian? If so, number the edges of Q3 in order to demonstrate the Euler tour. If not,
explain why not.
(d) Is Q4 Eulerian? If so, number the edges of Q3 in order to demonstrate the Euler tour. If not,
explain why not.
(e) Is Q3 Hamiltonian? If so, draw a Hamiltonian cycle on Q3 . If not, explain why not.
(f) Is Q4 Hamiltonian? If so, draw a Hamiltonian cycle on Q4 . If not, explain why not.
⋆Question 10.22. Does Theorem 10.77 imply that if a graph with v ≥ 3 vertices has fewer than
3v − 6 edges that it is planar? Explain, using an example if appropriate.
10.8 Problems
Problem 10.1. Give the degrees of the vertices of each of the following graphs. Assume m and
n are positive integers. For instance, for Pn , n − 1 of the vertices have degree 2, and 2 vertices
have degree 1.
(a) Cn
(b) Qn
(c) Kn
(d) Km,n
Problem 10.2. Can a graph with 6 vertices have vertices with the following degrees: 3, 4, 1, 5, 4, 2?
If so, draw it. If not, prove it.
Problem 10.3. Prove or disprove that Qn is bipartite for n ≥ 1.
Problem 10.4. For what values of n is Kn bipartite?
Problem 10.5. Give the adjacency matrix representation of Q3 , numbering the vertices in the
obvious order.
Problem 10.6. (a) Give the adjacency matrix representation for K4 .
Problem 10.8. Describe what the adjacency matrix looks like for Cn for n > 1.
Problem 10.9. Given an adjacency matrix for Cn , with n > 1, how can you modify it to make
it the adjacency matrix for Pn ?
Problem 10.10. What property does the adjacency matrix of every undirected graph have that
is not necessarily true of directed graphs?
(b) If G is directed and there is a path from u to v, is there necessarily a path from v to u?
Explain, giving an example if possible.
Problem 10.12. For what values of n is Qn Eulerian? Prove your claim.
Problem 10.13. Is Cn Eulerian for all n ≥ 3? Prove it or give a counter example.
Problem 10.17. A graph is Eulerian if and only if its adjacency matrix has what property?
428 Chapter 10
Problem 10.18. What properties does an adjacency matrix for graph G need in order to use
Theorem 10.70 to prove it is Hamiltonian?
Problem 10.19. Let G be a bipartite graph with v vertices and e edges. Prove that if e > 2v − 4,
then G is not planar.
Problem 10.20. For each of the following, either give a planar embedding or prove the graph is
not planar.
(a) Q3
(b) Q5
(c) K2,3
(d) K6
Problem 10.21. Let G be a graph with n vertices and m edges and let u and v be arbitrary
vertices of G. Describe an algorithm that accomplishes each of the following assuming G is
represented using an adjacency matrix. Then give a tight bound on the worst-case complexity of
the algorithm. Your bounds might be based on n, m, deg(u), and/or deg(v).
(c) Iterate over the neighbors of u (and doing something for each neighbor, but don’t worry about
what and assume it takes constant time for each neighbor).
Problem 10.22. Repeat Problem 10.21, but this time assume that G is represented using adja-
cency lists.
(c) Iterate over the neighbors of u (and doing something for each neighbor, but don’t worry about
what).
Problem 10.23. (a) List several advantages that the adjacency matrix representation has over
the adjacency list representation.
(b) List several advantages that the adjacency list representation has over the adjacency matrix
representation.
Chapter 11: Exercise Solutions
2.4 2d + 1; c + d + 1; even
2.6 2n; 2o + 1; some integers n and o; 4no + 2n = 2(2no + n) or 2(n(2o + 1)). (Your steps might
vary slightly, but you should end up with either 2(2no + n) or 2(n(2o + 1)) in the final step);
2no + 1 or n(2o + 1); ‘an even integer’ or ‘even’.
2.7 Let a and b be even integers. Then a = 2m and b = 2n for some integers m and n. Their
product is ab = (2m)(2n) = 2(2mn) which is even since 2mn is an integer.
2.8 Here are my comments on the proof.
• The first sentence is phrased weird–we are not letting a and b be odd by the definition of
odd. We are using the definition.
• Although it is not incorrect, using n and q is just weird in this context. It is customary to
use adjacent letters, like n and m, or q and r.
• Given the above problems, I would rephrase the first sentence as ‘Let a and b be an odd
numbers. Then a = 2n + 1 and b = 2m + 1 for some integers n and m.’
• If you replace 2nq + 1 with 2nq + q + n (twice) in the last sentence (see the previous item)
it would be a perfect finish to the proof.
2.9 Hopefully it is clear to you that the proof can’t be correct since the sum of an even and an
odd number is odd, not even. The algebra is correct. The problem is that n + m + 1/2 is not an
integer. In order to be even, a number must be expressed in the form 2k where k is an integer.
Any number can be written as 2x if we don’t require that x be an integer, so you cannot say that
a number is even because it is of the form 2x unless x is an integer.
2.13 a an integer; (3x + 2); (5x − 7);7; 7 divides 15x2 − 11x − 14.
2.15 This proof is correct. Not all of the Evaluate problems have an error!
2.17 The number 2 is positive and even but is clearly not composite since it is prime. Since the
statement is false the proof must be incorrect. So where is the error? It is in the final statement.
Although a can be written as the product of 2 and k, what if k = 1 (that is, a = 2). In that case
we have not demonstrated that a has a factor other than a or 1, so we can’t be sure that it is
composite.
2.18 Let a > 2 be an even integer. Then a = 2k for some integer k. Since a 6= 2, a has a factor
other than a or 1. Therefore a is not prime. Therefore 2 is the only even prime number.
2.19 It was O.K. because according to the definition of prime, only positive integers can be
prime. Therefore we only needed to consider positive even integers.
2.23 This one has a combination of two subtle errors. First of all, if a|c and b|c, that does not
necessarily imply that ab|c. For instance, 6|12 and 4|12, but it should be clear that 6 · 4 ∤ 12.
Second, what if a = b? We’ll see how to fix the proof in the next example.
2.25 Since n is not a perfect square, we know that a 6= b. Therefore a < b or b < a. Since
a and b are just labels for two factors of n, it doesn’t matter which one is larger. So we can
just assume a is the smaller one without any loss of generality. By definition of composite, we
429
430 Chapter 11
know that a > 1. Finally, it should be pretty clear that b < n − 1 since if b = n − 1, then
n = ab = a(n − 1) ≥ 2(n − 1) = 2n − 2 = n + (n − 2) > n since n > 4. But clearly n > n is
impossible.
2.26 We assumed that n = a2 > 4, so clearly a > 2.
2.28
1. Experiment. If you aren’t sure what to do, don’t be afraid to try things.
2. Read Examples. But don’t just read. Make sure you understand them.
2.33 Only when you read xkcd and you don’t laugh.
2.34 If you build it and they don’t come, the proposition is false. This is the only case where it
is false. To see this, notice that if you build it and they do come, it is true. If you don’t build it,
then it doesn’t matter whether or not they come–it is true.
2.38 If you don’t know a programming language, then you don’t know Java.
2.40 true; ¬p; false; p; p is true; q is false (the last two can be in either order).
2.42 If you don’t know Java, then you don’t know a programming language.
2.43 They are not equivalent. Since Java is a programming language, the proposition seems
obviously true. However, what if someone knows C++ but not Java? Then they know a pro-
gramming language but they don’t know Java. Thus, the inverse is false. Since one is true and
the other is false, the proposition and its inverse are clearly not equivalent.
2.45 If you know a programming language, then you know Java.
2.46 They are not equivalent. Since Java is a programming language, the proposition seems
obviously true. However, what if someone knows C++ but not Java? Then they know a pro-
gramming language but they don’t know Java. Thus, the converse is false. Since one is true and
the other is false, the proposition and its converse are clearly not equivalent.
2.48 (a) The implication states that if I get to watch “The Army of Darkness” that I will be
happy. However, it doesn’t say that it is the only thing that will make me happy. For instance,
if I get to see “Iron Man” instead, that would also make me happy. Thus, the inverse statement
is false.
(b) I will use fact that p → q is true unless p is true and q is false. The implication is true unless
I watch “The Army of Darkness” and I am not happy. The contrapositive is “If I am not happy,
then I didn’t get to watch ‘The Army of Darkness.’ ” This is true unless I am not happy and I
watched “The Army of Darkness.” Since this is exactly the same cases in which the implication
are true,
√ the implication
√ and its contrapositive are equivalent.
2.51 35; 10 35; 3481 ≥ 3500; nonsense or false or a contradiction.
2.52
• It is proving the wrong thing. This proves that the product of an even number and an
odd number is even. But it doesn’t even do that quite correctly as we will see next.
• The first sentence is phrased weird–we are not letting a be even by the definition of
even. We are using the definition.
• It does not state that n and q need to be integers.
• Although it is not incorrect, using n and q is just weird. It is customary to use adjacent
letters, like n and m, or q and r.
Exercise Solutions 431
• Given the above problems, I would rephrase the first sentence as ‘Let a be an even
number and b be an odd number. Then a = 2n and b = 2m + 1 for some integers n
and m.’
• There is an algebra mistake. The product should be 2(2nq + n).
• The last sentence is actually perfect (again, except for the fact that it isn’t proving the
right thing).
Evaluation of Proof 2: This proof is incorrect. It actually proves the converse of the statement.
(We’ll learn more about converse statements later.) In other words, it proves that if at least
of one of a or b is even, then ab is even. This is not the same thing. It is a pretty good proof
of the wrong thing, but it can be improved in at least 4 ways.
• It defines a and b but never really uses them. They should be used at the beginning
of the algebra steps (i.e. a · b = · · · ) to make it clear that the algebra is related to the
product of these two numbers.
• It needs to state that k and x are integers.
• As above, using k and x is weird (but not wrong). It would be better to use k and l,
or x and y.
• It needs a few words to bring the steps together. In particular, sentences should not
generally begin with algebra.
Taking into account these things, the second part could be rephrased as follows.
2.56 (1, 2, 3), (1, 3, 2), (2, 1, 3), (2, 3, 1), (3, 1, 2), and (3, 2, 1).
2.58 Since it wasn’t obvious how to do a direct proof of the fact, proof by contradiction seemed
like the way to go. So we begin by assuming what we want to prove (that the product is even)
is false. The short answer: Because contradiction proofs generally begin by assuming the negation
of what you want to prove.
2.59 The proof gives the justification for this, but you may have to think about it for it to
entirely sink in. Consider carefully the definition of S: S = (a1 − 1) + (a2 − 2) + · · · + (an − n).
Notice it adds and subtracts terms. If S = 0, then the amount added and subtracted must be the
same. And if you think about it for a few minutes, especially in light of the justification given in
the proof, you should see why. If you can’t see it right away, go back to how the ak ’s are defined
and think a little more. If you get totally stuck, try an example with n = 3 or 4.
2.62 Because a2 = a · a, so to list the factors of a2 you can list the factors of a twice. Thus, a2
has twice as many factors as a, so it must be an even number.
2.65 (1) No. (2) Yes. (3) No. (4) No. (5) Statements of the form “p implies q” are false precisely
when p is true and q is false. (6) No. Whether or not you are 21, you aren’t breaking the rule.
(7) No. If p is false, whether or not q is true or false doesn’t matter–the statement is true. Let’s
consider the previous question–if you do not drink alcohol, you are following the rule regardless
of whether or not the statement “you are 21” is true or false.
2.66 a > b; a−b a−b a b a b a
2 ; 2 ; b + 2 − 2 = 2 + 2 ; subtract 2 from both sides and multiple both sides by
2; a > b; contradiction; a ≤ b.
432 Chapter 11
Ä ä2 Ä ä
2.68 a pq + b pq + c; multiple both sides by q 2 ; odd; 0; ap2 + bpq is even and cq 2 is odd, so
ap2 + bpq + cq 2 is odd; bpq + cq 2 is even and ap2 is odd, so ap2 + bpq + cq 2 is odd; ax2 + bx + c = 0
does not have a rational solution if a, b, and c are odd.
2.72
Evaluation of Proof 1: This is attempting to prove the converse, not the contrapositive. Since
the converse of a statement is not equivalent to the original statement, this is not a valid
proof. Further, the proof contains an algebra mistake. Finally, it uses the property that the
sum of two even integers is even. Although this is true, the problem specifically asked to
prove it using the definition of even/odd.
Evaluation of Proof 2: This proof starts out correctly by using the contrapositive statement and
6
the definition of odd. Unfortunately, the writer claims that 5( k + 1) is ‘clearly odd.’ This
5
is not at all clear. What about this number makes it odd? Is it expressed as 2a + 1 for some
integer a? No. Even worse, there is a fraction in it, obscuring the fact that the number is
even an integer.
Evaluation of Proof 3: This proof is really close. The only problem is that we don’t know that
6k + 5 is odd using the definition of odd. All the writer needed to do is take their algebra a
little further to obtain 2(3k + 2) + 1, which is odd by the definition of odd since 3k + 2 is
an integer.
2.78 Answers will vary greatly, but one proof is: 3 and 5 are prime but 3 + 5 = 8 = 23 is clearly
not prime.
2.81 2s is a power of two that is in the closed interval.; 2r = 2 · 2r−1 < 2s < 2 · 2r = 2r+1 , so
s < 2r < 2s < 2r+1 , and so the interval [s, 2s] contains 2r , a power of 2.
2.82 Because these statements are contrapositives of each other. In other words, they are
equivalent. Therefore you can prove either form of the statement.
2.84 If x is odd, then x = 2k + 1 for some integer k. Then x + 20 = 2k + 1 + 20 = 2(k + 10) + 1,
which is odd since k + 10 is an integer. If x + 20 is odd, then x + 20 = 2k + 1 for some integer k.
Then x = (x + 20) − 20 = 2k + 1 − 20 = 2(k − 10) + 1, which is odd since k − 10 is odd. Therefore
x is odd iff x + 20 is odd.
2.85 If x is odd, then x = 2k + 1 for some integer k. Then x + 20 = 2k + 1 + 20 = 2(k + 10) + 1,
which is odd since k + 10 is an integer. If x is even, then x = 2k for some integer k. Then
x + 20 = 2k + 20 = 2(k + 10). Since k + 10 is an integer, then x + 20 is even. Therefore x is odd
iff x + 20 is odd.
2.86 p implies q; q implies p; p implies q; ¬p implies ¬q
2.87
Evaluation of Proof 1: For the forward direction, they didn’t use the definition of odd. Other-
wise, that part is fine. For the backward direction, their proof is nonsense. They assumed
that x = 2k + 1 when they wrote (2k + 1) − 4 in the second sentence. This need to be
proven.
Evaluation of Proof 2: For the forward direction, they didn’t specify that k was an integer. Oth-
erwise it is correct. The second part of the proof is not proving the converse. It is proving
the forward direction a second time using a proof by contraposition. In other words, this
proof just proves the forward direction twice and does not prove the backward direction.
Exercise Solutions 433
2.89 This only proves that 4 + 6 is even. It says nothing about the sum of any other two even
numbers.
2.91 The problem is that this is actually a proof that x + x is even if x is even since x = 2a = y
was assumed.
2.92 Notice that 4 and 6 are even, but 4 + 6 = 10 is not divisible by 4. So clearly the statement
is incorrect. Therefore, there must be something wrong with the proof. The problem is the same
as in the previous example–the proof assumed x = y, even if that was not the intent of the writer.
So what was proven was that if x is even, then x + x is divisible by 4.
2.93 Since it should be clear that the result (−1 = 1) is false, the proof can’t possibly be correct.
2.94 No! Example 2.93 should have made it clear that this approach is flawed.
2.95 No, you should not be convinced. As we just mentioned, whether or not the equation is
true, sometimes you can work both sides to get the same thing. Thus the technique of working
both sides is not valid. It doesn’t guarantee anything unless you already know that the equation
is valid.
p+q
2.96 Since p and q are odd, we know that p + q is even, and so is an integer. But p < q
2
p+q
gives 2p < p + q < 2q and so p < < q, that is, the average of p and q lies between them.
2
Since p and q are consecutive primes,
p any number between them is composite, and so divisible by
+ q
at least two primes. So p + q = 2 is divisible by the prime 2 and by at least two other
2
p+q
primes dividing .
2
2.97
Evaluation of Proof 1: This is not correct. It needs to be shown that xy can be written as c/d,
where c and d are integers with d 6= 0. Ask yourself this: Are ay and by necessarily integers?
Evaluation of Proof 2: This is not correct. If y = 3/2, what does it mean to multiple x by itself
one and a half times?
Evaluation of Proof 1: This solution has two serious flaws. First, we absolutely cannot assume
x is an integer. The only thing we can assume about x is that it is rational, and not every
rational number is an integer. The other problem is that the writer proved the inverse, not
the contrapositive. What they needed to prove was that if 1/x is rational, then x is rational.
So in actuality, we know is that 1/x is rational, not x. We need to prove that x is rational
based on the assumption that 1/x is rational.
Evaluation of Proof 2: This is not really a proof. It just takes the statement of the problem one
step further. Is the writer sure that 1/x can’t be expressed as an integer over an integer?
Why? There are just too many details omitted.
Evaluation of Proof 3: The biggest flaw is that this is a proof of the inverse statement, not the
contrapositive. So even it the rest of the proof were correct, it would be proving the wrong
thing since the inverse and contrapositive are not equivalent. But the rest is not even entirely
correct because the inverse statement is not quite true. If x = 0, then p = 0 as well and the
statement and proof falls apart for the same reason–you can’t divide by 0.
434 Chapter 11
Evaluation of Proof 4: This proof is almost correct. It does correctly try to prove the contrapos-
itive, and if it had done so correctly, that would imply the original statement is true. But
there is one small problem: If a = 0 the proof would fall apart because it would divide by
0. This possibility needs to be dealt with. This is actually not too difficult to fix. We just
need to add the following sentence before the last sentence: “Since 0 6= 1/x for any value
of x, we know that a 6= 0.”.
Evaluation of Proof 1: As you will prove next, the statement is actually false. Therefore the
proof has to be incorrect. But where did it go wrong? It turns out they they tried to
prove the wrong thing. What needed to be proved was “If p is prime then 2p − 1 is prime.”
They attempted to prove the converse statement, which is not equivalent. We can still learn
something by evaluating their proof. It turns out that the converse is actually true, and
the proof has a lot of correct elements. Unfortunately, they are not put together properly.
First of all, the proof seems to be a combination of a contradiction proof and a proof by
contrapositive. They needed to pick one and stick with it. Second, the arrows (→) are
confusing. What do they mean? I think they are supposed to be read as “implies”, but
a few more words are needed to make the connections between these phrases. Finally, the
final statement is incorrect. This does not prove that all numbers of the form 2p − 1 are
prime when p is prime.
Evaluation of Proof 2: This proof is not even close. This is a case of “I wasn’t sure how to prove
it so I just said stuff that sounded good.” You can’t argue anything about the factors of
2p − 1 based on the factors of 2p . Further, although 2p − 1 being odd means 2 is not a factor,
it doesn’t tell us whether or not the number might have other factors.
2.101 Notice that 11 is prime but that 211 − 1 = 23 · 89 is not. Therefore, not all numbers of the
form 2p − 1, where p is prime, are prime.
3.2
double a r e a S q u a r e ( double s ) { return s * s ; }
3.7 It does not work. To see why, notice that if we pass in a and b, then x = a and y = b at
the beginning. After the first line, x = b and y = b. After the second line x = b and y = b. The
problem is that the first line overwrites the value stored in x (a), and we can’t recover it.
3.12 (a) 45; (b) 8; (c) 3; (d) 6; (e) 0; (f) 7; (g) 7; (h) 7; (i) 11.
3.21 -15; -7; 9; 13; 21. Notice that it is every 4th number along the number line, both in the
positive and negative directions.
3.22 Either −1 or 3 are possible answers if we are uncertain whether it will return a positive or
negative answer. But we know it is one of these. It won’t be -5, for instance.
3.23
Evaluation of Solution 1: This solution is both incorrect and a bit confusing. The phrase ‘both
sides’ is confusing–both sides of what? We don’t have an equation in this problem. But
there is a more serious problem. If you thought it was correct, go back and try to figure
out why it is incorrect before you continue reading this solution. The main problem is that
although this may return a value in the correct range, it doesn’t always return the correct
value. In fact, what if (a (mod b) + b − 1) is odd? Mathematically, this would result in a
non-integer result which is clearly incorrect. In most programming languages it would at
Exercise Solutions 435
least truncate and return an integer–but again, not always the correct one. This person
focused on the wrong thing–getting the number in a particular range. Although that is
important, they needed to think more about how to get the correct number. They should
have plugged in a few more values to double-check their logic.
Evaluation of Solution 3: Incorrect. If you think about if for a few minutes you should see that
this is just one way of implementing the idea from the previous solution.
Evaluation of Solution 4: Incorrect. If (a mod b) is negative, performing another mod will still
leave it negative.
3.24 There are several possible answers, but the slickest is probably: (b+(a mod b)) mod b.
Try it with both positive and negative numbers for a and convince yourself that it is correct.
3.27 (1) 9; (2) 10; (3) 9; (4) 10; (5) 9; (6) 9.
3.29
Evaluation of Solution 1: This does not work. What happens when x = 3. 508, for instance?
Evaluation of Solution 2: This is incorrect for two reasons. First, 1/2 = 0 in most programming
languages, so this will always round down. Second, even if we replaced this with . 5 or
1. 0/2. 0, it would round up at . 5.
Evaluation of Solution 3: Nice try, but still no good. What if x = 2. 5? This will round up to 3.
Worse, what if x = 2. 0001? Again, it rounds up to 3 which is really bad.
Evaluation of Solution 4: This one is correct. Plug in values like 2, 2. 1, and 2. 5 to see that it
rounds down to 2 and values like 2. 51, 2. 7, and 2. 9 to see that it rounds up to 3.
3.31 (a) 0; (b) 1; (c) 1; (d) 1; (e) 1 (f) 1; (g) 2; (h) 9; (i) 0; (j) -1; (k) -1; (l) -2.
3.32
Evaluation of Solution 1: 0. 5 is not an integer, and the floor function is not allowed.
Evaluation of Solution 2: The floor function is not allowed. Even if it were, this solution doesn’t
work. 1/2 is evaluated to 0 so it doesn’t help.
Evaluation of Solution 3: This one works, but 0. 5 is not allowed so it does not follow the direc-
tions.
• If y is the maximum, the argument is essentially the same the previous case.
436 Chapter 11
3.42
Evaluation of Solution 1: This algorithm always returns 0. If you don’t see it right away, carefully
work through the algorithm with a few values of n.
Evaluation of Solution 2: This is correct. It doesn’t multiply by 1, but that doesn’t change the
answer.
Evaluation of Solution 3: This is also correct. It is just multiplying the values in the reverse
order of the other examples.
Evaluation of Solution 4: This is incorrect. It actually computes (n − 1)!. To fix this, does i
have to start at 0, or go to n (instead of n − 1)? We’ll leave it to you to work out which is
correct.
3.43 This isn’t really that much different than the algorithms to compute n!. Here is one
algorithm that does the job:
double power ( double x , int n ) {
power = 1;
for ( int i =0;i < n ; i ++) {
power = power * x ;
}
return power ;
}
The loop could also have been for(int i=1;i<=n;i++), or any loop that executes n times since
the loop index, i, is not used as part of the calculation.
3.48 As we already discussed, many languages truncate when performing integer division. When
the numbers are positive (as they are here), that is the same thing as taking the floor. Even if a
language does not do this, Theorem 3.28 implies that it would still work.
3.49 It is easiest to see that this is correct by comparing with the previous solution. The only
difference is that the condition went from i <= (n − 2)/2 to i < n/2. Notice that (n − 2)/2 + 1 =
n/2 − 2/2 + 1 = n/2. But since we also replaced <= with <, it still stops at the same point.
3.51 It is correct. To convince yourself (but this is not a proof), plug the numbers 1 through
5 (or some other set of both even and odd values) into both sides to see that you get the same
number.
3.57 In the first iteration in the loop, n = 5, i = 1, n ∗ i > 4 and thus x = 10. Next, n = 3, i = 2,
and we go through the loop again. Since n ∗ i > 4, x = 10 + 2 ∗ 3 = 16. Finally, n = 1, i = 3, and
the loop√stops. Hence x = 16 is returned.
3.60 ⌊ 101⌋ = 10. The only primes less than 10 are 2, 3, 5, and 7. Since 101 mod 2 = 1,
101 mod 3 = 2, 101 mod 5 = 1, and 101 mod 7 = 3, none of which are 0, Theorem 3.58 tells us
that 101 is prime.
√ = 17 ∗ 19, so it is not prime. I determined this by seeing if any of the primes no greater
3.61 323
than ⌊ 323⌋ = 17 were factors. Although 2, 3, 5, 7, 11, and 13, are not, 17 is.
3.63 It allows the loop to increment by 2 each time instead of by one, making it about twice as
fast. This is sensible since if 2 is not a factor, no even number is, so why check them all?
A bonus thought: If you think about it, the same thing could be said of 3, 5, etc. That is,
once we know that a number is not divisible by 3, we don’t really need to ask if it is divisible by
6, 9, 12, etc. But doing this in general (not just for 2) complicates the algorithm quite a bit. So
we’ll just settle for an algorithm that is about twice as fast.
3.64 The following algorithm does the job.
438 Chapter 11
int r e v e r s e D i g i t s ( int n ) {
x =0;
while ( n !=0) {
x = x *10+ n %10;
n = n /10;
}
return x ;
}
4.3 (a) false. (b) true. (c) true. (d) false. If you don’t know the story behind this, Google it.
4.5 (a) Not a proposition. (b) I would like to think this is true. However, this is not a proposition
since not everyone agrees with me. (c) Also not a proposition. (d) true. (e) false. This one is a
bit tricky to think about it, so the next example will ask you to prove it.
4.9 “I am not learning discrete mathematics.” You could also have “It is not the case that I am
learning discrete mathematics,” although it is better to smooth out the English when possible.;
False. Since you are currently reading the book, you are learning discrete mathematics.
4.10 (a) The statement “This is not a proposition” is just the negation of the proposition “This
is a proposition,” so it is a proposition. Since “This is a proposition” is true, “This is not a
proposition” is false.
(b) Assume that “This is not a proposition” is not a proposition. Then the statement “This
is not a proposition” is true, which means it is a proposition. But we assumed it wasn’t a
proposition. Since we have a contradiction, our assumption that “This is not a proposition” is
not a proposition was incorrect, so it is a proposition. Since it is a proposition that says is isn’t,
its truth value is false.
4.11 list.size()!=0 and !(list.size()==0) are the most obvious solutions.
4.15 Either “I like cake and I like ice cream,” or “I like cake and ice cream” are correct.
4.19 “x > 0 or x < 10”; true; true; x < 10; true.
4.20 (a) “It is not the case that Jill is tall,” which is awkward, so could be shorted to “Jill is not
tall” or perhaps “Jill is short.” (b) “Jill is tall or Jill is smart,” or more compactly, “Jill is tall or
smart.” (c) “Jill is tall and Jill is smart,” or more compactly, “Jill is tall and smart.”
4.21 Here is one possible solution. Note that the parentheses are necessary.
b o o l e a n s t a r t s O r E n d s W i t h Z e r o ( int [] a , int n ) {
if ( n >0 && ( a [0]==0 || a [n - 1 ] = = 0 ) ) {
return true ;
else {
return false ;
}
}
4.22 The solution uses the expression n>0 && (a[0]==0 || a[n-1]==0). If n = 0, the expres-
sion is false because of the &&, so the algorithm returns false as it should since an array with no
elements certainly does not begin or end with a 0. If n = 1, first note that n − 1 = 0, so a[0] and
a[n − 1] refer to the same element. Although this is redundant, it isn’t a problem. If a[0] = 0, the
expression evaluates to T ∧ (T ∨ T ) = T , and the algorithm returns true as expected. If a[0] 6= 0,
the expression evaluates to T ∧ (F ∨ F ) = F , and the algorithm returns false as expected.
4.25 (a) XOR; (b) OR. This one is a little tricky because parts can’t be simultaneously true so it
sounds like an XOR. But since the point of the statement is not to prevent both from being true,
it is an OR. (c) Without more context, this one is difficult to answer. I would suspect that most of
the time this is probably OR. The purpose of this example is to demonstrate that sometimes life
contains ambiguities. This is particularly true with software specifications. Generally speaking,
you should not just assume one of the alternative possibilities. Instead, get the ambiguity clarified.
Exercise Solutions 439
(d) When course prerequisites are involved, OR is almost certainly in mind. (e) The way this is
phrased, it is almost certainly an XOR.
4.26 p ∨ q is “either list 1 or list 2 is empty.” To be completely unambiguous, you could rephrase
it as “at least one of list 1 or list 2 is empty.” p ⊕ q is “either list 1 or list 2 is empty, but not
both,” or “precisely one of list 1 or list 2 is empty.” They are different because if both lists are
empty, p ∨ q is true, but p ⊕ q is false.
4.27 (a) No. If p and q are both true, then p ∨ q is true, but p ⊕ q is false, so they do not mean
the same thing.
(b) We have to be very careful here. In general, the answer to this would be absolutely not (we’ll
discuss this more next). However, for this particular p and q, they actually essentially are the
same. But the reason is that it is impossible for x to be less than 5 and greater than 15 at the
same time. In other words, p and q can’t both be true at the same time. The only other way for
p ⊕ q to be false is if both p and q are false, which is exactly when p ∨ q is false.
4.31 (a) An A. (b) We can’t be sure. We know that earning 94% is enough for an A, but we
don’t know whether or not there are other ways of earning an A. (c) We can’t be sure. If the
premise is false, we don’t know anything about conclusion.
4.35 (a) An A. (b) Yes. Because it is a biconditional statement that we assumed to be true, the
statements “you will receive an A in the course” and “you earn at least 94%” have the same truth
value. Since the former is true, the latter has to be true. (c) Yes. Notice that p ↔ q is equivalent
to ¬p ↔ ¬q (You should convince yourself that this is true). Thus the statements “you don’t
earn at least 94%” and “you didn’t get an A” have the same truth value.
4.37 The answers are in bold.
p q p → q (p → q) ∧ q
T T T T
T F F F
F T T T
F F T F
4.42 Here is the truth table with two intermediate columns. For consistency, your table should
have the rows in the same order.
440 Chapter 11
a b c ¬b a ∨ ¬b (a ∨ ¬b) ∧ c)
T T T F T T
T T F F T F
T F T T T T
T F F T T F
F T T F F F
F T F F F F
F F T T T T
F F F T T F
4.45 (a ∧ b) ∨ c
4.46 They are not equivalent. For instance, when a = F , b = F , and c = T , (a ∧ b) ∨ c is true
but a ∧ (b ∨ c) is false.
4.47 Since (a → b) → c is how it should be interpreted, the first statement is correct. The second
statement is incorrect. We’ll leave it to you to find true values for a, b, and c that result in these
two parenthesizations having different truth values.
4.50 (a) tautology (b) contradiction; p and ¬p cannot both be true. (c) contingency; it can be
either true or false depending on the truth values of p and q.
4.52
Evaluation of Proof 1: Nice truth table, but what does it mean? It is just a bunch of symbols
on a page. Why does this truth table prove that the proposition is a tautology? The proof
needs to include a sentence or two to make the connection between the truth table and the
proposition being a tautology.
Evaluation of Proof 2: This is mostly correct, but the phrasing could be improved. For instance,
the phrase ‘they all return true’ is problematic. Who/what are ‘they’ ? And what does it
mean that they ‘return’ true? Propositions don’t ‘return’ anything. Replace ‘Since they all
return true’ with ‘Since every row of the table is true’ and the proof would be good.
Evaluation of Proof 3: While I applaud the attempt at completeness, this proof is way too com-
plicated. It is hard to understand because of the incredibly long sentences and the math-
ematical statements written in English in the middle of sentences. But I suppose that
technically speaking it is correct. Here are a few specific examples of problems with the
proof (not exhaustive). The first three sentences are confusing as stated. The point that the
author is trying to make is that whenever q is true, the statement must be true regardless
of the value of p, so there is nothing further to verify. Thus the only case left is when q is
false. This point could be made with far few words and more clearly. The phrase ‘we would
have true and (true implies false), which is false,’ is very confusing, as are a few similar
statements in the proof. The problem is that the writer is trying to express mathematical
statements in sentence form instead of using mathematical notation. There is a reason we
learn mathematical notation–to use it!
Evaluation of Proof 4: This proof is correct and is not too difficult to understand. It is a lot
better than the previous proof for a few reasons. First of all, it starts off in a better place–
focusing in on the single case of importance. Second, it uses the appropriate mathematical
notation and refers to definitions and previous facts to clarify the argument.
Evaluation of Proof 5: While I appreciate the patriotism (in case you don’t know, some people
use ‘merica as a shorthand for America), this has nothing to do with the question. Sorry,
Exercise Solutions 441
no points for you! By the way, I did not make this solution up. Although it wasn’t really
used on this particular problem, one student was in the habit of giving answers like this if
he didn’t know how to do a problem.
4.56 Below is the truth table for ¬(p ∧ q) and ¬p ∨ ¬q (the gray columns).
p q p ∧ q ¬(p ∧ q) ¬p ¬q ¬p ∨ ¬q
T T T F F F F
T F F T F T T
F T F T T F T
F F F T T T T
Since they are the same for every row of the table, ¬(p ∧ q) = ¬p ∨ ¬q.
4.57 For the first part: negation; distribution; p ∨ p. For the second part:
p = p∧T (identity)
= p ∧ (p ∨ ¬p) (negation)
= (p ∧ p) ∨ (p ∧ ¬p) (distributive)
= (p ∧ p) ∨ F (negation)
= p∧p (identity)
Thus, p ∧ p = p.
4.59 (a) We can use the identity, distributive, and dominations laws to see that
p ∨ (p ∧ q) = (p ∧ T ) ∨ (p ∧ q) = p ∧ (T ∨ q) = p ∧ T = p.
(b) We can prove this similarly to the previous one, or we can use the previous one along with
distribution and idempotent laws:
p ∧ (p ∨ q) = (p ∧ p) ∨ (p ∧ q) = p ∨ (p ∧ q) = p.
4.62 Let p =“x > 0” and q =“x < y”. Then the conditional above can be expressed as
(p ∧ q) ∨ (p ∧ ¬q). According to Example 4.58, this is just p. Therefore the code simplifies to:
if (x >0) {
x=y;
}
4.63 This is not equivalent to the original code. Consider the case when x = −1 and y = 1, for
instance.
4.64 Since both conditions need to be true when if statements are nested, it is the same thing
as a conjunction. In other words, the two ifs are equivalent to if( x>0 && (x<y || x>0) ). By
absorption, this is equivalent to if(x>0). So the simplified code is:
if (x >0) {
x=y;
}
You can also think about it this way. The assignment x=y cannot happen unless x>0 due to the
outer if.1 But the inner if has a disjunction, one part of which is x>0, which we already know
is true. In other words, it doesn’t matter whether or not x<y. This argument also leads to the
solution we just gave.
4.66
1
If you think about it, this is why the solution to this in the previous example failed.
442 Chapter 11
Evaluation of Solution 1: This solution is incorrect. There are a few problems. The obvious one
is that the first statement actually prevents the program from crashing so it is certainly
not unnecessary! Also, the second and third statements may be equivalent, but how are
they connected? For instance, given the expression ¬(A ∧ B) ∨ ¬A, I cannot simply remove
the ‘redundant’ ¬A to obtain an “equivalent” expression of ¬(A ∧ B) (if necessary, plug in
different truth values for A and B to convince yourself that these are not the same).
Evaluation of Solution 2: This is not correct. The second part of the expression seems to have
disappeared. But how can we know it isn’t equivalent? We just need to find a scenario
where the two versions do different things. Notice that the ‘simplified’ expression is true
when the list is not empty regardless of the value of element 0. But what if the list is not
empty and element 0 is 50? The original expression is false and the ‘simplified’ expression
is true. Clearly not the same.
Evaluation of Solution 3: This solution is not only correct, but it is very well argued.
4.67 Technically speaking, the final solution is not equivalent. However, it turns out that it is
better than the original. This is because the original code would actually crash if the list is empty.
Go back and look at the code and verify that this is the case. Then verify that the final simplified
version will not crash.
4.68 (a) p ⊕ q; (b) (p ∧ ¬q) ∨ (¬p ∧ q) or (p ∨ q) ∧ ¬(p ∧ q). Other answers are possible, but most
likely you came up with one of these. If not, construct a truth table to determine whether or not
your answer is correct.
4.70
Evaluation of Proof 1: This is an incomplete proof. It only proves that in one case (p and q both
being true) they are equivalent. It says nothing about, for instance, whether or not they
have the same truth value when p is true and q is false.
Evaluation of Proof 2: This proof is also incomplete. It proves that in two cases they have the
same truth value, but is silent about the other cases. Are we supposed to assume that in
all other cases the expressions are both false?
Evaluation of Proof 3: This is either incomplete or incorrect, depending on how you read it. If
by “precisely” the writer means “exactly when”, then it is incorrect since the propositions
are also true when both p and q are false. Otherwise the proof is incomplete because it does
not deal with every case.
Evaluation of Proof 4: This is correct because it exhausts all of the cases. It is perhaps a bit
brief, however. The only way I know the proof is actually correct is that I have to verify
what the writer said. By the definition of p ↔ q, what they said is clearly true. But to see
that it is true of (p ∧ q) ∨ (¬p ∧ ¬q) I have to actually plug in a few values and/or think
about the meaning of the expression.
4.74 (a) is a predicate since it can be true or false depending on the value of x.; (b) is not a
predicate since it is simply a false statement–it doesn’t contain any variables.; (c) is a predicate
since it can be true or false depending on the value of M .; (d) is not a predicate. This one is
tricky. This is a definition. In this statement, x is not a variable but a label for a number so that
it can be referred to later in the sentence.
4.78
Exercise Solutions 443
(a) ∀x(2x < 3x). In case it isn’t obvious, there is nothing magical about x. You could also write
your answer as ∀a(2a < 3a), for instance.
Evaluation of Solution 1: While perhaps technically correct, this solution is not very good. It at
least uses a quantifier. But the fact that it includes the phrase “is even” suggests that it
could be phrased a bit more ‘mathematically.’
Evaluation of Solution 2: This solution is pretty good. It is concise, but expresses the idea with
mathematical precision. Although it doesn’t directly appeal to the definition of even, it
does use a fact that we all know to be true of even numbers.
Evaluation of Solution 3: This solution is also good. It clearly uses the definition of even. It is
a bit more complicated since it uses two quantifiers, but I prefer this one slightly over the
second solution. But that may be because I didn’t come up with the second solution and I
refuse to admit that someone had a better solution than what I thought of (which was this
one).
4.88 ∀x∃y∃z(x = y 2 + z 2 ).
4.89 y and z; non-negative integers; a perfect square; z 2 = 2; z 2 ≤ −1; which is also impossible;
exhausted/tried all possible values of y.
4.91 Notice that the student seems to have copied part of the proof from Example 4.90, so it
looks pretty convincing. Unfortunately, the proof is incorrect. Notice that in this exercise the
universe is N, not Z+ . Why does that matter? Because in this case we cannot always choose
m = n2 + 14n − 1 as suggested in the proof. If n = 0, this would give m = −1 which is not a
natural number, so it is not a valid choice. Therefore there is some value of n (namely, n = 0) for
which the statement is false. Since it is not true for all values of n, it is false.
4.92 It is easy to disprove this. If n = 0, then (n + 7)2 = 49, so we need to find a natural number
such that 49 > 49 + m. This requires we choose m < 0. But m needs to be a natural number,
so m ≥ 0, a contradiction. Thus, for n = 0, there is no m ∈ N that works. Therefore since the
statement is not true for all values of n, the statement is false.
4.95 You may have a different answer, but here is one possibility based on the hint. If we let
P (x, y) be x < y where the domain for both is the real numbers, then ∀x∃y(x < y) is true since
for any given x, we can choose y = x + 1. However, ∃y∀x(x < y) is false since no matter what
value we pick for y, x < y is false for x = y + 1. In other words, it is not true for all values of x.
As with the previous examples, the difference is that in this case we need to have a single value
of y that works for all values of x.
4.99
(a) It is saying that every integer can be written as two times another integer. Simplified, it is
saying that every integer is even.
(b) The most direct translation of the final line of the solution is “There is some integer that
cannot be written as two times another integer for any integer.” A smoothed-out translation
would be “There is at least one odd integer.”
444 Chapter 11
S1 = ∅ S9 = {d}
S2 = {a} S10 = {a, d}
S3 = {b} S11 = {b, d}
S4 = {c} S12 = {c, d}
S5 = {a, b} S13 = {a, b, d}
S6 = {b, c} S14 = {b, c, d}
S7 = {a, c} S15 = {a, c, d}
S8 = {a, b, c} S16 = {a, b, c, d}
5.27 Based on the answer to Exercise 5.24, we have that P ({a, b, c, d}) = {∅, {a}, {b}, {c}, {a, b},
{b, c}, {a, c}, {a, b, c}, {d}, {a, d}, {b, d}, {c, d}, {a, b, d}, {b, c, d}, {a, c, d}, {a, b, c, d}}. Notice that
a list of these 16 sets not separated by commas and not enclosed in {} is not correct. It may have
the correct content, but it is not in the proper form.
5.29 (a) By Theorem 5.28, |P (A)| = 24 = 16. (b) Similarly, |P (P (A))| = 216 = 65536. (c) This
is just getting a bit ridiculous, but the answer is |P (P (P (A)))| = 265536 .
5.30 Applying Theorem 5.28, it is not too hard to see that the power set will be twice as big
after a single element is added.
5.33 Z, or the set of (all) integers.
5.36 ∅.
5.39 A; B.
5.43 B; A.
5.47 Since no integer is both even and odd, A and B are disjoint.
5.51
Evaluation of Proof 2: Overall, this proof is very confusing and unclear. More specifically,
1. This is an attempt at working through what each set is by using the definitions. That
would be fine except for two things. First, they were asked to give a set containment
proof. Second, the wording of the proof is confusing and hard to follow. I do not come
away from this with a sense that anything has been proven.
2. They are not using the terminology properly. The terms ‘universe’ or ‘universal set’
would be appropriate, but not ‘universal’ on its own (used twice). Similarly, what does
the phrase ‘all intersection part’ mean? Also, a set doesn’t ‘return’ anything. A set is
just a set. It contains elements, but it doesn’t ‘do’ anything.
Evaluation of Proof 3: This proof contains a lot of correct elements. In fact, the first half is on
the right track. However, they jumped from x ∈ A and x 6∈ B to x ∈ A ∩ B. Between
446 Chapter 11
these statements they should say something like ‘x 6∈ B is equivalent to x ∈ B’ since the
latter statement is really needed before they can conclude that x ∈ A ∩ B. Also, it would be
better if they had ‘by the definition of intersection’ before or after the statement x ∈ A ∩ B.
Finally, it would help clarify the proof if the end was something like ’We have shown that
whenever x ∈ A − B, x ∈ A ∩ B. Thus, A − B ⊆ A ∩ B.’
The second half of the proof starts out well, but has serious flaws. The statement ’This
means that x ∈ A and x 6∈ B’ should be justified by the definitions of complement and
intersection, and might even involve two steps. This is the same problem they had in the
first half of the proof. More serious is the statement ‘which is what we just proved in the
previous statement’. What exactly does that mean? It is unclear how ‘what we just proved’
immediately leads us to the conclusion that A − B = A ∩ B. First we need to establish that
x ∈ A − B based on the previous statements (easy). Then we can say that A ∩ B ⊆ A − B.
Finally, we can combine this with the first part of the proof to say that A − B = A ∩ B.
In summary, the first half is pretty good. It should at least make the connection between
x 6∈ B and x ∈ B. The other suggestions clarify the proof a little, but the proof would
be O.K. if they were omitted. The second half is another story. It doesn’t really prove
anything, but instead makes a vague appeal to something that was proven before. Not only
is what they are referring to unclear, but how the proof of one direction is related to the
proof of the other direction is also unclear.
Evaluation of Solution 1: Although it is on the right track, this solution has several problems.
First, it would be better to make it more clear that the assumption is that both A and B
are not empty. But the bigger problem is the statement ‘(a, b) is in the cross product’. The
problem is that a and b are not defined anywhere. Saying ‘where a ∈ A and b ∈ B’ earlier
does not guarantee that there is such an a or b. The proof needs to say something along the
Exercise Solutions 447
lines of ‘Since A and B are not empty, then there exist some a ∈ A and b ∈ B. Therefore
(a, b) ∈ A × B. . . ’
Evaluation of Solution 2: This one is way off. The proof is essentially saying ‘Notice that p → q.
Therefore q → p.’ But these are not equivalent statements. Although it is true that if both
A and B are the empty set, then A × B is also the empty set, this does not prove that both
A and B must be empty in order for A × B to be empty. In fact, this isn’t the correct
conclusion.
Evaluation of Solution 3: The conclusion is incorrect, as is the proof. The problem is that the
negation of ‘both A and B are empty’ is ‘it is not the case that both A and B are empty’
or ‘at least one of A or B is not empty,’ which is not the same thing as ‘neither A nor B
is empty.’ So although the proof seems to be correct, it is not. The reason it seems almost
correct is that except for this error, the rest of the proof follows proper proof techniques.
Unfortunately, all it takes is one error to make a proof invalid.
5.73 f (x) = x mod 2 works. The domain is Z, and the codomain can be a variety of things. Z,
N, and {0, 1} are the most obvious choices. Note that we can pick any of these since the only
requirement of the codomain is that the range is a subset of it. On the other hand, R, C and Q
could also all be given as the codomain, but they wouldn’t make nearly as much sense.
5.77 We never said it was always wrong to work both sides of an equation. If you are working on
an equation that you know to be true, there is absolutely nothing wrong with it. It is a problem
only when you are starting with something you don’t know to be true. In this case, we know that
2a − 3 = 2b − 3 is true given the assumption made. Therefore, we are free to ‘work both sides’.
5.78 Let a, b ∈ R. If f (a) = f (b), then 5a = 5b. Dividing both sides by 5, we get a = b. Thus, f
is one-to-one.
5.81 Notice that f (4. 5) = f (4) = 4, so clearly f is not one-to-one. (Your proof may involve
different numbers, but should be this simple.)
5.84 Notice that if y = 2x + 1, then y − 1 = 2x and x = (y − 1)/2. Let b ∈ R. Then
f ((b − 1)/2) = 2((b − 1)/2) + 1 = b − 1 + 1 = b. Thus, every b ∈ R is mapped to by f , so f is onto.
5.87 Since the floor of any number is an integer, there is no a such that f (a) = 4. 5 (for instance).
Thus, f is not onto.
5.88 (a) f is not one-to-one. See Example 5.80 for a proof. (b) The same proof from
Example 5.80 works over the reals. But I guess it doesn’t hurt to repeat it: Since f (−1) = f (1) =
1, f is not one-to-one. (c) √ Let a, b√∈ N. If f (a) = f (b), that means a2 = b2 . Taking
√ the square
root of both sides, we obtain a = b , or |a| = |b| (if you didn’t remember that x2 = |x|, you
2 2
f is not onto, but it is one-to-one. (g) T. By definition of range, it is a subset of the codomain.
(h) F. We have seen several counter examples to this. (i) F. If a = 2 and b = 0, the odd numbers
are not in the range. (j) F. Same counterexample as the previous question. (k) T. The proof is
similar to several previous proofs.
5.97 If f (a) = f (b), 3a − 5 = 3b − 5. Subtracting 5 from both sides and then dividing both sides
by 3, we get a = b. Thus, f is one-to-one. If b ∈ R, notice that f ((b + 5)/3) = 3((b + 5)/3) − 5 =
b+ 5− 5 = b, so there is some value that maps to b. Therefore, f is onto. Since f is one-to-one and
onto, it has an inverse. To find the inverse, we let y = 3x − 5. Then 3x = y + 5, so x = (y + 5)/3.
Thus, f −1 (x) = (y + 5)/3 (or y3 + 35 ).
5.100 (f ◦ g)(x) = f (x/2) = ⌊x/2⌋, and (g ◦ f )(x) = g(⌊x⌋) = (⌊x⌋)/2.
5.105 (a) F. f might not be onto–e.g. if a = 2 and b = 0. (b) F. Same reason as the previous
question. (c) T. Since over the reals, f is one-to-one and onto. (d) F. There are several problems.
First, x2 may not even have an inverse depending on the domain (which was not specified).
Second, even if it had an inverse, it certainly wouldn’t be 1/x2 . That’s its reciprocal, not its
√
inverse. Its inverse would be x (again, assuming
√ the domain was chosen so that it is invertible).
(e) F. This is only true if n is odd. (f) F. 2 6∈ N, so not only is it not invertible, it can’t even be
defined on N. (g) T. The nth root of a positive number is defined for all positive real numbers,
so the function is well defined. It is not too difficult to convince yourself that the function is
both one-to-one and onto when restricted to positive numbers, so it is invertible. (h) T. In both
cases you get 1/x2 . (i) F. (f ◦ g)(x) = f (x + 1) = (x + 1 + 1)2 = (x + 2)2 = x2 + 4x + 4, and
(g ◦ f )(x) = g((x + 1)2 ) = (x + 1)2 + 1 = x2 + 2x + 2, which are clearly not the same. (j) F.
(f ◦ g)(x) = ⌈x⌉, and (g ◦ f )(x) = ⌊x⌋. (We’ll leave it to you to see why this is the case.) (k) F.
Certainly not. f (3. 5) = 3, but g(3) = 3, not 3. 5. (l) T. With the restricted domain, they are
indeed inverses.
5.113 The following three cases probably make the most sense: When a = b, when a < b and
when a > b. These make sense because these are likely different cases in the code. Mathematically,
we can think of it as follows. The possible inputs are from the set Z × Z. The partition we have
in mind is A = {(a, a) : a ∈ Z}, B = {(a, b) : a, b ∈ Z, a < b}, and C = {(a, b) : a, b ∈ Z, a > b}.
Convince yourself that these sets form a partition of Z × Z. That is, they are all disjoint from
each other and Z × Z = A ∪ B ∪ C.
Alternatively, you might have thought in terms of a and/or b being positive, negative, or 0.
Although that may make some sense, given that we are comparing a and b with each other, it
probably doesn’t matter exactly what values a and b have (i.e. whether they are positive, negative,
or 0), but what values they have relative to each other. That is why the first answer is much
better. With that being said, it wouldn’t hurt to include several tests for each of our three cases
that involve various combinations of positive, negative, and zero values.
5.114 Did you define two or more subsets of Z? Are they all non-empty? Do none of them
intersect with each other? If you take the union of all of them, do you get Z? If so, your answer
is correct! If not, try again.
5.116 Since R = Q ∪ I and Q ∩ I = ∅, {Q, I} is a partition of R. Hopefully this comes as no
surprise.
5.121 R is a subset of Z × Z, so it is a relation. By the way, this relation should look familiar.
Did you read the solution to Exercise 5.113?
5.122 Is it a subset of Z+ × Z+ ? It is. So it is a relation on Z+ .
5.124 (a) T is not reflexive since you cannot be taller than yourself. (b) N is reflexive because
everybody’s name starts with the same letter as their name does. (c) C is reflexive because
everybody have been to the same city as they have been in. (d) K is not reflexive because you
know who you are, so it is not the case that you don’t know who you are. That is, (a, a) 6∈ K for
Exercise Solutions 449
any a. (e) R is not reflexive because (Donald Knuth, Donald Knuth) (for instance) is not in the
relation.
5.126 (a) T is not symmetric since if a is taller than b, b is clearly not taller than a. (b) N
is symmetric since if a’s name starts with the same letter as b’s name, clearly b’s name starts
with the same letter as a’s name. (c) C is symmetric since it is worded such that it doesn’t
distinguish between the first and second item in the pair. In other words, if a and b have been
to the same city, then b and a have been to the same city. (d) K is not symmetric since
(David Letterman, Chuck Cusack) ∈ K, but (Chuck Cusack, David Letterman) 6∈ K. (e) R is not
symmetric since (Barack Obama, George W. Bush) ∈ R, but (George W. Bush, Barack Obama)6∈
R.
5.128 (a) Just knowing that (1, 1) ∈ R is not enough to tell either way. (b) On the other hand,
if (1, 2) and (2, 1) are both in R, it is clearly not anti-symmetric.
5.129 This is just the contrapositive of the original definition.
5.130 (a) T is anti-symmetric since whenever a 6= b, if a is taller than b, then b is not taller
than a, so if (a, b) ∈ T , then (b, a) 6∈ T . (b) N is not anti-symmetric since (Bono, Boy George)
and (Boy George, Bono) are both in N . (c) C is not anti-symmetric since (Bono, The Edge)
and (The Edge, Bono) are both in C (since they have played many concerts together, they
have certainly been in the same city at least once). (d) K is not anti-symmetric because both
(Dirk Benedict, Jon Blake Cusack 2.0) and (Jon Blake Cusack 2.0, Dirk Benedict) are in K. (e)
R is anti-symmetric since it only contains one element, (Barack Obama, George W. Bush), and
(George W. Bush, Barack Obama)6∈ R.
5.131 (a) No. The relation R = {(1, 2), (2, 1), (1, 3)} is neither symmetric ((3, 1) 6∈ R) nor
anti-symmetric ((1, 2) and (2, 1) are both in R). (b) No. For example, R from answer (a) is not
anti-symmetric, but isn’t symmetric either. (c) Yes. If you answered incorrectly, don’t worry.
You get to think about why the answer is ‘yes’ in the next exercise.
5.132 Many answers will work, but they all have the same thing in common: They only contain
‘diagonal’ elements (but not necessarily all of the diagonal elements). For instance, let R =
{(a, a) : a ∈ Z}. Go back to the definitions for symmetric and anti-symmetric and verify that this
is indeed both. Another examples is R = {(Ken, Ken)} on the set of English words.
5.134 (a) T is transitive since if a is taller than b, and b is taller than c, clearly a is taller
than c. In other words (a, b) ∈ R and (b, c) ∈ R implies that (a, c) ∈ R. (b) N is transitive
because if a’s name starts with the same letter as b’s name, and b’s name starts with the same
letter as c’s name, clearly it is the same letter in all of them, so a’s name starts with the same
letter as c’s. (c) C is not transitive. You might think a similar argument as in (a) and (b)
works here, but it doesn’t. The proof from (b) works because names start with a single letter,
so transitivity holds. But if (a, b) ∈ C and (b, c) ∈ C, it might be because a and b have both
been to Chicago, and b and c have both been to New York, but that a has never been to New
York. In this case, (a, c) 6∈ C. So C is not transitive. (d) K is not transitive. For instance,
(David Letterman, Chuck Cusack) ∈ K and (Chuck Cusack, David Letterman’s son) ∈ K, but
(David Letterman, David Letterman’s son) 6∈ K since I sure hope he knows his own son. (e) R is
transitive since there isn’t even an a, b, c ∈ R such that (a, b) and (b, c) are both in R, so it holds
vacuously.
5.137 (a) T is not an equivalence relation since it is not symmetric. (b) N is an equivalence
relation since it is reflexive, symmetric, and transitive. (c) C is not an equivalence relation since
it is not transitive. (d) K is not an equivalence relation since it is not reflexive, symmetric, or
transitive. This one isn’t even close! (e) R is not an equivalence relation since it is not reflexive.
5.139 (a) T is a not partial order because it is not reflexive. (b) N is not a partial order since
it is not anti-symmetric. (c) C is not a partial order since it is not anti-symmetric or transitive.
450 Chapter 11
(d) K is not a partial order since it is not reflexive, anti-symmetric, or transitive. (e) R is not a
partial order since it is not reflexive.
5.140 In the following, A, B, and C are elements of X. As such, they are sets.
(Reflexive) Since A ⊆ A, (A, A) ∈ R, so R is reflexive.
(Anti-symmetric) If (A, B) ∈ R and (B, A) ∈ R, then we know that A ⊆ B and B ⊆ A. By
Theorem 5.48, this implies that A = B. Therefore R is anti-symmetric.
(Transitive) If (A, B) ∈ R and (B, C) ∈ R, then A ⊆ B and B ⊆ C. But the definition of ⊆
implies that A ⊆ C, so (A, C) ∈ R, and R is transitive.
Since R is reflexive, anti-symmetric, and transitive, it is a partial order.
5.141 (a) Since (1, 1) 6∈ R, R is not reflexive. (b) Since (1, 2) ∈ R, but (2, 1) 6∈ R, R is not
symmetric. (c) A careful examination of the elements reveals that it is anti-symmetric. (d) A
careful examination of the elements reveals that it is transitive. (e) Since it is not reflexive or
symmetric, it is not an equivalence relation. (f) Since it is not reflexive, it is not a partial order.
5.143 ((a, b), (a, b)); bc; da; ((c, d), (a, b)); symmetric; ad = bc; cf = de; de/f ; b(de/f ); af = be;
((a, b), (e, f ))
6.3 (a)x0 = 1 + (−2)0 = 1 + 1 = 2 (b)x1 = 1 + (−2)1 = 1 − 2 = −1 (c)x2 = 1 + (−2)2 = 1 + 4 = 5
(d)x3 = 1 + (−2)3 = 1 − 8 = −7 (e)x4 = 1 + (−2)4 = 1 + 16 = 17
6.4 We will just provide the final answer for these. If you can’t get these answers, you may need
to brush up on your algebra skills. (a) 2, 1/2,5/4, 7/8, 17/16; (b) 2, 2, 3, 7, 25; (c) 1/3, 1/5,
1/25, 1/119, 1/721; (d) 2, 9/4, 64/27, 625/256, 7776/3125
6.8 Notice that x0 = 1, x1 = 5 · 1 = 5, x2 = 5 · 5 = 52 , x3 = 5 · 52 = 53 , etc. Looking back, we can
see that 1 = 50 , so x0 = 50 . Also, x1 = 5 = 51 . So it seems likely that the solution is xn = 5n .
This is not a proof, though!
6.9 Notice that x0 = 1, x1 = 1 · 1 = 1, x2 = 2 · 1 = 2, x3 = 3 · 2 = 6, x4 = 4 · 6 = 24,
x3 = 5 · 24 = 120, etc. Written this way, no obvious pattern is emerging. Sometimes how you
write the numbers matters. Let’s try this again: x1 = 1 · 1 = 1!, x2 = 2 · 1 = 2!, x3 = 3 · 2 · 1 = 3!,
x4 = 4 · 3 · 2 · 1 = 4!, x3 = 5 · 4 · 3 · 2 · 1 = 5!, etc. Now we can see that xn = n! is a likely solution.
Again, this isn’t a proof.
6.10 Their calculations are correct (Did you check them with a calculator? You should have!
How else can you tell whether or not their solution is correct?). So it does seem like an = 2n is
the correct solution. However,
ö √ ù ö √ ù
1+ 5
a5 = 2 × a4 + a3 = 1+2 5 × 16 + 8 = 33 6= 25
so the solution that seems ‘obvious’ turns out to incorrect. We won’t give the actual solution
since the point of this example is to demonstrate that just because a pattern holds for the first
several terms of a sequence, it does not guarantee that it holds for the whole sequence.
6.12 Hopefully you came up with the solution xn = 5n . Since x0 = 1 = 50 , it works for the
initial condition. If we plug this back into the right hand side of xn = 5 · xn−1 , we get
5 · xn−1 = 5 · 5n−1
= 5n
= xn ,
condition. If we plug this back into the right hand side of xn = n · xn−1 , we get
n · xn−1 = n · (n − 1)!
= n!
= xn ,
the last step since 1/n(n + 1) < 1 when n ≥ 1. Therefore, xn+1 − xn > 0, so xn+1 > xn , i.e., the
sequence is strictly increasing. If your solution is significantly different than this, make sure you
determine one way or another if it is correct.
452 Chapter 11
6.22 We could go into much more detail than we do here, and hopefully you did when you
wrote down your solutions. But we’ll settle for short, informal arguments this time. (a) This
is just a linear function. It is strictly increasing. (b) Since this keeps going from positive to
negative to positive, etc. it is non-monotonic. (c) We know that n! is strictly increasing. Since
this is the reciprocal of that function, it is almost strictly decreasing (since we are dividing by a
number that is getting larger). However, since 1/0! = 1/1! = 1, it is just decreasing. (d) This is
getting closer to 1 as n increases. It is strictly increasing (e) This is n(n − 1). x1 = 0, x2 = 2,
x3 = 6, etc. Each term is multiplying two numbers that are both getting larger, so it is strictly
increasing. (f) This is similar to the previous one, but x0 = x1 = 0, so it is just increasing.
(g) This alternates between −1 and 1, so it is non-monotonic. (h) Each term subtracts from
1 a smaller numbers than the last term, so it is strictly increasing. (i) Each term adds to 1 a
smaller number than the last term, so it is strictly decreasing.
6.26 You should have concluded that a = − 3217 and that r = 3216 / − 3217 = −317 /316 = −3 (or
n
you could have divided the second and third terms). Then the n-th term is − 3217 (−3)n−1 = 2(−1)
318−n
(Make sure you can do the algebra to get to this simplified form). Finally, the 17th term is
2(−1)17 2
318−17 = − 3
6.28 We are given that ar 5 = 20 and ar 9 = 320. Dividing, we can see that r 4 = 16. Thus
r = ±2. (We don’t have enough information to know which it is). Since ar 5 = 20, we know
that a = 20/r 5 = ±20/32. So the third term is ar 2 = (±20/32)(±2)2 = ±80/32 = ±5/2. Thus
|ar 2 | = 5/2.
6.32
(a) The difference between the each of the first 4 terms of the sequence is 7, so it appears to be
an arithmetic sequence. Doing a little math, the correct answer appears to be (d) 51.
(b) Although the sequence appears to be arithmetic, we cannot be certain that it is. If you are told
it is arithmetic, then 51 is absolutely the correct answer. Notice that the previous example
specifically stated that you should assume that the pattern continues. This one did not.
Without being told this, the rest of the sequence could be anything. The 8th term could be
0 or 8, 675, 309 for all we know. Of the choices given, 51 is the most obvious choice, but any
of the answers could be correct. This is one reason I hate these sorts of questions on tests.
Although I think it is important to point out the flaw in these sorts of questions, it is also
important to conform to the expectations when answering such questions on standardized
tests. In other words, instead of disputing the question (as some students might be inclined
to do), just go with the obvious interpretation.
6.33 (a) The closed form was xn = 5n , which is clearly geometric (with a = 1 and r = 5) and not
arithmetic. (b) Since the solution for this one is xn = n!, this is neither arithmetic or geometric.
(c) Since the sequence is essentially fn = 2n + 3, with initial condition f0 = 3, it is an arithmetic
sequence. It is clearly not geometric.
100
X
6.36 yi
i=0
50
X 50
X
6.38 (y 2 )i or y 2i
i=0 i=0
6.40 (a) 2 (b)11 (c) 100 (d) 101
X6 X6 30
X 30
X
6.45 (a) 5=5 1 = 5 · 2 = 10. (b) 200 = 200 1 = 200(30 − 20 + 1) = 2200.
k=5 k=5 k=20 k=20
Exercise Solutions 453
6.48 Using Theorem 6.46, we get the following answers: (a) (30 − 20 + 1)200 = 11 ∗ 200 = 2200.
(b) 900 (c) 909. Notice that this one has one more term than the previous one. The fact that the
additional index is 0 doesn’t matter since it is adding 9 for that term.
6.49 This solution contains an ‘off by one’ error. The correct answer is 10(75−25+1) = 10∗51 =
510.
6.52 (a) 20 · 21/2 = 210 (b) 100 · 101/2 = 5050 (c) 1000 · 1001/2 = 500500
6.53
Evaluation of Solution 1: Another example of the ‘off by one error’. They are using the formula
n(n − 1)/2 instead of n(n + 1)/2.
Evaluation of Solution 2: This answer doesn’t even make sense. What is k in the answer? k is
just an index of the summation. The index should never appear in the answer. The problem
is that you can’t pull the k out of the sum since each term in the sum depends on it.
6.54 It is true. The additional term that the sum adds is 0, so the sum is the same whether or
not it starts at 0 or 1.
100 100 100
X X X 100 · 101
6.57 2−i = 2− i = 200 − = 200 − 5050 = −4850.
2
i=1 i=1 i=1
6.58 The sum of the first n odd integers is
n n n n n
X X X X X n(n + 1)
(2k − 1) = 2k − 1=2 k− 1=2 − n = n2 + n − n = n2 .
2
k=1 k=1 k=1 k=1 k=1
20
X 20
X 9
X
6.61 (a) k= k− k = 20 · 21/2 − 9 · 10/2 = 210 − 45 = 165.
k=10 k=1 k=1
40
X 40
X 20
X
(b) k= k− k = 40 · 41/2 − 20 · 21/2 = 820 − 210 = 610.
k=21 k=1 k=1
6.62
Evaluation of Solution 1: Another example of the off-by-one error. The second sum should end
at 29, not 30.
Evaluation of Solution 2: This one has two errors, one of which is repeated twice. It has the
same error as the previous solution, but it also uses the incorrect formula for each of the
sums (the off-by-one error).
6.63 Two errors are made that cancel each other out. The first error is that the second sum in
the second step should go to 29, not 30. But in the computation of that sum in the next step,
the formula n(n − 1)/2 is used instead of n(n + 1)/2 (The correct formula was used for the first
sum). This is a rare case where an off-by-one error is followed by the opposite off-by-one error
that results in the correct answer.
It should be emphasized that even though the correct answer is obtained, this is an incorrect
solution. They obtained the correct answer by sheer luck.
6.65 There are two ways to answer this. The smart aleck answer is ‘because it is correct.’ But
why is it correct with 2, and couldn’t it be slightly modified to work with 1 or 0? The answer is
1
no because if you plug 1 or 0 into (k−1)k , you get a division by 0. Hopefully I don’t need to tell
you that this is a bad thing.
454 Chapter 11
6.66
n
X n
X n
X
3 3
k +k = k + k
k=1 k=1 k=1
n2 (n+ 1)2
n(n + 1)
= +
4 Å 2
n(n + 1) n(n + 1)
ã
= +1
2 2
Å 2
n(n + 1) n + n + 2
ã
=
2 2
2
n(n + 1)(n + n + 2)
=
4
n X
i n
X X n(n + 1)
6.68 (a) 1= i= .
2
i=1 j=1 i=1
n i n
X X X i(i + 1) n(n + 1)(n + 2)
(b) j = = ··· = . (This one involves doing a little algebra,
2 6
i=1 j=1 i=1
applying two formulas, and then doing a little more algebra. Make sure you work it out until you
get this answer.) Ñ é
n n n n n Å n
n(n + 1) n(n + 1) X n(n + 1) n(n + 1)
X X X X X ã
(c) ij = i j = i = i = =
2 2 2 2
i=1 j=1 i=1 j=1 i=1 i=1
n2 (n + 1)2
.
4 50
6.72 3 2−1 = 358948993845926294385124.
P
6.73 This is equivalent to 34 k 35
k=0 (−2) , so the summation is (1 − (−2) )/(1 − (−2)) = (1 −
35 35 35
(−1) 2 )/3 = (1 + 2 )/3 = 11453246123.
101 y 101 −1
6.74 (a) 1−y 1−y or y−1 (We won’t give the alternatives for the rest. If your answer differs, do
1−(−y)101 1+y 101 1−y 102
some algebra to make sure it is equivalent.) (b) 1−(−y) = 1+y (c) 1−y 2
.
6.77 x5 − 1 = (x − 1)(x4 + x3 + x2 + x + 1).
6.78 21 + 22 + 23 + · · · + 2n+1 ; 20 ; 2n+1 ; 2n+1 − 20
Xn
k+1
6.80 a r k ; a 1−r
1−r .
k=0
6.81 Let S = a + ar + ar 2 + · · · + ar n . Then rS = ar + ar 2 + · · · + ar n+1 Z, so
S − rS = a + ar + ar 2 + · · · + ar n − ar − ar 2 − · · · − ar n+1
= a − ar n+1 .
7.6 (a) Since we assumed that n ≥ 1, −3n is certainly negative. In other words, −3n ≤ 0. That’s
why in the first step we could say that 5n2 − 3n + 20 ≤ 5n2 + 20. (b) We used the fact that
20 ≤ 20n2 whenever n ≥ 1. If either of these solutions is not clear to you, you need to brush up
on your algebra.
7.7 This is incorrect. It is not true that −12n ≤ −12n2 when n ≥ 1. (If this isn’t clear to you
after thinking about it for a few minutes, you may need to do some algebra review.) In fact, that
error led to the statement 4n2 − 12n + 10 ≤ 2n2 which cannot possibly be true as n gets larger
since it would require that 2n2 − 12n + 10 ≤ 0. This is not true as n gets larger. In fact, when
n = 10, for instance, it is clearly not true. But it is true that −12n < 0 when n ≥ 0, so instead
of replacing it with −12n2 , it should be replaced with 0 as in previous examples.
7.8 (a) Sure. Add the final step of 25n2 ≤ 50n2 to the algebra in the proof. In fact, any number
above 25 can easily be used. Some values under 25 can also be used, but they would require a
modification of the algebra used in the proof. The bottom line is that there is generally no ‘right’
value to use for c. If you find a value that works, then it’s fine. (b) Clearly not. For this to work,
we would need 5n2 − 3n + 20 < 2n2 to hold as n increases towards ∞. But this would imply that
3n2 − 3n + 20 < 0. But when n ≥ 1, 3n2 is positive and larger than 3n, so 3n2 − 3n + 20 > 0. (c)
Sure. The proof used the fact that the inequality is true when n ≥ 1, so it is clearly also true if
n ≥ 100. And the definition of Big-O does not require that we use the smallest possible value for
n0 . (d) No. We would need a constant c such that 5 · 02 − 3 · 0 + 20 = 20 ≤ 0 = c · 02 , which is
clearly impossible.
7.9 If n ≥ 1,
7.25
Evaluation of Solution 1: Although this proof sounds somewhat reasonable, it is way too informal
and convoluted. Here are some of the problems.
1. This student misunderstands the concept behind ‘ignoring the constants.’ We can
ignore the constants after we know that f (n) = O(g(n)). We can’t ignore them in
order to prove it.
2. The phrase ‘become irrelevant’ (used twice) is not precise. We have developed mathe-
matical notation for a reason—it allows us to make statements like these precise. It’s
kind of like saying that a certain car costs ‘a lot’. What is ‘a lot’ ? Although $30,000
might be a lot for most of us, people with a lot more money than I have might not
think that $500,000 is a lot.
3. The phrase ‘This leaves us with nk + nk−1 + · · · + n = O(nk )’ is odd. What precisely
do they mean? That this is true or that this is what we need to prove now? In either
case, it is incorrect. Similarly for the second time they use the phrase ‘This leaves us
with’.
4. The second half of the proof is unnecessarily convoluted. They essentially are claiming
that their proof has boiled down to showing that nk = O(nk ). To prove this, they use
an incredibly drawn out, yet vague, explanation that is in a single unnecessarily long
sentence. Why are they even bringing Θ and Ω into this proof? Why don’t they just
say something like ‘since nk ≤ 1nk for all n ≥ 1, nk = O(nk )’ ? I believe the answer is
obvious: they don’t really understand what they are doing here. They clearly have a
vague understanding of the notation, but they don’t understand the formal definition.
The bottom line is that this student understands that the statement they needed to prove
is correct, and they have a vague sense of why it is true, but they did not have a clear
understanding of how to use the definition of Big-O to prove it. The most important thing
to take away from this example is this: Be precise, use the notation and definitions you have
learned, and if your proofs look a lot different than those in the book, you might question
whether or not you are on the right track.
7.26 We cannot say anything about the relative growth rates of f (n) and g(n) because we are
only given upper bounds for each. It is possible that f (n) = n2 and g(n) = n, so that f (n) grows
faster, or vice-versa. They could also both be n.
7.27 (a) F. This is saying that f (n) grows no faster than g(n). (b) F. They grow at the same
rate. (c) F. f (n) might grow slower than g(n). For instance, f (n) = n and g(n) = n2 . (d) F. They
might grow at the same rate. For instance, f (n) = g(n) = n. (e) F. If f (n) = n and g(n) = n2 ,
f (n) = O(g(n)), but f (n) 6= Ω(g(n)). (f) T. By Theorem 7.18. (g) F. If f (n) = n and g(n) = n2 ,
f (n) = O(g(n)), but f (n) 6= Θ(g(n)). (h) F. If f (n) = n and g(n) = n2 , f (n) = O(g(n)), but
g(n) 6= O(f (n)).
7.37 c1 g(n) ≤ f (n) ≤ c2 g(n) for all n ≥ n0 ; c12 f (n); c11 f (n); c3 h(n) ≤ g(n) ≤ c4 h(n) for all n ≥
n1 .; c2 ; c2 c4 ; max{n0 , n1 }; c1 c3 h(n); c2 c4 h(n); Θ(h(n)); Θ; transitive;
7.39 (a) T. By Theorem 7.36. (b) T. By Theorem 7.18. (c) T. By Theorem 7.32. (d) F. The
backwards implication is true, but the forward one is not. For instance, if f (n) = n and g(n) = n2 ,
clearly f (n) = O(g(n)), but f (n) 6= Θ(g(n)). (e) F. Neither direction is true. For instance, if
Exercise Solutions 457
f (n) = n and g(n) = n2 , f (n) = O(g(n)), but g(n) 6= O(f (n)). (f) T. By Theorem 7.36. (g) T.
By Theorem 7.18. (h) T. By Theorem 7.28.
7.42 c1 n2 ; c2 n2 ; 21 − n3 ; 10−6 1 1 2 1 2
20 = 5 ; 5 n ; 2 n ; 10.
7.43 There are a few ways to think about this. First, the larger n is, the smaller n3 is, so a
smaller amount is being subtracted. But that’s perhaps too fuzzy. Let’s look at it this way:
10 n 3 3 3 3 1 3 1 3
n ≥ 10 ⇒ ≤ ⇒ ≤ ⇒− ≥− ⇒ − ≥ − .
3 3 n 10 n 10 2 n 2 10
7.45 (a) Theorem 7.18. (b) Absolutely not! Theorem 7.18 requires that we also prove f (n) =
Ω(g(n)). Here is a counterexample: n = O(n2 ), but n 6= Θ(n2 ). So f (n) = O(g(n)) does not
imply that f (n) = Θ(g(n)).
7.46 Notice that when n ≥ 1, n! = 1 · 2 · 3 · · · n ≤ n · n · · · n = nn . Therefore n! = O(nn ) (We
used n0 = 1, and c = 1.)
7.49 If f (x) = O(g(x)), then there are positive constants c1 and n′0 such that
and if g(x) = O(h(x)), then there are positive constants c2 and n′′0 such that
n(n + 1)/2 n2 + n 1 1 1 1
7.63 Notice that lim 2
= lim 2
= lim + = + 0 = , so n(n + 1)/2 =
n→∞ n n→∞ 2n n→∞ 2 2n 2 2
Θ(n2 ). Å ãx
2x 2
7.64 (a) Since lim x = lim = 0, the result follows.
x→∞ 3 x→∞ 3
x x
(b) If x ≥ 1, then clearly (3/2)x ≥ 1, so 2x ≤ 2x 32 = 2×3 2 = 3x . Therefore, 2x = O(3x ).
7.70
Evaluation of Proof 1: 7x grows faster than 5x does not mean 7x − 5x > 0 for all x 6= 0. For one
thing, we are really only concerned about positive values of x. Further, we are specifically
concerned about very large values of x. In other words, we want something to be true for all
x that are ‘large enough’. Also, this statement does not take into account constant factors.
Similarly, a tight bound does not imply that 7x − 5x = 0. The bottom line: This one is
way off. They are not conveying an understanding of what ‘upper bound’ really means, and
they certainly haven’t proven anything. Frankly, I don’t think they have a clue what they
are trying to say in this proof.
458 Chapter 11
Evaluation of Proof 2: This one has several problems. First, the application of l’Hopital’s rule is
5x log 5
incorrect. The result should be lim x , which should make it obvious that l’Hopital’s
x→∞ 7 log 7
rule doesn’t actually help in this case. (The key to this one is to do a little algebra.) The
next problem is the statement ‘but x log 7 gets there faster’. What exactly does that mean?
Asymptotically faster, or just faster? If the former, it needs to be proven. If the latter, that
isn’t enough to prove relative growth rates. Finally, even if this showed that 5x = O(7x ),
that only shows that 7x is an upper bound on 5x . It does not show that the bound is not
tight. The bottom line is that bad algebra combined with vague statements falls way short
of a correct proof.
Evaluation of Proof 3: This proof is very close to being correct. The main problem is that they
only stated that 5x = O(7x ), but they also needed to show that 5x 6= Θ(7x ). It turns out
that the theorem they mention also gives them that. So all they needed to add is ‘and
5x 6= Θ(7x )’ at the end. Technically, there is another problem—they should have taken the
limit of 5x /7x . What they really showed using the limit theorem is that 7x = ω(5x ), which
is equivalent to 5x = o(7x ). It isn’t a major problem, but technically the limit theorem does
not directly give them the result they say it does. If you are trying to prove that f (x) is
bounded by g(x), put f (x) on the top and g(x) on the bottom.
7.72 You should have come up with n2 log n for the upper bound. If you didn’t, now that you
know the answer, go back and try to write the proofs before reading them here. (a) If n > 1,
Thus, n ln(n2 + 1) + n2 ln n = O(n2 ln n). (You may have different algebra in your proof. Just
make certain that however you did it that it is correct.)
n ln(n2 + 1) + n2 ln n n ln(n2 + 1)
(b) lim = lim +1
x→∞ n2 ln n x→∞ n2 ln n
ln(n2 + 1)
= 1 + lim
x→∞ n ln n
2n
2
n +1
= 1 + lim (l’Hopital)
x→∞ 1 · ln n + n · 1
n
2n
= 1 + lim 2
x→∞ (n + 1)(ln n + 1)
2
= 1 + lim (l’Hopital)
x→∞ 2n(ln n + 1) + (n2 + 1) · 1
n
2
= 1 + lim
x→∞ 2n(ln n + 1) + n + 1
n
= 1 + 0 = 1.
Therefore, n ln(n2 + 1) + n2 ln n = Θ(n2 log n).
7.74 We can see that (n2 − 1)5 = Θ(n10 ) since
ã5
(n2 − 1)5 1 5
Å 2
n −1
Å ã
lim = lim = lim 1 − 2 = 1.
n→∞ n10 n→∞ n2 n→∞ n
Exercise Solutions 459
Note that we could also have shown that 2n+1 + 5n−1 = Θ(5n−1 ), but that is not as simple of a
function.
na 1
7.78 Since a < b, b − a > 0. Therefore, lim b = lim na−b = lim b−a = 0. By Theorem 7.50,
n→∞ n n→∞ n→∞ n
na = o(nb ).
an a n
7.81 Since a < b, a/b < 1. Therefore, lim n = lim = 0. By Theorem 7.50, an = o(bn ).
n→∞ b n→∞ b
7.86 (a) False since 3n grows faster than 2n . (b) True since 2n grows slower than 3n . (c) False
since 3n grows faster than 2n , which means it does not grow slower or at the same rate. (d) True
since they both have the same growth rate. Remember, exponentials with different bases have
different growth rates, but logarithms with different bases have the same growth rate. (e) True
since they have the same growth rate. Remember that if f (n) = Θ(g(n)), then f (n) = O(g(n))
and f (n) = Ω(g(n)). (f) False since they have the same growth rate, so log10 n does not grow
slower than log3 n.
1
logc (n) n ln(c) 1
7.89 Using l’Hopital’s rule, we have lim = lim = lim = 0 since b > 0.
n→∞ nb n→∞ b nb−1 n→∞ ln(c)b nb
Thus, Theorem 7.50 tells us that logc n = o(nb ).
7.94 (a) Θ; (b) o (O is correct, but not precise enough.); (c) Θ; (d) o (O is correct, but not
precise enough.); (e) Θ since 2n = 2 2n−1 ; (f) Ω (Technically it is ω, but I’ll let it slide if you put
Ω since we haven’t used ω much.); (g) through (j) are all o (O is correct, but not precise enough.)
7.96 If your answers do not all start with Θ, go back and redo them before reading the answers.
Your answers should match the following exactly. (a) Θ(n7 ). (b) Θ(n8 ). (c) Θ(n2 ). (d) Θ(3n ).
(e) Θ(2n ). (f) Θ(n2 ). (g) Θ(n.000001 ). (h) Θ(nn ).
7.97 Here is the correct ranking (where ∼ indicates two functions grow at the same rate):
10000, log x ∼ log(x300 ), log300 x, x.000001 , x ∼ log(2x ), x log(x), xlog2 3 , x2 , x5 , 2x , 3x .
7.98 Modern computers use multitasking to perform several tasks (seemingly) at the same time.
Therefore, if an algorithm takes 1 minute of real time (wall-clock time), it might be that 58
seconds of that time was spent running the algorithm, but it could also be the case that only
30 seconds of that time were spent on that algorithm, and the other 30 seconds spent on other
processes. In this case, the CPU time would be 30 seconds, but the wall-clock time 60 seconds.
Further complicating matters is increasing availability of machines with multiple processors.
If an algorithm runs on 4 processors rather than one, it might take 1/4th the time in terms of
wall-clock time, but it will probably take the same amount of CPU time (or close to it).
7.99 We cannot be certain whose algorithm is better with the given information. Maybe Sue
used a TRS-80 Model IV from the 1980s to run her program and Stu used Tianhe-2 (The fastest
computer in the world from about 2013-2014). In this case, it is possible that if Sue ran her
program on Tianhe-2 it would have only taken 2 minutes, making her the real winner.
7.100 As has already been mentioned, other processes on the machine can have a significant
influence on the wall-clock time. For instance, if I run two CPU-intensive programs at once, the
wall-clock time of each might be about twice what it would be if I ran them one at a time. If they
are run on a machine with multiple cores the wall-clock time might be closer to the CPU-time.
But other processes that are running can still throw off the numbers.
7.101 For the most part, yes. This is especially true if the running times of the algorithms are
not too close to each other (in other words, if one of the algorithms is significantly faster than the
460 Chapter 11
other). However, the number of other processes running on the machine can have an influence
on CPU-time. For instance, if there are more processes running, there are more context switches,
and depending on how the CPU-time is counted, these context switches can influence the runtime.
So although comparing the CPU-time of two algorithms that are run on the same computer gives
a pretty good indication of which is better, it is still not perfect.
7.103 This one is a little more tricky. The answer is n · m since this is how many entries are
in the matrix. Sometimes we need to use two numbers to specify the input size. As suggested
previously, we will ignore the size of the two other pieces of data.
7.109 We focus on the assignment (=) inside the loop and ignore the other instructions. This
should be fine since assignment occurs at least as often as any other instruction. In addition,
it is important to note that max takes constant time (did you remember to explicitly say this?),
as do all of the other operations, so we aren’t under-counting. It isn’t too difficult to see that
the assignment will occur n times for an array of size n since the code goes through a loop with
i = 0, . . . , n − 1. Thus, the complexity of maximum is always Θ(n). That is, Θ(n) is the best,
average, and worst-case complexity of maximum.
7.112 The line in the inner for loop takes constant time (let’s call it c). The inner loop executes
k = 50 times, each time doing c operations. Thus the inner loop does 50 · c operations, which is
still just a constant. The outer loop executes n times, each time executing the inner loop, which
takes 50 · c operations. Thus, the whole algorithm takes 50 · c · n = Θ(n) time.
7.113 The line in the inner for loop takes constant time (let’s call it c). The inner loop executes
n2 times since j is going from 0 to n2 − 1, so each time the inner loop executes, it does cn2
operations. The outer loop executes n times, each time executing the inner loop. Thus, the total
time is n × cn2 = Θ(n3 ).
This is an example of an algorithm with a double-nested loop that is worse than Θ(n2 ). The
point of this exercise is to make it clear that you should never jump to conclusions too quickly
when analyzing algorithms. Read the limits on loops very carefully!
7.116 (a) AreaTrapezoid is constant. (b) factorialis not constant. It should be easy to see
that it has a complexity of Θ(n). (c) absoluteValue is constant if we assume sqrt takes constant
time.
7.121 Although it has a nested loop, the inside loop always executes 6 times, which is a constant.
So the algorithm takes about 6 · c · n = Θ(n) operations, not Θ(n2 ).
7.126 (a) Since factorial has a complexity of Θ(n), it is not quadratic. (b) Since there are n2
entries it to consider, the algorithm takes Θ(n2 ) time, so it would be quadratic.2
7.127 Bubble sort, selection sort, and insertion sort are three of them that you may have seen
before.
7.133 As we mentioned in our analysis, executing the conditional statement takes about 3
operations, and if it is true, about 3 additional operations are performed. So the worst case is no
more than about twice as many operations as the best case. In other words, we are comparing
c · n2 to 2c · n2 , both of which are Θ(n2 ).
7.136 Since both of these methods require accessing the ith element of the list for some integer
i, and since we must traverse the list from the head, clearly the complexity of both methods is
Θ(i).
We could be less specific and say that the complexity is Θ(n) since 0 ≤ i < n. However, when
analyzing algorithms that make repeated calls to these methods, using Θ(i) might give a more
accurate answer overall. It makes the analysis more difficult, but sometimes it is worth it.
2
Technically this is linear with respect to the size of the input since the size of the input is n2 . But it is quadratic
in n. In either case, it is Θ(n2 ).
Exercise Solutions 461
Note: For doubly-linked lists, some implementations traverse starting at the tail if the index is
closer to the end of the list. However, that just means the complexity is no worse than Θ(n/2) =
Θ(n). In other words, it only changes the complexity by a constant factor.
7.138 All of them should be Θ(1), assuming we keep track of how many elements are currently
in the stack (which is a reasonable thing to do).
7.139 For an array, either enqueue or dequeue (but not both) will be Θ(n). All of the others will
be Θ(1), assuming we keep track of how many elements are currently in the queue. Note that the
advantage of the circular array implementation is that both enqueue and dequeue are Θ(1).
7.140 For the array implementation, addToFront, removeFirst and contains will all be Θ(n) and
the rest will be Θ(1). For the linked list implementation, contains will be Θ(n) and the rest will be
constant if we assume there is both a head and tail pointer. If there is no tail pointer, addToEnd
will be Θ(n).
7.141 For unbalanced, all operations can be done in Θ(h) time, where h is the height of the tree.
This is the best answer you can give. You could also say O(n) time since the height is no more
then n, but this answer is not precise enough to be of much use. You cannot say Θ(log n) since
this is not necessarily true for an unbalanced tree.
For balanced (red-black, AVL, etc.), all operations can be implemented with complexity
Θ(log n).
7.142 The average-case complexity for all of these operations is Θ(1), and the worst-case com-
plexity is Θ(n).
7.143
Evaluation of Solution 1: I have no idea what logic they are trying to use here. Sure, an is an
exponential function, but what does that have to do with how long this algorithm takes?
This solution is way off.
Evaluation of Solution 2: Having the i in the answer is nonsense since it doesn’t mean anything
in the context of a complexity—it is just a variable that happens to be used to index a
loop. Further, the answer should be given using Θ-notation. So this solution is just plain
wrong. Since having an i in the complexity does not make any sense, this person either has
a fundamental misunderstanding of how to analyze algorithms or they didn’t think about
their final answer. Don’t be like this person!
Evaluation of Solution 3: This solution is O.K., but it has a slight problem. Although the anal-
ysis given estimates the worst-case behavior, it over-estimates it. By replacing i with n − 1,
they are over-estimating how long the algorithm takes. The call to pow only takes n − 1
time once. This solution can really only tell us that the complexity is O(n2 ). Is it possible
the over-estimation of time resulted in a bound that isn’t tight? Even if it turns out that
Θ(n2 ) is the correct bound, this solution does not prove it. Although they are on the right
track, this person needed to be a little more careful in their analysis.
7.144 Before you read too far: if you did not use a summation in your solution, go back and try
again! This is very similar to the analysis of bubblesort. The for loop takes i from 0 to n − 1,
and each time the code in the loop takes i time (since that is how long power(a,i) takes). Thus,
the complexity is
n−1
X (n − 1)n
i= = Θ(n2 ).
2
i=0
462 Chapter 11
Notice that just because the answer is Θ(n2 ), that does not mean that the third solution to Eval-
uate 7.143 was correct. As we stated in the solution to that problem, because they overestimated
the number of operations, they only proved that the algorithm has complexity O(n2 ).
7.145 Here is one solution.
double a d d P o w e r s( double a , int n ) {
if ( a ==1) {
return n ;
} else {
double sum = 1; // for the $ a ^0 $ term .
double pow = 1;
for ( int i =1;i < n ; i ++) {
pow = pow * a ;
sum += pow ;
}
return sum ;
}
}
If a = 1, the algorithm takes constant time. Otherwise, it executes a constant number of opera-
tions and a single for loop n times. The code in the loop takes constant time. Thus the algorithm
takes Θ(n) time.
7.146 If you used recursion instead of a loop, cool idea. However, go back and do it again. There
is an even simpler way to do it. Need a hint? Apply some of that discrete mathematics material
you have been learning! When you have a solution that does not use a loop or recursion (or you
get stuck), keep reading.
The trick is to use the formula for a geometric series (did you recognize that this is what
addPowers is really computing?). We need a special case for a = 1 because the formula requires
that a 6= 1.
double a d d P o w e r s( double a , int n ) {
if ( a ==1) {
return n ;
} else {
return (1 - power (a , n +1) ) /(1 - a ) ;
}
}
If a = 1, the algorithm takes constant time. Otherwise, it executes a constant number of
operations and a single call to power(a,n+1) which takes n + 1 time. Thus the algorithm takes
Θ(n + 1) = Θ(n) time.
It is worth noting that a = 0 is a tricky case. addPowers can’t really be computed for a = 0
since 00 is undefined. It is for this reason that the first term of a geometric sequence is technically
1, not a0 . Since a0 = 1 for all other values of a, the case of a = 0 is usually glossed over. If you
don’t understand what the fuss is about, don’t worry too much about it.
7.148
Evaluation of Solution 1: This takes about 2 + 4n + 4m operations, which is essentially the same
as the ‘C’ version. Unfortunately, it is slightly worse than the original solution since it is
now incorrect. All they did is omit adding the final n in the first sum. This went from a
‘C’ to a ‘D’ (at best).
so it is incorrect. It also is not as efficient as possible. I’d say this is still a ‘C’.
Evaluation of Solution 3: This student figured out the trick—they know a formula to compute
the sum, so they tried to use it. Unfortunately, they used the formula incorrectly and/or
they made a mistake when manipulating the sum (it is impossible to tell exactly what they
did wrong—they made either one or two errors), so the algorithm is not correct. In terms of
efficiency, their solution is great because it takes a constant number of operations no matter
what n and m are. Because their answer is efficient and very close to being correct, I’d
probably give them a ‘B’.
n n m−1
X X X n(n + 1) (m − 1)m
7.149 We use the fact that k= − = − to give us the following
2 2
k=m k=1 k=1
solution:
int s u m F r o m M T o N ( int m , int n ) {
return n *( n +1) /2 - (m -1) * m /2;
}
Since this is just doing a fixed number of arithmetic operations no matter what the values of m
and n are, it takes constant time.
7.152 The analysis of these is very similar to the analysis of Examples 7.151, so the details are
omitted. (a) For a LinkedList the contains method takes Θ(m) time, so the overall complexity
is Θ(nm). (b)For a HashSet it takes Θ(1) time to call contains (on average), so the overall
complexity is Θ(n + n) = Θ(n).
7.154 (a) contains takes Θ(m) time so the complexity is Θ(n(log n + m)).
(b) Here the contains method takes Θ(log m) time, so the overall complexity is Θ(n(log n +
log m)) (or Θ(n log n + n log m) if you prefer to write it that way).
Note that we don’t know the relationship between n and m so we can’t simplify either answer.
n ⌊n/2⌋
decimal binary decimal binary
12 1100 6 110
13 1101 6 110
7.156
32 100000 16 10000
33 100001 16 10000
118 1110110 59 111011
119 1110111 59 111011
7.157 The next theorem answers the question about the pattern.
8.2 (a) No. The domain is Z, which does not have a ‘starting point’. (b) Yes. The domain is
Z+ . (c) Yes. The domain is {2, 3, 4, . . .}. (d) Yes. The domain is Z+ . (e) No. The domain is R
which is not a subset of Z. Thus, not only is there no ‘starting point,’ there is no clear ordering
of the real numbers from one to the next.
8.4 Modus ponens.
8.5 You can immediately conclude that P (6) is true using modus ponens. If that was your
answer, good. But you can keep going. Since P (6) is true, you can conclude that P (7) is true
(also by modus ponens). But then you can conclude that P (8) is true. And so on. The most
complete answer you can give is that P (n) is true for all n ≥ 5. You cannot conclude that P (n)
is true for all n ≥ 1 because we don’t know anything about the truth values of P (1), P (2), P (3),
and P (4).
8.6 Nothing. We can conclude that P (n) is true for any n ≥ 17, but there is not enough
information to say anything about values of n less than 17.
464 Chapter 11
8.7 There are various ways to say this, including what was said in the paragraph above. Here is
another way to say it:
If P (a) is true, and for any value of k ≥ a, P (k) true implies that P (k + 1) is true,
then P (n) is true for all n ≥ a.
8.9 If you answered yes and you aren’t lying, great! If you answered no or you answered yes but
you lied, it is important that you think about it some more and/or get some help. If you want to
succeed at writing induction proofs, understanding this is an important step!
k k+1
1(1 + 1) X k(k + 1) X (k + 1)(k + 2)
8.12 ; P (1) is true; P (k) is true; i= ; P (k + 1); i= ;
2 2 2
i=1 i=1
k
X k(k + 1) k (k + 1)(k + 2)
i; ; + 1; ; P (k + 1) is true; P (1) is true; k ≥ 1; all n ≥ 1; induction
2 2 2
i=1
or the principle of mathematical induction or PMI.
8.13
Xk
(a) P (k) is the statement “ i · i! = (k + 1)! − 1”
i=1
Xk+1
(b) P (k + 1) is the statement “ i · i! = (k + 2)! − 1”
i=1
k
X
(c) LHS(k) = i · i!
i=1
I am only writing this down now so that I know what my goal is. I am not going to start working
both sides of this or otherwise manipulate it. I can’t because I don’t know whether or not it is
true yet.)
Inductive Step: Notice that
k+1
X k
X
i2 = i2 + (k + 1)2
i=1 i=1
k(k + 1)(2k + 1)
= + (k + 1)2
6
k(2k + 1)
Å ã
= (k + 1) + (k + 1)
6
k(2k + 1) + 6(k + 1)
Å ã
= (k + 1)
6
Å 2
2k + k + 6k + 6
ã
= (k + 1)
6
Å 2
2k + 7k + 6
ã
= (k + 1)
6
(2k + 3)(k + 2)
Å ã
= (k + 1)
6
(k + 1)(k + 2)(2k + 3)
= .
6
Therefore P (k + 1) is true.
Summary: Since P (1) is true and P (k) → P (k + 1) is true when k ≥ 1, P (n) is true for all n ≥ 1
by induction.
8.18 For k = 1 we have 1 · 2 = 2 + (1 − 1)22 , and so the statement is true for n = 1. Let k ≥ 1
and assume the statement is true for k. That is, assume
1 · 2 + 2 · 22 + 3 · 23 + · · · + k · 2k = 2 + (k − 1)2k+1 .
1 · 2 + 2 · 22 + 3 · 23 + · · · + (k + 1) · 2k+1 = 2 + k2k+2 .
Using some algebra and the inductive hypothesis, we can see that
• For the sake of clarity, it might have been better to use k throughout most of the proof
instead of n. The exception is in the final sentence where n is correct.
466 Chapter 11
• The base case is just some algebra without context. A few words are needed. For instance,
‘notice that when n = 1,’.
• The base case is presented incorrectly. Notice that the writer starts by writing down what
she wants to be true and then deduces that it is indeed correct by doing algebra on both
sides of the equation. As we have already mentioned, you should never start with what you
want to prove and work both sides! It is not only sloppy, but it can lead to incorrect proofs.
Whenever I see students do this, I always tell them to use what I call the U method. What
I mean is rewrite your work by starting at the upper left, going down the left side, then
doing up the right side. So the above should be rewritten as:
1 · 1! = 1 = 2! − 1 = (1 + 1)! − 1.
Notice that if the U method does not work (because one or more steps isn’t correct), it is
probably an indication of an incorrect proof. Consider what happens if you try it on the
proof in Exercise 2.93. You would write −1 = (−1)2 = 1 = 12 = 1. Notice that the first
equality is incorrect.
The U method can sometimes apply to inequalities as well.
• When the writer makes her assumption, she says ‘for n ≥ 1’. This is O.K., but there is some
ambiguity here. Does she mean for all n, or for a particular value of n? She must mean the
latter since the former is what she is trying to prove. It would have been better for her to
say ‘for some n ≥ 1.’
• The algebra in the inductive step is perfect. However, what does it mean? She should
include something like ‘Notice that’ before her algebra just to give it a little context. It
often doesn’t take a lot of words, but adding a few phrases here and there goes a long way
to help a proof flow more clearly.
• She says ‘Therefore it is true for n’. She must have meant n + 1 since that is what she just
proved.
• As with her assumption, her final statement could be clarified by saying ‘for all n ≥ 1.’
Overall, the proof has almost all of the correct content. Most of the problems have to do with
presentation. But as we have seen with other types of proofs, the details are really important to
get right!
8.21 Given this proof, we know that P (1) is true. We also know that P (2) → P (3), P (3) → P (4),
etc, are all true. Unfortunately, the proof omits showing that P (2) is true, so modus ponens never
applies. In other words, knowing that P (2) → P (3) is true does us no good unless we know P (2)
is true, which we don’t. Because of this, we don’t know anything about the truth values of P (3),
P (4), etc. The proof either needs to show that P (2) is true as part of the base case, or the
inductive step needs to start at 1 instead of 2.
8.23 Because our inductive hypothesis was that P (k − 1) is true instead of P (k). If we assumed
that k ≥ 0, then when k = 0 it would mean we are assuming P (−1) is true, and we don’t know
whether or not it is since we never discussed P (−1).
8.27 This contains a very subtle error. Did you find it? If not, go back and carefully re-read the
proof and think carefully–at least one thing said in the proof must be incorrect. What is it?
O.K., here it is: The statement ‘goat 2 is in both collections’ is not always true. If n = 1, then
the first collection contains goats 1 through 1, and the second collection contains goats 2 through
2. In this case, there is no overlap of goat 2, so the proof falls apart.
Exercise Solutions 467
8.28
Evaluation of Proof 1: This solution is on the right track, but it has several technical problems.
Evaluation of Proof 2: The base case correct. Unfortunately, that is about the only thing that
is correct.
• The second sentence is wrong. We cannot say that ‘it is true for all n’–that is precisely
what we are trying to prove. We need to assume it is true for a particular n and then
prove it is true for n + 1.
• The rest of the proof is one really long sentence that is difficult to follow. It should be
split into much shorter sentences, each of which provides one step of the proof.
• The term ‘binary number’ should be replaced with ‘binary palindrome’ throughout. It
causes confusion, especially when the words ‘add’ and ‘consecutive’ are used. These
mean something very different if we have numbers in mind instead of strings.
• I don’t think the phrase ‘each consecutive binary number’ means what the writer thinks
it means. The binary numbers 1001 and 1010 are consecutive (representing 9 and 10),
but that is probably not what the writer has in mind.
• The term ‘permutations’ shows up for some reason. I think they might have mean
‘strings’ or something else.
• Why bring up the 4 possible ways to extend a binary string by adding to the beginning
and end if only two of them are relevant? Why not just consider the ones of interest
in the first place?
468 Chapter 11
• In the context of a proof, the phrase ’you are adding’ doesn’t make sense. Why am
I adding something and what am I adding it to? And do they mean addition (of the
binary numbers) or appending (of strings)?
• They switch from n to k in the middle of the proof to provide further confusion.
Evaluation of Proof 3: This proof has most of the right ideas, but it does not put them together
well. The base case is correct. It sounds like the writer understands what is going on with
the inductive step, but needs to communicate it more clearly. More specifically, what does
‘assume 2k → 2k palindromes’ mean? I think I am supposed to read this as ‘assume that
there are 2k palindromes of length 2k.’ 3
The final sentence is also problematic. The first phrase tries to connect to the previous
sentence, but the connection needs to be a little more clear. The final phrase is not a
complete thought. In the first place, I know that 2k + 2k = 2k+1 and this has nothing
to do with the previous phrases. In other words, the ‘so’ connecting the phrases doesn’t
make sense. But more seriously, why do I care that 2k + 2k = 2k+1 ? What he meant was
something like ‘so there are 2k + 2k = 2k+1 palindromes of length 2k + 2’.
8.29 The empty string is the only string of length 0, and it is a palindrome. Thus there is 1 = 20
palindromes of length 0.
Now assume there are 2n binary palindromes of length 2n. For every palindrome of length 2n,
exactly two palindromes of length 2(n + 1) can be constructed by appending either a 0 or a 1 to
both the beginning and the end. Further, every palindrome of length 2(n + 1) can be constructed
this way. Thus, there are twice as many palindromes of length 2(n + 1) as there are of length 2n.
By the inductive hypothesis, there are 2 · 2n = 2n+1 binary palindromes of length 2(n + 1).
The result follows by PMI.
8.32 Yes. It clearly calls itself in the else clause.
8.35 (a) The base cases are n <≤ 0. (b) The inductive cases are n > 0. (c) Yes. For any value
n > 0, the recursive call uses the value n − 1, which is getting closer to the base case of 0.
8.38 Notice that if n ≤ 0, countdown(0) prints nothing, so it works in that case. For k ≥ 0, assume
countdown(k) works correctly.4 Then countdown(k+1) will print ‘k + 1’ and call countdown(k). By
the inductive hypothesis, countdown(k) will print ‘k k-1 . . . 2 1’, so countdown(k+1) will print
‘k+1 k k-1 . . . 2 1’, so it works properly. By PMI, countdown(n) works for all n ≥ 0.
8.42 It is pretty clear that the recursive algorithm is much shorter and was a lot easier to write.
It is also a lot easier to make a mistake implementing the iterative algorithm. So far, it looks
like the recursive algorithm is the clear winner. However, in the next section we will show you
why the recursive algorithm we gave should never be implemented. It turns out that is is very
inefficient.
The bottom line is that the iterative algorithm is better in this case. Don’t feel bad if you
thought the recursive algorithm was better. After the next section, you will be better prepared
to compare recursive and iterative algorithms in terms of efficiency.
8.45 PrintN will print from 1 to n, and NPrint will print from n to 1. If you go the answer
wrong, go back and convince yourself that this is correct.
8.48 (a) rn/2 . (b) 1. (c) an−1 + 2 · an−2 + 3 · an−3 + 4 · an−4 . (d) There are none.
3
In general, avoid the use of mathematical symbols in constructing the grammar of an English sentence. One of
the most common abuses I see is the use of → in the middle of a sentence.
4
We are letting n = 0 be the base case. You could also let n = 1 be the base case, but then you would need to
prove that countdown(1) works.
Exercise Solutions 469
8.52 It means to find a closed-form expression for it. In other words, one that does not define
the sequence recursively.
8.54 When n = 1, T (1) = 1 = 0 + 1 = log2 1 + 1. Assume that T (k) = log2 k + 1 for all 1 ≤ k < n
(we are using strong induction). Then
T (n) = T (n/2) + 1
= (log2 (n/2) + 1) + 1
= log2 n − log2 2 + 2
= log2 n − 1 + 2
= log2 n + 1.
H(n) = 2H(n − 1) + 1
= 2(2H(n − 2) + 1) + 1
= 22 H(n − 2) + 2 + 1
= 22 (2H(n − 3) + 1) + 2 + 1
= 23 H(n − 3) + 22 + 2 + 1
..
.
= 2n−1 H(1) + 2n−2 + 2n−3 + · · · + 2 + 1
= 2n−1 + 2n−2 + 2n−3 + · · · + 2 + 1
= 2n − 1
Thus, H(n) = 2n − 1. Luckily, this matches our answer from Example 8.55.
8.63 Iterating a few steps, we discover:
T (n) = T (n/2) + 1
= T (n/4) + 1 + 1
= T (n/22 ) + 2 (I think I see a pattern!)
= T (n/23 ) + 1 + 2
= T (n/23 ) + 3 (I do see a pattern!)
..
.
= T (n/2k ) + k
We need to find k such that n/2k = 1. We already saw in Example 8.60 that k = log2 n is the
470 Chapter 11
T (n) = T (n/2k ) + k
= T (n/2log2 n ) + log2 n
= T (1) + log2 n
= 1 + log2 n
T (n) = 2T (n − 1) + n
= 2(2T (n − 2) + (n − 1)) + n (having n instead of (n − 1) is a common error)
= 22 T (n − 2) + 3n − 2 (it is unclear yet if I should have 3n − 2 or some other form)
..
. (many skipped steps)
k−1
X
k k
= 2 T (n − k) + (2 − 1)n − i2i (the all-important pattern revealed)
i=1
..
. (plug in appropriate value of k and simplify)
n+1
= 2 − n − 2.
9.13 Write n = |1 + 1 +{z· · · + 1}. There are two choices for each plus sign–leave it or perform
n−1 +′ s
the addition. Each of the 2n−1 ways of making choices leads to a different expression, and every
expression can be constructed this way. Therefore, there are 2n−1 such ways of expressing n.
9.15 This combines the product and sum rules. We now have 10 + 26 = 36 choices for each
character, and there are 5 characters, so the answer is 365 .
9.16 Each bit can be either 0 or 1, so there are 2n bit strings of length n.
9.18 53 · 632 ; 53 · 633 ; 53 · 63k−1 .
9.21 It contains at least one repeated digit. The wording of your answer is very important. Your
answer should not be “it has some digit twice” since this is vague–do you mean ‘exactly twice’ ?
If so, that is incorrect. If you mean ‘at least twice’, then it is better to be explicit and say it that
way or just say ‘repeated’. To be clear, we don’t know that it contains any digit exactly twice,
and we also don’t know how many unique digits the number has–it might be 22222222222, but it
also might be 98765432101.
9.24 If all the magenta, all the yellow, all the white, 14 of the red and 14 of the blue marbles are
drawn, then in among these 8 + 10 + 12 + 14 + 14 = 58 there are no 15 marbles of the same color.
Thus we need 59 marbles in order to insure that there will be 15 marbles of the same color.
9.25 She knows that you are the 25th person in line. If everyone gets 4 tickets, she will get none,
but you will get the 4 you want. She can get one or more tickets if one or more people in front of
her, including you, get less than 4.
9.28 There are seven possible sums, each one a number in {−3, −2, −1, 0, 1, 2, 3}. By the Pi-
geonhole Principle, two of the eight sums must add up to the same number.
9.31 We have ⌈ 16 5 ⌉ = 4, so some cat has at least four kittens.
9.32
Evaluation of Proof 1: This proof is incomplete. It kind of argues it for 5, not n in general. Even
then, the proof is neither clear not complete. For instance, what are the 4 ‘slots’ ?
Evaluation of Proof 2: They only prove it for n = 2. It needs to be proven for any n.
Evaluation of Proof 3: You can’t assume somebody had shaken hands with everyone else without
some justification. You certainly can’t assume it was any particular person (i.e. person n).
Similarly, you can’t assume the next person has shaken n − 2 hands without justifying
it. The final statement is weird (what does ‘fulfills the contradiction’ mean?) and needs
justification (why is it a problem that the last person shakes no hands?).
9.33 Notice that if someone shakes n − 1 hands, then nobody shakes 0 hands and vice-verse.
Thus, we have two cases. If someone shakes n − 1 hands, then the n people can shake hands with
between 1 and n − 1 other people. If nobody shakes hands with n − 1 people, then the n people
can shake hands with between 0 and n − 2 other people. In either case, there are n − 1 possibilities
for the number of hands that the n people can shake. The pigeonhole principle implies that two
people shake hands with the same number of people.
Note: You cannot say that the two cases are that someone shakes hands with n − 1 or someone
shakes hands with 0. It may be that neither of these is true. The two cases are someone shakes
hands with n − 1 others or nobody does. Alternatively, you could say someone shakes hands with
0 others or nobody does.
9.34 Choose a particular person of the group, say Charlie. He corresponds with sixteen others.
By the pigeonhole principle, Charlie must write to at least six of the people about one topic, say
topic I. If any pair of these six people corresponds about topic I, then Charlie and this pair do
Exercise Solutions 473
the trick, and we are done. Otherwise, these six correspond amongst themselves only on topics II
or III. Choose a particular person from this group of six, say Eric. By the Pigeonhole Principle,
there must be three of the five remaining that correspond with Eric about one of the topics, say
topic II. If amongst these three there is a pair that corresponds with each other on topic II, then
Eric and this pair correspond on topic II, and we are done. Otherwise, these three people only
correspond with one another on topic III, and we are done again.
9.38 EAT , ET A, AT E, AET , T AE, and T EA.
9.41 Since there are 15 letters and none of them repeat, there are 15! permutations of the letters
in the word uncopyrightable.
9.43 (a) 5 · 7 · 6 · 5 · 4 · 3 · 2 = 25, 200. (b) We condition on the last digit. If the last digit were 1
or 5 then we would have 5 choices for the first digit and 2 for the last digit. Then there are 6 left
to choose from for the second, 5 for the third, etc. So this leads to
5 · 6 · 5 · 4 · 3 · 2 · 2 = 7, 200
possible phone numbers. If the last digit were either 3 or 7, then we would have 4 choices for the
first digit and 2 for the last. The rest of the digits have the same number of possibilities as above,
so we would have
4 · 6 · 5 · 4 · 3 · 2 · 2 = 5, 760
possible phone numbers. Thus the total number of phone numbers is
9.45 Label the letters T1 , A1 , L1 , and L2 . There are 4! permutations of these letters. However,
every permutation that has L1 before L2 is actually identical to one having L1 before L2 , so we
have double-counted. Therefore, there are 4!/2 = 12 permutations of the letters in T ALL.
9.46 T ALL, T LAL, T LLA, AT LL, ALT L, ALLT , LLAT , LALT , LAT L, LLT A, LT LA, and
LT AL. That makes 12 permutations, which is exactly what we said it should be in Exercise 9.45.
9.47 Following similar logic to the previous few examples, since we have one letter that is repeated
three times, and a total of 5 letters, the answer is 5!/3! = 20.
9.48 Ten of them are AIEEE, AEIEE, AEEIE, AEEEI, EAIEE, EAEIE, EAEEI,
EEAIE, EEAEI, EEEAI. The other ten are identical to these, but with the A and I swapped.
9.51 We can consider SM IT H as one block along with the remaining 5 letters A, L, G, O, and
R. Thus, we are permuting 6 ‘letters’, all of which are unique. So there are 6! = 720 possible
permutations.
9.54 (a) 5 · 86 = 1310720. (b) 5 · 85 · 4 = 655360. (c) 5 · 85 · 4 = 655360.
7·6·5·4·3 12 · 11 10 · 9 · 8 · 7 · 6
9.58 (a) = 21. (b) = 66. (c) = 252.
1·2·3·4·5 1·2 1·2·3·4·5
200 · 199 · 198 · 197
(d) = 64, 684, 950. (e) 1.
1Ç· 2 ·å
3·4 Ç å Ç å Ç å
17 17 17 · 16 12 12 12 · 11
9.61 (a) = = = 136. (b) = = = 66.
15 2 1·2 10 2 1·2
Ç å Ç å Ç å Ç å
200 200 200 · 199 · 198 · 197 67 67
(c) = = = 64, 684, 950. (d) = = 67/1 = 67.
196 4 1·2·3·4 66 1
9.65 12, 13, 14, 15, 23, 24, 25, 34, 35, 45.
9.68
Evaluation of Solution 1: This solution does not take into account which woman was selected
and which 15 of the original 16 are left, so this is not correct.
474 Chapter 11
Evaluation of Solution 2: This solution has two problems. First, it counts things multiple times.
For instance, any selection that contains both Sally and Kim will be counted twice–once
when Sally is the first woman selected and again when Kim is selected first. Second, the
product rule should have been used instead of the sum rule. Of course, that hardly matters
since it would have been wrong anyway.
Evaluation of Solution 2: This solution is incorrect since it does not take into account which man
and woman were selected and which 14 of the original 16 are left.
9 7
9.72 There are 16 5 possible committees. Of these, 5 contain only men and 5 contain only
women. Clearly these two sets of committees do not overlap. Therefore, the number of committees
16 9 7
that contain at least one man and at least one woman is 5 − 5 − 5 .
9.73 Because we subtracted the size of both of these from the total number of possible committees.
If the sets intersected, we would have subtracted some possibilities twice and the answer would
have been incorrect.
9.75
Evaluation of Solution 1: This solution is incorrect since it double counts some of the possibili-
ties.
Evaluation of Solution 2: This solution is incorrect because it does not take into account the
requirement that one course from each group must be taken.
9.76
Evaluation of Solution 1: This solution is incorrect since it counts some of the possibilities mul-
tiple times.
Evaluation of Solution 2: This solution is incorrect because it does not take into account the
requirement that one course from each group must be taken.
9.79 Using 10 bars to separate the meat and 3 stars to represent the slices, we can see that this
13
is exactly the same as the previous two examples. Thus, the solution is 13 10 = 3 = 286.
9.84
Ç å Ç å Ç å Ç å Ç å
4 4 4 4 4
(2x − y 2 )4 = (2x)4 + (2x)3 (−y 2 ) + (2x)2 (−y 2 )2 + (2x)(−y 2 )3 + (−y 2 )4
0 1 2 3 4
= (2x)4 + 4(2x)3 (−y 2 ) + 6(2x)2 (−y 2 )2 + 4(2x)(−y 2 )3 + (−y 2 )4
= 16x4 − 32x3 y 2 + 24x2 y 4 − 8xy 6 + y 8
Exercise Solutions 475
9.85
√ √ √ √ √ √ √ √ √ √
( 3 + 5)4 = ( 3)4 √
+ 4( 3)3 ( 5) √
+ 6( 3)2 ( 5)2 + 4( 3)( 5)3 + ( 5)4
= 9 + 12 15
√ + 90 + 20 15 + 25
= 124 + 32 15
9.87 Using a little algebra and the binomial theorem, we can see that
n Ç å n Ç å n Ç å
X n k X n k X n n−k k
3 = 3 −1= 1 3 − 1 = (1 + 3)n − 1 = 4n − 1.
k k k
k=1 k=0 k=0
9.91 Let A be the set of camels eating wheat and B be the set of camels eating barley. We know
that |A| = 46, |B| = 57, and |A ∪ B| = 100 − 10 = 90. We want |A ∩ B|. By Theorem 9.89
(solving it for |A ∩ B|),
n ≥ |A ∪ B|
= |A| + |B| − |A ∩ B|
= . 7n+. 75n − |A ∩ B|,
n ≥ |C ∪ D|
= |C| + |D| − |C ∩ D|
= . 8n+. 85n − |C ∩ D|.
This gives
|A ∩ B| ≥. 45n,
|C ∩ D| ≥. 65n.
This means that
n ≥ |(A ∩ B) ∪ (C ∩ D)|
= |A ∩ B| + |C ∩ D| − |A ∩ B ∩ C ∩ D|
≥ . 45n+. 65n − |A ∩ B ∩ C ∩ D|,
476 Chapter 11
whence
|A ∩ B ∩ C ∩ D| ≥. 45+. 65n − n =. 1n.
This means that at least 10% of the combatants lost all four members.
10.24 abed is a cycle of length 4 and ecdab is a cycle of length 5. Other answers are possible.
There is no cycle of length 6 since there are only 5 vertices in the graph and a cycle cannot repeat
a vertex.
10.27 You should have drawn something like this (but probably bigger and with dots on the
corners): △
10.30 You should have drawn something like this (but probably bigger and with dots on the
corners and center): ×
10.31 You should have drawn a path of length 2 and 4 vertices not connected to anything.
Something like this: | . . . .
10.55
A →D→E→F
B →D→E→F
C →D→E→F
D →A→B→C
E →A→B→C
F →A→B→C
10.56
A →D→E→F
B →D→E→F
C →D→E→F
D →
E →
F →
10.61
A B C D E E
A 0 0 0 1 1 1
B 0
0 0 1 1 1
C 0
0 0 1 1 1
D 1
1 1 0 0 0
E 1 1 1 0 0 0
F 1 1 1 0 0 0
10.62
A B C D E E
A 0 0 0 1 1 1
B 0
0 0 1 1 1
C 0
0 0 1 1 1
D 0
0 0 0 0 0
E 0 0 0 0 0 0
F 0 0 0 0 0 0
Exercise Solutions 477
10.73 Did you draw a triangle with a vertex in the middle connected to the three vertices of the
triangle? I thought so!
10.79 Notice that K3,3 does not have C3 as a subgraph. Since K3,3 has 3 · 3 = 9 edges and
9 > 8 = 2(6) − 4, Theorem 10.77 part (b) implies that K3,3 is not planar.
478 Chapter 11
Chapter 12: Reading Question Solutions
2.1 Answers will vary, but it should say something like “An argument using logic and/or math
to demonstrate or show the nature of a conclusion,” or “the process or an instance of establishing
the validity of a statement especially by derivation from other statements in accordance with
principles of reasoning” (the latter is from Merriam-Webster).
2.2 If someone correctly proves statement A, that means it is a true statement, regardless of
whether or not you understand the proof. That’s because, by definition, a proof establishes the
validity of a statement.
2.3 True. An even integer is one of the form 2k, where k is an integer. An odd integer is one of
the form 2k + 1 where k is an integer. Thus, even numbers are divisible by 2 whereas odd numbers
are not. A number cannot both be divisible by 2 and not divisible by 2 at the same time.
2.4 No. For instance, 6 is divisible by 2, but 2 is not divisible by 6.
2.5 No. A number cannot be both composite and prime because by definition, a composite
integer is a positive integer c > 1 that is not prime. Thus, every integer greater than 1 is either
prime or composite, but never both.
2.6 A proposition is which is either true or false, which means x ≥ 0 is indeed a proposition
because, given a value of x, the statement is either true or false since either x ≥ 0 (in which case
it is true) or x < 0 (in which case it is false).
2.7 By definition, a proposition and its negation can never both be true. If the proposition is
true, its negation is false, and if the proposition is false, its negation is true.
2.8 No. Take the example from the book, “If you know Java, then you know a programming
language” (where A is the proposition “you know Java” and B is the proposition “you know a
programming language.” Since Java is a programming language, this proposition is true. Then
B implies A is the proposition “if you know a programming language, then you know Java.” But
that is clearly not true. You might know C++, which is a programming language, but not know
Java.
2.9 False. From the previous answer, consider the implication “if you know a programming
language, then you know Java,” and its inverse, “if you do not know a programming language,
then you do not know Java.” It shouldn’t be too difficult to see that although the inverse is true,
the implication is false.
2.10 This one is true since the inverse and converse of an implication are contrapositives of each
other, and an implication and its contrapositive are equivalent.
2.11 An implication and its contrapositive are equivalent. Therefore, if one is true, then the
other is true (and if one is false, the other is false, of course).
2.12 Your answer will likely be different, and hopefully more detailed than the one provided
here. But you should say something about how you assume that what you want to prove is false,
then use logic to arrive at a contradiction (that is, a statement that you know to be false). Since
you “proved” a false statement using correct logic, it must be that your premise (that is, the
statement that you assumed was false) is incorrect. Since your premise was that the statement
you wanted to prove was false, then it must be that the statement is true (again, because you
showed that if it is false, then you can prove something that is not true).
2.13 Since an implication and its contrapositive are equivalent, if you prove the contrapositive
of a implication is true, then the implication must also be true. And that is exactly what proof
by contraposition does.
479
480 Chapter 12
2.14 Contradiction: Assume that p and ¬q are both true. Get a contradiction. Conclude that
if p is true, ¬q cannot be true, so q must be true. Thus, p → q is true.
Contraposition: Prove that ¬q → ¬p, which is equivalent to p → q.
In both cases, you assume that ¬q is true. Often the contradiction you get is that ¬p is true
(which is the goal in contraposition proof), so the majority of the proofs look the same. But they
begin and end slightly differently.
2.15 True. Every integer a can be written as a = a1 , where a and 1 are both integers and 1 6= 0,
so it is rational.
2.16 True. From the previous question, we know that every integer is rational, so integers are
not irrational. Therefore, if a number is irrational, it cannot be an integer. Alternatively, you can
recognize that this is essentially the contrapositive of the previous question, so it is also true.
2.17 False. For instance, 1. 5 = 32 is rational, but clearly not an integer.
2.18 No you would have to show the reverse as well. That is, if I want to prove A if and only if
B, I would have to prove either the proposition (A → B) or the contrapositive (¬B → ¬A) and
I would have to prove the inverse (¬A → ¬B) or the converse (B → A).
2.19 (a) No. This is proving the forward direction twice by proving the implication and its
contrapositive. One of these needs to be replaced with either the inverse or converse. (b) Yes.
This is proving the contrapositive and the converse.
2.20 You use proof by counterexample to disprove statements. You only need to show one
instance where the statement is not true to demonstrate that it is not always true, so proof by
counterexample is valid. On the other hand, proof by example is just demonstrating the truth of
a statement given some specific values. Just because it works for the given values, that does not
prove that it works for all values, so it is not a valid proof technique.
2.21 This is definitely not a valid proof technique! The problem is that when you write down
an equation and start working both sides of it, you are implicitly assuming that the equation you
are starting with is true. But since you are trying to prove that it is true, you can’t start your
proof with the fact that it is true–that is circular reasoning. For instance, do you remember the
supposed proof of −1 = 1 that started by working both sides of the equation?
3.1 It means that the result of the algorithm is some value that can be assigned to a variable.
For instance, when a line of code like int a = min(4,5) executes, the min method (algorithm)
“returns” the minimum value so it can be assigned to the variable a.
3.2 It certainly can be used. If the current time is t, and you are on 24-hour time, then in 8 hours
the time will be (t + 8 mod 24. If you are using 12-hour time, it is a little more complicated. You
can compute (t + 8) mod 12 to get the time in 8 hours, but if the result is 0, you need to treat it
like 12. Also, this does not take into account whether there was a switch from am to pm. That
is more complicated and we would need to use additional logic to get that part correct.
3.3 No. In fact, they are usually not going to be the same (unless you start with an integer). As
an example, the floor of 3.2 is 3, and the ceiling of 3 is 3, so the ceiling of the floor of 3.2 is 3.
The ceiling of 3.2 is 4, and the floor of 4 is 4, so the floor of the ceiling of 3.2 is 4.
3.4 Here is one possibility: keep subtracting n from a until you get to a number less than n:
int modN ( int a , int n ) {
while (a >= n ) {
a =a - n ;
return a ;
}
That is not very efficient if a is large, however. A more efficient solution would be as follows:
int modN ( int a , int n ) {
return a - ( a / n ) * n ;
Reading Question Solutions 481
}
Make sure you understand why this solution works! Also, it should be noted that both of these
solutions require that a ≥ 0.
3.5 Here is one possible solution that is very concise. You might have a similar solution that has
an if-else clause. That’s fine, too.
b o o l e a n c o n g r u e n t M o d N ( int a , int b , int n ) {
return (( a - b ) % n == 0) ;
}
3.6 This one is pretty straightforward: if(x%2==1)
3.7 The easy answer is as follows:
if ( x %2==0) {
x ++;
} else {
x - -;
}
One way to do it without a conditional statement is: x+=1-2*(x%2);
3.8 n − 1. Notice that the loop starts at 1 and ends at n − 1, so that’s n − 1 times (not n as you
might have guessed).
3.9 Here is the solution I expected you to come up with:
s u m F i r s t N( int n ) {
int sum = 0;
for ( int i = 1; i <= n ; i ++) {
sum += i ;
}
}
O.K., I can’t resist giving the better answer. The following is much more efficient, but it may be
unclear to you why it is correct. Do not worry–we will see the idea behind it later.
s u m F i r s t N( int n ) {
return n *( n + 1) / 2;
}
3.10
int m i n i m u m( int a [] , int n ) {
int min = a [0];
for ( int i =1; i < n ; i ++) {
if ( a [ i ] < min ) {
min = a [ i ];
}
}
return min ;
}
3.11 Does your solution use a boolean variable to keep track of whether or not you found a 0?
If so, try to rewrite your code so that it does not do so before you read the solution given below.
If you used a boolean variable, your solution is both more complicated and less efficient than it
should be, and it is best that you figure out the better way on your own so that you are less likely
to write similar code in the future.
Again, before reading the solution given below, does your solution contain an else clause? If
so, think about whether or not it is correct by walking through your code on a small array. Since
I can’t read your code, I cannot be certain, but if you have an else clause, your solution is likely
to be incorrect.
482 Chapter 12
b o o l e a n c o n t a i n s Z e r o ( int a [] , int n ) {
for ( int i =0; i < n ; i ++) {
if ( a [ i ]==0) {
return true ;
}
}
return false ;
}
3.12 Here is the first version:
b o o l e a n n o Z e r o e s( int a [] , int n ) {
return ! c o n t a i n s Z e r o (a , n ) ;
}
Here is the second version, which looks almost identical to the solution to the previous question:
b o o l e a n n o Z e r o e s( int a [] , int n ) {
for ( int i =0; i < n ; i ++) {
if ( a [ i ]==0) {
return false ;
}
}
return true ;
}
3.13
int i =1;
while (i < n ) {
// do s o m e t h i n g
i ++;
}
4.1 A proposition is a statement that is either true or false.
4.2 The operators are negation, or, and, exclusive-or, conditional, and biconditional. See Table 4.1
and Table 4.2 for the truth tables.
4.3 p ∨ q is true if p is true, q is true, or both p and q are true, whereas p ⊕ q is true if and only
if exactly one of p or q is true. So the difference is that is both p and q are true, p ∨ q is true, but
p ⊕ a is false.
4.4 It is true unless p is true and q is false.
4.5 Here is a truth table with some intermediate columns to help. As long as you have the same
final column, yours is probably fine.
p q p ∧ q ¬p (p ∧ q) ∨ ¬p
T T T F T
T F F F F
F T F T T
F F F T T
4.6 Here is a truth table with some intermediate columns to help. As long as you have the same
final column, yours is probably fine.
Reading Question Solutions 483
p q r p ∧ q (p ∧ q) ∨ r
T T T T T
T T F T T
T F T F T
T F F F F
F T T F T
F T F F F
F F T F T
F F F F F
4.7 By definition, no. If p is true, ¬p is false, and if p is false, ¬p is true.
4.8 q must be true since one of them has to be and p is not.
4.9 Nothing. It might be true, but it also might be false.
4.10 They are both true.
4.11 q has to also be false since p ↔ q implies they have the same truth value.
4.12 In this case q also has to be true.
4.13 Since p → q is true whenever p is false, we cannot say anything about q, so it could be true
or false.
4.14 It is probably a contingency. A tautology is a proposition that is always true. Since there
is a conditional statement involved, it is likely that different cases happen. In other words, there
would be no need for the if statement if the expression inside it is a tautology.
4.15 (a) p ∨ T is true if and only if either p or T is true. Since clearly T is true, p ∨ T is always
true. Therefore, p ∨ T = T . Alternatively, you can give a truth table for p ∨ T and then comment
something like “since the final column of the truth table is always true, p ∨ T = T .” (b) We will
do this one with a truth table: Notice that the column for p ∧ F in the truth table below is always
false. Therefore p ∧ F = F .
p p∧F
T F
F F
4.16 Although logically these two conditional statements are equivalent, they are actually not
equivalent in most programming languages due to short-circuiting. The first one is correct because
it makes sure that the index is valid before indexing into the array. The second one will probably
crash if i indexes outside of the array, so it is not correct.
4.17 if( x > 0 && y > 0 ) (using DeMorgan’s law). Note that if( x >= 0 && y >= 0 ) is
not correct since x>0 is the negation of x<=0.
4.18 We need to use a nested conditional statement:
if ( a != null ) {
if ( a . size () > 0) {
a . clear () ;
}
}
4.19 You could draw a truth table and show that the columns for these two differ. However,
there is any easier approach. Notice that if p is true and q is false, ¬p ∧ ¬q is false, but ¬(p ∧ q)
is true. Therefore they are not equivalent.
4.20 A propositional function is a statement that contains one or more variables and depending
on the values of the variable(s), the statement is true or false. In other words, a propositional
function is a function whose outputs are propositions.
4.21 ¬∀xP (x) means that it is not the case that P (x) is true for all values of x. So it doesn’t
mean it is never true. It means that for one or more values of x it is false. That is, it means
484 Chapter 12
5.6 {n/100|n ∈ Z}
5.7 Yes (because the power set of A is the set of all subsets of A, and A ⊆ A, so A ∈ P (A)) and
No (because the elements of P (A) are subsets of elements of A, so a subset of P (A) would be a
set of subsets of A, but A is a set of elements (of whatever type A consists of). This is a subtle
point, so do not worry too much about it unless you plan to major in mathematics.
5.8 25 = 32.
5.9 (a) |A| = 5 and |P (a)| = 25 = 32. (b) No. The elements of A are a, b, etc. If that statement
were true, {a} ∈ A, but it is not. Remember, a ∈ A (which is true) and {a} ∈ A (which is not
true) are not saying the same thing. (c) Yes. (d) No. As in part (b), the elements of A are a,
b, etc. but {b, c, e} is a set of those elements. Remember: ∈ means “element of,” and ⊆ means
“subset of,” which are not the same thing! (e) Yes because each of the three sets in the set on
the left are subsets of A. (f) No. To be a subset of the power set, it needs to be a set of subsets.
This is a set of elements. (g) Yes because the set {b, c, e} ⊆ A.
5.10 (a) {a, b, c, d, f , g, h, j, k, l, m, n, p, q, r, s, t, v, w, x, y, z} (b) {b, d, g, k, p, v} (c) {a} (NOT a!)
(d) {a} (e) {e, i, o, u}
5.11 (a) A ⊆ B. (b) A = B.
5.12 First, A ∩ B is the set of elements that are in both A and B. Therefore, A ∩ B is the set
of elements that are not in both A and B. (This includes elements that are in A but not B, in
B but not A, or in neither A nor B.) A ∪ B is the union of the elements that are not in A and
the elements that are not in B. That is, it is any element that is either not in A or not in B.
So it is all elements that are not in both A and B, which is exactly what A ∩ B is. Therefore,
A ∩ B = A ∪ B.
5.13 First, notice that since U is the universal set, A ⊆ U and A ⊆ U . Let x ∈ A ∪ A. Then
either x ∈ A or x ∈ A. In either case, x ∈ U (since A ⊆ U and A ⊆ U ). Therefore A ∪ A ⊆ U .
Now, assume x ∈ U . If x ∈ A, then x ∈ A ∪ A. If x 6∈ A, then by definition, x ∈ A, and
therefore x ∈ A ∪ A. In either case, x ∈ A ∪ A. Therefore U ⊆ A ∪ A.
Since we proved containment both ways, A ∪ A = U .
5.14 (a) answers will vary, but should be things like (1, z), (3, x), and (2, v). (b) answers will
2
vary, but should be things like (1, 2), (3, 3), and (4, 1). (c) 24. (d) 43 = 64. (e) 24 ∗6 = 296 .
5.15 The area being pointed to is “with AND without you,” so it is not at all correct.
That is definitely not where Bono can’t live. Notice that
without
“with you” and “without you” are complements, so the Venn
you with
diagram to the right demonstrates a proper relationship be-
tween them. So where can’t Bono live? He can’t live any- you
where in the rectangle (the universe) because he can’t live in
the circle (“with you”) or outside the circle (“without you”).
Put another way, the entire diagram is the place where Bono Where Bono can’t live
can’t live.
5.16 (a) Z. (b) Z. (c) {2z|z ∈ Z}. That is, the set of all even numbers.
5.17 Here are possible answers. There are many other possible correct answers. (a) f (x) = 2x.
(b) f (x) = (2x) mod 6 (so f (4) = 2 instead of 8). (c) f (x) = 2x. (d) f (x) = 4. (e) f (x) = 2x.
5.18 Here are possible answers. There are many other possible correct answers. (a) f (x) = 2x.
(b) f (x) = (2x) mod 6 (so f (4) = 2 instead of 8). (c) Impossible. The domain only has 4 numbers,
and the codomain has 6, so it is impossible to map to all of them. (d) f (x) = 2x (it doesn’t map
to 10 or 12). (e) Impossible. Since an onto function is impossible, a bijective one is as well.
5.19 (a) False. If a = b, then clearly f (a) = f (b), so that’s meaningless. You need to show that
if f (a) = f (b), then a = b. (b) True. (c) False. That’s the definition of a function. to show
onto, you need to show that every element of the codomain gets mapped to by some element of
486 Chapter 12
the domain. (d) True. Since the range is the set of values actually mapped to, if it equals the
codomain, then every value is mapped to and the function is onto. (e) False. For two reasons.
First, elements of the range are always mapped to–that’s the definition of range. Even if you
change range to codomain, it is still false. You only need to show that there is at least one
element of the codomain that is not mapped to. (f)True.
5.20 (f ◦ g)(x) = 2x+2 and (g ◦ f )(x) = 2x + 2. Did you get them backwards? If so, look at the
definition of composition of functions again.
5.21 (a) C = {1, 3, 5, 7, 9}. (b) C = {1, 2, 3, 5, 7, 9} (or any set that contains all of 1,3,5,7,9,
and at least one of 2, 4, 6, 8, or 10). (c) C = {1, 3, 7, 9} (or any proper subset of {1, 3, 5, 7, 9}).
(d) C = {1, 3, 5} and D = {7, 9} (or any two disjoint sets such that C ∪ D = {1, 3, 5, 7, 9}. (e)
C = {1, 2, 3, 5} and D = {7, 9} (or many other possibilities).
5.22 Yes. It is a subset of Z × Z.
5.23 There are many possible answers. Here is one. Define T to be the relation on human beings
such that xT y if and only if x is at least as tall as y. It is not hard to see that T is reflexive,
anti-symmetric, and transitive, so it is a partial order.
5.24 answers we vary, but here is one possibility. Let C be the set of all cars. Define the relation
R on C by xRy if and only if x was manufactured in the same year as y. It is not hard to see that R
is symmetric, reflexive, and transitive, so it is an equivalence relation. Let Cy be the set of all cars
manufactured in year y. Then it is not hard to see that C = C1886 ∪C1887 ∪· · ·∪C2021 ∪C2022 ∪C2023
is a partition of C (assuming you aren’t reading this in 2024 or later, and assuming we regard
1886 as the first year a car was made, which turns out to be a question without a clear answer).
For a representative for class Cy , simply pick any car that was manufactured in year y.
5.25 (a) It is straightforward to prove that B is symmetric, reflexive, and transitive, so it is an
equivalence relation. (b) It is not. It is no anti-symmetric. (c) Let Bi be the positive integers
that have i ones in their binary representation. It is easy to see that Bi ∩ Bj = ∅ if i 6= j and
that Z+ = B1 ∪ B2 ∪ B3 ∪ · · · , so this definition gives us a partition of Z+ . (d) We just let ai be
the number whose binary representation is i 1s. For instance, a1 = 1, a2 = 3, a3 = 7, etc. Notice
that we can do better by defining it explicitly: ai = 2i − 1. (e) The smallest elements of B2 would
be 3 = 112 , 5 = 1012 , 6 = 1102 , and 9 = 10012 . There are infinitely many other possible answers.
6.1 x1 = 3−2 = 1, x2 = 32 −22 = 5, x3 = 33 −23 = 19, x4 = 34 −24 = 65, and x5 = 35 −25 = 211.
6.2 It means to find an explicit formula (or closed formula) for it. In other words, a formula that
is not recursively defined.
6.3 x2 = 2x1 + 3 = 2 · 1 + 3 = 5, x3 = 2x2 + 3 = 2 · 5 + 3 = 13, x4 = 2x3 + 3 = 2 · 13 + 3 = 29,
and x5 = 2x4 + 3 = 2 · 29 + 3 = 61.
6.4 (a) x2 = x1 + 3 = 2 + 3 = 5, x3 = x2 + 3 = 5 + 3 = 8, x4 = x3 + 3 = 8 + 3 = 11,
and x5 = x4 + 3 = 11 + 3 = 14. (b) xn = 3n − 1. (c) Notice that if xn = 3n − 1, then
xn−1 = 3(n − 1) − 1 = 3n − 4. So xn−1 + 3 = 3n − 4 + 3 = 3n − 1 = xn , verifying that it
works for the recurrence relation. But we also need to show that it works for the base case:
x1 = 2 = 3(1) − 1, so it also works for the base case and we are done.
6.5 Increasing because for most algorithms, if you have more input it takes longer to even read
the input, so it would also take longer to run the algorithm. In rare cases it might be constant–for
instance, an algorithm that returns the fifth element of an array will take the same amount of
time regardless of the size of the array.
6.6 They are sometimes monotonic. If r < 0 they will not be monotonic, but if r > 0 they are
monotonic. They are also sometimes increasing. For instance, if r > 1, it will be increasing, but
if 0 < r < 1 it will be decreasing. If r < 0, they are neither increasing nor decreasing since they
go back and forth between positive and negative.
Reading Question Solutions 487
6.7 They are always monotonic. A given arithmetic progression is either always increasing or
always decreasing, depending on whether d is positive or negative.
6.8 Many answers are possible, but your answer should be very similar in form as the ones given.
(a) an = 3(2/3)n . (b) bn = 2 + 8n. (c) cn = (5/3)cn−1 , c0 = 3. (d) dn = dn−1 + 9/4, d0 = 8.
6.9 They are not because −xi = −(xi ) and (−x)i = (−1)i (x)i . Depending on whether x is
positive or negative and depending on whether i is even or odd, these may have opposite signs.
X6 X 6
6.10 −(−3)i or (−1)i+1 3i .
i=0 i=0
30 30 30 30
X X X X 30 · 31
6.11 5k − 7 = 5k − 7=5 k − 7 ∗ 31 = 5 − 217 = 2108
2
k=0 k=0 k=0 k=0
n n+1 n
X 2 −1 X 1 − 2n+1 1 − 2n+1
6.12 2k = = 2n+1 − 1 or 2k = = = 2n+1 − 1.
2−1 1−2 −1
k=0 k=0
6.13 Sure. Whenever x0 = 0.
6.14 If you can get this one without mistakes, you are doing really well! If you don’t quite get it,
keep trying to do it on your own. You will learn a lot in the process and it will be a good algebra
refresher.
23 23 23 Å 23 Å 23 Å Å ã0 !
11 1 1 k 1 k 1 k 1
X X X ã X ã X ã
k
= 11 k
= 11 = 11 − = 11 − − −
(−7) (−7) −7 7 7 7
k=1
k=1
1 24
! k=1
24 1 24
k=1 !
Ç Ç k=0Å ã24 å å
1 − −7 1 − (−1) 7 7 1
= 11 1
− 1 = 11 8 − 1 = 11 1− −1
1 − −7 7
8 7
7 7 1 24 1 1 1 23
Ç å Ç Å ã å Ç Å ã23 å
11 1
Å ã
= 11 − − 1 = 11 − − =− 1+ .
8 8 7 8 8 7 8 7
x2 x4 x2n
6.15 According to Theorem 6.84, cos x = 1 − + − · · · + (−1)n + · · · . To approximate
2! 4! (2n)!
cos(1), we can just use several terms of the infinite sum. So,
7.5 If f (n) = o(g(n)), then we know that f (n) grows slower than g(n). If f (n) = O(g(n)), then
f (n) grows no faster than g(n). That is, f (n) either grows slower than g(n) or they grow at the
same rate.
7.6 More. If you know that f (n) = Θ(g(n)), then you know that they grow at the same rate.
But if you only know that f (n) = O(g(n)), it is possible that the grow at the same rate, but it is
also possible that f (n) grows slower than g(n).
7.7 Proof 1: Notice that if n ≥ 1, 7n3 +4n2 −8n+27 ≤ 7n3 +4n2 +27 ≤ 7n3 +4n3 +27n3 = 38n3 .
By definition of Big-O, 7n3 + 4n2 − 8n + 27 = O(n3 ).
7n3 + 4n2 − 8n + 27 7n3 4n2 8n 27 4
Proof 2: Notice that lim = lim + − 3 + 3 = lim 7 + −
n→∞ n3 n→∞ n3 n3 n n n→∞ n
8 27
2
+ 3 = 7 + 0 + 0 + 0 = 7. Thus, 7n3 + 4n2 − 8n + 27 = Θ(n3 ) by Theorem 7.50 (part 3). By
n n
Theorem 7.18, 7n3 + 4n2 − 8n + 27 = O(n3 ).
3n 3 n
Å ã
3
7.8 Notice that lim = lim = 0 since 3.1 < 1. By Theorem 7.50 (part 1),
n→∞ 3. 1n n→∞ 3. 1
3n = o(3. 1n ).
7.9 Notice that both functions have a factor of n. Since log n grows faster than c (which doesn’t
grow at all since it is a constant), n log n grows faster than cn.
n log n log n
It is actually easy to prove this formally: lim = lim = ∞, so by Theorem 7.50,
n→∞ c n n→∞ c
n log n = ω(cn). In other words, n log n grows faster than cn.
7.10 No! The function may grow slowly, but it is still growing. So you are multiplying a function
by another function that is growing (even if slowly), which grows faster than 1. Thus, f (n) log n
definitely grows faster than f (n).
As with the previous question, a simple limit computation gives a clear proof of this fact.
f (n) log n log n
lim = lim = ∞, so by Theorem 7.50, f (n) log n = ω(f (n)). In other words,
n→∞ f (n) n→∞ 1
f (n) log n grows faster than f (n).
7.11 8675309, 7 log10 n ∼ log3 n, 27n, 7n log2 n, n2 + n + 1 ∼ 3n2 , n3 ∼ n3 + n2 loge n, 2n , 7n ,
n!, nn
7.12 It is unclear what types of machines the algorithms were run on. If one was run on a
cellphone and another on a supercomputer, it is definitely not a fair comparison. If they were run
on the same machine and on the same data, then we can compare them.
7.13 No. If one always takes 3 times as long, it’s just a constant multiple difference, so they have
the same growth rate. E.g. if one takes f (n) time, the other takes 3f (n) time, and these have the
same growth rate. Remember, when we are talking about growth rates, the constant multiples
do not matter.
7.14 Since g(n) grows faster than f (n), as n increases, algorithm B will take (relatively) longer
than A. In other words, algorithm A is faster. Remember: Faster growth rate of computational
complexity means slower algorithm!
7.15 You cannot even read all of the inputs in Θ(log n) time, so this is clearly ludicrous. It
would have to take at least Ω(n) time (and even that is not attainable, but we aren’t ready for
that proof yet).
7.16 This would be an awful algorithm since searching should not take more than Θ(n) time.
7.17 Better because insertionSort takes Θ(n2 ) time and n1.5 = o(n2 ).
7.18 One (or both) of the loops could only execute a constant number of times. Also, maybe
the index variable(s) aren’t just incremented/decremented by 1 each time through the loop. If
instead the variables are doubled or halved, for instance, the complexity might involve a logarithm
Reading Question Solutions 489
or something (e.g. maybe Θ(n log n)). Here’s a third (for free!): Maybe the outer loop executes
n times and the inner loop executes m times. Then the complexity is Θ(n m).
7.19 (a) Generally speaking, B is faster. Since we are given Θ bounds on the complexities, we
know the exact growth rates and n log n = o(n2 ), so B is definitely a faster algorithm. We know
that B is faster for large enough input. We do not know if is is faster for small inputs. (b) As
mentioned in the previous part, for small values of n, A could possibly be faster because the
constants involved in B might be much larger.
7.20 This is a trick question! The answer is unclear. We know that A has complexity Θ(n log n),
but we are only given a Big-O bound on the complexity of B. It is possible that B has complexity
Θ(n) and it is faster, or that it has complexity Θ(n log n) and it is the same as A, or that it has
complexity Θ(n2 ) and it is slower than A.
7.21 Memory usage; how complicated the algorithms are to implement, maintain, and debug;
the constants; how convinced you are of the correctness of each algorithm.
8.1 We need a “starting point.” Induction proofs involve proving a statement of the form
P (k) → P (k + 1). That is, we prove that if P is true for some value, then it is true for the next
value. But that does not imply that it is ever true–only that if it is true for some value, it is true
for the next value. So we need to prove that P is true for some value to get things started.
8.2 This is not circular reasoning because we are not proving that P (k + 1) is true. We are
proving that P (k) → P (k + 1) is true. In other words we are assuming P (k) is true, and the using
that fact to prove that P (k + 1), which is a different statement, is true. But again, we did not
prove that P (k + 1) is always true. We only proved that IF P (k) is true, then P (k + 1) is true.
8.3 (a) It is saying that if P (a) is true and it is also true that whenever P (k) is true that
P (k + 1) is true, then P (n) is true for all n ≥ a. (b) We know that P (a) is true. We also know
that P (k) → P (k + 1) is true for all k ≥ a. In particular, we know that P (a) → P (a + 1) is true.
Using modus ponens, we can conclude that P (a + 1) is true. But then we can use modus ponens
again to conclude that P (a + 2) is true (since we know P (a + 1) and P (a + 1) → P (a + 2) are both
true). We can keep doing this over and over again so that eventually we can show that P (a + k)
is true for any k ≥ 0. Thus, P (n) is true for all n ≥ a.
8.4 No. Notice that we know that P (0) is true, but we only know that P (k) → P (k + 1) for
k > 0. So we know P (1) → P (2), but we do not know whether or not P (0) → P (1). In other
words, we have a base case and an inductive case, but the inductive case does not go all the way
down to the base case, so we cannot connect them. (Note: this is not a failure of induction. It is
a failure in trying to use induction. A proper induction proof would show that P (k) → P (k + 1)
for k ≥ 0 so that the inductive step applies to the base case.)
8.5 There are clearly 2 = 21 binary strings of length 1 (‘0’ and ‘1’). Assume that there are 2k
binary strings of length k. Every string of length k + 1 ends with either a ‘0’ or a ‘1’, and the first
k characters can be any of the possible binary strings of length k. In other words, the number
of binary strings of length k + 1 is 2 · 2k = 2k+1 since we can append to each of the 2k strings of
length k either a ‘0’ (producing 2k strings of length k + 1) or a ‘1’ (producing a different 2k strings
of length k). Since we proved the base and inductive cases, we have shown that the number of
binary strings of length k is 2k .
8.6 These are both correct techniques.
8.7 In weak induction you assume P is true for a given value and show P is true for the next
value. For instance, you might assume P (k) is true and prove that P (k + 1) is true. In strong
induction you assume that P is true for every value from the base case up to a certain value, and
then you prove it for the next value. For instance, you assume P (1) ∧ P (2) ∧ · · · ∧ P (k − 1) is
true and prove that P (k) is true. (By the way, I purposely used P (k) and P (k + 1) for one and
P (k − 1) and P (k) for the other to re-emphasize that you can do it either way.)
490 Chapter 12
8.8 Induction is like a bunch of dominoes lined up. If you push the first one over, it will push
the next one over, which will push the next one over, etc., until they have all fallen down. The
first domino is the base case. The fact that the dominoes are placed close enough to each other
is like the inductive case (because they are close enough, if one falls, the next one will).
8.9 It links to itself, kind of like how a recursive algorithm calls itself.
8.10 (1) A recursive algorithm mush have one ore more inductive cases (or recursive case) where
the algorithm calls itself. This is required since otherwise the algorithm is not recursive. (2) A
recursive algorithm must have one or more base cases that can be solved directly (or at least non-
recursively). This is required since otherwise the algorithm would never finish. (3) The recursive
calls must be making progress toward the base cases. This is also required since otherwise the
algorithm would never finish.
8.11 They are both based on the same ideas, particularly that of bases cases and inductive cases.
8.12 Here is one solution. This should be called with n being the size of the array. Notice
that it searches from the end of the array to the beginning of the array because it is the most
straightforward way to implement the recursive idea.
int search ( int [] a , int n , int value ) {
if (n <=0) {
return -1;
} else if ( a [n -1]== value ) {
return n -1;
} else {
return search (a ,n -1 , value ) ;
}
}
8.13 Neither is better. Iterative algorithms are often faster than their recursive equivalents, but
recursive algorithms are often easier to come up with and implement. So they both have their
merits.
8.14 A recurrence relation is a recursively defined formula for the values of a sequence. In other
words, it is a formula to compute an based on one or more values of ai where i < n.
8.15 It means to come up with a closed formula. That is, formula to compute the nth term of
the sequence directly (not based on previous values of the sequence).
8.16 (a) The substitution method essentially involves guessing the correct formula and using
induction to prove it. (b) The iteration method keeps applying the recursive definition to the
right side of the formula until it can be simplified down to (generally speaking) a summation and
the base case(s) that is then simplified. (c) To use the Master Theorem, one verifying that the
recurrence relation is in the correct form, identifies the constants from the theorem (a, b, and
d), determines which case the formula falls into based on the values of the constants, and writes
down the answer based on which case it is. This technique only gives an asymptotic bound on
the solution, not an exact solution.
8.17 (a) They both give an exact formula whereas the Master Theorem only gives an asymptotic
bound. (b) The Master Method is much easier to use than the other two techniques. (c) You need
to be able to determine the answer before you prove it. If the formula is complicated, you may
not be able to determine what it is. (d) The main downside is that it is messy. It also only works
well on simple recurrence relations. For instance, if a recurrence relation has several recursive
terms (e.g. T (n) = T (n − 1) − 2T (n − 3)), it would probably be quite complicated to try to solve
it using iteration. (e) The Master Theorem also only works on one specific type of recurrence
relation and it does not give an exact solution. (f) If I don’t care about an exact solution, the
Master Theorem is by far the easiest, so I would use that one. If I want an exact solution, I would
Reading Question Solutions 491
prefer substitution if I can see an obvious pattern and find the formula. If I can’t find a formula,
I would prefer iteration because I should be able to work it out using that technique.
8.18 Because the three topics have a lot in common. Recurrence relations are just a form of
recursion, and they can be solved using induction.
8.19 Because recursive algorithms are often analyzed by developing and solving recurrence rela-
tions.
8.20 Develop a recurrence relation that describes the running time of the algorithm, including
one or more base cases. Use one of the techniques from the previous section to solve the recurrence
relation.
8.21 Your answer may be different depending on your algorithm. However, if your algorithm
does not have a complexity of Θ(n), you either have an incorrect algorithm (if its complexity is
better than this), a really bad algorithm (if its complexity is worse than this), or you analyzed it
incorrectly.
Let T (n) be the time it takes to run search(int[] a, int n, int value). If n ≤ 0, the
algorithm just returns -1 which takes constant time. So T (0) = 1. If n >= 0, the algorithm
checks one value of the array, which takes constant time, and then either returns or (in the worst
case) makes a recursive call on an array of size n − 1. (technically the array still has size n, but
we are telling the algorithm that it only has size n − 1.) So in this case, T (n) = T (n − 1) + 1.
This one is easy to solve by a variety of techniques. Let’s do it using iteration. First, notice that
if T (n) = T (n − 1) + 1, then if we plug in n − 1 for n, we would get T (n − 1) = T (n − 2) + 1.
Therefore, T (n) = T (n − 1) + 1 = (T (n − 2) + 1) + 1) = T (n − 2) + 2. But T (n − 2) = T (n − 3) + 1,
so T (n) = T (n − 2) + 2 = (T (n − 3) + 1) + 2 = T (n − 3) + 3. It’s not difficult to see that we can
generalize this to T (n) = T (n−k)+k. When k = n, we get T (n) = T (n−n)+n = T (0)+n = n+1.
So the worst-case performance of the algorithm on an array of size n is n + 1 steps.
9.1 Answers will vary, but here are a few examples. (a) I want to adopt a dog. At the pet shop,
there are four golden retrievers, two poodles and 5 corgis. How many choices do I have if I plan
to adopt one dog? You can choose either a golden retriever (4 options), a poodle (2 options), or
a corgi (5 options). So the total number of choices is 4 + 2 + 5 = 11.
(b) I go to a restaurant and see that there are 15 dessert choices, 10 main courses, and 2
appetizers. How many choices do I have if I want to order one of each? You have 15 choices for
desserts, and independent of that you have 10 choices for a main course, and independent of both
of those you can choose one of two appetizers. So you have 15 ∗ 10 ∗ 2 = 300 choices.
(c) Your password can be from 4 to 8 lower case alphabetic characters. How many possibilities
are there? There are 26 possible characters. If you use 4 characters, there are 26×26×26×26 = 264
possible passwords. Similarly, with 5 characters there are 265 possible passwords. Likewise, for
6, 7, and 8 characters, there are 266 , 267 , and 268 possible passwords. Since you have to choose
one of these lengths, the total number of possible passwords is 264 + 265 + 266 + 267 + 268 .
9.2 You cannot conclude that a box has at least 3 objects, that two boxes have at least two items,
or that every box has at least one item. (For each of these you can come up with a distribution
of objects in boxes that does not fit the description.) The most you can conclude is that at least
one box has at least 2 objects.
9.3 You might have all 30 balls in one bin. You might have 15 balls in one bin, and 15 in a
second bin. You might have 1 balls in each of the first 6 bins and 24 in the 7th bin. You might
have 4 balls in each of the first 6 bins and 6 balls in the 7th bin. You might have 4 balls in each
of the first 5 bins, and 5 balls in each of the final two bins. Notice that in all of these examples,
there is at least one box which has at least ⌈30/7⌉ = 5 balls as guaranteed by the generalized
pigeonhole principle.
9.4 There are 4 types of discs. Therefore by the generalized pigeonhole principle, I know that
492 Chapter 12
I have at least ⌈21/4⌉ = 6 discs of at least one type. But that is all I can say. I do not know
which type I have at least 6 of. For instance, it is possible all 21 are putter, or all 21 are distance
drivers, for instance, so I do not even know if I have a single disc of any type.
9.5 The numbers between 1 and 1000 can all be represented with 10 bits. A number between 1
and 1000 can have between 1 and 10 bits that are 1s. So the 12 numbers can each be placed in
one of 10 “bins” based on how many bits they have in their binary representation. Since we are
placing 12 numbers in 10 bins, at least one bin has at least 2 numbers. In other words, two of
the numbers have the same number of 1s in their binary representation. (None of the numbers
can actually have 10 1s because the number with 10 1s is 1023 which is larger than 1000. So
technically there are only 9 bins. But the argument is the same either way. However, if I said
that ten people picked numbers, then we would need to take this into account to solve the problem
correctly.)
9.6 Permutations are ordered and combinations are not. More specifically, a permutation is
a reordering of objects, whereas a combination is a selection of objects. They are basically
completely different things.
9.7 (a) 10·10·10 = 103 = 1000 (assuming you regard 0 = 000, 12 = 012, etc. as 3 digit numbers).
(b) 10 · 9 · 8 = 720. (c) Because sets cannot have repeats, there are 10 10·9·8
3 = 3·2·1 = 120. Notice
that there are 6 times as many three-digit numbers with no repeated digits than sets of three digits
because each set with three digits leads to 6 different three-digit numbers (e.g. the set {1, 2, 3} is
the same as the set {3, 1, 2}, for instance, but the numbers 123, 132, 213, 231, 312, 321 are all
different.) (d) Because lists can have repeats, there are 10 · 10 · 10 = 103 = 1000.
9.8 (a) 7!. (b) 5!.
5!
9.9 Using Theorem 9.49, there are 2!·2!·1! = 120
4 = 30.
9.10 Choose 5 people who are not on the team.
25 25·(6·4)·23
9.11 25 22 = 25·24·23
3 = 3·2·1 = 6 = 25 · 4 · 23 = 2300.
9.12 (a) 11 4 . (b) 4!. (c) We can choose the 4 members ( 114 ways) and then place them into the
11
offices (i.e. order them, so 4! ways) for a total of 4 · 4! = 11 · 10 · 9 · 8 = 7920 ways. Alternatively,
we have 11 choices for president, and once the president has been decided there are not 10 choices
for vice-president, then 9 for treasurer, and finally 8 for secretary, for a total of 11 · 10 · 9 · 8 = 7920
ways of choosing.
9.13 Whether or not order matters, whether or not there are repetitions, whether or not objects
are distinguishable or not.
n Ç å n Ç å
X n k
X n
9.14 (a) 10 = 10k 1n−k = (10 + 1)n = 11n . (b) 110 = 1, 111 = 11, 112 =
k k
k=0 k=0
121, 113 = 1331, 114 = 14641, 115 = 161051. (c) The rows of Pascal’s triangle look kind of like
the powers of 11. When the numbers in the triangle are longer than 1 digit, you have to actually
line them up and add them to get the correct result. But the connection is more clear when you
consider the formula for the Binomial Theorem and think about what it says about a row of the
triangle.
9.15 Plugging in 2x and −3y into the formula, we obtain
Ç å Ç å Ç å Ç å
5 5 0 5 5 1 4 5 2 3 5
(2x − 3y) = (2x) (−3y) + (2x) (−3y) + (2x) (−3y) + (2x)3 (−3y)2
0 1 2 3
Ç å Ç å
5 4 1 5
+ (2x) (−3y) + (2x)5 (−3y)0
4 5
= −243y 5 + 810x y 4 − 1080x2 y 4 + 720x3 y 2 − 240x4 y + 32x5 .
Reading Question Solutions 493
Notice that the negative sign goes inside the parentheses so that it is included in the powers, and
that the constants areÇalso å inside the parentheses so they are included in the powers.
n n Ç å
X n X n k n−k
9.16 Notice that = 1 1 = (1 + 1)n = 2n , where the second-to-last step uses
k k
k=0 k=0
the Binomial Theorem.
9.17 It is possible that the 4 students who sleep were also late, in which case 13 students did
neither. On the other extreme, the 4 students who slept are all different than the 7 who came
late. In that case there are 9 students who did neither. So at least 9 and at most 13 students did
neither.
9.18 No. There are 7 things on the right side of the equation, so if you only have 6 pieces of
information you cannot fully solve the problem.
9.19 |A ∪ B ∪ C ∪ D| = |A| + |B| + |C| + |D|
−|A ∩ B| − |A ∩ C| − |A ∩ D| − |B ∩ C| − |B ∩ D| − |C ∩ D|
+|A ∩ B ∩ C| + |A ∩ B ∩ D| + |A ∩ C ∩ D| + |B ∩ C ∩ D|
−|A ∩ B ∩ C ∩ D|
10.1 (a) answers will vary. Your graph should have numbers (weights) on every edge, no arrows
on the edges, and can contain loops (edges from a vertex to itself). (b) answers will vary. You
graph should not have any numbers on the edges, the edges should have arrows, and there may be
repeated edges–that is, there might be two different edges from some vertex u to another vertex
v. (c) answers will vary. A network is just a weighted directed graph. So your graph should have
numbers on the edges and every edge should have an arrow. It should not have loops or repeated
edges.
10.2 (a) A road system, where the vertices are intersections and the edges are the segments of
road between intersections. The weights on the edges might be distances, speed limits, expected
time to traverse, etc. (b) A ski trail map, where the vertices are intersections and the edges are
the segments of trail between intersections. The edges are directed because on som trails you
are only allowed to go one direction. Since sometimes a trail splits and comes back together,
multiple edges are allowed between vertices. The weights can be distances or difficulty ratings.
(c) Representing connections on a social media app, where it is assumed that being connected is
two-way (e.g. on Facebook when you are friends versus on Instagram where following is one-way),
and where you are allows to connect to yourself (I actually do not know of a social media site
that allows this, but I suppose it is possible).
10.3 The easiest proof is to realize that edges are just pairs of vertices. There are n vertices.
How many ways are there of choosing pairs of vertices? You are choosing 2 things out of n things,
so n2 .
Here is a proof by induction: A graph with 2 vertices has 1 = 22 possible edge, so it holds
for the base case (We could use 1 vertex as a base case, but it is more confusing so we start
at 2). Assume a graph with k − 1 vertices, where k ≥ 3 has k−1 2 possible edges. If you
add a vertex, it can be connected to each of the k − 1 vertices,
so a graph with
k vertices has
k−1 (k−1)(k−2) (k−1)(k)
2 + (k − 1) = 2 + (k − 1) = (k − 1) k−2
2 + 1 = (k − 1) k−2+2
2 = 2 = k2
possible edges. Since the formula is true for k = 2, and whenever it is true for k − 1 it is true for
k, it is true for all n ≥ 2 by induction.
10.4 answers will vary, but here are two examples: | or ⊠ ∧.
10.5 (a) n − 1. (b) Clearly a tree with 2 vertices has 2 − 1 = 1 edges. Assume all trees with
n − 1 vertices have n − 2 edges. Let T be a tree with n > 2 vertices. Since it is a tree, it contains
at least one vertex v of degree 1. Let T ′ be the tree T with vertex v deleted. Then T ′ has n − 1
vertices and is clearly still a tree since removing a vertex of degree 1 cannot either add a cycle or
494 Chapter 12
disconnect the graph. Thus T ′ has n − 2 edges. But it was created from T by removing a single
vertex and edge. Therefore T has n − 1 edges. Since the formula holds for n = 2 and whenever
it holds for n − 1 it holds for n, every tree with n ≥ 2 vertices has n − 1 edges.
10.6 (a) unweighted. (b) undirected. (c) e and x are not adjacent, but f and b are adjacent.
(d) L is connected. (e) 8. (f) 17. (g) deg(v) = 2 and deg(c) = 5. (h) 34. It is twice the number
of edges which makes sense because of the Handshaking Lemma. (i) answers will vary, but if you
drew a subgraph of L that contains exactly 7 edges and contains no cycles, then it is a spanning
tree. Of course it cannot contain cycles because it is a tree. (j) If you remove (e, v) and (f , v),
then v is disconnected from the rest of the graph. No single edge will disconnect the graph so
2 edges is the minimum number. (k) answers will vary, but mine are ef v, bcdf , abcdx, axdf ve,
aef bcdx, and finally aevf bcdx. Notice in all of these, the vertices listed next to each other are
adjacent and the first and last one on the list are adjacent.
10.7 Here are the graphs for the next 3 questions. The vertices labels are optional.
1
1 1
8 2 2
6 2
3
7 3
5 3 4
6 4
K6 4 C8 5 P5 5
10.8 See above.
10.9 See above.
10.10 Notice to draw Q2 , you can draw two copies of Q1 (i.e. two lines) and connect corresponding
vertices (i.e. if you draw the two lines vertically, connect the top vertices and the bottom vertices)
to make a square (notice that Q2 is the same as C4 ). To draw Q3 (a cube), you can draw two
copies of Q2 (squares) and connect corresponding vertices. Using the same idea, to draw Q5 , I
would draw 2 copies of Q4 , and then connect corresponding vertices between the two copies.
10.11 Notice that two vertices are adjacent iff they differ by exactly 1 bit. Also notice that
if two numbers have the same parity, they cannot differ by exactly 1 bit. So we can pick V1 =
{000, 011, 101, 110}, and V2 = {001, 010, 101, 111}. The numbers in V1 have even parity, and the
numbers in V2 have odd parity. So within each subset, none of the numbers differ by exactly one
bit since all of the numbers in each set have the same parity. Thus, this is a valid partition.
10.12 Here is K3,5 and one possible subgraph.
1 2 3 1 3
n2 , so Θ(n + m) would be smaller and the adjacency list would be appropriate. (b) If the number
of edges is larger, then either might be appropriate, depending on how large. If m ≈ n2 , then we
are comparing Θ(n + m) = Θ(n + n2 ) = Θ(n2 ) with Θ(n2 ), so there is minimal difference between
the two. But if m is large but smaller than n2 , the adjacency list might be the better choice.
10.16 If you are storing small graph (e.g. hundreds of vertices or less), it probably does not
matter a whole lot. But imagine storing the Facebook friendship graph. As of 2021, there are
2.85 billion Facebook users and each has an average of 350 friends (as of 2019 it was about 338,
so this number should be close). Since we are talking about exact numbers, we will compare
without using Θ notation. Recall that an adjacency list takes Θ(n + m) space and an adjacency
matrix takes Θ(n2 ) time. We will just treat these as n + m and n2 . In our example, n =
2, 850, 000, 000, and m = 2, 850, 000, 000 ∗ 350 = 997, 500, 000, 000. So an adjacency list would
take about n + m = 2, 850, 000, 000 + 997, 500, 000, 000 = 1, 000, 350, 000, 000 space. An adjacency
matrix would take about n2 = 2, 850, 000, 0002 = 8, 122, 500, 000, 000, 000, 000 space. In case it
isn’t clear, the adjacency matrix takes about 8,119,658 times as much space! To be more precise,
assuming it takes 32 bits to store each number in an adjacency list or matrix (and that is pushing
it), the adjacency list requires about 4 TB (terabtyes) of space, which is doable if you want to fill
up the majority of the hard drive on a very new top-of-the-line computer. On the other hand, the
adjacency matrix requires about 32.49 EB (exabytes). The largest hard drive you can currently
buy is about 18 TB, so you would need about 1,804,369 hard drives to store the adjacency matrix.
(Even if you encoded the matrix densely, using only 1 bit per entry, it would still require about
56,387 hard drives.) So yes, it does matter.
10.17 (a) Arbitrarily pick either u or v and check its list for the other–we will look at u’s list.
Since it might have to traverse the entire list of the neighbors of u, it would be O(deg(u)), which
might be as large as n − 1. So O(n) in the worst case (although O(deg(u)) is more precise).
I use big-O notation on this one because it is possible it finds it sooner. (b) Either Θ(deg(u))
if it has to traverse the list and count, or Θ(1) if this is maintained in the data structure. (c)
Θ(deg(u)) = Θ(k).
10.18 (a) Θ(1) since it can just look at the (u, v) entry of the matrix. (b) Θ(n) since it has to
look through an entire row of the matrix to determine which vertices are neighbors. (c) Same
answer and reason as (b).
10.19 (a) Either one works fine, but it is more efficient with an adjacency matrix since an edge
(u, v) can be added and removed in constant time by just changing the matrix entry (u, v) to a 0
or 1. For an adjacency list, to remove (u, v), you would have to find u on v’s list and v on u’s list
and then remove them from the lists, so it would take longer. (b) Adjacency list by far. If you
add a vertex, you can just add an adjacency list for it to your current list. With a matrix, you
need to create an entirely new matrix with one more row and column and copy all of the entries,
so it is not very efficient. Similar problems exist when removing vertices.
10.20 (a) Since every vertex of K5 has degree 4, it is Eulerian by Theorem 10.67. (b) Since
every vertex of K6 has degree 5, it is not Eulerian by Theorem 10.67. (c) Since every vertex of
Q3 has degree 3, it is not Eulerian by Theorem 10.67. (d) Since every vertex of Q4 has degree 4,
it is Eulerian by Theorem 10.67. Here is one of many possible orderings of the edges that form
an Eulerian tour:
496 Chapter 12
1101 22 1111
23 21
14 18
1001 31 1011
32 0101 13 0111 20
5 19
15 0001 1 0011 17
6 12
4 2
0100 11 0110
24 30
7 27
0000 3 0010
10 28
1100 16 1110
8 26
9 29
1000 25 1010
(e) and (f) are both are Hamiltonian. Here is one possible Hamiltonian cycle for each:
1101 1111
1001 1011
0101 0111
0001 0011
101 111
0100 0110
1100 1110
100 110
(g) Yes. Since it is just a cycle with all of the vertices, it is clearly a Hamiltonian cycle.
Reading Question Solutions 497
10.22 It is not saying that. No planar graphs have more edges than that, but not every graph
with fewer edges is planar. For instance, K3,3 has v = 6 and e = 9. So e = 9 < 12 = 3v − 6, but
as we saw earlier, K3,3 is not planar.
10.23 (a) I am too lazy to do another drawing, but draw it as a box inside a box with the corners
connected to each other and it is clearly planar. (b) Notice that v = 16, e = 32, and C3 is not
a subgraph of Q4 . Then e = 32 > 28 = 2v − 4, so by part (b) of Theorem 10.77, Q4 cannot be
planar.
498 Chapter 12
GNU Free Documentation License
Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed.
Preamble
The purpose of this License is to make a manual, textbook, or other functional and useful document “free” in the sense of freedom: to assure
everyone the effective freedom to copy and redistribute it, with or without modifying it, either commercially or noncommercially. Secondarily,
this License preserves for the author and publisher a way to get credit for their work, while not being considered responsible for modifications
made by others.
This License is a kind of “copyleft”, which means that derivative works of the document must themselves be free in the same sense. It
complements the GNU General Public License, which is a copyleft license designed for free software.
We have designed this License in order to use it for manuals for free software, because free software needs free documentation: a free
program should come with manuals providing the same freedoms that the software does. But this License is not limited to software manuals;
it can be used for any textual work, regardless of subject matter or whether it is published as a printed book. We recommend this License
principally for works whose purpose is instruction or reference.
2. VERBATIM COPYING
You may copy and distribute the Document in any medium, either commercially or noncommercially, provided that this License, the
copyright notices, and the license notice saying this License applies to the Document are reproduced in all copies, and that you add no other
conditions whatsoever to those of this License. You may not use technical measures to obstruct or control the reading or further copying of the
copies you make or distribute. However, you may accept compensation in exchange for copies. If you distribute a large enough number of copies
you must also follow the conditions in section 3.
You may also lend copies, under the same conditions stated above, and you may publicly display copies.
499
500 Chapter 12
3. COPYING IN QUANTITY
If you publish printed copies (or copies in media that commonly have printed covers) of the Document, numbering more than 100, and
the Document’s license notice requires Cover Texts, you must enclose the copies in covers that carry, clearly and legibly, all these Cover Texts:
Front-Cover Texts on the front cover, and Back-Cover Texts on the back cover. Both covers must also clearly and legibly identify you as the
publisher of these copies. The front cover must present the full title with all words of the title equally prominent and visible. You may add other
material on the covers in addition. Copying with changes limited to the covers, as long as they preserve the title of the Document and satisfy
these conditions, can be treated as verbatim copying in other respects.
If the required texts for either cover are too voluminous to fit legibly, you should put the first ones listed (as many as fit reasonably) on
the actual cover, and continue the rest onto adjacent pages.
If you publish or distribute Opaque copies of the Document numbering more than 100, you must either include a machine-readable
Transparent copy along with each Opaque copy, or state in or with each Opaque copy a computer-network location from which the general
network-using public has access to download using public-standard network protocols a complete Transparent copy of the Document, free of
added material. If you use the latter option, you must take reasonably prudent steps, when you begin distribution of Opaque copies in quantity,
to ensure that this Transparent copy will remain thus accessible at the stated location until at least one year after the last time you distribute
an Opaque copy (directly or through your agents or retailers) of that edition to the public.
It is requested, but not required, that you contact the authors of the Document well before redistributing any large number of copies, to
give them a chance to provide you with an updated version of the Document.
4. MODIFICATIONS
You may copy and distribute a Modified Version of the Document under the conditions of sections 2 and 3 above, provided that you release
the Modified Version under precisely this License, with the Modified Version filling the role of the Document, thus licensing distribution and
modification of the Modified Version to whoever possesses a copy of it. In addition, you must do these things in the Modified Version:
A. Use in the Title Page (and on the covers, if any) a title distinct from that of the Document, and from those of previous versions (which
should, if there were any, be listed in the History section of the Document). You may use the same title as a previous version if the
original publisher of that version gives permission.
B. List on the Title Page, as authors, one or more persons or entities responsible for authorship of the modifications in the Modified Version,
together with at least five of the principal authors of the Document (all of its principal authors, if it has fewer than five), unless they
release you from this requirement.
C. State on the Title page the name of the publisher of the Modified Version, as the publisher.
E. Add an appropriate copyright notice for your modifications adjacent to the other copyright notices.
F. Include, immediately after the copyright notices, a license notice giving the public permission to use the Modified Version under the
terms of this License, in the form shown in the Addendum below.
G. Preserve in that license notice the full lists of Invariant Sections and required Cover Texts given in the Document’s license notice.
I. Preserve the section Entitled “History”, Preserve its Title, and add to it an item stating at least the title, year, new authors, and
publisher of the Modified Version as given on the Title Page. If there is no section Entitled “History” in the Document, create one
stating the title, year, authors, and publisher of the Document as given on its Title Page, then add an item describing the Modified
Version as stated in the previous sentence.
J. Preserve the network location, if any, given in the Document for public access to a Transparent copy of the Document, and likewise the
network locations given in the Document for previous versions it was based on. These may be placed in the “History” section. You may
omit a network location for a work that was published at least four years before the Document itself, or if the original publisher of the
version it refers to gives permission.
K. For any section Entitled “Acknowledgements” or “Dedications”, Preserve the Title of the section, and preserve in the section all the
substance and tone of each of the contributor acknowledgements and/or dedications given therein.
L. Preserve all the Invariant Sections of the Document, unaltered in their text and in their titles. Section numbers or the equivalent are
not considered part of the section titles.
M. Delete any section Entitled “Endorsements”. Such a section may not be included in the Modified Version.
N. Do not retitle any existing section to be Entitled “Endorsements” or to conflict in title with any Invariant Section.
If the Modified Version includes new front-matter sections or appendices that qualify as Secondary Sections and contain no material copied
from the Document, you may at your option designate some or all of these sections as invariant. To do this, add their titles to the list of Invariant
Sections in the Modified Version’s license notice. These titles must be distinct from any other section titles.
You may add a section Entitled “Endorsements”, provided it contains nothing but endorsements of your Modified Version by various
parties–for example, statements of peer review or that the text has been approved by an organization as the authoritative definition of a
standard.
You may add a passage of up to five words as a Front-Cover Text, and a passage of up to 25 words as a Back-Cover Text, to the end
of the list of Cover Texts in the Modified Version. Only one passage of Front-Cover Text and one of Back-Cover Text may be added by (or
through arrangements made by) any one entity. If the Document already includes a cover text for the same cover, previously added by you or
by arrangement made by the same entity you are acting on behalf of, you may not add another; but you may replace the old one, on explicit
permission from the previous publisher that added the old one.
The author(s) and publisher(s) of the Document do not by this License give permission to use their names for publicity for or to assert or
imply endorsement of any Modified Version.
Reading Question Solutions 501
5. COMBINING DOCUMENTS
You may combine the Document with other documents released under this License, under the terms defined in section 4 above for modified
versions, provided that you include in the combination all of the Invariant Sections of all of the original documents, unmodified, and list them
all as Invariant Sections of your combined work in its license notice, and that you preserve all their Warranty Disclaimers.
The combined work need only contain one copy of this License, and multiple identical Invariant Sections may be replaced with a single
copy. If there are multiple Invariant Sections with the same name but different contents, make the title of each such section unique by adding
at the end of it, in parentheses, the name of the original author or publisher of that section if known, or else a unique number. Make the same
adjustment to the section titles in the list of Invariant Sections in the license notice of the combined work.
In the combination, you must combine any sections Entitled “History” in the various original documents, forming one section Entitled
“History”; likewise combine any sections Entitled “Acknowledgements”, and any sections Entitled “Dedications”. You must delete all sections
Entitled “Endorsements”.
6. COLLECTIONS OF DOCUMENTS
You may make a collection consisting of the Document and other documents released under this License, and replace the individual copies
of this License in the various documents with a single copy that is included in the collection, provided that you follow the rules of this License
for verbatim copying of each of the documents in all other respects.
You may extract a single document from such a collection, and distribute it individually under this License, provided you insert a copy of
this License into the extracted document, and follow this License in all other respects regarding verbatim copying of that document.
8. TRANSLATION
Translation is considered a kind of modification, so you may distribute translations of the Document under the terms of section 4. Replacing
Invariant Sections with translations requires special permission from their copyright holders, but you may include translations of some or all
Invariant Sections in addition to the original versions of these Invariant Sections. You may include a translation of this License, and all the
license notices in the Document, and any Warranty Disclaimers, provided that you also include the original English version of this License and
the original versions of those notices and disclaimers. In case of a disagreement between the translation and the original version of this License
or a notice or disclaimer, the original version will prevail.
If a section in the Document is Entitled “Acknowledgements”, “Dedications”, or “History”, the requirement (section 4) to Preserve its Title
(section 1) will typically require changing the actual title.
9. TERMINATION
You may not copy, modify, sublicense, or distribute the Document except as expressly provided for under this License. Any other attempt
to copy, modify, sublicense or distribute the Document is void, and will automatically terminate your rights under this License. However, parties
who have received copies, or rights, from you under this License will not have their licenses terminated so long as such parties remain in full
compliance.
Copyright © YEAR YOUR NAME. Permission is granted to copy, distribute and/or modify this document under the terms
of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation; with no
Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the section entitled
“GNU Free Documentation License”.
If you have Invariant Sections, Front-Cover Texts and Back-Cover Texts, replace the “with . . . Texts.” line with this:
with the Invariant Sections being LIST THEIR TITLES, with the Front-Cover Texts being LIST, and with the Back-Cover
Texts being LIST.
If you have Invariant Sections without Cover Texts, or some other combination of the three, merge those two alternatives to suit the
situation.
If your document contains nontrivial examples of program code, we recommend releasing these examples in parallel under your choice of
free software license, such as the GNU General Public License, to permit their use in free software.
502 Chapter 12
Index
∀ (for all), 102 P (A) (power set), 134
O(f (n)) (Big-O), 218 ⊆ (subset), 132
Ω(f (n)) (Big-Omega), 221 6⊆ (not a subset), 132
Θ(f (n)) (Big-Theta), 222 ⊂
P (proper subset), 132
ω(f (n)) (little-omega), 224 Q (summation), 195
o(f (n)) (little-o), 224 (product), 210
& (bitwise AND), 117 algorithm, 47
∼ (bitwise compliment), 116 AND, 80
| (bitwise OR), 117 AND (bitwise), 117
^ (bitwise XOR), 117 anti-symmetric relation, 162
n arithmetic progression, 192
k (binomial coefficient), 374
≡ (congruence modulo n), 51 arithmetic sequence, 192
∃ (there exists), 104 array, 63
! (factorial), 12 assignment operator, 48
⌊ ⌋ (floor), 54 asymptotic notation, 217
⌈ ⌉ (ceiling), 54
base case, 298
| (divides), 10
base case (induction), 297
∧ (AND), 80
base case (recursion), 317
¬ (NOT), 79, 107
biconditional, 85
∨ (OR), 81
Big-O, 218
⊕ (XOR), 82
Big-Omega, 221
→ (conditional), 83
Big-Theta, 222
↔ (biconditional), 85
binary search, 285, 326
= (logically equivalent), 92
binomial coefficient, 374
mod operator, 51
Binomial Theorem, 382
% (modulus), 51
bitwise operator
|A| (set cardinality), 129
AND, 117
∈ (element of set), 129
compliment, 116
6∈ (not element of set), 129
NOT, 116
C (complex numbers), 130 OR, 117
N (natural numbers), 130 XOR, 117
Q (rational numbers), 130 boolean
R (real numbers), 130 operator
Z (integers), 130 negation, 66
Z+ (positive integers), 130 not, 66
Z− (negative integers), 130 proposition, 16, 77
∅ (empty set), 130 variable, 66
{} (empty set), 130
∩ (intersection), 136 cardinality, set, 129
∪ (union), 136 Cartesian product, 144
A (complement of A), 137 ceiling, 54
\ (set-difference), 137 characteristic equation, 340, 342
× (Cartesian product), 144 choose, 374
503
504 INDEX