0% found this document useful (0 votes)
939 views347 pages

Advanced Precalculus 2 1

Uploaded by

albert
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
939 views347 pages

Advanced Precalculus 2 1

Uploaded by

albert
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 347

Advanced Precalculus

Daniel Kim

January 12, 2019


Contents

1 Logic 1
1.1 Logical Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Logical Arguments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.3 Quantified Statements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

2 Set Theory 19
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.2 Operations on Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

3 Fields 29
3.1 Field Axioms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.2 Subtraction, Division . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

4 Sequences and Series 39


4.1 Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
4.2 Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
4.3 Limits of Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
4.4 Summation and Product Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

5 Mathematical Induction 65
5.1 Standard Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
5.2 The Binomial Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

6 Basic Trigonometry 87
6.1 Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
6.2 The Unit Circle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
6.3 Graphing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
6.4 Inverse Trigonometric Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
iii
6.5 More Trigonometric Identities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
6.6 Areas, Law of Sines, Law of Cosines . . . . . . . . . . . . . . . . . . . . . . . . . . . 139

7 Advanced Trigonometry 149


7.1 Polar Coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
7.2 Parametric Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
7.3 Complex Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
7.3.1 Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
7.3.2 The Complex Plane . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170
7.3.3 Rotation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181

8 Linear Algebra 189


8.1 Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189
8.2 Linear Transformations and Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . 198
8.3 3-D Geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236

9 Limits 249
9.1 Linear Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250
9.2 Non-linear Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254
9.3 Limit Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 260
9.4 Other Limits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267
9.5 Trigonometric Limits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272
9.6 Advanced Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277
9.7 Continuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 282

10 Derivatives 287
10.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 287
10.2 Implicit Differentiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 306
10.3 Related Rates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311
10.4 Significance of the Derivative . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 318
10.5 Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 332
10.6 L’Hôpital’s Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 335
10.7 Inverses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 337
10.8 Parametric and Polar Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 339
10.9 Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 340
Chapter 1

Logic

To understand how proofs work, it is essential to understand the underlying mathematical logic
involved. In this chapter, we will only provide a relatively brief overview of how logic works.

1.1 Logical Operators

The Law of Excluded Middle states that every statement is always either true or false. We
usually use variables p, q, r, etc. to denote such a statement.

Definition 1.1.1. The negation of a statement p reverses the original truth value, and is denoted
as ∼p. This is pronounced as “not p.”

For instance, if p was true, then ∼p is false, and vice-versa.

Definition 1.1.2. The disjunction of statements p and q is denoted as p ∨ q, and it is pronounced


as “p or q.” This new statement is true if either or both of p or q are true.

In other words, as long as at least one of p or q is true, then p ∨ q is true. Otherwise, it is false.
This relationship can be depicted in a truth table, where all possible combinations of truth
values for the considered statements are enumerated in an organized table form. For brevity, the
values of true and false will respectively be denoted as T and F.
Here is the truth table for p ∨ q:
p q p∨q
T T T
T F T
F T T
F F F

For each pair of truth values for p and q, we read the truth table by horizontal rows. For example,
the first row of this table would indicate, “if p is true and q is true, then p ∨ q is true.”
Review the table and confirm that it matches with your understanding of disjunction.
1
Daniel Kim 2

Since there are two initial statements that we consider (p and q) with two possible truth values
for each (true or false), there are 22 = 4 rows in the truth table.
For a truth table demonstrating negation, we only have to consider the initial statement p, there
would only be two rows:
p ∼p
T F
F T

Definition 1.1.3. The conjunction of statements p and q is denoted as p ∧ q, and pronounced as


“p and q.” This new statement is true if both of p and q are true.

In other words, if at least one of the statements p and q is false, then p ∧ q will necessarily be
false. Here is the truth table representing conjunction:

p q p∧q
T T T
T F F
F T F
F F F

Again, convince yourself that this truth table is valid.


Negation, disjunction, and conjunction are the most basic logical operators from which all other
aspects of mathematical logic will be based on.

Example 1.1.4
Construct a truth table for ∼p ∧ q.

Solution. When we break down this statement, notice that we will have to evaluate the possible
truth values of p, q, ∼p, and lastly, ∼p ∧ q. The truth table illustrates this process:

p q ∼p ∼p ∧ q
T T F F
T F F F
F T T T
F F T F

Once we get the truth values for ∼p, we perform the conjunction on the ∼p column and the q
column to get our result.

Definition 1.1.5. The operation exclusive-or, shortened to “XOR,” is denoted as p ⊕ q. This new
statement is true as long as p and q have different truth values.
3 Chapter 1. Logic

In short, we would expect the truth table to look like:

p q p⊕q
T T F
T F T
F T T
F F F

But in fact, it is possible to express p ⊕ q in terms of the basic logical operators! We need to find
a statement that only uses negation, disjunction, and conjunction operators which fulfills the same
purpose as p ⊕ q.
Definition 1.1.6. Two statements are logically equivalent if they have corresponding equal
possible truth values. If p and q are two such statements, then their equivalence is represented by
p ≡ q.

In other words, their columns must be identical in a truth table.


Problem 1.1.7. Prove that p ⊕ q ≡ (p ∨ q) ∧ ∼(p ∧ q).

Proof. We end up with a large truth table since we have to break up that unwieldly statement.

p q p ∨ q p ∧ q ∼(p ∧ q) (p ∨ q) ∧ ∼(p ∧ q) p ⊕ q
T T T T F F F
T F T F T T T
F T T F T T T
F F F F T F F

It can then be confirmed that the columns for (p ∨ q) ∧ ∼(p ∧ q) and p ⊕ q have the same values.
Note that they must be the same for each row (i.e. when viewing the two columns top-to-bottom or
vice versa, the order of the truth values must match).
Problem 1.1.8. Prove that p ∧ (q ∨ r) ≡ (p ∧ q) ∨ (p ∧ r).

Proof. Since we have three initial statements to consider (p, q, and r), our truth table will have
23 = 8 rows.
p q r q ∨ r p ∧ q p ∧ r p ∧ (q ∨ r) (p ∧ q) ∨ (p ∧ r)
T T T T T T T T
T T F T T F T T
T F T T F T T T
T F F F F F F F
F T T T F F F F
F T F T F F F F
F F T T F F F F
F F F F F F F F
Daniel Kim 4

Note that this equivalence resembles the distributive property from algebra.

Problem 1.1.9. Is this equivalence true or false?

(p ∧ q) ∨ r ≡ p ∧ (q ∨ r)

Proof. This equivalence is false. It suffices to supply one counterexample, since a logical equivalence
must hold for all possible truth value combinations for the initial statements (which would be p, q,
and r in this problem).
Consider p = F, q = F, and r = T. These truth values lead to the following equivalences:

(p ∧ q) ∨ r ≡ T,
p ∧ (q ∨ r) ≡ F.

In fact, there is an abundance of logical equivalences, and the most important laws are listed in
the table below.

Logical Equivalences

Commutative laws p∧q ≡q∧p p∨q ≡q∨p

Associative laws p ∧ (q ∧ r) ≡ (p ∧ q) ∧ r p ∨ (q ∨ r) ≡ (p ∨ q) ∨ r

Distributive laws p ∧ (q ∨ r) ≡ (p ∧ q) ∨ (p ∧ r) p ∨ (q ∧ r) ≡ (p ∨ q) ∧ (p ∨ r)

Identity laws p∧t≡p p∨c≡p

Negation laws p ∧ ∼p ≡ c p ∨ ∼p ≡ t

Double negation law ∼(∼p)) ≡ p

Idempotent laws p∧p≡p p∨p≡p

DeMorgan’s laws ∼(p ∧ q) ≡ ∼p ∨ ∼q ∼(p ∨ q) ≡ ∼p ∧ ∼q

Universal bound laws p∨t≡t p∧c≡c

Absorption laws p ∨ (p ∧ q) ≡ p p ∧ (p ∨ q) ≡ p

Dichotomies ∼t ≡ c ∼c ≡ t

In this table, we use t and c to denote tautology and contradiction respectively. These are
interchangeable with T and F.

Exercise 1.1.10. Go through each equivalence listed here and develop a truth table to prove it,
until you are familiar with the proof-by-truth-table method.

It is recommended that you familiarize yourself with these rules, as invoking them tremendously
simplifies proofs later on.
5 Chapter 1. Logic

Example 1.1.11
Simplify ((q ∨ p) ∧ q) ∨ (r ∧ (∼r ∧ ∼q)).

Solution. To keep it brief and simple, we will glance over steps using the Commutative laws.

((q ∨ p) ∧ q) ∨ (r ∧ (∼r ∧ ∼q)) ≡ q ∨ (r ∧ (∼r ∧ ∼q)) (Absorption)


≡ q ∨ ((r ∧ ∼r) ∧ ∼q) (Associative)
≡ q ∨ (c ∧ ∼q) (Negation)
≡ (q ∨ ∼q) ∧ (q ∨ c) (Distributive)
≡ t ∧ (q ∨ c) (Negation)
≡t∧q (Identity)
≡ q. (Identity)

This method is much easier and less tedious than using a truth table, as long as the laws are invoked
correctly.

Often in life, things happen as a result of other things. In fact, there is a special operation in
logic that illustrates this cause-and-effect dynamic.

Definition 1.1.12. The conditional of two statements p and q is denoted as p → q. This is


pronounced “if p, then q.” It could also be read as “p implies q.” This new statement is only false
when p is true but q is false.

Consider the situation in which p is false. Regardless of what truth value q is, p → q will
automatically be true. In this case, p → q is vacuously true, since the conditional is irrelevant in
the first place.
Otherwise, consider p to be true. If q is true, then p → q will be true, since our “if-then”
relationship is satisfied. However, if q is false, then this fails our notion of “if p, then q” so p → q
would be considered false in this case.
These observations can be illustrated in the following truth table.

p q p→q
T T T
T F F
F T T
F F T

For future reference, whenever we are given a statement of the form p → q to prove, then we
should assume p to be true in our proof, because the statement is meaningless when p is false.

Problem 1.1.13. Is this equivalence true or false?

p → (q → r) ≡ (p → q) → r.
Daniel Kim 6

Solution. This equivalence is false. Let p = F, q = F, and r = F. Then,

p → (q → r) ≡ T,
(p → q) → r ≡ F.

Problem 1.1.14. Is this equivalence true or false?

(p → r) ∧ (q → r) → ((p ∧ q) → r)

Solution. This equivalence is true. We can construct a truth table and then analyze the truth values.
p q r p → r q → r p ∧ q (p → r) ∧ (q → r) (p ∧ q) → r (p → r) ∧ (q → r) → ((p ∧ q) → r)
T T T T T T T T T
T T F F F T F F T
T F T T T F T T T
T F F F T F F T T
F T T T T F T T T
F T F T F F F T T
F F T T T F T T T
F F F T T F T T T
The last column (which represents the possible truth values of the given equivalence) only has T
as a value, so the equivalence must overall be true.

How do we represent the conditional in terms of the basic logical operators? Note that p → q is
guaranteed to be true when p is false, i.e. ∼p is true. The only other case in which p → q is true is
when both p and q are true. Ultimately, we either want ∼p to be true or q to be true. Thus, we
would expect that
p → q ≡ ∼p ∨ q.
Exercise 1.1.15. Use a truth table to demonstrate p → q ≡ ∼p ∨ q.
Definition 1.1.16. Given the conditional p → q, there are three variations:

1. The inverse is ∼p → ∼q.

2. The converse is q → p.

3. The contrapositive is ∼q → ∼p.

Are these variations necessarily logically equivalent to p → q? It turns out that the original
conditional and its contrapositive are logically equivalent, and so are the converse and inverse.
In some cases, we may find it easier to prove the contrapositive of some theorem statement,
rather than prove the given implication.
Given the original conditional, directly assuming that the converse is true is called the converse
error, and likewise for the inverse it is called the inverse error.
Definition 1.1.17. The biconditional of statements p and q is denoted as p ↔ q. We read this as
“p if and only if q.” This is only true when p and q have the same truth value.
7 Chapter 1. Logic

Some mathematical texts will abbreviate “if and only if” to simply “iff.”
Exercise 1.1.18. Prove that p ↔ q ≡ (p → q) ∧ (q → p).

Notice that p ↔ q when p → q and its converse are both true. Usually, for mathematical
theorems, when we want to show that both the initial implication and its converse are both true, we
use ↔ to emphasize both directions. When we need to prove such theorems, it is often the case that
we prove p → q and q → p separately. They are respectively referred to as the ‘right direction’ and
‘left direction.’
Notice that the conditions needed for p ↔ q to be true are the opposite of those for p ⊕ q. Indeed,
the following equivalence also holds true.
Exercise 1.1.19. Prove that p ↔ q ≡ ∼(p ⊕ q).
Problem 1.1.20. Express the disjunction, conditional, biconditional, and exclusive-or only in terms
of conjunction and negation.

Solution. We take previous results and apply DeMorgan’s laws when needed.
p ∨ q ≡ ∼(∼p ∧ ∼q)
p → q ≡ ∼p ∨ q
≡ ∼(p ∧ ∼q)
p ↔ q ≡ (p → q) ∧ (q → p)
≡ ∼(p ∧ ∼q) ∧ ∼(q ∧ ∼p)
p ⊕ q ≡ ∼(p ↔ q)
≡ ∼(∼(p ∧ ∼q) ∧ ∼(q ∧ ∼p))

1.2 Logical Arguments


Again consider the conditional p → q. Given that p is true and the implication holds, it is clear that
q must be true as a result. This reasoning can be organized in an argument form:

p→q
p
∴q

This form is known as modus ponens, and it is the most basic rule of inference in logic. The
statements p → q and p are called premises, while q is the conclusion.
Consider another valid argument form,

p→q
∼q
∴ ∼p

and this is called modus tollens. By considering the contrapositive, it should be clear why this
is true.
An argument is valid if at any time all the premises are true, then the conclusion is also true.
Daniel Kim 8

Example 1.2.1
Prove that modus pollens is valid.

Proof. Once again, we use the truth table.

Premises Conclusion
p q p→q p q
T T T T T
T F F T F
F T T F T
F F T F F

We only consider the row(s) in which all of the premises are true. A row which satisfies this is called
a critical row. The argument is valid if for each critical row, the conclusion(s) is also true. In this
example, we see that the only critical row is the first row, and q is true in that row, so the argument
is valid.

Of course, argument forms can be much more complex, rendering the truth-table method tedious
and time-inefficient. Like before, we have a set of tools at our disposal, called the rules of inference:

Modus Ponens Modus Tollens


p→q p→q
p ∼q
∴q ∴ ∼p

Disjunctive Syllogism Disjunctive Addition


p∨q p
∼p ∴p∨q
∴q

Conjunctive Simplification Conjunctive Addition


p∧q p
∴p q
∴p∧q
9 Chapter 1. Logic

Hypothetical Syllogism Dilemma: Proof by Cases


p→q p∨q
q→r p→r
∴p→r q→r
∴r

It can be verified using the truth table that all of these are valid.
Lastly, there is another important argument form called the rule of contradiction:

∼p → F
∴p

This is the essence of the popular “proof by contradiction” − if the negation of a given statement
leads to a false conclusion, then the statement has to be true. Proof by contradiction will become a
very useful tool for proofs in the future.

Example 1.2.2
Demonstrate the following argument’s validity or invalidity.

p∨q (1)
q→r (2)
(p ∧ s) → t (3)
∼r (4)
∼q → (u ∧ s) (5)
∴t

Note that the variable t here represents a statement rather than a tautology.

Proof. We will show that this is valid, through a proof with multiple steps.
Step 1:

q→r (by 2)
∼r (by 4)
∴ ∼q (modus tollens)

Step 2:

p∨q (by 1)
Daniel Kim 10

∼q (by step 1)
∴p (disjunctive syllogism)

Step 3:

∼q → (u ∧ s) (by 5)
∼q (by step 1)
∴u∧s (modus ponens)

Step 4:

u∧s (by step 3)


∴s (conjunctive simplification)

Step 5:

p (by step 2)
s (by step 4)
∴p∧s (conjunctive addition)

Step 6:

(p ∧ s) → t (by 3)
p∧s (by step 5)
∴t (modus ponens)

Problem 1.2.3. Prove that this argument is valid.

p→q∧r (1)
∼q (2)
∴ ∼p

Proof. You may use a truth table and examine the truth values of the critical rows, but rules of
inference are still applicable:
Step 1:

∼q (by 2)
∴ ∼q ∨ ∼r (disjunctive addition)

Step 2:

∼q ∨ ∼r (by step 1)
∴ ∼(q ∧ r) (DeMorgan’s laws)
11 Chapter 1. Logic

Step 3:

p→q∧r (by 1)
∼(q ∧ r) (by step 2)
∴ ∼p (modus tollens)

Alternative Proof. The argument is only valid when if the premises are all true, then the conclusion
must be true. Thus, assume that the premises are true.
If ∼q is true, then q must be false. This then implies that q ∧ r is also false, by universal bound
laws. By modus tollens, p must be false, i.e. ∼p is true, and we are done.

1.3 Quantified Statements

A predicate is a declaration involving unknown variables, and if values were assigned to these
variables, the predicate would become a statement with a truth value. For example:

P (x) : x > 3 ← Predicate


P (7) : 7 > 3 ← Statement (Truth value: T)

The domain is the set of all values that can be assigned to the predicate variables.
Quantifiers serve to specify how many elements are able to be substituted into the predicate
and resulting in a true statement. There are two types:

Definition 1.3.1. The universal quantifier, denoted by ∀, indicates that all elements in the
domain result in a true statement when substituted into the predicate.

For instance, ∀x ∈ D, P (x) is read as “for all x in D, P (x) is true.” Thus, this quantified
statement is true precisely when P (x) is true for every x in D.
The symbol ∈ when we have x ∈ D indicates that x is an element of set D, where a set is a
collection of elements (refer to Definition 2.1.1). For example, 1 ∈ Z is true, but π ∈ Q is false.
We will go over sets in greater detail in the next chapter.

Definition 1.3.2. The existential quantifier, denoted by ∃, indicates that there is some element
in the domain that results in a true statement when substituted into the predicate.

For example, ∃x ∈ D, P (x) is read as “there exists an x in D such that P (x) is true.” Thus, this
quantified statement is true precisely when P (x) is true for some x in D.
Often, in math, we use quantified statements with respect to sets of numbers. Here is a list of
the generally accepted symbols for well-known sets.

R − Real numbers
Q − Rational numbers
Daniel Kim 12

Z − Integers
W − Whole numbers
N − Natural numbers
C − Complex numbers
H − Quaternions

In particular, you should be able to distinguish and understand the first five sets. As a note,
the natural numbers are defined starting from 1, while the whole numbers are the natural numbers
including 0.
Lastly, before proceeding further, it will be assumed that you are familiar with interval notation
from previous experience with algebra.

Problem 1.3.3. For each quantified statement, determine if it is true or false.

1. ∀x ∈ R, x2 ≥ 0

2. ∃x ∈ Z, x2 < 1

3. ∀x ∈ R, x ∈ Q

4. ∀x ∈ Z, x ∈ Q → x ∈ R

Solution. The quantifier can change the whole meaning of the statement, so be sure not to get
confused.

1. True. This is the statement of the Trivial Inequality.

2. True. There is an existential quantifier, so the statement will be true provided that we can
give at least one example: note that x = 0 satisfies x2 < 1.

3. False. Analogously, for a universal quantifier, we can provide at least one counterexample to
disprove the statement: note that x = π is a real but not rational number.

4. True. All integers are rational numbers, and all rational numbers are real numbers.

Problem 1.3.4. For each quantified statement, determine if it is true or false.

1. ∀n ∈ Z, 2 | n → 4 | n

2. ∀n ∈ Z, 4 | n → 2 | n

3. ∃n ∈ Z, 2 | n → 4 | n

4. ∀x ∈ R, x2 > 0

5. ∃x ∈ R, x3 − x2 + 4972x − 11.62π = 0

6. ∃x ∈ (0, 2π), sec x = 29

Note: the notation “a | b” indicates that a divides b, i.e. a is a factor of b.


13 Chapter 1. Logic

Solution. You may not have encountered some of the math used here, and that is alright. We will
eventually examine some of these closely in later chapters.

1. False. Counterexample: n = 6.
2. True. Let n = 4k for some k ∈ Z. Then n = 4k =⇒ n = 2(2k), and since k ∈ Z, we must
have 2 | n.
3. True. Example: n = 8.
4. False. Counterexample: x = 0.
5. True. By the Intermediate Value Theorem, all cubics have at least one real root.
6. True. The range of sec x is (−∞, −1] ∪ [1, +∞), which contains 29. Therefore there must be
some x which yields this number.
Problem 1.3.5. Let L(x, y) denote “x likes y.” For each of the following statements with double
quantifiers,

∀x ∀y L(x, y),
∀x ∃y L(x, y),
∃x ∀y L(x, y),
∃x ∃y L(x, y),
∃x ∀y L(y, x),

find a brief sentence in words that is equivalent to each statement.

Solution. This problem is meant to develop your understanding of universal and existential quantifiers,
as well as their relations to each other in a symbolic statement. Notice how switching the order or
quantifier can drastically change the meaning of the sentence.

∀x ∀y L(x, y) ←→ Everybody likes everyone.


∀x ∃y L(x, y) ←→ Everybody likes someone.
∃x ∀y L(x, y) ←→ Somebody likes everyone.
∃x ∃y L(x, y) ←→ Somebody likes someone.
∃x ∀y L(y, x) ←→ There’s someone whom everybody likes.

To clarify some ambiguity, the statement “everyone likes someone” suggests that each person
likes somebody else, but the person who is liked can vary depending on the person who is liking.
However, the statement “there’s someone whom everybody likes” suggests that everybody likes one
particular, common person.

How would we negate a quantified statement? First, let’s consider one with the existential
quantifier, ∃x ∈ D, P (x).
There just has to be at least one value of x, when substituted into the predicate, that yields a
true statement. Therefore, we can rewrite it as a series of predicates all connected by disjunctions:

∃x ∈ D, P (x) ≡ P (x1 ) ∨ P (x2 ) ∨ P (x3 ) ∨ . . . ∨ P (xn−1 ) ∨ P (xn ),


Daniel Kim 14

where D = {x1 , x2 , x3 , . . . , xn }, since a statement with a disjunction only needs at least one of the
P (xi∈D ) to be true.
We wish to find the negation of this. In fact, DeMorgan’s laws can be generalized for a series of
disjunctions, such that

∼ (P (x1 ) ∨ P (x2 ) ∨ P (x3 ) ∨ . . . P (xn−1 ) ∨ P (xn )) ≡


∼P (x1 ) ∧ ∼P (x2 ) ∧ ∼P (x3 ) ∧ . . . ∼P (xn−1 ) ∧ ∼P (xn ).

But if we have a series of P (xi∈D ) joined together by conjunctions, then this suggests that the
universal quantifier should be applied. It follows that

∴ ∼P (x1 ) ∧ ∼P (x2 ) ∧ ∼P (x3 ) ∧ . . . ∼P (xn−1 ) ∧ ∼P (xn ) ≡ ∀x ∈ D, ∼P (x).

Thus, ∼(∃x ∈ D, P (x)) ≡ ∀x ∈ D, ∼P (x).

Exercise 1.3.6. Use the same reasoning as above to show that ∼ (∀x ∈ D, P (x)) ≡ ∃x ∈ D, ∼P (x).

Problem 1.3.7. Negate the statement: ∃n ∈ Z, 2 | n → 4 | n.

Solution. We simply switch the ∃ symbol to a ∀ symbol, and then negate the remainder of the
statement, which is 2 | n → 4 | n. Recall that p → q ≡ ∼p ∨ q, so ∼(2 | n → 4 | n) ≡ ∼(2 - n ∨ 4 |
n) ≡ 2 | n ∧ 4 - n by DeMorgan’s law. Thus, we conclude

∀n ∈ Z, 2 | n ∧ 4 - n.

Problem 1.3.8. Negate:

∀ε > 0 ∃δ > 0 ∀x, 0 < |x − a| < δ → |f (x) − L| < ε

Note: this is the epsilon-delta definition of the limit, and we will go over this extensively in a
later chapter.

Solution. This statement has a lot of quantifiers strung together, but we can dissect each part one
by one:

∼(∀ε > 0 ∃δ > 0 ∀x, 0 < |x − a| < δ → |f (x) − L| < ε)


≡ ∃ε > 0 ∼(∃δ > 0 ∀x, 0 < |x − a| < δ → |f (x) − L| < ε)
≡ ∃ε > 0 ∀δ > 0 ∼(∀x, 0 < |x − a| < δ → |f (x) − L| < ε)
≡ ∃ε > 0 ∀δ > 0 ∃x, ∼(0 < |x − a| < δ → |f (x) − L| < ε)
≡ ∃ε > 0 ∀δ > 0 ∃x, ∼(∼(0 < |x − a| < δ) ∨ |f (x) − L| < ε)
≡ ∃ε > 0 ∀δ > 0 ∃x, 0 < |x − a| < δ ∧ |f (x) − L| ≥ ε.

Problem 1.3.9. Determine if the following statement is true or false, and explain. Then, find its
negation.
∀x ∈ Q ∃y ∈ Z+ , xy ∈ Z.
15 Chapter 1. Logic
m
Solution. The statement is true. By the definition of a rational number, let x = such that
n
m ∈ Z, n ∈ Z+ . Let y = n (this choice of y is allowed because the existential quantifier indicates
m
that a certain y can be found based on any given x). Therefore, xy = · n = m, which has already
n
been established as an integer.
The negation would be ∃x ∈ Q ∀y ∈ Z+ , xy ∈
/ Z.

Problem 1.3.10. Determine if the following statement is true or false for U = R and U = Z
respectively, and explain. Then, find its negation.

∀x ∈ U, ∀y ∈ U, (x > y → ∃z ∈ U, x > z > y) .

x+y
Solution. If U = R, then the statement is true. To demonstrate this, let z = , i.e. take the
2
average of x and y. Again, as z was defined using an existential quantifier, we should express z in
terms of x and y, which represent any real numbers (because of their universal quantifiers). As all
x+y
are real numbers, x > > y is clearly satisfied.
2
If U = Z, then the statement is false. It suffices to show one counterexample: let x = 2, y = 1.
Then there cannot be any integer z for which 2 > z > 1.
To find the negation, we dissect the statement similar to before:

∼(∀x ∈ U, ∀y ∈ U, (x > y → ∃z ∈ U, x > z > y))


≡ ∃x ∈ U, ∼(∀y ∈ U, (x > y → ∃z ∈ U, x > z > y))
≡ ∃x ∈ U, ∃y ∈ U, ∼ (x > y → ∃z ∈ U, x > z > y)
≡ ∃x ∈ U, ∃y ∈ U, ∼ (∼(x > y) ∨ ∃z ∈ U, x > z > y)
≡ ∃x ∈ U, ∃y ∈ U, (x > y ∧ ∼(∃z ∈ U, x > z > y))
≡ ∃x ∈ U, ∃y ∈ U, (x > y ∧ ∀z ∈ U, ∼(x > z > y)) .

Remember that x > z > y is actually an “AND” statement: x > z > y ≡ x > z ∧ z > y. Therefore
its negation would be x ≤ z ∨ z ≤ y, so the complete negation would be

∃x ∈ U, ∃y ∈ U, (x > y ∧ ∀z ∈ U, x ≤ z ∨ z ≤ y) .

Problem 1.3.11. Determine for each statement whether it is true or false, and justify.

1. ∀x ∈ R, ∃y ∈ R, y 2 = x

2. ∀x ∈ Z, ∃y ∈ Q, xy ∈ Z

3. ∀x ∈ Q, ∃y ∈ Z, xy ∈ Z

4. ∃x ∈ R, ∃y ∈ R, |x − y| > 7 ∧ x2 + y 2 = 22

Solution.

1. False. Consider x < 0. There is no real number whose square is a negative real number. You
could specify a particular negative value of x to be specific in demonstrating the failure.
Daniel Kim 16

2. True. Some rational numbers are also integers. Then, choose y to be a rational number that
is also an integer, such that xy will be an integer. We have freedom in choosing y because of
the existential quantifier.

3. True. Simply let y = 0 for any given rational number x. Clearly x · 0 = 0 ∈ Z, so we’re done.

4. False. The condition |x − y| > 7 indicates that x and y are more than 7 apart. The minimum
value of x2 is 0, attained when x = 0, by the Trivial Inequality. Since y would be more than
7 away from 0, we must have y 2 > 49, so x2 + y 2 > 49, which fails x2 + y 2 = 22. Likewise,
considering y = 0 yields x2 > 49, and we get the same result.
If we consider any other values of x and y, then the sum x2 + y 2 only increases. Thus, it is
impossible to have both |x − y| > 7 and x2 + y 2 = 22 for any real numbers x and y.

Problem 1.3.12. Determine for each statement whether it is true or false, and justify.

1. ∀x ∈ Q, ∃y ∈ Z+ , xy ∈ Z

2. ∀x ∈ Z ∃y ∈ Z ∀z ∈ Z, x 6= z −→ |x − y| ≤ |x − z|

3. ∀x ∈ R, ∀y ∈ R, ∃z ∈ R, xz = y

Solution.

m m
1. True. Let x = , where m ∈ Z and n ∈ Z+ , and let y = n. Then xy = · n = m, which is
n n
in Z, so we’re done.

2. True. As proof, let y = x. Then |x − y| = 0. No matter what z is, |x − z| will always be 0 or


greater, satisfying the given inequality. Furthermore, x =
6 z implies that x − z =6 0. Therefore,
|x − z| must be positive. Thus, the implication x 6= z −→ |x − y| ≤ |x − z| is always true when
y = x.

3. False. As a counterexample, let x = 0 and y = 1. Demonstrate why it fails.

For a brief interlude, let’s see how we can combine quantifiers with argument forms. Consider
the following premises:

P (1)
∀n ∈ Z+ , P (n) → P (n + 1)

What would be the conclusion?


Recall that a statement with a universal quantifier can be rewritten as a series of conjunctions.
Given that Z+ = {1, 2, 3, . . .}, realize that ∀n ∈ Z+ , P (n) → P (n + 1) implies infinitely many
predicates:

∀n ∈ Z+ , P (n) → P (n + 1) ≡ (P (1) → P (2)) ∧ (P (2) → P (3)) ∧ (P (3) → P (4)) ∧ . . .


17 Chapter 1. Logic

Given the initial case P (1), we can repeatedly apply modus ponens, so the second premise results
in infinitely many conclusions.
)  
P (1)  

P (2) 

P (1) → P (2) P (3)

 P (4)
P (2) → P (3) 




P (3) → P (4)

..
.

As shown above: ∀n ∈ Z+ , statements P (1), P (2), P (3), P (4), . . . , P (n) are all true. Therefore,
the complete argument is expressed as:

P (1)
∀n ∈ Z+ , P (n) → P (n + 1)
∴ ∀n ∈ Z+ , P (n)

This argument form is known as mathematical induction, with base case P (1) and inductive
step P (n). This is a very important technique of proving certain theorems that we will go over
closely in a later chapter.
Chapter 2

Set Theory

Every field of mathematics uses or refers to sets in one way or another. As the last chapter may
have shown, we need to use sets when dealing with quantified statements. In this chapter, we will
review some basic aspects of set theory, building from the background of the last chapter.

2.1 Introduction

First we start with a definition that seems general and vague.

Definition 2.1.1. A set is a collection of things, called elements.

Even then, we should have a basic sense of what sets are. You probably already have encountered
sets in your mathematical education so far, including R, Q, Z.
For notation, we will use capital letters to refer to sets and lowercase letters to refer to elements.
We state x ∈ A if x is an element of A.

Definition 2.1.2. The universal set, U, is the set that contains all elements.

Definition 2.1.3. The empty set, ∅, is the set that contains no elements.

We now introduce some relations between sets.

Definition 2.1.4. For two sets A and B, A is equal to B when

A = B ←→ ∀x ∈ U, x ∈ A ↔ x ∈ B.

Definition 2.1.5. For two sets A and B, A is a subset of B when

A ⊆ B ←→ ∀x ∈ U, x ∈ A → x ∈ B.

Exercise 2.1.6. Under what condition would A * B (A is not a subset of B)? Hint: negate
Definition 2.1.5.

19
Daniel Kim 20

Theorem 2.1.7
A = B ←→ A ⊆ B ∧ B ⊆ A.

Proof. We simply apply standard logical equivalences.

A = B ←→ ∀x ∈ U, x ∈ A ↔ x ∈ B
A = B ←→ ∀x ∈ U, (x ∈ A → x ∈ B) ∧ (x ∈ B → x ∈ A)
∴ A = B ←→ A ⊆ B ∧ B ⊆ A

Theorem 2.1.8
A ⊆ B ∧ B ⊆ C −→ A ⊆ C.

Proof. First, note A ⊆ B ∧ B ⊆ C −→ ∀x ∈ U, (x ∈ A → x ∈ B) ∧ (x ∈ B → x ∈ C). We know


that both x ∈ A → x ∈ B and x ∈ B → x ∈ C are true by conjunctive simplification, and therefore
we must have x ∈ A → x ∈ C by hypothetical syllogism.
As such, we have A ⊆ B ∧ B ⊆ C −→ ∀x ∈ U, x ∈ A → x ∈ C, or A ⊆ B ∧ B ⊆ C −→ A ⊆ C,
and we are done.

Problem 2.1.9. Prove A ⊆ ∅ −→ A = ∅.

Proof. Consider the definition of A ⊆ ∅, i.e. ∀x ∈ U, x ∈ A → x ∈ ∅. The empty set contains no


elements, so x ∈ ∅ ≡ F. Then, we have ∀x ∈ U, x ∈ A → F. This is a contradiction, and therefore
/ A. But this is just the definition of the empty set, so A = ∅.
∀x ∈ U, x ∈

Definition 2.1.10. Let the set A denote the complement of set A. Then,

x ∈ A ←→ x ∈ U ∧ x ∈
/ A.

We can alternatively express this as

∀x ∈ U, x ∈ A ←→ x ∈
/ A ←→ ∼(x ∈ A).

Problem 2.1.11. A = A.

Proof. This is a simple application of the double negative law.

∀x ∈ U, x ∈ A ←→ ∼(x ∈ A)
∀x ∈ U, x ∈ A ←→ ∼(∼(x ∈ A))
∀x ∈ U, x ∈ A ←→ x ∈ A
∴ A = A.

Problem 2.1.12. A ⊆ B −→ B ⊆ A.
21 Chapter 2. Set Theory

Proof. This results from the implication being logically equivalent to its contrapositive.
A ⊆ B −→ ∀x ∈ U, (x ∈ A → x ∈ B)
A ⊆ B −→ ∀x ∈ U, (∼(x ∈ B) → ∼(x ∈ A))
A ⊆ B −→ ∀x ∈ U, x ∈ B → x ∈ A
∴ A ⊆ B −→ B ⊆ A.

Problem 2.1.13. Prove A ⊆ B ←→ B ⊆ A.

Proof. Problem 2.1.12 gives us the right direction. For the left direction, note that Problem 2.1.12
also tells us that B ⊆ A −→ A ⊆ B. Then by Problem 2.1.11, B = B and A = A, so thus
B ⊆ A −→ A ⊆ B. It follows that A ⊆ B ←→ B ⊆ A.

Problem 2.1.14. Prove U = ∅.

Proof. By our previous results,


∀x ∈ U, x ∈ U ←→ x ∈
/U
←→ ∼(x ∈ U)
←→ F
←→ x ∈ ∅.

Thus, U = ∅. For clarification: x was defined to be an element of U, thus ∼(x ∈ U) would be F.

Problem 2.1.15. What is {1, 2, 3}?

Solution. It depends on the universe. Any set that does NOT contain any of 1, 2, or 3 is an answer.

{1, 2, 3} U
∅ {1, 2, 3}
{4, 5} {1, 2, 3, 4, 5}
(−∞, 1) ∪ (1, 2) ∪ (2, 3) ∪ (3, +∞) R

The last example uses notation that may not be familiar; we will introduce it in detail in the
next section.

2.2 Operations on Sets


Before we define some common operations on sets, we must first review set builder notation. In
this fashion, we define the set by specifiying the type of element and the conditions that must be
met in order to qualify a thing as an element of the set.
For instance, the set of all real numbers whose squares are less than 7 can be written as:
{x ∈ Z | x2 < 7} = {−2, −1, 0, 1, 2}.

We can then define two very important operations in this manner:


Daniel Kim 22

Definition 2.2.1. The intersection of A and B is defined as

A ∩ B = {x ∈ U | x ∈ A ∧ x ∈ B}.

Definition 2.2.2. The union of A and B is defined as

A ∪ B = {x ∈ U | x ∈ A ∨ x ∈ B}.

Problem 2.2.3. Given A = {1, 2, 3} and B = {3, 4, 5}, find A ∩ B and A ∪ B.

Solution. Remember that the intersection contains only elements both A and B have in common,
while the union contains elements from either A or B.

A ∩ B = {3}.

A ∪ B = {1, 2, 3, 4, 5}.

Consider the following diagram. This gives a visual representation of the intersection and union
of two sets.

A B

=A∩B

=A∪B

From this, we can quickly deduce two seemingly obvious results:

Problem 2.2.4. Prove A ⊆ A ∪ B.

Proof. First, note that A ∩ B ←→ ∀x ∈ U, x ∈ A ∩ B. But x ∈ A ∩ B ←→ x ∈ A ∧ x ∈ B. By


conjunctive simplification, we conclude x ∈ A. Thus, ∀x ∈ U, x ∈ A ∩ B → x ∈ A, which yields
A ∩ B ⊆ A, as desired.

Problem 2.2.5. Prove A ∩ B ⊆ A.

Proof. Given that for a given element x ∈ U U , if we assume that x ∈ A, then by disjunctive addition,
x ∈ A ∨ x ∈ B. But this implies that x ∈ A ∪ B. Thus, ∀x ∈ U, x ∈ A −→ x ∈ A ∨ x ∈ B, which
gives us A ⊆ A ∪ B.

Exercise 2.2.6. Prove A ∩ B = B −→ B ⊆ A.


23 Chapter 2. Set Theory

Theorem 2.2.7 (DeMorgan’s Laws on Sets)


For given sets A and B, A ∪ B = A ∩ B and A ∩ B = A ∪ B.

Proof. The proofs involve direct application of DeMorgan’s laws. I encourage you to go through and
justify each step of the process.

∀x ∈ U, x ∈ A ∪ B ←→ ∼(x ∈ A ∪ B)
∀x ∈ U, x ∈ A ∪ B ←→ ∼(x ∈ A ∨ x ∈ B)
∀x ∈ U, x ∈ A ∪ B ←→ ∼(x ∈ A) ∧ ∼(x ∈ B)
∀x ∈ U, x ∈ A ∪ B ←→ x ∈ A ∧ x ∈ B
∀x ∈ U, x ∈ A ∪ B ←→ x ∈ A ∩ B
∴ A ∪ B = A ∩ B.

The proof for the second result is very similar.

∀x ∈ U, x ∈ A ∩ B ←→ ∼(x ∈ A ∩ B)
∀x ∈ U, x ∈ A ∩ B ←→ ∼(x ∈ A ∧ x ∈ B)
∀x ∈ U, x ∈ A ∩ B ←→ ∼(x ∈ A) ∨ ∼(x ∈ B)
∀x ∈ U, x ∈ A ∩ B ←→ x ∈ A ∨ x ∈ B
∀x ∈ U, x ∈ A ∩ B ←→ x ∈ A ∪ B
∴ A ∩ B = A ∪ B.

Theorem 2.2.8 (Associative Laws on Sets)


For given sets A and B, A ∪ (B ∪ C) = (A ∪ B) ∪ C and A ∩ (B ∩ C) = (A ∩ B) ∩ C.

Exercise 2.2.9. By directly applying the associative law of logic, prove Theorem 2.2.8.

Theorem 2.2.10 (Distributive Laws on Sets)


For given sets A and B, A ∪ (B ∩ C) = (A ∪ B) ∩ (A ∪ C) and A ∩ (B ∪ C) = (A ∩ B) ∪ (A ∩ C).

Exercise 2.2.11. In the same style as the previous theorems, prove Theorem 2.2.10.

Example 2.2.12
Prove A ∩ B = A ∩ C ∧ A ∪ B = A ∪ C −→ B = C.

Proof. By Theorem 2.1.7, it suffices to prove B ⊆ C and C ⊆ B. First, we will prove B ⊆ C.


To do so, we want to prove that x ∈ B → x ∈ C. Assuming x ∈ B, we consider two cases: x ∈ A
or x ∈
/ A.
Daniel Kim 24

1. If x ∈ A, then by conjunctive addition, x ∈ A ∧ x ∈ B, or x ∈ A ∩ B. However, we are given


that A ∩ B = A ∩ C, thus x ∈ A ∩ C. But this reduces to x ∈ A ∧ x ∈ C, from which we invoke
conjunctive simplification to conclude x ∈ C, which is what we wanted.

2. If x ∈
/ A, then by disjunctive addition, x ∈ B ∨ x ∈ A (since we have assumed x ∈ B to be
true), or x ∈ A ∪ B. We are given that A ∪ B = A ∪ C, so x ∈ A ∪ C, from which we get
x ∈ A ∨ x ∈ C. However, we know that x ∈ / A, so by disjunctive syllogism, x ∈ C, the same
result as the previous case.

Thus, whether x ∈ A or x ∈
/ A, we can conclude that x ∈ B → x ∈ C.
The proof of x ∈ C → x ∈ B is very similar, and will be left to the reader.

Exercise 2.2.13. Finish the proof of Example 2.2.12.

We can alternatively prove Example 2.2.12 using a more logic-based argument.

Alternative Proof. Note that the theorem’s statement can be broken down into:

((x ∈ A ∨ x ∈ B) ↔ (x ∈ A ∨ x ∈ C)) ∧ ((x ∈ A ∧ x ∈ B) ↔ (x ∈ A ∧ x ∈ C)) −→ x ∈ B ↔ x ∈ C.

Let p, q, r denote the statements x ∈ A, x ∈ B, x ∈ C respectively. Therefore, the logical


statement above can be translated into this argument form:

p ∧ q ←→ p ∧ r
p ∨ q ←→ p ∨ r
∴ q ←→ r

It is sufficient to prove that this argument form is valid, i.e. if the premises are true, then the
conclusion must also be true. We can either use a truth table or casework to do so.
Since the truth table method can be time-consuming and tedious, I will proceed by casework on
p in particular.
If p is true, then we have

T ∧ q ←→ T ∧ r,
T ∨ q ←→ T ∨ r.

By the Negation laws, this simplifies to

q ←→ r,
T ←→ T.

Clearly, T ←→ T is always true, so we can ignore this part. Remember that we want to show
that if the premises are true, then the conclusion must also be true. Indeed, if q ←→ r is true, then
the conclusion must obviously be true since it happens to be the same statement as this premise.
If p is false, then we have

F ∧ q ←→ F ∧ r,
25 Chapter 2. Set Theory

F ∨ q ←→ F ∨ r.

By the Identity laws, this simplifies to

F ←→ F,
q ←→ r.

Obviously F ←→ F is always true. With the same reasoning as above, if the premise q ←→ r is
true, then the conclusion is true since it is the same statement.
We have shown that this argument form is valid, thus the theorem itself must be true.

We now introduce a new class of problems that requires some intuition with sets.
Problem 2.2.14. Prove or disprove the following statements.

1. A ∩ B = A ∩ C −→ B = C.

2. A ⊆ B ∪ C −→ A ⊆ B ∨ A ⊆ C.

3. A ∩ (B ∪ C) = (A ∩ B) ∪ C.

Solution. It is helpful to use Venn Diagrams as a visual aid when thinking about these statements.
Then consider counterexamples to disprove them. However, Venn Diagrams are not rigorous enough
to prove a true statement. Additionally, when disproving a statement, do not forget to demonstrate
how your counterexample actually fails it.

1. False. Consider the counterexample: Let A = {1, 2}, B = {2, 4}, C = {2, 3}. To demonstrate
its failure, we note that A ∩ B = {2} and A ∩ C = {2}, so the condition that A ∩ B = A ∩ C
is satisfied. However, B 6= C, failing the statement.

2. True. Here is the proof:

A ⊆ B ∪ C −→ ∀x ∈ U, x ∈ A → x ∈ B ∪ C
A ⊆ B ∪ C −→ ∀x ∈ U, x ∈ A → (x ∈ B ∨ x ∈ C)
A ⊆ B ∪ C −→ ∀x ∈ U, ∼(x ∈ A) ∨ (x ∈ B ∨ x ∈ C)
A ⊆ B ∪ C −→ ∀x ∈ U, ∼(x ∈ A) ∨ ∼(x ∈ A) ∨ x ∈ B ∨ x ∈ C
A ⊆ B ∪ C −→ ∀x ∈ U, (∼(x ∈ A) ∨ x ∈ B) ∨ (∼(x ∈ A) ∨ x ∈ C)
A ⊆ B ∪ C −→ ∀x ∈ U, (x ∈ A → x ∈ B) ∨ (x ∈ A → x ∈ C)
∴ A ⊆ B ∪ C −→ A ⊆ B ∨ A ⊆ C.

3. False. Consider A = ∅, B = ∅, C = {1}. Then A ∩ (B ∪ C) = ∅, while (A ∩ B) ∪ C = {1}.


Definition 2.2.15. The set that contains all elements in A but not in B is denoted as A − B. This
is logically represented as
A − B = A ∩ B.

For example, {1, 2, 3} − {1, 2, 4} = {3}, ∅ − A = ∅, and A − ∅ = A.


Daniel Kim 26

Problem 2.2.16. Prove or disprove the following statements.

1. (A − B) ∪ (B − C) = (A − C).
2. (A − C) ∪ (B − C) = (A ∪ B) − C.
3. (A − C) = C − A.
4. (A − B) − C = (A − C) − B.
5. A ⊆ C ∧ B ⊆ C ←→ (A ∪ B) ⊆ C.
6. A ⊆ C ←→ A − C = ∅.
7. A ⊆ B −→ A ∩ (B ∩ C) = ∅.

Solution. Note: we will leave it to the reader to demonstrate how the counterexamples fail their
statements.

1. False. Counterexample: Let A = ∅, B = {1}, C = ∅. Convince yourself why this fails.


2. True. Proof.

(A − C) ∪ (B − C) ←→ (A ∩ C) ∪ (B ∩ C)
(A − C) ∪ (B − C) ←→ ∀x ∈ U, (x ∈ A ∧ x ∈
/ C) ∨ (x ∈ B ∧ x ∈
/ C)
(A − C) ∪ (B − C) ←→ ∀x ∈ U, x ∈
/ C ∧ (x ∈ A ∨ x ∈ B)
∴ (A − C) ∪ (B − C) ←→ (A ∪ B) − C.

3. False. Counterexample: Let A = {1}, C = {1, 2}.


4. True. Proof.

(A − B) − C = (A ∩ B) ∩ C
(A − B) − C = C ∩ (A ∩ B)
(A − B) − C = (C ∩ A) ∩ B
(A − B) − C = (A ∩ C) ∩ B
∴ (A − B) − C = (A − C) − B.

5. True. Proof.

A ⊆ C ∧ B ⊆ C ←→ ∀x ∈ U, (x ∈ A → x ∈ C) ∧ (x ∈ B → x ∈ C)
A ⊆ C ∧ B ⊆ C ←→ ∀x ∈ U, (x ∈
/ A ∨ x ∈ C) ∧ (x ∈
/ B ∨ x ∈ C)
A ⊆ C ∧ B ⊆ C ←→ ∀x ∈ U, (x ∈ C ∨ x ∈
/ A) ∧ (x ∈ C ∨ x ∈
/ B)
A ⊆ C ∧ B ⊆ C ←→ ∀x ∈ U, x ∈ C ∨ (x ∈
/ A∧x∈
/ B)
A ⊆ C ∧ B ⊆ C ←→ ∀x ∈ U, (x ∈
/ A∧x∈
/ B) ∨ x ∈ C
A ⊆ C ∧ B ⊆ C ←→ ∀x ∈ U, ∼(x ∈ A ∨ x ∈ B) ∨ x ∈ C
A ⊆ C ∧ B ⊆ C ←→ ∀x ∈ U, (x ∈ A ∨ x ∈ B) → x ∈ C
∴ A ⊆ C ∧ B ⊆ C ←→ (A ∪ B) ⊆ C.
27 Chapter 2. Set Theory

6. True. We proceed by proving both directions.

(a) First, we must prove A ⊆ C −→ A − C = ∅.


Suppose A ⊆ C is true. Therefore, ∀x ∈ U, x ∈ A −→ x ∈ C, rewritten as ∀x ∈ U, x ∈ /
A ∨ x ∈ C, is true.
We must prove: A − C = ∅, which is x ∈ A ∧ x ∈ / C ↔ x ∈ ∅. However, x ∈ A ∧ x ∈
/C
is the negation of x ∈ / A ∨ x ∈ C (by DeMorgan’s Laws), which is true. Therefore,
x∈A∧x∈ / C is false.
Furthermore, by definition of an empty set, x ∈ ∅ is false.
Because both sides of the biconditional are false, the biconditional x ∈ A∧x ∈
/ C ←→ x ∈ ∅
itself is true. Thus, A − C = ∅ is true when A ⊆ C is true.
Therefore, A ⊆ C −→ A − C = ∅.
(b) Now we prove A − C = ∅ −→ A ⊆ C.
Suppose A − C = ∅ is true. Then ∀x ∈ U, x ∈ A ∧ x ∈ / C ←→ x ∈ ∅ is true.
By definition of an empty set, x ∈ ∅ is false.
A biconditional is true when both sides have the same truth value. Therefore, x ∈ A∧x ∈
/C
is false.
Thus, the negation of x ∈ A ∧ x ∈ / C, which is x ∈/ A ∨ x ∈ C (by DeMorgan’s Laws) is
true.
Therefore, x ∈ A −→ x ∈ C is true, so A ⊆ C is true.
Hence, A − C = ∅ −→ A ⊆ C.

We can now conclude the proof.

7. False. Let A = {1}, B = {1, 2}, and C = ∅.

To solidify your understanding of sets, here is an assortment of problems.

Exercise 2.2.17. Prove or disprove (A − B) ∪ (A ∩ B) = A.

Exercise 2.2.18. Prove or disprove A − (B − C) = (A − B) − C.

Exercise 2.2.19. Prove or disprove (A − B) ∩ (C − B) = (A ∩ C) − B.

Exercise 2.2.20. Prove or disprove (A − B) ∩ (C − B) = A − (B ∪ C).

Exercise 2.2.21. Prove or disprove A ⊆ B −→ A ∩ C ⊆ B ∩ C.

Exercise 2.2.22. Prove or disprove A ∪ C = B ∪ C −→ A = B.

Exercise 2.2.23. Prove or disprove A ∩ C ⊆ B ∩ C ∧ A ∪ C ⊆ B ∪ C −→ A = B.

Exercise 2.2.24. Prove or disprove (A ∪ B) ∩ C = A ∪ (B ∩ C).

Exercise 2.2.25. Prove or disprove A * B ∧ B * C −→ A * C.

Exercise 2.2.26. Prove or disprove A ⊆ B ∧ B * C −→ A * C.

Exercise 2.2.27. Prove or disprove A ∩ (B − C) = (A ∩ B) − (A ∩ C).


Daniel Kim 28

Exercise 2.2.28. Prove or disprove B ∩ C ⊆ A −→ (A − B) ∩ (A − C) = ∅.


Exercise 2.2.29. Prove or disprove (A ∪ B) = A ∪ B.
Exercise 2.2.30. Prove or disprove A ∩ C ⊆ B ∩ C ∧ A ∪ C ⊆ B ∪ C −→ A ⊆ B.
Exercise 2.2.31. Prove or disprove A ⊆ B ∪ C ←→ A ⊆ B.
Exercise 2.2.32. Prove or disprove A ⊆ C ∧ B ⊆ C −→ A ∩ B ⊆ C.
Exercise 2.2.33. Prove or disprove A ∩ A = ∅.
Definition 2.2.34. We denote the symmetric difference of sets A and B as A ? B. It is
symbolically defined as
A ? B = (A − B) ∪ (B − A).

Theorem 2.2.35
(A ? B) ? C = A ? (B ? C).

Proof. First, draw a Venn Diagram to get a sense of this expression. Realize that A ? B is the
exclusive-or (XOR) of sets A and B. Then,

A ? B ←→ ∀x ∈ U, x ∈ A ? B
A ? B ←→ ∀x ∈ U, (x ∈ A ∧ ∼(x ∈ B)) ∨ (x ∈ B ∧ ∼(x ∈ A))
A ? B ←→ ∀x ∈ U, x ∈ A ⊕ x ∈ B.

Using a truth table, we can show that for any given statements p, q, and r, (p⊕q)⊕r ≡ p⊕(q ⊕r),
or in other words, demonstrate that the XOR operation is associative.

p q r p ⊕ q q ⊕ r (p ⊕ q) ⊕ r p ⊕ (q ⊕ r)
T T T F F T T
T T F F T F F
T F T T T F F
T F F T F T T
F T T T F F F
F T F T T T T
F F T F T T T
F F F F F F F

The resulting truth values prove their logical equivalence. Therefore, we can conclude:

∀x ∈ U, (x ∈ A ⊕ x ∈ B) ⊕ x ∈ C ←→ x ∈ A ⊕ (x ∈ B ⊕ x ∈ C)
∀x ∈ U, x ∈ (A ? B) ? C ←→ x ∈ A ? (B ? C)
∴ (A ? B) ? C = A ? (B ? C).
Chapter 3

Fields

By middle school, we take many things in mathematics for granted, especially addition, subtraction,
multiplication, and division. But where do all of these come from? In this chapter, we discuss why
we can use such operations, and establish the foundations from scratch. Hopefully, after finishing
this chapter, you should have a deeper appreciation for some of the seemingly trivial algebraic
manipulations and techniques that we apply almost subconciously in math problems.

3.1 Field Axioms

An axiom is a statement that we assume to be true, from which we develop further mathematical
results and consequences. In other words, it is a starting point. Here, we provide a list of field
axioms:
Definition 3.1.1. A field is a set F with two operations, typically called + and ×, satisfying:

a) Closure: ∀a, b ∈ F, a + b ∈ F and a × b ∈ F .

b) Commutative: ∀a, b ∈ F, a + b = b + a and ab = ba.

c) Associative: ∀a, b, c ∈ F, a + (b + c) = (a + b) + c and a(bc) = (ab)c.

d) Distributive: ∀a, b, c ∈ F, a(b + c) = ab + ac.

e) Identities: ∃0, 1 ∈ F, (0 6= 1) s.t. ∀a ∈ F, a + 0 = 0 + a = a and a · 1 = 1 · a = a.

f) Inverses: ∀a ∈ F, ∃ −a ∈ F s.t. a + −a = 0.
∀a ∈ F, a 6= 0, ∃a−1 ∈ F s.t. a(a−1 ) = 1.

So first, we start off with addition and multiplication. Notice that subtraction and division can
also arise from these properties, and we will get to those operations later.
Some examples of fields would be the sets Q, R, C - take a moment to convince yourself that
these properties hold for these sets.
In addition, the set Zp (all integers mod p, where p is prime) is a field; it is a well known result
in number theory that all integers modulo p do have a multiplicative inverse. However, the set of
29
Daniel Kim 30

integers, Z, is not a field because not every integer has an multiplicative inverse that is also an
integer.
Remark 3.1.2. When we say “n mod m” or “n modulo m,” we are referring to the remainder when
n is divided by m. For example, 8 mod 5 is 3, and we write 8 ≡ 3 (mod 5) as shorter notation.
Other examples include 33 ≡ 3 (mod 10), 89 ≡ 5 (mod 7), etc. Furthermore, Zn refers to the set of
integers mod n. For instance, Z3 = {0, 1, 2}.

Considering numbers modulo some other number is part of an entire topic of mathematics called
modular arithmetic, and this is part of number theory. You are welcome to search for any outside
resources pertaining to this topic.
Problem 3.1.3. Do the integers mod 8 form a field? What about the integers mod 27?

Solution. In order for a set of numbers to be a field, all of the properties listed above must be
satisfied. Now for Z8 , consider if 4 (which is in this set) has a multiplicative inverse (let this be x).
Then we want
4x ≡ 1 (mod 8),
and this can be rewritten as 4x = 8k + 1, where k ∈ Z. Rearrange this to 4x − 8k = 1, and clearly
the left hand side is even while the right hand side is odd. Thus, there is no x which satisfies this, so
4 does not have a multiplicative inverse. Therefore, Z8 cannot be a field.
Likewise, for Z27 , consider the multiplicative inverse of 3.

3x ≡ 1 (mod 27).

Then we have 3x = 27k + 1 for k ∈ Z. Note that the left hand side is divisible by 3, while the
right hand side will leave a remainder of 1 when divided by 3, so we cannot find such an x. Thus, 3
does not have a multiplicative inverse, so Z27 is not a field.

The following proofs of results will seem tedious, but necessary as we are building everything
from scratch, using just the field axioms listed above. Be careful not to skip steps, as we cannot
assume to apply the usual algebraic techniques we have learned. Make sure you know what property
was invoked for each step of a proof.

Theorem 3.1.4 (Cancellation Laws)


For a, b, c ∈ F ,

1. a + b = a + c → b = c

2. ab = ac ∧ a 6= 0 → b = c

Proof. For the first part,

a+b=a+c
−a + (a + b) = −a + (a + c)
(−a + a) + b = (−a + a) + c
31 Chapter 3. Fields

0+b=0+c
b=c

For the second part, we know that a−1 exists because we are given that a 6= 0. Then,

ab = ac
a−1 (ab) = a−1 (ac)
(a−1 a)b = (a−1 a)c
1·b=1·c
b=c

Henceforth, when invoking either part of this theorem, we will simply refer to it by “Cancellation.”

Theorem 3.1.5
The additive and multiplicative identities and inverses are unique.

Proof. Aside from 0, suppose e


0 is another additive identity. Then we have,

0+e 0 = 0 because e
0 is an additive identity.
e e
0 + 0 = 0 because 0 is an additive identity.

Therefore, 0 = e
0, so it is unique.
Aside from −a, suppose −e
a is another additive inverse for a. Then we have,

0 = a + (−a),
0 = a + (−e
a).

Therefore, a + −a = a + −e a by cancellation.
a −→ −a = −e
Aside from 1, suppose e
1 is another multiplicative identity. Then,

1·e
1=e 1 because 1 is a multiplicative identity.
1·e
1 = 1 because e
1 is a multiplicative identity.

Therefore 1 = e
1.
Aside from a−1 , suppose e
a−1 is another multiplicative inverse. ∀a 6= 0, we have:

a · a−1 = 1
a−1 = 1
a·e

Thus, a · a−1 = a · e
a−1 . Therefore, a−1 = e
a−1 by cancellation.

Problem 3.1.6. Prove −(−a) = a.


Daniel Kim 32

Proof. −a + a = 0
−a + −(−a) = 0
∴ −a + −(−a) = −a + a
∴ −(−a) = a.

Problem 3.1.7. Prove −(a + b) = −a + −b.

Proof. Consider the quantity (−a + −b) + (a + b):

(−a + −b) + (a + b) = −a + (−b + (a + b))


= −a + ((−b + b) + a)
= −a + (0 + a)
= −a + a
= 0.

Therefore we have (−a + −b) + (a + b) = 0. Since a + b ∈ F , it has an additive inverse, so


−(a + b) + (a + b) = 0.
Then (−a + −b) + (a + b) = −(a + b) + (a + b). By cancellation, we have: (−a + −b) = −(a + b),
or −(a + b) = −a + −b by commutativity.

Problem 3.1.8. Prove a + (b + c) = c + (a + b).

Proof. a + (b + c) = (a + b) + c
= c + (a + b).

Problem 3.1.9. Prove 0 · a = 0.

Proof. 0+0=0
a(0 + 0) = a · 0
a·0+a·0=a·0
−a · 0 + (a · 0 + a · 0) = −a · 0 + a · 0
(−a · 0 + a · 0) + a · 0 = −a · 0 + a · 0
0 + a · 0 = −a · 0 + a · 0
a · 0 = −a · 0 + a · 0
a·0=0
∴ 0 · a = 0.

Problem 3.1.10. Prove −1 · a = −a.

Proof. We will have to use the result 0 · a = 0 from Problem 3.1.9.

1 + −1 = 0
a(1 + −1) = a · 0
a · 1 + a · −1 = a · 0
33 Chapter 3. Fields

a + a · −1 = a · 0
a + a · −1 = 0 · a
a + a · −1 = 0
−a + (a + a · −1) = −a + 0
(−a + a) + a · −1 = −a + 0
0 + a · −1 = −a + 0
a · −1 = −a + 0
a · −1 = −a
∴ −1 · a = −a.

Problem 3.1.11. Prove ab = 0 −→ a = 0 ∨ b = 0.

Proof. If a = 0, we’re done. Suppose a 6= 0. Then, a−1 exists, so we have

ab = 0
a−1 · (ab) = a−1 · 0
(a−1 · a)b = a−1 · 0
1 · b = a−1 · 0
1·b=0
∴ b = 0.

Note that the result we just proved helps us deduce roots of a polynomial after factoring.

Problem 3.1.12. Prove that x2 = y 2 → x = y ∨ x = −y.

Proof. First, we rearrange x2 = y 2 :

x2 = y 2
x2 + −y 2 = y 2 + −y 2
x2 + −y 2 = 0

Consider the quantity (x + y)(x + −y).

(x + y)(x + −y) = (x + y)x + (x + y) · −y


= x(x + y) + −y · (x + y)
= (x · x + x · y) + (−y · x + −y · y)
= (x · x + x · y) + ((−1 · y) · x + (−1 · y) · y)
= (x · x + x · y) + (−1 · (y · x) + −1 · (y · y))
= (x · x + x · y) + (−1 · (x · y) + −1 · (y · y))
= (x · x + x · y) + (−(x · y) + −(y · y))
= ((x · x + x · y) + −(x · y)) + −(y · y)
= (x · x + (x · y + −(x · y))) + −(y · y)
Daniel Kim 34

= (x · x + 0) + −(y · y)
= x · x + −(y · y)
= x2 + −y 2 .
Therefore, (x + y)(x + −y) = 0. By Problem 3.1.11, x + y = 0 or x + −y = 0. Consider each case
separately:

1. x + y = 0
x+y =0
(x + y) + −y = 0 + −y
x + (y + −y) = 0 + −y
x + 0 = 0 + −y
x = 0 + −y
∴ x = −y.

2. x + −y = 0
x + −y = 0
(x + −y) + y = 0 + y
x + (−y + y) = 0 + y
x+0=0+y
x=0+y
∴ x = y.

Thus, we have x = y ∨ x = −y.

Problem 3.1.13. Prove (ab)−1 = a−1 b−1 , provided b 6= 0.

Proof. As b 6= 0, b−1 exists. Then,


(ab) · (ab)−1 = 1
a−1 ((ab) · (ab)−1 ) = a−1 · 1
a−1 ((ab) · (ab)−1 ) = a−1
(a−1 (ab)) · (ab)−1 = a−1
((a−1 a)b) · (ab)−1 = a−1
(1 · b) · (ab)−1 = a−1
b · (ab)−1 = a−1
b−1 (b · (ab)−1 ) = b−1 a−1
(b−1 b) · (ab)−1 = b−1 a−1
1 · (ab)−1 = b−1 a−1
(ab)−1 = b−1 a−1
∴ (ab)−1 = a−1 b−1 .
35 Chapter 3. Fields

Alternative Proof. Here is a slightly faster method. First, note that

ab(a−1 b−1 ) = ((ab)a−1 )b−1


= (a−1 (ab))b−1
= ((a−1 a)b)b−1
= (a−1 a)(bb−1 )
=1·1
= 1.

We can then conclude:

ab(a−1 b−1 ) = 1
(ab)−1 (ab(a−1 b−1 )) = (ab)−1 · 1
(ab)−1 (ab(a−1 b−1 )) = (ab)−1
((ab)−1 ab)(a−1 b−1 ) = (ab)−1
1 · (a−1 b−1 ) = (ab)−1
a−1 b−1 = (ab)−1
∴ (ab)−1 = a−1 b−1 .

3.2 Subtraction, Division

At this point, we should address the other two operations. We introduced these later because they
are, in fact, results from addition and multiplication defined by the field axioms.
Consider the equation x + b = a, where a, b are constants. If we solve for x,

x+a=b
(x + a) + −a = b + −a
x + (a + −a) = b + −a
x + 0 = b + −a
x = b + −a

We now define a new operation to represent the solution to this special equation, x = a + −b.
Definition 3.2.1. Subtraction is defined as a + −b = a − b.

Likewise, consider bx = a. If we solve for x,

x·b=a
b·x=a
−1
b (b · x) = b−1 (a)
(b−1 b)x = b−1 (a)
1 · x = b−1 · a
Daniel Kim 36

x = b−1 · a
x = a · b−1

We define another new operation to represent the solution to this special equation, which is
x = ab−1 .
a
Definition 3.2.2. Division is defined as ab−1 = , provided b 6= 0.
b
a·d a
Problem 3.2.3. Prove = .
b·d b

Proof. a·d
= (a · d)(b · d)−1
b·d
= (a · d)(b−1 · d−1 )
= ((a · d) · b−1 ) · d−1
= ((d · a) · b−1 ) · d−1
= (d · (a · b−1 )) · d−1
= ((a · b−1 ) · d) · d−1
= (a · b−1 )(d · d−1 )
= (a · b−1 ) · 1
= a · b−1
a
= .
b
a c a·d+b·c
Problem 3.2.4. Prove + = .
b d b·d

a a·d c c·b
Proof. By Problem 3.2.3, = and = . Then,
b b·d d d·b
a c a·d c·b
+ = +
b d b·d d·b
a·d b·c
= +
b·d b·d
= (a · d)(b · d)−1 + (b · c)(b · d)−1
= (b · d)−1 (a · d) + (b · d)−1 (b · c)
= (b · d)−1 (a · d + b · c)
= (a · d + b · c)(b · d)−1
a·d+b·c
= .
b·d

Here are some review problems to help with field proofs. You may use previous results, or, for
added difficulty, prove them from scratch (i.e. using only the field axioms in the beginning of this
chapter).

Exercise 3.2.5. Prove that (a + c) − (b + c) = a − b.


37 Chapter 3. Fields

Exercise 3.2.6. Prove (a + b)(c + d) = (ac + ad) + (bc + bd).


a c
Exercise 3.2.7. Prove = ←→ ad = bc, provided that b, d 6= 0.
b d
Exercise 3.2.8. Prove (a + b) − (b + c) = a − c.

Exercise 3.2.9. Prove that (−a)(−b) = ab. (Hint: try proving (−a) · b = −(ab) first.)

Exercise 3.2.10. Prove that −(−a) = a and (a−1 )−1 = a.


 −1
b d
Exercise 3.2.11. Prove that = .
d b
Exercise 3.2.12. Prove that (−1)(−1) = 1.
Chapter 4

Sequences and Series

Now we move on from the theoretical aspects of math to a relatively more common topic: sequences
and series, and their behaviors.

4.1 Sequences

Definition 4.1.1. A sequence is a list of numbers. It can be represented as

{a0 , a1 , a2 , . . .}, or {an }.

Sequences can be defined in various ways:

1. Via Formula (Explicit Definition): For example, define the sequence


n
an = .
n2 +1

1 2 3 4 5 6
The first 6 terms of the sequence would be: , , , , , .
2 5 10 17 26 37
2. Implicitly defined: For example, let πn = the nth digit after the decimal point for π.

3. Recursively defined: Each term is defined in terms of previous terms. We have the famous
example, the Fibonacci sequence: Given f1 = 1, f2 = 1, ∀n ≥ 1, fn+2 = fn+1 + fn . Then the
first few terms are 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, . . .

An arithmetic sequence is a recursive sequence defined as follows:

a1 = a, ∀n ≥ 1, an+1 = an + d where a is the first term and d is the common difference.

An geometric sequence is a recursive sequence defined as follows:

a1 = a, ∀n ≥ 1, an+1 = r · an where a is the first term and r is the common ratio.

From these definitions, we can derive explicit formulas for each type of sequence:
39
Daniel Kim 40

Arithmetic Geometric
a1 = a a1 = a
a2 = a + d a2 = ar
a3 = a + 2d a3 = ar2
a4 = a + 3d a4 = ar3
.. ..
. .
an = a + (n − 1)d an = arn−1

Check that this matches with the recursive definition.

Example 4.1.2
Consider a sequence starting with 1, 3, . . . Write the next five terms if the sequence was arithmetic
or geometric. Then, find the 100th term of each.

Solution. If the sequence was arithmetic, then note that the common difference is 2 (which can be
obtained by subtracting the first term from the second term). Thus, we simply keep adding 2 to get
the next five terms: 1, 3, 5, 7, 9, 11, 13 .
If the sequence was geometric, then we can get the common ratio by dividing the second term
by the first term to get 3. Then we can simply keep multiplying by the common ratio to get
1, 3, 9, 27, 81, 243, 729 .
Obviously, it would be time-consuming to keep adding on 2 or multiplying by 3 to reach the
100th term, so we find a shorter way. As defined above, in an arithmetic sequence, an = a + (n − 1)d.
We know that a = 1, d = 2, and n = 100, so we get a100 = 1 + (100 − 1) · 2 = 199 .
Likewise, for a geometric sequence, an = arn−1 . We know that a = 1, r = 3, and n = 100, so
a100 = 1 · 3100−1 = 399 .

Problem 4.1.3. An arithmetic sequence has the fifth term of 19 and 79th term of 22. What is its
2017th term?

Solution. As we have an arithmetic sequence, the fifth term of 19 translates to a + 4d = 19 for some
first term a and common difference d. Likewise, we have a + 78d = 22.
3
If we subtract the two equations, we get 74d = 3, which rearranges to d = .
74
The 2017th term would be a + 2016d. Instead of also solving for a as well, we can just use the
given information. Note that a + 4d = 19, so a + 2016d = (a + 4d) + 2012d = 19 + 2012d. Then we
3 3721
can simply compute to get 19 + 2012 · = .
74 37
Problem 4.1.4. Let a, b, c be three consecutive terms of a sequence. Find

1. c in terms of a, b

2. b in terms of a, c
41 Chapter 4. Sequences and Series

if the sequence was arithmetic or geometric.

Solution. If the sequence was arithmetic, then using our explicit formulas, we have

a = a0 + (n − 1)d,
b = a0 + nd,
c = a0 + (n + 1)d.

Notice that a and b are only d apart, so b − a = d. We also have that b and c are d apart,
so c − b = d. We substitute b − a = d into the latter to get c − b = b − a, which rearranges to
c = 2b − a .
Also, b is evenly spaced apart from a and c. This should suggest that b is the average of a and c,
a+c
i.e. b = . Indeed, we can confirm this using the explicit formulas.
2
If the sequence was geometric, then our explicit formulas would tell us that

a = a0 rn−1 ,
b = a0 r n ,
c = a0 rn+1 .

b c c b b2
Likewise, note that = r and = r, so = , which rearranges to c = .
a b b a a

If we multiply a and c together, we get a20 r2 , which is just b2 , so b = ± ac . As we don’t know
what sign a0 or r is, we cannot determine what sign b is with certainty.

Problem 4.1.5. Let {an } be an arithmetic sequence. Simplify a29 + a76 − a51 .

Solution. This is a direct application of the explicit formula for an arithmetic sequence.

a + 28d + a + 75d − (a + 50d) = a + 53d


= a54 .

Problem 4.1.6. Find all sequences a, b, c such that a, b, c and a + 1, b + 1, c + 1 are both geometric.

Solution. If a, b, c is geometric, then we know that b2 = ac. Similarly, (b + 1)2 = (a + 1)(c + 1).
Note that (b + 1)2 = (a + 1)(c + 1) simplifies to b2 + 2b = ac + a + c. But we know that b2 = ac,
a+b
so b2 + 2b = b2 + a + c, yielding b = .
2
This implies that a, b, c is an arithmetic sequence. The only possible solution for a sequence that
is both geometric and arithmetic is a = b = c (when all terms are equal to each other).
We can also get this result by letting a = b − d and c = b + d, for some common difference d.
So b2 = ac becomes b2 = (b − d)(b + d), so d = 0, which implies the same result.
Daniel Kim 42

4.2 Series

Definition 4.2.1. A series is the sum of a sequence.

First, we tackle a problem that gives us a useful formula.


Problem 4.2.2. Find a formula for 1 + 2 + 3 + . . . + n.

Solution. Let S = 1 + 2 + 3 + . . . + n. Note that S can also be written backwards: n + (n − 1) +


(n − 2) + . . . + 1. Add these two together:
 
2S = 1 + 2 + 3 + . . . + n + n + (n − 1) + (n − 2) + . . . + 1
2S = (1 + n) + (2 + (n − 1)) + (3 + (n − 2)) + . . . + (n + 1)
2S = (n + 1) + (n + 1) + . . . + (n + 1)
| {z }
n ‘copies’ of (n+1)
n(n + 1)
∴S= .
2

n(n + 1)
So we have the formula 1 + 2 + 3 + . . . + n = .
2

Now, we can discuss notions of an arithmetic series, which is simply the sum of all terms of
an arithmetic sequence.

Theorem 4.2.3
For some finite arithmetic series a1 + a2 + a3 + . . . + an , we have general formulas:

n(n − 1) n
a1 + a2 + a3 + . . . + an = na + d = (a1 + an ).
2 2

Proof. Keeping in mind that ak = a + (k − 1)d, we have

a1 + a2 + a3 + . . . + an = a + (a + d) + (a + 2d) + . . . + (a + (n − 1)d)
= (a + a + a + . . . + a) + (d + 2d + 3d + . . . + (n − 1)d)
= na + d(1 + 2 + 3 + . . . + (n − 1))
n(n − 1)
= na + d,
2
using the formula from Problem 4.2.2.
To get the other formula, we perform some clever manipulations:
n(n − 1) n
na + d = (2a + (n − 1)d)
2 2
n
= (a + (a + (n − 1)d))
2
n
= (a1 + an ).
2
43 Chapter 4. Sequences and Series

The second formula is much simpler, but can only be used when you know the last term of the
arithmetic sequence. Otherwise, use the first formula.
Problem 4.2.4. Find 18 + 19 + 20 + . . . + 53.

Solution. Identify the main components of this series: a = 18, d = 1, and n = 36. Then we apply
36
the second formula: (18 + 53) = 18 · 71 = 1278 .
2
Problem 4.2.5. The sum of the first ten terms of an arithmetic series is 10. The sum of the next
ten terms is 100. What is the sum of the next ten terms after that?

n(n − 1)
Solution. This time, the first formula, na + d, is useful, because we don’t know the 10th
2
or 20th term, which is necessary to use the second theorem. We are given the sum of the first ten
terms, and the sum of the next ten terms. This indicates that we know the sum of the first twenty
terms (just add them together).
We apply the formula on the first ten and the first twenty to get:

a1 + a2 + . . . + a10 = 10 a1 + a2 + . . . + a20 = 110


10 · 9 20 · 19
10a + d = 10 20a + d = 110
2 2
∴ 10a + 45d = 10 ∴ 20a + 190d = 110

We have a system of equations. Multiply the first one by 2 and subtract the two equations to
9 61
get 100d = 90. Therefore, we have d = and a = − .
10 20
Finding the sum of the next ten terms after this 20-term arithmetic sequence is the same thing
as finding the sum of the 30-term arithmetic sequence and subtracting the sum of the first 20, which
we found to be 110. Thus,
61 30 · 29 9
a1 + a2 + a3 + . . . + a30 = 30 · − + ·
20 2 10
183 783
=− +
2 2
600
=
2
= 300.
 
Therefore, a21 + a22 + . . . + a30 = a1 + a2 + a3 + . . . + a30 − a1 + a2 + . . . + a20 = 300 − 110 =
190 .

Alternative Solution. The previous solution was laborious and inefficient. Let’s look for a better
way to approach this problem.
The important thing to notice about an arithmetic sequence is that the terms develop linearly.
There is a constant, common difference that is added over and over again. Therefore, the difference
between the first and last term of each set of 10 terms is constant.
a21 − a11 = a11 − a1
Daniel Kim 44

a22 − a12 = a12 − a2


..
.

Therefore, a1 + a2 + a3 + . . . + a10 , a11 + a12 + a13 + . . . + a20 , and a21 + a22 + a23 + . . . + a30 form
an arithmetic sequence. Since a1 + a2 + a3 + . . . + a10 = 10 and a11 + a12 + a13 + . . . + a20 = 100,
there is a common difference of 90, so we get a21 + a22 + a23 + . . . + a30 = 100 + 90 = 190 . This
solution was much more elegant!

Problem 4.2.6. Let m, n ∈ Z such that m ≤ n. Find the sum of all the integers from m through n
inclusive.

Solution. Consider the sequence m, m + 1, m + 2, . . . , n. There are n − m + 1 terms in this sequence


including m and n. The first and last terms are m and n. We can then apply the second general
(n − m + 1)(n + m)
formula to get .
2
Problem 4.2.7. Find all the ways to write 39 as a sum of consecutive integers.

Solution. As the problem suggests, 39 can be represented as m + (m + 1) + (m + 2) + . . . + (n − 1) + n


for arbitrary values of m and n. We apply the formula from Problem 4.2.6 to get

(n − m + 1)(n + m) = 78.

How are we supposed to find all possible values of m and n? We can take advantage of m and
n being integers, and the different parities of (n − m + 1) and (n + m) (i.e. one is odd and one is
even). We can enumerate all possibilities below:

Factorization of 78 into
(n − m + 1) (n + m) Solution (n, m)
odd and even ‘pairs’
n−m=1 n + m = 39 2 · 39 (20, 19)
n−m=5 n + m = 13 6 · 13 (9, 4)
n − m = 25 n+m=3 26 · 3 (14, −11)
n − m = 77 n+m=1 78 · 1 (39, −38)
n − m = 38 n+m=2 39 · 2 (20, −18)
n − m = 12 n+m=6 13 · 6 (9, −3)
n−m=2 n + m = 26 3 · 26 (14, 12)
n−m=0 n + m = 78 1 · 78 (39, 39)

Therefore, 39 can be expressed in the following ways:

19 + 20,
4 + 5 + 6 + . . . + 8 + 9,
−11 + −10 + −9 + . . . + 13 + 14,
−38 + −37 + . . . + 39,
45 Chapter 4. Sequences and Series

−18 + −17 + . . . + 20,


−3 + −2 + . . . + 9,
12 + 13 + 14,
39.

Likewise, consider the geometric series, which is the sum of all the terms of a geometric
sequence.

Theorem 4.2.8
For some finite geometric series a + ar + ar2 + ar3 + . . . + arn−1 , we have general formulas:
1 − rn
a + ar + ar2 + ar3 + . . . + arn−1 = a · if r 6= 1,
1−r
= na if r = 1.

Proof. Define S = a + ar + ar2 + ar3 + . . . + arn−2 + arn−1 .


First, assume r 6= 1. Multiply r on both sides to get rS = ar + ar2 + ar3 + ar4 + . . . + arn−1 + arn .
Notice that S and rS have terms ar, ar2 , ar3 , . . . , arn−1 in common. To simplify and eliminate these
intermediate terms, subtract rS from S.
 
S − rS = a + ar + ar2 + . . . + arn−2 + arn−1 − ar + ar2 + ar4 + . . . + arn−1 + arn
S − rS = a − arn
S(1 − r) = a(1 − rn )
1 − rn
∴S =a· .
1−r

Otherwise, if r = 1, then all the terms are just a. So our sum would be n terms of a added
together, which is just na.

Problem 4.2.9. Which expression has a larger value?

100 + 200 + 300 + . . .


| {z }
28 terms

1 + 2 + 4 + 8 + ...
| {z }
28 terms

100 + 2800
Solution. The first expression is an arithmetic series, which evaluates to · 28 = 40600.
2
1 − 228
The second expression is a geometric series and it can be computed as 1 · = 228 − 1 . It is
1−2
clear that the latter is much greater (note that 216 = 65536 > 40600).

Problem 4.2.10. Evaluate the sum


1 1 1 1
1− + − + ... + .
3 9 27 729
Daniel Kim 46
1 1
Solution. We start with the first term 0 and end with the last term 6 , so there are 7 terms. The
3 3
1
common ratio is − , so the sum is
3
7
1 − − 13 1 + 317 547
1· 1
 = 4 = .
1 − −3 3
729

Problem 4.2.11. Let {an } be a geometric series. If a1 + a2 + a3 = 10 and a2 + a3 + a4 = 20, what


is a3 + a4 + a5 + a6 + a7 + a8 ?

Solution. Using ak = ark−1 , we have the equations

a + ar + ar2 = 10,
ar + ar2 + ar3 = 20.

But notice that ar + ar2 + ar3 = r(a + ar + ar2 ) = 10r = 20, so r = 2.


We want to compute a3 + a4 + a5 + a6 + a7 + a8 , which can be rewritten to ar2 + ar3 + . . . + ar7 .
But this is actually r2 (a + ar + ar2 ) + r4 (ar + ar2 + ar3 ) = 22 · 10 + 24 · 20 = 360 .

However, for all of the series we have been dealing with so far, we assumed them to be finite.
What if we had an infinite series? There would be no ‘last term’ to consider.
1 1 1 1 1 1 − 21n 1
Examine the series + + + . . . + n . The formula gives · = 1 − n . Let’s substitute
2 4 8 2 2 1 − 12 2
in some values of n.
1 1 1 1 15
+ + + = = 0.9375.
2 4 8 16 16
1 1 1 1 1 31
+ + + + = = 0.96875.
2 4 8 16 32 32
1 1 1 1 255
+ + + ... + = ≈ 0.99609.
2 4 8 256 256

1
As n gets larger, we see that the series is getting closer to 1. If we examine the formula 1 − n ,
2
1 1
notice that as n gets large, n becomes a tremendously small number, so 1 − n approaches 1. To
2 2
express this notion, we write
1
lim 1 − n = 1.
n→∞ 2
Here, we express n getting arbitrarily large as n → ∞. This notation is read as “the limit as n
1 1 1 1 1
goes to ∞ of 1 − n is 1.” Furthermore, we say that the series + + + . . . + n converges to 1.
2 2 4 8 2
Problem 4.2.12. Find the value that this series converges to:

1 1 1 1
+ + + ... + n.
3 9 27 3
47 Chapter 4. Sequences and Series
1 − rn
Solution. First, find the sum in terms of n using the formula a · :
1−r

1 n
1 1 1 1 1 1− 3
+ + + ... + n = ·
3 9 27 3 3 1 − 13
n
1 − 31
=
2 
1 1
= 1− n .
2 3

1 1
As n gets large, n goes to 0. Then, 1 − n goes to 1 and therefore, the entire expression
  3 3
1 1 1
1 − n will tend toward .
2 3 2

Using this notion of the limit, it is possible to compute certain infinite series. Because some
series converge to one value, we can have a finite sum when there are infinitely many terms to add
together.

Theorem 4.2.13
a
For some infinite geometric series a + ar + ar2 + . . ., if |r| < 1, then the sum is .
1−r

Proof. Disclaimer: this proof will not be so rigorous as we have not discussed limits in detail yet (in
the next section, we will).
a(1 − rn )
Let Sn = a + ar + ar2 + . . . + arn−1 = . Then, we will consider n → ∞, so Sn becomes
1−r
n
a(1 − r )
an infinite geometric series. In the formula , the only value that is changing because of n
1−r
is r . So, let’s analyze r as n → ∞.
n n

If we consider all cases of r, we have the following scenarios:

r>1 rn → ∞ (becomes arbitrarily large)


r=1 rn → 1 (Sn = na)
0<r<1 rn → 0
r=0 rn → 0
−1 < r < 0 rn → 0
r = −1 rn alternates between −1 and 1, so there is no convergence; none
r < −1 Similar reason as above; none

In fact, if r ≤ 1 or r > 1, we cannot find a proper formula for such a series, and we have already
covered the case when r = 1.
a(1 − rn )
Thus, if we restrict r to −1 < r < 1, then we know for sure that rn will go to 0. Then,
1−r
a
will approach , which is the desired formula.
1−r
Daniel Kim 48

Problem 4.2.14. Find the sum of the infinite geometric series


2 4 8
1− + − + ...
3 9 27

2
Solution. The common ratio is − and the first term is 1. Using the formula, the sum is simply
3
1 3
2 = 5 .

1 − −3

Problem 4.2.15. Find the following sums:

9 9 9
1. + + + ...
10 100 1000
2. 0.23

3. 0.146

Solution.
9
9 9 9
1. Direct application of the formula gives 10
1 = 1 . Also notice that + + + ...
1− 10
10 100 1000
is equivalent to 0.9999 . . ., or 0.9, which equals 1.

23 23 0.23 23
2. 0.23 = + + ... = 1 = .
100 10000 1 − 100 99
!
46
1 46 46 1 29
3. 0.146 = + + + ... = + 1000
1 = .
10 1000 100000 10 1 − 100 198

Theorem 4.2.16
A number is rational if and only if it has a decimal expansion that either terminates or repeats.

Proof. First, we prove the right direction: if a number is rational, then it has a decimal expansion
that either terminates or repeats.
p
A rational number can be expressed as , where p, q ∈ Z. If we perform long division with p as
q
the dividend and q as the divisor, then there will only be finitely many possible remainders, namely
0, 1, 2, . . . , q − 1. Thus, the long division will eventually terminate by a remainder of 0, or it will
continue indefinitely due to a cycle of the same non-zero remainders.
For the left direction, if the number has a decimal expansion that either terminates or repeats,
then it can be represented as a geometric series or infinite geometric series with common ratio
between 0 and 1. In either case, we have an appropriate formula which shows that the sum is a
a(1 − rn ) a
rational number: or .
1−r 1−r
49 Chapter 4. Sequences and Series

Problem 4.2.17. The sum of an infinite geometric series is 10. The sum of the same series but
with each of its terms squared is 12. What is its fifth term?

Solution. We have the equations


a
a + ar + ar2 + . . . = = 10,
1−r
a2
a2 + a2 r 2 + a2 r 4 + . . . = = 12.
1 − r2

The first equation gives a = 10 − 10r, and the second gives a2 = 12 − 12r2 . Therefore, we can
solve for r with the equation
12 − 12r2 = (10 − 10r)2 .

We end up with the quadratic 14r2 −25r +11 = 0, which can be factored into (r −1)(14r −11) = 0,
11
giving r = 1, . Since it is an infinite geometric sequence, |r| < 1, so the common ratio must be
14
11
.
14
 4
11 15 15 11
Therefore a = 10 − 10 · = , so the fifth term, which is ar , would be
4 · .
14 7 7 14

4.3 Limits of Sequences

Earlier, we brought up the concept of the limit as a means to derive our formula for an infinite
geometric series. In this section, we will go over them with a very rigorous and technical approach.
If you are not yet familiar with quantifiers from the first chapter, make sure you review it, because
it is essential to understanding how exactly limits are defined from a theoretical standpoint.
Definition 4.3.1. Let {an } be an infinite sequence. We define the limit of the sequence as follows:

lim an = L ←→ ∀ε > 0 ∃N > 0 such that ∀n > N, |an − L| < ε.


n→∞

Now this symbolic definition certainly seems unwieldly, so we will dissect one part at a time.
Often, we use the Greek letter ε, called “epsilon,” to represent a preferably small number. Likewise,
we are using the capital letter N to represent a preferably large number.
When we start off with ∀ε > 0 ∃N > 0, we are saying that for any small positive number ε we
choose, we can always find a large positive number N such that the rest of the statement is satisfied.
Keep in mind that ε and N are defined to be real numbers in this context.
Namely, the rest of the statement reads ∀n > N, |an − L| < ε. Remember that |an − L| is the
distance between an and L, which is the value that the infinite sequence converges to. We are saying
that all an in the infinite sequence whose n is greater than the large number N we found is less than
ε away from L.
In other words, an can be made arbitrarily close to L by making n sufficiently large. This should
make sense - as we choose a term in the sequence farther down, we should be closer to the limit L
than before.
Daniel Kim 50

Then ε is our threshold of ‘closeness’ to L. As we began the statement with ∀ε > 0, this is the
‘arbitrarily close’ part of the definition. Making ε as small as we want yields terms further down the
sequence that are as close to L as we want.
Our choice of N , depending on the given number ε, represents how far we have to go down the
sequence in order to find terms that are within ε of L. Thus, N serves to represent the ‘make n
sufficiently large’ part of the definition.
As a last note, I will abbreviate “such that” to “s.t.” for concision.
Now we will proceed to prove that some sequences actually have the limit that we suspect it to
have.

Example 4.3.2
1
Prove lim = 0.
n→∞ n

Proof. First, write out the symbolic definition:


1
∀ε > 0 ∃N > 0 s.t. ∀n > N, − 0 < ε,
n

1
where an = and L = 0. It is sufficient to find an N in terms of a given ε that satisfies
n
1
∀n > N, < ε. How do we do so?
n
1
Usually, we can work backwards. Start from < ε. Notice that n is always positive, given the
n
1
precedent that N > 0 and ∀n > N . So we can lose the absolute value signs to get < ε. As ε is
n
1
also positive, we can rearrange this to n > .
ε
1
Recall that we must satisfy ∀n > N, < ε, and we have just discovered that n must be greater
n
1 1 1 1
than . In fact, if we let N = , then we have ∀n > , < ε, which is true because of the
ε ε ε n
algebraic manipulations we have just done (the steps we have taken are reversible).
1
Thus, it is sufficient to say that N = in order to prove the limit.
ε
Unfortunately, the line of reasoning that we’ve just done cannot be used in a formal proof,
because we have worked backwards. To formalize it, we must write the proof forwards:
1
For a given ε > 0, let N = . Then,
ε
1 1 1
n > N =⇒ n > =⇒ nε > 1 =⇒ < ε =⇒ − 0 < ε.
ε n n

Therefore, for any given ε > 0, we have found a N which satisfies


1
∀ε > 0 ∃N > 0 s.t. ∀n > N, − 0 < ε,
n
51 Chapter 4. Sequences and Series
1
and thus we have proven lim = 0.
n→∞ n

For the remainder of the problems, I will write out the steps taken backwards in order to show
the motivation behind choosing an N to satisfy the definition. However, I strongly recommend that
you write formal proofs forward in order to reinforce your understanding of the symbolic definition
of the limit. Furthermore, I will glance over some algebraic steps for conciseness, but you should be
writing out every step in your proof to be specific.
1
Problem 4.3.3. Prove lim = 0.
n→∞ n2

Proof. Our symbolic definition is now

1
∀ε > 0 ∃N > 0 s.t. ∀n > N, − 0 < ε,
n2

1
and now we can work backwards again. We have < ε, and again note that n is positive, so
n2
1 1 1 1
= 2 < ε. We can then rearrange this to n > √ . Then we can let N = √ to satisfy the
n2 n ε ε
definition.
n
Problem 4.3.4. Prove lim = 1.
n→∞ n + 1

Proof. We have the definition,

n
∀ε > 0 ∃N > 0 s.t. ∀n > N, − 1 < ε.
n+1

The procedure is similar to that of the previous proofs. To make things easier, condense the absolute
n 1 1 1
value expression: note that −1 = − , so − = using that n is positive.
n+1 n+1 n+1 n+1
1 1
Therefore, we have < ε, which can be rearranged to n > − 1.
n+1 ε
1
Thus, given ε > 0, let N = − 1, and it can be verified that this choice of N satisfies the
ε
definition.
−n + 1 1
Problem 4.3.5. Prove lim =− .
n→∞ 2n + 3 2

Proof. Again, write out the definition,


 
−n + 1 1
∀ε > 0 ∃N > 0 s.t. ∀n > N, − − < ε.
2n + 3 2

We can manipulate the inequality as follows:

−n + 1 1
+ <ε
2n + 3 2
Daniel Kim 52

−n + 1 n + 32
+ <ε
2n + 3 2n + 3
5
2

2n + 3

5 5
− 3ε
As n is positive, we have 2
< ε. We can rearrange this to get n > 2
. Thus, for a
2n + 3 2ε
5
− 3ε
given ε > 0, let N = 2
so the overall symbolic statement is true.

2n + 5 2
Problem 4.3.6. Prove lim = .
n→∞ 3n − 7 3

29 + 21ε
Proof. Let N = . Note that n > N suggests:

3n > 3N
3n − 7 > 3N − 7
1 1
<
3n − 7 3N − 7
29 29
3 3
∴ <
3n − 7 3N − 7

Then,

14

2n + 5 2 2n + 5 2n − 143 2n + 5 − 2n − 3
29
3
29
3
− = − = = = .
3n − 7 3 3n − 7 3n − 7 3n − 7 3n − 7 3n − 7

Thus, if n > N , then


29 29 29
2n + 5 2 3 3 3 
− = < = 29+21ε = ε.
3n − 7 3 3n − 7 3N − 7 3 9ε −7

2n + 5 2
In other words, for a given ε > 0, we have found an N such that for all n > N , − < ε.
3n − 7 3
2n + 5 2
Therefore, lim = .
n→∞ 3n − 7 3

Exercise 4.3.7. Let bn = c for all n, where c is a constant real number. Prove lim bn = c.
n→∞

Example 4.3.8
Prove that lim (−1)n does not exist.
n→∞
53 Chapter 4. Sequences and Series

Proof. Suppose it does exist, such that lim (−1)n = L. We have its definition,
n→∞

∀ε > 0 ∃N s.t. ∀n > N, |(−1)n − L| < ε.

To prove that there exists no such L, consider the negation,

∃ε > 0 ∀N s.t. ∃n > N, |(−1)n − L| ≥ ε.

Now, it becomes clear that we simply must find a value for ε such that we arrive at a contradiction
for |(−1)n − L| < ε.
1 1
Consider ε = . Then we have |(−1)n − L| < . We break up the absolute signs to get
2 2
1 1
− < (−1)n − L < ,
2 2
which gives us two inequalities,

1 1
(−1)n − L > − and (−1)n − L < .
2 2

Consider each one separately. First, based on the parity of integer n, notice that there are only
two possible values of (−1)n : 1 and −1. Now we list all possibilities:
( )
1 1
1 −1 − L > − 2 =⇒ L < − 2 1
(−1)n − L > − ∴L<− .
2 1
1 − L > − 2 =⇒ L < 2 3 2
( )
1 3
1 −1 − L < 2 =⇒ L > − 2 1
(−1)n − L < ∴L> .
2 1−L< 1
=⇒ L > 1 2
2 2
1 1
There is no value of L which satisfies both L < − and L > . Therefore, there exists no limit
2 2
for the sequence an = (−1)n .

Theorem 4.3.9
Let k ∈ R. If lim an = L, then
n→∞

a) lim (an + k) = L + k.
n→∞

b) lim kan = kL.


n→∞

Proof. The best approach is to rewrite everything in terms of the given definition of a limit. For all
parts of the question, keep in mind:

lim an = L ←→ ∀ε > 0, ∃N s.t. ∀n > N, |an − L| < ε.


n→∞
Daniel Kim 54

a) Regarding this definition, note that


an − L = an − L + (k − k)
= (an + k) + (−L − k)
= (an + k) − (L + k).
Therefore, we have
∀ε > 0, ∃N s.t. ∀n > N, |(an + k) − (L + k)| < ε,
which is lim (an + k) = L + k.
n→∞

b) Note that we are able to replace the third quantifier with an implication statement without
affecting the original definition, as such:
∀ε > 0, ∃N s.t. n > N → |an − L| < ε.

Consider the definition of lim kan = kL, using different variables to avoid confusion:
n→∞

∀e e s.t. n > N
ε > 0, ∃N e → |kan − kL| < εe
i.e. |k| |an − L| < εe
εe
i.e. |an − L| < .
|k|
εe
Let ε = , which is allowed because ε and εe can be any positive real numbers. Then
|k|
εe
∀ε > 0, ∃N s.t. n > N → |an − L| < i.e. |kan − kL| < εe, as desired.
|k|
A lemma is a minor result that we prove in order to help us with a harder proof of a theorem
or grander result. Here, we will prove a useful fact relating to absolute value and inequalities that
we will use in later proofs of major theorems.

Lemma 4.3.10 (Triangle Inequality)


For any a, b ∈ R, |a + b| ≤ |a| + |b|.

Proof. As the square of any number is nonnegative, it is clear that |x|2 = x2 for any number x.
Then,
(|a + b|)2 = (a + b)2
= a2 + 2ab + b2
= |a|2 + 2ab + |b|2 .

It should also be certain that for any x, x ≤ |x| (either x = |x| or x = − |x|). Therefore, we must
have ab ≤ |ab| = |a| |b|. Using this fact, |a|2 + 2ab + |b|2 ≤ |a|2 + 2 |a| |b| + |b|2 , which is equal to
(|a| + |b|)2 .
Therefore, (|a + b|)2 ≤ (|a| + |b|)2 . Since both terms are nonnegative, we can take the square
root of both sides to get |a + b| ≤ |a| + |b|.
55 Chapter 4. Sequences and Series

Theorem 4.3.11 (Sum of Limits)


If lim an = L and lim bn = M , then lim an + bn = L + M .
n→∞ n→∞ n→∞

Proof. We first write out the definitions of both limits given. For a given ε > 0,
ε
∃N1 ∀n > N1 , |an − L| < ,
2
ε
∃N2 ∀n > N2 , |bn − M | < .
2
ε
Why are we allowed to use instead of ε? We can make ε be any positive number we want, so it
2
does not matter.
Let N = max{N1 , N2 }, that is to say, the bigger value out of N1 and N2 . Then conveniently,
∀n > N, n > N1 ∧ n > N2 .
Therefore, ∀n > N ,
ε ε
|an − L| < ∧ |bn − M | < ,
2 2
so by the Triangle Inequality,
ε ε
|(an + bn ) − (L + M )| = |(an − L) + (bn − M )| ≤ |an − L| + |bn − M | < + = ε.
2 2

Thus, we have found N such that ∀n > N ,

|(an + bn ) − (L + M )| < ε,

so lim an + bn = L + M .
n→∞

Theorem 4.3.12 (Squeeze Theorem)


If lim an = 0 and 0 < bn < an ∀n, then lim bn = 0.
n→∞ n→∞

Proof. Assuming lim an = 0, we have


n→∞

∀ε > 0, ∃N s.t. ∀n > N, |an − 0| < ε.

Since we are given that 0 < an (in other words, an is positive for all n),

|an − 0| = |an | = an < ε.

Now, we know that 0 < bn < an < ε. This means that bn < ε, i.e. |bn − 0| < ε. Therefore, ∀n > N ,
|bn − 0| < ε, and this statement implies lim bn = 0.
n→∞
Daniel Kim 56

Lemma 4.3.13 (Bernoulli’s Inequality)


For h > 0 and n ∈ N, (1 + h)n ≥ 1 + hn.

We don’t have the tools required to prove this inequality yet, but it will be available as an
exercise in a later chapter when we learn proof by induction.

Theorem 4.3.14
If 0 < a < 1, then lim an = 0.
n→∞

1 1
Proof. If 0 < a < 1, then > 1. Therefore, = 1 + h for some h > 0. By Bernoulli’s Inequality,
a a
 n
1 1
= = (1 + h)n ≥ 1 + nh > nh.
an a

1 1 1 1 1 1 1
Thus, 0 < an < = · . Since lim = 0, by Theorem 4.3.9, lim · = 0 · , which is
nh n h n→∞ n n→∞ n h h
still 0.
1 1
Since 0 < an < and lim = 0, by Theorem 4.3.12, lim an = 0.
nh n→∞ nh n→∞

Beware that it is quite difficult to grasp the intuition or motivation behind the proofs for the
product and reciprocal of limits, so just try your best to understand how they work.

Theorem 4.3.15 (Product of Limits)


If lim an = L and lim bn = M , then lim an bn = LM .
n→∞ n→∞ n→∞

Proof. Given ε > 0, we know that, by the definition of a limit, we can find N1 , N2 , N3 such that
ε
∀n > N1 , |an − L| < ,
2 |M + 1|
ε
∀n > N2 , |bn − M | < ,
2 |L + 1|
∀n > N3 , |bn − M | < 1.

That third condition implies that, if n > N3 ,

|bn | = |bn − M + M | ≤ |bn − M | + |M | < 1 + |M | ,

where we split the absolute value using the Triangle Inequality. Let N = max{N1 , N2 , N3 }. Note
that n > N → n > N1 ∧ n > N2 ∧ n > N3 . Then if n > N ,

|an bn − LM | = |an bn − bn L + bn L − LM | ≤ |an bn − bn L| + |bn L − LM |

= |bn (an − L)| + |L(bn − M )| = |bn | |an − L| + |L| |bn − M | .


57 Chapter 4. Sequences and Series

Now, we can substitute in the inequalities we derived earlier (and using the obvious fact that
|L| < |L| + 1), to get that
ε ε ε ε
|bn | |an − L| + |L| |bn − M | < (|M | + 1) + (|L| + 1) = + = ε.
2(|M | + 1) 2(|L| + 1) 2 2

Thus, given an ε > 0, we have found an N such that |an bn − LM | < ε for all n > N . Therefore,
lim an bn = LM .
n→∞

Theorem 4.3.16 (Reciprocal of Limits)


1 1
If lim an = L and L 6= 0, then lim = .
n→∞ n→∞ an L

Proof. Given ε > 0, we can find N1 , N2 , such that

|L|
∀n > N1 , |an − L| < ,
2
ε |L|2
∀n > N2 , |an − L| < .
2

Let N = max{N1 , N2 }, such that

|L| ε |L|2
∀n > N, |an − L| < ∧ |an − L| < .
2 2

By the Triangle Inequality, we have

|L|
|L| = |L − an + an | ≤ |L − an | + |an | = |an − L| + |an | < + |an | ,
2
|L| 1 2
so |an | > . Both quantities are positive, so we can take the reciprocal and get < .
2 |an | |L|
Therefore, for a given ε, we have found N such that

1 1 L − an |L − an | |an − L| ε |L|2 2 1
∀n > N, − = = = < · · = ε,
an L an L |an | |L| |an | |L| 2 |L| |L|

1 1
so lim = .
n→∞ an L

Corollary 4.3.17 (Quotient of Limits)


an L
If lim an = L, lim bn = M , and M 6= 0, then lim = .
n→∞ n→∞ n→∞ bn M

Proof. The result follows by Theorem 4.3.15 and Theorem 4.3.16.


Daniel Kim 58

A corollary is a result that is a quick consequence of a previously proven theorem. Here, we


used the theorems of product and reciprocal of limits to immediately prove the quotient of limits.
1
Problem 4.3.18. If lim an = 0, then lim does not exist.
n→∞ n→∞ an

1 1
Proof. Assume for the sake of contradiction that lim = L. By Theorem 4.3.16, lim an = .
n→∞ an n→∞ L
1
We are given that this equals 0, so = 0 =⇒ 1 = 0, which is clearly false. Thus, the limit does
L
not exist.

Problem 4.3.19. Compute the following:

n
1. lim
n→∞ n+1
3 5
2. lim 2 + + 2 + 88−n + π
n→∞ n n
2n2 + 3n + 4
3. lim
n→∞ 3n2 + 5n − 2

Solution.

1. Divide the numerator and denominator by n. Then we have


1
lim 1,
n→∞ 1+ n

1
but notice that lim = 0, and 1 remains constant. Therefore, the limit is
n→∞ n

1 1
lim 1 = lim = 1.
n→∞ 1 + n→∞ 1 + 0
n

 n
3 5 1 1 1
2. Note that lim = 0, lim 2 = lim 5 · · = 0, and lim 88−n = lim = 0 by
n→∞ n n→∞ n n→∞ n n n→∞ n→∞ 88
Theorem 4.3.14. We are left with 2 + 0 + 0 + 0 + π = 2 + π .

3. Divide the numerator and denominator by the highest degree of n, which in this case, is n2 .
Then all the terms with a constant in the numerator and a power of n in the denominator
tend to 0, and the limit becomes clear:
3 4
2n2 + 3n + 4 2+ n + n2 2
lim 2
= lim 5 2 = .
n→∞ 3n + 5n − 2 n→∞ 3 + − 3
n n2

Problem 4.3.20. Compute the following:

2n
1. lim
n→∞ 2n + 3n
59 Chapter 4. Sequences and Series
3n
2. lim
n→∞ 2n + 3n

5n
3. lim
n→∞ 2n + 3n

Solution.

1. Divide the numerator and denominator by 3n to get



2 n
lim 3
2 n
.
n→∞
3 +1
 n
2
By Theorem 4.3.14, lim = 0. By various properties of limits we had proven earlier,
n→∞ 3


2 n
0
lim 3
2 n
= = 0.
n→∞
3 + 1 0+1

2. Similarly, divide the numerator and denominator by 3n to get

1
lim 
2 n
.
n→∞
3 +1
 n
2
We know that lim = 0. Therefore,
n→∞ 3

1 1
lim 
2 n
= lim = 1.
n→∞
3 +1 n→∞ 0+1


5 n
3. We do the same thing to get lim 3
2 n
. By properties of limits, we have
n→∞
3 + 1
 
5 n  n
5 n lim 3 5
3 n→∞
lim 
2 n
= = lim .
n→∞
3 + 1 1+0 n→∞ 3

 n  
5 2 n 2n 2n
Note that = 1+ ≥ 1+ by Bernoulli’s Inequality. Since 1 + will go to
3 3  3n  n 3
5 5
infinity as n goes to infinity, so will , therefore lim = ∞.
3 n→∞ 3

4.4 Summation and Product Notation

Moving aside from sequences, series, and limits, another section will be dedicated to summation and
product notation, which is useful shorthand to represent long sums and products in compact form.
Daniel Kim 60

Definition 4.4.1. Let m, n ∈ Z, m ≤ n. Then,


n
X
ai = am + am+1 + . . . + an ,
i=m

where i is referred to as the dummy variable. This is read as “the summation from i = m to n of
ai .”

Here are some examples:

P
7
• r2 = 9 + 16 + 25 + 36 + 49 = 135 .
r=3

P
3
• k 3 = −1 + 0 + 1 + 8 + 27 = 35 .
k=−1

P
100
• 7 = 22 · 7 = 154 .
k=79

P
5
• j = 2 + 3 + 4 + 5 = 14 .
j=2

Theorem 4.4.2
We can add summations together and pull out constant factors.

P
n P
n P
n
1. ak + bk = (ak + bk )
k=m k=m k=m

P
n P
n
2. c ak = cak
k=m k=m

Proof. By using the definition, the proofs of these involve simple algebra:

1. Using the associative property yields:


n
X n
X
ak + bk = (am + am+1 + am+2 + . . . + an ) + (bm + bm+1 + bm+2 + . . . + bn )
k=m k=m
= ((am + bm ) + (am+1 + bm+1 ) + (am+2 + bm+2 ) + . . . + (an + bn ))
X n
= (ak + bk ).
k=m

2. Likewise, we use the distributive property.


n
X
c ak = c (am + am+1 + am+2 + . . . + an )
k=m
61 Chapter 4. Sequences and Series

= cam + cam+1 + cam+2 + . . . + can


Xn
= cak .
k=m

Problem 4.4.3. Write in summation form the general formulas for the sums of an arithmetic series
and a geometric series.

Solution. For an arithmetic series, we have


n
X n(n − 1)
(a + (k − 1)d) = na + d.
2
k=1

For a geometric series, we have


n
X 1 − rn
ark−1 = a · , provided r 6= 1.
1−r
k=1

Problem 4.4.4. Write the following as a summation:


2 3 4 5 79
+ + + + ... + .
3 4 5 6 80
Solution. The denominator is simply the numerator plus 1, so this sum can be expressed as
79
X k
.
k+1
k=2

Note that if we changed the starting value of the dummy variable, we can still ‘shift over’ the
ending value and the expression itself to represent the same sum. For instance, if we had k = 41, we
could express the same sum above as
X118
k − 39
.
k − 38
k=41

Problem 4.4.5. Given the following formulas:


n
X n(n + 1)
k= ,
2
k=1
Xn
1 = n,
k=1
n
X n(n + 1)(2n + 1)
k2 = .
6
k=1

Find the sum:


10
X
(2k 2 + 3k + 1).
k=1
Daniel Kim 62

Solution. We will eventually get around to proving the given formulas.


We can split the sum, by Theorem 4.4.2, and compute each separately:
10
X 10
X 10
X 10
X
2 2
(2k + 3k + 1) = 2 k +3 k+ 1
k=1 k=1 k=1 k=1
10 · 11 · 21 10 · 11
=2· +3· + 10 = 945 .
6 2
 
4
X 3
X
Problem 4.4.6. Compute  i + j .
i=1 j=1

Solution. We evaluate the inner summation first, then the outer summation.
 
X4 X 3 X3 X 3 X3 X3
 i + j = 1+j+ 2+j+ 3+j+ 4+j
i=1 j=1 j=1 j=1 j=1 j=1

= (2 + 3 + 4) + (3 + 4 + 5) + (4 + 5 + 6) + (5 + 6 + 7)
= 9 + 12 + 15 + 18
= 54 .

1000
X
Problem 4.4.7. Find another way to express k2 .
k=400

Solution. The complication is that our dummy variable k starts at 400. We can make k = 1 and
either subtract what we don’t want from the entire sum, or as shown before, ‘shift’ the ending value
and the sequence formula as well:
1000
X 399
X
2
k − k2
k=1 k=1

601
X
(k + 399)2
k=1

Try listing out the first few terms of each to convince yourself that all three ways represent the
same sum.
20
X
Problem 4.4.8. Compute k 2 in two ways by:
k=10

a) Subtracting two sums.

b) Reindexing the summation starting from 1 and breaking the sum into parts.

Solution.
63 Chapter 4. Sequences and Series

a) The sum of 102 + 112 + . . . + 202 is the same as subtracting the sum 12 + 22 + . . . + 92 from
the total sum 12 + 22 + . . . + 202 , making this more manageable with the formulas we already
know:
20
X 20
X 9
X 20 · 21 · 41 9 · 10 · 19
k2 = k2 − k2 = − = 2870 − 285 = 2585 .
6 6
k=10 k=1 k=1

b) Note that we can reindex the sum as:

20
X 11
X 11
X
2 2
k = (k + 9) = k 2 + 18k + 81
k=10 k=1 k=1

P
11
We can split k 2 + 18k + 81 into a sum of partial sums, which we are able to compute
k=1
individually using our formulas:
11
X 11
X 11
X 11
X
2 2
k + 18k + 81 = k + 18 k+ 81
k=1 k=1 k=1 k=1
 
11 · 12 · 23 11 · 12
= + 18 + 11(81)
6 2
= 506 + 1188 + 891 = 2585 .

You may recall from earlier that I represented an infinite geometric series by

a + ar + ar2 + . . .

Using this newly introduced summation notation, we can represent this same infinite sum as

X
ark ,
k=1

and when |r| < 1, we can state the general formula as



X a
ark = .
1−r
k=1

Lastly, keep in mind that we can turn an infinite summation into a finite summation, with the
limit attached in front of it, as such:

X n
X
ak = lim ak .
n→∞
k=1 k=1

In addition to summation, we also have a way to write the product of terms in shorthand form.
Daniel Kim 64

Definition 4.4.9 (Product Notation). Let m, n ∈ Z, m ≤ n. Then,


n
Y
ak = am · am+1 · am+2 · · · an ,
k=m

and product notation is analogous to summation notation.

Here are some examples:

Q
n
1. k = n! .
k=1

Q
n
2. r = rn .
k=1

Q
79 k 2 3 4 79 2 1
3. = · · ··· = = . This occurrence, when nearly all terms ultimately
k=2 k + 1 3 4 5 80 80 40
cancel each other out, is called telescoping.
 80 
Q
80 Q
4. ln k = ln k , if you recall that ln(a + b) = ln(a) ln(b).
k=40 k=40

Theorem 4.4.10
Likewise, we can multiply two products together, and the power of the product is equal to the
product of the powers.

Q
n Q
n Q
n
1. ak bk = ak bk .
k=m k=m k=m
 r
Q
n Q
n
2. ak = (ak )r .
k=m k=m

Exercise 4.4.11. Prove Theorem 4.4.10 (it should be relatively straightforward).


Chapter 5

Mathematical Induction

In this chapter, we will go over an essential technique of proof for certain claims, particularly when
the claim should be true for all positive integers (and variants).

5.1 Standard Applications

Recall the following valid argument form, which underlies the principle of mathematical induction.

P (1)
∀n ∈ Z+ , P (n) → P (n + 1)
∴ ∀n ∈ Z+ , P (n).

We have already demonstrated that this argument is true at the end of Chapter 1. If we show
that P (1) is true and the implication is true, then we get P (2), P (3), P (4), etc. for the rest of the
positive integers.
Suppose you have a statement you are trying to prove for all positive integers n. Here are the
steps you should take in a proof by induction:

1. Prove the statement for n = 1. This is called the base case.

2. Assume that the statement is true for n = k. This is called the inductive hypothesis.

3. Prove that if the inductive hypothesis is true, then the statement is also true for n = k + 1.
This is the inductive step.

This kind of proof may seem mechanic, but this established, reliable structure is what makes
induction proofs relatively straightforward.

Example 5.1.1
Pn n(n + 1)
Prove k= .
k=1 2

65
Daniel Kim 66
P
n n(n + 1)
Proof. For clarity, let P (n) denote the assertion that k= .
k=1 2
P
1 1(1 + 1)
Base Case: P (1) : k= = 1.
k=1 2
We have 1 = 1, which is clearly true.
Pn n(n + 1) n(n + 1)
Inductive Step: Suppose P (n) : k= , i.e. 1 + 2 + 3 + . . . + n = , is true.
k=1 2 2
(n + 1)(n + 2)
We want to prove that P (n + 1) : 1 + 2 + 3 + . . . + n + (n + 1) = is true.
2
We have
n(n + 1)
1 + 2 + 3 + . . . + n + (n + 1) = + (n + 1)
2  
n
= (n + 1) +1
2 
n+2
= (n + 1)
2
(n + 1)(n + 2)
= .
2

Therefore our inductive step holds and we are done.


P
n n(n + 1)(2n + 1)
Problem 5.1.2. Prove k2 = .
k=1 6

P
n n(n + 1)(2n + 1)
Proof. Let P (n) : k2 = .
k=1 6
P
1 1·2·3
Base Case: P (1) : k2 = = 1.
k=1 6
We have 12 = 1, so we’re done.
P
n n(n + 1)(2n + 1)
Inductive Step: Suppose P (n) : k2 = is true.
k=1 6
(n + 1)(n + 2)(2n + 3)
We want to prove that P (n + 1) : 12 + 22 + 32 + . . . + n2 + (n + 1)2 = .
6
We have
n(n + 1)(2n + 1)
12 + 22 + 32 + . . . + n2 + (n + 1)2 = + (n + 1)2
6
 
n(2n + 1)
= (n + 1) + (n + 1)
6
 2 
2n + n 6n + 6
= (n + 1) +
6 6
 2 
2n + 7n + 6
= (n + 1)
6
67 Chapter 5. Mathematical Induction
 
(2n + 3)(n + 2)
= (n + 1)
6
(n + 1)(n + 2)(2n + 3)
= .
6

Our inductive step holds, so we are done.

Problem 5.1.3.

P
n n2 (n + 1)2
1. Prove i3 = by induction.
i=1 4

P
10
2. Then compute 2i3 − 3i2 + 5i − 7.
i=1

P
n n2 (n + 1)2
Proof. For the induction proof, let P (n) : i3 = .
i=1 4
P
1 12 (1 + 1)2 4
Base Case: P (1) : i3 = = = 1.
i=1 4 4
We have 13 = 1, which is clearly true.
P
n n2 (n + 1)2
Inductive Step: Suppose P (n) : i3 = .
i=1 4
(n + 1)2 (n + 2)2
We must prove that P (n + 1) : 13 + 23 + 33 + . . . + n3 + (n + 1)3 = .
4
Then, note that

n2 (n + 1)2
13 + 23 + 33 + . . . + n3 + (n + 1)3 = + (n + 1)3
4  
2
2 n
= (n + 1) + (n + 1)
4
 2 
2 n 4n + 4
= (n + 1) +
4 4
 2 
2 n + 4n + 4
= (n + 1)
4
 2

2 (n + 2)
= (n + 1)
4
(n + 1) (n + 2)2
2
= ,
4

which completes the proof.


For part 2, simply split up the sum as follows:
10
X 10
X 10
X 10
X 10
X
3 2 3 2
2i − 3i + 5i − 7 = 2 i −3 i +5 i− 7
i=1 i=1 i=1 i=1 i=1
Daniel Kim 68

102 · 112 10 · 11 · 21 10 · 11
=2· −3· +5· − 70
4 6 2
= 6050 − 1155 + 275 − 70
= 5100 .
P
n 1
Problem 5.1.4. Let an = .
i=1 i(i + 1)

1. Compute a1 , a2 , a3 , a4 , and a5 .
2. Hypothesize a formula for an .
3. Prove that formula using induction.
P
∞ 1
4. What is ?
i=1 i(i + 1)

Solution.

1. This is just direct computation of the summation.


1
a1 =
2
1 1 2
a2 = + =
2 6 3
1 1 1 3
a3 = + + =
2 6 12 4
1 1 1 1 4
a4 = + + + =
2 6 12 20 5
1 1 1 1 1 5
a5 = + + + + =
2 6 12 20 30 6
2. Noticing that the denominator is always one more than the numerator, we can hypothesize
n
that the formula is an = .
n+1
Pn 1 n
3. We proceed to prove = by induction.
i=1 i(i + 1) n+1
Pn 1 n
Let P (n) : = .
i=1 i(i + 1) n+1
P
1 1 1
Base Case: P (1) : = .
i=1 i(i + 1) 2
n 1 1
We also have that = = , so the base case is true.
n+1 1+1 2
P
n 1 n
Inductive Step: Suppose that P (n) : = .
i=1 i(i + 1) n+1
P
n+1 1 n+1
We wish to prove that P (n + 1) : = , i.e.
i=1 i(i + 1) n+2
1 1 1 1 n+1
+ + ... + + = .
1(1 + 1) 2(2 + 1) n(n + 1) (n + 1)(n + 2) n+2
69 Chapter 5. Mathematical Induction

Note that
1 1 1 1 n 1
+ + ... + + = +
1(1 + 1) 2(2 + 1) n(n + 1) (n + 1)(n + 2) n + 1 (n + 1)(n + 2)
n(n + 2) + 1
=
(n + 1)(n + 2)
n2 + 2n + 1
=
(n + 1)(n + 2)
(n + 1)2
=
(n + 1)(n + 2)
n+1
= .
n+2

This holds the inductive step, and we’re done.

4. Recall that we can rewrite an infinite summation as the limit of a finite summation:

X X n
1 1 n
= lim = lim .
i(i + 1) n→∞ i(i + 1) n→∞ n + 1
i=1 i=1

Divide the numerator and denominator by n:


n 1
lim = lim 1.
n→∞ n+1 n→∞ 1+ n

1 1
Then tends to 0 as n gets arbitrarily large, so we are left with = 1. Therefore
n 1+0
P∞ 1
= 1.
i=1 i(i + 1)
 
Q
n 1
Problem 5.1.5. Let f (n) = 1− .
k=1 k+1

i) Compute f (n) for n = 1, 2, 3, 4, 5.

ii) Hypothesize a formula.

iii) Prove your formula by induction.


 
Q
∞ 1
iv) Compute 1− .
k=1 k+1

Solution.

i) This time, we have a product instead of a summation to evaluate.


 
1 1
f (1) = 1 − =
2 2
Daniel Kim 70
  
1 1 1 2 1
f (2) = 1 − 1− · =
=
2 3 2 3 3
   
1 1 1 1 2 3 1
f (3) = 1 − 1− 1− = · · =
2 3 4 2 3 4 4
    
1 1 1 1 1 2 3 4 1
f (4) = 1 − 1− 1− 1− = · · · =
2 3 4 5 2 3 4 5 5
     
1 1 1 1 1 1 2 3 4 5 1
f (5) = 1 − 1− 1− 1− 1− = · · · · =
2 3 4 5 6 2 3 4 5 6 6

1
ii) The pattern of the first five terms suggests that the formula is f (n) = .
n+1
 
Q
n 1 1
iii) The statement we must prove is: 1− = .
k=1 k+1 n+1
 
Q
n 1 1
Let P (n) : 1− = .
k=1 k+1 n+1
 
Q
1 1 1 1
Base Case: P (1) : 1− =1− =
k=1 k + 1 2 2
1 1
But where n = 1 is , so the base case is true.
n+1 2
 
Q
n 1 1
Inductive Step: Assume P (n) : 1− = .
k=1 k + 1 n + 1
 
Q
n+1 1 1
We want to prove: P (n + 1) : 1− = .
k=1 k+1 n+2
We have
         
1 1 1 1 1 1 1
1− 1− 1− ··· 1 − 1− = 1−
2 3 4 n+1 n+2 n+1 n+2
  
1 n+1
=
n+1 n+2
1
= .
n+2

So the statement for P (n + 1) is true, which concludes the proof by induction.

iv) Like before, we can rewrite an infinite product as the limit of a finite product, and evaluate as
such:
Y∞   Yn  
1 1 1
1− = lim 1− = lim = 0.
k+1 n→∞ k+1 n→∞ n + 1
k=1 k=1

However, proofs by induction are not limited to summations. For the next few problems, we
establish a few facts about divisibility first.
71 Chapter 5. Mathematical Induction

Lemma 5.1.6 (Divisibility Lemmas)


Recall that a | b means “a divides b,” or “a is a factor of b.” Then,

1. a | b ∧ b | c −→ a | c

2. a | b ∧ a | c −→ a | (b + c)

3. a | b ∧ a | c −→ a | (b − c)

4. a | b −→ a | (bc)

Problem 5.1.7. Using induction, prove ∀n ∈ Z+ , 7 | (8n − 1).

Proof. Let P (n) : 7 | (8n − 1).


Base Case: P (1) : 7 | (81 − 1) i.e. 7 | 7, which is true.
Inductive Step: We assume P (n) : 7 | (8n − 1). We wish to prove P (n + 1) : 7 | (8n+1 − 1).
Note that 8n+1 = 8n · 8. To prove P (n + 1), note that 7 | (8n − 1) −→ 7 | (8 · (8n − 1)), i.e.
7 | (8n+1 − 8). As 7 | 7, we have 7 | ((8n+1 − 8) + 7) i.e. 7 | (8n+1 − 1), and we’re done.

Problem 5.1.8. Prove ∀n ∈ Z+ , 5 | (8n − 3n ).

Proof. ∀n ∈ Z+ , let P (n) : 5 | (8n − 3n ).


Base Case: P (1) : 5 | (81 − 31 ) −→ 5 | (8 − 3) i.e. 5 | 5, which is true.
Inductive Step: Suppose P (n) : 5 | (8n − 3n ).
We wish to prove P (n + 1) : 5 | (8n+1 − 3n+1 ).
Note that 5 | (8n − 3n ) → 5 | (8 · (8n − 3n )), i.e. 5 | (8n+1 − 8 · 3n ).
Using the fact that 3n+1 = 3 · 3n and 5 | (5 · 3n ), we have

5 | (5 · 3n ) ∧ 5 | (8n+1 − 8 · 3n ) −→ 5 | ((8n+1 − 8 · 3n ) + (5 · 3n ))
i.e. 5 | (8n+1 − 3 · 3n )
i.e. 5 | (8n+1 − 3n+1 ).

We have reached the conclusion that P (n + 1) : 5 | (8n+1 − 3n+1 ) is true, so our proof is
complete.

Problem 5.1.9. Prove for all positive odd integers n, 11 | (8n + 3n ).

Proof. There is something different about this problem - we must prove it for all positive odd integers
n. In this case, it suffices to prove P (1) and then P (n) → P (n + 2).
Let P (n) : 11 | (8n + 3n ).
Base Case: P (1) : 11 | (81 + 31 ) → 11 | 11, which is true.
Inductive Step: Suppose that P (n) : 11 | (8n + 3n ) is true.
Then we must prove P (n + 2) : 11 | (8n+2 + 3n+2 ).
Daniel Kim 72

First note that 11 | (8n + 3n ) → 11 | (82 (8n + 3n )) i.e. 11 | (8n+2 + 64 · 3n ).


However, notice that 64 = 55 + 9, then we have
11 | (8n+2 + 64 · 3n ) −→ 11 | (8n+2 + (55 + 9) · 3n )
i.e. 11 | (8n+2 + 55 · 3n + 9 · 3n ).
As 11 | 55, it must also be true that 11 | (55 · 3n ), therefore
11 | (55 · 3n ) ∧ 11 | (8n+2 + 55 · 3n + 9 · 3n ) → 11 | ((8n+2 + 55 · 3n + 9 · 3n ) − 55 · 3n )
i.e. 11 | (8n+2 + 9 · 3n )
i.e. 11 | (8n+2 + 3n+2 ),
and our proof by induction is complete.
Problem 5.1.10. Prove that for all n which are positive odd multiples of 3, 91 | (3n + 4n ).

Proof. Let P (n) : 91 | (3n + 4n ).


Base Case: Since we are talking about positive odd multiples of 3, our base case is n = 3. We
have P (3) : 91 | (33 + 43 ), or 91 | 91, which is true.
Inductive Step: For a given positive odd multiple of 3, say n, we must increment it by 6 to get
to the next positive odd multiple of 3. Thus, we want to demonstrate P (n) −→ P (n + 6).
Suppose P (n) : 91 | (3n + 4n ). We must prove that P (n + 6) : 91 | (3n+6 + 4n+6 ).
First, we can show that 91 | (3n + 4n ) −→ 91 | (46 (3n + 4n )) i.e. 91 | (46 · 3n + 4n+6 ).
Note that 46 − 36 = 4096 − 729 = 3367 = (43 + 33 )(43 − 33 ) = 91 · 37.
Therefore, 91 | 3367 or 91 | (46 − 36 ) i.e. 91 | (46 · 3n − 36 · 3n ). Then we have
91 | (46 · 3n − 36 · 3n ) ∧ 91 | (46 · 3n + 4n+6 ) −→ 91 | (46 · 3n − (46 · 3n − 36 · 3n ) + 4n+6 )
i.e. 91 | (36 · 3n + 4n+6 )
i.e. 91 | (3n+6 + 4n+6 ).

The statement P (n + 6) : 91 | (3n+6 + 4n+6 ) is true, and therefore our inductive step holds, and
the proof is done.
Problem 5.1.11. Prove ∀n ∈ Z+ , 3 | (n3 − n).

Proof. Let P (n) : 3 | (n3 − n).


Base Case: P (1) : 3 | (1 − 1) i.e. 3 | 0, which is true.
Inductive Step: Suppose P (n) : 3 | (n3 − n). We must prove P (n + 1) : 3 | ((n + 1)3 − (n + 1)).
We know that 3 | (3 · (n2 + n)) i.e. 3 | (3n2 + 3n). Therefore,
3 | (3n2 + 3n) ∧ 3 | (n3 − n) −→ 3 | (n3 − n + 3n2 + 3n)
i.e. 3 | (n3 + 3n2 + 3n − n)
i.e. 3 | (n3 + 3n2 + 3n + 1 − n − 1)
i.e. 3 | ((n + 1)3 − (n + 1)).

We have proven P (n + 1), which completes the inductive step, so we are done.
73 Chapter 5. Mathematical Induction
Pn 1 1 Pn 1
Problem 5.1.12. Prove 2
≤2− ∀n ∈ Z+ (and by implication 2
< 2).
i=1 i n i=1 i

Pn 1 1
Proof. Let P (n) : 2
≤2− .
i=1 i n
P1 1 1
Base Case: P (1) : 2
= 1 ≤ 2 − 2.
i=1 i 1
We have 1 ≤ 1, which is true.
Pn 1 1
Inductive Step: Assume P (n) : 2
≤2− .
i=1 i n
P
n+1 1 1
We wish to prove that P (n + 1) : 2
≤2− .
i=1 i n+1
1 1 1 1 1 1 1 1
In other words, given 2 + 2 + 2 + . . . + 2 ≤ 2 − , we should prove that 2 + 2 + 2 + . . . +
1 2 3 n n 1 2 3
1 1 1
2
+ 2
≤ 2− . As a first step, we can manipulate what our assumption into something
n (n + 1) n+1
similar to our goal, as follows:
1 1 1 1 1
2
+ 2 + 2 + ... + 2 ≤ 2 − (by inductive hypothesis)
1 2 3 n n
1 1 1 1 1 1 1
+ + + ... + 2 + ≤2− +
12 22 32 n (n + 1)2 n (n + 1)2

1 1 1 1 1 1
How would we use this inequality to get a result like 2
+ 2 + 2 +...+ 2 + 2
≤ 2− ?
1 2 3 n (n + 1) n+1
1 1 1
Try to find a relation between 2 − + and 2 − that would quickly enable us to finish
n (n + 1)2 n+1
the proof.
1 1 1
In fact, we want the inequality 2 − + 2
≤2− to be true.
n (n + 1) n+1
1 1 1 1 1 1 1
Why? If the above is true, then 2 + 2 + 2 + . . . + 2 + ≤ 2− + ≤
1 2 3 n (n + 1)2 n (n + 1)2
1 1 1 1 1 1 1
2− =⇒ 2 + 2 + 2 + . . . + 2 + 2
≤2− , allowing us to quickly finish the
n+1 1 2 3 n (n + 1) n+1
proof.
Indeed, this is true, and the proof is as follows:
As n is positive, n2 + 2n + 1 ≥ n2 + 2n. Then, we have

(n + 1)2 ≥ n(n + 2)
n(n + 2)
1≥
(n + 1)2
1 n+2

n (n + 1)2
1 (n + 1) + 1

n (n + 1)2
Daniel Kim 74
1 n+1 1
≥ 2
+
n (n + 1) (n + 1)2
1 1 1
≥ +
n n + 1 (n + 1)2
1 1 1
− ≥
n (n + 1)2 n+1
1 1 1
− + ≤−
n (n + 1)2 n+1
1 1 1
∴2− + 2
≤2−
n (n + 1) n+1

1 1 1 1 1 1 1
Since 2 − + 2
≤2− , we have therefore proven that 2 + 2 + 2 + . . . + 2 +
n (n + 1) n+1 1 2 3 n
1 1
≤2− . Our inductive step holds true, and so our proof is complete.
(n + 1)2 n+1
Pn 1 √
Problem 5.1.13. Prove √ < 2 n, ∀n ∈ Z+ .
k=1 k

Pn 1 √
Proof. Let P (n) : √ < 2 n.
k=1 k
P1 1
Base Case: P (1) : √ = 1.
k=1 k

Then 1 is less than 2 1 = 2, so the base case is true.
Pn 1 √
Inductive Step: Suppose P (n) : √ < 2 n is true.
k=1 k
P 1
n+1 √
We want to prove: P (n + 1) : √ < 2 n + 1.
k=1 k
By assumption we have
1 1 1 1 √
√ + √ + √ + . . . + √ < 2 n.
1 2 3 n

1
Add √ to both sides. Then,
n+1
1 1 1 1 1 √ 1
√ + √ + √ + ... + √ + √ <2 n+ √ .
1 2 3 n n+1 n+1

√ 1 √
We need 2 n + √ < 2 n + 1 to be true in order to complete the proof. Below, we will
n+1
prove this fact.
Note that n ∈ Z+ . Then,
√ √ 2
n− n+1 >0
75 Chapter 5. Mathematical Induction
p
n−2
n(n + 1) + n + 1 > 0
p
2 n(n + 1) < 2n + 1
p
2 n(n + 1) + 1 < 2n + 2
p
2 n(n + 1) + 1 < 2(n + 1)
     
√ √ 1 1 1
2 n n+1 √ +1 √ < 2(n + 1) √
n+1 n+1 n+1
√ 
√ 1 n+1
2 n+ √ < 2(n + 1)
n+1 n+1
√ 1 √
∴2 n+ √ < 2 n + 1.
n+1

Thus,
1 1 1 1 1 √ 1 √
√ + √ + √ + ... + √ + √ <2 n+ √ < 2 n + 1,
1 2 3 n n+1 n+1
or
1 1 1 1 1 √
√ + √ + √ + ... + √ + √ < 2 n + 1,
1 2 3 n n+1
which concludes the inductive step, so our proof by induction is complete.

Here is an arrangement of various results that are provable using induction.

Exercise 5.1.14. Prove ∀n ∈ Z+ , 7 | (13n − 6n ).

Exercise 5.1.15. Prove ∀n ∈ Z+ , 4 | (5n − 1).

Exercise 5.1.16. Prove ∀n ∈ Z+ , 3 | (n3 + 2n).

Exercise 5.1.17. Prove for all positive odd integers n, 3 | (2n + 1).
P
n
Exercise 5.1.18. Prove ∀n ∈ Z+ , 2k = 2n+1 − 1.
k=0

Exercise 5.1.19. Prove ∀n ≥ 0, 7 | (52n+1 + 22n+1 ).

Exercise 5.1.20. Prove for all positive odd integers n, 8 | (n2 − 1).

Exercise 5.1.21. Prove for all positive odd integers n, 16 | (n4 − 1).

Exercise 5.1.22. Prove ∀n ∈ Z+ , 6 | (17n3 + 103n).


P
n 4n3 − n
Exercise 5.1.23. Prove ∀n ∈ Z+ , (2k − 1)2 = .
k=1 3
P
n 1 n
Exercise 5.1.24. Prove ∀n ∈ Z+ , = .
k=1 (2k − 1)(2k + 1) 2n + 1

Exercise 5.1.25. Prove ∀n ∈ Z+ , 5 | (6n + 4).

Exercise 5.1.26. Prove ∀n ∈ Z+ , 8 | (9n − 1).


Daniel Kim 76

Exercise 5.1.27. Prove ∀n ∈ Z+ , 3 | (5n − 2n ).

Exercise 5.1.28. Prove Lemma 4.3.13. That is, given ∀n ∈ N0 , h ≥ −1, prove that (1+h)n ≥ 1+hn.

Exercise 5.1.29. Prove that if n is an integer greater than 3, then n! > 2n .


Yn  
2k − 1 1
Exercise 5.1.30. Prove ∀n ∈ Z ,+ ≤√ .
2k 3n + 1
k=1

Induction is not necessarily limited to simply one inductive step. Consider the following argument
form:

∀n ∈ Z+ P (n) −→ P (2n)
∀n ∈ Z+ P (n) −→ P (n − 1)
P (2)
+
∴ ∀n ∈ Z P (n)

Given P (2), we know that P (4) must be true. If P (4) is true, then P (3) is true.
Given P (4), we know that P (8) must be true. If P (8) is true, then P (7), P (6), and P (5) are
true.
Given P (8), we know that P (16) must be true. If P (16), then P (15), . . . all the way down to
P (9) are true.
In other words, we can use P (n) −→ P (2n) to increase our scope while P (n) −→ P (n − 1) will
take care of all the statements in the gap between P (n) and P (2n).
Thus, it follows that this argument form, called Cauchy Induction, is indeed valid. This form
will aid us in proving the following major result:

Theorem 5.1.31 (AM-GM Inequality)


The Arithmetic Mean-Geometric Mean Inequality states that ∀n ∈ Z+ and a1 , a2 , a3 , . . . , an >
0,
a1 + a2 + a3 + . . . + an √
≥ n a1 a2 a3 · · · an ,
n
where the left hand side is the arithmetic mean and the right hand side is the geometric mean.


Proof. Let P (n) : a1 +a2 +a3 +...+an
n ≥ n a1 a2 a3 · · · an .
a1 + a2 √
Base Case: We want to prove that P (2) : ≥ a1 a2 .
2
First, as the square of any number is nonnegative, we know that (a1 − a2 )2 ≥ 0. Then,

a21 − 2a1 a2 + a22 ≥ 0


a21 + 2a1 a2 + a22 ≥ 4a1 a2
(a1 + a2 )2 ≥ 4a1 a2 .
77 Chapter 5. Mathematical Induction

As both sides are positive, we can take the square root to get a1 + a2 ≥ 2 a1 a2 , and this
rearranges to the desired inequality, proving our base case.
a1 + a2 + a3 + . . . + an √
Inductive Step: Assume P (n) : ≥ n a1 a2 a3 · · · an is true.
n
First, we will prove P (n) → P (2n).
a1 + a2 + a3 + . . . + an b1 + b2 + b3 + . . . + bn
Consider the positive terms and . Then as the
n n
base case suggests, we have
a1 +a2 +a3 +...+an b1 +b2 +b3 +...+bn
r
n + n a1 + a2 + a3 + . . . + an b1 + b2 + b3 + . . . + bn
≥ · ,
2 n n
which simplifies to
r
a1 + a2 + . . . + an + b1 + b2 + . . . + bn a1 + a2 + . . . + an b1 + b2 + . . . + bn
≥ · .
2n n n

a1 + a2 + a3 + . . . + an √
By assumption of P (n), we know that ≥ n a1 a2 a3 · · · an , and similarly,
n
b1 + b2 + b3 + . . . + bn √
≥ n b1 b2 b3 · · · bn .
n
As all terms are positive, we can multiply the inequalities together to get

a1 + a2 + a3 + . . . + an b1 + b2 + b3 + . . . + bn √ p
· ≥ n a1 a2 a3 · · · an · n b1 b2 b3 · · · bn .
n n

Take the square root of both sides to get


r q
a1 + a2 + a3 + . . . + an b1 + b2 + b3 + . . . + bn √ p
· ≥ n
a1 a2 a3 · · · an · n b1 b2 b3 · · · bn .
n n

Thus, we have
r
a1 + a2 + . . . + an + b1 + b2 + . . . + bn a1 + a2 + . . . + an b1 + b2 + . . . + bn
≥ ·
2n n n
q p
√ n
≥ n
a1 a2 a3 · · · an · b1 b2 b3 · · · bn
qp
n
≥ a1 a2 a3 · · · an b1 b2 b3 · · · bn
p
2n
≥ a1 a2 a3 · · · an b1 b2 b3 · · · bn .

We have now established that


a1 + a2 + a3 + . . . + an + b1 + b2 + b3 + . . . + bn p
P (2n) : ≥ 2n a1 a2 a3 · · · an b1 b2 b3 · · · bn ,
2n
so we conclude the proof of our first inductive step.
Next, we need to demonstrate that P (n) → P (n − 1).
Daniel Kim 78
a1 + a2 + . . . + an−1
Consider the terms a1 , a2 , a3 , . . . , an−1 , and , where the last term is the
n−1
arithmetic mean of the previous terms. By assumption of P (n), we have
s  
a1 + a2 + a3 + . . . + an−1 + a1 +a2n−1
+...+an−1
n a1 + a2 + . . . + an−1
≥ a1 a2 a3 · · · an−1 · .
n n−1

Let S = a1 + a2 + . . . + an−1 . Then we rewrite the above as


s  
S
S + n−1 n S
≥ a1 a2 a3 · · · an−1 · .
n n−1

We can simplify this expression further:


s  
S
S + n−1 n S
≥ a1 a2 a3 · · · an−1 ·
n n−1
s  
S n S
≥ a1 a2 a3 · · · an−1 ·
n−1 n−1
S
n−1 1
  1 ≥ (a1 a2 · · · an−1 ) n
S n
n−1
  n−1
S n 1
≥ (a1 a2 · · · an−1 ) n .
n−1
n
Take both sides to the power.
n−1
  n−1 ! n−1
n
  n
S n 1 n−1
≥ (a1 a2 · · · an−1 ) n
n−1
S 1
≥ (a1 a2 · · · an−1 ) n−1
n−1
S √
≥ n−1 a1 a2 · · · an−1
n−1
a1 + a2 + . . . + an−1 √
∴ ≥ n−1 a1 a2 · · · an−1 .
n−1
a1 + a2 + . . . + an−1 √
We obtain the result ≥ n−1 a1 a2 · · · an−1 , proving that P (n − 1) is true.
n−1
It has been proven that P (n) −→ P (2n), and P (n) −→ P (n − 1). The base case, P (2), was
already proven. Thus, by Cauchy Induction, P (n) for all n ∈ Z+ , and we may conclude the proof.

5.2 The Binomial Theorem


Relevant to our discussion of mathematical induction, we take a brief look into combinatorics, a
branch of mathematics that deals with counting objects in an efficient manner.
79 Chapter 5. Mathematical Induction

Definition 5.2.1. Let n, k ∈ Z such that n ≥ k ≥ 0. Then,


 
n n!
= where 0! = 1,
k k!(n − k)!

and this is pronounced “n choose k.”


 
n
For example, if you had a bag of n balls, there would be ways to choose k out of n balls
k
from the bag.

Problem 5.2.2. How many different 3-member committees can be formed from a group of 15
people?

Solution.
  Out of the total 15 people, we want to select 3 of them. Then the total number of ways is
15 15! 15 · 14 · 13
= = = 455 .
3 3!12! 3·2

Theorem 5.2.3 (Pascal’s Identity)


     
n n n+1
+ =
k k+1 k+1

Proof. We can go for an algebra-based approach:

n! n! n!(k + 1) n!(n − k)
+ = +
k!(n − k)! (k + 1)!(n − k − 1)! k!(k + 1)(n − k)! (k + 1)!(n − k − 1)!(n − k)
n!(k + 1) n!(n − k)
= +
(k + 1)!(n − k)! (k + 1)!(n − k)!
n!(k + 1 + n − k)
=
(k + 1)!(n − k)!
n!(n + 1)
=
(k + 1)!(n − k)!
(n + 1)!
=
(k + 1)!((n + 1) − (k + 1))!
 
n+1
= .
k+1

Alternative Proof. In fact, we can also prove this identity using a combinatorial argument.
Consider a group of n + 1 people and we want to choose k + 1 of them to form a committee.
Now focus on a particular person - call that person A.
We can either includeornot include person A in the committee. If we do choose to include
n
person A, then there are ways to choose k remaining people for the committee out of the n
k
remaining people, since person A took one spot on the committee.
Daniel Kim 80

If we choose to exclude person A, then we  still have


 to select k + 1 people out of the remaining n
n
people, and the number of ways to do this is .
k+1
As person A is definitely either in or not in the committee, there are no other possibilities. We
also know that these  possibilities
  do not overlap, i.e. they are mutually exclusive. Then, we can
n n
establish that adding and will give us the total number of ways to select a committee
k k+1
of k + 1 people from n + 1 people in total. Thus,
     
n n n+1
+ = .
k k+1 k+1
 
n
Problem 5.2.4. Prove that ∈ Z, where 0 ≤ k ≤ n and n, k ∈ Z.
k

Proof. At first, it seems unclear how to prove this by induction, because we have two  variables
 n
n
and k to deal with. One approach may use double induction, where we let P (n, k) : ∈ Z and
k
then prove P (n, k) → P (n + 1, k) and P (n, k) → P (n, k + 1).
 
n
However, this seems tedious, and in fact there is a more elegant solution. Let Q(n) : ∈Z
k
for k = 0, 1, 2, . . . , n. Then we only have to worry about n as the variable in question. By letting
Q(n) refer to a collection of statements, as such:

Q(n) ≡ P (n, 0) ∧ P (n, 1) ∧ . . . ∧ P (n, n),

if we are able to prove that Q(n) ∀n is true, then P (n, k) ∀n, k ≤ n will be true. This will finish
the proof cleanly.
Now, we proceed by induction on n of Q(n).
   
1 1! 1 1!
Base Case: Q(1) : = = 1, and = = 1, which are integers, so the base case is
0 0!1! 1 1!0!
done.
 
n
Inductive Step: Suppose Q(n) : ∈ Z for k = 0, 1, 2, . . . , n is true.
k
 
n+1
We want to prove Q(n + 1) : ∈ Z for k = 0, 1, 2, . . . , n + 1.
k
 
n+1
Writing out the terms for , k = 0, 1, 2, . . . , n, n + 1, we have:
k
         
n+1 n+1 n+1 n+1 n+1
, , ,..., , .
0 1 2 n n+1
 
n
By Theorem 5.2.3, each of these can be expressed as the sum of two terms from , k =
k
0, 1, 2, . . . , n, as follows:
               
n+1 n n n n n n n+1
, + , + ,..., + , .
0 0 1 1 2 n−1 n n+1
81 Chapter 5. Mathematical Induction
   
n+1 n+1
Note that = = 1, and obviously are integers. From assumption of Q(n), we
0 n+1        
n n n n
have thereby assumed that each of , , ,..., are integers. Then any sum of some
0 1 2 n  
n+1
terms from these is also an integer. Therefore, each term from , k = 0, 1, 2, . . . , n, n + 1
    k
n+1 n+1
(except and , which are integers anyway) can be expressed as the sum of two
0 n+1
integers, and must necessarily be an integer.

Problem 5.2.5. Prove that ∀n ∈ Z+ , n! divides the product of n consecutive integers.


 
n
Proof. Consider , where k ≤ n. Then,
k
 
n n!
=
k (n − k)!k!
n(n − 1)(n − 2) · · · (n − k + 1)(n − k)(n − k − 1) · · · 3 · 2 · 1
=
(n − k)(n − k − 1)(n − k − 2) · · · 3 · 2 · 1 · k!
n(n − 1)(n − 2) · · · (n − k + 1)
= .
k!
There are k consecutive terms in the numerator, and this is divided by k! in the denominator.
However, by Problem 5.2.4, this quotient is an integer, thus k! divides the k consecutive terms in
the numerator, so we are done.

Now, we finally move on to the main theorem that is provable by induction.

Theorem 5.2.6 (The Binomial Theorem)


Let n ∈ Z+ . Then,
n  
X
n n n−k k
(x + y) = x y .
k
k=0

Proof. We proceed by induction on n.


Base Case: Consider n = 1. We have
1  
X    
1 1−k k 1 1 0 1 0 1
x y = x y + x y
k 0 1
k=0
= x1 + y 1
= x + y.

P
1 
Clearly (x + y)1 = x + y, so therefore (x + y)1 = 1
k x1−k y k , and the base case is true.
k=0
P
n 
Inductive Step: Assume (x + y)n = n
k xn−k y k is true.
k=0
Daniel Kim 82

P
n+1 
We want to prove: (x + y)n+1 = n+1
k xn+1−k y k .
k=0
We use Theorem 5.2.3 to proceed with the inductive step.
X n  
n+1 n n−k k
(x + y) = (x + y) x y
k
k=0
Xn   Xn  
n n−k k n n−k k
=x x y +y x y
k k
k=0 k=0
Xn   Xn  
n n+1−k k n n−k k+1
= x y + x y
k k
k=0 k=0
          
n n+1 n n n n−1 2 n n−2 3 n n
= x + x y+ x y + x y + ... + xy
0 1 2 3 n
        
n n n n−1 2 n n−2 3 n n+1
+ x y+ x y + x y + ... + y
0 1 2 n
           
n n+1 n n n n n n−1 2 n n−1 2
= x + x y+ x y + x y + x y
0 1 0 2 1
           
n n−2 3 n n−2 3 n 1 n n 1 n n n+1
+ x y + x y + ... + x y + x y + y
3 2 n n−1 n
         
n n+1 n n n n
= x + + xn y + + xn−1 y 2
0 1 0 2 1
         
n n n−2 3 n n 1 n n n+1
+ + x y + ... + + x y + y
3 2 n n−1 n
         
n n+1 n+1 n n + 1 n−1 2 n+1 n n n+1
= x + x y+ x y + ... + xy + y .
0 1 2 n n
       
n n+1 n n+1
Note that = = =1= = 1. Then,
0 0 n n+1
         
n + 1 n+1 n+1 n n + 1 n−1 2 n+1 n + 1 n+1
(x + y)n+1 = x + x y+ x y + ... + xy n + y
0 1 2 n n+1
X n + 1
n+1
= xn+1−k y k ,
k
k=0

which concludes our inductive step.

Problem 5.2.7. Expand the following:

1. (x + y)4
2. (2x + 3y)3

Solution. This problem is an exercise in directly applying the Binomial Theorem.


         
4 4 4 0 4 4−1 1 4 4−2 2 4 4−3 3 4 4−4 4
(x + y) = x y + x y + x y + x y + x y
0 1 2 1 0
83 Chapter 5. Mathematical Induction

= x4 + 4x3 y + 6x2 y 2 + 4xy 3 + y 4 .

       
3 3 3 0 3 2 1 3 1 2 3
(2x + 3y) = (2x) (3y) + (2x) (3y) + (2x) (3y) + (2x)0 (3y)3
0 1 2 3
= (2x)3 + 3(2x)2 (3y) + 3(2x)(3y)2 + (3y)3
= 8x3 + 36x2 y + 54xy 2 + 27y 3 .
 
7 150
Problem 5.2.8. When expanded, what is the constant term of 5x − 2
3 ?
x

Solution. According to the binomial theorem, a single term for an arbitrary k is


   
150 3 150−k 7 k
(5x ) − 2 .
k x
 
7 k
We must solve for the value of k which cancels the term with the − 2
(5x3 )150−k term,
x
as the latter has a power of x in the denominator which will cancel with the power of the first x,
guaranteeing the existence of a constant term.
Ignoring coefficients, the power of the first x is 3(150 − k) and the power of the second x, which
is in the denominator, is −2k.
Therefore, the overall power of x will be the sum of these two powers. Since we are searching
for the constant term, we want this overall power to be 0. Therefore, we solve the equation
3(150 − k) − 2k = 0, which gives k = 90.
Substituting this back into the general form given earlier, our constant term is
     
150 3 60 7 90 150 60 90
(5x ) − 2 = 5 7 .
90 x 90

Problem 5.2.9. When expanded, what is the full term of the form kx1776 y r in (4x − 5y)2017 ?

Solution. The Binomial Theorem states that a general term in a binomial expansion to the n power
is of the form  
n n−k k
x y .
k

Clearly n = 2017 and n − k = 1776, so k = 241. Therefore the term we have to find is
 
2017
(4x)1776 (−5y)241 ,
241
which simplifies to
 
2017  
− 41776 5241 x1776 y 241 .
241
Daniel Kim 84

Problem 5.2.10. Find and prove a formula for:


n  
X n
a)
k
k=0
n
X  
k n
b) (−1)
k
k=0
n 
X 
2n
c)
2k
k=0

Proof. We consider special cases of the Binomial Theorem.

a) Consider the quantity (1 + 1)n .


n  
X n n−k k
n
(1 + 1) = 1 1 .
k
k=0
n  
X n
But also note that (1 + 1)n = 2n , therefore, = 2n .
k
k=0

b) Similarly, consider (1 + −1)n .


Xn  
n n−k
(1 + −1)n = 1 (−1)k
k
k=0
Xn  
n
= (−1)k .
k
k=0
n
X  
k n
We have (1 + −1)n = 0n = 0, therefore, (−1) = 0.
k
k=0

2n 
X  2n
X  
2n k 2n
c) Based on the two previous examples, we know that =2 2n
and (−1) = 0.
k k
k=0 k=0
We can split these sums based on their parity (even or odd).
X2n  
2n
For , if k is even, then k = 2i for some integer i, so the sum of all the even terms is
k
k=0
        X n  
2n 2n 2n 2n 2n
+ + + ... + , or . If k is odd, then k = 2i + 1, so the sum
0 2 4 2n 2i
i=0
        X  2n 
n−1
2n 2n 2n 2n
of all the odd terms is + + +...+ or . Since k can
1 3 5 2n − 1 2i + 1
i=0
only be even or odd,
2n 
X  n 
X  X
n−1 
2n 2n 2n
= + = 22n .
k 2i 2i + 1
k=0 i=0 i=0
85 Chapter 5. Mathematical Induction
2n
X  
2n
It is nearly the same thing for (−1)k , except that when k is odd, (−1)k = −1, so we
k
k=0
have
2n
X   X n   n−1
X  2n 
2n 2n
(−1)k = + − = 0.
k 2i 2i + 1
k=0 i=0 i=0

Adding these two together, we get


2n 
X  2n
X   X n   n−1
X  2n  X n   n−1
X  2n 
2n k 2n 2n 2n
+ (−1) = + + + −
k k 2i 2i + 1 2i 2i + 1
k=0 k=0 i=0 i=0 i=0 i=0
Xn   n−1
X  2n  X n   X  2n 
n−1
2n 2n
= + + + −1 ·
2i 2i + 1 2i 2i + 1
i=0 i=0 i=0 i=0
Xn   X n  
2n 2n
= +
2i 2i
i=0 i=0
Xn  
2n
=2
2i
i=0
Xn  
2n 2n
2 +0=2
2i
i=0
n 
X 
2n
∴ = 22n−1 .
2i
i=0

Problem 5.2.11. What is wrong with the following proof?

Claim: All horses are the same color.

We proceed with proof by induction: Let P (n) : Any set of n horses is the same color.
Base Case: P (1) : Any set of 1 horse is the same color.
This is obviously true, as each horse has its own color.
Inductive Step: Assume P (n): any set of n horses is the same color.
Consider a set of n horses and let x refer to one of the horses in that set, and let H refer to the
rest of the horses.
Consider another set of n horses consisting of H horses plus another horse (distinct from x) that
we shall call y.
Because x is in a set with H, horse x and the horses H have the same color. Because y is also in
a set with H, horse y and the horses H have the same color. Therefore, x, y, and H have the same
color.
Therefore, we can construct a set of n + 1 horses containing x, y, and H. The inductive step
holds, so all horses are the same color.

Solution. This proof by induction makes a critical logical error in that P (1) → P (2) is false. If we
have a set of 1 horse, then there is no H (“rest of the horses”) from which we could compare the
Daniel Kim 86

colors of x and y. To ‘fix’ this, if you’re thinking about starting the induction with P (2) → P (3),
remember that P (2) itself is false because one can choose two differently colored horses for a set
of 2 horses. Overall, if the base case cannot imply the next statement, then the inductive form of
reasoning cannot be applied, so this argument is invalid.
Chapter 6

Basic Trigonometry

The reader is expected to have some experience with basic right triangle trigonometry, but a brief
review of it is given first, for the convenience of the reader and for the sake of completeness. A basic
understanding of simple concepts in geometry is expected.

6.1 Review

First, we review the 6 trigonometric functions.

c
b

C a B

Consider 4ABC where ∠ACB is a right angle and let ∠CAB = θ. Then, we may define our 6
trigonometric functions given this angle, as follows:

a c
sin θ = csc θ =
c a

b c
cos θ = sec θ =
c b

a b
tan θ = cot θ =
b a

To remember these, one can use the famous acronym SOH CAH TOA:
87
Daniel Kim 88
opposite
1. Sine-Opposite-Hypotenuse: sin θ refers to the ratio , where the sides in discussion
hypotenuse
are in relation to θ. In this diagram, a would be the opposite side to θ and c would be the
a
hypotenuse, so sin θ = .
c
adjacent
2. Cosine-Adjacent-Hypotenuse: cos θ refers to the ratio , where the sides in discussion
hypotenuse
are in relation to θ. In this diagram, b would be the adjacent side to θ and c is still the
b
hypotenuse, so cos θ = .
c
opposite
3. Tangent-Opposite-Adjacent: tan θ refers to the ratio , where the sides in discussion
adjacent
are in relation to θ. In this diagram, as previously mentioned, a is the opposite, and b is the
a
adjacent, so tan θ = .
b
In addition, recall the following definitions (and vice-versa):
1
sin θ =
csc θ
1
cos θ =
sec θ
1
tan θ =
cot θ

Noting that cosecant, secant, and cotangent are the reciprocals of sine, cosine, and tangent
respectively, their definitions become obvious:
1 hypotenuse c
csc θ = = or
opposite
hypotenuse
opposite a
1 hypotenuse c
sec θ = = or
adjacent adjacent b
hypotenuse
1 adjacent b
cot θ = = or
opposite
adjacent
opposite a

By using the helpful acronym SOH CAH TOA and remembering the reciprocal relationships
between each pair, we can intuitively derive the corresponding ratios for all of the functions.
Now, there are special angles associated with these trigonometric functions that you should
memorize.
A
45◦


x 2
x

C x B
89 Chapter 6. Basic Trigonometry

Consider an isosceles right triangle ABC with right angle C, and let the common length of the
legs be x. Then, we can derive the well-known side length proportions of this particular triangle
using the Pythagorean Theorem, as follows:

AB 2 = AC 2 + BC 2
AB 2 = x2 + x2
AB 2 = 2x2

∴ AB = x 2.

We know that ∠A = 45◦ , so we can find out the trigonometric ratios of the angle 45◦ using the
proportions of the sides we have just proven. Note that the x terms cancel out in the numerator and
denominator.
√ √
sin 45◦ = 2
2
csc 45◦ = 2
√ √
cos 45◦ = 2
2
sec 45◦ = 2

tan 45◦ = 1 cot 45◦ = 1

Here is the other special triangle:

30◦ 30◦


x x 3 x
2

60◦ 60◦
B x x C
2 D 2

Consider an equilateral triangle ABC with common side length x. Drop the altitude from A
to BC, and let the foot be D. Note that AD is also the median, as AB = AC. Furthermore, it is
not hard to see that AD is the angle bisector of ∠BAC. Therefore, D is the midpoint of BC, so
x
BD = CD = . We also have that ∠BAD = ∠CAD = 30◦ . We can then compute AD by using
2
the Pythagorean theorem on either 4ABD or 4ACD.
 x 2
AD2 + = x2
2
x2
AD2 = x2 −
√ 4
x 3
∴ AD = .
2
Daniel Kim 90

The resulting proportions of the sides of the smaller triangle inside the larger equilateral triangle
yield the 30-60-90 triangle:

A
60◦

x 2x

30◦

C x 3 B

We can now derive the trigonometric values for angles 60◦ and 30◦ respectively:

√ √
sin 60◦ = 2
3
csc 60◦ = 2 3
3

cos 60◦ = 12 sec 60◦ = 2


√ √
tan 60◦ = 3 cot 60◦ = 3
3

sin 30◦ = 1
2 csc 30◦ = 2
√ √
cos 30◦ = 2
3
sec 30◦ = 2 3
3
√ √
tan 30◦ = 3
3
cot 30◦ = 3

Lastly, we have the notion of radians vs. degrees, two ways of measuring angles. We define 2π
radians to be equivalent to 360◦ , or in simpler terms, π radians = 180◦ .

Exercise 6.1.1. Convert 0◦ , 30◦ , 45◦ , 60◦ , and 90◦ to radians. You should be familiar with all of
the radian equivalents for these special angles.

Sector of a circle with radius r and subtended angle θ

We can derive the formulae for the arc length and area of this sector of a circle for radians and
degrees:
91 Chapter 6. Basic Trigonometry

Degrees Radians
πrθ
Arc length rθ
180◦
θ r2 θ
Area πr2 ·
360◦ 2

Why should we use radians over degrees? As a start, the formulae in terms of radians is clearly
more aesthetically pleasing. However, it is not until we begin the topic of calculus that we realize
the true benefits of using radians.
In this chapter, if the symbol ◦ is specified, then the angle is in degrees. Otherwise, it is in
radians.

6.2 The Unit Circle


Now, we tie this idea of right triangle trigonometry to the unit circle. Consider the diagram below:

(0, 1)
(cos θ, sin θ)

1
sin θ

(−1, 0) θ (1, 0)
cos θ

(0, −1)

Draw a ray from the origin to some point on the graph of x2 + y 2 = 1, i.e. the unit circle. For
clarity I will illustrate such a ray in the first quadrant. If we drop a perpendicular from the point
to the x-axis, we actually form a right triangle, with the hypotenuse being the ray, which must be
length 1 since it is a radius of the circle.
Let θ be the angle between the hypotenuse and the x-axis. We start at θ = 0◦ , which is the point
(1, 0), and move counter-clockwise (in order of the four quadrants).
Definition 6.2.1. An angle will represent the rotation of a ray through the origin of the unit circle.
Definition 6.2.2. Two angles will be called coterminal if they end up at the same position on
the unit circle. In other words, angle x and angle y are coterminal ←→ 360◦ | (x − y), in terms of
degrees.
The terminal side is the hypotenuse of the triangle, which is 1.
Daniel Kim 92

This means that rotating the ray by a full circle, or 360◦ , will not change the point (x, y) on the
circle. Thus, sin(120◦ ) = sin(480◦ ) = sin(840◦ ) = . . ., so the trigonometric functions are periodic.
We will revisit this property later.

Since the hypotenuse is 1, the side adjacent to θ would be cos θ, the side opposite to θ would be
sin θ, and the slope of the hypotenuse would be tan θ, using the mnemonic ‘SOH CAH TOA.’

Definition 6.2.3. For any angle θ, let the terminal side of θ hit the unit circle at (x, y). Then we
define

x = cos θ,
y = sin θ.

Keep in mind that the domains of cosine and sine are necessarily all real numbers.

Definition 6.2.4. Although we have already stated these in the last section, I will emphasize that
the following remain the same under the context of the unit circle.

1 1
csc θ = sec θ =
sin θ cos θ

sin θ cos θ
tan θ = cot θ =
cos θ sin θ

Definition 6.2.5. Quadrantal angles are angles that terminate on the x or y axis. For example,
we call the angles 90◦ , 180◦ , 270◦ , 360◦ , 450◦ , 540◦ , . . . quadrantal, as each of them lie on one of the
four points (1, 0), (0, 1), (−1, 0), or (0, −1).

First, we can take advantage of the reflective/rotational symmetry that the circle offers in order
to develop our first few trigonometric identities.
93 Chapter 6. Basic Trigonometry
y

(− cos θ, sin θ) (cos θ, sin θ)


(cos(180 − θ), sin(180◦ − θ))

180◦ + θ

180◦ − θ

θ
x
−θ

(− cos θ, − sin θ) (cos θ, − sin θ)


(cos(180 + θ), sin(180◦ + θ))

(cos(−θ), sin(−θ))

Reflections of the ray from (0, 0) to (cos θ, sin θ) to the four quadrants

Recall the following facts: If a point (x, y) in the first quadrant is reflected across . . .

1. . . . the y-axis, then the new point is (−x, y).

2. . . . the origin, then the new point is (−x, −y).

3. . . . the x-axis, then the new point is (x, −y).

Applying these facts to the point (cos θ, sin θ), as shown in the diagram, we can express the three
other points in two ways each: the first way considering the reflection applied to the original, and
the second way considering the angle of the new point.
For the point in quadrant 2, notice how we have reflected (cos θ, sin θ) across the y-axis, so
this new point can be represented as (− cos θ, sin θ). However, this point is also the result of
rotating a ray 180◦ − θ in the counterclockwise direction, so the point can also be expressed as
(cos(180◦ − θ), sin(180◦ − θ)). As these two coordinates refer to the same point, we have the following
identities:

cos(180◦ − θ) = − cos θ
sin(180◦ − θ) = sin θ

Similarly, the point in the third quadrant can be represented as either (− cos θ, − sin θ) or (cos(180◦ +
θ), sin(180◦ + θ)). As a result, we can derive similar identities:

cos(180◦ + θ) = − cos θ
Daniel Kim 94

sin(180◦ + θ) = − sin θ

Lastly, the point in the fourth quadrant is either (cos θ, − sin θ) or (cos(−θ), sin(−θ)). So we have
the following identities:

cos(−θ) = cos θ
sin(−θ) = − sin θ

Thus, we put together our first set of trigonometric identities:

cos(180◦ − θ) = − cos θ cos(180◦ + θ) = − cos θ cos(−θ) = cos θ

sin(180◦ − θ) = sin θ sin(180◦ + θ) = − sin θ sin(−θ) = − sin θ

As you solve more problems, you will get used to these identities. However, if you imagine
reflecting and rotating angles on the unit circle, then it won’t be hard to rederive these relationships
again if you forget.
It is important that you remember that cos θ is an even function, while sin θ is an odd function.

Definition 6.2.6. The reference angle for a non-quadrantal angle θ is the measure of the acute
angle that the terminal side makes with the x-axis.

Remember that the actual angle and the reference angle are not the same. Consider the diagram
below:

θ1
cos θ1
θ2
sin θ1


In this example, π ≤ θ1 ≤ would be the overall angle that we base cosine and sine on, while
2
π
0 ≤ θ2 < would be the reference angle.
2
Problem 6.2.7. Find the reference angles for the following (which are in radians):
95 Chapter 6. Basic Trigonometry

a)
11

19π
b)
5

c) 8

π 3π
Solution. Make sure you keep in mind the main quadrantal angles in radians: 0, , π, , 2π, and
2 2
π
further multiples of . Multiples of π lie on the x-axis, so pay attention to those when looking for
2
the reference angle. Also remember to reduce a given radian measure larger than 2π by subtracting
the appropriate multiple of 2π. To visualize the angle, try drawing it on the unit circle and then the
appropriate right triangle for it.

7π π 7π 4π
a) is between and π, so the reference angle is π − = .
11 2 11 11

19π 9π 19π
b) is larger than 2π, so subtract 2π to get , which is coterminal to . Then, notice
5 5 5
9π 3π 9π π
that is between and 2π, so the reference angle is 2π − = .
5 2 5 5

π
c) In this case, we should figure out estimated values for π and , which are 3.14 and 1.57
2
respectively. 8 is larger than 2π, so subtract ≈ 6.28 to get 1.72, which is between 1.57 and
π
3.14, or and π. So the reference angle is π − (8 − 2π) = 3π − 8 . Note that we are merely
2
using these estimates for π and its fractions so we can determine which quadrant the angle 8
(in radians) is located in.

From our definitions, the following are true:

1. If two angles are coterminal, then they have the same trigonometric values.

2. If two angles have the same reference angle, then they have the same trigonometric functions
up to the sign. For example, sin(111◦ ) = ± sin(69◦ ).

Take a moment to convince yourself why these are true, especially the second result.

Now, we derive another important trigonometric identity by another kind of reflection.


Daniel Kim 96
y
(cos θ, sin θ) y=x

(sin θ, cos θ)
(cos(90◦ − θ), sin(90◦ − θ))
θ
90◦ − θ
x

Reflecting the point (cos θ, sin θ) over the line y = x

Similar to the reflections across the x-axis, y-axis, and origin, we can also reflect a point
(cos θ, sin θ) across the line y = x, yielding the point (sin θ, cos θ). Reflections also maintain angles,
so we can express the reflected point as (cos(90◦ − θ), sin(90◦ − θ)). These two coordinates refer to
the same point, so we have now derived the following identities:
sin(90◦ − θ) = cos θ
cos(90◦ − θ) = sin θ
Taking the reciprocal of these equations, we get
csc(90◦ − θ) = sec θ
sec(90◦ − θ) = csc θ
sin(90◦ − θ) cos θ
Furthermore, note that tan(90◦ − θ) = ◦
= = cot θ, and vice-versa.
cos(90 − θ) sin θ
Therefore, we can conclude that if f is a trigonometric function, then
f (θ) = cof (90◦ − θ),
|{z}
cofunction of f

where the cofunction pairs are sine and cosine, secant and cosecant, and tangent and cotangent.
Their relationships express the notion of the cofunction identities.
Definition 6.2.8. A function is periodic if ∃p > 0 s.t. f (x + p) = f (x) ∀x. If p is the smallest
positive number satisfying this, then p is called the period.
Problem 6.2.9. What is the period of sine and cosine? What about tangent?

Solution. As previously stated regarding coterminal angles, rotating a ray by a full circle, which
is 360◦ , will not change the point that the ray is pointing to. Thus, sin θ = sin(θ + 360◦ ) and
cos θ = cos(θ + 360◦ ). It can then be confirmed that 360◦ is the period.
97 Chapter 6. Basic Trigonometry

However, for tangent, notice that the slope of the ray from the origin is the same when it is
reflected across the origin. In other words, the tangent is the same when 180◦ is added to the current
angle. The period of tangent is thus 180◦ .
We can also demonstrate this using some identities. Note that

sin(θ + 180◦ ) − sin θ


tan(θ + 180◦ ) = ◦
= = tan θ.
cos(θ + 180 ) − cos θ

Problem 6.2.10. Find all six trig values for each angle:

a) 150◦

b) 225◦

c) 300◦

Then find the sum of all eighteen values.

Solution.
√ √
◦ 1 2 3
a) sin 150 = b) sin 225 ◦
=− ◦
c) sin 300 = −
2√
√2 2
3 2 1
cos 150◦ = − cos 225◦ =− cos 300◦ =
√2 2 2√
3 tan 225◦ =1 tan 300◦ =− 3
tan 150◦ =− √ √
3 2 3
csc 225◦ =− 2 csc 300◦ =−
csc 150◦ =2 √ 3
√ sec 225◦ =− 2
◦ 2 3 sec 300◦ =2
sec 150 =− cot 225◦ = 1 √
√3 3
cot 150◦ =− 3 cot 300◦ =−
3
√ √
5 5 3 √ 5 5 3
The sums in part a, b, and c respectively are − , 2 − 3 2, and − , so the total sum
√ √ 2 2 2 2
would be 7 − 5 3 − 3 2 . The important idea to consider is that we should take advantage of the
identities to simplify the problem and reduce the workload. Note that the sum for 150◦ and 300◦
were the same, and this is because

f (150◦ ) = cof (90◦ − 150◦ )


= cof (−60◦ )
= cof (−60◦ + 360◦ )
= cof (300◦ ),

according to our cofunction identities. Therefore, the six trig values for 150◦ will be the same for
300◦ , but for cofunctions of each other. We could have sped up the process of computing the sum by
finding the sum of either 150◦ or 300◦ and multiplying by two, then adding the sum for 225◦ .

Problem 6.2.11. Write the following in terms of a trig function of an angle between 0◦ and 45◦ :
Daniel Kim 98

1. sin(73◦ )

2. tan(109◦ )

3. sec(2017◦ )
 
11π
4. cos
7

Solution. Apply the identities we have covered so far:

1. sin(73◦ ) = cos(90◦ − 73◦ )


= cos(17◦ ) .

2. tan(109◦ ) = cot(90◦ − 109◦ )


= cot(−19◦ )
= − cot(19◦ ) .

3. sec(2017◦ ) = sec(217◦ )
= sec(217◦ − 360◦ )
= sec(−143◦ )
= sec(143◦ )
= − sec(37◦ ) .
   
4. 11π 11π
cos = − cos π −
7 7
 

= − cos −
7
 

= − cos
7
 
π 4π
= − sin −
2 7
 π
= − sin −
14
π
= sin .
14

Problem 6.2.12. Write the following in terms of a trig function of an angle between 0◦ and 45◦ :

1. sin(177◦ )

2. cos(111◦ )

3. tan(620◦ )

Solution.
99 Chapter 6. Basic Trigonometry

1. sin(177◦ ) = sin(180◦ − 177◦ )


= sin(3◦ ) .

2. cos(111◦ ) = − cos(180◦ − 111◦ )


= − cos(69◦ )
= − sin(90◦ − 69◦ )
= − sin(21◦ ) .

3. tan(620◦ ) = tan(620◦ − 360◦ )


= tan(260◦ )
sin(260◦ )
=
cos(260◦ )
sin(180◦ + 80◦ )
=
cos(180◦ + 80◦ )
− sin(80◦ )
=
− cos(80◦ )
cos(90◦ − 80◦ )
=
sin(90◦ − 80◦ )
= cot(10◦ ) .

Now consider the right triangle of the unit circle once again. The legs of the triangle are cos θ
and sin θ, and the hypotenuse is 1. Then, by the Pythagorean Theorem, we have
sin2 θ + cos2 θ = 1.

Furthermore, if we divide both sides by sin2 θ, we get


cot2 θ + 1 = csc2 θ,

and if we chose to divide both sides by cos2 θ instead, we get


1 + tan2 θ = sec2 θ.

These three equations are known as the Pythagorean Identities.


4
Problem 6.2.13. Find the rest of the trigonometric values for sin θ = .
5

5
Solution. First, it is obvious that csc θ = , as it is the reciprocal of sin θ.
4
 2
4 3
Furthermore, we use the identity cos θ + sin θ = 1 to have cos θ +
2 2 2 = 1 −→ cos θ = ± .
5 5
5 sin θ 4
Since secant is the reciprocal of cosine, we have sec θ = ± . Lastly, tan θ = , so tan θ = ± .
3 cos θ 3
3
Then cot θ = ± .
4
Keep in mind that if the quadrant of this angle is not specified, then some trig values can either
be positive or negative, so make sure you consider the sign as well.
Daniel Kim 100

For the following problems, we will use I, II, III, and IV to refer to the four quadrants.

20
Problem 6.2.14. Find the remaining trigonometric values when csc θ = − , and θ ∈ III.
17

17
Solution. We can instantly see that sin θ = − . Furthermore, we can use the identity csc2 θ =
√20
400 111
1+cot2 θ. We get = 1+cot2 θ, or cot θ = ± . However, as θ is in quadrant III, tangent must
289 17 √
111 17
be positive, so cotangent is also positive. Therefore, cot θ = . This implies tan θ = √ .
17 111
289
We also use the Pythagorean identity cos2 θ + sin2 θ = 1. This yields + cos2 θ = 1, or
√ 400
111
cos θ = − , as cosine is negative in the third quadrant. Since secant is the reciprocal of cosine,
20
20
we have sec θ = − √ .
111

While we have been simply applying identities in order to find the rest of the values, choosing
the sign can be a hassle. We could opt for the alternative method of reference triangles for a cleaner
and more straightforward solution.
4
For example, consider sin θ = where θ ∈ I. We can construct a reference triangle in the first
5
quadrant, as such:

5 4

We fill in the triangle with the given information. For simplicity, we let the opposite side be 4
and the hypotenuse be 5. We could then use the Pythagorean Theorem to get the remaining leg of
the right triangle.
101 Chapter 6. Basic Trigonometry

5 4

θ
3

Now, we can read our information from the resulting reference triangle to derive the rest of the
trigonometric values. From the triangle above, we have

5
csc θ = ,
4
3
cos θ = ,
5
5
sec θ = ,
3
4
tan θ = ,
3
3
cot θ = .
4

It is much easier to read the information off a visual cue like a triangle, and this method is less
error-prone than applying identities and worrying over which sign to choose.
The true strength of reference triangles is shown for examples in other quadrants.
4
Consider sin θ = , but this time θ ∈ II. Then our reference triangle would look different:
5

4 5

θ
−3
Daniel Kim 102

As θ ∈ II, it should be true that 90◦ < θ < 180◦ . However, in the context of a reference triangle,
we let θ in discussion be the reference angle of the triangle, so our calculations proceed smoothly.
As the triangle is to the left of the y-axis, the horizontal leg must be negative. Hence, we let it
be −3 instead of 3 after we apply Pythagorean’s Theorem to the given sides 4 and 5.
By drawing a diagram which clearly represents which parts are positive and negative, it is easily
to identify the rest of the trigonometric values. For this example, the rest would be:
5
csc θ = ,
4
3
cos θ = − ,
5
5
sec θ = − ,
3
4
tan θ = − ,
3
3
cot θ = − .
4
4
Exercise 6.2.15. Find the rest of the trigonometric values for cos θ = , θ ∈ IV .
5
12
Exercise 6.2.16. Find the rest of the trigonometric values for tan θ = , θ ∈ III.
5
7
Exercise 6.2.17. Find the rest of the trigonometric values for tan θ = − , θ ∈ II.
11
1
Exercise 6.2.18. Find the rest of the trigonometric values for tan θ = − , θ ∈ II.
2
Problem 6.2.19. Simplify (4 sin θ − 7 cos θ)2 + (7 sin θ + 4 cos θ)2 .

Solution. Expand and simplify:

(4 sin θ − 7 cos θ)2 + (7 sin θ + 4 cos θ)2


= 16 sin2 θ − 56 sin θ cos θ + 49 cos2 θ + 49 sin2 θ + 56 sin θ cos θ + 16 cos2 θ
= 65 sin2 θ + 65 cos2 θ
= 65(sin2 θ + cos2 θ)
= 65 .

Exercise 6.2.20. Simplify (2 sin θ − 3 cos θ)2 + (3 sin θ + 2 cos θ)2 .


1
Problem 6.2.21. If sin θ + cos θ = , find all possible values of cos θ.
3

1 1
Solution. If sin θ + cos θ = , then (sin θ + cos θ)2 = . We expand and simplify this to get
3 9
1
sin2 θ + 2 sin θ cos θ + cos2 θ =
9
103 Chapter 6. Basic Trigonometry
8
2 sin θ cos θ = −
9
4
sin θ cos θ = − .
9

1
Furthermore, from the original equation, we have sin θ = − cos θ. We substitute this into the
  3
1 4
equation above to get cos θ − cos θ = − . This expands to
3 9
1 4
cos θ − cos2 θ + = 0.
3 9

1 4
For legibility, let x = cos θ. Then we want to solve the quadratic x − x2 + = 0. Multiply by
3 9
−9 to get 9x2 − 3x − 4 = 0, and by the quadratic formula, we get

1± 17
x = cos θ = .
6
1
Problem 6.2.22. Find all values of sin x when sin x + cos x = .
2
1
Solution. Square both sides of sin x + cos x = to get:
2
1
sin2 x + cos2 x + 2 sin x cos x =
4
1
1 + 2 sin x cos x =
4
3
sin x cos x = − .
8

1
From the original equation, we know that cos x = − sin x. Then we have:
2
 
1 3
sin x − sin x = − .
2 8

1 3
Let y = sin x. Then we solve for the quadratic y − y 2 = − , or 8y 2 − 4y − 3 = 0. By the
√ 2 8
1± 7
quadratic formula, we have y = sin x = .
4
Problem 6.2.23. Prove the following identities:

1. sec x − cos x = sin x tan x


sin x
2. = csc x − cot x
1 + cos x

Proof.
Daniel Kim 104

1. We can simply combine fractions and use the Pythagorean Identity:


1
sec x − cos x = − cos x
cos x
1 − cos2 x
=
cos x
sin2 x
=
cos x
= sin x tan x.

2. The denominator 1 + cos x motivates us to multiply the top and bottom by 1 − cos x, so we
can apply the Pythagorean Identity:
sin x sin x 1 − cos x
= ·
1 + cos x 1 + cos x 1 − cos x
sin x(1 − cos x)
=
1 − cos2 x
sin x(1 − cos x)
=
sin2 x
1 − cos x
=
sin x
1 cos x
= −
sin x sin x
= csc x − cot x.

Problem 6.2.24. tan x + cot x equals the product of two trigonometric functions, both in terms of
x. What is this product?

Solution. Simply rewrite the given expression in terms of sine and cosine, then apply appropriate
identities:
sin x cos x
tan x + cot x = +
cos x sin x
sin2 x + cos2 x
=
cos x sin x
1
=
cos x sin x
= csc x sec x .

Lastly, there is one more identity to discover, based on previous identities that we have already
found.
We have cos θ = sin(90◦ − θ). If we substitute θ → −θ, we get cos(−θ) = sin(90◦ + θ). But
notice that cos(−θ) = cos θ, therefore

cos θ = sin(θ + 90◦ ).

This identity will play a significant role for the next section.
105 Chapter 6. Basic Trigonometry

6.3 Graphing

First observe the behavior of y = sin x as x goes from 0◦ to 90◦ . When we go counter-clockwise on
the unit circle from (1, 0), the y-coordinate, which is sin x, is increasing from 0 to 1. This can be
illustrated by the approximations,

sin 0◦ = 0,
1
sin 30◦ = = 0.5,
2

2
sin 45◦ = ≈ 0.7,
√2
3
sin 60◦ = ≈ 0.9,
2
sin 90◦ = 1.

Then, when x goes from 90◦ to 180◦ , sin x goes from 1 back to 0. In fact,

◦ 3
sin 120 = ≈ 0.9,
√2
2
sin 135◦ = = 0.7,
2
1
sin 150◦ = ≈ 0.5,
2
sin 180◦ = 0.

These are symmetric with the previous values. Notice that the identity sin(180◦ − θ) = sin θ
confirms this.

Thus, as x goes from 0◦ to 180◦ , sin x rises from 0 to 1, then back to 0. This can be graphed as:

90◦ 180◦
−1

What happens when x goes from 180◦ to 360◦ ? By the identity sin(180◦ + θ) = − sin θ, the
graph of sin x from 180◦ to 360◦ is basically the reflection of the graph from 0◦ to 180◦ over the
x-axis, as such:
Daniel Kim 106

90◦ 180◦ 270◦ 360◦


−1

Now remember that the period of sine is 2π, or 360◦ . Thus, the entire graph of sin x repeats
indefinitely (oscillating between 0 and 1) in both directions:

0◦
−360◦ −270◦ −180◦ −90◦ 90◦ 180◦ 270◦ 360◦
−1

The graph of sin x

Now, recall the identity cos θ = sin(θ + 90◦ ). This indicates that the graph of cos x is just the
graph of sin θ shifted 90◦ to the left! Thus, the entire graph of cos x would be:

0◦
−360◦ −270◦ −180◦ −90◦ 90◦ 180◦ 270◦ 360◦
−1

The graph of cos x

We usually express these graphs in radians, as such:


107 Chapter 6. Basic Trigonometry

−π 0 π
−2π − 23 π − 21 π 1 3 2π
2π 2π
−1

The graph of sin x in radians

−π 0 π
−2π − 23 π − 21 π 1 3 2π
2π 2π
−1

The graph of cos x in radians

Notice how the graph of sin x is consistent with the earlier discovery that sin x is an odd function,
meaning that it is symmetric around the origin. Likewise, notice how the graph of cos x is symmetric
across the y-axis, since cos x is an odd function.

As both functions continue oscillating between −1 and 1 in both directions forever, we can
conclude the following:

Domain Range

sin x R [−1, 1]

cos x R [−1, 1]

To graph csc x and sec x, we should use the fact that they are reciprocals of sin x and cos x
respectively.

Let’s look at csc x first. As sin x goes from 1 to an arbitrarily small positive number close to 0,
1
notice that csc x = goes from 1 to some arbitrarily large number. In other words, as sin x goes
sin x
from 1 to 0, csc x goes from 1 to ∞.

Similarly, as sin x goes from 0 to 1, csc x goes from ∞ to 1. Keeping this behavior in mind, here
is the graph csc x along with sin x for one part:
Daniel Kim 108

y = csc x

1
y = sin x

0 1 π

Make sure you understand how sin x and csc x are related to each other with respect to their
graphs. Thus, our whole graph of csc x over many periods looks like this (the vertical asymptotes
are represented by dashed lines):

−π 0 π
−2π − 23 π − 21 π 1 3 2π
2π 2π
−1

The graph of csc x

Sketch sin x over this graph to reinforce your understanding of their relations.

Likewise, sec x has an analogous relationship to cos x, as such:


109 Chapter 6. Basic Trigonometry

−π 0 π
−2π − 23 π − 21 π 1 3 2π
2π 2π
−1

The graph of sec x

Here, we also used dashed lines to denote the vertical asymptotes of sec x.
Again, sketch cos x over the graph of sec x as an exercise.
Now, as long as you remember the graphs of sin x and cos x, you should be able to easily graph
csc x and sec x as well.
Notice that csc x has vertical asymptotes at x = −2π, x = −π, x = 0, x = π, x = 2π, etc.
At those values of x, csc x is not defined. In other words, the domain of csc x is the real numbers
excluding any multiple of π. This can be expressed as R − {πk | k ∈ Z}.
What about the range of csc x? Notice that csc x never becomes any number between −1 and 1
exclusive. Thus, its range can be expressed as (−∞, −1] ∪ [1, +∞).
3 1 1 3
Similarly, sec x has vertical asymptotes at x = − π, x = − π, x = π, x = π, etc. Thus the
n o 2 2 2 2
π
domain of sec x is R − πk + | k ∈ Z . Since sec x is merely csc x with a horizontal shift, their
2
ranges are the same.

Domain Range

csc x R − {πk | k ∈ Z} (−∞, −1] ∪ [1, +∞)


n π o
sec x R − πk + | k ∈ Z (−∞, −1] ∪ [1, +∞)
2

π
For the graph of tan x, let’s consider the function’s behavior as x goes from 0 inclusive to
2
exclusive.
sin x
Since tan x = , notice that sin x is going from 0 to 1 while cos x is going from 1 to 0. Thus,
cos x
π
tan x is increasing without bound, i.e. it is going to ∞ as x goes to .
2
Daniel Kim 110

We can think about this in another way: tan x is the slope of the hypotenuse of the triangle in
π
the unit circle, so as the angle approaches , the hypotenuse gets steeper (until it becomes vertical
2
π
at x = ). This indicates that the slope, which is tan x, is tremendously increasing.
2
h π
Then, let’s graph tan x over 0, . We plug in some common values:
2

tan 0 = 0,

π 3
tan = ≈ 0.6,
6 3
π
tan = 1,
4
π √
tan = 3 ≈ 1.7.
3

1
Then we consider the fact that tan x goes to ∞ as x approaches π from the left.
2

0 1

How would we sketch the rest of the graph? We can take advantage of a few identities.

1 3
Recall that tan x = tan(x + π). Thus, the graph above will repeat for x = π to π, etc.
2 2
Furthermore, note that

sin(−x) − sin x
tan(−x) = = = − tan x,
cos(−x) cos x

so tan x is an odd function, indicating that it is symmetric around the origin.

Using these two pieces of information, we can sketch the rest of the graph (the dashed lines are
vertical asymptotes):
111 Chapter 6. Basic Trigonometry

−π 0 π
−2π − 23 π − 21 π 1 3 2π
2π 2π
−1

The graph of tan x

What about cot x? It turns out that it is relatively simple to sketch cot x once you know how to
sketch tan x. π  π
Recall the cofunction identity tan − x = cot x (since 90◦ = ). Replacing x with −x gives
π  2 2
cos(−x) cos x
us tan + x = cot(−x). But note that cot(−x) = = = − cot x, so we have
2 sin(−x) − sin x
π 
cot x = − tan +x .
2
π
This indicates that the graph of cot x is the result of shifting the graph of tan x to the left by ,
2
then flipping it over the x-axis. Thus, we end up with:

−π 0 π
−2π − 23 π − 21 π 1 3 2π
2π 2π
−1

The graph of cot x

Notice
n that tan xohas the same vertical asymptotes as sec x, so the domain of tan x is also
π
R − πk + | k ∈ Z . However, if we look at the graph of tan x, we can see that tan x can be any
2
real number. Thus, the range of tan x is R.
Daniel Kim 112

Likewise, cot x has the same vertical asymptotes as csc x, so the domain of cot x is R−{πk | k ∈ Z}.
For the same reason as tan x, the range of cot x is also R.
Thus, we have the following results:

Domain Range
n π o
tan x R − πk + | k ∈ Z R
2
cot x R − {πk | k ∈ Z} R

I will summarize all the discoveries made about the domain and range of every trigonometric
function below:

Domain Range

sin θ R [−1, 1]
cos θ R [−1, 1]
n π o
tan θ R − πk + | k ∈ Z R
2
cot θ R − {πk | k ∈ Z} R
n π o
sec θ R − πk + | k ∈ Z (−∞, −1] ∪ [1, +∞)
2
csc θ R − {πk | k ∈ Z} (−∞, −1] ∪ [1, +∞)

It will not always be the case that you will be tasked with sketching a simple graph of sin x. You
may be asked to graph more complicated trigonometric functions. Consider the general equation of
sine,
y = A sin(Bx + C) + D.

Note that all equations defined in this form will have a “wave”-like form, which we call sinusoidal.
The following will still apply for an equation like y = A cos(Bx + C) + D, since sine and cosine
are merely phase shifts of each other.
Assume that B > 0 (whenever B is negative, we can always use the fact that sine is an odd
function and manipulate the equation into the given form above). With respect to these variables,
we define the following properties of the equation:

• The midline is y = D. It is the horizontal, center line on which the sinusoidal wave oscillates
above and below. Here is an example:

Midline: y = D

y = A sin(Bx + C) + D
113 Chapter 6. Basic Trigonometry

• The amplitude is |A|. This represents the vertical distance from the midline to either the
highest or the lowest point on the graph.

Amplitude: |A|

Midline: y = D

Amplitude: |A|
y = A sin(Bx + C) + D

• The frequency is B. This denotes the number of cycles completed in an interval of 2π or 360◦ .
One cycle is the part of the graph that serves as a repeated pattern, since all trigonometric
functions are periodic. Here is an example of one cycle:

One cycle

For y = sin x, one cycle happens every 2π, so the frequency of sin x is 1.


• The period is . This is simply the horizontal length of one cycle. This is consistent with
B
the definition of frequency: we divide the interval 2π by the number of cycles completed in
that interval, which is B, to get the length of each cycle.

C
• The phase shift is − . This can be derived from the observation that y = A sin(Bx + C) +
  B 
C
D =⇒ y = A sin B x + +D. This represents how far the original sine function (if you
B
are given y = A sin(Bx + C) + D) or cosine function (if you are given y = A cos(Bx + C) + D)
has been shifted right, if the phase shift is positive, or left, if the phase shift is negative.

• The minimum is D−|A|, while the maximum is D+|A|. This follows from our understanding
of the midline and amplitude:
Daniel Kim 114

Maximum: D + |A|

Amplitude: |A|

Midline: y = D

Amplitude: |A|
y = A sin(Bx + C) + D

Minimum: D − |A|

Since graphs of trigonometric functions can go on forever, we will establish some criteria for a
‘sufficiently’ drawn graph (in the context of this book; this is NOT official or conventional). When a
problem asks for a sketch of some variant of sin x, then draw a curve of this form:

Then, label the coordinates of the five dotted points on that one period.
We will denote this portion of the graph as the first period. The first of the five points will
always have its x-coordinate equal to the phase shift (remember this, it is a useful tip!).
For some variant of − sin x, the first period would be:

For some variant of cos x, the first period would be:


115 Chapter 6. Basic Trigonometry

For some variant of − cos x, the first period would be:

Example 6.3.1
Sketch y = 2 sin(3x + 15) − 117.

Solution. For simplicity, we will use degrees.


15
First, notice that the phase shift is − = −5, so the x-coordinate of the first point is −5◦ .
3
Since we are sketching a variant of sin x, the first point lies on the midline, which is y = −117, so
the coordinates of the first point is (−5◦ , −117).
360◦
The period is = 120◦ , so the last point will have an x-coordinate of −5◦ + 120◦ = 115◦ .
3
Notice that the five special points are equally spaced away with respect to their x-coordinates.
Then, we can deduce that each point is horizontally a quarter of a full period away from its adjacent
points.

1 1 1 1
4 Period 4 Period 4 Period 4 Period

Since the period is 120◦ , each point will horizontally be 30◦ away from its adjacent points.

Thus, the second point will have an x-coordinate of −5◦ + 30◦ = 25◦ , the third point will have an
x-coordinate of 25◦ + 30◦ = 55◦ , and the fourth point will have an x-coordinate of 55◦ + 30◦ = 85◦ .
We can also confirm that the x-coordinate of the last point is indeed 115◦ by adding 30◦ to 85◦ .

Now, we just have to find the y-coordinates. Here, we observe that the minimum is −117 − |2| =
−119 and the maximum is −117 + |2| = −115. Then, we can assign y-coordinates to the points
according to which are maximums, minimums, or lie on the midline. Thus, our first period sketch is:
Daniel Kim 116

(25◦ , −115)

(55◦ , −117) (115◦ , −117)


(−5◦ , −117)

(85◦ , −119)

 π
Exercise 6.3.2. Sketch y = 2 − 3 cos 5x + .
3

Example 6.3.3
Consider a portion of a sinusoidal function:

A(30◦ , 7)
C
D(60◦ , 2)
B

Given that points C and D lie on the midline, find the coordinates of B and C. Then, find
four equations for the graph with the plane shift at A, B, C, and D.

Solution. Recall that a period can be broken up into four ‘quarters.’ While we are not directly
dealing with a single cycle this time, notice that there are seven ‘quarter’ pieces between points A
and D. Since the x-coordinate of points A and D are respectively 30◦ and 60◦ , each ‘quarter’ piece
60◦ − 30◦ 30 ◦
will be = long.
7 7
30 ◦
Since point B is two ‘quarter’ pieces away from point A, its x-coordinate will be 30◦ + 2 · =

7
270
. Likewise, point C is 5 ‘quarter’ pieces away from point A, its x-coordinate will then be
7
30 ◦ 360 ◦
30◦ + 5 · = .
7 7
Now, we just have to find the y-coordinates. Notice that point C also lies on the midline, so its
y-coordinate will be the same as point D, which is 2.
Since B lieson a minimum of the wave, its y-coordinate will be the midline minus the amplitude.
The amplitude is also the distance from the midline to the maximum, and since point A lies on
a maximum of 7, the amplitude must be 7 − 2 = 5. Thus, the minimum is 2 − 5 = −3, so the
y-coordinate ofpoint B must be −3.
   
270 ◦ 360 ◦
Thus, point B has coordinates , −3 and point C has coordinates ,2 .
7 7
To find an equation for this graph, we need the following information:
117 Chapter 6. Basic Trigonometry

• We have already found the amplitude to be 5.


30 ◦ 120 ◦
• The period is four ‘quarters,’ which is 4 · = .
7 7
120 ◦ 360 ◦
• Since the period is = , the frequency must be 21.
7 21
• The midline is y = 2.

All we need now is the phase shift. We look back to the four different shapes of ‘first periods’ for
sin x, − sin x, cos x, and − cos x to determine which function should be used for which point.

1. If the phase shift is at point A, then the function must be some variant of cos x. The phase
shift itself must be 30◦ (it is positive because we are shifting cosine to the right). Putting
together all the information, we have the equation,

y = 5 cos(21(x − 30◦ )) + 2 =⇒ y = 5 cos(21x − 630◦ ) + 2 .

270 ◦
2. Likewise, the phase shift at point B must be . The function that should be used is some
7
variant of − cos x. Thus, our equation is
  
270 ◦
y = −5 cos 21 x − + 2 =⇒ y = −5 cos(21x − 810◦ ) + 2 .
7

360 ◦
3. The phase shift at point C must be . Our function at this point must be some variant of
7
− sin x, so the equation is
  
360 ◦
y = −5 sin 21 x − + 2 =⇒ y = −5 sin(21x − 1080◦ ) + 2 .
7

360 ◦
4. The phase shift at point D is . The function must be some variant of sin x, so the equation
7
is
y = 5 sin(21(x − 60◦ )) + 2 =⇒ y = 5 sin(21x − 1260◦ ) + 2 .

6.4 Inverse Trigonometric Functions

Clearly, the graphs of these trigonometric functions fail the horizontal line test: if any horizontal
line intersects the function at more than one point, then an inverse function does not exist.
Then how could there possibly be inverse trigonometric functions? It turns out that the only
reason for the failure of the horizontal line test is that the graph of the function repeats itself
periodically.
However, there is no need to consider all the repeated periods. We just need to look at one
period of the graph in order to define an appropriate inverse function.
Daniel Kim 118
h π πi
Consider the graph of sin x on the interval − , .
2 2

If we just consider this single portion of the graph, we can see that we have covered all possible
points of sin x between −1 and 1. In other words, every distinct value of x is mapped to its
corresponding, distinct value of sin x. This is the condition needed for the existence of an inverse
function.
By limiting the graph to this interval, the horizontal line test is satisfied, i.e. there exists an
inverse for this function.
h π πi
So, if we limit the domain of sin x to − , , then there exists an inverse function. We could
    2 2
π 3π 3π π
have chosen , or − , − , but it is most convenient to select the interval centered at the
2 2 2 2
origin.
We will call this inverse function sin−1 x. It can also be called
h π arcsin x. It is universally agreed
πi
that to define sin x, we will restrict the domain of sin x to − , .
−1
2 2
Then, the domain of sin−1 x is simply the range of sin x, which h πremains to be [−1, 1], and the
πi
range of sin x is the newly restricted domain of sin x, which is − , .
−1
2 2
We can easily graph sin x by reflecting the limited graph of sin x over the line y = x.
−1

π
2

−1 1

− π2

Now, let’s consider cos x. Like before, we want to restrict the domain so we can define cos−1 x.
h π πi
This time, if we try to limit the domain to − , , we run into couple of problems. First, that
2 2
portion of the graph still fails the horizontal line test! Second, we would completely leave out the
negative values of cos x. Remember, it is our goal to include every possible value of cos x between
−1 and 1 without repeated values.
119 Chapter 6. Basic Trigonometry

However, if we limit the domain of cos x to [0, π], then the resulting portion of the graph
accomplishes what we wanted: it passes the horizontal line test, and it covers all possible values of
cos x between −1 and 1.

Thus, we define cos−1 x (or arccos x, if you prefer) to be the inverse of cos x, when the domain
of cos x is restricted to [0, π].

Of course, this interval is chosen for convenience - we could’ve chosen [−π, 0], but the negative
bounds may complicate matters in the future. It is nice to stick with a positive interval starting at 0.
As with sin x, it is generally agreed that the domain of cos x should be limited to [0, π] in order to
define cos−1 x.

Therefore, if the domain of cos x was limited to [0, π], then the range of cos−1 x should be [0, π].
As the range of cos x is still [−1, 1], the domain of cos−1 x is [−1, 1].

Again, we graph cos−1 x by reflecting through the line y = x.

−1 1

Lastly, let’s move on to tan x. We still have the same problem as before, but now you should be
able to tell what slight modification is necessary.
Daniel Kim 120

−π 0 π
−2π − 23 π − 21 π 1 3 2π
2π 2π
−1

 π π
We should limit the domain to − , . It is unnecessary to include more than one period of
2 2
tangent, since this selected portion already covers all values of tan x.
Then, since
 π the range of tan x is R, the domain
 πof π  x is R. Likewise, the restricted domain
tan −1
π
of tan x is − , , so the range of tan−1 x is − , .
2 2 2 2
By reflecting by the line y = x, we can graph tan−1 x. The dashed lines represent the horizontal
asymptotes of tan−1 x.

π
y= 2

y = − π2

To recap what we have done so far, here are the domains and ranges of the three trigonometric
inverse functions.

Domain Range
h π πi
sin−1 x [−1, 1] − ,
2 2
cos−1 x [−1, 1] [0, π]
 π π
tan−1 x R − ,
2 2

You should grow accustomed to some common values for these functions.
121 Chapter 6. Basic Trigonometry

x sin−1 x cos−1 x
π
1 0
2
√ x tan−1 x
3 π π √ π
2 3 6 3
√ 3
2 π π π
1
2 4 4 4
1 π π 1 π

2 6 3 3 6
π
0 0 0 0
2
1 π 2π 1 π
− − −√ −
2 6 3 3 6
√ π
2 π 3π −1 −
− − 4
2 4 4 √
√ π
3 π 5π − 3 −
− − 3
2 3 6
π
−1 − π
2

Take a moment to confirm that these values of x give the correct values of sin−1 x, cos−1 x, and
tan−1 x.
Problem 6.4.1. Find the values of the following:
 π
1. sin−1 sin
7
 π
2. cos−1 cos
7
 π
3. tan−1 tan
7
4. sin−1 (sin 2π)

5. sin−1 (sin 3)

Solution. Understand that the basic definition of an inverse implies f −1 (f (x)) = x. But be careful
of the domains and ranges of the inverse trig functions!
 π π
1. sin−1 sin = .
7 7
 π π
2. cos−1 cos = .
7 7
 π π
3. tan−1 tan = .
7 7
Daniel Kim 122
h π πi
4. The answer is not 2π, because the range of inverse sine is − , . So we must find
h π πi 2 2
a θ ∈ − , such that sin θ = sin(2π), which is θ = 0. Therefore, sin−1 (sin 2π) =
2 2
sin−1 (sin 0) = 0 .

5. Similarly, 3 would not be the correcth answer. But note that sin 3 = sin(π − 3), by the identity
π πi
sin(180 − θ) = sin θ. Since π − 3 ∈ − , , sin−1 (sin 3) = sin−1 (sin(π − 3)) = π − 3 .
2 2

Example 6.4.2
Find the values of the following:
  π 
1. sin−1 cos
7
  π 
2. sin−1 cos −
7
  π 
3. sin cos−1
7
4. Find a general form for sin(cos−1 x).
  π 
5. sin cos−1
2
6. sec(tan−1 (2017))

Solution.

1. Use the cofunction identities to make the problem more manageable:


  π    π π 
sin−1 cos = sin−1 sin −
7  2  7
−1 5π
= sin sin
14

= .
14

  π    π  5π
2. Note that cos(−θ) = cos θ. Thus, sin−1 cos − = sin−1 cos = , based on the
7 7 14
previous problem.

3. We introduce another technique that is commonly applied to this type of problem.


π  π
Let θ = cos−1 . This means that cos θ = . We must find sin θ, and now the problem
7 7
becomes familiar:

sin2 θ + cos2 θ = 1
π2
sin2 θ = 1 −
49
123 Chapter 6. Basic Trigonometry

49 − π 2
sin θ = ± .
7

Which sign do we choose? Now, we have to consider the domain and range of cos−1 x. Since
the range is [0, π], we must have θ ∈ [0, π]. Then, in that interval, sin θ is always positive.

49 − π 2
Therefore, we choose the positive square root to get .
7
4. First, note that

sin2 θ + cos2 θ = 1
sin2 θ = 1 − cos2 θ
p
sin θ = ± 1 − cos2 θ.

Then, let θ = cos−1 x. So, we have


p
sin(cos−1 (x)) = ± 1 − cos2 (cos−1 (x)),
p
= ± 1 − x2 .

However, cos−1 (x) has range [0, π], and sin is always positive in this interval, so we choose the
p
positive square root: sin(cos−1 (x)) = 1 − x2 .
π
5. The domain of cos−1 (x) is [−1, 1], but we are given x = , which is greater than 1. Therefore,
2
there is no solution .

6. Using the identity tan2 θ + 1 = sec2θ and letting


 θ = tan−1 (2017) or tan θ = 2017, we get
√ π π
sec θ = ± 20172 + 1. Note that θ ∈ − , , so sec θ, being the reciprocal of cos θ, must be
2 2 p
positive, so we choose the positive square root: 20172 + 1 .
    
3 4
Exercise 6.4.3. Compute and simplify sin cos −1 − + sin −1
− .
5 5
    
5 5
Exercise 6.4.4. Compute and simplify sin sin −1
+ cos −1 − .
13 13
π
Problem 6.4.5. Prove that sin−1 x + cos−1 x = .
2
π 
Proof. Let sin−1 x = θ, or sin θ = x. By the cofunction identities, sin θ = cos − θ = x, so
2
π π π
cos−1 x = − θ, or cos−1 x = − sin−1 x. This rearranges to sin−1 x + cos−1 x = , and we’re
2 2 2
done.

Problem 6.4.6. Find the following values:


 
1
1. sin cos −
−1
3
Daniel Kim 124

2. csc(tan−1 −2)

3. sec(sin−1 x)

Solution. It is essential that you understand the domains and ranges of the inverse functions so that
you are able to choose the proper sign.

1 1
1. Let θ = cos−1 − . Then cos θ = − . Because the range of cos−1 is [0, π], and sin is always
3 3
nonnegative in that interval, sin θ must be nonnegative. Using the identity sin2 θ + cos2 θ = 1,
√ √
2 2 2 2
we get that sin θ = ± , so choose the positive square root: sin θ = .
3 3

2. This is a challenging problem.

First, let θ = tan−1 −2. Then tan θ = −2. Using the identity
 πtan θ + 1 = sec θ, we get that
2 2

sec θ = ± 5. We know that tan θ < 0, so therefore θ ∈ − , 0 . cos θ must be positive in
2
√ 1
that interval, so sec θ is also positive. Thus, sec θ = 5, and cos θ = √ . Using the identity
 π  5
2
sin θ + cos θ = 1, we get sin θ = ± √ , but since θ ∈ − , 0 , sin θ < 0, so we choose
2 2
5 2

2 5
sin θ = − √ . Therefore, csc θ = − .
5 2

1
3. Note that sec(sin−1 x) = −1 . Let θ = sin−1 x, so sin θ = x. The range of inverse
cos(sin
h π πi x)
sine implies that θ ∈ − , , and therefore cos θ is always positive in that interval. Using
2 2 √
the identity sin2 θ + cos2 θ = 1, cos θ = 1 − x2 (we choose the positive square root). Thus,
1
sec(sin−1 x) = √ .
1 − x2

6.5 More Trigonometric Identities

So far, we have only discovered identities that dealt with 90◦ or 180◦ . What if we wanted to find an
expression for sin(θ + 30◦ )? If we know sin θ and sin 30◦ , it seems reasonable that we should also get
sin(θ + 30◦ ).

To develop our first set of new, useful identities, we first consider the following two triangles
which lie on the unit circle:
125 Chapter 6. Basic Trigonometry
y

C (cos b, sin b)
B (cos a, sin a)

a
b−
a
O x

D (cos(b − a), sin(b − a))

b−a E
O′ x

Notice that OC ∼ = O0 D and OB ∼ = O0 E, as they are radii of the unit circle. We have also
constructed the triangles to have same angle m∠COB = m∠DO0 E = b − a. Therefore, by SAS,
4OBC ∼ = 4O0 ED. Then, BC ∼ = ED, implying that mBC = mED.
These are equal, so now we have basic algebraic manipulation:
p q
(cos b − cos a)2 + (sin b − sin a)2 = (cos(b − a) − 1)2 + sin2 (b − a)
(cos b − cos a)2 + (sin b − sin a)2 = (cos(b − a) − 1)2 + sin2 (b − a).

Expanding gives

cos2 b − 2 cos b cos a + cos2 a + sin2 b − 2 sin b sin a + sin2 a


= cos2 (b − a) − 2 cos(b − a) + 1 + sin2 (b − a).
Daniel Kim 126

This reduces to
2 − 2(cos b cos a + sin b sin a) = 2 − 2 cos(b − a).

From here, we finally get our first new identity:

cos(b − a) = cos b cos a + sin b sin a.

Now, observe that cos(b + a) = cos(b − (−a)). Then,

cos(b + a) = cos(b − (−a))


cos(b + a) = cos b cos(−a) + sin b sin(−a)
cos(b + a) = cos b cos a − sin b sin a,

using the identities cos(−a) = cos a and sin(−a) = − sin a.


To find the formula for sin(b + a), apply the cofunction identity sin(b + a) = cos(90◦ − (b + a)).

sin(b + a) = cos(90◦ − (b + a))


sin(b + a) = cos((90◦ − b) − a))
sin(b + a) = cos(90◦ − b) cos a + sin(90◦ − b) sin a
sin(b + a) = sin b cos a + cos b sin a.

The angle difference for sine is similar to that of cosine: use the facts that sin(−θ) = − sin θ and
cos(−θ) = cos θ. We have,

sin(b − a) = sin(b + (−a))


sin(b − a) = sin b cos(−a) + cos b sin(−a)
sin(b − a) = sin b cos a − cos b sin a.

The four identities we have just discovered are known as the sum and difference identities:

sin(x ± y) = sin x cos y ± cos x sin y


cos(x ± y) = cos x cos y ∓ sin x sin y

Here are some exercises that require the use of these formulas.

Problem 6.5.1. Compute the following values:

1. cos(15◦ )

2. sin(195◦ )
 
3 5
3. cos cos −1 + cos −1
5 13

Solution.
127 Chapter 6. Basic Trigonometry

1. cos(15◦ ) = cos(45◦ − 30◦ )


= cos 45◦ cos 30◦ + sin 45◦ sin 30◦
√ √ √
2 3 2 1
= · + ·
2 2 2! 2
√ √
2 3+1
=
2 2
√ √
6+ 2
= .
4

2. sin(195◦ ) = sin(150◦ + 45◦ )


= sin 150◦ cos 45◦ + cos 150◦ sin 45◦
√ √ !
2 1 3
= −
2 2 2
√ √ !
2 1− 3
=
2 2
√ √
2− 6
= .
4
         
−1 3 −1 5 −1 3 −1 5 −1 3 −1 5
3. cos cos + cos = cos cos cos cos − sin cos sin cos
5 13 5 13 5 13
3 5 4 12
= · − ·
5 13 5 13
33
= − .
65

Using these identities, we can also find the sum and difference formulas for tangent:
sin(x ± y)
tan(x ± y) =
cos(x ± y)
sin x cos y ± cos x cos y
= .
cos x cos y ∓ sin x sin y
Divide the numerator and denominator by cos x cos y to get the desired relation:
tan x ± tan y
tan(x ± y) = .
1 ∓ tan x tan y
Problem 6.5.2. Compute the following values:

1. tan(15◦ )
1 1
2. tan−1 + tan−1
2 3
Solution.
Daniel Kim 128

1. tan(15◦ ) = tan(45◦ − 30◦ )


tan 45◦ − tan 30◦
=
1 + tan 45◦ tan 30◦
1
1− √
3
=
1
1+ √
3

= 2− 3 .

2. The addition of these inverse tan values suggest that we should take the tangent of this sum
and analyze its result. Note that
   
1 1
  tan tan−1 + tan tan−1
1 1 2 3
tan tan−1 + tan−1 =    
2 3 −1 1 −1 1
1 − tan tan tan tan
2 3
1 1
+
= 2 3
1
1−
6
= 1.

π
If the slope (which is the tangent) is less than 1, then the angle must be less than . Thus,
4
1 π 1 π 1 1 π
tan −1 < and tan −1 < , so tan −1 + tan −1 < .
2 4 3 4 2 3 2
 
1 1
Combining this with the result that tan tan −1 + tan −1 = 1, we see that the only value
2 3
1 1 π
possible for tan−1 + tan−1 is .
2 3 4

Using the sum and difference formulas, we can derive formulas for sin 2x, cos 2x, and tan 2x in
terms of sin x, cos x, and tan x.
Observe that 2x = x + x. Then,

sin(2x) = sin(x + x)
= sin x cos x + cos x sin x
= 2 sin x cos x .
cos(2x) = cos(x + x)
= cos x cos x − sin x sin x
= cos2 x − sin2 x .

The result for cos(2x) reminds us of the Pythagorean Identity sin2 x + cos2 x = 1. In fact, this
equation can be rewritten in two ways: sin2 x = 1 − cos2 x, and cos2 x = 1 − sin2 x. We then
129 Chapter 6. Basic Trigonometry

substitute these two relations into the result cos2 x − sin2 x to derive two new double angle identities
for cosine:

cos(2x) = cos2 x − sin2 x


= cos2 x − (1 − cos2 x) = 2 cos2 x − 1 ,
= (1 − sin2 x) − sin2 x = 1 − 2 sin2 x .

Deriving the double angle identity for tangent is straightforward:

tan(2x) = tan(x + x)
tan x + tan x
=
1 − tan x tan x
2 tan x
= .
1 − tan2 x

Thus, our double angle identities for sine, cosine, and tangent are:

sin 2x = 2 sin x cos x


cos 2x = cos2 x − sin2 x
= 2 cos2 x − 1
= 1 − 2 sin2 x
2 tan x
tan 2x =
1 − tan2 x
Problem 6.5.3. Answer the following:

1
1. If cos x = , what is cos 2x?
3
3
2. If sin x = , what is sin 2x?
5
1
3. If tan x = , what is tan 2x?
3
 
1
4. Compute the value tan 2 tan −1 .
2
5. Compute cos(2 tan−1 5).

Solution.
 2
1 7
1. Directly apply the double angle formula for cosine: cos 2x = 2 cos2 x − 1 =2 −1 = − .
3 9
2. Since sin 2x = 2 sin x cos x, we also have to find the value of cos x. Using the Pythagorean
4
Identity sin2 x + cos2 x = 1, we see that cos x = ± . We must be careful as there are two
5
possible values for sin 2x when we consider the sign of cos x:
Daniel Kim 130

3 4 24
• If cos x > 0, then sin 2x = 2 · · = .
5 5 25
3 4 24
• If cos x < 0, then sin 2x = 2 · ·− = − .
5 5 25
1
2· 3
3. Again, we can directly apply the half angle formula for tangent: tan 2x = 3 .
 2 =
1 4
1−
3
1
4. We treat tan−1 as the angle we are doubling. Applying the half angle formula in this fashion,
2
we have
 
−1 1
  2 tan tan
1 2
tan 2 tan−1 =  
2 2 −1 1
1 − tan tan
2
1

= 2
 2
1
1−
2
4
= .
3

5. By the double angle identity for cosine, cos(2 tan−1 5) = 2 cos2 tan−1 5 − 1. Now we must
compute the value cos(tan−1 5) to find the answer.

Let θ = tan−1 5, implying tan θ = 5. Using the identity tan2 θ+1 = sec2 θ, we get sec θ = ± 26.
However,
 π as θ is equal to an inverse tangent function, the range of inverse tangent implies that
π √
θ∈ − , . In this interval, cos θ, and therefore, sec θ, are always positive, so sec θ = 26
2 2
1
and cos θ = √ .
26
 2
 1 1 12
Therefore, cos(2 tan 5) = 2 cos tan 5 − 1 = 2 √
−1 2 −1 −1= −1= − .
26 13 13
If we can double an angle, we can also halve an angle. This will lead us to our next set of
identities.
First, we can express sin θ in terms of cos 2θ:
cos 2θ = 1 − 2 sin2 θ
r
1 − cos 2θ
sin θ = ± .
2
θ
Then, we can substitute θ → to get
2
r
θ 1 − cos θ
sin = ± .
2 2
131 Chapter 6. Basic Trigonometry

Likewise, for cosine:

cos 2θ = 2 cos2 θ − 1
r
1 + cos 2θ
cos θ = ±
2
r
θ 1 + cos θ
cos = ± .
2 2

There are two half angle identities for tangent, derived in different ways:

θ sin 2θ
tan =
2 cos 2θ
sin 2θ 2 cos 2θ
= ·
cos 2θ 2 cos 2θ
2 sin 2θ cos 2θ
= θ
2 cos2 2
sin θ
= .
1 + cos θ

θ sin 2θ
tan =
2 cos 2θ
sin 2θ 2 sin 2θ
= ·
cos 2θ 2 sin 2θ
θ
2 sin2 2
=
2 sin 2θ cos 2θ
1 − cos θ
= .
sin θ

Thus, the half angle identities for sine, cosine, and tangent are:
r
θ 1 − cos θ
sin = ±
2 2
r
θ 1 + cos θ
cos = ±
2 2
θ sin θ
tan =
2 1 + cos θ
1 − cos θ
=
sin θ
Problem 6.5.4. Answer the following:

1 θ
1. If cos θ = and θ ∈ I, what is cos ?
5 2
Daniel Kim 132
π
2. Compute and simplify tan .
8

Solution.
r r
θ 1 + 15 3 π
1. The half angle formula for cosine gives us cos = ± = ± . Be careful, cos may
2 2 5 2
not always be positive! Note that θ ∈ I implies that 0◦ + 360k ◦ < θ < 90◦ + 360k ◦ , which
θ θ
could suggest 360◦ < θ < 450◦ , leading to 180◦ < < 225◦ , and cos is negative in this
2 2
θ
interval. Therefore cos can either be positive or negative, even though θ is in quadrant I.
2

π 2 √
π sin 2
2. We simply use the half angle formula for tangent: tan = 4 = 2√ = √ =
8 π 2+ 2 2+ 2
1 + cos
4 2

2−1 .

Exercise 6.5.5. Compute tan 67.5◦ and simplify your answer.


 
1
Problem 6.5.6. Compute tan tan (2) .
−1
2

θ
Solution. If we let θ = tan−1 (2), then we have to compute tan . By our half angle identity, we have
2
θ sin θ
tan = .
2 1 + cos θ

Now we have to find sin θ and cos θ. Note that θ = tan−1 (2) =⇒ tan θ = 2. Now we have a
simple reference triangle problem.

Note that tan2 θ + 1 = sec2 θ, so sec2 θ = 22 + 1 = 5 =⇒ sec θ = ± 5.
 π π
Recall that the range of tan−1 is − , . Since θ = tan−1 (2) and 2 is a positive slope, we
2 2  π
must conclude that θ is in the first quadrant, i.e. θ ∈ 0, . Then cos θ and sec θ must be positive
2
√ 1
on that interval. Thus, we choose sec θ = 5, and thus cos θ = √ . Likewise, sin θ will also be
s 5
 2
1 2
positive on this interval, so sin θ = 1 − √ =√ .
5 5
Finally, we can plug in to get

√ 2 √
θ 5 5−1
tan = = .
2 1 + √15 2

In addition to the double angle identities, we also have the triple angle identities:

sin 3θ = −4 sin3 θ + 3 sin θ


133 Chapter 6. Basic Trigonometry

cos 3θ = 4 cos3 θ − 3 cos θ


3 tan θ − tan3 θ
tan 3θ =
1 − 3 tan2 θ

Problem 6.5.7. Prove the triple angle identities.

Proof. Using the fact that 3θ = 2θ+θ, we can use the sum formulas for sine and cosine, then repeatedly
apply double angle identities and Pythagorean identities sin2 θ = 1 − cos2 θ and cos2 θ = 1 − sin2 θ
to finish the proof.

sin 3θ = sin(2θ + θ)
= sin 2θ cos θ + cos 2θ sin θ
= (2 sin θ cos θ) cos θ + (1 − 2 sin2 θ) sin θ
= 2 sin θ cos2 θ + sin θ − 2 sin3 θ
= 2 sin θ(1 − sin2 θ) + sin θ − 2 sin3 θ
= 2 sin θ − 2 sin3 θ + sin θ − 2 sin3 θ
= −4 sin3 θ + 3 sin θ.
cos 3θ = cos(2θ + θ)
= cos 2θ cos θ − sin 2θ sin θ
= (2 cos2 θ − 1) cos θ − (2 sin2 θ cos θ)
= 2 cos3 θ − cos θ − (1 − cos2 θ)(2 cos θ)
= 2 cos3 θ − cos θ − 2 cos θ + 2 cos3 θ
= 4 cos3 θ − 3 cos θ.
tan 3θ = tan(2θ + θ)
tan 2θ + tan θ
=
1 − tan 2θ tan θ
2 tan θ
2 θ + tan θ
= 1−tan 2 tan θ
1 − 1−tan2 θ · tan θ
2 tan θ+tan θ(1−tan2 θ)
1−tan2 θ
= 1−3 tan2 θ
1−tan2 θ
2 tan θ + tan θ − tan3 θ
=
1 − 3 tan2 θ
3 tan θ − tan3 θ
= .
1 − 3 tan2 θ

Now, here are some review problems that will require the use of the identities discovered in this
section.
1
Problem 6.5.8. If sin θ + cos θ = , what’s sin 2θ?
2
Daniel Kim 134
1
Solution. If we square both sides of the equation sin θ + cos θ = , we get
2
1
sin2 θ + 2 sin θ cos θ + cos2 θ = .
4

3
Note that sin2 θ + cos2 θ = 1 and sin 2θ = 2 sin θ cos θ. Therefore, sin 2θ = − .
4

Problem 6.5.9. Simplify cos4 T − sin4 T as much as possible.

Solution. We have that cos4 T −sin4 T = (cos2 T +sin2 T )(cos2 T −sin2 T ). However, cos2 T +sin2 T =
1, and cos2 T − sin2 T = cos 2T , thus cos4 T − sin4 T = cos 2T .
θ
Problem 6.5.10. Prove cot = csc θ + cot θ.
2
θ sin θ θ 1 1 + cos θ
Proof. The half angle identity for tangent is: tan = . Then, cot = = =
2 1 + cos θ 2 θ sin θ
tan
2
1 cos θ
+ = csc θ + cot θ, as desired.
sin θ sin θ
Problem 6.5.11. Derive a formula for cot(α + β) in terms of cot α and cot β.

Solution. We can use the tangent sum identity:


1
cot(α + β) =
tan(α + β)
1 − tan α tan β
=
tan α + tan β
1 − cot α1cot β
= 1 1
cot α + cot β
cot α cot β−1
cot α cot β
= cot α+cot β
cot α cot β

cot α cot β − 1
= .
cot α + cot β

θ
Problem 6.5.12. If tan θ = 2, find all possible values of tan .
2
Solution. As tan θ is positive, θ is in either quadrant I or III.
√ 2 √
2 1 θ 5 2 5−1
If θ ∈ I, then sin θ = √ and cos θ = √ , so tan = 1 =√ = .
5 5 2 1 + √5 5+1 2

2 1 θ − √25 2 − 5−1
If θ ∈ III, then sin θ = − √ and cos θ = − √ , so tan = = −√ = .
5 5 2 1 − √15 5−1 2
135 Chapter 6. Basic Trigonometry
3
Problem 6.5.13. If sin θ = , what is sin 3θ?
5
3
Solution. The triple angle identity for sine is sin 3θ = −4 sin3 θ + 3 sin θ. Therefore, when sin θ = ,
 3 5
3 3 117
sin 3θ = −4 +3· = .
5 5 125
1
Problem 6.5.14. Prove cos 20◦ cos 40◦ cos 80◦ = .
8

Proof. Using sin(180◦ − θ) = sin θ and the double angle identity for sine, notice that
1
cos 20◦ cos 40◦ cos 80◦ sin 20◦ = sin 40◦ cos 40◦ cos 80◦
2
1
= sin 80◦ cos 80◦
4
1
= sin 160◦
8
1
= sin 20◦ .
8
1 1
Since cos 20◦ cos 40◦ cos 80◦ sin 20◦ = sin 20◦ , cos 20◦ cos 40◦ cos 80◦ = , so we’re done.
8 8
1
Problem 6.5.15. Prove cos 36◦ cos 72◦ = . Then, use this result to find the value of cos 36◦ .
4
Solution. Similar to the previous problem,
1
cos 36◦ cos 72◦ sin 36◦ = sin 72◦ cos 72◦
2
1
= sin 144◦
4
1
= sin 36◦ ,
4
1
implying cos 36◦ cos 72◦ = .
4
1
To find cos 36◦ , rewrite cos 36◦ cos 72◦ =
in terms of cos 36◦ only. To do this, notice that 72 is
4
2 · 36, so we can use the double angle formula to get: cos 72◦ = 2 cos2 36◦ − 1, so we have:
1
cos 36◦ (2 cos2 36◦ − 1) =
4
Letting cos 36◦ = xfor convenience,
 the above equation leaves us with the polynomial 8x3 −4x−1 = 0.
1 1
This factors into 2 x + (4x2 − 2x − 1) = 0, and since cos 36◦ clearly does not equal − , we must
2 √ 2
1 ± 5
find the roots of 4x2 − 2x − 1. Using the quadratic formula, we get: x = . Since cos 36◦ > 0,
√ 4
1+ 5
we have: cos 36 =
◦ .
4
Daniel Kim 136
√ √
Problem 6.5.16. Prove − 2 ≤ sin θ + cos θ ≤ 2 ∀θ.

Proof. Given the range of sine, we have that sin 2θ ≤ 1, which implies sin 2θ + 1 ≤ 2. However, note
that sin 2θ + 1 = sin 2θ + sin2 θ + cos2 θ = 2 sin θ cos
√ = (sin θ + cos θ)2 . Therefore,
θ + sin2 θ + cos2 θ √
we have (sin θ + cos θ)2 ≤ 2, which means that − 2 ≤ sin θ + cos θ ≤ 2, and we are done.

Problem 6.5.17. What’s the domain and range of cos−1 cos−1 x ?

Solution. For cos−1 (cos−1 x) to be defined, the domain of the outer cos−1 must be [−1, 1]. Note
that the range of the inner cos−1 x is [0, π]; this does not fit inside the domain of the outer cos−1 , so
we must restrict the range of the inner cos−1 x to [0, 1].
If the range of cos−1 x is [0, 1], then 0 ≤ cos−1 x ≤ 1. Taking the cosine of this inequality yields
1 ≥ x ≥ cos 1. We flip the inequality signs because cos 1 is clearly less than 1.
Therefore the domain of cos−1 (cos−1 x) is [cos 1, 1] .

Because the range of cos−1 x is [0, 1], the range of cos−1 (cos1 x) would be [cos−1 1, cos−1 0]
h πi
(because cos1 1 < cos−1 0), or 0, .
2

θ θ
Problem 6.5.18. Prove tan + cot = 2 csc θ.
2 2

Proof. θ θ sin θ 1 + cos θ


tan + cot = +
2 2 1 + cos θ sin θ
2
sin θ + 1 + 2 cos θ + cos2 θ
=
(1 + cos θ)(sin θ)
2 + 2 cos θ
=
(1 + cos θ)(sin θ)
2(1 + 1 cos θ)
=
(1 + cos θ)(sin θ)
2
=
sin θ
= 2 csc θ.

1
Problem 6.5.19. Find all θ in radians such that sin θ = .
2

π
Solution. It may be tempting to immediately declare θ = as the solution. However, recall that all
6
trigonometric functions are periodic, and thus there are an infinite number of solutions. Since sine
has a period of 2π (and thus the value does not change when adding or subtracting multiples of 2π),
π
we see that θ = + 2πk ∀k ∈ Z.
6

However, we are not finished. Note that sin θ = sin (π − θ). Therefore, θ = is also a valid
6

solution. We cannot forget that sine is periodic, so θ can actually be + 2πk ∀k ∈ Z.
6
137 Chapter 6. Basic Trigonometry

Combining these, we see that all solutions can be represented as:

π 5π
θ= + 2πk, + 2πk ∀k ∈ Z .
6 6


It may be hard to see that θ = is also a solution. A helpful tip is to visualize the unit circle
6
1
on the coordinate plane, then plot the line y = and notice that it intersects the circle at two places,
   2
π π 5π 5π
namely cos , sin and cos , sin . We can then proceed from here.
6 6 6 6
1
Problem 6.5.20. Find all 0◦ ≤ x < 360◦ in degrees such that cos x = . A calculator may be used.
3
 
1
Solution. First, we calculate x = cos−1 ≈ 70.5◦ . Then, observe that cos(360◦ − θ) = cos θ, so
3
we also have x = 360◦ − 70.5◦ = 289.5◦ . All together, our solutions are 70.5◦ , 289.5◦ .
1
Like the previous problem, notice that the line x =intersects the unit circle at two places (the
3
first and fourth quadrants), so there must be two values of x ∈ [0◦ , 360◦ ).

Problem 6.5.21. Find all 0 ≤ x < 2π in radians such that sin 2x = sin x.

Solution. First, we can use the double angle identity to get 2 sin x cos x = sin x. You may be tempted
to divide both sides by sin x, but this ignores the possibility that sin x = 0. It is better to rearrange
the equation to 2 sin x cos x − sin x = 0, from we factor out sin x to get sin x(2 cos x − 1) = 0. Now,
1 π 5π
we can notice that either sin x = 0 or cos x = . We can then get our solutions x = 0, , π, .
2 3 3

1
Problem 6.5.22. Find all 0◦ ≤ θ < 360◦ in degrees such that cos 3θ = − .
2

Solution. If 0◦ ≤ θ < 360◦ , then 0◦ ≤ 3θ < 1080◦ . Therefore, solving for all solutions of 3θ in
1
that range for the equation cos 3θ = − , we have 3θ = 120◦ , 240◦ , 480◦ , 600◦ , 840◦ , 960◦ (these are
2
obtained by determining the principal values 120◦ and 240◦ , then adding multiples of 360◦ to each).
Dividing by 3, we have: θ = 40◦ , 80◦ , 160◦ , 200◦ , 280◦ , 320◦ .

1
Problem 6.5.23. Find all 0◦ ≤ x < 360◦ such that sin(2x + 44◦ ) = .
2

Solution. Let y = 2x + 44.


If 0◦ ≤ x < 360◦ , then 44◦ ≤ y < 764◦ .
1
If we only consider 0 ≤ y < 360◦ , then we can get the solutions sin y = as y = 30◦ , 150◦ .
2
To get all possible solutions 44◦ ≤ y < 764◦ , we can simply add multiples of 360◦ to 30◦ and 150◦
respectively.
Daniel Kim 138

We end up with y = 150◦ , 390◦ , 510◦ , 750◦ as our solutions when we take into account the
given restrictions.
Then, we replace y with 2x + 44 and solve for x, resulting in x = 53◦ , 173◦ , 233◦ , 353◦ .

Problem 6.5.24. Find all 0◦ ≤ x < 360◦ in degrees such that cos 2x = cos x.

Solution. By the double angle identity for cosine, we have 2 cos2 x − 1 = cos x, which rearranges to
1
2 cos2 x − cos x − 1 = 0. We can factor this as (2 cos x + 1)(cos x − 1) = 0, from we get cos x =
2
and 1 as roots. Within the interval 0◦ ≤ x < 360◦ , we have the solutions x = 0◦ , 120◦ , 240◦ .

Exercise 6.5.25. Find all 0 ≤ θ < 2π such that cos 3θ = cos θ.

Problem 6.5.26. Classify all cases in which a, b, c and cos a, cos b, cos c are both arithmetic se-
quences.

Solution. Let a = b − d1 , c = b + d1 , and cos a = cos b − d2 , cos c = cos b + d2 . Then we have the
relations:

cos(b − d1 ) = cos b − d2 ,
cos(b + d1 ) = cos b + d2 .

Adding these, we eventually get 2 cos b cos d1 = 2 cos b, or 2 cos b(cos d1 − 1) = 0. This implies
cos d1 = 1 or cos b = 0. For the former, we get that d1 = 360k ◦ ∀k ∈ Z (d1 is the common difference
between a, b, c) and for the latter, b = 90◦ + 180k ◦ ∀k ∈ Z .

1 π
Problem 6.5.27. If sin x + cos x = and < x < π, what is tan x?
2 2

Solution. First, note that


 2
2 2 2 1
(sin x + cos x) = sin x + 2 sin x cos x + cos x = 1 + sin 2x = ,
2

3
from which we get sin 2x = − .
4
If we know sin x + cos x, it may be helpful to also find out sin x − cos x. Observe that
7
(sin x − cos x)2 = sin2 x − 2 sin x cos x + cos2 x = 1 − sin 2x = .
4

7
Thus, we can take the square root of both sides of sin x − cos x = ± . Now, we use the fact
2
π
that < x < π. On this interval, sin x > cos x (this can be seen from looking at the unit circle or
2
comparing graphs√ of sine and cosine). Thus, since sin x − cos x > 0, we choose the positive sign:
7
sin x − cos x = .
2
139 Chapter 6. Basic Trigonometry

1 7
Now we have the equations sin x + cos x = and sin x − cos x = . If we add them together,
√ 2 2 √
1+ 7 1− 7
we get 2 sin x = . If we subtract either equation from the other, we get 2 cos x = .
2 2
Finally, we can compute
√ √ √ √
2 sin x 1+ 7 (1 + 7)2 8+2 7 4+ 7
tan x = = √ = = = − .
2 cos x 1− 7 −6 −6 3

6.6 Areas, Law of Sines, Law of Cosines


Theorem 6.6.1
The area of a triangle 4ABC with side a opposite angle A, side b opposite angle B, and side c
1
opposite angle C is ab sin C.
2

Proof. We consider several cases: an acute triangle, a right triangle, an obtuse triangle in which
angle C is obtuse, and an obtuse triangle in which angle C is acute.

a
h

C
b

1
The area of this acute triangle above is bh, where h is the altitude dropped to the base. Using
2
h
our knowledge of right triangle trig, it is clear that sin C = , which rearranges to h = a sin C.
a
1
Therefore, substituting h in the original area formula, the area is ab sin C.
2

a c

C b

1
Similarly, the area of this right triangle is ab. Note that sin 90◦ = 1, so sin C = 1. Therefore,
2
1 1
the area is ab = ab sin C.
2 2
Daniel Kim 140

h a

180 − C
C
b

1
The area of this obtuse triangle with obtuse angle C is bh, when altitude h is dropped to
2
the same level as base b. Note that h = a sin(180 − C) = a (sin 180◦ cos C − sin C cos 180◦ ) =

1
a (0 − sin C · −1) = a sin C, leading to the desired formula ab sin C.
2

a
h

C
b

1
Lastly, we consider the case of an obtuse triangle with acute angle C, with an area of bh. Note
2
h 1
that sin C = =⇒ h = a sin C, so the area can also be expressed by ab sin C, as desired.
a 2

We will henceforth use the notation [ABC] to denote the area of 4ABC.

Theorem 6.6.2 (Law of Sines)


The law of sines states that for any given triangle 4ABC and appropriate side lengths a, b, c,
the following relationships hold:
sin A sin B sin C
= =
a b c

1 1 1
Proof. Recall that [ABC] = ab sin C. It is evident that [ABC] = bc sin A and [ABC] = ac sin B.
2 2 2
Therefore, we have
1 1 1
ab sin C = bc sin A = ac sin B.
2 2 2
1
Dividing everything by abc yields
2
sin C sin A sin B
= = ,
c a b
as desired.
141 Chapter 6. Basic Trigonometry

Theorem 6.6.3 (Law of Cosines)


The law of cosines states that for any given triangle 4ABC with side lengths a, b, c, the
following equation is true:
c2 = a2 + b2 − 2ab cos C.

Proof. Similar as before, we consider various cases depending on the triangle.

a c
h

C A
b X

h CX
Consider acute 4ABC. We have sin C = , so h = a sin C. We also have cos C = , so
a a
CX = a cos C. Therefore, AX = AC − CX = b − a cos C. Then, apply the Pythagorean Theorem
to 4ABX to get c2 = h2 + AX 2 . Substituting h = a sin C and AX = b − a cos C, we have

c2 = a2 sin2 C + (b − a cos C)2


= a2 sin2 C + b2 − 2ab cos C + a2 cos2 C

= a2 sin2 C + cos2 C + b2 − 2ab cos C
= a2 + b2 − 2ab cos C.

Secondly, for the case of a right triangle 4ABC with right angle at C, it suffices to show that
cos C = cos 90◦ = 0, combined with the already given relationship c2 = a2 + b2 by the Pythagorean
Theorem. c2 = a2 + b2 =⇒ c2 = a2 + b2 + 0 =⇒ c2 = a2 + b2 − 2ab cos C, so the right triangle
case is done.
B

a c
h

C
X b A

Consider an obtuse triangle 4ABC where ∠C is obtuse. Drop the altitude from B to X such
that it forms a perpendicular with line AC. Notice that ∠BCX = 180◦ − ∠BCA = 180◦ − ∠C.
CX
Furthermore, as cos ∠BCX = , CX = a cos ∠BCX = a cos(180◦ − C) = −a cos C. Similarly,
a
h = a sin ∠BCX = a sin(180◦ − C) = a sin C.
Daniel Kim 142

Applying Pythagorean Theorem on 4BXA, we know that c2 = h2 + AX 2 . It has already been


shown that h = a sin C and CX = −a cos C. Note that AX = b + CX, so AX = b − a cos C.
Substitute in the necessary values to get our desired result:

c2 = (a sin C)2 + (b − a cos C)2


= a2 sin2 C + b2 − 2ab cos C + a2 cos2 C

= a2 sin2 C + cos2 C + b2 − 2ab cos C
= a2 + b2 − 2ab cos C.

a c h

C
b A X

Lastly, consider an obtuse triangle 4ABC where ∠C is acute, and dropped altitude h to
h
line AC. First, given right triangle 4CBX yields sin C = =⇒ h = a sin C. Furthermore,
a
b + AX
cos C = =⇒ AX = a cos C − b.
a
Apply Pythagorean Theorem on 4ABX to get c2 = AX 2 + h2 . However, we know what AX
and h are, so we substitute in their respective values:

c2 = (a cos C − b)2 + a2 sin2 C


= a2 cos2 C − 2ab cos C + b2 + a2 sin2 C
= a2 + b2 − 2ab cos C.

We have covered all possible cases and have shown that the Law of Cosines works on any given
triangle.

Recall the several triangle congruence laws:

1. SAS: Side-Angle-Side

2. SSS: Side-Side-Side

3. ASA: Angle-Side-Angle

4. AAS: Angle-Angle-Side

We also have HL (Hypotenuse-Leg), a special case of the SSS congruence law, but for right
triangles (as the third side is predetermined by Pythagorean Theorem).
143 Chapter 6. Basic Trigonometry

These congruence laws suggest that when we are given 3 particular pieces of the triangle, we are
able to ‘solve’ the rest of the triangle (determining all remaining side lengths and angles). Here are
some examples:
ASA and AAS:

B
a

C A

When given 2 angles and a side length included between them, we can use the Law of Sines to
find the rest of the information. We actually have four pieces of information (that we need in order
to solve the proportions) because we can always find the third angle using the fact that there are
always 180◦ degrees in a triangle. This is why ASA and AAS are essentially the same.

sin A sin B sin C


= =
a b c

Thus, we can compute side lengths b and c (which are not known to us immediately) using Law
of Sines.
SSS and SAS:

a c

C
b

When given two side lengths and an angle included between them (SAS), we can use the Law of
Cosines to solve for the third side. Then once the third side is known, we can use the Law of Sines
to find the rest of the angles:

sin A sin B sin C


= =
a b c

We can then solve for sin A and sin B, then take the inverse sine of those, to get our angles A
and B.
If we are given three side lengths but no angle measurements (SSS), we can find the measurement
of one of the angles using the Law of Cosines (c2 = a2 + b2 − 2ab cos C) and solving for cos C. Then,
like before, we can use Law of Sines to find the rest of the angles.
For the following problems, a calculator is required.
Daniel Kim 144

Problem 6.6.4. Solve for the rest of the side lengths and angles for the following triangles. Round
angles in degrees to the nearest tenth and side lengths to the nearest hundredth.

1. 4ABC with m∠A = 73◦ , m∠B = 42◦ , and b = 7

2. 4ABC with m∠C = 12◦ , b = 7, and a = 11

Solution.

1. As all angles in a triangle add up to 180◦ , we can easily get that m∠C = 65◦ . Then using
the Law of Sines,
sin 73◦ sin 42◦ sin 65◦
= =
a 7 c
we can calculate the rest of the side lengths:

7 sin 73◦
a= ≈ 10.00
sin 42◦
7 sin 65◦
b= ≈ 9.48
sin 42◦

2. Using the Law of Cosines, we get that c ≈ 4.40 . By the Law of Sines, we have

sin 12◦ sin A sin B


= = .
c 11 7

Here is a caveat: we must use the exact answer to c (that is still stored in the calculator; not
the approximate 4.40) to find the smaller of the two remaining angles. Why?
As inverse sine only returns values in the interval [−90◦ , 90◦ ], we could incorrectly label an
obtuse angle acute. Therefore, we should choose the smaller angle of the two because we know
for sure that a triangle cannot have two angles both greater than 90◦ . After calculating the
smaller angle, we can then calculate the larger angle of the two by using the fact that all angles
add up to 180◦ .
 ◦

−1 7 sin 12
As ∠B is the smaller of the two, we calculate m∠B = sin ≈ 19.3◦ . Then
c
m∠A = 180◦ − 12◦ − 19.3◦ = 148.7◦ .

Problem 6.6.5. Find all triangles with sides in arithmetic progression that have an angle of 60◦ .

Solution. If the triangle has an angle of 60◦ , then either the triangle is equiangular (and thus
equilateral) or for the other two angles, one is greater than 60◦ and one is less than 60◦ . Since
this angle of 60◦ is the average of the three angles in this kind of triangle, it must be between the
‘smallest’ and ‘largest’ side lengths (as the problem states that the sides are in arithmetic progression).
Therefore, if we let the side lengths be x − d, x, and x + d with a common difference of d, then this
angle of 60◦ is between x − d and x + d and is opposite the side x. We then apply Law of Cosines:

x2 = (x − d)2 + (x + d)2 − 2(x − d)(x + d) cos 60◦


= 2x2 + 2d2 − (x2 − d2 )
145 Chapter 6. Basic Trigonometry

= x2 + 3d2
∴ d = 0.

Therefore, the only kind of triangle that has sides in arithmetic progression and contains an angle of
60◦ is an equilateral triangle.

Why isn’t SSA considered a valid triangle congruence law? In fact, when given two sides and an
angle not included between them, there can be multiple scenarios. Henceforth, for 4ABC, we will
denote sides a = BC, b = AC, and c = AB. Consider this diagram:

The line that forms an angle A with side length b will contain side c, depending on the side
length a. The following cases will demonstrate the various situations which can occur depending on
the length of a.

a
b

Case 1: a < b sin A Possible Triangles: 0

If we choose a side length a that is smaller than b sin A (which is the length of the perpendicular
dropped from angle C to the long line), then we will not be able to create any triangle. One can
visualize this by constructing a circle of radius a around point C, and realize that this circle will
never intersect with the long line.

b a

Case 2: a = b sin A Possible Triangles: 1


Daniel Kim 146

If we let a be that perpendicular, notice that we can construct exactly one triangle, which is
the right triangle with altitude a. Note that the shortest distance from a point to a line is the
perpendicular dropped from the point to the line.

b a1 a2

Case 3: b > a > b sin A Possible Triangles: 2

If we have a side length a be between the length of the altitude b sin A and the side length b,
then there are actually two possible triangles. As mentioned in case 1, it is easier to visualize this
and understand the reasoning by constructing a circle of radius a and seeing that it intersects the
long line at two points.

b a

Case 4: a ≥ b Possible Triangles: 1


If we make side length a longer than side length b, then we can only make 1 triangle where side
length a is ‘pushed outward’ and cannot be ‘pushed inward’ like what happened in Case 3.
Problem 6.6.6. Consider a triangle 4ABC where A = 30◦ and b = 10. How many possible
triangles are there for a = 4, 5, 8, and 12 respectively?
1
1 sin B
Solution. Note that sin A = . If we solve the proportion (from Law of Sines) = 2 for sin B,
2 10 a
5
we get sin B = .
a
5
1. a = 4: We end up with sin B = > 1, which is clearly impossible as the range of sine is [−1, 1].
4
Therefore, there is no solution.
5
2. a = 5: We have sin B = = 1, so B = 90◦ is the only option. This corresponds to a right
5
triangle with hypotenuse 10 and angle 30◦ .
5
3. a = 8: We have sin B = . Taking the inverse sine of this, we get B ≈ 38.7◦ . Using the
8
identity sin(180 − θ) = sin θ, we see that angle B could also be 141.3◦ . If angle B was 141.3◦ ,
147 Chapter 6. Basic Trigonometry

such a triangle would be possible because m∠A + m∠B = 30◦ + 141.3◦ < 180◦ . Therefore,
there are two possible triangles here.
5
4. a = 12: This yields sin B = . Taking inverse sine and applying the identity sin(180 − θ) =
12
sin θ, we get B ≈ 24.6◦ , 155.4◦ . However, if m∠B = 155.4◦ , then such a triangle would be
impossible, because m∠A + m∠B = 30◦ + 155.4◦ > 180◦ . Therefore, B ≈ 155.4◦ is not valid,
so there is only one possible triangle at B ≈ 24.6◦ .

Problem 6.6.7. Solve the following SSA scenario for 4ABC: A = 38◦ , a = 4, b = 9.

Solution. By the Law of Sines, we have


sin 38◦ sin B
= .,
4 9
9
yielding sin B = sin 38◦ ≈ 1.39. However, this is impossible, as 1.39 ∈
/ [−1, 1], which is the range
4
of sine. Therefore, there exists no triangle .

Exercise 6.6.8. Solve the following triangle: a = 5, b = 6, and A = 50◦ .

Problem 6.6.9. Find the angles of a triangle with side lengths 12, 13, and 14.

Solution. Given 4ABC, we let c = 12, b = 13, a = 14. By the Law of Cosines, we get

142 = 122 + 132 − 2 · 12 · 13 · cos A,

142 − (122 + 132 ) 3


which yields cos A = = . Using the Pythagorean identity sin2 A + cos2 A = 1,
√ −2 · 12 · 13 8
55
we get sin A = . Taking inverse sine of this, we get A ≈ 68.0◦ . Now that we know the value of
8
sin A, we can use the Law of Sines:
sin A sin B sin C
= = .
14 13 12

Solve for sin B and sin C, then take inverse sine, as necessary:

55
sin B
8
= =⇒ B ≈ 59.4◦ .
14 13

55
sin C
8
= =⇒ C ≈ 52.6◦ .
14 12
Exercise 6.6.10. Compute the largest angle in a triangle with side lengths 13, 15, and 17.
Chapter 7

Advanced Trigonometry

This is simply a continuation of the previous chapter; however, this chapter will focus on the other
ways of representing the coordinate plane, introducing more advanced concepts. The following
material will require good understanding and foundation of the trigonometry covered earlier.

7.1 Polar Coordinates

In the Cartesian plane, we represent points by (x, y) according to an xy coordinate grid. Polar
coordinates are another way of representing these points.

(x, y)

r
θ

Definition 7.1.1. We denote a point (r; θ) as a polar coordinate, where r is the length of the
ray from the origin and θ is the angle which represents the rotation of the ray through the origin of
the unit circle, analogous to the definitions of the unit circle stated earlier.

On the Cartesian plane, the unit circle contains all points of the form (cos θ, sin θ). This can be
generalized to circles for any radius k: (k cos θ, k sin θ). Therefore, we have that (r cos θ, r sin θ) in
Cartesian coordinates is (r; θ) in polar coordinates, or in other words: (x, y) ←→ (r cos θ, r sin θ) ←→
(r; θ), implying:

x = r cos θ,
149
Daniel Kim 150

y = r sin θ.

y
This suggests that = tan θ. Furthermore, note that x2 + y 2 = r2 cos2 θ + r2 sin2 θ = r2 (sin2 θ +
x
cos2 θ) = r2 . To simplify our work for converting between polar and Cartesian coordinates, we couldy
p
establish r ≥ 0 and 0 ≤ θ ≤ 2π as stipulations, in which case, r = x2 + y 2 and θ = tan−1 ,
x
adding or subtracting multiples of π as necessary (because inverse tan can return negative values,
and it depends on which quadrant the point lies).

Problem 7.1.2. Plot the following points and convert them into Cartesian coordinates.
 π
1. A 3;
2
 

2. B 2;
6
 π
3. C 4; −
3

4. D 0; 1010 π + 2

5. E (−1; π)

B
D
E x

π
Solution. For point A, we plot the point that is 3 away from the origin and radians from the
2
starting line, which is the x-axis. To convert this into a point in the form (x, y), recall that x = r cos θ
π π
and y = r sin θ, so x = 3 cos = 0 and y = 3 sin = 3, so the Cartesian coordinate would be
2 2
(0, 3) .
    √
5π 5π 5π 5π 3
Similarly, for point B, note that 2; ←→ 2 cos , 2 sin . We compute cos =−
6 6 6 6 2
5π 1 √
and sin = . Thus, B = (− 3, 1) .
6 2
151 Chapter 7. Advanced Trigonometry
 π   π  π   π 1
Likewise, Point C = 4; − = 4 cos − , 4 sin − . We have cos − = and
√ 3 3 3 3 2
 π 3 √
sin − =− , so C = (2, −2 3) .
3 2
For point D, note that r = 0. In this case, D clearly has to be the origin, (0, 0) , regardless of
the angle given.
We are given a negative radius for point E. By intuition we can see that a point with −r reflects
the original point with r over the origin. Thus, (−1; π) is the same point as (1; 0), or (1, 0) in
Cartesian coordinates.

Problem 7.1.3. Convert these Cartesian coordinates into polar coordinates, with the stipulations
r ≥ 0 and 0 ≤ θ ≤ 2π.

1. (7, −7)

2. (−4, 3)

 
p √ 7
Solution. For (7, −7), we use the relation r = x2 + y 2 to get r = 7 2. Then, θ = tan−1 =
−7
π 3π
tan−1 (−1) = − . However, the point lies in quadrant IV, so < θ < 2π. Therefore, we add 2π to
4   2
7π √ 7π
get θ = . Then the polar coordinate is 7 2; .
4 4
   
3 3
Similarly, for point (−4, 3), we get r = 5 and θ = tan −1 − . However, tan −1 −
  4 4
3
is clearly negative, so add π to get θ = π + tan −1 − . Therefore the polar coordinate is
4
  
3
5; π + tan−1 − .
4

Now that we have dealt with plotting polar coordinates, it is time to graph polar functions.
First, we address some obvious special cases:

• An equation of the form r = k for some k ∈ R is simply a circle centered at the origin with
radius k.

• An equation of the form θ = α where α is in radians or degrees is a ray from the origin rotated
by the angle θ around the origin.

For the remainder of the section, we will discuss the general r = f (θ) form for some function f
regarding θ.
√ √
Consider the graph of r = 2 sin θ. Using the approximations 2 ≈ 1.4 and 3 ≈ 1.7, we can
determine some points to plot:
Daniel Kim 152

θ r
θ r
0◦ 0
210◦ −1
30◦ 1
225◦ ≈ −1.4
45◦ ≈ 1.4
240◦ ≈ −1.7
60◦ ≈ 1.7
270◦ −2
90◦ 2
300◦ ≈ −1.7
120◦ ≈ 1.7
315◦ ≈ −1.4
135◦ ≈ 1.4
330◦ −1
150◦ 1
360◦ 0
180◦ 0

We can then develop a basic sketch on the polar coordinate plane.


y

Scale: 0.5

Note that the negative values on the right table result in the same points in the left table (recall
the identity sin(180◦ + θ) = − sin θ). After we plot these points and sketch the graph, notice that it
resembles a circle! We can then prove that this equation’s graph is a circle, by converting the polar
equation into a regular Cartesian equation:

r = 2 sin θ
r2 = 2r sin θ
x2 + y 2 = 2r sin θ
x2 + y 2 = 2y
x2 + y 2 − 2y = 0
153 Chapter 7. Advanced Trigonometry

x2 + y 2 − 2y + 1 = 1
x2 + (y − 1)2 = 1.

Thus, the graph is actually a circle of radius 1 centered at (0, 1).


√ √
3 2
Consider the graph of r = 1 + cos θ. Using the approximations ≈ 0.9 and ≈ 0.7:
2 2

θ r
θ r
0◦ 2
210◦ ≈ 0.1
30◦ ≈ 1.9
225◦ ≈ 0.3
45◦ ≈ 1.7
240◦ 0.5
60◦ 1.5
270◦ 1
90◦ 1
300◦ 1.5
120◦ 0.5
315◦ ≈ 1.7
135◦ ≈ 0.3
330◦ ≈ 1.9
150◦ ≈ 0.1
360◦ 2
180◦ 0

Then, we sketch:
y

Scale: 0.5

This ‘heart-shaped’ graph is called a cardioid.


1
Now let’s graph r = + cos θ.
2
Daniel Kim 154

θ r
θ r
0◦ 1.5
210◦ ≈ −0.4
30◦ ≈ 1.4
225◦ ≈ −0.2
45◦ ≈ 1.2
240◦ 0
60◦ 1
270◦ 0.5
90◦ 0.5
300◦ 1
120◦ 0
315◦ ≈ 1.2
135◦ ≈ −0.2
330◦ ≈ 1.4
150◦ ≈ −0.4
360◦ 1.5
180◦ −0.5

Scale: 0.5

The shape of this graph (a generalized cardioid) is called a limaçon.


There is a faster way to sketch these graphs; we analyze r increasing or decreasing as we increment
θ. For instance, consider the graph of r = sin 2θ. For some common values of θ, look at the behavior
of r:

0◦ 45◦ 90◦ 135◦ 180◦ 225◦ 270◦ 315◦ 360◦

0 1 0 −1 0 1 0 −1 0

The value of r is increasing from 0 to 1 as θ goes from 0◦ to 45◦ , then r decreases from 1 back to
0 as θ goes from 45◦ to 90◦ . This forms the quadrant I ‘petal’ of the graph, with a ‘peak’ at point
(1; 45◦ ).
155 Chapter 7. Advanced Trigonometry

We then see that r goes negative (from 0 to −1) when θ goes from 90◦ to 135◦ , and r goes back
to 0 as θ approaches 180◦ . As r is negative in this part, the petal which would have been in quadrant
II has been reflected across the origin, resulting in a petal that is in quadrant IV, with a peak at
(−1; 135◦ ).
We continue with the rest of the values of r, leaving us with four petals.
y

(−1; 315◦) (1; 45◦ )

(1; 225◦ ) (−1; 135◦)

Scale: 0.5

Exercise 7.1.4. Sketch the following polar functions:

1. r = sin 3θ

2. r = θ
1
3. r =
θ
4. r = 2 + cos 2θ

5. r = 2 + cos 3θ
Problem 7.1.5. Sketch r = sec θ.
1
Solution. Rearranging r = , we get r cos θ = 1. But x = r cos θ, so our graph is simply a line,
cos θ
x = 1.

Problem 7.1.6. Let a, b not be simultaneously 0. What do all graphs of the form r = a sin θ +b cos θ
have in common?

Solution. Multiply both sides of the equation by r to get: r2 = a · r sin θ + b · r cos θ. Substitute in
the definitions x2 + y 2 = r2 , x = r cos θ, and y = r sin θ, so we have: x2 + y 2 = ay + bx. Rearranging
this and completing the square yields:
 
b 2  a 2 a2 + b2
x− + y− = ,
2 2 4
Daniel Kim 156
  √
b a a2 + b2
which is an equation for a circle, centered at , with radius . Therefore, all polar
2 2 2
graphs of the form r = a sin θ + b cos θ are circles which vary depending on the values of a and b.
Furthermore, note that all these circles go through the origin.

Exercise 7.1.7. Convert r = 2 sin θ − 4 cos θ into Cartesian form, and describe fully the graph.

Problem 7.1.8. Convert x + y = 10 into a polar equation r = f (θ) if possible.

Solution. Simply use the definitions x = r cos θ and y = r sin θ:

x + y = 10
r cos θ + r sin θ = 10
r(cos θ + sin θ) = 10
10
r= .
cos θ + sin θ

Problem 7.1.9. Given the graph of r = f (θ), describe the transformation that results in

a) r = f (−θ).

b) r = k · f (θ), k ∈ R.

c) r = f (θ + α), α ∈ R.

Solution.

a) Recall that a ray on the unit circle with an angle −θ is rotated by θ clockwise from the initial
starting place (which is the x-axis) rather than counter-clockwise. Therefore, the graph of
r = f (−θ) is a reflection of r = f (θ) over the x-axis.

b) We have a little casework:

• If k > 0, then we simply scale the graph of r = f (θ) by a factor of k from the origin.
• If k = 0, then the graph becomes r = 0, which is simply the origin, (0, 0).
• If k < 0, then we first reflect the graph on the origin, then we scale the graph by a factor
of |k| from the origin.

c) The graph r = f (θ + α) is a rotation of r = f (θ) by an angle of |α|, clockwise when α is


positive and counter-clockwise when α is negative. This is analogous to left and right shifts for
normal graphs y = f (x), y = f (x + k) k ∈ R on the Cartesian plane.

Problem 7.1.10. If f is odd, what can you say about the graph of r = f (θ)? What about if f is
even?

Solution. Suppose we have a point P = (r; θ). We know that if f is odd, then r = f (θ) = −f (−θ),
or −r = f (−θ). This implies that (−r; −θ) lies on the graph as well. However, the point (−r; −θ) is
actually the reflection of point P across the y-axis.
157 Chapter 7. Advanced Trigonometry

To see why this is the case, note that (−r; θ) is a reflection of P across the origin, and (−r; −θ) is
the reflection of (−r; θ) across the x-axis. These two transformations ultimately result in a reflection
across the y-axis.
If f is even, then f (θ) = f (−θ). It has been established that f (−θ) is a reflection of f (θ) over
the x-axis. But if f (−θ) and f (θ) are the same graph, then we can say that a graph with even
function f is symmetric over the x-axis.

Problem 7.1.11. Convert the following polar equations into Cartesian equations:

1. r = cos θ

2. r = 1 + cos θ

Solution.

1. Multiply both sides by r, perform the necessary substitutions, then complete the square:

r = cos θ
r2 = r cos θ
x2 + y 2 = x
x2 − x + y 2 = 0
1 1
x2 − x + + y 2 =
4 4
 2
1 1
x− + y2 = .
2 4

2. Similar steps, however the resulting equation is more complicated:

r = 1 + cos θ
r2 = r + r cos θ
x2 + y 2 = r + x
r = x2 + y 2 − x
r2 = (x2 + y 2 − x)2
x2 + y 2 = (x2 + y 2 − x)2 .

Problem 7.1.12. Convert x2 − y 2 = 1 into a polar equation.

Solution. Use the relations x = r cos θ and y = r sin θ:

x2 − y 2 = 1
r2 cos2 θ − r2 sin2 θ = 1
r2 (cos2 θ − sin2 θ) = 1
r2 (cos 2θ) = 1
Daniel Kim 158

r2 = sec 2θ

r = sec 2θ .

Note that we do not put “±” before sec 2θ because there’s no difference whether r is negative
or not (all points (r; θ) can be represented as (−r; θ + π)).

7.2 Parametric Equations


This section will introduce another way we can represent Cartesian points and graphs, by the use of
a third variable. Previous experience with conic sections is required.
Definition 7.2.1. Given some rectangular equation using variables x and y, we can relate x and y
to each other by expressing each in terms of a third variable t, the parameter. Given some arbitrary
functions f and g, we have the parametric equations:
x = f (t)
y = g(t)
The parametric curve is the collection of all points (f (t), g(t)) as t runs through some domain.

I present a well-known example: the unit circle. It can be represented by the parametric equations
x = cos θ, y = sin θ. In this case, θ would be the parameter.

• Let x = cos θ, y = sin θ and suppose 0 ≤ θ ≤ 2π.

The parametric curve would start at the point (1, 0) (the dot), go counter-clockwise in a
circular motion, and finish back at (1, 0).
• −π ≤ θ ≤ π

The curve starts at (−1, 0), go counter-clockwise in a circular motion and finish at (−1, 0).
159 Chapter 7. Advanced Trigonometry

• 0 ≤ θ ≤ 4π

The course of motion is similar to the first one (0 ≤ θ ≤ 2π) except that after the curve starts
at (1, 0), it goes around the same circle twice counter-clockwise and finishes at (1, 0).
π π
• ≤θ≤
4 3

π π π
The result would be an arc of angle drawn clockwise, as θ increases from to . For
12 4 3
clarity, the rest of the unit circle is shown through a dashed curve.

Consider the parametric equations x = cos 2θ, y = sin 2θ.

a) If there was no restriction on θ, then we would have the normal unit circle, because x2 + y 2 = 1,
based on the Pythagorean Identity.
b) If we suppose that 0 ≤ θ ≤ 2π, then we would end up with a graph that starts at (1, 0),
goes counter-clockwise around the unit circle twice and finishes at (1, 0). Then how is it any
different than the parametric curve x = cos θ, y = sin θ, 0 ≤ θ ≤ 4π?
The curve for x = cos 2θ, y = sin 2θ is drawn twice as fast. For clarification, notice that when
θ = 2π, the graph for x = cos 2θ, y = sin 2θ is already complete (circle drawn over twice), but
for x = cos θ, y = sin θ, the circle has only been drawn once, with the ‘second circle’ remaining.

If we had parametric equations x = cos(−θ), y = sin(−θ), and restrictions 0 ≤ θ ≤ 2π, the


parametric curve would be the unit circle but drawn clockwise (going backwards), as opposed to
earlier examples.
Now, what if we defined x = sin θ, y = cos θ with 0 ≤ θ ≤ 2π? In fact, the graph would look the
same (but ‘drawn’ differently).

The graph would be a circle starting from (0, 1), going clockwise around the unit circle and
ending up back at (0, 1).
Daniel Kim 160

This can also result from reflecting the graph determined by x = cos θ, y = sin θ over the line
y = x, since we have switched the parametric functions assigned to x and y respectively.

Problem 7.2.2. Find the Cartesian equation represented by the parametric equations x = 1 + cos θ,
y = 2 + sin θ, with 0 ≤ θ ≤ 2π.

Solution. We use the Pythagorean identity sin2 θ + cos2 θ = 1. Note that cos θ = x − 1 and
sin θ = y − 2. Substituting these into the identity, we therefore have (x − 1)2 + (y − 2)2 = 1 , a
circle of radius 1 centered at (1, 2).

Problem 7.2.3. Find the Cartesian equation represented by the parametric equations x = 88 cos θ,
y = 88 sin θ with 0 ≤ θ ≤ 2π, with 0 ≤ θ ≤ 2π.

   
x y x2 y2
Solution. Similarly, note that cos θ = and sin θ = , yielding the equation + = 1,
88 88 882 882
or x2 + y 2 = 7744 , a circle centered at the origin with radius 88.

Problem 7.2.4. Find the Cartesian equation represented by the parametric equations x = 88 cos θ,
y = 89 sin θ, with 0 ≤ θ ≤ 2π.

Solution. We once again use our Pythagorean identity sin2 θ + cos2 θ = 1. Note that x2 =
x2 2
2 = 892 · sin2 θ =⇒ sin2 θ = y , yielding the equation
882 · cos2 θ =⇒ cos2 θ = and y
882 892
x 2 y 2
2
+ 2 = 1 , which is an ellipse.
88 89

(x + 2)2 (y − 3)2
Problem 7.2.5. Represent the ellipse with equation + = 1 with parametric
25 16
equations.

Solution. Given the reasoning presented in the solution of the previous problem, we can work
(x + 2)2
backwards. Using the identity sin2 θ + cos2 θ = 1, we have the relations = cos2 θ and
25
(y − 3)2
= sin2 θ, and solve to get the parametric equations x = 5 cos θ − 2 and y = 4 sin θ + 3 .
16

Problem 7.2.6. Find parametric equations for the graph x2 − y 2 = 9.

Solution. The form of the given equation is similar to another Pythagorean identity: sec2 θ =
x2 y 2
1 + tan2 θ. Since we can rearrange the equation to 2 − 2 = 1, we then solve for the parametric
3 3
equations x = 3 sec θ and y = 3 tan θ .

Problem 7.2.7. Graph the parametric curve x = cos t, y = cos 2t.


161 Chapter 7. Advanced Trigonometry

Solution. Using the double-angle formulas, we have that y = 2 cos2 t − 1, or y = 2x2 − 1. At first
glance the graph may seem like a normal parabola. However, we must consider the ranges of x and
y: as x is equal to the cosine function of parameter t, x necessarily has a range of [−1, 1], so we must
limit the graph to x-values from −1 to 1 only.

−2 −1 1 2

−1

−2

Problem 7.2.8. Find the Cartesian equation for parametric equations x = sin t, y = sin 2t.

Solution. The double-angle


p formula for sine states that sin 2x = 2 sin x√cos x. Combining this with
the fact that cos x = ± 1 − sin2 x, we have the equation y = ±2x 1 − x2 . To get rid of the
plus-minus sign, we square both sides to get: y 2 = 4x2 (1 − x2 ) .

Problem 7.2.9. Sketch the parametric curve for equations x = t2 − t, y = t2 + t.

Solution. There is no clear relationship between x and y, so first set up a table and calculate various
x and y values:

t x y

−3 12 6

−2 6 2

−1 2 0
1 3 1
− −
2 4 4
0 0 0
1 1 3

2 4 4
1 0 2

2 2 6

−3 6 12
Daniel Kim 162

Then, we can produce a basic sketch:

12

10

2 4 6 8 10 12

The graph looks like a rotated parabola. We can investigate this by finding the Cartesian equation
for this parametric curve. Given the parametric equations,

x = t2 − t,
y = t2 + t,

x+y y−x
we can add these to get t2 = , and subtract these to get t = . Therefore, we have:
2 2
 
y−x 2 x+y
=
2 2
2
y − 2xy + x 2 x+y
=
4 2
4x + 4y = 2(y 2 − 2xy + x2 )
4x + 4y = 2y 2 − 4xy + 2x2 ,

so we end up with the equation

2x2 − 4xy + 2y 2 − 4x − 4y = 0.

We are left with an equation for a conic section. The conic discriminant is (−4)2 − 4(2)(2) = 0,
so we can confirm that it represents an oblique parabola.
Don’t worry if you don’t know about the conic discriminant, as it will be covered in depth in a
later chapter (Theorem 7.3.34).

Suppose that in the following diagram,


163 Chapter 7. Advanced Trigonometry

(c, d)
t=1

(a, b)
t=0

we must find a set of parametric equations for x and y such that the parametric curve is this line
segment starting from point (a, b) at t = 0 and ending at point (c, d) with t = 1 (i.e. 0 ≤ t ≤ 1).
To start, we drop altitudes and form right triangles:

(c, d)
t=1

t(d − b)
(a, b)
t=0 t(c − a)

Note that since t goes from 0 to 1 and the graph in discussion is linear, t is the ratio of the
line segment that has been drawn starting from point (a, b). Therefore, the distance covered on the
x-axis would be t multiplied by the width difference c − a, and the distance covered on the y-axis
would be t multiplied by the height difference d − b.
Recall that we are starting from point (a, b), so the x-coordinate would be t(c − a) added to a
and the y-coordinate would be t(d − b) added to b. Therefore, we have the parametric equations:
x = f (t) = a + t(c − a),
y = g(t) = b + t(d − b).
Problem 7.2.10. Find parametric equations for the line segment connecting (1, 0) and (7, −5)
where

1. (1, 0) is the starting point.


2. (7, −5) is the starting point.

Solution.

1. Using the relationship proven in the previous example, we have the equations x = 1 + t(7 − 1)
and y = 0 + t(−5 − 0), or:
x = 1 + 6t,
y = −5t,
for 0 ≤ t ≤ 1.
Daniel Kim 164

2. Similarly, the parametric equations would be:

x = 7 − 6t,
y = −5 + 5t,

for 0 ≤ t ≤ 1.

Problem 7.2.11. Sketch the following graphs:

1. x = et , y = e−t

2. x = 2t , y = 4t

3. x = sin t, y = sin t

Solution.

1
1. At first, one might conclude that the graph is simply the hyperbola y = . However, remember
x
that x, y > 0, as a positive number raised to an exponent is always greater than zero. Therefore,
we only draw the graph for positive x and y values, as such:

2. It is clear that y = x2 . However, keep in mind that x, y > 0 (because a positive number raised
to any exponent is always positive). Therefore we only draw the right side of the parabola:
165 Chapter 7. Advanced Trigonometry

3. The line is obviously y = x, but remember that the range of sine is [−1, 1], therefore we limit
x, y to the interval [−1, 1]. Thus we are left with a line segment:

√ √
Exercise 7.2.12. Graph the parametric curve for equations x = t + 1, y = t − 1.

Problem 7.2.13. Find a parametric expression for the graph 2(x − 1)2 + 3(y + 2)2 = 11.

Solution. We strive to turn this equation into a standard equation for a conic section. Therefore,
divide both sides of the equation by 11 to make the RHS equal to 1, then express the LHS as the
sum of fractions where only the terms (x − 1)2 and (y + 2)2 are the numerators, as such:

2(x − 1)2 + 3(y + 2)2 = 11


2(x − 1)2 3(y + 2)2
+ =1
11 11
(x − 1)2 (y + 2)2
11 + 11 =1
2 3

This is an ellipse! Just like examples shown earlier in the section, the parametric equations for x
and y rely on the identity sin2 θ + cos2 θ = 1:
r
11
x= cos θ + 1,
2
r
11
y= sin θ − 2.
3

Problem 7.2.14. Sketch the parametric graph x = 2 = 4 cos θ, y = −1 + 3 sin θ.

We conclude this section with one famous example:


Daniel Kim 166

(0, 0)

Example 7.2.15 (The Cycloid)


Consider a circle of radius 1, centered at (0, 1), rolling across a line. Let us trace the path of a
certain point (in this case, let it be initially (0, 0)) of the circle, as the circle rolls. What are the
parametric equations to describe this path?

Solution. First, let’s consider a circle of radius r, centered at (0, 0), to simplify the problem. Given
an angle of θ (in radians) from the point (0, −r) to the point A, what would be the coordinates of A?

(0, 0)
θ

(0, −r)

Well, remember that the functions sin x and cos x start from the ‘east pole’ (denoted by the
dotted
 line in the diagram) and go counter-clockwise. Therefore, the point (0, −r) would be
π π  π π
r cos − , r sin − (it is − instead of because we went clockwise).
2 2 2 2
Point
 Ais simply rotating
 the point
 (0, −r) further θ degrees clockwise, so we subtract θ to get
π π
A = r cos − − θ , r sin − − θ .
2 2
However, note that
 π  π  π π
cos − − θ = cos + θ = cos cos θ − sin sin θ = − sin θ,
2 2 2 2
and  π  π   
π π
sin − − θ = − sin + θ = − sin cos θ + cos sin θ = − cos θ,
2 2 2 2
therefore A = (−r sin θ, −r cos θ).
To get the graph that we want, we must translate this circle upwards by r. Therefore, we add r
to the y-coordinate of all points on the circle.
167 Chapter 7. Advanced Trigonometry

(−r sin θ, r − r cos θ)

(0, r)
θ

(0, 0)

This results in the point (−r sin θ, r − r cos θ) to describe the point on a circle with angle θ.
However, we’re not done yet. We have yet to factor in the circle moving sideways, on the x-axis.
If a circle ‘rolls’ on its surface by an angle θ, we are really just translating the circle horizontally
by the length of the arc that the angle θ subtends. For example, in this diagram, the length of the
arc between the point (0, 0) and (−r sin θ, r − r cos θ) is the horizontal distance that the initial point
(0, 0) will cover when that point becomes (−r sin θ, r − r cos θ) as a result of the circle rolling.
θ
We can compute this length as · 2πr = rθ. We add this to the x-coordinate of (−r sin θ, r −

r cos θ) to get the point (rθ − r sin θ, r − r cos θ), which is the correct representation of the point on
the rolling circle. Therefore, our parametric equations would be:

x = r(θ − sin θ),


y = r(1 − cos θ).

7.3 Complex Numbers

This section assumes some prior experience of complex numbers.

7.3.1 Review

We define i = −1. A complex number can be expressed as z = a + bi, where a is the real part
and b is the imaginary part. All real numbers are complex numbers, with their imaginary part
equal to 0.

If i = −1, then i2 = −1, i3 = −i, and i4 = 1. When multiplying complex numbers together,
make sure you get used to various powers of i.
2 + 3i
Problem 7.3.1. Compute and (2 + 3i)3 .
4 + 5i

Solution. For the first one, multiply the numerator and denominator by the conjugate of the
denominator:
2 + 3i 4 − 5i 23 + 2i
· = .
4 + 5i 4 − 5i 41
Daniel Kim 168

For the second one, use the Binomial Theorem to expand and simplify:

(2 + 3i)3 = 23 + 3 · 22 · 3i + 3 · 2 · (3i)2 + (3i)3


= 8 + 36i − 54 − 27i
= −46 + 9i .

Given a complex number z = a + bi, define the conjugate of z as z = a − bi.

Problem 7.3.2. Given w, z ∈ C, prove:

a) w + z = w + z.

b) wz = w · z.
 
1 1
c) = .
z z

Proof.

a) Let w = a + bi and z = c + di. Then w + z = (a + c) + (b + d)i = (a + c) − (b + d)i, and


w + z = a + bi + c + di = a − bi + c − di = (a + c) − (b + d)i, therefore w + z = w + z.

b) Similarly, let w = a + bi and z = c + di. Expand both wz and w · z using the definitions of w
and z and simplify, separating the real and imaginary parts.

w · z = a + bi · c + di
= (a − bi)(c − di)
= ac − adi − bci − bd
= ac − bd − adi − bci
= (ac − bd) − (ad + bc)i.

wz = (a + bi)(c + di)
= ac + adi + bci − bd
= ac − bd + adi + bci
= (ac − bd) + (ad + bc)i
= (ac − bd) − (ad + bc)i.

We can compare the two and conclude that they are equal.

c) This time, let z = a + bi, and the structure of this proof is similar to the previous two:
   
1 1
=
z a + bi
   
1 a − bi
= ·
a + bi a − bi
169 Chapter 7. Advanced Trigonometry
 
a − bi
=
a2 + b2
 
a b
= − i
a2 + b2 a2 + b2
a b
= 2 + i
a + b2 a2 + b2

1 1
=
z a + bi
1
=
a − bi
1 a + bi
= ·
a − bi a + bi
a + bi
= 2
a + b2
a b
= 2 2
+ 2 i
a +b a + b2
We find that they are equal, so we are done.

Problem 7.3.3. Prove that if P (x) is a polynomial with real coefficients and z is a root of P (x),
then z is also a root of P (x).
n
X n
X
Proof. Let P (x) = ak xk where ak ∈ R. Let 0 = P (z) = ak z k . Then,
k=0 k=0

n
X n
X
k
0 = P (z) = ak z = ak (z k ).
k=0 k=0

It is evident that (z k ) = (z)k . It is also true that the conjugate of any real number is itself, since
Xn
a + 0i = a − 0i. Using those facts, we can conclude that 0 = ak (z)k = P (z), i.e. P (z) = 0. Thus,
k=0
z is a root of P (x).

Problem 7.3.4. Define |a + bi| = a2 + b2 . ∀w, z ∈ C, prove:

a) |wz| = |w||z|.

b) |w + z| ≤ |w| + |z|.

Proof.
√ √ √
a) First, |w||z| = |a + bi||c + di| = a2 + b2 · c2 + d2 = a2 c2 + a2 d2 + b2 c2 + b2 d2 .
Then,

|wz| = |(a + bi)(c + di)|


Daniel Kim 170

= |ac − bd + (ad + bc)i|


p
= (ac − bd)2 + (ad + bc)2
p
= a2 c2 + a2 d2 + b2 c2 + b2 d2 ,

so they are equal.

b) First, note that (|w + z|)2 = (a + c)2 + (b + d)2 = a2 + b2 + c2 + d2 + 2ac + 2bd.


p
Also, (|w| + |z|)2 = |w|2 + |z|2 + 2|w||z| = a2 + b2 + c2 + d2 + 2 (a2 + b2 )(c2 + d2 ).
ad + bc √
Using terms ad and bc, we can apply the AM-GM inequality to get ≥ abcd, which
2
rearranges to a d + b c ≥ 2abcd.
2 2 2 2

Therefore, by adding a2 c2 +b2 d2 to both sides of that result, we have a2 c2p


+a2 d2 +b2 c2 +b2 d2 ≥
a c +2abcd+b d , which factors into (a +b )(c +d ) ≥ (ac+bd) i.e. (a2 + b2 )(c2 + d2 ) ≥
2 2 2 2 2 2 2 2 2

ac + bd.
Now, we can apply that result on the expanded forms of |w + z| and p |w| + |z| stated in
the beginning of the proof; it is now clear that a + b + c + d + 2 (a2 + b2 )(c2 + d2 ) ≥
2 2 2 2

a2 + b2 + c2 + d2 + 2ac + 2bd, and so (|w| + |z|)2 ≥ (|z + w|)2 . Since magnitude is always
nonnegative, we take square root of both sides to get the intended result |w + z| ≤ |w| + |z|.

7.3.2 The Complex Plane

Now, we introduce a new coordinate system using complex numbers.

Imaginary axis

z2

2 + 3i
|z2 − w|

Real axis

a2 + b2 = |z1 |

z1 = a + bi
171 Chapter 7. Advanced Trigonometry

We have the Cartesian point (a, b) correspond to the point a + bi on the complex plane, where
the x-axis is the real part and the y-axis is the imaginary part of the complex number.
For instance, the point (2, 3) on the Cartesian plane would be represented as 2 + 3i on the
complex plane. Here are a couple of observations: We have the Cartesian point (a, b) correspond
to the point a + bi on the complex plane, where the x-axis is the real part and the y-axis is the
imaginary part of the complex number. For instance, the point (2, 3) on the Cartesian plane would
be represented as 2 + 3i on the complex plane. Here are a couple of observations:

a) Given
√ a point
p z1 ∈ C, |z1 | is the distance from the origin to z1 . This is obvious as |z1 | =
a + b = (a − 0)2 + (b − 0)2 which is simply the Cartesian distance formula with points
2 2

(a, b) and (0, 0) plugged in.

b) Given two points z2 and w, the distance between them is |z2 − w|. For a brief proof, let
z2 = a + bi (which
p denotes (a, b)) and w = c + di (which denotes (c, d)). The Cartesian distance
formula gives (a − c)2 + (b − d)2 , which is equivalent to |(a − c) + (b − d)i|, or |z2 − w|.

Problem 7.3.5. Describe the graphs of the following equations in the complex plane:

1. z = z

2. |z − 1| = |z|

3. |z − 2| + |z + 3i| = 10

4. |z − 2| = 2|z|

Solution.

a) If we let z = a + bi, then we have the equation a + bi = a − bi, and we solve to get b = 0,
implying that z must be real. Therefore the graph would simply be the real axis.

b) Like before, let z = a + bi. We algebraically simplify the equation:

|a + bi − 1| = |a + bi|
p p
(a − 1)2 + b2 = a2 + b2
(a − 1)2 = a2
a2 − 2a + 1 = a2
2a = 1
1
a=
2

1
Therefore the graph is a line of z such that the real part of z is .
2
c) Recall the definition of an ellipse, which is the set of points such that the sum of the distances
from any point to the two certain points (foci) is constant. In this equation, we are given that
the sum of the distance from z to 2 and the distance from z to 3i is always 10. Therefore this
graph is an ellipse with foci at 2 and −3i.
Daniel Kim 172

d) Let z = a + bi. We simplify this equation algebraically:

|a + bi − 2| = 2|a + bi|
p p
(a − 2)2 + b2 = 2 a2 + b2
(a − 2)2 + b2 = 4a2 + 4b2
a2 − 4a + 4 + b2 = 4a2 + 4b2
3a2 + 4a + 3b2 =4
 
4
3 a + a + 3b2
2
=4
3
 
2 4 4 4
3 a + a+ + 3b2 =4+
3 9 3
 
2 2 16
3 a+ + 3b2 =
3 3
 
2 2 16
a+ + b2 =
3 9
 
2 4
Therefore the graph is a circle centered at − , 0 with radius .
3 3

Recall the relationship between the Cartesian plane and the polar plane: (a, b) ⇐⇒ (r; θ). We
had defined a = r cos θ and b = r sin θ. We can now establish a relationship between the polar plane
and the complex plane:

a + bi = r cos θ + i · r sin θ
= r(cos θ + i sin θ)
= r cis θ.

We denote the polar form of the complex number a + bi as r cis θ, where r is the magnitude
and θ is the argument. If z is the complex number, then we can denote θ = Arg z. For notation
purposes, cis θ is the abbreviation for cos θ + i sin θ.
We now have four ways to represent a point in two-dimensional space:

a + bi ←→ (a, b) ←→ (r; θ) ←→ r cis θ.


p
It turns out that |cis θ| = |cos θ + i sin θ| = cos2 θ + sin2 θ = 1, so |r cis θ| = |r||cis θ| = |r|.
Thus,

a + bi = r cis θ −→ |a + bi| = |r cis θ|


−→ |a + bi| = |r|
p
−→ |r| = a2 + b2 .

This concept of the ’magnitude’ of a number is consistent throughout various representations.


173 Chapter 7. Advanced Trigonometry

Problem 7.3.6. Convert the following complex numbers from polar form to standard form (a + bi):
π 
a) 6 cis
6
b) 8 cis(−π)
 

c) 4 cis
4

Solution.
π   π π
a) 6 cis = 6 cos + i sin
6 6 ! 6

3 1
=6 + i
2 2

= 3 3 + 3i .

b) 8 cis(−π) = 8(cos(−π) + i sin(−π))


= 8(−1 + i · 0)
= −8 .
   
c) 3π 3π 3π
4 cis = 4 cos + i sin
4 4 4
√ √ !
2 2
=4 − + i
2 2
√ √
= −2 2 + 2 2i .

Problem 7.3.7. Convert 18 − 18 3i to polar form.
√ !
Solution. √ 1 3
18 − 18 3i = 36 − i
2 2
    
5π 5π
= 36 cos − i sin
3 3
 

= 36 cis .
3

Problem 7.3.8. Convert 3 + 4i to polar form.

4
Solution. Note that 3 + 4i is in the first quadrant, so tan−1 gives the correct angle. The magnitude
 3 
√ −1 4
is 3 + 4 = 5, therefore 3 + 4i in polar form is 5 cis tan
2 2 .
3
Daniel Kim 174

Lemma 7.3.9
Given two complex numbers r1 cis θ1 and r2 cis θ2 ,

(r1 cis θ1 )(r2 cis θ2 ) = r1 r2 cis(θ1 + θ2 ).

Proof. Expand and simplify using the trigonometric angle addition formulas.

(cis α)(cis β) = (cos α + i sin α)(cos β + i sin β)


= cos α cos β + i cos α sin β + i cos β sin α − sin α sin β
= cos(α + β) + i sin(α + β)
= cis(α + β).

Therefore, the statement (r1 cis θ1 )(r2 cis θ2 ) = r1 r2 cis(θ1 + θ2 ) follows.


√ √
Problem 7.3.10. Compute (3 3 + 3i)(18 − 18 3i) using Lemma 7.3.9, then compute it directly
to confirm that it works.

Solution. First, we use Lemma 7.3.9:


√ √ π 5π
(3 3 + 3i)(18 − 18 3i) = 6 cis · 36 cis
6  3 
π 5π
= 6 · 36 · cis +
6 3
11π
= 216 cis
 6 
11π 11π
= 216 cos + i sin
6 6
√ !
3 1
= 216 − i
2 2

= 108 3 − 108i .

Then, we confirm by directly expanding the product:


√ √ √ √
(3 3 + 3i)(18 − 18 3i) = 54 3 − 162i + 54i + 54 3

= 108 3 − 108i .

Lemma 7.3.11
cis(−θ) = (cis θ)−1 .

Proof. By Lemma 7.3.9, we have cis(−θ) cis θ = cis(−θ + θ) = cis 0, which is equal to 1. Since
1
cis(−θ) cis θ = 1, cis(−θ) = = (cis θ)−1 .
cis θ
175 Chapter 7. Advanced Trigonometry

Problem 7.3.12. Compute (cis θ)2 and (cis θ)3 .

Solution.
(cis θ)2 = cis θ cis θ = cis(θ + θ) = cis 2θ
(cis θ)2 = (cis θ)2 cis θ = cis 2θ + cis θ = cis(2θ + θ) = cis 3θ
The pattern suggests that (cis θ)n and cis nθ are equal, which will be proven in the following
theorem.

Theorem 7.3.13 (De Moivre’s Theorem)


Let n ∈ Z+ . Then, (cis θ)n = cis(nθ).

Proof. We proceed by induction. Let P (n) : (cis θ)n = cis nθ.


Base Case: P (1) : (cis θ)1 = cis θ = cis(θ · 1) so the base case is true.
Inductive Step: Assume P (n) : (cis θ)n = cis nθ is true.
We want to prove P (n + 1) : (cis θ)n+1 = cis((n + 1)θ). By Lemma 7.3.9, we have

(cis θ)n+1 = (cis θ)n (cis θ)


= (cis nθ)(cis θ)
= cis(nθ + θ)
= cis((n + 1)θ),

which concludes the inductive step.

Corollary 7.3.14
∀n ∈ Z, (cis θ)n = cis nθ.

Proof. We have already proven the case when n > 0, shown in the proof of Theorem 7.3.13.
If n = 0, then (cis θ)0 = 1, and cis(0 · θ) = cis 0 = 1, so they are equal.
If n < 0, then n + |n| = 0. Then,

(cis θ)n (cis θ)|n| = (cis θ)n+|n|


= (cis θ)0
= 1.

Therefore, we finish the proof for case n < 0 by using Lemma 7.3.9, Lemma 7.3.11, Theorem 7.3.13,
and the fact that −|n| = n:

(cis θ)n (cis θ)|n| = 1


1
(cis θ)n =
(cis θ)|n|
Daniel Kim 176
 −1
= (cis θ)|n|
= (cis(|n|θ))−1
= cis(−|n|θ)
= cis nθ.

Problem 7.3.15. Compute ( 3 + i)10 .

Solution. First, we convert the complex number to cis form.


√ !
√ 3 1
3+i=2 + i
2 2
 π π
= 2 cos + i sin
6 6
π
= 2 cis .
6

Therefore, by De Moivre’s Theorem,


 π 10  π 10
2 cis = 210 · cis
6 6

= 1024 · cis
 3 
5π 5π
= 1024 cos + i sin
3 3
√ !
1 3
= 1024 − i
2 2

= 512 − 512 3i .

Problem 7.3.16. Compute ( 3 − i)11 .
√ !!11
Solution. √ 3 1
( 3 − i)11 = 2 − i
2 2
  π 11
= 2 cis −
 6 
11π
= 211 cis −
6
π 
= 211 cis
6 !

3 1
= 2048 + i
2 2

= 1024 3 + 1024i .

Problem 7.3.17. Now, cis 3θ = (cis θ)3 . By expanding (cos θ +i sin θ)3 using the Binomial Theorem,
derive formulas for cos 3θ and sin 3θ entirely in terms of cos θ and sin θ respectively.
177 Chapter 7. Advanced Trigonometry

Solution. By the Binomial Theorem,


(cos θ + i sin θ)3 = cos3 θ + 3 cos2 θ · i sin θ − 3 cos θ sin2 θ − i sin3 θ.

By De Moivre’s Theorem, we have


cos3 θ + 3 cos2 θ · i sin θ − 3 cos θ sin2 θ − i sin3 θ = cos 3θ + i sin 3θ.

We equate the real and imaginary parts, then simplify to get the triple angle formulas:
cos 3θ = cos3 θ − 3 cos θ sin2 θ
= cos3 θ − 3 cos θ(1 − cos2 θ)
= cos3 θ − 3 cos θ + 3 cos3 θ
cos 3θ = 4 cos3 θ − 3 cos θ.

i sin 3θ = 3 cos2 θ · i sin θ − i sin3 θ



= i 3 cos2 θ sin θ − sin3 θ

= i 3(1 − sin2 θ) sin θ − sin3 θ

= i 3 sin θ − 3 sin3 θ − sin3 θ
sin 3θ = −4 sin3 θ + 3 sin θ.
Problem 7.3.18. By expanding (cos θ + i sin θ)5 and rewriting, find polynomials in terms of cos θ,
sin θ respectively for cos 5θ, sin 5θ.

Solution. We expand (cos θ + i sin θ)5 using the Binomial Theorem:


cos5 θ + 5 cos4 θ · i sin θ − 10 cos3 θ sin2 θ − 10 cos2 θ · i sin3 θ + 5 cos θ sin4 θ + i sin5 θ.

By De Moivre’s Theorem, this equals cos 5θ + i sin 5θ.


Then, we equate real and imaginary parts:
cos 5θ = cos5 θ − 10 cos3 θ sin2 θ + 5 cos θ sin4 θ
= cos5 θ − 10 cos3 θ(1 − cos2 θ) + 5 cos θ(1 − cos2 θ)2
= cos5 θ − 10 cos3 θ + 10 cos5 θ + 5 cos θ(1 − 2 cos2 θ + cos4 θ)
= 11 cos5 θ − 10 cos3 θ + 5 cos θ − 10 cos3 θ + 5 cos5 θ
= 16 cos5 θ − 20 cos3 θ + 5 cos θ.

i sin 5θ = 5 cos4 θ · i sin θ − 10 cos2 θ · i sin3 θ + i sin5 θ


sin 5θ = 5 cos4 θ sin θ − 10 cos2 θ sin3 θ + sin5 θ
= 5(1 − sin2 θ)2 sin θ − 10(1 − sin2 θ) sin3 θ + sin5 θ
= 5(1 − 2 sin2 θ + sin4 θ) sin θ − 10 sin3 θ + 10 sin5 θ + sin5 θ
= 5 sin θ − 10 sin3 θ + 5 sin5 θ − 10 sin3 θ + 11 sin5 θ
= 16 sin5 θ − 20 sin3 θ + 5 sin θ.
Daniel Kim 178

Theorem 7.3.19
We have certain relationships between cos nθ, cos θ and sin nθ, sin θ:

a) cos nθ can be written as a polynomial in cos θ ∀n ∈ Z+ .

b) sin nθ can be written as a polynomial in sin θ ∀ odd n ∈ Z+ .

Proof. Consider the binomial expansion of (cos θ + i sin θ)n :


       
n n n−1 n n−2 2 n n−3 3 n
cos θ + cos θ ·i sin θ − cos θ ·sin θ − cos θ ·i sin θ + cosn−4 θ sin4 θ +. . .
1 2 3 4

If we consider the real parts only, we have


   
n n n−2 2 n
cos nθ = cos θ − cos θ sin θ + cosn−4 θ sin4 θ + . . .
2 4

Notice that all powers of sin θ are even. We can use the identity sin2 θ = 1 − cos2 θ to be able to
express any sin2k θ as (1 − cos2 θ)k ∀k ∈ Z+ . Therefore, since all sin θ raised to an even power can
be expressed in terms of cos θ, we are left with cos nθ being equal to a polynomial in terms of cos θ.
Likewise, consider the imaginary parts:
     
n n−1 n n−3 3 n
sin(nθ) = cos θ sin θ − cos θ sin θ + cosn−5 θ sin5 θ − . . .
1 3 5

When we consider all odd n in Z+ , n minus some other odd number (n − (2k + 1) ∀k ∈ Z) must
be even. Note that the powers of cos θ in this expression are all of that form, therefore all cos θ are
raised to an even power.
We can then apply the identity cos2 θ = 1 − sin2 θ similar to the cos(nθ) example above, to
establish that all even powers of cos θ can be expressed in terms of sin θ. We can then conclude that
sin(nθ) can be expressed as a polynomial in terms of sin θ only, for all odd positive integers n.

Problem 7.3.20. Given the equation (a + bi)2 = i, solve for all such possible complex numbers.

Solution. We are given


a2 + 2abi − b2 = i.
1
We equate the real and imaginary parts, resulting in the equations a2 − b2 = 0 and ab = .
2
1
From the latter, we have b = , so we substitute this into the former equation, and simplify:
 2 2a √
1 1 2
2
a − = 0 =⇒ a = 2 =⇒ 4a = 1, solving for a we get ±
2 4 . Therefore we have two
2a 4a 2
solutions:
√ √ √ √
2 2 2 2
• a= =⇒ b = =⇒ + i.
2 2 2 2
179 Chapter 7. Advanced Trigonometry
√ √ √ √
2 2 2 2
• a=− =⇒ b = − =⇒ − − i.
2 2 2 2

Problem 7.3.21. Solve the equation z 2 = i (given z ∈ C) using the polar form of complex numbers.

π
Solution. First, note that |i| = 1 and Arg i = (the complex number i on the complex plane is
  2
π π π
the same as point (0, 1), or cos , sin on the Cartesian plane). Therefore, i = 1 · cis . We let
2 2 2
z = r cis θ for arbitrary r and θ, so we have the equation
π
(r cis θ)2 = 1 · cis .
2
π
Using De Moivre’s Theorem, this results in r2 cis 2θ = 1 · cis . We assume r to be positive, so
2
π
r = 1. We also have cis 2θ = cis . Because sine and cosine functions are periodic by a value of 2π,
2
π π
we conclude that 2θ = + 2πk ∀k ∈ Z, or θ = + πk ∀k ∈ Z. Therefore, limiting θ to [0, 2π) we
2 4
π 5π π 5π
have the solutions θ = , . Our final solutions are z = 1 · cis , 1 · cis .
4 4 4 4
Exercise 7.3.22. Confirm that the answers in Problem 7.3.20 and Problem 7.3.21 are equivalent.

Problem 7.3.23. Solve for z ∈ C: z 3 = 8.

Solution. We can solve this in two ways. I will demonstrate both of them:

1. Algebra. We rewrite the equation into a polynomial z 3 − 8 = 0, which can then be factored
into (z − 2)(z 2 + 2z + 4) = 0. Using the quadratic formula, we find that the roots are
√ √
z = 2 , −1 + i 3 , −1 − i 3 .

2. Using Polar Form. We know that 8 = 8 cis 0, therefore if we let z = r cis θ, then we have the
equation (r cis θ)3 = 8 cis 0, or r3 cis 3θ = 8 cis 0. This implies r3 = 8 and cis 3θ = cis 0, thus
2π 2π 4π
r = 3 and 3θ = 0 + 2πk ∀k ∈ Z (as cis has a period of 2π). Thus θ = k i.e. θ = 0, ,
3 3 3
when θ ∈ [0, 2π). Finally we bring all information together to state all solutions z = 2 cis 0 ,
2π 4π
2 cis , and 2 cis .
3 3

Check that both sets of answers are the same.

Problem 7.3.24. Write down all solutions to z 5 = −243.

Solution. As usual, rewrite −243 as 243 cis π. Letting z = r cis θ, we get (r cis θ)5 = 243 cis π, or
π 2π
r5 cis 5θ = 243 cis π using De Moivre’s Theorem. We have z = 3 and 5θ = π + 2πk, or θ = + k,
5 5
π 3π 7π 9π
and when restricting θ to the interval [0, 2π), we have the solutions θ = , , π, , and .
5 5 5 5
π 3π 7π 9π
Therefore all the solutions to z are 3 cis , 3 cis , 3 cis π , 3 cis , and 3 cis .
5 5 5 5
Daniel Kim 180

Problem 7.3.25. What are the five fifth roots of i? Then, find their sum and product.

π π
Solution. Following the usual procedure: z 5 = i =⇒ (r cis θ)5 = 1 · cis =⇒ r5 cis 5θ = 1 · cis ,
2 2
π 2π π π 9π 13π 17π
yielding r = 1 and θ = + k, i.e. θ = , , , , . Therefore the solutions to z are
10 5 10 2 10 10 10
π π 9π 13π 17π
cis , cis , cis , cis , cis .
10 2 10 10 10
To find their sum and product, we must recall that the fifth roots of i are essentially the roots of
the equation z 5 − i = 0. This is a polynomial! Therefore we can apply Vieta’s Formulas to determine
that the sum is 0 and the product is i .

Q
5
Problem 7.3.26. If the five fifth roots of i are ω1 , ω2 , ω3 , ω4 , and ω5 , what is (2 − ωk )?
k=1

Solution. As previously stated, the polynomial that has these roots is z 5 − i = 0. Therefore

5
Y
z 5 − i = (z − ω1 )(z − ω2 )(z − ω3 )(z − ω4 )(z − ω5 ) = (z − ωk ).
k=1

Thus our answer is simply plugging in z = 2, which gives us 32 − i .


Problem 7.3.27. Find all solutions to z 5 = 16 3 − 16i.

Solution. As usual, we use De Moivre’s Theorem and compare the magnitude and argument:
√ !
5 3 1
(r cis θ) = 32 − i
2 2
 π
r5 cis 5θ = 32 cis −
6
π
r = 2, 5θ = − + 2πk
6
π 2π
θ =− + k.
30 5

11π 23π 35π 47π 59π


Therefore z = 2 cis , 2 cis , 2 cis , 2 cis , and 2 cis .
30 30 30 30 30

Exercise 7.3.28. Write in polar form all solutions to z 5 = −1 + i, z ∈ C.

Exercise 7.3.29. Write the six solutions to z 6 = −64i in polar form, then convert two of them to
complex form.
181 Chapter 7. Advanced Trigonometry

7.3.3 Rotation

Expressing complex numbers in polar form allows us to easily rotate points. Consider the following
diagram:

Imaginary axis

(r; θ)
(r; θ + α)

θ+α
α
θ
Real axis

Recall that r cis(θ + α) = (r cis θ) · cis α. Thus, if we are given some complex number z = a + bi =
r cis θ, we can easily rotate it around the origin counter-clockwise by an angle α by multiplying z by
cis α.
Thus, the complex number that results from rotating a + bi by an angle of α counterclockwise is:

(a + bi)(cos α + i sin α) = (a cos α − b sin α) + (a sin α + b cos α)i.

Then, if we convert this complex number back to coordinates, we conclude that the rotation of
the point (a, b) by the angle α around the origin is

(a cos α − b sin α, a sin α + b cos α).

Problem 7.3.30. Rotate (3, 4) around the origin by 60◦ counter-clockwise.

Solution. Simple application of the formula yields:


√ !
◦ ◦ ◦ ◦ 3 √ 3 3
(3 cos 60 − 4 sin 60 , 3 sin 60 + 4 cos 60 ) = − 2 3, +2 .
2 2

Problem 7.3.31. Rotate (x, y) 90◦ counter-clockwise.

Solution. Another straightforward application of the formula:

(x cos 90◦ − y sin 90◦ , x sin 90◦ + y cos 90◦ ) = (−y, x) .


Daniel Kim 182

Problem 7.3.32. Rotate (1, −4) around (2, 3) counterclockwise by 60◦ .

Solution. We must turn this problem into something we know how to do, which is rotating around
the origin.
Observe how we can shift all points by −2 on the x-axis and −3 on the y-axis in order to have
(2, 3) ‘become’ the origin. Therefore this problem is equivalent to rotating (1 − 2, −4 − 3) = (−1, −7)
around the origin by 60◦ counter-clockwise, then shifting the point back 2 right and 3 up. Using the
formula, we have
√ √ !
◦ ◦ ◦ ◦ 7 3−1 7+ 3
(−1 · cos 60 + 7 sin 60 , −1 · sin 60 − 7 cos 60 ) = ,−
2 2

as the rotated point from (−1, −7) around the origin. Now we can shift the point by adding 2 to the
x-coordinate and 3 to the y-coordinate to get our answer:
√ √ ! √ √ !
7 3−1 7+ 3 7 3+3 1+ 3
+ 2, − +3 = ,− .
2 2 2 2

Now we shift our focus to conic sections. The method of rotating points that we have discovered
earlier can be utilized in a certain fashion (also known as a coordinate substitution) to rotate general
functions, particularly conics.
While functions from complex numbers to complex numbers are impossible to graph, functions
of the form f (z) = 0 are possible. Graphs of f (z) = 0 correspond to graphs of f (x, y) = 0 in the
Cartesian plane, so we will instead deal with those. Consider the following diagram.

Im

Re

In this figure, the ellipse shown is rotated some angle α counterclockwise. Note that if we rotate
any point on the rotated (purple) ellipse clockwise by α, then we get a point on the original (black)
ellipse.
For example, the red point on the purple ellipse gives us the red point on the black ellipse when
rotated. Therefore, if the equation of the black ellipse is f (x, y) = 0, then the equation of the purple
ellipse should be f (the point (x, y) rotated α clockwise) = 0. However, we now have a coordinate
substitution that gives us the coordinates after rotation! Substituting in our formulas (and noting
183 Chapter 7. Advanced Trigonometry

that rotating clockwise by α is the same as rotating counterclockwise by −α), the equation for the
rotated ellipse is
f (x cos(−α) − y sin(−α), x sin(−α) + y cos(−α)) = 0.

In general, we can conclude that given a function f (x, y) = 0, the new equation

f (x cos(−α) − y sin(−α), x sin(−α) + y cos(−α)) = 0

is a rotation of f (x, y) = 0 by an angle of α.

Problem 7.3.33. Rotate y = 2x by 60◦ counter-clockwise.

Solution. For clarity, rearrange this to y − 2x = 0. We then use the stated formula and make
appropriate substitutions to x and y:

(x sin(−60◦ ) + y cos(−60◦ )) − 2(x cos(−60◦ ) − y sin(−60◦ ) = 0.

√ ! √ ! √
x 3 y x y 3 y √ x 3
This rearranges to − + −2 + = 0, or − y 3 = x + . After much
2 2 2 2 2 2

8+5 3
simplification, we end up with y = − x.
11

Theorem 7.3.34
Consider the general equation for a conic section:

Ax2 + Bxy + Cy 2 + Dx + Ey + F = 0.

We call the value B 2 − 4AC the conic discriminant. Assuming that the conic is not
degenerate:

• If B 2 − 4AC < 0, then the conic is an ellipse.

• If B 2 − 4AC = 0, then the conic is a parabola.

• If B 2 − 4AC > 0, then the conic is a hyperbola.

Proof. We first attempt to get rid of the x and y terms of that general conic equation by translating
it. Translations are characterized by the transformations x → x − r and y → y − s. Substituting, we
have this equation:

A(x − r)2 + B(x − r)(y − s) + C(y − s)2 + D(x − r) + E(y − s) + F = 0.

After expanding, the resulting coefficient of x is D−2Ar−Bs and the y coefficient is E −2Cs−Br.
In order for both of these to equal 0 (and therefore eliminate those terms), we have

D = 2Ar + Bs,
Daniel Kim 184

E = Br + 2Cs.

We then solve for r and s, which are the translations necessary to get rid of the x and y terms:
BE − 2CD
r= ,
B 2 − 4AC
BD − 2AE
s= 2 .
B − 4AC

Thus, if the quantity B 2 − 4AC 6= 0, then we can translate the conic such that there are no x
nor y terms. Note also that the coefficients A, B, and C all stay the same under this transformation.
We now split into two different cases: 4AC − B 2 = 0 and 4AC − B 2 6= 0.

• Case 1: B 2 − 4AC 6= 0
Assuming B 2 − 4AC 6= 0, then we have successfully eliminated the x and y terms, so we just
want to deal with equations of the form Ax2 + Bxy + Cy 2 + F = 0 (remember, the values A,
B, C, and F all stayed the same under that transformation, so we can still use these variables
for this new equation).
We now try to rotate the conic by some angle θ to get rid of the xy term. As stated in Theorem
8, we can make the transformations x → x cos θ − y sin θ and y → x sin θ + y cos θ. Making our
substitutions, we have the equation

A(x cos θ − y sin θ)2 + B(x cos θ − y sin θ)(x sin θ + y cos θ) + C(x sin θ + y cos θ)2 + F

Expanding and rearranging according to our x2 , xy, and y 2 terms, we are left with:

(A cos2 θ + B sin θ cos θ + C sin2 θ)x2


+ (B cos 2θ + (C − A) sin 2θ)xy
+ (A sin2 θ − B sin θ cos θ + C cos2 θ)y 2 + K = 0

Our original purpose was to eliminate the xy term, so we set the coefficient of the xy term
equal to 0:
B cos 2θ + (C − A) sin 2θ = 0.

Rearranging, we have:
B cos 2θ = (A − C) sin 2θ.

If A 6= C, then we can conclude that


B
tan 2θ = .
A−C

If A = C, then we have cos 2θ = 0, or θ = 45◦ . Either way, we have found an expression for θ
that can eliminate the xy term.
After finally eliminating the xy term, we now have the new conic equation e ax2 + e
cy 2 + F = 0
(we cannot use the initial A, B, C variables again because the coefficients of x2 and y 2 have
changed due to the rotation by θ). Based on what we know about conic sections so far:
185 Chapter 7. Advanced Trigonometry

– If e c have the same sign, then the conic is an ellipse.


a, e
– If e c have opposite signs, then the conic is a hyperbola.
a, e

To figure out whether ea and ec have same signs or not, we multiply them together: if the
product is negative, then they have different signs, and if the product is positive, then they
have the same signs.
Recall from the expanded equation stated earlier that e
a = A cos2 θ + B sin θ cos θ + C sin2 θ and
c = A sin θ − B sin θ cos θ + C cos θ. We have to figure out the sign of this unwieldy product:
e 2 2

(A cos2 θ + B sin θ cos θ + C sin2 θ) · (A sin2 θ − B sin θ cos θ + C cos2 θ).

Without loss of generality, we can double each of these expressions, because multiplying by 2
would not influence the sign of the product.

(A · 2 cos2 θ + B · 2 sin θ cos θ + C · 2 sin2 θ) · (A · 2 sin2 θ − B · 2 sin θ cos θ + C · 2 cos2 θ).

As one may observe, doubling these expressions has enabled us to be able to conveniently
apply our known trigonometric identities:

(A · (cos 2θ + 1) + B sin 2θ + C · (1 − cos 2θ)) · (A · (1 − cos 2θ) − B sin 2θ + C · (1 + cos 2θ)).

Simplifying further, we have:

((A + C) + (A − C) cos 2θ + B sin 2θ) · ((A + C) − (A − C) cos 2θ − B sin 2θ).

At this point, we cannot proceed without discussing the value of θ. As concluded earlier, we
B
have either tan 2θ = or θ = 45◦ . Consider both cases:
A−C
B
– Case 1.1: tan 2θ = .
A−C
We must find the expressions of sin 2θ and cos 2θ in terms of A, B, and C. We can
accomplish this with the use of a reference triangle (constructed such that the tangent of
an angle 2θ is equal to the ratio of the opposite side to the adjacent side), as shown:
C) 2

(A

B
+
B2
p


A−C

We can then read our values:


B
sin 2θ = p ,
B 2 + (A − C)2
Daniel Kim 186
A−C
cos 2θ = p .
B 2 + (A − C)2

Looking back at our expression,

((A + C) + (A − C) cos 2θ + B sin 2θ) · ((A + C) − (A − C) cos 2θ − B sin 2θ),

we substitute in sin 2θ and cos 2θ to get:


 p  p 
(A + C) + B 2 + (A − C)2 (A + C) − B 2 + (A − C)2 .

This is simply a difference of squares, and we simplify to end up with:

(A + C)2 − (B 2 + (A − C))2 = 4AC − B 2 .

If 4AC − B 2 is positive, then e


a and e
c have the same sign, therefore the conic is an ellipse.
Note that 4AC − B > 0 =⇒ B − 4AC < 0, and so we have proven the ‘ellipse’ of the
2 2

theorem.
If 4AC − B 2 is negative, then e a and ec have different signs, therefore the conic is a
hyperbola. Note that 4AC − B 2 < 0 =⇒ B 2 − 4AC > 0, and so we have proven the
‘hyperbola’ of the theorem.
– Case 1.2: θ = 45◦ .
Our expression,

((A + C) + (A − C) cos 2θ + B sin 2θ) · ((A + C) − (A − C) cos 2θ − B sin 2θ),

is simplified tremendously when θ = 45◦ is plugged in:

(A + B + C)(A − B + C).

However, recall that θ = 45◦ iff A = C. Therefore, the expression above is equal to
(2A + B)(2A − B) = 4A2 − B 2 = 4AC − B 2 , and now we can make the same conclusion
as Case 1.1.

• Case 2: B 2 − 4AC = 0
Let’s go back to our original general equation of a conic section:

Ax2 + Bxy + Cy 2 + Dx + Ey + F = 0.

Note that Ax2 + Bxy + Cy 2 is a square √ if and only if B 2 − 4AC = 0. To see why this is
−B ± B 2 − 4AC
true, recall the quadratic formula (which is an expression for the roots
2A
of the quadratic). If B − 4AC = 0, then the square root term is eliminated and there is
2

no plus-minus case to consider, resulting in double roots (both roots are the same), so the
expression is a square of another expression.
Thus, let Ax2 + Bxy + Cy 2 = (mx + ny)2 for arbitrary m, n ∈ R. Substituting, we now have
the equation
(mx + ny)2 + Dx + Ey + F = 0.
187 Chapter 7. Advanced Trigonometry

We are trying to prove that B 2 − 4AC = 0 implies that the conic is a parabola. Therefore we
seek to eliminate the y 2 term so we are left with an equation with only y, x2 , and x as variables.
Like the first case, we perform the rotation x → x cos θ − y sin θ and y → x sin θ + y cos θ in
hopes of getting rid of the y 2 term:

(m(x cos θ − y sin θ) + n(x sin θ + y cos θ))2 + D(x cos θ − y sin θ) + E(x sin θ + y cos θ) + F = 0.

Rewrite this equation so that it is clear what the coefficients of x2 , y 2 , x, and y are:

(x(m cos θ + n sin θ) + y(n cos θ − m sin θ))2 +x(D cos θ+E sin θ)+y(E cos θ−D sin θ)+F = 0.

Now it is clear that the only source of the y 2 term is n cos θ − m sin θ which is the coefficient of
n
the y term inside the squared expression. Therefore we set n cos θ − m sin θ = 0, or tan θ = .
m
Note that the equation now will resemble this:
e + Ey
(x · (some expression))2 + Dx e + Fe = 0,

e E,
for arbitrary coefficients D, e and Fe.
At last, we now have shown that we can rotate the conic by a chosen angle θ (such that
n
tan θ = ), to end up with an equation that is a parabola. Therefore, B 2 − 4AC = 0 =⇒
m
the conic is a parabola.

Exercise 7.3.35. Prove that the parametrically defined graph x = t2 + t, y = t2 + 1 is a parabola.


Chapter 8

Linear Algebra

In this chapter, we will touch upon the basics of linear algebra, particularly vectors and matrices.
Then we discuss their applications in 2D and 3D geometry.
This chapter will only serve as a general introduction of linear algebra, so some proofs will be
omitted as they would be beyond the scope of this book. More emphasis will be placed on showing
how certain concepts are related to each other, as well as key observations to be made, rather than
utmost mathematical rigor. For in-depth study of linear algebra, feel free to explore courses offered
by universities or purchase books in Linear Algebra.

8.1 Vectors

Definition 8.1.1. A vector is characterized by its magnitude and direction.

The following would be some examples which constitute vectors:

• ~v : 12 ft. east

• w:
~ 5 in. north

We represent the concept of vectors with arrows, however every vector is unique by its magnitude
and direction. Although we can draw more than one arrow with the same length and direction,
remember that they all denote the same vector. Some representations of vectors are illustrated
below:
12 units
~v
These are the same vector ~v ,
but different representations.

12 units
~v

w
~

189
Daniel Kim 190

Furthermore, we define two important parts of the vector:

tail tip

Now we consider the idea of adding two vectors together:

Definition 8.1.2 (Vector Addition). ~v + w ~ can be described as follows: Take a representation of


~v . Create a representation of w ~ by placing the tail of w
~ at the tip of ~v . We form a new vector by
taking the tail of ~v to the tip of w.
~

This is demonstrated in the following diagram:

~v + w
~
w
~

~v

Example 8.1.3
Show that ~v + w
~ is unique.

Proof. Given the diagram below, we need to show that the lengths AC and A0 C 0 are equal (that
~ has a concrete value no matter what representations of ~v and w
~v + w ~ are taken).

“~v + w”
~
w
~

A ~v B

C′

“~v + w”
~
w
~

A′ ~v B′

First, note that the both vectors ~v are parallel (i.e. pointing in the same direction) and equal in
magnitude. So are the two w~ vectors. Therefore, ABB 0 A and BCC 0 B 0 are parallelograms.
This implies AA0 ∼ = BB 0 and BB 0 ∼ = CC 0 , and by the transitive property, AA0 ∼ = CC 0 .
191 Chapter 8. Linear Algebra

Furthermore, since ABB 0 A and BCC 0 B 0 are parallelograms, AA0 k BB 0 and BB 0 k CC 0 ,


therefore AA0 k CC 0 .
Since AA0 ∼= CC 0 and AA0 k CC 0 , we can conclude that ACC 0 A0 is a parallelogram, therefore
~ a well-defined value.
AC = A0 C 0 = ~v + w,

Problem 8.1.4. From this definition of vector addition, prove the following:

1. ~v + w
~ =w
~ + ~v
2. ~u + (~v + w)
~ = (~u + ~v ) + w
~

Proof.

1. Consider the following diagram:

w
~

~v
w~ +
~v ~=
w ~v
~v +

w
~

Simply use two different representations of vector ~v and two different representations of vector
~ such that the resulting vector ~v + w
w ~ is the same as w~ + ~v .
2. Consider the following diagram:

D
w
~
C
~v
~
~v + w
B
w~ )
+
(~v
~u +
~v

~=
+

w
~u

+
~u
+ ~v)
(~u

−−→ −−→ −−→


Let ~u be represented by AB, ~v be represented by BC, and w
~ be represented by CD. Therefore
−→ −−→ −→ −−→
~u + ~v is represented by AC, and we can conclude that AD represents (~u + ~v ) + w,
~ or AC + CD.
−−→ −−→ −−→ −−→
However, AD also represents ~u + (~v + w),
~ or AB + BD. Since AD represents both (~u + ~v ) + w~
and ~u + (~v + w),
~ we can conclude that these are the same vector.
Definition 8.1.5. Let a scalar be a real number. If k ∈ R+ , we define k~v to be the vector in the
same direction as ~v but with k times the magnitude.
Daniel Kim 192

Exercise 8.1.6. From this definition of a scalar, two distributive properties follow:

1. k(~v + w)
~ = k~v + k w
~

2. (k + l)~v = k~v + l~v

Prove these using similar triangles.

Definition 8.1.7. The zero vector, ~0, is the vector of magnitude 0 in any direction.

Definition 8.1.8. −~v is the vector which when added to ~v , gives us ~0 (i.e. −~v is the same magnitude
as ~v , but in the opposite direction).

It follows that we have another associative property: (kl)~v = k(l~v ) for k, l ∈ R. Feel free to prove
this on your own.

Problem 8.1.9. Prove −1 · ~v = −~v .

Proof. Using the distributive property, note that −1 · ~v + 1 · ~v = (−1 + 1) · ~v = 0 · ~v = ~0. Therefore,
−1 · v + ~v = ~0, so by the previous definition, −1 · v = −~v .

Definition 8.1.10 (Vector Subtraction). Given ~v , w,


~ we can define ~v − w
~ to be that vector, which
when added to w,
~ gives us ~v .

Here is a geometric interpretation:

~v ~v − w
~

w
~

Furthermore, note that −w~ has the same magnitude as w


~ but in the opposite direction, therefore
we can have another representation:

−w
~

~v − w
~ ~v

We can see that the vector ~v − w


~ is the same in both of these scenarios.
193 Chapter 8. Linear Algebra

Definition 8.1.11. We now introduce new notation that is not universally standard, but we will
be using this notation from now on:
In the two-dimensional plane, denote the vector ha, bi as the vector that represents the overall
path taken when going a horizontally and b vertically from any starting point.

Here is a geometric example:

(2, 8) 5 right
h5, 0i

14 down
h5, −14i h0, −14i

(7, −6)
Given a vector ha1 , a2 , . . . , an i, we would call a1 , a2 , . . . , an the components of that vector.

Problem 8.1.12. What would be the vector that connects the point (r, s) to the point (t, u)?

Definition 8.1.13. We define a set of special unit vectors:

• ~ı = h1, 0i in 2D and h1, 0, 0i in 3D.

• ~ = h0, 1i in 2D and h0, 1, 0i in 3D.

• ~k = h0, 0, 1i in 3D.

A vector of magnitude 1 is called a unit vector.

We can define more unit vectors for greater dimensions, and we will do so at a later part of this
chapter.
Using these vectors, we can establish the relation

ha, bi = a~ı + b~,

allowing us to break down any vector in terms of the unit vectors. We will revisit this idea later in
the chapter.

Definition 8.1.14. Let k~v k denote the magnitude of ~v .



For a two-dimensional vector ha, bi, kha, bik = a2 + b2 .

For a three-dimensional vector ha, b, ci, kha, b, cik = a2 + b2 + c2 . The pattern follows for higher
dimensions.

In accordance with this definition, k~v k ≥ 0 with equality iff ~v = ~0.


Daniel Kim 194

Problem 8.1.15. Find kh3, 4ik, k~ık, and kh1, 2, 3ik.

Solution. The
√ √ vector h3, 4i means that we’re going 3 up and 4 to the right, so we have kh3, 4ik =
32 + 42 = 25 = 5 . It is evident that k~ık = 1 , since we are only going 1 unit right. Lastly,
√ √
kh1, 2, 3ik = 12 + 22 + 32 = 14 .

The magnitude of a vector follows with a couple of properties:

1. kk~v k = |k|k~v k
2. k~v + wk ~ with equality if and only if ~v , w
~ ≤ k~v k + kwk ~ are in the same direction.

The first one is easily provable, while a proof for the second one is nearly identical to that of
Problem 7.3.4.
Problem 8.1.16. Find vectors in the same direction as h2, 3i of:

1. Magnitude 1.
2. Magnitude 7.

Solution.
√ √ 1
1. We find that kh2, 3ik = 13. Therefore we must scale our vector by a factor of √
22 + 3 2 =
 13

1 2 3
in order to scale its magnitude to 1. Therefore our new vector is √ h2, 3i = √ , √ .
13 13 13
 
2 3
2. Similarly, since we have found our vector √ , √ of magnitude 1 in the same direction of
13 13  
14 21
the original vector, we can simply scale that vector by a factor of 7 to get: √ , √ .
13 13
Now, we introduce the concept of angles to vectors. Consider the following vectors:

di
hc, ~v − w
~ = ha − c, b − di
=
w~
θ
~v = ha, bi

The vectors ~v , w,
~ and ~v − w
~ form a triangle. We can then apply the Law of Cosines to find out
the value of angle θ between vectors ~v and w:
~

~ 2 = k~v k2 + kwk
k~v − wk ~ 2 − 2k~v kkwk
~ cos θ.

Solving for θ, this rearranges to


k~v k2 + kwk
~ 2 − k~v − wk
~ 2
cos θ = .
2k~v kkwk
~
195 Chapter 8. Linear Algebra

We arbitrarily let ~v = ha, bi and w ~ = hc, di, which implies ~v − w


~ = ha − c, b − di. Therefore, using
the definition of magnitude, we make substitutions in the numerator (we will keep the denominator
the same to keep our expression from becoming too messy):

a2 + b2 + c2 + d2 − (a − c)2 + (b − d)2
cos θ =
2k~v kkwk
~
2ac + 2bd
=
2k~v kkwk
~
ac + bd
= .
k~v kkwk
~

Notice that ac + bd is the result of taking the sum of the product of the first coordinates and the
product of the second coordinates of ~v and w. ~ This quantity is extremely useful and significant in
the use of vectors, and so we will define this unique operation:
Definition 8.1.17. Given vectors ha1 , a2 , . . . , an i and hb1 , b2 , . . . , bn i, the dot product is denoted
as:
n
X
ha1 , a2 , . . . , an i · hb1 , b2 , . . . , bn i = ai bi .
i=1

Then, we have the following properties:

1. ∀a ∈ R, a~v · w
~ = a(~v · w).
~
2. ~u · (~v + w)
~ = ~u · ~v + ~u · w.
~

Proof. Let ~u = ha, bi, ~v = hc, di, w


~ = he, f i. Then ~u · (~v + w)
~ = ha, bi · (hc + e, d + f i) =
a(c + e) + b(d + f ) = ac + ae + bd + bf .
Likewise, ~u · ~v + ~u · w
~ = ha, bi · hc, di + ha, bi · he, f i = ac + bd + ae + bf = ac + ae + bd + bf ,
therefore ~u · (~v + w)
~ = ~u · ~v + ~u · w.
~

3. ~u · ~u = k~uk2 .

Proof. When we let ~u = ha, bi, we end up with ha, bi · ha, bi = a · a + b · b = a2 + b2 = k~uk2 .

4. (~u + ~v ) · (~u − ~v ) = k~uk2 − k~v k2 .

Proof. We use our previously declared definitions and make the proper substitutions.
(~u + ~v )(~u − ~v ) = (~u + ~v ) · ~u + (~u + ~v ) · (−~v )
= ~u · ~u + ~u · ~v + (~u + ~v ) · (−1 · ~v )
= k~uk2 + ~u · ~v + (−1(~u + ~v )) · ~v
= k~uk2 + ~u · ~v + (−~u − ~v ) · ~v
= k~uk2 + ~u · ~v − ~u · ~v − ~v · ~v
= k~uk2 − k~v k2 .

Now that we have defined the dot product, we can rewrite cos θ with a simpler, cleaner expres-
sion.
Daniel Kim 196

Theorem 8.1.18
Given two vectors ~v and w
~ and the angle θ between their representations such that they share
the same tail, we have the following relationship:

~v · w
~
cos θ = .
k~v kkwk
~

This formula extends to higher dimensions.

Problem 8.1.19. What is the angle between h1, 2i and h2, 1i?

√ √
Solution. The dot product h1, 2i · h2, 1i = 1 · 2 + 2 · 1 = 4, kh1, 2ik = 5, and kh2, 1ik = 5, therefore
 
4 4 −1 4
cos θ = √ √ = , and thus θ = cos since h1, 2i and h2, 1i have representations in the
5· 5 5 5
first quadrant in the Cartesian plane when their tails are placed at the origin.

Problem 8.1.20. Find the angle between h1, 2, 3i and h3, 4, 5i.

Solution. We have
h1, 2, 3i · h3, 4, 5i 1·3+2·4+3·5 13
cos θ = =√ √ = √ .
kh1, 2, 3ikkh3, 4, 5ik 2 2 2 2 2
1 +2 +3 · 3 +4 +5 2 5 7

 
−1 13
Therefore θ = cos √ .
5 7

Problem 8.1.21. Find all values of m for which the angle between vectors h1, 1i and h1, mi is 60◦ .

Solution. We have
1+m
cos 60◦ = √ √ .
2 · 1 + m2

1 √
Since cos 60◦ = , we end up with 2 + 2m2 = 2+2m. After squaring both sides and simplifying,
2 √
the equation reduces to m2 + 4m + 1 = 0. The quadratic formula yields m = −2 ± 3.
1 1+m
However, since cos 60◦ = > 0, we cannot have √ √ be negative. When we plug in
√ 2 2 · 1 + m√ 2
m = −2 − 3, we end up with a negative value. Therefore −2 − 3 cannot be a solution of m.

Then, m = −2 + 3 yields a positive value and thus it is the only value of m which satisfies
the conditions.

Definition 8.1.22. ~v and w


~ are orthogonal if any representation of ~v is perpendicular to any
representation of w.
~
197 Chapter 8. Linear Algebra

Theorem 8.1.23
~v and w
~ are orthogonal if and only if ~v · w
~ = 0.

Proof. If any representation of ~v is perpendicular to any representation of w,


~ then we can take a
representation of ~v and align it with the representation of w
~ such that they share the same tail.
~v · w
~
Then, the angle between those representations would be 90 . Therefore, we have cos 90◦ =
◦ ,
k~v kkwk
~
or ~v · w
~ = 0, so that concludes the first part.
If ~v · w
~ = 0, then cos θ = 0 such that θ is the angle between the representation of ~v and
representation of w
~ such that they share the same tail. The only angle θ that is less than or equal to
180◦ which satisfies cos θ = 0 would be θ = 90◦ , therefore the representations of ~v and w
~ would be
perpendicular, and we’re done.
Problem 8.1.24. Prove that for any k, l > 0, the angle between ~v and w
~ is the same as the angle
between k~v and lw.
~

Proof. Let the angle between the representation of ~v and representation of w


~ be θ1 , and likewise θ2
for k~v and lw.
~ Then,
~v · w
~
cos θ1 = .
k~v kkwk
~
We now show that cos θ2 = cos θ1 assuming that k, l > 0 and using the various properties of
magnitude and dot product that we have previously established:
k~v · lw
~ kl(~v · w)
~ kl(~v · w)
~ ~v · w
~
cos θ2 = = = = = cos θ1 .
kk~v kklwk
~ |k|k~v k|l|kwk
~ kk~v k · lkwk
~ k~v kkwk
~

Therefore we can conclude that θ1 = θ2 .

Consider the following diagram:

~v ~v − k w
~

kw
~
w
~

For some scalar k, we define the vector k w ~ as the projection of ~v onto w,


~ where the vector that
goes from the tip of k w
~ to the tip of ~v is orthogonal to w.
~ By Theorem 8.1.23, we have
(~v − k w)
~ ·w
~ = 0,
allowing us to find an expression for k.
This rearranges to ~v · w ~ and we have previously discovered that w
~ · w),
~ = k(w ~ ·w ~ 2,
~ = kwk
therefore we end up with
~v · w
~
k= .
~ 2
kwk
Then we have:
Daniel Kim 198

Definition 8.1.25. The vector projection of ~v onto w


~ is defined as

~v · w
~
projw~ (~v ) = · w.
~
~ 2
kwk

This formula also works for vectors in higher dimensions.


 
Problem 8.1.26. Find projh1,2i (h3, 4i), projh3,4i (h1, 2i), and proj~ı−~+~k ~ı + ~ + ~k .

Solution. First, kh1, 2ik2 = 12 + 22 = 5, and h1, 2i · h3, 4i = 1 · 3 + 2 · 4 = 11, therefore


 
11 11 22
projh1,2i (h3, 4i) = · h1, 2i = , .
5 5 5

Likewise, kh3, 4ik2 = 32 + 42 = 25, therefore


 
11 33 44
projh3,4i (h1, 2i) = · h3, 4i = , .
25 25 25

Lastly, note that ~ı − ~ + ~k = h1, −1, 1i and ~ı + ~ + ~k = h1, 1, 1i. Therefore kh1, −1, 1ik2 =
12 + (−1)2 + 12 = 3 and h1, −1, 1i · h1, 1, 1i = 1 − 1 + 1 = 1, so we substitute in the appropriate
values:
  1  
1 1 1 1 1 1
proj~ı−~+~k ~ı + ~ + ~k = h1, −1, 1i = ,− , = ~ı − ~ + ~k .
3 3 3 3 3 3 3

Problem 8.1.27. Find projh1,2,3i (h3, 4, 5i).

Solution. We have
 
3·1+4·2+5·3 13 13 26 39
projh1,2,3i (h3, 4, 5i) = 2 2 2
h1, 2, 3i = h1, 2, 3i = , , .
1 +2 +3 7 7 7 7

8.2 Linear Transformations and Matrices

In this section, we will give vectors useful meaning by discussing how vectors of one set can be
transformed into vectors of another set. The discussion of linear transformations will enable us to
relate vectors to matrices, and to understand the purpose of matrices.

Definition 8.2.1. Let Rn denote the set of all n-dimensional vectors with components in R.
A linear transformation from Rn to Rm is a map L : Rn 7→ Rm such that:

1. ∀~v , w
~ ∈ Rn , L(~v + w)
~ = L(~v ) + L(w).
~

2. ∀~v ∈ Rn and k ∈ R, L(k~v ) = kL(~v ).


199 Chapter 8. Linear Algebra

The notation L : Rn 7→ Rm describes L as the name of the map, Rn as the domain, and Rm as
the codomain. Values in the domain Rn are mapped, or associated, to values in the codomain Rm .
Note that the codomain is the set consisting of all possible values that can come out of the
function mapping, while the range is the actual set of all values that do come out of the mapping.
Therefore we can say that the range is a subset of the codomain.
   
As an example, what would L ~0 be? Clearly, L ~0 = ~0, however the zero vector that is
inputted into the linear transformation is different than the zero vector that is outputted. In other
words, ~0 given to L is in Rn while the ~0 as the result is in Rm .
Problem 8.2.2. We define the following transformation:

L : R2 7→ R1

L (ha, bi) = a.

Is this a linear transformation?

Solution. Recalling back to the definition, we must show that the transformation satisfies two
properties:

1. ∀~v , w
~ ∈ R2 , L(~v + w)
~ = L(~v ) + L(w).
~

2. ∀~v ∈ R2 and k ∈ R, L(k~v ) = kL(~v ).

Let ~v = ha, bi and w~ = hc, di. First, note that L(ha, bi) + L(hc, di) = a + c = L(ha + c, b + di),
therefore L(ha + c, b + di) = L(ha, bi) + L(hc, di), and we have showed that the transformation
satisfies the first property.
Then, note that L(k ha, bi) = L(hka, kbi) = ka = kL(ha, bi), therefore L(k ha, bi) = kL(ha, bi),
and our second property is satisfied, therefore the transformation is linear.

Problem 8.2.3. Determine whether the following transformation is linear or not.

L : R2 7→ R2 ,

L(ha, bi) = h2a − 3b, 3b + 79ai .

Solution. First, note that

L(ha, bi) + L(hc, di) = h2a − 3b, 3b + 79ai + h2c − 3d, 3d + 79ci
= h2a + 2c − 3b − 3d, 3b + 3d + 79a + 79ci
= h2(a + c) − 3(b + d), 3(b + d) + 79(a + c)i
= L(ha + c, b + di)
= L(ha, bi + hc, di),

which satisfies the first property of a linear transformation. Likewise, we show that the transformation
holds for the second one as well:

kL(ha, bi) = k h2a − 3b, 3b + 79ai


Daniel Kim 200

= hk(2a − 3b), k(3b + 79a)i


= h2(ka) − 3(kb), 3(kb) + 79(ka)i
= L(hka, kbi)
= L(k ha, bi).

Problem 8.2.4. Is this a linear transformation?

projw~ : Rn 7→ Rm .

Solution. We essentially treat the vector projection function as a mapping from any vector of the
n-dimension to an m-dimensional vector.

(v~1 + v~2 ) · w~
projw~ (v~1 + v~2 ) = w
~
~ 2
kwk
v~1 · w
~ + v~2 · w ~
= 2
w
~
kwk
~
v~1 · w
~ v~2 · w
~
= 2
w
~+ w
~
kwk
~ ~ 2
kwk
= projw~ (v~1 ) + projw~ (v~2 ) .
k~v · w~
projw~ (k~v ) = w~
~ 2
kwk
 
~v · w
~
=k w~
kwk~ 2
= k projw~ (~v ) .

As it satisfies the two properties, the vector projection is a linear transformation.

Exercise 8.2.5. Given L : R2 7→ R2 , L(ha, bi) = x2 , y 2 , is it a linear transformation?

Recall that we defined the special unit vectors ~ı, ~, and ~k of magnitude 1, and that we can express
a two-dimensional vector ha, bi as a~ı + b~.
Since we are dealing with an arbitrary number of dimensions, we reestablish these definitions
with a more generalized outlook:

Definition 8.2.6. If v~1 , v~2 , . . . , v~n are vectors, a linear combination is any vector of the form
a1 v~1 + a2 v~2 + . . . + an v~n for a1 , a2 , . . . , an ∈ R.

Definition 8.2.7. The standard basis of Rn will be denoted as:

ı~1 = h1, 0, 0, . . . , 0i ,
ı~2 = h0, 1, 0, . . . , 0i ,
..
.
ı~n = h0, 0, 0, . . . , 1i .
201 Chapter 8. Linear Algebra

Using these two definitions, we can write any vector ha1 , a2 , a3 , . . . , an i as a linear combination
of the standard basis vectors:

ha1 , a2 , a3 , . . . , an i = ha1 , 0, 0, . . . , 0i + h0, a2 , 0, . . . , 0i + h0, 0, a3 , . . . , 0i + . . . + h0, 0, 0, . . . , an i .

We factor out scalars from each vector to get

a1 h1, 0, 0, . . . , 0i + a2 h0, 1, 0, . . . , 0i + a3 h0, 0, 1, . . . , 0i + . . . + an h0, 0, 0, . . . , 1i .

We conclude that

ha1 , a2 , a3 , . . . , an i = a1 ı~1 + a2 ı~2 + a3 ı~3 + . . . + an ı~n .

This conclusion is consistent with the remark made earlier that ha, bi = a~ı + b~.
For some general linear transformation L : Rn 7→ Rm such that

L(~
ı1 ) = v~1 ,
L(~
ı2 ) = v~2 ,
..
.
L(~
ın ) = v~n ,

we have the linear combination: !


n
X n
X
L ak ı~k = ak v~k .
k=1 k=1

We now introduce the following new notation for clarity and consistency later on:

Definition 8.2.8. We can express a vector as a column vector:


 
a1
 a2 
 
ha1 , a2 , . . . , an i =  . 
 .. 
an

Definition 8.2.9. An n × m matrix is an array of numbers with n rows and m columns.

For instance, a 3 × 5 matrix can be:


 
1 2 3 4 5
6 7 8 9 10
5 4 3 2 1

   
Other examples: 117 is a 1 × 1 matrix, and 1 1 7 is a 1 × 3 matrix.
We will call each number inside a matrix an entry.
Daniel Kim 202

Definition 8.2.10. Let L : Rn 7→ Rm and suppose


L(~
ı1 ) = v~1 ,
L(~
ı2 ) = v~2 ,
L(~
ı3 ) = v~3 ,
..
.
L(~
ın ) = v~n ,
where v~1 , v~2 , v~3 , . . . , v~n are column vectors.
Then the matrix for the linear transformation L will be
 
v~1 v~2 v~3 . . . v~n .

This would be an m × n matrix because each of the column vectors v~1 , v~2 , v~3 , . . . , v~n have m
vertical entries, as they are in Rm . Thus, an m × n matrix represents the transformation Rn 7→ Rm .

The application of a linear transformation on a column vector is denoted by the product of the
matrix representing the transformation and that column vector. For instance,
 
a1
  
 a2 
v~1 v~2 v~3 . . . v~n  . 
 .. 
an
is the vector that results from applying a linear transformation on the initial vector ha1 , a2 , . . . , an i.
In other words, for some vector ~v , we have
L(~v ) = M~v ,
where M is the matrix representing the linear transformation L.
Essentially, the matrix serves to represent a transformation in a compact fashion. Later examples
will highlight why the matrix is such a valuable and important tool. Every time we want to talk
about applying a linear transformation to a vector, we don’t want to be dealing with the hassle of
defining some transformation L(ha, bi) in order to convey what the actual transformation is. Instead,
we can simply express our transformation as an organized and condensed matrix.
Let’s put this into practice. Consider Problem 8.2.3 where
L : R2 7→ R2 ,
L(ha, bi) = h2a − 3b, 3b + 79ai .
       
1 2 0 −3
We have that L = and L = . Then, to find the matrix that represents
0 79 1 3
 
2 −3
the overall linear transformation, we put these column vectors together, as such: . If we
79 3
 
a
multiply this matrix by any vector , we have
b
    
2 −3 a 2a − 3b
= ,
79 3 b 79a + 3b
203 Chapter 8. Linear Algebra

which is consistent with our definition that L(ha, bi) = h2a − 3b, 3b + 79ai.
Notice that the top entry of the resulting column vector, 2a − 3b, is the dot product of h2, −3i
with ha, bi. Similarly, the bottom entry, h79a + 3bi is the dot product of h79, 3i with ha, bi.

Problem 8.2.11. Let L : R2 7→ R3 satisfy L(ha, bi) = ha − 2b, 2a + 3b, −bi. Determine the matrix
for L.

Solution. First, we calculate the linear transformations of the standard basis vectors in R2 :

L(h1, 0i) = h1, 2, 0i ,


L(h0, 1i) = h−2, 3, −1i .

 
1 −2  
a
Therefore our matrix is 2 3 . Given an arbitrary R vector
2 , we have a relationship
b
0 −1
between the matrix and the vector:
   
1 −2   a − 2b
2 3  a
= 2a + 3b .
b
0 −1 −b

Again, notice that the first entry of the resulting column vector, a − 2b, is the dot product of
h1, −2i with ha, bi.
Confirm for yourself that the rest follow the same rule.

From these past two examples, we can make the conclusion that each entry in the resulting
column vector (that is, the vector that results from applying the linear transformation on the initial
vector) is the dot product of its corresponding row in the matrix with the initial vector.

Problem 8.2.12. Calculate the following product:


 
  1
1 2 3 10  
4 5 6 11 2 .
3
7 8 9 π
4

Solution. From our observation about the relationship between the dot product and the matrix,
it is not hard to see that for each row in the matrix, each number in that row is multiplied to its
corresponding number in the column vector (from top to bottom), then the sum of those products
becomes the corresponding entry in the final vector:
 
  1    
1 2 3 10   1 · 1 + 2 · 2 + 3 · 3 + 10 · 4 54
4 5 6 11 2 = 4 · 1 + 5 · 2 + 6 · 3 + 11 · 4 =  76  .
3
7 8 9 π 7·1+8·2+9·3+π·4 50 + 4π
4
Daniel Kim 204

Problem 8.2.13. Evaluate


 
  1
1 −1 2 3  
2 .
4 6 1 5 1
5
 
Definition 8.2.14. Suppose M is an n × m matrix. Then M = aı denotes all numbers in that
matrix where ı goes from 1 to n while  goes from 1 to m.

This will help us describe arbitrary matrices of any dimensions without writing out all entries in
a cumbersome fashion.
 
For example, if we had a 2 × 3 matrix M , the notation M = aı means that
 
a a a
M = 11 12 13
a21 a22 a23

for arbitrary numbers aı , where ı = 1, 2, and  = 1, 2, 3.

Theorem 8.2.15 (Matrix Addition)


   
Suppose M1 , M2 are matrices, such that M1 = aı and M2 = bı . Then,
 
M1 + M2 = aı + bı .

Intuitively, this states that the sum of two linear transformations is also a linear transformation
represented by a matrix. We can obtain each entry of this matrix by adding its corresponding entry
in the matrix representing the first linear transformation to its corresponding entry in the matrix
representing the second linear transformation.

 
r1
 r2 
 
Proof. Let M1 , M2 represent two linear transformations. Let ~v =  . . Since we are able to add
 .. 
rm
functions together, we must have that M1~v + M2~v = (M1 + M2 )~v . First, we simplify M1~v + M2~v :
     
a11 a12 . . . a1m r1 b11 b12 . . . b1m r1
 a21 a22 . . . a2m   r2   b21 b22 . . . b2m   r2 
     
M1~v + M2~v =  . .. ..   ..  +  .. .. ..   .. 
 .. . .   .   . . .  . 
an1 an2 . . . anm rm bn1 bn2 . . . bnm rm
   
a11 r1 + a12 r2 + . . . + a1m rm b11 r1 + b12 r2 + . . . + b1m rm
 a21 r1 + a22 r2 + . . . + a2m rm   b21 r1 + b22 r2 + . . . + b2m rm 
   
= .. + .. 
 .   . 
an1 r1 + an2 r2 + . . . + anm rm bn1 r1 + bn2 r2 + . . . + bnm rm
205 Chapter 8. Linear Algebra
P
m 
(a1ı + b1ı )rı
 ı=1 
P 
m 
 (a2ı + b2ı )rı 

=  ı=1 
. 

 .. 

P m 
(anı + bnı )rı
ı=1
 
r1
  
 r2 
= aı + bı  .  .
 .. 
rm
 
Therefore, M1 + M2 = aı + bı .

Problem 8.2.16. Evaluate the following sum:


   
3 4 −1 8
+ .
5 6 6 16 −7

Solution. We simply add each corresponding entry together:


       
3 4 −1 8 3 + (−1) 4+8 2 12
+ = = .
5 6 16 −7 5 + 16 6 + (−7) 21 −1

Now that we have established the sum of two linear transformations, consider the composition of
those transformations.

Theorem 8.2.17 (Matrix Multiplication)


   
Suppose M1 is an m×l matrix, and M2 is an n×m matrix, such that M1 = bı and M2 = aı .
m
X
 
Then, the n × l matrix M2 M1 = cı , where cı = aık bk .
k=1

Proof. Consider two linear transformations T1 , T2 , where T1 : Rl 7→ Rm and T2 : Rm 7→ Rn . Let


matrix M1 correspond to T1 and matrix M2 correspond to T2 .
Consider the composition of those two linear transformations:

T2 ◦ T1 (~v ) = w.
~

Then T2 ◦ T1 : Rl 7→ Rn , and we define M2 M1 to correspond with T2 ◦ T1 . As a result of the


composition, we consider the relation

M2 (M1~v ) = (M2 M1 )~v .

Note that when we take the composition of T2 and T1 , it is required that the codomain of T1 ⊆
domain of T2 . then we must have that the product of an (n by m) matrix and an (m by l) matrix
results in an (n by l) matrix.
Daniel Kim 206
 
x1
    x2 
 
As stated, we let M1 = bı and M2 = aı . Furthermore, let ~v =  . , where ~v ∈ Rl . First,
 .. 
xl
we compute M1~v , which is a legal operation because T1 : Rl 7→ Rm :
 
P
l
b1k xk 
     k=1 
b11 b12 . . . b1l x1 P l 
 b21 b22 . . .  
 b2l  x2  
   b 2k k 
x
M1~v =  . .. ..   ..  =  k=1 
 .. . .  .   .. 

 . 
bm1 bm2 . . . bml xl  l 
P 
bmk xk
k=1

We then compute M2 (M1~v ):


 
P
l
b1k xk 
  k=1 
a11 a12 . . . a1m  l 
 a21 a22 . . .  P 
 a2m 
 b 2k k 
x
M2 (M1~v ) =  . .. ..   k=1 
 .. . .  .. 

 . 
an1 an2 . . . anm  l 
P 
bmk xk
k=1
  
Pm P l
 a1 bk xk 
 =1  k=1 
P P 
m l

 a 2 b x
k k 

=  =1 k=1 

 .
.. 
 
m  
 P P l 
an bk xk
=1 k=1

 
P
m P
l
Consider the entry, ap bk xk , for p = 1, 2, . . . , n. Note that
=1 k=1

m l
! m l
!
X X X X
ap bk xk = ap bk xk
=1 k=1 =1 k=1
m X
X l
= (ap bk ) xk
=1 k=1
Xm
= ((ap b1 ) x1 + (ap b2 ) x2 + . . . + (ap bl ) xl )
=1
m
X m
X m
X
= x1 apk bk1 + x2 apk bk2 + . . . + xl apk bkl .
k=1 k=1 k=1
207 Chapter 8. Linear Algebra

Given that p goes from 1 to n, we can rewrite the enormous column vector as:
 m 
P Pm Pm
 a1k bk1 a1k bk2 . . . a1k bkl   
 k=1 k=1 k=1  x1
P m Pm Pm 
 a b a b . . . a 2k kl 
b  
 2k k1 2k k2  x2 
M2 (M1~v ) = (M2 M1 )~v =  k=1 k=1 k=1   .. 
 .. .. ..  . 

m . . . 
 xl
P Pm Pm 
ank bk1 ank bk2 . . . ank bkl
k=1 k=1 k=1

m
X
 
Therefore, M2 M1 = cı where cı = aik bkj .
k=1

Problem 8.2.18. Evaluate the following products, if possible:


 
  −1 1
1 2 3 
1. −2 2
4 5 6
−3 3
 
−1 1  
  1 2 3
2. −2 2
4 5 6
−3 3
 
  7 4
1 2 
3. 2 −1
−5 6
0 3
 
7 4  
  1 2
4. 2 −1
−5 6
0 3

Solution.
 
  −1 1    
1 2 3   1 · −1 + 2 · −2 + 3 · −3 1 · 1 + 2 · 2 + 3 · 3 −14 14
1. −2 2 = = .
4 5 6 4 · −1 + 5 · −2 + 6 · −3 4 · 1 + 5 · 2 + 6 · 3 −32 32
−3 3
     
−1 1   −1 · 1 + 1 · 4 −1 · 2 + 1 · 5 −1 · 3 + 1 · 6 3 3 3
1 2 3
2. −2 2 = −2 · 1 + 2 · 4 −2 · 2 + 2 · 5 −2 · 3 + 2 · 6 = 6 6 6.
4 5 6
−3 3 −3 · 1 + 3 · 4 −3 · 2 + 3 · 5 −3 · 3 + 3 · 6 9 9 9
 
  7 4
1 2 
3. Taking the product 2 −1 is not possible because each row of the first matrix is 2
−5 6
0 3
entries long, while each column of the second matrix is 3 entries long. We can only take the
product when each row of the first matrix has the same number of entries as each column of
the second matrix.
Daniel Kim 208
     
7 4   7 · 1 + 4 · −5 7·2+4·6 −13 38
1 2
4. 2 −1 = 2 · 1 + −1 · −5 2 · 2 + −1 · 6 =  7 −2.
−5 6
0 3 0 · 1 + 3 · −5 0·2+3·6 −15 18

Recall from the previous chapter that if we rotate a point (x, y) by an angle of θ counter-clockwise
about the origin, it goes to
(x cos θ − y sin θ, x sin θ + y cos θ).

We can associate the point (x, y) with a vector whose tail is at the origin and tip at the point
(x, y), which is hx, yi.

Definition 8.2.19. The rotation matrix Rθ to be the matrix that transforms the vector hx, yi
into the vector rotated counter-clockwise by angle θ:
    
cos θ − sin θ x x cos θ − y sin θ
= .
sin θ cos θ y x sin θ + y cos θ
| {z }

Problem 8.2.20. Prove Rα Rθ = Rα+θ .

Proof. We can directly evaluate the product of the matrices, then apply sum and difference formulas
of sine and cosine appropriately.
  
cos α − sin α cos θ − sin θ
Rα Rθ =
sin α cos α sin θ cos θ
 
cos α cos θ − sin α sin θ − cos α sin θ − sin α cos θ
=
sin α cos θ + cos α sin θ − sin α sin θ + cos α cos θ
 
cos(α + θ) − sin(α + θ)
=
sin(α + θ) cos(α + θ)
= Rα+θ .

Now, what if we multiplied a scalar by some matrix? Intuitively, if M is a matrix and ~v is a


vector, then (kM )~v should equal k(M~v ). It is clear that the only matrix that makes this work is
[kaij ] for aij in M. For example,
     
1 1 1·3 1·3 3 3
3 = = .
1 1 1·3 1·3 3 3

Furthermore, a convenient property of matrix multiplication is that it is associative. In other


words, if M is an a × b matrix, N is a b × c matrix, and P is a c × d matrix, then (M N )P = M (N P ).
You are welcome to prove this on your own.
One important condition on matrices is that matrix multiplication is not commutative. For
example,     
1 0 0 2 0 2
= .
0 0 0 0 0 0
209 Chapter 8. Linear Algebra

However, if we switch the matrices:


    
0 2 1 0 0 0
= ,
0 0 0 0 0 0

which is different from the first matrix we got. Thus, while matrix multiplication is associative, it is
not commutative.
Moving on, we introduce some new vocabulary to describe special matrices and parts of matrices.

Definition 8.2.21. We define Mn (R) to be the set of n × n matrices with elements in R. A special
name for an n × n matrix is a square matrix.
 
Definition 8.2.22. If M is an n × n square matrix i.e. ann , then the main diagonal consists of
the entries of the form akk , where k = 1, . . . , n.

Definition 8.2.23. Consider the linear transformation T : Rn → Rn such that for all elements ~ık
in the standard basis, T (~ık ) = ~ık . Let ~v be a vector in Rn which can be written as ha1 , a2 , . . . , an i.
Using the various properties of a linear transformation as well as the observation that ~v can be
written as a linear combination of the standard basis vectors, we have
n
!
X
T (~v ) = T ak~ık
k=1
n
X
= T (ak~ık )
k=1
Xn
= ak T (~ık )
k=1
n
X
= ak~ık
k=1
= ~v .

Therefore, for all ~v ∈ Rn , T (~v ) = ~v . We call T the identity linear transformation.

Now, we consider the matrix for this transformation. By looking at the column vectors, we can
see that it will look like:  
1 0 0 ... 0 0
0 1 0 . . . 0 0
 
0 0 1 . . . 0 0
 
 .. .. .. 

 . . .
0 0 0 . . . 1 0
0 0 0 ... 0 1

Definition 8.2.24. A matrix of this form is called the n × n identity matrix, denoted In . It takes
a vector to itself. All entries in the main diagonal of In have the value of 1, and all other entries
have the value of 0.
Daniel Kim 210

Problem 8.2.25. Now, let us consider the product In M for some n × m matrix M .
  
1 0 ... 0 a11 a12 . . . a1m
0 1 . . . 0  a21 a22 . . . a2m 
  
 .. . . ..   . .. .. 
. . .   .. . . 
0 0 . . . 1 an1 an2 . . . anm
Prove In M = M .
   
Proof. If we let M = aij and In = bij , then
 n 
P
In M = b a
ik kj .
k=1

However, we know that all of the values of bik are 0 except for entries of the form bii , which are
all 1. Thus, all of the values of bik akj are 0, except for bii aij = aij . This implies that
n
X
bik akj = aij .
k=1
 
Therefore, In M = aij = M i.e. In M = M .

We have demonstrated that not only does In represent the identity linear transformation, it also
serves as the identity for matrix multiplication!
Feel free to prove M In = M using an analogous line of reasoning as shown above.
We can observe how the identity matrix has appeared in places we have seen before. Consider
R0 , the rotation matrix by an angle of 0. It is clear that rotating a vector by 0 will not change the
vector at all. In other words, R0 should be an identity matrix. This can be easily demonstrated:
   
cos 0◦ − sin 0◦ 1 0
R0 = = = I2 .
sin 0◦ cos 0◦ 0 1
As with fields, the existence of an identity means that we should look for inverses as well.
Definition 8.2.26. The inverse of a matrix M is denoted as M −1 , such that M −1 M = I, i.e. the
identity matrix.

To get a sense of how matrix inverses can be applied in other parts of mathematics, consider the
following system of equations:
2x + 3y = m,
3x + 5y = n.

Standard algebraic techniques yield (x, y) = (5m − 3n, −3m + 2n) as the set of possible solutions.
However, we can also express this system of equations as a matrix operation:
    
2 3 x m
=
3 5 y n
| {z }
M
211 Chapter 8. Linear Algebra

Then our main objective is to find solutions to the equation M~x = ~v , where ~x, ~v are given vectors
and M is a matrix of an appropriate dimension. This is where our inverse matrix, denoted as M −1 ,
fulfills its role as a multiplicative inverse to this equation:

M −1 (M~x) = M −1~v
(M −1 M )~x = M −1~v
I~x = M −1~v
∴ ~x = M −1~v .

Suppose ~x, ~v are in the same dimension, and M is a square matrix. When does M −1 exist?
   
a b x y
For now, we will deal with the 2 × 2 square matrix. Let M = , and M −1 = . We
c d z w
should figure out what x, y, z, w are in terms of a, b, c, d.
By our definition of the inverse, we must have:
    
a b x y 1 0
= .
c d z w 0 1

Matrix multiplication yields:


   
ax + bz ay + bw 1 0
= .
cx + dz cy + dw 0 1

We end up with a system of four equations:

ax + bz = 1,
ay + bw = 0,
cx + dz = 0,
cy + dw = 1.

Applying our usual algebraic techniques, we eventually find the solutions to x, y, z, w:


   d −b 
−1 x y
M = = ad−bc
−c
ad−bc .
a
z w ad−bc ad−bc

Notice that the common denominator is ad − bc. Whether this quantity will equal 0 or not will
determine whether M has an inverse or not. This quantity will serve as an important trait of the
matrix.
 
a b
Definition 8.2.27. Given a 2 × 2 matrix M = , the determinant of M is denoted as
c d
Det(M ) = ad − bc.

Definition 8.2.28. We will define the determinant of any identity matrix to be 1. In other words,
Det(In ) = 1.
Daniel Kim 212
1
Lastly, we can factor out from M −1 :
ad − bc
 d −b   
−1 ad−bc ad−bc
1 d −b
M = −c a = .
ad−bc ad−bc ad − bc −c a

We can summarize everything we have just shown:

Theorem 8.2.29
Given a 2 × 2 square matrix M , M has an inverse if and only if Det(M ) 6= 0, such that
 
−1 1 d −b
M = .
Det(M ) −c a

Going back to our initial example,


    
2 3 x m
= ,
3 5 y n

we can apply our formula for a 2 × 2 matrix inverse:


 −1    
2 3 1 5 −3 5 −3
= = .
3 5 1 −3 2 −3 2

Therefore,
   −1       
x 2 3 m 5 −3 m 5m − 3n
= = = ,
y 3 5 n −3 2 n −3m + 2n
which is indeed consistent with the solutions (x, y) = (5m − 3n, −3m + 2n) that we had found for
the system of equations using algebraic techniques.
 −1     
7 5 7 5 x 7
Problem 8.2.30. Compute . Use this to solve the equation = .
−1 2 −1 2 y 11

Solution. This is a straight application of our formula:


 −1   2 −5

7 5 1 2 −5
= = 19
1
19
7 .
−1 2 19 1 7 19 19

  2 −5
   −41   
x 7 −41 84
Therefore = 19
1
19
7 = 19
84 , i.e. the solution (x, y) = , .
y 19 19 11 19 19 19

Problem 8.2.31. Using the formula for matrix inverse, find R(−θ) .

Solution. It is obvious that R(−θ) is the inverse of Rθ , since rotating by θ and then rotating by −θ
will result in no rotation being done to the initial vector at all.
213 Chapter 8. Linear Algebra

Then, we compute the inverse of Rθ :


 −1  
cos θ − sin θ 1 cos θ sin θ
=
sin θ cos θ cos2 θ + sin2 θ − sin θ cos θ
 
cos θ sin θ
=
− sin θ cos θ
 
cos(−θ) − sin(−θ)
=
sin(−θ) cos(−θ)
= R(−θ) .

For the following examples, we are assuming that M is a 2 × 2 square matrix, since we have only
discussed the determinant and inverse of that kind of matrix so far.

Problem 8.2.32. Let k ∈ R. Prove Det(kM ) = k 2 Det(M ) for a 2 × 2 matrix M .


      
a b a b ka kb
Proof. Let M = . Then, Det(kM ) = Det k = Det = ka·kd−kb·kc =
c d c d kc kd
k 2 (ad) − k 2 (bc) = k 2 (ad − bc) = k 2 Det(M ), and we are done.

Problem 8.2.33. Prove Det(M N ) = Det(M ) Det(N ).


   
a b e f
Proof. Let M = and N = . Then Det(M ) Det(N ) = (ad − bc)(eh − f g) = adeh −
c d g h
adf g − bceh + bcf g.
    
a b e f ae + bg af + bh
Note that M N = = , so Det(M N ) = (ae + bg)(cf + dh) −
c d g h ce + dg cf + dh
(af + bh)(ce + dg), which, after simplifying, leaves us with adeh − adf g − bceh + bcf g, which equals
Det(M ) Det(N ), so we are done.

Problem 8.2.34. Does Det(M + N ) = Det(M ) + Det(N )?

Solution. Using
  the same variables for M and N in the previous example, we have that M + N =
a+e b+f
, so Det(M +N ) = (a+e)(d+h)−(b+f )(c+g) = (ad+ah+de+eh)−(bc+bg+cf +f g),
c+g d+h
which clearly does not equal Det(M ) + Det(N ) = ad − bc + eh − f g, so Det(M + N ) 6= Det(M ) +
Det(N ).

Problem 8.2.35. Prove Det(M −1 ) = (Det(M ))−1 .

We could approach this proof algebraically just like our previous examples. However, we can
proceed with a much cleaner proof by using previously proven results:

Proof. As previously proven in an example, Det(M −1 M ) = Det(M −1 ) Det(M ). But recall that
M −1 M = I, so we have Det(I) = Det(M −1 ) Det(M ) =⇒ Det(M −1 ) Det(M ) = 1, so therefore
Det(M −1 ) = (Det(M ))−1 .
Daniel Kim 214

Now that we have dealt with rotation matrices, we can also consider the transformation of
reflection. We seek to reflect a vector over some line. To get a notion of how we should define this,
take a line through the origin:
y

~v : hx, yi

θ+α

θ θ−α
x

Let the angle of line l with respect to the x-axis be θ. Let Tθ be the linear transformation that
reflects a vector across the line making an angle of θ with the x-axis.
Let the angle of the gap between the vector and the line be α. The transformation will take a
vector making an angle of θ − α to one making an angle of θ + α (since reflection copies the angle).
We wish to express the initial vector as hcos β, sin βi, so let β = θ − α, which implies θ + α = 2θ − β,
so therefore, the vector hcos β, sin βi will reflect to the vector hcos(2θ − β), sin(2θ − β)i under Tθ .
We can find the matrix that corresponds to Tθ ; denote it Hθ . Under our definition, we must
have:    
cos β cos(2θ − β)
Hθ · = .
sin β sin(2θ − β)

But note that


      
cos(2θ − β) cos 2θ cos β + sin 2θ sin β cos 2θ sin 2θ cos β
= = .
sin(2θ − β) sin 2θ cos β − cos 2θ sin β sin 2θ − cos 2θ sin β

Therefore, we have our reflection matrix:


 
cos 2θ sin 2θ
Hθ = .
sin 2θ − cos 2θ

Definition 8.2.36. The reflection matrix, denoted by Hθ , reflects a vector over a line given the
angle θ between the line and the x-axis.
    
cos 2θ sin 2θ x x cos 2θ + y sin 2θ
= .
sin 2θ − cos 2θ y x sin 2θ − y cos 2θ
| {z }

215 Chapter 8. Linear Algebra

Problem 8.2.37. Prove that the product of a reflection matrix and a rotation matrix is a reflection
matrix.

Proof. Since matrix multiplication is not commutative, we must consider both cases whether a
reflection matrix is being multiplied to a rotation matrix or the other way around.

  
cos 2ϕ sin 2ϕ cos θ − sin θ
Hϕ Rθ =
sin 2ϕ − cos 2ϕ sin θ cos θ
 
cos 2ϕ cos θ + sin 2ϕ sin θ − cos 2ϕ sin θ + sin 2ϕ cos θ
=
sin 2ϕ cos θ − cos 2ϕ sin θ − sin θ sin 2ϕ − cos 2ϕ cos θ
 
cos(2ϕ − θ) sin(2ϕ − θ)
=
sin(2ϕ − θ) − cos(2ϕ − θ)
= H2ϕ−θ .
  
cos θ − sin θ cos 2ϕ sin 2ϕ
R θ Hϕ =
sin θ cos θ sin 2ϕ − cos 2ϕ
 
cos 2ϕ cos θ − sin 2ϕ sin θ cos θ sin 2ϕ + sin θ cos 2ϕ
=
sin θ cos 2ϕ + cos θ sin 2ϕ sin θ sin 2ϕ − cos θ cos 2ϕ
 
cos(2ϕ + θ) sin(2ϕ + θ)
=
sin(2ϕ + θ) − cos(2ϕ + θ)
= H2ϕ+θ .
   
x x
Problem 8.2.38. Compute H π4 · and R π2 · .
y y

Solution.
         
x cos π2 sin π2
x 0 1 x y
H π4 · = = =
y sin π2 − cos π2
y 1 0 y x
         
x cos π2 π
− sin 2 x 0 −1 x −y
R π2 · = = =
y sin π2 π
cos 2 y 1 0 y x
 
5 0
Exercise 8.2.39. Describe what the matrix does to a vector.
0 6
Exercise 8.2.40. Find a matrix that takes each vector ~v to 2~v .
 
t π
Problem 8.2.41. Rotate the set of all points of the form 2 by radians counter-clockwise, then
t 4
find the Cartesian equation which represents the set of those rotated points.

Solution. We have,
  " √2 √ #   "√ √ #
t −√ 22 t 2
t− 2 2
t
R π4 2 = √22 2 = √22 √2 .
t 2 t t+ 2 2
2 2 2 2 t
√ √ √ √
2 2 2 2 2 2
We now have the parametrization x = t− t ,y= t+ t , from which we can add
2 2 2 2 √ √
these equations together and simplify to get the conic x2 + 2xy + y 2 + 2x − 2y = 0 .
Daniel Kim 216

The objective of this problem is to demonstrate that we can use rotation matrices to rotate graphs.
In that problem, we expressed the initial Cartesian equation such as y = x2 as a parametrization,
π
through which we can rotate it radians counter-clockwise, resulting in the oblique parabola
√ √ 4
x2 + 2xy + y 2 + 2x − 2y = 0.
π
Problem 8.2.42. Use a rotation matrix to rotate x2 − y 2 = 1 by radians counter-clockwise.
4

Solution. Recall the identity sec2 θ = 1 + tan2 θ, from which we can deduce that the parametrization
of this equation is x = sec t, y = tan t. We proceed with our transformation:
  " √2 √ #
2
 " √2 √
2
#
sec t − sec t sec t − tan t
R π4 = √22 √2 = √22 √2 .
tan t 2 tan t sec t + 2 tan t
2 2 2 2

√ √ √ √
2 2 2 2
We have the parametric equations x = sec t − tan t and y = sec t + tan t. A
2 2 2 2
particularly clean method to finding the equation in terms of x and y is to multiply x and y together,
1
resulting in a difference-of-squares, which ultimately simplifies to xy = .
2

We have been going over various transformations, but there is a special situation that we should
address. After applying some linear transformation, what if the vector only changes by a scalar
factor?
Definition 8.2.43. A matrix M has eigenvector non-zero ~v with a scalar eigenvalue λ if

M~v = λ~v .

Let’s investigate this definition with some examples:


      eigenvalue
z}|{  
5 0 1 5 1
= = 5 ·
0 6 0 0 0
|{z}
eigenvector

When we multiply the eigenvector h1, 0i by a scalar of 5, the resulting vector is equivalent to
taking the matrix transformation of that vector! In other words, the direction of the vector remains
the same after a transformation is applied.
Here are some more examples of eigenvectors with their eigenvalues for the given matrix above:
      
5 0 0 0 0
= =6·
0 6 1 6 1
      
5 0 k 5k k
= =5·
0 6 0 0 0

In general, according to our definition, the eigenvector exists iff it is non-zero. When do we know
this is the case? Consider our equation,
M~v = λ~v
217 Chapter 8. Linear Algebra

This can be rewritten as M~v = λI~v , i.e. (M − λI)~v = ~0.


For convenience, let N = M − λI, so we have N~v = ~0. Assume N has an inverse. Then we can
multiply both sides of the equation by that: N −1 (N~v ) = N −1~0, i.e. N −1 (N~v ) = ~0. This rearranges
to (N −1 N )~v = ~0 =⇒ I~v = ~0 =⇒ ~v = ~0, which is a contradiction of our definition.
Therefore, N = M − λI does not have an inverse. In order for a ~v =
6 0 to exist, we would need
M − λI not to have an inverse, which happens iff Det(M − λI) = 0.
By evaluating the determinant of M − λI, we can determine which values of λ (i.e. eigenvalues)
force M − λI not to have an inverse and therefore ensure that the eigenvector exists for that value
of λ.

Example 8.2.44
 
5 0
Find all eigenvectors and eigenvalues for the matrix .
0 6

 
5−λ 0
Solution. Note that M − λI = , therefore Det(M − λI) = (5 − λ)(6 − λ), so the
0 6−λ
values of λ which force Det(M − λI) to equal 0 are λ = 5, 6. We consider each value separately:

• Consider λ = 5. As M~v = λ~v , we have:


    
5 0 x 5x
M~v = =
0 6 y 6y
   
x 5x
λ~v = 5 · =
y 5y
   
5x 5x
We get = , therefore y = 0 for any value of x.
6y 5y
   
x 1
So our eigenvectors of λ = 5 are of the form =x .
0 0
        
5 0 x 5x x 6x
• If λ = 6, we can follow a similar process: M~v = = and λ~v = 6 · = ,
0 6 y 6y y 6y
   
5x 6x
so = , implying x = 0 for any value of y.
6y 6y
   
0 0
Our eigenvectors of λ = 6 are of the form =y .
y 1
 
2 −1
Problem 8.2.45. Find eigenvectors and eigenvalues for M = .
−4 5

 
2 − λ −1
Solution. Note that Det(M − λI) = Det = (2 − λ)(5 − λ) − (−1)(−4) = λ2 −
−4 5 − λ
7λ + 6 = (λ − 1)(λ − 6).
Daniel Kim 218

• Case λ = 1:       
2 −1 x 2x − y x
= =1·
−4 5 y −4x + 5y y
leaving us with the system of equations 2x − y = x, −4x + 5y = y, whose solution is y = x.
   
x 1
Eigenvectors of λ = 1 are of the form i.e. scalar multiples of .
x 1

• Case λ = 6:         
2 −1 x 2x − y x 6x
= =6· =
−4 5 y −4x + 5y y 6y
The system of equations is 2x − y = 6x and −4x + 5y = 6y, so the solution is y = −4x.
   
x 1
Eigenvectors of λ = 6 are of the form i.e scalar multiples of .
−4x −4
 
2 1
Exercise 8.2.46. Find eigenvectors and eigenvalues of the matrix .
−4 −3
 
1 3
Exercise 8.2.47. Find eigenvectors and eigenvalues of the matrix .
−4 −6

Problem 8.2.48. Find eigenvalues and eigenvectors for Rθ .


 
cos θ − λ − sin θ
Solution. The determinant of Rθ − Iλ = must equal 0, i.e. (cos θ − λ)2 +
sin θ cos θ − λ
p
sin2 θ = 0. This rearranges to cos θ − λ = ± − sin2 θ =⇒ λ = cos θ ± i sin θ = cis ±θ.

• Case λ = cos θ + i sin θ:


      
cos θ − sin θ x x x cos θ + xi sin θ
= (cos θ + i sin θ) =
sin θ cos θ y y y cos θ + yi sin θ

This results in a system of equations:

x cos θ − y sin θ = x cos θ + xi sin θ,


x sin θ + y cos θ = y cos θ + yi sin θ.

After some simplification,


  we have y = −ix and x =
 yi, we can conclude that all eigenvectors
x 1
are of the form , i.e. scalar multiples of .
−xi −i

• Case λ = cos θ − i sin θ:


      
cos θ − sin θ x x x cos θ − xi sin θ
= (cos θ − i sin θ) =
sin θ cos θ y y y cos θ − yi sin θ

We similarly solve
 asystem of equations to get y= xi, x = −yi, implying that our eigenvectors
x 1
are of the form , i.e. scalar multiples of .
xi i
219 Chapter 8. Linear Algebra

Problem 8.2.49. Find eigenvalues and eigenvectors of Hθ .


 
cos 2θ − λ sin 2θ
Solution. The determinant of Hθ − λI = must equal zero, i.e. −(cos 2θ +
sin 2θ − cos 2θ − λ
λ)(cos 2θ − λ) − sin2 2θ = 0, which rearranges to cos2 2θ − λ2 + sin2 2θ = 0, giving solutions λ = ±1.

• Case λ = 1:
    
cos 2θ sin 2θ x x
=
sin 2θ − cos 2θ y y
This gives the system of equations

x cos 2θ + y sin 2θ = x,
x sin 2θ − y cos 2θ = y.

1 − cos 2θ
After simplification, we end up with y = x, which resembles the half-angle formula
sin 2θ
for
 tangent,
 from which we can deduce that y = x tan θ, so our eigenvectors are of the form
x 1
, i.e. x . As x is a scalar which spans over all R, we are allowed to make the
x tan θ tan θ
 
cos θ
substitution x −→ x cos θ, so we can express the form of the eigenvector as x i.e. all
sin θ
 
cos θ
scalar multiples of .
sin θ
• Case λ = −1:     
cos 2θ sin 2θ x −x
=
sin 2θ − cos 2θ y −y
1 + cos 2θ
Similarly, we solve a system of equations to get the solution y = − x = −x cot θ, so
  sin 2θ
1
our eigenvectors can be expressed in the form x , and using the valid substitution
− cot θ
   
sin θ sin θ
x −→ x sin θ, we have a cleaner form x i.e. all scalar multiples of .
− cos θ − cos θ
Definition 8.2.50. A set of vectors v~1 , v~2 , . . . , v~n is linearly independent if whenever a1 v~1 +
a2 v~2 + . . . + an v~n = 0, then a1 = a2 = . . . = an = 0.
Otherwise, the set of vectors is linearly dependent.
   
1 4
Problem 8.2.51. Determine if the vectors , are linearly independent or not.
−1 −1

Solution. Consider scalars a, b ∈ R such that


     
1 4 0
a +b = .
−1 −1 0

We have the system of equations a + 4b = 0 and −a − b = 0, which imply a = b = 0, so these


vectors are indeed linearly independent.
Daniel Kim 220
   
1 2
Problem 8.2.52. Are the vectors and linearly independent?
2 4

Solution. Note that      


1 2 0
−2 +1 = .
2 4 0
   
1 2
Since −2, 1 6= 0, and are linearly dependent.
2 4

Exercise 8.2.53. Prove that ~v , w


~ are linearly independent if and only if one is a scalar multiple of
the other.

Theorem 8.2.54
Any three vectors in R2 are linearly dependent.

Proof. Let our three vectors be      


a c e
, , .
b d f

Suppose we try to solve the equation


     
a c e
x +y = .
b d f

Note that this equation can be rewritten as


    
a c x e
= .
b d y f

 
a c
There exists a solution to this whenever the matrix has an inverse, i.e. the determinant
b d
6 0. If we have a solution to this equation, then we’re done, since the sum of some scalar
ad − bc =
multiples of the first two vectors will indeed result in the third vector.
bc
Otherwise, assume that ad − bc = 0. Without loss of generality, let a 6= 0. Then we have a = ,
d
bc
so d = , so our first two vectors are
a    
a c
, bc .
b a

But note that      


c a c 0
+ −1 bc = .
a b a 0

We have shown that the three vectors are linearly dependent in either case, and so we are
done.
221 Chapter 8. Linear Algebra

It turns out that this theorem is true in general (i.e. that any n + 1 vectors in Rn are linearly
dependent).
Any n linearly independent vector in Rn is called a basis for Rn . This is a generalization of the
standard basis vectors that we had defined earlier in the chapter.
Recall that we were able to express any vector as a linear combination of the standard basis
vectors: ha1 , a2 , a3 , . . . , an i = a1 ı~1 + a2 ı~2 + a3 ı~3 + . . . + an ı~n .
This can be extended to any basis in general; any vector in Rn can be written as a linear
combination of vectors in the basis.
     
1 0 1
Problem 8.2.55. Why can’t 0 , 1 , and 1 be a basis in R3 ?
    
0 0 0
     
1 0 1
Solution. First, it is easy to observe that 0 + 1 = 1, so they are not linearly independent.
    
0 0 0
 
0
Furthermore, we cannot express the vector 0 as any linear combination of these vectors
1
because their third entries are all 0. This fails the fact that any vector in R3 can be written as a
linear combination of vectors in the basis for R3 .

We shift our discussion of matrices back to the determinant. It turns out that it has far greater
significance than one may initially conceive. Consider a triangle made up of vectors:
Any vector in Rn can be written as a linear combination of vectors in the basis.

di
hc,
=
w~
θ
~v = ha, bi

Theorem 8.2.56
The area of a triangle defined by the vectors w
~ = ha, bi and ~v = hc, di is:
 
1 1 a c
|ad − bc| = Det .
2 2 b d

~v · w
~
Proof. Let θ be the angle between ~v and w.
~ Then cos θ = . From this, we can derive that
s k~
v kk wk
~
 2
~v · w
~ 1
sin θ = 1 − . By Theorem 6.6.1, we can apply the area formula ab sin C (for any
k~v kkwk
~ 2
two sides a, b with included angle C) to our vectors which make up the triangle to get our desired
Daniel Kim 222

expression. Instead of sides a and b, we have k~v k and kwk,


~ and replace C by θ in the context of this
proof.

1
Area = k~v kkwk
~ sin θ
2
1p
= k~v k2 kwk
~ 2 − (~v · w)
~ 2
2
1p 2
= (a + b2 )(c2 + d2 ) − (ac + bd)2
2
1p
= (ac)2 + (ad)2 + (bc)2 + (bd)2 − (ac)2 − 2(ac)(bd) − (bd)2
2
1p
= (ad)2 + (bc)2 − 2(ad)(bc)
2
1p
= (ad − bc)2
2
1
= |ad − bc|
2  
1 a c
= Det .
2 b d

Now, we can easily derive the area of a parallelogram:

(c, d)
(a + c, b + d)

θ
(a, b)

Theorem 8.2.57
The area of a parallelogram formed by vectors ha, bi and hc, di is
 
a c
Det .
b d

What if we place the triangle on a general coordinate system?

(c, d)

bi

a ,d

hc

(a, b) he − a, f − bi (e, f )
223 Chapter 8. Linear Algebra

Theorem 8.2.58 (Shoelace Theorem)


Consider a triangle with general coordinates (a, b), (c, d), and (e, f ). Then the area of this
triangle is
1
|(ad + cf + eb) − (bc + de + f a)|.
2

Proof. Based on our coordinates (a, b), (c, d), and (e, f ), the two vectors with common tail on (a, b)
are hc − a, d − bi and he − a, f − bi. We can then apply Theorem 8.2.56:
 
1 c−a e−a
Area = Det
2 d−b f −b
1
= |(c − a)(f − b) − (e − a)(d − b)|
2
1
= |cf − ab − af + ab − ed + eb + ad − ab|
2
1
= |(ad + cf + eb) − (bc + de + f a)|.
2

This theorem seems quite complex, but it is actually quite simple once you see the pattern.
What we do is list out the coordinates, and repeat the first point (which would be (a, b) in this
example) at the end as well:

a b
c d
e f
a b

Then, draw diagonal lines from a to d, c to f , and e to b, as such:


a b
c d
e f
a b

Take the product of each diagonal pair and add them all up. Here, we would have ad + cf + eb.
Then, draw diagonal lines from b to c, d to e, and f to a:
a b
c d
e f
a b

Do the same thing as last time: take the product of each diagonal pair and add all the products.
So we have the quantity bc + de + f a.
Daniel Kim 224
1
We have found the two quantities used in the formula |(ad + cf + eb) − (bc + de + f a)| we
2
discovered, so we can compute the area now.
This method of listing out the coordinates and taking diagonals is very helpful for applying
Shoelace Theorem correctly without having to completely memorize it.

Theorem 8.2.59 (Generalized Shoelace Theorem)


The area of a polygon defined by the points (xi , yi ) for 1 ≤ i ≤ n (in counterclockwise order) is
1
((x1 y2 + x2 y3 + · · · + xn−1 yn + xn y1 ) − (y1 x2 + y2 x3 + · · · + yn−1 xn + yn x1 )) .
2

This theorem can be proved by induction on the n-sided polygon. At the inductive step, we add
one more point outside the n-sided polygon. If we connect the two closest points on the polygon to
this new point, we end up with an n + 1-sided polygon.
We can then compute the area of this by adding the existing area of the n-sided polygon (which
would be the assumption made by the inductive hypothesis) to the area of the triangle formed by
the new point and the two closest points.
In general, to use the generalized Shoelace Theorem properly, we must list out the coordinates in
either clockwise or counter-clockwise order (the direction doesn’t matter, as long as the coordinates
are in order). Do not forget to repeat the first pair of coordinates at the end too.
We can then apply our method of drawing diagonal lines to arrive at the expression given by the
theorem.

Problem 8.2.60. Find the area of the quadrilateral with points at (1, 1), (9, 0), (2, 4), and (7, 6).

Solution. We set up our list of coordinates in clockwise order (or counter-clockwise order; it doesn’t
matter because of the absolute value sign):

1 1
2 4
7 6
9 0
1 1

From the solid diagonal lines, we end up with the quantity 1 · 4 + 2 · 6 + 7 · 0 + 9 · 1 = 25.
From the dashed diagonal lines, we have 1 · 2 + 4 · 7 + 6 · 9 + 0 · 1 = 84.
1 59
Thus, our area is |25 − 84| = .
2 2
Note that drawing these diagonal lines makes the diagram resemble shoelaces - hence, the name
of the theorem.
225 Chapter 8. Linear Algebra

Lastly, we move on to row and column operations on matrices. This will enable us to compute
the determinant of a 3 × 3 matrix.
Consider the following matrices:
         
a 0 1 0 1 c 1 0 0 1
A= B= C= D= E=
0 1 0 b 0 1 d 1 1 0
 
r s
for any a, b, c, d ∈ R. Let M = represent a general 2 × 2 matrix.
t u

• If we multiply the matrix A to the left of matrix M , we get


    
a 0 r s ar as
AM = = .
0 1 t u t u

The top row of M is multiplied by scalar a.

• If we multiply the matrix A to the right of matrix M , we get


    
r s a 0 ar s
MA = = .
t u 0 1 at u

The left column of M is multiplied by scalar a.

• If we multiply the matrix B to the left of matrix M , we get


    
1 0 r s r s
BM = = .
0 b t u bt bu

The bottom row of M is multiplied by scalar b.

• If we multiply the matrix B to the right of matrix M , we get


    
r s 1 0 r bs
MB = = .
t u 0 b t bu

The right column of M is multiplied by scalar b.

• If we multiply the matrix C to the left of matrix M , we get


    
1 c r s r + ct s + cu
CM = = .
0 1 t u t u

Then, c times the bottom row is added to the top row of M .

• If we multiply the matrix C to the right of matrix M , we get


    
r s 1 c r s + cr
MC = = .
t u 0 1 t u + ct

Then, c times the left column is added to the right column of M .


Daniel Kim 226

• If we multiply the matrix D to the left of matrix M , we get


    
1 0 r s r s
DM = = .
d 1 t u t + dr u + ds

Then, d times the top row is added to the bottom row of M .


• If we multiply the matrix D to the right of matrix M , we get
    
r s 1 0 r + ds s
MD = = .
t u d 1 t + du u

Then, d times the right column is added to the left column of M .


• If we multiply the matrix E to the left of matrix M , we get
    
0 1 r s t u
EM = = .
1 0 t u r s

The rows of M are swapped.


• If we multiply the matrix E to the right of matrix M , we get
    
r s 0 1 s r
ME = = .
t u 1 0 u t

The columns of M are swapped.

Note that Det(A) = a and Det(B) = b, which implies that Det(AM ) = Det(M A) = a · Det(M )
and Det(BM ) = Det(M B) = b · Det(M ). Therefore, if we multiply a row or column of a matrix M
by some scalar, then the determinant of M is also multiplied by that scalar.
Furthermore, Det(C) = Det(D) = 1, so Det(CM ) = Det(M C) = Det(DM ) = Det(M D) =
1 · Det(M ) = Det(M ). This means that adding a multiple of some row to another row or adding a
multiple of some column to another column does not affect the determinant.
Lastly, Det(E) = −1, which means that swapping a pair of rows or a pair of columns also flips
the sign of the determinant.
This result can be generalized to higher n × n matrices. However, without proof, we shall
summarize the information above in a general fashion:

Theorem 8.2.61 (Row and Column Operations)


Consider an n × n matrix M . We are allowed the following row and column operations:

1. When any row or any column is multiplied by some scalar k, the determinant of M is
also multiplied by k.

2. A multiple of some row added to another row, or a multiple of some column added to
another column, does not change the determinant of M .

3. When a pair of rows or a pair of columns is swapped, the sign of the determinant is flipped
(positive to negative, and vice-versa).
227 Chapter 8. Linear Algebra

Theorem 8.2.62
Consider a 3 × 3 matrix with arbitrary entries, as shown:
 
a b c
d e f 
g h i

The determinant of this matrix is:

aei − bdi − ceg − ahf + cdh + bf g.

To prove the theorem, we will first need a definition and a lemma:

Definition 8.2.63. Consider a matrix M in this particular form:


 
r s t
M = 0 u v.
0 0 w

We call this upper triangular form, where all entries below the non-zero main diagonal are zeroes.

Lemma 8.2.64
If matrix M is in upper triangular form, then Det(M ) = ruw.

Proof. Note that     


r s t r 0 0 1 r−1 s r−1 t
0 u v  = 0 u 0  0 1 u−1 v  .
0 0 w 0 0 w 0 0 1

Observe the following:


     
r 0 0 r 0 0 1 0 0 1 0 0
Det 0 u 0  = Det 0 1 0 0 u 0 0 1 0  = ruw.
0 0 w 0 0 1 0 0 1 0 0 w
 
r 0 0
But notice that 0 1 0 is the result of multiplying the first row of I3 by r. Since the
0 0 1
     
r 0 0 1 0 0 1 0 0
determinant of I3 is 1, the determinant of 0 1 0 is r. Likewise, 0 u 0 and 0 1 0 
0 0 1 0 0 1 0 0 w
have determinants of u and w respectively. Thus,
     
r 0 0 r 0 0 1 0 0 1 0 0
Det 0 u 0  = Det 0 1 0 0 u 0 0 1 0  = ruw.
0 0 w 0 0 1 0 0 1 0 0 w
Daniel Kim 228

Now take a look at the matrix  


1 r−1 s r−1 t
0 1 u−1 v  .
0 0 1

Denoting the first row, second row, and third row as R1 , R2 , and R3 respectively, note that we
can perform the row operation −(u−1 v)R3 + R2 −→ R2 to eliminate only the u−1 v term from the
second row, and the row operation −(r−1 t)R3 + R1 −→ R1 to eliminate only the r−1 t term from
the first row. Finally, we can do a row operation −(r−1 s)R2 + R1 −→ R1 to eliminate only the r−1 s
term from the first row. We are left with the identity matrix I3 . We have established that
 
1 r−1 s r−1 t
Det 0 1 u−1 v  = Det(I3 ),
0 0 1
since these row operations do not affect the overall determinant of the matrix. Thus we can conclude
the proof of this lemma:
     
r s t r 0 0 1 r−1 s r−1 t
Det 0 u v  = Det 0 u 0  Det 0 1 u−1 v 
0 0 w 0 0 w 0 0 1
    
r 0 0 r 0 0
= Det 0 u 0  Det(I3 ) = Det 0 u 0  · I3 
0 0 w 0 0 w
 
r 0 0
= Det  0 u 0  = ruw.
0 0 w

Now, we proceed to prove the actual theorem of the 3 × 3 matrix determinant.

Proof. We intend to take advantage of row operations to find our determinant.


To use Lemma 8.2.64 to our advantage, we must convert our initial matrix
 
a b c
d e f ,
g h i
to upper triangular form, through row operations. We use the same notation as used in the proof of
Lemma 8.2.64. First we eliminate the d term:
  d  
a b c − R1 +R2 →R2 a b c
d e f  a −→ 0 e − bd f− cd 
a a
g h i g h i

Then we eliminate the g term:


  g  
a b c − R1 +R3 →R3 a b c
0 e − bd f − cd  a  0 e − bd f − cd 
a a −→ a a
g h i 0 h − bg
a i− acg
229 Chapter 8. Linear Algebra

And lastly, we eliminate the h term:


 
 
h − bg
a a b c
 −  R2 +R3 →R3
a b c bd 0 e − bd f − cd 
e− a  a   a 
0 e − bd
f− a cd
−→  
a  h − bga  
bg
0 h − a i − cg 0 0 i − cg
−  f− cd 
a a bd a
e− a

By Lemma 8.2.64, the determinant is


   
  bg        
bd  cg h − a cd  cg  bd bg cd
a e− i− −  f− =a i− e− − h− f−
a a e − bd
a
a a a a a
1
= ((ai − cg)(ae − bd) − (ah − bg)(af − cd)) .
a

This simplifies to

1 2 
a ei − abdi − aceg + bdcg − a2 hf + acdh + abf g − bcdg = aei−bdi−ceg −ahf +cdh+bf g.
a

This formula is certainly intimidating, but there is a visual memorization technique that is similar
to the method we used to invoke the Shoelace theorem.
First, consider this augmented matrix with the first two columns appended at the end.
 
a b c a b
 d e f d e 
g h i g h

Consider the three diagonals going from top-left to bottom-right, and take the sum of the
products of the three entries in each of the diagonals, i.e. aei + bf g + cdh, as shown:
     
a b c a b a b c a b a b c a b
 d e f d e ,  d e f
 
d e , d e f d e 

g h i g h g h i g h g h i g h

Then consider the three diagonals going the other way, i.e. going from top-right to bottom-left,
and, like above, take the sum of the products of the three entries in each of the diagonals, i.e.
ceg + af h + bdi, as shown:
     
a b c a b a b c a b a b c a b
 d e f d e ,  d e f d e  
, d e f d e 

g h i g h g h i g h g h i g h

Then our determinant is simply the first sum minus the second sum, i.e. aei + bf g + cdh − (ceg +
af h + bdi), which is what the theorem states.
Daniel Kim 230
 
2 3 6
Problem 8.2.65. Compute the determinant of 1 −4 3 using the method shown above.
2 5 9

Solution. The Shoelace technique gives us (2)(−4)(9)+(3)(3)(2)+(6)(1)(5)−(6)(−4)(2)−(2)(3)(5)−


(3)(1)(9) = −72 + 18 + 30 − (−48) − 30 − 27 = −33 .

There is also another way to evaluate the 3 × 3 determinant. With the matrix
 
a b c
d e f  ,
g h i

consider the first row, which is a, b, and c. For each entry, imagine that you delete the row and
column of the matrix which contains that entry you are considering. You will have exactly four
entries left that have not been deleted. Form a 2 × 2 matrix out of those four entries in the exact
same order. The determinant of this resulting 2 × 2 matrix is called a minor.
For example, the minor for a would be:
 
a b c  
e f
Det d e f  =⇒ Det .
h i
g h i

The minor for e would be:


 
a b c  
   a c
Det d e f =⇒ Det .
g i
g h i


We can find the determinant of the 3 × 3 matrix by considering one whole row or column, and
for each entry, find the product of that entry and its minor. The determinant would then be the
alternating sum of those products (i.e. you add the first product, subtract the second, add the third,
etc.).
For example, we can evaluate the minors on the first row to get:
 
a b c      
e f d f d e
Det d e f  = a · Det − b · Det + c · Det .
h i g i g h
g h i

We can also find the same determinant by evaluating minors on the second column:
 
a b c      
   d f a c a c
Det d e f = b · Det − e · Det + h · Det .
g i g i d f
g h i

Problem 8.2.66. Solve Problem 8.2.65 using the minors method and confirm that you got the
same answer.
231 Chapter 8. Linear Algebra

Example 8.2.67
Use matrices to solve the following system of equations:

2x + 3y + 4z = 4,
x − 2y + 5z = 1,
3x + 6y − z = 2.

Solution. We read off the coefficients of x, y, and z from each of the three equations into a 3 × 3
matrix. However, we also read the constants 4, 1, and 2 into a separate column appended at the end
of the matrix; this addition of an ‘extra column’ results in an augmented matrix, as shown:
 
2 3 4 4
 1 −2 5 1 
3 6 −1 2

If we convert this to upper triangular form, then we have a matrix in the form
 
a b c r
 0 d e s ,
0 0 f u

from which we read off the equations as f z = u, dy + ez = s, and ax + by + cz = r. Using the


equation f z = u, we can find the value of z, from which we can substitute it into dy + ez = s to find
y, and substitute in y for ax + by + cz = r to find x. This process is called back substitution.
We can use row operations, as follows (keep in mind that we treat the fourth augmented column
the same as others!):
   
2 3 4 4 1 −2 5 1
 1 −2 5 1  R−→ 1 ↔R2 
2 3 4 4 
3 6 −1 2 3 6 −1 2
   
1 −2 5 1 −2R1 +R2 →R2 1 −2 5 1
−3R1 +R3 →R3
 2 3 4 4  −−−− −−−−−→  0 7 −6 2 
3 6 −1 2 0 12 −16 −1
   
1 −2 5 1 1 −2 5 1
7R3 →R3
 0 7 −6 2  −→  0 7 −6 2 
0 12 −16 −1 0 84 −112 −7
   
1 −2 5 1 1 −2 5 1
−12R2 +R3 →R3
 0 7 −6 2  −→  0 7 −6 2 
0 84 −112 −7 0 0 −40 −31

31
At this point, we have that −40z = 31 =⇒ z = from the last row. We then plug in this
40
19
value to 7y − 6z = 2 (second row), yielding y = . Lastly, we have x − 2y + 5z = 1 from the
20
Daniel Kim 232
39
first row, and plugging in respective values of y and z give x = − . Therefore our solution is
  40
39 19 31
(x, y, z) = − , , .
40 20 40
Alternatively, we could keep applying our row operations until we get to a form
 
1 0 0 a
 0 1 0 b 
0 0 1 c

for arbitrary a, b, c ∈ R, from which we can directly read off our values for x, y, z.

We now generalize the concept of eigenvectors and eigenvalues to 3 × 3 matrices. This is very
complicated for general matrices, so we limit our discussion of eigenvalues and eigenvectors to 3 × 3
matrices in upper triangular form.

Example 8.2.68
Find eigenvectors and eigenvalues for
 
1 2 3
M = 0 4 5 .
0 0 6

Solution. If λ is an eigenvalue of this matrix, then Det(M − λI) = 0. By Lemma 8.2.64, we can
evaluate this as
 
1−λ 2 3
Det  0 4−λ 5  = (1 − λ)(4 − λ)(6 − λ) = 0.
0 0 6−λ

Thus, the only possible roots of this polynomial are λ are 1, 4, 6.

• Case λ = 1:
Writing the equation out, we have
      
1 2 3 a a + 2b + 3c a
0 4 5  b  =  4b + 5c  =  b  .
0 0 6 c 6c c

We now have 3 equations:

a + 2b + 3c = a,
4b + 5c = b,
6c = c.
233 Chapter 8. Linear Algebra

The third equation gives us c = 0. Back-substituting, we also get b =


 0. Then,
 a= a, which
a 1
is true for any a. Thus, the eigenvectors for λ = 1 are of the form 0 = a 0 i.e. scalar
  
0 0
 
1
multiples of 0.
0

• Case λ = 4:
Similarly, we have the equations

a + 2b + 3c = 4a,
4b + 5c = 4b,
6c = 4c.

Like before, the third equations implies c = 0. Then the second equation simplifies to 4b = 4b
i.e. b = b, which is true for any value of b.
3
Furthermore, since c = 0, the first equation a + 2b + 3c = 4a reduces to 2b = 3a i.e. a = b,
2  2 2
3b 3
therefore the eigenvectors are of the form  b  = b  1 , from which we can make the valid
0 0
   
2 2
substitution b −→ 3b to get the cleaner form: b 3 i.e. scalar multiples of 3.
0 0

• Case λ = 6:

a + 2b + 3c = 6a,
4b + 5c = 6b,
6c = 6c.

5
The third equation implies c = c for any value of c. The second equation simplifies to b = c and
8  2 8
8 5c 5
the first equation simplifies to a = c, therefore our eigenvectors are of the form 52 c = c  52 .

5
c 1
   
16 16
Then we can substitute c −→ 10c to get c 25 i.e. scalar multiples of 25.
10 10

Problem 8.2.69. Find eigenvalues and eigenvectors for the matrix


 
2 1 4
0 −3 2 .
0 0 1
Daniel Kim 234

Solution. We know that Det(M − λI) = 0 for given matrix M and possible eigenvalues λ. Then, we
have  
2−λ 1 4
Det  0 −3 − λ 2  = (2 − λ)(−3 − λ)(1 − λ) = 0.
0 0 1−λ
We have the roots λ = 2, −3, 1. We check each case:

1. For λ = 2, we have     
2 1 4 x x
0 −3 2 y  = 2 y  .
0 0 1 z z
We are left with the system of equations
2x + y + 4z = 2x,
−3y + 2z = 2y,
z = 2z.

The last equation implies z = 0, from which we determine y = 0 as well (from the second
equation), and in the first
 equation we find x = x  for
 any x, therefore our eigenvectors for
x 1
λ = 2 are of the form  0  i.e. scalar multiples of 0.
0 0
2. For λ = −3, we similarly set up a system of equations:
2x + y + 4z = −3x,
−3y + 2z = −3y,
z = −3z.
 1 
−5y
1 
We find z = 0, y = y, and x = − y, therefore our eigenvectors are y  i.e. scalar multiples
5
0
 
−1
of  5 .
0
3. For λ = 1, we have the system of equations
2x + y + 4z = x,
−3y + 2z = y,
z = z.
 
−9y
We find z = z, z = 2y, and x = −9y, therefore our eigenvectors are  y  i.e. scalar multiples
2y
 
−9
of  1 .
2
235 Chapter 8. Linear Algebra

Sometimes, an eigenvalue can be a double root when the determinant equation is written out, as
presented in the following problem:
Problem 8.2.70. Find eigenvectors and eigenvalues for
 
1 0 0
M = 0 4 0 .
0 0 4

Solution. If λ is an eigenvalue of this matrix, then Det(M − λI) = 0. Writing this out,
 
1−λ 0 0
Det  0 4−λ 0  = (1 − λ)(4 − λ)2 = 0.
0 0 4−λ

Thus, the only possible values of λ are 1, 4, where 4 is a double root:

• Case λ = 1:
Write the equation out:       
1 0 0 a a a
0 4 0  b  = 4b =  b  .
0 0 4 c 4c c

We now have 3 equations:

a = a,
4b = b,
4c = c.

The third equation gives us Similarly,


 c= 0.   b = 0. Then, a = a,
 which
 is true for any a. Thus,
a 1 1
the solution for λ = 1 is 0 = a 0 i.e. scalar multiples of 0.
0 0 0
• Case λ = 4:
Similar to what we did previously, we have

a = 4a,
4b = 4b,
4c = 4c.

The last two equations give us c = c and b = b, which arealways  true. Finally, the first
0
equation gives us a = 0. Thus, the solutions are of the form  b .
c

Unlike the previous example, this eigenvector has two different variables. In general, when an
eigenvalue is a double root, there will be an eigenvector with two different degrees of freedom (i.e.
the vector is defined by 2 variables).
Daniel Kim 236

Problem 8.2.71. What are the determinants of the following matrices?


   
1 2 3 1 2 4
2 4 6 , 3 7 8 
7 7 7 4 9 12

Solution. For the first one, note that we can completely eliminate the second row by a row operation:
   
1 2 3 1 2 3
2 4 6 −2R1−→ +R2 →R2
0 0 0 .
7 7 7 7 7 7

As the Shoelace technique is applied, notice that all terms in the expression for the determinant
(i.e. aei − bdi − ceg − ahf + cdh + bf g) include a term from the second row. If all the entries in the
second row are zeroes, all the terms in the determinant are zeroes, and therefore the determinant is
zero.
Therefore, the determinant of the first matrix is 0.
We can make this same argument about any row or column. Thus, we conclude that if any row
or column is all zeroes, then the determinant is necessarily 0.
Similarly, for the second matrix, we can eliminate the third row by two special row operations:
   
1 2 4 −R1 +R3 →R3 1 2 4
−R2 +R3 →R3
3 7 8  −
−−−−−−−−→ 3 7 8 .
4 9 12 0 0 0

The determinant is clearly 0. From this, we can conclude in general that if some rows sum up to
another row, then the determinant is necessarily 0.

From these two examples of matrices, we can even make a broader conclusion: if the rows
(taken as vectors) are not linearly independent, then the determinant is 0. This is because our
row operations allow us to replace a row with any linear combination of the rows, i.e. we can do
aR1 + bR2 + cR3 → R1 for any integer a, b, and c (as long as they aren’t all 0). Therefore, if the
rows are linearly dependent, then we can find a, b, c =
6 0 such that aR1 + bR2 + cR3 = 0. Then, we
can apply row operations to get a row equal to 0, and thereby conclude that the determinant is 0.

8.3 3-D Geometry

Although we have had experience with the two dimensional Cartesian plane (plotting points and
lines), it is not until knowledge of parametric equations, vectors, and matrices are needed to establish
some foundation for developing ways to represent lines and planes.
First, we discuss how to represent lines in 3D space, through vectors.

P Q
(x0 , y0 , z0 ) (x1 , y1 , z1 )
237 Chapter 8. Linear Algebra
←→ −→ −−→
Consider a point R on the line P Q. Then P R will be a scalar multiple of P Q. This results in
the parameterization
−→ −−→
P R = tP Q.
−−→
Denote P = (x0 , y0 , z0 ) and Q = (x1 , y1 , z1 ). Then P Q = hx1 − x0 , y1 − y0 , z1 − z0 i, therefore
−→
P R = t hx1 − x0 , y1 − y0 , z1 − z0 i .

from which we can conclude

R = (x0 + t(x1 − x0 ), y0 + t(y1 − y0 ), z0 + t(z1 − z0 )).

←→
Since R is an arbitrary point that lies on the line P Q, we can establish the following:

Definition 8.3.1. A line in three-dimensional space can be expressed in parametric form with
the following:

x = x0 + t(x1 − x0 ),
y = y0 + t(y1 − y0 ),
z = z0 + t(z1 − z0 ).

Problem 8.3.2. Give a parametric representation for a line containing (1, 2, 3) and (−3, −2, 7).

Solution. Applying our definition, we have

x = 1 + t(−3 − 1) = 1 − 4t,
y = 2 + t(−2 − 2) = 2 − 4t,
z = 3 + t(7 − 3) = 3 + 4t.

Since each contains 4t, we can make the substitution 4t −→ t to simplify our parameterization:

x = 1 − t,
y = 2 − t,
z = 3 + t.

Problem 8.3.3. Does the line from Problem 8.3.2 intersect the line containing (0, 0, 0) and (1, 4, 9)?

Solution. Our second line has the parameterization

x = u,
y = 4u,
z = 9u.

We use u as the parameter here because since we will use t as the parameter for the line from
Problem 8.3.2. The parameters for the two lines are not necessarily the same.
Daniel Kim 238

To find the intersection, simply set the two parameterizations equal to each other:

1 − t = u,
2 − t = 4u,
3 + t = 9u.

Substituting the first equation into the second equation, we have 4(1 − t) = 2 − t which yields
2 1
t = and u = . However, these values fail the third equation, thus there is no solution, and the
3 3
lines do not intersect.

Problem 8.3.4. Find the parametric equation of the line containing (1, −2, 3) and (2, 1, −5).

Solution. x = 1 + (2 − 1)t = 1 + t,
y = −2 + (1 − (−2))t = −2 + 3t,
z = 3 + (−5 − 3)t = 3 − 8t.

Now we seek to come up with a proper definition for planes in space. The following diagram will
motivate us to define a plane in the following fashion:

For any point (x, y, z) that is on the plane, notice that the vector ha, b, ci is orthogonal to the
vector containing points (x0 , y0 , z0 ) and (x, y, z).

Definition 8.3.5. A plane in three-dimensional space is the set of points (x, y, z) such that the
vector containing (x, y, z) and (x0 , y0 , z0 ) is orthogonal to ha, b, ci. We can then find the equation
for a plane:

ha, b, ci · hx − x0 , y − y0 , z − z0 i = 0,
ax − ax0 + by − by0 + cy − cy0 = 0.

Let d be some constant equal to ax0 + by0 + cz0 , therefore our general equation for a plane (after
some rearranging) is
ax + by + cz = d.
239 Chapter 8. Linear Algebra

Definition 8.3.6. A vector or line is called normal to some plane when it is perpendicular to that
plane.
Problem 8.3.7. Find the point on the line from Problem 8.3.4 closest to the origin (0, 0, 0).

Solution. We find a vector that contains the points (1, −2, 3) and (2, 1, −5) is
h2 − 1, 1 − (−2), −5 − 3i = h1, 3, −8i .
Furthermore, the line from the origin to the closest point should be perpendicular to the line (the
shortest distance from a point to a line is the perpendicular).
If we let that closest point on the line be (1 + t, −2 + 3t, 3 − 8t) using the parametric equations
of the line, then the vector starting from (0, 0, 0) and ending at (1 + t, −2 + 3t, 3 − 8t) is simply
h1 + t, −2 + 3t, 3 − 8ti. This vector must be orthogonal to the vector h1, 3, −8i, therefore we have
h1 + t, −2 + 3t, 3 − 8ti · h1, 3, −8i = 0.

Evaluating the dot product gives


(1 + t) · 1 + (−2 + 3t) · 3 + (3 − 8t) · −8 = 0.

29
The solution is t = . Now, we can plug this value into our parametric equations of the line to
 74 
103 61 10
get the point ,− ,− .
74 74 74

Problem 8.3.8. Find the intersection of the line from Problem 8.3.4 with the plane whose equation
is x + y + 4z = 11.

Solution. Plugging in our parametric equations for the line, we have


1 + t + −2 + 3t + 4(3 − 8t) = 11.

This simplifies to t = 0, and so we plug this back into our parameterization for the line, resulting
in the point (1 + 0, −2 + 3(0), 3 − 8(0)) = (1, −2, 3) .

Problem 8.3.9. Find the intersection of planes x + y + 4z = 11 and x + y + 3z = 11.

Solution. As we have to find the intersection, we simply substitute x + y + 4z into the second
equation, as such:
x + y + 4z = x + y + 3z.

This results in the solutions z = 0 and x + y = 11, which is a line with a parameterization of
x = 1 + t,
y = 10 − t,
z = 0.

The intersection of the given planes is a line with that parameterization.


Note that we could also have done x = 2 + t, y = 9 − t or x = 6 + t, y = 5 − t, etc. It does not
matter as long as x and y add up to 11 with parameter t.
Daniel Kim 240

Problem 8.3.10. Find the acute angle between the planes given in the previous example.

Solution. Denote the acute angle as θ. The equations x + y + 4z = 11 and x + y + 3z = 11 suggest


that their normal vectors are h1, 1, 4i and h1, 1, 3i respectively. We can take representations of these
vectors such that their tips meet each other while their tails meet at the planes. Then a ‘cyclic
quadrilateral’ is formed with θ such that the angle between the two vectors at their tips is 180 − θ
(the other two angles are 90◦ each due to the normal vectors).
We can calculate the cosine of the angle 180 − θ between the tips of the vectors h1, 1, 4i and
h1, 1, 3i, from which we can get cos θ:

h1, 1, 4i · h1, 1, 3i 14
cos(180 − θ) = = √ = − cos θ
kh1, 1, 4ikkh1, 1, 3ik 3 22
 
14
One may conclude that the angle is cos−1− √ , however this is incorrect because the cosine
3 22
of an acute angle is always positive. To remedy this, we usually take the absolute value of the angle
between the two normal vectors. This is our general expression to find the acute angle:

~v · w
~
cos θ =
k~v kkwk
~

where ~v , w
~ are the vectors normal to the two planes, which in this case would be h1, 1, 4i and h1, 1, 3i.
 
−1 14
The answer is therefore θ = cos √ .
3 22

Theorem 8.3.11
The distance from a point (x0 , y0 , z0 ) to a plane ax + by + cz = d is

|ax0 + by0 + cz0 − d|


√ .
a2 + b2 + c2

Proof. By our definition of a plane, the vector that is normal to ax + by + cz = d is ha, b, ci. Let a
representation of this vector go through the point (x0 , y0 , z0 ), as shown:
241 Chapter 8. Linear Algebra

Consider the line that goes through (x0 , y0 , z0 ) and contains ha, b, ci. The parameterization for
this line would be

x = x0 + at,
y = y0 + bt,
z = z0 + ct.

To find the intersection of this line and the plane, substitute the parameters into the equation
for the plane:
a(x0 + at) + b(y0 + bt) + c(z0 + ct) = d.

This rearranges to
ax0 + by0 + cz0 − d = −t(a2 + b2 + c2 ).

Solving for parameter t, we get


 
ax0 + by0 + cz0 − d
t=− .
a2 + b2 + c2


The shortest distance from the plane to (x0 , y0 , z0 ) would be |t|kha, b, cik = |t| a2 + b2 + c2 , so
we substitute in our value of t to get
 
ax0 + by0 + cz0 − d p 2 |ax0 + by0 + cz0 − d|
− 2 2 2
a + b2 + c2 = √ ,
a +b +c a2 + b2 + c2

as desired.
   
Problem 8.3.12. Find a~ı + b~ + c~k · d~ı + e~ + f ~k .

Solution. The dot product is distributive over addition:


   
a~ı + b~ + c~k · d~ı + e~ + f ~k
= ad(~ı ·~ı) + ae(~ı · ~) + af (~ı · ~k) + bd(~ ·~ı) + be(~ · ~) + bf (~ · ~k) + cd(~k ·~ı) + ce(~k · ~) + cf (~k · ~k)
= ad + be + cf .
Daniel Kim 242

Now that we are dealing with three dimensional vectors, we can find a vector that is orthogonal
to two other given vectors simultaneously. Consider the following matrix:
 
~ı ~ ~k
a b c  .
d e f

This matrix is unlike any other matrix we have considered before. Although we have initially
defined matrices to be an array of numbers, we can extend this definition to include vectors as
possible entries as well.
The top row consist of the standard basis vectors for R3 . Then, the determinant of this matrix is
of the form
~ıγ1 + ~γ2 + ~kγ3 ,
where γ1 , γ2 , γ3 are placeholders for the remaining terms. Notice that the determinant itself is a
vector.
Now suppose we take the dot product of this with the vector a~ı + b~ + c~k. By Problem 8.3.12,
   
~ıγ1 + ~γ2 + ~kγ3 · a~ı + b~ + c~k = γ1 a + γ2 b + γ3 c.

But γ1 a + γ2 b + γ3 c is actually the determinant for the matrix,


 
a b c
a b c 
d e f

(where we have replaced the entries ~ı, ~, ~k with a, b, c).


For the matrix above, we can perform the row operation −R1 + R2 → R2 reduce the second row
to all zeroes. Thus, the determinant of this matrix, which is γ1 a + γ2 b + γ3 c, is equal to 0.
   
Then ~ıγ1 + ~γ2 + ~kγ3 · a~ı + b~ + c~k = 0, indicating that ~ıγ1 + ~γ2 + ~kγ3 and a~ı + b~ + c~k are
orthogonal.
By the same reasoning, we can take the dot product of ~ıγ1 + ~γ2 + ~kγ3 and d~ı + e~ + f ~k, which
would be γ1 d + γ2 e + γ3 f . Then this would be the determinant for the matrix,
 
d e f
a b c ,
d e f

and the row operation −R1 + R3 → R3 would indicate that the determinant is 0. Hence, we can
likewise conclude that ~ıγ1 + ~γ2 + ~kγ3 and d~ı + e~ + f ~k are orthogonal.
Thus, we have found that ~ıγ1 + ~γ2 + ~kγ3 is orthogonal to both a~ı + b~ + c~k and d~ı + e~ + f ~k.
This is the vector that we have been looking for.
This special vector will be so signficant in 3D geometry that we will define it with a special name:
243 Chapter 8. Linear Algebra

Definition 8.3.13. The cross product is defined as follows:


 
    ~ı ~ ~k
a~ı + b~ + c~k × d~ı + e~ + f ~k = Det a b c  .
d e f
   
This results in a vector that is orthogonal to both a~ı + b~ + c~k and d~ı + e~ + f ~k .

Problem 8.3.14. Find h1, 1, 4i × h1, 1, 3i.


 
Solution. ~ı ~ ~k
h1, 1, 4i × h1, 1, 3i = Det 1 1 4
1 1 3
= 3~ı + 4~ + ~k − 4~ı − 3~ − ~k
= −~ı + ~
= h−1, 1, 0i .

Properties of the cross product:

1. w ~ (this is also known as anticommutativity).


~ × ~v = −(~v × w)

Proof. Let ~v = a~ı + b~ + c~k and w~ = d~ı + e~ + f ~k. Then
 
~ı ~ ~k
~v × w
~ = Det   a b c  ,
d e f
 
~ı ~ ~k
~ × ~v = Det d
w e f  .
a b c

For w
~ × ~v , rows 2 and 3 of the matrix have been swapped, so its determinant is the negative of
the determinant of the matrix for ~v × w.~ Therefore w~ × ~v = −(~v × w).
~

2. ~v × ~v = ~0.

Proof. From the first property, ~v × ~v = −(~v × ~v ), from which the result follows.

3. The cross product is distributive.


4. The standard basis vectors satisfy the following equations:

~ı × ~ = ~k,
~ × ~k = ~ı,
~k ×~ı = ~.

5. (a~v ) × w
~ = a(~v × w).
~
Daniel Kim 244

Example 8.3.15
Find the equation of the plane containing the points (2, 3, 7), (1, 5, 6), and (−4, 0, 1).

−−→
Solution. Let R = (2, 3, 7), M = (1, 5, 6), and J = (−4, 0, 1). Then RM = h−1, 2, −1i and
−−→
M J = h−5, −5, −5i. Taking the cross-product of this gives a vector normal to the plane, so we
evaluate with minors:
 
~ı ~ ~k
−−→ −−→
RM × M J = Det −1 2 −1
−5 −5 −5
     
2 −1 −1 −1 −1 2
= ~ı · Det − ~ · Det + ~k · Det
−5 −5 −5 −5 −5 −5
= −15~ı + 15~k.

The normal vector is h−15, 0, 15i, so our equation for the plane is of the form −15x + 15z = d.
Now we plug in one of our three given points to find the value of d, i.e. plugging in point J we
get −15(−4) + 15(1) = d = 75, therefore our equation for the plane is −15x + 15z = 75 =⇒
−x + z = 5 .

Problem 8.3.16. Find the equation of the plane containing the points (1, 2, 4), (2, −1, 1), and
(4, 0, 5).

Solution. Similarly, we find two vectors given the three points (1, 2, 4), (2, −1, 1), and (4, 0, 5), which
are h1, −3, −3i and h2, 1, 4i. We then take the cross product to find the normal vector of the plane:
 
~ı ~ ~k
h1, −3, −3i × h2, 1, 4i = Det 1 −3 −3
2 1 4
     
−3 −3 1 −3 ~ 1 −3
= ~ı · Det − ~ · Det + k · Det
1 4 2 4 2 1
= −9~ı − 10~ + 7~k.

So our equation for the plane is of the form −9x − 10y + 7z = d, then we plug in J = (−4, 0, 1)
to get −9(−4) − 10(0) + 7(5) = d = −1, which simplifies to −9x − 10y + 7z = −1 .

Problem 8.3.17. Find the intersection of the plane from Problem 8.3.16 with the plane x+y +z = 0.

Solution. We have the following equations:

−9x − 10y + 7z = −1, (1)


x + y + z = 0. (2)

Multiplying 7 times equation (2) then adding to equation (1) gives

16x + 17y = 1,
245 Chapter 8. Linear Algebra

and multiplying 9 times equation (2) then adding to equation (1) gives

−y + 16z = −1.

We can use these equations to find two points which lie on the intersection of the two planes,
then determining the parameterization of the line using those two points.
Plugging in y = 1 gives the solutions (−1, 1, 0), and plugging in z = 1 gives the solutions
(−18, 17, 1). Therefore the parameterization of the line is

x = −1 − 17t,
y = 1 + 16t,
z = t.

Problem 8.3.18. Find the intersection of the plane from Problem 8.3.16 with the line that contains
(3, 4, 5) and (5, 12, 13).

Solution. The parameterization for the line with points (3, 4, 5) and (5, 12, 13) is:

x = 3 + 2t,
y = 4 + 8t,
z = 5 + 8t.

Note that we can divide coefficients of parameter t by 2, so our simplified parameterization is

x = 3 + t,
y = 4 + 4t,
z = 5 + 4t.

To find the intersection, we simply substitute in the parametric definitions into the equation of
the plane, which is −9x − 10y + 7z = −1, so we get

−9(3 + t) − 10(4 + 4t) + 7(5 + 4t) = −1.

31
Solving this gives t = − . We plug this back into the parameterization of the line to get the
  21
32 40 19
point ,− ,− .
21 21 21

Problem 8.3.19. Determine the distance from the origin to the line in Problem 8.3.18.

Solution. Recall that the line in discussion has the parameterization

x = 3 + t,
y = 4 + 4t,
z = 5 + 4t.
Daniel Kim 246

Consider the vector from (0, 0, 0) to an arbitrary point on the line, which can be represented as
(3+t, 4+4t, 5+4t). The vector going from the origin to this arbitrary point is just h3 + t, 4 + 4t, 5 + 4ti.
We can determine a vector in the line by taking two points on the line, i.e. (3, 4, 5) and (5, 12, 13),
which yields h2, 8, 8i. By Theorem 8.1.23, ~v and w
~ are orthogonal if and only if ~v · w
~ = 0. So we have

h3 + t, 4 + 4t, 5 + 4ti · h2, 8, 8i = 0.

This evaluates to
6 + 2t + 32 + 32t + 40 + 32t = 0,
13
and solving gives t = − . Therefore the distance is just the magnitude of the vector
11

h3 + t, 4 + 4t, 5 + 4ti ,

which is s 2  2  2 √
20 8 3 473
+ + = .
11 11 11 11

Theorem 8.3.20
Let θ is the included angle between ~v and w
~ if their tails were placed on each other. Then we
have
k~v × wk
~ = k~v kkwk
~ sin θ.

s  2
~v · w
~ ~v · w
~
Proof. Since cos θ = , we get sin θ = 1 − . Let ~v = ha, b, ci and w
~ = hd, e, f i.
k~v kkwk
~ k~v kkwk
~
Then,
p
~ sin θ = k~v k2 kwk
k~v kkwk ~ 2 − (~v · w)
~ 2
p
= (a2 + b2 + c2 )(d2 + e2 + f 2 ) − (ad + be + cf )2 .

After much algebraic manipulation, we see that this rearranges to


p
(bf − ce)2 + (cd − af )2 + (ae − bd)2 .

Given ~v = ha, b, ci and w


~ = hd, e, f i, we can find ~v × w:
~
 
~ı ~ ~k
Det a b c  = ~ı(bf − ae) + ~(cd − af ) + ~k(ae − bd),
d e f
p
and therefore k~v × wk
~ = (bf − ce)2 + (cd − af )2 + (ae − bd)2 , which we have found to be equal
to k~v kkwk
~ sin θ. So, we are done.
247 Chapter 8. Linear Algebra

This allows us to compute either the dot product or sin θ if we have the other.
Now, recall the formula based on Theorem 6.6.1 which states that the area of the triangle formed
1
by ~v and w~ with their tails placed on each other, and included angle θ, would be k~v kkwk
~ sin θ. As
2
1
a result of Theorem 8.3.20, we conclude that the area of such a triangle is k~v × wk.~
2
Then k~v kkwk
~ sin θ = k~v × wk~ is the area of the parallelogram (since it is twice the area of the
triangle):

~v

θ
w
~
Chapter 9

Limits

We are finally at the gates of calculus, and to begin, we must revisit the concept of limits.
I strongly recommend that you review the section on the limits of sequences, as we will be taking
it a step further when considering the limits of functions.
Note that we will only be covering an introduction of calculus, mainly limits (this chapter) and
derivatives (next chapter).

The Definition

Recall the definition for the limit of a sequence, i.e. lim an = L:


n→∞

∀ε > 0 ∃N s.t. ∀n > N, |an − L| < ε.

We could also write it as

∀ε > 0 ∃N s.t. n > N −→ |an − L| < ε,

after replacing the universal quantifier with an implication.


In this section, we will discuss the finite limit of a function, which will be denoted by lim f (x) = L.
x→a
This is read as “the limit as x approaches a of f (x) is L.” Its definition will be similar to that of the
limit for a sequence.
In order to get a sense of how we can approach forming a definition for this, we keep in mind the
following idea:

We can make f (x) as close as we want to L by making x sufficiently close to a.

Keeping this principle in mind, here is the ε − δ (“epsilon-delta”) definition of the limit:

Definition 9.0.1. The limit of f (x) as x approaches a is L, i.e. lim f (x) = L, iff
x→a

∀ε > 0 ∃δ > 0 ∀x, 0 < |x − a| < δ → |f (x) − L| < ε.


249
Daniel Kim 250

Essentially, this definition is a slight modification of what we had for sequences.


For sequences, we were able to make an arbitrarily close to L by going sufficiently down the
terms of the sequence.
However, for a function, we need to consider what it means for x to “approach” the value a. Here,
we mean that the distance between x and a, which is |x − a|, is getting smaller.
Given some positive number ε > 0, we can find a positive number δ such that the distance
between x and a is less than δ. Similar to the limit of a sequence, the value ε serves as a threshold
for f (x) to be close to L, via the condition |f (x) − L| < ε.
Since ε can be ANY positive number, it serves as the “make f (x) as close as we want to L” part
of the principle.
We can find a sufficiently small distance between x and a (i.e. find some δ) in order to satisfy
this condition that f (x) must be within ε of L.
To prove that the limit is some particular value, we need to find an expression for δ in terms of ε
such that if |x − a| < δ, then |f (x) − L| < ε.

9.1 Linear Functions

For the following few problems, we will introduce proving limits of linear functions, as they are the
most basic and straightforward.

Example 9.1.1
Find and prove lim 3x + 5.
x→3

We can expect the limit of 3x + 5 where x approaches 3 to be 14 (we can simply plug in 3 into the
given function). To rigorously prove this limit, the existence quantifier of the definition suggests
that we must find an expression for δ in terms of ε such that the implication

0 < |x − 3| < δ −→ |(3x + 5) − 14| < ε

is satisfied. Let’s take a closer look at |(3x + 5) − 14| < ε. We perform a series of algebraic
manipulations:

|(3x + 5) − 14| < ε


|3x − 9| < ε
|3(x − 3)| < ε
|3||x − 3| < ε
3|x − 3| < ε
ε
|x − 3| <
3

Note that all of these steps are reversible, therefore we can rewrite our implication as
ε
0 < |x − 3| < δ −→ |x − 3| < .
3
251 Chapter 9. Limits
ε
From here, it is obvious that our implication can only be satisfied if we let δ = . However,
3
even though we have essentially ‘solved’ the proof going backwards, we must write the proof going
forwards, as shown:
ε
Proof. For a given ε > 0, let δ = . Then,
3
0 <|x − 3| < δ
ε
|x − 3| <
3
3|x − 3| < ε
|3||x − 3| < ε
|3(x − 3)| < ε
|3x − 9| < ε
|(3x + 5) − 14| < ε

Therefore, for any given ε > 0, we have found a δ which satisfies

0 < |x − 3| < δ −→ |(3x + 5) − 14| < ε

which is the definition of lim 3x + 15 = 14, so we’re done.


x→3

The proofs for the rest will be written forward, but for each problem, make sure you attempt the
proof and find an expression for δ in terms of ε first before looking at the solution.

Problem 9.1.2. Prove lim 2x + 3 = 7.


x→2

ε
Proof. For a given ε > 0, let δ = . Then,
2
0 <|x − 2| < δ
ε
|x − 2| <
2
2|x − 2| < ε
|2||x − 2| < ε
|2(x − 2)| < ε
|2x − 4| < ε
|(2x + 3) − 7| < ε

Therefore, for any given ε > 0, we have found a δ which satisfies

0 < |x − 2| < δ −→ |(2x + 3) − 7| < ε

which is the definition for lim 2x + 3 = 7, and we are done.


x→2

Problem 9.1.3. Prove lim 8 − 7x = 22.


x→−2
Daniel Kim 252
ε
Proof. For a given ε > 0, let δ = . Then,
7
0 <|x − (−2)| < δ
ε
|x − (−2)| <
7
7|x − (−2)| < ε
|−7||x − (−2)| < ε
|−7(x − (−2))| < ε
|−7(x + 2)| < ε
|−7x − 14| < ε
|(8 − 7x) − 22| < ε

Therefore, for any given ε > 0, we have found a δ which satisfies

0 < |x − (−2)| < δ −→ |(8 − 7x) − 22| < ε

which is the definition for lim 8 − 7x = 22, so we are done.


x→−2

Exercise 9.1.4. Prove lim 3x + 5555 = 5561.


x→2

Example 9.1.5
(
7 x≤4
Let f (x) = . Prove that lim f (x) does not exist.
5 x>4 x→4

Proof. We proceed with proof by contradiction. Assume that lim f (x) = L. Then,
x→4

∀ε > 0 ∃δ > 0 ∀x, 0 < |x − 4| < δ → |f (x) − L| < ε.

Consider the negation of this statement:

∃ε > 0 ∀δ > 0 ∃x, 0 < |x − 4| < δ ∧ |f (x) − L| ≥ ε.

As long as we can find a single value of ε such that the implication is false for any δ, then we will
have shown that the limit does not exist.
First, we rewrite the implication to get

4 − δ < x < 4 + δ → L − ε < f (x) < L + ε.

1
Assume ε = (in fact, any value of ε ∈ (0, 1) can be used to demonstrate a contradiction). Then
2
we have
1 1
4 − δ < x < 4 + δ → L − < f (x) < L + ,
2 2
253 Chapter 9. Limits

which can be represented as


 
1 1
x ∈ (4 − δ, 4 + δ) → f (x) ∈ L − , L + .
2 2

For any δ > 0, there are values of x in the interval (4 − δ, 4 + δ) such


 that x < 4and x > 4, which
1 1
imply f (x) = 5 and f (x) = 7 respectively. However, the interval L − , L + has a length of
2 2
1; it is impossible that 5 and 7 are both contained within this interval, so we have a contradiction.
Therefore the limit does not exist.

Problem 9.1.6. Given the following function,


(
0 x≤0
f (x) =
1 x>0

prove that lim f (x) either exists and is equal to some value, or does not exist.
x→0

Proof. We shall show that there exists no limit for this function. First, assume that the limit is L.
Writing out the definition, we have

∀ε > 0 ∃δ > 0 ∀x, 0 < |x − 0| < δ −→ |f (x) − L| < ε.

As long as we find a ε that fails this definition, then we will have disproved the limit.
1
Consider ε = . When we take a closer look at the implication,
4

1
0 < |x − 0| < δ −→ |f (x) − L| < ,
4
 
1 1 1
we see that |f (x) − L| < implies that f (x) ∈ L − , L + .
4 4 4
δ δ
Let x = and x = − , which are allowed because these values satisfy 0 < |x − 0| < δ. Since
2 2
   
δ δ
δ > 0, we know that f = 1 and f − = 0, based on the given definition of the function.
2 2  
1 1
However, it has been demonstrated that f (x) ∈ L − , L + , and this interval clearly cannot
4 4
1
contain both 0 and 1 for any value of L. Therefore, we have a contradiction for ε = , so the limit
4
does not exist.

|x|
Exercise 9.1.7. Prove lim does not exist.
x→0 x
Daniel Kim 254

9.2 Non-linear Functions

Before we tackle some harder proofs, we first make the following key observation:
Remark 9.2.1. If the definition is true for a specific ε0 , then it is also true for any εe0 > ε0 .
If a particular δ0 works, then any δe0 < δ0 will also work.

To see why, write out the implication part of the definition:

|x − a| < δ0 → |f (x) − L| < ε0 .

Assume that this implication is true for δ0 . Then as long as x and a are within δ0 apart, then
they will be “sufficiently close” such that f (x) is within ε of L.
If δe0 < δ0 , then x and a are even closer to each other. They are more than sufficiently close in
order to satisfy that f (x) is within ε of L. Therefore, the implication

|x − a| < δe0 → |f (x) − L| < ε0

would also be true.


Likewise, if the implication is true for ε0 , then we have the result |f (x) − L| < ε0 . For εe0 > ε0 ,
we have |f (x) − L| < ε0 < εe0 , i.e. |f (x) − L| < εe0 , and thus this result must also be true.
Furthermore, we need to establish one more key point.

Lemma 9.2.2
Define non-strict intervals P = (a, b) and Q = (c, d) for a, b, c, d ∈ R. For any x ∈ R, if we have
x ∈ P −→ x ∈ Q, then P ⊆ Q i.e. (a, b) ⊆ (c, d), and subsequently,

c ≤ a < b ≤ d.

This lemma should be intuitive and does not need proof: if one interval is contained inside
another, then the bounds of the smaller interval must be between the bounds of the larger interval.
Note that for the following examples, I will be explaining the work that one must do before
writing a proper limit proof. Therefore these ‘proofs’ are written backwards, and it is left as exercises
to the reader to properly and formally write these proofs forward.

Example 9.2.3
Prove lim x2 = 9.
x→3

Proof. Focus on the implication:

|x − 3| < δ −→ |x2 − 9| < ε

Replace the absolute value signs by compound inequalities, and rearrange:

3 − δ < x < 3 + δ −→ 9 − ε < x2 < 9 + ε


255 Chapter 9. Limits

At this point, we need to figure out a way to rewrite 9 − ε < x2 < 9 + ε as an inequality on x,
so we can compare the two inequalities and determine what value of δ in terms of ε is required to
satisfy the definition.
The key point of this proof is to assume ε < 9. As long as we can prove that the limit exists for
ε < 9, then it will automatically follow for ε ≥ 9, as explained in Remark 9.2.1.

√ Because we√conveniently set ε < 9, we can take the square root of 9 − ε < x < 9 + ε, i.e.
2

9 − ε < x < 9 + ε.
NOTE: Here is a caveat of this proof: we are assuming that the square root function is increasing
in order to take the square root of that inequality. For now, we will take it for granted.
Our implication in discussion is now
√ √
3 − δ < x < 3 + δ −→ 9−ε<x< 9+ε

√ √ 
We want to pick a δ such that x ∈ (3 − δ, 3 + δ) −→ x ∈ 9 − ε, 9 + ε . By Lemma 9.2.2, it
is sufficient if √ √
9 − ε ≤ 3 − δ, 3+δ ≤ 9+ε
Rearranging, we have

δ ≤3− 9−ε

δ ≤ 9+ε−3

But now we have two inequalities for δ. How do we know which one to choose?
In fact, we don’t need to choose. We can define
√ √
δ = min{3 − 9 − ε, 9 + ε − 3}

so that both inequalities become true. This guarantees that our choice of δ will satisfy the definition
of the given limit.

Alternative Proof. We can proceed with a slightly easier proof. First, write out the implication:

|x − 3| < δ −→ |x2 − 9| < ε

Then we have x2 − 9 = |(x − 3)(x + 3)| = |x − 3| |x + 3|. So we want to find a δ such that

|x − 3| < δ −→ |x − 3| |x + 3| < ε.

Now, assume δ < 1. We arbitrarily choose 1 since it is small and relatively straightforward to
deal with, but the proof would have been fine if we choose 2, 10, or any other positive number
instead.
Then, |x − 3| < δ means that |x − 3| < 1. By the Triangle Inequality,

|x| = |x − 3 + 3| ≤ |x − 3| + |3| < 1 + 3 = 4.


Daniel Kim 256

We can then apply Triangle Inequality again to find a bound on |x + 3|.

|x + 3| ≤ |x| + |3| < 4 + 3 = 7.

Now, we use the facts that |x − 3| < δ and |x + 3| < 7 to conclude that

|x − 3| |x + 3| < 7δ,
ε
which in turn we want to be less than ε. To accomplish this, we realize that we want δ < .
7
However, we had assumed earlier that δ < 1. Thus, we simply have to find a δ such that it is
ε
both less than and 1. This can be done by defining
7
n εo
δ < min 1, ,
7
guaranteeing that our definition of the limit will be satisfied.

Problem 9.2.4. Prove lim x2 = 9801.


x→99

Solution. We must prove that

∀ε > 0 ∃δ > 0 ∀x, 0 < |x − 99| < δ → x2 − 9801 < ε.

We rewrite 0 < |x − 99| < δ as 99−δ < x < 99+δ, and x2 − 9801 < ε as 9801−ε < x2 < 9801+ε.
We want to take the square root of the latter inequality, but this is not possible if 9801 − ε is negative.
Therefore, we assume ε < 9801 (remember, as long as ε works, any√εe > ε will also work!). √ Then,
9801 − ε is definitely positive, so we can take the square root to get 9801 − ε < x < 9801 + ε.

Therefore our goal is to show that


√ √
99 − δ < x < 99 + δ → 9801 − ε < x < 9801 + ε.

With √interval notation,


√ we want to show that the interval (99 − δ, 99 + δ) is contained in the
interval ( 9801 − ε, 9801 + ε) (Lemma 9.2.2). It is sufficient that

9801 − ε < 99 − δ,

99 + δ < 9801 + ε.

These rearrange to

δ < 99 − 9801 − ε,

δ < 9801 + ε − 99.

We want both of these inequalities to be true, so we let


 √ √
δ < min 99 − 9801 − ε, 9801 + ε − 99 ,

so the implication ∀ε > 0 ∃δ > 0 ∀x, 0 < |x − 99| < δ → x2 − 9801 < ε is true.
257 Chapter 9. Limits
1
Problem 9.2.5. Prove lim = 1.
x→1 x

Proof. We must have


1
|x − 1| < δ −→ −1 <ε
x
which we rewrite as
1
1 − δ < x < 1 + δ −→ 1 − ε < <1+ε
x
1 1 1
We assume ε < 1. Then we can take the reciprocal of 1 − ε < < 1 + ε to get <x< ,
x 1+ε 1−ε
so we have
1 1
1 − δ < x < 1 + δ −→ <x< ,
1+ε 1−ε
which is essentially  
1 1
x ∈ (1 − δ, 1 + δ) −→ x ∈ , .
1+ε 1−ε

By Lemma 9.2.2, we have


1
≤1−δ
1+ε
1
1+δ ≤
1−ε
which rearrange to
1
δ ≤1−
1+ε
1
δ≤ −1
1−ε
Therefore we choose  
1 1
δ = min 1 − , −1
1+ε 1−ε
which satisfies the definition of our limit.
1 1
Problem 9.2.6. Prove lim = .
x→3 x 3

1 1
Proof. Consider the implication |x − 3| < δ → − < ε. We break up the absolute value signs
x 3
to get
1 1 1
3−δ <x<3+δ → − ε < < + ε.
3 x 3
1
Assume ε < , which allows us to take the reciprocal of the latter inequality to get
3
1 1
3−δ <x<3+δ → 1 <x< 1 .
3 +ε 3 −ε
Daniel Kim 258
!
1 1
We want to show that the interval (3 − δ, 3 + δ) is contained inside 1 , 1 . It is sufficient
3 +ε 3 −ε
that
1
1 < 3 − δ,
3+ε
1
3+δ < 1 .
3 −ε

These rearrange to

1
δ <3− 1 ,
3 +ε
1
δ< 1 − 3.
3 −ε

Therefore we take ( )
1 1
δ < min 3 − 1 , 1 −3
3 +ε 3 −ε

to satisfy both inequalities.

Alternative Proof. We want to prove that

1 1
∀ε > 0 ∃δ > 0 ∀x, 0 < |x − 3| < δ → − < ε.
x 3

Assume δ < 1, so 0 < |x − 3| < δ < 1. Then by the Triangle Inequality, |3| ≤ |3 − x| + |x| =
|x − 3| + |x| < 1 + |x|, which rearranges to |x| > 3 − 1 = 2. Clearly |x| is greater than zero, so we
1 1
can take the reciprocal of this inequality to get < .
|x| 2
Now note that

1 1 3−x |3 − x| |x − 3| 1 1 1 1 δ
− = = = = · · |x − 3| < · · δ = .
x 3 3x |3x| 3 |x| 3 |x| 3 2 6

1 1
We want − to be less than a given ε, so we should take δ < 6ε. However, we have also
x 3
assumed δ < 1, so it is sufficient to take δ < min {1, 6ε} to satisfy both inequalities.

x 1
Problem 9.2.7. Prove lim = .
x→1 x + 1 2

Proof. Again, we focus on the implication:

x 1
|x − 1| < δ −→ − <ε
x+1 2
259 Chapter 9. Limits

x 1
Note that |x − 1| < δ rearranges to 1 − δ < x < 1 + δ, and − < ε rearranges to:
x+1 2

1 x 1
−ε< < +ε
2 x+1 2
1 x 1
− +ε>− >− −ε
2 x+1 2
1 x 1
1− + ε >1 − >1− −ε
2 x+1 2
1 1 1
+ε> > −ε
2 x+1 2
1 1 1
−ε< < +ε
2 x+1 2

1 1 1 1 1
Assume ε < . Then, we can take the reciprocal of − ε < < + ε, i.e. 1 > x+1 >
2 2 x+1 2 2 −ε
1 1
1 −ε +ε
1 , which rearranges to 21 < x < 21 . Therefore our implication is
2 +ε 2 +ε 2 −ε

1 1
2 −ε 2 +ε
1 − δ < x < 1 + δ −→ 1 <x< 1
2 +ε 2 −ε

which can be rewritten as


!
1 1
2 −ε 2 +ε
x ∈ (1 − δ, 1 + δ) −→ x ∈ 1 , 1
2 +ε 2 −ε

It is sufficient that
1
2 −ε
1 ≤1−δ
2 +ε
1
2 +ε
1+δ ≤ 1
2 −ε

2ε 2ε
so we have δ ≤ 1 and δ ≤ 1 . Therefore, we take
2 −ε 2 +ε
( )
2ε 2ε
δ = min 1 ,1
2 −ε 2 +ε

to satisfy the definition of our limit.

Example 9.2.8
Prove lim x2 − 3x + 2 = 6.
x→4
Daniel Kim 260

First, noticing that the quadratic x2 − 3x + 2 − 6 is factorable, we rewrite the second part of the
implication:

|x − 4| < δ −→|x2 − 3x + 2 − 6| < ε


|x2 − 3x − 4| < ε
|(x − 4)(x + 1)| < ε

i.e. |x − 4| < δ −→ |x − 4||x + 1| < ε.


Assume δ < 1, therefore |x − 4| < 1.
By the Triangle Inequality, note that

|x + 1| ≤ |x − 4| + |5| < 1 + 5 = 6.

We combine the facts |x − 4| < δ and |x + 1| < 6 to get that |x − 4||x + 1| < 6δ.
ε
This suggests that we let δ < , but we have also assumed that δ < 1, therefore we take the
6
minimum of those two, i.e. let n εo
δ < min 1,
6
which will satisfy our definition. However, all of this work was done backwards, and cannot be
considered a proper, formal proof. Instead, when we write up the proof, we write it up forwards, as
shown:
n εo
Proof. Given a ε > 0, choose δ < min 1, . Note that
6
|x − 4| < δ −→ |x − 4| < 1 −→ |x + 1| ≤ |x − 4| + |5| < 6

by the Triangle Inequality. Therefore,


ε
|x−4| < δ −→ |x2 −3x+2−6| = |x2 −3x−4| = |(x−4)(x+1)| = |x−4||x+1| < 6|x−4| < 6· = ε.
6
Given a ε, we have found δ such that

|x − 4| < δ −→ |x2 − 3x + 2 − 6| < ε

i.e. lim x2 − 3x + 2 = 6.
x→4

9.3 Limit Properties

Up to now, we have been proving limits of various functions. To make our lives easier, we will prove
some properties of limits that will allow us to take limits of many more kinds of functions.

Theorem 9.3.1 (Uniqueness of Limits)


If lim f (x) = L and lim f (x) = M , then L = M .
x→a x→a
261 Chapter 9. Limits

Proof. For the sake of contradiction, assume L 6= M . Then it follows that |L − M | > 0. Now
consider

lim f (x) = L ←→ ∀ε > 0 ∃δ1 > 0 ∀x, 0 < |x − a| < δ1 → |f (x) − L| < ε,
x→a
lim f (x) = M ←→ ∀ε > 0 ∃δ2 > 0 ∀x, 0 < |x − a| < δ2 → |f (x) − M | < ε.
x→a

Choose δ = min {δ1 , δ2 }, so we have

∀ε > 0 ∃δ > 0 ∀x, 0 < |x − a| < δ → |f (x) − L| < ε ∧ |f (x) − M | < ε.

|L − M | |L − M |
Now consider ε = (since > 0). It becomes true that
2 2
|L − M | |L − M |
0 < |x − a| < δ → |f (x) − L| < ∧ |f (x) − M | < .
2 2
Then for this chosen value of ε, note that

|L − M | = |L − f (x) + f (x) − M |
≤ |L − f (x)| + |f (x) − M |
= |f (x) − L| + |f (x) − M |
|L − M | |L − M |
< + = |L − M | .
2 2

We have |L − M | < |L − M |, a contradiction. Therefore L = M .

Problem 9.3.2. Prove that if f (x) ≥ 0 ∀x and lim f (x) = L, then L ≥ 0.


x→a

Proof. Assume for the sake of contradiction that L < 0. We are given the definition

lim f (x) = L ←→ ∀ε > 0 ∃δ > 0 ∀x, 0 < |x − a| < δ → |f (x) − L| < ε.


x→a

As f (x) ≥ 0 is given, f (x) > L, i.e. f (x) − L > 0. Therefore |f (x) − L| = f (x) − L. Consider
L
ε = −L, which is valid because −L > 0. We can also choose ε = − , or any other value in
2
terms of L that would be positive. It follows that ∃δ > 0 ∀x, 0 < |x − a| < δ → f (x) − L < −L.
But f (x) − L < −L rearranges to f (x) < 0, which contradicts the given f (x) ≥ 0 ∀x. Therefore
L ≥ 0.

Theorem 9.3.3 (Limit of the Identity Function)


lim x = a.
x→a

Proof. It is sufficient to let δ = ε. Then it is obvious that 0 < |x − a| < δ ←→ 0 < |x − a| < ε →
|x − a| < ε.
Daniel Kim 262

Theorem 9.3.4 (Limit of a Constant)


lim c = c for some constant c.
x→a

Proof. Any δ is valid, since ∀ε > 0 we have |c − c| = 0 < ε which is always true.

Theorem 9.3.5 (Sum and Product of Limits)


Suppose lim f (x) = L and lim g(x) = M . Then
x→a x→a

a) lim (f (x) + g(x)) = L + M .


x→a

b) lim (f (x)g(x)) = LM .
x→a

Proof. a) We are given lim f (x) = L and lim g(x) = M , therefore


x→a x→a

ε
∀ε > 0 ∃δ1 > 0 ∀x, 0 < |x − a| < δ1 −→ |f (x) − L| < ,
2
ε
∃δ2 > 0 ∀x, 0 < |x − a| < δ2 −→ |g(x) − M | < .
2
This is allowed because we can always choose a δ1 or δ2 that is small enough to make each
ε
|f (x) − L| and |g(x) − M | smaller than .
2
Let δ = min {δ1 , δ2 }. This is because we want to be able to use the same δ for both definitions.
Then,
ε ε
∀ε > 0 ∃δ > 0 ∀x, 0 < |x − a| < δ −→ |f (x) − L| < ∧ |g(x) − M | < .
2 2
Note that |f (x) + g(x) − (L + M )| = |(f (x) − L) + (g(x) − M )| ≤ |f (x) − L| + |g(x) + M | by
the Triangle Inequality. Therefore, given ε > 0, we have found a δ such that
ε ε
|x − a| < δ −→ |f (x) + g(x) − (L + M )| ≤ |f (x) − L| + |g(x) + M | < + = ε,
2 2
i.e. lim (f (x) + g(x)) = L + M , as desired.
x→a

b) Let εe represent some quantity in terms of ε. Eventually we will figure out exactly what
expression we should set εe equal to, by doing the proof ‘backwards,’ to demonstrate the
motivation and intuition behind the choice of εe. After we find out what εe exactly is, we will
write the proof forwards.
As we are given lim f (x) = L and lim g(x) = M ,
x→a x→a

∀ε > 0 ∃δ1 > 0 ∀x, 0 < |x − a| < δ1 −→ |f (x) − L| < εe,


∃δ2 > 0 ∀x, 0 < |x − a| < δ2 −→ |g(x) − M | < εe.
263 Chapter 9. Limits

Like before, let δ = min {δ1 , δ2 }. Then we can cover both cases:
∀ε > 0 ∃δ > 0 ∀x, 0 < |x − a| < δ −→ |f (x) − L| < εe ∧ |g(x) − M | < εe.

Now we apply the Triangle Inequality based on what we are trying to prove:
|f (x)g(x) − LM | = |f (x)g(x) − f (x)M + f (x)M − LM |
= |f (x)(g(x) − M ) + M (f (x) − L)|
≤ |f (x)| |g(x) − M | + |M | |f (x) − L| .
Now we should introduce a bound on |f (x)|. Therefore we should assume ε < 1. Then it
follows that |f (x) − L| < 1. By another application of the Triangle Inequality,
|f (x)| ≤ |f (x) − L| + |L| < 1 + |L| .

We can now conclude that


|f (x)g(x) − LM | ≤ |f (x)| |g(x) − M | + |M | |f (x) − L|
< (1 + |L|)e
ε + |M | εe
= (1 + |L| + |M |)e
ε.
ε
It would be sufficient to let εe = , so |f (x)g(x) − LM | < εe(1 + |L| + |M |) < ε, as
1 + |L| + |M |
desired.

Now we should write the proof forwards:


As we are given lim f (x) = L and lim g(x) = M ,
x→a x→a
ε
∀ε > 0 ∃δ1 > 0 ∀x, 0 < |x − a| < δ1 −→ |f (x) − L| < ,
1 + |L| + |M |
ε
∃δ2 > 0 ∀x, 0 < |x − a| < δ2 −→ |g(x) − M | < .
1 + |L| + |M |
Let δ = min {δ1 , δ2 }. Then ∃δ > 0 such that
ε ε
0 < |x − a| < δ −→ |f (x) − L| < ∧ |g(x) − M | < .
1 + |L| + |M | 1 + |L| + |M |

Assume ε < 1, so that |f (x) − L| < 1. Then |f (x)| ≤ |f (x) − L| + |L| < 1 + |L|. Therefore,
we conclude that
|f (x)g(x) − LM | = |f (x)g(x) − f (x)M + f (x)M − LM |
= |f (x)(g(x) − M ) + M (f (x) − L)|
≤ |f (x)| |g(x) − M | + |M | |f (x) − L|
ε ε
< (1 + |L|) + |M |
1 + |L| + |M | 1 + |L| + |M |
ε
= (1 + |L| + |M |)
1 + |L| + |M |
= ε.
Daniel Kim 264

Corollary 9.3.6 (Difference of Limits)


lim (f (x) − g(x)) = lim f (x) − lim g(x).
x→a x→a x→a

Proof. It follows from Theorem 9.3.4 and Theorem 9.3.5 that lim (f (x) − g(x)) = lim f (x) +
x→a x→a
lim (−g(x)) = lim f (x) + lim (−1) lim g(x) = lim f (x) − lim g(x).
x→a x→a x→a x→a x→a x→a

Problem 9.3.7. Prove lim 2x3 − x2 = 225 using the limit properties we have just established.
x→5

Proof. We can ‘break up‘ the polynomial into smaller ‘pieces,’ by applying sum, difference, and
product rules of limits.

lim 2x3 − x2 = lim 2x3 − lim x2


x→5 x→5 x→5
= lim 2 · lim x · lim x · lim x − lim x · lim x
x→5 x→5 x→5 x→5 x→5 x→5
=2·5·5·5−5·5
= 255.

Lemma 9.3.8
If 0 ≤ j(x) ≤ k(x) ∀x and lim k(x) = 0, then lim j(x) = 0.
x→a x→a

Proof. It follows that k(x) − j(x) ≥ 0 ∀x. By Problem 9.3.2, lim k(x) − j(x) ≥ 0. But note that
x→a
lim k(x) − j(x) = lim k(x) − lim j(x) = − lim j(x), therefore lim j(x) ≤ 0. However, as we are also
x→a x→a x→a x→a x→a
given j(x) ≥ 0 ∀x, so by Problem 9.3.2, lim j(x) ≥ 0. Thus we can only conclude lim j(x) = 0.
x→a x→a

Theorem 9.3.9 (Squeeze Theorem)


Suppose f (x) ≤ g(x) ≤ h(x) ∀x and lim f (x) = lim h(x) = L. Then lim g(x) = L.
x→a x→a x→a

Proof. The given inequality rearranges to 0 ≤ g(x)−f (x) ≤ h(x)−f (x). Note that lim (h(x)−f (x)) =
x→a
lim h(x)− lim f (x) = L−L = 0. By Lemma 9.3.8, lim (g(x)−f (x)) = 0, i.e. lim g(x) = lim f (x) =
x→a x→a x→a x→a x→a
L, as desired.

Alternative Proof. In fact, we could prove this theorem from scratch, without using previous lemmas.
By definition of limits, there exist δ1 , δ2 such that

0 < |x − a| < δ1 → |f (x) − L| < ε,


0 < |x − a| < δ2 → |h(x) − L| < ε.
265 Chapter 9. Limits

Let δ = min {δ1 , δ2 }, it follows that

|x − a| < δ → L − ε < f (x), h(x) < L + ε.

Then
L − ε < f (x) ≤ g(x) ≤ h(x) < L + ε =⇒ |g(x) − L| < ε,
so this choice of δ establishes that lim g(x) = L.
x→a

Theorem 9.3.10 (Reciprocal of Limits)


1 1
Suppose lim g(x) = L, and L 6= 0. Then lim = .
x→a x→a g(x) L

|L| |L|
Proof. Given ε > 0, there exists δ1 such that 0 < |x − a| < δ1 → |g(x) − L| < , since is a
2 2
ε |L|2
number greater than 0. Likewise, there exists δ2 such that 0 < |x − a| < δ2 → |g(x) − L| < .
2
Then take δ = min {δ1 , δ2 }, such that

|L| ε |L|2
0 < |x − a| < δ → |g(x) − L| < ∧ |g(x) − L| < .
2 2

|L|
By the Triangle Inequality, |L| ≤ |L − g(x)|+|g(x)| = |g(x) − L|+|g(x)| < +|g(x)|, therefore
2
|L| 1 2
|g(x)| > . As both quantities are positive, we can take the reciprocal to get < . Note
2 |g(x)| |L|
that we applied the Triangle Inequality on |L| because we seek a lower bound for |g(x)|, such that
1
we have an upper bound for , which is needed to finish the proof.
|g(x)|
Therefore, we have found a δ such that

1 1 L − g(x) |g(x) − L| 2 1 ε |L|2


− = = < · · = ε,
g(x) L g(x)L |g(x)| |L| |L| |L| 2

1 1
given ε > 0, and we can conclude that lim = .
x→a g(x) L

Corollary 9.3.11 (Quotient of Limits)


f (x) lim f (x)
Provided lim g(x) 6= 0, lim = x→a .
x→a x→a g(x) lim g(x)
x→a

Proof. This follows from application of the reciprocal rule with the product rule.
Daniel Kim 266

Lemma 9.3.12 (Continuity of Polynomials)


Let P (x) be a polynomial. Then lim P (x) = P (a).
x→a

P
n
Proof. Let P (x) = bk xk . Then,
k=0

n
X
lim P (x) = lim bk xk
x→a x→a
k=0
n
X
= lim bk · ( lim x)k
x→a x→a
k=0
Xn
= bk (a)k
k=0
= P (a).

This theorem officially establishes that we can find the limit of any polynomial by simply plugging
in the number and computing the answer.

Lemma 9.3.13 (Continuity of Rational Functions)


Let P (x), Q(x) be polynomials. If Q(x) 6= 0 ∀x, then

P (x) P (a)
lim = .
x→a Q(x) Q(a)

Proof. We combine the results of Corollary 9.3.11 with Lemma 9.3.12.

Problem 9.3.14. Compute the following limits:

x2 − 5
1. lim
x→2 x2 + 3x + 1

x2 − 4
2. lim
x→2 x2 − 3x + 2

x3 − 8
3. lim
x→2 x2 − 5x + 6
x3 + 27
4. lim
x→−3 x4 − 81

Proof.

x2 − 5 22 − 5 1
1. lim = = − .
x→2 x2 + 3x + 1 22 + 3 · 2 + 1 11
267 Chapter 9. Limits

2. We cannot directly plug in x = 2, since the denominator becomes 0. However, the polynomials
in the numerator and denominator both share x − 2 as a common factor, so we cancel
those out, leaving us with an expression for which we can plug in x = 2 without issues:
x2 − 4 (x + 2)(x − 2) x+2
lim 2 = lim = lim = 4.
x→2 x − 3x + 2 x→2 (x − 2)(x − 1) x→2 x − 1

x3 − 8 (x − 2)(x2 + x + 4) x2 + 2x + 4
3. lim = lim = lim = −12 .
x→2 x2 − 5x + 6 x→2 (x − 2)(x − 3) x→2 x−3

x3 + 27 (x + 3)(x2 − 3x + 9)
4. lim = lim
x→−3 x4 − 81 x→−3 (x2 + 9)(x2 − 9)
(x + 3)(x2 − 3x + 9)
= lim
x→−3 (x2 + 9)(x + 3)(x − 3)

x2 − 3x + 9
= lim
x→−3 (x2 + 9)(x − 3)

(−3)2 − 3(−3) + 9
=
((−3)2 + 9)(−3 − 3)
1
= − .
4

9.4 Other Limits

When we say “x approaches a,” it can either indicate that x is greater than a so x would be decreasing
to get closer to a, or x is less than a and x would be increasing to get closer to a.
In our definition of the limit so far, we dealt with this by taking the absolute value of x − a to
get the distance between them. However, we can also specify whether x > a or x < a, through two
types of limits:

Definition 9.4.1. The right-hand limit as x → a of f (x) is L when

lim f (x) = L ←→ ∀ε > 0 ∃δ > 0 ∀x, 0 < x − a < δ → |f (x) − L| < ε.


x→a+

Definition 9.4.2. The left-hand limit as x → a of f (x) is L when

lim f (x) = L ←→ ∀ε > 0 ∃δ > 0 ∀x, 0 < a − x < δ → |f (x) − L| < ε.


x→a−

The right-hand limit deals with x approaching a from the right, while the left-hand limit deals
with x approaching a from the left.
Notice that the only difference between their definitions and the original definition is the
replacement of |x − a| with either x − a or a − x. These conditions indicate x > a and x < a
respectively.
Since |x − a| can only be one of these two expressions, we have an obvious result that you should
attempt to prove.
Daniel Kim 268

Exercise 9.4.3. Given lim f (x) and lim f (x) exist, prove that lim f (x) = lim f (x) = L ←→
x→a+ x→a− x→a+ x→a−
lim f (x) = L.
x→a

Problem 9.4.4. Let f (x) be the function considered in Example 9.1.5. Evaluate lim f (x) and
x→4+
lim f (x).
x→4−

Solution. For lim f (x), we are considering x > 4. By the piecewise definition, we must have
x→4+
lim f (x) = 5.
x→4+
Likewise, for lim f (x), x < 4 in this context, so lim f (x) = 7.
x→4− x→4−

|x|
Problem 9.4.5. Prove lim = 1.
x→0+ x

Proof. Based on the definition of the right-hand limit, we must prove

|x|
0 < x − 0 < δ −→ − 1 < ε.
x

|x| x |x|
Since x > 0, we know that = = 1, i.e. − 1 = 0 < ε which is true.
x x x
|x|
Problem 9.4.6. Prove lim = −1.
x→0− x

Proof. Based on the definition of the left-hand limit, we must prove

|x|
0 < 0 − x < δ −→ − (−1) < ε.
x

|x| −x
Since −δ < x < 0, we know that = = −1. Then,
x x
|x|
− (−1) = |−1 − (−1)| = 0,
x

which is always less than ε since ε is any positive number.

Problem 9.4.7. Consider the function


(
x2 − 1 x≥5
f (x) =
7x − 11 x < 5

Prove or disprove that lim f (x) exists.


x→5

Proof. We use the result of Exercise 9.4.3. Note that lim f (x) = lim x2 −1 = 24, and lim f (x) =
x→5+ x→5+ x→5−
lim 7x − 11 = 24. As lim f (x) = lim f (x) = 24, lim f (x) = 24.
x→5− x→5+ x→5− x→5
269 Chapter 9. Limits

|x2 − 16|
Problem 9.4.8. Find lim .
x→4 x − 4

|x2 − 16| |x − 4| |x + 4|
Solution. First, we factor: lim = lim .
x→4 x − 4 x→4 x−4
Now, we consider the right-hand and left-hand limits separately.
|x − 4| |x + 4|
For the right-hand limit, lim , note that x > 4 in this case. Then, |x − 4| = x − 4.
x→4+ x−4
Thus,
|x − 4| |x + 4| (x − 4) |x + 4|
lim = lim = lim |x + 4| = 8.
x→4+ x−4 x→4+ x−4 x→4+

|x − 4| |x + 4|
For the left-hand limit, lim , we must have x < 4. Thus, |x − 4| = 4 − x, and we
x→4− x−4
have
|x − 4| |x + 4| (4 − x) |x + 4|
lim = lim = lim − |x + 4| = −8.
x→4− x−4 x→4 − x−4 x→4−

|x − 4| |x + 4| |x − 4| |x + 4| |x − 4| |x + 4| |x2 − 16|
Since lim 6= lim , lim i.e. lim does not
x→4+ x−4 x→4− x−4 x→4 x−4 x→4 x − 4
exist.

Up to now, we have been dealing with x approaching some finite number a. However, as we did
for sequences, we can define x going to infinity as well. In fact, since we are dealing with functions,
it is possible for x to go to negative infinity as well!
As the Cartesian plane extends infinitely in both dimensions, we could also have f (x) going to
infinity or negative infinity for some function f .
Keeping these possibilities in mind, we define a new kind of limits for these:

Definition 9.4.9 (Infinite Limits). We introduce new variables N and M to deal with cases when
x or f (x) go to ∞ or −∞. Then,

• lim f (x) = L ←→ ∀ε > 0 ∃N ∀x, x > N → |f (x) − L| < ε.


x→∞

• lim f (x) = L ←→ ∀ε > 0 ∃N ∀x, x < N → |f (x) − L| < ε.


x→−∞

• lim f (x) = ∞ ←→ ∀M ∃N ∀x, x > N → f (x) > M .


x→∞

• lim f (x) = ∞ ←→ ∀M ∃N ∀x, x < N → f (x) > M .


x→−∞

• lim f (x) = −∞ ←→ ∀M ∃N ∀x, x > N → f (x) < M .


x→∞

• lim f (x) = −∞ ←→ ∀M ∃N ∀x, x < N → f (x) < M .


x→−∞

• lim f (x) = ∞ ←→ ∀M ∃δ > 0 ∀x, 0 < |x − a| < δ → f (x) > M .


x→a

• lim f (x) = −∞ ←→ ∀M ∃δ > 0 ∀x, 0 < |x − a| < δ → f (x) < M .


x→a
Daniel Kim 270

How did we even come up with these definitions? Well, if we have x or f (x) approaching some
finite value, then we NEED to signify that the distance between the two is less than some chosen
positive number: we use δ for x approaching a and ε for f (x) approaching the limit L. Of course,
the “distance” is represented by taking the absolute value of the difference between the two.
Otherwise, if we have ∞ or −∞ involved, then we need to indicate that the value in question (x
or f (x)) increases or decreases without bound.
For example, if we wanted to show x → ∞, then we would use the idea that for ANY real number
N you choose (even when N is an incredibly large number), x would always be greater than that
number. This fits with the notion of infinity - x cannot be less than any number since it is always
increasing.
Likewise, if we had x → −∞, then x would always be less than any real number we choose, since
it is decreasing.
The exact reasoning applies to f (x) as well.
Thus, whenever we have f (x) going to some finite limit L, then we would include ∀ε > 0 and
|f (x) − L| < ε.
However, if f (x) went to ∞ or −∞, then we would have ∀M and f (x) > M or f (x) < M
respectively.
If x went to some finite number a, then we would say ∃δ > 0 and 0 < |x − a| < δ.
If x went to ∞ or −∞, then we would define ∃N and x > N or x < N respectively.
Then, when we consider the right-hand or left-hand limits as f (x) goes to ∞ or −∞, we can just
replace 0 < |x − a| < δ (in the original definition) with 0 < x − a < δ or 0 < a − x < δ respectively.
1
Problem 9.4.10. Prove lim = 0.
x→∞ x

Proof. We want to prove that

1
∀ε > 0 ∃N ∀x, x > N → − 0 < ε.
x

Assume N > 0 (we have the same freedom in choosing N as we had with δ). Then x > N >
1 1 1 1 1 1
0 → < . Since x > 0 by assumption, x = |x|, so we have < , i.e. − 0 < . We
x N |x| N x N
1 1
want − 0 to be less than a given ε, which suggests that < ε. Therefore it is sufficient to take
x N
1
N> .
ε
1
Problem 9.4.11. Prove lim = −∞.
x→0− x

Proof. Consider the corresponding definition

1
∀M ∃δ > 0 ∀x, 0 < 0 − x < δ → < M.
x
271 Chapter 9. Limits

We rewrite the former inequality as 0 > x > −δ. Since x and −δ are both negative, we can take
1 1 1
the reciprocal to get < − . However, we want to be less than a given M , so it is sufficient to
x δ x
1 1
take − < M , i.e. δ < − .
δ M
1
Problem 9.4.12. Prove lim = −∞.
x→3− x−3

Proof. Writing out the definition,


1
∀N ∃δ > 0 ∀x, 0 < 3 − x < δ −→ < N.
x−3

First, note that 0 < 3 − x < δ can be rewritten as 0 > x − 3 > −δ.
If some value of N works, then any value greater than it must also work. Therefore, we can
1 1
assume that N < 0. Now, we can reciprocate the inequality < N to get x − 3 > . Now,
x−3 N
1 1 1 1
it becomes clear that we want −δ = , or δ = − . Then, 0 > x − 3 > −→ < N as
N N N x−3
desired.
sin x
Problem 9.4.13. Prove lim = 0.
x→∞ x

Proof. Consider the definition of the limit:

sin x
∀ε > 0 ∃N ∀x, x > N −→ < ε.
x

sin x |sin x| |sin x| 1


Rewrite as . Note that −1 ≤ sin x ≤ 1, so |sin x| ≤ 1. Therefore, ≤ . As
x |x| |x| |x|
1 1
before, we can assume N > 0 so that x > N −→ x > 0 −→ |x| = x, and = . Putting all of this
|x| x
together, we need
sin x |sin x| 1 1
= ≤ = <ε
x |x| |x| x
to be true. It then suffices to prove that
1
∀ε > 0 ∃N ∀x, x > N −→ < ε.
x
1
However, this is just the definition of lim = 0, and we have already proven this in Problem 9.4.10.
x→∞ x

Problem 9.4.14. Prove lim sin x does not exist.


n→∞

Proof. Assume by contradiction that the limit L exists. The definition for lim sin x = L is:
n→∞

∀ε > 0 ∃N ∀x, x > N −→ |sin x − L| < ε.


Daniel Kim 272

Consider the negation of this statement:

∃ε > 0 ∀N ∃x, x > N ∧ |sin x − L| ≥ ε.


If we can choose a value of ε such that no value of N works, then there will be no limit L that
1
exists. I claim that ε = fails the definition (as a note, we can choose any ε less than 1 and greater
4
than 0 to demonstrate a contradiction).
 
N
For any N , there is a multiple of 2π that is larger than N (in particular, it would be 2π ,
  2π
π
but that is not important in this proof). Let T be this multiple of 2π. Then sin T + = 1 and
  2

sin T + = −1.
2
Our definition of the limit suggests |sin x − L| < ε i.e. L − ε < sin x < L + ε. This implies
1 1
sin x ∈ (L − ε, L + ε). Noting that we chose ε = , we see that this is an interval of size . This
4 2
1
interval cannot possibly contain both −1 and 1, and so no value of N could possibly work for ε = .
4
We have arrived at a contradiction, thus, no value of L works, and the limit does not exist.

9.5 Trigonometric Limits


We define sine, cosine via the unit circle.

Example 9.5.1
Prove that lim sin θ = 0 and lim cos θ = 1.
θ→0 θ→0

Proof. Assume 0 < θ < π2 , where θ is in radians. We only care about θ being close to 0, which is
why we restrict θ to be less than π2 radians. Consider the unit circle:

1 θ
sin θ

θ A
O cos θ 1 − cos θ B
273 Chapter 9. Limits

d > BC (we will glance over the rigorous proof for this), and
We will take for granted that BC
since BC is the hypotenuse of 4ABC, BC > AC and BC > AB. Therefore, BC d > AC and
d > AB. Since θ is in radians, the length of BC
BC d is θ. Note that AC = sin θ and AB = 1 − cos θ.
We have the inequalities:

0 < sin θ < θ,


0 < 1 − cos θ < θ.

As lim 0 = lim θ = 0, by Theorem 9.3.9, lim sin θ = lim 1 − cos θ = 0; the latter rearranges
θ→0+ θ→0+ θ→0+ θ→0+
to lim cos θ = 1.
θ→0+

Lemma 9.5.2
Assuming both limits exist, lim f (x) = lim f (−x).
x→a+ x→−a−

Proof. Consider the definition of lim f (x) = L,


x→a+

∀ε > 0 ∃δ > 0 ∀x, 0 < x − a < δ → |f (x) − L| < ε.

Substitute x → −x. Then,

0 < (−x) − a < δ → |f (−x) − L| < ε


0 < −a − x < δ → |f (−x) − L| < ε

which is the definition of lim f (−x) = L. Therefore the definitions are equivalent, i.e. lim f (x) =
x→−a− x→a+
lim f (−x).
x→−a−

By Lemma 9.5.2, it follows that lim sin θ = lim sin(−θ) = − lim sin θ = − lim sin θ.
θ→0+ θ→−0− θ→−0− θ→0−
Since lim sin θ = 0, − lim sin θ = 0, i.e. lim sin θ = 0. By Exercise 9.4.3, we conclude that
θ→0+ θ→0− θ→0−
lim sin θ = 0.
θ→0
Similarly, we note that lim cos θ = lim cos(−θ) = lim cos θ = lim cos θ by Lemma 9.5.2.
θ→0+ θ→−0− θ→−0− θ→0−
Since lim cos θ = 1, we have lim cos θ = 1. By Exercise 9.4.3, we conclude that lim cos θ = 1.
θ→0+ θ→0− θ→0

Lemma 9.5.3
lim f (x) = L ←→ lim f (a + h) = L.
x→a h→0

Proof. The definition of lim f (x) = L is


x→a

∀ε > 0 ∃δ > 0 ∀x, 0 < |x − a| < δ → |f (x) − L| < ε.


Daniel Kim 274

Substitute x → a + h. Then,

∀ε > 0 ∃δ > 0 ∀h, 0 < |a + h − a| < δ → |f (a + h) − L| < ε,

which simplifies to

∀ε > 0 ∃δ > 0 ∀h, 0 < |h − 0| < δ → |f (a + h) − L| < ε,

i.e. the definition of lim f (a + h) = L. Since the definitions are equivalent, lim f (x) = L ←→
h→0 x→a
lim f (a + h) = L.
h→0

Theorem 9.5.4 (Continuity of Trigonometric Functions)


If a is in the domain, then lim Trig(x) = Trig(a), where Trig = sin, cos, tan, csc, sec, cot.
x→a

Proof. We will prove this for sin x and cos x, and the rest will follow by Theorem 9.3.10.
We proceed by Lemma 9.5.3 on sin x and cos x.

lim sin x = lim sin(a + h)


x→a h→0
= lim (sin a cos h + cos a sin h)
h→0
= lim sin a · lim cos h + lim cos a · lim sin h
h→0 h→0 h→0 h→0
= sin a · 1 + cos a · 0
= sin a.

lim cos x = lim cos(a + h)


x→a h→0
= lim (cos a cos h − sin a sin h)
h→0
= lim cos a · lim cos h − lim sin a · lim sin h
h→0 h→0 h→0 h→0
= cos a · 1 + sin a · 0
= cos a.

Theorem 9.5.5
sin θ
lim = 1.
θ→0 θ

π
Proof. Assume 0 < θ < , where θ is in radians. Consider the following diagram:
2
275 Chapter 9. Limits

C
tan θ
1
sin θ

θ
O cos θ A B

It is clear that we have the following inequality:

Area of 4OAC < Area of sector OBC < Area of 4OBD.

We find our respective areas:


sin θ cos θ θ tan θ
< < .
2 2 2
This rearranges to
1 sin θ
> > cos θ.
cos θ θ
1
Note that lim = 1 and lim cos θ = 1, therefore by Theorem 9.3.9, we have that
θ→0+ cos θ θ→0+
sin θ
lim = 1.
θ→0+ θ
sin θ
We can then proceed with Lemma 9.5.2, and eventually conclude that lim = 1, so therefore
θ→0− θ
sin θ
lim = 1.
θ→0 θ

Lemma 9.5.6
lim f (x) = L ←→ lim f (kx) = L provided k 6= 0.
x→0 x→0

Proof. The definition of lim f (x) = L is


x→0

∀ε > 0 ∃δ1 > 0 ∀x, 0 < |x − 0| < δ1 → |f (x) − L| < ε.


Daniel Kim 276

Substitute x → kx. Then we have,

∀ε > 0 ∃δ1 > 0 ∀x, 0 < |kx − 0| < δ1 → |f (kx) − L| < ε,

which rearranges to

∀ε > 0 ∃δ1 > 0 ∀x, 0 < |k| |x| < δ1 → |f (kx) − L| < ε.

δ1
Choose δ2 = . It follows that
|k|

∀ε > 0 ∃δ2 > 0 ∀x, 0 < |x − 0| < δ2 → |f (kx) − L| < ε,

which is just the definition of lim f (kx) = L.


x→0

Lemma 9.5.7
lim f (x) = L ←→ lima f (kx) = L provided k 6= 0.
x→a x→ k

The proof is very similar to that of the previous lemma.

Proof. The definition of lim f (x) = L is


x→a

∀ε > 0 ∃δ1 > 0 ∀x, 0 < |x − a| < δ1 → |f (x) − L| < ε.

Substitute x → kx. Then we have,

∀ε > 0 ∃δ1 > 0 ∀x, 0 < |kx − a| < δ1 → |f (kx) − L| < ε,

which rearranges to
a
∀ε > 0 ∃δ1 > 0 ∀x, 0 < |k| x − < δ1 → |f (kx) − L| < ε.
k

δ1
Choose δ2 = . It follows that
|k|
a
∀ε > 0 ∃δ2 > 0 ∀x, 0 < x − < δ2 → |f (kx) − L| < ε,
k
which is just the definition of lim f (kx) = L.
x→0

Problem 9.5.8. Find and justify the following limits:

1. lim cos 2x
x→0

sin 2x
2. lim
x→0 2x
277 Chapter 9. Limits
sin 2x
3. lim
x→0 3x

sin 6x
4. lim
x→0 sin 5x
tan 3x
5. lim
x→0 x
1 − cos x
6. lim
x→0 x

Solution. We apply the results of Example 9.5.1, Theorem 9.5.5 with Lemma 9.5.6 after some clever
algebraic manipulations.

1. lim cos 2x = 1 .
x→0

sin 2x
2. lim = 1.
x→0 2x

sin 2x sin 2x 2 2
3. lim = lim · = .
x→0 3x x→0 2x 3 3
sin 6x sin 6x sin 6x
sin 6x ·6 6 6 1 6
4. lim = lim x
sin 5x
= lim 6x
= lim 6x
= · = .
x→0 sin 5x x→0 x→0 sin 5x ·5 5 x→0 sin 5x 5 1 5
x 5x 5x

sin 3x
tan 3x sin 3x sin 3x sin 3x lim 3x 1
x→0
5. lim = lim = lim 3x
= lim 3x
= = = 3.
x→0 x x→0 x cos 3x x→0 x · cos 3x x→0 cos 3x 1
lim cos 3x 1
·1
3x 3 3 x→0 3

1 − cos x 1 − cos x 1 + cos x 1 − cos2 x sin2 x sin x


6. lim = lim · = lim = lim = lim ·
x→0 x x→0 x 1 + cos x x→0 x(1 + cos x) x→0 x(1 + cos x) x→0 x
1 1
lim sin x · lim =1·0· = 0 .
x→0 x→0 1 + cos x 2
tan2 x + 2x x2 (3 + sin x)
Exercise 9.5.9. Find and justify lim and lim .
x→0 x + x2 x→0 (x + sin x)2

9.6 Advanced Concepts


Theorem 9.6.1
P
n P
m
Let P (x), Q(x) be polynomials such that P (x) = ai xi and Q(x) = bk xk . Then,
i=0 k=0

 ± ∞ deg P > deg Q

P (x)  an
lim = deg P = deg Q
x→∞ Q(x) 
 bn

0 deg P < deg Q
Daniel Kim 278

Proof. If deg P > deg Q, then n > m, and we have

P (x) an xn + an−1 xn−1 + . . . + a1 x + a0


lim = lim .
x→∞ Q(x) x→∞ bm xm + bm−1 xm−1 + . . . + b1 x + b0

We divide the numerator and denominator by xn . Then,

an xn + an−1 xn−1 + . . . + a1 x + a0
lim
x→∞ bm xm + bm−1 xm−1 + . . . + b1 x + b0

an + an−1 x−1 + . . . + a1 x1−n + a0 x−n


= lim .
x→∞ bm xm−n + bm−1 xm−n−1 + . . . + b1 x1−n + b0 x−n

As x goes to ∞, any term of x raised to a negative power goes to 0. Therefore, the denominator
goes to 0 and the numerator goes to an , so the limit is ±∞, where the sign would depend on the
sign of an and whether the other terms approached 0+ or 0− .
If deg P = deg Q, then n = m. We will primarily use n. We have

P (x) an xn + an−1 xn−1 + . . . + a1 x + a0


lim = lim .
x→∞ Q(x) x→∞ bn xn + bn−1 xn−1 + . . . + b1 x + b0

We divide the numerator and denominator by xn to get

P (x) an + an−1 x−1 + . . . + a1 x1−n + a0 x−n


lim = lim .
x→∞ Q(x) x→∞ bn + bn−1 x−1 + . . . + b1 x1−n + b0 x−n

an
All terms with x raised to a negative power approach 0 as x goes to ∞, so we are left with as
bn
our limit.
Lastly, if deg P < deg Q, then n < m. We have

P (x) an xn + an−1 xn−1 + . . . + a1 x + a0


lim = lim .
x→∞ Q(x) x→∞ bm xm + bm−1 xm−1 + . . . + b1 x + b0

Similar to the first case, we divide the numerator and denominator by xm . Then,

an xn + an−1 xn−1 + . . . + a1 x + a0
lim
x→∞ bm xm + bm−1 xm−1 + . . . + b1 x + b0

an xn−m + an−1 xn−m−1 + . . . + a1 x1−m + a0 x−m


= lim .
x→∞ bm + bm−1 x−1 + . . . + b1 x1−m + b0 x−m

As x goes to ∞, all terms in the numerator go to 0, and all terms except for bm in the denominator
0
go to 0, so the limit is = 0.
bm
Problem 9.6.2. Evaluate and justify the following limits:

x2 − 7x + 12
1. lim
x→3 x3 − 27
279 Chapter 9. Limits

2. lim x2 + 2x − x
x→∞

3. lim x2 + 2x − x
x→−∞

Solution. 1. As we cannot directly plug in and compute, we first factor and cancel out like terms:
x2 − 7x + 12 (x − 3)(x − 4) x−4 1
lim = lim = lim 2 = − .
x→3 x3 − 27 x→3 (x − 3)(x2 + 3x + 9) x→3 x + 3x + 9 27
√ √
2. The radical motivates us to ‘rationalize the numerator’: lim x2 + 2x − x = lim ( x2 + 2x −
√ x→∞ x→∞
x2 + 2x + x 2x
x) · √ = lim √ . As we are considering x going to ∞, we assume
x2 + 2x + x x→∞ x2 + 2x + x
x > 0 and divide the numerator and denominator by x, which means that we divide the inner
2x 2 2
content of the radical by x2 , as such: lim √ = lim q = = 1.
x→∞ 2
x + 2x + x x→∞
1+ 2 +1 2
x

√ 2x
3. As shown in the previous problem, lim x2 + 2x − x = lim √ . However, as
x→−∞ x→−∞ x2 + √
2x + x
we are considering x going to −∞, we can safely assume x < 0, so x = − x2 , suggesting that
when we divide the numerator and denominator by x, we divide the inner content of the radical
2x 2
by x2 then making the radical a negative term: lim √ = lim q .
x→−∞ 2
x + 2x + x x→−∞ 2
− 1+ x +1
q q q
As x goes to −∞, 1 + x2 → 1− , so − 1 + x2 → −1+ , i.e. the denominator − 1 + x2 + 1
approaches 0+ . The numerator stays at 2, so since the overall sign is positive, the limit is
∞.

Problem 9.6.3. Evaluate and justify the following limits:

x+5
1. lim
x→−2+ x2 − 4
√ √
2. lim x+1− x
x→∞

3. lim x2 + x − x
x→∞

4. lim x2 + x + x
x→−∞

2x2 + 3x + 1
5. lim
x→∞ 5x2 − 2x + 3

x3 + x2 + x + 1
6. lim
x→−∞ 2x2 − 4x + 1

Solution.
Daniel Kim 280
x+5
1. Note that we can rewrite this to lim . As x approaches −2 from the positive
(x + 2)(x − 2)
x→−2+
side, we see that x + 5 will approach 3, x − 2 will approach −4, and x + 2 will approach 0
from the positive side, i.e. 0+ .
Note that the signs of 3 and 0+ are positive, but the sign of −4 is negative, therefore the
overall sign of the limit is negative. We also note that 0 is in the denominator, thus we can say
that the limit goes to −∞ .

2. Again, we rationalize the numerator:


√ √
√ √ x+1− x √ √
lim x+1− x = lim √ √ · ( x + 1 + x)
x→∞ x→∞ x+1+ x
x+1−x
= lim √ √
x→∞ x+1+ x
1
= lim √ √
x→∞ x+1+ x
= 0.

3. Likewise,

p x2 + x − x p 2
lim x + x − x = lim √
2 · ( x + x + x)
x→∞ x→∞ x2 + x + x
x2 + x − x2
= lim √
x→∞ x2 + x + x
x
= lim √ .
x→∞ 2
x +x+x

We then divide both the numerator and denominator by x. To deal with the radical in the
denominator, note that we√ are considering x approaching√∞, so we can assume x > 0, from
which it follows that x = x2 , i.e. divide that radical by x2 , resulting in:

1 1
lim q = .
x→∞
1 + x1 + 1 2

4. This problem is very similar to the previous one. Again, we rationalize the numerator:

p x2 + x + x p 2
lim x2 + x + x = lim √ · ( x + x − x)
x→∞ x→∞ x2 + x − x
x2 + x − x2
= lim √
x→∞ x2 + x − x
x
= lim √ .
x→∞ 2
x +x−x

Like before, we divide the numerator and denominator


√ by x. However, for the radical in the
denominator, we cannot simply divide by x2 , as that would result in a denominator equal
281 Chapter 9. Limits

to 0. Instead, note that since


√ we are considering x approaching
√ −∞, we assume x < 0, from
which it follows that x = − x , i.e. divide the radical by − x2 , resulting in:
2

1 1
lim q = − .
x→∞
− 1+ 1
−1 2
x

5. Divide both the numerator and denominator by x of the common highest degree of the
polynomials, i.e. x2 , resulting in:

2 + x3 + x1 · x1
lim .
x→∞ 5 − 2 + 3 · 1 · 1
x x x

It then becomes obvious that all fractions with x as the denominator go to 0 as x goes to ∞,
2
therefore we are left with .
5
6. Unlike the previous exercise, the highest degree in the numerator is 3, but the highest degree
in the denominator is 2, therefore we can deduce that the limit goes to either ∞ or −∞. We
divide the numerator and denominator by the lesser degree, i.e. x2 , to get:
1 1
x+1+ x + x2
lim 4 1
x→−∞ 2− x + x2

We are left with a term x in the numerator, and since the problem asks for the limit as x goes
to −∞, we can conclude that the overall limit goes to −∞ .

Exercise 9.6.4. Evaluate and justify the following limits:

sin(3x − 3)
1. lim
x→1 sin(2x − 2)

2. lim x2 + 4x − x
x→−∞
√ √
3. lim x( x + 2 − x)
x→∞

x3 + 4x − 7
4. lim
x→∞ 7x2 − x + 1

Similar to Theorem 4.3.14, we have

Lemma 9.6.5  
1
lim f (x) = lim f .
x→∞ y→0+ y

Proof. Assume lim f (x) = L, and consider its definition:


x→∞

∀ε > 0 ∃N ∀x, x > N → |f (x) − L| < ε.


Daniel Kim 282
1 1
Assume N > 0. Then substitute x → , so we have that x > 0 → > 0, and y > 0. The
y y
definition becomes  
1 1
∀ε > 0 ∃N ∀y, >N → f − L < ε.
y y

1 1 1 1
As N > 0 and > 0, we can take the reciprocal of > N to get y < . As is some positive
y y N N
1
number, we let δ = , and it follows that
N
 
1
∀ε > 0 ∃δ > 0 ∀y, 0 < y < δ → f − L < ε,
y
 
1
which is the definition of lim f = L, as desired.
y→0 + y
   
1 1
Exercise 9.6.6. Evaluate and justify lim x sin . What about lim x sin
2 ?
x→∞ x x→∞ x

9.7 Continuity

Although we have mentioned the notion of continuity before, we formally define it here:

Definition 9.7.1. f (x) is continuous at x = a if lim f (x) = f (a).


x→a

We have already proven the following results:

1. Polynomials are continuous everywhere, by Lemma 9.3.12.

2. Rational functions are continuous where they are defined, by Lemma 9.3.13.

3. Trigonometric functions are continuous where they are defined, by Theorem 9.5.4.

We have notions of different kinds of discontinuity:

Definition 9.7.2. When f (a) exists and lim f (x) 6= f (a), there is a removable discontinuity at
x→a
x = a.

(a, f (a))

Definition 9.7.3. When f (a) exists and lim f (x) does not exist, there is a essential discontinuity
x→a
at x = a.
283 Chapter 9. Limits

(a, f (a))

There can also be a discontinuity when f (a) does not exist at all.

x=a

sin x
Problem 9.7.4. Is f (x) = continuous? If not, what modifcations can we make to f such that
x
it is continuous?
sin x
Solution. Note that does not exist at x = 0, so we have a discontinuity. We can define the
x
piecewise function 
 sin x x 6= 0
f (x) = x

1 x=0
sin x
and since lim = 1, the function is now continuous.
x→0 x

Theorem 9.7.5 (Continuity of Composite Functions)


If g is continuous at a and f is continuous at g(a), then f ◦ g is continuous at a.

Proof. Given that g is continuous at a, it is true that lim g(x) = g(a), i.e.
x→a

∀ε1 > 0 ∃δ1 > 0 ∀x, 0 < |x − a| < δ1 → |g(x) − g(a)| < ε1 .

We are given that f is continuous at g(a), therefore lim f (y) = f (g(a)), i.e.
y→g(a)

∀ε2 > 0 ∃δ2 > 0 ∀y, 0 < |y − g(a)| < δ2 → |f (y) − f (g(a))| < ε2 .

Let y = g(x), so the definition above can be rewritten as

∀ε2 > 0 ∃δ2 > 0 ∀x, 0 < |g(x) − g(a)| < δ2 → |f (g(x)) − f (g(a))| < ε2 .

Our objective is to show that 0 < |x − a| < δ1 → |f (g(x) − f (g(a))| < ε2 from the two given
definitions. It suffices to let ε1 = δ2 , then it follows that 0 < |x − a| < δ1 → |g(x) − g(a)| < δ2 . By
the rewritten form of the second given definition, 0 < |g(x) − g(a)| < δ2 → |f (g(x)) − f (g(a))| < ε2 .
Then, by hypothetical syllogism,

∀ε2 > 0 ∃δ1 > 0 ∀x, 0 < |x − a| < δ1 → |f (g(x)) − f (g(a))| < ε2 ,

so lim f (g(x)) = f (g(a)), i.e. f ◦ g is continuous at a, as desired.


x→a
Daniel Kim 284

Exercise 9.7.6. Evaluate lim sin3 (7x).


x→π

Problem 9.7.7. Prove f (x) = x is continuous ∀x > 0.

Proof. For a given ε, let δ < ε a, assuming a > 0. Note that
√ √
√ √ √ √ x+ a |x − a| |x − a| δ
|x − a| < δ −→ x − a = x − a √ √ =√ √ ≤ √ < √ < ε.
x+ a x+ a a a
√ √ √ √
Therefore, ∀ε > 0 ∃δ > 0 ∀x, |x − a| < δ −→ | x − a| < ε, i.e. lim x = a.
x→a

Exercise 9.7.8. Prove that f (x) = x is right-continuous at 0, i.e. lim f (x) = f (0).
x→0+

Theorem 9.7.9
Suppose that f is continuous at x = a and f (a) > 0. Then ∃δ > 0 such that ∀x ∈ (a − δ, a + δ),
f (x) > 0.

Proof. Assuming f is continuous at x = a, then lim f (x) = f (a), i.e.


x→a

∀ε > 0 ∃δ > 0 ∀x, 0 < |x − a| < δ → |f (x) − f (a)| < ε.

f (a) f (a)
Now consider ε = , as f (a) > 0 implies that > 0, and ε can be any positive number.
2 2
Then, by the definition,
f (a)
∃δ > 0 ∀x, 0 < |x − a| < δ → |f (x) − f (a)| < .
2

We expand the absolute inequalities to get


f (a) 3f (a)
∃δ > 0 ∀x, a − δ < x < a + δ → < f (x) < ,
2 2
and we rewrite this based on the statement of the theorem, as such:
f (a) 3f (a)
∃δ > 0 ∀x ∈ (a − δ, a + δ), < f (x) < .
2 2
f (a)
We have f (x) > , which clearly implies that f (x) > 0, so we are done.
2

Theorem 9.7.10
Suppose that f is continuous at x = a and f (a) < 0. Then ∃δ > 0 such that ∀x ∈ (a − δ, a + δ),
f (x) < 0.

Proof. The proof is virtually identical to that of Theorem 9.7.9.


285 Chapter 9. Limits

Problem 9.7.11. Find all values of a and b such that



 2
2x − x − 7 x ≤ 1
f (x) = ax + b 1<x≤3

 3
x x>3

is continuous everywhere.

Solution. Recall that f (x) is continuous at x = a if lim f (x) = f (a). Therefore we should be
x→a
continuous at x = 1, i.e. lim f (x) should be defined at f (1). We consider the one-sided limits
x→1
separately: lim f (x) = lim 2x2 − x − 7 = −6, therefore lim f (x) = −6, i.e. lim ax + b =
x→1− x→1− x→1+ x→1+
a + b = −6.
Likewise, lim f (x) should also be defined at f (3). Note that lim f (x) = lim x3 = 27, so
x→3 x→3+ x→3+
lim f (x) = lim ax + b = 3a + b = 27.
x→3− x→3−
We now have the system of equations a + b = −6 and 3a + b = 27, from which the solutions are
33 45
a= ,b= − .
2 2
Chapter 10

Derivatives

Limits barely scrape the surface of calculus. With derivatives, we will observe how limits are used to
help us analyze functions in greater depth than we were able to do in early algebra classes.

10.1 Introduction
Consider a function f . When, at any part, the function is increasing, we have

b > a −→ f (b) > f (a),

for two values x = a, b in that part of the function. We rewrite this as

b − a > 0 −→ f (b) − f (a) > 0.

When we consider the two points (a, f (a)) and (b, f (b)), the slope of the line containing both points
is also positive:
f (b) − f (a)
b − a > 0 −→ > 0.
b−a
Likewise, we can use analogous reasoning for decreasing parts of a graph as well.
Definition 10.1.1. We call this line connecting (a, f (a)) and (b, f (b)) the secant line, as shown
below:

(b, f (b))

(a, f (a))

Now what happens to the slope of the secant line connecting (a, f (a)) and (b, f (b)) as b is
approaching a? Observe the secant line in the figure below:
287
Daniel Kim 288

(a, f (a))

Since we are considering b approaching a, this leads us to use limits to introduce a new, important
property of functions:
f (x) − f (a)
Definition 10.1.2. For some constant a, lim = f 0 (a) is called the derivative of f (x)
x→a x−a
at x = a. By Lemma 9.5.3, note that
f (x) − f (a) f (a + h) − f (a)
f 0 (a) = lim = lim ,
x→a x−a h→0 h
which is another form of the definition of the derivative.

Definition 10.1.3. The tangent line of f (x) at x = a is the line of slope f 0 (a) through (a, f (a)).

(a, f (a))

Definition 10.1.4. If f 0 (a) exists, f is said to be differentiable at x = a.

Example 10.1.5
Let f (x) = mx + b (i.e. an arbitrary linear function with slope m). For any a, prove that
f 0 (a) = m.

Proof. We have
f (x) − f (a)
f 0 (a) = lim
x→a x−a
(mx + b) − (ma + b)
= lim
x→a x−a
m(x − a)
= lim
x→a x−a
289 Chapter 10. Derivatives

= lim m
x→a
= m.

Therefore, for f (x) = mx + b, f 0 (a) = m.

Problem 10.1.6. Let f (x) = x2 . Find f 0 (2).

Solution. As usual, we apply the definition:

f (x) − f (2)
f 0 (2) = lim
x→2 x−2
2
x −4
= lim
x→2 x − 2
(x + 2)(x − 2)
= lim
x→2 x−2
= lim x + 2
x→2
= 4.

Exercise 10.1.7. Let f (x) = x3 − 3x. Find f 0 (2).


1
Exercise 10.1.8. Let f (x) = . Find f 0 (−3).
x
2x − 5
Exercise 10.1.9. Let f (x) = . Find f 0 (4).
1 − 3x

Exercise 10.1.10. Let f (x) = x. Find f 0 (4).

Problem 10.1.11. What is the equation of the tangent line of f (x) at x = a?

Solution. The slope of the tangent line is f 0 (a). This line must pass through the point (a, f (a)), so
by the point-slope form of a line, the equation of the tangent line would be y − f (a) = f 0 (a)(x − a),
which rearranges to
y = f 0 (a)(x − a) + f (a).

Problem 10.1.12. Find all tangent lines of y = x2 that go through (7, 1).

Solution. Define f (x) = x2 . Consider some point (a, f (a)) that lies on f and has a tangent line that
goes through (7, 1). By Problem 10.1.11, the general equation of the tangent line would be

y = f 0 (a)(x − a) + f (a).

f (x) − f (a) x2 − a2 (x − a)(x + a)


Note that f (a) = a2 , and f 0 (a) = lim = lim = lim = lim x +
x→a x−a x→a x − a x→a x−a x→a
a = 2a, so our tangent line is
y = 2a(x − a) + a2 .
Since our tangent line must also contain (7, 1), we plug those values into the equation:

1 = 2a(7 − a) + a2 .
Daniel Kim 290

This rearranges to a2 − 14a + 1 = 0, and the quadratic formula yields a = 7 ± 4 3, and in fact, both
are solutions to the problem. Thus, our tangent lines are
√ √ √
y = 2(7 + 4 3)(x − (7 + 4 3)) + 97 + 56 3,
√ √ √
y = 2(7 − 4 3)(x − (7 − 4 3)) + 97 − 56 3.

Example 10.1.13
Let f (x) = |x|. Find f 0 (7) and f 0 (0).

|x| − |7|
Solution. Note that f 0 (7) = lim . As we are considering the limit as x goes to 7, we can
x→7x−7
assume that x is positive because we essentially only care about the values of x that are close to 7.
Then,
|x| − |7| x−7
lim = lim = 1.
x→7 x − 7 x→7 x − 7

|x|
For f 0 (0), we want to find lim . However, evaluating the right-hand and left-hand limits
x→0 x
|x|
separately gives us 1 and −1 respectively, therefore lim does not exist, and it follows that there
x→0 x
is no tangent line at x = 0. In fact, if we graph f (x) = |x|, we see that there is a cusp (a pointed
end) at x = 0, so we can intuitively figure out that there cannot be a tangent line at that point.

Definition 10.1.14. For a function f (x), the derivative of f (x) is denoted as f 0 (x) with respect
to all x defined on f . As stated earlier, the limit can be written in two forms:

f (z) − f (x) f (x + h) − f (x)


f 0 (x) = lim = lim .
z→x z−x h→0 h
When evaluating derivatives using the limit definition, note that the former involves factoring, while
the latter involves expanding, so choose the appropriate definition depending on the function being
dealt with.

Definition 10.1.15. There are two different ways to denote the derivative of f : either f 0 (x), which
d
is Newton’s notation, or (f (x)), which is Leibniz’s notation. Leibniz notation is useful since
dx
d 3
we don’t have to define the function to take its derivative. For example, we can write (x ), but
0 dx
not x .
3

If y = f (x), the following all mean the same thing:

d
(f (x)) = f 0 (x).
dx
d dy
(y) = y 0 = .
dx dx
Definition 10.1.16. To differentiate a function is to evaluate the derivative of that function.
291 Chapter 10. Derivatives

Example 10.1.17
Differentiate the following functions:

1. x2

2. x3

3. x5

4. xn , ∀n ∈ Z+
3x + 11
5.
2x − 9

6. x

7. sin x

Solution. We may either take the limit of the given function as z → x or h → 0.

1. Either method works:


f (z) − f (x) z 2 − x2 (z − x)(z + x)
lim = lim = lim = lim z + x = 2x .
z→x z−x z→x z−x z→x z−x z→x

f (x + h) − f (x) (x + h)2 − x2 x2 + 2xh + h2 − x2


lim = lim = lim = lim 2x + h = 2x .
h→0 h h→0 h h→0 h h→0

2. Factoring seems like a relatively quicker solution:


f (z) − f (x) z 3 − x3 (z − x)(z 2 + zx + x2 )
lim = lim = lim = lim z 2 + zx + x2 = 3x2 .
z→x z−x z→x z − x z→x z−x z→x

3. Using the helpful factorization technique an − bn = (a − b)(an−1 + an−2 b + . . . + abn−2 + bn−1 ),


we can quickly deduce
f (z) − f (x) z 5 − x5
lim = lim = lim z 4 + z 3 x + z 2 x2 + zx3 + x4 = 5x4 .
z→x z−x z→x z − x z→x

Of course, we also have the option of expanding:

f (x + h) − f (x) (x + h)5 − x5
lim = lim
h→0 h h→0 h
x + 5x h + 10x3 h2 + 10x2 h3 + 5xh4 + h5 − x5
5 4
= lim
h→0 h
= lim 5x4 + 10x3 h + 10x2 h2 + 5xh3 + h4 = 5x4 .
h→0

4. From the previous examples, it seems that we have a pattern here. In fact, we can show that
d n
(x ) = nxx−1 using the Binomial Theorem.
dx
d n (x + h)n − xn
(x ) = lim
dx h→0 h
Daniel Kim 292
     
n n n n
0 xn + 1 xn−1 h + . . . + n−1 xhn−1 + n hn − xn
= lim
h→0
  h  
n n n n
1 xn−1 h + 2 xn−2 h2 + ... + n−2 x2 hn−2 + n−1 xhn−1 + hn
= lim
h→0
    h   
n n−1 n n−2 n 2 n−3 n
= lim x + x h + ... + x h + xhn−2 + hn−1
h→0 1 2 n−2 n−1
= nxn−1 .

We can also use the fact an − bn = (a − b)(an−1 + an−2 b + . . . + abn−2 + bn−1 ) as mentioned
before:
d n z n − xn
(x ) = lim
dx z→x z − x
(z − x)(z n−1 + z n−2 x + . . . + zxn−2 + xn−1 )
= lim
z→x z−x
n−1 n−2
= lim z +z x + . . . + zxn−2 + xn−1
z→x
n−1
=x + xn−2 · x + . . . + x · xn−2 + xn−1
= nxn−1 .

5. It is up to your choice which method you prefer. The following uses expanding.
  3(x+h)+11
− 3x+11
d 3x + 11 2(x+h)−9 2x−9
= lim
dx 2x − 9 h→0 h
3x+3h+11 3x+11
2x+2h−9 − 2x−9
= lim
h→0 h
(3x + 3h + 11)(2x − 9) − (3x + 11)(2x + 2h − 9)
= lim
h→0 h(2x + 2h − 9)(2x − 9)
6x − 27x + 6xh − 27h + 22x − 99 − 6x2 − 6xh + 27x − 22x − 22h + 99
2
= lim
h→0 h(2x + 2h − 9)(2x − 9)
−49h
= lim
h→0 h(2x + 2h − 9)(2x − 9)
49
= lim −
h→0 (2x + 2h − 9)(2x − 9)

49
= − .
(2x − 9)2

6. A common strategy in this situation is to multiply by the radical conjugate.


√ √
d √  x+h− x
x = lim
dx h→0 h
√ √ √ √
x+h− x x+h+ x
= lim · √ √
h→0 h x+h+ x
293 Chapter 10. Derivatives
x+h−x
= lim √ √ 
h→0 h x+h+ x
h
= lim √ √ 
h→0 h x+h+ x
1
= lim √ √
h→0 x+h+ x
1
= √ .
2 x

7. This is one of the more nontrivial functions to differentiate using the limit definition.

d sin(x + h) − sin x
(sin x) = lim
dx h→0 h
sin x cos h + cos x sin h − sin x
= lim
h→0 h
cos x sin h − sin x(1 − cos h)
= lim
h→0 h
sin2 h
cos x sin h − sin x · 1+cos h
= lim
h→0
 h 
sin x sin h
sin h cos x − 1+cos h
= lim
h→0 h
 
sin h sin x sin h
= lim · lim cos x −
h→0 h h→0 1 + cos h
 
sin x sin h
= lim cos x −
h→0 1 + cos h
 
sin x sin h
= lim cos x − lim
h→0 h→0 1 + cos h
lim (sin x sin h)
= cos x − h→0
lim (1 + cos h)
h→0
sin x · lim sin h
h→0
= cos x −
1 + lim cos h
h→0
sin x · 0
= cos x −
1+1
= cos x .
Daniel Kim 294

Theorem 10.1.18 (Linearity of Derivatives)


Let h(x) = f (x) + g(x) and j(x) = k · f (x) for some k ∈ R. Assume that f (x) and g(x) are
differentiable.

a) h0 (x) = f 0 (x) + g 0 (x).

b) j 0 (x) = k · f 0 (x).

Proof. a) By Theorem 9.3.5,


h(z) − h(x)
h0 (x) = lim
z→x x−z
f (z) + g(z) − (f (x) + g(x))
= lim
z→x x−z
(f (z) − f (x)) + (g(z) − g(x))
= lim
z→x
 x−z 
f (z) − f (x) g(z) − g(x)
= lim +
z→x x−z x−z
f (z) − f (x) g(z) − g(x)
= lim + lim
z→x x−z z→x x−z
= f 0 (x) + g 0 (x).

b) We can always factor out constants from limits, which follows from Theorem 9.3.4 and
Theorem 9.3.5:
j(z) − j(x)
j 0 (x) = lim
z→x z−x
k · f (z) − k · f (x)
= lim
z→x z−x
k(f (z) − f (x))
= lim
z→x z−x
f (z) − f (x)
= lim k ·
z→x z−x
f (z) − f (x)
= k · lim
z→x z−x
= k · f 0 (x).

Problem 10.1.19. Let g(x) = f (2x). If f 0 (x) = α(x), what is g 0 (x)?

Solution. First, note that


g(z) − g(x)
g 0 (x) = lim
z→x z−x
f (2z) − f (2x)
= lim
z→x z−x
f (2z) − f (2x)
= 2 · lim .
z→x 2z − 2x
295 Chapter 10. Derivatives

As z approaches x, 2z approaches 2x, so this is equivalent to


f (2z) − f (2x)
2 · lim = 2α(2x) .
2z→2x 2z − 2x

Theorem 10.1.20
If f (x) is differentiable at x = a, then it is continuous at x = a.

f (x) − f (a)
Proof. We want to show that if lim exists, then lim f (x) = f (a), or equivalently,
x→a x−a x→a
lim f (x) − f (a) = 0.
x→a
f (x) − f (a)
We know that lim and lim (x − a) both exist. We have that
x→a x−a x→a

f (x) − f (a) f (x) − f (a)


lim · lim (x − a) = lim · (x − a) = lim f (x) − f (a).
x→a x−a x→a x→a x−a x→a

But note that lim (x − a) = 0, and since the RHS will be 0, the LHS, i.e. lim f (x) − f (a), will
x→a x→a
necessarily be 0.

Problem 10.1.21. Define the following functions:


  
x sin 1

x 6= 0
f (x) = x

 0 x=0
  
x2 sin 1

x 6= 0
g(x) = x

 0 x=0

Are these functions differentiable at x = 0?

Solution. Consider f (x) at x = 0. By the limit definition of a derivative, we have

f (x) − f (0)
f 0 (0) = lim
x→0 x−0
x sin( x1 ) − 0
= lim
x→0 x −0

1
= lim sin
x→0 x
 
1
= lim sin .
x→0 + x
 
1
By Lemma 9.6.5, lim sin = lim sin y, and this limit does not exist, since sin y oscillates
x→0+ x y→∞
between −1 and 1. Therefore, f 0 (0) does not exist, so f is not differentiable at x = 0.
Daniel Kim 296

Now consider g(x) at x = 0. By the limit definition of a derivative, we have

g(x) − f (0)
g 0 (0) = lim
x→0 x−0
x2 sin( x1 ) − 0
= lim
x→0 x− 0
1
= lim x sin .
x→0 x

We can split this limit into right-hand


  and left-hand limits, then apply Squeeze Theorem on
1
both, to determine that lim x sin = 0. Thus, g 0 (0) = 0, so g is differentiable at x = 0.
x→0 x

Theorem 10.1.22 (Product Rule of Derivatives)


Assuming that the derivatives of f (x) and g(x) exist,

d
(f (x)g(x)) = f (x)g 0 (x) + f 0 (x)g(x).
dx

Proof. This proof uses a similar strategy to that of Theorem 9.3.5.

d f (z)g(z) − f (x)g(x)
(f (x)g(x)) = lim
dx z→x z−x
f (z)g(z) − f (z)g(x) + f (z)g(x) − f (x)g(x)
= lim
z→x z−x
f (z) (g(z) − g(x)) + g(x) (f (z) − f (x))
= lim
z→x
 z−x 
f (z) (g(z) − g(x)) g(x) (f (z) − f (x))
= lim +
z→x z−x z−x
g(z) − g(x) f (z) − f (x)
= lim f (z) · lim + lim · lim g(x)
z→x z→x z−x z→x z−x z→x
0 0
= lim f (z) · g (x) + f (x)g(x).
z→x

As we have assumed that f (x) and g(x) are differentiable, by Theorem 10.1.20, lim f (z) = f (x),
z→x
therefore
d
(f (x)g(x)) = f (x)g 0 (x) + f 0 (x)g(x).
dx

Theorem 10.1.23 (Quotient Rule of Derivatives)


Assuming that the derivatives of f (x) and g(x) exist,
 
d f (x) g(x)f 0 (x) − f (x)g 0 (x)
= .
dx g(x) g(x)2
297 Chapter 10. Derivatives

Proof. First, we expand the fractions, then apply a similar strategy used in the previous proof.
  f (z)
− f (x)
d f (x) g(z) g(x)
= lim
dx g(x) z→x z−x
f (z)g(x) − f (x)g(z)
= lim
z→x g(x)g(z)(z − x)
f (z)g(x) − f (x)g(x) + f (x)g(x) − f (x)g(z)
= lim
z→x g(x)g(z)(z − x)
g(x) (f (z) − f (x)) + f (x) (g(x) − g(z))
= lim
z→x g(x)g(z)(z − x)
 
1 1 f (z) − f (x) g(x) − g(z)
= lim · lim · lim · lim g(x) + lim f (x) · lim .
z→x g(x) z→x g(z) z→x z−x z→x z→x z→x z−x
By our assumptions and Theorem 10.1.20, we have
 
d f (x) 1 1  
= · · f 0 (x) · g(x) + f (x) · −g 0 (x)
dx g(x) g(x) g(x)
g(x)f 0 (x) − f (x)g 0 (x)
= .
g(x)2
2x − 1
Problem 10.1.24. Evaluate the derivatives of f (x) = x3 sin x and f (x) = .
3x + 2

Solution. For the first function, we apply Theorem 10.1.22.


d 3 d 3 d
(x sin x) = (x ) · sin x + x3 · (sin x)
dx dx dx
= 3x2 · sin x + x3 · cos x
= x2 (3 sin x + x cos x) .

For the latter, apply Theorem 10.1.23.


  d d
d 2x − 1 (3x + 2) · dx (2x − 1) − (2x − 1) · dx (3x + 2)
=
dx 3x + 2 (3x + 2)2
2(3x + 2) − 3(2x − 1)
=
(3x + 2)2
7
= .
(3x + 2)2

Problem 10.1.25. By Example 10.1.17, we have established that the derivative of sin x is cos x.
Using Theorem 10.1.22 and Theorem 10.1.23, find the derivatives of the rest of the trigonometric
functions.
d d
Solution. The proof of (cos x) is similar to that of (sin x).
dx dx
d cos(x + h) − cos x
(cos x) = lim
dx h→0 h
Daniel Kim 298
cos x cos h − sin x sin h − cos x
= lim
h→0 h
cos x(cos h − 1) − sin x sin h
= lim
h→0 h
sin2 h
cos x · − cos h+1 − sin x sin h
= lim
h→0
 h 
cos x sin h
− sin h cos h+1 + sin x
= lim
h→0 h 
sin h cos x sin h
= lim − · lim + sin x
h→0 h h→0 cos h + 1
 
cos x sin h
= − lim + sin x
h→0 cos h + 1
= − sin x .

Now that we have found the derivatives of sine and cosine, we can apply Theorem 10.1.23 to find
d
(tan x).
dx
 
d d sin x
(tan x) =
dx dx cos x
d d
cos x · − sin x ·
dx (sin x) dx (cos x)
=
cos2 x
2 2
cos x + sin x
=
cos2 x
1
=
cos2 x
= sec2 x .

d d
The proof of (cot x) is analogous to that of (tan x).
dx dx
d d  cos x 
(cot x) =
dx dx sin x
d d
sin x · dx (cos x) − cos x · dx (sin x)
=
sin2 x
2 2
− sin x − cos x
=
sin2 x
1
=− 2
sin x
= − csc2 x .

d
For (sec x), use Theorem 10.1.23 and note that the derivative of 1 is simply 0.
dx
 
d d 1
(sec x) =
dx dx cos x
299 Chapter 10. Derivatives
d d
cos x · dx (1)−1· dx (cos x)
=
cos2 x
sin x
=
cos2 x
= tan x sec x .

d d
The proof of (csc x) is analogous to that of (sec x).
dx dx
 
d d 1
(csc x) =
dx dx sin x
d d
sin x · dx (1)−1· dx (sin x)
=
sin2 x
cos x
=−
sin2 x
= − cot x csc x .

Problem 10.1.26. Prove the following results about the derivatives of function transformations.

d
a) (f (x + k)) = f 0 (x + k).
dx
d
b) (f (kx)) = kf 0 (kx).
dx

Proof. The general strategy is to substitute the transformations x + k and kx with convenient
variables to facilitate the manipulation of the limit definition of the derivative.

a) Let ze = z + k and x
e = x + k. Note that ze − x
e = z − x, and that z goes to x as ze goes to x
e.
Then,
d f (z + k) − f (x + k)
(f (x + k)) = lim
dx z→x z−x
z ) − f (e
f (e x)
= lim
ze→ex ze − x
e
0
= f (ex)
= f 0 (x + k).

b) Let ze = kz and x
e = kx. Note that z goes to x as ze goes to x
e. Then,

d f (kz) − f (kx)
(f (kx)) = lim
dx z→x z−x
f (kz) − f (kx)
= lim ·k
z→x kz − kx
z ) − f (e
f (e x)
= lim ·k
ze→e x ze − x
e
= f 0 (e
x) · k
= kf 0 (kx).
Daniel Kim 300
d d
Remark. Keep in mind that f 0 (x+k) and f 0 (kx) are NOT the same as (f (x + k)) and (f (kx)).
dx dx
If something is some function of x, then f 0 (something) is the derivative of f (x) and then plugging
d
in x → something, while (f (something)) = f 0 (something) evaluated at x.
dx
d
For instance, if we let f (x) = sin x, then f 0 (2x) = cos 2x while (f (2x)) = 2 cos 2x according
dx
to the stated theorem.
d
Problem 10.1.27. Using the results of Problem 10.1.26, quickly evaluate (cos x).
dx
Solution. We can take advantage of the cofunction identity of sine and cosine:
d d   π   π  π
(cos x) = sin x + = sin0 x + = cos x + = − sin x.
dx dx 2 2 2
d
Problem 10.1.28. Use the product rule twice to find (f (x)g(x)h(x)).
dx

Solution. Initially suppose that f (x)g(x) is a single function for the first application of Theo-
rem 10.1.22, then apply the theorem again to break up f (x)g(x).
d d d
(f (x)g(x)h(x)) = f (x)g(x) (h(x)) + (f (x)g(x))h(x)
dx dx dx
= f (x)g(x)h0 (x) + (f (x)g 0 (x) + f 0 (x)g(x))h(x)
= f (x)g(x)h0 (x) + f (x)g 0 (x)h(x) + f 0 (x)g(x)h(x).

It seems that there is a pattern for the derivative of the product of an arbitrary number of
functions. Here is the general statement:

Theorem 10.1.29 (Generalized Product Rule)


Consider arbitrary functions f1 , f2 , . . . , fn . Then,
n
! n
! n !
d Y Y X f 0 (x)
i
fk (x) = fi (x) .
dx fi (x)
k=1 k=1 i=1

This result can be proved by mathematical induction.

Corollary 10.1.30 (General Power Rule)

d
(f (x)n ) = nf (x)n−1 f 0 (x).
dx

Proof. We consider a special case of Theorem 10.1.29 where f1 = f2 = . . . = fn .


n
! n !
d Y X f 0 (x)
(f (x)n ) = f (x)
dx f (x)
i=1 i=1
301 Chapter 10. Derivatives
 
nf 0 (x)
= f (x)n
f (x)
= nf (x)n−1 f 0 (x).

d
Problem 10.1.31. Recall the result of Example 10.1.17 where it was proven that (xn ) = nxn−1
dx
for all n ∈ Z+ . Demonstrate that this identity holds for all negative integers n as well.

Proof. Let n ∈ Z+ . By Theorem 10.1.23, note that


 
d −n
 d 1
x =
dx dx xn
d d
xn · dx (1) −1· n
dx (x )
=
(xn )2
−nxn−1
=
x2n
= −nx−n−1 .

d m
Let m = −n, such that m is a negative integer. Then we have (x ) = mxm−1 , and we are
dx
done.

d 0 d d
Clearly, for n = 0, (x ) = (1) = 0 = 0·x0−1 . We have now established that (xn ) = nxn−1
dx dx dx
for all integers n. However, we can extend this rule to the rational numbers:
d  1
Problem 10.1.32. Let n ∈ Z+ . Evaluate xn .
dx

Proof. We take advantage of the following factorization:


 1  n  1 n
z − x = z n − xn
 1 1
  n−1 n−2 1 1 n−2 n−1

= z n − xn z n + z n xn + . . . + z n x n + x n .

Thus, we have

d  1
1 1
z n − xn
x n = lim
dx z→x z − x
1 1
z n − xn
= lim  1 1
  n−1 n−2 1 1 n−2 n−1

z→x
z −x
n n z n +z n x + ... + z x
n n n +x n

1
= lim n−1 n−2 1 1 n−2 n−1
z→x z n +z n x + ... + znx
n n +x n

1
= n−1
nx n
1 1
= x n −1 .
n
Daniel Kim 302
d  −1
Problem 10.1.33. Use Theorem 10.1.23 to find x n for n ∈ Z+ .
dx

Proof. We take the same approach as we did for Problem 10.1.31.


 
d  −1 d 1
x n =
dx dx x n1
1
 1
d d
x n · dx (1) − 1 · dx xn
=  1 2
xn
1
− n1 x n −1
= 2
xn
1 − 1 −1
=− x n .
n

d  1 1 1
We now have shown that x n = x n −1 for all nonzero integers n. Finally, we deal with
dx n
p
the general case for rational numbers, using results that we have discovered so far:
q

Theorem 10.1.34 (Power Rule)

d
∀n ∈ Q, (xn ) = nxn−1 .
dx

d  pq 
Proof. Using the previous result and Corollary 10.1.30, we will evaluate x for p, q ∈ Z and
dx
q 6= 0.
d  pq  d  1q p 
x = x
dx dx
 1 p−1  1 1 
−1
= p xq xq
q
p  1q p−1  1q −1 
= x x
q
p p −1
= xq .
q
P
n
Problem 10.1.35. Let P (x) be the polynomial ak xk . Find P 0 (x).
k=0

Proof. As a direct result of Theorem 10.1.34, we have


n
X
0 d
P (x) = (ak xk )
dx
k=0
n
X
= kak xk−1 .
k=0
303 Chapter 10. Derivatives

Now, notice the similarity among the following:

d
• (f (x + c)) = f 0 (x + c).
dx
d d
When we evaluate (f (x + c)), we are actually evaluating (f ◦ g(x)), where g(x) = x + c.
dx dx
d
• (f (kx)) = kf 0 (kx).
dx
d d
When we evaluate (f (kx)), we are actually evaluating (f ◦ g(x)), where g(x) = kx.
dx dx
d
• (f (x)n ) = nf (x)n−1 f 0 (x).
dx
d d
When we evaluate (f (x)n ), we are actually evaluating (g ◦ f (x)), where g(x) = xn .
dx dx

Considering the composition of functions will lead us into our next important property of
derivatives:

Theorem 10.1.36 (Chain Rule)


Under appropriate conditions (i.e. functions are well-behaved, continuous; there are no domain
or range issues, etc.),
d
(g(f (x))) = g 0 (f (x))f 0 (x).
dx
For convenience of explanations, we will refer to f in this statement as the ‘inner function’
and g as the ‘outside function.’

If we let y = f (x) and z = g(y), then it follows from Theorem 10.1.36 that

dz dz dy
= .
dx dy dx

The proof of this theorem lies beyond the scope of this book.
d
Problem 10.1.37. Use Theorem 10.1.36 to find (f (xn )).
dx

Solution. Note that the inner function is xn , and the outside function is f . Using Theorem 10.1.36
gives us
d d
(f (xn )) = f 0 (xn ) · (xn ) = f 0 (xn ) · nxn−1 .
dx dx
Exercise 10.1.38. Use Theorem 10.1.36 to demonstrate Corollary 10.1.30.
1  π π
Problem 10.1.39. Let f be a function such that f 0 (x) = √ for x ∈ − , . Evaluate
1 − x2 2 2
d
(f (sin x)).
dx
Daniel Kim 304
d d
Solution. By Theorem 10.1.36, we have (f (sin x)) = f 0 (sin x) (sin x) = f 0 (sin x) cos x. To
dx dx
1
evaluate f (sin x), we simply plug in sin x into the input of f , i.e. f 0 (sin x) = p
0 0 . Since
1 − sin2 x
 π π p 1 d
x∈ − , , we have that 1 − sin2 x = cos x, such that f 0 (sin x) = . Thus, (f (sin x)) =
2 2 cos x dx
1
· cos x = 1 .
cos x
Problem 10.1.40. Differentiate the following:

1. sin3 (x2 )

2. 1 + 4 sin x
x2
3.
1 + x2
4. tan(x3 + x2 )
q 
5. 3 cos4 sin5 (x6 )

Solution. You will need to be familiar with applying Theorem 10.1.36:

d d
1. After applying Theorem 10.1.36 twice, we get (sin3 (x2 )) = 3 sin2 (x2 ) · (sin(x2 )) =
dx dx
3 sin2 (x2 ) · cos(x2 ) · 2x = 6x sin2 (x2 ) cos(x2 ) .

d √ 1 1 d 1
2. Similar to the previous one, ( 1 + 4 sin x) = (1 + 4 sin x)− 2 · (1 + 4 sin x) = (1 +
dx 2 dx 2
1 2 cos x
4 sin x)− 2 · 4 cos x = √ .
1 + 4 sin x

3. We use Theorem 10.1.23:


  d
 d
d x2 (1 + x2 ) · dx x2 − x2 · dx (1 + x2 )
=
dx 1 + x2 (1 + x2 )2
2x(1 + x2 ) − 2x · x2
=
(1 + x2 )2
2x
= .
(1 + x2 )2

d d
4. By Theorem 10.1.36, (tan(x3 +x2 )) = sec2 (x3 +x2 )· (x3 +x2 ) = sec2 (x3 + x2 ) · (3x2 + 2x) .
dx dx
5. Repeatedly apply the Theorem 10.1.36:
q 
d  4 1  d 
3
cos sin (x ) = cos 3 sin5 x6 ·
4 5 6 cos sin5 x6
dx 3 dx
4 1   d 
= cos 3 sin5 x6 · − sin sin5 x6 · sin5 x6
3 dx
305 Chapter 10. Derivatives
4 1    d 
= cos 3 sin5 x6 · − sin sin5 x6 · 5 sin4 x6 · sin x6
3 dx
4 1   
= cos 3 sin5 x6 · − sin sin5 x6 · 5 sin4 x6 · cos(x6 ) · 6x5
3
  1  
= −40 sin sin5 x6 sin4 x6 cos 3 sin5 x6 cos x6 x5 .

Generally, this is as far as problems involving the application of Theorem 10.1.36 go.

Definition 10.1.41. Let n ∈ Z+ . The nth derivative of f (x) is what you get when you take the
derivative of the function n times. We write this as f (n) (x) or repeat the 0 symbol n times, as shown:

y = f (x)
y 0 = f 0 (x)
y 00 = f 00 (x)
y 000 = f 000 (x)

and so on.
We could also use roman numerals, i.e. we would represent the fourth derivative of f (x) as
f IV (x).
Using Leibniz Notation, we can represent the second derivative of y as

d  d dy d2 y
y 00 = y0 = = 2.
dx dx dx dx

Problem 10.1.42. Let y = x3 + x2 + x + 1. Find y 0 , y 00 , y 000 , y IV , and y (100) .

Solution.

y 0 = 3x2 + 2x + 1
y 00 = 6x + 2
y 000 = 6
y IV = 0
y (100) = 0

Exercise 10.1.43. Let y = xn . Find y (n) .

Exercise 10.1.44. Let y = sin(3x). Find y (99) .

d2017 y
Exercise 10.1.45. Find (cos(6x)).
dx2017

d2 y
Exercise 10.1.46. Find (f (x)g(x)). What does the result resemble?
dx2
Daniel Kim 306

10.2 Implicit Differentiation

We introduce the concept with an illuminating example.

Example 10.2.1
Find the slope of the tangent line at the point (3, 4) of the equation x2 + y 2 = 25. What about
(4, −3)?

Solution. This is a circle of radius 5 centered at the origin, so consider the following diagram:

(3, 4)

(4, −3)

The graph of this clearly fails the vertical line test, implying that y cannot be a function of x
(there exist multiple y values given one x value). Therefore, we cannot apply our usual tactics.
One approach √ is to consider cases√separately. First, we could solve for y and get two separate
equations: y = 25 − x2 and y = − 25 − x2 , then evaluate the derivatives separately for (3, 4),
which lies in the top half, and (4, −3) which lies in the bottom half, as shown:

√ (3, 4)
y= 25 − x2

√ (4, −3)
y = − 25 − x2

x
For the former, we get y 0 = − √ , and then we plug in 3 from the point (3, 4) to get the
25 − x2
3 x 4
slope − . For the latter, we get y 0 = √ , and then we plug in 4 to get the slope .
4 25 − x2 3
However, this method was inefficient and time-consuming. It is not even guaranteed that we are
able to find an explicit function for y, as the example x2 + y 2 = 25 was conveniently simple. In fact,
we can do better.
Note that the relation x2 + y 2 = 25 implies that x and y are tied to each other (i.e. share a
relation with one another) in some way, but not in a way such that one is an explicit function of
307 Chapter 10. Derivatives

another. In this situation, we acknowledge that y is an implicit function of x, and actually be


explicitly written as y = f (x).
We can still find the derivative of an implicit function. As x and y have some sort of relation to
each other, we let y = f (x), and the following relation holds:

x2 + f (x)2 = 25.

As this is an identity, we note that their derivatives must also be equal:


d  d
x2 + f (x)2 = (25).
dx dx

Applying our various rules, we end up with 2x + 2f (x)f 0 (x) = 0, i.e.


x
f 0 (x) = − .
f (x)

3
If we plug in the point (3, 4), we have that x = 3 and f (3) = 4, giving our correct answer − . If
4
4
we plug in the point (4, −3), we have that x = 4 and f (4) = −3, giving our correct answer .
3
To take this a step further, we don’t even need to bother with substituting y = f (x). Leibniz
dy
notation allows us to take the derivative of both sides of the given equation and solve for , while
dx
treating y as some function of x. For the particular equation x2 + y 2 = 25, we would do the following
work:

x2 + y 2 = 25
d  d
x2 + y 2 = (25)
dx dx
d  d 
x2 + y2 = 0
dx dx
d 
2x + y 2 = 0.
dx

d 
How would we evaluate y 2 ? Recall that we are treating y as a function of x, therefore we
dx
use Theorem 10.1.36:
d
2x + 2y (y) = 0.
dx
d dy
Nothing that (y) is just , we conclude that
dx dx
dy
2x + 2y =0
dx
dy x
= − .
dx y

Now, we can simply use this formula by plugging in the appropriate values when given points like
(3, 4) and (4, −3), and we can confirm that this formula yields the correct answers as well.
Daniel Kim 308

Remark. The result from Example 10.2.1 is consistent with geometric results, i.e. the tangent line
is always perpendicular to the radius at the point of intersection. Consider the following diagram:

(x, y)

y
m= x

(0, 0)

y
As the slope of the line connecting the radius and the point of intersection is , the slope of the
x
x
tangent line must be − , since the tangent is perpendicular.
y
dy
Problem 10.2.2. Find (i.e. y 0 ) for xy 2 + x2 y 3 = 7.
dx

Solution. As always, we take the derivative of both sides of the equation and then simplify and solve
dy dy
for . Never forget that we must treat y as a function of x, as means that we are evaluating
dx dx
the derivative of y with respect to x. Hence, we must use Theorem 10.1.36 in the process.

dy dy
(xy 2 + x2 y 3 ) = (7)
dx dx
dy dy
1 · y 2 + x · 2y · + 2x · y 3 + x2 · 3y 2 · =0
dx dx
dy −y 2 − 2xy 3
= .
dx 2xy + 3x2 y 2

dy
Problem 10.2.3. Find (i.e. y 0 ) for x2 + xy + y 2 = 12.
dx

Solution.
d  d
x2 + xy + y 2 = (12)
dx dx
dy dy
2x + y + x + 2y =0
dx dx
dy 2x + y
= − .
dx x + 2y
309 Chapter 10. Derivatives

Problem 10.2.4. Find all points on the oblique ellipse from Problem 10.2.3 when the tangent lines
are horizontal and vertical, respectively.

2x + y
Solution. When the tangent line is horizontal, − = 0. This simplifies to y = −2x, and then we
x + 2y
substitute this into the equation x2 + xy + y 2 = 12 to get x2 − 2x2 + 4x2 = 12, i.e. x2 = 4 −→ x = ±2,
and we get our respective y-values to get the points (2, −4) and (−2, 4) .
2x + y
When the tangent line is vertical, its slope is undefined. The only way − can be undefined
x + 2y
x x2 x2
is when x + 2y = 0, i.e. y = − . We plug this into the first equation to get x2 − + = 12,
2 2 4
i.e. x2 = 16 −→ x = ±4, and we solve for our respective y-values to get the points (4, −2) and
(−4, 2) .

Problem 10.2.5. Find all horizontal and vertical tangent lines on the oblique ellipse x2 −xy +y 2 = 7.

dy
Solution. First, we find . Note that
dx
dy 2 dy
(x − xy + y 2 ) = (7)
 dx  dx
dy dy
2x − y + x + 2y =0
dx dx
dy dy
2x − y − x + 2y =0
dx dx
dy y − 2x
= .
dx 2y − x

y − 2x
The tangent line is horizontal when − = 0, or y = 2x. We substitute this into the equation
2y − x r
7
x − xy + y = 7 to get x − x(2x) + (2x) = 7, which simplifies to x = ±
2 2 2 2 , so our horizontal
r r 3
7 7
tangent lines are y = 2 and y = −2 .
3 3
y − 2x
The tangent line is vertical when its slope is undefined. The only way − can be undefined
2y − x
is when 2y − x = r0, i.e. x = 2y. We plug this into the initial equation
r to get (2y) 2
r − (2y)y + y = 7
2

7 7 7
and obtain y = ± , so our vertical tangent lines are x = 2 and x = −2 .
3 3 3
dy
Problem 10.2.6. Find (i.e. y 0 ) for x sin(xy 3 ) + sin2 (y) = 1.
dx

Solution. We repeatedly apply Theorem 10.1.36, so you should be familiar with using it.

d  d
x sin(xy 3 ) + sin2 (y) = (1)
dx dx
Daniel Kim 310
d d  d
(x) · sin(xy 3 ) + x · sin(xy 3 ) + 2 sin(y) · (sin(y)) = 0
dx dx dx
d  dy
sin(xy 3 ) + x · cos(xy 3 ) · xy 3 + 2 sin(y) cos(y) =0
 dx  dx
dy dy
sin(xy 3 ) + x · cos(xy 3 ) y 3 + x · 3y 2 + 2 sin(y) cos(y) =0
dx dx
dy dy
sin(xy 3 ) + xy 3 cos(xy 3 ) + 3x2 y 2 cos(xy 3 ) + 2 sin(y) cos(y) =0
dx dx
dy 
(3x2 y 2 cos(xy 3 ) + 2 sin(y) cos(y)) = − sin(xy 3 ) + xy 3 cos(xy 3 )
dx

Thus, we have
dy sin(xy 3 ) + xy 3 cos(xy 3 )
= − 2 2 .
dx 3x y cos(xy 3 ) + sin(2y)

dy
Problem 10.2.7. Find for the equation sin(xy) = x2 + y 2 .
dx

Solution.

d d 
(sin(xy)) = x2 + y 2
dx dx
d dy
cos(xy) (xy) = 2x + 2y
 dx  dx
dy dy
cos(xy) y + x = 2x + 2y
dx dx
dy dy
y cos(xy) + x cos(xy) = 2x + 2y
dx dx
dy y cos(xy) − 2x
= .
dx 2y − x cos(xy)

dy
Exercise 10.2.8. Find for x3 + y 3 = 4.
dx
dy
Exercise 10.2.9. Find for y = sin(3x + 4y).
dx
dy
Exercise 10.2.10. Find for y = x2 y 3 + x3 y 2 .
dx
dy
Exercise 10.2.11. Find for cos2 x + cos2 y = cos(2x + 2y).
dx
dy p
Exercise 10.2.12. Find for x = x2 + y 2 .
dx

d2 y
Problem 10.2.13. Find for x2 + y 2 = 25.
dx2
311 Chapter 10. Derivatives
 
dy x d2 y d dy
Solution. From Example 10.2.1, we have found that = − . Note that = , so
  dx y dx2 dx dx
d2 y d x
= − . We then use Theorem 10.1.23 to get
dx2 dx y
dy
d2 y y · (−1) − (−x) dx
=
dx2 y2
x2
−y − y
=
y2
−x2 − y 2
=
y3
x2 + y 2
=− .
y3
We are almost done! At this point, we can simply use the equation we are given, x2 + y 2 = 25, and
d2 y 25
substitute in the appropriate value, to get = − 3 .
dx2 y

d2 y
Exercise 10.2.14. Find for x2 + xy + y 2 = 1.
dx2

10.3 Related Rates

Consider the well-known formula


distance
rate = .
time
Depending on the problem, rate can also refer to speed or velocity.
When applying derivatives to problems about rate, we consider all measurements as functions of
time, as such:
t = time
z}|{
s( t )
| {z }
something you can measure

Definition 10.3.1. The average rate of change of s on the interval [t0 , t1 ] is equal to
s(t1 ) − s(t0 ) ∆s
= .
t1 − t0 ∆t

Note that this is just the slope of the secant line connecting points (t0 , s(t0 )) and (t1 , s(t1 )).

We then consider the rate of change at the point t0 .


s(t1 ) − s(t0 )
lim = s0 (t0 ).
t1 →t0 t1 − t0

We will appropriately use ∆s and ∆t to represent the differences s(t1 ) − s(t0 ) and t1 − t0 ,
respectively.
Daniel Kim 312

Definition 10.3.2. The instantaneous rate of change of s at point t is equal to s0 (t).


∆s ds
lim = .
∆t→0 ∆t dt

This represents the rate of change at a single point in time, t, rather than taking the average of
two points.

We now introduce the concept of related rates problems with a series of examples.

Example 10.3.3
Consider a spherical balloon which is being blown up. Its volume is changing at a constant rate,
K. How does the radius of the balloon change? What about its surface area?

Solution. If the volume of the balloon is changing at a constant rate K, then


dV
= K.
dt

dr
We want to find the change in radius, i.e. .
dt
However, we keep in mind that V and r are functions of time t (always remember that measure-
ments are functions of time!).
For a sphere, we can use the formula of a balloon, i.e.
4
V = πr3 .
3

This is our main relation between V and r, hence ‘related rates.’


We proceed with implicit differentiation; take the derivative of both sides and then use Theo-
rem 10.1.36 on V and r as necessary:
 
d d 4 3
(V ) = πr
dt dt 3
dV dr
= 4πr2
dt dt
dr K
= .
dt 4πr2

Let S denote the surface area of the balloon. Then the surface area of a sphere is known to be

S = 4πr2 .

dS d dS dr
We implicitly differentiate to get = (4πr2 ),which simplifies to = 8πr . We have
dt dt dt dt
dr K dS 2K
already found that = , so we plug this back into the equation to get = .
dt 4πr2 dt r
313 Chapter 10. Derivatives

Example 10.3.4
How fast is the top of the ladder falling down the wall when the top is at a height of 12 ft. while
the bottom of the ladder is being pushed outward 6 inches per minute?

13 ft ladder

Solution. First, we draw a diagram that simplifies the problem:

y 13

For the changing lengths, we label them with variables x and y.


As it states in the problem, the bottom of the ladder is moving outward at 6 inches per minute.
dx 1 ft.
We can interpret this as = .
dt 2 min.
dy
As we want to find the rate of y falling down, we want when y = 12.
dt
As we have a right triangle, we take advantage of the Pythagorean Theorem: x2 + y 2 = 169.
d 2 
We differentiate both sides to get x + y 2 = (169). Remember, we always treat x and y as
dt
functions of time, therefore we use the chain rule to get

dx dy
2x + 2y = 0.
dt dt
dx 1 1 dy dy 5
We plug in = to get 10 · + 24 · = 0, which yields = − . It is numerically negative,
dt 2 2 dt dt 24
5 in.
but based on the word problem, we interpret this as the ladder falling down at .
2 min.
Daniel Kim 314

Problem 10.3.5. You are standing 500 ft. away from a tiny rocket. The rocket rises at a rate√of
100 feet per second. How fast is its angle of elevation from you changing when the rocket is 500 3
ft. high?

Solution. Again, we draw a simplified diagram to interpret the problem mathematically:

θ
500

dy
We are given that y increases 100 feet per second, i.e. = 100.
dt
dθ √
We want the change in θ, which is , when y = 500 3.
dt
y
Using this right triangle relationship, we can deduce that tan θ = . We differentiate this to
500
get
dθ 1 dy
sec2 θ = .
dt 500 dt
√ π dy
Note that y = 500 3 when θ = , i.e. sec θ = 2. Substituting that and = 100 into the
3 dt
dθ 1 dθ 1
equation, we therefore have 4 = · 100, from which we solve to get = radian per second,
dt 500 dt 20
1 radian
i.e. the angle of elevation is increasing at .
20 second

Example 10.3.6
Consider a cone-shaped cauldron (with a radius of 50 meters and a height of 100 meters) that
holds a potion. The potion leaks out at 2 cubic meters per minute. How fast is the height of
the potion changing when the height of the potion in the cauldron is 80 meters?

50 m

Potion 100 m
315 Chapter 10. Derivatives

Solution. Let r and h be the radius and height of the cone with the leaking potion. We are given
dV dh
that = −2. We want when h = 80.
dt dt
We use the fact that the cone with the leaking potion is similar to the cone of radius 50 meters and
50 1 r 1
height 100 meters. Since the radius to height ratio is = , we know that = . Furthermore,
100 2 h 2
1 2
we know the volume of the cone is V = πr h.
3
h 1
We rearrange to get r = and substitute this into the volume formula to get V = πh3 , then
2 12
differentiate it to get
dV 1 dh
= πh2 .
dt 4 dt

dV dh 1
We plug in the given = −2 and h = 80 to get that =− meters per minute, i.e. the
dt dt 800π
1
height is decreasing at meters per minute .
800π

Problem 10.3.7. Eric is walk along the path of the graph y = x2 , as shown:

y = x2

His x-coordinate increases 10 meters per second. How fast is the angle of inclination from the
origin changing when his x-coordinate is 3 meters?

Solution. We let x denote Eric’s x-coordinate. We are given that his x-coordinate increases 10
dx
meters per second, which means that = 10. We let θ be the angle of inclination from the origin.
dt

We want to find when x = 3.
dt
We note that the angle of inclination from the origin is simply the angle between the line
containing the point on y = x2 and the origin, and the x-axis. We also have to consider x, the
x-coordinate. These relationships motivate us to draw a simplified diagram, as such:
Daniel Kim 316

x2

θ
x

From this right triangle, it becomes apparent that tan θ = x, and this is our relation. We
dθ dx
differentiate both sides with respect to t to get sec2 θ = .
dt dt
How would we find sec2 θ? We can use the identity tan2 θ + 1 = sec2 θ (which can be derived
from the Pythagorean identity sin2 θ + cos2 θ = 1).
When x = 3, then tan θ = 3, therefore sec2 θ = 32 + 1 = 10, so we have

10 = 10.
dt
dx dθ
as we are given that = 10. We solve that = 1, therefore we can conclude that the angle of
dt dt
inclination from the origin increases at a rate of 1 radian per second .

Problem 10.3.8. The length of a rectangle increases by 10 cm. per hour. Its width decreases by 2
cm. per hour. When the length is 60 cm. and the width is 80 cm., determine how fast each of the
following is changing:

a) Perimeter
b) Area
c) Length of diagonal

dw
Solution. Let the length be l and width be w. The problem statement implies that = −2 and
dt
dl
= 10. Let the perimeter, area, and length of diagonal be P , A, and L respectively. Therefore the
dt
dP dA dL
problem is asking us to find , , and , when w = 80 and l = 60.
dt dt dt

a) We have our relation P = 2(l + w). We differentiate both sides to get


dP d
= (2(l + w)) .
dt dt
Applying the chain rule and substituting in the given information gives us
 
dP dl dw
=2 + = 2(10 − 2) = 16.
dt dt dt
317 Chapter 10. Derivatives

Therefore, the perimeter is increasing at a rate of 16 cm. per hour .

b) Similarly, we have that A = lw, and differentiating both sides and simplifying yield:
dA d
= (lw)
dt dt
dl dw
= w+l
dt dt
= 10 · 80 + 60 · −2
= 680.

Thus, the area increases at a rate of 680 cm.2 per hour .



c) We can either use √
the relation L = l2 + w2 , or L2 = l2 + w2 , then differentiate accordingly. I
will focus on L = l2 + w2 :
dL d p 2 
= l + w2
dt dt
1 d 2 
= √ · l + w2
2 l2 + w2 dt
 
1 dl dw
= √ · 2l + 2w
2 l2 + w 2 dt dt
1
= √ · (2 · 60 · 10 + 2 · 80 · −2)
2 602 + 802
1
= (1200 − 320)
200
22
= .
5
22
The length of the diagonal increases at a rate of cm. per hour .
5
Problem 10.3.9. Observe the clock below. The minute hand is 15 cm. and the hour hand is 8 cm.
How fast is the distance between the tips of the hour hand and minute hand changing at the given
time?

12
11 1

10 15 cm. 2
x
θ
9 3
8 cm.
8 4

7 5
6
Daniel Kim 318

Solution. Let the distance between the tips of the hour hand and minute hand be x, and the angle
between the two hands be θ. We intend to use the Law of Cosines on the triangle formed by the
side lengths (the two hands of the clock) and the third side, which is x. The problem is asking us to
dx 2π
find when θ = , which signifies the time of 4 o’clock.
dt 3
Starting off with Law of Cosines, we have:

x2 = 152 + 82 − 2(15)(8) cos θ


= = 225 + 64 − 240 cos θ
= 289 − 240 cos θ.

Differentiate both sides to get


dx dθ
2x = 240 sin θ .
dt dt
s  
2π 2π √
When θ = ,x= 289 − 240 cos = 409.
3 3

Now we approach the main question: what is ? Note that the minute hand is changing at 2π
dt
π
radians per hour, and the hour hand is changing at radians per hour.
6
We then consider the context of this problem: as we are approaching 4 o’clock, the quicker
minute hand is approaching the slower hour hand, and therefore θ (which is the angle between the
dθ π 11π
two hands) is decreasing, so = − 2π = − radians per hour.
dt 6 6
dθ π 11π
If the time was instead 8 o’clock, would instead be 2π − = radians per hour, as the
dt 6 6
minute hand would have to keep turning in the clockwise direction until it meets the hour hand
(which will have slowly moved two-thirds of the way from the 8 to the 9 markings), and so θ would
be increasing in that scenario.
Plugging in our information, we have

√ dx 3 11π
2 409 = 240 · ·− .
dt 2 6

dx 110π 3
Solving, we get =− √ , so our final answer would be that the distance between the tips of
dt 409

110π 3
the hour hand and the minute hand is changing at a rate of − √ cm. per hour .
409

10.4 Significance of the Derivative

First, we introduce some definitions to help us with stating the next few theorems.

Definition 10.4.1. f (x) has a local maximum at x = a if ∃δ > 0 such that ∀x ∈ (a − δ, a +


δ), f (x) ≤ f (a).
319 Chapter 10. Derivatives

Definition 10.4.2. f (x) has a local minimum at x = a if ∃δ > 0 such that ∀x ∈ (a − δ, a +


δ), f (x) ≥ f (a).
Definition 10.4.3. An extremum refers to either a local maximum or minimum.

Theorem 10.4.4
If the function f has a local extremum at x = a, then either f 0 (a) = 0 or f 0 (a) does not exist.

Proof. Without loss of generality, let there be a local maximum at x = a. The proof for the local
minimum will be analogous.
Suppose f 0 (a) exists, so we have
f (x) − f (a)
f 0 (a) = lim .
x→a x−a
f (x) − f (a) f (x) − f (a)
This implies that f 0 (a) = lim = lim . We analyze the one-sided limits
x→a+ x−a x→a− x−a
separately:

1. If x → a+ , then x − a > 0 by assumption. As x = a is a local maximum, f (x) − f (a) ≤ 0 by


f (x) − f (a) f (x) − f (a)
Definition 10.4.1, therefore ≤ 0, i.e. lim ≤ 0.
x−a x→a+ x−a
2. If x → a− , then x − a < 0 by assumption. Since f (x) − f (a) ≤ 0 by Definition 10.4.1, we
f (x) − f (a) f (x) − f (a)
conclude that ≥ 0, i.e. lim ≥ 0.
x−a x→a+ x−a

f (x) − f (a) f (x) − f (a)


We have that lim ≤ 0 and lim ≥ 0, and since these one-sided limits
x→a+ x−a x→a − x−a
f (x) − f (a) f (x) − f (a)
must be equal, we necessarily have lim = lim = 0. Therefore f 0 (a) must
x→a+ x−a x→a− x−a
be 0.
If f 0 (a) does not exist, then we have a cusp at (a, f (a)) which would be the local extremum.

Definition 10.4.5. The function f has a critical point at x = a if f 0 (a) = 0 or f 0 (a) does not
exist.

Theorem 10.4.6 (Extreme Value Theorem)


The maximum and minimum values of a continuous function f on [a, b] occur either at the end
points or critical points.

The proof of this theorem lies beyond the scope of this book.
The following problems will demonstrate the usefulness of this theorem regarding minimizing
and maximizing functions.
Problem 10.4.7. Maximize and minimize f (x) = x2 − x on [−4, 4].
Daniel Kim 320
1
Solution. Note that f 0 (x) = 2x − 1, so the only critical point is x = . Now, we examine all end
2
points and critical points:
f (−4) = 20,
 
1 1
f =− ,
2 4
f (4) = 12.

1
Therefore, the maximum and minimum values of f on [−4, 4] are 20 and − respectively.
4
Problem 10.4.8. Maximize and minimize f (x) = x3 − 3x + 1 on [−4, 4].

Solution. We have f 0 (x) = 3x2 − 3, so the critical points are x = 1, −1. Now, we examine all end
points and critical points:
f (−4) = −51,
f (−1) = −1,
f (1) = 3,
f (4) = 53.

Therefore, the maximum and minimum values are 53 and −51 respectively.

Problem 10.4.9. Maximize and minimize f (x) = sin 2x on [0, π].

π 3π
Solution. Since f 0 (x) = 2 cos 2x, the critical points are x = , . Now, we examine all end points
4 4
and critical points:
f (0) = 0,
π 
f = 1,
 4

f = −1,
4
f (π) = 0.

Hence, the maximum and minimum values of f on [0, π] are 1 and −1 respectively.
 
3 1
Problem 10.4.10. Maximize and minimize f (x) = 4x + on ,5 .
x 2

3 3
Solution. After we obtain f 0 (x) = 4 − 2 , we solve for the critical points: 4 − 2 = 0 → 4x2 = 3,
√ x   x
3 1
which gives us x = ± . Since we are considering the interval , 5 , we discard the negative value
2 2
of x. Now, we examine all end points and critical points:
 
1
f = 8,
2
321 Chapter 10. Derivatives
√ !
3 √
f = 4 3,
2
103
f (5) =
.
5
 
1 103 √
Thus, the maximum and minimum values of f on , 5 are and 4 3 respectively.
2 5
Problem 10.4.11. Squares of side length x will be cut from each corner of a 10 × 16 in. cardboard
such that the remaining will be folded up into an open box. Find the value of x which will maximize
the volume of the box.
10 in.
x
x

16 in.

Solution. We are asked to maximize the function V (x) which represents the volume of the resulting
box. First, we must consider the constraints of the cardboard. Considering the 10 inch side, the side
length of the squares cannot be greater than 5 inches. Therefore we must maximize V (x) over the
interval [0, 5]. We include the edge cases so we can apply Theorem 10.4.6.
The height of the box is x, and the dimensions of the base of the box would be 10−2x and 16−2x,
since each side is reduced by the 2 squares with side length x cut off from the corners. Therefore,
V (x) = x(16−2x)(10−2x) = 4x3 −52x2 +160x. Then V 0 (x) = 12x2 −104x+160 = 4(3x−20)(x−2),
20 20
so the critical points are x = , 2. However, since is not in the interval [0, 5], we ignore this
3 3
value, so we only check x = 0, 2, 5, as shown:
V (0) = 0,
V (2) = 144,
V (5) = 0.

Therefore, the maximum volume would be 144 in.3 .


Problem 10.4.12. Consider the following diagram:
C

E
w
y

B A D

z
Daniel Kim 322

Suppose 4ABC and 4ADE are right triangles, with a common vertex at A. Let BC = w,
DE = y, and BD = z. Locate A such that the sum of the hypotenuses AC and AE is minimal.
Then, demonstrate that regardless of the values of w, y, z, the minimum sum of AC and AE occurs
when ∠BAC ∼ = ∠DAE.

Solution. For some x ∈ (0, z), let BA = z − x and DA = x, as this problem would not make sense if
x was equal to either of the end points p
(resulting in one triangle disappearing
p completely). By the
Pythagorean Theorem, we have AC = w2 + (z − x)2 and AE = y 2 + x2 . Then consider
p p
f (x) = y 2 + x2 + w2 + (z − x)2 ,

which will represent the sum of the hypotenuses based on x. We wish to minimize this function over
(0, z). Then
x z−x
f 0 (x) = p −p .
2
x +y 2 (z − x)2 + w2
We solve for the critical point(s):

x z−x
p −p =0
2
x +y 2 (z − x)2 + w2
x z−x
p =p
x2 + y 2 (z − x)2 + w2
x2 (z − x)2
=
x + y2
2 (z − x)2 + w2
x2 + y 2 (z − x)2 + w2
=
x2 (z − x)2
y2 w2
1+ 2 =1+
x (z − x)2
y2 w2
=
x2 (z − x)2
y w
= .
x z−x

yz
We then solve to get x = as a critical point. After much simplification, we get
w+y
 
yz p
f = (w + y)2 + z 2 ,
w+y

which is our minimum sum of the hypotenuses.


y w
In order to achieve this minimal sum, we must have
= , as established earlier. This
x z−x
DE BC
implies that = , so 4ABC∼4ADE. Thus, ∠BAC ∼
= ∠DAE.
DA BA

Next, we introduce two very significant results.


323 Chapter 10. Derivatives

Theorem 10.4.13 (Rolle’s Theorem)


If f is continuous on [a, b] and differentiable on (a, b), and f (a) = f (b), then ∃a < c < b such
that f 0 (c) = 0.

Proof. First, we will assume that f has a maximum and a minimum on [a, b] in the first place. By
Theorem 10.4.6, these can occur either at end points or critical points.
If either is at a critical point c where a < c < b, then f 0 (c) = 0 since f is differentiable on (a, b).
In this case, we are done.
Otherwise, if either is at an end point, f must be constant on [a, b] since f (a) = f (b) and this
common value would both be the maximum and the minimum. Thus, we have f 0 (c) = 0 ∀c ∈
(a, b).

Next, we introduce a well-known theorem that will also not be proved in this chapter.

Theorem 10.4.14 (Intermediate Value Theorem)


Let f (x) be continuous on [a, b] and suppose f (a) < 0 < f (b). Then ∃a < c < b such that
f (c) = 0.

Problem 10.4.15. Prove f (x) = x3 + x + 1 has exactly one root.

Proof. Note that f (−1) = −1 and f (0) = 1, so by Theorem 10.4.14, ∃ − 1 < a < 0 such that
f (a) = 0. Thus, there exists at least one root.
Assume there is another root b. Then f (a) = f (b) = 0. Clearly f is continuous and differentiable
over the domain, so by Theorem 10.4.13, ∃a < c < b such that f 0 (c) = 0. However, f 0 (x) = 3x2 + 1 >
0 ∀x, thus f 0 (c) cannot be 0. We arrive at a contradiction, so a is the one and only root of f (x).

Theorem 10.4.16 (Mean Value Theorem)


If f is continuous on [a, b] and differentiable on (a, b), then ∃a < c < b such that

f (b) − f (a)
f 0 (c) = .
b−a

Proof. Consider the following function:

f (b) − f (a)
g(x) = f (x) − (x − a).
b−a

Note that g(a) = f (a), and g(b) = f (b) − (f (b) − f (a)) = f (a).
As g is defined by subtracting a linear term from f (x) (so all of its components are continuous
and differentiable), g is also continuous on [a, b] and differentiable on (a, b).
Daniel Kim 324

Thus, by Theorem 10.4.13, ∃a < c < b such that g 0 (c) = 0, i.e.


f (b) − f (a)
f 0 (c) − = 0,
b−a
which rearranges to
f (b) − f (a)
f 0 (c) = ,
b−a
as desired.

From this theorem, we have immediate results:

Corollary 10.4.17
If f 0 (c) = 0 ∀c on a certain interval, then f is constant on that interval.

Proof. Let a, b be any two points on that interval. Then by Theorem 10.4.16, ∃a < c < b such that
f (b) − f (a)
f 0 (c) = = 0.
b−a
Thus, f (a) = f (b) for any points a, b on that interval, so f is constant on that interval.

Corollary 10.4.18
If f 0 = g 0 on an interval, then ∃c ∈ R such that f (x) = g(x) + c.

Proof. Let h(x) = f (x) − g(x). Then h0 (x) = f 0 (x) − g 0 (x) = 0 on an interval. By Corollary 10.4.17,
h is constant on that interval, i.e. h(x) = c, so f (x) − g(x) = c, i.e. f (x) = g(x) + c.

Definition 10.4.19. f is increasing on an interval if ∀a, b on the interval, a < b → f (a) < f (b).
Definition 10.4.20. f is decreasing on an interval if ∀a, b on the interval, a < b → f (a) > f (b).

Corollary 10.4.21
If f 0 > 0 on an interval, f is increasing on that interval. Similarly, if f 0 < 0 on an interval, f is
decreasing on that interval.

Proof. For all a, b on the interval, let a < b without loss of generality. By Theorem 10.4.16, ∃a < c < b
f (b) − f (a)
such that f 0 (c) = > 0, so f (b) − f (a) > 0, i.e. f (a) < f (b). Thus, ∀a, b on the interval,
b−a
a < b → f (a) < f (b), so f is increasing.
The proof for the latter statement is analogous.

From Corollary 10.4.21, we can now determine which parts of a given function are increasing or
decreasing, using a method called the first derivative number line, which will be demonstrated
through the following examples.
325 Chapter 10. Derivatives
1
Problem 10.4.22. Determine the intervals on which the function f (x) = x + is increasing and
x
decreasing.

1
Solution. We find the first derivative to be f 0 (x) = 1 − 2 , which has critical points at −1 and 1.
x
We use the first derivative number line, as shown below. Note that the function is undefined at
x = 0, so we have an open circle at 0 and must consider the intervals on both sides of the open circle.
Keep in mind that x = 0 is technically not considered a critical point, but we still include it on the
first derivative number line anyway.

+ − − +

−1 0 1

Testing values in between each of the intervals, we find that the function is increasing on (−∞, 1],
decreasing on [−1, 0), decreasing on (0, 1], and increasing on [1, ∞).

Problem 10.4.23. Determine the intervals on which the following functions are increasing and
decreasing.

1. f (x) = 3x − 7

2. f (x) = x2 − 4x + 3

3. f (x) = x3 − 3x

4. f (x) = x3

Solution. We proceed with the first derivative number line:

1. Since f 0 (x) = 3, it is always positive, therefore the function is increasing on the entire domain.

2. We get f 0 (x) = 2x − 4, so we get the critical point x = 2. Thus, our number line is

− +

Thus, f is decreasing on (−∞, 2] and increasing on [2, ∞).

3. We evaluate f 0 (x) = 3x2 − 3 to get the critical points x = ±1, so the number line is

+ − +

−1 1

We conclude that f is increasing on (−∞, −1], decreasing on [−1, 1], and increasing on [1, ∞).

4. We have f 0 (x) = 3x2 , which is clearly always positive. Thus, f is increasing over the entire
domain.
Daniel Kim 326

Some secant lines of a convex function

Definition 10.4.24. f (x) is convex on an interval if ∀a, b in the interval, the secant line connecting
(a, f (a)) to (b, f (b)) lies above the graph on that part of the interval.

Some secant lines of a concave function

Definition 10.4.25. f (x) is concave on an interval if ∀a, b in the interval, the secant line connecting
(a, f (a)) to (b, f (b)) lies below the graph on that part of the interval.

Let two arbitrary points in a convex interval be a, b, and WLOG a < b. Consider any point
a < x < b.

(a, f (a))

(b, f (b))

(x, f (x))

By our definition of convexity, f (x) must be less than the corresponding point on the secant line
connecting a and b. We can algebraically represent this as
f (b) − f (a)
∀a < x < b, f (x) < (x − a) + f (a),
b−a
f (b) − f (a)
where y = (x − a) + f (a) is the equation of the secant line. We can rearrange this
b−a
inequality as
f (x) − f (a) f (b) − f (a)
∀a < x < b, < ,
x−a b−a
which is another way of representing the condition for convexity.
327 Chapter 10. Derivatives

f (x) − f (a)
Definition 10.4.26. For any two points a, b in a convex interval, ∀a < x < b, <
x−a
f (b) − f (a)
.
b−a
f (x) − f (a)
Definition 10.4.27. For any two points a, b in a concave interval, ∀a < x < b, >
x−a
f (b) − f (a)
.
b−a

Theorem 10.4.28
If f 00 > 0 over an interval, then f is convex on that interval.

Proof. Let a, b be two points on the interval, and WLOG a < b. Let x be some point between a and
b.
First, note that f 0 is increasing by Corollary 10.4.21. Applying Theorem 10.4.16 on the intervals
f (x) − f (a) f (b) − f (x)
(a, x) and (x, b), ∃c, d such that f 0 (c) = and f 0 (d) = . Since c < d and f 0 is
x−a b−x
f (x) − f (a) f (b) − f (x)
increasing, we have f 0 (c) < f 0 (d), or < . Then, note that
x−a b−x

f (x) − f (a) f (b) − f (x)


<
x−a b−x
(f (x) − f (a))(b − x) < (f (b) − f (x))(x − a)
bf (x) − xf (x) − bf (a) + xf (a) < xf (b) − af (b) − xf (x) + af (x)
bf (x) − bf (a) + xf (a) < xf (b) − af (b) + af (x)
bf (x) − bf (a) + xf (a) + af (a) < xf (b) − af (b) + af (x) + af (a)
bf (x) − bf (a) − af (x) + af (a) < xf (b) − af (b) − xf (a) + af (a)
(b − a)(f (x) − f (a)) < (x − a)(f (b) − f (a))
f (x) − f (a) f (b) − f (a)
< .
x−a b−a

Therefore, the interval between a and b is convex, as desired.

Exercise 10.4.29. Is the converse of Theorem 10.4.28 true? If not, what would be a counterexample?

Theorem 10.4.30
If f 00 < 0 over an interval, then f is concave on that interval.

Proof. The proof is analogous to that of the previous theorem.


Daniel Kim 328

Theorem 10.4.31
If f 00 > 0 on an interval, its tangent lines lie below the graph.

Proof. By Corollary 10.4.21, f 0 is increasing.


Consider a fixed point a on the interval. Then for any arbitrary x on the interval, we have two
cases:

f (x) − f (a)
• If x > a, then by Theorem 10.4.16, ∃a < b < x such that f 0 (b) = . As f 0 is
x−a
f (x) − f (a)
increasing, we have f 0 (a) < f 0 (b), i.e. f 0 (a) < . This rearranges to f 0 (a)(x − a) +
x−a
f (a) < f (x), which is what we wanted.
f (a) − f (x)
• If x < a, then by Theorem 10.4.16, ∃x < b < a such that f 0 (b) = . As f 0 is
a−x
f (a) − f (x)
increasing, we have f 0 (b) < f 0 (a), i.e. < f 0 (a). This rearranges to f 0 (a)(x − a) +
a−x
f (a) < f (x), the same result as before.

We now have all the tools in calculus to sketch graphs effectively. When given such a task, we
consider:

1. The first derivative, which tells us whether f is increasing or decreasing on which intervals.

2. The second derivative, which gives information on the concavity (convex or concave) on which
intervals.

3. Any asymptotes; consider values of x that f would be undefined in, the behavior of f as x
goes to infinity or negative infinity, or other special values, etc.

4. The function’s roots and y-intercept.

Example 10.4.32
Sketch f (x) = x3 − 3x and label all relevant points.

Solution. We find that f 0 (x) = 3x2 − 3, so its critical points are ±1, and we appropriately set up
our first derivative number line and find the signs in each interval:

+ − +

−1 1

Now we know that f is increasing on (−∞, −1], decreasing on [−1, 1], and increasing on [1, ∞).
We take the derivative of f 0 (x) to get that f 00 (x) = 6x. Our only possible point of inflection is 0,
so we now set up our second derivative number line and find the signs:
329 Chapter 10. Derivatives
− +

Now we know that f is concave down on (−∞, 0] and concave up on [0, ∞), and that 0 is a point
of inflection.

Lastly, we find the roots, which are ± 3, and it’s not hard to see that the graph passes through
the origin. It can be noted that since this function is a cubic, there are no asymptotes.

(−1, 2)

√ √
(− 3, 0) ( 3, 0)
(0, 0)

(1, −2)

Starting from the left, we begin by sketching a sharply increasing concave curve until we reach
−1. This portion is signified as red in the diagram.
Then, according to the first derivative number line, the function starts decreasing. Keep in mind
that the shape is still concave. This portion is represented as blue in the diagram.
Next, the concavity changes at x = 0, an inflection point, but the function is still decreasing.
This is the pink portion of the graph.
Lastly, we finish the sketch by sharply increasing outwards while maintaining the convex shape.
This is the green part of the graph.
1
Problem 10.4.33. Sketch y = .
x2 +1

−2x
Solution. Its first derivative is y 0 = , so its first derivative number line would be:
(x2+ 1)2

+ −

Next, we find our second derivative:

−2(x2 + 1)2 + 2x · 2(x2 + 1) · 2x


y 00 =
(x2 + 1)4
−2(x4 + 2x2 + 1) + 8x2 (x2 + 1)
=
(x2 + 1)4
Daniel Kim 330

6x4 + 4x2 − 2
=
(x2 + 1)4
2(x2 + 1)(3x2 − 1)
= .
(x2 + 1)4

3
The possible points of inflection are ± , and our second derivative number line would be:
3
+ − +
√ √
3 3
− 3 3

1 1
Note that lim = lim 2 = 0, so y = 0 is an asymptote as x goes to negative and
x→∞ x2 + 1 x→−∞ x + 1
positive infinity.

 √  (0, 1) √ 
3 3 3 3
− 3 , 4 3 ,4

Considering all of this information, we start off our sketch with the red line very close to the
asymptote y = 0, increasing and convex.

3
When we hit x = − , we have the line (now represented by a blue line in the figure below)
3
become concave down, but still increasing.
When x = 0, the graph now is decreasing, so we continue with the line (now green) decreasing.

3
Lastly, when we reach x = , the line (which is now pink) becomes concave up again, but
3
continues to decrease and approach the asymptote y = 0 as x goes to infinity.
1
Exercise 10.4.34. Sketch y = x2 − .
x
x
Exercise 10.4.35. Sketch y = .
x2 +1
Problem 10.4.36. Sketch y = x + sin x.

Solution. Note that y 0 = 1 + cos x. Since cos x ∈ [−1, 1], we have 1 + cos x ≥ 0. Thus, y 0 is always
positive except for critical points (. . . − π, π, 3π, . . .), indicating that y is always increasing except
at the critical points, at which the tangent lines would be horizontal.
331 Chapter 10. Derivatives

Don’t be afraid to deal with infinitely many critical points, because there will probably be a
recognizable pattern.
Then, we evaluate the second derivative to be − sin x, so our second derivative number line would
be: Notice the pattern that + and − are infinitely alternating. This suggests that the graph switches
+ − + −
... π ...
0 2π

concavity at every multiple of π.


After considering this information, the sketch should be similar to this:

(2π, 2π)

(π, π)

(0, 0)
(−π, −π)

(−2π, −2π)

Theorem 10.4.37
If f 0 (a) = 0 and f 00 (a) > 0 then f has a local minimum at x = a.

Proof. Note that


f 0 (a + h) − f 0 (a)
f 00 (a) = lim
h→0 h
f 0 (a + h)
= lim ,
h→0 h
f 0 (a + h)
and we are given that f 00 (a) > 0, so lim > 0. If h > 0, then f 0 (a + h) > 0, and if h < 0,
h→0 h
then f 0 (a + h) < 0. There is no need to consider h = 0 since we are taking the limit. Thus, if we
consider the first derivative number line, then

− +
a
Daniel Kim 332

which indicates that x = a is a local minimum.

Theorem 10.4.38
If f 0 (a) = 0 and f 00 (a) < 0 then f has a local maximum at x = a.

Proof. The proof is nearly identical to that of the previous theorem.

10.5 Optimization
Example 10.5.1
If the radius of the sphere is 12, what is the volume of the largest cylinder that can be inscribed
in the sphere?

Solution. Let r, h denote the radius and height of the cylinder respectively. Notice that the diagonal
of the cylinder is twice the radius of the sphere, and then we can take advantage of a right triangle
relationship:

24 h

2r

Consider the formula for the volume of a cylinder, V = πr2 h. Since there are two variables we
cannot find the maximum of this function. However, we have h2 + 4r2 = 576 from the right triangle,
333 Chapter 10. Derivatives

576 − h2
so we solve r2 = , and substitute this back into the volume formula to get
4
 
576 − h2 π
V (h) = π h = (576h − h3 ).
4 4

π √
We differentiate to get V 0 (h) = (576 − 3h2 ), from which we get the critical point h = 8 3.
4
Instead of going through the trouble to set up a first derivative number line, we can plug it into the
3π √
second derivative. Note that V 00 (h) = − h and thus V 00 (8 3) < 0. Then, by Theorem 10.4.38,
√ 2 √
h = 8 3 is a local maximum. However, since 8 3 is the only critical point, it is therefore the global
√ √
maximum, so the volume is minimized at V (8 3) = 768π 3 .

Problem 10.5.2. Inside a hemisphere of radius 12 is inscribed a box with a square base. What
dimensions will maximize its volume?

Solution. This diagram represents the box with side length s and height h.

12
h


s 2
2

Given that it is inscribed in a hemisphere, the distance between the center of the square base
(and the base of the hemisphere) and one of the four upper corners of the box must be the radius of
the sphere, or 12. By properties of a square, the distance from the
√ center of the base to one of the
s 2
four lower corners must be half the diagonal of the square, or .
2
Then, we have a right-triangle relationship and we can use the Pythagorean Theorem to relate
√ !2
s 2
all three sides: h2 + = 144. This rearranges to 2h2 + s2 = 288.
2
Given that the formula for the volume of this box is s2 h, we solve the prior equation to get
s2 = 288 − 2h2 . Thus, we end up with V (h) = (288 − 2h2 )h = 288h − 2h3 .

We differentiate
√ to get V (h) = −6h + 288, and the critical points are h = ±4 3. However, we
0 2

discard h = −4 3 since length cannot be negative.



We confirm that h = 4 3 is a√global maximum by substituting it into the second derivative

(we have V 00 (h) = −12h, so V 00 (4 3) < 0). Thus, the maximum volume of the box is V (4 3) =

768 3 .
Daniel Kim 334

Problem 10.5.3. Consider a circle with a sector cut out, at angle θ. Let this resulting figure have
a fixed area A. What values of r and θ will minimize the perimeter of this figure?

r
θ

2π − θ 2π − θ 2
Solution. We are given that the area A is constant, so the area formula A = · πr2 = r
2π 2
establishes a relation between r and θ. Since we wish to maximize the perimeter, which is 2r+2πr−rθ,
we need to find the expression for the perimeter in terms of one variable only, so we could apply our
usual optimization techniques.
2π − θ 2 2A
Note that A = r rearranges to θ = 2π − 2 . We substitute this back into the initial
2 r    
2A 2A 2A
expression for the perimeter to get P (r) = 2r + 2πr − r 2π − 2 = r 2 + 2 = 2r + .
r r r
2A √
We can then evaluate P 0 (r) = 2 − 2 , from which we get the critical points r = ± A. We
√ r √
discard the solution r = − A as length cannot be negative, so we are left with only r = A. We
4A √ 4 √
get P 00 (r) = 3 , so P 00 ( A) = √ > 0, thus we confirm that r = A and therefore θ = 2π − 2
r A
yield the minimum perimeter.

Problem 10.5.4. Show that if the point (a, b) on the parabola y = x2 is the closest point to (0, c)
(given c > 0), then the line connecting (a, b) to (0, c) is perpendicular to the tangent at (a, b).

p
Solution. The distance from (a, b) to (0, c) is a2 + (c − b)2 . We wish to minimize this distance, but
it is sufficient to minimize the expression inside the radical, as this simplifies our work tremendously
when differentiating. Note that b = a2 , as the point lies on the parabola y = x2 .
Consider D(a) = ra + (c − a ) , so D (a) = 2a + 2(c − a )(−2a) = 2a(1 − 2c + 2a ). The critical
2 2 2 0 2 2

2c − 1 1
points are a = 0, ± . However, if c ≤ , then a = 0 would be the only critical point.
2 2
r ! r
1 2c − 1 2c − 1
Assume c > . Then D (a) = 2 − 4c + 12a , so D
00 2 00 = 8c − 4 > 0, so a =
2 2 2
r
2c − 1
is a local minimum. The same follows for a = − .
2
r ! r !
2c − 1 2c − 1 2c − 1 2c − 1
Thus, the two equally closest points are , and − , . The
2 2 2 2
r !
2c−1
2c − 1 2c − 1 2 −c 1
slope of the line connecting (0, c) and , is q =− q .
2 2 2c−1
2 2c−1
2 2
335 Chapter 10. Derivatives

Note that the !derivative of y = x2 is y 0 = 2x, so the slope of the tangent line through
r r
2c − 1 2c − 1 2c − 1
, is 2 , and we confirm that these slopes are negative reciprocals of each
2 2 2
r !
2c − 1 2c − 1
other, so the lines are perpendicular. The same reasoning can be applied to − ,
2 2
as well.

Problem 10.5.5. Maximize and minimize a sin x + b cos x using calculus. Assume a, b 6= 0.

Solution. Let f (x) = a sin x + b cos x. Then f 0 (x) = a cos x − b sin x, so x is a critical point only
a a b
when a cos x = b sin x, or tan x = . This indicates that sin x = √ and cos x = √ , or
b 2
a +b 2 a + b2
2
a b
sin x = − √ and cos x = − √ .
a2 + b2 a2 + b2
When we substitute the former values into the second derivative, f 00 (x) = −a sin x − b cos x, we
a b −(a2 + b2 ) √
get −a · √ −b· √ = √ = − a2 + b2 < 0, so there is a local maximum at
2
a +b 2 2
a +b 2 2
a +b 2
a b
the value of x that satisfies sin x = √ and cos x = √ .
a2 + b2 a2 + b2

√ However, notice that f (x) = −f (x), so we can immediately conclude that −(− a + b ) =
00 2 2

a2 + b2 is the maximum value.



Substituting in the latter values eventually leads to the similar conclusion that − a2 + b2 is the
minimum value. Note that in solving this problem we did not have to explicitly solve for the critical
points to be able to plug them into the second derivative.

10.6 L’Hôpital’s Rule

First, we introduce a precursory theorem to aid us in proving the main formula for this section.

Theorem 10.6.1 (Cauchy Mean Value Theorem)


Suppose f, g are continuous on [a, b] and differentiable on (a, b). Then ∃a < x < b such that
(f (b) − f (a))g 0 (x) = (g(b) − g(a))f 0 (x), or

g 0 (x) g(b) − g(a)


0
= .
f (x) f (b) − f (a)

Proof. Let h(x) = (f (b) − f (a))g(x) − (g(b) − g(a))f (x). Note that h(a) = (f (b) − f (a))g(a) −
(g(b) − g(a))f (a) = f (b)g(a) − f (a)g(b), and h(b) = (f (b) − f (a))g(b) − (g(b) − g(a))f (b) =
f (b)g(a) − f (a)g(b), so h(a) = h(b). Therefore, by Theorem 10.4.13, ∃a < x < b such that h0 (x) = 0,
i.e. (f (b) − f (a))g 0 (x) − (g(b) − g(a))f 0 (x) = 0, which rearranges to our desired result.
Daniel Kim 336

Theorem 10.6.2 (L’Hôpital’s Rule)


g 0 (x) g(x) g 0 (x)
Suppose lim f (x) = lim g(x) = 0 and that lim exists. Then lim = lim .
x→a x→a x→a f 0 (x) x→a f (x) x→a f 0 (x)

Proof. Assuming that f and g are differentiable, note that they must also be continuous, by
Theorem 10.1.20. Thus, lim f (x) = lim g(x) = 0 indicates that f (a) = g(a) = 0.
x→a x→a

For some x near a, by Theorem 10.6.1, ∃b between a and x such that

g 0 (b) g(x) − g(a) g(x)


0
= = .
f (b) f (x) − f (a) f (x)

g(x) g 0 (b)
Then lim = lim 0 . However, note that as x approaches a, b will also approach a since b is
x→a f (x) x→a f (b)
g(x) g 0 (b) g 0 (a) g 0 (x)
between x and a. Thus, lim = lim 0 = 0 = lim 0 , as desired.
x→a f (x) x→a f (b) f (a) x→a f (x)

x2 − 3x + 2
Problem 10.6.3. Evaluate lim using Theorem 10.6.2.
x→2 x2 − 5x + 6

Solution. When we plug in x = 2, the numerator and denominator both become 0, so we can apply
x2 − 3x + 2 2x − 3
Theorem 10.6.2 to get lim 2 = lim = −1 .
x→2 x − 5x + 6 x→2 2x − 5

sin x sin 3x
Exercise 10.6.4. Evaluate lim and lim using Theorem 10.6.2.
x→0 x x→0 sin 5x

1 − cos 3x
Problem 10.6.5. Evaluate lim using Theorem 10.6.2.
x→0 1 − cos 5x

1 − cos 3x 3 sin 3x
Solution. First, we have lim = lim , but this still evaluates to 0 when x = 0 is
x→01 − cos 5x x→0 5 sin 5x
3 sin 3x 9 cos 3x 9
plugged in. Therefore, we apply Theorem 10.6.2 again: lim = lim = .
x→0 5 sin 5x x→0 25 cos 5x 25

x2 − 5x + 6
Problem 10.6.6. Evaluate lim using Theorem 10.6.2.
x→2 x2 − 4x + 4

x2 − 5x + 6 2x − 5
Solution. We have lim = lim , but this limit does not exist.
x→2 x2 − 4x + 4 x→2 2x − 4

x2 − 5x + 6
Therefore, lim does not exist.
x→2 x2 − 4x + 4

3x − sin 3x
Problem 10.6.7. Compute lim .
x→0 x3
337 Chapter 10. Derivatives

Solution. We repeatedly apply Theorem 10.6.2:


3x − sin 3x 3 − 3 cos 3x
lim 3
= lim
x→0 x x→0 3x2
1 − cos 3x
= lim
x→0 x2
3 sin 3x
= lim
x→0 2x
3 sin 3x
= lim
2 x→0 x
3
= lim 3 cos 3x
2 x→0
9
= .
2

Remark. Theorem 10.6.2 extends to x → ±∞, x → a+ , and x → a− .

Theorem 10.6.8
f 0 (x) f (x)
If lim 0 = ±∞, then lim = ±∞. The same holds for right-hand and left-hand limits.
x→a g (x) x→a g(x)

In other words, Theorem 10.6.2 extends to ± .

f (x)
Proof. Suppose lim f (x) = lim g(x) = ∞. Assuming that lim exists, we rearrange the fraction,
x→a x→a x→a g(x)

1
f (x) g(x)
lim = lim 1 ,
x→a g(x) x→a
f (x)

and now it is appropriate to apply Theorem 10.6.2, as the numerator and denominator now go to 0.
Then we differentiate,
0
g (x)
1
g(x)
− g(x)2 f (x) f (x) g 0 (x)
lim 1 = lim 0 = lim · · .
x→a x→a − ff(x)
(x) x→a g(x) g(x) f 0 (x)
f (x) 2

Thus,
f (x) f (x) f (x) g 0 (x)
lim = lim · lim · lim 0 ,
x→a g(x) x→a g(x) x→a g(x) x→a f (x)

f 0 (x) f (x)
so lim 0
= lim , which is what we want.
x→a g (x) x→a g(x)

10.7 Inverses

We begin the section with a warm-up.


Problem 10.7.1. Prove that if f is increasing, then f −1 is increasing.
Daniel Kim 338

Proof. Looking back to Definition 10.4.19, ∀a < b → f (a) < f (b). Then the contrapositive of this
implication must be true, i.e. f (b) ≤ f (a) → b ≤ a. Let eb = f (b) and e a = f (a). It follows that
f −1 (eb) = b and f −1 (e
a) = a, so we have eb ≤ e a → f −1 (eb) ≤ f −1 (e
a). However, note that f and f −1
are one-to-one (or else the inverse would not exist), so we can safely get rid of the equal signs to
conclude eb < e a → f −1 (eb) < f −1 (e
a). Thus, f −1 is increasing.

Theorem 10.7.2
Suppose f is continuous and increasing on [a, b]. Then f −1 is continuous on [f (a), f (b)].

The proof, which will involve the ε − δ definition of continuity, is left to the reader.
Now it makes sense to consider the derivative of inverses. We now prove the general formula:

Theorem 10.7.3 (Derivative of the Inverse)


Let f −1 denote the inverse function of f . Then,
1
(f −1 )0 (x) = .
f 0 (f −1 (x))

Proof. Assume f is continuous and one-to-one, or else it would not have an inverse. Let f −1 (x) = y,
or x = f (y). Note that
f −1 (x + h) − f −1 (x)
(f −1 )0 (x) = lim .
h→0 h
When h is small and approaches 0, f −1 (x + h) approaches f −1 (x) = y. So, let f −1 (x + h) = y + e
h,
for some small e
h. This also suggests that x + h = f (y + e
h). Thus,

f −1 (x + h) − f −1 (x) y+e h−y


lim = lim
h→0 h h→0 h
e
h
= lim
h→0 x + h − x
e
h
= lim ,
h→0 f (y + e
h) − f (y)

and note that as h goes to 0, e


h must also go to 0, as we keep in mind that f −1 (x) = y. Thus,

e
h e
h 1 1
lim = lim = 0 = 0 −1 ,
e
h→0 f (y + h) − f (y) e
h→0 f (y + h) − f (y)
e f (y) f (f (x))

1
so (f −1 )0 (x) = .
f 0 (f −1 (x))
Exercise 10.7.4. Use Theorem 10.1.36 to derive this formula. Why can’t this method be a rigorous
proof for Theorem 10.7.3?
339 Chapter 10. Derivatives

Example 10.7.5
d d d
Find (sin−1 (x)), (cos−1 (x)), and (tan−1 (x)).
dx dx dx

Solution. We can use three methods:

d 1
1. We can directly apply the formula from Theorem 10.7.3: (sin−1 (x)) = =
dx cos(sin−1 (x))
1
√ .
1 − x2

2. Alternatively, let y = cos−1 (x), so cos y = x. Then, we can implicitly differentiate,

d d dy
(cos y) = (x) ←→ − sin(y) = 1,
dx dx dx

dy 1 1 1
so =− =− −1
= −√ .
dx sin y sin(cos (x)) 1 − x2

3. Lastly, consider the identity x = tan(tan−1 (x)). We implicitly differentiate to get

d d d
(x) = (tan(tan−1 (x))) ←→ 1 = sec2 (tan−1 (x)) (tan−1 (x)),
dx dx dx

d 1 1
so (tan−1 (x)) = 2 −1 = 2 .
dx sec (tan (x)) x +1

Exercise 10.7.6. Evaluate the derivatives of the rest of the inverse trigonometric functions.
d 1 1 1
Exercise 10.7.7. Use Theorem 10.7.3 to prove that ∀n ∈ N, (x n ) = x n −1 . Then, prove the
dx n
d r
general case: ∀r ∈ Q, (x ) = rx .
r−1
dx
d  −1  7 
Exercise 10.7.8. Find tan x2 .
dx

10.8 Parametric and Polar Equations


Theorem 10.8.1 (Derivative of Parametric Equations)
Assume f (t), g(t) are continuous, such that

x = f (t),
y = g(t).

dy g 0 (t)
If f 0 (t) 6= 0, then = 0 .
dx f (t)
Daniel Kim 340

Proof. Since the equations x = f (t), y = g(t) do not necessarily define a function (there could be
loops or spirals), we need to impose some restrictions. For some t0 , y is a “function” of x locally
near t0 .
We can restrict the parametric equation to a certain domain and range such that it is an actual
function. In this area, let y = h(x), so y = g(t) = h(x) = h(f (t)). Then we implicitly differentiate
to get
g 0 (t)
g 0 (t) = h0 (f (t))f 0 (t) ←→ h0 (x) = 0 ,
f (t)
which is what we wanted.

Exercise 10.8.2. Given the parametric equations x = cos θ, y = sin θ (what is this graph?), find
dy
.
dx
Exercise 10.8.3. Given the parametric equations x = k(θ − sin θ), y = k(1 − cos θ) (where k ∈ R
dy
is some constant), find .
dx
Problem 10.8.4. How can we get the slope of r = f (θ) at some fixed θ = θ0 ?

Solution. Note that x = r cos θ, y = r sin θ, so substitute r = f (θ) to get x = f (θ) cos θ and
y = f (θ) sin θ. Apply Theorem 10.8.1 to this, and we get

dy f 0 (θ) sin θ + f (θ) cos θ


= 0 .
dx f (θ) cos θ − f (θ) sin θ

f 0 (θ0 ) sin θ0 + f (θ0 ) cos θ0


Therefore the slope of the tangent line at θ = θ0 is .
f 0 (θ0 ) cos θ0 − f (θ0 ) sin θ0

10.9 Review

Problem 10.9.1. Consider 4ABC where AC = 3, AB = 4, and BC = 5. Optimize (minimize or


maximize) the perimeter of 4DEF .

E
D 5
4

A F C
3
341 Chapter 10. Derivatives
BD x
Solution. Let DE = x. By AA similarity, 4BDE∼4BAC, so = , which rearranges to
s 4 3
 2
4 4 4
BD = x. Therefore DA = EF = 4 − x, and DF = x2 + 4 − x by the Pythagorean
3 3 3
Theorem. We want to optimize the perimeter, so consider the function,
s  
4 2
4 2
f (x) = x + 4 − x + x + 4 − x .
3 3
We differentiate to get
25 16
0 1
q 9 x− 3
f (x) = − + ,
3 25 2
x − 32
x + 16
9 3

and we set this equal to 0 to find the critical √


points, resulting in the
√ quadratic 25x − 96x + 90 = 0.
2

48 − 3 6 48 + 3 6
Therefore, the critical points are x = and x = , and it is left to the reader to
25 25
verify which yield local maximums or minimums.
d  d 
Exercise 10.9.2. Find tan−1 tan−1 (x) and x sin−1 x .
dx dx
Problem 10.9.3. Let f (x) = x3 + x + 1. Find (f −1 )0 (31).

1
Solution. By Theorem 10.7.3, (f −1 )0 (x) = . To find (f −1 )0 (31), we need to find
3(f −1 (x))2
+1
f −1 (31) in order to use the formula. Note that 3 is the only real root that satisfies x3 + x + 1 = 31,
1 1 1
so f −1 (31) = 3, so (f −1 )0 (31) = = = .
3(f −1 (31))2 + 1 3 · 32 + 1 28
 π sin−1 x x100 − 1
Problem 10.9.4. Find lim x− tan x, lim , and lim .
x→ π2 − 2 x→0 tan−1 x x→1 x99 − 1

Solution. We apply Theorem 10.6.2.

sin x
1. For the first limit, rewrite tan x = to get
cos x

x − π2 sin x
lim ,
x→ π2 − cos x

so it is now appropriate to differentiate the numerator and denominator to get



sin x + (cos x) x − π2
lim = −1 .
x→ π2 − − sin x

d 1 d 1
2. Recalling that (sin−1 (x)) = √ and (tan−1 (x)) = 2 , we have
dx 1−x 2 dx x +1
√ 1
sin−1 x 1−x2 x2 + 1
lim = lim = lim √ = 1.
x→0 tan−1 x x→0 1 x→0 1 − x2
x2 +1
Daniel Kim 342

3. Likewise,
x100 − 1 100x99 100
lim = lim = .
x→1 x99 − 1 x→1 99x98 99
Problem 10.9.5. Consider a 3 − 4 − 5 right triangle as shown.

A B 5
4

1
The horizontal segment AB moves downward at a rate of unit per year. How fast is the length
3
AB changing when AB = 2 units?

Solution. Let AB = y, and the altitude from AB to base of side length 3 be x. We label the triangle
as such:

4−x
y
A B

Note that the small right triangle with side lengths 4 − x and y is similar to the 3 − 4 − 5 triangle
y 4−x
by AA similarity. Thus, = , which rearranges to 4y + 3x = 12.
3 4
dx 1 dy
Interpreting the problem, we are given = − , and we want to find when y = 2. We
dt 3 dt
implicitly differentiate the equation above:
d d dy dx
(4y + 3x) = (12) ←→ 4 + 3 = 0,
dt dt dt dt
 
dy 1 dy 1
and we substitute the given values to get 4 + 3 − = 0 to get = unit per year. Note
dt 3 dt 4
that we did not even need to use y = 2, because the horizontal segment’s rate of change is constant,
given the relation 4y + 3x = 12.
343 Chapter 10. Derivatives

Problem 10.9.6. A box with a square base and an open top has volume 1000 cubic inches. What
dimensions will minimize its surface area?

Solution. Let the side length of the square base be x, and therefore the height of the box will be
1000
. Then the surface area can be represented by the function
x2
1000
f (x) = x2 + 4 · ,
x
4000 √
and we differentiate to get f 0 (x) = 2x − 2
, yielding the critical point x = 10 3
2. Since
x
8000 √ √
f 00 (x) = 2 + 3 , f 00 (10 3 2) > 0, so x = 10 3 2 yields the minimum surface area. Thus, the
x √ √ √
dimensions of the box will be 10 2 × 10 2 × 5 2 .
3 3 3

Problem 10.9.7. A boy flies a kite at the height of 300 feet. The wind carries the kite horizontally
from the boy at a rate of 25 feet per second. How fast must he let out the string when the kite is
500 feet away from him?

Solution. We interpret the problem using a right triangle.

y
300

dx dy
We are given = 25, and we want to find when y = 500. By the Pythagorean Theorem,
dt dt
dx dy
3002 + x2 = y 2 , so we implicitly differentiate it to get 2x = 2y . Then we substitute in values:
dt dt
dy dy
400 · 25 = 500 , i.e. = 20 feet per second.
dt dt

You might also like