Advanced Precalculus 2 1
Advanced Precalculus 2 1
Daniel Kim
1 Logic 1
1.1 Logical Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Logical Arguments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.3 Quantified Statements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2 Set Theory 19
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.2 Operations on Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3 Fields 29
3.1 Field Axioms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.2 Subtraction, Division . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
5 Mathematical Induction 65
5.1 Standard Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
5.2 The Binomial Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
6 Basic Trigonometry 87
6.1 Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
6.2 The Unit Circle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
6.3 Graphing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
6.4 Inverse Trigonometric Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
iii
6.5 More Trigonometric Identities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
6.6 Areas, Law of Sines, Law of Cosines . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
9 Limits 249
9.1 Linear Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250
9.2 Non-linear Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254
9.3 Limit Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 260
9.4 Other Limits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267
9.5 Trigonometric Limits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272
9.6 Advanced Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277
9.7 Continuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 282
10 Derivatives 287
10.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 287
10.2 Implicit Differentiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 306
10.3 Related Rates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311
10.4 Significance of the Derivative . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 318
10.5 Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 332
10.6 L’Hôpital’s Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 335
10.7 Inverses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 337
10.8 Parametric and Polar Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 339
10.9 Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 340
Chapter 1
Logic
To understand how proofs work, it is essential to understand the underlying mathematical logic
involved. In this chapter, we will only provide a relatively brief overview of how logic works.
The Law of Excluded Middle states that every statement is always either true or false. We
usually use variables p, q, r, etc. to denote such a statement.
Definition 1.1.1. The negation of a statement p reverses the original truth value, and is denoted
as ∼p. This is pronounced as “not p.”
In other words, as long as at least one of p or q is true, then p ∨ q is true. Otherwise, it is false.
This relationship can be depicted in a truth table, where all possible combinations of truth
values for the considered statements are enumerated in an organized table form. For brevity, the
values of true and false will respectively be denoted as T and F.
Here is the truth table for p ∨ q:
p q p∨q
T T T
T F T
F T T
F F F
For each pair of truth values for p and q, we read the truth table by horizontal rows. For example,
the first row of this table would indicate, “if p is true and q is true, then p ∨ q is true.”
Review the table and confirm that it matches with your understanding of disjunction.
1
Daniel Kim 2
Since there are two initial statements that we consider (p and q) with two possible truth values
for each (true or false), there are 22 = 4 rows in the truth table.
For a truth table demonstrating negation, we only have to consider the initial statement p, there
would only be two rows:
p ∼p
T F
F T
In other words, if at least one of the statements p and q is false, then p ∧ q will necessarily be
false. Here is the truth table representing conjunction:
p q p∧q
T T T
T F F
F T F
F F F
Example 1.1.4
Construct a truth table for ∼p ∧ q.
Solution. When we break down this statement, notice that we will have to evaluate the possible
truth values of p, q, ∼p, and lastly, ∼p ∧ q. The truth table illustrates this process:
p q ∼p ∼p ∧ q
T T F F
T F F F
F T T T
F F T F
Once we get the truth values for ∼p, we perform the conjunction on the ∼p column and the q
column to get our result.
Definition 1.1.5. The operation exclusive-or, shortened to “XOR,” is denoted as p ⊕ q. This new
statement is true as long as p and q have different truth values.
3 Chapter 1. Logic
p q p⊕q
T T F
T F T
F T T
F F F
But in fact, it is possible to express p ⊕ q in terms of the basic logical operators! We need to find
a statement that only uses negation, disjunction, and conjunction operators which fulfills the same
purpose as p ⊕ q.
Definition 1.1.6. Two statements are logically equivalent if they have corresponding equal
possible truth values. If p and q are two such statements, then their equivalence is represented by
p ≡ q.
Proof. We end up with a large truth table since we have to break up that unwieldly statement.
p q p ∨ q p ∧ q ∼(p ∧ q) (p ∨ q) ∧ ∼(p ∧ q) p ⊕ q
T T T T F F F
T F T F T T T
F T T F T T T
F F F F T F F
It can then be confirmed that the columns for (p ∨ q) ∧ ∼(p ∧ q) and p ⊕ q have the same values.
Note that they must be the same for each row (i.e. when viewing the two columns top-to-bottom or
vice versa, the order of the truth values must match).
Problem 1.1.8. Prove that p ∧ (q ∨ r) ≡ (p ∧ q) ∨ (p ∧ r).
Proof. Since we have three initial statements to consider (p, q, and r), our truth table will have
23 = 8 rows.
p q r q ∨ r p ∧ q p ∧ r p ∧ (q ∨ r) (p ∧ q) ∨ (p ∧ r)
T T T T T T T T
T T F T T F T T
T F T T F T T T
T F F F F F F F
F T T T F F F F
F T F T F F F F
F F T T F F F F
F F F F F F F F
Daniel Kim 4
Note that this equivalence resembles the distributive property from algebra.
(p ∧ q) ∨ r ≡ p ∧ (q ∨ r)
Proof. This equivalence is false. It suffices to supply one counterexample, since a logical equivalence
must hold for all possible truth value combinations for the initial statements (which would be p, q,
and r in this problem).
Consider p = F, q = F, and r = T. These truth values lead to the following equivalences:
(p ∧ q) ∨ r ≡ T,
p ∧ (q ∨ r) ≡ F.
In fact, there is an abundance of logical equivalences, and the most important laws are listed in
the table below.
Logical Equivalences
Associative laws p ∧ (q ∧ r) ≡ (p ∧ q) ∧ r p ∨ (q ∨ r) ≡ (p ∨ q) ∨ r
Distributive laws p ∧ (q ∨ r) ≡ (p ∧ q) ∨ (p ∧ r) p ∨ (q ∧ r) ≡ (p ∨ q) ∧ (p ∨ r)
Negation laws p ∧ ∼p ≡ c p ∨ ∼p ≡ t
Absorption laws p ∨ (p ∧ q) ≡ p p ∧ (p ∨ q) ≡ p
Dichotomies ∼t ≡ c ∼c ≡ t
In this table, we use t and c to denote tautology and contradiction respectively. These are
interchangeable with T and F.
Exercise 1.1.10. Go through each equivalence listed here and develop a truth table to prove it,
until you are familiar with the proof-by-truth-table method.
It is recommended that you familiarize yourself with these rules, as invoking them tremendously
simplifies proofs later on.
5 Chapter 1. Logic
Example 1.1.11
Simplify ((q ∨ p) ∧ q) ∨ (r ∧ (∼r ∧ ∼q)).
Solution. To keep it brief and simple, we will glance over steps using the Commutative laws.
This method is much easier and less tedious than using a truth table, as long as the laws are invoked
correctly.
Often in life, things happen as a result of other things. In fact, there is a special operation in
logic that illustrates this cause-and-effect dynamic.
Consider the situation in which p is false. Regardless of what truth value q is, p → q will
automatically be true. In this case, p → q is vacuously true, since the conditional is irrelevant in
the first place.
Otherwise, consider p to be true. If q is true, then p → q will be true, since our “if-then”
relationship is satisfied. However, if q is false, then this fails our notion of “if p, then q” so p → q
would be considered false in this case.
These observations can be illustrated in the following truth table.
p q p→q
T T T
T F F
F T T
F F T
For future reference, whenever we are given a statement of the form p → q to prove, then we
should assume p to be true in our proof, because the statement is meaningless when p is false.
p → (q → r) ≡ (p → q) → r.
Daniel Kim 6
p → (q → r) ≡ T,
(p → q) → r ≡ F.
(p → r) ∧ (q → r) → ((p ∧ q) → r)
Solution. This equivalence is true. We can construct a truth table and then analyze the truth values.
p q r p → r q → r p ∧ q (p → r) ∧ (q → r) (p ∧ q) → r (p → r) ∧ (q → r) → ((p ∧ q) → r)
T T T T T T T T T
T T F F F T F F T
T F T T T F T T T
T F F F T F F T T
F T T T T F T T T
F T F T F F F T T
F F T T T F T T T
F F F T T F T T T
The last column (which represents the possible truth values of the given equivalence) only has T
as a value, so the equivalence must overall be true.
How do we represent the conditional in terms of the basic logical operators? Note that p → q is
guaranteed to be true when p is false, i.e. ∼p is true. The only other case in which p → q is true is
when both p and q are true. Ultimately, we either want ∼p to be true or q to be true. Thus, we
would expect that
p → q ≡ ∼p ∨ q.
Exercise 1.1.15. Use a truth table to demonstrate p → q ≡ ∼p ∨ q.
Definition 1.1.16. Given the conditional p → q, there are three variations:
2. The converse is q → p.
Are these variations necessarily logically equivalent to p → q? It turns out that the original
conditional and its contrapositive are logically equivalent, and so are the converse and inverse.
In some cases, we may find it easier to prove the contrapositive of some theorem statement,
rather than prove the given implication.
Given the original conditional, directly assuming that the converse is true is called the converse
error, and likewise for the inverse it is called the inverse error.
Definition 1.1.17. The biconditional of statements p and q is denoted as p ↔ q. We read this as
“p if and only if q.” This is only true when p and q have the same truth value.
7 Chapter 1. Logic
Some mathematical texts will abbreviate “if and only if” to simply “iff.”
Exercise 1.1.18. Prove that p ↔ q ≡ (p → q) ∧ (q → p).
Notice that p ↔ q when p → q and its converse are both true. Usually, for mathematical
theorems, when we want to show that both the initial implication and its converse are both true, we
use ↔ to emphasize both directions. When we need to prove such theorems, it is often the case that
we prove p → q and q → p separately. They are respectively referred to as the ‘right direction’ and
‘left direction.’
Notice that the conditions needed for p ↔ q to be true are the opposite of those for p ⊕ q. Indeed,
the following equivalence also holds true.
Exercise 1.1.19. Prove that p ↔ q ≡ ∼(p ⊕ q).
Problem 1.1.20. Express the disjunction, conditional, biconditional, and exclusive-or only in terms
of conjunction and negation.
Solution. We take previous results and apply DeMorgan’s laws when needed.
p ∨ q ≡ ∼(∼p ∧ ∼q)
p → q ≡ ∼p ∨ q
≡ ∼(p ∧ ∼q)
p ↔ q ≡ (p → q) ∧ (q → p)
≡ ∼(p ∧ ∼q) ∧ ∼(q ∧ ∼p)
p ⊕ q ≡ ∼(p ↔ q)
≡ ∼(∼(p ∧ ∼q) ∧ ∼(q ∧ ∼p))
p→q
p
∴q
This form is known as modus ponens, and it is the most basic rule of inference in logic. The
statements p → q and p are called premises, while q is the conclusion.
Consider another valid argument form,
p→q
∼q
∴ ∼p
and this is called modus tollens. By considering the contrapositive, it should be clear why this
is true.
An argument is valid if at any time all the premises are true, then the conclusion is also true.
Daniel Kim 8
Example 1.2.1
Prove that modus pollens is valid.
Premises Conclusion
p q p→q p q
T T T T T
T F F T F
F T T F T
F F T F F
We only consider the row(s) in which all of the premises are true. A row which satisfies this is called
a critical row. The argument is valid if for each critical row, the conclusion(s) is also true. In this
example, we see that the only critical row is the first row, and q is true in that row, so the argument
is valid.
Of course, argument forms can be much more complex, rendering the truth-table method tedious
and time-inefficient. Like before, we have a set of tools at our disposal, called the rules of inference:
It can be verified using the truth table that all of these are valid.
Lastly, there is another important argument form called the rule of contradiction:
∼p → F
∴p
This is the essence of the popular “proof by contradiction” − if the negation of a given statement
leads to a false conclusion, then the statement has to be true. Proof by contradiction will become a
very useful tool for proofs in the future.
Example 1.2.2
Demonstrate the following argument’s validity or invalidity.
p∨q (1)
q→r (2)
(p ∧ s) → t (3)
∼r (4)
∼q → (u ∧ s) (5)
∴t
Note that the variable t here represents a statement rather than a tautology.
Proof. We will show that this is valid, through a proof with multiple steps.
Step 1:
q→r (by 2)
∼r (by 4)
∴ ∼q (modus tollens)
Step 2:
p∨q (by 1)
Daniel Kim 10
∼q (by step 1)
∴p (disjunctive syllogism)
Step 3:
∼q → (u ∧ s) (by 5)
∼q (by step 1)
∴u∧s (modus ponens)
Step 4:
Step 5:
p (by step 2)
s (by step 4)
∴p∧s (conjunctive addition)
Step 6:
(p ∧ s) → t (by 3)
p∧s (by step 5)
∴t (modus ponens)
p→q∧r (1)
∼q (2)
∴ ∼p
Proof. You may use a truth table and examine the truth values of the critical rows, but rules of
inference are still applicable:
Step 1:
∼q (by 2)
∴ ∼q ∨ ∼r (disjunctive addition)
Step 2:
∼q ∨ ∼r (by step 1)
∴ ∼(q ∧ r) (DeMorgan’s laws)
11 Chapter 1. Logic
Step 3:
p→q∧r (by 1)
∼(q ∧ r) (by step 2)
∴ ∼p (modus tollens)
Alternative Proof. The argument is only valid when if the premises are all true, then the conclusion
must be true. Thus, assume that the premises are true.
If ∼q is true, then q must be false. This then implies that q ∧ r is also false, by universal bound
laws. By modus tollens, p must be false, i.e. ∼p is true, and we are done.
A predicate is a declaration involving unknown variables, and if values were assigned to these
variables, the predicate would become a statement with a truth value. For example:
The domain is the set of all values that can be assigned to the predicate variables.
Quantifiers serve to specify how many elements are able to be substituted into the predicate
and resulting in a true statement. There are two types:
Definition 1.3.1. The universal quantifier, denoted by ∀, indicates that all elements in the
domain result in a true statement when substituted into the predicate.
For instance, ∀x ∈ D, P (x) is read as “for all x in D, P (x) is true.” Thus, this quantified
statement is true precisely when P (x) is true for every x in D.
The symbol ∈ when we have x ∈ D indicates that x is an element of set D, where a set is a
collection of elements (refer to Definition 2.1.1). For example, 1 ∈ Z is true, but π ∈ Q is false.
We will go over sets in greater detail in the next chapter.
Definition 1.3.2. The existential quantifier, denoted by ∃, indicates that there is some element
in the domain that results in a true statement when substituted into the predicate.
For example, ∃x ∈ D, P (x) is read as “there exists an x in D such that P (x) is true.” Thus, this
quantified statement is true precisely when P (x) is true for some x in D.
Often, in math, we use quantified statements with respect to sets of numbers. Here is a list of
the generally accepted symbols for well-known sets.
R − Real numbers
Q − Rational numbers
Daniel Kim 12
Z − Integers
W − Whole numbers
N − Natural numbers
C − Complex numbers
H − Quaternions
In particular, you should be able to distinguish and understand the first five sets. As a note,
the natural numbers are defined starting from 1, while the whole numbers are the natural numbers
including 0.
Lastly, before proceeding further, it will be assumed that you are familiar with interval notation
from previous experience with algebra.
1. ∀x ∈ R, x2 ≥ 0
2. ∃x ∈ Z, x2 < 1
3. ∀x ∈ R, x ∈ Q
4. ∀x ∈ Z, x ∈ Q → x ∈ R
Solution. The quantifier can change the whole meaning of the statement, so be sure not to get
confused.
2. True. There is an existential quantifier, so the statement will be true provided that we can
give at least one example: note that x = 0 satisfies x2 < 1.
3. False. Analogously, for a universal quantifier, we can provide at least one counterexample to
disprove the statement: note that x = π is a real but not rational number.
4. True. All integers are rational numbers, and all rational numbers are real numbers.
1. ∀n ∈ Z, 2 | n → 4 | n
2. ∀n ∈ Z, 4 | n → 2 | n
3. ∃n ∈ Z, 2 | n → 4 | n
4. ∀x ∈ R, x2 > 0
5. ∃x ∈ R, x3 − x2 + 4972x − 11.62π = 0
Solution. You may not have encountered some of the math used here, and that is alright. We will
eventually examine some of these closely in later chapters.
1. False. Counterexample: n = 6.
2. True. Let n = 4k for some k ∈ Z. Then n = 4k =⇒ n = 2(2k), and since k ∈ Z, we must
have 2 | n.
3. True. Example: n = 8.
4. False. Counterexample: x = 0.
5. True. By the Intermediate Value Theorem, all cubics have at least one real root.
6. True. The range of sec x is (−∞, −1] ∪ [1, +∞), which contains 29. Therefore there must be
some x which yields this number.
Problem 1.3.5. Let L(x, y) denote “x likes y.” For each of the following statements with double
quantifiers,
∀x ∀y L(x, y),
∀x ∃y L(x, y),
∃x ∀y L(x, y),
∃x ∃y L(x, y),
∃x ∀y L(y, x),
Solution. This problem is meant to develop your understanding of universal and existential quantifiers,
as well as their relations to each other in a symbolic statement. Notice how switching the order or
quantifier can drastically change the meaning of the sentence.
To clarify some ambiguity, the statement “everyone likes someone” suggests that each person
likes somebody else, but the person who is liked can vary depending on the person who is liking.
However, the statement “there’s someone whom everybody likes” suggests that everybody likes one
particular, common person.
How would we negate a quantified statement? First, let’s consider one with the existential
quantifier, ∃x ∈ D, P (x).
There just has to be at least one value of x, when substituted into the predicate, that yields a
true statement. Therefore, we can rewrite it as a series of predicates all connected by disjunctions:
where D = {x1 , x2 , x3 , . . . , xn }, since a statement with a disjunction only needs at least one of the
P (xi∈D ) to be true.
We wish to find the negation of this. In fact, DeMorgan’s laws can be generalized for a series of
disjunctions, such that
But if we have a series of P (xi∈D ) joined together by conjunctions, then this suggests that the
universal quantifier should be applied. It follows that
Exercise 1.3.6. Use the same reasoning as above to show that ∼ (∀x ∈ D, P (x)) ≡ ∃x ∈ D, ∼P (x).
Solution. We simply switch the ∃ symbol to a ∀ symbol, and then negate the remainder of the
statement, which is 2 | n → 4 | n. Recall that p → q ≡ ∼p ∨ q, so ∼(2 | n → 4 | n) ≡ ∼(2 - n ∨ 4 |
n) ≡ 2 | n ∧ 4 - n by DeMorgan’s law. Thus, we conclude
∀n ∈ Z, 2 | n ∧ 4 - n.
Note: this is the epsilon-delta definition of the limit, and we will go over this extensively in a
later chapter.
Solution. This statement has a lot of quantifiers strung together, but we can dissect each part one
by one:
Problem 1.3.9. Determine if the following statement is true or false, and explain. Then, find its
negation.
∀x ∈ Q ∃y ∈ Z+ , xy ∈ Z.
15 Chapter 1. Logic
m
Solution. The statement is true. By the definition of a rational number, let x = such that
n
m ∈ Z, n ∈ Z+ . Let y = n (this choice of y is allowed because the existential quantifier indicates
m
that a certain y can be found based on any given x). Therefore, xy = · n = m, which has already
n
been established as an integer.
The negation would be ∃x ∈ Q ∀y ∈ Z+ , xy ∈
/ Z.
Problem 1.3.10. Determine if the following statement is true or false for U = R and U = Z
respectively, and explain. Then, find its negation.
x+y
Solution. If U = R, then the statement is true. To demonstrate this, let z = , i.e. take the
2
average of x and y. Again, as z was defined using an existential quantifier, we should express z in
terms of x and y, which represent any real numbers (because of their universal quantifiers). As all
x+y
are real numbers, x > > y is clearly satisfied.
2
If U = Z, then the statement is false. It suffices to show one counterexample: let x = 2, y = 1.
Then there cannot be any integer z for which 2 > z > 1.
To find the negation, we dissect the statement similar to before:
Remember that x > z > y is actually an “AND” statement: x > z > y ≡ x > z ∧ z > y. Therefore
its negation would be x ≤ z ∨ z ≤ y, so the complete negation would be
∃x ∈ U, ∃y ∈ U, (x > y ∧ ∀z ∈ U, x ≤ z ∨ z ≤ y) .
Problem 1.3.11. Determine for each statement whether it is true or false, and justify.
1. ∀x ∈ R, ∃y ∈ R, y 2 = x
2. ∀x ∈ Z, ∃y ∈ Q, xy ∈ Z
3. ∀x ∈ Q, ∃y ∈ Z, xy ∈ Z
4. ∃x ∈ R, ∃y ∈ R, |x − y| > 7 ∧ x2 + y 2 = 22
Solution.
1. False. Consider x < 0. There is no real number whose square is a negative real number. You
could specify a particular negative value of x to be specific in demonstrating the failure.
Daniel Kim 16
2. True. Some rational numbers are also integers. Then, choose y to be a rational number that
is also an integer, such that xy will be an integer. We have freedom in choosing y because of
the existential quantifier.
3. True. Simply let y = 0 for any given rational number x. Clearly x · 0 = 0 ∈ Z, so we’re done.
4. False. The condition |x − y| > 7 indicates that x and y are more than 7 apart. The minimum
value of x2 is 0, attained when x = 0, by the Trivial Inequality. Since y would be more than
7 away from 0, we must have y 2 > 49, so x2 + y 2 > 49, which fails x2 + y 2 = 22. Likewise,
considering y = 0 yields x2 > 49, and we get the same result.
If we consider any other values of x and y, then the sum x2 + y 2 only increases. Thus, it is
impossible to have both |x − y| > 7 and x2 + y 2 = 22 for any real numbers x and y.
Problem 1.3.12. Determine for each statement whether it is true or false, and justify.
1. ∀x ∈ Q, ∃y ∈ Z+ , xy ∈ Z
2. ∀x ∈ Z ∃y ∈ Z ∀z ∈ Z, x 6= z −→ |x − y| ≤ |x − z|
3. ∀x ∈ R, ∀y ∈ R, ∃z ∈ R, xz = y
Solution.
m m
1. True. Let x = , where m ∈ Z and n ∈ Z+ , and let y = n. Then xy = · n = m, which is
n n
in Z, so we’re done.
For a brief interlude, let’s see how we can combine quantifiers with argument forms. Consider
the following premises:
P (1)
∀n ∈ Z+ , P (n) → P (n + 1)
Given the initial case P (1), we can repeatedly apply modus ponens, so the second premise results
in infinitely many conclusions.
)
P (1)
P (2)
P (1) → P (2) P (3)
P (4)
P (2) → P (3)
P (3) → P (4)
..
.
As shown above: ∀n ∈ Z+ , statements P (1), P (2), P (3), P (4), . . . , P (n) are all true. Therefore,
the complete argument is expressed as:
P (1)
∀n ∈ Z+ , P (n) → P (n + 1)
∴ ∀n ∈ Z+ , P (n)
This argument form is known as mathematical induction, with base case P (1) and inductive
step P (n). This is a very important technique of proving certain theorems that we will go over
closely in a later chapter.
Chapter 2
Set Theory
Every field of mathematics uses or refers to sets in one way or another. As the last chapter may
have shown, we need to use sets when dealing with quantified statements. In this chapter, we will
review some basic aspects of set theory, building from the background of the last chapter.
2.1 Introduction
Even then, we should have a basic sense of what sets are. You probably already have encountered
sets in your mathematical education so far, including R, Q, Z.
For notation, we will use capital letters to refer to sets and lowercase letters to refer to elements.
We state x ∈ A if x is an element of A.
Definition 2.1.2. The universal set, U, is the set that contains all elements.
Definition 2.1.3. The empty set, ∅, is the set that contains no elements.
A = B ←→ ∀x ∈ U, x ∈ A ↔ x ∈ B.
A ⊆ B ←→ ∀x ∈ U, x ∈ A → x ∈ B.
Exercise 2.1.6. Under what condition would A * B (A is not a subset of B)? Hint: negate
Definition 2.1.5.
19
Daniel Kim 20
Theorem 2.1.7
A = B ←→ A ⊆ B ∧ B ⊆ A.
A = B ←→ ∀x ∈ U, x ∈ A ↔ x ∈ B
A = B ←→ ∀x ∈ U, (x ∈ A → x ∈ B) ∧ (x ∈ B → x ∈ A)
∴ A = B ←→ A ⊆ B ∧ B ⊆ A
Theorem 2.1.8
A ⊆ B ∧ B ⊆ C −→ A ⊆ C.
Definition 2.1.10. Let the set A denote the complement of set A. Then,
x ∈ A ←→ x ∈ U ∧ x ∈
/ A.
∀x ∈ U, x ∈ A ←→ x ∈
/ A ←→ ∼(x ∈ A).
Problem 2.1.11. A = A.
∀x ∈ U, x ∈ A ←→ ∼(x ∈ A)
∀x ∈ U, x ∈ A ←→ ∼(∼(x ∈ A))
∀x ∈ U, x ∈ A ←→ x ∈ A
∴ A = A.
Problem 2.1.12. A ⊆ B −→ B ⊆ A.
21 Chapter 2. Set Theory
Proof. This results from the implication being logically equivalent to its contrapositive.
A ⊆ B −→ ∀x ∈ U, (x ∈ A → x ∈ B)
A ⊆ B −→ ∀x ∈ U, (∼(x ∈ B) → ∼(x ∈ A))
A ⊆ B −→ ∀x ∈ U, x ∈ B → x ∈ A
∴ A ⊆ B −→ B ⊆ A.
Proof. Problem 2.1.12 gives us the right direction. For the left direction, note that Problem 2.1.12
also tells us that B ⊆ A −→ A ⊆ B. Then by Problem 2.1.11, B = B and A = A, so thus
B ⊆ A −→ A ⊆ B. It follows that A ⊆ B ←→ B ⊆ A.
Solution. It depends on the universe. Any set that does NOT contain any of 1, 2, or 3 is an answer.
{1, 2, 3} U
∅ {1, 2, 3}
{4, 5} {1, 2, 3, 4, 5}
(−∞, 1) ∪ (1, 2) ∪ (2, 3) ∪ (3, +∞) R
The last example uses notation that may not be familiar; we will introduce it in detail in the
next section.
A ∩ B = {x ∈ U | x ∈ A ∧ x ∈ B}.
A ∪ B = {x ∈ U | x ∈ A ∨ x ∈ B}.
Solution. Remember that the intersection contains only elements both A and B have in common,
while the union contains elements from either A or B.
A ∩ B = {3}.
A ∪ B = {1, 2, 3, 4, 5}.
Consider the following diagram. This gives a visual representation of the intersection and union
of two sets.
A B
=A∩B
=A∪B
Proof. Given that for a given element x ∈ U U , if we assume that x ∈ A, then by disjunctive addition,
x ∈ A ∨ x ∈ B. But this implies that x ∈ A ∪ B. Thus, ∀x ∈ U, x ∈ A −→ x ∈ A ∨ x ∈ B, which
gives us A ⊆ A ∪ B.
Proof. The proofs involve direct application of DeMorgan’s laws. I encourage you to go through and
justify each step of the process.
∀x ∈ U, x ∈ A ∪ B ←→ ∼(x ∈ A ∪ B)
∀x ∈ U, x ∈ A ∪ B ←→ ∼(x ∈ A ∨ x ∈ B)
∀x ∈ U, x ∈ A ∪ B ←→ ∼(x ∈ A) ∧ ∼(x ∈ B)
∀x ∈ U, x ∈ A ∪ B ←→ x ∈ A ∧ x ∈ B
∀x ∈ U, x ∈ A ∪ B ←→ x ∈ A ∩ B
∴ A ∪ B = A ∩ B.
∀x ∈ U, x ∈ A ∩ B ←→ ∼(x ∈ A ∩ B)
∀x ∈ U, x ∈ A ∩ B ←→ ∼(x ∈ A ∧ x ∈ B)
∀x ∈ U, x ∈ A ∩ B ←→ ∼(x ∈ A) ∨ ∼(x ∈ B)
∀x ∈ U, x ∈ A ∩ B ←→ x ∈ A ∨ x ∈ B
∀x ∈ U, x ∈ A ∩ B ←→ x ∈ A ∪ B
∴ A ∩ B = A ∪ B.
Exercise 2.2.9. By directly applying the associative law of logic, prove Theorem 2.2.8.
Exercise 2.2.11. In the same style as the previous theorems, prove Theorem 2.2.10.
Example 2.2.12
Prove A ∩ B = A ∩ C ∧ A ∪ B = A ∪ C −→ B = C.
2. If x ∈
/ A, then by disjunctive addition, x ∈ B ∨ x ∈ A (since we have assumed x ∈ B to be
true), or x ∈ A ∪ B. We are given that A ∪ B = A ∪ C, so x ∈ A ∪ C, from which we get
x ∈ A ∨ x ∈ C. However, we know that x ∈ / A, so by disjunctive syllogism, x ∈ C, the same
result as the previous case.
Thus, whether x ∈ A or x ∈
/ A, we can conclude that x ∈ B → x ∈ C.
The proof of x ∈ C → x ∈ B is very similar, and will be left to the reader.
Alternative Proof. Note that the theorem’s statement can be broken down into:
p ∧ q ←→ p ∧ r
p ∨ q ←→ p ∨ r
∴ q ←→ r
It is sufficient to prove that this argument form is valid, i.e. if the premises are true, then the
conclusion must also be true. We can either use a truth table or casework to do so.
Since the truth table method can be time-consuming and tedious, I will proceed by casework on
p in particular.
If p is true, then we have
T ∧ q ←→ T ∧ r,
T ∨ q ←→ T ∨ r.
q ←→ r,
T ←→ T.
Clearly, T ←→ T is always true, so we can ignore this part. Remember that we want to show
that if the premises are true, then the conclusion must also be true. Indeed, if q ←→ r is true, then
the conclusion must obviously be true since it happens to be the same statement as this premise.
If p is false, then we have
F ∧ q ←→ F ∧ r,
25 Chapter 2. Set Theory
F ∨ q ←→ F ∨ r.
F ←→ F,
q ←→ r.
Obviously F ←→ F is always true. With the same reasoning as above, if the premise q ←→ r is
true, then the conclusion is true since it is the same statement.
We have shown that this argument form is valid, thus the theorem itself must be true.
We now introduce a new class of problems that requires some intuition with sets.
Problem 2.2.14. Prove or disprove the following statements.
1. A ∩ B = A ∩ C −→ B = C.
2. A ⊆ B ∪ C −→ A ⊆ B ∨ A ⊆ C.
3. A ∩ (B ∪ C) = (A ∩ B) ∪ C.
Solution. It is helpful to use Venn Diagrams as a visual aid when thinking about these statements.
Then consider counterexamples to disprove them. However, Venn Diagrams are not rigorous enough
to prove a true statement. Additionally, when disproving a statement, do not forget to demonstrate
how your counterexample actually fails it.
1. False. Consider the counterexample: Let A = {1, 2}, B = {2, 4}, C = {2, 3}. To demonstrate
its failure, we note that A ∩ B = {2} and A ∩ C = {2}, so the condition that A ∩ B = A ∩ C
is satisfied. However, B 6= C, failing the statement.
A ⊆ B ∪ C −→ ∀x ∈ U, x ∈ A → x ∈ B ∪ C
A ⊆ B ∪ C −→ ∀x ∈ U, x ∈ A → (x ∈ B ∨ x ∈ C)
A ⊆ B ∪ C −→ ∀x ∈ U, ∼(x ∈ A) ∨ (x ∈ B ∨ x ∈ C)
A ⊆ B ∪ C −→ ∀x ∈ U, ∼(x ∈ A) ∨ ∼(x ∈ A) ∨ x ∈ B ∨ x ∈ C
A ⊆ B ∪ C −→ ∀x ∈ U, (∼(x ∈ A) ∨ x ∈ B) ∨ (∼(x ∈ A) ∨ x ∈ C)
A ⊆ B ∪ C −→ ∀x ∈ U, (x ∈ A → x ∈ B) ∨ (x ∈ A → x ∈ C)
∴ A ⊆ B ∪ C −→ A ⊆ B ∨ A ⊆ C.
1. (A − B) ∪ (B − C) = (A − C).
2. (A − C) ∪ (B − C) = (A ∪ B) − C.
3. (A − C) = C − A.
4. (A − B) − C = (A − C) − B.
5. A ⊆ C ∧ B ⊆ C ←→ (A ∪ B) ⊆ C.
6. A ⊆ C ←→ A − C = ∅.
7. A ⊆ B −→ A ∩ (B ∩ C) = ∅.
Solution. Note: we will leave it to the reader to demonstrate how the counterexamples fail their
statements.
(A − C) ∪ (B − C) ←→ (A ∩ C) ∪ (B ∩ C)
(A − C) ∪ (B − C) ←→ ∀x ∈ U, (x ∈ A ∧ x ∈
/ C) ∨ (x ∈ B ∧ x ∈
/ C)
(A − C) ∪ (B − C) ←→ ∀x ∈ U, x ∈
/ C ∧ (x ∈ A ∨ x ∈ B)
∴ (A − C) ∪ (B − C) ←→ (A ∪ B) − C.
(A − B) − C = (A ∩ B) ∩ C
(A − B) − C = C ∩ (A ∩ B)
(A − B) − C = (C ∩ A) ∩ B
(A − B) − C = (A ∩ C) ∩ B
∴ (A − B) − C = (A − C) − B.
5. True. Proof.
A ⊆ C ∧ B ⊆ C ←→ ∀x ∈ U, (x ∈ A → x ∈ C) ∧ (x ∈ B → x ∈ C)
A ⊆ C ∧ B ⊆ C ←→ ∀x ∈ U, (x ∈
/ A ∨ x ∈ C) ∧ (x ∈
/ B ∨ x ∈ C)
A ⊆ C ∧ B ⊆ C ←→ ∀x ∈ U, (x ∈ C ∨ x ∈
/ A) ∧ (x ∈ C ∨ x ∈
/ B)
A ⊆ C ∧ B ⊆ C ←→ ∀x ∈ U, x ∈ C ∨ (x ∈
/ A∧x∈
/ B)
A ⊆ C ∧ B ⊆ C ←→ ∀x ∈ U, (x ∈
/ A∧x∈
/ B) ∨ x ∈ C
A ⊆ C ∧ B ⊆ C ←→ ∀x ∈ U, ∼(x ∈ A ∨ x ∈ B) ∨ x ∈ C
A ⊆ C ∧ B ⊆ C ←→ ∀x ∈ U, (x ∈ A ∨ x ∈ B) → x ∈ C
∴ A ⊆ C ∧ B ⊆ C ←→ (A ∪ B) ⊆ C.
27 Chapter 2. Set Theory
Theorem 2.2.35
(A ? B) ? C = A ? (B ? C).
Proof. First, draw a Venn Diagram to get a sense of this expression. Realize that A ? B is the
exclusive-or (XOR) of sets A and B. Then,
A ? B ←→ ∀x ∈ U, x ∈ A ? B
A ? B ←→ ∀x ∈ U, (x ∈ A ∧ ∼(x ∈ B)) ∨ (x ∈ B ∧ ∼(x ∈ A))
A ? B ←→ ∀x ∈ U, x ∈ A ⊕ x ∈ B.
Using a truth table, we can show that for any given statements p, q, and r, (p⊕q)⊕r ≡ p⊕(q ⊕r),
or in other words, demonstrate that the XOR operation is associative.
p q r p ⊕ q q ⊕ r (p ⊕ q) ⊕ r p ⊕ (q ⊕ r)
T T T F F T T
T T F F T F F
T F T T T F F
T F F T F T T
F T T T F F F
F T F T T T T
F F T F T T T
F F F F F F F
The resulting truth values prove their logical equivalence. Therefore, we can conclude:
∀x ∈ U, (x ∈ A ⊕ x ∈ B) ⊕ x ∈ C ←→ x ∈ A ⊕ (x ∈ B ⊕ x ∈ C)
∀x ∈ U, x ∈ (A ? B) ? C ←→ x ∈ A ? (B ? C)
∴ (A ? B) ? C = A ? (B ? C).
Chapter 3
Fields
By middle school, we take many things in mathematics for granted, especially addition, subtraction,
multiplication, and division. But where do all of these come from? In this chapter, we discuss why
we can use such operations, and establish the foundations from scratch. Hopefully, after finishing
this chapter, you should have a deeper appreciation for some of the seemingly trivial algebraic
manipulations and techniques that we apply almost subconciously in math problems.
An axiom is a statement that we assume to be true, from which we develop further mathematical
results and consequences. In other words, it is a starting point. Here, we provide a list of field
axioms:
Definition 3.1.1. A field is a set F with two operations, typically called + and ×, satisfying:
f) Inverses: ∀a ∈ F, ∃ −a ∈ F s.t. a + −a = 0.
∀a ∈ F, a 6= 0, ∃a−1 ∈ F s.t. a(a−1 ) = 1.
So first, we start off with addition and multiplication. Notice that subtraction and division can
also arise from these properties, and we will get to those operations later.
Some examples of fields would be the sets Q, R, C - take a moment to convince yourself that
these properties hold for these sets.
In addition, the set Zp (all integers mod p, where p is prime) is a field; it is a well known result
in number theory that all integers modulo p do have a multiplicative inverse. However, the set of
29
Daniel Kim 30
integers, Z, is not a field because not every integer has an multiplicative inverse that is also an
integer.
Remark 3.1.2. When we say “n mod m” or “n modulo m,” we are referring to the remainder when
n is divided by m. For example, 8 mod 5 is 3, and we write 8 ≡ 3 (mod 5) as shorter notation.
Other examples include 33 ≡ 3 (mod 10), 89 ≡ 5 (mod 7), etc. Furthermore, Zn refers to the set of
integers mod n. For instance, Z3 = {0, 1, 2}.
Considering numbers modulo some other number is part of an entire topic of mathematics called
modular arithmetic, and this is part of number theory. You are welcome to search for any outside
resources pertaining to this topic.
Problem 3.1.3. Do the integers mod 8 form a field? What about the integers mod 27?
Solution. In order for a set of numbers to be a field, all of the properties listed above must be
satisfied. Now for Z8 , consider if 4 (which is in this set) has a multiplicative inverse (let this be x).
Then we want
4x ≡ 1 (mod 8),
and this can be rewritten as 4x = 8k + 1, where k ∈ Z. Rearrange this to 4x − 8k = 1, and clearly
the left hand side is even while the right hand side is odd. Thus, there is no x which satisfies this, so
4 does not have a multiplicative inverse. Therefore, Z8 cannot be a field.
Likewise, for Z27 , consider the multiplicative inverse of 3.
3x ≡ 1 (mod 27).
Then we have 3x = 27k + 1 for k ∈ Z. Note that the left hand side is divisible by 3, while the
right hand side will leave a remainder of 1 when divided by 3, so we cannot find such an x. Thus, 3
does not have a multiplicative inverse, so Z27 is not a field.
The following proofs of results will seem tedious, but necessary as we are building everything
from scratch, using just the field axioms listed above. Be careful not to skip steps, as we cannot
assume to apply the usual algebraic techniques we have learned. Make sure you know what property
was invoked for each step of a proof.
1. a + b = a + c → b = c
2. ab = ac ∧ a 6= 0 → b = c
a+b=a+c
−a + (a + b) = −a + (a + c)
(−a + a) + b = (−a + a) + c
31 Chapter 3. Fields
0+b=0+c
b=c
For the second part, we know that a−1 exists because we are given that a 6= 0. Then,
ab = ac
a−1 (ab) = a−1 (ac)
(a−1 a)b = (a−1 a)c
1·b=1·c
b=c
Henceforth, when invoking either part of this theorem, we will simply refer to it by “Cancellation.”
Theorem 3.1.5
The additive and multiplicative identities and inverses are unique.
0+e 0 = 0 because e
0 is an additive identity.
e e
0 + 0 = 0 because 0 is an additive identity.
Therefore, 0 = e
0, so it is unique.
Aside from −a, suppose −e
a is another additive inverse for a. Then we have,
0 = a + (−a),
0 = a + (−e
a).
Therefore, a + −a = a + −e a by cancellation.
a −→ −a = −e
Aside from 1, suppose e
1 is another multiplicative identity. Then,
1·e
1=e 1 because 1 is a multiplicative identity.
1·e
1 = 1 because e
1 is a multiplicative identity.
Therefore 1 = e
1.
Aside from a−1 , suppose e
a−1 is another multiplicative inverse. ∀a 6= 0, we have:
a · a−1 = 1
a−1 = 1
a·e
Thus, a · a−1 = a · e
a−1 . Therefore, a−1 = e
a−1 by cancellation.
Proof. −a + a = 0
−a + −(−a) = 0
∴ −a + −(−a) = −a + a
∴ −(−a) = a.
Proof. a + (b + c) = (a + b) + c
= c + (a + b).
Proof. 0+0=0
a(0 + 0) = a · 0
a·0+a·0=a·0
−a · 0 + (a · 0 + a · 0) = −a · 0 + a · 0
(−a · 0 + a · 0) + a · 0 = −a · 0 + a · 0
0 + a · 0 = −a · 0 + a · 0
a · 0 = −a · 0 + a · 0
a·0=0
∴ 0 · a = 0.
1 + −1 = 0
a(1 + −1) = a · 0
a · 1 + a · −1 = a · 0
33 Chapter 3. Fields
a + a · −1 = a · 0
a + a · −1 = 0 · a
a + a · −1 = 0
−a + (a + a · −1) = −a + 0
(−a + a) + a · −1 = −a + 0
0 + a · −1 = −a + 0
a · −1 = −a + 0
a · −1 = −a
∴ −1 · a = −a.
ab = 0
a−1 · (ab) = a−1 · 0
(a−1 · a)b = a−1 · 0
1 · b = a−1 · 0
1·b=0
∴ b = 0.
Note that the result we just proved helps us deduce roots of a polynomial after factoring.
x2 = y 2
x2 + −y 2 = y 2 + −y 2
x2 + −y 2 = 0
= (x · x + 0) + −(y · y)
= x · x + −(y · y)
= x2 + −y 2 .
Therefore, (x + y)(x + −y) = 0. By Problem 3.1.11, x + y = 0 or x + −y = 0. Consider each case
separately:
1. x + y = 0
x+y =0
(x + y) + −y = 0 + −y
x + (y + −y) = 0 + −y
x + 0 = 0 + −y
x = 0 + −y
∴ x = −y.
2. x + −y = 0
x + −y = 0
(x + −y) + y = 0 + y
x + (−y + y) = 0 + y
x+0=0+y
x=0+y
∴ x = y.
ab(a−1 b−1 ) = 1
(ab)−1 (ab(a−1 b−1 )) = (ab)−1 · 1
(ab)−1 (ab(a−1 b−1 )) = (ab)−1
((ab)−1 ab)(a−1 b−1 ) = (ab)−1
1 · (a−1 b−1 ) = (ab)−1
a−1 b−1 = (ab)−1
∴ (ab)−1 = a−1 b−1 .
At this point, we should address the other two operations. We introduced these later because they
are, in fact, results from addition and multiplication defined by the field axioms.
Consider the equation x + b = a, where a, b are constants. If we solve for x,
x+a=b
(x + a) + −a = b + −a
x + (a + −a) = b + −a
x + 0 = b + −a
x = b + −a
We now define a new operation to represent the solution to this special equation, x = a + −b.
Definition 3.2.1. Subtraction is defined as a + −b = a − b.
x·b=a
b·x=a
−1
b (b · x) = b−1 (a)
(b−1 b)x = b−1 (a)
1 · x = b−1 · a
Daniel Kim 36
x = b−1 · a
x = a · b−1
We define another new operation to represent the solution to this special equation, which is
x = ab−1 .
a
Definition 3.2.2. Division is defined as ab−1 = , provided b 6= 0.
b
a·d a
Problem 3.2.3. Prove = .
b·d b
Proof. a·d
= (a · d)(b · d)−1
b·d
= (a · d)(b−1 · d−1 )
= ((a · d) · b−1 ) · d−1
= ((d · a) · b−1 ) · d−1
= (d · (a · b−1 )) · d−1
= ((a · b−1 ) · d) · d−1
= (a · b−1 )(d · d−1 )
= (a · b−1 ) · 1
= a · b−1
a
= .
b
a c a·d+b·c
Problem 3.2.4. Prove + = .
b d b·d
a a·d c c·b
Proof. By Problem 3.2.3, = and = . Then,
b b·d d d·b
a c a·d c·b
+ = +
b d b·d d·b
a·d b·c
= +
b·d b·d
= (a · d)(b · d)−1 + (b · c)(b · d)−1
= (b · d)−1 (a · d) + (b · d)−1 (b · c)
= (b · d)−1 (a · d + b · c)
= (a · d + b · c)(b · d)−1
a·d+b·c
= .
b·d
Here are some review problems to help with field proofs. You may use previous results, or, for
added difficulty, prove them from scratch (i.e. using only the field axioms in the beginning of this
chapter).
Exercise 3.2.9. Prove that (−a)(−b) = ab. (Hint: try proving (−a) · b = −(ab) first.)
Now we move on from the theoretical aspects of math to a relatively more common topic: sequences
and series, and their behaviors.
4.1 Sequences
1 2 3 4 5 6
The first 6 terms of the sequence would be: , , , , , .
2 5 10 17 26 37
2. Implicitly defined: For example, let πn = the nth digit after the decimal point for π.
3. Recursively defined: Each term is defined in terms of previous terms. We have the famous
example, the Fibonacci sequence: Given f1 = 1, f2 = 1, ∀n ≥ 1, fn+2 = fn+1 + fn . Then the
first few terms are 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, . . .
From these definitions, we can derive explicit formulas for each type of sequence:
39
Daniel Kim 40
Arithmetic Geometric
a1 = a a1 = a
a2 = a + d a2 = ar
a3 = a + 2d a3 = ar2
a4 = a + 3d a4 = ar3
.. ..
. .
an = a + (n − 1)d an = arn−1
Example 4.1.2
Consider a sequence starting with 1, 3, . . . Write the next five terms if the sequence was arithmetic
or geometric. Then, find the 100th term of each.
Solution. If the sequence was arithmetic, then note that the common difference is 2 (which can be
obtained by subtracting the first term from the second term). Thus, we simply keep adding 2 to get
the next five terms: 1, 3, 5, 7, 9, 11, 13 .
If the sequence was geometric, then we can get the common ratio by dividing the second term
by the first term to get 3. Then we can simply keep multiplying by the common ratio to get
1, 3, 9, 27, 81, 243, 729 .
Obviously, it would be time-consuming to keep adding on 2 or multiplying by 3 to reach the
100th term, so we find a shorter way. As defined above, in an arithmetic sequence, an = a + (n − 1)d.
We know that a = 1, d = 2, and n = 100, so we get a100 = 1 + (100 − 1) · 2 = 199 .
Likewise, for a geometric sequence, an = arn−1 . We know that a = 1, r = 3, and n = 100, so
a100 = 1 · 3100−1 = 399 .
Problem 4.1.3. An arithmetic sequence has the fifth term of 19 and 79th term of 22. What is its
2017th term?
Solution. As we have an arithmetic sequence, the fifth term of 19 translates to a + 4d = 19 for some
first term a and common difference d. Likewise, we have a + 78d = 22.
3
If we subtract the two equations, we get 74d = 3, which rearranges to d = .
74
The 2017th term would be a + 2016d. Instead of also solving for a as well, we can just use the
given information. Note that a + 4d = 19, so a + 2016d = (a + 4d) + 2012d = 19 + 2012d. Then we
3 3721
can simply compute to get 19 + 2012 · = .
74 37
Problem 4.1.4. Let a, b, c be three consecutive terms of a sequence. Find
1. c in terms of a, b
2. b in terms of a, c
41 Chapter 4. Sequences and Series
Solution. If the sequence was arithmetic, then using our explicit formulas, we have
a = a0 + (n − 1)d,
b = a0 + nd,
c = a0 + (n + 1)d.
Notice that a and b are only d apart, so b − a = d. We also have that b and c are d apart,
so c − b = d. We substitute b − a = d into the latter to get c − b = b − a, which rearranges to
c = 2b − a .
Also, b is evenly spaced apart from a and c. This should suggest that b is the average of a and c,
a+c
i.e. b = . Indeed, we can confirm this using the explicit formulas.
2
If the sequence was geometric, then our explicit formulas would tell us that
a = a0 rn−1 ,
b = a0 r n ,
c = a0 rn+1 .
b c c b b2
Likewise, note that = r and = r, so = , which rearranges to c = .
a b b a a
√
If we multiply a and c together, we get a20 r2 , which is just b2 , so b = ± ac . As we don’t know
what sign a0 or r is, we cannot determine what sign b is with certainty.
Problem 4.1.5. Let {an } be an arithmetic sequence. Simplify a29 + a76 − a51 .
Solution. This is a direct application of the explicit formula for an arithmetic sequence.
Problem 4.1.6. Find all sequences a, b, c such that a, b, c and a + 1, b + 1, c + 1 are both geometric.
Solution. If a, b, c is geometric, then we know that b2 = ac. Similarly, (b + 1)2 = (a + 1)(c + 1).
Note that (b + 1)2 = (a + 1)(c + 1) simplifies to b2 + 2b = ac + a + c. But we know that b2 = ac,
a+b
so b2 + 2b = b2 + a + c, yielding b = .
2
This implies that a, b, c is an arithmetic sequence. The only possible solution for a sequence that
is both geometric and arithmetic is a = b = c (when all terms are equal to each other).
We can also get this result by letting a = b − d and c = b + d, for some common difference d.
So b2 = ac becomes b2 = (b − d)(b + d), so d = 0, which implies the same result.
Daniel Kim 42
4.2 Series
n(n + 1)
So we have the formula 1 + 2 + 3 + . . . + n = .
2
Now, we can discuss notions of an arithmetic series, which is simply the sum of all terms of
an arithmetic sequence.
Theorem 4.2.3
For some finite arithmetic series a1 + a2 + a3 + . . . + an , we have general formulas:
n(n − 1) n
a1 + a2 + a3 + . . . + an = na + d = (a1 + an ).
2 2
a1 + a2 + a3 + . . . + an = a + (a + d) + (a + 2d) + . . . + (a + (n − 1)d)
= (a + a + a + . . . + a) + (d + 2d + 3d + . . . + (n − 1)d)
= na + d(1 + 2 + 3 + . . . + (n − 1))
n(n − 1)
= na + d,
2
using the formula from Problem 4.2.2.
To get the other formula, we perform some clever manipulations:
n(n − 1) n
na + d = (2a + (n − 1)d)
2 2
n
= (a + (a + (n − 1)d))
2
n
= (a1 + an ).
2
43 Chapter 4. Sequences and Series
The second formula is much simpler, but can only be used when you know the last term of the
arithmetic sequence. Otherwise, use the first formula.
Problem 4.2.4. Find 18 + 19 + 20 + . . . + 53.
Solution. Identify the main components of this series: a = 18, d = 1, and n = 36. Then we apply
36
the second formula: (18 + 53) = 18 · 71 = 1278 .
2
Problem 4.2.5. The sum of the first ten terms of an arithmetic series is 10. The sum of the next
ten terms is 100. What is the sum of the next ten terms after that?
n(n − 1)
Solution. This time, the first formula, na + d, is useful, because we don’t know the 10th
2
or 20th term, which is necessary to use the second theorem. We are given the sum of the first ten
terms, and the sum of the next ten terms. This indicates that we know the sum of the first twenty
terms (just add them together).
We apply the formula on the first ten and the first twenty to get:
We have a system of equations. Multiply the first one by 2 and subtract the two equations to
9 61
get 100d = 90. Therefore, we have d = and a = − .
10 20
Finding the sum of the next ten terms after this 20-term arithmetic sequence is the same thing
as finding the sum of the 30-term arithmetic sequence and subtracting the sum of the first 20, which
we found to be 110. Thus,
61 30 · 29 9
a1 + a2 + a3 + . . . + a30 = 30 · − + ·
20 2 10
183 783
=− +
2 2
600
=
2
= 300.
Therefore, a21 + a22 + . . . + a30 = a1 + a2 + a3 + . . . + a30 − a1 + a2 + . . . + a20 = 300 − 110 =
190 .
Alternative Solution. The previous solution was laborious and inefficient. Let’s look for a better
way to approach this problem.
The important thing to notice about an arithmetic sequence is that the terms develop linearly.
There is a constant, common difference that is added over and over again. Therefore, the difference
between the first and last term of each set of 10 terms is constant.
a21 − a11 = a11 − a1
Daniel Kim 44
Therefore, a1 + a2 + a3 + . . . + a10 , a11 + a12 + a13 + . . . + a20 , and a21 + a22 + a23 + . . . + a30 form
an arithmetic sequence. Since a1 + a2 + a3 + . . . + a10 = 10 and a11 + a12 + a13 + . . . + a20 = 100,
there is a common difference of 90, so we get a21 + a22 + a23 + . . . + a30 = 100 + 90 = 190 . This
solution was much more elegant!
Problem 4.2.6. Let m, n ∈ Z such that m ≤ n. Find the sum of all the integers from m through n
inclusive.
(n − m + 1)(n + m) = 78.
How are we supposed to find all possible values of m and n? We can take advantage of m and
n being integers, and the different parities of (n − m + 1) and (n + m) (i.e. one is odd and one is
even). We can enumerate all possibilities below:
Factorization of 78 into
(n − m + 1) (n + m) Solution (n, m)
odd and even ‘pairs’
n−m=1 n + m = 39 2 · 39 (20, 19)
n−m=5 n + m = 13 6 · 13 (9, 4)
n − m = 25 n+m=3 26 · 3 (14, −11)
n − m = 77 n+m=1 78 · 1 (39, −38)
n − m = 38 n+m=2 39 · 2 (20, −18)
n − m = 12 n+m=6 13 · 6 (9, −3)
n−m=2 n + m = 26 3 · 26 (14, 12)
n−m=0 n + m = 78 1 · 78 (39, 39)
19 + 20,
4 + 5 + 6 + . . . + 8 + 9,
−11 + −10 + −9 + . . . + 13 + 14,
−38 + −37 + . . . + 39,
45 Chapter 4. Sequences and Series
Likewise, consider the geometric series, which is the sum of all the terms of a geometric
sequence.
Theorem 4.2.8
For some finite geometric series a + ar + ar2 + ar3 + . . . + arn−1 , we have general formulas:
1 − rn
a + ar + ar2 + ar3 + . . . + arn−1 = a · if r 6= 1,
1−r
= na if r = 1.
Otherwise, if r = 1, then all the terms are just a. So our sum would be n terms of a added
together, which is just na.
1 + 2 + 4 + 8 + ...
| {z }
28 terms
100 + 2800
Solution. The first expression is an arithmetic series, which evaluates to · 28 = 40600.
2
1 − 228
The second expression is a geometric series and it can be computed as 1 · = 228 − 1 . It is
1−2
clear that the latter is much greater (note that 216 = 65536 > 40600).
a + ar + ar2 = 10,
ar + ar2 + ar3 = 20.
However, for all of the series we have been dealing with so far, we assumed them to be finite.
What if we had an infinite series? There would be no ‘last term’ to consider.
1 1 1 1 1 1 − 21n 1
Examine the series + + + . . . + n . The formula gives · = 1 − n . Let’s substitute
2 4 8 2 2 1 − 12 2
in some values of n.
1 1 1 1 15
+ + + = = 0.9375.
2 4 8 16 16
1 1 1 1 1 31
+ + + + = = 0.96875.
2 4 8 16 32 32
1 1 1 1 255
+ + + ... + = ≈ 0.99609.
2 4 8 256 256
1
As n gets larger, we see that the series is getting closer to 1. If we examine the formula 1 − n ,
2
1 1
notice that as n gets large, n becomes a tremendously small number, so 1 − n approaches 1. To
2 2
express this notion, we write
1
lim 1 − n = 1.
n→∞ 2
Here, we express n getting arbitrarily large as n → ∞. This notation is read as “the limit as n
1 1 1 1 1
goes to ∞ of 1 − n is 1.” Furthermore, we say that the series + + + . . . + n converges to 1.
2 2 4 8 2
Problem 4.2.12. Find the value that this series converges to:
1 1 1 1
+ + + ... + n.
3 9 27 3
47 Chapter 4. Sequences and Series
1 − rn
Solution. First, find the sum in terms of n using the formula a · :
1−r
1 n
1 1 1 1 1 1− 3
+ + + ... + n = ·
3 9 27 3 3 1 − 13
n
1 − 31
=
2
1 1
= 1− n .
2 3
1 1
As n gets large, n goes to 0. Then, 1 − n goes to 1 and therefore, the entire expression
3 3
1 1 1
1 − n will tend toward .
2 3 2
Using this notion of the limit, it is possible to compute certain infinite series. Because some
series converge to one value, we can have a finite sum when there are infinitely many terms to add
together.
Theorem 4.2.13
a
For some infinite geometric series a + ar + ar2 + . . ., if |r| < 1, then the sum is .
1−r
Proof. Disclaimer: this proof will not be so rigorous as we have not discussed limits in detail yet (in
the next section, we will).
a(1 − rn )
Let Sn = a + ar + ar2 + . . . + arn−1 = . Then, we will consider n → ∞, so Sn becomes
1−r
n
a(1 − r )
an infinite geometric series. In the formula , the only value that is changing because of n
1−r
is r . So, let’s analyze r as n → ∞.
n n
In fact, if r ≤ 1 or r > 1, we cannot find a proper formula for such a series, and we have already
covered the case when r = 1.
a(1 − rn )
Thus, if we restrict r to −1 < r < 1, then we know for sure that rn will go to 0. Then,
1−r
a
will approach , which is the desired formula.
1−r
Daniel Kim 48
2
Solution. The common ratio is − and the first term is 1. Using the formula, the sum is simply
3
1 3
2 = 5 .
1 − −3
9 9 9
1. + + + ...
10 100 1000
2. 0.23
3. 0.146
Solution.
9
9 9 9
1. Direct application of the formula gives 10
1 = 1 . Also notice that + + + ...
1− 10
10 100 1000
is equivalent to 0.9999 . . ., or 0.9, which equals 1.
23 23 0.23 23
2. 0.23 = + + ... = 1 = .
100 10000 1 − 100 99
!
46
1 46 46 1 29
3. 0.146 = + + + ... = + 1000
1 = .
10 1000 100000 10 1 − 100 198
Theorem 4.2.16
A number is rational if and only if it has a decimal expansion that either terminates or repeats.
Proof. First, we prove the right direction: if a number is rational, then it has a decimal expansion
that either terminates or repeats.
p
A rational number can be expressed as , where p, q ∈ Z. If we perform long division with p as
q
the dividend and q as the divisor, then there will only be finitely many possible remainders, namely
0, 1, 2, . . . , q − 1. Thus, the long division will eventually terminate by a remainder of 0, or it will
continue indefinitely due to a cycle of the same non-zero remainders.
For the left direction, if the number has a decimal expansion that either terminates or repeats,
then it can be represented as a geometric series or infinite geometric series with common ratio
between 0 and 1. In either case, we have an appropriate formula which shows that the sum is a
a(1 − rn ) a
rational number: or .
1−r 1−r
49 Chapter 4. Sequences and Series
Problem 4.2.17. The sum of an infinite geometric series is 10. The sum of the same series but
with each of its terms squared is 12. What is its fifth term?
The first equation gives a = 10 − 10r, and the second gives a2 = 12 − 12r2 . Therefore, we can
solve for r with the equation
12 − 12r2 = (10 − 10r)2 .
We end up with the quadratic 14r2 −25r +11 = 0, which can be factored into (r −1)(14r −11) = 0,
11
giving r = 1, . Since it is an infinite geometric sequence, |r| < 1, so the common ratio must be
14
11
.
14
4
11 15 15 11
Therefore a = 10 − 10 · = , so the fifth term, which is ar , would be
4 · .
14 7 7 14
Earlier, we brought up the concept of the limit as a means to derive our formula for an infinite
geometric series. In this section, we will go over them with a very rigorous and technical approach.
If you are not yet familiar with quantifiers from the first chapter, make sure you review it, because
it is essential to understanding how exactly limits are defined from a theoretical standpoint.
Definition 4.3.1. Let {an } be an infinite sequence. We define the limit of the sequence as follows:
Now this symbolic definition certainly seems unwieldly, so we will dissect one part at a time.
Often, we use the Greek letter ε, called “epsilon,” to represent a preferably small number. Likewise,
we are using the capital letter N to represent a preferably large number.
When we start off with ∀ε > 0 ∃N > 0, we are saying that for any small positive number ε we
choose, we can always find a large positive number N such that the rest of the statement is satisfied.
Keep in mind that ε and N are defined to be real numbers in this context.
Namely, the rest of the statement reads ∀n > N, |an − L| < ε. Remember that |an − L| is the
distance between an and L, which is the value that the infinite sequence converges to. We are saying
that all an in the infinite sequence whose n is greater than the large number N we found is less than
ε away from L.
In other words, an can be made arbitrarily close to L by making n sufficiently large. This should
make sense - as we choose a term in the sequence farther down, we should be closer to the limit L
than before.
Daniel Kim 50
Then ε is our threshold of ‘closeness’ to L. As we began the statement with ∀ε > 0, this is the
‘arbitrarily close’ part of the definition. Making ε as small as we want yields terms further down the
sequence that are as close to L as we want.
Our choice of N , depending on the given number ε, represents how far we have to go down the
sequence in order to find terms that are within ε of L. Thus, N serves to represent the ‘make n
sufficiently large’ part of the definition.
As a last note, I will abbreviate “such that” to “s.t.” for concision.
Now we will proceed to prove that some sequences actually have the limit that we suspect it to
have.
Example 4.3.2
1
Prove lim = 0.
n→∞ n
1
where an = and L = 0. It is sufficient to find an N in terms of a given ε that satisfies
n
1
∀n > N, < ε. How do we do so?
n
1
Usually, we can work backwards. Start from < ε. Notice that n is always positive, given the
n
1
precedent that N > 0 and ∀n > N . So we can lose the absolute value signs to get < ε. As ε is
n
1
also positive, we can rearrange this to n > .
ε
1
Recall that we must satisfy ∀n > N, < ε, and we have just discovered that n must be greater
n
1 1 1 1
than . In fact, if we let N = , then we have ∀n > , < ε, which is true because of the
ε ε ε n
algebraic manipulations we have just done (the steps we have taken are reversible).
1
Thus, it is sufficient to say that N = in order to prove the limit.
ε
Unfortunately, the line of reasoning that we’ve just done cannot be used in a formal proof,
because we have worked backwards. To formalize it, we must write the proof forwards:
1
For a given ε > 0, let N = . Then,
ε
1 1 1
n > N =⇒ n > =⇒ nε > 1 =⇒ < ε =⇒ − 0 < ε.
ε n n
For the remainder of the problems, I will write out the steps taken backwards in order to show
the motivation behind choosing an N to satisfy the definition. However, I strongly recommend that
you write formal proofs forward in order to reinforce your understanding of the symbolic definition
of the limit. Furthermore, I will glance over some algebraic steps for conciseness, but you should be
writing out every step in your proof to be specific.
1
Problem 4.3.3. Prove lim = 0.
n→∞ n2
1
∀ε > 0 ∃N > 0 s.t. ∀n > N, − 0 < ε,
n2
1
and now we can work backwards again. We have < ε, and again note that n is positive, so
n2
1 1 1 1
= 2 < ε. We can then rearrange this to n > √ . Then we can let N = √ to satisfy the
n2 n ε ε
definition.
n
Problem 4.3.4. Prove lim = 1.
n→∞ n + 1
n
∀ε > 0 ∃N > 0 s.t. ∀n > N, − 1 < ε.
n+1
The procedure is similar to that of the previous proofs. To make things easier, condense the absolute
n 1 1 1
value expression: note that −1 = − , so − = using that n is positive.
n+1 n+1 n+1 n+1
1 1
Therefore, we have < ε, which can be rearranged to n > − 1.
n+1 ε
1
Thus, given ε > 0, let N = − 1, and it can be verified that this choice of N satisfies the
ε
definition.
−n + 1 1
Problem 4.3.5. Prove lim =− .
n→∞ 2n + 3 2
−n + 1 1
+ <ε
2n + 3 2
Daniel Kim 52
−n + 1 n + 32
+ <ε
2n + 3 2n + 3
5
2
<ε
2n + 3
5 5
− 3ε
As n is positive, we have 2
< ε. We can rearrange this to get n > 2
. Thus, for a
2n + 3 2ε
5
− 3ε
given ε > 0, let N = 2
so the overall symbolic statement is true.
2ε
2n + 5 2
Problem 4.3.6. Prove lim = .
n→∞ 3n − 7 3
29 + 21ε
Proof. Let N = . Note that n > N suggests:
9ε
3n > 3N
3n − 7 > 3N − 7
1 1
<
3n − 7 3N − 7
29 29
3 3
∴ <
3n − 7 3N − 7
Then,
14
2n + 5 2 2n + 5 2n − 143 2n + 5 − 2n − 3
29
3
29
3
− = − = = = .
3n − 7 3 3n − 7 3n − 7 3n − 7 3n − 7 3n − 7
2n + 5 2
In other words, for a given ε > 0, we have found an N such that for all n > N , − < ε.
3n − 7 3
2n + 5 2
Therefore, lim = .
n→∞ 3n − 7 3
Exercise 4.3.7. Let bn = c for all n, where c is a constant real number. Prove lim bn = c.
n→∞
Example 4.3.8
Prove that lim (−1)n does not exist.
n→∞
53 Chapter 4. Sequences and Series
Proof. Suppose it does exist, such that lim (−1)n = L. We have its definition,
n→∞
Now, it becomes clear that we simply must find a value for ε such that we arrive at a contradiction
for |(−1)n − L| < ε.
1 1
Consider ε = . Then we have |(−1)n − L| < . We break up the absolute signs to get
2 2
1 1
− < (−1)n − L < ,
2 2
which gives us two inequalities,
1 1
(−1)n − L > − and (−1)n − L < .
2 2
Consider each one separately. First, based on the parity of integer n, notice that there are only
two possible values of (−1)n : 1 and −1. Now we list all possibilities:
( )
1 1
1 −1 − L > − 2 =⇒ L < − 2 1
(−1)n − L > − ∴L<− .
2 1
1 − L > − 2 =⇒ L < 2 3 2
( )
1 3
1 −1 − L < 2 =⇒ L > − 2 1
(−1)n − L < ∴L> .
2 1−L< 1
=⇒ L > 1 2
2 2
1 1
There is no value of L which satisfies both L < − and L > . Therefore, there exists no limit
2 2
for the sequence an = (−1)n .
Theorem 4.3.9
Let k ∈ R. If lim an = L, then
n→∞
a) lim (an + k) = L + k.
n→∞
Proof. The best approach is to rewrite everything in terms of the given definition of a limit. For all
parts of the question, keep in mind:
b) Note that we are able to replace the third quantifier with an implication statement without
affecting the original definition, as such:
∀ε > 0, ∃N s.t. n > N → |an − L| < ε.
Consider the definition of lim kan = kL, using different variables to avoid confusion:
n→∞
∀e e s.t. n > N
ε > 0, ∃N e → |kan − kL| < εe
i.e. |k| |an − L| < εe
εe
i.e. |an − L| < .
|k|
εe
Let ε = , which is allowed because ε and εe can be any positive real numbers. Then
|k|
εe
∀ε > 0, ∃N s.t. n > N → |an − L| < i.e. |kan − kL| < εe, as desired.
|k|
A lemma is a minor result that we prove in order to help us with a harder proof of a theorem
or grander result. Here, we will prove a useful fact relating to absolute value and inequalities that
we will use in later proofs of major theorems.
Proof. As the square of any number is nonnegative, it is clear that |x|2 = x2 for any number x.
Then,
(|a + b|)2 = (a + b)2
= a2 + 2ab + b2
= |a|2 + 2ab + |b|2 .
It should also be certain that for any x, x ≤ |x| (either x = |x| or x = − |x|). Therefore, we must
have ab ≤ |ab| = |a| |b|. Using this fact, |a|2 + 2ab + |b|2 ≤ |a|2 + 2 |a| |b| + |b|2 , which is equal to
(|a| + |b|)2 .
Therefore, (|a + b|)2 ≤ (|a| + |b|)2 . Since both terms are nonnegative, we can take the square
root of both sides to get |a + b| ≤ |a| + |b|.
55 Chapter 4. Sequences and Series
Proof. We first write out the definitions of both limits given. For a given ε > 0,
ε
∃N1 ∀n > N1 , |an − L| < ,
2
ε
∃N2 ∀n > N2 , |bn − M | < .
2
ε
Why are we allowed to use instead of ε? We can make ε be any positive number we want, so it
2
does not matter.
Let N = max{N1 , N2 }, that is to say, the bigger value out of N1 and N2 . Then conveniently,
∀n > N, n > N1 ∧ n > N2 .
Therefore, ∀n > N ,
ε ε
|an − L| < ∧ |bn − M | < ,
2 2
so by the Triangle Inequality,
ε ε
|(an + bn ) − (L + M )| = |(an − L) + (bn − M )| ≤ |an − L| + |bn − M | < + = ε.
2 2
|(an + bn ) − (L + M )| < ε,
so lim an + bn = L + M .
n→∞
Since we are given that 0 < an (in other words, an is positive for all n),
Now, we know that 0 < bn < an < ε. This means that bn < ε, i.e. |bn − 0| < ε. Therefore, ∀n > N ,
|bn − 0| < ε, and this statement implies lim bn = 0.
n→∞
Daniel Kim 56
We don’t have the tools required to prove this inequality yet, but it will be available as an
exercise in a later chapter when we learn proof by induction.
Theorem 4.3.14
If 0 < a < 1, then lim an = 0.
n→∞
1 1
Proof. If 0 < a < 1, then > 1. Therefore, = 1 + h for some h > 0. By Bernoulli’s Inequality,
a a
n
1 1
= = (1 + h)n ≥ 1 + nh > nh.
an a
1 1 1 1 1 1 1
Thus, 0 < an < = · . Since lim = 0, by Theorem 4.3.9, lim · = 0 · , which is
nh n h n→∞ n n→∞ n h h
still 0.
1 1
Since 0 < an < and lim = 0, by Theorem 4.3.12, lim an = 0.
nh n→∞ nh n→∞
Beware that it is quite difficult to grasp the intuition or motivation behind the proofs for the
product and reciprocal of limits, so just try your best to understand how they work.
Proof. Given ε > 0, we know that, by the definition of a limit, we can find N1 , N2 , N3 such that
ε
∀n > N1 , |an − L| < ,
2 |M + 1|
ε
∀n > N2 , |bn − M | < ,
2 |L + 1|
∀n > N3 , |bn − M | < 1.
where we split the absolute value using the Triangle Inequality. Let N = max{N1 , N2 , N3 }. Note
that n > N → n > N1 ∧ n > N2 ∧ n > N3 . Then if n > N ,
Now, we can substitute in the inequalities we derived earlier (and using the obvious fact that
|L| < |L| + 1), to get that
ε ε ε ε
|bn | |an − L| + |L| |bn − M | < (|M | + 1) + (|L| + 1) = + = ε.
2(|M | + 1) 2(|L| + 1) 2 2
Thus, given an ε > 0, we have found an N such that |an bn − LM | < ε for all n > N . Therefore,
lim an bn = LM .
n→∞
|L|
∀n > N1 , |an − L| < ,
2
ε |L|2
∀n > N2 , |an − L| < .
2
|L| ε |L|2
∀n > N, |an − L| < ∧ |an − L| < .
2 2
|L|
|L| = |L − an + an | ≤ |L − an | + |an | = |an − L| + |an | < + |an | ,
2
|L| 1 2
so |an | > . Both quantities are positive, so we can take the reciprocal and get < .
2 |an | |L|
Therefore, for a given ε, we have found N such that
1 1 L − an |L − an | |an − L| ε |L|2 2 1
∀n > N, − = = = < · · = ε,
an L an L |an | |L| |an | |L| 2 |L| |L|
1 1
so lim = .
n→∞ an L
1 1
Proof. Assume for the sake of contradiction that lim = L. By Theorem 4.3.16, lim an = .
n→∞ an n→∞ L
1
We are given that this equals 0, so = 0 =⇒ 1 = 0, which is clearly false. Thus, the limit does
L
not exist.
n
1. lim
n→∞ n+1
3 5
2. lim 2 + + 2 + 88−n + π
n→∞ n n
2n2 + 3n + 4
3. lim
n→∞ 3n2 + 5n − 2
Solution.
1
but notice that lim = 0, and 1 remains constant. Therefore, the limit is
n→∞ n
1 1
lim 1 = lim = 1.
n→∞ 1 + n→∞ 1 + 0
n
n
3 5 1 1 1
2. Note that lim = 0, lim 2 = lim 5 · · = 0, and lim 88−n = lim = 0 by
n→∞ n n→∞ n n→∞ n n n→∞ n→∞ 88
Theorem 4.3.14. We are left with 2 + 0 + 0 + 0 + π = 2 + π .
3. Divide the numerator and denominator by the highest degree of n, which in this case, is n2 .
Then all the terms with a constant in the numerator and a power of n in the denominator
tend to 0, and the limit becomes clear:
3 4
2n2 + 3n + 4 2+ n + n2 2
lim 2
= lim 5 2 = .
n→∞ 3n + 5n − 2 n→∞ 3 + − 3
n n2
2n
1. lim
n→∞ 2n + 3n
59 Chapter 4. Sequences and Series
3n
2. lim
n→∞ 2n + 3n
5n
3. lim
n→∞ 2n + 3n
Solution.
2 n
0
lim 3
2 n
= = 0.
n→∞
3 + 1 0+1
1
lim
2 n
.
n→∞
3 +1
n
2
We know that lim = 0. Therefore,
n→∞ 3
1 1
lim
2 n
= lim = 1.
n→∞
3 +1 n→∞ 0+1
5 n
3. We do the same thing to get lim 3
2 n
. By properties of limits, we have
n→∞
3 + 1
5 n n
5 n lim 3 5
3 n→∞
lim
2 n
= = lim .
n→∞
3 + 1 1+0 n→∞ 3
n
5 2 n 2n 2n
Note that = 1+ ≥ 1+ by Bernoulli’s Inequality. Since 1 + will go to
3 3 3n n 3
5 5
infinity as n goes to infinity, so will , therefore lim = ∞.
3 n→∞ 3
Moving aside from sequences, series, and limits, another section will be dedicated to summation and
product notation, which is useful shorthand to represent long sums and products in compact form.
Daniel Kim 60
where i is referred to as the dummy variable. This is read as “the summation from i = m to n of
ai .”
P
7
• r2 = 9 + 16 + 25 + 36 + 49 = 135 .
r=3
P
3
• k 3 = −1 + 0 + 1 + 8 + 27 = 35 .
k=−1
P
100
• 7 = 22 · 7 = 154 .
k=79
P
5
• j = 2 + 3 + 4 + 5 = 14 .
j=2
Theorem 4.4.2
We can add summations together and pull out constant factors.
P
n P
n P
n
1. ak + bk = (ak + bk )
k=m k=m k=m
P
n P
n
2. c ak = cak
k=m k=m
Proof. By using the definition, the proofs of these involve simple algebra:
Problem 4.4.3. Write in summation form the general formulas for the sums of an arithmetic series
and a geometric series.
Note that if we changed the starting value of the dummy variable, we can still ‘shift over’ the
ending value and the expression itself to represent the same sum. For instance, if we had k = 41, we
could express the same sum above as
X118
k − 39
.
k − 38
k=41
Solution. We evaluate the inner summation first, then the outer summation.
X4 X 3 X3 X 3 X3 X3
i + j = 1+j+ 2+j+ 3+j+ 4+j
i=1 j=1 j=1 j=1 j=1 j=1
= (2 + 3 + 4) + (3 + 4 + 5) + (4 + 5 + 6) + (5 + 6 + 7)
= 9 + 12 + 15 + 18
= 54 .
1000
X
Problem 4.4.7. Find another way to express k2 .
k=400
Solution. The complication is that our dummy variable k starts at 400. We can make k = 1 and
either subtract what we don’t want from the entire sum, or as shown before, ‘shift’ the ending value
and the sequence formula as well:
1000
X 399
X
2
k − k2
k=1 k=1
601
X
(k + 399)2
k=1
Try listing out the first few terms of each to convince yourself that all three ways represent the
same sum.
20
X
Problem 4.4.8. Compute k 2 in two ways by:
k=10
b) Reindexing the summation starting from 1 and breaking the sum into parts.
Solution.
63 Chapter 4. Sequences and Series
a) The sum of 102 + 112 + . . . + 202 is the same as subtracting the sum 12 + 22 + . . . + 92 from
the total sum 12 + 22 + . . . + 202 , making this more manageable with the formulas we already
know:
20
X 20
X 9
X 20 · 21 · 41 9 · 10 · 19
k2 = k2 − k2 = − = 2870 − 285 = 2585 .
6 6
k=10 k=1 k=1
20
X 11
X 11
X
2 2
k = (k + 9) = k 2 + 18k + 81
k=10 k=1 k=1
P
11
We can split k 2 + 18k + 81 into a sum of partial sums, which we are able to compute
k=1
individually using our formulas:
11
X 11
X 11
X 11
X
2 2
k + 18k + 81 = k + 18 k+ 81
k=1 k=1 k=1 k=1
11 · 12 · 23 11 · 12
= + 18 + 11(81)
6 2
= 506 + 1188 + 891 = 2585 .
You may recall from earlier that I represented an infinite geometric series by
a + ar + ar2 + . . .
Using this newly introduced summation notation, we can represent this same infinite sum as
∞
X
ark ,
k=1
Lastly, keep in mind that we can turn an infinite summation into a finite summation, with the
limit attached in front of it, as such:
∞
X n
X
ak = lim ak .
n→∞
k=1 k=1
In addition to summation, we also have a way to write the product of terms in shorthand form.
Daniel Kim 64
Q
n
1. k = n! .
k=1
Q
n
2. r = rn .
k=1
Q
79 k 2 3 4 79 2 1
3. = · · ··· = = . This occurrence, when nearly all terms ultimately
k=2 k + 1 3 4 5 80 80 40
cancel each other out, is called telescoping.
80
Q
80 Q
4. ln k = ln k , if you recall that ln(a + b) = ln(a) ln(b).
k=40 k=40
Theorem 4.4.10
Likewise, we can multiply two products together, and the power of the product is equal to the
product of the powers.
Q
n Q
n Q
n
1. ak bk = ak bk .
k=m k=m k=m
r
Q
n Q
n
2. ak = (ak )r .
k=m k=m
Mathematical Induction
In this chapter, we will go over an essential technique of proof for certain claims, particularly when
the claim should be true for all positive integers (and variants).
Recall the following valid argument form, which underlies the principle of mathematical induction.
P (1)
∀n ∈ Z+ , P (n) → P (n + 1)
∴ ∀n ∈ Z+ , P (n).
We have already demonstrated that this argument is true at the end of Chapter 1. If we show
that P (1) is true and the implication is true, then we get P (2), P (3), P (4), etc. for the rest of the
positive integers.
Suppose you have a statement you are trying to prove for all positive integers n. Here are the
steps you should take in a proof by induction:
2. Assume that the statement is true for n = k. This is called the inductive hypothesis.
3. Prove that if the inductive hypothesis is true, then the statement is also true for n = k + 1.
This is the inductive step.
This kind of proof may seem mechanic, but this established, reliable structure is what makes
induction proofs relatively straightforward.
Example 5.1.1
Pn n(n + 1)
Prove k= .
k=1 2
65
Daniel Kim 66
P
n n(n + 1)
Proof. For clarity, let P (n) denote the assertion that k= .
k=1 2
P
1 1(1 + 1)
Base Case: P (1) : k= = 1.
k=1 2
We have 1 = 1, which is clearly true.
Pn n(n + 1) n(n + 1)
Inductive Step: Suppose P (n) : k= , i.e. 1 + 2 + 3 + . . . + n = , is true.
k=1 2 2
(n + 1)(n + 2)
We want to prove that P (n + 1) : 1 + 2 + 3 + . . . + n + (n + 1) = is true.
2
We have
n(n + 1)
1 + 2 + 3 + . . . + n + (n + 1) = + (n + 1)
2
n
= (n + 1) +1
2
n+2
= (n + 1)
2
(n + 1)(n + 2)
= .
2
P
n n(n + 1)(2n + 1)
Proof. Let P (n) : k2 = .
k=1 6
P
1 1·2·3
Base Case: P (1) : k2 = = 1.
k=1 6
We have 12 = 1, so we’re done.
P
n n(n + 1)(2n + 1)
Inductive Step: Suppose P (n) : k2 = is true.
k=1 6
(n + 1)(n + 2)(2n + 3)
We want to prove that P (n + 1) : 12 + 22 + 32 + . . . + n2 + (n + 1)2 = .
6
We have
n(n + 1)(2n + 1)
12 + 22 + 32 + . . . + n2 + (n + 1)2 = + (n + 1)2
6
n(2n + 1)
= (n + 1) + (n + 1)
6
2
2n + n 6n + 6
= (n + 1) +
6 6
2
2n + 7n + 6
= (n + 1)
6
67 Chapter 5. Mathematical Induction
(2n + 3)(n + 2)
= (n + 1)
6
(n + 1)(n + 2)(2n + 3)
= .
6
Problem 5.1.3.
P
n n2 (n + 1)2
1. Prove i3 = by induction.
i=1 4
P
10
2. Then compute 2i3 − 3i2 + 5i − 7.
i=1
P
n n2 (n + 1)2
Proof. For the induction proof, let P (n) : i3 = .
i=1 4
P
1 12 (1 + 1)2 4
Base Case: P (1) : i3 = = = 1.
i=1 4 4
We have 13 = 1, which is clearly true.
P
n n2 (n + 1)2
Inductive Step: Suppose P (n) : i3 = .
i=1 4
(n + 1)2 (n + 2)2
We must prove that P (n + 1) : 13 + 23 + 33 + . . . + n3 + (n + 1)3 = .
4
Then, note that
n2 (n + 1)2
13 + 23 + 33 + . . . + n3 + (n + 1)3 = + (n + 1)3
4
2
2 n
= (n + 1) + (n + 1)
4
2
2 n 4n + 4
= (n + 1) +
4 4
2
2 n + 4n + 4
= (n + 1)
4
2
2 (n + 2)
= (n + 1)
4
(n + 1) (n + 2)2
2
= ,
4
102 · 112 10 · 11 · 21 10 · 11
=2· −3· +5· − 70
4 6 2
= 6050 − 1155 + 275 − 70
= 5100 .
P
n 1
Problem 5.1.4. Let an = .
i=1 i(i + 1)
1. Compute a1 , a2 , a3 , a4 , and a5 .
2. Hypothesize a formula for an .
3. Prove that formula using induction.
P
∞ 1
4. What is ?
i=1 i(i + 1)
Solution.
Note that
1 1 1 1 n 1
+ + ... + + = +
1(1 + 1) 2(2 + 1) n(n + 1) (n + 1)(n + 2) n + 1 (n + 1)(n + 2)
n(n + 2) + 1
=
(n + 1)(n + 2)
n2 + 2n + 1
=
(n + 1)(n + 2)
(n + 1)2
=
(n + 1)(n + 2)
n+1
= .
n+2
4. Recall that we can rewrite an infinite summation as the limit of a finite summation:
∞
X X n
1 1 n
= lim = lim .
i(i + 1) n→∞ i(i + 1) n→∞ n + 1
i=1 i=1
1 1
Then tends to 0 as n gets arbitrarily large, so we are left with = 1. Therefore
n 1+0
P∞ 1
= 1.
i=1 i(i + 1)
Q
n 1
Problem 5.1.5. Let f (n) = 1− .
k=1 k+1
Solution.
1
ii) The pattern of the first five terms suggests that the formula is f (n) = .
n+1
Q
n 1 1
iii) The statement we must prove is: 1− = .
k=1 k+1 n+1
Q
n 1 1
Let P (n) : 1− = .
k=1 k+1 n+1
Q
1 1 1 1
Base Case: P (1) : 1− =1− =
k=1 k + 1 2 2
1 1
But where n = 1 is , so the base case is true.
n+1 2
Q
n 1 1
Inductive Step: Assume P (n) : 1− = .
k=1 k + 1 n + 1
Q
n+1 1 1
We want to prove: P (n + 1) : 1− = .
k=1 k+1 n+2
We have
1 1 1 1 1 1 1
1− 1− 1− ··· 1 − 1− = 1−
2 3 4 n+1 n+2 n+1 n+2
1 n+1
=
n+1 n+2
1
= .
n+2
iv) Like before, we can rewrite an infinite product as the limit of a finite product, and evaluate as
such:
Y∞ Yn
1 1 1
1− = lim 1− = lim = 0.
k+1 n→∞ k+1 n→∞ n + 1
k=1 k=1
However, proofs by induction are not limited to summations. For the next few problems, we
establish a few facts about divisibility first.
71 Chapter 5. Mathematical Induction
1. a | b ∧ b | c −→ a | c
2. a | b ∧ a | c −→ a | (b + c)
3. a | b ∧ a | c −→ a | (b − c)
4. a | b −→ a | (bc)
5 | (5 · 3n ) ∧ 5 | (8n+1 − 8 · 3n ) −→ 5 | ((8n+1 − 8 · 3n ) + (5 · 3n ))
i.e. 5 | (8n+1 − 3 · 3n )
i.e. 5 | (8n+1 − 3n+1 ).
We have reached the conclusion that P (n + 1) : 5 | (8n+1 − 3n+1 ) is true, so our proof is
complete.
Proof. There is something different about this problem - we must prove it for all positive odd integers
n. In this case, it suffices to prove P (1) and then P (n) → P (n + 2).
Let P (n) : 11 | (8n + 3n ).
Base Case: P (1) : 11 | (81 + 31 ) → 11 | 11, which is true.
Inductive Step: Suppose that P (n) : 11 | (8n + 3n ) is true.
Then we must prove P (n + 2) : 11 | (8n+2 + 3n+2 ).
Daniel Kim 72
The statement P (n + 6) : 91 | (3n+6 + 4n+6 ) is true, and therefore our inductive step holds, and
the proof is done.
Problem 5.1.11. Prove ∀n ∈ Z+ , 3 | (n3 − n).
We have proven P (n + 1), which completes the inductive step, so we are done.
73 Chapter 5. Mathematical Induction
Pn 1 1 Pn 1
Problem 5.1.12. Prove 2
≤2− ∀n ∈ Z+ (and by implication 2
< 2).
i=1 i n i=1 i
Pn 1 1
Proof. Let P (n) : 2
≤2− .
i=1 i n
P1 1 1
Base Case: P (1) : 2
= 1 ≤ 2 − 2.
i=1 i 1
We have 1 ≤ 1, which is true.
Pn 1 1
Inductive Step: Assume P (n) : 2
≤2− .
i=1 i n
P
n+1 1 1
We wish to prove that P (n + 1) : 2
≤2− .
i=1 i n+1
1 1 1 1 1 1 1 1
In other words, given 2 + 2 + 2 + . . . + 2 ≤ 2 − , we should prove that 2 + 2 + 2 + . . . +
1 2 3 n n 1 2 3
1 1 1
2
+ 2
≤ 2− . As a first step, we can manipulate what our assumption into something
n (n + 1) n+1
similar to our goal, as follows:
1 1 1 1 1
2
+ 2 + 2 + ... + 2 ≤ 2 − (by inductive hypothesis)
1 2 3 n n
1 1 1 1 1 1 1
+ + + ... + 2 + ≤2− +
12 22 32 n (n + 1)2 n (n + 1)2
1 1 1 1 1 1
How would we use this inequality to get a result like 2
+ 2 + 2 +...+ 2 + 2
≤ 2− ?
1 2 3 n (n + 1) n+1
1 1 1
Try to find a relation between 2 − + and 2 − that would quickly enable us to finish
n (n + 1)2 n+1
the proof.
1 1 1
In fact, we want the inequality 2 − + 2
≤2− to be true.
n (n + 1) n+1
1 1 1 1 1 1 1
Why? If the above is true, then 2 + 2 + 2 + . . . + 2 + ≤ 2− + ≤
1 2 3 n (n + 1)2 n (n + 1)2
1 1 1 1 1 1 1
2− =⇒ 2 + 2 + 2 + . . . + 2 + 2
≤2− , allowing us to quickly finish the
n+1 1 2 3 n (n + 1) n+1
proof.
Indeed, this is true, and the proof is as follows:
As n is positive, n2 + 2n + 1 ≥ n2 + 2n. Then, we have
(n + 1)2 ≥ n(n + 2)
n(n + 2)
1≥
(n + 1)2
1 n+2
≥
n (n + 1)2
1 (n + 1) + 1
≥
n (n + 1)2
Daniel Kim 74
1 n+1 1
≥ 2
+
n (n + 1) (n + 1)2
1 1 1
≥ +
n n + 1 (n + 1)2
1 1 1
− ≥
n (n + 1)2 n+1
1 1 1
− + ≤−
n (n + 1)2 n+1
1 1 1
∴2− + 2
≤2−
n (n + 1) n+1
1 1 1 1 1 1 1
Since 2 − + 2
≤2− , we have therefore proven that 2 + 2 + 2 + . . . + 2 +
n (n + 1) n+1 1 2 3 n
1 1
≤2− . Our inductive step holds true, and so our proof is complete.
(n + 1)2 n+1
Pn 1 √
Problem 5.1.13. Prove √ < 2 n, ∀n ∈ Z+ .
k=1 k
Pn 1 √
Proof. Let P (n) : √ < 2 n.
k=1 k
P1 1
Base Case: P (1) : √ = 1.
k=1 k
√
Then 1 is less than 2 1 = 2, so the base case is true.
Pn 1 √
Inductive Step: Suppose P (n) : √ < 2 n is true.
k=1 k
P 1
n+1 √
We want to prove: P (n + 1) : √ < 2 n + 1.
k=1 k
By assumption we have
1 1 1 1 √
√ + √ + √ + . . . + √ < 2 n.
1 2 3 n
1
Add √ to both sides. Then,
n+1
1 1 1 1 1 √ 1
√ + √ + √ + ... + √ + √ <2 n+ √ .
1 2 3 n n+1 n+1
√ 1 √
We need 2 n + √ < 2 n + 1 to be true in order to complete the proof. Below, we will
n+1
prove this fact.
Note that n ∈ Z+ . Then,
√ √ 2
n− n+1 >0
75 Chapter 5. Mathematical Induction
p
n−2
n(n + 1) + n + 1 > 0
p
2 n(n + 1) < 2n + 1
p
2 n(n + 1) + 1 < 2n + 2
p
2 n(n + 1) + 1 < 2(n + 1)
√ √ 1 1 1
2 n n+1 √ +1 √ < 2(n + 1) √
n+1 n+1 n+1
√
√ 1 n+1
2 n+ √ < 2(n + 1)
n+1 n+1
√ 1 √
∴2 n+ √ < 2 n + 1.
n+1
Thus,
1 1 1 1 1 √ 1 √
√ + √ + √ + ... + √ + √ <2 n+ √ < 2 n + 1,
1 2 3 n n+1 n+1
or
1 1 1 1 1 √
√ + √ + √ + ... + √ + √ < 2 n + 1,
1 2 3 n n+1
which concludes the inductive step, so our proof by induction is complete.
Exercise 5.1.17. Prove for all positive odd integers n, 3 | (2n + 1).
P
n
Exercise 5.1.18. Prove ∀n ∈ Z+ , 2k = 2n+1 − 1.
k=0
Exercise 5.1.20. Prove for all positive odd integers n, 8 | (n2 − 1).
Exercise 5.1.21. Prove for all positive odd integers n, 16 | (n4 − 1).
Exercise 5.1.28. Prove Lemma 4.3.13. That is, given ∀n ∈ N0 , h ≥ −1, prove that (1+h)n ≥ 1+hn.
Induction is not necessarily limited to simply one inductive step. Consider the following argument
form:
∀n ∈ Z+ P (n) −→ P (2n)
∀n ∈ Z+ P (n) −→ P (n − 1)
P (2)
+
∴ ∀n ∈ Z P (n)
Given P (2), we know that P (4) must be true. If P (4) is true, then P (3) is true.
Given P (4), we know that P (8) must be true. If P (8) is true, then P (7), P (6), and P (5) are
true.
Given P (8), we know that P (16) must be true. If P (16), then P (15), . . . all the way down to
P (9) are true.
In other words, we can use P (n) −→ P (2n) to increase our scope while P (n) −→ P (n − 1) will
take care of all the statements in the gap between P (n) and P (2n).
Thus, it follows that this argument form, called Cauchy Induction, is indeed valid. This form
will aid us in proving the following major result:
√
Proof. Let P (n) : a1 +a2 +a3 +...+an
n ≥ n a1 a2 a3 · · · an .
a1 + a2 √
Base Case: We want to prove that P (2) : ≥ a1 a2 .
2
First, as the square of any number is nonnegative, we know that (a1 − a2 )2 ≥ 0. Then,
a1 + a2 + a3 + . . . + an √
By assumption of P (n), we know that ≥ n a1 a2 a3 · · · an , and similarly,
n
b1 + b2 + b3 + . . . + bn √
≥ n b1 b2 b3 · · · bn .
n
As all terms are positive, we can multiply the inequalities together to get
a1 + a2 + a3 + . . . + an b1 + b2 + b3 + . . . + bn √ p
· ≥ n a1 a2 a3 · · · an · n b1 b2 b3 · · · bn .
n n
Thus, we have
r
a1 + a2 + . . . + an + b1 + b2 + . . . + bn a1 + a2 + . . . + an b1 + b2 + . . . + bn
≥ ·
2n n n
q p
√ n
≥ n
a1 a2 a3 · · · an · b1 b2 b3 · · · bn
qp
n
≥ a1 a2 a3 · · · an b1 b2 b3 · · · bn
p
2n
≥ a1 a2 a3 · · · an b1 b2 b3 · · · bn .
Problem 5.2.2. How many different 3-member committees can be formed from a group of 15
people?
Solution.
Out of the total 15 people, we want to select 3 of them. Then the total number of ways is
15 15! 15 · 14 · 13
= = = 455 .
3 3!12! 3·2
n! n! n!(k + 1) n!(n − k)
+ = +
k!(n − k)! (k + 1)!(n − k − 1)! k!(k + 1)(n − k)! (k + 1)!(n − k − 1)!(n − k)
n!(k + 1) n!(n − k)
= +
(k + 1)!(n − k)! (k + 1)!(n − k)!
n!(k + 1 + n − k)
=
(k + 1)!(n − k)!
n!(n + 1)
=
(k + 1)!(n − k)!
(n + 1)!
=
(k + 1)!((n + 1) − (k + 1))!
n+1
= .
k+1
Alternative Proof. In fact, we can also prove this identity using a combinatorial argument.
Consider a group of n + 1 people and we want to choose k + 1 of them to form a committee.
Now focus on a particular person - call that person A.
We can either includeornot include person A in the committee. If we do choose to include
n
person A, then there are ways to choose k remaining people for the committee out of the n
k
remaining people, since person A took one spot on the committee.
Daniel Kim 80
Proof. At first, it seems unclear how to prove this by induction, because we have two variables
n
n
and k to deal with. One approach may use double induction, where we let P (n, k) : ∈ Z and
k
then prove P (n, k) → P (n + 1, k) and P (n, k) → P (n, k + 1).
n
However, this seems tedious, and in fact there is a more elegant solution. Let Q(n) : ∈Z
k
for k = 0, 1, 2, . . . , n. Then we only have to worry about n as the variable in question. By letting
Q(n) refer to a collection of statements, as such:
if we are able to prove that Q(n) ∀n is true, then P (n, k) ∀n, k ≤ n will be true. This will finish
the proof cleanly.
Now, we proceed by induction on n of Q(n).
1 1! 1 1!
Base Case: Q(1) : = = 1, and = = 1, which are integers, so the base case is
0 0!1! 1 1!0!
done.
n
Inductive Step: Suppose Q(n) : ∈ Z for k = 0, 1, 2, . . . , n is true.
k
n+1
We want to prove Q(n + 1) : ∈ Z for k = 0, 1, 2, . . . , n + 1.
k
n+1
Writing out the terms for , k = 0, 1, 2, . . . , n, n + 1, we have:
k
n+1 n+1 n+1 n+1 n+1
, , ,..., , .
0 1 2 n n+1
n
By Theorem 5.2.3, each of these can be expressed as the sum of two terms from , k =
k
0, 1, 2, . . . , n, as follows:
n+1 n n n n n n n+1
, + , + ,..., + , .
0 0 1 1 2 n−1 n n+1
81 Chapter 5. Mathematical Induction
n+1 n+1
Note that = = 1, and obviously are integers. From assumption of Q(n), we
0 n+1
n n n n
have thereby assumed that each of , , ,..., are integers. Then any sum of some
0 1 2 n
n+1
terms from these is also an integer. Therefore, each term from , k = 0, 1, 2, . . . , n, n + 1
k
n+1 n+1
(except and , which are integers anyway) can be expressed as the sum of two
0 n+1
integers, and must necessarily be an integer.
P
1
Clearly (x + y)1 = x + y, so therefore (x + y)1 = 1
k x1−k y k , and the base case is true.
k=0
P
n
Inductive Step: Assume (x + y)n = n
k xn−k y k is true.
k=0
Daniel Kim 82
P
n+1
We want to prove: (x + y)n+1 = n+1
k xn+1−k y k .
k=0
We use Theorem 5.2.3 to proceed with the inductive step.
X n
n+1 n n−k k
(x + y) = (x + y) x y
k
k=0
Xn Xn
n n−k k n n−k k
=x x y +y x y
k k
k=0 k=0
Xn Xn
n n+1−k k n n−k k+1
= x y + x y
k k
k=0 k=0
n n+1 n n n n−1 2 n n−2 3 n n
= x + x y+ x y + x y + ... + xy
0 1 2 3 n
n n n n−1 2 n n−2 3 n n+1
+ x y+ x y + x y + ... + y
0 1 2 n
n n+1 n n n n n n−1 2 n n−1 2
= x + x y+ x y + x y + x y
0 1 0 2 1
n n−2 3 n n−2 3 n 1 n n 1 n n n+1
+ x y + x y + ... + x y + x y + y
3 2 n n−1 n
n n+1 n n n n
= x + + xn y + + xn−1 y 2
0 1 0 2 1
n n n−2 3 n n 1 n n n+1
+ + x y + ... + + x y + y
3 2 n n−1 n
n n+1 n+1 n n + 1 n−1 2 n+1 n n n+1
= x + x y+ x y + ... + xy + y .
0 1 2 n n
n n+1 n n+1
Note that = = =1= = 1. Then,
0 0 n n+1
n + 1 n+1 n+1 n n + 1 n−1 2 n+1 n + 1 n+1
(x + y)n+1 = x + x y+ x y + ... + xy n + y
0 1 2 n n+1
X n + 1
n+1
= xn+1−k y k ,
k
k=0
1. (x + y)4
2. (2x + 3y)3
3 3 3 0 3 2 1 3 1 2 3
(2x + 3y) = (2x) (3y) + (2x) (3y) + (2x) (3y) + (2x)0 (3y)3
0 1 2 3
= (2x)3 + 3(2x)2 (3y) + 3(2x)(3y)2 + (3y)3
= 8x3 + 36x2 y + 54xy 2 + 27y 3 .
7 150
Problem 5.2.8. When expanded, what is the constant term of 5x − 2
3 ?
x
Problem 5.2.9. When expanded, what is the full term of the form kx1776 y r in (4x − 5y)2017 ?
Solution. The Binomial Theorem states that a general term in a binomial expansion to the n power
is of the form
n n−k k
x y .
k
Clearly n = 2017 and n − k = 1776, so k = 241. Therefore the term we have to find is
2017
(4x)1776 (−5y)241 ,
241
which simplifies to
2017
− 41776 5241 x1776 y 241 .
241
Daniel Kim 84
2n
X 2n
X
2n k 2n
c) Based on the two previous examples, we know that =2 2n
and (−1) = 0.
k k
k=0 k=0
We can split these sums based on their parity (even or odd).
X2n
2n
For , if k is even, then k = 2i for some integer i, so the sum of all the even terms is
k
k=0
X n
2n 2n 2n 2n 2n
+ + + ... + , or . If k is odd, then k = 2i + 1, so the sum
0 2 4 2n 2i
i=0
X 2n
n−1
2n 2n 2n 2n
of all the odd terms is + + +...+ or . Since k can
1 3 5 2n − 1 2i + 1
i=0
only be even or odd,
2n
X n
X X
n−1
2n 2n 2n
= + = 22n .
k 2i 2i + 1
k=0 i=0 i=0
85 Chapter 5. Mathematical Induction
2n
X
2n
It is nearly the same thing for (−1)k , except that when k is odd, (−1)k = −1, so we
k
k=0
have
2n
X X n n−1
X 2n
2n 2n
(−1)k = + − = 0.
k 2i 2i + 1
k=0 i=0 i=0
We proceed with proof by induction: Let P (n) : Any set of n horses is the same color.
Base Case: P (1) : Any set of 1 horse is the same color.
This is obviously true, as each horse has its own color.
Inductive Step: Assume P (n): any set of n horses is the same color.
Consider a set of n horses and let x refer to one of the horses in that set, and let H refer to the
rest of the horses.
Consider another set of n horses consisting of H horses plus another horse (distinct from x) that
we shall call y.
Because x is in a set with H, horse x and the horses H have the same color. Because y is also in
a set with H, horse y and the horses H have the same color. Therefore, x, y, and H have the same
color.
Therefore, we can construct a set of n + 1 horses containing x, y, and H. The inductive step
holds, so all horses are the same color.
Solution. This proof by induction makes a critical logical error in that P (1) → P (2) is false. If we
have a set of 1 horse, then there is no H (“rest of the horses”) from which we could compare the
Daniel Kim 86
colors of x and y. To ‘fix’ this, if you’re thinking about starting the induction with P (2) → P (3),
remember that P (2) itself is false because one can choose two differently colored horses for a set
of 2 horses. Overall, if the base case cannot imply the next statement, then the inductive form of
reasoning cannot be applied, so this argument is invalid.
Chapter 6
Basic Trigonometry
The reader is expected to have some experience with basic right triangle trigonometry, but a brief
review of it is given first, for the convenience of the reader and for the sake of completeness. A basic
understanding of simple concepts in geometry is expected.
6.1 Review
c
b
C a B
Consider 4ABC where ∠ACB is a right angle and let ∠CAB = θ. Then, we may define our 6
trigonometric functions given this angle, as follows:
a c
sin θ = csc θ =
c a
b c
cos θ = sec θ =
c b
a b
tan θ = cot θ =
b a
To remember these, one can use the famous acronym SOH CAH TOA:
87
Daniel Kim 88
opposite
1. Sine-Opposite-Hypotenuse: sin θ refers to the ratio , where the sides in discussion
hypotenuse
are in relation to θ. In this diagram, a would be the opposite side to θ and c would be the
a
hypotenuse, so sin θ = .
c
adjacent
2. Cosine-Adjacent-Hypotenuse: cos θ refers to the ratio , where the sides in discussion
hypotenuse
are in relation to θ. In this diagram, b would be the adjacent side to θ and c is still the
b
hypotenuse, so cos θ = .
c
opposite
3. Tangent-Opposite-Adjacent: tan θ refers to the ratio , where the sides in discussion
adjacent
are in relation to θ. In this diagram, as previously mentioned, a is the opposite, and b is the
a
adjacent, so tan θ = .
b
In addition, recall the following definitions (and vice-versa):
1
sin θ =
csc θ
1
cos θ =
sec θ
1
tan θ =
cot θ
Noting that cosecant, secant, and cotangent are the reciprocals of sine, cosine, and tangent
respectively, their definitions become obvious:
1 hypotenuse c
csc θ = = or
opposite
hypotenuse
opposite a
1 hypotenuse c
sec θ = = or
adjacent adjacent b
hypotenuse
1 adjacent b
cot θ = = or
opposite
adjacent
opposite a
By using the helpful acronym SOH CAH TOA and remembering the reciprocal relationships
between each pair, we can intuitively derive the corresponding ratios for all of the functions.
Now, there are special angles associated with these trigonometric functions that you should
memorize.
A
45◦
√
x 2
x
C x B
89 Chapter 6. Basic Trigonometry
Consider an isosceles right triangle ABC with right angle C, and let the common length of the
legs be x. Then, we can derive the well-known side length proportions of this particular triangle
using the Pythagorean Theorem, as follows:
AB 2 = AC 2 + BC 2
AB 2 = x2 + x2
AB 2 = 2x2
√
∴ AB = x 2.
We know that ∠A = 45◦ , so we can find out the trigonometric ratios of the angle 45◦ using the
proportions of the sides we have just proven. Note that the x terms cancel out in the numerator and
denominator.
√ √
sin 45◦ = 2
2
csc 45◦ = 2
√ √
cos 45◦ = 2
2
sec 45◦ = 2
30◦ 30◦
√
x x 3 x
2
60◦ 60◦
B x x C
2 D 2
Consider an equilateral triangle ABC with common side length x. Drop the altitude from A
to BC, and let the foot be D. Note that AD is also the median, as AB = AC. Furthermore, it is
not hard to see that AD is the angle bisector of ∠BAC. Therefore, D is the midpoint of BC, so
x
BD = CD = . We also have that ∠BAD = ∠CAD = 30◦ . We can then compute AD by using
2
the Pythagorean theorem on either 4ABD or 4ACD.
x 2
AD2 + = x2
2
x2
AD2 = x2 −
√ 4
x 3
∴ AD = .
2
Daniel Kim 90
The resulting proportions of the sides of the smaller triangle inside the larger equilateral triangle
yield the 30-60-90 triangle:
A
60◦
x 2x
30◦
√
C x 3 B
We can now derive the trigonometric values for angles 60◦ and 30◦ respectively:
√ √
sin 60◦ = 2
3
csc 60◦ = 2 3
3
sin 30◦ = 1
2 csc 30◦ = 2
√ √
cos 30◦ = 2
3
sec 30◦ = 2 3
3
√ √
tan 30◦ = 3
3
cot 30◦ = 3
Lastly, we have the notion of radians vs. degrees, two ways of measuring angles. We define 2π
radians to be equivalent to 360◦ , or in simpler terms, π radians = 180◦ .
Exercise 6.1.1. Convert 0◦ , 30◦ , 45◦ , 60◦ , and 90◦ to radians. You should be familiar with all of
the radian equivalents for these special angles.
We can derive the formulae for the arc length and area of this sector of a circle for radians and
degrees:
91 Chapter 6. Basic Trigonometry
Degrees Radians
πrθ
Arc length rθ
180◦
θ r2 θ
Area πr2 ·
360◦ 2
Why should we use radians over degrees? As a start, the formulae in terms of radians is clearly
more aesthetically pleasing. However, it is not until we begin the topic of calculus that we realize
the true benefits of using radians.
In this chapter, if the symbol ◦ is specified, then the angle is in degrees. Otherwise, it is in
radians.
(0, 1)
(cos θ, sin θ)
1
sin θ
(−1, 0) θ (1, 0)
cos θ
(0, −1)
Draw a ray from the origin to some point on the graph of x2 + y 2 = 1, i.e. the unit circle. For
clarity I will illustrate such a ray in the first quadrant. If we drop a perpendicular from the point
to the x-axis, we actually form a right triangle, with the hypotenuse being the ray, which must be
length 1 since it is a radius of the circle.
Let θ be the angle between the hypotenuse and the x-axis. We start at θ = 0◦ , which is the point
(1, 0), and move counter-clockwise (in order of the four quadrants).
Definition 6.2.1. An angle will represent the rotation of a ray through the origin of the unit circle.
Definition 6.2.2. Two angles will be called coterminal if they end up at the same position on
the unit circle. In other words, angle x and angle y are coterminal ←→ 360◦ | (x − y), in terms of
degrees.
The terminal side is the hypotenuse of the triangle, which is 1.
Daniel Kim 92
This means that rotating the ray by a full circle, or 360◦ , will not change the point (x, y) on the
circle. Thus, sin(120◦ ) = sin(480◦ ) = sin(840◦ ) = . . ., so the trigonometric functions are periodic.
We will revisit this property later.
Since the hypotenuse is 1, the side adjacent to θ would be cos θ, the side opposite to θ would be
sin θ, and the slope of the hypotenuse would be tan θ, using the mnemonic ‘SOH CAH TOA.’
Definition 6.2.3. For any angle θ, let the terminal side of θ hit the unit circle at (x, y). Then we
define
x = cos θ,
y = sin θ.
Keep in mind that the domains of cosine and sine are necessarily all real numbers.
Definition 6.2.4. Although we have already stated these in the last section, I will emphasize that
the following remain the same under the context of the unit circle.
1 1
csc θ = sec θ =
sin θ cos θ
sin θ cos θ
tan θ = cot θ =
cos θ sin θ
Definition 6.2.5. Quadrantal angles are angles that terminate on the x or y axis. For example,
we call the angles 90◦ , 180◦ , 270◦ , 360◦ , 450◦ , 540◦ , . . . quadrantal, as each of them lie on one of the
four points (1, 0), (0, 1), (−1, 0), or (0, −1).
First, we can take advantage of the reflective/rotational symmetry that the circle offers in order
to develop our first few trigonometric identities.
93 Chapter 6. Basic Trigonometry
y
180◦ − θ
θ
x
−θ
Reflections of the ray from (0, 0) to (cos θ, sin θ) to the four quadrants
Recall the following facts: If a point (x, y) in the first quadrant is reflected across . . .
Applying these facts to the point (cos θ, sin θ), as shown in the diagram, we can express the three
other points in two ways each: the first way considering the reflection applied to the original, and
the second way considering the angle of the new point.
For the point in quadrant 2, notice how we have reflected (cos θ, sin θ) across the y-axis, so
this new point can be represented as (− cos θ, sin θ). However, this point is also the result of
rotating a ray 180◦ − θ in the counterclockwise direction, so the point can also be expressed as
(cos(180◦ − θ), sin(180◦ − θ)). As these two coordinates refer to the same point, we have the following
identities:
cos(180◦ − θ) = − cos θ
sin(180◦ − θ) = sin θ
Similarly, the point in the third quadrant can be represented as either (− cos θ, − sin θ) or (cos(180◦ +
θ), sin(180◦ + θ)). As a result, we can derive similar identities:
cos(180◦ + θ) = − cos θ
Daniel Kim 94
sin(180◦ + θ) = − sin θ
Lastly, the point in the fourth quadrant is either (cos θ, − sin θ) or (cos(−θ), sin(−θ)). So we have
the following identities:
cos(−θ) = cos θ
sin(−θ) = − sin θ
As you solve more problems, you will get used to these identities. However, if you imagine
reflecting and rotating angles on the unit circle, then it won’t be hard to rederive these relationships
again if you forget.
It is important that you remember that cos θ is an even function, while sin θ is an odd function.
Definition 6.2.6. The reference angle for a non-quadrantal angle θ is the measure of the acute
angle that the terminal side makes with the x-axis.
Remember that the actual angle and the reference angle are not the same. Consider the diagram
below:
θ1
cos θ1
θ2
sin θ1
3π
In this example, π ≤ θ1 ≤ would be the overall angle that we base cosine and sine on, while
2
π
0 ≤ θ2 < would be the reference angle.
2
Problem 6.2.7. Find the reference angles for the following (which are in radians):
95 Chapter 6. Basic Trigonometry
7π
a)
11
19π
b)
5
c) 8
π 3π
Solution. Make sure you keep in mind the main quadrantal angles in radians: 0, , π, , 2π, and
2 2
π
further multiples of . Multiples of π lie on the x-axis, so pay attention to those when looking for
2
the reference angle. Also remember to reduce a given radian measure larger than 2π by subtracting
the appropriate multiple of 2π. To visualize the angle, try drawing it on the unit circle and then the
appropriate right triangle for it.
7π π 7π 4π
a) is between and π, so the reference angle is π − = .
11 2 11 11
19π 9π 19π
b) is larger than 2π, so subtract 2π to get , which is coterminal to . Then, notice
5 5 5
9π 3π 9π π
that is between and 2π, so the reference angle is 2π − = .
5 2 5 5
π
c) In this case, we should figure out estimated values for π and , which are 3.14 and 1.57
2
respectively. 8 is larger than 2π, so subtract ≈ 6.28 to get 1.72, which is between 1.57 and
π
3.14, or and π. So the reference angle is π − (8 − 2π) = 3π − 8 . Note that we are merely
2
using these estimates for π and its fractions so we can determine which quadrant the angle 8
(in radians) is located in.
1. If two angles are coterminal, then they have the same trigonometric values.
2. If two angles have the same reference angle, then they have the same trigonometric functions
up to the sign. For example, sin(111◦ ) = ± sin(69◦ ).
Take a moment to convince yourself why these are true, especially the second result.
(sin θ, cos θ)
(cos(90◦ − θ), sin(90◦ − θ))
θ
90◦ − θ
x
Similar to the reflections across the x-axis, y-axis, and origin, we can also reflect a point
(cos θ, sin θ) across the line y = x, yielding the point (sin θ, cos θ). Reflections also maintain angles,
so we can express the reflected point as (cos(90◦ − θ), sin(90◦ − θ)). These two coordinates refer to
the same point, so we have now derived the following identities:
sin(90◦ − θ) = cos θ
cos(90◦ − θ) = sin θ
Taking the reciprocal of these equations, we get
csc(90◦ − θ) = sec θ
sec(90◦ − θ) = csc θ
sin(90◦ − θ) cos θ
Furthermore, note that tan(90◦ − θ) = ◦
= = cot θ, and vice-versa.
cos(90 − θ) sin θ
Therefore, we can conclude that if f is a trigonometric function, then
f (θ) = cof (90◦ − θ),
|{z}
cofunction of f
where the cofunction pairs are sine and cosine, secant and cosecant, and tangent and cotangent.
Their relationships express the notion of the cofunction identities.
Definition 6.2.8. A function is periodic if ∃p > 0 s.t. f (x + p) = f (x) ∀x. If p is the smallest
positive number satisfying this, then p is called the period.
Problem 6.2.9. What is the period of sine and cosine? What about tangent?
Solution. As previously stated regarding coterminal angles, rotating a ray by a full circle, which
is 360◦ , will not change the point that the ray is pointing to. Thus, sin θ = sin(θ + 360◦ ) and
cos θ = cos(θ + 360◦ ). It can then be confirmed that 360◦ is the period.
97 Chapter 6. Basic Trigonometry
However, for tangent, notice that the slope of the ray from the origin is the same when it is
reflected across the origin. In other words, the tangent is the same when 180◦ is added to the current
angle. The period of tangent is thus 180◦ .
We can also demonstrate this using some identities. Note that
Problem 6.2.10. Find all six trig values for each angle:
a) 150◦
b) 225◦
c) 300◦
Solution.
√ √
◦ 1 2 3
a) sin 150 = b) sin 225 ◦
=− ◦
c) sin 300 = −
2√
√2 2
3 2 1
cos 150◦ = − cos 225◦ =− cos 300◦ =
√2 2 2√
3 tan 225◦ =1 tan 300◦ =− 3
tan 150◦ =− √ √
3 2 3
csc 225◦ =− 2 csc 300◦ =−
csc 150◦ =2 √ 3
√ sec 225◦ =− 2
◦ 2 3 sec 300◦ =2
sec 150 =− cot 225◦ = 1 √
√3 3
cot 150◦ =− 3 cot 300◦ =−
3
√ √
5 5 3 √ 5 5 3
The sums in part a, b, and c respectively are − , 2 − 3 2, and − , so the total sum
√ √ 2 2 2 2
would be 7 − 5 3 − 3 2 . The important idea to consider is that we should take advantage of the
identities to simplify the problem and reduce the workload. Note that the sum for 150◦ and 300◦
were the same, and this is because
according to our cofunction identities. Therefore, the six trig values for 150◦ will be the same for
300◦ , but for cofunctions of each other. We could have sped up the process of computing the sum by
finding the sum of either 150◦ or 300◦ and multiplying by two, then adding the sum for 225◦ .
Problem 6.2.11. Write the following in terms of a trig function of an angle between 0◦ and 45◦ :
Daniel Kim 98
1. sin(73◦ )
2. tan(109◦ )
3. sec(2017◦ )
11π
4. cos
7
3. sec(2017◦ ) = sec(217◦ )
= sec(217◦ − 360◦ )
= sec(−143◦ )
= sec(143◦ )
= − sec(37◦ ) .
4. 11π 11π
cos = − cos π −
7 7
4π
= − cos −
7
4π
= − cos
7
π 4π
= − sin −
2 7
π
= − sin −
14
π
= sin .
14
Problem 6.2.12. Write the following in terms of a trig function of an angle between 0◦ and 45◦ :
1. sin(177◦ )
2. cos(111◦ )
3. tan(620◦ )
Solution.
99 Chapter 6. Basic Trigonometry
Now consider the right triangle of the unit circle once again. The legs of the triangle are cos θ
and sin θ, and the hypotenuse is 1. Then, by the Pythagorean Theorem, we have
sin2 θ + cos2 θ = 1.
5
Solution. First, it is obvious that csc θ = , as it is the reciprocal of sin θ.
4
2
4 3
Furthermore, we use the identity cos θ + sin θ = 1 to have cos θ +
2 2 2 = 1 −→ cos θ = ± .
5 5
5 sin θ 4
Since secant is the reciprocal of cosine, we have sec θ = ± . Lastly, tan θ = , so tan θ = ± .
3 cos θ 3
3
Then cot θ = ± .
4
Keep in mind that if the quadrant of this angle is not specified, then some trig values can either
be positive or negative, so make sure you consider the sign as well.
Daniel Kim 100
For the following problems, we will use I, II, III, and IV to refer to the four quadrants.
20
Problem 6.2.14. Find the remaining trigonometric values when csc θ = − , and θ ∈ III.
17
17
Solution. We can instantly see that sin θ = − . Furthermore, we can use the identity csc2 θ =
√20
400 111
1+cot2 θ. We get = 1+cot2 θ, or cot θ = ± . However, as θ is in quadrant III, tangent must
289 17 √
111 17
be positive, so cotangent is also positive. Therefore, cot θ = . This implies tan θ = √ .
17 111
289
We also use the Pythagorean identity cos2 θ + sin2 θ = 1. This yields + cos2 θ = 1, or
√ 400
111
cos θ = − , as cosine is negative in the third quadrant. Since secant is the reciprocal of cosine,
20
20
we have sec θ = − √ .
111
While we have been simply applying identities in order to find the rest of the values, choosing
the sign can be a hassle. We could opt for the alternative method of reference triangles for a cleaner
and more straightforward solution.
4
For example, consider sin θ = where θ ∈ I. We can construct a reference triangle in the first
5
quadrant, as such:
5 4
We fill in the triangle with the given information. For simplicity, we let the opposite side be 4
and the hypotenuse be 5. We could then use the Pythagorean Theorem to get the remaining leg of
the right triangle.
101 Chapter 6. Basic Trigonometry
5 4
θ
3
Now, we can read our information from the resulting reference triangle to derive the rest of the
trigonometric values. From the triangle above, we have
5
csc θ = ,
4
3
cos θ = ,
5
5
sec θ = ,
3
4
tan θ = ,
3
3
cot θ = .
4
It is much easier to read the information off a visual cue like a triangle, and this method is less
error-prone than applying identities and worrying over which sign to choose.
The true strength of reference triangles is shown for examples in other quadrants.
4
Consider sin θ = , but this time θ ∈ II. Then our reference triangle would look different:
5
4 5
θ
−3
Daniel Kim 102
As θ ∈ II, it should be true that 90◦ < θ < 180◦ . However, in the context of a reference triangle,
we let θ in discussion be the reference angle of the triangle, so our calculations proceed smoothly.
As the triangle is to the left of the y-axis, the horizontal leg must be negative. Hence, we let it
be −3 instead of 3 after we apply Pythagorean’s Theorem to the given sides 4 and 5.
By drawing a diagram which clearly represents which parts are positive and negative, it is easily
to identify the rest of the trigonometric values. For this example, the rest would be:
5
csc θ = ,
4
3
cos θ = − ,
5
5
sec θ = − ,
3
4
tan θ = − ,
3
3
cot θ = − .
4
4
Exercise 6.2.15. Find the rest of the trigonometric values for cos θ = , θ ∈ IV .
5
12
Exercise 6.2.16. Find the rest of the trigonometric values for tan θ = , θ ∈ III.
5
7
Exercise 6.2.17. Find the rest of the trigonometric values for tan θ = − , θ ∈ II.
11
1
Exercise 6.2.18. Find the rest of the trigonometric values for tan θ = − , θ ∈ II.
2
Problem 6.2.19. Simplify (4 sin θ − 7 cos θ)2 + (7 sin θ + 4 cos θ)2 .
1 1
Solution. If sin θ + cos θ = , then (sin θ + cos θ)2 = . We expand and simplify this to get
3 9
1
sin2 θ + 2 sin θ cos θ + cos2 θ =
9
103 Chapter 6. Basic Trigonometry
8
2 sin θ cos θ = −
9
4
sin θ cos θ = − .
9
1
Furthermore, from the original equation, we have sin θ = − cos θ. We substitute this into the
3
1 4
equation above to get cos θ − cos θ = − . This expands to
3 9
1 4
cos θ − cos2 θ + = 0.
3 9
1 4
For legibility, let x = cos θ. Then we want to solve the quadratic x − x2 + = 0. Multiply by
3 9
−9 to get 9x2 − 3x − 4 = 0, and by the quadratic formula, we get
√
1± 17
x = cos θ = .
6
1
Problem 6.2.22. Find all values of sin x when sin x + cos x = .
2
1
Solution. Square both sides of sin x + cos x = to get:
2
1
sin2 x + cos2 x + 2 sin x cos x =
4
1
1 + 2 sin x cos x =
4
3
sin x cos x = − .
8
1
From the original equation, we know that cos x = − sin x. Then we have:
2
1 3
sin x − sin x = − .
2 8
1 3
Let y = sin x. Then we solve for the quadratic y − y 2 = − , or 8y 2 − 4y − 3 = 0. By the
√ 2 8
1± 7
quadratic formula, we have y = sin x = .
4
Problem 6.2.23. Prove the following identities:
Proof.
Daniel Kim 104
2. The denominator 1 + cos x motivates us to multiply the top and bottom by 1 − cos x, so we
can apply the Pythagorean Identity:
sin x sin x 1 − cos x
= ·
1 + cos x 1 + cos x 1 − cos x
sin x(1 − cos x)
=
1 − cos2 x
sin x(1 − cos x)
=
sin2 x
1 − cos x
=
sin x
1 cos x
= −
sin x sin x
= csc x − cot x.
Problem 6.2.24. tan x + cot x equals the product of two trigonometric functions, both in terms of
x. What is this product?
Solution. Simply rewrite the given expression in terms of sine and cosine, then apply appropriate
identities:
sin x cos x
tan x + cot x = +
cos x sin x
sin2 x + cos2 x
=
cos x sin x
1
=
cos x sin x
= csc x sec x .
Lastly, there is one more identity to discover, based on previous identities that we have already
found.
We have cos θ = sin(90◦ − θ). If we substitute θ → −θ, we get cos(−θ) = sin(90◦ + θ). But
notice that cos(−θ) = cos θ, therefore
This identity will play a significant role for the next section.
105 Chapter 6. Basic Trigonometry
6.3 Graphing
First observe the behavior of y = sin x as x goes from 0◦ to 90◦ . When we go counter-clockwise on
the unit circle from (1, 0), the y-coordinate, which is sin x, is increasing from 0 to 1. This can be
illustrated by the approximations,
sin 0◦ = 0,
1
sin 30◦ = = 0.5,
2
√
2
sin 45◦ = ≈ 0.7,
√2
3
sin 60◦ = ≈ 0.9,
2
sin 90◦ = 1.
Then, when x goes from 90◦ to 180◦ , sin x goes from 1 back to 0. In fact,
√
◦ 3
sin 120 = ≈ 0.9,
√2
2
sin 135◦ = = 0.7,
2
1
sin 150◦ = ≈ 0.5,
2
sin 180◦ = 0.
These are symmetric with the previous values. Notice that the identity sin(180◦ − θ) = sin θ
confirms this.
Thus, as x goes from 0◦ to 180◦ , sin x rises from 0 to 1, then back to 0. This can be graphed as:
90◦ 180◦
−1
What happens when x goes from 180◦ to 360◦ ? By the identity sin(180◦ + θ) = − sin θ, the
graph of sin x from 180◦ to 360◦ is basically the reflection of the graph from 0◦ to 180◦ over the
x-axis, as such:
Daniel Kim 106
Now remember that the period of sine is 2π, or 360◦ . Thus, the entire graph of sin x repeats
indefinitely (oscillating between 0 and 1) in both directions:
0◦
−360◦ −270◦ −180◦ −90◦ 90◦ 180◦ 270◦ 360◦
−1
Now, recall the identity cos θ = sin(θ + 90◦ ). This indicates that the graph of cos x is just the
graph of sin θ shifted 90◦ to the left! Thus, the entire graph of cos x would be:
0◦
−360◦ −270◦ −180◦ −90◦ 90◦ 180◦ 270◦ 360◦
−1
−π 0 π
−2π − 23 π − 21 π 1 3 2π
2π 2π
−1
−π 0 π
−2π − 23 π − 21 π 1 3 2π
2π 2π
−1
Notice how the graph of sin x is consistent with the earlier discovery that sin x is an odd function,
meaning that it is symmetric around the origin. Likewise, notice how the graph of cos x is symmetric
across the y-axis, since cos x is an odd function.
As both functions continue oscillating between −1 and 1 in both directions forever, we can
conclude the following:
Domain Range
sin x R [−1, 1]
cos x R [−1, 1]
To graph csc x and sec x, we should use the fact that they are reciprocals of sin x and cos x
respectively.
Let’s look at csc x first. As sin x goes from 1 to an arbitrarily small positive number close to 0,
1
notice that csc x = goes from 1 to some arbitrarily large number. In other words, as sin x goes
sin x
from 1 to 0, csc x goes from 1 to ∞.
Similarly, as sin x goes from 0 to 1, csc x goes from ∞ to 1. Keeping this behavior in mind, here
is the graph csc x along with sin x for one part:
Daniel Kim 108
y = csc x
1
y = sin x
0 1 π
2π
Make sure you understand how sin x and csc x are related to each other with respect to their
graphs. Thus, our whole graph of csc x over many periods looks like this (the vertical asymptotes
are represented by dashed lines):
−π 0 π
−2π − 23 π − 21 π 1 3 2π
2π 2π
−1
Sketch sin x over this graph to reinforce your understanding of their relations.
−π 0 π
−2π − 23 π − 21 π 1 3 2π
2π 2π
−1
Here, we also used dashed lines to denote the vertical asymptotes of sec x.
Again, sketch cos x over the graph of sec x as an exercise.
Now, as long as you remember the graphs of sin x and cos x, you should be able to easily graph
csc x and sec x as well.
Notice that csc x has vertical asymptotes at x = −2π, x = −π, x = 0, x = π, x = 2π, etc.
At those values of x, csc x is not defined. In other words, the domain of csc x is the real numbers
excluding any multiple of π. This can be expressed as R − {πk | k ∈ Z}.
What about the range of csc x? Notice that csc x never becomes any number between −1 and 1
exclusive. Thus, its range can be expressed as (−∞, −1] ∪ [1, +∞).
3 1 1 3
Similarly, sec x has vertical asymptotes at x = − π, x = − π, x = π, x = π, etc. Thus the
n o 2 2 2 2
π
domain of sec x is R − πk + | k ∈ Z . Since sec x is merely csc x with a horizontal shift, their
2
ranges are the same.
Domain Range
π
For the graph of tan x, let’s consider the function’s behavior as x goes from 0 inclusive to
2
exclusive.
sin x
Since tan x = , notice that sin x is going from 0 to 1 while cos x is going from 1 to 0. Thus,
cos x
π
tan x is increasing without bound, i.e. it is going to ∞ as x goes to .
2
Daniel Kim 110
We can think about this in another way: tan x is the slope of the hypotenuse of the triangle in
π
the unit circle, so as the angle approaches , the hypotenuse gets steeper (until it becomes vertical
2
π
at x = ). This indicates that the slope, which is tan x, is tremendously increasing.
2
h π
Then, let’s graph tan x over 0, . We plug in some common values:
2
tan 0 = 0,
√
π 3
tan = ≈ 0.6,
6 3
π
tan = 1,
4
π √
tan = 3 ≈ 1.7.
3
1
Then we consider the fact that tan x goes to ∞ as x approaches π from the left.
2
0 1
2π
How would we sketch the rest of the graph? We can take advantage of a few identities.
1 3
Recall that tan x = tan(x + π). Thus, the graph above will repeat for x = π to π, etc.
2 2
Furthermore, note that
sin(−x) − sin x
tan(−x) = = = − tan x,
cos(−x) cos x
Using these two pieces of information, we can sketch the rest of the graph (the dashed lines are
vertical asymptotes):
111 Chapter 6. Basic Trigonometry
−π 0 π
−2π − 23 π − 21 π 1 3 2π
2π 2π
−1
What about cot x? It turns out that it is relatively simple to sketch cot x once you know how to
sketch tan x. π π
Recall the cofunction identity tan − x = cot x (since 90◦ = ). Replacing x with −x gives
π 2 2
cos(−x) cos x
us tan + x = cot(−x). But note that cot(−x) = = = − cot x, so we have
2 sin(−x) − sin x
π
cot x = − tan +x .
2
π
This indicates that the graph of cot x is the result of shifting the graph of tan x to the left by ,
2
then flipping it over the x-axis. Thus, we end up with:
−π 0 π
−2π − 23 π − 21 π 1 3 2π
2π 2π
−1
Notice
n that tan xohas the same vertical asymptotes as sec x, so the domain of tan x is also
π
R − πk + | k ∈ Z . However, if we look at the graph of tan x, we can see that tan x can be any
2
real number. Thus, the range of tan x is R.
Daniel Kim 112
Likewise, cot x has the same vertical asymptotes as csc x, so the domain of cot x is R−{πk | k ∈ Z}.
For the same reason as tan x, the range of cot x is also R.
Thus, we have the following results:
Domain Range
n π o
tan x R − πk + | k ∈ Z R
2
cot x R − {πk | k ∈ Z} R
I will summarize all the discoveries made about the domain and range of every trigonometric
function below:
Domain Range
sin θ R [−1, 1]
cos θ R [−1, 1]
n π o
tan θ R − πk + | k ∈ Z R
2
cot θ R − {πk | k ∈ Z} R
n π o
sec θ R − πk + | k ∈ Z (−∞, −1] ∪ [1, +∞)
2
csc θ R − {πk | k ∈ Z} (−∞, −1] ∪ [1, +∞)
It will not always be the case that you will be tasked with sketching a simple graph of sin x. You
may be asked to graph more complicated trigonometric functions. Consider the general equation of
sine,
y = A sin(Bx + C) + D.
Note that all equations defined in this form will have a “wave”-like form, which we call sinusoidal.
The following will still apply for an equation like y = A cos(Bx + C) + D, since sine and cosine
are merely phase shifts of each other.
Assume that B > 0 (whenever B is negative, we can always use the fact that sine is an odd
function and manipulate the equation into the given form above). With respect to these variables,
we define the following properties of the equation:
• The midline is y = D. It is the horizontal, center line on which the sinusoidal wave oscillates
above and below. Here is an example:
Midline: y = D
y = A sin(Bx + C) + D
113 Chapter 6. Basic Trigonometry
• The amplitude is |A|. This represents the vertical distance from the midline to either the
highest or the lowest point on the graph.
Amplitude: |A|
Midline: y = D
Amplitude: |A|
y = A sin(Bx + C) + D
• The frequency is B. This denotes the number of cycles completed in an interval of 2π or 360◦ .
One cycle is the part of the graph that serves as a repeated pattern, since all trigonometric
functions are periodic. Here is an example of one cycle:
One cycle
For y = sin x, one cycle happens every 2π, so the frequency of sin x is 1.
2π
• The period is . This is simply the horizontal length of one cycle. This is consistent with
B
the definition of frequency: we divide the interval 2π by the number of cycles completed in
that interval, which is B, to get the length of each cycle.
C
• The phase shift is − . This can be derived from the observation that y = A sin(Bx + C) +
B
C
D =⇒ y = A sin B x + +D. This represents how far the original sine function (if you
B
are given y = A sin(Bx + C) + D) or cosine function (if you are given y = A cos(Bx + C) + D)
has been shifted right, if the phase shift is positive, or left, if the phase shift is negative.
• The minimum is D−|A|, while the maximum is D+|A|. This follows from our understanding
of the midline and amplitude:
Daniel Kim 114
Maximum: D + |A|
Amplitude: |A|
Midline: y = D
Amplitude: |A|
y = A sin(Bx + C) + D
Minimum: D − |A|
Since graphs of trigonometric functions can go on forever, we will establish some criteria for a
‘sufficiently’ drawn graph (in the context of this book; this is NOT official or conventional). When a
problem asks for a sketch of some variant of sin x, then draw a curve of this form:
Then, label the coordinates of the five dotted points on that one period.
We will denote this portion of the graph as the first period. The first of the five points will
always have its x-coordinate equal to the phase shift (remember this, it is a useful tip!).
For some variant of − sin x, the first period would be:
Example 6.3.1
Sketch y = 2 sin(3x + 15) − 117.
1 1 1 1
4 Period 4 Period 4 Period 4 Period
Since the period is 120◦ , each point will horizontally be 30◦ away from its adjacent points.
Thus, the second point will have an x-coordinate of −5◦ + 30◦ = 25◦ , the third point will have an
x-coordinate of 25◦ + 30◦ = 55◦ , and the fourth point will have an x-coordinate of 55◦ + 30◦ = 85◦ .
We can also confirm that the x-coordinate of the last point is indeed 115◦ by adding 30◦ to 85◦ .
Now, we just have to find the y-coordinates. Here, we observe that the minimum is −117 − |2| =
−119 and the maximum is −117 + |2| = −115. Then, we can assign y-coordinates to the points
according to which are maximums, minimums, or lie on the midline. Thus, our first period sketch is:
Daniel Kim 116
(25◦ , −115)
(85◦ , −119)
π
Exercise 6.3.2. Sketch y = 2 − 3 cos 5x + .
3
Example 6.3.3
Consider a portion of a sinusoidal function:
A(30◦ , 7)
C
D(60◦ , 2)
B
Given that points C and D lie on the midline, find the coordinates of B and C. Then, find
four equations for the graph with the plane shift at A, B, C, and D.
Solution. Recall that a period can be broken up into four ‘quarters.’ While we are not directly
dealing with a single cycle this time, notice that there are seven ‘quarter’ pieces between points A
and D. Since the x-coordinate of points A and D are respectively 30◦ and 60◦ , each ‘quarter’ piece
60◦ − 30◦ 30 ◦
will be = long.
7 7
30 ◦
Since point B is two ‘quarter’ pieces away from point A, its x-coordinate will be 30◦ + 2 · =
◦
7
270
. Likewise, point C is 5 ‘quarter’ pieces away from point A, its x-coordinate will then be
7
30 ◦ 360 ◦
30◦ + 5 · = .
7 7
Now, we just have to find the y-coordinates. Notice that point C also lies on the midline, so its
y-coordinate will be the same as point D, which is 2.
Since B lieson a minimum of the wave, its y-coordinate will be the midline minus the amplitude.
The amplitude is also the distance from the midline to the maximum, and since point A lies on
a maximum of 7, the amplitude must be 7 − 2 = 5. Thus, the minimum is 2 − 5 = −3, so the
y-coordinate ofpoint B must be −3.
270 ◦ 360 ◦
Thus, point B has coordinates , −3 and point C has coordinates ,2 .
7 7
To find an equation for this graph, we need the following information:
117 Chapter 6. Basic Trigonometry
All we need now is the phase shift. We look back to the four different shapes of ‘first periods’ for
sin x, − sin x, cos x, and − cos x to determine which function should be used for which point.
1. If the phase shift is at point A, then the function must be some variant of cos x. The phase
shift itself must be 30◦ (it is positive because we are shifting cosine to the right). Putting
together all the information, we have the equation,
270 ◦
2. Likewise, the phase shift at point B must be . The function that should be used is some
7
variant of − cos x. Thus, our equation is
270 ◦
y = −5 cos 21 x − + 2 =⇒ y = −5 cos(21x − 810◦ ) + 2 .
7
360 ◦
3. The phase shift at point C must be . Our function at this point must be some variant of
7
− sin x, so the equation is
360 ◦
y = −5 sin 21 x − + 2 =⇒ y = −5 sin(21x − 1080◦ ) + 2 .
7
360 ◦
4. The phase shift at point D is . The function must be some variant of sin x, so the equation
7
is
y = 5 sin(21(x − 60◦ )) + 2 =⇒ y = 5 sin(21x − 1260◦ ) + 2 .
Clearly, the graphs of these trigonometric functions fail the horizontal line test: if any horizontal
line intersects the function at more than one point, then an inverse function does not exist.
Then how could there possibly be inverse trigonometric functions? It turns out that the only
reason for the failure of the horizontal line test is that the graph of the function repeats itself
periodically.
However, there is no need to consider all the repeated periods. We just need to look at one
period of the graph in order to define an appropriate inverse function.
Daniel Kim 118
h π πi
Consider the graph of sin x on the interval − , .
2 2
If we just consider this single portion of the graph, we can see that we have covered all possible
points of sin x between −1 and 1. In other words, every distinct value of x is mapped to its
corresponding, distinct value of sin x. This is the condition needed for the existence of an inverse
function.
By limiting the graph to this interval, the horizontal line test is satisfied, i.e. there exists an
inverse for this function.
h π πi
So, if we limit the domain of sin x to − , , then there exists an inverse function. We could
2 2
π 3π 3π π
have chosen , or − , − , but it is most convenient to select the interval centered at the
2 2 2 2
origin.
We will call this inverse function sin−1 x. It can also be called
h π arcsin x. It is universally agreed
πi
that to define sin x, we will restrict the domain of sin x to − , .
−1
2 2
Then, the domain of sin−1 x is simply the range of sin x, which h πremains to be [−1, 1], and the
πi
range of sin x is the newly restricted domain of sin x, which is − , .
−1
2 2
We can easily graph sin x by reflecting the limited graph of sin x over the line y = x.
−1
π
2
−1 1
− π2
Now, let’s consider cos x. Like before, we want to restrict the domain so we can define cos−1 x.
h π πi
This time, if we try to limit the domain to − , , we run into couple of problems. First, that
2 2
portion of the graph still fails the horizontal line test! Second, we would completely leave out the
negative values of cos x. Remember, it is our goal to include every possible value of cos x between
−1 and 1 without repeated values.
119 Chapter 6. Basic Trigonometry
However, if we limit the domain of cos x to [0, π], then the resulting portion of the graph
accomplishes what we wanted: it passes the horizontal line test, and it covers all possible values of
cos x between −1 and 1.
Thus, we define cos−1 x (or arccos x, if you prefer) to be the inverse of cos x, when the domain
of cos x is restricted to [0, π].
Of course, this interval is chosen for convenience - we could’ve chosen [−π, 0], but the negative
bounds may complicate matters in the future. It is nice to stick with a positive interval starting at 0.
As with sin x, it is generally agreed that the domain of cos x should be limited to [0, π] in order to
define cos−1 x.
Therefore, if the domain of cos x was limited to [0, π], then the range of cos−1 x should be [0, π].
As the range of cos x is still [−1, 1], the domain of cos−1 x is [−1, 1].
−1 1
Lastly, let’s move on to tan x. We still have the same problem as before, but now you should be
able to tell what slight modification is necessary.
Daniel Kim 120
−π 0 π
−2π − 23 π − 21 π 1 3 2π
2π 2π
−1
π π
We should limit the domain to − , . It is unnecessary to include more than one period of
2 2
tangent, since this selected portion already covers all values of tan x.
Then, since
π the range of tan x is R, the domain
πof π x is R. Likewise, the restricted domain
tan −1
π
of tan x is − , , so the range of tan−1 x is − , .
2 2 2 2
By reflecting by the line y = x, we can graph tan−1 x. The dashed lines represent the horizontal
asymptotes of tan−1 x.
π
y= 2
y = − π2
To recap what we have done so far, here are the domains and ranges of the three trigonometric
inverse functions.
Domain Range
h π πi
sin−1 x [−1, 1] − ,
2 2
cos−1 x [−1, 1] [0, π]
π π
tan−1 x R − ,
2 2
You should grow accustomed to some common values for these functions.
121 Chapter 6. Basic Trigonometry
x sin−1 x cos−1 x
π
1 0
2
√ x tan−1 x
3 π π √ π
2 3 6 3
√ 3
2 π π π
1
2 4 4 4
1 π π 1 π
√
2 6 3 3 6
π
0 0 0 0
2
1 π 2π 1 π
− − −√ −
2 6 3 3 6
√ π
2 π 3π −1 −
− − 4
2 4 4 √
√ π
3 π 5π − 3 −
− − 3
2 3 6
π
−1 − π
2
Take a moment to confirm that these values of x give the correct values of sin−1 x, cos−1 x, and
tan−1 x.
Problem 6.4.1. Find the values of the following:
π
1. sin−1 sin
7
π
2. cos−1 cos
7
π
3. tan−1 tan
7
4. sin−1 (sin 2π)
5. sin−1 (sin 3)
Solution. Understand that the basic definition of an inverse implies f −1 (f (x)) = x. But be careful
of the domains and ranges of the inverse trig functions!
π π
1. sin−1 sin = .
7 7
π π
2. cos−1 cos = .
7 7
π π
3. tan−1 tan = .
7 7
Daniel Kim 122
h π πi
4. The answer is not 2π, because the range of inverse sine is − , . So we must find
h π πi 2 2
a θ ∈ − , such that sin θ = sin(2π), which is θ = 0. Therefore, sin−1 (sin 2π) =
2 2
sin−1 (sin 0) = 0 .
5. Similarly, 3 would not be the correcth answer. But note that sin 3 = sin(π − 3), by the identity
π πi
sin(180 − θ) = sin θ. Since π − 3 ∈ − , , sin−1 (sin 3) = sin−1 (sin(π − 3)) = π − 3 .
2 2
Example 6.4.2
Find the values of the following:
π
1. sin−1 cos
7
π
2. sin−1 cos −
7
π
3. sin cos−1
7
4. Find a general form for sin(cos−1 x).
π
5. sin cos−1
2
6. sec(tan−1 (2017))
Solution.
π π 5π
2. Note that cos(−θ) = cos θ. Thus, sin−1 cos − = sin−1 cos = , based on the
7 7 14
previous problem.
sin2 θ + cos2 θ = 1
π2
sin2 θ = 1 −
49
123 Chapter 6. Basic Trigonometry
√
49 − π 2
sin θ = ± .
7
Which sign do we choose? Now, we have to consider the domain and range of cos−1 x. Since
the range is [0, π], we must have θ ∈ [0, π]. Then, in that interval, sin θ is always positive.
√
49 − π 2
Therefore, we choose the positive square root to get .
7
4. First, note that
sin2 θ + cos2 θ = 1
sin2 θ = 1 − cos2 θ
p
sin θ = ± 1 − cos2 θ.
However, cos−1 (x) has range [0, π], and sin is always positive in this interval, so we choose the
p
positive square root: sin(cos−1 (x)) = 1 − x2 .
π
5. The domain of cos−1 (x) is [−1, 1], but we are given x = , which is greater than 1. Therefore,
2
there is no solution .
2. csc(tan−1 −2)
3. sec(sin−1 x)
Solution. It is essential that you understand the domains and ranges of the inverse functions so that
you are able to choose the proper sign.
1 1
1. Let θ = cos−1 − . Then cos θ = − . Because the range of cos−1 is [0, π], and sin is always
3 3
nonnegative in that interval, sin θ must be nonnegative. Using the identity sin2 θ + cos2 θ = 1,
√ √
2 2 2 2
we get that sin θ = ± , so choose the positive square root: sin θ = .
3 3
First, let θ = tan−1 −2. Then tan θ = −2. Using the identity
πtan θ + 1 = sec θ, we get that
2 2
√
sec θ = ± 5. We know that tan θ < 0, so therefore θ ∈ − , 0 . cos θ must be positive in
2
√ 1
that interval, so sec θ is also positive. Thus, sec θ = 5, and cos θ = √ . Using the identity
π 5
2
sin θ + cos θ = 1, we get sin θ = ± √ , but since θ ∈ − , 0 , sin θ < 0, so we choose
2 2
5 2
√
2 5
sin θ = − √ . Therefore, csc θ = − .
5 2
1
3. Note that sec(sin−1 x) = −1 . Let θ = sin−1 x, so sin θ = x. The range of inverse
cos(sin
h π πi x)
sine implies that θ ∈ − , , and therefore cos θ is always positive in that interval. Using
2 2 √
the identity sin2 θ + cos2 θ = 1, cos θ = 1 − x2 (we choose the positive square root). Thus,
1
sec(sin−1 x) = √ .
1 − x2
So far, we have only discovered identities that dealt with 90◦ or 180◦ . What if we wanted to find an
expression for sin(θ + 30◦ )? If we know sin θ and sin 30◦ , it seems reasonable that we should also get
sin(θ + 30◦ ).
To develop our first set of new, useful identities, we first consider the following two triangles
which lie on the unit circle:
125 Chapter 6. Basic Trigonometry
y
C (cos b, sin b)
B (cos a, sin a)
a
b−
a
O x
b−a E
O′ x
Notice that OC ∼ = O0 D and OB ∼ = O0 E, as they are radii of the unit circle. We have also
constructed the triangles to have same angle m∠COB = m∠DO0 E = b − a. Therefore, by SAS,
4OBC ∼ = 4O0 ED. Then, BC ∼ = ED, implying that mBC = mED.
These are equal, so now we have basic algebraic manipulation:
p q
(cos b − cos a)2 + (sin b − sin a)2 = (cos(b − a) − 1)2 + sin2 (b − a)
(cos b − cos a)2 + (sin b − sin a)2 = (cos(b − a) − 1)2 + sin2 (b − a).
Expanding gives
This reduces to
2 − 2(cos b cos a + sin b sin a) = 2 − 2 cos(b − a).
The angle difference for sine is similar to that of cosine: use the facts that sin(−θ) = − sin θ and
cos(−θ) = cos θ. We have,
The four identities we have just discovered are known as the sum and difference identities:
Here are some exercises that require the use of these formulas.
1. cos(15◦ )
2. sin(195◦ )
3 5
3. cos cos −1 + cos −1
5 13
Solution.
127 Chapter 6. Basic Trigonometry
Using these identities, we can also find the sum and difference formulas for tangent:
sin(x ± y)
tan(x ± y) =
cos(x ± y)
sin x cos y ± cos x cos y
= .
cos x cos y ∓ sin x sin y
Divide the numerator and denominator by cos x cos y to get the desired relation:
tan x ± tan y
tan(x ± y) = .
1 ∓ tan x tan y
Problem 6.5.2. Compute the following values:
1. tan(15◦ )
1 1
2. tan−1 + tan−1
2 3
Solution.
Daniel Kim 128
2. The addition of these inverse tan values suggest that we should take the tangent of this sum
and analyze its result. Note that
1 1
tan tan−1 + tan tan−1
1 1 2 3
tan tan−1 + tan−1 =
2 3 −1 1 −1 1
1 − tan tan tan tan
2 3
1 1
+
= 2 3
1
1−
6
= 1.
π
If the slope (which is the tangent) is less than 1, then the angle must be less than . Thus,
4
1 π 1 π 1 1 π
tan −1 < and tan −1 < , so tan −1 + tan −1 < .
2 4 3 4 2 3 2
1 1
Combining this with the result that tan tan −1 + tan −1 = 1, we see that the only value
2 3
1 1 π
possible for tan−1 + tan−1 is .
2 3 4
Using the sum and difference formulas, we can derive formulas for sin 2x, cos 2x, and tan 2x in
terms of sin x, cos x, and tan x.
Observe that 2x = x + x. Then,
sin(2x) = sin(x + x)
= sin x cos x + cos x sin x
= 2 sin x cos x .
cos(2x) = cos(x + x)
= cos x cos x − sin x sin x
= cos2 x − sin2 x .
The result for cos(2x) reminds us of the Pythagorean Identity sin2 x + cos2 x = 1. In fact, this
equation can be rewritten in two ways: sin2 x = 1 − cos2 x, and cos2 x = 1 − sin2 x. We then
129 Chapter 6. Basic Trigonometry
substitute these two relations into the result cos2 x − sin2 x to derive two new double angle identities
for cosine:
tan(2x) = tan(x + x)
tan x + tan x
=
1 − tan x tan x
2 tan x
= .
1 − tan2 x
Thus, our double angle identities for sine, cosine, and tangent are:
1
1. If cos x = , what is cos 2x?
3
3
2. If sin x = , what is sin 2x?
5
1
3. If tan x = , what is tan 2x?
3
1
4. Compute the value tan 2 tan −1 .
2
5. Compute cos(2 tan−1 5).
Solution.
2
1 7
1. Directly apply the double angle formula for cosine: cos 2x = 2 cos2 x − 1 =2 −1 = − .
3 9
2. Since sin 2x = 2 sin x cos x, we also have to find the value of cos x. Using the Pythagorean
4
Identity sin2 x + cos2 x = 1, we see that cos x = ± . We must be careful as there are two
5
possible values for sin 2x when we consider the sign of cos x:
Daniel Kim 130
3 4 24
• If cos x > 0, then sin 2x = 2 · · = .
5 5 25
3 4 24
• If cos x < 0, then sin 2x = 2 · ·− = − .
5 5 25
1
2· 3
3. Again, we can directly apply the half angle formula for tangent: tan 2x = 3 .
2 =
1 4
1−
3
1
4. We treat tan−1 as the angle we are doubling. Applying the half angle formula in this fashion,
2
we have
−1 1
2 tan tan
1 2
tan 2 tan−1 =
2 2 −1 1
1 − tan tan
2
1
2·
= 2
2
1
1−
2
4
= .
3
5. By the double angle identity for cosine, cos(2 tan−1 5) = 2 cos2 tan−1 5 − 1. Now we must
compute the value cos(tan−1 5) to find the answer.
√
Let θ = tan−1 5, implying tan θ = 5. Using the identity tan2 θ+1 = sec2 θ, we get sec θ = ± 26.
However,
π as θ is equal to an inverse tangent function, the range of inverse tangent implies that
π √
θ∈ − , . In this interval, cos θ, and therefore, sec θ, are always positive, so sec θ = 26
2 2
1
and cos θ = √ .
26
2
1 1 12
Therefore, cos(2 tan 5) = 2 cos tan 5 − 1 = 2 √
−1 2 −1 −1= −1= − .
26 13 13
If we can double an angle, we can also halve an angle. This will lead us to our next set of
identities.
First, we can express sin θ in terms of cos 2θ:
cos 2θ = 1 − 2 sin2 θ
r
1 − cos 2θ
sin θ = ± .
2
θ
Then, we can substitute θ → to get
2
r
θ 1 − cos θ
sin = ± .
2 2
131 Chapter 6. Basic Trigonometry
cos 2θ = 2 cos2 θ − 1
r
1 + cos 2θ
cos θ = ±
2
r
θ 1 + cos θ
cos = ± .
2 2
There are two half angle identities for tangent, derived in different ways:
θ sin 2θ
tan =
2 cos 2θ
sin 2θ 2 cos 2θ
= ·
cos 2θ 2 cos 2θ
2 sin 2θ cos 2θ
= θ
2 cos2 2
sin θ
= .
1 + cos θ
θ sin 2θ
tan =
2 cos 2θ
sin 2θ 2 sin 2θ
= ·
cos 2θ 2 sin 2θ
θ
2 sin2 2
=
2 sin 2θ cos 2θ
1 − cos θ
= .
sin θ
Thus, the half angle identities for sine, cosine, and tangent are:
r
θ 1 − cos θ
sin = ±
2 2
r
θ 1 + cos θ
cos = ±
2 2
θ sin θ
tan =
2 1 + cos θ
1 − cos θ
=
sin θ
Problem 6.5.4. Answer the following:
1 θ
1. If cos θ = and θ ∈ I, what is cos ?
5 2
Daniel Kim 132
π
2. Compute and simplify tan .
8
Solution.
r r
θ 1 + 15 3 π
1. The half angle formula for cosine gives us cos = ± = ± . Be careful, cos may
2 2 5 2
not always be positive! Note that θ ∈ I implies that 0◦ + 360k ◦ < θ < 90◦ + 360k ◦ , which
θ θ
could suggest 360◦ < θ < 450◦ , leading to 180◦ < < 225◦ , and cos is negative in this
2 2
θ
interval. Therefore cos can either be positive or negative, even though θ is in quadrant I.
2
√
π 2 √
π sin 2
2. We simply use the half angle formula for tangent: tan = 4 = 2√ = √ =
8 π 2+ 2 2+ 2
1 + cos
4 2
√
2−1 .
θ
Solution. If we let θ = tan−1 (2), then we have to compute tan . By our half angle identity, we have
2
θ sin θ
tan = .
2 1 + cos θ
Now we have to find sin θ and cos θ. Note that θ = tan−1 (2) =⇒ tan θ = 2. Now we have a
simple reference triangle problem.
√
Note that tan2 θ + 1 = sec2 θ, so sec2 θ = 22 + 1 = 5 =⇒ sec θ = ± 5.
π π
Recall that the range of tan−1 is − , . Since θ = tan−1 (2) and 2 is a positive slope, we
2 2 π
must conclude that θ is in the first quadrant, i.e. θ ∈ 0, . Then cos θ and sec θ must be positive
2
√ 1
on that interval. Thus, we choose sec θ = 5, and thus cos θ = √ . Likewise, sin θ will also be
s 5
2
1 2
positive on this interval, so sin θ = 1 − √ =√ .
5 5
Finally, we can plug in to get
√ 2 √
θ 5 5−1
tan = = .
2 1 + √15 2
In addition to the double angle identities, we also have the triple angle identities:
Proof. Using the fact that 3θ = 2θ+θ, we can use the sum formulas for sine and cosine, then repeatedly
apply double angle identities and Pythagorean identities sin2 θ = 1 − cos2 θ and cos2 θ = 1 − sin2 θ
to finish the proof.
sin 3θ = sin(2θ + θ)
= sin 2θ cos θ + cos 2θ sin θ
= (2 sin θ cos θ) cos θ + (1 − 2 sin2 θ) sin θ
= 2 sin θ cos2 θ + sin θ − 2 sin3 θ
= 2 sin θ(1 − sin2 θ) + sin θ − 2 sin3 θ
= 2 sin θ − 2 sin3 θ + sin θ − 2 sin3 θ
= −4 sin3 θ + 3 sin θ.
cos 3θ = cos(2θ + θ)
= cos 2θ cos θ − sin 2θ sin θ
= (2 cos2 θ − 1) cos θ − (2 sin2 θ cos θ)
= 2 cos3 θ − cos θ − (1 − cos2 θ)(2 cos θ)
= 2 cos3 θ − cos θ − 2 cos θ + 2 cos3 θ
= 4 cos3 θ − 3 cos θ.
tan 3θ = tan(2θ + θ)
tan 2θ + tan θ
=
1 − tan 2θ tan θ
2 tan θ
2 θ + tan θ
= 1−tan 2 tan θ
1 − 1−tan2 θ · tan θ
2 tan θ+tan θ(1−tan2 θ)
1−tan2 θ
= 1−3 tan2 θ
1−tan2 θ
2 tan θ + tan θ − tan3 θ
=
1 − 3 tan2 θ
3 tan θ − tan3 θ
= .
1 − 3 tan2 θ
Now, here are some review problems that will require the use of the identities discovered in this
section.
1
Problem 6.5.8. If sin θ + cos θ = , what’s sin 2θ?
2
Daniel Kim 134
1
Solution. If we square both sides of the equation sin θ + cos θ = , we get
2
1
sin2 θ + 2 sin θ cos θ + cos2 θ = .
4
3
Note that sin2 θ + cos2 θ = 1 and sin 2θ = 2 sin θ cos θ. Therefore, sin 2θ = − .
4
Solution. We have that cos4 T −sin4 T = (cos2 T +sin2 T )(cos2 T −sin2 T ). However, cos2 T +sin2 T =
1, and cos2 T − sin2 T = cos 2T , thus cos4 T − sin4 T = cos 2T .
θ
Problem 6.5.10. Prove cot = csc θ + cot θ.
2
θ sin θ θ 1 1 + cos θ
Proof. The half angle identity for tangent is: tan = . Then, cot = = =
2 1 + cos θ 2 θ sin θ
tan
2
1 cos θ
+ = csc θ + cot θ, as desired.
sin θ sin θ
Problem 6.5.11. Derive a formula for cot(α + β) in terms of cot α and cot β.
cot α cot β − 1
= .
cot α + cot β
θ
Problem 6.5.12. If tan θ = 2, find all possible values of tan .
2
Solution. As tan θ is positive, θ is in either quadrant I or III.
√ 2 √
2 1 θ 5 2 5−1
If θ ∈ I, then sin θ = √ and cos θ = √ , so tan = 1 =√ = .
5 5 2 1 + √5 5+1 2
√
2 1 θ − √25 2 − 5−1
If θ ∈ III, then sin θ = − √ and cos θ = − √ , so tan = = −√ = .
5 5 2 1 − √15 5−1 2
135 Chapter 6. Basic Trigonometry
3
Problem 6.5.13. If sin θ = , what is sin 3θ?
5
3
Solution. The triple angle identity for sine is sin 3θ = −4 sin3 θ + 3 sin θ. Therefore, when sin θ = ,
3 5
3 3 117
sin 3θ = −4 +3· = .
5 5 125
1
Problem 6.5.14. Prove cos 20◦ cos 40◦ cos 80◦ = .
8
Proof. Using sin(180◦ − θ) = sin θ and the double angle identity for sine, notice that
1
cos 20◦ cos 40◦ cos 80◦ sin 20◦ = sin 40◦ cos 40◦ cos 80◦
2
1
= sin 80◦ cos 80◦
4
1
= sin 160◦
8
1
= sin 20◦ .
8
1 1
Since cos 20◦ cos 40◦ cos 80◦ sin 20◦ = sin 20◦ , cos 20◦ cos 40◦ cos 80◦ = , so we’re done.
8 8
1
Problem 6.5.15. Prove cos 36◦ cos 72◦ = . Then, use this result to find the value of cos 36◦ .
4
Solution. Similar to the previous problem,
1
cos 36◦ cos 72◦ sin 36◦ = sin 72◦ cos 72◦
2
1
= sin 144◦
4
1
= sin 36◦ ,
4
1
implying cos 36◦ cos 72◦ = .
4
1
To find cos 36◦ , rewrite cos 36◦ cos 72◦ =
in terms of cos 36◦ only. To do this, notice that 72 is
4
2 · 36, so we can use the double angle formula to get: cos 72◦ = 2 cos2 36◦ − 1, so we have:
1
cos 36◦ (2 cos2 36◦ − 1) =
4
Letting cos 36◦ = xfor convenience,
the above equation leaves us with the polynomial 8x3 −4x−1 = 0.
1 1
This factors into 2 x + (4x2 − 2x − 1) = 0, and since cos 36◦ clearly does not equal − , we must
2 √ 2
1 ± 5
find the roots of 4x2 − 2x − 1. Using the quadratic formula, we get: x = . Since cos 36◦ > 0,
√ 4
1+ 5
we have: cos 36 =
◦ .
4
Daniel Kim 136
√ √
Problem 6.5.16. Prove − 2 ≤ sin θ + cos θ ≤ 2 ∀θ.
Proof. Given the range of sine, we have that sin 2θ ≤ 1, which implies sin 2θ + 1 ≤ 2. However, note
that sin 2θ + 1 = sin 2θ + sin2 θ + cos2 θ = 2 sin θ cos
√ = (sin θ + cos θ)2 . Therefore,
θ + sin2 θ + cos2 θ √
we have (sin θ + cos θ)2 ≤ 2, which means that − 2 ≤ sin θ + cos θ ≤ 2, and we are done.
Problem 6.5.17. What’s the domain and range of cos−1 cos−1 x ?
Solution. For cos−1 (cos−1 x) to be defined, the domain of the outer cos−1 must be [−1, 1]. Note
that the range of the inner cos−1 x is [0, π]; this does not fit inside the domain of the outer cos−1 , so
we must restrict the range of the inner cos−1 x to [0, 1].
If the range of cos−1 x is [0, 1], then 0 ≤ cos−1 x ≤ 1. Taking the cosine of this inequality yields
1 ≥ x ≥ cos 1. We flip the inequality signs because cos 1 is clearly less than 1.
Therefore the domain of cos−1 (cos−1 x) is [cos 1, 1] .
Because the range of cos−1 x is [0, 1], the range of cos−1 (cos1 x) would be [cos−1 1, cos−1 0]
h πi
(because cos1 1 < cos−1 0), or 0, .
2
θ θ
Problem 6.5.18. Prove tan + cot = 2 csc θ.
2 2
1
Problem 6.5.19. Find all θ in radians such that sin θ = .
2
π
Solution. It may be tempting to immediately declare θ = as the solution. However, recall that all
6
trigonometric functions are periodic, and thus there are an infinite number of solutions. Since sine
has a period of 2π (and thus the value does not change when adding or subtracting multiples of 2π),
π
we see that θ = + 2πk ∀k ∈ Z.
6
5π
However, we are not finished. Note that sin θ = sin (π − θ). Therefore, θ = is also a valid
6
5π
solution. We cannot forget that sine is periodic, so θ can actually be + 2πk ∀k ∈ Z.
6
137 Chapter 6. Basic Trigonometry
π 5π
θ= + 2πk, + 2πk ∀k ∈ Z .
6 6
5π
It may be hard to see that θ = is also a solution. A helpful tip is to visualize the unit circle
6
1
on the coordinate plane, then plot the line y = and notice that it intersects the circle at two places,
2
π π 5π 5π
namely cos , sin and cos , sin . We can then proceed from here.
6 6 6 6
1
Problem 6.5.20. Find all 0◦ ≤ x < 360◦ in degrees such that cos x = . A calculator may be used.
3
1
Solution. First, we calculate x = cos−1 ≈ 70.5◦ . Then, observe that cos(360◦ − θ) = cos θ, so
3
we also have x = 360◦ − 70.5◦ = 289.5◦ . All together, our solutions are 70.5◦ , 289.5◦ .
1
Like the previous problem, notice that the line x =intersects the unit circle at two places (the
3
first and fourth quadrants), so there must be two values of x ∈ [0◦ , 360◦ ).
Problem 6.5.21. Find all 0 ≤ x < 2π in radians such that sin 2x = sin x.
Solution. First, we can use the double angle identity to get 2 sin x cos x = sin x. You may be tempted
to divide both sides by sin x, but this ignores the possibility that sin x = 0. It is better to rearrange
the equation to 2 sin x cos x − sin x = 0, from we factor out sin x to get sin x(2 cos x − 1) = 0. Now,
1 π 5π
we can notice that either sin x = 0 or cos x = . We can then get our solutions x = 0, , π, .
2 3 3
1
Problem 6.5.22. Find all 0◦ ≤ θ < 360◦ in degrees such that cos 3θ = − .
2
Solution. If 0◦ ≤ θ < 360◦ , then 0◦ ≤ 3θ < 1080◦ . Therefore, solving for all solutions of 3θ in
1
that range for the equation cos 3θ = − , we have 3θ = 120◦ , 240◦ , 480◦ , 600◦ , 840◦ , 960◦ (these are
2
obtained by determining the principal values 120◦ and 240◦ , then adding multiples of 360◦ to each).
Dividing by 3, we have: θ = 40◦ , 80◦ , 160◦ , 200◦ , 280◦ , 320◦ .
1
Problem 6.5.23. Find all 0◦ ≤ x < 360◦ such that sin(2x + 44◦ ) = .
2
We end up with y = 150◦ , 390◦ , 510◦ , 750◦ as our solutions when we take into account the
given restrictions.
Then, we replace y with 2x + 44 and solve for x, resulting in x = 53◦ , 173◦ , 233◦ , 353◦ .
Problem 6.5.24. Find all 0◦ ≤ x < 360◦ in degrees such that cos 2x = cos x.
Solution. By the double angle identity for cosine, we have 2 cos2 x − 1 = cos x, which rearranges to
1
2 cos2 x − cos x − 1 = 0. We can factor this as (2 cos x + 1)(cos x − 1) = 0, from we get cos x =
2
and 1 as roots. Within the interval 0◦ ≤ x < 360◦ , we have the solutions x = 0◦ , 120◦ , 240◦ .
Problem 6.5.26. Classify all cases in which a, b, c and cos a, cos b, cos c are both arithmetic se-
quences.
Solution. Let a = b − d1 , c = b + d1 , and cos a = cos b − d2 , cos c = cos b + d2 . Then we have the
relations:
cos(b − d1 ) = cos b − d2 ,
cos(b + d1 ) = cos b + d2 .
Adding these, we eventually get 2 cos b cos d1 = 2 cos b, or 2 cos b(cos d1 − 1) = 0. This implies
cos d1 = 1 or cos b = 0. For the former, we get that d1 = 360k ◦ ∀k ∈ Z (d1 is the common difference
between a, b, c) and for the latter, b = 90◦ + 180k ◦ ∀k ∈ Z .
1 π
Problem 6.5.27. If sin x + cos x = and < x < π, what is tan x?
2 2
3
from which we get sin 2x = − .
4
If we know sin x + cos x, it may be helpful to also find out sin x − cos x. Observe that
7
(sin x − cos x)2 = sin2 x − 2 sin x cos x + cos2 x = 1 − sin 2x = .
4
√
7
Thus, we can take the square root of both sides of sin x − cos x = ± . Now, we use the fact
2
π
that < x < π. On this interval, sin x > cos x (this can be seen from looking at the unit circle or
2
comparing graphs√ of sine and cosine). Thus, since sin x − cos x > 0, we choose the positive sign:
7
sin x − cos x = .
2
139 Chapter 6. Basic Trigonometry
√
1 7
Now we have the equations sin x + cos x = and sin x − cos x = . If we add them together,
√ 2 2 √
1+ 7 1− 7
we get 2 sin x = . If we subtract either equation from the other, we get 2 cos x = .
2 2
Finally, we can compute
√ √ √ √
2 sin x 1+ 7 (1 + 7)2 8+2 7 4+ 7
tan x = = √ = = = − .
2 cos x 1− 7 −6 −6 3
Proof. We consider several cases: an acute triangle, a right triangle, an obtuse triangle in which
angle C is obtuse, and an obtuse triangle in which angle C is acute.
a
h
C
b
1
The area of this acute triangle above is bh, where h is the altitude dropped to the base. Using
2
h
our knowledge of right triangle trig, it is clear that sin C = , which rearranges to h = a sin C.
a
1
Therefore, substituting h in the original area formula, the area is ab sin C.
2
a c
C b
1
Similarly, the area of this right triangle is ab. Note that sin 90◦ = 1, so sin C = 1. Therefore,
2
1 1
the area is ab = ab sin C.
2 2
Daniel Kim 140
h a
180 − C
C
b
1
The area of this obtuse triangle with obtuse angle C is bh, when altitude h is dropped to
2
the same level as base b. Note that h = a sin(180 − C) = a (sin 180◦ cos C − sin C cos 180◦ ) =
◦
1
a (0 − sin C · −1) = a sin C, leading to the desired formula ab sin C.
2
a
h
C
b
1
Lastly, we consider the case of an obtuse triangle with acute angle C, with an area of bh. Note
2
h 1
that sin C = =⇒ h = a sin C, so the area can also be expressed by ab sin C, as desired.
a 2
We will henceforth use the notation [ABC] to denote the area of 4ABC.
1 1 1
Proof. Recall that [ABC] = ab sin C. It is evident that [ABC] = bc sin A and [ABC] = ac sin B.
2 2 2
Therefore, we have
1 1 1
ab sin C = bc sin A = ac sin B.
2 2 2
1
Dividing everything by abc yields
2
sin C sin A sin B
= = ,
c a b
as desired.
141 Chapter 6. Basic Trigonometry
a c
h
C A
b X
h CX
Consider acute 4ABC. We have sin C = , so h = a sin C. We also have cos C = , so
a a
CX = a cos C. Therefore, AX = AC − CX = b − a cos C. Then, apply the Pythagorean Theorem
to 4ABX to get c2 = h2 + AX 2 . Substituting h = a sin C and AX = b − a cos C, we have
Secondly, for the case of a right triangle 4ABC with right angle at C, it suffices to show that
cos C = cos 90◦ = 0, combined with the already given relationship c2 = a2 + b2 by the Pythagorean
Theorem. c2 = a2 + b2 =⇒ c2 = a2 + b2 + 0 =⇒ c2 = a2 + b2 − 2ab cos C, so the right triangle
case is done.
B
a c
h
C
X b A
Consider an obtuse triangle 4ABC where ∠C is obtuse. Drop the altitude from B to X such
that it forms a perpendicular with line AC. Notice that ∠BCX = 180◦ − ∠BCA = 180◦ − ∠C.
CX
Furthermore, as cos ∠BCX = , CX = a cos ∠BCX = a cos(180◦ − C) = −a cos C. Similarly,
a
h = a sin ∠BCX = a sin(180◦ − C) = a sin C.
Daniel Kim 142
a c h
C
b A X
Lastly, consider an obtuse triangle 4ABC where ∠C is acute, and dropped altitude h to
h
line AC. First, given right triangle 4CBX yields sin C = =⇒ h = a sin C. Furthermore,
a
b + AX
cos C = =⇒ AX = a cos C − b.
a
Apply Pythagorean Theorem on 4ABX to get c2 = AX 2 + h2 . However, we know what AX
and h are, so we substitute in their respective values:
We have covered all possible cases and have shown that the Law of Cosines works on any given
triangle.
1. SAS: Side-Angle-Side
2. SSS: Side-Side-Side
3. ASA: Angle-Side-Angle
4. AAS: Angle-Angle-Side
We also have HL (Hypotenuse-Leg), a special case of the SSS congruence law, but for right
triangles (as the third side is predetermined by Pythagorean Theorem).
143 Chapter 6. Basic Trigonometry
These congruence laws suggest that when we are given 3 particular pieces of the triangle, we are
able to ‘solve’ the rest of the triangle (determining all remaining side lengths and angles). Here are
some examples:
ASA and AAS:
B
a
C A
When given 2 angles and a side length included between them, we can use the Law of Sines to
find the rest of the information. We actually have four pieces of information (that we need in order
to solve the proportions) because we can always find the third angle using the fact that there are
always 180◦ degrees in a triangle. This is why ASA and AAS are essentially the same.
Thus, we can compute side lengths b and c (which are not known to us immediately) using Law
of Sines.
SSS and SAS:
a c
C
b
When given two side lengths and an angle included between them (SAS), we can use the Law of
Cosines to solve for the third side. Then once the third side is known, we can use the Law of Sines
to find the rest of the angles:
We can then solve for sin A and sin B, then take the inverse sine of those, to get our angles A
and B.
If we are given three side lengths but no angle measurements (SSS), we can find the measurement
of one of the angles using the Law of Cosines (c2 = a2 + b2 − 2ab cos C) and solving for cos C. Then,
like before, we can use Law of Sines to find the rest of the angles.
For the following problems, a calculator is required.
Daniel Kim 144
Problem 6.6.4. Solve for the rest of the side lengths and angles for the following triangles. Round
angles in degrees to the nearest tenth and side lengths to the nearest hundredth.
Solution.
1. As all angles in a triangle add up to 180◦ , we can easily get that m∠C = 65◦ . Then using
the Law of Sines,
sin 73◦ sin 42◦ sin 65◦
= =
a 7 c
we can calculate the rest of the side lengths:
7 sin 73◦
a= ≈ 10.00
sin 42◦
7 sin 65◦
b= ≈ 9.48
sin 42◦
2. Using the Law of Cosines, we get that c ≈ 4.40 . By the Law of Sines, we have
Here is a caveat: we must use the exact answer to c (that is still stored in the calculator; not
the approximate 4.40) to find the smaller of the two remaining angles. Why?
As inverse sine only returns values in the interval [−90◦ , 90◦ ], we could incorrectly label an
obtuse angle acute. Therefore, we should choose the smaller angle of the two because we know
for sure that a triangle cannot have two angles both greater than 90◦ . After calculating the
smaller angle, we can then calculate the larger angle of the two by using the fact that all angles
add up to 180◦ .
◦
−1 7 sin 12
As ∠B is the smaller of the two, we calculate m∠B = sin ≈ 19.3◦ . Then
c
m∠A = 180◦ − 12◦ − 19.3◦ = 148.7◦ .
Problem 6.6.5. Find all triangles with sides in arithmetic progression that have an angle of 60◦ .
Solution. If the triangle has an angle of 60◦ , then either the triangle is equiangular (and thus
equilateral) or for the other two angles, one is greater than 60◦ and one is less than 60◦ . Since
this angle of 60◦ is the average of the three angles in this kind of triangle, it must be between the
‘smallest’ and ‘largest’ side lengths (as the problem states that the sides are in arithmetic progression).
Therefore, if we let the side lengths be x − d, x, and x + d with a common difference of d, then this
angle of 60◦ is between x − d and x + d and is opposite the side x. We then apply Law of Cosines:
= x2 + 3d2
∴ d = 0.
Therefore, the only kind of triangle that has sides in arithmetic progression and contains an angle of
60◦ is an equilateral triangle.
Why isn’t SSA considered a valid triangle congruence law? In fact, when given two sides and an
angle not included between them, there can be multiple scenarios. Henceforth, for 4ABC, we will
denote sides a = BC, b = AC, and c = AB. Consider this diagram:
The line that forms an angle A with side length b will contain side c, depending on the side
length a. The following cases will demonstrate the various situations which can occur depending on
the length of a.
a
b
If we choose a side length a that is smaller than b sin A (which is the length of the perpendicular
dropped from angle C to the long line), then we will not be able to create any triangle. One can
visualize this by constructing a circle of radius a around point C, and realize that this circle will
never intersect with the long line.
b a
If we let a be that perpendicular, notice that we can construct exactly one triangle, which is
the right triangle with altitude a. Note that the shortest distance from a point to a line is the
perpendicular dropped from the point to the line.
b a1 a2
If we have a side length a be between the length of the altitude b sin A and the side length b,
then there are actually two possible triangles. As mentioned in case 1, it is easier to visualize this
and understand the reasoning by constructing a circle of radius a and seeing that it intersects the
long line at two points.
b a
such a triangle would be possible because m∠A + m∠B = 30◦ + 141.3◦ < 180◦ . Therefore,
there are two possible triangles here.
5
4. a = 12: This yields sin B = . Taking inverse sine and applying the identity sin(180 − θ) =
12
sin θ, we get B ≈ 24.6◦ , 155.4◦ . However, if m∠B = 155.4◦ , then such a triangle would be
impossible, because m∠A + m∠B = 30◦ + 155.4◦ > 180◦ . Therefore, B ≈ 155.4◦ is not valid,
so there is only one possible triangle at B ≈ 24.6◦ .
Problem 6.6.7. Solve the following SSA scenario for 4ABC: A = 38◦ , a = 4, b = 9.
Problem 6.6.9. Find the angles of a triangle with side lengths 12, 13, and 14.
Solution. Given 4ABC, we let c = 12, b = 13, a = 14. By the Law of Cosines, we get
Solve for sin B and sin C, then take inverse sine, as necessary:
√
55
sin B
8
= =⇒ B ≈ 59.4◦ .
14 13
√
55
sin C
8
= =⇒ C ≈ 52.6◦ .
14 12
Exercise 6.6.10. Compute the largest angle in a triangle with side lengths 13, 15, and 17.
Chapter 7
Advanced Trigonometry
This is simply a continuation of the previous chapter; however, this chapter will focus on the other
ways of representing the coordinate plane, introducing more advanced concepts. The following
material will require good understanding and foundation of the trigonometry covered earlier.
In the Cartesian plane, we represent points by (x, y) according to an xy coordinate grid. Polar
coordinates are another way of representing these points.
(x, y)
r
θ
Definition 7.1.1. We denote a point (r; θ) as a polar coordinate, where r is the length of the
ray from the origin and θ is the angle which represents the rotation of the ray through the origin of
the unit circle, analogous to the definitions of the unit circle stated earlier.
On the Cartesian plane, the unit circle contains all points of the form (cos θ, sin θ). This can be
generalized to circles for any radius k: (k cos θ, k sin θ). Therefore, we have that (r cos θ, r sin θ) in
Cartesian coordinates is (r; θ) in polar coordinates, or in other words: (x, y) ←→ (r cos θ, r sin θ) ←→
(r; θ), implying:
x = r cos θ,
149
Daniel Kim 150
y = r sin θ.
y
This suggests that = tan θ. Furthermore, note that x2 + y 2 = r2 cos2 θ + r2 sin2 θ = r2 (sin2 θ +
x
cos2 θ) = r2 . To simplify our work for converting between polar and Cartesian coordinates, we couldy
p
establish r ≥ 0 and 0 ≤ θ ≤ 2π as stipulations, in which case, r = x2 + y 2 and θ = tan−1 ,
x
adding or subtracting multiples of π as necessary (because inverse tan can return negative values,
and it depends on which quadrant the point lies).
Problem 7.1.2. Plot the following points and convert them into Cartesian coordinates.
π
1. A 3;
2
5π
2. B 2;
6
π
3. C 4; −
3
4. D 0; 1010 π + 2
5. E (−1; π)
B
D
E x
π
Solution. For point A, we plot the point that is 3 away from the origin and radians from the
2
starting line, which is the x-axis. To convert this into a point in the form (x, y), recall that x = r cos θ
π π
and y = r sin θ, so x = 3 cos = 0 and y = 3 sin = 3, so the Cartesian coordinate would be
2 2
(0, 3) .
√
5π 5π 5π 5π 3
Similarly, for point B, note that 2; ←→ 2 cos , 2 sin . We compute cos =−
6 6 6 6 2
5π 1 √
and sin = . Thus, B = (− 3, 1) .
6 2
151 Chapter 7. Advanced Trigonometry
π π π π 1
Likewise, Point C = 4; − = 4 cos − , 4 sin − . We have cos − = and
√ 3 3 3 3 2
π 3 √
sin − =− , so C = (2, −2 3) .
3 2
For point D, note that r = 0. In this case, D clearly has to be the origin, (0, 0) , regardless of
the angle given.
We are given a negative radius for point E. By intuition we can see that a point with −r reflects
the original point with r over the origin. Thus, (−1; π) is the same point as (1; 0), or (1, 0) in
Cartesian coordinates.
Problem 7.1.3. Convert these Cartesian coordinates into polar coordinates, with the stipulations
r ≥ 0 and 0 ≤ θ ≤ 2π.
1. (7, −7)
2. (−4, 3)
p √ 7
Solution. For (7, −7), we use the relation r = x2 + y 2 to get r = 7 2. Then, θ = tan−1 =
−7
π 3π
tan−1 (−1) = − . However, the point lies in quadrant IV, so < θ < 2π. Therefore, we add 2π to
4 2
7π √ 7π
get θ = . Then the polar coordinate is 7 2; .
4 4
3 3
Similarly, for point (−4, 3), we get r = 5 and θ = tan −1 − . However, tan −1 −
4 4
3
is clearly negative, so add π to get θ = π + tan −1 − . Therefore the polar coordinate is
4
3
5; π + tan−1 − .
4
Now that we have dealt with plotting polar coordinates, it is time to graph polar functions.
First, we address some obvious special cases:
• An equation of the form r = k for some k ∈ R is simply a circle centered at the origin with
radius k.
• An equation of the form θ = α where α is in radians or degrees is a ray from the origin rotated
by the angle θ around the origin.
For the remainder of the section, we will discuss the general r = f (θ) form for some function f
regarding θ.
√ √
Consider the graph of r = 2 sin θ. Using the approximations 2 ≈ 1.4 and 3 ≈ 1.7, we can
determine some points to plot:
Daniel Kim 152
θ r
θ r
0◦ 0
210◦ −1
30◦ 1
225◦ ≈ −1.4
45◦ ≈ 1.4
240◦ ≈ −1.7
60◦ ≈ 1.7
270◦ −2
90◦ 2
300◦ ≈ −1.7
120◦ ≈ 1.7
315◦ ≈ −1.4
135◦ ≈ 1.4
330◦ −1
150◦ 1
360◦ 0
180◦ 0
Scale: 0.5
Note that the negative values on the right table result in the same points in the left table (recall
the identity sin(180◦ + θ) = − sin θ). After we plot these points and sketch the graph, notice that it
resembles a circle! We can then prove that this equation’s graph is a circle, by converting the polar
equation into a regular Cartesian equation:
r = 2 sin θ
r2 = 2r sin θ
x2 + y 2 = 2r sin θ
x2 + y 2 = 2y
x2 + y 2 − 2y = 0
153 Chapter 7. Advanced Trigonometry
x2 + y 2 − 2y + 1 = 1
x2 + (y − 1)2 = 1.
θ r
θ r
0◦ 2
210◦ ≈ 0.1
30◦ ≈ 1.9
225◦ ≈ 0.3
45◦ ≈ 1.7
240◦ 0.5
60◦ 1.5
270◦ 1
90◦ 1
300◦ 1.5
120◦ 0.5
315◦ ≈ 1.7
135◦ ≈ 0.3
330◦ ≈ 1.9
150◦ ≈ 0.1
360◦ 2
180◦ 0
Then, we sketch:
y
Scale: 0.5
θ r
θ r
0◦ 1.5
210◦ ≈ −0.4
30◦ ≈ 1.4
225◦ ≈ −0.2
45◦ ≈ 1.2
240◦ 0
60◦ 1
270◦ 0.5
90◦ 0.5
300◦ 1
120◦ 0
315◦ ≈ 1.2
135◦ ≈ −0.2
330◦ ≈ 1.4
150◦ ≈ −0.4
360◦ 1.5
180◦ −0.5
Scale: 0.5
0 1 0 −1 0 1 0 −1 0
The value of r is increasing from 0 to 1 as θ goes from 0◦ to 45◦ , then r decreases from 1 back to
0 as θ goes from 45◦ to 90◦ . This forms the quadrant I ‘petal’ of the graph, with a ‘peak’ at point
(1; 45◦ ).
155 Chapter 7. Advanced Trigonometry
We then see that r goes negative (from 0 to −1) when θ goes from 90◦ to 135◦ , and r goes back
to 0 as θ approaches 180◦ . As r is negative in this part, the petal which would have been in quadrant
II has been reflected across the origin, resulting in a petal that is in quadrant IV, with a peak at
(−1; 135◦ ).
We continue with the rest of the values of r, leaving us with four petals.
y
Scale: 0.5
1. r = sin 3θ
2. r = θ
1
3. r =
θ
4. r = 2 + cos 2θ
5. r = 2 + cos 3θ
Problem 7.1.5. Sketch r = sec θ.
1
Solution. Rearranging r = , we get r cos θ = 1. But x = r cos θ, so our graph is simply a line,
cos θ
x = 1.
Problem 7.1.6. Let a, b not be simultaneously 0. What do all graphs of the form r = a sin θ +b cos θ
have in common?
Solution. Multiply both sides of the equation by r to get: r2 = a · r sin θ + b · r cos θ. Substitute in
the definitions x2 + y 2 = r2 , x = r cos θ, and y = r sin θ, so we have: x2 + y 2 = ay + bx. Rearranging
this and completing the square yields:
b 2 a 2 a2 + b2
x− + y− = ,
2 2 4
Daniel Kim 156
√
b a a2 + b2
which is an equation for a circle, centered at , with radius . Therefore, all polar
2 2 2
graphs of the form r = a sin θ + b cos θ are circles which vary depending on the values of a and b.
Furthermore, note that all these circles go through the origin.
Exercise 7.1.7. Convert r = 2 sin θ − 4 cos θ into Cartesian form, and describe fully the graph.
x + y = 10
r cos θ + r sin θ = 10
r(cos θ + sin θ) = 10
10
r= .
cos θ + sin θ
Problem 7.1.9. Given the graph of r = f (θ), describe the transformation that results in
a) r = f (−θ).
b) r = k · f (θ), k ∈ R.
c) r = f (θ + α), α ∈ R.
Solution.
a) Recall that a ray on the unit circle with an angle −θ is rotated by θ clockwise from the initial
starting place (which is the x-axis) rather than counter-clockwise. Therefore, the graph of
r = f (−θ) is a reflection of r = f (θ) over the x-axis.
• If k > 0, then we simply scale the graph of r = f (θ) by a factor of k from the origin.
• If k = 0, then the graph becomes r = 0, which is simply the origin, (0, 0).
• If k < 0, then we first reflect the graph on the origin, then we scale the graph by a factor
of |k| from the origin.
Problem 7.1.10. If f is odd, what can you say about the graph of r = f (θ)? What about if f is
even?
Solution. Suppose we have a point P = (r; θ). We know that if f is odd, then r = f (θ) = −f (−θ),
or −r = f (−θ). This implies that (−r; −θ) lies on the graph as well. However, the point (−r; −θ) is
actually the reflection of point P across the y-axis.
157 Chapter 7. Advanced Trigonometry
To see why this is the case, note that (−r; θ) is a reflection of P across the origin, and (−r; −θ) is
the reflection of (−r; θ) across the x-axis. These two transformations ultimately result in a reflection
across the y-axis.
If f is even, then f (θ) = f (−θ). It has been established that f (−θ) is a reflection of f (θ) over
the x-axis. But if f (−θ) and f (θ) are the same graph, then we can say that a graph with even
function f is symmetric over the x-axis.
Problem 7.1.11. Convert the following polar equations into Cartesian equations:
1. r = cos θ
2. r = 1 + cos θ
Solution.
1. Multiply both sides by r, perform the necessary substitutions, then complete the square:
r = cos θ
r2 = r cos θ
x2 + y 2 = x
x2 − x + y 2 = 0
1 1
x2 − x + + y 2 =
4 4
2
1 1
x− + y2 = .
2 4
r = 1 + cos θ
r2 = r + r cos θ
x2 + y 2 = r + x
r = x2 + y 2 − x
r2 = (x2 + y 2 − x)2
x2 + y 2 = (x2 + y 2 − x)2 .
x2 − y 2 = 1
r2 cos2 θ − r2 sin2 θ = 1
r2 (cos2 θ − sin2 θ) = 1
r2 (cos 2θ) = 1
Daniel Kim 158
r2 = sec 2θ
√
r = sec 2θ .
√
Note that we do not put “±” before sec 2θ because there’s no difference whether r is negative
or not (all points (r; θ) can be represented as (−r; θ + π)).
I present a well-known example: the unit circle. It can be represented by the parametric equations
x = cos θ, y = sin θ. In this case, θ would be the parameter.
The parametric curve would start at the point (1, 0) (the dot), go counter-clockwise in a
circular motion, and finish back at (1, 0).
• −π ≤ θ ≤ π
The curve starts at (−1, 0), go counter-clockwise in a circular motion and finish at (−1, 0).
159 Chapter 7. Advanced Trigonometry
• 0 ≤ θ ≤ 4π
The course of motion is similar to the first one (0 ≤ θ ≤ 2π) except that after the curve starts
at (1, 0), it goes around the same circle twice counter-clockwise and finishes at (1, 0).
π π
• ≤θ≤
4 3
π π π
The result would be an arc of angle drawn clockwise, as θ increases from to . For
12 4 3
clarity, the rest of the unit circle is shown through a dashed curve.
a) If there was no restriction on θ, then we would have the normal unit circle, because x2 + y 2 = 1,
based on the Pythagorean Identity.
b) If we suppose that 0 ≤ θ ≤ 2π, then we would end up with a graph that starts at (1, 0),
goes counter-clockwise around the unit circle twice and finishes at (1, 0). Then how is it any
different than the parametric curve x = cos θ, y = sin θ, 0 ≤ θ ≤ 4π?
The curve for x = cos 2θ, y = sin 2θ is drawn twice as fast. For clarification, notice that when
θ = 2π, the graph for x = cos 2θ, y = sin 2θ is already complete (circle drawn over twice), but
for x = cos θ, y = sin θ, the circle has only been drawn once, with the ‘second circle’ remaining.
The graph would be a circle starting from (0, 1), going clockwise around the unit circle and
ending up back at (0, 1).
Daniel Kim 160
This can also result from reflecting the graph determined by x = cos θ, y = sin θ over the line
y = x, since we have switched the parametric functions assigned to x and y respectively.
Problem 7.2.2. Find the Cartesian equation represented by the parametric equations x = 1 + cos θ,
y = 2 + sin θ, with 0 ≤ θ ≤ 2π.
Solution. We use the Pythagorean identity sin2 θ + cos2 θ = 1. Note that cos θ = x − 1 and
sin θ = y − 2. Substituting these into the identity, we therefore have (x − 1)2 + (y − 2)2 = 1 , a
circle of radius 1 centered at (1, 2).
Problem 7.2.3. Find the Cartesian equation represented by the parametric equations x = 88 cos θ,
y = 88 sin θ with 0 ≤ θ ≤ 2π, with 0 ≤ θ ≤ 2π.
x y x2 y2
Solution. Similarly, note that cos θ = and sin θ = , yielding the equation + = 1,
88 88 882 882
or x2 + y 2 = 7744 , a circle centered at the origin with radius 88.
Problem 7.2.4. Find the Cartesian equation represented by the parametric equations x = 88 cos θ,
y = 89 sin θ, with 0 ≤ θ ≤ 2π.
Solution. We once again use our Pythagorean identity sin2 θ + cos2 θ = 1. Note that x2 =
x2 2
2 = 892 · sin2 θ =⇒ sin2 θ = y , yielding the equation
882 · cos2 θ =⇒ cos2 θ = and y
882 892
x 2 y 2
2
+ 2 = 1 , which is an ellipse.
88 89
(x + 2)2 (y − 3)2
Problem 7.2.5. Represent the ellipse with equation + = 1 with parametric
25 16
equations.
Solution. Given the reasoning presented in the solution of the previous problem, we can work
(x + 2)2
backwards. Using the identity sin2 θ + cos2 θ = 1, we have the relations = cos2 θ and
25
(y − 3)2
= sin2 θ, and solve to get the parametric equations x = 5 cos θ − 2 and y = 4 sin θ + 3 .
16
Solution. The form of the given equation is similar to another Pythagorean identity: sec2 θ =
x2 y 2
1 + tan2 θ. Since we can rearrange the equation to 2 − 2 = 1, we then solve for the parametric
3 3
equations x = 3 sec θ and y = 3 tan θ .
Solution. Using the double-angle formulas, we have that y = 2 cos2 t − 1, or y = 2x2 − 1. At first
glance the graph may seem like a normal parabola. However, we must consider the ranges of x and
y: as x is equal to the cosine function of parameter t, x necessarily has a range of [−1, 1], so we must
limit the graph to x-values from −1 to 1 only.
−2 −1 1 2
−1
−2
Problem 7.2.8. Find the Cartesian equation for parametric equations x = sin t, y = sin 2t.
Solution. There is no clear relationship between x and y, so first set up a table and calculate various
x and y values:
t x y
−3 12 6
−2 6 2
−1 2 0
1 3 1
− −
2 4 4
0 0 0
1 1 3
−
2 4 4
1 0 2
2 2 6
−3 6 12
Daniel Kim 162
12
10
2 4 6 8 10 12
The graph looks like a rotated parabola. We can investigate this by finding the Cartesian equation
for this parametric curve. Given the parametric equations,
x = t2 − t,
y = t2 + t,
x+y y−x
we can add these to get t2 = , and subtract these to get t = . Therefore, we have:
2 2
y−x 2 x+y
=
2 2
2
y − 2xy + x 2 x+y
=
4 2
4x + 4y = 2(y 2 − 2xy + x2 )
4x + 4y = 2y 2 − 4xy + 2x2 ,
2x2 − 4xy + 2y 2 − 4x − 4y = 0.
We are left with an equation for a conic section. The conic discriminant is (−4)2 − 4(2)(2) = 0,
so we can confirm that it represents an oblique parabola.
Don’t worry if you don’t know about the conic discriminant, as it will be covered in depth in a
later chapter (Theorem 7.3.34).
(c, d)
t=1
(a, b)
t=0
we must find a set of parametric equations for x and y such that the parametric curve is this line
segment starting from point (a, b) at t = 0 and ending at point (c, d) with t = 1 (i.e. 0 ≤ t ≤ 1).
To start, we drop altitudes and form right triangles:
(c, d)
t=1
t(d − b)
(a, b)
t=0 t(c − a)
Note that since t goes from 0 to 1 and the graph in discussion is linear, t is the ratio of the
line segment that has been drawn starting from point (a, b). Therefore, the distance covered on the
x-axis would be t multiplied by the width difference c − a, and the distance covered on the y-axis
would be t multiplied by the height difference d − b.
Recall that we are starting from point (a, b), so the x-coordinate would be t(c − a) added to a
and the y-coordinate would be t(d − b) added to b. Therefore, we have the parametric equations:
x = f (t) = a + t(c − a),
y = g(t) = b + t(d − b).
Problem 7.2.10. Find parametric equations for the line segment connecting (1, 0) and (7, −5)
where
Solution.
1. Using the relationship proven in the previous example, we have the equations x = 1 + t(7 − 1)
and y = 0 + t(−5 − 0), or:
x = 1 + 6t,
y = −5t,
for 0 ≤ t ≤ 1.
Daniel Kim 164
x = 7 − 6t,
y = −5 + 5t,
for 0 ≤ t ≤ 1.
1. x = et , y = e−t
2. x = 2t , y = 4t
3. x = sin t, y = sin t
Solution.
1
1. At first, one might conclude that the graph is simply the hyperbola y = . However, remember
x
that x, y > 0, as a positive number raised to an exponent is always greater than zero. Therefore,
we only draw the graph for positive x and y values, as such:
2. It is clear that y = x2 . However, keep in mind that x, y > 0 (because a positive number raised
to any exponent is always positive). Therefore we only draw the right side of the parabola:
165 Chapter 7. Advanced Trigonometry
3. The line is obviously y = x, but remember that the range of sine is [−1, 1], therefore we limit
x, y to the interval [−1, 1]. Thus we are left with a line segment:
√ √
Exercise 7.2.12. Graph the parametric curve for equations x = t + 1, y = t − 1.
Problem 7.2.13. Find a parametric expression for the graph 2(x − 1)2 + 3(y + 2)2 = 11.
Solution. We strive to turn this equation into a standard equation for a conic section. Therefore,
divide both sides of the equation by 11 to make the RHS equal to 1, then express the LHS as the
sum of fractions where only the terms (x − 1)2 and (y + 2)2 are the numerators, as such:
This is an ellipse! Just like examples shown earlier in the section, the parametric equations for x
and y rely on the identity sin2 θ + cos2 θ = 1:
r
11
x= cos θ + 1,
2
r
11
y= sin θ − 2.
3
(0, 0)
Solution. First, let’s consider a circle of radius r, centered at (0, 0), to simplify the problem. Given
an angle of θ (in radians) from the point (0, −r) to the point A, what would be the coordinates of A?
(0, 0)
θ
(0, −r)
Well, remember that the functions sin x and cos x start from the ‘east pole’ (denoted by the
dotted
line in the diagram) and go counter-clockwise. Therefore, the point (0, −r) would be
π π π π
r cos − , r sin − (it is − instead of because we went clockwise).
2 2 2 2
Point
Ais simply rotating
the point
(0, −r) further θ degrees clockwise, so we subtract θ to get
π π
A = r cos − − θ , r sin − − θ .
2 2
However, note that
π π π π
cos − − θ = cos + θ = cos cos θ − sin sin θ = − sin θ,
2 2 2 2
and π π
π π
sin − − θ = − sin + θ = − sin cos θ + cos sin θ = − cos θ,
2 2 2 2
therefore A = (−r sin θ, −r cos θ).
To get the graph that we want, we must translate this circle upwards by r. Therefore, we add r
to the y-coordinate of all points on the circle.
167 Chapter 7. Advanced Trigonometry
(0, r)
θ
(0, 0)
This results in the point (−r sin θ, r − r cos θ) to describe the point on a circle with angle θ.
However, we’re not done yet. We have yet to factor in the circle moving sideways, on the x-axis.
If a circle ‘rolls’ on its surface by an angle θ, we are really just translating the circle horizontally
by the length of the arc that the angle θ subtends. For example, in this diagram, the length of the
arc between the point (0, 0) and (−r sin θ, r − r cos θ) is the horizontal distance that the initial point
(0, 0) will cover when that point becomes (−r sin θ, r − r cos θ) as a result of the circle rolling.
θ
We can compute this length as · 2πr = rθ. We add this to the x-coordinate of (−r sin θ, r −
2π
r cos θ) to get the point (rθ − r sin θ, r − r cos θ), which is the correct representation of the point on
the rolling circle. Therefore, our parametric equations would be:
7.3.1 Review
√
We define i = −1. A complex number can be expressed as z = a + bi, where a is the real part
and b is the imaginary part. All real numbers are complex numbers, with their imaginary part
equal to 0.
√
If i = −1, then i2 = −1, i3 = −i, and i4 = 1. When multiplying complex numbers together,
make sure you get used to various powers of i.
2 + 3i
Problem 7.3.1. Compute and (2 + 3i)3 .
4 + 5i
Solution. For the first one, multiply the numerator and denominator by the conjugate of the
denominator:
2 + 3i 4 − 5i 23 + 2i
· = .
4 + 5i 4 − 5i 41
Daniel Kim 168
For the second one, use the Binomial Theorem to expand and simplify:
a) w + z = w + z.
b) wz = w · z.
1 1
c) = .
z z
Proof.
b) Similarly, let w = a + bi and z = c + di. Expand both wz and w · z using the definitions of w
and z and simplify, separating the real and imaginary parts.
w · z = a + bi · c + di
= (a − bi)(c − di)
= ac − adi − bci − bd
= ac − bd − adi − bci
= (ac − bd) − (ad + bc)i.
wz = (a + bi)(c + di)
= ac + adi + bci − bd
= ac − bd + adi + bci
= (ac − bd) + (ad + bc)i
= (ac − bd) − (ad + bc)i.
We can compare the two and conclude that they are equal.
c) This time, let z = a + bi, and the structure of this proof is similar to the previous two:
1 1
=
z a + bi
1 a − bi
= ·
a + bi a − bi
169 Chapter 7. Advanced Trigonometry
a − bi
=
a2 + b2
a b
= − i
a2 + b2 a2 + b2
a b
= 2 + i
a + b2 a2 + b2
1 1
=
z a + bi
1
=
a − bi
1 a + bi
= ·
a − bi a + bi
a + bi
= 2
a + b2
a b
= 2 2
+ 2 i
a +b a + b2
We find that they are equal, so we are done.
Problem 7.3.3. Prove that if P (x) is a polynomial with real coefficients and z is a root of P (x),
then z is also a root of P (x).
n
X n
X
Proof. Let P (x) = ak xk where ak ∈ R. Let 0 = P (z) = ak z k . Then,
k=0 k=0
n
X n
X
k
0 = P (z) = ak z = ak (z k ).
k=0 k=0
It is evident that (z k ) = (z)k . It is also true that the conjugate of any real number is itself, since
Xn
a + 0i = a − 0i. Using those facts, we can conclude that 0 = ak (z)k = P (z), i.e. P (z) = 0. Thus,
k=0
z is a root of P (x).
√
Problem 7.3.4. Define |a + bi| = a2 + b2 . ∀w, z ∈ C, prove:
a) |wz| = |w||z|.
b) |w + z| ≤ |w| + |z|.
Proof.
√ √ √
a) First, |w||z| = |a + bi||c + di| = a2 + b2 · c2 + d2 = a2 c2 + a2 d2 + b2 c2 + b2 d2 .
Then,
ac + bd.
Now, we can apply that result on the expanded forms of |w + z| and p |w| + |z| stated in
the beginning of the proof; it is now clear that a + b + c + d + 2 (a2 + b2 )(c2 + d2 ) ≥
2 2 2 2
a2 + b2 + c2 + d2 + 2ac + 2bd, and so (|w| + |z|)2 ≥ (|z + w|)2 . Since magnitude is always
nonnegative, we take square root of both sides to get the intended result |w + z| ≤ |w| + |z|.
Imaginary axis
z2
2 + 3i
|z2 − w|
Real axis
√
a2 + b2 = |z1 |
z1 = a + bi
171 Chapter 7. Advanced Trigonometry
We have the Cartesian point (a, b) correspond to the point a + bi on the complex plane, where
the x-axis is the real part and the y-axis is the imaginary part of the complex number.
For instance, the point (2, 3) on the Cartesian plane would be represented as 2 + 3i on the
complex plane. Here are a couple of observations: We have the Cartesian point (a, b) correspond
to the point a + bi on the complex plane, where the x-axis is the real part and the y-axis is the
imaginary part of the complex number. For instance, the point (2, 3) on the Cartesian plane would
be represented as 2 + 3i on the complex plane. Here are a couple of observations:
a) Given
√ a point
p z1 ∈ C, |z1 | is the distance from the origin to z1 . This is obvious as |z1 | =
a + b = (a − 0)2 + (b − 0)2 which is simply the Cartesian distance formula with points
2 2
b) Given two points z2 and w, the distance between them is |z2 − w|. For a brief proof, let
z2 = a + bi (which
p denotes (a, b)) and w = c + di (which denotes (c, d)). The Cartesian distance
formula gives (a − c)2 + (b − d)2 , which is equivalent to |(a − c) + (b − d)i|, or |z2 − w|.
Problem 7.3.5. Describe the graphs of the following equations in the complex plane:
1. z = z
2. |z − 1| = |z|
3. |z − 2| + |z + 3i| = 10
4. |z − 2| = 2|z|
Solution.
a) If we let z = a + bi, then we have the equation a + bi = a − bi, and we solve to get b = 0,
implying that z must be real. Therefore the graph would simply be the real axis.
|a + bi − 1| = |a + bi|
p p
(a − 1)2 + b2 = a2 + b2
(a − 1)2 = a2
a2 − 2a + 1 = a2
2a = 1
1
a=
2
1
Therefore the graph is a line of z such that the real part of z is .
2
c) Recall the definition of an ellipse, which is the set of points such that the sum of the distances
from any point to the two certain points (foci) is constant. In this equation, we are given that
the sum of the distance from z to 2 and the distance from z to 3i is always 10. Therefore this
graph is an ellipse with foci at 2 and −3i.
Daniel Kim 172
|a + bi − 2| = 2|a + bi|
p p
(a − 2)2 + b2 = 2 a2 + b2
(a − 2)2 + b2 = 4a2 + 4b2
a2 − 4a + 4 + b2 = 4a2 + 4b2
3a2 + 4a + 3b2 =4
4
3 a + a + 3b2
2
=4
3
2 4 4 4
3 a + a+ + 3b2 =4+
3 9 3
2 2 16
3 a+ + 3b2 =
3 3
2 2 16
a+ + b2 =
3 9
2 4
Therefore the graph is a circle centered at − , 0 with radius .
3 3
Recall the relationship between the Cartesian plane and the polar plane: (a, b) ⇐⇒ (r; θ). We
had defined a = r cos θ and b = r sin θ. We can now establish a relationship between the polar plane
and the complex plane:
a + bi = r cos θ + i · r sin θ
= r(cos θ + i sin θ)
= r cis θ.
We denote the polar form of the complex number a + bi as r cis θ, where r is the magnitude
and θ is the argument. If z is the complex number, then we can denote θ = Arg z. For notation
purposes, cis θ is the abbreviation for cos θ + i sin θ.
We now have four ways to represent a point in two-dimensional space:
Problem 7.3.6. Convert the following complex numbers from polar form to standard form (a + bi):
π
a) 6 cis
6
b) 8 cis(−π)
3π
c) 4 cis
4
Solution.
π π π
a) 6 cis = 6 cos + i sin
6 6 ! 6
√
3 1
=6 + i
2 2
√
= 3 3 + 3i .
4
Solution. Note that 3 + 4i is in the first quadrant, so tan−1 gives the correct angle. The magnitude
3
√ −1 4
is 3 + 4 = 5, therefore 3 + 4i in polar form is 5 cis tan
2 2 .
3
Daniel Kim 174
Lemma 7.3.9
Given two complex numbers r1 cis θ1 and r2 cis θ2 ,
Proof. Expand and simplify using the trigonometric angle addition formulas.
Lemma 7.3.11
cis(−θ) = (cis θ)−1 .
Proof. By Lemma 7.3.9, we have cis(−θ) cis θ = cis(−θ + θ) = cis 0, which is equal to 1. Since
1
cis(−θ) cis θ = 1, cis(−θ) = = (cis θ)−1 .
cis θ
175 Chapter 7. Advanced Trigonometry
Solution.
(cis θ)2 = cis θ cis θ = cis(θ + θ) = cis 2θ
(cis θ)2 = (cis θ)2 cis θ = cis 2θ + cis θ = cis(2θ + θ) = cis 3θ
The pattern suggests that (cis θ)n and cis nθ are equal, which will be proven in the following
theorem.
Corollary 7.3.14
∀n ∈ Z, (cis θ)n = cis nθ.
Proof. We have already proven the case when n > 0, shown in the proof of Theorem 7.3.13.
If n = 0, then (cis θ)0 = 1, and cis(0 · θ) = cis 0 = 1, so they are equal.
If n < 0, then n + |n| = 0. Then,
Therefore, we finish the proof for case n < 0 by using Lemma 7.3.9, Lemma 7.3.11, Theorem 7.3.13,
and the fact that −|n| = n:
Problem 7.3.17. Now, cis 3θ = (cis θ)3 . By expanding (cos θ +i sin θ)3 using the Binomial Theorem,
derive formulas for cos 3θ and sin 3θ entirely in terms of cos θ and sin θ respectively.
177 Chapter 7. Advanced Trigonometry
We equate the real and imaginary parts, then simplify to get the triple angle formulas:
cos 3θ = cos3 θ − 3 cos θ sin2 θ
= cos3 θ − 3 cos θ(1 − cos2 θ)
= cos3 θ − 3 cos θ + 3 cos3 θ
cos 3θ = 4 cos3 θ − 3 cos θ.
Theorem 7.3.19
We have certain relationships between cos nθ, cos θ and sin nθ, sin θ:
Notice that all powers of sin θ are even. We can use the identity sin2 θ = 1 − cos2 θ to be able to
express any sin2k θ as (1 − cos2 θ)k ∀k ∈ Z+ . Therefore, since all sin θ raised to an even power can
be expressed in terms of cos θ, we are left with cos nθ being equal to a polynomial in terms of cos θ.
Likewise, consider the imaginary parts:
n n−1 n n−3 3 n
sin(nθ) = cos θ sin θ − cos θ sin θ + cosn−5 θ sin5 θ − . . .
1 3 5
When we consider all odd n in Z+ , n minus some other odd number (n − (2k + 1) ∀k ∈ Z) must
be even. Note that the powers of cos θ in this expression are all of that form, therefore all cos θ are
raised to an even power.
We can then apply the identity cos2 θ = 1 − sin2 θ similar to the cos(nθ) example above, to
establish that all even powers of cos θ can be expressed in terms of sin θ. We can then conclude that
sin(nθ) can be expressed as a polynomial in terms of sin θ only, for all odd positive integers n.
Problem 7.3.20. Given the equation (a + bi)2 = i, solve for all such possible complex numbers.
Problem 7.3.21. Solve the equation z 2 = i (given z ∈ C) using the polar form of complex numbers.
π
Solution. First, note that |i| = 1 and Arg i = (the complex number i on the complex plane is
2
π π π
the same as point (0, 1), or cos , sin on the Cartesian plane). Therefore, i = 1 · cis . We let
2 2 2
z = r cis θ for arbitrary r and θ, so we have the equation
π
(r cis θ)2 = 1 · cis .
2
π
Using De Moivre’s Theorem, this results in r2 cis 2θ = 1 · cis . We assume r to be positive, so
2
π
r = 1. We also have cis 2θ = cis . Because sine and cosine functions are periodic by a value of 2π,
2
π π
we conclude that 2θ = + 2πk ∀k ∈ Z, or θ = + πk ∀k ∈ Z. Therefore, limiting θ to [0, 2π) we
2 4
π 5π π 5π
have the solutions θ = , . Our final solutions are z = 1 · cis , 1 · cis .
4 4 4 4
Exercise 7.3.22. Confirm that the answers in Problem 7.3.20 and Problem 7.3.21 are equivalent.
Solution. We can solve this in two ways. I will demonstrate both of them:
1. Algebra. We rewrite the equation into a polynomial z 3 − 8 = 0, which can then be factored
into (z − 2)(z 2 + 2z + 4) = 0. Using the quadratic formula, we find that the roots are
√ √
z = 2 , −1 + i 3 , −1 − i 3 .
2. Using Polar Form. We know that 8 = 8 cis 0, therefore if we let z = r cis θ, then we have the
equation (r cis θ)3 = 8 cis 0, or r3 cis 3θ = 8 cis 0. This implies r3 = 8 and cis 3θ = cis 0, thus
2π 2π 4π
r = 3 and 3θ = 0 + 2πk ∀k ∈ Z (as cis has a period of 2π). Thus θ = k i.e. θ = 0, ,
3 3 3
when θ ∈ [0, 2π). Finally we bring all information together to state all solutions z = 2 cis 0 ,
2π 4π
2 cis , and 2 cis .
3 3
Solution. As usual, rewrite −243 as 243 cis π. Letting z = r cis θ, we get (r cis θ)5 = 243 cis π, or
π 2π
r5 cis 5θ = 243 cis π using De Moivre’s Theorem. We have z = 3 and 5θ = π + 2πk, or θ = + k,
5 5
π 3π 7π 9π
and when restricting θ to the interval [0, 2π), we have the solutions θ = , , π, , and .
5 5 5 5
π 3π 7π 9π
Therefore all the solutions to z are 3 cis , 3 cis , 3 cis π , 3 cis , and 3 cis .
5 5 5 5
Daniel Kim 180
Problem 7.3.25. What are the five fifth roots of i? Then, find their sum and product.
π π
Solution. Following the usual procedure: z 5 = i =⇒ (r cis θ)5 = 1 · cis =⇒ r5 cis 5θ = 1 · cis ,
2 2
π 2π π π 9π 13π 17π
yielding r = 1 and θ = + k, i.e. θ = , , , , . Therefore the solutions to z are
10 5 10 2 10 10 10
π π 9π 13π 17π
cis , cis , cis , cis , cis .
10 2 10 10 10
To find their sum and product, we must recall that the fifth roots of i are essentially the roots of
the equation z 5 − i = 0. This is a polynomial! Therefore we can apply Vieta’s Formulas to determine
that the sum is 0 and the product is i .
Q
5
Problem 7.3.26. If the five fifth roots of i are ω1 , ω2 , ω3 , ω4 , and ω5 , what is (2 − ωk )?
k=1
Solution. As previously stated, the polynomial that has these roots is z 5 − i = 0. Therefore
5
Y
z 5 − i = (z − ω1 )(z − ω2 )(z − ω3 )(z − ω4 )(z − ω5 ) = (z − ωk ).
k=1
√
Problem 7.3.27. Find all solutions to z 5 = 16 3 − 16i.
Solution. As usual, we use De Moivre’s Theorem and compare the magnitude and argument:
√ !
5 3 1
(r cis θ) = 32 − i
2 2
π
r5 cis 5θ = 32 cis −
6
π
r = 2, 5θ = − + 2πk
6
π 2π
θ =− + k.
30 5
Exercise 7.3.29. Write the six solutions to z 6 = −64i in polar form, then convert two of them to
complex form.
181 Chapter 7. Advanced Trigonometry
7.3.3 Rotation
Expressing complex numbers in polar form allows us to easily rotate points. Consider the following
diagram:
Imaginary axis
(r; θ)
(r; θ + α)
θ+α
α
θ
Real axis
Recall that r cis(θ + α) = (r cis θ) · cis α. Thus, if we are given some complex number z = a + bi =
r cis θ, we can easily rotate it around the origin counter-clockwise by an angle α by multiplying z by
cis α.
Thus, the complex number that results from rotating a + bi by an angle of α counterclockwise is:
Then, if we convert this complex number back to coordinates, we conclude that the rotation of
the point (a, b) by the angle α around the origin is
Solution. We must turn this problem into something we know how to do, which is rotating around
the origin.
Observe how we can shift all points by −2 on the x-axis and −3 on the y-axis in order to have
(2, 3) ‘become’ the origin. Therefore this problem is equivalent to rotating (1 − 2, −4 − 3) = (−1, −7)
around the origin by 60◦ counter-clockwise, then shifting the point back 2 right and 3 up. Using the
formula, we have
√ √ !
◦ ◦ ◦ ◦ 7 3−1 7+ 3
(−1 · cos 60 + 7 sin 60 , −1 · sin 60 − 7 cos 60 ) = ,−
2 2
as the rotated point from (−1, −7) around the origin. Now we can shift the point by adding 2 to the
x-coordinate and 3 to the y-coordinate to get our answer:
√ √ ! √ √ !
7 3−1 7+ 3 7 3+3 1+ 3
+ 2, − +3 = ,− .
2 2 2 2
Now we shift our focus to conic sections. The method of rotating points that we have discovered
earlier can be utilized in a certain fashion (also known as a coordinate substitution) to rotate general
functions, particularly conics.
While functions from complex numbers to complex numbers are impossible to graph, functions
of the form f (z) = 0 are possible. Graphs of f (z) = 0 correspond to graphs of f (x, y) = 0 in the
Cartesian plane, so we will instead deal with those. Consider the following diagram.
Im
Re
In this figure, the ellipse shown is rotated some angle α counterclockwise. Note that if we rotate
any point on the rotated (purple) ellipse clockwise by α, then we get a point on the original (black)
ellipse.
For example, the red point on the purple ellipse gives us the red point on the black ellipse when
rotated. Therefore, if the equation of the black ellipse is f (x, y) = 0, then the equation of the purple
ellipse should be f (the point (x, y) rotated α clockwise) = 0. However, we now have a coordinate
substitution that gives us the coordinates after rotation! Substituting in our formulas (and noting
183 Chapter 7. Advanced Trigonometry
that rotating clockwise by α is the same as rotating counterclockwise by −α), the equation for the
rotated ellipse is
f (x cos(−α) − y sin(−α), x sin(−α) + y cos(−α)) = 0.
In general, we can conclude that given a function f (x, y) = 0, the new equation
Solution. For clarity, rearrange this to y − 2x = 0. We then use the stated formula and make
appropriate substitutions to x and y:
√ ! √ ! √
x 3 y x y 3 y √ x 3
This rearranges to − + −2 + = 0, or − y 3 = x + . After much
2 2 2 2 2 2
√
8+5 3
simplification, we end up with y = − x.
11
Theorem 7.3.34
Consider the general equation for a conic section:
Ax2 + Bxy + Cy 2 + Dx + Ey + F = 0.
We call the value B 2 − 4AC the conic discriminant. Assuming that the conic is not
degenerate:
Proof. We first attempt to get rid of the x and y terms of that general conic equation by translating
it. Translations are characterized by the transformations x → x − r and y → y − s. Substituting, we
have this equation:
After expanding, the resulting coefficient of x is D−2Ar−Bs and the y coefficient is E −2Cs−Br.
In order for both of these to equal 0 (and therefore eliminate those terms), we have
D = 2Ar + Bs,
Daniel Kim 184
E = Br + 2Cs.
We then solve for r and s, which are the translations necessary to get rid of the x and y terms:
BE − 2CD
r= ,
B 2 − 4AC
BD − 2AE
s= 2 .
B − 4AC
Thus, if the quantity B 2 − 4AC 6= 0, then we can translate the conic such that there are no x
nor y terms. Note also that the coefficients A, B, and C all stay the same under this transformation.
We now split into two different cases: 4AC − B 2 = 0 and 4AC − B 2 6= 0.
• Case 1: B 2 − 4AC 6= 0
Assuming B 2 − 4AC 6= 0, then we have successfully eliminated the x and y terms, so we just
want to deal with equations of the form Ax2 + Bxy + Cy 2 + F = 0 (remember, the values A,
B, C, and F all stayed the same under that transformation, so we can still use these variables
for this new equation).
We now try to rotate the conic by some angle θ to get rid of the xy term. As stated in Theorem
8, we can make the transformations x → x cos θ − y sin θ and y → x sin θ + y cos θ. Making our
substitutions, we have the equation
A(x cos θ − y sin θ)2 + B(x cos θ − y sin θ)(x sin θ + y cos θ) + C(x sin θ + y cos θ)2 + F
Expanding and rearranging according to our x2 , xy, and y 2 terms, we are left with:
Our original purpose was to eliminate the xy term, so we set the coefficient of the xy term
equal to 0:
B cos 2θ + (C − A) sin 2θ = 0.
Rearranging, we have:
B cos 2θ = (A − C) sin 2θ.
If A = C, then we have cos 2θ = 0, or θ = 45◦ . Either way, we have found an expression for θ
that can eliminate the xy term.
After finally eliminating the xy term, we now have the new conic equation e ax2 + e
cy 2 + F = 0
(we cannot use the initial A, B, C variables again because the coefficients of x2 and y 2 have
changed due to the rotation by θ). Based on what we know about conic sections so far:
185 Chapter 7. Advanced Trigonometry
To figure out whether ea and ec have same signs or not, we multiply them together: if the
product is negative, then they have different signs, and if the product is positive, then they
have the same signs.
Recall from the expanded equation stated earlier that e
a = A cos2 θ + B sin θ cos θ + C sin2 θ and
c = A sin θ − B sin θ cos θ + C cos θ. We have to figure out the sign of this unwieldy product:
e 2 2
Without loss of generality, we can double each of these expressions, because multiplying by 2
would not influence the sign of the product.
As one may observe, doubling these expressions has enabled us to be able to conveniently
apply our known trigonometric identities:
At this point, we cannot proceed without discussing the value of θ. As concluded earlier, we
B
have either tan 2θ = or θ = 45◦ . Consider both cases:
A−C
B
– Case 1.1: tan 2θ = .
A−C
We must find the expressions of sin 2θ and cos 2θ in terms of A, B, and C. We can
accomplish this with the use of a reference triangle (constructed such that the tangent of
an angle 2θ is equal to the ratio of the opposite side to the adjacent side), as shown:
C) 2
−
(A
B
+
B2
p
2θ
A−C
theorem.
If 4AC − B 2 is negative, then e a and ec have different signs, therefore the conic is a
hyperbola. Note that 4AC − B 2 < 0 =⇒ B 2 − 4AC > 0, and so we have proven the
‘hyperbola’ of the theorem.
– Case 1.2: θ = 45◦ .
Our expression,
(A + B + C)(A − B + C).
However, recall that θ = 45◦ iff A = C. Therefore, the expression above is equal to
(2A + B)(2A − B) = 4A2 − B 2 = 4AC − B 2 , and now we can make the same conclusion
as Case 1.1.
• Case 2: B 2 − 4AC = 0
Let’s go back to our original general equation of a conic section:
Ax2 + Bxy + Cy 2 + Dx + Ey + F = 0.
Note that Ax2 + Bxy + Cy 2 is a square √ if and only if B 2 − 4AC = 0. To see why this is
−B ± B 2 − 4AC
true, recall the quadratic formula (which is an expression for the roots
2A
of the quadratic). If B − 4AC = 0, then the square root term is eliminated and there is
2
no plus-minus case to consider, resulting in double roots (both roots are the same), so the
expression is a square of another expression.
Thus, let Ax2 + Bxy + Cy 2 = (mx + ny)2 for arbitrary m, n ∈ R. Substituting, we now have
the equation
(mx + ny)2 + Dx + Ey + F = 0.
187 Chapter 7. Advanced Trigonometry
We are trying to prove that B 2 − 4AC = 0 implies that the conic is a parabola. Therefore we
seek to eliminate the y 2 term so we are left with an equation with only y, x2 , and x as variables.
Like the first case, we perform the rotation x → x cos θ − y sin θ and y → x sin θ + y cos θ in
hopes of getting rid of the y 2 term:
(m(x cos θ − y sin θ) + n(x sin θ + y cos θ))2 + D(x cos θ − y sin θ) + E(x sin θ + y cos θ) + F = 0.
Rewrite this equation so that it is clear what the coefficients of x2 , y 2 , x, and y are:
(x(m cos θ + n sin θ) + y(n cos θ − m sin θ))2 +x(D cos θ+E sin θ)+y(E cos θ−D sin θ)+F = 0.
Now it is clear that the only source of the y 2 term is n cos θ − m sin θ which is the coefficient of
n
the y term inside the squared expression. Therefore we set n cos θ − m sin θ = 0, or tan θ = .
m
Note that the equation now will resemble this:
e + Ey
(x · (some expression))2 + Dx e + Fe = 0,
e E,
for arbitrary coefficients D, e and Fe.
At last, we now have shown that we can rotate the conic by a chosen angle θ (such that
n
tan θ = ), to end up with an equation that is a parabola. Therefore, B 2 − 4AC = 0 =⇒
m
the conic is a parabola.
Linear Algebra
In this chapter, we will touch upon the basics of linear algebra, particularly vectors and matrices.
Then we discuss their applications in 2D and 3D geometry.
This chapter will only serve as a general introduction of linear algebra, so some proofs will be
omitted as they would be beyond the scope of this book. More emphasis will be placed on showing
how certain concepts are related to each other, as well as key observations to be made, rather than
utmost mathematical rigor. For in-depth study of linear algebra, feel free to explore courses offered
by universities or purchase books in Linear Algebra.
8.1 Vectors
• ~v : 12 ft. east
• w:
~ 5 in. north
We represent the concept of vectors with arrows, however every vector is unique by its magnitude
and direction. Although we can draw more than one arrow with the same length and direction,
remember that they all denote the same vector. Some representations of vectors are illustrated
below:
12 units
~v
These are the same vector ~v ,
but different representations.
12 units
~v
w
~
189
Daniel Kim 190
tail tip
~v + w
~
w
~
~v
Example 8.1.3
Show that ~v + w
~ is unique.
Proof. Given the diagram below, we need to show that the lengths AC and A0 C 0 are equal (that
~ has a concrete value no matter what representations of ~v and w
~v + w ~ are taken).
“~v + w”
~
w
~
A ~v B
C′
“~v + w”
~
w
~
A′ ~v B′
First, note that the both vectors ~v are parallel (i.e. pointing in the same direction) and equal in
magnitude. So are the two w~ vectors. Therefore, ABB 0 A and BCC 0 B 0 are parallelograms.
This implies AA0 ∼ = BB 0 and BB 0 ∼ = CC 0 , and by the transitive property, AA0 ∼ = CC 0 .
191 Chapter 8. Linear Algebra
Problem 8.1.4. From this definition of vector addition, prove the following:
1. ~v + w
~ =w
~ + ~v
2. ~u + (~v + w)
~ = (~u + ~v ) + w
~
Proof.
w
~
~v
w~ +
~v ~=
w ~v
~v +
w
~
Simply use two different representations of vector ~v and two different representations of vector
~ such that the resulting vector ~v + w
w ~ is the same as w~ + ~v .
2. Consider the following diagram:
D
w
~
C
~v
~
~v + w
B
w~ )
+
(~v
~u +
~v
~=
+
w
~u
+
~u
+ ~v)
(~u
Exercise 8.1.6. From this definition of a scalar, two distributive properties follow:
1. k(~v + w)
~ = k~v + k w
~
Definition 8.1.7. The zero vector, ~0, is the vector of magnitude 0 in any direction.
Definition 8.1.8. −~v is the vector which when added to ~v , gives us ~0 (i.e. −~v is the same magnitude
as ~v , but in the opposite direction).
It follows that we have another associative property: (kl)~v = k(l~v ) for k, l ∈ R. Feel free to prove
this on your own.
Proof. Using the distributive property, note that −1 · ~v + 1 · ~v = (−1 + 1) · ~v = 0 · ~v = ~0. Therefore,
−1 · v + ~v = ~0, so by the previous definition, −1 · v = −~v .
~v ~v − w
~
w
~
−w
~
~v − w
~ ~v
Definition 8.1.11. We now introduce new notation that is not universally standard, but we will
be using this notation from now on:
In the two-dimensional plane, denote the vector ha, bi as the vector that represents the overall
path taken when going a horizontally and b vertically from any starting point.
(2, 8) 5 right
h5, 0i
14 down
h5, −14i h0, −14i
(7, −6)
Given a vector ha1 , a2 , . . . , an i, we would call a1 , a2 , . . . , an the components of that vector.
Problem 8.1.12. What would be the vector that connects the point (r, s) to the point (t, u)?
• ~k = h0, 0, 1i in 3D.
We can define more unit vectors for greater dimensions, and we will do so at a later part of this
chapter.
Using these vectors, we can establish the relation
allowing us to break down any vector in terms of the unit vectors. We will revisit this idea later in
the chapter.
Solution. The
√ √ vector h3, 4i means that we’re going 3 up and 4 to the right, so we have kh3, 4ik =
32 + 42 = 25 = 5 . It is evident that k~ık = 1 , since we are only going 1 unit right. Lastly,
√ √
kh1, 2, 3ik = 12 + 22 + 32 = 14 .
1. kk~v k = |k|k~v k
2. k~v + wk ~ with equality if and only if ~v , w
~ ≤ k~v k + kwk ~ are in the same direction.
The first one is easily provable, while a proof for the second one is nearly identical to that of
Problem 7.3.4.
Problem 8.1.16. Find vectors in the same direction as h2, 3i of:
1. Magnitude 1.
2. Magnitude 7.
Solution.
√ √ 1
1. We find that kh2, 3ik = 13. Therefore we must scale our vector by a factor of √
22 + 3 2 =
13
1 2 3
in order to scale its magnitude to 1. Therefore our new vector is √ h2, 3i = √ , √ .
13 13 13
2 3
2. Similarly, since we have found our vector √ , √ of magnitude 1 in the same direction of
13 13
14 21
the original vector, we can simply scale that vector by a factor of 7 to get: √ , √ .
13 13
Now, we introduce the concept of angles to vectors. Consider the following vectors:
di
hc, ~v − w
~ = ha − c, b − di
=
w~
θ
~v = ha, bi
The vectors ~v , w,
~ and ~v − w
~ form a triangle. We can then apply the Law of Cosines to find out
the value of angle θ between vectors ~v and w:
~
~ 2 = k~v k2 + kwk
k~v − wk ~ 2 − 2k~v kkwk
~ cos θ.
Notice that ac + bd is the result of taking the sum of the product of the first coordinates and the
product of the second coordinates of ~v and w. ~ This quantity is extremely useful and significant in
the use of vectors, and so we will define this unique operation:
Definition 8.1.17. Given vectors ha1 , a2 , . . . , an i and hb1 , b2 , . . . , bn i, the dot product is denoted
as:
n
X
ha1 , a2 , . . . , an i · hb1 , b2 , . . . , bn i = ai bi .
i=1
1. ∀a ∈ R, a~v · w
~ = a(~v · w).
~
2. ~u · (~v + w)
~ = ~u · ~v + ~u · w.
~
3. ~u · ~u = k~uk2 .
Proof. When we let ~u = ha, bi, we end up with ha, bi · ha, bi = a · a + b · b = a2 + b2 = k~uk2 .
Proof. We use our previously declared definitions and make the proper substitutions.
(~u + ~v )(~u − ~v ) = (~u + ~v ) · ~u + (~u + ~v ) · (−~v )
= ~u · ~u + ~u · ~v + (~u + ~v ) · (−1 · ~v )
= k~uk2 + ~u · ~v + (−1(~u + ~v )) · ~v
= k~uk2 + ~u · ~v + (−~u − ~v ) · ~v
= k~uk2 + ~u · ~v − ~u · ~v − ~v · ~v
= k~uk2 − k~v k2 .
Now that we have defined the dot product, we can rewrite cos θ with a simpler, cleaner expres-
sion.
Daniel Kim 196
Theorem 8.1.18
Given two vectors ~v and w
~ and the angle θ between their representations such that they share
the same tail, we have the following relationship:
~v · w
~
cos θ = .
k~v kkwk
~
Problem 8.1.19. What is the angle between h1, 2i and h2, 1i?
√ √
Solution. The dot product h1, 2i · h2, 1i = 1 · 2 + 2 · 1 = 4, kh1, 2ik = 5, and kh2, 1ik = 5, therefore
4 4 −1 4
cos θ = √ √ = , and thus θ = cos since h1, 2i and h2, 1i have representations in the
5· 5 5 5
first quadrant in the Cartesian plane when their tails are placed at the origin.
Problem 8.1.20. Find the angle between h1, 2, 3i and h3, 4, 5i.
Solution. We have
h1, 2, 3i · h3, 4, 5i 1·3+2·4+3·5 13
cos θ = =√ √ = √ .
kh1, 2, 3ikkh3, 4, 5ik 2 2 2 2 2
1 +2 +3 · 3 +4 +5 2 5 7
−1 13
Therefore θ = cos √ .
5 7
Problem 8.1.21. Find all values of m for which the angle between vectors h1, 1i and h1, mi is 60◦ .
Solution. We have
1+m
cos 60◦ = √ √ .
2 · 1 + m2
1 √
Since cos 60◦ = , we end up with 2 + 2m2 = 2+2m. After squaring both sides and simplifying,
2 √
the equation reduces to m2 + 4m + 1 = 0. The quadratic formula yields m = −2 ± 3.
1 1+m
However, since cos 60◦ = > 0, we cannot have √ √ be negative. When we plug in
√ 2 2 · 1 + m√ 2
m = −2 − 3, we end up with a negative value. Therefore −2 − 3 cannot be a solution of m.
√
Then, m = −2 + 3 yields a positive value and thus it is the only value of m which satisfies
the conditions.
Theorem 8.1.23
~v and w
~ are orthogonal if and only if ~v · w
~ = 0.
~v ~v − k w
~
kw
~
w
~
~v · w
~
projw~ (~v ) = · w.
~
~ 2
kwk
Lastly, note that ~ı − ~ + ~k = h1, −1, 1i and ~ı + ~ + ~k = h1, 1, 1i. Therefore kh1, −1, 1ik2 =
12 + (−1)2 + 12 = 3 and h1, −1, 1i · h1, 1, 1i = 1 − 1 + 1 = 1, so we substitute in the appropriate
values:
1
1 1 1 1 1 1
proj~ı−~+~k ~ı + ~ + ~k = h1, −1, 1i = ,− , = ~ı − ~ + ~k .
3 3 3 3 3 3 3
Solution. We have
3·1+4·2+5·3 13 13 26 39
projh1,2,3i (h3, 4, 5i) = 2 2 2
h1, 2, 3i = h1, 2, 3i = , , .
1 +2 +3 7 7 7 7
In this section, we will give vectors useful meaning by discussing how vectors of one set can be
transformed into vectors of another set. The discussion of linear transformations will enable us to
relate vectors to matrices, and to understand the purpose of matrices.
Definition 8.2.1. Let Rn denote the set of all n-dimensional vectors with components in R.
A linear transformation from Rn to Rm is a map L : Rn 7→ Rm such that:
1. ∀~v , w
~ ∈ Rn , L(~v + w)
~ = L(~v ) + L(w).
~
The notation L : Rn 7→ Rm describes L as the name of the map, Rn as the domain, and Rm as
the codomain. Values in the domain Rn are mapped, or associated, to values in the codomain Rm .
Note that the codomain is the set consisting of all possible values that can come out of the
function mapping, while the range is the actual set of all values that do come out of the mapping.
Therefore we can say that the range is a subset of the codomain.
As an example, what would L ~0 be? Clearly, L ~0 = ~0, however the zero vector that is
inputted into the linear transformation is different than the zero vector that is outputted. In other
words, ~0 given to L is in Rn while the ~0 as the result is in Rm .
Problem 8.2.2. We define the following transformation:
L : R2 7→ R1
L (ha, bi) = a.
Solution. Recalling back to the definition, we must show that the transformation satisfies two
properties:
1. ∀~v , w
~ ∈ R2 , L(~v + w)
~ = L(~v ) + L(w).
~
Let ~v = ha, bi and w~ = hc, di. First, note that L(ha, bi) + L(hc, di) = a + c = L(ha + c, b + di),
therefore L(ha + c, b + di) = L(ha, bi) + L(hc, di), and we have showed that the transformation
satisfies the first property.
Then, note that L(k ha, bi) = L(hka, kbi) = ka = kL(ha, bi), therefore L(k ha, bi) = kL(ha, bi),
and our second property is satisfied, therefore the transformation is linear.
L : R2 7→ R2 ,
L(ha, bi) + L(hc, di) = h2a − 3b, 3b + 79ai + h2c − 3d, 3d + 79ci
= h2a + 2c − 3b − 3d, 3b + 3d + 79a + 79ci
= h2(a + c) − 3(b + d), 3(b + d) + 79(a + c)i
= L(ha + c, b + di)
= L(ha, bi + hc, di),
which satisfies the first property of a linear transformation. Likewise, we show that the transformation
holds for the second one as well:
projw~ : Rn 7→ Rm .
Solution. We essentially treat the vector projection function as a mapping from any vector of the
n-dimension to an m-dimensional vector.
(v~1 + v~2 ) · w~
projw~ (v~1 + v~2 ) = w
~
~ 2
kwk
v~1 · w
~ + v~2 · w ~
= 2
w
~
kwk
~
v~1 · w
~ v~2 · w
~
= 2
w
~+ w
~
kwk
~ ~ 2
kwk
= projw~ (v~1 ) + projw~ (v~2 ) .
k~v · w~
projw~ (k~v ) = w~
~ 2
kwk
~v · w
~
=k w~
kwk~ 2
= k projw~ (~v ) .
Recall that we defined the special unit vectors ~ı, ~, and ~k of magnitude 1, and that we can express
a two-dimensional vector ha, bi as a~ı + b~.
Since we are dealing with an arbitrary number of dimensions, we reestablish these definitions
with a more generalized outlook:
Definition 8.2.6. If v~1 , v~2 , . . . , v~n are vectors, a linear combination is any vector of the form
a1 v~1 + a2 v~2 + . . . + an v~n for a1 , a2 , . . . , an ∈ R.
ı~1 = h1, 0, 0, . . . , 0i ,
ı~2 = h0, 1, 0, . . . , 0i ,
..
.
ı~n = h0, 0, 0, . . . , 1i .
201 Chapter 8. Linear Algebra
Using these two definitions, we can write any vector ha1 , a2 , a3 , . . . , an i as a linear combination
of the standard basis vectors:
We conclude that
This conclusion is consistent with the remark made earlier that ha, bi = a~ı + b~.
For some general linear transformation L : Rn 7→ Rm such that
L(~
ı1 ) = v~1 ,
L(~
ı2 ) = v~2 ,
..
.
L(~
ın ) = v~n ,
We now introduce the following new notation for clarity and consistency later on:
Other examples: 117 is a 1 × 1 matrix, and 1 1 7 is a 1 × 3 matrix.
We will call each number inside a matrix an entry.
Daniel Kim 202
This would be an m × n matrix because each of the column vectors v~1 , v~2 , v~3 , . . . , v~n have m
vertical entries, as they are in Rm . Thus, an m × n matrix represents the transformation Rn 7→ Rm .
The application of a linear transformation on a column vector is denoted by the product of the
matrix representing the transformation and that column vector. For instance,
a1
a2
v~1 v~2 v~3 . . . v~n .
..
an
is the vector that results from applying a linear transformation on the initial vector ha1 , a2 , . . . , an i.
In other words, for some vector ~v , we have
L(~v ) = M~v ,
where M is the matrix representing the linear transformation L.
Essentially, the matrix serves to represent a transformation in a compact fashion. Later examples
will highlight why the matrix is such a valuable and important tool. Every time we want to talk
about applying a linear transformation to a vector, we don’t want to be dealing with the hassle of
defining some transformation L(ha, bi) in order to convey what the actual transformation is. Instead,
we can simply express our transformation as an organized and condensed matrix.
Let’s put this into practice. Consider Problem 8.2.3 where
L : R2 7→ R2 ,
L(ha, bi) = h2a − 3b, 3b + 79ai .
1 2 0 −3
We have that L = and L = . Then, to find the matrix that represents
0 79 1 3
2 −3
the overall linear transformation, we put these column vectors together, as such: . If we
79 3
a
multiply this matrix by any vector , we have
b
2 −3 a 2a − 3b
= ,
79 3 b 79a + 3b
203 Chapter 8. Linear Algebra
which is consistent with our definition that L(ha, bi) = h2a − 3b, 3b + 79ai.
Notice that the top entry of the resulting column vector, 2a − 3b, is the dot product of h2, −3i
with ha, bi. Similarly, the bottom entry, h79a + 3bi is the dot product of h79, 3i with ha, bi.
Problem 8.2.11. Let L : R2 7→ R3 satisfy L(ha, bi) = ha − 2b, 2a + 3b, −bi. Determine the matrix
for L.
Solution. First, we calculate the linear transformations of the standard basis vectors in R2 :
1 −2
a
Therefore our matrix is 2 3 . Given an arbitrary R vector
2 , we have a relationship
b
0 −1
between the matrix and the vector:
1 −2 a − 2b
2 3 a
= 2a + 3b .
b
0 −1 −b
Again, notice that the first entry of the resulting column vector, a − 2b, is the dot product of
h1, −2i with ha, bi.
Confirm for yourself that the rest follow the same rule.
From these past two examples, we can make the conclusion that each entry in the resulting
column vector (that is, the vector that results from applying the linear transformation on the initial
vector) is the dot product of its corresponding row in the matrix with the initial vector.
Solution. From our observation about the relationship between the dot product and the matrix,
it is not hard to see that for each row in the matrix, each number in that row is multiplied to its
corresponding number in the column vector (from top to bottom), then the sum of those products
becomes the corresponding entry in the final vector:
1
1 2 3 10 1 · 1 + 2 · 2 + 3 · 3 + 10 · 4 54
4 5 6 11 2 = 4 · 1 + 5 · 2 + 6 · 3 + 11 · 4 = 76 .
3
7 8 9 π 7·1+8·2+9·3+π·4 50 + 4π
4
Daniel Kim 204
This will help us describe arbitrary matrices of any dimensions without writing out all entries in
a cumbersome fashion.
For example, if we had a 2 × 3 matrix M , the notation M = aı means that
a a a
M = 11 12 13
a21 a22 a23
Intuitively, this states that the sum of two linear transformations is also a linear transformation
represented by a matrix. We can obtain each entry of this matrix by adding its corresponding entry
in the matrix representing the first linear transformation to its corresponding entry in the matrix
representing the second linear transformation.
r1
r2
Proof. Let M1 , M2 represent two linear transformations. Let ~v = . . Since we are able to add
..
rm
functions together, we must have that M1~v + M2~v = (M1 + M2 )~v . First, we simplify M1~v + M2~v :
a11 a12 . . . a1m r1 b11 b12 . . . b1m r1
a21 a22 . . . a2m r2 b21 b22 . . . b2m r2
M1~v + M2~v = . .. .. .. + .. .. .. ..
.. . . . . . . .
an1 an2 . . . anm rm bn1 bn2 . . . bnm rm
a11 r1 + a12 r2 + . . . + a1m rm b11 r1 + b12 r2 + . . . + b1m rm
a21 r1 + a22 r2 + . . . + a2m rm b21 r1 + b22 r2 + . . . + b2m rm
= .. + ..
. .
an1 r1 + an2 r2 + . . . + anm rm bn1 r1 + bn2 r2 + . . . + bnm rm
205 Chapter 8. Linear Algebra
P
m
(a1ı + b1ı )rı
ı=1
P
m
(a2ı + b2ı )rı
= ı=1
.
..
P m
(anı + bnı )rı
ı=1
r1
r2
= aı + bı . .
..
rm
Therefore, M1 + M2 = aı + bı .
Now that we have established the sum of two linear transformations, consider the composition of
those transformations.
T2 ◦ T1 (~v ) = w.
~
Note that when we take the composition of T2 and T1 , it is required that the codomain of T1 ⊆
domain of T2 . then we must have that the product of an (n by m) matrix and an (m by l) matrix
results in an (n by l) matrix.
Daniel Kim 206
x1
x2
As stated, we let M1 = bı and M2 = aı . Furthermore, let ~v = . , where ~v ∈ Rl . First,
..
xl
we compute M1~v , which is a legal operation because T1 : Rl 7→ Rm :
P
l
b1k xk
k=1
b11 b12 . . . b1l x1 P l
b21 b22 . . .
b2l x2
b 2k k
x
M1~v = . .. .. .. = k=1
.. . . . ..
.
bm1 bm2 . . . bml xl l
P
bmk xk
k=1
P
m P
l
Consider the entry, ap bk xk , for p = 1, 2, . . . , n. Note that
=1 k=1
m l
! m l
!
X X X X
ap bk xk = ap bk xk
=1 k=1 =1 k=1
m X
X l
= (ap bk ) xk
=1 k=1
Xm
= ((ap b1 ) x1 + (ap b2 ) x2 + . . . + (ap bl ) xl )
=1
m
X m
X m
X
= x1 apk bk1 + x2 apk bk2 + . . . + xl apk bkl .
k=1 k=1 k=1
207 Chapter 8. Linear Algebra
Given that p goes from 1 to n, we can rewrite the enormous column vector as:
m
P Pm Pm
a1k bk1 a1k bk2 . . . a1k bkl
k=1 k=1 k=1 x1
P m Pm Pm
a b a b . . . a 2k kl
b
2k k1 2k k2 x2
M2 (M1~v ) = (M2 M1 )~v = k=1 k=1 k=1 ..
.. .. .. .
m . . .
xl
P Pm Pm
ank bk1 ank bk2 . . . ank bkl
k=1 k=1 k=1
m
X
Therefore, M2 M1 = cı where cı = aik bkj .
k=1
Solution.
−1 1
1 2 3 1 · −1 + 2 · −2 + 3 · −3 1 · 1 + 2 · 2 + 3 · 3 −14 14
1. −2 2 = = .
4 5 6 4 · −1 + 5 · −2 + 6 · −3 4 · 1 + 5 · 2 + 6 · 3 −32 32
−3 3
−1 1 −1 · 1 + 1 · 4 −1 · 2 + 1 · 5 −1 · 3 + 1 · 6 3 3 3
1 2 3
2. −2 2 = −2 · 1 + 2 · 4 −2 · 2 + 2 · 5 −2 · 3 + 2 · 6 = 6 6 6.
4 5 6
−3 3 −3 · 1 + 3 · 4 −3 · 2 + 3 · 5 −3 · 3 + 3 · 6 9 9 9
7 4
1 2
3. Taking the product 2 −1 is not possible because each row of the first matrix is 2
−5 6
0 3
entries long, while each column of the second matrix is 3 entries long. We can only take the
product when each row of the first matrix has the same number of entries as each column of
the second matrix.
Daniel Kim 208
7 4 7 · 1 + 4 · −5 7·2+4·6 −13 38
1 2
4. 2 −1 = 2 · 1 + −1 · −5 2 · 2 + −1 · 6 = 7 −2.
−5 6
0 3 0 · 1 + 3 · −5 0·2+3·6 −15 18
Recall from the previous chapter that if we rotate a point (x, y) by an angle of θ counter-clockwise
about the origin, it goes to
(x cos θ − y sin θ, x sin θ + y cos θ).
We can associate the point (x, y) with a vector whose tail is at the origin and tip at the point
(x, y), which is hx, yi.
Definition 8.2.19. The rotation matrix Rθ to be the matrix that transforms the vector hx, yi
into the vector rotated counter-clockwise by angle θ:
cos θ − sin θ x x cos θ − y sin θ
= .
sin θ cos θ y x sin θ + y cos θ
| {z }
Rθ
Proof. We can directly evaluate the product of the matrices, then apply sum and difference formulas
of sine and cosine appropriately.
cos α − sin α cos θ − sin θ
Rα Rθ =
sin α cos α sin θ cos θ
cos α cos θ − sin α sin θ − cos α sin θ − sin α cos θ
=
sin α cos θ + cos α sin θ − sin α sin θ + cos α cos θ
cos(α + θ) − sin(α + θ)
=
sin(α + θ) cos(α + θ)
= Rα+θ .
which is different from the first matrix we got. Thus, while matrix multiplication is associative, it is
not commutative.
Moving on, we introduce some new vocabulary to describe special matrices and parts of matrices.
Definition 8.2.21. We define Mn (R) to be the set of n × n matrices with elements in R. A special
name for an n × n matrix is a square matrix.
Definition 8.2.22. If M is an n × n square matrix i.e. ann , then the main diagonal consists of
the entries of the form akk , where k = 1, . . . , n.
Definition 8.2.23. Consider the linear transformation T : Rn → Rn such that for all elements ~ık
in the standard basis, T (~ık ) = ~ık . Let ~v be a vector in Rn which can be written as ha1 , a2 , . . . , an i.
Using the various properties of a linear transformation as well as the observation that ~v can be
written as a linear combination of the standard basis vectors, we have
n
!
X
T (~v ) = T ak~ık
k=1
n
X
= T (ak~ık )
k=1
Xn
= ak T (~ık )
k=1
n
X
= ak~ık
k=1
= ~v .
Now, we consider the matrix for this transformation. By looking at the column vectors, we can
see that it will look like:
1 0 0 ... 0 0
0 1 0 . . . 0 0
0 0 1 . . . 0 0
.. .. ..
. . .
0 0 0 . . . 1 0
0 0 0 ... 0 1
Definition 8.2.24. A matrix of this form is called the n × n identity matrix, denoted In . It takes
a vector to itself. All entries in the main diagonal of In have the value of 1, and all other entries
have the value of 0.
Daniel Kim 210
Problem 8.2.25. Now, let us consider the product In M for some n × m matrix M .
1 0 ... 0 a11 a12 . . . a1m
0 1 . . . 0 a21 a22 . . . a2m
.. . . .. . .. ..
. . . .. . .
0 0 . . . 1 an1 an2 . . . anm
Prove In M = M .
Proof. If we let M = aij and In = bij , then
n
P
In M = b a
ik kj .
k=1
However, we know that all of the values of bik are 0 except for entries of the form bii , which are
all 1. Thus, all of the values of bik akj are 0, except for bii aij = aij . This implies that
n
X
bik akj = aij .
k=1
Therefore, In M = aij = M i.e. In M = M .
We have demonstrated that not only does In represent the identity linear transformation, it also
serves as the identity for matrix multiplication!
Feel free to prove M In = M using an analogous line of reasoning as shown above.
We can observe how the identity matrix has appeared in places we have seen before. Consider
R0 , the rotation matrix by an angle of 0. It is clear that rotating a vector by 0 will not change the
vector at all. In other words, R0 should be an identity matrix. This can be easily demonstrated:
cos 0◦ − sin 0◦ 1 0
R0 = = = I2 .
sin 0◦ cos 0◦ 0 1
As with fields, the existence of an identity means that we should look for inverses as well.
Definition 8.2.26. The inverse of a matrix M is denoted as M −1 , such that M −1 M = I, i.e. the
identity matrix.
To get a sense of how matrix inverses can be applied in other parts of mathematics, consider the
following system of equations:
2x + 3y = m,
3x + 5y = n.
Standard algebraic techniques yield (x, y) = (5m − 3n, −3m + 2n) as the set of possible solutions.
However, we can also express this system of equations as a matrix operation:
2 3 x m
=
3 5 y n
| {z }
M
211 Chapter 8. Linear Algebra
Then our main objective is to find solutions to the equation M~x = ~v , where ~x, ~v are given vectors
and M is a matrix of an appropriate dimension. This is where our inverse matrix, denoted as M −1 ,
fulfills its role as a multiplicative inverse to this equation:
M −1 (M~x) = M −1~v
(M −1 M )~x = M −1~v
I~x = M −1~v
∴ ~x = M −1~v .
Suppose ~x, ~v are in the same dimension, and M is a square matrix. When does M −1 exist?
a b x y
For now, we will deal with the 2 × 2 square matrix. Let M = , and M −1 = . We
c d z w
should figure out what x, y, z, w are in terms of a, b, c, d.
By our definition of the inverse, we must have:
a b x y 1 0
= .
c d z w 0 1
ax + bz = 1,
ay + bw = 0,
cx + dz = 0,
cy + dw = 1.
Notice that the common denominator is ad − bc. Whether this quantity will equal 0 or not will
determine whether M has an inverse or not. This quantity will serve as an important trait of the
matrix.
a b
Definition 8.2.27. Given a 2 × 2 matrix M = , the determinant of M is denoted as
c d
Det(M ) = ad − bc.
Definition 8.2.28. We will define the determinant of any identity matrix to be 1. In other words,
Det(In ) = 1.
Daniel Kim 212
1
Lastly, we can factor out from M −1 :
ad − bc
d −b
−1 ad−bc ad−bc
1 d −b
M = −c a = .
ad−bc ad−bc ad − bc −c a
Theorem 8.2.29
Given a 2 × 2 square matrix M , M has an inverse if and only if Det(M ) 6= 0, such that
−1 1 d −b
M = .
Det(M ) −c a
Therefore,
−1
x 2 3 m 5 −3 m 5m − 3n
= = = ,
y 3 5 n −3 2 n −3m + 2n
which is indeed consistent with the solutions (x, y) = (5m − 3n, −3m + 2n) that we had found for
the system of equations using algebraic techniques.
−1
7 5 7 5 x 7
Problem 8.2.30. Compute . Use this to solve the equation = .
−1 2 −1 2 y 11
2 −5
−41
x 7 −41 84
Therefore = 19
1
19
7 = 19
84 , i.e. the solution (x, y) = , .
y 19 19 11 19 19 19
Problem 8.2.31. Using the formula for matrix inverse, find R(−θ) .
Solution. It is obvious that R(−θ) is the inverse of Rθ , since rotating by θ and then rotating by −θ
will result in no rotation being done to the initial vector at all.
213 Chapter 8. Linear Algebra
For the following examples, we are assuming that M is a 2 × 2 square matrix, since we have only
discussed the determinant and inverse of that kind of matrix so far.
Solution. Using
the same variables for M and N in the previous example, we have that M + N =
a+e b+f
, so Det(M +N ) = (a+e)(d+h)−(b+f )(c+g) = (ad+ah+de+eh)−(bc+bg+cf +f g),
c+g d+h
which clearly does not equal Det(M ) + Det(N ) = ad − bc + eh − f g, so Det(M + N ) 6= Det(M ) +
Det(N ).
We could approach this proof algebraically just like our previous examples. However, we can
proceed with a much cleaner proof by using previously proven results:
Proof. As previously proven in an example, Det(M −1 M ) = Det(M −1 ) Det(M ). But recall that
M −1 M = I, so we have Det(I) = Det(M −1 ) Det(M ) =⇒ Det(M −1 ) Det(M ) = 1, so therefore
Det(M −1 ) = (Det(M ))−1 .
Daniel Kim 214
Now that we have dealt with rotation matrices, we can also consider the transformation of
reflection. We seek to reflect a vector over some line. To get a notion of how we should define this,
take a line through the origin:
y
~v : hx, yi
θ+α
θ θ−α
x
Let the angle of line l with respect to the x-axis be θ. Let Tθ be the linear transformation that
reflects a vector across the line making an angle of θ with the x-axis.
Let the angle of the gap between the vector and the line be α. The transformation will take a
vector making an angle of θ − α to one making an angle of θ + α (since reflection copies the angle).
We wish to express the initial vector as hcos β, sin βi, so let β = θ − α, which implies θ + α = 2θ − β,
so therefore, the vector hcos β, sin βi will reflect to the vector hcos(2θ − β), sin(2θ − β)i under Tθ .
We can find the matrix that corresponds to Tθ ; denote it Hθ . Under our definition, we must
have:
cos β cos(2θ − β)
Hθ · = .
sin β sin(2θ − β)
Definition 8.2.36. The reflection matrix, denoted by Hθ , reflects a vector over a line given the
angle θ between the line and the x-axis.
cos 2θ sin 2θ x x cos 2θ + y sin 2θ
= .
sin 2θ − cos 2θ y x sin 2θ − y cos 2θ
| {z }
Hθ
215 Chapter 8. Linear Algebra
Problem 8.2.37. Prove that the product of a reflection matrix and a rotation matrix is a reflection
matrix.
Proof. Since matrix multiplication is not commutative, we must consider both cases whether a
reflection matrix is being multiplied to a rotation matrix or the other way around.
cos 2ϕ sin 2ϕ cos θ − sin θ
Hϕ Rθ =
sin 2ϕ − cos 2ϕ sin θ cos θ
cos 2ϕ cos θ + sin 2ϕ sin θ − cos 2ϕ sin θ + sin 2ϕ cos θ
=
sin 2ϕ cos θ − cos 2ϕ sin θ − sin θ sin 2ϕ − cos 2ϕ cos θ
cos(2ϕ − θ) sin(2ϕ − θ)
=
sin(2ϕ − θ) − cos(2ϕ − θ)
= H2ϕ−θ .
cos θ − sin θ cos 2ϕ sin 2ϕ
R θ Hϕ =
sin θ cos θ sin 2ϕ − cos 2ϕ
cos 2ϕ cos θ − sin 2ϕ sin θ cos θ sin 2ϕ + sin θ cos 2ϕ
=
sin θ cos 2ϕ + cos θ sin 2ϕ sin θ sin 2ϕ − cos θ cos 2ϕ
cos(2ϕ + θ) sin(2ϕ + θ)
=
sin(2ϕ + θ) − cos(2ϕ + θ)
= H2ϕ+θ .
x x
Problem 8.2.38. Compute H π4 · and R π2 · .
y y
Solution.
x cos π2 sin π2
x 0 1 x y
H π4 · = = =
y sin π2 − cos π2
y 1 0 y x
x cos π2 π
− sin 2 x 0 −1 x −y
R π2 · = = =
y sin π2 π
cos 2 y 1 0 y x
5 0
Exercise 8.2.39. Describe what the matrix does to a vector.
0 6
Exercise 8.2.40. Find a matrix that takes each vector ~v to 2~v .
t π
Problem 8.2.41. Rotate the set of all points of the form 2 by radians counter-clockwise, then
t 4
find the Cartesian equation which represents the set of those rotated points.
Solution. We have,
" √2 √ # "√ √ #
t −√ 22 t 2
t− 2 2
t
R π4 2 = √22 2 = √22 √2 .
t 2 t t+ 2 2
2 2 2 2 t
√ √ √ √
2 2 2 2 2 2
We now have the parametrization x = t− t ,y= t+ t , from which we can add
2 2 2 2 √ √
these equations together and simplify to get the conic x2 + 2xy + y 2 + 2x − 2y = 0 .
Daniel Kim 216
The objective of this problem is to demonstrate that we can use rotation matrices to rotate graphs.
In that problem, we expressed the initial Cartesian equation such as y = x2 as a parametrization,
π
through which we can rotate it radians counter-clockwise, resulting in the oblique parabola
√ √ 4
x2 + 2xy + y 2 + 2x − 2y = 0.
π
Problem 8.2.42. Use a rotation matrix to rotate x2 − y 2 = 1 by radians counter-clockwise.
4
Solution. Recall the identity sec2 θ = 1 + tan2 θ, from which we can deduce that the parametrization
of this equation is x = sec t, y = tan t. We proceed with our transformation:
" √2 √ #
2
" √2 √
2
#
sec t − sec t sec t − tan t
R π4 = √22 √2 = √22 √2 .
tan t 2 tan t sec t + 2 tan t
2 2 2 2
√ √ √ √
2 2 2 2
We have the parametric equations x = sec t − tan t and y = sec t + tan t. A
2 2 2 2
particularly clean method to finding the equation in terms of x and y is to multiply x and y together,
1
resulting in a difference-of-squares, which ultimately simplifies to xy = .
2
We have been going over various transformations, but there is a special situation that we should
address. After applying some linear transformation, what if the vector only changes by a scalar
factor?
Definition 8.2.43. A matrix M has eigenvector non-zero ~v with a scalar eigenvalue λ if
M~v = λ~v .
When we multiply the eigenvector h1, 0i by a scalar of 5, the resulting vector is equivalent to
taking the matrix transformation of that vector! In other words, the direction of the vector remains
the same after a transformation is applied.
Here are some more examples of eigenvectors with their eigenvalues for the given matrix above:
5 0 0 0 0
= =6·
0 6 1 6 1
5 0 k 5k k
= =5·
0 6 0 0 0
In general, according to our definition, the eigenvector exists iff it is non-zero. When do we know
this is the case? Consider our equation,
M~v = λ~v
217 Chapter 8. Linear Algebra
Example 8.2.44
5 0
Find all eigenvectors and eigenvalues for the matrix .
0 6
5−λ 0
Solution. Note that M − λI = , therefore Det(M − λI) = (5 − λ)(6 − λ), so the
0 6−λ
values of λ which force Det(M − λI) to equal 0 are λ = 5, 6. We consider each value separately:
2 − λ −1
Solution. Note that Det(M − λI) = Det = (2 − λ)(5 − λ) − (−1)(−4) = λ2 −
−4 5 − λ
7λ + 6 = (λ − 1)(λ − 6).
Daniel Kim 218
• Case λ = 1:
2 −1 x 2x − y x
= =1·
−4 5 y −4x + 5y y
leaving us with the system of equations 2x − y = x, −4x + 5y = y, whose solution is y = x.
x 1
Eigenvectors of λ = 1 are of the form i.e. scalar multiples of .
x 1
• Case λ = 6:
2 −1 x 2x − y x 6x
= =6· =
−4 5 y −4x + 5y y 6y
The system of equations is 2x − y = 6x and −4x + 5y = 6y, so the solution is y = −4x.
x 1
Eigenvectors of λ = 6 are of the form i.e scalar multiples of .
−4x −4
2 1
Exercise 8.2.46. Find eigenvectors and eigenvalues of the matrix .
−4 −3
1 3
Exercise 8.2.47. Find eigenvectors and eigenvalues of the matrix .
−4 −6
We similarly solve
asystem of equations to get y= xi, x = −yi, implying that our eigenvectors
x 1
are of the form , i.e. scalar multiples of .
xi i
219 Chapter 8. Linear Algebra
• Case λ = 1:
cos 2θ sin 2θ x x
=
sin 2θ − cos 2θ y y
This gives the system of equations
x cos 2θ + y sin 2θ = x,
x sin 2θ − y cos 2θ = y.
1 − cos 2θ
After simplification, we end up with y = x, which resembles the half-angle formula
sin 2θ
for
tangent,
from which we can deduce that y = x tan θ, so our eigenvectors are of the form
x 1
, i.e. x . As x is a scalar which spans over all R, we are allowed to make the
x tan θ tan θ
cos θ
substitution x −→ x cos θ, so we can express the form of the eigenvector as x i.e. all
sin θ
cos θ
scalar multiples of .
sin θ
• Case λ = −1:
cos 2θ sin 2θ x −x
=
sin 2θ − cos 2θ y −y
1 + cos 2θ
Similarly, we solve a system of equations to get the solution y = − x = −x cot θ, so
sin 2θ
1
our eigenvectors can be expressed in the form x , and using the valid substitution
− cot θ
sin θ sin θ
x −→ x sin θ, we have a cleaner form x i.e. all scalar multiples of .
− cos θ − cos θ
Definition 8.2.50. A set of vectors v~1 , v~2 , . . . , v~n is linearly independent if whenever a1 v~1 +
a2 v~2 + . . . + an v~n = 0, then a1 = a2 = . . . = an = 0.
Otherwise, the set of vectors is linearly dependent.
1 4
Problem 8.2.51. Determine if the vectors , are linearly independent or not.
−1 −1
Theorem 8.2.54
Any three vectors in R2 are linearly dependent.
a c
There exists a solution to this whenever the matrix has an inverse, i.e. the determinant
b d
6 0. If we have a solution to this equation, then we’re done, since the sum of some scalar
ad − bc =
multiples of the first two vectors will indeed result in the third vector.
bc
Otherwise, assume that ad − bc = 0. Without loss of generality, let a 6= 0. Then we have a = ,
d
bc
so d = , so our first two vectors are
a
a c
, bc .
b a
We have shown that the three vectors are linearly dependent in either case, and so we are
done.
221 Chapter 8. Linear Algebra
It turns out that this theorem is true in general (i.e. that any n + 1 vectors in Rn are linearly
dependent).
Any n linearly independent vector in Rn is called a basis for Rn . This is a generalization of the
standard basis vectors that we had defined earlier in the chapter.
Recall that we were able to express any vector as a linear combination of the standard basis
vectors: ha1 , a2 , a3 , . . . , an i = a1 ı~1 + a2 ı~2 + a3 ı~3 + . . . + an ı~n .
This can be extended to any basis in general; any vector in Rn can be written as a linear
combination of vectors in the basis.
1 0 1
Problem 8.2.55. Why can’t 0 , 1 , and 1 be a basis in R3 ?
0 0 0
1 0 1
Solution. First, it is easy to observe that 0 + 1 = 1, so they are not linearly independent.
0 0 0
0
Furthermore, we cannot express the vector 0 as any linear combination of these vectors
1
because their third entries are all 0. This fails the fact that any vector in R3 can be written as a
linear combination of vectors in the basis for R3 .
We shift our discussion of matrices back to the determinant. It turns out that it has far greater
significance than one may initially conceive. Consider a triangle made up of vectors:
Any vector in Rn can be written as a linear combination of vectors in the basis.
di
hc,
=
w~
θ
~v = ha, bi
Theorem 8.2.56
The area of a triangle defined by the vectors w
~ = ha, bi and ~v = hc, di is:
1 1 a c
|ad − bc| = Det .
2 2 b d
~v · w
~
Proof. Let θ be the angle between ~v and w.
~ Then cos θ = . From this, we can derive that
s k~
v kk wk
~
2
~v · w
~ 1
sin θ = 1 − . By Theorem 6.6.1, we can apply the area formula ab sin C (for any
k~v kkwk
~ 2
two sides a, b with included angle C) to our vectors which make up the triangle to get our desired
Daniel Kim 222
1
Area = k~v kkwk
~ sin θ
2
1p
= k~v k2 kwk
~ 2 − (~v · w)
~ 2
2
1p 2
= (a + b2 )(c2 + d2 ) − (ac + bd)2
2
1p
= (ac)2 + (ad)2 + (bc)2 + (bd)2 − (ac)2 − 2(ac)(bd) − (bd)2
2
1p
= (ad)2 + (bc)2 − 2(ad)(bc)
2
1p
= (ad − bc)2
2
1
= |ad − bc|
2
1 a c
= Det .
2 b d
(c, d)
(a + c, b + d)
θ
(a, b)
Theorem 8.2.57
The area of a parallelogram formed by vectors ha, bi and hc, di is
a c
Det .
b d
(c, d)
bi
−
a ,d
−
hc
(a, b) he − a, f − bi (e, f )
223 Chapter 8. Linear Algebra
Proof. Based on our coordinates (a, b), (c, d), and (e, f ), the two vectors with common tail on (a, b)
are hc − a, d − bi and he − a, f − bi. We can then apply Theorem 8.2.56:
1 c−a e−a
Area = Det
2 d−b f −b
1
= |(c − a)(f − b) − (e − a)(d − b)|
2
1
= |cf − ab − af + ab − ed + eb + ad − ab|
2
1
= |(ad + cf + eb) − (bc + de + f a)|.
2
This theorem seems quite complex, but it is actually quite simple once you see the pattern.
What we do is list out the coordinates, and repeat the first point (which would be (a, b) in this
example) at the end as well:
a b
c d
e f
a b
Take the product of each diagonal pair and add them all up. Here, we would have ad + cf + eb.
Then, draw diagonal lines from b to c, d to e, and f to a:
a b
c d
e f
a b
Do the same thing as last time: take the product of each diagonal pair and add all the products.
So we have the quantity bc + de + f a.
Daniel Kim 224
1
We have found the two quantities used in the formula |(ad + cf + eb) − (bc + de + f a)| we
2
discovered, so we can compute the area now.
This method of listing out the coordinates and taking diagonals is very helpful for applying
Shoelace Theorem correctly without having to completely memorize it.
This theorem can be proved by induction on the n-sided polygon. At the inductive step, we add
one more point outside the n-sided polygon. If we connect the two closest points on the polygon to
this new point, we end up with an n + 1-sided polygon.
We can then compute the area of this by adding the existing area of the n-sided polygon (which
would be the assumption made by the inductive hypothesis) to the area of the triangle formed by
the new point and the two closest points.
In general, to use the generalized Shoelace Theorem properly, we must list out the coordinates in
either clockwise or counter-clockwise order (the direction doesn’t matter, as long as the coordinates
are in order). Do not forget to repeat the first pair of coordinates at the end too.
We can then apply our method of drawing diagonal lines to arrive at the expression given by the
theorem.
Problem 8.2.60. Find the area of the quadrilateral with points at (1, 1), (9, 0), (2, 4), and (7, 6).
Solution. We set up our list of coordinates in clockwise order (or counter-clockwise order; it doesn’t
matter because of the absolute value sign):
1 1
2 4
7 6
9 0
1 1
From the solid diagonal lines, we end up with the quantity 1 · 4 + 2 · 6 + 7 · 0 + 9 · 1 = 25.
From the dashed diagonal lines, we have 1 · 2 + 4 · 7 + 6 · 9 + 0 · 1 = 84.
1 59
Thus, our area is |25 − 84| = .
2 2
Note that drawing these diagonal lines makes the diagram resemble shoelaces - hence, the name
of the theorem.
225 Chapter 8. Linear Algebra
Lastly, we move on to row and column operations on matrices. This will enable us to compute
the determinant of a 3 × 3 matrix.
Consider the following matrices:
a 0 1 0 1 c 1 0 0 1
A= B= C= D= E=
0 1 0 b 0 1 d 1 1 0
r s
for any a, b, c, d ∈ R. Let M = represent a general 2 × 2 matrix.
t u
Note that Det(A) = a and Det(B) = b, which implies that Det(AM ) = Det(M A) = a · Det(M )
and Det(BM ) = Det(M B) = b · Det(M ). Therefore, if we multiply a row or column of a matrix M
by some scalar, then the determinant of M is also multiplied by that scalar.
Furthermore, Det(C) = Det(D) = 1, so Det(CM ) = Det(M C) = Det(DM ) = Det(M D) =
1 · Det(M ) = Det(M ). This means that adding a multiple of some row to another row or adding a
multiple of some column to another column does not affect the determinant.
Lastly, Det(E) = −1, which means that swapping a pair of rows or a pair of columns also flips
the sign of the determinant.
This result can be generalized to higher n × n matrices. However, without proof, we shall
summarize the information above in a general fashion:
1. When any row or any column is multiplied by some scalar k, the determinant of M is
also multiplied by k.
2. A multiple of some row added to another row, or a multiple of some column added to
another column, does not change the determinant of M .
3. When a pair of rows or a pair of columns is swapped, the sign of the determinant is flipped
(positive to negative, and vice-versa).
227 Chapter 8. Linear Algebra
Theorem 8.2.62
Consider a 3 × 3 matrix with arbitrary entries, as shown:
a b c
d e f
g h i
We call this upper triangular form, where all entries below the non-zero main diagonal are zeroes.
Lemma 8.2.64
If matrix M is in upper triangular form, then Det(M ) = ruw.
Denoting the first row, second row, and third row as R1 , R2 , and R3 respectively, note that we
can perform the row operation −(u−1 v)R3 + R2 −→ R2 to eliminate only the u−1 v term from the
second row, and the row operation −(r−1 t)R3 + R1 −→ R1 to eliminate only the r−1 t term from
the first row. Finally, we can do a row operation −(r−1 s)R2 + R1 −→ R1 to eliminate only the r−1 s
term from the first row. We are left with the identity matrix I3 . We have established that
1 r−1 s r−1 t
Det 0 1 u−1 v = Det(I3 ),
0 0 1
since these row operations do not affect the overall determinant of the matrix. Thus we can conclude
the proof of this lemma:
r s t r 0 0 1 r−1 s r−1 t
Det 0 u v = Det 0 u 0 Det 0 1 u−1 v
0 0 w 0 0 w 0 0 1
r 0 0 r 0 0
= Det 0 u 0 Det(I3 ) = Det 0 u 0 · I3
0 0 w 0 0 w
r 0 0
= Det 0 u 0 = ruw.
0 0 w
This simplifies to
1 2
a ei − abdi − aceg + bdcg − a2 hf + acdh + abf g − bcdg = aei−bdi−ceg −ahf +cdh+bf g.
a
This formula is certainly intimidating, but there is a visual memorization technique that is similar
to the method we used to invoke the Shoelace theorem.
First, consider this augmented matrix with the first two columns appended at the end.
a b c a b
d e f d e
g h i g h
Consider the three diagonals going from top-left to bottom-right, and take the sum of the
products of the three entries in each of the diagonals, i.e. aei + bf g + cdh, as shown:
a b c a b a b c a b a b c a b
d e f d e , d e f
d e , d e f d e
g h i g h g h i g h g h i g h
Then consider the three diagonals going the other way, i.e. going from top-right to bottom-left,
and, like above, take the sum of the products of the three entries in each of the diagonals, i.e.
ceg + af h + bdi, as shown:
a b c a b a b c a b a b c a b
d e f d e , d e f d e
, d e f d e
g h i g h g h i g h g h i g h
Then our determinant is simply the first sum minus the second sum, i.e. aei + bf g + cdh − (ceg +
af h + bdi), which is what the theorem states.
Daniel Kim 230
2 3 6
Problem 8.2.65. Compute the determinant of 1 −4 3 using the method shown above.
2 5 9
There is also another way to evaluate the 3 × 3 determinant. With the matrix
a b c
d e f ,
g h i
consider the first row, which is a, b, and c. For each entry, imagine that you delete the row and
column of the matrix which contains that entry you are considering. You will have exactly four
entries left that have not been deleted. Form a 2 × 2 matrix out of those four entries in the exact
same order. The determinant of this resulting 2 × 2 matrix is called a minor.
For example, the minor for a would be:
a b c
e f
Det d e f =⇒ Det .
h i
g h i
We can find the determinant of the 3 × 3 matrix by considering one whole row or column, and
for each entry, find the product of that entry and its minor. The determinant would then be the
alternating sum of those products (i.e. you add the first product, subtract the second, add the third,
etc.).
For example, we can evaluate the minors on the first row to get:
a b c
e f d f d e
Det d e f = a · Det − b · Det + c · Det .
h i g i g h
g h i
We can also find the same determinant by evaluating minors on the second column:
a b c
d f a c a c
Det d e f = b · Det − e · Det + h · Det .
g i g i d f
g h i
Problem 8.2.66. Solve Problem 8.2.65 using the minors method and confirm that you got the
same answer.
231 Chapter 8. Linear Algebra
Example 8.2.67
Use matrices to solve the following system of equations:
2x + 3y + 4z = 4,
x − 2y + 5z = 1,
3x + 6y − z = 2.
Solution. We read off the coefficients of x, y, and z from each of the three equations into a 3 × 3
matrix. However, we also read the constants 4, 1, and 2 into a separate column appended at the end
of the matrix; this addition of an ‘extra column’ results in an augmented matrix, as shown:
2 3 4 4
1 −2 5 1
3 6 −1 2
If we convert this to upper triangular form, then we have a matrix in the form
a b c r
0 d e s ,
0 0 f u
31
At this point, we have that −40z = 31 =⇒ z = from the last row. We then plug in this
40
19
value to 7y − 6z = 2 (second row), yielding y = . Lastly, we have x − 2y + 5z = 1 from the
20
Daniel Kim 232
39
first row, and plugging in respective values of y and z give x = − . Therefore our solution is
40
39 19 31
(x, y, z) = − , , .
40 20 40
Alternatively, we could keep applying our row operations until we get to a form
1 0 0 a
0 1 0 b
0 0 1 c
for arbitrary a, b, c ∈ R, from which we can directly read off our values for x, y, z.
We now generalize the concept of eigenvectors and eigenvalues to 3 × 3 matrices. This is very
complicated for general matrices, so we limit our discussion of eigenvalues and eigenvectors to 3 × 3
matrices in upper triangular form.
Example 8.2.68
Find eigenvectors and eigenvalues for
1 2 3
M = 0 4 5 .
0 0 6
Solution. If λ is an eigenvalue of this matrix, then Det(M − λI) = 0. By Lemma 8.2.64, we can
evaluate this as
1−λ 2 3
Det 0 4−λ 5 = (1 − λ)(4 − λ)(6 − λ) = 0.
0 0 6−λ
• Case λ = 1:
Writing the equation out, we have
1 2 3 a a + 2b + 3c a
0 4 5 b = 4b + 5c = b .
0 0 6 c 6c c
a + 2b + 3c = a,
4b + 5c = b,
6c = c.
233 Chapter 8. Linear Algebra
• Case λ = 4:
Similarly, we have the equations
a + 2b + 3c = 4a,
4b + 5c = 4b,
6c = 4c.
Like before, the third equations implies c = 0. Then the second equation simplifies to 4b = 4b
i.e. b = b, which is true for any value of b.
3
Furthermore, since c = 0, the first equation a + 2b + 3c = 4a reduces to 2b = 3a i.e. a = b,
2 2 2
3b 3
therefore the eigenvectors are of the form b = b 1 , from which we can make the valid
0 0
2 2
substitution b −→ 3b to get the cleaner form: b 3 i.e. scalar multiples of 3.
0 0
• Case λ = 6:
a + 2b + 3c = 6a,
4b + 5c = 6b,
6c = 6c.
5
The third equation implies c = c for any value of c. The second equation simplifies to b = c and
8 2 8
8 5c 5
the first equation simplifies to a = c, therefore our eigenvectors are of the form 52 c = c 52 .
5
c 1
16 16
Then we can substitute c −→ 10c to get c 25 i.e. scalar multiples of 25.
10 10
Solution. We know that Det(M − λI) = 0 for given matrix M and possible eigenvalues λ. Then, we
have
2−λ 1 4
Det 0 −3 − λ 2 = (2 − λ)(−3 − λ)(1 − λ) = 0.
0 0 1−λ
We have the roots λ = 2, −3, 1. We check each case:
1. For λ = 2, we have
2 1 4 x x
0 −3 2 y = 2 y .
0 0 1 z z
We are left with the system of equations
2x + y + 4z = 2x,
−3y + 2z = 2y,
z = 2z.
The last equation implies z = 0, from which we determine y = 0 as well (from the second
equation), and in the first
equation we find x = x for
any x, therefore our eigenvectors for
x 1
λ = 2 are of the form 0 i.e. scalar multiples of 0.
0 0
2. For λ = −3, we similarly set up a system of equations:
2x + y + 4z = −3x,
−3y + 2z = −3y,
z = −3z.
1
−5y
1
We find z = 0, y = y, and x = − y, therefore our eigenvectors are y i.e. scalar multiples
5
0
−1
of 5 .
0
3. For λ = 1, we have the system of equations
2x + y + 4z = x,
−3y + 2z = y,
z = z.
−9y
We find z = z, z = 2y, and x = −9y, therefore our eigenvectors are y i.e. scalar multiples
2y
−9
of 1 .
2
235 Chapter 8. Linear Algebra
Sometimes, an eigenvalue can be a double root when the determinant equation is written out, as
presented in the following problem:
Problem 8.2.70. Find eigenvectors and eigenvalues for
1 0 0
M = 0 4 0 .
0 0 4
Solution. If λ is an eigenvalue of this matrix, then Det(M − λI) = 0. Writing this out,
1−λ 0 0
Det 0 4−λ 0 = (1 − λ)(4 − λ)2 = 0.
0 0 4−λ
• Case λ = 1:
Write the equation out:
1 0 0 a a a
0 4 0 b = 4b = b .
0 0 4 c 4c c
a = a,
4b = b,
4c = c.
a = 4a,
4b = 4b,
4c = 4c.
The last two equations give us c = c and b = b, which arealways true. Finally, the first
0
equation gives us a = 0. Thus, the solutions are of the form b .
c
Unlike the previous example, this eigenvector has two different variables. In general, when an
eigenvalue is a double root, there will be an eigenvector with two different degrees of freedom (i.e.
the vector is defined by 2 variables).
Daniel Kim 236
Solution. For the first one, note that we can completely eliminate the second row by a row operation:
1 2 3 1 2 3
2 4 6 −2R1−→ +R2 →R2
0 0 0 .
7 7 7 7 7 7
As the Shoelace technique is applied, notice that all terms in the expression for the determinant
(i.e. aei − bdi − ceg − ahf + cdh + bf g) include a term from the second row. If all the entries in the
second row are zeroes, all the terms in the determinant are zeroes, and therefore the determinant is
zero.
Therefore, the determinant of the first matrix is 0.
We can make this same argument about any row or column. Thus, we conclude that if any row
or column is all zeroes, then the determinant is necessarily 0.
Similarly, for the second matrix, we can eliminate the third row by two special row operations:
1 2 4 −R1 +R3 →R3 1 2 4
−R2 +R3 →R3
3 7 8 −
−−−−−−−−→ 3 7 8 .
4 9 12 0 0 0
The determinant is clearly 0. From this, we can conclude in general that if some rows sum up to
another row, then the determinant is necessarily 0.
From these two examples of matrices, we can even make a broader conclusion: if the rows
(taken as vectors) are not linearly independent, then the determinant is 0. This is because our
row operations allow us to replace a row with any linear combination of the rows, i.e. we can do
aR1 + bR2 + cR3 → R1 for any integer a, b, and c (as long as they aren’t all 0). Therefore, if the
rows are linearly dependent, then we can find a, b, c =
6 0 such that aR1 + bR2 + cR3 = 0. Then, we
can apply row operations to get a row equal to 0, and thereby conclude that the determinant is 0.
Although we have had experience with the two dimensional Cartesian plane (plotting points and
lines), it is not until knowledge of parametric equations, vectors, and matrices are needed to establish
some foundation for developing ways to represent lines and planes.
First, we discuss how to represent lines in 3D space, through vectors.
P Q
(x0 , y0 , z0 ) (x1 , y1 , z1 )
237 Chapter 8. Linear Algebra
←→ −→ −−→
Consider a point R on the line P Q. Then P R will be a scalar multiple of P Q. This results in
the parameterization
−→ −−→
P R = tP Q.
−−→
Denote P = (x0 , y0 , z0 ) and Q = (x1 , y1 , z1 ). Then P Q = hx1 − x0 , y1 − y0 , z1 − z0 i, therefore
−→
P R = t hx1 − x0 , y1 − y0 , z1 − z0 i .
←→
Since R is an arbitrary point that lies on the line P Q, we can establish the following:
Definition 8.3.1. A line in three-dimensional space can be expressed in parametric form with
the following:
x = x0 + t(x1 − x0 ),
y = y0 + t(y1 − y0 ),
z = z0 + t(z1 − z0 ).
Problem 8.3.2. Give a parametric representation for a line containing (1, 2, 3) and (−3, −2, 7).
x = 1 + t(−3 − 1) = 1 − 4t,
y = 2 + t(−2 − 2) = 2 − 4t,
z = 3 + t(7 − 3) = 3 + 4t.
Since each contains 4t, we can make the substitution 4t −→ t to simplify our parameterization:
x = 1 − t,
y = 2 − t,
z = 3 + t.
Problem 8.3.3. Does the line from Problem 8.3.2 intersect the line containing (0, 0, 0) and (1, 4, 9)?
x = u,
y = 4u,
z = 9u.
We use u as the parameter here because since we will use t as the parameter for the line from
Problem 8.3.2. The parameters for the two lines are not necessarily the same.
Daniel Kim 238
To find the intersection, simply set the two parameterizations equal to each other:
1 − t = u,
2 − t = 4u,
3 + t = 9u.
Substituting the first equation into the second equation, we have 4(1 − t) = 2 − t which yields
2 1
t = and u = . However, these values fail the third equation, thus there is no solution, and the
3 3
lines do not intersect.
Problem 8.3.4. Find the parametric equation of the line containing (1, −2, 3) and (2, 1, −5).
Solution. x = 1 + (2 − 1)t = 1 + t,
y = −2 + (1 − (−2))t = −2 + 3t,
z = 3 + (−5 − 3)t = 3 − 8t.
Now we seek to come up with a proper definition for planes in space. The following diagram will
motivate us to define a plane in the following fashion:
For any point (x, y, z) that is on the plane, notice that the vector ha, b, ci is orthogonal to the
vector containing points (x0 , y0 , z0 ) and (x, y, z).
Definition 8.3.5. A plane in three-dimensional space is the set of points (x, y, z) such that the
vector containing (x, y, z) and (x0 , y0 , z0 ) is orthogonal to ha, b, ci. We can then find the equation
for a plane:
ha, b, ci · hx − x0 , y − y0 , z − z0 i = 0,
ax − ax0 + by − by0 + cy − cy0 = 0.
Let d be some constant equal to ax0 + by0 + cz0 , therefore our general equation for a plane (after
some rearranging) is
ax + by + cz = d.
239 Chapter 8. Linear Algebra
Definition 8.3.6. A vector or line is called normal to some plane when it is perpendicular to that
plane.
Problem 8.3.7. Find the point on the line from Problem 8.3.4 closest to the origin (0, 0, 0).
Solution. We find a vector that contains the points (1, −2, 3) and (2, 1, −5) is
h2 − 1, 1 − (−2), −5 − 3i = h1, 3, −8i .
Furthermore, the line from the origin to the closest point should be perpendicular to the line (the
shortest distance from a point to a line is the perpendicular).
If we let that closest point on the line be (1 + t, −2 + 3t, 3 − 8t) using the parametric equations
of the line, then the vector starting from (0, 0, 0) and ending at (1 + t, −2 + 3t, 3 − 8t) is simply
h1 + t, −2 + 3t, 3 − 8ti. This vector must be orthogonal to the vector h1, 3, −8i, therefore we have
h1 + t, −2 + 3t, 3 − 8ti · h1, 3, −8i = 0.
29
The solution is t = . Now, we can plug this value into our parametric equations of the line to
74
103 61 10
get the point ,− ,− .
74 74 74
Problem 8.3.8. Find the intersection of the line from Problem 8.3.4 with the plane whose equation
is x + y + 4z = 11.
This simplifies to t = 0, and so we plug this back into our parameterization for the line, resulting
in the point (1 + 0, −2 + 3(0), 3 − 8(0)) = (1, −2, 3) .
Solution. As we have to find the intersection, we simply substitute x + y + 4z into the second
equation, as such:
x + y + 4z = x + y + 3z.
This results in the solutions z = 0 and x + y = 11, which is a line with a parameterization of
x = 1 + t,
y = 10 − t,
z = 0.
Problem 8.3.10. Find the acute angle between the planes given in the previous example.
h1, 1, 4i · h1, 1, 3i 14
cos(180 − θ) = = √ = − cos θ
kh1, 1, 4ikkh1, 1, 3ik 3 22
14
One may conclude that the angle is cos−1− √ , however this is incorrect because the cosine
3 22
of an acute angle is always positive. To remedy this, we usually take the absolute value of the angle
between the two normal vectors. This is our general expression to find the acute angle:
~v · w
~
cos θ =
k~v kkwk
~
where ~v , w
~ are the vectors normal to the two planes, which in this case would be h1, 1, 4i and h1, 1, 3i.
−1 14
The answer is therefore θ = cos √ .
3 22
Theorem 8.3.11
The distance from a point (x0 , y0 , z0 ) to a plane ax + by + cz = d is
Proof. By our definition of a plane, the vector that is normal to ax + by + cz = d is ha, b, ci. Let a
representation of this vector go through the point (x0 , y0 , z0 ), as shown:
241 Chapter 8. Linear Algebra
Consider the line that goes through (x0 , y0 , z0 ) and contains ha, b, ci. The parameterization for
this line would be
x = x0 + at,
y = y0 + bt,
z = z0 + ct.
To find the intersection of this line and the plane, substitute the parameters into the equation
for the plane:
a(x0 + at) + b(y0 + bt) + c(z0 + ct) = d.
This rearranges to
ax0 + by0 + cz0 − d = −t(a2 + b2 + c2 ).
√
The shortest distance from the plane to (x0 , y0 , z0 ) would be |t|kha, b, cik = |t| a2 + b2 + c2 , so
we substitute in our value of t to get
ax0 + by0 + cz0 − d p 2 |ax0 + by0 + cz0 − d|
− 2 2 2
a + b2 + c2 = √ ,
a +b +c a2 + b2 + c2
as desired.
Problem 8.3.12. Find a~ı + b~ + c~k · d~ı + e~ + f ~k .
Now that we are dealing with three dimensional vectors, we can find a vector that is orthogonal
to two other given vectors simultaneously. Consider the following matrix:
~ı ~ ~k
a b c .
d e f
This matrix is unlike any other matrix we have considered before. Although we have initially
defined matrices to be an array of numbers, we can extend this definition to include vectors as
possible entries as well.
The top row consist of the standard basis vectors for R3 . Then, the determinant of this matrix is
of the form
~ıγ1 + ~γ2 + ~kγ3 ,
where γ1 , γ2 , γ3 are placeholders for the remaining terms. Notice that the determinant itself is a
vector.
Now suppose we take the dot product of this with the vector a~ı + b~ + c~k. By Problem 8.3.12,
~ıγ1 + ~γ2 + ~kγ3 · a~ı + b~ + c~k = γ1 a + γ2 b + γ3 c.
and the row operation −R1 + R3 → R3 would indicate that the determinant is 0. Hence, we can
likewise conclude that ~ıγ1 + ~γ2 + ~kγ3 and d~ı + e~ + f ~k are orthogonal.
Thus, we have found that ~ıγ1 + ~γ2 + ~kγ3 is orthogonal to both a~ı + b~ + c~k and d~ı + e~ + f ~k.
This is the vector that we have been looking for.
This special vector will be so signficant in 3D geometry that we will define it with a special name:
243 Chapter 8. Linear Algebra
Proof. Let ~v = a~ı + b~ + c~k and w~ = d~ı + e~ + f ~k. Then
~ı ~ ~k
~v × w
~ = Det a b c ,
d e f
~ı ~ ~k
~ × ~v = Det d
w e f .
a b c
For w
~ × ~v , rows 2 and 3 of the matrix have been swapped, so its determinant is the negative of
the determinant of the matrix for ~v × w.~ Therefore w~ × ~v = −(~v × w).
~
2. ~v × ~v = ~0.
Proof. From the first property, ~v × ~v = −(~v × ~v ), from which the result follows.
~ı × ~ = ~k,
~ × ~k = ~ı,
~k ×~ı = ~.
5. (a~v ) × w
~ = a(~v × w).
~
Daniel Kim 244
Example 8.3.15
Find the equation of the plane containing the points (2, 3, 7), (1, 5, 6), and (−4, 0, 1).
−−→
Solution. Let R = (2, 3, 7), M = (1, 5, 6), and J = (−4, 0, 1). Then RM = h−1, 2, −1i and
−−→
M J = h−5, −5, −5i. Taking the cross-product of this gives a vector normal to the plane, so we
evaluate with minors:
~ı ~ ~k
−−→ −−→
RM × M J = Det −1 2 −1
−5 −5 −5
2 −1 −1 −1 −1 2
= ~ı · Det − ~ · Det + ~k · Det
−5 −5 −5 −5 −5 −5
= −15~ı + 15~k.
The normal vector is h−15, 0, 15i, so our equation for the plane is of the form −15x + 15z = d.
Now we plug in one of our three given points to find the value of d, i.e. plugging in point J we
get −15(−4) + 15(1) = d = 75, therefore our equation for the plane is −15x + 15z = 75 =⇒
−x + z = 5 .
Problem 8.3.16. Find the equation of the plane containing the points (1, 2, 4), (2, −1, 1), and
(4, 0, 5).
Solution. Similarly, we find two vectors given the three points (1, 2, 4), (2, −1, 1), and (4, 0, 5), which
are h1, −3, −3i and h2, 1, 4i. We then take the cross product to find the normal vector of the plane:
~ı ~ ~k
h1, −3, −3i × h2, 1, 4i = Det 1 −3 −3
2 1 4
−3 −3 1 −3 ~ 1 −3
= ~ı · Det − ~ · Det + k · Det
1 4 2 4 2 1
= −9~ı − 10~ + 7~k.
So our equation for the plane is of the form −9x − 10y + 7z = d, then we plug in J = (−4, 0, 1)
to get −9(−4) − 10(0) + 7(5) = d = −1, which simplifies to −9x − 10y + 7z = −1 .
Problem 8.3.17. Find the intersection of the plane from Problem 8.3.16 with the plane x+y +z = 0.
16x + 17y = 1,
245 Chapter 8. Linear Algebra
and multiplying 9 times equation (2) then adding to equation (1) gives
−y + 16z = −1.
We can use these equations to find two points which lie on the intersection of the two planes,
then determining the parameterization of the line using those two points.
Plugging in y = 1 gives the solutions (−1, 1, 0), and plugging in z = 1 gives the solutions
(−18, 17, 1). Therefore the parameterization of the line is
x = −1 − 17t,
y = 1 + 16t,
z = t.
Problem 8.3.18. Find the intersection of the plane from Problem 8.3.16 with the line that contains
(3, 4, 5) and (5, 12, 13).
Solution. The parameterization for the line with points (3, 4, 5) and (5, 12, 13) is:
x = 3 + 2t,
y = 4 + 8t,
z = 5 + 8t.
x = 3 + t,
y = 4 + 4t,
z = 5 + 4t.
To find the intersection, we simply substitute in the parametric definitions into the equation of
the plane, which is −9x − 10y + 7z = −1, so we get
31
Solving this gives t = − . We plug this back into the parameterization of the line to get the
21
32 40 19
point ,− ,− .
21 21 21
Problem 8.3.19. Determine the distance from the origin to the line in Problem 8.3.18.
x = 3 + t,
y = 4 + 4t,
z = 5 + 4t.
Daniel Kim 246
Consider the vector from (0, 0, 0) to an arbitrary point on the line, which can be represented as
(3+t, 4+4t, 5+4t). The vector going from the origin to this arbitrary point is just h3 + t, 4 + 4t, 5 + 4ti.
We can determine a vector in the line by taking two points on the line, i.e. (3, 4, 5) and (5, 12, 13),
which yields h2, 8, 8i. By Theorem 8.1.23, ~v and w
~ are orthogonal if and only if ~v · w
~ = 0. So we have
This evaluates to
6 + 2t + 32 + 32t + 40 + 32t = 0,
13
and solving gives t = − . Therefore the distance is just the magnitude of the vector
11
h3 + t, 4 + 4t, 5 + 4ti ,
which is s 2 2 2 √
20 8 3 473
+ + = .
11 11 11 11
Theorem 8.3.20
Let θ is the included angle between ~v and w
~ if their tails were placed on each other. Then we
have
k~v × wk
~ = k~v kkwk
~ sin θ.
s 2
~v · w
~ ~v · w
~
Proof. Since cos θ = , we get sin θ = 1 − . Let ~v = ha, b, ci and w
~ = hd, e, f i.
k~v kkwk
~ k~v kkwk
~
Then,
p
~ sin θ = k~v k2 kwk
k~v kkwk ~ 2 − (~v · w)
~ 2
p
= (a2 + b2 + c2 )(d2 + e2 + f 2 ) − (ad + be + cf )2 .
This allows us to compute either the dot product or sin θ if we have the other.
Now, recall the formula based on Theorem 6.6.1 which states that the area of the triangle formed
1
by ~v and w~ with their tails placed on each other, and included angle θ, would be k~v kkwk
~ sin θ. As
2
1
a result of Theorem 8.3.20, we conclude that the area of such a triangle is k~v × wk.~
2
Then k~v kkwk
~ sin θ = k~v × wk~ is the area of the parallelogram (since it is twice the area of the
triangle):
~v
θ
w
~
Chapter 9
Limits
We are finally at the gates of calculus, and to begin, we must revisit the concept of limits.
I strongly recommend that you review the section on the limits of sequences, as we will be taking
it a step further when considering the limits of functions.
Note that we will only be covering an introduction of calculus, mainly limits (this chapter) and
derivatives (next chapter).
The Definition
Keeping this principle in mind, here is the ε − δ (“epsilon-delta”) definition of the limit:
Definition 9.0.1. The limit of f (x) as x approaches a is L, i.e. lim f (x) = L, iff
x→a
For the following few problems, we will introduce proving limits of linear functions, as they are the
most basic and straightforward.
Example 9.1.1
Find and prove lim 3x + 5.
x→3
We can expect the limit of 3x + 5 where x approaches 3 to be 14 (we can simply plug in 3 into the
given function). To rigorously prove this limit, the existence quantifier of the definition suggests
that we must find an expression for δ in terms of ε such that the implication
is satisfied. Let’s take a closer look at |(3x + 5) − 14| < ε. We perform a series of algebraic
manipulations:
Note that all of these steps are reversible, therefore we can rewrite our implication as
ε
0 < |x − 3| < δ −→ |x − 3| < .
3
251 Chapter 9. Limits
ε
From here, it is obvious that our implication can only be satisfied if we let δ = . However,
3
even though we have essentially ‘solved’ the proof going backwards, we must write the proof going
forwards, as shown:
ε
Proof. For a given ε > 0, let δ = . Then,
3
0 <|x − 3| < δ
ε
|x − 3| <
3
3|x − 3| < ε
|3||x − 3| < ε
|3(x − 3)| < ε
|3x − 9| < ε
|(3x + 5) − 14| < ε
The proofs for the rest will be written forward, but for each problem, make sure you attempt the
proof and find an expression for δ in terms of ε first before looking at the solution.
ε
Proof. For a given ε > 0, let δ = . Then,
2
0 <|x − 2| < δ
ε
|x − 2| <
2
2|x − 2| < ε
|2||x − 2| < ε
|2(x − 2)| < ε
|2x − 4| < ε
|(2x + 3) − 7| < ε
Example 9.1.5
(
7 x≤4
Let f (x) = . Prove that lim f (x) does not exist.
5 x>4 x→4
Proof. We proceed with proof by contradiction. Assume that lim f (x) = L. Then,
x→4
As long as we can find a single value of ε such that the implication is false for any δ, then we will
have shown that the limit does not exist.
First, we rewrite the implication to get
1
Assume ε = (in fact, any value of ε ∈ (0, 1) can be used to demonstrate a contradiction). Then
2
we have
1 1
4 − δ < x < 4 + δ → L − < f (x) < L + ,
2 2
253 Chapter 9. Limits
prove that lim f (x) either exists and is equal to some value, or does not exist.
x→0
Proof. We shall show that there exists no limit for this function. First, assume that the limit is L.
Writing out the definition, we have
As long as we find a ε that fails this definition, then we will have disproved the limit.
1
Consider ε = . When we take a closer look at the implication,
4
1
0 < |x − 0| < δ −→ |f (x) − L| < ,
4
1 1 1
we see that |f (x) − L| < implies that f (x) ∈ L − , L + .
4 4 4
δ δ
Let x = and x = − , which are allowed because these values satisfy 0 < |x − 0| < δ. Since
2 2
δ δ
δ > 0, we know that f = 1 and f − = 0, based on the given definition of the function.
2 2
1 1
However, it has been demonstrated that f (x) ∈ L − , L + , and this interval clearly cannot
4 4
1
contain both 0 and 1 for any value of L. Therefore, we have a contradiction for ε = , so the limit
4
does not exist.
|x|
Exercise 9.1.7. Prove lim does not exist.
x→0 x
Daniel Kim 254
Before we tackle some harder proofs, we first make the following key observation:
Remark 9.2.1. If the definition is true for a specific ε0 , then it is also true for any εe0 > ε0 .
If a particular δ0 works, then any δe0 < δ0 will also work.
Assume that this implication is true for δ0 . Then as long as x and a are within δ0 apart, then
they will be “sufficiently close” such that f (x) is within ε of L.
If δe0 < δ0 , then x and a are even closer to each other. They are more than sufficiently close in
order to satisfy that f (x) is within ε of L. Therefore, the implication
Lemma 9.2.2
Define non-strict intervals P = (a, b) and Q = (c, d) for a, b, c, d ∈ R. For any x ∈ R, if we have
x ∈ P −→ x ∈ Q, then P ⊆ Q i.e. (a, b) ⊆ (c, d), and subsequently,
c ≤ a < b ≤ d.
This lemma should be intuitive and does not need proof: if one interval is contained inside
another, then the bounds of the smaller interval must be between the bounds of the larger interval.
Note that for the following examples, I will be explaining the work that one must do before
writing a proper limit proof. Therefore these ‘proofs’ are written backwards, and it is left as exercises
to the reader to properly and formally write these proofs forward.
Example 9.2.3
Prove lim x2 = 9.
x→3
At this point, we need to figure out a way to rewrite 9 − ε < x2 < 9 + ε as an inequality on x,
so we can compare the two inequalities and determine what value of δ in terms of ε is required to
satisfy the definition.
The key point of this proof is to assume ε < 9. As long as we can prove that the limit exists for
ε < 9, then it will automatically follow for ε ≥ 9, as explained in Remark 9.2.1.
√ Because we√conveniently set ε < 9, we can take the square root of 9 − ε < x < 9 + ε, i.e.
2
9 − ε < x < 9 + ε.
NOTE: Here is a caveat of this proof: we are assuming that the square root function is increasing
in order to take the square root of that inequality. For now, we will take it for granted.
Our implication in discussion is now
√ √
3 − δ < x < 3 + δ −→ 9−ε<x< 9+ε
√ √
We want to pick a δ such that x ∈ (3 − δ, 3 + δ) −→ x ∈ 9 − ε, 9 + ε . By Lemma 9.2.2, it
is sufficient if √ √
9 − ε ≤ 3 − δ, 3+δ ≤ 9+ε
Rearranging, we have
√
δ ≤3− 9−ε
√
δ ≤ 9+ε−3
But now we have two inequalities for δ. How do we know which one to choose?
In fact, we don’t need to choose. We can define
√ √
δ = min{3 − 9 − ε, 9 + ε − 3}
so that both inequalities become true. This guarantees that our choice of δ will satisfy the definition
of the given limit.
Alternative Proof. We can proceed with a slightly easier proof. First, write out the implication:
Then we have x2 − 9 = |(x − 3)(x + 3)| = |x − 3| |x + 3|. So we want to find a δ such that
|x − 3| < δ −→ |x − 3| |x + 3| < ε.
Now, assume δ < 1. We arbitrarily choose 1 since it is small and relatively straightforward to
deal with, but the proof would have been fine if we choose 2, 10, or any other positive number
instead.
Then, |x − 3| < δ means that |x − 3| < 1. By the Triangle Inequality,
Now, we use the facts that |x − 3| < δ and |x + 3| < 7 to conclude that
|x − 3| |x + 3| < 7δ,
ε
which in turn we want to be less than ε. To accomplish this, we realize that we want δ < .
7
However, we had assumed earlier that δ < 1. Thus, we simply have to find a δ such that it is
ε
both less than and 1. This can be done by defining
7
n εo
δ < min 1, ,
7
guaranteeing that our definition of the limit will be satisfied.
We rewrite 0 < |x − 99| < δ as 99−δ < x < 99+δ, and x2 − 9801 < ε as 9801−ε < x2 < 9801+ε.
We want to take the square root of the latter inequality, but this is not possible if 9801 − ε is negative.
Therefore, we assume ε < 9801 (remember, as long as ε works, any√εe > ε will also work!). √ Then,
9801 − ε is definitely positive, so we can take the square root to get 9801 − ε < x < 9801 + ε.
These rearrange to
√
δ < 99 − 9801 − ε,
√
δ < 9801 + ε − 99.
so the implication ∀ε > 0 ∃δ > 0 ∀x, 0 < |x − 99| < δ → x2 − 9801 < ε is true.
257 Chapter 9. Limits
1
Problem 9.2.5. Prove lim = 1.
x→1 x
1 1
Proof. Consider the implication |x − 3| < δ → − < ε. We break up the absolute value signs
x 3
to get
1 1 1
3−δ <x<3+δ → − ε < < + ε.
3 x 3
1
Assume ε < , which allows us to take the reciprocal of the latter inequality to get
3
1 1
3−δ <x<3+δ → 1 <x< 1 .
3 +ε 3 −ε
Daniel Kim 258
!
1 1
We want to show that the interval (3 − δ, 3 + δ) is contained inside 1 , 1 . It is sufficient
3 +ε 3 −ε
that
1
1 < 3 − δ,
3+ε
1
3+δ < 1 .
3 −ε
These rearrange to
1
δ <3− 1 ,
3 +ε
1
δ< 1 − 3.
3 −ε
Therefore we take ( )
1 1
δ < min 3 − 1 , 1 −3
3 +ε 3 −ε
1 1
∀ε > 0 ∃δ > 0 ∀x, 0 < |x − 3| < δ → − < ε.
x 3
Assume δ < 1, so 0 < |x − 3| < δ < 1. Then by the Triangle Inequality, |3| ≤ |3 − x| + |x| =
|x − 3| + |x| < 1 + |x|, which rearranges to |x| > 3 − 1 = 2. Clearly |x| is greater than zero, so we
1 1
can take the reciprocal of this inequality to get < .
|x| 2
Now note that
1 1 3−x |3 − x| |x − 3| 1 1 1 1 δ
− = = = = · · |x − 3| < · · δ = .
x 3 3x |3x| 3 |x| 3 |x| 3 2 6
1 1
We want − to be less than a given ε, so we should take δ < 6ε. However, we have also
x 3
assumed δ < 1, so it is sufficient to take δ < min {1, 6ε} to satisfy both inequalities.
x 1
Problem 9.2.7. Prove lim = .
x→1 x + 1 2
x 1
|x − 1| < δ −→ − <ε
x+1 2
259 Chapter 9. Limits
x 1
Note that |x − 1| < δ rearranges to 1 − δ < x < 1 + δ, and − < ε rearranges to:
x+1 2
1 x 1
−ε< < +ε
2 x+1 2
1 x 1
− +ε>− >− −ε
2 x+1 2
1 x 1
1− + ε >1 − >1− −ε
2 x+1 2
1 1 1
+ε> > −ε
2 x+1 2
1 1 1
−ε< < +ε
2 x+1 2
1 1 1 1 1
Assume ε < . Then, we can take the reciprocal of − ε < < + ε, i.e. 1 > x+1 >
2 2 x+1 2 2 −ε
1 1
1 −ε +ε
1 , which rearranges to 21 < x < 21 . Therefore our implication is
2 +ε 2 +ε 2 −ε
1 1
2 −ε 2 +ε
1 − δ < x < 1 + δ −→ 1 <x< 1
2 +ε 2 −ε
It is sufficient that
1
2 −ε
1 ≤1−δ
2 +ε
1
2 +ε
1+δ ≤ 1
2 −ε
2ε 2ε
so we have δ ≤ 1 and δ ≤ 1 . Therefore, we take
2 −ε 2 +ε
( )
2ε 2ε
δ = min 1 ,1
2 −ε 2 +ε
Example 9.2.8
Prove lim x2 − 3x + 2 = 6.
x→4
Daniel Kim 260
First, noticing that the quadratic x2 − 3x + 2 − 6 is factorable, we rewrite the second part of the
implication:
|x + 1| ≤ |x − 4| + |5| < 1 + 5 = 6.
We combine the facts |x − 4| < δ and |x + 1| < 6 to get that |x − 4||x + 1| < 6δ.
ε
This suggests that we let δ < , but we have also assumed that δ < 1, therefore we take the
6
minimum of those two, i.e. let n εo
δ < min 1,
6
which will satisfy our definition. However, all of this work was done backwards, and cannot be
considered a proper, formal proof. Instead, when we write up the proof, we write it up forwards, as
shown:
n εo
Proof. Given a ε > 0, choose δ < min 1, . Note that
6
|x − 4| < δ −→ |x − 4| < 1 −→ |x + 1| ≤ |x − 4| + |5| < 6
i.e. lim x2 − 3x + 2 = 6.
x→4
Up to now, we have been proving limits of various functions. To make our lives easier, we will prove
some properties of limits that will allow us to take limits of many more kinds of functions.
Proof. For the sake of contradiction, assume L 6= M . Then it follows that |L − M | > 0. Now
consider
lim f (x) = L ←→ ∀ε > 0 ∃δ1 > 0 ∀x, 0 < |x − a| < δ1 → |f (x) − L| < ε,
x→a
lim f (x) = M ←→ ∀ε > 0 ∃δ2 > 0 ∀x, 0 < |x − a| < δ2 → |f (x) − M | < ε.
x→a
|L − M | |L − M |
Now consider ε = (since > 0). It becomes true that
2 2
|L − M | |L − M |
0 < |x − a| < δ → |f (x) − L| < ∧ |f (x) − M | < .
2 2
Then for this chosen value of ε, note that
|L − M | = |L − f (x) + f (x) − M |
≤ |L − f (x)| + |f (x) − M |
= |f (x) − L| + |f (x) − M |
|L − M | |L − M |
< + = |L − M | .
2 2
Proof. Assume for the sake of contradiction that L < 0. We are given the definition
As f (x) ≥ 0 is given, f (x) > L, i.e. f (x) − L > 0. Therefore |f (x) − L| = f (x) − L. Consider
L
ε = −L, which is valid because −L > 0. We can also choose ε = − , or any other value in
2
terms of L that would be positive. It follows that ∃δ > 0 ∀x, 0 < |x − a| < δ → f (x) − L < −L.
But f (x) − L < −L rearranges to f (x) < 0, which contradicts the given f (x) ≥ 0 ∀x. Therefore
L ≥ 0.
Proof. It is sufficient to let δ = ε. Then it is obvious that 0 < |x − a| < δ ←→ 0 < |x − a| < ε →
|x − a| < ε.
Daniel Kim 262
Proof. Any δ is valid, since ∀ε > 0 we have |c − c| = 0 < ε which is always true.
b) lim (f (x)g(x)) = LM .
x→a
ε
∀ε > 0 ∃δ1 > 0 ∀x, 0 < |x − a| < δ1 −→ |f (x) − L| < ,
2
ε
∃δ2 > 0 ∀x, 0 < |x − a| < δ2 −→ |g(x) − M | < .
2
This is allowed because we can always choose a δ1 or δ2 that is small enough to make each
ε
|f (x) − L| and |g(x) − M | smaller than .
2
Let δ = min {δ1 , δ2 }. This is because we want to be able to use the same δ for both definitions.
Then,
ε ε
∀ε > 0 ∃δ > 0 ∀x, 0 < |x − a| < δ −→ |f (x) − L| < ∧ |g(x) − M | < .
2 2
Note that |f (x) + g(x) − (L + M )| = |(f (x) − L) + (g(x) − M )| ≤ |f (x) − L| + |g(x) + M | by
the Triangle Inequality. Therefore, given ε > 0, we have found a δ such that
ε ε
|x − a| < δ −→ |f (x) + g(x) − (L + M )| ≤ |f (x) − L| + |g(x) + M | < + = ε,
2 2
i.e. lim (f (x) + g(x)) = L + M , as desired.
x→a
b) Let εe represent some quantity in terms of ε. Eventually we will figure out exactly what
expression we should set εe equal to, by doing the proof ‘backwards,’ to demonstrate the
motivation and intuition behind the choice of εe. After we find out what εe exactly is, we will
write the proof forwards.
As we are given lim f (x) = L and lim g(x) = M ,
x→a x→a
Like before, let δ = min {δ1 , δ2 }. Then we can cover both cases:
∀ε > 0 ∃δ > 0 ∀x, 0 < |x − a| < δ −→ |f (x) − L| < εe ∧ |g(x) − M | < εe.
Now we apply the Triangle Inequality based on what we are trying to prove:
|f (x)g(x) − LM | = |f (x)g(x) − f (x)M + f (x)M − LM |
= |f (x)(g(x) − M ) + M (f (x) − L)|
≤ |f (x)| |g(x) − M | + |M | |f (x) − L| .
Now we should introduce a bound on |f (x)|. Therefore we should assume ε < 1. Then it
follows that |f (x) − L| < 1. By another application of the Triangle Inequality,
|f (x)| ≤ |f (x) − L| + |L| < 1 + |L| .
Assume ε < 1, so that |f (x) − L| < 1. Then |f (x)| ≤ |f (x) − L| + |L| < 1 + |L|. Therefore,
we conclude that
|f (x)g(x) − LM | = |f (x)g(x) − f (x)M + f (x)M − LM |
= |f (x)(g(x) − M ) + M (f (x) − L)|
≤ |f (x)| |g(x) − M | + |M | |f (x) − L|
ε ε
< (1 + |L|) + |M |
1 + |L| + |M | 1 + |L| + |M |
ε
= (1 + |L| + |M |)
1 + |L| + |M |
= ε.
Daniel Kim 264
Proof. It follows from Theorem 9.3.4 and Theorem 9.3.5 that lim (f (x) − g(x)) = lim f (x) +
x→a x→a
lim (−g(x)) = lim f (x) + lim (−1) lim g(x) = lim f (x) − lim g(x).
x→a x→a x→a x→a x→a x→a
Problem 9.3.7. Prove lim 2x3 − x2 = 225 using the limit properties we have just established.
x→5
Proof. We can ‘break up‘ the polynomial into smaller ‘pieces,’ by applying sum, difference, and
product rules of limits.
Lemma 9.3.8
If 0 ≤ j(x) ≤ k(x) ∀x and lim k(x) = 0, then lim j(x) = 0.
x→a x→a
Proof. It follows that k(x) − j(x) ≥ 0 ∀x. By Problem 9.3.2, lim k(x) − j(x) ≥ 0. But note that
x→a
lim k(x) − j(x) = lim k(x) − lim j(x) = − lim j(x), therefore lim j(x) ≤ 0. However, as we are also
x→a x→a x→a x→a x→a
given j(x) ≥ 0 ∀x, so by Problem 9.3.2, lim j(x) ≥ 0. Thus we can only conclude lim j(x) = 0.
x→a x→a
Proof. The given inequality rearranges to 0 ≤ g(x)−f (x) ≤ h(x)−f (x). Note that lim (h(x)−f (x)) =
x→a
lim h(x)− lim f (x) = L−L = 0. By Lemma 9.3.8, lim (g(x)−f (x)) = 0, i.e. lim g(x) = lim f (x) =
x→a x→a x→a x→a x→a
L, as desired.
Alternative Proof. In fact, we could prove this theorem from scratch, without using previous lemmas.
By definition of limits, there exist δ1 , δ2 such that
Then
L − ε < f (x) ≤ g(x) ≤ h(x) < L + ε =⇒ |g(x) − L| < ε,
so this choice of δ establishes that lim g(x) = L.
x→a
|L| |L|
Proof. Given ε > 0, there exists δ1 such that 0 < |x − a| < δ1 → |g(x) − L| < , since is a
2 2
ε |L|2
number greater than 0. Likewise, there exists δ2 such that 0 < |x − a| < δ2 → |g(x) − L| < .
2
Then take δ = min {δ1 , δ2 }, such that
|L| ε |L|2
0 < |x − a| < δ → |g(x) − L| < ∧ |g(x) − L| < .
2 2
|L|
By the Triangle Inequality, |L| ≤ |L − g(x)|+|g(x)| = |g(x) − L|+|g(x)| < +|g(x)|, therefore
2
|L| 1 2
|g(x)| > . As both quantities are positive, we can take the reciprocal to get < . Note
2 |g(x)| |L|
that we applied the Triangle Inequality on |L| because we seek a lower bound for |g(x)|, such that
1
we have an upper bound for , which is needed to finish the proof.
|g(x)|
Therefore, we have found a δ such that
1 1
given ε > 0, and we can conclude that lim = .
x→a g(x) L
Proof. This follows from application of the reciprocal rule with the product rule.
Daniel Kim 266
P
n
Proof. Let P (x) = bk xk . Then,
k=0
n
X
lim P (x) = lim bk xk
x→a x→a
k=0
n
X
= lim bk · ( lim x)k
x→a x→a
k=0
Xn
= bk (a)k
k=0
= P (a).
This theorem officially establishes that we can find the limit of any polynomial by simply plugging
in the number and computing the answer.
P (x) P (a)
lim = .
x→a Q(x) Q(a)
x2 − 5
1. lim
x→2 x2 + 3x + 1
x2 − 4
2. lim
x→2 x2 − 3x + 2
x3 − 8
3. lim
x→2 x2 − 5x + 6
x3 + 27
4. lim
x→−3 x4 − 81
Proof.
x2 − 5 22 − 5 1
1. lim = = − .
x→2 x2 + 3x + 1 22 + 3 · 2 + 1 11
267 Chapter 9. Limits
2. We cannot directly plug in x = 2, since the denominator becomes 0. However, the polynomials
in the numerator and denominator both share x − 2 as a common factor, so we cancel
those out, leaving us with an expression for which we can plug in x = 2 without issues:
x2 − 4 (x + 2)(x − 2) x+2
lim 2 = lim = lim = 4.
x→2 x − 3x + 2 x→2 (x − 2)(x − 1) x→2 x − 1
x3 − 8 (x − 2)(x2 + x + 4) x2 + 2x + 4
3. lim = lim = lim = −12 .
x→2 x2 − 5x + 6 x→2 (x − 2)(x − 3) x→2 x−3
x3 + 27 (x + 3)(x2 − 3x + 9)
4. lim = lim
x→−3 x4 − 81 x→−3 (x2 + 9)(x2 − 9)
(x + 3)(x2 − 3x + 9)
= lim
x→−3 (x2 + 9)(x + 3)(x − 3)
x2 − 3x + 9
= lim
x→−3 (x2 + 9)(x − 3)
(−3)2 − 3(−3) + 9
=
((−3)2 + 9)(−3 − 3)
1
= − .
4
When we say “x approaches a,” it can either indicate that x is greater than a so x would be decreasing
to get closer to a, or x is less than a and x would be increasing to get closer to a.
In our definition of the limit so far, we dealt with this by taking the absolute value of x − a to
get the distance between them. However, we can also specify whether x > a or x < a, through two
types of limits:
The right-hand limit deals with x approaching a from the right, while the left-hand limit deals
with x approaching a from the left.
Notice that the only difference between their definitions and the original definition is the
replacement of |x − a| with either x − a or a − x. These conditions indicate x > a and x < a
respectively.
Since |x − a| can only be one of these two expressions, we have an obvious result that you should
attempt to prove.
Daniel Kim 268
Exercise 9.4.3. Given lim f (x) and lim f (x) exist, prove that lim f (x) = lim f (x) = L ←→
x→a+ x→a− x→a+ x→a−
lim f (x) = L.
x→a
Problem 9.4.4. Let f (x) be the function considered in Example 9.1.5. Evaluate lim f (x) and
x→4+
lim f (x).
x→4−
Solution. For lim f (x), we are considering x > 4. By the piecewise definition, we must have
x→4+
lim f (x) = 5.
x→4+
Likewise, for lim f (x), x < 4 in this context, so lim f (x) = 7.
x→4− x→4−
|x|
Problem 9.4.5. Prove lim = 1.
x→0+ x
|x|
0 < x − 0 < δ −→ − 1 < ε.
x
|x| x |x|
Since x > 0, we know that = = 1, i.e. − 1 = 0 < ε which is true.
x x x
|x|
Problem 9.4.6. Prove lim = −1.
x→0− x
|x|
0 < 0 − x < δ −→ − (−1) < ε.
x
|x| −x
Since −δ < x < 0, we know that = = −1. Then,
x x
|x|
− (−1) = |−1 − (−1)| = 0,
x
Proof. We use the result of Exercise 9.4.3. Note that lim f (x) = lim x2 −1 = 24, and lim f (x) =
x→5+ x→5+ x→5−
lim 7x − 11 = 24. As lim f (x) = lim f (x) = 24, lim f (x) = 24.
x→5− x→5+ x→5− x→5
269 Chapter 9. Limits
|x2 − 16|
Problem 9.4.8. Find lim .
x→4 x − 4
|x2 − 16| |x − 4| |x + 4|
Solution. First, we factor: lim = lim .
x→4 x − 4 x→4 x−4
Now, we consider the right-hand and left-hand limits separately.
|x − 4| |x + 4|
For the right-hand limit, lim , note that x > 4 in this case. Then, |x − 4| = x − 4.
x→4+ x−4
Thus,
|x − 4| |x + 4| (x − 4) |x + 4|
lim = lim = lim |x + 4| = 8.
x→4+ x−4 x→4+ x−4 x→4+
|x − 4| |x + 4|
For the left-hand limit, lim , we must have x < 4. Thus, |x − 4| = 4 − x, and we
x→4− x−4
have
|x − 4| |x + 4| (4 − x) |x + 4|
lim = lim = lim − |x + 4| = −8.
x→4− x−4 x→4 − x−4 x→4−
|x − 4| |x + 4| |x − 4| |x + 4| |x − 4| |x + 4| |x2 − 16|
Since lim 6= lim , lim i.e. lim does not
x→4+ x−4 x→4− x−4 x→4 x−4 x→4 x − 4
exist.
Up to now, we have been dealing with x approaching some finite number a. However, as we did
for sequences, we can define x going to infinity as well. In fact, since we are dealing with functions,
it is possible for x to go to negative infinity as well!
As the Cartesian plane extends infinitely in both dimensions, we could also have f (x) going to
infinity or negative infinity for some function f .
Keeping these possibilities in mind, we define a new kind of limits for these:
Definition 9.4.9 (Infinite Limits). We introduce new variables N and M to deal with cases when
x or f (x) go to ∞ or −∞. Then,
How did we even come up with these definitions? Well, if we have x or f (x) approaching some
finite value, then we NEED to signify that the distance between the two is less than some chosen
positive number: we use δ for x approaching a and ε for f (x) approaching the limit L. Of course,
the “distance” is represented by taking the absolute value of the difference between the two.
Otherwise, if we have ∞ or −∞ involved, then we need to indicate that the value in question (x
or f (x)) increases or decreases without bound.
For example, if we wanted to show x → ∞, then we would use the idea that for ANY real number
N you choose (even when N is an incredibly large number), x would always be greater than that
number. This fits with the notion of infinity - x cannot be less than any number since it is always
increasing.
Likewise, if we had x → −∞, then x would always be less than any real number we choose, since
it is decreasing.
The exact reasoning applies to f (x) as well.
Thus, whenever we have f (x) going to some finite limit L, then we would include ∀ε > 0 and
|f (x) − L| < ε.
However, if f (x) went to ∞ or −∞, then we would have ∀M and f (x) > M or f (x) < M
respectively.
If x went to some finite number a, then we would say ∃δ > 0 and 0 < |x − a| < δ.
If x went to ∞ or −∞, then we would define ∃N and x > N or x < N respectively.
Then, when we consider the right-hand or left-hand limits as f (x) goes to ∞ or −∞, we can just
replace 0 < |x − a| < δ (in the original definition) with 0 < x − a < δ or 0 < a − x < δ respectively.
1
Problem 9.4.10. Prove lim = 0.
x→∞ x
1
∀ε > 0 ∃N ∀x, x > N → − 0 < ε.
x
Assume N > 0 (we have the same freedom in choosing N as we had with δ). Then x > N >
1 1 1 1 1 1
0 → < . Since x > 0 by assumption, x = |x|, so we have < , i.e. − 0 < . We
x N |x| N x N
1 1
want − 0 to be less than a given ε, which suggests that < ε. Therefore it is sufficient to take
x N
1
N> .
ε
1
Problem 9.4.11. Prove lim = −∞.
x→0− x
1
∀M ∃δ > 0 ∀x, 0 < 0 − x < δ → < M.
x
271 Chapter 9. Limits
We rewrite the former inequality as 0 > x > −δ. Since x and −δ are both negative, we can take
1 1 1
the reciprocal to get < − . However, we want to be less than a given M , so it is sufficient to
x δ x
1 1
take − < M , i.e. δ < − .
δ M
1
Problem 9.4.12. Prove lim = −∞.
x→3− x−3
First, note that 0 < 3 − x < δ can be rewritten as 0 > x − 3 > −δ.
If some value of N works, then any value greater than it must also work. Therefore, we can
1 1
assume that N < 0. Now, we can reciprocate the inequality < N to get x − 3 > . Now,
x−3 N
1 1 1 1
it becomes clear that we want −δ = , or δ = − . Then, 0 > x − 3 > −→ < N as
N N N x−3
desired.
sin x
Problem 9.4.13. Prove lim = 0.
x→∞ x
sin x
∀ε > 0 ∃N ∀x, x > N −→ < ε.
x
Proof. Assume by contradiction that the limit L exists. The definition for lim sin x = L is:
n→∞
Example 9.5.1
Prove that lim sin θ = 0 and lim cos θ = 1.
θ→0 θ→0
Proof. Assume 0 < θ < π2 , where θ is in radians. We only care about θ being close to 0, which is
why we restrict θ to be less than π2 radians. Consider the unit circle:
1 θ
sin θ
θ A
O cos θ 1 − cos θ B
273 Chapter 9. Limits
d > BC (we will glance over the rigorous proof for this), and
We will take for granted that BC
since BC is the hypotenuse of 4ABC, BC > AC and BC > AB. Therefore, BC d > AC and
d > AB. Since θ is in radians, the length of BC
BC d is θ. Note that AC = sin θ and AB = 1 − cos θ.
We have the inequalities:
As lim 0 = lim θ = 0, by Theorem 9.3.9, lim sin θ = lim 1 − cos θ = 0; the latter rearranges
θ→0+ θ→0+ θ→0+ θ→0+
to lim cos θ = 1.
θ→0+
Lemma 9.5.2
Assuming both limits exist, lim f (x) = lim f (−x).
x→a+ x→−a−
which is the definition of lim f (−x) = L. Therefore the definitions are equivalent, i.e. lim f (x) =
x→−a− x→a+
lim f (−x).
x→−a−
By Lemma 9.5.2, it follows that lim sin θ = lim sin(−θ) = − lim sin θ = − lim sin θ.
θ→0+ θ→−0− θ→−0− θ→0−
Since lim sin θ = 0, − lim sin θ = 0, i.e. lim sin θ = 0. By Exercise 9.4.3, we conclude that
θ→0+ θ→0− θ→0−
lim sin θ = 0.
θ→0
Similarly, we note that lim cos θ = lim cos(−θ) = lim cos θ = lim cos θ by Lemma 9.5.2.
θ→0+ θ→−0− θ→−0− θ→0−
Since lim cos θ = 1, we have lim cos θ = 1. By Exercise 9.4.3, we conclude that lim cos θ = 1.
θ→0+ θ→0− θ→0
Lemma 9.5.3
lim f (x) = L ←→ lim f (a + h) = L.
x→a h→0
Substitute x → a + h. Then,
which simplifies to
i.e. the definition of lim f (a + h) = L. Since the definitions are equivalent, lim f (x) = L ←→
h→0 x→a
lim f (a + h) = L.
h→0
Proof. We will prove this for sin x and cos x, and the rest will follow by Theorem 9.3.10.
We proceed by Lemma 9.5.3 on sin x and cos x.
Theorem 9.5.5
sin θ
lim = 1.
θ→0 θ
π
Proof. Assume 0 < θ < , where θ is in radians. Consider the following diagram:
2
275 Chapter 9. Limits
C
tan θ
1
sin θ
θ
O cos θ A B
Lemma 9.5.6
lim f (x) = L ←→ lim f (kx) = L provided k 6= 0.
x→0 x→0
which rearranges to
∀ε > 0 ∃δ1 > 0 ∀x, 0 < |k| |x| < δ1 → |f (kx) − L| < ε.
δ1
Choose δ2 = . It follows that
|k|
Lemma 9.5.7
lim f (x) = L ←→ lima f (kx) = L provided k 6= 0.
x→a x→ k
which rearranges to
a
∀ε > 0 ∃δ1 > 0 ∀x, 0 < |k| x − < δ1 → |f (kx) − L| < ε.
k
δ1
Choose δ2 = . It follows that
|k|
a
∀ε > 0 ∃δ2 > 0 ∀x, 0 < x − < δ2 → |f (kx) − L| < ε,
k
which is just the definition of lim f (kx) = L.
x→0
1. lim cos 2x
x→0
sin 2x
2. lim
x→0 2x
277 Chapter 9. Limits
sin 2x
3. lim
x→0 3x
sin 6x
4. lim
x→0 sin 5x
tan 3x
5. lim
x→0 x
1 − cos x
6. lim
x→0 x
Solution. We apply the results of Example 9.5.1, Theorem 9.5.5 with Lemma 9.5.6 after some clever
algebraic manipulations.
1. lim cos 2x = 1 .
x→0
sin 2x
2. lim = 1.
x→0 2x
sin 2x sin 2x 2 2
3. lim = lim · = .
x→0 3x x→0 2x 3 3
sin 6x sin 6x sin 6x
sin 6x ·6 6 6 1 6
4. lim = lim x
sin 5x
= lim 6x
= lim 6x
= · = .
x→0 sin 5x x→0 x→0 sin 5x ·5 5 x→0 sin 5x 5 1 5
x 5x 5x
sin 3x
tan 3x sin 3x sin 3x sin 3x lim 3x 1
x→0
5. lim = lim = lim 3x
= lim 3x
= = = 3.
x→0 x x→0 x cos 3x x→0 x · cos 3x x→0 cos 3x 1
lim cos 3x 1
·1
3x 3 3 x→0 3
an xn + an−1 xn−1 + . . . + a1 x + a0
lim
x→∞ bm xm + bm−1 xm−1 + . . . + b1 x + b0
As x goes to ∞, any term of x raised to a negative power goes to 0. Therefore, the denominator
goes to 0 and the numerator goes to an , so the limit is ±∞, where the sign would depend on the
sign of an and whether the other terms approached 0+ or 0− .
If deg P = deg Q, then n = m. We will primarily use n. We have
an
All terms with x raised to a negative power approach 0 as x goes to ∞, so we are left with as
bn
our limit.
Lastly, if deg P < deg Q, then n < m. We have
Similar to the first case, we divide the numerator and denominator by xm . Then,
an xn + an−1 xn−1 + . . . + a1 x + a0
lim
x→∞ bm xm + bm−1 xm−1 + . . . + b1 x + b0
As x goes to ∞, all terms in the numerator go to 0, and all terms except for bm in the denominator
0
go to 0, so the limit is = 0.
bm
Problem 9.6.2. Evaluate and justify the following limits:
x2 − 7x + 12
1. lim
x→3 x3 − 27
279 Chapter 9. Limits
√
2. lim x2 + 2x − x
x→∞
√
3. lim x2 + 2x − x
x→−∞
Solution. 1. As we cannot directly plug in and compute, we first factor and cancel out like terms:
x2 − 7x + 12 (x − 3)(x − 4) x−4 1
lim = lim = lim 2 = − .
x→3 x3 − 27 x→3 (x − 3)(x2 + 3x + 9) x→3 x + 3x + 9 27
√ √
2. The radical motivates us to ‘rationalize the numerator’: lim x2 + 2x − x = lim ( x2 + 2x −
√ x→∞ x→∞
x2 + 2x + x 2x
x) · √ = lim √ . As we are considering x going to ∞, we assume
x2 + 2x + x x→∞ x2 + 2x + x
x > 0 and divide the numerator and denominator by x, which means that we divide the inner
2x 2 2
content of the radical by x2 , as such: lim √ = lim q = = 1.
x→∞ 2
x + 2x + x x→∞
1+ 2 +1 2
x
√ 2x
3. As shown in the previous problem, lim x2 + 2x − x = lim √ . However, as
x→−∞ x→−∞ x2 + √
2x + x
we are considering x going to −∞, we can safely assume x < 0, so x = − x2 , suggesting that
when we divide the numerator and denominator by x, we divide the inner content of the radical
2x 2
by x2 then making the radical a negative term: lim √ = lim q .
x→−∞ 2
x + 2x + x x→−∞ 2
− 1+ x +1
q q q
As x goes to −∞, 1 + x2 → 1− , so − 1 + x2 → −1+ , i.e. the denominator − 1 + x2 + 1
approaches 0+ . The numerator stays at 2, so since the overall sign is positive, the limit is
∞.
x+5
1. lim
x→−2+ x2 − 4
√ √
2. lim x+1− x
x→∞
√
3. lim x2 + x − x
x→∞
√
4. lim x2 + x + x
x→−∞
2x2 + 3x + 1
5. lim
x→∞ 5x2 − 2x + 3
x3 + x2 + x + 1
6. lim
x→−∞ 2x2 − 4x + 1
Solution.
Daniel Kim 280
x+5
1. Note that we can rewrite this to lim . As x approaches −2 from the positive
(x + 2)(x − 2)
x→−2+
side, we see that x + 5 will approach 3, x − 2 will approach −4, and x + 2 will approach 0
from the positive side, i.e. 0+ .
Note that the signs of 3 and 0+ are positive, but the sign of −4 is negative, therefore the
overall sign of the limit is negative. We also note that 0 is in the denominator, thus we can say
that the limit goes to −∞ .
3. Likewise,
√
p x2 + x − x p 2
lim x + x − x = lim √
2 · ( x + x + x)
x→∞ x→∞ x2 + x + x
x2 + x − x2
= lim √
x→∞ x2 + x + x
x
= lim √ .
x→∞ 2
x +x+x
We then divide both the numerator and denominator by x. To deal with the radical in the
denominator, note that we√ are considering x approaching√∞, so we can assume x > 0, from
which it follows that x = x2 , i.e. divide that radical by x2 , resulting in:
1 1
lim q = .
x→∞
1 + x1 + 1 2
4. This problem is very similar to the previous one. Again, we rationalize the numerator:
√
p x2 + x + x p 2
lim x2 + x + x = lim √ · ( x + x − x)
x→∞ x→∞ x2 + x − x
x2 + x − x2
= lim √
x→∞ x2 + x − x
x
= lim √ .
x→∞ 2
x +x−x
1 1
lim q = − .
x→∞
− 1+ 1
−1 2
x
5. Divide both the numerator and denominator by x of the common highest degree of the
polynomials, i.e. x2 , resulting in:
2 + x3 + x1 · x1
lim .
x→∞ 5 − 2 + 3 · 1 · 1
x x x
It then becomes obvious that all fractions with x as the denominator go to 0 as x goes to ∞,
2
therefore we are left with .
5
6. Unlike the previous exercise, the highest degree in the numerator is 3, but the highest degree
in the denominator is 2, therefore we can deduce that the limit goes to either ∞ or −∞. We
divide the numerator and denominator by the lesser degree, i.e. x2 , to get:
1 1
x+1+ x + x2
lim 4 1
x→−∞ 2− x + x2
We are left with a term x in the numerator, and since the problem asks for the limit as x goes
to −∞, we can conclude that the overall limit goes to −∞ .
sin(3x − 3)
1. lim
x→1 sin(2x − 2)
√
2. lim x2 + 4x − x
x→−∞
√ √
3. lim x( x + 2 − x)
x→∞
x3 + 4x − 7
4. lim
x→∞ 7x2 − x + 1
Lemma 9.6.5
1
lim f (x) = lim f .
x→∞ y→0+ y
1 1 1 1
As N > 0 and > 0, we can take the reciprocal of > N to get y < . As is some positive
y y N N
1
number, we let δ = , and it follows that
N
1
∀ε > 0 ∃δ > 0 ∀y, 0 < y < δ → f − L < ε,
y
1
which is the definition of lim f = L, as desired.
y→0 + y
1 1
Exercise 9.6.6. Evaluate and justify lim x sin . What about lim x sin
2 ?
x→∞ x x→∞ x
9.7 Continuity
Although we have mentioned the notion of continuity before, we formally define it here:
2. Rational functions are continuous where they are defined, by Lemma 9.3.13.
3. Trigonometric functions are continuous where they are defined, by Theorem 9.5.4.
Definition 9.7.2. When f (a) exists and lim f (x) 6= f (a), there is a removable discontinuity at
x→a
x = a.
(a, f (a))
Definition 9.7.3. When f (a) exists and lim f (x) does not exist, there is a essential discontinuity
x→a
at x = a.
283 Chapter 9. Limits
(a, f (a))
There can also be a discontinuity when f (a) does not exist at all.
x=a
sin x
Problem 9.7.4. Is f (x) = continuous? If not, what modifcations can we make to f such that
x
it is continuous?
sin x
Solution. Note that does not exist at x = 0, so we have a discontinuity. We can define the
x
piecewise function
sin x x 6= 0
f (x) = x
1 x=0
sin x
and since lim = 1, the function is now continuous.
x→0 x
Proof. Given that g is continuous at a, it is true that lim g(x) = g(a), i.e.
x→a
∀ε1 > 0 ∃δ1 > 0 ∀x, 0 < |x − a| < δ1 → |g(x) − g(a)| < ε1 .
We are given that f is continuous at g(a), therefore lim f (y) = f (g(a)), i.e.
y→g(a)
∀ε2 > 0 ∃δ2 > 0 ∀y, 0 < |y − g(a)| < δ2 → |f (y) − f (g(a))| < ε2 .
∀ε2 > 0 ∃δ2 > 0 ∀x, 0 < |g(x) − g(a)| < δ2 → |f (g(x)) − f (g(a))| < ε2 .
Our objective is to show that 0 < |x − a| < δ1 → |f (g(x) − f (g(a))| < ε2 from the two given
definitions. It suffices to let ε1 = δ2 , then it follows that 0 < |x − a| < δ1 → |g(x) − g(a)| < δ2 . By
the rewritten form of the second given definition, 0 < |g(x) − g(a)| < δ2 → |f (g(x)) − f (g(a))| < ε2 .
Then, by hypothetical syllogism,
∀ε2 > 0 ∃δ1 > 0 ∀x, 0 < |x − a| < δ1 → |f (g(x)) − f (g(a))| < ε2 ,
Theorem 9.7.9
Suppose that f is continuous at x = a and f (a) > 0. Then ∃δ > 0 such that ∀x ∈ (a − δ, a + δ),
f (x) > 0.
f (a) f (a)
Now consider ε = , as f (a) > 0 implies that > 0, and ε can be any positive number.
2 2
Then, by the definition,
f (a)
∃δ > 0 ∀x, 0 < |x − a| < δ → |f (x) − f (a)| < .
2
Theorem 9.7.10
Suppose that f is continuous at x = a and f (a) < 0. Then ∃δ > 0 such that ∀x ∈ (a − δ, a + δ),
f (x) < 0.
is continuous everywhere.
Solution. Recall that f (x) is continuous at x = a if lim f (x) = f (a). Therefore we should be
x→a
continuous at x = 1, i.e. lim f (x) should be defined at f (1). We consider the one-sided limits
x→1
separately: lim f (x) = lim 2x2 − x − 7 = −6, therefore lim f (x) = −6, i.e. lim ax + b =
x→1− x→1− x→1+ x→1+
a + b = −6.
Likewise, lim f (x) should also be defined at f (3). Note that lim f (x) = lim x3 = 27, so
x→3 x→3+ x→3+
lim f (x) = lim ax + b = 3a + b = 27.
x→3− x→3−
We now have the system of equations a + b = −6 and 3a + b = 27, from which the solutions are
33 45
a= ,b= − .
2 2
Chapter 10
Derivatives
Limits barely scrape the surface of calculus. With derivatives, we will observe how limits are used to
help us analyze functions in greater depth than we were able to do in early algebra classes.
10.1 Introduction
Consider a function f . When, at any part, the function is increasing, we have
When we consider the two points (a, f (a)) and (b, f (b)), the slope of the line containing both points
is also positive:
f (b) − f (a)
b − a > 0 −→ > 0.
b−a
Likewise, we can use analogous reasoning for decreasing parts of a graph as well.
Definition 10.1.1. We call this line connecting (a, f (a)) and (b, f (b)) the secant line, as shown
below:
(b, f (b))
(a, f (a))
Now what happens to the slope of the secant line connecting (a, f (a)) and (b, f (b)) as b is
approaching a? Observe the secant line in the figure below:
287
Daniel Kim 288
(a, f (a))
Since we are considering b approaching a, this leads us to use limits to introduce a new, important
property of functions:
f (x) − f (a)
Definition 10.1.2. For some constant a, lim = f 0 (a) is called the derivative of f (x)
x→a x−a
at x = a. By Lemma 9.5.3, note that
f (x) − f (a) f (a + h) − f (a)
f 0 (a) = lim = lim ,
x→a x−a h→0 h
which is another form of the definition of the derivative.
Definition 10.1.3. The tangent line of f (x) at x = a is the line of slope f 0 (a) through (a, f (a)).
(a, f (a))
Example 10.1.5
Let f (x) = mx + b (i.e. an arbitrary linear function with slope m). For any a, prove that
f 0 (a) = m.
Proof. We have
f (x) − f (a)
f 0 (a) = lim
x→a x−a
(mx + b) − (ma + b)
= lim
x→a x−a
m(x − a)
= lim
x→a x−a
289 Chapter 10. Derivatives
= lim m
x→a
= m.
f (x) − f (2)
f 0 (2) = lim
x→2 x−2
2
x −4
= lim
x→2 x − 2
(x + 2)(x − 2)
= lim
x→2 x−2
= lim x + 2
x→2
= 4.
Solution. The slope of the tangent line is f 0 (a). This line must pass through the point (a, f (a)), so
by the point-slope form of a line, the equation of the tangent line would be y − f (a) = f 0 (a)(x − a),
which rearranges to
y = f 0 (a)(x − a) + f (a).
Problem 10.1.12. Find all tangent lines of y = x2 that go through (7, 1).
Solution. Define f (x) = x2 . Consider some point (a, f (a)) that lies on f and has a tangent line that
goes through (7, 1). By Problem 10.1.11, the general equation of the tangent line would be
y = f 0 (a)(x − a) + f (a).
1 = 2a(7 − a) + a2 .
Daniel Kim 290
√
This rearranges to a2 − 14a + 1 = 0, and the quadratic formula yields a = 7 ± 4 3, and in fact, both
are solutions to the problem. Thus, our tangent lines are
√ √ √
y = 2(7 + 4 3)(x − (7 + 4 3)) + 97 + 56 3,
√ √ √
y = 2(7 − 4 3)(x − (7 − 4 3)) + 97 − 56 3.
Example 10.1.13
Let f (x) = |x|. Find f 0 (7) and f 0 (0).
|x| − |7|
Solution. Note that f 0 (7) = lim . As we are considering the limit as x goes to 7, we can
x→7x−7
assume that x is positive because we essentially only care about the values of x that are close to 7.
Then,
|x| − |7| x−7
lim = lim = 1.
x→7 x − 7 x→7 x − 7
|x|
For f 0 (0), we want to find lim . However, evaluating the right-hand and left-hand limits
x→0 x
|x|
separately gives us 1 and −1 respectively, therefore lim does not exist, and it follows that there
x→0 x
is no tangent line at x = 0. In fact, if we graph f (x) = |x|, we see that there is a cusp (a pointed
end) at x = 0, so we can intuitively figure out that there cannot be a tangent line at that point.
Definition 10.1.14. For a function f (x), the derivative of f (x) is denoted as f 0 (x) with respect
to all x defined on f . As stated earlier, the limit can be written in two forms:
Definition 10.1.15. There are two different ways to denote the derivative of f : either f 0 (x), which
d
is Newton’s notation, or (f (x)), which is Leibniz’s notation. Leibniz notation is useful since
dx
d 3
we don’t have to define the function to take its derivative. For example, we can write (x ), but
0 dx
not x .
3
d
(f (x)) = f 0 (x).
dx
d dy
(y) = y 0 = .
dx dx
Definition 10.1.16. To differentiate a function is to evaluate the derivative of that function.
291 Chapter 10. Derivatives
Example 10.1.17
Differentiate the following functions:
1. x2
2. x3
3. x5
4. xn , ∀n ∈ Z+
3x + 11
5.
2x − 9
√
6. x
7. sin x
f (x + h) − f (x) (x + h)5 − x5
lim = lim
h→0 h h→0 h
x + 5x h + 10x3 h2 + 10x2 h3 + 5xh4 + h5 − x5
5 4
= lim
h→0 h
= lim 5x4 + 10x3 h + 10x2 h2 + 5xh3 + h4 = 5x4 .
h→0
4. From the previous examples, it seems that we have a pattern here. In fact, we can show that
d n
(x ) = nxx−1 using the Binomial Theorem.
dx
d n (x + h)n − xn
(x ) = lim
dx h→0 h
Daniel Kim 292
n n n n
0 xn + 1 xn−1 h + . . . + n−1 xhn−1 + n hn − xn
= lim
h→0
h
n n n n
1 xn−1 h + 2 xn−2 h2 + ... + n−2 x2 hn−2 + n−1 xhn−1 + hn
= lim
h→0
h
n n−1 n n−2 n 2 n−3 n
= lim x + x h + ... + x h + xhn−2 + hn−1
h→0 1 2 n−2 n−1
= nxn−1 .
We can also use the fact an − bn = (a − b)(an−1 + an−2 b + . . . + abn−2 + bn−1 ) as mentioned
before:
d n z n − xn
(x ) = lim
dx z→x z − x
(z − x)(z n−1 + z n−2 x + . . . + zxn−2 + xn−1 )
= lim
z→x z−x
n−1 n−2
= lim z +z x + . . . + zxn−2 + xn−1
z→x
n−1
=x + xn−2 · x + . . . + x · xn−2 + xn−1
= nxn−1 .
5. It is up to your choice which method you prefer. The following uses expanding.
3(x+h)+11
− 3x+11
d 3x + 11 2(x+h)−9 2x−9
= lim
dx 2x − 9 h→0 h
3x+3h+11 3x+11
2x+2h−9 − 2x−9
= lim
h→0 h
(3x + 3h + 11)(2x − 9) − (3x + 11)(2x + 2h − 9)
= lim
h→0 h(2x + 2h − 9)(2x − 9)
6x − 27x + 6xh − 27h + 22x − 99 − 6x2 − 6xh + 27x − 22x − 22h + 99
2
= lim
h→0 h(2x + 2h − 9)(2x − 9)
−49h
= lim
h→0 h(2x + 2h − 9)(2x − 9)
49
= lim −
h→0 (2x + 2h − 9)(2x − 9)
49
= − .
(2x − 9)2
7. This is one of the more nontrivial functions to differentiate using the limit definition.
d sin(x + h) − sin x
(sin x) = lim
dx h→0 h
sin x cos h + cos x sin h − sin x
= lim
h→0 h
cos x sin h − sin x(1 − cos h)
= lim
h→0 h
sin2 h
cos x sin h − sin x · 1+cos h
= lim
h→0
h
sin x sin h
sin h cos x − 1+cos h
= lim
h→0 h
sin h sin x sin h
= lim · lim cos x −
h→0 h h→0 1 + cos h
sin x sin h
= lim cos x −
h→0 1 + cos h
sin x sin h
= lim cos x − lim
h→0 h→0 1 + cos h
lim (sin x sin h)
= cos x − h→0
lim (1 + cos h)
h→0
sin x · lim sin h
h→0
= cos x −
1 + lim cos h
h→0
sin x · 0
= cos x −
1+1
= cos x .
Daniel Kim 294
b) j 0 (x) = k · f 0 (x).
b) We can always factor out constants from limits, which follows from Theorem 9.3.4 and
Theorem 9.3.5:
j(z) − j(x)
j 0 (x) = lim
z→x z−x
k · f (z) − k · f (x)
= lim
z→x z−x
k(f (z) − f (x))
= lim
z→x z−x
f (z) − f (x)
= lim k ·
z→x z−x
f (z) − f (x)
= k · lim
z→x z−x
= k · f 0 (x).
Theorem 10.1.20
If f (x) is differentiable at x = a, then it is continuous at x = a.
f (x) − f (a)
Proof. We want to show that if lim exists, then lim f (x) = f (a), or equivalently,
x→a x−a x→a
lim f (x) − f (a) = 0.
x→a
f (x) − f (a)
We know that lim and lim (x − a) both exist. We have that
x→a x−a x→a
But note that lim (x − a) = 0, and since the RHS will be 0, the LHS, i.e. lim f (x) − f (a), will
x→a x→a
necessarily be 0.
f (x) − f (0)
f 0 (0) = lim
x→0 x−0
x sin( x1 ) − 0
= lim
x→0 x −0
1
= lim sin
x→0 x
1
= lim sin .
x→0 + x
1
By Lemma 9.6.5, lim sin = lim sin y, and this limit does not exist, since sin y oscillates
x→0+ x y→∞
between −1 and 1. Therefore, f 0 (0) does not exist, so f is not differentiable at x = 0.
Daniel Kim 296
g(x) − f (0)
g 0 (0) = lim
x→0 x−0
x2 sin( x1 ) − 0
= lim
x→0 x− 0
1
= lim x sin .
x→0 x
d
(f (x)g(x)) = f (x)g 0 (x) + f 0 (x)g(x).
dx
d f (z)g(z) − f (x)g(x)
(f (x)g(x)) = lim
dx z→x z−x
f (z)g(z) − f (z)g(x) + f (z)g(x) − f (x)g(x)
= lim
z→x z−x
f (z) (g(z) − g(x)) + g(x) (f (z) − f (x))
= lim
z→x
z−x
f (z) (g(z) − g(x)) g(x) (f (z) − f (x))
= lim +
z→x z−x z−x
g(z) − g(x) f (z) − f (x)
= lim f (z) · lim + lim · lim g(x)
z→x z→x z−x z→x z−x z→x
0 0
= lim f (z) · g (x) + f (x)g(x).
z→x
As we have assumed that f (x) and g(x) are differentiable, by Theorem 10.1.20, lim f (z) = f (x),
z→x
therefore
d
(f (x)g(x)) = f (x)g 0 (x) + f 0 (x)g(x).
dx
Proof. First, we expand the fractions, then apply a similar strategy used in the previous proof.
f (z)
− f (x)
d f (x) g(z) g(x)
= lim
dx g(x) z→x z−x
f (z)g(x) − f (x)g(z)
= lim
z→x g(x)g(z)(z − x)
f (z)g(x) − f (x)g(x) + f (x)g(x) − f (x)g(z)
= lim
z→x g(x)g(z)(z − x)
g(x) (f (z) − f (x)) + f (x) (g(x) − g(z))
= lim
z→x g(x)g(z)(z − x)
1 1 f (z) − f (x) g(x) − g(z)
= lim · lim · lim · lim g(x) + lim f (x) · lim .
z→x g(x) z→x g(z) z→x z−x z→x z→x z→x z−x
By our assumptions and Theorem 10.1.20, we have
d f (x) 1 1
= · · f 0 (x) · g(x) + f (x) · −g 0 (x)
dx g(x) g(x) g(x)
g(x)f 0 (x) − f (x)g 0 (x)
= .
g(x)2
2x − 1
Problem 10.1.24. Evaluate the derivatives of f (x) = x3 sin x and f (x) = .
3x + 2
Problem 10.1.25. By Example 10.1.17, we have established that the derivative of sin x is cos x.
Using Theorem 10.1.22 and Theorem 10.1.23, find the derivatives of the rest of the trigonometric
functions.
d d
Solution. The proof of (cos x) is similar to that of (sin x).
dx dx
d cos(x + h) − cos x
(cos x) = lim
dx h→0 h
Daniel Kim 298
cos x cos h − sin x sin h − cos x
= lim
h→0 h
cos x(cos h − 1) − sin x sin h
= lim
h→0 h
sin2 h
cos x · − cos h+1 − sin x sin h
= lim
h→0
h
cos x sin h
− sin h cos h+1 + sin x
= lim
h→0 h
sin h cos x sin h
= lim − · lim + sin x
h→0 h h→0 cos h + 1
cos x sin h
= − lim + sin x
h→0 cos h + 1
= − sin x .
Now that we have found the derivatives of sine and cosine, we can apply Theorem 10.1.23 to find
d
(tan x).
dx
d d sin x
(tan x) =
dx dx cos x
d d
cos x · − sin x ·
dx (sin x) dx (cos x)
=
cos2 x
2 2
cos x + sin x
=
cos2 x
1
=
cos2 x
= sec2 x .
d d
The proof of (cot x) is analogous to that of (tan x).
dx dx
d d cos x
(cot x) =
dx dx sin x
d d
sin x · dx (cos x) − cos x · dx (sin x)
=
sin2 x
2 2
− sin x − cos x
=
sin2 x
1
=− 2
sin x
= − csc2 x .
d
For (sec x), use Theorem 10.1.23 and note that the derivative of 1 is simply 0.
dx
d d 1
(sec x) =
dx dx cos x
299 Chapter 10. Derivatives
d d
cos x · dx (1)−1· dx (cos x)
=
cos2 x
sin x
=
cos2 x
= tan x sec x .
d d
The proof of (csc x) is analogous to that of (sec x).
dx dx
d d 1
(csc x) =
dx dx sin x
d d
sin x · dx (1)−1· dx (sin x)
=
sin2 x
cos x
=−
sin2 x
= − cot x csc x .
Problem 10.1.26. Prove the following results about the derivatives of function transformations.
d
a) (f (x + k)) = f 0 (x + k).
dx
d
b) (f (kx)) = kf 0 (kx).
dx
Proof. The general strategy is to substitute the transformations x + k and kx with convenient
variables to facilitate the manipulation of the limit definition of the derivative.
a) Let ze = z + k and x
e = x + k. Note that ze − x
e = z − x, and that z goes to x as ze goes to x
e.
Then,
d f (z + k) − f (x + k)
(f (x + k)) = lim
dx z→x z−x
z ) − f (e
f (e x)
= lim
ze→ex ze − x
e
0
= f (ex)
= f 0 (x + k).
b) Let ze = kz and x
e = kx. Note that z goes to x as ze goes to x
e. Then,
d f (kz) − f (kx)
(f (kx)) = lim
dx z→x z−x
f (kz) − f (kx)
= lim ·k
z→x kz − kx
z ) − f (e
f (e x)
= lim ·k
ze→e x ze − x
e
= f 0 (e
x) · k
= kf 0 (kx).
Daniel Kim 300
d d
Remark. Keep in mind that f 0 (x+k) and f 0 (kx) are NOT the same as (f (x + k)) and (f (kx)).
dx dx
If something is some function of x, then f 0 (something) is the derivative of f (x) and then plugging
d
in x → something, while (f (something)) = f 0 (something) evaluated at x.
dx
d
For instance, if we let f (x) = sin x, then f 0 (2x) = cos 2x while (f (2x)) = 2 cos 2x according
dx
to the stated theorem.
d
Problem 10.1.27. Using the results of Problem 10.1.26, quickly evaluate (cos x).
dx
Solution. We can take advantage of the cofunction identity of sine and cosine:
d d π π π
(cos x) = sin x + = sin0 x + = cos x + = − sin x.
dx dx 2 2 2
d
Problem 10.1.28. Use the product rule twice to find (f (x)g(x)h(x)).
dx
Solution. Initially suppose that f (x)g(x) is a single function for the first application of Theo-
rem 10.1.22, then apply the theorem again to break up f (x)g(x).
d d d
(f (x)g(x)h(x)) = f (x)g(x) (h(x)) + (f (x)g(x))h(x)
dx dx dx
= f (x)g(x)h0 (x) + (f (x)g 0 (x) + f 0 (x)g(x))h(x)
= f (x)g(x)h0 (x) + f (x)g 0 (x)h(x) + f 0 (x)g(x)h(x).
It seems that there is a pattern for the derivative of the product of an arbitrary number of
functions. Here is the general statement:
d
(f (x)n ) = nf (x)n−1 f 0 (x).
dx
d
Problem 10.1.31. Recall the result of Example 10.1.17 where it was proven that (xn ) = nxn−1
dx
for all n ∈ Z+ . Demonstrate that this identity holds for all negative integers n as well.
d m
Let m = −n, such that m is a negative integer. Then we have (x ) = mxm−1 , and we are
dx
done.
d 0 d d
Clearly, for n = 0, (x ) = (1) = 0 = 0·x0−1 . We have now established that (xn ) = nxn−1
dx dx dx
for all integers n. However, we can extend this rule to the rational numbers:
d 1
Problem 10.1.32. Let n ∈ Z+ . Evaluate xn .
dx
Thus, we have
d 1
1 1
z n − xn
x n = lim
dx z→x z − x
1 1
z n − xn
= lim 1 1
n−1 n−2 1 1 n−2 n−1
z→x
z −x
n n z n +z n x + ... + z x
n n n +x n
1
= lim n−1 n−2 1 1 n−2 n−1
z→x z n +z n x + ... + znx
n n +x n
1
= n−1
nx n
1 1
= x n −1 .
n
Daniel Kim 302
d −1
Problem 10.1.33. Use Theorem 10.1.23 to find x n for n ∈ Z+ .
dx
d 1 1 1
We now have shown that x n = x n −1 for all nonzero integers n. Finally, we deal with
dx n
p
the general case for rational numbers, using results that we have discovered so far:
q
d
∀n ∈ Q, (xn ) = nxn−1 .
dx
d pq
Proof. Using the previous result and Corollary 10.1.30, we will evaluate x for p, q ∈ Z and
dx
q 6= 0.
d pq d 1q p
x = x
dx dx
1 p−1 1 1
−1
= p xq xq
q
p 1q p−1 1q −1
= x x
q
p p −1
= xq .
q
P
n
Problem 10.1.35. Let P (x) be the polynomial ak xk . Find P 0 (x).
k=0
d
• (f (x + c)) = f 0 (x + c).
dx
d d
When we evaluate (f (x + c)), we are actually evaluating (f ◦ g(x)), where g(x) = x + c.
dx dx
d
• (f (kx)) = kf 0 (kx).
dx
d d
When we evaluate (f (kx)), we are actually evaluating (f ◦ g(x)), where g(x) = kx.
dx dx
d
• (f (x)n ) = nf (x)n−1 f 0 (x).
dx
d d
When we evaluate (f (x)n ), we are actually evaluating (g ◦ f (x)), where g(x) = xn .
dx dx
Considering the composition of functions will lead us into our next important property of
derivatives:
If we let y = f (x) and z = g(y), then it follows from Theorem 10.1.36 that
dz dz dy
= .
dx dy dx
The proof of this theorem lies beyond the scope of this book.
d
Problem 10.1.37. Use Theorem 10.1.36 to find (f (xn )).
dx
Solution. Note that the inner function is xn , and the outside function is f . Using Theorem 10.1.36
gives us
d d
(f (xn )) = f 0 (xn ) · (xn ) = f 0 (xn ) · nxn−1 .
dx dx
Exercise 10.1.38. Use Theorem 10.1.36 to demonstrate Corollary 10.1.30.
1 π π
Problem 10.1.39. Let f be a function such that f 0 (x) = √ for x ∈ − , . Evaluate
1 − x2 2 2
d
(f (sin x)).
dx
Daniel Kim 304
d d
Solution. By Theorem 10.1.36, we have (f (sin x)) = f 0 (sin x) (sin x) = f 0 (sin x) cos x. To
dx dx
1
evaluate f (sin x), we simply plug in sin x into the input of f , i.e. f 0 (sin x) = p
0 0 . Since
1 − sin2 x
π π p 1 d
x∈ − , , we have that 1 − sin2 x = cos x, such that f 0 (sin x) = . Thus, (f (sin x)) =
2 2 cos x dx
1
· cos x = 1 .
cos x
Problem 10.1.40. Differentiate the following:
1. sin3 (x2 )
√
2. 1 + 4 sin x
x2
3.
1 + x2
4. tan(x3 + x2 )
q
5. 3 cos4 sin5 (x6 )
d d
1. After applying Theorem 10.1.36 twice, we get (sin3 (x2 )) = 3 sin2 (x2 ) · (sin(x2 )) =
dx dx
3 sin2 (x2 ) · cos(x2 ) · 2x = 6x sin2 (x2 ) cos(x2 ) .
d √ 1 1 d 1
2. Similar to the previous one, ( 1 + 4 sin x) = (1 + 4 sin x)− 2 · (1 + 4 sin x) = (1 +
dx 2 dx 2
1 2 cos x
4 sin x)− 2 · 4 cos x = √ .
1 + 4 sin x
d d
4. By Theorem 10.1.36, (tan(x3 +x2 )) = sec2 (x3 +x2 )· (x3 +x2 ) = sec2 (x3 + x2 ) · (3x2 + 2x) .
dx dx
5. Repeatedly apply the Theorem 10.1.36:
q
d 4 1 d
3
cos sin (x ) = cos 3 sin5 x6 ·
4 5 6 cos sin5 x6
dx 3 dx
4 1 d
= cos 3 sin5 x6 · − sin sin5 x6 · sin5 x6
3 dx
305 Chapter 10. Derivatives
4 1 d
= cos 3 sin5 x6 · − sin sin5 x6 · 5 sin4 x6 · sin x6
3 dx
4 1
= cos 3 sin5 x6 · − sin sin5 x6 · 5 sin4 x6 · cos(x6 ) · 6x5
3
1
= −40 sin sin5 x6 sin4 x6 cos 3 sin5 x6 cos x6 x5 .
Generally, this is as far as problems involving the application of Theorem 10.1.36 go.
Definition 10.1.41. Let n ∈ Z+ . The nth derivative of f (x) is what you get when you take the
derivative of the function n times. We write this as f (n) (x) or repeat the 0 symbol n times, as shown:
y = f (x)
y 0 = f 0 (x)
y 00 = f 00 (x)
y 000 = f 000 (x)
and so on.
We could also use roman numerals, i.e. we would represent the fourth derivative of f (x) as
f IV (x).
Using Leibniz Notation, we can represent the second derivative of y as
d d dy d2 y
y 00 = y0 = = 2.
dx dx dx dx
Solution.
y 0 = 3x2 + 2x + 1
y 00 = 6x + 2
y 000 = 6
y IV = 0
y (100) = 0
d2017 y
Exercise 10.1.45. Find (cos(6x)).
dx2017
d2 y
Exercise 10.1.46. Find (f (x)g(x)). What does the result resemble?
dx2
Daniel Kim 306
Example 10.2.1
Find the slope of the tangent line at the point (3, 4) of the equation x2 + y 2 = 25. What about
(4, −3)?
Solution. This is a circle of radius 5 centered at the origin, so consider the following diagram:
(3, 4)
(4, −3)
The graph of this clearly fails the vertical line test, implying that y cannot be a function of x
(there exist multiple y values given one x value). Therefore, we cannot apply our usual tactics.
One approach √ is to consider cases√separately. First, we could solve for y and get two separate
equations: y = 25 − x2 and y = − 25 − x2 , then evaluate the derivatives separately for (3, 4),
which lies in the top half, and (4, −3) which lies in the bottom half, as shown:
√ (3, 4)
y= 25 − x2
√ (4, −3)
y = − 25 − x2
x
For the former, we get y 0 = − √ , and then we plug in 3 from the point (3, 4) to get the
25 − x2
3 x 4
slope − . For the latter, we get y 0 = √ , and then we plug in 4 to get the slope .
4 25 − x2 3
However, this method was inefficient and time-consuming. It is not even guaranteed that we are
able to find an explicit function for y, as the example x2 + y 2 = 25 was conveniently simple. In fact,
we can do better.
Note that the relation x2 + y 2 = 25 implies that x and y are tied to each other (i.e. share a
relation with one another) in some way, but not in a way such that one is an explicit function of
307 Chapter 10. Derivatives
x2 + f (x)2 = 25.
3
If we plug in the point (3, 4), we have that x = 3 and f (3) = 4, giving our correct answer − . If
4
4
we plug in the point (4, −3), we have that x = 4 and f (4) = −3, giving our correct answer .
3
To take this a step further, we don’t even need to bother with substituting y = f (x). Leibniz
dy
notation allows us to take the derivative of both sides of the given equation and solve for , while
dx
treating y as some function of x. For the particular equation x2 + y 2 = 25, we would do the following
work:
x2 + y 2 = 25
d d
x2 + y 2 = (25)
dx dx
d d
x2 + y2 = 0
dx dx
d
2x + y 2 = 0.
dx
d
How would we evaluate y 2 ? Recall that we are treating y as a function of x, therefore we
dx
use Theorem 10.1.36:
d
2x + 2y (y) = 0.
dx
d dy
Nothing that (y) is just , we conclude that
dx dx
dy
2x + 2y =0
dx
dy x
= − .
dx y
Now, we can simply use this formula by plugging in the appropriate values when given points like
(3, 4) and (4, −3), and we can confirm that this formula yields the correct answers as well.
Daniel Kim 308
Remark. The result from Example 10.2.1 is consistent with geometric results, i.e. the tangent line
is always perpendicular to the radius at the point of intersection. Consider the following diagram:
(x, y)
y
m= x
(0, 0)
y
As the slope of the line connecting the radius and the point of intersection is , the slope of the
x
x
tangent line must be − , since the tangent is perpendicular.
y
dy
Problem 10.2.2. Find (i.e. y 0 ) for xy 2 + x2 y 3 = 7.
dx
Solution. As always, we take the derivative of both sides of the equation and then simplify and solve
dy dy
for . Never forget that we must treat y as a function of x, as means that we are evaluating
dx dx
the derivative of y with respect to x. Hence, we must use Theorem 10.1.36 in the process.
dy dy
(xy 2 + x2 y 3 ) = (7)
dx dx
dy dy
1 · y 2 + x · 2y · + 2x · y 3 + x2 · 3y 2 · =0
dx dx
dy −y 2 − 2xy 3
= .
dx 2xy + 3x2 y 2
dy
Problem 10.2.3. Find (i.e. y 0 ) for x2 + xy + y 2 = 12.
dx
Solution.
d d
x2 + xy + y 2 = (12)
dx dx
dy dy
2x + y + x + 2y =0
dx dx
dy 2x + y
= − .
dx x + 2y
309 Chapter 10. Derivatives
Problem 10.2.4. Find all points on the oblique ellipse from Problem 10.2.3 when the tangent lines
are horizontal and vertical, respectively.
2x + y
Solution. When the tangent line is horizontal, − = 0. This simplifies to y = −2x, and then we
x + 2y
substitute this into the equation x2 + xy + y 2 = 12 to get x2 − 2x2 + 4x2 = 12, i.e. x2 = 4 −→ x = ±2,
and we get our respective y-values to get the points (2, −4) and (−2, 4) .
2x + y
When the tangent line is vertical, its slope is undefined. The only way − can be undefined
x + 2y
x x2 x2
is when x + 2y = 0, i.e. y = − . We plug this into the first equation to get x2 − + = 12,
2 2 4
i.e. x2 = 16 −→ x = ±4, and we solve for our respective y-values to get the points (4, −2) and
(−4, 2) .
Problem 10.2.5. Find all horizontal and vertical tangent lines on the oblique ellipse x2 −xy +y 2 = 7.
dy
Solution. First, we find . Note that
dx
dy 2 dy
(x − xy + y 2 ) = (7)
dx dx
dy dy
2x − y + x + 2y =0
dx dx
dy dy
2x − y − x + 2y =0
dx dx
dy y − 2x
= .
dx 2y − x
y − 2x
The tangent line is horizontal when − = 0, or y = 2x. We substitute this into the equation
2y − x r
7
x − xy + y = 7 to get x − x(2x) + (2x) = 7, which simplifies to x = ±
2 2 2 2 , so our horizontal
r r 3
7 7
tangent lines are y = 2 and y = −2 .
3 3
y − 2x
The tangent line is vertical when its slope is undefined. The only way − can be undefined
2y − x
is when 2y − x = r0, i.e. x = 2y. We plug this into the initial equation
r to get (2y) 2
r − (2y)y + y = 7
2
7 7 7
and obtain y = ± , so our vertical tangent lines are x = 2 and x = −2 .
3 3 3
dy
Problem 10.2.6. Find (i.e. y 0 ) for x sin(xy 3 ) + sin2 (y) = 1.
dx
Solution. We repeatedly apply Theorem 10.1.36, so you should be familiar with using it.
d d
x sin(xy 3 ) + sin2 (y) = (1)
dx dx
Daniel Kim 310
d d d
(x) · sin(xy 3 ) + x · sin(xy 3 ) + 2 sin(y) · (sin(y)) = 0
dx dx dx
d dy
sin(xy 3 ) + x · cos(xy 3 ) · xy 3 + 2 sin(y) cos(y) =0
dx dx
dy dy
sin(xy 3 ) + x · cos(xy 3 ) y 3 + x · 3y 2 + 2 sin(y) cos(y) =0
dx dx
dy dy
sin(xy 3 ) + xy 3 cos(xy 3 ) + 3x2 y 2 cos(xy 3 ) + 2 sin(y) cos(y) =0
dx dx
dy
(3x2 y 2 cos(xy 3 ) + 2 sin(y) cos(y)) = − sin(xy 3 ) + xy 3 cos(xy 3 )
dx
Thus, we have
dy sin(xy 3 ) + xy 3 cos(xy 3 )
= − 2 2 .
dx 3x y cos(xy 3 ) + sin(2y)
dy
Problem 10.2.7. Find for the equation sin(xy) = x2 + y 2 .
dx
Solution.
d d
(sin(xy)) = x2 + y 2
dx dx
d dy
cos(xy) (xy) = 2x + 2y
dx dx
dy dy
cos(xy) y + x = 2x + 2y
dx dx
dy dy
y cos(xy) + x cos(xy) = 2x + 2y
dx dx
dy y cos(xy) − 2x
= .
dx 2y − x cos(xy)
dy
Exercise 10.2.8. Find for x3 + y 3 = 4.
dx
dy
Exercise 10.2.9. Find for y = sin(3x + 4y).
dx
dy
Exercise 10.2.10. Find for y = x2 y 3 + x3 y 2 .
dx
dy
Exercise 10.2.11. Find for cos2 x + cos2 y = cos(2x + 2y).
dx
dy p
Exercise 10.2.12. Find for x = x2 + y 2 .
dx
d2 y
Problem 10.2.13. Find for x2 + y 2 = 25.
dx2
311 Chapter 10. Derivatives
dy x d2 y d dy
Solution. From Example 10.2.1, we have found that = − . Note that = , so
dx y dx2 dx dx
d2 y d x
= − . We then use Theorem 10.1.23 to get
dx2 dx y
dy
d2 y y · (−1) − (−x) dx
=
dx2 y2
x2
−y − y
=
y2
−x2 − y 2
=
y3
x2 + y 2
=− .
y3
We are almost done! At this point, we can simply use the equation we are given, x2 + y 2 = 25, and
d2 y 25
substitute in the appropriate value, to get = − 3 .
dx2 y
d2 y
Exercise 10.2.14. Find for x2 + xy + y 2 = 1.
dx2
Definition 10.3.1. The average rate of change of s on the interval [t0 , t1 ] is equal to
s(t1 ) − s(t0 ) ∆s
= .
t1 − t0 ∆t
Note that this is just the slope of the secant line connecting points (t0 , s(t0 )) and (t1 , s(t1 )).
We will appropriately use ∆s and ∆t to represent the differences s(t1 ) − s(t0 ) and t1 − t0 ,
respectively.
Daniel Kim 312
This represents the rate of change at a single point in time, t, rather than taking the average of
two points.
We now introduce the concept of related rates problems with a series of examples.
Example 10.3.3
Consider a spherical balloon which is being blown up. Its volume is changing at a constant rate,
K. How does the radius of the balloon change? What about its surface area?
dr
We want to find the change in radius, i.e. .
dt
However, we keep in mind that V and r are functions of time t (always remember that measure-
ments are functions of time!).
For a sphere, we can use the formula of a balloon, i.e.
4
V = πr3 .
3
Let S denote the surface area of the balloon. Then the surface area of a sphere is known to be
S = 4πr2 .
dS d dS dr
We implicitly differentiate to get = (4πr2 ),which simplifies to = 8πr . We have
dt dt dt dt
dr K dS 2K
already found that = , so we plug this back into the equation to get = .
dt 4πr2 dt r
313 Chapter 10. Derivatives
Example 10.3.4
How fast is the top of the ladder falling down the wall when the top is at a height of 12 ft. while
the bottom of the ladder is being pushed outward 6 inches per minute?
13 ft ladder
y 13
dx dy
2x + 2y = 0.
dt dt
dx 1 1 dy dy 5
We plug in = to get 10 · + 24 · = 0, which yields = − . It is numerically negative,
dt 2 2 dt dt 24
5 in.
but based on the word problem, we interpret this as the ladder falling down at .
2 min.
Daniel Kim 314
Problem 10.3.5. You are standing 500 ft. away from a tiny rocket. The rocket rises at a rate√of
100 feet per second. How fast is its angle of elevation from you changing when the rocket is 500 3
ft. high?
θ
500
dy
We are given that y increases 100 feet per second, i.e. = 100.
dt
dθ √
We want the change in θ, which is , when y = 500 3.
dt
y
Using this right triangle relationship, we can deduce that tan θ = . We differentiate this to
500
get
dθ 1 dy
sec2 θ = .
dt 500 dt
√ π dy
Note that y = 500 3 when θ = , i.e. sec θ = 2. Substituting that and = 100 into the
3 dt
dθ 1 dθ 1
equation, we therefore have 4 = · 100, from which we solve to get = radian per second,
dt 500 dt 20
1 radian
i.e. the angle of elevation is increasing at .
20 second
Example 10.3.6
Consider a cone-shaped cauldron (with a radius of 50 meters and a height of 100 meters) that
holds a potion. The potion leaks out at 2 cubic meters per minute. How fast is the height of
the potion changing when the height of the potion in the cauldron is 80 meters?
50 m
Potion 100 m
315 Chapter 10. Derivatives
Solution. Let r and h be the radius and height of the cone with the leaking potion. We are given
dV dh
that = −2. We want when h = 80.
dt dt
We use the fact that the cone with the leaking potion is similar to the cone of radius 50 meters and
50 1 r 1
height 100 meters. Since the radius to height ratio is = , we know that = . Furthermore,
100 2 h 2
1 2
we know the volume of the cone is V = πr h.
3
h 1
We rearrange to get r = and substitute this into the volume formula to get V = πh3 , then
2 12
differentiate it to get
dV 1 dh
= πh2 .
dt 4 dt
dV dh 1
We plug in the given = −2 and h = 80 to get that =− meters per minute, i.e. the
dt dt 800π
1
height is decreasing at meters per minute .
800π
Problem 10.3.7. Eric is walk along the path of the graph y = x2 , as shown:
y = x2
His x-coordinate increases 10 meters per second. How fast is the angle of inclination from the
origin changing when his x-coordinate is 3 meters?
Solution. We let x denote Eric’s x-coordinate. We are given that his x-coordinate increases 10
dx
meters per second, which means that = 10. We let θ be the angle of inclination from the origin.
dt
dθ
We want to find when x = 3.
dt
We note that the angle of inclination from the origin is simply the angle between the line
containing the point on y = x2 and the origin, and the x-axis. We also have to consider x, the
x-coordinate. These relationships motivate us to draw a simplified diagram, as such:
Daniel Kim 316
x2
θ
x
From this right triangle, it becomes apparent that tan θ = x, and this is our relation. We
dθ dx
differentiate both sides with respect to t to get sec2 θ = .
dt dt
How would we find sec2 θ? We can use the identity tan2 θ + 1 = sec2 θ (which can be derived
from the Pythagorean identity sin2 θ + cos2 θ = 1).
When x = 3, then tan θ = 3, therefore sec2 θ = 32 + 1 = 10, so we have
dθ
10 = 10.
dt
dx dθ
as we are given that = 10. We solve that = 1, therefore we can conclude that the angle of
dt dt
inclination from the origin increases at a rate of 1 radian per second .
Problem 10.3.8. The length of a rectangle increases by 10 cm. per hour. Its width decreases by 2
cm. per hour. When the length is 60 cm. and the width is 80 cm., determine how fast each of the
following is changing:
a) Perimeter
b) Area
c) Length of diagonal
dw
Solution. Let the length be l and width be w. The problem statement implies that = −2 and
dt
dl
= 10. Let the perimeter, area, and length of diagonal be P , A, and L respectively. Therefore the
dt
dP dA dL
problem is asking us to find , , and , when w = 80 and l = 60.
dt dt dt
b) Similarly, we have that A = lw, and differentiating both sides and simplifying yield:
dA d
= (lw)
dt dt
dl dw
= w+l
dt dt
= 10 · 80 + 60 · −2
= 680.
12
11 1
10 15 cm. 2
x
θ
9 3
8 cm.
8 4
7 5
6
Daniel Kim 318
Solution. Let the distance between the tips of the hour hand and minute hand be x, and the angle
between the two hands be θ. We intend to use the Law of Cosines on the triangle formed by the
side lengths (the two hands of the clock) and the third side, which is x. The problem is asking us to
dx 2π
find when θ = , which signifies the time of 4 o’clock.
dt 3
Starting off with Law of Cosines, we have:
First, we introduce some definitions to help us with stating the next few theorems.
Theorem 10.4.4
If the function f has a local extremum at x = a, then either f 0 (a) = 0 or f 0 (a) does not exist.
Proof. Without loss of generality, let there be a local maximum at x = a. The proof for the local
minimum will be analogous.
Suppose f 0 (a) exists, so we have
f (x) − f (a)
f 0 (a) = lim .
x→a x−a
f (x) − f (a) f (x) − f (a)
This implies that f 0 (a) = lim = lim . We analyze the one-sided limits
x→a+ x−a x→a− x−a
separately:
Definition 10.4.5. The function f has a critical point at x = a if f 0 (a) = 0 or f 0 (a) does not
exist.
The proof of this theorem lies beyond the scope of this book.
The following problems will demonstrate the usefulness of this theorem regarding minimizing
and maximizing functions.
Problem 10.4.7. Maximize and minimize f (x) = x2 − x on [−4, 4].
Daniel Kim 320
1
Solution. Note that f 0 (x) = 2x − 1, so the only critical point is x = . Now, we examine all end
2
points and critical points:
f (−4) = 20,
1 1
f =− ,
2 4
f (4) = 12.
1
Therefore, the maximum and minimum values of f on [−4, 4] are 20 and − respectively.
4
Problem 10.4.8. Maximize and minimize f (x) = x3 − 3x + 1 on [−4, 4].
Solution. We have f 0 (x) = 3x2 − 3, so the critical points are x = 1, −1. Now, we examine all end
points and critical points:
f (−4) = −51,
f (−1) = −1,
f (1) = 3,
f (4) = 53.
Therefore, the maximum and minimum values are 53 and −51 respectively.
π 3π
Solution. Since f 0 (x) = 2 cos 2x, the critical points are x = , . Now, we examine all end points
4 4
and critical points:
f (0) = 0,
π
f = 1,
4
3π
f = −1,
4
f (π) = 0.
Hence, the maximum and minimum values of f on [0, π] are 1 and −1 respectively.
3 1
Problem 10.4.10. Maximize and minimize f (x) = 4x + on ,5 .
x 2
3 3
Solution. After we obtain f 0 (x) = 4 − 2 , we solve for the critical points: 4 − 2 = 0 → 4x2 = 3,
√ x x
3 1
which gives us x = ± . Since we are considering the interval , 5 , we discard the negative value
2 2
of x. Now, we examine all end points and critical points:
1
f = 8,
2
321 Chapter 10. Derivatives
√ !
3 √
f = 4 3,
2
103
f (5) =
.
5
1 103 √
Thus, the maximum and minimum values of f on , 5 are and 4 3 respectively.
2 5
Problem 10.4.11. Squares of side length x will be cut from each corner of a 10 × 16 in. cardboard
such that the remaining will be folded up into an open box. Find the value of x which will maximize
the volume of the box.
10 in.
x
x
16 in.
Solution. We are asked to maximize the function V (x) which represents the volume of the resulting
box. First, we must consider the constraints of the cardboard. Considering the 10 inch side, the side
length of the squares cannot be greater than 5 inches. Therefore we must maximize V (x) over the
interval [0, 5]. We include the edge cases so we can apply Theorem 10.4.6.
The height of the box is x, and the dimensions of the base of the box would be 10−2x and 16−2x,
since each side is reduced by the 2 squares with side length x cut off from the corners. Therefore,
V (x) = x(16−2x)(10−2x) = 4x3 −52x2 +160x. Then V 0 (x) = 12x2 −104x+160 = 4(3x−20)(x−2),
20 20
so the critical points are x = , 2. However, since is not in the interval [0, 5], we ignore this
3 3
value, so we only check x = 0, 2, 5, as shown:
V (0) = 0,
V (2) = 144,
V (5) = 0.
E
w
y
B A D
z
Daniel Kim 322
Suppose 4ABC and 4ADE are right triangles, with a common vertex at A. Let BC = w,
DE = y, and BD = z. Locate A such that the sum of the hypotenuses AC and AE is minimal.
Then, demonstrate that regardless of the values of w, y, z, the minimum sum of AC and AE occurs
when ∠BAC ∼ = ∠DAE.
Solution. For some x ∈ (0, z), let BA = z − x and DA = x, as this problem would not make sense if
x was equal to either of the end points p
(resulting in one triangle disappearing
p completely). By the
Pythagorean Theorem, we have AC = w2 + (z − x)2 and AE = y 2 + x2 . Then consider
p p
f (x) = y 2 + x2 + w2 + (z − x)2 ,
which will represent the sum of the hypotenuses based on x. We wish to minimize this function over
(0, z). Then
x z−x
f 0 (x) = p −p .
2
x +y 2 (z − x)2 + w2
We solve for the critical point(s):
x z−x
p −p =0
2
x +y 2 (z − x)2 + w2
x z−x
p =p
x2 + y 2 (z − x)2 + w2
x2 (z − x)2
=
x + y2
2 (z − x)2 + w2
x2 + y 2 (z − x)2 + w2
=
x2 (z − x)2
y2 w2
1+ 2 =1+
x (z − x)2
y2 w2
=
x2 (z − x)2
y w
= .
x z−x
yz
We then solve to get x = as a critical point. After much simplification, we get
w+y
yz p
f = (w + y)2 + z 2 ,
w+y
Proof. First, we will assume that f has a maximum and a minimum on [a, b] in the first place. By
Theorem 10.4.6, these can occur either at end points or critical points.
If either is at a critical point c where a < c < b, then f 0 (c) = 0 since f is differentiable on (a, b).
In this case, we are done.
Otherwise, if either is at an end point, f must be constant on [a, b] since f (a) = f (b) and this
common value would both be the maximum and the minimum. Thus, we have f 0 (c) = 0 ∀c ∈
(a, b).
Next, we introduce a well-known theorem that will also not be proved in this chapter.
Proof. Note that f (−1) = −1 and f (0) = 1, so by Theorem 10.4.14, ∃ − 1 < a < 0 such that
f (a) = 0. Thus, there exists at least one root.
Assume there is another root b. Then f (a) = f (b) = 0. Clearly f is continuous and differentiable
over the domain, so by Theorem 10.4.13, ∃a < c < b such that f 0 (c) = 0. However, f 0 (x) = 3x2 + 1 >
0 ∀x, thus f 0 (c) cannot be 0. We arrive at a contradiction, so a is the one and only root of f (x).
f (b) − f (a)
f 0 (c) = .
b−a
f (b) − f (a)
g(x) = f (x) − (x − a).
b−a
Note that g(a) = f (a), and g(b) = f (b) − (f (b) − f (a)) = f (a).
As g is defined by subtracting a linear term from f (x) (so all of its components are continuous
and differentiable), g is also continuous on [a, b] and differentiable on (a, b).
Daniel Kim 324
Corollary 10.4.17
If f 0 (c) = 0 ∀c on a certain interval, then f is constant on that interval.
Proof. Let a, b be any two points on that interval. Then by Theorem 10.4.16, ∃a < c < b such that
f (b) − f (a)
f 0 (c) = = 0.
b−a
Thus, f (a) = f (b) for any points a, b on that interval, so f is constant on that interval.
Corollary 10.4.18
If f 0 = g 0 on an interval, then ∃c ∈ R such that f (x) = g(x) + c.
Proof. Let h(x) = f (x) − g(x). Then h0 (x) = f 0 (x) − g 0 (x) = 0 on an interval. By Corollary 10.4.17,
h is constant on that interval, i.e. h(x) = c, so f (x) − g(x) = c, i.e. f (x) = g(x) + c.
Definition 10.4.19. f is increasing on an interval if ∀a, b on the interval, a < b → f (a) < f (b).
Definition 10.4.20. f is decreasing on an interval if ∀a, b on the interval, a < b → f (a) > f (b).
Corollary 10.4.21
If f 0 > 0 on an interval, f is increasing on that interval. Similarly, if f 0 < 0 on an interval, f is
decreasing on that interval.
Proof. For all a, b on the interval, let a < b without loss of generality. By Theorem 10.4.16, ∃a < c < b
f (b) − f (a)
such that f 0 (c) = > 0, so f (b) − f (a) > 0, i.e. f (a) < f (b). Thus, ∀a, b on the interval,
b−a
a < b → f (a) < f (b), so f is increasing.
The proof for the latter statement is analogous.
From Corollary 10.4.21, we can now determine which parts of a given function are increasing or
decreasing, using a method called the first derivative number line, which will be demonstrated
through the following examples.
325 Chapter 10. Derivatives
1
Problem 10.4.22. Determine the intervals on which the function f (x) = x + is increasing and
x
decreasing.
1
Solution. We find the first derivative to be f 0 (x) = 1 − 2 , which has critical points at −1 and 1.
x
We use the first derivative number line, as shown below. Note that the function is undefined at
x = 0, so we have an open circle at 0 and must consider the intervals on both sides of the open circle.
Keep in mind that x = 0 is technically not considered a critical point, but we still include it on the
first derivative number line anyway.
+ − − +
−1 0 1
Testing values in between each of the intervals, we find that the function is increasing on (−∞, 1],
decreasing on [−1, 0), decreasing on (0, 1], and increasing on [1, ∞).
Problem 10.4.23. Determine the intervals on which the following functions are increasing and
decreasing.
1. f (x) = 3x − 7
2. f (x) = x2 − 4x + 3
3. f (x) = x3 − 3x
4. f (x) = x3
1. Since f 0 (x) = 3, it is always positive, therefore the function is increasing on the entire domain.
2. We get f 0 (x) = 2x − 4, so we get the critical point x = 2. Thus, our number line is
− +
3. We evaluate f 0 (x) = 3x2 − 3 to get the critical points x = ±1, so the number line is
+ − +
−1 1
We conclude that f is increasing on (−∞, −1], decreasing on [−1, 1], and increasing on [1, ∞).
4. We have f 0 (x) = 3x2 , which is clearly always positive. Thus, f is increasing over the entire
domain.
Daniel Kim 326
Definition 10.4.24. f (x) is convex on an interval if ∀a, b in the interval, the secant line connecting
(a, f (a)) to (b, f (b)) lies above the graph on that part of the interval.
Definition 10.4.25. f (x) is concave on an interval if ∀a, b in the interval, the secant line connecting
(a, f (a)) to (b, f (b)) lies below the graph on that part of the interval.
Let two arbitrary points in a convex interval be a, b, and WLOG a < b. Consider any point
a < x < b.
(a, f (a))
(b, f (b))
(x, f (x))
By our definition of convexity, f (x) must be less than the corresponding point on the secant line
connecting a and b. We can algebraically represent this as
f (b) − f (a)
∀a < x < b, f (x) < (x − a) + f (a),
b−a
f (b) − f (a)
where y = (x − a) + f (a) is the equation of the secant line. We can rearrange this
b−a
inequality as
f (x) − f (a) f (b) − f (a)
∀a < x < b, < ,
x−a b−a
which is another way of representing the condition for convexity.
327 Chapter 10. Derivatives
f (x) − f (a)
Definition 10.4.26. For any two points a, b in a convex interval, ∀a < x < b, <
x−a
f (b) − f (a)
.
b−a
f (x) − f (a)
Definition 10.4.27. For any two points a, b in a concave interval, ∀a < x < b, >
x−a
f (b) − f (a)
.
b−a
Theorem 10.4.28
If f 00 > 0 over an interval, then f is convex on that interval.
Proof. Let a, b be two points on the interval, and WLOG a < b. Let x be some point between a and
b.
First, note that f 0 is increasing by Corollary 10.4.21. Applying Theorem 10.4.16 on the intervals
f (x) − f (a) f (b) − f (x)
(a, x) and (x, b), ∃c, d such that f 0 (c) = and f 0 (d) = . Since c < d and f 0 is
x−a b−x
f (x) − f (a) f (b) − f (x)
increasing, we have f 0 (c) < f 0 (d), or < . Then, note that
x−a b−x
Exercise 10.4.29. Is the converse of Theorem 10.4.28 true? If not, what would be a counterexample?
Theorem 10.4.30
If f 00 < 0 over an interval, then f is concave on that interval.
Theorem 10.4.31
If f 00 > 0 on an interval, its tangent lines lie below the graph.
f (x) − f (a)
• If x > a, then by Theorem 10.4.16, ∃a < b < x such that f 0 (b) = . As f 0 is
x−a
f (x) − f (a)
increasing, we have f 0 (a) < f 0 (b), i.e. f 0 (a) < . This rearranges to f 0 (a)(x − a) +
x−a
f (a) < f (x), which is what we wanted.
f (a) − f (x)
• If x < a, then by Theorem 10.4.16, ∃x < b < a such that f 0 (b) = . As f 0 is
a−x
f (a) − f (x)
increasing, we have f 0 (b) < f 0 (a), i.e. < f 0 (a). This rearranges to f 0 (a)(x − a) +
a−x
f (a) < f (x), the same result as before.
We now have all the tools in calculus to sketch graphs effectively. When given such a task, we
consider:
1. The first derivative, which tells us whether f is increasing or decreasing on which intervals.
2. The second derivative, which gives information on the concavity (convex or concave) on which
intervals.
3. Any asymptotes; consider values of x that f would be undefined in, the behavior of f as x
goes to infinity or negative infinity, or other special values, etc.
Example 10.4.32
Sketch f (x) = x3 − 3x and label all relevant points.
Solution. We find that f 0 (x) = 3x2 − 3, so its critical points are ±1, and we appropriately set up
our first derivative number line and find the signs in each interval:
+ − +
−1 1
Now we know that f is increasing on (−∞, −1], decreasing on [−1, 1], and increasing on [1, ∞).
We take the derivative of f 0 (x) to get that f 00 (x) = 6x. Our only possible point of inflection is 0,
so we now set up our second derivative number line and find the signs:
329 Chapter 10. Derivatives
− +
Now we know that f is concave down on (−∞, 0] and concave up on [0, ∞), and that 0 is a point
of inflection.
√
Lastly, we find the roots, which are ± 3, and it’s not hard to see that the graph passes through
the origin. It can be noted that since this function is a cubic, there are no asymptotes.
(−1, 2)
√ √
(− 3, 0) ( 3, 0)
(0, 0)
(1, −2)
Starting from the left, we begin by sketching a sharply increasing concave curve until we reach
−1. This portion is signified as red in the diagram.
Then, according to the first derivative number line, the function starts decreasing. Keep in mind
that the shape is still concave. This portion is represented as blue in the diagram.
Next, the concavity changes at x = 0, an inflection point, but the function is still decreasing.
This is the pink portion of the graph.
Lastly, we finish the sketch by sharply increasing outwards while maintaining the convex shape.
This is the green part of the graph.
1
Problem 10.4.33. Sketch y = .
x2 +1
−2x
Solution. Its first derivative is y 0 = , so its first derivative number line would be:
(x2+ 1)2
+ −
6x4 + 4x2 − 2
=
(x2 + 1)4
2(x2 + 1)(3x2 − 1)
= .
(x2 + 1)4
√
3
The possible points of inflection are ± , and our second derivative number line would be:
3
+ − +
√ √
3 3
− 3 3
1 1
Note that lim = lim 2 = 0, so y = 0 is an asymptote as x goes to negative and
x→∞ x2 + 1 x→−∞ x + 1
positive infinity.
√ (0, 1) √
3 3 3 3
− 3 , 4 3 ,4
Considering all of this information, we start off our sketch with the red line very close to the
asymptote y = 0, increasing and convex.
√
3
When we hit x = − , we have the line (now represented by a blue line in the figure below)
3
become concave down, but still increasing.
When x = 0, the graph now is decreasing, so we continue with the line (now green) decreasing.
√
3
Lastly, when we reach x = , the line (which is now pink) becomes concave up again, but
3
continues to decrease and approach the asymptote y = 0 as x goes to infinity.
1
Exercise 10.4.34. Sketch y = x2 − .
x
x
Exercise 10.4.35. Sketch y = .
x2 +1
Problem 10.4.36. Sketch y = x + sin x.
Solution. Note that y 0 = 1 + cos x. Since cos x ∈ [−1, 1], we have 1 + cos x ≥ 0. Thus, y 0 is always
positive except for critical points (. . . − π, π, 3π, . . .), indicating that y is always increasing except
at the critical points, at which the tangent lines would be horizontal.
331 Chapter 10. Derivatives
Don’t be afraid to deal with infinitely many critical points, because there will probably be a
recognizable pattern.
Then, we evaluate the second derivative to be − sin x, so our second derivative number line would
be: Notice the pattern that + and − are infinitely alternating. This suggests that the graph switches
+ − + −
... π ...
0 2π
(2π, 2π)
(π, π)
(0, 0)
(−π, −π)
(−2π, −2π)
Theorem 10.4.37
If f 0 (a) = 0 and f 00 (a) > 0 then f has a local minimum at x = a.
− +
a
Daniel Kim 332
Theorem 10.4.38
If f 0 (a) = 0 and f 00 (a) < 0 then f has a local maximum at x = a.
10.5 Optimization
Example 10.5.1
If the radius of the sphere is 12, what is the volume of the largest cylinder that can be inscribed
in the sphere?
Solution. Let r, h denote the radius and height of the cylinder respectively. Notice that the diagonal
of the cylinder is twice the radius of the sphere, and then we can take advantage of a right triangle
relationship:
24 h
2r
Consider the formula for the volume of a cylinder, V = πr2 h. Since there are two variables we
cannot find the maximum of this function. However, we have h2 + 4r2 = 576 from the right triangle,
333 Chapter 10. Derivatives
576 − h2
so we solve r2 = , and substitute this back into the volume formula to get
4
576 − h2 π
V (h) = π h = (576h − h3 ).
4 4
π √
We differentiate to get V 0 (h) = (576 − 3h2 ), from which we get the critical point h = 8 3.
4
Instead of going through the trouble to set up a first derivative number line, we can plug it into the
3π √
second derivative. Note that V 00 (h) = − h and thus V 00 (8 3) < 0. Then, by Theorem 10.4.38,
√ 2 √
h = 8 3 is a local maximum. However, since 8 3 is the only critical point, it is therefore the global
√ √
maximum, so the volume is minimized at V (8 3) = 768π 3 .
Problem 10.5.2. Inside a hemisphere of radius 12 is inscribed a box with a square base. What
dimensions will maximize its volume?
Solution. This diagram represents the box with side length s and height h.
12
h
√
s 2
2
Given that it is inscribed in a hemisphere, the distance between the center of the square base
(and the base of the hemisphere) and one of the four upper corners of the box must be the radius of
the sphere, or 12. By properties of a square, the distance from the
√ center of the base to one of the
s 2
four lower corners must be half the diagonal of the square, or .
2
Then, we have a right-triangle relationship and we can use the Pythagorean Theorem to relate
√ !2
s 2
all three sides: h2 + = 144. This rearranges to 2h2 + s2 = 288.
2
Given that the formula for the volume of this box is s2 h, we solve the prior equation to get
s2 = 288 − 2h2 . Thus, we end up with V (h) = (288 − 2h2 )h = 288h − 2h3 .
√
We differentiate
√ to get V (h) = −6h + 288, and the critical points are h = ±4 3. However, we
0 2
Problem 10.5.3. Consider a circle with a sector cut out, at angle θ. Let this resulting figure have
a fixed area A. What values of r and θ will minimize the perimeter of this figure?
r
θ
2π − θ 2π − θ 2
Solution. We are given that the area A is constant, so the area formula A = · πr2 = r
2π 2
establishes a relation between r and θ. Since we wish to maximize the perimeter, which is 2r+2πr−rθ,
we need to find the expression for the perimeter in terms of one variable only, so we could apply our
usual optimization techniques.
2π − θ 2 2A
Note that A = r rearranges to θ = 2π − 2 . We substitute this back into the initial
2 r
2A 2A 2A
expression for the perimeter to get P (r) = 2r + 2πr − r 2π − 2 = r 2 + 2 = 2r + .
r r r
2A √
We can then evaluate P 0 (r) = 2 − 2 , from which we get the critical points r = ± A. We
√ r √
discard the solution r = − A as length cannot be negative, so we are left with only r = A. We
4A √ 4 √
get P 00 (r) = 3 , so P 00 ( A) = √ > 0, thus we confirm that r = A and therefore θ = 2π − 2
r A
yield the minimum perimeter.
Problem 10.5.4. Show that if the point (a, b) on the parabola y = x2 is the closest point to (0, c)
(given c > 0), then the line connecting (a, b) to (0, c) is perpendicular to the tangent at (a, b).
p
Solution. The distance from (a, b) to (0, c) is a2 + (c − b)2 . We wish to minimize this distance, but
it is sufficient to minimize the expression inside the radical, as this simplifies our work tremendously
when differentiating. Note that b = a2 , as the point lies on the parabola y = x2 .
Consider D(a) = ra + (c − a ) , so D (a) = 2a + 2(c − a )(−2a) = 2a(1 − 2c + 2a ). The critical
2 2 2 0 2 2
2c − 1 1
points are a = 0, ± . However, if c ≤ , then a = 0 would be the only critical point.
2 2
r ! r
1 2c − 1 2c − 1
Assume c > . Then D (a) = 2 − 4c + 12a , so D
00 2 00 = 8c − 4 > 0, so a =
2 2 2
r
2c − 1
is a local minimum. The same follows for a = − .
2
r ! r !
2c − 1 2c − 1 2c − 1 2c − 1
Thus, the two equally closest points are , and − , . The
2 2 2 2
r !
2c−1
2c − 1 2c − 1 2 −c 1
slope of the line connecting (0, c) and , is q =− q .
2 2 2c−1
2 2c−1
2 2
335 Chapter 10. Derivatives
Note that the !derivative of y = x2 is y 0 = 2x, so the slope of the tangent line through
r r
2c − 1 2c − 1 2c − 1
, is 2 , and we confirm that these slopes are negative reciprocals of each
2 2 2
r !
2c − 1 2c − 1
other, so the lines are perpendicular. The same reasoning can be applied to − ,
2 2
as well.
Problem 10.5.5. Maximize and minimize a sin x + b cos x using calculus. Assume a, b 6= 0.
Solution. Let f (x) = a sin x + b cos x. Then f 0 (x) = a cos x − b sin x, so x is a critical point only
a a b
when a cos x = b sin x, or tan x = . This indicates that sin x = √ and cos x = √ , or
b 2
a +b 2 a + b2
2
a b
sin x = − √ and cos x = − √ .
a2 + b2 a2 + b2
When we substitute the former values into the second derivative, f 00 (x) = −a sin x − b cos x, we
a b −(a2 + b2 ) √
get −a · √ −b· √ = √ = − a2 + b2 < 0, so there is a local maximum at
2
a +b 2 2
a +b 2 2
a +b 2
a b
the value of x that satisfies sin x = √ and cos x = √ .
a2 + b2 a2 + b2
√
√ However, notice that f (x) = −f (x), so we can immediately conclude that −(− a + b ) =
00 2 2
First, we introduce a precursory theorem to aid us in proving the main formula for this section.
Proof. Let h(x) = (f (b) − f (a))g(x) − (g(b) − g(a))f (x). Note that h(a) = (f (b) − f (a))g(a) −
(g(b) − g(a))f (a) = f (b)g(a) − f (a)g(b), and h(b) = (f (b) − f (a))g(b) − (g(b) − g(a))f (b) =
f (b)g(a) − f (a)g(b), so h(a) = h(b). Therefore, by Theorem 10.4.13, ∃a < x < b such that h0 (x) = 0,
i.e. (f (b) − f (a))g 0 (x) − (g(b) − g(a))f 0 (x) = 0, which rearranges to our desired result.
Daniel Kim 336
Proof. Assuming that f and g are differentiable, note that they must also be continuous, by
Theorem 10.1.20. Thus, lim f (x) = lim g(x) = 0 indicates that f (a) = g(a) = 0.
x→a x→a
g(x) g 0 (b)
Then lim = lim 0 . However, note that as x approaches a, b will also approach a since b is
x→a f (x) x→a f (b)
g(x) g 0 (b) g 0 (a) g 0 (x)
between x and a. Thus, lim = lim 0 = 0 = lim 0 , as desired.
x→a f (x) x→a f (b) f (a) x→a f (x)
x2 − 3x + 2
Problem 10.6.3. Evaluate lim using Theorem 10.6.2.
x→2 x2 − 5x + 6
Solution. When we plug in x = 2, the numerator and denominator both become 0, so we can apply
x2 − 3x + 2 2x − 3
Theorem 10.6.2 to get lim 2 = lim = −1 .
x→2 x − 5x + 6 x→2 2x − 5
sin x sin 3x
Exercise 10.6.4. Evaluate lim and lim using Theorem 10.6.2.
x→0 x x→0 sin 5x
1 − cos 3x
Problem 10.6.5. Evaluate lim using Theorem 10.6.2.
x→0 1 − cos 5x
1 − cos 3x 3 sin 3x
Solution. First, we have lim = lim , but this still evaluates to 0 when x = 0 is
x→01 − cos 5x x→0 5 sin 5x
3 sin 3x 9 cos 3x 9
plugged in. Therefore, we apply Theorem 10.6.2 again: lim = lim = .
x→0 5 sin 5x x→0 25 cos 5x 25
x2 − 5x + 6
Problem 10.6.6. Evaluate lim using Theorem 10.6.2.
x→2 x2 − 4x + 4
x2 − 5x + 6 2x − 5
Solution. We have lim = lim , but this limit does not exist.
x→2 x2 − 4x + 4 x→2 2x − 4
x2 − 5x + 6
Therefore, lim does not exist.
x→2 x2 − 4x + 4
3x − sin 3x
Problem 10.6.7. Compute lim .
x→0 x3
337 Chapter 10. Derivatives
Theorem 10.6.8
f 0 (x) f (x)
If lim 0 = ±∞, then lim = ±∞. The same holds for right-hand and left-hand limits.
x→a g (x) x→a g(x)
∞
In other words, Theorem 10.6.2 extends to ± .
∞
f (x)
Proof. Suppose lim f (x) = lim g(x) = ∞. Assuming that lim exists, we rearrange the fraction,
x→a x→a x→a g(x)
1
f (x) g(x)
lim = lim 1 ,
x→a g(x) x→a
f (x)
and now it is appropriate to apply Theorem 10.6.2, as the numerator and denominator now go to 0.
Then we differentiate,
0
g (x)
1
g(x)
− g(x)2 f (x) f (x) g 0 (x)
lim 1 = lim 0 = lim · · .
x→a x→a − ff(x)
(x) x→a g(x) g(x) f 0 (x)
f (x) 2
Thus,
f (x) f (x) f (x) g 0 (x)
lim = lim · lim · lim 0 ,
x→a g(x) x→a g(x) x→a g(x) x→a f (x)
f 0 (x) f (x)
so lim 0
= lim , which is what we want.
x→a g (x) x→a g(x)
10.7 Inverses
Proof. Looking back to Definition 10.4.19, ∀a < b → f (a) < f (b). Then the contrapositive of this
implication must be true, i.e. f (b) ≤ f (a) → b ≤ a. Let eb = f (b) and e a = f (a). It follows that
f −1 (eb) = b and f −1 (e
a) = a, so we have eb ≤ e a → f −1 (eb) ≤ f −1 (e
a). However, note that f and f −1
are one-to-one (or else the inverse would not exist), so we can safely get rid of the equal signs to
conclude eb < e a → f −1 (eb) < f −1 (e
a). Thus, f −1 is increasing.
Theorem 10.7.2
Suppose f is continuous and increasing on [a, b]. Then f −1 is continuous on [f (a), f (b)].
The proof, which will involve the ε − δ definition of continuity, is left to the reader.
Now it makes sense to consider the derivative of inverses. We now prove the general formula:
Proof. Assume f is continuous and one-to-one, or else it would not have an inverse. Let f −1 (x) = y,
or x = f (y). Note that
f −1 (x + h) − f −1 (x)
(f −1 )0 (x) = lim .
h→0 h
When h is small and approaches 0, f −1 (x + h) approaches f −1 (x) = y. So, let f −1 (x + h) = y + e
h,
for some small e
h. This also suggests that x + h = f (y + e
h). Thus,
e
h e
h 1 1
lim = lim = 0 = 0 −1 ,
e
h→0 f (y + h) − f (y) e
h→0 f (y + h) − f (y)
e f (y) f (f (x))
1
so (f −1 )0 (x) = .
f 0 (f −1 (x))
Exercise 10.7.4. Use Theorem 10.1.36 to derive this formula. Why can’t this method be a rigorous
proof for Theorem 10.7.3?
339 Chapter 10. Derivatives
Example 10.7.5
d d d
Find (sin−1 (x)), (cos−1 (x)), and (tan−1 (x)).
dx dx dx
d 1
1. We can directly apply the formula from Theorem 10.7.3: (sin−1 (x)) = =
dx cos(sin−1 (x))
1
√ .
1 − x2
d d dy
(cos y) = (x) ←→ − sin(y) = 1,
dx dx dx
dy 1 1 1
so =− =− −1
= −√ .
dx sin y sin(cos (x)) 1 − x2
d d d
(x) = (tan(tan−1 (x))) ←→ 1 = sec2 (tan−1 (x)) (tan−1 (x)),
dx dx dx
d 1 1
so (tan−1 (x)) = 2 −1 = 2 .
dx sec (tan (x)) x +1
Exercise 10.7.6. Evaluate the derivatives of the rest of the inverse trigonometric functions.
d 1 1 1
Exercise 10.7.7. Use Theorem 10.7.3 to prove that ∀n ∈ N, (x n ) = x n −1 . Then, prove the
dx n
d r
general case: ∀r ∈ Q, (x ) = rx .
r−1
dx
d −1 7
Exercise 10.7.8. Find tan x2 .
dx
x = f (t),
y = g(t).
dy g 0 (t)
If f 0 (t) 6= 0, then = 0 .
dx f (t)
Daniel Kim 340
Proof. Since the equations x = f (t), y = g(t) do not necessarily define a function (there could be
loops or spirals), we need to impose some restrictions. For some t0 , y is a “function” of x locally
near t0 .
We can restrict the parametric equation to a certain domain and range such that it is an actual
function. In this area, let y = h(x), so y = g(t) = h(x) = h(f (t)). Then we implicitly differentiate
to get
g 0 (t)
g 0 (t) = h0 (f (t))f 0 (t) ←→ h0 (x) = 0 ,
f (t)
which is what we wanted.
Exercise 10.8.2. Given the parametric equations x = cos θ, y = sin θ (what is this graph?), find
dy
.
dx
Exercise 10.8.3. Given the parametric equations x = k(θ − sin θ), y = k(1 − cos θ) (where k ∈ R
dy
is some constant), find .
dx
Problem 10.8.4. How can we get the slope of r = f (θ) at some fixed θ = θ0 ?
Solution. Note that x = r cos θ, y = r sin θ, so substitute r = f (θ) to get x = f (θ) cos θ and
y = f (θ) sin θ. Apply Theorem 10.8.1 to this, and we get
10.9 Review
E
D 5
4
A F C
3
341 Chapter 10. Derivatives
BD x
Solution. Let DE = x. By AA similarity, 4BDE∼4BAC, so = , which rearranges to
s 4 3
2
4 4 4
BD = x. Therefore DA = EF = 4 − x, and DF = x2 + 4 − x by the Pythagorean
3 3 3
Theorem. We want to optimize the perimeter, so consider the function,
s
4 2
4 2
f (x) = x + 4 − x + x + 4 − x .
3 3
We differentiate to get
25 16
0 1
q 9 x− 3
f (x) = − + ,
3 25 2
x − 32
x + 16
9 3
48 − 3 6 48 + 3 6
Therefore, the critical points are x = and x = , and it is left to the reader to
25 25
verify which yield local maximums or minimums.
d d
Exercise 10.9.2. Find tan−1 tan−1 (x) and x sin−1 x .
dx dx
Problem 10.9.3. Let f (x) = x3 + x + 1. Find (f −1 )0 (31).
1
Solution. By Theorem 10.7.3, (f −1 )0 (x) = . To find (f −1 )0 (31), we need to find
3(f −1 (x))2
+1
f −1 (31) in order to use the formula. Note that 3 is the only real root that satisfies x3 + x + 1 = 31,
1 1 1
so f −1 (31) = 3, so (f −1 )0 (31) = = = .
3(f −1 (31))2 + 1 3 · 32 + 1 28
π sin−1 x x100 − 1
Problem 10.9.4. Find lim x− tan x, lim , and lim .
x→ π2 − 2 x→0 tan−1 x x→1 x99 − 1
sin x
1. For the first limit, rewrite tan x = to get
cos x
x − π2 sin x
lim ,
x→ π2 − cos x
d 1 d 1
2. Recalling that (sin−1 (x)) = √ and (tan−1 (x)) = 2 , we have
dx 1−x 2 dx x +1
√ 1
sin−1 x 1−x2 x2 + 1
lim = lim = lim √ = 1.
x→0 tan−1 x x→0 1 x→0 1 − x2
x2 +1
Daniel Kim 342
3. Likewise,
x100 − 1 100x99 100
lim = lim = .
x→1 x99 − 1 x→1 99x98 99
Problem 10.9.5. Consider a 3 − 4 − 5 right triangle as shown.
A B 5
4
1
The horizontal segment AB moves downward at a rate of unit per year. How fast is the length
3
AB changing when AB = 2 units?
Solution. Let AB = y, and the altitude from AB to base of side length 3 be x. We label the triangle
as such:
4−x
y
A B
Note that the small right triangle with side lengths 4 − x and y is similar to the 3 − 4 − 5 triangle
y 4−x
by AA similarity. Thus, = , which rearranges to 4y + 3x = 12.
3 4
dx 1 dy
Interpreting the problem, we are given = − , and we want to find when y = 2. We
dt 3 dt
implicitly differentiate the equation above:
d d dy dx
(4y + 3x) = (12) ←→ 4 + 3 = 0,
dt dt dt dt
dy 1 dy 1
and we substitute the given values to get 4 + 3 − = 0 to get = unit per year. Note
dt 3 dt 4
that we did not even need to use y = 2, because the horizontal segment’s rate of change is constant,
given the relation 4y + 3x = 12.
343 Chapter 10. Derivatives
Problem 10.9.6. A box with a square base and an open top has volume 1000 cubic inches. What
dimensions will minimize its surface area?
Solution. Let the side length of the square base be x, and therefore the height of the box will be
1000
. Then the surface area can be represented by the function
x2
1000
f (x) = x2 + 4 · ,
x
4000 √
and we differentiate to get f 0 (x) = 2x − 2
, yielding the critical point x = 10 3
2. Since
x
8000 √ √
f 00 (x) = 2 + 3 , f 00 (10 3 2) > 0, so x = 10 3 2 yields the minimum surface area. Thus, the
x √ √ √
dimensions of the box will be 10 2 × 10 2 × 5 2 .
3 3 3
Problem 10.9.7. A boy flies a kite at the height of 300 feet. The wind carries the kite horizontally
from the boy at a rate of 25 feet per second. How fast must he let out the string when the kite is
500 feet away from him?
y
300
dx dy
We are given = 25, and we want to find when y = 500. By the Pythagorean Theorem,
dt dt
dx dy
3002 + x2 = y 2 , so we implicitly differentiate it to get 2x = 2y . Then we substitute in values:
dt dt
dy dy
400 · 25 = 500 , i.e. = 20 feet per second.
dt dt