Swanson
Swanson
Irena Swanson
Purdue University
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Table of contents
Preface 7
The briefest overview, motivation, notation 9
Chapter 1: How we will do mathematics 13
Section 1.1: Statements and proof methods 13
Section 1.2: Statements with quantifiers 25
Section 1.3: More proof methods 28
Section 1.4: Logical negation 32
Section 1.5: Summation 34
Section 1.6: Proofs by (mathematical) induction 36
Section 1.7: Pascal’s triangle 45
Chapter 2: Concepts with which we will do mathematics 49
Section 2.1: Sets 49
Section 2.2: Cartesian product 58
Section 2.3: Relations, equivalence relations 60
Section 2.4: Functions 66
Section 2.5: Binary operations 76
Section 2.6: Fields 82
Section 2.7: Order on sets, ordered fields 86
Section 2.8: What are the integers and the rational numbers? 92
Section 2.9: Increasing and decreasing functions 95
Section 2.10: The Least upper bound property of R 97
Section 2.11: Absolute values 101
Chapter 3: The field of complex numbers, and topology 105
Section 3.1: Complex numbers 105
Section 3.2: Functions related to complex numbers 108
Section 3.3: Absolute value in C 110
Section 3.4: Polar coordinates 114
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
4
Section 3.5: Topology on the fields of real and complex numbers 119
Section 3.6: The Heine-Borel theorem 124
Chapter 4: Limits of functions 127
Section 4.1: Limit of a function 127
Section 4.2: When a number is not a limit 138
Section 4.3: More on the definition of a limit 142
Section 4.4: Limit theorems 145
Section 4.5: Infinite limits (for real-valued functions) 153
Section 4.6: Limits at infinity 156
Chapter 5: Continuity 159
Section 5.1: Continuous functions 159
Section 5.2: Topology and the Extreme value theorem 164
Section 5.3: Intermediate value theorem 167
Section 5.4: Radical functions 171
Section 5.5: Uniform continuity 176
Chapter 6: Differentiation 180
Section 6.1: Definition of derivatives 180
Section 6.2: Basic properties of derivatives 184
Section 6.3: The Mean value theorem 192
Section 6.4: L’Hôpital’s rule 197
Section 6.5: Higher-order derivatives, Taylor polynomials 200
Chapter 7: Integration 205
Section 7.1: Approximating areas 205
Section 7.2: Computing integrals from upper and lower sums 216
Section 7.3: What functions are integrable? 219
Section 7.4: The Fundamental theorem of calculus 224
Section 7.5: Integration of complex-valued functions 231
Section 7.6: Natural logarithm and the exponential functions 232
Section 7.7: Applications of integration 238
Chapter 8: Sequences 244
Section 8.1: Introduction to sequences 244
Section 8.2: Convergence of infinite sequences 249
Section 8.3: Divergence of infinite sequences and infinite limits 256
Section 8.4: Convergence theorems via epsilon-N proofs 260
Section 8.5: Convergence theorems via functions 266
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
5
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Preface
These notes were written expressly for Mathematics 112 at Reed College, with first
use in the spring of 2013. The title of the course is “Introduction to Analysis”. The
prerequisite is calculus. Recently used textbooks have been Steven R. Lay’s “Analysis,
With an Introduction to Proof ” (Prentice Hall, Inc., Englewood Cliffs, NJ, 1986, 4th
edition), and Ray Mayer’s in-house notes “Introduction to Analysis” (2006, available at
http://www.reed.edu/~mayer/math112.html/index.html). Ray Mayer’s notes strongly
influenced the coverage in this book.
In Math 112 at Reed College, students learn to write proofs while at the same time
learning about binary operations, orders, fields, ordered fields, complete fields, complex
numbers, sequences, and series. We also review limits, continuity, differentiation, and
integration. My aim for these notes is to constitute a self-contained book that covers the
standard topics of a course in introductory analysis, that handles complex-valued functions,
sequences, and series, that has enough examples and exercises, that is rigorous, and is
accessible to undergraduates. I maintain two versions of these notes, one in which the
natural, rational and real numbers are constructed and the Least upper bound theorem
is proved for the ordered field of real numbers, and one version in which the Least upper
bound property is assumed for the ordered field of real numbers. You are reading the
shorter, latter version.
Chapter 1 is about how we do mathematics: basic logic, proof methods, and Pascal’s
triangle for practicing proofs. Chapter 2 introduces foundational concepts: sets, Carte-
sian products, relations, functions, binary operations, fields, ordered fields, Archimedean
property for the set of real numbers. In particular, we assume that the set of familiar
real numbers forms an ordered field with the Least upper bound property. In Chapter 3
we construct the very useful field of complex numbers, and introduce topology which is
indispensable for the rigorous treatment of limits. I cover topology more lightly than what
is in the written notes. Subsequent chapters cover standard material for introduction to
analysis: limits, continuity, differentiation, integration, sequences, series, ending with the
P∞ xk
development of the power series k=0 k! , the exponential and the trigonometric func-
tions. Since students have seen limits, continuity, differentiation and integration before, I
go through chapters 4 through 7 quickly. I slow down for sequences and series (the last
three chapters).
An effort is made throughout to use only what had been proved. For this reason, the
chapters on differentiation and integration do not have the usual palette of trigonometric
and exponential examples of other books. The final chapter makes up for it and works out
much trigonometry in great detail and depth.
I acknowledge and thank the support from the Dean of Faculty of Reed College
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
8 Preface
to fund exercise and proofreading support in the summer of 2012 for Maddie Brandt,
Munyo Frey-Edwards, and Kelsey Houston-Edwards. I also thank the following people
for their valuable feedback: Mark Angeles, Josie Baker, Marcus Bamberger, Anji Bodony,
Zachary Campbell, Nick Chaiyachakorn, Safia Chettih, Laura Dallago, Andrew Erlanger,
Joel Franklin, Darij Grinberg, Rohr Hautala, Palak Jain, Ya Jiang, Albyn Jones, Wil-
low Kelleigh, Mason Kennedy, Christopher Keane, Michael Keppler, Ryan Kobler, Oleks
Lushchyk, Molly Maguire, Benjamin Morrison, Samuel Olson, Kyle Ormsby, Angélica Os-
orno, Shannon Pearson, David Perkinson, Jeremy Rachels, Ezra Schwartz, Jacob Sharkan-
sky, Marika Swanberg, Simon Swanson, Matyas Szabo, Ruth Valsquier, Xingyi Wang,
Emerson Webb, Livia Xu, Qiaoyu Yang, Dean Young, Eric Zhang, Jialun Zhao, and two
anonymous reviewers. If you have further comments or corrections, please send them to
[email protected].
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
The briefest overview, motivation, notation
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
10 The briefest overview, motivation, notation
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Introduction to Analysis
with Complex Numbers
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Chapter 1: How we will do mathematics
* This statement can also be written in plain English as “One equals two.” In mathematics it is acceptable
to use symbolic notation to some extent, but keep in mind that too many symbols can make a sentence hard to
read. In general we avoid starting sentences with a symbol. In particular, do not make the following sentence.
“=” is a verb. Instead make a sentence such as the following one. Note that “=” is a verb.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
14 Chapter 1: How we will do mathematics
get the cookie. On the other hand, if “Hello” were to be true or false, I would not be able
to make any further deductions about the world or my next action, so that “Hello” is not
a statement, but only a sentence.
A useful tool for manipulating statements is a truth table: it is a table in which the
first few columns may set up a situation, and the subsequent columns record truth values of
statements applying in those particular situations. Here are two examples of truth tables,
where “T ” of course stands for “true” and “F ” for “false”:
f constant continuous differentiable everywhere
f (x) = x2 F T T
f (x) = |x| F T F
f (x) = 7 T T T
x y xy > 0 xy ≤ 0 xy < 0
x>0 y >0 T F F
x>0 y ≤0 F T F
x<0 y >0 F T T
x<0 y ≤0 F F F
Note that in the second row of the last table, in the exceptional case y = 0, the
statement xy < 0 is false, but in “the majority” of the cases in that row xy < 0 is true.
The one counterexample is enough to declare xy < 0 not true, i.e., false.
Statements can be manipulated just like numbers and variables can be manipulated,
and rather than adding or multiplying statements, we connect them (by compounding the
sentences in grammatical ways) with connectors such as “not”, “and”, “or”, and so on.
Statement connecting:
(1) Negation of a statement P is a statement whose truth values are exactly opposite
from the truth values of P (under any specific circumstance). The negation of P
is denoted “ not P ” (or “¬P ”).
Some simple examples: the negation of “A = B” is “A 6= B”; the negation of
“A ≤ B” is “A > B”; the negation of “I am here” is “I am not here” or “It is not
the case that I am here”.
Now go back to the last truth table. Note that in the last line, the truth values
of “xy > 0” and “xy ≤ 0” are both false. But one should think that “xy > 0”
and “xy ≤ 0” are negations of each other! So what is going on, why are the
two truth values not opposites of each other? The problem is of course that the
circumstances x < 0 and y ≤ 0 are not specific enough. The statement “xy > 0”
is under these circumstances false precisely when y = 0, but then “xy ≤ 0” is true.
Similarly, the statement “xy ≤ 0” is under the given circumstances false precisely
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 1.1: Statements and proof methods 15
when y < 0, but then “xy > 0” is true. Thus, once we make the conditions specific
enough, then the truth values of “xy > 0” and “xy ≤ 0” are opposite, so that the
two statements are indeed negations of each other.
(2) Conjunction of statements P and Q is a statement that is true precisely when
both P and Q are true, and it is false otherwise. It is denoted “P and Q” or
“P ∧ Q”. We can record this in a truth table as follows:
P Q P and Q
T T T
T F F
F T F
F F F
(3) Disjunction of statements P and Q is a statement that is false precisely when
both P and Q are false, and it is true otherwise. We denote it as “P or Q” or as
“P ∨ Q”. In other words, as long as either P or Q is true, then P or Q is true. In
plain language, unfortunately, we use “or” in two different ways: “You may take
cream or sugar” says you may take cream or sugar or both, just like in the proper
logical way, but “Tonight we will go to the movies or to the baseball game” implies
that we will either go to the movies or to the baseball game but we will not do
both. The latter connection of two sentences is in logic called exclusive or, often
denoted xor. Even “either-or” does not disambiguate between “or” and “xor”.
The truth table for the two disjunctions is:
P Q P or Q P xor Q
T T T F
T F T T
F T T T
F F F F
(4) Implication or a conditional statement is a statement of the form “P im-
plies Q,” or variants thereof, such as all of the following:
P implies Q.
If P then Q.
P is a sufficient condition for Q.
P only if Q.
Q if P .
Q provided P .
Q given P .
Q whenever P .
Q is a necessary condition for P .
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
16 Chapter 1: How we will do mathematics
“Given P , Q follows,”
“Q whenever P ”.
P is called the antecedent and Q the consequent. A symbolic abbreviation is
“P ⇒ Q.”
An implication is true when a true conclusion follows a true assumption, or when-
ever the assumption is false. In other words, P ⇒ Q is false exactly when P is
true and Q is false.
P Q P ⇒Q
T T T
T F F
F T T
F F T
It may be counterintuitive that a false antecedent always makes the implication
true. Bertrand Russell once lectured on this and claimed that if 1 = 2 then he
(Bertrand Russell) was the pope. An audience member challenged him to prove
it. So Russell reasoned somewhat like this: “If I am the pope, then the consequent
is true. If the consequent is false, then I am not the pope. But if I am not the
pope, then the pope and I are two different people. By assumption 1 = 2, so we
two people are one, so I am the pope. Thus no matter what, I am the pope.”
Furthermore, if 1 = 2, then Bertrand Russell is also not the pope. Namely, if he is
not the pope, the consequent is true, but if he is the pope, then the pope and he
are one, and since one equals two, then the pope and he are two people, so Russell
cannot be the pope.
A further discussion about why false antecedent makes the implication true is in
the next discussion (5).
Unfortunately, the implication statement is not used consistently in informal spo-
ken language. For example, your grandmother may say: “You may have ice cream
if you eat your broccoli” when she means “You may have ice cream only if you eat
your broccoli.” Be nice to your grandmother and eat that broccoli even if she does
not express herself precisely because you know precisely what she means. But
in mathematics you do have to express yourself precisely! (Well, read the next
paragraph.)
Even in mathematics some shortcuts in precise expressions are acceptable. Here
is an example. The statements “An object x has property P if somethingorother
holds” and “An object x has property P if and only if somethingorother holds” (see
(5) below for “if and only if”) in general have different truth values and the proof
of the second is longer. However, the definition of what it means for an object
to have property P in terms of somethingorother is usually phrased with “if”, but
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 1.1: Statements and proof methods 17
“if and only if” is meant. For example, the following is standard: “Definition:
A positive integer strictly bigger than 1 is prime if whenever it can be written as
a product of two positive integers, one of the two factors must be 1.” The given
definition, if read logically precisely, since it said nothing about numbers such as
4 = 2 · 2, would allow us to call 4 prime. However, it is an understood shortcut
that only the numbers with the stated property are called prime.
(5) Equivalence or the logical biconditional of P and Q stands for the compound
statement (P ⇒ Q) and (Q ⇒ P ). It is abbreviated “P ⇔ Q” or “P iff Q”, and is
true precisely when P and Q have the same truth values.
For example, for real numbers x and y, the statement “x ≤ y + 1” is equivalent to
“x − 1 ≤ y.” Another example: “2x = 4x2 ” is equivalent to “x = 2x2 ,” but it is
not equivalent to “1 = 2x.” (Say why!)
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
18 Chapter 1: How we will do mathematics
(6) Proof of P is a series of steps (in statement form) that establish beyond doubt that
P is true under all circumstances, weather conditions, political regimes, time of
day... The logical reasoning that goes into mathematical proofs is called deductive
reasoning. Whereas both guessing and intuition can help you find the next step
in your mathematical proof, only the logical parts are trusted and get written
down. Proofs are a mathematician’s most important tool; the book contains many
examples, and the next few pages give some examples and ideas of what proofs
are.
Ends of proofs are usually marked by , , //, or QED (for “quod erat demon-
strandum”, which is Latin for “that which was to be proved”). It is a good idea to
mark the completions of proofs especially when they are long or with many parts
and steps — that helps the readers know that nothing else is to be added.
The most trivial proofs simply invoke a definition or axiom, such as “An even
integer is of the form 2 times an integer,” “An odd integer is of the form 1 plus 2
times an integer,” or, “A positive integer is prime if whenever it can be written
as a product of two positive integers, one of the two factors is 1.”
Another type of proof consists of filling in a truth table. For example, P or ( not P )
is always true, no matter what the truth value of P is, and this can be easily verified
with the truth table:
P not P P or not P
T F T
F T T
A formula using logical statements that is always true is called a tautology.
So P or not P is a tautology. Here is another example of tautology: ((P ⇒
Q) and P ) ⇒ Q, and it is proved below with the truth table:
P Q P ⇒ Q (P ⇒ Q) and P ((P ⇒ Q) and P ) ⇒ Q
T T T T T
T F F F T
F T T F T
F F T F T
This particular tautology is called modus ponens, and its most famous example
is the following:
Every man is mortal. (If X is a man, then X is mortal.)
Socrates is human.
Therefore, Socrates is mortal.
Here is a more mathematical example of modus ponens:
Every differentiable function is continuous.
f is differentiable.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 1.1: Statements and proof methods 19
Therefore, f is continuous.
Another tautology is modus tollens: ((P ⇒ Q) and ( not Q)) ⇒ ( not P ). To
prove it, one constructs a truth table as before for modus ponens. — It is a
common proof technique to invoke the similarity principle with previous work
that allows one to not carry out all the steps, as I just did. However, whenever
you invoke the proof-similarity principle, you better be convinced in your mind
that the similar proof indeed does the job; if you have any doubts, show all work
instead! In this case, I am sure that the truth table does the job, but if you are
seeing this for the first time, you may want to do the actual truth table explicitly
to get a better grasp on these concepts.
Here is a mathematical example of modus tollens:
Every differentiable function is continuous.
f is not continuous.
Therefore, f is not differentiable.
Here is another example on more familiar ground:
If you are in Oregon, then you are in the USA.
You are not in the USA.
Therefore, you are not in Oregon.
Some proofs can be pictorial/graphical. Here we prove with this method that for
any real numbers x and y, |x| < y if and only if −y < x < y. (We will see many
uses of absolute values.) Proof: [For a biconditional P ⇔ Q we need to
prove P ⇒ Q and Q ⇒ P .] The assumption |x| < y implies that y must be
positive, and the assumption −y < x < y implies that −y < y, which also says
that y must be positive. So, with either assumption, we can draw the following
part of the real number line:
−y 0 y
Now, by drawing, the real numbers x with |x| < y are precisely those real num-
bers x with −y < x < y. A fancier way of saying this is that |x| < y if and only if
−y < x < y.
Similarly, for all real numbers x and y, |x| ≤ y if and only if −y ≤ x ≤ y. (Here,
the word “similarly” is a clue that I am invoking the proof-similarity principle,
and a reader who wants to practice proofs or is not convinced should at this point
work through a proof by mimicking the steps in the previous one.)
Some (or actually most) proofs invoke previous results without re-doing the previ-
ous work. In this way we prove the triangle inequality, which asserts that for all
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
20 Chapter 1: How we will do mathematics
real numbers x and y, |x ± y| ≤ |x| + |y|. (By the way, we will use the triangle in-
equality intensely, so understand it well.) Proof: Note that always −|x| ≤ x ≤ |x|,
−|y| ≤ ±y ≤ |y|. Since the sum of smaller numbers is always less than or equal to
the sum of larger numbers, we then get that −|x| − |y| ≤ x ± y ≤ |x| + |y|. But
−|x| − |y| = −(|x| + |y|), so that −(|x| + |y|) ≤ x ± y ≤ |x| + |y|. But then by the
previous result, |x ± y| ≤ |x| + |y|.
Here is a pictorial proof establishing the basis of trigonometry and the definition
of slope as rise over run: namely that B b
A = a.
B
b
a
A
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 1.1: Statements and proof methods 21
Proof: The areas of the big and small triangles are 21 AB and 12 ab, and the area of
the difference is base times the average height, i.e., it is (A − a) b+B
2 . Thus
1 1 b+B
AB = ab + (A − a) .
2 2 2
By multiplying through by 2 we get that AB = ab + (A − a)(b + B) = ab + Ab +
AB − ab − aB, so that, after cancellations, Ab = aB. Then, after dividing through
by aA we get B
A = a.
b
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
22 Chapter 1: How we will do mathematics
1.1.2. Sometimes statements are not written precisely enough. For example, “It is not the
case that 3 is prime and 5 is even” may be saying “ not (3 is prime and 5 is even),” or it
may be saying “( not (3 is prime)) and (5 is even).” The first option is true and the second
is false.
Similarly analyze several possible interpretations of the following ambiguous sentences:
i) If 6 is prime then 7 is even or 5 is odd.
ii) It is not the case that 3 is prime or if 6 is prime then 7 is even or 5 is odd.
General advice: Write precisely; aim to not be misunderstood.
1.1.3. Add the following columns to the truth table in (5): P and not Q, ( not P ) and ( not Q),
( not P ) or ( not Q), P ⇒ not Q, ( not P ) ⇒ not Q. Are any of the new columns negations
of the columns in the truth table (5) or of each other?
1.1.4. Suppose that P ⇒ Q is true and Q is false. Prove that P is false.
1.1.5. Prove that P ⇒ Q is equivalent to ( not Q) ⇒ ( not P ).
1.1.6. Simplify the following statements:
i) (P and P ) or P .
ii) P ⇒ P .
iii) (P and Q) or (P or Q).
1.1.7. Prove with truth tables that the following statements are true.
i) (P ⇔ Q) ⇔ [(P ⇒ Q) and (Q ⇒ P )].
ii) (P ⇒ Q) ⇔ (Q or not P ).
iii) (P and Q) ⇔ [P and (P ⇒ Q)].
iv) [P ⇒ (Q or R)] ⇔ [(P and not Q) ⇒ R].
1.1.8. Assume that P or Q is true and that R ⇒ Q is false. Determine with proof the
truth values of P, Q, R, or explain if there is not enough information.
1.1.9. Assume that (P and Q) ⇒ R is false. Determine with proof the truth values of
P, Q, R, or explain if there is not enough information.
1.1.10. Suppose that x is any real number such that |x+2| < 3. Prove that |x3 −3x| < 200.
1.1.11. Suppose that x is any real number such that |x−1| < 5. Find with proof a positive
constant B such that for all such x, |3x4 − x| < B.
1.1.12. (Odd-even integers)
i) Prove that the sum of two odd integers is an even integer.
ii) Prove that the product of two integers is odd if and only if the two integers are
both odd.
iii) Suppose that the product of two integers is odd. Prove that the sum of those two
integers is even.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 1.1: Statements and proof methods 23
iv) Suppose that the sum of the squares of two integers is odd. Prove that one of the
two integers is even and the other is odd.
v) Prove that the product of two consecutive integers is even. Prove that the product
of three consecutive integers is an integer multiple of 6.
vi) Prove that the sum of two consecutive integers is odd. Prove that the sum of three
consecutive integers is an integer multiple of 3.
1.1.13. (The quadratic formula) Let a, b, c be real numbers with a non-zero. Prove
(with algebra) that all solutions x of the quadratic equation ax2 + bx + c = 0 are of the
form √
−b ± b2 − 4ac
x= .
2a
√
1.1.14. Prove that 3 is not a rational number. (Remark: It is harder to prove that π
and e are not rational.)
√
1.1.15. For which integers n is n not a rational number?
1.1.16. (Cf. Exercise 10.2.3.) Draw a unit circle and a line segment from the center to the
circle. Any real number x uniquely determines a point P on the circle at angle x radians
from the line. Draw the line from that point that is perpendicular to the first line. The
length of this perpendicular line is called sin(x), and the distance from the intersection of
the two perpendicular lines to the center of the circle is called cos(x). This is our definition
of cos and sin.
Consider the following picture inside the circle of radius 1:
y x
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
24 Chapter 1: How we will do mathematics
1.1.18. Assuming that the area of the circle of radius r is πr2 , convince yourself with
proportionality argument that the area of the region below, where x is measured in radians,
is 12 xr2 .
r
x
† 1.1.19. (Invoked in Theorem 10.2.5.) Let x be a small positive real number. Consider
the following picture with a circular segment of radius 1 and two right triangles:
i) Assert that the area of the small triangle is strictly smaller than the area of the
wedge, which in turn is strictly smaller than the area of the big triangle.
ii) Using the previous part and ratio geometry (from page 20), prove that
1 1 1
2 sin(x) cos(x) < 2 x < 2 tan x.
iii) Using the previous part, prove that 0 < cos(x) < sinx x < cos1 x .
*1.1.20. (Logic circuits) Logic circuits are simple circuits which take as inputs logical
values of true and false (or 1 and 0) and give a single output. Logic circuits are composed
of logic gates. Each logic gate stands for a logical connective you are familiar with– it could
be and, or, or not (more complex logic circuits incorporate more). The shapes for logical
and, or, not are as follows:
Given inputs, each of these logic gates outputs values equal to the values in the associated
truth table. For instance, an “and” gate only outputs “on” if both of the wires leading
into it are “on”. From these three logic gates we can build many others. For example, the
following circuit is equivalent to xor.
input
output
input
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 1.2: Statements with quantifiers 25
Make logic circuits that complete the following tasks. (It may be helpful to make logic
tables for each one.)
i) xor in a different way than the circuit above.
ii) “P implies Q” implication.
iii) xor for three inputs.
iv) Is a 3-digit binary number greater than 2?
v) Is a 4-digit binary string a palindrome?
“The number x equals 1” is true for some x and false for some x. For determining
a statement’s veracity we possibly need a further qualification. We can use the universal
quantifier “for all”, “for every”, or the existential quantifier “there exists”, “for some”.
The statement above could be modified to one of the following:
(1) “There exists a real number x such that x = 1.”
(2) “For all real numbers x, x = 1,” which is logically the same as “For every real
number x, x = 1,” and the same as “Every real number equals 1.”
Certainly the first statement is true and the second is false.
For shorthand we abbreviate “for all” with the symbol ∀, and “there exists” with ∃.
These abbreviations come in handy when we manipulate logical statements. The general
forms of abbreviated statements with quantifiers are:
“ ∀x with a certain specification, P (x) holds” = “ ∀x P (x)”
“ ∃x with a certain specification, P (x) holds” = “ ∃x P (x)”
where P is some property that can be applied to objects x in question. The forms on the
left have an explicit specifications on the scope of the x, and in the forms on the right the
scope of the x is implicit.
Warning: For ease of readability it may be better to write out full words rather than
symbolic abbreviations.
We read the displayed statements above as “for all x, P of x [holds/is true]” and
“there exists x such that P of x [holds/is true]”, respectively. The part “such that” only
appears with the existential quantifier as a grammar filler but without any logical meaning;
it can be replaced with “for which”, and can sometimes be shortened further. For example:
“There exists a function f such that for all real numbers x, f (x) = f (−x)” can be rewritten
with equal meaning as “There exists a function f that is defined for all real numbers and
is even,” or even shorter as “There exists an even function.” (No “such that” appears in
the last two versions.)
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
26 Chapter 1: How we will do mathematics
Read the following symbolic statement (it defines the limit of the function f at a to
be L; see Definition 4.1.1):
∀ > 0 ∃δ > 0 ∀x, 0 < |x − a| < δ ⇒ |f (x) − L| < .
When are the truth values and the negations of statements with quantifiers? We first
write a truth table with all the possible situations with regards to P in the first column,
and other columns give the truth values of the quantifier statements:
a possible situation ∀x P (x) ∀x not P (x) ∃x not P (x) ∃xP (x)
there are no x of the T T F F
specified type vacuously vacuously
A “for all” statement is true precisely when without exception all x with the given
description have the property P , and a “there exists” statement is true precisely when at
least one x with the given description satisfies property P . One proves a “for all” statement
by determining that each x with the given description has the property P , and one proves a
“there exists” statement by producing one specimen x with the given description and then
proving that that specimen has property P . If there are no x with the given specification,
then any property holds for those no-things x vacuously. For example, any positive real
number that is strictly smaller than −1 is also zero, equal to 15, greater than 20, product
of distinct prime integers, and any other fine property you can think of.
Notice that among the columns with truth values, one and three have opposite values,
and two and four have opposite values. This proves the following:
Theorem 1.2.1. The negation of “ ∀x P (x)” is “ ∃x not P (x)”. The negation of “ ∃x P (x)”
is “ ∀x not P (x).”
Thus “ ∀x P (x)” is false if there is even one tiny tiniest example to the contrary.
“Every prime number is odd” is false because 2 is an even prime number. “Every whole
number divisible by 3 is divisible by 2” is false because 3 is divisible by 3 and is not divisible
by 2.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 1.2: Statements with quantifiers 27
Remark 1.2.2. The statement “For all whole numbers x between 1/3 and 2/3, x2 is
irrational” is true vacuously. Another reason why “For all whole numbers x between 1/3
and 2/3, x2 is irrational” is true is that its negation, “There exists a whole number x
between 1/3 and 2/3 for which x2 is rational”, is false because there is no whole number
between 1/3 and 2/3: since the negation is false, we get yet more motivation to declare the
original statement true.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
28 Chapter 1: How we will do mathematics
1.2.5. (Contrast with the switching of quantifiers in the previous two exercises.) Explain
why the following two statements do not have the same truth values:
i) For every x > 0 there exists y > 0 such that xy = 1.
ii) There exists y > 0 such that for every x > 0, xy = 1.
1.2.6. Rewrite the following statements using quantifiers:
i) 7 is prime.
ii) There are infinitely many prime numbers.
iii) Everybody loves Raymond.
iv) Spring break is always in March.
1.2.7. Let “xLy” represent the statement that x loves y. Rewrite the following statements
symbolically: “Everybody loves somebody,” “Somebody loves everybody,” “Somebody is
loved by everybody,” “Everybody is loved by somebody.” At least one statement should
be of the form “∀x ∃y, xLy”. Compare its truth value with that of “∃x ∀y, xLy”.
1.2.8. Find a property P of real numbers x, y, z such that “∀y ∃x ∀z, P (x, y, z)” and
“∀z ∃x ∀y, P (x, y, z)” have different truth values.
1.2.9. Suppose that it is true that there exists x of some kind with property P . Can we
conclude that all x of that kind have property P ? (A mathematician and a few other jokers
are on a train and see a cow through the window. One of them generalizes: “All cows in
this state are brown,” but the mathematician corrects: “This state has a cow whose one
side is brown.”)
When statements are compound, they can be harder to prove. Fortunately, proofs
can be broken down into simpler statements. An essential chart of this breaking down is
in the chart on the next page.
Example 1.3.1. Integers 2 and 3 are prime, i.e., 2 is a prime integer and 3 is a prime
integer.
Proof. Let m and n be whole numbers strictly greater than 1. If m · n = 2, then 1 <
m, n ≤ 2, so m = n = 2, but 2 · 2 is not equal to 2. Thus 2 cannot be written as a product
of two positive numbers different from 1, so 2 is a prime number. If instead m · n = 3, then
1 < m, n ≤ 3. Then all combinations of products are 2 · 2, 2 · 3, 3 · 2, 3 · 3, none of which
is 3. Thus 3 is a prime number.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 1.3: More proof methods 29
Example 1.3.2. A positive prime number is either odd or it equals 2. (Often the term
“prime” implicitly assumes positivity, but −2 can be thought of as a prime number as
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
30 Chapter 1: How we will do mathematics
well.)
Proof. Let p be a positive prime number. Suppose that p is not odd. Then p must be even.
Thus p = 2 · q for some positive whole number q. Since p is a prime, it follows that q = 1,
so that p = 2.
Proof. Observe that 2 is a real number and that 23 − 3 · 2 = 2. Thus x = 2 satisfies the
conditions.
Example 1.3.7. (Mixture of methods) For every real number x strictly between 0 and 1
there exists a positive real number y such that x1 + y1 = xy
1
.
Proof. [We have to prove that for all x as specified some property holds.]
Let x be in (0, 1). [For this x we have to find y ...] Set y = 1 − x. [Was this
a lucky find? No matter how we got inspired to determine this y, we now
verify that the stated properties hold for x and y.] Since x is strictly smaller
than 1, it follows that y is positive. Thus also xy is positive, and y + x = 1. After dividing
the last equation by the positive number xy we get that x1 + y1 = y+x 1
xy = xy .
Furthermore, y in the example above has no choice but to be 1 − x.
* Recall that this font in brackets and in red color indicates the reasoning that should go on in the
background in your head; these statements are not part of a proof.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 1.3: More proof methods 31
Proof. (This proof is harder; it is fine to skip it.) Suppose for contradiction that the
conclusion fails for some positive integer n. Then on the list 2, 3, 4, . . . , n let m be the
smallest integer for which the conclusion fails. If m is a prime, take k = 1 and p1 = m,
a1 = 1, and so the conclusion does not fail. Thus m cannot be a prime number, and so m =
m1 m2 for some positive integers m1 , m2 strictly bigger than 1. Necessarily 2 ≤ m1 , m2 < m.
By the choice of m, the conclusion is true for m1 and m2 . Write m1 = pa1 1 pa2 2 · · · pakk and
m2 = q1b1 q2b2 · · · qlbl for some positive prime integers p1 < · · · < pk , q1 < · · · < qk and some
positive integers a1 , . . . , ak , b1 , . . . , bl . Thus m = pa1 1 pa2 2 · · · pakk q1b1 q2b2 · · · qlbl is a product
of positive prime numbers, and after sorting and merging the pi and qj , the conclusion
follows also for m. But we assumed that the conclusion fails for m, which yields the
desired contradiction. Hence the conclusion does not fail for any positive integer.
Example 1.3.9. Any positive rational number can be written as ab , where a and b are
positive whole numbers and where in any prime factorizations of a and b as in the previous
example, the prime factors for a are distinct from the prime factors for b.
Proof. [We have to prove that for all ...] Let x be a(n arbitrary) positive rational
number. [We rewrite the meaning of this in a more concrete and usable
form next.] Thus x = ab for some whole numbers a, b. If a is negative, since x is positive
necessarily b has to be negative. But then −a, −b are positive numbers, and x = −a −b . Thus
by possibly replacing a, b with −a, −b we may assume that a, b are positive. [A rewriting
trick.] There may be many different pairs of a, b, and we choose a pair for which a is the
smallest of all possibilities. [A choosing trick. But does the smallest a exist?]
Such a does exist because in a collection of given positive integers there is always a smallest
one. Suppose that a and b have a (positive) prime factor p in common. Write a = a0 p and
b = b0 p for some positive whole numbers a0 , b0 . Then x = ab00 , and since 0 < a0 < a, this
contradicts the choice of the pair a, b. Thus a and b could not have had a prime factor in
common.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
32 Chapter 1: How we will do mathematics
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 1.4: Logical negation 33
Example 1.4.1. There are infinitely many (positive) prime numbers. Proof by contradic-
tion: (Due to Euclid.) Suppose that there are only finitely many prime numbers. Then
we can enumerate them all: p1 , p2 , . . . , pn . Let a = (p1 p2 · · · pn ) + 1. Since we know
that 2, 3, 5 are primes, necessarily n ≥ 3 and so a > 1. By the Fundamental theorem of
arithmetic (Example 1.3.8), a has a prime factor p. Since p1 , p2 , . . . , pn are all the primes,
necessarily p = pi for some i. But then p = pi divides a and p1 p2 · · · pn , whence it divides
1 = a − (p1 p2 · · · pn ), which is a contradiction. So it is not the case that there are only
finitely many prime numbers, so there must be infinitely many.
Another proof by contradiction: (Due to R. Meštrović, American Mathematical
Monthly 124 (2017), page 562.) Suppose that there are only finitely many prime numbers.
Then we can enumerate them all: p1 = 2, p2 = 3, . . . , pn . The positive integer p2 p3 · · · pn −2
has no odd prime factors, and since it is odd, it must be equal to 1. Hence p2 p3 · · · pn = 3,
which is false since p2 = 3, p3 = 5, and so on.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
34 Chapter 1: How we will do mathematics
1.5 Summation
There are many reasons for not writing out the sum of the first hundred numbers in
full length: it would be too long, it would not be any clearer, and we might start doubting
the intelligence of the writer. Instead we express such sums with the summation sign Σ:
100
X 100
X
k or n.
k=1 n=1
The counters k and n above are dummy variables, they vary from 1 to 100, as indicated
below and above the summation sign. We could use any other name in place of k or n.
In general, if f is a function defined at m, m + 1, m + 2, . . . , n, we use the summation
shortening as follows:
n
X
f (k) = f (m) + f (m + 1) + f (m + 2) + · · · + f (n).
k=m
This is one example where good notation saves effort and often clarifies the concept. For
typographical reasons, to prevent lines jamming into each other, we also write this as
Pn
k=m f (k).
Now is a good time to discuss polynomials. A polynomial function is a function
of the form f (x) = a0 + a1 x + · · · + an xn for some non-negative integer n and some
numbers a0 , a1 , . . . , an . We call a0 + a1 x + · · · + an xn a polynomial, and if an is non-zero,
we say that the polynomial has degree n. (More on the degrees of polynomial functions
is in Example 1.6.5, Exercise 2.6.15.) It is convenient to write this polynomial with the
shorthand notation
Xn
f (x) = a0 + a1 x + · · · + an xn = ak xk .
k=0
Here, of course, x0 stands for 1. When we evaluate f at 0, we get a0 = a0 +a1 ·0+· · ·+an ·0n
Pn
= k=0 ak 0k , and we deduce that notationally 00 stands for 1 here.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 1.5: Summation 35
Remark 1.5.1. 00 could possibly be thought of also as lim+ 0x , which is surely equal to 0.
x→0
But then one can wonder whether 00 equals 0 or 1 or to something else entirely? Well, it
turns out that 00 is not equal to that zero limit – you surely know of other functions f for
which lim f (x) exists but the limit is not equal to f (c). (Check out also Exercise 7.6.9.)
x→c
Examples 1.5.2.
5
X
(1) 2 = 2 + 2 + 2 + 2 + 2 = 10.
k=1
X5
(2) k = 1 + 2 + 3 + 4 + 5 = 15.
k=1
X4
(3) k 2 = 12 + 22 + 32 + 42 = 30.
k=1
X12
(4) cos(kπ) = cos(10π) + cos(11π) + cos(12π) = 1 − 1 + 1 = 1.
k=10
X2 2
X
3
(5) (4k ) = −4 + 0 + 4 + 4 · 8 = 32 = 4(−1 + 0 + 1 + 1 · 8) = 4 k3 .
k=−1 k=−1
Xn
(6) 3 = 3 added to itself n times = 3n.
k=1
Xb
(7) 2 = 2 added to itself b − a + 1 times = 2(b − a + 1).
k=a
P0
We can even deal with empty sums such as k=1 ak : here the index starts at k = 1
and keeps increasing and we stop at k = 0, but there are no such indices k. What could
possibly be the meaning of such an empty sum? Note that
4
X 2
X 4
X 1
X 4
X 0
X 4
X
ak = ak + ak = ak + ak = ak + ak ,
k=1 k=1 k=3 k=1 k=2 k=1 k=1
from which we deduce that this empty sum must be 0. Similarly, every empty sum equals 0.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
36 Chapter 1: How we will do mathematics
Qn
In particular, for all non-negative integers n, the product k=1 k is used often and is
Qn
abbreviated as n! = k=1 k. See Exercise 1.5.6 for the fact that 0! = 1.
So far we have learned a few proof methods. There is another type of proofs that
deserves special mention, and this is proof by (mathematical) induction, sometimes
referred to as the principle of mathematical induction. This method can be used when
one wants to prove that a property P holds for all integers n greater than or equal to an
integer n0 . Typically, n0 is either 0 or 1, but it can be any integer, even a negative one.
Induction is a two-step procedure:
(1) Base case: Prove that P holds for n0 .
(2) Inductive step: Let n > n0 . Assume that P holds for all integers n0 , n0 + 1, n0 +
2, . . . , n − 1. Prove that P holds for n.
Why does induction succeed in proving that P holds for all n ≥ n0 ? By the base
case we know that P holds for n0 . The inductive step then proves that P also holds for
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 1.6: Proofs by (mathematical) induction 37
n0 + 1. So then we know that the property holds for n0 and n0 + 1, whence the inductive
step implies that it also holds for n0 + 2. So then the property holds for n0 , n0 + 1 and
n0 + 2, whence the inductive step implies that it also holds for n0 + 3. This establishes that
the property holds for n0 , n0 + 1, n0 + 2, and n0 + 3, so that by inductive step it also holds
for n0 + 4. We keep going. For any integer n > n0 , in n − n0 step we similarly establish
that the inductive step holds for n0 , n0 + 1, n0 + 2, . . . , n0 + (n − n0 ) = n. Thus for any
integer n ≥ n0 , we eventually prove that P holds for it.
The same method can be phrased with a slightly different two-step process, with the
same result, and the same name:
(1) Base case: Prove that P holds for n0 .
(2) Inductive step: Let n > n0 . Assume that P holds for integer n − 1. Prove that
P holds for n.
Similar reasoning as in the previous case also shows that this induction principle
succeeds in proving that P holds for all n ≥ n0 .
Pn n(n+1)
Example 1.6.1. Prove the equality k=1 k= 2 for all n ≥ 1.
P1
Proof. Base case n = 1: The left side of the equation is k=1 k which equals 1. The right
side is 1(1+1)
2 which also equals 1. This verifies the base case.
Inductive step: Let n > 1 and we assume that the equality holds for n − 1. [We
want to prove the equality for n. We start with the expression on the
left (messier) side of the desired and not-yet-proved equation for n and
manipulate the expression until it resembles the desired right side.] Then
n n−1
!
X X
k= k +n
k=1 k=1
(n − 1)(n − 1 + 1)
= + n (by induction assumption for n − 1)
2
n2 − n 2n
= + (by algebra)
2 2
n2 + n
=
2
n(n + 1)
= ,
2
as was to be proved.
Pn
We can even prove the equality k=1 k = n(n+1) 2 for all n ≥ 0. Since we have already
P0
proved this equality for all n ≥ 1, it remains to prove it for n = 0. The left side k=1 k is
an empty sum and hence 0, and the right side is 0(0+1) 2 , which is also 0.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
38 Chapter 1: How we will do mathematics
Pn
Example 1.6.2. Prove the equality k=1 k(k + 1)(k + 2)(k + 3) = n(n+1)(n+2)(n+3)(n+4)
5
for all n ≥ 1.
P1
Proof. Base case n = 1: k=1 k(k + 1)(k + 2)(k + 3) = 1(1 + 1)(1 + 2)(1 + 3) = 1 · 2 · 3 · 4 =
1·2·3·4·5
5 = 1(1+1)(1+2)(1+3)(1+4)
5 , which verifies the base case.
Inductive step: Let n > 1 and we assume that the equality holds for n − 1. [We
want to prove the equality for n. We start with the expression on the
left side of the desired and not-yet-proved equation for n (the messier of
the two) and manipulate the expression until it resembles the desired right
side.] Then
n
X
k(k + 1)(k + 2)(k + 3) =
k=1
n−1
!
X
k(k + 1)(k + 2)(k + 3) + n(n + 1)(n + 2)(n + 3)
k=1
(n − 1)(n − 1 + 1)(n − 1 + 2)(n − 1 + 3)(n − 1 + 4)
=
5
+ n(n + 1)(n + 2)(n + 3)
(by induction assumption)
(n − 1)n(n + 1)(n + 2)(n + 3) 5n(n + 1)(n + 2)(n + 3)
= +
5 5
n(n + 1)(n + 2)(n + 3)
= (n − 1 + 5) (by factoring)
5
n(n + 1)(n + 2)(n + 3)(n + 4)
= ,
5
as was to be proved.
Example 1.6.3. Assuming that the derivative of x is 1 and the product rule for deriva-
d
tives, prove that for all n ≥ 1, dx (xn ) = nxn−1 . (We introduce derivatives formally in
Section 6.1.)
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 1.6: Proofs by (mathematical) induction 39
Proof. When n = 1,
Proof. We keep g(x) fixed and we prove by induction on the degree n of f (x) that the claim
holds for all polynomials f (x). If n < m, then we are done with q(x) = 0 and r(x) = f (x).
If n = m, then we set q(x) = abnn and (necessarily)
am
r(x) = f (x) − g(x)
bm
am
= a0 + a1 x + · · · + am xm − (b0 + b1 x + · · · + bm xm )
bm
am am am
= (a0 − b0 ) + (a1 − b1 )x + · · · + (am−1 − bm−1 )xm−1 ,
bm bm bm
which has degree strictly smaller than m. These are the base cases.
Now suppose that n > m. Set h(x) = a1 + a2 x + a3 x2 + · · · + an xn−1 . By induction
on n, there exist polynomials q1 (x) and r1 (x) such that h(x) = q1 (x) · g(x) + r1 (x) and such
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
40 Chapter 1: How we will do mathematics
that the degree of r1 (x) is strictly smaller than m. Then xh(x) = xq1 (x) · g(x) + xr1 (x).
Since the degree of xr1 (x) is at most m, by the second base case there exist polynomials
q2 (x) and r2 (x) such that xr1 (x) = q2 (x)g(x) + r2 (x) and such that the degree of r2 (x) is
strictly smaller than m. Now set q(x) = xq1 (x) + q2 (x) and r(x) = r2 (x) + a0 . Then the
degree of r(x) is strictly smaller than m, and
q(x)g(x) + r(x) = xq1 (x)g(x) + q2 (x)g(x) + r2 (x) + a0
= xq1 (x)g(x) + xr1 (x) + a0
= xh(x) + a0
= f (x).
Remark 1.6.6. A common usage of the Euclidean algorithm is in finding the greatest
common divisor of two polynomials. A polynomial d(x) divides f (x) and g(x) exactly
when it divides g(x) and r(x) = f (x) − q(x) · g(x). It is easier to find factors of polynomials
of smaller degree. As an example, let f (x) = x4 +4x3 +6x2 +4x+1 and g(x) = x3 +2x2 +x.
The first step of the Euclidean algorithm gives
r(x) = f (x) − (x + 2)g(x) = x2 + 2x + 1.
So to find the greatest common divisor of f (x) and g(x) it suffices to find the greatest
common divisor of g(x) and r(x). The Euclidean algorithm on these two gives r1 (x) =
g(x) − xr(x) = 0, so that to find the greatest common divisor of f (x) and g(x) it suffices
to find the greatest common divisor of r(x) and 0. But the latter is clearly r(x). In fact,
f (x) = (x + 1)4 and g(x) = x(x + 1)2 .
√
Example 1.6.7. For all positive integers n, n
n < 2.
√
Proof. Base case: n = 1, so n n = 1 < 2.
√
Inductive step: Suppose that n is an integer with n ≥ 2 and that n−1 n − 1 < 2.
This means that n − 1 < 2n−1 . Hence n < 2n−1 + 1 < 2n−1 + 2n−1 = 2 · 2n−1 = 2n , so that
√
n
n < 2.
Remark 1.6.8. There are two other equivalent formulations of mathematical induction
for proving a property P for all integers n ≥ n0 :
Mathematical induction, version III:
(1) Base case: Prove that P holds for n0 .
(2) Inductive step: Let n ≥ n0 . Assume that P holds for all integers n0 , n0 +
1, n0 + 2, . . . , n. Prove that P holds for n + 1.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 1.6: Proofs by (mathematical) induction 41
Exercises for Section 1.6: Prove the following properties for n ≥ 1 by induction.
n
X n(n + 1)(2n + 1)
1.6.1. k2 = .
6
k=1
n 2
X
3 n(n + 1)
1.6.2. k = .
2
k=1
1.6.3. The sum of the first n odd positive integers is n2 .
Xn
1.6.4. (2k − 1) = n2 .
k=1
1.6.5. (Triangle inequality) For all positive integers n and for all real numbers a1 , . . . , an ,
|a1 + a2 + · · · + an | ≤ |a1 | + |a2 | + · · · + |an |. (Hint: there may be more than one base case.
Why is that?)
Xn
1.6.6. (3k 2 − k) = n2 (n + 1).
k=1
1.6.7. 1 · 2 + 2 · 3 + 3 · 4 + · · · + n(n + 1) = 13 n(n + 1)(n + 2).
1.6.8. 7n + 2 is a multiple of 3.
1.6.9. 3n−1 < (n + 1)!.
1 1 1 1 √
1.6.10. √ + √ + √ + · · · + √ ≥ n.
1 2 3 n
1 1 1 1 1
1.6.11. 2 + 2 + 2 + · · · + 2 ≤ 2 − .
1 2 3 n n
1.6.12. Let a1 = 2, and for n ≥ 2, an = 3an−1 . Formulate and prove a theorem giving an
in terms of n (no dependence on other ai ).
1.6.13. 8 divides 5n + 2 · 3n−1 + 1.
1.6.14. 1(1!) + 2(2!) + 3(3!) + · · · + n(n!) = (n + 1)! − 1.
1.6.15. 2n−1 ≤ n!.
n
Y 1 1
1.6.16. 1− = .
k n
k=2
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
42 Chapter 1: How we will do mathematics
n
Y 1 n+1
1.6.17. 1− 2 = .
k 2n
k=2
Xn
1.6.18. 2k (k + 1) = 2n+1 n + 1.
k=0
P2n 1 n+2
1.6.19. k=1 k ≥ 2 .
P2n −1 1
Pn 1
† 1.6.20. (Invoked in Example 9.1.9.) k=1 k2 ≤ k=0 2k .
† 1.6.21. (Invoked in the proof of Theorem 9.4.1.) For all numbers x, y, xn − y n = (x −
Pn−1
y) xn−1 + xn−2 y + xn−3 y 2 + · · · + y n−1 , i.e., xn − y n = (x − y) k=0 xn−1−k y k .
† 1.6.22. (Invoked in the Ratio tests Theorems 8.6.6 and 9.2.3.) Let r be a positive real
number.
i) Suppose that for all positive integers n ≥ n0 , an+1 < ran . Prove that for all
positive integers n > n0 , an < rn−n0 an0 .
ii) Suppose that an+1 ran , where is one of ≤, >, ≥. Prove that an rn−n0 an0 .
1.6.23. Let An = 12 + 22 + 32 + · · · + (2n − 1)2 and Bn = 12 + 32 + 52 + · · · + (2n − 1)2 .
Discover formulas for An and Bn , and prove them (by using algebra and previous problems,
and possibly not with induction).
1.6.24. (From the American Mathematical Monthly 123 (2016), page 87, by K. Gaitanas)
Pn−1 k 1
Prove that for every n ≥ 2, k=1 (k+1)! = 1 − n! .
1 1 1 1
1.6.25. Let An = 1·2 + 2·3 + 3·4 + ··· + n(n+1) . Discover a formula for An and prove it.
1.6.26. How many handshakes happen at a gathering of n people if everybody shakes
everybody else’s hands exactly once.
1.6.27. Find with proof an integer n0 such that n2 < 2n for all integers n ≥ n0 .
1.6.28. Find with proof an integer n0 such that 2n < n! for all integers n ≥ n0 .
1.6.29. For any positive integer n and real number x define Sn = 1 + x + x2 + · · · + xn .
i) Prove that for any n ≥ 2, xSn−1 + 1 = Sn , and that Sn (1 − x) = 1 − xn+1 .
ii) Prove that if x = 1, then Sn = n + 1.
n+1
iii) Prove that if x 6= 1, then Sn = 1−x
1−x . Compare with the proof by induction in
Example 1.6.4.
1.6.30. (Fibonacci numbers) Let s1 = 1, s2 = 1, and for all n ≥ 2, let sn+1 = sn +sn−1 .
This sequence starts with 1, 1, 2, 3, 5, 8, 13, 21, 34, . . .. (Many parts below are taken from the
book Fibonacci Numbers by N. N. Vorob’ev, published by Blaisdell Publishing Company,
1961, translated from the Russian by Halina Moss; there is a new edition of the book with
author’s last name written as Vorobiev, published by Springer Basel AG, 2002, translated
from the Russian by Mircea Martin.)
i) Fibonacci numbers are sometimes “motivated” as follows. You get the rare gift of
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 1.6: Proofs by (mathematical) induction 43
a pair of newborn Fibonacci rabbits. Fibonacci rabbits are the type of rabbits who
never die and each month starting in their second month produce another pair of
rabbits. At the beginning of months one and two you have exactly that 1 pair of
rabbits. In the second month, that pair gives you another pair of rabbits, so at
the beginning of the third month you have 2 pairs of rabbits. In the third month,
the original pair produces another pair of rabbits, so that at the beginning of the
fourth month, you have 3 pairs of rabbits. Justify why the number of rabbits at
the beginning of the nth month is sn . √ n √ n
ii) Prove that for all n ≥ 1, sn = √1 1+ 5
− √15 1− 5
. (It may seem amazing
5 2 2
that these expressions with square roots of 5 always yield positive integers.) Note
that the base case requires proving this for n = 1 and n = 2, and that the inductive
step uses knowing the property for the previous two integers.
iii) Prove that s1 + s3 + s5 + · · · + s2n−1 = s2n .
iv) Prove that s2 + s4 + s6 + · · · + s2n = s2n+1 − 1.
v) Prove that s1 + s2 + s3 + · · · + sn = sn+2 − 1.
vi) Prove that s1 − s2 + s3 − s4 + · · · + s2n−1 − s2n = 1 − s2n−1 .
vii) Prove that s1 − s2 + s3 − s4 + · · · + s2n−1 − s2n + s2n+1 = s2n + 1.
viii) Prove that s1 − s2 + s3 − s4 + · · · + (−1)n+1 sn = (−1)n+1 sn−1 + 1.
√
ix) Prove that for all n ≥ 3, sn > ( 1+2 5 )n−2 .
x) Prove that for all n ≥ 1, s21 + s22 + · · · + s2n = sn sn+1 .
xi) Prove that sn+1 sn−1 − s2n = (−1)n .
xii) Prove that s1 s2 + s2 s3 + · · · + s2n−1 s2n = s22n .
xiii) Prove that s1 s2 + s2 s3 + · · · + s2n s2n+1 = s22n+1 − 1.
xiv) Prove that ns1 + (n − 1)s2 + (n − 2)s3 + · · · + 2sn−1 + sn = sn+4 − (n + 3).
xv) Prove that for all n ≥ 1 and all k ≥ 2, sn+k = sk sn+1 + sk−1 sn .
xvi) Prove that for all n, k ≥ 1, skn is a multiple of sn . (Use the previous part.)
xvii) Prove that s2n+1 = s2n+1 + s2n .
xviii) Prove that s2n = s2n+1 − s2n−1 .
xix) Prove that s3n = s3n+1 + s3n − s3n−1 .
√ √ n
xx) Prove that sn+1 = 1+2 5 sn + 1−2 5 .
s3n+2 −1
xxi) Prove that s3 + s6 + s9 + · · · + s3n = 2 . (Use the previous part.)
s3n+2 +(−1)n+1 6sn−1 +5
xxii) Prove that s31 + s32 + s33 + · · · + s3n = 10 .
√
(1 + 5)n 1
xxiii) Prove that sn − √ < .
n
2 5 2
(xxiv)* If you know a bit of number theory, prove that for all positive integers m, n, the
greatest common divisor of sm and sn is sgcd(m,n) .
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
44 Chapter 1: How we will do mathematics
i) Prove that for every positive integer n, any 2n × 2n square grid with exactly one
of the squares removed can be tiled with trominoes.
ii) Prove that for every positive integer n, 4n − 1 is an integer multiple of 3.
1.6.32. Pick a vertex V in a triangle. Draw n distinct lines from V to the opposite edge
of the triangle. If n = 1, you get the original triangle and two smaller triangles, for a total
of three triangles. Determine the number of distinct triangles obtained in this way with
arbitrary n.
1.6.33. (Spiral of Theodorus) Draw a triangle with vertices at (0, 0), (1, 0), (1, 1). The
√
hypotenuse has length 2.
i) One of the vertices of the hypotenuse is at (0, 0). At the other vertex of the
hypotenuse, draw an edge of length 1 at the right angle away from the first triangle.
Make a triangle from the old hypotenuse and this new edge. What is the length of
the hypotenuse of the new triangle?
ii) Repeat the previous step twice.
√
iii) Prove that one can draw n for every positive integer n.
1.6.34. (Tower of Hanoi) There are 3 pegs on a board. On one peg, there are n disks,
stacked from largest to smallest. The task is to move all of the disks from one peg to a
different peg, given the following constraints: you may only move one disk at a time, and
you may only place a smaller peg on a larger one (never a larger one on a smaller one).
Let Sn be the least number of moves to complete the task for n disks.
i) If n = 1, then what is the least number of moves it takes to complete the task?
What if there are 2 disks? Repeat for 3, 4, 5 disks.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 1.7: Pascal’s triangle 45
ii) Make a recursive formula (defining Sn based on Sn−1 ) for this Sn . Then, make a
guess for a non-recursive formula for Sn (defining Sn based on n without invoking
Sn−1 ). Prove your guess using induction and the recursive formula that you wrote.
1.6.35. What is wrong with the following “proof by induction”? I will prove that 5n + 1
is a multiple of 4. Assume that this is true for n − 1. Then we can write 5n−1 + 1 = 4m
for some integer m. Multiply this equation through by 5 to get that
5n + 5 = 20m,
Pascal’s triangle is very useful, so read this section with the exercises.
The following is rows 0 through 8 of Pascal’s triangle, and the pattern is obvious
for continuation into further rows:
row 0: . . . . . . . . . 1
row 1: . . . . . . . . 1 1
row 2: . . . . . . . 1 2 1
row 3: . . . . . . 1 3 3 1
row 4: . . . . . 1 4 6 4 1
row 5: . . . . 1 5 10 10 5 1
row 6: . . . 1 6 15 20 15 6 1
row 7: . . 1 7 21 35 35 21 7 1
Note that the leftmost and rightmost numbers in each row are all 1, and each of the other
numbers is the sum of the two numbers nearest to it in the row above it. We number the
slanted columns from left to right starting from 0: the 0th slanted column consists of all 1s,
the 1st slanted column consists of consecutive numbers 1, 2, 3, 4, . . ., the 2nd slanted column
consists of consecutive numbers 1, 3, 6, 10, . . ., and so on for the subsequent columns.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
46 Chapter 1: How we will do mathematics
Let the entry in the nth row and kth column be denoted nk . We read this as “n
choose k”. These are loaded words, however, and we will eventually justify them.
Pascal’s triangle is defined so that for all n ≥ 1 and all k = 0, 1, . . . , n − 1,
n n n+1
+ = .
k k+1 k+1
What would it take to compute 100
5 ? It seems like we would need to write down rows
0 through 100 of Pascal’s triangle, or actually a little less, only slanted columns 0 through
5 of these 101 rows. That is too much drudgery! We will instead be smart mathematicians
and we will prove many properties of Pascal’s triangle in general, including shortcuts for
computing 100
5 . We will accomplish this through exercises, most of which can be proved
by mathematical induction.
1.7.3. Prove that every integer n ≥ 0 has the property that for all k = 0, 1, 2, . . . , n,
n n!
k = k!(n−k)! .
1.7.6. Prove that nk is the number of possible k-member teams in a club with exactly n
† 1.7.7. (Invoked in Theorem 2.9.2, Example 8.2.9, Theorem 6.2.3.) Prove that for all
non-negative integers n,
n
n
X n k n−k
(a + b) = a b .
k
k=0
(Since a + b contains two summands, it is called a binomial, and the expansion of (a + b)n
is called the binomial expansion, with coefficients ni being called by yet another name
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 1.7: Pascal’s triangle 47
√
1.7.8. Express each of the following as a + b 2 for some integers a, b:
√ √ √ √ √
i) 2 − 1, ( 2 − 1)2 , ( 2 − 1)3 , ( 2 − 1)4 , ( 2 − 1)5 .
√ √
ii) Write each of the five expressions in the previous part in the form c − d for
some positive integers c, d.
(iii)* Do you see a relation between c and d for each expression in the previous part? Is
there a general rule? Can you prove it?
Pn Pn
1.7.9. Prove that for all positive integers n, k=0 (−1)k nk = 0. Compute k=0 (−1)k nk
in case n = 0.
1.7.10. Prove that for any non-negative integer k,
n
X n(n + 1)(n + 2) · · · (n + k + 1)
j(j + 1)(j + 2) · · · (j + k) = .
j=1
k+1
i) Note that j 2 = j(j + 1) − j. Use the simplifications from above to prove that
Pn 2 n(n+1)(2n+1)
j=1 j = 6 .
ii) From j = j(j + 1)(j + 2) − 3j 2 − 2j = j(j + 1)(j + 2) − 3j(j + 1) + j develop the
3
Pn
formula for j=1 j 3 .
Pn
iii) Mimic the previous work to develop the formula for j=1 j 4 .
k
1.7.12. Prove that for all non-negative integers n and all k = 0, 1, . . . n, nk ≤ nk! .
1.7.13. Fix a positive integer k. Prove that there exists a positive number C such that
for all sufficiently large integers n, Cnk ≤ nk .
1.7.14. Give reasons why we should have nk = 0 for n < k or if either k or n is negative.
1.7.15. Let d be a positive integer. This is about summing entries in Pascal’s triangle
along the dth northwest-southeast slanted column: Prove by induction on n ≥ 0 that
Pn d+k
= d+n+1
k=0 k n .
*1.7.16. Prove that for all non-negative integers n,
n !2 n−1
!2
X 2n k X 2n
2 −1=2 2k
2k 2k + 1
k=0 k=0
and
n !2 n−1
X !2
X 2n + 1 k 2n + 1 k
2 +1=2 2 .
2k 2k + 1
k=0 k=0
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
48 Chapter 1: How we will do mathematics
Pen n k Pon n
k
For notation’s sake you may want to label En = k=0 2k 2 and On = k=0 2k+1 2 ,
where en , on are the largest integers such that 2en ≤ n and 2on + 1 ≤ n. The claim is then
2 2 2 2
that for all n ≥ 0, E2n − 1 = 2O2n and E2n+1 + 1 = 2O2n+1 . (Hint: use the definition
n
of k to rewrite En in terms of En−1 , On−1 . Proceed with induction.)
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Chapter 2: Concepts with which we will do mathematics
2.1 Sets
What is a set? Don’t we already have an idea of what a set is? The following
informal definition relies on our intuitive idea of a set while making precise some notation
and vocabulary of membership.
Definition 2.1.1. A set is a collection of objects. These objects are called members or
elements of that set. If m is a member of a set A, we write m ∈ A, and also say that A
contains m. If m is not a member of a set A, we write m 6∈ A.
The set of all polygons contains triangles, squares, rectangles, pentagons, and so on.
The set of all polygons does not contain circles or disks. The set of all functions contains
the trigonometric, logarithmic, exponential, constant functions, and so on.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
50 Chapter 2: Concepts with which we will do mathematics
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 2.1: Sets 51
worry when “C” appears in the text). The set of all non-negative real numbers
equals [0, ∞), and we also write it as R≥0 .
(8) We can define sets propositionally: if P is a property, then the set
{x : P (x)} or {x ∈ A : P (x)}
consists of all x (or x ∈ A) for which P holds. Here are some explicit examples:
(i) {x ∈ R : x2 = x}, and this happens to be the set {0, 1}.
(ii) {x ∈ R : x > 0 and x < 1}, and this happens to be the interval (0, 1).
(iii) Q = { ab : a, b ∈ Z and b 6= 0}.
(iv) The set of all positive prime numbers equals {n ∈ N : n > 1 and if n = pq for
some integers p, q then |p| = 1 or |q| = 1 }.
(v) Set A = {x : x is a positive integer that equals the sum of its proper factors}.
(Elements of A are called perfect numbers.) It is easy to verify that 1, 2, 3, 4, 5
are not in A. But 6 has factors 1, 2, 3, 6, and the sum of the factors other than
6 equals 6. Thus 6 is an element of A. You can verify that the three numbers
28, 496, and 8128 are also in A. (If you and your computer have a lot of time,
write a program to verify that no other number smaller than 33 million is in A.)
(9) Proving that a property P holds for all integers n ≥ n0 is the same as saying that
the set A = {n ∈ Z : P holds for n} contains {n0 , n0 + 1, n0 + 2, . . .}. By the
principle of mathematical induction, P holds for all integers n ≥ n0 is the same as
saying that n0 ∈ A and that n − 1 ∈ A implies that n ∈ A.
Just like numbers, functions, and logical statements, sets and their elements can also
be related and combined in meaningful ways. The list below introduces quite a few new
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
52 Chapter 2: Concepts with which we will do mathematics
concepts that may be overwhelming at first, but in a few weeks you will be very comfortable
with them.
Subsets: A set A is a subset of a set B if every element of A is an element of B. In that
case we write A ⊆ B. For example, N+ ⊆ N0 ⊆ Z ⊆ Q ⊆ R. The non-subset relation
is expressed with the symbol 6⊆: R 6⊆ N.
Every set is a subset of itself, i.e., for every set A, A ⊆ A.
The empty set is a subset of every set, i.e., for every set A, ∅ ⊆ A.
If A is a subset of B and A is not equal to B (so B contains at least one element that
is not in A), then we say that A is a proper subset of B, and we write A ( B. For
example, N+ ( N0 ( Z ( Q ( R.
Equality: Two sets are equal if they consist of exactly the same elements. In other words,
A = B if and only if A ⊆ B and B ⊆ A.
Intersection: The intersection of sets A and B is the set of all objects that are in A
and in B:
A ∩ B = {x : x ∈ A and x ∈ B}.
When A ∩ B = ∅, we say that A and B are disjoint.
Union: The union of sets A and B is the set of all objects that are either in A or in B:
A ∪ B = {x : x ∈ A or x ∈ B}.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 2.1: Sets 53
n
[
Ak = A1 ∪ A2 ∪ · · · ∪ An = {a : ∃k = 1, . . . , n such that a ∈ Ak }.
k=1
When I = {1, 2, . . . , n}, then the intersections and unioins in the two lines above are
the same as those in the previous display.
When I is the empty index set, one can argue as for empty sums in Section 1.5 that
[
Ak = ∅,
k∈∅
i.e., the empty union is that set which when unioned with any other set returns that
other set. The only set which satisfies this property is the empty set. Similarly,
the empty intersection should be that set which when intersected with any other set
returns that other set. However, this empty intersection depends on the context: when
the allowed other sets vary over all subsets of a set X, then the empty intersection
equals X. We return to this theme in Section 2.5.
We often have an implicit or explicit universal set that contains all elements of our
current interest. Perhaps we are talking only about real numbers, or perhaps we are
talking about all functions defined on the interval [0, 1] with values being real numbers.
In that case, for any subset A of the universal set U , the complement of A is the
complement of A in U , thus U \ A, and this is denoted as Ac .
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
54 Chapter 2: Concepts with which we will do mathematics
Example 2.1.3. We prove that Z = {3m + 4n : m, n ∈ Z}. Certainly for any integers m
and n, 3m + 4n is also an integer, so that {3m + 4n : m, n ∈ Z} ⊆ Z. Now let x ∈ Z. Then
x = 1 · x = (4 − 3) · x = 3(−x) + 4x,
so that x ∈ {3m + 4n : m, n ∈ Z}, whence Z ⊆ {3m + 4n : m, n ∈ Z}. Since we already
proved the other inclusion, the proof is done.
Example 2.1.4. We prove that A = {6m + 14n : m, n ∈ Z} equals the set B of all even
integers. Certainly for any integers m and n, 6m + 14n is an even integer, so that A ⊆ B.
Now let x ∈ B. Then x is even, so x = 2n for some integer n. Write
x = 2n = (14 − 2 · 6)n = 6(−2n) + 14n,
so that x ∈ {6m + 14n : m, n ∈ Z} = A. Thus B ⊆ A. Together with the first part this
implies that A = B.
Example 2.1.5. The complement in Z of the set of even integers is the set of odd integers.
The complement in Q of the set of even integers contains many more elements than odd
integers. For example, it contains 12 , 23 , . . ..
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 2.1: Sets 55
Example 2.1.8. For each i ∈ N+ , let Ai = [i, ∞), Bi = {i, i + 1, i + 2, i + 3, . . .}, and
Ci = (−i, i). Think through the following:
\ \ \
Ak = ∅, Bk = ∅, Ck = (−1, 1),
k∈N k∈N k∈N
[ [ [
Ak = [1, ∞), B k = N+ , Ck = R.
k∈N k∈N k∈N
Example 2.1.9. For each real number r, let Ar = {r}, Br = [0, |r|]. Then
\ \ [ [
Ar = ∅, Br = {0}, Ar = R, Br = [0, ∞).
r∈R r∈R r∈R r∈R
Set operations can be represented with a Venn diagram, especially in the presence
of a universal set U . Here is an example:
A B U
On this Venn diagram, sets are represented by the geometric regions: A is the set
represented by the left circle, B is represented by the right circle, A ∩ B is the part of the
two circles that is both in A and in B, A ∪ B is represented by the region that is either in
A or in B, A \ B is the left crescent after B is chopped out of A, etc. (There is no reason
why the regions for sets A and B are drawn as circles, but this is traditional.)
Sometimes we draw a few (or all) elements into the diagram. For example, in
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
56 Chapter 2: Concepts with which we will do mathematics
w U
A B
y t u
x z
A B U
Proof. With the Venn diagram below, B ∪ C is the region filled with either horizontal
or vertical lines, A is the region filled with the southeast-northwest slanted lines, and so
A ∩ (B ∪ C) is the region that has simultaneously the southeast-northwest slanted lines
and either horizontal or vertical lines. Also, A ∩ C is the region that has horizontal and
southeast-northwest slanted lines, A ∩ B is the region that has vertical and southeast-
northwest slanted lines, so that their union (A ∩ B) ∪ (A ∩ C) represents the total region of
southeast-northwest slanted lines that either have horizontal or vertical cross lines as well,
which is the same as the region for A ∩ (B ∪ C).
C U
A B
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 2.1: Sets 57
2.1.3. Assume that A and B are disjoint subsets of U . For each part below, draw a Venn
diagram with A and B in U , and shade in the region described by the set: (i) U \ B,
(ii) A ∩ B, (iii) U \ (U \ A), (iv) (U \ A) ∩ (U \ B), (v) (U ∩ B) ∪ (U \ (B ∪ A)), (vi)
(B ∩ U ) ∪ (B \ U ), (vii) (U \ (U \ A)) ∪ B, (viii) (A ∪ B) ∪ (U \ A).
2.1.4. Prove the following:
√ √
i) {x ∈ R : x2 = 3} = { 3, − 3}.
ii) {x3 : x ∈ R} = R.
iii) {x2 : x ∈ R} = {x ∈ R : x ≥ 0} = [0, ∞).
iv) {2, 2, 5} = {2, 5} = {5, 2}.
v) {x ≥ 0 : x is an even prime number} = {2}.
vi) ∅ is a subset of every set. Elements of ∅ are green, smart, sticky, hairy, feathery,
prime, whole, negative, positive,...
vii) {x : x can be written as a sum of three consecutive integers} = {3n : n ∈ Z}.
viii) If A ⊆ B, then A ∩ B = A and A ∪ B = B.
2.1.5. Let U = {1, 2, 3, 4, 5, 6}, A = {1, 3, 5}, and B = {4, 5, 6}. Find the following sets:
i) (A \ B) ∪ (B \ A).
ii) U \ (B \ A).
iii) U ∪ (B \ A).
iv) U \ (A ∪ B).
v) (U ∩ A) ∪ (U ∩ B).
vi) A \ (A \ B).
vii) B \ (B \ A).
viii) {A} ∩ {B}.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
58 Chapter 2: Concepts with which we will do mathematics
2.1.6. Let A, B ⊆ U .
i) Prove that there exist at most 16 distinct subsets of U obtained from A, B, U by
intersections, unions, and complementation.
ii) If A and B are disjoint, prove that there exist at most 8 such distinct subsets.
iii) If A = B, prove that there exist at most 4 such distinct subsets.
iv) If A = B = U , prove that there exist at most 2 such distinct subsets.
v) If A = B = U = ∅, prove that there exists at most 1 such subset.
2.1.7. Let A, B, C ⊆ U . Prove the following statements:
i) (A ∩ C) \ B = (A \ B) ∩ (C \ B).
ii) (A \ B) ∪ (B \ A) = (A ∪ B) \ (A ∩ B).
iii) (A ∩ B) ∪ (U \ (A ∪ B)) = (U \ (A \ B)) \ (B \ A).
iv) U \ (A \ B) = (U \ A) ∪ B.
v) If U = A ∩ B, then A = B = U .
2.1.8. Let A, B ⊆ U .
i) Prove that (U \ A) ∩ (U \ B) = U \ (A ∪ B). (The intersection of the complements
is the complement of the union.)
ii) Prove that (U \ A) ∪ (U \ B) = U \ (A ∩ B). (The union of the complements is the
complement of the intersection.)
\ \ \
2.1.9. Compute (−1/k, 1/k), [−1/k, 1/k], {−1/k, 1/k}.
k∈N+ k∈N+ k∈N+
[ [ [
2.1.10. Compute (−1/k, 1/k), [−1/k, 1/k], {−1/k, 1/k}.
k∈N+ k∈N+ k∈N+
The set {a, b} is the same as the set {b, a}, as any element of either set is also the
element of the other set. Thus, the order of the listing of elements does not matter. But
sometimes we want the order to matter. We can then simply make another new notation
for ordered pairs, but in general it is not a good idea to be inventing many new notations
and concepts; it is better if we can reuse and recycle old ones. We do this next:
Definition 2.2.1. An ordered pair (a, b) is defined as the set {{a}, {a, b}}.
So here we defined (a, b) in terms of already known constructions: (a, b) is the set one
of whose elements is the set {a} with exactly one element a, and the remaining element of
(a, b) is the set {a, b} that has exactly two elements a, b if a 6= b and has exactly one element
otherwise. Thus for example the familiar ordered pair (2, 3) really stands for {{2}, {2, 3}},
(3, 2) stands for {{3}, {2, 3}}, and (2, 2) stands for {{2}, {2, 2}} = {{2}, {2}} = {{2}}.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 2.2: Cartesian product 59
Definition 2.2.3. For any sets A and B, the Cartesian product A × B of A and B is
the set {(a, b) : a ∈ A and b ∈ B} of all ordered pairs where the first component varies over
all elements of A and the second component varies over all elements of B.
In general, one can think of A × B as the “rectangle” with A on the horizontal side
and B on the perpendicular side.
Say, if A has 4 elements and B has 3 elements, then A × B is represented by the
12 points in the rectangle with base consisting of elements of A and height consisting of
elements of B as follows:
A
If instead A and B are intervals as above, then A × B is the indicated rectangle.
When A and B extend infinitely far, then A × B is correspondingly a “large” rectangle:
The familiar real plane is the Cartesian product R × R.
(The three-dimensional space can be written as the Cartesian product R × (R ×
R) or the Cartesian product (R × R) × R. In the former case we write elements in the
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
60 Chapter 2: Concepts with which we will do mathematics
form (a, (b, c)), and in the latter case we write them in the form ((a, b), c). Those extra
parentheses are there only for notation and to slow us down, they serve no better function,
so by convention we write elements simply in the form (a, b, c).)
In this section we introduce relations in a formal way. Most relations that we even-
tually analyze are of familiar kind, such as ≤, <, is cousin, taller than, has same birth date,
et cetera, but we get more structure with a formal approach.
Examples 2.3.2.
(1) Some relations on R are ≤, <, =, ≥, >. We can write (1, 2) ∈ ≤, or more familiarly,
1 ≤ 2. As a subset of R × R, ≤ consists of all points on or above the line y = x.
This relation can be drawn (and read off) easily.
(2) Draw anything in R × R. That defines a relation on R (which most likely cannot
be expressed with a formula). The relation R = {(a, b) : a, b ∈ R and a2 < b + 1}
is drawn as the set of all points (x, y) above the parabola y = x2 − 1.
(3) The following are all the possible relations on A = {1, 2} and B = {a, b}:
{(1, a), (1, b), (2, a), (2, b)},
{(1, b), (2, a), (2, b)},
{(1, a), (2, a), (2, b)},
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 2.3: Relations, equivalence relations 61
(4) A relation can be known by more than one name. Among the students in a typical
college classroom, the relations cousin and aunt are identical, i.e., no one student
is a cousin or aunt of another, so this relation equals {}, the empty relation.
Examples 2.3.4.
(1) ≤ on R is reflexive and transitive but not symmetric.
(2) < on R is transitive but not reflexive or symmetric.
(3) = on any set A is reflexive, symmetric, and transitive.
(4) Being a cousin is symmetric and neither reflexive nor transitive.
(5) Being taller than is ...
(6) Let A = {u, v}. Any equivalence relation on A needs to contain (u, u) and (v, v)
in order to achieve reflexivity. The relation
{(u, u), (v, v)}
is reflexive, symmetric and transitive. The relations
{(u, u), (v, v), (u, v)} and {(u, u), (v, v), (v, u)}
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
62 Chapter 2: Concepts with which we will do mathematics
Definition 2.3.5. Let R be an equivalence relation on a set A. For each a ∈ A, the set
of all elements b of A such that aRb is called the equivalence class of a. We denote the
equivalence class of a with the shorthand [a].
For example, if R is the equality relation, then the equivalence class of a is {a}. If
R = A × A, then the equivalence class of any a is A. If A is the set of all students in
Math 112 this year, and aRb if students a and b are in the same section of Math 112, then
[a] is the set of all students that are in the same section as student a.
Proof. Let a, b ∈ A, and suppose that their equivalence classes have an element in common.
Call the element c.
We now prove that the equivalence class of a is a subset of the equivalence class
of b. Let d be any element in the equivalence class of a. Then aRd, aRc and bRc imply by
symmetry that dRa and cRb, so that by transitivity dRc. Then dRc, cRb and transitivity
give dRb, so that by symmetry bRd, which says that d is in the equivalence class of b. Thus
the equivalence class of a is a subset of the equivalence class of b.
A symmetric proof shows that the equivalence class of b is a subset of the equivalence
class of a, so that the two equivalence classes are identical.
Remark 2.3.7. What this says is that whenever R is an equivalence relation on a set A,
then every element of A is in a unique equivalence class. Thus A is the disjoint union of
distinct equivalence classes. Conversely, if A = ∪i∈I Ai where the Ai are pairwise disjoint,
define R ⊆ A × A as (a, b) ∈ R precisely if a and b are elements of the same Ai . Then R is
an equivalence relation: reflexivity and symmetry are obvious, and for transitivity, suppose
that a and b are in the same Ai and b and c are in the same Aj . Since Ai and Aj have the
element b in common, by the pairwise disjoint assumption necessarily i = j, so that a and
c are both in Ai . Thus R is an equivalence relation.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 2.3: Relations, equivalence relations 63
Example 2.3.8. Let A = {1, 2, 3, 4, 5}. The writing of A as {1, 2} ∪ {3, 4} ∪ {5} makes
the following equivalence relation on A:
{(1, 1), (1, 2), (2, 1), (2, 2), (3, 3), (3, 4), (4, 3), (4, 4), (5, 5)}.
This means is that counting all the possible equivalence relations on A is the same as
counting all the possible writings of A as unions of pairwise disjoint subsets. (In contrast,
the number of all possible relations on a set A equals the number of subsets of A × A.)
Important example 2.3.9. Let n be a positive integer. Let R be the relation on Z given
by aRb if a − b is a multiple of n. This relation is called congruence modulo n. It is
reflexive because for every a ∈ Z, a − a = 0 is an integer multiple of n. It is symmetric
because for all a, b ∈ Z, if aRb, then a − b = x · n for some integer x, and so b − a = (−x) · n,
and since −x is an integer, this proves that bRa. Finally, this relation R is transitive: let
a, b, c ∈ Z, and suppose that aRb and bRc. This means that a − b = x · n and b − c = y · n for
some integers x and y. Then a−c = a+(−b+b)−c = (a−b)+(b−c) = x·n+y·n = (x+y)·n,
and since x+y is an integer, this proves that aRc. Thus R is an equivalence relation. If aRb
for this relation R, we say that a is congruent to b modulo n, or that a is congruent
to b mod n. (Normally in the literature this is written as a ≡ b mod n.) We denote the
set of all equivalence classes with Z/nZ, and we read this as “Z mod n Z”. This set
consists of [0], [1], [2], . . . , [n − 1], [n] = [0], [n + 1] = [1], et cetera, so that Z/nZ has at
most n equivalence classes. Since any two numbers among 0, 1, . . . , n − 1 have difference
strictly between 0 and n, it follows that this difference is not an integer multiple of n, so
that [0], [1], [2], . . . , [n − 1] are distinct. Thus Z/nZ has exactly n equivalence classes. Two
natural lists of representatives of equivalence classes are 0, 1, 2, . . . , n − 1 and 1, 2, . . . , n.
(But there are infinitely many other representatives as well.)
For example, modulo 12, the equivalence class of 1 is the set {1, 13, 25, 37, . . .} ∪
{−11, −23, −35, . . .}, and the equivalence class of 12 is the set of all multiples of 12 (in-
cluding 0).
In everyday life we use congruence modulo 12 (or sometimes 24) for hours, modulo
12 for months, modulo 7 for days of the week, modulo 4 for seasons of the year, modulo 3
for meals of the day ...
There are exactly two equivalence classes for the congruences modulo 2: one consists
of all the even integers and the other of all the odd integers. There is exactly one equivalence
class for the congruences modulo 1: all integers are congruent modulo 1 to each other. For
the congruences modulo 0, each equivalence class consists of precisely one element.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
64 Chapter 2: Concepts with which we will do mathematics
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 2.4: Functions 65
2.3.3. In each part below, find a relation with the given properties. You may contrive a
relation on a contrived set A.
i) Reflexive, but not symmetric and not transitive.
ii) Reflexive and symmetric, but not transitive.
iii) Reflexive and transitive, but not symmetric.
iv) Symmetric, but not reflexive and not transitive.
v) Transitive, but not symmetric and not reflexive.
vi) Transitive and symmetric, but not reflexive.
2.3.4. Let A be a set with 2 elements. Count all equivalence relations on A. Repeat first
for A with 3 elements, then for A with 4 elements.
2.3.5. Let A be a set with n elements. Let R be an equivalence relation on A with fewest
members. How many members does R have?
2.3.6. Let R be the relation on R given by aRb if a − b is a rational number. Prove that
R is an equivalence relation. Find at least three disjoint equivalence classes.
2.3.7. Let R be the relation on R given by aRb if a − b is an integer.
i) Prove that R is an equivalence relation.
ii) Prove that for any a ∈ R there exists b ∈ [0, 1) such that [a] = [b].
2.3.8. Let R be a relation on R × R given by (a, b)R(c, d) if and only if a − c and b − d are
integers.
i) Prove that R is an equivalence relation.
ii) Prove that for any (a, b) ∈ R × R there exists (c, d) ∈ [0, 1) × [0, 1) such that
[(a, b)] = [(c, d)].
iii) Prove that the set of equivalence classes can be identified with [0, 1) × [0, 1).
iv) For fun: check out the video game Asteroids online for a demonstration of this
equivalence relation. Do not get addicted to the game.
2.3.9. Let A be the set of all lines in the plane.
i) Prove that the relation “is parallel to” is an equivalence relation. Note that the
equivalence class of a non-vertical line can be identified by the (same) slope of the
lines in that class. Note that the vertical lines are in their own equivalence class.
ii) Prove that the relation “is perpendicular to” is not an equivalence relation.
2.3.10. For (a, b), (a0 , b0 ) ∈ Z × Z \ {0}, define (a, b) o (a0 , b0 ) if a · b0 = a0 · b.
i) Prove that o is an equivalence relation. (Possibly mimic Example 2.3.10.)
ii) Describe the equivalence classes of (0, 1), (1, 1), (2, 3)?
iii) Find a natural identification between the equivalence classes and elements of Q.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
66 Chapter 2: Concepts with which we will do mathematics
2.4 Functions
Definition 2.4.1. Let A and B be sets. A function from A to B is a rule that assigns to
each element of A a unique element of B. We express this with “f : A → B is a function.”
The set A is the domain of f and B is the codomain of f . The range or image of f is
Image(f ) = Range(f ) = {b ∈ B : b = f (a) for some a ∈ A}.
But in the spirit of introducing few new notions, let’s instead define functions with
the concepts we already know. Convince yourself that the two definitions are the same:
Definition 2.4.2. Let A and B be sets. A relation f on A and B is a function if for all
a ∈ A there exists b ∈ B such that (a, b) ∈ f and if for all (a, b), (a, c) ∈ f , b = c. In this
case we say that A is the domain of f , B is the codomain of f , and we write f : A → B.
The range of f is Range(f ) = {b ∈ B : there exists a ∈ A such that (a, b) ∈ f }.
Note that this second formulation is also familiar: it gives us all elements of the
graph of the function: b = f (a) if and only if (a, b) is on the graph of f . We freely change
between notations f (a) = b and (a, b) ∈ f .
One should be aware that if f is a function, then f (x) is an element of the range
and is not by itself a function. (But often we speak loosely of f (x) being a function, such
as “x2 is a function”.)
To specify a function one needs to present its domain and its codomain, and to
show what the function does to each element of the domain.
Examples 2.4.3.
(1) A function can be given with a formula.
1
For example, let f : R → R be given by f (x) = 1+x 2 . The range is (0, 1]: For all
2
x, 1 + x ≥ 1 with equality whenqx = 0. Thus f (x) ∈ (0, 1]. For any y ∈ (0, 1],
1
1/y ≥ 1, so 1/y − 1 ≥ 0, so x = y − 1 is a positive real number and f (x) = y.
So indeed the range of f is (0, 1].
√
(2) Here are formula definitions of two functions with domains [0, ∞): f (x) = x and
√ √
g(x) = − x. Note, however, that h(x) = ± x is NOT a function!
(3) There may be more than one formula for a function, each of which is applied to
distinct elements of the domain. For example, define f : N+ → Z by
n−1
f (n) = 2 , if n is odd;
− n2 , if n is even.
(4) Let f : N+ → R be given by the description that f (n) equals the nth prime.
By Euclid’s theorem (proved on page 33) there are infinitely many primes so that
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 2.4: Functions 67
f is indeed defined for all positive integers. We know that f (1) = 2, f (2) = 3,
f (3) = 5, and with computer’s help I get that f (100) = 541, f (500) = 3571. There
is no algebraic formula for the nth prime.
(5) For any set A, the identity function idA : A → A takes each x to itself.
(6) Let b ∈ B. A function f : A → B given by f (a) = b for all a is called a constant
function.
(7) The constant function f : R → R given by f (x) = 1 for all x is not the identity
function.
(8) A function may be presented by a table. Here is an example.
x f (x)
1 1
2 1
3 2
(9) A function may be presented in a pie chart, histogram, with words, in a weather
map...
(10) A function f : N+ → R can be given recursively, such as the Fibonacci numbers
f (1) = 1, f (2) = 1, and for all n ≥ 2, f (n+1) = f (n)+f (n−1). See Exercise 1.6.30
for more on these numbers.
(11) A function can be given by its graph:
From this particular graph we surmise that f (0) = 0, but with the precision of
the drawing and our eyesight it might be the case that f (0) = 0.000000000004.
Without any further labels on the axes we cannot estimate the numerical values
of f at other points. Typically the graph should be filled in with more information.
The arrows on the graphs indicate increasing values.
y
3 y = f (x)
2
1
x
−3 −2 −1 1 2 3 4 5
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
68 Chapter 2: Concepts with which we will do mathematics
×
×
× ×
×
Remark 2.4.4. Two functions are the same if they have the same domains, the same
codomains, and if to each element of the domain they assign the same element of the
codomain.
For example, f : R → R and g : R → [0, ∞) given by f (x) = x2 and g(x) = x2 are
not the same! On the other hand, the functions h, k : R → R given by h(x) = |x| and
√
k(x) = x2 are the same.
Notation 2.4.5. It is common to not specify the domain, in which case the domain is
implicitly the largest possible subset of R on which the function is defined. For example, the
√
domain of f (x) = x1 is the set of all non-zero real numbers, and the domain of f (x) = x is
the set of all non-negative real numbers. (After we introduce complex numbers the domain
will implicitly be the largest possible subset of C on which the function is defined, see
page 109.)
Sometimes instead we want to take smaller domains than possible:
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 2.4: Functions 69
For example, the identity function is always bijective. Constant functions are in-
jective when the domain consists of one element, and are surjective when the codomain
consists of one element. The function f : R → R given by f (x) = x2 is not injective
because f (−1) = 1 = f (1). It is, however, surjective. The function f : R≥0 → R given
√
by f (x) = x is injective because the square root function is strictly increasing (more
about that in Section 2.9), but it is not surjective because −1 is not the square root of
√
any non-negative real number. The function f : R≥0 → R≥0 given by f (x) = x is both
injective and surjective, thus bijective.
The following are all the possible functions f : {1, 2} → {1, 2}, and they are given in
tabular form:
x f (x) x f (x) x f (x) x f (x)
1 1 1 1 1 2 1 2
2 1 2 2 2 1 2 2
The first and the last are neither injective nor surjective, but the middle two are bijective.
The following are the eight possible functions f : {1, 2, 3} → {1, 2}:
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
70 Chapter 2: Concepts with which we will do mathematics
In this case no functions are injective, and all non-constant ones are surjective.
Proof. Let c ∈ C. Since g is surjective, there exists b ∈ B such that g(b) = c. Since f is
surjective, there exists a ∈ A such that f (a) = b. Thus (g ◦ f )(a) = g(f (a)) = g(b) = c, so
that g ◦ f is surjective.
The last statement follows from the first part and Theorem 2.4.11.
Definition 2.4.13. We have seen polynomial functions in Section 1.5: recall that for any
subset S ⊆ R, a function f : S → R is a polynomial function if there exist a non-negative
integer n and c0 , c1 , . . . , cn ∈ R such that for all x ∈ S, f (x) = c0 + c1 x + c2 x2 + · · · + cn xn .
A function f : S → R is a rational function if there exist polynomial functions f1 , f2 :
S → R such that for all x ∈ S, f2 (x) 6= 0 and f (x) = f1 (x)/f2 (x).
Similarly there are polynomial and rational functions if all occurrences of “R” above
are replaced by “Q” or “C” .
Polynomial and rational functions are a workhorse of analysis. Below are some special
properties. Further special properties of polynomial functions in greater generality appear
in Exercise 2.6.13. (The reader has of course encountered trigonometric, exponential, and
logarithmic functions, which are not polynomial or rational.)
(This meaning of “root” is different from the meaning of “root” in “square root”,
“cube root” or “100th root” of a number.)
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 2.4: Functions 71
Theorem 2.4.15. If a polynomial function with real coeffiecients is not constant zero,
then it has only finitely many roots. Specifically, if f (x) = c0 + c1 x + c2 x2 + · · · + cn xn ,
then the number of roots is at most n.
The domain of a rational function is the complement of a finite subset of R.
(The same statement and proof work if the coefficients are complex numbers; complex
numbers are introduced in Chapter 3.)
Proof. At least one of ci is non-zero since f (x) is not constant 0. By possibly renaming we
may assume that cn 6= 0, and by the non-constant assumption, n ≥ 1. If n = 1, then f has
only one root, namely −c0 /c1 . Suppose that n ≥ 2.* In general, let a be any root of f . By
the Euclidean algorithm (Example 1.6.5), there exist polynomial functions q and r such
that
f (x) = q(x)(x − a) + r(x),
and the degree of r is strictly smaller than 1, i.e., r(x) is a constant. But if we plug x = a
into both sides, since a is a root of f , we get that r is the constant zero function, so that
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
72 Chapter 2: Concepts with which we will do mathematics
(You are aware that the trigonometric functions sine and cosine have infinitely many
zeroes and that tangent and cotangent are not defined at infinitely many real numbers. This
fact, together with the previous theorem, establishes that these four trigonometric functions
are not polynomial or rational functions. For similar reasons the logarithmic functions are
not polynomial or rational. We have to work harder to prove that the exponential functions
are not polynomial or rational.)
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 2.4: Functions 73
2.4.9. For each of the following functions, state the domains, codomains, and how they
are compositions of f, g, h from the previous exercise.
i) {(2, 2), (4, 2)}.
ii) {(9, 4), (3, 2), (6, 2)}.
iii) {(2, 6), (1, 3)}.
2.4.10. Let A be a set with 3 elements and B be a set with 2 elements.
i) Count all functions from A to B?
ii) Count all injective functions from A to B.
iii) Count all surjective functions from A to B. Repeat for functions from B to A.
iv) Count all bijective functions from A to B.
v) Count all functions from A to B that are injective but not surjective.
vi) Count all functions from A to B that are surjective but not injective.
vii) Count all functions from A to B that are neither surjective nor injective.
2.4.11. In each part below, find f : R → R with the specified condition.
i) f is a bijective function.
ii) f is neither injective nor surjective.
iii) f is injective, but not surjective.
iv) f is surjective, but not injective.
2.4.12. The pigeonhole principle states that if n items (such as pigeons) are put into
m holes with n > m, then at least one hole has more than one item. Let A and B be sets
with only finitely many elements.
i) Use the pigeonhole principle to demonstrate that if A has more elements than B,
then f : A → B cannot be injective.
ii) Use the pigeonhole principle to demonstrate that if A has fewer elements than B,
then f : A → B cannot be surjective.
iii) Use the pigeonhole principle to demonstrate that if A and B do not have the same
number of elements, then f : A → B cannot be bijective.
2.4.13. Let A be a set with m elements and B a set with n elements. (To solve this
problem, you may want to first examine the case m = 1, then m = 2, followed by m = 3,
or possibly you have to start with small n, after which you will probably see a pattern.
Once you have the correct pattern, the proof is straightforward.)
i) Count all functions from A to B?
ii) For which combinations of m, n are there injective functions from A to B?
iii) For m, n as in (ii), count all injective functions from A to B.
iv) For which combinations of m, n are there surjective functions from A to B?
v) For m, n as in (iv), count all surjective functions from A to B.
vi) For which combinations of m, n are there bijective functions from A to B?
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
74 Chapter 2: Concepts with which we will do mathematics
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 2.4: Functions 75
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
76 Chapter 2: Concepts with which we will do mathematics
10
1010 . We know from page 33 that such a prime exists, so this f is well-defined
as a function. However, we do not know its numerical value due to computational
limitations, which makes this function useless for some purposes.
2.4.30. Let A be the set of all differentiable functions f : R → R. Define a relation R on
A by f Rg if f (0) = g(0).
i) Prove that R is an equivalence relation.
(ii)* Let S be the set of all equivalence classes. Define F : S → R by F ([f ]) = f (0).
Prove that F is a well-defined function, in the sense that if [f ] = [g], then F ([f ]) =
F ([g]). Also prove that F is bijective.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 2.5: Binary operations 77
(In this chapter we use “e” for the identity element; this is a symbol that is unrelated
to the base of the exponential function as in Definition 7.6.6.)
Theorem 2.5.5. Let ◦ be a binary operation on A. Suppose that e and f are both
identities for ◦. Then e = f . In other words, if an identity exists for a binary operation, it
is unique. Hence we talk about the identity for ◦.
Proof. Since for all a ∈ A, e ◦ a = a, we get in particular that e ◦ f = f . Also, for every
a ∈ A, a ◦ f = a, hence e ◦ f = e. Thus e = e ◦ f = f .
Note: we used symmetry and transitivity of the equality relation.
Definition 2.5.6. Let ◦ be a binary operation on A and suppose that e is its identity. Let
x be an element of A. An inverse of x is an element y ∈ A such that x ◦ y = e = y ◦ x.
To emphasize what the operation is, we may also say that y is a ◦-inverse of x (or see
specific terms below).
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
78 Chapter 2: Concepts with which we will do mathematics
(6) Here is a new binary operation ◦ on the set S = {a, b, c, d} presented via its
◦-table:
◦ a b c d
a a b c d
b b c d a
c c d a b
d d a b c
Note that a is the identity element, the inverse of a is a, the inverse of b is d, the
inverse of c is c, the inverse of d is b.
Definition 2.5.11. We say that x is invertible if x has an inverse. The (abstract) inverse
is usually denoted x−1 .
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 2.5: Binary operations 79
You may want to find specific four integers a, b, c, d for which you get four distinct values
with different placements of parentheses.
Notation 2.5.13. Just like for functions in Remark 2.4.8, also for arbitrary associative
binary operation ◦ we abbreviate a ◦ a with a2 , (a ◦ a) ◦ a with a3 , et cetera, and in general
for all positive integers n we write
an = an−1 ◦ a = a ◦ an−1 .
This notation is familiar also when ◦ equals multiplication: then 25 stands for 32. When
◦ is addition, then the abstract “25 ” stands for 10, but of course we prefer to not write 10
this way; instead we write it in the additive notation 2 + 2 + 2 + 2 + 2, or briefly, 5 · 2. The
empty product a0 makes sense if the set has the identity, and in that case a0 = e. (See
Exercise 1.5.6 for the first occurrence of an empty product.)
If a has a multiplicative inverse, then a−1 is that inverse, and in that case if ◦ is also
associative,
a−n = a−(n−1) ◦ a−1 .
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
80 Chapter 2: Concepts with which we will do mathematics
Theorem 2.5.15. If x is invertible, then its inverse is also invertible, and the inverse of
the inverse is x.
Proof. By definition of inverses of x, x−1 ◦ x = e = x ◦ x−1 , which also reads as “the inverse
of x−1 is x.”
We end this section with an important example. The reader is familiar with manip-
ulations below when n = 12 or n = 24 for hours of the day (we do not say “28 o’clock”),
when n = 7 for days of the week, when n = 3 for standard meals of the day, et cetera.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 2.5: Binary operations 81
Important example 2.5.19. Let n be a positive integer. Recall the definition of Z/nZ
from Example 2.3.9: elements are equivalence classes [0], [1], [2], . . . , [n − 1]. Define + on
Z/nZ as follows: [a] + [b] = [a + b]. Well, first of all, is this even a function? Namely, we
need to verify that whenever [a] = [a0 ] and [b] = [b0 ], then [a + b] = [a0 + b0 ], which says
that any choice of representatives gives the same final answer. Well, a − a0 and b − b0 are
integer multiples of n, hence (a + b) − (a0 + b0 ) = (a − a0 ) + (b − b0 ) is a sum of two multiples
of n and hence also a multiple of n. Thus [a + b] = [a0 + b0 ], which says that + is indeed a
binary operation on Z/nZ. It is straightforward to verify that + on Z/nZ is commutative
and associative, the identity elements is [0], and every element [a] has an additive inverse
[−a] = [n − a].
Similarly, we can define · on Z/nZ as follows: [a] · [b] = [a · b]. It is left to the
reader that this is a binary operation that is commutative and associative, and the identity
elements is [1]. The multiplication tables for n = 2, 3, 4 are below, where, for ease of
notation, we abbreviate “[a]” with “a”:
Z/2Z: · 0 1 Z/3Z: · 0 1 2 Z/4Z: · 0 1 2 3
0 0 0 0 0 0 0 0 0 0 0 0
1 0 1 1 0 1 2 1 0 1 2 3
2 0 2 1 2 0 2 0 2
3 0 3 2 1
Note that for multiplication in Z/4Z, [3] = [−1] is the multiplicative inverse of itself, and
[2] has no multiplicative inverse.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
82 Chapter 2: Concepts with which we will do mathematics
2.5.3. Find a set A with a binary operation ◦ such that for some invertible f, g ∈ A,
(g ◦ f )−1 6= g −1 ◦ f −1 . (This is sometimes called the socks-and-shoes problem. Say why?)
2.5.4. Refer to Example 2.5.19: Write addition tables for Z/2Z, Z/3Z, Z/4Z.
2.5.5. Write multiplication tables for Z/5Z, Z/6Z, Z/7Z.
2.5.6. Determine all [a] ∈ Z/12Z that have a multiplicative inverse.
*2.5.7. Determine all [a] ∈ Z/pZ that have a multiplicative inverse if p is a prime number.
2.5.8. Here is an opportunity to practice induction. Let f, g : A → A be functions, with
f invertible. Prove that (f ◦ g ◦ f −1 )n = f ◦ g n ◦ f −1 . (Notation is from Remark 2.4.8.)
2.5.9. Consider the following binary operation ◦ on {a, b, c}: ◦ a b c
a a b c
b b a a
c c a a
i) Show that ◦ is commutative and has an identity element.
ii) Show that every element has an inverse and that inverses need not be unique.
iii) Prove that ◦ is not associative. (Hint: Theorem 2.5.10.)
*2.5.10. Let S be the set of all logical statements. We define a relation ∼ on S as follows:
P ∼ Q if P and Q have the same truth values in all conditions. For example, “1 = 1” and
“2 = 2” are related via ∼.
i) Prove that ∼ is an equivalence relation on S. Let A be the set of all equivalence
classes.
ii) Verify that and , or and xor are naturally binary operations on A.
iii) Find the identity elements, if they exist, for each of the three binary operations.
iv) For each binary operation above with identity, which elements of A have inverses?
2.5.11. Define a binary operation ⊕ on R as a ⊕ b = a + b + 2, where + is the ordinary
addition on R. Prove that ⊕ is commutative and associative. Find the identity element of
⊕, and for each x ∈ R, find its inverse.
2.5.12. Define a binary operation on R as a b = a+2·a+2·b+2, where + and · are the
ordinary addition and multiplication on R. Prove that is commutative and associative.
Find the identity element of , and for each x ∈ R \ {−2}, find its inverse.
2.6 Fields
The motivation for the abstract definition of fields below comes from the familiar
properties of the set of all real numbers.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 2.6: Fields 83
Definition 2.6.1. A set F is a field if it has two binary operations on it, typically denoted
+ and ·, and special elements 0, 1 ∈ F , such that the following identities hold for all
m, n, p ∈ F :
(1) (Additive identity) m + 0 = m = 0 + m.
(2) (Associativity of addition) m + (n + p) = (m + n) + p.
(3) (Commutativity of addition) m + n = n + m.
(4) (Multiplicative identity) m · 1 = m = 1 · m.
(5) (Distributivity) m · (n + p) = (m · n) + (m · p).
(6) (Associativity of multiplication) m · (n · p) = (m · n) · p.
(7) (Commutativity of multiplication) m · n = n · m.
(8) (Existence of additive inverses) There exists r ∈ F such that m+r = r+m = 0.
(9) (Existence of multiplicative inverses) If m 6= 0, there exists r ∈ F such that
m · r = r · m = 1.
(10) 1 6= 0.
0 is called the additive identity and 1 is called the multiplicative identity.
The familiar N+ , N0 , Z, Q, R all have the familiar binary operations + and · on them.
Among these, N+ lacks the additive identity, but all others have the additive identity 0. In
N0 , all non-zero elements lack additive inverses, and in Z, all non-zero elements other than
1 and −1 lack a multiplicative inverse. Thus N+ , N0 and Z are not fields.
We take it for granted that Q and R are fields. In Section 3.1 we construct a new
field, the field of complex numbers. There are many other fields out there, such as the set
of all real-valued rational functions with real coefficients. A few fields are developed in the
exercises to this section.
Notation 2.6.2. By Section 2.5, we know that the additive and multiplicative identities
and inverses are unique in a field. The additive inverse of m is denoted −m, and the
multiplicative inverse of a non-zero m is denoted m−1 , or also 1/m. The sum n + (−m) of
n and −m is also written as n − m, and the product n · m−1 of n and m−1 is also written as
n/m. The latter two operations are also called subtraction and division. The functions
− and −1 are unary (see definition on page 76) with domains F and F \ {0}, respectively.
By Theorem 2.5.15, −(−m) = m, and for any non-zero m, 11 = (m−1 )−1 = m.
m
It is standard to omit “·” when no confusion arises. Note that 2 · 222 + 4 is different
from 2 222 + 4, but 2 · x + 4 is the same as 2x + 4.
Another bit of notation: · takes precedence over addition, so that “(a · b) + c” can be
written simply as “a · b + c”, or with the omission of the multiplication symbol, as “ab + c”.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
84 Chapter 2: Concepts with which we will do mathematics
Theorem 2.6.3. (The other distributive property) If F is a field, then for all m, n, p ∈ F ,
(m + n)p = mp + np.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 2.6: Fields 85
2.6.1. Verify that the set {0} satisfies axioms (1)–(9) of fields, with 0 being the additive
and the multiplicative identity. Obviously {0} fails axiom (10).
2.6.2. Use the set-up in Example 2.3.9. Prove that Z/2Z is a field. Note that in this field
[2] = [0], so [2] does not have a multiplicative inverse. Also note that in this field, every
number has a square and cube root.
2.6.3. Use the set-up in Example 2.3.9. Prove that Z/3Z is a field. Note that in this field
[3] = [0], so [3] does not have a multiplicative inverse. Note that in this field, [2] is not the
square of any number.
2.6.4. Use the set-up in Example 2.3.9, and let n be a positive integer strictly bigger
than 1 that is not a prime integer. Prove that Z/nZ is not a field.
*2.6.5. Use the set-up in Example 2.3.9. Prove that Z/pZ is a field for any prime integer p.
Note that in Z/7Z, [2] is the square of [3] and of [4].
2.6.6. Prove using only the axioms of fields that for any x in a field, (−1) · x is the additive
inverse of x.
2.6.7. Let F be a field. Prove that for any x ∈ F , −(−x) = x. Prove that for any non-zero
x ∈ F , 1/(1/x) = x.
2.6.8. Let F be a field. Prove that for any x, y ∈ F , (−x) · y = −(x · y) = x · (−y). (Hint:
Use the definition of additive inverses.)
2.6.9. Let F be a field. Prove that for any x, y ∈ F , (−x) · (−y) = x · y.
2.6.10. Let x be a non-zero element of a field F . Then (−x)−1 = −(x−1 ).
2.6.11. Let A be a set and F a field. For any functions f, g : A → F we define new
functions f + g, f · g : F → G as (f + g)(x) = f (x) + g(x) and (f · g)(x) = f (x) · g(x).
Here, the second + and · are the binary operations on F , and the first + and · are getting
defined. Let S be the set of all functions from A to F .
i) Prove that + and · are binary operations on S.
ii) If A = F , then S includes polynomial functions. Let T be the set of all polynomial
functions from F to F . Prove that +, · and ◦ are binary operations on T .
2.6.12. Let F be a field and n a non-negative integer. We define exponentiation by n
to be a function f : F → F given as f (x) = xn , where x0 = 1 for all x and where for
positive n, xn = x · xn−1 . In this exponentiation, n is called the exponent, or power, and
x is called the base.
i) Prove by induction on n that exponentiation is a well-defined function.
ii) We want to define exponentiation by negative integers as well. What is the largest
subset D of F such that x−1 is defined for all x ∈ D. Is D = F ? Why or why not?
iii) Prover that for any integers m and n, if x ∈ D, then (xm )n = (xn )m .
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
86 Chapter 2: Concepts with which we will do mathematics
2.6.13. (Euclidean algorithm over arbitrary fields) Let F be a field. Let f (x) = a0 + a1 x +
· · · + an xn and g(x) = b0 + b1 x + · · · + bm xm for some a0 , a1 , . . . , an , b0 , b1 , . . . , bm ∈ F and
with an bm 6= 0.
i) Suppose that m, n ≥ 1. Prove that there exist polynomials q(x) and r(x) such that
f (x) = q(x) · g(x) + r(x) and such that the degree of r(x) is strictly smaller than m.
ii) Prove that there are at most n elements c in F such that f (c) = 0.
2.6.14. (Degree of a polynomial function) Let F be an infinite field. Let f : F → F be a
polynomial function given by f (x) = a0 + a1 x + · · · + an xn for some non-negative integer n
and a1 , a2 , . . . , an ∈ F .
i) Suppose that f is the zero function. Prove that a0 = a1 = · · · = an = 0. (Hint:
Exercise 2.6.13.)
ii) Prove that the coefficients a0 , a1 , . . . , an are uniquely determined. In particular,
the degree of a polynomial function is uniquely determined.
2.6.15. (Degree of a polynomial function)
i) Let f, g : Z/2Z → Z/2Z be defined by f (x) = x2 , g(x) = x. Show that f and g are
an identical polynomial function given by polynomials of different degrees.
ii) Find a non-zero polynomial p(x) of degree 3 that equals the zero function on Z/2Z.
2.6.16. (An unusual field.) Let ⊕ and be binary operations on R as defined in Exer-
cises 2.5.11 and 2.5.12. Prove that R is a field with these two binary operations.
which by before also defines >, ≥. Similarly, each of >, ≥ determines all four relations of
this form. Thus one of these relations on a set S implies that we have all four relations
naturally derived from the one. These relations impose the familiar notion of order.
We are familiar with these relations <, ≤, >, ≥ in R. We can also use them in other
contexts:
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 2.7: Order on sets, ordered fields 87
Examples 2.7.1.
(1) < can be the relation “is a proper subset of” on a set S of all subsets of some uni-
versal set U . In this case, ≤ means “is a subset of”, > means “properly contains”,
and ≥ means “contains”.
(2) If < is the relation “has strictly fewer elements” on the set S of all subsets of the
set {1, 2, 3, . . . , 100}, then ≤ means “has fewer elements or is the same set” (rather
than “has fewer or the same number of elements”).
Examples 2.7.3.
(1) The set N0 has minimum 0. It is not bounded above, for any upper bound u
would be strictly smaller than the positive integer due + 1 (the ceiling function),
thus contradicting the assumption of upper bounds.
(2) The set T = {1/n : n ∈ N+ } has maximum 1, it is bounded below, the infimum
is 0, and there is no minimum.
Proof: In long form, the set equals {1, 1/2, 1/3, 1/4, 1/5, . . .}. From this re-writing
it is clear that 1 is the maximum, that 0 is a lower bound and that 0 is not in the
set, so 0 cannot be the minimum. Why is 0 the largest lower bound, i.e., why is
0 the infimum of T ? Suppose that r is a positive real number. Set n = d 1r e + 1.
Then n is a positive integer, and n > 1r . By cross multiplying we get that r > n1 ,
which proves that r is not a lower bound on T . Since r was arbitrary, this proves
that no positive number is a lower bound on T , so that 0 is the greatest lower
bound on T .
*A sentence of the form “P is Q (resp. Q0 ) if R (resp. R0 )” is shorthand for two sentences: “P is Q if R”
and “P is Q0 if R0 ”.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
88 Chapter 2: Concepts with which we will do mathematics
(3) The set {1/p : p is a positive prime number} has maximum 1/2 and infimum 0.
(There are infinitely many prime numbers; see the proof on page 33.)
(4) The sets {(−1)n : n ∈ N+ } and {sin(x) : x ∈ R} both have maximum 1 and
minimum −1.
(5) The set of all positive rational numbers that are strictly smaller than π has infimum
0 and supremum π, but it has no minimum and no maximum.
(6) The set {ex : x ∈ R} has no upper bound, it is bounded below with infimum 0 and
no minimum.
(7) The empty subset has no minimum nor maximum. Every element of S is vacuously
an upper and a lower bound of the empty set.
(8) The set {x ∈ R : −3 < x − 5 < 3} has no minimum and maximum, but the
infimum is 2 and the supremum is 8. The set {x ∈ R : −3 ≤ x − 5 < 3} has
minimum 2, supremum 8, and no maximum. The set {x ∈ R : −3 < x − 5 ≤ 3}
has maximum 8, infimum 2, and no minimum. The set {x ∈ R : −3 ≤ x − 5 ≤ 3}
has minimum 2 and maximum 8.
(9) If T = {{}, {1}, {2}}, then the inclusion relation on T has minimum {}, and no
upper bounds in T . If we think of T as a subset of the set S of all subsets of {1, 2}
(or of the set S of all subsets of R), then T has supremum {1, 2}.
(10) If S is the set of all subsets of the set {1, 2, 3, . . . , 100} and < is the rela-
tion “has strictly fewer elements”, then the empty set is the minimum and
{1, 2, 3, . . . , 100} is the maximum. If T is the subset of S consisting only
of sets with at most two elements, then the minimum of T is the empty
The 100
set, and there is no maximum or supremum in T . 2 elements
{1, 2}, {1, 3}, . . . , {1, 100}, {2, 3}, {2, 4}, . . . , {99, 100} are each greater than or
equal to all elements of T and they are not strictly smaller than any other element
of T .
In the sequel we restrict < to relations that satisfy the trichotomy property:
Definition 2.7.4. A relation < on a set S satisfies the trichotomy if for all s, t ∈ S,
exactly one of the following relations holds:
s = t, s < t, t < s.
Examples 2.7.5.
(1) The familiar < on R satisfies the trichotomy.
(2) If S is the set of all subsets of a universal set U , then the inclusion relation on S
satisfies the trichotomy.
(3) If S = {{}, {1}, {2}}, then the inclusion relation on S does not satisfy the tri-
chotomy.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 2.7: Order on sets, ordered fields 89
Theorem 2.7.6. Let < on S satisfy the trichotomy. Then a supremum (resp. infimum)
of a non-empty subset T of S, if it exists, is unique.
Proof. Suppose that c, c0 are suprema of T in S. Both c and c0 are upper bounds on T , and
since c is a least upper bound, necessarily c0 ≤ c. Similarly c ≤ c0 , so that by trichotomy
c = c0 . This proves that suprema are unique, and a similar proof shows that infima are
unique.
Why did we assume that the subset T of S above be non-empty? By definition every
element of S is an upper bound for ∅, so in particular if S has no minimum, then ∅ has no
least upper bound.
Definition 2.7.7. A set S with relation ≤ is well-ordered if inf(T ) = min(T ) for every
non-empty subset T of S. The element min(T ) is called the least element of T .
Examples 2.7.8.
(1) Any finite set with relation ≤ is well-ordered (simply check the finitely many
pairings for which element is smaller).
(2) Z is not well-ordered as there is no smallest whole number.
(3) Similarly, the set of all positive rational numbers is not well-ordered.
(4) N0 is well-ordered because for any non-empty subset T of N0 , by the fact that
the set is not empty there exists an element n ∈ T , and after that one has to
check which of the finitely many numbers 0, 1 through n is the smallest one in T .
Similarly, N+ is well-ordered.
Definition 2.7.9. Let F be a set with a binary operation +, with (additive) identity
0 ∈ F , and with a relation < satisfying the trichotomy. Define F + = {x ∈ F : 0 < x}, and
F − = {x ∈ F : x < 0}. Elements of F + are called positive, and element of F − are called
negative. Elements of F \ F + are called non-positive and element of F \ F − are called
non-negative.
We define intervals in F to be sets of the following form, where a, b ∈ F with a < b:
(a, b) = {x ∈ F : a < x < b},
(a, b] = {x ∈ F : a < x ≤ b},
[a, b) = {x ∈ F : a ≤ x < b},
[a, b] = {x ∈ F : a ≤ x ≤ b},
(a, ∞) = {x ∈ F : a < x},
[a, ∞) = {x ∈ F : a ≤ x},
(−∞, b) = {x ∈ F : x < b},
(−∞, b] = {x ∈ F : x ≤ b}.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
90 Chapter 2: Concepts with which we will do mathematics
Definition 2.7.10. We say that a field F is an ordered field if it has a relation < with
the following properties:
(1) < satisfies the trichotomy, i.e., for all x, y ∈ F , exactly one of the following is true:
x < y, x = y, y < x.
(2) (Transitivity of <) For all x, y, z ∈ F , if x < y and y < z then x < z.
(3) (Compatibility of < with addition) For all x, y, z ∈ F , if x < y then x + z <
y + z.
(4) (Compatibility of < with multiplication by positive elements) For all
x, y, z ∈ F , if x < y and 0 < z then xz < yz.
A subset of an ordered field is called an ordered set.
It follows that Q is an ordered field as well: it is a field, and the trichotomy, transitiv-
ity, and compatibilities of < hold on Q as they hold on the larger set R. These properties
of < also hold on the subsets N+ , N0 and Z, even if the latter sets are not fields.
Proof. (1) x ∈ F + if and only if 0 < x, and by compatibility of < with addition this implies
that −x = 0 − x < x − x = 0, so that −x ∈ F − . The rest of (1) is equally easy.
(2) By assumption 1 6= 0. If 1 6∈ F + , then by trichotomy 1 < 0, and by (1), 0 < −1.
Thus by compatibility of < with multiplication by positive numbers, since −1 is supposedly
positive, 0 = 0·(−1) < (−1)·(−1). By Exercise 2.6.9, (−1)·(−1) = 1, which by transitivity
says that 0 < 1. Since we also assumed that 1 < 0, we get a contradiction to the trichotomy.
So necessarily 1 ∈ F + .
(3) Suppose that x ∈ F + . By trichotomy then exactly one of the following three
inequalities holds: x−1 < 0, x = 0, x−1 > 0. Let stand for the correct inequality (or
equality). By compatibility of < with multiplication by the positive number x, we then
have 1 = x · x−1 x · 0 = 0. By (2), the relation must equal >, so that x−1 > 0.
If instead x ∈ F − , then by (1), −x ∈ F + , and by what we have proved of (3),
(−x)−1 ∈ F + . By Exercise 2.6.10 then −x−1 = (−x)−1 ∈ F + , so that x−1 ∈ F − by (1).
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 2.7: Order on sets, ordered fields 91
Proof. (1) By assumption, 0 < x and 0 < y. Then by compatibility of < with addition,
y = 0 + y < x + y, and since 0 < y, by transitivity of <, 0 < x + y, i.e., x + y ∈ F + .
(2) By assumption, 0 < x and 0 < y. Then by compatibility of < with multiplication
by positive numbers, 0 = 0 · y < x · y, which proves that x · y ∈ F + .
The proofs of the rest are similar.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
92 Chapter 2: Concepts with which we will do mathematics
† 2.7.7. (Invoked in Theorem 7.3.4.) Let S and T be subsets of an ordered field F . Let
S + T = {s + t : s ∈ S and t ∈ T }.
i) If S and T are bounded above, prove that sup(S + T ) ≤ sup S + sup T .
ii) If S and T are bounded below, prove that inf(S + T ) ≥ inf S + inf T .
2.7.8. Let F be an ordered field. Prove that 2, 3 are positive (and so not zero).
2.7.9. Let F be a field and x ∈ F . Prove that x2 = 0 if and only if x = 0.
2.7.10. Let F be an ordered field and x ∈ F . Let x, y ∈ F be non-negative (resp. non-
positive) such that x + y = 0. Prove that x = y = 0.
2.7.11. Let F be an ordered field. Suppose that x ≤ y and p ≤ q. Prove that x+p ≤ y +q.
If in addition x < y or p < q, prove that x + p < y + q.
2.7.12. Let F be an ordered field and x, y ∈ F . Prove that x < y if and only if 0 < y − x.
Prove that x ≤ y if and only if 0 ≤ y − x.
2.7.13. Let F be an ordered field, and x, y ∈ F + with x < y. Prove that 1/y < 1/x.
2.7.14. Let F be an ordered field. Suppose that x < y and that x, y are non-zero. Does
it follow that 1/y < 1/x? Prove or give a counterexample.
†2.7.15. (In-betweenness in an ordered field) Let F be an ordered field. Let x, y ∈ F with
x < y. Prove that x < (x + y)/2 < y. (Why are we allowed to divide by 2?)
2.7.16. Find an ordered set without a minimum.
2.7.17. Let F be an ordered set. Prove that any non-empty finite subset S of F has a
maximum and a minimum. Prove that for all s ∈ S, min(S) ≤ s ≤ max(S).
2.7.18. Let n > 1 be an integer and F = Z/nZ. (This was defined in Example 2.3.9.)
Prove that F is not an ordered set. In particular, using Exercise 2.6.5, for any prime
integer p, Z/pZ is a field that is not ordered.
So far we have taken it for granted that elements of N0 and Z are special and that
Q and R are ordered fields. This section contains a formal definition of N0 as a subset of
R with derivations of a few important properties. I recommend covering this section very
lightly if at all. Regardless of whether you read this section or not, you should be able to
do all the exercises at the end.
Once we have a definition of the set N0 of non-negative integers, we define the set Z
of all integers as N0 ∪ {n ∈ R : −n ∈ N0 } and the set Q of all rational numbers as
{x · y −1 : x ∈ Z, y ∈ N+ }. In this section we derive no special properties of Z or of Q.
We use Axiom 2.7.11 accepting that R is an ordered field.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 2.8: What are the integers and the rational numbers? 93
Theorem 2.8.2. There exists an inductive subset N0 of R that is a subset of every induc-
tive subset of R.
(1) N0 is the smallest inductive subset of R (in the sense that any inductive subset of
R contains N0 ).
(2) If m ∈ N0 is non-zero, then m = n + 1 for some n ∈ N0 .
(3) All elements of N0 \ {0} are positive.
Proof. Let T be the subset of N0 consisting of all n that satisfy the property that there are
no elements of N strictly between n and n + 1. We will prove that T is an inductive set.
Suppose that there exists m ∈ N0 strictly between 0 and 0 + 1 = 1. Then by
Theorem 2.8.2, m = p + 1 for some p ∈ N0 . By compatibility of order with addition,
p = m − 1 < 1 − 1 = 0, contradicting Theorem 2.8.2 which asserts that elements of N0 are
non-negative. This proves that 0 ∈ T .
Now suppose that n ∈ T . We want to prove that n+1 ∈ T . Suppose for contradiction
that there exists m ∈ N0 that is strictly between n + 1 and (n + 1) + 1. Since m > n + 1 ≥ 1,
m is not zero, so that by Theorem 2.8.2, m = p + 1 for some p ∈ N0 . Then by compatibility
of order with addition, p is strictly between n and n + 1, which contradicts the assumption
that n ∈ T . So necessarily there is no m with the stated property, so that n + 1 ∈ T .
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
94 Chapter 2: Concepts with which we will do mathematics
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 2.9: Increasing and decreasing functions 95
2.8.1. Let F be an ordered field and let x ∈ F satisfy x > 1. Prove that for all positive
integers n, xn > 1 and xn+1 > x.
2.8.2. Let F be an ordered field and let x ∈ F satisfy 0 < x < 1. Prove that for all
positive integers n, 0 < xn < 1 and xn+1 < x.
2.8.3. (Bernoulli’s inequality) Prove that for all x ∈ R≥0 and all n ∈ N0 , (1 + x)n ≥
1 + nx.
2.8.4. Does the set {2n /n : n ∈ N+ } have a lower (resp. upper bound)? Justify. Repeat
with {n/2n : n ∈ N+ }.
2.8.5. Prove that for all n ∈ Z there exist no integer strictly between n and n + 1.
(Hint: if n is negative, then the interval (n, n + 1) can be mirrored across 0 to the interval
(−n − 1, (−n − 1) + 1).)
2.8.6. Prove that for all x ∈ R, the interval (x, x + 1) can contain at most one integer.
Theorem 2.9.2. Let n be a positive integer and F an ordered field. Then the function
f : F → F defined by f (x) = xn when restricted to F + ∪ {0} is strictly increasing with
the range in F + ∪ {0}.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
96 Chapter 2: Concepts with which we will do mathematics
Corollary 2.9.3. Let n be a positive integer and F an ordered field. Suppose that x, y ∈
F + ∪ {0} have the property that xn < y n . Then x < y.
Proof. Let y ∈ Range(f ). Then y = f (x) for some x ∈ F . If also y = f (z) for some z ∈ F ,
since f is strictly monotone, x = z. So f is injective and x is unique. Thus we define
g : Range(f ) → F by g(y) = x. Then by definition for all x ∈ F , g(f (x)) = x and for all
y ∈ Range(f ), f (g(y)) = y. If f is increasing and y1 , y2 ∈ Range(f ) such that y1 < y2 , then
g(y1 ) < g(y2 ) for otherwise by the increasing property of f , y1 = f (g(y1 )) ≥ f (g(y2 )) = y2 ,
which is a contradiction. Thus if f is increasing, so is g = f −1 . Thus if g = f −1 is
increasing, so is f = (f −1 )−1 . The same reasoning goes for the decreasing property.
If the exponentiation function in Corollary 2.9.3 with exponent n takes F + ∪{0} onto
F + ∪ {0}, then by Theorem 2.9.4, we can define its inverse function F + ∪ {0} → F + ∪ {0}.
However, the function need not be surjective or have an inverse, witness F = Q and n = 2
as proved on page 21.
Proof of (3): Let x, y ∈ F with x < y. Suppose that f and g are both increas-
ing functions, so that f (x) < f (y) < 0 and g(x) < g(y) < 0. Then −f (y), −g(x)
are positive numbers, so by compatibility of < with multiplication by positive numbers,
f (x)(−g(x)) < f (y)(−g(x)) and (−f (y))g(x) < (−f (y))g(y). By Exercise 2.6.8, this
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 2.10: The Least upper bound property of R 97
says that −(f (x)g(x)) < −(f (y)g(x)) and −(f (y)g(x)) < −(f (y))g(y)). By transitiv-
ity of < then −(f (x)g(x)) < −(f (y))g(y)). By compatibility of < with addition, by
adding f (x)g(x) + f (y)g(y) we get that f (y)g(y) < f (x)g(x). With function notation,
(f g)(y) < (f g)(x), and since x and y were arbitrary, this says that f g is strictly decreas-
ing. The proof in the case where both f and g are strictly decreasing is similar.
In this section we formalize and analyze the important Least upper bound property
of R that we accept without proof.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
98 Chapter 2: Concepts with which we will do mathematics
Axiom 2.10.1. (Least upper/greatest lower bound property) For any non-empty
subset T of R that is bounded above, sup(T ) exists in R.
Similarly, for any non-empty subset T of R that is bounded below, inf(T ) exists in R.
With a proof similar to that in Exercise 2.7.5 one can deduce the existence of infima
from the existence of suprema, and vice versa.
By Theorem 2.7.6, sup(T ) and inf(T ) are unique.
√
The non-empty bounded set {x ∈ Q : x2 < 2} in R has infimum − 2 and supremum
√
2 in R. In contrast, the same set has no infimum nor maximum in Q. So, R satisfies the
Least upper bound property but Q does not.
We use this Least upper bound property in the proofs of the Archimedean property
Theorem 2.10.3, the Intermediate value theorem Theorem 5.3.1, the Extreme value theorem
Theorem 5.2.2, and in many limit tricks for functions and sequences. Here is the first
example of its usage.
Theorem 2.10.2. The ceiling function makes sense on R, i.e., there exists a function
d e : R → Z such that for all x ∈ R, dxe ≥ x and there exists no integer n with the property
dxe > n ≥ x.
Similarly, there exists the floor function b c : R → Z such that for all x ∈ R, bxc ≤ x
and there exists no integer n with the property bxc < n ≤ x.
Furthermore, bxc = −d−xe.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 2.10: The Least upper bound property of R 99
suprema (Theorem 2.7.6), r − 0.5 is not an upper bound on G, so that there exists p ∈ G
such that r − 0.5 < p. Necessarily p ≤ r ≤ x. If n is an integer such that p < n ≤ x, then
n ∈ G so that n ≤ r. It follows that r − 0.5 < p < n ≤ r, so that by compatibility of <
with addition, 0 < n − p < r − (r − 0.5) < 1. But n − p ∈ N0 by Theorem 2.8.4, which
contradicts Theorem 2.8.3. Thus necessarily p is the largest integer less than or equal to x.
This proves the existence of the floor function for positive real numbers.
Clearly d0e = 0 = b0c.
Now let x be negative. Then −x is positive and m = −d−xe exists by the first
paragraph. By definition then m is an integer with the properties that −x ≤ −m and that
there is no integer n such that −x ≤ n < −m. Hence by compatibility of < with addition,
x ≥ m and there is no integer n such that x ≥ −n > m, i.e., there is no integer n such that
x ≥ n > m. This proves that m = bxc. A similar proof shows that dxe = −b−xc.
Theorem 2.10.4. Between any two distinct real numbers there is a rational number
(strictly between them).
Proof. Let x, y ∈ R with x < y. Then y − x > 0, and by the Archimedean property, there
exists a positive integer p such that 2 < p(y − x). Let r = p1 (dpxe + 1) (recall that dpxe
is the ceiling function of px). So r is a rational number. By the definition of the ceiling
function, px ≤ dpxe < dpxe+1. Since p is positive, so is p−1 , and by compatibility of < with
multiplication by positive numbers, x < p1 (dpxe+1) = r. Furthermore, px+1 < 2+px < py
by the choice of p, so that dpxe + 1 < py and finally r = p1 (dpxe + 1) < y.
Remark 2.10.5. It is also true that between any two real numbers there is an irrational
√
number. Namely, let x < y be real numbers. It is proved on page 21 that 2 is a
positive irrational number. By compatibility of < with multiplication by positive numbers
√ √
then x 2 < y 2. By the Archimedean property, there is a rational number r such that
√ √
x 2 < r < y 2. If r = 0, again by the Archimedean property there exists a rational
√ √
number s such that x 2 < 0 < s < y 2. So by possibly replacing r by this s we
may assume that r is a non-zero rational number. Then again by compatibility and by
√ √
Theorem 2.7.12, x < r/ 2 < y. But r is non-zero, so that r/ 2 is an irrational number
strictly between the given real numbers x and y.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
100 Chapter 2: Concepts with which we will do mathematics
I recommend that in the first pass through the next theorem the reader works with
concrete n = 2.
Theorem 2.10.6. (Radicals exist in R.) Let y be a positive real number and n ∈ N+ .
Then there exists a unique positive real number s such that sn = y.
(s + 1/p)n < y. Since s + 1/p > s ≥ 0, it follows that s + 1/p ∈ T , which contradicts the
fact that s = sup T . This proves that sn ≥ y.
Now suppose that sn > y. Then in particular s > 0. By the Archimedean property
Pn−1
there exist positive integers p1 and p2 such that 1 < p1 s and k=0 nk sk < p2 (sn − y). Let
Pn−1
p = max{p1 , p2 }. Then 1 < ps and k=0 nk sk < p(sn − y). Thus s0 = s − p1 is positive,
and
n n
n 1 1
s − s− = s0 + − sn0
p p
n−1
X n
= sk0 /pn−k
k
k=0
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 2.11: Absolute values 101
n−1
X
n k
≤ s /p (since 0 < s0 < s and pn−k ≥ p)
k
k=0
< sn − y.
Hence by compatibility of < with addition, y < (s − p1 )n . But then for all r ∈ T , rn ≤ y <
(s − p1 )n . Since s − p1 is positive, by the increasing property of the power functions on R+
(Theorem 2.9.2) we have that r ≤ s − p1 for all r ∈ T . But then s − p1 is an upper bound
on T , which contradicts the fact that s is the supremum of T .
Then by trichotomy on R we conclude that sn = y.
Now suppose that t is another positive real number such that tn = y. By trichotomy,
either t < s or s < t. Then by the increasing property of the power functions on R+ ,
tn 6= sn , which is a contradiction. This proves that s is unique
√
We call the number s such that sn = y the nth root of y, and we write it as n y or
as y 1/n .
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
102 Chapter 2: Concepts with which we will do mathematics
The following theorem lists the familiar properties of absolute values, and the reader
may wish to prove them without reading the given proof.
The last part of the theorem above shows that the absolute value works well with
multiplication: the absolute value of the product is the product of absolute values. It is
not the case that the absolute value of the sum of two numbers is always the sum of their
absolute values. Instead we have triangle inequalities as in the theorem below. We use the
standard notation “±” to mean that the result holds with either + or −.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 2.11: Absolute values 103
Proof. (1) By the first part of the previous theorem, −|x| ≤ x ≤ |x| and −|y| ≤ y ≤ |y|.
Thus
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
104 Chapter 2: Concepts with which we will do mathematics
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Chapter 3: The field of complex numbers, and topology
In this chapter we construct complex numbers from the real numbers. Even if we
want to measure only real numbers in the real world, the more general complex numbers
streamline many constructions, so they are an important tool. We prove the important
basic properties of complex numbers and their field C in the first two sections. As a starting
point we take it as fact that R is an ordered field with the Least upper bound property
and the Archimedean property.
The last two sections are about the Euclidean topology on R and C, which is crucial
in the subsequent chapters on limits. I cover those two sections lightly but invoke them
later in the book as needed.
Note that there is no real number x such that x2 = −1. In this section we build the
smallest possible field containing R with an element whose square is −1. We proceed by
defining a new set C with new operations + and ·, we verify that the result is a field that
contains R as a subfield and that has two elements whose squares equal −1. It is left to an
interested reader to show that there are no fields strictly between R and C (that contain a
root of −1).
(0, 1)
(−3, 1) (2, 0.5)
(0, 0)
(−1, 0) (1, 0)
(0, −1)
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
106 Chapter 3: The field of complex numbers, and topology
The horizontal axis, on which all complex numbers are of the form (r, 0), is called
the real axis, and the vertical axis, on which all complex numbers are of the form (0, r),
is called the imaginary axis.
The illustration below shows a geometric interpretation of addition: to add (a, b) and
(c, d), draw the parallelogram using these two points and (0, 0) as three of the four vertices.
Think through why the sum is the fourth vertex. Because of this picture we loosely say
that addition in C obeys the parallelogram rule.
(a + c, b + d)
(c, d)
(a, b)
However, C is not an ordered field in the sense of Definition 2.7.10, and a rigorous
proof is left for Exercise 3.1.6.
Rather than giving a formal proof that C is a field, below is a list of the necessary
easy verifications. The reader is encouraged to verify all.
(1) ·, + are associative and commutative.
(2) · distributes over +.
(3) For all x ∈ C, (0, 0) + x = x. In other words, C has the additive identity 0 = (0, 0).
(The additive identity of a field is written as “0” even when it is an ordered pair
of real numbers.)
(4) For all (a, b) ∈ C, (−a, −b)+(a, b) = (0, 0). In other words, every element (a, b) has
an additive inverse −(a, b) = (−a, −b). By Theorem 2.5.10, the additive inverse is
unique.
(5) For all x ∈ C, (1, 0) · x = x. In other words, C has the multiplicative identity
1 = (1, 0). (The multiplicative identity of a field is written as “1” even when it is
an ordered pair of real numbers.)
(6) (1, 0) 6= (0, 0). i.e., 1 6= 0.
(7) Every non-zero element has a multiplicative inverse. Namely, for any (a, b) 6= (0, 0),
a −b a −b
( a2 +b 2 , a2 +b2 ) ∈ C and ( a2 +b2 , a2 +b2 ) · (a, b) = (1, 0). By Theorem 2.5.10, the
multiplicative inverse is unique so that (a, b) = ( a2 +b2 , a2−b
−1 a
+b2 ).
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 3.1: Complex numbers 107
For example, the multiplicative inverse of (1, 0) is (1, 0), the multiplicative inverse of
3 5
(0, 1) is (0, −1), and the multiplicative inverse of (3, 5) is 34 , − 34 .
The squares of the complex numbers (0, 1) and (0, −1) are (−1, 0) (check!). Thus we
have found two complex roots of the polynomial function x2 + (1, 0). An identical proof to
that of Theorem 2.4.15 shows that these two complex numbers are the only two roots.
Notation 3.1.3. There is another notation for elements of C that is in some ways better:
(a, b) = a + bi, with (a, 0) = a and (0, b) = bi. This notational convention does not lose any
information, but it does save a few writing strokes. Addition is easy: (a + bi) + (c + di) =
(a + b) + (c + d)i, the additive inverse of a + bi is −a − bi, the additive identity is 0. This
notation justifies the possibly strange earlier definition of multiplication in C:
(a, b) · (c, d) = (a + bi)(c + di) = ac + adi + bic + bidi = (ac − bd, ad + bc).
a−bi
The multiplicative identity is 1 and the multiplicative inverse of a non-zero a + bi is a2 +b2 .
Definition 3.1.4. The real part of (a, b) is Re(a, b) = a, and the imaginary part is
Im(a, b) = b. In alternate notation, Re(a + bi) = a and Im(a + bi) = b. Note that both the
real and the imaginary part of a complex number are real numbers.
These number systems progressively contain more numbers and more solutions of more
equations. For example, the equation 1 + x = 0 does not have any solutions in N0 but it
does have one in Z; the equation 1 + 2x = 0 does not have any solutions in Z, but it does
have one in Q; the equation 2 − x2 = 0 does not have any solutions in Q, but it does have
√ √
two in R (namely 2 and − 2); the equation 2 + x2 = 0 does not have any solutions in R,
√ √
but it does have two in C (namely 2i and − 2i). Furthermore, the standard quadratic
formula always yields roots of quadratic equations in C.*
* One of the excellent properties of C is the Fundamental Theorem of Algebra: every polynomial with
coefficients in C or R has roots in C. The proof of this fact is proved in a junior-level class on complex analysis or
in a senior-level class on algebra. The theorem does not say how to find the roots, only that they exist. In fact,
there is another theorem in Galois theory that says that in general it is impossible to find roots of a polynomial
by using radicals, sums, differences, products, and quotients.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
108 Chapter 3: The field of complex numbers, and topology
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 3.2: Functions related to complex numbers 109
x · y = (a + bi) · (c + di)
= ac − bd + (ad + bc)i
= ac − bd − (ad + bc)i
= (a − bi) · (c − di)
= x · y.
If y 6= 0, then by (1) and (5), x = (x/y)y = x/y · y, and so (6) follows.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
110 Chapter 3: The field of complex numbers, and topology
Remark 3.2.3. All functions above with domain and codomain equal to C were given
with some sort of algebraic formulation or description. How else can we represent such
a function? We certainly cannot give a tabular function formulation since the domain is
infinite. But we cannot draw such a function either: for the domain we would need to draw
the two-dimensional real plane, and the same for the codomain, so we would have to draw
the four-dimensional picture to see it all, and that is something we cannot do. So we need
to be satisfied with the algebraic or verbal descriptions of functions.
We have seen the absolute value function in ordered fields. The Pythagorean theorem
in the plane R × R motivates the natural definition of distance in C:
bi (a, b) = a + bi
Since the absolute value is a real number, this gives a way to partially compare com-
plex numbers, say by their lengths, or by their real components. But recall Exercise 3.1.6:
C is not an ordered field.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 3.3: Absolute value in C 111
The absolute value of (a, 0) = a or (0, a) = ia is |a|; the absolute value of (1, 1) = 1+i
√ √ √ √ √ √
is 2; the absolute value of (1, 2) = 1+i 2 is 3; the absolute value of (1, 3) = 1+i 3
√
is 4 = 2, et cetera.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
112 Chapter 3: The field of complex numbers, and topology
= xx ± xy ± yx + yy (by algebra)
= |x|2 ± xy ± xy + |y|2 (by Theorem 3.2.2 (2))
= |x|2 ± 2 Re(xy) + |y|2 (by Exercise 3.2.1 iii))
≤ |x|2 + 2| Re(xy)| + |y|2 (comparison of real numbers)
≤ |x|2 + 2|xy| + |y|2 (by (4))
= |x|2 + 2|x||y| + |y|2 (by (6))
= |x|2 + 2|x||y| + |y|2 (by (1))
= (|x| + |y|)2 ,
and since the squaring function is strictly increasing on the set of non-negative real numbers,
it follows that |x ± y| ≤ |x| + |y|. This proves (7), and by Theorem 2.11.3 also (8).
A consequence of part (6) of this theorem is that if x ∈ C has absolute value greater
than 1, then positive integer powers of x have increasingly larger absolute values, if |x| < 1,
then positive integer powers of x have get increasingly smaller than 1, and if |x| = 1, then
all powers of x have absolute value equal to 1.
The absolute value allows the definition of bounded sets in C (despite not having an
order on C):
For example, any set with only finitely many elements is bounded: if A =
{x1 , . . . , xn }, set M = max{|x1 |, . . . , |xn |} + 1, and then certainly for all x ∈ A, |x| < M .
The subset Z of C is not bounded. The infinite set {x ∈ C : |x| = 5} is bounded.
The set {in : n ∈ N+ } is bounded. The set {1/n : n ∈ N+ } is bounded. The set of complex
numbers at angle π/4 from the positive real axis is not bounded. (Draw these sets.)
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 3.4: Polar coordinates 113
3.3.4. Let A be a subset of C. Prove that the following statements are equivalent:
i) A is a bounded set.
ii) There exist a ∈ C and a positive real number M such that A ⊆ B(a, M ).
iii) For all a ∈ C there exists a positive real number M such that A ⊆ B(a, M ).
iv) For all a ∈ R there exists a positive real number M such that A ⊆ B(a, M ).
v) There exist a ∈ R and a positive real number M such that A ⊆ B(a, M ).
3.3.5. Let F be either C or an ordered field, so that the absolute value function is defined
on F . Let S be a subset of F . Prove that S is bounded if and only if {|s| : s ∈ S} is
bounded.
3.3.6. Let A and B be subsets of C. Define A + B = {a + b : a ∈ A and b ∈ B},
A · B = {a · b : a ∈ A and b ∈ B}, and for any c ∈ C, let cA = {c · a : a ∈ A}. Compute
A + B, A · B, and c · A for the following A, B, c:
i) A = {1, 2, i}, B = {−1, i}, c = 4.
ii) A = R+ , B = R+ , c = −1.
3.3.7. Let A be a bounded subset of C.
i) Prove that for any complex number c, {ca : a ∈ A} is bounded.
ii) Prove that for any complex number c, {a + c : a ∈ A} is bounded.
iii) Prove that {a2 : a ∈ A} is bounded.
iv) Prove that for any positive integer n, {an : a ∈ A} is bounded.
v) Prove that for any polynomial function f , {f (a) : a ∈ A} is bounded.
3.3.8. Let a, b ∈ C. Suppose that for all real numbers > 0, |a − b| < . Prove that a = b.
(Hint: Theorem 2.11.4.)
*3.3.9. (Keep in mind that a square root function on C is yet to be discussed carefully;
see Exercise 5.4.6.) Discuss correctness/incorrectness issues in the following equalities:
√ √ √ √ p √
i) −6 = ( 3i)( 12i) = −3 −12 = (−3)(−12) = 36 = 6.
ii) (R. Bombelli, 1560, when solving the equation x3 = 15x + 4.)
√ √
q q
3 3
4 = 2 + −121 + 2 − −121.
p √ p √ √
iii) (G. Leibniz, 1675) 1 + −3 + 1 − −3 = 6.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
114 Chapter 3: The field of complex numbers, and topology
So far we have expressed complex numbers with pairs of real numbers either in
ordered-pair notation (x, y) or in the form x + yi. But a complex number can also be
uniquely determined from its absolute value and the angle measured counterclockwise from
the positive real axis to the line connecting (0, 0) and (x, y).
Any choice of θ works for the complex number zero. The angles are measured in
radians. (While you may say degrees out loud, get into the habit of writing down radians;
later we will see that radians work better.) The angle is not unique; addition of any integer
multiple of 2π to it does not change the complex number.
√
For further examples, 1+i2 3 is on the unit circle√centered at the origin and is at angle
π/3 counterclockwise from the positive real axis, 1−i2 3 is on the √same unit circle and at
angle −π/3 counterclockwise from the positive real axis, and −1+i 2
3
is on the same circle
and at angle 2π/3 counterclockwise from the positive real axis.
We refer to the entries in the ordered pair (x, y) ∈ R × R = C as Cartesian co-
ordinates. The coordinates (r, θ) consisting of the absolute value r of a complex number
and its angle θ (measured counterclockwise from the positive real axis) are referred to as
polar coordinates.
Numerical conversions between the two coordinate systems use trigonometry. If we
know r and θ, then x and y are given by:
x = r cos θ,
y = r sin θ,
and if we know x and y, then r and θ are given by:
p
r = x2 + y 2 ,
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 3.4: Polar coordinates 115
anything, if x = 0 = y;
π
2, if x = 0 and y > 0;
θ = −π2 , if x = 0 and y < 0;
arctan(y/x) ∈ (−π/2, π/2), if x > 0;
arctan(y/x) ∈ (π/2, 3π/2), if x < 0.
Note that the angle is ±π/2 precisely when Re x = 0, that the angle is 0 when x is a
positive real number, that it is π when x is a negative real number, et cetera. Furthermore,
if the angle is not ±π/2, then the tangent of this angle is precisely Im x/ Re x.
We will show in Chapter 9 that the polar coordinates r, θ determine the complex
number as reiθ , but at this point we cannot yet make sense out of this exponentiation.
Nevertheless, this notation hints at multiplication of complex numbers reiθ and seiβ as
resulting in rsei(θ+β) , confirming that the absolute value of the product is the product of
the absolute values, and hinting that the angle of the product is the sum of the two angles
of the numbers.
We next prove this beautiful fact of how multiplication works geometrically.
Theorem 3.4.1. (Fun fact) Let z be a complex number in polar coordinates r and θ.
Define functions M, S, R : C → C as follows:
M (x) = zx = Multiply x by z,
S(x) = rx = Stretch x by a factor of r,
R(x) = Rotate x by angle θ counterclockwise around (0, 0).
Then
M = S ◦ R = R ◦ S,
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
116 Chapter 3: The field of complex numbers, and topology
b
(c1 , d1 )
(c2 , d2 )
(−d1 , c1 ) b b b
(−d4 , c4 )
(−d2 , c2 ) b b b
(−d3 , c3 )
(c4 , d4 )
(c3 , d3 ) b
The complex number M (i) = (−d, c) has length equal to |(c, d)| = r. The angle between
(c, d) and (−d, c) is 90◦ , or π/2 radians, and more precisely, to get from z = (c, d) to
M (i) = (−d, c) we have to rotate counterclockwise by π/2. Thus the angle formed by M (i)
counterclockwise from the positive real axis is θ + π/2. But (R ◦ S)(i) = R(ri) also has the
same angle and length as M (i), so that M (i) = (R ◦ S)(i).
Now let x be general in C. Write x = a + bi for some a, b ∈ R. By the geometry of
rotation, R(a + bi) = R(a) + R(bi) = aR(1) + bR(i). Then
R ◦ S(x) = S ◦ R(x) (as established from geometry)
= rR(x)
= rR(a + bi)
= r(aR(1) + bR(i))
= arR(1) + brR(i)
= aS(R(1)) + bS(R(i))
= aM (1) + bM (i) (by previously proved cases)
= az1 + bzi
= z(a + bi)
= M (x).
Theorem 3.4.2. For any non-zero complex number x and any integer n, the angle of xn
counterclockwise away from the positive x-axis is n times the angle of x. Also, |xn | = |x|n .
Proof. If n = 1, this is trivially true. Now suppose that the theorem is true for some
positive integer n. Then the angle of xn−1 counterclockwise away from the positive x-axis
is n − 1 times the angle of x, and by Theorem 3.4.1, the angle of xn = xxn−1 is the sum
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 3.4: Polar coordinates 117
of the angles of x and xn−1 , so that it is n times the angle of x. Similarly, by part (6) of
Theorem 3.3.2, |xn | = xxn−1 = |x| xn−1 = |x||x|n−1 = |x|n .
Thus by induction the theorem is proved for all positive n.
Still keep n positive. Since 1 = x−n xn has angle 0 and xn has angle n times the
angle of x, by Theorem 3.4.1, x−n must have angle −n times the angle of x. Also, by
part (6) of Theorem 3.3.2, 1 = |x−n | |xn | = |x−n | |x|n , so that |x−n | = |x|−n . Thus the
theorem holds for all non-zero n.
Finally, if n = 0, then then angle of xn = 1 is 0, which is 0 times the angle of x, and
|x0 | = |1| = 1 = |x|0 .
√
For example, −1+2 3i is on the unit circle at angle 2π/3 counterclockwise
√
from the
◦ −1+ 3i
positive x-axis (i.e., at angle 120 in degrees), and so the second power of 2 is on the
unit circle at angle 4π/3 counterclockwise from the positive√ x-axis, and the cube power is
on the unit circle at angle 2π, i.e., at angle 0, so that ( −1+2 3i )3 = 1.
Corollary 3.4.3. Let n be a positive integer. Let A be the set of all complex numbers
on the unit circle at angles 0, 2π 2π 2π 2π
n , 2 n , 3 n , . . . , (n − 1) n . Then A equals the set of all the
complex number solutions to the equation xn = 1.
Proof. Let a ∈ A. By the previous theorem, an has length 1 and angle an integer multiple
of 2π, so that an = 1. If b ∈ C satisfies bn = 1, then |b|n = |bn | = 1, so that the non-
negative real number |b| equals 1. Thus b is on the unit circle. If θ is its angle, then the
angle of bn = 1 is by the previous theorem equal to nθ, so that nθ must be an integer
multiple of 2π. It follows that θ is an integer multiple of 2π
n , but all those angles appear for
the elements of A. Thus every element of A is a root, and every root is an element of A,
which proves the corollary.
Theorem 3.4.4. Let x be a non-zero complex number and let n be a positive integer.
Then there exist exactly n complex numbers whose nth power equals x.
Proof. By Theorem 3.3.2 we know that r = |x| is positive. By Theorem 2.10.6 there exists
a positive real number s such that sn = r. Let α be the angle of x in radians measured
counterclockwise from the positive real axis. For any positive integer j let uj be the
complex number on the unit circle whose angle from the positive real axis is (α + 2πj)/n.
By the previous theorem, unj is on the unit circle at angle α + 2πj counterclockwise from
the positive real axis. But this is the same as the complex number on the unit circle at
angle α counterclockwise from the positive real axis. Hence (suj )n = sn unj = runj is the
complex number on the circle of radius r at angle α measured counterclockwise from the
positive real axis. This says that (suj )n = x. By angle considerations, su1 , su2 , . . . , sun
are distinct. This proves that there exist n complex numbers whose nth power equals x.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
118 Chapter 3: The field of complex numbers, and topology
Now let y be any complex number whose nth power equals x. By the previous
theorem, |y| = s. Let β be the angle of y measured counterclockwise from the positive real
axis. Since y n = x, by the previous theorem, nβ − α = 2πk for some integer k. Hence
β = (α + 2πk)/n. We can write k = k 0 n + j for some integer j ∈ {1, 2, . . . , n} and some
integer k 0 . Then
β = (α + 2πk)/n = (α + 2πk1 )/n + 2πk 0 ,
so that the angle ond the length of y are the same as those of suj , so that y = suj . Thus
there exist exactly n complex numbers whose powers equal x.
Whereas for non-negative real numbers we choose its non-negative square root as
the nth root, there is no natural choice for an nth root of a complex number; more on that
is in the chapter on continuity in Exercise 5.4.6.
3.4.4. Prove that for any non-zero z ∈ C there exist exactly two elements in C whose
square equals z. (Hint: Theorem 2.10.6 and Theorem 3.4.1.)
3.4.5. Let z be non-zero in C with polar coordinates r and θ and let n ∈ N+ . For and
integer k, let zk be the complex number whose absolute value equals r1/n and whose angle
measured counterclockwise from the positive x axis is kθ/n.
i) Prove that zk is uniquely determined.
ii) Prove that for all k, zkn = z. (Hint: Theorem 3.4.1.)
iii) Prove that there the set {zk : k ∈ N} contains exactly n elements.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 3.5: Topology on the fields of real and complex numbers 119
3.4.6. “Square” the pentagon drawn below. Namely, estimate the coordinates (real, imag-
inary or length, angle) of various points on the pentagon, square the point, and draw its
image on a different real plane. You need to plot the image not only of the five vertices, but
of several representative points from each side to see how the squaring curves the edges.
(Hint: You may want to use Theorem 3.4.1.)
When reading this section, absorb the following main points of topology: open ball,
open set, limit point, closed set. The main object of this section is to introduce limit points
of sets so that we can in subsequent chapters talk about limits of functions, sequences, and
series.
By a topology on a set we mean that some sets are declared open, subject to the
conditions that the empty set and the whole set have to be open, that arbitrary unions of
open sets be open, and that finite intersections of open sets be open. In any topology, the
complement of an open set is called closed, but a set may be neither open nor closed. A
topology can be imposed on any set, not just R or C, but we focus on these two cases, and
in fact we work only with the “standard”, or “Euclidean” topology.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
120 Chapter 3: The field of complex numbers, and topology
The following are both B(0, 1), but the left one is a ball in R and the right one is a
ball in C. Note that they are different: by definition the left set is an open subset of R,
but if you think of it as a subset of C, it is not open (see Exercise 3.5.1).
−1 0 1 −1 0 1
Examples 3.5.2.
(1) B(a, r) is open.
(2) F = ∪a∈F B(a, 1) is an open set.
(3) The empty set is an open set because it is vacuously a union of open sets (see
page 52).
(4) For real numbers a < b, the interval in R of the form (a, b) is an open set in R
because it is equal to B((a + b)/2, (b − a)/2). The interval (a, ∞) is open because
it equals ∪∞n=1 B(a + n, 1).
(5) The set A = {x ∈ C : 1 < Re x < 3 and 0.5 < Im x < 2} is open in C. Namely,
this set is the union ∪a∈A B(a, min{Re a − 1, 3 − Re a, Im a − 0.5, 2 − Im a})).
(6) The set A = {x ∈ C : Re x < 1 and Im x < 2} is open in C. Namely, this set is
the union ∪a∈A B(a, min{1 − Re a, 2 − Im a}).
Proof. For each integer n ≥ 2, a + r/n ∈ B(a, r). Since r > 0, these numbers are all
distinct. Since N0 is infinite, so is the set of all integers that are at least 2.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 3.5: Topology on the fields of real and complex numbers 121
Theorem 3.5.5. Let A be an open set and let a ∈ A. Then there exists r > 0 such that
B(a, r) ⊆ A.
Proof. Since A is open, it is a union of open balls. Thus a is an element of one such ball
B(b, s), with B(b, s) ⊆ A.
Since a ∈ B(b, s), we have that |a − b| < s, so that r =
s − |a − b| is a positive real number. (In the illustration, this is the a
distance between a and the outside of the circle.) We claim that
b
B(a, r) ⊆ B(b, s). To prove this, let x ∈ B(a, r). Then |x − b| =
|x − a + a − b| ≤ |x − a| + |a − b| by the triangle inequality, and since
|x−a| < r = s−|a−b|, it means that |x−b| < s, so that x ∈ B(b, s).
This proves the claim, and hence it proves that B(a, r) ⊆ A.
Proof. The empty set can be written as an empty union of open balls, so it is open
vacuously, and F = ∪a∈F B(a, 1), so that F is open. This proves (1).
Every open set is a union of open balls, and so the union of open sets is a union of
open balls, hence open. This proves (2).
Now let A1 , . . . , An be open sets. Let a ∈ A1 ∩ · · · ∩ An . By Theorem 3.5.5, for each
k = 1, . . . , n, there exists rk > 0 such that B(a, rk ) ⊆ Ak . Set r = min{r1 , . . . , rn }. Then
B(a, r) ⊆ ∩nk=1 B(a, rk ) ⊆ ∩nk=1 Ak . Thus for each a ∈ ∩nk=1 Ak there exists ra > 0 such
that B(a, ra ) ⊆ ∩nk=1 Ak . It follows that
∩nk=1 Ak = ∪a∈∩nk=1 Ak B(a, ra ).
This proves that ∩nk=1 Ak is a union of open balls, so it is open.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
122 Chapter 3: The field of complex numbers, and topology
Examples 3.5.8.
(1) If A = {a}, then the set of limit points of A is the empty set.
(2) If A = Q, then the set of limit points of A is R.
(3) The set of limit points of ∅ is empty, the set of limit points of R is R, and the set
of limit points of C is C.
(4) The set of limit points of B(a, r) equals {x ∈ F : |x − a| ≤ r}.
Definition 3.5.9. A set A is a closed set if it contains all of its limit points.
Proof. Suppose that A is open. Let x ∈ A. By Theorem 3.5.5, there exists r > 0 such that
B(x, r) ⊆ A. Thus B(x, r) ∩ (F \ A) = ∅, so that x is not a limit point of F \ A. Thus
no point of A is a limit point of F \ A, which proves that any limit points of F \ A are in
F \ A. Thus F \ A is closed.
Now suppose that F \ A is closed. Let x ∈ A. Since F \ A contains all of its limit
points, then x is not a limit point of F \ A. Thus by the definition of limit points, there
exists r > 0 such that B(x, r) ∩ (F \ A) is empty. This means that B(x, r) ⊆ A. Thus
A = ∪x∈A B(x, rx ) for appropriate rx > 0, so that A is open.
The following is now almost immediate from previous results:
Proof. Exercise 2.1.8 proves that the union of the complements of two sets equals the
complement of the intersection and that the it was proved that the intersection of the
complements of two sets equals the complement of the union, and an equally easily proved
mathematical truth is the following generalization to possibly many more sets:
[ \ \ [
F\ Ak = (F \ Ak ), F\ Ak = (F \ Ak ).
k∈I k∈I k∈I k∈I
With this, (2) and (3) follow from the last two theorems, and the proof of (1) is trivial.
Both ∅ and F are open and closed, and these turn out to be the only sets that
are both open and closed (see Exercise 3.5.2). Some sets are neither open nor closed (see
Exercise 3.5.1).
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 3.6: The Heine-Borel theorem 123
Remark 3.5.12. (This remark puts Theorems 3.5.6 and 3.5.11 in the more general context
and is not needed in the first course in analysis.) Any set F (not necessarily a field) is a
topological space if there exists a collection T of subsets of F such that the following
properties are satisfied:
(1) ∅, F ∈ T,
(2) Arbitrary unions of elements in T are in T.
(3) Finite intersections of elements in T are in T.
Elements of T are called open. Subsets of F that are complements of open sets are called
closed. The proof of Theorem 3.5.11 for closed sets in this topological space are proved in
the same way.
3.5.6. Let a be a limit point of a set A. Suppose that a set B contains A. Prove that a is
a limit point of B.
3.5.7. Give examples of sets A ⊆ B ⊆ C and a ∈ C such that a is a limit point of B but
not of A.
3.5.8. Let A be a subset of C all of whose elements are real numbers. Prove that every
limit point of A is a real number.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
124 Chapter 3: The field of complex numbers, and topology
Closed and bounded sets in C and R have many excellent properties – we will for
example see in Section 5.3 that when a good (say continuous) real-valued function has a
closed and bounded domain, then that function achieves a maximum and minimum value,
et cetera. The concept of uniform continuity (introduced in Section 5.5) needs the fairly
technical Heine-Borel theorems proved in this section.
Construction 3.6.1. (Halving closed and bounded subsets of R and quartering closed
and bounded subsets of C) Let A be a bounded subset of of R or of C, and let P be
a property that applies to some subsets of A. Boundedness of A guarantees that A fits
inside a closed bounded rectangle R0 of the form (a0 , b0 ) × (c0 , d0 ) in C, with c0 = 0 = d0
if A is a subset of R. The rectangle can be halved lengthwise and crosswise to get four
equal closed subrectangles. In the next iteration we pick, if possible, one of these four
closed quarter subrectangles such that its intersection with A has property P . We call this
subrectangle R1 . If A is a subset of R, then the length of R1 is half the length of R0 , and
otherwise the area of R1 is a quarter of the area of R0 . In general, once we have Rn , we
similarly pick a subrectangle Rn+1 such that Rn+1 ∩ A has property P and such that the
sides of Rn+1 are half the lengths of the sides in Rn . Write Rn = [an , bn ] × [cn , dn ] for some
real numbers an ≤ bn and cn ≤ dn . By construction, for all n, bn − an = (b0 − a0 )/2n , and
a0 ≤ a1 ≤ a2 ≤ · · · ≤ an ≤ · · · ≤ bn · · · ≤ b2 ≤ b1 ≤ b0 .
Similarly,
c = sup{c1 , c2 , c3 , . . .} = inf{d1 , d2 , d3 , . . .}.
This means that the intersection of all the Rn equals the set {a + ci}, consisting of exactly
one complex number. By the shrinking property of the subrectangles, for every δ > 0 there
exists a positive integer N such that RN ∩ A ⊆ B(a + ci, δ).
In particular, “quartering” of the closed and bounded region (interval) A =
[a0 , b0 ] ⊆ R, means halving the rectangle (interval), and the intersection of all the chosen
closed half-rectangles is a set with exactly one element. That element is in A, so a real
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 3.6: The Heine-Borel theorem 125
number.
Theorem 3.6.2. (The Heine-Borel theorem (in R, C)) Let A be a closed and bounded
subset of R or C. For each c ∈ A let δc be a positive number. Then there exists a finite
subset S of A such that A ⊆ ∪c∈S B(c, δc ).
Proof. We declare that a subset B of A satisfies (property) P if there exists a finite subset
S of B such that B ⊆ ∪c∈S B(c, δc ). We want to prove that A has P .
Suppose for contradiction that A does not have P . Since A is closed and bounded, it
fits inside a closed rectangle R0 . With Construction 3.6.1, we construct iteratively nested
subrectangles R0 ⊇ R1 ⊇ R2 ⊇ · · ·. The quarter subrectangles are chosen so that each
Rn ∩ A does not have P . This is true if n = 0 by assumption. Suppose that Rn has been
chosen so that Rn ∩A does not have P . If the intersection with A of each of the four quarter
subrectangles of Rn has P , i.e., if each of the four subrectangles (as in Construction 3.6.1)
intersected with A is contained in the union of finitely many balls B(c, δc ), then Rn ∩ A is
covered by finitely many such balls as well, which contradicts the assumption on Rn . Thus
it is possible to choose Rn+1 so that Rn+1 ∩ A does not have P . By construction, ∩∞ n=1 Rn
contains exactly one point. Let that point be x. Since each Rn ∩ A has infinitely many
points, x is a limit point of A, and since A is closed, necessarily x ∈ A.
By the shrinking sizes of the Rn , there exists a positive integer N such that Rn ⊆
B(x, δx ). But then Rn ∩ A has P , which contradicts the construction. Thus A has P .
Remark 3.6.3. Let F be R or C and let A be a closed and bounded subset of F . Let T
be a collection of open subsets of F such that A ⊂ ∪U ∈T U . This set containment is usually
referred to as T being an open cover of A. By the definition of set containment, for each
c ∈ A there exists Uc ∈ T such that c ∈ Uc . Since Uc is open, by Theorem 3.5.5, there
exists δc > 0 such that B(c, δc ) ⊆ Uc . The Heine-Borel theorem Theorem 3.6.2 asserts that
there exists a finite subset S of A such that A ⊆ ∪c∈S B(c, δc ) ∪c Uc . In other words, the
Heine-Borel theorem asserts that every open cover of a closed and bounded subset
of C has a finite subcover.
Remark 3.6.4. In the more general context of topological spaces as in Remark 3.5.12,
there need not be a notion of bounded sets. A set for which every open cover has a finite
subcover is called compact. So Theorem 3.6.2 proves that every closed and bounded
subset of C is compact.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
126 Chapter 3: The field of complex numbers, and topology
Theorem 3.6.5. Let A be a closed and bounded subset of R or C, and for each a ∈ A
let δa be a positive number. Then there exist a finite subset S of A and a positive real
number δ such that A ⊆ ∪c∈S B(c, δc ) and such that for all x ∈ A there exists c ∈ S such
that B(x, δ) ⊆ B(c, δc ).
Proof. By Theorem 3.6.2, there exists a finite subset S of A such that A ⊆ ∪c∈S B(c, δc /2).
Let δ = 12 min{δc : c ∈ S}. Since S is a finite set, δ is a positive real number.
Let x ∈ A. By the choice of S there exists c ∈ S such that x ∈ B(c, δc /2). Let
y ∈ B(x, δ). Then
|y − c| = |y − x + x − c| ≤ |y − x| + |x − c| < δ + δc /2 ≤ δc ,
so that y ∈ B(c, δc ). It follows that B(x, δ) ⊆ B(c, δc ).
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Chapter 4: Limits of functions
All calculus classes teach about limits, but the domains there are typically intervals
in R. Here we learn a more general definition for (more interesting) domains in C.
It is important to note that we are not asking for f (a). For one thing, a may or may
not be in the domain of f , we only know that a is a limit point of the domain of f . We
are asking for the behavior of the function f at points near a.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
128 Chapter 4: Limits of functions
We can give a simple geometric picture of this in case the domain and codomain
are subsets of R (refer to Remark 3.2.3 for why we cannot draw functions when domains
and codomains are subsets of C). Below are three graphs of real-valued functions defined
on a subset of R and with a being a limit point of the domain. In each, on the graph of
y = f (x) we cover the vertical line x = a, and with that information, we conclude that
limx→a f (x) = L.
L L L
a a a
The function f from the first graph above might be any of the following:
L L L
a a a
Intuitively we are hoping that f (x) for x near a can predict a trend for the value of
f as we get arbitrarily close to a. For example, we may not be able to bring x to 0 Kelvin,
but if we can take measurements f (x) for x getting colder and colder, perhaps we can
predict what may happen at 0 Kelvin. But how believable is our prediction? Perhaps for
our theory to be satisfactory, we need to run experiments at temperatures x that give us
f (x) within = 10 of the predicted value. Or when instruments get better, perhaps gets
smaller, say one thousandth. Or a new material is discovered which allows even smaller
. But no matter what is determined ahead of time, for the prediction to be believable,
we need to determine a fixed range of x, within a δ of a but not equal to a, for which the
f -values are within the given of the prediction.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 4.1: Limit of a function 129
L+ǫ
L
L−ǫ
a−δ a a+δ
If gets smaller, δ has to get smaller too; but we may keep the old δ for larger .
While these pictures can help our intuition, they do not constitute a proof: the
definition is an algebraic formulation, and as such it requires algebraic proofs. In the rest
of the section we examine many examples algebraically, with the goal of mastering the
epsilon-delta proofs. But epsilon-delta proofs are time-consuming, so in the future we will
want to replace them with some shortcuts. We will have to prove that those shortcuts
are logically correct, and the proofs will require mastering abstract epsilon-delta proofs.
Naturally, before we can master abstract epsilon-delta proofs, we need to be comfortable
with epsilon-delta proofs on concrete examples. In short, in order to be able to avoid
epsilon-delta proofs, we have to master them. (Ha!)
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
130 Chapter 4: Limits of functions
“|(4x − 5) − 7| < ” because we do not know that yet. We write the left side
of this inequality, and manipulate it – algebraically, often with triangle
inequalities and several steps, until we get < . ] Then
The commentary in the proof above is describing the thought process behind the
proof but need not and should not be written out in homework solutions. Below is a
homework-style solution:
Proof of Example 4.1.2 without the commentary: The function that takes x to 4x − 5 is
a polynomial function, so it is defined for all complex numbers. Thus the domain of the
function is C and 3 is a limit point of the domain. Let > 0. Set δ = /4. Then δ is a
positive real number. Let x be an arbitrary complex number. Assume that 0 < |x − 3| < δ.
Then
|(4x − 5) − 7| = |4x − 12|
= 4|x − 3|
< 4δ
=4
4
= .
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 4.1: Limit of a function 131
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
132 Chapter 4: Limits of functions
= 25 · |x + 2| [Goal #2 accomplished.]
[Now go back to specifying δ at the beginning of the proof by filling in
with: δ = min{1, /B}. This means that δ is the smaller of 1 and /B, and in
particular δ ≤ 1 and δ ≤ /B.]
< 25 · δ
≤ 25 · /25
= .
The final version of the proof of lim (4x2 − 5x + 2) = 28 then looks like this:
x→−2
2
The function that takes x to 4x − 5x + 2 is a polynomial function and it is defined
for all complex numbers. Thus −2 is a limit point of the domain. Let > 0. Set δ =
min{1, /25}. Then δ is a positive real number. Let x be any complex number. Suppose
that 0 < |x + 2| < δ. Then
|(4x2 − 5x + 2) − 28| = |4x2 − 5x − 26|
= |(x + 2)(4x − 13)|
= |4x − 13| · |x + 2|
= |4(x + 2 − 2) − 13| · |x + 2| (adding a clever zero)
= |4(x + 2) − 21| · |x + 2|
≤ (|4(x + 2)| + 21) · |x + 2|
(by the triangle inequality |a ± b| ≤ |a| + |b|)
< (4 · 1 + 21) · |x + 2| (since |x + 2| < δ ≤ 1)
= 25 · |x + 2|
< 25 · δ
≤ 25 · /25
= .
The next example has the same type of discovery work with fewer comments.
Proof. The function that takes x to 4/x2 is defined for all non-zero complex numbers, so
−1 is a limit point of the domain. Let > 0. Set δ = . Then δ is a positive
real number. Let x be any complex number that satisfies 0 < |x + 1| < δ. Then
4 − 4x2
|4/x2 − 4| =
x2
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 4.1: Limit of a function 133
(1 − x)(1 + x)
=4
x2
1−x
=4 |1 + x|
x2
[Goal #1 accomplished: something times |x − a|.]
[Now we want the something 4 1−x
x2 to be at most some constant. Certainly
if we allow x to get close to 0, then (1 − x)/x2 is very large, so in order to
find an upper bound, we need to make sure that x stays away from 0. Since x
is within δ of −1, in order to avoid 0 we need to make sure that δ is strictly
smaller than 1. For example, make sure that δ ≤ 0.4. Thus, on the δ line
write: δ = min{0.4, }.]
2 − (x + 1)
=4 |1 + x| (by rewriting 1 − x = 2 − (x + 1))
|x|2
2 + |x + 1|
≤4 |1 + x| (by the triangle inequality)
x2
2.4
≤ 4 2 |1 + x| (since δ ≤ 0.4)
|x|
9.6
= |1 + x| (by rewriting x = x + 1 − 1)
|x + 1 − 1|2
9.6
≤ |1 + x| (by the reverse triangle inequality because
0.62
|x + 1 − 1| ≥ 1 − |x + 1| > 1 − δ ≥ 1 − 0.4 = 0.6,
so that 1/|x + 1 − 1|2 < 1/0.62 )
[On the δ-line now write: δ = min{0.4, 0.62 /9.6}.]
9.6
< δ
0.62
9.6
≤ 2
0.62 /9.6
0.6
= .
4x3 +x
Example 4.1.5. lim = −2.
x→2i 8x−i
3
Proof. The domain of 4x +x
8x−i consists of all complex numbers different from i/8, so 2i is a
limit point of the domain. Let > 0. Set δ = min{1, /9}. [It is so obvious that this
minimum of two positive numbers is positive that we skip the assertion “Thus
δ is a positive real number.” Do not omit the assertion or the checking of
its veracity for more complicated specifications of δ.] Let x be any complex
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
134 Chapter 4: Limits of functions
number different from i/8 such that 0 < |x − 2i| < δ. [Here we merged: “Let x be any
complex number different from i/8. Let x satisfy 0 < |x − 2i| < δ.” into one
shorter and logically equivalent statement “Let x be any complex number
different from i/8 such that 0 < |x − 2i| < δ.”] Then
4x3 + x 4x3 + x
− (−2) = +2
8x − i 8x − i
4x3 + x + 16x − 2i
=
8x − i
3
4x + 17x − 2i
=
8x − i
2
(4x + 8ix + 1)(x − 2i)
=
8x − i
2
(4x + 8ix + 1)
= |x − 2i|
8x − i
(Goal #1 is accomplished: x − a is a factor.)
|4x2 | + |8ix| + 1
≤ |x − 2i| (by the triangle inequality)
|8x − i|
4|(x − 2i + 2i)|2 + 8|x − 2i + 2i| + 1
≤ |x − 2i|
|8(x − 2i) + 15i|
4(|x − 2i| + 2)2 + 8(|x − 2i| + 2) + 1
≤ |x − 2i|
−8|x − 2i| + 15
(by the triangle and reverse triangle inequalities)
4(1 + 2)2 + 8(1 + 2) + 1
≤ |x − 2i|
−8 + 15
(since |x − 2i| < δ ≤ 1)
61
= |x − 2i|
7
< 9δ
≤ 9/9
= .
Proof. Any a is a limit point of the domain of the given polynomial function. Let > 0.
Set δ = min{1, /(1 + |2a − 2|)}. Let x satisfy 0 < |x − a| < δ. Then
|(x2 − 2x) − (a2 − 2a)| = |(x2 − a2 ) − (2x − 2a)| (by algebra)
= |(x + a)(x − a) − 2(x − a)| (by algebra)
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 4.1: Limit of a function 135
Remark 4.1.7. Note that δ depends on and a, which are constants in the problem; δ is
not allowed to depend on x, as the definition goes:
“for all > 0 there exists δ > 0 such that for all x, etc”
so that x depends on δ, but δ does not depend on x.
By the definition of limits, δ is supposed to be a positive real number, not a function of x.
(See also Exercise 4.1.1.)
Remark 4.1.8. In all cases of rational functions, such as in examples above, Goal #1 is
to factor x − a from f (x) − L: if x − a is not a factor, check your limit or algebra for any
√
mistakes. In the next example, x − a is not a factor, but x − a is.
√
Example 4.1.9. lim 2x − 6 = 0.
x→3
Proof. The domain here is all x ≥ 3. So 3 is a limit point of the domain. Let > 0. Set
δ = 2 /2. Let x > 3 satisfy 0 < |x − 3| < δ. Then
√ √ √
2x − 6 = 2 · x − 3
√ √ √
< 2 · δ (because is an increasing function)
√ p
= 2 · 2 /2
= .
Often books consider the last example as a case of a one-sided limit (see definition
below) since we can only take the x from one side of 3. Our definition handles both-sided
and one-sided and all sorts of other limits with one simple notation, but we do have a use
for one-sided limits as well, so we define them next.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
136 Chapter 4: Limits of functions
√
With this, Example 4.1.9 can be phrased as limx→3+ 2x − 6 = 0, and the proof goes
as follows: The domain A consists of all x ≥ 3, and 3 is a limit point of A∩{x : x > 3} = A.
Let > 0. Set δ = 2 /2. Let x satisfy 0 < x − 3 < δ. Then
√ √ √
2x − 6 = 2 · x − 3
√ √
< 2· δ
√ p
= 2 · 2 /2
= .
√
Thus, the two proofs are almost identical. Note that limx→3− 2x − 6 does not exist
because 3 is not the limit point of A ∩ {x ∈ R : x < 3} = ∅.
One-sided limits can also be used in contexts where limx→a f (x) does not exist.
Below is one example.
x2 + 4, if x > 1;
Example 4.1.11. Let f : R → R be given by f (x) = Then
x − 2, if x ≤ 1.
limx→1+ f (x) = 5, limx→1− f (x) = −1.
Proof. Let > 0. Set δ = min{1, /3}. Let x satisfy 0 < x − 1 < δ. Then
|f (x) − 5| = |x2 + 4 − 5| (since x > 1)
= |x2 − 1|
= |(x + 1)(x − 1)|
< |x + 1|δ (since x > 1, so x + 1 is positive)
= |x − 1 + 2|δ (by adding a clever 0)
≤ (|x − 1| + 2)δ (by the triangle inequality)
< (1 + 2)δ (since 0 < x − 1 < δ ≤ 1)
≤ 3/3
= .
This proves that limx→1+ f (x) = 5.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 4.1: Limit of a function 137
4.1.2. Fill in the blanks of the following proof that limx→2 (x2 − 3x) = −2. Explain why
none of the inequalities can be changed into equalities.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
138 Chapter 4: Limits of functions
4.1.3. Determine the following limits and prove them with the epsilon-delta proofs.
i) lim (x3 − 4).
x→1
1
ii) lim .
x→2 x
iii) lim xx−4
2 +2 .
x→3 √
iv) lim x + 5.
x→4
x2 −9
v) lim .
x→3 x−3
4.1.4. Rework Example 4.1.3 with choosing δ to be at most 2 rather than at most 1.
4.1.5. Let b ∈ C and f, g : C → C with
x3 − 4x2 , if x 6= 5; x3 − 4x2 , if x = 5;
f (x) = g(x) =
b, if x = 5, b, if x 6= 5.
Prove that the limit of f (x) as x approaches 5 is independent of b, but that the limit of
g(x) as x approaches 5 depends on b.
4.1.6. Suppose that a is a limit point of {x ∈ A : x > a} and of {x ∈ A : x < a}. Prove
that limx→a f (x) = L if and only if limx→a+ f (x) = L and limx→a− f (x) = L.
4.1.7. Suppose that a is a limit point of {x ∈ A : x > a} but not of {x ∈ A : x < a}.
Prove that limx→a f (x) = L if and only if limx→a+ f (x) = L.
4.1.8. Prove that limx→a (mx + l) = ma + l, where m and l are constants.
x
4.1.9. Let f : R → R be given by f (x) = |x| . Prove that limx→0+ f (x) = 1 and that
limx→0− f (x) = −1.
4.1.10. Find a function f : R → R such that limx→0− f (x) = 2 and limx→0+ f (x) = −5.
4.1.11. Find a function f : R → R such that limx→0− f (x) = 2, limx→0+ f (x) = −5,
limx→1− f (x) = 3, and limx→1+ f (x) = 0. (Try to define such a function with fewest
possible words or symbols, but do use full grammatical sentences.)
Recall that limx→a f (x) = L means that a is a limit point of the domain of f , and
that for all real numbers > 0 there exists a real number δ > 0 such that for all x in the
domain of f , if 0 < |x − a| < δ then |f (x) − L| < . [Think of limx→a f (x) = L as
statement P , a being a limit point of the domain as statement Q, and the
epsilon-delta part as statement R. By definition, P is logically the same as
the statement Q and R.]
Thus if limx→a f (x) 6= L, then either a is not a limit point of the domain of f or else
it is not true that for all real numbers > 0 there exists a real number δ > 0 such that for
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 4.2: When a number is not a limit 139
all x in the domain of f , if 0 < |x − a| < δ then |f (x) − L| < . [This simply says that
not P is the same as ( not Q) or ( not R).]
In particular,
lim f (x) 6= L and a is a limit point of the domain of f
x→a
means that it is not true that for all real numbers > 0 there exists a real number δ > 0
such that for all x in the domain of f , if 0 < |x − a| < δ then |f (x) − L| < . [This says
that ( not P ) and Q is the same as not R. You may want to write truth tables
for yourself.]
Negations of compound sentences, such as in the previous paragraph, are typically
hard to process and to work with in proofs. But by the usual negation rules of compound
statements (see chart on page 32), we successively rewrite this last negation into a form
that is easier to handle:
not For all real numbers > 0 there exists a real number δ > 0
such that for all x in the domain of f , if 0 < |x − a| < δ
then |f (x) − L| < .
[Negation of “For all z of some kind, property P holds” is “There is some z
of that kind for which P is false.” Hence the following rephrasing:]
= There exists a real number > 0 such that not there exists a
real number δ > 0 such that for all x in the domain of f , if
0 < |x − a| < δ then |f (x) − L| < .
[Negation of “There exists z of some kind such that property P holds” is
“For all z of that kind, P is false.” Hence the following rephrasing:]
= There exists a real number > 0 such that for all real numbers
δ > 0, not for all x in the domain of f , if 0 < |x − a| < δ
then |f (x) − L| < .
[Negation of “For all z of some kind, property P holds” is “There is some z
of that kind for which P is false.” Hence the following rephrasing:]
= There exists a real number > 0 such that for all real numbers
δ > 0, there exists x in the domain of f such that not if
0 < |x − a| < δ then |f (x) − L| < .
[Negation of “If P then Q” is “P and not Q.” Hence the following rephras-
ing:]
= There exists a real number > 0 such that for all real numbers
δ > 0, there exists x in the domain of f such that 0 <
|x − a| < δ and not |f (x) − L| < .
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
140 Chapter 4: Limits of functions
= There exists a real number > 0 such that for all real numbers
δ > 0, there exists x in the domain of f such that 0 <
|x − a| < δ and |f (x) − L| ≥ .
Theorem 4.2.1. If a is a limit point of the domain of f , then limx→a f (x) 6= L means
that there exists a real number > 0 such that for all real numbers δ > 0, there exists x
in the domain of f such that 0 < |x − a| < δ and |f (x) − L| ≥ .
x
Example 4.2.2. The limit of |x| as x approaches 0 does not exist. In other words, for all
x
complex numbers L, limx→0 |x| 6= L.
x
The domain of the function that takes x to |x| is the set of all non-zero complex
x
numbers. For each non-zero x, |x| is a complex number of length 1 and with the same
angle as x. Thus the image of this function is the unit circle in C. Note that it is possible
to take two non-zero x very close to 0 but at different angles so that their images on the
unit circle are far apart. This is a geometric reasoning why the limit cannot exist. Next
we give an epsilon-delta proof.
x
Proof. The domain of the function that takes x to |x| is the set of all non-zero complex
numbers, so that 0 is a limit point of the domain. [Thus if the limit is not L, then
it must be the epsilon-delta condition that fails.] Set = 1. Let δ > 0 be an
arbitrary positive number. Let x = −δ/2 if Re(L) ≥ 0, and let x = δ/2 otherwise. Then
0 < |x| = |x − 0| < δ. If Re(L) ≥ 0, then
x −δ/2
Re − L = Re − L = −1 − Re(L) ≤ −1,
|x| | − δ/2|
x
so that | |x| − L| ≥ 1 = , and if Re(L) < 0, then
x δ/2
Re − L = Re − L = 1 − Re(L) > 1,
|x| |δ/2|
x
so that again | |x| − L| > 1 = . This proves the claim of the example.
i
Example 4.2.3. For all L ∈ C, limx→2 x−2 6= L.
A geometric reason for the non-existence of this limit is that as x gets closer to 2
i
(but not equal to 2), the size of x−2 gets larger and larger.
Proof. Set = 1. Let δ > 0 be an arbitrary positive number. Set δ 0 = min{δ, 1/(|L| + 1)}.
Let x = 2 + δ 0 /2. Then 0 < |x − 2| < δ 0 ≤ δ, and
i 2i
−L = 0 −L
x−2 δ
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 4.2: When a number is not a limit 141
2i
≥ − |L| (by the reverse triangle inequality)
δ0
≥ 2(|L| + 1) − |L| (since δ 0 ≤ 1/(|L| + 1))
≥1
= .
Example 4.2.4. For f : R → R given by the graph below, lim f (x) does not exist because
x→2
of the jump in the function at 2.
1 2 3
Here is an epsilon-delta proof. Say that the limit exists. Call it L. Set = 14 . Let δ
be an arbitrary positive number. If L ≥ 23 , set x = 2 + min 14 , 2δ , and if L < 32 , set
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
142 Chapter 4: Limits of functions
4.2.4. Prove
that for all a ∈ R, limx→a f (x) does not exist, where f : R → R is defined by
1, if x is rational;
f (x) =
0, if x is not rational.
√ √
4.2.5. Prove that limx→0 ( x − −x) does not exist. (Hint: The reason is different from
the reasons in other examples in this section.)
The purpose of this section is to show that even small changes in the definition of
limits affect the meaning significantly. A lesson to be learned is that it is important to
remember any formal statement precisely.
Here is a restatement of Definition 4.1.1 for limx→a f (x) = L when a is a limit point
of the domain A of f :
∀ > 0 ∃δ > 0 ∀x ∈ A if 0 < |x − a| < δ then |f (x) − L| < . (4.3.1)
Example 4.3.2. Suppose that in Statement (4.3.1) we switch the order of the first two
quantifiers:
∃δ > 0 ∀ > 0 ∀x ∈ A if 0 < |x − a| < δ then |f (x) − L| < .
Let f : C → C be the function given by f (x) = x. By Exercise 4.1.8, limx→a f (x) = a, but
this f does not satisfy the modified definition above because no matter what δ is taken, the
conditional fails for any < δ/2. Thus this modification of the definition of limits changes
the meaning.
Example 4.3.3. Suppose that in Statement (4.3.1) we switch the order of the second and
third quantifiers:
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 4.3: More on the definition of a limit 143
Example 4.3.4. Suppose that in Statement (4.3.1) we replace the first ∀ with ∃:
∃ > 0 ∃δ > 0 ∀x ∈ A if 0 < |x − a| < δ then |f (x) − L| < .
Then whenever limx→a f (x) = L, the modified statement is satisfied with any complex
number L0 in place of L. Namely, set = 1 + |L0 − L|. By the definition of limx→a f (x) = L
there exists δ > 0 such that for all x ∈ A, if 0 < |x − a| < δ then |f (x) − L| < 1. Then
|f (x) − L0 | = |f (x) − L + L − L0 |
≤ |f (x) − L| + |L − L0 | (by the triangle inequality)
< 1 + |L − L0 |
= .
This means that with this modification of the definition of limits, limits would not be
unique (but they are by Theorem 4.4.1).
x
Furthermore, the function |x| , for which we proved in Example 4.2.2 that no limit
exists at a = 0, satisfies this last modified statement for any L.
Example 4.3.5. Suppose that in Statement (4.3.1) we replace the first ∃ with ∀:
Example 4.3.6. Suppose that in Statement (4.3.1) we replace the conditional with the
conjunction:
∀ > 0 ∃δ > 0 ∀x ∈ A 0 < |x − a| < δ and |f (x) − L| < .
This modification fails for every function f that is not equal to the constant L on A \ {a}.
Namely, for the condition 0 < |x − a| < δ to hold, necessarily δ must be so large so
that A ⊆ B(a, δ); and for |f (x) − L| < to hold for all x ∈ A \ {a} and all > 0, by
Theorem 2.11.4, f (x) = L. Thus again, this modification of the definition of limits changes
the meaning.
So far we have examined modifications of Statement (4.3.1) in which we switched
the order of quantifiers, we switched a quantifier, or we changed the conditional into a
conjunction. In all cases the modification resulted in a different meaning. We can modify
Statement (4.3.1) in many other ways and get even further meanings.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
144 Chapter 4: Limits of functions
The lack of the quantifier on x would then again tacitly signal the universal quantifier,
whereas the correct negation calls for the existential quantifier. In short, we cannot omit
a quantifier.
We finish the section with a modification of Statement (4.3.1) that does not change
the meaning: we replace the last two occurrences of “<” in the statement with “≤”.
Proof. First suppose that limx→a f (x) = L. We will prove that the modified statement
also holds. Let > 0. By assumption there exists δ 0 > 0 such that for all x ∈ A, if
0 < |x − a| < δ 0 then |f (x) − L| < . Set δ = δ 0 /2. Let x ∈ A such that 0 < |x − a| ≤ δ.
Then 0 < |x − a| < δ 0 , so that |f (x) − L| < , and hence |f (x) − L| ≤ . Thus the modified
statement holds.
Conversely, suppose that the modified statement holds. We will prove that
limx→a f (x) = L. Let > 0. By assumption there exists δ > 0 such that for all x ∈ A,
if 0 < |x − a| ≤ δ then |f (x) − L| ≤ /2. Let x ∈ A such that 0 < |x − a| < δ. Then
0 < |x − a| ≤ δ, so that |f (x) − L| ≤ /2 < . Thus limx→a f (x) = L.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 4.4: Limit theorems 145
While epsilon-delta proofs are a reliable method for proving limits, they do not help
in deciding what a limit may be. In this section we prove theorems that will efficiently
establish the limits for many functions. The proofs of these theorems require the epsilon-
delta machinery – as this is the definition of limits, but subsequent applications of these
theorems allow us to omit the time-consuming epsilon-delta proofs.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
146 Chapter 4: Limits of functions
which says that |L1 − L2 | < . Since was arbitrary, an application of Theorem 2.11.4
gives that |L1 − L2 | = 0, so that L1 − L2 = 0, and hence that L1 = L2 .
(If in the definition of limx→a f (x) we did not require that a be a limit point of the
domain of f , then x as in the proof above would not exist for small δ, so any complex
number L would vacuously satisfy the definition of limits. We would thus not be able
to guarantee that limits are unique. Perhaps the definition of limits is technical, but the
technicalities are there for a good reason.)
Theorem 4.4.2. Let a be a limit point of the domain of a function f with limx→a f (x) =
L. Suppose that L 6= 0. Then there exists δ > 0 such that for all x in the domain of f , if
0 < |x − a| < δ, then |f (x)| > |L|/2.
In particular, there exists δ > 0 such that for all x in the domain of f , if 0 < |x−a| <
δ, then f (x) 6= 0.
Proof. Since |L|/2 > 0, there exists δ > 0 such that for all x in the domain of f , if
0 < |x − a| < δ, then |f (x) − L| < |L|/2. Hence by the reverse triangle inequality, for the
same x, |L|/2 > |f (x) − L| ≥ |L| − |f (x)|, so that by adding |f (x) + |L|/2 to both sides we
get that |f (x)| > |L|/2. In particular f (x) 6= 0.
The following theorem is very important, so study it carefully.
Theorem 4.4.3. Let A be the domain of f and g, and let a be a limit point of A, and let
c ∈ C. Suppose that limx→a f (x) and limx→a g(x) both exist. Then
(1) (Constant rule) lim c = c.
x→a
(2) (Linear rule) lim x = a.
x→a
(3) (Scalar rule) lim cf (x) = c lim f (x).
x→a x→a
(4) (Sum/difference rule) lim (f (x) ± g(x)) = lim f (x) ± lim g(x).
x→a x→a x→a
(5) (Product rule) lim (f (x) · g(x)) = lim f (x) · lim g(x).
x→a x→a x→a
f (x) lim f (x)
(6) (Quotient rule) If lim g(x) 6= 0, then lim = x→a .
x→a x→a g(x) lim g(x)
x→a
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 4.4: Limit theorems 147
(4) Let > 0. Since L = limx→a f (x), there exists δ1 > 0 such that for all x ∈ A,
if 0 < |x − a| < δ1 , then |f (x) − L| < /2. Similarly, since K = limx→a g(x), there exists
δ2 > 0 such that for all x ∈ A, if 0 < |x − a| < δ2 , then |g(x) − L| < /2.
Set δ = min{δ1 , δ2 }. Then δ is a positive number. Let x ∈ A satisfy 0 < |x − a| < δ.
Then (4) follows from:
|(f (x) ± g(x)) − (L ± K)| = |(f (x) − L) ± (g(x) − K)| (by algebra)
≤ |f (x) − L| + |g(x) − K| (by the triangle inequality)
< /2 + /2 (because 0 < |x − a| < δ ≤ δ1 , δ2 )
= .
(5) Let > 0. Then min{/(2|L| + 2), 1}, min{/(2|K| + 1), 1} are positive numbers.
Since L = limx→a f (x), there exists δ1 > 0 such that for all x ∈ A, if 0 < |x − a| < δ1 , then
|f (x) − L| < min{/(2|K| + 1), 1}. Similarly, since K = limx→a g(x), there exists δ2 > 0
such that for all x ∈ A, if 0 < |x − a| < δ2 , then |g(x) − L| < /(2|L| + 2).
Set δ = min{δ1 , δ2 }. Then δ is a positive number. Let x ∈ A satisfy 0 < |x − a| < δ.
Then by the triangle inequality,
|f (x)| = |f (x) − L + L| ≤ |f (x) − L| + |L| < 1 + |L|,
and so
|f (x) · g(x) − L · K| = |f (x) · g(x) − f (x)K + f (x)K − L · K|
(by adding a clever zero)
≤ |f (x) · g(x) − f (x)K| + |f (x)K − L · K|
(by the triangle inequality)
= |f (x)| · |g(x) − K| + |f (x) − L| · |K| (by factoring)
< (1 + |L|) · + |K| (since δ ≤ δ1 , δ2 )
2|L| + 2 2|K| + 1
< /2 + /2
= .
(6) Let > 0. Since K 6= 0, by Theorem 4.4.2, there exists δ0 > 0 such that for all
x ∈ A, if 0 < |x − a| < δ0 , then |g(x)| > |K|/2.
The numbers |K|/4, |K|2 /(4|L| + 1) are positive numbers. Since L = limx→a f (x),
there exists δ1 > 0 such that for all x ∈ A, if 0 < |x − a| < δ1 , then |f (x) − L| < |K|/4.
Similarly, since K = limx→a g(x), there exists δ2 > 0 such that for all x ∈ A, if 0 < |x−a| <
δ2 , then |g(x) − L| < |K|2 /(4|L| + 1).
Set δ = min{δ0 , δ1 , δ2 }. Then δ is a positive number. Let x ∈ A satisfy 0 < |x−a| < δ.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
148 Chapter 4: Limits of functions
Then
f (x) L f (x)K − Lg(x)
− = (by algebra)
g(x) K Kg(x)
f (x)K − LK + LK − Lg(x)
= (by adding a clever zero)
Kg(x)
f (x)K − LK LK − Lg(x)
≤ +
Kg(x) Kg(x)
(by the triangle inequality)
f (x) − L |L| K − g(x)
= + (by factoring)
g(x) |K| g(x)
|K| 2 |L| |K|2 2
< · + · (since δ ≤ δ0 , δ1 , δ2 )
4 |K| |K| 4|L| + 1 |K|
< /2 + /2
= .
This proves (6) and thus the theorem.
Theorem 4.4.4. (Power rule for limits) Let n be a positive integer. If limx→a f (x) = L,
then limx→a f (x)n = Ln .
Proof. The case n = 1 is the assumption. Suppose that we know the result for n − 1. Then
lim f (x)n = lim f (x)n−1 · f (x) (by algebra)
x→a x→a
= lim f (x)n−1 · lim (f (x)) (by the product rule)
x→a x→a
n−1
=L · L (by induction assumption)
= Ln .
So the result holds for n, and we are done by mathematical induction.
Theorem 4.4.5. (Polynomial function rule for limits) Let f be a polynomial func-
tion. Then for all complex (or real) a, limx→a f (x) = f (a).
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 4.4: Limit theorems 149
= c0 + c1 a + c2 a2 + · · · + cn an
= f (a).
Theorem 4.4.6. (Rational function rule for limits) Let f be a rational function.
Then for all complex (or real) a in the domain of f , limx→a f (x) = f (a).
Proof. Let a be in the domain of f . Write f (x) = g(x)/h(x) for some polynomial functions
g, h such that h(a) 6= 0. By Theorem 2.4.15, the domain of f is the set of all except finitely
many numbers, so that in particular a is a limit point of the domain. By the polynomial
function rule for limits, limx→a g(x) = g(a) and limx→a h(x) = h(a) 6= 0. Thus by the
quotient rule, limx→a f (x) = g(a)/h(a) = f (a).
Theorem 4.4.7. (Absolute value rule for limits) For all a ∈ C, limx→a |x| = |a|.
Proof. This function is defined for all complex numbers, and so every a ∈ C is a limit point
of the domain. Let > 0. Set δ = . Then for all x ∈ C, by the reverse triangle inequality,
|x| − |a| ≤ |x − a| < δ = .
Proof. First suppose that limx→a f (x) = L. Let > 0. By assumption there exists δ > 0
such that for all x ∈ A, 0 < |x − a| < δ implies that |f (x) − L| < . Then for the same x,
| Re f (x) − Re L| = | Re(f (x) − L)| ≤ |f (x) − L| < ,
and similarly | Im f (x) − Im L| < which proves that limx→a Re f (x) = Re L and
limx→a Im f (x) = Im L.
Now suppose that limx→a Re f (x) = Re L and limx→a Im f (x) = Im L. By the scalar
and sum rules in Theorem 4.4.3, and by the definition of real and imaginary parts,
lim f (x) = lim (Re f (x) + i Im f (x))
x→a x→a
= lim Re f (x) + i lim Im f (x)
x→a x→a
= Re L + i Im L
= L.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
150 Chapter 4: Limits of functions
Proof. Let > 0. Since lim g(x) = g(L), there exists δ1 > 0 such that for all x in the
x→L
domain of g, if 0 < |x−a| < δ1 then |g(x)−g(L)| < . Since lim f (x) = L, there exists δ > 0
x→a
such that for all x in the domain of f , if 0 < |x − a| < δ then |f (x) − L| < δ1 . Thus for the
same δ, if x is in the domain of h and 0 < |x−a| < δ, then |h(x)−g(L)| = |g(f (x))−g(L)| <
because |f (x) − L| < δ1 .
Perhaps the hypotheses on g in the theorem above seem overly restrictive, and you
think that the limit of g(x) as x approaches L need not be g(L) but an arbitrary K?
Consider the following example which shows that limx→a g(f (x)) then need not be K. Let
3, if x 6= 5;
f (x) = 5 and g(x) = . Then
7, otherwise.
lim f (x) = 5, lim g(x) = 3, and lim g(f (x)) = 7.
x→a x→5 x→a
Theorem 4.4.10. Suppose that f, g : A → R, that a is a limit point of A, that limx→a f (x)
and limx→a g(x) both exist, and that for all x ∈ A, f (x) ≤ g(x). Then
lim f (x) ≤ lim g(x).
x→a x→a
Proof. Let L = lim f (x), K = lim g(x). Let > 0. By assumptions there exists δ > 0
x→a x→a
such that for all x ∈ A, if 0 < |x − a| < δ, then |f (x) − L|, |g(x) − K| < /2. Then for the
same x,
K − L = K − g(x) + g(x) − f (x) + f (x) − L ≥ K − g(x) + f (x) − L
≥ −|K − g(x) + f (x) − L|
≥ −|K − g(x)| − |f (x) − L| (by the triangle inequality)
> −/2 − /2 = −.
Since this is true for all > 0, by Theorem 2.11.4, K ≥ L, as desired.
Proof. [If we knew that limx→a g(x) existed, then by the previous theorem,
limx→a f (x) ≤ limx→a g(x) ≤ limx→a h(x) = limx→a f (x) would give that the three
limits are equal. But we have yet to prove that limx→a g(x) exists.]
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 4.4: Limit theorems 151
Let L = limx→a f (x) = limx→a h(x). Let > 0. Since limx→a f (x) = L, there
exists δ1 > 0 such that for all x, if 0 < |x − a| < δ1 then |f (x) − L| < . Similarly,
since limx→a h(x) = L, there exists δ2 > 0 such that for all x, if 0 < |x − a| < δ2 then
|h(x) − L| < . Now set δ = min{δ1 , δ2 }. Let x satisfy 0 < |x − a| < δ. Then
− < f (x) − L ≤ g(x) − L ≤ h(x) − L < ,
where the first inequality holds because δ ≤ δ1 , and the last inequality holds because δ ≤ δ2 .
Hence − < g(x) − L < , which says that |g(x) − L| < , so that limx→a g(x) = L.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
152 Chapter 4: Limits of functions
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 4.5: Infinite limits (for real-valued functions) 153
When the codomain of a function is a subset of an ordered field such as R, the values
of a function may grow larger and larger with no upper bound, or more and more negative
with no lower bound. In that case we may want to declare limit to be ∞ or −∞. Naturally
both the definition and how we operate with infinite limits requires different handling.
Note that we cannot use epsilon-delta proofs: no real numbers are within of infinity.
So instead we approximate infinity with huge numbers. In fact, infinity stands for that thing
which is comparable to and larger than any real number. Thus for all M we can find x
near a with f (x) > M is simply saying that we can take f (x) arbitrarily large, which is
more succinctly expressed as saying that f (x) goes to ∞. (As far as many applications are
concerned, a real number larger than the number of atoms in the universe is as close to
infinity as realistically possible, but for proofs, the number of atoms in the universe is not
large enough.)
1
Example 4.5.2. limx→0 |x|2 = ∞.
Proof. 0 is a limit point of the domain (in the field of real or complex numbers) of the
√
function that takes x to 1/|x|2 . Let M be a positive real number. Set δ = 1/ M . Let x
satisfy 0 < |x − 0| < δ, i.e., let x satisfy 0 < |x| < δ. Then
1 1
2
> 2 (because 0 < |x| < δ)
|x| δ
= M.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
154 Chapter 4: Limits of functions
x+2
Example 4.5.3. limx→5+ x2 −25 = ∞.
Proof. Certainly 5 is the limit point of the domain of the given function. Let M > 0. Set
7
δ = min{1, 11M }. Let x satisfy 0 < x − 5 < δ. Then
x+2 5+2
2
> 2 (because x > 5 and x2 − 25 > 0)
x − 25 x − 25
7 1
= · (by algebra)
(x + 5) x − 5
7 1
> · (because 0 < x − 5 < δ ≤ 1, so 0 < x + 5 < 11)
11 x − 5
7 1
> · (because 0 < x − 5 < δ)
11 δ
7 1 7 7
≥ · 7 (because δ ≤ 11M , so 1/δ ≥ 1/( 11M ))
11 11M
= M.
1
Example 4.5.4. limx→0+ x = ∞.
1
Proof. Let M > 0. Set δ = 1/M . Then for all x with 0 < x − 0 < δ, x > 1/δ = M .
1
Example 4.5.5. limx→0− x = −∞.
Proof. Let M < 0. Set δ = −1/M . Then δ is a positive number. Then for all x with
0 < 0 − x < δ, and so by compatibility with multiplication by the positive real number
1/(−xδ),
1 1
<− ,
δ x
1
and so by compatibility of order with addition of x − 1δ ,
1 1
< − = M.
x δ
Example 4.5.6. We conclude that limx→0 x1 cannot be a real number, and it cannot be
either ∞ or −∞. Thus limx→0 x1 does not exist.
The following theorem is straightforward to prove, and it is left to the exercises.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 4.5: Infinite limits (for real-valued functions) 155
but we do not have enough information to determine whether the limit of g(x) + h(x) as x
approaches a exists.
(3) (Product rule)
∞, if L > 0;
lim (f (x) · g(x)) =
x→a −∞, if L < 0;
−∞, if L > 0;
lim (f (x) · h(x)) =
x→a ∞, if L < 0;
lim (g(x) · h(x)) = −∞.
x→a
We do not have enough information to determine the existence (or value) of limx→a (f (x) ·
g(x)) and of limx→a (f (x) · h(x)) in case L = 0.
(4) (Quotient rule)
f (x)
lim = 0,
x→a g(x)
f (x)
lim = 0,
x→a h(x)
g(x) ∞, if L > 0;
lim =
x→a f (x) −∞, if L < 0;
h(x) −∞, if L > 0;
lim =
x→a f (x) ∞, if L < 0.
limx→a (g(x)
We do not have enough information to determine the existence (or value) of f (x))
limx→a (h(x)
and of f (x)) in case L = 0. We also do not have enough information to determine
limx→a (g(x)
the existence (or value) of h(x)) .
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
156 Chapter 4: Limits of functions
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 4.6: Limits at infinity 157
Proof. Let M > 0. Set N = max{17, M 1/4 }. Then for all x > N ,
x5 − 16x4 = x4 (x − 16)
≥ x4 (because x > N ≥ 17)
> N 4 (because x > N )
≥ (M 1/4 )4 (because N ≥ M 1/4 )
= M.
x5 −16x4
Example 4.6.3. limx→∞ x5 +4x2 = 1.
Proof. Let > 0. Set N = max{1, 20/}. Then for all x > N ,
x5 − 16x4 x5 − 16x4 − x5 − 4x2
− 1 =
x5 + 4x2 x5 + 4x2
−16x4 − 4x2
=
x5 + 4x2
16x4 + 4x2
=
x5 + 4x2
16x4 + 4x2
= 5 (because x > N ≥ 0)
x + 4x2
16x4 + 4x2
≤ (because x5 + 4x2 ≥ x5 > 0)
x5
16x4 + 4x4
≤ (because x > N ≥ 1 so that x2 < x4 )
x5
20x4
= 5
x
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
158 Chapter 4: Limits of functions
20
=
x
20
< (because x > N )
N
20
≤ (because N ≥ 20/)
20/
= .
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Chapter 5: Continuity
Continuous functions from an interval in R to R are the ones that we can graph with-
out any holes or jumps, i.e., without lifting the pencil from the paper, so the range of such
functions is an interval in R as well. We make this more formal below, and not just for func-
tions with domains and codomains in R. The formal definition involves limits of functions.
All the hard work for that was already done in Chapter 4, so this chapter, after absorbing
the definition, is really straightforward. The big new results are the Intermediate value
theorem and the Extreme value theorem for real-valued functions (Section 5.3), existence
of radical functions (Section 5.4), and the new notion of uniform continuity (Section 5.5).
Proof. In the first case, there exists δ > 0 such that B(a, δ) ∩ A = {a}. Thus by definition,
the only x ∈ A with |x − a| < δ is x = a, whence |f (x) − f (a)| = 0 is strictly smaller than
an arbitrary positive . Thus f : A → B is continuous at a.
Now assume that a is a limit point of A. Suppose that f is continuous at a. We need
to prove that limx→a f (x) = f (a). Let > 0. Since f is continuous at a, there exists δ > 0
such that for all x ∈ A, if |x − a| < δ, then |f (x) − f (a)| < . Hence for all 0 < |x − a| < δ,
|f (x) − f (a)| < , and since a is a limit point of A, this proves that limx→a f (x) = f (a).
Finally suppose that limx→a f (x) = f (a). We need to prove that f is continuous at
a. Let > 0. By assumption there exists δ > 0 such that for all x ∈ A, if 0 < |x − a| < δ
then |f (x) − f (a)| < . If x = a, then |f (x) − f (a)| = 0 < , so that for all x ∈ A, if
|x − a| < δ, then |f (x) − f (a)| < . Thus f is continuous at a.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
160 Chapter 5: Continuity
The next theorem is an easy application of Theorems 4.4.3, 4.4.4, 4.4.5, 4.4.6, 4.4.7,
4.4.8, and 4.4.9, at least when the points in question are limit points of the domain. When
the points in question are not limit points of the domain, the results below need somewhat
different but still easy proofs.
The theorem covers many continuous functions, but the following function for exam-
ple has to be verified differently.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 5.1: Continuous functions 161
x2 − 4, if x > 1;
g(x) =
−3x3 , if x ≤ 1.
so that (by Exercise 4.1.6), limx→1 g(x) = −3 = −3·13 = g(1). Thus g is indeed continuous
at 1. By the polynomial rules, g is continuous at all other real numbers as well. Thus g
is continuous — even if it is a restriction of a non-continuous function. It is worth noting
that precisely because of continuity we can write g in the following ways as well:
x2 − 4, if x ≥ 1; x2 − 4, if x ≥ 1;
g(x) = =
−3x3 , if x < 1 −3x3 , if x ≤ 1.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
162 Chapter 5: Continuity
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 5.2: Topology and the Extreme value theorem 163
*5.1.8. (The Thomae function, also called the popcorn function, the raindrop function,
and more) Let f : R → R be defined as
1q , if x = pq , where p ∈ Z and q ∈ N+
f (x) = and p and q have no common prime factors;
0, if x is irrational.
The exercises below are a further play on Section 4.3: they modify the definition of conti-
nuity to get very different types of functions. – The moral is that the order of quantifiers
and implications is very important!
5.1.9. (This is from page 1177 of the Edward Nelson’s article “Internal set theory: a new
approach to nonstandard analysis.” Bull. Amer. Math. Soc. 83 (1977), no. 6, 1165–1198.)
A function f : A → B is suounitnoc at a ∈ A if for all real numbers > 0 there exists a
real number δ > 0 such that for all x ∈ A, |x − a| < implies that |f (x) − f (a)| < δ.
i) Prove that if f is suounitnoc at some a ∈ A, it is suounitnoc at every b ∈ A.
ii) Let f : R+ → R be given by f (x) = 1/x. Prove that at every a ∈ R+ , f is
continuous but not suounitnoc.
iii) Let f : R → R be given by f (x) = 1 if x is irrational and f (x) = 0 if x is rational.
Prove that at every a ∈ R, f is suounitnoc but not continuous.
5.1.10. A function f : A → B is ticonnuous at a ∈ A if there exists a real number
δ > 0 such that for all real numbers > 0 and for all x ∈ A, |x − a| < δ implies that
|f (x) − f (a)| < .
i) Suppose that f is ticonnuous at some a ∈ A. Prove that there exist a real number
δ > 0 and b ∈ B such that for all x ∈ A, |x − a| < δ implies that f (x) = b.
ii) Give an example of a continuous function that is not ticonnuous at every a in the
domain.
iii) Prove that every function that is ticonnuous at every point in the domain is con-
tinuous.
5.1.11. A function f : A → B is connuousti at a ∈ A if for all real numbers > 0 there
exists a real number δ > 0 such that for all x ∈ A, |f (x) − f (a)| < implies that |x − a| < δ.
i) Let f : R → R be a constant function. Prove that f is not connuousti at any a ∈ R.
ii) Let f : R → R be given by f (x) = 1 if x is irrational and f (x) = 0 if x is rational.
Prove that f is not connuousti at any a ∈ R.
iii) Let f : R → R be given by f (x) = x if x is irrational and f (x) = x + 1 if x is
rational. Prove that at every a ∈ R, f is connuousti.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
164 Chapter 5: Continuity
Topology and continuity go hand in hand (see Exercise 5.2.5), but not in the obvious
way, as the next two examples show.
(1) If f : A → B is continuous and A is open, it need not be true that the range of f
is open in B, even if A is bounded. For example, let f : R → R or f : (−1, 1) → R
be the squaring function. Certainly f is continuous and R is an open set, but the
image of f is [0, ∞) or [0, 1), which is not open.
(2) If f : A → B is continuous and A is closed, it need not be true that the range of
1
f is closed in B. For example, let f : R → R be given by f (x) = 1+x 2 . This f is
Proof. We first prove that the range is closed. Let b be a limit point of the range. We want
to prove that b is in the range. Since b is a limit point, by definition for every positive real
number r, B(b, r) contains an element of the range (even an element of the range that is
different from b). In particular, for every positive integer m there exists xm ∈ A such that
f (xm ) ∈ B(b, 1/m). If for some m we have f (xm ) = b, then we are done, so we may assume
that for all m, f (xm ) 6= b. Thus there are infinitely many xm . As in Construction 3.6.1,
we can construct nested quarter subrectangles Rn that contain infinitely many xm . There
is a unique complex number c that is contained in all the Rn . By construction, c is the
limit point of the set of the xm , hence of A. As A is closed, c ∈ A. But f is continuous
at c, so that for all > 0 there exists δ > 0 such that for all x ∈ A, if |x − c| < δ then
|f (x) − f (c)| < /2. In particular, for infinitely many large m, |xm − c| < δ, so that for
these same m, |f (xm ) − f (c)| < /2. But for all large m we also have |f (xm ) − b| < /2,
so that by the triangle inequality, for |f (c) − b| < . Since this is true for all positive , it
follows that f (c) = b. Thus any limit point of the range is in the range, so that the range
is closed.
Next we prove that the range is bounded. If not, then for every positive integer
m there exists xm ∈ A such that |f (xm )| > m. Again we use Construction 3.6.1, and
this time we construct nested quarter subrectangles Rn that contain infinitely many of
these xm . As before, there is a unique complex number c that is contained in all the Rn
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 5.2: Topology and the Extreme value theorem 165
and in A, and as before (with = 2), we get that for infinitely many m, |f (xm ) − f (c)| < 1.
But |f (xm )| > m, so that by the reverse triangle inequality,
|f (c)| ≥ |f (xm ) − (f (xm ) − f (c))| ≥ |f (xm )| − |f (xm ) − f (c)| > m − 1
for infinitely many positive integers m. Since |f (c)| is a fixed real number, it cannot be
larger than all positive integers. Thus we get a contradiction to the assumption that the
range is not bounded, which means that the range must be bounded.
Theorem 5.2.2. (Extreme value theorem) Let A be a closed and bounded subset of C,
and let f : A → R be a continuous function. Then there exist l, u ∈ A such that for all
x ∈ A,
f (l) ≤ f (x) ≤ f (u).
In other words, f achieves its maximum value at u and its minimum value at l.
Proof. By Theorem 5.2.1, the range {f (x) : x ∈ A} of f is a closed and bounded subset
of R, so that its infimum L and supremum U are real numbers which are by closedness in
the range. Thus there exist u, l ∈ A such that L = f (l) and U = f (u).
Proof. We need to prove that f −1 is continuous at every b ∈ B. Let > 0. Set a = f −1 (b).
The set B(a, ) is open, so its complement is closed. Therefore C = A \ B(a, ) = A ∩
(F \ B(a, )) is closed. As C ⊆ A, then C is also bounded. Thus by Theorem 5.2.1,
{f (x) : x ∈ C} is a closed and bounded subset of F . By injectivity of f , this set does not
contain b = f (a). The complement of this set contains b and is open, so that there exists
δ > 0 such that B(b, δ) ⊆ F \ {f (x) : x ∈ C}. Now let y ∈ B with |y − b| < δ. Since f
is invertible, there exists x ∈ A such that y = f (x). By the choice of δ, x 6∈ C, so that
f −1 (y) = x ∈ B(a, ) = B(f −1 (b), ). In short, for every > 0 there exists δ > 0 such that
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
166 Chapter 5: Continuity
for all y ∈ B, if |y − b| < δ, then |f −1 (y) − f −1 (a)| < . This proves that f −1 is continuous
at b.
Prove that f is continuous and invertible but that f −1 is not continuous. Compare with
Theorem 5.2.4.
5.2.4. Let A be a closed and bounded subset of C, and let f : A → C be continuous.
i) Why are we not allowed to talk about f achieving its maximum or minimum?
ii) Can we talk about the maximum and minimum absolute values of f ? Justify your
answer.
iii) Can we talk about the maximum and minimum of the real or of the imaginary
components of f ? Justify your answer.
*5.2.5. Let F be R or C, and let A and B be subsets of F . Prove that f : A → B is
continuous if and only if for every open subset U of F there exists an open subset V of F
such that the set {x ∈ A : f (x) ∈ U } = V ∩ A.
5.2.6. Let f : (0, 1) → R be defined by f (x) = x1 . Prove that f is continuous and that f
has neither a minimum nor a maximum. Explain why this does not contradict the Extreme
value theorem (Theorem 5.2.2).
5.2.7. (Outline of another proof of the Extreme value theorem for closed intervals in R, due
to Samuel J. Ferguson, “A one-sentence line-of-sight proof of the Extreme value theorem”,
American Mathematical Monthly 121 (2014), page 331.) Let f : [a, b] → R be a continuous
function. The goal is to prove that f achieves its maximum at a point u ∈ [a, b]. Set
L = {x ∈ [a, b] : for all y ∈ [a, b], if y < x then f (y) ≤ f (x)}.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 5.3: Intermediate value theorem 167
i) Prove that a ∈ L.
ii) Let c ∈ Bd (L) (boundary of L). Prove that c ∈ [a, b].
iii) Suppose that c 6∈ L.
− Prove that there exists y ∈ [a, c) such that f (y) > f (c).
− Prove that there exists δ > 0 such that for all x ∈ [a, b] ∩ B(c, δ), |f (x) − f (c)| <
(f (y) − f (c))/2. (Hint: use continuity at c.)
− Prove that there exists x ∈ L ∩ B(c, min{δ, c − y}).
− Prove that f (y) − f (x) > 0.
− Prove that y ≥ x.
− Prove that c − y > |x − c| ≥ c − x ≥ c − y, which is a contradiction.
iv) Conclude that c ∈ L and that L is a closed set.
v) Let u = sup(L). Prove that u exists and is an element of L.
vi) Let k > f (u). Prove that the set Sk = {x ∈ [c, b] : f (y) ≥ k} is closed, and in
particular that Sk , if non-empty, has a minimum.
vii) Suppose that for some x ∈ [a, b], f (x) > f (u).
− Prove that x > u and x 6∈ L.
− Prove that there exists x1 ∈ (u, x) such that f (x1 ) > f (x).
− Prove that there exist x1 ∈ (u, x), x2 ∈ (u, x1 ), x3 ∈ (u, x2 ), . . ., xn ∈ (u, xn−1 ),
. . ., such that · · · > f (xn ) > f (xn−1 ) > · · · > f (x3 ) > f (x2 ) > f (x1 ) > f (x) >
f (u).
− Prove that the set Sf (u) does not have a minimum.
viii) Conclude that f (u) is the maximum of f on [a, b].
5.2.8. Let A be a subset of R, and let f : A → R be a function with a maximum at
u ∈ A. What can you say about the slope of the line between (a, f (a)) and (u, f (u)) for
a ∈ A \ {u}?
In this section, all functions are real-valued. The reason is that we can only make
comparisons in an ordered field, and R is an ordered field (Axiom 2.7.11), and C is not
(Exercise 3.1.6).
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
168 Chapter 5: Continuity
Theorem 5.3.1. (Intermediate value theorem) Let a, b ∈ R with b > a, and let
f : [a, b] → R be continuous. Let k be a real number strictly between f (a) and f (b). Then
there exists c ∈ (a, b) such that f (c) = k.
Here is a picture that illustrates the Intermediate value theorem: for value k on the y-
axis between f (a) and f (b) there happen to be two c for which f (c) = k. The Intermediate
value theorem guarantees that one such c exists but does not say how many such c exist.
y
f (b) b
a c1 c2 b x
b
f (a)
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 5.3: Intermediate value theorem 169
c2 −2
Example 5.3.2. There exists a real number c such that c5 − 4 = c2 +2 .
2
Proof. Let f : R → R be defined by f (x) = x5 − 4 − xx2 −2 +2 . This function is a rational
function and defined for all real numbers, so that by Theorem 5.1.3, f is continuous. Note
that f (0) = −3 < 0 < f (2), so that by Theorem 5.3.1 there exists c in (0, 2) such that
2
f (c) = 0. In other words, c5 − 4 = cc2 −2
+2 .
Proof. For any c and d in the image of f , by the Intermediate value theorem (The-
orem 5.3.1), any real number between c and d is in the image of f , which proves the
theorem.
However, if f : A → B is continuous and injective and A is open, then the range of
f is open in B. We prove this first for A and B subsets of R, and then for subsets of C.
Compare the next theorem to Example 5.1.6.
Proof. Let a < b be in I. Since f has an inverse, f (a) 6= f (b), so that by trichotomy, either
f (a) < f (b) or f (a) > f (b).
For now we assume that f (a) < f (b). With that we prove that f is an increasing
function, i.e., that for any x, y ∈ I, if x < y then f (x) < f (y). First suppose that x < y < z
are in I. By invertibility, f (x), f (y), f (z) are distinct. If f (x) is between f (y) and f (z), then
an application of the Intermediate value theorem gives c ∈ (y, z) such that f (c) = f (x).
But x < y < c, so x and c are distinct, and f (c) = f (x) contradicts invertibility of f . So
f (x) is not between f (y) and f (z), and similarly f (z) is not between f (x) and f (y). Thus
necessarily f (x) < f (y) < f (z) or f (x) > f (y) > f (z). By setting x = a and z = b we
get that f is increasing on [a, b], by setting y = a and z = b we get that f is increasing on
I ∩ (−∞, a], and by setting x = a and y = b we get that f is increasing on I ∩ [b, ∞). Thus
f is increasing on I.
By Theorem 5.3.3 we know that B is an interval. By definition of inverses, f −1 is
strictly increasing.
If I is in addition open, let c ∈ B and a = f −1 (c). Since I is open and f is
increasing, there exist b1 , b2 ∈ A such that b1 < a < b2 . Thus f (b1 ) < f (a) = c < f (b2 ),
and by Theorem 5.3.3 we know that (f (b1 ), f (b2 )) is an open subset of B that contains c.
Thus B is open.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
170 Chapter 5: Continuity
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 5.4: Radical functions 171
5.3.6. The goal of this exercise is to prove that every polynomial of odd degree has a real
root. Write f (X) = a0 + a1 X + a2 X 2 + · · · + an X n for some real numbers a0 , a1 , a2 , . . . , an
such that n is odd and an 6= 0. Set b = |ann | max{|a0 |, |a1 |, . . . , |an−1 |, |an |}.
i) Prove that b is a positive real number. If f (b) = 0 or if f (−b) = 0, we have found
the root. So we may assume that f (b) and f (−b) are not zero.
ii) Justify all steps below:
|a0 + a1 (±b) + a2 (±b)2 + · · · + an−1 (±b)n−1 |
≤ |a0 | + |a1 b| + |a2 b2 | + · · · + |an−1 bn−1 |
(by the triangle inequality)
|an |b |an |b2 |an |b3 |an |bn
≤ + + +· · ·+ (by the choice of b)
n n n n
|an |b
1 + b + b2 + · · · + bn−1
≤
n
|an |b n−1
+ bn−1 + bn−1 + · · · + bn−1 (because b ≥ 1)
≤ b
n
0 1 2 n−1 (place markers)
|an |b n−1
≤ nb
n
= |an |bn
= |an bn |.
iv) Prove that f (b) has the same sign (positive or negative) as an bn and that f (−b)
has the same sign as an (−b)n .
v) Prove that f (b) and f (−b) have opposite signs.
vi) Prove that f has a real root in (−b, b).
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
172 Chapter 5: Continuity
The Intermediate value theorem guarantees that there exists r ∈ (a−1, 0) such that a = rn .
So for odd n all real numbers are in the range. If n is even, then n/2 is an integer and
for any x ∈ R, xn = (x2 )n/2 is a positive-integer power of x2 . By Theorem 2.7.13, x2 is
positive, and so by Theorem 2.9.2, xn ∈ R+ ∪ {0}.
We now re-prove Theorem 2.10.6 that the nth radical function exists when n is a
positive integer. Let f be as in the theorem above. By Theorem 2.9.2 and Exercise 2.9.1,
f is strictly increasing. Thus by Theorem 2.9.4, f has an inverse function f −1 . By the power
rule, f is continuous, so that by Theorem 5.3.4, f −1 is strictly increasing and continuous.
√
We call this inverse the nth radical function. For any a in its domain we write f (a) = n a
or as f (a) = a1/n , and we call this value as the nth root of a.
√
We just established that for any positive integer n, n is defined on R≥0 , or even on
√
R if n is odd. We also just established that n is a strictly increasing continuous function.
By continuity we immediately get the following:
Theorem 5.4.2. (Radical rule for limits) For any positive integer n and any a in the
√
domain of n ,
√ √
lim n x = n a.
x→a
√
Now let m and n be positive integers. For any x ∈ R+ ∪ {0}, the elements n
xm and
√
( n x)m are well-defined in R+ ∪ {0}. Their nth powers are the same value xm :
√
( n xm )n = xm
by definition, and
√ √ √
(( n x)m )n = (( n x)mn = (( n x)n )m = xm .
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 5.4: Radical functions 173
√
We record this function as n/m = ( )m/n , and call it the exponentiation by m/n. This
shows that we can handle rational exponents as well.
Proof. Write r = m/n, where n and m are integers and n 6= 0. Since m/n = (−m)/(−n),
by possibly multiplying by −1 we may assume that n > 0. Then f is a composition of
exponentiation by m with exponentiation by 1/n, in either order. Exponentiation by non-
negative m is continuous by the constant or power rule, exponentiation by negative m is
continuous by the quotient rule, and exponentiation by 1/n is continuous by Theorem 5.4.2.
Thus f is continuous by the composite rule.
If r > 0, then m, n > 0, and then f is the composition of two strictly increasing
functions, hence strictly increasing. If r < 0, then m < 0 and n > 0, so f is the composition
of a strictly increasing and a strictly decreasing function, hence strictly decreasing by
Exercise 2.9.7.
For any non-zero integer p, let gp , hp : A → A be defined as gp (x) = xp and hp (x) =
x1/p . By associativity of function composition (Exercise 2.4.6),
(f (a))n/m = hm ◦ gn (f (a))
= hm ◦ gn (hn ◦ gm (a))
= (hm ◦ (gn ◦ hn ))(gm (a))
= hm ◦ gm (a)
= a,
and similarly f (an/m ) = a. This proves that exponentiation by n/m = 1/r is the inverse
of f .
The last part is obvious.
Theorem 7.6.4 shows more generally that exponentiation by arbitrary real numbers
(not just by rational numbers) is continuous.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
174 Chapter 5: Continuity
5.4.1. Let n, m ∈ Q, and suppose that a and b are in the domain of exponentiation by n
and m. Prove:
i) an · bn = (ab)n .
ii) (an )m = amn .
iii) an · am = an+m .
iv) If a 6= 0, then a−n = 1/an .
5.4.2. Determine the following limits, and justify all steps by invoking the relevant theo-
rems/rules:
p
i) lim x2 − 3x + 4.
x→2
√ √
x− 2
ii) lim .
x→2 x2 + 4
x−2
iii) lim 2 .
x→2 x − 4
√ √
x− 2
iv) lim .
x→2 x2 − 4
√ √
5.4.3. Let f (x) = x − −x.
i) Prove that for all a in the domain of f , limx→a f (x) is inapplicable.
ii) Prove that f is continuous.
√
5.4.4. Determine the domain of the function f given by f (x) = −x2 .
5.4.5. Here is an alternate proof of Theorem 5.4.2. Study the proof, and provide any
√
missing commentary. Let A be the domain of n .
i) Prove that an element of A is a limit point of A.
ii) Suppose that a = 0. Set δ = n . Let x ∈ A satisfy 0 < |x − a| < δ. Since the nth
root function is an increasing function on A, it follows that
√ √ √ p √
n
n
x − n a = n x = n |x| < δ = .
√
iii) Suppose that a > 0. First let 0 = min{, n a}. So 0 is a positive number. Set
√ √ n
δ = min{( + n a)n − a, a − ( n a − 0 )n }. Note that
√ √
( + n a)n − a > ( n a)n − a = 0,
√ √
0 ≤ n a − 0 < n a,
√ n
a − ( n a − 0 )n > 0,
so that δ is positive. Let x ∈ A satisfy 0 < |x − a| < δ. Then −δ < x − a < δ and
√ √ n
a − δ < x < δ + a. Since δ = min{( + n a)n − a, a − ( n a − 0 )n }, it follows that
√ n √
a − a − ( n a − 0 )n ≤ a − δ < x < a + δ ≤ a + ( + n a)n − a ,
√ √
or in other words, ( n a − 0 )n < x < ( + n a)n . Since the nth radical function is
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 5.4: Radical functions 175
5.4.6. Recall from Exercise 3.4.4 that for every non-zero complex number a there exist
exactly two complex numbers whose squares are a. Let’s try to create a continuous square
root function f : C → C. (We will fail.)
i) Say that for all a in the first quadrant we choose f (a) in the first quadrant. Where
are then f (a) for a in the remaining quadrants?
ii) Is it possible to extend this square root function to a function f on all of C
(the positive and negative real and imaginary axes) in such a way as to make
limx→a f (x) = f (a) for all a ∈ C?
√ √ p √
iii) Explain away the problematic claims −4 · −9 = (−4)(−9) = 36 = 6,
√ √
−4 · −9 = 2i · 3i = −6 from page 9.
iv) Let D be the set of all complex numbers that are not on the negative real axis.
Prove that we can define a continuous square root function f : D → C.
(v)* Let θ be any real number, and let D be the set of all complex numbers whose
counterclockwise angle from the positive real axis is θ. Prove that we can define a
continuous square root function f : D → C.
5.4.7. (The goal of this exercise and the next one is to develop exponential functions
without derivatives and integrals. We will see in Section 7.4 that derivatives and integrals
give a more elegant approach.) Let c ∈ (1, ∞) and let f : Q → R+ be the function
f (x) = cx .
i) Why is f a function? (Is it well-defined, i.e., are we allowed to raise positive real
numbers to rational exponents?)
ii) Prove that f is strictly increasing. (Hint: Theorem 5.4.3.)
iii) Let > 0. Justify the following:
n+1
X
( + 1)n+1 − 1 = ( + 1)i − ( + 1)i−1
i=1
n+1
X
= ( + 1)i−1
i=1
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
176 Chapter 5: Continuity
n+1
X
≥
i=1
= (n + 1).
Use the Archimedean property of R (Theorem 2.10.3) to prove that the set {(+1)n :
n ∈ N0 } is not bounded above.
iv) Prove that there exists a positive integer n such that c1/n < + 1.
v) Prove that there exists δ1 > 0 such that for all x ∈ (0, δ1 ) ∩ Q, |cx − 1| < .
vi) Prove that there exists δ2 > 0 such that for all x ∈ (−δ2 , 0) ∩ Q, |cx − 1| < .
vii) Prove that limx→0,x∈Q cx = 1.
viii) Prove that for any r ∈ R, limx→r,x∈Q cx exists and is a real number.
5.4.8. (Related to the previous exercise.) Let c ∈ R+ .
i) Prove that for any r ∈ R, limx→r,x∈Q cx exists and is a real number. (Hint: Case
c = 1 is special; case c > 1 done; relate the case c < 1 to the case c > 1 and the
quotient rule for limits.)
ii) We denote the limit in the previous part with cr . Prove that for all real numbers
c, c1 , c2 , r, r1 , r2 , with c, c1 , c2 > 0,
1
c−r =
, (c1 c2 )r = cr1 cr2 , cr1 +r2 = cr1 cr2 , cr1 r2 = (cr1 )r2 .
cr
iii) Prove that the function g : R → R given by g(x) = cx is continuous. (This is easy.)
Definition 5.5.1. A function f is uniformly continuous if for all real numbers > 0
there exists a real number δ > 0 such that for all x and y in the domain, if |x − y| < δ,
then |f (x) − f (y)| < .
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 5.5: Uniform continuity 177
Proof. Let > 0. Since f is continuous, for each a ∈ A there exists δa > 0 such that for
all x ∈ A, |x − a| < δa implies that |f (x) − f (a)| < /2. Note that A ⊆ ∪a∈A B(a, δa ).
By Theorem 3.6.5 there exists δ > 0 such that for all x ∈ A there exists a ∈ A such that
B(x, δ) ⊆ B(a, δa ). Let x, y ∈ A with |x − y| < δ. Since x, y ∈ B(x, δ) ⊆ B(a, δa ), it follows
that |x − a|, |y − a| < δa . Thus
|f (x) − f (y)| = |f (x) − f (a) + f (a) − f (y)|
≤ |f (x) − f (a)| + |f (a) − f (y)| (by the triangle inequality)
< /2 + /2 = .
√
Example 5.5.5. The continuous function is uniformly continuous.
√
Proof. We established in Section 5.4 that is continuous. Let > 0. We divide the
domain into two regions, one closed and bounded so we can invoke the theorem above, and
√
the other unbounded but where has a bounded derivative.
The first region is the closed and bounded interval [0, 2]. By the previous theorem
√ √
there exists δ1 > 0 such that for all a, x ∈ [0, 2], if |x − a| < δ1 then | x − a| < .
The second region is the unbounded interval [1, ∞). For a, x ∈ [1, ∞) with |x−a| <
we have
√ √
√ √ √ √ x+ a x−a
| x − a| = ( x − a) √ √ = √ √
x+ a x+ a
|x − a| √ √
≤ (because x, a ≥ 1)
2
< .
√
Finally, set δ = min{δ1 , , 1}. Let a and x be in the domain of such that |x−a| < δ.
Since |x − a| < δ ≤ 1, necessarily either x, a ∈ [0, 2] or x, a ≤ [1, ∞). We have analyzed
√ √
both cases, and we conclude that | x − a| < .
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
178 Chapter 5: Continuity
Proof. Let > 0. Since f is uniformly continuous, there exists δ > 0 such that for all
x, y ∈ A, if |x − y| < δ then |f (x) − f (y)| < . But then for any x, y ∈ C, if |x − y| < δ,
then |g(x) − g(y)| = |f (x) − f (y)| < .
Example 5.5.7. By Example 5.5.3 the squaring function is not uniformly continuous on
C or R, but when the domain is restricted to any bounded subset of C, that domain is a
subset of a closed and bounded subset of C, and so since squaring is continuous, it follows
by the previous theorem that squaring is uniformly continuous on the closed and bounded
set and hence on any subset of that.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 5.5: Uniform continuity 179
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Chapter 6: Differentiation
The geometric motivation for differentiation comes from lines tangent to a graph of
a function f at a point (a, f (a)). For example, on the graph below are two gray secant
lines through (a, f (a)):
f (a)
a x
It appears that the line through (a, f (a)) and (x, f (x)) is closer to the tangent line to the
graph of f at (a, f (a)) if x is closer to a. Intuitively, the slope of the tangent line is the
limit of the slopes of the secant lines.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 6.1: Definition of derivatives 181
Example 6.1.2. Let f (x) = mx + l, where m and l are complex numbers. Then for any
a ∈ C,
f (x) − f (a)
f 0 (a) = lim
x→a x−a
(mx + l) − (ma + l)
= lim
x→a x−a
m(x − a)
= lim
x→a x−a
= m.
Alternatively,
f (a + h) − f (a)
f 0 (a) = lim
h→0 h
(m(a + h) + l) − (ma + l)
= lim
h→0 h
mh
= lim
h→0 h
= m.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
182 Chapter 6: Differentiation
= 2a.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 6.1: Definition of derivatives 183
Example 6.1.6. Let f (x) = x3/2 . The domain of f is R≥0 , and for all a ≥ 0,
f (a + h) − f (a)
f 0 (a) = lim
h→0 h
(a + h) − a3/2
3/2
= lim
h→0 h
(a + h) − a3/2 (a + h)3/2 + a3/2
3/2
= lim ·
h→0 h (a + h)3/2 + a3/2
(a + h)3 − a3
= lim (since (x − y)(x + y) = x2 − y 2 )
h→0 h((a + h)3/2 + a3/2 )
a3 + 3a2 h + 3ah2 + h3 − a3
= lim
h→0 h((a + h)3/2 + a3/2 )
(3a2 + 3ah + h2 )h
= lim
h→0 h((a + h)3/2 + a3/2 )
3a2 + 3ah + h2
= lim
h→0 (a + h)3/2 + a3/2
2
a3/23a+a3/2
, if a > 0;
= 2
lim h = lim h1/2 = 0, if a = 0;
h→0 h3/2 h→0
3
= a1/2
2
by the linear, radical, composite, and quotient rules for limits. (Note that this f is differen-
tiable even at 0, whereas the square root function (previous example) is not differentiable
at 0.)
Note that in all these examples, f 0 is a function from some subset of the domain of
f to a subset of C, and we can compute f 0 at a number labeled x rather than a:
df (x) f (x + h) − f (x) f (z) − f (x)
f 0 (x) =
= lim = lim .
dx h→0 h z→x z−x
The h-limit is perhaps preferable to the last limit, where it is z that varies and gets closer
and closer to x.
Proof. This function is not differentiable whether the domain is C or R. The reason is
that the limit of |h|−|0|
h as h goes to 0 does not exist. Namely, if h varies over positive real
numbers, this limit is 1, and if h varies over negative real numbers, the limit is −1, so that
the limit indeed does not exist, and hence the derivative does not exist.
This gives an example of a continuous function that is not differentiable. (Any
continuous function with a jagged graph is not differentiable.)
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
184 Chapter 6: Differentiation
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 6.2: Basic properties of derivatives 185
Proof. Parts (1) and (2) were already proved in part (1) of Example 6.1.2.
(3) follows from
(cf )(a + h) − (cf )(h) cf (a + h) − cf (h)
lim = lim
h→0 h h→0 h
f (a + h) − f (h)
= c lim
h→0 h
0
= cf (a),
and (4) follows from the sum rule for limits and from
(f + g)(a + h) − (f + g)(h)
lim
h→0 h
f (a + h) + g(a + h) − f (h) − g(h)
= lim
h→0 h
f (a + h) − f (h) + g(a + h) − g(h)
= lim
h→0 h
f (a + h) − f (h) g(a + h) − g(h)
= lim +
h→0 h h
0 0
= f (a) + g (a).
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
186 Chapter 6: Differentiation
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 6.2: Basic properties of derivatives 187
Proof #1: Part (1) of Example 6.1.2 with m = 1 and l = 0 proves that (x1 )0 = 1, so that
the theorem is true when n = 1. Now suppose that the theorem holds for some positive
integer n. Then
(xn )0 = (xn−1 x)0
= (xn−1 )0 x + xn−1 x0 (by the product rule)
= (n − 1)xn−2 x + xn−1 (by induction assumption)
= (n − 1)xn−1 + xn−1
= nxn−1 ,
so that the theorem holds also for n, and so by induction also for all positive n.
Proof #2: The second proof uses binomial expansions as in Exercise 1.7.7:
(x + h)n − xn
(xn )0 = lim
h→0 h
Pn n k n−k
x h − xn
= lim k=0 k
h→0 h
Pn−1 n k n−k
x h
= lim k=0 k
h→0 h
Pn−1 n k n−k−1
h k=0 k x h
= lim
h→0 h
n−1
X n
= lim xk hn−k−1
h→0 k
k=0
n−1
X n
= xk 0n−k−1 (by the polynomial rule for limits)
k
k=0
n
= xn−1
n−1
= nxn−1 .
Theorem 6.2.4. (Polynomial, rational function rule for derivatives) Polynomial functions
are differentiable at all real/complex numbers and rational functions are differentiable at
all points in the domain.
Proof. The proof is an application of the sum, scalar, and power rules from Theorems 6.2.2
and 6.2.3.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
188 Chapter 6: Differentiation
Theorem 6.2.5. (The composite function rule for derivatives, aka the chain rule)
Suppose that f is differentiable at a, that g is differentiable at f (a), and that a is a limit
point of the domain of g ◦ f . (If f : A → B, g : B → C, and f is differentiable at a, then
automatically a is a limit point of A and hence of the domain of g ◦ f .) Then g ◦ f is
differentiable at a, and (g ◦ f )0 (a) = g 0 (f (a)) · f 0 (a).
Proof. Let > 0. Since f is differentiable at a, there exists δ1 > 0 such that for all a + h
in the domain of f , if 0 < |h| < δ1 then | f (a+h)−f h
(a)
− f 0 (a)| < min{1, /(2|g 0 (f (a))| + 2)}.
f (a+h)−f (a)
For all such h, by the triangle inequality, | h | < |f 0 (a)| + 1. By assumption g is
differentiable at f (a), so that there exists δ2 > 0 such that for all x in the domain of g, if
0 < |x − f (a)| < δ2 , then | g(x)−g(f (a))
x−f (a) − g 0 (f (a))| < /(2|f 0 (a)| + 2). Since f is differentiable
and hence continuous at a, there exists δ3 > 0 such that for all x in the domain of f , if
|x − a| < δ3 , then |f (x) − f (a)| < δ1 . Set δ = min{δ1 , δ2 , δ3 }. Let a + h be arbitrary in
the domain of g ◦ f such that 0 < |h| < δ. In particular a + h is in the domain of f . If
f (a + h) 6= f (a), then
(g ◦ f )(a + h) − (g ◦ f )(a)
− g 0 (f (a)) · f 0 (a)
h
g(f (a + h)) − g(f (a))
= − g 0 (f (a)) · f 0 (a)
h
g(f (a + h)) − g(f (a)) f (a + h) − f (a)
= · − g 0 (f (a)) · f 0 (a)
f (a + h) − f (a) h
g(f (a + h)) − g(f (a)) 0 f (a + h) − f (a)
= − g (f (a)) ·
f (a + h) − f (a) h
f (a + h) − f (a)
+ g 0 (f (a)) · − g 0 (f (a)) · f 0 (a)
h
g(f (a + h)) − g(f (a)) f (a + h) − f (a)
≤ − g 0 (f (a)) ·
f (a + h) − f (a) h
f (a + h) − f (a)
+ |g 0 (f (a))| − f 0 (a)
h
0 0
≤ · (|f (a)| + 1) + |g (f (a))|
2|f 0 (a)| + 2 2|g 0 (f (a))| + 2
< +
2 2
= .
Thus if there exists δ as above but possibly smaller such that f (a + h) 6= f (a) for all a + h
in the domain with 0 < |h| < δ, the above proves the theorem.
Now suppose that for all δ > 0 there exists h such that a + h is in the domain of f ,
0 < |h| < δ, and f (a + h) = f (a). Then in particular when h varies over those infinitely
many h, f 0 (a) = limh→0 f (a+h)−f
h
(a)
= limh→0 h0 = 0. Also, for such h, (g ◦ f )(a + h) − (g ◦
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 6.2: Basic properties of derivatives 189
Theorem 6.2.6. (Real and imaginary parts) Let A ⊆ R, and let a ∈ A be a limit
point of A. Let f : A → C. Then f is differentiable at a if and only if Re f and Im f are
differentiable at a, and in that case, f 0 = (Re f )0 + i(Im f )0 .
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
190 Chapter 6: Differentiation
Let > 0. Since g is differentiable at a, and g(a) = f (a) = b and g 0 (a) = f 0 (a), by
the limit definition of derivatives, there exists δ > 0 such that for all x ∈ C, if |x − a| < δ
g(x)−b
then x−a − f 0 (a) < . By continuity of g −1 , there exists γ > 0 such that for all
y ∈ D, if |y − b| < γ then |g −1 (y) − a| = |g −1 (y) − g −1 (b)| < δ. In particular, for all
y−b g(g −1 (y))−b
such y, g −1 (y)−a − f 0 (a) = g −1 (y)−a − f 0 (a) < . This says that for all h ∈ C, if
g(g −1 (b+h))−b
|h| < γ and b + h ∈ D, then h
g −1 (b+h)−a − f 0 (a) = g −1 (b+h)−a − f 0 (a) < . Thus
limh→0 h
g −1 (b+h)−a = f 0 (a), and by the quotient rule for limits,
g −1 (b + h) − g −1 (b) g −1 (b + h) − a 1 1
lim = lim = 0 = 0 −1 .
h→0 h h→0 h f (a) f (f (b))
But this holds for every C (and g which is f restricted to C), so that in particular,
−1 0 f −1 (b + h) − f −1 (b) 1
(f ) (b) = lim = 0 −1 .
h→0 h f (f (b))
It should be noted that if we know that f −1 is differentiable, the proof of the last
part of the theorem above goes as follows. For all x ∈ B, (f ◦ f −1 )(x) = x, so that
(f ◦ f −1 )0 (x) = 1, and by the chain rule, f 0 (f −1 (x)) · (f −1 )0 (x) = 1. Then if f 0 is never 0,
we get that (f −1 )0 (x) = f 0 (f −1
1
(x)) .
Example 6.2.8. Let f : [0, ∞) → [0, ∞) be the function f (x) = x2 . We know that f
is differentiable at all points in the domain and that f 0 (x) = 2x. By Example 6.1.5 and
Exercise 6.1.4, the inverse of f , namely the square root function, is differentiable at all
positive x, but not at 0. The theorem above applies to positive x (but not to x = 0):
√ 1 1 1
( x)0 = (f −1 )0 (x) = = = √ .
f 0 (f −1 (x)) 2f −1 (x) 2 x
√
Theorem 6.2.9. Let n be a positive integer. Then for all non-zero x in the domain of n
,
√ 1
( n x)0 = x1/n−1 .
n
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 6.2: Basic properties of derivatives 191
Proof. The power rule and quotient rules prove this in case r an integer, and the previous
theorem proves it in case r is one over a positive integer. Now suppose that r = m/n for
some integers m, n with n 6= 0. Since m/n = (−m)/(−n) is also a quotient of two integers,
we may write r = m/n so that m ∈ Z and n is a positive integer. Thus f is the composition
of exponentiation by m and by 1/n. By the chain rule,
1 1/n−1
f 0 (x) = m(x1/n )m−1 · x
n
m (m−1)/n+1/n−1
= x
n
= rxm/n−1/n+1/n−1
= rxr−1 .
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
192 Chapter 6: Differentiation
In this section the domains and codomains of all functions are subsets of R.
Theorem 6.3.1. Let f : [a, b] → R, and let c ∈ [a, b] such that f achieves an extreme
value at c (i.e., either for all x ∈ [a, b], f (c) ≤ f (x) or for all x ∈ [a, b], f (c) ≥ f (x)). Then
at least one of the following holds:
(1) c = a;
(2) c = b;
(3) f is not continuous at c;
(4) f is not differentiable at c;
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 6.3: The Mean value theorem 193
Proof. It suffices to prove that if the first four conditions do not hold, then the fifth one
has to hold. So we assume that c 6= a, c 6= b, and that f is differentiable at c.
Suppose that f 0 (c) > 0. By the definition of derivative, f 0 (c) = limx→c f (x)−f
x−c
(c)
.
Thus for all x very near c but larger than c, f (x)−f x−c
(c)
> 0, so that f (x) − f (c) > 0, so
that f does not achieve its maximum at c. Also, for all x very near c but smaller than c,
f (x)−f (c)
x−c > 0, so that f (x) − f (c) < 0, so that f does not achieve its minimum at c. This
is a contradiction, so that f 0 (c) cannot be positive. Similarly, f 0 (c) cannot be negative.
Thus f 0 (c) = 0.
Thus to find extreme values of a function, one only has to check if extreme values
occur at the endpoints of the domain, at points where the function is not continuous or
non-differentiable, or where it is differentiable and the derivative is 0. One should be aware
that just because any of the five conditions is satisfied, we need not have an extreme value
of the function. Here are some examples:
(1) The function f : [−1, 1] → R given by f (x) = x3 − x has neither the maximum
nor the minimum at the endpoints.
x, if x > 0;
(2) Let f : [−1, 1] → R be given by f (x) = Then f is not continuous
1/2, if x ≤ 0.
at 0 but f does not have a minimum or maximum at 0.
x, if x > 0;
(3) Let f : [−1, 1] → R be given by f (x) = Then f is continuous
2x, if x ≤ 0.
and not differentiable at 0, yet f does not have a minimum or maximum at 0.
(4) Let f : [−1, 1] → R be given by f (x) = x3 . Then f is differentiable, f 0 (0) = 0, but
f does not have a minimum or maximum at 0.
Theorem 6.3.2. (Darboux’s theorem) Let a < b be real numbers, and let f : [a, b] →
R be differentiable. Then f 0 has the intermediate value property, i.e., for all k between
f 0 (a) and f 0 (b) there exists c ∈ [a, b] such that f 0 (c) = k.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
194 Chapter 6: Differentiation
Theorem 6.3.3. (Rolle’s theorem) Let a, b ∈ R with a < b, and let f : [a, b] → R be a
continuous function such that f is differentiable on (a, b). If f (a) = f (b), then there exists
c ∈ (a, b) such that f 0 (c) = 0.
Proof. By the Extreme value theorem (Theorem 5.2.2) there exist l, u ∈ [a, b] such that
f achieves its minimum at l and its maximum at u. If f (l) = f (u), then the minimum
value of f is the same as the maximum value of f , so that f is a constant function, and so
f 0 (c) = 0 for all c ∈ (a, b).
Thus we may assume that f (l) 6= f (u). It may be that f achieves its minimum at
the two endpoints, in which case u must be strictly between a and b. Similarly, it may be
that f achieves its maximum at the two endpoints, in which case l must be strictly between
a and b. In all cases of f (l) 6= f (u), either l or u is not the endpoint.
Say that l is not the endpoint. Then a < l < b. For all x ∈ [a, b], f (x) ≥ f (l),
so that in particular for all x ∈ (a, l), f (x)−f
x−l
(l)
≤ 0 and for all x ∈ (l, b), f (x)−f
x−l
(l)
≥ 0.
Since f 0 (l) = limx→l f (x)−f
x−l
(l)
exists, it must be both non-negative and non-positive, so
necessarily it has to be 0.
If instead u is not the endpoint, then a < u < b and for all x ∈ [a, b], f (x) ≤ f (u).
Thus in particular for all x ∈ (a, u), f (x)−fx−u
(u)
≥ 0 and for all x ∈ (u, b), f (x)−f
x−u
(u)
≤ 0.
Since f 0 (u) = limx→u f (x)−f
x−u
(u)
exists, it must be both non-negative and non-positive, so
necessarily it has to be 0. Thus in all cases we found c ∈ (a, b) such that f 0 (c) = 0.
Theorem 6.3.4. (Mean value theorem) Let a, b ∈ R with a < b, and let f : [a, b] → R
be a continuous function such that f is differentiable on (a, b). Then there exists c ∈ (a, b)
such that f 0 (c) = f (b)−f
b−a
(a)
.
a b x
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 6.3: The Mean value theorem 195
= f (a) = g(a). Thus by Rolle’s theorem, there exists c ∈ (a, b) such that g 0 (c) = 0. But
f (b) − f (a)
g 0 (x) = f 0 (x) − ,
b−a
f (b)−f (a) f (b)−f (a)
so that 0 = g 0 (c) = f 0 (c) − b−a , whence f 0 (c) = b−a .
The rest of this section consists of various applications of the Mean value theorem.
More concrete examples are left for the exercises.
Theorem 6.3.5. Let a, b ∈ R with a < b, and let f : [a, b] → R be a continuous function
such that f is differentiable on (a, b).
(1) If f 0 (c) ≥ 0 for all c ∈ (a, b), then f is non-decreasing on [a, b].
(2) If f 0 (c) > 0 for all c ∈ (a, b), then f is strictly increasing on [a, b].
(3) If f 0 (c) ≤ 0 for all c ∈ (a, b), then f is non-increasing on [a, b].
(4) If f 0 (c) < 0 for all c ∈ (a, b), then f is strictly decreasing on [a, b].
(5) If f 0 (c) = 0 for all c ∈ (a, b), then f is a constant function.
Proof of part (2): Let x, y ∈ [a, b] with x < y. By Theorem 6.3.4 there exists c ∈ (x, y)
such that f 0 (c) = f (x)−f
x−y
(y)
. Since f 0 (c) > 0 and x < y, necessarily f (x) < f (y). Since x
and y were arbitrary with x < y, then f is strictly increasing on [a, b].
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
196 Chapter 6: Differentiation
Theorem 6.3.8. (Cauchy’s mean value theorem) Let a < b be real numbers and let
f, g : [a, b] → R be continuous functions that are differentiable on (a, b). Then there exists
c ∈ (a, b) such that
f 0 (c)(g(b) − g(a)) = g 0 (c)(f (b) − f (a)).
f 0 (c) f (b)−f (a)
In particular, if g 0 (c) 6= 0 and g(b) 6= g(a), this says that g 0 (c) = g(b)−g(a) .
Proof. Define h : [a, b] → R by h(x) = f (x)(g(b) − g(a)) − g(x)(f (b) − f (a)). Then h is con-
tinuous on [a, b] and differentiable on (a, b). Note that h(a) = f (a)(g(b)−g(a))−g(a)(f (b)−
f (a)) = f (a)g(b) − g(a)f (b) = h(b). Then by the Mean value theorem (Theorem 6.3.4)
there exists c ∈ (a, b) such that h0 (c) = 0, i.e., 0 = f 0 (c)(g(b) − g(a)) − g 0 (c)(f (b) − f (a)).
Theorem 6.3.9. (Cauchy’s mean value theorem, II) Let a < b be real numbers and
let f, g : [a, b] → R be continuous functions that are differentiable on (a, b) and such that
g 0 is non-zero on (a, b). Then g(b) 6= g(a), and there exists c ∈ (a, b) such that
f 0 (c) f (b) − f (a)
0
= .
g (c) g(b) − g(a)
Proof. By the Mean value theorem (Theorem 6.3.4) there exists c ∈ (a, b) such that
g 0 (c) = g(b)−g(a)
b−a . By assumption, g 0 (c) 6= 0, so that g(b) 6= g(a). The rest follows by
Cauchy’s mean value theorem (Theorem 6.3.8).
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 6.4: L’Hôpital’s rule 197
6.3.6. Let A be an open interval in R, and let f : A → R be differentiable such that the
range of the derivative function is bounded. Prove that f is uniformly continuous.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
198 Chapter 6: Differentiation
f 0 (x)
Proof. Let > 0. Since lim 0 = L, there exists δ > 0 such that for all x in the domain, if
x→a g (x)
0
0 < |x−a| < δ then | fg0 (x)
(x)
−L| < . Let x be one such number. By Theorem 6.3.9 there exists
0
a number c strictly between a and x such that fg0 (c) (c)
= fg(x)−g(a)
(x)−f (a)
. Since f (a) = g(a) = 0,
0
this says that fg0 (c)
(c)
= fg(x)
(x)
. Since 0 < |x − a| < δ and c is between a and x, it follows that
0 < |c − a| < δ. Hence
f (x) f 0 (c)
−L = 0 − L < .
g(x) g (c)
With our definition of derivatives, this last version includes the one-sided cases where
the domains for f and g are of the form [a, b), or of the form (b, a].
I should note that some books omit hypothesis (3). A counterexample to the con-
clusion if we omit this hypothesis but keep all the others is in Exercise 10.4.7.
The versions of L’Hôpital’s rule so far deal with limits of the form “zero over zero”.
There are versions for the form “infinity over infinity”; I write and prove only the right-sided
version, but similarly there is a left-sided version and a both-sided version.
Theorem 6.4.3. (L’Hôpital’s rule) Let L and a < b be real numbers. Let f, g : (a, b) →
R be differentiable with the following properties:
(1) lim− f (x) = lim− g(x) = ∞.
x→b x→b
(2) For all x ∈ (a, b), g 0 (x) 6= 0.
f 0 (x)
(3) lim− 0 = L.
x→b g (x)
f (x)
Then lim− = L.
x→b g(x)
Proof. Let > 0. By assumption there exists δ1 > 0 such that for all x ∈ (a, b), if
0
0 < b − x < δ1 , then fg0 (x)
(x)
− L < /4. By possibly replacing δ1 by min{δ1 , (b − a)/2} we
may assume that b − δ1 > a.
Set a0 = b − δ1 . Fix x such that 0 < b − x < δ1 . Then x ∈ (a0 , b) ⊆ (a, b).
Suppose that g(x) = g(a0 ). Then by the Mean value theorem (Theorem 6.3.4) there exists
u ∈ (a0 , x) such that g 0 (u) = 0, which contradicts the assumption (2). Thus g(x) 6= g(a0 ).
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 6.4: L’Hôpital’s rule 199
f 0 (c) f (x)−f (a0 )
By Theorem 6.3.9 there exists c ∈ (a0 , x) such that g 0 (c) = g(x)−g(a0 ) , and so
f (x) − f (a0 ) f 0 (c)
−L = 0 − L < /4.
g(x) − g(a0 ) g (c)
Since lim f (x) = lim g(x) = ∞, there exists δ2 > 0 such that for all x with
x→b− x→b−
0 < b − x < δ2 , f (x) and g(x) are non-zero, and so we can define h : (b − δ2 , b) → R as
f (a0 )
1− f (x)
h(x) = g(a0 )
.
1− g(x)
By Theorem 4.5.7, limx→b− h(x) = 1−0 1−0 = 1. Thus there exists δ3 > 0 such that for all x,
if 0 < b − x < δ3 , then |h(x) − 1| < min{/4(|L| + 1), 12 }. The ( 21 )-restriction in particular
means that h(x) > 12 and thus non-zero. Set δ = min{δ1 , δ2 , δ3 }. Then for all x with
0 < b − x < δ,
f (x) f (x) 1
−L = h(x) − Lh(x)
g(x) g(x) |h(x)|
f (x) − f (a0 ) 1
= − Lh(x)
g(x) − g(a0 ) |h(x)|
f (x) − f (a0 )
≤2 − Lh(x) (since |h(x)| > 1/2)
g(x) − g(a0 )
f (x) − f (a0 )
=2 − L + L − Lh(x)
g(x) − g(a0 )
(by addition of a clever zero)
f (x) − f (a0 )
≤2 − L + |L − Lh(x)|
g(x) − g(a0 )
(by the triangle inequality)
0
f (c)
=2 − L + |L| |1 − h(x)|
g 0 (c)
<2 +
4 4
= .
x2 −1
Example 6.4.4. Compute lim 3 .
x→1 x −1
x2 −1 (x−1)(x+1)
Proof #1: By Example 1.6.4, x3 −1 = (x−1)(x2 +x+1) , so that by Exercise 4.4.8 and Theo-
2
x −1
rem 4.4.6, lim 3 = lim 2x+1 = 121+1 2
+1+1 = 3 .
x→1 x −1 x→1 x +x+1
Proof #2: By Theorem 4.4.5, lim (x2 − 1) = 0 = lim (x3 − 1), lim (x2 − 1)0 = lim 2x = 2,
x→1 x→1 x→1 x→1
x2 −1
and lim (x3 − 1)0 = lim 3x2 = 3. Thus by L’Hôpital’s rule (Theorem 6.4.2), lim x 3 −1
x→1 x→1 x→1
equals 23 .
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
200 Chapter 6: Differentiation
More forms of L’Hôpital’s rule are in the exercises below and also in Sections 7.6 and
10.4 after the exponential and trigonometric functions have been introduced.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 6.5: Higher-order derivatives, Taylor polynomials 201
and for a = 1,
T0,f,1 (x) = −1,
T1,f,1 (x) = −1 + 10(x − 1),
T2,f,1 (x) = −1 + 10(x − 1) + (x − 1)2 ,
T3,f,1 (x) = −1 + 10(x − 1) + (x − 1)2 + (x − 1)3 ,
Tn,f,1 (x) = −1 + 10(x − 1) + (x − 1)2 + (x − 1)3 + (x − 1)4 for all n ≥ 4.
Note that for all n ≥ 4, Tn,f,0 (x) = Tn,f,1 (x) = f (x).
The following is a generalization of this observation:
Theorem 6.5.4. If f is a polynomial of degree at most d, then for any a ∈ C and any
integer n ≥ d, the nth-order Taylor polynomial of f centered at a equals f .
Proof. Write f (x) = c0 +c1 x+· · ·+cd xd for some c0 , c1 , . . . , cd ∈ C. By elementary algebra,
it is possible to rewrite f in the form f (x) = e0 + e1 (x − a) + e2 (x − a)2 + · · · + ed (x − a)d
for some e0 , e1 , . . . , ed ∈ C. Now observe that
f (a) = e0 ,
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
202 Chapter 6: Differentiation
f 0 (a) = e1 ,
f 00 (a) = 2e2 ,
f 000 (a) = 6e3 = 3!e3 ,
f (4) (a) = 24e4 = 4!e4 ,
..
.
f (k) (a) = k!ek if k ≤ d,
f (k) (a) = 0 if k > d.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 6.5: Higher-order derivatives, Taylor polynomials 203
Proof. Let > 0. Let M = 1 + max{|f (a)|, |f 0 (a)|, . . . , |f (n) (a)|}. Since f is differentiable
at a, it is continuous at a, so there exists δ1 > 0 such that for all qx ∈ B(a, r), ifq|x − a| < δ1 ,
then |f (x)−f (a)| < /(n+1). Set δ = min{r, δ1 , /(M (n+1)), 2 (M (n+1)) , . . . n (M (n+1)) }.
Let x satisfy |x − a| < δ. Then x is in the domain of f and Tn,f,a , and
|f (x)−Tn,f,a (x)|
f 00 (a) f (n) (a)
= f (x)−f (a) − f 0 (a)(x − a) − (x − a)2 − · · · − (x − a)n
2! n!
f 00 (a) f (n) (a)
≤ |f (x)−f (a)|+|f 0 (a)| |x−a|+ |x − a|2 +· · ·+ |x − a|n
2! n!
< +M +M + ··· + M
n+1 M (n + 1) M (n + 1) M (n + 1)
= .
More on Taylor polynomials and Taylor series is in the exercises in this section, in
Exercise 7.4.9, and in Section 9.3.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
204 Chapter 6: Differentiation
6.5.2. Prove that if f is a polynomial function, then for every a ∈ R, Tn,f,a = f for all n
greater than or equal to the degree of f .
6.5.3. (Generalized product rule.) Suppose that f and g have derivatives of orders 1, . . . , n
at a. Prove that
n
(n)
X n (k)
(f · g) (a) = f (a)g (n−k) (a).
k
k=0
6.5.4. (Generalized quotient rule.) Prove the following generalization of the product rule:
Suppose that f and g have derivatives of orders 1, . . . , n at a and that g(a) 6= 0. Find and
prove a formula for the nth derivative of the function f /g.
√
6.5.5. Compute the Taylor polynomial of f (x) = 1 + x of degree 5 centered at a = 0.
Justify your work.
√
6.5.6. Compute the Taylor polynomial of f (x) = 1 − x of degree 10 centered at a = 0.
Justify your work.
1
6.5.7. Let f (x) = 1−x .
(k)
i) Compute f (x) for all integers k ≥ 0.
ii) Compute the Taylor polynomial of f of arbitrary degree n and centered at a = 0.
iii) Compute Tn,f,0 (0.5). (Hint: Use Example 1.6.4.)
iv) Compute Tn,f,0 (0.5) − f (0.5).
v) Compute n such that |Tn,f,0 (0.5) − f (0.5)| < 0.001.
vi) Try to use Theorem 6.5.5 to determine n such that |Tn,f,0 (0.5) − f (0.5)| < 0.001.
Note that this usage is not fruitful.
vii) Use Theorem 6.5.5 to determine n such that |Tn,f,0 (0.4) − f (0.4)| < 0.001.
6.5.8. Let f : C → R be the absolute value function.
i) Prove that f has derivatives of all orders at all non-zero numbers.
ii) Compute T2,f,i .
6.5.9. Let f : R → R be given by f (x) = |x|3 .
i) Prove that f is differentiable, and compute f 0 .
ii) Prove that f 0 is differentiable, and compute f 00 .
iii) Prove that f 00 is not differentiable.
6.5.10. Find a differentiable function f : R → R such that f 0 is not differentiable.
6.5.11. Find a function f : R → R that has derivatives of orders 1, 2, 3, 4, but such that
f (4) is not differentiable.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Chapter 7: Integration
The basic motivation for integration is computing areas of regions bounded by graphs
of functions. In this chapter we develop the theory of integration for functions whose
domains are subsets of R. The first two sections handle only codomains in R, and at the
end of Section 7.4 we extend integration to functions with codomains in C. We do not
extend to domains being subsets of C as that would require multi-variable methods and
complex analysis, which are not the subject of this course.
In this section, domains and codomains of all functions are subsets of R. Thus we
can draw the regions and build the geometric intuition together with the formalism.
Let f : [a, b] → R. The basic aim is to compute the signed area of the region
bounded by the x-axis, the graph of y = f (x), and the lines x = a and x = b. By “signed”
area we mean that we add up the areas of the regions above the x-axis and subtract the
areas of the regions below the x-axis. Thus a signed area may be positive, negative or zero.
y = f (x)
y
a b x
In the plot above, there are many (eight) regions whose boundaries are some of the
listed curves, but only the shaded region (comprising two of the eight regions in the count)
is bounded as a subset of the plane.
The simplest case of an area is when f is a constant function with constant value c.
Then the signed area is c · (b − a), which is positive if c > 0 and non-positive if c ≤ 0.
For a general f , we can try to approximate the area by rectangles, such as in the
following approximations with crosshatched rectangles:
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
206 Chapter 7: Integration
y = f (x)
y
a b x
It may be hard to decide how close the approximation is to the true value. But we
can approximate the region more systematically, by having heights of the rectangles be
either the least possible height or the largest possible height, as below:
a b a b
Then clearly the true area is larger than the sum of the areas of the darker rectangles
on the left and smaller than the sum of the areas of the darker rectangles on the right.
We establish some notation for all this.
Remark 7.1.2. If f is a bounded function with codomain R, then by the Least upper
bound property (Axiom 2.10.1), for any subset B of the domain, sup{f (x) : x ∈ B} and
inf{f (x) : x ∈ B} are real numbers.
Definition 7.1.3. A partition of [a, b] is a finite subset of [a, b] that contains a and b.
We typically write a partition in the form P = {x0 , x1 , . . . , xn }, where x0 = a < x1 < x2 <
· · · < xn−1 < xn = b.
The sub-intervals of the partition P are [x0 , x1 ], [x1 , x2 ], . . ., [xn−1 , xn ].
In all illustrated examples above, the partition of [a, b] uses n = 10. Note that the
sub-interval [x1 , x5 ] of the interval [a, b] is not a sub-interval of the partition!
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 7.1: Approximating areas 207
Theorem 7.1.6. If f (x) = c for all x ∈ [a, b], then for any partition P of [a, b], L(f, P ) =
U (f, P ) = c(b − a).
so that
n
X Pn
L(f, P ) = U (f, P ) = c(xk − xk−1 ) =c k=1 (xk − xk−1 )
k=1
= c(xn − x0 ) = c(b − a).
Example 7.1.7. Approximate the area under the curve y = f (x) = 30x4 + 2x between
x = 1 and x = 4. We first establish a partition Pn = {x0 , . . . , xn } of [1, 4] into n equal
sub-intervals. The length of each sub-interval is (4 − 1)/n, and x0 = 1, so that x1 =
x0 + 3/n = 1 + 3/n, x2 = x1 + 3/n = 1 + 2 · 3/n, and in general, xk = 1 + k · 3/n.
Note that xn = 1 + n · 3/n = 4, as needed. Since f 0 (x) = 12x3 + 2 is positive on [1, 4],
it follows that f is increasing on [1, 4]. Thus necessarily for each i, the infimum of all
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
208 Chapter 7: Integration
values of f on the ith sub-interval is achieved at the left endpoint, and the supremum at
the right endpoint. In symbols, this says that inf{f (x) : x ∈ [xk−1 , xk ]} = f (xk−1 ) and
sup{f (x) : x ∈ [xk−1 , xk ]} = f (xk ). For example, with n = 1, L(f, P1 ) = f (1) · 3 = 96 and
U (f, P1 ) = f (4) · 3 = 3 · (30 · 44 + 2 · 4) = 23064. Thus the true area is some number between
96 and 23064. Admittedly, this is not much information. The problem is that our partition
is too rough. A computer program produced the following better numerical approximations
for lower and upper sums with respect to partitions Pn into n equal sub-intervals:
n L(f, Pn ) U (f, Pn )
10 5061.2757 7358.0757
100 6038.727 6268.407
1000 6141.5217 6164.48967
10000 6151.8517 6154.148
100000 6152.885 6153.1148
1000000 6152.988 6153.011
Notice how the lower sums get larger and the upper sums get smaller as we take finer
partitions. We would like to conclude that the true area is between 6152.988 and 6153.011.
This is getting closer but may still be insufficient precision. For more precision, partitions
would have to get even finer, but the calculations slow down too.
The observed monotonicity is not a coincidence:
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 7.1: Approximating areas 209
Theorem 7.1.10. For any partitions P and Q of [a, b], and for any bounded function
f : [a, b] → R,
L(f, P ) ≤ U (f, Q).
We say that f is integrable over [a, b] when L(f ) = U (f ). We call this common
value the integral of f over [a, b], and we write it as
Z b Z b Z b
f= f (x) dx = f (t) dt.
a a a
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
210 Chapter 7: Integration
Theorem 7.1.10 shows that always L(f ) ≤ U (f ). Example 7.1.9 shows that some-
times strict inequality holds. Note that we have not yet proved that the function in Ex-
ample 7.1.7 is integrable, but in Section 7.3 we prove more generally that every continuous
function is integrable over a closed bounded interval.
By Exercise 7.1.3, for non-constant functions the upper and lower sums with respect
to a partition are never equal. In this case for every partition P , L(f, P ) < U (f, P ), yet
for good, i.e., integrable, functions it can happen that L(f ) = U (f ).
The simplest examples of integrable functions are constant functions:
Theorem 7.1.12. (Constant rule for integrals.) If f (x) = c for all x ∈ [a, b], then for
Rb
any partition P of [a, b], a f = L(f ) = U (f ) = c(b − a).
Theorem 7.1.13. Let f : [a, b] → R be bounded. Then f is integrable over [a, b] if and
only if for all > 0 there exists a partition P of [a, b] such that U (f, P ) − L(f, P ) < .
Proof. Suppose that f is integrable over [a, b]. Then L(f ) = U (f ). Let > 0. Since L(f )
is the supremum of all lower sums L(f, P ) as P varies over partitions of [a, b], there exists
a partition P1 of [a, b] such that L(f ) − L(f, P1 ) < /2. Similarly, there exists a partition
P2 of [a, b] such that U (f, P2 ) − U (f ) < /2. Let P = P1 ∪ P2 . Then P is a partition of
[a, b], and by Equation (7.1.5) and by Theorem 7.1.8,
L(f, P1 ) ≤ L(f, P ) ≤ U (f, P ) ≤ U (f, P2 ).
Now suppose that for every > 0 there exists a partition P of [a, b] such that
U (f, P )−L(f, P ) < . By the supremum/infimum definitions of lower and upper integrals,
0 ≤ U (f ) − L(f ) ≤ U (f, P ) − L(f, P ) < .
Since the non-negative constant U (f )−L(f ) is strictly smaller than every positive number ,
it means, by Theorem 2.11.4, that U (f ) − L(f ) = 0, so that f is integrable.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 7.1: Approximating areas 211
Notation 7.1.15. In the definition of integral there is no need to write “dx” when we
Rb
are simply integrating a function f , as in “ a f ”: we seek the signed area determined by
f over the domain from a to b. For this it does not matter if we like to plug x or t or
anything else into f . But when we write “f (x)” rather than “f ”, then we need to write
“dx”, and the reason is that f (x) is an element of the codomain and is not a function.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
212 Chapter 7: Integration
Theorem 7.1.16. Let f : [a, b] → R be bounded and integrable. For each real number
(r) (r) (r)
r > 0 let Pr = {x0 , x1 , . . . , xnr } be a partition of [a, b] such that each sub-interval
(r) (r)
[xk−1 , xk ] has length at most r. Then
Z b
lim+ L(f, Pr ) = lim+ U (f, Pr ) = f.
r→0 r→0 a
(r)
Furthermore, if for each r > 0 and each k = 1, 2, . . . , nr , ck is arbitrary in
(r) (r)
[xk−1 , xk ], then
nr Z b
(r) (r) (r)
X
lim+ f (ck )(xk − xk−1 ) = f.
r→0 a
k=1
Proof. Let > 0. By Theorem 7.1.13, there exists a partition P of [a, b] such that
U (f, P ) − L(f, P ) < /2. Let P = {y0 , y1 , . . . , yn }. Since f is bounded, there exists a
positive real number M such that for all x ∈ [a, b], |f (x)| < M . Let r be any positive real
number such that
1 n o
r < min , y1 − y0 , y2 − y1 , . . . , yn − yn−1 ,
2 4M n
and let Pr = {x0 , x1 , . . . , xmr } (omitting the superscript (r)). By the definition of r,
each sub-interval [xk−1 , xk ] contains at most one element of P . When it does contain an
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 7.1: Approximating areas 213
element of P , we call such k special and we denote its point of P as yj(k) . Thus [xk−1 , xk ] =
[xk−1 , yj(k) ] ∪ [yj(k) , xk ]. The elements y0 and yn are each in exactly one sub-interval of
Pr , whereas the n − 1 points y1 , . . . , yn−1 may each be contained in two sub-intervals of Pr
(when it is also in Pr ), so that there are at most 2 + 2 · (n − 1) = 2n special integers. For
non-special k there exists a unique i(k) ∈ {1, . . . , n} such that [xk−1 , xk ] ⊆ [yi(k)−1 , yi(k) ].
Then
Xmr
L(f, Pr ) = inf{f (x) : x ∈ [xk−1 , xk ]}(xk − xk−1 )
k=1
mr
X
= inf{f (x) : x ∈ [xk−1 , xk ]}(xk − xk−1 )
special k=1
mr
X
+ inf{f (x) : x ∈ [xk−1 , xk ]}(xk − xk−1 )
non-special k=1
which is not quite equal to L(f, P ) because for each i ∈ {1, . . . , n}, the union of the sub-
intervals [xk−1 , xk ] contained in [yi−1 , yi ] need not be all of [yi−1 , yi ]. In fact, L(f, P ) equals
the summand in the last row plus
mr
X
inf{f (x) : x ∈ [yj(k)−1 , yj(k) ]}(yj(k) − xk−1 )
special k=1
mr
X
+ inf{f (x) : x ∈ [yj(k) , yj(k)+1 ]}(xk − yj(k) ).
special k=1
Thus
mr
X
L(f, Pr ) ≥ inf{f (x) : x ∈ [xk−1 , xk ]}(xk − xk−1 ) + L(f, P )
special k=1
mr
X
− inf{f (x) : x ∈ [yj(k)−1 , yj(k) ]}(yj(k) − xk−1 )
special k=1
Xmr
− inf{f (x) : x ∈ [yj(k) , yj(k)+1 ]}(xk − yj(k) ).
special k=1
m
X r
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
214 Chapter 7: Integration
mr
X mr
X
− M (yj(k) − xk−1 ) − M (xk − yj(k) )
special k=1 special k=1
mr
X
= L(f, P ) − 2M (xk − xk−1 )
special k=1
≥ L(f, P ) − 2M 2nr
> L(f, P ) − .
2
Since U (f, P ) ≥ U (f ) = L(f ) ≥ L(f, P ) and U (f, P ) − L(f, P ) < 2 , it follows that
L(f ) − L(f, P ) < /2. Thus by the triangle inequality and minding which quantities are
larger to be able to omit absolute value signs,
L(f ) − L(f, Pr ) < L(f ) − L(f, P ) + < .
2
Rb
This then proves that limr→0+ L(f, Pr ) = L(f ) = a f . Similarly we can prove that
Rb
U (f, Pr ) − U (f ) < and so that limr→0+ U (f, Pr ) = U (f ) = a f .
Furthermore, by compatibility of order with multiplication by positive numbers and
addition, we have that for each r,
nr
(r) (r) (r)
X
L(f, Pr ) ≤ f (ck )(xk − xk−1 ) ≤ U (f, Pr ),
k=1
Pnr (r) (r) (r) Rb
So that for all r sufficiently close to 0, k=1 f (ck )(xk − xk−1 ) is within of a f . Thus
Pnr (r) (r) (r) Rb
by the definition of limits, limr→0+ k=1 f (ck )(xk − xk−1 ) = a f .
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 7.1: Approximating areas 215
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
216 Chapter 7: Integration
1, if x is rational;
7.1.8. Use Eantegrability from the previous exercise. Let f (x) =
0, if x is irrational.
By Example 7.1.9 we know that f is not integrable over [0, 2].
2
i) Prove that f is Eantegrable over [0, 2] and find its Eantegral f .
0
√2 2 √2 2 2
ii) Compute f, √ f , and prove that f + √ f 6= f .
0 2 0 2 0
The definition of integrals appears daunting: we seem to need to compute all the
possible lower sums to get at the lower integral, all the possible upper sums to get at the
upper integral, and only if the lower and upper integrals are the same do we have the precise
integral. In Example 7.1.7 in the previous section we have already seen that numerically
we can often compute the integral to within desired precision by taking finer and finer
partitions. In this section we compute some precise numerical values of integrals, and
without computing all the possible upper and lower sums. Admittedly, the computations
are time-consuming, but the reader is encouraged to read through them to get an idea of
what calculations are needed to follow the definition of integrals. In Section 7.4 we will see
very efficient shortcuts for computing integrals, but only for easy/good functions.
Example 7.2.1. Let f (x) = x on [2, 6]. We know that the area under the curve between
Z 6
x = 2 and x = 6 is 16. Here we compute that indeed f = 16. For any positive integer n
2
let Pn = {x0 , . . . , xn } be the partition of [2, 6] into n equal parts. By Exercise 7.1.1,
xk = 2 + k n4 . Since f is increasing, on each sub-interval [xk−1 , xk ] the minimum is xk−1
and the maximum is xk . Thus
n
X
U (f, Pn ) = xk (xk − xk−1 )
k=1
n
X 4 4
= 2+k
n n
k=1
n n 2
X 4 X 4
= 2 + k
n n
k=1 k=1
2
4 n(n + 1) 4
=2 n+
n 2 n
8(n + 1)
=8+ .
n
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 7.2: Computing integrals from upper and lower sums 217
Example 7.2.2. We compute the integral for f (x) = x2 over [1, 7]. For any positive inte-
ger n let Pn = {x0 , . . . , xn } be the partition of [1, 7] into n equal parts. By Exercise 7.1.1,
xk = 1 + k n6 . Since f is increasing on [1, 7], on each sub-interval [xk−1 , xk ] the minimum
is x2k−1 and the maximum is x2k . Thus, by using Exercise 1.6.1 in one of the steps below:
n
X
U (f, Pn ) = x2k (xk − xk−1 )
k=1
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
218 Chapter 7: Integration
n 2
X 6 6
= 1+k
n n
k=1
n
X 12 36 6
= 1 + k + k2 2
n n n
k=1
n n n 3
X 6 X 12 6 X 2 6
= + k + k
n n n n
k=1 k=1 k=1
3
6 n(n + 1) 12 6 n(n + 1)(2n + 1) 6
= n+ +
n 2 n n 6 n
36(n + 1) 36(n + 1)(2n + 1)
=6+ + ,
n n2
so that U (f ) ≤ inf{U (f, Pn ) : n} = 6 + 36 + 72 = 114. Similarly,
L(f ) ≥ L(f, Pn ) = 114,
R7
whence 114 ≤ L(f ) ≤ U (f ) ≤ 114 and 1 f = 114.
R2√
Example 7.2.3. The goal of this exercise is to compute 0 x dx. For the first attempt,
let Pn = {x0 , . . . , xn } be the partition of [0, 2] into n equal intervals. By Exercise 7.1.1,
xk = 2k
n . The square root function is increasing, so that
n
X √
U (f, Pn ) = xk (xk − xk−1 )
k=1
n r
X 2k 2
=
n n
k=1
3/2 X n √
2
= k.
n
k=1
Pn √
But now we are stuck: we have no simplification of k=1 k, and we have no other
Pn √
2 3/2
immediate tricks to compute inf{ n k=1 k : n}.
But it is possible to compute enough upper and lower sums for this function to get
2·(n−1)2 2·n2
the integral. Namely, for each positive integer n let Qn = {0, n22 , 2·4
n 2 , 2·9
n 2 , . . . , n2 , n2 =
2k2 √
2}. This is a partition of [0, 2] into n (unequal) parts, with xk = n2 . Since is an
increasing function, on each sub-interval [xk−1 , xk ] the minimum is achieved at xk−1 and
the maximum at xk , so that
n √
2 k 2k 2 2(k − 1)2
X
U (f, Qn ) = −
n n2 n2
k=1
√ n
2 2X
k k 2 − (k − 1)2
= 3
n
k=1
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 7.3: What functions are integrable? 219
√ n
2 2X
= k (2k − 1)
n3
k=1
√ n √ n
4 2X 2 2 2X
= k − 3 k
n3 n
k=1 k=1
√ √
4 2 n(n + 1)(2n + 1) 2 2 n(n + 1)
= − 3 (by Exercise 1.6.1)
n√3 6 n 2
2 2
= (n + 1)(2(2n + 1) − 3) (by factoring)
6n2
√
2
= (n + 1)(4n − 1).
3n2
√ √
Thus
√
U (f ) ≤ inf{U (f, Pn )√: n} = inf{ 3n22 (n + 1)(4n − 1) : n} = 4 2
3 . Similarly, L(f ) ≥
4 2
R2√
3 , so that 0
x dx = 4 3 2 .
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
220 Chapter 7: Integration
Proof. Let > 0. Since f is bounded, there exists a positive real number M such that
for all x ∈ [a, b], |f (x)| < M . Let e = 31 min{s1 − s0 , s2 − s1 , . . . , sm − sm−1 } and d =
min{e, /(4M (m + 1)(2m + 1))}. By assumption, for each k = 1, . . . , m − 1, f is integrable
on the interval [sk−1 + d, sk − d]. Thus by Theorem 7.1.13, there exists a partition Pk of
[sk−1 + d, sk − d] such that U (f, Pk ) − L(f, Pk ) < /(2m + 1). Now let P = {a} ∪ P1 ∪ P2 ∪
· · · ∪ Pm−1 ∪ {b}. Then P is a partition of [a, b], and
U (f, P ) − L(f, P )
Xm
= (U (f, P ∩ [sk−1 + d, sk − d]) − L(f, P ∩ [sk−1 + d, sk − d]))
k=1
+ (U (f, P ∩ [a, a + d]) − L(f, P ∩ [a, a + d]))
m−1
X
+ (U (f, P ∩ [sk − d, sk + d]) − L(f, P ∩ [sk − d, sk + d]))
k=1
+ (U (f, P ∩ [b − d, b]) − L(f, P ∩ [b − d, b]))
m
X m−1
X
≤ (U (f, Pk ) − L(f, Pk )) + 2M · 2d + 2M · 2d + 2M · 2d
k=1 k=1
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 7.3: What functions are integrable? 221
m
X
< + 4M (m + 1)d
2m + 1
k=1
< .
Rb Pm R sk
Thus by Theorem 7.3.1, f is integrable. Furthermore, a f − k=1 sk−1 f is bounded
Pm Pm
above by U (f, P ) − k=1 L(f, Pk ) and below by L(f, P ) − k=1 U (f, Pk ), and the upper
Rb Pm R sk
and lower bounds are within 2 of 0, so that a f − k=1 sk−1 f is within 2 of 0. Since
Rb Pm R sk
is arbitrary, it follows by Theorem 2.11.4 that a f = k=1 sk−1 f .
Ra
Notation 7.3.3. What could possibly be the meaning of f if a < b? In our definition b
Rb
of integrals, all partitions started from a smaller a to a larger b to get a f . If we did
reverse b and a, then the widths of the sub-intervals in each partition would be negative
(xk − xk−1 = −(xk−1 − xk )), so that all the partial sums and both the lower and the upper
integrals would get the negative value. Thus it seems reasonable to declare
Z a Z b
f =− f.
b a
In fact, this is exactly what makes Theorem 7.3.2 work without any order assumptions
Rb Rc Rb
on the sk . For example, if a < c < b, by Theorem 7.3.2, a f = a f + c f , whence
Rc Rb Rb Rb Rc
a
f = a
f − c
f = a
f + b
f.
Theorem 7.3.4. Suppose that f and g are integrable over [a, b], and that c ∈ R. Then
Rb Rb Rb
f + cg is integrable over [a, b] and a (f + cg) = a f + c a g.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
222 Chapter 7: Integration
Here is a picture that illustrates this theorem: the values of g are at each point in
the domain greater than or equal to the values of f , and the area under the graph of g is
larger than the area under the graph of f :
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 7.4: The Fundamental theorem of calculus 223
y = g(x)
y = f (x)
a b x
Proof. By assumption on every sub-interval I of [a, b], inf{f (x) : x ∈ I} ≤ inf{g(x) : x ∈ I}.
Thus for all partitions P of [a, b], L(f, P ) ≤ L(g, P ). Hence L(f ) ≤ L(g), and since f and
Rb Rb
g are integrable, this says that a f ≤ a g.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
224 Chapter 7: Integration
Despite first appearances, it turns out that integration and differentiation are related.
For this we have two versions of the Fundamental theorem of calculus.
Proof. Let P = {x0 , x1 , . . . , xn } be a partition of [a, b]. Since g is differentiable on [a, b], it
is continuous on each [xk−1 , xk ] and differentiable on each (xk−1 , xk ). Thus by the Mean
value theorem (Theorem 6.3.4), there exists ck ∈ (xk−1 , xk ) such that f (ck ) = g 0 (ck ) =
g(xk )−g(xk−1 )
xk −xk−1 . By the definition of lower and upper sums,
n
X
L(f, P ) ≤ f (ck )(xk − xk−1 ) ≤ U (f, P ).
k=1
But
n n
X X g(xk ) − g(xk−1 )
f (ck )(xk − xk−1 ) = (xk − xk−1 )
xk − xk−1
k=1 k=1
Xn
= (g(xk ) − g(xk−1 ))
k=1
= g(xn ) − g(x0 )
= g(b) − g(a),
so that L(f, P ) ≤ g(b) − g(a) ≤ U (f, P ), whence
L(f ) = sup{L(f, P ) : P a partition of [a, b]}
≤ g(b) − g(a)
≤ inf{U (f, P ) : P a partition of [a, b]}
= U (f ).
Since f is integrable over [a, b], by definition L(f ) = U (f ), and so all inequalities above
Rb
have to be equalities, so that necessarily a f = g(b) − g(a).
The general notation for applying Theorem 7.4.1 is as follows: if g 0 = f , then
Z b b
f = g(x) = g(b) − g(a).
a a
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 7.4: The Fundamental theorem of calculus 225
1+2x−x2 x−1
For example, since (1+x2 )2 is the derivative of 1+x2 , it follows that
1 1
1 + 2x − x2 x−1 1−1 0−1
Z
dx = = − = 1.
0 (1 + x2 )2 1 + x2 0 1 + 1 2 1 + 02
If we instead had to compute this integral with upper and lower sums, it would take us a
lot longer and a lot more effort to come up with the answer.
In general, upper and lower sums and integrals are time-consuming and we want to
avoid them if possible. The fundamental theorem of calculus that we just proved enables
us to do that for many functions: to integrate f over [a, b] one needs to find g with g 0 = f .
Such g is called an antiderivative of f . For example, if r is a rational number different
r+1
from −1, then by the power rule (Theorem 6.2.10), an antiderivative of xr is xr+1 . By the
r+1
scalar rule for derivatives, for any constant C, xr+1 + C is also an antiderivative. It does
not matter which antiderivative we choose to compute the integral:
Z b r+1 r+1
br+1 ar+1
r b a
x dx = +C − +C = − ,
a r+1 r+1 r+1 r+1
so that the choice of the antiderivative is irrelevant.
and so on. (Study the differences and similarities of the last three.)
So far we have seen 2x for rational exponents x. Exercise 5.4.7 also allows real
exponents, and proves that this function of x is continuous. Thus by Theorem 7.3.1 this
function is integrable. We do not yet know 2x dx, but in Theorem 7.6.5 we will see that
R
2
2 dx = ln12 2x + C. For 2(x ) dx instead, you and I do not know an antiderivative,
R x R
we will not know one by the end of the course, and there actually is no “closed-form”
antiderivative. This fact is due to a theory of Joseph Liouville (1809–1882). What is the
meaning of “closed-form”? Here is an oblique answer: Exercise 10.1.5 claims that there
exists an infinite power series (sum of infinitely many terms) that is an antiderivative
2
of 2(x ) . Precisely because of this infinite sum nature, the values of any antiderivative
2
of 2(x ) cannot be computed precisely, only approximately. Furthermore, according to
Liouville’s theory, that infinite sum cannot be expressed in terms of the more familiar
standard functions, and neither can any other expression for an antiderivative. It is in this
2
sense that we say that 2(x ) does not have a “closed-form” antiderivative.
(It is a fact that in the ocean of all functions, those for which there is a “closed-form”
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
226 Chapter 7: Integration
Proof. Since f is continuous over [a, b], it is continuous over [a, x], so that by Theorem 7.3.1,
f is integrable over [a, x]. Thus g is a well-defined function. Let c ∈ (a, b). We will prove
that g is differentiable at c.
Let > 0. By continuity of f at c, there exists δ > 0 such that for all x ∈ [a, b], if
|x − c| < δ then |f (x) − f (c)| < . Thus on [c − δ, c + δ] ∩ [a, b], f (c) − < f (x) < f (c) + ,
so that by Theorem 7.3.5,
Z max{x,c} Z max{x,c} Z max{x,c}
(f (c) − ) ≤ f≤ (f (c) + ).
min{x,c} min{x,c} min{x,c}
Thus Z max{x,c}
|x − c|(f (c) − ) ≤ f ≤ |x − c|(f (c) + ).
min{x,c}
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 7.4: The Fundamental theorem of calculus 227
Rx
f
c
= − f (c)
x−c
(by Theorem 7.3.2 and Notation 7.3.3)
< .
g(x) − g(c)
Thus lim exists and equals f (c), i.e., g 0 (c) = f (c).
x→c x−c
Rx
It is probably a good idea to review the notation again. The integral a f can also
be written as Z Z x Z x x
f= f (t) dt = f (z) dz.
a a a
This is a function of x because x appears in the bound of the domain of integration. Note
Rx
similarly that a f (t) dz are functions of t and x but not of z. Thus by the Fundamental
theorem of calculus, II,
Z x Z x
d d
f = f (x), and f (t) dz = f (t).
dx a dx a
Rx
Do not write “ a f (x) dx”: this is trying to say that x varies from a to x, so
one occurrence of the letter x is constant and the other occurrence varies from a to that
constant, which mixes up the symbols too much.
d
= (g(h(x)) − g(k(x)))
dx
d 0
= (g (h(x))h0 (x) − g(k(x))k 0 (x)) (by the chain rule)
dx
= f (h(x))h0 (x) − f (k(x))k 0 (x).
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
228 Chapter 7: Integration
7.4.1. Compute the integrals below. You may want to use clever guessing and rewriting.
Z 1
√
i) x dx =
0
Z 1
ii) 16x(x2 + 4)7 dx =
Z0 1
√
iii) (3 x + 4)(x3/2 + 2x + 3)7 dx =
0
7.4.2. Compute the integrals below, assuming that t and x do not depend on each other.
Z 1
i) x3 dx =
Z0 1
ii) x3 dt =
Z0 x
iii) x3 dt =
0
7.4.3. Below t and x do not depend on each other. Compute the following derivatives,
possibly using
Z x Theorem 7.4.3.
d
i) t3 dt =
dx Z0
x
d
ii) x3 dt =
dx 0
Z t
d
iii) x3 dx =
dx 0
Z t
d
iv) t3 dx =
dx 0
7.4.4. Suppose that f : [a, b] → R+ is continuous and that f (c) > 0 for some c ∈ [a, b].
Rb
Prove that a f > 0.
7.4.5. (Integration by substitution) By the chain rule for differentiation, (f ◦ g)0 (x) =
f 0 (g(x))g 0 (x).
Rb
i) Prove that a f 0 (g(x))g 0 (x) dx = f (g(b)) − f (g(a)).
ii) Prove that f 0 (g(x))g 0 (x) dx = f (g(x)) + C.
R
iii) Compute the following integrals applying this rule explicitly stating f, g:
Z 3
(2x − 4)10 dx =
Z2 3
4x + 3
√ 9 dx =
1 2x 2 + 3x
Z 3 p3
(8x + 6) 2x2 + 3x dx =
1
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 7.4: The Fundamental theorem of calculus 229
7.4.6. (Integration by parts) By the product rule for differentiation, (f · g)0 (x) =
f 0 (x)g(x) + f (x)g 0 (x).
Rb Rb
i) Prove that a f 0 (x)g(x) dx = f (b)g(b) − f (a)g(a) − a f (x)g 0 (x) dx.
ii) Prove that f 0 (x)g(x) dx = f (x)g(x) − f (x)g 0 (x) dx.
R R
iii) Compute the following integrals applying this rule explicitly stating f, g:
Z 1
(4x + 3)(5x + 1)10 dx =
Z−1
1
4x + 3
√ dx =
−1 2x + 4
7.4.7. Compute the following derivatives. (Hint: the Fundamental theorem of calculus
and the chain rule.)
d
Z 3x q √
i) t4 + 5 t dt.
dx 2
Z √x √
d t+5 t
ii) dt.
dx −x2 t1 00 − 2t5 0 + t7 − 2
Z −h(x)
d
iii) f (t) dt.
dx h(x)
7.4.8. (Mean value theorem for integrals) Let f : [a, b] → R be continuous. Prove
that there exists c ∈ (a, b) such that
Z b
1
f (c) = f.
b−a a
ii) Integrate the integral above by parts, and rewrite, to get that
Z x
0
f (x) = f (a) + (x − a)f (a) + (x − t)f 00 (t) dt.
a
iii) Use induction, integration by parts, and rewritings, to get that f (x) equals
Z x
f 0 (a) f (n) (a) n (x − t)n (n+1)
f (a) + (x − a) + · · · + (x − a) + f (t) dt.
1! n! a n!
iv) Say why you cannot apply the Fundamental theorem of calculus II, to compute
d
Rx n+1 (n+1)
dx a (x − t) f (t) dt.
v) (Taylor’ remainder formula in integral form) Consult Section 6.5 for Taylor poly-
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
230 Chapter 7: Integration
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 7.5: Integration of complex-valued functions 231
Definition 7.5.1. Let f : [a, b] → C be a function such that Re f and Im f are integrable
over [a, b]. The integral of f over [a, b] is
Z b Z b Z b
f= Re f + i Im f.
a a a
The following are then immediate generalizations of the two versions of the funda-
mental theorem of calculus Theorems 7.4.1 and 7.4.3:
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
232 Chapter 7: Integration
The function that takes a non-zero x to 1/x is continuous everywhere on its domain
since it is a rational function. Thus by Theorem 7.3.1 and Notation 7.3.3, for all x > 0,
Rx 1
1 x
dx is well-defined. This function has a familiar name:
Remark 7.6.2.
R1
(1) ln 1 = 1 1t dt = 0.
Rx Rx
(2) By geometry, for x > 1, ln x = 1 1t dt > 0, and for x ∈ (0, 1), ln x = 1 1t dt =
R1
− x 1t dt < 0.
(3) By the Fundamental theorem of calculus (Theorem 7.4.3), for all b ∈ R+ , ln is
differentiable on (0, b), so that ln is differentiable on R+ . Furthermore, ln0 (x) = x1 .
(4) ln is continuous (since it is differentiable) on R+ .
(5) The derivative of ln is always positive. Thus by Theorem 6.3.5, ln is everywhere
increasing.
(6) Let c ∈ R+ , and set g(x) = ln(cx). By the chain rule, g is differentiable, and
g 0 (x) = cx
1
c = x1 = ln0 (x). Thus the function g − ln has constant derivative 0. It
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 7.6: Natural logarithm and the exponential functions 233
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
234 Chapter 7: Integration
= ln−1(x − y) .
We have proved that for all c ∈ R+ and r ∈ Q, ln−1 (r ln(c)) = ln−1 (ln(cr )) = cr ,
and we have proved that for all r ∈ R, ln−1 (r ln(c)) is well-defined. This allows us to define
exponentiation with real (not just rational) exponents:
Proof. By definition, f (x) = ln−1 (r ln(x)), which is differentiable by the chain and scalar
rules and the fact that ln and its inverse are differentiable. Furthermore, the derivative is
−1
f 0 (x) = ln−1 (r ln(x)) · xr = r lnln−1(r(ln(x))
ln(x))
= r ln−1 (r ln(x) − ln(x)) = r ln−1 ((r − 1) ln(x)) =
rxr−1 . The monotone properties then follow from Theorem 6.3.5.
Proof. By definition, f (x) = ln−1 (x ln(c)), which is differentiable by the chain and scalar
rules and the fact that ln−1 is differentiable. Furthermore, the derivative is f 0 (x) =
ln−1 (x ln(c)) · ln(c) = f (x) · ln(c) = (ln(c))cx . The monotone properties then follow from
Theorem 6.3.5.
We next give a more concrete form to the exponential function ln−1 .
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 7.6: Natural logarithm and the exponential functions 235
Definition 7.6.6. Let e = ln−1 (1) (so that ln(e) = 1). The constant e is called Euler’s
constant.
Since ln(e) = 1 > 0 = ln(1), by the increasing property of ln it follows that e > 1.
Now let f (x) = 1/x. This function is non-negative on [1, ∞). If P is a partition
of [1, 3] into 5 equal parts, then L(f, P ) ∼ = 0.976934, if P is a partition of [1, 3] into 6
equal parts, then L(f, P ) ∼ = 0.995635, and if P is a partition of [1, 3] into 7 equal parts,
then L(f, P ) ∼= 1.00937. This proves that L(f ) > 1 over the interval [1, 3]. On [1, 3], the
R3
function f is continuous and thus integrable, so that ln 3 = 1 f > 1 = ln e. Since ln is an
increasing function, this means that e < 3. By geometry ln(2) < U (f, {1, 2}) = 1 = ln e,
so that similarly e > 2. We conclude that e is a number strictly between 2 and 3.
1 1 1
Note that U (f, {1, 1.25, 1.5, 1.75, 2, 2.25, 2.5}) = 0.25(1+ 1.25 + 1.5 + 1.75 + 12 + 2.25
1
)=
2509
R 2.5
2520 < 1, so that ln 2.5 = 1 f is strictly smaller than this upper sum. It follows that
ln 2.5 < 1 = ln e and 2.5 < e. If P is a partition of [1, 2.71828] into a million pieces of equal
length, a computer gives that U (f, P ) is just barely smaller than 1, so that 2.71828 < e.
If P is a partition of [1, 2.718285] into a million pieces of equal length, then L(f, P ) is just
barely bigger than 1, so that e < 2.718285. Thus e ∼ = 2.71828.
A reader may want to run further computer calculations for greater precision. A dif-
ferent and perhaps easier computation is in Exercise 7.6.14.
We have already proved on page 233 that the derivative of ln−1 is ln−1 :
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
236 Chapter 7: Integration
7.6.5. Let c ∈ R+ . Use integration by parts (Exercise 7.4.6) to compute xcx dx.
R
R
7.6.6. Use integration by parts (Exercise 7.4.6) to compute ln(x) dx.
7.6.7. Prove by integration by parts the following improper integral value for a non-
negative integer n:
Z 1
1
(−x2n ln x)dx = .
0 (2n + 1)2
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 7.7: Applications of integration 237
2
7.6.16. Let f : R → R be given by f (x) = e−1/x , if x 6= 0;
0, if x = 0.
i) Prove by induction on n ≥ 0 that for each n there exists a polynomial function
hn : R → R such that
−1/x2
1
(n)
f (x) = hn ( x ) · e , if x 6= 0;
0, if x = 0.
(At non-zero x you can use the chain rule, the derivative of the exponential function,
and the power rule for derivatives. However, f (n+1) (0) is (but of course) computed
(n) (n)
as limh→0 f (h)−fh
(0)
, and then you have to use L’Hôpital’s rule. You do not
have to be explicit about the polynomial functions hn .)
ii) Compute the nth Taylor polynomial for f centered at 0.
7.6.17. (A friendly competition between e and π) Use calculus and not a calculator to
determine which number is bigger, eπ or π e . You may assume that 1 < e < π. (Hint:
compute the derivative of some function.)
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
238 Chapter 7: Integration
Whether this is an approximation of the true length depends on the partition, but geomet-
rically it makes sense that the true length of the curve equals
n p
X
lim (xk − xk−1 )2 + (f (xk ) − f (xk−1 ))2 ,
k=1
as the partitions {x0 , x1 , . . . , xn } get finer and finer. But this is not yet in form of The-
orem 7.1.16. For that we need to furthermore assume that f is differentiable on (a, b).
Then by the Mean value theorem (Theorem 6.3.4) for each k = 1, . . . , n there exists
ck ∈ (xk−1 , xk ) such that f (xk ) − f (xk−1 ) = f 0 (ck )(xk − xk−1 ). If in addition we as-
sume that f 0 is continuous, then it is integrable by Theorem 7.3.1, and by Theorem 7.1.16
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 7.7: Applications of integration 239
We just proved:
Theorem 7.7.2. If f : [a, b] → R is continuous, then the volume of the solid of revolution
obtained by rotating around the x-axis the region between x = a and x = b and bounded
by the x-axis and the graph of f is
Z b
π (f (x))2 dx.
a
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
240 Chapter 7: Integration
Theorem 7.7.3. If 0 ≤ a ≤ b and f, g : [a, b] → R are continuous, then the volume of the
solid of revolution obtained by rotating around the x-axis the region between y = a and
y = b and bounded by the graphs of x = f (y) and x = g(y), is
Z b
2π y(g(y) − f (y)) dy.
a
Proof. We rotate the upper-half circle of radius r centered at the origin around the x-
axis. The circle of radius r centered at the origin consists of all points (x, y) such that
p p
x2 + y 2 = r2 , so we have g(y) = r2 − y 2 and f (y) = − r2 − y 2 . Thus the volume is
Z r p r
4 2 2 3/2 4 4
4π y r2 − y 2 dy = − π(r − y ) = π(r2 )3/2 = πr3 .
0 3 0 3 3
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 7.7: Applications of integration 241
In this way we obtain a right circular cone of height b and base radius |m|b. The
perimeter of that base circle is of course 2π|m|b. If we cut the cone in a straight line from
p
a side to the vertex, we cut along an edge of length b2 + (mb)2 , and we get the wedge as
follows:
perimeter 2π|m|b
radius
b2 + (mb)2
p
Without the clip in the disc, the perimeter would be 2π b2 + (mb)2 , but our perime-
ter is only 2π|m|b. Thus the angle subtended by the wedge is by proportionality equal to
√2π|m|b 2π = √2π|m| . The area of the full circle is radius squared times one half of
2π 2 2
b +(mb) 1+m2
the full angle, and so proportionally the area of our wedge is radius squared times one half
2π|m|
p
of our angle, i.e., the surface area of this surface of revolution is ( b2 + (mb)2 )2 2√ 1+m2
2
√
2
= π|m|b 1 + m . Note that even if b < 0, the surface area is the absolute value of
√
πmb2 1 + m2 .
Thus if m is not zero and 0 ≤ a < b or a < b ≤ 0, then the surface area of revolution
obtained by rotating the line y = mx from x = a to x = b equals the absolute value of
√
πm(b2 − a2 ) 1 + m2 . Note the geometric requirement that at a and b the line is on the
same side of the x-axis.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
242 Chapter 7: Integration
Now suppose that we rotate the line y = mx + l around the x-axis, with m 6= 0
and a < b and both on the same side of the intersection of the line with the x-axis. This
intersection is at x = −l/m. By shifting the graph by l/m to the right, this is the same as
rotating the line y = mx from x = a + l/m to x = b + l/m, and by the previous case the
surface area of this is the absolute value of
p
πm((b + l/m)2 − (a + l/m)2 ) 1 + m2
p
= πm b2 + 2bl/m + l2 /m2 − (a2 + 2al/m + l2 /m2 ) 1 + m2
p
= πm b2 − a2 + 2(b − a)l/m
1 + m2
p
= πm(b − a)(b + a + 2l/m) 1 + m2
p
= π(b − a)(m(b + a) + 2l) 1 + m2 .
If instead we rotate the line y = l (with m = 0) around the x-axis, we get a ring whose
√
surface area is (b − a)2π|l|, which is the absolute value of π(b − a)(m(b + a) + 2l) 1 + m2 .
Thus for all m, the surface area of the surface of revolution obtained by rotating the line
y = mx + l from x = a to x = b around the x-axis is the absolute value of
p
π(b − a)(m(b + a) + 2l) 1 + m2 ,
with the further restriction in case m 6= 0 that a and b are both on the same side of the
x-intercept.
Now let f : [a, b] → R≥0 be a differentiable (and not necessarily linear) function. Let
P = {x0 , x1 , . . . , xn } be a partition of [a, b]; on each sub-interval [xk−1 , xk ] we approximate
the curve with the line from (xk−1 , f (xk−1 )) to (xk , f (xk )). By the assumption that f
takes on only non-negative values we have that both xk−1 , xk are both on the same side of
the x-intercept of that line. The equation of the line is y = f (xxkk)−f (xk−1 )
−xk−1 (x − xk ) + f (xk ),
so that m = f (xxkk)−f (xk−1 )
−xk−1 and l = − f (xxkk)−f (xk−1 )
−xk−1 xk + f (xk ). Since f is differentiable, by
the Mean value theorem (Theorem 6.3.4) there exists ck ∈ (xk−1 , xk ) such that f 0 (ck ) =
f (xk )−f (xk−1 )
xk −xk−1 . Thus m = f 0 (ck ) and l = f (xk ) − f 0 (ck )xk .
We rotate that line segment around the x-axis, compute its volume of that solid of
revolution, and add up the volumes for all the subparts:
n
X p
π(xk −xk−1 ) (f 0 (ck )(xk +xk−1 )+2 (f (xk )−f 0 (ck )xk )) 1+(f 0 (ck ))2
k=1
n
X p
= π |(f 0 (ck )(−xk + xk−1 ) + 2f (xk ))| 1 + (f 0 (ck ))2 (xk − xk−1 ).
k=1
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 7.7: Applications of integration 243
Theorem 7.7.5. If f : [a, b] → R≥0 is differentiable with continuous derivative, then the
surface area of the solid of revolution obtained by rotating around the x-axis the curve
y = f (x) between x = a and x = b is
Z b p Z b p
0
π |f (x)(−x+x)+2f (x)| 1+(f (x)) dx = 2π f (x) 1+(f 0 (x))2 dx.
0 2
a a
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Chapter 8: Sequences
In this chapter, Sections 8.5 and 8.4 contain identical results in identical order, but
the proofs are different. You may want to learn both perspectives, or you may choose to
omit one of the two sections.
s = {s1 , s2 , s3 , . . .} = {sn }∞
n=1 = {sn }n≥1 = {sn }n∈N+
= {sn }n = {sn+3 }∞
n=−2 = {sn−4 }n>4 = {sn }.
The nth element sn = s(n) in the ordered list is called the nth term of the sequence.
The notation {s1 , s2 , s3 , . . .} usually stands for the set consisting of the elements
s1 , s2 , s3 , . . ., and the order of a listing of elements in a set is irrelevant. Here, however,
{s1 , s2 , s3 , . . .} stands for the sequence, and the order matters. When the usage is not
clear from the context, we add the word “sequence” or “set” as appropriate.
The first term of the sequence {2n − 1}n≥4 is 7, the second term is 9, et cetera. The
point is that even though the notating of a sequence can start with an arbitrary integer,
the counting of the terms always starts with 1.
Note that sn is the nth term of the sequence s, whereas {sn } = {sn }n≥1 is the
sequence in which n plays a dummy variable. Thus
s = {sn } =
6 sn .
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 8.1: Introduction to sequences 245
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
246 Chapter 8: Sequences
to the second row as we never finish the first row. So we need a cleverer way of
counting, and that is done as follows. We start counting at (1, 1), which stands
for 1/1 = 1. We then proceed through all the other integer points in the positive
quadrant of the plane via diagonals as in Plot 8.1.2. The given instructions would
enumerate positive rational numbers as 1/1, 2/1, 1/2, 1/3, 2/2. Ah, but 2/2 has
already been counted as 1/1, so we do not count 2/2. Thus, the proper counting
of positive rational numbers in this scheme starts with:
1/1,2/1, 1/2, 1/3, 2/2, 3/1, 4/1, 3/2, 2/3, 1/4, 1/5,
2/4, 3/3, 4/2, 5/1, 6/1, 5/2, 4/3, 3/4, 2/5, 1/6,
et cetera, where the crossed out numbers are not part of the sequence because they
had been counted earlier. Thus in this count the fifth term is 3.
1 2 3 4 5 6
It is important to note that every positive rational number appears on this list,
and because we are skipping any repetitions, it follows that every positive rational
number appears on this list exactly once. Thus this gives an enumeration of
positive rational numbers. *
A different enumeration of Q+ is given with an algebraic formulation in Exer-
cise 2.4.28.
(13) If {sn } is a sequence of positive rational numbers in which every positive rational
number appears exactly (or at least) once, we can construct from it a sequence in
* Here is a fun exercise: look at the ordered list of positive rational numbers above, including the crossed-
(n+m−2)2 3n+m−4
out fractions. Verify for a few of them that n/m is in position 2 + 2 + 1 on the list. Namely, it is
(x+y−2)2 3x+y−4 + 2 +
a fact that f (x, y) = +2 + 1 gives a bijection of (N ) with N . This was first proved by Rudolf
2
Fueter and George Pólya, but the proof is surprisingly hard, using transcendence of er for algebraic numbers r,
so do not attempt to prove this without more number theory background.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 8.1: Introduction to sequences 247
which every rational number appears exactly (or at least) once as follows:
0, if n = 1;
qn = sn/2 , if n is even;
s
(n−1)/2 , if n ≥ 3 is odd.
This sequence starts with 0, s1 , −s1 , s2 , −s2 , s3 , −s3 , . . .. Since every positive ratio-
nal number is one of the sn (exactly once/at least once), so every rational number
is on this new list (exactly once/at least once).
Incidentally, it is impossible to scramble R or C into a sequence. This can be proved
with a so-called Cantor’s diagonal argument, which we are not presenting here,
but an interested reader can consult other sources.
(14) Sequences are functions, and if all terms of the sequence are real numbers, we can
plot sequences in the usual manner for plotting functions. The following is part of
a plot of the sequence {1/n}.
sn
1
n
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Plot 8.1.2 sn = 1/n
(15) Another way to plot a sequence is to simply plot and label each sn in the complex
plane or on the real line. We plot three examples below.
s1 , s3 , s5 , . . . s2 , s4 , s6 , . . . s3 s2 s1
−1 1 0 1
s1
b
i/2
s3 s4 s2 b
b
b
b b
b b
b bbb b b
b bb b b b
−1 1
One has to make sure to add/multiply/divide equally numbered terms of the two
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
248 Chapter 8: Sequences
Here are a few further examples of arithmetic on sequences, with + and · binary
operations on the set of sequences:
{2n } + {−2n } = {0},
{2n } + {(−2)n } = {0, 8, 0, 32, 0, 128, 0, 512, 0, . . .},
2{2n } = {2n+1 },
{2n } · {2−n } = {1},
{2n − 1} + {1} = {2n},
{(−1)n n} + {2/n} = {(−1)n n + 2/n},
{in }/{(−i)n } = {(−1)n }.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 8.2: Convergence of infinite sequences 249
and, to save vertical space, just like for limits of functions, we also use a variation on the
last three: limn→∞ sn = L, limn→∞ {sn } = L, limn→∞ (sn ) = L.
We say that a sequence is convergent if it has a limit.
For example, the constant sequence s = {c} converges to L = c because for all n,
|sn − L| = |c − c| = 0 is strictly smaller than any positive real number .
The sequence s = {300, −5, π, 4, 0.5, 106 , 2, 2, 2, 2, 2, . . .} converges to L = 2 because
for all n ≥ 7, |sn − L| = |2 − 2| = 0 is strictly smaller than any positive real number .
In conceptual terms, a sequence {sn } converges if the tail end of the sequence gets
closer and closer to L; you can make all sn with n > N get arbitrarily close to L by simply
increasing N a sufficient amount.
We work out examples of epsilon-N proofs; they are similar to the epsilon-delta
proofs, and we go through them slowly at first. Depending on the point of view of your
class, the reader may wish to skip the rest of this section for an alternative treatment
in Section 8.5 in terms of limits of functions. More epsilon-N proofs are in Section 8.4.
Be aware that this section is more concrete; the next section assumes greater ease with
abstraction.
Example 8.2.2. Consider the sequence s = {1/n}. Example 8.1.2 gives a hunch that
lim sn = 0, and now we prove it. [(Recall that any text between square brackets
in this font and in red color is what should approximately be going through
your thoughts, but it is not something to write down in a final solution.)
By the definition of convergence, we have to show that for all > 0 some
property holds. All proofs of this form start with:] Let be an arbitrary
positive number. [Now we have to show that there exists an N for which
some other property holds. Thus we have to construct an N . Usually
this is done in retrospect, one cannot simply guess an N , but in the final
write-up, readers see simply that educated guess – more about how to guess
educatedly later:] Set N = 1 . Then N is a positive real number. [Now we have to
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
250 Chapter 8: Sequences
show that for all integers n > N , |sn − 0| < . All proofs of statements of
the form “for all integers n > N ” start with:] Let n be an (arbitrary) integer
with n > N . [Finally, we have to prove the inequality |sn − 0| < . We do that
by algebraically manipulating the left side until we get the desired final
< :]
|sn − 0| = |1/n − 0|
= |1/n|
= 1/n (because n is positive)
< 1/N (because n > N > 0)
1
= (because N = 1/) [That was a clever guess!]
(1/)
= .
So we conclude that |sn − 0| < , which proves that lim sn = 0.
Just as in the epsilon-delta proofs where one has to find a δ, similarly how does one
divine an N ? In the following two examples we indicate this step-by-step, not as a book
or your final homework solution would have it recorded.
Example 8.2.3. Let sn = { n1 ((−1)n + i(−1)n+1 )}. If we write out the first few terms, we
find that {sn } = {−1 + i, 1/2 − i/2, −1/3 + i/3, . . .}, and we may speculate that lim sn = 0.
Here is plot of the image set of this sequence in the complex plane:
s1
i/2
s3
s4
−1 s2 1
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 8.2: Convergence of infinite sequences 251
1
· |(−1)n + i(−1)n+1 | (because |ab| = |a||b|)
=
n
1
= · |(−1)n + i(−1)n+1 | (because n is positive)
n
1 p
= · ((−1)n )2 + ((−1)n+1 )2
n
1 √
= · 2
n
1 √
< · 2 (because n > N )
N
√ √ √
[Aside: we want/need 2/N ≤ , and 2/N = is a possibility, so set N = 2/.
Now go ahead, write that missing information on N in line 1 of this proof!]
√
2 √
=√ (because N = 2/)
2/
= ,
which proves that for all n > 1/, |sn − 0| < . Since is arbitrary, this proves that
lim sn = 0.
Thus a polished version of the example just worked out looks like this:
√
We prove that lim n1 ((−1)n + i(−1)n+1 ) − 0 = 0. Let > 0. Set N = 2/. Then
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
252 Chapter 8: Sequences
2
2n+3n
Example 8.2.4. Claim: lim 3+4n+n 2 = 3. Proof: Let > 0. Set N = . Let n be
an integer strictly bigger than N . Then
2n + 3n2 2n + 3n2 3(3 + 4n + n2 )
− 3 = −
3 + 4n + n2 3 + 4n + n2 3 + 4n + n2
−9 − 10n
=
3 + 4n + n2
9 + 10n
= (because n > 0)
3 + 4n + n2
[Assuming that N > 0.]
n + 10n
≤ (because n ≥ 9)
3 + 4n + n2
[Assuming that N ≥ 8.]
11n
=
3 + 4n + n2
11n
≤ 2 (because 3 + 4n + n2 > n2 ,
n
so 1/(3 + 4n + n2 ) < 1/n2 )
11
=
n
11
< (because n > N )
N
11
≤ (because N ≥ 11/ so 1/N ≤ 1/(11/))
11/
[Assuming this.]
= ,
which was desired. Now (on scratch paper) we gather all the information we used about N :
N > 0, N ≥ 8, N ≥ 11/, and that is it. Thus on the first line we fill in the blank part:
Set N = max{8, 11/}, which says that N is either 8 or 11/, whichever is greater, so that
N ≥ 8 and N ≥ 11/.
The polished version of this proof would go as follows:
2
2n+3n
Example 8.2.5. Claim: lim 3+4n+n 2 = 3. Proof: Let > 0. Set N = max{8, 11/}. Then
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 8.2: Convergence of infinite sequences 253
n + 10n
≤ (because n ≥ 9)
3 + 4n + n2
11n
=
3 + 4n + n2
11n
≤ 2 (because 3 + 4n + n2 > n2 ,
n
so 1/(3 + 4n + n2 ) < 1/n2 )
11
=
n
11
< (because n > N )
N
11
≤ (because N ≥ 11/ so 1/N ≤ 1/(11/))
11/
= ,
which proves that for all n > N , |sn − 3| < . Since is arbitrary, this proves that the limit
of this sequence is 3.
Below is a polished proof of a very similar problem.
2
Example 8.2.6. Claim: lim −2n+3n
9−4n+n2 = 3. Proof: Let > 0. Set N = max{2, 20/}. Then
N is a positive real number. Let n be an integer strictly bigger than N . Then
−2n + 3n2 −2n + 3n2 3(9 − 4n + n2 )
−3 = −
9 − 4n + n2 9 − 4n + n2 9 − 4n + n2
−27 + 10n
=
9 − 4n + n2
−27 + 10n
= (because n > N ≥ 2, so n ≥ 3,
9 − 4n + n2
so 10n−27 > 0 and n2 −4n+9 = (n−3)2 +2n > 0)
10n
< (because 10n − 27 < 10n)
9 − 4n + n2
10n
= √ √
9 − 4n + (1/ 2)n2 + (1 − 1/ 2)n2
10n
< √
(1 − 1/ 2)n2
√
(because 0 < (n − 4)2 + 2 = 2(9 − 4n + (1/ 2)n2 ))
10 √ 1√ 1
< (because 1 − 1/ 2 > 0.5, so 1−1/ 2
< 0.5 )
0.5n
20
=
n
20
< (because n > N )
N
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
254 Chapter 8: Sequences
20
≤ (because N ≥ 20/ so 1/N ≤ 1/(20/))
20/
= ,
which proves that for all n > N , |sn − 3| < . Since is arbitrary, this proves that the limit
of this sequence is 3.
Proof. If r = 0, the sequence {rn } is the constant zero sequence so certainly the limit is 0.
So we may assume without loss of generality that r 6= 0. Let > 0. Set N =
ln(min{,0.5})
ln |r| . Since ln of numbers between 0 and 1 is negative, we have that N is a positive
number. Let n be an integer with n > N . Then
|rn − 0| = |r|n
= |rn−N | |r|N
< 1n−N |r|N (by Theorem 7.6.5)
= min{, 0.5}
≤ .
1 n
Example 8.2.8. lim 1+ n = e.
Proof. All the hard work for this has been done already in Exercise 7.6.10. Let > 0. By
x
Exercise 7.6.10, limx→∞ 1 + x1
= e. Thus there exists N > 0 such that for all x > N ,
1 x
n
1 + n1
1+ x − e < . In particular, for any integer n > N , − e < .
Each of the summands is non-negative, and if we only use the summands with k = 0 and
k = 2, we then get that
1
n ≥ 1 + n(n − 1)(n1/n − 1)2 .
2
By subtracting 1√we get that n−1 ≥ 2 n(n−1)(n1/n −1)2 , so that for n ≥ 2, n2 ≥ (n1/n −1)2 ,
1
and hence that √n2 ≥ n1/n − 1. Certainly n1/n − 1 ≥ 0 for all n ≥ 1. It follows that for all
√
n ≥ 2, and even for all n ≥ 1, 0 ≤ n1/n − 1 ≤ √2 . Now let > 0. Set N = max{2, 2/2 }.
n √
Then N is a positive real number. Let n > N be an integer. Then 0 ≤ n1/n − 1 ≤ √2 <
√ n
√2 = , which proves that |n1/n − 1| < , and hence proves this limit.
N
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 8.2: Convergence of infinite sequences 255
Proof. First suppose that M ≥ 1. Then certainly for all integers n ≥ M , we have that
1 ≤ M 1/n ≤ n1/n . Let > 0. By the previous example, there exists N > 0 such
that for all integers n > N , 0 < n1/n − 1 < . Then for all integers n > max{M, N },
0 ≤ M 1/n − 1 ≤ n1/n − 1 < , which implies that |M 1/n − 1| < . This proves the example
in case M ≥ 1.
Now suppose that M < 1. By assumption, 1/M > 1, so by the previous case
(and by using that (1/M )1/n = 1/M 1/n ), there exists N > 0 such that for all integers
n > N , 0 ≤ 1/M 1/n − 1 < . By adding 1 to all three parts in this inequality we get that
1 ≤ 1/M 1/n < + 1, so that by compatibility of < with multiplication by positive numbers,
1 1/n
+1 < M ≤ 1. Hence by compatibility of < with addition,
1
0 ≤ 1 − M 1/n < 1 − = < ,
+1 +1
since + 1 > 1. Thus |1 − M 1/n | < , which proves that lim M 1/n = 1.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
256 Chapter 8: Sequences
Pn 1
8.2.9. Suppose that the terms of a sequence are given by sn = k=1 k(k+1) .
Pn 1 1
i) Using induction on n, prove that k=1 k(k+1) = 1 − n+1 .
ii) Use this to find and prove the limit of {sn }.
8.2.10. Prove the following limits:
ln(n)
i) lim ln(n+1) = 1. (Hint: epsilon-N proof, continuity of ln.)
ln(ln(n))
ii) lim ln(ln(n+1)) = 1.
8.2.11. Let r ∈ R. By Theorem 2.10.4, for every positive integer n there exists a rational
number sn ∈ (r − n1 , r + n1 ). Prove that {sn } converges to r.
8.2.12. Let r be a real number with a known decimal expansion. Let sn be the rational
number whose digits n + 1, n + 2, n + 3, et cetera, beyond the decimal point are all 0, and
all other digits agree with the digits of r. (For example, if r = π, then s1 = 3.1, s2 = 3.14,
s7 = 3.1415926, et cetera.) Prove that lim{sn } = r. (Repeat with binary expansions if you
know what a binary expansion is.)
√
8.2.13. Prove that limn { n! n! } = 1.
n
8.2.14. What is wrong with the following “proof” that limn→∞ 2n+1 = 12 .
1
2 −1
“Proof.” Let > 0. Set N = 2 . Let n > N . Then
n 1 2n − (2n + 1)
− =
2n + 1 2 2(2n + 1)
−1
=
2(2n + 1)
1
=
2(2n + 1)
1
< (because all terms are positive)
2(2N + 1)
= .
The sequence {(−1)n } alternates in value between −1 and 1, and does not seem to
converge to a single number. The following definition addresses this situation.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 8.3: Divergence of infinite sequences and infinite limits 257
Definition 8.3.1. A sequence diverges if it does not converge. In other words, {sn }
diverges if for all complex numbers L, lim{sn } =
6 L.
not For all real numbers > 0 there exists a positive real number
N such that for all integers n > N , |sn − L| < .
= There exists a real number > 0 such that for all positive real
numbers N , not for all integers n > N , |sn − L| < .
= There exists a real number > 0 such that for all positive real
numbers N there exists an integer n > N such that not
|sn − L| < .
= There exists a real number > 0 such that for all positive
real numbers N there exists an integer n > N such that
|sn − L| ≥ .
Example 8.3.2. {(−1)n } is divergent. Namely, for all complex numbers L, lim sn 6= L.
Proof. Set = 1 (half the distance between the two values of the sequence). Let N be
an arbitrary positive number. If Re(L) > 0, let n be an odd integer greater than N ,
and if Re(L) ≤ 0, let n be an even integer greater than N . In either case, |sn − L| ≥
| Re(sn ) − Re(L)| ≥ 1 = .
The sequence in the previous example has no limit, whereas the sequence in the next
example has no finite limit:
6 L.
Example 8.3.3. For all complex numbers L, lim{n} =
Proof. Set = 53 (any positive number works). Let N be a positive real number. Let n
be any integer that is strictly bigger than N and strictly bigger than |L| + 53 (say strictly
bigger than N + |L| + 53). Such an integer exists. Then by the reverse triangle inequality,
|n − L| ≥ |n| − |L| ≥ 53 = .
The last two examples are different: the first one has no limit at all since the terms
oscillate wildly, but for the second example we have a sense that its limit is infinity. We
formalize this:
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
258 Chapter 8: Sequences
Definition 8.3.4. A real-valued sequence {sn } diverges to ∞ if for every positive real
number M there exists a positive number N such that for all integers n > N , sn > M . We
write this as lim sn = ∞.
A real-valued sequence {sn } diverges to −∞ if for every negative real number M
there exists a positive real number N such that for all integers n > N , sn < M . We write
this as lim sn = −∞.
Proof. Let M > 0. Set N = M . (As in epsilon-delta or epsilon-N proofs, we must figure
out what to set N to. In this case, N = M works). Let n ∈ N+ with n > N . Since N = M ,
we conclude that n > M , and the proof is complete.
√
n
Example 8.3.6. lim n! = ∞.
Proof. Let M > 0. Let M0 be an integer that is strictly greater than M . By Example 8.2.10
there exists N1 > 0 such that for all integers n > N1 ,
s
M0 ! M0 + 1 − M
n
M
−1 < .
(M0 + 1) 0 M0 + 1
M0 !
n! = n(n − 1) · · · (M0 + 1) · M0 ! ≥ (M0 + 1)n−M0 · M0 ! = (M0 + 1)n ,
(M0 + 1)M0
so that s
√
n M0 ! M
n! ≥ (M0 + 1) n M
> (M0 + 1) = M.
(M0 + 1) 0 M0 + 1
Theorem 8.3.7. (Comparison theorem (for sequences with infinite limits)) Let
{sn }, {tn } be real-valued sequences such that for all sufficiently large n (say for n ≥ N for
some fixed N ), sn ≤ tn .
(1) If lim sn = ∞, then lim tn = ∞.
(2) If lim tn = −∞, then lim sn = −∞.
Proof. (1) By assumption lim sn = ∞ for every positive M there exists a positive N 0 such
that for all integers n > N 0 , sn > M . Hence by assumption tn ≥ sn for all n ≥ N we get
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 8.3: Divergence of infinite sequences and infinite limits 259
that for every positive M and for all integers n > max{N, N 0 }, tn > M . Thus by definition
lim tn = ∞.
Part (2) has an analogous proof.
2
Example 8.3.8. lim n n+1 = ∞.
2
Proof. Note that for all n ∈ N+ , n n+1 = n + n1 ≥ n, and since we already know that
2
lim n = ∞, it follows by the comparison theorem above that lim n n+1 = ∞.
2
Example 8.3.9. lim n n−1 = ∞.
2
Proof. Note that for all integers n > 2, n n−1 = n − n1 ≥ n2 . We already know that
lim n = ∞, and it is straightforward to prove that lim n2 = ∞. Hence by the comparison
2
theorem, lim n n−1 = ∞.
Or, we can give an M − N proof. Let M > 0. Set N = max{2, 2M }. Let n be an
integer strictly bigger than N . Then
n2 − 1 1 n N
=n− ≥ > ≥ M.
n n 2 2
Theorem 8.3.10. Let {sn } be a sequence of positive numbers. Then lim sn = ∞ if and
only if lim s1n = 0.
Proof. Suppose that lim sn = ∞. Let > 0. By the definition of infinite limits, there
exists a positive number N such that for all integers n > N , sn > 1/. Then for the same
n, 0 < s1n < , so that | s1n | < . This proves that lim s1n = 0.
Now suppose that lim s1n = 0. Let M be a positive number. By assumption lim s1n =
0 there exists a positive number N such that for all integers n > N , | s1n − 0| < 1/M . Since
each sn is positive, it follows that for the same n, s1n < 1/M , so that sn > M . This proves
that lim sn = ∞.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
260 Chapter 8: Sequences
8.3.4. Give an example of two divergent sequences {sn }, {tn } such that {sn +tn } converges.
8.3.5. Let {sn } be a sequence of negative numbers. Prove that lim sn = −∞ if and only
if lim s1n = 0.
8.3.6. Given the following sequences, find and prove the limits, finite or infinite, if they
exist. Otherwise, prove divergence:
i) { nn+5
3 −5 }.
2
−n
ii) { 2n
3n2 −5 }.
n
iii) { (−1)
n−3 }.
n
iv) { (−1) n
n+1 }.
3
v) { nn2 −8n
+8n }.
2
vi) { 1−nn }.
n
vii) { 2n! }.
n!
viii) { (n+1)! }.
8.3.7. Prove or give a counterexample:
i) If {sn } and {tn } both diverge, then {sn + tn } diverges.
ii) If {sn } converges and {tn } diverges, then {sn + tn } diverges.
iii) If {sn } and {tn } both diverge, then {sn · tn } diverges.
iv) If {sn } converges and {tn } diverges, then {sn · tn } diverges.
8.3.8. Find examples of the following:
sn
i) A sequence {sn } of non-zero terms such that lim{ sn+1 } = 0.
sn+1
ii) A sequence {sn } of non-zero terms such that lim{ sn } = 1.
iii) A sequence {sn } of non-zero terms such that lim{ sn+1
sn } = ∞.
iv) A sequence {sn } such that lim{sn+1 − sn } = 0.
v) A sequence {sn } such that lim{sn+1 − sn } = ∞.
8.3.9. Suppose that the sequence {sn }n diverges to ∞ (or to −∞). Prove that {sn }n
diverges. (The point of this exercise is to parse the definitions correctly.)
All of the theorems in this section are also proved in Section 8.5 with a different
method; here we use the epsilon-N formulation for proofs without explicitly resorting to
functions whose domains have a limit point. I recommend reading this section and omitting
Section 8.5.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 8.4: Convergence theorems via epsilon-N proofs 261
Proof. Let {sn } be a convergent sequence. Suppose that {sn } converges to both L and L0 .
Then for any > 0, there exists an N such that |sn − L| < /2 for all n > N . Likewise, for
any > 0, there exists an N 0 such that |sn − L0 | < /2 for all n > N 0 . Then by the triangle
inequality, |L − L0 | = |L − sn + sn − L0 | ≤ |L − sn | + |sn − L0 | < /2 + /2 = . Since is
arbitrary, by Theorem 2.11.4 it must be the case that |L − L0 | = 0, i.e., that L = L0 .
Theorem 8.4.2. Suppose that lim{sn } = L and that L 6= 0. Then there exists a positive
number N such that for all integers n > N , |sn | > |L|/2. In particular, there exists a
positive number N such that for all integers n > N , sn 6= 0.
Proof. Note that p = |L|/2 is a positive real number. Since lim{sn } = L, it follows that
there exists a real number N such that for all integers n > N , |sn − L| < |L|/2. Then by
the reverse triangle inequality (proved in Theorem 2.11.3),
|sn | = |sn − L + L| = |(sn − L) + L| ≥ |L| − |sn − L| > |L| − |L|/2 = |L|/2.
Proof. Part (1) was proved immediately after Definition 8.2.1. Part (2) was Example 8.2.2.
Part (3): Let > 0. Since lim sn = L, there exists a positive real number N1 such
that for all integers n > N1 , |sn − L| < /2. Since lim tn = K, there exists a positive real
number N2 such that for all integers n > N2 , |tn − K| < /2. Let N = max{N1 , N2 }. Then
for all integers n > N ,
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
262 Chapter 8: Sequences
|c|
n, |csn − cL| = |c| · |sn − L| ≤ |c|/(|c| + 1) = |c|+1 < . Since is any positive number,
we conclude that csn converges to cL.
Part (5): Let > 0. Since lim{sn } = L, there exists N1 such that for all integers
n > N1 , |sn −L| < 1. Thus for all such n, |sn | = |sn −L+L| ≤ |sn −L|+|L| < 1+|L|. There
also exists N2 such that for all integers n > N2 , |sn − L| < /(2|K| + 1). By assumption
lim{tn } = K there exists N3 such that for all integers n > N3 , |tn − K| < /(2|L| + 2). Let
N = max{N1 , N2 , N3 }. Then for all integers n > N ,
|sn tn − LK| = |sn tn − sn K + sn K − LK| (by adding a clever 0)
= |(sn tn − sn K) + (sn K − LK)|
≤ |sn tn − sn K| + |sn K − LK| (by the triangle inequality)
= |sn (tn − K)| + |(sn − L)K|
= |sn | · |tn − K| + |sn − L| · |K|
< (|L| + 1) · + · |K|
2|L| + 2 2|K| + 1
< +
2 2
= ,
which proves (5).
Part (6): Let > 0. Since lim{sn } = L, there exists N1 such that for all integers
n > N1 , |sn − L| < 1. Thus for all such n, |sn | = |sn − L + L| ≤ |sn − L| + |L| < 1 + |L|.
There also exists N2 such that for all integers n > N2 , |sn − L| < |K|2 . By assumption
2
|K|
lim{tn } = L there exists N3 such that for all integers n > N3 , |tn − K| < 4(1+|L|) . Since
K 6= 0, by Theorem 8.4.2 there exists a positive number N4 such that for all integers
n > N4 , |tn | > |K|/2. Let N = max{N1 , N2 , N3 , N4 }. Then for all integers n > N ,
sn L K · sn − L · tn
− =
tn K K · tn
K · sn − tn sn + tn sn − L · tn
= (by adding a clever 0)
K · tn
K · sn − tn sn tn sn − L · tn
≤ + (by the triangle inequality)
K · tn K · tn
1 1 1
= |K − tn ||sn | + |L − sn |
|K| |tn | K
|K|2 1 2 |K| 1
< (1 + |L|) +
4(1 + |L|) |K| |K| 2 |K|
= +
2 2
= ,
which proves (6).
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 8.4: Convergence theorems via epsilon-N proofs 263
Part (7): The proof is by induction on m, the base case being the assumption. If
limn→∞ {1/nm−1 } = 0, then by the product rule, limn→∞ {1/nm } = limn→∞ {1/nm−1 ·
1/n} = limn→∞ {1/nm−1 } · limn→∞ {1/n} = 0. This proves (7).
5n−2
Example 8.4.4. Suppose sn = 3n+4 . To prove that lim sn = 5/3, we note that sn =
5n−2 1/n 5−2/n
3n+4 1/n = 3+4/n .
By the linear rule, lim(1/n) = 0, so by the scalar rule, lim(2/n) =
lim(4/n) = 0. Thus by the constant, sum, and difference rules, lim{5 − 2/n} = 5 and
lim{3 + 4/n} = 3, so that by the quotient rule, lim sn = 5/3.
2 2
3n+2 1/n 3/n+2/n
Example 8.4.5. Let sn = n3n+2 2 −3 . Note that sn = n2 −3 · 1/n2 = 1−3/n2 . By the linear
rule, lim 1/n = 0, so that by the scalar rule, lim 3/n = 0, and by the product and scalar
rules, lim 2/n2 = lim 3/n2 = 0. Thus by the sum and difference rules, lim{3/n + 2/n2 } = 0
and lim{1 − 3/n2 } = 1. Finally, by the quotient rule, lim sn = 0/1 = 0.
Theorem 8.4.6. (Power, polynomial, rational rules for sequences) For any positive
integer m, limn→∞ {1/nm } = 0. If f is a polynomial function, then lim{f (1/n)} = f (0). If
f is a rational function that is defined at 0, then lim{f (1/n)} = f (0).
Proof. By the linear rule (in Theorem 8.4.3), limn→∞ {1/n} = 0, and by the power rule,
for all positive integers m, limn→∞ {1/nm } = 0.
Now write f (x) = a0 + a1 x + · · · + ak xk for some non-negative integer k and some
complex numbers a0 , a1 , . . . , ak . By the constant, power and repeated sum rules,
lim f (1/n) = lim a0 + a1 (1/n) + a2 (1/n)2 + · · · + ak (1/n)k
n→∞ n→∞
= a0 + lim a1 (1/n) + lim a2 (1/n)2 + · · · + lim ak (1/n)k
n→∞ n→∞ n→∞
2 k
= a0 + a1 · 0 + a2 · 0 + · · · + ak · 0
= a0 = f (0).
Theorem 8.4.7. (The composite rule for sequences) Suppose that lim sn = L. Let
g be a function whose domain contains L and all terms sn . Suppose that g is continuous
at L. Then lim g(sn ) = g(L).
Proof. Let > 0. Since g is continuous at L, there exists a positive number δ > 0 such
that for all x in the domain of g, if |x − L| < δ then |g(x) − g(L)| < . Since lim sn = L,
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
264 Chapter 8: Sequences
there exists a positive number N such that for all integers n > N , |sn − L| < δ. Hence for
the same n, |g(sn ) − g(L)| < .
In particular, since the absolute value function, the real part, and the imaginary part
functions are continuous everywhere, we immediately conclude the following:
Furthermore, since the real and imaginary parts determine a complex number, we
moreover get:
Theorem 8.4.9. A sequence {sn } of complex numbers converges if and only if the se-
quences {Re sn } and {Im sn } of real numbers converge.
Proof. Let L = lim sn , K = lim tn . By Theorem 8.4.8, lim |sn | = |L|, lim |tn | = |K|.
Suppose that |L| > |K|. Set = (|L| − |K|)/2|. By the definition of convergence,
there exist N1 , N2 > 0 such that if n is an integer, n > N1 implies that − < |sn | − |L| < ,
and n > N2 implies that − < |tn | − |K| < . Let N3 be a positive number such that for all
integers n > N3 , |sn | ≤ |tn |. If we let N = max{N1 , N2 , N3 }, then for integers n > N we
have that |tn | < |K| + = (|L| + |K|)/2 = |L| − < |sn |. This contradicts the assumption
|sn | ≤ |tn |, so that necessarily |L| ≤ |K|.
The proof of the second part is similar and left to the exercises.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 8.4: Convergence theorems via epsilon-N proofs 265
Theorem 8.4.11. (The squeeze theorem for sequences) Suppose that s, t, u are se-
quences of real numbers and that for all n ∈ N+ , sn ≤ tn ≤ un . If lim s and lim u both
exist and are equal, then lim t exists as well and
Proof. Set L = lim s = lim u. Let > 0. Since lim s = L, there exists a positive N1 such
that for all integers n > N1 , |sn − L| < . Since lim u = L, there exists a positive N2 such
that for all integers n > N2 , |un − L| < . Set N − max{N1 , N2 }. Let n be an integer
strictly greater than N . Then − < sn − L ≤ tn − L ≤ un − L < , so that |tn − L| < .
Since was arbitrary, this proves that lim t = L.
n+1 1/n
1/n
≤ 32
Proof. For n ≥ 2, 1 ≤ n ≤ n1/n . Thus by the previous theorem and
n+1 1/n
by Example 8.2.9, lim{ n ) n
= 1. Hence by Theorem 8.4.3 and by Example 8.2.9,
n o n o
1/n n+1
1/n 1/n n+1 1/n
lim{(n + 1) } = lim n n = lim n lim{n1/n } = 1.
8.4.2. Give an example of a sequence {sn } and a number L such that lim |sn | = |L| but
{sn } does not converge. Justify your example.
n o
(−1)n
8.4.3. Prove that lim n = 0.
†8.4.4. Let {sn } be a convergent sequence of positive real numbers. Prove that lim sn ≥ 0.
m
8.4.5. Prove that for all integers m, limn→∞ n+1n = 1.
8.4.6. Prove that limn→∞ {3 − n2 } = 3.
√ √
8.4.7. Prove that lim 3 n + 1 − 3 n = 0.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
266 Chapter 8: Sequences
1 n
8.4.8. By Example 8.2.8, lim 1+ n = e. Determine the following limits:
n
i) n+1
n .
n o
1 2n
ii) 1 + 2n .
n o
2n
iii) 1 + n1
.
1
n
iv) 1 + 2n .
n o
n+1
v) 1 + n1
.
√
8.4.9. Let s1 be a positive real number. For each n ≥ 1, let sn+1 = sn . Prove that
lim sn = 1. (Hint: Prove that lim sn = lim s2n .)
The results in this section are the same as those in Section 8.4, but here they are
proved with theorems about limits of functions that were proved in Chapter 4. So, a
connection is made between limits of functions and limits of sequences. The reader may
omit this section (or the previous one). This section is more abstract; one has to keep
in mind connections with functions as well as theorems about limits of functions to get
at theorems about limits of sequences. Exercises for this section appear at the end of
Section 8.4.
For any sequence s we can define a function f : {1/n : n ∈ N+ } → C with f (1/n) =
sn . Conversely, for every function f : {1/n : n ∈ N+ } → C we can define a sequence s with
sn = f (1/n).
The domain of f has exactly one limit point, namely 0. With this we have the usual
notion of limx→0 f (x) with standard theorems from Section 4.4,
Theorem 8.5.1. Let s, f be as above. Then lim sn = L if and only if limx→0 f (x) = L.
Proof. (⇒) Suppose that lim sn = L. We have to prove that limx→0 f (x) = L. Let > 0.
By assumption lim sn = L, there exists a positive real number N such that for all integers
n > N , |sn − L| < . Let δ = 1/N . Then δ is a positive real number. Let x be in the
domain of f such that 0 < |x − 0| < δ. Necessarily x = 1/n for some positive integer
n. Thus 0 < |x − 0| < δ simply says that 1/n < δ = 1/N , so that N < n. But then by
assumption |f (1/n) − L| = |sn − L| < , which proves that limx→0 f (x) = L.
(⇐) Now suppose that limx→0 f (x) = L. We have to prove that lim sn = L. Let
> 0. By assumption limx→0 f (x) = L there exists a positive real number δ such that for
all x in the domain of f , if 0 < |x − 0| < δ then |f (x) − L| < . Set N = 1/δ. Then N
is a positive real number. Let n be an integer greater than N . Then 0 < 1/n < 1/N = δ,
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 8.5: Convergence theorems via functions 267
so that by assumption |f (1/n) − L| < . Hence |sn − L| = |f (1/n) − L| < , which proves
that lim sn = L.
Example 8.5.2. (Compare the reasoning in this example with the epsilon-N proofs of
5n−2 1/n 5−2/n
Section 8.2.) Let sn = 5n−2
3n+4 . We note that sn = 3n+4 · 1/n = 3+4/n . The corresponding
function f : {1/n : n ∈ N+ } is f (x) = 5−2x 3+4x , and by the scalar, sum, difference, and
quotient rules for limits of functions, limx→0 f (x) = 5−0
3+0 = 5/3, so that by Theorem 8.5.1,
lim sn = 5/3.
Proof. Let {sn } be a convergent sequence. Suppose that {sn } converges to both L
and L0 . Let f : {1/n : n ∈ N+ } be the function corresponding to s. By Theorem 8.5.1,
limx→0 f (x) = L and limx→0 f (x) = L0 . By Theorem 4.4.1, L = L0 . This proves uniqueness
of limits for sequences.
Theorem 8.5.5. Suppose that lim{sn } = L and that L 6= 0. Then there exists a positive
number N such that for all integers n > N , |sn | > |L|/2. In particular, there exists a
positive number N such that for all integers n > N , sn 6= 0.
Proof. Let f be the function corresponding to s. Then limx→0 f (x) = L, by Theorem 4.4.2
there exists δ > 0 such that for all x in the domain of f , if x < δ then |f (x)| > |L|/2.
Set N = 1/δ. Let n be an integer strictly greater than N . Then 1/n < 1/N = δ, so
|sn | = |f (1/n)| > |L|/2.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
268 Chapter 8: Sequences
Theorem 8.5.8. (The composite rule for sequences) Suppose that lim sn = L. Let
g be a function whose domain contains L and all terms sn . Suppose that g is continuous
at L. Then lim g(sn ) = g(L).
Since the real and imaginary parts determine a complex number, we also have:
Theorem 8.5.10. A sequence {sn } of complex numbers converges if and only if the se-
quences {Re sn } and {Im sn } of real numbers converge.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 8.6: Bounded sequences, monotone sequences, ratio test 269
Proof. Let A be the set of those 1/n for which |sn | ≤ |tn | in the first case and for which
sn ≤ tn in the second case. Let f, g : A → R be the functions f (1/n) = |sn |, g(1/n) = |tn |
in the first case, and f (1/n) = sn , g(1/n) = tn in the second case. By assumption
for all x in the domain, f (x) ≤ g(x). Since 0 is a limit point of the domain (despite
omitting finitely many 1/n) and since by Theorem 8.5.1, limx→0 f (x) and limx→0 g(x)
both exist, by Theorem 4.4.10, limx→0 f (x) < limx→0 g(x). In the first case, this translates
to | lim sn | ≤ | lim tn |, and in the second case it translates to lim sn ≤ lim tn .
Theorem 8.5.12. (The squeeze theorem for sequences) Suppose that s, t, u are se-
quences of real numbers and that for all n ∈ N+ , sn ≤ tn ≤ un . If lim s and lim u both
exist and are equal, then lim t exists as well and
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
270 Chapter 8: Sequences
Definition 8.6.1. A sequence {sn } is bounded if there exists a positive real number B
such that for all integers n, |sn | ≤ B.
If all sn are real numbers, we say that {sn } is bounded above (resp. below) if
there exists a real number M such that for all positive integers n, sn ≤ M (resp. sn ≥ M ).
In other words, {sn } is bounded if the set {sn : n ∈ N+ } is a subset of B(0, M ) for some
real number M .
n √
4+i
Sequences {4 + 5i}, {1/n}, { 5+2i }, {(−1)n }, {3 − 4in }, { 3n4n 2 −4 }, {
n
n} are
2 √ √
bounded (the latter by Example 1.6.7), but {n}, {(−1)n n}, { n n+1 }, { n n!}, {2 n } are not
bounded. The next theorem provides many examples of bounded sequences.
Proof. Let {sn } be a convergent sequence with limit L. Thus there exists a positive integer
N such that for all integers n > N , |sn −L| < 1. Set B = max{|s1 |, |s2 |, |s3 |, . . . , |sN +1 |, |L|+
1}. Then for all positive integers n ≤ N , |sn | ≤ B by definition of B, and for n > N ,
|sn | = |sn − L + L| ≤ |sn − L| + |L| < 1 + |L| ≤ B.
Another proof of the theorem above is given in the next section via Cauchy sequences.
Definition 8.6.3. A sequence {sn } of real numbers is called non-decreasing (resp. non-
increasing, strictly increasing, strictly decreasing) if for all n, sn ≤ sn+1 (resp.
sn ≥ sn+1 , sn < sn+1 , sn > sn+1 ). Any such sequence is called monotone.
2
Sequences {1/n}, {−n} are strictly decreasing, { n n+1 } is strictly increasing,
2
{(−1)n n} is neither increasing nor decreasing, { n n+5 }n≥1 is neither increasing nor
2
decreasing, but { n n+5 }n≥2 is strictly increasing.
Proof. Suppose that for all n ≥ N , sn ≤ sn+1 . By the Least upper bound property
(Axiom 2.10.1), the least upper bound of the set {sN , sN +1 , sN +2 , . . .} exists. Call it L.
Let > 0. Since L is the least upper bound, there exists a positive integer N 0 ≥ N
such that 0 ≤ L − sN 0 < . Hence for all integers n > N 0 , sN 0 ≤ sn , so that
0 ≤ L − sn ≤ L − sN 0 < ,
which proves that for all n > N 0 , |sn − L| < . Thus lim sn = L.
The proof of the case of sn ≥ sn+1 for all n ≥ N is similar.
The theorem below was already proved in Example 8.2.7.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 8.6: Bounded sequences, monotone sequences, ratio test 271
Theorem 8.6.5. (Ratio test for sequences) Let r ∈ C with |r| < 1. Then lim rn = 0.
Proof. If r = 0, the sequence is the constant zero sequence, so of course its limit is 0. Now
suppose that r 6= 0. By Exercise 2.8.2, for all positive integers n, 0 < |r|n+1 < |r|n . Thus
the sequence {|r|n } is a non-increasing sequence that is bounded below by 0 and above
by 1. By Theorem 8.6.4, L = lim |r|n = inf{|r|n : n ∈ N+ }. Since 0 is a lower bound and
L is the greatest of lower bounds of {|r|n : n ∈ N+ }, necessarily 0 ≤ L.
Suppose that L > 0. Then L(1 − |r|)/(2|r|) is a positive number. Since L is the
infimum of the set {|r|, |r|2 , |r|3 , . . .}, there exists a positive integer N > N1 such that
0 ≤ |r|N − L < L 1−|r|
2|r| . By multiplying by |r| we get that
N +1 1 − |r| L(1 − |r|)
|r| < |r|L + L = L |r| +
2 2
(1 + |r|)
=L
2
(1 + 1)
<L = L,
2
and since L is the infimum of all powers of |r|, we get that L ≤ |r|N +1 < L, which is a
contradiction. So necessarily L = 0. Hence for every > 0 there exists N ∈ R+ such that
for all integers n > N , ||rn | − 0| < . But this says that |rn − 0| < , so that lim rn = 0 as
well.
Theorem 8.6.6. (Ratio test for sequences) Let {sn } be a sequence of non-zero complex
sn+1
numbers and let L be a real number in the interval [0, 1). Assume that lim sn = L or
sn+1
that there exists a positive integer K such that for any integer n ≥ K, sn ≤ L.
Then lim sn = 0.
Proof. Let r be a real number strictly between L and 1. Then r and r − L are positive
numbers.
Under the first (limit) condition, there exists a positive number K such that for all
sn+1
integers n > K, sn − L < r − L. Then for all n > K,
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
272 Chapter 8: Sequences
than N . Then
sn sn−1 sn−2 sK+1
|sn − 0| = · · ··· · · sK
sn−1 sn−2 sn−3 sK
sn sn−1 sn−2 sK+1
= · · ··· · · |sK |
sn−1 sn−2 sn−3 sK
≤ rn−K |sK |
= rn |sK |r−K
< .
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 8.7: Cauchy sequences, completeness of R, C 273
8.6.11. Let {sn } be a non-decreasing sequence. Prove that the set {s1 , s2 , s3 , . . .} is
bounded below.
8.6.12. (Monotone sequences) Let {sn } be a monotone sequence of real numbers.
i) Suppose that {sn } is not bounded above. Prove that {sn } is non-decreasing and
that lim sn = ∞.
ii) Suppose that {sn } is not bounded below. Prove that {sn } is non-increasing and
that lim sn = −∞.
iii) Prove that lim sn is a real number if and only if {sn } is bounded.
Definition 8.7.1. A sequence {sn } is Cauchy if for all > 0 there exists a positive real
number N such that for all integers m, n > N , |sn − sm | < .
Proof. Let {sn } be a Cauchy sequence. Thus for = 1 there exists a positive integer N
such that for all integers m, n > N , |sn − sm | < 1. Then the set {|s1 |, |s2 |, . . . , |sN |, |sN +1 |}
is a finite and hence a bounded subset of R. Let M 0 an upper bound of this set, and let
M = M 0 + 1. It follows that for all n = 1, . . . , N , |sn | < M , and for n > N , |sn | =
|sn − sN +1 + sN +1 | ≤ |sn − sN +1 | + |sN +1 | < 1 + M 0 = M . Thus {sn } is bounded by M .
Proof. Let {sn } be a convergent sequence. Let L be the limit. Let > 0. Since lim sn = L,
there exists a positive real number N such that for all n > N , |sn − L| < /2. Thus for all
integers m, n > N ,
|sn − sm | = |sn − L + L − sm | ≤ |sn − L| + |L − sm | < /2 + /2 = .
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
274 Chapter 8: Sequences
Remark 8.7.4. It follows that every convergent sequence is bounded (this was already
proved in Theorem 8.6.2).
The converse of Theorem 8.7.3 is not true if the field in which we are working is Q.
√
For example, let sn be the decimal approximation of 2 to n digits after the decimal point.
Then {sn } is a Cauchy sequence of rational numbers: for every > 0, let N be a positive
integer such that 1/10N < . Then for all integers n, m > N , sn and sm differ at most
in digits N + 1, N + 2, . . . beyond the decimal point, so that |sn − sm | ≤ 1/10N < . But
{sn } does not have a limit in Q, so that {sn } is a Cauchy but not a convergent sequence.
Another way of writing this is:
( √ )
10n 2 √
lim n
= 2.
10
But over R and C, all Cauchy sequences are convergent, as we prove next.
Proof. First let {sn } be a Cauchy sequence in R (as opposed to in C). By Theorem 8.7.2,
{sn } is bounded. It follows that all subsets {s1 , s2 , s3 , . . .} are bounded too. In particular,
by the Least upper bound property (Axiom 2.10.1), un = sup{sn , sn+1 , sn+2 , . . .} is a real
number. For all n, un ≥ un+1 because un is the supremum of a larger set. Any lower
bound on {s1 , s2 , s3 , . . .} is also a lower bound on {u1 , u2 , u3 , . . .}. Thus by Theorem 8.6.4,
the monotone sequence {un } has a limit L = inf{u1 , u2 , u3 , . . .}.
We claim that L = lim{sn }. Let > 0. Since {sn } is Cauchy, there exists N1 > 0
such that for all integers m ≥ n > N1 , |sn − sm | < /2. Thus if we fix n > N1 , then for
all m ≥ n, we have that sm < sn + /2. But un is the least upper bound on all sm for
m ≥ n, so that sm ≤ un < sn + /2, and in particular, sn ≤ un < sn + /2. It follows
that |sn − un | < /2 for all integers n > N1 . Since L = inf{u1 , u2 , u3 , . . .}, there exists an
integer N2 such that 0 ≤ uN2 − L < /2. Set N = max{N1 , N2 }. Let n > N be an integer.
By the definition of the un , L ≤ un ≤ uN ≤ uN2 , so that 0 ≤ un − L ≤ uN2 − L < /2.
Hence
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 8.7: Cauchy sequences, completeness of R, C 275
The center of mass of the top two books is clearly at 34 from the right-hand edge of the
bottom book, so that the third book from the top down should protrude from underneath
the second one 14 .
2·1+1· 12
The center of mass of this system, measured from the rightmost edge, is 3 = 56 ,
so that the fourth book has to protrude out 61 .
3·1+1· 12
The center of mass of this system, measured from the rightmost edge, is 4 = 78 ,
so that the fifth book has to protrude out 18 units.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
276 Chapter 8: Sequences
(n−1)·1+1· 1
In general, the center of mass of the top n books is at n
2
= 2n−1
2n measured
1
from the rightmost edge, so that the (n + 1)st book should protrude out by 2n units.
Thus the total protrusion of the top n books equals
1 1 1 1
1 + + + ··· +
2 2 3 n−1
units. We just proved that this sum grows beyond limit. Thus we can reach the Moon
with enough books (and a platform to stand on).
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 8.8: Subsequences 277
8.8 Subsequences
Definition 8.8.1. A subsequence of an infinite sequence {sn } is an infinite sequence
{sk1 , sk2 , sk3 , . . .} where 1 ≤ k1 < k2 < k3 < · · · are integers. Notations for such a
subsequence are:
{skn }, {skn }n , {skn }∞
n≥1 , {skn }n≥1 , {skn }n∈N+ .
Proof. Let {sn } be a convergent sequence, with limit L, and let {skn } be a subsequence.
Let > 0. By assumption there exists a positive number N such that for all integers
n > N , |sn − L| < . Since n ≤ kn , it follows that |skn − L| < . Thus {skn } converges.
The proof of the second part is similar.
Proof. This proof uses the halving construction already encountered in Construction 3.6.1,
and here the property P is that the subset contains infinitely many elements of the sequence.
Let {sn } be a bounded sequence (of real or complex numbers). Let M be a positive
real number such that for all n, |sn | ≤ M . Let a0 = c0 = −M and b0 = d0 = M . The
sequence {sn } has infinitely many (all) terms in the rectangle R0 = [a0 , b0 ] × [c0 , d0 ]. Set
l0 = 0. (If all sn are real, we may take c0 = d0 = 0, or perhaps better, ignore the second
coordinates.)
We prove below that for all m ∈ N+ there exists a subsequence {skn } all of whose
terms are in the rectangle Rm = [am , bm ] × [cm , dm ], where bm − am = 2−m (b0 − a0 ),
[am , bm ] ⊆ [am−1 , bm−1 ]. dm − cm = 2−m (d0 − c0 ), [cm , dm ] ⊆ [cm−1 , dm−1 ]. Furthermore,
we prove that there exists lm > lm−1 such that slm ∈ Rm .
Namely, given the (m − 1)st rectangle Rm−1 , integer lm−1 such that slm−1 ∈ Rm−1 ,
and a subsequence {skn } all of whose terms are in Rm−1 , divide Rm−1 into four equal-sized
subrectangles. Necessarily at least one of these four subrectangles contains infinitely many
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
278 Chapter 8: Sequences
elements of {skn }, so pick one such subrectangle, and call it Rm . Therefore there exists
a subsequence of {skn } that is contained in Rm , and that subsequence of {skn } is also a
subsequence of {sn }. We call it {skn0 }. Since we have infinitely many kn0 , in particular
there exists kn0 > lm−1 , and we set lm = kn0 . Thus slm ∈ Rm .
By construction, {sln }n is a subsequence of {sn }n . We next prove that {sln }n is
a Cauchy sequence. Let > 0. Since the either side length of the mth subrectangle Rm
equals the corresponding side length of R0 divided by 2m , by Exercise 2.10.6 there exists
a positive integer N such that any side length of RN is strictly smaller than the constant
/2. Let m, n > N be integers. Then slm , sln are in RN , so that
p
|slm − sln | ≤ (one side length of Rn )2 + (other side length of Rn )2 < .
The following is now an immediate consequence of Theorem 8.7.5:
Example 8.8.6. We work out the construction of a subsequence as in the proof on the
bounded sequence {(−1)n − 1}. For example, all terms lie on the interval [a0 , b0 ] = [−4, 4].
Infinitely many terms lie on [a1 , b1 ] = [−4, 0], and on this sub-interval I arbitrarily choose
the second term, which equals 0. Infinitely many terms lie on [a2 , b2 ] = [−4, −2], in
particular, I choose the third term −2. After this all terms of the sequence in [a2 , b2 ] are
−2, so that we have built the Cauchy subsequence {0, −2, −2, −2, . . .} (and subsequent
[an , bn ] all have bn = −2). We could have built the Cauchy subsequence {−2, −2, . . .},
or, if we started with the interval [−8, 8], we could have built the Cauchy subsequences
{0, 0, −2, −2, . . .} or {−2, 0, 0, −2, −2, . . .}, and so on.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 8.9: Liminf, limsup for real-valued sequences 279
Theorem 8.8.8. Every unbounded sequence of real numbers has a subsequence that has
limit −∞ or ∞.
Proof. If {sn } is not bounded, choose k1 ∈ N+ such that |sk1 | ≥ 1, and once kn−1 has been
chosen, choose an integer kn > kn−1 such that |skn | ≥ n. Now {skn }n is a subsequence
of {sn }. Either infinitely many among the skn are positive or else infinitely many among
the skn are negative. Choose a subsequence {sln }n of {skn }n such that all terms in {sln }
have the same sign. If they are all positive, then since sln ≥ n for all n, it follows that
limn→∞ sln = ∞, and if they are all negative, then since sln ≤ −n for all n, it follows that
limn→∞ sln = −∞.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
280 Chapter 8: Sequences
Recall that by Axiom 2.10.1 every non-empty subset T of R bounded above has a
least upper bound sup T = lub T in R, and that every non-empty subset T bounded below
has a greatest lower bound inf T = inf T in R. We extend this definition by declaring
sup T = ∞ if T is not bounded above,
inf T = −∞ if T is not bounded below.
The same definitions apply to sequences thought of as sets: sup{1/n}n = 1, inf{1/n}n = 0,
sup{n}n = ∞, inf{n}n = 1, sup{(−1)n }n = 1, inf{(−1)n }n = −1, et cetera.
Much analysis of sequences has to do with their long-term behavior rather than with
their first three, first hundred, or first million terms — think convergence or the Cauchy
property of sequences. Further usage of such tail-end analysis is in the next chapter (for
convergence criteria for series). Partly with this goal in mind, we apply infima and suprema
to sequences of tail ends of sequences:
Definition 8.9.1. Let {sn } be a real-valued sequence. The limit superior lim sup and
limit inferior lim inf of {sn } are:
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 8.9: Liminf, limsup for real-valued sequences 281
b
b
b b b
b
b
b b b b
b b b b
b b b b b b b b b
b b b b b b b b b b b b b b b
b b b b b b b b b b b b b b b
b b b b
b b b b b
b b b
b b b b
b b b b b b b
b b b b
b b b b b
b b b
b
b
b b
b b
For lim inf sn to be a real number it is not enough for the sequence to be bounded
below: for example, {n} is bounded below but lim inf{n} = sup{inf{n : n ≥ m} : m ≥
1} = sup{m : m ≥ 1} = ∞. Similarly, for lim sup sn to be a real number it is not enough
for the sequence to be bounded above.
Proof. Let > 0. Then there exists N > 0 such that for all integers n > N , |sn − L| < .
Thus for all m ≥ N , sup{sn : n ≥ m} ≥ L + , so that L − ≤ lim sup sn ≤ L + . Since
this is true for all > 0, it follows by Theorem 2.11.4 that L = lim sup sn . The other part
is left to the reader.
Remark 8.9.3. (Ratio test for sequences) With the new language, Theorem 8.6.6
can be rephrased as follows: If {sn } is a sequence of non-zero complex numbers such that
lim sup{|sn+1 /sn |} < 1, then lim sn = 0. The proof there already accomplishes this. On the
other hand, the ratio test for divergence in Exercise 8.6.6 is not phrased in the most general
form. One generalization is that if lim sup{|sn+1 /sn |} > 1, then {sn } diverges. The proof
is simple. Let r ∈ (1, lim inf{|sn+1 /sn |}). By definition of liminf as supremum of some
infima, this means that there exists an integer m such that inf{|sn+1 /sn | : n ≥ m} > r.
Thus by an easy induction, for all n > m, |sn | > rn−m |sm |, and then by the Comparison
test (Theorem 8.3.7), {|sn |} diverges to infinity, hence {sn } does not converge to a complex
number.
It turns out that there is an important connection between limsup, liminf, and sub-
sequential limits:
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
282 Chapter 8: Sequences
Theorem 8.9.4. Let {sn } be a bounded sequence of real numbers. Then the supremum
of the set of all subsequential limits equals lim sup sn , and the infimum of the set of all
subsequential limits equals lim inf sn .
Proof. Proof of the limsup part only: Let A = lim sup{sn }, let S be the set of all subse-
quential limits of {sn }, and let U = sup(S). Since the sequence is bounded, A and U are
real numbers.
Let > 0. Since A = inf{sup{sn : n ≥ m} : m ≥ 1}, there exists m0 ≥ 1 such that
sup{sn : n ≥ m0 } − A < . Thus for all n ≥ m0 , sn − A < . But then any subsequential
limit of {sn } is a subsequential limit of {sn }n≥m0 , so that this limit must be at most A + .
Thus A + is an upper bound on all subsequential limits of {sn }, so that U ≤ A + . Since
is an arbitrary positive number, by Theorem 2.11.4 this means that U ≤ A.
By definition of U , there exists a convergent subsequence {skn } such that U −
lim{skn }n < /2. Let L = lim{skn }n . So U − L < /2, and there exists a positive
real number N such that for all integers n > N , |skn − L| < /2. Thus for all n > N ,
skn > L − /2 > (U − /2) − /2 = U − .
Thus for any integer m ≥ N , the supremum of {sn : n > m} must be at least U − , so
that A, the limsup of {sn } must be at least U − . Hence by Theorem 2.11.4 this means
that A ≤ U .
It follows that A = U .
Theorem 8.9.5. Let {sn }, {tn } be bounded sequences in R. Then lim sup sn +lim sup tn ≥
lim sup(sn + tn ) and lim inf sn + lim inf tn ≤ lim inf(sn + tn ).
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 8.9: Liminf, limsup for real-valued sequences 283
Theorem 8.9.6. Let {sn } and {tn } be sequences of non-negative real numbers such that
lim sn is a positive real number L. Then lim sup(sn tn ) = L lim sup tn and lim inf(sn tn ) =
L lim inf tn .
Proof. Let > 0. Set 0 = min{L/2, }. By assumption there exists N > 0 such that for all
integers n > N , |sn − L| < 0 . Then L − 0 < sn < L + 0 . Thus each sn is positive, and in
fact |sn | > L/2. It follows that (L − 0 )tn ≤ sn tn ≤ (L + 0 )tn . But then since L ± 0 ≥ 0,
(L − 0 ) lim sup tn = lim sup(L − 0 )tn
≤ lim sup sn tn
≤ lim sup(L + 0 )tn
= (L + 0 ) lim sup tn .
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
284 Chapter 8: Sequences
8.9.6. Find bounded real-valued sequences {sn }, {tn } such that lim sup sn + lim sup tn >
lim sup(sn + tn ). (Compare with Theorem 8.9.5.)
8.9.7. Let {sn }, {tn } be bounded sequences in R.
i) Finish the proof of Theorem 8.9.5, namely prove that lim inf sn + lim inf tn ≤
lim inf(sn + tn ).
ii) Find such {sn }, {tn } so that lim inf sn + lim inf tn < lim inf(sn + tn ).
8.9.8. Suppose that lim sn = ∞. Prove that the set of subsequential limits of {sn } is
empty.
8.9.9. Finish the proof of Theorem 8.9.4, namely prove that the infimum of the set of all
subsequential limits of a bounded sequence equals the liminf of the sequence.
8.9.10. Let {sn } be a sequence of positive real numbers. Prove that lim sup s1n = 1
lim inf sn .
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Chapter 9: Infinite series and power series
In this section we introduce infinite sums, what it means for them to make sense,
and we introduce functions that are infinite sums of higher and higher powers of a variable
x. The work horse of infinite sums are the geometric series, and they are almost the only
type of infinite sums that we can compute numerically. The more technical sections on
differentiability of power series then allow us to compute many more infinite sums.
Warning: Finite sums are possible by the field axioms, but infinite sums need not
make any sense at all. For example,
1 + (−1) + 1 + (−1) + 1 + (−1) + 1 + (−1) + 1 + (−1) + 1 + (−1) + 1 + (−1) · · ·
may be taken to be 0 or 1 depending on which consecutive pairs are grouped together in
a sum, or it could even be summed to exactly 3 by taking the first three positive 1s, and
then matching each successive −1 in the sum with the next not-yet-used +1. In this way
each ±1 in the expression is used exactly once, so that the sum can indeed be taken to be
3. Similarly, we can make the limit be 4, −17, et cetera.
This should convince you that in infinite sums the order of addition matters. For
more on the order of addition, see Theorem 9.2.8 and Exercise 9.2.16.
Infinite sums require special handling. Limits of sequences prepared the ground.
Definition 9.1.1. For an infinite sequence {an } of complex numbers, define the corre-
sponding sequence of partial sums
{a1 , a1 + a2 , a1 + a2 + a3 , a1 + a2 + a3 + a4 , . . .}.
Pn
We denote the nth term of this sequence sn = k=1 ak . The (infinite) series corre-
P∞
sponding to the sequence {an } is k=1 ak (whether this “infinite sum” makes sense or
not).
P P
When the range of indices is clear, we write simply k ak or ak .
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
286 Chapter 9: Infinite series and power series
Example 9.1.2. For the sequence {1}, the sequence of partial sums is {n}. If a 6= 1, by
Pn n+1
Example 1.6.4 the sequence of partial sums of {an } is { k=1 ak }n = { a a−1−a }n . In par-
n
ticular, the sequence of partial sums of {(−1)n } is {{ (−1)2 −1 }n = {−1, 0, −1, 0, −1, 0, . . .}.
We have encountered shifted sequences, such as {an }n≥m , and similarly there are
P∞
shifted series: k=m ak stands for the limit of the sequence of partials sums, but in this
case, the nth partial sum is sn = am + am+1 + am+2 + · · · + am+n−1 .
P∞
Definition 9.1.3. (Most of the time and by default we take m = 1.) The series k=m ak
Pn−m+1
converges to L ∈ C if the sequence { k=m ak }n converges to L. We say then that L
P∞
is the sum of the series and we write k=m ak = L.
If the series does not converge, it diverges.
Just like for sequences, when a series diverges, it may diverge to ∞ or to −∞, or it
may simply have no limit.
Since a sequence {sn } converges if and only if {sn + c} converges (where c is any
P∞ P∞
constant), it follows that k=1 ak converges if and only if k=m ak converges, and then
∞
X ∞
X
ak = a1 + a2 + · · · + am−1 + ak .
k=1 k=m
The following follows immediately from the corresponding results for sequences:
P∞ P∞
Theorem 9.1.4. Let A = k=1 ak , B = k=1 bk , and c ∈ C.
(1) If A, B ∈ C, then
∞
X
(ak + cbk ) = A + cB.
k=1
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 9.1: Infinite series 287
Remark 9.1.5. This theorem justifies the binary operation of addition on the set of of
convergent infinite series:
∞ ∞ ∞
! !
X X X
ak + bk = (ak + bk ).
k=1 k=1 k=1
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
288 Chapter 9: Infinite series and power series
P∞ 1
Example 9.1.9. The series k=1 k2 converges.
Theorem 9.1.11. Let {an } be a sequence of non-negative real numbers. If the sequence
P
{a1 + a2 + · · · + an } of partial sums is bounded above, then an converges.
Proof. The sequence {a1 + a2 + · · · + an } of partial sums is monotone and bounded above,
so it converges by Theorem 8.6.4.
Theorem 9.1.12. Let {an } be a sequence of complex numbers, and let m be a positive
P∞ P∞
integer. Then k=1 an converges if and only if k=m an converges. Furthermore in this
P∞ P∞
case, k=1 an = (a1 + a2 + · · · + am−1 ) + k=m an .
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 9.2: Convergence and divergence theorems for series 289
P∞
9.1.1. Let r ∈ C satisfy |r| ≥ 1. Prove that rk diverges. k=1
P∞ P∞ P∞
9.1.2. Compute and justify the following sums: k=1 21k , k=6 31k , k=8 52k .
P∞ 2k+1
9.1.3. Prove that k=1 k2 ·(k+1)2 converges, and find the sum. (Hint: Do some initial
experimentation with partial sums, find a pattern for partial sums, and prove the pattern
with mathematical induction.)
9.1.4. Let an = (−1)n . Prove that the sequence of partial sums {a1 + a2 + · · · + an } is
bounded but does not converge. How does this not contradict Theorem 9.1.11?
9.1.5. For each k ∈ N+ let xk be an integer between 0 and 9.
P∞ xk
i) Prove that k=1 10 k converges.
rational number.
P∞ P∞ P∞
9.1.6. Prove that k=1 ak converges if and only if k=1 Re ak and k=1 Im ak converge.
P∞ P∞ P∞
Furthermore, k=1 ak = k=1 Re ak + i k=1 Im ak .
P∞
9.1.7. Suppose that lim an 6= 0. Prove that k=1 ak diverges.
9.1.8. Determine with proof which series converge.
P∞
i) k=1 k1k .
P∞
ii) k=1 k13 + ik .
P∞ 1
iii) k=1 k! .
9.1.9. Let {an } and {bn } be complex sequences, and let m ∈ N+ such that for all n ≥ 1,
P∞ P∞
an = bn+m . Prove that k=1 ak converges if and only if k=1 bk converges.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
290 Chapter 9: Infinite series and power series
Now suppose that for all real numbers > 0 there exists a real number N > 0 such
that for all integers n ≥ m > N , |am+1 + am+2 + · · · + an | < . This means that the
sequence {sn } of partial sums is Cauchy. By Theorem 8.7.5, {sn } is convergent. Then by
P
the definition of series, k ak converges.
Theorem 9.2.2. (Comparison test (for series)) Let {an } be a real and {bn } a complex
sequence. such that for all n, an ≥ |bn |.
P P
(1) If ak converges then bk converges.
P P
(2) If bk diverges then ak diverges.
Proof. Note that all an are non-negative, and for all integers n ≥ m,
Theorem 9.2.3. (Ratio test) (Compare to Remark 8.9.3.) Let {an } be a sequence of
non-zero complex numbers.
(1) If lim sup aan+1
P P
n
< 1, then |ak | and ak converge.
(2) If lim inf aan+1
P P
n
> 1, then |ak | and ak diverge.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 9.2: Convergence and divergence theorems for series 291
This ratio test for convergence of series does not apply when lim sup | aan+1 | = 1 or
an+1 Pn
lim inf | an | = 1. The reason is that under these assumptions the series k |ak | and
P
k ak sometimes converge and sometimes diverge. For example, if an = 1/n for all n,
Pn 1
lim sup | aan+1
n
| = lim inf | an+1
an
| = 1, and 2
k=1 k diverges; whereas if an = 1/n for all n,
n
then lim sup | aan+1 | = lim inf | aan+1 | = 1, and k=1 k12 converges.
P
n n
Theorem 9.2.4. (Root test for series) Let {an } be a sequence of complex numbers.
Let L = lim sup |an |1/n .
P P
(1) If L < 1, then k |ak |, k ak converge.
P P
(2) If L > 1, then k |ak |, k ak diverge.
Proof. If L < 1, choose r ∈ (L, 1). Since L = inf{sup{|an |1/n : n ≥ m} : m ≥ 1} and r > L,
there exists m ≥ 1 such that r > sup{|an |1/n : n ≥ m}. Thus for all n ≥ m, rn ≥ |an |.
P k
Thus by the Comparison test (Theorem 9.2.2), since the geometric series r converges,
P P
we have that ak and |ak | converge. The proof of (2) is similar, and is omitted here.
P
Example 9.2.6. Recall from Example 9.1.7 that the harmonic series k 1/k diverges.
But the alternating series k (−1)k /k converges by this theorem. (In fact, k
P P
k (−1) /k
converges to − ln 2, but proving the limit is harder – see the proof after Example 9.7.7.
We examine this infinite series more carefully:
1 1 1 1 1 1 1 1 1
−1 + − + − + − + − + − ···
2 3 4 5 6 7 8 9 10
We cannot rearrange the terms in this series as 21 + 14 + 16 + 18 + 10 1
+ · · · minus
1 + 13 + 15 + 17 + 19 + · · · because both of these series diverge to infinity. More on
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
292 Chapter 9: Infinite series and power series
P∞ P∞
Definition 9.2.7. A series k=1 ak is called absolutely convergent if k=1 |ak | con-
verges.
P∞
Theorem 9.2.8. Let k=1 ak be absolutely convergent. Let r : N+ → N+ be a bijective
P∞
function. Then k=1 ar(k) converges.
P∞
Proof. By comparison test (Theorem 9.2.2), k=1 ak converges to some number L ∈ C.
P∞
We will prove that k=1 ar(k) converges to L.
P∞
Let > 0. Since k=1 ak = L, there exists N0 > 0 such that for all integers n > N0 ,
Pn P∞
| k=1 ak − L| < /2. Since k=1 |ak | converges, by Cauchy’s criterion for sequences there
Pn
exists N1 > 0 such that for all integers n ≥ m > N1 , k=m+1 |ak | < /2. Pick an
integer N > max{N0 , N1 }. Since r is a bijective and hence an invertible function, we can
define M = max{r−1 (1), r−1 (2), . . . , r−1 (N )}. Let n be an integer strictly bigger than M .
Then by definition the set {r(1), r(2), . . . , r(n)} contains 1, 2, . . . , N . Let K = {r(k) : k ≤
n} \ {1, 2, . . . , N }. Then
n
X N
X X N
X X
ar(k) − L = ak + ak − L ≤ ak − L + |ak | <
k=1 k=1 k∈K k=1 k∈K
since all the finitely many indices in K are strictly bigger than N0 .
Theorem 9.2.9. (Integral test for series convergence) Let f : [1, ∞) → [0, ∞) be a
Rn P∞
decreasing function. Suppose that for all n ∈ N+ , 1 f exists. Then k=1 f (k) converges
Rn
if and only if limn→∞ 1 f exists and is a real number.
P∞ Rn
(It is not necessarily the case that k=1 f (k) equals limn→∞ 1 f .)
Proof. Since f is decreasing, for all x ∈ [n, n + 1], f (n) ≥ f (x) ≥ f (n + 1). Thus
Z n+1 Z n+1 Z n+1
f (n + 1) = f (n + 1)dx ≤ f (x)dx ≤ f (n)dx = f (n).
n n n
P
Suppose that k f (k) converges. Then by the definition this means that
R n+1
limn→∞ (f (1) + f (2) + · · · + f (n)) exists. By the displayed inequalities, 1 f =
R2 R3 R n+1 R n+1
1
f + 2 f + · · · + n f ≤ f (1) + f (2) + · · · + f (n), so that { 1 f }n is a bounded
R n+1
increasing sequence of real numbers, so that limn→∞ 1 f exists, and hence that
Rn
limn→∞ 1 f exists.
Rn
Conversely, suppose that limn→∞ 1 f exists. Let L ∈ R be this limit. By the
R3 R n+1 R n+1
displayed inequalities, f (2) + f (3) + · · · + f (n + 1) ≤ 2 f + · · · + n f = 2 f . Since
f takes on only non-negative values, this says that f (2) + f (3) + · · · + f (n + 1) ≤ L. Thus
{f (2) + · · · + f (n + 1)}n is a non-decreasing sequence that is bounded above by L. Thus
by Theorem 8.6.4, this sequence converges. By adding the constant f (1), the sequence
P
{f (1) + · · · + f (n + 1)}n converges, so that by the definition of series, k f (k) converges.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 9.2: Convergence and divergence theorems for series 293
Theorem 9.2.10. (The p-series convergence test) Let p be a real number. The series
P p
k k converges if p < −1 and diverges if p ≥ −1.
Proof. If p = −1, then the series is the harmonic series and hence diverges. If p ≥ −1,
then np ≥ n−1 for all n by Theorem 7.6.5. Thus by the comparison test (Theorem 9.2.2),
P p
k k diverges.
Now suppose that p < −1. The function f : [1, ∞) → R given by f (x) = xp is
differentiable, continuous, and decreasing. Since f is continuous, for all positive integers n,
Rn Rn Rn p np+1 −1
1
f exists. By the Fundamental theorem of calculus, 1
f = 1
x dx = p+1 . By the
composite rule for sequences (either Theorem 8.5.8 or Theorem 8.4.7), since the function
that exponentiates by the positive −(p+1) is continuous at all real numbers and lim n1 = 0,
−(p+1) Rn
it follows that lim np+1 = lim n1 = 0−(p+1) = 0, so that lim 1 f exists and equals
−1
P p
p+1 . Thus by the Integral test (Theorem 9.2.9), k k converges.
9.2.3. For each of the following series, determine with proof whether they converge or
diverge. You may need to use Examples 8.2.9 and 8.3.6.
∞
X 3i
i) √ .
k 2 + k4
k=1
∞
X 1
ii) √ .
k=1
k
∞
X 1
iii) .
k3
k=1
∞
X 2k
iv) .
k!
k=1
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
294 Chapter 9: Infinite series and power series
∞
X 2k
v) .
k3
k=1
∞
X 1
vi) √ .
k 2
k=1
with lim sup |an |1/n =
P P
9.2.4. Find a convergent series k ak and a divergent series k bk
lim sup |bn |1/n = 1.
9.2.5. Make a list of all encountered criteria of convergence for series.
9.2.6. The goal of this exercise is to show that if the ratio test (Theorem 9.2.3) determines
the convergence/divergence of a series, then the root test (Theorem 9.2.4) determines it as
well. Let {an } be a sequence of non-zero complex numbers.
i) Suppose that lim sup | aan+1
n
| < 1. Prove that lim sup |an |1/n < 1.
ii) Suppose that lim inf | aan+1
n
| > 1. Prove that lim sup |an |1/n > 1.
P∞ k
9.2.7. Apply the ratio test (Theorem 9.2.3) and the root test (Theorem 9.2.4) to k=1 5k! .
P∞
Was one test easier? Repeat for k=1 k1k .
P∞
9.2.8. Let {an } be a complex sequence, and let c ∈ C. Is it true that k=1 ak converges
P∞
if and only if k=1 cak converges? If true, prove; if false, give a counterexample.
P∞
9.2.9. Let an = (−1)n /n, bn = 2(−1)n . Prove that k=1 ak converges, that for all n,
P∞
|bn | = 2, and that k=1 abkk diverges.
P∞
9.2.10. (Compare with Exercise 9.2.9.) Suppose that k=1 |ak | converges. Let {bn } be a
P∞
sequence of complex numbers such that for all n, |bn | > 1. Prove that k=1 abkk converges.
9.2.11. (Summation by parts) Let {an }, {bn } be complex sequences. Prove that
n
X n
X n
X k
X
ak bk = an bk − (ak+1 − ak ) bj .
k=1 k=1 k=1 j=1
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 9.3: Power series 295
9.2.15. The following guides through another proof of the p-series convergence test for
p < −1. (Confer Theorem 9.2.10 for the first proof).
i) Prove that for each positive integer n there exists c ∈ (n, n+1) such that (p+1)cp =
(n + 1)p+1 − np+1 . (Hint: Mean value theorem (Theorem 6.3.4).)
ii) Prove that for all positive integers n,
(n + 1)p+1 − np+1 np+1 − (n − 1)p+1
(n + 1)p < and np < .
p+1 p+1
iii) Prove by induction on n ≥ 1 that
n
1 X p
np+1 + kp ≤
.
p+1 p+1
k=1
Pn
iv) Prove that the positive sequence { k=1 k p }n of partial sums is an increasing se-
quence bounded above.
Pn P∞
v) Prove that the sequence { k=1 k p }n converges, and so that k=1 k p converges.
9.2.16. (Order of summation in infinite sums is important.)
P∞ 1 P∞ 1
i) Prove that k=1 2k and k=1 2k+1 diverge. Refer to Example 9.2.6.
Pn k
n P2n+1
1 1
= k=1 (−1)
P
ii) Observe that k=1 2k − k=1 2k+1 k .
P∞ (−1)k P∞ 1 P∞ 1
iii) Argue that k=1 k 6= k=1 2k − k=1 2k+1 . Why does this not contradict
the “expected” summation and difference rules?
9.2.17. (Raabe’s test) Let a1 , a2 , . . . be positive real numbers such that for some α > 1
and for some N ∈ N, aan+1 ≤ 1− α
P
n n for all n ≥ N . Prove that n an converges. (Hint:
Let f (x) = 1 − x . Use the Mean value theorem to get c ∈ (x, 1) such that f 0 (c)(1 − x) =
α
P −α
f (1) − f (x). Conclude that 1 − xα ≤ (1 − x)α. Apply this to x = 1 − n1 . Use that n
converges.)
In this section we deal with sums where the index varies through N0 , and furthermore,
the terms of the sequence are special functions rather than constants:
where a0 , a1 , a2 , . . . are fixed complex numbers, and x is a variable that can be replaced by
any complex number. (By convention as on page 34, 00 = 1.)
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
296 Chapter 9: Infinite series and power series
power series an
X∞
xk 1
k=0
∞
X
2k+1 3 1,5 7 if n is odd;
x =x + x + x + x + · · ·
0, if n is even.
k=0
∞ 0, if n is odd;
(
X
2k+12 12 14 16 18
x = x +x +x +x +· · · 0, if n is even and n < 12;
k=0 1, if n ≥ 12 is even.
X∞
kxk n
k=0
∞
X xk 1
k! n!
k=0
X∞
(kx)k nn
k=0
The partial sums of power series are polynomials, so a power series is a limit of
polynomials.
A power series is a function of x, and the domain is to be determined. Clearly, 0
P∞
is in the domain of every power series: plugging in x = 0 returns k=0 ak 0k = a0 . If all
except finitely many an are 0, then the power series is actually a polynomial, and is thus
defined on all of C. When 1 is in the domain, evaluation of the power series k ak xk at
P
P
x = 1 is the (ordinary) series k ak .
The most important question that we address in this section is: which x are in the
domain of the power series, i.e., for which x does such an infinite series converge. We prove
that for every power series whose domain is not all of C there exists a non-negative real
number R such that the series converges for all x ∈ C with |x| < R and the series diverges
for all x ∈ C with |x| > R. What happens at x with |x| = R depends on the series.
P∞
Example 9.3.2. Let f (x) = k=0 xk . By Theorem 9.1.6, the domain of f contains all
complex numbers with absolute value strictly smaller than 1, and by Theorem 9.1.10, the
domain of f contains no other numbers, so that the domain equals {x ∈ C : |x| < 1}.
P∞ 1
Moreover, for all x in the domain of f , by Theorem 9.1.6, k=0 xk = 1−x . Note that the
1
domain of 1−x is strictly larger than the domain of f .
Whereas for general power series it is impossible to get a true numerical infinite sum,
for geometric series this is easy: f ( 12 ) = k≥0 21k = 1−1 1 = 2, f ( 31 ) = k≥0 31k = 1−1 1 = 32 ,
P P
2 3
1
f (0.6) = k≥0 0.6k = 1−0.6
P
= 2.5.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 9.3: Power series 297
ak xk be a
P
Theorem 9.3.3. (Root test for the convergence of power series) Let
power series, and let α = lim sup |an |1/n . Define R by
1/α, if 0 < α < ∞;
R = 0, if α = ∞;
∞, if α = 0.
Definition 9.3.4. The R from Theorem 9.3.3 is called the radius of convergence of
ak xk .
P
the series
This is really a radius of convergence because inside the circle B(0, R) the series
converges and outside of the circle the series diverges. Whether the power series converges
at points on the circle depends on the series; see Example 9.3.6.
p √ √
Proof. By Example 8.2.9, lim n |n| = 1. For any integer n ≥ 2, 1 ≤ n n − 1 ≤ n n, so that
√ p
by the squeeze theorem, lim n n − 1 = 1. Thus by Theorem 8.9.6, lim sup n |n(n − 1)an | =
p p p
lim n n(n − 1)·lim sup n |an | = lim sup n |an |. This proves that the α as in Theorem 9.3.3
ak xk is the same as the α for k(k − 1)ak xk , which proves that these two power
P P
for
series have the same radius of convergence. The proofs of the other parts are similar.
Example 9.3.6. We have seen that xk has radius of convergence 1. By Theorem 9.1.10,
P
this series does not converge at any point on the unit circle. By the previous theorem,
P∞ P∞
the radius of convergence of k=1 k1 xk and of k=1 k12 xk is also 1. By the p-series test
P∞
(Theorem 9.2.10) or by the harmonic series fact, k=1 k1 xk diverges at x = 1, and by the
P∞
alternating series test (Theorem 9.2.5), k=1 k1 xk converges at x = −1. By the p-series
P∞
test (Theorem 9.2.10), k=1 k12 converges, so that by the comparison test Theorem 9.2.2,
P∞ 1 k
k=1 k2 x converges on the unit circle.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
298 Chapter 9: Infinite series and power series
xk , kxk ,
P P
Example 9.3.8. Similarly to the last example and by Theorem 9.3.5,
P 2 k P
k(k − 1)xk ,
P 2k+1
k x , x all have radius of convergence 1.
Theorem 9.3.9. (Ratio test for the convergence of power series) Suppose that all
an are non-zero complex numbers.
(1) If |x| < lim inf aan+1 |ak ||x|k and ak xk converge.
P P
n
, then
(2) If |x| > lim sup aan+1 |ak ||x|k and ak xk diverge.
P P
n
, then
Thus if lim aan+1 ak xk .
P
n
exists, it equals the radius of convergence of
Warning: Compare with the Ratio test for convergence of series (Theorem 9.2.3)
where fractions are different. Explain to yourself why that is necessarily so, possibly after
going through the proof below.
Proof. The two series converge in case x = 0, so that we may assume that x 6= 0. We may
then apply the Ratio test for convergence of series (Theorem 9.2.3):
an+1 xn+1 an+1
lim sup n
= |x| lim sup
an x an
an+1
= |x| inf{sup : n ≥ m : m ≥ 1}
an
( )
1
= |x| inf :m≥1
inf{ aan+1
n
: n ≥ m}
|x|
= n o
an
sup inf{ an+1 : n ≥ m} : m ≥ 1
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 9.3: Power series 299
|x|
= .
lim inf aan+1
n
If this is strictly smaller than 1, then the two series converge, which proves (1). Similarly,
an+1 xn+1 |x|
lim inf = ,
an xn lim sup aan+1
n
and if this is strictly larger than 1, then the two series diverge. The last part is then
immediate by the definition of radius of convergence.
Examples 9.3.10.
P 2k+1
(1) In Example 9.3.8 we established via the root test that x has radius of
convergence 1. The ratio test is inapplicable for this power series. However, note
P 2k+1
= x (x2 )k , and by the ratio test for series (not power series), this
P
that x
series converges for non-zero x if lim sup |(x2 )k+1 /(x2 )k | < 1, i.e., if |x2 | < 1, i.e.,
if |x| < 1, and it diverges if |x| > 1.
P∞ k
(2) The radius of convergence of k=1 xk is ∞. For this we apply the root test:
1 1/n
α = lim sup nn = lim sup n1 = 0.
1
xk
P∞
(3) By the ratio test, the radius of convergence of k=1 k! is lim n!
1 = lim (n+1)!
n! =
(n+1)!
lim(n + 1) = ∞. The root test gives α = lim sup |1/n!|1/n | = lim sup(1/n!)1/n , and
by Example 8.3.6 this is 0. Thus the radius of convergence is ∞ also by the root
test.
gence at least 1.
P∞
ii) Give an example of ak ∈ C for which the radius of convergence of k=0 ak xk is
strictly greater than 1. (Bonus points for easiest example.)
P∞
iii) Give an example of ak ∈ C for which the radius of convergence of k=0 ak xk is
equal to 1.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
300 Chapter 9: Infinite series and power series
9.3.3. Compare with the previous exercise: find the radii of convergence of the power series
P∞ k P∞ k
P∞ k
P∞ k P∞ k
P∞ k P∞ k
k=0 x , k=0 (−1)x , k=0 (2x) , k=0 x + k=0 (−1)x , k=0 x + k=0 (2x) .
9.3.4. Compute and justify the radius of convergence for the following series:
P k
i) 3x .
ii) (3x)k .
P
3kxk .
P
iii)
k(3x)k .
P
iv)
P xk
v) k3 .
P 3x k
vi) k3 .
P (3x)k
vii) k3 .
9.3.5. Compute and justify the radius of convergence for the following series:
P xk
i) .
P kxkk
ii) .
P k2k
xk
iii) (2k)k
.
P∞ P∞
9.3.6. Let k=0 ak xk and k=0 bk xk be convergent power series with radii of convergence
P∞ P∞
R1 and R2 , respectively. Let R = min{R1 , R2 }. Prove that k=0 ak xk ± k=0 bk xk =
P∞ k
k=0 (ak ±bk )x is convergent with radius of convergence at least R. (Hint: Theorem 9.1.4.)
P∞
9.3.7. Let R be the radius of convergence of k=0 ak xk . Let p be a positive integer.
P∞
i) Determine the radius of convergence of k=p ak xk .
P∞
ii) Determine the radius of convergence of k=0 ak xpk .
P∞
9.3.8. What would be a sensible definition for generalized power series k=0 ak (x−a)k ?
P∞
What would be a sensible definition of the radius of convergence of k=0 ak (x−a)k ? Draw
a relevant picture in C.
Power series are functions. In this section we prove that they are differentiable at all
x inside the circle of convergence. Since a differentiable function is continuous, it follows
that a power series is continuous inside the circle of convergence.
Recall that for any differentiable function f ,
f (x + h) − f (x)
f 0 (x) = lim,
h→0 h
P∞ P∞ Pn
and any power series k=0 ak xk is the limit of a sequence: k=0 ak xk = lim{ k=0 ak xk }n .
Thus
∞ Pn Pn
X
k 0 lim{ k=0 ak (x + h)k }n − lim{ k=0 ak xk }n
( ak x ) = lim .
h→0 h
k=0
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 9.4: Differentiation of power series 301
Pn Pn
Certainly by the sum rule for convergent series, lim{ k=0 ak (x+h)k }n −lim{ k=0 ak xk }n =
Pn
lim{ k=0 ak ((x + h)k − xk )}n , and by the constant rule we get that
∞ n
X
k 0
X (x + h)k − xk
( ak x ) = lim lim ak .
h→0 n→∞ h
k=0 k=0
If we could change the order of limits, then we would get by the polynomial rule for
derivatives that
∞
X n
X
( ak xk )0 = lim lim kak xk−1 .
n→∞ h→0
k=0 k=0
In fact, it turns out that this is the correct derivative, but our reasoning above was based
on an unproven (and generally false) switch of the two limits.
We give a correct proof of derivatives in the rest of the section. By Theorem 9.3.5
P∞ P∞ P∞
we already know that the series k=0 ak xk and k=0 kak xk−1 = k=1 kak xk−1 have the
same radius of convergence.
The following theorem is not necessarily interesting in its own right, but it is a
stepping stone in the proof of derivatives of power series.
Theorem 9.4.1. Let k ak xk have radius convergence R. Let c ∈ C satisfy |c| < R. Then
P
P∞
the function g(x) = k=1 ak (xk−1 + cxk−2 + c2 xk−3 + · · · + ck−1 ) is defined on B(0, R)
and is continuous at c.
Theorem 9.3.5 also the radius of convergence of k kak xk−1 is R, so that k kak dk−1
P P
converges. Then from the Comparison theorem (Theorem 9.2.2) and from
|ak (xk−1 + cxk−2 + c2 xk−3 + · · · + ck−1 )|
≤ |ak |(|x|k−1 + |c||x|k−2 + |c|2 |x|k−3 + · · · + |c|k−1 )
(by the triangle inequality)
≤ |ak |(dk−1 + ddk−2 + d2 dk−3 + · · · + dk−1 )
= k|ak |dk−1
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
302 Chapter 9: Infinite series and power series
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 9.4: Differentiation of power series 303
P∞
Theorem 9.4.2. Let f (x) = k=0 ak xk have radius of convergence R. Then f is differen-
P∞ P∞
tiable on B(0, R) and f 0 (x) = k=0 kak xk−1 = k=1 kak xk−1 . The radius of convergence
of f 0 equals R.
which is the function g from the previous theorem. In that theorem we proved that g is
continuous at c, so that
∞
0 f (x) − f (c) X
f (c) = lim = lim g(x) = g(c) = kak ck−1 .
x→c x−c x→c
k=1
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
304 Chapter 9: Infinite series and power series
so that an = bn .
Differentiation of power series is a powerful tool. For all complex numbers x ∈ B(0, 1)
P∞ 1 1
the geometric series k=0 xk converges to 1−x . Certainly it is easier to compute 1−x than
the infinite sum. We can exploit geometric series and derivatives of power series to compute
many other infinite sums. Below we provide a few illustrations of the method.
P∞ k
Example 9.5.1. k=1 2k−1 = 4.
P∞
Proof. Let f (x) = k=0 xk . This is the geometric series with radius of convergence 1 that
P∞ P∞
converges to 1−x1
(Example 9.3.2). By Theorem 9.4.2, f 0 (x) = k=0 kxk−1 = k=1 kxk−1 ,
and by Theorem 9.3.5, the radius of convergence of f 0 is also 1. Thus 12 is in the domain
of f 0 . Since we have two ways of expressing f (as power series and as a rational function),
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 9.5: Numerical evaluations of some series 305
k2 k2
P∞ P∞
Example 9.5.2. k=0 2k = 6, and k=0 2k−1 = 12.
P∞ k
Proof. As in the previous example we start with the geometric series f (x) = k=0 x
∞
that converges on B(0, 1). Its derivative f 0 (x) = k−1 1
P
k=0 kx = (1−x)2 also converges
0
P ∞ k 0 0
P∞ 2 k−1
on B(0, 1). Then xf (x) = k=0 kx and its derivative (xf (x)) = k=0 k x also
converge on B(0, 1). From
∞ 0
X
2 k−1 0 0 x 1+x
k x = (xf (x)) = 2
=
(1 − x) (1 − x)3
k=0
1+ 12
P∞ k2 P∞ 2
3/2
we deduce that k=0 2k−1 = (1−1/2) 3 = 1/8 = 12, and k=0 2kk = 6.
P∞ 1+x
Example 9.5.3. From the previous example we know that k=0 k 2 xk−1 = (1−x)3 . Mul-
x+x2
P∞
tiplying both sides by x gives k=0 k 2 xk = (1−x)3 , and differentiation gives
∞
x + x2 1 + 4x + x2
X
3 k−1 d
k x = = .
dx (1 − x)3 (1 − x)4
k=0
k3 1+4·0.5+0.52 k3
P∞ P∞
It follows that k=0 2k−1 = (1−0.5)4 = 52 and k=0 2k = 26.
P∞ 1
P∞ k
P∞ k2 P∞ k3
Summary: k=0 2k−1 = 2, k=0 2k−1 = 4, k=0 2k−1 = 12, k=0 2k−1 = 26. Is it
∞ k4
P
possible to predict k=0 2k−1 ?
P∞ 1
Example 9.5.4. k=1 k2k = ln 2.
P∞ k
Proof. Let f (x) = k=1 xk . The radius of convergence of f is 1, so 1/2 is in its domain.
P∞
Also, f 0 (x) = k=1 xk−1 = 1−x 1
, so that f (x) = − ln(1 − x) + C for some constant C. In
P∞ P∞ k
particular, C = −0 + C = − ln(1 − 0) + C = f (0) = 0, so that k=1 k21k = k=1 k1 12 =
f ( 12 ) = − ln(1 − 12 ) = − ln( 21 ) = ln(2).
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
306 Chapter 9: Infinite series and power series
P∞ k
9.5.2. Consider k=1 k+2·3 k . According to my computer the partial sum of the first 1000
terms is a rational number whose numerator and denominator take several screen pages,
so this sum as a rational number is hard to comprehend. So instead I computed curtailed
decimal expansions for this and for a few other sums: according to my computer, the
partial sum of the first 10 terms is about 0.719474635555091, the partial sum of the first
100 terms is about 0.719487336054311, the partial sum of the first 1000 terms is about
0.719487336054311. What can you suspect? How would you go about proving it?
The first theorem in this section is about products of power series and the second is
about the convergence of a power series at the points on the boundary of the circle of the
radius of convergence. This section is meant as a reference and should be skipped in a first
class on power series.
P∞ P∞
Theorem 9.6.1. Let k=0 ak xk and k=0 bk xk be convergent power series with radii
of convergence R1 and R2 , respectively. Let R = min{R1 , R2 }. Then on B(0, R)
Pn Pn
the product sequence {( k=0 ak xk ) · ( k=0 bk xk )}n converges to the power series
P∞ Pk k
k=0 ( j=0 aj bk−j )x .
P∞ P∞ P∞ Pk
We write this as ( k=0 ak xk ) · ( k=0 bk xk ) = k=0 ( j=0 aj bk−j )xk on B(0, R).
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 9.6: Some technical aspects of power series 307
P∞ P∞
Since |x| < R ≤ R1 , R2 , the series k=0 |ak xk | and k=0 |bk xk | converge, to some real
numbers L,e K,
e respectively. Thus there exists N2 > 0 such that for all integers n > N2 ,
∞ ∞ n
X
k
X
k
X
|ak x | = |ak x | −
, |ak xk | <
k=n+1 k=0 4K
e +1
k=0
P∞
and similarly there exists N3 > 0 such that for all integers n > N3 , k=n+1 |bk xk | < .
4L
e+1
Now let n > max{2N1 , N2 , N3 }. Then
n X
X n
am bn+k−m xn+k
k=1 m=k
n bn/2c
X X
= am xm bn+k−m xn+k−m
k=1 m=k
Xn n
X
+ am xm bn+k−m xn+k−m
k=1 m=max{k,bn/2c+1}
bn/2c n
X X
≤ |am xm | bk x k +
m=1 k=bn/2c
n bn/2c
X X
|am xm | bk xk ,
m=bn/2c k=1
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
308 Chapter 9: Infinite series and power series
P∞
Theorem 9.6.2. Let f (x) = k=0 ck xk have radius of convergence a positive real num-
P∞
ber R. Let a ∈ C with |a| = R such that k=0 ck ak converges. Let B be an open ball
centered at a, and let g : B → C be continuous. If f (x) = g(x) for all x ∈ B(0, R) ∩ B,
then f (a) = g(a).
Proof. Let > 0. We want to show that |f (a) − g(a)| < , which via Theorem 2.11.4 then
proves that f (a) = g(a). It suffices to prove the inequality |f (a) − g(a)| < under the
additional assumption that < 1.
P∞
Since k=0 ck ak converges, by Cauchy’s criterion Theorem 9.2.1, there exists a posi-
Pn PN +m
tive integer N such that for all integers n ≥ N , | k=N ck ak | < /4. Let sm = k=N ck ak .
By assumption, for all m ≥ 1, |sm | < /4 < 1. Furthermore, cN = s0 /aN , and for n > N ,
cN +n = (sn − sn−1 )/aN +n .
Let r be a real number in the interval (0, 1). By rewriting and by the triangle
inequality then
N
X +n
ck (ra)k = rN cN aN + cN +1 aN +1 rN +1 + · · · + cN +n aN +n rN +n
k=N
= rN s0 + (s1 − s0 )rN +1 + (s2 − s1 )rN +2 + · · · + (sn − sn−1 )rN +n
= s0 (rN − rN +1 ) + s1 (rN +1 − rN +2 ) + · · ·
+sn−1 (rN +n−1 − rN +n ) + sn rN +n
= rN (1 − r) s0 + s1 r + · · · + sn−1 rn−1 + sn rn
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 9.6: Some technical aspects of power series 309
N
X −1 N
X −1
= g(a)−g(ra)+g(ra)−f (ra)+f (ra)− ck ak + ck ak −f (a)
k=0 k=0
N
X −1 N
X −1
k
≤ |g(a)−g(ra)|+|g(ra)−f (ra)|+ f (ra)− ck a + ck ak −f (a)
k=0 k=0
∞
X N
X −1 N
X −1 ∞
X
= |g(a)−g(ra)|+0+ ck (ra)k + ck (ra)k − ck ak + ck ak
k=N k=0 k=0 k=N
∞ N −1 N −1 ∞
X X X X
< + ck (ra)k + ck (ra)k − ck ak + ck ak
4
k=N k=0 k=0 k=N
< + + +
4 4 4 4
= .
Since is arbitrary, by Theorem 2.11.4, f (a) = g(a).
B(0, 1) ∪ {1}, that it is continuous on B(0, 1), and that when the domain is restricted to
(B(0, 1) ∪ {1}) ∩ R, then f is continuous also at 1.
i) Prove that the domain of f includes B(0, 1) ∪ {1}.
ii) Prove that f is continuous on B(0, 1). (Hint: Invoke a theorem.)
† “Lemma” means a “helpful theorem”, possibly not interesting in its own right, but useful later. There
are examples of so-called lemmas that have turned out to be very interesting in their own right.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
310 Chapter 9: Infinite series and power series
P
Set L = k ak , and for n ≥ 0, set sn = a0 + a1 + · · · + an − L.
Pn Pn
iii) Prove that k=0 sk (1−x)xk = sn (1−xn+1 )+ k=0 ak (xk −1). (Hint: Summation
by parts, see Exercise 9.2.11.)
iv) Prove that for every > 0 there exists N ∈ N such that for all integers n ≥ N ,
|sn | < .
Pn
v) Prove that the sequence { k=0 sk (1 − x)xk }n converges for all x ∈ B(0, 1) ∪ {1}.
P∞
vi) Prove that f (x) − L = k=0 sk (1 − x)xk for all x ∈ B(0, 1) ∪ {1}.
vii) Let > 0. Prove that there exists N ∈ R such that for all integers m ≥ N and for
P∞ k
all real numbers x in the interval [0, 1], k=m sk (1 − x)x < .
viii) Let > 0. Prove that for every positive integer n there exists δ > 0 such that for
Pn k
all complex numbers x ∈ (B(0, 1) ∪ {1}) ∩ B(1, δ), k=0 sk (1 − x)x < .
ix) Let > 0. Prove that there exists δ > 0 such that for every real number x in the
P∞ k
interval [max{0, 1 − δ}, 1], k=0 sk (1 − x)x < .
x) Let g : [0, 1] → C be defined by g(x) = f (x). Prove that g is continuous.
P∞ P∞
*9.6.4. Let f (x) = k=0 ak xk and g(x) = k=0 bk xk .
i) Express (f ◦ g)(x) as a power series in terms of the ai and bj .
ii) What is special for the power series of the composition if a0 = 0?
iii) Assuming that a0 = 0, and given the radii of convergence for f and g, what would
it take to find the radius of convergence of the composition series?
Definition 9.7.1. Let a be in the domain of a function f , and assume that f has derivatives
P∞ (k)
of all orders at a. The Taylor series of f (centered) at a is the series k=0 f k!(a) (x−a)k .
Remark 9.7.2. If a = 0, the Taylor series is a power series (as defined in this chapter),
and for other a this is also a power series but of a more general kind which can nevertheless
be easily transformed into a usual power series in the following sense. Let f : A → C. Set
B = {x − a : x ∈ A}, and g : B → C as g(x) = f (x + a). By straightforward calculus,
k k
P P
k ak (x − a) is a Taylor series for f at a if and only if k ak x is a Taylor series for g
at 0. Furthermore, the radius of convergence of the Taylor series for g is R if and only if for
all x ∈ C, the Taylor series k ak (x − a)k for f at a converges at x whenever |x − a| < R
P
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 9.7: Taylor series 311
Section 6.5 contains many examples of Taylor polynomials and a few more are com-
puted in this section. For some functions this computation is easier than for others. The
following theorem covers a trivial computation, and it applies in particular to all polynomial
functions.
P∞
Theorem 9.7.3. Let f (x) = k=0 ak xk be a power series whose radius convergence is
not zero. Then the nth Taylor polynomial of f centered at 0 is
n
X
ak xk ,
k=0
Remark 9.7.4. Theorems 6.5.5 and 6.5.6 say that some Taylor series are convergent power
series at each x near 0 and that they converge to the value of the original function. This
is certainly true for functions given as power series (such as in the theorem above). We
examine a few further examples in this section with a few left for the exercises. Beware
that the Taylor series need not converge to its function at any point other than at a, see
Exercise 9.7.3.
Example 9.7.5. Let f (x) = ex , where the domain of f is R. It is easy to compute the
Taylor series for f :
∞
X xk
.
k!
k=0
By the ratio test or the root test, this series converges for all x ∈ C, not only for x ∈ R.
(More on this series is in Sections 10.1 and 10.2, where we learn more about exponentiation
by complex numbers.) Let x ∈ R. By Theorem 6.5.5, for every positive integer n there
exists dn between 0 and x such that
n
X xk edn
ex − = xn+1 .
k! (n + 1)!
k=0
Thus
n
x
X xk edn e|x|
e − = xn+1 ≤ xn+1 . (9.7.6)
k! (n + 1)! (n + 1)!
k=0
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
312 Chapter 9: Infinite series and power series
n |x| o
e
The sequence (n+1)! xn+1 converges to 0 by the ratio test for sequences. This proves
that for each x ∈ R, the Taylor polynomials for ex approximate ex arbitrarily closely, and
that the Taylor series at each x equals ex :
∞
X xk
ex = .
k!
k=0
Example 9.7.7. Let f (x) = ln(x + 1), where the domain of f is (−1, ∞). It is straight-
forward to compute the Taylor series for f centered at 0:
∞ ∞
X (−1)k−1 xk X (−x)k
.=− .
k k
k=1 k=1
By the Ratio test for series, Theorem 9.2.3, the radius of convergence for this series is 1.
It is worth noting that the domain of the function f is all real numbers strictly bigger
than −1, whereas the computed Taylor series converges at all complex numbers in B(0, 1)
and diverges at all complex (including real) numbers whose absolute value is strictly bigger
than 1. By Example 9.1.7, the series diverges at x = −1, and by Theorem 9.2.5, it converges
at x = 1. Furthermore, by Exercise 9.2.14, the series converges at all complex numbers x
with |x| ≤ 1 and x 6= 1. You should test and invoke Theorem 6.5.5 to show that for all
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 9.7: Taylor series 313
P∞ (−x)k
x ∈ (−1, 1), ln(x + 1) = − k=1 k .
Incidentally, since ln(x + 1) is continuous on its domain and since its Taylor series
converges at x = 1 by the Alternating series test (Theorem 9.2.5), it follows from Theo-
rem 9.6.2 that
∞
X (−1)k−1
= ln 2.
k
k=1
With the help of Taylor series we can similarly get finite-term expressions for other
infinite sums. Below is a harder example (and the reader may wish to skip it).
√
Example 9.7.8. Let f (x) = 1 − x. The domain of f is the interval (−∞, 1]. On the
sub-interval (−∞, 1) the function has derivatives of all orders:
1
f 0 (x) = − (1 − x)−1/2 ,
2
1 1
f 00 (x) = − · (1 − x)−3/2 ,
2 2
1 1 3
f 000 (x) = − · · (1 − x)−5/2 , . . . ,
2 2 2
1 1 3 2n − 3
f (n) (x) = − · · · · · · · (1 − x)−(2n−1)/2 .
2 2 2 2
Thus the Taylor series for f centered at 0 is
∞
X 1 · 3 · 5 · · · (2k − 3)
1
1− x− xk .
2 k!2k
k=2
For large n, the quotient of the (n + 1)st coefficient divided by the nth coefficient equals
2n−1
2(n+1) , whose limsup equals 1. Thus by the Ratio test for power series (Theorem 9.3.9),
the Taylor series converges absolutely on B(0, 1), and in particular it converges absolutely
2n−1
on (−1, 1). Furthermore, the quotient 2(n+1) = 2(n+1)−3 4/3
2(n+1) above is at most 1 − n for all
n ≥ 4. Thus by Raabe’s test (Exercise 9.2.17), this Taylor series converges at x = 1, and
so it converges absolutely on [−1, 1]. But what does it converge to? Consider x ∈ (−1, 1).
We use the integral form of the Taylor’s remainder theorem (Exercise 7.4.9):
Z x
(x − t)n 1 1 3 2n − 1
Tn,f,0 (x) − f (x) = · · · ··· (1 − t)−(2n+1)/2 dt
0 n! 2 2 2 2
Z x n
x−t 1 · 3 · 5 · · · (2n − 1)
= √ dt.
0 1−t n!2n+1 1 − t
As the integrand goes to 0 with n, and as |x| < 1, the integral goes to 0 with n, so that
the Taylor series converges to f on (−1, 1). (Incidentally, an application of Exercise 9.6.3
shows that the Taylor series is continuous on [−1, 1], and as f is also continuous there,
necessarily the Taylor series converges to f on [−1, 1].)
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
314 Chapter 9: Infinite series and power series
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Chapter 10: Exponential and trigonometric functions
The culmination of the chapter and of the course are the exponential and trigonomet-
ric functions with their properties. Section 10.4 covers more varied examples of L’Hôpital’s
rule than what was possible in Section 6.4.
Remarks 10.1.1.
P∞ k−1 P∞ xk−1 P∞ k
(1) E 0 (x) = k=1 kxk! = k=1 (k−1)! = k=0 xk! = E(x).
(2) E(0) = 1.
(3) Let a ∈ C. Define g : C → C by g(x) = E(x + a) · E(−x). Then g is a product of
two differentiable functions, hence differentiable, and
g 0 (x) = E 0 (x + a) · E(−x) + E(x + a) · E 0 (−x) · (−1)
= E(x + a) · E(−x) − E(x + a) · E(−x)
= 0,
so that g is a constant function. This constant has to equal
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
316 Chapter 10: Exponential and trigonometric functions
(7) By parts (5) and (6), E(na) = (E(a))n for all integers n and all a ∈ C.
(8) Let a, b ∈ R. By part (4), E(a + bi) = E(a) · E(bi). Thus to understand the
function E : C → C, it suffices to understand E restricted to real numbers and E
restricted to i times real numbers. We accomplish the former in this section and
the latter in the next section.
Theorem 10.1.2. For all x ∈ R, E(x) = ln−1 (x) = ex , the exponential function from Sec-
tion 7.6. In particular, E(1) equals the Euler’s constant e.
E(x)
Proof. Define f : R → R by f (x) = ex . Then f is differentiable and
E 0 (x)ex − E(x)(ex )0 E(x)ex − E(x)ex
f 0 (x) = = = 0.
(ex )2 (ex )2
Thus f is a constant function, so that for all x ∈ R, E(x) E(0) 1
ex = f (x) = f (0) = e0 = 1 = 1.
Thus E(x) = ex . By Theorem 7.6.7, this is the same as ln−1 (x). In particular, E(1) =
e1 = e.
The following is now immediate:
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 10.2: The exponential function, continued 317
P∞
10.1.4. Find a0 , a1 , a2 , . . . ∈ R such that the power series k=0 ak xk converges for all
2
x ∈ R to ex . (Hint: Use the function E to do this easily.) With that, determine a power
2
series whose derivative is e(x ) . (This problem is related to Exercise 9.4.2; it turns out that
2
there is no simpler, finite-term antiderivative of ex .)
2
10.1.5. Determine a power series whose derivative is 2(x ) . (It turns out that there is no
simpler, finite-term antiderivative of this function – see the no-closed-form discussion on
page 225).
P∞ k
10.1.6. Express k=0 (−1) k
4 k!
in terms of e.
P∞ k
2
10.1.7. Express k=0 3k (k+1)! in terms of e.
10.1.8. Compute a Taylor polynomial of degree n of the function E centered at 0. (Do as
little work as possible, but do explain your reasoning.)
10.1.9. Numerically evaluate e2 from the power series for ex = E(x) to 5 significant digits.
Prove that you have achieved desired precision. (Hint: Theorem 6.5.5.)
10.1.10. Compute limx→a E(x). (Do as little hard work as possible, but do explain your
reasoning.)
10.1.11. With proof, for which real numbers α does the sequence {E(nα)}n converge?
In this section we restrict E from the previous section to the imaginary axis. Thus
we look at E(ix) where x varies over real numbers. Note that
∞
X (ix)k
E(ix) =
k!
k=0
ix (ix)2 (ix)3 (ix)4 (ix)5 (ix)6 (ix)7 (ix)8
= 1+
+ + + + + + + +· · ·
1! 2! 3! 4! 5! 6! 7! 8!
x x2 x3 x4 x5 x6 x7 x8
= 1+i − −i + +i − −i + +· · · .
1! 2! 3! 4! 5! 6! 7! 8!
We define two new functions (their names may be purely coincidental, but pronounce them
for the time being as “cause” and “sin”):
x2 x4 x6 x8
COS (x) = Re E(ix) = 1 − + − + + ···,
2! 4! 6! 8!
x x3 x5 x7 x9
SIN (x) = Im E(ix) = − + − + + ···.
1! 3! 5! 7! 9!
Thus
E(ix) = COS (x) + iSIN (x).
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
318 Chapter 10: Exponential and trigonometric functions
Since E(ix) converges for all x, so do its real and imaginary parts. This means that COS
and SIN are defined for all x (even for all complex x; but in this section all x are real).
Thus the radius of convergence for the power series COS and SIN is infinity.
We can alternatively practice the root test on these
n functions
q to q
determineqtheir radii
o
1 1 1
of convergence: the α for the function COS is lim sup 1, 0, 2 2! , 0, 4 4! , 0, 6 6! ,... .
Since 0s do not
n contribute
q q to limsup
q of aosequence of non-negative numbers, it follows that
1 1 1
α = lim sup 1, 2 2! , 4 4! , 6 6! , . . . . By Example 8.3.6 this is 0. Thus the radius
of convergence of COS is ∞. The proof for the infinite radius of convergence of SIN is
similar.
Remarks 10.2.1.
(1) E(ix) = COS (x) + i SIN (x). This writing is not a random rewriting of the sum-
mands in E(ix), but it is the sum of the real and the imaginary parts.
(2) COS (0) = Re E(i · 0) = Re 1 = 1, SIN (0) = Im E(i · 0) = Im 1 = 0.
(3) By the powers appearing in the power series for the two functions, for all x ∈ R,
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 10.2: The exponential function, continued 319
and
(SIN (x))0 = (Im(E(ix)))0 = Im((E(ix))0 ) = COS (x).
√
Theorem 10.2.2. There exists a unique real number s ∈ (0, 3) such that E(is) = i.
√
In other words, there exists a unique real number s ∈ (0, 3) such that COS (s) = 0 and
SIN (s) = 1.
Proof. The function t − SIN (t) is differentiable and its derivative is 1 − COS (t), which is
always non-negative. Thus t − SIN (t) is non-decreasing for all real t, so that for all t ≥ 0,
2
t−SIN (t) ≥ 0−SIN (0) = 0. Hence the function t2 +COS (t) has a non-negative derivative
2 2
on [0, ∞), so that t2 + COS (t) is non-decreasing on [0, ∞). Thus for all t ≥ 0, t2 + COS (t)
2 3
≥ 02 + COS (0) = 1. It follows that the function t6 − t + SIN (t) is non-decreasing on
3
[0, ∞). [How long will we keep going like this?] Thus for all t ≥ 0, t6 − t + SIN (t)
3
t4 2
≥ 06 − 0 + SIN (0) = 0. Thus 24 − t2 − COS (t) is non-decreasing on [0, ∞), so that for all
t4 2
04 2
t ≥ 0, 24 − t2 − COS (t) ≥ 24 − 02 − COS (0) = −1. We conclude that for all t ≥ 0,
t4 t2
COS (t) ≤ − + 1.
24 2
√ √ 4 √ 2
In particular, COS ( 3) ≤ 243 − 23 + 1 = − 81 < 0. We also know that COS (0) =
1 > 0. Since COS is differentiable, it is continuous, so by the Intermediate value theorem
√
(Theorem 5.3.1) there exists s ∈ (0, 3) such that COS (s) = 0.
√
Now suppose that there exists a different u ∈ (0, 3) such that COS (u) = 0. By
possibly switching the names of s and u we may assume that s < u. Note that since s and
t4 2
u are both positive, s2 < u2 . Since 24 − t2 − COS (t) is non-decreasing on [0, ∞), it follows
4 2
s4 2
that u24 − u2 − COS (u) ≥ 24 − s2 − COS (s). Since COS (u) = COS (s) = 0, this says that
u4 u2 s4 s2
24 − 2 ≥ 24 − 2 . In other words,
1 1 √ 2 √ 2 1 2 1 2 u2 − s2 1 u4 − s4 1
= ( 3 + 3 )≥ (u + s2 ) = (u + s2 ) 2 2
= 2 2
= ,
4 24 24 24 u −s 24 u − s 2
which is a contradiction. Thus s is unique.
Since (COS (s))2 + (SIN (s))2 = 1, we have that SIN (s) = ±1. Since COS is positive
√ √
on (0, 3), this means that SIN is increasing on (0, 3), and so by continuity of SIN ,
SIN (s) must be positive, and hence SIN (s) = 1. This proves that E(is) = i.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
320 Chapter 10: Exponential and trigonometric functions
Remark 10.2.3. The proof above establishes the following properties for all t ≥ 0:
t2 t2 t4
COS (t) ≤ 1, COS (t) ≥ 1 − , COS (t) ≤ 1 − + ,
2 2 24
t3
SIN (t) ≤ t, .SIN (t) ≥ t −
6
Observe that the polynomials above alternately over- and under-estimating COS are the
Taylor polynomials of COS (by Theorem 9.7.3) and are converging to the Taylor series
for COS . Similarly, the polynomials above alternately over- and under-estimating SIN are
the Taylor polynomials of SIN converging to the Taylor series for SIN .
√
Remark 10.2.4. (About the numerical value of s.) What role does 3 play in Theo-
√
rem 10.2.2? We deduced that s < 3 from knowing that COS (t) ≤ 1 − t2 /2 + t4 /4! at all
√
t > 0, from knowing that the latter polynomial evaluates to a negative number at 3, and
from an application of the Intermediate value theorem. You probably suspect that s = π/2.
√ √
One can check that π/2 < 3, and even that π/2 < 0.91 · 3. However, 1 − t2 /2 + t4 /4!
√ √
at 0.91 · 3 is positive, so we cannot conclude that s < 0.91 · 3 from this argument.
But if in the proof of Theorem 10.2.2 we compute a few more upper and lower bounding
polynomials of SIN and COS , we obtain that COS (t) ≤ 1 − t2 /2 + t4 /4! − t6 /6! + t8 /8!
√
for all t > 0, and this latter polynomial is negative at 0.91 · 3, which would then by the
√
Intermediate value theorem guarantee that s < 0.91 · 3. By taking higher and higher
degree polynomials as in the proof we could get tighter and tighter upper bounds on s.
We can also get lower bounds on s. From COS (t) ≥ 1 − t2 /2 (as in the proof) we deduce
√
that COS (t) is positive at 0.999999 · 2 (with an arbitrary finite number of digits 9). Thus
√ √
the unique s must be in [ 2, 0.91 · 3). More steps in the proof also show that From
√
COS (t) ≥ 1 − t2 /2 + t4 /4! − t6 /6! for all t > 0, and this polynomial is negative at 0.9 · 3.
Thus by the Intermediate value theorem,
√ √
0.9 · 3 < s < 0.91 · 3.
√
Incidentally, my computer gives me 1.55884572681199 for 0.9 · 3 and 1.57616623488768
√
for 0.91 · 3. Thus we know the in-between s to a few digits of precision. (Incidentally, my
computer gives me 1.5707963267949 for π/2.) By taking higher-degree polynomials in the
more-step version of the proof of Theorem 10.2.2 we can get even more digits of precision
of s. However, this numerical approach does not prove that s is equal to π/2; we need
different reasoning to prove that s = π/2 (see Theorem 10.2.5).
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 10.2: The exponential function, continued 321
is the unique complex number on the unit circle centered at the origin that is on the half-
ray from the origin at angle t radians measured counterclockwise from the positive x-axis.
In terms of ratio geometry (see page 20), this says that in a right triangle with one angle
t radians, cos(t) is the ratio of the length of the adjacent edge divided by the length of
the hypotenuse, and sin(t) is the ratio of the length of the opposite edge divided by the
length of the hypotenuse. Geometrically it is clear, and we do assume this fact, that cos
and sin are continuous functions. In general continuity does not imply differentiability, but
for these two functions we use their continuity in the proof of their differentiability.
Theorem 10.2.5. The trigonometric functions cos and sin are differentiable. Furthermore,
COS and SIN are the functions cos and sin, s = π/2, and for any real number x, E(ix)
is the point on the unit circle centered at 0 at angle x radians measured counterclockwise
from the positive real axis.
Proof. We know that E(ix) is a point on the unit circle with coordinates (COS (x), SIN (x)).
What we do not yet know is whether the angle of this point counterclockwise from the
positive real axis equals x radians.
Let s be as in Theorem 10.2.2. The angle of E(is) = 0 + i · 1 = i measured in radians
counterclockwise from the positive real axis is π/2. Uniqueness of s guarantees that COS
is positive on [0, s). Thus SIN is increasing on [0, s], and it increases from 0 to 1. Thus for
all t ∈ (0, s), E(it) is in the first quadrant.
Let n be an integer strictly greater than 1. Let α be the angle of E(is/n) measured
counterclockwise from the positive real axis. For every integer j, (E(ijs/n)) = E(is)j , so by
Theorem 3.4.2, the angle of (E(ijs/n)) is jα, and in particular the angle nα coincides with
the angle of E(is) = E(ins/n), i.e., with π2 + 2πk for all integers k. Thus α = ( π2 + 2πk)/n
for some integer k. For all j = 1, . . . , n − 1, j/n ∈ (0, 1), and so E(ijs/n) is in the first
quadrant by the previous paragraph. Thus
π π
0 < jα = j (1 + 4k) ≤ .
2n 2
The first inequality says that k is not negative, and the second inequality used with j = n−1
π
says that k is not positive. Thus k = 0 and α = 2n .
Together with Theorem 3.4.2 we just established that for all positive integers n and
all integers j, the angle of E(isj/n) = (E(is/n))j is jα = π2 · nj . In other words, for all
rational numbers r,
π π
COS (sr) + i SIN (sr) = E(isr) = cos r + i sin r .
2 2
By continuity of the functions E, cos and sin we conclude that for any real number x,
π π
COS (sx) + i SIN (sx) = E(isx) = cos x + i sin x .
2 2
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
322 Chapter 10: Exponential and trigonometric functions
By matching the real and the imaginary parts in this equation we get that for all x ∈ R,
2s 2s
cos(x) = COS x , sin(x) = SIN x ,
π π
and so cos and sin are also differentiable functions. By the chain rule,
0 2s 2s 2s
cos (x) = − SIN x = − sin(x),
π π π
2s 2s 2s
sin0 (x) = COS x = cos(x).
π π π
By geometry (see Exercise 1.1.19), for small positive real x, 0 < cos(x) < sinx x < cos1 x .
Since cos is differentiable, it is continuous. Since cos(0) = COS (0) = 1, by the Squeeze
theorem (Theorem 4.4.11), limx→0+ sinx x = 1. Hence
2s 2s sin x − sin 0 sin x
= cos(0) = sin0 (0) = lim = lim+ = 1,
π π x→0 x x→0 x
whence s = π/2. This proves that cos = COS and sin = SIN .
Theorem 10.2.6. Every complex number x can be written in the form rE(iθ), where
r = |x| is the length and θ the angle of x counterclockwise from the positive real axis.
Proof. Let r = |x|. Thus x lies on the circle centered at 0 and of radius r. If r = 0, then
x = 0, and the angle is irrelevant. If instead r is non-zero, it is necessarily positive, and
x/r is a complex number of length |x|/r = 1. By Theorem 3.4.1, x/r and x have the same
angle. Let θ be that angle. Then x/r = E(iθ), so that x = rE(iθ).
Notation 10.2.7. It is common to write E(x) = ex for any complex number x. We have
seen that equality does hold if x is real, but we adopt this notation also for other numbers.
With this, if x, y ∈ R, then
ex+iy = ex eiy ,
and ex is the length and y is the angle of ex+iy counterclockwise from the positive x axis.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 10.2: The exponential function, continued 323
10.2.3. Let x, y ∈ R. Expand both sides of E(i(x + y)) = E(ix)E(iy) in terms of sin and
cos to prove that cos(x + y) = cos(x) cos(y) − sin(x) sin(y), and sin(x + y) = sin(x) cos(y) +
cos(x) sin(y). (Cf. Exercise 1.1.16.)
10.2.4. Prove de Moivre’s formula: for all n ∈ Z and all real numbers x,
(cos(x) + i sin(x))n = cos(nx) + i sin(nx).
10.2.5. Prove that for every real number x that is not an integer multiple of 2π, the series
∞ ∞
X cos(kx) X sin(kx)
and
k k
k=1 k=1
P∞ cos(kx)+i sin(kx)
converge. (Hint: The complex series k=1 k converges if and only if its
real and imaginary parts converge. Use De Moivre’s formula (Exercise 10.2.4) and Exer-
cise 9.2.12.)
10.2.6. Express the following complex numbers in the form ex+iy : i, −i, 1, −1, e, 2 + 2i.
Can 0 be expressed in this way? Justify.
10.2.7. Express (3 + 4i)6 in the form ea+bi for some a, b ∈ R. (Do not give numeric
approximations of a and b.)
10.2.8. Prove that for any integer m, the sequence {E(2πimn)}n converges. Prove that
√
the sequence {E( 2πin)}n does not converge. Determine all real numbers α for which the
sequence {E(inα)}n converges. Prove your conclusion.
10.2.9. For which real numbers α and β does the sequence {E(n(α + iβ))}n converge?
Prove your conclusion.
10.2.10. Let f (x) = sin(x2 ) or f (x) = cos(x2 ).
i) Determine the Taylor series for f centered at 0. (This should not be hard; do not
compute derivatives of all orders.)
ii) Determine a power series whose derivative is f . (There is no simpler or finite-term
antiderivative of f .)
iii) Determine the radius of convergence of the antiderivative power series.
R1
iv) Use the Taylor remainder theorem (Theorem 6.5.5) to compute 0 f to within five
digits of precision.
10.2.11. (This is from the Reviews section edited by P. J. Campbell, page 159 in Mathe-
matics Magazine 91 (2018).) Let P be a polynomial function and m a non-zero constant.
0 00
P 000
i) Prove that P + Pm + Pm2 + m3 + · · · is a polynomial function.
ii) Compute the derivative with respect to x of
e−mx P 0 (x) P 00 (x) P 000 (x)
− P (x) + + + + ··· .
m m m2 m3
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
324 Chapter 10: Exponential and trigonometric functions
Z
iii) Integrate e−mx P (x)dx.
*10.2.12. The goal of this exercise is to give another proof that s = π/2. Let f : [0, 1] → R
√
be given by f (x) = 1 − x2 . Then the graph of f is the part of the circle of radius 1 that
is in the first quadrant.
i) Use lengths of curves and that 2π is the perimeter of the circle of radius 1 to justify
that s
Z 1 2
π −2x
= 1+ √ dx.
2 0 2 1 − x2
(Caution: This is an improper integral.)
ii) Compute the integral via substitution x = COS (u). The integral should evaluate
to s from Theorem 10.2.2.
10.3 Trigonometry
Definition 10.3.1. (Trigonometric functions) For any real number x, eix is the complex
number that is on the unit circle at angle x radians counterclockwise from the positive
horizontal axis.
(1) sin(x) is the imaginary part of eix .
(2) cos(x) is the real part of eix .
sin(x)
(3) tan(x) = cos(x) , cot(x) = cos(x) 1 1
sin(x) , sec(x) = cos(x) , csc(x) = sin(x) .
Remarks 10.3.2. The following is straightforward from the work in the previous section:
(1) sin and cos are differentiable and hence continuous functions whose domain is R.
(2) The Taylor series for sin is
∞
X x2k+1
(−1)k
(2k + 1)!
k=0
and it converges to sin(x) for all x ∈ R. Similarly, the Taylor series for cos is
∞
X x2k
cos(x) = (−1)k .
(2k)!
k=0
(3) All trigonometric functions (as in the definition above) are continuous and differ-
entiable on their domains. By the quotient rule for differentiation,
tan0 (x) = (sec(x))2 , cot0 (x) = −(csc(x))2 ,
sec0 (x) = sec(x) tan(x), csc0 (x) = − csc(x) cot(x).
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 10.3: Trigonometry 325
(4) For all x ∈ R, sin(x + 2π) = sin(x) and cos(x + 2π) = cos(x). This is not obvious
from the power series definition of E(ix), but if follows from Theorem 10.2.6.
ix −ix
(5) sin(x) = e −e 2i .
eix +e−ix
(6) cos(x) = 2 .
(7) (sin(x))2 + (cos(x))2 = 1. (Recall from Remark 2.4.8 that for trigonometric func-
tions we also write this as sin2 (x) + cos2 (x) = 1, but for an arbitrary function f ,
f 2 (x) refers to f (f (x)) rather than for (f (x))2 .)
(8) Dividing the equality in the previous part by (cos(x))2 yields the equality
(tan(x))2 + 1 = (sec(x))2 .
(9) (cot(x))2 + 1 = (csc(x))2 .
(10) sin0 (x) = cos(x).
(11) cos0 (x) = − sin(x).
(12) For all x ∈ R, sin(x) = − sin(−x) and cos(x) = cos(−x).
(13) sin and cos take on non-negative values on [0, π/2]. By the previous part, cos takes
on non-negative values on [−π/2, π/2].
(14) sin is increasing on [−π/2, π/2] since its derivative cos is non-negative there and
positive on (−π/2, π/2).
(15) sin, when restricted to [−π/2, π/2], has an inverse by Theorem 2.9.4. The inverse
is called arcsin. The domain of arcsin is [−1, 1].
(16) Geometrically, sin takes on non-negative values on [0, π] and positive values on
(0, π), so that cos, when restricted to [0, π], has an inverse, called arccos. The
domain of arccos is [−1, 1].
(17) Verify the details in the following. The derivative of tan is always non-negative, so
that on (−π/2, π/2), tan is invertible. Its inverse is called arctan, and the domain
of arctan is (−∞, ∞).
Proof. We prove the first and the third parts; the proof of the second part is similar to the
proof of the first.
1
arcsin0 (x) = 0 (by Theorem 6.2.7)
sin (arcsin(x))
1
=
cos(arcsin(x))
1
=p
(cos(arcsin(x)))2
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
326 Chapter 10: Exponential and trigonometric functions
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 10.3: Trigonometry 327
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
328 Chapter 10: Exponential and trigonometric functions
10.3.14.
RN
(1) Compute limN →∞ −N
sin(x) dx.
R 2N π+π/2
(2) Compute limN →∞ −2N π sin(x) dx.
R 2N π
(3) Compute limN →∞ −2N π+π/2 sin(x) dx.
(4) Conclude that sin is not integrable over (−∞, ∞) in the sense of Exercise 7.4.10.
10.3.15. Let k, j ∈ Z. Prove the following equalities (possibly by multiple integration by
parts (Exercise 7.4.6)):
Rπ
i) −π sin(kt) cos(jt) dt = 0.
Rπ 0, if j 6= k or jk = 0,
ii) −π sin(kt) sin(jt) dt =
π, otherwise.
Rπ 0, if j 6= k or jk = 0,
iii) −π cos(kt) cos(jt) dt =
π, otherwise.
10.3.16. Prove that for any non-zero integer k,
Z
1 1
x sin(kx) dx = − x cos(kx) + 2 sin(kx) + C,
k k
and that Z
1 1
x cos(kx) dx = x sin(kx) + 2 cos(kx) + C.
k k
i) Compute the Fourier series for f (x) = sin(3x). (Hint: Exercise 10.3.15.)
ii) Compute the Fourier series for f (x) = (sin(x))2 . (Hint: Exercises 10.3.9 and
10.3.15.)
iii) Compute a0 , a1 , a2 , b1 , b2 for f (x) = x. (Hint: Exercise 10.3.16.)
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 10.4: Examples of L’Hôpital’s rule 329
Compute the Fourier series in this sense for the functions in the previous exercise.
10.3.20. Use Exercise 10.2.11 to elaborate on the two types of Fourier series from the pre-
vious two exercises for any polynomial function. Note that Fourier series are also functions
in the form of infinite sums but they are not power series.
Several versions of L’Hôpital’s rule were proved in Section 6.4. With an increased
repertoire of functions we can now show more interesting examples, including counterex-
amples if some hypothesis is omitted. All work in this section is in the exercises.
(sin(x))2
= lim
x→0 x(1 + cos(x))
sin(x) sin(x)
= lim ·
x→0 x 1 + cos(x)
0
=1·
1+1
= 0.
The next two exercises are taken from R. J. Bumcrot’s article “Some subtleties in
L’Hôpital’s rule” in Century of Calculus. Part II. 1969–1991, edited by T. M. Apostol,
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
330 Chapter 10: Exponential and trigonometric functions
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 10.5: Trigonometry for the computation of some series 331
The goal of this section is to handle a few more infinite series and power series.
Namely, so far we have been able to add up few infinite series numerically, and here we
P∞
compute for example k=1 k12 in two different ways. (In Example 9.1.9 we proved that
the sum converges but we did not know what it converges to.) All work is in the exercises.
As the title suggests, these uses are not covered in a standard first analysis course and it
is fine to omit this section.
vii) Use the inequality x ≤ (π/2) sin(x) for x ∈ [0, π/2] (see Exercise 10.3.8) to estimate
π 2 π/2
Z
0 ≤ JN ≤ (sin(x))2 (cos(x))2N dx
4 0
π2
= (IN − IN +1 )
4
π2 1
= · IN .
4 2N + 2
∞
X 1 π2
viii) Prove that = .
k2 6
k=1
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
332 Chapter 10: Exponential and trigonometric functions
*10.5.2. (This exercise is used in Exercise 10.5.3.) Use de Moivre’s formula (see Exer-
cise 10.2.4) to prove that for all x ∈ R and all positive integers n,
cos(nx) + i sin(nx) = (sin(x))n (cot(x) + i)n
n
n
X n k
= (sin(x)) i (cot(x))n−k .
k
k=0
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Section 10.5: Trigonometry for the computation of some series 333
2 P
m
iv) Use Exercise 10.5.2 to prove that m(2m−1)
3 < (2m+1)
π2
1
k=1 k2 < m +
m(2m−1)
3 .
P∞ 1 2
v) Prove that k=1 k2 = π6 . (Hint: multiply the previous part by π 2 /4m2 .)
10.5.4. Fill in the explanations and any missing steps in the two double improper inte-
grals. (While integrating with respect to x, think of y as a constant, and while integrating
with respect to y, think of x as a constant.)
Z ∞Z ∞ Z ∞ Z ∞
1 1 1 1
· 2
dx dy = dx dy
0 0 1+y 1+x y 0 1+y 0 1 + x2 y
Z ∞ Z N !
1 1
= lim dx dy
0 1+y N →∞ 0 1 + x2 y
√ !
arctan( yx) N
Z ∞
1
= lim √ dy
0 1+y N →∞ y x=0
π ∞
Z
1
= √ dy
2 0 y(1 + y)
π ∞
Z
2u
= du
2 0 u(1 + u2 )
(by substitution y = u2 )
π2
= ,
2
Z ∞Z ∞
1 1
· 2
dy dx
0 0 1+y 1+x y
Z ∞Z ∞
x2
1 1
= 2
− dy dx
0 0 1−x 1 + y 1 + x2 y
Z ∞
1 1
= ln dx
0 1 − x2 x2
Z ∞
ln x
=2 2
dx
0 x −1
Z 1 Z ∞
ln x ln x
=2 2
dx + dx
0 x −1 1 x2 − 1
Z 1 Z 1
ln x ln u
=2 2
dx + 2
du
0 x −1 0 u −1
Z 1
ln x
=4 2
dx. (Stop here.)
0 x −1
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
334 Chapter 10: Exponential and trigonometric functions
2
*10.5.5. (This is from the article D. Ritelli, Another proof of ζ(2) = π6 , American Mathe-
matical Monthly 120 (2013), 642–645.) We proved in Section 9.4 that derivatives and (def-
inite) integrals commute with infinite sums for power series. There are other cases where
integrals commute with infinite sums, but the proofs in greater generality are harder. Ac-
R 1 ln x R1 P∞ 2k P∞ R 1
cept that 0 − 1−x 2 dx = 0
(− ln x) k=0 x dx = 2k
k=0 0 (− ln x)x dx. Also accept that
the two integrals in Exercise 10.5.4 are the same (order of integration matters sometimes).
P∞ 2
Use Exercises 7.6.7 and 9.2.1 to prove that k=1 k12 = π6 .
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Appendices
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Appendix A. Advice on writing mathematics
Process is important
Perhaps the final answer to the question is 42. It is not sufficient to simply write “42”,
“The answer is 42,” or similar, without the process that led to that answer. While it is
extremely beneficial to have the intuition, the smarts, the mental calculating and reason-
ing capacity, the inspiration, or what-not, to conclude “42”, a huge part of learning and
understanding is to be able to explain clearly the reasoning that lead to your answer.
I encourage you to discuss the homework with others before, during or after com-
pleting it: the explanations back-and-forth make you a better thinker and expositor.
Write your solutions in your own words on your own, and for full disclosure write
the names of all of your collaborators on the work that you turn in for credit. I do not
take points off, but you should practice full honesty.
Sometimes you may want to consult a book or the internet. Again, on the work that
you turn in disclose the help that you got from outside sources.
Keep in mind that the more you have to consult outside sources, the more fragile
your stand-alone knowledge is, the less well you understand the material, and the less likely
you are to be able to do satisfactory work on closed-book or limited-time projects.
Do not divide by 0
Never write “1/0”, “0/0”, “02 /0”, “∞/0”. (Erase from your mind that you ever saw
this in print! It cannot exist.)
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
338 Appendix A: Advice on writing mathematics
Sometimes division by zero creeps in in subtler ways. For example, to find solutions
2
to x = 3x, it is wrong to simply cancel the x on both sides to get only one solution x = 3.
Yes, x = 3 is one of the solutions, but x = 0 is another one. Cancellation of x in x2 = 3x
amounts to dividing by 0 in case the solution is x = 0.
Never plug numbers into a function that are not in the domain of the function
By design, the only numbers you can plug into a function are those that are in the
domain of the function. What else is there to say?
I will say more, by way of examples. Never plug 0 into the function f (x) = x1 (see
previous admonition). Even never plug 0 into the function f (x) = xx : the latter function
is undefined at x = 0 and is constant 1 at all other x.
√
Never plug −1 into or into ln.
3x − 4, if x > 3;
Do not plug x = 0 or x = 1.12 into f that is defined by f (x) =
2x + 1, if x < −1.
Write parentheses
·− is not a recognized binary operator. Do not write “5·−2”; instead write “5·(−2)”.
lim 4 − 3x = 4 − 3x, whereas lim (4 − 3x) = 7.
x→−1 x→−1
R R
“ 4 − 3x dx” is terrible grammar; instead write “ (4 − 3x) dx”.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Appendix A: Advice on writing mathematics 339
ON THE LEFT!
X 1(1 + 1)(2 · 1 + 1)
k2 =
WRITE AS
DO NOT
6
k=1
1·2·3
12 =
6
1=1
The reasoning above is wrong-headed because in the first line you are asserting the
equality that you are expected to prove, and in subsequent lines you are simply repeating
your assumptions more succinctly. If you add question marks over the three equal sums
and a check mark on the last line, then you are at least acknowledging that you are not
yet sure of the equality. However, even writing with question marks over equal signs is
inelegant and long-winded. That kind of writing is what we do on scratch paper to get our
bearings on how to tackle the problem. But a cleaned-up version of the proof would be
better as follows:
1
X 1·2·3 1(1 + 1)(2 · 1 + 1)
k 2 = 12 = 1 = = .
6 6
k=1
Do you see how this is shorter and proves succinctly the desired equality by transitivity of
equality, with each step on the way sure-footed?
Another reason why the three-line reasoning above is bad is because it can lead to
the following nonsense:
?
1=0
?
add 1 to both sides: 2=1
? √
multiply both sides by 0: 0=0
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
340 Appendix A: Advice on writing mathematics
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Appendix B. What one should never forget
Logic
We should remember the basic truth tables, correct usage of “or” and of implications,
how to justify/prove a statement, and how to negate a statement.
Truth table:
Negation chart:
Statement Negation
not P P
P and Q ( not P ) or ( not Q)
P or Q ( not P ) and ( not Q)
P ⇒Q P and ( not Q)
P ⇔Q P ⇔ ( not Q) = ( not P ) ⇔ Q
For all x of the specified type, There exists x of the specified
property P holds for x. type such that P is false for x.
There exists x of the specified type For all x of the specified type,
such that property P holds for x. P is false for x.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
342 Appendix B: What one should never forget
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Appendix B: What one should never forget 343
Mathematical induction
The goal is to prove a property for all integers n ≥ n0 . First prove the base case,
namely that the property holds for n0 . For the inductive step, assume that for some
n − 1 ≥ n0 , the property holds for n − 1 (alternatively, for n0 , n0 + 1, . . . , n − 1), and then
prove the property for n.
We say that f is integrable over [a, b] when L(f ) = U (f ). We call this common value
the integral of f over [a, b], and we write it as
Z b Z b Z b
f= f (x) dx = f (t) dt.
a a a
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
344 Appendix B: What one should never forget
II: Let f : [a, b] → R be continuous. Then for all x ∈ [a, b], f is integrable over [a, x], and
Rx
the function g : [a, b] → R given by g(x) = a f is differentiable on (a, x) with
Z x
d
f = f (x).
dx a
Geometric series
P∞ k r
k=1 r diverges if |r| ≥ 1 and converges to 1−r if |r| < 1.
P∞ k 1
k=0 r diverges if |r| ≥ 1 and converges to 1−r if |r| < 1.
Pn n+1
For all r ∈ C \ {1}, k=0 rk = r r−1−1 .
Never divide by 0
It bears repeating. Similarly do not plug 0 or negative numbers into ln, do not plug
negative numbers into the square root function, do not ascribe a function (or a person) a
task that makes no sense.
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Index
SYMBOLS N, N+ , N0 50
◦ abstract binary operation 76 ± 102
n
C 50, 105 Qn Y
k=1 , product 35, 79
n
k , n choose k 46 k=1
Ac complement of A 53 Q 50
\ complement of sets 53 // end of proof 18
◦ composition of functions 68 QED end of proof 18
cos 23, 324 end of proof 18
e (identity element for binary operation) 77 end of proof 18
e: Euler’s constant 235 R 50
{} empty set 50 R≥0 50
∅ empty set 50 Range: range of a function 66
√
∃ there exists 25
n
101, 172, 173
∀ for all 25 sin 23, 24, 324
f n function composed with itself 69 ⊆ subset of a set 52
sinn power of trigonometric function (not ( proper subset 52
composition) 69 6⊆ not a subset 52
n
X
> (strictly) greater than 86 Pn
k=1 , 34, 79
≥ greater than or equal to 86 k=1
e (identity element for binary operation) 77 sup: supremum 87
id (identity function) 67 ∪ union 52
⇔ if and only if 17 U (f, P ): upper sum 207
⇒ implies 16 Z 50
∈ in a set 49 00 34
inf: infimum 87
∩ intersection 52 A
< (strictly) smaller than 86, 88, 90 Abel’s lemma 309
≤ less than or equal to 86 absolute value 19, 101
lim of sequence 249 absolutely convergent 292
lim inf 280 advice on writing proofs 337
lim sup 280 alternating test for series 291
ln: natural logarithm 232 and (logical) 15
¬ logical not 14 antecedent 16
∨ logical or 15 antiderivative 225
∧ logical and 15 arcsin, arccos, arctan 325
L(f, P ): lower sum 207 area (signed or usual) 205
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
346 Index
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Index 347
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
348 Index
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Index 349
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
350 Index
monotone sequence 270, 273 partial sum (of a sequence) 285, 286
partition (of an interval) 206
N refinement 208
N, N+ 50 partition
N0 50, 92 sub-interval 206
natural logarithm, ln 232 Pascal’s triangle 45
negation 14, 32 perfect number 51
chart 32, 341 π vs. e 237
negative 89 pigeonhole principle 73
Nelson, E. 163 polar coordinates 114
non-negative 89 polynomial function 34, 70, 71, 72, 108,
non-positive 89 148, 201, 263, 268
not: see negation degree 34
notation Taylor polynomial 201
integral 224, 227 popcorn function 163
N, N+ , N0 50 positive 89
set vs. sequence 244 power function 95, 96, 97, 148, 171, 263,
⊆, ( 52 268
number of elements in Q+ 75 generalized 173, 234
power notation 34, 69, 79
O 00 34
odd function, its integral 72 for functions 69, 82
odd integer 18, 22 on sets with binary operation 79
one-to-one 69 special in trigonometry 69
onto 69 power rule for derivatives 187
open ball 119 generalized 191
open cover 125 power series 295, 300
open set 119 continuous 310
or (logical) 15 derivative 303
ordered pair 58 generalized 300
ordered set 90 inverse 309
field 90 numerical evaluation 304
order 86 product of two 306
radius of convergence 297
P ratio test 298
p-test for series 293 root test 297
Papadimitriou, I. 332 Taylor series 310
parallelogram rule for addition 106 uniqueness 303
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Index 351
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
352 Index
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27
Index 353
T W
tautology 18 well-defined 75
Taylor polynomial 201, 237, 327 well-ordered 89, 94
Taylor series 310
not approximating 314 X
Taylor’s remainder theorem 202, 203, 229 xor 15
Thomae function 163
ticonnuous 163 Z
topology 119, 121, 122 Z 50
closed set 122 zero of a function 70
general 123 zero: do not divide by it 9, 84, 337
limit point 121 zeroth derivative 201
open ball 119 zeroth power 34
open set 119
tower of Hanoi 44
triangle inequality 19, 103
over C 111
reverse 103, 111
trichotomy (of <) 88
trigonometry 320, 324
tromino 44
truth table 14, 341
U
uniform continuity 176, 197
universal quantifier ∀ 25
universal set 53
unusual field 86
upper bound of set 87
upper integral 209
upper sum (integrals) 207
V
vacuously true 26
Velleman, D. 44, 331
Venn diagram 55
volume of a surface of revolution 239, 240
Vorob’ev, N. N. 42
AMS Open Math Notes: Works in Progress; Reference # OMN:201911.110809; Last Revised: 2020-07-25 17:23:27