Analysis Notes
Analysis Notes
Jim Portegies
January 8, 2024
Contents
1 Introduction 12
3 Proofs in analysis 25
3.1 What is a proof? . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.2 Expectations on proofs . . . . . . . . . . . . . . . . . . . . . . 26
3.3 Prove statements block by block . . . . . . . . . . . . . . . . 27
3.4 Directly proving “for all” statements . . . . . . . . . . . . . . 27
3.5 Directly proving “there exists” statements . . . . . . . . . . . 29
3.6 Trying to finish the proof . . . . . . . . . . . . . . . . . . . . . 29
1
CONTENTS 2
4 Real numbers 36
4.1 What are the real numbers? . . . . . . . . . . . . . . . . . . . 36
4.2 The completeness axiom . . . . . . . . . . . . . . . . . . . . . 37
4.3 Alternative characterizations of suprema and infima . . . . . 41
4.4 Maxima and minima . . . . . . . . . . . . . . . . . . . . . . . 43
4.5 The Archimedean property . . . . . . . . . . . . . . . . . . . 44
4.6 Sets can be complicated . . . . . . . . . . . . . . . . . . . . . 49
4.7 Computation rules for suprema . . . . . . . . . . . . . . . . . 49
4.8 Bernoulli’s inequality . . . . . . . . . . . . . . . . . . . . . . . 51
4.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
4.9.1 Blue exercises . . . . . . . . . . . . . . . . . . . . . . . 52
4.9.2 Orange exercises . . . . . . . . . . . . . . . . . . . . . 53
5 Sequences 54
5.1 A sequence is a function from the natural numbers . . . . . . 54
5.2 Terminology around sequences . . . . . . . . . . . . . . . . . 55
5.3 Convergence of sequences . . . . . . . . . . . . . . . . . . . . 57
5.4 Examples and limits of simple sequences . . . . . . . . . . . 58
5.5 Uniqueness of limits . . . . . . . . . . . . . . . . . . . . . . . 59
5.6 More properties of convergent sequences . . . . . . . . . . . 60
CONTENTS 3
6 Real-valued sequences 68
6.1 Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
6.2 Monotone, bounded sequences are convergent . . . . . . . . 69
6.3 Limit theorems . . . . . . . . . . . . . . . . . . . . . . . . . . 70
6.4 The squeeze theorem . . . . . . . . . . . . . . . . . . . . . . . 75
6.5 Divergence to ∞ and −∞ . . . . . . . . . . . . . . . . . . . . . 77
6.6 Limit theorems for improper limits . . . . . . . . . . . . . . . 78
6.7 Standard sequences . . . . . . . . . . . . . . . . . . . . . . . . 79
6.7.1 Geometric sequence . . . . . . . . . . . . . . . . . . . 79
6.7.2 The nth root of n . . . . . . . . . . . . . . . . . . . . . 80
6.7.3 The number e . . . . . . . . . . . . . . . . . . . . . . . 81
6.7.4 Exponentials beat powers . . . . . . . . . . . . . . . . 83
6.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
6.8.1 Blue exercises . . . . . . . . . . . . . . . . . . . . . . . 86
6.8.2 Orange exercises . . . . . . . . . . . . . . . . . . . . . 86
7 Series 87
7.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
7.2 Geometric series . . . . . . . . . . . . . . . . . . . . . . . . . . 88
7.3 The harmonic series . . . . . . . . . . . . . . . . . . . . . . . . 89
7.4 The hyperharmonic series . . . . . . . . . . . . . . . . . . . . 90
CONTENTS 4
12 Compactness 154
12.1 Definition of (sequential) compactness . . . . . . . . . . . . . 154
12.2 Boundedness and total boundedness . . . . . . . . . . . . . . 155
12.3 Alternative characterization of compactness . . . . . . . . . . 158
12.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
12.4.1 Blue exercises . . . . . . . . . . . . . . . . . . . . . . . 163
12.4.2 Orange exercises . . . . . . . . . . . . . . . . . . . . . 163
15 Differentiability 201
15.1 Definition of differentiability . . . . . . . . . . . . . . . . . . 202
15.2 The derivative as a function . . . . . . . . . . . . . . . . . . . 204
15.3 Constant and linear maps are differentiable . . . . . . . . . . 205
15.4 Bases and coordinates . . . . . . . . . . . . . . . . . . . . . . 205
15.5 The matrix representation . . . . . . . . . . . . . . . . . . . . 208
15.6 The chain rule . . . . . . . . . . . . . . . . . . . . . . . . . . . 209
15.7 Sum, product and quotient rules . . . . . . . . . . . . . . . . 213
15.8 Differentiability of components . . . . . . . . . . . . . . . . . 214
15.9 Differentiability implies continuity . . . . . . . . . . . . . . . 215
15.10Derivative vanishes in local maxima and minima . . . . . . . 216
15.11The mean-value theorem . . . . . . . . . . . . . . . . . . . . . 218
15.12Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
15.12.1 Blue exercises . . . . . . . . . . . . . . . . . . . . . . . 219
15.12.2 Orange exercises . . . . . . . . . . . . . . . . . . . . . 220
Introduction
12
Chapter 2
f ( x ) = x2 + 2.
13
CHAPTER 2. SETS, SPACES AND FUNCTIONS 14
1
b(n) := , for n ∈ N
1 + n2
is an example of what we call a real-valued sequence, or a sequence of real
numbers. In general, a real-valued sequence is a function from the natural
numbers N to the real numbers R.
b ( n ) : = n3 , for n ∈ N
numbers N.
L( x ) := Mx
is a function from Rm to Rn .
d( a, c) ≤ d( a, b) + d(b, c).
Usually conditions (ii) and (v) are combined to the condition that for all
a, b ∈ X, d( a, b) = 0 if and only if a = b.
It will be very useful to give a definition to all points within a certain dis-
tance of a given point in a metric space.
CHAPTER 2. SETS, SPACES AND FUNCTIONS 17
Definition 2.4.4 ((open) ball). Let ( X, dist) be a metric space. The (open)
ball around a point p ∈ X with radius r > 0 is denoted by B( p, r ) and
is defined as the set
B( p, r ) := {q ∈ X | dist(q, p) < r }.
Example 2.4.5. In 2.4.3, the open ball B(yellow, 3/2) is the set {yellow, red},
because d(yellow, yellow) = 0 which is strictly less than 3/2, and d(yellow, red) =
1 < 3/2, but d(yellow, blue) = 3 > 3/2.
Definition 2.5.1 (vector space). Let K be a field (or if you don’t know
what a field is, let K be the real or complex numbers or the rational
numbers). A vector space (V, ·, +, 0) over the field K is a set V, to-
gether with functions · : K × V → V and + : V × V → V called scalar
multiplication and addition, and a particular element 0 ∈ V such that
the following properties are satisfied.
v+w = w+v
CHAPTER 2. SETS, SPACES AND FUNCTIONS 18
u + (v + w) = (u + v) + w
v+0 = 0+v = v
v + (−v) = 0.
1 · v = v.
λ · (v + w) = λ · v + λ · w.
(λ + µ) · v = λv + µv.
One of the reasons that vector spaces are so important in analysis is that
for functions defined on vector spaces, we can introduce the concept of
differentiation, while this is in general problematic for functions defined on
metric spaces.
k v + w k ≤ k v k + k w k.
i =1
d
( x, y) := ∑ xi yi
i =1
√
(
x, if x ≥ 0,
k x k2 = x 2 =
− x, if x < 0.
CHAPTER 2. SETS, SPACES AND FUNCTIONS 20
dk·k ( x, y) := k x − yk.
for all v, w ∈ V
dk·k (v, w) ≥ 0.
Since the statement we need to show starts with “for all v, w ∈ V”, we
start by writing:
Let v, w ∈ V. Then
dk·k (v, w) = kv − wk ≥ 0
for all v, w ∈ V
if dk·k (v, w) = 0 then v = w.
Let v, w ∈ V. Then
dk·k (v, w) = kv − wk = 0.
CHAPTER 2. SETS, SPACES AND FUNCTIONS 21
for all v, w ∈ V
dk·k (v, w) = dk·k (w, v).
dk·k (v, w) = kv − wk
= k(−1) · (w − v)k
= | − 1|kw − vk
= kw − vk = dk·k (w, v).
for all u, v, w ∈ V
dk·k (u, w) ≤ dk·k (u, v) + dk·k (v, w).
Let u, v, w ∈ V. Then
dk·k (u, w) = ku − wk
= k(u − v) + (v − w)k
≤ ku − vk + kv − wk
= dk·k (u, v) + dk·k (v, w).
for all u ∈ V,
dk·k (u, u) = 0.
CHAPTER 2. SETS, SPACES AND FUNCTIONS 22
dk·k (u, u) = ku − uk
= k0 · u k
= |0| · k u k
= 0 · kuk
= 0.
Although strictly speaking, the metric space (V, dk·k ) and (V, k · k) are
different objects (one is a metric space and the other is a normed vector
space), we will usually be a bit sloppy about this difference.
i =1
distR = |v − w|.
And if there is no room for confusion, we will just leave out the sub-
script altogether.
k a + bk ≤ k ak + kbk
so that
kvk ≤ kv − wk + kwk
and
k v k − k w k ≤ k v − w k.
Similarly,
k w k − k v k ≤ k w − v k = k v − w k.
We conclude that
|kvk − kwk| ≤ kv − wk.
2.7 Exercises
Please read also the following chapter and the best practices in Appendix
A before attempting the next exercises. In particular, please follow the
best practices in writing down your solutions.
Then dist| A is called the restriction of dist to A. Show that dist| A is a dis-
tance on A.
Let z ∈ V and r > 0. Recall Definition 2.4.4 of the open ball B(z, r ). Show
that B(z, r ) is convex.
Chapter 3
Proofs in analysis
25
CHAPTER 3. PROOFS IN ANALYSIS 26
Best practices. Just like big software companies have best practices for
writing code, with the aim of having well-maintainable code with a min-
imal amount of bugs, in this course we have a set of best practices for
writing proofs. These best practices are recorded in Appendix A. If we all
try to adhere to the best practices, authors are helped in structuring their
proofs, and readers, reviewers and authors are all helped in understand-
ing the proofs.
Such a statement looks complicated, and you may not know how to start
a proof or how to continue.
But one of the main messages of this chapter is that the statement itself
tells you how to start, and how to continue. In fact, the statement gives
you a template, where there are only a few things left for you to fill in.
for all a ∈ A,
...
you need to do the following. You first need to introduce (i.e. define) the
variable a ∈ A by writing something like
Let a ∈ A.
or
Take a ∈ A arbitrary.
Next, you continue to prove the indented block on the next line.
In our example, we encounter this situation twice: in
and in
for all n ≥ n0 ,
...
Let e > 0.
CHAPTER 3. PROOFS IN ANALYSIS 29
you need to do the following. You need to make a choice for b, e.g. by
writing
Choose b := . . .
after which you continue to show the indented statement (. . . ) with that
choice of b.
In our example we encounter this situation with
Choose n0 := 10.
and then continue with the proof of the block {. . . }, with now n0 fixed as
10. Making choices is hard. With a bad choice, you won’t be able to prove
whatever is inside the block {. . . }. In many of the proofs that you will
write, this is probably the step that requires the most thinking, the most
creativity.
Let e > 0.
Choose n0 := 10.
for all n ≥ n0 ,
|1/n − 0| < e.
with the only knowledge about e that it is a (real) number larger than 0,
and that n0 = 10.
Let us stick to the recipe. We need to show a statement of the form
for all n ≥ n0
so we define n by writing
Let n ≥ n0
But we do not know whether e is larger than 1/10 or not! All we know is
that it is a positive real number.
We cannot prove that
|1/n − 0| < e
how to choose n0 , you need to know your endgame, you already need to
know how you will finish the proof. This means, you need to do a lot of
scratchwork.
Let us do some of that scratchwork. We can figure out what choice of n0
would lead to a proof. For instance if we choose instead
n0 := d1/ee + 1
then n0 is a natural number strictly larger than 1/e (here d1/ee is 1/e
rounded up to a natural number). In that case
It works!
But you need to present the proof following the above steps. You just make
better choices.
so we write
Let e > 0.
Choose n0 := d1/ee + 1
for all n ≥ n0 ,
|1/n − 0| < e.
Next, we write
Let n ≥ n0 .
|1/n − 0| < e.
Let e > 0.
Choose n0 := . . .
Let n ≥ n0 .
Then show desired estimate.
We use induction on n ∈ N.
We first show the base case, i.e. that P(0) holds.
... insert here a proof of P(0) ...
We now show the induction step.
Let k ∈ N and assume that P(k) holds.
We need to show that P(k + 1) holds.
... insert here a proof of P(k + 1) ...
¬ (for all a ∈ A, (. . . ))
is equivalent to
¬ (there exists a ∈ A, (. . . ))
is equivalent to
3.11 Exercises
Real numbers
The functions that we will study will most often have the real numbers as
a target space. And if the target space is not the space of real numbers,
then it will most often be a normed vector space or a metric space with
properties very similar to the real numbers.
In this chapter we will therefore take a careful look at the properties of the
real numbers. Although most of the properties may seem obvious, there
is one property that you usually don’t encounter so explicitly. The name
of this property is the completeness axiom.
The absolute main message of this chapter is the importance of the com-
pleteness axiom: at the innermost core of almost every proof in these notes
is the completeness axiom. It is the motor of analysis.
36
CHAPTER 4. REAL NUMBERS 37
rational numbers with the standard addition, multiplication and order are
an example of a totally ordered field. We will come to the exact meaning
of ‘complete’ in the next section, but let’s already mention that a totally
ordered field is called complete if every non-empty, bounded-from-above
subset of the totally ordered field has a least upper bound.
At this stage we assume that complete, totally ordered fields exist. We
choose one such complete ordered field and call it the real numbers and
denote it by R. Are there other possibilities then? Yes and not really. Yes,
there are different complete totally ordered fields satisfying all these ax-
ioms, but they are all essentially the same in the sense that they are isomor-
phic as totally ordered fields. We will not show these statements in these
notes. For the purpose of the notes it’s enough to just make a choice of a
complete ordered field and call it R.
Between various books there may be some slight difference in choice of
axioms. Although the lists are different, they do specify (essentially) the
same complete totally ordered field. In Abbott’s book [A+ 15], you can find
the axioms of the real numbers in Section 8.6.
In proof assistants such as Waterproof, the axioms for the real numbers are
also introduced and they are used as the fundamental building blocks on
which the rest of the theory is formally built up. You can find a full list
of possible axioms for the real numbers as they are used in Waterproof on
the website:
coq.inria.fr/library/Coq.Reals.Raxioms.html.
for all a ∈ A,
a ≤ M.
for all a ∈ A
m ≤ a.
Given the definition of upper and lower bounds, we define what it means
for a set to be bounded from above, bounded from below and just bounded.
A least upper bound is an upper bound that is smaller than or equal to any
other upper bound.
CHAPTER 4. REAL NUMBERS 39
i. M is an upper bound
Definition 4.2.6 (The supremum). A different name for the least upper
bound of a set A ⊂ R is the supremum of A, which we also write as
sup A.
Given all the terminology defined above, we can now state the complete-
ness axiom.
Lemma 4.2.8. Every non-empty subset of the real line that is bounded
from below has a largest lower bound.
i. m is a lower bound of A.
− A := {− a | a ∈ A}.
for all a ∈ A
m ≤ a.
ii.
for all e > 0,
there exists a ∈ A,
a > M − e.
ii.
for all e > 0,
there exists a ∈ A,
a < m + e.
ii.
for all e > 0
there exists a ∈ A
a < 1 + e.
1 + e/2 = 501 ∈
/ A.
Therefore in general
1 + e/2 ∈
/ A.
This happens a lot, that there is an initial idea which is almost ok, but
it doesn’t quite work. In that case we can adapt. One way is as follows:
Choose
a := min (1 + e/2, 2)
In that case, we know that a is always between 1 and 2, and therefore
a ∈ A. What we now need to do is show that a < 1 + e.
For this we write a small chain of inequalities. Indeed, it holds that
for all a ∈ A,
a ≤ y.
for all a ∈ A,
x ≤ a.
One of the very important aspects of the above definition is that the min-
CHAPTER 4. REAL NUMBERS 44
which is a contradiction.
Given the previous proposition, we can define the ceiling function (that
we actually have already used in the running example in the previous
chapter).
The Archimedean property implies that between every two real numbers
you can find a rational number. That is the content of the following propo-
sition.
Then
3 3
Mb − Ma = M(b − a) = (b − a) ≥ (b − a) = 3.
b−a b−a
Therefore
Ma < d Mae + 1 < Ma + 2 < Mb. (4.5.1)
d Mae+1
Choose q := M , which is indeed an element of Q. After dividing
both sides of (4.5.1) by M, we conclude that
d Mae + 1
a< < b.
M
CHAPTER 4. REAL NUMBERS 46
√
The next proposition loosely speaking says that the 2 is irrational. If we
would be √very precise, though, at this stage we wouldn’t even know how
to define 2. Sure, we could define it as a real number x ∈ R such that
x2 = 2, but who says such a real number exists?
√
The next proposition really defines 2.
2 − (sup A)2
sup A < q < sup A + .
4 sup A
2 − (sup A)2
0<e< < 1.
4 sup A
Therefore
q2 = (sup A + e)2
= (sup A)2 + 2e sup A + e2
< (sup A)2 + 2e sup A + e
< (sup A)2 + 2e sup A + 2e sup A
= (sup A)2 + 4e sup A < 2,
(sup A)2 − 2
δ := .
2 sup A
Corollary 4.5.6. For every two real numbers a, b ∈ R with a < b, there
exists an irrational number r ∈ R \ Q such that a < r < b.
N := d1/(b − q)e + 1.
Choose √
2
r := q + .
2N
Then r is irrational and
√
2 1
a < r = q+ ≤ q+ < q + (b − q) = b.
2N N
CHAPTER 4. REAL NUMBERS 49
A + B = { a + b | a ∈ A, b ∈ B}
and
λA = {λa | a ∈ A}
for subsets A, B ⊂ R and a scalar λ ∈ R.
v. sup(−C ) = − inf C
Proof. We first show (i) in the list above. We set M := sup A + sup B
and will show that M is indeed the supremum of the set A + B by
showing items (i) and (ii) of the alternative characterization of the
supremum in Proposition 4.3.1.
We first show item (i) of Proposition 4.3.1, namely that M is an upper
bound. We need to show that
for all c ∈ A + B,
c ≤ M.
c = a + b ≤ sup A + sup B = M
we know that
for all e1 > 0,
there exists a ∈ A, (4.7.1)
a > sup A − e1 .
we find that there exists an a ∈ A such that a > sup A − e/2. Similarly,
there exists a b ∈ B such that b > sup B − e/2.
Choose c := a + b. Then
A remark about the presentation of the proof above: The text in the lighter
gray inset around statement (4.7.1) is optional. Once we get more skilled
in proving, we will usually omit it. The omission makes the proof a bit
shorter and perhaps easier to read, while if you have seen this type of
argument a few times, you will know what lines to insert to make the
proof more detailed.
If a and b are positive, we can get some useful inequalities by just leaving
out some terms on the right hand side. We will use this technique repeat-
edly in the lecture notes.
CHAPTER 4. REAL NUMBERS 52
We can even get some inequalities if b = 1 and a ≥ −1. One such inequal-
ity is Bernoulli’s inequality.
(1 + a )0 = 1 ≥ 1 = 1 + 0 · a
(1 + a ) k +1 = (1 + a ) k (1 + a )
≥ (1 + ka)(1 + a)
= 1 + (k + 1) a + ka2
≥ 1 + (k + 1) a.
4.9 Exercises
inf[ a, b) = a.
Sequences
This chapter will introduce sequences. Sequences are extra important since
they can be used to determine whether metric spaces or functions between
metric spaces satisfy certain properties.
Example 5.1.2. Consider the set X := {red, yellow, blue} of primary col-
ors. The function a : N → X defined by
(
blue, if n is odd,
a(n) :=
red, if n is even,
54
CHAPTER 5. SEQUENCES 55
is an example of a sequence in X.
We really want to stress the point of view here that a sequence, for instance
a sequence a : N → R of real numbers, is really a function. In practice, we
often write ( an )n∈N , ( an ), ( a(n) ), or
a0 , a1 , a2 , a3 , . . .
there exists q ∈ X,
there exists M > 0,
for all n ∈ N,
dist( an , q) ≤ M.
Proof. We first show the “if” part of the statement. So we assume that
Now we will show the “only if” part of the statement. We assume that
there exists q ∈ V,
there exists M > 0,
for all n ∈ N,
k an − qk ≤ M
k an k = k an − q + qk
≤ k an − qk + kqk
≤ M + kqk = M1 .
a : N → X converges to a point p ∈ X if
We sometimes write
lim an = p
n→∞
Example 5.3.2. Let’s see what this definition looks like when the metric
space ( X, dist) is (R, distR ), where by distR : R × R → R we always
mean the standard distance on R given by
distR ( x, y) = | x − y|.
dist( an , p) < e.
which is a contradiction.
Again, the lighter inset around statement (5.5.1) above denotes optional
text. As we progress in these notes, we will more and more often omit it,
but in the early stages it shows the argumentation a bit more clearly.
n 7→ dist( an , p)
converges to 0 in R.
CHAPTER 5. SEQUENCES 61
bn := dist( an , p).
0 ≤ dist( an , p) < e
so that indeed
We now show the “if” part of the statement. We assume that (bn ) con-
verges to 0 and we need to show that ( an ) converges to p.
Let e > 0.
Since (bn ) converges to zero, there exists an N1 such that for all n ≥ N1 ,
there exists q ∈ X,
there exists M > 0,
for all n ∈ N,
dist( an , q) ≤ M.
we know that
for all e1 > 0,
there exists N ∈ N,
(5.6.1)
for all n ≥ N,
dist( an , p) < e1 .
dist( an , p) < 1.
Choose
M := max(dist( a1 , p), . . . , dist( a N −1 , p), 1).
Let n ∈ N. We need to show that dist( an , p) ≤ M. We make a case
distinction.
In the case n ≤ N − 1, then
dist( an , p) < 1 ≤ M.
lim dist( an , bn ) = 0.
n→∞
lim dist( an , bn ) = 0.
n→∞
Let e > 0.
Because limn→∞ an = p, there exists an N0 ∈ N such that for all n ≥
N0 ,
e
dist( an , p) < .
2
Because limn→∞ bn = p, there exists an N1 ∈ N such that for all n ≥
N1 ,
e
dist(bn , p) < .
2
Choose N := max( N0 , N1 ).
Let n ≥ N. Because then n ≥ N0 and n ≥ N1 , we know
e e
dist( an , bn ) ≤ dist( an , p) + dist( p, bn ) < + = e.
2 2
CHAPTER 5. SEQUENCES 64
I’ve added the next corollary later in the year 2021-2022 just for your
convenience, just to highlight a consequence of the previous proposi-
tion.
a n = bn .
Proof. We leave the proof of (i) as an exercise and prove (ii), which is a
bit more difficult.
We need to show that limn→∞ (λn an ) = µp, i.e. we need to show that
Let e > 0.
CHAPTER 5. SEQUENCES 66
|λn | ≤ M.
Choose N := max( N0 , N1 ).
Let n ≥ N. Then
conclude that the same sequence but with index shifted is also convergent.
Proposition 5.8.1 (Index shift). Let ( X, dist) be a metric space and let
a : N → X be a sequence in X. Let k ∈ N and p ∈ X. Then the
sequence ( an ) converges to p if and only if the sequence ( an+k )n (i.e.
the sequence n 7→ an+k ) converges to p.
5.9 Exercises
Real-valued sequences
6.1 Terminology
Because the real numbers come with an order (≤), we can define increas-
ing, decreasing and monotone sequences.
The main result of this chapter is that monotone, bounded sequences are
convergent. In order to introduce what it means for a sequence to be
bounded, we first introduce upper and lower bounds.
68
CHAPTER 6. REAL-VALUED SEQUENCES 69
for all n ∈ N,
m ≤ an .
In the previous chapter, we have already defined what it means for a se-
quence to be bounded. The next proposition relates the two definitions to
each other.
L := sup an .
n ∈N
We need to show that for all e > 0, there exists a N ∈ N such that for
all n ≥ N,
| an − L| < e.
Let e > 0. Then, by the definition of the supremum, there exists a
k ∈ N such that
L − e < ak .
Choose N := k. Let n ≥ N. Because the sequence ( a` ) is increasing,
we find that
an ≥ a N = ak > L − e.
Because of the definition of L, we also know that an ≤ L < L + e.
Summarizing,
| an − L| < e.
lim an = inf an .
n→∞ n ∈N
Then
iii. If d 6= 0, then the limit limn→∞ ( an /bn ) exists and is equal to c/d.
Proof. Let us aim to prove item (v). We first show the statement for
c = 1, then for every k ∈ N+ , the limit limn→∞ a1/k
n exists and is equal
to 1.
Let e > 0. Define e0 := min(e, 1/2). (We will prefer to work with
e0 over e because we know that e0 ≤ 1/2, which will be convenient
below when we want to take the kth root of (1 − e0 ).) Since an → c by
assumption, there exists an n0 ∈ N such that for every n ≥ n0 ,
| a n − 1 | < e0 .
Let n ≥ n0 . Then
1 − e0 < a n < 1 + e0
and therefore
Hence,
( an )1/k − 1 < e0 ≤ e.
| an | < ek .
Let n ≥ N. Then,
The text here is optional, as on the one hand it is really required for
a rigorous proof but on the other hand the amount to write down
would be way too much for more involved limits.
By the limit theorem for powers, Theorem 6.3.1, item (iv), it follows
that the sequence n 7→ (1/n)2 also converges.
We also know that the sequence n 7→ 3 converges, as this is a con-
stant sequence, see Proposition 5.4.1.
By the limit theorem for the sum and the power, we conclude that the
sequence a : N → R also converges and
1
lim an = lim 3 + 2
n→∞ n→∞ n
2
1
= lim 3 + lim
n→∞ n→∞ n
1 2
= 3 + lim
n→∞ n
= 3 + 02 = 3.
(What one really needs to do, when leaving out the optional text, is to
read the above chain of equalities from back to front, and making sure
that all steps are justified. In particular, it is extremely important to
verify that all involved limits exist.)
3n2 + 5n + 9
an :=
2n2 + 3n + 7
We claim that the sequence a : N → R converges, and that the limit
equals 3/2.
When confronted with a sequence that is given as a fraction of two
terms, the first thing to do is to divide numerator and denominator by
the fastest growing term in n. In this case, we need to divide by n2 . We
CHAPTER 6. REAL-VALUED SEQUENCES 74
get
3 + 5 n1 + 9 n12
an := .
2 + 3 n1 + 7 n12
We would like to use the limit theorem for quotients, namely Theorem
6.3.1, item (iii). However, to apply this limit theorem, we should really
make sure that the limit of numerator and denominator exist, and that
the limit of the denominator is not equal to 0. Whereas the previous
example had optional text to justify all the steps, here we leave it out.
We will use the strategy described at the end of the previous example,
to read chains of equalities backwards while making sure all involved
limits exist.
Note that the limit
1
lim
n→∞ n
exists, and equals 0, as this is the standard limit from Example 5.4.2.
By the limit theorems it follows that
2
1 1 1 1
lim 2+3 +7 2 = lim 2 + 3 lim + 7 lim
n→∞ n n n→∞ n→∞ n n→∞ n
= 2 + 0 + 0 = 2.
3 + 5 n1 + 9 n12
lim an = lim
n→∞ 2 + 3 n1 + 7 n12
n→∞
limn→∞ 3 + 5 n1 + 9 n12
=
1 1
limn→∞ 2 + 3 n + 7 n2
2
limn→∞ 3 + 5 limn→∞ n1 +9 limn→∞ n1
=
2
3+0+0
= = 3/2.
2
(Again we read the chain of equalities backwards to make sure every
step is justified.)
a n ≤ bn ≤ c n
a n ≤ bn ≤ c n
show that
for all e > 0,
there exists N0 ∈ N,
for all n ≥ N0 ,
|bn − L| < e.
L − e < a n ≤ bn ≤ c n < L + e
The squeeze theorem is a great tool to show existence of limits and to com-
pute limits for a sequences that can easily be compared to other sequences
as in the next example.
sin(n)
bn : = .
n+1
We can use the squeeze theorem to show that
lim bn = 0.
n→∞
−1 ≤ sin(n) ≤ 1,
we know that
1 sin(n) 1
− ≤ ≤ .
n+1 n+1 n+1
CHAPTER 6. REAL-VALUED SEQUENCES 77
1
lim − = 0.
n→∞ n+1
It follows by the squeeze theorem that
lim bn = 0
n→∞
as well.
lim an = ∞
n→∞
if
for all M ∈ R,
there exists N ∈ N,
for all n ≥ N,
an > M.
lim an = −∞
n→∞
CHAPTER 6. REAL-VALUED SEQUENCES 78
if
for all M ∈ R,
there exists N ∈ N,
for all n ≥ N,
an < M.
lim an = ∞.
n→∞
lim bn = −∞.
n→∞
the sequence (bn ) is bounded from below and the sequence (dn ) is
bounded from above. Let λ : N → R be a sequence bounded below
by some µ > 0. Then
i. limn→∞ ( an + bn ) = ∞.
• converges to 0 if q ∈ (−1, 1)
• converges to 1 if q = 1
• diverges to ∞ if q > 1
so s = 0.
CHAPTER 6. REAL-VALUED SEQUENCES 80
Therefore,
2
0 < d2n ≤
n
and r
2
0 < dn ≤ .
n
The limit of the left-hand side is zero, and by the limit theorems, we
know that the limit of the right-hand-side is 0 as well. It follows by the
squeeze theorem that limn→∞ dn = 0.
√
Corollary 6.7.3. Let a > 0. Then the sequence (bn ) defined by bn := n a
converges to 1.
Proof. We need to show that for all n ∈ N \ {0}, an < an+1 . Let n ∈ N
be larger than or equal to 1. We can just write out
n k n k
n 1 1 n! 1
an = ∑ = ∑
k =0
k n k =0
k! (n − k )! n
whereas
n +1 k
1 ( n + 1) ! 1
a n +1 = ∑ k! (n + 1 − k )! n+1
k =0
CHAPTER 6. REAL-VALUED SEQUENCES 82
How to compare these and show that an < an+1 ? First, because all
terms in the sum are positive, we can estimate an+1 from below by
forgetting the last term
n k
1 ( n + 1) ! 1
a n +1 > ∑
k =0
k! (n + 1 − k )! n+1
Next, we will show that each of the term in this sum is larger than the
corresponding sum for an . We can see this better if we rewrite
n
n+1−1 n + 1 − ( k − 1)
1 n+1
a n +1 > ∑ ···
k =0
k! n + 1 n+1 n+1
n
1 n n − 1 n − ( k − 1)
> ∑ ···
k =0
k! n n n
= an
1 n
n 7→ 1 +
n
converges. Let’s record in the next definition that we call the limit e.
np
lim = 0.
n→∞ an
nM M
an ≥ b .
2 M M!
Indeed, let n ≥ N. First note that because n ≥ 2M, we know
n
n−M ≥ . (6.7.1)
2
CHAPTER 6. REAL-VALUED SEQUENCES 84
We now compute
n
n k
a = (1 + b ) = ∑
n n
b
k =0
k
n
≥ bM
M
1
= n ( n − 1) · · · ( n − M + 1) b M
M!
n M 1
M
≥ b
2 M!
where for the last inequality we used (6.7.1). This proves our claim.
Since M > p + 1 we find that for all n ≥ N
np 1 1
0 < n < 2 M M! M .
a b n
We know that
1 1
lim 2 M M! =0
n→∞ bM n
by limit theorems and the standard limit limn→∞ n1 = 0. Therefore, it
holds by the squeeze theorem (Theorem 6.4.1) that
np
lim = 0.
n→∞ an
(n) 1
x1 =
n
(n)
and the second component sequence ( a2 ) is given by
n
(n) 1
x2 =
2
(n) 1
lim x1 = lim =0
n→∞ n→∞ n
and n
(n) 1
lim x = lim = 0.
n→∞ 2 n→∞ 2
Note how in the last term we use the notation 0 for the 0-vector in the
vector space R2 .
CHAPTER 6. REAL-VALUED SEQUENCES 86
6.8 Exercises
Exercise 6.8.4. Prove the statement about the sequence (bn ) in Proposition
6.5.2.
2 + xn2
x n +1 : =
2xn
for n ∈ N while x0 = 2. Prove that the sequence x : N → R converges
and determine its limit.
1 5n5 + 2n2 √
an := −3 bn : = cn := n − n
n3 3n5 + 7n3 + 4
2n p √
n
dn := 100 e n : = n2 + n − n f n := 3n2
n
2n + 5n200 p
n
gn : = n hn := (−1)n 3n in := 5n + n 2
3 + n10
Chapter 7
Series
7.1 Definitions
Definition 7.1.1. Let (V, k · k) be a normed vector space and let a :
N → V be a sequence in V. Let K ∈ N. We say that a series
∞
∑ an
n=K
∞ n
∑ ak := lim SKn = lim
n→∞ n→∞
∑ ak .
k=K k=K
87
CHAPTER 7. SERIES 88
Proof. We consider
n n n
(1 − a ) ∑ ak = ∑ ak − a ∑ ak
k =0 k =0 k =0
n n
= ∑ a k − ∑ a k +1
k =0 k =0
n n +1
= ∑ ak − ∑ ak
k =0 k =1
n +1
= 1−a .
Because
lim an+1 = 0
n→∞
by index shift and Proposition 6.7.1, we find with the limit laws that
limn→∞ Sn exists as well and equals
∞
1 − a n +1 1
∑ ak := lim Sn = lim
n→∞ n→∞ 1 − a
=
1−a
.
k =0
diverges.
2`
1
S2 ` = ∑ k
.
k =1
1 1
≥ `+1 .
k 2
CHAPTER 7. SERIES 90
`
S2 ` ≥ .
2
Note also that the sequence of partial sums (Sn ) is increasing. There-
fore, the sequence of partial sums (Sn ) diverges to infinity.
converges.
2` −1
1
S2 ` − 1 = ∑ kp
.
k =1
Therefore
` 2m −1
1
S2 ` − 1 = ∑ ∑ kp
m =1 k =2m −1
` m −1
1
≤ ∑ 2 p −1
.
m =1
1
S2 ` − 1 ≤ .
1 − 2 p1−1
∞ k !
1 1
∑ k2
,
2
k =1
By using limit theorems and index shift we find that the right-hand
side converges to 0 as m → ∞.
CHAPTER 7. SERIES 94
Proof. Suppose ∑∞
n=0 an is convergent to L ∈ V. Then
a n = S n − S n −1
where Sn denotes the partial sum ∑nk=0 ak . Because Sn and Sn−1 are
both convergent to L, the sequence ( an ) is convergent as well and con-
verges to L − L = 0.
The following is a very simple, but often useful test for divergence.
CHAPTER 7. SERIES 95
is divergent.
i. The series
∞
∑ ( a n + bn )
n =0
is convergent and converges to
∞ ∞
∑ a n + ∑ bn .
n =0 n =0
7.8 Exercises
bn : = a n + 1 − a n , for n ∈ N.
i. limn→∞ an = 0,
ii. ∑∞
n=1 an diverges.
∞ ∞ ∞ k
2 1
( a) ∑ k3 (b) ∑k (c) ∑ 1+
k
k =3 k =1 k =1
∞ 2k ∞ ∞ p
1 2k + 3
∑ (−1) 3 ∑ ∑
k k
(d) (e) (f) 2− k + 3
k =1 k =0
( k + 1)2 ( k + 2)2 k =1
Chapter 8
In this chapter, we will consider a very special, but very important type of
series. These are series with real, positive, terms.
The chapter gives tools for answering the question: Does a series of posi-
tive terms converge or not, i.e. does it converge or does it diverge? So far,
we only know this for very specific series: we have seen that the harmonic
series diverges, the hyperharmonic series converges and geometric series
∑∞ k
k=0 q converges if and only if q ∈ (−1, 1). With the tools in this chap-
ter, however, we can conclude for many more series that they converge or
diverge.
As an example, consider the series
∞
k
∑ k2 −1
k =2
For large k, the terms in this series, namely k/(k2 − 1) are very close to 1/k.
We may therefore expect that the series diverges, just like the harmonic
series does. In this chapter, we will see various theorems that allow you to
rigorously derive this conclusion.
98
CHAPTER 8. SERIES WITH POSITIVE TERMS 99
Then also
n
∑ bk > M
k= N
Therefore the series ∑∞ ∞
k= N bk diverges. Then also the series ∑k =0 bk di-
verges.
CHAPTER 8. SERIES WITH POSITIVE TERMS 100
before you applied the Comparison Test, because before you concluded
the convergence of the left-hand side, the statement does not make
sense.
k
k2 +1
are very close to 1/k for large k, so we might still expect that the series di-
verges, because the standard harmonic series diverges as well. However,
in contrast to the previous example, the terms
1
k
are larger than the terms k2k+1 . That means we cannot apply the Compari-
son Test directly. There is however a very convenient way around it: this
way is industrialized by the following theorem.
Proof. We show item (i). Assume ∑ bk converges and that the limit
an
lim
n → ∞ bn
we have that
for all e > 0,
there exists N ∈ N,
for all n ≥ N, (8.2.1)
an
− L < e.
bn
an
− L < 1.
bn
a n ≤ bn ( L + 1 ) .
Since the series ∑ bk converges, by the limit laws for series, Theorem
7.7.1, the series
∞
∑ bk ( L + 1 )
k= N
converges as well.
Therefore by the comparison test, Theorem 8.1.1, we find that the series
∞
∑ ak
k= N
converges as well.
Example 8.2.2. Let us see how we can use the limit comparison test to
conclude that the series
∞
k
∑ k2 + 1
k =2
diverges.
For this, we will apply part (ii) of the Limit Comparison Test, Theorem
8.2.1.
We use sequences a : N → (0, ∞) and b : N → (0, ∞) defined for
CHAPTER 8. SERIES WITH POSITIVE TERMS 104
k ≥ 2 by
k
ak :=
k2 +1
and
1
bk : = .
k
(In general, for the comparison sequence bk it is good to try a sequence
for which you understand well whether the corresponding series di-
verges or converges, while at the same time you believe, have the in-
tuition, the inkling or the guess that ak and bk are close for k large.)
Then
k
ak k 2 +1 1
= 1 = 1
.
bk k 1 + 2
k
By limit laws, we find that the limit of the denominator is 1, i.e.
1 1
lim 1 + 2 = lim 1 + lim 2 = 1 + 0 = 1.
k→∞ k k→∞ k→∞ k
Therefore, we may apply the limit law for the quotient and conclude
that
a 1 1
lim k = = = 1.
k → ∞ bk lim 1+ 1 1
k→∞ k2
The series ∑∞ 1
k=2 k diverges and therefore it follows from the Limit Com-
parison Test that the series
∞ ∞
k
∑ ak = ∑ k2 +1
k =2 k =2
diverges as well.
Let me also say a word about a crucial technique we used in Theorem 8.2.1:
we used that because the sequence an /bn converges to L, there exists an
N ∈ N such that for all n ≥ N,
an
L−1 < < L+1
bn
an
This expresses that we have some pretty good control on the terms bn
CHAPTER 8. SERIES WITH POSITIVE TERMS 105
0 < a N +k ≤ qk a N
is divergent.
Proof. Suppose there exists an N ∈ N and a q ∈ (0, 1) such that for all
n ≥ N, it holds that √
n
an ≤ q.
CHAPTER 8. SERIES WITH POSITIVE TERMS 108
0 ≤ an ≤ qn .
The series
∞
∑ qn
n= N
8.5 Exercises
Suppose this series of absolute values converges (we will say that the se-
ries ∑∞
k=0 ak converges absolutely). Then the theorem will allow us to con-
clude that the series
∞
∑ ak
k =0
converges as well.
110
CHAPTER 9. SERIES WITH GENERAL TERMS 111
i. ak ≥ 0 for every k ≥ K
iii. limk→∞ ak = 0
In words, the sequences (S2n+1 ) and (S2n ) converge to the same limit.
Let’s call this limit s.
Finally,
S2n > s > S2n+1 = S2n − a2n+1
so that |S2n − s| < a2n+1 .
Similarly,
S2n+1 < s < S2n+2 = S2n+1 + a2n+2
so that |S2n+1 − s| < a2n+2 .
In conclusion, in such an alternating series we have the estimate for all
n∈N
| S n − s | ≤ a n +1
Therefore the sequence (Sn ) converges to s.
converges.
We would like to apply the Alternating Series Test. To do so, we need
to check its conditions.
We define the sequence a : N → R by
1
ak :=
k
CHAPTER 9. SERIES WITH GENERAL TERMS 113
1 1
ak = ≥ = a k +1 .
k k+1
converges.
converges absolutely if
∞
∑ k ak k
k =0
converges.
Since this is not an alternating series, we cannot apply the Leibniz test.
However, for every k ∈ N \ {0}, we have
sin(k) 1
2
≤ 2.
k k
The series
∞
1
∑ k2
k =1
is a standard hyperharmonic series, of which we know that it con-
CHAPTER 9. SERIES WITH GENERAL TERMS 115
converges as well.
Therefore, the series
∞
sin(k )
∑ k2
k =1
converges.
∞ ∞ ∞
! !
∑ Ck = ∑ Ak ∑ Bk .
k =0 k =0 k =0
n n k
∑ Ck = ∑ ∑ A` Bk−`
k =0 k=0 `=0
n n
= ∑ ∑ A` Bk−`
`=0 k=`
n n−`
= ∑ ∑ A` Bm .
`=0 m=0
CHAPTER 9. SERIES WITH GENERAL TERMS 117
Therefore
n n n−`
∑ |Ck | ≤ ∑ | A` | ∑ | Bm |
k =0 `=0 m =0
! !
n n
≤ ∑ | A` | ∑ | Bm |
`=0 m =0
∞ ∞
! !
≤ ∑ | A` | ∑ | Bm |
`=0 m =0
It follows that
2n 2n 2n n ∞ n ∞
∑ Ck − ∑ A` ∑ Bm ≤ ∑ | A` | ∑ | Bm | + ∑ | B` | ∑ | A n |.
k =0 `=0 m =0 `=0 m=n `=0 m=n
(9.3.1)
CHAPTER 9. SERIES WITH GENERAL TERMS 118
exists. Similarly,
∞
lim
n→∞
∑ | B` |
`=0
exists.
Moreover, because the series
∞
∑ | Bm |
m =0
Similarly,
∞
lim
n→∞ m=n
∑ | Am | = 0.
converges to
∞ ∞
∑ A` ∑ Bm .
`=0 m =0
9.4 Exercises
• limk→∞ ak = 0,
There are two important elements to this definition: first of all, index se-
quences are sequences taking values in the natural numbers (as opposed
to just an arbitrary space). Secondly, an index sequence is strictly increas-
ing, so for every k ∈ N, nk+1 > nk .
120
CHAPTER 10. SUBSEQUENCES, LIM SUP AND LIM INF 121
nk := 2k
` 0 1 2 3 4 5 6 7 8 9 10 11 12 13 . . .
a` a0 a1 a2 a3 a4 a5 a6 a7 a8 a9 a10 a11 a12 a13
a` 11 12 13 41 15 16 17 18 19 10
1 1
11
1
12
1
13
1
14
1 1 1 1 1 1 1
a nk 2 4 6 8 10 12 14
a nk a1 a3 a5 a7 a9 a11 a13
nk 1 3 5 7 9 11 13
k 0 1 2 3 4 5 6
example index sequence, and create a similar table for your own ex-
ample.
n k ≥ n k 0 ≥ k 0 = m0
where in the last inequality we made use of the fact that the index
sequence n : N → N is strictly increasing. Because nk ≥ m0 , it follows
by (10.3.1) that
dist( ank , p) < e.
k 7→ sup an .
n≥k
CHAPTER 10. SUBSEQUENCES, LIM SUP AND LIM INF 124
Note that this sequence is decreasing, because for larger k the supremum
is taken over a smaller set. We will show in the lemma below that then the
sequence k 7→ supn≥k an is also bounded from below.
Therefore, the sequence k 7→ supn≥k an has a limit, and the limit is in fact
equal to the infimum of the sequence. This limit is called the lim sup
sup an < M
n≥m
CHAPTER 10. SUBSEQUENCES, LIM SUP AND LIM INF 125
an ≤ sup a` < M.
`≥m
This finishes our proof that ( an ) diverges to −∞, and we have de-
rived a contradiction. Hence the sequence k 7→ supn≥k ( an ) is in fact
bounded from below.
i.
for all e > 0,
there exists N ∈ N,
for all ` ≥ N,
a` < M + e.
ii.
for all e > 0,
for all k ∈ N,
there exists m ≥ k,
am > M − e.
`0 ∈ N such that
sup ak < M + e.
k≥`0
sup an ≥ M.
n≥k
am > sup an − e ≥ M − e.
n≥k
We will now show that if M ∈ R satisfies conditions (i) and (ii), that
then M equals lim sup`→∞ a` .
Assume M satisfies conditions (i) and (ii).
We first need to settle that a : N → R is bounded from above and does
not diverge to −∞.
We will show that a : N → R is bounded from above. By (i), we
may obtain an N ∈ N such that for all ` ≥ N, a` < M + 1. Choose
L := max( a0 , a1 , . . . , a N −1 , M + 1). Then for all ` ∈ N, it holds that
a` ≤ L. Hence L is an upper bound for a : N → R and a : N → R is
indeed bounded.
We will now show that a : N → R does not diverge to −∞. We argue
by contradiction. Suppose a : N → R diverges to −∞. Then there
exists an N0 ∈ N such that for all ` ≥ N0 , a` < M − 1. However, by (ii)
there exists an m ≥ N0 such that am > M − 1. This is a contradiction.
CHAPTER 10. SUBSEQUENCES, LIM SUP AND LIM INF 127
M ≤ sup an .
n≥k
M − e < sup an
n≥k
M − e < am .
Therefore also
M − e < sup an
n≥k
Finally, we show that for every e > 0, there exists a k ∈ N such that
sup an < M + e
n≥k
lim ank = M.
k→∞
Suppose now that n`−1 is defined for some ` ∈ N \ {0}. We are going
to define n` . We know that there exists an m` ∈ N such that for every
m ≥ m` ,
1
am ≤ M + .
`+1
Now, there exists an n` ≥ max(n`−1 , m` ) + 1 such that
1
M− < a n`
`+1
and because n` > m` , we also know that
1
a n` < M + .
`+1
CHAPTER 10. SUBSEQUENCES, LIM SUP AND LIM INF 129
The previous theorem has the following consequence, which is a key fact
in analysis.
Corollary 10.4.4 (Bolzano-Weierstrass). Every bounded, real-valued se-
quence has a subsequence that converges in (R, distR ).
lim sup a`
`→∞
More precisely, the lim inf is a function that takes in a real-valued sequence
a : N → R and outputs
CHAPTER 10. SUBSEQUENCES, LIM SUP AND LIM INF 130
i.
for all e > 0,
there exists N ∈ N,
for all ` ≥ N,
a` > M − e.
ii.
for all e > 0,
for all K ∈ N,
there exists m ≥ K,
am < M + e.
So far we haven’t seen a statement that said that if two convergent se-
quences a : N → R and b : N → R are ordered as in a` ≤ b` for all `,
then lim`→∞ a` ≤ lim`→∞ b` . Part of the reason is that the assumption of
convergence of both sequences is a bit unsatisfactory. The next proposi-
tion is a generalization that can be much more useful, especially when it is
combined with the previous proposition.
10.7 Exercises
r j z b a g w q o r
x o l b d x s l e e
u h g c e c k v n i
v c n i l t j n h c
e e i u s u m e c t
c r y o b v n d f d
b d h z f a z s l i
h f k s o x c f n o
a x c n d i a d c l
e u y i j s c v i k
m s g n o c d n f g
for all k ∈ N,
there exists m ≥ k,
Pm = blue.
Pnk+1 = blue.
Pnk = blue.
Exercise 10.7.7. Let a : N → R be a sequence with (at least) two sequential
accumulation points p, q ∈ R (with p 6= q). Prove that the sequence a :
N → R does not converge.
Chapter 11
The main purpose of the current section and the next is to introduce three
stronger and stronger properties for subsets of a metric space: closedness,
completeness and compactness. Here ‘stronger and stronger’ means that
every compact set is complete, and every complete set is closed. However,
not every closed set is complete, and not every complete set is compact.
If we know that a subset of K of a metric space is compact, we get a lot of
amazing properties for free.
B( p, r ) := { x ∈ X | dist( x, p) < r } .
The reason for the parentheses around ‘open’ is that yes, soon we will
prove that this set is indeed open, however so far we have not defined
what ‘open’ really is!
134
CHAPTER 11. POINT-SET TOPOLOGY OF METRIC SPACES 135
Open sets are subsets for which every point in the subset is an interior
point.
Having defined what it means for a set to be open, we can now prove that
the (open) ball is indeed open.
B( p, r ) := { x ∈ X | dist( x, p) < r }
is indeed open.
Before giving the proof of the proposition, I’d like to say the following.
If by this point, you have the blue exercises and the best practices down,
the proof of the proposition may come to you very easily. This is one of
the reasons that I stress the best practices so much: whereas without them
it may be difficult to even see where to start, with them the proof can be
written down almost mechanically.
If you still would have difficulties giving such a proof yourself, don’t
worry, it takes time to get used to proving mathematical statements. If
you’re still struggling with following the best practices, the proofs in this
chapter may help you get further in your understanding. It is especially
helpful to see if you can recognize the various components of the best
practices in the proofs that are given.
CHAPTER 11. POINT-SET TOPOLOGY OF METRIC SPACES 136
r − dist( x, p) > 0.
The second part of the proof gives another good example of a proof that
shows that a set is open.
CHAPTER 11. POINT-SET TOPOLOGY OF METRIC SPACES 137
Proof. Note that the interval ( a, b) is exactly equal to the (open) ball
y < x + r ≤ x + (b − x ) = b
Proposition 11.1.5. Let ( X, dist) be a metric space. Then both the empty
set ∅ and the set X itself (both of these are subsets of X) are open.
Proof. We will first show that the empty set is open. The argument is
a bit silly (yet logically correct). We argue by contradiction. Suppose
there exists a point x ∈ ∅ such that x is not an interior point of X. Then
we have a contradiction, because the empty set has no elements.
We will now show that X is open. Let x ∈ X. We will show that x is
an interior point, i.e. we will show that there exists an r > 0 such that
B( x, r ) ⊂ X.
Choose r := 1. Then B( x, r ) = B( x, 1) ⊂ X.
The set of all interior points of a subset A ⊂ X is called the interior of the
set A.
CHAPTER 11. POINT-SET TOPOLOGY OF METRIC SPACES 138
At the end of this section we provide a few ways to create new open sets
out of sets about which you already know that they are open.
CHAPTER 11. POINT-SET TOPOLOGY OF METRIC SPACES 139
Unions of open sets are always open. You may recall that if I is some set,
and if for every α ∈ I we have a subset Aα ⊂ X, then the union
[
Aα ⊂ X
α∈I
is defined as
[
Aα := { x ∈ X | there exists α ∈ I such that x ∈ Aα }.
α∈I
is open.
O1 ∩ · · · ∩ O N
CHAPTER 11. POINT-SET TOPOLOGY OF METRIC SPACES 140
is also open.
O1 × · · · × Od (= {(o1 , · · · , od ) | oi ∈ Oi })
Both the empty set and the full set are closed.
Proposition 11.2.2. Let ( X, dist) be a metric space. Then both the empty
set ∅ and the set X itself are closed.
Warning: The following two facts may conflict your expectations when
you use intuition for the meaning of ‘open’ and ‘closed’ from daily life:
ii. In Exercise 11.6.2 we will see that there is a set (and many more)
that are neither open nor closed.
What does it mean in practice? If you want to show that a set is closed,
it is not enough to show that the set is not open.
In other words
for all r > 0,
there exists a ∈ B( p, r ) (11.2.2)
a∈ / O.
lim dist(yn , p) = 0
n→∞
lim yn = p.
n→∞
Here is a typical example of how you can show that set is closed.
and
(n)
lim y = z2 .
n→∞ 2
(n)
By limit theorems, we know that the limit of the sequence n 7→ (y2 )2
also exists and
(n) 2
lim y2 = ( z2 )2 .
n→∞
(n)
Since for all n ∈ N, y(n) ∈ A, we also know that for all n ∈ N, y1 ≤
(n)
( y2 )2 . Therefore,
(n) 2
(n)
z1 = lim y1 ≤ lim y2 = ( z2 )2 .
n→∞ n→∞
We now provide a few ways to create new closed sets out of sets about
which you already know that they are closed.
is defined as
\
Aα := { x ∈ X | for all α ∈ I , x ∈ Aα }.
α∈ I
CHAPTER 11. POINT-SET TOPOLOGY OF METRIC SPACES 144
is closed as well.
C1 ∪ · · · ∪ CN
is also closed.
C1 × · · · × Cd (= {(c1 , · · · , cd ) | ci ∈ Ci })
Although for some sets, the topological boundary may coincide with
what you intuitively think of as a ‘boundary’ of a set, for many sets
CHAPTER 11. POINT-SET TOPOLOGY OF METRIC SPACES 145
and
int(R \ [2, 5)) = (−∞, 2) ∪ (5, ∞).
Therefore
Let n ∈ N. Then
dist( an , p) ≤ M.
CHAPTER 11. POINT-SET TOPOLOGY OF METRIC SPACES 147
11.4 Completeness
We first now give a definition of completeness for a metric space.
Note that we have used the term ‘complete’ various times in the lecture
notes: completeness of a totally ordered field, the series characterization
of completeness in normed vector spaces and now completeness of a metric
space. In the next section we will see that a normed vector space satisfies
the series characterization of completeness if and only if the correspond-
ing metric space is complete. What we will do next is show that the metric
space (R, distR ) is complete (as a metric space). Under the hood, we re-
ally use the Completeness Axiom 4.2.7 for this: that axiom is really what
makes everything work.
lim sup ak .
k→∞
Proof. The “only if” side of this proposition follows from Proposition
11.4.4.
We will now show the “if” part of the proposition. Suppose A is closed.
Let ( xn ) be a Cauchy sequence in A. Then ( xn ) is also a sequence in C.
CHAPTER 11. POINT-SET TOPOLOGY OF METRIC SPACES 150
Proof. We first show that ‘only if’ direction. Suppose (V, k · k) is com-
plete. Let a : N → V be a sequence and suppose the series
∞
∑ k ak k
k =0
We are going to show that (Sn ) is a Cauchy sequence. Let e > 0. Since
the series
∞
∑ k ak k
k =0
CHAPTER 11. POINT-SET TOPOLOGY OF METRIC SPACES 151
n n ∞
kSn − Sm k = ∑ ak ≤ ∑ k ak k ≤ ∑ k ak k < e.
k = m +1 k = m +1 k = m +1
k a n k +1 − a n k k ≤ 2 − k .
bk : = a n k +1 − a n k .
converges.
11.6 Exercises
Exercise 11.6.1. Let (V, k · k) be a normed linear space and let A be the
closed ball of radius 1 around the origin, i.e.
A : = { v ∈ V | k v k ≤ 1}.
Exercise 11.6.2. Show that the interval [0, 1) is neither open nor closed
(seen as a subset of the normed linear space (R, | · |) ).
Note the moral of the previous exercise: there are sets that are neither open
nor closed.
L := {( x, y) ∈ R2 | x + 2y = 1}.
Exercise 11.6.5. Give an example of a metric space ( X, dist) that is not com-
plete (as always, actually prove that ( X, dist) is indeed not complete).
A := {( x1 , x2 ) ∈ R2 | 4( x1 )2 + ( x2 )2 ≤ 25}.
Compactness
In this chapter, we are going to define what it means for a subset of a met-
ric space to be compact. Compactness is a strong property: Every compact
subset is complete, and every complete subset is closed. We will in this
chapter also give an alternative characterization of compactness: We will
define what it means for a subset to be totally bounded and will use this
concept to show that a subset is compact if and only if it is complete and
totally bounded. In (Rd , k · k2 ) however, we will see that a subset is com-
pact if and only if it is closed and bounded.
154
CHAPTER 12. COMPACTNESS 155
there exists q ∈ X,
there exists M > 0,
for all p ∈ A,
dist( p, q) ≤ M.
We will now define what it means for a subset to be totally bounded. In-
tuitively, it means that for every radius r > 0 (which could be extremely
small) the subset can be covered with only a finite number of balls with
radius r.
Choose q := p1 .
Choose M := max(dist( p2 , p1 ), . . . , dist( p N , p1 )) + 1.
Let p ∈ A. Then there exists an i ∈ {1, . . . , N } such that p ∈ B( pi , 1).
CHAPTER 12. COMPACTNESS 157
It follows that
dist( p, q) = dist( p, p1 )
≤ dist( p, pi ) + dist( pi , p1 )
≤ 1 + max(dist( p2 , p1 ), . . . , dist( p N , p1 ))
= M.
In the special case of the normed vector space (Rd , k · k2 ), however, a sub-
set is totally bounded if and only if it is bounded.
G := [− M, M]d ∩ (δZ)d .
These are all points in Rd of which the coordinates are an integer mul-
tiple of δ, and lie between − M and M. Note that G is a finite set, so we
CHAPTER 12. COMPACTNESS 158
G = { p1 , . . . , p N }.
i =1 i =1
2
dist( pi , p j ) ≥ r
for every i 6= j. Note that we can also phrase this last property as that
for all k ∈ N, and all i < k, dist( pi , pk ) ≥ r.
We first just take some point p0 ∈ K. Now let k ∈ N and assume the
points p0 , . . . , pk have already been defined, and dist( pi , pk ) ≥ r for
i ∈ {0, . . . , k − 1}. Then we know that
k
[
K 6⊂ B ( p i , r ).
i =0
We will now tackle the base case of the inductive definition. For this,
we let the index sequence n(0) : N → N be just the identity function,
i.e.
(0)
n ` : = `.
We now continue with the inductive step of the inductive definition.
Let k ∈ N and assume that the index sequence n(k−1) : N → N is de-
fined for some k ∈ N \ {0}. We are going to define the index sequence
n(k) : N → N as a subsequence of n(k−1) . Note that there exists an
(k) (k)
Nk ∈ N and points p1 , . . . , p N ∈ K such that
k
Nk
[ (k)
K⊂ B pi , 1/k .
i =1
Said differently, the set K is covered by only finitely many balls of ra-
(k) (k)
dius 1/k. Hence, there exists a point pi such that x (k−1) ∈ B( pi , 1/k )
k n` k
CHAPTER 12. COMPACTNESS 161
where for the strict inequality we used that n(`) is an index sequence
and therefore strictly increasing, while the second inequality follows
because for every two index sequences a, b : N → N and every i ∈ N,
abi ≥ ai (this can be applied to the case where a ◦ b equals n(`+1) and a
equals n(`) ). This finishes the proof of the claim that m : N → N is an
index sequence.
We now claim that the sequence ( xm` )` is a Cauchy sequence. Let e >
˜
0. Choose M := d2/ee + 1. Let `˜ , j̃ ≥ M. Then, because n(`) is a
subsequence of n( M) , we can find an ` ∈ N such that
˜
(`) ( M)
m`˜ = n`˜ = n` .
CHAPTER 12. COMPACTNESS 162
( M)
For the same reason, we may find a j ∈ N such that m j̃ = n j . It
follows by (12.3.1) that
2
dist xm`˜ , xm j̃ = dist x ( M) , x ( M) < < e.
n` nj M
12.4 Exercises
Exercise 12.4.1. Let a, b ∈ R be two real numbers such that a < b. Prove
that the interval [ a, b] is a compact subset of the normed vector space (R, | ·
|).
Exercise 12.4.2. Let ( X, dist) be a metric space and let a : N → X be a
sequence in X. Show that the sequence a : N → X is bounded (according
to Definition 5.2.1) if and only if the set
A : = { a n | n ∈ N}
Exercise 12.4.3. Consider the metric space ((0, 1), dist), where (0, 1) de-
notes the interval from 0 to 1 and dist( x, y) = | x − y|. Prove that (0, 1) is
a closed and bounded subset of the metric space ((0, 1), dist). Also prove
that (0, 1) is not a compact subset of the metric space ((0, 1), dist).
The moral of this exercise is that the Heine-Borel theorem (Theorem 12.3.2)
is an alternative characterization of compactness in (Rd , k · k2 ): for other
metric spaces or normed vector spaces it does not hold that subsets are
compact if and only if they are closed and bounded.
Exercise 12.4.4. Let ( X, dist) be a metric space and let K ⊂ X be a compact
subset. Let a : N → X be a sequence with values in X, such that
for all N ∈ N,
there exists ` ≥ N, (12.4.1)
a` ∈ K.
(this is formal equivalent way of saying that there are infinitely many ` ∈
N such that a` ∈ K).
The exercise consists of two parts:
CHAPTER 12. COMPACTNESS 164
ii. Use the fact that K is compact to show that there is a point p ∈ K and
a subsequence of a : N → X converging to p.
A := {( x1 , x2 ) ∈ R2 | x1 − x2 = 1}
and
B := {( x1 , x2 ) ∈ R2 | ( x1 )2 + ( x2 )2 ≤ 1}
Prove that the set A ∩ B is compact (as a subset of the normed vector space
(R2 , k · k2 )).
Chapter 13
We are finally ready to treat perhaps the most important target of Anal-
ysis 1: the concepts of limits and continuity. These concepts embody the
adagio of this course: to make rigorous statements about the approximate
behavior of functions.
The setting is as follows: We will consider functions f : D → Y map-
ping from a subset D ⊂ X of a metric space ( X, distX ) to a metric space
(Y, distY ). These are quite some actors: an input metric space ( X, distX ),
a subset D of the metric space, and an output metric space (Y, distY ), and
the concept of limits and continuity depend on all these actors. That makes
it a bit tricky.
On the coarsest level, if p ∈ X and q ∈ Y then the statement that
lim f ( x ) = q
x→ p
will mean that if the distance between x and p is small, but not zero, the
distance between f ( x ) and q will be small. Using a similar approach as
we took with sequences, we will make this vague statement completely
rigorous.
There is, however, one critical, tricky point. This concept of limits only
behaves nicely if p satisfies a special property with respect to the set D on
which f is defined. We will discuss this property in the next section.
165
CHAPTER 13. LIMITS AND CONTINUITY 166
Note that accumulation points of a set D do not have to lie in the set D
themselves. If a point does lie in D, but is not an accumulation point, then
we call it an isolated point.
if
for all e > 0,
there exists δ > 0,
for all x ∈ D,
if 0 < distX ( x, a) < δ, then distY ( f ( x ), q) < e.
Proposition 13.3.1. Let ( X, distX ) and (Y, distY ) be metric spaces and
let D ⊂ X be a subset of X. Let f : D → Y be a function on D. Let
a ∈ D 0 and assume
distY ( p, q)
distY ( f ( x ), p) < e =
2
and there exists a δ2 > 0 such that for all x ∈ D, if 0 < distX ( x, a) < δ2
then
distY ( p, q)
distY ( f ( x ), q) < e = .
2
Choose such δ1 > 0 and δ2 > 0.
Now define δ := min(δ1 , δ2 ) > 0. Because a is an accumulation point
of D, there exists a point b ∈ B( a, δ). Then
distY ( p, q)
distY ( f (b), p) <
2
CHAPTER 13. LIMITS AND CONTINUITY 168
and
distY ( p, q)
distY ( f (b), q) < .
2
Therefore by the triangle inequality
lim f ( x ) = q
x→a
if and only if
that
for all e > 0,
there exists N ∈ N,
for all n ≥ N,
distY ( f ( x n ), q) < e.
Let e > 0. Because limx→ a f ( x ) = q, there exists a δ > 0 such that for
all x ∈ D, if 0 < distX ( x, a) < δ then
distY ( f ( x ), q) < e.
distY ( f ( x n ), q) < e.
13.6 Continuity
CHAPTER 13. LIMITS AND CONTINUITY 171
lim f ( x ) = f ( a).
x→a
Definition 13.6.2 (Continuity on the domain). Let ( X, distX ) and (Y, distY )
be two metric spaces and let D ⊂ X be a subset of X. We say a function
f : D → Y is continuous on D if f is continuous in a for every a ∈ D.
lim ( g ◦ f )( x n ) = ( g ◦ f )( a)
n→∞
lim g( f ( x n )) = g( f ( a)).
n→∞
lim g( f ( x n )) = g( f ( a))
n→∞
CHAPTER 13. LIMITS AND CONTINUITY 173
Proposition 13.9.1. Let ( X, distX ) and (Y, distY ) be two metric spaces
and let K ⊂ X be a compact subset of X. Let f : K → Y be continuous
on K. Then f (K ) is a compact subset of Y.
Definition 13.10.1. Let ( X, distX ) and (Y, distY ) be metric spaces and
let D ⊂ X be a non-empty subset. We say that f : D → Y is uniformly
continuous on D if
for all e > 0,
there exists δ > 0,
for all p, q ∈ D,
0 < distX ( p, q) < δ =⇒ distY ( f ( p), f (q)) < e.
CHAPTER 13. LIMITS AND CONTINUITY 174
Proposition 13.10.2. Let ( X, distX ) and (Y, distY ) be metric spaces and
let D ⊂ X be a non-empty subset. Let f : D → Y be uniformly contin-
uous on D. Then f is continuous on D.
Therefore, f is continuous in a.
Theorem 13.10.3. Let ( X, distX ) and (Y, distY ) be metric spaces, let K ⊂
X be compact and let f : K → Y be continuous on K. Then f is uni-
formly continuous on K.
which is a contradiction.
13.11 Exercises
Exercise 13.11.1. Let ( X, distX ) be (R2 , distk·k2 ) (i.e. the metric space associ-
ated to the normed vector space (R2 , k · k2 )) and let (Y, distY ) be (R, distR ).
Let D = B(0, 1) ⊂ R2 . Let f : D → R be defined as
(
x12 + x22 if x 6= (0, 0)
f ( x ) :=
185 if x = (0, 0).
CHAPTER 13. LIMITS AND CONTINUITY 176
Show that
lim f ( x ) = lim x12 + x22 = 0.
x →(0,0) x →(0,0)
f (x) = x for x ∈ R
Exercise 13.11.4. Let ( X, distX ) and (Y, distY ) be two metric spaces, and let
D ⊂ X be a subset of X. Let f : D → Y be a bounded function (i.e. f ( D )
is a bounded subset of (Y, distY )) and let a ∈ D ∩ D 0 . Define ω : (0, ∞) →
[0, ∞) by
Suppose
inf{ω (r ) | r ∈ (0, ∞)} = 0.
Show that f : D → Y is continuous in a ∈ D ∩ D 0 .
Exercise 13.11.5. Let ( X, distX ) and (Y, distY ) be metric spaces, let D ⊂ X
and let f : D → Y. Assume that f : D → Y is Lipschitz continuous, that
means that there exists a constant M > 0 such that for all x, z ∈ D,
Real-valued functions
177
CHAPTER 14. REAL-VALUED FUNCTIONS 178
v. If
p for all x ∈ D, f ( x ) ≥ 0, then for every k ∈ N \ {0}, the function
k
f is continuous in a.
In some sense, we’re not ready for the next proposition: not only do we not
yet have the tools to prove it, worse, we are not even ready to define the
functions involved. Most likely, however, you have seen these functions
in high school or in Calculus. I think it’s good to mention the proposition
now anyway, as it plays a central role when you want to show that some
more complicated functions are continuous.
exp : R → R ln : (0, ∞) → R
sin : R → R arcsin : [−1, 1] → R
cos : R → R arccos : [−1, 1] → R
tan : (−π/2, π/2) → R arctan : R → R
Definition 14.4.1 (Limit from the left). Let (Y, distY ) be a metric space,
and let D ⊂ R be a subset of R. Let f : D → Y be a function. Let a ∈ R
be such that a ∈ ((−∞, a) ∩ D )0 , i.e. such that a is an accumulation
point of the set (−∞, a) ∩ D in the metric space (R, distR ). Let q ∈ Y.
We say that f ( x ) converges to q as x approaches a from the left (or from
below), and write
lim f ( x ) = q or sometimes lim f ( x ) = q
x↑a x → a−
CHAPTER 14. REAL-VALUED FUNCTIONS 180
if
for all e > 0,
there exists δ > 0,
for all x ∈ D ∩ (−∞, a),
0 < distR ( x, a) < δ =⇒ distY ( f ( x ), q) < e.
Definition 14.4.2 (Limit from the right). Let (Y, distY ) be a metric space,
and let D ⊂ R be a subset of R. Let f : D → Y be a function. Let a ∈ R
be such that a ∈ (( a, ∞) ∩ D )0 , i.e. such that a is an accumulation point
of the set ( a, ∞) ∩ D in the metric space (R, distR ). Let q ∈ Y. We say
that f ( x ) converges to q as x approaches a from the right (or from above),
and write
lim f ( x ) = q or sometimes lim f ( x ) = q
x↓a x → a+
if
for all e > 0,
there exists δ > 0,
for all x ∈ D ∩ ( a, ∞),
0 < distR ( x, a) < δ =⇒ distY ( f ( x ), q) < e.
Definition 14.5.1 (The extended real line). The extended real line Rext
is the union of the set R and two symbols, ”∞” and ”−∞”. That is
Rext = R ∪ {∞} ∪ {−∞}.
We now want to turn the extended real line into a metric space. For that,
we need to define a distance on the extended real line. We do this by first
defining the map ι : Rext → [−1, 1] by
= −∞,
−1
if x
x ∈ R and x ≥ 0,
1+ x if x
ι( x ) := x
if x ∈ R and x < 0,
1− x
1 if x = ∞.
Because this function is injective, we can now use Exercise 2.7.1 to build a
distance on Rext .
14.6 Limits to ∞ or −∞
Definition 14.6.1 (divergence to ∞). Let ( X, distX ) be a metric space,
let D be a subset of X and assume a ∈ D 0 . Let f : D → R. We say that
CHAPTER 14. REAL-VALUED FUNCTIONS 182
f diverges to ∞ in a if
for all M ∈ R,
there exists δ > 0,
for all x ∈ D,
0 < distX ( x, a) < δ =⇒ f ( x ) > M.
for all M ∈ R,
there exists δ > 0,
for all x ∈ D,
0 < distX ( x, a) < δ =⇒ f ( x ) < M.
lim f ( x ) = q
x →∞
if
for all e > 0,
there exists z ∈ R,
for all x ∈ D,
x > z =⇒ distY ( f ( x ), q) < e.
Definition 14.7.2. Let (Y, distY ) be a metric space and let D be a subset
of R that is unbounded from below. Let q ∈ Y. Let f : D → Y be a
function. We say that f ( x ) converges to q as x → −∞, and write
lim f ( x ) = q
x →−∞
if
for all e > 0,
there exists z ∈ R,
for all x ∈ D,
x < z =⇒ distY ( f ( x ), q) < e.
lim f ( x ) = ∞
x →∞
CHAPTER 14. REAL-VALUED FUNCTIONS 184
if
for all M ∈ R,
there exists z ∈ R,
for all x ∈ D,
x > z =⇒ f ( x ) > M.
lim f ( x ) = −∞
x →∞
if
for all M ∈ R,
there exists z ∈ R,
for all x ∈ D,
x > z =⇒ f ( x ) < M.
Limit statement
Lines in formal definition: Needs X = R?
of form:
lim f ( x ) = . . . ..., No
x→a
there exists δ > 0,
for all x ∈ D,
if 0 < distX ( x, a) < δ,
...
lim f ( x ) = . . . ..., Yes
x↑a
there exists δ > 0,
for all x ∈ D ∩ (−∞, a),
if 0 < distX ( x, a) < δ,
...
lim f ( x ) = . . . ..., Yes
x↓a
there exists δ > 0,
for all x ∈ D ∩ ( a, ∞),
if 0 < distX ( x, a) < δ,
...
lim f ( x ) =
x ←−∞ ..., Yes
...
there exists z ∈ R,
for all x ∈ D,
if x < z,
...
lim f ( x ) = . . . ..., Yes
x →∞
there exists z ∈ R,
for all x ∈ D,
if x > z,
...
Table 14.1: Possible patterns for limit statements regarding the domain,
and the lines that correspond to them in the formal definition
CHAPTER 14. REAL-VALUED FUNCTIONS 186
Limit statement
Lines in formal definition: Needs Y = R?
of form:
lim f ( x ) = q for all e > 0, No
...
...,
...,
...,
distY ( f ( x ), q) < e
lim f ( x ) = −∞ for all M ∈ R, Yes
...
...,
...,
...,
f (x) < M
lim f ( x ) = ∞ for all M ∈ R, Yes
...
...,
...,
...,
f (x) > M
Table 14.2: Possible patterns for limit statements regarding the target
space, and the lines that correspond to them in the formal definition
CHAPTER 14. REAL-VALUED FUNCTIONS 187
lim f ( x ) = ∞,
x↑a
lim f ( x ) = . . .
x↑a
...,
there exists δ > 0,
for all x ∈ D ∩ (−∞, a),
if 0 < distX ( x, a) < δ,
...
for all M ∈ R,
...,
...,
...,
f ( x ) > M.
for all M ∈ R,
there exists δ > 0,
for all x ∈ D ∩ (−∞, a),
if 0 < distX ( x, a) < δ,
f (x) > M
CHAPTER 14. REAL-VALUED FUNCTIONS 188
f ( x ) = lim f ( xn ) ≤ c.
n→∞
f ( x ) = lim f (yn ) ≥ c.
n→∞
In conclusion, f ( x ) = c.
lim f ( xn ) = M.
n→∞
c1 k x k A ≤ k x k B ≤ c2 k x k A .
We will now show that any two norms on a finite-dimensional vector space
are equivalent.
L ( x ) : = x1 v1 + · · · + x d v d .
S = {( x1 , . . . , xd ) T ∈ Rd | x12 + · · · + xd2 = 1}
f ( x ) = (k · k A ◦ L)( x ) = k L( x )k A
x1 v1 + · · · + x d v d = 0
x1 = · · · = xd = 0.
We conclude that there exist constants c1 , c2 > 0 such that for all v ∈ V,
c1 k v k A ≤ k v k B ≤ c2 k v k A .
From this theorem it also follows (by the series characterization of com-
pleteness in Theorem 11.5.1) that every absolutely converging series in a
finite-dimensional normed vector space is converging: a statement that we
announced as Proposition 9.2.3.
CHAPTER 14. REAL-VALUED FUNCTIONS 193
i. for all a, b ∈ V,
L( a + b) = L( a) + L(b)
k L(v)kW ≤ MkvkV .
(λL)(v) = λ( L(v)).
The zero-element in this vector space BLin(V, W ) is the map that maps
every vector to the zero-element of W.
We would like to be able to talk about the norm of such a linear map. We
now introduce one such norm, called the operator norm.
defined by
k L k V →W : = sup k L( x )kW
x ∈ B̄V (0,1)
is a norm on BLin(V, W ).
v 1
k x kV = = kvkV = 1.
k v kV V k v kV
It follows that
Therefore
ι ( x ) = x1 v1 + · · · + x d v d .
The map ι is a bijective linear map, and therefore its inverse ι−1 is a
bijective linear map as well: it is the map that assigns to a vector v its
components x1 , . . . , xd with respect to the basis v1 , . . . , vd . Because ι−1
is injective, the function
k · k 2 ◦ ι −1
is a norm on V. By the equivalence of norms, there exists a constant
C > 0 such that for every v ∈ V,
Then
k L(v)kW = k L( x1 v1 + · · · + xd vd )kW
≤ | x1 |k L(v1 )kW + · · · + | xd |k L(vd )kW
!
d
≤ ∑ | xi | max k L(v j )kW
j=1,...,d
i =1
√
≤ max k L(v j )kW d k x k2
j=1,...,d
√
≤ max k L(v j )kW dC kvkV
j=1,...,d
Proof. We first show the “only if” direction. Assume therefore that
L : V → W is continuous. Then L is in particular continuous in 0 ∈ V.
Therefore, there exists a δ > 0 such that for all v ∈ V, if 0 < kvkV < δ,
then k LvkW < 1. Choose such a δ > 0.
Choose M := 2/δ. Let v ∈ B̄(0, 1). If v = 0, then k LvkW = 0 < M.
Suppose now v 6= 0. We also know that kvkV < 2. It follows that
2δ 2 2 2
k LvkW = k L(v)kW = k L(δv/2)kW < · 1 = = M.
2δ δ δ δ
k L(v)kW ≤ MkvkV .
Then
0 ≤ k L(un ) − L(v)kW
= k L(un − v)kW
≤ M k u n − v kV
14.12 Exercises
and
lim f ( x ) = −∞.
x →∞
Show that there exists a c ∈ (0, ∞) such that f (c) = 0.
Exercise 14.12.6. Let f : R → R be a function and let a ∈ R. Let L ∈ R.
Show that
lim f ( x ) = L
x→a
if and only if
lim f ( x ) = lim f ( x ) = L.
x↓a x↑a
and
lim f ( x ) = ∞.
x ↑3
Show that f attains a minimum on the interval (−∞, 3). I.e., show that
there exists a c ∈ (−∞, 3) such that for all x ∈ (−∞, 3),
f ( c ) ≤ f ( x ).
( x1 )4 −2( x2 )2
(
( x1 )4 +( x2 )4
( x1 , x2 ) 6= (0, 0)
f ( x ) :=
0 ( x1 , x2 ) = (0, 0).
Either prove that the function f is continuous or prove that it is not contin-
uous (where f is viewed as a function from the domain R2 in the normed
vector space (R2 , k · k2 ) to the normed vector space (R, | · |)).
Chapter 15
Differentiability
201
CHAPTER 15. DIFFERENTIABILITY 202
Err a ( x ) := f ( x ) − f ( a) − L a ( x − a)
it holds that
kErr a ( x )kW
lim = 0.
x → a k x − a kV
The following proposition relates the derivative to the derivative you are
used to from Calculus.
f ( x ) − f ( a)
lim
x→a x−a
exists. Moreover, if this limit exists, we call it f 0 ( a), and then for all
h ∈ R,
f 0 ( a ) · h = ( D f ) a ( h ). (15.1.1)
f ( x ) = x.
f ( x ) − f ( a) x−a
lim = lim
x→a x−a x→a x − a
= 1.
The previous proposition can be generalized to the case in which the target
is an arbitrary normed vector space.
CHAPTER 15. DIFFERENTIABILITY 204
f ( x ) − f ( a)
lim (15.1.2)
x→a x−a
exists. Moreover, if this limit exists we denote it by f 0 ( a), and then for
all h ∈ R,
f 0 ( a ) · h = ( D f ) a ( h ).
instead of f 0 ( a).
D f : Ω → Lin(V, W )
Lin(V, W ).
a 7→ A.
Pi ( x ) = xi .
P i ( x + y ) = x i + y i = P i ( x ) + P i ( y ).
basis w1 , . . . , wm .
( DΨ) a = Ψ
for all a ∈ W.
The component functions Ψ1 , . . . Ψm of Ψ are together sometimes called
the dual basis of w1 , . . . , wm . Here, by component functions we mean the
functions Ψi : W → R that are defined by Ψi := Pi ◦ Ψ, i.e.
Ψ = ( Ψ1 , . . . , Ψ m ).
Proof. See e.g. Theorem 1.6.7 in the Linear Algebra 2 lecture notes
which in words means that ( M )ij is the ith coordinate of the vector L(v j )
expressed in the basis w1 , . . . , wm .
The matrix M is precisely that matrix such that for all x ∈ Rd , with y = Mx
it holds that
L ( x 1 v 1 + · · · + x d v d ) = y 1 w1 + · · · + y m w m .
Ψ ◦ L( x1 v1 + · · · + xd vd ). = Mx
f¯ : Φ(Ω) → Rm
defined by
f¯( x ) = Ψ ◦ f ◦ Φ−1 ( x ).
CHAPTER 15. DIFFERENTIABILITY 209
∂ f¯i
([ D f ] a )ij = (Φ( a)).
∂x j
then
3(b2 )2
2b1
[ D f ]a = .
4(b1 ) 5(b2 )4
3
where b := Φ( a).
But this was just a preview. Before we can show this, we need computation
rules such as the chain rule, the sum, product and quotient rules.
that then
g◦ f
kErr a ( x )kW
lim = 0.
x→a k x − a kU
Let e > 0.
According to the template, the next step would be to find a δ > 0, but
for this step we need quite some preparation. First of all it is helpful to
define the error functions
f
Err a ( x ) := f ( x ) − f ( a) − ( D f ) a ( x − a)
and
g
Err f (a) (y) := g(y) − g( f ( a)) − ( Dg) f (a) (y − f ( a))
CHAPTER 15. DIFFERENTIABILITY 211
Our strategy will be to find a δ > 0 such that for all x ∈ Ω, if 0 <
k x − akU < δ, the right-hand-side of (15.6.1), and therefore also the
left-hand-side, is less than e. Let’s see how we can find such a δ > 0.
Because g is differentiable in f ( a) with derivative ( Dg) f (a) , it holds
that g
kErr f (a) (y)kW
lim = 0.
y→ f ( a) k y − f ( a )kV
Now define
!
e
e2 := min 1, .
2k( Dg) f (a) kV →W + 1
Define
ρ
δ2 := .
1 + k( D f ) a kU →V
mapping to the normed vector space (R, | · |). Assume both f and g
are differentiable in the point a ∈ Ω, with derivative ( D f ) a : V → W
and ( Dg) a : V → R respectively. Then
for all h ∈ V.
1
( D ( f /g)) a (h) = g ( a )( D f ) a ( h ) − f ( a )( Dg ) a ( h )
( g( a))2
for all h ∈ V.
Ψi ◦ f
f 0 ( a) = f 10 ( a), · · · , f m0 ( a) .
kErr a ( x )kW
< 1.
k x − a kV
Now let y : N → Ω be a sequence in Ω converging to a. Then, there
exists an N ∈ N such that for all n ≥ N,
It follows by Proposition 5.6.1 and the squeeze theorem that the se-
quence n 7→ k f (yn ) − f ( a)kW converges to zero, and we conclude by
Proposition 5.6.1 that the sequence ( f (yn )) converges to f ( a).
Then ( D f ) a = 0.
Proof. We will show the statement for the case in which f attains a local
maximum in a. In that case, there exists an r > 0 such that f ( x ) ≤ f ( a)
for all x ∈ B( a, r ). Because f is differentiable in a,
f
f ( x ) = f ( a) + ( D f ) a ( x − a) + Err a ( x )
CHAPTER 15. DIFFERENTIABILITY 217
where
f
|Err a ( x )|
lim = 0. (15.10.1)
x → a k x − a kV
( D f ) a (u) 6= 0.
( D f ) a (v) > 0.
u 1
k v kV = = kukV = 1.
k u kV V k u kV
The intuition behind the rest of the proof is that f evaluated in points
in the direction of v, close enough to a, is larger than f ( a).
By (15.10.1) there exists an δ > 0 such that for all x ∈ Ω, if 0 < k x −
akV < δ then
f
|Err a ( x )| 1
< |( D f ) a (v)| (15.10.2)
k x − a kV 2
1
Now choose ρ := 2 min(r, δ) and define
y := a + ρv.
Then by positive homogeneity of the norm and the fact that kvkV = 1,
1
0 < ky − akV = kρvkV = ρkvkV = ρ = min(r, δ).
2
CHAPTER 15. DIFFERENTIABILITY 218
Therefore on the one hand ky − akV < r and thus f (y) ≤ f ( a), but on
the other hand ky − akV < δ and thus
f
f (y) = f ( a) + ( D f ) a (y − a) + Err a (y)
f
= f ( a) + ( D f ) a (ρv) + Err a (y)
f
= f ( a) + ρ( D f ) a (v) + Err a (y)
f
≥ f ( a) + ρ( D f ) a (v) − |Err a (y)|.
c ∈ ( a, b) such that
f (b) − f ( a)
f 0 (c) = .
b−a
x−a b−x
g( x ) = f ( x ) − f (b) − f ( a ).
b−a b−a
By the sum and product rules, the function g is differentiable on ( a, b).
By the rules for continuous functions, the function g is also continuous
on [ a, b]. Moreover,
g( a) = g(b) = 0.
It follows by Rolle’s theorem that there exists a c ∈ ( a, b) such that
g0 (c) = 0.
Then
1 1
0 = g0 (c) = f 0 (c) − f (b) + f ( a)
b−a b−a
so that indeed
f (b) − f ( a)
f 0 (c) = .
b−a
15.12 Exercises
Exercise 15.12.4. Prove Proposition 15.1.5. You may assume that W is finite-
dimensional.
( D f ) 0 ( v 1 + v 2 ) = w1
and
( D f )0 (v1 − 2v2 ) = w1 − w2 .
Give the matrix representation of the linear map ( D f )0 : V → W with
respect to the bases v1 , v2 and w1 , w2 .
Chapter 16
Differentiability of standard
functions
With these observations we can get pretty far, and conclude that polyno-
mials and rational functions are differentiable.
221
CHAPTER 16. DIFFERENTIABILITY OF STANDARD FUNCTIONS 222
for every definition, lemma etc. I will usually not reintroduce these vari-
ables, but only indicate deviations from it.
We will consider two normed vector spaces (V, k · kV ) and (W, k · kW )
and a function f : Ω → W where Ω ⊂ V is an open subset of V. We
will from now assume that V and W are finite-dimensional, and we will
denote by v1 , . . . , vd a basis in V, with corresponding coordinate map Φ,
and by w1 , . . . , wm a basis in W with corresponding coordinate map Ψ.
f (x) = xn
is differentiable with
f 0 ( x ) = nx n−1 .
In other words, the derivative of f , i.e. ( D f ) : R → Lin(R, R) is given
by
x 7→ h 7→ nx n−1 h
D : = { x ∈ Rd | q ( x ) 6 = 0 } .
CHAPTER 16. DIFFERENTIABILITY OF STANDARD FUNCTIONS 223
p( x )
f (x) =
q( x )
is differentiable.
In other words, every rational function is differentiable on its domain
of definition.
exp : R → R ln : (0, ∞) → R
sin : R → R cos : R → R
tan : (−π/2, π/2) → R arctan : R → R
f (t) = t2 , sin(t)
f 1 ( t ) = t2
and
f 2 (t) = sin(t)
Since these component functions are differentiable standard functions,
we find by Corollary 15.8.2 that f is differentiable as well and
16.4 Exercises
Exercise 16.4.1. Consider the polynomial f : R2 → R given by
f ( x1 , x2 ) = 3x1m x2n + x1k
for some nonnegative integers m, n and k. Since f is a polynomial, it is dif-
ferentiable on R2 . Give ( D f ) : R2 → Lin(R2 , R) and justify your answer.
Exercise 16.4.2. Prove Proposition 15.8.1.
Exercise 16.4.3. Consider the function f : R2 → R2 given by
x1 x23 , 5x
x12 + x22 2 if ( x1 , x2 ) 6= (0, 0)
f ( x1 , x2 ) =
(0, 0) if ( x , x ) = (0, 0).
1 2
Note that ` a,v is an affine map, i.e. it is the sum of a constant and a linear
map. It is therefore differentiable on (−δ, δ), and for every t ∈ (−δ, δ), its
derivative ( D ` a,v )t in t is a linear map from R to V. In fact, for all h ∈ R,
226
CHAPTER 17. DIRECTIONAL AND PARTIAL DERIVATIVES 227
To figure out what this means, we realize that ( D ( f ◦ ` a,v ))0 is a linear map
from R to W. So let’s see its output when the input is h ∈ R:
( D ( f ◦ ` a,v ))0 (h) = ( D f ) a ◦ ( D ` a,v )0 (h)
= ( D f ) a ( D ` a,v )0 (h)
= ( D f ) a hv
= h( D f ) a (v)
g := f ◦ ` a,v : (−δ, δ) → W
f ( a + hv) − f ( a)
( Dv f ) a := g0 (0) = lim .
h →0 h
How does the directional derivative relate to the derivative of a function? The
answer is subtle. If the derivative exists in a point a, then for all v ∈ V,
the directional derivative in the direction of v of f in the point a exists as
well and
( Dv f ) a = ( D f ) a ( v ) .
CHAPTER 17. DIRECTIONAL AND PARTIAL DERIVATIVES 228
The last equality tells us that in this case the directional derivative in the
direction of v at a point a, namely ( Dv f ) a , is just the derivative of f in the
point a (which is a linear map!) applied to the vector v, namely ( D f ) a (v).
The precise statement is given by the next proposition. After the proposi-
tion, we will give a warning about the reverse direction: existence of direc-
tional derivatives does not say anything about existence of the derivative.
( D f ) a ( v ).
Proof. This proposition follows from the Chain Rule, Theorem 15.6.1.
Indeed, let v ∈ V.
Because Ω is open, there exists a δ1 > 0 such that B( a, δ1 ) ⊂ Ω.
Consider now the function g := f ◦ ` a,v , which is a function from
(−δ, δ) → W where δ := δ1 /kvkV .
The function ` a,v is an affine function, and therefore it is differentiable
in 0 with derivative
( D ` a,v )0 := h 7→ hv .
f (0 + tv) − f (0)
( Dv f )0 = lim
t →0 t
tv − 0
= lim 1
t →0 t
= lim v1
t →0
= v1 .
while if v2 = 0 then
f (0 + tv) − f (0)
( Dv f )0 = lim
t →0 t
0−0
= lim
t →0 t
= 0.
CHAPTER 17. DIRECTIONAL AND PARTIAL DERIVATIVES 230
∂f d f ( a + hei ) − f ( a)
( a ) : = ( Dei f ) a = f ( a + tei ) = lim .
∂xi dt t =0 h →0 h
Here
ei := (0, . . . , 0, 1, 0 . . . , 0).
1 in ith position
∂f
.
∂x2
CHAPTER 17. DIRECTIONAL AND PARTIAL DERIVATIVES 231
( De2 f ) a
exists.
By Definition 17.2.1, we need to verify whether the derivative of the
function g : R → R defined by
∂f
( a) = ( De2 f ) a = 2a1 + 12a32 .
∂x2
In general, there are many different expressions for the partial derivative
of a function in some point a. Here are a few of them
∂f d
( a ) : = ( Dei f ) a = f ( a + tei )
∂xi dt t =0
d
= f ( a1 , . . . , ai−1 , ai + t, ai+1 , . . . , ad )
dt t =0
d
= f ( a1 , . . . , ai−1 , s, ai+1 , . . . , ad ) .
ds s = ai
The moral of the last expression is very nice: to determine the partial
derivative of f in a point a, you keep all coordinates fixed except for the
ith coordinate, and you then view the function as a function of only that
CHAPTER 17. DIRECTIONAL AND PARTIAL DERIVATIVES 232
ith coordinate. It then is a function of only one variable, and you can dif-
ferentiate according to the one-variable definition in calculus.
Let us record the statement in a proposition.
∂f d
( a) = f ( a1 , . . . , ai−1 , t, ai+1 , . . . , ad )
∂xi dt t = ai
d d 2
f ( x1 , t ) = ( x + 2x1 t + 3t4 ) = 2x1 + 12x23 .
dt t = x2 dt 1 t = x2
[ D f ] a = [ D f¯]Φ(a)
Ta := {(v, L a (v)) | v ∈ V }
f −1 ( c ) = { x ∈ V | f ( x ) = c }
at a is given by
{ x ∈ V | L a ( x ) = c }.
17.7 Exercises
Exercise 17.7.1. Consider the function f : R2 → R given by
( 2
( x2 )
x1 if x1 6= 0,
f (( x1 , x2 )) :=
0 if x1 = 0.
CHAPTER 17. DIRECTIONAL AND PARTIAL DERIVATIVES 237
(a). Show that for all v ∈ R2 , the directional derivative ( Dv f )0 (i.e. the
directional derivative at 0 in the direction of v) exists and compute
its value.
∂f ∂f ∂f
( a ), ( a ), ( a)
∂x1 ∂x2 ∂x3
exist and compute their values.
To get the most out of this inequality, you often don’t apply it to a function
directly, but rather to the difference of two functions. This difference could
for instance be a difference of the function you are interested in, and a
linear function. By picking a good difference of functions, the right-hand
side in the inequality actually becomes small, so that you can conclude
that the left-hand side becomes small too.
238
CHAPTER 18. THE MEAN-VALUE INEQUALITY 239
Proof. Denote
K := sup k f 0 (t)kW .
t∈( a,b)
In this first part of the proof we will a few times use an ”It suffices to
show that-” construction to reduce what we need to show to an easier
statement.
We first claim that it suffices to show that for all ā ∈ ( a, b)
To see why this suffices, note that the left-hand side and right-hand
side can be viewed as continuous functions of ā. Therefore, if we know
that (18.1.1) holds, we can take the limit as ā → a on both sides, and
conclude that also
To prove the claim, we aim to show (18.1.1) from (18.1.2). First note
that if (18.1.2) holds for all e > 0 and all s ∈ [ ā, b], then it also holds for
all e > 0 and s = b. We now argue by contradiction. Suppose
k f (b) − f ( ā)kW
1
e1 : = −K >0
2 b − ā
so that
k f (b) − f ( ā)kW > (K + e1 )(b − ā).
Yet when we choose e = e1 in (18.1.2), we obtain
Our second claim is that whenever inequality (18.1.2) holds for all
s ∈ [ ā, c) for some c ∈ [ ā, b], it also holds for s = c. This follows
since the left-hand side and the right-hand of the inequality (18.1.2)
are continuous when interpreted as functions in s.
Our third claim is that whenever the inequality holds for all s ∈ [ ā, c]
for some c ∈ [ ā, b), there exists a δ > 0 such that the inequality holds
for all s ∈ [ ā, c + δ).
To prove this third claim, let c ∈ [ ā, b) and assume the inequality holds
for all s ∈ [ ā, c]. Since f is differentiable in c, there exists a δ > 0 such
that for all s ∈ [c, δ),
f
k f (s) − f (c) − f 0 (c)(s − c)kW = kErrc (s)kW ≤ e|s − c| = e(s − c)
CHAPTER 18. THE MEAN-VALUE INEQUALITY 241
As a consequence,
which shows that indeed inequality (18.1.2) is also satisfied for all s ∈
[ ā, c + δ). Hence we have proved the third claim.
We now define the set S as those s ∈ [ ā, b] such that for all σ ∈ [ ā, s],
inequality (18.1.2) is satisfied. In other words,
Note that just from its definition, it follows that S is either the empty
set, or just the point { ā} or it is an interval that is closed on the left with
ā as the left endpoint. From the first claim, we know that ā ∈ S, so S is
not empty. The second claim tells us that S is closed, so it is either { ā}
or it is a closed interval of the form [ ā, c] with c ∈ ( ā, b]. The third claim
gives a contradiction when S = { ā} or S = [ ā, c] with c < b. Therefore
S = [ ā, b] and inequality (18.1.2) is satisfied in s = b.
In other words,
k L(v)kW ≤ K kvkV ,
then
k LkV →W ≤ K.
The derivation of Corollary 18.2.1 from Lemma 18.1.1 is the topic of Exer-
cise 18.4.1.
∂f
(x)
∂xi
Lb : Rd → W by
d
∂f
Lb (v) = ∑ ∂xi (b)vi .
i =1
f
Now define the error function Errb : Ω → W by
f
Errb ( x ) = f ( x ) − ( f (b) + Lb ( x − b)).
Let e > 0.
By assumption, for every i ∈ {1, . . . , d}, the partial derivative
∂f
:Ω→W
∂xi
∂f ∂f e
(z) − (b) < .
∂xi ∂xi W d
δ := min(δ1 , . . . , δd , ρ).
f
Now let x ∈ B(b, δ). To show that Errb ( x ) is small, we are going to ap-
ply the Mean-Value Inequality (a few times), on paths that are parallel
to the axes in Rd .
CHAPTER 18. THE MEAN-VALUE INEQUALITY 245
y0 := (b1 , . . . , bd )
y j : = ( x 1 , . . . , x j , b j + 1 , b j +2 , . . . , bd )
y d : = ( x1 , . . . , x d )
and write
d
∂f
= f ( x ) − f (b) − ∑
f
Errb ( x ) (b)( xi − bi )
i =1
∂xi
d d
∂f
= ∑ f ( y i ) − f ( y i −1 ) − ∑ (b)( xi − bi ).
i =1 i =1
∂x i
∂f
gi (t) = f ( x1 , . . . , xi−1 , t, bi+1 , . . . , bd ) − (b)(t − bi )
∂xi
e d
d i∑
< | x i − bi |
=1
e
≤ d k x − b k2
d
= e k x − b k2 .
d
∂f ∂f
∑
k ( D f )c − ( D f )b ( x1 , . . . , xd )kW ≤ (c) − (b) | xi |
i =1
∂xi ∂xi W
v
2
u d
∂f ∂f
≤ t∑
u
(c) − (b) k x k2
i =1
∂xi ∂xi W
It follows that
v
2
u d
∂f ∂f
≤ t∑
u
k( D f )c − ( D f )b kV →W (c) − (b)
i =1
∂xi ∂xi W
18.4 Exercises
Exercise 18.4.1. Prove Corollary 18.2.1.
k( D f ) a kR2 →W ≤ 5.
Prove that
k f ((2, 0)) − f ((−2, 0))kW ≤ 10π.
Exercise 18.4.4. Consider the function f : R2 → R given by
( x1 )2 ( x2 )7
(
( x1 )2 +( x2 )2
if ( x1 , x2 ) 6= (0, 0)
f (( x1 , x2 )) :=
0 if ( x1 , x2 ) = (0, 0).
CHAPTER 18. THE MEAN-VALUE INEQUALITY 247
The second order derivative is the derivative of the derivative, the third
order derivative is the derivative of the second order derivative, the fourth
order derivative is the derivative of the third order derivative, etc.. This
way, we create quite complicated objects, and I would therefore like to
encourage you to, at least at first, study what the statements in this chapter
are, and what the mathematical objects are, rather than the proofs of the
statements.
As a bit of help, here’s a list of most important messages for this chapter:
248
CHAPTER 19. HIGHER ORDER DERIVATIVES 249
as a function
( D f ) : Ω → Lin(V, W )
i.e. it is a function from Ω to Lin(V, W ). Because Lin(V, W ) is a finite-
dimensional vector space again, we can use the definition of differentia-
bility to check whether the function ( D f ) : Ω → Lin(V, W ) is differen-
tiable in a point a. If so, we denote the derivative of ( D f ) in the point a by
( D ( D f )) a .
If ( D f ) is differentiable in every point a in V, then we say f is twice differ-
entiable, and the second derivative is a function
( D ( D f )) : Ω → Lin(V, Lin(V, W )).
Similarly, the third derivative is a function
( D ( D ( D f ))) : Ω → Lin(V, Lin(V, Lin(V, W ))).
The general definition can be given by induction. We first define the space
Linn (V, W ) inductively.
D n f : B( a, r ) → Linn (V, W )
V ×n = V × · · · × V
n times
to W.
The statement that we may equivalently interpret elements from Linn (V, W )
as multi-linear maps, precisely means that there is an invertible linear map
from Linn (V, W ) to MLin(V ×n , W ) that preserves norm (with the choice of
norm on MLin(V ×n , W ) that we will give later). Intuitively, this has as a
consequence that for all accounts and purposes these two spaces are the
same.
The linear mapJn that brings elements in Linn (V, W ) to multilinear maps
in MLin(V ×n , W ) is given by
J1 A : = A
K1 B = B
and
(Kn+1 B)(v1 ) = Kn ( B(v1 , · · · )).
We will now show that Jn+1 ◦ Kn+1 is the identity. Let v1 ∈ V and let
B ∈ MLin(V ×(n+1) , W ). Then
1
lim ( f ( a + tu + tv) − f ( a + tu) − f ( a + tv) + f ( a)) (19.5.1)
t →0 t 2
( D2 f ) a (v, u).
we find that
and
Df
( D f ) a+τv = ( D f ) a + ( D ( D f )) a (τv) + Err a ( a + τv)
Note that the left-hand side and right hand side of these equations are
linear maps. We apply these linear maps to the vector v ∈ V and find
and
Df
( D f ) a+τv (v) := ( D f ) a (v) + ( D ( D f )) a (τv)(v) + Err a ( a + τv)(v)
Df
:= ( D f ) a (v) + τ ( D2 f ) a (v, v) + Err a ( a + τv)(v).
CHAPTER 19. HIGHER ORDER DERIVATIVES 256
1
( f ( a + su + tv) − f ( a + su) − f ( a + tv) + f ( a)) − ( D2 f ) a (u, v)
t2 W
1
Df Df
≤ sup Err a ( a + tu + τv) + Err a ( a + τv) k v kV
|t| τ ∈(−|t|,|t|) V →W V →W
( D n f ) a ( v 1 , v 2 , · · · , v n ) = ( D n f ) a ( v σ (1) , v σ (2) , · · · , v σ ( n ) )
With the following definition, we can just express this by saying that if f
is n times differentiable in a point a, that then D n f can be viewed as an
element in the space of n-linear, symmetric maps Symn (V, W ), where the
space Symn (V, W ) is defined as follows.
D n f : Ω → Symn (V, W )
19.7 Exercises
The following exercises are mostly meant as practice to get familiar with
the concepts and theorems in this chapter.
Exercise 19.7.1. Consider the function f : Ω → R where Ω = R2 given by
f (( x1 , x2 )) = exp(( x1 )2 − ( x2 )3 )
∂2 f ∂2 f
:Ω→R :Ω→R
∂x1 ∂x1 ∂x2 ∂x1
∂2 f ∂2 f
:Ω→R :Ω→R
∂x1 ∂x2 ∂x2 ∂x2
exist and compute them.
c. Now show that the second order partial derivatives are continuous
and conclude, by quoting the right theorem, that f is twice differen-
tiable.
CHAPTER 19. HIGHER ORDER DERIVATIVES 258
f (( x1 , x2 )) = ( x1 )5 ( x2 )8
Du f : R2 → R
( Dv ( Du f )) : R2 → R
( D2 f ) a (u, v)
( D4 f ) a (e2 , e3 , e3 , e5 ) = (5, 0)
( D4 f ) a (e2 , e3 , e5 , e5 ) = (2, 3)
∂4 f
( a) = (0, 1)
∂x5 ∂x3 ∂x3 ∂x3
∂4 f
( a) = (1, 2)
∂x5 ∂x3 ∂x5 ∂x3
Give
( D4 f ) a (e3 − 2e2 , 6e5 , e3 + e5 , e3 ).
Chapter 20
α1 + · · · + α d = k
259
CHAPTER 20. POLYNOMIALS AND APPROXIMATION BY POLYNOMIALS260
Note that
∂|α| α
x = α!
∂x α
This is maybe easier to appreciate in an example:
∂14
3 7 4
( x 1 ) ( x 2 ) ( x 3 ) = 3!7!4!
∂x13 ∂x27 ∂x34
1
∑ α!
sα x α
|α|=n
1
F (S)( x ) = S(ι( x ), · · · , ι( x ))
n!
1 1 α (α)
F (S)( x ) = S(ι( x ), · · · , ι( x )) = ∑ x S
n! |α|=n
α!
where
S ( α ) = S ( v i1 , v i2 , . . . , v i n )
where i1 , · · · , in ∈ {1, . . . , d} are such that v1 appears α1 times, v2 ap-
pears α2 times etc. In particular,
∂|α| F
S(α) = (0).
∂x α
If we now compute
S(u, · · · , u)
where u := ∑dj=1 x j v j , we find
!
d d
1 1
n!
S(u, · · · , u) = S
n! ∑ x j1 v j1 , · · · , ∑ x jn v jn
j1 =1 jn =1
d d
1
∑ ∑
= ··· x j1 · · · x jn S v j1 , · · · , v jn
n! j1 =1 jn =1
1 n α (α)
= ∑
n! |α|=n α
x S
1 α (α)
= ∑ α!
x S .
|α|=n
Therefore the coefficients S (α) can be read off by inspecting the poly-
nomial F : Rd → R given by
1 1 α (α)
F ( x ) := S(ι( x ), · · · , ι( x )) = ∑ x S ,
n! |α|=n
α!
or alternatively,
1 ∂|α|
S (α) = F (0)
n! ∂x α
by
n
1
Ta,n ( x ) := f ( a) + ∑ k!
( D k f ) a ( x − a, · · · , x − a)
k =1
is called the Taylor expansion of f around a.
Taylor’s theorem says that the Taylor expansion provides a good approxi-
mation.
1
f ( x ) := S( x − a, . . . , x − a)
k!
Then
i. for all b ∈ V,
( D k f )b = S
CHAPTER 20. POLYNOMIALS AND APPROXIMATION BY POLYNOMIALS264
( D j f )b = 0,
1
( D j f ) b ( u1 , · · · , u j ) = S(u1 , · · · , u j , b − a, · · · , b − a).
(k − j)!
We can prove the above proposition for instance with help of the corre-
spondence between homogeneous polynomials and symmetric multilin-
ear forms of Lemma 20.1.3. The first approach is illustrated by Exercise
20.4.4, while the latter approach is illustrated by Exercise 20.4.2.
We will now give a sketch of the proof of Taylor’s theorem.
Proof of Theorem 20.2.2. We give a sketch of the proof. First of all, by us-
ing the previous proposition we may without loss of generality assume
that f ( a) = 0 and that all derivatives of f up to and including order n
vanish (because otherwise we just consider the function g := f − Ta,n ).
By repeatedly applying the Mean-Value Inequality we then find that
and therefore
D n −1 f n −1
k f (v) − f ( a)kW ≤ sup kErr a ((1 − τ ) a + τv)kSymn−1 (V,W ) kv − akV .
τ ∈(0,1)
( Dk f )a = ( Dk g)a .
1 1 ∂|α| f
( D k f ) a ( x, · · · , x ) = ∑ ( a) xα .
k! |α|=k
α! ∂x α
1
q( x ) = ( D k q)0 ( x, · · · , x ).
k!
1
q(u) = ∑ α!
sα uα
|α|=3
where
∂3 q
sα = (0).
∂x α
We know by the previous proposition that if such a function f exist
that then for all u ∈ R2 ,
1 ∂|α| f 1 2
∑ α! ∂xα (a)uα = 3! ( D3 f )a (u, u, u) = 3! (u1 )2 (u2 ).
|α|=3
If we compare the left-hand side and right-hand side this may suggest
us to find a function such that
1 ∂3 f 2
2
( a) = .
3! (∂x1 ) ∂x2 3!
1
f ( x ) 7→ ( x1 − a1 )2 ( x2 − a2 )
3
is such a polynomial.
we have that
kErr a,n ( x )k2
lim = 0.
x→a k x − ak2n
1 ∂|α| f
f ( a) + ∑ α
( a)( x − a)α
1≤|α|≤n
α! ∂x
1 ∂|α| f
f ( x ) = f ( a) + ∑ α! ∂xα (x − a)α
1≤|α|≤n
1
+ ( D n+1 f ) a+θ (x−a) ( x − a, . . . , x − a).
( n + 1) !
The notation
f ( x ) = g( x ) + O(| x | N )
should be read as that there exists a C ≥ 0 and a δ > 0 such that for all
x ∈ (−δ, δ),
| f ( x ) − g( x )| ≤ C | x | N .
CHAPTER 20. POLYNOMIALS AND APPROXIMATION BY POLYNOMIALS269
20.4 Exercises
Exercise 20.4.1. Consider the function f : R2 → R given by
π
f (( x1 , x2 )) := sin ( x1 )2 + ( x2 ) .
2
d. Show that
| T2 ( x ) − f ( x )|
lim = 0.
( x1 ,x2 )→(1,2) k( x1 , x2 ) − (1, 2)k22
Note: You can use that there exists a constant K ≥ 0 such that for all
u, v ∈ V it holds that
Exercise 20.4.3. Determine whether the following limit exists, and if so,
determine its value:
exp(( x1 )2 + ( x2 )2 ) − 1
lim
( x1 ,x2 )→(0,0) sin(( x1 )2 + ( x2 )2 )
∂|α| β
x = 0.
∂x α
Chapter 21
If f satisfies this inequality for all x, z, and a constant κ ∈ [0, 1), then
we will also sometimes say that f is a κ-contraction.
271
CHAPTER 21. BANACH FIXED POINT THEOREM 272
x (0) : = q
(21.1.1)
x ( n +1) : = f ( x ( n ) ) for n ∈ N.
κn
distX ( x (n) , p) ≤ distX ( x (1) , x (0) ). (21.1.2)
1−κ
Before given the proof of the theorem, we first formulate a version in case
X is just Rd . Because every closed subset of Rd is complete, we get the
following theorem.
k f ( x ) − f (z)k2 ≤ κ k x − zk2 .
x (0) : = q
(21.1.3)
x ( n +1) : = f ( x ( n ) ) for n ∈ N.
κn
k x ( n ) − p k2 ≤ k x (1) − x (0) k 2 . (21.1.4)
1−κ
Proof of the metric space version of the Banach Fixed Point theorem. We will
first show that f has at most one fixed point. To this end, let z ∈ X be
a fixed point of f , i.e. f (z) = z, and let p ∈ X be another fixed point of
CHAPTER 21. BANACH FIXED POINT THEOREM 273
so that
(1 − κ )distX ( p, z) = 0
and therefore distX ( p, z) = 0, from which it indeed follows that p = z.
We will now show that for all q ∈ D, the sequence ( x (n) )n defined
inductively by (21.1.1) converges to a fixed point p ∈ D. From this, of
course, it follows immediately that such a fixed point exists.
Let q ∈ D.
Now define the sequence ( x (n) )n inductively according to (21.1.1), i.e.
x (0) : = q
x ( n +1) : = f ( x ( n ) ) for n ∈ N.
κn
distX ( p, x (n) ) ≤ distX ( x (1) , x (0) ).
1−κ
CHAPTER 21. BANACH FIXED POINT THEOREM 275
21.2 An example
F : [0, 1]2 → R2
given by
1 1 1 1
F (( x1 , x2 )) := ( x2 )2 + x1 , arctan( x1 ) +
6 3 π 2
We would like to show that there is a unique fixed point q of the func-
tion F in the set [0, 1]2 (i.e. F (q) = q). To do this, we would like to
apply Banach’s fixed point theorem, so we need to check the condi-
tions of the theorem.
First note that [0, 1]2 is closed.
We will now show that the range of F is contained in [0, 1]2 ⊂ R2 .
Let ( x1 , x2 ) ∈ [0, 1]2 . Then
1 1 1 1
0 ≤ F1 (( x1 , x2 )) = ( x2 )2 + x1 ≤ + < 1
6 3 6 3
and because for all z ∈ R, −π/2 < arctan(z) < π/2
1 1 1π 1
F2 (( x1 , x2 )) = arctan( x1 ) + < + =1
π 2 π2 2
and
1 1 1π 1
F2 (( x1 , x2 )) = arctan( x1 ) + > − + = 0.
π 2 π2 2
So indeed F maps into [0, 1]2 .
CHAPTER 21. BANACH FIXED POINT THEOREM 276
We will now show that F is a contraction. In fact we will show that for
all x, y ∈ [0, 1]2 ,
k F ( x ) − F (y)k2 ≤ κ k x − yk2
√
with κ = 31 3, so indeed κ is strictly smaller than 1.
In proving such an inequality, the Mean-Value Inequality will often
play an important role.
1 1 1 1
| F1 (( x1 , x2 )) − F1 ((y1 , y2 ))| = ( x2 )2 + x1 − ( y2 )2 − y1
6 3 6 3
1 1
≤ ( x2 )2 − ( y2 )2 + ( x1 − y1 )
6 3
1 1
= | x2 − y2 || x2 + y2 | + | x1 − y1 |
6 3
1 1
≤ | x2 − y2 |(| x2 | + |y2 |) + | x1 − y1 |
6 3
1 1
≤ | x2 − y2 | + | x1 − y1 |
3 3
We now use the inequality that for all a, b ∈ R,
1 1
| F2 (( x1 , x2 )) − F2 ((y1 , y2 ))| = arctan( x1 ) − arctan(y1 )
π π
To estimate this, we can use the mean-value inequality. Since for all
t ∈ R,
1
0 ≤ arctan0 (t) = ≤1
1 + t2
CHAPTER 21. BANACH FIXED POINT THEOREM 277
It follows that
1 1
| F2 (( x1 , x2 )) − F2 ((y1 , y2 ))| = arctan( x1 ) − arctan(y1 )
π π
1
= | arctan( x1 ) − arctan(y1 )|
π
1
≤ | x1 − y1 |
π
Therefore
21.3 Exercises
Exercise 21.3.1. Consider the function F : [−1, 1]2 → R2 given by
1 1 1 1 3 1
F (( x1 , x2 )) := sin( x2 ) + x1 + , ( x1 ) −
2 3 6 4 6
CHAPTER 21. BANACH FIXED POINT THEOREM 278
f (( x, g( x ))) = x2 + ( g( x ))2 − 1 = 0.
279
CHAPTER 22. IMPLICIT FUNCTION THEOREM 280
22.2 Notation
Before we describe the theorem, we will need to introduce more notation.
Since we will assume the function f : Ω → Rm to be continuously dif-
ferentiable (with Ω an open subset of Rd+m ), we will have that in a point
( a, b) ∈ Ω, the derivative ( D f )(a,b) exists and is a linear map from Rd+m to
Rm . To get a feeling for this, let’s see what it looks like in an example.
F (( x1 , x2 , x3 ), (y1 , y2 )) = (( x1 )2 y2 + y1 − 2, sin( x2 y2 ) + ( x3 )4 − 3)
∂ f1 ∂ f1 ∂ f1 ∂ f1 ∂ f1
!
∂x1 (( a, b )) ∂x2 (( a, b )) ∂x3 (( a, b )) ∂y1 (( a, b )) ∂y2 (( a, b ))
[ DF ](a,b) = ∂ f2 ∂ f2 ∂ f2 ∂ f2 ∂ f2
∂x1 (( a, b )) ∂x2 (( a, b )) ∂x3 (( a, b )) ∂y1 (( a, b )) ∂y2 (( a, b ))
( a1 )2
2a1 b2 0 0 1
= 3
0 cos( a2 b2 )b2 4( a3 ) 0 cos( a2 b2 ) a2
We will denote by
( D1 f )(a,b) : Rd → Rm
the restriction of the derivative ( D f )(a,b) : Rd+m → Rm to the subspace
Rd ⊂ Rd+m . In other words, for all h ∈ Rd ,
( D2 f )(a,b) : Rm → Rm
∂ f1 ∂ f1
!
∂y1 (( a, b )) ∂y2 (( a, b ))
[ D2 f ](a,b) = ∂ f2 ∂ f2
∂y1 (( a, b )) ∂y2 (( a, b ))
( a1 )2
1
=
0 cos( a2 b2 ) a2
f ( x, y) = 0 if and only if y = g ( x ).
( Dg) x = −( D2 f )− 1
( x,g( x ))
◦ ( D1 f )(x,g(x)) . (22.3.1)
CHAPTER 22. IMPLICIT FUNCTION THEOREM 283
The expression for the derivative of g is actually easy to derive when you
already know that g is differentiable. Because in that case, we can start
from the equality
f ( x, g( x )) = 0,
and use the chain rule to compute that
( D1 f )(x,g(x)) + ( D2 f )(x,g(x)) ◦ ( Dg) x = 0.
We then use that ( D2 f )( x,g( x)) is non-singular, multiply by its inverse, and
conclude the expression (22.3.1).
The proof of this theorem is very technical, but the underlying idea is sim-
ple and very beautiful. Given an x ∈ Rd close to a, how do we find y such
that f ( x, y) = 0? It won’t be possible to find such a y immediately, but
let’s first see what the guess y(0) := b will give us. We will then make an
error, because f ( x, y(0) ) will not be equal to 0 in general. Therefore we will
make a new guess, y(1) that aims to correct for the error. We will choose
the difference y(1) − y(0) in the way that would give us the exact solution
if f were in fact affine:
where you can see the important assumption that ( D2 f )(a,b) is non-singular
coming in.
Of course, in general y(1) will also not be the value we are looking for, i.e.
in general f ( x, y(1) ) 6= 0, but we can make a new guess y(2) , solving for
the update y(2) − y(1) again acting as if f were affine:
y ( n + 1 ) : = y ( n ) − ( D2 f ) − 1 (n)
( a,b)
f ( x, y )
And that’s basically it! We just need to show that this works (that this re-
cursive scheme converges, and that the resulting solutions depend nicely
CHAPTER 22. IMPLICIT FUNCTION THEOREM 284
on x). But this is exactly the virtue of the Banach fixed point theorem. In
other words, we need to check that we can apply the Banach fixed point
theorem, but if we can apply it then we indeed find a y such that
y = y − ( D2 f ) − 1
( a,b)
f ( x, y)
Proof. Right at the beginning of this proof, we choose two radii r1 and
r2 according to the following criteria:
1
k( D2 f )− 1
k m m k( D2 f )(x,z) − ( D2 f )(a,b) kRm →Rm < .
( a,b) R →R 2
F ( x ) ( y ) : = y − ( D2 f ) − 1
( a,b)
f ( x, y ) .
We will want to use the Banach fixed point theorem to the function
F ( x) , and therefore we need to show that F ( x) maps every element in
B(b, r2 ) back into B(b, r2 ), and we need to check that F ( x) is a contrac-
tion.
We first check that F ( x) maps B(b, r2 ) back to B(b, r2 ). We show a
slightly stronger property, namely that F ( x) maps B(b, r2 ) to B(b, 2r2 /3).
Indeed, by the criteria above, if x ∈ B( a, r1 ) and y ∈ B(b, r2 ) then
k F ( x ) ( y ) − b k 2 = k y − b − ( D2 f ) − 1
( a,b)
f ( x, y ) k2
= k y − b − ( D2 f ) − 1
( a,b)
f ( a, b) + ( D1 f )(a,b) ( x − a)
f
+ ( D2 f )(a,b) (y − b) + Err(a,b) (( x, y)) k2
f
= k − ( D2 f ) − 1
( a,b)
( D 1 f )( x − a ) + Err ( a,b)
(( x, y )) k2
2r2
< .
3
(22.3.2)
( DF (x) )z = I − ( D2 f )− 1
( a,b)
◦ ( D2 f )(x,z) .
CHAPTER 22. IMPLICIT FUNCTION THEOREM 286
We compute
f ( x, g( x )) = 0.
F ( x) ( g(u)) − g(u).
CHAPTER 22. IMPLICIT FUNCTION THEOREM 287
Therefore we compute
F ( x) ( g(u)) − g(u) = −( D2 f )− 1
( a,b)
( f ( x, g(u)))
= −( D2 f )− 1
( a,b)
f ( x, g(u)) − f (u, g(u))
M := 2k( D2 f )− 1
k m m
( a,b) R →R
sup k( D1 f )(σ,τ ) kRm →Rm
(σ,τ )∈ B( a,r1 )× B(b,r2 )
1
k F (x) ( g(u)) − g(u)kRm ≤ M k x − u k Rd
2
k g( x ) − g(u)kRm ≤ Mk x − ukRd .
( Dg)u = −( D2 f )− 1
(u,v)
◦ ( D1 f )(u,v)
(as this expression can be derived from the chain rule assuming that g
is indeed differentiable).
Therefore, we consider the error function
g
Erru ( x ) := g( x ) − g(u) + ( D2 f )− 1
(u,v)
◦ ( D1 f )(u,v) ( x − u)
CHAPTER 22. IMPLICIT FUNCTION THEOREM 288
g ( u ) − ( D2 f ) − 1
(u,v)
◦ ( D1 f )(u,v) ( x − u)
F(x) g ( u ) − ( D2 f ) − 1
(u,v)
◦ ( D1 f )(u,v) ( x − u)
− g ( u ) − ( D2 f ) − 1
(u,v)
◦ ( D1 f )(u,v) ( x − u)
g ( x ) − g ( u ) − ( D2 f ) − 1
(u,v)
◦ ( D1 f )(u,v) ( x − u)
as well.
Therefore, we compute
F(x) g ( u ) − ( D2 f ) − 1
(u,v)
◦ ( D1 f )(u,v) ( x − u)
− g ( u ) − ( D2 f ) − 1
(u,v)
◦ ( D1 f )(u,v) ( x − u)
= −( D2 f )− 1
f ( x, g(u) − ( D2 f )− 1
( a,b) (u,v)
◦ ( D1 f )(u,v) ( x − u))
= −( D2 f )− 1
( a,b)
f ( x, g(u) − ( D2 f )− 1
(u,v)
◦ ( D1 f )(u,v) ( x − u))
− f (u, g(u))
where we used in the last line that f (u, g(u)) = 0. Because f is differ-
CHAPTER 22. IMPLICIT FUNCTION THEOREM 289
entiable, we find
F(x) g ( u ) − ( D2 f ) − 1
(u,v)
◦ ( D1 f )(u,v) ( x − u)
− g ( u ) − ( D2 f ) − 1
(u,v)
◦ ( D1 f )(u,v) ( x − u)
= −( D2 f )− 1
( a,b)
( D1 f )(u,v) ( x − u)
+ ( D2 f )(u,v) (−( D2 f )− 1
(u,v)
◦ ( D1 f )(u,v) ( x − u))
f
+ Err(u,v) (( x, g(u) − ( D2 f )− 1
(u,v)
◦ ( D1 f )(u,v) ( x − u)))
f
= −( D2 f )− 1
Err(u,v) (( x, g(u) − ( D2 f )− 1
( a,b) (u,v)
◦ ( D1 f )(u,v) ( x − u))
It follows that
k F(x) g ( u ) − ( D2 f ) − 1
(u,v)
◦ ( D 1 f ) ( u,v ) ( x − u )
−1
− g(u) − ( D2 f )(u,v) ◦ ( D1 f )(u,v) ( x − u) kRm
≤ k( D2 f )− 1
k m m
( a,b) R →R
f
× kErr(u,v) (( x, g(u) − ( D2 f )− 1
(u,v)
◦ ( D1 f )(u,v) ( x − u)))kRm
( Dg) x = ( Dh)− 1
g( x )
.
Proof. The proof of the theorem follows from applying the Implicit
Function Theorem to the function F : Rm × Ω → Rm given by
F ( x, y) := x − h(y)
The inverse function theorem is useful to conclude for instance that for
k ∈ N, the function x 7→ x1/k is differentiable on the domain (0, ∞).
The implicit function theorem would also allow us to conclude that the
function ln : (0, ∞) → R is differentiable on its domain, given that the
exponentional function exp : R → R is differentiable. But truly, we still
haven’t given a proper definition of the exponential function. The next
chapters will allow us to provide such a definition.
CHAPTER 22. IMPLICIT FUNCTION THEOREM 291
22.5 Exercises
Exercise 22.5.1. Consider the function F : R2 → R given by
F ( x, y) = x2 − y3 + 3y
Function sequences
Definition 23.1.1. Let ( X, distX ) and (Y, distY ) be two metric spaces
and let D ⊂ X. We say that a sequence of functions f : N → ( D → Y )
converges pointwise to a function f ∗ : D → Y if
for all x ∈ D,
lim f n ( x ) = f ∗ ( x ).
n→∞
f n (x) = xn .
To show this, let x ∈ [0, 1]. Then we consider two cases. In case x ∈
292
CHAPTER 23. FUNCTION SEQUENCES 293
lim f n ( x ) = lim 1n = 1 = f ∗ ( x ).
n→∞ n→∞
Definition 23.2.1. Let ( X, distX ) and (Y, distY ) be two metric spaces
and let D ⊂ X. We say that a sequence of functions f : N → ( D → Y )
converges uniformly to a function f ∗ : D → Y if
Proposition 23.2.2. Let ( X, distX ) and (Y, distY ) be two metric spaces,
let D ⊂ X, and assume that a sequence of functions f : N → ( D → Y )
converges uniformly to a function f ∗ : D → Y. Then ( f n ) converges to
f ∗ pointwise on D.
distY ( f n ( x ), f ∗ ( x )) < e.
CHAPTER 23. FUNCTION SEQUENCES 294
distY ( f n ( x ), f ∗ ( x )) < e.
Choose δ := δ0 .
Let x ∈ D. Assume that 0 < distX ( x, a) < δ. Then by the triangle
inequality
The previous theorem is sometimes very useful to rule out that a sequence
of functions converges uniformly. If the functions in the sequence are all
continuous, but the pointwise limit is not continuous, then the sequence
does not converge uniformly.
CHAPTER 23. FUNCTION SEQUENCES 296
Let e > 0.
Because the functions D f n converge to ∆ uniformly, the function ∆ is
continuous by Theorem 23.3.1. Therefore, there exists a δ0 > 0, such
that for all z ∈ Ω, if 0 < kz − akV < δ0 then
e
k ∆ z − ∆ a k V →R < .
3
Choose such a δ0 .
Choose δ := δ0 .
Let x ∈ Ω and assume that 0 < k x − akV < δ.
Moreover, since the functions D f n converge to ∆ uniformly on Ω, there
exists an N0 ∈ N such that for all z ∈ Ω and all n ≥ N0 ,
e
k( D f n )z − ∆z kV →R < .
3
Now because the function sequence ( f n ) converges to g pointwise,
there exists an N1 ∈ N such that for all n ≥ N1 ,
e
| f n ( x ) − g( x )| < k x − akV .
6
Similarly, there exists an N2 ∈ N such that for all n ≥ N2 ,
e
| f n ( a) − g( a)| < k x − akV .
6
CHAPTER 23. FUNCTION SEQUENCES 298
f N ( x ) − f N ( a ) = ( D f N ) y ( x − a ).
Therefore
g e
|Err a ( x )| < | f N ( x ) − f N ( a) − ∆ a ( x − a)| + k x − akV
3
e
= |( D f N )y ( x − a) − ∆ a ( x − a)| + k x − akV
3
e
= |(( D f N )y − ∆ a )( x − a)| + k x − akV
3
e
≤ k( D f N )y − ∆ a kV →R k x − akV + k x − akV
3
e
≤ (k( D f N )y − ∆y kV →R + k∆y − ∆ a kV →R )k x − akV + k x − akV
3
< e k x − a kV
k f k∞ = sup | f ( x )|.
x∈D
CHAPTER 23. FUNCTION SEQUENCES 299
23.6 Exercises
Exercise 23.6.1. Determine whether the following functions converge point-
wise and whether they converge uniformly on the indicated domain. If
they converge pointwise, give the pointwise limit.
Function series
24.1 Definitions
Let ( X, distX ) be a metric space. Let ( f n ) be a sequence of functions from
Ω ⊂ X to R.
We say that the function series
∞
∑ fk
k =0
300
CHAPTER 24. FUNCTION SERIES 301
| f n ( x )| ≤ Mn
converges absolutely.
In particular, the series
∞
∑ fk
k =0
converges pointwise to a function s : Ω → R.
We will now show that the series
∞
∑ fk
k =0
Let x ∈ Ω. Then
` ∞
| ∑ f k ( x ) − s( x )| = ∑ f k (x)
k =0 k=`+1
∞
≤ ∑ | f k ( x )|
k=`+1
∞
≤ ∑ Mk
k=`+1
CHAPTER 24. FUNCTION SERIES 303
`
lim k
`→∞
∑ f k − sk∞ = 0.
k =0
We claim that for every r > 0, this series converges uniformly on the
interval [−r, r ].
Proof. Let r > 0. To verify the claim, we will apply the Weierstrass
M-test. The functions f k appearing in the theorem correspond to the
functions
xk
f k (x) = .
k!
We need to check all conditions in the Weierstrass M-test, so we need
to define a sequence ( Mk ) and verify for that choice that for all x ∈
[−r, r ] and all k ∈ N,
| f k ( x )| ≤ Mk ,
CHAPTER 24. FUNCTION SERIES 304
converges.
First we note that for all k ∈ N and for all x ∈ [−r, r ],
| x |k rk
| f k ( x )| = ≤ .
k! k!
Therefore, we choose for ( Mk ) the sequence
rk
Mk : = .
k!
We then indeed find that for all k ∈ N and for all x ∈ [−r, r ],
| f k ( x )| ≤ Mk .
converges. It follows from the Weierstrass M-test that the function se-
ries
∞
xk
∑ k!
k =0
24.4 Exercises
Exercise 24.4.1. Let ( f k ) be a sequence of continuously differentiable, bounded
functions R → R. Suppose that for all k ∈ N,
k f k k∞ ≤ 1
and
k f k0 k∞ ≤ 5
lim k f k k∞ = 0.
k→∞
b2 exp(−b) ≤ exp(−b/2)
Power series
25.1 Definition
Definition 25.1.1. A power series at a point c ∈ R is a function series of
the form
∞
∑ ak ( x − c)k
k =0
where a : N → R is a real-valued sequence.
308
CHAPTER 25. POWER SERIES 309
Now note that since δ < |z − c|, it follows that δ/|z − c| < 1 so that
∞ k
δ
∑ |z − c|
k =0
ii. There exists an R > 0 such that for all x ∈ (c − R, c + R) the power
series converges and for all x ∈ R\[c − R, c + R] the series diverges.
In this case we say the radius of convergence equals R.
iii. The series converges for all x ∈ R. In this case we say the radius of
convergence is ∞.
Then
ii. if L ∈ (0, ∞), then the radius of convergence of the power series
is 1/L,
For the proof of the proposition, let us first give an alternative version of
the root test, Theorem 8.4.1.
Theorem 25.2.4 (Root test, lim sup version). Let (bk ) be a sequence of
nonnegative real numbers.
CHAPTER 25. POWER SERIES 311
i. If p
k
lim sup bk < 1,
k→∞
then the series ∑k bk converges.
ii. If p
k
lim sup bk > 1
k→∞
then the series ∑k bk diverges.
For such `, also b` > 1. It follows that the sequence (bk ) does not
converge to zero, and therefore the series ∑k bk diverges.
With this version of the root test, we can now prove Proposition 25.2.3.
Proof of Proposition 25.2.3. We would like to apply the root test. We
therefore consider
q q
k k k
lim sup | ak ( x − c) | = lim sup | ak || x − c|
k→∞ k→∞
and the radius of convergence R for this new power series satisfies
R ≥ min( R1 , R2 ).
R ≥ min( R1 , R2 ).
Proof. It suffices to show that for all x ∈ R such that | x − z| < min( R1 , R2 ),
CHAPTER 25. POWER SERIES 315
∞ ∞ ∞
! !
∑ ck ( x − z)k = ∑ ak ( x − z)k ∑ bk ( x − z ) k
k =0 k =0 k =0
∞ ∞ ∞
! !
∑ ck ( x − z) k
= ∑ ak ( x − z) k
∑ bk ( x − z ) k
.
k =0 k =0 k =0
Theorem 25.5.4 (Identity theorem for power series). Let R > 0 and let
f , g : (c − R, c + R) → R be given by power series
∞ ∞
f (x) = ∑ ak ( x − c)k and g( x ) = ∑ bk ( x − c ) k
k =0 k =0
CHAPTER 25. POWER SERIES 317
f ( x ) = g ( x ).
25.7 Exercises
Exercise 25.7.1. Give the Taylor series of the function
1
f ( x ) :=
3x2
around the point c = 1 and give its radius of convergence.
Exercise 25.7.2. Prove Proposition 25.4.1.
Exercise 25.7.3. Now that the definitions of cos and sin are provided:
i. Prove using power series, (i.e. without using Proposition 16.3.1) that
for all x ∈ R,
cos0 ( x ) = − sin( x )
and
sin0 ( x ) = cos( x ).
CHAPTER 25. POWER SERIES 318
sin2 ( x ) + cos2 ( x ) = 1.
x2 f 00 ( x ) − 2x f 0 ( x ) + (2 − x2 ) f ( x ) = 0
with
f (0) = 0, f 0 (0) = 1, f 00 (0) = 0.
and determine the radius of convergence of the power series. Hint, use
the Ansatz
∞
f ( x ) := ∑ ak x k
k =0
and use the theorems about the sums of power series, products of power
series and the identity theorem to determine the coefficients ak .
Chapter 26
319
CHAPTER 26. RIEMANN INTEGRATION IN ONE DIMENSION 320
Mk : = sup f (x)
x ∈[ xk−1 ,xk ]
where
mk := inf f ( x ).
x ∈[ xk−1 ,xk ]
L( P, f ) ≤ U ( P, f ).
because
mk = inf f (x) ≤ sup f ( x ) = Mk .
x ∈[ xk−1 ,xk ] x ∈[ xk−1 ,xk ]
We are most interested in those functions for which the upper Darboux in-
tegral agrees with the lower Darboux integral. We will call those functions
Riemann integrable.
CHAPTER 26. RIEMANN INTEGRATION IN ONE DIMENSION 322
In this case we say that the Riemann integral of f equals this common
value, i.e.
Z b Z b Z b
f dx := f dx = f dx.
a a a
It follows that
U ( P, f ) − L( P, f ) < e.
U ( P, f ) − L( P, f ) < e.
But then
Z b Z b
e= f ( x )dx − f ( x )dx < U ( P, f ) − L( P, f ) < e.
a a
i. We have Z b
1dx = b − a
a
Let e > 0.
Because the closed interval [ a, b] is compact, the function f is uni-
formly continuous. Therefore, there exists a δ > 0 such that for all
x, y ∈ [ a, b], if 0 < | x − y| < δ then
e
| f ( x ) − f (y)| < .
2( b − a )
CHAPTER 26. RIEMANN INTEGRATION IN ONE DIMENSION 326
P : = ( x 0 , x 1 , . . . , x N +1 )
by
b−a
xi : = a + i .
N
Then
N N
U ( P, f ) − L( P, f ) = ∑ Mk ∆xk − ∑ mk ∆xk
k =1 k =1
N
= ∑ ( Mk − mk )∆xk
k =1
with
Mk = sup f (x)
x ∈[ xk−1 ,xk ]
and
mk = inf f ( x ).
x ∈[ xk−1 ,xk ]
We find that
N
|U ( P, f ) − L( P, f )| ≤ ∑ | Mk − mk |∆xk
k =1
N
e
2(b − a) k∑
≤ ∆xk
=1
e
= (b − a)
2( b − a )
< e.
CHAPTER 26. RIEMANN INTEGRATION IN ONE DIMENSION 327
F : [ a, b] → R
given by Z x
F ( x ) := f (s)ds
a
is differentiable on ( a, b) and for all x ∈ ( a, b)
F 0 ( x ) = f ( x ).
F0 (x) = f (x)
and then use the Mean-Value Theorem to conclude that there are points
ck ∈ ( xk−1 , xk ) such that
n
∑
F (b) − F ( a) = F ( x k ) − F ( x k −1 )
k =1
n
= ∑ f (ck )( xk − xk−1 )
k =1
Mk : = sup f (x)
x ∈[ xk−1 ,xk ]
to estimate
n
F (b) − F ( a) = ∑ f (ck )( xk − xk−1 )
k =1
n
≤ ∑ Mk ∆xk
k =1
= U ( P, f ).
It follows that
Z b
F (b) − F ( a) ≤ f ( x )dx
a
Similarly, we can prove that
Z b
f ( x )dx ≤ F (b) − F ( a)
a
Therefore,
Z b Z b
f ( x )dx ≤ F (b) − F ( a) ≤ f ( x )dx
a a
CHAPTER 26. RIEMANN INTEGRATION IN ONE DIMENSION 330
26.5 Exercises
Exercise 26.5.1. Let a, b and c be three real numbers, with a < b < c and let
g : [ a, b] → R and h : [b, c] → R be two bounded and Riemann-integrable
functions. Let f : [ a, c] → R be such that for all x ∈ ( a, b)
f ( x ) = g( x )
U ( P, f ) − L( P, f ) ≥ 5.
Exercise 26.5.3. Compute the following integral, carefully quoting the the-
orems that you use in your computation
Z √3
1
dx.
−1 1 + x2
CHAPTER 26. RIEMANN INTEGRATION IN ONE DIMENSION 331
Exercise 26.5.4. Let a, b and c be three real numbers, with a < b < c. As-
sume f : [ a, c] → R is bounded, and assume f is Riemann integrable on
[ a, b] and Riemann integrable on [b, c].
Prove that f is Riemann integrable on [ a, c].
f ( x ) = g ( x ).
1y : [ a, b] → R
defined as (
1 if x = y
1y ( x ) =
0 if x 6= y
is Riemann integrable.
Chapter 27
R = [ a1 , b1 ] × · · · × [ ad , bd ]
332
CHAPTER 27. RIEMANN INTEGRATION IN MULTIPLE DIMENSIONS333
R = [ a1 , b1 ] × [ a2 , b2 ] × · · · × [ ad , bd ]
Q = P1 × · · · × P d
is a partition of [ a j , b j ].
Then the upper sum of f with respect to Q is defined as
n1 nd
U ( Q, f ) := ∑ ··· ∑ Mk1 ,··· ,kd ∆xk11 · · · ∆xkdd
k 1 =1 k d =1
j j j
where ∆xk := ( xk − xk−1 ) and
where
mk1 ,...,kd := inf f ( x ).
x ∈[ xk1 −1 ,xk1 ]×···×[ xkd −1 ,xkd ]
1 1 d d
In this case we say that the Riemann integral of f equals this common
value, i.e. Z Z Z
f dx := f dx = f dx.
R R R
i. (volume)
Z
1dx = (b1 − a1 )(b2 − a2 ) · · · (bd − ad ) =: Vol( R)
R
then
Z N Z
R
f ( x )dx = ∑ f ( x )dx.
i =1 Q i
h x (y) := f ( x, y)
R = [ a1 , b1 ] × · · · [ ad , bd ] ⊂ Rd
CHAPTER 27. RIEMANN INTEGRATION IN MULTIPLE DIMENSIONS338
R = [ a1 , b1 ] × · · · × [ ad , bd ]
bi − a i = b j − a j .
27.10 Exercises
Exercise 27.10.1. Let S1 , . . . , Sm ⊂ Rd have Jordan content zero. Show that
the union
m
[
Si
i =1
also has Jordan content zero.
Exercise 27.10.2. Show that the unit ball B(0, 1) in R2 is a Jordan set. Hint:
Make a parametrization of the boundary of the unit ball
Exercise 27.10.3. Let E ⊂ R2 be the subset of [0, 1]2 above the curve y = x4 ,
and to the right of the curve x = sin(yπ )3 . Show that E is a Jordan set.
Change-of-variables Theorem
341
CHAPTER 28. CHANGE-OF-VARIABLES THEOREM 342
defined by
Φpol (r, φ) := (r cos φ, r sin φ)
Here,
det [ DΦpol ](r,φ) = r
In many situations in which one would like to change the polar coordi-
nates to compute an integral, the Change-of-variables Theorem does not
directly apply. The transformation from polar to Cartesian coordinates is
so nice, however, that one can obtain a statement that can be applied more
conveniently.
A subset E ⊂ (0, ∞) × (0, 2π ) is a Jordan set if and only if Φpol ( E) is a
Jordan set. Moreover, if one of these holds, a function f : R2 → R is
Riemann integrable on Φpol ( E) if and only if f ◦ Φpol is Riemann integrable
on E and Z Z
f ( x )dx = f (Φpol (r, φ))rdrdφ
Φpol ( E) E
defined by
Φcyl (r, φ, z) := (r cos φ, r sin φ, z)
Here,
det [ DΦcyl ](r,φ,z) = r
Here,
det [ DΦsph ](ρ,φ,θ ) = ρ2 sin θ.
28.5 Exercises
Exercise 28.5.1. Determine
Z π/2 Z y
sin(y)
q dxdy
0 0 2
4 − sin ( x )
Exercise 28.5.3. Let K be the subset of those points in R3 that are inside
the ball around the origin of radius 4 but outside the cylinder around the
z-axis of radius 1, i.e.
K := B(0, 4) \ {( x, y, z) ∈ R3 | x2 + y2 < 1}
ι ( x ) = x1 v1 + · · · + x d v d .
Show that
1
Vol(ι(S)) = | det M|.
d!
Appendix A
Best practices
i. Always start by writing down what is given and what you need to
show.
for all a ∈ A,
(. . . )
Let a ∈ A.
there exists a ∈ A,
(. . . )
345
APPENDIX A. BEST PRACTICES 346
Choose a := . . .
if A then B
Assume A.
It suffices to show A.
It holds that A.
If you want to use a theorem (or lemma, proposition etc.), first ex-
plicitly check all the assumptions, then afterwards you can use the
conclusion of the theorem. A template is:
APPENDIX A. BEST PRACTICES 347
for all a ∈ A,
(A.0.1)
(. . . )
there exists a ∈ A,
(A.0.2)
(. . . )
or as
or just
Obtain such an a ∈ A.
x. To prove a statement
APPENDIX A. BEST PRACTICES 348
(. . . )
xi. Sometimes you need to make a case distinction: you might for in-
stance want to argue differently if a real number is strictly negative
or positive. A template for a case distinction is as follows.
Case A.
. . . proof in Case A. . .
Case B.
. . . proof in Case B. . .
Case C.
. . . proof in Case C. . .
etc. . .
We use induction on n ∈ N.
We first show the base case, i.e. that P(0) holds.
... insert here a proof of P(0) ...
We now show the induction step.
Let k ∈ N and assume that P(k ) holds.
We need to show that P(k + 1) holds.
... insert here a proof of P(k + 1) ...
xiv. Make sure that every variable that you are using is defined. In par-
ticular:
the variable N is not defined, and you cannot refer to it. To use
it in the rest of a proof, you can follow up with
Choose such an N ∈ N.
if . . . then . . .
xvi. If the statement that you need to show is an “if and only if” state-
ment, show the “if” and “only if” statements separately.
xix. At several times, remind the reader (and yourself) of what you need
to show at that stage.
xx. If you hand-write your proof, make sure that you use your best
handwriting.
Bibliography
351