Intro To Analysis Notes
Intro To Analysis Notes
Vitali Liskevich
With minor adjustments by Vitaly Moroz
School of Mathematics
University of Bristol
Bristol BS8 1TW, UK
Contents
2 Numbers 27
2.1 Various Sorts of Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.1.1 Integers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.1.2 Rational Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.1.3 Irrational Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.1.4 Cuts of the Rationals . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.2 The Field of Real Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.3 Bounded sets of numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.4 Supremum and infimum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
II
CONTENTS
5 Differential Calculus 68
5.1 Definition of derivative. Elementary properties . . . . . . . . . . . . . . . . . 68
5.2 Theorems on differentiable functions . . . . . . . . . . . . . . . . . . . . . . . 72
5.3 Approximation by polynomials. Taylor’s Theorem . . . . . . . . . . . . . . . 75
6 Series 78
6.1 Series of positive and negative terms . . . . . . . . . . . . . . . . . . . . . . . 78
6.1.1 Alternating series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
6.1.2 Absolute convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
6.1.3 Rearranging series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
6.1.4 Multiplication of series . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
6.2 Power series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
7 Elementary functions 87
7.1 Exponential function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
7.2 Logarithmic function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
7.3 Trigonometric functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
In mathematics we always assume that our propositions are definite and unambiguous, so that
such propositions are always true or false (there is no intermediate option). In this respect
they differ from propositions in ordinary life, which are often ambiguous or indeterminate. We
also assume that our mathematical propositions are objective, so that they are determinately
true or determinately false independently of our knowing which of these alternatives holds.
We will denote propositions by capital Roman letters.
Examples of propositions
1. A ≡ London is the capital of the UK.
2. B ≡ Paris is the capital of the UK.
3. C ≡ 3 > 5
4. D ≡ 3 < 5. etc.
(We will use the sign ≡ in order to define propositions.)
A and D are true, whereas B and C are false, nonetheless they are propositions. Thus
with respect to our agreement a proposition may take one of the two values : “true” or
“false” (never simultaneously). We use the symbol T for an arbitrary true proposition and
F for an arbitrary false one.
Not every sentence is a mathematical proposition. For instance, the following sentences
are not mathematical propositions.
About (i) it is impossible to judge whether it is true or false (depends for whom...). (ii) does
not have precise sense. (iii) is a question, so does not state anything. And (iv) contains a
letter x and whether it is true or false depends on the value of x.
At the same time, there are propositions for which it is not immediately easy to establish
whether they are true or false. For example,
1
1.1. PROPOSITIONAL CONNECTIVES
This is obviously a definite proposition, but it would take a lot of computation actually to
determine whether it is true or false.
There are some propositions in mathematics for which their truth value has not been
determined yet. For instance, “in the decimal representation of π there are infinitely many
digits 7”. It is not known whether it is true, but this is certainly a mathematical proposition.
The truth value of this proposition constitutes an open question (at least the answer is not
known to the author of these notes). There are plenty of open questions in mathematics!
1.1.1 Negation
One can build a new proposition from an old one by negating it. Take A above as an example.
The negation of A (not A) will mean
We will use the symbol ¬A to denote not A. Another example. For the proposition D ≡
{8 is a prime number}, its negation is ¬D ≡ {8 is not a prime number}. Since we agreed
that we have one of the two truth values : truth or false, we can define negation of a proposition
A by saying that if A is true then ¬A is false, and if A is false then ¬A is true. This definition
is reflected in the following table.
A ¬A
T F
F T
1.1.2 Conjunction
Conjunction is a binary operation on propositions which corresponds to the word “and” in
English. We stipulate by definition that “A and B” is true if A is true and B is true, and “A
and B” is false if A is false or B is false. This definition is expressed in the following truth
table. We use the notation A ∧ B for the conjunction “A and B”.
A B A∧B
T T T
T F F
F T F
F F F
The four rows of this table correspond to the four possible truth combinations of the propo-
sition A and the proposition B. The last entry in each row stipulates the truth or falsity of
the complex proposition in question.
Conjunction is sometimes called logical product.
1.1.3 Disjunction
Disjunction is a binary operation on propositions which corresponds to the word “or” in
English. We stipulate by definition that “A or B” is true if A is true or B is true, and “A
or B” is false if A is false and B is false. This definition is expressed in the following truth
table. We use the notation A ∨ B for the disjunction “A or B”.
A B A∨B
T T T
T F T
F T T
F F F
1.1.4 Implication
Implication is a binary operation on propositions which corresponds to the word “if... then...”
in English. We will denote this operation ⇒. So A ⇒ B can be read “if A then B”, or “A
implies B”. A is called the antecedent, and B is called the consequent. The truth table for
implication is the following.
A B A⇒B
T T T
T F F
F T T
F F T
So, as we see from the truth table which constitutes the definition of the operation “im-
plication”, the implication is false only in the case in which the antecedent is true and the
consequent is false. It is true in the remaining cases.
Notice that our introduced “implies” differs from that used in the ordinary speech. The
reason for such a definition will become clear later. It proved to be useful in mathematics.
Now we have to accept it. Thus in the mathematical meaning of “implies” the proposition
1.1.5 Equivalence
The last binary operation on propositions we introduce, is equivalence. Saying that A is
equivalent to B we will mean that A is true whenever B is true, and vice versa. We denote
this by A ⇔ B. So we stipulate that A ⇔ B is true in the cases in which the truth values of
A and B are the same. In the remaining cases it is false. This is given by the following truth
table.
A B A⇔B
T T T
T F F
F T F
F F T
(A ⇒ B) ∧ (B ⇒ A).
Now let A be a proposition. Then ¬A is also a proposition, so that we can construct its
negation ¬¬A. Now it is easy to understand that ¬¬A is the same proposition as A as we
have only two truth values T and F .
¬¬A ⇔ A
(A ∧ B) ⇒ (A ∨ B)
is a law.
The shortest way to justify this is to build the truth table for (A ∧ B) ⇒ (A ∨ B).
A B A∧B A∨B (A ∧ B) ⇒ (A ∨ B)
T T T T T
T F F T T
F T F T T
F F F F T
As we see the last column consists entirely of T ’s, which means that this is a law.
Below we list without proof some of the most important logical laws. We recommend that
you verify them by constructing the truth tables of these laws.
(1.2.1) (A ∨ B) ⇔ (B ∨ A)
(1.2.3) (A ∧ B) ⇔ (B ∧ A)
• Idempotent laws
(1.2.7) (A ∧ A) ⇔ A, (A ∨ A) ⇔ A
• Absorption laws
(A ∧ T ) ⇔ A, (A ∧ F ) ⇔ F,
(1.2.8) (A ∨ T ) ⇔ T, (A ∨ F ) ⇔ A
• Syllogistic law
(1.2.10) (A ∨ ¬A) ⇔ T
(1.2.11) (A ∧ ¬A) ⇔ F
• De Morgan’s laws
• Contrapositive law
(1.2.15) (A ⇒ B) ⇔ (¬A ∨ B)
1.3 Sets
The notion of a set is one of the basic notions. We cannot, therefore, give it a precise
mathematical definition. Roughly speaking:
It is obviously not a definition (what is “collection”? what is “size”?). A rigorous set theory
is constructed in the axiomatic way. We however confine ourselves to the naive set theory,
introducing some axioms only in order to clarify the notion of a set. We accept as the basic
notions “set”, and the relation “to be an element of a set”. If A is a set we write a ∈ A to
express that “a is an element of the set A”, or “a belongs to A”. If a is not an element of A
we write a 6∈ A. So the following is true for any x and A:
(x 6∈ A) ⇔ ¬(x ∈ A).
Example 1.3.1. (i) A = {0, 1}. This means that the set A consists of two elements, 0
and 1.
(ii) B = {0, {1}, {0, 1}}. The set B contains three elements: the number 0; the set {1}
containing one element: the number 1; and the set containing two elements: the numbers
0 and 1.
The order in which the elements are listed is irrelevant. Thus {0, 1} = {1, 0}.
Example 1.3.2. (i) C = {x ∈ N | x is a prime number}. The set C contains all primes.
We cannot list them for the reason that there are infinitely many primes. (Here N is
the set of natural numbers 1, 2, 3 . . . )
(ii) D = {x ∈ R | x2 − x = 0}. The set D contains the roots of the equation x2 − x = 0. But
these are 0 and 1, so the set D contains the same elements as the set A in the previous
example. In this case we say that the sets A and D coincide, and write A = D. (Here
R is the set of real numbers.)
If the sets A and B contain the same elements, then they coincide, (or are equal).
(A = B) ⇔ [(A ⊂ B) ∧ (B ⊂ A)].
This is the main way we will use to prove equalities for sets.
Since it happens that a set may contain no elements, the following definition is useful.
Definition 1.3.2. The empty set is the set which contains no elements.
This in particular means that the set of elephants taking this course of Analysis coincides
with the set of natural numbers solving the equation x2 − 2 = 0.
∅, {0}, {1}, {2}, {0, 1}, {0, 2}, {1, 2}, {0, 1, 2}.
Subsets of a set A which do not coincide with A are called proper subsets. In this example
all subsets except for the last one are proper.
Definition 1.3.3. The set of all subsets of a set A is called the power set of A, and is denoted
by P (A) (or 2A in some books).
Note that in the last example S contains 3 elements, whereas P (S) has 8 elements (23 = 8).
In general, if a set A has n elements then P (A) has 2n elements (provided n is finite). Prove
it!
Union of sets
Definition 1.3.4. The union of sets A and B is the set containing the elements of A and the
elements of B, and no other elements.
(x ∈ A ∪ B) ⇔ (x ∈ A) ∨ (x ∈ B).
Intersection of sets
Definition 1.3.5. The intersection of sets A and B is the set containing the elements which
are elements of both A and B, and no other elements.
(x ∈ A ∩ B) ⇔ (x ∈ A) ∧ (x ∈ B).
Difference of sets
Definition 1.3.6. The difference of sets A and B is the set containing the elements of A
which do not belong to B.
We use the notation A − B for the difference. The following is true for arbitrary x and
arbitrary A and B :
(x ∈ A − B) ⇔ [(x ∈ A) ∧ (x 6∈ B)].
Symmetric difference
A4B = (A − B) ∪ (B − A).
A ∪ B = {0, 1, 2, 3, 4, 5, 7, 9}.
A ∩ B = {1, 3, 5}.
Note that
A ∪ B = (A ∩ B) ∪ (A4B).
• Associative laws
(1.3.18) A ∪ (B ∪ C) = (A ∪ B) ∪ C,
(1.3.19) A ∩ (B ∩ C) = (A ∩ B) ∩ C.
• Distributive laws
(1.3.20) A ∩ (B ∪ C) = (A ∩ B) ∪ (A ∩ C),
(1.3.21) A ∪ (B ∩ C) = (A ∪ B) ∩ (A ∪ C).
• Idempotent laws
(1.3.22) A ∪ A = A, A ∩ A = A.
The proofs of the above laws are based on the corresponding laws for conjunction and
disjunction.
(1.3.23) A ∪ (B − A) = A ∪ B.
Proof.
[x ∈ A ∪ (B − A)] ⇔ {(x ∈ A) ∨ [(x ∈ B) ∧ ¬(x ∈ A)]}
⇔ [(x ∈ A) ∨ (x ∈ B)] ∧ [(x ∈ A) ∨ ¬(x ∈ A)]
⇔ [(x ∈ A) ∨ (x ∈ B)],
since [(x ∈ A) ∨ ¬(x ∈ A)] is true.
From the last formula it follows that the difference is not an inverse operation to the union
(which means that in general A ∪ (B − A) 6= B).
(1.3.24) A − B = A − (A ∩ B).
Proof.
[x ∈ A − (A ∩ B)] ⇔ {(x ∈ A) ∧ ¬(x ∈ A ∩ B)}
⇔{(x ∈ A) ∧ ¬[(x ∈ A) ∧ (x ∈ B)]
⇔{(x ∈ A) ∧ [¬(x ∈ A) ∨ ¬(x ∈ B)]}
⇔{[(x ∈ A) ∧ ¬(x ∈ A)] ∨ [(x ∈ A) ∧ ¬(x ∈ B)]}
⇔[(x ∈ A) ∧ ¬(x ∈ B)]
⇔(x ∈ A − B)
De Morgan’s laws
(1.3.25) A − (B ∩ C) = (A − B) ∪ (A − C),
(1.3.26) A − (B ∪ C) = (A − B) ∩ (A − C).
(1.3.27) (Ac )c = A,
(1.3.28) (A ∩ B)c = Ac ∪ B c ,
(1.3.29) (A ∪ B)c = Ac ∩ B c .
Example 1.4.1. Let A ≡ {x ∈ R | x2 − x = 0}. (The notation R is used to denote the set of
real numbers, which we will discuss in details later on.) Then
We often want to say that some property holds for every element from S. In this case we
use the universal quantifier ∀. So for “for all x ∈ S A(x)” we write (∀x ∈ S) A(x). After
applying the universal quantifier to a predicate we obtain a proposition (which may be true
or false, as usual). The universal quantifier ∀ substitutes for the words “every”, “for every”,
“any”, “for all”.
Example 1.4.2. (i) The proposition “Every real number has a non-negative square” can
be written as
(∀x ∈ R) [x2 ≥ 0].
(∀x ∈ R) [x ≥ 0].
Note that proposition [(∀x ∈ S) A(x)] means that the truth set of A(x) is the whole set
S. Thus if for an element x of S, A(x) is false, the proposition [(∀x ∈ S) A(x)] is false, hence
in order to state that it is false it is enough to find one element in S for which A(x) is false.
To express that a property holds for some element of S, or in other words, “there exists
an element in S such that A(x) holds”, we use
the existential quantifier ∃ and write
(∃x ∈ S) A(x).
In propositions “(∀x) P (x)” and “(∃x) P (x)” P (x) may itself contain quantifiers.
Example 1.4.4. (i)
(∀x ∈ N)(∃y ∈ N) [y = x + x].
(ii)
(∀x ∈ R)(∃y ∈ R) [y < x].
This means that we can obtain a proposition with initially placed quantifier that is equiv-
alent to the negation of a proposition with initially placed quantifier by pushing the negation
to the inside and changing ∀ to ∃ and ∃ to ∀.
Example 1.4.5. Consider the following proposition
Example 1.4.6.
Definition 1.5.1.
(a, b) = {{a}, {a, b}}.
2)(only if) [(a, b) = (c, d)] ⇒ [(a = c) ∧ (b = d)]. Suppose that (a, b) = (c, d), i.e.
Hence
{{a}} = {{c}, {c, d}}, so {c} = {a} and {c, d} = {c},
i.e. a = b = c = d as required.
Case 2. a 6= b. In this case we have
But [{c} = {a, b}] is false since a 6= b. Hence {c} = {a}, so a = c. Since a 6= b, it follows
that {a} 6= {a, b}. Therefore {c, d} = {a, b} and as a = c, {a, d} = {a, b}. Hence b = d, as
required.
Thus the definition of ordered pair, however “artificial” it may appear, gives to ordered
pairs the crucial property that we require of them; and that is all we can reasonably require
from a mathematical definition.
Now we define Cartesian product which is an essential tool for further development.
Definition 1.5.2. Let A and B are sets. The Cartesian product of A and B, denoted by
A × B, is the set of all ordered pairs (a, b) in which a ∈ A and b ∈ B, i.e.
A × B = {(a, b) | (a ∈ A) ∧ (b ∈ B)}.
Thus
(p ∈ A × B) ⇔ {(∃a ∈ A)(∃b ∈ B) [p = (a, b)]}.
Example 1.5.1. 1. If A ={red, green} and B = {1, 2, 3} then
A × B = {(red, 1), (red, 2), (red, 3), (green, 1), (green, 2), (green, 3)}.
2. R × R = {(x, y) | x and y are real numbers}. These are coordinates of all points in the
plane. The notation R2 is usually used for this set.
X × X is called Cartesian square of X.
The following theorem provides some basic properties of the Cartesian product.
Theorem 1.5.2. Let A, B, C, D be sets. Then
(1.5.31) A × (B ∩ C) = (A × B) ∩ (A × C),
(1.5.32) A × (B ∪ C) = (A × B) ∪ (A × C),
(1.5.33) A × ∅ = ∅ × A = ∅.
1.6 Relations
Definition 1.6.1. Let X, Y be sets. A set R ⊂ X × Y is called a relation from X to Y .
If (x, y) ∈ R, we say that x is in the relation R to y. We will also write in this case xRy.
Example 1.6.1. 1. Let A = {1, 2, 3}, B = {3, 4, 5}. The set R = {(1, 3), (1, 5), (3, 3)} is
a relation from A to B since R ⊂ A × B.
2. G = {(x, y) ∈ R × R | x > y} is a relation from R to R.
Definition 1.6.2. Let R be a relation from X to Y . The domain of R is the set
D(R) = {x ∈ X | ∃y ∈ Y [(x, y) ∈ R]}.
The range of R is the set
Ran(R) = {y ∈ Y | ∃x ∈ X [(x, y) ∈ R]}.
The inverse of R is the relation R−1 from Y to X defined as follows
R−1 = {(y, x) ∈ Y × X | (x, y) ∈ R}.
Definition 1.6.3. Let R be a relation from X to Y , S be a relation from Y to Z. The
composition of S and R is a relation from X to Z defined as follows
S ◦ R = {(x, z) ∈ X × Z | ∃y ∈ Y [(x, y) ∈ R] ∧ [(y, z) ∈ S]}.
Theorem 1.6.2. Let R be a relation from X to Y , S be a relation from Y to Z, T be a
relation from Z to V . Then
1. (R−1 )−1 = R.
2. D(R−1 ) = Ran(R).
3. Ran(R−1 ) = D(R).
4. T ◦ (S ◦ R) = (T ◦ S) ◦ R.
5. (S ◦ R)−1 = R−1 ◦ S −1 .
Next we take a look at some particular types of relations. Let us consider relations from
X to X, i.e. subsets of X × X. In this case we talk about relations on X.
A simple example of such a relation is the identity relation on X which is defined as follows
iX = {(x, y) ∈ X × X | x = y}.
Definition 1.6.4. 1. A relation R on X is said to be reflexive if
(∀x ∈ X) (x, x) ∈ R.
2. R is said to be symmetric if
(∀x ∈ X)(∀y ∈ X) {[(x, y) ∈ R] ⇒ [(y, x) ∈ R]}.
3. R is said to be transitive if
(∀x ∈ X)(∀y ∈ X)(∀z ∈ X){[((x, y) ∈ R) ∧ ((y, z) ∈ R)] ⇒ [(x, z) ∈ R]}.
Equivalence relations
R = {(x, y) | |x − y| ≤ a}.
x ≡ y if (∃k ∈ Z |x − y = km).
From the above proposition it follows that the classes of equivalence are disjoint and every
element of the set X belongs to a class of equivalence (the union of the classes equals to the
set X).
Remark 1. See more on this in [1].
1.7 Functions
1.7.1 Function
The notion of a function is of fundamental importance in all branches of mathematics. You
met functions in your previous study of mathematics, but without precise definition. Here
we give a definition and make connections to examples you learned before.
Definition 1.7.1. Let X and Y be sets. Let F be a relation from X to Y . Then F is called
a function if the following properties are satisfied
(i) (∀x ∈ X)(∃y ∈ Y ) [(x, y) ∈ F ].
In this case it is customary to write y = F (x).
(ii) (∀x ∈ X)(∀y ∈ Y )(∀z ∈ Y ){([(x, y) ∈ F ] ∧ [(x, z) ∈ F ]) ⇒ (y = z)}.
(In other words, for every x ∈ X there is only one y ∈ Y such that (x, y) ∈ F ).
X is called the domain of F and Y is called codomain.
Proof. 1. (⇒). Let x ∈ X. Then (∃y ∈ Y )(y = F (x)). But G(x) = F (x), so (x, y) ∈ G.
Therefore F ⊂ G. Analogously one sees that G ⊂ F .
2. (⇐). Obvious since the sets F and G are equal.
The above theorem says that in order to establish that two functions are equal one has to
establish that they have the same domain and codomain and for every element of the domain
they have equal images. Note that equal function may be defined by different “rules”.
Example 1.7.3. Let f : R → R, g : R → R, h : R → R+ (By R+ we denote the set of
non-negative real numbers). Let ∀x ∈ R
Then f and g are equal, but f and h are not since they have different codomains.
Proof. We know that g ◦ f is a relation. So we must prove that for every x ∈ X there exists
a unique element z ∈ Z such that (x, z) ∈ g ◦ f .
Existence: Let x ∈ X be arbitrary. Then ∃y ∈ Y such that y = f (x), or in other words,
(x, y) ∈ f . Also ∃z ∈ Z such that z = g(y), or in other words, (y, z) ∈ g. By the definition it
means that (x, z) ∈ g ◦ f . Moreover, we see that
The following theorem (which we give without proof) establishes some properties of images
of sets.
Theorem 1.7.8. Let f : X → Y and A ⊂ X, B ⊂ X. Then
Remark 2. Note that in (ii) there is no equality in general. Consider the following example.
f : R → R defined by f (x) = x2 . Let A = [−1, 12 ] and B = [− 12 , 1]. Then f (A) = [0, 1],
f (B) = [0, 1], so that f (A) ∩ f (B) = [0, 1]. At the same time A ∩ B = [− 21 , 12 ], and hence
f (A ∩ B) = [0, 14 ].
Definition 1.7.4. Let f : X → Y and B ⊂ Y . The inverse image of B under f is the set
Example 1.7.9. Let f : R → R defined by f (x) = x2 . Let B = [−1, 4]. Then f −1 (B) =
[−2, 2].
The following theorem (which we give without proof) establishes some properties of inverse
images of sets.
The above definition means that f is one-to-one correspondence between X and Ran(f ).
Using the contrapositive law one can rewrite the above definition as follows.
Then f is an injection.
In contrast define g : X → Y as
The above definition means that Ran(f ) = Y . For this reason surjections are sometimes
called onto.
Now we are ready to answer the question about the inverse to a function. First, recall
that if f : X → Y then f −1 is a relation from Y to X with the properties D(f −1 ) = Ran(f )
and Ran(f −1 ) = D(f ). So, if f −1 is a function from Y to X then D(f −1 ) = Y . Therefore we
conclude that the condition Ran(f ) = Y is a necessary condition for f −1 to be a function.
Which means, that f has to be surjective. Also, from the definition of the inverse relation f −1
it is clear that injectivity of f is also a necessary condition for f −1 to be a function. Hence
bijectivity of f is a necessary condition for f −1 to be a function.
It turns out that this is also a sufficient condition.
Theorem 1.7.14. Let f : X → Y . Then
(f −1 : Y → X) ⇔ (f is a bijection).
f −1 ◦ f = iX and f ◦ f −1 = iY .
This is the same as to say that f −1 ◦ f = iX . The second statement is similar and is left
as an exercise.
(A ⇒ B) ⇔ (¬B ⇒ ¬A).
So to prove that A ⇒ B is true is the same as to prove that ¬A ∨ B is true or else that A ∧ ¬B
is false.
2. In the remaining part of this short section we will discuss an important property of
natural numbers. The set of natural numbers
N = {0, 1, 2, 3, 4, . . . }
The Principle of Mathematical Induction is often used when one needs to prove the state-
ment of the form
(∀n ∈ N+ ) P (n)
or similar types of statements.
Since there are infinitely many natural numbers we cannot check one by one that they all
have property P . The idea of mathematical induction is that to list all natural numbers one
has to start from 1 and then repeatedly add 1. Thus one can show that 1 has property P
and that whenever one adds 1 to a number that has property P , the resulting number also
has property P .
Principle of Mathematical Induction. If for a statement P (n)
Proof. We need not make the base case, since it has been done working out the conjecture.
Induction step:
Let
1 + 3 + 5 + · · · + (2k − 1) = k 2 (k > 1).
Then
1 + 3 + 5 + · · · + (2k − 1) + (2k + 1) = k 2 + 2k + 1 = (k + 1)2 .
This completes the proof by induction.
The base step can be started not necessarily from 1. It can start from any integer onwards.
(k + 1)3 − (k + 1) = k 3 + 3k 2 + 3k + 1 − k − 1 = (k 3 − k) + 3(k 2 + k)
| {z } | {z }
divisible by 3 divisible by 3
Example 1.8.6. Prove that (∀n ∈ N)[(n > 5) ⇒ (2n > n2 )].
Solution. Base case: n = 5. True.
Induction step:
Suppose that 2k > k 2 (k > 5). Then
2k+1 = 2 · 2k > 2k 2 .
2k 2 − (k + 1)2 = k 2 − 2k − 1 = (k − 1)2 − 2.
Since k > 5 we have that k − 1 > 4 and (k − 1)2 − 2 > 14 > 0 which proves the above
inequality.
Numbers
N = {0, 1, 2, 3, 4 . . .}
(order)
The first difficulty occurs when we try solving the equation in N
a + x = b,
with b 6 a. In order to make this equation soluble we have to widen the set N by introducing
0 and negative integers as solutions of the equations
and
a + x = 0 (existence of additive inverse)
respectively. Our extended system, which is denoted by Z, now contains all integers and can
be arranged in order
27
2.1. VARIOUS SORTS OF NUMBERS
(2.1.1) ax = b
need not have a solution x ∈ Z. In order to enable one to solve (2.1.1) (for a 6= 0) we
have to widen our system of numbers again so that it included fractions b/a (existence of
multiplicative inverse in Z − {0}. This motivates the following
Definition 2.1.1. Rational numbers, or rationals, is the set
p
{r = | p ∈ Z, q ∈ N}.
q
The set of rational numbers will be denoted by Q. When writing p/q for a rational we
always assume that the numbers p and q have no common factor greater than 1.
All the arithmetical operations in Q are straightforward. Let us introduce a relation of
order for rationals.
Definition 2.1.2. Let b ∈ N, d ∈ N, with both b, d > 0,. Then
³a c´ ³ ´
> ⇐⇒ ad > bc .
b d
The following theorem provides a very important property of rationals.
Theorem 2.1.1. Between any two rational numbers there is another (and, hence, infinitely
many others).
Proof. Let b ∈ N d ∈ N, both b, d > 0, and
a c
> .
b d
Notice that
a a + mc c
(∀m ∈ N) [ > > ].
b b + md d
Indeed, since b, d and m are positive we have
[a(b + md) > b(a + mc)] ⇐⇒ [mad > mbc] ⇔ (ad > bc),
and
[d(a + mc) > c(b + md)] ⇔ (ad > bc).
(2.1.2) x2 = a.
In general (2.1.2) does not have rational solutions. For example the following theorem holds.
Theorem 2.1.2. No rational number has square 2.
2k 2 = q 2
and therefore q is also even. The last statement contradicts our assumption that p and q have
no common factor.
Theorem 2.1.2 provides an example of a number which is not rational, or irrational. Here
are some other examples of irrational numbers.
Theorem 2.1.3. No rational x satisfies the equation x3 = x + 7.
Proof. First we show that there are no integers satisfying the equation x3 = x + 7. For a
contradiction suppose that there is. Then x(x + 1)(x − 1) = 7 from which it follows that x
divides 7. Hence x can be only ±1, ±7. Direct verification shows that these numbers do not
satisfy the equation.
Second, show that there are no fractions satisfying the equation x3 = x + 7. For a
contradiction suppose that there is. Let m
n with m ∈ Z, n ∈ N, n 6= 0, and m, n have no
common factors greater than 1, is such that ( m 3 m
n ) = n + 7. Multiplying this equality by n
2
3
we obtain mn = mn2 + 7n2 , which is impossible since the right-hand side is an integer and
the left-hand side is not.
Example 2.1.4. No rational satisfies the equation x5 = x + 4.
Prove it!
Algebraic numbers A correspond to all real solutions of polynomial equations with integer
coefficients. So x is algebraic if there exists n ∈ N, a0 , a1 , . . . , an ∈ Z such that
a0 + a1 x + · · · + an xn = 0.
The last three statements become more concrete if we use the arithmetical rule for square
root to find, to as many decimal places as we please, a set of numbers l ∈ L
1, 1 · 4, 1 · 41, 1 · 414, . . . ,
each of which is greater than the preceding (or equal to it if the last digit is 0) and each
having its square less than 2. Moreover, the numbers got by adding 1 to the last digit of these
numbers l ∈ L for a set of numbers r ∈ R
2, 1 · 5, 1 · 42, 1 · 415, . . . ,
each having its square greater than 2 and each less than (or equal to) the preceding.
If we are now given a particular number a whose square is less than 2, we shall by going
far enough along the set of numbers 1, 1 · 4, 1 · 41, 1 · 414, . . . come to one which is greater
than a.
If, then, we are building a number-system starting with integers
√ and then including the
rational numbers, we see that an irrational number (such as 2) corresponds to and can be
defined by a cutting of the rationals into two classes L, R of which L has no greatest member
and R no least member. This is Dedekind’s definition of irrationals by the cut.
There are cuts which correspond to non-algebraic numbers: the transcendental numbers,
denoted by T. For example, the cut corresponding to 2x = 3 is transcendental. (Check that
no rational x satisfies 2x = 3). The proof that x 6∈ A is more complicated. Summarising:
N ⊂ Z ⊂ Q ⊂ A ⊂ T ∪ A = R.
Notation: We use the symbol ∃! in order to say that there exists a unique... So (∃!x ∈ R)
has to be red : there exists a unique real number x.
The axioms A.6-A.10 that follow are analogues of A.1-A.5 for the operation of multipli-
cation.
From the axioms above the familiar rules of manipulation of real numbers can be deduced.
As an illustration we present
Example 2.2.1. (∀a ∈ R)[0 · a = 0]. Indeed, by A.11 and A.4 we have
1 · a + 0 · a = (1 + 0)a = 1 · a.
O.4. (∀a ∈ R)(∀b ∈ R)(∀c ∈ R)[(a > b) ∧ (c > 0) ⇒ (ac > bc)].
Observe that
(∀a ∈ R)(∀b ∈ R){[a > b] ⇔ [a − b > 0]}.
This follows from (O.3).
Completeness axiom states that there are no gaps among the reals. We will give it in the
form of
Dedekind’s axiom: Suppose that the system of all real numbers is divided into two classes
L and R, such that every number l ∈ L is less than every number r ∈ R (and R, L 6= ∅).
Then there is a dividing number ξ with the properties that every number less ξ belongs to L
and every number greater than ξ belongs to R.
The number ξ is either in L or in R. If ξ ∈ R then ξ is the least number in R. If ξ ∈ L
then ξ is the greatest number in L.
Let us express Dedekind’s axiom in the form of a true logical proposition.
1) L ∩ R = ∅.
2) (ξ ∈ L) ∨ (ξ ∈ R) is true.
Inequalities
The order axioms express the properties of the order relation (inequalities) on the set of real
numbers. Inequalities play an extremely important role in analysis. Here we discuss several
ideas how to prove inequalities. But before we start doing so we give the definition of the
absolute value of a real number and derive a simple consequence from it.
Definition 2.2.1. Let a ∈ R. The absolute value |a| of a is defined by
½
a if a ≥ 0,
|a| =
−a if a < 0.
Theorem 2.2.2.
(∀a ∈ R)(∀b ∈ R) [ |a + b| ≤ |a| + |b| ] .
1) a ≥ 0, b ≥ 0;
2) a ≥ 0, b < 0;
3) a < 0, b ≥ 0;
4) a < 0, b < 0.
1. First idea of proving inequalities is very simple and is based on the equivalence
a2 + b2 − 2ab = (a − b)2 ≥ 0.
(Reminder: R+ = {x ∈ R | x ≥ 0}.)
Proof. As above, let us prove that the difference between the left-hand side (LHS) and the
right-hand side (RHS) is non-negative.
√ √
a+b √ ( a − b)2
− ab = ≥ 0.
2 2
The equality holds iff a = b.
2. The second idea is to use already proved inequalities and axioms to derive new ones.
Example 2.2.5. Prove that
a2 + b2 ≥ 2ab,
a2 + c2 ≥ 2ac,
b2 + c2 ≥ 2bc,
we obtain 2(a2 + b2 + c2 ) ≥ 2ab + 2ac + 2bc, which proves the desired. The equality holds iff
a = b = c.
In general this means that when proving that a ≥ c we have to find b such that a ≥ b and
b ≥ c.
Example 2.2.7. Let n ≥ 2 be a natural number. Prove that
1 1 1 1
+ + ··· + > .
n+1 n+2 2n 2
Proof.
1 1 1 1 1 1 1 1
+ + ··· + > + + ··· + =n = .
n+1 n+2 2n |2n 2n {z 2n} 2n 2
n
Note that if S is bounded above with upper bound K then any real number greater than
K is also an upper bound of S. Similarly, if S is bounded below with lower bound k then any
real number smaller than k is also a lower bound of S.
Analogously one defines the greatest lower bound for a set bounded below.
Theorem 2.4.1. Let S ⊂ R be non-empty and bounded above. Then sup S exists.
Then
1. L ∩ R = ∅, and L 6= ∅, R 6= ∅.
Indeed, since S 6= ∅, it follows that ∃s ∈ S. Then x = s − 1 ∈ L. Let K be an upper
bound of S. Then K ∈ R.
2.
(∀l ∈ L)(∀r ∈ R) (l < r).
Indeed, let l ∈ L and r ∈ R. The by the definition of L there exists s ∈ S such that
s > l. By the definition of R we have that r ≥ s, hence l < r.
From this it follows that the classes L and R satisfy Dedekind’s axiom. Therefore
By the definition of the supremum we conclude that ξ = sup S, which proves the theorem.
We use the following notation
−S = {x ∈ R | − x ∈ S}.
Theorem 2.4.2. Let S ⊂ R be non-empty and bounded below. Then inf S exists and is equal
to − sup(−S).
Proof. Since S is bounded below, it follows that ∃k ∈ R such that (∀x ∈ S)(x ≥ k). Hence
(∀x ∈ S)(−x ≤ −k) which means that −S is bounded above. By Theorem 2.4.1 there exists
ξ = sup(−S). We will show that −ξ = inf S. First, we have that (∀x ∈ S)(−x ≤ ξ), so
that x ≥ −ξ. Hence −ξ is a lower bound of S. Next, let η is another lower bound of S, i.e.
(∀x ∈ S)(x ≥ η). Then (∀x ∈ S)(−x ≤ −η). Hence −η is an upper bound of −S, and by the
definition of supremum −η ≥ sup(−S) = ξ. Hence η ≤ −ξ which proves the required.
Theorem 2.4.3. (The Archimedian Principle) Let x > 0 and y ∈ R. Then there exists
n ∈ Z such that y < nx.
Then the set A := {nx | n ∈ Z} is bounded above and y its upper bound. Then by Theorem
2.4.1 there exists sup A. The number sup A − x < sup A is not an upper bound of A, so there
exists m ∈ Z such that mx > sup A − x, or otherwise
But m+1 ∈ Z, hence (m+1)x ∈ A, and (2.4.3) contradicts to the definition of supremum.
Corollary 2.4.1. N is unbounded.
Corollary 2.4.2.
y
(∀x > 0)(∀y > 0)(∃n ∈ N) ( < x).
n
Proof. By the Archimedian Principle (Theorem 2.4.3)
Proof. All the element of A are positive, except for the first one which is 0. Therefore
min A = 0, and hence inf A = 0.
Now notice that µ ¶
n−1 1 1 1
(∀n ∈ N) n 6= 0 ⇒ = − < .
2n 2 2n 2
Hence 12 is an upper bound of A.
We have to prove that 12 is the least upper bound. For this we have to prove that
µ ¶
n−1 1
(∀ε > 0)(∃n ∈ N) > −ε .
2n 2
Notice that µ ¶ µ ¶
n−1 1 1 1 1
> −ε ⇔ − > −ε
2n 2 2 2n 2
µ ¶
1
⇔ < ε ⇔ [n · (2ε) > 1].
2n
By the Archimedian Principle there exists such n ∈ N.
Note that for any set A ⊂ R if max A exists then sup A = max A. Also if min A exists
then inf A = min A.
Theorem 2.4.6. For any interval (a, b) there exists a rational r ∈ (a, b). In other words,
Now the set of integers between −m0 and m is finite. Let m00 be the least element of that
set such that
m00
> a.
n
00
Set r = mn . Then we have
m00 m00 − 1 1 1
a<r= = + ≤ a + < a + h = b.
n n n n
The next theorem is a multiplicative analogue of the Archimedian Principle.
Theorem 2.4.7. Let x > 1, y > 0. Then
(∃n ∈ N) (xn > y).
Prove this theorem repeating the argument from the proof of Theorem 2.4.3.
Theorem 2.4.8. (∃!x > 0) (x2 = 2).
3.1 Sequences
In general any function f : N − {0} → X with domain N+ and arbitrary codomain X is
called a sequence. This means that a sequence (an )n∈N+ is a subset of X each element of
which has its number, i.e. an = f (n). (We do not call this a series - this term is used for
another concept.)
In this course we confine ourselves to sequences of numbers. In particular, in this chapter
we always assume that X = R. So we adopt the following definition.
We will give the precise mathematical definition to the notion of a null sequence.
Definition 3.2.1. (an )n∈N+ is a null sequence if
Now we are ready to verify rigorously this definition for some examples.
1
Example 3.2.1. (an )n∈N+ with an = is a null sequence.
n
40
CHAPTER 3. SEQUENCES AND LIMITS
The next theorem follows straight from the definition of a null sequence and the simple
fact that (∀a ∈ R) (|a| = ||a||).
Theorem 3.2.2. A sequence (an )n∈N+ is a null sequence if and only if the sequence (|an |)n∈N+
is a null sequence.
(−1)n
Example 3.2.3. (an )n∈N+ with an = is a null sequence.
n
B(x, ε) := (x − ε, x + ε) = {y ∈ R | |y − x| < ε}
Using this notion one can say that a sequence (an )n∈N+ is a null sequence if an only if
for any ε > 0 starting from some number all the elements of the sequence belong to the
ε-neighbourhood of zero.
2. limn→∞ an = a if an only if for any ε > 0 starting from some number all the elements
of the sequence belong to the ε-neighbourhood of a.
Proof. Let ε > 0 be given. We have to find N ∈ N+ such that for n > N |an − a| < ε. The
last inequality for our example reads as follows
¯ ¯
¯ n ¯
¯ ¯
¯ n + 1 − 1¯ < ε.
Chose n to be any integer greater than or equal to 1ε −1 (which exists by AP). Then, if n > N ,
we have
1
n > N > − 1.
ε
Therefore µ ¶ µ¯ ¯ ¶
1 ¯ n ¯
<ε ⇔ ¯ ¯ ¯
− 1¯ < ε .
n+1 n+1
Theorem 3.3.3. A sequence (an )n∈N+ can have at most one limit.
Proof. We have to prove that the following implication is true for all a, b ∈ R
h³ ´ ³ ´i
lim an = a ∧ lim an = b ⇒ (a = b).
n→∞ n→∞
By the definition
Suppose for a contradiction that the statement is not true. Then its negation
h³ ´ ³ ´i
lim an = a ∧ lim an = b ∧ (a 6= b)
n→∞ n→∞
is true.
Fix ε = 13 |a − b| > 0 and take N = max{N1 , N2 }. Then, if n > N , we have
2
|a − b| = |(a − an ) + (an − b)| ≤ |an − a| + |an − b| < 2ε = |a − b|,
3
which is a contradiction.
Theorem 3.3.4. Any convergent sequence is bounded.
Take ε = 1 in the definition of the limit. Then ∃N ∈ N+ such that for n > N we have
(∀n ∈ N+ ) (m ≤ an ≤ M ).
Note that Theorem 3.3.4 can be expressed by the contrapositive law in the following way
• If a sequence is unbounded then it diverges.
Moreover, if a > 0 then for the above n an > a2 , and if if a < 0 then for the above n an < a2 .
|a|
Proof. Fix ε = 2 . Then ∃N ∈ N+ such that for n > N
|a|
> |a − an | ≥ |a| − |an |,
2
|a| |a|
which implies |an | > |a| − 2 = 2 ,
and the first assertion is proved. On the other hand
µ ¶
|a| |a| |a|
( > |a − an |) ⇔ a − < an < a + .
2 2 2
a−b
Proof. For a contradiction assume that b < a. Fix ε < 2 so that b + ε < a − ε and choose
N1 and N2 such that the following is true
[(∀n > N1 )(an > a − ε)] ∧ [(∀n > N2 )(bn < b + ε)].
If N ≥ max{N1 , N2 } then
bn < b + ε < a − ε < an ,
which contradicts to the condition (∀n ∈ N+ ) (an ≤ bn ).
Theorem 3.3.7. Sandwich rule. Let a ∈ R and (an )n , (bn )n , (cn )n be real sequences. Then
(iii) If in µ
addition +
¶ b 6= 0 and (∀n ∈ N ) (bn 6= 0) then
an a
lim = .
n→∞ bn b
Proof. (i) Let ε > 0. Choose N1 such that for all n > N1 |an − a| < ε/2. Choose N2 such
that for all n > N2 |bn − b| < ε/2. Then for all n > max{N1 , N2 }
(ii) Since (bn )n is convergent, it is bounded by Theorem 3.3.4. Let K be such that
(∀n ∈ N+ ) (|bn | ≤ K) and |a| ≤ K. We have
(iii) We have
¯ ¯ ¯ ¯
¯ an a ¯ ¯ an b − abn ¯ |an − a||b| + |a||bn − b|
¯ − ¯=¯ ¯≤ .
¯ bn b¯ ¯ bn b ¯ |bn b|
(iii) A sequence of real numbers (an )n∈N+ is called monotone if it is either increasing or
decreasing.
The next theorem is one of the main theorems in the theory of converging sequences.
Theorem 3.4.2. If a sequence of real numbers is bounded above and increasing then it is
convergent.
Proof. Let (an )n∈N+ be a bounded above increasing sequence of real numbers. Since the set
{an | n ∈ N+ } is bounded above, there exists supremum
a := sup{an }.
n
Fix this N . Since (an )n is increasing, (∀n > N ) (an > aN ). Therefore
In complete analogy to the previous theorem one proves the following one.
Theorem 3.4.3. If a sequence of real numbers is bounded below and decreasing then it is
convergent.
In many cases before computing the limit of a sequence one has to prove that the limit
exists, i.e. the sequence converges. In particular, this is the case when a sequence is defined
by a recurrent formula, i.e. the formula for the nth member of the sequence via previous
members with numbers n − 1 and less.
1. (an )n is bounded.
Indeed, we prove that
(∀n ∈ N+ ) (0 < an ≤ 2).
Proof. Positivity is obvious.
For n = 1 a1 = 2 ≤ 2, and the statement is true.
Suppose that it is true for n = k (k ≥ 1), i.e. ak ≤ 2.
For n = k + 1 we have √ √
ak+1 = ak + 2 ≤ 2 + 2 = 2,
and by the principle of induction the statement is proved.
2. (an )n is increasing.
We have to prove that
(∀n ∈ N+ ) (an+1 ≥ an ).
Proof. This is equivalent to prove
√
[(∀n ∈ N+ ) ( an + 2 ≥ an )] ⇔ [(∀n ∈ N+ ) (an + 2 ≥ a2 )].
Indeed
an + 2 − a2n = (2 − an )(an + 1) ≥ 0.
By Theorem 3.4.2 we conclude that (an ) is convergent. Let limn an = x. Then, of course,
limn an−1 = x. Passing to the limit in (3.4.1) we obtain that
√
x = x + 2,
Proof. The sequence (an )n is increasing and bounded above, and the sequence (bn )n is de-
creasing and bounded below. Therefore (an )n , (bn )n are convergent.
Let a := limn an ; b := limn bn .
Then
a≤b
by Theorem 3.3.6. Moreover,
b − a = lim(bn − an ) = 0.
n
Hence \
a=b∈ [an , bn ].
n∈N+
Definition 3.5.1. A sequence of real numbers (an )n∈N+ is called a Cauchy sequence if
Proof. Let a := limn an . Let ε > 0. Then there exists N ∈ N such that for all n > N
It turns out that the inverse to Proposition 3.5.1 is also true. This is expressed in the
next theorem that delivers the Cauchy criterion for convergence of sequences.
Theorem 3.5.1. If (an )n∈N+ is a Cauchy sequence of real numbers then it is convergent.
Proof. Let (an )n∈N+ be a Cauchy sequence. Take ε = 1. There exists N ∈ N such that for all n, m > N
|an − am | < 1.
Then we have
(∀n ∈ N) ([αn+1 , βn+1 ] ⊂ [αn , βn ]).
(Justify the above)
The length of the interval [αn , βn ] converges to zero as ∇ → ∞ since from the Cauchy condition it
follows that for any ε > 0 there exists N ∈ N+ such that for all k > N
aN − ε < ak < aN + ε,
Therefore lim an = a.
n→∞
3.6 Series
In this section we discuss infinite series. Let us start from an example which is likely to be
familiar from high school. How to make sense of the infinite sum
1 1 1 1
+ + + ...
2 4 8 16
This is the known sum of a geometric progression. One proceeds as follows. Define the sum
of the first n term
1 1 1 1 1
sn = + + + · · · + n = 1 − n
2 4 8 2 2
and then define the infinite sum s as limn→∞ sn , so that s = 1.
This idea is used to define formally an arbitrary series,
Definition 3.6.1. Let (an )n∈N+ be a sequence of real (complex) numbers. Let
n
X
sn = a1 + a2 + · · · + an = ak .
k=1
is convergent if the sequence of the partial sums (sn )n∈N+ is convergent. The limit of this
sequence is called the sum of the series.
is convergent then
The above theorem expresses the simplest necessary condition for the convergence of a
series.
For example, each of the following series is divergent
1 + 1 + 1 + 1 + ...,
1 − 1 + 1 − 1 + ...
since the necessary condition is not satisfied.
P
The main feature of such a case that the sequence of the partial sums sn = nk=1 ak is
increasing.
Indeed, sn − sn−1 = an ≥ 0.
This allows us to formulate a simple criterion of convergence for such a series.
Theorem 3.6.6. A series of positive terms is convergent if and only if the sequence of partial
sums is bounded above.
Proof. Indeed, if a series is convergent then the sequence of partial sums is convergent by the
definition, and therefore it is bounded.
Conversely, if the sequence of partial sums is bounded above then it is convergent by
Theorem 3.4.2, and therefore the series is convergent.
Comparison tests
Here we establish tests which make possible to infer the convergence or divergence of a series
by means of comparing it with another series convergence or divergence of which is known.
∞
X ∞
X
Theorem 3.6.7. Let ak , bk be two series of positive terms. Assume that
k=1 k=1
Then
∞
X ∞
X
(i) If bk is convergent then ak is convergent.
k=1 k=1
∞
X ∞
X
(ii) If ak is divergent then bk is divergent.
k=1 k=1
Pn Pn
Proof. Let sn = k=1 ak , s0n = k=1 bk . Then
The following observation (which follows immediately from the definition of convergence
of a series) is important for the next theorem.
The convergence or divergence of a series is unaffected if a finite number of terms are
inserted, or suppressed, or altered.
Theorem 3.6.8. Let (an )n , (bn )n be two sequences of positive numbers. Assume that
an
lim = L > 0. (the limit is positive and finite).
n→∞ bn
∞
X ∞
X
Then the series ak is convergent if and only if the series bk is convergent.
k=1 k=1
Now the assertion follows from the observation before the theorem and from Theorem 3.3.6.
∞
X
2n
Example 3.6.9. (i) The series is divergent since
n2 + 1
n=1
2n ∞
X
n2 +1 1
lim 1 = 2 and the series is divergent.
n→∞
n
n
n=1
∞
X n ∞
X
n 2n3 +2 1 1
(ii) The series 3
is convergent since lim 1 = and the series is
2n + 2 n→∞
n2
2 n2
n=1 n=1
convergent.
Theorem 3.6.10. (Cauchy’s Test or The Root Test) Let (an )n∈N+ be a sequence of positive
numbers. Suppose that
√
lim n an = l.
n→∞
∞
X ∞
X
Then if l < 1, the series an converges; if l > 1, the series an diverges. If l = 1, no
n=1 n=1
conclusion can be drawn.
Proof. Suppose that l < 1. Choose r such that l < r < 1. Then
√
(∃N ∈ N)(∀n ∈ N+ )[(n > N ) ⇒ ( n
an < r)].
Or otherwise n
P∞ ann < r . The convergence follows now from the comparison with the convergent
series n=1 r .
Next suppose that l > 1. Then
√
(∃N ∈ N)(∀n ∈ N+ )[(n > N ) ⇒ ( n
an > 1)].
Theorem 3.6.11. (D’Alembert’s Test or The Ratio Test) Let (an )n∈N+ be a sequence of
positive numbers. Suppose that
an+1
lim = l.
n→∞ an
∞
X ∞
X
Then if l < 1, the series an converges; if l > 1, the series an diverges. If l = 1, no
n=1 n=1
conclusion can be drawn.
Proof. Suppose that l < 1. Choose r such that l < r < 1. Then
an+1
(∃N ∈ N)(∀n ∈ N+ )[(n > N ) ⇒ ( < r)].
an
Therefore
an an−1 aN+ 2 aN +1
an = · · ··· · · aN +1 < rn−N −1 aN +1 = N +1 · rn .
an−1 an−2 aN +1 r
P
The convergence follows now from the comparison with the convergent series ∞ n
n=1 r .
Next suppose that l > 1. Then
an+1
(∃N ∈ N)(∀n ∈ N+ )[(n > N ) ⇒ ( > 1)].
an
Or otherwise an+1 > an . Hence the series diverges.
an+1 2n+1 + n + 1 3n − n 2
lim = lim n
· n+1
= .
n→∞ an n→∞ 2 +n 3 −n−1 3
Therefore the series converges.
f (x) → b as x → a.
Definition 4.1.1. (Cauchy) Let f : D(f ) → R and let c < a < d be such that (c, a)∪(a, d) ⊂
D(f ). We say that f (x) → b as x → a and write lim f (x) = b if
x→a
h ¡ ¢i
(∀ε > 0)(∃δ > 0)(∀x ∈ D(f )) (0 < |x − a| < δ) ⇒ |f (x) − b| < ε .
Note that
¡ ¢
(|x − a| < δ) ⇔ (−δ < x − a < δ) ⇔ (a − δ < x < a + δ) ⇔ x ∈ (a − δ, a + δ) .
lim x = a.
x→a
Proof.
h Let a ∈ R be arbitrary. Fix ε > 0.
We need to show that 0 < |x − a| < δ implies |x − a| < ε. Hence one can
i
choose δ = ε.
Choose δ = ε. Then
h i
(∀x ∈ R) (0 < |x − a| < δ) ⇒ (|x − a| < ε) .
55
4.1. LIMITS OF FUNCTION
lim x2 = a2 .
x→a
There is another possibility to define a limit of a function. It is based upon the definition
of a limit of a sequence.
Definition 4.1.2. (Heine) Let f be be such that (c, a) ∪ (a, d) ⊂ D(f ). We say that
f (x) → b as x → a
(ii) xn → a as n → ∞
we have
lim f (xn ) = b.
n→∞
In order to use any of the two definitions of the limit of a function we have to make sure
that they are equivalent. This is the matter of the next theorem.
Theorem 4.1.3. Definitions 4.1.1 and 4.1.2 are equivalent.
1 ¡ ¢
(0 < |xn − a| < ) ∧ |f (xn ) − b| ≥ ε .
n
lim f (x) = A.
x→−∞
Example 4.1.4.
1
lim = 0.
x→∞ x
It is sufficient to choose M such that M ε > 1 which exists by the Archimedian Principle.
Limits of functions exhibit many properties similar to those of limits of sequences. Let us
prove uniqueness of the limit of a function at a point using the Heine definition.
Theorem 4.1.5. If lim f (x) = A and lim f (x) = B then A = B.
x→a x→a
The following two theorems can be proved using the Heine definition of the limit of a
function at a point and the corresponding property of a limit of a sequence. Proofs are left
as exercises.
Theorem 4.1.7. Let lim f (x) = A and lim g(x) = B. Then
x→a x→a
¡ ¢
(i) lim f (x) + g(x) = A + B;
x→a
¡ ¢
(ii) lim f (x) · g(x) = A · B;
x→a
If in addition B 6= 0 then
µ ¶
f (x) A
(iii) lim = .
x→a g(x) B
Theorem 4.1.8. Let lim f (x) = A and lim g(x) = B. Suppose that
h x→a x→a i
(∃δ > 0) {x ∈ R | 0 < |x − a| < δ} ⊂ D(f ) ∩ D(g) and
¡ ¢ £ ¤
0 < |x − a| < δ ⇒ f (x) ≤ g(x) .
Then A ≤ B, i.e.
lim f (x) ≤ lim g(x).
x→a x→a
Using Theorem 4.1.7 one can easily compute limits of some functions.
Example 4.1.9.
One-sided limits
Definition 4.1.4. (i) Let f be defined on an interval (a, d) ⊂ R. We say that f (x) →
b as x → a+ and write lim f (x) = b if
x→a+
h ¡ ¢i
(∀ε > 0)(∃δ > 0)(∀x ∈ (c, d)) (0 < x − a < δ) ⇒ |f (x) − b| < ε .
(ii) Let f be defined on an interval (c, a) ⊂ R. We say that f (x) → b as x → a− and write
lim f (x) = b if
x→a−
h ¡ ¢i
(∀ε > 0)(∃δ > 0)(∀x ∈ (c, d)) (−δ < x − a < 0) ⇒ |f (x) − b| < ε .
Using the definitions of the limit given in the previous section we can formulate the above
definition in the following way.
Definition 4.2.2. A function f is called continuous at a point a ∈ R if f is defined on an
interval (c, d) containing a and
h ¡ ¢i
(∀ε > 0)(∃δ > 0)(∀x ∈ (c, d)) (|x − a| < δ) ⇒ |f (x) − f (a)| < ε .
Note the difference between this definition and the definition of the limit: the function f
has to be defined at a.
Using the above definition it is easy to formulate what it means that a function f is
discontinuous at a point a:
A function f is discontinuous at a point a ∈ R if
f is not defined on any neighbourhood (c, d) containing a, or if
h ¡ ¢i
(∃ε > 0)(∀δ > 0)(∃x ∈ D(f )) (|x − a| < δ) ∧ |f (x) − f (a)| ≥ ε .
One more equivalent way to define continuity at a point is to use the Heine definition of
the limit.
Definition 4.2.3. A function f is called continuous at a point a ∈ R if f is defined on an
interval (c, d) containing a and for any sequence (xn )n∈N such that
h i
(i) (∀n ∈ N) (xn ∈ (c, d)) ∧ (xn 6= a) ,
(ii) xn → a as n → ∞
we have
lim f (xn ) = f (a).
n→∞
The following theorem easily follows from the definition of continuity and properties of
limits.
(i) f + g is continuous at a.
(ii) f · g is continuous at a.
Moreover, if g(a) =
6 0, then
f
(iii) is continuous at a.
g
Based on this theorem and the two examples above we conclude that
a0 xn + a1 xn−1 + · · · + an
f (x) =
b0 xm + b1 xm−1 + · · · + bm
is continuous at every point of its domain of definition.
which means that the one-sided limits exist, are equal and equal the value of the function at
a.
Theorem 4.2.5. Let f be continuous at a ∈ R. Let f (a) < B. Then there exists a neigh-
bourhood of a such that f (x) < B for all the points x from the neighbourhood.
Boundedness of functions
Theorem 4.2.7. If f is continuous at a then there exists δ > 0 such that f is bounded on
the interval (a − δ, a + δ).
So on the interval (a − δ, a + δ)
Proof. If η = f (a) or η = f (b) then there is nothing to prove. Fix η ∈ (f (a), f (b)). Let us
introduce the set
A := {x ∈ [a, b] | f (x) < η}.
The set A is not empty since a ∈ A. The set A is bounded above (by b). Therefore there
exists ξ := sup A.
Our aim now is to prove that f (ξ) = η. We will do that by ruling out two other possibilities:
f (ξ) < η and f (ξ) > η.
First, let us assume that f (ξ) < η. Then by Theorem 4.2.5 we have that
h¡ ¢ ¡ ¢i
(∃δ > 0)(∀x ∈ D(f )) x ∈ (ξ − δ, ξ + δ) ⇒ f (x) < η .
Therefore h¡ ¢ ¡ ¢i
(∃x1 ∈ R) x1 ∈ (ξ, ξ + δ) ∧ f (x1 ) < η .
In other words h¡ ¢ ¡ ¢i
(∃x1 ∈ R) x1 > ξ ∧ x1 ∈ A .
Next, let us assume that f (ξ) > η. Then by the remark after Theorem 4.2.5 we have that
£ ¤
(∃δ > 0) D(f ) ∩ (ξ − δ, ξ + δ) ⊂ Ac (Ac is the complement to A).
Note that
1. A 6= ∅ since a ∈ A.
2. A is bounded (by b).
Therefore f is bounded on [a, x2 ] where x2 > ξ which contradicts to the fact that ξ is the
supremum of A. This proves that ξ = b.
(Note that this does not complete the proof since supremum may not belong to a set, i.e.
it is possible that ξ 6∈ A).
From the continuity of f at b it follows that
£ ¤
(∃δ1 > 0) f is bounded on (b − δ1 , b] .
The last theorem asserts that the range of a continuous function restricted to a closed
interval is a bounded subset in R, so that there exist supremum and infimum of it. The next
theorem asserts that these values are attained, which means that there are point in [a, b] such
that the values of the function at this points equal to the supremum (infimum).
We know that F 6= ∅ and by Theorem 4.3.2 F is bounded. Therefore its supremum exists.
Denote α := sup F .
Our aim is to prove that there exists y ∈ [a, b] such that f (y) = α. We prove this by
contradiction.
Suppose that ¡ ¢
( ∀x ∈ [a, b] ) f (x) < α .
Define the following function
1
g(x) = , x ∈ [a, b].
α − f (x)
Since the denominator is never zero on [a, b] we conclude that g is continuous and by Theorem
4.3.2 is bounded on [a, b].
At the same time by the definition of supremum
¡ ¢
(∀ε > 0)(∃x ∈ [a, b]) f (x) > α − ε ,
1
in other words α − f (x) < ε, and so g(x) > . This proves that
ε
³ 1´
(∀ε > 0)(∃x ∈ [a, b]) g(x) > .
ε
Therefore g is unbounded on [a, b] which is a contradiction.
Theorem 4.4.1. Let f be defined and continuous on a closed interval [a, b]. Then f is
uniformly continuous on [a, b].
The next example shows that the property of uniform continuity is stronger than the
property of continuity.
1
Example 4.4.2. The function f (x) = is continuous on (0, 1) but not uniformly continuous.
x
Indeed, take xn = n1 , yn = 1
n+1 .
√
Example 4.4.3. The function f (x) = x is uniformly continuous on [1, ∞).
Indeed,
√ √ |x1 − x2 |
|f (x1 ) − f (x2 )| = | x1 − x2 | = √ √ ≤ |x1 − x2 |.
x1 + x2
It is not difficult to show (this point is delegated to Exercises) that a continuous bijection
defined on an interval [a, b] is monotone (either increasing or decreasing), the notion which
was discussed for sequences and the intuitive meaning of which is clear. The precise definition
reads as follows.
Theorem 4.5.1. Let f be continuous on [a, b] and increasing. Let f (a) = c, f (b) = d. Then
there exists a function g : [c, d] → [a, b] which is continuous, increasing and such that
h i
(∀y ∈ [c, d]) f (g(y)) = y .
There is only one such value ξ since f is increasing. (Why?) The inverse function g is defined
by
ξ = g(η).
It is easy to see that g is increasing.
• Indeed, let y1 < y2 , y1 = f (x1 ), y2 = f (x2 ). Suppose that at the same time x2 ≤ x1 . Since
f is increasing it follows that y2 ≤ y1 . Contradiction.•
Differential Calculus
f (a + h) − f (a)
We say that f is differentiable at a if the limit lim exists. This limit, denoted
h→0 h
by f 0 (a), is called the derivative of f at a.
Proof.
f (a + h) − f (a) c−c
lim = lim = 0.
h→0 h h→0 h
Proof.
f (a + h) − f (a) a+h−a
lim = lim = 1.
h→0 h h→0 h
68
CHAPTER 5. DIFFERENTIAL CALCULUS
Proof.
The proof which can be performed by induction, based on the product rule below, is left
as an exercise.
Since in the above example the function f is continuous at 0, it shows that continuity
does not imply differentiability.
Proof.
f (a + h) − f (a)
f 0 (a) = lim .
h→0 h
Hence we have
¡ ¢ ¡ ¢
lim f (x) − f (a) = lim f (a + h) − f (a)
x→a h→0
f (a + h) − f (a)
= lim · h = f 0 (a) · lim h = 0.
h→0 h h→0
Therefore
lim f (x) = f (a).
x→a
Indeed, define
f (x) − f (a)
α(x) := − f 0 (a).
x−a
Then α(x) → 0 as x → a and f (x) = f (a) + f 0 (a)(x − a) + α(x)(x − a).
Theorem 5.1.8. (The product rule). If f and g are differentiable at a then f · g is also
differentiable at a, and
(f · g)0 (a) = f 0 (a) · g(a) + f (a) · g 0 (a).
Proof.
1
Theorem 5.1.9. If g is differentiable at a and g(a) 6= 0 then φ = g is also differentiable at
a, and µ ¶0
1 g 0 (a)
φ0 (a) = (a) = − .
g [g(a)]2
Theorem 5.1.10. (The quotient rule). If f and g are differentiable at a and g(a) 6= 0
then φ = fg is also differentiable at a, and
µ ¶0
0 f f 0 (a) · g(a) − f (a) · g 0 (a)
φ (a) = (a) = .
g [g(a)]2
In the next theorem we establish relation between the derivative of an invertible function
and the derivative of the inverse function.
Theorem 5.1.13. Let f be continuous, increasing on (a, b) and given by y = f (x). Suppose
that for some x0 ∈ (a, b) f is differentiable at x0 and f 0 (x0 ) 6= 0. Then the inverse function
g = f −1 given by x = g(y) is differentiable at y0 = f (x0 ) and
1
g 0 (y0 ) = .
f 0 (x0 )
Proof. Remark 8 implies that
y − y0 = f (g(y)) − f (g(y0 ))
0
= f (g(y0 ))(g(y) − g(y0 )) + α(g(y))(g(y) − g(y0 )),
where α(g(y)) → 0 as g(y) → g(y0 ). Since g is continuous at y0 it follows that g(y) → g(y0 )
as y → y0 , and hence α(g(y)) → 0 as y → y0 . Therefore we have
g(y) − g(y0 ) g(y) − g(y0 )
= 0
y − y0 f (g(y0 ))(g(y) − g(y0 )) + α(g(y))(g(y) − g(y0 ))
1 1
= 0 → 0 as y → y0 .
f (g(y0 )) + α(g(y)) f (g(y0 ))
One-sided derivatives
f (a + h) − f (a) f (a + h) − f (a)
f−0 (a) = lim , f+0 (a) = lim .
h→0− h h→0+ h
Note that we do not assume differentiability, or even continuity, of f at any other point.
Proof. We prove the theorem for the case of maximum. The case of minimum is similar.
If h is any number such that x0 + h ∈ (a, b), then
f (x0 + h) − f (x0 )
≥ 0,
h
so that
f (x0 + h) − f (x0 )
lim ≥ 0.
h→0− h
By hypothesis f 0 (x0 ) exists so that f−0 (x0 ) = f+0 (x0 ). So from the above f 0 (x0 ) = 0.
f : R → R, f (x) = x3 .
We see that f 0 (0) = 0, however 0 is not the point if maximum or minimum on any interval.
Proof. It follows from the continuity of f that f has a maximum and a minimum value on
[a, b].
Suppose that the maximum value occurs at x0 ∈ (a, b). Then by Theorem 5.2.1 f 0 (x0 ) = 0,
and we are done.
Suppose next that the minimum value occurs at x0 ∈ (a, b). Then again by Theorem 5.2.1
f 0 (x0 ) = 0.
Finally, suppose that the maximum value and the minimum value both occur at the end
points. Since f (a) = f (b), the maximum value and the minimum value are equal, so that f
is a constant. Hence f 0 (x) = 0 for all x ∈ (a, b).
Theorem 5.2.3. (The Mean Value Theorem) If f is continuous on [a, b] and differen-
tiable on (a, b), then
³ ´· f (b) − f (a)
¸
0
∃x0 ∈ (a, b) f (x0 ) = .
b−a
Proof. Let · ¸
f (b) − f (a)
g(x) = f (x) − (x − a).
b−a
Then g is continuous on [a, b] and differentiable on (a, b), and
f (b) − f (a)
g 0 (x) = f 0 (x) − .
b−a
Moreover, g(a) = f (a) and
· ¸
f (b) − f (a)
g(b) = f (b) − (b − a) = f (a).
b−a
Corollary 5.2.1. If f is defined on an interval and f 0 (x) = 0 for all x in the interval, then
f is a constant on the interval.
Proof. Let a and b be any two points in the interval with a 6= b. Then by the Mean Value
theorem there is a point x in (a, b) such that
f (b) − f (a)
f 0 (x) = .
b−a
But f 0 (x) = 0 for all x in the interval, so that
f (b) − f (a)
0= ,
b−a
and consequently f (b) = f (a). Thus the value of f at any two points is the same. Therefore
f is a constant on the interval.
Corollary 5.2.2. If f and g are defined on the same interval and f 0 (x) = g 0 (x) for all x in
the interval, then there is some number c ∈ R such that f = g + c on the interval.
The proof is left as an exercise.
Corollary 5.2.3. If f 0 (x) > 0 for all x in an interval, then f is increasing on the interval;
if f 0 (x) < 0 for all x in an interval, then f is decreasing on the interval;
Proof. Consider the case f 0 (x) > 0. Let a and b be any two points in the interval with a < b.
Then by the Mean Value theorem there is a point x in (a, b) such that
f (b) − f (a)
f 0 (x) = .
b−a
But f 0 (x) > 0 for all x in the interval, so that
f (b) − f (a)
> 0.
b−a
Since b − a > 0, it follows that f (b) > f (a), which proves that f is increasing on the interval.
The case f 0 (x) < 0 is left as an exercise.
The next theorem is a generalisation of the Mean Value Theorem. It is of interest because
of its applications.
Theorem 5.2.4. (The Cauchy Mean Value Theorem). If f and g are continuous on
[a, b] and differentiable on (a, b), then
¡ ¢h i
∃x0 ∈ (a, b) [f (b) − f (a)]g 0 (x0 ) = [g(b) − g(a)]f 0 (x0 ) .
(If g(b) 6= g(a), and g 0 (x0 ) 6= 0, the above equality can be rewritten as
Then h(a) = f (b)g(a) − f (a)g(b) = h(b), so that h satisfies Rolle’s theorem. Therefore
¡ ¢h i
∃x0 ∈ (a, b) 0 = h0 (x0 ) = [f (b) − f (a)]g 0 (x0 ) − [g(b) − g(a)]f 0 (x0 ) .
Proof. Set
h2 00 hn−1 (n−1)
Rn = f (a + h) − f (a) − hf 0 (a) − f (a) − · · · − f (a).
2! (n − 1)!
Define g : [a, a + h] → R by
(a + h − x)2 00
g(x) = f (a + h) − f (x) − (a + h − x)f 0 (x) − f (x)
2!
(a + h − x)n−1 (n−1) (a + h − x)p Rn
−··· − f (x) − .
(n − 1)! hp
g 0 (a + th) = 0.
all other terms in the differentiation canceling in pairs (It is advisable to check this). From
this we find that
hn (1 − t)n−p (n)
Rn = f (a + th),
p(n − 1)!
which proves the theorem.
h2 00 hn−1 (n−1)
(5.3.2) f (a + h) = f (a) + hf 0 (a) + f (h) + · · · + f (a) + Rn ,
2! (n − 1)!
where
hn (n)
(i) (Lagrange) Rn = f (a + th) for some t ∈ (0, 1).
n!
(1 − s)n−1 hn (n)
(ii) (Cauchy) Rn = f (a + sh) for some s ∈ (0, 1).
(n − 1)!
α α(α − 1) 2 α(α − 1) . . . (α − n + 1) n
(1 + x)α = 1 + x+ x + ··· + x + Rn+1 (x),
1! 2! n!
where the remainder term in the Lagrange form is
α(α − 1) . . . (α − n)
Rn+1 (x) = (1 + tx)α−n−1 xn+1 for some t ∈ (0, 1).
(n + 1)!
Series
In this chapter we continue to study series. We already studied series of positive terms. (It
is advisable to revise Section 3.6.) In this chapter we will discuss convergence and divergence
of series which have infinitely many positive and infinitely many negative terms.
(ii) lim an = 0.
n→∞
converges.
78
CHAPTER 6. SERIES
un − tn = a2n ≥ 0, i.e. un ≥ tn .
Thus the sequence (un )n is decreasing and bounded below by t1 . Therefore is converges to
a limit u = limn→∞ un . The the sequence (tn )n is increasing and bounded above by u1 .
Therefore is converges to a limit t = limn→∞ tn . Moreover u − t = limn→∞ (un − tn ) =
limn→∞ a2n = 0. Denote t = u = s. So we have
Therefore sn → s as n → ∞.
By the d’Alembert test if x < 1 then the series converges, if x > 1 then it diverges. If x = 1
we obtain the harmonic series which is divergent.
Now let x < 0. Denote y = −x > 0. Then we have an alternating series
∞
X (−1)n y n
.
n
n=1
yn y n+1 yn
If y ≤ 1 then > and lim = 0, so that the series converges by Theorem 6.1.1. If
n n n+1 n→∞ n
y
y > 1 then lim 6= 0, and the series diverges.
n→∞ n
∞
X
Proof. Let an be an absolutely convergent series. Define
n=1
½
an if an ≥ 0,
bn =
0 if an < 0.
½
0 if an ≥ 0,
cn =
−an if an < 0.
Then (∀n ∈ N)(bn ≥ 0) ∧ (cn ≥ 0). Note that
an = bn − cn ,
|an | = bn + cn .
Since (∀n ∈ N)(bn ≤ |an |) ∧ (cn ≤ |an |) by the comparison test we conclude that the series
∞
X ∞
X
bn and cn
n=1 n=1
∞
X
converge, and therefore so does the series (bn − cn ).
n=1
1 1 1
1− + − + ...
2 3 4
Let sn be the nth partial sum, i.e.
1 1 1 (−1)n−1
sn = 1 − + − + ··· + .
2 3 4 n
Let us now rearrange the series in the following way
1 1 1 1 1 1 1 1
1− − + − − + − − + ...
2 4 3 6 8 5 10 12
Let tn be be the nth partial sum of the new series. Then
µ ¶ µ ¶ µ ¶
1 1 1 1 1 1 1 1
t3n = 1 − − + − − + ··· + − −
2 4 3 6 8 2n − 1 4n − 2 4n
µ ¶ µ ¶ µ ¶
1 1 1 1 1 1 1 1 1
= 1− + − + ··· + − = s2n .
2 2 2 3 4 2 2n − 1 2n 2
1
tn → s as n → ∞.
2
Remark 10. The series 6.1.7 is conditionally convergent. We see that by rearranging this
series we change its sum. In fact, one can prove that any conditionally convergent series can
be rearranged in such a way that the sum of the rearranged series is any real number chosen
in advance.
In the next theorem we prove that rearranging an absolutely convergent series does not
alter its sum.
∞
X ∞
X
(We say that the series bn is a rearrangement of the series an if there is a bijection
n=1 n=1
f on the set of natural numbers N such that
∞
X
Theorem 6.1.8. Let an be an absolutely convergent series with sum s. Then any rear-
n=1
∞
X
rangement of an is also convergent with sum s.
n=1
Proof. First let us consider the case in which an ≥ 0 for all n. Let b1 + b2 + b3 + . . . is a
rearrangement of a1 + a2 + a3 + . . . . Denote by sn and tn the corresponding partial sums, i.e.
sn = a1 + a2 + a3 + · · · + an , tn = b1 + b2 + b3 + · · · + bn .
The sequences (sn )n and (tn )n are increasing and sn → s as n → ∞. So (∀n ∈ N)(sn ≤ s).
Also
¡ ¢
(∀n ∈ N)(∃k ∈ N) b1 + b2 + b3 + · · · + bn ≤ sk .
(Each of the numbers b1 , b2 , . . . , bn is one of a’s.) Therefore (∀n ∈ N)(tn ≤ s). Hence the
sequence (tn )n is bounded above by s. Thus it converges, and the limit t ≤ s. By the same
argument considering a1 + a2 + a3 + . . . as a rearrangement of b1 + b2 + b3 + . . . we obtain
that s ≤ t. So we conclude that t = s.
The general case is treated similarly to the proof of Theorem 6.1.4. Define
½
an if an ≥ 0,
dn =
0 if an < 0.
½
0 if an ≥ 0,
cn =
−an if an < 0.
We have h i h i
(∀n ∈ N) 0 ≤ dn ≤ |an | ∧ 0 ≤ cn ≤ |an |
x1 + x2 + · · · = d1 + d2 + . . .
Similarly
y1 + y2 + · · · = c1 + c2 + . . .
But bn = xn − yn for all n. Therefore
b1 + b2 + · · · =(x1 + x2 + . . . ) − (y1 + y2 + . . . )
=(d1 + d2 + . . . ) − (c1 + c2 + . . . ) = u − v = s.
In the right-hand side of (6.1.1) there is the product of mth partial sums of the series
∞
X X∞
|an | and |bn |. Therefore the sequence (Sn )n is bounded above, which proves the
n=1 n=1
absolute convergence of the series in question.
It remains to prove that the sum of the series is st. Let S denote the sum of the series.
By Theorem 6.1.8 it does not depend on the order of terms. So we can write S = limn→∞ Sn .
Now notice that
Sn2 = (a1 + a2 + · · · + an )(b1 + b2 + · · · + bn )
which implies that
lim Sn2 = st.
n→∞
x x2 xn y y2 yn
e(x)e(y) =(1 + + + ··· + + . . . )(1 + + + ··· + + ...)
1! 2! n! 1! 2! n!
y2 x2
=1 + x + y + + xy + + ...
2 2
(x + y) (x + y)2 (x + y)n
=1 + + + ··· + + · · · = e(x + y),
1! 2! n!
where we used the observation that the terms of degree n in x and y are
xn xk y n−k yn (x + y)n
+ ··· + + ··· + = .
n! k! (n − k)! n! n!
x x2 xn
1+ + + ··· + + ...
1! 2! n!
converges for all x ∈ R. (see Example 6.1.10).
Example 6.2.2. The series
∞
X
nn xn
n=1
converges only for x = 0.
Indeed, if x 6= 0 then limn→∞ nn xn 6= 0.
Example 6.2.3. The geometric series
∞
X
1 + x + x2 + · · · + xn + · · · = xn
n=0
∞
X
Theorem 6.2.5. Let r ∈ R be such that the series an rn converges. Then the power series
n=0
∞
X
an xn is absolutely convergent for all x ∈ (−|r|, |r|) (i.e. |x| < |r|).
n=0
|x|
Let x ∈ R be such that |x| < |r|, and set y = . Note that 0 ≤ y < 1. We have
|r|
|an xn | = |an | · |x|n = |an | · |rn |y n = |an rn | · y n ≤ Ky n .
∞
X
Therefore the series |an xn | converges by comparison with the convergent geometric series
n=0
K(1 + y + y 2 + · · · + y n + . . . ).
∞
X
Theorem 6.2.6. Let an xn be a power series. Then one of the following possibilities
n=0
occurs:
(iii) There exists r > 0 such that the series is absolutely convergent for all x ∈ R such that
|x| < r and is divergent for all x ∈ R such that |x| > r.
∞
X
Proof. Let us denote by E the set of all non-negative numbers such that the series an xn
n=0
converges. Clearly 0 ∈ E. If E = {0} then (i) is true.
If (∃x ∈ R+ )(x ∈ E) ∧ (x 6= 0) then the series converges for all y ∈ R such that 0 ≤ y ≤ x (by
the previous theorem).
Suppose that E is not bounded above. Let y be any non-negative real number. Then y is
not an upper bound for E so that (∃x ∈ E)(y < x). By Theorem 6.2.5 y ∈ E. Thus E is the
set of all non-negative real numbers. Absolute convergence follows also from Theorem 6.2.5.
Finally let us suppose that E is bounded above and contains at least one positive number.
Then sup E exists. Set r = sup E. We know that r > 0. Let x ∈ (−r, r). Then (∃y ∈ (|x|, r))
such that y ∈ E.
Indeed, y is not an upper bound for E. Hence (∃z ∈ E)(y < z). By Theorem 6.2.5 y ∈ E.
Therefore by Theorem 6.2.5 the series is absolutely convergent at x.
Now suppose that |x| > r. Then there exists u ∈ R such that r < u < |x| and u 6∈ E. The
series is not convergent at u, hence by Theorem 6.2.5 it is not convergent at x
a0 + a1 x + a2 x2 + . . .
is defined as follows: R = ∞ if the series converges for all x ∈ R; R = 0 if the series converges
for x = 0 only; and R = r if (iii) in Theorem 6.2.6 holds.
(−1)n
a0 + a1 y + a2 y 2 + . . . where an = .
(2n)!
|an+1 y n+1 | y
lim = lim =0
n→∞ |an y n | n→∞ (2n + 1)(2n + 2)
X (−1)n xn∞
x x2 x3
1− + − + ··· =
3 5 7 2n + 1
n=0
is convergent.
For absolute convergence we use d’Alembert’s test.
Elementary functions
x x2 xn
(7.1.1) exp x = 1 + + + ··· + + ...
1! 2! n!
The series on the right is convergent for all x ∈ R. (see Example 6.2.1).
Theorem 7.1.1. For all x, y ∈ R
Theorem 7.1.2. The exponential function f (x) = exp x is everywhere differentiable and
We will also use the notation ex = exp x for the reason which will become apparent a little
later.
ea+h − ea eh − 1
= ea .
h h
Now
eh − 1 h h2
=1+ + + · · · = 1 + g(h).
h 2! 3!
87
7.1. EXPONENTIAL FUNCTION
2.718281828459045
Note that from the definition and from Theorem 7.1.1 it follows that
(∀n ∈ N) (exp n = en ).
ex = exp x.
Remark 11. It is useful to redefine the function ex (after we have studied the properties
above) as a mapping from R to (0, ∞).
So, by the exponential function we mean the mapping f : R → (0, ∞) defined by f (x) = ex .
The above properties show in particular that this function is a bijection.
1
(ii) (log x)0 = ;
x
(iii) lim log x = ∞;
x→∞
Proof. The proofs are straightforward from the properties of exp. Let us show (i).
1 1 1
f 0 (x) = = = .
g 0 (y) exp y x
x2 x3 x4 xn
log(1 + x) = x − + − + · · · + (−1)n−1 + ....
2 3 4 n
Now we are able to define arbitrary powers of nonnegative numbers (until now it was only
clear how to define rational powers).
Definition 7.2.2. Let a > 0. Then for any x ∈ R we set
ax := ex log a .
x3 x5 x7
(7.3.2) sin x := x − + − + ...
3! 5! 7!
x2 x4 x6
(7.3.3) cos x := 1 − + − + ...
2! 4! 6!
The series are absolutely convergent for all x ∈ R (see Example 6.2.8 and Example 6.2.7).
The next theorem can be proved using the above definitions by multiplying and adding
the corresponding series. We skip the details of the proof. An easier proof would be to use
the exponential function with complex variables. This will be done in the course “Further
topics in Analysis”.
Theorem 7.3.1. For all x, y ∈ R
Theorem 7.3.2. The functions sin : R → R and cos : R → R are everywhere differentiable
and
(sin x)0 = cos x, (cos x)0 = − sin x.
The proof is similar to that of Theorem 7.1.2. We do not give the details.
1 1
cos $ = 0 and sin $ = 1.
2 2
Proof. If 0 < x < 2 then
µ ¶ µ 5 ¶
x3 x x7
sin x = x − + − + ··· > 0
3! 5! 7!
92
CHAPTER 8. THE RIEMANN INTEGRAL
Denote
m := inf f (x), M := sup f (x).
x∈[a,b] x∈[a,b]
The following inequality is obviously true
£ ¤
(∀i ∈ {1, 2, . . . , n}) m ≤ mi ≤ f (ξi ) ≤ Mi ≤ M .
Therefore we have
n
X n
X n
X
m(b − a) = m(xi − xi−1 ) ≤ mi (xi − xi−1 ) ≤ f (ξi )(xi − xi−1 )
i=1 i=1 i=1
n
X n
X
≤ Mi (xi − xi−1 ) ≤ M (xi − xi−1 ) = M (b − a),
i=1 i=1
which implies that for every partition and for every choice
{ξi ∈ [xi−1 , xi ] | (xi )ni=0 ∈ P } the following inequality holds
Proof. First let us consider a particular case. Let P 0 be a partition formed from P by adding
one extra point, say c ∈ [xk−1 , xk ]. Let
m0k = inf f (x), m00k = inf f (x).
x∈[xk−1 ,c] x∈[c,xk ]
Hence
J = j = C(b − a).
Example 8.1.3. The Dirichlet function D : [0, 1] → R is defined as D(x) = 1 if x is rational
and D(x) = 0 is x is irrational.
Then h i h i
(∀P ) L(D, P ) = 0 ∧ U (D, P ) = 1 .
Proof. 1) Necessity. Let J = j, i.e. let us assume that f is integrable. Fix ε > 0. Then
h i
(∃P1 ) L(f, P1 ) > j − ε/2 .
Also h i
(∃P2 ) U (f, 21 ) < J + ε/2 .
Let Q = P1 ∪ P2 . Then
Therefore (since J = j)
U (f, Q) − L(f, Q) < ε.
Note that
J − j ≤ U (f, P ) − L(f, P ) < ε.
Therefore it follows that ³ ´
(∀ε > 0) J − j < ε .
Proof. Without loss of generality assume that f is increasing, so that f (a) < f (b). Fix ε > 0.
Let us consider a partition P of [a, b] such that
ε
kP k < δ = .
f (b) − f (a)
Proof. Fix ε > 0. Since f is continuous on a closed interval, it is uniformly continuous (See
ε
section 4.4). Therefore for b−a there exists δ > 0 such that
£ ε
(∀x1 , x2 ∈ [a, b]) (|x1 − x2 | < δ) ⇒ (|f (x1 ) − f (x2 )| < .
b−a
Hence for every partition P with norm kP k < δ we have
n
X n
ε X
U (f, P ) − L(f, P ) = (Mi − mi )(xi − xi−1 ) < (xi − xi−1 ) = ε.
b−a
i=1 i=1
Proof. Suppose that f is integrable on [a, b]. Fix ε > 0. Then there exists a partition
P = {x0 , . . . , xn } of [a, b] such that
Therefore we have
Since each of the terms on the left hand side is non-negative, each one is less than ε, which
proves that f is integrable on [a, c] and on [c, b]. Note also that
Z c
L(f, P1 ) ≤ f (x)dx ≤ U (f, P1 ),
a
Z b
L(f, P2 ) ≤ f (x)dx ≤ U (f, P2 ),
c
so that Z Z
c b
L(f, P ) ≤ f (x)dx + f (x)dx ≤ U (f, P ).
a c
This is true for any partition of [a, b]. Therefore
Z b Z c Z b
f (x)dx = f (x)dx + f (x)dx.
a a c
Now suppose that f is integrable on [a, c] and on [c, b]. Fix ε > 0. Then there exists a
partition P1 of [a, c] such that
Let P = P1 ∪ P2 . Then
Rb
The integral a f (x)dx was defined only for a < b. We add by definition that
Z a Z b Z a
f (x)dx = 0 and f (x)dx = − f (x)dx if a > b.
a a b
Theorem 8.4.2. Let f and g be integrable on [a, b]. Then f + g is also integrable on [a, b]
and Z b Z b Z b
[f (x) + g(x)]dx = f (x)dx + g(x)dx.
a a a
mi ≥m0i + m00i ,
Mi ≤Mi0 + Mi00 .
Therefore we have
or otherwise
Fix ε > 0. Since f and g are integrable there are partitions P1 and P2 such that
U (f + g, P ) − L(f + g, P ) < ε.
Moreover,
Z b
L(f, P ) + L(g, P ) ≤ L(f + g, P ) ≤ [f (x) + g(x)]dx
a
≤ U (f + g, P ) ≤ U (f, P ) + U (g, P ),
and Z Z
b b
L(f, P ) + L(g, P ) ≤ f (x)dx + g(x)dx ≤ U (f, P ) + U (g, P ).
a a
Theorem 8.4.3. Let f be integrable on [a, b]. Then, for any c ∈ R, cf is also integrable on
[a, b] and
Z b Z b
cf (x)dx = c f (x)dx.
a a
Proof. The proof is left as an exercise. Consider separately two cases: c ≥ 0 and c ≤ 0.
Then Z Z
b b
f (x)dx ≤ g(x)dx.
a a
Corollary 8.4.1. Let f be integrable on [a, b] and there are M, m ∈ R such that
Then Z b
m(b − a) ≤ f (x)dx ≤ M (b − a).
a
Corollary 8.4.2. Let f be continuous on [a, b]. Then there exists θ ∈ [a, b] such that
Z b
f (x)dx = f (θ)(b − a).
a
where m = min[a,b] f (x), M = max[a,b] f (x). Then by the Intermediate Value Theorem we
conclude that there exists θ ∈ [a, b] such that
Z b
1
f (θ) = f (x)dx.
b−a a
Theorem 8.4.5. Let f be integrable on [a, b]. Then |f | is integrable on [a, b] and
Z Z
¯ b ¯ b
¯ f (x)dx¯ ≤ |f (x)|dx.
a a
Indeed,
³ ´
(∀x, y ∈ [α, β]) f (x) − f (y) ≤ sup f (x) − inf f (x) , so that
[α,β] [α,β]
³ ´
(∀x, y ∈ [α, β]) |f (x)| − |f (y)| ≤ sup f (x) − inf f (x) ,
[α,β] [α,β]
which proves the integrability of |f | by the criterion of integrability, Theorem 8.2.1. The last
assertion follows from Theorem 8.4.4.
Theorem 8.4.6. 1 Let f : [a, b] → R be integrable and (∀x ∈ [a, b])(m ≤ f (x) ≤ M ). Let g : [m, M ] →
R be continuous. Then h : [a, b] → R defined by h(x) = g(f (x)) is integrable.
Proof. Fix ε > 0. Since g is uniformly continuous on [m, M ], there exists δ > 0 such that δ < ε and
£ ¤
(∀t, s ∈ [m, M ]) (|t − s| < δ) ⇒ (|g(t) − g(s)| < ε) .
Let mi = inf [xi−1 ,xi ] f (x), Mi = sup[xi−1 ,xi ] f (x) and m∗i = inf [xi−1 ,xi ] h(x), Mi∗ = sup[xi−1 ,xi ] h(x).
Decompose the set {1, . . . , n} into two subset : (i ∈ A) ⇔ (Mi −mi < δ) and (i ∈ B) ⇔ (Mi −mi ≥ δ).
For i ∈ A by the choice of δ we have that Mi∗ − m∗i < ε.
1
This theorem is is outside the syllabus and will not be included in the examination paper
For i ∈ B we have that Mi∗ − m∗i ≤ 2K where K = supt∈[m,M ] |g(t)|. By (8.4.3) we have
X X
δ (xi − xi−1 ) ≤ (Mi − mi )(xi − xi−1 ) < δ 2 ,
i∈B i∈B
P
so that i∈B (xi − xi−1 ) < δ. Therefore
X X
U (h, P ) − L(h, P ) = (Mi∗ − m∗i )(xi − xi−1 ) + (Mi∗ − m∗i )(xi − xi−1 )
i∈A i∈B
<ε(b − a) + 2Kδ < ε[(b − a) + 2K],
Proof. Since f + g and f − g are integrable on [a, b], (f + g)2 and (f − g)2 are integrable on [a, b] by
the previous theorem. Therefore
1
fg = [(f + g)2 − (f − g)2 ] is integrable on [a, b].
4
Proof. By the definition of integrability f is bounded on [a, b]. Let M = sup[a,b] |f (x)|. Then
for x, y ∈ [a, b] we have
Z
¯ ¯ ¯ y ¯
¯F (x) − F (y)¯ = ¯ f (t)dt¯ ≤ M |x − y|,
x
F 0 (c) = f (c).
Hence we have
F (c + h) − F (c)
= f (θ).
h
As h → 0, θ → c, and due to continuity of f we conclude that limh→0 f (θ) = f (c). The
assertion follows. The case h < 0 is similar. The cases c = a and c = b are similar (In these
cases one talks on one-sided derivatives only.)
Theorem 8.5.3. Let f be continuous on [a, b] and f = g 0 fore some function g defined on
[a, b]. Then for x ∈ [a, b]
Z x
f (t)dt = g(x) − g(a).
a
Proof. Let Z x
F (x) = f (t)dt.
a
By Theorem 8.5.2 the function F − g is differentiable on [a, b] and F 0 − g 0 = (F − g)0 = 0.
Therefore by Corollary 5.2.2 there is a number c such that
F = g + c.
Proof. Let P = {x0 , . . . , xn } be a partition of [a, b]. By the Mean Value Theorem there exists
a point ti ∈ [xi−1 , xi ] such that
Let
mi = inf f (x), Mi = sup f (x).
[xi−1 ,xi ] [xi−1 ,xi ]
Then
mi ((xi − xi−1 ) ≤ f (ti )(xi − xi−1 ) ≤ Mi (xi − xi−1 ), that is
mi ((xi − xi−1 ) ≤ g(xi ) − g(xi−1 ) ≤ Mi (xi − xi−1 ).
Adding these inequalities for i = 1, . . . , n we obtain
n
X n
X
mi ((xi − xi−1 ) ≤ g(b) − g(a) ≤ Mi (xi − xi−1 ),
i=1 i=1
The next theorem shows that the Riemann integral can be equavalently defined via the limit of
the integral sums.
Theorem 8.6.1. Let f : [a, b] → R. The f is Riemann integrable if and only if limkP k→0 σ(f, P, ξ)
exists. In this case Z b
lim σ(f, P, ξ) = f (x)dx.
kP k→0 a
Proof. First, assume that f is Riemann integrable. Then we know that f is bounded, so that there
is a constant C such that |f (x)| ≤ C for all x ∈ [a, b]. Fix ε > 0. Then there exists a partition P0 of
[a, b] such that
U (f, P0 ) − L(f, P) < ε/2.
ε
Let m be the number of points in the partition P0 . Choose δ = . Then for any partition P1 such
8mC
that kP1 k < δ and P = P0 ∪ P1 we have
¡ ¢
U (f, P1 ) = U (f, P ) + U (f, P1 ) − U (f, P )
¡ ¢
≤ U (f, P0 ) + U (f, P1 ) − U (f, P )
≤ U (f, P0 ) + 2CkP1 km < U (f, P0 ) + ε/4.
Similarly,
L(f, P1 ) > L(f, P0 ) − ε/4.
Therefore we get
L(f, P0 ) − ε/4 < L(f, P1 ) ≤ U (f, P1 ) < U (f, P0 ) + ε/4.
Hence
U (f, P1 ) − L(f, P1 ) < ε,
which together with the inequalities
Z b
L(f, P1 ) ≤ f (x)dx ≤ U (f, P1 ),
a
L(f, P1 ) ≤ σ(f, P1 , ξ) ≤ U (f, P1 )
leads to Z
¯ b ¯
¯ f (x)dx − σ(f, P1 , ξ)¯ < ε.
a
Now suppose that limkP k→0 σ(f, P, ξ) = A. Fix ε > 0. Then there exists δ > 0 such that if kP k < δ
then
A − ε/2 < σ(f, P, ξ) < A + ε/2.
2
The entire section is not included in the examination paper
Choose P as above. Varying (ξi ) take sup and inf of σ(f, P, ξ) in the above inequality. We obtain
We use the same terminology for improper integrals as for series, that is, an improper
integral may converge or diverge.
Example 8.7.1. Z ∞
1
dx = 1.
1 x2
Indeed, Z A
1 1
2
dx = 1 − → 1 as A → ∞.
1 x A
Z ∞
1
Theorem 8.7.2. dx converges if and only is k > 1.
1 xk
Example 8.7.3. Z Z
1
dx 1
dx √
√ = lim √ = lim (2 − 2 δ) = 2.
0 x δ→0 δ x δ→0
The notion of the improper integral is useful for investigation of convergence of certain
series. The following theorem is often called the integral test for convergence of series.
R∞
Theorem 8.7.4. P∞Let f : [1, ∞) → R be positive and increasing. Then the integral 1 f (x)dx
and the series n=1 f (n) both converge or both diverge.
Proof. Since f is monotone it is integrable on any finite interval (Theorem 8.3.1). For n − 1 ≤
x ≤ n we have
f (n) ≤ f (x) ≤ f (n − 1).
Integrating the above inequality from n − 1 to n we obtain
Z n
f (n) ≤ f (x)dx ≤ f (n − 1).
n−1
converges if α > 1.
Indeed, Z Z µ ¶
∞ A
dx dx 1 1 1
= lim = 1− = < ∞.
1 xα A→∞ 1 x α α−1 Aα−1 α−1
Example 8.7.6. The series
∞
X 1
n(log n)α
n=1
8.8 Constant π
In Chapter 7 we defined the trigonometric functions sin and cos by the series and proved that
they are periodic with period 2$. Here we show that $ is the same constant as π know from
elementary geometry.
Consider a circle x2 + y 2 ≤ 1, which is centered at the origin with radius 1. It is known
that its area is π.
The area of the semi-circle can be obtained by
Z 1p Z $ Z
2 2 1 $ 1
1 − x dx = sin θdθ = (1 − cos 2θ)dθ = $
−1 0 2 0 2
[1] D. J. Velleman, How to prove it. A structural approach, Cambridge University Press, 1994.
[2] I. Stewart and D. Tall, The Foundations of Mathematics, Oxford University Press, 1977.
[4] J. C. Burkill, A first course in Mathematical Analysis, Cambridge University Press, 1962.
107