0% found this document useful (0 votes)
49 views82 pages

Mathsforscientistsi PDF

This document is a course syllabus for Mathematics for Scientists and Engineers I, which covers calculus topics including functions, differentiation, integration, differential equations, and limits. It was authored by Dr. Stephen West of the Department of Physics at Royal Holloway, University of London. The syllabus contains 10 sections that will introduce key calculus concepts and methods over the course of the term.

Uploaded by

David Simmons
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
49 views82 pages

Mathsforscientistsi PDF

This document is a course syllabus for Mathematics for Scientists and Engineers I, which covers calculus topics including functions, differentiation, integration, differential equations, and limits. It was authored by Dr. Stephen West of the Department of Physics at Royal Holloway, University of London. The syllabus contains 10 sections that will introduce key calculus concepts and methods over the course of the term.

Uploaded by

David Simmons
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 82

Mathematics for Scientists and Engineers I

Calculus.

PH1110/EE1110

Dr Stephen West

Department of Physics,
Royal Holloway, University of London,
Egham, Surrey,
TW20 0EX.

1
Contents

1 Introduction. 6

2 Functions 6

2.1 Definition of a Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2.2 Logarithms - a quick revision . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

2.3 Trigonometric Functions - a quick revision and a little more . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2.4 Power Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

2.4.1 Expanding Functions in Power Series - Taylor Series . . . . . . . . . . . . . . . . . . . . . . . . . . 12

2.4.2 Approximation Errors in Taylor Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

3 Differential calculus 16

3.1 Continuity of a function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

3.2 Differentiability and Differentiation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

3.3 Rules of Differentiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

3.3.1 Product Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

3.3.2 Composite Function or Function of a Function or Implicit Differentiation . . . . . . . . . . . . . . 20

3.3.3 Quotient Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

3.3.4 Logarithmic Differentiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

3.3.5 Leibniz’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

3.4 Special Points of a Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

3.4.1 Singular Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

3.5 Inverse Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

3.5.1 Inverse Trigonometric Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

3.5.2 Inverse Function Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

3.6 Partial Differentiation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

4 A note on coordinate systems. 26

4.1 Circular Polar Coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

4.2 Cylindrical Polars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

2
4.3 Spherical Polars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

5 Integration 30

5.1 Integration From First Principles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

5.2 Integration of Arbitrary Continuous Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

5.2.1 Properties of Definite Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

5.3 Integration as the inverse of Differentiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

5.4 Integration By Inspection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

5.5 Integration by Substitution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

5.5.1 Indefinite Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

5.5.2 Solving Differential Equations with Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

5.5.3 Definite Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

5.6 Integration by Parts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

5.7 Differentiation of an Integral Containing a Parameter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

5.8 An Aside on Hyperbolic Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

5.8.1 Identities of Hyperbolic Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

5.8.2 Inverses of Hyperbolic Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

5.8.3 Calculus of Hyperbolic Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

5.9 Hyperbolic Functions In Integration by Substitution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

6 Differential Equations 45

6.1 Solving ODEs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

6.2 First order ODE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

6.2.1 Separable first order ODE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

6.2.2 Almost Separable first order ODEs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

6.2.3 Homogeneous first order ODEs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

6.2.4 Homogeneous apart from a constant . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

6.2.5 Looks homogeneous but actually separable with a change of variable. . . . . . . . . . . . . . . . . . 51

6.2.6 Exact equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

6.2.7 Integrating Factor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

3
6.2.8 Bernoulli equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

7 Second Order Differential Equations 56

7.1 Homogeneous second order ODE with constant coefficients . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

7.2 Finding Solutions to 2nd order linear ODEs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

7.2.1 Real Roots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

7.2.2 Complex Roots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

7.2.3 Equal roots and the method of reduction of order . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

7.3 Non-homogeneous linear ODEs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

7.3.1 The method of undetermined coefficients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

7.3.2 Particular solution when non-homogeneous term is a sum . . . . . . . . . . . . . . . . . . . . . . . 64

7.3.3 Use of complex exponentials in solutions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

8 Applications of second order ODE 66

8.1 The driven simple harmonic oscillator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

8.1.1 Unforced Oscillations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

9 Limits and the evaluation of Indeterminate Forms 72

9.1 L’Hôpital’s rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

10 Series 75

10.1 Arithmetic Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

10.2 Geometric Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

10.3 Arithmetico-Geometric Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

10.4 A last example of a series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

10.5 Convergent and Divergent Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

10.5.1 Testing a Series For Convergence: The Preliminary Test. . . . . . . . . . . . . . . . . . . . . . . . 79

10.5.2 Tests for Convergence of Series of Positive Terms: Absolute Convergence . . . . . . . . . . . . . . 79

10.5.3 Alternating Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

11 Summary 81

4
A The Greek Alphabet 82

5
1 Introduction.

Welcome to PH1110, Mathematics for Scientists (calculus component). This course will provide you with some of the
building blocks in Mathematics that you will need throughout your Physics (or Electronic Engineering) degree. The notes
here and the notes that you will take as part of the lectures will only be a small part of you learning in Mathematics.
The real learning takes place on your own as you go through as many examples and problems as you can. The only way
to learn mathematics is to do mathematics.

This part of the course focuses on calculus with the following topics

• Review of basic functions (Logs, trigonometric functions etc).

• Differentiation (including an introduction of partial differentiation) and integration (formal definitions and basic
techniques).

• Taylor series expansions and convergence.

• First and second order differential equations

You are strongly encouraged to borrow the course book

• K F Riley, M P Hobson and S J Bence, Mathematical Methods for Physics and Engineering A Comprehensive
Guide, 3rd Edition, CUP, 2006 ISBN: 9780521679718.

This book is available to read online via the link Riley Textbook .

Another useful book (that is also available in the library) that contains many good examples is

• M L Boas, Mathematical Methods in the Physical Sciences, J Wiley, 2006. (530.15.BOA).

The algebra content of this course is very high and to accommodate this we will make use of the Greek alphabet. If you
are not familiar with this alphabet there is a list in Appendix A.

This is the first version of these notes, if you spot any typos or more serious errors please let me know by emailing
[email protected].

2 Functions

2.1 Definition of a Function

A function is a rule that relates an input to an output. The input to a function is called the argument and the output is
called the value.

Formally speaking, a function takes elements from a set (the domain) and relates them to elements in a another set (the
codomain). All the outputs (the values related to) are together called the range of the function.

A function has special rules:

6
• It must work for every possible input value.

• And it has only one relationship for each input value.

When writing a function there are three important parts; the argument, the relationship and the output of the function.
Let’s look at an example.

f (x) = x2 , (2.1)

where f is the name of the function labelling the output, x is the argument and x2 is the relationship that is, what the
function does to the input. The action of the function is straightforward. For example, if we have an argument x = 2,
then the output is f (2) = 4.

Note: f is a common name for a function but we could have called it anything at all, for example we could have called
it g or h or helicopter. It does not matter, it is just a name. Equally, the argument does not have to be x, we could have
chosen b, q or telephone. The argument is just a place holder. For example,

f (b) = b2 , (2.2)

is exactly the same function as the one in Equation 2.1. As above, if the argument b = 2, then the output is again
f (2) = 4. The argument just shows us where the input goes.

Sometimes, functions have no names for example,

y = x2 , (2.3)

but the function still has an output and in this case it is y.

There are a large number of special types of function possessing very particular properties. For example the symmetry
of a function can play a crucial role in its operation. As an example a function may be even or odd or neither.

An even function is one like x2 or cos x whose graph for negative x is just a reflection in the y-axis of its graph for positive
x. Mathematically we can write this as

A function f (x) is even if f (−x) = f (x).


. (2.4)

An odd function is one like x or sin x, where the values of f (x) and f (−x) are negatives of each other. By definition

A function f (x) is odd if f (−x) = −f (x).


. (2.5)

We will come back to even and odd function later in particular when we look at integrals over these functions.

7
2.2 Logarithms - a quick revision

You will have already come across Logarithms at school, but a brief review is given here to refresh your memory. Let’s
start with the following

x = bp , (2.6)

where we say that p is the Logarithm of x to the base b.

We write this as

p = logb x, (2.7)

where the bases, b, can be any non-zero positive number (more formally we say b ∈ R > 0). Two common examples are
where b = 10 or where b = e, where e is a special constant with approximate value e = 2.71828 to 5 decimal places. This
constant plays a crucial role in Physics and so when it is used as the base in a Logarithm it gets its own special name.
The Log to the base e is referred to as the natural Logarithm and is often denoted by

p = ln x. (2.8)

It should be clear that logb 0 = −∞, or more properly logb 0 is not defined. The reason being that if x = 0 then 0 = bp
can only be satisfied if p = −∞. The real Logarithm function is only defined for x > 0.

By definition

x = blogb x and conversely x = logb (bx ), (2.9)

so in the case of the natural Logarithm we have

x = eln x and x = ln(ex ). (2.10)

Another common base is 10 and you will often see a Log to the base 10 written without the base explicitly stated,
log10 x = log x. It is always important to make sure it is clear which base is being used.

There are a number of important rules associated with the manipulation of Logarithms.

The Product Rule.

logb (xy) = logb x + logb y.


(2.11)

Proof: Let x = bn and y = bm and so n = logb x and m = logb y. Consider the product of x and y.

xy = bn bm = bn+m = blogb x+logb y . (2.12)

But we also know that by definition

xy = blogb xy . (2.13)

8
Comparing the exponents in Equations 2.12 and 2.13 we see that

logb (xy) = logb x + logb y.

The Quotient Rule.

 
x
logb = logb x − logb y.
y
(2.14)

Proof: Let x = bn and y = bm and so n = logb x and m = logb y. Consider the quotient of x and y.

x bn
= m = bn b−m = bn−m = blogb x−logb y . (2.15)
y b

In addition we have by definition


x
= blogb ( y ) .
x
(2.16)
y

Comparing the exponents in Equations 2.15 and 2.16 we see that


 
x
logb = logb x − logb y.
y

The Power Rule.

logb (xm ) = m logb x


. (2.17)

Proof: Let x = bn and so n = logb x. Raise x to the power of m

xm = (bn )m = bmn . (2.18)

Now take Logs of both sides

logb xm = logb (bmn ) = mn. (2.19)

Now apply n = logb x to find

logb (xm ) = mn = m logb x. (2.20)

Changing the base of a Logarithm

loga x = loga b logb x


. (2.21)

9
Proof: By definition we can write

x = aloga x = blogb x . (2.22)

But we can also write,

b = aloga b . (2.23)

Using this in Equation 2.22 we arrive at


logb x
aloga x = aloga b = aloga b logb x . (2.24)

Comparing exponentials we find

loga x = loga b logb x. (2.25)

Example: Show that

logb x + loga x = (1 + loga b) logb x. (2.26)

Solution:

Use the change of basis rule to convert loga x = loga b logb x

logb x + loga x = logb x + loga b logb x (2.27)


= (1 + loga b) logb x.

2.3 Trigonometric Functions - a quick revision and a little more

A very common set of functions that we will see in this course are the trigonometric functions. For example you will be
familiar with sin, cos, tan. Most relationships between these functions can be derived from two identities (which we will
prove later in the course).

cos[x + y] = cos[x] cos[y] − sin[x] sin[y] (2.28)


sin[x + y] = cos[x] sin[y] + cos[y] sin[x]. (2.29)

For example, if set x = −y in Equation 2.28 we get

cos(0) = 1 = cos2 (x) + sin2 (x). (2.30)

Alternatively if we set x = y in Equation 2.28 we find

cos(2x) = cos2 (x) − sin2 (x). (2.31)

10
We also have a set of functions that are the reciprocal of the familiar trigonometric functions. These are
1 1 1
cosec(x) = ; sec(x) = ; cot(x) = . (2.32)
sin(x) cos(x) tan(x)

As an example of a further identity we can start with Equation 2.30 and divide both sides by cos2 (A)

1 cos2 (x) sin2 (x)


= sec2 (x) = +
cos2 (x) cos2 (x) sin2 (x)
sec2 (x) = 1 + tan2 (x). (2.33)

Many more identities can be derived in related ways, see the list in section 7.1 of the Mathematics Formula Booklet
(online version here).

A further important property of cos x and sin x is that they can be expanded in terms of a polynomial, literally meaning
“many terms”. In our case these polynomials are a sum terms depending on the variable x to increasingly higher powers.
Specifically

X∞
x3 x5 x7 x2n+1 (−1)n
sin x = x− + − + ... = (2.34)
3! 5! 7! n=0
(2n + 1)!
X∞
x2 x4 x6 x2n (−1)n
cos x = 1− + − + ... = , (2.35)
2! 4! 6! n=0
(2n)!

where the symbol ! means factorial which for example 4! = 4.3.2.1 etc and 0! = 1 by definition. These power series
expansions come from Taylor Series.

2.4 Power Series

By definition a power series has the form



X
an xn = a0 + a1 x + a2 x2 + . . . + an xn + . . . (2.36)
n=0

or

X
an (x − a)n = a0 + a1 (x − a) + a2 (x − a)2 + . . . + an (x − a)n + . . . . (2.37)
n=0

The value of x will determine whether the series converges and we often are tasked with finding the value of x for which
a series converges. For example, the power series

x x2 x3 xn
P (x) = 1 − + − + ... + n + ....
2 4 8 2
can be tested for convergence by considering the absolute ratio of successive terms

(−x)n+1 (−x)n x
ρn = n+1 / n = ,

2 2 2

11
and then
x
ρ = lim ρn = .
n→∞ 2
The series converges when ρ < 1 that is for when |x| < 2 and diverges for |x| > 2. When x = 2 we get the series

S = 1 − 1 + 1 − 1 + ...,

which is not convergent and for x = −2 we have

S = 1 + 1 + 1 + 1 + ...,

which again is not convergent. We can now say that the interval of convergence for the power series, P (x) is −2 < x < 2.
When finding the interval of convergence one must always analyse whether the interval endpoint values lead to a convergent
series or not. We will come back to testing the convergence of series later in the term.

2.4.1 Expanding Functions in Power Series - Taylor Series

It is very often more convenient to work with a power series representation of a function. For example, we will consider
the power series expansion of the function sin x, which we have already seen earlier in the course.

We first have to assume that such an expansion exists and then find the value of all the coefficients, an . Writing this
expansion we have

sin x = a0 + a1 x + a2 x2 + . . . + an xn + . . . . (2.38)

The interval of convergence of a power series of this form must contain x = 0. So we may set x = 0 in this expansion and
we learn that a0 = 0. Next we differentiate both sides of equation 2.38 to find

cos x = a1 + 2a2 x + 3a3 x2 + 4a4 x3 + . . . . (2.39)

Again putting x = 0 into this we find a1 = 1. Repeating this procedure a further time we have

− sin x = 2a2 x + 3 · 2 a3 x + 4 · 3 a4 x3 + . . . , (2.40)

with x = 0 telling us that a2 = 0. If we continue this we find that


1 1
a3 = − , a4 = 0, a5 = , etc.
3! 5!
Substituting these values back into equation 2.38 we arrive at the power series
X∞
x3 x5 x(2n+1) (−1)n
sin x = x − + − ... = . (2.41)
3! 5! n=0
(2n + 1)!

A series obtained in this way is called a Maclaurin Series, or a Taylor Series expansion about the origin.

A Taylor series in general means a series of powers of (x − a), where a is some constant. It is found by writing (x − a)
instead of x on the right hand side of a power series expansion. To find the coefficients in this case we follow a similar
procedure as we did before but this time we set x = a to evaluate the coefficients.

12
To see this we can expand a general function of x as a Taylor Series. Consider the function f (x) and expand it around
the point x = a. We have then

f (x) = a0 + a1 (x − a) + a2 (x − a)2 + a3 (x − a)3 + a4 (x − a)4 . . . + an (x − a)n + . . . (2.42)


f 0 (x) = a1 + 2a2 (x − a) + 3a3 (x − a)2 + 4a4 (x − a)3 + . . . + nan (x − a)n−1 + . . . (2.43)
00 2 n−2
f (x) = 2a2 + 3 · 2a3 (x − a) + 4 · 3a4 (x − a) + . . . + n(n − 1)an (x − a) + ... (2.44)
000 n−3
f (x) = 3! a3 + 4 · 3 · 2 a4 (x − a) + . . . + n(n − 1)(n − 2) an (x − a) + ... (2.45)
... (2.46)
n
f (x) = n(n − 1)(n − 2) . . . 2 · 1 an + terms containing powers of (x − a). (2.47)

In each of the above equation we put x = a and find

f (a) = a0 , f 0 (a) = a1 , f 00 (a) = 2a2 , (2.48)


000 (n)
f (a) = 3!a3 , . . . , f (a) = n!an . (2.49)

We can now write the full Taylor series for f (x) about x = a:

1 1
f (x) = f (a) + (x − a)f 0 (a) + (x − a)2 f 00 (a) + . . . + (x − a)n f (n) (a) + . . . .
2! n!
(2.50)

The Maclaurin series for f (x)is the Taylor series about the origin. Putting a = 0 we obtain the Maclaurin series (or
Taylor Series expansion about x = 0) for f (x).

1 2 00 1
f (x) = f (0) + (x)f 0 (0) + x f (0) + . . . + xn f (n) (0) + . . . .
2! n!
(2.51)

Note: The functions are differentiated first and then the value of x around which they are being expanded is inserted.

We have already seen some important Maclaurin series but they are collected here and should be memorised. You are
also expected to be able to derive them all as well.

convergent for (2.52)


x3 x5 x7
sin x = x − + − + ..., all x; (2.53)
3! 5! 7!
x2 x4 x6
cos x = 1 − + − + ..., all x; (2.54)
2! 4! 6!
x2 x3 x4
ex = 1 + x + + + + ..., all x; (2.55)
2! 3! 4!
x2 x3 x4
ln(1 + x) = x − + − + ..., −1 < x ≤ 1; (2.56)
2 3 4
p(p − 1) 2 p(p − 1)(p − 2) 3
(1 + x)p = 1 + px + x + x + ..., all |x| < 1. (2.57)
2! 3!

The last of these series is called a “Binomial Series”.

13
Let’s look at an example. Expand f (x) = cos bx around x = 0, where b is a constant. Include only the first 3 terms.

We use the Taylor Series expansion around x = 0 in equation 2.51, that is


1 2 00
f (x) ≈ f (0) + (x)f 0 (0) + x f (0).
2!

Notice the approximation sign, this is because we are only including the first three terms of what is an infinite series and
so our power series expansion is only an approximation of the full functional.

We can calculate the individual terms. The first term is

f (0) = cos (bx)|x=0 = 1.

The coefficient of x is then


0 d cos (bx)
f (0) = = −b sin (bx)|x=0 = 0
dx x=0

and finally the coefficient of x2 is



1 00 1 d2 cos (bx) −b2
−b2
f (0) = = cos (bx) = .
2 2 dx2 x=0 2 x=0 2
Putting all this together we get
1
cos (bx) ≈ 1 − (bx)2 .
2
If we had wanted the first three non-zero terms we would have to go to the x4 term.

2.4.2 Approximation Errors in Taylor Series

See Riley Section 4.6.1, 4.6.2.

We have seen that some functions (those that are differentiable) can be represented by an infinite series. It is often useful
to approximate this series and keep only the first n terms, with the other terms neglected.

If we do make this approximation we would like to know what size error we are making. We can rewrite the full Taylor as

1 1
f (x) = f (a) + (x − a)f 0 (a) = (x − a)2 f 00 (a) + . . . + (x − a)(n−1) f (n−1) (a) + Rn (x), (2.58)
2! (n − 1)!

where
(x − a)n (n)
Rn (x) = f (η),
n!
where η lies in the range [a, x]. Rn (x) is called the remainder term and represents the error in approximating f (x) by
the above (n − 1)th-order power series.

The value of η that satisfies the expression for Rn is not known, an upper limit the error may be found by differentiating
Rn with respect to η and equating the result to zero and solving for η in the usual way for finding a maximum.

Example, if we calculate the Taylor series for cos x about x = 0 but only include the first two terms we get
x2
cos x ≈ 1 − .
2

14
The remainder function in this case is
x4
R4 (x) = cos η,
4!
with η confined to the interval [0, x]. It si clear that the maximum the value of cos η can be is 1 so that the error is x4 /4!.

If we take x = 0.5, taking just the first two terms yields cos(0.5) ≈ 0.875 with an indicated maximum error of 0.00260.
In fact using a calculator cos(0.5) = 0.87758 to 5 decimal places. To this accuracy the true error is 0.00258.

15
f (x) f(x)

1 1.0

0.8

0.6

0.4

0.2

x x
-4 -2

0 2 4

Figure 3.1: The Step(x) function is an example of a function that is not continuous.

3 Differential calculus

You will have already seen many example of differentiation and know what the derivative of certain functions are. In this
section we will look at the formal definition of a derivative which will allow us to prove the results of the derivatives that
you have seen before.

3.1 Continuity of a function

First we must define a property of a function that will be useful in our description of differentiation. A function, f (x), is
continuous at x = a if f (x) → f (a) as x → a.

An example of where this condition breaks is the following:

The ”Step” function, which is defined as


1, if x > 0,
Step(x) = (3.1)
0, if x < 0.

A graphic visualisation of this function is displayed in Figure 3.1. It is clear that this function is not continuous at x = 0.
In particular, as we take the limit x → 0+ , step(x) → 1 in contrast to when we take the limit x → 0− , step(x) → 0. Here
x → 0+ means that we take x to zero from positive numerical values of x with x → 0− meaning we take x to zero from
negative numerical values of x.

16
f

f (x + x)
f (x + x) f (x) ⌘ f
f (x) P
x


x
x x+ x

Figure 3.2: The Barrow triangle determining the gradient of the tangent to a curve. The graph shows that eh gradient
or slope of the function at P , given by tan θ, is approximately equal to ∆f /∆x.

3.2 Differentiability and Differentiation.

Read section 2.1 in Riley for more on this.

We will now investigate what it means to take a derivative of a function. The derivative of a function is essentially the
gradient of the tangent to a curve (as described by the function) at an arbitrary input value (let’s call it x for now).

With this in mind consider Figure 3.2. The figure shows the plot of f , as a function of the input variable x. We wish to
calculate the gradient (that is, the slope) of the tangent to a curve at a particular value of x.

The tangent line (or simply tangent) to a plane curve at a given point is the straight line that “just touches” the curve
at that point (see green line in Figure 3.2). Leibniz defined it as the line through a pair of infinitely close points on the
curve.

Figure 3.2 shows that the gradient of the hypotenuse of the Barrow triangle is given by

f (x + ∆x) − f (x) ∆f
= . (3.2)
∆x ∆x

This is not quite the tangent to the curve at the point P given by tan θ. However, it is clear that as we decrease ∆x
the hypotenuse and tangent to the curve become closer and closer. By taking the limit ∆x → 0 we can match on to the
gradient. Formally speaking then the derivative of f with respect to x is given by

17
df f (x + ∆x) − f (x) ∆f
≡ lim = lim ,
dx ∆x→0 ∆x ∆x→0 ∆x
(3.3)

where lim means take the limit of ∆x → 0 in the expression.


∆x→0

This definition is only true if the function has a unique value at the input value. This means that if you approach
this point from any direction the value of this function is unique. Further more if a function is differentiable it is also
continuous. The reverse is not true, if a function is continuous it does not imply that it is differentiable.

Let’s look at how this works in two examples.

Example 1: Find the derivative from first principles of f (x) = x2 with respect to x.

Applying Equation 3.3 to this function we have

df f (x + ∆x) − f (x) (x + ∆x)2 − x2 2 x ∆x + (∆x)2


≡ lim = lim = lim = lim (2 x + ∆x) = 2x (3.4)
dx ∆x→0 ∆x ∆x→0 ∆x ∆x→0 ∆x ∆x→0

So we conclude that if f (x) = x2 then f 0 = 2x, where f 0 is a common shorthand for df /dx.

Example 2: Find the derivative from first principles of f (β) = cos β with respect to β.

Applying Equation 3.3 to this function we have

df cos(β + ∆β) − cos β


≡ lim . (3.5)
dβ ∆β→0 ∆β

We can expand cos(β + ∆β) using the double angle formula in Equation 2.28 we can rewrite this as

cos(β + ∆β) = cos β cos(∆β) − sin β sin(∆β). (3.6)

The next step is to use the polynomial expansions for both sin and cos expressed in Equations 2.34 and ?? respectively.
In particular and keeping only the first few terms as higher order (higher power) terms will be smaller and smaller as we
take ∆β to zero

sin(∆β) =
(∆β) + . . . (3.7)
(∆β)2
cos(∆β) = 1 − + ..., (3.8)
2!
where . . . represent higher order powers of ∆β. Putting these into our derivative we find

df cos β cos(∆β) − sin β sin(∆β) − cos β


≡ lim
dβ ∆β→0 ∆β
(∆β)2
cos(β)(1 − 2! + . . .) − (∆β + . . .) sin β − cos β
= lim
∆β→0 ∆β
2
− cos(β)( (∆β)
2! + . . .) − (∆β + . . .) sin β
= lim = − sin β. (3.9)
∆β→0 ∆β

18
Finally arriving at the result that the derivative of f (β) = cos(β) is
df d cos β
= = − sin β.
dβ dβ

It is important that you learn the following results for use in calculations. These result all result from the technique
outline above and you should be able to derive the all of the results However, unless you are asked explicitly to do so, it
is sufficient just to state these results.

d n d
x = nxn−1 , (sin ax) = a cos ax,
dx dx
d ax d
e = aeax , (cos ax) = − sin ax,
dx dx
d 1 d
ln ax = , (tan ax) = a sec2 ax,
dx x dx
d d
sec(ax) = a sec ax tan ax, cosec ax = a cosec ax cot ax,
dx dx
where a and n are constants.

A note on notation. It is often that derivatives are written as follows


df d2 f dn f
≡ f 0, ≡ f 0 ≡ f (2) , = f (n) . (3.10)
dx dx2 dxn
Make sure you include the brackets for the higher order derivatives to avoid confusing it with a power.

3.3 Rules of Differentiation

3.3.1 Product Function

For more reading and examples see Riley Section 2.1.3.

We can also prove the product rule in a similar way. Let’s say we have a function f that is a product of two other
functions u(x) and v(x). Applying Equation 3.3 we have that
df u(x + ∆x)v(x + ∆x) − u(x)v(x)
= lim
dx ∆x→0 ∆x
u(x + ∆x)(v(x + ∆x) − v(x)) + (u(x + ∆x) − u(x))v(x)
= lim
∆x→0 ∆x
(v(x + ∆x) − v(x)) (u(x + ∆x) − u(x))
= lim u(x + ∆x) + v(x)
∆x→0 ∆x ∆x
dv(x) du(x)
= u(x) + v(x) .
dx dx

Consequently we obtain the familiar


df d(u(x)v(x)) dv(x) du(x)
= = u(x) + v(x) . (3.11)
dx dx dx dx

This is the Product Rule.

19
3.3.2 Composite Function or Function of a Function or Implicit Differentiation

For more reading and examples see Riley Section 2.1.5.

Consider a function g(x) and another function f (g(x) which is a function of our original function g. We would like to
know how to take the derivative of f (g(x)). We already know the answer from school but let’s prove it using Equation 3.3.

df (g(x)) f (g(x + ∆x)) − f (g(x))


= lim (3.12)
dx ∆x→0 ∆x
f (g(x + ∆x)) − f (g(x)) g(x + ∆x) − g(x)
= lim .
∆x→0 g(x + ∆x) − g(x) ∆x
But we have that ∆g = g(x + ∆x) − g(x)

df (g(x)) f (g + ∆g) − f (g) ∆g df dg


= lim lim = . (3.13)
dx ∆g→0 ∆g ∆x→0 ∆x dg dx
This is the Chain Rule.

3.3.3 Quotient Rule

For more reading and examples see Riley Section 2.1.4.

The Quotient rule does not really stand alone as a separate rule as it is a direct consequence of the product and chain
rule. Consider the product of two functions u(x) and 1/v(x). We can take the derivative of this product following the
product rule
   
d 1 d 1 1 du
u. = u + (3.14)
dx v dx v v dx
   
d 1 u dv 1 1 du
u. = − 2 + (3.15)
dx v v dx v v dx
d u v du dv
dx − u dx
= , (3.16)
dx v v2
which is the Quotient Rule.

3.3.4 Logarithmic Differentiation

For more reading and examples see Riley Section 2.1.6.

In circumstances in which the variable with respect to which we are differentiating is an exponent, taking logarithms and
then differentiating implicitly is the simplest way to find the derivative.

Example: Find the derivative with respect to x of g(x) = bx , where b is a constant.

Solution: First take natural logs of both sides and then differentiate
1 dg
ln g = ln ax = x ln a, ⇒ = ln a. (3.17)
g dx

20
Now simply rearrange and substitute in the original expression for g

dg
= g ln a = ax ln a. (3.18)
dx

3.3.5 Leibniz’s Theorem

For more reading and examples see Riley Section 2.1.7.

Leibniz’ theorem states the form of the nth order derivative of a product of functions. We know from above that the
derivative of a product of two functions, u ≡ u(x) and v ≡ v(x) is given by the product rule

d(uv) dv du
=u +v . (3.19)
dx dx dx

Apply the derivative again


d2 (uv) d2 v du dv d2 u
2
=u 2 +2 +v 2 (3.20)
dx dx dx dx dx
and again
d3 (uv) d3 v d2 u dv du d2 v d3 u
= u + 3 + 3 + v , (3.21)
dx3 dx2 dx dx dx dx dx3
etc.

This is similar to the binomial theorem of the form

n
!
X n
n n n−1 n−1 n
(a + b) = a + na b + ...nb a+b = an−r br , (3.22)
r=0
r

where
!
n n!
= . (3.23)
r r!(n − r)!

From this we can write Leibniz’s theorem as


n
!
dn X n
(uv) = u(n−r) v (r) .
dxn r=0
r
(3.24)

3.4 Special Points of a Function

In the preceding sections we have interpreted the derivative of a function as the gradient of the function at a particular
value of the input variable. If the gradient is zero for some particular value of x then the function is said to have a
stationary point there.

21
f

x
x=a x=b x=c

Figure 3.3: Examples of the three different types of stationary point.

In Figure 3.3, the function has three stationary points in the domain displayed at points x = a, x = b and x = c. In each
case it is clear that the gradient of the function at the three points is zero, corresponding graphically to horizontal lines.

A stationary point can be classified in three ways. It can be a minimum as at x = a where the function decreases either
side of that point, it can be classified as a maximum as at x = b where the function increases either side of that point or
it can be a point of inflection, as in at x = c where on one side of the point the function falls but on the other it increases.

The first two type, the maximum and minimum are commonly referred to as turning points.

We can define these stationary points mathematically. All stationary points have zero gradient and therefore
df
= 0. (3.25)
dx

In the case of a minimum, the slope of the graph, that is df /dx, goes from negative for x < a to positive for a < x < b.
This means that d2 f /dx2 > 0 at x = a.

In the case of a maximum, the slope of the graph goes from positive for a < x < b to negative for b < x < c. This means
that d2 f /dx2 < 0 at x = b.

In the case of a point of inflection, from the left of x = c the slope becomes less negative as we increase x towards x = c
and hence d2 f /dx2 > 0. To the right of x = c however, the curve becomes increasing negative so that d2 f /dx2 < 0. It is
not obvious but this means that at x = c, d2 f /dx2 = 0.

In summary at a stationary point:

22
df d2 f
• for a maximum: dx = 0, dx2 < 0,
df d2 f
• for a minimum: dx = 0, dx2 > 0,
df d2 f d2 f
• for a point of inflection: dx = 0, dx2 = 0 and dx2 changes sign through the point.

3.4.1 Singular Points

In mathematics, a singularity is in general a point at which a given mathematical object is not defined, or a point of
an exceptional set where it fails to be well-behaved in some particular way, such as differentiability. In particular, the
derivative of a function at a singular point is not well behaved.

3.5 Inverse Functions

An inverse function is a function that “reverses” the action of another function. For example, if a function f applied to
an input variable x gives the output y, then applying the inverse function, g to y gives x.

Often the inverse of a function is written as f −1 (y). (Note the argument of the inverse function does not have to be y,
as we have learnt earlier the symbol used to represent the argument of a function is not important, it could be anything,
for example we could use x or t or anything you can think of).

An extremely simple example is to find the inverse function, g ≡ f −1 of

f (x) = 3x + 2.

To do this we follow the simple steps. First replace f (x) by some other variable, let’s chose y. Now we solve for x in
terms of y to get
x = (y − 2)/3.
We have now found the inverse function and can relabel as

g(y) = (y − 2)/3.

Or
f −1 (x) = (x − 2)/3.

Another useful example is converting between Fahrenheit and Celsius. To convert Fahrenheit to Celsius
5
f (F ) = (F − 32),
9
and the inverse, that is converting Celsius to Fahrenheit, is
9C
f −1 (C) = + 32.
5

Recalling that the definition of a function is that for any input value within a particular domain, the function must return
a unique answer. This is also true for inverse functions, as they are themselves functions. One must therefore be careful
with certain functions. For example, the function
f (x) = x2

23
only has an inverse function provided the domain of the function is carefully chosen. For example, we can find the inverse
of the function f (x) = x2 for the domain x > 0. The inverse function is straightforward to find and is given by

f −1 (y) = y 1/2 .

If however we choose to find the inverse function of g(x) = x2 for the domain x < 0 then the inverse function in this case
is
g −1 (y) = −y 1/2 ,

where in both cases we must also specify y > 0.

3.5.1 Inverse Trigonometric Functions

The inverse of trigonometric functions also come with a domain warning. Their inverses are usually written

sin−1 x = arcsin x, cos−1 x = arccos x, tan−1 x = arctan x.

Inverse trigonometric functions do not lead to unique outputs in general as

y = sin x = sin(x + 2nπ), where n = 0, ±1, ±2, . . . .

Hence, the inverse x = sin−1 y can take an infinite number of values for a given y. In order to overcome this, we can
define a domain over which the inverse is unique value. They are referred to as the principle branches

π π
x = ≤x≤
arcsin y, −
2 2
x = arccos y, 0≤x≤π
π π
x = arctan y, − ≤ x ≤ .
2 2

3.5.2 Inverse Function Rule

Suppose f and g are inverse functions so that

y = f (x), and x = g(y),

that is g = f −1 . If both the derivatives f 0 and g 0 exist then


1
f 0 (x) = ,
g 0 (y)

or if we associate y = f (x) and x = g(y) we have


dy 1
= .
dx dx/dy

Example: f (x) = x3 and it inverse function g(y) = y 1/3 . We have f 0 (x) = 3x2 and g 0 (y) = 31 y −2/3 = 13 x2 , where in the
last step we have set y = x3 as per the definition of the original function. It is clear that the

24
✓ ◆ ✓ ◆
@f f (x, y) @f
@y x
@x y

x y

Figure 3.4: Examples of partial derivatives of a function of x and y.

3.6 Partial Differentiation.

See Riley section 5.1.

We can generalise everything we have discussed so far to more than one variable. In PH1120 you will see a full introduction
to partial differentiation but we will have short discussion here.

Partial derivatives are when a function has more than one variable. For example, if f ≡ f (x, y). The function f (x, y)
defines a surface in 3-d and we can calculate the slopes at points on the surface in the x and y direction.

The partial derivative of f (x, y) with respect to x at constant y is defined as

   
∂f f (x + ∆x, y) − f (x, y)
= lim , (3.26)
∂x y
∆x→0 ∆x

said another way, it is the rate of change of f due to an infinitesimal change in x whilst keeping y constant.

The partial derivative of f (x, y) with respect to y at constant x is defined as

   
∂f f (x, y + ∆y) − f (x, y)
= lim , (3.27)
∂y x
∆y→0 ∆y
said another way, it is the rate of change of f due to an infinitesimal change in y whilst keeping x constant.

25
In Figure 3.4 we have an example of a function f (x, y). This function forms a surface in 3-D. Now we have a surface we
need to specify along which directions we are calculating the gradient of the slope of the surface. For example, taking
point P ≡ (x0 , y0 ) on the surface we can calculate the gradient in the y-direction at a constant value of x = x0 . This
gradient line is labelled (∂f /∂y)x and is parallel to the y-axis and perpendicular to the x-axis.

Similarly we can calculate the gradient in the x-direction at a constant value of y = y0 . This gradient line is labelled
(∂f /∂x)y and is parallel to the x-axis and perpendicular to the y-axis.

Often the notation fx = (∂f /∂x)y and fy = (∂f /∂y)x is used.

Let’s look at a few examples.

Example 1: Calculate (∂f /∂x)y and (∂f /∂y)x for f = x2 + y 2 + 3xy.

Solution: Calculate fx first. We are taking the derivative of f with respect to x keeping y constant. We have then

fx = 2x + 3y.

. Similarly, we calculate fy by taking the derivative of f with respect to y at constant x. We have then

fy = 2y + 3x.

Example 2: Calculate fV and fP for f = sin(V P ) + V 2 .

Solution: The two independent variables we have are V and P . To calculate fV we take the derivative of f with respect
to V while holding P constant. We have
fV = P cos(V P ) + 2V.

Similarly we can calculate fP by taking the derivative of f with respect to P while holding V constant

fP = V cos(V P ).

4 A note on coordinate systems.

Up to now you will be used to describing a position in 3-D space using Cartesian coordinates, (x, y, z). For certain
problems it is more convenient to use either circular, cylindrical or spherical coordinate systems to specify the point in
space.

4.1 Circular Polar Coordinates

The simplest example is where the system in which we are interested possesses a circular symmetry in 2-dimensions. In
this case we can exchange the usual Cartesin coorinate, x and y for circular polars which are defined by

x = r cos ϕ, tan ϕ = xy
p
y = r sin ϕ, r = x2 + y 2 ,

26
y

r
r sin '

'
r cos ' x

Figure 4.1: Diagrammatic illustration of circular polars coordinates.

where 0 ≤ r ≤ ∞ and 0 ≤ ϕ ≤ 2π. This relationship is demonstrated graphically in Figure 4.1, where the point P can
equally well be described in terms of the Cartesian coordinates x and y and the circular polar coordinates r and ϕ.

An example of the usefulness of this change of coordinate system is where we are considering a set of Points, Pi that
map out the edge of a circle. To write down the coordinates of these points in terms of x and y is extremely cumbersome
where as using r and ϕ it is straightforward.

4.2 Cylindrical Polars

Moving to 3 dimensions, in Physics and Engineering we are often presented with a problem involving a system possessing
a cylindrical symmetry, such as a wire with a current passing through it. In this case it is convenient to use cylindrical
coordinates, which are an obvious extension of circular polars and are defined by

x = r cos ϕ, tan ϕ = xy
p
y = r sin ϕ, r = x2 + y 2 ,
z = z,

where 0 ≤ r ≤ ∞, 0 ≤ ϕ ≤ 2π and −∞ ≤ z ≤ ∞. This relationship is demonstrated graphically in Figure 4.2, where


the point P can equally well be described in terms of the Cartesian coordinates x, y, z and and the cylindrical polar
coordinates r, ϕ, z.

27
z

z
y
' r

r sin '
x r cos '

Figure 4.2: Diagrammatic illustration of cylindrical polars coordinates.

P
r

y
'

Figure 4.3: Diagrammatic illustration of spherical polars coordinates.

4.3 Spherical Polars

Finally the last example is where the system in which we are interested has a spherical symmetry. For example, the
points on the surface of a ball are most easily described by spherical polars rather than Cartesian coordinates.

This time the components of the spherical geometry are r, ϕ, θ and they are related to x, yz via

28
x = r sin θ cos ϕ, tan ϕ = xy
p
y = r sin θ sin ϕ, r = x2 + y 2 + z 2
z
z = r cos θ, cos θ = √ ,
x2 +y 2 +z 2

where 0 ≤ r ≤ ∞, 0 ≤ ϕ ≤ 2π and 0 ≤ θ ≤ π.

This relationship is demonstrated graphically in Figure 4.3, where the point P can equally well be described in terms of
the Cartesian coordinates x, y, z and and the cylindrical polar coordinates r, ϕ, θ.

29
5 Integration

f (x)

f (x)

x
a b

Figure 5.1: Representation of the integral of the function f (x) with respect to x from x = a to x = b as the area under
the curve (shaded region).

See Riley Section 2.2.

The idea of integration should already be familiar to you as the area under a curve. In Figure 5.1 the definite integral
over f (x) from x = a to x = b is represented as the area under the curve (of f (x)) between x = a and x = b. The value
of this integral is written as

Z b
I= f (x) dx. (5.1)
a

5.1 Integration From First Principles

We can derive the form of an integral by subdividing the finite interval a ≤ x ≤ b into a large number of small segments
of width ∆x as depicted in Figure 5.2. In Figure 5.2a these segments are constructed such that the area is overestimated
whereas in Figure 5.2b they are formed such that the area is underestimated.

By defining the endpoint of the ith segment as xi . In the case where we overestimate the area, the value of the function
at xi is labelled as Mi and where we underestimate the area, the value of the function at xi is labelled as mi .

30
f (x) f (x)

x x
Mi
f (x) mi f (x)

x x
a xi b a xi b

(a) (b)

Figure 5.2: Integration from first principles. The area under a curve can be estimated by adding the areas of a large
number of rectangular segments between a and b. The way in which these rectangles are constructed can be done such
that we always over estimate the area beneath the curve (a) or such that we underestimate the area (b).

In each case we can calculate the area under the curve by adding the areas of all n segments together. For the overestimate
we have
X n
S = M1 ∆x + M2 ∆x + . . . Mn ∆x = Mi ∆x
i=1

and for the underestimate we have


n
X
S̄ = m1 ∆x + m2 ∆x + . . . mn ∆x = mi ∆x.
i=1

It is clear that
S ≥ S̄.

In both cases the number of segments between the limits a and b is given by

n = (b − a)/∆x

If we were to increase the number of segments, that is increase the number n, then the value of the underestimate would
increase whereas the value of the overestimate would decrease.

It is important to note that S is bounded from below and S̄ is bounded from above. Such that

S ≥ I ≥ S̄,

where I is the limit of both S and S̄ when n → ∞† .


† There is a little more to do if we really want to prove that the limit of both areas is the same value I, but we will stick with intuitive

reasoning.

31
This suggest the following definition of a definite integral.

Definite integral of a continuous and non-negative function: Let f (x) be a continuous non-negative function on the closed
interval a ≤ x ≤ b, if xi is an point in the ith segment of length ∆x, the definite integral of f (x) integrated over the
interval [a, b], and written symbolically as
Z b
I= f (x) dx
a
is defined to be
Z b n
X
I= f (x) dx = lim f (xi )∆x.
a n→∞
i=1
(5.2)

Let’s look at an example.

Example 1: Evaluate the definite integral


Z a
I= x2 dx, where a < b.
b

Solution: As the function is continuous and non-negative over the domain in which we are interested. we start by by
considering a convenient partition in which [a, b] is divided into n equal sub-intervals, each of length

b−a
∆x = .
n

To do this we use the definition in Equation 5.2. We first identify xi with the right hand end point of the ith segment so
that we have
x1 = a + ∆x, x2 = a + 2∆x, x3 = a + ∆x, . . . , xn = a + n∆x

. Hence from Equation 5.2,


n
X
I = lim (a + i∆x)2 ∆x.
n→∞
i=1

Expanding and grouping the terms of the summation then gives


 
I = lim na2 ∆x + 2a(∆x)2 (1 + 2 + 3 + 4 + . . . + n) + (∆x)3 (12 + 22 + 33 + . . . + n2 )
n→∞

b−a
Using the fact that ∆x = n and that
n
1 + 2 + 3 + ... + n = (n + 1)
2
and
n(n + 1)(2n + 1)
12 + 22 + 33 + . . . + n2 = ,
6
it follows that     
n(n + 1) 3 (n + 1)(2n + 1)
I = lim a2 (a − b) + a(b − a)2 + (b − a)
n→∞ n2 6n2

Thus taking the limit we find


1 3 
I= b − a3 ,
3

32
and so
Z b
1 3 
x2 dx = b − a3 .
a 3

You will see in PH1120 lots of uses for integrals beyond the calculation of the area under a curves. For example, the
computation of the arc length if a curve, also called a line integral; the computation of surface integrals and volume
integrals when we go to more dimensions. All to come next term.

5.2 Integration of Arbitrary Continuous Function

Most functions have positive and negative values in their domain of definition, the notion of a definite integral as formulated
so far may seem rather restrictive due to the rather restrictive assumption that the integrand must be an essentially positive
quantity.

However, nothing in the argument used requires the upper or lower sums (S, S̄) or the terms comprising them to be
non-negative. Since a term in either of these sums will be negative when mi or Mi is negative, that is when f (xi ) is
negative, it follows that the interpretation of the definite integral as an area may be extended to continuous functions
which assume negative values provided that areas below the x-axis are regarded as negative.

Thus using this convention we may modify the definition of a definite integral as an area by removing the condition on
the integrand being non-negative. This modification simply amounts to the deletion of the word non-negative such that
we have the modified definition.

For completeness the definition is restated without the non-negative restriction.

Definite integral of a continuous function: Let f (x) be a continuous function on the closed interval a ≤ x ≤ b, if xi is
an point in the ith segment of length ∆x, the definite integral of f (x) integrated over the interval [a, b], and written
symbolically as
Z b
I= f (x) dx
a

is defined to be
Z b n
X
I= f (x) dx = lim f (xi )∆x.
a n→∞
i=1
(5.3)

5.2.1 Properties of Definite Integrals

Let f (x) and g(x) be continuous functions defined on the closed interval a ≤ x ≤ b, and let c be a constant and k be such
that a ≤ k ≤ b. Then
Rb Rk Rb
a) a f (x) dx = a f (x) dx + k f (x) dx, (Additivity with respect to interval of integration.)
Rb Rb
b) a cf (x) dx = c a f (x) dx (Homogeneity)
Rb Rb Rb
c) a (f (x) + g(x)) dx = a f (x) dx + a g(x) dx (linearity).

33
5.3 Integration as the inverse of Differentiation

f (x)

f (x)

a x x

Figure 5.3: Graphic demonstrating how integration is the inverse of differentiation.

The definite integral has been defined as the area under a curve between two fixed limits. Now consider the integral
Z x
F (x) = f (u)du, (5.4)
a
in which the lower limit a remains fixed but the upper limit x is now a variable. It should be clear that this form is
essentially a restatement of Equation 5.1 but where the variable x in the integrand has been replaced by a new variable
u. It is conventional to rename the dummy variable in the integrand in this way in order that the same variable does
not appear in both the integral and the integration limits.

To make the link between the form of the integral in Equation 5.4 and the assertion that integration is the inverse process
to differentiation we consider Equation 5.4 but shift x to x + ∆x such that

Z x+∆x
F (x + ∆x) = f (u)du. (5.5)
a
We can split an integral into different segments. For example, we can write

Z x Z x+∆x
F (x + ∆x) = f (u)du + f (u)du
a x
Z x+∆x
= F (x) + f (u)du, (5.6)
x

34
where intuitively in terms of the area under a curve we are just splitting the area in two parts as shown in Figure 5.3.

Rearranging and dividing through ∆x we find

Z x+∆x
F (x + ∆x) − F (x) 1
= f (u)du. (5.7)
∆x ∆x x

Now taking the limit of both sides as ∆x → 0


Z x+∆x
F (x + ∆x) − F (x) dF 1 1
lim = = lim f (u)du = lim f (x)∆x = f (x).
∆x→0 ∆x dx ∆x→0 ∆x x ∆x→0 ∆x

In the penultimate step we have appealed to our understanding of how a definite integral is constructed in terms of narrow
segments of width ∆x.

The result here is that


dF
= f (x) (5.8)
dx
and so the derivative of the integral is the original function f (x) itself. We can make this even more apparent by
substituting Equation 5.4 into Equation 5.8 to find

Z x
d
f (u)du = f (x). (5.9)
dx a

From these last two equations we can see that integration is the inverse of differentiation.

We see however, that the lower limit in the above, a, is arbitrary and so differentiation does not have a unique inverse.

Any function F (x) obeying Equation 5.4 is called an indefinite integral of f (x).
dF (x)
Consider the function F (x) = x3 + 8. Suppose we write down its derivative as f (x), that is f (x) = dx . For our example
we can straightforwardly calculate the derivative as
dF (x)
f (x) = = 3x2 .
dx
Suppose now that we want to work in the opposite direction, that is, we want to find the function that has a derivative
equal to f (x) = 3x2 . Clearly one answer is x3 + 8. We say that F (x) = x3 + 8 is an anti-derivative of f (x) = 3x2 .

There are however, many other functions that have the derivative 3x2 , all related by a constant. E.g. x3 + 2, x3 − 200, x3 ,
etc. The reason of course is that the constant term disappears during differentiation. This is exactly the arbitrariness of
the lower limit in Equation 5.4.

In general therefore, a function F (x) is an anti-derivative of f (x) if dF/dx = f (x). If F (x) is an anti-derivative of f (x)
then so too is F (x) + C, where C represents any constant.

As a consequence we must allow for this when we calculate the indefinite integral and include an additional constant

Z x Z
f (u)du = f (x)dx = F (x) + C, (5.10)

35
where C is an arbitrary constant of integration.

It should also be noted that the definite integral between x = a and x = b can be written in terms of F (x).
Z b Z b Z a
f (x)dx = f (x) − f (x) = F (b) − F (a), (5.11)
a x0 x0

where x0 is any third fixed point. Using the notation that F 0 (x) = df /dx
Z b
b
F 0 (x)dx = F (b) − F (a) = [F ]a . (5.12)
a

5.4 Integration By Inspection

There are a number of integration results that you should learn off by heart. For example the following are important
ones to remember, however you must still be able to derive them when asked.

f (x) F (x)
a
axn n+1 x
n+1

sin ax − a1 cos ax
1
cos ax a sin ax
eax 1 ax
ae
1
x ln x
R
Table 1: Common integration results for f (x)dx = F (x) + C, where C is an arbitrary constant. For more see the
Formula Booklet section 11.1. Note these are useful to remember but you must be able to derive them if asked.

5.5 Integration by Substitution

5.5.1 Indefinite Integrals

Sometimes it is possible to make a substitution of variables that turns a complicated integral into a simpler one, which
can be integrated by a standard method. There are many useful substitutions and knowing which to use is a matter of
experience. A list of these is given in the Formula Booklet 11.2. As always the list is for reference but you should learn
to use these substitutions and be able to choose which to use.

We will now go through a number of examples for indefinite integrals and later go through the definite integrals.

Example 1:

Z
1
I= √ dx. (5.13)
a2 − x2

Make the substitution x = a sin u. We note that

dx
= a cos u, → dx = a cos u du. (5.14)
du

36
Putting these into I we have

Z Z Z
1 1 1
I= p a cos u du = √ cos u du = du = u + c, (5.15)
a 1 − sin2 u cos2 u

where c is an arbitrary constant.

Now substituting back in


x
I = sin−1 + c. (5.16)
a

Example 2: Integrals of the form

Z Z
1 1
I= dx or I = dx. (5.17)
a + b cos x a + b sin x

In these cases, making the substitution t = tan x/2 yields integrals that can be solved more easily.

We first note that if t = tan x/2 we can find expressions for sin x/2 and cos x/2. To show this, we start with the identity
x x
1 = cos2 + sin2 ,
2 2
now divide both sides by cos x/2 to get
1 x
= 1 + tan2 = 1 + t2 .
cos2 x/2 2

Rearranging we find
x 1
cos =√
2 1 + t2
and
x x x t
sin = cos tan = √ .
2 2 2 1 + t2

Moreover, we can write

x x 1 − t2
cos x = cos2 − sin2 = , (5.18)
2 2 1 + t2
x x 2t
sin x = 2 sin cos = . (5.19)
2 2 1 + t2

Before we apply the change of variable t = tan x/2 to the integral I we also note that
dt 1 x 1 x 1 
= sec2 = 1 + tan2 = 1 + t2 .
dx 2 2 2 2 2

Leading to
2
dx = dt.
1 + t2

Now let’s look at a specific example (we can do the general case but we leave that to an optional problem set challenge).

37
Example 2a:
Z
1
I= dx (5.20)
1 + 3 cos x

Now we apply the change of variable t = tan x/2, cos x = (1 − t2 )/(1 + t2 ), dx = 2/(1 + t2 ) dt
Z Z Z
1 2 2 2
I = dt = dt = dt
1 + 3(1 − t2 )(1 + t2 )−1 1 + t2 1 + t2 + 3(1 − t2 ) 4 − 2t2
Z Z Z  
2 2 1 1 1
= dt = √ √ dt = √ + √ dt.
4 − 2t2 (2 − 2t)(2 + 2t) 2 2 − 2t 2 + 2t

We can now do each of the partial fractions in turn, so we have

Z Z √
1 1 1 1
Ia = √ dt = − √ du = − √ ln u = − √ ln(2 − 2t). (5.21)
2 − 2t 2u 2 2

where we have utilised the the substitution u = 2 − 2t to do this integral. Similarly for the second term
Z Z √
1 1 1 1
Ia = √ dt = √ du = √ ln u = √ ln(2 + 2t), (5.22)
2 + 2t 2u 2 2

combining the two to get

1 h √ √ i
I = − √ ln(2 − 2t) − ln(2 + 2t) . (5.23)
2 2

Recalling that t = tan x/2, we finally arrive at


" √ #
1 2 + 2 tan(x/2)
I = √ ln √ .
2 2 2 − 2 tan(x/2)

Example 3: Logarithmic Integration

Calculate the integral


Z
3x2 − sin x
I= dx.
x3 + cos x

We can do this by substitution, let u = x3 + cos x, then du = (3x3 − sin x)dx. Applying these to the integral we have
Z
1
I= du = ln u + c = ln[x3 + cos x] + c.
u

Through this example we can see a pattern and that is if the numerator contains the derivative of the denominator
(potentially multiplied by a constant) we know that the integral will be ln of the denominator. That is
Z 0
f (x)
dx = ln f (x) + c. (5.24)
f (x)

This follows from the derivative of a ln as a function of a function.

38
5.5.2 Solving Differential Equations with Integration

Later in the course we will see differential equations in full, but for now it is worth exploring solving first order (that is
one derivative) differential equations of the form
df (x)
= g(x), (5.25)
dx

where g(x) is some function of x. Usually we are also given some information about f (x) in the form of its value at a
particular point in x. For example, we may have that at x = a, f (a) = b.

We already know how to calculate f (x), as f (x) is just the anti-derivative of g(x) in the language we have used above.

To solve this equation, that is find the functional form of f (x) we can simply integrate both sides with respect to x.

Z Z Z
df (x)
dx = df (x) = f (x) = g(x) dx. (5.26)
dx

So for example if g(x) = x, we will have


Z
x2
f (x) = x dx = + c, (5.27)
2
where c is an arbitrary constant. We can find c by using the information given about the function f (x) evaluated at a
particular value of x. For example if f (a) = b then we have that
a2
f (a) = b = + c, (5.28)
2
and therefore
a2
c=b−
2
with the final result for f
x2 a2
f (x) = +b− .
2 2
We could have arrived at the same result in a slightly different way. We can integrate the differential equations with
respect to x but this time between two limits. As we want to find out the functional form of f as a function of x we
let the upper limit of the integral be x itself. This is similar to what we did when looking at the indefinite integrals in
Section 5.3. The lower limit in this case is very specific, we take it to be x = a, which corresponds to the condition we
have been given in the problem.

We have then that


Z x Z x
df
dx0 = x0 dx0 , (5.29)
a dx0 a

where we have distinguished the dummy variable in the integral and the upper limit x by putting a prime on the dummy
variable.

The left hand side of the integral can be rewritten as


Z x Z f (x)
df 0
0
dx = df = f (x) − f (a) = f (x) − b,
a dx f (a)

39
where we have effectively changed variables from x to f and the limits as a result changed accordingly. The right hand
side of our integral is then x
Z x
(x0 )2 x2 a2
f (x) − b = x dx0 = = − .
a 2 a 2 2
Leading to
x2 a2
f (x) = +b−
2 2
as we had before.

5.5.3 Definite Integrals

When we change variables in a definite integral we have to take care of the limits in the integral. For definite integrals we
have to do three things when we change variables. The first is we must replace the variable in the integrand, the second
is we must change the infinitesimal width dx and lastly we must recalculate the limits.

Example: Calculate the definite integral


Z 9
1
I= √ √ dx.
4 x(1 + x)2
√ 1 −1/2 √
Make the substitution, u = 1 + x, then du = leading to dx = 2 xdu. We must also calculate the limits in
2x dx

terms of the new variable u. For x = 4 we have that u = 1 + 4 = 3 and for x = 9 we have u = 4. Changing the variable
in the integral then leads to
Z Z 4  
9
1 4
du 1 1 1 1
I= √ √ dx = = 2 − = −2 + = .
4 x(1 + x)2 3 u2 u 3 4 3 6

Note that you do not need to re-express the answers in terms of x we can just evaluate the integral directly with u and
the modified limits.

5.6 Integration by Parts

Integration by parts is analogous to the product rule of differentiation, that is

d dv du
(uv) = u +v . (5.30)
dx dx dx

If we integrate this we have


Z Z Z
d dv du
(uv) = uv = u dx + v dx. (5.31)
dx dx dx

Rearranging we have the familiar formula

Z Z
dv du
u dx = uv − v dx.
dx dx
(5.32)

40
Example 1: Calculate
Z π
I= x sin x dx.
0

Using Equation 5.32, we let u = x and dv/dx = sin x to find


Z π
π π π
I = −x cos x|0 − (− cos x) dx = −x cos x|0 + sin x|0 = π.
0

Example 2: Calculate Z
I= ln x dx.

In this case we know that the integral of ln x is not a standard result. We can use integration by parts to perform this
integral by letting u = ln x and dv/dx = 1 so that, using Equation 5.32 we have
Z  
1
I = x ln x − xdx = x ln x − x + c, (5.33)
x

where c is an arbitrary constant.

5.7 Differentiation of an Integral Containing a Parameter

In PH1120 you will see this topic again in a slightly more complicated set-up. Here we will consider the derivative of an
integral where the limits are constants (in PH1120 the limits will be functions of x.).

It can sometimes happen that an integrand, in addition to being a function of x, also depends on a parameter, say α. So
for example if f (x, α) is both integrable with respect x over the interval [a, b] and differentiable with respect to α then
Z b Z b
d df
f (x, α) dx = dx. (5.34)
dα a a dα

Rb
Example: Find the derivative with respect to α of the following integral I(x, α) = a
eαx dx, where a and b are constants.

Z b Z b Z b
d d αx d (eαx )
I(x, α) = e dx = dx = xeαx dx.
dα dα a a dα a

5.8 An Aside on Hyperbolic Functions

We can define the following functions, which are called Hyperbolic functions:

ex − e−x
sinh x =
2
ex + e−x
cosh x = .
2

41
Note that cosh x is an even function and sinh x is an odd function, just like their trigonometric relations.

In analogy with trigonometric functions, the remaining hyperbolic functions are

sinh x ex − e−x
tanh x = = x
cosh x e + e−x
1 2
sech x = = x
cosh x e + e−x
1 2
cosech x = = x (5.35)
sinh x e − e−x
1 ex + e−x
coth x = = x .
tanh x e − e−x

We can connect hyperbolic functions with their trigonometric counterpart. We can rewrite sin x and cos x in terms of
exponentials as

eix + e−ix
cos x = ,
2i
ix −ix
e −e
sin x = .
2

If now we replace x with ia we find

e−a + ea ea + e−a
cos ia = = , (5.36)
2 2
−a a 
e −e i a
sin ia = = e − e−a .
2i 2

From these equations we find that

cosh x = cos ix
i sinh x = sin ix
cos x = cosh ix (5.37)
i sin x = sinh ix.

These relationships lead to a number of similarities between hyperbolic and trigonometric functions in particular in their
calculus and identities.

5.8.1 Identities of Hyperbolic Functions

We can find a number of hyperbolic analogues of the trigonometric identities. For example we can calculate the hyperbolic
analogue of cos2 x + sin2 x = 1 by using the relationships in Equation 5.37, that is we replace x with ix we get

sin2 ix = − sinh2 x

42
and
cos2 ix = cosh2 x.

The identity therefore becomes


cosh2 x − sinh2 x = 1.

Some other identities that can be derived in a similar manner are

sech2 x = 1 − tanh2 x,
cosech2 x = coth2 x − 1
sinh 2x = 2 sinh x cosh x (5.38)
2 2
cosh 2x = cosh x + sinh x.

5.8.2 Inverses of Hyperbolic Functions

Just like trigonometric functions, hyperbolic functions have inverses. We can find closed-form expressions for their inverses.

Example: Find the inverse hyperbolic function y = sinh−1 x.

First write x as a function of y, that is


y = sinh−1 x ⇒ x = sinh y.

Now since cosh y = 1


2 (ey + e−y ) and sinh y = 1
2 (ey − e−y ), we can write

ey = cosh y + sinh y (5.39)


q
= 1 + sinh2 y + sinh y
p
ey = 1 + x2 + x,

and hence p 
y = sinh−1 x = ln 1 + x2 + x .

In a similar manner we can show that p 


cosh−1 x = ln x2 − 1 + x

and that  
−1 1 1+x
tanh x = ln .
2 1−x

5.8.3 Calculus of Hyperbolic Functions

We have just seen that the identities of hyperbolic functions are closely related to the identities of their trigonometric
counterparts. Their calculus is also closely related as we will see now.

43
The derivatives of the two basic hyperbolic functions are

d
sinh x = cosh x (5.40)
dx
d
cosh x = sinh x. (5.41)
dx

Notice the lack of a minus sign in the second expression. We can verify these derivatives by rewriting the hyperbolic
function in terms of the exponential forms. For example
 
d d ex − e−x 1 x 
sinh x = = e + e−x = cosh x.
dx dx 2 2

Similarly we can show that

d
(tanh x) = sech2 x, (5.42)
dx
d
(sech x) = sech x tanh x, (5.43)
dx
d
(cosech x) = cosech x coth x, (5.44)
dx
d
(coth x) = − cosech2 x. (5.45)
dx
(5.46)

5.9 Hyperbolic Functions In Integration by Substitution

Consider the integral of the form

Z
1
I= √ dx (5.47)
x2+ a2

The most convenient substitution in this case is x = a sinh u. The dx = a cosh udu.

Z
a cosh u
I= q du, (5.48)
a2 (1 + sinh2 u)

now using the identity cosh2 u − sinh2 u = 1, we find


Z x
I= du = u + c = sinh−1 + c, (5.49)
a

where c is an arbitrary constant.

44
6 Differential Equations

A great many applied problems in Mathematics involve rates, that is derivatives and the way in which a quantity changes
as a function of an independent variable, such as time. Let us look at a few examples.

Newton’s second law is written as F = ma. If we write the acceleration as dv/dt, where v is the velocity, or as d2 x/dt2 ,
where x is the displacement we have a differential equation. Thus any mechanics problem in which we want to describe
the motion of a body under the action of a given force, involves the solution to a differential equation.

A further example, is to consider a simple electrical circuit containing a resister, R, a capacitor, C and inductance, L,
and a source of emf V . If the current flowing around the circuit at a time t is I(t) and the charge on the capacitor is q(t),
then I = dq/dt. The voltage across R is RI, the voltage across C is q/C, and the voltage across L is L(dI/dt). Then at
any time we have

dI q
L + RI + = V.
dt C

If we differentiate this equation with respect to t and use dq/dt = I, we have

d2 I dI I dV
L 2
+R + =
dt dt C dt

as the differential equation satisfied by the current I in a simple series circuit with given L, R and C, and a given V (t).

An equation containing derivatives is called a differential equation. For us in this course we will only consider ordinary
differential equations, that is, a differential equation that contains derivatives with respect to a single variable. In PH1120
you will see partial differential equations which contain partial derivatives.

Ordinary differential equations (ODEs) may be categorised by their general characteristics. One particularly important
property of an ODE is the order or degree of an ODE is given by the order of the highest derivative in the differential
equation.

Some examples

d2 I dI I dV
L 2
+R + = 2nd order (6.1)
dt  dt C dt
d3 y d2 y
+x + y = ex 3rd order (6.2)
dx3 dx2
dy
+ xy = 0 1st order. (6.3)
dx

In these differential equations I and y are the dependent variables. Notice that these equations only contain one factor
of the dependent variables or their derivatives. These differential equations are said to be linear ODEs. Examples of
non-linear differential equations are

dy
y + xy 2 = 0, (6.4)
dx
dy
= cot y. (6.5)
dx

45
Both the terms in equation 6.4 are non-linear, whereas the cot y terms means equation 6.5 is also non-linear (think of the
Maclaurin series of cot y). We will come back to what we mean by linear later (and you will have already seen properties
of linear equations in the other half of PH1110). To summarise, a linear ODE with x the independent variable and y the
dependent variable is
a0 y + a1 y 0 + a2 y 00 + a3 y 000 + . . . = b,

where the coefficients ai and b are either constants or functions of x and the primes denote differentiation with respect
to the independent variable, which in this case is x.

In the next few sections we will see examples of ODEs and how to solve them.

6.1 Solving ODEs

We have already seen an example of how to solve an first order ODE in section 5.5.2.

A solution to a differential equation (in the dependent and independent variables, e.g. y and x respectively)
is a relation between y and x which, if substituted into the differential equation, gives an identity.

It is useful to consider an example to get an idea of the components of the most general solution to an ODE.

Example 1. The relation

y = sin x + C, (6.6)

is a solution of the differential equation

y 0 = cos x (6.7)

because if we substitute equation 6.6 into the differential equation we get the identity cos x = cos x.

Example 2. The 2nd order ODE


y 00 = y

has solutions y = ex or y = e−x or y = Aex + Be−x as can be verified by substitution.

Notice, if we integrate y 0 = f (x), the expression for y has the form


Z
y = f (x)dx + C,

we obtain one arbitrary constant, C.

If we integrate y 00 = g(x) twice we get Z Z 


y= g(x)dx dx + Cx + D

with two arbitrary constants C and D.

We might expect that this continues with higher and higher order derivatives, for example the nth order version, e.g. for
y (n) = h(x), y will contain n arbitrary constants.

Examples 1 above showed that the solution to the first order ODE contained one arbitrary constant C and in Example
2 there was a solution that contained two arbitrary constants A and B.

46
Importantly, any linear differential equation of order n has a solution containing n arbitrary constants. This solution is
called the General Solution of the linear ODE.

An important check when you think you have found the solution to an ODE is to substitute the solution back into the
equation to confirm that it is indeed a solution.

Once we have the general solution to our ODE and we then want to use it to describe some phenomenon, then we will
need to establish what the values of the arbitrary constants are.

We can do this by using Initial Conditions or Boundary Conditions. These are constraints on the solution to allow us to
determine the arbitrary constant. An example may be that we find the solution to a first order ODE as

y(t) = t2 + A

and we are given the initial condition y(0) = 0, that is at t = 0 y = 0. If we apply that to the solution above we find that

A = 0.

Hence the unique solution is given by


y(t) = t2 .

In general the general solution to an nth order ODE with n arbitrary constants requires n conditions to determine all the
constants.

These conditions may include constrains on the solution, the derivative of the solution or similar.

6.2 First order ODE

6.2.1 Separable first order ODE

Every time we have written an integral of the form


Z
y= f (x)dx (6.8)

you are solving a differential equation, namely


dy
y0 = = f (x).
dx
This is a simple example of an equation which can be written with only y terms on one side of the equation with the
other side being a function of x only:
dy = f (x)dx.

Whenever we can separate the variables in a differential equation in this way, we call the ODE separable, and we get the
solution by integrating each side of the equation.

A more general example is where we have an ODE of the form

dy
= f (x)g(y),
dx

47
then the solution is found by Z Z
dy
= f (x)dx.
g(y)

Example 1: The rate at which a radio-active species decays is proportional to the remaining number of atoms. If there
are N0 atoms at t = 0, find the number at time t.

We need to write down the ODE that describes how the number of radioactive species is decreasing over time. We are
told that the rate of change is proportional to the number remaining, so we know
dN
∝ −N,
dt
where the negative sign tells us that N is decreasing with time. We can replace the proportionality sign with an equals
sign and a constant of proportionality, λ, as
dN
= −N λ.
dt
This ODE is separable as we can rearrange it to read
dN
= −λdt
N
and we can integrate both sides to find
Z Z
dN
= − λdt (6.9)
N
⇒ ln N = −λt + C, (6.10)

where C is an arbitrary constant. We can exponentiate both sides

N (t) = e−λt+C = e−λt eC = Ae−λt ,

where we have defined A = exp [C]. We see that we have found general solution as we have one arbitrary constant.

To finish the example we have an initial condition that allows us to determine the constant A. That is, N (0) = N0 .
Applying this to our general solution we find N (t) = N0 e−λt .

Example 2. Find the general solution of the ODE

xy 0 − xy = y.

Rearranging we have
y(1 + x) y0 1+x
y0 = ⇒ =
x y x
and therefore
dy 1+x
= dx.
y x

Integrating both sides we have that


ln y = ln x + x + C.
Exponentiating both sides we arrive at the general solution

y(x) = Axex .

We can check this is a solution of the original ODE by substituting back in.

48
6.2.2 Almost Separable first order ODEs

Sometimes an ODE may not be immediately separable but with a change of variable we can find a version of the ODE
that is separable.

Consider the ODE


y 0 = f (ax + by),

where f (ax + by) is some unspecified function of ax + by, where a and b are constants. If we change variables by setting

z = ax + by,

with
dz dy
=a+b .
dx dx

We have then that  


dy 1 dz
= − a = f (z),
dx b dx
leading to
dz
= bf (z) + a.
dx
This ODE is now separable, leading to the integrals
Z Z
dz
= dx.
bf (z) + a

Once we have performed the integration we must remember to re-substitute z = ax + by back into the general solution.

Example: Solve the ODE


dy x+y+1
= .
dx x+y+3
dz dy
As it stands this ODE is not separable but if we let z = x + y, then dx =1+ dx leading to

dz z+1 dz 2z + 4
−1= ⇒ = .
dx z+3 dx z+3
Separating variable we have
     
1 z+3 1 z+2+1 1 1
dz = dz = 1+ dz = dx.
2 z+2 2 z+2 2 z+2

Integrating both sides we find


z + ln(z + 2)
= x + C.
2
Now we must replace z = x + y to give

x + y + ln(x + y + 2) = 2x + 2C, ⇒ y − x + ln(x + y + 2) = B,

where B = 2C and is an arbitrary constant. This is our general solution.

49
6.2.3 Homogeneous first order ODEs

A function f (x, y) is called homogeneous of degree n if

f (tx, ty) = tn f (x, y).

For example f (x, y) = x3 + y 3 is homogeneous of degree 3 as

f (tx, ty) = (tx)3 + (ty)3 = t3 (x3 + y 3 ) = t3 f (x, y).

If the ODE of interest is homogeneous then it can be written as a function of the ratio y/x. We can then use a change
of variable where y = xv to produce a separable ODE.

Example: Solve the ODE


dy y 2 + xy  y 2 y
= = + .
dx x2 x x

Set y = vx, such that


dy dx dv
=v +x ,
dx dx dx
leading to Z Z
dv 1 1
v+x = v2 + v ⇒ dv = dx,
dx v2 x
Integrating we have
1 x
− = ln x + C, ⇒ − = ln x + C
v y
and so
x
y=− .
ln x + C

6.2.4 Homogeneous apart from a constant

We may be presented with an ODE that is very close to being homogeneous but there are some constants floating around
that spoil this. To combat this we can define new variables that are linear transformations of the original x and y.

Example: Find the solution to the ODE


dy y+x−5
= .
dx y − 3x − 1
To do this define new variables
ȳ = y + a, x̄ = x + b.

Then using the chain rule


dy dy dȳ dx̄ dȳ ȳ − a + x̄ − b − 5
= = = .
dx dȳ dx̄ dx dx̄ ȳ − a − 3x̄ − 3b − 1
To get it into the homogeneous form we must have

−a − b − 5 = 0 (6.11)
−a − 3b − 1 = 0. (6.12)

Solving these we have that a = −4, b = −1

50
We are left with
dȳ ȳ + x̄ ȳ/x̄ + 1
= = .
dx̄ ȳ − 3x̄ ȳ/x̄ − 3

Now following the procedure in section 6.2.3 we let ȳ = vx̄

dȳ dv v+1
= v + x̄ = .
dx̄ dx̄ v−3
Rearranging this we have
dv v+1 −v 2 + 4v + 1
x̄ = −v = .
dx̄ v−3 v−3
Separating variables

1 v−3 3−v 3−v 1 − (v − 2)


dx = dv = 2 dv = dv = dv.
x̄ −v 2 + 4v + 1 v − 4v − 1 (v − 2)2 − 5 (v − 2)2 − 5

Now we integrate both sides


Z Z   Z    
1 1 (v − 2) 1 1 1 (v − 2)
dx = − dv = √ √ − √ − dv,
x̄ (v − 2)2 − 5 (v − 2)2 − 5 2 5 (v − 2) − 5 (v − 2) + 5 (v − 2)2 − 5

where in the last step we have used partial fractions to re-express the first part of the integrand, that is
 
1 1 1 1
= √ √ − √ .
(v − 2)2 − 5 2 5 (v − 2) − 5 (v − 2) + 5

We can now perform the integrals using the methods outline in section 5 to find
√ !
1 (v − 2) − 5 1
ln x = √ ln √ − ln[(v − 2)2 − 5] + C
2 5 (v − 2) + 5 2

We now need to re-substitute the definition of v in terms of x and y,


ȳ y−4
v= =
x̄ x−1
so that

y−4
√ !
1 ( x−1 − 2) − 5 1 y−4
ln x = √ ln y−4
√ − ln[( − 2)2 − 5] + C
2 5 ( x−1 − 2) + 5 2 x−1
√ √ ! " 2 #
1 y − (2 + 5)x − 2 + 5 1 y − 2x − 2
ln x = √ ln √ √ − ln − 5 + C.
2 5 y − (2 − 5)x − 2 − 5 2 x−1

6.2.5 Looks homogeneous but actually separable with a change of variable.

Consider the ODE with the general form

dy c1 y + d1 x + e1
= , (6.13)
dx c2 y + d 2 x + e 2

51
where c1 , c2 , d1 , d2 , e1 and e2 are all constants. As in section 6.2.4, we may try to make linear transformations

y = ȳ − a, x = x̄ − b.

Applying these to the ODE we find


dȳ c1 ȳ − c1 a + d1 x̄ − d1 b + e1
= , (6.14)
dx̄ c2 ȳ − c2 a + d2 x̄ − d2 b + e2
As in section 6.2.4 we need to leave the right hand side in homogeneous form (so that we can use v = y/x as a substitution
as before), that is
dȳ c1 ȳ + d1 x̄
= , (6.15)
dx̄ c2 ȳ + d2 x̄
meaning that we must have

−c1 a − d1 b + e1 = 0
−c2 a − d2 b + e2 = 0.

We can solve for the parameters a and b, re-arranging the first equation for a we have
e1 − d1 b
a=
c1
and substitute this into the second we find
 
e1 − d1 b e1 c2 − e2 c1
−c2 − d2 b + e2 = 0, ⇒ b= .
c1 c2 d1 − c1 d2
We can see from this result that we can find a value for b except when c2 d1 − c1 d2 = 0 or rearranging when
c1 d1
c2 d1 = c1 d2 or = .
c2 d2

If the above is true we must consider a different method. We can apply this condition to the original ODE equation 6.13
we find that
d1 c2 d1
dy y + d1 x + e1 d2 (c2 y + d2 x) + e1
= d2 = . (6.16)
dx c2 y + d2 x + e2 c2 y + d2 x + e2
Notice that the numerator contains a term that is some factor (d1 /d2 ) times c2 y + d2 x and this same combination of x
and y appears in the denominator.

This allows us to make the substitution v = c2 y + d2 x where


dv dy
= c2 + d2
dx dx
and we can rewrite the ODE as
d1
dv v + e1
= c2 d2 + d2 .
dx v + e2
Now we have an ODE that is separable and we can solve in the usual way.

Example: To find the solution of the ODE


dy y − 3x − 2
= , (6.17)
dx 2y − 6x − 5

52
We spot that the same combination of x and y appears in the numerator and denominator allowing us to make the
substitution v = y − 3x leaving
dv v−2
= − 3,
dx 2v − 5
which can then be separated and integrated as normal. This ODE is exactly of the form described by equation 6.16, with
c1 = 1, c2 = 2, d1 = −3 and d2 = −6 and indeed we see that cc21 = dd12 .

6.2.6 Exact equations

Some times first order ODEs can be solved by writing the LHS as an exact differential. Consider the following ODE
dy df (x)
f (x) + y = Q(x).
dx dx
We can write the LHS as an exact differential, that is
dy df (x) d(f (x)y)
f (x) + y= ,
dx dx dx
leading to
d(f (x)y)
= Q(x).
dx
This can then be solved by using separation of variables
Z Z Z
1
d(f (x)y) = f (x)y = Q(x)dx ⇒ y = Q(x)dx.
f (x)

Example: Solve the ODE

dy
x + y = 2x.
dx
We can rewrite the LHS as
dy d(yx)
x +y =
dx dx
and solve
d(yx)
= 2x
dx
by separation of variables Z Z
c
d(yx) = yx = 2xdx = x2 + c ⇒ y =x+ .
x

6.2.7 Integrating Factor

The general form for a first order linear ODE is


dy
f (x) + P (x)y = Q(x), (6.18)
dx
where f (x), P (x) and Q(x) are all unspecified functions of x. If we divide through by f (x) we have

dy P (x) Q(x)
+ y= , (6.19)
dx f (x) f (x)

53
The integrating factor, I, is then multiplied through on both sides of the equation such that

dy P (x) Q(x)
I +I y=I . (6.20)
dx f (x) f (x)

Our task now is to find an integrating factor such that the left hand side can be written as an exact differential d(uv)/dx.
Using the product rule we want to find u and v and we directly compare the LHS of equation 6.20, that is

d(uv) dv du
= u +v (6.21)
dx dx dx
Q(x) dy p(x)
I = I + yI (6.22)
f (x) dx f (x)

we can say that

u = I
dv dy
=
dx dx
v = y
du P (x)
= I .
dx f (x)

From the first and fourth we can see that


du dI P (x)
= =I .
dx dx f (x)
We can thus find I from this last equality by solving the ODE
Z 
dI P (x) P (x)
= dx ⇒ I = exp dx .
I f (x) f (x)

This means that we can now write equation 6.20 as an exact differential

dy P (x) d(yI) Q(x)


I + yI = =I .
dx f (x) dx f (x)

We can now solve the ODE in the last equality by writing


Z
Q(x)
yI = I dx,
f (x)

with Z 
P (x)
I = exp dx .
f (x)

Example: Solve using an integrating factor the ODE


dy 3y ex
+ = 3.
dx x x

The integrating factor is given by Z 


3
I = exp dx = exp (3 ln x) = x3 .
x

54
Multiplying the ODE through by I we have
dy
x3 + 3yx2 = ex ,
dx
where we can write the LHS as
dy d(x3 y)
x3 + 3yx2 = ,
dx dx
which is just d(Iy)/dx. We thus have
d(x3 y)
= ex ,
dx
which can be separated as
ex + c
d(x3 y) = ex dx ⇒ x3 y = ex + c ⇒ y= .
x3

6.2.8 Bernoulli equation

The Bernoulli equation is non-linear, but can be reduced to a linear ODE with a suitable change of variable. The general
form of the Bernoulli equation is
dy
+ P (x)y = Q(x)y n , (6.23)
dx
where P and Q are function of x. If we make the substitution

z = y 1−n ,

then
dz dy
= (1 − n)y −n .
dx dx
If we multiply equation 6.23 by (1 − n)y −n we have

dy
(1 − n)y −n + P (x)(1 − n)y 1−n = (1 − n)Q(x),
dx
and using the definition of z and its derivative we can write
dz
+ P (x)(1 − n)z = (1 − n)Q(x),
dx
which is now a first order linear ODE and can be solved using an integration factor.

55
7 Second Order Differential Equations

We will now take a look at second order ODEs. The general form of a second order ODE is
d2 y dy
a +b + cy = f (x), (7.1)
dx2 dx
where the coefficients a, b and c may be functions of x.

7.1 Homogeneous second order ODE with constant coefficients

We will consider first the case where a, b, c in equation 7.1 are constants and the right hand side f (x) = 0. With the right
hand side set to zero we call this type of second order ODE is called homogeneous as each term contains a single factor
of y or its first or second derivative. The second ODE we wish to solve is then

d2 y dy
a +b + cy = 0, (7.2)
dx2 dx

We can construct the general solution of an ODE by exploiting the linear nature of the homogeneous ODE. We recall
that the general solution of an nth order ODE must contain two arbitrary constants.

For a linear homogeneous ODE

• If y is a solution of a particular ODE then so is Ay, where A is an arbitrary constant.

• If y1 and y2 are linearly independent and are both solutions to a linear homogeneous ODE then the combination
y = y1 + y2 is also a solution of the ODE.

We can combine these: If y1 and y2 are both independent solutions to a linear homogeneous ODE then the combination
y = Ay1 + By2 is the general solution of the homogeneous ODE.

For example, we have the linear second order ODE

y 00 − y = 0.

We know that the most general solution of this second order ODE should contain two arbitrary constants. We also know
that both y1 = ex and y2 = e−x are individual solutions of this ODE. Using the properties above we also know therefore
that
y = Aex + Be−x
is also a solution to the y 00 = y. This has two arbitrary constants and is therefore the general solution of the second order
ODE.

Note that the two contributions to the general solution are linearly independent. That is ex cannot be written in terms
of some constant times e−x and vice versa.

If we had not known about the e−x solution, we may have thought that we could have constructed

y = Aex + Bex

56
with two arbitrary constants A and B. However, we can combine both of these terms into a single term

y = Cex ,

where C = A + B leaving only one arbitrary constant in our solution indicating that we have yet to find the full general
solution.

7.2 Finding Solutions to 2nd order linear ODEs

The standard way to find the solution to a second order homogeneous ODE, y, is by using a trial solution of the form

y = emx .

Substituting this into equation 7.1 we find



am2 emx + bmemx + cemx = am2 + bm + c emx = 0.

Since the exponential term is never zero we can divide by it out to obtain the characteristic (or auxiliary) equation,

am2 + bm + c = 0 . (7.3)

In general for an nth order ODE one finds an nth order polynomial, which therefore has up to n roots, which may be
complex.

In our example with a second order equation the roots are:


−b ± b2 − 4ac
m± = . (7.4)
2a
The general solution is then
y = Aem+ x + Bem− x .

7.2.1 Real Roots

For the case where the two roots are unequal, i.e., b2 6= 4ac and therefore m+ 6= m− , we find two independent solutions
to our second order ODE. The most general solution is then the combination of these two solutions and is written

y(x) = Aem+ x + Bem− x , (7.5)

where A and B are arbitrary constants. As we learnt in section 6.1, for an nth order ODE we expect n arbitrary constants
in the general solution. If we do not have n then we need to use a different method to find the remaining parts of the
general solution. We will come back to this point later.

If the coefficients of the equation are such that b2 > 4ac, then the roots are real and the solution given by Eq. (7.5) is
also real.

57
7.2.2 Complex Roots

If b2 < 4ac, then the two roots are complex, and can be written


b 4ac − b2
m+ = − +i ≡ p + iq , (7.6)
2a 2a

b 4ac − b2
m− = − −i ≡ p − iq , (7.7)
2a 2a

where

b
p = Re(m+ ) = Re(m− ) = − , (7.8)
2a

4ac − b2
q = Im(m+ ) = −Im(m− ) = . (7.9)
2a

The general solution can then be written as


y(x) = c1 e(p+iq)x + c2 e(p−iq)x = epx c1 eiqx + c2 e−iqx , (7.10)

where c1 and c2 are arbitrary constants and can in principle be complex depending. We can check that this is indeed a
solution by substituting it back into equation 7.2.

From complex numbers we know that using the Euler Formula we can write

eiθ = cos θ + i sin θ,

so we can use this to rewrite the general solution in equation 7.10 as

y(x) = epx [c1 (cos qx + i sin qx) + c2 (cos qx − i sin qx)] = Aepx cos(qx) + Bepx sin(qx), (7.11)

where A and B are functions of c1 and c2 but are still unknown arbitrary constants, and p and q are defined in Eqs. (7.8)
and (7.9).

7.2.3 Equal roots and the method of reduction of order

If b2 = 4ac then the characteristic equation has only one distinct root, which is real:

b
m+ = m− = − ≡m. (7.12)
2a
We may be tempted to take

y(x) = c1 emx (7.13)

58
as the general solution. We can easily verify that this is indeed a solution, but it is clearly not the most general one, since
it contains only one constant, c1 . Because we have a second order equation, however, two arbitrary constants are needed.

To find a second linearly independent solution, we can use the method of reduction of order. The idea is to take an
available solution, namely, Eq. (7.13), and to find from it a linearly independent solution by multiplying by a function
v(x), which we will need to find. That is, starting from the solution emx we seek a solution of the form

y(x) = v(x)emx = v(x)e−(b/2a)x . (7.14)

The first and second derivatives of y are found using the product rule to be

b −(b/2a)x
y0 = v 0 e−(b/2a)x −
ve , (7.15)
2a
 
b 0 b2
y 00 = v − v + 2 v e−(b/2a)x .
00
(7.16)
a 4a

Substituting these ingredients into our differential equation (7.2) and cancelling the exponential factors gives

   
b 0
00 b2 0 b
a v − v + 2 v + b v − v + cv = 0 , (7.17)
a 4a 2a

which can be simplified to

 
00 b2
av − −c v =0. (7.18)
4a

But because we are considering the case b2 = 4ac, the second term in Eq. (7.18) disappears and we are left with

v 00 = 0 . (7.19)

Integrating this twice to find v gives

v(x) = A + Bx , (7.20)

where A and B are arbitrary constants. By using this with Eq. (7.14) to find our general solution y for the case of equal
roots we obtain

y(x) = (A + Bx)emx , (7.21)


where the single root of the characteristic equation is given by m = −b/2a.

Let’s look at some examples.

Example 1. Find the solution to the following second order ODE.

59
d2 y dy
2
+3 + 2y = 0.
dx dx

Trial solution is
y = emx .

Substitute this into the ODE to find

m2 m2 + 3m + 2 = 0, → (m + 1)(m + 2) = 0,

which has solutions


m = −1, m = −2.

The general solution to this ODE is then

y = Ae−x + Be−2x .

Example 2. Find the solution to the following second order ODE.

d2 ψ
+ vψ = 0,
dx2
where v is a constant.

Use a trial solution of the form


ψ = emx

and substitute into the the ODE to find


m2 + v = 0.

This gives us

m = ±i v.

The general solution in this case is then


√ √
ψ = Aei vx
+ Be−i vx
.

We can rewrite this using Euler’s formula as


√ √
ψ = C cos vx + D sin vx,

where C and D are arbitrary constants.

Example 3: Find the solution to the following second order ODE

d2 x dx
+4 + 4x = 0.
dt2 dt

Use trial solution


x = em t

60
and substitute into the ODE to find
m2 + 4m + 4 = 0, → (m + 2)2 = 2,
giving a repeated root of
m = 2.
. From this method we have therefore only found one part of the general solution. To find the general solution we need
to use the method of reduction of order. We know that this method tells us that the general solution is of the form

x = (A + Bt)e2t .

This solution has two arbitrary constant and is therefore the general solution. We can always check that we have the
correct form the solution by substituting back into the original ODE.

7.3 Non-homogeneous linear ODEs

We now extend the homogeneous problem of the previous section to the non-homogeneous equation

d2 y dy
a +b + cy = f (x), (7.22)
dx2 dx
where f (x) is a given function and here a, b and c may in general be functions of x.

To solve this problem, first recall that the general solution to the corresponding homogeneous equation
d2 y dy
a 2
+b +c=0
dx dx
can be written

yc (x) = Ay1 (x) + By2 (x) , (7.23)

where y1 and y2 are linearly independent solutions to the homogeneous equation and A and B are arbitrary constants;
yc is called the complementary function.

Now note that we can add yc to any solution to the non-homogeneous equation, and the result will still be a solution.
Suppose we have a particular solution that satisfies (7.22), say, yp , that is

d 2 yp dyp
a 2
+b + cyp = f (x)
dx dx
and we add to it yc , and then substitute this into equation 7.22. We find
d2 (yp + yc ) d(yp + yc ) d2 yp dyp d 2 yc dyc
a +b + c(yp + yc ) = a +b + cyp + a 2 + b + cyc
dx2 dx dx 2 dx dx dx
d2 yc dyc
= f +a 2 +b + cyc
dx dx
= f,

as
d2 yc dyc
a +b + cyc = 0.
dx2 dx

61
So we see that

y(x) = yc (x) + yp (x) (7.24)

still solves the non-homogeneous equation.

The solution to the non-homogeneous equation, yp (x) is called the particular integral. Notice that, despite use of the
word “particular”, the solution yp does not need to satisfy any particular boundary or initial conditions; it can be any
solution of the full non-homogeneous equations without any arbitrary constants.

To solve a non-homogeneous equation of the form in equation 7.22 subject to given initial conditions, we must first find
any solution, yp , and also the general solution yc to the corresponding homogeneous equation, which will contain two
arbitrary constants. Then these are added together and the initial (or boundary) conditions are imposed to determine
the values of the constants. We have already discussed the homogeneous problem above; we will next look at methods
for finding the particular solution yp .

7.3.1 The method of undetermined coefficients

Consider again the non-homogeneous linear ODE

ay 00 + by 0 + cy = f (x) , (7.25)

where we now restrict ourselves to the case where the coefficients a, b and c are constants. To find a particular solution
to the equation we can in many cases employ the method of undetermined coefficients. Basically this amounts to guessing
a solution that contains some undetermined parameters, substituting it back into the differential equation and seeing
whether there exist values for the parameters such that the function is indeed a solution.

It is difficult to formulate very general rules for how to guess a solution, but we can summarise certain guidelines that
cover a number of important cases, namely, when f (x) consists of an exponential, polynomial, sine or cosine terms. If
f (x) is of some different type or if the coefficients in the differential equation are not constant, then the method of
undetermined coefficients is not likely to be of use. The basic form of the guesses that turn out to work are summarised
in Table 2.

In problems of this type it is crucial that before seeking the particular solution yp to find first the complementary function
yc . It can happen that the guess for yp from Table 2 is proportional to yc , in which case the method will not work. This
is easy to see, since if yguess = Cyc for some constant C, then

C(ayc00 + byc0 + cyc ) = 0.

But the guess for yp must give the non-homogeneous term f (x), not zero, so such a function cannot actually be a particular
solution. If the initial guess for yp is found to be proportional to yc , then the problem can be avoided by multiplying the
guess by x, or in general by as many powers of x as needed so that the trial function is no longer proportional to yc , as
listed for the example of the exponential form for f (x).

As an example, suppose we have the differential equation

62
Form of f (x) Guess for yp (x)

cekx Aekx
Axekx if ekx already appears in the complementary function
Ax2 ekx if xekx already appears in the complementary function

c0 + c1 x + c2 x2 + · · · + cn xn A0 + A1 x + A2 x2 + · · · + An xn + An+1 xn+1 + An+2 xn+2

c cos(kx) or c sin(kx) A cos(kx) + B sin(kx)

ceax cos(kx) or ceax sin(kx) eax (A cos(kx) + B sin(kx))

Table 2: Trial solutions for yp given non-homogeneous terms of different forms in f (x). For the case of an nth order polynomial
(second entry below), the guess for yp must include all n + 3 of the coefficients A0 , A1 , . . . , An+2 , even if some of the coefficients ci
in f (x) happen to be zero for i < n.

ay 00 + by 0 + cy = ekx . (7.26)

Referring to Table 2 we take as our trial solution a function of the form

yp (x) = Aekx . (7.27)

Substituting this into our differential equation gives

aAk 2 ekx + bAkekx + cAekx = ekx . (7.28)

After cancelling the term ekx and simplifying, we see that our guess for yp works if the coefficient A has the value

1
A= . (7.29)
ak 2 + bk + c

Having now found a particular solution we can add to this the general solution to the homogeneous equation, which
contains two arbitrary constants,
ekx
y = Dem+ x + F em− x + 2 ,
ak + bk + c
and we then determine the constants by imposing the initial conditions.

Example 2. Solve
d2 y dy
+3 + 2y = e−x .
dx2 dx

63
To do this we must find the complementary function first. We know what this as we solved the homogeneous version of
this ODE before, the complementary function is

yc (x) = Ae−x + Be−2x .

For the particular integral we follow the rules and note that our first guess would be of the form od Ce−x but this already
appears in the CF so we try
yp = Cxe−x .

Putting this into the full non-homogeneous ODE we find

Ce−x = e−x , → C = 1.

The full general solution to the non-homogeneous equation is

y = Ae−x + Be−2x + xe−x .

We will look at another example of this entire procedure in a later section.

7.3.2 Particular solution when non-homogeneous term is a sum

We may encounter a situation where the function f (x) on the right-hand side of a non-homogeneous differential equation
is not as simple as those shown in Table 2, but can nevertheless be expressed as a linear combination of such terms.
Suppose the non-homogeneous term f (x) can be expressed as a sum of n terms:

n
X
f (x) = fi (x) . (7.30)
i=1

Instead of solving
ay 00 + by 0 + cy = f (x)
we can instead try to find solutions yi for i = 1, . . . , n to the equations

ayi00 + byi0 + cyi = fi (x) . (7.31)

The sum of the solutions to these n equations,

n
X
y(x) = yi (x) , (7.32)
i=1

is therefore a solution to our original equation, as can be easily verified:

n
X n
X
[ayi00 + byi0 + cyi ] = fi = f (7.33)
i=1 i=1

So if f (x) is a sum of terms, we can try to find the solution for each term and then sum these in the end.

64
7.3.3 Use of complex exponentials in solutions.

In applied problems, the function f (x) on the right hand side of the non-homogeneous ODE

ay 00 + by 0 + cy = f (x)

is very often a sine or cosine representing an alternating EMF or periodic force. We can find yp by guessing a solution of
the form A cos kx + B sin kx (as suggested in table 2) and plugging in this guess to the non-homogeneous ODE to find A
and B.

A more efficient way to find yp involves appealing to complex numbers. To explain the method consider the non-
homogeneous ODE

y 00 + y 0 − 2y = 4 sin 2x. (7.34)

Instead of tackling this problem directly, we can consider the equation

Y 00 + Y 0 − 2Y = 4e2ix . (7.35)

Since e2ix = cos 2x + i sin 2x is complex, the solution Y may also be complex. Then if

Y = YR + iYI ,

where YR and YI are the real and imaginary parts of Y , then equation 7.35 is equivalent to two equations

YR00 + YR0 − 2YR = Re 4e2ix = 4 cos 2x (7.36)


YI00 + YI0 − 2YI = Im 4e2ix = 4 sin 2x. (7.37)

We see that the second of these two is identical to equation 7.34 and so the solution to equation 7.34 is the same as the
imaginary part of Y . This to find yp for equation 7.34, we can find Yp for equation 7.37 and then take the imaginary part.

Let’s solve equation 7.35. We first note that e2 i does not appear in the complementary function for the homogeneous
version of our non-homogeneous ODE. Following the method of undetermined coefficients we try a solution of the form

Yp = Ce2ix

and substitute it into equation 7.35 to get

(−4 + 2i − 2)Ce2ix = 4e2ix ,

rearranging we find
4 1
C= = − (i + 3),
2i − 6 5
and therefore we find
1
Yp = − (i + 3)e2ix .
5
Taking the imaginary part of Yp , we find yp as
1 3
yp = − cos 2x − sin 2x.
5 5

We can as always check that this is a solution by substituting back into our original non-homogeneous ODE.

65
8 Applications of second order ODE

8.1 The driven simple harmonic oscillator

Second order ODE appear frequently in Physics (and Engineering). A particular example is simple harmonic motion.
Consider the diagram in Figure 8.1. The figure depicts a mass m in motion subject to a number of forces.

Figure 8.1: The driven


harmonic oscillator (see
text)

The motion of the mass m is given by its position x relative to an equilibrium position of x = 0 as a function of time
t. The mass is attached to a spring, which exerts a force Fs = −kx, where k is the spring constant. The mass moves
through a viscous medium providing a resistive force of Fres = −β(dx/dt), and it is acted upon by an external force
Fext = F0 cos(ωt).

Equating the total force to the mass times acceleration gives

dx d2 x
− kx − β + F0 cos(ωt) = m 2 . (8.1)
dt dt

To simplify the notation we divide through by m and rewrite Eq. (8.1) as

x00 + γx0 + ω02 x = f0 cos(ωt) , (8.2)

where we have used primes for the derivatives and have defined

66
r
k
ω0 = , (8.3)
m

β
γ = , (8.4)
m
F0
f0 = . (8.5)
m

Suppose we are given the initial position and velocity for the mass: x(0) = x0 , x0 (0) = v0 , and we want to find the
function x(t) that describes the subsequent motion of the mass as a function of time.

From Sec. 7.3 we know that the general solution x(t) can be expressed as the sum of two terms: xc , the general solution
to the homogeneous problem (the complementary function),

x00 + γx0 + ω02 x = 0 , (8.6)

plus a particular integral xp to the non-homogeneous problem. The general solution will contain two arbitrary constants
coming from the complementary, function xc whose values can be determined by imposing the initial conditions on the
sum xc + xp .

We can use the technology of Sec. 7.2 to write down the general solution to the homogeneous equation. With a trial
solution of x = ert the characteristic equation is

r2 + γr + ω02 = 0 . (8.7)

Over-damped case

If γ 2 > 4ω02 (the over-damped case), then the roots of the characteristic equation are real:

p
γ γ 2 − 4ω02
r± = − ± . (8.8)
2 2

The general solution to the homogeneous equation is then

xc (t) = Aer+ t + Ber− t . (8.9)

Under-damped case

If γ 2 < 4ω02 (the under-damped case) then the roots are complex:

p
γ 4ω02 − γ 2
r± = − ± i . (8.10)
2 2

The general solution is therefore

67
xc = Aept cos(qt) + Bept sin(qt) , (8.11)

where

γ
p = − , (8.12)
2
p
4ω02 − γ 2
q = , (8.13)
2

and where A and B are arbitrary constants. The solution is thus a combination of oscillating sine and cosine terms with
an amplitude that decreases exponentially in time.

Critically Damped Case

Finally, if γ 2 = 4ω02 (the critically-damped case) then the characteristic equation has only one distinct root,

γ
r=− , (8.14)
2
and the general solution to the homogeneous equation is

xc (t) = (A + Bt)ert . (8.15)

Depending on the values of γ and ω0 we will therefore use one of the three solutions above for xc (t).

The next step is to determine a particular solution to the non-homogeneous equation, and for this we can use the method
of undetermined coefficients as described in Sec. 7.3.1. The non-homogeneous term is A cos(ωt), so referring to Table 2
we should use a linear combination of sin(ωt) and cos(ωt) in the trial solution. Equivalently (and this will turn out
to be somewhat easier) we can regard our desired solution as the real part of the complex differential equation (see
Section 7.3.3).

x00 + γx0 + ω02 x = f0 eiωt . (8.16)

Now the non-homogeneous part is an exponential, so again referring to Table 2 we see the trial solution should be another
exponential of the same form, namely,

xp (t) = Ceiωt . (8.17)

An obvious advantage of using the complex exponential instead of the cosine form of the driving function is that taking
the derivative just gives a multiple of the same exponential. Substituting the solution (8.17) into the differential equation
(8.16) gives

− ω 2 Ceiωt + iωγCeiωt + ω02 Ceiωt = f0 eiωt . (8.18)

68
We can cancel the factor of eiωt and solve for the constant C,

f0
C=− . (8.19)
ω2 − ω02 − iωγ

Multiplying numerator and denominator by ω 2 − ω02 + iωγ to separate out the real and imaginary parts gives

f0 (ω 2 − ω02 ) f0 ωγ
C=− 2 −i 2 . (8.20)
(ω 2 − ω0 ) + ω γ
2 2 2 (ω − ω02 )2 + ω 2 γ 2

In order to extract easily the real part of the solution it is convenient to write C in the form† |C|eiφ , where

f0
|C| = p , (8.21)
(ω 2 − ω02 )2 + ω 2 γ 2
ωγ
φ = tan−1 . (8.22)
ω2 − ω02

Note that the quadrant of the angle is not uniquely determined by the tan−1 function, and we must require

ω 2 − ω02
cos φ = −p , (8.23)
(ω 2 − ω02 )2 + ω 2 γ 2
ωγ
sin φ = −p . (8.24)
(ω 2 − ω02 )2 + ω 2 γ 2

The final answer for xp (t) (still in complex form) is

xp (t) = |C|ei(ωt+φ) = |C| [cos(ωt + φ) + i sin(ωt + φ)] . (8.25)

Taking the real part of this as the solution to our original problem (with redefinition of up to refer now only to the real
part) gives

xp (t) = |C| cos(ωt + φ) . (8.26)

where the amplitude |C| and phase angle φ are given by Eqs. (8.21) and (8.22).

The complete solution to our problem is then found by adding together the particular solution xp and that of the
homogeneous equation xc , which, depending on the values of γ and ω0 , is given by Eqs. (8.9), (8.11) or (8.15). The
solution xc contains two arbitrary constants, A and B, and these are determined by the initial values of the position,
x(0) = x0 and speed, x0 (0) = x00 , for some specified x0 and v0 . That is, we require
p
† Recall any complex number z = x + iy can be expressed as z = |z|eiθ with |z| = x2 + y 2 and θ = tan−1 (y/x). The quadrant of θ is
fixed by cos θ = x/|z| and sin θ = y/|z|. By representing complex numbers in this way one can also show for any two complex numbers z1 and
z2 , |z1 /z2 | = |z1 |/|z2 |.

69
x(0) = xc (0) + xp (0) = x0 , (8.27)

x0 (0) = x0c (0) + x0p (0) = v0 . (8.28)

Suppose, for example, that we have γ < 2ω0 (the under-damped case), so that the solution to the homogeneous equation
is given by Eq. (8.11). The complete general solution including both uc and up is therefore

x(t) = Aept cos(qt) + Bept sin(qt) + |C| cos(ωt + φ) , (8.29)

where p and q are given by Eqs. (8.12) and (8.13). For the initial condition on the speed we need to differentiate Eq. (8.29)
and then evaluate at t = 0. Imposing both initial conditions gives

A + |C| cos φ = x0 , (8.30)

Ap + Bq − ω|C| sin φ = v0 . (8.31)

Solving for A and B we find

A = x0 − |C| cos φ , (8.32)


1
B = (ω|C| sin φ + p|C| cos φ − x0 p + v0 ) . (8.33)
q

Now A and B are completely determined by quantities that are specified at the outset, so by using these values with
Eq. (8.29) we have the fully specified function x(t) that satisfies our initial conditions and is valid for all times t > 0.

Notice that because p = −γ/2 is negative, the two exponential terms in the solution (8.29) mean that after a sufficiently
long time (t  2/γ) the homogeneous part of the solution goes to zero and one is left only with the particular solution.
That is, the homogeneous solution, which includes constants that depend on the initial conditions, represents transient
behaviour that eventually dies off. The particular solution represents the long-term motion, and this does not depend on
the constants A and B and is thus independent of the initial position and speed.

8.1.1 Unforced Oscillations

We may of course have the situation where there is no driving force at all, that is if f0 = 0. In this case we just have
the complementary function solution. We can analyse the situations in each of the cases of damping described above,
over-damped, critically damped and under-damped and a fourth possibility of no damping at all, which is simply where
γ = 0.

We saw that the over-damped case the solution is given by


r+ t
xod
c (t) = Ae + Ber− t ,

70
where p
γ γ 2 − 4ω02
r± = − ± .
2 2

It is clear that if γ > 2ω0 , then the exponentials in this solution both have negative arguments and the solution is slowly
falling to zero.

x(t)
2

1
Over-Damped

Critically Damped

0
0.5 1.0 1.5 2.0 2.5 3.0 3.5 t
Under-Damped
t
±e 2

-1

Undamped

-2

Figure 8.2: The harmonic oscillator without a driving force. The motion of the mass on the spring is plotted for several
different cases: over-damping (red), critical damping (blue), under-damping (black, with grey dashed lines indicating the
e−γt/2 envelope) and no damping at all (orange).

For the critical damped case we again have a similar behaviour as


rt
xcd
c (t) = (A + Bt)e ,

with r = −γ/2.

For the under-damped case we again have a decaying exponential envelope as


pt
xud
c = e (A cos(qt) + B sin(qt)) ,

71
with p = −γ/2.

For the last case of no damping, that is γ = 0, we just get the simple form of

xnd
c = cos(qt) + B sin(qt) .

All four of these cases are plotted for some example values of the parameters in Figure 8.2.

9 Limits and the evaluation of Indeterminate Forms

See Riley Section 4.7

The idea of the limit of a function f (x) as x approaches a value a is intuitive and we have been using this intuition to
evaluate limits already in this course. There is however a strict definition of what we mean by a limit. In many cases the
limit of the function f (x) as x approaches a is given f (a), but sometimes this is not so.

For example, suppose we want to find


1 − ex
L = lim .
x→0 x
If we naively try to substitute x = 0, we get 0/0. Expressions that lead us to such meaningless results when we substitute
values for the dependent variable are called indeterminate forms.

The limit of this function does actually exist and is well defined and we will see this in a little while. Another possibility
is that even if f (x) is defined at x = a its value may not be equal to the limiting value lim f (x). This can occur if the
x→a
function is discontinuous at x = a.

The strict definition of a limit is that if limx→a f (x) = l then for an number  however small it must be possible to find a
number η such that |f (x) − l| <  whenever |x − a| < η. In other words, as x becomes arbitrarily close to a, f (x)becomes
arbitrarily close to its limit l. To remove any ambiguity the number η will depend on  and the form of f (x).

The following observations are often useful when finding the limit of a function:

• A limit may be ±∞. For example as x → 0, 1/x2 → ∞. This limit is defined, it just happens to

• A limit may be approached from below or above and the value may be different in each case. For example, consider
the function f (x) = tan x. As x tends to π/2 from below f (x) → ∞ but if the limit is approached from above then
f (x) → −∞. We can write this as

lim tan x = ∞ lim tan x = −∞.


x→ π
2
+ x→ π
2

• It may ease the evaluation of limits if the function under consideration is split into a sum, product or quotient.
Provided that in each case a limit exists, the rules for evaluating such limits are as follows

– a) lim [f (x) + g(x)] = lim f (x) + lim g(x).


x→a x→a x→a
– b) lim [f (x)g(x)] = ( lim f (x))( lim g(x)).
x→a x→a x→a
 
f (x) limx→a f (x)
– c) lim = , provided that both the numerator and denominator are not both equal to zero
x→a g(x) limx→a g(x)
of infinity.

72
Let’s look at some example limits.
Example 1: Evaluate lim (x2 + 2x3 ). For this we can use a) above.
x→1

lim (x2 + 2x3 ) = lim (x2 ) + lim (2x3 ) = 1 + 2 = 3.


x→1 x→1 x→1

Example 2: Evaluate lim (x cos x). For this we can use b) above
x→0

lim (x cos x) = lim x lim cos x = 0 × 1 = 0.


x→0 x→0 x→0

sin x
Example 3: Evaluate lim ( ). For this we can use c) (having check that not both the numerator and denomi-
x→π/2 x
nator are zero or infinity).
  lim sin x
sin x x→π/2 1 2
lim = = = .
x→π/2 x lim x π/2 π
x→π/2

• Limits of functions of x that contain exponents that are themselves functions of x can often be found by taking
Logarithms. Example 4: Evaluate the limit
 x2
a2
L = lim 1− 2 .
x→∞ x

First define
 x2
a2
y= 1− 2
x
and take the natural Log of this and then take the required limit as
  
2 a2
lim ln y = lim x ln 1 − 2 .
x→∞ x→∞ x

Now we can use the Maclaurin series (or Taylor Series around x = 0) of ln(1 + x) (see Equation 2.56) to write
   2 2
a2 a2 1 a
ln 1 − 2 = − 2 − − 2 + ...
x x 2 x

So that " !# " #


 2  2
2 a2 1 a2 2 1 a2
lim ln y = lim x − 2− + ... = lim −a − + . . . = −a2
x→∞ x→∞ x 2 x2 x→∞ 2 x

Therefore, since lim ln y = −a2 it follows that lim y = exp (−a2 ).


x→∞ x→∞

9.1 L’Hôpital’s rule

L’Hôpital’s rule is an extension of c) above but in cases where both the numerator and denominator are zero of infinite.
Consider lim f (x)/g(x), where f (a) = 0 and g(a) = 0. Expanding the numerator and denominator as Taylor Series we
x→a
obtain  
f (x) f (a) + (x − a)f 0 (a) + (x − a)2 /2! f 00 (a) + . . .
= .
g(x) g(a) + (x − a)g 0 (a) + [(x − a)2 /2!] g 00 (a) + . . .

73
However, f (a) = g(a) = 0 so
 
f (x) (x − a)f 0 (a) + (x − a)2 /2! f 00 (a) + . . . f 0 (a) + [(x − a)/2!] f 00 (a) + . . .
= 0 00
= 0 .
g(x) (x − a)g (a) + [(x − a) /2!] g (a) + . . .
2 g (a) + [(x − a)/2!] g 00 (a) + . . .

Therefore we find

f (x) f 0 (a) + [(x − a)/2!] f 00 (a) + . . . f 0 (a)


lim = lim 0 = ,
x→a g(x) x→a g (a) + [(x − a)/2!] g 00 (a) + . . . g 0 (a)

provided f 0 (a) and g 0 (a) are not themselves both zero. If they are both zero then we can apply the same procedure again
to find

f (x) f 00 (a)
lim = 00 .
x→a g(x) g (a)

We can repeat this procedure by going to higher and higher order terms until we find a defined limit (which may be
infinity, zero or some other number).
1 − ex
Example 5. Evaluate the limit lim . We note first of all that if we set x = 0 then the numerator and denominator
x→0 x
are both zero. We apply L’Hôpital’s rule

1 − ex −ex
lim = lim = −1.
x→0 x x→0 1

Checking this, if we just Taylor expand both numerator and denominator around x = 0 we have

1 − ex 1 − (1 + x + x2 /2! + x3 /(3!) + . . . −1 − x/2! − x2 /(3!) + . . . x x2


= = = −1 − − + ....
x x x 2! 3!
Again if we take the limit x → 0 the result is

 
1 − ex x x2
lim = lim −1 − − + . . . = −1.
x→0 x x→0 2! 3!

We have so far concentrated on the limit where x approached 0. For the case where f (a) = g(a) = ∞ we may still apply
L’Hôpital’s rule by writing
f (x) 1/g(x)
lim = lim ,
x→a g(x) x→a 1/f (x)

which is now of the form 0/0 at x = a. Note also that L’Hôpital’s rule is still valid for finding limit as x → ∞, i.e. when
a = ∞. This is easily shown by letting y = 1/x as follows:

f (x) f (1/y) f 0 (1/y) f 0 (x)


lim = lim = lim 0 = lim 0 .
x→∞ g(x) y→0 g(1/y) y→0 g (1/y) x→∞ g (x)

74
10 Series

Riley section 4. Boas Chapter 1.

There are a large number of different series that are useful in Mathematics. Some you will already have seen before and
some you will be seeing for the first time. A series may have a finite or infinite number of terms.

The first N terms of a series can be represented as

SN = u1 + u2 + u3 + . . . + uN ,

where the terms of the series un , where n = 1, 2, 3, . . . , N are numbers and may in general be complex.
1
An example series has terms un = 2n for n = 1, 2, 3, . . . , N then the sum of the first N terms will be
N
X 1 1 1 1
SN = un = + + + ... + N .
n=1
2 4 8 2

It is often of practical interest to calculate the sum of an infinite series (one with an infinite number of terms). If the
value of the sum “converges” then the series is finite. We can consider the following limit to determine whether a series
converges as
S = lim SN .
N →∞

Not all series converge they may approach ∞ or −∞, or oscillate finitely or infinity. Moreover, for a series where each term
depends on some variable, its convergence can depend on the value assumed by the variable. Whether a sum converges,
diverges or oscillates has important implications when describing physical systems. We now go through some examples
of different types of series and how to sum them.

10.1 Arithmetic Series

An arithmetic series is defined by the difference between successive terms being constant. The sum of a general arithmetic
series is written
N
X
SN = a + (a + d) + (a + 2d) + . . . + [a + (N − 1)d] = (a + nd).
n=0

If we rewrite the sum backwards

SN = [a + (N − 1)d] + [a + (N − 2)d] + . . . + a

Adding these two sums term by term together we get

2SN = [a + a + (N − 1)d] + [a + a + (N − 1)d] + . . . + [a + a + (N − 1)d] = N [a + a + (N − 1)d] .

So that we have

N
SN = (first term + last term) .
2

75
10.2 Geometric Series

A geometric series is a series where the ratio of successive terms is a constant. In general we may write a geometric series
and its sum as
N
X −1
SN = a + ar + ar2 + . . . + arN −1 = arn , (10.1)
n=0

where a is a constant and r is the ratio of successive terms, also referred to as the common ratio.

We may find a closed expression for this type of series for a given number of terms (which we can take to infinity if the
series converges). Consider the series SN and rSN , that is

SN = a + ar + ar2 + . . . + arN −1 , (10.2)


2 3 N
rSN = ar + ar + ar + . . . + ar . (10.3)

If we now subtract the second equation from the first we find

(1 − r) SN = a − arN

and hence
a(1 − rN )
SN = . (10.4)
1−r

For a series with an infinite number of terms and |r| < 1, we have a limit
a
lim SN = S = . (10.5)
N →∞ 1−r
This series is then called convergent.

10.3 Arithmetico-Geometric Series

As its name suggest, an arithmetico-geometric series is a mix of a geometric and arithmetic series. It has the following
general form
N
X −1
SN = a + (a + d)r + (a + 2d)r2 + . . . + [a + (N − 1)d] rN −1 = (a + nd)rn , (10.6)
n=0

and can be summed in a similar way to a geometric series by writing

SN = a + (a + d)r + (a + 2d)r2 + . . . + [a + (N − 1)d] rN −1 d (10.7)


2 3 N
rSN = ar + (a + d)r + (a + 2d)r + . . . + [a + (N − 1)d] r d. (10.8)

Subtract second from the first

(1 − r)SN = a + rd + r2 d + . . . + rN −1 d − [a + (N − 1)d] rN = a − [a + (N − 1)d] rN + dr(1 + r + r2 + . . . + rN −2 ).

The second term contains a geometric series that can be summed according to Equation 10.4 to find
rd(1 + rN −1 )
(1 − r)SN = a − [a + (N − 1)d] rN + .
1−r

76
Rearranging we have
a − [a + (N − 1)d] rN rd(1 + rN −1 )
SN = + (10.9)
1−r (1 − r)2
For an infinite series with |r| < 0 we have that limn→∞ rn → 0 and hence
a rd
S = lim SN = + (10.10)
n→∞ 1 − r (1 − r)2

10.4 A last example of a series

We saw in the integration section that we needed to sum the following series
n
X
Sn = 1 + 22 + 32 + . . . + n2 = i2 .
i=1

We can calculate the result of this sum by considering the following two series containing the sum of cubes. That is
consider

S = 13 + 23 + 33 + . . . + n3 (10.11)
3 3 3 3
S̄ = 0 + 1 + 2 + . . . + (n − 1) . (10.12)

Subtract the second from the first to get


n
X
3 3 3 3 3 3 3 3
 
S − S̄ = (1 − 0 ) + (2 − 1 ) + (3 − 2 ) + . . . + (n − (n − 1) ) = i3 − (i − 1)3 .
i=1

3 3
but this difference is also equal to n as all terms cancel apart from the final n factor. We have therefore

n
X 3 
i − (i − 1)3 = n3 .
i=1

We can expand the left hand side


n
X n
X n
X n
X

n3 = 3i2 − 3i1 = 3 i2 − 3 i− 1. (10.13)
i=1 i=1 i=1 i=1

For the second two terms we have


n
X n(n + 1)
i=
i=1
2
and
n
X
1 = n,
i=1
where the first sum has been completed using the sum of an arithmetic series as calculated in Section 10.1. Putting this
all into equation 10.13 and rearranging we find
n
X n3 − n n n
i2 = + (n + 1) = (2n + 1)(n + 1).
i=1
3 2 6

77
10.5 Convergent and Divergent Series

So far we have been talking about series which have a finite sum and in the last subsection we saw a series that if we
took the number of terms in the series to infinity we would get a value for the sum that is infinity. If a series has a finite
sum, it is called convergent, but if the sum is not finite it is called divergent. It is important to know whether a series is
convergent or not as odd things can happen if we try to manipulate the series using ordinary algebra or calculus.

To see this we can look at an example of a clearly divergent series

S = 1 + 2 + 4 + 8 + 16 + . . . ,

where the . . . tell us that the series carries on for an infinite number of terms. It should be clear that this series is
divergent and if we added up all the terms we would get an infinite result.

Let’s try and do some simple algebra with this series. If we write down 2S, we have

2S = 2 + 4 + 8 + 16 + . . . ,

which is very close to our original series S apart from the first term so that

2S = S − 1

. We can rearrange this to find


S = −1,

which is clearly non-sense.

The same sort of thing can happen in a less obvious case. For example the series

1 1 1 1
S =1+ + + + + ...
2 3 4 5
is actually a divergent series. As a result we need to be careful about the series we are dealing with and be able to identify
a convergent series.

Before we look at some tests for convergence, we should be clear about what we mean by the convergence of a series.

Consider the series


a1 + a2 + a3 + . . . + an + . . . .

Recall that the three dots tell us the series goes on without end.

Now consider the sums, Sn , created by including only the first n terms of the series, that is

S1 = a1
S2 = a1 + a2
...
Sn = a1 + a2 + a3 + . . . + an .

Each Sn is called a partial sum, it is the sum of the first n terms.

78
We are of course interested in the limit as we take n to infinity and whether we can write

lim Sn = S.
n→∞

It is understood that S is a finite number. If this happens, we make the following definitions:

• If the partial sum Sn of an infinite series tends to a limit S, the series is called convergent. Otherwise it is called
divergent.

• The limiting value S is called th sum of the series.

• The difference Rn = S − Sn is called the remainder (or the remainder after n terms). We see that

lim Rn = lim (S − Sn ) = S − S = 0.
n→∞ n→∞

10.5.1 Testing a Series For Convergence: The Preliminary Test.

First we discuss what is usually called the preliminary test. In most cases you should apply this test first before you use
other tests as it will identify the badly divergent series.

Preliminary Test: If the terms of an infinite series do not tend to zero (that is, if limn→∞ an 6= 0), the series diverges.
If the limn→∞ an = 0, we must make further tests. This test cannot tell you whether the series is convergent.

10.5.2 Tests for Convergence of Series of Positive Terms: Absolute Convergence

There are several useful tests for series whose terms are all positive. Here we will look at 3 of them.

If we do have a series with negative terms, it may still be useful to make all the terms positive and do the test on this
modified series. If this new series converges we call the original series absolutely convergent. It can be proved that if a
series converges absolutely, the series is convergent when you reinstate the original minus signs (the sum is different of
course). The following 3 tests may all be used for testing a series of positive terms, or for testing any series for absolute
convergence.

1) The Comparison Test

This test has two parts. Let


m1 + m2 + m3 + . . .

be a series of positive terms which you know converges. Then the series we are testing has the form

a1 + a2 + a3 + . . .

and is absolutely convergent is |an | < mn

Example: Test
X∞
1 1 1 1
=1+ + + + ...
n=1
n! 2 6 24

79
for convergence. As a comparison series, we can choose the geometric series
X∞
1 1 1 1 1
n
= + + + + ...
n=1
2 2 4 8 16

We do not care about the first few terms (or in fact any finite number of terms) in a series, because they can affect the
sum of the series not whether it converges. When we ask whether a series converges of not, we are asking what happens
as we add more and more terms larger and larger n.Does the sum increase indefinitely, or does approach a limit?
P∞ P∞
In our example, the terms of n=1 1/n! are smaller than the corresponding terms of n=1 21n for all n > 3. We know
P∞
that the geometric series converges and therefore n=1 1/n! converges also.

2) Integral Test

We can use this test when the terms of the series are positive and non-increasing, that is an+1 ≤ an . The test involve
writing an as a function of n, then allowing n to take all values not just integral ones.
P∞ R∞
The if 0 < an+1 ≤ an for n > N , then an converges if an dn is finite.

Example: Test for convergence the harmonic series

1 1 1
1+ + + + ....
2 3 4
Using the integral test with an = 1/n we calculate
Z ∞
1 ∞
dn = ln n| = ∞.
n

Since the integral is infinite this tells us that the series is divergent.

3) Ratio Test

The integral test depends on being able to integrate an dn and this is not always possible. We now consider another test
which can handle many series.

First recall that for a geometric series each term could be obtained by multiplying the one before it by the ratio r, that
is an+1 = ran or an+1 /an = r. For other series this same ratio is not a constant but may depend on n. Let us define the
absolute value of this ratio as ρn . Let us also find the limit (if there is one) of ρn as n → ∞ and call this limit ρ. Thus
we have


an+1

ρn = ,
an
ρ = lim ρn . (10.14)
n→∞

If

ρ < 1, the series converges;


ρ = 1, use a different test;
ρ > 1, the series diverges.

80
Example: Test for convergence the series

1 1 1
S =1+ + + ... + + ....
2! 3! n!

Using equation 10.14, we have



1 1 n! n(n − 1) . . . 3 · 2 · 1 1

ρn = / = = = .
(n + 1)! n! (n + 1)! (n + 1)n(n − 1) . . . 3 · 2 · 1 n+1

Now taking the limit at n → ∞ we have


1
ρ = lim ρn = lim = 0.
n→∞ n→∞ n + 1

Since ρ < 1, the series converges.

10.5.3 Alternating Series

So far we have only looked in detail at series with positive terms. An important example of a series is where the terms
alternate in sign. For example, consider the following series

1 1 1 1 (−1)n+1
S =1− + − + − ... + + .... (10.15)
2 3 4 5 n
We can ask two questions about alternating series, the first is whether the series as written converges and the second
is whether the series converges absolutely. Let us consider the second question first. We can rewrite the series with on
positive powers as
1 1 1 1 (−1)n+1
S = 1 + + + + + ... + + ...,
2 3 4 5 n
where of course we have a different sum that will be bigger than our original sum. Therefore if this series converges,
our original alternating series will also converge. We of course recognise the harmonic series which we found is actually
divergent. As a result our series is not absolutely convergent so we must try a different test.

For an alternating series the test is very simple. An alternating series converges if the absolute value of the terms decreases
steadily to zero, that is if |an+1 | ≤ |an | and limn→∞ an = 0.
1 1 1
In the example above we have n+1 < n, and limn→∞ n = 0 and so the series in equation 10.15 is convergent.

11 Summary

That is all for this year. The only way you will learn this material is by doing lots and lots of examples. Have fun doing
them!

81
A The Greek Alphabet

A α alpha
B β beta
Γ γ gamma
∆ δ delta
E  (or ε) epsilon
Z ζ zeta
H η eta
Θ θ theta
I ι iota
K κ kappa
Λ λ lambda
M µ mu
N ν nu
Ξ ξ xi (‘ksee’)
O o omicron
Π π pi
P ρ rho
Σ σ sigma
T τ tau
Υ υ upsilon
Φ φ (or ϕ) phi
X χ chi (‘kai’)
Ψ ψ psi
Ω ω omega

Table 3: The Greek alphabet.

Notice that the letter nu (ν) is distinct from a Latin v, and χ is not the same as a Latin x. The Greek lower-case
letters upsilon (υ) and omicron (o) are essentially indistinguishable from Latin v and o and are therefore not used in
mathematics. Likewise many of the upper-case Greek and Latin letters are identical and so the Greek versions are rarely
used.

82

You might also like