Mathsforscientistsi PDF
Mathsforscientistsi PDF
Calculus.
PH1110/EE1110
Dr Stephen West
Department of Physics,
Royal Holloway, University of London,
Egham, Surrey,
TW20 0EX.
1
Contents
1 Introduction. 6
2 Functions 6
3 Differential calculus 16
2
4.3 Spherical Polars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
5 Integration 30
6 Differential Equations 45
3
6.2.8 Bernoulli equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
10 Series 75
11 Summary 81
4
A The Greek Alphabet 82
5
1 Introduction.
Welcome to PH1110, Mathematics for Scientists (calculus component). This course will provide you with some of the
building blocks in Mathematics that you will need throughout your Physics (or Electronic Engineering) degree. The notes
here and the notes that you will take as part of the lectures will only be a small part of you learning in Mathematics.
The real learning takes place on your own as you go through as many examples and problems as you can. The only way
to learn mathematics is to do mathematics.
This part of the course focuses on calculus with the following topics
• Differentiation (including an introduction of partial differentiation) and integration (formal definitions and basic
techniques).
• K F Riley, M P Hobson and S J Bence, Mathematical Methods for Physics and Engineering A Comprehensive
Guide, 3rd Edition, CUP, 2006 ISBN: 9780521679718.
This book is available to read online via the link Riley Textbook .
Another useful book (that is also available in the library) that contains many good examples is
The algebra content of this course is very high and to accommodate this we will make use of the Greek alphabet. If you
are not familiar with this alphabet there is a list in Appendix A.
This is the first version of these notes, if you spot any typos or more serious errors please let me know by emailing
[email protected].
2 Functions
A function is a rule that relates an input to an output. The input to a function is called the argument and the output is
called the value.
Formally speaking, a function takes elements from a set (the domain) and relates them to elements in a another set (the
codomain). All the outputs (the values related to) are together called the range of the function.
6
• It must work for every possible input value.
When writing a function there are three important parts; the argument, the relationship and the output of the function.
Let’s look at an example.
f (x) = x2 , (2.1)
where f is the name of the function labelling the output, x is the argument and x2 is the relationship that is, what the
function does to the input. The action of the function is straightforward. For example, if we have an argument x = 2,
then the output is f (2) = 4.
Note: f is a common name for a function but we could have called it anything at all, for example we could have called
it g or h or helicopter. It does not matter, it is just a name. Equally, the argument does not have to be x, we could have
chosen b, q or telephone. The argument is just a place holder. For example,
f (b) = b2 , (2.2)
is exactly the same function as the one in Equation 2.1. As above, if the argument b = 2, then the output is again
f (2) = 4. The argument just shows us where the input goes.
y = x2 , (2.3)
There are a large number of special types of function possessing very particular properties. For example the symmetry
of a function can play a crucial role in its operation. As an example a function may be even or odd or neither.
An even function is one like x2 or cos x whose graph for negative x is just a reflection in the y-axis of its graph for positive
x. Mathematically we can write this as
An odd function is one like x or sin x, where the values of f (x) and f (−x) are negatives of each other. By definition
We will come back to even and odd function later in particular when we look at integrals over these functions.
7
2.2 Logarithms - a quick revision
You will have already come across Logarithms at school, but a brief review is given here to refresh your memory. Let’s
start with the following
x = bp , (2.6)
We write this as
p = logb x, (2.7)
where the bases, b, can be any non-zero positive number (more formally we say b ∈ R > 0). Two common examples are
where b = 10 or where b = e, where e is a special constant with approximate value e = 2.71828 to 5 decimal places. This
constant plays a crucial role in Physics and so when it is used as the base in a Logarithm it gets its own special name.
The Log to the base e is referred to as the natural Logarithm and is often denoted by
p = ln x. (2.8)
It should be clear that logb 0 = −∞, or more properly logb 0 is not defined. The reason being that if x = 0 then 0 = bp
can only be satisfied if p = −∞. The real Logarithm function is only defined for x > 0.
By definition
Another common base is 10 and you will often see a Log to the base 10 written without the base explicitly stated,
log10 x = log x. It is always important to make sure it is clear which base is being used.
There are a number of important rules associated with the manipulation of Logarithms.
Proof: Let x = bn and y = bm and so n = logb x and m = logb y. Consider the product of x and y.
xy = blogb xy . (2.13)
8
Comparing the exponents in Equations 2.12 and 2.13 we see that
x
logb = logb x − logb y.
y
(2.14)
Proof: Let x = bn and y = bm and so n = logb x and m = logb y. Consider the quotient of x and y.
x bn
= m = bn b−m = bn−m = blogb x−logb y . (2.15)
y b
9
Proof: By definition we can write
b = aloga b . (2.23)
Solution:
A very common set of functions that we will see in this course are the trigonometric functions. For example you will be
familiar with sin, cos, tan. Most relationships between these functions can be derived from two identities (which we will
prove later in the course).
10
We also have a set of functions that are the reciprocal of the familiar trigonometric functions. These are
1 1 1
cosec(x) = ; sec(x) = ; cot(x) = . (2.32)
sin(x) cos(x) tan(x)
As an example of a further identity we can start with Equation 2.30 and divide both sides by cos2 (A)
Many more identities can be derived in related ways, see the list in section 7.1 of the Mathematics Formula Booklet
(online version here).
A further important property of cos x and sin x is that they can be expanded in terms of a polynomial, literally meaning
“many terms”. In our case these polynomials are a sum terms depending on the variable x to increasingly higher powers.
Specifically
X∞
x3 x5 x7 x2n+1 (−1)n
sin x = x− + − + ... = (2.34)
3! 5! 7! n=0
(2n + 1)!
X∞
x2 x4 x6 x2n (−1)n
cos x = 1− + − + ... = , (2.35)
2! 4! 6! n=0
(2n)!
where the symbol ! means factorial which for example 4! = 4.3.2.1 etc and 0! = 1 by definition. These power series
expansions come from Taylor Series.
or
∞
X
an (x − a)n = a0 + a1 (x − a) + a2 (x − a)2 + . . . + an (x − a)n + . . . . (2.37)
n=0
The value of x will determine whether the series converges and we often are tasked with finding the value of x for which
a series converges. For example, the power series
x x2 x3 xn
P (x) = 1 − + − + ... + n + ....
2 4 8 2
can be tested for convergence by considering the absolute ratio of successive terms
(−x)n+1 (−x)n x
ρn = n+1 / n = ,
2 2 2
11
and then
x
ρ = lim ρn = .
n→∞ 2
The series converges when ρ < 1 that is for when |x| < 2 and diverges for |x| > 2. When x = 2 we get the series
S = 1 − 1 + 1 − 1 + ...,
S = 1 + 1 + 1 + 1 + ...,
which again is not convergent. We can now say that the interval of convergence for the power series, P (x) is −2 < x < 2.
When finding the interval of convergence one must always analyse whether the interval endpoint values lead to a convergent
series or not. We will come back to testing the convergence of series later in the term.
It is very often more convenient to work with a power series representation of a function. For example, we will consider
the power series expansion of the function sin x, which we have already seen earlier in the course.
We first have to assume that such an expansion exists and then find the value of all the coefficients, an . Writing this
expansion we have
sin x = a0 + a1 x + a2 x2 + . . . + an xn + . . . . (2.38)
The interval of convergence of a power series of this form must contain x = 0. So we may set x = 0 in this expansion and
we learn that a0 = 0. Next we differentiate both sides of equation 2.38 to find
Again putting x = 0 into this we find a1 = 1. Repeating this procedure a further time we have
A series obtained in this way is called a Maclaurin Series, or a Taylor Series expansion about the origin.
A Taylor series in general means a series of powers of (x − a), where a is some constant. It is found by writing (x − a)
instead of x on the right hand side of a power series expansion. To find the coefficients in this case we follow a similar
procedure as we did before but this time we set x = a to evaluate the coefficients.
12
To see this we can expand a general function of x as a Taylor Series. Consider the function f (x) and expand it around
the point x = a. We have then
We can now write the full Taylor series for f (x) about x = a:
1 1
f (x) = f (a) + (x − a)f 0 (a) + (x − a)2 f 00 (a) + . . . + (x − a)n f (n) (a) + . . . .
2! n!
(2.50)
The Maclaurin series for f (x)is the Taylor series about the origin. Putting a = 0 we obtain the Maclaurin series (or
Taylor Series expansion about x = 0) for f (x).
1 2 00 1
f (x) = f (0) + (x)f 0 (0) + x f (0) + . . . + xn f (n) (0) + . . . .
2! n!
(2.51)
Note: The functions are differentiated first and then the value of x around which they are being expanded is inserted.
We have already seen some important Maclaurin series but they are collected here and should be memorised. You are
also expected to be able to derive them all as well.
13
Let’s look at an example. Expand f (x) = cos bx around x = 0, where b is a constant. Include only the first 3 terms.
Notice the approximation sign, this is because we are only including the first three terms of what is an infinite series and
so our power series expansion is only an approximation of the full functional.
We have seen that some functions (those that are differentiable) can be represented by an infinite series. It is often useful
to approximate this series and keep only the first n terms, with the other terms neglected.
If we do make this approximation we would like to know what size error we are making. We can rewrite the full Taylor as
1 1
f (x) = f (a) + (x − a)f 0 (a) = (x − a)2 f 00 (a) + . . . + (x − a)(n−1) f (n−1) (a) + Rn (x), (2.58)
2! (n − 1)!
where
(x − a)n (n)
Rn (x) = f (η),
n!
where η lies in the range [a, x]. Rn (x) is called the remainder term and represents the error in approximating f (x) by
the above (n − 1)th-order power series.
The value of η that satisfies the expression for Rn is not known, an upper limit the error may be found by differentiating
Rn with respect to η and equating the result to zero and solving for η in the usual way for finding a maximum.
Example, if we calculate the Taylor series for cos x about x = 0 but only include the first two terms we get
x2
cos x ≈ 1 − .
2
14
The remainder function in this case is
x4
R4 (x) = cos η,
4!
with η confined to the interval [0, x]. It si clear that the maximum the value of cos η can be is 1 so that the error is x4 /4!.
If we take x = 0.5, taking just the first two terms yields cos(0.5) ≈ 0.875 with an indicated maximum error of 0.00260.
In fact using a calculator cos(0.5) = 0.87758 to 5 decimal places. To this accuracy the true error is 0.00258.
15
f (x) f(x)
1 1.0
0.8
0.6
0.4
0.2
x x
-4 -2
0 2 4
Figure 3.1: The Step(x) function is an example of a function that is not continuous.
3 Differential calculus
You will have already seen many example of differentiation and know what the derivative of certain functions are. In this
section we will look at the formal definition of a derivative which will allow us to prove the results of the derivatives that
you have seen before.
First we must define a property of a function that will be useful in our description of differentiation. A function, f (x), is
continuous at x = a if f (x) → f (a) as x → a.
1, if x > 0,
Step(x) = (3.1)
0, if x < 0.
A graphic visualisation of this function is displayed in Figure 3.1. It is clear that this function is not continuous at x = 0.
In particular, as we take the limit x → 0+ , step(x) → 1 in contrast to when we take the limit x → 0− , step(x) → 0. Here
x → 0+ means that we take x to zero from positive numerical values of x with x → 0− meaning we take x to zero from
negative numerical values of x.
16
f
f (x + x)
f (x + x) f (x) ⌘ f
f (x) P
x
✓
x
x x+ x
Figure 3.2: The Barrow triangle determining the gradient of the tangent to a curve. The graph shows that eh gradient
or slope of the function at P , given by tan θ, is approximately equal to ∆f /∆x.
We will now investigate what it means to take a derivative of a function. The derivative of a function is essentially the
gradient of the tangent to a curve (as described by the function) at an arbitrary input value (let’s call it x for now).
With this in mind consider Figure 3.2. The figure shows the plot of f , as a function of the input variable x. We wish to
calculate the gradient (that is, the slope) of the tangent to a curve at a particular value of x.
The tangent line (or simply tangent) to a plane curve at a given point is the straight line that “just touches” the curve
at that point (see green line in Figure 3.2). Leibniz defined it as the line through a pair of infinitely close points on the
curve.
Figure 3.2 shows that the gradient of the hypotenuse of the Barrow triangle is given by
f (x + ∆x) − f (x) ∆f
= . (3.2)
∆x ∆x
This is not quite the tangent to the curve at the point P given by tan θ. However, it is clear that as we decrease ∆x
the hypotenuse and tangent to the curve become closer and closer. By taking the limit ∆x → 0 we can match on to the
gradient. Formally speaking then the derivative of f with respect to x is given by
17
df f (x + ∆x) − f (x) ∆f
≡ lim = lim ,
dx ∆x→0 ∆x ∆x→0 ∆x
(3.3)
This definition is only true if the function has a unique value at the input value. This means that if you approach
this point from any direction the value of this function is unique. Further more if a function is differentiable it is also
continuous. The reverse is not true, if a function is continuous it does not imply that it is differentiable.
Example 1: Find the derivative from first principles of f (x) = x2 with respect to x.
So we conclude that if f (x) = x2 then f 0 = 2x, where f 0 is a common shorthand for df /dx.
Example 2: Find the derivative from first principles of f (β) = cos β with respect to β.
We can expand cos(β + ∆β) using the double angle formula in Equation 2.28 we can rewrite this as
The next step is to use the polynomial expansions for both sin and cos expressed in Equations 2.34 and ?? respectively.
In particular and keeping only the first few terms as higher order (higher power) terms will be smaller and smaller as we
take ∆β to zero
sin(∆β) =
(∆β) + . . . (3.7)
(∆β)2
cos(∆β) = 1 − + ..., (3.8)
2!
where . . . represent higher order powers of ∆β. Putting these into our derivative we find
18
Finally arriving at the result that the derivative of f (β) = cos(β) is
df d cos β
= = − sin β.
dβ dβ
It is important that you learn the following results for use in calculations. These result all result from the technique
outline above and you should be able to derive the all of the results However, unless you are asked explicitly to do so, it
is sufficient just to state these results.
d n d
x = nxn−1 , (sin ax) = a cos ax,
dx dx
d ax d
e = aeax , (cos ax) = − sin ax,
dx dx
d 1 d
ln ax = , (tan ax) = a sec2 ax,
dx x dx
d d
sec(ax) = a sec ax tan ax, cosec ax = a cosec ax cot ax,
dx dx
where a and n are constants.
We can also prove the product rule in a similar way. Let’s say we have a function f that is a product of two other
functions u(x) and v(x). Applying Equation 3.3 we have that
df u(x + ∆x)v(x + ∆x) − u(x)v(x)
= lim
dx ∆x→0 ∆x
u(x + ∆x)(v(x + ∆x) − v(x)) + (u(x + ∆x) − u(x))v(x)
= lim
∆x→0 ∆x
(v(x + ∆x) − v(x)) (u(x + ∆x) − u(x))
= lim u(x + ∆x) + v(x)
∆x→0 ∆x ∆x
dv(x) du(x)
= u(x) + v(x) .
dx dx
19
3.3.2 Composite Function or Function of a Function or Implicit Differentiation
Consider a function g(x) and another function f (g(x) which is a function of our original function g. We would like to
know how to take the derivative of f (g(x)). We already know the answer from school but let’s prove it using Equation 3.3.
The Quotient rule does not really stand alone as a separate rule as it is a direct consequence of the product and chain
rule. Consider the product of two functions u(x) and 1/v(x). We can take the derivative of this product following the
product rule
d 1 d 1 1 du
u. = u + (3.14)
dx v dx v v dx
d 1 u dv 1 1 du
u. = − 2 + (3.15)
dx v v dx v v dx
d u v du dv
dx − u dx
= , (3.16)
dx v v2
which is the Quotient Rule.
In circumstances in which the variable with respect to which we are differentiating is an exponent, taking logarithms and
then differentiating implicitly is the simplest way to find the derivative.
Solution: First take natural logs of both sides and then differentiate
1 dg
ln g = ln ax = x ln a, ⇒ = ln a. (3.17)
g dx
20
Now simply rearrange and substitute in the original expression for g
dg
= g ln a = ax ln a. (3.18)
dx
Leibniz’ theorem states the form of the nth order derivative of a product of functions. We know from above that the
derivative of a product of two functions, u ≡ u(x) and v ≡ v(x) is given by the product rule
d(uv) dv du
=u +v . (3.19)
dx dx dx
n
!
X n
n n n−1 n−1 n
(a + b) = a + na b + ...nb a+b = an−r br , (3.22)
r=0
r
where
!
n n!
= . (3.23)
r r!(n − r)!
In the preceding sections we have interpreted the derivative of a function as the gradient of the function at a particular
value of the input variable. If the gradient is zero for some particular value of x then the function is said to have a
stationary point there.
21
f
x
x=a x=b x=c
In Figure 3.3, the function has three stationary points in the domain displayed at points x = a, x = b and x = c. In each
case it is clear that the gradient of the function at the three points is zero, corresponding graphically to horizontal lines.
A stationary point can be classified in three ways. It can be a minimum as at x = a where the function decreases either
side of that point, it can be classified as a maximum as at x = b where the function increases either side of that point or
it can be a point of inflection, as in at x = c where on one side of the point the function falls but on the other it increases.
The first two type, the maximum and minimum are commonly referred to as turning points.
We can define these stationary points mathematically. All stationary points have zero gradient and therefore
df
= 0. (3.25)
dx
In the case of a minimum, the slope of the graph, that is df /dx, goes from negative for x < a to positive for a < x < b.
This means that d2 f /dx2 > 0 at x = a.
In the case of a maximum, the slope of the graph goes from positive for a < x < b to negative for b < x < c. This means
that d2 f /dx2 < 0 at x = b.
In the case of a point of inflection, from the left of x = c the slope becomes less negative as we increase x towards x = c
and hence d2 f /dx2 > 0. To the right of x = c however, the curve becomes increasing negative so that d2 f /dx2 < 0. It is
not obvious but this means that at x = c, d2 f /dx2 = 0.
22
df d2 f
• for a maximum: dx = 0, dx2 < 0,
df d2 f
• for a minimum: dx = 0, dx2 > 0,
df d2 f d2 f
• for a point of inflection: dx = 0, dx2 = 0 and dx2 changes sign through the point.
In mathematics, a singularity is in general a point at which a given mathematical object is not defined, or a point of
an exceptional set where it fails to be well-behaved in some particular way, such as differentiability. In particular, the
derivative of a function at a singular point is not well behaved.
An inverse function is a function that “reverses” the action of another function. For example, if a function f applied to
an input variable x gives the output y, then applying the inverse function, g to y gives x.
Often the inverse of a function is written as f −1 (y). (Note the argument of the inverse function does not have to be y,
as we have learnt earlier the symbol used to represent the argument of a function is not important, it could be anything,
for example we could use x or t or anything you can think of).
f (x) = 3x + 2.
To do this we follow the simple steps. First replace f (x) by some other variable, let’s chose y. Now we solve for x in
terms of y to get
x = (y − 2)/3.
We have now found the inverse function and can relabel as
g(y) = (y − 2)/3.
Or
f −1 (x) = (x − 2)/3.
Another useful example is converting between Fahrenheit and Celsius. To convert Fahrenheit to Celsius
5
f (F ) = (F − 32),
9
and the inverse, that is converting Celsius to Fahrenheit, is
9C
f −1 (C) = + 32.
5
Recalling that the definition of a function is that for any input value within a particular domain, the function must return
a unique answer. This is also true for inverse functions, as they are themselves functions. One must therefore be careful
with certain functions. For example, the function
f (x) = x2
23
only has an inverse function provided the domain of the function is carefully chosen. For example, we can find the inverse
of the function f (x) = x2 for the domain x > 0. The inverse function is straightforward to find and is given by
f −1 (y) = y 1/2 .
If however we choose to find the inverse function of g(x) = x2 for the domain x < 0 then the inverse function in this case
is
g −1 (y) = −y 1/2 ,
The inverse of trigonometric functions also come with a domain warning. Their inverses are usually written
Hence, the inverse x = sin−1 y can take an infinite number of values for a given y. In order to overcome this, we can
define a domain over which the inverse is unique value. They are referred to as the principle branches
π π
x = ≤x≤
arcsin y, −
2 2
x = arccos y, 0≤x≤π
π π
x = arctan y, − ≤ x ≤ .
2 2
Example: f (x) = x3 and it inverse function g(y) = y 1/3 . We have f 0 (x) = 3x2 and g 0 (y) = 31 y −2/3 = 13 x2 , where in the
last step we have set y = x3 as per the definition of the original function. It is clear that the
24
✓ ◆ ✓ ◆
@f f (x, y) @f
@y x
@x y
x y
We can generalise everything we have discussed so far to more than one variable. In PH1120 you will see a full introduction
to partial differentiation but we will have short discussion here.
Partial derivatives are when a function has more than one variable. For example, if f ≡ f (x, y). The function f (x, y)
defines a surface in 3-d and we can calculate the slopes at points on the surface in the x and y direction.
∂f f (x + ∆x, y) − f (x, y)
= lim , (3.26)
∂x y
∆x→0 ∆x
said another way, it is the rate of change of f due to an infinitesimal change in x whilst keeping y constant.
∂f f (x, y + ∆y) − f (x, y)
= lim , (3.27)
∂y x
∆y→0 ∆y
said another way, it is the rate of change of f due to an infinitesimal change in y whilst keeping x constant.
25
In Figure 3.4 we have an example of a function f (x, y). This function forms a surface in 3-D. Now we have a surface we
need to specify along which directions we are calculating the gradient of the slope of the surface. For example, taking
point P ≡ (x0 , y0 ) on the surface we can calculate the gradient in the y-direction at a constant value of x = x0 . This
gradient line is labelled (∂f /∂y)x and is parallel to the y-axis and perpendicular to the x-axis.
Similarly we can calculate the gradient in the x-direction at a constant value of y = y0 . This gradient line is labelled
(∂f /∂x)y and is parallel to the x-axis and perpendicular to the y-axis.
Solution: Calculate fx first. We are taking the derivative of f with respect to x keeping y constant. We have then
fx = 2x + 3y.
. Similarly, we calculate fy by taking the derivative of f with respect to y at constant x. We have then
fy = 2y + 3x.
Solution: The two independent variables we have are V and P . To calculate fV we take the derivative of f with respect
to V while holding P constant. We have
fV = P cos(V P ) + 2V.
Similarly we can calculate fP by taking the derivative of f with respect to P while holding V constant
fP = V cos(V P ).
Up to now you will be used to describing a position in 3-D space using Cartesian coordinates, (x, y, z). For certain
problems it is more convenient to use either circular, cylindrical or spherical coordinate systems to specify the point in
space.
The simplest example is where the system in which we are interested possesses a circular symmetry in 2-dimensions. In
this case we can exchange the usual Cartesin coorinate, x and y for circular polars which are defined by
x = r cos ϕ, tan ϕ = xy
p
y = r sin ϕ, r = x2 + y 2 ,
26
y
r
r sin '
'
r cos ' x
where 0 ≤ r ≤ ∞ and 0 ≤ ϕ ≤ 2π. This relationship is demonstrated graphically in Figure 4.1, where the point P can
equally well be described in terms of the Cartesian coordinates x and y and the circular polar coordinates r and ϕ.
An example of the usefulness of this change of coordinate system is where we are considering a set of Points, Pi that
map out the edge of a circle. To write down the coordinates of these points in terms of x and y is extremely cumbersome
where as using r and ϕ it is straightforward.
Moving to 3 dimensions, in Physics and Engineering we are often presented with a problem involving a system possessing
a cylindrical symmetry, such as a wire with a current passing through it. In this case it is convenient to use cylindrical
coordinates, which are an obvious extension of circular polars and are defined by
x = r cos ϕ, tan ϕ = xy
p
y = r sin ϕ, r = x2 + y 2 ,
z = z,
27
z
z
y
' r
r sin '
x r cos '
P
r
✓
y
'
Finally the last example is where the system in which we are interested has a spherical symmetry. For example, the
points on the surface of a ball are most easily described by spherical polars rather than Cartesian coordinates.
This time the components of the spherical geometry are r, ϕ, θ and they are related to x, yz via
28
x = r sin θ cos ϕ, tan ϕ = xy
p
y = r sin θ sin ϕ, r = x2 + y 2 + z 2
z
z = r cos θ, cos θ = √ ,
x2 +y 2 +z 2
where 0 ≤ r ≤ ∞, 0 ≤ ϕ ≤ 2π and 0 ≤ θ ≤ π.
This relationship is demonstrated graphically in Figure 4.3, where the point P can equally well be described in terms of
the Cartesian coordinates x, y, z and and the cylindrical polar coordinates r, ϕ, θ.
29
5 Integration
f (x)
f (x)
x
a b
Figure 5.1: Representation of the integral of the function f (x) with respect to x from x = a to x = b as the area under
the curve (shaded region).
The idea of integration should already be familiar to you as the area under a curve. In Figure 5.1 the definite integral
over f (x) from x = a to x = b is represented as the area under the curve (of f (x)) between x = a and x = b. The value
of this integral is written as
Z b
I= f (x) dx. (5.1)
a
We can derive the form of an integral by subdividing the finite interval a ≤ x ≤ b into a large number of small segments
of width ∆x as depicted in Figure 5.2. In Figure 5.2a these segments are constructed such that the area is overestimated
whereas in Figure 5.2b they are formed such that the area is underestimated.
By defining the endpoint of the ith segment as xi . In the case where we overestimate the area, the value of the function
at xi is labelled as Mi and where we underestimate the area, the value of the function at xi is labelled as mi .
30
f (x) f (x)
x x
Mi
f (x) mi f (x)
x x
a xi b a xi b
(a) (b)
Figure 5.2: Integration from first principles. The area under a curve can be estimated by adding the areas of a large
number of rectangular segments between a and b. The way in which these rectangles are constructed can be done such
that we always over estimate the area beneath the curve (a) or such that we underestimate the area (b).
In each case we can calculate the area under the curve by adding the areas of all n segments together. For the overestimate
we have
X n
S = M1 ∆x + M2 ∆x + . . . Mn ∆x = Mi ∆x
i=1
It is clear that
S ≥ S̄.
In both cases the number of segments between the limits a and b is given by
n = (b − a)/∆x
If we were to increase the number of segments, that is increase the number n, then the value of the underestimate would
increase whereas the value of the overestimate would decrease.
It is important to note that S is bounded from below and S̄ is bounded from above. Such that
S ≥ I ≥ S̄,
reasoning.
31
This suggest the following definition of a definite integral.
Definite integral of a continuous and non-negative function: Let f (x) be a continuous non-negative function on the closed
interval a ≤ x ≤ b, if xi is an point in the ith segment of length ∆x, the definite integral of f (x) integrated over the
interval [a, b], and written symbolically as
Z b
I= f (x) dx
a
is defined to be
Z b n
X
I= f (x) dx = lim f (xi )∆x.
a n→∞
i=1
(5.2)
Solution: As the function is continuous and non-negative over the domain in which we are interested. we start by by
considering a convenient partition in which [a, b] is divided into n equal sub-intervals, each of length
b−a
∆x = .
n
To do this we use the definition in Equation 5.2. We first identify xi with the right hand end point of the ith segment so
that we have
x1 = a + ∆x, x2 = a + 2∆x, x3 = a + ∆x, . . . , xn = a + n∆x
b−a
Using the fact that ∆x = n and that
n
1 + 2 + 3 + ... + n = (n + 1)
2
and
n(n + 1)(2n + 1)
12 + 22 + 33 + . . . + n2 = ,
6
it follows that
n(n + 1) 3 (n + 1)(2n + 1)
I = lim a2 (a − b) + a(b − a)2 + (b − a)
n→∞ n2 6n2
32
and so
Z b
1 3
x2 dx = b − a3 .
a 3
You will see in PH1120 lots of uses for integrals beyond the calculation of the area under a curves. For example, the
computation of the arc length if a curve, also called a line integral; the computation of surface integrals and volume
integrals when we go to more dimensions. All to come next term.
Most functions have positive and negative values in their domain of definition, the notion of a definite integral as formulated
so far may seem rather restrictive due to the rather restrictive assumption that the integrand must be an essentially positive
quantity.
However, nothing in the argument used requires the upper or lower sums (S, S̄) or the terms comprising them to be
non-negative. Since a term in either of these sums will be negative when mi or Mi is negative, that is when f (xi ) is
negative, it follows that the interpretation of the definite integral as an area may be extended to continuous functions
which assume negative values provided that areas below the x-axis are regarded as negative.
Thus using this convention we may modify the definition of a definite integral as an area by removing the condition on
the integrand being non-negative. This modification simply amounts to the deletion of the word non-negative such that
we have the modified definition.
Definite integral of a continuous function: Let f (x) be a continuous function on the closed interval a ≤ x ≤ b, if xi is
an point in the ith segment of length ∆x, the definite integral of f (x) integrated over the interval [a, b], and written
symbolically as
Z b
I= f (x) dx
a
is defined to be
Z b n
X
I= f (x) dx = lim f (xi )∆x.
a n→∞
i=1
(5.3)
Let f (x) and g(x) be continuous functions defined on the closed interval a ≤ x ≤ b, and let c be a constant and k be such
that a ≤ k ≤ b. Then
Rb Rk Rb
a) a f (x) dx = a f (x) dx + k f (x) dx, (Additivity with respect to interval of integration.)
Rb Rb
b) a cf (x) dx = c a f (x) dx (Homogeneity)
Rb Rb Rb
c) a (f (x) + g(x)) dx = a f (x) dx + a g(x) dx (linearity).
33
5.3 Integration as the inverse of Differentiation
f (x)
f (x)
a x x
The definite integral has been defined as the area under a curve between two fixed limits. Now consider the integral
Z x
F (x) = f (u)du, (5.4)
a
in which the lower limit a remains fixed but the upper limit x is now a variable. It should be clear that this form is
essentially a restatement of Equation 5.1 but where the variable x in the integrand has been replaced by a new variable
u. It is conventional to rename the dummy variable in the integrand in this way in order that the same variable does
not appear in both the integral and the integration limits.
To make the link between the form of the integral in Equation 5.4 and the assertion that integration is the inverse process
to differentiation we consider Equation 5.4 but shift x to x + ∆x such that
Z x+∆x
F (x + ∆x) = f (u)du. (5.5)
a
We can split an integral into different segments. For example, we can write
Z x Z x+∆x
F (x + ∆x) = f (u)du + f (u)du
a x
Z x+∆x
= F (x) + f (u)du, (5.6)
x
34
where intuitively in terms of the area under a curve we are just splitting the area in two parts as shown in Figure 5.3.
Z x+∆x
F (x + ∆x) − F (x) 1
= f (u)du. (5.7)
∆x ∆x x
In the penultimate step we have appealed to our understanding of how a definite integral is constructed in terms of narrow
segments of width ∆x.
Z x
d
f (u)du = f (x). (5.9)
dx a
From these last two equations we can see that integration is the inverse of differentiation.
We see however, that the lower limit in the above, a, is arbitrary and so differentiation does not have a unique inverse.
Any function F (x) obeying Equation 5.4 is called an indefinite integral of f (x).
dF (x)
Consider the function F (x) = x3 + 8. Suppose we write down its derivative as f (x), that is f (x) = dx . For our example
we can straightforwardly calculate the derivative as
dF (x)
f (x) = = 3x2 .
dx
Suppose now that we want to work in the opposite direction, that is, we want to find the function that has a derivative
equal to f (x) = 3x2 . Clearly one answer is x3 + 8. We say that F (x) = x3 + 8 is an anti-derivative of f (x) = 3x2 .
There are however, many other functions that have the derivative 3x2 , all related by a constant. E.g. x3 + 2, x3 − 200, x3 ,
etc. The reason of course is that the constant term disappears during differentiation. This is exactly the arbitrariness of
the lower limit in Equation 5.4.
In general therefore, a function F (x) is an anti-derivative of f (x) if dF/dx = f (x). If F (x) is an anti-derivative of f (x)
then so too is F (x) + C, where C represents any constant.
As a consequence we must allow for this when we calculate the indefinite integral and include an additional constant
Z x Z
f (u)du = f (x)dx = F (x) + C, (5.10)
35
where C is an arbitrary constant of integration.
It should also be noted that the definite integral between x = a and x = b can be written in terms of F (x).
Z b Z b Z a
f (x)dx = f (x) − f (x) = F (b) − F (a), (5.11)
a x0 x0
where x0 is any third fixed point. Using the notation that F 0 (x) = df /dx
Z b
b
F 0 (x)dx = F (b) − F (a) = [F ]a . (5.12)
a
There are a number of integration results that you should learn off by heart. For example the following are important
ones to remember, however you must still be able to derive them when asked.
f (x) F (x)
a
axn n+1 x
n+1
sin ax − a1 cos ax
1
cos ax a sin ax
eax 1 ax
ae
1
x ln x
R
Table 1: Common integration results for f (x)dx = F (x) + C, where C is an arbitrary constant. For more see the
Formula Booklet section 11.1. Note these are useful to remember but you must be able to derive them if asked.
Sometimes it is possible to make a substitution of variables that turns a complicated integral into a simpler one, which
can be integrated by a standard method. There are many useful substitutions and knowing which to use is a matter of
experience. A list of these is given in the Formula Booklet 11.2. As always the list is for reference but you should learn
to use these substitutions and be able to choose which to use.
We will now go through a number of examples for indefinite integrals and later go through the definite integrals.
Example 1:
Z
1
I= √ dx. (5.13)
a2 − x2
dx
= a cos u, → dx = a cos u du. (5.14)
du
36
Putting these into I we have
Z Z Z
1 1 1
I= p a cos u du = √ cos u du = du = u + c, (5.15)
a 1 − sin2 u cos2 u
Z Z
1 1
I= dx or I = dx. (5.17)
a + b cos x a + b sin x
In these cases, making the substitution t = tan x/2 yields integrals that can be solved more easily.
We first note that if t = tan x/2 we can find expressions for sin x/2 and cos x/2. To show this, we start with the identity
x x
1 = cos2 + sin2 ,
2 2
now divide both sides by cos x/2 to get
1 x
= 1 + tan2 = 1 + t2 .
cos2 x/2 2
Rearranging we find
x 1
cos =√
2 1 + t2
and
x x x t
sin = cos tan = √ .
2 2 2 1 + t2
x x 1 − t2
cos x = cos2 − sin2 = , (5.18)
2 2 1 + t2
x x 2t
sin x = 2 sin cos = . (5.19)
2 2 1 + t2
Before we apply the change of variable t = tan x/2 to the integral I we also note that
dt 1 x 1 x 1
= sec2 = 1 + tan2 = 1 + t2 .
dx 2 2 2 2 2
Leading to
2
dx = dt.
1 + t2
Now let’s look at a specific example (we can do the general case but we leave that to an optional problem set challenge).
37
Example 2a:
Z
1
I= dx (5.20)
1 + 3 cos x
Now we apply the change of variable t = tan x/2, cos x = (1 − t2 )/(1 + t2 ), dx = 2/(1 + t2 ) dt
Z Z Z
1 2 2 2
I = dt = dt = dt
1 + 3(1 − t2 )(1 + t2 )−1 1 + t2 1 + t2 + 3(1 − t2 ) 4 − 2t2
Z Z Z
2 2 1 1 1
= dt = √ √ dt = √ + √ dt.
4 − 2t2 (2 − 2t)(2 + 2t) 2 2 − 2t 2 + 2t
Z Z √
1 1 1 1
Ia = √ dt = − √ du = − √ ln u = − √ ln(2 − 2t). (5.21)
2 − 2t 2u 2 2
√
where we have utilised the the substitution u = 2 − 2t to do this integral. Similarly for the second term
Z Z √
1 1 1 1
Ia = √ dt = √ du = √ ln u = √ ln(2 + 2t), (5.22)
2 + 2t 2u 2 2
1 h √ √ i
I = − √ ln(2 − 2t) − ln(2 + 2t) . (5.23)
2 2
We can do this by substitution, let u = x3 + cos x, then du = (3x3 − sin x)dx. Applying these to the integral we have
Z
1
I= du = ln u + c = ln[x3 + cos x] + c.
u
Through this example we can see a pattern and that is if the numerator contains the derivative of the denominator
(potentially multiplied by a constant) we know that the integral will be ln of the denominator. That is
Z 0
f (x)
dx = ln f (x) + c. (5.24)
f (x)
38
5.5.2 Solving Differential Equations with Integration
Later in the course we will see differential equations in full, but for now it is worth exploring solving first order (that is
one derivative) differential equations of the form
df (x)
= g(x), (5.25)
dx
where g(x) is some function of x. Usually we are also given some information about f (x) in the form of its value at a
particular point in x. For example, we may have that at x = a, f (a) = b.
We already know how to calculate f (x), as f (x) is just the anti-derivative of g(x) in the language we have used above.
To solve this equation, that is find the functional form of f (x) we can simply integrate both sides with respect to x.
Z Z Z
df (x)
dx = df (x) = f (x) = g(x) dx. (5.26)
dx
where we have distinguished the dummy variable in the integral and the upper limit x by putting a prime on the dummy
variable.
39
where we have effectively changed variables from x to f and the limits as a result changed accordingly. The right hand
side of our integral is then x
Z x
(x0 )2 x2 a2
f (x) − b = x dx0 = = − .
a 2 a 2 2
Leading to
x2 a2
f (x) = +b−
2 2
as we had before.
When we change variables in a definite integral we have to take care of the limits in the integral. For definite integrals we
have to do three things when we change variables. The first is we must replace the variable in the integrand, the second
is we must change the infinitesimal width dx and lastly we must recalculate the limits.
Note that you do not need to re-express the answers in terms of x we can just evaluate the integral directly with u and
the modified limits.
d dv du
(uv) = u +v . (5.30)
dx dx dx
Z Z
dv du
u dx = uv − v dx.
dx dx
(5.32)
40
Example 1: Calculate
Z π
I= x sin x dx.
0
Example 2: Calculate Z
I= ln x dx.
In this case we know that the integral of ln x is not a standard result. We can use integration by parts to perform this
integral by letting u = ln x and dv/dx = 1 so that, using Equation 5.32 we have
Z
1
I = x ln x − xdx = x ln x − x + c, (5.33)
x
In PH1120 you will see this topic again in a slightly more complicated set-up. Here we will consider the derivative of an
integral where the limits are constants (in PH1120 the limits will be functions of x.).
It can sometimes happen that an integrand, in addition to being a function of x, also depends on a parameter, say α. So
for example if f (x, α) is both integrable with respect x over the interval [a, b] and differentiable with respect to α then
Z b Z b
d df
f (x, α) dx = dx. (5.34)
dα a a dα
Rb
Example: Find the derivative with respect to α of the following integral I(x, α) = a
eαx dx, where a and b are constants.
Z b Z b Z b
d d αx d (eαx )
I(x, α) = e dx = dx = xeαx dx.
dα dα a a dα a
We can define the following functions, which are called Hyperbolic functions:
ex − e−x
sinh x =
2
ex + e−x
cosh x = .
2
41
Note that cosh x is an even function and sinh x is an odd function, just like their trigonometric relations.
sinh x ex − e−x
tanh x = = x
cosh x e + e−x
1 2
sech x = = x
cosh x e + e−x
1 2
cosech x = = x (5.35)
sinh x e − e−x
1 ex + e−x
coth x = = x .
tanh x e − e−x
We can connect hyperbolic functions with their trigonometric counterpart. We can rewrite sin x and cos x in terms of
exponentials as
eix + e−ix
cos x = ,
2i
ix −ix
e −e
sin x = .
2
e−a + ea ea + e−a
cos ia = = , (5.36)
2 2
−a a
e −e i a
sin ia = = e − e−a .
2i 2
cosh x = cos ix
i sinh x = sin ix
cos x = cosh ix (5.37)
i sin x = sinh ix.
These relationships lead to a number of similarities between hyperbolic and trigonometric functions in particular in their
calculus and identities.
We can find a number of hyperbolic analogues of the trigonometric identities. For example we can calculate the hyperbolic
analogue of cos2 x + sin2 x = 1 by using the relationships in Equation 5.37, that is we replace x with ix we get
sin2 ix = − sinh2 x
42
and
cos2 ix = cosh2 x.
sech2 x = 1 − tanh2 x,
cosech2 x = coth2 x − 1
sinh 2x = 2 sinh x cosh x (5.38)
2 2
cosh 2x = cosh x + sinh x.
Just like trigonometric functions, hyperbolic functions have inverses. We can find closed-form expressions for their inverses.
and hence p
y = sinh−1 x = ln 1 + x2 + x .
and that
−1 1 1+x
tanh x = ln .
2 1−x
We have just seen that the identities of hyperbolic functions are closely related to the identities of their trigonometric
counterparts. Their calculus is also closely related as we will see now.
43
The derivatives of the two basic hyperbolic functions are
d
sinh x = cosh x (5.40)
dx
d
cosh x = sinh x. (5.41)
dx
Notice the lack of a minus sign in the second expression. We can verify these derivatives by rewriting the hyperbolic
function in terms of the exponential forms. For example
d d ex − e−x 1 x
sinh x = = e + e−x = cosh x.
dx dx 2 2
d
(tanh x) = sech2 x, (5.42)
dx
d
(sech x) = sech x tanh x, (5.43)
dx
d
(cosech x) = cosech x coth x, (5.44)
dx
d
(coth x) = − cosech2 x. (5.45)
dx
(5.46)
Z
1
I= √ dx (5.47)
x2+ a2
The most convenient substitution in this case is x = a sinh u. The dx = a cosh udu.
Z
a cosh u
I= q du, (5.48)
a2 (1 + sinh2 u)
44
6 Differential Equations
A great many applied problems in Mathematics involve rates, that is derivatives and the way in which a quantity changes
as a function of an independent variable, such as time. Let us look at a few examples.
Newton’s second law is written as F = ma. If we write the acceleration as dv/dt, where v is the velocity, or as d2 x/dt2 ,
where x is the displacement we have a differential equation. Thus any mechanics problem in which we want to describe
the motion of a body under the action of a given force, involves the solution to a differential equation.
A further example, is to consider a simple electrical circuit containing a resister, R, a capacitor, C and inductance, L,
and a source of emf V . If the current flowing around the circuit at a time t is I(t) and the charge on the capacitor is q(t),
then I = dq/dt. The voltage across R is RI, the voltage across C is q/C, and the voltage across L is L(dI/dt). Then at
any time we have
dI q
L + RI + = V.
dt C
d2 I dI I dV
L 2
+R + =
dt dt C dt
as the differential equation satisfied by the current I in a simple series circuit with given L, R and C, and a given V (t).
An equation containing derivatives is called a differential equation. For us in this course we will only consider ordinary
differential equations, that is, a differential equation that contains derivatives with respect to a single variable. In PH1120
you will see partial differential equations which contain partial derivatives.
Ordinary differential equations (ODEs) may be categorised by their general characteristics. One particularly important
property of an ODE is the order or degree of an ODE is given by the order of the highest derivative in the differential
equation.
Some examples
d2 I dI I dV
L 2
+R + = 2nd order (6.1)
dt dt C dt
d3 y d2 y
+x + y = ex 3rd order (6.2)
dx3 dx2
dy
+ xy = 0 1st order. (6.3)
dx
In these differential equations I and y are the dependent variables. Notice that these equations only contain one factor
of the dependent variables or their derivatives. These differential equations are said to be linear ODEs. Examples of
non-linear differential equations are
dy
y + xy 2 = 0, (6.4)
dx
dy
= cot y. (6.5)
dx
45
Both the terms in equation 6.4 are non-linear, whereas the cot y terms means equation 6.5 is also non-linear (think of the
Maclaurin series of cot y). We will come back to what we mean by linear later (and you will have already seen properties
of linear equations in the other half of PH1110). To summarise, a linear ODE with x the independent variable and y the
dependent variable is
a0 y + a1 y 0 + a2 y 00 + a3 y 000 + . . . = b,
where the coefficients ai and b are either constants or functions of x and the primes denote differentiation with respect
to the independent variable, which in this case is x.
In the next few sections we will see examples of ODEs and how to solve them.
We have already seen an example of how to solve an first order ODE in section 5.5.2.
A solution to a differential equation (in the dependent and independent variables, e.g. y and x respectively)
is a relation between y and x which, if substituted into the differential equation, gives an identity.
It is useful to consider an example to get an idea of the components of the most general solution to an ODE.
y = sin x + C, (6.6)
y 0 = cos x (6.7)
because if we substitute equation 6.6 into the differential equation we get the identity cos x = cos x.
We might expect that this continues with higher and higher order derivatives, for example the nth order version, e.g. for
y (n) = h(x), y will contain n arbitrary constants.
Examples 1 above showed that the solution to the first order ODE contained one arbitrary constant C and in Example
2 there was a solution that contained two arbitrary constants A and B.
46
Importantly, any linear differential equation of order n has a solution containing n arbitrary constants. This solution is
called the General Solution of the linear ODE.
An important check when you think you have found the solution to an ODE is to substitute the solution back into the
equation to confirm that it is indeed a solution.
Once we have the general solution to our ODE and we then want to use it to describe some phenomenon, then we will
need to establish what the values of the arbitrary constants are.
We can do this by using Initial Conditions or Boundary Conditions. These are constraints on the solution to allow us to
determine the arbitrary constant. An example may be that we find the solution to a first order ODE as
y(t) = t2 + A
and we are given the initial condition y(0) = 0, that is at t = 0 y = 0. If we apply that to the solution above we find that
A = 0.
In general the general solution to an nth order ODE with n arbitrary constants requires n conditions to determine all the
constants.
These conditions may include constrains on the solution, the derivative of the solution or similar.
Whenever we can separate the variables in a differential equation in this way, we call the ODE separable, and we get the
solution by integrating each side of the equation.
dy
= f (x)g(y),
dx
47
then the solution is found by Z Z
dy
= f (x)dx.
g(y)
Example 1: The rate at which a radio-active species decays is proportional to the remaining number of atoms. If there
are N0 atoms at t = 0, find the number at time t.
We need to write down the ODE that describes how the number of radioactive species is decreasing over time. We are
told that the rate of change is proportional to the number remaining, so we know
dN
∝ −N,
dt
where the negative sign tells us that N is decreasing with time. We can replace the proportionality sign with an equals
sign and a constant of proportionality, λ, as
dN
= −N λ.
dt
This ODE is separable as we can rearrange it to read
dN
= −λdt
N
and we can integrate both sides to find
Z Z
dN
= − λdt (6.9)
N
⇒ ln N = −λt + C, (6.10)
where we have defined A = exp [C]. We see that we have found general solution as we have one arbitrary constant.
To finish the example we have an initial condition that allows us to determine the constant A. That is, N (0) = N0 .
Applying this to our general solution we find N (t) = N0 e−λt .
xy 0 − xy = y.
Rearranging we have
y(1 + x) y0 1+x
y0 = ⇒ =
x y x
and therefore
dy 1+x
= dx.
y x
y(x) = Axex .
We can check this is a solution of the original ODE by substituting back in.
48
6.2.2 Almost Separable first order ODEs
Sometimes an ODE may not be immediately separable but with a change of variable we can find a version of the ODE
that is separable.
where f (ax + by) is some unspecified function of ax + by, where a and b are constants. If we change variables by setting
z = ax + by,
with
dz dy
=a+b .
dx dx
Once we have performed the integration we must remember to re-substitute z = ax + by back into the general solution.
dz z+1 dz 2z + 4
−1= ⇒ = .
dx z+3 dx z+3
Separating variable we have
1 z+3 1 z+2+1 1 1
dz = dz = 1+ dz = dx.
2 z+2 2 z+2 2 z+2
49
6.2.3 Homogeneous first order ODEs
If the ODE of interest is homogeneous then it can be written as a function of the ratio y/x. We can then use a change
of variable where y = xv to produce a separable ODE.
We may be presented with an ODE that is very close to being homogeneous but there are some constants floating around
that spoil this. To combat this we can define new variables that are linear transformations of the original x and y.
−a − b − 5 = 0 (6.11)
−a − 3b − 1 = 0. (6.12)
50
We are left with
dȳ ȳ + x̄ ȳ/x̄ + 1
= = .
dx̄ ȳ − 3x̄ ȳ/x̄ − 3
dȳ dv v+1
= v + x̄ = .
dx̄ dx̄ v−3
Rearranging this we have
dv v+1 −v 2 + 4v + 1
x̄ = −v = .
dx̄ v−3 v−3
Separating variables
where in the last step we have used partial fractions to re-express the first part of the integrand, that is
1 1 1 1
= √ √ − √ .
(v − 2)2 − 5 2 5 (v − 2) − 5 (v − 2) + 5
We can now perform the integrals using the methods outline in section 5 to find
√ !
1 (v − 2) − 5 1
ln x = √ ln √ − ln[(v − 2)2 − 5] + C
2 5 (v − 2) + 5 2
y−4
√ !
1 ( x−1 − 2) − 5 1 y−4
ln x = √ ln y−4
√ − ln[( − 2)2 − 5] + C
2 5 ( x−1 − 2) + 5 2 x−1
√ √ ! " 2 #
1 y − (2 + 5)x − 2 + 5 1 y − 2x − 2
ln x = √ ln √ √ − ln − 5 + C.
2 5 y − (2 − 5)x − 2 − 5 2 x−1
dy c1 y + d1 x + e1
= , (6.13)
dx c2 y + d 2 x + e 2
51
where c1 , c2 , d1 , d2 , e1 and e2 are all constants. As in section 6.2.4, we may try to make linear transformations
y = ȳ − a, x = x̄ − b.
−c1 a − d1 b + e1 = 0
−c2 a − d2 b + e2 = 0.
We can solve for the parameters a and b, re-arranging the first equation for a we have
e1 − d1 b
a=
c1
and substitute this into the second we find
e1 − d1 b e1 c2 − e2 c1
−c2 − d2 b + e2 = 0, ⇒ b= .
c1 c2 d1 − c1 d2
We can see from this result that we can find a value for b except when c2 d1 − c1 d2 = 0 or rearranging when
c1 d1
c2 d1 = c1 d2 or = .
c2 d2
If the above is true we must consider a different method. We can apply this condition to the original ODE equation 6.13
we find that
d1 c2 d1
dy y + d1 x + e1 d2 (c2 y + d2 x) + e1
= d2 = . (6.16)
dx c2 y + d2 x + e2 c2 y + d2 x + e2
Notice that the numerator contains a term that is some factor (d1 /d2 ) times c2 y + d2 x and this same combination of x
and y appears in the denominator.
52
We spot that the same combination of x and y appears in the numerator and denominator allowing us to make the
substitution v = y − 3x leaving
dv v−2
= − 3,
dx 2v − 5
which can then be separated and integrated as normal. This ODE is exactly of the form described by equation 6.16, with
c1 = 1, c2 = 2, d1 = −3 and d2 = −6 and indeed we see that cc21 = dd12 .
Some times first order ODEs can be solved by writing the LHS as an exact differential. Consider the following ODE
dy df (x)
f (x) + y = Q(x).
dx dx
We can write the LHS as an exact differential, that is
dy df (x) d(f (x)y)
f (x) + y= ,
dx dx dx
leading to
d(f (x)y)
= Q(x).
dx
This can then be solved by using separation of variables
Z Z Z
1
d(f (x)y) = f (x)y = Q(x)dx ⇒ y = Q(x)dx.
f (x)
dy
x + y = 2x.
dx
We can rewrite the LHS as
dy d(yx)
x +y =
dx dx
and solve
d(yx)
= 2x
dx
by separation of variables Z Z
c
d(yx) = yx = 2xdx = x2 + c ⇒ y =x+ .
x
dy P (x) Q(x)
+ y= , (6.19)
dx f (x) f (x)
53
The integrating factor, I, is then multiplied through on both sides of the equation such that
dy P (x) Q(x)
I +I y=I . (6.20)
dx f (x) f (x)
Our task now is to find an integrating factor such that the left hand side can be written as an exact differential d(uv)/dx.
Using the product rule we want to find u and v and we directly compare the LHS of equation 6.20, that is
d(uv) dv du
= u +v (6.21)
dx dx dx
Q(x) dy p(x)
I = I + yI (6.22)
f (x) dx f (x)
u = I
dv dy
=
dx dx
v = y
du P (x)
= I .
dx f (x)
This means that we can now write equation 6.20 as an exact differential
with Z
P (x)
I = exp dx .
f (x)
54
Multiplying the ODE through by I we have
dy
x3 + 3yx2 = ex ,
dx
where we can write the LHS as
dy d(x3 y)
x3 + 3yx2 = ,
dx dx
which is just d(Iy)/dx. We thus have
d(x3 y)
= ex ,
dx
which can be separated as
ex + c
d(x3 y) = ex dx ⇒ x3 y = ex + c ⇒ y= .
x3
The Bernoulli equation is non-linear, but can be reduced to a linear ODE with a suitable change of variable. The general
form of the Bernoulli equation is
dy
+ P (x)y = Q(x)y n , (6.23)
dx
where P and Q are function of x. If we make the substitution
z = y 1−n ,
then
dz dy
= (1 − n)y −n .
dx dx
If we multiply equation 6.23 by (1 − n)y −n we have
dy
(1 − n)y −n + P (x)(1 − n)y 1−n = (1 − n)Q(x),
dx
and using the definition of z and its derivative we can write
dz
+ P (x)(1 − n)z = (1 − n)Q(x),
dx
which is now a first order linear ODE and can be solved using an integration factor.
55
7 Second Order Differential Equations
We will now take a look at second order ODEs. The general form of a second order ODE is
d2 y dy
a +b + cy = f (x), (7.1)
dx2 dx
where the coefficients a, b and c may be functions of x.
We will consider first the case where a, b, c in equation 7.1 are constants and the right hand side f (x) = 0. With the right
hand side set to zero we call this type of second order ODE is called homogeneous as each term contains a single factor
of y or its first or second derivative. The second ODE we wish to solve is then
d2 y dy
a +b + cy = 0, (7.2)
dx2 dx
We can construct the general solution of an ODE by exploiting the linear nature of the homogeneous ODE. We recall
that the general solution of an nth order ODE must contain two arbitrary constants.
• If y1 and y2 are linearly independent and are both solutions to a linear homogeneous ODE then the combination
y = y1 + y2 is also a solution of the ODE.
We can combine these: If y1 and y2 are both independent solutions to a linear homogeneous ODE then the combination
y = Ay1 + By2 is the general solution of the homogeneous ODE.
y 00 − y = 0.
We know that the most general solution of this second order ODE should contain two arbitrary constants. We also know
that both y1 = ex and y2 = e−x are individual solutions of this ODE. Using the properties above we also know therefore
that
y = Aex + Be−x
is also a solution to the y 00 = y. This has two arbitrary constants and is therefore the general solution of the second order
ODE.
Note that the two contributions to the general solution are linearly independent. That is ex cannot be written in terms
of some constant times e−x and vice versa.
If we had not known about the e−x solution, we may have thought that we could have constructed
y = Aex + Bex
56
with two arbitrary constants A and B. However, we can combine both of these terms into a single term
y = Cex ,
where C = A + B leaving only one arbitrary constant in our solution indicating that we have yet to find the full general
solution.
The standard way to find the solution to a second order homogeneous ODE, y, is by using a trial solution of the form
y = emx .
Since the exponential term is never zero we can divide by it out to obtain the characteristic (or auxiliary) equation,
am2 + bm + c = 0 . (7.3)
In general for an nth order ODE one finds an nth order polynomial, which therefore has up to n roots, which may be
complex.
√
−b ± b2 − 4ac
m± = . (7.4)
2a
The general solution is then
y = Aem+ x + Bem− x .
For the case where the two roots are unequal, i.e., b2 6= 4ac and therefore m+ 6= m− , we find two independent solutions
to our second order ODE. The most general solution is then the combination of these two solutions and is written
where A and B are arbitrary constants. As we learnt in section 6.1, for an nth order ODE we expect n arbitrary constants
in the general solution. If we do not have n then we need to use a different method to find the remaining parts of the
general solution. We will come back to this point later.
If the coefficients of the equation are such that b2 > 4ac, then the roots are real and the solution given by Eq. (7.5) is
also real.
57
7.2.2 Complex Roots
If b2 < 4ac, then the two roots are complex, and can be written
√
b 4ac − b2
m+ = − +i ≡ p + iq , (7.6)
2a 2a
√
b 4ac − b2
m− = − −i ≡ p − iq , (7.7)
2a 2a
where
b
p = Re(m+ ) = Re(m− ) = − , (7.8)
2a
√
4ac − b2
q = Im(m+ ) = −Im(m− ) = . (7.9)
2a
y(x) = c1 e(p+iq)x + c2 e(p−iq)x = epx c1 eiqx + c2 e−iqx , (7.10)
where c1 and c2 are arbitrary constants and can in principle be complex depending. We can check that this is indeed a
solution by substituting it back into equation 7.2.
From complex numbers we know that using the Euler Formula we can write
y(x) = epx [c1 (cos qx + i sin qx) + c2 (cos qx − i sin qx)] = Aepx cos(qx) + Bepx sin(qx), (7.11)
where A and B are functions of c1 and c2 but are still unknown arbitrary constants, and p and q are defined in Eqs. (7.8)
and (7.9).
If b2 = 4ac then the characteristic equation has only one distinct root, which is real:
b
m+ = m− = − ≡m. (7.12)
2a
We may be tempted to take
58
as the general solution. We can easily verify that this is indeed a solution, but it is clearly not the most general one, since
it contains only one constant, c1 . Because we have a second order equation, however, two arbitrary constants are needed.
To find a second linearly independent solution, we can use the method of reduction of order. The idea is to take an
available solution, namely, Eq. (7.13), and to find from it a linearly independent solution by multiplying by a function
v(x), which we will need to find. That is, starting from the solution emx we seek a solution of the form
The first and second derivatives of y are found using the product rule to be
b −(b/2a)x
y0 = v 0 e−(b/2a)x −
ve , (7.15)
2a
b 0 b2
y 00 = v − v + 2 v e−(b/2a)x .
00
(7.16)
a 4a
Substituting these ingredients into our differential equation (7.2) and cancelling the exponential factors gives
b 0
00 b2 0 b
a v − v + 2 v + b v − v + cv = 0 , (7.17)
a 4a 2a
00 b2
av − −c v =0. (7.18)
4a
But because we are considering the case b2 = 4ac, the second term in Eq. (7.18) disappears and we are left with
v 00 = 0 . (7.19)
v(x) = A + Bx , (7.20)
where A and B are arbitrary constants. By using this with Eq. (7.14) to find our general solution y for the case of equal
roots we obtain
59
d2 y dy
2
+3 + 2y = 0.
dx dx
Trial solution is
y = emx .
m2 m2 + 3m + 2 = 0, → (m + 1)(m + 2) = 0,
y = Ae−x + Be−2x .
d2 ψ
+ vψ = 0,
dx2
where v is a constant.
This gives us
√
m = ±i v.
d2 x dx
+4 + 4x = 0.
dt2 dt
60
and substitute into the ODE to find
m2 + 4m + 4 = 0, → (m + 2)2 = 2,
giving a repeated root of
m = 2.
. From this method we have therefore only found one part of the general solution. To find the general solution we need
to use the method of reduction of order. We know that this method tells us that the general solution is of the form
x = (A + Bt)e2t .
This solution has two arbitrary constant and is therefore the general solution. We can always check that we have the
correct form the solution by substituting back into the original ODE.
We now extend the homogeneous problem of the previous section to the non-homogeneous equation
d2 y dy
a +b + cy = f (x), (7.22)
dx2 dx
where f (x) is a given function and here a, b and c may in general be functions of x.
To solve this problem, first recall that the general solution to the corresponding homogeneous equation
d2 y dy
a 2
+b +c=0
dx dx
can be written
where y1 and y2 are linearly independent solutions to the homogeneous equation and A and B are arbitrary constants;
yc is called the complementary function.
Now note that we can add yc to any solution to the non-homogeneous equation, and the result will still be a solution.
Suppose we have a particular solution that satisfies (7.22), say, yp , that is
d 2 yp dyp
a 2
+b + cyp = f (x)
dx dx
and we add to it yc , and then substitute this into equation 7.22. We find
d2 (yp + yc ) d(yp + yc ) d2 yp dyp d 2 yc dyc
a +b + c(yp + yc ) = a +b + cyp + a 2 + b + cyc
dx2 dx dx 2 dx dx dx
d2 yc dyc
= f +a 2 +b + cyc
dx dx
= f,
as
d2 yc dyc
a +b + cyc = 0.
dx2 dx
61
So we see that
The solution to the non-homogeneous equation, yp (x) is called the particular integral. Notice that, despite use of the
word “particular”, the solution yp does not need to satisfy any particular boundary or initial conditions; it can be any
solution of the full non-homogeneous equations without any arbitrary constants.
To solve a non-homogeneous equation of the form in equation 7.22 subject to given initial conditions, we must first find
any solution, yp , and also the general solution yc to the corresponding homogeneous equation, which will contain two
arbitrary constants. Then these are added together and the initial (or boundary) conditions are imposed to determine
the values of the constants. We have already discussed the homogeneous problem above; we will next look at methods
for finding the particular solution yp .
ay 00 + by 0 + cy = f (x) , (7.25)
where we now restrict ourselves to the case where the coefficients a, b and c are constants. To find a particular solution
to the equation we can in many cases employ the method of undetermined coefficients. Basically this amounts to guessing
a solution that contains some undetermined parameters, substituting it back into the differential equation and seeing
whether there exist values for the parameters such that the function is indeed a solution.
It is difficult to formulate very general rules for how to guess a solution, but we can summarise certain guidelines that
cover a number of important cases, namely, when f (x) consists of an exponential, polynomial, sine or cosine terms. If
f (x) is of some different type or if the coefficients in the differential equation are not constant, then the method of
undetermined coefficients is not likely to be of use. The basic form of the guesses that turn out to work are summarised
in Table 2.
In problems of this type it is crucial that before seeking the particular solution yp to find first the complementary function
yc . It can happen that the guess for yp from Table 2 is proportional to yc , in which case the method will not work. This
is easy to see, since if yguess = Cyc for some constant C, then
But the guess for yp must give the non-homogeneous term f (x), not zero, so such a function cannot actually be a particular
solution. If the initial guess for yp is found to be proportional to yc , then the problem can be avoided by multiplying the
guess by x, or in general by as many powers of x as needed so that the trial function is no longer proportional to yc , as
listed for the example of the exponential form for f (x).
62
Form of f (x) Guess for yp (x)
cekx Aekx
Axekx if ekx already appears in the complementary function
Ax2 ekx if xekx already appears in the complementary function
Table 2: Trial solutions for yp given non-homogeneous terms of different forms in f (x). For the case of an nth order polynomial
(second entry below), the guess for yp must include all n + 3 of the coefficients A0 , A1 , . . . , An+2 , even if some of the coefficients ci
in f (x) happen to be zero for i < n.
ay 00 + by 0 + cy = ekx . (7.26)
After cancelling the term ekx and simplifying, we see that our guess for yp works if the coefficient A has the value
1
A= . (7.29)
ak 2 + bk + c
Having now found a particular solution we can add to this the general solution to the homogeneous equation, which
contains two arbitrary constants,
ekx
y = Dem+ x + F em− x + 2 ,
ak + bk + c
and we then determine the constants by imposing the initial conditions.
Example 2. Solve
d2 y dy
+3 + 2y = e−x .
dx2 dx
63
To do this we must find the complementary function first. We know what this as we solved the homogeneous version of
this ODE before, the complementary function is
For the particular integral we follow the rules and note that our first guess would be of the form od Ce−x but this already
appears in the CF so we try
yp = Cxe−x .
Ce−x = e−x , → C = 1.
We may encounter a situation where the function f (x) on the right-hand side of a non-homogeneous differential equation
is not as simple as those shown in Table 2, but can nevertheless be expressed as a linear combination of such terms.
Suppose the non-homogeneous term f (x) can be expressed as a sum of n terms:
n
X
f (x) = fi (x) . (7.30)
i=1
Instead of solving
ay 00 + by 0 + cy = f (x)
we can instead try to find solutions yi for i = 1, . . . , n to the equations
n
X
y(x) = yi (x) , (7.32)
i=1
n
X n
X
[ayi00 + byi0 + cyi ] = fi = f (7.33)
i=1 i=1
So if f (x) is a sum of terms, we can try to find the solution for each term and then sum these in the end.
64
7.3.3 Use of complex exponentials in solutions.
In applied problems, the function f (x) on the right hand side of the non-homogeneous ODE
ay 00 + by 0 + cy = f (x)
is very often a sine or cosine representing an alternating EMF or periodic force. We can find yp by guessing a solution of
the form A cos kx + B sin kx (as suggested in table 2) and plugging in this guess to the non-homogeneous ODE to find A
and B.
A more efficient way to find yp involves appealing to complex numbers. To explain the method consider the non-
homogeneous ODE
Y 00 + Y 0 − 2Y = 4e2ix . (7.35)
Since e2ix = cos 2x + i sin 2x is complex, the solution Y may also be complex. Then if
Y = YR + iYI ,
where YR and YI are the real and imaginary parts of Y , then equation 7.35 is equivalent to two equations
We see that the second of these two is identical to equation 7.34 and so the solution to equation 7.34 is the same as the
imaginary part of Y . This to find yp for equation 7.34, we can find Yp for equation 7.37 and then take the imaginary part.
Let’s solve equation 7.35. We first note that e2 i does not appear in the complementary function for the homogeneous
version of our non-homogeneous ODE. Following the method of undetermined coefficients we try a solution of the form
Yp = Ce2ix
rearranging we find
4 1
C= = − (i + 3),
2i − 6 5
and therefore we find
1
Yp = − (i + 3)e2ix .
5
Taking the imaginary part of Yp , we find yp as
1 3
yp = − cos 2x − sin 2x.
5 5
We can as always check that this is a solution by substituting back into our original non-homogeneous ODE.
65
8 Applications of second order ODE
Second order ODE appear frequently in Physics (and Engineering). A particular example is simple harmonic motion.
Consider the diagram in Figure 8.1. The figure depicts a mass m in motion subject to a number of forces.
The motion of the mass m is given by its position x relative to an equilibrium position of x = 0 as a function of time
t. The mass is attached to a spring, which exerts a force Fs = −kx, where k is the spring constant. The mass moves
through a viscous medium providing a resistive force of Fres = −β(dx/dt), and it is acted upon by an external force
Fext = F0 cos(ωt).
dx d2 x
− kx − β + F0 cos(ωt) = m 2 . (8.1)
dt dt
where we have used primes for the derivatives and have defined
66
r
k
ω0 = , (8.3)
m
β
γ = , (8.4)
m
F0
f0 = . (8.5)
m
Suppose we are given the initial position and velocity for the mass: x(0) = x0 , x0 (0) = v0 , and we want to find the
function x(t) that describes the subsequent motion of the mass as a function of time.
From Sec. 7.3 we know that the general solution x(t) can be expressed as the sum of two terms: xc , the general solution
to the homogeneous problem (the complementary function),
plus a particular integral xp to the non-homogeneous problem. The general solution will contain two arbitrary constants
coming from the complementary, function xc whose values can be determined by imposing the initial conditions on the
sum xc + xp .
We can use the technology of Sec. 7.2 to write down the general solution to the homogeneous equation. With a trial
solution of x = ert the characteristic equation is
r2 + γr + ω02 = 0 . (8.7)
Over-damped case
If γ 2 > 4ω02 (the over-damped case), then the roots of the characteristic equation are real:
p
γ γ 2 − 4ω02
r± = − ± . (8.8)
2 2
Under-damped case
If γ 2 < 4ω02 (the under-damped case) then the roots are complex:
p
γ 4ω02 − γ 2
r± = − ± i . (8.10)
2 2
67
xc = Aept cos(qt) + Bept sin(qt) , (8.11)
where
γ
p = − , (8.12)
2
p
4ω02 − γ 2
q = , (8.13)
2
and where A and B are arbitrary constants. The solution is thus a combination of oscillating sine and cosine terms with
an amplitude that decreases exponentially in time.
Finally, if γ 2 = 4ω02 (the critically-damped case) then the characteristic equation has only one distinct root,
γ
r=− , (8.14)
2
and the general solution to the homogeneous equation is
Depending on the values of γ and ω0 we will therefore use one of the three solutions above for xc (t).
The next step is to determine a particular solution to the non-homogeneous equation, and for this we can use the method
of undetermined coefficients as described in Sec. 7.3.1. The non-homogeneous term is A cos(ωt), so referring to Table 2
we should use a linear combination of sin(ωt) and cos(ωt) in the trial solution. Equivalently (and this will turn out
to be somewhat easier) we can regard our desired solution as the real part of the complex differential equation (see
Section 7.3.3).
Now the non-homogeneous part is an exponential, so again referring to Table 2 we see the trial solution should be another
exponential of the same form, namely,
An obvious advantage of using the complex exponential instead of the cosine form of the driving function is that taking
the derivative just gives a multiple of the same exponential. Substituting the solution (8.17) into the differential equation
(8.16) gives
68
We can cancel the factor of eiωt and solve for the constant C,
f0
C=− . (8.19)
ω2 − ω02 − iωγ
Multiplying numerator and denominator by ω 2 − ω02 + iωγ to separate out the real and imaginary parts gives
f0 (ω 2 − ω02 ) f0 ωγ
C=− 2 −i 2 . (8.20)
(ω 2 − ω0 ) + ω γ
2 2 2 (ω − ω02 )2 + ω 2 γ 2
In order to extract easily the real part of the solution it is convenient to write C in the form† |C|eiφ , where
f0
|C| = p , (8.21)
(ω 2 − ω02 )2 + ω 2 γ 2
ωγ
φ = tan−1 . (8.22)
ω2 − ω02
Note that the quadrant of the angle is not uniquely determined by the tan−1 function, and we must require
ω 2 − ω02
cos φ = −p , (8.23)
(ω 2 − ω02 )2 + ω 2 γ 2
ωγ
sin φ = −p . (8.24)
(ω 2 − ω02 )2 + ω 2 γ 2
Taking the real part of this as the solution to our original problem (with redefinition of up to refer now only to the real
part) gives
where the amplitude |C| and phase angle φ are given by Eqs. (8.21) and (8.22).
The complete solution to our problem is then found by adding together the particular solution xp and that of the
homogeneous equation xc , which, depending on the values of γ and ω0 , is given by Eqs. (8.9), (8.11) or (8.15). The
solution xc contains two arbitrary constants, A and B, and these are determined by the initial values of the position,
x(0) = x0 and speed, x0 (0) = x00 , for some specified x0 and v0 . That is, we require
p
† Recall any complex number z = x + iy can be expressed as z = |z|eiθ with |z| = x2 + y 2 and θ = tan−1 (y/x). The quadrant of θ is
fixed by cos θ = x/|z| and sin θ = y/|z|. By representing complex numbers in this way one can also show for any two complex numbers z1 and
z2 , |z1 /z2 | = |z1 |/|z2 |.
69
x(0) = xc (0) + xp (0) = x0 , (8.27)
Suppose, for example, that we have γ < 2ω0 (the under-damped case), so that the solution to the homogeneous equation
is given by Eq. (8.11). The complete general solution including both uc and up is therefore
where p and q are given by Eqs. (8.12) and (8.13). For the initial condition on the speed we need to differentiate Eq. (8.29)
and then evaluate at t = 0. Imposing both initial conditions gives
Now A and B are completely determined by quantities that are specified at the outset, so by using these values with
Eq. (8.29) we have the fully specified function x(t) that satisfies our initial conditions and is valid for all times t > 0.
Notice that because p = −γ/2 is negative, the two exponential terms in the solution (8.29) mean that after a sufficiently
long time (t 2/γ) the homogeneous part of the solution goes to zero and one is left only with the particular solution.
That is, the homogeneous solution, which includes constants that depend on the initial conditions, represents transient
behaviour that eventually dies off. The particular solution represents the long-term motion, and this does not depend on
the constants A and B and is thus independent of the initial position and speed.
We may of course have the situation where there is no driving force at all, that is if f0 = 0. In this case we just have
the complementary function solution. We can analyse the situations in each of the cases of damping described above,
over-damped, critically damped and under-damped and a fourth possibility of no damping at all, which is simply where
γ = 0.
70
where p
γ γ 2 − 4ω02
r± = − ± .
2 2
It is clear that if γ > 2ω0 , then the exponentials in this solution both have negative arguments and the solution is slowly
falling to zero.
x(t)
2
1
Over-Damped
Critically Damped
0
0.5 1.0 1.5 2.0 2.5 3.0 3.5 t
Under-Damped
t
±e 2
-1
Undamped
-2
Figure 8.2: The harmonic oscillator without a driving force. The motion of the mass on the spring is plotted for several
different cases: over-damping (red), critical damping (blue), under-damping (black, with grey dashed lines indicating the
e−γt/2 envelope) and no damping at all (orange).
with r = −γ/2.
71
with p = −γ/2.
For the last case of no damping, that is γ = 0, we just get the simple form of
xnd
c = cos(qt) + B sin(qt) .
All four of these cases are plotted for some example values of the parameters in Figure 8.2.
The idea of the limit of a function f (x) as x approaches a value a is intuitive and we have been using this intuition to
evaluate limits already in this course. There is however a strict definition of what we mean by a limit. In many cases the
limit of the function f (x) as x approaches a is given f (a), but sometimes this is not so.
The limit of this function does actually exist and is well defined and we will see this in a little while. Another possibility
is that even if f (x) is defined at x = a its value may not be equal to the limiting value lim f (x). This can occur if the
x→a
function is discontinuous at x = a.
The strict definition of a limit is that if limx→a f (x) = l then for an number however small it must be possible to find a
number η such that |f (x) − l| < whenever |x − a| < η. In other words, as x becomes arbitrarily close to a, f (x)becomes
arbitrarily close to its limit l. To remove any ambiguity the number η will depend on and the form of f (x).
The following observations are often useful when finding the limit of a function:
• A limit may be ±∞. For example as x → 0, 1/x2 → ∞. This limit is defined, it just happens to
• A limit may be approached from below or above and the value may be different in each case. For example, consider
the function f (x) = tan x. As x tends to π/2 from below f (x) → ∞ but if the limit is approached from above then
f (x) → −∞. We can write this as
• It may ease the evaluation of limits if the function under consideration is split into a sum, product or quotient.
Provided that in each case a limit exists, the rules for evaluating such limits are as follows
72
Let’s look at some example limits.
Example 1: Evaluate lim (x2 + 2x3 ). For this we can use a) above.
x→1
Example 2: Evaluate lim (x cos x). For this we can use b) above
x→0
sin x
Example 3: Evaluate lim ( ). For this we can use c) (having check that not both the numerator and denomi-
x→π/2 x
nator are zero or infinity).
lim sin x
sin x x→π/2 1 2
lim = = = .
x→π/2 x lim x π/2 π
x→π/2
• Limits of functions of x that contain exponents that are themselves functions of x can often be found by taking
Logarithms. Example 4: Evaluate the limit
x2
a2
L = lim 1− 2 .
x→∞ x
First define
x2
a2
y= 1− 2
x
and take the natural Log of this and then take the required limit as
2 a2
lim ln y = lim x ln 1 − 2 .
x→∞ x→∞ x
Now we can use the Maclaurin series (or Taylor Series around x = 0) of ln(1 + x) (see Equation 2.56) to write
2 2
a2 a2 1 a
ln 1 − 2 = − 2 − − 2 + ...
x x 2 x
L’Hôpital’s rule is an extension of c) above but in cases where both the numerator and denominator are zero of infinite.
Consider lim f (x)/g(x), where f (a) = 0 and g(a) = 0. Expanding the numerator and denominator as Taylor Series we
x→a
obtain
f (x) f (a) + (x − a)f 0 (a) + (x − a)2 /2! f 00 (a) + . . .
= .
g(x) g(a) + (x − a)g 0 (a) + [(x − a)2 /2!] g 00 (a) + . . .
73
However, f (a) = g(a) = 0 so
f (x) (x − a)f 0 (a) + (x − a)2 /2! f 00 (a) + . . . f 0 (a) + [(x − a)/2!] f 00 (a) + . . .
= 0 00
= 0 .
g(x) (x − a)g (a) + [(x − a) /2!] g (a) + . . .
2 g (a) + [(x − a)/2!] g 00 (a) + . . .
Therefore we find
provided f 0 (a) and g 0 (a) are not themselves both zero. If they are both zero then we can apply the same procedure again
to find
f (x) f 00 (a)
lim = 00 .
x→a g(x) g (a)
We can repeat this procedure by going to higher and higher order terms until we find a defined limit (which may be
infinity, zero or some other number).
1 − ex
Example 5. Evaluate the limit lim . We note first of all that if we set x = 0 then the numerator and denominator
x→0 x
are both zero. We apply L’Hôpital’s rule
1 − ex −ex
lim = lim = −1.
x→0 x x→0 1
Checking this, if we just Taylor expand both numerator and denominator around x = 0 we have
1 − ex x x2
lim = lim −1 − − + . . . = −1.
x→0 x x→0 2! 3!
We have so far concentrated on the limit where x approached 0. For the case where f (a) = g(a) = ∞ we may still apply
L’Hôpital’s rule by writing
f (x) 1/g(x)
lim = lim ,
x→a g(x) x→a 1/f (x)
which is now of the form 0/0 at x = a. Note also that L’Hôpital’s rule is still valid for finding limit as x → ∞, i.e. when
a = ∞. This is easily shown by letting y = 1/x as follows:
74
10 Series
There are a large number of different series that are useful in Mathematics. Some you will already have seen before and
some you will be seeing for the first time. A series may have a finite or infinite number of terms.
SN = u1 + u2 + u3 + . . . + uN ,
where the terms of the series un , where n = 1, 2, 3, . . . , N are numbers and may in general be complex.
1
An example series has terms un = 2n for n = 1, 2, 3, . . . , N then the sum of the first N terms will be
N
X 1 1 1 1
SN = un = + + + ... + N .
n=1
2 4 8 2
It is often of practical interest to calculate the sum of an infinite series (one with an infinite number of terms). If the
value of the sum “converges” then the series is finite. We can consider the following limit to determine whether a series
converges as
S = lim SN .
N →∞
Not all series converge they may approach ∞ or −∞, or oscillate finitely or infinity. Moreover, for a series where each term
depends on some variable, its convergence can depend on the value assumed by the variable. Whether a sum converges,
diverges or oscillates has important implications when describing physical systems. We now go through some examples
of different types of series and how to sum them.
An arithmetic series is defined by the difference between successive terms being constant. The sum of a general arithmetic
series is written
N
X
SN = a + (a + d) + (a + 2d) + . . . + [a + (N − 1)d] = (a + nd).
n=0
SN = [a + (N − 1)d] + [a + (N − 2)d] + . . . + a
So that we have
N
SN = (first term + last term) .
2
75
10.2 Geometric Series
A geometric series is a series where the ratio of successive terms is a constant. In general we may write a geometric series
and its sum as
N
X −1
SN = a + ar + ar2 + . . . + arN −1 = arn , (10.1)
n=0
where a is a constant and r is the ratio of successive terms, also referred to as the common ratio.
We may find a closed expression for this type of series for a given number of terms (which we can take to infinity if the
series converges). Consider the series SN and rSN , that is
(1 − r) SN = a − arN
and hence
a(1 − rN )
SN = . (10.4)
1−r
For a series with an infinite number of terms and |r| < 1, we have a limit
a
lim SN = S = . (10.5)
N →∞ 1−r
This series is then called convergent.
As its name suggest, an arithmetico-geometric series is a mix of a geometric and arithmetic series. It has the following
general form
N
X −1
SN = a + (a + d)r + (a + 2d)r2 + . . . + [a + (N − 1)d] rN −1 = (a + nd)rn , (10.6)
n=0
The second term contains a geometric series that can be summed according to Equation 10.4 to find
rd(1 + rN −1 )
(1 − r)SN = a − [a + (N − 1)d] rN + .
1−r
76
Rearranging we have
a − [a + (N − 1)d] rN rd(1 + rN −1 )
SN = + (10.9)
1−r (1 − r)2
For an infinite series with |r| < 0 we have that limn→∞ rn → 0 and hence
a rd
S = lim SN = + (10.10)
n→∞ 1 − r (1 − r)2
We saw in the integration section that we needed to sum the following series
n
X
Sn = 1 + 22 + 32 + . . . + n2 = i2 .
i=1
We can calculate the result of this sum by considering the following two series containing the sum of cubes. That is
consider
S = 13 + 23 + 33 + . . . + n3 (10.11)
3 3 3 3
S̄ = 0 + 1 + 2 + . . . + (n − 1) . (10.12)
3 3
but this difference is also equal to n as all terms cancel apart from the final n factor. We have therefore
n
X 3
i − (i − 1)3 = n3 .
i=1
77
10.5 Convergent and Divergent Series
So far we have been talking about series which have a finite sum and in the last subsection we saw a series that if we
took the number of terms in the series to infinity we would get a value for the sum that is infinity. If a series has a finite
sum, it is called convergent, but if the sum is not finite it is called divergent. It is important to know whether a series is
convergent or not as odd things can happen if we try to manipulate the series using ordinary algebra or calculus.
S = 1 + 2 + 4 + 8 + 16 + . . . ,
where the . . . tell us that the series carries on for an infinite number of terms. It should be clear that this series is
divergent and if we added up all the terms we would get an infinite result.
Let’s try and do some simple algebra with this series. If we write down 2S, we have
2S = 2 + 4 + 8 + 16 + . . . ,
which is very close to our original series S apart from the first term so that
2S = S − 1
The same sort of thing can happen in a less obvious case. For example the series
1 1 1 1
S =1+ + + + + ...
2 3 4 5
is actually a divergent series. As a result we need to be careful about the series we are dealing with and be able to identify
a convergent series.
Before we look at some tests for convergence, we should be clear about what we mean by the convergence of a series.
Recall that the three dots tell us the series goes on without end.
Now consider the sums, Sn , created by including only the first n terms of the series, that is
S1 = a1
S2 = a1 + a2
...
Sn = a1 + a2 + a3 + . . . + an .
78
We are of course interested in the limit as we take n to infinity and whether we can write
lim Sn = S.
n→∞
It is understood that S is a finite number. If this happens, we make the following definitions:
• If the partial sum Sn of an infinite series tends to a limit S, the series is called convergent. Otherwise it is called
divergent.
• The difference Rn = S − Sn is called the remainder (or the remainder after n terms). We see that
lim Rn = lim (S − Sn ) = S − S = 0.
n→∞ n→∞
First we discuss what is usually called the preliminary test. In most cases you should apply this test first before you use
other tests as it will identify the badly divergent series.
Preliminary Test: If the terms of an infinite series do not tend to zero (that is, if limn→∞ an 6= 0), the series diverges.
If the limn→∞ an = 0, we must make further tests. This test cannot tell you whether the series is convergent.
There are several useful tests for series whose terms are all positive. Here we will look at 3 of them.
If we do have a series with negative terms, it may still be useful to make all the terms positive and do the test on this
modified series. If this new series converges we call the original series absolutely convergent. It can be proved that if a
series converges absolutely, the series is convergent when you reinstate the original minus signs (the sum is different of
course). The following 3 tests may all be used for testing a series of positive terms, or for testing any series for absolute
convergence.
be a series of positive terms which you know converges. Then the series we are testing has the form
a1 + a2 + a3 + . . .
Example: Test
X∞
1 1 1 1
=1+ + + + ...
n=1
n! 2 6 24
79
for convergence. As a comparison series, we can choose the geometric series
X∞
1 1 1 1 1
n
= + + + + ...
n=1
2 2 4 8 16
We do not care about the first few terms (or in fact any finite number of terms) in a series, because they can affect the
sum of the series not whether it converges. When we ask whether a series converges of not, we are asking what happens
as we add more and more terms larger and larger n.Does the sum increase indefinitely, or does approach a limit?
P∞ P∞
In our example, the terms of n=1 1/n! are smaller than the corresponding terms of n=1 21n for all n > 3. We know
P∞
that the geometric series converges and therefore n=1 1/n! converges also.
2) Integral Test
We can use this test when the terms of the series are positive and non-increasing, that is an+1 ≤ an . The test involve
writing an as a function of n, then allowing n to take all values not just integral ones.
P∞ R∞
The if 0 < an+1 ≤ an for n > N , then an converges if an dn is finite.
1 1 1
1+ + + + ....
2 3 4
Using the integral test with an = 1/n we calculate
Z ∞
1 ∞
dn = ln n| = ∞.
n
Since the integral is infinite this tells us that the series is divergent.
3) Ratio Test
The integral test depends on being able to integrate an dn and this is not always possible. We now consider another test
which can handle many series.
First recall that for a geometric series each term could be obtained by multiplying the one before it by the ratio r, that
is an+1 = ran or an+1 /an = r. For other series this same ratio is not a constant but may depend on n. Let us define the
absolute value of this ratio as ρn . Let us also find the limit (if there is one) of ρn as n → ∞ and call this limit ρ. Thus
we have
an+1
ρn = ,
an
ρ = lim ρn . (10.14)
n→∞
If
80
Example: Test for convergence the series
1 1 1
S =1+ + + ... + + ....
2! 3! n!
So far we have only looked in detail at series with positive terms. An important example of a series is where the terms
alternate in sign. For example, consider the following series
1 1 1 1 (−1)n+1
S =1− + − + − ... + + .... (10.15)
2 3 4 5 n
We can ask two questions about alternating series, the first is whether the series as written converges and the second
is whether the series converges absolutely. Let us consider the second question first. We can rewrite the series with on
positive powers as
1 1 1 1 (−1)n+1
S = 1 + + + + + ... + + ...,
2 3 4 5 n
where of course we have a different sum that will be bigger than our original sum. Therefore if this series converges,
our original alternating series will also converge. We of course recognise the harmonic series which we found is actually
divergent. As a result our series is not absolutely convergent so we must try a different test.
For an alternating series the test is very simple. An alternating series converges if the absolute value of the terms decreases
steadily to zero, that is if |an+1 | ≤ |an | and limn→∞ an = 0.
1 1 1
In the example above we have n+1 < n, and limn→∞ n = 0 and so the series in equation 10.15 is convergent.
11 Summary
That is all for this year. The only way you will learn this material is by doing lots and lots of examples. Have fun doing
them!
81
A The Greek Alphabet
A α alpha
B β beta
Γ γ gamma
∆ δ delta
E (or ε) epsilon
Z ζ zeta
H η eta
Θ θ theta
I ι iota
K κ kappa
Λ λ lambda
M µ mu
N ν nu
Ξ ξ xi (‘ksee’)
O o omicron
Π π pi
P ρ rho
Σ σ sigma
T τ tau
Υ υ upsilon
Φ φ (or ϕ) phi
X χ chi (‘kai’)
Ψ ψ psi
Ω ω omega
Notice that the letter nu (ν) is distinct from a Latin v, and χ is not the same as a Latin x. The Greek lower-case
letters upsilon (υ) and omicron (o) are essentially indistinguishable from Latin v and o and are therefore not used in
mathematics. Likewise many of the upper-case Greek and Latin letters are identical and so the Greek versions are rarely
used.
82