ELE00038C 2023 24 Part II
ELE00038C 2023 24 Part II
Part II
2023-2024
[ELE00038C] J. J. Bissell
3
Preface
The purpose of this course is to introduce those mathematical tools that are
most relevant to solving practical problems in engineering and the physical
sciences. Where feasible, core concepts will be motivated using ‘real-life’ con-
texts; however, to develop competence and fluency we will often need to ‘drill’
technique using examples that are in themselves somewhat abstract.
Course delivery
Mathematics is (and always has been) a subject that is best learned by doing,
and our emphasis will be on active participation through problem solving. This
approach will require more effort, but it should also be more effective, and
ultimately more rewarding.
Lectures
Your timetable should include four slots each week. Although these are listed
as ‘lectures’, they will—as far as possible—be run more like workshops, with
lots of examples and exercises, with a basic structure as follows:
3. Repeat.
In this way I hope to maximise our overall contact time, and help to set the
pace of learning. Remember to bring pencil and paper with you.
Workshops
In addition to the lecture sessions, I will be holding a weekly workshop (as per
your timetable). These workshops will mainly involve practicing techniques by
solving problems on the course exercise sheets, but they can also be used as
an opportunity for you to ask me further questions about course material.
In previous years students have also set-up and used their own online forum to
discuss course content (e.g., via Discord). I encourage this kind of collaborative
engagement; however, remember that you are all coming from different back-
grounds, with different life experiences, so please be respectful of one-another.
Exercise sheets
Almost all of the techniques that we will be studying in this course can be mas-
tered by practice, and it is essential that you work on course exercise sheets
throughout term. Our ultimate goal is skilled creative work in applied con-
texts; however, to reach that point we must first hone our craft, and this will
often mean working on more abstract problems. As a rule-of-thumb, keep
mathematically fit by solving at least 10 exercise problems each week.
These notes have been written as an independent guide, and (in principle)
you should be able to learn the course content by studying each section, and
completing the worked examples, and course exercises. Reading mathematics
requires active participation, so keep pencil and paper to hand to check the
steps in an argument. For example, a mathematical statement like
x2 + 3x + 2
f (x) = , (x 6= −2) ⇒ f (x) = x + 1
x+2
might not seem obvious; however, if you have a pencil and paper to hand, then
you can confirm that it is correct by noting that
x2 + 3x + 2 (x + 1)(x + 2)
f (x) = = = x + 1. (4.1)
x+2 (x + 2)
Assessment
The course will be assessed by two components with the following weightings:
Further information about the format and timings of the multiple choice quizzes
is summarised on the Virtual Learning Environment.
Unless explicitly marked ‘(optional)’ any section in these notes could be ex-
amined. However, it would not be a very fair examination if I were to set very
difficult problems, or ask for long, and technical derivations. Clearly I cannot
tell you what will be on the exam., but I can tell you that I will try to be fair.
The best guide to the sorts of questions you might be asked in an examination
is to look at past papers (see the course wiki-pages). If you can solve most of
the course exercises, then you will do very well in the exam. - so practice!
Further reading
The simplest of these is probably Gilbert and Jordan, but the other two texts
will also be useful for second year mathematics and beyond.
1. The first rule of Maths Club is: you do to talk about Maths Club.
2. The second rule of Maths Club is: you do not talk about Maths Club.
3. Third rule of Maths Club: If someone yells “stop!”, goes limp, taps out,
or fails to draw a diagram, then the maths is over.
8. The eighth and final rule: If this is your first time at Maths Club, then
you have to math.
It has always taken a certain amount of skill to navigate university life, but
things have arguably become more difficult over recent years. Let us try to
remember, then, that fundamentally we are here as a community of scholars to
engage joyfully with the process of learning and discovery. So let’s treat each
other with goodwill, and give it our best shot.
J. J. Bissell
(Summer 2023)
7
8 CONTENTS
Learning outcomes:
11
12 Section 5 : Ordinary Differential Equations
dQ
I= (5.1)
dt
begins to flow through the circuit. The subsequent decay in charge Q on the
capacitor may be modelled by a differential equation.
To see how this works, observe that the potential difference vR across the
resistor is
dQ
vR = IR = R , (5.2)
dt
whereas the potential difference vC across the capacitor is (by definition)
Q
vC = . (5.3)
C
When the circuit is complete (t > 0) these potential differences are related by
vR + vC = 0. (5.4)
Thus, substituting equations (5.2) and (5.3) into equation (5.4) we obtain
dQ Q
R + = 0. (5.5)
dt C
It is common to write this equation in the simplified form
dQ 1
+ Q = 0, where τ = RC (5.6)
dt τ
If some function Q(t) can be found such the left-hand-side of equation (5.6) is
equal to the right-hand-side when Q = Q(t), then we say that Q(t) satisfies
the equation, or is a solution to the differential equation. We shall see in
§5.3.1 that equation (5.6) can be integrated to give the solution
where k is an arbitrary constant that arises from the integration. The arbi-
trary constant k means that equation (5.7) represents a family of solutions,
where each possible value for k corresponds to a different member of the family.
In this way the particular solution to the differential equation (5.6) that
satisfies the boundary condition Q(0) = Q0 is
(see figure 5.2). A boundary condition that depends on some kind of initial
value is typically referred to as an initial condition.
0.8
0.6
0.4
0.2
0
0 0.5 1 1.5 2 2.5
For t ≥ 0, therefore, the potential difference across the resistor and capacitor
is balanced by V , that is,
vR + vC = V. (5.13)
dQ Q
R + = V. (5.14)
dt C
It is convenient to write this equation in the simplified form
dQ 1 1
+ Q= Q∞ , (5.15)
dt τ τ
Putting this value for k back into our general solution (5.17) yeilds
as the particular solution to equation (5.15) that satisfies the initial condition
Q(0) = 0. This particular solution is plotted in figure 5.4.
1.25
0.75
0.5
0.25
0
0 1 2 3 4 5
The solution (5.19) describes how the charge Q(t) on the capacitor increases
from Q(0) = 0 with increasing time t. For example, if we charge the capacitor
for period of time greatly exceeding the time constant τ , that is, for a time
then we find that Q(t) approaches the value Q∞ (see figure 5.4). This idea
may be formalised mathematically using a limit, viz.
Q∞ (1 − e−t/τ ) = Q∞ .
lim Q(t) = lim (5.21)
(t/τ )→∞ (t/τ )→∞
In the previous sections we saw how RC-circuits can be modelled using equa-
tions involving first-order derivatives, i.e., by first-order differential equations.
We now consider a mechanical context which gives rise to an equation involving
a second-order derivative, that is, a second-order differential equation.
To this end consider a mass m attached to a spring with spring constant k, such
as that depicted schematically in figure 5.5. If the mass is displaced through a
distance x then the restoring force FH is given by Hooke’s law
FH = −kx. (5.22)
If we further suppose that the mass is attached to a viscous dash-pot, then the
Figure 5.5: Mass m on a spring with constant k, damped by viscous drag (coefficient c).
drag force Fv on the mass is proportional to its velocity dx/dt, that is,
dx
Fv = −c , (5.23)
dt
where c is a constant of proportionality. Now the acceleration of the mass is
d2 x/dt2 , so according to Newton’s second law we have
d2 x
F =m = Fh + Fv , (5.24)
dt2
with F as the net force. Thus, substituting for Fh and Fv by equations (5.22)
and (5.23), we obtain
d2 x dx
m 2 = −kx − c . (5.25)
dt dt
It is common to rearrange this equation to the form
r
d2 x dx k c
2
+ 2γ0 + ω02 x = 0, with ω0 = , and γ0 = . (5.26)
dt dt m 2m
Equation (5.26) relates the displacement x of the mass to the first-order and
second-order derivatives, dx/dt and d2 x/dt2 respectively; for this reason it is
called a second order differential equation for x.
The solutions to equation (5.26) describe the displacement x(t) of the mass
as a function of time t. For example, if ω0 > γ0 then it may be shown that
the general solution to equation (5.26) is
q
−γ0 t
x(t) = e [A cos(ωt) + B sin(ωt)] , where ω = ω02 − γ02 , (5.27)
and A and B are arbitrary constants.1 Observe that the trigonometric terms
are multiplied by an exponential factor, and so describe oscillations with a
decaying amplitude, or damped oscillations (see Exercise 5.24, and figure 5.7).
1
The general solution to a second-order differential equation always involves two arbitrary con-
stants because ‘undoing’ a second-derivative effectively corresponds to two integrations.
It will be clear from our discussion in §5.1 that the field of differential equations
is replete with a large and technical vocabulary; this vocabulary helps to simplify
the study of differential equations by emphasising common properties, and
methods of solution. Below we summarise the jargon encountered so far.
dx d2 x
ẍ(t) + 2γ0 ẋ(t) + ω02 x(t) = 0, where ẋ ≡ , ẍ ≡ , (5.30)
dt dt2
and ω0 and γ0 are constants.
(b) The highest order derivative in this equation is the third-order derivative
y 000 (x) ≡ d3 y/dx3 , making this a third-order ordinary differential equation. The
dependent variable is y(x), and the independent variable is x.
dy
= F (x, y), (5.31)
dx
where F (x, y) is a function of x and y. The differential form of a first-order
differential equation is
where A(x, y) and B(x, y) are functions of x and y. The relationship between
these two forms may be seen by rearranging equation (5.32) as
dy A(x, y)
=− , (5.33)
dx B(x, y)
which is in the form of equation (5.31) with F (x, y) = −A(x, y)/B(x, y).
dy 4x + y 2
=− (5.34)
dx 2xy
in differential form.
Because the first term in this equation involves x only, and the second term
involves y only, we say that the variables x and y have been separated.
dy f (x)
=− , i.e., g(y)dy = −f (x)dx, (5.40)
dx g(y)
Example 5.4 Find the general solution to the first-order differential equation
dy
= 3x2 + cos x. (5.42)
dx
What is the particular solution that satisfies the boundary condition y(π) = 0?
The particular solution is the solution with c fixed to match our boundary
condition y(π) = 0. Substituting y(π) = 0 into the general solution gives
where τ and Q∞ are constants. Determine the general solution this equation.
What is the particular solution that satisfies the initial condition Q(t) = 0?
dQ (Q − Q∞ )
=− . (5.48)
dt τ
This is a separable differential equation; thus, separating the variables we have
Z Z
1 1
dQ = − dt. (5.49)
(Q − Q∞ ) τ
is an arbitrary constant. For this solution (5.51) to satisfy the initial condition
Q(0) = 0 we require
that is,
k = −Q∞ . (5.53)
Q(t) = Q∞ 1 − e−t/τ .
(5.54)
Notice here that the general solution in equation (5.57) is given by an implicit
relationship between x and y, i.e., by an implicit function
The general solution contains all possible solutions to the differential equation,
with particular values for the constant a corresponding to particular solutions.
In this case each of the solutions to the differential equation is a circle of radius
a. Example solutions are plotted as a family of solution curves in figure 5.6,
where each member of the family corresponds to different value of a. J
-1
-2
-3
-3 -2 -1 0 1 2 3
dy
a1 + a0 y = 0, (5.59)
dx
where a1 and a0 are constants. This is a very frequently occurring differential
equation in science and engineering. Indeed, the capacitor discharge equation
of §5.1.1, namely
dQ 1
+ Q = 0, (5.60)
dt τ
is a homogeneous first-order linear differential equation with constant coeffi-
cients a1 = 1 and a0 = (1/τ ).
The solution to equation (5.59) may be found by separation of variables, i.e.,
Z Z
1 a0 1 a0
dy = − dx ⇒ dy = − dx. (5.61)
y a1 y a1
Result 5.1 The general form of the homogeneous first-order linear differ-
ential equation with constant coefficients a0 and a1 is
dy
a1 + a0 y = 0. (5.64)
dx
This equation has general solution
dy
Example 5.7 Determine the general solution to 2 + 4y = 0.
dx
I Solution: This is a homogeneous first-order linear equation with constant
coefficients a0 = 4 and a1 = 2. Hence, by Result 5.1, the general solution is
dx
− 3x = 0. (5.67)
dt
This, is a homogeneous first-order linear equation with constant coefficients
a0 = −3 and a1 = 1. Hence, by Result 5.1, the general solution is
What is the particular solution that satisfies the initial condition Q(t) = Q0 .
dy
a1 + a0 y = φ(x), (5.73)
dx
where φ(x) is a function of x. This equation is said to be homogeneous if
φ(x) = 0, in which case it may be solved by separating variables (see §5.3.2).
Otherwise, if φ(x) 6= 0, then equation (5.73) is said to be inhomogeneous.
Inhomogeneous equations cannot usually be solved by separating the variables.
We shall study a powerful method of solving a generalised form of equation
(5.73) in §5.3.5. Here we shall simply assert that the general solution is
dyc
a1 + a0 yc = 0. (5.75)
dx
In this way the complementary function may be found using Result 5.1.
dyp
a1 + a0 yp = φ(x), (5.76)
dx
that does not contain any arbitrary constants. We explore how to deter-
mine yp (x) using the method of undetermined coefficients in §5.3.4.
dy d
a1 + a0 y = a1 (yc + yp ) + a0 (yc + yp )
dx dx
dyc dyp
= a1 + a0 yc + a1 + a0 yp = φ(x) (5.77)
| dx{z } | dx {z }
=0 =φ(x)
as required. That y = yc + yp is the general solution follows from the fact that
it is the solution to a first-order differential equation (an equation of order 1)
containing a single (i.e., 1) arbitrary constant (see Definition 5.1).
In summary, therefore, we have the following outline method for solving inho-
mogeneous first-order linear differential equations:
dy
a1 + a0 y = φ(x). (5.78)
dx
One way of obtaining the general solution to this equation is as follows:
dyc
a1 + a0 yc = 0, (5.79)
dx
e.g., by separating the variables, or using Result 5.1.
dyp
a1 + a0 yp = φ(x) (5.80)
dx
using the method of undetermined coefficients (see §5.3.4).
dyp
a1 + a0 yp = φ(x). (5.82)
dx
This method is best illustrated by example (see, e.g., Examples 5.10 and 5.12).
c0 k0
c0 eγx k0 eγx
Table 5.1: Trial forms for the particular integral yp (x) given φ(x), where the coefficients
c0 , c1 , c2 , A, B, ω, γ, and k0 , k1 , k2 , a, b are constants. Note: If any term in the proposed
trial form for yp (x) is already present in the complementary function yc (x), then multiply
yp (x) by xm , where m is the lowest integer power such that no coincidence occurs.
Example 5.10 Use Method 5.1 to determine the general solution to the in-
homogeneous equation
dy
2 + 4y = 8x. (5.83)
dx
You may wish to refer to the trial functions listed in table 5.1.
yp (x) = k1 x + k0 , (5.87)
d
2 k1 x + k0 + 4 k1 x + k0 = 4k1 x + (2k1 + 4k0 ) = 8x. (5.88)
dx
For 4k1 x + (2k1 + 4k0 ) = 8x to be true for all x it is necessary that
yp (x) = 2x − 1. (5.90)
xp (t) = k0 , (5.96)
dx 13
− 3x = 3
cos (2t). (5.101)
dt
You may wish to refer to the trial functions listed in table 5.1.
dxc
− 3xc = 0. (5.102)
dt
By Example 5.8 we know that the solution to this equation is
dxp 13
− 3xp = φ(t), where φ(t) = 3
cos(2t). (5.104)
dt
We shall solve this equation using the method of undetermined coefficients.
Since the form of the inhomogeneity is a trigonometric function φ(t) = 133
cos(2t),
we shall try a solution that is also based on trigonometric functions, i.e.,
where a and b are coefficients to be determined (see table 5.1). Putting this
trial solution into equation (5.104) we require
d 13
a cos(2t) + b sin(2t) − 3 a cos(2t) + b sin(2t) = 3
cos(2t), (5.106)
dt
whereupon evaluating the derivative we find
13
(2b − 3a) cos(2t) − (2a + 3b) sin(2t) = 3
cos(2t). (5.107)
13
(2b − 3a) = 3
and (2a + 3b) = 0. (5.108)
Example 5.13 Use Method 5.1 to determine the general solution to the in-
homogeneous equation
dy
2 + 4y = e−2x . (5.111)
dx
You may wish to refer to the trial functions listed in table 5.1.
dyc
2 + 4yc = 0, that is, yc (x) = Ce−2x , (5.112)
dx
where we used our solution to Example 5.7, with C as an arbitrary constant.
dyp
2 + 4yp = φ(x), where φ(x) = e−2x . (5.113)
dx
We shall solve this equation using the method of undetermined coefficients.
Since φ(x) = e−2x , a first glance at table 5.1 suggests a using trial solution
yp (x) = ke−2x , where k is a coefficient to be determined (see §5.3.4). However,
the complementary function yc (x) already includes a term in e−2x ; thus, to
avoid coincidental terms between yp and yc , we multiply yp (x) = ke2x by x
(the lowest power of x such that no coincidence occurs), and instead try
d
kxe−2x + 4 kxe−2x = 2ke−2x = e−2x , k = 12 . (5.115)
2 i.e.,
dx
Thus, the particular integral is
dy
a1 (x) + a0 (x)y = φ(x), (5.118)
dx
where the coefficients a1 (x) and a0 (x) are functions of x (which can be con-
stants). By dividing through by a1 (x), these equations are usually written
dy
+ P (x)y = Q(x), (5.119)
dx
where P (x) and Q(x) are functions of x (see, e.g., Examples 5.14 and 5.15).
To determine the general solution to equation (5.119) we introduce a function
µ(x) known as an integrating factor, which we define by
Z
z(x)
µ(x) = e , where z(x) = P (x)dx. (5.120)
1 dµ 1 dµ
P (x) = z
= . (5.122)
e dx µ dx
dy y dµ
+ = Q(x), (5.123)
dx µ dx
dy dµ
µ +y = µ(x)Q(x). (5.124)
dx dx
The left-hand-side of this equation can be written as the derivative of µy, i.e,
d dy dµ
(µy) = µ +y = µ(x)Q(x), (5.125)
dx dx dx
where we used the product rule for differentiating.
dy
x + y = x sin (x). (5.131)
dx
Find the particular solution that satisfies the boundary condition y(π) = 2.
dy 1
+ P (x)y = Q(x), with P (x) = and Q(x) = sin(x), (5.132)
dx x
which is the general form of a first-order linear differential equation.
dv du
u = x and = sin x, with = 1 and v = − cos x,
dx dx
(5.137)
since this gives
Z Z Z
dv du
x sin xdx = u dx = uv − v dx (5.138)
dx dx
Z
= −x cos x + cos xdx = −x cos x + sin x + c,
sin x − x cos x + c
y(x) = , (5.139)
x
where c is an arbitrary constant.
For the solution to satisfy the boundary condition y(π) = 2 we require
dv
x + 2v = x cos (x3 ). (5.141)
dx
Find the particular solution that satisfies the boundary condition v(π 1/3 ) = 0.
dv 2
+ P (x)v = Q(x), with P (x) = and Q(x) = cos(x3 ), (5.142)
dx x
which is the general form of a first-order linear differential equation.
du 1
u(x) = x3 , such that = 3x2 , i.e., du = x2 dx, (5.147)
dx 3
then we have
Z Z
2 3 1 1
x cos(x )dx = (cos u)du = sin(u) + c
3 3
1
= sin(x3 ) + c, (5.148)
3
where c is an arbitrary constant.
Putting equation (5.148) into equation (5.146), therefore, we obtain the gen-
eral solution to equation (5.141)
sin(x3 ) + 3c
v(x) = , (5.149)
3x2
where c is an arbitrary constant.
For the solution to satisfy the boundary condition v(π 1/3 ) = 0 we require
sin(π) + 3c c
v(π 1/3 ) = 2/3
= 2/3 = 0, i.e., c = 0. (5.150)
3π π
Thus, selecting c = 0 in equation (5.149) we have
sin(x3 )
v(x) = . (5.151)
3x2
as the particular solution to equation (5.141) that satisfies the boundary
condition v(π 1/3 ) = 0. J
dy
= h(y/x), (5.152)
dx
where h(y/x) is a function of (y/x), such as equation (5.160) below. In
particular, we shall show that equations of this kind can be transformed into
separable differential equations using the substitution
To see how this substitution works, observe by the product rule that
dy d dv
= (vx) = x + v. (5.154)
dx dx dx
Engineering Mathematics : ELE00038C J.J.B. 2023-2024
38 Section 5 : Ordinary Differential Equations
Hence, putting equations (5.153) and (5.154) into equation (5.152) we have
dv y
x + v = h(v), where v= . (5.155)
dx x
By separating the variables, and integrating we then obtain
Z Z
1 1
dv = dx, with y = vx. (5.156)
h(v) − v x
These integrals define a relationship between x and v, that can then be used
to deduce the general solution to the differential equation y(x) = vx.
dy
= h(y/x), (5.157)
dx
where h(y/x) is a function of (y/x), may be solved using the substitution
y dy dv
v= for which =x + v. (5.158)
x dx dx
Making a change of variables in this way yields a separable equation in v
and x that may be solved to obtain the general solution y(x) = xv(x).
dy x
x =y− . (5.159)
dx sin(y/x)
What is the particular solution that satisfies the boundary condition y(2) = π.
dy y 1
= h(y/x) where h(y/x) = − (5.160)
dx x sin(y/x)
dy dv
y = vx, for which =x + v. (5.161)
dx dx
Engineering Mathematics : ELE00038C J.J.B. 2023-2024
Section 5 : Ordinary Differential Equations 39
dv 1 dv 1
x +v =v− , i.e., x =− . (5.162)
dx sin v dx sin v
This equation is separable, as may be seen by writing
1
− sin vdv = dx (5.163)
x
Integrating this equation we find
Z Z
1
− sin vdv = dx ⇒ cos(v) = loge x + c, (5.164)
x
The particular solution to equation (5.159) that satisfies the boundary con-
dition y(2) = π is therefore
Example 5.17 Determine the general solution to y 0 (x) = y/(x − y). What is
the particular solution that satisfies the boundary condition y(3) = 1?
dy (y/x)
= , (5.168)
dx 1 − (y/x)
dy dv
y = vx, for which =x + v. (5.169)
dx dx
Engineering Mathematics : ELE00038C J.J.B. 2023-2024
40 Section 5 : Ordinary Differential Equations
dv v
x +v = , (5.170)
dx 1−v
which may be rearranged to give
v2
dv 1 1 1
x = , that is, − dv = dx. (5.171)
dx 1−v v2 v x
3 + loge 1 = c ⇒ c = 3. (5.174)
dy
+ p(x)y = q(x)y n , (5.178)
dx
where p(x) and q(x) are functions of x only, and n is a constant.
To see how this substitution works, observe by the chain rule that
dv d (1−n) d (1−n) dy dy
= y = y = (1 − n)y −n , (5.180)
dx dx dy dx dx
that is,
dy y n dv
= . (5.181)
dx (1 − n) dx
Putting this expression into equation (5.178) we have
y n dv
+ p(x)y = q(x)y n , (5.182)
(1 − n) dx
1 dv dv
+ p(x)y (1−n) = q(x), i.e., + P (x)v = Q(x), (5.183)
(1 − n) dx dx
dy
+ p(x)y = q(x)y n , (5.185)
dx
where p(x) and q(x) are functions of x only, and n is a constant. If n 6= 0, 1,
then this equation may be transformed into a first-order linear differential
equation in v(x) by making the substitution
dy y n dv
v = y (1−n) for which = . (5.186)
dx (1 − n) dx
where v(x) is may be found using the method described in Definition 5.4.
Example 5.19 Determine the general solution to the first-order Bernoulli equa-
tion define by
dy 1
− y = 3xy 2 . (5.188)
dx x
What is the particular solution that satisfies the boundary condition y(2) = 1?
v = y (1−2) = y −1 (5.189)
dv d −1 1 dy dy dv 1 dv
= y =− 2 , ⇒ = −y 2 = − 2 . (5.190)
dx dx y dx dx dx v dx
Thus, substituting equations (5.189) and (5.190) into equation (5.188) we find
1 dv 1 1 1
− 2 − = 3x 2 . (5.191)
v dx x v v
dv 1
+ P (x)v = Q(x), with P (x) = and Q(x) = −3x. (5.192)
dx x
According to Definition 5.4, the general solution to this equation is
Z
1 R
v(x) = µ(x)Q(x)dx, where µ(x) = e P (x)dx (5.193)
µ(x)
c − x3
Z Z
1 1
v(x) = µ(x)Q(x)dx = −3x2 dx = , (5.195)
µ(x) x x
1 x
y(x) = = . (5.196)
v(x) c − x3
For this solution to satisfy the boundary condition y(2) = 1 we therefore require
2 2
y(2) = 3
= =1 ⇔ c = 10. (5.197)
c−2 c−8
Hence, selecting c = 10, the particular solution to the Bernoulli equation
(5.188) that satisfies the boundary condition y(2) = 1 is
x
y(x) = . (5.198)
10 − x3
Note: We check this solution by differentiating, viz.
3x3
dy d x 1
= = + . (5.199)
dx dx 10 − x3 10 − x3 (10 − x3 )2
d2 y dy
a2 + a 1 + a0 y = φ(x), (5.201)
dx2 dx
where the coefficients a0 , a1 , . . . , an are constants, and φ(x) is a function of x
only (cf. first-order linear equations in §5.3.3). Such an equation is said to be
homogeneous if φ(x) = 0; otherwise it is said to be inhomogeneous.
d2 x dx
2
+ 2γ0 + ω02 x = 0, (5.202)
dt dt
is a homogenous second-order linear differential equation with constant coeffi-
cients a0 = ω02 , a1 = 2γ0 , and a0 = 1.
Similarly, the equation for a damped oscillator driven by a force F (t), that is
d2 x dx F (t)
2
+ 2γ0 + ω02 x = , (5.203)
dt dt m
where m is a constant, is an inhomogeneous second-order equation with con-
stant coefficients a0 = ω02 , a1 = 2γ0 , and a0 = 1.
d2 y dy
a2 + a 1 + a0 y = φ(x), (5.204)
dx2 dx
where the coefficients a0 , a1 and a2 are constants, and φ(x) is a function
of x only. Such an equation is said to be homogeneous if φ(x) = 0;
otherwise it is said to be inhomogeneous.
where λ and C are constants. We need to determine the values that these
constant must take for y(x) = Ceλx to satisfy equation (5.205).
When C = 0 the trial solution y = Ceλx clearly satisfies equation (5.205)
because the left-hand-side vanishes trivially. The solution y = 0 is called the
trivial solution.
For non-trivial solutions we need y = Ceλx to satisfy equation (5.205) with
C 6= 0, that is, we require
d2 λx
d λx
λx
a2 Ce +a 1 Ce +a 0 Ce = 0. (5.207)
dx2 | {z } dx | {z } | {z }
y y y
(a2 λ2 + a1 λ + a0 ) = 0. (5.209)
This equation is called the auxiliary quadratic, and its roots are the values
that λ must take for y(x) = Ceλx to be a solution to equation (5.205). No
condition is placed on the value of C, meaning that C is an arbitrary constant.
Let us label the roots of the auxiliary quadratic λ1 and λ2 . Then it follows
that there is a solution for each root, say
Now observe that because y1 and y2 are both solutions to the differential
equation (5.205) we have
d2 y1 dy1
a2 2 + a1 + a0 y1 = 0 (5.211a)
dx dx
d2 y2 dy2
and a2 2 + a1 + a0 y2 = 0. (5.211b)
dx dx
Adding these equations together we obtain
d2 y dy
a2 2 + a1 + a0 y = 0, where y = y1 + y2 . (5.212)
dx dx
Hence, the linear combination
is a single arbitrary constant. Since this solution contains only one ar-
bitrary constant, it cannot be the general solution (see Definition 5.1).
Indeed, in this case it may be shown that the general solution is actually
d2 y dy
a2 2 + a1 + a0 y = 0, (5.217)
dx dx
where the coefficients a2 , a1 and a0 define an auxiliary quadratic
a2 λ2 + a1 λ + a0 = 0. (5.218)
After solving for the roots λ1 and λ2 of this quadratic, the general
solution y(x) to the second-order equation (5.217) is found as follows:
d2 y dy
2
−2 − 15y = 0. (5.222)
dx dx
Which solution satisfies the boundary conditions y(0) = 3 and y 0 (0) = −1?
the roots of the auxiliary quadratic are therefore λ1 = 5 and λ2 = −3. Because
these are distinct roots (λ1 6= λ2 ), we have by Method 5.2 that the general
solution to the homogeneous second order equation (5.222) is
y(0) = C1 + C2 = 3. (5.225)
d
y 0 (x) = C1 e5x + C2 e−3x = 5C1 e5x − 3C2 e−3x ,
(5.226)
dx
such that our second boundary condition gives
dx d2 x
ẍ(t) − 4ẋ(t) + 4x(t) = 0, where ẋ ≡ and ẍ ≡ 2 (5.229)
dt dt
is Newton’s notation for time-derivatives. Hence determine the particular so-
lution that satisfies the initial conditions x(0) = 0 and ẋ(0) = 2.
The constants C1 and C0 may be fixed to match our initial conditions x(0) = 0
and ẋ(0) = 2 as follows. Beginning with x(0) = 0 we obtain
x(0) = C0 = 0, (5.232)
d
C1 te2t = C1 e2t + 2C1 te2t ,
ẋ(t) = (5.234)
dt
such that our initial condition ẋ(0) = 2 gives
ẋ(0) = C1 = 2. (5.235)
i.e., A = 2 and B = −1. The particular solution that satisfies the boundary
conditions y(0) = 2 and y( π2 ) = 1 is thus y(x) = 2 cos(3x) − sin(3x) . J
λ2 + 2λ + 17 = 0. (5.247)
dx d −t
ẋ(t) = = e [A cos(4t) + B sin(4t)]
dt dt
= e [−4A sin(4t) + 4B cos(4t)] − e−t [A cos(4t) + B sin(4t)]
−t
Expressing the solution in this way highlights the exponential envelope 2e−t
governing the attenuation of the oscillation amplitude (see figure 5.7). J
-1
-2
0 0.25 0.5 0.75 1
d2 y dy
a2 2 + a1 + a0 y = 0 (5.257)
dx dx
In Example 5.25 it is shown that the roots λ1 and λ2 of the auxiliary quadratic
a2 λ2 + a1 λ + a0 = 0 (5.258)
d2 y dy
2
− (λ1 + λ2 ) + (λ1 λ2 )y = 0. (5.260)
dx dx
Here the left-hand-side may rearranged into a ‘factorised’ form, viz.
d dy dy
− λ2 y − λ1 − λ2 y = 0, (5.261)
dx dx dx
or equivalently
dQ dy
−λ1 Q = 0, where Q(x) is defined by −λ2 y = Q(x). (5.262)
dx dx
Equations (5.262) express the second-order equation (5.257) in terms of two
first-order equations; in particular, if y(x) satisfies the second of these equa-
tions, then y(x) satisfies equation (5.257). We have therefore reduced the or-
der of the problem: rather than try to solve the second-order equation (5.257)
directly, we will solve the first-order equations (5.262).
dQ
− λ1 Q = 0. (5.263)
dx
This is a homogeneous first-order linear differential equation, so its general
solution is given by
Q(x) = Ceλ1 x (5.264)
where C is an arbitrary constant (see Result 5.1). Putting this solution (5.264)
into the second of equations (5.262) we have
dy
+ P (x)y = Q(x) with P (x) = −λ2 and Q(x) = Ceλ1 x . (5.265)
dx
This is a first-order linear differential equation of the type considered in §5.3.5.
According to Definition 5.4, the general solution to this equation is
Z
1 R
y(x) = µ(x)Q(x)dx, where µ(x) = e P (x)dx (5.266)
µ(x)
hence, substituting for µ(x) and Q(x) in equation (5.266) we have the general
solution Z
y(x) = e λ2 x
Ce(λ1 −λ2 )x dx. (5.268)
Now if λ1 6= λ2 , then the integral in this equation gives the general solution
Z
y(x) = e λ2 x
Ce(λ1 −λ2 )x dx
λ2 x C (λ1 −λ2 )x
=e e + C2 ,
(λ1 − λ2 )
C
= eλ1 x + C2 eλ2 x
(λ1 − λ2 )
= C1 eλ1 x + C2 eλ2 x , (5.269)
Hence, summarising our results from equations (5.269) and (5.270), the gen-
eral solution to the homogeneous second-order linear equation (5.257) is
C eλ1 x + C eλ2 x if λ 6= λ
1 2 1 2
y(x) = , (5.271)
(C1 x + C0 )eλx if λ = λ1 = λ2
Result 5.3 The general form of the homogeneous second-order linear dif-
ferential equation with constant coefficients is
d2 y dy
a2 2 + a1 + a0 y = 0, (5.272)
dx dx
where the coefficients a2 , a1 and a0 define an auxiliary quadratic
a2 λ2 + a1 λ + a0 = 0. (5.273)
Example 5.25 Show that if λ1 and λ2 are the roots of the auxiliary quadratic
a2 λ2 + a1 λ + a0 = 0, then (λ1 + λ2 ) = − (a1 /a2 ) and λ1 λ2 = (a0 /a2 ).
Thus, comparing each of the terms in equations (5.275) and (5.276), we have
that (λ1 + λ2 ) = − (a1 /a2 ) and λ1 λ2 = (a0 /a2 ) as required. J
d2 y dy
a2 2
+ a1 + a0 y = φ(x), (5.277)
dx dx
where φ(x) 6= 0. We now explore how to solve equations of this kind using the
method employed for first-order inhomogeneous equations in §5.3.3.
As with first-order inhomogeneous equations, we begin by asserting that the
general solution to equation (5.277) is
where yc (x) is the complementary function, and yp (x) is the particular in-
tegral. For second-order equations, these two functions are defined as follows:
d2 y p dyp
a2 2
+ a1 + a0 yp = φ(x), (5.280)
dx dx
that does not contain any arbitrary constants. The particular integral
may be found using the method of undetermined coefficients.
d2 y dy d2 d
a2 2
+ a1 + a0 y = a2 2 (yc + yp ) + a1 (yc + yp ) + a0 (yc + yp ) (5.281)
dx dx dx dx
d2 yc dyc d2 yp dyp
= a2 2 + a1 + a0 yc + a2 2 + a1 + a0 yp = φ(x)
| dx {z dx } | dx {z dx }
=0 =φ(x)
as required. That y = yc + yp is the general solution follows from the fact that
it is the solution to a second-order differential equation (an equation of order
2) containing two (i.e., 2) arbitrary constants (see Definition 5.1).
d2 y dy
a2 2
+ a1 + a0 y = φ(x) (5.282)
dx dx
may be determined by the following steps:
d2 yc dyc
a2 2
+ a1 + a0 yc = 0, (5.283)
dx dx
i.e., by following Method 5.2.
d2 y p dyp
a2 2
+ a1 + a0 yp = φ(x), (5.284)
dx dx
e.g., by the method of undetermined coefficients, see §5.3.4.
d2 y dy
2
−2 − 15y = 30x2 − 7x. (5.286)
dx dx
[Hint: You may wish to refer to Example 5.20.]
c0 k0
c0 eγx k0 eγx
Table 5.2: Restatement of the trial functions listed in table 5.1. Note: If any term in the
proposed trial form for yp (x) is already present in the complementary function yc (x), then
multiply yp (x) by xm , where m is the lowest integer power such that no coincidence occurs.
d2 yc dyc
2
−2 − 15yc = 0. (5.287)
dx dx
This equation is identical to the homogeneous equation of Example 5.20, so
we can use the solution (5.224) found therein, i.e.,
d2 y p dyp
2
−2 − 15yp = φ(x), where φ(x) = 30x2 − 7x. (5.289)
dx dx
We shall solve this equation using the method of undetermined coefficients.
Substituting this trial solution for yp (x) into equation (5.289) we require
d2 2 d 2
k 2 x + k1 x + k0 − 2 k2 x + k1 x + k0 (5.291)
dx2 dx
− 15 k2 x + k1 x + k0 = 30x2 − 7x,
2
that is,
−15k2 = 30, −(4k2 +15k1 ) = −7, and (2k2 −2k1 −15k0 ) = 0. (5.293)
k0 = − 52 , k1 = 1 k2 = −2, (5.294)
◦ Step (3): With reference to equations (5.288) and (5.295), the general
solution to equation (5.286) is
yp00 (x) + 9yp (x) = φ(x), where φ(x) = 16 cos x − 8 sin x. (5.300)
d2
[a cos x + b sin x] + 9 [a cos x + b sin x] = 16 cos x − 8 sin x, (5.302)
dx2
that is,
Example 5.28 Find the general solution to the second order equation
d2 x dx
2
−4 + 4x = 6e2t . (5.308)
dt dt
Hence determine the particular solution that satisfies the boundary conditions
x(0) = −4 and x(1) = e2 . [You may wish to refer to Example 5.21.]
I Solution: As before, we adopt the three step process outlined in Method 5.3.
d 2 xc dxc
2
−4 + 4xc = 0. (5.309)
dt dt
This equation is identical to the homogeneous differential equation of Example
5.21, so we can use the solution (5.231) found therein, i.e.,
d2 xp dxp
2
−4 + 4xp = φ(t), where φ(x) = 6e2t . (5.311)
dt dt
We shall solve this equation using the method of undetermined coefficients.
Because φ(x) = 6e2t is exponential, first inspection of table 5.2 suggests that
we should use a trial function of the form xp (t) = ke2t , where k is a coefficient
to be determined. Notice, however, that the complementary function xc (t)
already includes terms in e2t and te2t , so xp (t) = ke2t is unsuitable.
To avoid coincidental terms between xp (t) and xc (t) we multiply xp (x) = ke2t
by t2 (the lowest power of t such that no coincidence occurs); we therefore try
d2 2 2t d 2 2t 2 2t
kt e − 4 kt e + 4 kt e = 6e2t , (5.313)
dt2 dt
Engineering Mathematics : ELE00038C J.J.B. 2023-2024
62 Section 5 : Ordinary Differential Equations
(4kt2 + 8kt + 2k)e2t − 4(2kt2 + 2kt)e2t + 4kt2 e2t = 2ke2t = 6e2t , (5.314)
◦ Step (3): Combining equations (5.310) and (5.315), the general solution
to equation (5.308) is thus
For this solution to satisfy the boundary conditions x(0) = −4 and x(1) = e2
we require
this means that Qc (t) is the transient solution. The particular integral,
however, does not vanish; this means that Qp (t) is the steady state solution.
0.5
-0.5
-1
0 1 2 3 4 5
Figure 5.8: Transient and steady-state solutions to the capacitor charging problem.
65
66 Section 6 : Vectors and Vector Algebra
F = ma. (6.1)
Here the inertial mass m is a scalar quantity having magnitude only; however,
typically we are interested in both the magnitude and direction of the mass’s
acceleration, so we need to know both how large the force is, and the direc-
tion in which it acts. Since direction and magnitude are both essential for
determining resulting motion, force F and acceleration a are vector quantities.
In equation (6.1) force F and acceleration a have been denoted using ‘bold-
type’, and we adopt this convention for vectors throughout. In written work,
however, it is customary to denote vectors by either underlining, or overlining
with arrows; for example, given an arbitrary vector v, we can use the notations
v≡v≡→
−
v. (6.2)
Figure 6.2: Position vectors a and b, a displacement vector c, and an arbitrary path P.
Similarly, a vector which describes the relative displacement from one point to
another is known as a displacement vector. The two points A and B in figure
6.2 are related by the displacement c. A displacement vector from one point
to another is sometimes denoted using an ‘over-arrow’, e.g., the displacement
vectors from (i) A to B, and (ii) B to A can be denoted
−→ −→
(i) AB = c and (ii) BA = −c. (6.3)
The algebra of vectors is similar to that of scalars, but distinct in certain impor-
tant ways. In the following subsections we discuss the addition and subtraction
of vectors (§6.2.1), and the meaning of multiplying a vector by a scalar (§6.2.2);
we also introduce the idea of the unit vector (§6.2.3).
a + b = b + a. (6.4)
a + (b + c) = (a + b) + c. (6.5)
a − b = a + (−b), (6.6)
where −b has the same magnitude as b, but points in the opposite direction.
These properties of vector addition and subtraction are depicted in figure 6.4.
Example 6.1 Figure 6.2 depicts two points A and B with position vectors a
−→ −→
and b; express the vectors AB = c and BA = −c in terms of a and b.
−→ −→
I Solution: The vectors AB = c and BA = −c may be written as
this may be seen by ‘following’ the directions of the vectors in figure 6.2. J
Figure 6.4: Vector subtraction (a−b), and additive associativity (a+b)+c = a+(b+c).
Note that the subtraction of two equal vectors yields a vector of zero magnitude
called the zero vector, viz.
a − a = 0. (6.8)
The magnitude of a vector a is denoted |a| and refers to the vector’s absolute
size (or length), irrespective of direction; for example, given some force vector
F, the magnitude |F| is the size of the force. We also sometimes denote vector
magnitude using plain-type, e.g., we can denote the magnitude of a as
a ≡ |a|. (6.9)
Given two vectors a and b, and two scalars λ and µ, scalar multiplication
satisfies the following properties
Note that multiplication of a vector by zero yields the zero vector, i.e., 0a = 0.
Definition 6.2 Let a be a non-zero vector (|a| 6= 0). Then the unit
vector â that points in the same direction as a is defined by
a |a|
â ≡ , where |â| = = 1. (6.11)
|a| |a|
Any non-zero vector can therefore be written in the form a = |a|â, that is,
with its magnitude |a| and direction â separated explicitly (see figure 6.5).
Example 6.2 Let a and b be two vectors that point in opposite directions,
where a has twice the magnitude of b. Write down: (i) the vector a in terms
of b; and (ii) the unit vector of a in terms of the unit vector of b.
I Solution: (i) Since a is twice the magnitude of b, but points in the opposite
direction, we have a = −2b. (ii) By definition, the unit vectors â and b̂ have
magnitude |â| = |b̂| = 1, they are simply antiparallel, i.e., â = −b̂. J
Example 6.3 Let the points A, B, and P have position vectors a, b, and p,
where P is a point that divides AB in the ratio λ : µ, as shown in figure 6.6.
−→
(a) Let c = AB be the vector from A to B; express c in terms of a and b.
−→
(b) Let d = AP be the vector from A to P ; show that d = (|d|/|c|)c.
(c) Determine the ratio |d|/|c|. Hence express d in terms of c, λ, and µ.
(d) Hence demonstrate that
µ λ
p= a+ b. (6.12)
λ+µ λ+µ
Figure 6.6: Point P dividing the line joining A and B in the ratio λ : µ.
(a) To get to B from A we travel from A to the origin O, and then from the
origin O to B, i.e.,
−→
c = AB = −a + b = (b − a). (6.13)
−→
(b) By the sketch we have that d = AP points in the same direction as
−→
c = AB; this means that the unit vectors d̂ and ĉ are identical, viz.
d c |d|
= d̂ = ĉ = , that is, d= c. (6.14)
|d| |c| |c|
(d) With reference to the sketch, it follows from equation (6.16) that
−→ λ λ
p = a + AP = a + c=a+ (b − a)
λ+µ λ+µ
λ λ
= 1− a+ b
λ+µ λ+µ
µ λ
= a+ b (6.17)
λ+µ λ+µ
Let i and j be a unit vectors in the x-y plane which point in the x-direction and
y-direction respectively (see figure 6.7). Then any vector ax that is parallel (or
anti-parallel) to the x-axis may be written as ax = ax i, where ax is a scalar;
ax is positive if ax is parallel to i, and negative if ax is anti-parallel to i.
Likewise, any vector ay that is parallel (or anti-parallel) to the y-axis may be
written in the form ay = ay j, where ay is a scalar; ay is positive if ay is parallel
to j, and negative if ay is anti-parallel to j.
a = ax + ay = ax i + ay j, (6.18)
Note that an equivalent notation for a is as the ordered pair (ax , ay ), i.e.,
a = ax i + ay j ≡ (ax , ay ). (6.19)
The vectors i and j are perpendicular, this means that the magnitudes of the
components ax and ay are given by
is the magnitude of a, and θx and θy are the angles between a and the x and
y-directions respectively (see figure 6.7). For this reason, cos θx and cos θy are
referred to as the direction cosines of a.
Example 6.4 Let A be the point with position vector a = (5, 2), and let
c = −i − 3j be the displacement vector from A to another point B. If
b = (bx , by ) is the position vector of B, determine the components bx and by .
hence,
b = a + c = (5i + 2j) + (−i − 3j) = 4i − j. (6.22)
a = ax i + ay j + az k ≡ (ax , ay , az ), (6.23)
where az is the component of a in the z-direction (see figure 6.8). Notice that
az = a cos θz , (6.24)
with θz as the angle between a and the z-axis, so cos θz is a direction cosine like
cos θx and cos θy . The notation a ≡ (ax , ay , az ) is called an ordered triplet.
Definition 6.3 The vectors i, j, and k are unit vectors in the x, y, and z-
directions respectively, and are referred to as the Cartesian basis vectors.
A vector a may be written in the Cartesian form
a = ax i + ay j + az k ≡ (ax , ay , az ), (6.26)
where cos θx , cos θy , and cos θz are called the direction cosines of a.
One advantage of expressing vectors in Cartesian form is that addition and sub-
traction may be done by component; thus, given two vectors a = (ax , ay , az ),
and b = (bx , by , bz ) we have
a + b = (ax i + ay j + az k) + (bx i + by j + bz k)
= (ax + bx )i + (ay + by )j + (az + bz )k. (6.29)
Recall that the magnitude of a vector a is its absolute size a = |a| irrespec-
tive the direction in which it points. In two dimensions this means that the
magnitude of an arbitrary vector a = ax i + ay j is given by
q
a = |a| = a2x + a2y , (6.31)
since the length a is simply the hypotenuse of a right-angled triangle with sides
ax and ay (see figure 6.7). Similarly, in three dimensions the magnitude of a
vector a = ax i + ay j + az k is
q
a = |a| = a2x + a2y + a2z (6.32)
(see figure 6.8). Thus, the components of a vector can be used to compute
the vector’s magnitude according to Pythagoras’s theorem. Notice that this
means that the direction cosines of a vector (see Definition 6.3) satisfy
For position and displacement vectors the magnitude (or length) corresponds
directly to distance. However, for other kinds of vectors we must interpret the
length differently; for example, when given a particle’s velocity vector v the
magnitude |v| s the particle’s speed v = |v|.
Example 6.6 Calculate the relative speed of the two particles in Example 6.5.
I Solution: By our solution to Example 6.5 we have that the relative velocity
is u = −i + j + 5k; the relative speed is thus
p √
u = |u| = (−1)2 + 12 + 52 = 27, (6.34)
a 1
â ≡ = (4i + 7j − 4k). (6.36)
|a| 9
Component form thus offers a simple method for determining unit vectors. J
Definition 6.4 The scalar product (or dot product) of two vectors a
and b is denoted a · b, and yields a scalar quantity defined by
where θ is the angle between the two vectors when their ‘tails’ are placed
together (see figure 6.9). The scalar product satisfies
a · b = b · a, and a · (b + c) = a · b + a · c. (6.38)
(a) (b)
Figure 6.9: Scalar projection b cos θ of b onto a when: (a) 0 ≤ θ < π2 ; and (b) π
2 < θ ≤ π.
Observe that the scalar product a · b = |a||b| cos θ yields two special cases:
In this way any unit vector â satisfies â · â = |â|2 = 1. Hence, because the
Cartesian basis vectors are mutually perpendicular, we have
i · i = j · j = k · k = 1, and i · j = j · k = k · i = 0. (6.41)
a · b = (ax i + ay j + az k) · (bx i + by j + bz k)
= ax bx (i · i) + ax by (i · j) + ax bz (i · k)
+ ay bx (j · i) + ay by (j · j) + ay bz (j · k)
+ az bx (k · i) + az by (k · j) + az bz (k · k)
= ax bx (i · i) + ay by (j · j) + az bz (k · k)
= ax b x + ay b y + az b z ,
where we used the fact that i, j, and k are orthogonal unit vectors to write
i · j = j · k = k · i = 0, and i · i = j · j = k · k = 1 (6.44)
as in equation (6.41). J
Example 6.9 Show that the angle θ between the vectors a = i − 2j + 6k and
b = 8i + 7j + k is given by θ = π2 (i.e., show that the vectors are orthogonal).
The scalar product has many useful geometric applications, including determin-
ing the angle between vectors, and the projection of one vector onto another:
Definition 6.5 Let θ be the angle between two non-zero vectors a and b:
These definitions are represented as vector diagrams in figures 6.9 and 6.10.
(a) (b)
Figure 6.10: Vector projection bk = (|b| cos θ)â of b onto a when: (a) bk is parallel to
π π
â, i.e., 0 ≤ θ < 2; and (b) bk is antiparallel to â, i.e., 2 < θ ≤ π (cf. figure 6.9).
The projected vector of b onto a describes the amount of b that acts in the
same direction as a. We can similarly define a vector b⊥ to describe the amount
of b that acts at right-angles to a; such a vector must satisfy (cf. figure 6.10)
(b) Thus, by the definition of the scalar product a · b = cos θ|a||b| we obtain
a·b 8 1 π
cos θ = = √ √ = , that is, θ = . (6.48)
|a||b| 4 2×2 2 2 3
(c) By Definition 6.5 we have that: (i) the scalar projection of b onto a is
√ √
given by |b| cos θ = 2 2/2 = 2; and (ii) the vector projection is
√
√ a 2
bk = (|b| cos θ)â = 2 = √ (4i + 4k) = i + k. (6.49)
|a| 4 2
The vectors a, b, and n̂ form what is known as a right-handed set {a, b, n̂};
we use this set to define the vector product.
Now suppose we wish to find b × a; in this case when we curl our fingers in the
direction of the angle θ from b to a our thumb points in the direction opposite
to a × b. Since the magnitude |a||b| sin θ is unchanged, Definition 6.6 implies
Result 6.1 The following results are useful for treating vector products:
Observe that the magnitude of the vector product |a × b| = |a||b| sin θ yields
two special cases:
Since any vector a is parallel to itself, this means that a × a = 0. Hence, the
Cartesian basis vectors satisfy
i × i = j × j = k × k = 0. (6.55)
Now consider the right-handed set formed by the Cartesian basis vectors shown
in figure 6.12; the definition of the vector product means that
i × j = k, j × k = i, k × i = j, (6.56)
where the directions of the products follow from the right-hand-rule (see figure
6.12), and the magnitudes are all unity, e.g., |i × j| = |i||j| sin( π2 ) = 1.
Figure 6.12: Cartesian basis vectors are a right-handed set {i, j, k} with i × j = k.
a × b = (ax i + ay j + az k) × (bx i + by j + bz k)
= ax b x i × i + ax b y i × j + ax b z i × k
+ ay b x j × i + ay b y j × j + ay b z j × k
+ az bx k × i + az by k × j + az bz k × k. (6.57)
a × b = ax b y i × j − ax b z k × i
− ay b x i × j + ay b z j × k
+ az b x k × i − az b y j × k
= (ay bz − az by )j × k + (az bx − ax bz )k × i + (ax by − ay bx )i × j,
where we used the fact that i × k = −(k × i), and so forth. Hence, using
equation (6.56) to write j × k = i etc., we obtain
i j k
(ay bz − az by )i + (az bx − ax bz )j + (ax by − ay bx )k ≡ ax ay az . (6.59)
bx by bz
i j k
a × b = ax ay az . (6.61)
bx by bz
I Solution: Using the determinant method for the component form we have
i j k
a×b= 3 1 −1
1 2 1
= (1 × 1 − (−1) × 2)i + (−1 × 1 − 3 × 1)j + (3 × 2 − 1 × 1)k
= 3i − 4j + 5k. (6.62)
Likewise,
i j k
b×a= 1 2 1 = −3i + 4j − 5k.
3 1 −1
a×b
n̂ = , (6.63)
|a × b|
is a unit normal to both a and b, with {a, b, n̂} forming a right-handed set.
Result 6.3 Let a and b be non-parallel vectors, then the unit normal n̂
given by
a×b
n̂ = (6.64)
|a × b|
is a unit vector that is perpendicular to both a and b.
√
Example 6.12 Determine a unit normal to a = 4i+4k, and b = i+ 6j+k.
i j k
√ √
a × b = 4 0 4 = −4 6i + 4 6k, (6.65)
√
1 6 1
√ √
with magnitude |a × b| = 42 × 6 + 42 × 6 = 8 3. Hence, the unit vector
given by n̂ = (a × b)/|a × b| = √12 (−i + k) is a unit normal to a and b. J
Figure 6.14 illustrates a parallelogram whose sides are two vectors a and b
separated by the angle θ. Because the height of the parallelogram is |b| sin θ,
while the base has length |a|, the area A of the parallelogram is
a × b = 3i − 4j + 5k. (6.67)
The vector product has several applications to physical problems involving ro-
tations. For example, consider a force F acting through a point R with position
vector r as in figure 6.15. The moment M of the force about the origin O is
defined by |F| multiplied by its perpendicular distance l⊥ to O; hence,
Notice also that sense of the moment is as the rotation from r to F; thus, we
can specify both the direction and magnitude of the moment as M, where
L=r×p (6.71)
a scalar parameter that is unique to each point on the line. In component form,
therefore, we have
x − ax y − ay z − az
= = , (6.75)
bx by bz
Figure 6.16: Line parallel to b, passing through the point A with position vector a.
c = (p − a), (6.76)
then the perpendicular (and thus shortest) distance l⊥ from the line to P is
where θ is the angle between c and b. Hence, with b̂ = b/|b| as a unit vector
in the direction of b, we have by the definition of the vector product that
1 1
b̂ = b = (2i + j − 2k). (6.79)
3 3
Thus, with
(p − a) = 3j, (6.80)
we have that
i j k
(p − a) × b̂ = 0 3 0 = −2i − 2k. (6.81)
2 1
3 3
− 32
Let r1 be the line containing a point a1 , and running parallel to b1 , and let r2
be the line containing a point a2 , and running parallel to b2 , then
r1 = a1 + λ1 b1 and r2 = a2 + λ2 b2 , (6.83)
where λ1 and λ2 are scalar parameters (see §6.6.1). In this section we consider
the problem of determining the perpendicular distance d between these two
lines assuming that they are skew (non-parallel), as depicted in figure 6.17.
To do this, first observe that we may define a unit normal n̂ to both lines
b1 × b2
n̂ = . (6.84)
|b1 × b2 |
c = (a1 − a2 ), (6.85)
as shown in figure 6.17. The perpendicular distance d between the two lines is
then the magnitude of the scalar projection of c onto n̂, that is,
|(a1 − a2 ) · (b1 × b2 )|
d = |c · n̂| = . (6.86)
|b1 × b2 |
Example 6.15 Determine the minimum distance d between the z-axis, and
the line r1 parallel to b1 = (4i + 3j − k) containing the point a1 = (2i − j).
Figure 6.17: Skew lines separated by a perpendicular distance (i.e., minimum distance) d.
Figure 6.18 depicts a plane with normal n (unit normal n̂) containing a given
point A with position vector a = (ax , ay , az ), and an arbitrary point R with
−→
position vector r = (x, y, z). Notice that the vector AR = (r − a) must lie
completely within the plane, and is therefore perpendicular to n, i.e.,
(r − a) · n = 0; (6.87)
where we used the fact that |n̂| = 1; in this way, equation (6.88) becomes
Hence, if the components of the unit normal are given by n̂ = (u, v, w), then
r · n̂ = ux + vy + wz, and we may express the plane in Cartesian form as
ux + vy + wz = d. (6.91)
Figure 6.18: Plane with normal n, containing the point A with position vector a.
Now suppose that b and c are also points in the plane, with a, b and c distinct,
then the vectors (b − a) and (c − a) are non-parallel vectors that lie within the
plane. Thus, any point r in the plane may be reached by starting at a, and
moving some distance in the direction (b − a), followed by some distance in
the direction (c − a). In this way we can write the equation for the plane as
where λ and µ are two scalar parameters. Since only λ and µ can be varied,
this equation highlights the fact that a plane has two degrees of freedom.
Now let P be a point adjacent to the plane described in the previous section,
as illustrated in figure 6.18. If we denote the position vector of P as p, then
the displacement from a to p is given by the vector
c = (p − a). (6.93)
Thus, if we denote the angle between c and the unit normal n̂ as θc (see figure
6.18), then the perpendicular distance l⊥ from P to the plane is
Note that if P is on the same side of the plane as the origin O, then θc > π2 ,
and this expression for l⊥ will give a negative answer.
i j k
u × v = 1 1 0 = i − j + k, (6.95)
0 1 1
√ √
and |u × v| = 12 + 12 + 12 = 3. Thus, a unit normal to u and v is
u×v 1
n̂ = = √ (i − j + k). (6.96)
|u × v| 3
Engineering Mathematics : ELE00038C J.J.B. 2023-2024
Section 6 : Vectors and Vector Algebra 91
Clearly any plane with unit normal n̂ will be parallel to both u and v; however,
if the plane must also contain the point a = i + k, then according to equation
(6.88) in §6.6.4 we require
r · n̂ = d, where d = a · n̂ (6.97)
1 2
d = a · n̂ = (i + k) · √ (i − j + k) = √ . (6.98)
3 3
1 2
r · n̂ = (xi + yk + zk) · √ (i − j + k) = √ (x − y + z). (6.99)
3 3
x−y+z =1 (6.100)
I Solution: (a) Notice that the angle between the normals is the same as the
angle between the planes. Thus, by the scalar product we have
π
n1 · n2 = 1 = |n1 ||n2 | cos θ = 2 cos θ, that is, θ= . (6.101)
3
(b) Let r = xi + yj + zk be an arbitrary position vector; then by equation
(6.88) of §6.6.4 the equations for the planes P1 and P2 are
r · n1 = a · n1 and r · n2 = a · n2 (6.102)
(c) The line of intersection of the two planes is the set of points that satisfy
both equations (6.103); since these are two equations in three unknowns, we
must form a parametric solution. Let x = λ, then equations (6.103) give
x = λ, y = 2 − λ, z = −λ, (6.104)
b = n2 × n1 = i − j + k. (6.106)
The vector product b × c of two vectors b and c is itself a vector, and can
be combined with a third vector a to form triple products. In this section we
consider the scalar triple product, and the vector triple product.
ax ay az
a · (b × c) = bx by bz , (6.110)
cx cy cz
where the determinant can be evaluated using Sarrus’s rule (see figure 6.13).
In summary, therefore, we have the following definition:
The scalar triple product returns a scalar, and may be evaluated using the
determinant
ax ay az
a · (b × c) = bx by bz . (6.112)
cx cy cz
The scalar triple product satisfies
a · (b × c) = b · (c × a) = c · (a × b), (6.113)
n = a × b. (6.114)
The base of the parallelepiped is a parallelogram whose sides are the vectors a
and b; thus, according to our discussion in §6.13, the base has area
A = |a × b| = |n|. (6.115)
Now observe that the height h of the parallelepiped is equal to the scalar
projection of c onto the vector n, that is,
(see §6.4.2). Thus, the volume V of the parallelepiped is given by the scalar
triple product of a, b, and c, viz.
Equation (6.117) tells us that a geometric interpretation for the scalar triple
product is the volume of a parallelepiped. Thus, if a · (b × c) = 0, then the
parallelepiped whose edges are a, b, and c has volume V = 0, that is, the
vectors a, b, and c must all lie in a plane. In this way the scalar triple product
may be used to test whether vectors are co-planar:
Result 6.4 If the scalar triple product of three vectors a, b, and c is zero,
that is, if
a · (b × c) = 0, (6.118)
then a, b, and c all lie in the same plane, and are said to be co-planar.
Conversely, if
a · (b × c) 6= 0, (6.119)
If a set of three vectors are coplanar, then any one vector can be expressed
as a linear combination of the other two. For example, if a, b and c are
coplanar, then we may write a as the linear combination
a = λb + µc, (6.120)
where λ and µ are scalars. In such cases we say that a, b, and c are linearly
dependent. Conversely, if
a · (b × c) 6= 0, (6.121)
i j k
b×c= 1 2 −1 = i − 3j − 5k. (6.122)
2 −1 1
a = λb + µc, (6.124)
5 = λ + 2µ, 0 = 2λ − µ, 1 = µ − λ. (6.126)
For the purpose of completeness we now introduce the vector triple product:
a × (b × c), (6.127)
as required. J
There are many instances in engineering and the physical sciences when it is
necessary to handle large systems of simultaneous linear equations, either by
the nature of the systems themselves (e.g., circuits, and networks), or due to
methods of approximation (e.g., linearised models, and numerical analysis).
The branch of mathematics concerned with the theory of linear equations is
known as linear algebra, and the rectangular arrays of numbers used to rep-
resent such equations are known as matrices. Here we introduce the basic
algebra of matrices, and apply matrix methods to simple problems. Note that
our examples will necessarily involve small systems that can (in principle) be
solved easily using ‘non-matrix’ approaches; however, the point is to illustrate
systematic processes for application to much larger systems where ‘non-matrix’
approaches are impractical. By the end of the section you should be able to. . .
Learning outcomes:
◦ Explain what are meant by the rows and columns of an n × m matrix.
◦ Classify different kinds of square (n × n) matrices, and compute traces.
◦ Write down the transpose AT of a matrix A.
◦ Add and subtract matrices, and multiply a matrix by a scalar.
◦ Perform the matrix multiplication AB of two matrices A and B.
◦ Explain the meaning of the identity (or unit) matrix I.
◦ Compute the determinant of a 2 × 2 matrix.
◦ Use Laplace expansions to find determinants of 3 × 3 and 4 × 4 matrices.
◦ Find the inverse A−1 of a square matrix A by Gaussian elimination.
◦ Compute the inverse of a matrix using the determinant method.
◦ Use matrix methods to solve simple systems of linear equations.
These learning outcomes must be reinforced by completing course exercises.
97
98 Section 7 : Matrices and Linear Algebra
Figure 7.1 depicts a circuit comprising three loops, three voltage sources, and
a selection of resistors. One way to analyse this circuit is to associated currents
I1 , I2 , and I3 with each of the three loops. In this way it may be shown (see
Example 7.1) that the loop currents satisfy the following system of equations
Notice that these equations are formed from three types of quantity: (i) the
unknowns I1 , I2 , and I3 ; (ii) the coefficients which multiply the unknowns;
and (iii) the known constants 1, 2, and 4. We can separate these types of
quantity from each other using rectangular arrays of numbers by writing
5 0 −2 I1 1
0 9 −3 I2 = 3 . (7.2)
−2 −3 5 I3 4
Ax = b. (7.4)
The order of rows and columns in this augmented matrix may then be used to
infer properties of the system of equations (7.1). The first, second, and third
rows of the augmented matrix represents the first (7.1a), second (7.1b), and
third (7.1c) equations respectively. Likewise, the entries in the first, second,
and third columns represent the respective coefficients of the first (I1 ), second
(I2 ), and third (I3 ) unknowns. The entries in the final column are then the
known constants from the right-hand-sides of each equation.
Example 7.1 Show that the loop currents in figure 7.1 satisfy equations (7.1).
where aij is the element in the ith row and jth column. This notation may be
used to easily identify different elements; for example, given the 3 × 4 matrix
2 −1 π 0
A = 35 0 1 1 (7.11)
√ 4
7 2 − 7 10
√
we have a11 = 2, a32 = 2, a21 = 35 , a13 = π, a34 = 10, and so forth.
Example 7.2 Equation (7.9) defines two matrices B and C; (a) write down
the values of the elements b11 , b42 , and b32 ; and (b) c21 , c13 , c22 , and c23 .
I Solution: (a) b11 = π, b42 = −3; (b) c21 = 0, c13 = 32 , c23 = 12 , c23 = 3. J
where aij denotes a general element in the ith row and jth column. In this
course we denote matrices in bold type A, or by writing a general element
in brackets, i.e., A = [aij ]; other notations include double underlining A.
A matrix formed from only one column is called a column vector (or column
matrix). For example, the matrices
x1
1
x2
" #
11 −2
c= , A = 3 , x= x3 , (7.13)
−9 π
..
√
.
2
xm
This notation is useful when row vectors are written within the body of a text.
Note that the trace of a matrix is only defined for square matrices. The leading
diagonal may be used to define several special types of square matrix:
An upper triangular matrix U is one for which all the elements below the
leading diagonal are zero, i.e.,
u11 u12 u13 . . . u1n
0 u22 u23 . . . u2n
U= 0 0 u33 . . . u3n , with uij = 0 when i > j. (7.18)
.. .. .. . . ..
. . . . .
0 0 0 . . . unn
A lower triangular matrix L is one for which all the elements above the leading
diagonal are zero, that is,
l11 0 0 ... 0
l21 l22 0 ... 0
L= l31 l32 l33 . . . 0 , with lij = 0 when i < j. (7.19)
.. .. .. . .
. . . . 0
Symmetric matrix
A matrix S in which the elements are symmetric about the leading diagonal,
i.e., sij = sji , is known as a symmetric matrix. For example, the 3×3 matrix
7 −2 π
√
S = −2 0 3 (7.20)
√
π 3 11
Diagonal matrix
A matrix D is said to be diagonal if all the elements either side of the leading
diagonal are zero, that is,
d11 0 0 ... 0
0 d22 0 ... 0
D= 0 0 d33 . . . 0 , with dij = 0 when i 6= j. (7.21)
.. .. .. . .
. . . . 0
0 0 0 . . . dnn
The identity matrix I (or unit matrix) is a special kind of diagonal matrix for
which all the elements along the leading diagonal are unity (1). For example,
the 2 × 2 and 3 × 3 identity matrices are
" # 1 0 0
1 0
I= , and I = 0 1 0 . (7.22)
0 1
0 0 1
The elements of the unit matrix are given by the Kronecker delta δij , where
1 if i = j
δij = (7.23)
0 if i 6= j.
Thus, the unit matrix may be written in terms of a general matrix element as
I = [δij ]. (7.24)
The diagonal containing the elements a11 , a22 , . . . , ann is called the leading
diagonal. The trace of a square matrix tr(A) is defined by
Example 7.3 Classify the following matrices, and (if possible) find their traces:
4 4 3 0 0 0 "
2
#
1 1 0 9 3
A = −1 2 , B = −3 0 0 , C= ,
2 1
0 2 3 4
5 −2 0 4 π 2
1
1 2 6 1
2 " √ #
0 0 2 0 1 − 2 2
D=
√ , E= √ , F=
.
0 0 −1 2 − 2 2 3
0 0 0 5 4
We now introduce the most basic algebraic operations that may be performed
on matrices; these operations are analogous to those in conventional arithmetic.
Two matrices A and B are said to be equal if they have the same number of
rows and columns, and if their corresponding elements are equal, that is,
In some cases this property can be used to deduce the elements of a matrix.
I Solution: The elements of B are b11 = 2, b12 = −4, b21 = −1, b22 = 0. J
Two matrices A = [aij ] and B = [bij ] can be added (or subtracted) to form
a another matrix C = [cij ] if and only if they have the same number of rows
and columns. Indeed, addition (or subtraction) is then accomplished by simply
adding (or subtracting) the corresponding elements, i.e.,
The zero matrix 0 is a matrix for which all the elements are 0. For instance,
the 1 × 4, 2 × 2 and 2 × 3 zero matrices are
" # " #
h i 0 0 0 0 0
0 0 0 0 , , and
0 0 0 0 0
The zero matrix 0 works in an analogous fashion to the way the number 0
works in conventional arithmetic. Hence, for any matrix A we have
These results follow from the fact that aij + 0 = aij and aij − aij = 0.
then we have
2
4 −6 −1 3
−2 3
1
2A = 18 0 , A= 3 0 , −A = −9 −0 . (7.40)
3
5
24 10 4 3
−12 −5
Result 7.1 If A, B, and C are matrices of the same size, then the rules
of commutativity and associativity are respectively
A + B = B + A, (7.45)
(A + B) + C = A + (B + C). (7.46)
k1 (A + B) = k1 A + k1 B, (7.47)
(k1 + k2 )A = k1 A + k2 A. (7.48)
(A + B)T = AT + BT . (7.49)
In this section we define the binary operation of multiplying two matrices to-
gether, beginning with row and column matrices.
Example 7.9 Find the inner product of A = [1, −2, 0] and B = [4, −5, −6]T .
Here the column vector B has been defined as the transpose of a row vector;
this is a common notation for writing column vectors in the body of text. J
Example 7.10 Determine all the possible products of the following vectors:
8 0 5
r= 1 4 , s= 6 7 , a= , b= , c= .
3 9 2
where we used the rule that the product is defined for ‘rows into columns’. J
The our method for multiplying row and column vectors (the inner product) is
used to define the matrix product as follows:
Thus, the element cij is the inner product of the ith row of A with the
jth column of B (see figure 7.2). The matrix product AB is only defined
if the number of columns of A is equal to the number of rows of B.
where the final line follows from the results in equations (7.54). The product
BA is not defined because the number of columns of B is not equal to the
number of rows of A, i.e., B is a (2 × 3) matrix, and A is a (2 × 2) matrix. J
Non-commutativity
AB 6= BA. (7.59)
This property is typically the case even when AB and BA both exist.
Example 7.13 Let A and B be the matrices defined by
" # 0 1
2 0 1
A= and B = −3 1 . (7.60)
−1 1 3
2 0
Distributivity
A(B + C) = AB + AC (7.63)
Evaluating AB + AC we have
" #" # " #" #
2 0 1 −1 2 0 0 1
AB + AC = +
1 1 2 1 1 1 −1 1
" # " # " #
2 −2 0 2 2 0
= + = . (7.66)
3 0 −1 2 2 2
I Solution: Using the general element notation we have by Definition 7.5 that
" n # " n ! n
!#
X X X
A(B + C) = aij (bjk + cjk ) = aij bjk + aij cjk
j=1 j=1 j=1
" n
# " n
#
X X
= aij bjk + aij cjk = AB + AC (7.67)
j=1 j=1
as required. J
Associativity
I Solution: Using the general element notation we have by Definition 7.5 that
p
" n !# " n p #
X X XX
A(BC) = aij bjk ckl = aij bjk ckl (7.72)
j=1 k=1 j=1 k=1
p p
" n
# " n
! #
XX X X
= aij bjk ckl = aij bjk ckl = (AB)C
k=1 j=1 k=1 j=1
Identity
IA = AI = A; (7.73)
as required. J
Zero matrix
It is clear from the Definition 7.5 that the product of a matrix A with an
appropriately sized zero matrix 0 will yield another zero matrix, i.e.,
0A = 0, and A0 = 0. (7.77)
Transpose
Powers
A square matrix A may be multiplied by itself, and for this reason it is useful
to define the ‘square’ and ‘cube of a matrix’, etc.; thus, if k is an integer, then
Ak = AA . . . A} .
| {z (7.81)
k factors
Summary of properties
the linear system (7.89) may be written using the matrix product as
a11 a12 . . . a1n x1 b1
a21 a22 . . . a2n x2 b2
.. = .. , (7.91)
. .. . . ..
. .
. . . . .
an1 an2 . . . ann xn bn
that is,
Ax = b. (7.92)
This discussion verifies our earlier claim that systems of linear equations can
be represented by matrix equations (see §7.1).
7.5 Determinants
where the coefficients a11 , a12 , a21 , a22 , and b1 , b2 are known quantities.
Ax = b, (7.94)
with
" # " # " #
a11 a12 x1 b1
A= , x= , and b= . (7.95)
a21 a22 x2 b2
Thus, unique solution for x1 and x2 only exist when (a11 a22 − a21 a12 ) 6= 0.
Because it determines whether equations (7.93a) and (7.93b) have a unique
solution, the quantity (a11 a22 −a21 a12 ) is called the determinant of the matrix
A, and is denoted as either det(A), or ∆, or |A|, i.e.,
a11 a12
det(A) = ∆ = = (a11 a22 − a21 a12 ). (7.97)
a21 a22
a11 a12
det(A) ≡ ∆ ≡ = (a11 a22 − a21 a12 ). (7.99)
a21 a22
−9 3
det(A) = = (−9 × −1) − (3 × 3) = 0. (7.101)
3 −1
and
4 1
det(B) = = (4 × 3) − (2 × 1) = 10. (7.102)
2 3
respectively. J
To this end we must first define the minors and a cofactors of a matrix.
Definition 7.7 The minor Mij associated with each element aij of an
n × n matrix A = [aij ] is the determinant of the (n − 1) × (n − 1) matrix
obtained by striking-out (removing) the ith row and jth column of A. The
matrix of minors of A is the matrix M defined by M = [Mij ].
Likewise, the minor M12 is found by striking-out the first row, and second
column, and evaluating the determinant of the resulting 2 × 2 matrix, viz.
The minor M13 is found by striking-out the first row, and third column, viz.
Proceeding in this fashion, the M21 , M22 , and M23 minors are
5A 1 2
1 2
M21 = 4A 1A 2A = = (1 × −1) − (−1 × 2) = 1, (7.108)
−1 −1
−2A −1 −1
5 1A 2
5 2
M22 = 4A 1A 2A = = (5×−1)−(−2×2) = −1, (7.109)
−2 −1
−2 −1A −1
5 1 2A
5 1
M23 = 4A 1A 2A = = (5×−1)−(−2×1) = −3. (7.110)
−2 −1
−2 −1 −1A
5A 1 2
1 2
M31 = 4A 1 2 = = (1 × 2) − (1 × 2) = 0, (7.111)
1 2
−2A −1A −1A
5 1A 2
5 2
M32 = 4 1A 2 = = (5 × 2) − (4 × 2) = 2, (7.112)
4 2
−2A −1A −1A
5 1 2A
5 1
M33 = 4 1 2A = = (5 × 1) − (4 × 1) = 1. (7.113)
4 1
−2A −1A −1A
Definition 7.8 The cofactor Cij associated with each element aij of an
n × n matrix A = [aij ] is a scalar quantity defined by
where Mij is the corresponding minor; thus, Cij is a minor with a sign
attached. Notice that the sign (−1)(i+j) is positive if (i + j) is even, and
negative if (i+j) is odd, e.g., for a 3×3 matrix the signs follow the pattern
+ − +
− + − (7.116)
+ − +
Example 7.22 Write down expressions for the minors Mij and cofactors Cij
of the arbitrary 3 × 3 matrix
a11 a12 a13
A = a21 a22 a23 . (7.117)
a31 a32 a33
By Definition 7.8, the cofactors are Cij = (−1)(i+j) Mij , that is,
Notice here that the signs attached to the minors Mij follow the checker-board
pattern depicted in expression (7.116). J
Example 7.23 Find all the cofactors Cij of the matrix A from Example 7.21.
a11 a12
det(A) = ∆ = = (a11 a22 − a21 a12 ). (7.119)
a21 a22
Note that this definition is unambiguous, i.e., it does not matter which
row (or column) is chosen for the expansion when evaluating det(A).
where C11 , C12 , and C13 are the cofactors associated with the elements along
the top row. Indeed, since
Note: One, can of course, expand across any row, and it is often easier to
use the row or column that contains the most zeros (cf. Example 7.26).
by: (a) expanding across row i = 1; and (b) expanding across column j = 3.
I Solution: Column j = 4 has only one non-zero element, making it the easiest
choice for the Laplace expansion. In particular, by equation (7.121) we have
where C14 , C24 , C34 , and C44 are the cofactors associated with the elements
of column j = 4.
5 1 2
(3+4)
C34 = (−1) M34 = −M34 =− 4 1 2 , (7.133)
−2 −1 −1
5 1 2
det(B) = 3 × C34 =3×− 4 1 2 = −3, (7.134)
−2 −1 −1
1. Swapping any two rows (or columns) changes the sign of det(A).
2. If any two rows (or two columns) are equal, then the det(A) = 0.
1 Swapping any two rows (or columns) changes the sign of the determinant
Interchanging any two rows (or any two columns) results in the sign of the
determinant being multiplied by (−1). As an example consider
1 2
= (1 × 4) − (3 × 2) = −2. (7.135)
3 4
3 4 1 2
= (3 × 2) − (1 × 4) = 2 = − . (7.136)
1 2 3 4
2 1 1 2
= (2 × 3) − (4 × 1) = 2 = − . (7.137)
4 3 3 4
2 If any two rows (or two columns) are equal, then the determinant is zero
This rule is a corollary to rule 1 above, since swapping two identical rows
or columns must: (i) leave the value of the determinant unchanged; and (ii)
multiply the determinant by (−1). These two qualities can only both be true if
the value of the determinant is zero. For example, with two identical columns
1 2 2
4 4 3 4 3 4
3 4 4 =1 −2 +2
6 6 5 6 5 6
5 6 6
= (1 × 0) − (2 × −2) + (2 × −2) = 0. (7.138)
9 8 7
8 7 9 7 9 8
9 8 7 =9 −8 +7
5 4 6 4 6 5
6 5 4
= (9 × −3) − (8 × −6) + (7 × −3) = 0. (7.139)
1 3 1 2
= = (1 × 4) − (3 × 2) = −2. (7.141)
2 4 3 4
It follows from the idea of a Laplace expansion (Definition 7.9) that if all the
elements of a row (or column) are multiplied by a scalar k, then the value of
the determinant will change by a factor of k. For example,
1 2 3 1 2 3
6 12 18 =3 2 4 6 , (7.142)
9 10 11 9 10 11
where we have taken a common factor of k = 3 out of the second row (to see
this, consider the result of performing the Laplace expansion across row i = 2).
This property may be combined with property 2 above to deduce that if any one
row (or column) is a multiple of another row (or column), then the determinant
is zero. For instance, in the following determinant we observe that
1 5 1 1 1 1
3 10 2 =5× 3 2 2 = 0, (7.143)
5 15 3 5 3 3
| {z } | {z }
second column=5×third column second column=third column
Adding (or subtracting) the multiple of one row (or column) to another row
(or column) does not change the determinant. For example, let
1 2 −6
3 2 3 2 3 3
∆= 3 3 2 =1 −2 −6
−1 3 2 3 2 −1
2 −1 3
= (1 × 11) − (2 × 5) + (−6 × −9) = 55, (7.144)
where we performed the Laplace expansion across the top row. Adding twice
the third row to the first row leaves ∆ unchanged, viz.
1 + (2 × 2) 2 + (2 × −1) −6 + (2 × 3) 5 0 0
3 2
3 3 2 = 3 3 2 =5
−1 3
2 −1 3 2 −1 3
= 5 × 11 = ∆. (7.145)
0 2 6 4
2 6 4 2 6 4
2 1 3 2
det(A) = = 2 × − 2 6 5 + 5 × − 1 3 2 .
0 2 6 5
1 3 9 2 6 5
5 1 3 9
(7.147)
By properties 4 and 2 of Result 7.4 we have that
2 6 4 2 2 4
2 6 5 =3× 2 2 5 = 0. (7.148)
1 3 9 1 1 9
| {z } | {z }
common factor of 3 in second column first column=second column
2 6 4 1 3 2
1 3 2 =2× 1 3 2 = 0. (7.149)
2 6 5 2 6 5
| {z } | {z }
common factor of 2 in first row first row=second row
Hence, putting equations (7.148) and (7.148) into equation (7.147) we have
0 2 6 4
2 1 3 2
det(A) = =0 (7.150)
0 2 6 5
5 1 3 9
as required. J
Division is not defined in matrix arithmetic; however, for some (not all) square
matrices A it is possible to find an inverse matrix denoted A−1 such that
where I is the identity matrix. It turns out that a necessary and sufficient
condition for a square matrix A to have an inverse A−1 is that det (A) 6= 0.
Example 7.29 Prove that the inverse A−1 of an invertible matrix A is unique.
AB = BA = I and AC = CA = I. (7.159)
Note: If det (A) = 0, then A is singular, and does not have an inverse.
4 6
det(A) = = (4 × 2) − (1 × 6) = 2. (7.164)
1 2
I Solution: Evaluating A−1 A, we have with det(A) ≡ (a11 a22 − a21 a12 ) that
" #" #
−1 1 a22 −a12 a11 a12
A A=
det(A) −a21 a11 a21 a22
" #
1 (a11 a22 − a21 a12 ) (a22 a12 − a12 a22 )
=
det(A) (a11 a21 − a21 a11 ) (a11 a22 − a21 a12 )
" # " #
1 det(A) 0 1 0
= =
det(A) 0 det(A) 0 1
=I (7.168)
1 −3 1 3 6
det(B) = = , and det(C) = = 0. (7.170)
− 12 2 2 2 4
This expression for B−1 is consistent with our findings in Exercise 7.28. J
In this section we illustrate the determinant method for finding the inverse
of an n × n matrix; this method is computationally intensive, but important for
theoretical reasons. We describe an alternative approach called the Gaussian
elimination method in §7.6.4; this method is more practical for large matrices
in particular. The determinant method is given without proof as follows:
has an inverse A−1 . If the inverse A−1 exists, then find it.
5 1 2
det(A) = 4 1 2 = 1; (7.173)
−2 −1 −1
has an inverse A−1 . If the inverse A−1 exists, then find it.
0 2 4
det(A) = 3 0 1 = −6; (7.179)
0 0 1
It is apparent from the examples above that the process of finding adj(A)
makes determinant method for inverting a matrix A computationally intensive,
especially for large matrices. In the next section we introduce a more efficient
approach called the Gaussian elimination method (§7.6.4); this method
relies on a set of manipulations known as elementary row operations:
Definition 7.11 Let Ri denote the ith row of a matrix, and let k be a
scalar, then the three kinds of elementary row operations are:
(where the notation R1 ↔ R2 means ‘insert R1 into the row previously occupied
(where the notation R3 − 2R1 → R3 means ‘insert R3 − 2R1 into the row
previously occupied by R3 ’). Finally, dividing row R3 by −4, we obtain
1 −2 3 1 −2 3
0 1 −2 0 1 −2 (7.188)
0 0 −4 − 41 R3 → R3 0 0 1
The upper triangular matrix has now been converted to a 3 × 3 identity matrix
I; we have thus completed the process of transforming A into I as required.J
The Gaussian elimination method for determining the inverse A−1 of a matrix
A relies on the following theorem, which we state without proof.
This method for inverting matrices is called the Gaussian elimination method.
Verify that the inverse of A exists, then find it using the Gaussian elimination.
To obtain the inverse matrix A−1 using the Gaussian elimination method we
look for a sequence of elementary row operations that will transform the left-
hand-side (A) into the identity matrix I; this same sequence will transform the
the left-hand-side (I) into the inverse matrix A−1 (see Theorem 7.1).
We proceed in two stages using the sequence we found in Example 7.35: first
(Stage 1), we transform A into an upper triangular matrix; second (Stage 2),
we transform the upper triangular matrix into I.
The left-hand-side of the augmented matrix has now been transformed into
upper triangular form.
◦ Stage 2: Adding 2R3 to R2 , and subtracting 3R3 from R1 we have
R1 − 3R3 → R1 1 −2 0 0 − 12 4
3
R2 + 2R3 → R2 0 1 0 1 1 − 12 . (7.197)
1
0 0 1 0 2
− 41
has an inverse A−1 . If A−1 exists, then find it using Gaussian elimination.
1 1
Multiplying row R1 by 3
and row R2 by 2
we have
1 1
R
3 1
→ R1 1 0 0 0 − 13
3
1 1
R → R2 0 1 0 0 −2 . (7.205)
2 2 2
0 0 1 0 0 1
Examples 7.36 and 7.37 use the Gaussian elimination method in two stages,
where the first stage involves transformation to upper-triangular form. In prac-
tice any suitable sequence is acceptable, as the following example shows.
Example 7.38 Determine whether-or-not the matrix
5 1 2
A= 4 1 2 (7.207)
−2 −1 −1
has an inverse A−1 . If A−1 exists, then find it using Gaussian elimination.
If A and B are both non-singular matrices of the same size, then it may be
shown that
A−1 B−1 = (BA)−1 . (7.214)
Ax = b (7.215)
where the coefficient A, and constant b are knowns. If A 6= 0, then the solution
x to this equation is given by multiplying through by the inverse A−1 , i.e.,
Ax = b, (7.217)
where I is the identity matrix. This procedure for finding the solution to a
system of linear equations is known as the inverse matrix method.
Ax = b, (7.219)
det(A) 6= 0, (7.220)
x = A−1 b. (7.221)
If det(A) = 0, then the system (7.219) does not have a unique solution x.
In the following subsections we consider two methods for solving linear equa-
tions: the inverse matrix method described above, and the method of Gaussian
elimination. Since both these methods are based on expressing the equations
using the the matrix of coefficients A, evaluating the determinant det (A)
provides a simple preliminary check for whether a unique solution exists.
4x + 6y = −2, (7.222a)
x + 2y = 3. (7.222b)
Without solving the equations explicitly, demonstrate that they have a unique
solution. Find this solution using the inverse matrix method.
Ax = b, (7.223)
x = A−1 b. (7.225)
i.e., the equations are only both satisfied when x = −11 and y = 7. J
A linear equation in two unknowns can be thought of as an equation for a line,
e.g., the equation 2x − y = 1 is the equation for the line y = 2x − 1. In this
way, two linear equations in two unknowns can be interpreted geometrically
as a problem involving two lines. Types of solution can be classified based on
whether the equations are consistent (i.e., have a solution), and whether the
equations are independent (i.e., the equations are not multiples of each other):
Figure 7.3: Geometric representations of linear equations in two unknowns: (a) indepen-
dent equations have a unique solution, e.g., two lines intersecting at a unique point; (b)
dependent equations have infinite solutions, e.g., two identical lines coinciding along their
length; (c) inconsistent equations have no solution, e.g., parallel lines never intersect.
5x + y + 2z = 2 (7.228a)
4x + y + 2z = −1 (7.228b)
−2x − y − z = 3 (7.228c)
Without solving the equations explicitly, demonstrate that they have a unique
solution. Find this solution using the inverse matrix method.
Ax = b (7.229)
Linear equations are said to be consistent if they have at least one solution;
otherwise they are said to be inconsistent. Furthermore, if any one equation is
a linear combination of the other two, then the equations are said to be linearly
dependent, otherwise they are said to be independent. These different
possibilities for types of solution lead to the following classifications:
These types of solution are represented geometrically in figures 7.4 and 7.5.
(a) (b)
Example 7.41 Figure 7.6 depicts a circuit comprising three current loops I1 ,
I2 , and I3 , and input voltage sources v1 = 4, v2 = 3, and v3 = 4. It may be
shown (cf. Exercise 7.1) that the currents satisfy the equations
Figure 7.6: Three current loops I1 , I2 , and I3 , and three input voltages v1 , v2 , and v3 .
Ax = b (7.234)
The determinant of A is
5 0 −2
9 −3 0 9
det(A) = 0 9 −3 =5 −2 = 144. (7.236)
−3 5 −2 3
−2 −3 5
that is, I1 = 87 , I2 = 43
48
, and I3 = 27
16
.
that is, I1 = 89 , I2 = 53
48
, and I3 = 37
16
. J
Note: If det(A) = 0, then the equations will not have a solution, and
Step 3 will fail. In these cases Step 2 can give rise to two possibilities:
2x + 4y + 2z = 10, (7.240a)
x + 5y + 3z = 4, (7.240b)
−2x − y − 2z = −7. (7.240c)
x + 2y + z = 5,
3y + 2z = −1, (7.243)
− 2z = 4.
Here the final equation yields z = −4/2 = −2, the penultimate equation yields
y = (−1 − 2z)/3 = 1, and the first equation yields x = (5 − 2y − z) = 5.
Hence, system (7.240) has the unique solution (x, y, z) = (5, 1, −2).
Note: Gaussian elimination works because the elementary row operations have
the effect of adding or subtracting equations from each other, just as one would
do when solving simultaneous equations by substitution. Gaussian elimination
makes this process systematic, mitigating against computational error. J
2x + y − z = 3, (7.244a)
x − 3y + 2z = 1, (7.244b)
4x − 5y + 3z = 5. (7.244c)
Without solving the equations explicitly, demonstrate that they have do not
have unique solution. Show that the equations have a parametric solution.
2 1 −1
det(A) = 1 −3 2 = 0. (7.246)
4 −5 3
Since A is singular, system (7.244) does not have a unique solution. We shall
look for a set of solutions to system (7.244) using Gaussian elimination.
◦ Stage 1: (augmented matrix) The augmented matrix of A with b is
2 1 −1 3
[A|b] = 1 −3 2 1 . (7.247)
4 −5 3 5
x − 3y + 2z = 1, (7.251a)
7y − 5z = 1, (7.251b)
0 = 0; (7.251c)
since the final equation is ‘0 = 0’, only two of these equations are meaningful,
which is insufficient information to fix the values of three unknowns.
We can, however, form a parametric solution by setting
1 + 5λ
7y − 5z = 1 ⇒ y= , (7.253)
7
and equation (7.251a) gives
10 + λ
x − 3y + 2z = 1 ⇒ x= . (7.254)
7
Engineering Mathematics : ELE00038C J.J.B. 2023-2024
154 Section 7 : Matrices and Linear Algebra
10 + λ 1 + 5λ
x= , y= , z = λ, (7.255)
7 7
where λ ∈ R is a parameter that may be chosen freely.
Note: In column matrixform the parametric solution (7.255) may be written
10 1
x 7 7
1
y = 7 + λ 57 , (7.256)
z 0 1
i.e., as the vector equation for a line. Geometrically, therefore, system (7.244)
corresponds to three planes intersecting along a solution line, as in figure 7.4(b).
Selecting different values for λ selects different points on the line. J
3x + 2y − z = 4, (7.257a)
5x − y − z = 3, (7.257b)
x + 5y − z = 1. (7.257c)