0% found this document useful (0 votes)
2K views253 pages

Linear Algebra MATH 211 Textbook

linear

Uploaded by

df44444
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2K views253 pages

Linear Algebra MATH 211 Textbook

linear

Uploaded by

df44444
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 253

A FIRST COURSE

IN LINEAR ALGEBRA
An Open Text

by Ken Kuttler
LYRYX SERVICE COURSE SOLUTION ADAPTATION FOR:
UNIVERSITY OF CALGARY
MATH 211- LINEAR METHODS I
FALL 2013- ALL SECTIONS
*Creative Commons Attribution License (CC BY)
This text, including the art and illustrations, are available under the Creative Commons
Attribution license (CC BY), allowing anyone to reuse, revise, remix and redistribute the
text.
CONTENTS
1 Systems Of Equations 7
1.1 Systems Of Equations, Geometry . . . . . . . . . . . . . . . . . . . . . . . . 7
1.2 Systems Of Equations,Algebraic Procedures . . . . . . . . . . . . . . . . . . 11
1.2.1 Elementary Operations . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.2.2 Gaussian Elimination . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.2.3 Uniqueness of the Reduced Row-Echelon Form . . . . . . . . . . . . 30
1.2.4 Rank and Homogeneous Systems . . . . . . . . . . . . . . . . . . . . 32
1.3 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
2 Matrices 47
2.1 Matrix Arithmetic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
2.1.1 Addition of Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
2.1.2 Scalar Multiplication of Matrices . . . . . . . . . . . . . . . . . . . . 51
2.1.3 Multiplication Of Matrices . . . . . . . . . . . . . . . . . . . . . . . . 52
2.1.4 The ij
th
Entry Of A Product . . . . . . . . . . . . . . . . . . . . . . 58
2.1.5 Properties Of Matrix Multiplication . . . . . . . . . . . . . . . . . . . 61
2.1.6 The Transpose . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
2.1.7 The Identity And Inverses . . . . . . . . . . . . . . . . . . . . . . . . 65
2.1.8 Finding The Inverse Of A Matrix . . . . . . . . . . . . . . . . . . . . 67
2.1.9 Elementary Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
2.1.10 More on Matrix Inverses . . . . . . . . . . . . . . . . . . . . . . . . . 77
2.2 Linear Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
2.2.1 Matrices Which Are One To One Or Onto . . . . . . . . . . . . . . . 82
2.3 The Matrix Of A LinearTransformation . . . . . . . . . . . . . . . . . . . . . 85
2.3.1 The General Solution Of A Linear System . . . . . . . . . . . . . . . 89
2.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
3 Determinants 107
3.1 Basic Techniques And Properties . . . . . . . . . . . . . . . . . . . . . . . . 107
3.1.1 Cofactors And 2 2 Determinants . . . . . . . . . . . . . . . . . . . 107
3.1.2 The Determinant Of A Triangular Matrix . . . . . . . . . . . . . . . 112
3.1.3 Properties Of Determinants . . . . . . . . . . . . . . . . . . . . . . . 114
3.1.4 Finding Determinants Using Row Operations . . . . . . . . . . . . . 118
3.2 Applications of the Determinant . . . . . . . . . . . . . . . . . . . . . . . . . 120
3.2.1 A Formula For The Inverse . . . . . . . . . . . . . . . . . . . . . . . . 120
3.2.2 Cramers Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
3.3 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
3
4 Complex Numbers 137
4.1 Complex Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
4.2 Polar Form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
4.3 Roots Of Complex Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
4.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
5 Spectral Theory 149
5.1 Eigenvalues And Eigenvectors Of A Matrix . . . . . . . . . . . . . . . . . . . 149
5.1.1 Denition Of Eigenvectors And Eigenvalues . . . . . . . . . . . . . . 149
5.1.2 Finding Eigenvectors And Eigenvalues . . . . . . . . . . . . . . . . . 152
5.1.3 Eigenvalues and Eigenvectors for Special Types of Matrices . . . . . . 157
5.2 Diagonalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
5.2.1 Complex Eigenvalues . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
5.3 Applications of Spectral Theory . . . . . . . . . . . . . . . . . . . . . . . . . 167
5.3.1 Raising a Matrix to a High Power . . . . . . . . . . . . . . . . . . . . 167
5.3.2 Markov Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
5.3.3 Dynamical Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
5.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
6 R
n
191
6.1 R
n
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191
6.2 Geometric Meaning Of Vectors . . . . . . . . . . . . . . . . . . . . . . . . . 193
6.3 Algebra in R
n
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195
6.3.1 Addition of Vectors in R
n
. . . . . . . . . . . . . . . . . . . . . . . . 195
6.3.2 Scalar Multiplication of Vectors in R
n
. . . . . . . . . . . . . . . . . . 196
6.4 Geometric Meaning OfVector Addition . . . . . . . . . . . . . . . . . . . . . 197
6.5 Length Of A Vector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
6.6 Geometric MeaningOf Scalar Multiplication . . . . . . . . . . . . . . . . . . 204
6.7 Parametric Lines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
6.8 Vector Products . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209
6.8.1 The Dot Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209
6.8.2 The Geometric Signicance Of The Dot Product . . . . . . . . . . . . 213
6.8.3 Projections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216
6.8.4 The Cross Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
6.9 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226
6.9.1 Rotations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226
6.9.2 Vectors And Physics . . . . . . . . . . . . . . . . . . . . . . . . . . . 229
6.9.3 Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233
6.10 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234
A Some Prerequisite Topics 245
A.1 Sets And Set Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245
A.2 Well Ordering And Induction . . . . . . . . . . . . . . . . . . . . . . . . . . 247
4
Preface
Linear Algebra: A First Course presents an introduction to the fascinating subject of
linear algebra. As the title suggests, this text is designed as a rst course in linear algebra
for students who have a reasonable understanding of basic algebra. Major topics of linear
algebra are presented in detail, with proofs of important theorems provided. Connections
to additional topics covered in advanced courses are introduced, in an eort to assist those
students who are interested in continuing on in linear algebra.
Each chapter begins with a list of desired outcomes which a student should be able to
achieve upon completing the chapter. Throughout the text, examples and diagrams are given
to reinforce ideas and provide guidance on how to approach various problems. Suggested
exercises are given at the end of each chapter, and students are encouraged to work through
a selection of these exercises.
A brief review of complex numbers is given, which can serve as an introduction to anyone
unfamiliar with the topic.
Linear algebra is a wonderful and interesting subject, which should not be limited to a
challenge of correct arithmetic. The use of a computer algebra system can be a great help
in long and dicult computations. Some of the standard computations of linear algebra are
easily done by the computer, including nding the reduced row-echelon form. While the use
of a computer system is encouraged, it is not meant to be done without the student having
an understanding of the computations.
Elementary Linear Algebra c _2012 by Kenneth Kuttler is oered under a Creative Com-
mons Attribution(CCBY) license. Full license terms may be viewed at:
http://creativecommons.org/licenses/by/3.0/.
5
6
1. Systems Of Equations
Outcomes
A. Relate the types of solution sets of a system of two (three) variables to the intersections
of lines in a plane (the intersection of planes in three space).
B. Determine whether a system of linear equations has no solution, a unique solution or
an innite number of solutions from its echelon form.
C. Solve a system of equations using Gaussian Elimination and Gauss-Jordan Elimination.
D. Model a physical system with linear equations and then solve.
1.1 Systems Of Equations, Geometry
As you may remember, linear equations like 2x + 3y = 6 can be graphed as straight lines
in the coordinate plane. We say that this equation is in two variables, in this case x and
y. Suppose you have two such equations, each of which can be graphed as a straight line.
Consider the resulting graph of two lines. What would it mean if there exists a point of
intersection between the two lines? This point, which lies on both graphs, gives x and y
values for which both equations are true. In other words, this point gives the ordered pair
(x, y) that satisfy both equations. If the point (x, y) is a point of intersection, we say that
(x, y) is a solution to the two equations. In linear algebra, we often are concerned with
nding the solution(s) to a system of equations, if such solutions exist. First, we consider
graphical representations of solutions and later we will consider the algebraic methods for
nding solutions.
When looking for the intersection of two lines in a graph, several situations may arise.
The following picture demonstrates the possible situations when considering two equations
(two lines in the graph) involving two variables.
x
y
one solution
x
y
two parallel lines
no solutions
x
y
innitely
many solutions
7
In the rst diagram, there is a unique point of intersection, which means that there is only
one (unique) solution to the two equations. In the second, there are no points of intersection
and hence no solution. When no solution exists, this means that the two lines never intersect,
and hence are parallel. The third situation which can occur, as demonstrated in diagram
three, is that the two lines are really the same line. For example, x +y = 1 and 2x +2y = 2
are equations which when graphed yield the same line. In this case there are innitely many
points which are solutions of these two equations, as every ordered pair which is on the
graph of the line satises both equations. When considering linear systems of equations,
there are always three types of solutions possible; exactly one (unique) solution, innitely
many solutions, or no solution.
Example 1.1: A Graphical Solution
Use a graph to nd the solution to the following system of equations
x + y = 3
y x = 5
Solution. Through graphing the above equations and identifying the point of intersection, we
can nd the solution(s). Remember that we must have either one solution, innitely many, or
no solutions at all. The following graph shows the two equations, as well as the intersection.
Remember, the point of intersection represents the solution of the two equations, or the
(x, y) which satisfy both equations. In this case, there is one point of intersection which
means we have one unique solution. You can verify the solution is (x, y) = (1, 4) .
5 4 3 2 1 1 2
2
4
6
(x, y) = (1, 4)
y
x
In the above example, we investigated the intersection point of two equations in two
variables, x and y. Now we will consider the graphical solutions of three equations in two
variables.
8
Consider a system of three equations in two variables. Again, these equations can be
graphed as straight lines in the plane, so that the resulting graph contains three straight
lines. Recall the three possible types of solutions; no solution, one solution, and innitely
many solutions. There are now more complex ways of achieving these situations, due to the
presence of the third line. For example, you can imagine the case of three intersecting lines
having no common point of intersection. Perhaps you can also imagine three intersecting
lines which do intersect at a single point. These two situations are illustrated below.
x
y
x
y
Consider the rst picture above. While all three lines intersect with one another, there
is no common point of intersection where all three lines meet at one point. Hence, there is
no solution to the three equations. Remember, a solution is a point (x, y) which satises all
three equations. In the case of the second picture, the lines intersect at a common point.
This means that there is one solution to the three equations whose graphs are the given lines.
You should take a moment now to draw the graph of a system which results in three parallel
lines. Next, try the graph of three identical lines. Which type of solution is represented in
each of these graphs?
We have now considered the graphical solutions of systems of two equations in two
variables, as well as three equations in two variables. However, there is no reason to limit
our investigation to equations in two variables. We will now consider equations in three
variables.
You may recall that equations in three variables, such as 2x + 4y 5z = 8, form a
plane. Above, we were looking for intersections of lines in order to identify any possible
solutions. When graphically solving systems of equations in three variables, we look for
intersections of planes. These points of intersection give the (x, y, z) that satisfy all the
equations in the system. What types of solutions are possible when working with three
variables? Consider the following picture involving two planes, which are given by two
equations in three variables.
Notice how these two planes intersect in a line. This means that the points (x, y, z) on
this line satisfy both equations in the system. Since the line contains innitely many points,
this system has innitely many solutions.
9
It could also happen that the two planes fail to intersect. However, is it possible to have
two planes intersect at a single point? Take a moment to attempt drawing this situation, and
convince yourself that it is not possible! This means that when we have only two equations
in three variables, there is no way to have a unique solution! Hence, the types of solutions
possible for two equations in three variables are no solution or innitely many solutions.
Now imagine adding a third plane. In other words, consider three equations in three
variables. What types of solutions are now possible? Consider the following diagram.

New Plane
In this diagram, there is no point which lies in all three planes. There is no intersection
between all planes so there is no solution. The picture illustrates the situation in which the
line of intersection of the new plane with one of the original planes forms a line parallel to
the line of intersection of the rst two planes. However, in three dimensions, it is possible
for two lines to fail to intersect even though they are not parallel. Such lines are called skew
lines.
Recall that when working with two equations in three variables, it was not possible to
have a unique solution. Is it possible when considering three equations in three variables?
In fact, it is possible, and we demonstrate this situation in the following picture.

New Plane
In this case, the three planes have a single point of intersection. Can you think of other
types of solutions possible? Another is that the three planes could intersect in a line, resulting
in innitely many solutions, as in the following diagram.
10
We have now seen how three equations in three variables can have no solution, a unique
solution, or intersect in a line resulting in innitely many solutions. It is also possible that
the three equations graph the same plane, which also leads to innitely many solutions.
You can see that when working with equations in three variables, there are many more
ways to achieve the dierent types of solutions than when working with two variables. It
may prove enlightening to spend time imagining (and drawing) many possible scenarios, and
you should take some time to try a few.
You should also take some time to imagine (and draw) graphs of systems in more than
three variables. Equations like x + y 2z + 4w = 8 with more than three variables are
often called hyper-planes . You may soon realize that it is tricky to draw the graphs
of hyper-planes! Through the tools of linear algebra, we can algebraically examine these
types of systems which are dicult to graph. In the following section, we will consider these
algebraic tools.
1.2 Systems Of Equations,
Algebraic Procedures
We have now taken an in depth look at graphical representations of systems of equations, as
well as how to nd possible solutions graphically. Our attention now turns to working with
systems algebraically.
11
Denition 1.2: System of Linear Equations
A system of linear equations is a list of equations,
a
11
x
1
+ a
12
x
2
+ + a
1n
x
n
= b
1
a
21
x
1
+ a
22
x
2
+ + a
2n
x
n
= b
2
.
.
.
a
m1
x
1
+ a
m2
x
2
+ + a
mn
x
n
= b
m
where a
ij
and b
j
are real numbers. The above is a system of m equations in the n
variables, x
1
, x
2
, x
n
. Written more simply in terms of summation notation, the
above can be written in the form
n

j=1
a
ij
x
j
= b
i
, i = 1, 2, 3, , m
The relative size of m and n is not important here. Notice that we have allowed a
ij
and
b
j
to be any real number. We can also call these numbers scalars . We will use this term
throughout the text, so keep in mind that the term scalar just means that we are working
with real numbers.
Now, suppose we have a system where b
i
= 0 for all i. Hence, every equation equals 0.
This is a special type of system.
Denition 1.3: Homogeneous System of Equations
A system of equations is called homogeneous if each equation in the system is equal
to 0. A homogeneous system has the form
a
11
x
1
+ a
12
x
2
+ + a
1n
x
n
= 0
a
21
x
1
+ a
22
x
2
+ + a
2n
x
n
= 0
.
.
.
a
m1
x
1
+ a
m2
x
2
+ + a
mn
x
n
= 0
where a
ij
are scalars and x
i
are variables.
Recall from the previous section that our goal when working with systems of linear
equations was to nd the point of intersection of the equations when graphed. In other
words, we looked for the solutions to the system. We now wish to nd these solutions
algebraically. We want to nd values for x
1
, , x
n
which solve all of the equations. If such
a set of values exists, we call (x
1
, , x
n
) the solution set .
Recall the above discussions about the types of solutions possible. We will see that
systems of linear equations will have one unique solution, innitely many solutions, or no
solution. Consider the following denition.
12
Denition 1.4: Consistent and Inconsistent Systems
A system of linear equations is called consistent if there exists at least one solution.
It is called inconsistent if there is no solution.
If you think of each equation as a condition which must be satised by the variables,
consistent would mean there is some choice of variables which can satisfy all the conditions.
Inconsistent would mean there is no choice of the variables which can satisfy all of the
conditions.
The following sections provide methods for determining if a system is consistent or in-
consistent, and nding solutions if they exist.
1.2.1. Elementary Operations
We begin this section with an example.
Example 1.5: Verifying an Ordered Pair is a Solution
Find x and y such that
x + y = 7
2x y = 8
(1.1)
Solution. By graphing these two equations and identifying the point of intersection, you can
verify that (x, y) = (5, 2) is the only solution to the above system.
We can verify algebraically by substituting these values into the original equations, and
ensuring that the equations hold. First, we substitute the values into the rst equation and
check that it equals 7.
x + y = (5) + (2) = 7 (1.2)
This equals 7 as needed, so we see that (5, 2) is a solution to the rst equation. Substituting
the values into the second equation yields
2x y = 2(5) (2) = 10 2 = 8 (1.3)
which is true. Hence, for (x, y) = (5, 2) each equation is true. Therefore, (5, 2) is a solution
to the system.
Now, the interesting question is this: If you were not given these numbers to verify, how
could you algebraically determine the solution? Linear algebra gives us the tools needed to
answer this question. The following basic operations are important tools that we will utilize.
13
Denition 1.6: Elementary Operations
Elementary operations are those operations consisting of the following.
1. Interchange the order in which the equations are listed.
2. Multiply any equation by a nonzero number.
3. Replace any equation with itself added to a multiple of another equation.
It is important to note that none of these operations will change the set of solutions of
the system of equations. In fact, elementary operations are the key tool we use in linear
algebra to nd solutions to systems of equations.
Consider the following example.
Example 1.7: Eects of an Elementary Operation
Show that the system
x + y = 7
2x y = 8
has the same solution as the system
x + y = 7
3y = 6
Solution. Notice that the second system has been obtained by taking the second equation
of the rst system and adding -2 times the rst equation, as follows:
2x y + (2)(x + y) = 8 + (2)(7)
By simplifying, we obtain
3y = 6
which is the second equation in the second system. Now, from here we can solve for y and
see that y = 2. Next, we substitute this value into the rst equation as follows
x + y = x + 2 = 7
Hence x = 5 and so (x, y) = (5, 2) is a solution to the second system. We want to check if
(5, 2) is also a solution to the rst system. We check this by substituting (x, y) = (5, 2) into
the system and ensuring the equations are true.
x + y = (5) + (2) = 7
2x y = 2 (5) (2) = 8
Hence, (5, 2) is also a solution to the rst system.
This example illustrates how an elementary operation applied to a system of two equations
in two variables does not aect the solution set. However, a linear system may involve many
14
equations and many variables and there is no reason to limit our study to small systems.
For any size of system in any number of variables, the solution set is still the collection
of solutions to the equations. In every case, the above operations of Denition 1.6 do not
change the set of solutions to the system of linear equations.
In the following theorem, we use the notation E
i
to represent an equation, while b
i
denotes
a constant.
Theorem 1.8: Elementary Operations and Solutions
Suppose you have a system of two linear equations
E
1
= b
1
E
2
= b
2
(1.4)
Then the following systems have the same solution set as 1.4:
1.
E
2
= b
2
E
1
= b
1
(1.5)
2.
E
1
= b
1
kE
2
= kb
2
(1.6)
for any scalar k, provided k ,= 0.
3.
E
1
= b
1
E
2
+ kE
1
= b
2
+ kb
1
(1.7)
for any scalar k (including k = 0).
Before we proceed with the proof of Theorem 1.8, let us consider this theorem in context
of Example 1.7. Then,
E
1
= x + y, b
1
= 7
E
2
= 2x y, b
2
= 8
Recall the elementary operations that we used to modify the system in the solution to the
example. First, we added (2) times the rst equation to the second equation. In terms of
Theorem 1.8, this action is given by
E
2
+ (2) E
1
= b
2
+ (2) b
1
or
2x y + (2) (x + y) = 8 + (2) 7
This gave us the second system in Example 1.7, given by
E
1
= b
1
E
2
+ (2) E
1
= b
2
+ (2) b
1
15
From this point, we were able to nd the solution to the system. Theorem 1.8 tells us
that the solution we found is in fact a solution to the original system.
We will now prove Theorem 1.8.
Proof of Theorem 1.8:
1. The proof of equation 1.5 is as follows. Suppose that (x
1
, , x
n
) is a solution to
E
1
= b
1
, E
2
= b
2
. We want to show that this is a solution to the system in 1.5 above.
This is clear, because the system in 1.5 is the original system, but listed in a dierent
order. Changing the order does not eect the solution set, so (x
1
, , x
n
) is a solution
to 1.5.
2. Next we want to prove equation 1.6 which states that E
1
= b
1
, E
2
= b
2
has the same
solution set as the system E
1
= b
1
, kE
2
= kb
2
provided k ,= 0. Let (x
1
, , x
n
) be a
solution of E
1
= b
1
, E
2
= b
2
,. We want to show that it is a solution to E
1
= b
1
, kE
2
=
kb
2
. Notice that the only dierence between these two systems is that the second
involves multiplying the equation, E
2
= b
2
by the scalar k. Recall that when you
multiply both sides of an equation by the same number, the sides are still equal to
each other. Hence if (x
1
, , x
n
) is a solution to E
2
= b
2
, then it will also be a solution
to kE
2
= kb
2
. Hence, (x
1
, , x
n
) is also a solution to 1.6.
Similarly, let (x
1
, , x
n
) be a solution of E
1
= b
1
, kE
2
= kb
2
. Then we can multiply
the equation kE
2
= kb
2
by the scalar 1/k, which is possible only because we have
required that k ,= 0. Just as above, this action preserves equality and we obtain the
equation E
2
= b
2
. Hence (x
1
, , x
n
) is also a solution to E
1
= b
1
, E
2
= b
2
.
3. Finally, we will prove 1.7. We will show that any solution of E
1
= b
1
, E
2
= b
2
is
also a solution of 1.7. Then, we will show that any solution of 1.7 is also a solution
of E
1
= b
1
, E
2
= b
2
. Let (x
1
, , x
n
) be a solution to E
1
= b
1
, E
2
= b
2
. Then in
particular it solves E
1
= b
1
. Hence, it solves the rst equation in 1.7. Similarly, it also
solves E
2
= b
2
. By our proof of 1.6, it also solves kE
1
= kb
1
. Notice that if we add E
2
and kE
1
, this is equal to b
2
+ kb
1
. Therefore, if (x
1
, , x
n
) solves E
1
= b
1
, E
2
= b
2
it
must also solve E
2
+ kE
1
= b
2
+ kb
1
.
Now suppose (x
1
, , x
n
) solves the system E
1
= b
1
, E
2
+ kE
1
= b
2
+ kb
1
. Then
in particular it is a solution of E
1
= b
1
. Again by our proof of 1.6, it is also a
solution to kE
1
= kb
1
. Now if we subtract these equal quantities from both sides of
E
2
+ kE
1
= b
2
+ kb
1
we obtain E
2
= b
2
, which shows that the solution also satises
E
1
= b
1
, E
2
= b
2
.
Stated simply, the above theorem shows that the elementary operations do not change
the solution set of a system of equations.
We will now look at an example of a system of three equations and three variables.
Similarly to the previous examples, the goal is to nd values for x, y, z such that each of the
given equations are satised when these values are substituted in.
16
Example 1.9: Solving a System of Equations with
Elementary Operations
Find the solutions to the system,
x + 3y + 6z = 25
2x + 7y + 14z = 58
2y + 5z = 19
(1.8)
Solution. We can relate this system to Theorem 1.8 above. In this case, we have
E
1
= x + 3y + 6z, b
1
= 25
E
2
= 2x + 7y + 14z, b
2
= 58
E
3
= 2y + 5z, b
3
= 19
Theorem 1.8 claims that if we do elementary operations on this system, we will not change
the solution set. Therefore, we can solve this system using the elementary operations given
in Denition 1.6. First, replace the second equation by (2) times the rst equation added
to the second. This yields the system
x + 3y + 6z = 25
y + 2z = 8
2y + 5z = 19
(1.9)
Now, replace the third equation with (2) times the second added to the third. This yields
the system
x + 3y + 6z = 25
y + 2z = 8
z = 3
(1.10)
At this point, we can easily nd the solution. Simply take z = 3 and substitute this back
into the previous equation to solve for y, and similarly to solve for x.
x + 3y + 6 (3) = x + 3y + 18 = 25
y + 2 (3) = y + 6 = 8
z = 3
The second equation is now
y + 6 = 8
You can see from this equation that y = 2. Therefore, we can substitute this value into the
rst equation as follows:
x + 3 (2) + 18 = 25
By simplifying this equation, we nd that x = 1. Hence, the solution to this system is
(x, y, z) = (1, 2, 3). This process is called back substitution .
17
Alternatively, in 1.10 you could have continued as follows. Add (2) times the third
equation to the second and then add (6) times the second to the rst. This yields
x + 3y = 7
y = 2
z = 3
Now add (3) times the second to the rst. This yields
x = 1
y = 2
z = 3
a system which has the same solution set as the original system. This avoided back substi-
tution and led to the same solution set. It is your decision which you prefer to use, as both
methods lead to the correct solution, (x, y, z) = (1, 2, 3).
1.2.2. Gaussian Elimination
The work we did in the previous section will always nd the solution to the system. In this
section, we will explore a less cumbersome way to nd the solutions. First, we will represent
a linear system with an augmented matrix. A matrix is simply a rectangular array of
numbers. The size or dimension of a matrix is dened as m n where m is the number
of rows and n is the number of columns. In order to construct an augmented matrix from
a linear system, we create a coecient matrix from the coecients of the variables in
the system, as well as a constant matrix from the constants. The coecients from one
equation of the system create one row of the augmented matrix.
For example, consider the linear system in Example 1.9
x + 3y + 6z = 25
2x + 7y + 14z = 58
2y + 5z = 19
This system can be written as an augmented matrix, as follows
_
_
1 3 6 [ 25
2 7 14 [ 58
0 2 5 [ 19
_
_
Notice that it has exactly the same information as the original system. Here it is un-
derstood that the rst column contains the coecients from x in each equation, in order,
_
_
1
2
0
_
_
. Similarly, we create a column from the coecients on y in each equation,
_
_
3
7
2
_
_
18
and a column from the coecients on z in each equation,
_
_
6
14
5
_
_
. For a system of more
than three variables, we would continue in this way constructing a column for each variable.
Similarly, for a system of less than three variables, we simply construct a column for each
variable.
Finally, we construct a column from the constants of the equations,
_
_
25
58
19
_
_
.
The rows of the augmented matrix correspond to the equations in the system. For exam-
ple, the top row in the augmented matrix,
_
1 3 6 [ 25

corresponds to the equation


x + 3y + 6z = 25.
Consider the following denition.
Denition 1.10: Augmented Matrix of a Linear System
For a linear system of the form
a
11
x
1
+ + a
1n
x
n
= b
1
.
.
.
a
m1
x
1
+ + a
mn
x
n
= b
m
where the x
i
are variables and the a
ij
and b
i
are constants, the augmented matrix of
this system is given by
_

_
a
11
a
1n
[ b
1
.
.
.
.
.
. [
.
.
.
a
m1
a
mn
[ b
m
_

_
Now, consider elementary operations in the context of the augmented matrix. The el-
ementary operations in Denition 1.6 can be used on the rows just as we used them on
equations previously. Changes to a system of equations in as a result of an elementary op-
eration are equivalent to changes in the augmented matrix resulting from the corresponding
row operation. Note that Theorem 1.8 implies that any elementary row operations used on
an augmented matrix will not change the solution to the corresponding system of equations.
We now formally dene elementary row operations. These are the key tool we will use to
nd solutions to systems of equations.
19
Denition 1.11: Elementary Row Operations
The elementary row operations (also known as row operations) consist of the
following
1. Switch two rows.
2. Multiply a row by a nonzero number.
3. Replace a row by any multiple of another row added to it.
Recall how we solved Example 1.9. We can do the exact same steps as above, except now
in the context of an augmented matrix and using row operations. The augmented matrix of
this system is
_
_
1 3 6 [ 25
2 7 14 [ 58
0 2 5 [ 19
_
_
Thus the rst step in solving the system given by 1.8 would be to take (2) times the rst
row of the augmented matrix and add it to the second row,
_
_
1 3 6 [ 25
0 1 2 [ 8
0 2 5 [ 19
_
_
Note how this corresponds to 1.9. Next take (2) times the second row and add to the third,
_
_
1 3 6 [ 25
0 1 2 [ 8
0 0 1 [ 3
_
_
This augmented matrix corresponds to the system
x + 3y + 6z = 25
y + 2z = 8
z = 3
which is the same as 1.10. By back substitution you obtain the solution x = 1, y = 6, and
z = 3.
Through a systematic procedure of row operations, we can simplify an augmented matrix
and carry it to row-echelon form or reduced row-echelon form, which we dene next.
These forms are used to nd the solutions of the system of equations corresponding to the
augmented matrix.
In the following denitions, the term leading entry refers to the rst nonzero entry of
a row when scanning the row from left to right.
20
Denition 1.12: Row-Echelon Form
An augmented matrix is in row-echelon form if
1. All nonzero rows are above any rows of zeros.
2. Each leading entry of a row is in a column to the right of the leading entries of
any row above it.
3. Each leading entry of a row is equal to 1.
We also consider another reduced form of the augmented matrix which has one further
condition.
Denition 1.13: Reduced Row-Echelon Form
An augmented matrix is in reduced row-echelon form if
1. All nonzero rows are above any rows of zeros.
2. Each leading entry of a row is in a column to the right of the leading entries of
any rows above it.
3. Each leading entry of a row is equal to 1.
4. All entries in a column above and below a leading entry are zero.
Notice that the rst three conditions on a reduced row-echelon form matrix are the same
as those for row-echelon form.
Hence, every reduced row-echelon form matrix is also in row-echelon form. The converse
is not necessarily true; we cannot assume that every matrix in row-echelon form is also in
reduced row-echelon form. However, it often happens that the row-echelon form is sucient
to provide information about the solution of a system.
The following examples describe matrices in these various forms. As an exercise, take
the time to carefully verify that they are in the specied form.
Example 1.14: Not in Row-Echelon Form
The following augmented matrices are not in row-echelon form (and therefore also not
in reduced row-echelon form).
_

_
0 0 0 [ 0
1 2 3 [ 3
0 1 0 [ 2
0 0 0 [ 1
0 0 0 [ 0
_

_
,
_
_
1 2 [ 3
2 4 [ 6
4 0 [ 7
_
_
,
_

_
0 2 3 [ 3
1 5 0 [ 2
7 5 0 [ 1
0 0 1 [ 0
_

_
21
Example 1.15: Matrices in Row-Echelon Form
The following augmented matrices are in row-echelon form, but not in reduced row-
echelon form.
_

_
1 0 6 5 8 [ 2
0 0 1 2 7 [ 3
0 0 0 0 1 [ 1
0 0 0 0 0 [ 0
_

_
,
_

_
1 3 5 [ 4
0 1 0 [ 7
0 0 1 [ 0
0 0 0 [ 1
0 0 0 [ 0
_

_
,
_

_
1 0 6 [ 0
0 1 4 [ 0
0 0 1 [ 0
0 0 0 [ 0
_

_
Notice that we could apply further row operations to these matrices to carry them to
reduced row-echelon form. Take the time to try that on your own. Consider the following
matrices, which are in reduced row-echelon form.
Example 1.16: Matrices in Reduced Row-Echelon Form
The following augmented matrices are in reduced row-echelon form.
_

_
1 0 0 5 0 [ 0
0 0 1 2 0 [ 0
0 0 0 0 1 [ 1
0 0 0 0 0 [ 0
_

_
,
_

_
1 0 0 [ 0
0 1 0 [ 0
0 0 1 [ 0
0 0 0 [ 1
0 0 0 [ 0
_

_
,
_
_
1 0 0 [ 4
0 1 0 [ 3
0 0 1 [ 2
_
_
One way in which the row-echelon form of a matrix is useful is in identifying the pivot
positions and pivot columns of the matrix.
Denition 1.17: Pivot Position and Pivot Column
A pivot position in a matrix is the location of a leading entry in the row-echelon
formof a matrix. A pivot column is a column that contains a pivot position.
For example consider the following.
Example 1.18: Pivot Position
Let
A =
_
_
1 2 3 [ 4
3 2 1 [ 6
4 4 4 [ 10
_
_
Where are the pivot positions and pivot columns of the augmented matrix A?
Solution. The row-echelon form of this matrix is
_
_
1 2 3 [ 4
0 1 2 [
3
2
0 0 0 [ 0
_
_
22
This is all we need in this example, but note that this matrix is not in reduced row-echelon
form.
In order to identify the pivot positions in the original matrix, we look for the leading
entries in the row-echelon form of the matrix. Here, the entry in the rst row and rst
column, as well as the entry in the second row and second column are the leading entries.
Hence, these locations are the pivot positions. We identify the pivot positions in the original
matrix, as in the following:
_
_
1 2 3 [ 4
3 2 1 [ 6
4 4 4 [ 10
_
_
Thus the pivot columns in the matrix are the rst two columns.
The following is an algorithm for carrying a matrix to row-echelon form and reduced row-
echelon form. You may wish to use this algorithm to carry the above matrix to row-echelon
form or reduced row-echelon form yourself for practice.
Algorithm 1.19: Reduced Row-Echelon Form Algorithm
This algorithm provides a method for using row operations to take a matrix to its
reduced row-echelon form. We begin with the matrix in its original form.
1. Starting from the left, nd the rst nonzero column. This is the rst pivot
column, and the position at the top of this column is the rst pivot position.
Switch rows if necessary to place a nonzero number in the rst pivot position.
2. Use row operations to make the entries below the rst pivot position (in the rst
pivot column) equal to zero.
3. Ignoring the row containing the rst pivot position, repeat steps 1 and 2 with
the remaining rows. Repeat the process until there are no more rows to modify.
4. Divide each nonzero row by the value of the leading entry, so that the leading
entry becomes 1. The matrix will then be in row-echelon form.
The following step will carry the matrix from row-echelon form to reduced row-echelon
form.
5. Moving from right to left, use row operations to create zeros in the entries of the
pivot columns which are above the pivot positions. The result will be a matrix
in reduced row-echelon form.
Most often we will apply this algorithm to an augmented matrix in order to nd the
solution to a system of linear equations. However, we can use this algorithm to compute the
reduced row-echelon form of any matrix which could be useful in other applications.
Consider the following example of Algorithm 1.19.
23
Example 1.20: Finding Row-Echelon Form and
Reduced Row-Echelon Form of a Matrix
Let
A =
_
_
0 5 4
1 4 3
5 10 7
_
_
Find the row-echelon form of A. Then complete the process until A is in reduced
row-echelon form.
Solution. In working through this example, we will use the steps outlined in Algorithm 1.19.
1. The rst pivot column is the rst column of the matrix, as this is the rst nonzero
column from the left. Hence the rst pivot position is the one in the rst row and rst
column. Switch the rst two rows to obtain a nonzero entry in the rst pivot position,
outlined in a box below.
_
_
1 4 3
0 5 4
5 10 7
_
_
2. Step two involves creating zeros in the entries below the rst pivot position. The rst
entry of the second row is already a zero. All we need to do is subtract 5 times the
rst row from the third row. The resulting matrix is
_
_
1 4 3
0 5 4
0 10 8
_
_
3. Now ignore the top row. Apply steps 1 and 2 to the smaller matrix
_
5 4
10 8
_
In this matrix, the rst column is a pivot column, and 5 is in the rst pivot position.
Therefore, we need to create a zero below it. To do this, add 2 times the rst row (of
this matrix) to the second. The resulting matrix is
_
5 4
0 0
_
Our original matrix now looks like
_
_
1 4 3
0 5 4
0 0 0
_
_
We can see that there are no more rows to modify.
24
4. Now, we need to create leading 1s in each row. The rst row already has a leading 1
so no work is needed here. Divide the second row by 5 to create a leading 1. The
resulting matrix is
_
_
1 4 3
0 1
4
5
0 0 0
_
_
This matrix is now in row-echelon form.
5. Now create zeros in the entries above pivot positions in each column, in order to carry
this matrix all the way to reduced row-echelon form. Notice that there is no pivot
position in the third column so we do not need to create any zeros in this column! The
column in which we need to create zeros is the second. To do so, subtract 4 times the
second row from the rst row. The resulting matrix is
_

_
1 0
1
5
0 1
4
5
0 0 0
_

_
This matrix is now in reduced row-echelon form.
The above algorithm gives you a simple way to obtain the row-echelon form and reduced
row-echelon form of a matrix. The main idea is to do row operations in such a way as
to end up with a matrix in row-echelon form or reduced row-echelon form. This process
is important because the resulting matrix will allow you to describe the solutions to the
corresponding linear system of equations in a meaningful way.
In the next example, we look at how to solve a system of equations using the corresponding
augmented matrix.
Example 1.21: Finding the Solution to a System
Give the complete solution to the following system of equations
2x + 4y 3z = 1
5x + 10y 7z = 2
3x + 6y + 5z = 9
Solution. The augmented matrix for this system is
_
_
2 4 3 [ 1
5 10 7 [ 2
3 6 5 [ 9
_
_
In order to nd the solution to this system, we wish to carry the augmented matrix to
reduced row-echelon form. We will do so using Algorithm 1.19. Notice that the rst column
is nonzero, so this is our rst pivot column. The rst entry in the rst row, 2, is the rst
leading entry and it is in the rst pivot position. We will use row operations to create zeros
25
in the entries below the 2. First, replace the second row with 5 times the rst row plus 2
times the second row. This yields
_
_
2 4 3 [ 1
0 0 1 [ 1
3 6 5 [ 9
_
_
Now, replace the third row with 3 times the rst row plus to 2 times the third row. This
yields
_
_
2 4 3 [ 1
0 0 1 [ 1
0 0 1 [ 21
_
_
Now the entries in the rst column below the pivot position are zeros. We now look for the
second pivot column, which in this case is column three. Here, the 1 in the second row and
third column is in the pivot position. We need to do just one row operation to create a zero
below the 1.
Taking 1 times the second row and adding it to the third row yields
_
_
2 4 3 [ 1
0 0 1 [ 1
0 0 0 [ 20
_
_
We could proceed with the algorithm to carry this matrix to row-echelon form or reduced
row-echelon form. However, remember that we are looking for the solutions to the system
of equations. Take another look at the third row of the matrix. Notice that it corresponds
to the equation
0x + 0y + 0z = 20
There is no solution to this equation because for all x, y, z, the left side will equal 0 and
0 ,= 20. This shows there is no solution to the given system of equations. In other words,
this system is inconsistent.
The following is another example of how to nd the solution to a system of equations by
carrying the corresponding augmented matrix to reduced row-echelon form.
Example 1.22: An Innite Set of Solutions
Give the complete solution to the system of equations
3x y 5z = 9
y 10z = 0
2x + y = 6
(1.11)
Solution. The augmented matrix of this system is
_
_
3 1 5 [ 9
0 1 10 [ 0
2 1 0 [ 6
_
_
26
In order to nd the solution to this system, we will carry the augmented matrix to reduced
row-echelon form, using Algorithm 1.19. The rst column is the rst pivot column. We want
to use row operations to create zeros beneath the rst entry in this column, which is in the
rst pivot position. Replace the third row with 2 times the rst row added to 3 times the
third row. This gives
_
_
3 1 5 [ 9
0 1 10 [ 0
0 1 10 [ 0
_
_
Now, we have created zeros beneath the 3 in the rst column, so we move on to the second
pivot column (which is the second column) and repeat the procedure. Take 1 times the
second row and add to the third row.
_
_
3 1 5 [ 9
0 1 10 [ 0
0 0 0 [ 0
_
_
The entry below the pivot position in the second column is now a zero. Notice that we have
no more pivot columns because we have only two leading entries.
At this stage, we also want the leading entries to be equal to one. To do so, divide the
rst row by 3.
_

_
1
1
3

5
3
[ 3
0 1 10 [ 0
0 0 0 [ 0
_

_
This matrix is now in row-echelon form.
Lets continue with row operations until the matrix is in reduced row-echelon form. This
involves creating zeros above the pivot positions in each pivot column. This requires only
one step, which is to add
1
3
times the second row to the rst row.
_
_
1 0 5 [ 3
0 1 10 [ 0
0 0 0 [ 0
_
_
This is in reduced row-echelon form, which you should verify using Denition 1.13. The
equations corresponding to this reduced row-echelon form are
x 5z = 3
y 10z = 0
or
x = 3 + 5z
y = 10z
Observe that z is not restrained by any equation. In fact, z can equal any number. For
example, we can let z = t, where we can choose t to be any number. In this context t is
called a parameter . Therefore, the solution set of this system is
x = 3 + 5t
y = 10t
z = t
27
where t is arbitrary. The system has an innite set of solutions which are given by these
equations. For any value of t we select, x, y, and z will be given by the above equations. For
example, if we choose t = 4 then the corresponding solution would be
x = 3 + 5(4) = 23
y = 10(4) = 40
z = 4
In Example 1.22 the solution involved one parameter. It may happen that the solution
to a system involves more than one parameter, as shown in the following example.
Example 1.23: A Two Parameter Set of Solutions
Find the solution to the system
x + 2y z + w = 3
x + y z + w = 1
x + 3y z + w = 5
Solution. The augmented matrix is
_
_
1 2 1 1 [ 3
1 1 1 1 [ 1
1 3 1 1 [ 5
_
_
We wish to carry this matrix to row-echelon form. Here, we will outline the row operations
used. However, make sure that you understand the steps in terms of Algorithm 1.19.
Take 1 times the rst row and add to the second. Then take 1 times the rst row
and add to the third. This yields
_
_
1 2 1 1 [ 3
0 1 0 0 [ 2
0 1 0 0 [ 2
_
_
Now add the second row to the third row and divide the second row by 1.
_
_
1 2 1 1 [ 3
0 1 0 0 [ 2
0 0 0 0 [ 0
_
_
(1.12)
This matrix is in row-echelon form and we can see that x and y correspond to pivot
columns, while z and w do not. Therefore, we will assign parameters to the variables z and
w. Assign the parameter s to z and the parameter t to w. Then the rst row yields the
equation x + 2y s + t = 3, while the second row yields the equation y = 2. Since y = 2,
the rst equation becomes x + 4 s + t = 3 showing that the solution is given by
x = 1 + s t
y = 2
z = s
w = t
28
It is customary to write this solution in the form
_

_
x
y
z
w
_

_
=
_

_
1 + s t
2
s
t
_

_
(1.13)
This example shows a system of equations with an innite solution set which depends
on two parameters. It can be less confusing in the case of an innite solution set to rst
place the augmented matrix in reduced row-echelon form rather than just row-echelon form
before seeking to write down the description of the solution.
In the above steps, this means we dont stop with the row-echelon form in equation 1.12.
Instead we rst place it in reduced row-echelon form as follows.
_
_
1 0 1 1 [ 1
0 1 0 0 [ 2
0 0 0 0 [ 0
_
_
Then the solution is y = 2 from the second row and x = 1 + z w from the rst. Thus
letting z = s and w = t, the solution is given by 1.13.
You can see here that there are two paths to the correct answer, which both yield the
same answer. Hence, either approach may be used. The process which we rst used in the
above solution is called Gaussian Elimination This process involves carrying the matrix
to row-echelon form, converting back to equations, and using back substitution to nd the
solution. When you do row operations until you obtain reduced row-echelon form, the process
is called Gauss-Jordan Elimination .
We have now found solutions for systems of equations with no solution and innitely
many solutions, with one parameter as well as two parameters. Recall the three types of
solution sets which we discussed in the previous section; no solution, one solution, and
innitely many solutions. Each of these types of solutions could be identied from the graph
of the system. It turns out that we can also identify the type of solution from the reduced
row-echelon form of the augmented matrix.
No Solution: In the case where the system of equations has no solution, the row-
echelon form of the augmented matrix will have a row of the form
_
0 0 0 [ 1

This row indicates that the system is inconsistent and has no solution.
One Solution: In the case where the system of equations has one solution, every
column of the coecient matrix is a pivot column. The following is an example of
an augmented matrix in reduced row-echelon form for a system of equations with one
solution.
_
_
1 0 0 [ 5
0 1 0 [ 0
0 0 1 [ 2
_
_
29
Innitely Many Solutions: In the case where the system of equations has innitely
many solutions, the solution contains parameters. There will be columns of the coef-
cient matrix which are not pivot columns. The following are examples of augmented
matrices in reduced row-echelon form for systems of equations with innitely many
solutions.
_
_
1 0 0 [ 5
0 1 2 [ 3
0 0 0 [ 0
_
_
or
_
1 0 0 [ 5
0 1 0 [ 3
_
1.2.3. Uniqueness of the Reduced Row-Echelon Form
As we have seen in earlier sections, we know that every matrix can be brought into reduced
row-echelon form by a sequence of elementary row operations. Here we will prove that
the resulting matrix is unique; in other words, the resulting matrix in reduced row-echelon
form does not depend upon the particular sequence of elementary row operations or the order
in which they were performed.
Let A be the augmented matrix of a homogeneous system of linear equations in the
variables x
1
, x
2
, , x
n
which is also in reduced row-echelon form. The matrix A divides
the set of variables in two dierent types. We say that x
i
is a basic variable whenever
A has a leading 1 in column number i, in other words, when column i is a pivot column.
Otherwise we say that x
i
is a free variable. All solutions can be written in terms of the
free variables. In such a description, the free variables can take any values (they become
parameters), while the basic variables become simple linear functions of these parameters.
Indeed, a basic variable x
i
is a linear function ofonly those free variables x
j
with j > i. This
leads to the following observation.
Proposition 1.24: Basic and Free Variables
If x
i
is a basic variable of a homogeneous system of linear equations, then any solution
of the system with x
j
= 0 for all those free variables x
j
with j > i must also have
x
j
= 0.
Using this proposition, we prove a lemma which will be used in the proof of the main
result of this section below.
Lemma 1.25: Solutions and the Reduced Row-Echelon Form of a Matrix
Let A and B be two distinct augmented matrices for two homogeneous systems of m
equations in n variables, such that A and B are each in reduced row-echelon form.
Then, the two systems do not have exactly the same solutions.
30
Proof: With respect to the linear systems associated with the matrices A and B, there
are two cases to consider:
Case 1: the two systems have the same basic variables
Case 2: the two systems do not have the same basic variables
In case 1, the two matrices will have exactly the same pivot positions. However, since A and
B are not identical, there is some row of A which is dierent from the corresponding row of
B and yet the rows each have a pivot in the same column position. Let i be the index of
this column position. Since the matrices are in reduced row-echelon form, the two rows must
dier at some entry in a column j > i. Let these entries be a in A and b in B, where a ,= b.
Since A is in reduced row-echelon form, if x
j
were a basic variable for its linear system, we
would have a = 0. Similarly, if x
j
were a basic variable for the linear system of the matrix B,
we would have b = 0. Since a and b are unequal, they cannot both be equal to 0, and hence
x
j
cannot be a basic variable for both linear systems. However, since the systems have the
same basic variables, x
j
must then be a free variable for each system. We now look at the
solutions of the systems in which x
j
is set equal to 1 and all other free variables are set equal
to 0. For this choice of parameters, the solution of the system for matrix A has x
j
= a,
while the solution of the system for matrix B has x
j
= b, so that the two systems have
dierent solutions.
In case 2, there is a variable x
i
which is a basic variable for one matrix, lets say A, and
a free variable for the other matrix B. The system for matrix B has a solution in which
x
i
= 1 and x
j
= 0 for all other free variables x
j
. However, by Proposition 1.24 this cannot
be a solution of the system for the matrix A. This completes the proof of case 2.
Now, we say that the matrix B is equivalent to the matrix A provided that B can be
obtained from A by performing a sequence of elementary row operations beginning with A.
The importance of this concept lies in the following result.
Theorem 1.26: Equivalent Matrices
The two linear systems of equations corresponding to two equivalent augmented ma-
trices have exactly the same solutions.
The proof of this theorem is left as an exercise.
Now, we can use Lemma 1.25 and Theorem 1.26 to prove the main result of this section.
Theorem 1.27: Uniqueness of the Reduced Row-Echelon Form
Every matrix A is equivalent to a unique matrix in reduced row-echelon form.
Proof: Let A be an mn matrix and let B and C be matrices in reduced row-echelon
form, each equivalent to A. It suces to show that B = C.
Let A
+
be the matrix A augmented with a new rightmost column consisting entirely of
zeros. Similarly, augment matrices B and C each with a rightmost column of zeros to obtain
B
+
and C
+
. Note that B
+
and C
+
are matrices in reduced row-echelon form which are
31
obtained from A
+
by respectively applying the same sequence of elementary row operations
which were used to obtain B and C from A.
Now, A
+
, B
+
, and C
+
can all be considered as augmented matrices of homogeneous
linear systems in the variables x
1
, x
2
, , x
n
. Because B
+
and C
+
are each equivalent to
A
+
, Theorem 1.26 ensures that all three homogeneous linear systems have exactly the same
solutions. By Lemma 1.25 we conclude that B
+
= C
+
. By construction, we must also have
B = C.
According to this theorem we can say that each matrix A has a unique reduced row-
echelon form.
1.2.4. Rank and Homogeneous Systems
There is a special type of system which requires additional study. This type of system is
called a homogeneous system of equations, which we dened above in Denition 1.3. Our
focus in this section is to consider what types of solutions are possible for a homogeneous
system of equations.
Consider the following denition.
Denition 1.28: Trivial Solution
Consider the homogeneous system of equations given by
a
11
x
1
+ a
12
x
2
+ + a
1n
x
n
= 0
a
21
x
1
+ a
22
x
2
+ + a
2n
x
n
= 0
.
.
.
a
m1
x
1
+ a
m2
x
2
+ + a
mn
x
n
= 0
Then, x
1
= 0, x
2
= 0, , x
n
= 0 is always a solution to this system. We call this the
trivial solution .
If the system has a solution in which not all of the x
1
, , x
n
are equal to zero, then we
call this solution nontrivial . The trivial solution does not tell us much about the system,
as it says that 0 = 0! Therefore, when working with homogeneous systems of equations, we
want to know when the system has a nontrivial solution.
Suppose we have a homogeneous system of m equations, using n variables, and suppose
that n > m. In other words, there are more variables than equations. Then, it turns out
that this system always has a nontrivial solution. Not only will the system have a nontrivial
solution, but it also will have innitely many solutions. It is also possible, but not required,
to have a nontrivial solution if n = m and n < m.
Consider the following example.
32
Example 1.29: Solutions to a Homogeneous System of Equations
Find the nontrivial solutions to the following homogeneous system of equations
2x + y z = 0
x + 2y 2z = 0
Solution. Notice that this system has m = 2 equations and n = 3 variables, so n > m.
Therefore by our previous discussion, we expect this system to have innitely many solutions.
The process we use to nd the solutions for a homogeneous system of equations is the
same process we used in the previous section. First, we construct the augmented matrix,
given by
_
2 1 1 [ 0
1 2 2 [ 0
_
Then, we carry this matrix to its reduced row-echelon form, given below.
_
1 0 0 [ 0
0 1 1 [ 0
_
The corresponding system of equations is
x = 0
y z = 0
Since z is not restrained by any equation, we know that this variable will become our pa-
rameter. Let z = t where t is any number. Therefore, our solution has the form
x = 0
y = z = t
z = t
Hence this system has innitely many solutions, with one parameter t.
Suppose we were to write the solution to the previous example in another form. Speci-
cally,
x = 0
y = 0 + t
z = 0 + t
can be written as
_
_
x
y
z
_
_
=
_
_
0
0
0
_
_
+ t
_
_
0
1
1
_
_
Notice that we have constructed a column from the constants in the solution (all equal to
0), as well as a column corresponding to the coecients on t in each equation. While we
will discuss this form of solution more in further chapters, for now consider the column of
coecients of the parameter t. In this case, this is the column
_
_
0
1
1
_
_
.
33
There is a special name for this column, which is basic solution. The basic solutions
of a system are columns constructed from the coecients on parameters in the solution.
We often denote basic solutions by X
1
, X
2
etc., depending on how many solutions occur.
Therefore, Example 1.29 has the basic solution X
1
=
_
_
0
1
1
_
_
.
We explore this further in the following example.
Example 1.30: Basic Solutions of a Homogeneous System
Consider the following homogeneous system of equations.
x + 4y + 3z = 0
3x + 12y + 9z = 0
Find the basic solutions to this system.
Solution. The augmented matrix of this system is
_
1 4 3 [ 0
3 12 9 [ 0
_
The reduced row-echelon form of this matrix is
_
1 4 3 [ 0
0 0 0 [ 0
_
When written in equations, this system is given by
x + 4y + 3z = 0
Notice that only x corresponds to a pivot column. In this case, we will have two parameters,
one for y and one for z. Let y = s and z = t for any numbers s and t. Then, our solution
becomes
x = 4s 3t
y = s
z = t
which can be written as
_
_
x
y
z
_
_
=
_
_
0
0
0
_
_
+ s
_
_
4
1
0
_
_
+ t
_
_
3
0
1
_
_
You can see here that we have two columns of coecients corresponding to parameters,
specically one for s and one for t. Therefore, this system has two basic solutions! These
are
X
1
=
_
_
4
1
0
_
_
, X
2
=
_
_
3
0
1
_
_
34
We now present a new denition.
Denition 1.31: Linear Combination
Let X
1
, , X
n
, V be column matrices. Then V is said to be a linear combination
of the columns X
1
, , X
n
if there exist scalars, a
1
, , a
n
such that
V = a
1
X
1
+ + a
n
X
n
A remarkable result of this section is that a linear combination of the basic solutions is
again a solution to the system. Even more remarkable is that every solution can be written
as a linear combination of these solutions. Therefore, if we take a linear combination of the
two solutions to Example 1.30, this would also be a solution. For example, we could take
the following linear combination
3
_
_
4
1
0
_
_
+ 2
_
_
3
0
1
_
_
=
_
_
18
3
2
_
_
You should take a moment to verify that
_
_
x
y
z
_
_
=
_
_
18
3
2
_
_
is in fact a solution to the system in Example 1.30.
Another way in which we can nd out more information about the solutions of a homo-
geneous system is to consider the rank of the associated coecient matrix. We now dene
what is meant by the rank of a matrix.
Denition 1.32: Rank of a Matrix
Let A be a matrix and consider any row-echelon form of A. Then, the number r of
leading entries of A does not depend on the row-echelon form you choose, and is called
the rank of A.
Similarly, we could count the number of pivot positions (or pivot columns) to determine
the rank of A.
Example 1.33: Finding the Rank of a Matrix
Consider the matrix
_
_
1 2 3
1 5 9
2 4 6
_
_
What is its rank?
35
Solution. First, we need to nd the reduced row-echelon form of A. Through the usual
algorithm, we nd that this is
_
_
1 0 1
0 1 2
0 0 0
_
_
Here we have two leading entries, or two pivot positions. These are shown above in boxes.
Hence, the rank of A is r = 2.
Notice that we would have achieved the same answer if we had found the row-echelon
form of A instead of the reduced row-echelon form.
Suppose we have a homogeneous system of m equations in n variables, and suppose that
n > m. From our above discussion, we know that this system will have innitely many
solutions. If we consider the rank of the coecient matrix of this system, we can nd out
even more about the solution. Note that we are looking at just the coecient matrix, not
the entire augmented matrix.
Theorem 1.34: Rank and Solutions to a Homogeneous System
Let A be the m n coecient matrix corresponding to a homogeneous system of
equations, and suppose A has rank r. Then, the solution to the corresponding system
has n r parameters.
Consider our above Example 1.30 in the context of this theorem. The system in this
example has m = 2 equations in n = 3 variables. First, because n > m, we know that the
system has a nontrivial solution, and therefore innitely many solutions. This tells us that
the solution will contain at least one parameter. The rank of the coecient matrix can tell
us even more about the solution! The rank of the coecient matrix of the system is 1, as it
has one leading entry in row-echelon form. Theorem 1.34 tells us that the solution will have
n r = 3 1 = 2 parameters. You can check that this is true in the solution to Example
1.30.
Notice that if n = m or n < m, it is possible to have either a unique solution (which will
be the trivial solution) or innitely many solutions.
We are not limited to homogeneous systems of equations here. The rank of a matrix
can be used to learn about the solutions of any system of linear equations. In the previous
section, we discussed that a system of equations can have no solution, a unique solution,
or innitely many solutions. Suppose the system is consistent, whether it is homogeneous
or not. The following theorem tells us how we can use the rank to learn about the type of
solution we have.
36
Theorem 1.35: Rank and Solutions to a Consistent System of Equations
Let A be the m (n + 1) augmented matrix corresponding to a consistent system of
equations in n variables, and suppose A has rank r. Then
1. the system has a unique solution if r = n
2. the system has innitely many solutions if r < n
We will not present a formal proof of this, but consider the following discussions.
1. No Solution The above theorem assumes that the system is consistent, that is, that
it has a solution. It turns out that it is possible for the augmented matrix of a system
with no solution to have any rank r as long as r > 1. Therefore, we must know that
the system is consistent in order to use this theorem!
2. Unique Solution Suppose r = n. Then, there is a pivot position in every column of
the coecient matrix of A. Hence, there is a unique solution.
3. Innitely Many Solutions Suppose r < n. Then there are innitely many solutions.
There are less pivot positions (and hence less leading entries) than columns, meaning
that not every column is a pivot column. The columns which are not pivot columns
correspond to parameters. In fact, in this case we have n r parameters.
1.3 Exercises
1. Find the point (x
1
, y
1
) which lies on both lines, x + 3y = 1 and 4x y = 3.
2. Solve Problem 1 graphically. That is, graph each line and see where they intersect.
3. Find the point of intersection of the two lines 3x + y = 3 and x + 2y = 1.
4. Solve Problem 3 graphically. That is, graph each line and see where they intersect.
5. Do the three lines, x + 2y = 1, 2x y = 1, and 4x + 3y = 3 have a common point of
intersection? If so, nd the point and if not, tell why they dont have such a common
point of intersection.
6. Do the three planes, x + y 3z = 2, 2x + y + z = 1, and 3x + 2y 2z = 0 have
a common point of intersection? If so, nd one and if not, tell why there is no such
point.
7. You have a system of k equations in two variables, k 2. Explain the geometric
signicance of
37
(a) No solution.
(b) A unique solution.
(c) An innite number of solutions.
8. Consider the following augmented matrix in which denotes an arbitrary number
and denotes a nonzero number. Determine whether the given augmented matrix is
consistent. If consistent, is the solution unique?
_

_
[
0 0 [
0 0 [
0 0 0 0 [
_

_
9. Consider the following augmented matrix in which denotes an arbitrary number
and denotes a nonzero number. Determine whether the given augmented matrix is
consistent. If consistent, is the solution unique?
_
_
[
0 [
0 0 [
_
_
10. Consider the following augmented matrix in which denotes an arbitrary number
and denotes a nonzero number. Determine whether the given augmented matrix is
consistent. If consistent, is the solution unique?
_

_
[
0 0 0 [
0 0 0 [
0 0 0 0 [
_

_
11. Consider the following augmented matrix in which denotes an arbitrary number
and denotes a nonzero number. Determine whether the given augmented matrix is
consistent. If consistent, is the solution unique?
_

_
[
0 0 [
0 0 0 0 [ 0
0 0 0 0 [
_

_
12. Suppose a system of equations has fewer equations than variables. Must such a system
be consistent? If so, explain why and if not, give an example which is not consistent.
13. If a system of equations has more equations than variables, can it have a solution? If
so, give an example and if not, tell why not.
38
14. Find h such that
_
2 h [ 4
3 6 [ 7
_
is the augmented matrix of an inconsistent matrix.
15. Find h such that
_
1 h [ 3
2 4 [ 6
_
is the augmented matrix of a consistent matrix.
16. Find h such that
_
1 1 [ 4
3 h [ 12
_
is the augmented matrix of a consistent matrix.
17. Choose h and k such that the augmented matrix shown has each of the following:
(a) one solution
(b) no solution
(c) innitely many solutions
_
1 h [ 2
2 4 [ k
_
18. Choose h and k such that the augmented matrix shown has each of the following:
(a) one solution
(b) no solution
(c) innitely many solutions
_
1 2 [ 2
2 h [ k
_
19. Determine if the system is consistent. If so, is the solution unique?
x + 2y + z w = 2
x y + z + w = 1
2x + y z = 1
4x + 2y + z = 5
20. Determine if the system is consistent. If so, is the solution unique?
x + 2y + z w = 2
x y + z + w = 0
2x + y z = 1
4x + 2y + z = 3
39
21. Determine which matrices are in reduced row-echelon form.
(a)
_
1 2 0
0 1 7
_
(b)
_
_
1 0 0 0
0 0 1 2
0 0 0 0
_
_
(c)
_
_
1 1 0 0 0 5
0 0 1 2 0 4
0 0 0 0 1 3
_
_
22. Row reduce the following matrix to obtain the row-echelon form. Then continue to
obtain the reduced row-echelon form.
_
_
2 1 3 1
1 0 2 1
1 1 1 2
_
_
23. Row reduce the following matrix to obtain the row-echelon form. Then continue to
obtain the reduced row-echelon form.
_
_
0 0 1 1
1 1 1 0
1 1 0 1
_
_
24. Row reduce the following matrix to obtain the row-echelon form. Then continue to
obtain the reduced row-echelon form.
_
_
3 6 7 8
1 2 2 2
1 2 3 4
_
_
25. Row reduce the following matrix to obtain the row-echelon form. Then continue to
obtain the reduced row-echelon form.
_
_
2 4 5 15
1 2 3 9
1 2 2 6
_
_
26. Row reduce the following matrix to obtain the row-echelon form. Then continue to
obtain the reduced row-echelon form.
_
_
4 1 7 10
1 0 3 3
1 1 2 1
_
_
40
27. Row reduce the following matrix to obtain the row-echelon form. Then continue to
obtain the reduced row-echelon form.
_
_
3 5 4 2
1 2 1 1
1 1 2 0
_
_
28. Row reduce the following matrix to obtain the row-echelon form. Then continue to
obtain the reduced row-echelon form.
_
_
2 3 8 7
1 2 5 5
1 3 7 8
_
_
29. Find the solution of the system whose augmented matrix is
_
_
1 2 0 [ 2
1 3 4 [ 2
1 0 2 [ 1
_
_
30. Find the solution of the system whose augmented matrix is
_
_
1 2 0 [ 2
2 0 1 [ 1
3 2 1 [ 3
_
_
31. Find the solution of the system whose augmented matrix is
_
1 1 0 [ 1
1 0 4 [ 2
_
32. Find the solution of the system whose augmented matrix is
_

_
1 0 2 1 1 [ 2
0 1 0 1 2 [ 1
1 2 0 0 1 [ 3
1 0 1 0 2 [ 2
_

_
33. Find the solution of the system whose augmented matrix is
_

_
1 0 2 1 1 [ 2
0 1 0 1 2 [ 1
0 2 0 0 1 [ 3
1 1 2 2 2 [ 0
_

_
41
34. Find the solution to the system of equations, 7x + 14y + 15z = 22, 2x + 4y + 3z = 5,
and 3x + 6y + 10z = 13.
35. Find the solution to the system of equations, 3x y + 4z = 6, y + 8z = 0, and
2x + y = 4.
36. Find the solution to the system of equations, 9x2y +4z = 17, 13x3y +6z = 25,
and 2x z = 3.
37. Find the solution to the system of equations, 65x+84y+16z = 546, 81x+105y+20z =
682, and 84x + 110y + 21z = 713.
38. Find the solution to the system of equations, 8x + 2y + 3z = 3, 8x + 3y + 3z = 1,
and 4x + y + 3z = 9.
39. Find the solution to the system of equations, 8x+2y +5z = 18, 8x+3y +5z = 13,
and 4x + y + 5z = 19.
40. Find the solution to the system of equations, 3x y 2z = 3, y 4z = 0, and
2x + y = 2.
41. Find the solution to the system of equations, 9x + 15y = 66, 11x + 18y = 79,
x + y = 4, and z = 3.
42. Find the solution to the system of equations, 19x+8y = 108, 71x +30y = 404,
2x + y = 12, 4x + z = 14.
43. Find the rank of the following matrix.
_
_
4 16 1 5
1 4 0 1
1 4 1 2
_
_
44. Find the rank of the following matrix.
_
_
3 6 5 12
1 2 2 5
1 2 1 2
_
_
45. Find the rank of the following matrix.
_

_
0 0 1 0 3
1 4 1 0 8
1 4 0 1 2
1 4 0 1 2
_

_
42
46. Find the rank of the following matrix.
_
_
4 4 3 9
1 1 1 2
1 1 0 3
_
_
47. Find the rank of the following matrix.
_

_
2 0 1 0 1
1 0 1 0 0
1 0 0 1 7
1 0 0 1 7
_

_
48. Find the rank of the following matrix.
_

_
4 15 29
1 4 8
1 3 5
3 9 15
_

_
49. Find the rank of the following matrix.
_

_
0 0 1 0 1
1 2 3 2 18
1 2 2 1 11
1 2 2 1 11
_

_
50. Find the rank of the following matrix.
_

_
1 2 0 3 11
1 2 0 4 15
1 2 0 3 11
0 0 0 0 0
_

_
51. Find the rank of the following matrix.
_

_
2 3 2
1 1 1
1 0 1
3 0 3
_

_
43
52. Find the rank of the following matrix.
_

_
4 4 20 1 17
1 1 5 0 5
1 1 5 1 2
3 3 15 3 6
_

_
53. Find the rank of the following matrix.
_

_
1 3 4 3 8
1 3 4 2 5
1 3 4 1 2
2 6 8 2 4
_

_
54. Suppose A is an m n matrix. Explain why the rank of A is always no larger than
min (m, n) .
55. Suppose a system of equations has fewer equations than variables and you have found
a solution to this system of equations. Is it possible that your solution is the only one?
Explain.
56. Suppose a system of linear equations has a 24 augmented matrix and the last column
is a pivot column. Could the system of linear equations be consistent? Explain.
57. Suppose the coecient matrix of a system of n equations with n variables has the
property that every column is a pivot column. Does it follow that the system of
equations must have a solution? If so, must the solution be unique? Explain.
58. Suppose there is a unique solution to a system of linear equations. What must be true
of the pivot columns in the augmented matrix?
59. State whether each of the following sets of data are possible for the matrix equation
AX = B. If possible, describe the solution set. That is, tell whether there exists
a unique solution no solution or innitely many solutions. Here, [A[B] denotes the
augmented matrix.
(a) A is a 5 6 matrix, rank (A) = 4 and rank [A[B] = 4.
(b) A is a 3 4 matrix, rank (A) = 3 and rank [A[B] = 2.
(c) A is a 4 2 matrix, rank (A) = 4 and rank [A[B] = 4.
(d) A is a 5 5 matrix, rank (A) = 4 and rank [A[B] = 5.
(e) A is a 4 2 matrix, rank (A) = 2 and rank [A[B] = 2.
44
60. Consider the system 5x+2y z = 0 and 5x2y z = 0. Both equations equal zero
and so 5x + 2y z = 5x 2y z which is equivalent to y = 0. Thus x and z can
equal anything. But when x = 1, z = 4, and y = 0 are plugged in to the equations,
the equations do not equal 0. Why?
61. Four times the weight of Gaston is 150 pounds more than the weight of Ichabod.
Four times the weight of Ichabod is 660 pounds less than seventeen times the weight
of Gaston. Four times the weight of Gaston plus the weight of Siegfried equals 290
pounds. Brunhilde would balance all three of the others. Find the weights of the four
people.
62. The steady state temperature, u, of a plate solves Laplaces equation, u = 0. One
way to approximate the solution is to divide the plate into a square mesh and require
the temperature at each node to equal the average of the temperature at the four
adjacent nodes. This procedure is justied by the mean value property of harmonic
functions. In the following picture, the numbers represent the observed temperature at
the indicated nodes. Find the temperature at the interior nodes, indicated by x, y, z,
and w. One of the equations is z =
1
4
(10 + 0 + w + x).
10 10
20
20
x
y
z
w
30 30
0
0
63. Consider the following diagram of four circuits.
10 volts
5 volts
20 volts 3
2
4
1
6
2
5 1
3 1
I
1
I
2
I
3
I
4
Those jagged places denote resistors and the numbers next to them give their resistance
in ohms, written as . The breaks in the lines having one short line and one long line
denote a voltage source which causes the current to ow in the direction which goes
from the longer of the two lines toward the shorter along the unbroken part of the
circuit. The current in amps in the four circuits is denoted by I
1
, I
2
, I
3
, I
4
and it
is understood that the motion is in the counter clockwise direction. If I
k
ends up
being negative, then it just means the current ows in the clockwise direction. Then
Kirchhos law states that
45
The sum of the resistance times the amps in the counter clockwise direction around a
loop equals the sum of the voltage sources in the same direction around the loop.
In the above diagram, the top left circuit should give the equation
2I
2
2I
1
+ 5I
2
5I
3
+ 3I
2
= 5
For the circuit on the lower left, you should have
4I
1
+ I
1
I
4
+ 2I
1
2I
2
= 10
Write equations for each of the other two circuits and then give a solution to the
resulting system of equations.
64. Consider the following diagram of three circuits.
10 volts
12 volts 3
2
7
1
4
5 3
4 2
I
1
I
2
I
3
Those jagged places denote resistors and the numbers next to them give their resistance
in ohms, written as . The breaks in the lines having one short line and one long line
denote a voltage source which causes the current to ow in the direction which goes from
the longer of the two lines toward the shorter along the unbroken part of the circuit.
The current in amps in the four circuits is denoted by I
1
, I
2
, I
3
and it is understood
that the motion is in the counter clockwise direction. If I
k
ends up being negative,
then it just means the current ows in the clockwise direction. Then Kirchhos law
states that
The sum of the resistance times the amps in the counter clockwise direction around a
loop equals the sum of the voltage sources in the same direction around the loop.
Find I
1
, I
2
, I
3
.
46
2. Matrices
Outcomes
A. Perform the matrix operations of matrix addition, scalar multiplication, transposition
and matrix multiplication. Identify when these operations are not dened. Represent
these operations in terms of the entries of a matrix.
B. Prove algebraic properties for matrix addition, scalar multiplication, transposition, and
matrix multiplication. Apply these properties to manipulate an algebraic expression
involving matrices.
C. Compute the inverse of a matrix using row operations, and prove identities involving
matrix inverses.
E. Solve a linear system using matrix algebra.
F. Find the matrix of a linear transformation, and determine if this matrix is one to one
or onto.
G. Use linear transformations to nd the general solution to a linear system.
2.1 Matrix Arithmetic
You have now solved systems of equations by writing them in terms of an augmented matrix
and then doing row operations on this augmented matrix. It turns out that matrices are
important not only for systems of equations but also in many applications.
Recall that a matrix is a rectangular array of numbers. Several of them are referred to
as matrices. For example, here is a matrix.
_
_
1 2 3 4
5 2 8 7
6 9 1 2
_
_
(2.1)
Recall that the size or dimension of a matrix is dened as mn where m is the number of
rows and n is the number of columns. The above matrix is a 3 4 matrix because there are
three rows and four columns. You can remember the columns are like columns in a Greek
temple. They stand upright while the rows just lay there like rows made by a tractor in a
plowed eld.
47
When specifying the size of a matrix, you always list the number of rows before the
number of columns.You might remember that you always list the rows before the columns
by using the phrase Rowman Catholic.
Consider the following denition.
Denition 2.1: Square Matrix
A matrix A which has size n n is called a square matrix . In other words, A is a
square matrix if it has the same number of rows and columns.
There is some notation specic to matrices which we now introduce. We denote the
columns of a matrix A by A
j
as follows
A =
_
A
1
A
2
A
n

Therefore, A
j
is the j
th
column of A, when counted from left to right.
The individual elements of the matrix are called entries or components of A. Elements
of the matrix are identied according to their position. The (i, j)-entry of a matrix is the
entry in the ith row and jth column. For example, in the matrix 2.1 above, 8 is in position
(2, 3) because it is in the second row and the third column.
In order to remember which matrix we are speaking of, we will denote the entry in the
i
th
row and the j
th
column of matrix A by A
ij
. Then, we can write A in terms of its entries,
as A = [A
ij
]. Using this notation on the matrix in 2.1, A
23
= 8, A
32
= 9, A
12
= 2, etc.
There are various operations which are done on matrices of appropriate sizes. Matrices
can be added to and subtracted from other matrices, multiplied by a scalar, and multiplied
by other matrices. We will never divide a matrix by another matrix, but we will see later
how matrix inverses play a similar role.
In doing arithmetic with matrices, we often dene the action by what happens in terms
of the entries (or components) of the matrices. Before looking at these operations in depth,
consider a few general denitions.
Denition 2.2: The Zero Matrix
The m n zero matrix is the m n matrix having every entry equal to zero. It is
denoted by 0.
One possible zero matrix is shown in the following example.
Example 2.3: The Zero Matrix
The 2 3 zero matrix is 0 =
_
0 0 0
0 0 0
_
.
Note there is a 2 3 zero matrix, a 3 4 zero matrix, etc. In fact there is a zero matrix
for every size!
48
Denition 2.4: Equality of Matrices
Let A and B be two matrices. Then A = B means that the two matrices are of the
same size and for A = [A
ij
] and B = [B
ij
] , A
ij
= B
ij
for all 1 i m and 1 j n.
In other words, two matrices are equal exactly when they are the same size and the
corresponding entries are identical. Thus
_
_
0 0
0 0
0 0
_
_
,=
_
0 0
0 0
_
because they are dierent sizes. Also,
_
0 1
3 2
_
,=
_
1 0
2 3
_
because, although they are the same size, their corresponding entries are not identical.
In the following section, we explore addition of matrices.
2.1.1. Addition of Matrices
When adding matrices, all matrices in the sum need have the same size. For example,
_
_
1 2
3 4
5 2
_
_
and
_
1 4 8
2 8 5
_
cannot be added, as one has size 3 2 while the other has size 2 3.
However, the addition
_
_
4 6 3
5 0 4
11 2 3
_
_
+
_
_
0 5 0
4 4 14
1 2 6
_
_
is possible.
The formal denition is as follows.
Denition 2.5: Addition of Matrices
Let A = [A
ij
] and B = [B
ij
] be two mn matrices. Then A+B = C where C is the
mn matrix C = [C
ij
] dened by
C
ij
= A
ij
+ B
ij
49
This denition tells us that when adding matrices, we simply add corresponding entries
of the matrices. This is demonstrated in the next example.
Example 2.6: Addition of Matrices of Same Size
Add the following matrices, if possible.
A =
_
1 2 3
1 0 4
_
, B =
_
5 2 3
6 2 1
_
Solution. Notice that both A and B are of size 2 3. Since A and B are of the same size,
the addition is possible. Using Denition 2.5, the addition is done as follows.
A+ B =
_
1 2 3
1 0 4
_
+
_
5 2 3
6 2 1
_
=
_
1 + 5 2 + 2 3 + 3
1 +6 0 + 2 4 + 1
_
=
_
6 4 6
5 2 5
_
Addition of matrices obeys very much the same properties as normal addition with num-
bers. Note that when we write for example A + B then we assume that both matrices are
of equal size so that the operation is indeed possible.
Proposition 2.7: Properties of Matrix Addition
Let A, B and C be matrices. Then, the following properties hold.
Commutative Law of Addition
A+ B = B + A (2.2)
Associative Law of Addition
(A + B) + C = A+ (B + C) (2.3)
Existence of an Additive Identity
There exists a zero matrix 0 such that
A+ 0 = A
(2.4)
Existence of an Additive Inverse
There exists a matrix A such that
A+ (A) = 0
(2.5)
Proof: Consider the Commutative Law of Addition given in 2.2. Let A, B, C, and D be
matrices such that A+B = C and B+A = D. We want to show that D = C. To do so, we
will use the denition of matrix addition given in Denition 2.5. Now,
C
ij
= A
ij
+ B
ij
= B
ij
+ A
ij
= D
ij
.
50
Therefore, C = D because the ij
th
entries are the same for all i and j. Note that the
conclusion follows from the commutative law of addition of numbers, which says that if a
and b are two numbers, then a + b = b + a. The proof of the other results are similar, and
are left as an exercise.
We call the zero matrix in 2.4 the additive identity. Similarly, we call the matrix A
in 2.5 the additive inverse. A is dened to equal (1) A = [A
ij
]. In other words, every
entry of A is multiplied by 1. In the next section we will study scalar multiplication in
more depth to understand what is meant by (1) A.
2.1.2. Scalar Multiplication of Matrices
Recall that we use the word scalar when referring to numbers. Therefore, scalar multiplica-
tion of a matrix is the multiplication of a matrix by a number. To illustrate this concept,
consider the following example in which a matrix is multiplied by the scalar 3.
3
_
_
1 2 3 4
5 2 8 7
6 9 1 2
_
_
=
_
_
3 6 9 12
15 6 24 21
18 27 3 6
_
_
The new matrix is obtained by multiplying every entry of the original matrix by the given
scalar.
The formal denition of scalar multiplication is as follows.
Denition 2.8: Scalar Multiplication of Matrices
If A = [A
ij
] and k is a scalar, then kA = [kA
ij
] .
Consider the following example.
Example 2.9: Eect of Multiplication by a Scalar
Find the result of multiplying the following matrix A by 7.
A =
_
2 0
1 4
_
Solution. By Denition 2.8, we multiply each element of A by 7. Therefore,
7A = 7
_
2 0
1 4
_
=
_
7(2) 7(0)
7(1) 7(4)
_
=
_
14 0
7 28
_
Similarly to addition of matrices, there are several properties of scalar multiplication
which hold.
51
Proposition 2.10: Properties of Scalar Multiplication
Let A, B be matrices, and k, p be scalars. Then, the following properties hold.
Distributive Law over Matrix Addition
k (A + B) = kA+ kB
Distributive Law over Scalar Addition
(k + p) A = kA + pA
Associative Law for Scalar Multiplication
k (pA) = (kp) A
Rule for Multiplication by 1
1A = A
The proof of this proposition is left an exercise to the reader.
2.1.3. Multiplication Of Matrices
The next important matrix operation we will explore is multiplication of matrices. The
operation of matrix multiplication is one of the most important and useful of the matrix
operations. Throughout this section, we will also demonstrate how matrix multiplication
relates to linear systems of equations.
First, we provide a formal denition of row and column vectors.
Denition 2.11: Row and Column Vectors
Matrices of size n1 or 1n are called vectors. If X is such a matrix, then we write
X
i
to denote the entry of X in the i
th
row of a column matrix, or the i
th
column of a
row matrix.
The n 1 matrix
X =
_

_
X
1
.
.
.
X
n
_

_
is called a column vector. The 1 n matrix
X =
_
X
1
X
n

is called a row vector.


52
We may simply use the term vector throughout this text to refer to either a column or
row vector. If we do so, the context will make it clear which we are referring to.
In this chapter, we will again use the notion of linear combination of vectors as in Def-
inition 1.31. In this context, a linear combination is a sum consisting of vectors multiplied
by scalars. For example,
_
50
122
_
= 7
_
1
4
_
+ 8
_
2
5
_
+ 9
_
3
6
_
is a linear combination of three vectors.
It turns out that we can express any system of linear equations as a linear combination
of vectors. In fact, the vectors that we will use are just the columns of the corresponding
augmented matrix!
Denition 2.12: The Vector Form of a System of Linear Equations
Suppose we have a system of equations given by
a
11
x
1
+ + a
1n
x
n
= b
1
.
.
.
a
m1
x
1
+ + a
mn
x
n
= b
m
We can express this system in vector form which is as follows:
x
1
_

_
a
11
a
21

a
m1
_

_
+ x
2
_

_
a
12
a
22
.
.
.
a
m2
_

_
+ + x
n
_

_
a
1n
a
2n
.
.
.
a
mn
_

_
=
_

_
b
1
b
2
.
.
.
b
m
_

_
Notice that each vector used here is one column from the corresponding augmented
matrix. There is one vector for each variable in the system, along with the constant vector.
The rst important form of matrix multiplication is multiplying a matrix by a vector.
Consider the product given by
_
1 2 3
4 5 6
_
_
_
7
8
9
_
_
We will soon see that this equals
7
_
1
4
_
+ 8
_
2
5
_
+ 9
_
3
6
_
=
_
50
122
_
In general terms,
_
a
11
a
12
a
13
a
21
a
22
a
23
_
_
_
x
1
x
2
x
3
_
_
= x
1
_
a
11
a
21
_
+ x
2
_
a
12
a
22
_
+ x
3
_
a
13
a
23
_
=
_
a
11
x
1
+ a
12
x
2
+ a
13
x
3
a
21
x
1
+ a
22
x
2
+ a
23
x
3
_
53
Thus you take x
1
times the rst column, add to x
2
times the second column, and nally x
3
times the third column. The above sum is a linear combination of the columns of the matrix.
When you multiply a matrix on the left by a vector on the right, the numbers making up the
vector are just the scalars to be used in the linear combination of the columns as illustrated
above.
Here is the formal denition of how to multiply an m n matrix by an n 1 column
vector.
Denition 2.13: Multiplication of Vector by Matrix
Let A = [A
ij
] be an mn matrix and let X be an n 1 matrix with
A = [A
1
A
n
] , X =
_

_
X
1
.
.
.
X
n
_

_
Then AX is the m 1 column vector which equals the following linear combination
of the columns.
X
1
A
1
+ X
2
A
2
+ + X
n
A
n
=
n

j=1
X
j
A
j
If we write the columns of A in terms of their entries, they are of the form
A
j
=
_

_
A
1j
A
2j
.
.
.
A
mj
_

_
Then, we can write the product AX as
AX = X
1
_

_
A
11
A
21
.
.
.
A
m1
_

_
+ X
2
_

_
A
12
A
22
.
.
.
A
m2
_

_
+ + X
n
_

_
A
1n
A
2n
.
.
.
A
mn
_

_
Note that multiplication of an m n matrix and an n 1 vector produces an m 1
vector.
Here is an example.
Example 2.14: A Vector Multiplied by a Matrix
Compute the product AX for
A =
_
_
1 2 1 3
0 2 1 2
2 1 4 1
_
_
, X =
_

_
1
2
0
1
_

_
54
Solution. We will use Denition 2.13 to compute the product. Therefore, we compute the
product AX as follows.
1
_
_
1
0
2
_
_
+ 2
_
_
2
2
1
_
_
+ 0
_
_
1
1
4
_
_
+ 1
_
_
3
2
1
_
_
=
_
_
1
0
2
_
_
+
_
_
4
4
2
_
_
+
_
_
0
0
0
_
_
+
_
_
3
2
1
_
_
=
_
_
8
2
5
_
_
Using the above operation, we can also write a system of linear equations in matrix
form. In this form, we express the system as a matrix multiplied by a vector. Consider the
following denition.
Denition 2.15: The Matrix Form of a System of Linear Equations
Suppose we have a system of equations given by
a
11
x
1
+ + a
1n
x
n
= b
1
.
.
.
a
m1
x
1
+ + a
mn
x
n
= b
m
Then we can express this system in matrix form as follows.
_

_
a
11
a
12
a
1n
a
21
a
22
a
2n
.
.
.
.
.
.
a
m1
a
m2
a
mn
_

_
_

_
x
1
x
2
.
.
.
x
n
_

_
=
_

_
b
1
b
2
.
.
.
b
m
_

_
This is also known as the form AX = B. The matrix A is simply the coecient matrix of
the system. The vector X is the column vector constructed from the variables of the system,
and the vector B is the column vector constructed from the constants of the system. It is
important to note that any system of linear equations can be written in this form.
Notice that if we write a homogeneous system of equations in matrix form, it would have
the form AX = 0, for the zero vector 0.
You can see from this denition that a vector
X =
_

_
X
1
X
2
.
.
.
X
n
_

_
55
will satisfy the equation AX = B only when the components X
1
, X
2
, , X
n
of the vector
X are solutions to the original system.
Now that we have examined how to multiply a matrix by a vector, we wish to consider
the case where we multiply two matrices of more general sizes, although these sizes still need
to be appropriate as we will see. For example, in Example 2.14, we multiplied a 34 matrix
by a 4 1 vector. We want to investigate how to multiply other sizes of matrices.
We have not yet given any conditions on when matrix multiplication is possible! For
matrices A and B, in order to form the product AB, the number of columns of A must equal
the number of rows of B. Consider a product AB where A has size m n and B has size
n p. Then, the product in terms of size of matrices is given by
(m
these must match!

n) (n p ) = mp
Note the two outside numbers give the size of the product. One of the most important
rules regarding matrix multiplication is the following. If the two middle numbers dont
match, you cant multiply the matrices!
When the number of columns of A equals the number of rows of B the two matrices are
said to be conformable and the product AB is obtained as follows.
Denition 2.16: Multiplication of Two Matrices
Let A be an mn matrix and let B be an n p matrix of the form
B = [B
1
B
p
]
where B
1
, ..., B
p
are the n 1 columns of B. Then the mp matrix AB is dened as
follows:
AB = A[B
1
B
p
] = [AB
1
AB
p
]
where AB
k
is an m1 matrix or column vector which gives the k
th
column of AB.
Consider the following example.
Example 2.17: Multiplying Two Matrices
Find AB if possible.
A =
_
1 2 1
0 2 1
_
, B =
_
_
1 2 0
0 3 1
2 1 1
_
_
Solution. The rst thing you need to verify when calculating a product is whether the
multiplication is possible. The rst matrix has size 2 3 and the second matrix has size
3 3. The inside numbers are equal, so A and B are conformable matrices. According to
the above discussion AB will be a 2 3 matrix. Denition 2.16 gives us a way to calculate
56
each column of AB, as follows.
_

_
First column
..
_
1 2 1
0 2 1
_
_
_
1
0
2
_
_
,
Second column
..
_
1 2 1
0 2 1
_
_
_
2
3
1
_
_
,
Third column
..
_
1 2 1
0 2 1
_
_
_
0
1
1
_
_
_

_
You know how to multiply a matrix times a vector, using Denition 2.13 for each of the
three columns. Thus
_
1 2 1
0 2 1
_
_
_
1 2 0
0 3 1
2 1 1
_
_
=
_
1 9 3
2 7 3
_
Since vectors are simply n1 or 1m matrices, we can also multiply a vector by another
vector.
Example 2.18: Vector Times Vector Multiplication
Multiply if possible
_
_
1
2
1
_
_
_
1 2 1 0

.
Solution. In this case we are multiplying a matrix of size 3 1 by a matrix of size 1 4. The
inside numbers match so the product is dened. Note that the product will be a matrix of
size 3 4. Using Denition 2.16, we can compute this product as follows
_
_
1
2
1
_
_
_
1 2 1 0

=
_

_
First column
..
_
_
1
2
1
_
_
_
1

,
Second column
..
_
_
1
2
1
_
_
_
2

,
Third column
..
_
_
1
2
1
_
_
_
1

,
Fourth column
..
_
_
1
2
1
_
_
_
0

_
You can use Denition 2.13 to verify that this product is
_
_
1 2 1 0
2 4 2 0
1 2 1 0
_
_
Example 2.19: A Multiplication Which is Not Dened
Find BA if possible.
B =
_
_
1 2 0
0 3 1
2 1 1
_
_
, A =
_
1 2 1
0 2 1
_
57
Solution. First check if it is possible. This product is of the form (3 3) (2 3) . The inside
numbers do not match and so you cant do this multiplication.
In this case, we say that the multiplication is not dened. Notice that these are the same
matrices which we used in Example 2.17. In this example, we tried to calculate BA instead
of AB. This demonstrates another property of matrix multiplication. While the product AB
maybe be dened, we cannot assume that the product BA will be possible. Therefore, it is
important to always check that the product is dened before carrying out any calculations.
Earlier, we dened the zero matrix 0 to be the matrix (of appropriate size) containing
zeros in all entries. Consider the following example for multiplication by the zero matrix.
Example 2.20: Multiplication by the Zero Matrix
Compute the product A0 for the matrix
A =
_
1 2
3 4
_
and the 2 2 zero matrix given by
0 =
_
0 0
0 0
_
Solution. In this product, we compute
_
1 2
3 4
_ _
0 0
0 0
_
=
_
0 0
0 0
_
Hence, A0 = 0.
Notice that we could also multiply A by the 2 1 zero vector given by
_
0
0
_
. The
result would be the 2 1 zero vector. Therefore, it is always the case that A0 = 0, for an
appropriately sized zero matrix or vector.
2.1.4. The ij
th
Entry Of A Product
In previous sections, we used the entries of a matrix to describe the action of matrix addition
and scalar multiplication. We can also study matrix multiplication using the entries of
matrices.
What is the ij
th
entry of AB? It is the entry in the i
th
row and the j
th
column of the
product AB.
Now if A is mn and B is n p, then we know that the product AB has the form
_

_
A
11
A
12
A
1n
A
21
A
22
A
2n
.
.
.
.
.
.
.
.
.
A
m1
A
m2
A
mn
_

_
_

_
B
11
B
12
B
1j
B
1p
B
21
B
22
B
2j
B
2p
.
.
.
.
.
.
.
.
.
.
.
.
B
n1
B
n2
B
nj
B
np
_

_
58
The j
th
column of AB is of the form
_

_
A
11
A
12
A
1n
A
21
A
22
A
2n
.
.
.
.
.
.
.
.
.
A
m1
A
m2
A
mn
_

_
_

_
B
1j
B
2j
.
.
.
B
nj
_

_
which is an m1 column vector. It is calculated by
B
1j
_

_
A
11
A
21
.
.
.
A
m1
_

_
+ B
2j
_

_
A
12
A
22
.
.
.
A
m2
_

_
+ + B
nj
_

_
A
1n
A
2n
.
.
.
A
mn
_

_
Therefore, the ij
th
entry is the entry in row i of this vector. This is computed by
A
i1
B
1j
+ A
i2
B
2j
+ + A
in
B
nj
=
n

k=1
A
ik
B
kj
The following is the formal denition for the ij
th
entry of a product of matrices.
Denition 2.21: The ij
th
Entry of a Product
Let A = [A
ij
] be an mn matrix and let B = [B
ij
] be an n p matrix. Then AB is
an mp matrix and the (i, j)-entry of AB is dened as
AB
ij
=
n

k=1
A
ik
B
kj
(2.6)
Another way to write this is
AB
ij
=
_
A
i1
A
i2
A
in

_
B
1j
B
2j
.
.
.
B
nj
_

_
= A
i1
B
1j
+ A
i2
B
2j
+ + A
in
B
nj
In other words, to nd the (i, j)-entry of the product AB, or AB
ij
, you multiply the i
th
row of A, on the left by the j
th
column of B. To express AB in terms of its entries, we write
AB = [AB
ij
].
Consider the following example.
59
Example 2.22: The Entries of a Product
Compute AB if possible. If it is, nd the (3, 2)-entry of AB using Denition 2.21.
A =
_
_
1 2
3 1
2 6
_
_
, B =
_
2 3 1
7 6 2
_
Solution. First check if the product is possible. It is of the form (3 2) (2 3) and since the
inside numbers match, it is possible to do the multiplication. The result should be a 3 3
matrix. To nd the answer, we compute AB:
_
_
_
_
1 2
3 1
2 6
_
_
_
2
7
_
,
_
_
1 2
3 1
2 6
_
_
_
3
6
_
,
_
_
1 2
3 1
2 6
_
_
_
1
2
_
_
_
where the commas separate the columns in the resulting product. Thus the above product
equals
_
_
16 15 5
13 15 5
46 42 14
_
_
which is a 3 3 matrix as desired. Thus, the (3, 2)-entry equals 42.
We can also compute the entry directly using Denition 2.21. The (3, 2)-entry equals
2

k=1
A
3k
B
k2
= A
31
B
12
+ A
32
B
22
= 2 3 + 6 6 = 42
Consulting our result for AB above, this is correct!
You may wish to use this method to verify that the rest of the entries in AB are correct.
Here is another example.
Example 2.23: Finding the Entries of a Product
Determine if the product AB is dened. If it is, nd the (2, 1)-entry of the product.
A =
_
_
2 3 1
7 6 2
0 0 0
_
_
, B =
_
_
1 2
3 1
2 6
_
_
Solution. This product is of the form (3 3) (3 2). The middle numbers match so the
matrices are conformable and it is possible to compute the product.
60
We want to nd the (2, 1)-entry of AB, that is, the entry in the second row and rst
column of the product. We will use Denition 2.21, which states
AB
ij
=
n

k=1
A
ik
B
kj
In this case, n = 3, i = 2 and j = 1. Hence the (2, 1)-entry is found by computing
AB
21
=
3

k=1
A
2k
B
k1
=
_
A
21
A
22
A
23

_
_
B
11
B
21
B
31
_
_
Substituting in the appropriate values, this product becomes
_
A
21
A
22
A
23

_
_
B
11
B
21
B
31
_
_
=
_
7 6 2

_
_
1
3
2
_
_
= 1 7 + 3 6 + 2 2 = 29
Hence, AB
21
= 29.
You should take a moment to nd a few other entries of AB. You can multiply the
matrices to check that your answers are correct. The product AB is given by
AB =
_
_
13 13
29 32
0 0
_
_
2.1.5. Properties Of Matrix Multiplication
As pointed out above, it is sometimes possible to multiply matrices in one order but not in
the other order. However, even if both AB and BA are possible, they may not be equal.
Example 2.24: Matrix Multiplication is Not Commutative
Compare the products AB and BA, for matrices A =
_
1 2
3 4
_
, B =
_
0 1
1 0
_
Solution. First, notice that A and B are both of size (2 2). Therefore, both products AB
and BA are dened. The rst product, AB is
AB =
_
1 2
3 4
_ _
0 1
1 0
_
=
_
2 1
4 3
_
The second product, BA is
_
0 1
1 0
_ _
1 2
3 4
_
=
_
3 4
1 2
_
61
Therefore, AB ,= BA.
This example illustrates that you cannot assume AB = BA even when multiplication is
dened in both orders. If for some matrices A and B it is true that AB = BA, then we say
that A and B commute . This is one important property of matrix multiplication.
The following are other important properties of matrix multiplication. Notice that these
properties hold only when the size of matrices are such that the products are dened.
Proposition 2.25: Properties of Matrix Multiplication
The following hold for matrices A, B, and C and for scalars r and s,

A(rB + sC) = r (AB) + s (AC) (2.7)

(B + C) A = BA+ CA (2.8)

A(BC) = (AB) C (2.9)


Proof: First we will prove 2.7. We will use Denition 2.21 and prove this statement
using the ij
th
entries of a matrix. Therefore,
(A(rB + sC))
ij
=

k
A
ik
(rB + sC)
kj
=

k
A
ik
(rB
kj
+ sC
kj
)
= r

k
A
ik
B
kj
+ s

k
A
ik
C
kj
= r (AB)
ij
+ s (AC)
ij
= (r (AB) + s (AC))
ij
Thus A(B + C) = AB + AC as claimed.
The proof of 2.8 follows the same pattern and is left as an exercise.
Statement 2.9 is the associative law of multiplication. Using Denition 2.21,
(A(BC))
ij
=

k
A
ik
(BC)
kj
=

k
A
ik

l
B
kl
C
lj
=

l
(AB)
il
C
lj
= ((AB) C)
ij
.
This proves 2.9.
2.1.6. The Transpose
Another important operation on matrices is that of taking the transpose. For a matrix A,
we denote the transpose of A by A
T
. Before formally dening the transpose, we explore
62
this operation on the following matrix.
_
_
1 4
3 1
2 6
_
_
T
=
_
1 3 2
4 1 6
_
What happened? The rst column became the rst row and the second column became
the second row. Thus the 3 2 matrix became a 2 3 matrix. The number 4 was in the
rst row and the second column and it ended up in the second row and rst column.
The denition of the transpose is as follows.
Denition 2.26: The Transpose of a Matrix
Let A be an mn matrix. Then A
T
, the transpose of A, denotes the n m matrix
given by
A
T
= [A
ij
]
T
= [A
ji
]
The (i, j)-entry of A becomes the (j, i)-entry of A
T
.
Consider the following example.
Example 2.27: The Transpose of a Matrix
Calculate A
T
for the following matrix
A =
_
1 2 6
3 5 4
_
Solution. By Denition 2.26, we know that for A = [A
ij
], A
T
= [A
ji
]. In other words, we
switch the row and column location of each entry. The (1, 2)-entry becomes the (2, 1)-entry.
Thus,
A
T
=
_
_
1 3
2 5
6 4
_
_
Notice that A is a (2 3) matrix, while A
T
is a (3 2) matrix.
The transpose of a matrix has the following important properties .
Lemma 2.28: Properties of the Transpose of a Matrix
Let A be an mn matrix and let B be a n p matrix. Then
(AB)
T
= B
T
A
T
(2.10)
If r and s are scalars,
(rA+ sB)
T
= rA
T
+ sB
T
(2.11)
63
Proof: First we prove 2.10. From Denition 2.26,
(AB)
T
= [AB
ij
]
T
= [AB
ji
] =

k
A
jk
B
ki
=

k
B
ki
A
jk
=

k
[B
ik
]
T
[A
kj
]
T
= [B
ij
]
T
[A
ij
]
T
= B
T
A
T
The proof of Formula 2.11 is left as an exercise.
The transpose of a matrix is related to other important topics. Consider the following
denition.
Denition 2.29: Symmetric and Skew Symmetric Matrices
An n n matrix A is said to be symmetric if A = A
T
. It is said to be skew
symmetric if A = A
T
.
We will explore these denitions in the following examples.
Example 2.30: Symmetric Matrices
Let
A =
_
_
2 1 3
1 5 3
3 3 7
_
_
Use Denition 2.29 to show that A is symmetric.
Solution. By Denition 2.29, we need to show that A = A
T
. Now, using Denition 2.26,
A
T
=
_
_
2 1 3
1 5 3
3 3 7
_
_
Hence, A = A
T
, so A is symmetric.
Example 2.31: A Skew Symmetric Matrix
Let
A =
_
_
0 1 3
1 0 2
3 2 0
_
_
Show that A is skew symmetric.
Solution. By Denition 2.29,
A
T
=
_
_
0 1 3
1 0 2
3 2 0
_
_
64
You can see that each entry of A
T
is equal to 1 times the same entry of A. Hence,
A
T
= A and so by Denition 2.29, A is skew symmetric.
2.1.7. The Identity And Inverses
There is a special matrix called I which is referred to as the identity matrix. The identity
matrix is always a square matrix, and it has the property that there are ones down the main
diagonal and zeroes elsewhere. Here are some identity matrices of various sizes.
[1] ,
_
1 0
0 1
_
,
_
_
1 0 0
0 1 0
0 0 1
_
_
,
_

_
1 0 0 0
0 1 0 0
0 0 1 0
0 0 0 1
_

_
The rst is the 1 1 identity matrix, the second is the 2 2 identity matrix, and so on. By
extension, you can likely see what the n n identity matrix would be. When it is necessary
to distinguish which size of identity matrix is being discussed, we will use the notation I
n
for the n n identity matrix.
The identity matrix is so important that there is a special symbol to denote the ij
th
entry of the identity matrix. This symbol is given by I
ij
=
ij
where
ij
is the Kronecker
symbol dened by

ij
=
_
1 if i = j
0 if i ,= j
I
n
is called the identity matrix because it is a multiplicative identity in the following
sense. Note that (AI
n
)
ij
denotes the ij
th
entry of the product of A and the n n identity
matrix I
n
.
Lemma 2.32: Multiplication by the Identity Matrix
Suppose A is an mn matrix and I
n
is the n n identity matrix. Then AI
n
= A. If
I
m
is the mm identity matrix, it also follows that I
m
A = A.
Proof:
(AI
n
)
ij
=

k
A
ik

kj
= A
ij
and so AI
n
= A. The other case is left as an exercise for you.
We now dene the matrix operation which in some way plays the role of division.
Denition 2.33: The Inverse of a Matrix
A square n n matrix A is said to have an inverse A
1
if and only if
AA
1
= A
1
A = I
n
In this case, the matrix A is called invertible.
65
Such a matrix A
1
will have the same size as the matrix A. It is very important to
observe that the inverse of a matrix, if it exists, is unique. Another way to think of this is
that if it acts like the inverse, then it is the inverse.
Theorem 2.34: Uniqueness of Inverse
Suppose A is an n n matrix such that an inverse A
1
exists. Then there is only one
such inverse matrix. That is, given any matrix B such that AB = BA = I, B = A
1
.
Proof: In this proof, it is assumed that I is the n n identity matrix. Let A, B be
n n matrices such that A
1
exists and AB = BA = I. We want to show that A
1
= B.
Now using properties we have seen, we get:
A
1
= A
1
I = A
1
(AB) =
_
A
1
A
_
B = IB = B
Hence, A
1
= B which tells us that the inverse is unique.
The next example demonstrates how to check the inverse of a matrix.
Example 2.35: Verifying the Inverse of a Matrix
Let A =
_
1 1
1 2
_
. Show
_
2 1
1 1
_
is the inverse of A.
Solution. To check this, multiply
_
1 1
1 2
_ _
2 1
1 1
_
=
_
1 0
0 1
_
= I
and
_
2 1
1 1
_ _
1 1
1 2
_
=
_
1 0
0 1
_
= I
showing that this matrix is indeed the inverse of A.
Unlike ordinary multiplication of numbers, it can happen that A ,= 0 but A may fail to
have an inverse. This is illustrated in the following example.
Example 2.36: A Nonzero Matrix With No Inverse
Let A =
_
1 1
1 1
_
. Show that A does not have an inverse.
Solution. One might think A would have an inverse because it does not equal zero. However,
note that
_
1 1
1 1
_ _
1
1
_
=
_
0
0
_
66
If A
1
existed, we would have the following
_
0
0
_
= A
1
__
0
0
__
= A
1
_
A
_
1
1
__
=
=
_
A
1
A
_
_
1
1
_
= I
_
1
1
_
=
_
1
1
_
This says that
_
0
0
_
=
_
1
1
_
which is impossible! Therefore, A does not have an inverse.
In the next section, we will explore how to nd the inverse of a matrix, if it exists.
2.1.8. Finding The Inverse Of A Matrix
In Example 2.35, we were given A
1
and asked to verify that this matrix was in fact the
inverse of A. In this section, we explore how to nd A
1
.
Let
A =
_
1 1
1 2
_
as in Example 2.35. In order to nd A
1
, we need to nd a matrix
_
x z
y w
_
such that
_
1 1
1 2
_ _
x z
y w
_
=
_
1 0
0 1
_
We can multiply these two matrices, and see that in order for this equation to be true, we
must nd the solution to the systems of equations,
x + y = 1
x + 2y = 0
and
z + w = 0
z + 2w = 1
Writing the augmented matrix for these two systems gives
_
1 1 [ 1
1 2 [ 0
_
(2.12)
for the rst system and
_
1 1 [ 0
1 2 [ 1
_
(2.13)
67
for the second.
Lets solve the rst system. Take 1 times the rst row and add to the second to get
_
1 1 [ 1
0 1 [ 1
_
Now take 1 times the second row and add to the rst to get
_
1 0 [ 2
0 1 [ 1
_
Writing in terms of variables, this says x = 2 and y = 1.
Now solve the second system, 2.13 to nd z and w. You will nd that z = 1 and w = 1.
If we take the values found for x, y, z, and w and put them into our inverse matrix, we
see that the inverse is
A
1
=
_
x z
y w
_
=
_
2 1
1 1
_
After taking the time to solve the second system, you may have noticed that exactly
the same row operations were used to solve both systems. In each case, the end result was
something of the form [I[X] where I is the identity and X gave a column of the inverse. In
the above,
_
x
y
_
the rst column of the inverse was obtained by solving the rst system and then the second
column
_
z
w
_
To simplify this procedure, we could have solved both systems at once! To do so, we
could have written
_
1 1 [ 1 0
1 2 [ 0 1
_
and row reduced until we obtained
_
1 0 [ 2 1
0 1 [ 1 1
_
and read o the inverse as the 2 2 matrix on the right side.
This exploration motivates the following important algorithm.
68
Algorithm 2.37: Matrix Inverse Algorithm
Suppose A is an n n matrix. To nd A
1
if it exists, form the augmented n 2n
matrix
[A[I]
If possible do row operations until you obtain an n 2n matrix of the form
[I[B]
When this has been done, B = A
1
. In this case, we say that A is invertible. If it is
impossible to row reduce to a matrix of the form [I[B] , then A has no inverse.
This algorithm shows how to nd the inverse if it exists. It will also tell you if A does
not have an inverse.
Consider the following example.
Example 2.38: Finding the Inverse
Let A =
_
_
1 2 2
1 0 2
3 1 1
_
_
. Find A
1
if it exists.
Solution. Set up the augmented matrix
[A[I] =
_
_
1 2 2 [ 1 0 0
1 0 2 [ 0 1 0
3 1 1 [ 0 0 1
_
_
Now we row reduce, with the goal of obtaining the 3 3 identity matrix on the left hand
side. First, take 1 times the rst row and add to the second followed by 3 times the rst
row added to the third row. This yields
_
_
1 2 2 [ 1 0 0
0 2 0 [ 1 1 0
0 5 7 [ 3 0 1
_
_
Then take 5 times the second row and add to -2 times the third row.
_
_
1 2 2 [ 1 0 0
0 10 0 [ 5 5 0
0 0 14 [ 1 5 2
_
_
Next take the third row and add to 7 times the rst row. This yields
_
_
7 14 0 [ 6 5 2
0 10 0 [ 5 5 0
0 0 14 [ 1 5 2
_
_
69
Now take
7
5
times the second row and add to the rst row.
_
_
7 0 0 [ 1 2 2
0 10 0 [ 5 5 0
0 0 14 [ 1 5 2
_
_
Finally divide the rst row by -7, the second row by -10 and the third row by 14 which yields
_

_
1 0 0 [
1
7
2
7
2
7
0 1 0 [
1
2

1
2
0
0 0 1 [
1
14
5
14

1
7
_

_
Notice that the left hand side of this matrix is now the 3 3 identity matrix I
3
. Therefore,
the inverse is the 3 3 matrix on the right hand side, given by
_

1
7
2
7
2
7
1
2

1
2
0
1
14
5
14

1
7
_

_
It may happen that through this algorithm, you discover that the left hand side cannot
be row reduced to the identity matrix. Consider the following example of this situation.
Example 2.39: A Matrix Which Has No Inverse
Let A =
_
_
1 2 2
1 0 2
2 2 4
_
_
. Find A
1
if it exists.
Solution. Write the augmented matrix [A[I]
_
_
1 2 2 [ 1 0 0
1 0 2 [ 0 1 0
2 2 4 [ 0 0 1
_
_
and proceed to do row operations attempting to obtain [I[A
1
] . Take 1 times the rst row
and add to the second. Then take 2 times the rst row and add to the third row.
_
_
1 2 2 [ 1 0 0
0 2 0 [ 1 1 0
0 2 0 [ 2 0 1
_
_
70
Next add 1 times the second row to the third row.
_
_
1 2 2 [ 1 0 0
0 2 0 [ 1 1 0
0 0 0 [ 1 1 1
_
_
At this point, you can see there will be no way to obtain I on the left side of this augmented
matrix. Hence, there is no way to complete this algorithm, and therefore the inverse of A
does not exist. In this case, we say that A is not invertible.
If the algorithm provides an inverse for the original matrix, it is always possible to check
your answer. To do so, use the method used in Example 2.35. Check that the products
AA
1
and A
1
A both equal the identity matrix. Through this method, you can always be
sure that you have calculated A
1
properly!
One way in which the inverse of a matrix is useful is to nd the solution of a system
of linear equations. Recall from Denition 2.15 that we can write a system of equations in
matrix form, which is of the form AX = B. Suppose you nd the inverse of the matrix
A
1
. Then you could multiply both sides of this equation on the left by A
1
and simplify
to obtain
(A
1
) AX = A
1
B
(A
1
A) X = A
1
B
IX = A
1
B
X = A
1
B
Therefore, we can nd X, the solution to the system, by computing X = A
1
B. Note
that once you have found A
1
, you can easily get the solution for dierent right hand sides
(dierent B) without any eort. It is always just A
1
B.
We will explore this method of nding the solution to a system in the following example.
Example 2.40: Using the Inverse to Solve a System of Equations
Consider the following system of equations. Use the inverse of a suitable matrix to
give the solutions to this system.
x + z = 1
x y + z = 3
x + y z = 2
Solution. First, we can write the system of equations in matrix form
AX =
_
_
1 0 1
1 1 1
1 1 1
_
_
_
_
x
y
z
_
_
=
_
_
1
3
2
_
_
= B (2.14)
The inverse of the matrix
A =
_
_
1 0 1
1 1 1
1 1 1
_
_
71
is
A
1
=
_

_
0
1
2
1
2
1 1 0
1
1
2

1
2
_

_
Verifying this inverse is left as an exercise.
From here, the solution to the given system 2.14 is found by
_
_
x
y
z
_
_
= A
1
B =
_

_
0
1
2
1
2
1 1 0
1
1
2

1
2
_

_
_
_
1
3
2
_
_
=
_

_
5
2
2

3
2
_

_
What if the right side, B, of 2.14 had been
_
_
0
1
3
_
_
? In other words, what would be the
solution to
_
_
1 0 1
1 1 1
1 1 1
_
_
_
_
x
y
z
_
_
=
_
_
0
1
3
_
_
?
By the above discussion, the solution is given by
_
_
x
y
z
_
_
= A
1
B =
_

_
0
1
2
1
2
1 1 0
1
1
2

1
2
_

_
_
_
0
1
3
_
_
=
_
_
2
1
2
_
_
This illustrates that for a system AX = B where A
1
exists, it is easy to nd the solution
when the vector B is changed.
In the next section, we explore a special type of matrices.
2.1.9. Elementary Matrices
We now turn our attention to a special type of matrix called an elementary matrix. An
elementary matrix is always a square matrix. Recall the row operations given in Denition
1.11. Any elementary matrix, which we often denote by E, is obtained from applying one
row operation to the identity matrix of the same size.
For example, the matrix
E =
_
0 1
1 0
_
is the elementary matrix obtained from switching the two rows. The matrix
E =
_
_
1 0 0
0 3 0
0 0 1
_
_
72
is the elementary matrix obtained from multiplying the second row of the 3 3 identity
matrix by 3. The matrix
E =
_
1 0
3 1
_
is the elementary matrix obtained from adding 3 times the rst row to the third row.
You may construct an elementary matrix from any row operation, but remember that
you can only apply one operation.
Consider the following denition.
Denition 2.41: Elementary Matrices and Row Operations
Let E be an n n matrix. Then E is an elementary matrix if it is the result of
applying one row operation to the n n identity matrix I
n
.
Those which involve switching rows of the identity matrix are called permutation
matrices.
Therefore, E constructed above by switching the two rows of I
2
is called a permutation
matrix.
Elementary matrices can be used in place of row operations and can thus be very useful.
It turns out that left multiplying by an elementary matrix E will have the same eect as
doing the row operation used to obtain E.
The following theorem is an important result which we will use throughout this text.
Theorem 2.42: Multiplication by an Elementary Matrix and
Row Operations
To perform any of the three row operations on a matrix A it suces to take the product
EA, where E is the elementary matrix obtained by using the desired row operation
on the identity matrix.
Therefore, instead of performing row operations on a matrix A, we can row reduce through
matrix multiplication with the appropriate elementary matrix. We will examine this theorem
in detail for each of the three row operations given in Denition 1.11.
First, consider the following lemma.
Lemma 2.43: Action of Permutation Matrix
Let P
ij
denote the elementary matrix which involves switching the i
th
and the j
th
rows. Hence, P
ij
is a permutation matrix. Then
P
ij
A = B
where B is obtained from A by switching the i
th
and the j
th
rows.
We will explore this idea more in the following example.
73
Example 2.44: Switching Rows with an Elementary Matrix
Let
P
12
=
_
_
0 1 0
1 0 0
0 0 1
_
_
, A =
_
_
a b
g d
e f
_
_
Find B where B = P
12
A.
Solution. You can see that the matrix P
12
is obtained by switching the rst and second rows
of the 3 3 identity matrix I.
Using our usual procedure, compute the product P
12
A = B. The result is given by
B =
_
_
g d
a b
e f
_
_
Notice that B is the matrix obtained by switching rows 1 and 2 of A. Therefore by multi-
plying A by P
12
, the row operation which was applied to I to obtain P
12
is applied to A to
obtain B.
Theorem 2.42 applies to all three row operations, and we now look at the row operation
of multiplying a row by a scalar. Consider the following lemma.
Lemma 2.45: Multiplication by a Scalar and Elementary Matrices
Let E (k, i) denote the elementary matrix corresponding to the row operation in which
the i
th
row is multiplied by the nonzero scalar, k. Thus E (k, i) involves multiplying
the i
th
row of the identity matrix by k. Then
E (k, i) A = B
where B is obtained from A by multiplying the i
th
row of A by k.
We will explore this lemma further in the following example.
Example 2.46: Multiplication of a Row by 5 Using Elementary Matrix
Let
E (5, 2) =
_
_
1 0 0
0 5 0
0 0 1
_
_
, A =
_
_
a b
c d
e f
_
_
Find the matrix B where B = E (5, 2) A
Solution. You can see that E (5, 2) is obtained by multiplying the second row of the identity
by 5.
74
Using our usual procedure for multiplication of matrices, we can compute the product
E (5, 2) A. The resulting matrix is given by
B =
_
_
a b
5c 5d
e f
_
_
Notice that B is obtained by multiplying the second row of A by the scalar 5.
There is one last row operation to consider. The following lemma discusses the nal
operation of adding a multiple of a row to another row.
Lemma 2.47: Adding Multiples of Rows and Elementary Matrices
Let E (k i + j) denote the elementary matrix obtained from I by replacing the j
th
row with k times the i
th
row added to it. Then
E (k i + j) A = B
where B is obtained from A by replacing the j
th
row of A with itself added to k times
the i
th
row of A.
Consider the following example.
Example 2.48: Adding Two Times the First Row to the Last
Let
E (2 1 + 3) =
_
_
1 0 0
0 1 0
2 0 1
_
_
, A =
_
_
a b
c d
e f
_
_
Find B where B = E (2 1 + 3) A.
Solution. You can see that the matrix E (2 1 + 3) was obtained by adding 2 times the rst
row of I to the third row of I.
Using our usual procedure, we can compute the product E (2 1 + 3) A. The resulting
matrix B is given by
B =
_
_
a b
c d
2a + e 2b + f
_
_
You can see that B is the matrix obtained by adding 2 times the rst row of A to the
third row.
Suppose we have applied a row operation to a matrix A. Consider the row operation
required to return A to its original form, to undo the row operation. It turns out that this
action is how we nd the inverse of an elementary matrix E.
Consider the following theorem.
75
Theorem 2.49: Elementary Matrices and Inverses
Every elementary matrix is invertible and its inverse is also an elementary matrix.
In fact, the inverse of an elementary matrix is constructed by doing the reverse row
operation on I. E
1
will be obtained by performing the row operation which would carry E
back to I.
If E is obtained by switching rows i and j, then E
1
is also obtained by switching rows
i and j.
If E is obtained by multiplying row i by the scalar k, then E
1
is obtained by multi-
plying row i by the scalar
1
k
.
If E is obtained by adding k times row i to row j, then E
1
is obtained by subtracting
k times row i from row j.
Consider the following example.
Example 2.50: Inverse of an Elementary Matrix
Let
E =
_
1 0
0 2
_
Find E
1
.
Solution. Consider the elementary matrix E given by
E =
_
1 0
0 2
_
Here, E is obtained from the 2 2 identity matrix by multiplying the second row by 2. In
order to carry E back to the identity, we need to multiply the second row of E by
1
2
. Hence,
E
1
is given by
E
1
=
_
1 0
0
1
2
_
We can verify that EE
1
= I. Take the product EE
1
, given by
EE
1
=
_
1 0
0 2
_
_
1 0
0
1
2
_
=
_
1 0
0 1
_
This equals I so we know that we have compute E
1
properly.
76
2.1.10. More on Matrix Inverses
In this section, we will prove three theorems which will clarify the concept of matrix inverses.
In order to do this, rst recall some important properties of elementary matrices.
Recall that an elementary matrix is a square matrix obtained by performing an elemen-
tary operation on an identity matrix. Each elementary matrix is invertible, and its inverse
is also an elementary matrix. If E is an mm elementary matrix and A is an mn matrix,
then the product EA is the result of applying to A the same elementary row operation that
was applied to the mm identity matrix in order to obtain E.
Let R be the reduced row-echelon form of an mn matrix A. R is obtained by iteratively
applying a sequence of elementary row operations to A. Denote by E
1
, E
2
, , E
k
the ele-
mentary matrices associated with the elementary row operations which were applied, in order,
to the matrix A to obtain the resulting R. We then have that R = (E
k
(E
2
(E
1
A)) ) =
E
k
E
2
E
1
A. Let E denote the product matrix E
k
E
2
E
1
so that we can write R = EA
where E is an invertible matrix whose inverse is the product (E
1
)
1
(E
2
)
1
(E
k
)
1
.
Now, we will consider some preliminary lemmas.
Lemma 2.51: Invertible Matrix and Zeros
Suppose that A and B are matrices such that the product AB is an identity matrix.
Then the reduced row-echelon form of A does not have a row of zeros.
Proof: Let R be the reduced row-echelon form of A. Then R = EA for some invertible
square matrix E. By hypothesis AB = I where I is an identity matrix, so we have a chain
of equalities
R(BE
1
) = (EA)(BE
1
) = E(AB)E
1
= EIE
1
= EE
1
= I
If R would have a row of zeros, then so would the product R(BE
1
). But since the identity
matrix I does not have a row of zeros, neither can R have one.
We now consider a second important lemma.
Lemma 2.52: Size of Invertible Matrix
Suppose that A and B are matrices such that the product AB is an identity matrix.
Then A has at least as many columns as it has rows.
Proof: Let R be the reduced row-echelon form of A. By Lemma 2.51, we know that R
does not have a row of zeros, and therefore each row of R has a leading 1. Since each column
of R contains at most one of these leading 1s, R must have at least as many columns as it
has rows.
An important theorem follows from this lemma.
Theorem 2.53: Invertible Matrices are Square
Only square matrices can be invertible.
77
Proof: Suppose that A and B are matrices such that both products AB and BA are
identity matrices. We will show that A and B must be square matrices of the same size. Let
the matrix A have m rows and n columns, so that A is an mn matrix. Since the product
AB exists, B must have n rows, and since the product BA exists, B must have m columns
so that B is an n m matrix. To nish the proof, we need only verify that m = n.
We rst apply Lemma 2.52 with A and B, to obtain the inequality m n. We then apply
Lemma 2.52 again (switching the order of the matrices), to obtain the inequality n m. It
follows that m = n, as we wanted.
Of course, not all square matrices are invertible. In particular, zero matrices are not
invertible, along with many other square matrices.
The following proposition will be useful in proving the next theorem.
Proposition 2.54: Reduced Row-Echelon Form of a Square Matrix
If R is the reduced row-echelon form of a square matrix, then either R has a row of
zeros or R is an identity matrix.
The proof of this proposition is left as an exercise to the reader. We now consider the
second important theorem of this section.
Theorem 2.55: Unique Inverse of a Matrix
Suppose A and B are square matrices such that AB = I where I is an identity matrix.
Then it follows that BA = I. Further, both A and B are invertible and B = A
1
and
A = B
1
.
Proof: Let R be the reduced row-echelon form of a square matrix A. Then, R = EA
where E is an invertible matrix. Since AB = I, Lemma 2.51 gives us that R does not have
a row of zeros. By noting that R is a square matrix and applying Proposition 2.54, we see
that R = I. Hence, EA = I.
Using both that EA = I and AB = I, we can nish the proof with a chain of equalities
as given by
BA = IBIA = (EA)B(E
1
E)A
= E(AB)E
1
(EA)
= EIE
1
I
= EE
1
= I
It follows from the denition of the inverse of a matrix that B = A
1
and A = B
1
.
This theorem is very useful, since with it we need only test one of the products AB or
BA in order to check that B is the inverse of A. The hypothesis that A and B are square
matrices is very important, and without this the theorem does not hold.
We will now consider an example.
78
Example 2.56: Non Square Matrices
Let
A =
_
_
1
0
0
_
_
, B =
_
_
0
1
0
_
_
and let P be the augmented matrix [A[B], which is a 3 2 matrix. Consider the
products P
T
P and PP
T
.
Solution. The matrix P is given by
P =
_
_
1 0
0 1
0 0
_
_
Then, consider the product P
T
P given by
_
1 0 0
0 1 0
_
_
_
1 0
0 1
0 0
_
_
=
_
1 0
0 1
_
Therefore, P
T
P = I
2
, where I
2
is the 2 2 identity matrix. However, the product PP
T
is
_
_
1 0
0 1
0 0
_
_
_
1 0 0
0 1 0
_
=
_
_
1 0 0
0 1 0
0 0 0
_
_
Hence PP
T
is not the 33 identity matrix. This shows that for Theorem 2.55, it is essential
that both matrices be square and of the same size.
Is it possible to have matrices A and B such that AB = I, while BA = 0? This question
is left to the reader to answer, and you should take a moment to consider the answer.
We conclude this section with an important theorem.
Theorem 2.57: The Reduced Row-Echelon Form of an Invertible Matrix
For any matrix A the following conditions are equivalent:
A is invertible
The reduced row-echelon form of A is an identity matrix
Proof: In order to prove this, we show that for any given matrix A, each condition
implies the other. We rst show that if A is invertible, then its reduced row-echelon form is
an identity matrix, then we show that if the reduced row-echelon form of A is an identity
matrix, then A is invertible.
If A is invertible, there is some matrix B such that AB = I. By Lemma 2.51, we get
that the reduced row-echelon form of A does not have a row of zeros. Then by Theorem
79
2.53, it follows that A and the reduced row-echelon form of A are square matrices. Finally,
by Proposition 2.54, this reduced row-echelon form of A must be an identity matrix. This
proves the rst implication.
Now suppose the reduced row-echelon form of A is an identity matrix I. Then I = EA
for some product E of elementary matrices. By Theorem 2.55, we can conclude that A is
invertible.
Theorem 2.57 corresponds to Algorithm 2.37, which claims that A
1
is found by row
reducing the augmented matrix [A[I] to the form [I[A
1
]. This will be a matrix product
E [A[I] where E is a product of elementary matrices. By the rules of matrix multiplication,
we have that E [A[I] = [EA[EI] = [EA[E].
It follows that the reduced row-echelon form of [A[I] is [EA[E], where EA gives the
reduced row-echelon form of A. By Theorem 2.57, if EA ,= I, then A is not invertible, and
if EA = I, A is invertible. If EA = I, then by Theorem 2.55, E = A
1
. This proves that
Algorithm 2.37 does in fact nd A
1
.
2.2 Linear Transformations
Recall that when we multiply an m n matrix by an n 1 column vector, the result is an
m1 column vector. In this section we will discuss how, through matrix multiplication, an
mn matrix transforms an n 1 column vector into an m1 column vector.
We will introduce some new notation at this time, which will be used frequently through-
out the next section. You may be familiar with the notation R, which we call the set of real
numbers. For example, 3,
1
2
, 0.6,

2, , are all real numbers.


Similarly, R
2
denotes the set of all 2 1 vectors, of the form
_
x
y
_
, or all ordered pairs,
such as (x, y). R
2
represents the standard two-dimensional Cartesian Plane which you are
familiar with.
Consider now vectors of n elements. For instance, consider the n 1 vector given by
X =
_

_
X
1
X
2
.
.
.
X
n
_

_
We say that this vector belongs to R
n
, which is the set of all n 1 vectors.
Through matrix multiplication, matrices can transform vectors. We explore this idea in
the next example.
Example 2.58: A Function Which Transforms Vectors
Consider the matrix A =
_
1 2 0
2 1 0
_
.
Show that A transforms vectors in R
3
into vectors in R
2
.
80
Solution. First, recall that vectors in R
3
are vectors of size 3 1, while vectors in R
2
are of
size 2 1. If we multiply A, which is a 2 3 matrix, by a 3 1 vector, the result will be a
2 1 vector. This what we mean when we say that A transforms vectors.
Now, for
_
_
x
y
z
_
_
in R
3
, multiply on the left by the given matrix to obtain the new vector.
This product looks like
_
1 2 0
2 1 0
_
_
_
x
y
z
_
_
=
_
x + 2y
2x + y
_
The resulting product is a 2 1 vector which is determined by the choice of x and y. Here
are some numerical examples.
_
1 2 0
2 1 0
_
_
_
1
2
3
_
_
=
_
5
4
_
Here, the vector
_
_
1
2
3
_
_
in R
3
was transformed by the matrix into the vector
_
5
4
_
in R
2
.
Here is another example:
_
1 2 0
2 1 0
_
_
_
10
5
3
_
_
=
_
20
25
_
The idea is to dene a function which takes vectors in R
3
and delivers new vectors in R
2
.
In this case, that function is multiplication by the matrix A.
Recall the properties of matrix multiplication. The pertinent property here is 2.8 which
states that for k and p scalars,
A(kB + pC) = kAB + pAC
In particular, for A an mn matrix and B and C, n 1 vectors in R
n
, this formula holds.
In other words, this means that matrix multiplication gives an example of a linear trans-
formation, which we will now dene. The notation T : R
n
R
m
means that the function T
transforms vectors in R
n
into vectors in R
m
.
Denition 2.59: Linear Transformation
Let T : R
n
R
m
be a function, where for each X R
n
, T (X) R
m
. Then T is a
linear transformation if whenever k, p are scalars and X
1
and X
2
are vectors in R
n
(n 1 vectors),
T (kX
1
+ pX
2
) = kT (X
1
) + pT (X
2
)
81
We began this section by discussing matrix transformations, where multiplication by a
matrix transforms vectors. It turns out that every linear transformation can be expressed
as a matrix transformation. That is, if T is a linear transformation, we can express T (X)
as AX for some matrix A. The action of T can be written as matrix multiplication. In this
case, we say that T is determined by the matrix A.
Even more remarkable is that any transformation which can be written as matrix multi-
plication, is in fact a linear transformation. This is a very important fact which will be used
many times, and we explore this further in the next section.
This observation leads to another type of notation used to express a linear transformation.
If T is a linear transformation with T (X) = AX, we often write
T
A
(X) = AX
Therefore, T
A
is the linear transformation determined by the matrix A.
Consider the set of vectors of R
m
which are of the form T (X) for some X R
n
. It is
common to write TR
n
, T (R
n
), or Im(T) to denote these vectors. We call this set the range
or image of T.
There are several important characterizations of linear transformations. Two of these are
dened now.
Denition 2.60: One to One, Onto
Suppose X
1
and X
2
are vectors in R
n
. A linear transformation is called one to one
(often written as 1 1) if whenever X
1
,= X
2
it follows that :
T (X
1
) ,= T (X
2
)
Equivalently, if T (X
1
) = T (X
2
) , then X
1
= X
2
. Thus, T is 1 1 if it never takes two
dierent vectors to the same vector.
A linear transformation mapping R
n
to R
m
is called onto if whenever X
2
R
m
there
exists X
1
R
n
such that T (X
1
) = X
2
.
In the next section, we look at the concepts of one to one and onto in more detail.
2.2.1. Matrices Which Are One To One Or Onto
Since every linear transformation can be written as matrix multiplication, it is common to
refer to the matrix as onto or one to one when the linear transformation it determines is
onto or one to one.
If T is obtained from multiplication by an mn matrix A, we can denote the range of T
by A(R
n
), AR
n
, or Im(A). In this case, we are talking about the vectors in R
m
which are
obtained in the form AX for some X R
n
. We may also call this the range or image of A.
82
Lemma 2.61: The Form of AX
Let A be an m n matrix where A
1
, , A
n
denote the columns of A. Then, for a
vector X =
_
X
1
X
n

T
in R
n
,
AX =
n

k=1
X
k
A
k
Therefore, A(R
n
) is the collection of all linear combinations of these products.
Proof: This follows from the denition of matrix multiplication in Denition 2.13.
The following proposition is an important result.
Proposition 2.62: One to One Matrices
Let A be an mn matrix. Then A is one to one if and only if AX = 0 implies X = 0.
Proof: We need to prove two things here. First, we will prove that if A is one to one,
then AX = 0 implies that X = 0. Second, we will show that if AX = 0 implies that X = 0,
then it follows that A is one to one.
First note that A0 = A(0 + 0) = A0 + A0 and so A0 = 0 Now suppose A is one to one
and AX = 0. We need to show that this implies X = 0. Since A0 = 0, and A is one to
one, by Denition 2.60 A can only map one vector to the zero vector 0. Since AX = 0 and
A0 = 0, it follows that X = 0. Thus if A is one to one and AX = 0, then X = 0.
Next assume that AX = 0 implies X = 0. We need to show that A is one to one.
Suppose AX = AY . Then AX AY = 0. Hence AX AY = A(X Y ) = 0. However,
we have assumed that AX = 0 implies X = 0. This means that whenever A times a vector
equals 0, that vector is also equal to 0. Therefore, X Y = 0 and so X = Y . Thus A is one
to one by Denition 2.60.
Note that this proposition says that if A =
_
A
1
A
n

then A is one to one if and


only if whenever
0 =
n

k=1
c
k
A
k
it follows that each scalar c
k
= 0.
We will now take a look at an example of a one to one and onto linear transformation.
Example 2.63: A One to One and Onto Linear Transformation
Suppose
T
_
x
y
_
=
_
1 1
1 2
_ _
x
y
_
Then, T : R
2
R
2
is a linear transformation. Is T onto? Is it one to one?
83
Solution. Recall that because T can be expressed as matrix multiplication, we know that T
is a linear transformation. We will start by looking at onto. So suppose
_
a
b
_
R
2
. Does
there exist
_
x
y
_
R
2
such that T
_
x
y
_
=
_
a
b
_
? If so, then since
_
a
b
_
is an arbitrary
vector in R
2
, it will follow that T is onto.
This question is familiar to you. It is asking whether there is a solution to the equation
_
1 1
1 2
_ _
x
y
_
=
_
a
b
_
This is the same thing as asking for a solution to the following system of equations.
x + y = a
x + 2y = b
Set up the augmented matrix and row reduce.
_
1 1 [ a
1 2 [ b
_

_
1 0 [ 2a b
0 1 [ b a
_
(2.15)
You can see from this point that the system has a solution. Therefore, we have shown that
for any a, b, there is a
_
x
y
_
such that T
_
x
y
_
=
_
a
b
_
. Thus T is onto.
Now we want to know if T is one to one. By Lemma 2.62 it is enough to show that
AX = 0 implies X = 0. Consider the system
_
1 1
1 2
_ _
x
y
_
=
_
0
0
_
This is the same as the system given by
x + y = 0
x + 2y = 0
We need to show that the solution to this system is x = 0 and y = 0.
By setting up the augmented matrix and row reducing, we end up with
_
1 0 [ 0
0 1 [ 0
_
This tells us that x = 0 and y = 0. Returning to the original system, this says that if
_
1 1
1 2
_ _
x
y
_
=
_
0
0
_
then
_
x
y
_
=
_
0
0
_
.
84
In other words, AX = 0 implies that X = 0. By Proposition 2.62, A is one to one, and
so T is also one to one.
We also could have seen that T is one to one from our above solution for onto. By looking
at the matrix given by 2.15, you can see that there is a unique solution given by x = 2a b
and y = b a. Therefore, there is only one vector, specically
_
x
y
_
=
_
2a b
b a
_
such that
T
_
x
y
_
=
_
a
b
_
. Hence by Denition 2.60, T is one to one.
We have now examined some specic linear transformations. In the next example, we
take a more general look.
Example 2.64: The General Form of a Linear Transformation
Suppose
T
_

_
X
1
.
.
.
X
n
_

_
=
_

_
a
11
X
1
+ +a
1n
X
n
.
.
.
.
.
.
a
mn
X
1
+ +a
mn
X
n
_

_
Is T a linear transformation?
Solution. Recall the discussion earlier in this section, when we mentioned that any trans-
formation which can be written as matrix multiplication is in fact a linear transformation.
Therefore, this question becomes, can we write this transformation as
T (X) = AX
where A is an mn matrix and X R
n
?
In fact, the right side of the above equation can be written as
_

_
a
11
a
1n
.
.
.
.
.
.
a
mn
a
mn
_

_
_

_
X
1
.
.
.
X
n
_

_
= AX
Therefore, T is linear.
The next section is devoted to looking at the matrix A of such transformations in more
detail.
2.3 The Matrix Of A Linear
Transformation
In the above examples, the action of the linear transformations was to multiply by a matrix.
We also discussed that this is always the case for linear transformations. If T is any linear
85
transformation which maps R
n
to R
m
, there is always an mn matrix A with the property
that
T (X) = AX (2.16)
for all X R
n
.
Here is why. Suppose T : R
n
R
m
is a linear transformation and you want to nd the
matrix dened by this linear transformation as described in 2.16. Note that
X =
_

_
X
1
X
2
.
.
.
X
n
_

_
= X
1
_

_
1
0
.
.
.
0
_

_
+ X
2
_

_
0
1
.
.
.
0
_

_
+ + X
n
_

_
0
0
.
.
.
1
_

_
=
n

i=1
X
i
E
i
where E
i
is the i
th
column of I
n
, that is the n 1 vector which has zeros in every slot but
the i
th
and a 1 in this slot.
Then since Tis linear,
T (X) =
n

i=1
X
i
T (E
i
)
=
_
_
[ [
T (E
1
) T (E
n
)
[ [
_
_
_

_
X
1
.
.
.
X
n
_

_
= A
_

_
X
1
.
.
.
X
n
_

_
Therefore, the desired matrix is obtained from constructing the i
th
column as T (E
i
) . We
state this formally as the following theorem.
Theorem 2.65: Matrix of a Linear Transformation
Let T be a linear transformation fromR
n
to R
m
. Then the matrix A satisfying T (X) =
AX is given by
A =
_
_
[ [
T (E
1
) T (E
n
)
[ [
_
_
where E
i
is the i
th
column of I
n
, and then T (E
i
) is the i
th
column of A.
Consider the following example.
Example 2.66: The Matrix of a Linear Transformation
Suppose T is a linear transformation, T : R
3
R
2
where
T
_
_
1
0
0
_
_
=
_
1
2
_
, T
_
_
0
1
0
_
_
=
_
9
3
_
, T
_
_
0
0
1
_
_
=
_
1
1
_
Find the matrix A of T such that T (X) = AX for all X.
86
Solution. Recall that we can construct A as follows:
A =
_
_
[ [
T (E
1
) T (E
n
)
[ [
_
_
In this case, A will be a 2 3 matrix, so we need to nd T (E
1
) , T (E
2
) , and T (E
3
).
Luckily, we have been given these values so we can ll in A as needed, using these vectors
as the columns of A. Hence,
A =
_
1 9 1
2 3 1
_
In this example, we were given the resulting vectors of T (E
1
) , T (E
2
) , and T (E
3
). Con-
structing the matrix A was simple, as we could simply use these vectors as the columns of
A.
The next example shows how to nd A when we are not given T (E
1
) and T (E
2
) so
clearly.
Example 2.67: The Matrix of Linear Transformation: Inconveniently De-
ned
Suppose T is known to be a linear transformation, T : R
2
R
2
and
T
_
1
1
_
=
_
1
2
_
, T
_
0
1
_
=
_
3
2
_
Find the matrix A of the transformation T.
Solution. By Theorem 2.65 to nd this matrix, we need to determine the action of T on
E
1
and E
2
. In the Example 2.66, we were given these resulting vectors. However, in this
example, we have been given T of two dierent vectors. How can we nd out the action of
T on E
1
and E
2
? In particular for E
1
, suppose there exist x and y such that
_
1
0
_
= x
_
1
1
_
+ y
_
0
1
_
(2.17)
Then, since T is linear,
T
_
1
0
_
= xT
_
1
1
_
+ yT
_
0
1
_
Substituting in values, this sum becomes
T
_
1
0
_
= x
_
1
2
_
+ y
_
3
2
_
(2.18)
Therefore, if we know the values of x and y, we can plug these into equation 2.17. By
doing so, we nd T (E
1
) which is the rst column of the matrix A.
87
We proceed to nd x and y. We do so by solving 2.17, which can be done by solving the
system
x = 1
x y = 0
We see that x = 1 and y = 1 is the solution to this system. Substituting these values
into equation 2.18, we have
T
_
1
0
_
= 1
_
1
2
_
+ 1
_
3
2
_
=
_
1
2
_
+
_
3
2
_
=
_
4
4
_
Therefore
_
4
4
_
is the rst column of A.
Computing the second column is done in the same way, and is left as an exercise.
The resulting matrix A is given by
A =
_
4 3
4 2
_
This example illustrates a very long procedure for nding the matrix of A. While this
method is reliable and will always result in the correct matrix A, the following procedure
provides an alternative method.
Recall that
_
A
1
A
n

denotes a matrix which has for its i


th
column the vector A
i
.
Procedure 2.68: Finding the Matrix of Inconveniently Dened Linear
Transformation
Suppose T : R
n
R
m
is a linear transformation. Suppose there exist vectors
A
1
, , A
n
in R
n
such that
_
A
1
A
n

1
exists, and
T (A
i
) = B
i
Then the matrix of T must be of the form
_
B
1
B
n
_
A
1
A
n

1
We will illustrate this procedure in the following example. You may also nd it useful to
work through Example 2.67 using this procedure.
88
Example 2.69: Matrix of a Linear Transformation
Given Inconveniently
Suppose T : R
3
R
3
is a linear transformation and
T
_
_
1
3
1
_
_
=
_
_
0
1
1
_
_
, T
_
_
0
1
1
_
_
=
_
_
2
1
3
_
_
, T
_
_
1
1
0
_
_
=
_
_
0
0
1
_
_
Find the matrix of this linear transformation.
Solution. By Procedure 2.68, A =
_
_
1 0 1
3 1 1
1 1 0
_
_
1
and B =
_
_
0 2 0
1 1 0
1 3 1
_
_
Then, Procedure 2.68 claims that the matrix of T is
C = BA
1
=
_
_
2 2 4
0 0 1
4 3 6
_
_
Indeed you can rst verify that T(X) = CX for the 3 vectors above:
_
_
2 2 4
0 0 1
4 3 6
_
_
_
_
1
3
1
_
_
=
_
_
0
1
1
_
_
,
_
_
2 2 4
0 0 1
4 3 6
_
_
_
_
0
1
1
_
_
=
_
_
2
1
3
_
_
_
_
2 2 4
0 0 1
4 3 6
_
_
_
_
1
1
0
_
_
=
_
_
0
0
1
_
_
But more generally T(X) = CX for any X. To see this, let Y = A
1
X and then using
linearity of T:
T(X) = T(AY ) = T(

i
Y
i
A
i
) =

Y
i
T(A
i
)

Y
i
B
i
= BY = BA
1
X = CX
2.3.1. The General Solution Of A Linear System
Recall the denition of a linear transformation discussed above. T is a linear transforma-
tion if whenever X, Y are vectors and k, p are scalars,
T (kX + pY ) = kT (X) + pT (Y )
Thus linear transformations distribute across addition and pass scalars to the outside.
89
It turns out that we can use linear transformations to solve linear systems of equations.
Indeed given a system of linear equations of the form AX = B, one may rephrase this as
T(X) = B where T is the linear transformation T
A
induced by the coecient matrix A.
With this in mind consider the following denition.
Denition 2.70: Particular Solution of a System of Equations
Suppose a linear system of equations can be written in the form
T (X) = B
If T (X
p
) = B, then X
p
is called a particular solution of the linear system.
Recall that a system is called homogeneous if every equation in the system is equal to 0.
Suppose we represent a homogeneous system of equations by T (X) = 0. It turns out that
the X for which T (X) = 0 are part of a special set called the null space of T. We may
also refer to the null space as the kernel of T, and we write ker (T).
Consider the following denition.
Denition 2.71: Null Space or Kernel of a Linear Transformation
Let T be a linear transformation. Dene
ker (T) = X : T (X) = 0
The kernel, ker (T) consists of the set of all vectors which T sends to 0 .This is also
called the null space of T.
We may also refer to the kernel of T as the solution space of the equation T (X) = 0.
Consider the following familiar example.
Example 2.72: The Kernel of the Derivative
Let
d
dx
denote the linear transformation dened on f, the functions which are dened
on R and have a continuous derivative. Find ker
_
d
dx
_
.
Solution. The example asks for functions, f which the property that
df
dx
= 0. As you know
from calculus, these functions are the constant functions. Thus ker
_
d
dx
_
is the set of constant
functions.
Denition 2.71 states that ker (T) is the set of solutions to the equation,
T (X) = 0
Since we can write T (X) as AX, you have been solving such equations for quite some time.
We have spent a lot of time nding solutions to systems of equations in general, as well
as homogeneous systems. Suppose we look at a system given by AX = B, and consider the
90
related homogeneous system. By this, we mean that we replace B by 0 and look at AX = 0.
It turns out that there is a very important relationship between the solutions of the original
system and the solutions of the associated homogeneous system. In the following theorem,
we use linear transformations to denote a system of equations. Remember that T (X) = AX.
Theorem 2.73: Particular Solution and General Solution
Suppose X
p
is a solution to the linear system given by ,
T (X) = B
Then if Y is any other solution to T (X) = B, there exists X ker (T) such that
Y = X
p
+ X
Hence, every solution to the linear system can be written as a sum of a particular
solution, X
p
, and a solution X to the associated homogeneous system given by T (X) =
0.
Proof: Consider Y X
p
= Y + (1) X
p
. Then T (Y X
p
) = T (Y ) T (X
p
). Since Y
and X are both solutions to the system, it follows that T (Y ) = B and T (X
p
) = B.
Hence, T (Y ) T (X
p
) = B B = 0. Let X = Y X
p
. Then, T (X) = 0 so X is a
solution to the associated homogeneous system and so is in ker (T).
Sometimes people remember the above theorem in the following form. The solutions
to the system T (X) = B are given by X
p
+ ker (T) where X
p
is a particular solution to
T (X) = B.
For now, we have been speaking about the kernel or null space of a linear transformation
T. However, we know that every linear transformation T is determined by some matrix
A. Therefore, we can also speak about the null space of a matrix. Consider the following
example.
Example 2.74: The Null Space of a Matrix
Let
A =
_
_
1 2 3 0
2 1 1 2
4 5 7 2
_
_
Find ker (A). Equivalently, nd the solutions to the system of equations AX = 0.
Solution. We are asked to nd X : AX = 0 . In other words we want to solve the system,
AX = 0. Let X =
_
x y z w

T
. Then this amounts to solving
_
_
1 2 3 0
2 1 1 2
4 5 7 2
_
_
_

_
x
y
z
w
_

_
=
_
_
0
0
0
_
_
91
This is the linear system
x + 2y + 3z = 0
2x + y + z + 2w = 0
4x + 5y + 7z + 2w = 0
Set up the augmented matrix
_
_
1 2 3 0 [ 0
2 1 1 2 [ 0
4 5 7 2 [ 0
_
_
Then row reduce to obtain the reduced row-echelon form,
_

_
1 0
1
3
4
3
[ 0
0 1
5
3

2
3
[ 0
0 0 0 0 [ 0
_

_
This yields x =
1
3
z
4
3
w and y =
2
3
w
5
3
z. Since ker (A) consists of the solutions to this
system, it consists vectors of the form,
_

_
1
3
z
4
3
w
2
3
w
5
3
z
z
w
_

_
= z
_

_
1
3

5
3
1
0
_

_
+ w
_

4
3
2
3
0
1
_

_
Consider the following example.
Example 2.75: A General Solution
The general solution of a linear system of equations is the set of all possible solutions.
Find the general solution to the linear system,
_
_
1 2 3 0
2 1 1 2
4 5 7 2
_
_
_

_
x
y
z
w
_

_
=
_
_
9
7
25
_
_
given that
_

_
1
1
2
1
_

_
=
_

_
x
y
z
w
_

_
is one solution.
Solution. Note the matrix of this system is the same as the matrix in Example 2.74. There-
fore, from Theorem 2.73, you will obtain all solutions to the above linear system by adding
92
a particular solution X
p
to the solutions of the associated homogeneous system, X. One
particular solution is given in this example, by
X
p
=
_

_
x
y
z
w
_

_
=
_

_
1
1
2
1
_

_
(2.19)
Using this particular solution along with the solutions found in Example 2.74, we obtain
the following solutions,
z
_

_
1
3

5
3
1
0
_

_
+ w
_

4
3
2
3
0
1
_

_
+
_

_
1
1
2
1
_

_
Hence, any solution to the above linear system is of this form.
2.4 Exercises
1. Consider the matrices A =
_
1 2 3
2 1 7
_
, B =
_
3 1 2
3 2 1
_
, C =
_
1 2
3 1
_
, D =
_
1 2
2 3
_
, E =
_
2
3
_
.
Find, if possible, 3A, 3B A, AC, CB, AE, EA. If it is not possible explain why.
2. Consider the matrices A =
_
_
1 2
3 2
1 1
_
_
, B =
_
2 5 2
3 2 1
_
, C =
_
1 2
5 0
_
, D =
_
1 1
4 3
_
, E =
_
1
3
_
Find, if possible, 3A, 3BA, AC, CA, AE, EA, BE, DE. If it is not possible explain
why.
3. Consider the matrices A =
_
_
1 2
3 2
1 1
_
_
, B =
_
2 5 2
3 2 1
_
, C =
_
1 2
5 0
_
, D =
_
1 1
4 3
_
, E =
_
1
3
_
Find, if possible, 3A
T
, 3B A
T
, AC, CA, AE, E
T
B, BE, DE, EE
T
, E
T
E. If it is not
possible explain why.
93
4. Consider the matrices A =
_
_
1 2
3 2
1 1
_
_
, B =
_
2 5 2
3 2 1
_
, C =
_
1 2
5 0
_
, D =
_
1
4
_
, E =
_
1
3
_
Find, if possible, AD, DA, D
T
B, D
T
BE, E
T
D, DE
T
. If it is not possible, explain why.
5. Let A =
_
_
1 1
2 1
1 2
_
_
, B =
_
1 1 2
2 1 2
_
, and C =
_
_
1 1 3
1 2 0
3 1 0
_
_
. Find the
following if possible.
(a) AB
(b) BA
(c) AC
(d) CA
(e) CB
(f) BC
6. Suppose A and B are square matrices of the same size. Which of the following are
necessarily true?
(a) (A B)
2
= A
2
2AB + B
2
(b) (AB)
2
= A
2
B
2
(c) (A + B)
2
= A
2
+ 2AB + B
2
(d) (A + B)
2
= A
2
+ AB + BA+ B
2
(e) A
2
B
2
= A(AB) B
(f) (A + B)
3
= A
3
+ 3A
2
B + 3AB
2
+ B
3
(g) (A + B) (A B) = A
2
B
2
7. Let A =
_
1 1
3 3
_
. Find all 2 2 matrices, B such that AB = 0.
8. Let X =
_
1 1 1

and Y =
_
0 1 2

. Find X
T
Y and XY
T
if possible.
9. Let A =
_
1 2
3 4
_
, B =
_
1 2
3 k
_
. Is it possible to choose k such that AB = BA? If
so, what should k equal?
10. Let A =
_
1 2
3 4
_
, B =
_
1 2
1 k
_
. Is it possible to choose k such that AB = BA? If
so, what should k equal?
94
11. In the context of Proposition 2.7, describe A and 0.
12. Let A be an nn matrix. Show A equals the sum of a symmetric and a skew symmetric
matrix. Hint: Show that
1
2
_
A
T
+ A
_
is symmetric and then consider using this as
one of the matrices.
13. Show that the main diagonal of every skew symmetric matrix consists of only zeros.
Recall that the main diagonal consists of every entry of the matrix which is of the form
a
ii
.
14. Using only the properties given in Proposition 2.7 and Proposition 2.10, show A is
unique.
15. Using only the properties given in Proposition 2.7 and Proposition 2.10 show 0 is
unique.
16. Using only the properties given in Proposition 2.7 and Proposition 2.10 show 0A = 0.
Here the 0 on the left is the scalar 0 and the 0 on the right is the zero matrix of
appropriate size.
17. Using only the properties given in Proposition 2.7 and Proposition 2.10, as well as
previous problems show (1) A = A.
18. Prove 2.11.
19. Prove that I
m
A = A where A is an mn matrix.
20. Give an example of matrices, A, B, C such that B ,= C, A ,= 0, and yet AB = AC.
21. Suppose AB = AC and A is an invertible n n matrix. Does it follow that B = C?
Explain why or why not. What if A were a non invertible n n matrix?
22. Find 2 2 matrices, A and B such that A ,= 0, B ,= 0 with AB ,= BA.
23. Find 2 2 matrices, A and B such that A ,= 0, B ,= 0, but AB = 0.
24. Find 2 2 matrices, A, D, and C such that A ,= 0, C ,= D, but AC = AD.
25. Explain why if AB = AC and A
1
exists, then B = C.
26. Give an example of a matrix A such that A
2
= I and yet A ,= I and A ,= I.
27. Give an example of matrices, A, B such that A ,= 0 and B ,= 0 but AB = 0.
28. Find square matrices A and B such that AB ,= BA.
29. Let
A =
_
2 1
1 3
_
Find A
1
if possible. If A
1
does not exist, determine why.
95
30. Let
A =
_
0 1
5 3
_
Find A
1
if possible. If A
1
does not exist, determine why.
31. Let
A =
_
2 1
3 0
_
Find A
1
if possible. If A
1
does not exist, determine why.
32. Let
A =
_
2 1
4 2
_
Find A
1
if possible. If A
1
does not exist, determine why.
33. Let A be a 2 2 invertible matrix, with A =
_
a b
c d
_
. Find a formula for A
1
in
terms of a, b, c, d.
34. Let
A =
_
_
1 2 3
2 1 4
1 0 2
_
_
Find A
1
if possible. If A
1
does not exist, determine why.
35. Let
A =
_
_
1 0 3
2 3 4
1 0 2
_
_
Find A
1
if possible. If A
1
does not exist, determine why.
36. Let
A =
_
_
1 2 3
2 1 4
4 5 10
_
_
Find A
1
if possible. If A
1
does not exist, determine why.
37. Let
A =
_

_
1 2 0 2
1 1 2 0
2 1 3 2
1 2 1 2
_

_
Find A
1
if possible. If A
1
does not exist, determine why.
96
38. Write the system
x
1
x
2
+ 2x
3
2x
3
+ x
1
3x
3
3x
4
+ 3x
2
+ x
1
in the form A
_

_
x
1
x
2
x
3
x
4
_

_
where A is an appropriate matrix.
39. Write the system
x
1
+ 3x
2
+ 2x
3
2x
3
+ x
1
6x
3
x
4
+ 3x
2
+ x
1
in the form A
_

_
x
1
x
2
x
3
x
4
_

_
where A is an appropriate matrix.
40. Write the system
x
1
+ x
2
+ x
3
2x
3
+ x
1
+ x
2
x
3
x
1
3x
4
+ x
1
in the form A
_

_
x
1
x
2
x
3
x
4
_

_
where A is an appropriate matrix.
41. Using the inverse of the matrix, nd the solution to the systems
(a)
_
_
1 0 3
2 3 4
1 0 2
_
_
_
_
x
y
z
_
_
=
_
_
1
2
3
_
_
(b)
_
_
1 0 3
2 3 4
1 0 2
_
_
_
_
x
y
z
_
_
=
_
_
2
1
0
_
_
(c)
_
_
1 0 3
2 3 4
1 0 2
_
_
_
_
x
y
z
_
_
=
_
_
1
0
1
_
_
97
(d)
_
_
1 0 3
2 3 4
1 0 2
_
_
_
_
x
y
z
_
_
=
_
_
3
1
2
_
_
Now give the solution in terms of a, b, and c to
_
_
1 0 3
2 3 4
1 0 2
_
_
_
_
x
y
z
_
_
=
_
_
a
b
c
_
_
42. Using the inverse of the matrix, nd the solution to the systems
(a)
_
_
1 0 3
2 3 4
1 0 2
_
_
_
_
x
y
z
_
_
=
_
_
1
2
3
_
_
(b)
_
_
1 0 3
2 3 4
1 0 2
_
_
_
_
x
y
z
_
_
=
_
_
2
1
0
_
_
(c)
_
_
1 0 3
2 3 4
1 0 2
_
_
_
_
x
y
z
_
_
=
_
_
1
0
1
_
_
(d)
_
_
1 0 3
2 3 4
1 0 2
_
_
_
_
x
y
z
_
_
=
_
_
3
1
2
_
_
Now give the solution in terms of a, b, and c to
_
_
1 0 3
2 3 4
1 0 2
_
_
_
_
x
y
z
_
_
=
_
_
a
b
c
_
_
43. Using the inverse of the matrix, nd the solution to the system
_

_
1
1
2
1
2
1
2
3
1
2

1
2

5
2
1 0 0 1
2
3
4
1
4
9
4
_

_
_

_
x
y
z
w
_

_
=
_

_
a
b
c
d
_

_
98
44. Show that if A is an nn invertible matrix and X is a n1 matrix such that AX = B
for B an n 1 matrix, then X = A
1
B.
45. Prove that if A
1
exists and AX = 0 then X = 0.
46. Show that if A
1
exists for an n n matrix, then it is unique. That is, if BA = I and
AB = I, then B = A
1
.
47. Show that if A is an invertible n n matrix, then so is A
T
and
_
A
T
_
1
= (A
1
)
T
.
48. Show (AB)
1
= B
1
A
1
by verifying that
AB
_
B
1
A
1
_
= I
and
B
1
A
1
(AB) = I
Hint: Use Problem 46.
49. Show that (ABC)
1
= C
1
B
1
A
1
by verifying that (ABC) (C
1
B
1
A
1
) = I and
(C
1
B
1
A
1
) (ABC) = I. Hint: Use Problem 46.
50. If A is invertible, show (A
2
)
1
= (A
1
)
2
. Hint: Use Problem 46.
51. If A is invertible, show (A
1
)
1
= A. Hint: Use Problem 46.
52. A matrix A is called idempotent if A
2
= A. Let
A =
_
_
2 0 2
1 1 2
1 0 1
_
_
and show that A is idempotent .
Dene the column space of A as all linear combinations of the columns. Thus the
column space for this matrix A consists of all vectors which are of the form
x
_
_
2
1
1
_
_
+ y
_
_
0
1
0
r
_
_
+ z
_
_
2
2
1
_
_
Show that a vector in the column space of an idempotent matrix is left unchanged by
multiplication by A.
53. Show the map T : R
n
R
m
dened by T (X) = AX where A is an mn matrix and
X is an m1 column vector is a linear transformation.
54. Consider the following functions which map R
n
to R
n
.
(a) T multiplies the j
th
component of X by a nonzero number b.
99
(b) T replaces the i
th
component of X with b times the j
th
component added to the
i
th
component.
(c) T switches two components.
Show these functions are linear transformations and describe their matrices A such
that T (X) = AX.
55. Write the solution set of the following system as a linear combination of vectors
_
_
1 1 2
1 2 1
3 4 5
_
_
_
_
x
y
z
_
_
=
_
_
0
0
0
_
_
56. Using Problem 55 nd the general solution to the following linear system.
_
_
1 1 2
1 2 1
3 4 5
_
_
_
_
x
y
z
_
_
=
_
_
1
2
4
_
_
57. Write the solution set of the following system as a linear combination of vectors
_
_
0 1 2
1 2 1
1 4 5
_
_
_
_
x
y
z
_
_
=
_
_
0
0
0
_
_
58. Using Problem 57 nd the general solution to the following linear system.
_
_
0 1 2
1 2 1
1 4 5
_
_
_
_
x
y
z
_
_
=
_
_
1
1
1
_
_
59. Write the solution set of the following system as a linear combination of vectors.
_
_
1 1 2
1 2 0
3 4 4
_
_
_
_
x
y
z
_
_
=
_
_
0
0
0
_
_
60. Using Problem 59 nd the general solution to the following linear system.
_
_
1 1 2
1 2 0
3 4 4
_
_
_
_
x
y
z
_
_
=
_
_
1
2
4
_
_
61. Write the solution set of the following system as a linear combination of vectors
_
_
0 1 2
1 0 1
1 2 5
_
_
_
_
x
y
z
_
_
=
_
_
0
0
0
_
_
100
62. Using Problem 61 nd the general solution to the following linear system.
_
_
0 1 2
1 0 1
1 2 5
_
_
_
_
x
y
z
_
_
=
_
_
1
1
1
_
_
63. Write the solution set of the following system as a linear combination of vectors
_

_
1 0 1 1
1 1 1 0
3 1 3 2
3 3 0 3
_

_
_

_
x
y
z
w
_

_
=
_

_
0
0
0
0
_

_
64. Using Problem 63 nd the general solution to the following linear system.
_

_
1 0 1 1
1 1 1 0
3 1 3 2
3 3 0 3
_

_
_

_
x
y
z
w
_

_
=
_

_
1
2
4
3
_

_
65. Write the solution set of the following system as a linear combination of vectors
_

_
1 1 0 1
2 1 1 2
1 0 1 1
0 0 0 0
_

_
_

_
x
y
z
w
_

_
=
_

_
0
0
0
0
_

_
66. Using Problem 65 nd the general solution to the following linear system.
_

_
1 1 0 1
2 1 1 2
1 0 1 1
0 1 1 1
_

_
_

_
x
y
z
w
r
_

_
=
_

_
2
1
3
0
_

_
67. Write the solution set of the following system as a linear combination of vectors
_

_
1 1 0 1
1 1 1 0
3 1 1 2
3 3 0 3
_

_
_

_
x
y
z
w
_

_
=
_

_
0
0
0
0
_

_
68. Using Problem 67 nd the general solution to the following linear system.
_

_
1 1 0 1
1 1 1 0
3 1 1 2
3 3 0 3
_

_
_

_
x
y
z
w
_

_
=
_

_
1
2
4
3
_

_
101
69. Write the solution set of the following system as a linear combination of vectors
_

_
1 1 0 1
2 1 1 2
1 0 1 1
0 1 1 1
_

_
_

_
x
y
z
w
_

_
=
_

_
0
0
0
0
_

_
70. Using Problem 69 nd the general solution to the following linear system.
_

_
1 1 0 1
2 1 1 2
1 0 1 1
0 1 1 1
_

_
_

_
x
y
z
w
_

_
=
_

_
2
1
3
1
_

_
71. Find ker (A) for
A =
_

_
1 2 3 2 1
0 2 1 1 2
1 4 4 3 3
0 2 1 1 2
_

_
Recall ker (A) is the set of solutions to the system AX = 0.
72. Using Problem 71, nd the general solution to the following linear system.
_

_
1 2 3 2 1
0 2 1 1 2
1 4 4 3 3
0 2 1 1 2
_

_
_

_
x
1
x
2
x
3
x
4
x
5
_

_
=
_

_
11
7
18
7
_

_
73. Using Problem 71, nd the general solution to the following linear system.
_

_
1 2 3 2 1
0 2 1 1 2
1 4 4 3 3
0 2 1 1 2
_

_
_

_
x
1
x
2
x
3
x
4
x
5
_

_
=
_

_
6
7
13
7
_

_
74. Suppose AX = B has a solution. Explain why the solution is unique precisely when
AX = 0 has only the trivial solution.
75. Give an example of a 3 2 matrix with the property that the linear transformation
determined by this matrix is one to one but not onto.
76. Suppose A is an mn matrix in which m n. Suppose also that the rank of A equals
m. Show that the transformation T determined by A maps R
n
onto R
m
. Hint: The
vectors E
1
, , E
m
occur as columns in the reduced row-echelon formfor A.
102
77. Suppose A is an m n matrix in which m n. Suppose also that the rank of A
equals n. Show that A is one to one. Hint: If not, there exists a vector, X such
that AX = 0, and this implies at least one column of A is a linear combination of the
others. Show this would require the rank to be less than n.
78. Explain why an n n matrix, A is both one to one and onto if and only if its rank is
n.
79. You are given a linear transformation T : R
n
R
m
and you know that
T (A
i
) = B
i
where
_
A
1
A
n

1
exists. Show that the matrix of T is of the form
_
B
1
B
n
_
A
1
A
n

1
80. Suppose T is a linear transformation such that
T
_
_
1
2
6
_
_
=
_
_
5
1
3
_
_
, T
_
_
1
1
5
_
_
=
_
_
1
1
5
_
_
T
_
_
0
1
2
_
_
=
_
_
5
3
2
_
_
Find the matrix of T. That is nd A such that T(X) = AX.
81. Suppose T is a linear transformation such that
T
_
_
1
1
8
_
_
=
_
_
1
3
1
_
_
, T
_
_
1
0
6
_
_
=
_
_
2
4
1
_
_
T
_
_
0
1
3
_
_
=
_
_
6
1
1
_
_
Find the matrix of T. That is nd A such that T(X) = AX.
82. Suppose T is a linear transformation such that
T
_
_
1
3
7
_
_
=
_
_
3
1
3
_
_
, T
_
_
1
2
6
_
_
=
_
_
1
3
3
_
_
T
_
_
0
1
2
_
_
=
_
_
5
3
3
_
_
Find the matrix of T. That is nd A such that T(X) = AX.
103
83. Suppose T is a linear transformation such that
T
_
_
1
1
7
_
_
=
_
_
3
3
3
_
_
, T
_
_
1
0
6
_
_
=
_
_
1
2
3
_
_
T
_
_
0
1
2
_
_
=
_
_
1
3
1
_
_
Find the matrix of T. That is nd A such that T(X) = AX.
84. Suppose T is a linear transformation such that
T
_
_
1
2
18
_
_
=
_
_
5
2
5
_
_
, T
_
_
1
1
15
_
_
=
_
_
3
3
5
_
_
T
_
_
0
1
4
_
_
=
_
_
2
5
2
_
_
Find the matrix of T. That is nd A such that T(X) = AX.
85. Consider the following functions T : R
3
R
2
. Show that each is a linear transforma-
tion and determine for each the matrix A such that T(X) = AX.
(a) T
_
_
x
y
z
_
_
=
_
x + 2y + 3z
2y 3x + z
_
(b) T
_
_
x
y
z
_
_
=
_
7x + 2y + z
3x 11y + 2z
_
(c) T
_
_
x
y
z
_
_
=
_
3x + 2y + z
x + 2y + 6z
_
(d) T
_
_
x
y
z
_
_
=
_
2y 5x + z
x + y + z
_
86. Show that if a function T : R
n
R
m
is linear, then it is always the case that T (0) = 0.
87. Consider the following functions T : R
3
R
2
. Explain why each of these functions T
is not linear.
(a) T
_
_
x
y
z
_
_
=
_
x + 2y + 3z + 1
2y 3x + z
_
104
(b) T
_
_
x
y
z
_
_
=
_
x + 2y
2
+ 3z
2y + 3x + z
_
(c) T
_
_
x
y
z
_
_
=
_
sin x + 2y + 3z
2y + 3x + z
_
(d) T
_
_
x
y
z
_
_
=
_
x + 2y + 3z
2y + 3x ln z
_
88. Suppose
_
A
1
A
n

1
exists where each A
j
R
n
and let vectors B
1
, , B
n
in R
m
be given. Show that
there always exists a linear transformation T such that T(A
i
) = B
i
.
105
106
3. Determinants
Outcomes
A. Evaluate the determinant of a square matrix using either Laplace Expansion or row
operations
B. Demonstrate the eects that row operations have on determinants.
C. Verify the following:
(a) The determinant of a product of matrices is the product of the determinants.
(b) The determinant of a matrix is equal to the determinant of its transpose.
D. Use determinants to determine whether a matrix has an inverse, and evaluate the
inverse using cofactors.
E. Apply Cramers Rule to solve a 2 2 or a 3 3 linear system.
3.1 Basic Techniques And Properties
3.1.1. Cofactors And 2 2 Determinants
Let A be an nn matrix. That is, let A be a square matrix. The determinant of A, denoted
by det (A) is a very important number which we will explore throughout this section.
If A is a 22 matrix, the determinant is given by the following formula.
Denition 3.1: Determinant of a Two By Two Matrix
Let A =
_
a b
c d
_
. Then
det (A) = ad cb
The determinant is also often denoted by enclosing the matrix with two vertical lines.
Thus
det
_
a b
c d
_
=

a b
c d

= ad bc
107
The following is an example of nding the determinant of a 2 2 matrix.
Example 3.2: A Two by Two Determinant
Find det (A) for the matrix A =
_
2 4
1 6
_
.
Solution. From Denition 3.1,
det (A) = (2) (6) (1) (4) = 12 + 4 = 16
The 2 2 determinant can be used to nd the determinant of larger matrices. We will
now explore how to nd the determinant of a 3 3 matrix, using several tools including the
2 2 determinant.
We begin with the following denition.
Denition 3.3: The ij
th
Minor of a Matrix
Let A be a 3 3 matrix. The ij
th
minor of A, denoted as minor (A)
ij
, is the
determinant of the 2 2 matrix which results from deleting the i
th
row and the j
th
column of A.
Hence, there is a minor associated with each entry of A.
Consider the following example which demonstrates this denition.
Example 3.4: Finding Minors of a Matrix
Let
A =
_
_
1 2 3
4 3 2
3 2 1
_
_
Find minor (A)
12
and minor (A)
23
.
Solution. First we will nd minor (A)
12
, the (1, 2) minor of A. By Denition 3.3, this is the
determinant of the 2 2 matrix which results when you delete the rst row and the second
column. This minor is given by
minor (A)
12
= det
_
4 2
3 1
_
Using Denition 3.1, we see that
det
_
4 2
3 1
_
= (4) (1) (3) (2) = 4 6 = 2
108
Therefore minor (A)
12
= 2.
Similarly, minor (A)
23
is the determinant of the 2 2 matrix which results when you
delete the second row and the third column. This minor is therefore
minor (A)
23
= det
_
1 2
3 2
_
= 4
Finding the other minors of A is left as an exercise.
The ij
th
minor of a matrix A is used in another important denition, given next.
Denition 3.5: The ij
th
Cofactor of a Matrix
Suppose A is a 3 3 matrix. The ij
th
cofactor, denoted by cof (A)
ij
is dened to be
cof (A)
ij
= (1)
i+j
minor (A)
ij
It is also convenient to refer to the cofactor of an entry of a matrix as follows. If A
ij
is
the ij
th
entry of the matrix, then its cofactor is just cof (A)
ij
.
Example 3.6: Finding Cofactors of a Matrix
Consider the matrix
A =
_
_
1 2 3
4 3 2
3 2 1
_
_
Find cof (A)
12
and cof (A)
23
.
Solution. We will use Denition 3.5 to compute these cofactors.
First, we will compute cof (A)
12
. Therefore, we need to nd minor (A)
12
. This is the
determinant of the 2 2 matrix which results when you delete the rst row and the second
column. Thus minor (A)
12
is given by
det
_
4 2
3 1
_
= 2
Then,
cof (A)
12
= (1)
1+2
minor (A)
12
= (1)
1+2
(2) = 2
Hence, cof (A)
12
= 2.
Similarly, we can nd cof (A)
23
. First, nd minor (A)
23
, which is the determinant of the
22 matrix which results when you delete the second row and the third column. This minor
is therefore
det
_
1 2
3 2
_
= 4
109
Hence,
cof (A)
23
= (1)
2+3
minor (A)
23
= (1)
2+3
(4) = 4
You may wish to nd the remaining cofactors for the above matrix. Remember that
there is a cofactor for every entry in the matrix.
We have now established the tools we need to nd the determinant of a 3 3 matrix.
Denition 3.7: The Determinant of a Three By Three Matrix
Let A be a 3 3 matrix. Then, det (A) is calculated by picking a row (or column) and
taking the product of each entry in that row (column) with its cofactor and adding
these products together.
This process when applied to the i
th
row (column) is known as expanding along
the i
th
row (column) .
When calculating the determinant, you can choose to expand any row or any column.
Regardless of your choice, you will always get the same number which is the determinant
of the matrix A. This method of evaluating a determinant by expanding along a row or a
column is called Laplace Expansion or Cofactor Expansion.
Consider the following example.
Example 3.8: Finding the Determinant of a Three by Three Matrix
Let
A =
_
_
1 2 3
4 3 2
3 2 1
_
_
Find det (A) using the method of Laplace Expansion.
Solution. First, we will calculate det (A) by expanding along the rst column. Using Deni-
tion 3.7, we take the 1 in the rst column and multiply it by its cofactor,
1 (1)
1+1

3 2
2 1

Similarly, we take the 4 in the rst column and multiply it by its cofactor, as well as with
the 3 in the rst column. Finally, we add these numbers together, as given in the following
equation.
det (A) = 1
cof(A)
11
..
(1)
1+1

3 2
2 1

+ 4
cof(A)
21
..
(1)
2+1

2 3
2 1

+ 3
cof(A)
31
..
(1)
3+1

2 3
3 2

Calculating each of these, we obtain


det (A) = 1 (1) (1) + 4 (1) (4) + 3 (1) (5) = 1 + 16 +15 = 0
110
Hence, det (A) = 0.
As mentioned in Denition 3.7, we can choose to expand along any row or column. Lets
try now by expanding along the second row. Here, we take the 4 in the second row and
multiply it to its cofactor, then add this to the 3 in the second row multiplied by its cofactor,
and the 2 in the second row multiplied by its cofactor. The calculation is as follows.
det (A) = 4
cof(A)
21
..
(1)
2+1

2 3
2 1

+ 3
cof(A)
22
..
(1)
2+2

1 3
3 1

+ 2
cof(A)
23
..
(1)
2+3

1 2
3 2

Calculating each of these products, we obtain


det (A) = 4 (1) (2) + 3 (1) (8) + 2 (1) (4) = 0
You can see that for both methods, we obtained det (A) = 0.
As mentioned above, we will always come up with the same value for det (A) regardless
of the row or column we choose to expand along. You should try to compute the above
determinant by expanding along other rows and columns. This is a good way to check your
work, because you should come up with the same number each time!
We present this idea formally in the following theorem.
Theorem 3.9: The Determinant is Well Dened
Expanding the n n matrix along any row or column always gives the same answer,
which is the determinant.
We have now looked at the determinant of 2 2 and 3 3 matrices. It turns out that
the method used to calculate the determinant of a 3 3 matrix can be used to calculate the
determinant of any sized matrix. Notice that Denition 3.3, Denition 3.5 and Denition
3.7 can all be applied to a matrix of any size.
For example, the ij
th
minor of a 4 4 matrix is the determinant of the 3 3 matrix you
obtain when you delete the i
th
row and the j
th
column. Just as with the 3 3 determinant,
we can compute the determinant of a 4 4 matrix by Laplace Expansion, along any row or
column
Consider the following example.
Example 3.10: Determinant of a Four by Four Matrix
Find det (A) where
A =
_

_
1 2 3 4
5 4 2 3
1 3 4 5
3 4 3 2
_

_
111
Solution. As in the case of a 3 3 matrix, you can expand this along any row or column.
Lets pick the third column. Then, using Laplace Expansion,
det (A) = 3 (1)
1+3

5 4 3
1 3 5
3 4 2

+ 2 (1)
2+3

1 2 4
1 3 5
3 4 2

+
4 (1)
3+3

1 2 4
5 4 3
3 4 2

+ 3 (1)
4+3

1 2 4
5 4 3
1 3 5

Now, you can calculate each 33 determinant using Laplace Expansion, as we did above.
You should complete these as an exercise and verify that det (A) = 12.
The following provides a formal denition for the determinant of an n n matrix. You
may wish to take a moment and consider the above denitions for 22 and 33 determinants
in context of this denition.
Denition 3.11: The Determinant of an n n Matrix
Let A be an n n matrix where n 2 and suppose the determinant of an (n 1)
(n 1) has been dened. Then
det (A) =
n

j=1
a
ij
cof (A)
ij
=
n

i=1
a
ij
cof (A)
ij
The rst formula consists of expanding the determinant along the i
th
row and the
second expands the determinant along the j
th
column.
In the following sections, we will explore some important properties and characteristics
of the determinant.
3.1.2. The Determinant Of A Triangular Matrix
There is a certain type of matrix for which nding the determinant is a very simple procedure.
Consider the following denition.
112
Denition 3.12: Triangular Matrices
A matrix A, is upper triangular if A
ij
= 0 whenever i > j. Thus the entries of such a
matrix below the main diagonal equal 0, as shown. Here, refers to a nonzero number.
_

_

0
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
0 0
_

_
A lower triangular matrix is dened similarly as a matrix for which all entries above
the main diagonal are equal to zero.
The following theorem provides a useful way to calculate the determinant of a triangular
matrix.
Theorem 3.13: Determinant of a Triangular Matrix
Let A be an upper or lower triangular matrix. Then det (A) is obtained by taking the
product of the entries on the main diagonal.
The verication of this Theorem can be done by computing the determinant using Laplace
Expansion along the rst row or column.
Consider the following example.
Example 3.14: Determinant of a Triangular Matrix
Let
A =
_

_
1 2 3 77
0 2 6 7
0 0 3 33.7
0 0 0 1
_

_
Find det (A) .
Solution. From Theorem 3.13, it suces to take the product of the elements on the main
diagonal. Thus det (A) = 1 2 3 (1) = 6.
Without using Theorem 3.13, you could use Laplace Expansion. We will expand along
the rst column. This gives
det (A) = 1

2 6 7
0 3 33.7
0 0 1

+ 0 (1)
2+1

2 3 77
0 3 33.7
0 0 1

+
0 (1)
3+1

2 3 77
2 6 7
0 0 1

+ 0 (1)
4+1

2 3 77
2 6 7
0 3 33.7

113
and the only nonzero term in the expansion is
1

2 6 7
0 3 33.7
0 0 1

Now nd the determinant of this 33 matrix, by expanding along the rst column to obtain
det (A) = 1
_
2

3 33.7
0 1

+ 0 (1)
2+1

6 7
0 1

+ 0 (1)
3+1

6 7
3 33.7

_
= 1 2

3 33.7
0 1

Next use Denition 3.1 to nd the determinant of this 2 2 matrix, which is just 3 1
0 33.7 = 3. Putting all these steps together, we have
det (A) = 1 2 3 (1) = 6
which is just the product of the entries down the main diagonal of the original matrix!
You can see that while both methods result in the same answer, Theorem 3.13 provides
a much quicker method.
In the next section, we explore some important properties of determinants.
3.1.3. Properties Of Determinants
There are many important properties of determinants. Since many of these properties involve
the row operations discussed in Chapter 1, we recall that denition now.
Denition 3.15: Row Operations
The row operations consist of the following
1. Switch two rows.
2. Multiply a row by a nonzero number.
3. Replace a row by a multiple of another row added to itself.
We will now consider the eect of row operations on the determinant of a matrix. In
future sections, we will see that using the following properties can greatly assist in nding
determinants.
The rst theorem explains the aect on the determinant of a matrix when two rows are
switched.
114
Theorem 3.16: Switching Rows
Let A be an nn matrix and let B be a matrix which results from switching two rows
of A. Then det (B) = det (A) .
Hence, when we switch two rows of a matrix, the determinant is multiplied by 1.
Consider the following example.
Example 3.17: Switching Two Rows
Let A =
_
1 2
3 4
_
and let B =
_
3 4
1 2
_
. Find det (B).
Solution. By Denition 3.1, det (A) = 1 4 3 2 = 2. Notice that the rows of B are
the rows of A but switched. By Theorem 3.16 since two rows of A have been switched,
det (B) = det (A) = (2) = 2. You can verify this using Denition 3.1.
The next theorem demonstrates the aect on the determinant of a matrix when we
multiply a row by a scalar.
Theorem 3.18: Multiplying a Row by a Scalar
Let A be an n n matrix and let B be a matrix which results from multiplying some
row of A by a scalar k. Then det (B) = k det (A).
Notice that this theorem is true when we multiply one row of the matrix by k. If we were
to multiply two rows of A by k to obtain B, we would have det (B) = k
2
det (A). Suppose
we were to multiply all n rows of A by k to obtain the matrix B, so that B = kA. Then,
det (B) = k
n
det (A).
Consider the following example.
Example 3.19: Multiplying a Row by 5
Let A =
_
1 2
3 4
_
, B =
_
5 10
3 4
_
. Find det (B).
Solution. By Denition 3.1, det (A) = 2. We can also compute det (B) using this denition,
and we see that det (B) = 10.
Now, lets compute det (B) using Theorem 3.18 and see if we obtain the same answer.
Notice that the rst row of B is 5 times the rst row of A, while the second row of B is
equal to the second row of A. By Theorem 3.18, det (B) = 5 det (A) = 5 2 = 10.
You can see that this matches our answer above.
Finally, consider the next theorem for the last row operation, that of adding a multiple
of a row to another row.
115
Theorem 3.20: Adding a Multiple of a Row to Another Row
Let A be an n n matrix and let B be a matrix which results from adding a multiple
of a row to another row. Then det (A) = det (B).
Therefore, when we add a multiple of a row to another row, the determinant of the matrix
is unchanged. Note that if a matrix A contains a row which is a multiple of another row,
det (A) will equal 0. To see this, suppose the rst row of A is equal to 1 times the second
row. By Theorem 3.20, we can add the rst row to the second row, and the determinant
will be unchanged. However, this row operation will result in a row of zeros. Using Laplace
Expansion along the row of zeros, we nd that the determinant is 0.
Consider the following example.
Example 3.21: Adding a Row to Another Row
Let A =
_
1 2
3 4
_
and let B =
_
1 2
5 8
_
. Find det (B).
Solution. By Denition 3.1, det (A) = 2. Notice that the second row of B is two times
the rst row of A added to the second row. By Theorem 3.16, det (B) = det (A) = 2. As
usual, you can verify this answer using Denition 3.1.
Until now, our focus has primarily been on row operations. However, we can carry out the
same operations with columns, rather than rows. The three operations outlined in Denition
3.15 can be done with columns instead of rows. In this case, in Theorems 3.16, 3.18, and
3.20 you can replace the word, row with the word column.
There are two other major properties of determinants which do not involve row (or
column) operations. The rst is the determinant of a product of matrices.
Theorem 3.22: Determinant of a Product
Let A and B be two n n matrices. Then
det (AB) = det (A) det (B)
In order to nd the determinant of a product of matrices, we can simply take the product
of the determinants.
Consider the following example.
Example 3.23: The Determinant of a Product
Compare det (AB) and det (A) det (B) for
A =
_
1 2
3 2
_
, B =
_
3 2
4 1
_
116
Solution. First compute AB, which is given by
AB =
_
1 2
3 2
_ _
3 2
4 1
_
=
_
11 4
1 4
_
and so by Denition 3.1
det (AB) = det
_
11 4
1 4
_
= 40
Now
det (A) = det
_
1 2
3 2
_
= 8
and
det (B) = det
_
3 2
4 1
_
= 5
Computing det (A) det (B) we have 8 5 = 40. This is the same answer as above
and you can see that det (A) det (B) = 8 (5) = 40 = det (AB).
We conclude this section with one more property of the determinant.
Theorem 3.24: Determinant of the Transpose
Let A by a matrix where A
T
is the transpose of A. Then,
det
_
A
T
_
= det (A)
This theorem is illustrated in the following example.
Example 3.25: Determinant of the Transpose
Let
A =
_
2 5
4 3
_
Find det
_
A
T
_
.
Solution. First, note that
A =
_
2 4
5 3
_
Using Denition 3.1, we can compute det (A) and det
_
A
T
_
. It follows that det (A) =
2 3 4 5 = 14 and det
_
A
T
_
= 2 3 5 4 = 14. Hence, det (A) = det
_
A
T
_
.
117
3.1.4. Finding Determinants Using Row Operations
Theorems 3.16, 3.18 and 3.20 illustrate how row operations aect the determinant of a
matrix. In this section, we look at two examples where row operations are used to nd
the determinant of a large matrix. Recall that when working with large matrices, Laplace
Expansion is eective but timely, as there are many steps involved. This section provides
useful tools for an alternative method. By rst applying row operations, we can obtain a
simpler matrix to which we apply Laplace Expansion.
While working through questions such as these, it is useful to record your row operations
as you go along. Keep this in mind as you read through the next example.
Example 3.26: Finding a Determinant
Find the determinant of the matrix
A =
_

_
1 2 3 4
5 1 2 3
4 5 4 3
2 2 4 5
_

_
Solution. We will use the properties of determinants outlined above to nd det (A). Replace
the second row by 5 times the rst row added to it. Then replace the third row by 4
times the rst row added to it. Finally, replace the fourth row by 2 times the rst row
added to it. This yields the matrix
B =
_

_
1 2 3 4
0 9 13 17
0 3 8 13
0 2 10 3
_

_
Notice that the only row operation we have done so far is adding a multiple of a row to
another row. Therefore, by Theorem 3.20, det (B) = det (A) .
At this stage, you could use Laplace Expansion to nd det (B). However, we will continue
with row operations to nd an even simpler matrix to work with.
Next, replace the second row with 3 times the third row added to the second row. By
Theorem 3.20 this does not change the value of the determinant. Then, multiply the fourth
row by 3. This results in the matrix
C =
_

_
1 2 3 4
0 0 11 22
0 3 8 13
0 6 30 9
_

_
Here, det (C) = 3 det (B), which means that det (B) =
_

1
3
_
det (C)
Since det (A) = det (B), we now have that det (A) =
_

1
3
_
det (C). Again, you could use
Laplace Expansion here to nd det (C). However, we will continue with row operations.
118
Now replace the fourth row with 2 times the third row added to the fourth. This does not
change the value of the determinant by Theorem 3.20. Finally switch the third and second
rows. This causes the determinant to be multiplied by 1. Thus det (C) = det (D) where
D =
_

_
1 2 3 4
0 3 8 13
0 0 11 22
0 0 14 17
_

_
Hence, det (A) =
_

1
3
_
det (C) =
_
1
3
_
det (D)
You could do more row operations or you could note that this can be easily expanded
along the rst column. Then, expand the resulting 3 3 matrix also along the rst column.
This results in
det (D) = 1 (3)

11 22
14 17

= 1485
and so det (A) =
_
1
3
_
(1485) = 495.
You can see that by using row operations, we can simplify a matrix to the point where
Laplace Expansion involves only a few steps. In Example 3.26, we also could have continued
until the matrix was in upper triangular form, and taken the product of the entries on the
main diagonal. Whenever computing the determinant, it is useful to consider all the possible
methods and tools.
Consider the next example.
Example 3.27: Find the Determinant
Find the determinant of the matrix
A =
_

_
1 2 3 2
1 3 2 1
2 1 2 5
3 4 1 2
_

_
Solution. Once again, we will simplify the matrix through row operations. Replace the
second row by 1 times the rst row added to the second row. Next take 2 times the rst
row and add to the third and nally take 3 times the rst row and add to the fourth row.
This yields
B =
_

_
1 2 3 2
0 5 1 1
0 3 4 1
0 10 8 4
_

_
By Theorem 3.20, det (A) = det (B).
119
Remember you can work with the columns also. Take 5 times the fourth column and
add to the second column. This yields
C =
_

_
1 8 3 2
0 0 1 1
0 8 4 1
0 10 8 4
_

_
By Theorem 3.20 det (A) = det (C).
Now take 1 times the third row and add to the top row. This gives.
D =
_

_
1 0 7 1
0 0 1 1
0 8 4 1
0 10 8 4
_

_
which by Theorem 3.20 has the same determinant as A.
Now, we can nd det (D) by expanding along the rst column as follows. You can see
that there will be only one non zero term.
det (D) = 1 det
_
_
0 1 1
8 4 1
10 8 4
_
_
+ 0 + 0 + 0
Expanding again along the rst column, we have
det (D) = 1
_
0 + 8 det
_
1 1
8 4
_
+ 10 det
_
1 1
4 1
__
= 82
Now since det (A) = det (D), it follows that det (A) = 82.
Remember that you can verify these answers by using Laplace Expansion on A. Similarly,
if you rst compute the determinant using Laplace Expansion, you can use the row operation
method to verify.
3.2 Applications of the Determinant
3.2.1. A Formula For The Inverse
The determinant of a matrix also provides a way to nd the inverse of a matrix. Recall the
denition of the inverse of a matrix in Denition 2.33. We say that A
1
is the inverse of A
if AA
1
= I and A
1
A = I.
We now dene a new matrix, called the cofactor matrix of A. The cofactor matrix of
A is the matrix whose ij
th
entry is the ij
th
cofactor of A. The formal denition is as follows.
120
Denition 3.28: The Cofactor Matrix
Let A = [A
ij
] be an n n matrix. Then the cofactor matrix of A, denoted cof (A),
is dened by cof (A) =
_
cof (A)
ij
_
where cof (A)
ij
is the ij
th
cofactor of A.
Hence, cof (A)
ij
is the ij
th
entry of the cofactor matrix.
We will use the cofactor matrix to create a formula for the inverse of A. First, we dene
the adjugate of A to be the transpose of the cofactor matrix. We can also call this matrix
the classical adjoint of A, and we denote it by adj (A).
In the specic case where A is a 2 2 matrix given by
A =
_
a b
c d
_
then adj (A) is given by
adj (A) =
_
d b
c a
_
In general however, adj (A) can always be found by taking the transpose of the cofactor
matrix of A.
The following theorem provides a formula for A
1
using the determinant and adjugate
of A.
Theorem 3.29: The Inverse and the Determinant
Let A be a matrix such that det (A) ,= 0. Then,
A
1
=
1
det (A)
adj (A)
The proof of this Theorem is below, after two examples demonstrating this concept.
Notice that this formula is only dened when det (A) ,= 0. Therefore, A is invertible if
and only if det (A) ,= 0.
Consider the following example.
Example 3.30: Find Inverse Using the Determinant
Find the inverse of the matrix
A =
_
_
1 2 3
3 0 1
1 2 1
_
_
using the formula in Theorem 3.29.
Solution. According to Theorem 3.29,
A
1
=
1
det (A)
adj (A)
121
First we will nd the determinant of this matrix. Using Theorems 3.16, 3.18, and 3.20,
we can rst simplify the matrix through row operations. First, add 3 times the rst row
to the second row. Then add 1 times the rst row to the third row to obtain
B =
_
_
1 2 3
0 6 8
0 0 2
_
_
By Theorem 3.20, det (A) = det (B). By Theorem 3.13, det (B) = 162 = 12. Hence,
det (A) = 12.
Now, we need to nd adj (A). To do so, rst we will nd the cofactor matrix of A. This
is given by
cof (A) =
_
_
2 2 6
4 2 0
2 8 6
_
_
Here, the ij
th
entry is the ij
th
cofactor of the original matrix A which you can verify.
Therefore, from Theorem 3.29, the inverse of A is given by
A
1
=
1
12
_
_
2 2 6
4 2 0
2 8 6
_
_
T
=
_

1
6
1
3
1
6

1
6

1
6
2
3
1
2
0
1
2
_

_
Remember that we can always verify our answer for A
1
. Compute the product AA
1
and A
1
A and make sure each product is equal to I.
Compute A
1
A as follows
A
1
A =
_

1
6
1
3
1
6

1
6

1
6
2
3
1
2
0
1
2
_

_
_
_
1 2 3
3 0 1
1 2 1
_
_
=
_
_
1 0 0
0 1 0
0 0 1
_
_
= I
Hence our answer is correct.
We will look at another example of how to use this formula to nd A
1
.
Example 3.31: Find the Inverse From a Formula
Find the inverse of the matrix
A =
_

_
1
2
0
1
2

1
6
1
3

1
2

5
6
2
3

1
2
_

_
using the formula given in Theorem 3.29.
122
Solution. First we need to nd det (A). This step is left as an exercise and you should verify
that det (A) =
1
6
. The inverse is therefore equal to
A
1
=
1
(1/6)
adj (A) = 6 adj (A)
We continue to calculate as follows.
A
1
= 6
_

1
3

1
2
2
3

1
2

1
6

1
2

5
6

1
2

1
6
1
3

5
6
2
3

0
1
2
2
3

1
2

1
2
1
2

5
6

1
2

1
2
0

5
6
2
3

0
1
2
1
3

1
2

1
2
1
2

1
6

1
2

1
2
0

1
6
1
3

_
T
Expanding all the 2 2 determinants, this yields
A
1
= 6
_

_
1
6
1
3
1
6
1
3
1
6

1
3

1
6
1
6
1
6
_

_
T
=
_
_
1 2 1
2 1 1
1 2 1
_
_
Again, you can always check your work by multiplying A
1
A and ensuring this product
equals I.
A
1
A =
_
_
1 2 1
2 1 1
1 2 1
_
_
_

_
1
2
0
1
2

1
6
1
3

1
2

5
6
2
3

1
2
_

_
=
_
_
1 0 0
0 1 0
0 0 1
_
_
This tells us that our calculation for A
1
is correct.
The verication step is very important, as it is a simple way to check your work! If you
multiply A
1
A and AA
1
and these products are not both equal to I, be sure to go back and
double check each step. One common error is to forget to take the transpose of the cofactor
matrix, so be sure to complete this step.
We will now prove Theorem 3.29.
Proof of Theorem 3.29: From the denition of the determinant in terms of expansion
along a column, and letting A = [A
ir
], if det (A) ,= 0,
n

i=1
A
ir
cof (A)
ir
det(A)
1
= det(A) det(A)
1
= 1
123
Now consider
n

i=1
A
ir
cof (A)
ik
det(A)
1
when k ,= r. Replace the k
th
column with the r
th
column to obtain a matrix B
k
whose
determinant equals zero by Theorem 3.16. However, expanding this matrix B
k
along the k
th
column yields
0 = det (B
k
) det (A)
1
=
n

i=1
A
ir
cof (A)
ik
det (A)
1
Summarizing,
n

i=1
A
ir
cof (A)
ik
det (A)
1
=
rk
=
_
1 if r = k
0 if r ,= k
Now
n

i=1
A
ir
cof (A)
ik
=
n

i=1
A
ir
cof (A)
T
ki
which is the kr
th
entry of cof (A)
T
A. Therefore,
cof (A)
T
det (A)
A = I (3.1)
Using the other formula in Denition 3.11, and similar reasoning,
n

j=1
A
rj
cof (A)
kj
det (A)
1
=
rk
Now
n

j=1
A
rj
cof (A)
kj
=
n

j=1
A
rj
cof (A)
T
jk
which is the rk
th
entry of Acof (A)
T
. Therefore,
A
cof (A)
T
det (A)
= I (3.2)
and it follows from 3.1 and 3.2 that A
1
=
_
A
1
ij
_
, where
A
1
ij
= cof (A)
ji
det (A)
1
In other words,
A
1
=
cof (A)
T
det (A)
Now suppose A
1
exists. Then by Theorem 3.22,
1 = det (I) = det
_
AA
1
_
= det (A) det
_
A
1
_
124
so det (A) ,= 0.
This method for nding the inverse of A is useful in many contexts. In particular, it is
useful with complicated matrices where the entries are functions, rather than numbers.
Consider the following example.
Example 3.32: Inverse for Non Constant Matrix
Suppose
A(t) =
_
_
e
t
0 0
0 cos t sin t
0 sin t cos t
_
_
Show that A(t)
1
exists and then nd it.
Solution. First note det (A(t)) = e
t
(cos
2
t + sin
2
t) = e
t
,= 0 so A(t)
1
exists.
The cofactor matrix is
C (t) =
_
_
1 0 0
0 e
t
cos t e
t
sin t
0 e
t
sin t e
t
cos t
_
_
and so the inverse is
1
e
t
_
_
1 0 0
0 e
t
cos t e
t
sin t
0 e
t
sin t e
t
cos t
_
_
T
=
_
_
e
t
0 0
0 cos t sin t
0 sin t cos t
_
_
3.2.2. Cramers Rule
Another context in which the formula given in Theorem 3.29 is important is Cramers
Rule. Recall that we can represent a system of linear equations in the form AX = B, where
the solutions to this system are given by X. Cramers Rule gives a formula for the solutions
X in the special case that A is a square invertible matrix. Note this rule does not apply
if you have a system of equations in which there is a dierent number of equations than
variables (in other words, when A is not square), or when A is not invertible.
Suppose we have a system of equations given by AX = B, and we want to nd solutions
X which satisfy this system. Then recall that if A
1
exists,
AX = B
A
1
(AX) = A
1
B
(A
1
A) X = A
1
B
IX = A
1
B
125
Hence, the solutions X to the system are given by X = A
1
B. Now in the case that A
1
exists, there is a formula for A
1
given above. Substituting this formula into this equation
for X, we have
X = A
1
B =
1
det (A)
adj (A) B
Let X
i
be the i
th
entry of X and B
j
be the j
th
entry of B. Then this equation becomes
X
i
=
n

j=1
(A
ij
)
1
B
j
=
n

j=1
1
det (A)
adj (A)
ij
B
j
where adj (A)
ij
is the ij
th
entry of adj (A).
By the formula for the expansion of a determinant along a column,
X
i
=
1
det (A)
det
_

_
B
1

.
.
.
.
.
.
.
.
.
B
n

_

_
where here the i
th
column of A is replaced with the column vector [B
1
, B
n
]
T
. The
determinant of this modied matrix is taken and divided by det (A). This formula is known
as Cramers rule.
We formally dene this method now.
Procedure 3.33: Using Cramers Rule
Suppose A is an n n invertible matrix and we wish to solve the system AX = B for
X = [X
1
, , X
n
]
T
. Then Cramers rule says
X
i
=
det (A
i
)
det (A)
where A
i
is the matrix obtained by replacing the i
th
column of A with the column
matrix
B =
_

_
B
1
.
.
.
B
n
_

_
We illustrate this procedure in the following example.
Example 3.34: Using Cramers Rule
Find x, y, z if
_
_
1 2 1
3 2 1
2 3 2
_
_
_
_
x
y
z
_
_
=
_
_
1
2
3
_
_
126
Solution. We will use method outlined in Procedure 3.33 to nd the values for x, y, z which
give the solution to this system. Let
B =
_
_
1
2
3
_
_
In order to nd x, we calculate
x =
det (A
1
)
det (A)
where A
1
is the matrix obtained from replacing the rst column of A with B.
Hence, A
1
is given by
A
1
=
_
_
1 2 1
2 2 1
3 3 2
_
_
Therefore,
x =
det (A
1
)
det (A)
=

1 2 1
2 2 1
3 3 2

1 2 1
3 2 1
2 3 2

=
1
2
Similarly, to nd y we construct A
2
by replacing the second column of A with B. Hence,
A
2
is given by
A
2
=
_
_
1 1 1
3 2 1
2 3 2
_
_
Therefore,
y =
det (A
2
)
det (A)
=

1 1 1
3 2 1
2 3 2

1 2 1
3 2 1
2 3 2

=
1
7
Similarly, A
3
is constructed by replacing the third column of A with B. Then, A
3
is given
by
A
3
=
_
_
1 2 1
3 2 2
2 3 3
_
_
Therefore, z is calculated as follows.
127
z =
det (A
3
)
det (A)
=

1 2 1
3 2 2
2 3 3

1 2 1
3 2 1
2 3 2

=
11
14
Cramers Rule gives you another tool to consider when solving a system of linear equa-
tions.
We can also use Cramers Rule for systems of non linear equations. Consider the following
system where the matrix A has functions rather than numbers for entries.
Example 3.35: Use Cramers Rule for Nonconstant Matrix
Solve for z if
_
_
1 0 0
0 e
t
cos t e
t
sin t
0 e
t
sin t e
t
cos t
_
_
_
_
x
y
z
_
_
=
_
_
1
t
t
2
_
_
Solution. We are asked to nd the value of z in the solution. We will solve using Cramers
rule. Thus
z =

1 0 1
0 e
t
cos t t
0 e
t
sin t t
2

1 0 0
0 e
t
cos t e
t
sin t
0 e
t
sin t e
t
cos t

= t ((cos t) t + sin t) e
t
3.3 Exercises
1. Find the determinants of the following matrices.
(a)
_
_
1 2 3
3 2 2
0 9 8
_
_
(b)
_
_
4 3 2
1 7 8
3 9 3
_
_
128
(c)
_

_
1 2 3 2
1 3 2 3
4 1 5 0
1 2 1 2
_

_
2. Find the following determinant by expanding along the rst row and second column.

1 2 1
2 1 3
2 1 1

3. Find the following determinant by expanding along the rst column and third row.

1 2 1
1 0 1
2 1 1

4. Find the following determinant by expanding along the second row and rst column.

1 2 1
2 1 3
2 1 1

5. Compute the determinant by cofactor expansion. Pick the easiest row or column to
use.

1 0 0 1
2 1 1 0
0 0 0 2
2 1 3 1

6. Find the determinant using row operations.

1 2 1
2 3 2
4 1 2

7. Find the determinant using row operations.

2 1 3
2 4 2
1 4 5

8. Find the determinant using row operations.

1 2 1 2
3 1 2 3
1 0 3 1
2 3 2 2

129
9. Find the determinant using row operations.

1 4 1 2
3 2 2 3
1 0 3 3
2 1 2 2

10. Verify an example of each property of determinants found in Theorems 3.16 - 3.20 for
2 2 matrices.
11. An operation is done to get from the rst matrix to the second. Identify what was
done and tell how it will aect the value of the determinant.
_
a b
c d
_
,
_
a c
b d
_
12. An operation is done to get from the rst matrix to the second. Identify what was
done and tell how it will aect the value of the determinant.
_
a b
c d
_
,
_
c d
a b
_
13. An operation is done to get from the rst matrix to the second. Identify what was
done and tell how it will aect the value of the determinant.
_
a b
c d
_
,
_
a b
a + c b + d
_
14. An operation is done to get from the rst matrix to the second. Identify what was
done and tell how it will aect the value of the determinant.
_
a b
c d
_
,
_
a b
2c 2d
_
15. An operation is done to get from the rst matrix to the second. Identify what was
done and tell how it will aect the value of the determinant.
_
a b
c d
_
,
_
b a
d c
_
16. Let A be an r r matrix and suppose there are r 1 rows (columns) such that all rows
(columns) are linear combinations of these r 1 rows (columns). Show det (A) = 0.
17. Show det (aA) = a
n
det (A) for an n n matrix A and scalar a.
18. Construct 2 2 matrices A and B to show that the det Adet B = det(AB).
19. Is it true that det (A+ B) = det (A) + det (B)? If this is so, explain why it is so and
if it is not so, give a counter example.
130
20. An nn matrix is called nilpotent if for some positive integer, k it follows A
k
= 0. If
A is a nilpotent matrix and k is the smallest possible integer such that A
k
= 0, what
are the possible values of det (A)?
21. A matrix is said to be orthogonal if A
T
A = I. Thus the inverse of an orthogonal
matrix is just its transpose. What are the possible values of det (A) if Ais an orthogonal
matrix?
22. Fill in the missing entries (denoted ) to make the matrix orthogonal as in Problem
21.
_

_
1

2
1

12
6
1

6
3
_

_
23. Let A and B be two n n matrices. A B (A is similar to B) means there exists
an invertible matrix S such that A = S
1
BS. Show that if A B, then B A. Show
also that A A and that if A B and B C, then A C.
24. In the context of Problem 23 show that if A B, then det (A) = det (B) .
25. Show that if two matrices are similar, they have the same characteristic polynomials.
26. Tell whether each statement is true or false. If true, provide a proof. If false, provide
a counter example.
(a) If A is a 33 matrix with a zero determinant, then one column must be a multiple
of some other column.
(b) If any two columns of a square matrix are equal, then the determinant of the
matrix equals zero.
(c) For two n n matrices A and B, det (A+ B) = det (A) + det (B) .
(d) For an n n matrix A, det (3A) = 3 det (A)
(e) If A
1
exists then det (A
1
) = det (A)
1
.
(f) If B is obtained by multiplying a single row of A by 4 then det (B) = 4 det (A) .
(g) For A an n n matrix, det (A) = (1)
n
det (A) .
(h) If A is a real n n matrix, then det
_
A
T
A
_
0.
(i) Cramers rule is useful for nding solutions to systems of linear equations in which
there is an innite set of solutions.
(j) If A
k
= 0 for some positive integer k, then det (A) = 0.
(k) If AX = 0 for some X ,= 0, then det (A) = 0.
27. Use Cramers rule to nd the solution to
x + 2y = 1
2x y = 2
131
28. Use Cramers rule to nd the solution to
x + 2y + z = 1
2x y z = 2
x + z = 1
29. Let
A =
_
_
1 2 3
0 2 1
3 1 0
_
_
Determine whether the matrix A has an inverse by nding whether the determinant
is non zero. If the determinant is nonzero, nd the inverse using the formula for the
inverse which involves the cofactor matrix.
30. Let
A =
_
_
1 2 0
0 2 1
3 1 1
_
_
Determine whether the matrix A has an inverse by nding whether the determinant
is non zero. If the determinant is nonzero, nd the inverse using the formula for the
inverse which involves the cofactor matrix.
31. Let
A =
_
_
1 3 3
2 4 1
0 1 1
_
_
Determine whether the matrix A has an inverse by nding whether the determinant
is non zero. If the determinant is nonzero, nd the inverse using the formula for the
inverse which involves the cofactor matrix.
32. Let
A =
_
_
1 2 3
0 2 1
2 6 7
_
_
Determine whether the matrix A has an inverse by nding whether the determinant
is non zero. If the determinant is nonzero, nd the inverse using the formula for the
inverse which involves the cofactor matrix.
33. Let
A =
_
_
1 0 3
1 0 1
3 1 0
_
_
Determine whether the matrix A has an inverse by nding whether the determinant
is non zero. If the determinant is nonzero, nd the inverse using the formula for the
inverse which involves the cofactor matrix.
132
34. For the following matrices, determine if they are invertible. If so, use the formula for
the inverse in terms of the cofactor matrix to nd each inverse. If the inverse does not
exist, explain why.
_
1 1
1 2
_
,
_
_
1 2 3
0 2 1
4 1 1
_
_
,
_
_
1 2 1
2 3 0
0 1 2
_
_
35. Find the following determinants.
(a) det
_
_
2 2 + 2i 3 3i
2 2i 5 1 7i
3 + 3i 1 + 7i 16
_
_
(b) det
_
_
10 2 + 6i 8 6i
2 6i 9 1 7i
8 + 6i 1 + 7i 17
_
_
36. Consider the matrix
A =
_
_
1 0 0
0 cos t sin t
0 sin t cos t
_
_
Does there exist a value of t for which this matrix fails to have an inverse? Explain.
37. Consider the matrix
A =
_
_
1 t t
2
0 1 2t
t 0 2
_
_
Does there exist a value of t for which this matrix fails to have an inverse? Explain.
38. Consider the matrix
A =
_
_
e
t
cosh t sinh t
e
t
sinh t cosh t
e
t
cosh t sinh t
_
_
Does there exist a value of t for which this matrix fails to have an inverse? Explain.
39. Consider the matrix
A =
_
_
e
t
e
t
cos t e
t
sin t
e
t
e
t
cos t e
t
sin t e
t
sin t + e
t
cos t
e
t
2e
t
sin t 2e
t
cos t
_
_
Does there exist a value of t for which this matrix fails to have an inverse? Explain.
40. Show that if det (A) ,= 0 for A an nn matrix, it follows that if AX = 0, then X = 0.
41. Suppose A, B are n n matrices and that AB = I. Show that then BA = I. Hint:
You might do something like this: First explain why det (A) , det (B) are both nonzero.
Then (AB) A = A and then show BA(BAI) = 0. From this use what is given to
conclude A(BAI) = 0. Then use Problem 40.
133
42. Use the formula for the inverse in terms of the cofactor matrix to nd the inverse of
the matrix
A =
_
_
e
t
0 0
0 e
t
cos t e
t
sin t
0 e
t
cos t e
t
sin t e
t
cos t + e
t
sin t
_
_
43. Find the inverse, if it exists, of the matrix
A =
_
_
e
t
cos t sin t
e
t
sin t cos t
e
t
cos t sin t
_
_
44. Suppose A is an upper triangular matrix. Show that A
1
exists if and only if all
elements of the main diagonal are non zero. Is it true that A
1
will also be upper
triangular? Explain. Could the same be concluded for lower triangular matrices?
45. If A, B, and C are each nn matrices and ABC is invertible, show why each of A, B,
and C are invertible.
46. Let F (t) = det
_
a [t) b (t)
c (t) d (t)
_
. Verify
F

(t) = det
_
a

(t) b

(t)
c (t) d (t)
_
+ det
_
a (t) b (t)
c

(t) d

(t)
_
(3.3)
Now suppose
F (t) = det
_
_
a (t) b (t) c (t)
d (t) e (t) f (t)
g (t) h (t) i (t)
_
_
Use Laplace expansion and equation 3.3 to verify
F

(t) = det
_
_
a

(t) b

(t) c

(t)
d (t) e (t) f (t)
g (t) h (t) i (t)
_
_
+ det
_
_
a (t) b (t) c (t)
d

(t) e

(t) f

(t)
g (t) h (t) i (t)
_
_
+det
_
_
a (t) b (t) c (t)
d (t) e (t) f (t)
g

(t) h

(t) i

(t)
_
_
Conjecture a general result valid for n n matrices and explain why it will be true.
Can a similar thing be done with the columns?
47. Let L(y) = y
(n)
+ a
n1
(x) y
(n1)
+ + a
1
(x) y

+ a
0
(x) y where the a
i
are given
continuous functions dened on a closed interval, (a, b) and y is some function which
has n derivatives so that we can write L(y). Suppose L(y
k
) = 0 for k = 1, 2, , n.
The Wronskian of the functions y
i
is dened as
W (y
1
, , y
n
) (x) = det
_

_
y
1
(x) y
n
(x)
y

1
(x) y

n
(x)
.
.
.
.
.
.
y
(n1)
1
(x) y
(n1)
n
(x)
_

_
134
We can write W (x) = W (y
1
, , y
n
) (x). Then, show
W

(x) = det
_

_
y
1
(x) y
n
(x)
y

1
(x) y

n
(x)
.
.
.
.
.
.
y
(n)
1
(x) y
(n)
n
(x)
_

_
Now use the dierential equation, L(y) = 0 which is satised by each of these functions,
y
i
and properties of determinants presented above to verify that W

+a
n1
(x) W = 0.
Give an explicit solution of this linear dierential equation, Abels formula, and use
your answer to verify that the Wronskian of these solutions to the equation, L(y) = 0
either vanishes identically on (a, b) or never. Hint: To solve the dierential equation,
let A

(x) = a
n1
(x) and multiply both sides of the dierential equation by e
A(x)
and
then argue the left side is the derivative of something.
135
136
4. Complex Numbers
Outcomes
A. Understand the geometric signicance of a complex number as a point in the plane.
B. Prove algebraic properties of addition and multiplication of complex numbers, and
apply these properties. Understand the action of taking the conjugate of a complex
number.
C. Understand the absolute value of a complex number and how to nd it as well as its
geometric signicance.
D. Convert a complex number from standard form to polar form, and from polar form to
standard form.
E. Understand De Moivres theorem and be able to use it to nd the roots of a complex
number.
F. Use the Quadratic Formula to nd the complex roots of a quadratic equation.
4.1 Complex Numbers
Although very powerful, the real numbers are inadequate to solve equations such as x
2
+1 = 0,
and this is where complex numbers come in. We dene the number i as the imaginary number
such that i
2
= 1, and dene complex numbers as those of the form z = a +bi where a and
b are real numbers, We call this the standard form of the complex number z. Then, we refer
to a as the real part of z, and b as the imaginary part of z. It turns out that such numbers
not only solve the above equation, but in fact also solve any polynomial of degree at least
1 with complex coecients. This property, called the Fundamental Theorem of Algebra, is
sometimes referred to by saying C is algebraically closed. Gauss is usually credited with
giving a proof of this theorem in 1797 but many others worked on it and the rst completely
correct proof was due to Argand in 1806.
Just as a real number can be considered as a point on the line, a complex number
z = a + bi can be considered as a point (a, b) in the plane whose x coordinate is a and
whose y coordinate is b. For example, in the following picture, the point z = 3 + 2i can be
represented as the point in the plane with coordinates (3, 2) .
137
z = (3, 2) = 3 + 2i
Addition of complex numbers is dened as follows.
(a + bi) + (c + di) = (a + c) + (b + d) i
This addition obeys all the usual properties as the following theorem indicates.
Theorem 4.1: Properties of Addition of Complex Numbers
Let z, w, and v be complex numbers. Then the following properties hold.
Commutative Law for Addition
z + w = w + z
Additive Identity
z + 0 = z
Existence of Additive Inverse
For each z C, there exists z C such that z + (z) = 0
In fact if z = a + bi, then z = a bi.
Associative Law for Addition
(z + w) + v = z + (w + v)
The proof of this theorem is left as an exercise for the reader.
Now, multiplication of complex numbers is dened the way you would expect, recalling
that i
2
= 1.
(a + bi) (c + di) = ac + adi + bci + i
2
bd
= (ac bd) + (ad + bc) i
Interestingly every nonzero complex number a + bi has a unique multiplicative inverse.
In other words, for a nonzero complex number z, there exists a number z
1
(or
1
z
) so that
zz
1
= 1. Note that z = a + bi is nonzero exactly when a
2
+ b
2
,= 0, and its inverse can be
written in standard form as follows:
z
1
=
1
a + bi
=
a bi
a
2
+ b
2
=
a
a
2
+ b
2
i
b
a
2
+ b
2
The following are important properties of multiplication of complex numbers.
138
Theorem 4.2: Properties of Multiplication of Complex Numbers
Let z, w and v be complex numbers. Then, the following properties of multiplication
hold.
Commutative Law for Multiplication
zw = wz
Associative Law for Multiplication
(zw) v = z (wv)
Multiplicative Identity
1z = z
Existence of Multiplicative Inverse
For each z ,= 0, there exists z
1
such that zz
1
= 1
Distributive Law
z (w + v) = zw + zv
You may wish to verify some of these statements. The real numbers also satisfy the
above axioms, and in general any mathematical structure which satises these axioms is
called a eld. There are many other elds, in particular even nite ones particularly useful
for cryptography, and the reason for specifying these axioms is that linear algebra is all about
elds and we can do just about anything in this subject using any eld. Although here, the
elds of most interest will be the familiar eld of real numbers, denoted as R, and the eld
of complex numbers, denoted as C.
An important construction regarding complex numbers is the complex conjugate denoted
by a horizontal line above the number, z. It is dened as follows.
a + bi = a bi
Geometrically, the action of the conjugate is to reect a given complex number across the x
axis. Algebraically, it changes the sign on the imaginary part of the complex number.
Consider the following computation.
_
a + bi
_
(a + bi) = (a bi) (a + bi)
= a
2
+ b
2
(ab ab) i = a
2
+ b
2
Notice that there is no imaginary part in the product, thus multiplying a complex number
by its conjugate results in a real number.
Consider the following denition.
139
Denition 4.3: Absolute Value
The absolute value of a complex number, denoted [z[ is dened as follows.
[a + bi[ =

a
2
+ b
2
Thus, if z is the complex number z = a + bi, it follows that
[z[ = (zz)
1/2
Also from the denition, if z = a + bi and w = c + di are two complex numbers, then
[zw[ = [z[ [w[ . Take a moment to verify this.
The triangle inequality is an important property of the absolute value of complex num-
bers. There are two useful versions which we present here, although the rst one is ocially
called the triangle inequality.
Proposition 4.4: Triangle Inequality
Let z, w be complex numbers.
The following two inequalities hold for any complex numbers z, w:
[z + w[ [z[ +[w[
[[z[ [w[[ [z w[
The rst one is called the Triangle Inequality.
Proof: Let z = a + bi and w = c + di. First note that
zw = (a + bi) (c di) = ac + bd + (bc ad) i
and so [ac + bd[ [zw[ = [z[ [w[ .
Then,
[z + w[
2
= (a + c + i (b + d)) (a + c i (b + d))
= (a + c)
2
+ (b + d)
2
= a
2
+ c
2
+ 2ac + 2bd + b
2
+ d
2
[z[
2
+[w[
2
+ 2 [z[ [w[ = ([z[ +[w[)
2
Taking the square root, we have that
[z + w[ [z[ +[w[
so this veries the triangle inequality.
To get the second inequality, write
z = z w + w, w = w z + z
and so by the rst form of the inequality we get both:
[z[ [z w[ +[w[ , [w[ [z w[ +[z[
140
Hence, both [z[ [w[ and [w[ [z[ are no larger than [z w[. This proves the second
version because [[z[ [w[[ is one of [z[ [w[ or [w[ [z[.
With this denition, it is important to note the following. You may wish to take the time
to verify this remark.
Let z = a + bi and w = c + di. Then [z w[ =
_
(a c)
2
+ (b d)
2
. Thus the distance
between the point in the plane determined by the ordered pair (a, b) and the ordered pair
(c, d) equals [z w[ where z and w are as just described.
For example, consider the distance between (2, 5) and (1, 8) . Letting z = 2 + 5i and
w = 1 + 8i, z w = 1 3i, (z w) (z w) = (1 3i) (1 + 3i) = 10 so [z w[ =

10.
Recall that we refer to z = a + bi as the standard form of the complex number. In the
next section, we examine another form in which we can express the complex number.
4.2 Polar Form
In the previous section, we identied a complex number z = a +bi with a point (a, b) in the
coordinate plane. There is a another form in which we can express the same number, called
the polar form . The polar form is the focus of this section. It will turn out to be very useful
if not crucial for certain calculations as we shall soon see.
Suppose z = a +bi is a complex number, and let r =

a
2
+ b
2
= [z[. We often call r the
modulus of z . Note rst that
_
a
r
_
2
+
_
b
r
_
2
=
a
2
+ b
2
r
2
= 1
and so
_
a
r
,
b
r
_
is a point on the unit circle. Therefore, there exists a unique angle (in radians)
[0, 2) such that
cos =
a
r
, sin =
b
r
In other words is the unique angle in [0, 2) such that a = r cos and b = r sin , that is
= cos
1
(a/r) and = sin
1
(b/r). We call this angle the argument of z.
The polar form of the complex number z = a + bi = r (cos + i sin ) is for convenience
written as:
z = re
i
where is this angle just described, the argument of z. Here we think of e
i
as a short cut for
cos + i sin . This is all we will need in this course, but in reality e
i
can be considered as
the complex equivalent of the exponential function where this turns out to be a true equality.

z = a + bi = re
i
r =

a
2
+ b
2
r
141
Thus we can convert any complex number in the standard form z = a +bi into its polar
form. Consider the following example.
Example 4.5: Standard to Polar Form
Let z = 2 + 2i be a complex number. Write z in the polar form
z = re
i
Solution. First, nd r. By the above discussion, r =

a
2
+ b
2
= [z[. Therefore,
r =

2
2
+ 2
2
=

8 = 2

2
Now, to nd , we plot the point (2, 2) and nd the angle from the positive x axis to
the line between this point and the origin. In this case, = 45

=

4
. That is we found the
unique angle such that = cos
1
(1/

2) and = sin
1
(1/

2).
Note that in polar form, we always express angles in radians, not degrees.
Hence, we can write z as
z = 2

2e
i

4
Notice that the standard and polar forms are completely equivalent, that is not only can
we transform a complex number from standard form to its polar form, we can also take a
complex number in polar form and convert it back to standard form.
Example 4.6: Polar to Standard Form
Let z = 2e
2i/3
. Write z in the standard form
z = a + bi
Solution. Let z = 2e
2i/3
be the polar form of a complex number. Recall that e
i
=
cos + i sin . Therefore using standard values of sin and cos we get:
z = 2e
i2/3
= 2(cos(2/3) + i sin(2/3))
= 2
_

1
2
+ i

3
2
_
= 1 +

3i
which is the standard form of this complex number.
You can always verify your answer by converting it back to polar form and ensuring you
reach the original answer.
142
4.3 Roots Of Complex Numbers
A fundamental identity is the formula of De Moivre which follows.
Theorem 4.7: De Moivres Theorem
For any positive integer n, we have
_
e
i
_
n
= e
in
Thus for any real number r > 0 and any positive integer n, we have:
(r (cos + i sin ))
n
= r
n
(cos n + i sin n)
Proof: The proof is by induction on n. It is clear the formula holds if n = 1. Suppose
it is true for n. Then, consider n + 1.
(r (cos + i sin ))
n+1
= (r (cos + i sin ))
n
(r (cos + i sin ))
which by induction equals
= r
n+1
(cos n + i sin n) (cos + i sin )
= r
n+1
((cos n cos sin n sin ) + i (sin n cos + cos n sin ))
= r
n+1
(cos (n + 1) + i sin (n + 1) )
by the formulas for the cosine and sine of the sum of two angles.
The process used in the previous proof, called mathematical induction is very powerful
in Mathematics and Computer Science and explored in more detail in the Appendix.
Now, consider a corollary of Theorem 4.7.
Corollary 4.8: Roots of Complex Numbers
Let z be a non zero complex number. Then there are always exactly k many k
th
roots
of z in C.
Proof: Let z = a + bi and let z = [z[ (cos + i sin ) be the polar form of the complex
number. By De Moivres theorem, a complex number
w = re
i
= r (cos + i sin )
is a k
th
root of z if and only if
w
k
= (re
i
)
k
= r
k
e
ik
= r
k
(cos k + i sin k) = [z[ (cos + i sin )
143
This requires r
k
= [z[ and so r = [z[
1/k
. Also, both cos (k) = cos and sin (k) = sin .
This can only happen if
k = + 2
for an integer. Thus
=
+ 2
k
, Z
and so the k
th
roots of z are of the form
[z[
1/k
_
cos
_
+ 2
k
_
+ i sin
_
+ 2
k
__
, Z
Since the cosine and sine are periodic of period 2, there are exactly k distinct numbers
which result from this formula.
Lets consider an example of this concept. Note that according to Corollary 4.8, there
are exactly 3 cube roots of a complex number.
Example 4.9: Finding Cube Roots
Find the three cube roots of z = i.
Solution. First note that i = 1
_
cos
_

2
_
+ i sin
_

2
__
, where r = 1 and =

2
.
Using the formula in the proof of Corollary 4.8, the cube roots of i are
1
_
cos
_
(/2) + 2l
3
_
+ i sin
_
(/2) + 2l
3
__
where l = 0, 1, 2. Therefore, the roots are
cos
_

6
_
+ i sin
_

6
_
, cos
_
5
6

_
+ i sin
_
5
6

_
, cos
_
3
2

_
+ i sin
_
3
2

_
Written in standard form, the cube roots of i are

3
2
+ i
_
1
2
_
,

3
2
+ i
_
1
2
_
, and i.
The ability to nd k
th
roots can also be used to factor some polynomials.
Example 4.10: Solving a Polynomial Equation
Factor the polynomial x
3
27.
Solution. First nd the cube roots of 27. By the above procedure using De Moivres theorem,
these cube roots are 3, 3
_
1
2
+ i

3
2
_
, and 3
_
1
2
i

3
2
_
. You may wish to verify this
using the above steps.
Therefore, x
3
27 =
(x 3)
_
x 3
_
1
2
+ i

3
2
___
x 3
_
1
2
i

3
2
__
144
Note also
_
x 3
_
1
2
+ i

3
2
___
x 3
_
1
2
i

3
2
__
= x
2
+ 3x + 9 and so
x
3
27 = (x 3)
_
x
2
+ 3x + 9
_
where the quadratic polynomial x
2
+ 3x + 9 cannot be factored without using complex
numbers.
Note that even though the polynomial x
3
27 has all real coecients, it has some complex
zeros, 3
_
1
2
+ i

3
2
_
, and 3
_
1
2
i

3
2
_
. These zeros are complex conjugates of each
other. It is always the case that if a polynomial has real coecients and a complex root, it
will also have a root equal to the complex conjugate.
This discussion relates to the fundamental theorem of algebra as mentioned before. In
the next section, we look at how complex numbers relate to a concept which may be familiar
to you.
4.4 Exercises
1. Let z = 5 + 9i. Find z
1
.
2. Let z = 2 + 7i and let w = 3 8i. Find zw, z + w, z
2
, and w/z.
3. Give the complete solution to x
4
+ 16 = 0.
4. Graph the complex cube roots of 8 in the complex plane. Do the same for the four
fourth roots of 16.
5. If z is a complex number, show there exists a complex number with [[ = 1 and
z = [z[ .
6. De Moivres theorem says [r (cos t + i sin t)]
n
= r
n
(cos nt + i sin nt) for n a positive
integer. Does this formula continue to hold for all integers n, even negative integers?
Explain.
7. You already know formulas for cos (x + y) and sin (x + y) and these were used to prove
De Moivres theorem. Now using De Moivres theorem, derive a formula for sin (5x)
and one for cos (5x).
8. If z and w are two complex numbers and the polar form of z involves the angle while
the polar form of w involves the angle , show that in the polar form for zw the angle
involved is + . Also, show that in the polar form of a complex number z, r = [z[ .
9. Factor x
3
+ 8 as a product of linear factors.
145
10. Write x
3
+ 27 in the form (x + 3) (x
2
+ ax + b) where x
2
+ ax + b cannot be factored
any more using only real numbers.
11. Completely factor x
4
+ 16 as a product of linear factors.
12. Factor x
4
+ 16 as the product of two quadratic polynomials each of which cannot be
factored further without using complex numbers.
13. If z, w are complex numbers prove zw = zw and then show by induction that z
1
z
m
=
z
1
z
m
. Also verify that

m
k=1
z
k
=

m
k=1
z
k
. In words this says the conjugate of a
product equals the product of the conjugates and the conjugate of a sum equals the
sum of the conjugates.
14. Suppose p (x) = a
n
x
n
+ a
n1
x
n1
+ + a
1
x + a
0
where all the a
k
are real numbers.
Suppose also that p (z) = 0 for some z C. Show it follows that p (z) = 0 also.
15. Show that 1 + i, 2 + i are the only two zeros to
p (x) = x
2
(3 + 2i) x + (1 + 3i)
Hence complex zeros do not necessarily come in conjugate pairs if the coecients of
the equation are not real.
16. I claim that 1 = 1. Here is why.
1 = i
2
=

1 =
_
(1)
2
=

1 = 1
This is clearly a remarkable result but is there something wrong with it? If so, what
is wrong?
17. Consider the following equation, which utilizes De Moivres theorem with rational
numbers, instead of integers.
1 = 1
(1/4)
= (cos 2 + i sin 2)
1/4
= cos (/2) + i sin (/2) = i.
Therefore, squaring both sides it follows 1 = 1 as in the previous problem. What
does this tell you about De Moivres theorem? Is there a profound dierence between
raising numbers to integer powers and raising numbers to non integer powers?
18. In the context of Problem 6, consider the following question. If n is an integer, is it
always true that (cos i sin )
n
= cos (n) i sin (n)? Explain.
19. Suppose you have any polynomial in cos and sin . This will be an expression of the
form

m
=0

n
=0
a

cos

sin

where a

C. Can this always be written in the


form

m+n
=(n+m)
b

cos +

n+m
=(n+m)
c

sin ? Explain.
146
20. Suppose p (x) = a
n
x
n
+ a
n1
x
n1
+ +a
1
x +a
0
is a polynomial and it has n zeros,
z
1
, z
2
, , z
n
listed according to multiplicity. (z is a root of multiplicity m if the polynomial f (x) =
(x z)
m
divides p (x) but (x z) f (x) does not.) Show that
p (x) = a
n
(x z
1
) (x z
2
) (x z
n
)
21. Give the solutions to the following quadratic equations having real coecients.
(a) x
2
2x + 2 = 0
(b) 3x
2
+ x + 3 = 0
(c) x
2
6x + 13 = 0
(d) x
2
+ 4x + 9 = 0
(e) 4x
2
+ 4x + 5 = 0
22. Give the solutions to the following quadratic equations having complex coecients.
(a) x
2
+ 2x + 1 + i = 0
(b) 4x
2
+ 4ix 5 = 0
(c) 4x
2
+ (4 + 4i) x + 1 + 2i = 0
(d) x
2
4ix 5 = 0
(e) 3x
2
+ (1 i) x + 3i = 0
23. Prove the fundamental theorem of algebra for quadratic polynomials having coecients
in C. That is, show that an equation of the form
ax
2
+ bx + c = 0 where a, b, c are complex numbers, a ,= 0 has a complex solution.
Hint: Consider the fact, noted earlier that the expressions given from the quadratic
formula do in fact serve as solutions.
147
148
5. Spectral Theory
Outcomes
A. Describe eigenvalues geometrically and algebraically.
B. Find eigenvalues and eigenvectors for a square matrix.
C. When possible, diagonalize a matrix.
D. Use diagonalization in some applications.
5.1 Eigenvalues And Eigenvectors Of A Matrix
Spectral Theory refers to the study of eigenvalues and eigenvectors of a matrix. It is of
fundamental importance in many areas and is the subject of our study for this chapter.
5.1.1. Definition Of Eigenvectors And Eigenvalues
In this section, we will work with the entire set of complex numbers, denoted by C. Recall
that the real numbers, R are contained in the complex numbers, so the discussions in this
section apply to both real and complex numbers.
To illustrate the idea behind what will be discussed, consider the following example.
Example 5.1: Eigenvectors and Eigenvalues
Let
A =
_
_
0 5 10
0 22 16
0 9 2
_
_
Compute the product AX for
X =
_
_
5
4
3
_
_
, X =
_
_
1
0
0
_
_
What do you notice about AX in each of these products?
149
Solution. First, compute AX for
X =
_
_
5
4
3
_
_
This product is given by
AX =
_
_
0 5 10
0 22 16
0 9 2
_
_
_
_
5
4
3
_
_
=
_
_
50
40
30
_
_
= 10
_
_
5
4
3
_
_
In this case, the product AX resulted in a vector which is equal to 10 times the vector
X. In other words, AX = 10X.
Lets see what happens in the next product. Compute AX for the vector
X =
_
_
1
0
0
_
_
This product is given by
AX =
_
_
0 5 10
0 22 16
0 9 2
_
_
_
_
1
0
0
_
_
=
_
_
0
0
0
_
_
= 0
_
_
1
0
0
_
_
In this case, the product AX resulted in a vector equal to 0 times the vector X, AX = 0X.
Perhaps this matrix is such that AX results in kX, for every vector X. However, consider
_
_
0 5 10
0 22 16
0 9 2
_
_
_
_
1
1
1
_
_
=
_
_
5
38
11
_
_
In this case, AX did not result in a vector of the form kX for some scalar k.
There is something special about the rst two products calculated in Example 5.1. Notice
that for each, AX = kX where k is some scalar. When this equation holds for some X and
k, we call the scalar k an eigenvalue of A. We often use the special symbol instead of k
when referring to eigenvalues. In Example 5.1, the values 10 and 0 are eigenvalues for the
matrix A and we can label these as
1
= 10 and
2
= 0.
When AX = X for some X ,= 0, we call such an X an eigenvector of the matrix A.
The eigenvectors of A are associated to an eigenvalue. Hence, if
1
is an eigenvalue of A
and AX =
1
X, we can label this eigenvector as X
1
. Note again that in order to be an
eigenvector, X must be nonzero.
There is also a geometric signicance to eigenvectors. When you have a nonzero vector
which, when multiplied by a matrix results in another vector which is parallel to the rst or
equal to 0, this vector is called an eigenvector of the matrix. This is the meaning when the
vectors are in R
n
.
The formal denition of eigenvalues and eigenvectors is as follows.
150
Denition 5.2: Eigenvalues and Eigenvectors
Let A be an n n matrix and let X C
n
be a nonzero vector for which
AX = X (5.1)
for some scalar . Then X is called an eigenvector and is called an eigenvalue
of the matrix A.
The set of all eigenvalues of an n n matrix A is denoted by (A) and is referred to
as the spectrum of A.
The eigenvectors of a matrix A are those vectors X for which multiplication by A results
in a vector in the same direction or opposite direction to X. Since the zero vector 0 has no
direction this would make no sense for the zero vector. As noted above, 0 is never allowed
to be an eigenvector.
Lets look at eigenvectors in more detail. Suppose X satises 5.1. Then
AX X = 0
or
(AI) X = 0
for some X ,= 0. Equivalently, you could write (I A) X = 0. Hence, when we are
looking for eigenvectors, we are looking for nontrivial solutions to this homogeneous system
of equations!
Recall that the solutions to a homogeneous system of equations consist of basic solutions,
and the linear combinations of those basic solutions. In this context, we call the basic
solutions of the equation (AI) X = 0 basic eigenvectors. It follows that any (nonzero)
linear combination of basic eigenvectors is again an eigenvector.
Suppose the matrix (AI) is invertible, so that (A I)
1
exists. Then the following
equation would be true.
X = IX
=
_
(AI)
1
(A I)
_
X
= (A I)
1
((AI) X)
= (A I)
1
0
= 0
This claims that X = 0. However, we have required that X ,= 0. Therefore (AI) cannot
have an inverse!
Recall from Theorem 3.29 that if a matrix is not invertible, then its determinant is equal
to 0. Therefore we can conclude that
det (AI) = 0 (5.2)
This is equivalent to det (I A) = 0.
151
The expression det (xI A) is a polynomial (in the variable x) called the characteristic
polynomial of A, and det (xI A) = 0 is called the characteristic equation. We could thus
also refer to the eigenvalues of A as characteristic values, but the former is often used for
historical reasons.
The following theorem claims that the roots of the characteristic polynomial are the
eigenvalues of A. Thus when 5.2 holds, A has a nonzero eigenvector.
Theorem 5.3: The Existence of an Eigenvector
Let A be an n n matrix and suppose det (AI) = 0. Then is an eigenvalue of
A and thus there exists a nonzero vector X C
n
such that (AI) X = 0.
Proof: For A an n n matrix, the method of Laplace Expansion demonstrates that
det (AI) is a polynomial of degree n. As such, the equation 5.2 has a solution C by
the fundamental theorem of algebra. The fact that is an eigenvalue follows from Proposition
2.62 along with Theorem 3.29. Since det (AI) = 0 the matrix (AI) cannot be one
to one and so there exists a nonzero vector X such that (AI) X = 0.
5.1.2. Finding Eigenvectors And Eigenvalues
Now that eigenvalues and eigenvectors have been dened, we will study how to nd them
for a matrix A.
First, consider the following denition.
Denition 5.4: Multiplicity of an Eigenvalue
Let A be an nn matrix with characteristic polynomial given by det (xI A). Then,
the multiplicity of an eigenvalue of A is the number of times occurs as a root of
that characteristic polynomial.
For example, suppose the characteristic polynomial of A is given by (x 2)
2
. Solving for
the roots of this polynomial, we set (x 2)
2
= 0 and solve for x. We nd that = 2 is a
root that occurs twice. Hence, in this case, = 2 is an eigenvalue of A of multiplicity equal
to 2.
We will now look at how to nd the eigenvalues and eigenvectors for a matrix A in detail.
The steps used are summarized in the following procedure.
152
Procedure 5.5: Finding Eigenvalues and Eigenvectors
Let A be an n n matrix.
1. First, nd the eigenvalues of A by solving the equation det (xI A) = 0.
2. For each , nd the basic eigenvectors X ,= 0 by nding the basic solutions to
(AI) X = 0.
To verify your work, make sure that AX = X for each and associated eigenvector
X.
We will explore these steps further in the following example.
Example 5.6: Find the Eigenvalues and Eigenvectors
Find the eigenvalues and eigenvectors for the matrix
A =
_
_
5 10 5
2 14 2
4 8 6
_
_
Solution. We will use Procedure 5.5. First we need to nd the eigenvalues of A. Recall that
they are the solutions of the equation
det (xI A) = 0
In this case the equation is
det
_
_
x
_
_
1 0 0
0 1 0
0 0 1
_
_

_
_
5 10 5
2 14 2
4 8 6
_
_
_
_
= 0
which becomes
det
_
_
x 5 10 5
2 x 14 2
4 8 x 6
_
_
= 0
Using Laplace Expansion, compute this determinant and simplify. The result is the
following equation.
(x 5)
_
x
2
20x + 100
_
= 0
Solving this equation, we nd that the eigenvalues are
1
= 5,
2
= 10 and
3
= 10.
Notice that 10 is a root of multiplicity two due to
x
2
20x + 100 = (x 10)
2
Therefore,
2
= 10 is an eigenvalue of multiplicity two.
153
Now that we have found the eigenvalues for A, we can compute the eigenvectors.
First we will nd the basic eigenvectors for
1
= 5. In other words, we want to nd all non-
zero vectors X so that AX = 5X. This requires that we solve the equation (A 5I) X = 0
for X as follows.
_
_
_
_
5 10 5
2 14 2
4 8 6
_
_
5
_
_
1 0 0
0 1 0
0 0 1
_
_
_
_
_
_
x
y
z
_
_
=
_
_
0
0
0
_
_
That is you need to nd the solution to
_
_
0 10 5
2 9 2
4 8 1
_
_
_
_
x
y
z
_
_
=
_
_
0
0
0
_
_
By now this is a familiar problem. You set up the augmented matrix and row reduce to
get the solution. Thus the matrix you must row reduce is
_
_
0 10 5 [ 0
2 9 2 [ 0
4 8 1 [ 0
_
_
The reduced row-echelon form is
_

_
1 0
5
4
[ 0
0 1
1
2
[ 0
0 0 0 [ 0
_

_
and so the solution is any vector of the form
_

_
5
4
s

1
2
s
s
_

_
= s
_

_
5
4

1
2
1
_

_
where s R. If we multiply this vector by 4, we obtain a simpler description for the solution
to this system, as given by
t
_
_
5
2
4
_
_
(5.3)
where t R. Here, the basic eigenvalue is given by
X
1
=
_
_
5
2
4
_
_
Notice that we cannot let t = 0 here, because this would result in the zero vector and
eigenvectors are never equal to 0! Other than this value, every other choice of t in 5.3 results
in an eigenvector.
154
It is a good idea to check your work! To do so, we will take the original matrix and
multiply by the basic eigenvector X
1
. We check to see if we get 5X
1
.
_
_
5 10 5
2 14 2
4 8 6
_
_
_
_
5
2
4
_
_
=
_
_
25
10
20
_
_
= 5
_
_
5
2
4
_
_
This is what we wanted, so we know that our calculations were correct.
Next we will nd the basic eigenvectors for
2
,
3
= 10. These vectors are the basic
solutions to the equation,
_
_
_
_
5 10 5
2 14 2
4 8 6
_
_
10
_
_
1 0 0
0 1 0
0 0 1
_
_
_
_
_
_
x
y
z
_
_
=
_
_
0
0
0
_
_
That is you must nd the solutions to
_
_
5 10 5
2 4 2
4 8 4
_
_
_
_
x
y
z
_
_
=
_
_
0
0
0
_
_
Consider the augmented matrix
_
_
5 10 5 [ 0
2 4 2 [ 0
4 8 4 [ 0
_
_
The reduced row-echelon form for this matrix is
_
_
1 2 1 [ 0
0 0 0 [ 0
0 0 0 [ 0
_
_
and so the eigenvectors are of the form
_
_
2s t
s
t
_
_
= s
_
_
2
1
0
_
_
+ t
_
_
1
0
1
_
_
Note that you cant pick t and s both equal to zero because this would result in the zero
vector and eigenvectors are never equal to zero.
Here, there are two basic eigenvectors, given by
X
2
=
_
_
2
1
0
_
_
, X
3
=
_
_
1
0
1
_
_
Taking any (nonzero) linear combination of X
2
and X
3
will also result in an eigenvector
for the eigenvalue = 10. As in the case for = 5, always check your work! For the rst
basic eigenvector, we can check AX
2
= 10X
2
as follows.
_
_
5 10 5
2 14 2
4 8 6
_
_
_
_
1
0
1
_
_
=
_
_
10
0
10
_
_
= 10
_
_
1
0
1
_
_
155
This is what we wanted. Checking the second basic eigenvector, X
3
, is left as an exercise.
It is important to remember that for any eigenvector X, X ,= 0. However, it is possible
to have eigenvalues equal to zero. This is illustrated in the following example.
Example 5.7: A Zero Eigenvalue
Let
A =
_
_
2 2 2
1 3 1
1 1 1
_
_
Find the eigenvalues and eigenvectors of A.
Solution. First we nd the eigenvalues of A. We will do so using Denition 5.2.
In order to nd the eigenvalues of A, we solve the following equation.
det (xI A) = det
_
_
x 2 2 2
1 x 3 1
1 1 x 1
_
_
= 0
This reduces to x
3
6x
2
+ 8x = 0. You can verify that the solutions are
1
= 0,
2
=
2,
3
= 4. Notice that while eigenvectors can never equal 0, it is possible to have an eigenvalue
equal to 0.
Now we will nd the basic eigenvectors. For
1
= 0, we need to solve the equation
(A0I) X = 0. This equation becomes AX = 0, and so the augmented matrix for nding
the solutions is given by
_
_
2 2 2 [ 0
1 3 1 [ 0
1 1 1 [ 0
_
_
The reduced row-echelon form is
_
_
1 0 1 [ 0
0 1 0 [ 0
0 0 0 [ 0
_
_
Therefore, the eigenvectors are of the form t
_
_
1
0
1
_
_
where t ,= 0 and the basic eigenvector is
given by
X
1
=
_
_
1
0
1
_
_
We can verify that this eigenvector is correct by checking that the equation AX
1
= 0X
1
holds. The product AX
1
is given by
AX
1
=
_
_
2 2 2
1 3 1
1 1 1
_
_
_
_
1
0
1
_
_
=
_
_
0
0
0
_
_
156
This clearly equals 0X
1
, so the equation holds. Hence, AX
1
= 0X
1
and so 0 is an
eigenvalue of A.
Computing the other basic eigenvectors is left as an exercise.
In the following sections, we examine ways to simplify this process of nding eigenvalues
and eigenvectors by using properties of special types of matrices.
5.1.3. Eigenvalues and Eigenvectors for Special Types of Matrices
There are three special kinds of matrices which we can use to simplify the process of nding
eigenvalues and eigenvectors. Throughout this section, we will discuss similar matrices,
elementary matrices, as well as triangular matrices.
We begin with a denition.
Denition 5.8: Similar Matrices
Let A and B be nn matrices. Suppose there exists an invertible matrix P such that
A = P
1
BP
Then A and B are called similar matrices .
It turns out that we can use the concept of similar matrices to help us nd the eigenvalues
of matrices. Consider the following lemma.
Lemma 5.9: Similar Matrices and Eigenvalues
Let A and B be similar matrices, so that A = P
1
BP where A, B are n n matrices
and P is invertible. Then A, B have the same eigenvalues.
Proof: We need to show two things. First, we need to show that if A = P
1
BP, then
A and B have the same eigenvalues. Secondly, we show that if A and B have the same
eigenvalues, then A = P
1
BP.
Here is the proof of the rst statement. Suppose A = P
1
BP and is an eigenvalue of
A, that is AX = X for some X ,= 0. Then
P
1
BPX = X
and so
BPX = PX
Since P is one to one and X ,= 0, it follows that PX ,= 0. Here, PX plays the role of
the eigenvector in this equation. Thus is also an eigenvalue of B. One can similarly verify
that any eigenvalue of B is also an eigenvalue of A, and thus both matrices have the same
eigenvalues as desired.
Proving the second statement is similar and is left as an exercise.
157
Note that this proof also demonstrates that the eigenvectors of A and B will (generally)
be dierent. We see in the proof that AX = X, while B(PX) = (PX). Therefore, for
an eigenvalue , A will have the eigenvector X while B will have the eigenvector PX.
The second special type of matrices we discuss in this section is elementary matrices.
Recall from Denition 2.41 that an elementary matrix E is obtained by applying one row
operation to the identity matrix.
It is possible to use elementary matrices to simplify a matrix before searching for its
eigenvalues and eigenvectors. This is illustrated in the following example.
Example 5.10: Simplify Using Elementary Matrices
Find the eigenvalues for the matrix
A =
_
_
33 105 105
10 28 30
20 60 62
_
_
Solution. This matrix has big numbers and therefore we would like to simplify as much as
possible before computing the eigenvalues.
We will do so using row operations. First, add 2 times the second row to the third row.
To do so, left multiply A by E (2, 2). Then right multiply A by the inverse of E (2, 2) as
illustrated.
_
_
1 0 0
0 1 0
0 2 1
_
_
_
_
33 105 105
10 28 30
20 60 62
_
_
_
_
1 0 0
0 1 0
0 2 1
_
_
=
_
_
33 105 105
10 32 30
0 0 2
_
_
By Lemma 5.9, the resulting matrix has the same eigenvalues as A where here, the matrix
E (2, 2) plays the role of P.
We do this step again, as follows. In this step, we use the elementary matrix obtained
by adding 3 times the second row to the rst row.
_
_
1 3 0
0 1 0
0 0 1
_
_
_
_
33 105 105
10 32 30
0 0 2
_
_
_
_
1 3 0
0 1 0
0 0 1
_
_
=
_
_
3 0 15
10 2 30
0 0 2
_
_
(5.4)
Again by Lemma 5.9, this resulting matrix has the same eigenvalues as A. At this point, we
can easily nd the eigenvalues. Let
B =
_
_
3 0 15
10 2 30
0 0 2
_
_
Then, we nd the eigenvalues of B (and therefore of A) by solving the equation det (xI B) =
0. You should verify that this equation becomes
(x + 2) (x + 2) (x 3) = 0
158
Solving this equation results in eigenvalues of
1
= 2,
2
= 2, and
3
= 3. Therefore,
these are also the eigenvalues of A.
Through using elementary matrices, we were able to create a matrix for which nding
the eigenvalues was easier than for A. At this point, you could go back to the original matrix
A and solve (A I) X = 0 to obtain the eigenvectors of A.
Notice that when you multiply on the right by an elementary matrix, you are doing the
column operation dened by the elementary matrix. In 5.4 multiplication by the elementary
matrix on the right merely involves taking three times the rst column and adding to the
second. Thus, without referring to the elementary matrices, the transition to the new matrix
in 5.4 can be illustrated by
_
_
33 105 105
10 32 30
0 0 2
_
_

_
_
3 9 15
10 32 30
0 0 2
_
_

_
_
3 0 15
10 2 30
0 0 2
_
_
The third special type of matrix we will consider in this section is the triangular matrix.
Recall Denition 3.12 which states that an upper (lower) triangular matrix contains all zeros
below (above) the main diagonal. Remember that nding the determinant of a triangular
matrix is a simple procedure of taking the product of the entries on the main diagonal.. It
turns out that there is also a simple way to nd the eigenvalues of a triangular matrix.
In the next example we will demonstrate that the eigenvalues of a triangular matrix are
the entries on the main diagonal.
Example 5.11: Eigenvalues for a Triangular Matrix
Let A =
_
_
1 2 4
0 4 7
0 0 6
_
_
. Find the eigenvalues of A.
Solution. We need to solve the equation det (xI A) = 0 as follows
det (xI A) = det
_
_
x 1 2 4
0 x 4 7
0 0 x 6
_
_
= (x 1) (x 4) (x 6) = 0
Solving the equation (x 1) (x 4) (x 6) = 0 for x results in the eigenvalues
1
=
1,
2
= 4 and
3
= 6. Thus the eigenvalues are the entries on the main diagonal of the
original matrix.
The same result is true for lower triangular matrices. For any triangular matrix, the
eigenvalues are equal to the entries on the main diagonal. To nd the eigenvectors of a
triangular matrix, we use the usual procedure.
In the next section, we explore an important process involving the eigenvalues and eigen-
vectors of a matrix.
159
5.2 Diagonalization
We begin this section by recalling Denition 5.8 of similar matrices. Recall that if A, B are
two n n matrices, then they are similar if and only if there exists an invertible matrix P
such that
A = P
1
BP
The following are important properties of similar matrices.
Proposition 5.12: Properties of Similarity
Dene for n n matrices A, B and C by A B if A is similar to B. Then
A A
If A B then B A
If A B and B C then A C
Proof: It is clear that A A, taking P = I.
Now, if A B, then for some P invertible,
A = P
1
BP
and so
PAP
1
= B
But then
_
P
1
_
1
AP
1
= B
which shows that B A by Denition 5.8.
Now suppose A B and B C. Then there exist invertible matrices P, Q such that
A = P
1
BP, B = Q
1
CQ
Then,
A = P
1
_
Q
1
CQ
_
P = (QP)
1
C (QP)
showing that A is similar to C by Denition 5.8.
When a matrix is similar to a diagonal matrix, the matrix is said to be diagonalizable.
We dene a diagonal matrix D as a matrix containing a zero in every entry except those
on the main diagonal. More precisely, if D
ij
is the ij
th
entry of a diagonal matrix D, then
D
ij
= 0 unless i = j. Such matrices look like the following.
D =
_

_
0
.
.
.
0
_

_
160
where is a number which might not be zero.
The following is the formal denition of a diagonalizable matrix.
Denition 5.13: Diagonalizable
Let A be an n n matrix. Then A is said to be diagonalizable if there exists an
invertible matrix P such that
P
1
AP = D
where D is a diagonal matrix.
The most important theorem about diagonalizability is the following major result.
Theorem 5.14: Eigenvectors and Diagonalizable Matrices
An nn matrix A is diagonalizable if and only if there is an invertible matrix P given
by
P =
_
X
1
X
2
X
n

where the X
k
are eigenvectors of A.
Moreover if A is diagonalizable, the corresponding eigenvalues of A are the diagonal
entries of the diagonal matrix D.
Proof: Suppose P is given as above as an invertible matrix whose columns are eigen-
vectors of A. Then P
1
is of the form
P
1
=
_

_
W
T
1
W
T
2
.
.
.
W
T
n
_

_
where W
T
k
X
j
=
kj
, which is the Kroneckers symbol discussed earlier.
Then
P
1
AP =
_

_
W
T
1
W
T
2
.
.
.
W
T
n
_

_
_
AX
1
AX
2
AX
n

=
_

_
W
T
1
W
T
2
.
.
.
W
T
n
_

_
_

1
X
1

2
X
2

n
X
n

=
_

1
0
.
.
.
0
n
_

_
161
Conversely, suppose A is diagonalizable so that P
1
AP = D. Let
P =
_
X
1
X
2
X
n

where the columns are the X


k
and
D =
_

1
0
.
.
.
0
n
_

_
Then
AP = PD =
_
X
1
X
2
X
n

1
0
.
.
.
0
n
_

_
and so
_
AX
1
AX
2
AX
n

=
_

1
X
1

2
X
2

n
X
n

showing the X
k
are eigenvectors of A and the
k
are eigenvectors.

We demonstrate this concept in the next example. Note that not only are the columns
of the matrix P formed by eigenvectors, but P must be invertible so must consist of a wide
variety of eigenvectors. We achieve this by using basic eigenvectors for the columns of P.
Example 5.15: Diagonalize a Matrix
Let
A =
_
_
2 0 0
1 4 1
2 4 4
_
_
Find an invertible matrix P and a diagonal matrix D such that P
1
AP = D.
Solution. By Theorem 5.14 we use the eigenvectors of A as the columns of P, and the
corresponding eigenvalues of A as the diagonal entries of D.
First, we will nd the eigenvalues of A. To do so, we solve det (xI A) = 0 as follows.
det
_
_
x
_
_
1 0 0
0 1 0
0 0 1
_
_

_
_
2 0 0
1 4 1
2 4 4
_
_
_
_
= 0
This computation is left as an exercise, and you should verify that the eigenvalues are

1
= 2,
2
= 2, and
3
= 6.
Next, we need to nd the eigenvectors. We rst nd the eigenvectors for
1
,
2
= 2.
Solving (A2I) X = 0 to nd the eigenvectors, we nd that the eigenvectors are
t
_
_
2
1
0
_
_
+ s
_
_
1
0
1
_
_
162
where t, s are scalars. Hence there are two basic eigenvectors which are given by
X
1
=
_
_
2
1
0
_
_
, X
2
=
_
_
1
0
1
_
_
You can verify that the basic eigenvector for
3
= 6 is X
3
=
_
_
0
1
2
_
_
Then, we construct the matrix P as follows.
P =
_
X
1
X
2
X
3

=
_
_
2 1 0
1 0 1
0 1 2
_
_
That is, the columns of P are the basic eigenvectors of A. Then, you can verify that
P
1
=
_

1
4
1
2
1
4
1
2
1
1
2
1
4
1
2

1
4
_

_
Thus,
P
1
AP =
_

1
4
1
2
1
4
1
2
1
1
2
1
4
1
2

1
4
_

_
_
_
2 0 0
1 4 1
2 4 4
_
_
_
_
2 1 0
1 0 1
0 1 2
_
_
=
_
_
2 0 0
0 2 0
0 0 6
_
_
You can see that the result here is a diagonal matrix where the entries on the main
diagonal are the eigenvalues of A. We expected this based on Theorem 5.14. Notice that
eigenvalues on the main diagonal must be in the same order as the corresponding eigenvectors
in P.
It is possible that a matrix A cannot be diagonalized. In other words, we cannot nd an
invertible matrix P so that P
1
AP = D.
Consider the following example.
Example 5.16: Impossible to Diagonalize
Let
A =
_
1 1
0 1
_
Find an invertible matrix P and diagonal matrix D so that P
1
AP = D
163
Solution. Through the usual procedure, we nd that the eigenvalues of A are
1
= 1,
2
= 1.
To nd the eigenvectors, we solve the equation (AI) X = 0. The matrix (AI) is
given by
_
1 1
0 1
_
Substituting in = 1, we have the matrix
_
1 1 1
0 1 1
_
=
_
0 1
0 0
_
Then, solving the equation (AI) X = 0 involves carrying the following augmented
matrix to reduced row-echelon form.
_
0 1 [ 0
0 0 [ 0
_
Since this matrix is in reduced row-echelon form, no work needs to be done. Then the
eigenvectors are of the form
t
_
1
0
_
and the basic eigenvector is
X
1
=
_
1
0
_
In this case, the matrix A has one eigenvalue of multiplicity two, but only one basic
eigenvector. In order to diagonalize A, we need to construct an invertible 2 2 matrix P.
However, because A only has one basic eigenvector, we cannot construct this P. Notice that
if we were to use X
1
as both columns of P, P would not be invertible. For this reason, we
cannot repeat eigenvectors in P.
Hence this matrix cannot be diagonalized.
Recall Denition 5.4 of the multiplicity of an eigenvalue. It turns out that we can deter-
mine when a matrix is diagonalizable based on the multiplicity of its eigenvalues. In order
for A to be diagonalizable, the number of basic eigenvectors associated with an eigenvalue
must be the same number as the multiplicity of the eigenvalue. In Example 5.16, A had one
eigenvalue = 1 of multiplicity 2. However, there was only one basic eigenvector associated
with this eigenvalue. Therefore, we can see that A is not diagonalizable.
We summarize this in the following theorem.
Theorem 5.17: Diagonalizability Condition
An n n matrix A is diagonalizable exactly when the number of basic eigenvectors
associated with an eigenvalue is the same number as the multiplicity of that eigenvalue.
You may wonder if there is a need to nd P
1
, since we can use Theorem 5.14 to construct
P and D. We will see this is needed to compute high powers of matrices, which is one of the
major applications of diagonalizability.
Before we do so, we rst discuss complex eigenvalues.
164
5.2.1. Complex Eigenvalues
In some applications, a matrix may have eigenvalues which are complex numbers. For
example, this often occurs in dierential equations. These questions are approached in the
same way as above.
Consider the following example.
Example 5.18: A Real Matrix with Complex Eigenvalues
Let
A =
_
_
1 0 0
0 2 1
0 1 2
_
_
Find the eigenvalues and eigenvectors of A.
Solution. We will rst nd the eigenvalues as usual by solving the following equation.
det
_
_
x
_
_
1 0 0
0 1 0
0 0 1
_
_

_
_
1 0 0
0 2 1
0 1 2
_
_
_
_
= 0
This reduces to (x 1) (x
2
4x + 5) = 0. The solutions are
1
= 1,
2
= 2+i and
3
= 2i.
There is nothing new about nding the eigenvectors for
1
= 1 so this is left as an
exercise.
Consider now the eigenvalue
2
= 2 + i. As usual, we solve the equation
(AI) X = 0
as given by
_
_
_
_
1 0 0
0 2 1
0 1 2
_
_
(2 + i)
_
_
1 0 0
0 1 0
0 0 1
_
_
_
_
X =
_
_
0
0
0
_
_
In other words, we need to solve the system represented by the augmented matrix
_
_
1 + i 0 0 [ 0
0 i 1 [ 0
0 1 i [ 0
_
_
We now use our row operations to solve the system. Divide the rst row by (1 + i) and
then take i times the second row and add to the third row. This yields
_
_
1 0 0 [ 0
0 i 1 [ 0
0 0 0 [ 0
_
_
165
Now multiply the second row by i to obtain the reduced row-echelon form, given by
_
_
1 0 0 [ 0
0 1 i [ 0
0 0 0 [ 0
_
_
Therefore, the eigenvectors are of the form
t
_
_
0
i
1
_
_
and the basic eigenvector is given by
X
2
=
_
_
0
i
1
_
_
As an exercise, verify that the eigenvectors for = 2 i are of the form
t
_
_
0
i
1
_
_
Hence, the basic eigenvector is given by
X
3
=
_
_
0
i
1
_
_
As usual, be sure to check your answers! To verify, we check that AX
3
= (2 i) X
3
as
follows.
_
_
1 0 0
0 2 1
0 1 2
_
_
_
_
0
i
1
_
_
=
_
_
0
1 2i
2 i
_
_
= (2 i)
_
_
0
i
1
_
_
Therefore, we know that this eigenvector and eigenvalue are correct.
Notice that in Example 5.18, two of the eigenvalues were given by
2
= 2+i and
3
= 2i.
You may recall that these two complex numbers are conjugates. It turns out that whenever
a matrix containing real entries has a complex eigenvalue , it also has an eigenvalue equal
to , the conjugate of .
In the next section, we look at some important applications of diagonalization.
166
5.3 Applications of Spectral Theory
5.3.1. Raising a Matrix to a High Power
Suppose we have a matrix A and we want to nd A
50
. One could try to multiply A with itself
50 times, but this is computationally extremely intensive (try it!). However diagonalization
allows us to compute high powers of a matrix relatively easily. Suppose A is diagonalizable,
so that P
1
AP = D. We can rearrange this equation to write A = PDP
1
.
Now, consider A
2
. Since A = PDP
1
, it follows that
A
2
=
_
PDP
1
_
2
= PDP
1
PDP
1
= PD
2
P
1
Similarly,
A
3
=
_
PDP
1
_
3
= PDP
1
PDP
1
PDP
1
= PD
3
P
1
In general,
A
n
=
_
PDP
1
_
n
= PD
n
P
1
Therefore, we have reduced the problem to nding D
n
. In order to compute D
n
, then
because D is diagonal we only need to raise every entry on the main diagonal of D to the
power of n.
Through this method, we can compute large powers of matrices. Consider the following
example.
Example 5.19: Raising a Matrix to a High Power
Let A =
_
_
2 1 0
0 1 0
1 1 1
_
_
. Find A
50
.
Solution. We will rst diagonalize A. The steps are left as an exercise and you may wish to
verify that the eigenvalues of A are
1
= 1,
2
= 1, and
3
= 2.
The basic eigenvectors corresponding to
1
,
2
= 1 are
X
1
=
_
_
0
0
1
_
_
, X
2
=
_
_
1
1
0
_
_
The basic eigenvector corresponding to
3
= 2 is
X
3
=
_
_
1
0
1
_
_
167
Now we construct P by using the basic eigenvectors of A as the columns of P. Thus
P =
_
X
1
X
2
X
3

=
_
_
0 1 1
0 1 0
1 0 1
_
_
Then also
P
1
=
_
_
1 1 1
0 1 0
1 1 0
_
_
which you may wish to verify.
Then,
P
1
AP =
_
_
1 1 1
0 1 0
1 1 0
_
_
_
_
2 1 0
0 1 0
1 1 1
_
_
_
_
0 1 1
0 1 0
1 0 1
_
_
=
_
_
1 0 0
0 1 0
0 0 2
_
_
= D
Now it follows by rearranging the equation that
A = PDP
1
=
_
_
0 1 1
0 1 0
1 0 1
_
_
_
_
1 0 0
0 1 0
0 0 2
_
_
_
_
1 1 1
0 1 0
1 1 0
_
_
Therefore,
A
50
= PD
50
P
1
=
_
_
0 1 1
0 1 0
1 0 1
_
_
_
_
1 0 0
0 1 0
0 0 2
_
_
50
_
_
1 1 1
0 1 0
1 1 0
_
_
By our discussion above, D
50
is found as follows.
_
_
1 0 0
0 1 0
0 0 2
_
_
50
=
_
_
1
50
0 0
0 1
50
0
0 0 2
50
_
_
It follows that
A
50
=
_
_
0 1 1
0 1 0
1 0 1
_
_
_
_
1
50
0 0
0 1
50
0
0 0 2
50
_
_
_
_
1 1 1
0 1 0
1 1 0
_
_
=
_
_
2
50
1 + 2
50
0
0 1 0
1 2
50
1 2
50
1
_
_
168
Through diagonalization, we can eciently compute a high power of A. Without this,
we would be forced to multiply this by hand!
The next section explores another interesting application of diagonalization.
5.3.2. Markov Matrices
There are applications which are of great importance which feature a special type of ma-
trix. Matrices in which the columns are non-negative numbers which sum to one are called
Markov matrices. An important application of Markov matrices is in population migra-
tion, as illustrated in the following denition.
Denition 5.20: Migration Matrices
Let n locations be denoted by the numbers 1, 2, , n. Suppose it is the case that
each year the proportion of residents in location j which move to location i is A
ij
.
Also suppose no one escapes or emigrates from without these n locations. This last
assumption requires

i
A
ij
= 1, and means that the matrix A, such that A = [A
ij
],
is a Markov matrix. In this context, A is also called a migration matrix.
Consider the following example which demonstrates this situation.
Example 5.21: Migration Matrix
Let A be a Markov matrix given by
A =
_
.4 .2
.6 .8
_
Verify that A is a Markov matrix and describe the entries of A in terms of population
migration.
Solution. The columns of A are comprised of non-negative numbers which sum to 1. Hence,
A is a Markov matrix.
Now, consider the entries A
ij
of A in terms of population. The entry A
11
= .4 is the
proportion of residents in location one which stay in location one in a given time period.
Entry A
21
= .6 is the proportion of residents in location 1 which move to location 2 in the
same time period. Entry A
12
= .2 is the proportion of residents in location 2 which move to
location 1. Finally, entry A
22
= .8 is the proportion of residents in location 2 which stay in
location 2 in this time period.
Considered as a Markov matrix, these numbers are usually identied with probabilities.
Hence, we can say that the probability that a resident of location one will stay in location
one in the time period is .4.
169
Observe that in Example 5.21 if there was initially say 15 thousand people in location
1 and 10 thousands in location 2, then after one year there would be .4 15 + .2 10 = 8
thousands people in location 1 the following year, and similarly there would be .615+.8
10 = 17 thousands people in location 2 the following year.
More generally let X = [X
1
X
n
]
T
where X
i
is the population of location i at a given
instant. We can use the migration matrix A to nd the population in each location i one time
period later. To do this, we compute

j
A
ij
X
j
= (AX)
i
. In order to nd the population of
location i after k years, we compute
_
A
k
X
_
i
.
The following is an important proposition.
Proposition 5.22: Eigenvalues of a Migration Matrix
Let A = [A
ij
] be a migration matrix. Then 1 is always an eigenvalue for A.
Proof: Remember that the determinant of a matrix always equals that of its transpose.
Therefore,
det (AxI) = det
_
(AxI)
T
_
= det
_
A
T
xI
_
because I
T
= I. Thus the characteristic equation for A is the same as the characteristic
equation for A
T
. Consequently, A and A
T
have the same eigenvalues. We will show that 1
is an eigenvalue for A
T
and then it will follow that 1 is an eigenvalue for A.
Remember that for a migration matrix,

i
A
ij
= 1. Therefore, if A
T
= [B
ij
] with B
ij
=
A
ji
, it follows that

j
B
ij
=

j
A
ji
= 1
Therefore, from matrix multiplication,
A
T
_

_
1
.
.
.
1
_

_
=
_

j
B
ij
.
.
.

j
B
ij
_

_
=
_

_
1
.
.
.
1
_

_
Notice that this shows that
_

_
1
.
.
.
1
_

_
is an eigenvector for A
T
corresponding to the eigen-
value, = 1. As explained above, this shows that = 1 is an eigenvalue for A because A
and A
T
have the same eigenvalues.
Consider the following example.
170
Example 5.23: Using a Migration Matrix
Consider the migration matrix
A =
_
_
.6 0 .1
.2 .8 0
.2 .2 .9
_
_
for locations 1, 2, and 3. Suppose initially there are 100 residents in location 1, 200 in
location 2 and 400 in location 4. Find the population in the three locations after 10
units of time.
Solution. From our above discussion, we need to consider
_
A
k
X
_
i
, where k denotes the
number of time periods which have passed. The vector X
0
is constructed as the initial
populations of each location. In this case, it is given by
X
0
=
_
_
100
200
400
_
_
Therefore, we compute the populations in each location after 10 units of time as follows.
_
_
.6 0 .1
.2 .8 0
.2 .2 .9
_
_
10
_
_
100
200
400
_
_
=
_
_
115. 085 829 22
120. 130 672 44
464. 783 498 34
_
_
Since we are speaking about populations, we would need to round these numbers to provide
a logical answer. Therefore, we can say that after 10 units of time, there will be 115 residents
in location one, 120 in location two, and 465 in location three.
Suppose we wish to know how many residents will be in a certain location after a very
long time. It turns out that if some power of the migration matrix has all positive entries,
then there is a limiting vector X = lim
k
A
k
X
0
. As above, X
0
is dened to be the initial
vector describing the number of inhabitants in the various locations before any time has
passed.
The vector X will be an eigenvector for the eigenvalue 1 because
1X = lim
k
A
k
X
0
= lim
k
A
k+1
X
0
= A lim
k
A
k
X = AX
Now, the sum of the entries of X will equal the sum of the entries of the initial vector X
0
because this sum is preserved for every multiplication by A since

j
A
ij
X
j
=

j
X
j
_

i
A
ij
_
=

j
X
j
Consider the following example. Notice that it is the same example as the Example 5.23
but here it will involve a longer time frame.
171
Example 5.24: Populations over the Long Run
Consider the migration matrix
A =
_
_
.6 0 .1
.2 .8 0
.2 .2 .9
_
_
for locations 1, 2, and 3. Suppose initially there are 100 residents in location 1, 200 in
location 2 and 400 in location 4. Find the population in the three locations after a
long time.
Solution. By the above discussion, we need to nd the eigenvector X associated with the
eigenvalue 1. Once we nd X, we will nd the vector in the same direction for which the
sum of its entries equals the sum of the entries of the initial vector X
0
.
Thus we need to nd a solution to
_
_
_
_
1 0 0
0 1 0
0 0 1
_
_

_
_
.6 0 .1
.2 .8 0
.2 .2 .9
_
_
_
_
_
_
x
y
z
_
_
=
_
_
0
0
0
_
_
The augmented matrix is
_
_
0.4 0 0.1 [ 0
0.2 0.2 0 [ 0
0.2 0.2 0.1 [ 0
_
_
and its reduced row-echelon formis
_
_
1 0 0.25 [ 0
0 1 0.25 [ 0
0 0 0 [ 0
_
_
Therefore, the eigenvectors are
s
_
_
0.25
0.25
1
_
_
The initial vector X
0
is given by
_
_
100
200
400
_
_
Now all that remains is to choose the value of s such that
0.25s + 0.25s + s = 100 + 200 + 400
Solving this equation for s yields s =
1400
3
. Therefore the population in the long run is given
by
1400
3
_
_
0.25
0.25
1
_
_
=
_
_
116.666 666 666 666 7
116.666 666 666 666 7
466.666 666 666 666 7
_
_
172
Again, because we are working with populations, these values need to be rounded.
We can see that the numbers we calculated in Example 5.23 for the populations after the
10th unit of time are not far from the long term values.
Consider another example.
Example 5.25: Populations After a Long Time
Suppose a migration matrix is given by
A =
_

_
1
5
1
2
1
5
1
4
1
4
1
2
11
20
1
4
3
10
_

_
Find the comparison between the populations in the three locations after a long time.
Solution. Again, we rst need to nd the eigenvector for = 1. Solve
_
_
_
_
_
_
_
_
1 0 0
0 1 0
0 0 1
_
_

_
1
5
1
2
1
5
1
4
1
4
1
2
11
20
1
4
3
10
_

_
_
_
_
_
_
_
_
_
x
y
z
_
_
=
_
_
0
0
0
_
_
The augmented matrix is
_

_
4
5

1
2

1
5
[ 0

1
4
3
4

1
2
[ 0

11
20

1
4
7
10
[ 0
_

_
The reduced row-echelon form is
_

_
1 0
16
19
[ 0
0 1
18
19
[ 0
0 0 0 [ 0
_

_
and so an eigenvector is
_
_
16
18
19
_
_
Therefore, the proportion of population in location 2 to location 1 is given by
18
16
. The
proportion of population 3 to location 2 is given by
19
18
.
173
There are many other applications which can be looked at using migration matrices.
They include things like the gamblers ruin problem which asks for the probability that a
compulsive gambler will eventually lose all his money.
It turns out that migration matrices are part of a broader group of applications called
dynamical systems. We examine these in the next section.
5.3.3. Dynamical Systems
The migration matrices discussed above give an example of a discrete dynamical system. We
call them discrete because they involve discrete values taken at a sequence of points rather
than on a continuous interval of time.
An example of a situation which can be studied in this way is a predator prey model.
Consider the following model where x is the number of prey and y the number of predators
in a certain area at a certain time. These are functions of n N where n = 1, 2, are the
ends of intervals of time which may be of interest in the problem. In other words, x(n) is
the number of prey at the end of the n
th
interval of time. This situation may be modeled by
the following equation
_
x(n + 1)
y (n + 1)
_
=
_
a b
c d
_ _
x(n)
y (n)
_
for some a, b, c, d R.
This says that from time period n to n+1, x increases if there are more x and decreases as
there are more y. In the context of this example, this means that as the number of predators
increases, the number of prey decreases. As for y, it increases if there are more y and also if
there are more x.
We will now consider an example in more detail.
Example 5.26: Solutions of a Discrete Dynamical System
Suppose a dynamical system is of the form
_
x(n + 1)
y (n + 1)
_
=
_
1.5 0.5
1.0 0
_ _
x(n)
y (n)
_
Find solutions to the dynamical system for initial conditions x
0
= 20, y
0
= 10.
Solution. Let
A =
_
1.5 0.5
1.0 0
_
Then you can verify that the eigenvalues of A are 1, and .5. By diagonalizing, we can
write A in the form
_
1 1
1 2
_ _
1 0
0 .5
_ _
2 1
1 1
_
174
Now, given an initial condition
_
x
0
y
0
_
the solution to the dynamical system is given by
_
x(n)
y (n)
_
=
_
1 1
1 2
_ _
1 0
0 .5
_
n
_
2 1
1 1
_ _
x
0
y
0
_
=
_
1 1
1 2
_ _
1 0
0 (.5)
n
_ _
2 1
1 1
_ _
x
0
y
0
_
=
_
y
0
((.5)
n
1) x
0
((.5)
n
2)
y
0
(2 (.5)
n
1) x
0
(2 (.5)
n
2)
_
If we let n , the limit is given by
_
2x
0
y
0
2x
0
y
0
_
Thus for large n,
_
x(n)
y (n)
_

_
2x
0
y
0
2x
0
y
0
_
Now suppose the initial condition is given by
_
x
0
y
0
_
=
_
20
10
_
Then, we can nd solutions for various values of n. Here are the solutions for values of
n between 1 and 5
n = 1 :
_
25.0
20.0
_
, n = 2 :
_
27.5
25.0
_
, n = 3 :
_
28.75
27.5
_
n = 4 :
_
29.375
28.75
_
, n = 5 :
_
29.688
29.375
_
Notice that as n increases, we approach the vector given by
_
2x
0
y
0
2x
0
y
0
_
=
_
2 (20) 10
2 (20) 10
_
=
_
30
30
_
These solutions are graphed in the following gure.
175
The following example demonstrates another system which exhibits some interesting
behaviour. When we graph the solutions, it is possible for the ordered pairs to spiral around
the origin.
Example 5.27: Finding Solutions to a Dynamical System
Suppose a dynamical system is of the form
_
x(n + 1)
y (n + 1)
_
=
_
0.7 0.7
0.7 0.7
_ _
x(n)
y (n)
_
Find solutions to the dynamical system for given initial conditions.
Solution. Let
A =
_
0.7 0.7
0.7 0.7
_
Then, you can verify that the eigenvalues of A are complex and are given by
1
= .7 + .7i
and
2
= .7 .7i. Suppose the initial condition is
_
x
0
y
0
_
What is a formula for the solutions to the dynamical system? To solve this, we must
diagonalize A.
You can verify that the eigenvector for
1
= .7 + .7i is
_
1
i
_
and that the eigenvector for
2
= .7 .7i is
_
1
i
_
Thus the matrix A can be written in the form
_
1 1
i i
_ _
.7 + .7i 0
0 .7 .7i
_
_
_
1
2

1
2
i
1
2
1
2
i
_
_
and so,
_
x(n)
y (n)
_
=
_
1 1
i i
_ _
(.7 + .7i)
n
0
0 (.7 .7i)
n
_
_
_
1
2

1
2
i
1
2
1
2
i
_
_
_
x
0
y
0
_
176
The explicit solution is given by
_
x
0
_
1
2
(0.7 0.7i)
n
+
1
2
(0.7 + 0.7i)
n
_
+ y
0
_
1
2
i (0.7 0.7i)
n

1
2
i (0.7 + 0.7i)
n
_
y
0
_
1
2
(0.7 0.7i)
n
+
1
2
(0.7 + 0.7i)
n
_
x
0
_
1
2
i (0.7 0.7i)
n

1
2
i (0.7 + 0.7i)
n
_
_
Suppose the initial condition is
_
x
0
y
0
_
=
_
10
10
_
Then one obtains the following sequence of values which are graphed below by letting n =
1, 2, , 20
In this picture, the dots are the values and the dashed line is to help to picture what is
happening.
These points are getting gradually closer to the origin, but they are circling the origin in
the clockwise direction as they do so. Also, since both eigenvalues are slightly smaller than
1 in absolute value,
lim
n
_
x(n)
y (n)
_
=
_
0
0
_
This type of behavior along with complex eigenvalues is typical of the deviations from an
equilibrium point in the Lotka Volterra system of dierential equations which is a famous
model for predator-prey interactions. These dierential equations are given by
x

= x(a by)
y

= y (c dx)
where a, b, c, d are positive constants. For example, you might have X be the population of
moose and Y the population of wolves on an island.
Note that these equations make logical sense. The top says that the rate at which
the moose population increases would be aX if there were no predators Y . However, this is
modied by multiplying instead by (a bY ) because if there are predators, these will militate
against the population of moose. The more predators there are, the more pronounced is this
177
eect. As to the predator equation, you can see that the equations predict that if there
are many prey around, then the rate of growth of the predators would seem to be high.
However, this is modied by the term cY because if there are many predators, there would
be competition for the available food supply and this would tend to decrease Y

.
The behavior near an equilibrium point, which is a point where the right side of the
dierential equations equals zero, is of great interest. In this case, the equilibrium point is
x =
c
d
, y =
a
b
Then one denes new variables according to the formula
x +
c
d
= x, y = y +
a
b
In terms of these new variables, the dierential equations become
x

=
_
x +
c
d
__
a b
_
y +
a
b
__
y

=
_
y +
a
b
__
c d
_
x +
c
d
__
Multiplying out the right sides yields
x

= bxy b
c
d
y
y

= dxy +
a
b
dx
The interest is for x, y small and so these equations are essentially equal to
x

= b
c
d
y, y

=
a
b
dx
Replace x

with the dierence quotient


x(t+h)x(t)
h
where h is a small positive number
and y

with a similar dierence quotient. For example one could have h correspond to one
day or even one hour. Thus, for h small enough, the following would seem to be a good
approximation to the dierential equations.
x(t + h) = x(t) hb
c
d
y
y (t + h) = y (t) + h
a
b
dx
Let 1, 2, 3, denote the ends of discrete intervals of time having length h chosen above.
Then the above equations take the form
_
x(n + 1)
y (n + 1)
_
=
_
1
hbc
d
had
b
1
_
_
x(n)
y (n)
_
Note that the eigenvalues of this matrix are always complex.
178
We are not interested in time intervals of length h for h very small. Instead, we are
interested in much longer lengths of time. Thus, replacing the time interval with mh,
_
x(n + m)
y (n + m)
_
=
_
1
hbc
d
had
b
1
_
m
_
x(n)
y (n)
_
For example, if m = 2, you would have
_
x(n + 2)
y (n + 2)
_
=
_
1 ach
2
2b
c
d
h
2
a
b
dh 1 ach
2
_
_
x(n)
y (n)
_
Note that most of the time, the eigenvalues of the new matrix will be complex.
You can also notice that the upper right corner will be negative by considering higher
powers of the matrix. Thus letting 1, 2, 3, denote the ends of discrete intervals of time,
the desired discrete dynamical system is of the form
_
x(n + 1)
y (n + 1)
_
=
_
a b
c d
_ _
x(n)
y (n)
_
where a, b, c, d are positive constants and the matrix will likely have complex eigenvalues
because it is a power of a matrix which has complex eigenvalues.
You can see from the above discussion that if the eigenvalues of the matrix used to dene
the dynamical system are less than 1 in absolute value, then the origin is stable in the sense
that as n , the solution converges to the origin. If either eigenvalue is larger than 1 in
absolute value, then the solutions to the dynamical system will usually be unbounded, unless
the initial condition is chosen very carefully. The next example exhibits the case where one
eigenvalue is larger than 1 and the other is smaller than 1.
The following example demonstrates a familiar concept as a dynamical system.
Example 5.28: The Fibonacci Sequence
The Fibonacci sequence is the sequence given by
1, 1, 2, 3, 5,
which is dened recursively in the form
x(0) = 1 = x(1) , x(n + 2) = x(n + 1) + x(n)
Show how the Fibonacci Sequence can be considered a dynamical system.
Solution. This sequence is extremely important in the study of reproducing rabbits. It can
be considered as a dynamical system as follows. Let y (n) = x(n + 1) . Then the above
recurrence relation can be written as
_
x(n + 1)
y (n + 1)
_
=
_
0 1
1 1
_ _
x(n)
y (n)
_
,
_
x(0)
y (0)
_
=
_
1
1
_
179
Let
A =
_
0 1
1 1
_
The eigenvalues of the matrix A are
1
=
1
2

1
2

5 and
2
=
1
2

5+
1
2
. The corresponding
eigenvectors are, respectively,
X
1
=
_

1
2

5
1
2
1
_
, X
2
=
_
1
2

5
1
2
1
_
You can see from a short computation that one of the eigenvalues is smaller than 1 in
absolute value while the other is larger than 1 in absolute value. Now, diagonalizing A gives
us
_
_
1
2

5
1
2

1
2

5
1
2
1 1
_
_
1
_
0 1
1 1
_
_
_
1
2

5
1
2

1
2

5
1
2
1 1
_
_
=
_
_
1
2

5 +
1
2
0
0
1
2

1
2

5
_
_
Then it follows that for a given initial condition, the solution to this dynamical system
is of the form
_
x(n)
y (n)
_
=
_
_
1
2

5
1
2

1
2

5
1
2
1 1
_
_
_ _
1
2

5 +
1
2
_
n
0
0
_
1
2

1
2

5
_
n
_

_
1
5

5
1
10

5 +
1
2

1
5

5
1
5

5
_
1
2

5
1
2
_
_

_
_
1
1
_
It follows that
x(n) =
_
1
2

5 +
1
2
_
n
_
1
10

5 +
1
2
_
+
_
1
2

1
2

5
_
n
_
1
2

1
10

5
_
Here is a picture of the ordered pairs (x(n) , y (n)) for n = 0, 1, , n.
180
There is so much more that can be said about dynamical systems. It is a major topic of
study in dierential equations and what is given above is just an introduction.
5.4 Exercises
1. If A is an invertible n n matrix, compare the eigenvalues of A and A
1
. More
generally, for m an arbitrary integer, compare the eigenvalues of A and A
m
.
2. If A is an n n matrix and c is a nonzero constant, compare the eigenvalues of A and
cA.
3. If A is the matrix of a linear transformation which rotates all vectors in R
2
through
60

, explain why A cannot have any real eigenvalues. Is there an angle such that
rotation through this angle would have a real eigenvalue? What eigenvalues would be
obtainable in this way?
4. Let A be the 2 2 matrix of the linear transformation which rotates all vectors in R
2
through an angle of . For which values of does A have a real eigenvalue?
5. State the eigenvalue problem from both an algebraic and a geometric perspective.
6. Let A, B be invertible n n matrices which commute. That is, AB = BA. Suppose
X is an eigenvector of B. Show that then AX must also be an eigenvector for B.
7. Suppose A is an n n matrix and it satises A
m
= A for some m a positive integer
larger than 1. Show that if is an eigenvalue of A then [[ equals either 0 or 1.
8. Show that if AX = X and AY = Y , then whenever k, p are scalars,
A(kX + pY ) = (kX + pY )
Does this imply that kX + pY is an eigenvector? Explain.
9. Is it possible for a nonzero matrix to have only 0 as an eigenvalue?
10. Let A be the 2 2 matrix of the linear transformation which rotates all vectors in R
2
through an angle of . For which values of does A have a real eigenvalue?
11. Let T be the linear transformation which reects vectors about the x axis. Find a
matrix for T and then nd its eigenvalues and eigenvectors.
12. Let T be the linear transformation which rotates all vectors in R
2
counterclockwise
through an angle of /2. Find a matrix of T and then nd eigenvalues and eigenvectors.
181
13. Let T be the linear transformation which reects all vectors in R
3
through the xy
plane. Find a matrix for T and then obtain its eigenvalues and eigenvectors.
14. Find the eigenvalues and eigenvectors of the matrix
_
_
6 92 12
0 0 0
2 31 4
_
_
One eigenvalue is 2. Diagonalize the matrix if possible.
15. Find the eigenvalues and eigenvectors of the matrix
_
_
2 17 6
0 0 0
1 9 3
_
_
One eigenvalue is 1. Diagonalize if possible.
16. Find the eigenvalues and eigenvectors of the matrix
_
_
9 2 8
2 6 2
8 2 5
_
_
One eigenvalue is 3. Diagonalize if possible.
17. Find the eigenvalues and eigenvectors of the matrix
_
_
6 76 16
2 21 4
2 64 17
_
_
One eigenvalue is 2. Diagonalize if possible.
18. Find the eigenvalues and eigenvectors of the matrix
_
_
3 5 2
8 11 4
10 11 3
_
_
One eigenvalue is 3. Diagonalize if possible.
19. Find the eigenvalues and eigenvectors of the matrix
_
_
5 18 32
0 5 4
2 5 11
_
_
One eigenvalue is 1. Diagonalize if possible.
182
20. Find the eigenvalues and eigenvectors of the matrix
_
_
13 28 28
4 9 8
4 8 9
_
_
One eigenvalue is 3. Diagonalize if possible.
21. Find the eigenvalues and eigenvectors of the matrix
_
_
89 38 268
14 2 40
30 12 90
_
_
One eigenvalue is 3. Diagonalize if possible.
22. Find the eigenvalues and eigenvectors of the matrix
_
_
1 90 0
0 2 0
3 89 2
_
_
One eigenvalue is 1. Diagonalize if possible.
23. Find the eigenvalues and eigenvectors of the matrix
_
_
11 45 30
10 26 20
20 60 44
_
_
One eigenvalue is 1. Diagonalize if possible.
24. Find the eigenvalues and eigenvectors of the matrix
_
_
95 25 24
196 53 48
164 42 43
_
_
One eigenvalue is 5. Diagonalize if possible.
25. Find the eigenvalues and eigenvectors of the matrix
_
_
15 24 7
6 5 1
58 76 20
_
_
One eigenvalue is 2. Diagonalize if possible. Hint: This one has some complex
eigenvalues.
183
26. Find the eigenvalues and eigenvectors of the matrix
_
_
15 25 6
13 23 4
91 155 30
_
_
One eigenvalue is 2. Diagonalize if possible. Hint: This one has some complex
eigenvalues.
27. Find the eigenvalues and eigenvectors of the matrix
_
_
11 12 4
8 17 4
4 28 3
_
_
One eigenvalue is 1. Diagonalize if possible. Hint: This one has some complex
eigenvalues.
28. Find the eigenvalues and eigenvectors of the matrix
_
_
14 12 5
6 2 1
69 51 21
_
_
One eigenvalue is 3. Diagonalize if possible. Hint: This one has some complex
eigenvalues.
29. Suppose A is a 3 3 matrix and the following information is available.
A
_
_
0
1
1
_
_
= 0
_
_
0
1
1
_
_
, A
_
_
1
1
1
_
_
= 2
_
_
1
1
1
_
_
A
_
_
2
3
2
_
_
= 2
_
_
2
3
2
_
_
Find A
_
_
1
4
3
_
_
.
30. Suppose A is a 3 3 matrix and the following information is available.
A
_
_
1
2
2
_
_
= 1
_
_
1
2
2
_
_
, A
_
_
1
1
1
_
_
= 0
_
_
1
1
1
_
_
A
_
_
1
4
3
_
_
= 2
_
_
1
4
3
r
_
_
Find A
_
_
3
4
3
_
_
.
184
31. Suppose A is a 3 3 matrix and the following information is available.
A
_
_
0
1
1
_
_
= 2
_
_
0
1
1
_
_
, A
_
_
1
1
1
_
_
= 1
_
_
1
1
1
_
_
A
_
_
3
5
4
_
_
= 3
_
_
3
5
4
_
_
Find A
_
_
2
3
3
_
_
.
32. Let V be a unit vector which means V
T
V = 1, and let A = I 2V V
T
. Show that A
has an eigenvalue equal to 1.
33. Suppose A is an n n matrix which is diagonally dominant. This means
[a
ii
[ >

j
[a
ij
[
Show that A
1
must exist. Hint: You might try and show that det (A) ,= 0.
34. Let A be an n n matrix. Then A is self adjoint if A

= A where A

is the adjoint of
A Show the eigenvalues of a self adjoint matrix are all real. If the self adjoint matrix
has all real entries, it is symmetric. Hint: Let AX = X, , X ,= 0. Then
X
T
A
T

X =

X
T
AX =

X
T
X
Now take the conjugate of both sides and use A

= A.
35. Suppose A is an nn matrix consisting entirely of real entries but a +ib is a complex
eigenvalue having the eigenvector, X + iY Here X and Y are real vectors. Show that
then a ib is also an eigenvalue with the eigenvector, X iY . Hint: You should
remember that the conjugate of a product of complex numbers equals the product of
the conjugates. Here a + ib is a complex number whose conjugate equals a ib.
36. Recall an n n matrix is said to be skew symmetric if it has all real entries and if
A = A
T
. Show that any nonzero eigenvalues must be of the form ib where i
2
= 1.
In words, the eigenvalues are either 0 or pure imaginary. Hint: Use one of the above
problems on self adjoint matrices. Is iA self adjoint?
37. Suppose A is an n n matrix and let V be an eigenvector such that AV = V . Also
suppose the characteristic polynomial of A is
det (xI A) = x
n
+ a
n1
x
n1
+ + a
1
x + a
0
Explain why
_
A
n
+ a
n1
A
n1
+ + a
1
A+ a
0
I
_
V = 0
185
If A is diagonalizable, give a proof of the Cayley Hamilton theorem based on this.
Recall this theorem says A satises its characteristic equation,
A
n
+ a
n1
A
n1
+ + a
1
A+ a
0
I = 0
38. Suppose the characteristic polynomial of an nn matrix A is 1 x
n
. Find A
mn
where
m is an integer.
39. The following is a Markov (migration) matrix for three locations
_

_
7
10
1
9
1
5
1
10
7
9
2
5
1
5
1
9
2
5
_

_
The total number of individuals in the migration process is 256. After a long time,
how many are in each location?
40. The following is a Markov (migration) matrix for three locations
_

_
1
5
1
5
2
5
2
5
2
5
1
5
2
5
2
5
2
5
_

_
The total number of individuals in the migration process is 500. After a long time, how
many are in each location?
41. The following is a Markov (migration) matrix for three locations
_

_
3
10
3
8
1
3
1
10
3
8
1
3
3
5
1
4
1
3
_

_
The total number of individuals in the migration process is 480. After a long time,
how many are in each location?
42. The following is a Markov (migration) matrix for three locations
_

_
3
10
1
3
1
5
3
10
1
3
7
10
2
5
1
3
1
10
_

_
The total number of individuals in the migration process is 1155. After a long time,
how many are in each location?
186
43. The following is a Markov (migration) matrix for three locations
_

_
2
5
1
10
1
8
3
10
2
5
5
8
3
10
1
2
1
4
_

_
The total number of individuals in the migration process is 704. After a long time, how
many are in each location?
44. You own a trailer rental company in a large city and you have four locations, one in
the South East, one in the North East, one in the North West, and one in the South
West. Denote these locations by SE,NE,NW, and SW respectively. Suppose that the
following table is observed to take place.
SE NE NW SW
SE
1
3
1
10
1
10
1
5
NE
1
3
7
10
1
5
1
10
NW
2
9
1
10
3
5
1
5
SW
1
9
1
10
1
10
1
2
In this table, the probability that a trailer starting at NE ends in NW is 1/10, the
probability that a trailer starting at SW ends in NW is 1/5, and so forth. Approxi-
mately how many will you have in each location after a long time if the total number
of trailers is 413?
45. You own a trailer rental company in a large city and you have four locations, one in
the South East, one in the North East, one in the North West, and one in the South
West. Denote these locations by SE,NE,NW, and SW respectively. Suppose that the
following table is observed to take place.
SE NE NW SW
SE
1
7
1
4
1
10
1
5
NE
2
7
1
4
1
5
1
10
NW
1
7
1
4
3
5
1
5
SW
3
7
1
4
1
10
1
2
187
In this table, the probability that a trailer starting at NE ends in NW is 1/10, the
probability that a trailer starting at SW ends in NW is 1/5, and so forth. Approxi-
mately how many will you have in each location after a long time if the total number
of trailers is 1469.
46. The following table describes the transition probabilities between the states rainy,
partly cloudy and sunny. The symbol p.c. indicates partly cloudy. Thus if it starts o
p.c. it ends up sunny the next day with probability
1
5
. If it starts o sunny, it ends up
sunny the next day with probability
2
5
and so forth.
rains sunny p.c.
rains
1
5
1
5
1
3
sunny
1
5
2
5
1
3
p.c.
3
5
2
5
1
3
Given this information, what are the probabilities that a given day is rainy, sunny, or
partly cloudy?
47. The following table describes the transition probabilities between the states rainy,
partly cloudy and sunny. The symbol p.c. indicates partly cloudy. Thus if it starts o
p.c. it ends up sunny the next day with probability
1
10
. If it starts o sunny, it ends
up sunny the next day with probability
2
5
and so forth.
rains sunny p.c.
rains
1
5
1
5
1
3
sunny
1
10
2
5
4
9
p.c.
7
10
2
5
2
9
Given this information, what are the probabilities that a given day is rainy, sunny, or
partly cloudy?
48. You own a trailer rental company in a large city and you have four locations, one in
the South East, one in the North East, one in the North West, and one in the South
West. Denote these locations by SE,NE,NW, and SW respectively. Suppose that the
following table is observed to take place.
SE NE NW SW
SE
5
11
1
10
1
10
1
5
NE
1
11
7
10
1
5
1
10
NW
2
11
1
10
3
5
1
5
SW
3
11
1
10
1
10
1
2
188
In this table, the probability that a trailer starting at NE ends in NW is 1/10, the
probability that a trailer starting at SW ends in NW is 1/5, and so forth. Approxi-
mately how many will you have in each location after a long time if the total number
of trailers is 407?
49. The University of Poohbah oers three degree programs, scouting education (SE),
dance appreciation (DA), and engineering (E). It has been determined that the prob-
abilities of transferring from one program to another are as in the following table.
SE DA E
SE .8 .1 .3
DA .1 .7 .5
E .1 .2 .2
where the number indicates the probability of transferring from the top program to
the program on the left. Thus the probability of going from DA to E is .2. Find the
probability that a student is enrolled in the various programs.
50. In the city of Nabal, there are three political persuasions, republicans (R), democrats
(D), and neither one (N). The following table shows the transition probabilities between
the political parties, the top row being the initial political party and the side row being
the political aliation the following year.
R D N
R
1
5
1
6
2
7
D
1
5
1
3
4
7
N
3
5
1
2
1
7
Find the probabilities that a person will be identied with the various political per-
suasions. Which party will end up being most important?
51. The following table describes the transition probabilities between the states rainy,
partly cloudy and sunny. The symbol p.c. indicates partly cloudy. Thus if it starts o
p.c. it ends up sunny the next day with probability
1
5
. If it starts o sunny, it ends up
sunny the next day with probability
2
7
and so forth.
rains sunny p.c.
rains
1
5
2
7
5
9
sunny
1
5
2
7
1
3
p.c.
3
5
3
7
1
9
Given this information, what are the probabilities that a given day is rainy, sunny, or
partly cloudy?
189
190
6. R
n
Outcomes
A. Find the position vector of a point in R
n
.
B. Understand vector addition and scalar multiplication, both algebraically and graphi-
cally
C. Find the length of a vector and the distance between two points in R
n
.
D. Find the corresponding unit vector to a vector in R
n
.
E. Find the vector and parametric equations of a line.
F. Compute the dot product of vectors, and use this to compute vector projections.
G. Compute the cross product and box product of vectors in R
3
.
6.1 R
n
The notation R
n
refers to the collection of ordered lists of n real numbers, that is R
n
=
(x
1
x
n
) : x
j
R for j = 1, , n .
In this chapter, we take a closer look at the geometric signicance of vectors in R
n
. First,
we will consider what R
n
looks like in more detail.
You may be familiar with the geometric signicance of R
n
for n 3. Recall the set
(0 0 t 0 0) : t R
for t in the i
th
slot is called the i
th
coordinate axis. The point given by (0, , 0) is called
the origin.
Now, consider the case when n = 1. Then from the denition we can identify R with
points in R
1
as follows:
R = R
1
= (x
1
) : x
1
R
Hence, R is dened as the set of all real numbers and geometrically, we can describe this as
all the points on a line.
Now suppose n = 2. Then, from the denition,
R
2
= (x
1
, x
2
) : x
j
R for j = 1, 2
191
Hence, every element in R
2
is identied by two components, x
1
and x
2
, in the usual manner.
Consider the familiar coordinate plane, with an x axis and a y axis. Any point within this
coordinate plane is identied by where it is located along the x axis, and also where it is
located along the y axis. Consider as an example the following diagram.
2
3
6
P = (2, 6)
8
Q = (8, 3)
Notice how you can identify the point P shown in the plane with the ordered pair, (2, 6) .
Starting at the origin, you go to the right a distance of 2 and then up a distance of 6.
Similarly, you can identify the point Q in the plane with the ordered pair (8, 3) . Go to the
left from the origin a distance of 8 and then up a distance of 3. Notice that every ordered
pair determines a unique point in the plane.
Pick any point (x, y) in the plane. Then, draw a vertical line through the point, so that
it crosses the x axis at the point x. Similarly, draw a horizontal line though the point so
that it crosses the y axis at y. Notice that these lines will only intersect at one point, and
that point is (x, y). In other words we can say that the coordinates x and y determine a
unique point in the plane. In short, points in the plane can be identied with ordered pairs
similar to the way that points on the real line are identied with real numbers.
Now suppose n = 3. You may have previously encountered the 3-dimensional coordinate
system, given by
R
3
= (x
1
, x
2
, x
3
) : x
j
R for j = 1, 2, 3
Points in R
3
will be determined by three coordinates, (x, y, z) which correspond to the
x, y, and z axes. We can think as above that the rst two coordinates determine a point in
a plane. The third component determines the height above or below the plane, depending
on whether this number is positive or negative, and all together this determines a point in
space. Thus, (1, 4, 5) would mean to determine the point in the plane that goes with (1, 4)
and then to go below this plane a distance of 5 to obtain a unique point in space. You see
that the ordered triples correspond to points in space just as the ordered pairs correspond
to points in a plane and single real numbers correspond to points on a line.
The idea behind the more general R
n
is that we can extend these ideas beyond n = 3.
For example suppose we were interested in the location of two objects in space. You would
need three coordinates to describe where the rst object is and another three coordinates to
describe where the other object is located. Therefore, you would need to be considering a
point in R
6
. If the two objects moved around, you would need a time coordinate as well, so
possibly another component. As another example, consider a hot object which is cooling and
suppose you want the temperature of this object. How many coordinates would be needed?
You would need one for the temperature, three for the position of the point in the object
and one more for the time. Thus you would need to be considering R
5
. Many other examples
can be given, and in some applications n is very large.
192
In the next section, we look at these general vectors in R
n
.
6.2 Geometric Meaning Of Vectors
In this chapter, we focus on the geometric meaning of vectors. In previous sections, we have
written vectors as columns, or n 1 matrices, but for convenience in this chapter we may
often write vectors as row vectors, or 1 n matrices. These are of course equivalent and
should be clear given the context. Consider for example the vector given by
_
2
3
_
in R
2
,
which we can also write it as
_
2 3

. The important feature of either choice of notation is


the order of the numbers in the list.
We will now dene what we mean by vectors in R
n
. While we consider R
n
for all n, we
will largely focus on n = 2, 3 in this section.
Denition 6.1: Vectors in R
n
Let R
n
= (x
1
, , x
n
) : x
j
R for j = 1, , n . Then,
x = [x
1
x
n
]
is called a vector. The numbers x
j
are called the components of x.
Notice that two vectors u = [u
1
u
n
] and v = [v
1
v
n
] are equal if and only if all
corresponding components are equal. Precisely,
u = vif and only if
u
j
= v
j
for all j = 1, , n
Thus
_
1 2 4

R
3
and
_
2 1 4

R
3
but
_
1 2 4

,=
_
2 1 4

because, even
though the same numbers are involved, the order of the numbers is dierent.
For the specic case of R
3
, there are three special vectors which we often use. They are
given by

i =
_
1 0 0

j =
_
0 1 0

k =
_
0 0 1

We can write any vector u =


_
u
1
u
2
u
3

as a linear combination of these vectors, written


as u = u
1

i + u
2

j + u
3

k. This notation will be used throughout this chapter.


Now, there is a close connection between points in R
n
and vectors in R
n
. Consider the
following denition.
193
Denition 6.2: The Position Vector
Let P = (p
1
, , p
n
) be the coordinates of a point in R
n
. Then the vector p with its
tail at 0 = (0, , 0) and its tip at P is called the position vector of the point P.
We write
p = [p
1
p
n
]
For this reason we may write both P = (p
1
, , p
n
) R
n
and p = [p
1
p
n
] R
n
.
This denition is illustrated in the following picture for the special case of R
3
.

P = (p
1
, p
2
, p
3
)
p = [p
1
p
2
p
3
]
Thus every point P in R
n
determines its position vector p. Conversely, every such position
vector p which has its tail at 0 and point at P determines the point P of R
n
.
Now suppose we are given two points, P, Q whose coordinates are (p
1
, , p
n
) and
(q
1
, , q
n
) respectively. We can also determine the position vector from P to Q de-
ned as follows.

PQ = [q
1
p
1
q
n
p
n
] = q p
Now, imagine taking a vector in R
n
and moving it around, always keeping it pointing in
the same direction as shown in the following picture.

P = (p
1
, p
2
, p
3
)
p = [p
1
p
2
p
3
]

A
B
After moving it around, it is regarded as the same vector! In other words, each of the
vectors in the above picture is regarded as the same vector. This is because two vectors are
equal if and only if all their corresponding components are equal. But each of the vectors
above are position vectors of the form

AB =

b a, which are the same as p.


You can think of the components of a vector as directions for obtaining the vector.
Consider n = 3. Draw a vector with its tail at the point (0, 0, 0) and its tip at the point
(a, b, c). This vector it is obtained by starting at (0, 0, 0), moving parallel to the x axis to
(a, 0, 0) and then from here, moving parallel to the y axis to (a, b, 0) and nally parallel to the
z axis to (a, b, c) . Observe that the same vector would result if you began at the point (d, e, f),
moved parallel to the x axis to (d + a, e, f) , then parallel to the y axis to (d + a, e + b, f) ,
and nally parallel to the z axis to (d + a, e + b, f + c). Here, the vector would have its tail
sitting at the point determined by A = (d, e, f) and its point at B = (d + a, e + b, f + c) . It
is the same vector because it will point in the same direction and have the same length.
It is like you took an actual arrow, and moved it from one location to another keeping it
pointing the same direction.
194
In the next section, we discuss two important operations on vectors in R
n
.
6.3 Algebra in R
n
Addition and scalar multiplication are two important algebraic operations done with vectors.
Notice that these operations apply to vectors in R
n
, for any value of n. We will explore these
operations in more detail in the following sections.
6.3.1. Addition of Vectors in R
n
Addition of vectors in R
n
is dened as follows.
Denition 6.3: Addition of Vectors in R
n
If u = [u
1
u
n
] , v = [v
1
v
n
] R
n
then u +v R
n
and is dened by
u +v = [u
1
u
n
] + [v
1
v
n
]
= [u
1
+ v
1
u
n
+ v
n
]
Hence to add vectors, we simply add corresponding components exactly as we did for
matrices. Therefore, in order to add vectors, they must be the same size.
Similarly to matrices, addition of vectors satises some important properties. These are
outlined in the following theorem.
195
Theorem 6.4: Properties of Vector Addition
The following properties hold for vectors u, v, w R
n
.
The Commutative Law of Addition
u +v = v +u
The Associative Law of Addition
(u +v) + w = u + (v + w)
The Existence of an Additive Identity
u +

0 = u (6.1)
The Existence of an Additive Inverse
u + (u) =

0
The proof of this theorem follows from the similar theorem given for matrices in Propo-
sition 2.7. Thus the additive identity shown in equation 6.1 is also called the zero vector
, the n 1 vector in which all components are equal to 0. Further, u is simply the vector
with all components having same value as those of u but opposite sign; this is just (1)u.
This will be made more explicit in the next section when we explore scalar multiplication of
vectors.
Subtraction is dened as u v = u + (v) .
6.3.2. Scalar Multiplication of Vectors in R
n
Scalar multiplication of vectors in R
n
is dened as follows. Notice that, just like addition,
this denition is the same as the corresponding denition for matrices.
Denition 6.5: Scalar Multiplication of Vectors in R
n
If u R
n
and k R is a scalar, then ku R
n
is dened by
ku = k [u
1
u
n
] = [ku
1
ku
n
]
Just as with addition, scalar multiplication of vectors satises several important proper-
ties. These are outlined in the following theorem.
196
Theorem 6.6: Properties of Scalar Multiplication
The following properties hold for vectors u, v R
n
and k, p scalars.
The Distributive Law over Vector Addition
k (u +v) = ku + kv
The Distributive Law over Scalar Addition
(k + p) u = ku + pu
The Associative Law for Scalar Multiplication
k (pu) = (kp) u
Rule for Multiplication by 1
1u = u
Proof: Again the verication of these properties follows from the corresponding proper-
ties for scalar multiplication of matrices, given in Proposition 2.10.
As a refresher we can show that
k (u +v) = ku + kv
Note that:
k (u +v) = k [u
1
+ v
1
u
n
+ v
n
]
= [k (u
1
+ v
1
) k (u
n
+ v
n
)]
= [ku
1
+ kv
1
ku
n
+ kv
n
]
= [ku
1
ku
n
] + [kv
1
kv
n
]
= ku + kv

In the next section, we look at the geometric meaning of the operation of vector addition.
6.4 Geometric Meaning Of
Vector Addition
Recall that an element of R
n
is an ordered list of numbers. For the specic case of n = 2, 3
this can be used to determine a point in two or three dimensional space. This point is
specied relative to some coordinate axes.
197
Consider the case n = 3. Recall that taking a vector and moving it around without chang-
ing its length or direction does not change the vector. This is important in the geometric
representation of vector addition.
Suppose we have two vectors, u and v in R
3
. Each of these can be drawn geometrically
by placing the tail of each vector at 0 and its point at (u
1
, u
2
, u
3
) and (v
1
, v
2
, v
3
) respectively.
Suppose we slide the vector v so that its tail sits at the point of u. We know that this does
not change the vector v. Now, draw a new vector from the tail of u to the point of v. This
vector is u +v.
The geometric signicance of vector addition in R
n
for any n is given in the following
denition.
Denition 6.7: Geometry of Vector Addition
Let u and v be two vectors. Slide v so that the tail of v is on the point of u. Then
draw the arrow which goes from the tail of u to the point of v. This arrow represents
the vector u +v.

u
u +v
v
This denition is illustrated in the following picture in which u+v is shown for the special
case n = 3.

u
`
u +v

v
x
z
y
Notice the parallelogram created by u and v in the above diagram. Then u + v is the
directed diagonal of the parallelogram determined by the two vectors u and v.
When you have a vector v, its additive inverse v will be the vector which has the same
magnitude as v but the opposite direction. When one writes u v, the meaning is u +(v)
as with real numbers. The following example illustrates these denitions and conventions.
198
Example 6.8: Graphing Vector Addition
Consider the following picture of vectors u and v.
u

v j
Sketch a picture of u +v, u v.
Solution. We will rst sketch u +v. Begin by drawing u and then at the point of u, place
the tail of v as shown. Then u +v is the vector which results from drawing a vector from
the tail of u to the tip of v.
u

v
j
u +v
:
Next consider u v. This means u + (v) . From the above geometric description of
vector addition, v is the vector which has the same length but which points in the opposite
direction to v. Here is a picture.
u

v
Y
u + (v)
6
We have now considered vector addition both algebraically and geometrically. Before
exploring the geometric meaning of scalar multiplication of vectors, we consider what is
meant by the length of a vector. This is explored in the next section.
6.5 Length Of A Vector
In this section, we explore what is meant by the length of a vector in R
n
. We develop this
concept by rst looking at the distance between two points in R
n
.
First, we will consider the concept of distance for R, that is, for points in R
1
. Here, the
distance between two points P and Q is given by the absolute value of their dierence. Thus
199
[P Q[ is equal to the distance between these two points in R, given by
[P Q[ =
_
(P Q)
2
(6.2)
Consider the following picture in the case that n = 2.
P = (p
1
, p
2
) (q
1
, p
2
)
Q = (q
1
, q
2
)
There are two points P = (p
1
, p
2
) and Q = (q
1
, q
2
) in the plane. Then, the distance
between these points is shown in the picture as a solid line. Notice that this line is the
hypotenuse of a right triangle which is half of the rectangle shown in dotted lines. We
want to nd the length of this hypotenuse in order to nd the distance between the two
points. Note the lengths of the sides of this triangle are [q
1
p
1
[ and [q
2
p
2
[ by our above
discussion. Therefore, the Pythagorean Theorem implies the length of the hypotenuse (and
thus the distance between P and Q) equals
_
[q
1
p
1
[
2
+[q
2
p
2
[
2
_
1/2
=
_
(q
1
p
1
)
2
+ (q
2
p
2
)
2
_
1/2
(6.3)
Now suppose n = 3 and let P = (p
1
, p
2
, p
3
) and Q = (q
1
, q
2
, q
3
) be two points in R
3
.
Consider the following picture in which the solid line joins the two points and a dotted line
joins the points (p
1
, p
2
, p
3
) and (q
1
, q
2
, p
3
) .
P = (p
1
, p
2
, p
3
) (q
1
, p
2
, p
3
)
(q
1
, q
2
, p
3
)
Q = (q
1
, q
2
, q
3
)
Here, we need to use Pythagorean Theorem twice in order to nd the length of the solid
line. First, by the Pythagorean Theorem, the length of the dotted line joining (p
1
, p
2
, p
3
)
and (q
1
, q
2
, p
3
) equals
_
(q
1
p
1
)
2
+ (q
2
p
2
)
2
_
1/2
while the length of the line joining (q
1
, q
2
, p
3
) to (q
1
, q
2
, q
3
) is just [q
3
p
3
[ . Therefore, by
the Pythagorean Theorem again, the length of the line joining the points P = (p
1
, p
2
, p
3
)
and Q = (q
1
, q
2
, q
3
) equals
_
_
_
(q
1
p
1
)
2
+ (q
2
p
2
)
2
_
1/2
_
2
+ (q
3
p
3
)
2
_
1/2
200
=
_
(q
1
p
1
)
2
+ (q
2
p
2
)
2
+ (q
3
p
3
)
2
_
1/2
(6.4)
We will now formally dene the distance between two points.
Denition 6.9: Distance Between Points
Let P = (p
1
, , p
n
) and Q = (q
1
, , q
n
) be two points in R
n
. Then [P Q[ indicates
the distance between these points and is dened as
distance between P and Q = [P Q[ =
_
n

k=1
[p
k
q
k
[
2
_
1/2
This is called the distance formula.
From the above discussion, you can see that Denition 6.9 holds for the special cases
n = 1, 2, 3, as in Equations 6.2, 6.3, 6.4. You could continue this discussion, including the
use of the Pythagorean Theorem, to verify that it holds for greater n as well.
In the following example, we use Denition 6.9 to nd the distance between two points
in R
4
.
Example 6.10: Distance Between Points
Find the distance between the points P and Q in R
4
, where P and Q are given by
P = (1, 2, 4, 6)
and
Q = (2, 3, 1, 0)
Solution. We will use the formula given in Denition 6.9 to nd the distance between P and
Q. Use the distance formula and write
[P Q[
2
= (1 2)
2
+ (2 3)
2
+ (4 (1))
2
+ (6 0)
2
= 47
Therefore, [P Q[ =

47.
There are certain properties of the distance between points which are important in our
study. These are outlined in the following theorem.
Theorem 6.11: Properties of Distance
Let P and Q be points in R
n
, and let the distance between them, [P Q[ be given as
in Denition 6.9. Then, the following properties hold .
[P Q[ = [QP[
[P Q[ 0, and equals 0 exactly when P = Q.
201
There are many applications of the concept of distance. For instance, given two points,
we can ask what collection of points are all the same distance between the given points. This
is explored in the following example.
Example 6.12: The Plane Between Two Points
Describe the points in R
3
which are at the same distance between (1, 2, 3) and (0, 1, 2) .
Solution. Let P = (p
1
, p
2
, p
3
) be such a point. Therefore, P is the same distance from (1, 2, 3)
and (0, 1, 2) . Then by Denition 6.9,
_
(p
1
1)
2
+ (p
2
2)
2
+ (p
3
3)
2
=
_
(p
1
0)
2
+ (p
2
1)
2
+ (p
3
2)
2
Squaring both sides we obtain
(p
1
1)
2
+ (p
2
2)
2
+ (p
3
3)
2
= p
2
1
+ (p
2
1)
2
+ (p
3
2)
2
and so
p
2
1
2p
1
+ 14 + p
2
2
4p
2
+ p
2
3
6p
3
= p
2
1
+ p
2
2
2p
2
+ 5 + p
2
3
4p
3
Simplifying, this becomes
2p
1
+ 14 4p
2
6p
3
= 2p
2
+ 5 4p
3
which can be written as
2p
1
+ 2p
2
+ 2p
3
= 9 (6.5)
Therefore, the points P = (p
1
, p
2
, p
3
) which are the same distance from each of the given
points form a plane whose equation is given by 6.5.
We can now use our understanding of the distance between two points to dene what is
meant by the length of a vector. Consider the following denition.
Denition 6.13: Length of a Vector
Let u = [u
1
u
n
] be a vector in R
n
. Then, the length of u is given by
[u[ =
_
u
2
1
+ + u
2
n
This denition corresponds to Denition 6.9, if you consider the vector u to have its tail
at the point (0, , 0) and its tip at the point (u
1
, , u
n
). Then the length of u is equal
to the distance between these two points.
Consider Example 6.10. By Denition 6.13, we could also nd the distance between P
and Q as the length of the vector connecting them. Hence, if we were to draw a vector

PQ
with its tail at P and its point at Q, this vector would have length equal to

47.
We conclude this section with a new denition for the special case of vectors of length 1.
202
Denition 6.14: Unit Vector
Let u be a vector in R
n
. Then, we call u a unit vector if it has length 1, that is if
[u[ = 1
Let v be a vector in R
n
. Then, the vector u which has the same direction as v but length
equal to 1 is the corresponding unit vector of v. This vector is given by
u =
1
[v[
v
Consider the following example.
Example 6.15: Finding a Unit Vector
Let v be given by
v =
_
1 3 4

Find the unit vector u which has the same direction as v .


Solution. We will use Denition 6.14 to solve this. Therefore, we need to nd the length of
v which, by Denition 6.13 is given by
[v[ =
_
v
2
1
+ v
2
2
+ v
2
3
Using the corresponding values we nd that
[v[ =
_
1
2
+ (3)
2
+ 4
2
=

1 + 9 + 16
=

26
In order to nd u, we divide v by

26. The result is


u =
1
[v[
v
=
1

26
_
1 3 4

=
_
1

26

3

26
4

26
_
You can verify using the Denition 6.13 that [u[ = 1.
In the next section, we turn our attention to the geometric meaning of scalar multiplica-
tion.
203
6.6 Geometric Meaning
Of Scalar Multiplication
Recall that the point P = (p
1
, p
2
, p
3
) determines a vector p from 0 to P. The length of p,
denoted [ p[, is equal to
_
p
2
1
+ p
2
2
+ p
2
3
by Denition 6.9.
Now suppose we have a vector u =
_
u
1
u
2
u
3

and we multiply u by a scalar k. By


Denition 6.5, ku =
_
ku
1
ku
2
ku
3

. Then, by using Denition 6.9, the length of this


vector is given by
_
_
(ku
1
)
2
+ (ku
2
)
2
+ (ku
3
)
2
_
= [k[
_
u
2
1
+ u
2
2
+ u
2
3
Thus the following holds.
[ku[ = [k[ [u[
In other words, multiplication by a scalar magnies or shrinks the length of the vector by a
factor of [k[. If [k[ > 1, the length of the resulting vector will be magnied. If [k[ < 1, the
length of the resulting vector will shrink. Remember that by the denition of the absolute
value, [k[ > 0.
What about the direction? Draw a picture of u and ku where k is negative. Notice that
this causes the resulting vector to point in the opposite direction while if k > 0 it preserves
the direction the vector points. Notice that the direction can either reverse, if k < 0, or
remain preserved, if k > 0.
Consider the following example.
Example 6.16: Graphing Scalar Multiplication
Consider the vectors u and v drawn below.
u

v j
Draw u, 2v, and
1
2
v.
Solution.
In order to nd u, we preserve the length of u and simply reversing the direction. For
2v, we double the length of v, while preserving the direction.
1
2
v is found by taking half the
length of v and reversing the direction. These vectors are shown in the following diagram.
204
u

v j
u

2v j

1
2
v
Y
Now that we have studied both vector addition and scalar multiplication, we can combine
the two actions. Recall Denition 1.31 of linear combinations of column matrices. We can
apply this denition to vectors in R
n
. A linear combination of vectors in R
n
is a sum of
vectors multiplied by scalars.
In the following example, we examine the geometric meaning of this concept.
Example 6.17: Graphing a Linear Combination of Vectors
Consider the following picture of the vectors u and v
u

v j
Sketch a picture of u + 2v, u
1
2
v.
Solution. The two vectors are shown below.

u
2v
u + 2v

`
u

1
2
v
u
1
2
v
In the next section, we look at how we can use vectors to learn about lines in R
n
.
6.7 Parametric Lines
In R
n
, we can use the concept of vectors and points to nd equations for lines.
To begin, consider the case n = 1 so we have R
1
= R. The only line in R is the familiar
number line, that is R itself. Let P and Q be two dierent points in R. Then, if X is an
arbitrary point in R, consider
X = P + t (QP)
205
where t R. You see that you can always solve the above equation for t, showing that every
point on R is of this form.
Now consider the case where n = 2, in other words R
2
. Does a similar formula hold here
as in the case where n = 1? Let P and Q be two dierent points in R
2
which are contained
in a line L. Let p and q be the position vectors for the points P and Q respectively. Suppose
that X is an arbitrary point on L. Consider the following diagram.
P
Q
X
Our goal is to be able to dene X in terms of P and Q.
Consider the vector

PQ = q p which has its tail at P and point at Q. If we add q p to
the position vector p for P, the sum would be a vector with its point at Q. In other words,
q = p + (q p)
Now suppose we were to add t(q p) to p where t is some scalar. You can see that by
doing so, we could nd a vector with its point at X. In other words, we can nd t such that
x = p + t (q p)
In fact, this equation determines the line L. Consider the following denition.
Denition 6.18: Vector Equation of a Line
Suppose a line L in R
n
contains the two dierent points P and Q. Let p and q be the
position vectors of these two points, respectively. Then, L is the collection of points
X which have the position vector x given by
x = p + t (q p)
where t R. This is known as a vector equation for L.
Note that this denition agrees with the usual notion of a line in two dimensions and so
this is consistent with earlier concepts.
Proposition 6.19: Algebraic Description of a Straight Line
Let a,

b R
n
with

b ,=

0. Then x =a + t

b, t R, is a line.
Proof: Let x
1
, x
2
R
n
. Dene x
1
= a and let x
2
x
1
=

b. Since

b ,=

0, it follows that
x
2
,= x
1
. Then a + t

b = x
1
+ t ( x
2
x
1
). It follows that x = a + t

b is a line containing the


206
two dierent points X
1
and X
2
whose position vectors are given by x
1
and x
2
respectively.

Denition 6.20: Direction Vector


The vector

b in Proposition 6.19 is called a direction vector for the line.
We can use the above discussion to nd the equation of a line when given two distinct
points. Consider the following example.
Example 6.21: A Line From Two Points
Find a vector equation for the line through the points P = (1, 2, 0) and Q = (2, 4, 6) .
Solution. We will use the denition of a line given above in Denition 6.18 to write this line
in the form
x = p + t (q p)
Let x =
_
x y z

. Then, we can nd p and q by taking the position vectors of points P


and Q respectively. Then,
x = p + t (q p)
can be written as
_
x y z

=
_
1 2 0

+ t
_
1 6 6

, t R
Here, the direction vector
_
1 6 6

is obtained by q p =
_
2 4 6

_
1 2 0

as indicated above in Denition 6.18.


Notice that in the above example we said that we found a vector equation for the
line, not the equation. The reason for this terminology is that there are innitely many
dierent vector equations for the same line. To see this, replace t with another parameter,
say 3s. Then you obtain a dierent vector equation for the same line because the same set
of points is obtained.
In Example 6.21, the vector given by
_
1 6 6

is the direction vector dened in


Denition 6.20. If we know the direction vector of a line, as well as a point on the line, we
can nd the vector equation.
Consider the following example.
Example 6.22: A Line From a Point and a Direction Vector
Find a vector equation for the line which contains the point P = (1, 2, 0) and has
direction vector

b =
_
1 2 1

Solution. We will use Proposition 6.19 to write this line in the form x = a + t

b, t R. We
are given the direction vector

b. In order to nd a, we can use the position vector of the
207
point P. This is given by
_
1 2 0

. Letting x =
_
x y z

, the equation for the line is


given by
_
x y z

=
_
1 2 0

+ t
_
1 2 1

, t R (6.6)
We sometimes elect to write a line such as the one given in 6.6 in the form
x = 1 + t
y = 2 + 2t
z = t
_
_
_
where t R (6.7)
This set of equations give the same information as 6.6, and is called the parametric equa-
tion of the line.
Consider the following denition.
Denition 6.23: Parametric Equation of a Line
Let L be a line in R
3
which has direction vector

b =
_
b
1
b
2
b
3

and goes through


the point P = (p
1
, p
2
, p
3
). Then, letting t be a parameter, we can write L as
x = p
1
+ tb
1
y = p
2
+ tb
2
z = p
3
+ tb
3
_
_
_
where t R
This is called a parametric equation of the line L.
You can verify that the form discussed following Example 6.22 in equation 6.7 is of the
form given in Denition 6.23.
There is one other form for a line which is useful, which is the symmetric form. Consider
the line given by 6.7. You can solve for the parameter t to write
t = x 1
t =
y2
2
t = z
Therefore,
x 1 =
y 2
2
= z
This is the symmetric form of the line.
In the following example, we look at how to take the equation of a line from symmetric
form to parametric form.
208
Example 6.24: Change Symmetric Form to Parametric Form
Suppose the symmetric form of a line is
x 2
3
=
y 1
2
= z + 3
Write the line in parametric form as well as vector form.
Solution. We want to write this line in the form given by Denition 6.23. This is of the form
x = p
1
+ tb
1
y = p
2
+ tb
2
z = p
3
+ tb
3
_
_
_
where t R
Let t =
x2
3
, t =
y1
2
and t = z + 3, as given in the symmetric form of the line. Then
solving for x, y, z, yields
x = 2 + 3t
y = 1 + 2t
z = 3 + t
_
_
_
with t R
This is the parametric equation for this line.
Now, we want to write this line in the form given by Denition 6.18. This is the form
x = p + t (q p)
where t R. This equation becomes
_
x y z

=
_
2 1 3

+ t
_
3 2 1

, t R
In the following section, we look at some important products of vectors in R
n
.
6.8 Vector Products
6.8.1. The Dot Product
There are two ways of multiplying vectors which are of great importance in applications.
The rst of these is called the dot product. When we take the dot product of vectors, the
result is a scalar. For this reason, the dot product is also called the scalar product and
sometimes the inner product. The denition is as follows.
209
Denition 6.25: Dot Product
Let u, v be two vectors in R
n
. Then we dene the dot product u v as
u v =
n

k=1
u
k
v
k
The dot product u v is sometimes denoted as (u, v) where a comma replaces . It can
also be written as u, v. If we write the vectors as column or row matrices, it amounts to
the matrix product v w
T
.
Consider the following example.
Example 6.26: Compute a Dot Product
Find u v for
u =
_
1 2 0 1

v =
_
0 1 2 3

Solution. By Denition 6.25, we must compute


u v =
4

k=1
u
k
v
k
This is given by
u v = (1)(0) + (2)(1) + (0)(2) + (1)(3)
= 0 + 2 + 0 +3
= 1
With this denition, there are several important properties satised by the dot product.
Proposition 6.27: Properties of the Dot Product
Let k and p denote scalars and u, v, w denote vectors. The the dot product, u v
satises the following properties.
u v = v u
u u 0 and equals zero if and only if u =

0
(ku + pv) w = k (u w) + p (v w)
u (kv + p w) = k (u v) + p (u w)
[u[
2
= u u
210
The proof of this proposition is left as an exercise.
This proposition tells us that we can also use the dot product to nd the length of a
vector.
Example 6.28: Length of a Vector
Find the length of
u =
_
2 1 4 2

That is, nd [u[ .


Solution. By Proposition 6.27, [u[
2
= u u. Therefore, [u[ =

u u. First, compute u u.
This is given by
u u = (2)(2) + (1)(1) + (4)(4) + (2)(2)
= 4 + 1 + 16 + 4
= 25
Then,
[u[ =

u u
=

25
= 5
You may wish to compare this to our previous denition of length, given in Denition
6.13.
The Cauchy Schwarz inequality is a fundamental inequality satised by the dot prod-
uct. . It is given in the following theorem.
Theorem 6.29: Cauchy Schwarz Inequality
The dot product satises the inequality
[u v[ [u[ [v[ (6.8)
Furthermore equality is obtained if and only if one of u or v is a scalar multiple of the
other.
Proof: First note that if v =

0 both sides of 6.8 equal zero and so the inequality holds
in this case. Therefore, it will be assumed in what follows that v ,=

0.
Dene a function of t R by
f (t) = (u + tv) (u + tv)
211
Then by Proposition 6.27, f (t) 0 for all t R. Also from Proposition 6.27
f (t) = u (u + tv) + tv (u + tv)
= u u + t (u v) + tv u + t
2
v v
= [u[
2
+ 2t (u v) +[v[
2
t
2
Now this means the graph of y = f (t) is a parabola which opens up and either its vertex
touches the t axis or else the entire graph is above the t axis. In the rst case, there exists
some t where f (t) = 0 and this requires u + tv =

0 so one vector is a multiple of the other.


Then clearly equality holds in 6.8. In the case where v is not a multiple of u, it follows
f (t) > 0 for all t which says f (t) has no real zeros and so from the quadratic formula,
(2 (u v))
2
4 [u[
2
[v[
2
< 0
which is equivalent to [u v[ < [u[ [v[.
Notice that this proof was based only on the properties of the dot product listed in
Proposition 6.27. This means that whenever an operation satises these properties, the
Cauchy Schwarz inequality holds. There are many other instances of these properties besides
vectors in R
n
.
The Cauchy Schwarz inequality provides another proof of the triangle inequality for
distances in R
n
.
Theorem 6.30: Triangle Inequality
For u, v R
n
[u +v[ [u[ +[v[ (6.9)
and equality holds if and only if one of the vectors is a non-negative scalar multiple of
the other.
Also
[[u[ [v[[ [u v[ (6.10)
Proof: By properties of the dot product and the Cauchy Schwarz inequality,
[u +v[
2
= (u +v) (u +v)
= (u u) + (u v) + (v u) + (v v)
= [u[
2
+ 2 (u v) +[v[
2
[u[
2
+ 2 [u v[ +[v[
2
[u[
2
+ 2 [u[ [v[ +[v[
2
= ([u[ +[v[)
2
Hence,
[u +v[
2
([u[ +[v[)
2
Taking square roots of both sides you obtain 6.9.
It remains to consider when equality occurs. Suppose u =

0. Then, u = 0v and the claim


about when equality occurs is veried. The same argument holds if v =

0. Therefore, it can
212
be assumed both vectors are nonzero. To get equality in 6.9 above, Theorem 6.29 implies
one of the vectors must be a multiple of the other. Say v = ku. If k < 0 then equality cannot
occur in 6.9 because in this case
u v = k [u[
2
< 0 < [k[ [u[
2
= [u v[
Therefore, k 0.
To get the other form of the triangle inequality write
u = u v +v
so
[u[ = [u v +v[
[u v[ +[v[
Therefore,
[u[ [v[ [u v[ (6.11)
Similarly,
[v[ [u[ [v u[ = [u v[ (6.12)
It follows from 6.11 and 6.12 that 6.10 holds. This is because [[u[ [v[[ equals the left side
of either 6.11 or 6.12 and either way, [[u[ [v[[ [u v[.
In the next section, we look at the dot product geometrically.
6.8.2. The Geometric Significance Of The Dot Product
Given two vectors, u and v, the included angle is the angle between these two vectors which
is less than or equal to 180 degrees. The dot product can be used to determine the included
angle between two vectors. Consider the following picture where gives the included angle.
*
qU
v
u

Proposition 6.31: The Dot Product and the Included Angle


Let u and v be two vectors in R
n
, and let be the included angle. Then the following
equation holds.
u v = [u[ [v[ cos
213
In words, the dot product of two vectors equals the product of the magnitude (or length)
of the two vectors multiplied by the cosine of the included angle. Note this gives a geometric
description of the dot product which does not depend explicitly on the coordinates of the
vectors.
Consider the following example.
Example 6.32: Find the Angle Between Two Vectors
Find the angle between the vectors given by
u =
_
2 1 1

v =
_
3 4 1

Solution. By Proposition 6.31,


u v = [u[ [v[ cos
Hence,
cos =
u v
[u[ [v[
First, we can compute u v. By Denition 6.25, this equals
u v = (2)(3) + (1)(4) + (1)(1) = 9
Then,
[u[ =
_
(2)(2) + (1)(1) + (1)(1) =

6
[v[ =
_
(3)(3) + (4)(4) + (1)(1) =

26
Therefore, the cosine of the included angle equals
cos =
9

26

6
= 0.7205766...
With the cosine known, the angle can be determined by computing the inverse cosine of
that angle, giving approximately = 0.76616 radians.
Another application of the geometric description of the dot product is in nding the angle
between two lines. Typically one would assume the lines intersect, but in some situation it
may make sense to ask this question when the lines do not intersect, for example the angle
between two object trajectories. In any case we understand it to mean the smallest angle
between (any of) their direction vectors. The only subtlety here is that if u is a direction
vector for a line, then so is any multiple ku, and thus we will nd complementary angles
among all angles between direction vectors for two lines, and we simply take the smaller of
the two.
214
Example 6.33: Find the Angle Between Two Lines
Find the angle between the two lines
(L
1
)
_
x y z

=
_
1 2 0

+ t
_
1 1 2

and
(L
2
)
_
x y z

=
_
0 4 3

+ s
_
2 1 1

Solution. You can verify that these lines do not intersect, but as discussed above this does
not matter and we simply nd the smallest angle between any directions vectors for these
lines.
To do so we rst nd the angle between the direction vectors given above:
u =
_
1 1 2

, v =
_
2 1 1

In order to nd the angle, we solve the following equation for


u v = [u[ [v[ cos
to obtain cos =
1
2
and since we choose included angles between 0 and we obtain =
2
3
.
Now the angles between any two direction vectors for these lines will either be
2
3
or its
complement =
2
3
=

3
. Thus the angle between the two lines is

3
.
We can also use Proposition 6.31 to compute the dot product of two vectors.
Example 6.34: Using Geometric Description to Find a Dot Product
Let u, v be vectors with [u[ = 3 and [v[ = 4. Suppose the angle between u and v is
/3. Find u v.
Solution. From the geometric description of the dot product in Proposition 6.31
u v = (3)(4) cos (/3) = 3 4 1/2 = 6
Two nonzero vectors are said to be perpendicular, sometimes also called orthogonal,
if the included angle is /2 radians (90

).
Consider the following proposition.
Proposition 6.35: Perpendicular Vectors
Let u and v be nonzero vectors in R
n
. Then, u and v are said to be perpendicular
exactly when
u v = 0
215
Proof. This follows directly from Proposition 6.31. First if the dot product of two nonzero
vectors is equal to 0, this tells us that cos = 0 (this is where we need nonzero vectors).
Thus = /2 and the vectors are perpendicular.
If on the other hand v is perpendicular to u, then the included angle is /2 radians.
Hence cos = 0 and u v = 0.
Consider the following example.
Example 6.36: Determine if Two Vectors are Perpendicular
Determine whether the two vectors,
u =
_
2 1 1

,
v =
_
1 3 5

are perpendicular.
Solution. In order to determine if these two vectors are perpendicular, we compute the dot
product. This is given by
u v = (2)(1) + (1)(3) + (1)(5) = 0
Therefore, by Proposition 6.35 these two vectors are perpendicular.
6.8.3. Projections
In some applications, we wish to write a vector as a sum of two related vectors.
First, we explore an important theorem. The result of this theorem will provide our
denition of a vector projection.
Theorem 6.37: Vector Projections
Let v and u be nonzero vectors. Then there exist unique vectors v
||
and v

such that
v = v
||
+v

(6.13)
where v
||
is a scalar multiple of u, and v

is perpendicular to u.
Proof: Suppose 6.13 holds and v
||
= ku. Taking the dot product of both sides with v
and using v

u = 0, this yields
v u = (v
||
+v

) u
= ku u +v

u
= k [u[
2
216
which requires k = v u/ [u[
2
. Thus there can be no more than one vector v
||
. It follows v

must equal v v
||
. This veries there can be no more than one choice for both v
||
and v

.
Now let
v
||
=
v u
[u[
2
u
and let
v

= v v
||
= v
v u
[u[
2
u
Then v
||
= ku where k =
vu
|u|
2
. It only remains to verify v

u = 0. But
v

u = v u
v u
[u[
2
u u
= v u v u = 0

The vector v
||
in Theorem 6.37 is called the projection of v onto u and is denoted by
v
||
= proj
u
(v)
We now make a formal denition of the vector projection.
Denition 6.38: Vector Projection
Let u and v be vectors. Then, the projection of v onto u is given by
proj
u
(v) =
_
v u
u u
_
u
Consider the following example of a projection.
Example 6.39: Find the Projection of One Vector Onto Another
Find proj
u
(v) if
u =
_
2 3 4

v =
_
1 2 1

Solution. We can use the formula provided in Denition 6.38 to nd proj


u
(v). First, compute
v u. This is given by
_
1 2 1

_
2 3 4

= (2)(1) + (3)(2) + (4)(1)


= 2 6 4
= 8
217
Similarly, u u is given by
_
2 3 4

_
2 3 4

= (2)(2) + (3)(3) + (4)(4)


= 4 + 9 + 16
= 29
Therefore, the projection is equal to
proj
u
(v) =
8
29
_
2 3 4

=
_

16
29

24
29
32
29

Consider the map v proj


u
(v). It turns out that this map is linear, a result which
follows from the properties of the dot product. This is shown as follows.
proj
u
(kv + p w) =
_
(kv + p w) u
u u
_
u
= k
_
v u
u u
_
u + p
_
w u
u u
_
u
= k proj
u
(v) + p proj
u
( w)
Consider the following example.
Example 6.40: Matrix of a Projection Map
Let the projection map be dened by T(v) = proj
u
(v) and let u =
_
1 2 3

T
. Does
this linear transformation come from multiplication by a matrix? If so, what is the
matrix?
Solution. Recall that any linear transformation T is determined by a matrix, where the
columns of this matrix are given by T(e
i
). Recall that e
i
denotes the vector in R
n
which has
a 1 in the i
th
position and a zero everywhere else.
It follows that T(e
i
) = proj
u
(e
i
) gives the i
th
column of the desired matrix. Therefore,
we need to nd
proj
u
(e
i
) =
_
e
i
u
u u
_
u
For the given vector u , this implies the columns of the desired matrix are
1
14
_
_
1
2
3
_
_
,
2
14
_
_
1
2
3
_
_
,
3
14
_
_
1
2
3
_
_
218
which you can verify using Denition 6.38. Hence the matrix of T is
1
14
_
_
1 2 3
2 4 6
3 6 9
_
_
6.8.4. The Cross Product
The second type of product for vectors is called the cross product. It is important to note
that the cross product is only dened in R
3
. First we discuss the geometric meaning and
then a description in terms of coordinates is given. Both descriptions of the cross product
are important. The geometric description is essential in order to understand the applications
to physics and geometry while the coordinate description is necessary to compute the cross
product.
Consider the following denition.
Denition 6.41: Right Hand System of Vectors
Three vectors, u, v, w form a right hand system if when you extend the ngers of your
right hand along the direction of vector u and close them in the direction of v, the
thumb points roughly in the direction of w.
For an example of a right handed system of vectors, see the following picture.
-
.
`
u
v
w
In this picture the vector w points upwards from the plane determined by the other two
vectors. Point the ngers of your right hand along u, and close them in the direction of v.
Notice that if you extend the thumb on your right hand, it points in the direction of w.
You should consider how a right hand system would dier from a left hand system. Try
using your left hand and you will see that the vector w would need to point in the opposite
direction.
Notice that the special vectors,

i,

j,

k will always form a right handed system. If you


extend the ngers of your right hand along

i and close them in the direction



j, the thumb
points in the direction of

k.
219

j
The following is the geometric description of the cross product. Recall that the dot
product of two vectors results in a scalar. In contrast, the cross product results in a vector,
as the product gives a direction as well as magnitude.
Denition 6.42: Geometric Denition of Cross Product
Let u and v be two vectors in R
3
. Then u v is dened by the following two rules.
1. Its length is [u v[ = [u[ [v[ sin ,
where is the included angle between u and v.
2. It is perpendicular to both u and v, that is u v u = 0, u v v = 0,
and u, v, u v form a right hand system.
The cross product of the special vectors

i,

j,

k is as follows.

j =

k

j

i =

i =

j

i

k =

k =

i

k

j =

i
With this information, the following gives the coordinate description of the cross product.
Recall that the vector u =
_
u
1
u
2
u
3

can be written in terms of



i,

j,

k as u =
u
1

i + u
2

j + u
3

k.
Proposition 6.43: Coordinate Description of Cross Product
Let u = u
1

i + u
2

j + u
3

k and v = v
1

i + v
2

j + v
3

k be two vectors. Then


u v = (u
2
v
3
u
3
v
2
)

i + (u
3
v
1
u
1
v
3
)

j+
+(u
1
v
2
u
2
v
1
)

k
(6.14)
Writing u v in the usual way, it is given by
u v =
_
u
2
v
3
u
3
v
2
u
3
v
1
u
1
v
3
u
1
v
2
u
2
v
1

We now prove this proposition.


Proof: From the above table and the properties of the cross product listed,
u v =
_
u
1

i + u
2

j + u
3

k
_

_
v
1

i + v
2

j + v
3

k
_
= u
1
v
2

j + u
1
v
3

k + u
2
v
1

i + u
2
v
3

k + +u
3
v
1

i + u
3
v
2

j
= u
1
v
2

k u
1
v
3

j u
2
v
1

k + u
2
v
3

i + u
3
v
1

j u
3
v
2

i
= (u
2
v
3
u
3
v
2
)

i + (u
3
v
1
u
1
v
3
)

j + (u
1
v
2
u
2
v
1
)

k
(6.15)
220

There is another version of 6.14 which may be easier to remember. We can express the
cross product as the determinant of a matrix, as follows.
u v =

i

j

k
u
1
u
2
u
3
v
1
v
2
v
3

(6.16)
where you expand the determinant along the top row. This yields

i (1)
1+1

u
2
u
3
v
2
v
3

j (1)
2+1

u
1
u
3
v
1
v
3

k (1)
3+1

u
1
u
2
v
1
v
2

u
2
u
3
v
2
v
3

u
1
u
3
v
1
v
3

u
1
u
2
v
1
v
2

The above equals


(u
2
v
3
u
3
v
2
)

i (u
1
v
3
u
3
v
1
)

j + (u
1
v
2
u
2
v
1
)

k (6.17)
which is the same as 6.15.
The cross product satises the following properties.
Proposition 6.44: Properties of the Cross Product
Let u, v, w be vectors in R
3
, and k a scalar. Then, the following properties of the cross
product hold.
1. u v = (v u) , and u u =

0
2. (ku) v = k (u v) = u (kv)
3. u (v + w) = u v +u w
4. (v + w) u = v u + w u
Proof: Formula 1. follows immediately from the denition. The vectors u v and v u
have the same magnitude, [u[ [v[ sin , and an application of the right hand rule shows they
have opposite direction.
Formula 2. is proven as follows. If k is a non-negative scalar, the direction of (ku) v
is the same as the direction of u v, k (u v) and u (kv). The magnitude is k times the
magnitude of u v which is the same as the magnitude of k (u v) and u (kv) . Using
this yields equality in 2. In the case where k < 0, everything works the same way except
the vectors are all pointing in the opposite direction and you must multiply by [k[ when
comparing their magnitudes.
The distributive laws, 3. and 4., are much harder to establish. For now, it suces to
notice that if we know that 3. is true, 4. follows. Thus, assuming 3., and using 1.,
(v + w) u = u (v + w)
= (u v +u w)
= v u + w u
221

We will now look at an example of how to compute a cross product.


Example 6.45: Find a Cross Product
Find u v for the following vectors
u =
_
1 1 2

v =
_
3 2 1

Solution. Note that we can write u, v in terms of the special vectors

i,

j,

k as
u =

j + 2

k
v = 3

i 2

j +

k
We will use the equation given by 6.16 to compute the cross product.
u v =

i

j

k
1 1 2
3 2 1

1 2
2 1

1 2
3 1

j +

1 1
3 2

k = 3

i + 5

j +

k
We can write this result in the usual way, as
u v =
_
3 5 1

An important geometrical application of the cross product is as follows. [u v[ is the


area of the parallelogram determined by u and v, as shown in the following picture.

v
u
[u[sin()
.
We examine this concept in the following example.
Example 6.46: Area of a Parallelogram
Find the area of the parallelogram determined by the vectors u and v given by
u =
_
1 1 2

v =
_
3 2 1

222
Solution. Notice that these vectors are the same as the ones given in Example 6.45. Recall
from the geometric description of the cross product, that the area of the parallelogram is
simply the magnitude of u v. From Example 6.45, u v = 3

i +5

j +

k. We can also write


this as
u v =
_
3 5 1

Thus the area of the parallelogram is


[u v[ =
_
(3)(3) + (5)(5) + (1)(1) =

9 + 25 + 1 =

35
We can also use this concept to nd the area of a triangle. Consider the following example.
Example 6.47: Area of Triangle
Find the area of the triangle determined by the points (1, 2, 3) , (0, 2, 5) , (5, 1, 2)
Solution. This triangle is obtained by connecting the three points with lines. Picking (1, 2, 3)
as a starting point, there are two displacement vectors,
_
1 0 2

and
_
4 1 1

.
Notice that if we add either of these vectors to the position vector of the starting point, the
result is the position vectors of the other two points. Now, the area of the triangle is half
the area of the parallelogram determined by
_
1 0 2

and
_
4 1 1

. The required
cross product is given by
_
1 0 2

_
4 1 1

=
_
2 7 1

Taking the size of this vector gives the area of the parallelogram, given by
_
(2)(2) + (7)(7) + (1)(1) =

4 + 49 + 1 =

54
Hence the area of the triangle is
1
2

54 =
3
2

6.
In general, if you have three points in R
3
, P, Q, R, the area of the triangle is given by
1
2

PQ

PR

Recall that

PQ is the vector running from point P to point Q.

P
Q
R
In the next section, we explore another application of the cross product.
223
The Box Product
Recall that we can use the cross product to nd the the area of a parallelogram. It follows
that we can use the cross product together with the dot product to nd the area of a
parallelepiped.
We begin with a denition.
Denition 6.48: Parallelepiped
A parallelepiped determined by the three vectors, u, v, and w consists of
ru + sv + t w : r, s, t [0, 1]
That is, if you pick three numbers, r, s, and t each in [0, 1] and form ru+sv +t w then
the collection of all such points makes up the parallelepiped determined by these three
vectors.
The following is an example of a parallelepiped.
-

3
u
v
w
6
u v

Notice that the base of the parallelepiped is the parallelogram determined by the vec-
tors u and v. Therefore, its area is equal to [u v[. The height of the parallelepiped is
[ w[ cos where is the angle shown in the picture between w and u v. The volume of this
parallelepiped is the area of the base times the height which is just
[u v[ [ w[ cos = u v w
This expression is known as the box product and is sometimes written as [u, v, w] . You
should consider what happens if you interchange the v with the w or the u with the w. You
can see geometrically from drawing pictures that this merely introduces a minus sign. In any
case the box product of three vectors always equals either the volume of the parallelepiped
determined by the three vectors or else 1 times this volume.
Consider an example of this concept.
Example 6.49: Volume of a Parallelepiped
Find the volume of the parallelepiped determined by the vectors
u =
_
1 2 5

v =
_
1 3 6

w =
_
3 2 3

224
Solution. According to the above discussion, pick any two of these vectors, take the cross
product and then take the dot product of this with the third of these vectors. The result
will be either the desired volume or 1 times the desired volume.
We will take the cross product of u and v. This is given by
u v =
_
1 2 5

_
1 3 6

i

j

k
1 2 5
1 3 6

= 3

i +

j +

k =
_
3 1 1

Now take the dot product of this vector with w which yields
(u v) w =
_
3 1 1

_
3 2 3

=
_
3

i +

j +

k
_

_
3

i + 2

j + 3

k
_
= 9 + 2 + 3 = 14
This shows the volume of this parallelepiped is 14 cubic units.
There is a fundamental observation which comes directly from the geometric denitions
of the cross product and the dot product.
Proposition 6.50: Order of the Product
Let u, v, and w be vectors. Then (u v) w = u (v w) .
Proof: This follows from observing that either (u v) w and u (v w) both give the
volume of the parallelepiped or they both give 1 times the volume.
Recall that we can express the cross product as the determinant of a particular matrix.
It turns out that the same can be done for the box product. Suppose you have three vectors,
u =
_
a b c

, v =
_
d e f

, and w =
_
g h i

. Then the box product u v w is


given by the following.
u v w =
_
a b c

i

j

k
d e f
g h i

= a

e f
h i

d f
g i

+ c

d e
g h

= det
_
_
a b c
d e f
g h i
_
_
To take the box product, you can simply take the determinant of the matrix which results
by letting the rows be the components of the given vectors in the order in which they occur
in the box product.
This follows directly from the denition of the cross product given above and the way we
expand determinants. Thus the volume of a parallelepiped determined by the vectors u, v, w
is just the absolute value of the above determinant.
225
6.9 Applications
6.9.1. Rotations
In this section, we will use the above geometric descriptions of vector addition and scalar
multiplication to show that a rotation of vectors through an angle is an example of a linear
transformation.
Such a rotation would achieve something like the following if applied to each vector from
(0, 0) to the point in the picture corresponding to the person shown standing upright.
More generally, denote a rotation by T. Why is such a transformation linear? Consider
the following picture which illustrates a rotation. Let u, v denote vectors.
T(u)
u
v v u
+
v
T(v)
T
(
u
+
v
)
Lets consider how to obtain T (u +v). Simply, you add T(u) and T(v). Here is why.
If you add T(u) to T(v) you get the diagonal of the parallelogram determined by T(u) and
T(v), as this action is our usual vector addition. Now, suppose we rst add u and v, and
then apply the transformation T to u+v. Hence, we nd T(u+v). As shown in the diagram,
this will result in the same vector. In other words, T(u +v) = T(u) + T(v).
This is because the rotation preserves all angles between the vectors as well as their
lengths. In particular, it preserves the shape of this parallelogram. Thus both T (u) +T (v)
226
and T (u +v) give the same vector. It follows that T distributes across addition of the
vectors of R
2
.
Similarly, if k is a scalar, it follows that T (ku) = kT (u). Thus rotations are an example
of a linear transformation by Denition 2.59.
In the following example, we consider a linear transformation which rotates vectors in R
2
through an angle of .
Example 6.51: Find the Rotation Matrix in Two Dimensions
Determine the matrix which represents the linear transformation dened by rotating
every vector in R
2
through an angle of .
Solution. Let e
1
=
_
1
0
_
and e
2
=
_
0
1
_
. These identify the geometric vectors which point
along the positive x axis and positive y axis as shown.
-
6
e
1
e
2

(cos(), sin()) (sin(), cos())


T(e
1
)
T(e
2
)
From Theorem 2.65, we need to nd T(e
1
) and T(e
2
), and use these as the columns of
the matrix A of T. We can use cos, sin of the angle to nd the coordinates of T(e
1
) as
shown in the above picture. The coordinates of T(e
2
) also follow from trigonometry. Thus
T(e
1
) =
_
cos
sin
_
, T(e
2
) =
_
sin
cos
_
Therefore, from Theorem 2.65,
A =
_
cos sin
sin cos
_
We can also solve this example algebraically without the use of the above picture. The
denition of (cos () , sin ()) is as the coordinates of the point of T(e
1
). Now the point of
the vector e
2
is exactly /2 further along the unit circle from the point of e
1
, and therefore
after rotation through an angle of the coordinates x and y of the point of T(e
2
) are given
by
(x, y) = (cos ( + /2) , sin ( + /2)) = (sin , cos )
227
We now look at an example of a linear transformation involving two angles.
Example 6.52: The Rotation Matrix of the Sum of Two Angles
Find the matrix of the linear transformation which is obtained by rst rotating all
vectors through an angle of and then through an angle . Hence the linear transfor-
mation rotates all vectors through an angle of + .
Solution. Let T
+
denote the linear transformation which rotates every vector through an
angle of + . Then to obtain T
+
, we rst apply T

and then T

where T

is the linear
transformation which rotates through an angle of and T

is the linear transformation which


rotates through an angle of . Denoting the corresponding matrices by A
+
, A

, and A

, it
follows that for every u
T
+
(u) = A
+
u = A

u = T

(u)
Notice the order of the matrices here!
Consequently, you must have
A
+
=
_
cos ( + ) sin ( + )
sin ( + ) cos ( + )
_
=
_
cos sin
sin cos
_ _
cos sin
sin cos
_
= A

The usual matrix multiplication yields


A
+
=
_
cos ( + ) sin ( + )
sin ( + ) cos ( + )
_
=
_
cos cos sin sin cos sin sin cos
sin cos + cos sin cos cos sin sin
_
= A

Dont these look familiar? They are the usual trigonometric identities for the sum of two
angles derived here using linear algebra concepts.
Here we have focused on rotations in two dimensions. However, you can consider rota-
tions and other geometric concepts in any number of dimensions. This is one of the major
advantages of linear algebra. You can break down a dicult geometrical procedure into small
steps, each corresponding to multiplication by an appropriate matrix. Then by multiplying
the matrices, you can obtain a single matrix which can give you numerical information on
the results of applying the given sequence of simple procedures.
Consider the following example which incorporates a reection as well as a rotation of
vectors.
228
Example 6.53: Rotation Followed by a Reection
Find the matrix of the linear transformation which is obtained by rst rotating all
vectors through an angle of /6 and then reecting through the x axis.
Solution. As shown in Example 6.52, the matrix of the transformation which involves rotating
through an angle of /6 is
_
cos (/6) sin (/6)
sin (/6) cos (/6)
_
=
_
_
1
2

3
1
2
1
2
1
2

3
_
_
The matrix for the transformation which reects all vectors through the x axis is
_
1 0
0 1
_
Therefore, the matrix of the linear transformation which rst rotates through /6 and
then reects through the x axis is given by
_
1 0
0 1
_
_
_
1
2

3
1
2
1
2
1
2

3
_
_
=
_
_
1
2

3
1
2

1
2

1
2

3
_
_
In the following section, we look at vectors in the context of physics.
6.9.2. Vectors And Physics
Suppose you push on something. Then, your push is made up of two components, how
hard you push and the direction you push. This illustrates the concept of force.
Denition 6.54: Force
Force is a vector. The magnitude of this vector is a measure of how hard it is pushing.
It is measured in units such as Newtons or pounds or tons. The direction of this vector
is the direction in which the push is taking place.
Vectors are used to model force and other physical vectors like velocity. As with all
vectors, a vector modeling force has two essential ingredients, its magnitude and its direction.
Recall the special vectors which point along the coordinate axes. These are given by
e
i
= [0 0 1 0 0]
where the 1 is in the i
th
slot and there are zeros in all the other spaces. See the picture in
the case of R
3
.
229
-
y e
2
6
z
e
3

x
e
1
Recall that in R
3
, we refer to these vectors as

i,

j, and

k.
The direction of e
i
is referred to as the i
th
direction. Given a vector u = [u
1
u
n
] , it
follows that
u = u
1
e
1
+ + u
n
e
n
=
n

k=1
u
i
e
i
What does addition of vectors mean physically? Suppose two forces are applied to some
object. Each of these would be represented by a force vector and the two forces acting
together would yield an overall force acting on the object which would also be a force vector
known as the resultant. Suppose the two vectors are u =

n
k=1
u
i
e
i
and v =

n
k=1
v
i
e
i
.
Then the vector u involves a component in the i
th
direction, u
i
e
i
while the component in the
i
th
direction of v is v
i
e
i
. Then the vector u +v should have a component in the i
th
direction
equal to (u
i
+ v
i
) e
i
. This is exactly what is obtained when the vectors, u and v are added.
u +v = [u
1
+ v
1
u
n
+ v
n
]
=
n

i=1
(u
i
+ v
i
) e
i
Thus the addition of vectors according to the rules of addition in R
n
which were presented
earlier, yields the appropriate vector which duplicates the cumulative eect of all the vectors
in the sum.
Now here are some applications of vector addition.
Example 6.55: The Resultant of Three Forces
There are three ropes attached to a car and three people pull on these ropes. The rst
exerts a force of

F
1
= 2

i+3

j 2

k Newtons, the second exerts a force of



F
2
= 3

i+5

j +

k
Newtons and the third exerts a force of 5

j + 2

k Newtons. Find the total force in


the direction of

i.
Solution. To nd the total force, we add the vectors as described above. This is given by
(2

i + 3

j 2

k) + (3

i + 5

j +

k) + (5

j + 2

k)
= (2 + 3 + 5)

i + (3 + 5 +1)

j + (2 + 1 + 2)

k
= 10

i + 7

j +

k
230
Hence, the total force is 10

i + 7

j +

k Newtons. Therefore, the force in the

i direction is 10
Newtons.
Consider another example.
Example 6.56: Finding a Vector from Geometric Description
An airplane ies North East at 100 miles per hour. Write this as a vector.
Solution.
A picture of this situation follows.

Therefore, we need to nd the vector u which has length 100 and direction as shown in
this diagram. We can consider the vector u as the hypotenuse of a right triangle having
equal sides, since the direction of u corresponds with the 45

line. The sides, corresponding


to the

i and

j directions, should be each of length 100/

2. Therefore, the vector would be


u =
100

i +
100

j =
_
100

2
100

2
_
This example also motivates the concept of velocity.
Denition 6.57: Speed and Velocity
The speed of an object is a measure of how fast it is going. It is measured in units
of length per unit time. For example, miles per hour, kilometers per minute, feet
per second. The velocity is a vector having the speed as the magnitude but also
specifying the direction.
Thus the velocity vector in the above example is
100

i +
100

j, while the speed is 100 miles


per hour.
Consider the following example.
Example 6.58: Position From Velocity and Time
The velocity of an airplane is 100

i +

j +

k measured in kilometers per hour and at a


certain instant of time its position is (1, 2, 1) .
Find the position of this airplane one minute later.
Solution. Here imagine a Cartesian coordinate system in which the third component is
altitude and the rst and second components are measured on a line from West to East and
a line from South to North.
231
Consider the vector
_
1 2 1

, which is the initial position vector of the airplane. As


the plane moves, the position vector changes according to the velocity vector. After one
minute (considered as
1
60
of an hour) the airplane has moved in the

i direction a distance of
100
1
60
=
5
3
kilometer. In the

j direction it has moved
1
60
kilometer during this same time,
while it moves
1
60
kilometer in the

k direction. Therefore, the new displacement vector for
the airplane is
_
1 2 1

+
_
5
3
1
60
1
60
_
=
_
8
3
121
60
121
60
_
Now consider an example which involves combining two velocities.
Example 6.59: Sum of Two Velocities
A certain river is one half kilometer wide with a current owing at 4 kilometers per
hour from East to West. A man swims directly toward the opposite shore from the
South bank of the river at a speed of 3 kilometers per hour. How far down the river
does he nd himself when he has swam across? How far does he end up swimming?
Solution. Consider the following picture which demonstrates the above scenario.

4
6
3
First we want to know the total time of the swim across the river. The velocity in the
direction across the river is 3 kilometers per hour, and the river is
1
2
kilometer wide. It
follows the trip takes 1/6 hour or 10 minutes.
Now, we can compute how far downstream he will end up. Since the river runs at a rate
of 4 kilometers per hour, and the trip takes 1/6 hour, the distance traveled downstream is
given by 4
_
1
6
_
=
2
3
kilometers.
The distance traveled by the swimmer is given by the hypotenuse of a right triangle. The
two arms of the triangle are given by the distance across the river,
1
2
km, and the distance
traveled downstream,
2
3
km. Then, using the Pythagorean Theorem, we can calculate the
total distance d traveled.
d =

_
2
3
_
2
+
_
1
2
_
2
=
5
6
km
Therefore, the swimmer travels a total distance of
5
6
kilometers.
232
6.9.3. Work
The nal application we will explore is the concept of work. The physical concept of work
diers from the notion of work employed in ordinary conversation. For example, suppose
you were to slide a 150 pound weight o a table which is three feet high and shue along
the oor for 50 yards, keeping the height always three feet and then deposit this weight on
another three foot high table. The physical concept of work would indicate that the force
exerted by your arms did no work during this project. The reason for this denition is that
even though your arms exerted considerable force on the weight, the direction of motion was
at right angles to the force they exerted. The only part of a force which does work in the
sense of physics is the component of the force in the direction of motion.
Work is dened to be the magnitude of the component of this force times the distance
over which it acts, when the component of force points in the direction of motion. In the
case where the force points in exactly the opposite direction of motion work is given by (1)
times the magnitude of this component times the distance. Thus the work done by a force
on an object as the object moves from one point to another is a measure of the extent to
which the force contributes to the motion. This is illustrated in the following picture in the
case where the given force contributes to the motion.

`

F

F
||

Q
P

Recall that for any vector u in R


n
, we can write u as a sum of two vectors, as in
u = u
||
+u

For any force



F, we can write this force as the sum of a vector in the direction of the motion
and a vector perpendicular to the motion. In other words,

F =

F
||
+

F

In the above picture the force,



F is applied to an object which moves on the straight
line from P to Q. There are two vectors shown,

F
||
and

F

and the picture is intended to


indicate that when you add these two vectors you get

F. In other words,

F =

F
||
+

F

.
Notice that

F
||
acts in the direction of motion and

F

acts perpendicular to the direction of


motion. Only

F
||
contributes to the work done by

F on the object as it moves from P to Q.

F
||
is called the component of the force in the direction of motion. From trigonometry, you
see the magnitude of

F
||
should equal

[cos [ . Thus, since



F
||
points in the direction of
the vector from P to Q, the total work done should equal

PQ

cos =

[q p[ cos
Now, suppose the included angle had been obtuse. Then the work done by the force

F
on the object would have been negative because

F
||
would point in 1 times the direction of
233
the motion. In this case, cos would also be negative and so it is still the case that the work
done would be given by the above formula. Thus from the geometric description of the dot
product given above, the work equals

[ q p[ cos =

F (q p)
This explains the following denition.
Denition 6.60: Work Done on an Object by a Force
Let

F be a force acting on an object which moves from the point P to the point Q,
which have position vectors given by p and q respectively. Then the work done on
the object by the given force equals

F (q p) .
Consider the following example.
Example 6.61: Finding Work
Let

F =
_
2 7 3

Newtons. Find the work done by this force in moving from


the point (1, 2, 3) to the point (9, 3, 4) along the straight line segment joining these
points where distances are measured in meters.
Solution. First, compute the vector q p, given by
_
9 3 4

_
1 2 3

=
_
10 5 1

According to Denition 6.60 the work done is


_
2 7 3

_
10 5 1

= 20 + (35) + (3)
= 58 Newton meters
Note that if the force had been given in pounds and the distance had been given in feet,
the units on the work would have been foot pounds. In general, work has units equal to
units of a force times units of a length. Recall that 1 Newton meter is equal to 1 Joule. Also
notice that the work done by the force can be negative as in the above example.
6.10 Exercises
1. Find 3
_
5 1 2 3

+ 5
_
8 2 3 6

.
2. Find 7
_
6 0 4 1

+ 6
_
13 1 1 6

.
234
3. Find the vector equation for the line through (7, 6, 0) and (1, 1, 4) . Recall that to
nd the vector equation for a line through two points (a, b, c) and (d, e, f) , you nd a
direction vector of the form
_
d a e b f c

and then the equation of the line


is
_
x y z

=
_
a b c

+ t
_
d a e b f c

where t R. Then, nd the


parametric equations for this line.
4. Find parametric equations for the line through the point (7, 7, 1) with a direction vector
_
1 6 2

.
5. Parametric equations of the line are
x = t + 2
y = 6 3t
z = t 6
Find a direction vector for the line and a point on the line.
6. Find the vector equation for the line through the two points (5, 5, 1), (2, 2, 4) .
Then, nd the parametric equations. Recall that to nd the vector equation for a
line through two points (a, b, c) and (d, e, f) , you nd a direction vector of the form
_
d a e b f c

and then the equation of the line is


_
x y z

=
_
a b c

+
t
_
d a e b f c

where t R.
7. The equation of a line in two dimensions is written as y = x 5. Find parametric
equations for this line.
8. Find parametric equations for the line through (6, 5, 2) and (5, 1, 2) .
9. Find the vector equation and parametric equations for the line through the point
(7, 10, 6) with a direction vector
_
1 1 3

.
10. Parametric equations of the line are
x = 2t + 2
y = 5 4t
z = t 3
Find a direction vector for the line and a point on the line, and write the vector
equation of the line.
11. Find the vector equation and parametric equations for the line through the two points
(4, 10, 0), (1, 5, 6) .
12. Find the point on the line segment from P = (4, 7, 5) to Q = (2, 2, 3) which is
1
7
of the way from P to Q.
235
13. Suppose a triangle in R
n
has vertices at P
1
, P
2
, and P
3
. Consider the lines which are
drawn from a vertex to the mid point of the opposite side. Show these three lines
intersect in a point and nd the coordinates of this point.
14. Use the formula given in Proposition 6.31 to verify the Cauchy Schwarz inequality and
to show that equality occurs if and only if one of the vectors is a scalar multiple of the
other.
15. For u, v vectors in R
3
, dene the product, u v = u
1
v
1
+ 2u
2
v
2
+ 3u
3
v
3
. Show the
axioms for a dot product all hold for this product. Prove
[u v[ (u u)
1/2
(v v)
1/2
Hint: You do not need trigonometry to prove this.
16. Find the angle between the vectors
u =
_
3 1 1

, v =
_
1 4 2

17. Find the angle between the vectors


u =
_
1 2 1

, v =
_
1 2 7

18. Find proj


v
( w) where w =
_
1 0 2

and v =
_
1 2 3

.
19. Find proj
v
( w) where w =
_
1 2 2

and v =
_
1 0 3

.
20. Find proj
v
( w) where w =
_
1 2 2 1

and v =
_
1 2 3 0

.
21. Does it make sense to speak of proj

0
( w)?
22. Prove that T ( w) = proj
v
( w) is a linear transformation and nd the matrix of proj
v
( w)
where v =
_
1 2 3

.
23. Prove the Cauchy Schwarz inequality in R
n
as follows. For u, v vectors, consider
( w proj
v
w) ( w proj
v
w) 0
Simplify using the axioms of the dot product and then put in the formula for the
projection. Notice that this expression equals 0 and you get equality in the Cauchy
Schwarz inequality if and only if w = proj
v
w. What is the geometric meaning of
w = proj
v
w?
24. Let v, w u be vectors. Show that ( w +u)

= w

+u

where w

= w proj
v
( w) .
25. Find
_
1 2 3 4

_
2 0 1 3

.
26. Let a,

b be vectors. Show that


_
a

b
_
=
1
4
_

a +

2
_
.
236
27. Prove from the axioms of the dot product the parallelogram identity,

a +

2
+

2
= 2 [a[
2
+ 2

2
28. Find the matrix for T ( w) = proj
v
( w) where v =
_
1 2 3

T
.
29. Find the matrix for T ( w) = proj
v
( w) where v =
_
1 5 3

T
.
30. Find the matrix for T ( w) = proj
v
( w) where v =
_
1 0 3

T
.
31. Show that the function T
v
dened by T
v
( w) = w proj
v
( w) is also a linear transfor-
mation.
32. Show that
(v proj
u
(v) , u) = (v proj
u
(v) , u) = (v proj
u
(v)) u = 0
and conclude every vector in R
n
can be written as the sum of two vectors, one which
is perpendicular and one which is parallel to the given vector.
33. Let A be a real mn matrix and let u R
n
and v R
m
. Show (Au, v)
R
m
=
_
u, A
T
v
_
R
n
where (, )
R
k
denotes the dot product in R
k
. In the notation above, Au v = u A
T
v.
Use the denition of matrix multiplication to do this.
34. Use the result of Problem 33 to verify directly that (AB)
T
= B
T
A
T
without making
any reference to subscripts.
35. Show that if a u =

0 for any unit vector u, then a =

0.
36. Find the area of the triangle determined by the three points, (1, 2, 3) , (4, 2, 0) and
(3, 2, 1) .
37. Find the area of the triangle determined by the three points, (1, 0, 3) , (4, 1, 0) and
(3, 1, 1) .
38. Find the area of the triangle determined by the three points, (1, 2, 3) , (2, 3, 4) and
(3, 4, 5) . Did something interesting happen here? What does it mean geometrically?
39. Find the area of the parallelogram determined by the vectors
_
1 2 3

,
_
3 2 1

.
40. Find the area of the parallelogram determined by the vectors
_
1 0 3

,
_
4 2 1

.
41. Find the volume of the parallelepiped determined by the vectors
_
1 7 5

,
_
1 2 6

,
_
3 2
42. Suppose u, v, and w are three vectors whose components are all integers. Can you
conclude the volume of the parallelepiped determined from these three vectors will
always be an integer?
43. What does it mean geometrically if the box product of three vectors gives zero?
237
44. Using Problem 43, nd an equation of a plane containing the two position vectors, p
and q and the point 0. Hint: If (x, y, z) is a point on this plane, the volume of the
parallelepiped determined by (x, y, z) and the vectors p, q equals 0.
45. Using the notion of the box product yielding either plus or minus the volume of the
parallelepiped determined by the given three vectors, show that
(u v) w = u (v w)
In other words, the dot and the cross can be switched as long as the order of the vectors
remains the same. Hint: There are two ways to do this, by the coordinate description
of the dot and cross product and by geometric reasoning.
46. Is u(v w) = (u v) w? What is the meaning of uv w? Explain. Hint: Try
_

j
_

j.
47. Simplify (u v) (v w) ( w z) .
48. Simplify [u v[
2
+ (u v)
2
[u[
2
[v[
2
.
49. For u, v, w functions of t, show the product rules
(u v)

= u

v +u v

(u v)

= u

v +u v

50. If u is a function of t, and the magnitude [u(t)[ is a constant, show from the above
problem that the velocity u

is perpendicular to u.
51. When you have a rotating rigid body with angular velocity vector

, then the velocity
vector v = u

is given by
v =

u
where u is a position vector. The acceleration is the derivative of the velocity. Show
that if

is a constant vector, then the acceleration vector a = v

is given by the
formula
a =


_

u
_
.
Now simplify the expression. It turns out this is centripetal acceleration.
52. Verify directly that the coordinate description of the cross product, u v has the
property that it is perpendicular to both u and v. Then show by direct computation
that this coordinate description satises
[u v[
2
= [u[
2
[v[
2
(u v)
2
= [u[
2
[v[
2
_
1 cos
2
()
_
where is the angle included between the two vectors. Explain why [u v[ has the
correct magnitude. All that is missing is the material about the right hand rule.
238
Verify directly from the coordinate description of the cross product that the right
thing happens with regards to the vectors

i,

j,

k. Next verify that the distributive law


holds for the coordinate description of the cross product. This gives another way to
approach the cross product. First dene it in terms of coordinates and then get the
geometric properties from this. However, this approach does not yield the right hand
rule property very easily.
53. Suppose A is a 3 3 skew symmetric matrix such that A
T
= A. Show there exists a
vector

such that for all u R
3
Au =

u
Hint: Explain why, since A is skew symmetric it is of the form
A =
_
_
0
3

2

3
0
1

2

1
0
_
_
where the
i
are numbers. Then consider
1

i +
2

j +
3

k.
54. If

F is a force and

D is a vector, show proj

D
_

F
_
=
_

cos
_
u where u is the unit
vector in the direction of

D, where u =

D/

and is the included angle between


the two vectors,

F and

D.

cos is sometimes called the component of the force,



F
in the direction,

D.
55. A boy drags a sled for 100 feet along the ground by pulling on a rope which is 20
degrees from the horizontal with a force of 40 pounds. How much work does this force
do?
56. A girl drags a sled for 200 feet along the ground by pulling on a rope which is 30
degrees from the horizontal with a force of 20 pounds. How much work does this force
do?
57. A large dog drags a sled for 300 feet along the ground by pulling on a rope which is 45
degrees from the horizontal with a force of 20 pounds. How much work does this force
do?
58. How much work does it take to slide a crate 20 meters along a loading dock by pulling
on it with a 200 Newton force at an angle of 30

from the horizontal? Express your


answer in Newton meters.
59. An object moves 10 meters in the direction of

j. There are two forces acting on this
object,

F
1
=

i +

j + 2

k, and

F
2
= 5

i + 2

j 6

k. Find the total work done on the


object by the two forces. Hint: You can take the work done by the resultant of the
two forces or you can add the work done by each force. Why?
239
60. An object moves 10 meters in the direction of

j +

i. There are two forces acting on


this object,

F
1
=

i + 2

j + 2

k, and

F
2
= 5

i + 2

j 6

k. Find the total work done on the


object by the two forces. Hint: You can take the work done by the resultant of the
two forces or you can add the work done by each force. Why?
61. An object moves 20 meters in the direction of

k +

j. There are two forces acting on


this object,

F
1
=

i +

j + 2

k, and

F
2
=

i + 2

j 6

k. Find the total work done on the


object by the two forces. Hint: You can take the work done by the resultant of the
two forces or you can add the work done by each force.
62. Let u =
_
a b

be a unit vector in R
2
. Find the matrix which reects all vectors
across this vector.

u
Hint: Notice that
_
a b

=
_
cos sin

for some . First rotate through . Next


reect through the x axis. Finally rotate through .
63. Let u be a unit vector. Show the linear transformation determined by the matrix
I 2uu
T
preserves all distances and satises
_
I 2uu
T
_
T
_
I 2uu
T
_
= I
This matrix is called a Householder reection. More generally, any matrix A which
satises A
T
A = AA
T
is called an orthogonal matrix. Show the linear transformation
determined by an orthogonal matrix always preserves the length of a vector in R
n
.
Hint: First show that for any matrix A,
Au, v =

u, A
T
v
_
64. Suppose [u[ = [v[ for u, v R
n
. Find the matrix of an orthogonal transformation A,
(see Problem 63) which has the property that Au = v and Av = u. Show
A = I 2
u v
[u v[
2
(u v)
T
is such a matrix.
65. Let u be a xed vector. The function T
u
dened by T
u
v = u + v has the eect of
translating all vectors by adding u ,=

0. Show this is not a linear transformation.
Explain why it is not possible to represent T
u
in R
3
by multiplying by a 3 3 matrix.
66. In spite of Problem 65 we can represent both translations and rotations by matrix
multiplication at the expense of using higher dimensions. This is done by the homoge-
neous coordinates. Consider the case for R
3
where most interest in this is found. For
240
each vector v =
_
v
1
v
2
v
3

T
, consider the vector in R
4
_
v
1
v
2
v
3
1

T
. Explain
the product given by
_

_
1 0 0 a
1
0 1 0 a
2
0 0 1 a
3
0 0 0 1
_

_
_

_
v
1
v
2
v
3
1
_

_
Describe how to consider both rotations and translations all at once by forming ap-
propriate 4 4 matrices.
67. The wind blows from the South at 20 kilometers per hour and an airplane which ies
at 600 kilometers per hour in still air is heading East. Find the velocity of the airplane
and its location after two hours.
68. The wind blows from the West at 30 kilometers per hour and an airplane which ies
at 400 kilometers per hour in still air is heading North East. Find the velocity of the
airplane and its position after two hours.
69. The wind blows from the North at 10 kilometers per hour. An airplane which ies at
300 kilometers per hour in still air is supposed to go to the point whose coordinates
are at (100, 100) . In what direction should the airplane y?
70. Three forces act on an object. Two are
_
3 1 1

and
_
1 3 4

Newtons.
Find the third force if the object is not to move.
71. Three forces act on an object. Two are
_
6 3 3

and
_
2 1 3

Newtons. Find
the third force if the total force on the object is to be
_
7 1 3

.
72. A river ows West at the rate of b miles per hour. A boat can move at the rate of 8
miles per hour. Find the smallest value of b such that it is not possible for the boat to
proceed directly across the river.
73. The wind blows from West to East at a speed of 50 miles per hour and an airplane
which travels at 400 miles per hour in still air is heading North West. What is the
velocity of the airplane relative to the ground? What is the component of this velocity
in the direction North?
74. The wind blows from West to East at a speed of 60 miles per hour and an airplane
can travel travels at 100 miles per hour in still air. How many degrees West of North
should the airplane head in order to travel exactly North?
75. The wind blows from West to East at a speed of 50 miles per hour and an airplane
which travels at 400 miles per hour in still air heading somewhat West of North so that,
with the wind, it is ying due North. It uses 30.0 gallons of gas every hour. If it has
to travel 600.0 miles due North, how much gas will it use in ying to its destination?
241
76. An airplane is ying due north at 500.0 miles per hour but it is not actually going due
North because there is a wind which is pushing the airplane due east at 40.0 miles per
hour. After one hour, the plane starts ying 30

East of North. Assuming the plane


starts at (0, 0) , where is it after 2 hours? Let North be the direction of the positive y
axis and let East be the direction of the positive x axis.
77. City A is located at the origin (0, 0) while city B is located at (300, 500) where distances
are in miles. An airplane ies at 250 miles per hour in still air. This airplane wants
to y from city A to city B but the wind is blowing in the direction of the positive y
axis at a speed of 50 miles per hour. Find a unit vector such that if the plane heads
in this direction, it will end up at city B having own the shortest possible distance.
How long will it take to get there?
78. A certain river is one half mile wide with a current owing at 3.0 miles per hour from
East to West. A man takes a boat directly toward the opposite shore from the South
bank of the river at a speed of 5.0 miles per hour. How far down the river does he nd
himself when he has swam across? How far does he end up traveling?
79. A certain river is one half mile wide with a current owing at 2 miles per hour from
East to West. A man can swim at 3 miles per hour in still water. In what direction
should he swim in order to travel directly across the river? What would the answer to
this problem be if the river owed at 3 miles per hour and the man could swim only
at the rate of 2 miles per hour?
80. Three forces are applied to a point which does not move. Two of the forces are
2

i + 2

j 6

k Newtons and 8

i + 8

j + 3

k Newtons. Find the third force.


81. The total force acting on an object is to be 4

i+2

j3

k Newtons. A force of 3

i1

j+8

k
Newtons is being applied. What other force should be applied to achieve the desired
total force?
82. A bird ies from its nest 8 km. in the direction
5
6
north of east where it stops to rest
on a tree. It then ies 1 km. in the direction due southeast and lands atop a telephone
pole. Place an xy coordinate system so that the origin is the birds nest, and the
positive x axis points east and the positive y axis points north. Find the displacement
vector from the nest to the telephone pole.
83. A car is stuck in the mud. There is a cable stretched tightly from this car to a tree
which is 20 feet long. A person grasps the cable in the middle and pulls with a force
of 150 pounds perpendicular to the stretched cable. The center of the cable moves 2
feet and remains still. What is the tension in the cable? The tension in the cable is
the force exerted on this point by the part of the cable nearer the car as well as the
force exerted on this point by the part of the cable nearer the tree. Assume the cable
cannot be lengthened so the car moves slightly in moving the center of the cable.
242
84. A car is stuck in the mud. There is a cable stretched tightly from this car to a tree
which is 20 feet long. A person grasps the cable in the middle and pulls with a force
of 150 pounds perpendicular to the stretched cable. The center of the cable moves 6
feet and remains still. What is the tension in the cable? The tension in the cable is
the force exerted on this point by the part of the cable nearer the car as well as the
force exerted on this point by the part of the cable nearer the tree. Assume the cable
does lengthen and the car does not move at all.
85. A box is placed on a strong plank of wood and then one end of the plank of wood is
gradually raised. It is found that the box does not move till the angle of inclination
of the wood equals 60 degrees at which angle the box begins to slide. What is the
coecient of static friction?
86. Find the matrix for the linear transformation which rotates every vector in R
2
through
an angle of /3.
87. Find the matrix for the linear transformation which rotates every vector in R
2
through
an angle of /4.
88. Find the matrix for the linear transformation which rotates every vector in R
2
through
an angle of /3.
89. Find the matrix for the linear transformation which rotates every vector in R
2
through
an angle of 2/3.
90. Find the matrix for the linear transformation which rotates every vector in R
2
through
an angle of /12. Hint: Note that /12 = /3 /4.
91. Find the matrix for the linear transformation which rotates every vector in R
2
through
an angle of 2/3 and then reects across the x axis.
92. Find the matrix for the linear transformation which rotates every vector in R
2
through
an angle of /3 and then reects across the x axis.
93. Find the matrix for the linear transformation which rotates every vector in R
2
through
an angle of /4 and then reects across the x axis.
94. Find the matrix for the linear transformation which rotates every vector in R
2
through
an angle of /6 and then reects across the x axis followed by a reection across the
y axis.
95. Find the matrix for the linear transformation which reects every vector in R
2
across
the x axis and then rotates every vector through an angle of /4.
96. Find the matrix for the linear transformation which reects every vector in R
2
across
the y axis and then rotates every vector through an angle of /4.
97. Find the matrix for the linear transformation which reects every vector in R
2
across
the x axis and then rotates every vector through an angle of /6.
243
98. Find the matrix for the linear transformation which reects every vector in R
2
across
the y axis and then rotates every vector through an angle of /6.
99. Find the matrix for the linear transformation which rotates every vector in R
2
through
an angle of 5/12. Hint: Note that 5/12 = 2/3 /4.
100. Find the matrix of the linear transformation which rotates every vector in R
3
counter
clockwise about the z axis when viewed from the positive z axis through an angle of
30

and then reects through the xy plane.


244
A. Some Prerequisite Topics
The topics presented in this section are important concepts in mathematics and therefore
should be examined.
A.1 Sets And Set Notation
A set is a collection of things called elements. For example 1, 2, 3, 8 would be a set con-
sisting of the elements 1,2,3, and 8. To indicate that 3 is an element of 1, 2, 3, 8, it is
customary to write 3 1, 2, 3, 8. We can also indicate when an element is not in a set, by
writing 9 / 1, 2, 3, 8 which says that 9 is not an element of 1, 2, 3, 8 . Sometimes a rule
species a set. For example you could specify a set as all integers larger than 2. This would
be written as S = x Z : x > 2 . This notation says: S is the set of all integers, x, such
that x > 2.
Suppose A and B are sets with the property that every element of A is an element of B.
Then we say that A is a subset of B. For example, 1, 2, 3, 8 is a subset of 1, 2, 3, 4, 5, 8.
In symbols, we write 1, 2, 3, 8 1, 2, 3, 4, 5, 8. It is sometimes said that A is contained
in B or even B contains A. The same statement about the two sets may also be written
as 1, 2, 3, 4, 5, 8 1, 2, 3, 8.
We can also talk about the union of two sets, which we write as A B. This is the
set consisting of everything which is an element of at least one of the sets, A or B. As an
example of the union of two sets, consider 1, 2, 3, 83, 4, 7, 8 = 1, 2, 3, 4, 7, 8. This set
is made up of the numbers which are in at least one of the two sets.
In general
A B = x : x A or x B
Notice that an element which is in both A and B is also in the union, as well as elements
which are in only one of A or B.
Another important set is the intersection of two sets A and B, written A B. This set
consists of everything which is in both of the sets. Thus 1, 2, 3, 8 3, 4, 7, 8 = 3, 8
because 3 and 8 are those elements the two sets have in common. In general,
A B = x : x A and x B
If A and B are two sets, A B denotes the set of things which are in A but not in B.
Thus
A B = x A : x / B
For example, if A = 1, 2, 3, 8 and B = 3, 4, 7, 8, then A B = 1, 2, 3, 8 3, 4, 7, 8 =
1, 2.
245
A special set which is very important in mathematics is the empty set denoted by . The
empty set, , is dened as the set which has no elements in it. It follows that the empty set
is a subset of every set. This is true because if it were not so, there would have to exist a
set A, such that has something in it which is not in A. However, has nothing in it and
so it must be that A.
We can also use brackets to denote sets which are intervals of numbers. Let a and b be
real numbers. Then
[a, b] = x R : a x b
[a, b) = x R : a x < b
(a, b) = x R : a < x < b
(a, b] = x R : a < x b
[a, ) = x R : x a
(, a] = x R : x a
These sorts of sets of real numbers are called intervals. The two points a and b are called
endpoints, or bounds, of the interval. In particular, a is the lower bound while b is the upper
bound of the above intervals, where applicable. Other intervals such as (, b) are dened
by analogy to what was just explained. In general, the curved parenthesis, (, indicates the
end point is not included in the interval, while the square parenthesis, [, indicates this end
point is included. The reason that there will always be a curved parenthesis next to or
is that these are not real numbers and cannot be included in the interval in the way a
real number can.
To illustrate the use of this notation relative to intervals consider three examples of
inequalities. Their solutions will be written in the interval notation just described.
Example A.1: Solving an Inequality
Solve the inequality 2x + 4 x 8.
Solution. We need to nd x such that 2x +4 x 8. Solving for x, we see that x 12 is
the answer. This is written in terms of an interval as (, 12].
Consider the following example.
Example A.2: Solving an Inequality
Solve the inequality (x + 1) (2x 3) 0.
Solution. We need to nd x such that (x + 1) (2x 3) 0. The solution is given by x 1
or x
3
2
. Therefore, x which t into either of these intervals gives a solution. In terms of
set notation this is denoted by (, 1] [
3
2
, ).
246
Consider one last example.
Example A.3: Solving an Inequality
Solve the inequality x(x + 2) 4.
Solution. This inequality is true for any value of x where x is a real number. We can write
the solution as R or (, ) .
In the next section, we examine another important mathematical concept.
A.2 Well Ordering And Induction
Mathematical induction and well ordering are two extremely important principles in math.
They are often used to prove signicant things which would be hard to prove otherwise.
Denition A.4: Well Ordered
A set is well ordered if every nonempty subset S, contains a smallest element z having
the property that z x for all x S.
In particular, the set of natural numbers dened as
N = 1, 2,
is well ordered.
Consider the following proposition.
Proposition A.5: Well Ordered Sets
Any set of integers larger than a given number is well ordered.
This proposition claims that if a set has a lower bound which is a real number, then this
set is well ordered.
Further, this proposition implies the principle of mathematical induction. The symbol Z
denotes the set of all integers. Note that if a is an integer, then there are no integers between
a and a + 1.
Theorem A.6: Mathematical Induction
A set S Z, having the property that a S and n +1 S whenever n S, contains
all integers x Z such that x a.
Proof: Let T consist of all integers larger than or equal to a which are not in S. The
theorem will be proved if T = . If T ,= then by the well ordering principle, there would
247
have to exist a smallest element of T, denoted as b. It must be the case that b > a since by
denition, a / T. Thus b a + 1, and so b 1 a and b 1 / S because if b 1 S, then
b 1 +1 = b S by the assumed property of S. Therefore, b 1 T which contradicts the
choice of b as the smallest element of T. (b 1 is smaller.) Since a contradiction is obtained
by assuming T ,= , it must be the case that T = and this says that every integer at least
as large as a is also in S.
Mathematical induction is a very useful device for proving theorems about the integers.
The procedure is as follows.
Procedure A.7: Proof by Mathematical Induction
Suppose S
n
is a statement which is a function of the number n, for n = 1, 2, , and
we wish to show that S
n
is true for all n 1. To do so using mathematical induction,
use the following steps.
1. Base Case: Show S
1
is true.
2. Assume S
n
is true for some n, which is the induction hypothesis. Then, using
this assumption, show that S
n+1
is true.
Proving these two steps shows that S
n
is true for all n = 1, 2, .
We can use this procedure to solve the following examples.
Example A.8: Proving by Induction
Prove by induction that

n
k=1
k
2
=
n(n + 1) (2n + 1)
6
.
Solution. By Procedure A.7, we rst need to show that this statement is true for n = 1.
When n = 1, the statement says that
1

k=1
k
2
=
1 (1 + 1) (2(1) + 1)
6
=
6
6
= 1
The sum on the left hand side also equals 1, so this equation is true for n = 1.
Now suppose this formula is valid for some n 1 where n is an integer. Hence, the
following equation is true.
n

k=1
k
2
=
n(n + 1) (2n + 1)
6
(1.1)
We want to show that this is true for n + 1.
248
Suppose we add (n + 1)
2
to both sides of equation 1.1.
n+1

k=1
k
2
=
n

k=1
k
2
+ (n + 1)
2
=
n(n + 1) (2n + 1)
6
+ (n + 1)
2
The step going from the rst to the second line is based on the assumption that the formula
is true for n. Now simplify the expression in the second line,
n(n + 1) (2n + 1)
6
+ (n + 1)
2
This equals
(n + 1)
_
n(2n + 1)
6
+ (n + 1)
_
and
n(2n + 1)
6
+ (n + 1) =
6 (n + 1) + 2n
2
+ n
6
=
(n + 2) (2n + 3)
6
Therefore,
n+1

k=1
k
2
=
(n + 1) (n + 2) (2n + 3)
6
=
(n + 1) ((n + 1) + 1) (2 (n + 1) + 1)
6
showing the formula holds for n + 1 whenever it holds for n. This proves the formula by
mathematical induction. In other words, this formula is true for all n = 1, 2, .
Consider another example.
Example A.9: Proving an Inequality by Induction
Show that for all n N,
1
2

3
4

2n 1
2n
<
1

2n + 1
.
Solution. Again we will use the procedure given in Procedure A.7 to prove that this statement
is true for all n. Suppose n = 1. Then the statement says
1
2
<
1

3
which is true.
Suppose then that the inequality holds for n. In other words,
1
2

3
4

2n 1
2n
<
1

2n + 1
is true.
249
Now multiply both sides of this inequality by
2n+1
2n+2
. This yields
1
2

3
4

2n 1
2n

2n + 1
2n + 2
<
1

2n + 1
2n + 1
2n + 2
=

2n + 1
2n + 2
The theorem will be proved if this last expression is less than
1

2n + 3
. This happens if and
only if
_
1

2n + 3
_
2
=
1
2n + 3
>
2n + 1
(2n + 2)
2
which occurs if and only if (2n + 2)
2
> (2n + 3) (2n + 1) and this is clearly true which may
be seen from expanding both sides. This proves the inequality.
Lets review the process just used. If S is the set of integers at least as large as 1 for which
the formula holds, the rst step was to show 1 S and then that whenever n S, it follows
n + 1 S. Therefore, by the principle of mathematical induction, S contains [1, ) Z,
all positive integers. In doing an inductive proof of this sort, the set S is normally not
mentioned. One just veries the steps above.
250
Index
, 245
, 245
, 245
row-echelon form, 21
reduced row-echelon form, 21
algorithm, 23
Abels formula, 135
adjugate, 121
back substitution, 17
base case, 248
basic solution, 34
box product, 224
Cauchy Schwarz inequality, 211
characteristic equation, 152
characteristic value, 151
classical adjoint, 121
Cofactor Expansion, 110
cofactor matrix, 121
complex eigenvalues, 165
complex numbers
absolute value, 140
addition, 138
argument, 141
conjugate, 139
conjugate of a product, 146
modulus, 141
multiplication, 138
polar form, 141
roots, 143
standard form, 137
triangle inequality, 140
component of a force, 233, 239
consistent system, 13
Cramers rule, 126
cross product, 219, 220
area of parallelogram, 222
coordinate description, 220
geometric description, 220
De Moivres theorem, 143
determinant, 107
cofactor, 109
expanding along row or column, 110
matrix inverse formula, 121
minor, 108
product, 116
row operations, 114
diagonalizable, 160, 161
direction vector, 207
distance formula, 201
properties, 201
dot product, 209
properties, 210
eigenvalue, 151
eigenvalues
calculating, 153
eigenvector, 151
eigenvectors
calculating, 153
elementary matrix, 73
inverse, 76
elementary operations, 14
empty set, 246
eld axioms, 138
force, 229
Fundamental Theorem of Algebra, 137
Gauss-Jordan Elimination, 29
Gaussian Elimination, 29
general solution, 92
solution space, 91
homogeneous coordinates, 240
householder matrix, 240
hyper-planes, 11
idempotent, 99
included angle, 213
251
inconsistent system, 13
induction hypothesis, 248
inner product, 209
intersection, 245
intervals
notation, 246
inverses and determinants, 123
kernel, 90
Kirchos law, 45, 46
Kronecker symbol, 65
Laplace expansion, 110
leading entry, 20
linear combination, 35
linear transformation, 81
matrix, 86
rotation, 226
lines
parametric equation, 208
symmetric form, 208
vector equation, 206
main diagonal, 113
Markov matrices, 169
mathematical induction, 247, 248
matrix, 18, 47
addition, 49
augmented matrix, 18, 19
coecient matrix, 18
commutative, 62
components of a matrix, 48
conformable, 56
diagonal matrix, 160
dimension, 18
entries of a matrix, 48
equality, 49
equivalent, 31
nding the inverse, 69
identity, 65
inverse, 65
invertible, 65
lower triangular, 113
main diagonal, 160
orthogonal, 131
properties of addition, 50
properties of scalar multiplication, 52
properties of transpose, 63
raising to a power, 167
rank, 35
rotation, 227
scalar multiplication, 51
self adjoint, 185
skew symmetric, 64
square, 48
symmetric, 64, 185
transpose, 63
upper triangular, 113
matrix multiplication, 56
ijth entry, 59
properties, 62
vectors, 54
matrix transformation, 82
migration matrix, 169
multiplicity, 152
Newton, 229
nilpotent, 131
nontrivial solution, 32
null space, 90
one to one, 82
onto, 82
orthogonal matrix, 240
parallelepiped, 224
volume, 224
parameter, 27
particular solution, 90
permutation matrices, 73
pivot column, 22
pivot position, 22
polynomials
factoring, 144
position vector, 194
reection
across a given vector, 240
resultant, 230
right handed system, 219
row operations, 20, 114
scalar, 12
252
scalar product, 209
scalars, 196
set notation, 245
similar matrix, 157
skew lines, 10
solution space, 90
spectrum, 151
speed, 231
system of equations, 12
homogeneous, 12
matrix form, 55
solution set, 12
vector form, 53
The form AX, 83
triangle inequality, 212
trigonometry
sum of two angles, 228
trivial solution, 32
union, 245
unit vector, 203
vector, 193
addition, 195
addition, geometric meaning, 194
components, 193, 194
corresponding unit vector, 203
length, 202
orthogonal, 215
perpendicular, 215
points and vectors, 194
projection, 217
scalar multiplication, 196
subtraction, 196
vectors, 52
column, 52
row vector, 52
velocity, 231
well ordered, 247
work, 234
Wronskian, 134
zero matrix, 48
zero vector, 196
253

You might also like