0% found this document useful (0 votes)
105 views185 pages

LinAlg - Complete 1 185 PDF

Uploaded by

Alivia Zahra
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
105 views185 pages

LinAlg - Complete 1 185 PDF

Uploaded by

Alivia Zahra
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 185

LINEAR ALGEBRA

Paul Dawkins
Linear Algebra

Table of Contents

Preface............................................................................................................................................. ii
Outline............................................................................................................................................ iii
Systems of Equations and Matrices.............................................................................................. 1
Introduction ................................................................................................................................................ 1
Systems of Equations ................................................................................................................................. 3
Solving Systems of Equations .................................................................................................................. 15
Matrices .................................................................................................................................................... 27
Matrix Arithmetic & Operations .............................................................................................................. 33
Properties of Matrix Arithmetic and the Transpose ................................................................................. 45
Inverse Matrices and Elementary Matrices .............................................................................................. 50
Finding Inverse Matrices .......................................................................................................................... 59
Special Matrices ....................................................................................................................................... 68
LU-Decomposition ................................................................................................................................... 75
Systems Revisited .................................................................................................................................... 81
Determinants ................................................................................................................................ 90
Introduction .............................................................................................................................................. 90
The Determinant Function ....................................................................................................................... 91
Properties of Determinants ..................................................................................................................... 100
The Method of Cofactors ....................................................................................................................... 107
Using Row Reduction To Compute Determinants ................................................................................. 115
Cramer’s Rule ........................................................................................................................................ 122
Euclidean n-Space ...................................................................................................................... 125
Introduction ............................................................................................................................................ 125
Vectors ................................................................................................................................................... 126
Dot Product & Cross Product ................................................................................................................. 140
Euclidean n-Space .................................................................................................................................. 154
Linear Transformations .......................................................................................................................... 163
Examples of Linear Transformations ..................................................................................................... 173
Vector Spaces ............................................................................................................................. 181
Introduction ............................................................................................................................................ 181
Vector Spaces ......................................................................................................................................... 183
Subspaces ............................................................................................................................................... 193
Span ........................................................................................................................................................ 203
Linear Independence .............................................................................................................................. 212
Basis and Dimension .............................................................................................................................. 223
Change of Basis...................................................................................................................................... 239
Fundamental Subspaces ......................................................................................................................... 252
Inner Product Spaces .............................................................................................................................. 263
Orthonormal Basis ................................................................................................................................. 271
Least Squares ......................................................................................................................................... 283
QR-Decomposition ................................................................................................................................ 291
Orthogonal Matrices ............................................................................................................................... 299
Eigenvalues and Eigenvectors ................................................................................................... 305
Introduction ............................................................................................................................................ 305
Review of Determinants ......................................................................................................................... 306
Eigenvalues and Eigenvectors ................................................................................................................ 315
Diagonalization ...................................................................................................................................... 331

© 2007 Paul Dawkins i http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

Preface

Here are my online notes for my Linear Algebra course that I teach here at Lamar University.
Despite the fact that these are my “class notes”, they should be accessible to anyone wanting to
learn Linear Algebra or needing a refresher.

These notes do assume that the reader has a good working knowledge of basic Algebra. This set
of notes is fairly self contained but there is enough Algebra type problems (arithmetic and
occasionally solving equations) that can show up that not having a good background in Algebra
can cause the occasional problem.

Here are a couple of warnings to my students who may be here to get a copy of what happened on
a day that you missed.

1. Because I wanted to make this a fairly complete set of notes for anyone wanting to learn
Linear Algebra I have included some material that I do not usually have time to cover in
class and because this changes from semester to semester it is not noted here. You will
need to find one of your fellow class mates to see if there is something in these notes that
wasn’t covered in class.

2. In general I try to work problems in class that are different from my notes. However,
with a Linear Algebra course while I can make up the problems off the top of my head
there is no guarantee that they will work out nicely or the way I want them to. So,
because of that my class work will tend to follow these notes fairly close as far as worked
problems go. With that being said I will, on occasion, work problems off the top of my
head when I can to provide more examples than just those in my notes. Also, I often
don’t have time in class to work all of the problems in the notes and so you will find that
some sections contain problems that weren’t worked in class due to time restrictions.

3. Sometimes questions in class will lead down paths that are not covered here. I try to
anticipate as many of the questions as possible in writing these notes up, but the reality is
that I can’t anticipate all the questions. Sometimes a very good question gets asked in
class that leads to insights that I’ve not included here. You should always talk to
someone who was in class on the day you missed and compare these notes to their notes
and see what the differences are.

4. This is somewhat related to the previous three items, but is important enough to merit its
own item. THESE NOTES ARE NOT A SUBSTITUTE FOR ATTENDING CLASS!!
Using these notes as a substitute for class is liable to get you in trouble. As already noted
not everything in these notes is covered in class and often material or insights not in these
notes is covered in class.

© 2007 Paul Dawkins ii http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

Outline

Here is a listing and brief description of the material in this set of notes.

Systems of Equations and Matrices


Systems of Equations – In this section we’ll introduce most of the basic topics
that we’ll need in order to solve systems of equations including augmented
matrices and row operations.
Solving Systems of Equations – Here we will look at the Gaussian Elimination
and Gauss-Jordan Method of solving systems of equations.
Matrices – We will introduce many of the basic ideas and properties involved in
the study of matrices.
Matrix Arithmetic & Operations – In this section we’ll take a look at matrix
addition, subtraction and multiplication. We’ll also take a quick look at the
transpose and trace of a matrix.
Properties of Matrix Arithmetic – We will take a more in depth look at many
of the properties of matrix arithmetic and the transpose.
Inverse Matrices and Elementary Matrices – Here we’ll define the inverse and
take a look at some of its properties. We’ll also introduce the idea of Elementary
Matrices.
Finding Inverse Matrices – In this section we’ll develop a method for finding
inverse matrices.
Special Matrices – We will introduce Diagonal, Triangular and Symmetric
matrices in this section.
LU-Decompositions – In this section we’ll introduce the LU-Decomposition a
way of “factoring” certain kinds of matrices.
Systems Revisited – Here we will revisit solving systems of equations. We will
take a look at how inverse matrices and LU-Decompositions can help with the
solution process. We’ll also take a look at a couple of other ideas in the solution
of systems of equations.

Determinants
The Determinant Function – We will give the formal definition of the
determinant in this section. We’ll also give formulas for computing determinants
of 2 ´ 2 and 3 ´ 3 matrices.
Properties of Determinants – Here we will take a look at quite a few properties
of the determinant function. Included are formulas for determinants of triangular
matrices.
The Method of Cofactors – In this section we’ll take a look at the first of two
methods form computing determinants of general matrices.
Using Row Reduction to Find Determinants – Here we will take a look at the
second method for computing determinants in general.
Cramer’s Rule – We will take a look at yet another method for solving systems.
This method will involve the use of determinants.

Euclidean n-space
Vectors – In this section we’ll introduce vectors in 2-space and 3-space as well
as some of the important ideas about them.

© 2007 Paul Dawkins iii http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

Dot Product & Cross Product – Here we’ll look at the dot product and the
cross product, two important products for vectors. We’ll also take a look at an
application of the dot product.
Euclidean n-Space – We’ll introduce the idea of Euclidean n-space in this
section and extend many of the ideas of the previous two sections.
Linear Transformations – In this section we’ll introduce the topic of linear
transformations and look at many of their properties.
Examples of Linear Transformations – We’ll take a look at quite a few
examples of linear transformations in this section.

Vector Spaces
Vector Spaces – In this section we’ll formally define vectors and vector spaces.
Subspaces – Here we will be looking at vector spaces that live inside of other
vector spaces.
Span – The concept of the span of a set of vectors will be investigated in this
section.
Linear Independence – Here we will take a look at what it means for a set of
vectors to be linearly independent or linearly dependent.
Basis and Dimension – We’ll be looking at the idea of a set of basis vectors and
the dimension of a vector space.
Change of Basis – In this section we will see how to change the set of basis
vectors for a vector space.
Fundamental Subspaces – Here we will take a look at some of the fundamental
subspaces of a matrix, including the row space, column space and null space.
Inner Product Spaces – We will be looking at a special kind of vector spaces in
this section as well as define the inner product.
Orthonormal Basis – In this section we will develop and use the Gram-Schmidt
process for constructing an orthogonal/orthonormal basis for an inner product
space.
Least Squares – In this section we’ll take a look at an application of some of the
ideas that we will be discussing in this chapter.
QR-Decomposition – Here we will take a look at the QR-Decomposition for a
matrix and how it can be used in the least squares process.
Orthogonal Matrices – We will take a look at a special kind of matrix, the
orthogonal matrix, in this section.

Eigenvalues and Eigenvectors


Review of Determinants – In this section we’ll do a quick review of
determinants.
Eigenvalues and Eigenvectors – Here we will take a look at the main section in
this chapter. We’ll be looking at the concept of Eigenvalues and Eigenvectors.
Diagonalization – We’ll be looking at diagonalizable matrices in this section.

© 2007 Paul Dawkins iv http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

Systems of Equations and Matrices

Introduction
We will start this chapter off by looking at the application of matrices that almost every book on
Linear Algebra starts off with, solving systems of linear equations. Looking at systems of
equations will allow us to start getting used to the notation and some of the basic manipulations
of matrices that we’ll be using often throughout these notes.

Once we’ve looked at solving systems of linear equations we’ll move into the basic arithmetic of
matrices and basic matrix properties. We’ll also take a look at a couple of other ideas about
matrices that have some nice applications to the solution to systems of equations.

One word of warning about this chapter, and in fact about this complete set of notes for that
matter, we’ll start out in the first section or to doing a lot of the details in the problems, but
towards the end of this chapter and into the remaining chapters we will leave many of the details
to you to check. We start off by doing lots of details to make sure you are comfortable working
with matrices and the various operations involving them. However, we will eventually assume
that you’ve become comfortable with the details and can check them on your own. At that point
we will quit showing many of the details.

Here is a listing of the topics in this chapter.

Systems of Equations – In this section we’ll introduce most of the basic topics that we’ll need in
order to solve systems of equations including augmented matrices and row operations.

Solving Systems of Equations – Here we will look at the Gaussian Elimination and Gauss-
Jordan Method of solving systems of equations.

Matrices – We will introduce many of the basic ideas and properties involved in the study of
matrices.

Matrix Arithmetic & Operations – In this section we’ll take a look at matrix addition,
subtraction and multiplication. We’ll also take a quick look at the transpose and trace of a matrix.

Properties of Matrix Arithmetic – We will take a more in depth look at many of the properties
of matrix arithmetic and the transpose.

Inverse Matrices and Elementary Matrices – Here we’ll define the inverse and take a look at
some of its properties. We’ll also introduce the idea of Elementary Matrices.

Finding Inverse Matrices – In this section we’ll develop a method for finding inverse matrices.

Special Matrices – We will introduce Diagonal, Triangular and Symmetric matrices in this
section.

© 2007 Paul Dawkins 1 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

LU-Decompositions – In this section we’ll introduce the LU-Decomposition a way of


“factoring” certain kinds of matrices.

Systems Revisited – Here we will revisit solving systems of equations. We will take a look at
how inverse matrices and LU-Decompositions can help with the solution process. We’ll also take
a look at a couple of other ideas in the solution of systems of equations.

© 2007 Paul Dawkins 2 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

Systems of Equations
Let’s start off this section with the definition of a linear equation. Here are a couple of examples
of linear equations.
5
6 x - 8 y + 10 z = 3 7 x1 - x2 = -1
9
In the second equation note the use of the subscripts on the variables. This is a common
notational device that will be used fairly extensively here. It is especially useful when we get into
the general case(s) and we won’t know how many variables (often called unknowns) there are in
the equation.

So, just what makes these two equations linear? There are several main points to notice. First,
the unknowns only appear to the first power and there aren’t any unknowns in the denominator of
a fraction. Also notice that there are no products and/or quotients of unknowns. All of these
ideas are required in order for an equation to be a linear equation. Unknowns only occur in
numerators, they are only to the first power and there are no products or quotients of unknowns.

The most general linear equation is,


a1 x1 + a2 x2 + L an xn = b (1)
where there are n unknowns, x1 , x2 ,K , xn , and a1 , a2 ,K , an , b are all known numbers.

Next we need to take a look at the solution set of a single linear equation. A solution set (or
often just solution) for (1) is a set of numbers t1 , t2 ,K , tn so that if we set x1 = t1 , x2 = t2 , … ,
xn = tn then (1) will be satisfied. By satisfied we mean that if we plug these numbers into the left
side of (1) and do the arithmetic we will get b as an answer.

The first thing to notice about the solution set to a single linear equation that contains at least two
variables with non-zero coefficents is that we will have an infinite number of solutions. We will
also see that while there are infinitely many possible solutions they are all related to each other in
some way.

Note that if there is one or less variables with non-zero coefficients then there will be a single
solution or no solutions depending upon the value of b.

Let’s find the solution sets for the two linear equations given at the start of this section.

Example 1 Find the solution set for each of the following linear equations.
5
(a) 7 x1 - x2 = -1 [Solution]
9
(b) 6 x - 8 y + 10 z = 3 [Solution]

Solution
5
(a) 7 x1 - x2 = -1
9
The first thing that we’ll do here is solve the equation for one of the two unknowns. It doesn’t
matter which one we solve for, but we’ll usually try to pick the one that will mean the least

© 2007 Paul Dawkins 3 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

amount (or at least simpler) work. In this case it will probably be slightly easier to solve for x1
so let’s do that.
5
7 x1 - x2 = -1
9
5
7 x1 = x2 - 1
9
5 1
x1 = x2 -
63 7

Now, what this tells us is that if we have a value for x2 then we can determine a corresponding
value for x1 . Since we have a single linear equation there is nothing to restrict our choice of x2
and so we we’ll let x2 be any number. We will usually write this as x2 = t , where t is any
number. Note that there is nothing special about the t, this is just the letter that I usually use in
these cases. Others often use s for this letter and, of course, you could choose it to be just about
anything as long as it’s not a letter representing one of the unknowns in the equation (x in this
case).

Once we’ve “chosen” x2 we’ll write the general solution set as follows,
5 1
x1 = t- x2 = t
63 7

So, just what does this tell us as far as actual number solutions go? We’ll choose any value of t
and plug in to get a pair of numbers x1 and x2 that will satisfy the equation. For instance picking
a couple of values of t completely at random gives,

1
t = 0: x1 = - , x2 = 0
7
5 1
t = 27 : x1 = ( 27 ) - = 2, x2 = 27
63 7

We can easily check that these are in fact solutions to the equation by plugging them back into the
equation.
æ 1ö 5
t = 0: 7 ç - ÷ - ( 0 ) = -1
è 7ø 9
5
t = 27 : 7 ( 2 ) - ( 27 ) = -1
9

So, for each case when we plugged in the values we got for x1 and x2 we got -1 out of the
equation as we were supposed to.

Note that since there an infinite number of choices for t there are in fact an infinite number of
possible solutions to this linear equation.
[Return to Problems]

© 2007 Paul Dawkins 4 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

(b) 6 x - 8 y + 10 z = 3
We’ll do this one with a little less detail since it works in essentially the same manner. The fact
that we now have three unknowns will change things slightly but not overly much. We will first
solve the equation for one of the variables and again it won’t matter which one we chose to solve
for.
10 z = 3 - 6 x + 8 y
3 3 4
z = - x+ y
10 5 5

In this case we will need to know values for both x and y in order to get a value for z. As with the
first case, there is nothing in this problem to restrict out choices of x and y. We can therefore let
them be any number(s). In this case we’ll choose x = t and y = s . Note that we chose different
letters here since there is no reason to think that both x and y will have exactly the same value
(although it is possible for them to have the same value).

The solution set to this linear equation is then,


3 3 4
x=t y=s z= - t+ s
10 5 5

So, if we choose any values for t and s we can get a set of number solutions as follows.
3 3 4 13
x=0 y = -2 z= - ( 0 ) + ( -2 ) = -
10 5 5 10
3 3 3æ 3ö 4 26
x=- y =5 z = - ç - ÷ + ( 5) =
2 10 5 è 2 ø 5 5

As with the first part if we take either set of three numbers we can plug them into the equation to
verify that the equation will be satisfied. We’ll do one of them and leave the other to you to
check.
æ -3 ö æ 26 ö
6 ç ÷ - 8 ( 5 ) + 10 ç ÷ = -9 - 40 + 52 = 3
è 2 ø è 5 ø
[Return to Problems]

The variables that we got to choose for values for ( x2 in the first example and x and y in the
second) are sometimes called free variables.

We now need to start talking about the actual topic of this section, systems of linear equations. A
system of linear equations is nothing more than a collection of two or more linear equations.
Here are some examples of systems of linear equations.

© 2007 Paul Dawkins 5 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

4 x1 - 5 x2 + x3 = 9 6 x1 + x2 = 9
2x + 3 y = 9
- x1 + 10 x3 = -2 -5 x1 - 3x2 = 7
x - 2 y = -13
7 x1 - x2 - 4 x3 = 5 3 x1 - 10 x1 = -4

x1 - x2 + x3 - x4 + x5 = 1
3x1 + 2 x2 - x4 + 9 x2 = 0
7 x1 + 10 x2 + 3x3 + 6 x4 - 9 x5 = -7

As we can see from these examples systems of equation can have any number of equations and/or
unknowns. The system may have the same number of equations as unknowns, more equations
than unknowns, or fewer equations than unknowns.

A solution set to a system with n unknowns, x1 , x2 ,K , xn , is a set of numbers, t1 , t2 ,K , tn , so


that if we set x1 = t1 , x2 = t2 , … , xn = tn then all of the equations in the system will be satisfied.
Or, in other words, the set of numbers t1 , t2 ,K , tn is a solution to each of the individual equations
in the system.

For example, x = -3 , y = 5 is a solution to the first system listed above,


2x + 3 y = 9
(2)
x - 2 y = -13
because,
2 ( -3 ) + 3 ( 5 ) = 9 & ( -3) - 2 ( 5) = -13
However, x = -15 , y = -1 is not a solution to the system because,
2 ( -15 ) + 3 ( -1) = -33 ¹ 9 & ( -15 ) - 2 ( -1) = -13
We can see from these calculations that x = -15 , y = -1 is NOT a solution to the first equation,
but it IS a solution to the second equation. Since this pair of numbers is not a solution to both of
the equations in (2) it is not a solution to the system. The fact that it’s a solution to one of them
isn’t material. In order to be a solution to the system the set of numbers must be a solution to
each and every equation in the system.

It is completely possible as well that a system will not have a solution at all. Consider the
following system.
x - 4 y = 10
(3)
x - 4 y = -3

It is clear (hopefully) that this system of equations can’t possibly have a solution. A solution to
this system would have to be a pair of numbers x and y so that if we plugged them into each
equation it will be a solution to each equation. However, since the left side is identical this would
mean that we’d need an x and a y so that x - 4 y is both 10 and -3 for the exact same pair of
numbers. This clearly can’t happen and so (3) does not have a solution.

© 2007 Paul Dawkins 6 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

Likewise, it is possible for a system to have more than one solution, although we do need to be
careful here as we’ll see. Let’s take a look at the following system.
-2 x + y = 8
(4)
8 x - 4 y = -32

We’ll leave it to you to verify that all of the following are four of the infinitely many solutions to
the first equation in this system.
x = 0, y = 8 x = -3, y = 2, x = -4, y = 0 x = 5, y = 18
Recall from our work above that there will be infinitely many solutions to a single linear
equation.

We’ll also leave it to you to verify that these four solutions are also four of the infinitely many
solutions to the second equation in (4).

Let’s investigate this a little more. Let’s just find the solution to the first equation (we’ll worry
about the second equation in a second). Following the work we did in Example 1 we can see that
the infinitely many solutions to the first equation in (4) are
x=t y = 2 t + 8, t is any number

Now, if we also find just the solutions to the second equation in (4) we get
x=t y = 2 t + 8, t is any number

These are exactly the same! So, this means that if we have an actual numeric solution (found by
choosing t above…) to the first equation it will be guaranteed to also be a solution to the second
equation and so will be a solution to the system (4). This means that we in fact have infinitely
many solutions to (4).

Let’s take a look at the three systems we’ve been working with above in a little more detail. This
will allow us to see a couple of nice facts about systems.

Since each of the equations in (2),(3), and (4) are linear in two unknowns (x and y) the graph of
each of these equations is that of a line. Let’s graph the pair of equations from each system on
the same graph and see what we get.

© 2007 Paul Dawkins 7 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

From the graph of the equations for system (2) we can see that the two lines intersect at the point
( -3,5 ) and notice that, as a point, this is the solution to the system as well. In other words, in
this case the solution to the system of two linear equations and two unknowns is simply the
intersection point of the two lines.

Note that this idea is validated in the solution to systems (3) and (4). System (3) has no solution
and we can see from the graph of these equations that the two lines are parallel and hence will
never intersect. In system (4) we had infinitely many solutions and the graph of these equations
shows us that they are in fact the same line, or in some ways they “intersect” at an infinite number
of points.

Now, to this point we’ve been looking at systems of two equations with two unknowns but some
of the ideas we saw above can be extended to general systems of n equations with m unknowns.

First, there is a nice geometric interpretation to the solution of systems with equations in two or
three unknowns. Note that the number of equations that we’ve got won’t matter the interpretation
will be the same.

© 2007 Paul Dawkins 8 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

If we’ve got a system of linear equations in two unknowns then the solution to the system
represents the point(s) where all (not some but ALL) the lines will intersect. If there is no
solution then the lines given by the equations in the system will not intersect at a single point.
Note in the no solution case if there are more than two equations it may be that any two of the
equations will intersect, but there won’t be a single point were all of the lines will intersect.

If we’ve got a system of linear equations in three unknowns then the graphs of the equations will
be planes in 3D-space and the solution to the system will represent the point(s) where all the
planes will intersect. If there is no solution then there are no point(s) where all the planes given
by the equations of the system will intersect. As with lines, it may be in this case that any two of
the planes will intersect, but there won’t be any point where all of the planes intersect at that
point.

On a side note we should point out that lines can intersect at a single point or if the equations give
the same line we can think of them as intersecting at infinitely many points. Planes can intersect
at a point or on a line (and so will have infinitely many intersection points) and if the equations
give the same plane we can think of the planes as intersecting at infinitely many places.

We need to be a little careful about the infinitely many intersection points case. When we’re
dealing with equations in two unknowns and there are infinitely many solutions it means that the
equations in the system all give the same line. However, when dealing with equations in three
unknowns and we’ve got infinitely many solutions we can have one of two cases. Either we’ve
got planes that intersect along a line, or the equations will give the same plane.

For systems of equations in more than three variables we can’t graph them so we can’t talk about
a “geometric” interpretation, but we can still say that a solution to such a system will represent
the point(s) where all the equations will “intersect” even if we can’t visualize such an intersection
point.

From the geometric interpretation of the solution to two equations in two unknowns we know that
we have one of three possible solutions. We will have either no solution (the lines are parallel),
one solution (the lines intersect at a single point) or infinitely many solutions (the equations are
the same line). There is simply no other possible number of solutions since two lines that
intersect will either intersect exactly once or will be the same line. It turns out that this is in fact
the case for a general system.

Theorem 1 Given a system of n equations and m unknowns there will be one of three
possibilities for solutions to the system.
1. There will be no solution.
2. There will be exactly one solution.
3. There will be infinitely many solutions.

If there is no solution to the system we call the system inconsistent and if there is at least one
solution to the system we call it consistent.

Now that we’ve got some of the basic ideas about systems taken care of we need to start thinking
about how to use linear algebra to solve them. Actually that’s not quite true. We’re not going to
do any solving until the next section. In this section we just want to get some of the basic
notation and ideas involved in the solving process out of the way before we actually start trying to
solve them.
© 2007 Paul Dawkins 9 http://tutorial.math.lamar.edu/terms.aspx
Linear Algebra

We’re going to start off with a simplified way of writing the system of equations. For this we will
need the following general system of n equations and m unknowns.

a11 x1 + a12 x2 + L + a1m xm = b1


a21 x1 + a22 x2 + L + a2 m xm = b2
(5)
M
an1 x1 + an 2 x2 + L + an m xm = bn

In this system the unknowns are x1 , x2 ,K , xm and the ai j and bi are known numbers. Note as
well how we’ve subscripted the coefficients of the unknowns (the ai j ). The first subscript, i,
denotes the equation that the subscript is in and the second subscript, j, denotes the unknown that
it multiples. For instance, a36 would be in the coefficient of x6 in the third equation.

Any system of equations can be written as an augmented matrix. A matrix is just a rectangular
array of numbers and we’ll be looking at these in great detail in this course so don’t worry too
much at this point about what a matrix is. Here is the augmented matrix for the general system in
(5).
é a11 a12 L a1m b1 ù
êa a22 L a2 m b2 úú
ê 21
ê M M M Mú
ê ú
êë an1 an 2 L an m bn úû

Each row of the augmented matrix consists of the coefficients and constant on the right of the
equal sign form a given equation in the system. The first row is for the first equation, the second
row is for the second equation etc. Likewise each of the first n columns of the matrix consists of
the coefficients from the unknowns. The first column contains the coefficients of x1 , the second
column contains the coefficients of x2 , etc. The final column (the m+1st column) contains all the
constants on the right of the equal sign. Note that the augmented part of the name arises because
we tack the bi ’s onto the matrix. If we don’t tack those on and we just have
é a11 a12 L a1m ù
êa ú
ê 21 a22 L a2 m ú
ê M M M ú
ê ú
êë an1 an 2 L an m úû
and we call this the coefficient matrix for the system.

© 2007 Paul Dawkins 10 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

Example 2 Write down the augmented matrix for the following system.
3x1 - 10 x2 + 6 x3 - x4 = 3
x1 + 9 x3 - 5 x4 = -12
-4 x1 + x2 - 9 x3 + 2 x4 = 7

Solution
There really isn’t too much to do here other than write down the system.
é 3 -10 6 -1 3ù
ê 1 0 9 -5 -12 úú
ê
êë -4 1 -9 2 7 úû
Notice that the second equation did not contain an x2 and so we consider its coefficient to be
zero.

Note as well that given an augmented matrix we can always go back to a system of equations.

Example 3 For the given augmented matrix write down the corresponding system of equations.
é 4 -1 1ù
ê -5 -8 4 ú
ê ú
êë 9 2 -2 úû
Solution
So since we know each row corresponds to an equation we have three equations in the system.
Also, the first two columns represent coefficients of unknowns and so we’ll have two unknowns
while the third column consists of the constants to the right of the equal sign. Here’s the system
that corresponds to this augmented matrix.
4 x1 - x2 = 1
-5 x1 - 8 x2 = 4
9 x1 + 2 x2 = -2

There is one final topic that we need to discuss in this section before we move onto actually
solving systems of equation with linear algebra techniques. In the next section where we will
actually be solving systems our main tools will be the three elementary row operations. Each of
these operations will operate on a row (which shouldn’t be too surprising given the name…) in
the augmented matrix and since each row in the augmented matrix corresponds to an equation
these operations have equivalent operations on equations.

Here are the three row operations, their equivalent equation operations as well as the notation that
we’ll be using to denote each of them.

Row Operation Equation Operation Notation


Multiply row i by the constant c Multiply equation i by the constant c cR i
Interchange rows i and j Interchange equations i and j Ri « R j
Add c times row i to row j Add c times equation i to equation j R j + cR i

© 2007 Paul Dawkins 11 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

The first two operations are fairly self explanatory. The third is also a fairly simple operation
however there are a couple things that we need to make clear about this operation. First, in this
operation only row (equation) j actually changes. Even though we are multiplying row (equation)
i by c that is done in our heads and the results of this multiplication are added to row (equation) j.
Also, when we say that we add c time a row to another row we really mean that we add
corresponding entries of each row.

Let’s take a look at some examples of these operations in action.

Example 4 Perform each of the indicated row operations on given augmented matrix.
é 2 4 -1 -3ù
ê 6 -1 -4 10 ú
ê ú
êë 7 1 -1 5úû
(a) -3R1 [Solution]
1
(b) R2 [Solution]
2
(c) R1 « R3 [Solution]
(d) R2 + 5 R3 [Solution]
(e) R1 - 3R2 [Solution]

Solution
In each of these we will actually perform both the row and equation operation to illustrate that
they are actually the same operation and that the new augmented matrix we get is in fact the
correct one. For reference purposes the system corresponding to the augmented matrix give for
this problem is,
2 x1 + 4 x2 - x3 = -3
6 x1 - x2 - 4 x3 = 10
7 x1 + x2 - x3 = 5

Note that at each part we will go back to the original augmented matrix and/or system of
equations to perform the operation. In other words, we won’t be using the results of the previous
part as a starting point for the current operation.

(a) -3R1
Okay, in this case we’re going to multiply the first row (equation) by -3. This means that we will
multiply each element of the first row by -3 or each of the coefficients of the first equation by -3.
Here is the result of this operation.
é -6 -12 3 9ù -6 x1 - 12 x2 + 3x3 = 9
ê 6 -1 -4 10 úú Û 6 x1 - x2 - 4 x3 = 10
ê
êë 7 1 -1 5úû 7 x1 + x2 - x3 = 5
[Return to Problems]

© 2007 Paul Dawkins 12 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

1
(b) R2
2
This is similar to the first one. We will multiply each element of the second row by one-half or
each coefficient of the second equation by one-half. Here are the results of this operation.
2 x1 + 4 x2 - x3 = -3
é 2 4 -1 -3ù
ê 3 - 1 -2 1
5úú Û 3x1 - x2 - 2 x3 = 5
ê 2
2
êë 7 1 -1 5úû
7 x1 + x2 - x3 = 5
Do not get excited about the fraction showing up. Fractions are going to be a fact of life with
much of the work that we’re going to be doing so get used to seeing them.

Note that often in cases like this we will say that we divided the second row by 2 instead of
multiplied by one-half.
[Return to Problems]

(c) R1 « R3
In this case were just going to interchange the first and third row or equation.
é 7 1 -1 5ù 7 x1 + x2 - x3 = 5
ê 6 -1 -4 10 úú Û 6 x1 - x2 - 4 x3 = 10
ê
êë 2 4 -1 -3úû 2 x1 + 4 x2 - x3 = -3
[Return to Problems]

(d) R2 + 5 R3
Okay, we now need to work an example of the third row operation. In this case we will add 5
times the third row (equation) to the second row (equation).

So, for the row operation, in our heads we will multiply the third row times 5 and then add each
entry of the results to the corresponding entry in the second row.

Here are the individual computations for this operation.


1st entry : 6 + ( 5 )( 7 ) = 41
2nd entry : - 1 + ( 5)(1) = 4
3rd entry : - 4 + ( 5)( -1) = -9
4th entry : 10 + ( 5)( 5 ) = 35

For the corresponding equation operation we will multiply the third equation by 5 to get,
35 x1 + 5 x2 - 5 x3 = 25
then add this to the second equation to get,
41x1 + 4 x2 - 9 x3 = 35

Putting all this together gives and remembering that it’s the second row (equation) that we’re
actually changing here gives,

© 2007 Paul Dawkins 13 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

é 2 4 - 1 - 3ù 2 x1 + 4 x2 - x3 = -3
ê 41 4 -9 35úú Û 41x1 + 4 x2 - 9 x3 = 35
ê
êë 7 1 -1 5úû 7 x1 + x2 - x3 = 5

It is important to remember that when multiplying the third row (equation) by 5 we are doing it in
our head and don’t actually change the third row (equation).
[Return to Problems]

(e) R1 - 3R2
In this case we’ll not go into the detail that we did in the previous part. Most of these types of
operations are done almost completely in our head and so we’ll do that here as well so we can
start getting used to it.

In this part we are going to subtract 3 times the second row (equation) from the first row
(equation). Here are the results of this operation.

é -16 7 11 -33ù -16 x1 + 7 x2 + 11x3 = -33


ê 6 -1 -4 10 úú Û 6 x1 - x2 - 4 x3 = 10
ê
êë 7 1 -1 5úû 7 x1 + x2 - x3 = 5

It is important when doing this work in our heads to be careful of minus signs. In operations such
as this one there are often a lot of them and it easy to lose track of one or more when you get in a
hurry.
[Return to Problems]

Okay, we’ve now got most of the basics down that we’ll need to start solving systems of linear
equations using linear algebra techniques so it’s time to move onto the next section.

© 2007 Paul Dawkins 14 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

Solving Systems of Equations


In this section we are going to take a look at using linear algebra techniques to solve a system of
linear equations. Once we have a couple of definitions out of the way we’ll see that the process is
a fairly simple one. Well, it’s fairly simple to write down the process anyway. Applying the
process is fairly simple as well but for large systems can take quite a few steps. So, let’s get the
definitions out of the way.

A matrix (any matrix, not just an augmented matrix) is said to be in reduced row-echelon form
if it satisfies all four of the following conditions.

1. If there are any rows of all zeros then they are at the bottom of the matrix.
2. If a row does not consist of all zeros then its first non-zero entry (i.e. the left most non-
zero entry) is a 1. This 1 is called a leading 1.
3. In any two successive rows, neither of which consists of all zeroes, the leading 1 of the
lower row is to the right of the leading 1 of the higher row.
4. If a column contains a leading 1 then all the other entries of that column are zero.

A matrix (again any matrix) is said to be in row-echelon form if it satisfies items 1 – 3 of the
reduced row-echelon form definition.

Notice from these definitions that a matrix that is in reduced row-echelon form is also in row-
echelon form while a matrix in row-echelon form may or may not be in reduced row-echelon
form.

Example 1 The following matrices are all in row-echelon form.


é 1 -6 9 1 0ù é 1 0 5ù
ê 0 0 1 -4 -5úú ê ú
ê ê0 1 3ú
êë 0 0 0 1 2 úû êë0 0 1úû

é 1 -8 10 5 -3ù
ê 0 1 13 9 12 úú
ê
ê 0 0 0 1 1ú
ê ú
ë 0 0 0 0 0û

None of the matrices in the previous example are in reduced row-echelon form. The entries that
are preventing these matrices from being in reduced row-echelon form are highlighted in red and
underlined (for those without color printers...). In order for these matrices to be in reduced row-
echelon form all of these highlighted entries would need to be zeroes.

Notice that we didn’t highlight the entries above the 1 in the fifth column of the third matrix.
Since this 1 is not a leading 1 (i.e. the leftmost non-zero entry) we don’t need the numbers above
it to be zero in order for the matrix to be in reduced row-echelon form.

© 2007 Paul Dawkins 15 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

Example 2 The following matrices are all in reduced row-echelon form.


é 0 1 0 -8 ù
é1 0 ù é0 0 ù ê 0 0
ê0 1 ú ê0 0 ú ê 1 5úú
ë û ë û êë 0 0 0 0 úû

é 1 9 0 0 -2 ù
é 1 -7 10 ù ê
ê 0 0 0 1 0 16 úú
ê 0 0 úú ê
ê 0 0 0 1 1ú
êë 0 0 0 úû ê ú
ë 0 0 0 0 0û

In the second matrix on the first row we have all zeroes in the entries. This is perfectly
acceptable and so don’t worry about it. This matrix is in reduced row-echelon form, the fact that
it doesn’t have any non-zero entries does not change that fact since it satisfies the conditions.
Also, in the second matrix of the second row notice that the last column does not have zeroes
above the 1 in that column. That is perfectly acceptable since the 1 in that column is not a
leading 1 for the fourth row.

Notice from Examples 1 and 2 that the only real difference between row-echelon form and
reduced row-echelon form is that a matrix in row-echelon form is only required to have zeroes
below a leading 1 while a matrix in reduced row-echelon from must have zeroes both below and
above a leading 1.

Okay, let’s now start thinking about how to use linear algebra techniques to solve systems of
linear equations. The process is actually quite simple. To solve a system of equations we will
first write down the augmented matrix for the system. We will then use elementary row
operations to reduce the augmented matrix to either row-echelon form or to reduced row-echelon
form. Any further work that we’ll need to do will depend upon where we stop.

If we go all the way to reduced row-echelon form then in many cases we will not need to do any
further work to get the solution and in those times where we do need to do more work we will
generally not need to do much more work. Reducing the augmented matrix to reduced row-
echelon form is called Gauss-Jordan Elimination.

If we stop at row-echelon form we will have a little more work to do in order to get the solution,
but it is generally fairly simple arithmetic. Reducing the augmented matrix to row-echelon form
and then stopping is called Gaussian Elimination.

At this point we should work a couple of examples.

Example 3 Use Gaussian Elimination and Gauss-Jordan Elimination to solve the following
system of linear equations.
-2 x1 + x2 - x3 = 4
x1 + 2 x2 + 3 x3 = 13
3 x1 + x3 = -1
Solution
Since we’re asked to use both solution methods on this system and in order for a matrix to be in
© 2007 Paul Dawkins 16 http://tutorial.math.lamar.edu/terms.aspx
Linear Algebra

reduced row-echelon form the matrix must also be in row-echelon form. Therefore, we’ll start
off by putting the augmented matrix in row-echelon form, then stop to find the solution. This will
be Gaussian Elimination. After doing that we’ll go back and pick up from row-echelon form and
further reduce the matrix to reduced row echelon form and at this point we’ll have performed
Gauss-Jordan Elimination.

So, let’s start off by getting the augmented matrix for this system.
é -2 1 -1 4ù
ê 1 2 3 13úú
ê
êë 3 0 1 -1úû
As we go through the steps in this first example we’ll mark the entry(s) that we’re going to be
looking at in each step in red so that we don’t lose track of what we’re doing. We should also
point out that there are many different paths that we can take to get this matrix into row-echelon
form and each path may well produce a different row-echelon form of the matrix. Keep this in
mind as you work these problems. The path that you take to get this matrix into row-echelon
form should be the one that you find the easiest and that may not be the one that the person next
to you finds the easiest. Regardless of which path you take you are only allowed to use the three
elementary row operations that we looked in the previous section.

So, with that out of the way we need to make the leftmost non-zero entry in the top row a one. In
this case we could use any three of the possible row operations. We could divide the top row by -
2 and this would certainly change the red “-2” into a one. However, this will also introduce
fractions into the matrix and while we often can’t avoid them let’s not put them in before we need
to.

Next, we could take row three and add it to row one, or we could take three times row 2 and add
it to row one. Either of these would also change the red “-2” into a one. However, this row
operation is the one that is most prone to arithmetic errors so while it would work let’s not use it
unless we need to.

This leaves interchanging any two rows. This is an operation that won’t always work here to get
a 1 into the spot we want, but when it does it will usually be the easiest operation to use. In this
case we’ve already got a one in the leftmost entry of the second row so let’s just interchange the
first and second rows and we’ll get a one in the leftmost spot of the first row pretty much for free.
Here is this operation.
é -2 1 -1 4ù é 1 2 3 13ù
ê 1 R1 « R2
ê 2 3 13úú ê -2
ê 1 -1 4 úú
®
êë 3 0 1 -1ûú êë 3 0 1 -1úû

Now, the next step we’ll need to take is changing the two numbers in the first column under the
leading 1 into zeroes. Recall that as we move down the rows the leading 1 MUST move off to
the right. This means that the two numbers under the leading 1 in the first column will need to
become zeroes. Again, there are often several row operations that can be done to do this.
However, in most cases adding multiples of the row containing the leading 1 (the first row in this
case) onto the rows we need to have zeroes is often the easiest. Here are the two row operations
that we’ll do in this step.

© 2007 Paul Dawkins 17 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

é 1 2 3 13ù R2 + 2 R1 é 1 2 313ù
ê -2 1 -1 4 úú R3 - 3R1 ê 0 5 5 30 úú
ê ê
êë 3 0 1 -1úû ® êë 0 -6 -8 -40 úû
Notice that since each operation changed a different row we went ahead and performed both of
them at the same time. We will often do this when multiple operations will all change different
rows.

We now need to change the red “5” into a one. In this case we’ll go ahead and divide the second
row by 5 since this won’t introduce any fractions into the matrix and it will give us the number
we’re looking for.
é 1 2 13ù
3 é 1 2 13ù
3
ê
1
R2
ê 0 5 5 30 úú 5 ê
ê 0 1 1 6 úú
®
êë 0 -6 -8 -40 úû êë 0 -6 -8 -40 úû

Next, we’ll use the third row operation to change the red “-6” into a zero so the leading 1 of the
third row will move to the right of the leading 1 in the second row. This time we’ll be using a
multiple of the second row to do this. Here is the work in this step.
é 1 2 3 13ù é 1 2 3 13ù
ê R3 + 6 R2
ê 0 1 1 6 úú ê 0
ê 1 1 6 úú
®
ëê 0 -6 -8 -40 ûú ëê 0 0 -2 -4 ûú
Notice that in both steps were we needed to get zeroes below a leading 1 we added multiples of
the row containing the leading 1 to the rows in which we wanted zeroes. This will always work
in this case. It may be possible to use other row operations, but the third can always be used in
these cases.

The final step we need to get the matrix into row-echelon form is to change the red “-2” into a
one. To do this we don’t really have a choice here. Since we need the leading one in the third
row to be in the third or fourth column (i.e. to the right of the leading one in the second column)
we MUST retain the zeroes in the first and second column of the third row.

Interchanging the second and third row would definitely put a one in the third column of the third
row, however, it would also change the zero in the second column which we can’t allow.
Likewise we could add the first row to the third row and again this would put a one in the third
column of the third row, but this operation would also change both of the zeroes in front of it
which can’t be allowed.

Therefore, our only real choice in this case is to divide the third row by -2. This will retain the
zeroes in the first and second column and change the entry in the third column into a one. Note
that this step will often introduce fractions into the matrix, but at this point that can’t be avoided.
Here is the work for this step.
é 1 2 3 13ù é 1 2 3 13ù
ê 0 - 12 R3
ê 1 1 6 úú ê0
ê 1 1 6 úú
®
êë 0 0 -2 -4 úû êë 0 0 1 2 úû

At this point the augmented matrix is in row-echelon form. So if we’re going to perform

© 2007 Paul Dawkins 18 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

Gaussian Elimination on this matrix we’ll stop and go back to equations. Doing this gives,
é 1 2 3 13ù x1 + 2 x2 + 3 x3 = 13
ê0 1 1 6 úú Þ x2 + x3 = 6
ê
êë 0 0 1 2 úû x3 = 2

At this point solving is quite simple. In fact we can see from this that x3 = 2 . Plugging this into
the second equation gives x2 = 4 . Finally, plugging both of these into the first equation gives
x1 = -1 . Summarizing up the solution to the system is,
x1 = -1 x2 = 4 x3 = 2
This substitution process is called back substitution.

Now, let’s pick back up at the row-echelon form of the matrix and further reduce the matrix into
reduced row-echelon form. The first step in doing this will be to change the numbers above the
leading 1 in the third row into zeroes. Here are the operations that will do that for us.
é 1 2 3 13ù R1 - 3R3 é 1 2 0 7ù
ê0 1 1 6 úú R2 - R3 ê0 1 0 4ú
ê ê ú
êë 0 0 1 2 úû ® êë 0 0 1 2 úû

The final step is then to change the red “2” above the leading one in the second row into a zero.
Here is this operation.
é 1 2 0 7ù é 1 0 0 -1ù
ê0 1 0 4ú R1 - 2 R2 ê 0
ê ú ê 1 0 4 úú
®
êë 0 0 1 2 úû êë 0 0 1 2 úû

We are now in reduced row-echelon form so all we need to do to perform Gauss-Jordan


Elimination is to go back to equations.
é 1 0 0 -1ù x1 = -1
ê 0 1 0 4 úú Þ x2 = 4
ê
êë 0 0 1 2 úû x3 = 2

We can see from this that one of the nice consequences to Gauss-Jordan Elimination is that when
there is a single solution to the system there is no work to be done to find the solution. It is
generally given to us for free. Note as well that it is the same solution as the one that we got by
using Gaussian Elimination as we should expect.

Before we proceed with another example we need to give a quick fact. As was pointed out in this
example there are many paths we could take to do this problem. It was also noted that the path
we chose would affect the row-echelon form of the matrix. This will not be true for the reduced
row-echelon form however. There is only one reduced row-echelon form of a given matrix no
matter what path we chose to take to get to that point.

If we know ahead of time that we are going to go to reduced row-echelon form for a matrix we
will often take a different path than the one used in the previous example. In the previous

© 2007 Paul Dawkins 19 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

example we first got the matrix in row-echelon form by getting zeroes under the leading 1’s and
then went back and put the matrix in reduced row-echelon form by getting zeroes above the
leading 1’s. If we know ahead of time that we’re going to want reduced row-echelon form we
can just take care of the matrix in a column by column basis in the following manner. We first
get a leading 1 in the correct column then instead of using this to convert only the numbers below
it to zero we can use it to convert the numbers both above and below to zero. In this way once we
reach the last column and take care of it of course we will be in reduced row-echelon form.

We should also point out the differences between Gauss-Jordan Elimination and Gaussian
Elimination. With Gauss-Jordan Elimination there is more matrix work that needs to be
performed in order to get the augmented matrix into reduced row-echelon form, but there will be
less work required in order to get the solution. In fact, if there’s a single solution then the
solution will be given to us for free. We will see however, that if there are infinitely many
solutions we will still have a little work to do in order to arrive at the solution. With Gaussian
Elimination we have less matrix work to do since we are only reducing the augmented matrix to
row-echelon form. However, we will always need to perform back substitution in order to get the
solution. Which method you use will probably depend on which you find easier.

Okay let’s do some more examples. Since we’ve done one example in excruciating detail we
won’t be bothering to put as much detail into the remaining examples. All operations will be
shown, but the explanations of each operation will not be given.

Example 4 Solve the following system of linear equations.


x1 - 2 x2 + 3 x3 = -2
- x1 + x2 - 2 x3 = 3
2 x1 - x2 + 3 x3 = 1
Solution
First, the instructions to this problem did not specify which method to use so we’ll need to make a
decision. No matter which method we chose we will need to get the augmented matrix down to
row-echelon form so let’s get to that point and then see what we’ve got. If we’ve got something
easy to work with we’ll stop and do Gaussian Elimination and if not we’ll proceed to reduced
row-echelon form and do Gauss-Jordan Elimination.

So, let’s start with the augmented matrix and then proceed to put it into row-echelon form and
again we’re not going to put in quite the detail in this example as we did with the first one. So,
here is the augmented matrix for this system.
é 1 -2 3 -2 ù
ê -1 1 -2 3úú
ê
êë 2 -1 3 1úû
and here is the work to put it into row-echelon form.
é 1 -2 3 -2 ù R2 + R1 é 1 -2 3 -2 ù é 1 -2 3 -2 ù
ê -1 1 -2 ú ê ú - R2 ê
ê 3ú R3 - 2 R1 ê 0 -1 1 1ú ê 0 1 -1 -1úú
®
êë 2 -1 3 1úû ® êë 0 3 -3 5úû êë 0 3 -3 5úû

© 2007 Paul Dawkins 20 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

é 1 -2 3 -2 ù 1 é 1 -2 3 -2 ù
R3 - 3R2 ê ú 8 R3 ê
ê 0 1 -1 -1ú ê 0 1 -1 -1úú
® ®
êë 0 0 0 8úû êë 0 0 0 1úû
Okay, we’re now in row-echelon form. Let’s go back to equation and see what we’ve got.
x1 - 2 x2 + 3x3 = -2
x2 - x3 = -1
0 =1
Hmmmm. That last equation doesn’t look correct. We’ve got a couple of possibilities here.
We’ve either just managed to prove that 0=1 (and we know that’s not true), we’ve made a
mistake (always possible, but we haven’t in this case) or there’s another possibility we haven’t
thought of yet.

Recall from Theorem 1 in the previous section that a system has one of three possibilities for a
solution. Either there is no solution, one solution or infinitely many solutions. In this case we’ve
got no solution. When we go back to equations and we get an equation that just clearly can’t be
true such as the third equation above then we know that we’ve got not solution.

Note as well that we didn’t really need to do the last step above. We could have just as easily
arrived at this conclusion by looking at the second to last matrix since 0=8 is just as incorrect as
0=1.

So, to close out this problem, the official answer that there is no solution to this system.

In order to see how a simple change in a system can lead to a totally different type of solution
let’s take a look at the following example.

Example 5 Solve the following system of linear equations.


x1 - 2 x2 + 3 x3 = -2
- x1 + x2 - 2 x3 = 3
2 x1 - x2 + 3 x3 = -7
Solution
The only difference between this system and the previous one is the -7 in the third equation. In
the previous example this was a 1.

Here is the augmented matrix for this system.


é 1 -2 3 -2 ù
ê -1 1 -2 3úú
ê
êë 2 -1 3 -7 úû
Now, since this is essentially the same augmented matrix as the previous example the first few
steps are identical and so there is no reason to show them here. After taking the same steps as
above (we won’t need the last step this time) here is what we arrive at.
é 1 -2 3 -2 ù
ê 0 1 -1 -1úú
ê
êë 0 0 0 0 úû

© 2007 Paul Dawkins 21 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

For some good practice you should go through the steps above and make sure you arrive at this
matrix.

In this case the last line converts to the equation


0=0
and this is a perfectly acceptable equation because after all zero is in fact equal to zero! In other
words, we shouldn’t get excited about it.

At this point we could stop convert the first two lines of the matrix to equations and find a
solution. However, in this case it will actually be easier to do the one final step to go to reduced
row-echelon form. Here is that step.
é 1 -2 3 -2 ù é 1 0 1 -4 ù
ê 0 ú R1 + 2 R2 ê
ê 1 -1 -1ú 0 1 -1 -1úú
® ê
êë 0 0 0 0 úû êë 0 0 0 0 úû

We are now in reduced row-echelon form so let’s convert to equations and see what we’ve got.
x1 + x3 = -4
x2 - x3 = -1

Okay, we’ve got more unknowns than equations and in many cases this will mean that we have
infinitely many solutions. To see if this is the case for this example let’s notice that each of the
equations has an x3 in it and so we can solve each equation for the remaining variable in terms of
x3 as follows.
x1 = -4 - x3
x2 = -1 + x3

So, we can choose x3 to be any value we want to, and hence it is a free variable (recall we saw
these in the previous section), and each choice of x3 will give us a different solution to the
system. So, just like in the previous section when we’ll rename the x3 and write the solution as
follows,
x1 = -4 - t x2 = -1 + t x3 = t t is any number

We therefore get infinitely many solutions, one for each possible value of t and since t can be any
real number there are infinitely many choices for t.

Before moving on let’s first address the issue of why we used Gauss-Jordan Elimination in the
previous example. If we’d used Gaussian Elimination (which we definitely could have used) the
system of equations would have been.
x1 - 2 x2 + 3x3 = -2
x2 - x3 = -1

To arrive at the solution we’d have to solve the second equation for x2 first and then substitute
this into the first equation before solving for x1 . In my mind this is more work and work that I’m

© 2007 Paul Dawkins 22 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

more likely to make an arithmetic mistake than if we’d just gone to reduced row-echelon form in
the first place as we did in the solution.

There is nothing wrong with using Gaussian Elimination on a problem like this, but the back
substitution is definitely more work when we’ve got infinitely many solutions than when we’ve
got a single solution.

Okay, to this point we’ve worked nothing but systems with the same number of equations and
unknowns. We need to work a couple of other examples where this isn’t the case so we don’t get
too locked into this kind of system.

Example 6 Solve the following system of linear equations.


3 x1 - 4 x2 = 10
-5 x1 + 8 x2 = -17
-3 x1 + 12 x2 = -12
Solution
So, let’s start with the augmented matrix and reduce it to row-echelon form and see if what we’ve
got is nice enough to work with or if we should go the extra step(s) to get to reduced row-echelon
form. Let’s start with the augmented matrix.
é 3 -4 10 ù
ê -5 8 -17 úú
ê
êë -3 12 -12 úû
Notice that this time in order to get the leading 1 in the upper left corner we’re probably going to
just have to divide the row by 3 and deal with the fractions that will arise. Do not go to great
lengths to avoid fractions, they are a fact of life with these problems and so while it’s okay to try
to avoid them, sometimes it’s just going to be easier to deal with it and work with them.

So, here’s the work for reducing the matrix to row-echelon form.
é 3 -4 10ù 1 é 1 - 43 10
3 ù R2 + 5R1 é 1 - 43 3 ù
10
R
ê -5
ê 8 -17 úú 3 1 êê -5 ú
8 -17 ú R3 + 3R1 ê 0 ê 4
3 - 13 úú
®
êë -3 12 -12úû êë -3 12 -12 úû ® êë 0 8 -2 úû
é 1 - 43 3 ù
10
é 1 - 43 3 ù
10

4 R2 ê
3
ú R - 8 R ê
0 1 - 14 ú 3 2
0 1 - 14 úú
® ê ® ê
êë 0 8 -2 úû êë 0 0 0 úû

Okay, we’re in row-echelon form and it looks like if we go back to equations at this point we’ll
need to do one quick back substitution involving numbers and so we’ll go ahead and stop here at
this point and do Gaussian Elimination.

Here are the equations we get from the row-echelon form of the matrix and the back substitution.
4 10 10 4 æ 1 ö
x1 - x2 = Þ x1 = + ç- ÷ = 3
3 3 3 3è 4ø
1
x2 = -
4

© 2007 Paul Dawkins 23 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

So, the solution to this system is,


1
x1 = 3 x2 = -
4

Example 7 Solve the following system of linear equations.


7 x1 + 2 x2 - 2 x3 - 4 x4 + 3 x5 = 8
-3 x1 - 3x2 + 2 x4 + x5 = -1
4 x1 - x2 - 8 x3 + 20 x5 = 1
Solution
First, let’s notice that we are guaranteed to have infinitely many solutions by the fact above since
we’ve got more unknowns than equations. Here’s the augmented matrix for this system.
é 7 2 -2 -4 3 8ù
ê -3 -3 0 2 1 -1úú
ê
êë 4 -1 -8 0 20 1úû
In this example we can avoid fractions in the first row simply by adding twice the second row to
the first to get our leading 1 in that row. So, with that as our initial step here’s the work that will
put this matrix into row-echelon form.
é 7 2 -2 -4 3 8ù é 1 -4 -2 0 6ù 5
ê -3 -3 0 ú R1 + 2 R2 ê
ê 2 1 -1ú 1 -1úú
-3 -3 0 2
® ê
êë 4 -1 -8 0 20 1úû êë 4 -1 -8 0
20 1úû
R2 + 3R1 é 1 -4 -2 0 5 6ù é 1 -4 -2
0 5 6ù
R2 « R3 ê
R3 - 4 R1 êê0 -15 -6 2 16 17 ú ú 0 0 -23úú
0 15 0
® ê
® êë0 15 0 0 0 -23úû êë0 -15 -6
2 16 17 úû
é 1 -4 -2 0 5 6ù 151 R2 é 1 -4 -2 0 5 6ù
R3 + R2 ê ú ê 23 ú
0 15 0 0 0 -23ú - 16 R3 ê0 1 0 0 0 - 15 ú
® ê
êë0 0 -6 2 16 -6 úû ® êë0 0 1 - 13 - 83 1úû

We are now in row-echelon form. Notice as well that in several of the steps above we took
advantage of the form of several of the rows to simplify the work somewhat and in doing this we
did several of the steps in a different order than we’ve done to this point. Remember that there
are no set paths to take through these problems!

Because of the fractions that we’ve got here we’re going to have some work to do regardless of
whether we stop here and do Gaussian Elimination or go the couple of extra steps in order to do
Gauss-Jordan Elimination. So with that in mind let’s go all the way to reduced row-echelon form
so we can say that we’ve got another example of that in the notes. Here’s the remaining work.
é 1 -4 -2 0 5 6ù é 1 -4 0 - 23 - 13 8ù
ê0 R
23 ú 1
+ 2 R3ê 23 ú
ê 1 0 0 0 - 15 ú ê0 1 0 0 0 - 15 ú
®
êë0 0 1 - 13 - 83 1úû êë 0 0 1 - 13 - 83 1úû

© 2007 Paul Dawkins 24 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

é 1 0 0 - 23 - 13 15 ù
28
R1 + 4 R2 ê 23 ú
ê 0 1 0 0 0 - 15 ú
®
êë0 0 1 - 13 - 83 1úû

We’re now in reduced row-echelon form and so let’s go back to equations and see what we’ve
got.
2 1 28 28 2 1
x1 - x4 - x5 = Þ x1 = + x4 + x5
3 3 15 15 3 3
23
x2 = -
15
1 8 1 8
x3 - x4 - x5 = 1 Þ x3 = 1 + x4 + x5
3 3 3 3

So, we’ve got two free variables this time, x4 and x5 , and notice as well that unlike any of the
other infinite solution cases we actually have a value for one of the variables here. That will
happen on occasion so don’t worry about it when it does. Here is the solution for this system.

28 2 1 23 1 8
x1 = + t+ s x2 = - x3 = 1 + t + s
15 3 3 15 3 3
x4 = t x5 = s s and t are any numbers

Now, with all the examples that we’ve worked to this point hopefully you’ve gotten the idea that
there really isn’t any one set path that you always take through these types of problems. Each
system of equations is different and so may need a different solution path. Don’t get too locked
into any one solution path as that can often lead to problems.

Homogeneous Systems of Linear Equations


We’ve got one more topic that we need to discuss briefly in this section. A system of n linear
equations in m unknowns in the form
a11 x1 + a12 x2 + L + a1m xm = 0
a21 x1 + a22 x2 + L + a2 m xm = 0
M
an1 x1 + an 2 x2 + L + an m xm = 0
is called a homogeneous system. The one characteristic that defines a homogeneous system is
the fact that all the equations are set equal to zero unlike a general system in which each equation
can be equal to a different (probably non-zero) number.

Hopefully, it is clear that if we take


x1 = 0 x2 = 0 x3 = 0 L xm = 0
we will have a solution to the homogeneous system of equations. In other words, with a
homogeneous system we are guaranteed to have at least one solution. This means that Theorem 1
from the previous section can then be reduced to the following for homogeneous systems.

© 2007 Paul Dawkins 25 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

Theorem 1 Given a homogeneous system of n equations and m unknowns there will be one of
two possibilities for solutions to the system.
4. There will be exactly one solution, x1 = 0, x2 = 0, x3 = 0,L , xm = 0 . This solution is
called the trivial solution.
5. There will be infinitely many non-zero solutions in addition to the trivial solution.

Note that when we say non-zero solution in the above fact we mean that at least one of the xi ’s in
the solution will not be zero. It is completely possible that some of them will still be zero, but at
least one will not be zero in a non-zero solution.

We can make a further reduction to Theorem 1 from the previous section if we assume that there
are more unknowns than equations in a homogeneous system as the following theorem shows.

Theorem 2 Given a homogeneous system of n linear equations in m unknowns if m > n (i.e.


there are more unknowns than equations) there will be infinitely many solutions to the system.

© 2007 Paul Dawkins 26 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

Matrices
In the previous section we used augmented matrices to denote a system of linear equations. In
this section we’re going to start looking at matrices in more generality. A matrix is nothing more
than a rectangular array of numbers and each of the numbers in the matrix is called an entry.
Here are some examples of matrices.

é 12 ù
é 4 3 0 6 -1 0ù é 7 10 -1ù ê -4 ú
ê 0 2 -4 -7 1 3úú ê 8 0 -2 ú ê ú
ê ê ú ê 2ú
ëê -6 1 15 -1 0 úû êë 9 3 0ûú
1
ê ú
ë -17 û
2

[ 3 -1 12 0 -9] [ -2]
The size of a matrix with n rows and m columns is denoted by n ´ m . In denoting the size of a
matrix we always list the number of rows first and the number of columns second.

Example 1 Give the size of each of the matrices above.

Solution
é 4 3 0 6 -1 0ù
ê 0 2 -4 -7 1 3úú Þ size : 3 ´ 6
ê
êë -6 1 15 1
2 -1 0úû

é 7 10 -1ù
ê 8 0 -2 ú Þ size : 3 ´ 3
ê ú
êë 9 3 0 úû
In this matrix the number of rows is equal to the number of columns. Matrices that have the same
number of rows as columns are called square matrices.

é 12 ù
ê -4 ú
ê ú Þ size : 4 ´ 1
ê 2ú
ê ú
ë -17 û
This matrix has a single column and is often called a column matrix.

[ 3 -1 12 0 -9 ] Þ size : 1´ 5
This matrix has a single row and is often called a row matrix.

[ -2 ] Þ size : 1´1
Often when dealing with 1´ 1 matrices we will drop the surrounding brackets and just write -2.

© 2007 Paul Dawkins 27 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

Note that sometimes column matrices and row matrices are called column vectors and row
vectors respectively. We do need to be careful with the word vector however as in later chapters
the word vector will be used to denote something much more general than a column or row
matrix. Because of this we will, for the most part, be using the terms column matrix and row
matrix when needed instead of the column vector and row vector.

There are a lot of notational issues that we’re going to have to get used to in this class. First,
upper case letters are generally used to refer to matrices while lower case letters generally are
used to refer to numbers. These are general rules, but as you’ll see shortly there are exceptions to
them, although it will usually be easy to identify those exceptions when they happen.

We will often need to refer to specific entries in a matrix and so we’ll need a notation to take care
of that. The entry in the ith row and jth column of the matrix A is denoted by,
ai j OR ( A )i j
In the first notation the lower case letter we use to denote the entries of a matrix will always
match with the upper case letter we use to denote the matrix. So the entries of the matrix B will
be denoted by bi j .

In both of these notations the first (left most) subscript will always give the row the entry is in
and the second (right most) subscript will always give the column the entry is in. So, c4 9 will be
the entry in the 4th row and 9th column of C (which is assumed to be a matrix since it’s an upper
case letter…).

Using the lower case notation we can denote a general n ´ m matrix, A, as follows,
é a11 a12 L a1m ù é a11 a12 L a1m ù
êa a22 L a2 m úú êa a22 L a2 m úú
A=ê ê
21 21
OR A=
ê M M M ú ê M M M ú
ê ú ê ú
êë an1 an 2 L an m úû êë an1 an 2 L an m úû n ´ m
We don’t generally subscript the size of the matrix as we did in the second case, but on occasion
it may be useful to make the size clear and in those cases we tend to subscript it as shown in the
second case.

The notation above for a general matrix is fairly cumbersome so we’ve also got some much more
compact notation that we’ll use when we can. When possible we’ll use the following to denote a
general matrix.
éë ai j ùû éë ai j ùû An´ m
n´ m
The first two we tend to use when we need to talk about the general entry of a matrix (such as
certain formulas) but don’t really care what that entry is. Also, we’ll denote the size if it’s
important or needed for whatever we’re doing, but otherwise we’ll not bother with the size. The
third notation is really nothing more than the standard notation with the size denoted. We’ll use
this only when we need to talk about a matrix and the size is important but the entries aren’t. We
won’t run into this one too often, but we will on occasion.

We will be dealing extensively with column and row matrices in later chapters/sections so we
need to take care of some notation for those. There are the main exception to the upper
case/lower case convention we adopted earlier for matrices and their entries. Column and row

© 2007 Paul Dawkins 28 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

matrices tend to be denoted with a lower case letter that has either been bolded or has an arrow
over it as follows,
é a1 ù
ê ú r
r ê a2 ú
a=a= b = b = [b1 b2 L bm ]
êMú
ê ú
ë an û
In written documents, such as this, column and row matrices tend to be in bold face while on the
chalkboard of a classroom they tend to get arrows written over them since it’s often difficult on a
chalkboard to differentiate a letter that’s in bold from one that isn’t.

Also, notice with column and row matrices the entries are still denoted with lower case letters that
match the letter that represents the matrix and in this case since there is either a single column or
a single row there was no reason to double subscript the entries.

Next we need to get a quick definition out of the way for square matrices. Recall that a square
matrix is a matrix whose size is n ´ n (i.e. it has the same number of rows as columns). In a
square matrix the entries a11 , a22 ,K , an n (see the shaded portion of the matrix below) are called
the main diagonal.

The next topic that we need to discuss in this section is that of partitioned matrices and
submatrices. Any matrix can be partitioned into smaller submatrices simply by adding in
horizontal and/or vertical lines between selected rows and/or columns.

Example 2 Here are several partitions of a general 5 ´ 3 matrix.


(a)
é a11 a12 a13 ù
êa a23 úú
ê 21 a22 éA A12 ù
A = ê a31 a32 a33 ú = ê 11
ê ú A A22 úû
ê a41 a42 a43 ú ë 21
êë a51 a52 a53 úû
In this case we partitioned the matrix into four submatrices. Also notice that we simplified the
matrix into a more compact form and in this compact form we’ve mixed and matched some of
our notation. The partitioned matrix can be thought of as a smaller matrix with four entries,
except this time each of the entries are matrices instead of numbers and so we used capital letters
to represent the entries and subscripted each on with the location in portioned matrix.

Be careful not to confuse the location subscripts on each of the submatrices with the size of each
submatrix. In this case A11 is a 2 ´ 1 sub matrix of A, A12 is a 2 ´ 2 sub matrix of A, A21 is a

© 2007 Paul Dawkins 29 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

3 ´ 1 sub matrix of A, and A22 is a 3 ´ 2 sub matrix of A.

(b)
é a11 a12 a13 ù
êa a22 a23 úú
ê 21
A = ê a31 a32 a33 ú = [c1 c2 c3 ]
ê ú
ê a41 a42 a43 ú
êë a51 a52 a53 úû
In this case we partitioned A into three column matrices each representing one column in the
original matrix. Again, note that we used the standard column matrix notation (the bold face
letters) and subscripted each on with the location in the partitioned matrix. The ci in the
partitioned matrix are sometimes called the column matrices of A.

(c)
é a11 a12 a13 ù é r1 ù
êa a23 úú êêr2 úú
ê 21 a22
A = ê a31 a32 a33 ú = ê r3 ú
ê ú ê ú
ê a41 a42 a43 ú êr4 ú
ê a51 a52 a53 úû êë r5 úû
ë
Just as we can partition a matrix into each of its columns as we did in the previous part we can
also partition a matrix into each of its rows. The ri in the partitioned matrix are sometimes called
the row matrices of A.

The previous example showed three of the many possible ways to partition up the matrix. There
are, of course, many other ways to partition this matrix. We won’t be partitioning up too many
matrices here, but we will be doing it on occasion, so it’s a useful idea to remember. Also note
that when we do partition up a matrix into its column/row matrices we will generally put in the
bars separating the columns/rows as we’ve done here to indicate that we’ve got a partitioned
matrix.

To close out this section we’re going to introduce a couple of special matrices that we’ll see show
up on occasion.

The first matrix is the zero matrix. The zero matrix is pretty much what the name implies. It is
an n ´ m matrix whose entries are all zeroes. The notation we’ll use for the zero matrix is 0n ´ m
for a general zero matrix or 0 for a zero column or row matrix. Here are a couple of zero
matrices just so we can say we have some in the notes.
é0 ù
é0 0 0 0ù
02´ 4 =ê ú 0 = [ 0 0 0 0] 0 = êê0 úú
ë0 0 0 0û êë0 úû

© 2007 Paul Dawkins 30 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

If the size of a column or row zero matrix is important we will sometimes subscript the size on
those as well just to make it clear what the size is. Also, if the size of a full zero matrix is not
important or implied from the problem we will drop the size from 0n ´ m and just denote it by 0.

The second special matrix we’ll look at in this section is the identity matrix. The identity matrix
is a square n ´ n matrix usually denoted by I n or just I if the size is unimportant or clear from
the context of the problem. The entries on the main diagonal of the identity matrix are all ones
and all the other entries in the identity matrix are zeroes. Here are a couple of identity matrices.
é1 0 0 0ù
ê0 0 úú
é1 0 ù 1 0
I2 = ê ú I4 = ê
ë0 1 û ê0 0 1 0ú
ê ú
ë0 0 0 1û

As we’ll see identity matrices will arise fairly regularly. Here is a nice theorem about the reduced
row-echelon form of a square matrix and how it relates to the identity matrix.

Theorem 1 If A is an n ´ n matrix then the reduced row-echelon form of the matrix will either
contain at least one row of all zeroes or it will be I n , the n ´ n identity matrix.

Proof : This is a simple enough theorem to prove that we may as well. Let’s suppose that B is the
reduced row-echelon form of the matrix. If B has at least one row of all zeroes we are done so
let’s suppose that B does not have a row of all zeroes. This means that every row has a leading 1
in it.

Now, we know that the leading 1 of a row must be the right of the leading 1 of the row
immediately above it. Because we are assuming that B is square and doesn’t have any rows of all
zeroes we can actually locate each of the leading 1’s in B.

First, let’s suppose that the leading 1 in the first row is NOT b11 (i.e. b11 = 0 ). The next possible
location of the leading 1 in the first row would then be b12 . So, let’s suppose that this is where
the leading 1 is. So, upon assuming this we can say that B must have the following form.
é 0 1 b13 L b1n ù
ê 0 0 b23 L b2 n úú
B=ê
ê M M M M ú
ê ú
ë 0 0 bn 3 L bnn û

Now, let’s assume the best possible scenario happens. That is the leading 1 of each of the lower
rows is exactly one column to the right of the leading 1 above it. This however, leads us to
instant problems. Because our first leading 1 is in the second column by the time we reach the n-
1st row our leading 1 will be in the nth column and this will in turn force the nth row to be a row of
all zeroes which contradicts our initial assumption. If you’re not sure you believe this consider
the 4 ´ 4 case.

© 2007 Paul Dawkins 31 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

é0 1 0 0 ù
ê0 0 1 0 ú
ê ú
ê0 0 0 1ú
ê ú
ë0 0 0 0 û
Sure enough a row of all zeroes in the 4th row.

Now, we assumed the best possible scenario for the leading 1’s in the lower rows and ran into
problems. If the leading 1 jumps to the right say 2 columns (or 3 or 4, etc.) we will run into the
same kind of problem only we’ll end up with more than one row of all zeroes.

Likewise if the leading 1 in the first row is in any of b13 , b14 ,K , b1n we will have the same
problem. So, in order to meet the assumption that we don’t have any rows of all zeroes we know
that the leading 1 in the first row must be at b11 .

Using a similar argument to that above we can see that if the leading 1 on any of the lower rows
jumps to the right more than one column we will have a leading 1 in the nth column prior to
hitting the nth row. This will in turn force at least the nth row to be a row of all zeroes which will
again contradict our initial assumption.

Therefore we know that the leading one in the first row is at b11 and the only hope of not having a
row of all zeroes at the bottom is to have the leading 1’s of a row be exactly one column to the
right of the leading 1 of the row above it. This means that the leading 1 in the second row must
be at b2 2 , the leading 1 in the third row must be at b33 , etc. Eventually we’ll hit the nth row and
in this row the leading 1 must be at bn n .

Therefore the leading 1’s of B must be on the diagonal and because B is the reduced row-echelon
form of A we also know that all the entries above and below the leading 1’s must be zeroes. This
however, is exactly I n . Therefore, if B does not have a row of all zeroes in it then we must have
that B = I n .

© 2007 Paul Dawkins 32 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

Matrix Arithmetic & Operations


One of the biggest impediments that some people have in learning about matrices for the first
time is trying to take everything that they know about arithmetic of real numbers and translate
that over to matrices. As you will eventually see much of what you know about arithmetic of real
numbers will also be true here, but there is also a few ideas/facts that will no longer hold here. To
make matters worse there are some rules of arithmetic of real numbers that will work
occasionally with matrices but won’t work in general. So, keep this in mind as you go through
the next couple of sections and don’t be too surprised when something doesn’t quite work out as
you expect it to.

This section is devoted mostly to developing the arithmetic of matrices as well as introducing a
couple of operations on matrices that don’t really have an equivalent operation in real numbers.
We will see some of the differences between arithmetic of real numbers and matrices mentioned
above in this section. We will also see more of them in the next section when we delve into the
properties of matrix arithmetic in more detail.

Okay, let’s start off matrix arithmetic by defining just what we mean when we say that two
matrices are equal.

Definition 1 If A and B are both n ´ m matrices then we say that A = B provided corresponding
entries from each matrix are equal. Or in other words, A = B provided ai j = bi j for all i and j.

Matrices of different sizes cannot be equal.

Example 1 Consider the following matrices.


é -9 123ù é -9 b ù é -9 ù
A=ê ú B=ê ú C=ê ú
ë 3 -7 û ë 3 -7 û ë 3û
For these matrices we have that A ¹ C and B ¹ C since they are different sizes and so can’t be
equal. The fact that C is essentially the first column of both A and B is not important to
determining equality in this case. The size of the two matrices is the first thing we should look at
in determining equality.

Next, A = B provided we have b = 123 . If b ¹ 123 then we will have A ¹ B .

Next we need to move on to addition and subtraction of two matrices.

Definition 2 If A and B are both n ´ m matrices then A ± B is a new n ´ m matrix that is found
by adding/subtracting corresponding entries from each matrix. Or in other words,
A ± B = éë ai j ± bi j ùû

Matrices of different sizes cannot be added or subtracted.

© 2007 Paul Dawkins 33 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

Example 2 For the following matrices perform the indicated operation, if possible.
é 2 0 2ù
é 2 0 -3 2 ù é 0 -4 -7 2ù
A=ê ú B=ê ú C = êê -4 9 5úú
ë -1 8 10 -5û ë 12 3 7 9û
êë 6 0 -6 úû
(a) A + B
(b) B - A
(c) A + C

Solution
(a) Both A and B are the same size and so we know the addition can be done in this case. Once
we know the addition can be done there really isn’t all that much to do here other than to just add
the corresponding entries here to get the results.
.
é 2 -4 -10 4ù
A+ B = ê
ë 11 11 17 4 úû
(b) Again, since A and B are the same size we can do the difference and as like the previous part
there really isn’t all that much to do. All that we need to be careful with is the order. Just like
with real number arithmetic B - A is different from A - B . So, in this case we’ll subtract the
entries of A from the entries of B.
é -2 -4 -4 0 ù
B- A= ê ú
ë 13 -5 -3 14 û
(c) In this case because A and C are different sizes the addition can’t be done. Likewise, A - C ,
C - A , B + C . C - B , and B - C can’t be done for the same reason.

We now need to move into multiplication involving matrices. However, there are actually two
kinds of multiplication to look at : Scalar Multiplication and Matrix Multiplication. Let’s start
with scalar multiplication.

Definition 3 If A is any matrix and c is any number then the product (or scalar multiple), cA, is
a new matrix of the same size as A and it’s entries are found by multiplying the original entries of
A by c. In other words cA = éë c ai j ùû for all i and j.

Note that in the field of Linear Algebra a number is often called a scalar and hence the name
scalar multiple since we are multiplying a matrix by a scalar (number). From this point on we
will generally call numbers scalars.

Before doing an example we need to get another quick definition out of the way. If
A1 , A2 ,K , An are all matrices of the same size and c1 , c2 ,K , cn are scalars then the linear
combination of A1 , A2 ,K , An with coefficients c1 , c2 ,K , cn is,
c1 A1 + c2 A2 + L + cn An

This may seem like a silly thing to define but we’ll be using linear combination in quite a few
places in this class and so we need to get used to seeing them.

© 2007 Paul Dawkins 34 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

Example 3 Given the matrices


é 0 9ù é 8 1ù é 2 3ù
A = êê 2 -3úú B = êê -7 0 úú ê
C = ê -2 5úú
êë -1 1úû êë 4 -1úû êë 10 -6 úû
1
compute 3 A + 2 B - C.
2
Solution
So, we’re really being asked to compute a linear combination here. We’ll do that by first
computing the scalar multiplies and the performing the addition and subtraction. Note as well
1
that in the case of the third scalar multiple we are going to consider the scalar to be a positive
2
and leave the minus sign out in front of the matrix. Here is the work for this problem.
é 0 27 ù é 16 2 ù é 1 32 ù é 15 2 ù
55

1 ê 5ú ê 23 ú
3 A + 2 B - C = êê 6 -9 úú + êê -14 ú
0 ú - ê -1 2 ú = ê -7 - 2 ú
2
êë -3 3úû êë 8 -2 úû êë 5 -3úû êë 0 4 úû

We now need to move into matrix multiplication, however before we do the general case let’s
look at a special case first since this will help with the general case.

Suppose that we have the following two matrices,


é b1 ù
êb ú
a = [ a1 a2 L an ] b = ê 2ú
êMú
ê ú
ëbn û
So, a is a row matrix and b is a column matrix and they have the same number of entries. Then
the product of a and b is defined to be,
ab = a1b1 + a2b2 + L + anbn

It is important to note that this product can only be done if a and b have the same number of
entries. If they have a different number of entries then this product is not defined.

Example 4 Compute ab given that,


é -4 ù
a = [ 4 -10 3] b = êê 3 úú
êë 8 úû
Solution
There is not really a whole lot to do here other than use the definition given above.
ab = ( 4 )( -4 ) + ( -10 )( 3) + ( 3)( 8 ) = -22

Now let’s move onto general matrix multiplication.

© 2007 Paul Dawkins 35 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

Definition 4 If A is an n ´ p matrix and B is a p ´ m matrix then the product (or matrix


multiplication) is a new matrix with size n ´ m whose ijth entry is found by multiplying row i of
A times column j of B.

So, just like with addition and subtraction, we need to be careful with the sizes of the two
matrices we’re dealing with. However, with multiplication we need to be a little more careful.
This definition tells us that the product AB is only defined if A (i.e. the first matrix listed in the
product) has the same number of columns as B (i.e. the second matrix listed in the product) has
rows. If the number of columns of the first matrix listed is not the same as the number of rows of
the second matrix listed then the product is not defined.

An easy way to check that a product is defined is to write down the two matrices in the order that
we want to multiply them and underneath them write down the sizes as shown below.
A B = AB
n´ p p´m n´m
If the two inner numbers are equal then the product is defined and the size of the product will be
given by the outside numbers.

Example 5 Compute AC and CA for the following two matrices, if possible.


é 8 5 3ù
ê 2 úú
é 1 -3 0 4 ù ê -3 10
A=ê C =
ë -2 5 -8 9 úû ê 2 0 -4 ú
ê ú
ë -1 -7 5û
Solution
Okay, let’s first do AC . Here are the sizes for A and C.
A C = AC
2´ 4 4´3 2´3
So, the two inner numbers (4 and 4) are the same and so the multiplication can be done and we
can see that the new size of the matrix is 2 ´ 3 . Now, let’s actually do the multiplication. We’ll
go through the first couple of entries in the product in detail and then do the remaining entries a
little quicker.

To get the number in the first row and first column of AC we’ll multiply the first row of A by the
first column of B as follows,
(1)( 8 ) + ( -3)( -3) + ( 0 )( 2 ) + ( 4 )( -1) = 13
If we next want the entry in the first row and second column of AC we’ll multiply the first row of
A by the second column of B as follows,
(1)( 5) + ( -3)(10 ) + ( 0 )( 0 ) + ( 4 )( -7 ) = -53
Okay, at this point, let’s stop and insert these into the product so we can make sure that we’ve got
our bearings. Here’s the product so far,

© 2007 Paul Dawkins 36 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

é 8 5 3ù
ê 2 úú é 13 -53 ù
é 1 -3 0 4 ù ê -3 10
ê -2 =ê ú
ë 5 -8 9 úû ê 2 0 -4 ú êë úû
ê ú
ë -1 -7 5û

As we can see we’ve got four entries left to compute. For these we’ll give the row and column
multiplications but leave it to you to make sure we used the correct row/column and put the result
in the correct place. Here’s the remaining work.
(1)( 3) + ( -3)( 2 ) + ( 0 )( -4 ) + ( 4 )( 5 ) = 17
( -2 )(8 ) + ( 5 )( -3) + ( -8)( 2 ) + ( 9 )( -1) = -56
( -2 )( 5 ) + ( 5 )(10 ) + ( -8)( 0 ) + ( 9 )( -7 ) = -23
( -2 )( 3) + ( 5)( 2 ) + ( -8)( -4 ) + ( 9 )( 5 ) = 81
Here is the completed product.
é 8 5 3ù
ê 2 úú é 13 -53
é 1 -3 0 4 ù ê -3 10 17 ù
ê -2 =
ë 5 -8 9 úû ê 2 0 -4 ú êë -56 -23 81úû
ê ú
ë -1 -7 5û

Now let’s do CA. Here are the sizes for this product.
C A = CA
4´3 2´ 4 N/A
Okay, in this case the two inner numbers (3 and 2) are NOT the same and so this product can’t be
done.

So, with this example we’ve now run across the first real difference between real number
arithmetic and matrix arithmetic. When dealing with real numbers the order in which we write a
product doesn’t affect the actual result. For instance (2)(3)=6 and (3)(2)=6. We can flip the order
and we get the same answer. With matrices however, we will have to be very careful and pay
attention to the order in which the product is written down. As this example has shown the
product AC could be computed while the product CA in not defined.

Now, do not take the previous example and assume that all products will work that way. It is
possible for both AC and CA to be defined as we’ll see in the next example.

Example 6 Compute BD and DB for the given matrices, if possible.


é 3 -1 7 ù é -1 4 9 ù
B = êê 10 1 -8úú D = êê 6 2 -1úú
êë -5 2 4 úû êë 7 4 7 úû
Solution
First, notice that both of these matrices are 3 ´ 3 matrices and so both BD and DB are defined.
Again, it’s worth pointing out that this example differs from the previous example in that both the
products are defined in this example rather than only one being defined as in the previous
© 2007 Paul Dawkins 37 http://tutorial.math.lamar.edu/terms.aspx
Linear Algebra

example. Also note that in both cases the product will be a new 3 ´ 3 matrix.

In this example we’re going to leave the work of verifying the products to you. It is good practice
so you should try and verify at least one of the following products.
é 3 -1 7 ù é -1 4 9 ù é 40 38 77 ù
ê
BD = ê 10 ú ê ú ê
1 -8ú ê 6 2 -1ú = ê -60 10 33úú
êë -5 2 4 úû êë 7 4 7 úû êë 45 0 -19 úû
é -1 4 9 ù é 3 -1 7 ù é -8 23 -3ù
DB = êê 6 2 -1úú êê 10 1 -8 úú = êê 43 -6 22 úú
êë 7 4 7 úû êë -5 2 4 úû êë 26 11 45úû

This example leads us to yet another difference (although it’s related to the first) between real
number arithmetic and matrix arithmetic. In this example both BD and DB were defined. Notice
however that the products were definitely not the same. There is nothing wrong with this so don’t
get excited about it when it does happen. Note however that this doesn’t mean that the two
products will never be the same. It is possible for them to be the same and we’ll see at least one
case where the two products are the same in a couple of sections.

For the sake of completeness if A is an n ´ p matrix and B is a p ´ m matrix then the entry in the
ith row and jth column of AB is given by the following formula,
( AB )i j = ai1b1 j + ai 2b2 j + ai 3b3 j + L + ai p bp j
This formula can be useful on occasion, but is really used mostly in proofs and computer
programs that compute the product of matrices.

On occasion it can be convenient to know a single row or a single column from a product and not
the whole product itself. The following theorem tells us how to get our hands on just that.

Theorem 1 Assuming that A and B are appropriately sized so that AB is defined then,
1. The ith row of AB is given by the matrix product : [ith row of A]B.
2. The jth column of AB is given by the matrix product : A[jth column of B].

Example 7 Compute the second row and third column of AC given the following matrices.
é 8 5 3ù
ê 2 úú
é 1 -3 0 4 ù ê -3 10
A=ê C=
ë -2 5 -8 9 úû ê 2 0 -4 ú
ê ú
ë -1 -7 5û
Solution
These are the matrices from Example 5 and so we can verify the results of using this fact once
we’re done.

Let’s find the second row first. So, according to the fact this means we need to multiply the
second row of A by C. Here is that work.

© 2007 Paul Dawkins 38 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

é 8 5 3ù
ê -3 10 2 úú
[ -2 5 -8 9] ê
ê 2
= [ -56 -23 81]
0 -4 ú
ê ú
ë -1 -7 5û
Sure enough, this is the correct second row of the product AC.

Next, let’s use the fact to get the third column. This means that we’ll need to multiply A by the
third column of B. Here is that work.
é 3ù
é 1 -3 0 4 ù êê 2 úú é17 ù
ê -2 =
ë 5 -8 9 úû ê -4 ú êë81úû
ê ú
ë 5û
And sure enough, this also gives us the correct answer.

We can use this fact about how to get individual rows or columns of a product as well as the idea
of a partitioned matrix that we saw in the previous section to derive a couple of new ways to find
the product of two matrices.

Let’s start by assuming we’ve got two matrices A (size n ´ p ) and B (size p ´ m ) so we know
the product AB is defined.

Now, for the first new way of finding the product let’s partition A into its row matrices as follows,
é a11 a12 L a1 p ù é r1 ù
êa a22 L a2 p úú êêr2 úú
A=ê
21
=
ê M M M ú êMú
ê ú ê ú
êë an1 an 2 L anp úû êërn úû
Now, from the fact we know that the ith row of AB is [ith row of A]B, or ri B . Using this idea the
product AB can then be written as a new partitioned matrix as follows.
é r1 ù é r1 B ù
êr ú êr B ú
AB = ê 2ú
B=ê ú
2

êMú ê M ú
ê ú ê ú
ëêrn ûú ëêrn B ûú

For the second new way of finding the product we’ll partition B into its column matrices as,
é b11 b12 L b1m ù
êb b22 L b2 m úú
B=ê = [c1 c 2 L cm ]
21

ê M M M ú
ê ú
ëê bp1 bp 2 L bpm ûú
We can then use the fact that t he j column of AB is given by A[jth column of B] and so the
th

product AB can be written as a new partitioned matrix as follows.

© 2007 Paul Dawkins 39 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

AB = A [ c1 c 2 L c m ] = [ A c1 Ac 2 L Acm ]

Example 8 Use both of the new methods for computing products to find AC for the following
matrices.
é 8 5 3ù
ê -3 10 2 úú
é 1 -3 0 4ù
A=ê C=ê
ë -2 5 -8 9 úû ê 2 0 -4 ú
ê ú
ë -1 -7 5û
Solution
So, once again we know the answer to this so we can use it to check our results against the
answer from Example 5.

First, let’s use the row matrices of A. Here are the two row matrices of A.
r1 = [ 1 -3 0 4] r2 = [ -2 5 -8 9]
and here are the rows of the product.
é 8 5 3ù
ê -3 10 2 úú
r1C = [ 1 -3 0 4] ê = [ 13 -53 17 ]
ê 2 0 -4 ú
ê ú
ë -1 -7 5û
é 8 5 3ù
ê -3 10 2 úú
r2C = [ -2 5 -8 9] ê = [ -56 -23 81]
ê 2 0 -4 ú
ê ú
ë -1 -7 5û
Putting these together gives,
é r C ù é 13 -53 17 ù
AC = ê 1 ú = ê
ër2C û ë -56 -23 81úû
and this is the correct answer.

Now let’s compute the product using columns. Here are the three column matrices for C.
é 8ù é 5ù é 3ù
ê -3ú ê 10 ú ê 2ú
c1 = ê ú c2 = ê ú c3 = ê ú
ê 2ú ê 0ú ê -4 ú
ê ú ê ú ê ú
ë -1û ë -7 û ë 5û
Here are the columns of the product.
é 8ù
é 1 -3 0 4 ù êê -3úú é 13ù
Ac1 = ê =
ë -2 5 -8 9 úû ê 2 ú êë -56 úû
ê ú
ë -1û

© 2007 Paul Dawkins 40 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

é 5ù
é 1 -3 0 4 ù êê 10 úú é -53ù
Ac 2 = ê =
ë -2 5 -8 9 úû ê 0 ú êë -23úû
ê ú
ë -7 û
é 3ù
é 1 -3 0 4 ù êê 2 úú é17 ù
Ac3 = ê =
ë -2 5 -8 9 úû ê -4 ú êë81úû
ê ú
ë 5û
Putting all this together as follows gives the correct answer.
é 13 -53 17 ù
AB = [ Ac1 Ac 2 Ac3 ] = ê
ë -56 -23 81úû

We can also write certain kinds of matrix products as a linear combination of column matrices.
Consider A an n ´ p matrix and x a p ´ 1 column matrix. We can easily compute this product
directly as follows,
é a11 a12 L a1 p ù é x1 ù é a11 x1 + a12 x2 + L + a1 p x p ù
êa a22 L a2 p úú êê x2 úú êê a21 x1 + a22 x2 + L + a2 p x p úú
Ax = ê
21
=
ê M M M úê M ú ê M ú
ê úê ú ê ú
ëê an1 an 2 L anp ûú ëê x p ûú ëê an1 x1 + an 2 x2 + L + anp x p ûú
n ´1

Now, using matrix addition we can write the resultant n ´ 1 matrix as follows,
é a11 x1 + a12 x2 + L + a1 p x p ù é a11 x1 ù é a12 x2 ù é a1 p x p ù
êa x + a x + L + a x ú ê ú ê ú êa x ú
ê 21 1 22 2 2p p ú a21 x1 ú ê a22 x2 ú
+L + ê
2p p ú
= ê +
ê M ú ê M ú ê M ú ê M ú
ê ú ê ú ê ú ê ú
êë an1 x1 + an 2 x2 + L + anp x p úû ë an1 x1 û ë an 2 x2 û êë anp x p úû
Now, each of the p column matrices on the right above can also be rewritten as a scalar multiple
as follows.
é a11 x1 ù é a12 x2 ù é a1 p x p ù é a11 ù é a12 ù é a1 p ù
êa x ú ê a x ú êa x ú êa ú êa ú êa ú
ê 21 1 ú + ê 22 2 ú + L + ê 2 p p ú = x ê 21 ú + x ê 22 ú + L + x ê 2 p ú
ê M ú ê M ú ê M ú 1ê M ú 2ê M ú p
ê M ú
ê ú ê ú ê ú ê ú ê ú ê ú
ë an1 x1 û ë an 2 x2 û êë anp x p úû ë an1 û ë an 2 û êë anp úû
Finally, the column matrices that are multiplied by the xi ’s are nothing more than the column
matrices of A. So, putting all this together gives us,
é a11 ù é a12 ù é a1 p ù
êa ú êa ú êa ú
ê + L + x p ê ú = x1c1 + x2c 2 + L + x p c p
21 ú ê 22 ú 2p
Ax = x1 + x2
ê M ú ê M ú ê M ú
ê ú ê ú ê ú
ë an1 û ë an 2 û êë anp úû

© 2007 Paul Dawkins 41 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

where c1 , c 2 ,K , c p are the column matrices of A. Written in this matter we can see that Ax can
be written as the linear combination of the column matrices of A, c1 , c 2 ,K , c p , with the entries of
x , x1 , x2 ,K x p , as coefficients.

Example 9 Compute Ax directly and as a linear combination for the following matrices.
é 2ù
é 4 1 2 -1ù ê -1ú
A = êê -12 1 3 2úú x=ê ú
ê 6ú
êë 0 -5 -10 9 úû ê ú
ë 8û
Solution
We’ll leave it to you to verify that the direct computation of the product gives,
é 2ù
é 4 1 2 -1ù ê ú é11ù
-1
Ax = ê -12 1 3 2 úú ê ú = êê 9 úú
ê ê 6ú
êë 0 -5 -10 9 úû ê ú êë17 úû
ë 8û

Here is the linear combination method of computing the product.


é 2ù
é 4 1 2 -1ù ê ú
-1
Ax = êê -12 1 3 2 úú ê ú
ê 6ú
êë 0 -5 -10 9 úû ê ú
ë 8û
é 4ù é 1ù é 2ù é -1ù
= 2 ê -12 ú - 1 ê 1ú + 6 ê 3ú + 8 êê 2 úú
ê ú ê ú ê ú
êë 0úû êë -5úû êë -10 úû êë 9 úû
é 8ù é 1ù é 12 ù é -8ù
= êê -24 úú - êê 1úú + êê 18úú + êê 16 úú
êë 0 úû êë -5úû êë -60 úû êë 72 úû
é11ù
= êê 9 úú
êë17 úû

This is the same result that we got by the direct computation.

Matrix multiplication also gives us a very nice and compact way of writing systems of equations.
In fact we even saw most of it as we introduced the above idea. Let’s start out with a general
system of n equations and m unknowns.

© 2007 Paul Dawkins 42 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

a11 x1 + a12 x2 + L + a1m xm = b1


a21 x1 + a22 x2 + L + a2 m xm = b2
M
an1 x1 + an 2 x2 + L + an m xm = bn
Now, instead of thinking of these as a set of equations let’s think of each side as a vector of size
n ´ 1 as follows,
é a11 x1 + a12 x2 + L + a1m xm ù é b1 ù
ê ú ê ú
ê a21 x1 + a22 x2 + L + a2 m xm ú = êb2 ú
ê M ú êMú
ê ú ê ú
ëê an1 x1 + an 2 x2 + L + an m xm ûú ëbn û
In the work above we saw that the left side of this can be written as the following matrix product,
é a11 a12 L a1m ù é x1 ù é b1 ù
êa a22 L a2 m úú êê x2 úú êêb2 úú
ê 21 =
ê M M M úê M ú ê M ú
ê úê ú ê ú
ë an1 an 2 L anm û ë xm û ëbn û
If we now denote the coefficient matrix by A, the column matrix containing the unknowns by x
and the column matrix containing the bi ’s by b. we can write the system in the following matrix
form,
Ax = b
In many of the section to follow we’ll write general systems of equations as Ax = b given its
compact nature in order to save space.

Now that we’ve gotten the basics of matrix arithmetic out of the way we need to introduce a
couple of matrix operations that don’t really have any equivalent operations with real numbers.

Definition 5 If A is an n ´ m matrix then the transpose of A, denoted by AT , is an m ´ n


matrix that is obtained by interchanging the rows and columns of A. So, the first row of AT is
the first column of A, the second row of AT is the second column of A, etc. Likewise, the first
column of AT is the first row of A, the second column of AT is the second row of A, etc.

On occasion you’ll see the transpose defined as follows,


A = éë ai j ùû Þ AT = éë a j i ùû for all i and j
n´ m m´ n
Notice the difference in the subscripts. Under this definition, the entry in the ith row and jth
column of A will be in the jth row and ith column of AT .

Notice that these two definitions are really the same definition, they just don’t look like they are
the same at first glance.

Definition 6 If A is a square matrix of size n ´ n then the trace of A, denoted by tr(A), is the
sum of the entries on main diagonal. Or,
tr ( A ) = a11 + a22 + L + an n
If A is not square then the trace is not defined.
© 2007 Paul Dawkins 43 http://tutorial.math.lamar.edu/terms.aspx
Linear Algebra

Example 10 Determine the transpose and trace (if it is defined) for each of the following
matrices.
é 3 2 -6 ù é 9ù
é 4 10 -7 0ù
A=ê ú B = êê -9 1 -7 úú C = êê -1úú
ë 5 -1 3 -2 û êë 5 0 12 úû êë 8úû
é -12 -7 ù
D = [15] E=ê
ë -7 10 úû
Solution
There really isn’t all that much to do here other than to go through the definitions. Note as well
that the trace will only not be defined for A and C since these matrices are not square.

é 4 5ù
ê 10 -1ú
AT = ê ú tr ( A) : Not defined since A is not square.
ê -7 3ú
ê ú
ë 0 -2 û

é 3 -9 5ù
ê
B =ê 2
T
1 0úú tr ( B ) = 3 + 1 + 12 = 16
êë -6 -7 12 úû

C T = [ 9 -1 8] tr ( c ) : Not defined since C is not square.

DT = [15] tr ( D ) = 15

é -12 -7 ù
ET = ê tr ( E ) = -12 + 10 = -2
ë -7 10úû

In the previous example note that DT = D and that E T = E . In these cases the matrix is called
symmetric. So, in the previous example D and E are symmetric while A, B, and C, are not
symmetric.

© 2007 Paul Dawkins 44 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

Properties of Matrix Arithmetic and the Transpose


In this section we’re going to take a quick look at some of the properties of matrix arithmetic and
of the transpose of a matrix. As mentioned in the previous section most of the basic rules of real
number arithmetic are still valid in matrix arithmetic. However, there are a few that are no longer
valid in matrix arithmetic as we’ll be seeing.

We’ve already seen one of the real number properties that doesn’t hold in matrix arithmetic. If a
and b are two real numbers then we know by the commutative law for multiplication of real
numbers that ab = ba (i.e. (2)(3)=(3)(2)=6 ). However, if A and B are two matrices such that AB
is defined we saw an example in the previous section in which BA was not defined as well as an
example in which BA was defined and yet AB ¹ BA . In other words, we don’t have a
commutative law for matrix multiplication. Note that doesn’t mean that we’ll never have
AB = BA for some matrices A and B, it is possible for this to happen (as we’ll see in the next
section) we just can’t guarantee that this will happen if both AB and BA are defined.

Now, let’s take a quick look at the properties of real number arithmetic that are valid in matrix
arithmetic.

Properties
In the following set of properties a and b are scalars and A, B, and C are matrices. We’ll assume
that the size of the matrices in each property are such that the operation in that property is
defined.

1. A+ B = B+ A Commutative law for addition


2. A + ( B + C ) = ( A + B) + C Associative law for addition
3. A ( BC ) = ( AB ) C Associative law for multiplication
4. A ( B ± C ) = AB ± AC Left distributive law
5. ( B ± C ) A = BA ± CA Right distributive law
6. a ( B ± C ) = aB ± aC
7. ( a ± b ) C = aC ± bC
8. ( ab ) C = a ( bC )
9. a ( BC ) = ( aB ) C = B ( aC )

With real number arithmetic we didn’t need both 4. and 5. since we’ve also got the commutative
law for multiplication. However, since we don’t have the commutative law for matrix
multiplication we really do need both 4. and 5. Also, properties 6. – 9. are simply distributive or
associative laws for dealing with scalar multiplication.

Now, let’s take a look at couple of other idea from real number arithmetic and see if they have
equivalent ideas in matrix arithmetic.

We’ll start with the following idea. From real number arithmetic we know that 1 × a = a ×1 = a .
Or, in other words, if we multiply a number by 1 (one) doesn’t change the number. The identity
matrix will give the same result in matrix multiplication. If A is an n ´ m matrix then we have,

© 2007 Paul Dawkins 45 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

I n A = AI m = A

Note that we really do need different identity matrices on each side of A that will depend upon the
size of A.

Example 1 Consider the following matrix.


é 10 0 ù
ê -3 8 ú
A=ê ú
ê -1 11ú
ê ú
ë 7 -4 û
Then,
é1 0 0 0ù é 10 0ù é 10 0 ù
ê0 1 0 0úú êê -3 8úú êê -3 8úú
I4 A = ê =
ê0 0 1 0 ú ê -1 11ú ê -1 11ú
ê úê ú ê ú
ë0 0 0 1û ë 7 -4û ë 7 -4 û

é 10 0 ù é 10 0 ù
ê -3 8ú 1 0 ê -3 8ú
AI 2 = ê úé ù ê
= ú
ê -1 11ú ë 0 1úû ê -1 11ú
ê
ê ú ê ú
ë 7 -4 û ë 7 -4 û

Now, just like the identity matrix takes the place of the number 1 (one) in matrix multiplication,
the zero matrix (denoted by 0 for a general matrix and 0 for a column/row matrix) will take the
place of the number 0 (zero) in most of the matrix arithmetic. Note that we said most of the
matrix arithmetic. There are a couple of properties involving 0 in real numbers that are not
necessarily valid in matrix arithmetic.

Let’s first start with the properties that are still valid.

Zero Matrix Properties


In the following properties A is a matrix and 0 is the zero matrix sized appropriately for the
indicated operation to be valid.

1. A+0 = 0+ A = A
2. A- A = 0
3. 0 - A = -A
4. 0 A = 0 and A0 = 0

Now, in real number arithmetic we know that if ab = ac and a ¹ 0 then we must have b = c
(sometimes called the cancellation law). We also know that if ab = 0 then we have a = 0
and/or b = 0 (sometimes called the zero factor property). Neither of these properties of real
number arithmetic are valid in general for matrix arithmetic.

© 2007 Paul Dawkins 46 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

Example 2 Consider the following three matrices.


é -3 2ù é -1 2 ù é 1 4ù
A=ê ú B=ê ú C=ê ú
ë -6 4 û ë 3 -2 û ë 6 1û
We’ll leave it to you to verify that,
é 9 -10 ù
AB = ê ú = AC
ë 18 -20 û

Clearly A ¹ 0 and just as clearly B ¹ C and yet we do have AB = AC . So, at least in this case,
the cancellation law does not hold.

We should be careful and not read too much into the results of the previous example. The
cancellation law will not be valid in general for matrix multiplication. However, there are times
when a variation of the cancellation law will be valid as we’ll see in the next section.

Example 3 Consider the following two matrices.


é 1 2ù é -16 2ù
A=ê ú B=ê
ë2 4û ë 8 -1úû
We’ll leave it to you to verify that,
é0 0 ù
AB = ê ú
ë0 0 û

So, we’ve got AB = 0 despite the fact that A ¹ 0 and B ¹ 0 . So, in this case the zero factor
property does not hold in this case.

Now, again, we need to be careful. There are times when we will have a variation of the zero
factor property, however there will be no zero factor property for the multiplication of any two
random matrices.

The next topic that we need to take a look at is that of powers of matrices. At this point we’ll just
work with positive exponents. We’ll need the next section before we can deal with negative
exponents. Let’s start off with the following definitions.

Definition 1 If A is a square matrix then,


A0 = I An = 1
AAL3
424A, n > 0
n times

We’ve also got several of the standard integer exponent properties that we are used to working
with.

Properties of Matrix Exponents


If A is a square matrix and n and m are integers then,
An Am = An + m (A )n m
= An m

We can also talk about plugging matrices into polynomials using the following definition. If we
have the polynomial,

© 2007 Paul Dawkins 47 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

p ( x ) = an x n + an -1 x n -1 + L + a1 x + a0
and A is a square matrix then,
p ( A ) = an An + an -1 An-1 + L + a1 A + a0 I
where the identity matrix on the constant term a0 has the same size as A.

Example 4 Evaluate each of the following for the given matrix.


é -7 3ù
A=ê
ë 5 1úû
(a) A2
(b) A3
(c) p ( A ) where p ( x ) = -6 x 3 + 10 x - 9
Solution
(a) There really isn’t much to do with this problem. We’ll leave it to you to verify the
multiplication here.
é -7 3ù é -7 3ù é 64 -18ù
A2 = ê =
ë 5 1úû êë 5 1úû êë -30 16 úû

(b) In this case we may as well take advantage of the fact that we’ve got the result from the first
part already. Again, we’ll leave it to you to verify the multiplication.
é 64 -18ù é -7 3ù é -538 174 ù
A3 = A2 A = ê úê =
ë -30 16 û ë 5 1úû êë 290 -74 úû

(c) In this case we’ll need the result from the second part. Outside of that there really isn’t much
to do here.
p ( A ) = -6 A3 + 10 A - 9 I
é -538 174 ù é -7 3ù é 1 0ù
= -6 ê ú + 10 ê ú -9ê ú
ë 290 -74 û ë 5 1û ë0 1û
é -538 174 ù é -7 3ù é 1 0ù
= -6 ê ú + 10 ê ú -9ê ú
ë 290 -74 û ë 5 1û ë0 1û
é 3228 -1044 ù é -70 30 ù é9 0 ù
=ê + -
ë -1740 444 úû êë 50 10úû êë0 9 úû
é 3149 -1014 ù

ë -1690 445úû

The last topic in this section that we need to take care of is some quick properties of the transpose
of a matrix.

© 2007 Paul Dawkins 48 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

Properties of the Transpose


If A and B are matrices whose sizes are such that the given operations are defined and c is any
scalar then,
1. (A )
T T
=A
( A ± B ) = AT ± BT
T
2.

( cA) = cAT
T
3.

( AB ) = BT AT
T
4.

The first three of these properties should be fairly obvious from the definition of the transpose.
The fourth is a little trickier to see, but isn’t that bad to verify.

Proof of #4 : We know that the entry in the ith row and jth column of AB is given by,
( AB )i j = ai1b1 j + ai 2b2 j + ai 3b3 j + L + ai p bp j
We also know that the entry in the ith row and jth column of ( AB ) is found simply by
T

interchanging the subscripts i and j and so it is,


(( AB ) ) T

ij
= ( AB ) j i = a j1b1i + a j 2b2 i + a j 3b3i + L + a j p bpi

Now, let’s denote the entries of AT and BT as ai j and bi j respectively. Again, based on the
definition of the transpose we also know that,
AT = éë ai j ùû = éë a j i ùû BT = éëbi j ùû = éëb j i ùû
and so from this we see that ai j = a j i and bi j = b j i . Finally, the entry in the ith row and jth
column of BT AT is given by,
(B T
AT ) = bi1a1 j + bi 2 a2 j + bi 3a3 j + L + bi p a p j
ij

Now, plug in for ai j and bi j and we get that,

(B T
AT ) = bi1a1 j + bi 2 a2 j + bi 3a3 j + L + bi p a p j
ij

= b1i a j1 + b2 i a j 2 + b3i a j 3 + L + bp i a j p

= a j1b1i + a j 2b2 i + a j 3b3i + L + a j p bp i = (( AB ) )


T

ij
So, just what have we done here? We’ve managed to show that the entry in the ith row and jth
column of ( AB ) is equal to the entry in the ith row and jth column of BT AT . Therefore, since
T

each of the entries are equal the matrices must also be equal.

Note that #4 can be naturally extended to more than two matrices. For example,
( ABC )
T
= C T BT AT

© 2007 Paul Dawkins 49 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

Inverse Matrices and Elementary Matrices


Our main goal in this section is to define inverse matrices and to take a look at some nice
properties involving matrices. We won’t actually be finding any inverse matrices in this section.
That is the topic of the next section.

We’ll also take a quick look at elementary matrices which as we’ll see in the next section we can
use to help us find inverse matrices. Actually, that’s not totally true. We’ll use them to help us
devise a method for finding inverse matrices, but we won’t be explicitly using them to find the
inverse.

So, let’s start off with the definition of the inverse matrix.

Definition 1 If A is a square matrix and we can find another matrix of the same size, say B, such
that
AB = BA = I
then we call A invertible and we say that B is an inverse of the matrix A.

If we can’t find such a matrix B we call A a singular matrix.

Note that we only talk about inverse matrices for square matrices. Also note that if A is invertible
it will on occasion be called non-singular. We should also point out that we could also say that
B is invertible and that A is the inverse of B.

Before proceeding we need to show that the inverse of a matrix is unique, that is for a given
invertible matrix A there is exactly one inverse for the matrix.

Theorem 1 Suppose that A is invertible and that both B and C are inverses of A. Then B = C
and we will denote the inverse as A-1 .

Proof : Since B is an inverse of A we know that AB = I . Now multiply both sides of this by C
to get C ( AB ) = CI = C . However, by the associative law of matrix multiplication we can also
write C ( AB ) as C ( AB ) = ( CA ) B = I B = B . Therefore, putting these two pieces together we
see that C = C ( AB ) = B or C = B .

So, the inverse for a matrix is unique. To denote this fact we now will denote the inverse of the
matrix A as A-1 from this point on.

Example 1 Given the matrix A verify that the indicated matrix is in fact the inverse.
é -4 -2 ù -1 é - 12 - 15 ù
A=ê ú A = ê 1 2ú
ë 5 5û ë 2 5û
Solution
To verify that we do in fact have the inverse we’ll need to check that
AA-1 = A-1 A = I
This is easy enough to do and so we’ll leave it to you to verify the multiplication.

© 2007 Paul Dawkins 50 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

é -4 -2 ù é - 12 - 15 ù é 1 0ù
AA-1 = ê úê 1 2ú

ë 5 5û ë 2 5û ë0 1úû

-1 é - 12 - 15 ù é -4 -2 ù é 1 0ù
A A=ê 1 2ú ê
=
ë 2 5ûë 5 5úû êë 0 1úû

As the definition of an inverse matrix suggests, not every matrix will have an inverse. Here is an
example of a matrix without an inverse.

Example 2 The matrix below does not have an inverse.


é 3 9 2ù
B = êê 0 0 0 úú
êë -4 -5 1úû

This is fairly simple to see. If B has a matrix then it must be a 3 ´ 3 matrix. So, let’s just take
any old 3 ´ 3 ,
é c11 c12 c13 ù
ê ú
C = ê c21 c2 2 c23 ú
ê c31 c32 c33 úû
ë
Now let’s think about the product BC. We know that the 2nd row of BC can be found by looking
at the following matrix multiplication,
é c11 c12 c13 ù
ê ú
éë 2 nd row of B ùû C = [ 0 0 0] ê c21 c2 2 c23 ú = [ 0 0 0]
ê c31 c32 c33 úû
ë

So, the second row of BC is [ 0 0 0] , but if C is to be the inverse of B the product BC must be
the identity matrix and this means that the second row must in fact be [ 0 1 0] .

Now, C was a general 3 ´ 3 matrix and we’ve shown that the second row of BC is all zeroes and
hence the product will never be the identity matrix and so B can’t have an inverse and so is a
singular matrix.

In the previous section we introduced the idea of matrix exponentiation. However, we needed to
restrict ourselves to positive exponents. We can now take a look at negative exponents.

Definition 2 If A is a square matrix and n > 0 then,


A- n = ( A-1 ) = 14
n
A-14 L A3
A-244
1 -1

n times

© 2007 Paul Dawkins 51 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

Example 3 Compute A-3 for the matrix,


é -4 -2 ù
A=ê ú
ë 5 5û
Solution
From Example 1 we know that the inverse of A is,
é- 1 - 15 ù
A-1 = ê 12 2ú
ë 2 5û
So, this is easy enough to compute.
é - 1 - 15 ù é - 12 - 15 ù é - 12 - 15 ù
A-3 = ( A-1 ) = ê 12
3
2úê 1 2ú ê 1 2ú
ë 2 5ûë 2 5ûë 2 5û

é 203 - 1 - 15 ù
50 ù é 2
1
=ê 1 3 úê 1 2ú
ë - 20 50 û ë 2 5û

é - 13 - 500 11
ù
= ê 20011 17 ú
ë 200 500 û

Next, let’s take a quick look at some nice facts about the inverse matrix.

Theorem 2 Suppose that A and B are invertible matrices of the same size. Then,
(a) AB is invertible and ( AB ) = B -1 A-1 .
-1

( )
-1
(b) A-1 is invertible and A-1 = A.

( ) = A- n = ( A-1 ) .
-1 n
(c) For n = 0,1, 2,K A n is invertible and A n
1 -1
(d) If c is any non-zero scalar then cA is invertible and ( cA )
-1
= A
c
( ) = ( A -1 ) .
-1 T
(e) AT is invertible and AT

Proof :
Note that in each case in order to prove that the given matrix is invertible all we need to do is
show that the inverse is what we claim it to be. Also, don’t get excited about showing that the
inverse is what we claim it to be. In these cases all we need to do is show that the product (both
left and right product) of the given matrix and what we claim is the inverse is the identity matrix.
That’s it.

Also, do not get excited about the inverse notation. For example, in the first one we state that
( AB ) = B -1 A-1 . Remember that the ( AB ) is just the notation that we use to denote the
-1 -1

inverse of AB. This notation will not be used in the proof except in the final step to denote the
inverse.

(a) Now, as suggested above showing this is not really all that difficult. All we need to do is
( )
show that ( AB ) B -1 A-1 = I and B -1 A-1 ( ) ( AB ) = I . Here is that work.

© 2007 Paul Dawkins 52 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

( AB ) ( B -1 A-1 ) = A ( BB -1 ) A-1 = AIA-1 = AA-1 = I


( B -1 A-1 ) ( AB ) = B -1 ( A-1 A) B = B -1IB = B -1B = I
So, we’ve shown both and so we now know that AB is in fact invertible (since we’ve found the
inverse!) and that ( AB ) = B -1 A-1 .
-1

(b) Now, we know from the fact that A is invertible that


AA-1 = A-1 A = I
But this is telling us that if we multiply A-1 by A on both sides then we’ll get the identity matrix.
But this is exactly what we need to show that A-1 is invertible and that its inverse is A.

(c) The best way to prove this part is by a proof technique called induction. However, there’s a
chance that a good many of you don’t know that and that isn’t the point of this class. Luckily, for
this part anyway, we can at least outline another way to prove this.

To officially prove this part we’ll need to show that ( A )( A ) = ( A )( A ) = I . We’ll show
n -n -n n

one of the inequalities and leave the other to you to verify since the work is pretty much identical.
æ öæ ö
( A )( A ) = ç 1
n -n
AAL3
424A ÷ ç 14
A 4 L A3 ÷
A 244 -1 -1 -1

è n times ø è n times ø
æ ö æ -1 -1 -1 ö
= ç1AA
424 L3 A ÷ ( AA-1 ) ç 14
A 4 A 244 L A3 ÷ but AA-1 = I so,
è n -1 times ø è n -1 times ø
æ ö æ -1 -1 -1 ö
= ç1AA
424 L3 A ÷ ç 14
A 4A 244 L A3 ÷
è n -1 times ø è n -1 times ø
= etc.
= ( AA) ( A-1 A-1 )
= A ( AA-1 ) A-1 again AA-1 = I so,
= AA-1
=I

Again, we’ll leave the second product to you to verify, but the work is identical. After doing this
( ) = A- n = ( A-1 ) .
-1 n
product we can see that A n is invertible and A n

æ 1 -1 ö æ 1 -1 ö
(d) To prove this part we’ll need to show that ( cA ) ç A ÷ = ç A ÷ ( cA) = I . As with the
èc ø èc ø
last part we’ll do half the work and leave the other half to you to verify.

( cA) æç
1 -1 ö æ 1 ö
A ÷ = ç c × ÷ ( AA-1 ) = (1)( I ) = I
èc ø è cø

© 2007 Paul Dawkins 53 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

1 -1
Upon doing the second product we can see that cA is invertible and ( cA )
-1
= A .
c

( ) =(A )
T -1 T
(e) The part will require us to show that AT A-1 AT = I and in keeping with
tradition of the last couple parts we’ll do the first one and leave the second one to you to verify.

This one is a little tricky at first, but once you realize the correct formula to use it’s not too bad.
( ) and then remember that ( C D ) = DT C T . Using this fact
T T
Let’s start with AT A-1

(backwards) on A ( A )
T -1 T
gives us,

AT ( A-1 ) = ( A-1 A) = I T = I
T T

Note that we used the fact that I T = I here which we’ll leave to you to verify.

( ) = ( A -1 ) .
-1 T
So, upon showing the second product we’ll have that AT is invertible and AT

Note that the first part of this theorem can be easily extended to more than two matrices as
follows,
( ABC )
-1
= C -1 B -1 A-1

Now, in the previous section we saw that in general we don’t have a cancellation law or a zero
factor property. However, if we restrict ourselves just a little we can get variations of both of
these.

Theorem 3 Suppose that A is an invertible matrix and that B, C, and D are matrices of the same
size as A.
(a) If AB = AC then B = C
(b) If AD = 0 then D = 0

Proof :
(a) Since we know that A is invertible we know that A-1 exists so multiply on the left by A-1 to
get,
A-1 AB = A-1 AC
IB = IC
B=C
(b) Again we know that A-1 exists so multiply on the left by A-1 to get,
A-1 AD = A-1 0
ID = 0
D=0

Note that this theorem only required that A be invertible, it is completely possible that the other
matrices are singular.

© 2007 Paul Dawkins 54 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

Note as well with the first one that we’ve got to remember that matrix multiplication is not
commutative and so if we have AB = CA then there is no reason to think that B = C even if A is
invertible. Because we don’t know that CA = AC we’ve got to leave this as is. Also when we
multiply both sides of the equation by A-1 we’ve got multiply each side on the left or each side
on the right, which is again because we don’t have the commutative law with matrix
multiplication. So, if we tried the above proof on AB = CA we’d have,
A-1 AB = A-1CA OR ABA-1 = CAA-1
B = A-1CA ABA-1 = C

In either case we don’t have B = C .

Okay, it is now time to take a quick look at Elementary matrices.

Definition 3 A square matrix is called an elementary matrix if it can be obtained by applying a


single elementary row operation to the identity matrix of the same size.

Here are some examples of elementary matrices and the row operations that produced them.

Example 4 The following matrices are all elementary matrices. Also given is the row operation
on the appropriately sized identity matrix.
é9 0ù
ê0 9 R1 on I 2
ë 1úû
é0 0 0 1ù
ê0 1 0 0 úú
ê R1 « R4 on I 4
ê0 0 1 0ú
ê ú
ë1 0 0 0û
é 1 0 0 0ù
ê 0 1 -7 0úú
ê R2 - 7 R3 on I 4
ê 0 0 1 0ú
ê ú
ë 0 0 0 1û
é 1 0 0ù
ê ú 1 × R2 on I 3
ê0 1 0 ú
êë0 0 1úû

Note that the fourth example above shows that any identity matrix is also an elementary matrix
since we can think of arriving at that matrix by taking one times any row (not just the second as
we used) of the identity matrix.

Here’s a really nice theorem about elementary matrices that we’ll be using extensively to develop
a method for finding the inverse of a matrix.

© 2007 Paul Dawkins 55 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

Theorem 4 Suppose E is an elementary matrix that was found by applying an elementary row
operation to I n . Then if A is an n ´ m matrix EA is the matrix that will result by applying the
same row operation to A.

Example 5 For the following matrix perform the row operation R1 + 4 R2 on it and then find the
elementary matrix, E, for this operation and verify that EA will give the same result.
é 4 5 -6 1 -1ù
A = êê -1 2 -1 10 3úú
êë 3 0 4 -4 7 úû
Solution
Performing the row operation is easy enough.
é 4 5 -6 1 -1ù é 0 13 -10 41 11ù
ê -1 ú R1 + 4 R2 ê
ê 2 -1 10 3ú -1 2 -1 10 3úú
® ê
êë 3 0 4 -4 7 úû êë 3 0 4 -4 7 úû

Now, we can find E simply by applying the same operation to I 3 and so we have,
é 1 4 0ù
E = êê 0 1 0 úú
êë 0 0 1úû

We just need to verify that EA is then the same matrix that we got above.
é 1 4 0ù é 4 5 -6 1 -1ù é 0 13 -10 41 11ù
EA = êê 0 1 0 úú êê -1 2 -1 10 3úú = êê -1 2 -1 10 3úú
êë 0 0 1úû êë 3 0 4 -4 7 úû êë 3 0 4 -4 7 úû

Sure enough the same matrix as the theorem predicted.

Now, let’s go back to Example 4 for a second and notice that we can apply a second row
operation to get the given elementary matrix back to the original identity matrix.

Example 6 Give the operation that will take the elementary matrices from Example 4 back to
the original identity matrix.
é9 0 ù 19 R1 é 1 0 ù
ê0 1ú ® ê0 1ú
ë û ë û
é0 0 0 1ù é1 0 0 0ù
ê0 1 0 0 ú R1 « R4 êê0 1
ú 0 0 úú
ê
ê0 0 1 0 ú ® ê0 0 1 0ú
ê ú ê ú
ë1 0 0 0û ë0 0 0 1û

© 2007 Paul Dawkins 56 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

é 1 0 0 0ù é1 0 0 0ù
ê 0 1 -7 0 ú R2 + 7 R3 êê0
ú 1 0 0 úú
ê
ê 0 0 1 0ú ® ê0 0 1 0ú
ê ú ê ú
ë 0 0 0 1û ë0 0 0 1û
é 1 0 0ù é1 0 0ù
ê0 1 0 ú 1 × R2 ê0 1 0 úú
ê ú ® ê
êë0 0 1úû êë0 0 1úû

These kinds of operations are called inverse operations and each row operation will have an
inverse operation associated with it. The following table gives the inverse operation for each row
operation.

Row operation Inverse Operation


1
Multiply row i by c ¹ 0 Multiply row i by
c
Interchange rows i and j Interchange rows j and i
Add c times row i to row j Add -c times row i to row j

Now that we’ve got inverse operations we can give the following theorem.

Theorem 5 Suppose that E is the elementary matrix associated with a particular row operation
and that E0 is the elementary matrix associated with the inverse operation. Then E is invertible
and E -1 = E0

Proof : This is actually a really simple proof. Let’s start with E0 E . We know from Theorem 4
that this is the same as if we’d applied the inverse operation to E, but we also know that inverse
operations will take an elementary matrix back to the original identity matrix. Therefore we
have,
E0 E = I

Likewise, if we look at EE0 this will be the same as applying the original row operation to E0 .
However, if you think about it this will only undo what the inverse operation did to the identity
matrix and so we also have,
EE0 = I

Therefore, we’ve proved that EE0 = E0 E = I and so E is invertible and E -1 = E0 .

Now, suppose that we’ve got two matrices of the same size A and B. If we can reach B by
applying a finite number of row operations to A then we call the two matrices row equivalent.
Note that this will also mean that we can reach A from B by applying the inverse operations in the
reverse order.

© 2007 Paul Dawkins 57 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

Example 7 Consider
é 4 3 -2 ù
A=ê
ë -1 5 8úû
then
é 4 3 -2 ù
B=ê
ë 14 -1 -22 úû
is row equivalent to A because we reached B by first multiplying row 2 of A by -2 and the adding
3 times row 1 onto row 2.

For the practice let’s do these operations using elementary matrices. Here are the elementary
matrices (and their inverses) for the operations on A.
é 1 0ù é 1 0ù
-2 R2 : E1 = ê ú E1-1 = ê 1ú
ë 0 -2 û ë 0 - 2û
é 1 0ù é 1 0ù
R2 + 3R1 : E2 = ê ú E2 -1 = ê
ë 3 1û ë -3 1úû

Now, to reach B Theorem 4 tells us that we need to multiply the left side of A by each of these in
the same order as we applied the operations.
é1 0ù é 1 0ù é 4 3 -2 ù
E2 E1 A = ê
ë3 1û ë 0 -2 û ë -1 5 8úû
ú ê ú ê

é1 0ù é 4 3 -2 ù

ë3 1û ë 2 -10 -16 úû
ú ê

é 4 3 -2 ù
=ê =B
ë 14 -1 -22 úû

Sure enough we get B as we should.

Now, since A and B are row equivalent this means that we should be able to get to A from B by
applying the inverse operations in the reverse order. Let’s see if that does in fact work.
é 1 0ù é 1 0ù é 4 3 -2 ù
E1-1 E2-1B = ê 1úê
ë 0 - 2 û ë -3 1úû êë 14 -1 -22 úû
é 1 0ù é 4 3 -2 ù
=ê 1úê
ë 0 - 2û ë 2 -10 -16 úû
é 4 3 -2 ù
=ê ú= A
ë -1 5 8 û

So, we sure enough end up with the correct matrix and again remember that each time we
multiplied the left side by an elementary matrix Theorem 4 tells us that is the same thing as
applying the associated row operation to the matrix.

© 2007 Paul Dawkins 58 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

Finding Inverse Matrices


In the previous section we introduced the idea of inverse matrices and elementary matrices. In
this section we need to devise a method for actually finding the inverse of a matrix and as we’ll
see this method will, in some way, involve elementary matrices, or at least the row operations that
they represent.

The first thing that we’ll need to do is take care of a couple of theorems.

Theorem 1 If A is an n ´ n matrix then the following statements are equivalent.


(a) A is invertible.
(b) The only solution to the system Ax = 0 is the trivial solution.
(c) A is row equivalent to I n .
(d) A is expressible as a product of elementary matrices.

Before we get into the proof let’s say a couple of words about just what this theorem tells us and
how we go about proving something like this. First, when we have a set of statements and when
we say that they are equivalent then what we’re really saying is that either they are all true or they
are all false. In other words, if you know one of these statements is true about a matrix A then
they are all true for that matrix. Likewise, if one of these statements is false for a matrix A then
they are all false for that matrix.

To prove a set of equivalent statements we need to prove a string of implications. This string has
to be able to get from any one statement to any other through a finite number of steps. In this
case we’ll prove the following chain,

( a ) Þ (b) , (b) Þ ( c ) , ( c ) Þ ( d ) , ( d ) Þ ( a )
By doing this if we know one of them to be true/false then we can follow this chain to get to any
of the others.

The actual proof will involve four parts, one for each implication. To prove a given implication
we’ll assume the statement on the left is true and show that this must in some way also force the
statement on the right to also be true. So, let’s get going.

Proof :
( a ) Þ ( b ) : So we’ll assume that A is invertible and we need to show that this assumption also
implies that Ax = 0 will have only the trivial solution. That’s actually pretty easy to do. Since A
is invertible we know that A-1 exists. So, start by assuming that x 0 is any solution to the
system, plug this into the system and then multiply (on the left) both sides by A-1 to get,
A-1 Ax 0 = A-1 0
Ix 0 = 0
x0 = 0
So, Ax = 0 has only the trivial solution and we’ve managed to prove this implication.

© 2007 Paul Dawkins 59 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

( b ) Þ ( c ) : Here we’re assuming that Ax = 0 will have only the trivial solution and we’ll need
to show that A is row equivalent to I n . Recall that two matrices are row equivalent if we can get
from one to the other by applying a finite set of elementary row operations.

Let’s start off by writing down the augmented matrix for this system.
é a11 a12 L a1n 0ù
êa a22 L a2 n 0 úú
ê 21
ê M M O M Mú
ê ú
ëê an1 an 2 L an n 0 ûú

Now, if we were going to solve this we would use elementary row operations to reduce this to
reduced row-echelon form, Now we know that the solution to this system must be,
x1 = 0, x2 = 0,K , xn = 0
by assumption. Therefore, we also know what the reduced row-echelon form of the augmented
matrix must be since that must give the above solution. The reduced-row echelon form of this
augmented matrix must be,
é1 0 L 0 0ù
ê0 1 L 0 0 úú
ê
êM M O M Mú
ê ú
ë0 0 L 1 0û
Now, the entries in the last column do not affect the values in the entries in the first n columns
and so if we take the same set of elementary row operations and apply them to A we will get I n
and so A is row equivalent to I n since we can get to I n by applying a finite set of row operations
to A. Therefore this implication has been proven.

(c) Þ (d ) : In this case we’re going to assume that A is row equivalent to I n and we’ll need to
show that A can be written as a product of elementary matrices.

So, since A is row equivalent to I n we know there is a finite set of elementary row operations
that we can apply to A that will give us I n . Let’s suppose that these row operations are
represented by the elementary matrices E1 , E2 ,K , Ek . Then by Theorem 4 of the previous
section we know that applying each row operation to A is the same thing as multiplying the left
side of A by each of the corresponding elementary matrices in the same order. So, we then know
that we will have the following.
Ek L E2 E1 A = I n

Now, by Theorem 5 from the previous section we know that each of these elementary matrices is
invertible and their inverses are also elementary matrices. So multiply the above equation (on the
left) by Ek-1 ,K , E2-1 , E1-1 (in that order) to get,
A = E1-1 E2-1 L Ek-1 I n = E1-1 E2-1 L Ek-1

So, we see that A is a product of elementary matrices and this implication is proven.
© 2007 Paul Dawkins 60 http://tutorial.math.lamar.edu/terms.aspx
Linear Algebra

(d ) Þ (a) : Here we’ll be assuming that A is a product of elementary matrices and we need to
show that A is invertible. This is probably the easiest implication to prove.

First, A is a product of elementary matrices. Now, by Theorem 5 from the previous section we
know each of these elementary matrices is invertible and by Theorem 2(a) also from the previous
section we know that a product of invertible matrices is also invertible. Therefore, A is invertible
since it can be written as a product of invertible matrices and we’ve proven this implication.

This theorem can actually be extended to include a couple more equivalent statements, but to do
that we need another theorem.

Theorem 2 Suppose that A is a square matrix then


(a) If B is a square matrix such that BA = I then A is invertible and A-1 = B .
(b) If B is a square matrix such that AB = I then A is invertible and A-1 = B .

Proof :
(a) This proof will need part (b) of Theorem 1. If we can show that Ax = 0 has only the trivial
solution then by Theorem 1 we will know that A is invertible. So, let x 0 be any solution to
Ax = 0 . Plug this into the equation and then multiply both sides (on the left by B.
Ax 0 = 0
BAx 0 = B0
Ix 0 = 0
x0 = 0
So, this shows that any solution to Ax = 0 must be the trivial solution and so by Theorem 1 if
one statement is true they all are and so A is invertible. We know from the previous section that
inverses are unique and because BA = I we must then also have A-1 = B .

(b) In this case let’s let x 0 be any solution to Bx = 0 . Then multiplying both sides (on the left)
of this by A we can use a similar argument to that used in (a) to show that x 0 must be the trivial
solution and so B is an invertible matrix and that in fact B -1 = A . Now, this isn’t quite what we
were asked to prove, but it does in fact give us the proof. Because B is invertible and its inverse
is A (by the above work) we know that,
AB = BA = I
but this is exactly what it means for A to be invertible and that A-1 = B . So, we are done.

So, what’s the big deal with this theorem? We’ll recall in the last section that in order to show
that a matrix, B, was the inverse of A we needed to show that AB = BA = I . In other words, we
needed to show that both of these products were the identity matrix. Theorem 2 tells us that all
we really need to do is show one of them and we get the other one for free.

This theorem gives us is the ability to add two equivalent statements to Theorem 1. Here is the
improved Theorem 1.

© 2007 Paul Dawkins 61 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

Theorem 3 If A is an n ´ n matrix then the following statements are equivalent.


(a) A is invertible.
(b) The only solution to the system Ax = 0 is the trivial solution.
(c) A is row equivalent to I n .
(d) A is expressible as a product of elementary matrices.
(e) Ax = b has exactly one solution for every n ´ 1 matrix b.
(f) Ax = b is consistent for every n ´ 1 matrix b.

Note that (e) and (f) appear to be the same on the surface, but recall that consistent only says that
there is at least one solution. If a system is consistent there may be infinitely many solutions.
What this part is telling us is that if the system is consistent for any choice of b that we choose to
put into the system then we will in fact only get a single solution. If even one b gives infinitely
many solutions the (f) is false, which in turn makes all the other statements false.

Okay so how do we go about proving this? We’ve already proved that the first four statements
are equivalent above so there’s no reason to redo that work. This means that all we need to do is
prove that one of the original statements implies the new two new statements and these in turn
imply one of the four original statements. We’ll do this by proving the following implications
(a ) Þ (e) Þ ( f ) Þ ( a) .
Proof :
(a ) Þ (e) : Okay with this implication we’ll assume that A is invertible and we’ll need to show
that Ax = b has exactly one solution for every n ´ 1 matrix b. This is actually very simple to do.
Since A is invertible we know that A-1 so we’ll do the following.
A-1 Ax = A-1b
Ix = A-1b
x = A-1b

So, if A is invertible we’ve shown that the solution to the system will be x = A-1b and since
matrix multiplication is unique (i.e. we aren’t going to get two different answers from the
multiplication) the solution must also be unique and so there is exactly one solution to the system.

( e) Þ ( f ) : This implication is trivial. We’ll start off by assuming that the system Ax = b has
exactly one solution for every n ´ 1 matrix b but that also means that the system is consistent
every n ´ 1 matrix b and so we’re done with the proof of this implication.

( f ) Þ (a) : Here we’ll start off by assuming that Ax = b is consistent for every n ´ 1 matrix b
and we’ll need to show that this implies A is invertible. So, if Ax = b is consistent for every
n ´ 1 matrix b it is consistent for the following n systems.

© 2007 Paul Dawkins 62 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

é1 ù é0 ù é0 ù
ê0 ú ê1 ú ê0 ú
ê ú ê ú ê ú
Ax = ê0 ú Ax = ê0 ú L Ax = ê0 ú
ê ú ê ú ê ú
êM ú êM ú êM ú
êë0 úû êë0 úû êë1 úû
n ´1 n ´1 n ´1

Since we know each of these systems have solutions let x1 , x 2 ,K , x n be those solutions and form
a new matrix, B, with these solutions as its columns. In other words,
B = [ x1 x2 L xn ]

Now let’s take a look at the product AB. We know from the matrix arithmetic section that the ith
column of AB will be given by Axi and we know what each of these products will be since xi is
a solution to one of the systems above. So, let’s use all this knowledge to see what the product
AB is.
é1 0 L 0ù
ê0 1 L 0 úú
ê
AB = [ A x1 Ax 2 L Ax n ] = ê 0 0 L 0ú=I
ê ú
êM M Mú
êë 0 0 L 1 úû

So, we’ve shown that AB = I , but by Theorem 2 this means that A must be invertible and so
we’re done with the proof.

Before proceeding let’s notice that part (c) of this theorem is also telling us that if we reduced A
down to reduced row-echelon form then we’d have I n . This can also be seen in the proof in
Theorem 1 of the implication ( b ) Þ ( c ) .

So, just how does this theorem help us to determine the inverse of a matrix? Well, first let’s
assume that A is in fact invertible and so all the statements in Theorem 3 are true. Now, go back
to the proof of the implication ( c ) Þ ( d ) . In this proof we saw that there were elementary
matrices, E1 , E2 ,K , Ek , so that we’d get the following,
Ek L E2 E1 A = I n

Since we know A is invertible we know that A-1 exists and so multiply (on the right) each side of
this to get,
Ek L E2 E1 AA-1 = I n A-1 Þ A-1 = Ek L E2 E1 I n

What this tell us is that we need to find a series of row operation that will reduce A to I n and then
apply the same set of operations to I n and the result will be the inverse, A-1 .

© 2007 Paul Dawkins 63 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

Okay, all this is fine. We can write down a bunch of symbols to tell us how to find the inverse,
but that doesn’t always help to actually find the inverse. The work above tells us that we need to
identify a series of elementary row operations that will reduce A to I n and then apply those
operations to I n . We’ll it turns out that we can do both of these steps simultaneously and we
don’t need to mess around with the elementary matrices.

Let’s start off by supposing that A is an invertible n ´ n matrix and then form the following new
matrix.
[A In ]
Note that all we did here was tack on I n to the original matrix A. Now, if we apply a row
operation to this it will be equivalent to applying it simultaneously to both A and to I n . So, all
we need to do is find a series of row operations that will reduce the “A” portion of this to I n ,
making sure to apply the operations to the whole matrix. Once we’ve done this we will have,
éë I n A-1 ùû
provided A is in fact invertible of course. We’ll deal with singular matrices in a bit.

Let’s take a look at a couple of examples.

Example 1 Determine the inverse of the following matrix given that it is invertible.
é -4 -2 ù
A=ê ú
ë 5 5û
Solution
Note that this is the 2 ´ 2 we looked at in Example 1 of the previous section. In that example
stated (and proved) that the inverse was,
é- 1 - 15 ù
A-1 = ê 12 2ú
ë 2 5û
We can now show how we arrived at this for the inverse. We’ll first form the new matrix
é -4 -2 1 0ù
ê 5 5 0 1úû
ë

Next we’ll find row operations that will convert the first two columns into I 2 and the third and
fourth columns should then contain A-1 . Here is that work,
é -4 -2 1 0 ù R1 + R2 é 1 3 1 1ù R2 - 5 R1 é 1 3 1 1ù
ê 5 5 0 ú ê ú
1û ® ë 5 5 0 1û ® ë 0 -10 -5 ê -4úû
ë
- 101 R2 é 1 3 1 1ù R1 - 3R2 é 1 0 - 12 - 15 ù
ê 2ú
® êë0 1 12 52 úû ® ë 0 1 1
2 5û

So, the first two columns are in fact I 2 and in the third and fourth columns we’ve got the inverse,
é- 1 - 15 ù
A-1 = ê 12 2ú
ë 2 5û

© 2007 Paul Dawkins 64 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

Example 2 Determine the inverse of the following matrix given that it is invertible.
é 3 1 0ù
C = êê -1 2 2 úú
êë 5 0 -1úû
Solution
Okay we’ll first form the new matrix,
é 3 1 0 1 0 0ù
ê -1 2 2 0 1 0 úú
ê
êë 5 0 -1 0 0 1úû
and we’ll use elementary row operations to reduce the first three rows to I 3 and then the last
three rows will be the inverse of C. Here is that work.
é 3 1 0 1 0 0ù é 1 5 4 1 2 0ù
ê -1 2 2 0 1 0 ú R + 2 R ê ú
ú ® ê -1 2 2 0 1 0 ú
1 2
ê
êë 5 0 -1 0 0 1úû êë 5 0 -1 0 0 1úû
R2 + R1 é 1 5 4 1 2 0ù 1 é 1 5 4 1 2 0ù
R
R3 - 5R1 êê 0 7 6 1 3 0 úú 7 2 êê 0 1 6
7
1
7
3
7 0 úú
®
® êë 0 -25 -21 -5 -10 1úû êë 0 -25 -21 -5 -10 1úû
é 1 5 4 1 2 0ù 7 é 1 5 4 1 2 0ù
R3 + 25R2 ê ú R ê
ê 0 1 6 1 3
0ú 3 3 ê 0 1 6 1 3
0 úú
® 7 7 7
® 7 7 7
êë 0 0 3
7 - 107 5
7 1úû êë 0 0 1 - 103 5
3

R2 - 7 R3 é 1
6
5 0 43
-3 - 3ù
14 28
é 1 0 0 -3 2 1 2
ù
ê
3
ú R1 - 5R2 ê 3 3

R1 - 4 R3 ê 0 1 0 3 -1 -2 ú 0 1 0 3 -1 -2 úú
® ê
® êë 0 0 1 -310 5
3
7

ú êë 0 0 1 -3 10 5
3

So, we’ve gotten the first three columns reduced to I 3 and that means the last three must be the
inverse.
é - 23 1
3 ù
2
3

C -1 = êê 3 -1 -2 úú
êë - 103 5
3

We’ll leave it to you to verify that CC -1 = C -1C = I 3 .

Okay, so far we’ve seen how to use the method above to determine an inverse, but what happens
if a matrix doesn’t have an inverse? We’ll it turns out that we can also use this method to
determine that as well and it generally doesn’t take quite as much work as it does to actually find
the inverse (if it exists of course….).

Let’s take a look at an example of that.

© 2007 Paul Dawkins 65 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

Example 3 Show that the following matrix does not have an inverse, i.e. show the matrix is
singular.
é 3 3 6ù
B = êê 0 1 2 úú
êë -2 0 0 úû
Solution
Okay, the problem statement says that the matrix is singular, but let’s pretend that we didn’t know
that and work the problem as we did in the previous two examples. That means we’ll need the
new matrix,
é 3 3 6 1 0 0ù
ê 0 1 2 0 1 0úú
ê
êë -2 0 0 0 0 1úû
Now, let’s get started on getting the first three columns reduced to I 3 .
é 3 3 6 1 0ù0 é 1 3 6 1 0 1ù
ê 0 ú R1 + R3 ê
ê 1 2 0 1 0ú 0 1 2 0 1 0 úú
® ê
êë -2 0 0 0 0 1úû êë -2 0 0 0 0 1úû
é 1 3 6 1 0 1ù é 1 3 6 1 0 1ù
R3 + 2 R1 ê ú R3 - 6 R2 ê
0 1 2 0 1 0ú 0 1 2 0 1 0 úú
® ê ® ê
êë 0 6 12 2 0 3úû êë 0 0 0 2 -6 3úû

At this point let’s stop and examine the third row in a little more detail. In order for the first three
columns to be I 3 the first three entries of the last row MUST be [ 0 0 1] which we clearly
don’t have. We could use a multiple of row 1 or row 2 to get a 1 in the third spot, but that would
in turn change at least one of the first two entries away from 0. That’s a problem since they must
remain zeroes.

In other words, there is no way to make the third entry in the third row a 1 without also changing
one or both of the first two entries into something other than zero and so we will never be able to
make the first three columns into I 3 .

So, there are no sets of row operations that will reduce B to I 3 and hence B is NOT row
equivalent to I 3 . Now, go back to Theorem 3. This was a set of equivalent statements and if one
is false they are all false. We’ve just managed to show that part (c) is false and that means that
part (a) must also be false. Therefore, B must be a singular matrix.

The idea used in this last example to show that B was singular can be used in general. If, in the
course of reducing the new matrix, we ever end up with a row in which all the entries to the left
of the dashed line are zeroes we will know that the matrix must be singular.

We’ll leave this section off with a quick formula that can always be used to find the inverse of an
invertible 2 ´ 2 matrix as well as a way to quickly determine if the matrix is invertible. The

© 2007 Paul Dawkins 66 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

above method is nice in that it always works, but it can be cumbersome to use so the following
formula can help to make things go quicker for 2 ´ 2 matrices.

Theorem 4 The matrix


é a bù
A=ê ú
ëc dû
will be invertible if ad - bc ¹ 0 and singular if ad - bc = 0 . If the matrix is invertible its
inverse will be,
1 é d -b ù
A-1 =
ad - bc êë -c a úû

Let’s do a quick example or two of this fact.

Example 4 Use the fact to show that


é -4 -2 ù
A=ê ú
ë 5 5û
is an invertible matrix and find its inverse.

Solution
We’ve already looked at this one above, but let’s do it here so we can contrast the work between
the two methods. First, we need,
ad - bc = ( -4 )( 5 ) - ( 5 )( -2 ) = -10 ¹ 0

So, the matrix is in fact invertible by the fact and here is the inverse,
1 é 5 2 ù é - 12 - 15 ù
A-1 = =ê 2ú
-10 ëê -5 -4 ûú ë 12 5û

Example 5 Determine if the following matrix is singular.


é -4 -2 ù
B=ê
ë 6 3úû
Solution
Not much to do with this one.
( -4 )( 3) - ( -2 )( 6 ) = 0
So, by the fact the matrix is singular.

If you’d like to see a couple more example of finding inverses check out the section on Special
Matrices, there are a couple more examples there.

© 2007 Paul Dawkins 67 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

Special Matrices
This section is devoted to a couple of special matrices that we could have talked about pretty
much anywhere, but due to the desire to keep most of these sections as small as possible they just
didn’t fit in anywhere. However, we’ll need a couple of these in the next section and so we now
need to get them out of the way.

Diagonal Matrix
This first one that we’re going to take a look at is a diagonal matrix. A square matrix is called
diagonal if it has the following form.
é d1 0 0 L 0 ù
ê 0 d 0 L 0 úú
ê 2

D = ê 0 0 d3 L 0 ú
ê ú
ê M M M O Mú
êë 0 0 0 L d n úû
n´ n

In other words, a diagonal matrix is any matrix in which the only potentially non-zero entries are
on the main diagonal. Any entry off the main diagonal must be zero and note that it is possible to
have one or more of the main diagonal entries be zero.

We’ve also been dealing with a diagonal matrix already to this point if you think about it a little.
The identity matrix is a diagonal matrix.

Here is a nice theorem about diagonal matrices.

Theorem 1 Suppose D is a diagonal matrix and d1 , d 2 ,K d n are the entries on the main diagonal.
If one or more of the d i ’s are zero then the matrix is singular. On the other hand if d i ¹ 0 for all
i then the matrix is invertible and the inverse is,
é1 ù
êd 0 0 L 0ú
ê 1
ú
ê 1 ú
ê 0 d 0 L 0ú
ê 2 ú
D -1 = ê 1 ú
ê 0 0 L 0ú
ê d3 ú
ê M M M O Mú
ê ú
ê 0 1ú
0 0 L
êë d n úû

Proof : First, recall Theorem 3 from the previous section. This theorem tells us that if D is row
equivalent to the identity matrix then D is invertible and if D is not row equivalent to the identity
then D is singular.

© 2007 Paul Dawkins 68 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

If none of the d i ’s are zero then we can reduce D to the identity simply dividing each of the rows
its diagonal entry (which we can do since we’ve assumed none of them are zero) and so in this
case D will be row equivalent to the identity. Therefore, in this case D is invertible. We’ll leave
it to you to verify that the inverse is what we claim it to be. You can either compute this directly
using the method from the previous section or you can verify that DD -1 = D -1 D = I .

Now, suppose that at least one of the d i is equal to zero. In this case we will have a row of all
zeroes, and because D is a diagonal matrix all the entries above the main diagonal entry in this
row will also be zero and so there is no way for us to use elementary row operations to put a 1
into the main diagonal and so in this case D will not be row equivalent to the identity and hence
must be singular.

Powers of diagonal matrices are also easy to compute. If D is a diagonal matrix and k is any
integer then
é d1k 0 0 L 0ù
ê ú
ê 0 d2
k
0 L 0ú
D k = ê 0 0 d 3k L 0 ú
ê ú
ê M M M O Mú
ê 0 0 0 L dk ú
ë n û

Triangular Matrix
The next kind of matrix we want to take a look at will be triangular matrices. In fact there are
actually two kinds of triangular matrix. For an upper triangular matrix the matrix must be
square and all the entries below the main diagonal are zero and the main diagonal entries and the
entries above it may or may not be zero. A lower triangular matrix is just the opposite. The
matrix is still a square matrix and all the entries of a lower triangular matrix above the main
diagonal are zero and the main diagonal entries and those below it may or may not be zero.

Here are the general forms of an upper and lower triangular matrix.
é u11 u12 u13 L u1n ù
ê 0 u u23 L u2 n úú
ê 22

U =ê 0 0 u33 L u3n ú Upper Triangular


ê ú
ê M M M O Mú
ê 0 0 0 L un n úû
ë n´ n

é l11 0 0 L 0ù
êl L 0 úú
ê 21 l22 0
L = ê l31 l32 l33 L 0ú Lower Triangular
ê ú
ê M M M O Mú
ê ln1 ln 2 ln 3 L ln n úû
ë n´ n

In these forms the ui j and li j may or may not be zero.

© 2007 Paul Dawkins 69 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

If we do not care if the matrix is upper or lower triangular we will generally just call it
triangular.

Note as well that a diagonal matrix can be thought of as both an upper triangular matrix and a
lower triangular matrix.

Here’s a nice theorem about the invertibility of a triangular matrix.

Theorem 2 If A is a triangular matrix with main diagonal entries a11 , a22 ,K , ann then if one or
more of the aii ’s are zero the matrix will be singular. On the other hand if ai i ¹ 0 for all i then
the matrix is invertible.

Here is the outline of the proof.

Proof Outline : First assume that ai i ¹ 0 for all i. In this case we can divide each row by aii
(since it’s not zero) and that will put a 1 in the main diagonal entry for each row. Now use the
third row operation to eliminate all the non-zero entries above the main diagonal entry for an
upper triangular matrix or below it for a lower triangular matrix. When done with these
operations we will have reduced A to the identity matrix. Therefore, in this case A is row
equivalent to the identity and so must be invertible.

Now assume that at least one of the aii are zero. In this case we can’t get a 1 in the main
diagonal entry just be dividing by aii as we did in the first place. Now, for a second let’s suppose
we have an upper triangular matrix. In this case we could use the third row operation using one
of the rows above this to get a 1 into the main diagonal entry, however, this will also put non-zero
entries into the entries to the left of this as well. In other words, we’re not going to be able to
reduce A to the identity matrix. The same type of problem will arise if we’ve got a lower
triangular matrix.

In this case, A will not be row equivalent to the identity and so will be singular.

Here is another set of theorems about triangular matrices that we aren’t going to prove.

Theorem 3
(a) The product of lower triangular matrices will be a lower triangular matrix.
(b) The product of upper triangular matrices will be an upper triangular matrix.
(c) The inverse of an invertible lower triangular matrix will be a lower triangular matrix.
(d) The inverse of an invertible upper triangular matrix will be an upper triangular
matrix.

The proof of these will pretty much follow from how products and inverses are found and so will
be left to you to verify.

The final kind of matrix that we want to look at in this section is that of a symmetric matrix. In
fact we’ve already seen these in a previous section we just didn’t have the space to investigate
them in more detail in that section so we’re going to do it here.

© 2007 Paul Dawkins 70 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

For completeness sake we’ll give the definition here again. Suppose that A is an n ´ m matrix,
then A will be called symmetric if A = AT .

Note that the first requirement for a matrix to be symmetric is that the matrix must be square.
Since the size of AT will be m ´ n there is no way A and AT can be equal if A is not square since
they won’t have the same size.

Example 1 The following matrices are all symmetric.


é 6 -10 3 0 ù
ê -10 0 -4 úú
é4 6ù 1
A=ê ú B=ê C = [10]
ë 6 -7 û ê 3 1 12 8 ú
ê ú
ë 0 -4 8 5 û

We’ll leave it to you to compute the transposes of each of these and verity that they are in fact
symmetric. Notice with the second matrix (B) above that you can always quickly identify a
symmetric matrix by looking at the diagonals off the main diagonal. The diagonals right above
and below the main diagonal consists of the entries -10, 1, 8 are identical. Likewise, the
diagonals two above and below the main diagonal consists of the entries 3, -4 and again are
identical. Finally, the “diagonals” that are three above and below the main diagonal is identical
as well.

This idea we see in the second matrix above will be true in any symmetric matrix.

Here is a nice set of facts about arithmetic with symmetric matrices.

Theorem 4 If A and B are symmetric matrices of the same size and c is any scalar then,
(a) A ± B is symmetric.
(b) cA is symmetric.
(c) AT is symmetric.

Note that the product of two symmetric matrices is probably not symmetric. To see why this is
consider the following. Suppose both A and B are symmetric matrices of the same size then,
( AB )
T
= BT AT = BA
Notice that we used one of the properties of transposes we found earlier in the first step and the
fact that A and B are symmetric in the last step.

So what this tells us is that unless A and B commute we won’t have ( AB ) = AB and the
T

product won’t be symmetric. If A and B do commute then the product will be symmetric.

Now, if A is any n ´ m matrix then because AT will have size m ´ n both AAT and AT A will
be defined and in fact will be square matrices where AAT has size n ´ n and AT A has size
m´m .

Here are a couple of quick facts about symmetric matrices.

© 2007 Paul Dawkins 71 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

Theorem 5
(a) For any matrix A both AAT and AT A are symmetric.
(b) If A is an invertible symmetric matrix then A-1 is symmetric.
(c) If A is invertible then AAT and AT A are both invertible.

Proof :
(a) We’ll show that AAT is symmetric and leave the other to you to verify. To show that AAT is
( )
T
symmetric we’ll need to show that AAT = AAT . This is actually quite simple if we recall the
various properties of transpose matrices that we’ve got.
( AA ) = ( A )
T T T T
AT = ( A ) AT = AAT

( )
T
(b) In this case all we need is a theorem from a previous section to show that A-1 = A -1 .
Here is the work,
(A ) = (A )
-1 T -1
= ( A ) = A-1
T -1

(c) If A is invertible then we also know that AT is invertible and we since the product of
invertible matrices is invertible both AAT and AT A are invertible.

Let’s finish this section with an example or two illustrating the results of some of the theorems
above.

Example 2 Given the following matrices compute the indicated quantities.


é 4 -2 1ù é -2 0 3ù é 3 0 0ù
ê
A = ê 0 9 -6 ú ú ê
B = ê 0 7 -1ú ú C = êê 0 2 0 úú
êë 0 0 -1úû êë 0 0 5úû êë 9 5 4 úû
é -2 0 -4 1ù é 1 -2 0 ù
ê
D = ê 1 0 -1 6 ú ú E = êê -2 3 1úú
êë 8 2 1 -1úû êë 0 1 0 úû
(a) AB [Solution]
(b) C -1 [Solution]
(c) DT D [Solution]
(d) E -1 [Solution]

Solution
(a) AB
There really isn’t much to do here other than the multiplication and we’ll leave it to you to verify
the actual multiplication.
é -8 -14 19 ù
A = êê 0 63 -39 úú
êë 0 0 -5úû
So, as suggested by Theorem 3 the product of upper triangular matrices is in fact an upper
triangular matrix.

© 2007 Paul Dawkins 72 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

[Return to Problems]

(b) C -1
Here’s the work for finding C -1 .
é 3 0 0 1 0 0 ù 13 R1 é 1 0 0 0 0ù
1
3
ê0 2 0 0 1 0ú 1 R ê0 1 0 0 12 0 úú
ê ú2 2ê
êë 9 5 4 0 0 1úû ® êë 9 5 4 0 0 1úû
é 1 0 0 1
0 0ù é 1 0 0 1
0 0ù
R3 - 9 R1 ê 3
ú R3 - 5 R2 ê 3

0 1 0 0 1
0ú 0 1 0 0 1
0 úú
® ê 2
® ê 2
êë 0 5 4 -3 0 1úû êë 0 0 4 -3 - 5
2 1úû
é 1 0 0 1
3 0 0ù é 13 0 0ù
1
R
4 3ê
ê 0 1 0 0 1
2 0 úú Þ -1 ê
C =ê 0 1
2 0 úú
®
êë 0 0 1 - 43 - 85 1ú

êë - 4 - 85
3 1ú

So, again as suggested by Theorem 3 the inverse of a lower triangular matrix is also a lower
triangular matrix.
[Return to Problems]

(c) DT D
Here’s the transpose and the product.
é -2 1 8ù
ê 0 0 2ú
DT = ê ú
ê -4 -1 1ú
ê ú
ë 1 6 -1û
é -2 1 8ù é 69 16 15 -4ù
ê 0 0 2 ú é -2 0 -4 1ù ê
16 4 2 -2úú
DT D = ê úê 1 0 -1 6 úú = ê
ê -4 -1 1ú ê ê 15 2 18 -11ú
ê ú êë 8 2 1 -1úû ê ú
ë 1 6 -1û ë -4 -2 -11 38û

So, as suggested by Theorem 5 this product is symmetric even though D was not symmetric (or
square for that matter).
[Return to Problems]

(d) E -1
Here is the work for finding E -1 .
é 1 -2 0 1 0 0ù é 1 -2 0 1 0 0ù
R2 + 2 R1 ê
E = êê -2 3 1 0 1 0úú 0 -1 1 2 1 0 úú
® ê
ëê 0 1 0 0 0 1ûú ëê 0 1 0 0 0 1ûú

© 2007 Paul Dawkins 73 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

é 1 -2 0 1 0 0 ù R1 + 2 R2 é 1 0 0 1 0 2 ù
R2 « R3 ê
0 1 0 0 0 1úú R3 + R2 êê 0 1 0 0 0 1úú
® ê
êë 0 -1 1 2 1 0 úû ® êë 0 0 1 2 1 1úû
So, the inverse is
é 1 0 2ù
E = êê 0 0 1úú
-1

êë 2 1 1úû
and as suggested by Theorem 5 the inverse is symmetric.
[Return to Problems]

© 2007 Paul Dawkins 74 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

LU-Decomposition
In this section we’re going to discuss a method for factoring a square matrix A into a product of a
lower triangular matrix, L, and an upper triangular matrix, U. Such a factorization can be used to
solve systems of equations as we’ll see in the next section when we revisit that topic.

Let’s start the section out with a definition and a theorem.

Definition 1 If A is a square matrix and it can be factored as A = LU where L is a lower


triangular matrix and U is an upper triangular matrix, then we say that A has an LU-
Decomposition of LU.

Theorem 1 If A is a square matrix and it can be reduced to a row-echelon form, U, without


interchanging any rows then A can be factored as A = LU where L is a lower triangular matrix.

We’re not going to prove this theorem but let’s examine it in some detail and we’ll find a way to
determine a way of determining L. Let’s start off by assuming that we’ve got a square matrix A
and that we are able to reduce it row-echelon form U without interchanging any rows. We know
that each row operation that we used has a corresponding elementary matrix, so let’s suppose that
the elementary matrices corresponding to the row operations we used are E1 , E2 ,K , Ek .

We know from Theorem 4 in a previous section that multiplying these to the left side of A in the
same order we applied the row operations will be the same as actually applying the operations.
So, this means that we’ve got,
Ek L E2 E1 A = U

We also know that elementary matrices are invertible so let’s multiply each side by the inverses,
Ek-1 ,K , E2-1 , E1-1 , in that order to get,
A = E1-1E2-1 L Ek-1U

Now, it can be shown that provided we avoid interchanging rows the elementary row operations
that we needed to reduce A to U will all have corresponding elementary matrices that are lower
triangular matrices. We also know from the previous section that inverses of lower triangular
matrices are lower triangular matrices and products of lower triangular matrices are lower
triangular matrices. In other words, L = E1-1 E2-1 L Ek-1 is a lower triangular matrix and so using
this we get the LU-Decomposition for A of A = LU .

Let’s take a look at an example of this.

Example 1 Determine an LU-Decomposition for the following matrix.


é 3 6 -9 ù
A = êê 2 5 -3úú
êë -4 1 10 úû
Solution
So, first let’s go through the row operations to get this into row-echelon form and remember that
we aren’t allowed to do any interchanging of rows. Also, we’ll do this step by step so that we can

© 2007 Paul Dawkins 75 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

keep track of the row operations that we used since we’re going to need to write down the
elementary matrices that are associated with them eventually.
é 3 6 -9 ù 1 é 1 2 -3ù
ê 2 R
ê 5 -3úú 3 1 êê 2 5 -3úú
®
êë -4 1 10 úû êë -4 1 10úû
é 1 2 -3ù é 1 2 - 3ù
ê 2 ú R2 - 2 R1 ê
ê 5 -3ú ê 0 1 3úú
®
êë -4 1 10 úû êë -4 1 10 úû
é 1 2 -3ù é 1 2 -3ù
ê 0 ú R3 + 4 R1 ê
ê 1 3ú ê 0 1 3úú
®
êë -4 1 10 úû êë 0 9 -2 úû
é 1 2 -3 ù é 1 2 -3 ù
ê 0 ú R3 - 9 R2 ê
ê 1 3ú 0 1 3úú
® ê
êë 0 9 -2 úû êë 0 0 -29 úû
é 1 2 -3ù 1 é 1 2 -3ù
ê 0 - R
ê 1 3úú 29 3 êê 0 1 3úú
®
êë 0 0 -29 úû êë 0 0 1úû

Okay so, we’ve got our hands on U.


é 1 2 - 3ù
U = êê 0 1 3úú
êë 0 0 1úû
Now we need to get L. This is going to take a little more work. We’ll need the elementary
matrices for each of these, or more precisely their inverses. Recall that we can get the elementary
matrix for a particular row operation by applying that operation to the appropriately sized identity
matrix ( 3 ´ 3 in this case). Also recall that the inverse matrix can be found by applying the
inverse operation to the identity matrix.

Here are the elementary matrices and their inverses for each of the operations above.
é 13 0 0ù é3 0 0ù
1
R1 E1 = êê0 1 0 úú E = êê 0
1
-1
1 0 úú
3
êë0 0 1úû êë 0 0 1úû
é 1 0 0ù é1 0 0ù
R2 - 2 R1 E2 = êê -2 1 0úú E2 = êê 2
-1
1 0 úú
êë 0 0 1úû êë 0 0 1úû
é 1 0 0ù é 1 0 0ù
R3 + 4 R1 E3 = êê 0 1 0 úú E3 = êê 0
-1
1 0 úú
êë 4 0 1úû êë -4 0 1úû

© 2007 Paul Dawkins 76 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

é 1 0 0ù é 1 0 0ù
R3 - 9 R2 E4 = êê 0 1 0 úú E = êê 0 1 0 úú
-1
4

êë 0 -9 1úû êë 0 9 1úû
é 0 0 0ù é 1 0 0ù
1 ê
- R3 E5 = ê 0 1 0 úú -1 ê
E5 = ê 0 1 0 úú
29
êë 0 0 - 291 úû êë 0 0 -29 úû

Okay, we know can compute L.


L = E1-1 E2-1 E3-1 E4-1 E5-1
é 3 0 0ù é 1 0 0ù é 1 0 0ù é 1 0 0ù é 1 0 0ù
= êê0 1 0 úú êê 2 1 0 úú êê 0 1 0 úú êê0 1 0 úú êê 0 1 0 úú
êë0 0 1úû êë 0 0 1úû êë -4 0 1úû êë0 9 1úû êë 0 0 -29 úû
é 3 0 0ù
=ê 2ê 1 0 úú
êë -4 9 -29 úû

Finally, we can verify that we’ve gotten an LU-Decomposition with a quick computation.
é 3 0 0ù é 1 2 -3ù é 3 6 -9 ù
ê 0 úú êê 0 1 3úú = êê 2 5 -3úú = A
ê 2 1
êë -4 9 -29 úû êë 0 0 1úû êë -4 1 10 úû

So we did all the work correctly.

That was a lot of work to determine L. There is an easier way to do it however. Let’s start off
with a general L with “*” in place of the potentially non-zero terms.
é * 0 0ù
L = êê * * 0 úú
êë * * *úû

Let’s start with the main diagonal and go back and look at the operations that was required to get
1’s on the diagonal when we were computing U. To get a 1 in the first row we had to multiply
1
that row by . We didn’t need to do anything to get a 1 in the second row, but for the sake
3
argument let’s say that we actually multiplied that row by 1. Finally, we multiplied the third row
1
by - to get a 1 in the main diagonal entry in that row.
29

Next go back and look at the L that we had for this matrix. The main diagonal entries are 3, 1,
and -29. In other words, they are the reciprocal of the numbers we used in computing U. This
will always be the case. The main diagonal of L then using this idea is,

© 2007 Paul Dawkins 77 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

é 3 0 0ù
L = êê * 1 0úú
êë * * -29 úû

Now, let’s take a look at the two entries under the 3 in the first column. Again go back to the
operations used to find U and take a look at the operations we used to get zeroes in these two
spots. To get a zero in the second row we added -2R1 onto R2 and to get a zero in the third row
we added 4R1 onto R3 .

Again, go back to the L we found and notice that these two entries are 2 and -4. Or, they are the
negative of the multiple of the first row that we added onto that particular row to get that entry to
be zero. Filling these in we now arrive at,
é 3 0 0ù
L = êê 2 1 0 úú
êë -4 * -29 úû

Finally, in determining U we -9R2 onto R3 to get the entry in the third row and second column
to be zero and in the L we found this entry is 9. Again, it’s the negative of the multiple of the
second row we used to make this entry zero. This gives us the final entry in L.
é 3 0 0ù
L = êê 2 1 0 úú
êë -4 9 -29 úû

This process we just went through will always work in determining L for our LU-Decomposition
provided we follow the process above to find U. In fact that is the one drawback to this process.
We need to find U using exactly the same steps we used in this example. In other words,
multiply/divide the first row by an appropriate scalar to get a 1 in the first column then zero out
the entries below that one. Next, multiply/divide the second row by an appropriate scalar to get a
1 in the main diagonal entry of the second row and then zero out all the entries below this.
Continue in this fashion until you’ve dealt with all the columns. This will sometimes lead to
some messy fractions.

Let’s take a look at another example and this time we’ll use the procedure outlined above to find
L instead of dealing with all the elementary matrices.

Example 2 Determine an LU-Decomposition for the following matrix.


é 2 3 -4 ù
B = êê 5 4 4 úú
êë -1 7 0 úû
Solution
So, we first need to reduce B to row-echelon form without using row interchanges. Also, if we’re
going to use the process outlined above to find L we’ll need to do the reduction in the same
manner as the first example. Here is that work.

© 2007 Paul Dawkins 78 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

é 2 3 -4 ù 1 é 1 3
2
-2 ù R2 - 5R1 é 1 3
2
-2 ù
ê 5 4 4ú 2 1 ê 5 4R
ê ú ® ê 4 úú R3 + R1 êê 0 - 7
2 14 úú
êë -1 7 0 úû êë -1 7 0 úû ® êë 0 17
2 -2 úû
é 1 3
2 -2 ù é 1 3
2 -2 ù 1 é 1 3
2 -2 ù
- 72 R2 ê ú R3 - 172 R2 ê R
0 1 -4 ú 0 1 -4 úú 32 3 êê 0 1 -4 úú
® ê ® ê ®
êë 0 172 -2 úû êë 0 0 32 úû êë 0 0 1úû
So, U is,
é 1 3
2 -2 ù
U = êê 0 1 -4 úú
êë 0 0 1úû

Now, let’s get L. Again, we’ll start with a general L and the main diagonal entries will be the
reciprocal of the scalars we needed to multiply each row by to get a one in the main diagonal
entry. This gives,
é 2 0 0ù
ê
L = ê * - 72 0 úú
êë * * 32 úû

Now, for the remaining entries, go back to the process and look for the multiple that was needed
to get a zero in that spot and this entry will be the negative of that multiple. This gives us our
final L.
é 2 0 0ù
ê
L = ê 5 - 72 0 úú
êë -1 172 32 úû

As a final check we can always do a quick multiplication to verify that we do in fact get B from
this factorization.
é 2 0 0ù é 1 3
2
-2 ù é 2 3 -4 ù
ê 5 -7 0 úú êê 0 1 -4úú = êê 5 4 4 úú = B
ê 2
êë -1 17
2 32 úû êë 0 0 1úû êë -1 7 0 úû

So, it looks like we did all the work correctly.

We’ll leave this section by pointing out a couple of facts about LU-Decompositions.

First, given a random square matrix, A, the only way we can guarantee that A will have an LU-
Decomposition is if we can reduce it to row-echelon form without interchanging any rows. If we
do have to interchange rows then there is a good chance that the matrix will NOT have an LU-
Decomposition.

© 2007 Paul Dawkins 79 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

Second, notice that every time we’ve talked about an LU-Decomposition of a matrix we’ve used
the word “an” and not “the” LU-Decomposition. This choice of words is intentional. As the
choice suggests there is no single unique LU-Decomposition for A.

To see that LU-Decompositions are not unique go back to the first example. In that example we
computed the following LU-Decomposition.
é 3 6 -9 ù é 3 0 0ù é 1 2 -3ù
ê 2 5 -3úú = êê 2 1 0 úú êê 0 1 3úú
ê
êë -4 1 10 úû êë -4 9 -29 úû êë 0 0 1úû
However, we’ve also got the following LU-Decomposition.
é 3 6 -9 ù é 1 0 0ù é 3 6 -9 ù
ê 2
ê 5 -3úú = êê 23 1 0 úú êê 0 1 3úú
êë -4 1 10 úû êë - 43 9 1úû êë 0 0 -29 úû
This is clearly an LU-Decomposition since the first matrix is lower triangular and the second is
upper triangular and you should verify that upon multiplying they do in fact give the shown
matrix.

If you would like to see a further example of an LU-Decomposition worked out there is an
example in the next section.

© 2007 Paul Dawkins 80 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

Systems Revisited
We opened up this chapter talking about systems of equations and we spent a couple of sections
on them and then we moved away from them and haven’t really talked much about them since.
It’s time to come back to systems and see how some of the ideas we’ve been talking about since
then can be used to help us solve systems. We’ll also take a quick look at a couple of other ideas
about systems that we didn’t look at earlier.

First let’s recall that any system of n equations and m unknowns,


a11 x1 + a12 x2 + L + a1m xm = b1
a21 x1 + a22 x2 + L + a2 m xm = b2
M
an1 x1 + an 2 x2 + L + an m xm = bn
can be written in matrix form as follows.
é a11 a12 L a1m ù é x1 ù é b1 ù
êa a22 L a2 m úú êê x2 úú êêb2 úú
ê 21 =
ê M M M úê M ú ê M ú
ê úê ú ê ú
ë an1 an 2 L anm û ë xm û ëbn û
Ax = b

In the matrix form A is called the coefficient matrix and each row contains the coefficients of the
corresponding equations, x is a column matrix that contains all the unknowns from the system of
equations and finally b is a column matrix containing the constants on the right of the equal sign.

Now, let’s see how inverses can be used to solve systems. First, we’ll need to assume that the
coefficient matrix is a square n ´ n matrix. In other words there are the same number of
equations as unknowns in our system. Let’s also assume that A is invertible. In this case we
actually saw in the proof of Theorem 3 in the section on finding inverses that the solution to
Ax = b is unique (i.e. only a single solution exists) and that it’s given by,
x = A-1b

So, if we’ve got the inverse of the coefficient matrix in hand (not always an easy thing to find of
course…) we can get the solution based on a quick matrix multiplication. Let’s see an example
of this.

Example 1 Use the inverse of the coefficient matrix to solve the following system.
3 x1 + x2 = 6
- x1 + 2 x2 + 2 x3 = -7
5 x1 - x3 = 10
Solution
Okay, let’s first write down the matrix form of this system.

© 2007 Paul Dawkins 81 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

é 3 1 0 ù é x1 ù é 6 ù
ê -1 2 2 úú êê x2 úú = êê -7 úú
ê
êë 5 0 -1úû êë x3 úû êë 10 úû
Now, we found the inverse of the coefficient matrix back in Example 2 of the Finding Inverses
section so here is the coefficient matrix and its inverse.
é 3 1 0ù é - 23 1
3
2
3 ù
A = êê -1 2 2 úú A-1 = êê 3 -1 -2 úú
êë 5 0 -1úû êë - 103 5
3

The solution to the system in matrix form is then,


é - 23 1
3
2
3 ù é 6 ù é 13 ù
x = A-1b = êê 3 -1 -2 úú êê -7 úú = êê 5úú
êë - 103 5
3

ê ú êë - 253 úû
3 û ë 10 û
Now since each of the entries of x are one of the unknowns in the original system above the
system to the original system is then,
1 25
x1 = x2 = 5 x3 = -
3 3

So, provided we have a square coefficient matrix that is invertible and we just happen to have our
hands on the inverse of the coefficient matrix we can find the solution to the system fairly easily.

Next, let’s look at how the topic of the previous section (LU-Decompositions) can be used to
solve systems of equations. First let’s recall how LU-Decompositions work. If we have a square
matrix, A, (so we’ll again be working the same number of equations as unknowns) then if we can
reduce it to row-echelon form without using any row interchanges then we can write it as
A = LU where L is a lower triangular matrix and U is an upper triangular matrix.

So, let’s start with a system Ax = b where the coefficient matrix, A, is an n ´ n square and has
an LU-Decomposition of A = LU . Now, substitute this into the system for A to get,
LUx = b

Next, let’s just take a look at Ux. This will be an n ´ 1 column matrix and let’s call it y. So,
we’ve got Ux = y .

So, just what does this do for us? Well let’s write the system in the following manner.
Ly = b where Ux = y

As we’ll see it’s very easy to solve Ly = b for y and once we know y it will be very easy to
solve Ux = y for x which will be the solution to the original system.

It’s probably easiest to see how this method works with an example so let’s work one.

© 2007 Paul Dawkins 82 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

Example 2 Use the LU-Decomposition method to find the solution to the following system of
equations.
3x1 + 6 x2 - 9 x3 = 0
2 x1 + 5 x2 - 3 x3 = -4
-4 x1 + x2 + 10 x3 = 3
Solution
First let’s write down the matrix form of the system.
é 3 6 -9 ù é x1 ù é 0 ù
ê 2 5 -3úú êê x2 úú = êê -4 úú
ê
êë -4 1 10 úû êë x3 úû êë 3úû

Now, we found an LU-Decomposition to this coefficient matrix in Example 1 of the previous


section. From that example we see that,
é 3 6 -9 ù é 3 0 0ù é 1 2 -3ù
ê 2 5 -3úú = êê 2 1 0 úú êê 0 1 3úú
ê
êë -4 1 10 úû êë -4 9 -29 úû êë 0 0 1úû

According to the method outlined above this means that we actually need to solve the following
two systems.
é 3 0 0 ù é y1 ù é 0 ù é 1 2 -3ù é x1 ù é y1 ù
ê 2 1 0 úú êê y2 úú = êê -4 úú ê 0 1 3úú êê x2 úú = êê y2 úú
ê ê
êë -4 9 -29 úû êë y3 úû êë 3úû êë 0 0 1úû êë x3 úû êë y3 úû
in order.

So, let’s get started on the first one. Notice that we don’t really need to do anything other than
write down the equations that are associated with this system and solve using forward
substitution. The first equation will give us y1 for free and once we know that the second
equation will give us y2 . Finally, with these two values in hand the third equation will give us
y3 . Here is that work.
3 y1 = 0 Þ y1 = 0
2 y1 + y2 = -4 Þ y2 = -4
39
-4 y1 + 9 y2 - 29 y3 = 3 Þ y3 = -
29

The second system that we need to solve is then,


é 1 2 -3ù é x1 ù é 0 ù
ê 0
ê 1 3úú êê x2 úú = êê -4úú
ëê 0 0 1ûú ëê x3 ûú êë - 39 ú
29 û
Again, notice that to solve this all we need to do is write down the equations and do back
substitution. The third equation will give us x3 for free and plugging this into the second

© 2007 Paul Dawkins 83 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

equation will give us x2 , etc. Here’s the work for this.


119
x1 + 2 x2 - 3 x3 = 0 Þ x1 = -
29
1
x2 + 3 x3 = -4 Þ x2 =
29
39 39
x3 = - Þ x3 = -
29 29

The solution to the original system is then shown above. Notice that while the final answers
where a little messy the work was nothing more than a little arithmetic and wasn’t terribly
difficult.

Let’s work one more of these since there’s a little more work involved in this than the inverse
matrix method of solving a system.

Example 3 Use the LU-Decomposition method to find a solution to the following system of
equations.
-2 x1 + 4 x2 - 3 x3 = -1
3 x1 - 2 x2 + x3 = 17
-4 x2 + 3x3 = -9
Solution
Once again, let’s first get the matrix form of the system.
é -2 4 -3ù é x1 ù é -1ù
ê 3 -2 1úú êê x2 úú = êê 17 úú
ê
êë 0 -4 3úû êë x3 úû êë -9 úû
Now let’s get an LU-Decomposition for the coefficient matrix. Here’s the work that will reduce
it to row-echelon form. Remember that the result of this will be U.
é -2 4 -3 ù 1 é 1 -2 2ù
3
é 1 -2 ù3
- R
2 1ê ú R2 - 3R1 ê 2
ê 3 -2 1úú 3 -2 1ú 0 4 - úú 7
ê ® ê ® ê 2

êë 0 -4 3úû êë 0 -4 3úû êë 0 -4 3úû


é 1 -2 2ù
3
é 1 -2 2ù
3
é 1 -2 2ù
3
1
R
4 2 ê
R
7ú 3
+ 4 R2 ê 7ú
-2 R3 ê 7ú
0 1 - 8ú 0 1 - 8ú 0 1 - 8ú
® ê ® ê ® ê
êë 0 -4 3úû êë 0 0 - 12 úû êë 0 0 1úû

So, U is then,
é 1 -2 2ù
3

ê
U =ê 0 7ú
1 - 8ú
êë 0 0 1úû

Now, to get L remember that we start off with a general lower triangular matrix and on the main
diagonals we put the reciprocal of the scalar used in the work above to get a one in that spot.
Then, in the entries below the main diagonal we put the negative of the multiple used to get a zero
© 2007 Paul Dawkins 84 http://tutorial.math.lamar.edu/terms.aspx
Linear Algebra

in that spot above. L is then,


é -2 0 0ù
L = êê 3 4 0 úú
êë 0 -4 - 12 úû

We’ll leave it to you to verify that A = LU . Now let’s solve the system. This will mean we
need to solve the following two systems.
é -2 0 0 ù é y1 ù é -1ù é 1 -2 2 ù é 1ù
3
x é y1 ù
ê 3 4 0úú êê y2 úú = êê 17 úú ê 0 7úê
1 - 8 ú ê x2 ú = êê y2 úú
ú
ê ê
êë 0 -4 - 12 úû êë y3 úû êë -9 úû êë 0 0 1úû êë x3 úû êë y3 úû

Here’s the work for the first system.


1
-2 y1 = -1 Þ y1 =
2
31
3 y1 + 4 y2 = 17 Þ y2 =
8
1
-4 y 2 - y3 = -9 Þ y3 = -13
2
Now let’s get the actual solution by solving the second system.
é 1 -2 2 ù é x1 ù
3
é 12 ù
ê 0 1 - 78 úú êê x2 úú = êê 318 úú
ê
êë 0 0 1úû êë x3 úû êë -13úû
Here is the substitution work for this system.
3 1
x1 - 2 x2 + x3 = Þ x1 = 5
2 2
7 31 15
x2 - x3 = Þ x2 = -
8 8 2
x3 = -13 Þ x3 = -13

So there’s the solution to this system.

Before moving onto the next topic of this section we should probably address why we even
bothered with this method. It seems like a lot of work to solve a system of equations and when
solving systems by hand it can be a lot of work. However, because the method for finding L and
U is a fairly straightforward process and once those are found the method for solving the system
is also very straightforward this is a perfect method for use in computer systems when
programming the solution to systems. So, while it seems like a lot of work, it is a method that is
very easy to program and so is a very useful method.

The remaining topics in this section don’t really rely on previous sections as the first part of this
section has. Instead we just need to look at a couple of ideas about solving systems that we didn’t
have room to put into the section on solving systems of equations.

© 2007 Paul Dawkins 85 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

First we want to take a look at the following scenario. Suppose that we need so solve a system of
equations only there are two or more sets of the bi ’s that we need to look at. For instance
suppose we wanted to solve the following systems of equations.

Ax = b1 Ax = b 2 L Ax = b k

Again, the coefficient matrix is the same for all these systems and the only thing that is different
is the bi ’s. We could use any of the methods looked at so far to solve these systems. However,
each of the methods we’ve looked at so far would require us to do each system individually and
that could potentially lead to a lot of work.

There is one method however that can be easily extended to solve multiple systems
simultaneously provided they all have the same coefficient matrix. In fact the method is the very
first one we looked at. In that method we solved systems by adding the column matrix b, onto
the coefficient matrix and then reducing it to row-echelon or reduced row-echelon form. For the
systems above this would require working with the following augmented matrices.
[A b1 ] [A b2 ] L [A bk ]

However, if you think about it almost the whole reduction process revolves around the columns in
the augmented matrix that are associated with A and not the b column. So, instead of doing these
individually let’s add all of them onto the coefficient matrix as follows.
[A b1 b 2 L b k ]
All we need to do this is reduce this to reduced row-echelon form and we’ll have the answer to
each of the systems. Let’s take a look at an example of this.

Example 4 Find the solution to each of the following systems.


x1 - 3x2 + 4 x3 = 12 x1 - 3x2 + 4 x3 = 0
2 x1 - x2 - 2 x3 = -1 2 x1 - x2 - 2 x3 = 5
5 x1 - 2 x2 - 3x3 = 3 5 x1 - 2 x2 - 3x3 = -8
Solution
So, we’ve got two systems with the same coefficient matrix so let’s form the following matrix.
Note that we’ll leave the vertical bars in to make sure we remember the last two columns are
really b’s for the systems we’re solving.
é 1 -3 4 12 0 ù
ê 2 -1 -2 -1 5 ú
ê ú
êë 5 -2 -3 3 -8úû

Now, we just need to reduce this to reduced row-echelon form. Here is the work for that.
é 1 -3 4 12 0 ù R2 - 2 R1 é 1 -3 4 12 0ù
ê 2 -1 -2 -1 5 ú R - 5 R ê 0 5 -10 -25 5úú
ê ú 3 1 ê

êë 5 -2 -3 3 -8 úû ® êë 0 13 -23 -57 -8úû

© 2007 Paul Dawkins 86 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

é 1 -3 4 12 0ù é 1 -3 4 12 0ù
1
R2 ê ú R3 - 13R2 ê
5
0 1 -2 -5 1ú ê 0 1 -2 -5 1úú
® ê ®
êë 0 13 -23 -57 -8úû êë 0 0 3 8 -21úû
é 1 -3 4 12 0 ù R2 + 2 R3 é 1 -3 0 4
3
28ù
R3 ê
1
ú ê
3
0 1 -2 -5 1ú R1 - 4 R3 ê 0 1 0 1
-13úú
® ê 3
êë 0 0 1 8
3 -7 úû ® êë 0 0 1 8
3 -7 úû
é 1 0 0 7
-11ù
R1 + 3R2 ê 3

ê 0 1 0 1
-13úú
® 3
êë 0 0 1 8
3 -7 úû

Okay from the solution to the first system is in the fourth column since that is the b for the first
system and likewise the solution to the second system is in the fifth column. Therefore, the
solution to the first system is,
7 1 8
x1 = x2 = x3 =
3 3 3
and the solution to the second system is,
x1 = -11 x2 = -13 x3 = -7

The remaining topic to discuss in this section gives us a method for answering the following
question.

Given an n ´ m matrix A determine all the m ´ 1 matrices, b, for which Ax = b is consistent,


that is Ax = b has at least one solution. This is a question that can arise fairly often and so we
should take a look at how to answer it.

Of course if A is invertible (and hence square) this answer is that Ax = b is consistent for all b as
we saw in an earlier section. However, what if A isn’t square or isn’t invertible? The method
we’re going to look at doesn’t really care about whether or not A is invertible but it really should
be pointed out that we do know the answer for invertible matrices.

It’s easiest to see how these work with an example so let’s jump into one.

Example 5 Determine the conditions (if any) on b1 , b2 , and b3 in order for the following
system to be consistent.
x1 - 2 x2 + 6 x3 = b1
- x1 + x2 - x3 = b2
-3 x1 + x2 + 8 x3 = b3
Solution
Okay, we’re going to use the augmented matrix method we first looked at here and reduce the
matrix down to reduced row-echelon form. The final form will be a little messy because of the
presence of the bi ’s but other than that the work is identical to what we’ve been doing to this
point.

© 2007 Paul Dawkins 87 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

Here is the work.


é 1 -2 6 b1 ù R2 + R1 é 1 -2 6 b1 ù
ê -1 1 -1 b ú R + 3R ê0 -1 5 b2 + b1 úú
ê 2ú 3 1ê

êë -3 1 8 b3 úû ® êë0 -5 26 b3 + 3b1 úû
é 1 -2 6 b1 ù é 1 -2 6 b1 ù
- R2 ê ú R3 + 5 R2 ê
0 1 -5 -b2 - b1 ú 0 1 -5 -b2 - b1 úú
® ê ® ê
êë0 -5 26 b3 + 3b1 úû êë0 0 1 b3 - 5b2 - 2b1 úû
R2 + 5 R3 é 1 -2 0 -6b3 + 30b2 + 13b1 ù é1 0 0 4b3 - 22b2 - 9b1 ù
ê ú R1 + 2 R2 ê
R1 - 6 R3 ê0 1 0 5b3 - 26b2 - 11b1 ú 0 1 0 5b3 - 26b2 - 11b1 úú
® ê
® êë0 0 1 b3 - 5b2 - 2b1 úû êë0 0 1 b3 - 5b2 - 2b1 úû

Okay, just what does this all mean? Well go back to equations and let’s see what we’ve got.
x1 = 4b3 - 22b2 - 9b1
x2 = 5b3 - 26b2 - 11b1
x3 = b3 - 5b2 - 2b1

So, what this says is that no matter what our choice of b1 , b2 , and b3 we can find a solution
using the general solution above and in fact there will always be exactly one solution to the
system for a given choice of b.

Therefore, there are no conditions on b1 , b2 , and b3 in order for the system to be consistent.

Note that the result of the previous example shouldn’t be too surprising given that the coefficient
matrix is invertible.

Now, we need to see what happens if the coefficient matrix is singular (i.e.not invertible).

Example 6 Determine the conditions (if any) on b1 , b2 , and b3 in order for the following
system to be consistent.
x1 + 3 x2 - 2 x3 = b1
- x1 - 5 x2 + 3 x3 = b2
2 x1 - 8 x2 + 3x3 = b3
Solution
We’ll do this one in the same manner as the previous one. So, convert to an augmented matrix
and start the reduction process. As we’ll see in this case we won’t need to go all the way to
reduced row-echelon form to get the answer however.
é 1 3 -2 b1 ù R2 + R1 é 1 3 -2 b1 ù
ê -1 -5 3 ú ê
b2 ú R3 - 2 R1 ê0 -2 1 b2 + b1 úú
ê
êë 2 -8 3 b3 úû ® êë0 -14 7 b3 - 2b1 úû

© 2007 Paul Dawkins 88 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

é 1 3 -2 b1 ù
R3 - 7 R2 ê
ê 0 -2 1 b2 + b1 úú
®
êë0 0 0 b3 - 7b2 - 9b1 úû

Okay, let’s stop here and see what we’ve got. The last row corresponds to the following
equation.
0 = b3 - 7b2 - 9b1
If the right side of this equation is NOT zero then this equation will not make any sense and so
the system won’t have a solution. If however, it is zero then this equation will not be a problem
and since we can take the first two rows and finish out the process to find a solution for any given
values of b1 and b2 we’ll have a solution.

This then gives us our condition that we’re looking for. In order for the system to have a
solution, and hence be consistent, we must have
b3 = 7b2 + 9b1

© 2007 Paul Dawkins 89 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

Determinants

Introduction
By this point in your mathematical career you should have run across functions. The functions
that you’ve probably seen to this point have had the form f ( x ) where x is a real number and the
output of the function is also a real number. Some examples of functions are f ( x ) = x 2 and
f ( x ) = cos ( 3x ) - sin ( x ) .

Not all functions however need to take a real number as an argument. For instance we could have
a function f ( X ) that takes a matrix X and outputs a real number. In this chapter we are going
to be looking at one such function, the determinant function. The determinant function is a
function that will associate a real number with a square matrix.

The determinant function is a function that won’t be seeing all that often in the rest of this course,
but it will show up on occasion.

Here is a listing of the topics in this chapter.

The Determinant Function – We will give the formal definition of the determinant in this
section. We’ll also give formulas for computing determinants of 2 ´ 2 and 3 ´ 3 matrices.

Properties of Determinants – Here we will take a look at quite a few properties of the
determinant function. Included are formulas for determinants of triangular matrices.

The Method of Cofactors – In this section we’ll take a look at the first of two methods form
computing determinants of general matrices.

Using Row Reduction to Find Determinants – Here we will take a look at the second method
for computing determinants in general.

Cramer’s Rule – We will take a look at yet another method for solving systems. This method
will involve the use of determinants.

© 2007 Paul Dawkins 90 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

The Determinant Function


We’ll start off the chapter by defining the determinant function. This is not such an easy thing
however as it involves some ideas and notation that you probably haven’t run across to this point.
So, before we actually define the determinant function we need to get some preliminaries out of
the way.

First, a permutation of the set of integers {1, 2,K , n} is an arrangement of all the integers in the
list without omission or repetitions. A permutation of {1, 2,K , n} will typically be denoted by
( i1 , i2 ,K , in ) where i1 is the first number in the permutation, i2 is the second number in the
permutation, etc.

Example 1 List all permutations of {1, 2} .

Solution
This one isn’t too bad because there are only two integers in the list. We need to come up with all
the possible ways to arrange these two numbers. Here they are.
(1, 2 ) ( 2,1)

Example 2 List all the permutations of {1, 2,3}

Solution
This one is a little harder to do, but still isn’t too bad. We need all the arrangements of these
three numbers in which no number is repeated or omitted. Here they are.
(1, 2,3) (1,3, 2 ) ( 2,1,3) ( 2,3,1) ( 3,1, 2 ) ( 3, 2,1)
From this point on it can be somewhat difficult to find permutations for lists of numbers with
more than 3 numbers in it. One way to make sure that you get all of them is to write down a
permutation tree. Here is the permutation tree for {1, 2,3} .

At the top we list all the numbers in the list and from this top number we’ll branch out with each
of the remaining numbers in the list. At the second level we’ll again branch out with each of the
numbers from the list not yet written down along that branch. Then each branch will represent a
permutation of the given list of numbers

As you can see the number of permutations for a list will quick grow as we add numbers to the
list. In fact it can be shown that there are n! permutations of the list {1, 2,K , n} , or any list
containing n distinct numbers, but we’re going to be working with {1, 2,K , n} so that’s the one

© 2007 Paul Dawkins 91 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

we’ll reference. So, the list {1, 2,3, 4} will have 4! = ( 4 )( 3)( 2 )(1) = 24 permutations, the list
{1, 2,3, 4,5} will have 5! = ( 5 )( 4 )( 3)( 2 )(1) = 120 permutations, etc.
Next we need to discuss inversions in a permutation. An inversion will occur in the permutation
( i1 , i2 ,K , in ) whenever a larger number precedes a smaller number. Note as well we don’t mean
that the smaller number is immediately to the right of the larger number, but anywhere to the right
of the larger number.

Example 3 Determine the number of inversions in each of the following permutations.


(a) ( 3,1, 4, 2 ) [Solution]
(b) (1, 2, 4,3) [Solution]
(c) ( 4,3, 2,1) [Solution]
(d) (1, 2,3, 4,5 ) [Solution]
(e) ( 2,5, 4,1,3) [Solution]
Solution
(a) ( 3,1, 4, 2 )
Okay, to count the number of inversions we will start at the left most number and count the
number of numbers to the right that are smaller. We then move to the second number and do the
same thing. We continue in this fashion until we get to the end. The total number of inversions
are then the sum of all these.

We’ll do this first one in detail and then do the remaining ones much quicker. We’ll mark the
number we’re looking at in red and to the side give the number of inversions for that particular
number.
( 3,1, 4, 2 ) 2 inversions
( 3,1, 4, 2 ) 0 inversions
( 3,1, 4, 2 ) 1 inversion

In the first case there are two numbers to the right of 3 that are smaller than 3 so there are two
inversions there. In the second case we’re looking at the smallest number in the list and so there
won’t be any inversions there. Then with 4 there is one number to the right that is smaller than 4
and so we pick up another inversion. There is no reason to look at the last number in the
permutation since there are no numbers to the right of it and so won’t introduce any inversions.

The permutation ( 3,1, 4, 2 ) has a total of 3 inversions.


[Return to Problems]

(b) (1, 2, 4,3)


We’ll do this one much quicker. There are 0 + 0 + 1 = 1 inversions in (1, 2, 4,3) . Note that each
number in the sum above represents the number of inversion for the number in that position in the
permutation.
[Return to Problems]

© 2007 Paul Dawkins 92 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

(c) ( 4,3, 2,1)


There are 3 + 2 + 1 = 6 inversions in ( 4,3, 2,1) .
[Return to Problems]

(d) (1, 2,3, 4,5 )


There are no inversions in (1, 2,3, 4,5 ) .
[Return to Problems]

(e) ( 2,5, 4,1,3)


There are 1 + 3 + 2 + 0 = 6 in ( 2,5, 4,1,3) .
[Return to Problems]

Next, a permutation is called even if the number of inversions is even and odd if the number of
inversions is odd.

Example 4 Classify as even or odd all the permutations of the following lists.
(a) {1, 2}
(b) {1, 2,3}
Solution
(a) Here’s a table giving all the permutations, the number of inversions in each and the
classification.

Permutation # Inversions Classification


(1, 2 ) 0 even
( 2,1) 1 odd

(b) We’ll do the same thing here

Permutation # Inversions Classification


(1, 2,3) 0 even
(1,3, 2 ) 1 odd
( 2,1,3) 1 odd
( 2,3,1) 2 even
( 3,1, 2 ) 2 even
( 3, 2,1) 3 odd

We’ll need these results later in the section.

Alright, let’s move back into matrices. We still have some definitions to get out of the way
before we define the determinant function, but at least we’re back dealing with matrices.

© 2007 Paul Dawkins 93 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

Suppose that we have an n ´ n matrix, A, then an elementary product from this matrix will be a
product of n entries from A and none of the entries in the product can be from the same row or
column.

Example 5 Find all the elementary products for,


(a) a 2 ´ 2 matrix [Solution]
(b) a 3 ´ 3 matrix. [Solution]

Solution
(a) a 2 ´ 2 matrix.

Okay let’s first write down the general 2 ´ 2 matrix.


éa a12 ù
A = ê 11 ú
ë a21 a22 û
Each elementary product will contain two terms and since each term must come from different
rows we know that each elementary product must have the form,
a1 a2

All we need to do is fill in the column subscripts and remember in doing so that they must come
from different columns. There are really only two possible ways to fill in the blanks in the
product above. The two ways of filling in the blanks are (1, 2 ) and ( 2,1) and yes we did mean
to use the permutation notation there since that is exactly what we need. We will fill in the blanks
with all the possible permutations of the list of column numbers, {1, 2} in this case.

So, the elementary products for a 2 ´ 2 matrix are


a11a 22 a12 a21
[Return to Problems]

(b) a 3 ´ 3 matrix.

Again, let’s start off with a general 3 ´ 3 matrix for reference purposes.
é a11 a12 a13 ù
A = êê a21 a22 a23 úú
êë a31 a32 a33 úû

Each of the elementary products in this case will involve three terms and again since the must all
come from different rows we can again write down the form they must take.
a1 a2 a3

Again, each of the column subscripts will need to come from different columns and like the 2 ´ 2
case we can get all the possible choices for these by filling in the blanks will all the possible
permutations of {1, 2,3} .

© 2007 Paul Dawkins 94 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

So, the elementary products of the 3 ´ 3 are,


a11a22 a33 a11a23a32
a12 a21a33 a12 a23a31
a13 a21a32 a13a22 a31
[Return to Problems]

A general n ´ n matrix A, will have n! elementary products of the form


a1 i1 a2 i 2 L an i n
where ( i1 , i2 ,K , in ) ranges over all the permutations of {1, 2,K , n} .

We can now take care of the final preliminary definition that we need for the determinant
function. A signed elementary product from A will be the elementary product a1i1 a2 i2 L anin
that is multiplied by “+1” if ( i1 , i2 ,K , in ) is an even permutation or multiplied by “-1” if
( i1 , i2 ,K , in ) is an odd permutation.

Example 6 Find all the signed elementary products for,


(a) a 2 ´ 2 matrix [Solution]
(b) a 3 ´ 3 matrix. [Solution]

Solution
We listed out all the elementary products in Example 5 and we classified all the permutations
used in them as even or odd in Example 4. So, all we need to do is put all this information
together for each matrix.

(a) a 2 ´ 2 matrix.

Here are the signed elementary products for the 2 ´ 2 matrix.

Elementary Signed Elementary


Permutation
Product Product
a11a 22 (1, 2 ) - even a11a 22
a12 a21 ( 2,1) - odd - a12 a21
[Return to Problems]

(b) a 3 ´ 3 matrix.

Here are the signed elementary products for the 3 ´ 3 matrix.

Elementary Signed Elementary


Permutation
Product Product
a11a22 a33 (1, 2,3) - even a11a22 a33
a11a23 a32 (1,3, 2 ) - odd - a11a23a32

© 2007 Paul Dawkins 95 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

a12 a21a33 ( 2,1,3) - odd - a12 a21a33


a12 a23 a31 ( 2,3,1) - even a12 a23a31
a13 a21a32 ( 3,1, 2 ) - even a13a21a32
a13 a22 a31 ( 3, 2,1) - odd - a13a22 a31
[Return to Problems]

Okay, we can now give the definition of the determinant function.

Definition 1 If A is square matrix then the determinant function is denoted by det and det(A)
is defined to be the sum of all the signed elementary products of A.

Note that often we will call the number det(A) the determinant of A. Also, there is some
alternate notation that is sometimes used for determinants. We will sometimes denote
determinants as det ( A ) = A and this is most often done with the actual matrix instead of the
letter representing the matrix. For instance for a 2 ´ 2 matrix A we will use any of the following
to denote the determinant,
a11 a12
det ( A) = A =
a21 a22

So, now that we have the definition of the determinant function in hand we can actually start
writing down some formulas. We’ll give the formula for 2 ´ 2 and 3 ´ 3 matrices only because
for any matrix larger than that the formula becomes very long and messy and at those sizes there
are alternate methods for computing determinants that will be easier.

So, with that said, we’ve got all the signed elementary products for 2 ´ 2 and 3 ´ 3 matrices
listed in Example 6 so let’s write down the determinant function for these matrices.

First the determinant function for a 2 ´ 2 matrix.


a11 a12
det ( A) = = a11a22 - a12 a21
a21 a22

Now the determinant function for a 3 ´ 3 matrix.


a11 a12 a13
det ( A) = a21 a22 a23
a31 a32 a33
= a11a22 a33 + a12 a23 a31 + a13 a21a32 - a12 a21a33 - a11a23a32 - a13a22 a31

Okay, the formula for a 2 ´ 2 matrix isn’t too bad, but the formula for a 3 ´ 3 is messy and
would not be fun to memorize. Fortunately, there is an easy way to quickly “derive” both of
these formulas.

© 2007 Paul Dawkins 96 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

Before we give this quick trick to “derive” the formulas we should point out that what we’re
going to do ONLY works for 2 ´ 2 and 3 ´ 3 matrices. There is no corresponding trick for
larger matrices!

Okay, let’s start with a 2 ´ 2 matrix. Let’s examine the determinant below.

Notice the two diagonals that we’ve put on this determinant. The diagonal that runs from left to
right also covers the positive elementary product in the formula. Likewise, the diagonal that runs
from right to left covers the negative elementary product.

So, for a 2 ´ 2 matrix all we need to do is write down the determinant, sketch in the diagonals
multiply along the diagonals then add the product if the diagonal runs from left to right and
subtract the product if the diagonal runs from right to left.

Now let’s take a look at a 3 ´ 3 matrix. There is a similar trick that will work here, but in order
to get it to work we’ll first need to tack copies the first 2 columns onto the right side of the
determinant as shown below.

With the addition of the two extra columns we can see that we’ve got three diagonals running in
each direction and that each will cover one of the elementary products for this matrix. Also, the
diagonals that run from left to right cover the positive elementary products and those that run
from right to left cover the negative elementary product. So, as with the 2 ´ 2 matrix, we can
quickly write down the determinant function formula here by simply multiplying along each
diagonal and then adding it if the diagonal runs left to right or subtracting it if the diagonal runs
right to left.

Let’s take a quick look at a couple of examples with numbers just to make sure we can do these.

Example 7 Compute the determinant of each of the following matrices.


é 3 2ù
(a) A = ê [Solution]
ë -9 5úû
é 3 5 4ù
ê
(b) B = -2 -1 8úú [Solution]
ê
êë -11 1 7 úû
é 2 -6 2 ù
ê
(c) C = 2 -8 3úú [Solution]
ê
êë -3 1 1úû

© 2007 Paul Dawkins 97 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

Solution
é 3 2ù
(a) A = ê
ë -9 5úû
We don’t really need to sketch in the diagonals for 2 ´ 2 matrices. The determinant is simply the
product of the diagonal running left to right minus the product of the diagonal running from right
to left. So, here is the determinant for this matrix. The only thing we need to worry about is
paying attention to minus signs. It is easy to make a mistake with minus signs in these
computations if you aren’t paying attention.

det ( A) = ( 3)( 5 ) - ( 2 )( -9 ) = 33
[Return to Problems]

é 3 5 4ù
ê
(b) B = -2 -1 8úú
ê
êë -11 1 7 úû
Okay, with this one we’ll copy the two columns over and sketch in the diagonals to make sure
we’ve got the idea of these down.

Now, just remember to add products along the left to right diagonals and subtract products along
the right to left diagonals.
det ( B ) = ( 3)( -1)( 7 ) + ( 5 )( 8 )( -11) + ( 4 )( -2 )(1) - ( 5 )( -2 )( 7 ) -
( 3)( 8 )(1) - ( 4 )( -1)( -11)
= -467
[Return to Problems]

é 2 -6 2ù
ê
(c) C = 2 -8 3úú
ê
êë -3 1 1úû
We’ll do this one with a little less detail. We’ll copy the columns but not bother to actually
sketch in the diagonals this time.
2 -6 2 2 -6
det ( C ) = 2 -8 3 2 -8
-3 1 1 -3 1
= ( 2 )( -8)(1) + ( -6 )( 3)( -3) + ( 2 )( 2 )(1) - ( -6 )( 2 )(1) -
( 2 )( 3)(1) - ( 2 )( -8 )( -3)
=0
[Return to Problems]

© 2007 Paul Dawkins 98 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

As this example has shown determinants of matrices can be positive, negative or zero.

It is again worth noting that there are no such tricks for computing determinants for matrices
larger that 3 ´ 3

In the remainder of this chapter we’ll take a look at some properties of determinants, two
alternate methods for computing them that are not restricted by the size of the matrix as the two
quick tricks we saw in this section were and an application of determinants.

© 2007 Paul Dawkins 99 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

Properties of Determinants
In this section we’ll be taking a look at some of the basic properties of determinants and towards
the end of this section we’ll have a nice test for the invertibility of a matrix. In this section we’ll
give a fair number of theorems (and prove a few of them) as well as examples illustrating the
theorems. Any proofs that are omitted are generally more involved than we want to get into in
this class.

Most of the theorems in this section will not help us to actually compute determinants in general.
Most of these theorems are really more about how the determinants of different matrices will
relate to each other. We will take a look at a couple of theorems that will help show us how to
find determinants for some special kinds of matrices, but we’ll have to wait until the next two
sections to start looking at how to compute determinants in general.

All of the determinants that we’ll be computing in the examples in this section will be of a 2 ´ 2
or a 3 ´ 3 matrix. If you need a refresher on how to compute determinants of these kinds of
matrices check out this example in the previous section. We won’t actually be showing any of
that work here in this section.

Let’s start with the following theorem.

Theorem 1 Let A be an n ´ n matrix and c be a scalar then,


det ( cA ) = c n det ( A )

Proof : This is a really simply proof. From the definition of the determinant function in the
previous section we know that the determinant is the sum of all the signed elementary products
for the matrix. So, for cA we will sum signed elementary products that are of the form,
( ca )( ca )L( ca ) = c ( a
1i1 2 i2 nin
n
1i1 a2 i2 L anin )
Recall that for scalar multiplication we multiply all the entries by c and so we’ll have a c on each
entry as shown above. Also, as shown, we can factor all n of the c’s out and we’ll get what we’ve
shown above. Note that a1i1 a2 i2 L anin is the signed elementary product for A.

Now, if we add all the signed elementary products for cA we can factor the c n that is on each
term out of the sum and what we’re left with is the sum of all the signed elementary products of
A, or in other words, we’re left with det(A). So, we’re done.

Here’s a quick example to verify the results of this theorem.

© 2007 Paul Dawkins 100 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

Example 1 For the given matrix below compute both det(A) and det(2A).
é 4 -2 5ù
A = ê -1 -7 10 úú
ê
êë 0 1 -3úû
Solution
We’ll leave it to you to verify all the details of this problem. First the scalar multiple.
é 8 -4 10 ù
2 A = êê -2 -14 20 úú
êë 0 2 -6 úû
The determinants.
det ( A) = 45 det ( 2 A ) = 360 = ( 8 )( 45 ) = 23 det ( A )

Now, let’s investigate the relationship between det(A), det(B) and det(A+B). We’ll start with the
following example.

Example 2 Compute det(A), det(B) and det(A+B) for the following matrices.
é 10 -6 ù é 1 2ù
A=ê ú B=ê ú
ë -3 -1û ë 5 -6 û
Solution
Here all the determinants.
det ( A) = -28 det ( B ) = -16 det ( A + B ) = -69

Notice here that for this example we have det ( A + B ) ¹ det ( A ) + det ( B ) . In fact this will
generally be the case.

There is a very special case where we will get equality for the sum of determinants, but it doesn’t
happen all that often. Here is the theorem detailing this special case.

Theorem 2 Suppose that A, B, and C are all n ´ n matrices and that they differ by only a row,
say the kth row. Let’s further suppose that the kth row of C can be found by adding the
corresponding entries from the kth rows of A and B. Then in this case we will have that
det ( C ) = det ( A ) + det ( B )

The same result will hold if we replace the word row with column above.

Here is an example of this theorem.

Example 3 Consider the following three matrices.


é 4 2 -1ù é 4 2 -1ù é 4 2 -1ù
ê
A=ê 6 1 7ú ú B = êê -2 -5 3úú C = êê 4 -4 10 úú
ëê -1 -3 9 ûú ëê -1 -3 9 úû ëê -1 -3 9 úû

First, notice that we can write C as,

© 2007 Paul Dawkins 101 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

é 4 2 -1ù é 4 2 -1ù
ê ú ê
C = ê 4 -4 10 ú = ê6 + ( -2 ) 1 + ( -5 ) 7 + 3úú
êë -1 -3 9 úû êë -1 -3 9 úû

All three matrices differ only in the second row and the second row of C can be found by adding
the corresponding entries from the second row of A and B.

The determinants of these matrices are,


det ( A) = 15 det ( B ) = -115 det ( C ) = -100 = 15 + ( -115 )

Next let’s look at the relationship between the determinants of matrices and their products.

Theorem 3 If A and B are matrices of the same size then


det ( AB ) = det ( A) det ( B )

This theorem can be extended out to as many matrices as we want. For instance,
det ( ABC ) = det ( A ) det ( B ) det ( C )

Let’s check out an example of this.

Example 4 For the given matrices compute det(A), det(B), and det(AB).
é 1 -2 3ù é 0 1 8ù
ê
A=ê 2 7 ú
4ú B = êê 4 -1 1úú
êë 3 1 4 úû êë 0 3 3úû
Solution
Here’s the product of the two matrices.
é -8 12 15ù
AB = êê 28 7 35úú
êë 4 14 37 úû
Here are the determinants.
det ( A) = -41 det ( B ) = 84
det ( AB ) = -3444 = ( -41)( 84 ) = det ( A) det ( B )

Here is a theorem relating determinants of matrices and their inverse (provided the matrix is
invertible of course…).

Theorem 4 Suppose that A is an invertible matrix then,


1
det ( A-1 ) =
det ( A)

© 2007 Paul Dawkins 102 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

Proof : The proof of this theorem is a direct result of the previous theorem. Since A is invertible
we know that AA-1 = I . So take the determinant of both sides and then use the previous theorem
on the left side.
det ( AA-1 ) = det ( A ) det ( A-1 ) = det ( I )
Now, all that we need is to know that det ( I ) = 1 which you can prove using Theorem 8 below.
1
det ( A) det ( A-1 ) = 1 Þ det ( A ) =
det ( A-1 )

Here’s a quick example illustrating this.

Example 5 For the given matrix compute det(A) and det ( A-1 ) .
é 8 -9 ù
A=ê
ë 2 5úû
Solution
We’ll leave it to you to verify that A is invertible and that its inverse is,
é 585
-1
9
ù
A =ê 1 58
ú
ë - 29
4
29 û
Here are the determinants for both of these matrices.
1 1
det ( A) = 58 det ( A-1 ) = =
58 det ( A)

The next theorem that we want to take a look at is a nice test for the invertibility of matrices.

Theorem 5 A square matrix A is invertible if and only if det ( A ) ¹ 0 . A matrix that is invertible
is often called non-singular and a matrix that is not invertible is often called singular.

Before doing an example of this let’s talk a little bit about the phrase “if and only if” that appears
in this theorem. That phrase means that this is kind of like a two way street. This theorem,
because of the “if and only if” phrase, says that if we know that A is invertible then we will have
det ( A) ¹ 0 . If, on the other hand, we know that det ( A) ¹ 0 then we will also know that A is
invertible.

Most theorems presented in these notes are not “two way streets” so to speak. They only work
one way, if however, we do have a theorem that does work both ways you will always be able to
identify it by the phrase “if and only if”.

Now let’s work an example to verify this theorem.

© 2007 Paul Dawkins 103 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

Example 6 Compute the determinants of the following two matrices.


é 3 1 0ù é 3 3 6ù
ê
C = ê -1 2 2 ú ú B = êê 0 1 2 úú
êë 5 0 -1úû êë -2 0 0 úû
Solution
We determined the invertibility of both of these matrices in the section on Finding Inverses so we
already know what the answers should be (at some level) for the determinants. In that section we
determined that C was invertible and so by Theorem 5 we know that the det(C) should be non-
zero. We also determined that B was singular (i.e. not invertible) and so we know by Theorem 5
that det(B) should be zero.

Here are those determinants of these two matrices.


det ( C ) = 3 det ( B ) = 0

Sure enough we got zero where we should have and didn’t get zero where we should have.

Here is a theorem relating the determinants of a matrix and its transpose.

Theorem 6 If A is a square matrix then,


det ( A) = det ( AT )

Here is an example that verifies the results of this theorem.

Example 7 Compute det(A) and det ( AT ) for the following matrix.


é 5 3 2ù
A = ê -1 -8 -6 úú
ê
êë 0 1 1úû
Solution
We’ll leave it to you to verify that
det ( A) = det ( AT ) = -9

There are a couple special cases of matrices that we can quickly find the determinant for so let’s
take care of those at this point.

Theorem 7 If A is a square matrix with a row or column of all zeroes then


det ( A) = 0
and so A will be singular.

Proof : The proof here is fairly straight forward. The determinant is the sum of all the signed
elementary products and each of these will have a factor from each row and a factor from each
column. So, in particular it will have a factor from the row or column of all zeroes and hence will
have a factor of zero making the whole product zero.

All of the products are zero and upon summing them up we will also get zero for the determinant.

© 2007 Paul Dawkins 104 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

Note that in the following example we don’t need to worry about the size of the matrix now since
this theorem gives us a value for the determinant. You might want to check the 2 ´ 2 and 3 ´ 3
to verify that the determinants are in fact zero. You also might want to come back and verify the
other after the next section where we’ll learn methods for computing determinants in general.

Example 8 Each of the following matrices are singular.


é 4 12 8 0ù
é 5 0 1ù ê
é 3 9ù 5 -3 1 2úú
A=ê ú B = êê -9 0 2 úú C=ê
ë0 0 û ê 0 0 0 0ú
êë 4 0 -3úû ê ú
ë 5 1 3 6û

It is actually very easy to compute the determinant of any triangular (and hence any diagonal)
matrix. Here is the theorem that tells us how to do that.

Theorem 8 Suppose that A is an n ´ n triangular matrix then,


det ( A) = a11a22 L ann

So, what this theorem tells us is that the determinant of any triangular matrix (upper or lower) or
any diagonal matrix is simply the product of the entries from the matrices main diagonal.

We won’t do a formal proof here. We’ll just give a quick outline.

Proof Outline : Since we know that the determinant is the sum of the signed elementary products
and each elementary products has a factor from each row and a factor from each column because
of the triangular nature of the matrix, the only elementary product that won’t have at least one
zero is a11a22 L ann . All the others will have at least one zero in them. Hence the determinant of
the matrix must be det ( A ) = a11a22 L ann

Let’s take the determinant of a couple of triangular matrices. You should verify the 2 ´ 2 and
3 ´ 3 matrices and after the next section come back and verify the other.

Example 9 Compute the determinant of each of the following matrices.


é 10 5 1 3ù
é 5 0 0ù ê
é 6 0ù 0 0 -4 9 úú
A = êê 0 -3 0 úú B=ê ú C=ê
ë 2 -1û ê 0 0 6 4ú
êë 0 0 4 úû ê ú
ë 0 0 0 5û
Solution
Here are these determinants.
det ( A) = ( 5 )( -3)( 4 ) = -60 det ( B ) = ( 6 )( -1) = -6
det ( C ) = (10 )( 0 )( 6 )( 5 ) = 0

© 2007 Paul Dawkins 105 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

We have one final theorem to give in this section. In the Finding Inverse section we gave a
theorem that listed several equivalent statements. Because of Theorem 5 above we can add a
statement to that theorem so let’s do that.

Here is the improved theorem.

Theorem 9 If A is an n ´ n matrix then the following statements are equivalent.


(a) A is invertible.
(b) The only solution to the system Ax = 0 is the trivial solution.
(c) A is row equivalent to I n .
(d) A is expressible as a product of elementary matrices.
(e) Ax = b has exactly one solution for every n ´ 1 matrix b.
(f) Ax = b is consistent for every n ´ 1 matrix b.
(g) det ( A ) ¹ 0

© 2007 Paul Dawkins 106 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

The Method of Cofactors


In this section we’re going to examine one of the two methods that we’re going to be looking at
for computing the determinant of a general matrix. We’ll also see how some of the ideas we’re
going to look at in this section can be used to determine the inverse of an invertible matrix.

So, before we actually give the method of cofactors we need to get a couple of definitions taken
care of.

Definition 1 If A is a square matrix then the minor of ai j , denoted by M i j , is the determinant


of the submatrix that results from removing the ith row and jth column of A.

Definition 2 If A is a square matrix then the cofactor of ai j , denoted by Ci j , is the number

( -1)
i+ j
Mi j .

Let’s take a look at computing some minors and cofactors.

Example 1 For the following matrix compute the cofactors C12 , C2 4 , and C32 .
é 4 0 10 4 ù
ê -1 2 3 9 úú
A=ê
ê 5 -5 -1 6 ú
ê ú
ë 3 7 1 -2 û
Solution
In order to compute the cofactors we’ll first need the minor associated with each cofactor.
Remember that in order to compute the minor we will remove the ith row and jth column of A.

So, to compute M 12 (which we’ll need for C12 ) we’ll need to compute the determinate of the
matrix we get by removing the 1st row and 2nd column of A. Here is that work.

We’ve marked out the row and column that we eliminated and we’ll leave it to you to verify the
determinant computation. Now we can get the cofactor.
C12 = ( -1) M 12 = ( -1) (160 ) = -160
1+ 2 3

Let’s now move onto the second cofactor. Here is the work for the minor.

© 2007 Paul Dawkins 107 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

The cofactor in this case is,


C2 4 = ( -1) M 2 4 = ( -1) ( 508 ) = 508
2+ 4 6

Here is the work for the final cofactor.

C32 = ( -1) M 32 = ( -1) (150 ) = -150


3+ 2 5

Notice that the cofactor is really just ± M i j depending upon i and j. If the subscripts of the
cofactor add to an even number then we leave the minor alone (i.e. no “-” sign) when writing
down the cofactor. Likewise, if the subscripts on the cofactor sum to an odd number then we add
a “-” to the minor when writing down the cofactor.

We can use this fact to derive a table that will allow us to quickly determine whether or not we
should add a “-” onto the minor or leave it alone when writing down the cofactor.

Let’s start with C11 . In this case the subscripts sum to an even number and so we don’t tack on a
minus sign to the minor. Now, let’s move along the first row. The next cofactor would then be
C12 and in this case the subscripts add to an odd number and so we tack on a minus sign to the
minor. For the next cofactor, C13 , we would leave the minor alone and for the next, C14 , we’d
tack a minus sign on, etc.

As you can see from this work, if we start at the leftmost entry of the first row we have a “+” in
front of the minor and then as we move across the row the signs alternate. If you think about it,
this will also happen as we move down the first column. In fact, this will happen as we move
across any row and down any column.

We can summarize this idea in the following “sign matrix” that will tell us if we should leave the
minor alone (i.e. tack on a “+”) or change its sign (i.e. tack on a “-”) when writing down the
cofactor.

é+ - + - Lù
ê- + - + Lúú
ê
ê+ - + - Lú
ê ú
ê- + - + Lú
êë M M M M Oúû

Okay, we can now talk about how to use cofactors to compute the determinant of a general square
matrix. In fact there are two ways we can used cofactors as the following theorem shows.

© 2007 Paul Dawkins 108 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

Theorem 1 If A is an n ´ n matrix.
(a) Choose any row, say row i, then,
det ( A) = ai1Ci1 + ai 2Ci 2 +L ai nCi n

(b) Choose any column, say column j, then,


det ( A) = a1 j C1 j + a2 j C2 j + L + an j Cn j

What this theorem tells us is that if we pick any row all we need to do is go across that row and
multiply each entry by its cofactor, add all these products up and we’ll have the determinant for
the matrix. It also says that we could do the same thing only instead of going across any row we
could move down any column.

The process of moving across a row or down a column is often called a cofactor expansion.

Let’s work some examples of this so we can see it in action.

Example 2 For the following matrix compute the determinant using the given cofactor
expansions.
é 4 2 1ù
ê
A = ê -2 -6 3úú
êë -7 5 0 úû
(a) Expand along the first row. [Solution]
(b) Expand along the third row. [Solution]
(c) Expand along the second column. [Solution]

Solution
First, notice that according to the theorem we should get the same result in all three parts.

(a) Expand along the first row.

Here is the cofactor expansion in terms of symbols for this part.


det ( A) = a11C11 + a12C12 + a13C13

Now, let’s plug in for all the quantities. We will just plug in for the entries. For the cofactors
we’ll write down the minor and a “+1” or a “-1” depending on which sign each minor needs.
We’ll determine these signs by going to our “sign matrix” above starting at the first entry in the
particular row/column we’re expanding along and then as we move along that row or column
we’ll write down the appropriate sign.

Here is the work for this expansion.


-6 3 -2 3 -2 -6
det ( A) = ( 4 )( +1) + ( 2 )( -1) + (1)( +1)
5 0 -7 0 -7 5
= 4 ( -15 ) - 2 ( 21) + (1)( -52 )
= -154

© 2007 Paul Dawkins 109 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

We’ll leave it to you to verify the 2 ´ 2 determinant computations.


[Return to Problems]

(b) Expand along the third row.

We’ll do this one without all the explanations.


det ( A) = a31C31 + a32C32 + a33C33
2 1 4 1 4 2
( -7 )( +1) + ( 5 )( -1) + ( 0 )( +1)
-6 3 -2 3 -2 -6
= -7 (12 ) - 5 (14 ) + ( 0 )( -20 )
= -154

So, the same answer as the first part which is good since that was supposed to happen.

Notice that the signs for the cofactors in this case were the same as the signs in the first case.
This is because the first and third row of our “sign matrix” are identical. Also, notice that we
didn’t really need to compute the third cofactor since the third entry was zero. We did it here just
to get one more example of a cofactor into the notes.
[Return to Problems]

(c) Expand along the second column.

Let’s take a look at the final expansion. In this one we’re going down a column and notice that
from our “sign matrix” that this time we’ll be starting the cofactor signs off with a “-1” unlike the
first two expansions.
det ( A) = a12C12 + a22C22 + a32C32
-2 3 4 1 4 1
( 2 )( -1) + ( -6 )( +1) + ( 5 )( -1)
-7 0 -7 0 -2 3
= -2 ( 21) - 6 ( 7 ) - 5 (14 )
= -154

Again, the same as the first two as we expected.


[Return to Problems]

There was another point to the previous problem apart from showing that the row or column we
choose to expand along won’t matter. Because we are allowed to expand along any row that
means unless the problem statement forces to use a particular row or column we will get to
choose the row/column to expand along.

When choosing we should choose a row/column that will reduce the amount of work we’ve got to
do if possible. Comparing the parts of the previous example should suggest to us something we
should be looking for in making this choice. In part (b) it was pointed out that we didn’t really
need to compute the third cofactor since the third entry in that row was zero.

Choosing to expand along a row/column with zeroes in it will instantly cut back on the number of
cofactors that we’ll need to compute. So, when allowed to choose which row/column to expand

© 2007 Paul Dawkins 110 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

along we should look for the one with the most zeroes. In the case of the previous example that
means that the quickest expansions would be either the 3rd row or the 3rd column since both of
those have a zero in them and none of the other rows/columns do.

So, let’s take a look at a couple more examples.

Example 3 Using a cofactor expansion compute the determinant of,


é 5 -2 2 7 ù
ê 1 0 0 3úú
A= ê
ê -3 1 5 0ú
ê ú
ë 3 -1 -9 4 û
Solution
Since the row or column to use for the cofactor expansion was not given in the problem statement
we get to choose which one we want to use. Recalling the brief discussion after the last example
we know that we want to choose the row/column with the most zeroes in it since that will mean
we won’t have to compute cofactors for each entry that is a zero.

So, it looks like the second row would be a good choice for the expansion since it has two zeroes
in it. Here is the expansion for this row. As with the previous expansions we’ll explicitly give
the “+1” or “-1” for the cofactors and the minors as well so you can see where everything in the
expansion is coming from.
-2 2 7 5 -2 2
det ( A) = (1)( -1) 1 5 0 + ( 0 )( +1) M 2 2 + ( 0 )( -1) M 23 + ( 3)( +1) -3 1 5
-1 9 4 3 -1 -9

We didn’t bother to write down the minors M 22 and M 23 because of the zero entry. How we
choose to compute the determinants for the first and last entry is up to us at this point. We could
use a cofactor expansion on each of them or we could use the technique we learned in the first
section of this chapter. Either way will get the same answer and we’ll leave it to you to verify
these determinants.

The determinant for this matrix is,


det ( A) = - ( -76 ) + 3 ( 4 ) = 88

Example 4 Using a cofactor expansion compute the determinant of,


é 2 -2 0 3 4ù
ê 4 -1 0 1 -1úú
ê
B=ê 0 5 0 0 -1ú
ê ú
ê 3 2 -3 4 3ú
êë 7 -2 0 9 -5úû
Solution
This is a large matrix, but if you check out the third column we’ll see that there is only one non-
zero entry in that column and so that looks like a good column to do a cofactor expansion on.
Here’s the cofactor expansion for this matrix. Again, we explicitly added in the “+1” and “-1”

© 2007 Paul Dawkins 111 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

and won’t bother to write down the minors for the zero entries.

det ( B ) = ( 0 )( +1) M 13 + ( 0 )( -1) M 23 + ( 0 )( +1) M 33 +


2 -2 3 4
4 -1 1 -1
( -3)( -1) + ( 0 )( +1) M 53
0 5 0 -1
7 -2 9 -5

Now, in order to complete this problem we’ll need to take the determinant of a 4 ´ 4 matrix and
the only way that we’ve got to do that is to once again do a cofactor expansion on it. In this case
it looks like the third row will be the best option since it’s got more zero entries than any other
row or column.

This time we’ll just put in the terms that come from non-zero entries. Here is the remainder of
this problem. Also don’t forget that there is still a coefficient of 3 in front of this determinant!
2 -2 3 4
4 -1 1 -1
det ( B ) = 3
0 5 0 -1
7 -2 9 -5
æ 2 3 4 2 -2 3ö
ç ÷
= 3 ç ( 5)( -1) 4 1 -1 + ( -1)( -1) 4 -1 1÷
ç 7 9 -5 7 -2 9 ÷ø
è
= 3 ( -5 (163) + (1)( 41) )
= -2322

This last example has shown one of the drawbacks to this method. Once the size of the matrix
gets large there can be a lot of work involved in the method. Also, for anything larger than a
4 ´ 4 matrix you are almost assured of having to do cofactor expansions multiple times until the
size of the matrix gets down to 3 ´ 3 and other methods can be used.

There is a way to simplify things down somewhat, but we’ll need to the topic of the next section
before we can show that.

Now let’s move onto the final topic of this section.

It turns out that we can also use cofactors to determine the inverse of an invertible matrix. To see
how this is done we’ll first need a quick definition.

© 2007 Paul Dawkins 112 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

Definition 3 Let A be an n ´ n matrix and Ci j be the cofactor of ai j . The matrix of cofactors


from A is,
é C11 C12 L C1n ù
êC C22 L C2 n úú
ê 21
ê M M O M ú
ê ú
ë Cn1 Cn 2 L Cnn û

The adjoint of A is the transpose of the matrix of cofactors and is denoted by adj(A).

Example 5 Compute the adjoint of the following matrix.


é 4 2 1ù
A = êê -2 -6 3úú
êë -7 5 0 úû
Solution
We need the cofactors for each of the entries from this matrix. This is the matrix from Example 2
and in that example we computed all the cofactors except for C21 and C23 so here are those
computations.
2 1
C21 = ( -1) = ( -1)( -5) = 5
5 0
4 2
C23 = ( -1) = ( -1)( 34 ) = -34
-7 5

Here are the others from Example 2.


C11 = -15 C12 = -21 C13 = -52 C22 = 7
C31 = 12 C32 = -14 C33 = -20

The matrix of cofactors is then,


é -15 -21 -52 ù
ê 5 7 -34 úú
ê
êë 12 -14 -20 úû
The adjoint is then,
é -15 5 12 ù
adj ( A ) = ê -21 7 -14 úú
ê
êë -52 -34 -20 úû

We started this portion of this section off by saying that we were going to see how to use
cofactors to determine the inverse of a matrix. Here is the theorem that will tell us how to do that.

© 2007 Paul Dawkins 113 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

Theorem 2 If A is an invertible matrix then


1
A-1 = adj ( A )
det ( A )

Example 6 Use the adjoint matrix to compute the inverse of the following matrix.
é 4 2 1ù
ê
A = ê -2 -6 3úú
êë -7 5 0 úû
Solution
We’ve done most of the work for this problem already. In Example 2 we determined that
det ( A) = -154
and in Example 5 we found the adjoint to be
é -15 5 12 ù
adj ( A ) = ê -21 7 -14 úú
ê
êë -52 -34 -20 úû

Therefore, the inverse of the matrix is,


é -15 5 12 ù é 15
154
- 154
5
- 776 ù
1 ê
-1
A = ê -21 7 -14 úú = êê 3
22 - 221 1 ú
11 ú
-154
ëê -52 -34 -20 ûú êë
26 17 10 ú
77 77 77 û

You might want to verify this using the row reduction method we used in the previous chapter for
the practice.

© 2007 Paul Dawkins 114 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

Using Row Reduction To Compute Determinants


In this section we’ll take a look at the second method for computing determinants. The idea in
this section is to use row reduction on a matrix to get it down to a row-echelon form.

Since we’re computing determinants we know that the matrix, A, we’re working with will be
square and so the row-echelon form of the matrix will be an upper triangular matrix and we know
how to quickly compute the determinant of a triangular matrix. So, since we already know how
to do row reduction all we need to know before we can work some problems is how the row
operations used in the row reduction process will affect the determinant.

Before proceeding we should point out that there are a set of elementary column operations that
mirror the elementary row operations. We can multiply a column by a scalar, c, we can
interchange two columns and we add a multiple of one column onto another column. The
operations could just as easily be used as row operations and so all the theorems in this section
will make note of that. We’ll just be using row operations however in our examples.

Here is the theorem that tells us how row or column operations will affect the value of the
determinant of a matrix.

Theorem 1 Let A be a square matrix.


(a) If B is the matrix that results from multiplying a row or column of A by a scalar, c,
then det ( B ) = c det ( A )
(b) If B is the matrix that results from interchanging two rows or two columns of A then
det ( B ) = - det ( A )
(c) If B is the matrix that results from adding a multiple of one row of A onto another row
of A or adding a multiple of one column of A onto another column of A then
det ( B ) = det ( A )

Notice that the row operation that we’ll be using the most in the row reduction process will not
change the determinant at all. The operations that we’re going to need to worry about are the first
two and the second is easy enough to take care of. If we interchange two rows the determinant
changes by a minus sign. We are going to have to be a little careful with the first one however.
Let’s check out an example of how this method works in order to see what’s going on.

Example 1 Use row reduction to compute the determinant of the following matrix.
é 4 12 ù
A=ê
ë -7 5úû
Solution
There is of course no real reason to do row reduction on this matrix in order to compute the
determinant. We can find it easily enough at this point. In fact let’s do that so we can check the
results of our work after we do row reduction on this.
det ( A) = ( 4 )( 5 ) - ( -7 )(12 ) = 104

Okay, now let’s do with row reduction to see what we’ve got. We need to reduce this down to
row-echelon form and while we could easily use a multiple of the third row to get a 1 in the first
entry of the first row let’s just divide the first row by 4 since that’s the one operation we’re going

© 2007 Paul Dawkins 115 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

to need to careful with. So, let’s do the first operation and see what we’ve got.

é 4 12 ù 14 R1 é 1 3ù
A=ê =B
ë -7 5ûú ® ëê -7 5ûú

So, we called the result B and let’s see what the determinant of this matrix is.
1
det ( B ) = (1)( 5 ) - ( -7 )( 3) = 26 = det ( A)
4

So, the results of the theorem are verified for this step. The next step is then to convert the -7 into
a zero. Let’s do that and see what we get.

é 1 3ù R2 + 7 R1 é 1 3ù
B=ê =C
ë -7 5úû ® êë 0 26 úû

According to the theorem C should have the same determinant as B and it does (you should verify
this statement).

The final step is to convert the 26 into a 1.


é 1 3ù 261 R2 é 1 3ù
C=ê ú ê ú=D
ë 0 26 û ® ë0 1û
Now, we’ve got the following,
1
det ( D ) = 1 = det ( C )
26
Once again the theorem is verified.

Now, just how does all of this help us to find the determinant of the original matrix? We could
work our way backwards from det(D) and figure out what det(A) is. However, there is a way to
modify our work above that will allow us to also get the answer once we reach row-echelon form.

To see how we do this let’s go back to the first operation that we did and we saw when we were
done we had,
1
det ( B ) = det ( A ) OR det ( A) = 4 det ( B )
4

Written in another way this is,


4 12 1 3
det ( A) = = ( 4) = det ( B )
-7 5 -7 5

Notice that the determinants, when written in the “matrix” form, are pretty much what we
originally wrote down when doing the row operation. Therefore, instead of writing down the row
operation as we did above let’s just use this “matrix” form of the determinant and write the row
operation as follows.
4 12 1
R1 1 3
det ( A) = 4
( 4)
-7 5 = -7 5

© 2007 Paul Dawkins 116 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

1
In going from the matrix on the left to the matrix on the right we performed the operation R1
4
and in the process we changed the value of the determinant. So, since we’ve got an equal sign
here we need to also modify the determinant of the matrix on the right so that it will remain equal
to the determinant of the matrix on the left. As shown above, we can do this by multiplying the
matrix on the right by the reciprocal of the scalar we used in the row operation.

Let’s complete this and notice that in the second step we aren’t going to change the value of the
determinant since we’re adding a multiple of the second row onto the first row so we’ll not
change the value of the determinant on the right. In the final operation we divided the second
row by 26 and so we’ll need to multiply the determinant on the right by 26 to persevere the
equality of the determinants.

Here is the complete work for this problem using these ideas.
4 12 1
R1 1 3
det ( A) = 4
( 4)
-7 5 = -7 5
R2 + 7 R1 1 3
( 4)
= 0 26
1
R2 1 3
26
( 4 )( 26 )
= 0 1

Okay, we’re down to row-echelon form so let’s strip out all the intermediate steps out and see
what we’ve got.
1 3
det ( A) = ( 4 )( 26 )
0 1

The matrix on the right is triangular and we know that determinants of triangular matrices are just
the product of the main diagonal entries and so the determinant of A is,
det ( A) = ( 4 )( 26 )(1)(1) = 104

Now, that was a lot of work to compute the determinant and in general we wouldn’t use this
method on a 2 ´ 2 matrix, but by doing it on one here it allowed us to investigate the method in a
detail without having to deal with a lot of steps.

There are a couple of issues to point out before we move into another more complicated problem.
First, we didn’t do any row interchanges in the above example, but the theorem tells us that will
only change the sign on the determinant. So, if we do a row interchange in our work we’ll just
tack a minus sign onto the determinant.

Second, we took the matrix all the way down to row-echelon form, but if you stop to think about
it there’s really nothing special about that in this case. All we need to do is reduce the matrix to a
triangular matrix and then use the fact that can quickly find the determinant of any triangular
matrix.

© 2007 Paul Dawkins 117 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

From this point on we’ll not be going all the way to row-echelon form. We’ll just make sure that
we reduce the matrix down to a triangular matrix and then stop and compute the determinant.

Example 2 Use row reduction to compute the determinant of the following matrix.
é -2 10 2 ù
A = êê 1 0 7 úú
êë 0 -3 5úû
Solution
We’ll do this one with less explanation. Just remember that if we interchange rows tack a minus
sign onto the determinant and if we multiply a row by a scalar we’ll need to multiply the new
determinant by the reciprocal of the scalar.
-2 10 2 1 0 7
R1 « R2
det ( A) = 1 0 7 - -2 10 2
=
0 -3 5 0 -3 5
1 0 7
R2 + 2 R1
- 0 10 16
=
0 -3 5
1 0 7
1
R2
10
- (10 ) 0 1 8
5
=
0 -3 5
1 0 7
R3 + 3R2
- (10 ) 0 1 8
= 5

0 0 49
5

Okay, we’ve gotten the matrix down to triangular form and so at this point we can stop and just
take the determinant of that and make sure to keep the scalars that are multiplying it. Here is the
final computation for this problem.

æ 49 ö
det ( A) = -10 (1)(1) ç ÷ = -98
è 5 ø

Example 3 Use row reduction to compute the determinant of the following matrix.
é 3 0 6 -3ù
ê 0 2 3 0 úú
A= ê
ê -4 -7 2 0ú
ê ú
ë 2 0 1 10 û
Solution
Okay, there’s going to be some work here so let’s get going on it.

© 2007 Paul Dawkins 118 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

3 0 6 -3 1 0 2 -1
0 2 3 0 1
R1 0 2 3 0
det ( A) = 3
( 3)
-4 -7 2 0 = -4 -7 2 0
2 0 1 10 2 0 1 10
1 0 2 -1
R3 + 4 R1
0 2 3 0
R4 - 2 R1 ( 3)
0 -7 10 -4
=
0 0 -3 12
1 0 2 -1
2 R2
1
0 1 3
0
( 3)( 2 ) 2

= 0 -7 10 -4
0 0 -3 12
1 0 2 -1
R3 + 7 R2 0 1 3
0
( 3)( 2 ) 2

= 0 0 412 -4
0 0 -3 12
1 0 2 -1
2
R3 0 1 3
0
( 3)( 2 ) æç ö÷
41 41 2

= è 2ø 0 0 1 - 418
0 0 -3 12
1 0 2 -1
R4 + 3R3 0 1 3
0
( 3)( 2 ) æç ö÷
41 2

= è 2ø 0 0 1 - 418
0 0 0 468
41

Okay, that was a lot of work, but we’ve gotten it into a form we can deal with. Here’s the
determinant.

æ 41 öæ 468 ö
det ( A) = ( 3)( 2 ) ç ÷ç ÷ = 1404
è 2 øè 41 ø

Now, as the previous example has shown us, this method can be a lot of work and its work that if
we aren’t paying attention it will be easy to make a mistake.

There is a method that we could have used here to significantly reduce our work and it’s not even
a new method. Notice that with this method at each step we have a new determinant that needs
computing. We continued down until we got a triangular matrix since that would be easy for us
to compute. However, there’s nothing keeping us from stopping at any step and using some other
method for computing the determinant. In fact, if you look at our work, after the second step
we’ve gotten a column with a 1 in the first entry and zeroes below it. If we were in the previous
© 2007 Paul Dawkins 119 http://tutorial.math.lamar.edu/terms.aspx
Linear Algebra

section we’d just do a cofactor expansion along this column for this determinant. So, let’s do
that. No one ever said we couldn’t mix the methods from this and the previous section in a
problem.

Example 4 Use row reduction and a cofactor expansion to compute the determinant of the
matrix in Example 3.

Solution.
Okay, this “new” method says to use row reduction until we get a matrix that would be easy to do
a cofactor expansion on. As noted earlier that means only doing the first two steps. So, for the
sake of completeness here are those two steps again.
3 0 6 -3 1 0 2 -1
0 2 3 0 1
R1 0 2 3 0
det ( A) = 3
( 3)
-4 -7 2 0 = -4 -7 2 0
2 0 1 10 2 0 1 10
1 0 2 -1
R3 + 4 R1
0 2 3 0
R4 - 2 R1 ( 3)
0 -7 10 -4
=
0 0 -3 12

At this point we’ll just do a cofactor expansion along the first column.
æ 2 3 0 ö
ç ÷
det ( A) = ( 3) ç (1)( +1) -7 10 -4 + ( 0 ) C21 + ( 0 ) C31 + ( 0 ) C41 ÷
ç 0 -3 12 ÷
è ø
2 3 0
= 3 -7 10 -4
0 -3 12

At this point we can use any method to compute the determinant of the new 3 ´ 3 matrix so we’ll
leave it to you to verify that
det ( A) = ( 3)( 468 ) = 1404

There is one final idea that we need to discuss in this section before moving on.

Theorem 2 Suppose that A is a square matrix and that two of its rows are proportional or two of
its columns are proportional. Then det ( A ) = 0 .

When we say that two rows or two columns are proportional that means that one of the
rows(columns) is a scalar times another row(column) of the matrix.

We’re not going to prove this theorem but it you think about it, it should make some sense. Let’s
suppose that two rows are proportional. So we know that one of the rows is a scalar multiple of
another row. This means we can use the third row operation to make one of the rows all zero.

© 2007 Paul Dawkins 120 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

From Theorem 1 above we know that both of these matrices must have the same determinant and
from Theorem 7 from the Determinant Properties section we know that if a matrix has a row or
column of all zeroes, then that matrix is singular, i.e. its determinant is zero. Therefore both
matrices must have a zero determinant.

Here is a quick example showing this.

Example 5 Show that the following matrix is singular.


é 4 - 1 3ù
A = êê 2 5 -1úú
êë -8 2 -6 úû
Solution
We can use Theorem 2 above upon noticing that the third row is -2 times the first row. That’s all
we need to use this theorem.

So, technically we’ve answered the question. However, let’s go through the steps outlined above
to also show that this matrix is singular. To do this we’d do one row reduction step to get the row
of all zeroes into the matrix as follows.
4 -1 3 4 -1 3
R3 + 2 R1
det ( A) = 2 5 -1 2 5 -1
=
-8 2 -6 0 0 0

We know by Theorem 1 above that these two matrices have the same determinant. Then because
we see a row of all zeroes we can invoke Theorem 7 from the Determinant Properties to say that
the determinant on the right must be zero, and so be singular.

Then, as we pointed out, these two matrices have the same determinant and so we’ve also got
det ( A) = 0 and so A is singular.

You might want to verify that this matrix is singular by computing its determinant with one of the
other methods we’ve looked at for the practice.

We’ve now looked at several methods for computing determinants and as we’ve seen each can be
long and prone to mistakes. On top of that for some matrices one method may work better than
the other. So, when faced with a determinant you’ll need to look at it and determine which
method to use and unless otherwise specified by the problem statement you should use the one
that you find the easiest to use. Note that this may not be the method that somebody else chooses
to use, but you shouldn’t worry about that. You should use the method you are the most
comfortable with.

© 2007 Paul Dawkins 121 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

Cramer’s Rule
In this section we’re going to come back and take one more look at solving systems of equations.
In this section we’re actually going to be able to get a general solution to certain systems of
equations. It won’t work on all systems of equations and as we’ll see if the system is too large it
will probably be quicker to use one of the other methods that we’ve got for solving systems of
equations.

So, let’s jump into the method.

Theorem 1 Suppose that A is an n ´ n invertible matrix. Then the solution to the system Ax = b
is given by,
det ( A1 ) det ( A2 ) det ( An )
x1 = , x2 = , K , xn =
det ( A) det ( A ) det ( A )
where Ai is the matrix found by replacing the ith column of A with b.

Proof : The proof to this is actually pretty simple. First, because we know that A is invertible
then we know that the inverse exists and that det ( A ) ¹ 0 . We also know that the solution to the
system can be given by,
x = A-1b

From the section on cofactors we know how to define the inverse in terms of the adjoint of A.
Using this gives us,
é C11 C21 L Cn1 ù é b1 ù
ê L Cn 2 úú êb2 ú
1 1 ê C12 C22
ê ú
x= adj ( A ) b =
det ( A ) det ( A ) ê M M O M úê M ú
ê úê ú
êë C1n C2 n L Cn n úû ëbn û

Recall that Ci j is the cofactor of ai j . Also note that the subscripts on the cofactors above appear
to be backwards but they are correctly placed. Recall that we get the adjoint by first forming a
matrix with Ci j in the ith row and jth column and then taking the transpose to get the adjoint.

Now, multiply out the matrices to get,


é b1C11 + b2C21 + L bn Cn n ù
ê ú
1 ê b1C12 + b2C22 + L bn Cn 2 ú
x=
det ( A ) ê M ú
ê ú
êëb1C1n + b2C2 n + L bnCn n úû

The entry in the ith row of x, which is xi in the solution, is


b1C1i + b2C2i + L bnCni
xi =
det ( A )

© 2007 Paul Dawkins 122 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

Next let’s define,


é a11 a12 L a1 i -1 b1 a1 i +1 L a1n ù
ê a21 a22 L a2 i -1 b2 a2 i +1 L a2 n ú
Ai = ê ú
ê M M M M M M ú
ê ú
ëê an1 an 2 L an i -1 bn an i +1 L an n ûú

So, Ai is the matrix we get by replacing the ith column of A with b. Now, if we were to compute
the determinate of Ai by expanding along the ith column the products would be one of the bi ’s
times the appropriate cofactor. Notice however that since the only difference between Ai and A
is the ith column and so the cofactors of we get by expanding Ai along the ith column will be
exactly the same as the cofactors we would get by expanding A along the ith column.

Therefore, the determinant of Ai is given be,


det ( Ai ) = b1C1i + b2C2i +L bn Cni
where Ck i is the cofactor of ak i from the matrix A. Note however that this is exactly the
numerator of xi and so we have,
det ( Ai )
xi =
det ( A)
as we wanted to prove.

Let’s work a quick example to illustrate the method.

Example 1 Use Cramer’s Rule to determine the solution to the following system of equations.
3x1 - x2 + 5 x3 = -2
-4 x1 + x2 + 7 x3 = 10
2 x1 + 4 x2 - x3 = 3
Solution
First let’s put the system into matrix form and verify that the coefficient matrix is invertible.
é 3 -1 5ù é x1 ù é -2 ù
ê -4 1 7 úú êê x2 úú = êê 10 úú
ê
êë 2 4 -1úû êë x3 úû êë 3úû
A x = b

det ( A) = -187 ¹ 0

So, the coefficient matrix is invertible and Cramer’s Rule can be used on the system. We’ll also
need det(A) in a bit so it’s good that we now have it. Let’s now write down the formulas for the
solution to this system.

© 2007 Paul Dawkins 123 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

det ( A1 ) det ( A2 ) det ( A3 )


x1 = x2 = x3 =
det ( A) det ( A) det ( A )
where A1 is the matrix formed by replacing the 1st column of A with b, A2 is the matrix formed
by replacing the 2nd column of A with b, and A3 is the matrix formed by replacing the 3rd column
of A with b.

We’ll leave it to you to verify the following determinants.


-2 -1 5
det ( A1 ) = 10 1 7 = 212
3 4 -1
3 -2 5
det ( A2 ) = -4 10 7 = -273
2 3 -1
3 -1 -2
det ( A3 ) = -4 1 10 = -107
2 4 3

The solution to the system is then,


212 273 107
x1 = - x2 = x3 =
187 187 187

Now, the solution to this system had some somewhat messy solutions and that would have made
the row reduction method prone to mistake. However, since this solution required us to compute
4 determinants as you can see if your system gets too large this would be a very time consuming
method to use. For example a system with 5 equations and 5 unknowns would require us to
compute 6 5 ´ 5 determinants. At that point, regardless of how messy the final answers are there
is a good chance that the row reduction method would be easier.

© 2007 Paul Dawkins 124 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

Euclidean n-Space

Introduction
In this chapter we are going to start looking at the idea of a vector and the ultimate goal of this
chapter will be to define something called Euclidean n-space. In this chapter we’ll be looking at
some very specific examples of vectors so we can build up some of the ideas that surround them.
We will reserve general vectors for the next chapter.

We will also be taking a quick look at the topic of linear transformations. Linear transformations
are a very important idea in the study of Linear Algebra.

Here is a listing of the topics in this chapter.

Vectors – In this section we’ll introduce vectors in 2-space and 3-space as well as some of the
important ideas about them.

Dot Product & Cross Product – Here we’ll look at the dot product and the cross product, two
important products for vectors. We’ll also take a look at an application of the dot product.

Euclidean n-Space – We’ll introduce the idea of Euclidean n-space in this section and extend
many of the ideas of the previous two sections.

Linear Transformations – In this section we’ll introduce the topic of linear transformations and
look at many of their properties.

Examples of Linear Transformations – We’ll take a look at quite a few examples of linear
transformations in this section.

© 2007 Paul Dawkins 125 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

Vectors
In this section we’re going to start taking a look at vectors in 2-space (normal two dimensional
space) and 3-space (normal three dimensional space). Later in this chapter we’ll be expanding
the ideas here to n-space and we’ll be looking at a much more general definition of a vector in the
next chapter. However, if we start in 2-space and 3-space we’ll be able to use a geometric
interpretation that may help understand some of the concepts we’re going to be looking at.

So, let’s start off with defining a vector in 2-space or 3-space. A vector can be represented
geometrically by a directed line segment that starts at a point A, called the initial point, and ends
at a point B, called the terminal point. Below is an example of a vector in 2-space.

Vectors are typically denoted with a boldface lower case letter. For instance we could represent
the vector above by v, w, a, or b, etc. Also when we’ve explicitly given the initial and terminal
points we will often represent the vector as,
uuur
v = AB
where the positioning of the upper case letters is important. The A is the initial point and so is
listed first while the terminal point, B, is listed second.

As we can see in the figure of the vector shown above a vector imparts two pieces of information.
A vector will have a direction and a magnitude (the length of the directed line segment). Two
vectors with the same magnitude but different directions are different vectors and likewise two
vectors with the same direction but different magnitude are different.

Vectors with the same direction and same magnitude are called equivalent and even though they
may have different initial and terminal points we think of them as equal and so if v and u are two
equivalent vectors we will write,
v=u

To illustrate this idea all of the vectors in the image below (all in 2-space) are equivalent since
they have the same direction and magnitude.

© 2007 Paul Dawkins 126 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

It is often difficult to really visualize a vector without a frame of reference and so we will often
introduce a coordinate system to the picture. For example, in 2-space, suppose that v is any
vector whose initial point is at the origin of the rectangular coordinate system and its terminal
point is at the coordinates ( v1 , v2 ) as shown below.

In these cases we call the coordinates of the terminal point the components of v and write,
v = ( v1 , v2 )

We can do a similar thing for vectors in 3-space. Before we get into that however, let’s make
sure that you’re familiar with all the concepts we might run across in dealing with 3-space.
Below is a point in 3-space.

© 2007 Paul Dawkins 127 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

Just as a point in 2-space is described by a pair ( x, y ) we describe a point in 3-space by a triple


( x, y , z ) . Next if we take each pair of coordinate axes and look at the plane they form we call
these the coordinate planes and denote them as xy-plane, yz-plane, and xz-plane respectively.
Also note that if we take the general point and move it straight into one of the coordinate planes
we get a new point where one of the coordinates is zero. For instance in the xy-plane we have the
point ( x, y, 0 ) , etc.

Just as in 2-space, suppose that we’ve got a vector v whose initial point is the origin of the
coordinate system and whose terminal point is given by ( v1 , v2 , v3 ) as shown below,

Just as in 2-space we call ( v1 , v2 , v3 ) the components of v and write,


v = ( v1 , v2 , v3 )

© 2007 Paul Dawkins 128 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

Before proceeding any further we should briefly talk about the notation we’re using because it
can be confusing sometimes. We are using the notation ( v1 , v2 , v3 ) to represent both a point in 3-
space and a vector in 3-space as shown in the figure above. This is something you’ll need to get
used to. In this class ( v1 , v2 , v3 ) can be either a point or a vector and we’ll need to be careful and
pay attention to the context of the problem, although in many problems it won’t really matter.
We’ll be able to use it as a point or a vector as we need to. The same comment could be made for
points/vectors in 2-space.

Now, let’s get back to the discussion at hand and notice that the component form of the vector is
really telling how to get from the initial point of the vector to the terminal point of the vector. For
example, lets suppose that v = ( v1 , v2 ) is a vector in 2-space with initial point A = ( x1 , y1 ) . The
first component of the vector, v1 , is the amount we have to move to the right (if v1 is positive) or
to the left (if v1 is negative). The second component tells us how much to move up or down
depending on the sign of v2 . The terminal point of v is then given by,
B = ( x1 + v1 , y1 + v2 )

Likewise if v = ( v1 , v2 , v3 ) is a vector in 2-space with initial point A = ( x1 , y1 , z1 ) the terminal


point is given by,
B = ( x1 + v1 , y1 + v2 , z1 + v3 )
Notice as well that if the initial point is the origin then the final point will be B = ( v1 , v2 , v3 ) and
we once again see that ( v1 , v2 , v3 ) can represent both a point and a vector.

This can all be turned around as well. Let’s suppose that we’ve got two points in 2-space,
A = ( x1 , y1 ) and B = ( x2 , y2 ) . Then the vector with initial point A and terminal point B is given
by,
uuur
AB = ( x2 - x1 , y2 - y1 )

Note that the order of the points is important. The components are found by subtracting the
coordinates of the initial point from the coordinates of the terminal point. If we turned this
around and wanted the vector with initial point B and terminal point A we’d have,
uuur
BA = ( x1 - x2 , y1 - y2 )

Of course we can also do this in 3-space. Suppose that we want the vector that has an initial point
of A = ( x1 , y1 , z1 ) and a terminal point of B = ( x2 , y2 , z2 ) . This vector is given by,
uuur
AB = ( x2 - x1 , y2 - y1 , z2 - z1 )

Let’s see an example of this.

Example 1 Find the vector that starts at A = ( 4, -2,9 ) and ends at B = ( -7, 0, 6 ) .

© 2007 Paul Dawkins 129 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

Solution
There really isn’t much to do here other than use the formula above.
uuur
v = AB = ( -7 - 4,0 - ( -2 ) , 6 - 9 ) = ( -11, 2, -3)

Here is a sketch showing the points and the vector.

Okay, it’s now time to move into arithmetic of vectors. For each operation we’ll look at both a
geometric and a component interpretation. The geometric interpretation will help with
understanding just what the operation is doing and the component interpretation will help us to
actually do the operation.

There are two quick topics that we first need to address in vector arithmetic. The first is the zero
vector. The zero vector, denoted by 0, is a vector with no length. Because the zero vector has no
length it is hard to talk about its direction so by convention we say that the zero vector can have
any direction that we need for it to have in a given problem.

The next quick topic to discuss is that of negative of a vector. If v is a vector then the negative of
the vector, denoted by –v, is defined to be the vector with the same length as v but has the
opposite direction as v as shown below.

We’ll see how to compute the negative vector in a bit. Also note that sometimes the negative is
called the additive inverse of the vector v.

Okay let’s start off the arithmetic with addition.

© 2007 Paul Dawkins 130 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

Definition 1 Suppose that v and w are two vectors then to find the sum of the two vectors,
denoted v + w , we position w so that its initial point coincides with the terminal point of v. The
new vector whose initial point is the initial point of v and whose terminal point is the terminal
point of w will be the sum of the two vectors, or v + w .

Below are three sketches of what we’ve got here with addition of vectors in 2-space. In terms of
components we have v = ( v1 , v2 ) and w = ( w1 , w2 ) .

The sketch on the left matches the definition above. We first sketch in v and the sketch w starting
where v left off. The resultant vector is then the sum. In the middle we have the sketch for
w + v and as we can see we get exactly the same resultant vector. From this we can see that we
will have,
v+w =w+v

The sketch on the right merges the first two sketches into one and also adds in the components for
each of the vectors. It’s a little “busy”, but you can see that the coordinates of the sum are
( v1 + w1 , v2 + w2 ) . Therefore, for the vectors in 2-space we can compute the sum of two vectors
using the following formula.
v + w = ( v1 + w1 , v2 + w2 )

Likewise, if we have two vectors in 3-space, say v = ( v1 , v2 , v3 ) and w = ( w1 , w2 , w3 ) , then


we’ll have,
v + w = ( v1 + w1 , v2 + w2 , v3 + w3 )

Now that we’ve got addition and the negative of a vector out of the way we can do subtraction.

Definition 2 Suppose that we have two vectors v and w then the difference of w from v, denoted
by v - w is defined to be,
v - w = v + ( -w )

If we make a sketch, in 2-space, for the summation form of the difference we the following
sketch.

© 2007 Paul Dawkins 131 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

Now, while this sketch shows us what the vector for the difference is as a summation we
generally like to have a sketch that relates to the two original vectors and not one of the vectors
and the negative of the other. We can do this by recalling that any two vectors are equal if they
have the same magnitude and direction. Upon recalling this we can pick up the vector
representing the difference and moving it as shown below.

Finally, if we were to go back to the original sketch add in components for the vectors we will see
that in 2-space we can compute the difference as follows,
v - w = ( v1 - w1 , v2 - w2 )
and if the vectors are in 3-space the difference is,
v - w = ( v1 - w1 , v2 - w2 , v3 - w3 )

Note that both addition and subtraction will extend naturally to more than two vectors.

The final arithmetic operation that we want to take a look at is scalar multiples.

Definition 3 Suppose that v is a vector and c is a non-zero scalar (i.e. c is a number) then the
scalar multiple, cv, is the vector whose length is c times the length of v and is in the direction
of v if c is positive and in the opposite direction of v is c is negative.

Here is a sketch of some scalar multiples of a vector v.

© 2007 Paul Dawkins 132 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

Note that we can see from this that scalar multiples are parallel. In fact it can be shown that if v
and w are two parallel vectors then there is a non-zero scalar c such that v = cw , or in other
words the two vectors will be scalar multiples of each other.

It can also be shown that if v is a vector in either 2-space or 3-space then the scalar multiple can
be computed as follows,
cv = ( cv1 , cv2 ) OR cv = ( cv1 , cv2 , cv3 )

At this point we can give a formula for the negative of a vector. Let’s examine the scalar
multiple, ( -1) v . This is a vector whose length is the same as v since -1 = 1 and is in the
opposite direction of v since the scalar is negative. Hence this vector represents the negative of v.
In 3-space this gives,
- v = ( -1) v = ( -v1 , -v2 , -v3 )
and in 2-space we’ll have,
- v = ( -1) v = ( -v1 , -v2 )

Before we move on to an example let’s get some properties of vector arithmetic written down.

Theorem 1 If u, v, and w are vectors in 2-space or 3-space and c and k are scalars then,
(a) u + v = v + u
(b) u + ( v + w ) = ( u + v ) + w
(c) u + 0 = 0 + u = u
(d) u - u = u + ( -u ) = 0
(e) 1u = u
(f) ( ck ) u = c ( ku ) = k ( cu )
(g) ( c + k ) u = cu + ku
(h) c ( u + v ) = cu + cv

© 2007 Paul Dawkins 133 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

The proof of all these comes directly from the component definition of the operations and so are
left to you to verify.

At this point we should probably do a couple of examples of vector arithmetic to say that we’ve
done that.

Example 2 Given the following vectors compute the indicated quantity.


a = ( 4, -6 ) b = ( -3, -7 ) c = ( -1,5 )
u = (1, -2, 6 ) v = ( 0, 4, -1) w = ( 9, 2, -3)

(a) -w [Solution]
(b) a + b [Solution]
(c) a - c [Solution]
(d) a - 3b + 10c [Solution]
(e) 4u + v - 2w [Solution]
Solution
There really isn’t too much to these other than to compute the scalar multiples and the do the
addition and/or subtraction. For the first three we’ll include sketches so you can visualize what’s
going on with each operation.

(a) -w
- w = ( -9, -2,3)
Here is a sketch of this vector as well as w.

[Return to Problems]

(b) a + b
a + b = ( 4 + ( -3) , -6 + ( -7 ) ) = (1, -13)
Here is a sketch of a and b as well as the sum.

© 2007 Paul Dawkins 134 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

[Return to Problems]

(c) a - c
a - c = ( 4 - ( -1) , -6 - 5 ) = ( 5, -11)
Here is a sketch of a and c as well as the difference

[Return to Problems]

(d) a - 3b + 10c
a - 3b + 10c = ( 4, -6 ) - ( -9, -21) + ( -10 + 50 ) = ( 3, 65)
[Return to Problems]

(e) 4u + v - 2w
4u + v - 2w = ( 4, -8, 24 ) + ( 0, 4, -1) - (18, 4, -6 ) = ( -14, -8, 29 )
[Return to Problems]

There is one final topic that we need to discus in this section. We are often interested in the
length or magnitude of a vector so we’ve got a name and notation to use when we’re talking
about the magnitude of a vector.

© 2007 Paul Dawkins 135 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

Definition 4 If v is a vector then the magnitude of the vector is called the norm of the vector and
denoted by v . Furthermore, if v is a vector in 2-space then,

v = v12 + v22
and if v is in 3-space we have,
v = v12 + v22 + v32

In the 2-space case the formula is fairly easy to see from a geometric perspective. Let’s suppose
that we have v = ( v1 , v2 ) and we want to find the magnitude (or length) of this vector. Let’s
consider the following sketch of the vector.

Since we know that the components of v are also the coordinates of the terminal point of the
vector when its initial point is the origin (as it is here) we know then the lengths of the sides of a
right triangle as shown. Then using the Pythagorean Theorem we can find the length of the
hypotenuse, but that is also the length of the vector. A similar argument can be done on the 3-
space version.

From above we know that cv is a scalar multiple of v and that its length is |c| times the length of v
and so we have,
cv = c v

We can also get this from the definition of the norm. Here is the 3-space case, the 2-space
argument is identical.
( cv1 ) + ( cv2 ) + ( cv3 ) = c 2 ( v12 + v22 + v32 ) = c v12 + v22 + v32 = c v
2 2 2
cv =

There is one norm that we’ll be particularly interested in on occasion. Suppose v is a vector in 2-
space or 3-space. We call v a unit vector if v = 1 .

Let’s compute a couple of norms.

© 2007 Paul Dawkins 136 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

Example 3 Compute the norms of the given vectors.


(a) v = ( -5,3,9 )
(b) j = ( 0,1, 0 )
1
(c) w = ( 3, -4 ) and w
5
Solution
Not much to do with these other than to use the formula.

( -5 )
2
(a) v = + 32 + 92 = 115

(b) j = 02 + 12 + 02 = 1 = 1 , so j is a unit vector!

(c) Okay with this one we’ve got two norms to compute. Here is the first one.
w = 32 + ( -4 ) = 25 = 5
2

To get the second we’ll first need,


1 æ3 4ö
w = ç ,- ÷
5 è5 5ø
and here is the norm using the fact that cv = c v .
1 1 æ1ö
w = w = ç ÷ ( 5) = 1
5 5 è5ø

As a check let’s also compute this using the formula for the norm.
2 2
1 æ 3ö æ 4ö 9 16 25
w = ç ÷ +ç- ÷ = + = =1
5 è5ø è 5ø 25 25 25

Both methods get the same answer as they should. Notice as well that w is not a unit vector but
5 w is a unit vector.
1

We now need to take a look at a couple facts about the norm of a vector.

Theorem 2 Given a vector v in 2-space or 3-space then v ³ 0 . Also, v = 0 if and only if


v = 0.

Proof : The proof of the first part of this comes directly from the definition of the norm. The
norm is defined to be a square root and by convention the value of a square root is always greater
than or equal to zero and so a norm will always be greater than or equal to zero.

Now, for the second part, recall that when we say “if and only if” in a theorem statement we’re
saying that this is kind of a two way street. This statement is saying that if v = 0 then we must
also have v = 0 and in the reverse it’s also saying that if v = 0 then we must also have v = 0 .

© 2007 Paul Dawkins 137 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

To prove this we need to make each assumption and then prove that this will imply the other
portion of the statement.

We’re only going to show the proof for the case where v is in 2-space. The proof for in 3-space is
identical. So, assume that v = ( v1 , v2 ) and let’s start the proof by assuming that v = 0 .
Plugging into the formula for the norm gives,
0 = v12 + v22 Þ v12 + v22 = 0
As shown, the only way we’ll get zero out of a square root is if the quantity under the radical is
zero. Now at this point we’ve got a sum of squares equaling zero. The only way this will happen
is if the individual terms are zero. So, this means that,
v1 = 0 & v2 = 0 Þ v = ( 0, 0 ) = 0
So, if v = 0 we must have v = 0 .

Next, let’s assume that v = 0 . In this case simply plug the components into the formula for the
norm and a quick computation will show that v = 0 and so we’re done.

1
Theorem 3 Given a non-zero vector v in 2-space or 3-space define a new vector u = v , then
v
u is a unit vector.

Proof : This is a really simple proof, just notice that u is a scalar multiple of v and take the norm
of u.
1
u = v
v
Now we know that v > 0 because norms are always greater than or equal to zero, but will only
be zero if we have the zero vector. In this case we’ve explicitly assumed that we don’t have the
zero vector and so we now the norm will be strictly positive and this will allow us to drop the
absolute value bars on the norm when we do the computation.

We can now do the following,


1 1 1
u = v = v = v =1
v v v
So, u is a unit vector.

This theorem tells us that we can always turn a non-zero vector into a unit vector simply by
dividing by the norm. Note as well that because all we’re doing to compute this new unit vector
is scalar multiplication by a positive number this new unit vector will point in the same direction
as the original vector.

© 2007 Paul Dawkins 138 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

Example 4 Given v = ( 3, -1, -2 ) find a unit vector that,


(a) points in the same direction as v
(b) points in the opposite direction as v
Solution
(a) Now, as pointed out after the proof of the previous theorem, the unit vector computed in the
theorem will point in the same direction as v so all we need to do is compute the norm of v and
then use the theorem to find a unit vector that will point in the same direction as v.
v = 32 + ( -1) + ( -2 ) = 14
2 2

1 æ 3 1 2 ö
u= ( 3, -1, -2 ) = ç ,- ,- ÷
14 è 14 14 14 ø

(b) We’ve done most of the work for this one. Since u is a unit vector that points in the same
direction as v then its negative will be a unit vector that points in the opposite directions as v. So,
here is the negative of u.
æ 3 1 2 ö
-u = ç - , , ÷
è 14 14 14 ø

Finally, here is a sketch of all three of these vectors.

© 2007 Paul Dawkins 139 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

Dot Product & Cross Product


In this section we’re going to be taking a look at two special products of vectors, the dot product
and the cross product. However, before we look at either on of them we need to get a quick
definition out of the way.

Suppose that u and v are two vectors in 2-space or 3-space that are placed so that their initial
points are the same. Then the angle between u and v is angle q that is formed by u and v such
that 0 £ q £ p . Below are some examples of angles between vectors.

Notice that there are always two angles that are formed by the two vectors and the one that we
will always chose is the one that satisfies 0 £ q £ p . We’ll be using this angle with both
products.

So, let’s get started by taking a look at the dot product. Of the two products we’ll be looking at in
this section this is the one we’re going to run across most often in later sections. We’ll start with
the definition.

Definition 1 If u and v are two vectors in 2-space or 3-space and q is the angle between them
then the dot product, denoted by ug v is defined as,
ug v = u v cos q

Note that the dot product is sometimes called the scalar product or the Euclidean inner
product. Let’s see a quick example or two of the dot product.

Example 1 Compute the dot product for the following pairs of vectors.
(a) u = ( 0, 0,3) and v = ( 2, 0, 2 ) which makes the angle between them 45° .
(b) u = ( 0, 2, -1) and v = ( -1,1, 2 ) which makes the angle between them 90° .
Solution
For reference purposes here is a sketch of the two sets of vectors.

© 2007 Paul Dawkins 140 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

(a) There really isn’t too much to do here with this problem.
u = 0+0+9 = 3 v = 4+0+4 = 8 = 2 2
æ 2ö
( )
ug v = ( 3) 2 2 cos ( 45) = 6 2 çç ÷÷ = 6
è 2 ø
(b) Nor is there a lot of work to do here.
u = 0 + 4 +1 = 5 v = 1+1+ 4 = 6
ug v = ( 5 )( 6 ) cos ( 90) = 30 ( 0 ) = 0

Now, there should be a question in everyone’s mind at this point. Just how did we arrive at those
angles above? They are the correct angles, but just how did we get them? That is the problem
with this definition of the dot product. If you don’t have the angles between two vectors you
can’t easily compute the dot product and sometimes finding the correct angles is not the easiest
thing to do.

Fortunately, there is another formula that we can use to compute the formula that relies only on
the components of the vectors and not the angle between them.

Theorem 1 Suppose that u = ( u1 , u2 , u3 ) and v = ( v1 , v2 , v3 ) are two vectors in 3-space then,


ug v = u1v1 + u2v2 + u3v3

Likewise, if u = ( u1 , u2 ) and v = ( v1 , v2 ) are two vectors in 2-space then,


ug v = u1v1 + u2v2

Proof : We’ll just prove the 3-space version of this theorem. The 2-space version has a similar
proof. Let’s start out with the following figure.

© 2007 Paul Dawkins 141 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

So, these three vectors form a triangle and the lengths of each side is u , v , and v - u .
Now, from the Law of Cosines we know that,
2 2 2
v - u = v + u - 2 v u cos q

Now, plug in the definition of the dot product and solve for ug v .
v - u = v + u - 2 ( ug v )
2 2 2

ug v =
1
2
2
( 2
v + u - v -u
2
) (1)

Next, we know that v - u = ( v1 - u1 , v2 - u2 , v3 - u3 ) and so we can compute v - u . Note as


2

well that because of the square on the norm we won’t have a square root. We’ll also do all of the
multiplications.
v - u = ( v1 - u1 ) + ( v2 - u2 ) + ( v3 - u3 )
2 2 2 2

= v12 - 2v1u1 + u12 + v22 - 2v2u2 + u22 + v32 - 2v3u3 + u32


= v12 + v22 + v32 + u12 + u22 + u32 - 2 ( v1u1 + v2u2 + v3u3 )

2
The first three terms of this are nothing more than the formula for v and the next three terms
2
are the formula for u . So, let’s plug this into (1).

ug v =
1
2
(
2 2 2 2
(
v + u - v + u - 2 ( v1u1 + v2u2 + v3u3 ) ))
1
= ( 2 ( v1u1 + v2u2 + v3u3 ) )
2
= v1u1 + v2u2 + v3u3
And we’re done with the proof.

Before we work an example using this new (easier to use) formula let’s notice that if we rewrite
the definition of the dot product as follows,

© 2007 Paul Dawkins 142 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

ug v
cos q = , 0 £q £p
u v
we now have a very easy way to determine the angle between any two vectors. In fact this is how
we got the angles between the vectors in the first example!

Example 2 Determine the angle between the following pairs of vectors.


(a) a = ( 9, -2 ) b = ( 4,18 )
(b) u = ( 3, -1, 6 ) v = ( 4, 2, 0 )
Solution
(a) Here are all the important quantities for this problem.
a = 85 b = 340 agb = ( 9 )( 4 ) + ( -2 )(18 ) = 0
The angle is then,
0
cos q = =0 Þ q = 90°
85 340
(b) The important quantities for this part are,
u = 46 v = 20 ug v = ( 3)( 4 ) + ( -1)( 2 ) + ( 6 )( 0 ) = 10
The angle is then,
10
cos q = = 0.3296902 Þ q = 70.75°
46 20
Note that we did need to use a calculator to get this result.

Twice now we’ve seen two vectors whose dot product is zero and in both cases we’ve seen that
the angle between them was 90° and so the two vectors in question each time where
perpendicular. Perpendicular vectors are called orthogonal and as we’ll see on occasion we
often want to know if two vectors are orthogonal. The following theorem will give us a nice
check for this.

Theorem 2 Two non-zero vectors, u and v, are orthogonal if and only if ug v = 0 .

Proof :
First suppose that u and v are orthogonal. This means that the angle between them is 90° and so
from the definition of the dot product we have,
ug v = u v cos ( 90 ) = u v ( 0 ) = 0
and so we have u g v = 0 .

Next suppose that u g v = 0 , then from the definition of the dot product we have,
0 = ug v = u v cos q Þ cos q = 0 Þ q = 90°
and so the two vectors are orthogonal.

Note that we used the fact that the two vectors are non-zero, and hence would have non-zero
magnitudes, in determining that we must have cos q = 0 .

© 2007 Paul Dawkins 143 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

If we take the convention that the zero vector is orthogonal to any other vector we can say that for
any two vectors u and v they will be orthogonal provided u g v = 0 . Using this convention means
we don’t need to worry about whether or not we have zero vectors.

Here are some nice properties about the dot product.

Theorem 3 Suppose that u, v, and w are three vectors that are all in 2-space or all in 3-space
and that c is a scalar. Then,
1
(a) v g v = v (this implies that v = ( v g v ) 2 )
2

(b) ug v = v gu
(c) ug( v + w ) = ug v + ugw
(d) c ( ug v ) = ( cu )g v = ug( cv )
(e) v g v > 0 if v ¹ 0
(f) v g v = 0 if and only if v = 0

We’ll prove the first couple and leave the rest to you to prove since the follow pretty much from
either the definition of the dot product or the formula from Theorem 2. The proof of the last one
is nearly identical to the proof of Theorem 2 in the previous section.

Proof :
(a) The angle between v and v is 0 since they are the same vector and so by the definition of the
dot product we’ve got.
v g v = v v cos ( 0 ) = v
2

To get the second part just take the square root of both sides.

(b) This proof is going to seem tricky but it’s really not that bad. Let’s just look at the 3-space
case. So, u = ( u1 , u2 , u3 ) and v = ( v1 , v2 , v3 ) and the dot product ug v is
ug v = u1v1 + u2v2 + u3v3
We can also compute v gu as follows,
v gu = v1u1 + v2u2 + v3u3
However, since u1v1 = v1u1 , etc. (they are just real numbers after all) these are identical and so
we’ve got ug v = v gu .

Example 3 Given u = ( 5, -2 ) , v = ( 0,7 ) and w = ( 4,10 ) compute the following.


(a) ugu and u
2

(b) ugw
(c) ( -2u )g v and ug( -2 v )
Solution
(a) Okay, in this one we’ll be verifying part (a) of the previous theorem. Note as well that
because the norm is squared we’ll not need to have the square root in the computation. Here are
the computations for this part.

© 2007 Paul Dawkins 144 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

ugu = ( 5 )( 5 ) + ( -2 )( -2 ) = 25 + 4 = 29
u = 52 + ( -2 ) = 29
2 2

So, as the theorem suggested we do have ugu = u .


2

(b) Here’s the dot product for this part.


ugw = ( 5 )( 4 ) + ( -2 )(10 ) = 0
So, it looks like u and w are orthogonal.

(c) In this part we’ll be verifying part (d) of the previous theorem. Here are the computations for
this part.
-2u = ( -10, 4 ) -2 v = ( 0, -14 )
( -2u )g v = ( -10 )( 0 ) + ( 4 )( 7 ) = 28
ug( -2 v ) = ( 5)( 0 ) + ( -2 )( -14 ) = 28

Again, we got the result that we should expect .

We now need to take a look at a very important application of the dot product. Let’s suppose that
u and a are two vectors in 2-space or 3-space and let’s suppose that they are positioned so that
their initial points are the same. What we want to do is “decompose” the vector u into two
components. One, which we’ll denote v1 for now, will be parallel to the vector a and the other,
denoted v 2 for now, will be orthogonal to a. See the image below to see some examples of kind
of decomposition.

From these figures we can see how to actually construct the two pieces of our decomposition.
Starting at u we drop a line straight down until it intersects a (or the line defined by a as in the
second case). The parallel vector v1 is then the vector that starts at the initial point of u and ends
where the perpendicular line intersects a. Finding v 2 is actually really simple provided we first
have v1 . From the image we can see that we have,
v1 + v 2 = u Þ v 2 = u - v1

We now need to get some terminology and notation out of the way. The parallel vector, v1 , is
called the orthogonal projection of u on a and is denoted by proja u . Note that sometimes
proja u is called the vector component of u along a. The orthogonal vector, v 2 , is called the
vector component of u orthogonal to a.

© 2007 Paul Dawkins 145 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

The following theorem gives us formulas for computing both of these vectors.

Theorem 4 Suppose that u and a ¹ 0 are both vectors in 2-space or 3-space then,
uga
proja u = 2 a
a
and the vector component of u orthogonal to a is given by,
uga
u - proja u = u - 2
a
a

Proof : First let v1 = proja u then u - proja u will be the vector component of u orthogonal to a
and so all we need to do is show the formula for v1 is what we claimed it to be.

To do this let’s first note that since v1 is parallel to a then it must be a scalar multiple of a since
we know from last section that parallel vectors are scalar multiples of each other. Therefore there
is a scalar c such that v1 = c a . Now, let’s start with the following,
u = v1 + v 2 = c a + v 2
Next take the dot product of both sides with a and distribute the dot product through the
parenthesis.
uga = ( c a + v 2 )ga
= c aga + v 2 ga

Now, aga = a and v 2 ga = 0 because v 2 is orthogonal to a. Therefore this reduces to,


2

uga
uga = c a
2
Þ c= 2
a
and so we get,
uga
v1 = proja u = 2
a
a

We can also take a quick norm of proja u to get a nice formula for the magnitude of the
orthogonal projection of u on a.
uga uga uga
proja u = 2
a = 2
a =
a a a

Let’s work a quick example or two of orthogonal projections.

© 2007 Paul Dawkins 146 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

Example 4 Compute the orthogonal projection of u on a and the vector component of u


orthogonal to a for each of the following.
(a) u = ( -3,1) a = ( 7, 2 )
(b) u = ( 4, 0, -1) a = ( 3,1, -5 )
Solution
There really isn’t much to do here other than to plug into the formulas so we’ll leave it to you to
verify the details.

(a) First,
uga = -19
2
a = 53
Now the orthogonal projection of u on a is,
-19
( 7, 2 ) = æç - , - ö÷
133 38
proja u =
53 è 53 53 ø
and the vector component of u orthogonal to a is,
æ 133 38 ö æ 26 91 ö
u - proja u = ( -3,1) - ç - ,- ÷ = ç- , ÷
è 53 53 ø è 53 53 ø

(b) First,
uga = 17
2
a = 35
Now the orthogonal projection of u on a is,

( 3,1, -5 ) = æç , , - ö÷
17 51 17 17
proja u =
35 è 35 35 7 ø
and the vector component of u orthogonal to a is,
æ 51 17 17 ö æ 89 17 10 ö
u - proja u = ( 4, 0, -1) - ç , , - ÷ = ç , - , ÷
è 35 35 7 ø è 35 35 7 ø

We need to be very careful with the notation proja u . In this notation we are looking for the
orthogonal projection of u (the second vector listed) on a (the vector that is subscripted). Let’s do
a quick example illustrating this.

Example 5 Given u = ( 4, -5) and a = (1, -1) compute,


(a) proja u
(b) proju a
Solution
(a) In this case we are looking for the component of u that is parallel to a and so the orthogonal
projection is given by,
uga
proja u = 2
a
a
so let’s get all the quantities that we need.
uga = ( 4 )(1) + ( -5 )( -1) = 9 a = (1) + ( -1) = 2
2 2 2

The projection is then,

© 2007 Paul Dawkins 147 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

(1, -1) = æç , - ö÷
9 9 9
proja u =
2 è2 2ø
(b) Here we are looking for the component of a that is parallel to u and so the orthogonal
projection is given by,
agu
proju a = 2
u
u
so let’s get the quantities that we need for this part.
agu = uga = 9 u = ( 4 ) + ( -5 ) = 41
2 2 2

The projection is then,

( 4, -5) = æç , - ö÷
9 36 45
proju a =
41 è 41 41 ø

As this example has shown we need to pay attention to the placement of the two vectors in the
projection notation. Each part above was asking for something different and as shown we did in
fact get different answers so be careful.

It’s now time to move into the second vector product that we’re going to look at in this section.
However before we do that we need to introduce the idea of the standard unit vectors or
standard basis vectors for 3-space. These vectors are defined as follows,

i = (1, 0, 0 ) j = ( 0,1, 0 ) k = ( 0, 0,1)

Each of these have a magnitude of 1 and so are unit vectors. Also note that each one lies along
the coordinate axes of 3-space and point in the positive direction as shown below.

Notice that any vector in 3-space, say u = ( u1 , u2 , u3 ) , can be written in terms of these three
vectors as follows,
u = ( u1 , u2 , u3 )
= ( u1 , 0, 0 ) + ( 0, u2 , 0 ) + ( 0, 0, u3 )
= u1 (1, 0, 0 ) + u2 ( 0,1, 0 ) + u3 ( 0, 0,1)
= u1i + u2 j + u3k

So, for example we can do the following,


( -10, 4,3) = -10i + 4 j + 3k ( -1, 0, 2 ) = -i + 2k
© 2007 Paul Dawkins 148 http://tutorial.math.lamar.edu/terms.aspx
Linear Algebra

Also note that if we define i = (1, 0 ) and j = ( 0,1) these two vectors are the standard basis
vectors for 2-space and any vector in 2-space, say u = ( u1 , u2 ) can be written as,
u = ( u1 , u2 ) = u1i + u2 j
We’re not going to need the 2-space version of things here, but it was worth pointing out that
there was a 2-space version since we’ll need that down the road.

Okay we are now ready to look at the cross product. The first thing that we need to point out here
is that, unlike the dot product, this is only valid in 3-space. There are three different ways of
defining it depending on how you want to do it. The following definition gives all three
definitions.

Definition 2 If u and v are two vectors in 3-space then the cross product, denoted by u ´ v
and is defined in one of three ways.
(a) u ´ v = ( u2 v3 - u3v2 , u3v1 - u1v3 , u1v2 - u2 v1 ) - Vector Notation.
æ u2 u3 u u u u ö
(b) u ´ v = ç , - 1 3 , 1 2 ÷ - Using 2 ´ 2 determinants
è v2 v3 v1 v3 v1 v2 ø
i j k
(c) u ´ v = u1 u2 u3 - Using 3 ´ 3 determinants
v1 v2 v3

Note that all three of these definitions are equivalent as you can check by computing the
determinants in the second and third definition and verifying that you get the same formula as in
the first definition.

Notice that the cross product of two vectors is a new vector unlike the dot product which gives a
scalar. Make sure to keep these two products straight.

Let’s take a quick look at an example of a cross product.

Example 6 Compute u ´ v for u = ( 4, -9,1) and v = ( 3, -2, 7 ) .

Solution
You can use either of the three definitions above to compute this cross product. We’ll use the
third one. If you don’t remember how to compute determinants you might want to go back and
check out the first section of the Determinants chapter. In that section you’ll find the formulas for
computing determinants of both 2 ´ 2 and 3 ´ 3 matrices.
i j k i j
u ´ v = 4 -9 1 4 -9
3 -2 7 3 -2
= -63 i + 3 j - 8 k - 28 j + 2 i + 27 k = -61 i - 25 j + 19 k

When we’re using this definition of the cross product we’ll always get the answer in terms of the

© 2007 Paul Dawkins 149 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

standard basis vectors. However, we can always go back to the form we’re used to. Doing this
gives,
u ´ v = ( -61, -25,19 )

Here is a theorem listing the main properties of the cross product.

Theorem 5 Suppose u, v, and w are vectors in 3-space and c is any scalar then
(a) u ´ v = - ( v ´ u )
(b) u ´ ( v + w ) = ( u ´ v ) + ( u ´ w )
(c) ( u + v ) ´ w = ( u ´ w ) + ( v ´ w )
(d) c ( u ´ v ) = ( c u ) ´ v = u ´ ( c v )
(e) u ´ 0 = 0 ´ u = 0
(f) u ´ u = 0

The proof of all these properties come directly from the definition of the cross product and so are
left to you to verify.

There are also quite a few properties that relate the dot product and the cross product. Here is a
theorem giving those properties.

Theorem 6 Suppose u, v, and w are vectors in 3-space then,


(a) ug( u ´ v ) = 0
(b) v g( u ´ v ) = 0
v - ( ug v ) - This is called Lagrange’s Identity
2 2 2 2
(c) u ´ v = u
(d) u ´ ( v ´ w ) = ( ugw ) v - ( ug v ) w
(e) ( u ´ v ) ´ w = ( ugw ) v - ( v gw ) u

The proof of all these properties come directly from the definition of the cross product and the dot
product and so are left to you to verify.

The first two properties deserve some closer inspection. That they are saying is that given two
vectors u and v in 3-space then the cross product u ´ v is orthogonal to both u and v. The image
below shows this idea.

© 2007 Paul Dawkins 150 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

As this figure shows there are two directions in which the cross product to be orthogonal to u and
v and there is a nice way to determine which it will be. Take your right hand and cup your
fingers so that they point in the direction of rotation that is shown in the figures (i.e. rotate u until
it lies on top of v) and hold your thumb out. Your thumb will point in the direction of the cross
product. This is often called the right-hand rule.

Notice that part (a) of Theorem 5 above also gives this same result. If we flip the order in which
we take the cross product (which is really what we did in the figure above when we interchanged
the letters) we get u ´ v = - ( v ´ u ) . In other words, in one order we get a cross product that
points in one direction and if we flip the order we get a new cross product that points in the
opposite direction as the first one.

Let’s work a couple more cross products to verify some of the properties listed above and so we
can say we’ve got a couple more examples in the notes.

Example 7 Given u = ( 3, -1, 4 ) and v = ( 2, 0,1) compute each of the following.


(a) u ´ v and v ´ u [Solution]
(b) u ´ u [Solution]
(c) ug( u ´ v ) and v g( u ´ v ) [Solution]
Solution
In the solutions to these problems we will be using the third definition above and we’ll be setting
up the determinant. We will not be showing the determinant computation however, if you need a
reminder on how to take determinants go back to the first section in the Determinant chapter for a
refresher.

(a) u ´ v and v ´ u

Let’s compute u ´ v first.


i j k
u ´ v = 3 -1 4 = -i + 5 j + 2k = ( -1,5, 2 )
2 0 1
Remember that we’ll get the answers here in terms of the standard basis vectors and these can
always be put back into the standard vector notation that we’ve been using to this point as we did
above.

Now let’s compute v ´ u .


i j k
v´u = 2 0 1 = i - 5 j - 2k = (1, -5, -2 )
3 -1 4

So, as part (a) of Theorem 5 suggested we got u ´ v = - ( v ´ u ) .


[Return to Problems]
(b) u ´ u
Not much to do here other than do the cross product and note that part (f) of Theorem 5 implies
that we should get u ´ u = 0 .

© 2007 Paul Dawkins 151 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

i j k
u ´ u = 3 -1 4 = ( 0, 0, 0 )
3 -1 4
So, sure enough we got 0.
[Return to Problems]
(c) ug( u ´ v ) and v g( u ´ v )
We’ve already got u ´ v computed so we just need to do a couple of dot products and according
to Theorem 6 both u and v are orthogonal to u ´ v and so we should get zero out of both of these.
ug( u ´ v ) = ( 3)( -1) + ( -1)( 5 ) + ( 4 )( 2 ) = 0
vg( u ´ v ) = ( 2 )( -1) + ( 0 )( 5 ) + (1)( 2 ) = 0

And we did get zero as expected.


[Return to Problems]

We’ll give one theorem on cross products relating the magnitude of the cross product to the
magnitudes of the two vectors we’re taking the cross product of.

Theorem 7 Suppose that u and v are vectors in 3-space and let q be the angle between them
then,
u ´ v = u v sin q

Let’s take a look at one final example here.

Example 8 Given u = (1, -1, 0 ) and v = ( 0, -2, 0 ) verify the results of Theorem 7.
Solution
Let’s get the cross product and the norms taken care of first.
i j k
u´ v = 1 -1 0 = ( 0, 0, -2 ) u´ v = 0 + 0 + 4 = 2
0 -2 0

u = 1+1+ 0 = 2 v = 0+4+0 = 2

Now, in order to verify Theorem 7 we’ll need the angle between the two vectors and we can use
the definition of the dot product above to find this. We’ll first need the dot product.
2 1
ug v = 2 Þ cos q = = Þ q = 45°
( 2 ) ( 2) 2
All that’s left is to check the formula.
æ 2ö
u v sin q = ( 2 ) ( 2 ) sin ( 45°) = ( 2 ) ( 2) çç 2
÷÷ = 2 = u ´ v
è ø
So, the theorem is verified.

© 2007 Paul Dawkins 152 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

© 2007 Paul Dawkins 153 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

Euclidean n-Space
In the first two sections of this chapter we looked at vectors in 2-space and 3-space. You
probably noticed that with the exception of the cross product (which is only defined in 3-space)
all of the formulas that we had for vectors in 3-space were natural extensions of the 2-space
formulas. In this section we’re going to extend things out to a much more general setting. We
won’t be able to visualize things in a geometric setting as we did in the previous two sections but
things will extend out nicely. In fact, that was why we started in 2-space and 3-space. We
wanted to start out in a setting where we could visualize some of what was going on before we
generalized things into a setting where visualization was a very difficult thing to do.

So, let’s get things started off with the following definition.

Definition 1 Given a positive integer n an ordered n-tuple is a sequence of n real numbers


denoted by ( a1 , a2 ,K , an ) . The complete set of all ordered n-tuples is called n-space and is
denoted by ¡ n

In the previous sections we were looking at ¡ 2 (what we were calling 2-space) and ¡ 3 (what we
were calling 3-space). Also the more standard terms for 2-tuples and 3-tuples are ordered pair
and ordered triplet and that’s the terms we’ll be using from this point on.

Also, as we pointed out in the previous sections an ordered pair, ( a1 , a2 ) , or an ordered triplet,
( a1 , a2 , a3 ) , can be thought of as either a point or a vector in ¡ 2 or ¡3 respectively. In general
an ordered n-tuple, ( a1 , a2 ,K , an ) , can also be thought of as a “point” or a vector in ¡ n . Again,
we can’t really visualize a point or a vector in ¡ n , but we will think of them as points or vectors
in ¡ n anyway and try not to worry too much about the fact that we can’t really visualize them.

Next, we need to get the standard arithmetic definitions out of the way and all of these are going
to be natural extensions of the arithmetic we saw in ¡ 2 and ¡ 3 .

Definition 2 Suppose u = ( u1 , u2 ,K , un ) and v = ( v1 , v2 ,K , vn ) are two vectors in ¡ n .


(a) We say that u and v are equal if,
u1 = v1 u2 = v2 L un = vn
(b) The sum of u and v is defined to be,
u + v = ( u1 + v1 , u2 + v2 ,K , un + vn )
(c) The negative (or additive inverse) of u is defined to be,
-u = ( -u1 , -u2 ,K , -un )
(d) The difference of two vectors is defined to be,
u - v = u + ( - v ) = ( u1 - v1 , u2 - v2 ,K , un - vn )
(e) If c is any scalar then the scalar multiple of u is defined to be,
c u = ( cu1 , cu2 ,K , cun )
(f) The zero vector in ¡ n is denoted by 0 and is defined to be,
0 = ( 0,0,K , 0 )

© 2007 Paul Dawkins 154 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

The basic properties of arithmetic are still valid in ¡ n so let’s also give those so that we can say
that we’ve done that.

Theorem 1 Suppose u = ( u1 , u2 ,K , un ) , v = ( v1 , v2 ,K , vn ) and w = ( w1 , w2 ,K , wn ) are


vectors in ¡ n and c and k are scalars then,
(a) u + v = v + u
(b) u + ( v + w ) = ( u + v ) + w
(c) u + 0 = 0 + u = u
(d) u - u = u + ( -u ) = 0
(e) 1u = u
(f) ( ck ) u = c ( ku ) = k ( cu )
(g) ( c + k ) u = cu + ku
(h) c ( u + v ) = cu + cv

The proof of all of these come directly from the definitions above and so won’t be given here.

We now need to extend the dot product we saw in the previous section to ¡ n and we’ll be giving
it a new name as well.

Definition 3 Suppose u = ( u1 , u2 ,K , un ) and v = ( v1 , v2 ,K , vn ) are two vectors in ¡ n then


the Euclidean inner product denoted by ug v is defined to be
ug v = u1v1 + u2v2 + L + un vn

So, we can see that it’s the same notation and is a natural extension to the dot product that we
looked at in the previous section, we’re just going to call it something different now. In fact, this
is probably the more correct name for it and we should instead say that we’ve renamed this to the
dot product when we were working exclusively in ¡ 2 and ¡ 3 .

Note that when we add in addition, scalar multiplication and the Euclidean inner product to ¡ n
we will often call this Euclidean n-space.

We also have natural extensions of the properties of the dot product that we saw in the previous
section.

Theorem 2 Suppose u = ( u1 , u2 ,K , un ) , v = ( v1 , v2 ,K , vn ) , and w = ( w1 , w2 ,K , wn ) are


vectors in ¡ n and let c be a scalar then,
(a) ug v = v gu
(b) ( u + v )gw = ugw + v gw
(c) c ( ug v ) = ( c u )g v = ug( c v )
(d) u gu ³ 0
(e) u gu = 0 if and only if u=0.

© 2007 Paul Dawkins 155 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

The proof of this theorem falls directly from the definition of the Euclidean inner product and are
extensions of proofs given in the previous section and so aren’t given here.

The final extension to the work of the previous sections that we need to do is to give the
definition of the norm for vectors in ¡ n and we’ll use this to define distance in ¡ n .

Definition 4 Suppose u = ( u1 , u2 ,K , un ) is a vector in ¡ n then the Euclidean norm is,


1
u = ( ugu ) 2 = u12 + u22 + L + un2

Definition 5 Suppose u = ( u1 , u2 ,K , un ) and v = ( v1 , v2 ,K , vn ) are two points in ¡ n then the


Euclidean distance between them is defined to be,
d ( u, v ) = u - v = ( u1 - v1 ) + ( u2 - v2 ) + L + ( un - vn )
2 2 2

Notice in this definition that we called u and v points and then used them as vectors in the norm.
This comes back to the idea that an n-tuple can be thought of as both a point and a vector and so
will often be used interchangeably where needed.

Let’s take a quick look at a couple of examples.

Example 1 Given u = ( 9,3, -4,0,1) and v = ( 0, -3, 2, -1,7 ) compute


(a) u - 4 v
(b) v gu
(c) ugu
(d) u
(e) d ( u, v )
Solution
There really isn’t much to do here other than use the appropriate definition.
(a)
u - 4 v = ( 9,3, -4, 0,1) - 4 ( 0, -3, 2, -1,7 )
= ( 9,3, -4, 0,1) - ( 0, -12,8, -4, 28 )
= ( 9,15, -12, 4, -27 )
(b)
v gu = ( 0 )( 9 ) + ( -3)( 3) + ( 2 )( -4 ) + ( -1)( 0 ) + ( 7 )(1) = -10
(c)
ugu = 92 + 32 + ( -4 ) + 02 + 12 = 107
2

(d)
u = 92 + 32 + ( -4 ) + 02 + 12 = 107
2

(e)
d ( u, v ) = ( 9 - 0 ) + ( 3 - ( -3) ) + ( -4 - 2 ) + ( 0 - ( -1) ) + (1 - 7 )
2 2 2 2 2
= 190

© 2007 Paul Dawkins 156 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

Just as we saw in the section on vectors if we have u = 1 then we will call u a unit vector and
so the vector u from the previous set of examples is not a unit vector

Now that we’ve gotten both the inner product and the norm taken care of we can give the
following theorem.

Theorem 3 Suppose u and v are two vectors in ¡ n and q is the angle between them. Then,
ug v = u v cos q

Of course since we are in ¡ n it is hard to visualize just what the angle between the two vectors
is, but provided we can find it we can use this theorem. Also note that this was the definition of
the dot product that we gave in the previous section and like that section this theorem is most
useful for actually determining the angle between two vectors.

The proof of this theorem is identical to the proof of Theorem 1 in the previous section and so
isn’t given here.

The next theorem is very important and has many uses in the study of vectors. In fact we’ll need
it in the proof of at least one theorem in these notes. The following theorem is called the
Cauchy-Schwarz Inequality.

Theorem 4 Suppose u and v are two vectors in ¡ n then


ug v £ u v

Proof : This proof is surprisingly simple. We’ll start with the result of the previous theorem and
take the absolve value of both sides.
ug v = u v cos q
However, we know that cos q £ 1 and so we get our result by using this fact.
ug v = u v cos q £ u v (1) = u v

Here are some nice properties of the Euclidean norm.

Theorem 5 Suppose u and v are two vectors in ¡ n and that c is a scalar then,
(a) u ³ 0
(b) u = 0 if and only if u=0.
(c) c u = c u
(d) u + v £ u + v - Usually called the Triangle Inequality

The proof of the first two part is a direct consequence of the definition of the Euclidean norm and
so won’t be given here.

© 2007 Paul Dawkins 157 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

Proof :
(c) We’ll just run through the definition of the norm on this one.
( cu1 ) + ( cu2 ) + L + ( cun )
2 2 2
cu =

= c 2 ( u12 + u22 + L + un2 )

= c u12 + u22 + L + un2


=c u
(d) The proof of this one isn’t too bad once you see the steps you need to take. We’ll start with
the following.
u + v = ( u + v )g( u + v )
2

So, we’re starting with the definition of the norm and squaring both sides to get rid of the square
root on the right side. Next, we’ll use the properties of the Euclidean inner product to simplify
this.
u + v = ug( u + v ) + v g( u + v )
2

= ugu + ug v + vgu + vg v
= ugu + 2 ( ug v ) + vg v

Now, notice that we can convert the first and third terms into norms so we’ll do that. Also, ug v
is a number and so we know that if we take the absolute value of this we’ll have ug v £ ug v .
Using this and converting the first and third terms to norms gives,
u + v = u + 2 ( ug v ) + v
2 2 2

£ u + 2 ug v + v
2 2

We can now use the Cauchy-Schwarz inequality on the second term to get,
2 2 2
u+v £ u +2 u v + v

We’re almost done. Let’s notice that the left side can now be rewritten as,
u+v £( u + v )
2 2

Finally, take the square root of both sides.


u+v £ u + v

Example 2 Given u = ( -2,3,1, -1) and v = ( 7,1, -4, -2 ) verify the Cauchy-Schwarz
inequality and the Triangle Inequality.

Solution
Let’s first verify the Cauchy-Schwarz inequality. To do this we need to following quantities.
ug v = -14 + 3 - 4 + 2 = -13
u = 4 + 9 + 1 + 1 = 15 v = 49 + 1 + 16 + 4 = 70

© 2007 Paul Dawkins 158 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

Now, verify the Cauchy-Schwarz inequality.


ug v = -13 = 13 £ 32.4037 = 15 70 = u v

Sure enough the Cauchy-Schwarz inequality holds true.

To verify the Triangle inequality all we need is,


u + v = ( 5, 4, -3, -3) u + v = 25 + 16 + 9 + 9 = 59

Now verify the Triangle Inequality.


u + v = 59 = 7.6811 £ 12.2396 = 15 + 70 = u + v

So, the Triangle Inequality is also verified for this problem.

Here are some nice properties pertaining to the Euclidean distance.

Theorem 6 Suppose u, v, and w are vectors in ¡ n then,


(a) d ( u, v ) ³ 0
(b) d ( u, v ) = 0 if and only if u=v.
(c) d ( u, v ) = d ( v, u )
(d) d ( u, v ) £ d ( u, w ) + d ( w , v ) - Usually called the Triangle Inequality

The proof of the first two parts is a direct consequence of the previous theorem and the proof of
the third part is a direct consequence of the definition of distance and won’t be proven here.

Proof (d) : Let’s start off with the definition of distance.


d ( u, v ) = u - v
Now, add in and subtract out w as follows,
d ( u, v ) = u - w + w - v = ( u - w ) + ( w - v )
Next use the Triangle Inequality for norms on this.
d ( u, v ) £ u - w + w - v
Finally, just reuse the definition of distance again.
d ( u, v ) £ d ( u, w ) + d ( w , v )

We have one final topic that needs to be generalized into Euclidean n-space.

Definition 6 Suppose u and v are two vectors in ¡ n . We say that u and v are orthogonal if
ug v = 0 .

So, this definition of orthogonality is identical to the definition that we saw when we were
dealing with ¡ 2 and ¡ 3 .

© 2007 Paul Dawkins 159 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

Here is the Pythagorean Theorem in ¡ n .

Theorem 7 Suppose u and v are two orthogonal vectors in ¡ n then,


2 2 2
u+v = u + v

Proof : The proof of this theorem is fairly simple. From the proof of the triangle inequality for
norms we have the following statement.
u + v = u + 2 ( ug v ) + v
2 2 2

However, because u and v are orthogonal we have u g v = 0 and so we get,


2 2 2
u+v = u + v

Example 3 Show that u = ( 3, 0,1, 0, 4, -1) and v = ( -2,5, 0, 2, -3, -18 ) are orthogonal and
verify that the Pythagorean Theorem holds.

Solution
Showing that these two vectors is easy enough.
ug v = ( 3)( -2 ) + ( 0 )( 5) + (1)( 0 ) + ( 0 )( 2 ) + ( 4 )( -3) + ( -1)( -18) = 0

So, the Pythagorean Theorem should hold, but let’s verify that. Here’s the sum
u + v = (1,5,1, 2,1, -19 )
and here’s the square of the norms.
u + v = 12 + 52 + 12 + 22 + 12 + ( -19 ) = 393
2 2

u = 32 + 02 + 12 + 0 2 + 42 + ( -1) = 27
2 2

v = ( -2 ) + 52 + 0 2 + 22 + ( -3 ) + ( -18 ) = 366
2 2 2 2

2 2 2
A quick computation then confirms that u + v = u + v .

We’ve got one more theorem that gives a relationship between the Euclidean inner product and
the norm. This may seem like a silly theorem, but we’ll actually need this theorem towards the
end of the next chapter.

Theorem 8 If u and v are two vectors in ¡ n then,


1 1
ug v = u + v - u - v
2 2

4 4

Proof : The proof here is surprisingly simple. First, start with,


u + v = u + 2 ( ug v ) + v
2 2 2

u - v = u - 2 ( ug v ) + v
2 2 2

The first of these we’ve seen a couple of times already and the second is derived in the same
manner that the first was and so you should verify that formula.

© 2007 Paul Dawkins 160 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

Now subtract the second from the first to get,


4 ( ug v ) = u + v - u - v
2 2

Finally, divide by 4 and we get the result we were after.


1 1
ug v =
2 2
u+v - u-v
4 4

In the previous section we saw the three standard basis vectors for ¡ 3 , i, j, and k. This idea can
also be extended out to ¡ n . In ¡ n we will define the standard basis vectors or standard unit
vectors to be,
e1 = (1, 0, 0,K , 0 ) e2 = ( 0,1, 0,K , 0 ) L e n = ( 0, 0, 0,K ,1)
and just as we saw in that section we can write any vector u = ( u1 , u2 ,K , un ) in terms of these
standard basis vectors as follows,
u = u1 (1, 0, 0,K 0 ) + u2 ( 0,1, 0,K 0 ) + L + un ( 0, 0, 0,K1)
= u1e1 + u2e 2 + L + une n

Note that in ¡ 3 we have e1 = i , e 2 = j and e3 = k .

Now that we’ve gotten the general vector in Euclidean n-space taken care of we need to go back
and remember some of the work that we did in the first chapter. It is often convenient to write the
vector u = ( u1 , u2 ,K , un ) as either a row matrix or a column matrix as follows,
é u1 ù
êu ú
u = [ u1 u 2 L un ] u = ê 2ú
êMú
ê ú
ë un û

In this notation we can use matrix addition and scalar multiplication for matrices to show that
we’ll get the same results as if we’d done vector addition and scalar multiplication for vectors on
the original vectors.

So, why do we do this? We’ll let’s use the column matrix notation for the two vectors u and v.
é u1 ù é v1 ù
êu ú êv ú
u = ê 2ú v = ê 2ú
êMú êMú
ê ú ê ú
ë un û ëvn û

Now compute the following matrix product.

© 2007 Paul Dawkins 161 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

é u1 ù
êu ú
vT u = [ v1 v2 L vn ] ê 2 ú = [u1v1 + u2v2 + L + un vn ] = [ug v ] = ug v
êMú
ê ú
ë un û

So, we can think of the Euclidean inner product can be thought of as a matrix multiplication
using,
ug v = vT u
provided we consider u and v as column vectors.

The natural question this is just why is this important? Well let’s consider the following scenario.
Suppose that u and v are two vectors in ¡ n and that A is an n ´ n matrix. Now consider the
following inner product and write it as a matrix multiplication.
( Au )g v = vT ( Au )
Now, rearrange the order of the multiplication and recall one of the properties of transposes.
( Au )g v = ( vT A) u = ( AT v )
T
u

Don’t forget that we switch the order on the matrices when we move the transpose out of the
parenthesis. Finally, this last matrix product can be rewritten as an inner product.
( Au )g v = ug( AT v )
This tells us that if we’ve got an inner product and the first vector (or column matrix) is
multiplied by a matrix then we can move that matrix to the second vector (or column matrix) if
we simply take its transpose.

A similar argument can also show that,


ug( Av ) = ( AT u )g v

© 2007 Paul Dawkins 162 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

Linear Transformations
In this section we’re going to take a look at a special kind of function that arises very naturally in
the study of Linear Algebra and has many applications in fields outside of mathematics such as
physics and engineering. This section is devoted mostly to the basic definitions and facts
associated with this special kind of function. We will be looking at a couple of examples, but
we’ll reserve most of the examples for the next section.

Now, the first thing that we need to do is take a step back and make sure that we’re all familiar
with some of the basics of functions in general. A function, f, is a rule (usually defined by an
equation) that takes each element of the set A (called the domain) and associates it with exactly
one element of a set B (called the codomain). The notation that we’ll be using to denote our
function is

f :A®B

When we see this notation we know that we’re going to be dealing with a function that takes
elements from the set A and associates them with elements from the set B. Note as well that it is
completely possible that not every element of the set B will be associated with an element from A.
The subset of all elements from B that are associated with elements from A is called the range.

In this section we’re going to be looking at functions of the form,


f : ¡n ® ¡m
In other words, we’re going to be looking at functions that take elements/points/vectors from ¡ n
and associate them with elements/points/vectors from ¡ m . These kinds of functions are called
transformations and we say that f maps ¡ n into ¡ m . On an element basis we will also say that
f maps the element u from ¡ n to the element v from ¡ m .

So, just what do transformations look like? Consider the following scenario. Suppose that we
have m functions of the following form,
w1 = f1 ( x1 , x2 ,K , xn )
w2 = f 2 ( x1 , x2 ,K , xn )
M M
wm = f m ( x1 , x2 ,K , xn )

Each of these functions takes a point in ¡ n , namely ( x1 , x2 ,K , xn ) , and maps it to the number
wi . We can now define a transformation T : ¡ n ® ¡ m as follows,
T ( x1 , x2 ,K , xn ) = ( w1 , w2 ,K , wm )
In this way we associate with each point ( x1 , x2 ,K , xn ) from ¡ n a point ( w1 , w2 ,K , wm ) from
¡ m and we have a transformation.

Let’s take a look at a couple of transformations.

© 2007 Paul Dawkins 163 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

Example 1 Given
w1 = 3 x1 - 4 x2 w2 = x1 + 2 x2 w3 = 6 x1 - x2 w4 = 10 x2
define T : ¡ ® ¡ as,
2 4

T ( x1 , x2 ) = ( w1 , w2 , w3 , w4 ) OR
T ( x1 , x2 ) = ( 3 x1 - 4 x2 , x1 + 2 x2 , 6 x1 - x2 , 10 x2 )

Note that the second form is more convenient since we don’t actually have to define any of the
w’s in that way and is how we will define most of our transformations.

We evaluate this just as we evaluate the functions that we’re used to working with. Namely, pick
a point from ¡ 2 and plug into the transformation and we’ll get a point out of the function that is
in ¡ 4 . For example,
T ( -5, 2 ) = ( -23, -1, -32, 20 )

Example 2 Define T : ¡3 ® ¡ 2 as T ( x1 , x2 , x3 ) = ( 4 x22 + x32 , x12 - x2 x3 ) . A sample


evaluation of this transformation is,
T ( 3, -1, 6 ) = ( 40, 15 )

Now, in this section we’re going to be looking at a special kind of transformation called a linear
transformation. Here is the definition of a linear transformation.

Definition 1 A function T : ¡ n ® ¡ m is called a linear transformation if for all u and v in


¡ n and all scalars c we have,
T (u + v ) = T (u ) + T ( v ) T ( c u ) = cT ( u )

We looked at two transformations above and only one of them is linear. Let’s take a look at each
one and see what we’ve got.

Example 3 Determine if the transformation from Example 2 is linear or not.

Solution
Okay, if this is going to be linear then it must satisfy both of the conditions from the definition.
In other words, both of the following will need to be true.
T ( u + v ) = T ( u1 + v1 , u2 + v2 , u3 + v3 )
= T ( u1 , u2 , u3 ) + T ( v1 , v2 , v3 )
= T (u ) + T ( v )

T ( c u ) = T ( cu1 , cu2 , cu3 ) = cT ( u1 , u2 , u3 ) = cT ( u )

In this case let’s take a look at the second condition.

© 2007 Paul Dawkins 164 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

T ( c u ) = T ( cu1 , cu2 , cu3 )


= ( 4c 2u22 + c 2u32 , c 2u12 - c 2u2u3 )
= c 2 ( 4u22 + u32 , u12 - u2u3 )
= c 2T ( u ) ¹ cT ( u )

The second condition is not satisfied and so this is not a linear transformation. You might want to
verify that in this case the first is also not satisfied. It’s not too bad, but the work does get a little
messy.

Example 4 Determine if the transformation in Example 1 is linear or not.

Solution
To do this one we’re going to need to rewrite things just a little. The transformation is defined as
T ( x1 , x2 ) = ( w1 , w2 , w3 , w4 ) where,
w1 = 3x1 - 4 x2
w2 = x1 + 2 x2
w3 = 6 x1 - x2
w4 = 10 x2

Now, each of the components are given by a system of linear (hhmm, makes one instantly wonder
if the transformation is also linear…) equations and we saw in the first chapter that we can always
write a system of linear equations in matrix form. Let’s do that for this system.

é w1 ù é 3 -4 ù
êw ú ê 1 2 úú é x1 ù
ê 2ú = ê Þ w = Ax
ê w3 ú ê 6 -1ú êë x2 úû
ê ú ê ú
ë w4 û ë 0 10 û

Now, notice that if we plug in any column matrix x and do the matrix multiplication we’ll get a
new column matrix out, w. Let’s pick a column matrix x totally at random and see what we get.
é -23ù é 3 -4 ù
ê -1ú ê 1 2 úú é -5ù
ê ú=ê
ê -32 ú ê 6 -1ú êë 2 úû
ê ú ê ú
ë 20 û ë 0 10 û

Of course, we didn’t pick x completely at random. Notice that x we choose was the column
matrix representation of the point from ¡ 2 that we used in Example 1 to show a sample
evaluation of the transformation. Just as importantly notice that the result, w, is the matrix
representation of the point from ¡ 4 that we got out of the evaluation.

In fact, this will always be the case for this transformation. So, in some way the evaluation

© 2007 Paul Dawkins 165 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

T ( x ) is the same as the matrix multiplication Ax and so we can write the transformation as
T ( x ) = Ax
Notice that we’re kind of mixing and matching notation here. On the left x represents a point in
¡ 2 and on the right it is a 2 ´ 1 matrix. However, this really isn’t a problem since they both can
be used to represent a point in ¡ 2 . We will have to get used to this notation however as we’ll be
using it quite regularly.

Okay, just what were we after here. We wanted to determine if this transformation is linear or
not. With this new way of writing the transformation this is actually really simple. We’ll just
make use of some very nice facts that we know about matrix multiplication. Here is the work for
this problem
T ( u + v ) = A ( u + v ) = Au + Av = T ( u ) + T ( v )
T ( c u ) = A ( c u ) = c Au = cT ( u )

So, both conditions of the definition are met and so this transformation is a linear transformation.

There are a couple of things to note here. First, we couldn’t write the transformation from
Example 2 as a matrix multiplication because at least one of the equations (okay both in this case)
for the components in the result were not linear.

Second, when all the equations that give the components of the result are linear then the
transformation will be linear. If at least one of the equations are not linear then the
transformation will not be linear either.

Now, we need to investigate the idea that we used in the previous example in more detail. There
are two issues that we want to take a look at.

First, we saw that, at least in some cases, matrix multiplication can be thought of as a linear
transformation. As the following theorem shows, this is in fact always the case.

Theorem 1 If A is an m ´ n matrix then its induced transformation, TA : ¡ n ® ¡ m , defined


as,
TA ( x ) = Ax
is a linear transformation.

Proof : The proof here is really simple and in fact we pretty much saw it last example.
TA ( u + v ) = A ( u + v ) = Au + Av = TA ( u ) + TA ( v )
TA ( c u ) = A ( c u ) = cAu = cTA ( u )

So, the induced function, TA , satisfies both the conditions in the definition of a linear
transformation and so it is a linear transformation.

So, any time we do matrix multiplication we can also think of the operation as evaluating a linear
transformation.

© 2007 Paul Dawkins 166 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

The other thing that we saw in Example 4 is that we were able, in that case, to write a linear
transformation as a matrix multiplication. Again, it turns out that every linear transformation can
be written as a matrix multiplication.

Theorem 2 Let T : ¡ n ® ¡ m be a linear transformation, then there is an m ´ n matrix such


that T = TA (recall that TA is the transformation induced by A).

The matrix A is called the matrix induced by T and is sometimes denoted as A = [T ] .

Proof : First let,


e1 = (1, 0, 0,K , 0 ) e2 = ( 0,1, 0,K , 0 ) L e n = ( 0, 0, 0,K ,1)
be the standard basis vectors for ¡ and define A to be the m ´ n matrix whose ith column is
n

T ( ei ) . In other words, A is given by,


A = éë T ( e1 ) T ( e 2 ) L T ( e n ) ùû

Next let x be any vector from ¡ n . We know that we can write x in terms of the standard basis
vectors as follows,
é x1 ù
êx ú
x = ê 2 ú = x1e1 + x2e 2 + L + xne n
êMú
ê ú
ë xn û

In order to prove this theorem we’re going to need to show that for any x (which we’ve got a nice
general one above) we will have T ( x ) = TA ( x ) . So, let’s start off and plug x into T using the
general form as written out above.
T ( x ) = T ( x1e1 + x2e 2 + L + xne n )

Now, we know that T is a linear transformation and so we can break this up at each of the “+”’s
as follows,
T ( x ) = T ( x1e1 ) + T ( x2e 2 ) + L + T ( xn e n )

Next, each of the xi ’s are scalars and again because T is a linear transformation we can write this
as,
T ( x ) = x1T ( e1 ) + x2T ( e 2 ) + L + xnT ( e n )

Next, let’s notice that this is nothing more than the following matrix multiplication.
é x1 ù
êx ú
T ( x ) = éë T ( e1 ) T ( e 2 ) L T ( en ) ùû ê 2 ú
êMú
ê ú
ë xn û
© 2007 Paul Dawkins 167 http://tutorial.math.lamar.edu/terms.aspx
Linear Algebra

But the first matrix nothing more than A and the second is just x and we when we define A as we
did above we will get,
T ( x ) = Ax = TA ( x )
and so we’ve proven what we needed to.

In this proof we used the standard basis vectors to define the matrix A. As we will see in a later
chapter there are other choices of vectors that we could use here and these will produce a
different induced matrix, A, and we do need to remember that. However, when we use the
standard basis vectors to define A, as we’re going to in this chapter, then we don’t actually need
to evaluate T at each of the basis vectors as we did in the proof. All we need to do is what we did
in Example 4, write down the coefficient matrix for the system of equations that we get by
writing out each of the components as individual equations.

Okay, we’ve done a lot of work in this section and we haven’t really done any examples so we
should probably do a couple of them. Note that we are saving most of the examples for the next
section, so don’t expect a lot here. We’re just going to do a couple so we can say we’ve done a
couple.

Example 5 The zero transformation is the transformation T : ¡ n ® ¡ m that maps every


vector x in ¡ n to the zero vector in ¡ m , i.e. T ( x ) = 0 . The matrix induced by this
transformation is the m ´ n zero matrix, 0 since,
T ( x ) = T0 ( x ) = 0x = 0

To make it clear we’re using the zero transformation we usually denote it by T0 ( x ) .

Example 6 The identity transformation is the transformation T : ¡ n ® ¡ n (yes they are both
¡ n ) that maps every x to itself, i.e. T ( x ) = x . The matrix induced by this transformation is the
n ´ n identity matrix, I n since,
T ( x ) = TI ( x ) = I n x = x

We’ll usually denote the identity transformation as TI ( x ) to make it clear we’re working with it.

So, the two examples above are standard examples and we did need them taken care of.
However, they aren’t really very illustrative for seeing how to construct the matrix induced by the
transformation. To see how this is done, let’s take a look at some reflections in ¡ 2 . We’ll look
at reflections in ¡ 3 in the next section.

© 2007 Paul Dawkins 168 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

Example 7 Determine the matrix induced by the following reflections.


(a) Reflection about the x-axis. [Solution]
(b) Reflection about the y-axis. [Solution]
(c) Reflection about the line y = x . [Solution]

Solution
Note that all of these will be linear transformations of the form T : ¡ 2 ® ¡ 2 .

(a) Reflection about the x-axis.

Let’s start off with a sketch of what we’re looking for here.

So, from this sketch we can see that the components of the for the translation (i.e. the equations
that will map x into w) are,
w1 = x
w2 = - y
Remember that w1 will be the first component of the transformed point and w2 will be the
second component of the transformed point.

Now, just as we did in Example 4 we can write down the matrix form of this system.
é w1 ù é 1 0 ù é x ù
ê w ú = ê 0 -1ú ê y ú
ë 2û ë ûë û
So, it looks like the matrix induced by this reflection is,
é 1 0ù
ê 0 -1ú
ë û
[Return to Problems]

(b) Reflection about the y-axis.

We’ll do this one a little quicker. Here’s a sketch and the equations for this reflection.

© 2007 Paul Dawkins 169 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

w1 = - x
w2 = y
The matrix induced by this reflection is,
é -1 0ù
ê 0 1úû
ë
[Return to Problems]

(c) Reflection about the line y = x .

Here’s the sketch and equations for this reflection.

w1 = y
w2 = x
The matrix induced by this reflection is,
é0 1ù
ê 1 0ú
ë û
[Return to Problems]

Hopefully, from these examples you’re starting to get a feel for how we arrive at the induced
matrix for a linear transformation. We’ll be seeing more of these in the next section, but for now
we need to move on to some more ideas about linear transformations.

Let’s suppose that we have two linear transformations induced by the matrices A and B,
TA : ¡ n ® ¡ k and TB : ¡ k ® ¡ m . If we take any x out of ¡ n TA will map x into ¡ k . In other

© 2007 Paul Dawkins 170 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

words, TA ( x ) will be in ¡ k and notice that we can then apply TB to this and its image will be in
¡ m . In summary, if we take x out of ¡ n and first apply TA to x and then apply TB to the result
we will have a transformation from ¡ n to ¡ m .

This process is called composition of transformations and is denoted as


(TB o TA )( x ) = TB (TA ( x ) )
Note that the order here is important. The first transformation to be applied is on the right and the
second is on the left.

Now, because both of our original transformations were linear we can do the following,
(TB o TA )( x ) = TB (TA ( x ) ) = TB ( Ax ) = ( BA) x
and so the composition TB o TA is the same as multiplication by BA. This means that the
composition will be a linear transformation provided the two original transformations were also
linear.

Note as well that we can do composition with as many transformations as we want provided all
the spaces correctly match up. For instance with three transformations we require the following
three transformations,
TA : ¡ n ® ¡ k TB : ¡ k ® ¡ p TC : ¡ p ® ¡ m
and in this case the composition would be,
(TC o TB o TA )( x ) = TC (TB (TA ( x ) ) ) = ( CBA) x
Let’s take a look at a couple of examples.

Example 8 Determine the matrix inducted by the composition of reflection about the y-axis
followed by reflection about the x-axis.
Solution
First, notice that reflection about the y-axis should change the sign on the x coordinate and
following this by a reflection about the x-axis should change the sign on the y coordinate.

The two transformations here are,


é -1 0 ù
TA : ¡ 2 ® ¡ 2 A=ê ú reflection about y -axis
ë 0 1û
é 1 0ù
TB : ¡ 2 ® ¡ 2 B=ê ú reflection about x-axis
ë 0 -1û
The matrix induced by the composition is then,
é 1 0 ù é -1 0 ù é -1 0 ù
TB o TA = BA = ê úê =
ë 0 -1û ë 0 1úû êë 0 -1úû

Let’s take a quick look at what this does to a point. Given x in ¡ 2 we have,
é -1 0 ù é x ù é - x ù
(TB o TA )( x ) = ê úê ú = ê ú
ë 0 -1û ë y û ë - y û
This is what we expected to get. This is often called reflection about the origin.
© 2007 Paul Dawkins 171 http://tutorial.math.lamar.edu/terms.aspx
Linear Algebra

Example 9 Determine the matrix inducted by the composition of reflection about the y-axis
followed by another reflection about the y-axis.

Solution
In this case if we reflect about the y-axis twice we should end right back where we started.

The two transformations in this case are,


é -1 0ù
TA : ¡ 2 ® ¡ 2 A=ê reflection about y -axis
ë 0 1úû
é -1 0ù
TB : ¡ 2 ® ¡ 2 B=ê reflection about y -axis
ë 0 1úû
The induced matrix is,
é -1 0 ù é -1 0ù é 1 0ù
TB o TA = BA = ê = = I2
ë 0 1úû êë 0 1úû êë0 1úû

So, the composition of these two transformations yields the identity transformation. So,
(TB o TA )( x ) = TI ( x ) = x
and the composition will not change the original x as we guessed.

© 2007 Paul Dawkins 172 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

Examples of Linear Transformations


This section is going to be mostly devoted to giving the induced matrices for a variety of standard
linear transformations. We will be working exclusively with linear transformations of the form
T : ¡ 2 ® ¡ 2 and T : ¡ 3 ® ¡3 and for the most part we’ll be providing equations and sketches
of the transformations in ¡ 2 but we’ll just be providing equations for the ¡ 3 cases.

Let’s start this section out with two of the transformations we looked at in the previous section
just so we can say we’ve got all the main examples here in one section.

Zero Transformation
In this case very vector x is mapped to the zero vector and so the transformation is,
T ( x ) = T0 ( x )
and the induced matrix is the zero matrix, 0.

Identity Transformation
The identity transformation will map every vector x to itself. The transformation is,
T ( x ) = TI ( x )
and so the induced matrix is the identity matrix.
Reflections
We saw a variety of reflections in ¡ 2 in the previous section so we’ll give those again here again
along with some reflections in ¡ 3 so we can say that we’ve got them all in one place.
Reflection Equations Induced Matrix
w1 = x é 1 0ù
Reflection about x-axis in ¡ 2 ê 0
w2 = - y ë -1úû
w1 = - x é -1 0ù
Reflection about y-axis in ¡ 2 ê 0
w2 = y ë 1úû
w1 = y é0 1ù
Reflection about line x = y in ¡ 2 ê1
w2 = x ë 0 úû
w1 = - x é -1 0ù
Reflection about origin in ¡ 2 ê 0
w2 = - y ë -1úû
w1 = x é 1 0 0ù
w2 = y ê 0 0 úú
Reflection about xy-plane in ¡ 3 ê 1
w3 = - z êë 0 0 -1úû
w1 = - x é -1 0 0ù
w2 = y ê 0 0 úú
Reflection about yz-plane in ¡ 3 ê 1
w3 = z êë 0 0 1úû
w1 = x é 1 0 0ù
w2 = - y ê 0 -1 0 úú
Reflection about xz-plane in ¡ 3 ê
w3 = z êë 0 0 1úû

© 2007 Paul Dawkins 173 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

Note that in the ¡ 3 when we say we’re reflecting about a given plane, say the xy-plane, all we’re
doing is moving from above the plane to below the plane (or visa-versa of course) and this means
simply changing the sign of the other variable, z in the case of the xy-plane.

Orthogonal Projections
We first saw orthogonal projections in the section on the dot product. In that section we looked at
projections only in the ¡ 2 , but as we’ll see eventually they can be done in any setting. Here we
are going to look at some special orthogonal projections.

Let’s start with the orthogonal projections in ¡ 2 . There are two of them that we want to look at.
Here is a quick sketch of both of these.

So, we project x onto the x-axis or y-axis depending upon which we’re after. Of course we also
have a variety of projections in ¡ 3 as well. We could project onto one of the three axes or we
could project onto one of the three coordinate planes.

Here are the orthogonal projections we’re going to look at in this section, their equations and their
induced matrix.

Orthogonal Projection Equations Induced Matrix


w1 = x é1 0ù
Projection on x-axis in ¡ 2 ê0
w2 = 0 ë 0úû
w1 = 0 é0 0ù
Projection on y-axis in ¡ 2 ê0
w2 = y ë 1úû
w1 = x é1 0 0ù
w2 = 0 ê0 0 0 úú
Projection on x-axis in ¡ 3 ê
w3 = 0 êë0 0 0 úû
w1 = 0 é0 0 0ù
w2 = y ê0 1 0 úú
Projection on y-axis in ¡ 3 ê
w3 = 0 êë0 0 0 úû

© 2007 Paul Dawkins 174 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

w1 = 0 é0 0 0ù
w2 = 0 ê0 0 0 úú
Projection on z-axis in ¡ 3
ê
w3 = z êë0 0 1úû
w1 = x é1 0 0ù
w2 = y ê0 1 0 úú
Projection on xy-plane in ¡ 3
ê
w3 = 0 êë0 0 0 úû
w1 = 0 é0 0 0ù
w2 = y ê0 1 0 úú
Projection on yz-plane in ¡ 3
ê
w3 = z êë0 0 1úû
w1 = x é1 0 0ù
w2 = 0 ê0 0 0 úú
Projection on xz-plane in ¡ 3
ê
w3 = z êë0 0 1úû

Contractions & Dilations


These transformations are really just fancy names for scalar multiplication, w = c x , where c is a
nonnegative scalar. The transformation is called a contraction if 0 £ c £ 1 and a dilation if
c ³ 1 . The induced matrix is identical for both a contraction and a dilation and so we’ll not give
separate equations or induced matrices for both.

Contraction/Dilation Equations Induced Matrix


w1 = cx é c 0ù
Contraction/Dilation in ¡ 2 ê0 c ú
w2 = cy ë û
w1 = cx é c 0 0ù
w2 = cy ê0 c 0 ú
Contraction/Dilation in ¡ 3 ê ú
w3 = cz êë0 0 c úû

Rotations
We’ll start this discussion in ¡ 2 . We’re going to start with a vector x and we want to rotate that
vector through an angle q in the counter-clockwise manner as shown below.

© 2007 Paul Dawkins 175 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

Unlike the previous transformation where we could just write down the equations we’ll need to
do a little derivation work here. First, from our basic knowledge of trigonometry we know that
x = r cos a y = r sin a
and we also know that
w1 = r cos (a + q ) w2 = r sin (a + q )

Now, through a trig formula we can write the equations for w as follows,
w1 = ( r cos a ) cos q - ( r sin a ) sin q
w2 = ( r cos a ) sin q + ( r sin a ) cos q

Notice that the formulas for x and y both show up in these formulas so substituting in for those
gives,
w1 = x cos q - y sin q
w2 = x sin q + y cos q

Finally, since q is a fixed angle cos q and sin q are fixed constants and so there are our
equations and the induced matrix is,
é cos q - sin q ù
ê sin q cos q úû
ë

In ¡ 3 we also have rotations but the derivations are a little trickier. The three that we’ll be
giving here are counter-clockwise rotation about the three positive coordinate axes.

Here is a table giving all the rotational equation and induced matrices.

Rotation Equations Induced Matrix


Counter-clockwise rotation w1 = x cos q - y sin q é cos q - sin q ù
through an angle q in ¡ 2 w2 = x sin q + y cos q ê sin q cos q úû
ë
Counter-clockwise rotation w1 = x é 1 0 0 ù
trough an angle of q about w1 = y cos q - z sin q ê 0 cos q - sin q úú
ê
the positive x-axis in ¡ 3 w2 = y sin q + z cos q ëê 0 sin q cos q ûú
Counter-clockwise rotation w1 = x cos q + z sin q é cos q 0 sin q ù
trough an angle of q about w1 = y ê 0 1 0 úú
ê
the positive y-axis in ¡ 3 w2 = z cos q - x sin q ëê - sin q 0 cos q úû
Counter-clockwise rotation w1 = x cos q - y sin q é cos q - sin q 0 ù
trough an angle of q about w1 = x sin q + y cos q ê sin q cos q 0 ú
ê ú
the positive z-axis in ¡ 3 w2 = z ëê 0 0 1 ûú

Okay, we’ve given quite a few general formulas here, but we haven’t worked any examples with
numbers in them so let’s do that.

© 2007 Paul Dawkins 176 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

Example 1 Determine the new point after applying the transformation to the given point. Use
the induced matrix associated with each transformation to find the new point.
(a) x = ( 2, -4,1) reflected about the xz-plane.
(b) x = (10, 7, -9 ) projected on the x-axis.
(c) x = (10, 7, -9 ) projected on the yz-plane.
Solution
So, it would be easier to just do all of these directly rather than using the induced matrix, but at
least this way we can verify that the induced matrix gives the correct value.

(a) Here’s the multiplication for this one.


é 1 0 0ù é 2ù é2ù
w = êê 0 -1 0 úú êê -4 úú = êê 4 úú
êë 0 0 1úû êë 1úû êë 1úû
So, the point x = ( 2, -4,1) maps to w = ( 2, 4,1) under this transformation.

(b) The multiplication for this problem is,


é 1 0 0 ù é 10 ù é10 ù
w = êê0 0 0 úú êê 7 úú = êê 0úú
êë0 0 0 úû êë -9 úû êë 0 úû
The projection here is w = (10, 0, 0 )

(c) The multiplication for the final transformation in this set is,
é0 0 0ù é 10 ù é 0 ù
w = êê0 1 0 úú êê 7 úú = êê 7 úú
êë0 0 1úû êë -9 úû êë -9 úû
The projection here is w = ( 0, 7, -9 ) .

Let’s take a look at a couple of rotations.

Example 2 Determine the new point after applying the transformation to the given point. Use
the induced matrix associated with each transformation to find the new point.
(a) x = ( 2, -6 ) rotated 30° in the counter-clockwise direction.
(b) x = ( 0,5,1) rotated 90° in the counter-clockwise direction about the y-axis.
(c) x = ( -3, 4, -2 ) rotated 25° in the counter-clockwise direction about the z-axis.

Solution
There isn’t much to these other than plugging into the appropriate induced matrix and then doing
the multiplication.

© 2007 Paul Dawkins 177 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

(a) Here is the work for this rotation.


é cos 30 - sin 30 ù é 2 ù é 3
- 12 ù é 2 ù é 3 + 3 ù
w=ê úê ú = ê
2
ú =ê ú
3 ê -6 ú
ë sin 30 cos 30 û ë -6 û ëê 1
2 2 û ú ë û ëê1 - 3 3 ûú
(
The new point after this rotation is then, w = 3 + 3, 1 - 3 3 . )
(b) The matrix multiplication for this rotation is,
é cos 90 0 sin 90 ù é0 ù é 0 0 1 ù é0 ù é 1ù
w = êê 0 1 0 úú êê 5úú = êê 0 1 0 úú êê 5úú = êê 5úú
êë - sin 90 0 cos 90 úû êë 1úû êë -1 0 0 úû êë 1úû êë 0 úû
The point after this rotation becomes w = (1,5, 0 ) . Note that we could have predicted this one.
The original point was in the yz-plane (because the x component is zero) and a 90° counter-
clockwise rotation about the y-axis would put the new point in the xy-plane with the z component
becoming the x component and that is exactly what we got.

(c) Here’s the work for this part and notice that the angle is not one of the “standard” trig angles
and so the answers will be in decimals.
é cos 25 - sin 25 0 ù é -3ù
w = êê sin 25 cos 25 0 ú ê 4ú
úê ú
êë 0 0 1 úû êë -2 úû
é 0.9063 -0.4226 0 ù é - 3ù
= êê 0.4226 0.9063 0 úú êê 4 úú
êë 0 0 1 úû êë -2 úû
é -4.4093 ù
= êê 2.3574 úú
êë -2 úû
The new point under this rotation is then w = ( -4.4093, 2.3574, -2 ) .

Finally, let’s take a look at some compositions of transformations.

Example 3 Determine the new point after applying the transformation to the given point. Use
the induced matrix associated with each transformation to find the new point.
(a) Dilate x = ( 4, -1, -3) by 2 (i.e. 2x) and then project on the y-axis.
(b) Project x = ( 4, -1, -3) on the y-axis and then dilate by 2.
(c) Project x = ( 4, 2 ) on the x-axis and the rotate by 45° counter-clockwise.
(d) Rotate x = ( 4, 2 ) 45° counter-clockwise and then project on the x-axis.
Solution
Notice that the first two are the same translations just done in the opposite order and the same is
true for the last two. Do you expect to get the same result from each composition regardless of
the order the transformations are done?

© 2007 Paul Dawkins 178 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

Recall as well that in compositions we can get the induced matrix by multiplying the induced
matrices from each transformation from the right to left in the order they are applied. For
instance the induced matrix for the composition TB o TA is BA where TA is the first
transformation applied to the point.

(a) Dilate x = ( 4, -1, -3) by 2 (i.e. 2x) and then project on the y-axis.

The induced matrix for this composition is,


é0 0 0 ù é 2 0 0 ù é 0 0 0 ù
ê0 1 0 ú ê 0 2 0 ú = ê 0 2 0 ú
ê úê ú ê ú
êë0 0 0 úû êë 0 0 2 úû êë 0 0 0 úû
14243 14243 14243
Project on y -axis Dilate by 2 Composition

The matrix multiplication for the new point is then,


é0 0 0ù é 4 ù é 0 ù
ê 0 2 0 ú ê -1 ú = ê -2 ú
ê úê ú ê ú
êë 0 0 0 úû êë -3 úû êë 0 úû
The new point is then w = ( 0, -2, 0 ) .

(b) Project x = ( 4, -1, -3) on the y-axis and then dilate by 2.

In this case the induced matrix is,


é 2 0 0 ù é0 0 0 ù é 0 0 0 ù
ê 0 2 0 ú ê0 1 0 ú = ê 0 2 0 ú
ê úê ú ê ú
ëê14
0 0 2 ûú ëê0 0 0 ûú ëê 0 0 0 ûú
243 14243 14243
Dilate by 2 Project on y -axis Composition

So, in this case the induced matrix for the composition is the same as the previous part.
Therefore, the new point is also the same, w = ( 0, -2, 0 ) .

(c) Project x = ( 4, 2 ) on the x-axis and the rotate by 45° counter-clockwise.

Here is the induced matrix for this composition.

é cos 45 - sin 45ù é 1 0 ù é 2


- 2ù é 1 0ù é 2

ê sin 45 cos 45ú ê0 0 ú = ê ú=ê
2 2 2
ú
2 ê0
ú
ë14442444 0
3û ë123û êë ë û êë
2 2
2 ú
2 û 2 0úû
Rotate by 45° Project on
x -axis
The matrix multiplication for new point after applying this composition is,
é 2
0ù é4 ù é2 2 ù
w=ê 2
úê ú = ê ú
êë 2
2
0 úû ë 2 û êë 2 2 úû

© 2007 Paul Dawkins 179 http://tutorial.math.lamar.edu/terms.aspx


Linear Algebra

(
The new point is then, w = 2 2, 2 2 )
(d) Rotate x = ( 4, 2 ) 45° counter-clockwise and then project on the x-axis.

The induced matrix for the final composition is,


é 1 0 ù é cos 45 - sin 45ù é 1 0 ù é 2
- 2 ù é 2
- ù
2

ê0 0 ú ê sin 45 cos 45ú = ê0 0 ú ê ú=ê


2 2 2 2
ú
ë123û 144ë 0 0û
3û ë û êë 2 2
42444 2 2 ûú ë
Project on Rotate by 45°
x -axis
Note that this is different from the induced matrix from (c) and so we should expect the new point
to also be different. The fact that the induced matrix is different shouldn’t be too surprising given
that matrix multiplication is not a commutative operation.

The matrix multiplication for the new point is,


é 2
- ù é4 ù é 2 ù
2
w=ê 2 2
úê ú = ê ú
ë 0 0û ë2 û ë 0û
The new point is then, w = ( )
2, 0 and as we expected it was not the same as that from part
(c).

So, as this example has shown us transformation composition is not necessarily commutative and
so we shouldn’t expect that to happen in most cases.

© 2007 Paul Dawkins 180 http://tutorial.math.lamar.edu/terms.aspx

You might also like