0% found this document useful (0 votes)
645 views

Linear Algebra

The document outlines the contents of a textbook on linear algebra. Chapter 1 introduces systems of linear equations and matrices, including Gaussian elimination for solving systems. Chapter 2 covers determinants. Chapter 3 discusses general vector spaces and concepts like subspaces and linear independence. Chapter 4 is about inner product spaces and orthonormal bases. Chapter 5 introduces linear transformations. Chapter 6 covers eigenvalues and eigenvectors.

Uploaded by

Min Jeong koo
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
645 views

Linear Algebra

The document outlines the contents of a textbook on linear algebra. Chapter 1 introduces systems of linear equations and matrices, including Gaussian elimination for solving systems. Chapter 2 covers determinants. Chapter 3 discusses general vector spaces and concepts like subspaces and linear independence. Chapter 4 is about inner product spaces and orthonormal bases. Chapter 5 introduces linear transformations. Chapter 6 covers eigenvalues and eigenvectors.

Uploaded by

Min Jeong koo
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 163

Linear Algebra

2009¿
>
=
Contents

Chapter 1. Systems of Linear equations and matrices


§1.1 Introduction to Systems of Linear Equations

§1.2 Gaussian Elimination

§1.3 Matrices and Matrix Operations

§1.4 Inverses; Rules of Matrix Arithmetic

§1.5 Elementary Matrices and a Method for Finding A−1

§1.6 Further Results on Systems of Equations and Invertibility

Chapter 2. Determinants
§2.1 Combinatorial Approach to Determinants

§2.2 Evaluating Determinants by Row Reduction

§2.3 Properties of the Determinant Function

§2.4 The Determinants by Cofactor Expansion

Chapter 3. General Vector Spaces


§3.1 Euclidean n-Space

§3.2 General Vector Spaces

§3.3 Subspaces

§3.4 Linear Independence

§3.5 Basis and Dimension

§3.6 Row Space and Column Space and Rank

2
Chapter 4. Inner Product Spaces
§4.1 Inner Products

§4.2 Length and Angle in Inner Product Spaces

§4.3 Orthonormal Bases; Gram-Schmidt Process

§4.4 Coordinates; Change of Basis

Chapter 5. Linear Transformations


§5.1 Introduction to Linear Transformations

§5.2 Properties of Linear Transformations; Kernel and Range

§5.3 Linear Transformations from Rn to Rm

§5.4 Matrices of Linear Transformations

Chapter 6. Eigenvalues and Eigenvectors


§6.1 Eigenvalues and Eigenvectors

§6.2 Diagonalization

§6.3 Orthogonal Diagonalization; Symmetric Matrices

3
Chapter One
Systems of Linear
equations and matrices

§1.1 Introduction to Systems of Linear Equations

In this section we introduce basic terminology and discuss a method for

solving systems of linear equations.

Definition 1.1. If a1 , a2 , . . . , an and b are real constants, then an equation of

the form

a1 x1 + a2 x2 + · · · + an xn = b

is called a linear equation in the n unknown variables x1 , x2 , . . . , xn .

A solution of a linear equation a1 x1 + a2 x2 + · · · + an xn = b is a sequence

of n numbers s1 , s2 , . . . , sn such that s1 = x1 , s2 = x2 , . . . , sn = xn , that is,

a1 s1 + a2 s2 + · · · + an sn = b.

The set of all solutions of the equation is called its solution set or the general

solution of the equation.

4
A finite set of linear equations in the variables x1 , x2 , . . . , xn ,

a11 x1 + a12 x2 + · · · + a1n xn = b1

a21 x1 + a22 x2 + · · · + a2n xn = b2


.. ... ..
. .

am1 x1 + am2 x2 + · · · + amn xn = bm

is called a system of linear equations or a linear system. A sequence

of numbers s1 , s2 , . . . , sn such that s1 = x1 , s2 = x2 , . . . , sn = xn is called a

solution of the system.

A system of linear equations is said to be inconsistent if it has no solu-

tions. If there is at least one solution, it is called consistent.

For a linear system

a11 x1 + a12 x2 + · · · + a1n xn = b1

a21 x1 + a22 x2 + · · · + a2n xn = b2


.. .. ..
. . .

am1 x1 + am2 x2 + · · · + amn xn = bm ,

the rectangular array


 
a11 a12 ··· a1n b1
 
 
 a21 a22 ··· a2n b2 
 
 .. .. .. .. 
 
 . . . . 
 
am1 am2 · · · amn bm

5
is called the augmented matrix for the system.

The elementary row operations for the augmented matrix of a linear

system are

1. multiplying a row(horizonal line) through by a nonzero constant,

2. interchanging two rows,

3. adding a multiple of one row to another row.

Remark 1.1. The following example illustrates how the elementary opera-

tions can be used to solve systems of linear equations.

Example 1.1. Solve the system of linear equations

x + y + 2z = 9

2x + 4y − 3z = 1

3x + 6y − 5z = 0.

Solution. Starting with its augmented matrix


 
 1 1 2 9 
 
 2 4 −3 1  (−2× 1st row + 2nd row) ⇒
 
 
3 6 −5 0

 
 1 1 2 9 
 
 0 2 −7 −17  (−3× 1st row + 3rd row) ⇒
 
 
3 6 −5 0

6
 
 1 1 2 9 
 
 0 2 −7 −17  ( 1 × 2rd row) ⇒
  2
 
0 3 −11 −27

 
 1 1 2 9 
 
 0 1 − 7 − 17  (−3× 2rd row+3rd row) ⇒
 2 2 
 
0 3 −11 −27

   
 1 1 2 9   1 1 2 9 
   
 0 1 − 7 − 17  (−2· 3rd) ⇒  0 1 − 7 − 17  (−1· 2rd+1st row) ⇒
 2 2   2 2 
   
0 0 − 12 − 32 0 0 1 3

 
11 35
 1 0 2 2 
 
 0 1 − 7 − 17  (− 11 × 3rd row+1st row and 7 × 3rd row +2nd row) ⇒
 2 2  2 2
 
0 0 1 3

 
 1 0 0 1 
 
 0 1 0 2 .
 
 
0 0 1 3

The solution is

x = 1, y = 2, z = 3.

7
♣Exercises 1.1

1. Which of the following are linear equations in x1 , x2 and x3 ?

(a) x1 + 2x1 x2 + x3 = 2, (b) x1 + x2 + x3 = sin k (k is a constant),


1 √
(c) x1 − 3x2 + 2x32 = 4, (d) x1 = 2x3 − x2 + 7,

(e) x1 + x−1
2 − 3x3 = 5, (f ) x1 = x3 .

2. Find the solution set of each of the following:

(a) 6x − 7y = 3, (b) 2x1 + 4x2 − 7x3 = 8,

(c) − 3x1 + 4x2 − 7x3 + 8x4 = 5, (d) 2v − w + 3x + y − 4z = 0.

3. Find the augmented matrix for each of the following systems of linear

equations:

(a)
x1 − 2x2 = 0

3x1 + 4x2 = −1 ,

2x1 − x2 = 3
(b)
x1 + x3 = 0
,
−x1 + 2x2 − x3 = 3
(c)
x1 + x3 = 1

2x2 − x3 + x5 = 2 ,

2x3 + x4 = 3

8
(d)
x1 = 1

x2 = 2.

4. Find a system of linear equations corresponding to each of the following

augmented matrices:
   
 1 0 −1 2   1 0 0 
   
(a)  2 1 1 3  
 , (b)  0 1 0 
,
   
0 −1 2 4 1 −1 1
 
1 0 0 0 1
   
 
 0 1 0 0 2 
 1 2 3 4 5   
(c)   , (d)  .
 
5 4 3 2 1  0 0 1 0 3 
 
0 0 0 1 4

5. For which value(s) of the constant k does the following system of linear

equations have no solution? Exactly one solution? Infinitely many solutions?

x − y = 3

2x − 2y = k.

6. Consider the system of equations

ax + by = k

cx + dy = l

ex + f y = m.

9
Discuss the relative positions of the lines ax+by = k, cx+dy = l, ex+f y = m

when:

(a) the system has no solution;

(b) the system has exactly one solution;

(c) the system has infinitely many solutions.

§1.2 Gaussian Elimination

In this section we give a systematic procedure for solving systems of linear

equations; it is based on the idea of reducing the augmented matrix.

Definition 1.2. A matrix is said to be in reduced row-echelon form if

it has the following properties:

1. If a row does not consist entirely of zeros, then the first nonzero number

is a 1, called a leading 1 .

2. If there are any rows that consist entirely of zeros, then they are grouped

together in the bottom of the matrix.

3. In two successive rows that do not consist entirely of zeros, the leading

1 in the lower row occurs farther to the right than the leading 1 in the higher

row.

4. Each column that contains a leading 1 has zeros everywhere else.

A matrix having properties 1, 2 and 3 is said to be in row-echelon form.

10
Example 1.2. The following matrices are in reduced row-echelon form:
 
    0 1 −2 0 1
1 0 0 4 1 0 0    
      
     0 0 0 1 3   0 0 
 0 1 0  
7 , 0 1 0 
 , , .
      0 0 0 0 0 

0 0
0 0 1 −1 0 0 1  
0 0 0 0 0

While the following matrices are in row-echelon form:


     
 1 4 3 7   1 1 0   0 1 2 0 0 
     
 0 1 6 2     .
  ,  0 1 0  ,  0 0 1 −1 0 
     
0 0 1 5 0 0 0 0 0 0 0 1

Remark 1.2. We illustrate the idea which reduces a matrix into a reduced

row-echelon form by reducing the following matrix to a reduced row-echelon

form.  
 0 0 −2 0 7 12 
 
 2 4 −10 6 12 28 .
 
 
2 4 −5 6 −5 −1
Step 1. Locate the leftmost column that does not consist entirely of zeros.
 
 0 0 −2 0 7 12 
 
 2 4 −10 6 12 28 
 .
 
2 4 −5 6 −5 −1

Step 2. Interchange the top row with another, if necessary, to bring a nonzero

11
entry to the top of the column found in Step 1
 
 2 4 −10 6 12
28 
 
 0 0 −2 7 12 
 0 .
 
2 4 −5 6 −5 −1

Step 3. If the entry that is at the top of the column found in Step 1 is a,
1
multiply the first row by in order to introduce a leading 1
a
 
 1 2 −5 3 6 14 
 
 0 0 −2 0 7 12 .
 
 
2 4 −5 6 −5 −1

Step 4. Add suitable multiples of the top row to the rows below so that all

entries below the leading 1 become zeros.


 
 1 2 −5 3 6 14 
 
 0 0 −2 0 7 12 .
 
 
0 0 5 0 −17 −29

Step 5. Now cover the top row in the matrix and begin again with Step 1

applied to the submatrix that remains. Continue in this way until the entire

matrix is in row-echelon form.


 
 1 2 −5 3 6 14 
 
 0 0 −2 0 7 12 .
 
 
0 0 5 0 −17 −29

12
 
 1 2 −5 3 6 14 
 
 0 0 −6 
 1 0 − 72 .
 
0 0 5 0 −17 −29
 
 1 2 −5 3 6 14 
 
 0 0 −6 
 1 0 − 72 .
 
1
0 0 0 0 2
1
 
 1 2 −5 3 6 14 
 
 0 0 1 0 − 72 −6 
 
 
0 0 0 0 1 2
which is now in row-echelon form. To find the reduced row-echelon form we

need the following additional step.

Step 6. Beginning with the last nonzero row and working upward, add suit-

able multiples of each row to the rows above to introduce zeros above the

leading 1’s.  
 1 2 −5 3 6 14 
 
 0 0 1 0 0 1 
 
 
0 0 0 0 1 2
 
 1 2 −5 3 0 2 
 
 0 0 1 0 0 1 
 
 
0 0 0 0 1 2

13
 
 1 2 0 3 0 7 
 
 0 0 1 0 0 1 
 
 
0 0 0 0 1 2
which is now in reduced row-echelon form.

The above procedure for reducing a matrix into a reduced row-echelon form

is called Gauss-Jordan elimination. If we use only the first five steps, the

procedure produces a row-echelon form and is called Gaussian elimination.

Example 1.3. Solve the following linear system by using Gauss-Jordan elim-

ination.

x1 + 3x2 − 2x3 + 2x5 = 0

2x1 + 6x2 − 5x3 − 2x4 + 4x5 − 3x6 = −1


.
5x3 + 10x4 + 15x6 = 5

2x1 + 6x2 + 8x4 + 4x5 + 18x6 = 6

Solution. Start with the augmented matrix for the system


 
1 3 −2 0 2 0 0
 
 
 2 6 −5 −2 4 −3 −1 
 
 .
 
 0 0 5 10 0 15 5 
 
2 6 0 8 4 18 6

14
Adding −2 times the first row to the second row,
 
1 3 −2 0 2 0 0
 
 
 0 0 −1 −2 0 −3 −1 
 
 .
 
 0 0 5 10 0 15 5 
 
2 6 0 8 4 18 6

Multiplying the second row by −1 and then adding −5 times the second row

to the third row and −4 times the second row to the fourth row,
 
1 3 −2 0 2 0 0
 
 
 0 0 1 2 0 3 1 
 
 
 
 0 0 0 0 0 0 0 
 
0 0 0 0 0 6 2

which is in row echelon form. Interchanging the third and fourth rows and
1
then multiplying the third row of the resulting matrix by ,
6
 
1 3 −2 0 2 0 0
 
 
 0 0 1 2 0 3 1 
 
 .
 1 
 0 0 0 0 0 1 3 
 
0 0 0 0 0 0 0

Adding −3 the third row to the second row and then adding 2 times the second

15
row of the resulting matrix to the first row,
 
1 3 0 4 2 0 0
 
 
 0 0 1 2 0 0 0 
 
 
 1 
 0 0 0 0 0 1 3 
 
0 0 0 0 0 0 0

which is in reduced row echelon form. The corresponding system of the equa-

tion is
x1 + 3x2 + 4x4 + 2x5 = 0

x3 + 2x4 = 0 .
1
x6 = 3

Solving for the leading variables,

x1 = −3x2 − 4x4 − 2x5

x3 = −2x4
1
x6 = 3
.

If we set

x2 = r, x4 = s, x5 = t

for arbitrary values r, s, t, such arbitrary values are called parameters, then

16
the solution set is given by

x1 = −3r − 4s − 2t

x2 = r,

x3 = −2s

x4 = s

x5 = t
1
x6 = 3
.

Definition 1.3. A technique to solve a system of linear equations by using

Gaussian elimination to bring the augmented matrix into a row echelon form

is called back-substitution.

Example 1.4. Solve the linear system in Example 1.3 by back-substitution.

Solution. From the solution of Example 1.3, we have the row echelon form
 
1 3 −2 0 2 0 0
 
 
 0 0 1 2 0 3 1 
 
 .
 
 0 0 0 0 0 1 13 
 
0 0 0 0 0 0 0

The corresponding system of the equation is

x1 + 3x2 −2x3 + 2x5 = 0

x3 + 2x4 + 3x6 = 1 .
1
x6 = 3

17
We proceed as follows:

Step 1. Solve each equation for its leading variable.

x1 = −3x2 + 2x3 − 2x5

x3 = 1 − 2x4 − 3x6 .
1
x6 = 3

Step 2. Beginning with the bottom equation and working upward, suc-

cessively substitute each equation into all the equations above it. Substituting
1
x6 = into the second equation,
3

x1 = −3x2 + 2x3 − 2x5

x3 = −2x4
1
x6 = 3
.

Substituting x3 = −2x4 into the first equation,

x1 = −3x2 − 4x4 − 2x5

x3 = −2x4 .
1
x6 = 3

Step 3. Assign arbitrary values to the nonleading variables.

If we assign

x2 = r, x4 = s, x5 = t

18
for arbitrary values r, s, t, then the solution set is given by

x1 = −3r − 4s − 2t

x2 = r,

x3 = −2s

x4 = s

x5 = t
1
x6 = 3
.

Every system of linear equations has either one solution, infinitely many

solutions, or no solutions. If a system has solutions, how many solutions has

it? We consider several cases in which it is possible to make situation about

the number of solutions by inspection.

Definition 1.4. A system of linear equations is said to be homogeneous

if all the constants b1 , b2 , . . . , bm are all zero, that is, the system has the form

a11 x1 + a12 x2 + · · · + a1n xn = 0

a21 x1 + a22 x2 + · · · + a2n xn = 0


.. .. ..
. . .

am1 x1 + am2 x2 + · · · + amn xn = 0.

The solution x1 = 0, x2 = 0, . . . , xn = 0 for any homogeneous system of linear

equations is called the trivial solution, if there are other solutions, they are

called nontrivial solutions.

19
Remark 1.3. For a homogeneous system of linear equations, exactly one of

the following is true:

1. The system has only the trivial solution.

2. The system has infinitely many nontrivial solutions.

Example 1.5. Solve the homogeneous system of linear equations by Gauss-

Jordan elimination

2x1 + 2x2 − x3 + x5 = 0

−x1 − x2 + 2x3 − 3x4 + x5 = 0


.
x1 + x2 − 2x3 − x5 = 0

x3 + x4 + x5 = 0

Solution. The augmented matrix for the system is

 
2 2 −1 0 1 0
 
 
 −1 −1 2 −3 1 0 
 
 .
 
 1 1 −2 0 −1 0 
 
0 0 1 1 1 0

Reducing this matrix to reduced row echelon form,


 
1 1 0 0 1 0
 
 
 0 0 1 0 1 0 
 
 .
 
 0 0 0 1 0 0 
 
0 0 0 0 0 0

20
The corresponding system of equations is

x1 + x2 + x5 = 0

x3 + x5 = 0 .

x4 = 0

Solving for the leading variables,

x1 = −x2 − x5

x3 = −x5

x4 = 0

The solution set is given by

x1 = −s − t, x2 = s, x3 = −t, x4 = 0, x5 = t

where s and t are arbitrary values. .

Theorem 1.1. A homogeneous system of linear equations with more un-

knowns than equations has infinitely many solutions.

Proof. Omitted!

Remark 1.4. A nonhomogeneous system of linear equations with more un-

knowns than equations need not be consistent; however, if the system is con-

sistent, it will have many solutions.

21
♣Exercises 1.2

1. Which of the following are in reduced row-echelon form?


     
 1 0 0   0 1 0   1 1 0 
     
(a) 
 0 0 0  , (b)  1 0
 0 
, (c)  
 0 1 0 ,
     
0 0 1 0 0 0 0 0 0
 
1 2 0 3 0  
   

 0
  1 0 0 5 
 0 1 1 0     1 0 3 1 
(d)   , (e)  0 0 1 3  , (f ) 
 .
   
 0 0 0 0 1    0 1 2 4
  0 1 0 4
0 0 0 0 0

2. Which of the following are in row-echelon form?


   
 
 1 2 3   1 1 0 
   1 −7 5 5   
(a) 
 0 0 0  , (b)   , (c)  0 1 0 
,
  0 1 3 2  
0 0 1 0 0 0
 
1 3 0 2 0    
 

 1
  2 3 4   0 0 0 
 0 2 2 0      
(d)   , (e)  0 1 2 ,
 (f ) 
 0 0 0 
.
     
 0 0 0 0 1 
  0 0 3 0 0 0
0 0 0 0 0

3. In each part suppose that the augmented matrix for a system of linear

equations has been reduced by row operations to the given reduced row-echelon

22
form. Solve the system.
   
 1 0 0 4   1 0 0 3 2 
   
(a)  0 1 0 3 
, (b) 
 0 1 0 −1 4
,

   
0 0 1 2 0 0 1 1 2
 
1 5 0 0 5 −1  
 

 0 0 1
  1 2 0 0 
 0 3 1 
  
(c)   , (d) 
 0 0 1 0
.

   
 0 0 0 1 4 2 
  0 0 0 1
0 0 0 0 0 0

4. In each part suppose that the augmented matrix for a system of linear

equations has been reduced by row operations to the given row-echelon form.

Solve the system.


   
 1 2 −4 2   1 0 4 7 10 
   
(a)  
 0 1 −2 −1  , (b) 
 0 1 −3 −4 −2  ,

   
0 0 1 2 0 0 1 1 2
 
1 5 −4 0 −7 −5  
 

 0 0

  1 2 2 2 
 1 1 7 3   
(c)   , (d) 
 0 1 3 3 .

   
 0 0 0 1 4 2 
  0 0 0 1
0 0 0 0 0 0

23
5. Solve the linear system by Gaussian-Jordan elimination and back-substitution:

3x1 + 2x2 − x3 = −15

5x1 + 3x2 + 2x3 = 0

3x1 + x2 + 3x3 = 11

11x1 + 7x2 = −30

6. Show that if ad 6= bc,

(a) then the reduced row-echelon form of


   
 a b   1 0 
  is  ;
c d 0 1

(b) the system


ax + by = k

cx + dy = l
has exactly one solution.

7. Without using pencil and paper, determine which of the following homo-

geneous systems have nontrivial solutions:

(a) x1 + 3x2 + 5x3 + x4 = 0 (b) x1 + 2x2 + 3x3 = 0

4x1 − 7x2 − 3x3 − x4 = 0 , x2 + 4x3 = 0 ,

3x1 + 2x2 + 7x3 + 8x4 = 0 5x3 = 0

(c) a11 x1 + a12 x2 + a13 x3 = 0 (d) x1 + x2 = 0


,
a21 x1 + a22 x2 + a23 x3 = 0 2x1 + 2x2 = 0.

24
♣. Solve the given homogeneous systems of linear equations:

2. 2x1 + x2 + 3x3 = 0
3. 3x1 + x2 + x3 + x4 = 0
x1 + 2x2 = 0 , ,
5x1 − x2 + x3 − x4 = 0
x2 + x3 = 0

4. 2x1 − 4x2 + x3 + x4 = 0

x1 − 5x2 + 2x3 = 0
5. x + 6y − 2z = 0
− 2x2 − 2x3 − x4 = 0 ,
2x − 4y + z = 0.
x1 + 3x2 + x4 = 0

x1 − 2x2 − x3 + x4 = 0

8. For which value(s) of λ does the following system of equations have non-

trivial solutions?
(λ − 3)x +y = 0

x +(λ − 3)y = 0.

9. Consider the system of equations

ax + by = k

cx + dy = l

ex + f y = m.

Discuss the relative positions of the lines ax+by = k, cx+dy = l, ex+f y = m

when:

(a) the system has only the trivial solution;

(b) the system has nontrivial solutions.

25
10. Consider the system of equations

ax + by = 0

cx + dy = 0.

(a) Show that if x = x0 , y = y0 is any solution and k is any constant, then

x = kx0 , y = ky0 is also a solution.

(b) Show that if x = x1 , y = y1 and x = x2 , y = y2 are any two solutions,

then x = x1 + x2 , y = y1 + y2 is also a solution.

§1.3 Matrices and Matrix Operations

Rectangular array of real numbers arise in many contexts other than as

augmented matrices for systems of linear equations. In this section we consider

such arrays, just called matrices, as objects in their own right and develop some

their properties.

Definition 1.5. A matrix is a rectangular array of numbers. The numbers

in a matrix are called the entries of the matrix. The size of a matrix is

described by specifying the number of rows (horizontal lines) and columns

(vertical lines) that occur in the matrix. If a matrix has m rows and n columns,

its size is m by n (written m × n).

26
Remark 1.5. If A is a m × n matrix with aij as its entry in row i and column

j, then it is denoted by
 
a a12 · · · a1n
 11 
 
 a21 a22 · · · a2n 
 
A= . .. ..  or A = [aij ].
 . 
 . . . 
 
am1 am2 · · · amn

If m = n, then A is called a square matrix of order n and the enrties

a11 , a22 , . . . , ann are said to be on the main diagonal of A.

Remark 1.6. When we discuss matrices, it is common to refer to numerical

quantities (real numbers) as scalars.

Definition 1.6. Let A = [aij ] and B = [bij ] be two matrices with the same

size.

(1). A and B are said to be equal if aij = bij for all i, j.

(2). The sum of A and B is the matrix A + B = [aij + bij ]; the difference

of A from B is the matrix A − B = [aij − bij ].

(3). If c is any scalar, the scalar multiplication cA is the matrix cA =

[caij ].

Definition 1.7. Let A = [aij ] be a m × r matrix and B = [bij ] be a r × n

matrix. Then the matrix multiplication AB is the m × n matrix


" r #
X
AB = aik bkj .
k=1

27
Example 1.6. If
 
 
 4 1 4 3 
 1 2 4   
A= , B=
 0 −1 3 1 ,

2 6 0  
2 7 5 2

then AB=
 
1·4+2·0+4·2 1·1−2·1+4·7 1·4+2·3+4·5 1·3+2·1+4·2
 
2·4+6·0+0·2 2·1−6·1+0·7 2·4+6·3+0·5 2·3+6·1+0·2

 
 12 27 30 13 
= .
8 −4 26 12

Remark 1.7. Matrix multiplication has an important application to systems

of linear equations. Consider a system of m equations in n unknowns:

a11 x1 + a12 x2 + · · · + a1n xn = b1

a21 x1 + a22 x2 + · · · + a2n xn = b2


.. .. ..
. . .

am1 x1 + am2 x2 + · · · + amn xn = bm .

Then the system is represented by


   
a x + a12 x2 + · · · + a1n xn b1
 11 1   
   
 a21 x1 + a22 x2 + · · · + a2n xn   b2 
   
 . .. .. = .. 
 .   
 . . .   . 
   
am1 x1 + am2 x2 + · · · + amn xn bm

28
which becomes
    
a11 a12 ··· a1n x1 b1
    
    
 a21 a22 ··· a2n  x2   b2 
    
 .. .. ..  .. = ..  .
    
 . . .  .   . 
    
am1 am2 · · · amn xm bm

The matrix  
a11 a12 ··· a1n
 
 
 a21 a22 ··· a2n 
 
 .. .. .. 
 
 . . . 
 
am1 am2 · · · amn
is called the coefficient matrix for the linear system. We denote the aug-

mented matrix by  
a11 a12 ··· a1n | b1
 
 
 a21 a22 ··· a2n | b2 
 
 .. .. .. . 
 
 . . . | .. 
 
am1 am2 · · · amn | bm

Definition 1.8. If A = [aij ] is a m × n matrix, then the n × m matrix

At = [aji ] that results from interchanging the rows and columns of A is called

the transpose of A.

If A = [aij ] is a square matrix of order n, the tr(A) = a11 + a22 + · · · + ann

is called the trace of A.

29
♣Exercises 1.3

1. Let A and B be 4×5 matrices and let C, D, and E be 5×2, 4×2, and 5×4

matrices, respectively. Determine which of the following matrix expressions are

defines. For those which are defined, give the size of the resulting matrix.

(a) BA, (b) AC + D, (c) AE + B, (d) AB + B,

(e) E(A + B), (f ) E(AC), (g) E t A, (h) (At + E)D.

2.

(a) Show that if AB and BA are defined, then AB and BA are square

matrices.

(b) Show that if A is an m × n matrix and A(BA) is defined, then B is an

n × m matrix.

3. Solve the following matrix equation for a, b, c and d:


   
 a−b b+c   8 1 
 = .
3d + c 2a − 4d 7 6

30
4. Consider the matrices
 
   
 3 0 
   4 −1   1 4 2 
A=  −1 2 ,
 B=  C= ,
  0 2 3 1 5
1 1

   
 1 5 2   6 1 3 
   
D=
 −1 0 1  , E =  −1 1 2  .
  
   
3 2 4 4 1 3

Compute
(a) AB, (b) D + E, (c) D − E,

(d) DE, (e) ED, (f ) − 7B.

5. A square matrix is called a diagonal matrix if all entries off the main

diagonal are zero. Show that the product of diagonal matrices is again a

diagonal matrix.

§1.4 Inverses: Rules of Matrix Arithmetic

Many of the rules of arithmetic for real numbers also hold for matrices, but

thereare someexceptions. For example,


 in general, AB 6= BA for matrices

 −1 0   1 2 
A=  and B =  .
2 3 3 0

31
Theorem 1.2. Assuming that the sizes of matrices are such that the indicated

operations can be performed,

(a) A + B = B + A,

(b) A + (B + C) = (A + B) + C,

(c) A(BC) = (AB)C,

(d) A(B + C) = AB + AC,

(e) (B + C)A = BA + CA,

(f) A(B − C) = AB − AC,

(g) (B − C)A = BC − CA,

(h) a(B + C) = aB + aC,

(i) a(B − C) = aB − aC,

(j) (a + b)C = aC + bC,

(k) (a − b)C = aC − bC,

(l) (ab)C = a(bC),

(m) a(BC) = (aB)C = B(aC).

Proof. Left to the reader as exercises.

Definition 1.9. A matrix whose entries are zeros is called a zero matrix

and is denoted by O.

Theorem 1.3. Assuming that the sizes of matrices are such that the indicated

operations can be performed,

(a) A + O = O + A = A,

32
(b) A − A = O,

(c) O − A = −A,

(d) AO = O; OA = O.

Proof. Left to the reader as exercises.

Theorem 1.4. Every system of linear equations has either no solutions, ex-

actly one solution or infinitely many solutions.

Proof. Let AX = B be a system of linear equations. Suppose that the system

has more than one solution. It is enough to show that it has infinitely many

solutions. Let X1 and X2 be two solutions of AX = B. Then

AX1 = B and AX2 = B.

So

AX1 = AX2 ⇒ A(X1 − X2 ) = O.

If we set X0 = X1 − X2 and k is any scalar, then

A(X1 + kX0 ) = AX1 + A(kX0 )

= AX1 + k(AX0 )

= B + kO

= B+O

= B.

Thus X1 + kX0 is also a solution for arbitrary k, and hence AX = B has

infinitely many solutions.

33
Definition 1.10. A square matrix A = [aij ] such that


 1 if i = j,
aij =

 0 if i 6= j,

is called an identity matrix and is denoted by I. If we emphasize the size,

we write In for the n × n identity matrix.

Remark 1.8. For any matrix A, AI = IA = A.

Definition 1.11. A square matrix A is said to be invertible if there exists

a matrix B, called an inverse of A, such that AB = BA = I.

Theorem 1.5. If B and C are both inverses of a matrix A, then B = C.

Proof. Since B and C are inverses of A,

B = IB = (CA)B = C(AB) = CI = C.

Remark 1.9. If A is an invertible matrix, its inverse will be denoted by A−1 .

Thus

AA−1 = I = A−1 A.

Theorem 1.6. If A and B are invertible matrices of the same size, then

(a) AB is invertible,

(b) (AB)−1 = B −1 A−1 .

34
Proof. Since

(AB)(B −1 A−1 ) = A(BB −1 )A−1 = AIA−1 = AA−1 = I,

(B −1 A−1 )(AB) = B −1 (A−1 A)B = B −1 IB = B −1 B = I.

Thus B −1 A−1 is the inverse of AB; so AB is invertible.

Definition 1.12. If A is a square matrix, then we define the nonnegative

integer powers of A to be

A0 = I, An = AA · · · A} (n > 0).
| {z
n factors

In addition, if A is invertible, we define

A−n = A−1 −1 −1
| A {z· · · A } (n > 0).
n factors

Theorem 1.7. If A is a square matrix and r and s are integers, then

Ar As = Ar+s , (Ar )s = Ars .

Proof. Left to the reader as exercises.

Theorem 1.8. If A is an invertible matrix, then

(a) A−1 is invertible and (A−1 )−1 = A,

(b) An is invertible and (An )−1 = (A−1 )n , n = 0, 1, 2, . . . ,


1
(c) For any k 6= 0, kA is invertible and (kA)−1 = A−1 .
k

35
Proof. (a) Since AA−1 = I = A−1 A, A−1 is invertible and (A−1 )−1 = A.

(b) Since

An (A−1 )n = An A−n = An−n = A0 = I,

(A−1 )n An = A−n An = A−n+n = A0 = I,

An is invertible and (An )−1 = (A−1 )n .

(c) Since
µ ¶ µ ¶
1 −1 1 −1 1
(kA) A = (kA)A = k AA−1 = I,
µ k¶ k
µ ¶ k
1 −1 1
A (kA) = k (A−1 A) = I,
k k
1 −1
kA is invertible and (kA)−1 = A .
k

Theorem 1.9. If the sizes of the matrices are such that the given operations

can be performed, then

(a) (At )t = A,

(b) (A + B)t = At + B t ,

(c) (kA)t = kAt for any scalar k,

(d) (AB)t = B t At ,

(e) If A is invertible, then At is also invertible and (At )−1 = (A−1 )t .

Proof. (e)
At (A−1 )t = (A−1 A)t = I t = I,

(A−1 )t At = (AA−1 )t = I t = I.
Others are left to the reader as exercises.

36
♣Exercises 1.4

1. Let α = −3 β = 2, and let


     
 3 2   4 0   0 −1 
A= , B =  , C= .
−1 3 1 5 4 6

Show that

(a) A + (B + C) = (A + B) + C, (b) (AB)C = A(BC),

(c) (α + β)C = αC + βC, (d) α(B − C) = αB − αC,

(e) α(BC) = (αB)C = B(αC), (f ) A(B − C) = AB − AC,

(g) (At )t = A, (h) (A + B)t = At + B t ,

(i) (αC)t = αC t , (j) (AB)t = B t At .

2. Compute the inverses of the following matrices


     
 3 1   2 −3   2 0 
A= , B =  , C =  .
5 2 4 4 0 3

3. Let A and B be square matrices of the same size. Is (AB)2 = A2 B 2 a valid

matrix identity? Justify your answer.

4. If R is a square matrix in reduced row echelon form and has no zero rows,

show that R = I.

37
§1.5 Elementary Matrices and a Method for Finding A−1

In this section we will develop a simple scheme or algorithm for finding the

inverse of an invertible matrix.

Definition 1.13. An n × n matrix E is called an elementary matrix if

it is obtained from In by performing a single elementary row operation, that

is, exactly one of

1. multiplying a row through a nonzero constant,

2. interchanging two rows,

3. adding a multiple of one row to another row.

Example 1.7. The following matrices are elementary matrices:


 
1 0 0 0    
   

 0 0 0 1 
  1 0 3   1 0 0 
 1 0       
(i)   , (ii)   , (iii) 
 0 1 0  , (iv)  0 1 0
 


     
0 −3  0 0 1 0 
  0 0 1 0 0 1
0 1 0 0

since (i) −3× the second row of I2 , (ii) interchanging 2nd and fourth rows of

I4 , (iii) 1st row + 3× 3rd row of I3 and (iv) 1× 1st row of I3 .

Theorem 1.10. If E is the elementary matrix obtained from Im by perform-

ing a row operation and A is a m × n matrix, then EA is the matrix that

results when the row operation is performed on A.

38
Proof. Omitted!

Example 1.8. Let


   
 1 0 2 3   1 0 0 
   
A= 
 2 −1 3 6  , E= 
 0 1 0 .
   
1 4 4 0 3 0 1

Then E is the elementary matrix obtained from I3 by adding 3 times the first

row to the third row. We see that


 
 1 0 2 3 
 
EA = 
 2 −1 3 6 

 
4 4 10 9

which is the matrix that results when we add 3 times the first row of A to the

third row.

Definition 1.14. Let E and I be an elementary matrix and the identity

matrix of the same size.

Row Operation on I that produces E Row Operation on E that reproduces I


1
Multiply row i by c 6= 0 Multiply row i by c
.
Interchange rows i and j Interchange rows i and j

Add c times row i to row j Add −c times row i to row j

The operations on the right side of the table are called the inverse opera-

tions of the corresponding operations on the left.

39
Theorem 1.11. Every elementary matrix is invertible, and the inverse is also

an elementary matrix.

Proof. If E is an elementary matrix, then E is the result from performing a

row operation on I. Let E0 be the matrix that results when the inverse of this

operation is performed on I. Then E0 is an elementary matrix. By Theorem

1. 10,

E0 E = I and EE0 = I.

Thus E0 is the inverse of E; so E is invertible.

Definition 1.15. A matrix A is said to be row equivalent to a matrix B,

written A ∼ B, if B is obtained from A by performing a finite number of

elementary row operations.

Theorem 1.12. If A is an n × n matrix, then the following statements are

equivalent:

(a) A is invertible,

(b) AX = 0 has only the trivial solution,

(c) A ∼ In .

Proof. (a) ⇒ (b). Assume A is invertible and let X0 be any solution of

AX = 0. Then AX0 = 0 and the A−1 AX0 = a−1 0 = 0; so 0 = IX0 = X0 .

Thus AX = 0 has only the trivial solution.

40
(b) ⇒ (c). Let AX = 0 be the matrix form of the system

a11 x1 + a12 x2 + · · · + a1n xn = 0

a21 x1 + a22 x2 + · · · + a2n xn = 0


.. .. .. ..
. . . .

an1 x1 + an2 x2 + · · · + ann xn = 0


and the system has only the trivial solution. If we solve by Gauss-Jordan

elimination, then the system of equations corresponding to the reduced row-

echelon form of the augmented matrix will be

x1 = 0

x2 = 0
..
.

xn = 0
Thus the augmented matrix for the system AX = 0
 
a a12 · · · a1n 0
 11 
 
 a21 a22 · · · a2n 0 
 
 . .. .. .. 
 . 
 . . . . 
 
an1 an2 · · · ann 0
can be reduced to the augmented matrix
 
1 0 ··· 0 0
 
 
 0 1 ··· 0 0 
 
 . .. .. .. 
 . 
 . . . . 
 
0 0 ··· 1 0

41
for
x1 = 0

x2 = 0
..
.

xn = 0
by a finite number of elementary row operations. Therefore, A is reduced to

the matrix identity matrix In by a finite number of elementary row operations.

Thus A is row equivalent to In .

(c) ⇒ (a). Assume that A is row equivalent to In . Then, by Theorem 1.10,

we can find elementary matrices E1 , E2 , . . . , Ek such that

Ek · · · E2 E1 A = In .

By Theorem 1.11, E1 , E2 , . . . , Ek are invertible; so we have

A = E1−1 E2−1 · · · Ek−1 In = (Ek · · · E2 E1 )−1 .

Thus A is invertible.

Remark 1.10. From A = (Ek · · · E2 E1 )−1 ,

A−1 = Ek · · · E2 E1 .

A simple method for finding the inverse Ek · · · E2 E1 of A is given in the fol-

lowing example.

42
Example 1.9. Find the inverse of
 
 1 2 3 
 
 2 5 3 .
 
 
1 0 8

Solution. The procedure is as follows: we reduce [A|I] to [I|A−1 ].


 
 1 2 3 | 1 0 0 
 
 2 5 3 | 0 1 0 
 
 
1 0 8 | 0 0 1

Adding −2 times the first row to the second and −1 times the first row to the

third,  
 1 2 3 | 1 0 0 
 
 0 1 −3 | −2 1 0 
 
 
0 −2 5 | −1 0 1
Adding 2 times the second row to the third,
 
 1 2 3 | 1 0 0 
 
 0 1 −3 | −2 1 0 
 
 
0 0 −1 | −5 2 1

Multiplying the third row by −1,


 
 1 2 3 | 1 0 0 
 
 0 1 −3 | −2 1 0 
 
 
0 0 1 | 5 −2 −1

43
Adding 3 times the third row to the second and −3 times the third row to the

first,  
 1 2 0 | −14 6 3 
 
 0 1 0 | 13 −5 −3 
 
 
0 0 1 | 5 −2 −1
Adding −2 times the second row to the first,
 
 1 0 0 | −40 16 9 
 
 0 1 0 13 −5 −3 
 | .
 
0 0 1 | 5 −2 −1

Thus  
 −40 16 9 
 
A−1 =
 13 −5 −3 .

 
5 −2 −1

44
♣Exercises 1.5

1. Which of the following are elementary matrices?


     
 2 0   1 0   2 0 
(a)  , (b)  , (c)  ,
0 1 3 1 0 2

   
 0 1 0   0 1 0 
   
(d) 
 1 0 0 ,
 (e) 
 0 0 1 ,

   
0 0 1 0 0 1

 
  1 0 0 0
1 0 0  
   
   0 1 0 0 
   
(f )  0 1 −3  , (g)  .
   
 0 1 1 0 
0 0 1  
0 0 0 1

2. Consider the matrices


     
 1 2 3   7 8 9   1 2 3 
     
A=  4 5 6 , B= 
 4 5 6 , C=
 4 5 6
.

     
7 8 9 1 2 3 9 12 15

Find elementary matrices E1 , E2 E3 and E4 such that

(a) E1 A = B, (b) E2 B = A, (c) E3 A = C, (d) E4 C = A.

45
3. Express the matrix
 
 1 3 3 8 
 
A= 
 −2 −5 1 −8 
 
0 1 7 8

in the form EF A = R where E and F are elementary matrices and R is in

row-echelon form.

4. Show that if  
 1 0 0 
 
A=
 0 1 0 

 
a b c
is an elementary matrix, then at least one of a, b, c must be a zero.

5. Find the inverse of each of the following matrices, where k1 , k2 , k3 , k4 and

k are all nonzero.


     
k 0 0 0 0 0 0 k1 k 0 0 0
 1     
     
 0 k2 0 0   0 0 k2 
0   1 k 0 0 
    
(a)   , (b)   , (c)  .
     
 0 0 k3 0   0 k3 0 0   0 1 k 0 
     
0 0 0 k4 k4 0 0 0 0 0 1 k

46
§1.6 Further Results on Systems of Equations and Invert-
ibility

In this section we will establish more results about systems of linear equa-

tions and invertibility of matrices.

Theorem 1.13. If A is an invertible n × n matrix, then for each n × 1 matrix

B, the system of equations AX = B has exactly one solution X = A−1 B.

Proof. Since A(A−1 B) = B, A−1 B is a solution of AX = B.

If X0 is any solution of AX = B, then AX0 = B and hence X0 = A−1 B.

Remark 1.11. To solve systems of equations

AX = B1 , AX = B2 , . . . , AX = Bk ,

reduce [A|B1 |B2 | . . . |Bk ] to [I|B10 |B20 | . . . |Bk0 ].

Example 1.10. Solve the systems

x1 + 2x2 + 3x3 = 4 x1 + 2x2 + 3x3 = 1


(a) 2x1 + 5x2 + 3x3 = 5 (b) 2x1 + 5x2 + 3x3 = 6 .

x1 + 8x3 = 9 x1 + 8x3 = −6

Solution. Reducing
 
 1 2 3 | 4 | 1 
 
 2 5 3 | 5 | 6 ,
 
 
1 0 8 | 9 | −6

47
we will have  
 1 0 0 | 1 | 2 
 
 0 1 0 | 0 | 1 .
 
 
0 0 1 | 1 | −1
Thus the solution of (a) is x1 = 1, x2 = 0, x3 = 1 and of (b) is x1 = 2, x2 =

1, x3 = −1.

Theorem 1.14. Let A be a square matrix.

(a) If B is a square matrix such that BA = I, then B = A−1 ,

(b) If B is a square matrix such that AB = I, then B = A−1 .

Proof. (a) It is enough to show that A is invertible and then BAA−1 = IA−1

implies B = A−1 . By Theorem 1.12, it suffices to prove that the system

AX = 0 has only the trivial solution. Let AX = 0. Then

B(AX) = B0 ⇒ (BA)X = 0 ⇒ IX = 0 ⇒ X = 0.

Thus AX = 0 has only the trivial solution.

(b) AB = I ⇒ B t At = I t = I; so, by (a), At is invertible and so is A by

Theorem 1.9(e).

Theorem 1.15. If A is an n × n matrix, then the following statements are

equivalent.

(a) A is invertible,

(b) AX = 0 has only the trivial solution,

48
(c) A is row equivalent to In ,

(d) AX = B is consistent for every n × 1 matrix B.

Proof. By Theorem 1.12, (a) ⇒ (b) ⇒ (c) ⇒ (a). It remains to show that

(a) ⇔ (d).

If A is invertible and B is any n × 1 matrix, then X = A−1 B is the solution

of AX = 0; so AX = 0 is consistent.

Assume the system AX = B is consistent for any n × 1 matrix B. In

particular, the systems


     
 1   0   0 
     
 0   1   0 
     
     
     
AX =  0  , AX =  0  , . . . , AX =  0 
     
 ..   .   . 
 
.   .
.   .. 
    
     
0 0 1

will be consistent. Let X1 be a solution of the first system, X2 be a solution

of the second system, . . . , Xn be a solution of the last system. Let C be

the matrix with X1 .X2 , . . . , Xn as the first, the second, . . . , last columns,

respectively, that is,

C = [X1 X2 . . . Xn ];

so we see that

AC = [AX1 AX2 . . . AXn ] = I.

By Theorem 1.14, A is invertible. .

49
A Fundamental Problem. Let A be a m×n matrix. Find all m×1 matrices

B such that the system AX = B is consistent.

The following example illustrates how Gaussian elimination can be used to

determine such conditions.

Example 1.11. What conditions must b1 , b2 , b3 satisfy for the system

x1 + x2 + 2x3 = b1

x1 + x3 = b 2

2x1 + x2 + 3x3 = b3
to be consistent?

Solution. Start with the augment matrix


 
 1 1 2 b1 
 
 1 0 1 b .
 2 
 
2 1 3 b3
Adding −1 times the first row to the second and −2 times the first row to the

third,  
 1 1 2 b1 
 
 0 −1 −1 b − b .
 2 1 
 
0 −1 −1 b3 − 2b1
Multiplying the second row by −1,
 
 1 1 2 b1 
 
 0 1 1 b1 − b2 .
 
 
0 −1 −1 b3 − 2b1

50
Adding the second row to the third,
 
 1 1 2 b1 
 
 0 b1 − b2 
 1 1 .
 
0 0 0 b3 − b2 − b1

From the third row, it is evident that the system has a solution if and only if

b3 − b2 − b1 = 0 or b3 = b1 + b2 .

Thus the system AX = B is consistent if and only if


 
 b1 
 
B=  b2  .

 
b1 + b3

♣Exercises 1.6

1. Find the conditions that the b’s must satisfy for the systems to be consis-

tent.

x1 − x2 + 3x3 = b1
4x1 − 2x2 = b1
(a) , (b) 3x1 − 3x2 + 9x3 = b2 .
2x1 − x2 = b2
−2x1 + x2 − 6x3 = b3

51
2. Consider the matrices
   
 2 2 3   x1 
   
A=  1 2 1 ,
 and X = 
 x 2
.

   
2 −2 1 x3

Show that the equation AX = X can be written as (A − I)X = 0 and use this

result to solve AX = X for X.

3. Let AX = 0 be a homogeneous system of n linear equations in n unknowns

that has only the trivial solution. Show that if k is any positive integer, then

the system Ak X = 0 also has only the trivial solution.

4. Let AX = 0 be a homogeneous system of n linear equations in n unknowns,

and let Q be an invertible matrix. Show that AX = 0 has just the trivial

solution if and only if (QA)X = 0 has just the trivial solution.

5. Show that an n × n matrix A is invertible if and only if it can be written

as a product of elementary matrices.

52
Chapter Two
Determinants

§2.1 Combinatorial Approach To Determinants

A “determinant” is a certain kind of function that associates a real number

with a square matrix. In this section, we will define this function. Our work

on the determinant function will have important applications to the theory of

systems of linear equations and will also lead us to an explicit formula for the

inverse of an invertible matrix.

Definition 2.1. A permutation of n integers is an ordered arrangement of

the n integers 1, 2, . . . , n.

Remark 2.1. The number of permutations of n integers equals n!.

Definition 2.2. An inversion of a permutation α = (i1 , i2 , . . . , in ) of n

integers is an ordered pair (ij , ik ) of integers of {1, 2, . . . , n} such that ij > ik

and ij precedes ik in α.

Remark 2.2. For a permutation α = (i1 , i2 , . . . , in ) of n integers, the number

of inversions of α equals j1 + j2 + · · · + jn−1 where

j1 is the number of inversions of α whose first coordinate is i1 ,

j2 is the number of inversions of α whose first coordinate is i2 , and so on.

53
jn is the number of inversions of α whose first coordinate is in−1 .

Example 2.1. The number of inversions of (6, 1, 3, 4, 5, 2) equals 5 + 0 + 1 +

1 + 1 = 8.

Definition 2.3. A permutation α of n integers is said to be even if its total

number of inversions is even; and even if its total number of inversions is odd.

If α is a permutation of n integers, we define the sign of α by




 +1 if α is even,
sgn(α) =

 −1 if α is odd.

Definition 2.4. For a n × n matrix


 
a a ··· a1n
 11 12 
 
 a21 a22 · · · a2n 
 
A= . .. .. ,
 . 
 . . . 
 
an1 an2 · · · ann

an elementary product form from A is any product of n entries of A, no

two of which come from the same row or the same column.

A signed elementary product form from A is

sgn(α)a1i1 a2i2 · · · anin

where α = (i1 , i2 , . . . , in ) is a permutation of n integers.

54
Remark 2.3. If A is a n × n matrix, then there are n! elementary products

form from A.

Remark 2.4. For any permutation α of n integers, α is considered as a

bijective function from the set {1, 2, . . . , n} onto itself. So if α = (i1 , i2 , . . . , in )

is a permutation, it means that

α(1) = i1 , α(2) = i2 , . . . , α(n) = in .

Definition 2.5. Let A = [aij ] be a n × n matrix and let Sn be the set of

all permutations of n integers. Then the determinant function of square

matrices is denoted by det and we define the value of det(A) by

X
det(A) = sgn(α)a1α(1) a1α(2) · · · a1α(n) .
α∈Sn

Example 2.2.
 
 a11 a12 
(i) det   = a11 a22 − a12 a21 .
a21 a22

 
 a11 a12 a13 
 
(ii) det 
 a21 a22 a23
 = a11 a22 a33 + a12 a23 a31 + a13 a21 a32

 
a31 a32 a33

−a13 a22 a31 − a12 a21 a33 − a11 a23 a32 .

55
♣Exercises 2.1

1. Find the number of inversions in each of the following permutations of

{1, 2, 3, 4, 5}

(a) (3, 4, 1, 5, 2), (b) (4, 2, 5, 3, 1), (c) (5, 4, 3, 2, 1),

(d) (1, 2, 3, 4, 5), (e) (1, 3, 5, 4, 2), (f ) (2, 3, 5, 4, 1).

2. Evaluate the determinant:


¯ ¯ ¯ ¯ ¯ ¯
¯ ¯ ¯ ¯ ¯ ¯
¯ 1 2 ¯ ¯ 6 4 ¯ ¯ −1 7 ¯
¯ ¯ ¯ ¯ ¯ ¯
(a) ¯ ¯, (b) ¯ ¯, (c) ¯ ¯,
¯ ¯ ¯ ¯ ¯ ¯
¯ −1 3 ¯ ¯ 3 2 ¯ ¯ −8 −3 ¯

¯ ¯ ¯ ¯ ¯ ¯
¯ ¯ ¯ ¯ ¯ ¯
¯ 1 −2 7 ¯ ¯ 8 2 −1 ¯ ¯ 1 0 3 ¯
¯ ¯ ¯ ¯ ¯ ¯
¯ ¯ ¯ ¯ ¯ ¯
(d) ¯¯ 3 5 1 ¯¯ , (e) ¯¯ −3 4 −6 ¯¯ , (f ) ¯ 4 0 −1 ¯ .
¯ ¯
¯ ¯ ¯ ¯ ¯ ¯
¯ ¯ ¯ ¯ ¯ ¯
¯ 4 3 8 ¯ ¯ 1 7 2 ¯ ¯ 2 8 6 ¯

3. Find all values of λ for which det(A) = 0.


 
 
 λ−6 0 0 
 λ − 1 −2   
(a) A =  , (b) A = 
 0 λ −1 .

1 λ−4  
0 4 λ−4

56
§2.2 Evaluating Determinants by Row Reduction

In this section we show that the the determinant of a matrix can be eval-

uated by reducing the matrix to row echelon form.

Theorem 2.1. If A is a square matrix that contains a row of zeros, then

det(A) = 0.

Proof. Let A = [aij ] be a n × n matrix and let the i-th row consist of zeros.

Then aij = 0 for each i = 1, 2, . . . , n. Thus

X
det(A) = sgn(α)a1α(1) a2α(2) · · · aiα(i) · · · anα(n) = 0
α∈Sn

since aiα(i) = 0 for each i = 1, 2, . . . , n.

Definition 2.6. A square matrix A is said to be

(a) upper triangular if all the entries below the main diagonal are zeros,

(b) lower triangular if all the entries above the main diagonal are zeros,

(a) triangular if it is either upper or lower triangular.

Example 2.3. A 4 × 4 upper and lower triangular matrix, respectively, are


   
a a a13 a14 a11 0 0 0
 11 12   
   
 0 a22 a23 a24   a21 a22 0 0 
   
 ,  .
   
 0 0 a33 a34  a a
 31 32 33 a 0 
   
0 0 0 a44 a41 a42 a43 a44

57
We see that in either case, det(A) = a11 a22 a33 a44 is the product of entries on

the main diagonal. In general, we have the following theorem.

Theorem 2.2. If A which is a n × n triangular matrix, then det(A) is the

product of the entries on the main diagonal.

Theorem 2.3. Let A be a n × n matrix.

(a) If A0 is the matrix obtained from A by multiplying a scalar k to a row,

then det(A0 ) = k det(A); so det(kA) = k n det(A),

(b) If A0 is the matrix obtained from A by interchanging two rows, then

det(A0 ) = −det(A).

(c) If A0 is the matrix obtained from A by adding a multiple of a row to

another row, then det(A0 ) = det(A).

Proof. Left to reader as exercises.

Example 2.4. Evaluate det(A) where


 
 0 1 5 
 
 3 −6 9  .
 
 
2 6 1

58
Solution.
¯ ¯ ¯ ¯ ¯ ¯
¯ ¯ ¯ ¯ ¯ ¯
¯ 0 1 5 ¯ ¯ 3 −6 9 ¯ ¯ 1 −2 3 ¯
¯ ¯ ¯ ¯ ¯ ¯
¯ ¯ ¯ ¯ ¯ ¯
det(A) = ¯¯ 3 −6 9 ¯ = −¯ 0
¯ ¯ 1 5 ¯ = −3 ¯ 0
¯ ¯ 1 5 ¯
¯
¯ ¯ ¯ ¯ ¯ ¯
¯ ¯ ¯ ¯ ¯ ¯
¯ 2 6 1 ¯ ¯ 2 6 1 ¯ ¯ 2 6 1 ¯
¯ ¯
¯ ¯
¯ 1 −2 3 ¯¯
¯
¯ ¯
= −3 ¯¯ 0 1 5 ¯¯ = (−3)(1)(1)(−55) = 165.
¯ ¯
¯ ¯
¯ 0 0 −55 ¯

Remark 2.5. By Theorem 2.1 and (c) of Theorem 2.3, if A is a square matrix

which has two proportional rows, then det(A) = 0.

Example 2.5. The determinant of


 
 2 7 8 
 
 3 2 4 
 
 
4 14 16

is zero.

59
♣Exercises 2.2

1. Evaluate the following by inspection:


¯ ¯
¯ ¯ ¯ ¯
¯ ¯ ¯ 1 0 0 0 ¯
¯ 2 −40 17 ¯ ¯ ¯
¯ ¯ ¯ ¯
¯ ¯ ¯ −9 −1 0 0 ¯
¯ ¯ ¯ ¯
(a) ¯ 0 1 11 ¯ , (b) ¯ ¯,
¯ ¯ ¯ ¯
¯ ¯ ¯ 12 7 8 0 ¯
¯ 0 0 3 ¯ ¯ ¯
¯ ¯
¯ 4 5 7 2 ¯

¯ ¯ ¯ ¯
¯ ¯ ¯ ¯
¯ 1 2 3 ¯ ¯ 3 −1 2 ¯
¯ ¯ ¯ ¯
¯ ¯ ¯ ¯
(c) ¯¯ 3 7 6 ¯,
¯ (d) ¯¯ 6 −2 4 ¯.
¯
¯ ¯ ¯ ¯
¯ ¯ ¯ ¯
¯ 1 2 3 ¯ ¯ 1 7 3 ¯

♣. Evaluate the determinants of the given matrices by reducing the matrix

to row-echelon form:
   
 2 3 7   2 1 1 
   
2. 
 0 0 −3  , 3.  4 2 3  ,
  
   
1 −2 7 1 3 0

   
 1 −2 0   2 −4 8 
   
4. 
 −3 5 1  , 5.  −2
  7 −2 ,

   
4 −3 2 0 1 5

60
   
3 6 9 3 2 1 3 1
   
   
 −1 0 1 0   1 0 1 1 
   
6.   , 7.  ,
   
 1 3 2 −1   0 2 1 0 
   
−1 −2 −2 1 0 1 2 3

 
 
1 1 1  1 3 1 5 3 
1  
 2 2  2  −2 −7 0 −4 2 
   
 −1 1
0  1  
 2 2  2  
8.  , 9.  0 0 1 0 1 .
 2 1 1   
 3 3 3
0   
   0 0 2 1 1 
1 1  
3
1 3 0  
0 0 0 1 1

 
 a b c 
 
10. Assume det 
 d e f  = 5. Find

 
g h i
   
 d e f   −a −b −c 
   
(a) det 
 g h i ,
 (b) det 
 2d 2e 2f ,

   
a b c −g −h −i

   
 a+d b+e c+f   a b c 
   
(c) det 
 d e f  , (d) det  d − 3a e − 3b f − 3c
 


   
g h i 2g 2h 2i

61
11. Use row reduction to show that
¯ ¯
¯ ¯
¯ 1 1 1 ¯
¯ ¯
¯ ¯
¯ a b c ¯ = (b − a)(c − a)(c − b).
¯ ¯
¯ ¯
¯ 2 2 2 ¯
¯ a b c ¯

§2.3 Properties of the Determinant Function

In this section we develop some of the fundamental properties of the the

determinant function.

Theorem 2.4. If A is a square matrix, then det(At ) = det(A).

Proof. It follows from the fact that A and At actually have the same signed

elementary products.

Remark 2.6. By Theorem 2.4, Theorem 2.3 can be written as: if A is a n × n

matrix, then

(a) If A0 is the matrix obtained from A by multiplying a scalar k to a

column, then det(A0 ) = k det(A).

(b) If A0 is the matrix obtained from A by interchanging two columns, then

det(A0 ) = −det(A).

(c) If A0 is the matrix obtained from A by adding a multiple of a column

to another column, then det(A0 ) = det(A).

62
Example 2.6. Compute the determinant
 
1 0 0 3
 
 
 2 7 0 6 
 
A= 
 
 0 6 3 0 
 
7 3 1 −5

Solution.
¯ ¯ ¯ ¯
¯ ¯ ¯ ¯
¯ 1 0 0 3 ¯ ¯ 1 0 0 0 ¯
¯ ¯ ¯ ¯
¯ ¯ ¯ ¯
¯ 2 7 0 6 ¯ ¯ 2 7 0 0 ¯
¯ ¯ ¯ ¯
det(A) = ¯ ¯=¯ ¯ = (1)(7)(3)(−26) = −546.
¯ ¯ ¯ ¯
¯ 0 6 3 0 ¯ ¯ 0 6 3 0 ¯
¯ ¯ ¯ ¯
¯ ¯ ¯ ¯
¯ 7 3 1 −5 ¯ ¯ 7 3 1 −26 ¯

Theorem 2.5. Let A = [aij ], A0 = [a0ij ] and A00 = [a00ij ] be n × n matrices such

that a00rj = arj + a0rj for j = 1, 2, . . . , n and aij = a0ij = a00ij for all i 6= r. Then

det(A00 ) = det(A) + det(A0 ).

Proof. Left to the reader as an exercise.

Example 2.7.
     
 1 7 5   1 7 5   1 7 5 
     
det 
 2 0 3  = det  2 0 3  + det  2 0 3
   
.

     
1 + 0 4 + 1 7 + (−1) 1 4 7 0 1 −1

63
Theorem 2.6. If A and B are square matrices of the same size, then

det(AB) = det(A)det(B).

Proof. Omitted!

Remark 2.7. In general, det(A + B) 6= det(A) + det(B). For example,


     
 3 1   −1 3   2 4 
det   + det   6= det  .
2 1 5 8 7 9

Theorem 2.7. A square matrix A is invertible if and only if det(A) 6= 0.

Proof. If A is invertible, then I = AA−1 and hence

1 = det(I) = det(AA−1 ) = det(A)det(A−1 );

so det(A) 6= 0.

Assume that det(A) 6= 0. We will show that A is row equivalent to I and

then, by Theorem 1.12, A is invertible. Let R be the row echelon form of A.

Then R was obtained by a finite number of elementary row operations on A;

so we can find elementary matrices E1 , E2 , . . . , Ek such that

Ek · · · E2 E1 A = R ⇒ A = E1−1 E2−1 · · · Ek−1 R.

64
Thus

det(A) = det(E1−1 ) det(E2−1 ) · · · det(Ek−1 )det(R).

Since det(A) 6= 0, det(R) 6= 0; so each row of R does not entirely consist of

zeros. Hence R must be I, that is, R = I. Therefore, A ∼ I.

Corollary. If A is invertible, then


1
det(A−1 ) = .
det(A)

Proof. Since AA−1 = I,

1 = det(I) = det(AA−1 ) = det(A)det(A−1 );


1
so det(A−1 ) = .
det(A)

♣Exercises 2.3

1. Verify that det(A) = det(At ) for


 
 
 1 2 7 
 1 −3   
A= , A=
 −1 0 6 .

2 5  
3 2 8

2. Verify that det(AB) = det(A)det(B) when


   
 2 1 0   1 −1 3 
   
A= 3 4 0 , B =  7
  1 2 .

   
0 0 2 5 0 1

65
3. By inspection, explain why det(A) = 0 where
 
−3 4 7 −2
 
 
 2 6 1 −3 
 
A= .
 
 1 0 0 0 
 
2 −8 3 4

4. Assume that det(A) = 5, where


 
 a b c 
 
A= 
 d e f .
 
g h i

Find

(a) det(3A), (b) det(2A−1 ), (c) det((2A)−1 ).

5. For which value(s) of k does A fail to be invertible?


 
 
 1 2 4 
 k − 3 −2   
(a) A =   , (b) A =  
 3 1 6 .
−2 k − 2  
k 3 2

§2.4 Determinants By Cofactor Expansion

In this section we consider a method for evaluating determinants that is

useful for hand computations and important theoretically.

66
Definition 2.7. If A = [aij ] is a n × n square matrix, then the minor

of entry aij is denoted by Mij and is defined by the determinant of the

(n − 1) × (n − 1) submatrix Aij which is obtained from A by deleting the i-th

row and the j-th column. The number (−1)i+j Mij is denoted by Cij and is

called the cofactor of aij .

Example 2.8. Let  


 3 1 −4 
 
A=
 2 5 6 
.
 
1 4 8
Then
 
 5 6  1+1
M11 =   = 40 − 24 = 16; so C11 = (−1) M11 = 16.
4 8

Theorem 2.8. If A = [aij ] is a n × n matrix, then

(a) det(A) = a1j C1j + a2j C2j + · · · + anj Cnj for each j, which is called the

cofactor expansion along the j-th column.

(b) det(A) = ai1 Ci1 + ai2 Ci2 + · · · + ain Cin for each i, which is called the

cofactor expansion along the i-th row .

Proof. Omitted!

67
Example 2.9. Evaluate det(A) where
 
3 5 −2 6
 
 
 1 2 −1 1 
 
A= .
 
 2 4 1 5 
 
3 7 5 3

Solution.
¯ ¯
¯ ¯ ¯ ¯
¯ 3 5 −2 6 ¯ ¯ ¯
¯ ¯ ¯ −1 1 3 ¯
¯ ¯ ¯ ¯
¯ 1 2 −1 1 ¯ ¯ ¯
¯ ¯
det(A) = ¯ ¯ = − ¯¯ 0 3 3 ¯
¯
¯ ¯ ¯ ¯
¯ 2 4 1 5 ¯ ¯ ¯
¯ ¯ ¯ 1 8 0 ¯
¯ ¯
¯ 3 7 5 3 ¯
¯ ¯
¯ ¯ ¯ ¯
¯ −1 1 3 ¯ ¯ ¯
¯ ¯ ¯
¯ ¯ ¯ 3 3 ¯¯
= − ¯¯ 0 3 3 ¯¯ = −(−1) ¯ ¯ = 9 − 27 = −18.
¯ ¯ ¯ ¯
¯ ¯ ¯ 9 3 ¯
¯ 0 9 3 ¯

Remark 2.8. From Theorem 2.8, if A = [aij ] is a n × n matrix, then

(a) a1i C1j + a2i C2j + · · · + ani Cnj = 0, provided i 6= j, since it is the

determinant of the matrix obtained from A by replacing the j-th column by

the i-th column, which has the same two columns.

(b) aj1 Ci1 + aj2 Ci2 + · · · + ajn Cin = 0, provided i 6= j, since it is the

determinant of the matrix obtained from A by replacing the i-th row by the

j-th row, which has the same two rows.

68
Definition 2.8. If A = [aij ] is a n × n matrix and Cij is the cofactor of aij ,

then the matrix  


C C12 · · · C1n
 11 
 
 C21 C22 · · · C2n 
 
 . .. .. 
 . 
 . . . 
 
Cn1 Cn2 · · · Cnn
is called the matrix of cofactors from A. The transpose of this matrix
 
C C21 · · · Cn1
 11 
 
 C12 C22 · · · Cn2 
 
 . .. .. 
 . 
 . . . 
 
C1n C2n · · · Cnn

is called the adjoint of A and is denoted by adj(A).

Example 2.10. If  
 3 2 1 
 
A=
 1 6 3 ,

 
2 −4 0
then the matrix of cofactors from A is
   
 C11 C12 C13   12 6 −16 
   
 C C C = 4 2 16 
 21 22 23   
   
C31 C32 C33 12 −10 16

69
and the adjoint of A is  
 12 4
12 
 
 6 2 −10 
 .
 
−16 16 16

Theorem 2.9. If A is an invertible matrix, then

1
A−1 = adj(A).
det(A)

Proof. Let A = [aij ] be a n × n invertible matrix. Then


 
 a11 a12 · · · a1n   
 
 a 
 21 a22 · · · a2n   C11 C21 · · · Cj1 · · · Cn1

 
 .. .. ..   
 . . .  C12 C22 · · · Cj2 · · · Cn2 
A adj(A) =   
  .. .. .. .. 
 ai1 ai2 · · · ain   . . . .

 


 . .. ..  
 .. . .  C1n C2n · · · Cjn · · · Cnn

 
an1 an2 · · · ann
 
det(A) 0 ··· 0
 
 
 0 det(A) · · · 0 
 
=  .. .. ..  = det(A)I.
 
 . . . 
 
0 0 · · · det(A)

by Theorem 2.8 and Remark 2.8. Thus A adj(A) = det(A)I; so


· ¸
1 1
A adj(A) = I so A−1 = adj(A).
det(A) det(A)

70
Theorem 2.10 (Cramer’s Rule). If AX = B is a system of linear equations

in n unknowns such that det(A) 6= 0, then the system has a unique solution

det(Aj )
xj = for j = 1, 2, . . . , n
det(A)

where  
a11 a12 · · · a1j−1 b1 a1j+1 · · · a1n
 
 
 a21 a22 · · · a2j−1 b2 a2j+1 · · · a2n 
 
Aj =  .. .. .. .. .. .. 
 
 . . . . . . 
 
an1 an2 · · · anj−1 bn anj+1 · · · ann
for each j = 1, 2, . . . , n. Note that det(Aj ) = b1 C1j + b2 C2j + · · · + bn Cnj by

Theorem 2.8.

Proof. Since det(A) 6= 0, A−1 exists; so AX = B implies X = A−1 B which

is the unique solution of AX = B.

71
By Theorem 2.9,
1
X = A−1 B = adj(A)B
 det(A)  
C11 C21 · · · Cn1 b1
  
  
1  C12 C22 · · · Cn2   b2 
  

=  . . .  . 
det(A)  .. .. ..   
   .. 
  
C1n C2n · · · Cnn bn
 
bC + b2 C21 + · · · + bn Cn1
 1 11 
 
1   b1 C12 + b2 C22 + · · · + bn Cn2


=  .
det(A)  ... ..
.
..
.

 
 
b1 C1n + b2 C2n + · · · + bn Cnn

So
det(Aj )
xj = b1 C1j + b2 C2j + · · · + bn Cnj det(A) =
det(A)
for each j = 1, 2, . . . , n.

♣Exercises 2.4

1. Let  
 1 6 −3 
 
A=
 −2 7 1 .

 
3 −1 4
(a) Find all the minors.

(b) Find all the cofactors.

72
(c) Evaluate the determinant of A by a cofactor expansion along

(i) the first row, (ii) the first column,

(iii) the second row, (iv) the second column,

(v) the third row, (vi) the third column.

(d) Find (i) adj(A) and (ii) A−1 .

2. Use Cramer’s Rule to solve

4x + 5y = 2

11x + y + 2z = 3

x + 5y + 2z = 1.

3. Prove that the equation of the line through two distinct points (a1 , b1 ) and

(a2 , b2 ) can be written ¯ ¯


¯ ¯
¯ x y 1 ¯
¯ ¯
¯ ¯
¯ a b 1 ¯ = 0.
¯ 1 1 ¯
¯ ¯
¯ ¯
¯ a2 b2 1 ¯

4. Prove that three points (x1 , y1 ), (x2 , y2 ) and (x3 , y3 ) are collinear if and

only if ¯ ¯
¯ ¯
¯ x1 y1 1 ¯
¯ ¯
¯ ¯
¯ x y 1 ¯ = 0.
¯ 2 2 ¯
¯ ¯
¯ ¯
¯ x3 y3 1 ¯

73
5. For the system

4x + y + z + w = 6

3x + 7y − z + w = 1
,
7x + 3y − 5z + 8w = −3

x + y + z + 2w = 3

(a) solve by Cramer’s Rule;

(b) solve by Gauss-Jordan elimination.

6. Prove that if det(A) = 1 and all the entries in A are integers, then all the

entries in A−1 are integers.

7. Prove that if A is an invertible upper triangular matrix, then A−1 is upper

triangular.

74
Chapter Three
General Vector Spaces

§3.1 Euclidean n-Space

In this section we will make the idea of using pairs of numbers to locate

points in the plane and triples of numbers to locate points in the sphere to

extend beyond 3-space.

Definition 3.1. If n is a positive integer, then an ordered n-tuple is a

sequence of n real numbers (a1 , a2 , . . . , an ). The set of all ordered n-tuples is

called n-space and is denoted by Rn . When n = 2 or 3, it is usual use the

term ordered pair and ordered triple. When n = 1, it is usual to write R

rather than R1 , which is the set of all real numbers. The elements of Rn are

called vectors and the real numbers are called scalars.

Definition 3.2. Let u=(u1 , u2 , . . . , un ) and v=(v1 , v2 , . . . , vn ) be two vectors

in Rn . Then

(a) u and v are called equal if ui = vi for each i = 1, 2, . . . , n.

(b) The sum u+v is defined by

u+v = (u1 + v1 , u2 + v2 , . . . , un + vn ).

75
(c) If α is a scalar, then scalar multiple αu is defined by

αu = (αu1 , αu2 , . . . , αun ).

(d) The zero vector in Rn is the vector

0 = (0, 0, . . . , 0).

(e) The negative (or additive inverse) of u is defined by

−u = (−u1 , −u2 , . . . , −un ).

Theorem 3.1. Let u=(u1 , u2 , . . . , un ), v=(v1 , v2 , . . . , vn ) and w=(w1 , w2 , . . . , wn )

be vectors in Rn and α and β be scalars. Then

(a) u+v=v+u.

(b) u+(v+w)= (u+v)+w.

(c) u+0=u=0+u.

(d) u + (−u) = 0, that is, u − u = 0.

(e) α(βu) = (αβ)u.

(f) α(u+v) = αu + αv.

(g) (α + β)u = αu + βu.

(h) 1u=u.

Proof. Left to reader as exercises.

76
Definition 3.3. If u=(u1 , u2 , . . . , un ) and v=(v1 , v2 , . . . , vn ) are two vectors

in Rn , then the Euclidean inner product u·v is defined by

u · v = u1 v 1 + u2 v 2 + · · · + u n v n .

Theorem 3.2. Let u=(u1 , u2 , . . . , un ), v=(v1 , v2 , . . . , vn ) and w=(w1 , w2 , . . . , wn )

be vectors in Rn and α be a scalar. Then

(a) u· v=v· u.

(b) (u+v)· w=u· w+v· w.

(c) (αu) · v = α(u · v).

(d) v·v≥0. v·v=0 if and only if v=0.

Proof. Left to reader as exercises.

Definition 3.4. The Euclidean norm (or Euclidean length) of a vector

u=(u1 , u2 , . . . , un ) in Rn is defined by
q
1
kuk = (u · u) 2 = u21 + u22 + · · · + u2n .

The Euclidean distance between u=(u1 , u2 , . . . , un ) and v=(v1 , v2 , . . . , vn )

is defined by
p
d(u, v) = ku − vk = (u1 − v1 )2 + (u2 − v2 )2 + · · · + (u2n − vn ).

Remark 3.1. It is common to refer to the n-space Rn with the operations of

addition, scalar multiplication and inner product as Euclidean n-space.

77
♣Exercises 3.1

1. Let u = (2, 0, −1, 3), v = (5, 4, 7, −1) and w = (6, 2, 0, 9). Find

(a) u − v, (b) 7v + 3w, (c) − w + v,

(d) 3(u − 7v), (e) − 3v − 8w, (f ) 2v − (u + w).

(g) find the vector x that satisfies u − v + x = 7x + w.

2. Let u1 = (−1, 3, 2, 0), u2 = (2, 0, 4, −1), u3 = (7, 1, 1, 4) and u4 =

(6, 3, 1, 2). Find scalars α1 , α2 , α3 and α4 such that

α1 u1 + α2 u2 + α3 u3 + α4 u4 = (0, 5, 6, −3).

3. Compute the Euclidean norm of v when

(a) v = (4, 3), (b) v = (1, −1, 3), (c) v = (2, 0, 3, −1).

4. Find the Euclidean inner product u · v when

(a) u = (−1, 3), v = (7, 2), (b) u = (3, 7, 1), v = (−1, 0, 2).

5. For vectors in Rn , establish the identities:

(a) ku + vk2 + ku − vk2 = 2kuk2 + 2kvk2 ,

(b) u · v = 14 ku + vk2 − 14 ku − vk2 .

78
§3.2 General Vector Spaces

In this section we generalize the concept of a vector still further. We will

say that a set of real (or complex) numbers K is a field if for any x, y ∈ K,

(a) x ± y, xy ∈ K,
1
(b) x−1 = ∈ K, provided x 6= 0,
x
(c) 0, 1 ∈ K.

Definition 3.5. A vector space over the field K is a set V whose elements

are called vectors, together with two operations, + : V × V → V and

· : K × V → V , called addition and scalar multiplication, respectively,

which satisfies the following axioms: for all u, v, w ∈ V and α, β ∈ K,

(1) u + v ∈ V .

(2) u+v=v+u.

(3) u+(v+w)= (u+v)+w.

(4) There exists 0 ∈ V , called the zero vector , such that u+0=u=0+u.

(5) There exists −u such that u+(−u)=0=(−u) + u.

(6) αu ∈ V .

(7) α(u+v) = αu + αv.

(8) (α + β)u = αu + βu.

(9) α(βu) = (αβ)u.

(10) 1u=u.

79
If K = R is the set of all real numbers, V is called a real vector space,

and if K = C is the set of all complex numbers, V is called a complex vector

space. The elements of K are called scalars.

Theorem 3.3. If V is a vector space and u ∈ V and α is a scalar, then

(a) 0u = 0.

(b) α0 = 0.

(c) (−1)u = −u.

(d) If αu = 0, then α = 0 or u = 0.

Proof. (a)

0u + 0u = (0 + 0)u = 0u ⇒ 0u + 0u = 0u

0u + 0u + (−0u) = 0u + (−0u) = 0 ⇒ 0u + 0 = 0

⇒ 0u = 0.

(b)
α0 + α0 = α(0 + 0) = α0 ⇒ α0 + α0 = α0

α0 + α0 + (−α0) = α0 + (−α0) = 0 ⇒ k0 + 0 = 0

⇒ α0 = 0.
(c)
u + (−1)u = 1u + (−1)u

= (1 + (−1))u = 0u

= 0.
So (−1)u must be −u.

80
(d) Let αu = 0. If α 6= 0, then

α−1 (αu) = α−1 0 = 0

⇒ 0 = α−1 (αu) = (α−1 α)u = 1u = u.

Let u 6= 0. Suppose α 6= 0. Then

0 = αu ⇒ α−1 0 = α−1 (αu) ⇒ 0 = u

which is a contradiction.

♣Exercises 3.2

♣. A set of objects is given together with operations of addition and scalar

multiplication. Determine which sets are vector spaces under the given oper-

ations. For those that are not, list all axioms that fail to hold.

1. The set of all triples of real numbers (x, y, z) with the operations (x, y, z) +

(x0 , y 0 , z 0 ) = (x + y 0 , y + y 0 , z + z 0 ) and α(x, y, z) = (αx, y, z).

2. The set of all triples of real numbers (x, y, z) with the operations (x, y, z) +

(x0 , y 0 , z 0 ) = (x + y 0 , y + y 0 , z + z 0 ) and α(x, y, z) = (0, 0, 0).

3. The set of all 2 × 2 matrices of the form


 
 a 1 
 
1 b

81
with matrix addition and scalar multiplication.

4. The set of all 2 × 2 matrices of the form


 
 a 0 
 
0 b

with matrix addition and scalar multiplication.

5. The set of all 2 × 2 matrices of the form


 
 a a+b 
 
a+b b

with matrix addition and scalar multiplication.

6. The set whose only element is the moon. The operations are moon+moon=moon

and k(moon)=moon, where k is a real number.

82
§3.3 Subspaces

It is possible for one vector space to be contained within a larger vector

space.

Definition 3.6. A subset W of a vector space V is called a subspace of V

if W itself is a vector under the addition and scalar multiplication defined on

V.

Theorem 3.4. A nonempty subset W of a vector space V is a subspace of V

if and only if

(a) u + v ∈ W for all u, v ∈ W ,

(b) αu ∈ W for all u ∈ W and any scalar α.

Proof. Left to the reader as an exercise.

Example 3.1. Consider the real vector space Rn . Each vector v = (v1 , v2 , . . . , vn ) ∈

Rn is considered as a n × 1 matrix
 
v1
 
 
 v2 
 
v= .. .
 
 . 
 
vn

Let A be a m×n matrix. If W = {s ∈ Rn |As = 0} is the set of all solutions,

called solution vectors, of the homogeneous linear system Ax = 0, then W

83
is subspace of Rn , called the solution space of the system Ax = 0 since if

s, s0 ∈ W and α is a scalar, then

A(s + s0 ) = As + As0 = 0 + 0 = 0,

A(αs) = α(As) = α0 = 0.

Definition 3.7. A vector w is called a linear combination of the vectors

v1 , v2 , . . . , vr if there exist scalars α1 , α2 , . . . , αr such that

w = α1 v1 + α2 v2 + · · · + αr vr .

Definition 3.8. If v1 , v2 , . . . , vr are vectors in a vector space V and every

vector in V can be written as a linear combination of v1 , v2 , . . . , vr , then we

say that V is spanned by {v1 , v2 , . . . , vr }.

Theorem 3.5. Let S = {v1 , v2 , . . . , vr } be a subset of a vector space V .

Then the set of all linear combinations of v1 , v2 , . . . , vr , denoted by Span S (or

Span {v1 , v2 , . . . , vr }), is a subspace of V , called the linear space spanned

by S or the space spanned by S.

In addition, Span S is the smallest subspace of V which contains v1 , v2 , . . . , vr .

Proof. To show that Span S is a subspace, let u, v ∈ Span S and let α be

84
any scalar. Then there exist scalars α1 , α2 , . . . , αr and β1 , β2 , . . . , βr such that

u = α1 v1 + α2 v2 + · · · + αr vr

v = β1 v1 + β2 v2 + · · · + βr vr ; so

u + v = (α1 + β1 )v1 + (α2 + β2 )v2 + · · · + (αr + βr )vr

αu = (αα1 )v1 + (αα2 )v2 + · · · + (ααr )vr .

Thus u + v, αu ∈ Span S and hence Span S is a subspace.

To show that Span S is the smallest, let W be any subspace of V , which

contains v1 , v2 , . . . , vr . Since

vi = 0v1 + 0v2 + · · · + 1vi + · · · + 0vr

for each i = 1, 2, . . . , r, vi ∈ lin(S) for each i = 1, 2, . . . , r.

Since W is a subspace, W must contain all linear combinations

α1 v1 + α2 v2 + · · · + αr vr of v1 , v2 , . . . , vr .

Thus Span S ⊆ W .

85
♣Exercises 3.3

1. Determine which of the following are subspaces of R3 :

(a) all vectors of the form (a, 0, 0);

(b) all vectors of the form (a, 1, 1);

(c) all vectors of the form (a, b, c), where b = a + c;

(d) all vectors of the form (a, b, c), where b = a + c + 1.

2. Determine which of the following are subspaces of M22 which is the vector

space of all 2 × 2 matrices with matrix addition and scalar multiplication:

(a) all matrices of the form


 
 a b 
 
c d
where a, b, c, and d are integers.

(b) all matrices of the form


 
 a b 
 
c d
where a + d = 0

(c) all 2 × 2 matrices A such that At = A.

(d) all 2 × 2 matrices A such that det(A) = 0.

3. Express the following as linear combinations of u = (2, 1, 4), v = (1, −1, 3)

and w = (3, 2, 5):

(a) (5, 9, 5), (b) (2, 0, 6), (c) (0, 0, 0), (d) (2, 2, 3).

86
4. In each part determine whether the given vectors span R3 :

(a) v1 = (1, 1, 1), v2 = (2, 2, 0), v3 = (3, 0, 0).

(b) v1 = (2, −1, 3), v2 = (4, 1, 2), v3 = (8, −1, 8).

(c) v1 = (3, 1, 4), v2 = (2, −3, 5), v3 = (5, −2, 9), v4 = (1, 4, −1).

(d) v1 = (1, 3, 3), v2 = (1, 3, 4), v3 = (1, 4, 3), v4 = (6, 2, 1).

§3.4 Linear Independence

Definition 3.9. A set of vectors S = {v1 , v2 , . . . , vr } is called a linearly

independent set if the vector equation

α1 v1 + α2 v2 + · · · + αr vr = 0

has only the trivial solution,

α1 = 0, α2 = 0, . . . , αr = 0.

Otherwise, that is, the equation has other solutions, it is a linearly depen-

dent set.

Example 3.2. Determine whether the vectors

v1 = (1, −2, 3), v2 = (5, 6, −1), v3 = (3, 2, 1)

form a linearly dependent or linearly independent set.

87
Solution. Suppose

α1 v1 + α2 v2 + α3 v3 = (0, 0, 0).

Then

(0, 0, 0) = α1 (1, −2, 3) + α2 (5, 6, −1) + α3 (3, 2, 1) = (0, 0, 0)

= (α1 + 5α2 + 3α3 , −2α1 + 6α2 + 2α3 , 3α1 − α2 + α3 ).

so we have a system of linear equations

α1 + 5α2 + 3α3 = 0

−2α1 + 6α2 + 2α3 = 0

3α1 − α2 + α3 = 0.

Solving this system


   
1
 1 5 3 | 0   1 0 | 0  2
   
 −2 6 2 | 0 → 0 1 1 | 0 
   2 
   
3 −1 1 | 0 0 0 0 | 0

Thus
1
α1 + α
2 3
= 0
1
α2 + α
2 3
= 0

0 · α3 = 0
Therefore,
1 1
α1 = t, α2 = t, α3 = t
2 2
for t ∈ R.

88
Theorem 3.6. A nonempty set of vectors S is

(a) linearly dependent iff at least one of the vectors in S is expressed as a

linear combination of other vectors in S.

(b) linearly independent iff no vector in S is expressed as a linear combi-

nation of other vectors in S.

Proof. (a) Let S = {v1 , v2 , . . . , vr }.

(⇒) Suppose S is linearly dependent. Then there exist scalars α1 , α2 , . . . , αr ,

not all zero, such that

α1 v1 + α2 v2 + · · · + αr vr = 0.

If αi 6= 0, then

α1 α2 αi−1 αi+1 αr
vi = − v1 − v2 − · · · − vi−1 − vi+1 − · · · − vr .
αi αi αi αi αi

So vi is a linear combination of S \ {vi }.

(⇐) Suppose vi is a linear combination of S \{vi }. Then there exist scalars

α1 , α2 , . . . , αi−1 , αi+1 , . . . , αr such that

vi = α1 v1 + α2 v2 + · · · + αi−1 vi−1 + αi+1 vi+1 + · · · + αr vr ;

so

α1 v1 + α2 v2 + · · · + αi−1 vi−1 − vi + αi+1 vi+1 + · · · + αr vr = 0.

Thus S is linearly dependent.

89
(b) is left to the reader as an exercise.

Theorem 3.7.

(a) If a set contains the zero vector, then it is linearly dependent.

(b) A set with exactly two vectors is linearly dependent if and only if one

is a scalar multiple of the other.

Proof. Left to the reader as exercises.

Theorem 3.8. Let S = {v1 , v2 , . . . , vr } be a set of vectors in Rn . If r > n,

then S is linearly dependent.

Proof. Let
v1 = (v11 , v12 , . . . , v1n )

v2 = (v21 , v22 , . . . , v2n )


..
.

vr = (vr1 , vr2 , . . . , vrn ).


Suppose

α1 v1 + α2 v2 + · · · + αr vr = 0.

Then we have the linear system

v11 α1 + v21 α2 + · · · + vr1 αr = 0

v12 α1 + v22 α2 + · · · + vr2 αr = 0


.. .. .. ..
. . . .

v1n α1 + v2n α2 + · · · + vrn αr = 0

90
If r > n, by Theorem 1.1, this system has a nontrivial solution α1 , α2 , . . . , αr .

Thus S is linearly dependent.

♣Exercises 3.4

1. Which of the following sets of vectors are linearly dependent?

(a) (2, −1, 4), (3, 6, 2), (2, 10, −4).

(b) (3, 1, 1), (2, −1, 5), (4, 0, −3).

(c) (6, 0, −1), (1, 1, 4).

(d) (1, 3, 3), (0, 1, 4), (5, 6, 3), (7, 2, −1).

2. For which real values of λ do the following vectors form a linearly dependent

set in R3 ?
µ ¶ µ ¶ µ ¶
1 1 1 1 1 1
v1 = λ, − , − , v2 = − , λ, − , v1 = − ,− ,λ .
2 2 2 2 2 2

3. If S = {v1 , v2 , . . . , vn } is a linearly independent set of vectors, show that

every subset of S with one or more vectors is also a linearly independent set.

4. If {v1 , v2 , v3 } is a linearly dependent set of vectors, show that {v1 , v2 , v3 , v4 }

is also a linearly dependent set for any other vector v4 .

5. For any vectors u, v and w, show that {u − v, v − w, w − u} is linearly

dependent.

91
§3.5 Basis and Dimension

Definition 3.10. Let S = {v1 , v2 , . . . , vr } be a finite set of vectors in a vector

space V . then S is called a basis for V if

(i) S is linearly independent,

(ii) S spans V .

Example 3.3. Let


e1 = (1, 0, 0, . . . , 0)

e1 = (0, 1, 0, . . . , 0)
..
.

en = (0, 0, 0, . . . , 1)
be vectors in Rn . It is easy to see that S = {e1 , e2 , . . . , en } is linearly indepen-

dent and spans Rn ; so S is a basis for Rn . This basis is called the standard

basis for Rn .

Definition 3.11. A vector space V is said to be finite-dimensional if

either V = {0} or V has a basis with a finite number of vectors. Otherwise,

that is, V 6= {0} and V has no basis, V is called infinite-dimensional .

Theorem 3.9. If S = {v1 , v2 , . . . , vn } is a basis for a vector space V , then

every set with more than n vectors is linearly dependent.

Proof. Let S 0 = {w1 , w2 , . . . , wm } be a set of m vectors in V and m > n.

92
Since S spans V , S also spans S 0 ; so we have

w1 = α11 v1 + α21 v2 + ··· + αn1 vn

w2 = α12 v1 + α22 v2 + ··· + αn2 vn


.. .. .. ..
. . . .

wm = α1m v1 + α2m v2 + · · · + αnm vn

Suppose that

α1 w1 + α2 w2 + · · · + αm wm = 0.

Then we have
(α1 α11 + α2 α12 + · · · + αm α1m )v1 +

(α1 α21 + α2 α22 + · · · + αm α2m )v2 +


..
.

+(α1 αn1 + α2 αn2 + · · · + αm αnm )vn

= 0.
Thus
α1 α11 + α2 α12 + · · · + αm α1m = 0

α1 α21 + α2 α22 + · · · + αm α2m = 0


..
.

α1 αn1 + α2 αn2 + · · · + αm αnm = 0


Since this linear system has more unknowns than equations (∵ m > n), by

Theorem 1.1, this system has a nontrivial solution. So S 0 = {w1 , w2 , . . . , wm }

is linearly dependent.

93
Theorem 3.10. Any two bases for a finite-dimensional vector space have the

same number of vectors.

Proof. Let S = {v1 , v2 , . . . , vn } and S 0 = {w1 , w2 , . . . , wm } be bases for a

finite-dimensional vector space V .

Since S is a basis and S 0 is linearly independent, by Theorem 3.9, m ≤

n. On the other hand, since S 0 is a basis and S is linearly independent, by

Theorem 3.9, n ≤ m. Thus n = m.

Definition 3.12. The dimension of a finite-dimensional vector space V is

the number of vectors in a basis for V and is denoted by dim(V ). We define

the zero vector space to have dimension zero, that is, if V = {0}, define

dim(V ) = 0.

Example 3.4. Determine a basis for and the dimension of the solution space

of the homogeneous system

2x1 + 2x2 − x3 + x5 = 0

−x1 − x2 + 2x3 − 3x4 + x5 = 0

x1 + x2 − 2x3 − x5 = 0

x3 + x4 + x5 = 0.

94
Solution. The augmented matrix for the system is
 
2 2 −1 0 1 0
 
 
 −1 −1 2 −3 1 0 
 
 .
 
 1 1 −2 0 −1 0 
 
0 0 1 1 1 0

Reducing this matrix to reduced row echelon form,


 
1 1 0 0 1 0
 
 
 0 0 1 0 1 0 
 
 .
 
 0 0 0 1 0 0 
 
0 0 0 0 0 0

The corresponding system of equations is

x1 + x2 + x5 = 0

x3 + x5 = 0 .

x4 = 0

Solving for the leading variables,

x1 = −x2 − x5

x3 = −x5

x4 = 0

The solution set is given by

x1 = −s − t, x2 = s, x3 = −t, x4 = 0, x5 = t

95
where s and t are arbitrary values. Thus the solution space is

S = {x = (x1 , x2 .x3 , x4 , x5 )|x1 = −s − t, x2 = s, x3 = −t, x4 = 0, x5 = t}.

Since
           
 x1   −s − t   −s   −t   −1   −1 
           
 x   s   s   0   1   0 
 2           
           
           
 x3  =  −t  =  0  +  −t  = s  0  + t  −1  ,
           
           
 x4   0   0   0   0   0 
           
           
x5 t 0 t 0 1
   
 −1   −1 
   
 1   0 
   
   
   
v1 =  0  , v2 =  −1 
   
   
 0   0 
   
   
0 1
span the solution space and we see that they are linearly independent. There-

fore, {v1 , v2 } is a basis for S and dim(S) = 2.

Theorem 3.11.

(a) If S = {v1 , v2 , . . . , vn } is linearly independent in a n-dimensional vector

space V , then S is a basis for V .

(b) If S = {v1 , v2 , . . . , vn } spans a n-dimensional vector space V , then S

is a basis for V .

96
(c) If S = {v1 , v2 , . . . , vr } is linearly independent in a n-dimensional vector

space V and r < n, then there exist vectors

vr+1 , vr+2 , . . . , vn ∈ V

such that S ∪ {vr+1 , vr+2 , . . . , vn } is a basis for V .

Proof. Left to the reader as exercises.

Example 3.5. Show that {(−3, 7), (5, 5)} is a basis for R2 .

Solution. By (a) of Theorem 3.11, it is enough to show that {(−3, 7), (5, 5)}

is linearly independent.

Suppose that

α(−3, 7) + β(5, 5) = (0, 0).

Then
−3α + 5β = 0

7α + 5β = 0
Then 10α = 0; so α = 0 and then β = 0. Thus {(−3, 7), (5, 5)} is linearly

independent.

97
♣Exercises 3.5

1. Which of the following sets of vectors are bases for R2 ?

(a) (2, 1), (3, 0), (b) (4, 1), (−7, −8), (c) (0, 0), (1, 3).

2. Which of the following sets of vectors are bases for R3 ?

(a) (1, 0, 0), (2, 2, , 0), (3, 3, 3), (b) (3, 1, −4), (2, 5, 6), (1, 4, 8)

(c) (2, −3, 1), (4, 1, , 1), (0, −7, 1), (d) (1, 6, 4), (2, 4, −1), (−1, 2, 5).

3. Determine the dimension of and a basis for the solution space of the system:

(a) (b)

x1 + x2 − x3 = 0,
3x1 + x2 + x3 + x4 = 0,
−2x1 − x2 + 2x3 = 0 ,
5x1 − x2 + x3 − x4 = 0
−x1 + x3 = 0

§3.6 Row Space and Column Space and Rank

Definition 3.13. For a m × n matrix


 
a a12 · · · a1n
 11 
 
 a21 a22 · · · a2n 
 
A= . .. .. ,
 . 
 . . . 
 
am1 am2 · · · amn

98
the vectors
r1 = (a11 , a12 , . . . , a1n ),

r2 = (a21 , a22 , . . . , a2n ),


..
.

rm = (am1 , am2 , . . . , amn )


are called the row vectors of A and the subspace of Rn spanned by the row

vectors is called the row space of A.

The vectors
     
a a a
 11   12   1n 
     
 a21   a22   a2n 
     
c1 =  .  , c2 =  .  , . . . , cn =  . ,
 .   .   . 
 .   .   . 
     
am1 am2 amn

are called the column vectors of A and the subspace of Rm spanned by the

column vectors is called the column space of A.

Theorem 3.12. Elementary row operations do not change the row space of

a matrix.

Theorem 3.13. The nonzero row vectors in a row-echelon form of a matrix

A form a basis for the row space of A.

99
Example 3.6. Find a basis for the space spanned by the vectors

v1 = (1, −2, 0, 0, 3),

v2 = (2, −5, −3, −2, 6),

v3 = (0, 5, 15, 10, 0),

v4 = (2, 6, 18, 8, 6).

Solution. The space spanned by v1 , v2 , v3 , v4 is the row space of the matrix


 
1 −2 0 0 3
 
 
 2 −5 −3 −2 6 
 
 .
 
 0 5 15 10 0 
 
2 6 18 8 6

This matrix is reduced to the row echelon form


 
1 −2 0 0 3
 
 
 0 1 3 2 0 
 
 .
 
 0 0 1 1 0 
 
0 0 0 0 0

Thus the nonzero row vectors

r1 = (1, −2, 0, 0, 3),

r2 = (0, 1, 2, 3, 0),

r3 = (0, 0, 1, 1, 0)

form a basis for the the space spanned by the vectors v1 , v2 , v3 , v4 .

100
Example 3.7. Find a basis for the column space of the matrix
 
 1 0 1 1 
 
A=  3 2 5 1 
.
 
0 4 4 −4

Solution. Transposing A,
 
1 3 0
 
 
 0 2 4 
 
At =  
 
 1 5 4 
 
1 1 −4

and reducing to the row echelon form


 
1 3 0
 
 
 0 1 2 
 
 .
 
 0 0 0 
 
0 0 0

Thus the nonzero vectors


   
 1   0 
   
c1 =  
 3 , c2 =  
 1 
   
0 2

form a basis for the column space of A.

Theorem 3.14. If A is any matrix, then the row space and column space of

A have the same dimension.

101
Definition 3.14. The dimension of the row (or column) space of a matrix A

is called the rank of A and is denoted by rank(A).

Theorem 3.15. Let A be a n × n matrix. Then the following statements are

equivalent.

(a) A is invertible.

(b) Ax = 0 has only the trivial solution.

(c) A is row equivalent to In .

(d) Ax = b is consistent for every n × 1 matrix b.

(e) det(A) 6= 0.

(f) rank(A) = n.

(g) The row vectors of A are linearly independent.

(h) The column vectors of A are linearly independent.

Proof. In Theorem 1.15, we show that (a) ∼ (d) are equivalent, and (a) and

(e) are equivalent in Theorem 2.7. Now, we will show (c) ⇒ (f ) ⇒ (g) ⇒

(h) ⇒ (c).

(c) ⇒ (f ). Since A is row equivalent to In which has n nonzero rows, the

row space of A has dimension n by Theorem 3.13. Hence rank(A) = n.

(f ) ⇒ (g). Since rank(A) = n, the row space of A has dimension n. Since

the n row vectors of A span the row space of A, by Theorem 3.11, the row

vectors of A are linearly independent.

(g) ⇒ (h). Assume that the row vectors of A are linearly independent.

102
Then the row space of A is n-dimensional. By Theorem 3.14, the column

space of A is n-dimensional. Since the column vectors of A span the column

space of A, the column vectors of A are linearly independent by Theorem 3.11.

(h) ⇒ (c). Assume that the column vectors of A are linearly independent.

Then the column space of A is n-dimensional. By Theorem 3.14, the row

space of A is n-dimensional. This means that the reduced row-echelon form of

A has n nonzero rows, that is, all rows are nonzero; so it should be the identity

matrix In . Hence A is row equivalent to In .

Theorem 3.16. A system of linear equations Ax = b is consistent if and

only if b is in the column space of A.

Proof. If A be a m × n matrix and let


 
x
 1 
 
 x2 
 
x= . .
 . 
 . 
 
xn

Then Ax = b becomes

x1 c1 + x2 c2 + · · · + xn cn = b

where c1 , c2 . . . , cn are the column vectors of A. Thus b is a linear combination

of the column vectors of A; so it is in the column space of A.

103
Theorem 3.17. A system of linear equations Ax = b is consistent if and

only if the rank of the coefficient matrix A is the same as the rank of the

augmented matrix [A|b].

Proof. By Theorem 3.16, Ax = b is consistent if and only if b is in the

column space of A if and only if b is a linear combination of the column

vectors of A if and only if the rank of the column space of A is the same as the

rank of the matrix [A|b]. Note that the rank of A is the rank of the column

space of A.

Theorem 3.18. If Ax = b is a consistent linear system of m equations in n

unknowns, and if rank(A) = r, then the solution of the system contains n − r

parameters. (That is, the solution is of the form (x1 , x2 , . . . , xn ) such that r of

x1 , x2 , . . . , xn are functions of the remaining n − r terms which take arbitrary.)

Proof. By Theorem 3.17, Since rank[A|b] = rank(A) = r, the reduced row

echelon form of the augmented matrix [A|b] has r nonzero rows. Since each of

these r rows contains a leading 1, the corresponding leading variables can be

expressed in terms of the remaining n − r unknowns.

104
♣Exercises 3.6

1. List the row vectors and column vectors of the following matrices, and find

a basis for the row space; a basis for the column space; and the rank of the

matrix:
   
 
 1 2 −1   1 1 2 1 
 1 −3     
(a)  , (b) 
 2 4 6 ,
 (c) 
 1 0 1 2 .

2 −6    
0 0 −8 2 1 3 4

2. Find a basis for the space of R4 spanned by the given vectors:

(a) (1, 1, −4, −3), (2, 0, 2, −2), (2, −1, 3, 2);

(b) (−1, 1, −2, 0), (3, 3, 6, 0), (9, 0, 0, 3);

(c) (1, 1, 0, 0), (0, 0, 1, 1), (−2, 0, 2, 2), (0, −3, 0, 3).

3. Verify that the row space and column space have the same dimension
 
2 0 2 2  
 

 3 −4 −1 −9 
  2 3 5 7 4 
   
(a)   , (b) 
 −1 2 1 0 −2 
.
   
 1 2 3 7 
  4 1 5 9 8
−3 1 −2 0

105
Chapter Four
Inner Product Spaces

§4.1 Inner Products

Recall that a real vector space is a vector space over the field R of all real

numbers.

Definition 4.1. An inner product on a real vector space V is a real-valued

function <, > : V × V → R such that for all u, v, w ∈ V and all α ∈ R,

(1) < u, v >=< v, u > (symmetry axiom),

(2) < u + v, w >=< u, w > + < v, w > (additivity axiom),

(3) < αu, v >= α < u, v > (homogeneity axiom),

(4) < v, v >≥ 0 (positivity axiom)

and < v, v >= 0 if and only if v = 0.

A real vector space with an inner product is called an inner product

space.

Example 4.1. Let    


u1 v1
   
   
 u2   v2 
   
u= .. , v= .. 
   
 .   . 
   
un vn

106
be vectors in the Euclidean n-space Rn (expressed n × 1 matrices), and let

A be an invertible n × n matrix. If u · v = u1 v1 + u2 v2 + · · · + un vn is the

Euclidean inner product on Rn , define <, > on Rn by

< u, v >= Au · Av.

Then <, > is an inner product on Rn (verify!), which is called the inner

product on Rn generated by A.

Theorem 4.1. If u, v and w are vectors in an inner product space and α is

a scalar, then

(a) < 0, v >=< v, 0 >= 0,

(b) < u, v + w >=< u, v > + < u, w >,

(c) < u, αv >= α < u, v >.

Proof. Left to the reader as exercises.

107
♣Exercises 4.1

1. Let < u, v > be the Euclidean inner product on R2 , and let u = (2, −1), v =

(−1, 3), w = (0, −5) and α = −3. Verify that

(a) < u, v >=< v, u >;

(b) < u + v, w >=< u, w > + < v, w >;

(c) < αu, v >= α < u, v >;

(d) < 0, v >=< v, 0 >= 0;

(e) < u, v + w >=< u, v > + < u, w >;

(f) < u, αv >= α < u, v >.

2. If < u, v > be the Euclidean inner product on Rn , and if A is an n × n

matrix, show that

< u, Av >=< At u, v > .

3. Let w1 , w2 , . . . , wn be positive real numbers and let u = (u1 , u2 , . . . , un )

and v = (v1 , v2 , . . . , vn ). Show that

< u, v >= w1 u1 v1 + w2 u2 v2 + · · · + wn un vn

is an inner product on Rn .

108
§4.2 Length and Angle in Inner Product Spaces

Definition 4.2. If V is an inner product space, then the norm (or length)

of a vector u ∈ V is denoted by kuk and is defined by

1
kuk =< u, u > 2 .

The distance between two points (vectors) u and v in V is denoted by

d(u, v) = ku − vk.

Theorem 4.2 (Cauchy-Schwarz Inequality). If u and v are vectors in an

inner product space, then

< u, v >2 ≤< u, u >< v, v >= kuk2 kvk2 .

Proof. If u = 0, then

< u, v >= 0 =< u, u >;

so the equality holds.

Let u 6= 0 and set

a =< u, u >, b = 2 < u, v >, c =< v, v >, t ∈ R be any.

109
By the positivity of the inner product <, >,

0 ≤< (tu + v), (tu + u) > = < u, u > t2 + 2 < u, v > t+ < v, v >

= at2 + bt + c.

Thus at2 + bt + c ≥ 0 for all t and hence it has either no real roots or a double

root. Therefore, its discriminant must be

b2 − 4ac ≤ 0 ⇒< u, v >2 ≤< u, u >< v, v > .

Example 4.2. Let u = (u1 , u2 , . . . , un ) and v = (v1 , v2 , . . . , vn ) be vectors in

Rn . Then the inner product on Rn is

< u, v >= u · v = (u1 , u2 , . . . , un ) · (v1 , v2 , . . . , vn ) = u1 v1 + u2 v2 + · · · + un vn

which is its Euclidean inner product and Cauchy-Schwarz Inequality becomes

1 1
|u1 v1 + u2 v2 + · · · + un vn | ≤ (u21 + u22 + · · · + u2n ) 2 (v12 + v22 + · · · + vn2 ) 2

which is called Cauchy inequality .

Remark 4.1. Since kuk2 =< u, u > and kvk2 =< v, v >, the Cauchy-

Schwarz inequality, < u, v >2 ≤< u, u >< v, v >, can be written as

< u, v >2 ≤ kuk2 kvk2 or | < u, v > | ≤ kuk kvk.

110
1
Theorem 4.3. If V is an inner product space, then the norm kuk =< u, u > 2

and the distance d(u, v) = ku − vk satisfy all the properties listed in the table:

Basic Properties of Length Basic Properties of Distance

L1. kuk ≥ 0 D1. d(u, v) ≥ 0

L2. kuk = 0 if and only if u=0 D2. d(u, v) = 0 if and only if u=v

L3. kαuk = |α|kuk D3. d(u, v) = d(v, u)

L4. ku + vk ≤ kuk + kuk D4. d(u, v) ≤ d(u, w) + d(w, v)

(triangle inequality) (triangle inequality)

Proof. We will prove L4. Others are left to the reader as exercises.

ku + vk2 = < ku + vk, ku + vk >

= < u, u > +2 < u, v > + < v, v >

≤ < u, u > +2| < u, v > |+ < v, v >

≤ < u, u > +2kuk kvk+ < v, v >

= kuk2 + 2kuk kvk + kvk2

= (kuk + kvk)2 .

Taking square roots

ku + vk ≤ kuk + kuk.

Remark 4.2. From Remark 4.1,

< u, v >
| < u, v > | ≤ kuk kvk ⇒ −1 ≤ ≤1
kuk kvk

111
for any nonzero two vectors u and v in an inner product space.

Definition 4.3. The angle between u and v is denoted by θ and is defined

by
< u, v >
cos θ = and 0 ≤ θ ≤ π.
kuk kvk

Definition 4.4. In an inner product space, two vectors u and v are said to

be orthogonal if < u, v >= 0. If u is orthogonal to each vector in a set W ,

we say that u is orthogonal to W .

Theorem 4.4 (Generalized Theorem of Pythagoras). If u and v are

orthogonal vectors in an inner product space, then

ku + vk2 = kuk2 + kvk2 .

Proof.

ku + vk2 = < u + v, u + v >= kuk2 + 2 < u, v > +kvk2 =

= kuk2 + kvk2 .

112
♣Exercises 4.2

1. In each part use the given inner product on R2 to find kwk, where w =

(−1, 3).

(a) the Euclidean inner product;

(b) the weighted Euclidean product < u, v >= 3u1 v1 + 2u2 v2 , where u =

(u1 , u2 ) and v = (v1 , v2 );  


 1 2 
(c) the inner product generated by the matrix A =  .
−1 3
2. In each part determine whether the given vectors are orthogonal with re-

spect to the Euclidean inner product:

(a) u = (−1, 2, 4), v = (2, 3, −1);

(b) u = (1, 1, 1), v = (−1, −1, −1);

(c) u = (a, b, c), v = (0, 0, 0);

(d) u = (−2, 3, −5, 1), v = (2, 1, −2, −9);

(e) u = (0, −1, 2, 5), v = (1, −2, 3, 0);

(f) u = (a, b), v = (−b, a).

3. In each part verify that the Cauchy-Schwarz inequality holds for the given

vectors using the Euclidean inner product:

(a) u = (2, 1), v = (1, −3);

(b) u = (3, −1, 2), v = (0, 1, −3);

(c) u = (1, 2, −4), v = (−2, −4, 8);

(d) u = (1, 1, −1, −1), v = (1, 2, −2, 0);

113
4. Let V be an inner product space. Show that if u and v are orthogonal

vectors in V such that kuk = kvk = 1, then ku − vk = 2.

5. Let V be an inner product space. Show that

ku + vk2 + ku − vk2 = 2kuk2 + 2kvk2

for vectors in V .

6. Let V be an inner product space. Show that

1 1
< u, v >= ku + vk2 − ku − vk2
4 4

for vectors in V .

114
§4.3 Orthonormal Bases; Gram-Schmidt Process

Definition 4.5. A set of vectors W in an inner product space is called an

orthogonal set if each pair of distinct vectors in W is orthogonal. In addition,

if W is a basis, W is called an orthogonal basis. An orthogonal set in which

each vector has norm 1 is called an orthonormal set. Further, if it is a basis,

it is called an orthonormal basis.

Remark 4.3. If v is a nonzero vector in an inner product space, then

v
, called normalizing v ,
kvk

has norm 1.

Theorem 4.5. If S = {v1 , v2 , . . . , vn } is an orthonormal basis for an inner

product space V , and u ∈ V is any vector, then

u =< u, v1 > v1 + < u, v2 > v2 + · · · + < u, vn > vn .

Proof. Since S = {v1 , v2 , . . . , vn } is a basis,

u = α1 v1 + α2 v2 + · · · + αn vn

for some scalars α1 , α2 , . . . , αn . For each i = 1, 2, . . . , n,

< u, vi > = < α1 v1 + α2 v2 + · · · + αn vn , vi >

= α1 < v1 , v1 > +α2 < v2 , v2 > + · · · + αn < vn , vn >

= αi < vi , vi >= αi

115
since 

 kvi k2 = 1 if i = j,
< vi , vj >=

 0 if i 6= j.

Theorem 4.6. If S = {v1 , v2 , . . . , vn } is an orthogonal set of nonzero vectors

in an inner product space V , then S is linearly independent.

Proof. Assume that

α1 v1 + α2 v2 + · · · + αn vn = 0

for some scalars α1 , α2 , . . . , αn . Then, for each i = 1, 2, . . . , n,

0 = < 0, vi >=< α1 v1 + α2 v2 + · · · + αn vn , vi >

= αi < vi , vi > .

Since each vector S is nonzero, < vi , vi >6= 0; so αi < vi , vi >= 0 implies

αi = 0 for each i. .

Theorem 4.7. Let V be an inner product space and {v1 , v2 , . . . , vr } be an

orthonormal set of vectors in V . If W is the space spanned by v1 , v2 , . . . , vr ,

then every vector u in V can be expressed in the form

u = w1 + w2

where w1 is in W and w2 is orthogonal to W by letting

w1 = < u, v1 > v1 + < u, v2 > v2 + · · · + < u, vr > vr ,

w2 = u− < u, v1 > v1 − < u, v2 > v2 − · · · − < u, vr > vr .

116
The vector w1 is called the orthogonal projection of u on W and is

denoted by

projW u =< u, v1 > v1 + < u, v2 > v2 + · · · + < u, vr > vr .

The vector w2 = u − projW u is called the component of u orthogonal to

W.

Proof. Left to the reader as an exercise.

Theorem 4.8. Every nonzero finite-dimensional inner product space has an

orthonormal basis.

Proof. Let V be a nonzero n-dimensional inner product space, and let S =

{u1 , u2 , . . . , un } be a basis for V . We will construct an orthonormal basis

from S by the following step-by-step construction which is called the Gram-

Schmidt process:
u1
v1 = ,
ku1 k
u2 − < u2 , v1 > v1
v2 = ,
ku2 − < u2 , v1 > v1 k
u3 − < u3 , v1 > v1 − < u3 , v2 > v2
v3 = ,
ku3 − < u3 , v1 > v1 − < u3 , v2 > v2 k
u4 − < u4 , v1 > v1 − < u4 , v2 > v 2 − < u4 , v 3 > v 3
v4 = ,
ku4 − < u4 , v1 > v1 − < u4 , v2 > v 2 − < u4 , v 3 > v 3 k
..
.
un − < un , v1 > v1 − < un , v2 > v2 − · · · − < un , vn−1 > vn−1
vn = .
kun − < un , v1 > v1 − < un , v2 > v2 − · · · − < un , vn−1 > vn−1 k
Then {v1 , v2 , . . . , vn } is an orthonormal basis for V .

117
Example 4.3. In the vector space R3 with the Euclidean inner product,

construct a orthonormal basis from the basis {(1, 1, 1), (0, 1, 1), (0, 0, 1)}.

Solution. Let

u1 = (1, 1, 1), u2 = (0, 1, 1), u3 = (0, 0, 1).

Taking
µ ¶
u1 (1, 1, 1) 1 1 1
v1 = = = √ ,√ ,√ ,
ku1 k k(1, 1, 1)k 3 3 3
u2 − < u2 , v1 > v1
v2 =
ku2 − < u2 , v1 > v1 k ³ ´ ³ ´
(0, 1, 1)− < (0, 1, 1), √13 , √13 , √13 > √13 , √13 , √13
= ³ ´ ³ ´
k(0, 1, 1)− < (0, 1, 1), √13 , √13 , √13 > √13 , √13 , √13 k
³ ´
2 1 √1 √1
(0, 1, 1) − 3 √ √
3
, 3, 3
= ³ ´
2 1 √1 √1
k(0, 1, 1) − 3 √ √ , 3, 3 k
¡ 2 1 1¢ 3
µ ¶ µ ¶
−3, 3, 3 3 2 1 1 2 1 1
= ¡ ¢ =√ − , , = −√ , √ , √ ,
k − 23 , 13 , 13 k 6 3 3 3 6 6 6
u3 − < u3 , v1 > v1 − < u3 , v2 > v2
v3 =
ku3 − < u3 , v1 ³> v1 − < u3´, v2 > v ³2 k ´
(0, 0, 1) − √3 √3 , √3 , √3 − √6 − √26 , √16 , √16
1 1 1 1 1

= ³ ´ ³ ´ ,
k(0, 0, 1) − √13 √13 , √13 , √13 − √16 − √26 , √16 , √16 k
¡ ¢ µ ¶ µ ¶
0, − 21 , 12 √ 1 1 1 1
= ¡ ¢ = 2 0, − , = 0, − √ , √ .
k 0, − 12 , 12 k 2 2 2 2

Then
µ ¶ µ ¶ µ ¶
1 1 1 2 1 1 1 1
√ , √ , √ , − √ , √ , √ , 0, − √ , √
3 3 3 6 6 6 2 2
form an orthonormal basis for R3 .

118
Theorem 4.9 (Projection Theorem). If W is a finite-dimensional sub-

space of an inner product space V , then every vector u ∈ V can be expressed

in exactly one way as

u = w1 + w2

where w1 ∈ W and w2 is orthogonal to W .

Theorem 4.10 (Best Approximation Theorem. If W is a finite-dimensional

subspace of an inner product space V , and if u ∈ V , then projW u is the best

approximation to u from W in the sense that

ku − projW uk < ku − wk

for every vector w ∈ W different from projW u.

119
♣Exercises 4.3

1. Let R3 have the Euclidean inner product. Which of the following form

orthonormal sets?
µ ¶ µ ¶ µ ¶
1 1 1 1 1 1 1
(a) √ , 0, √ , √ , √ , − √ , − √ , 0, √ ,
µ 2 2 µ
¶ 3 3¶ µ 3 ¶ 2 2
2 2 1 2 1 2 1 2 2
(b) ,− , , , ,− , , , ,
3 3 µ 3 3 3 ¶3 3 3 3
1 1
(c) (1, 0, 0), 0, √ , √ , (0, 0, 1),
µ 2 ¶ 2µ ¶
1 1 2 1 1
(d) √ , √ , − √ , √ , − √ , 0 .
6 6 6 2 2

2. Let R3 have the Euclidean inner product. Use the Gram-Schmidt process

to transform the basis {u1 , u2 , u3 } into an orthonormal basis:

(a) u1 = (1, 1, 1), u2 = (−1, 1, 0), u3 = (1, 2, 1);

(b) u1 = (1, 0, 0), u2 = (3, 7, −2), u3 = (0, 4, 1).

3. Let {v1 , v2 , . . . , vn } be an orthonormal basis for an inner product space V .

Show that if w is a vector in V , then

kwk2 =< w, v1 >2 + < w, v2 >2 + · · · + < w, vn >2 .

120
§4.4 Coordinates; Change of Basis

There is a close relationship between the notion of a basis and the notion

of a coordinate system. In this section we develop this idea and also discuss

results about changing bases for vector spaces.

Theorem 4.11. If S = {v1 , v2 , . . . , vn } is a basis for a vector space V , then

every vector v ∈ V is uniquely expressed in the form

v = α1 v1 + α2 v2 + · · · + αn vn

for some scalars α1 , α2 , . . . , αn . The scalars α1 , α2 , . . . , αn are called the coor-

dinates of v relative to the basis S. The coordinate vector of v relative

to S is denoted by (v)S and is defined by

(v)S = (α1 , α2 , . . . , αn ).

The coordinate matrix of v relative to S is denoted by [v]S and is defined

by  
α
 1 
 
 α2 
 
[v]S =  .  .
 . 
 . 
 
αn

Proof. Suppose that

v = α1 v1 + α2 v2 + · · · + αn vn ,

v = β1 v1 + β2 v2 + · · · + βn vn

121
for some scalars α1 , α2 , . . . , αn , β1 , β2 , . . . , βn . Then

(α1 − β1 )v1 + (α2 − β2 )v2 + · · · + (αn − βn )vn = 0

Since S is linearly independent, αi − βi = 0 for each i, that is, αi = βi , ∀i.

Theorem 4.12. If S is an orthonormal basis for an n-dimensional inner prod-

uct space V and if

(u)S = (α1 , α2 , . . . , αn ) and (v)S = (β1 , β2 , . . . , βn ),

then
p
(a) kuk = α12 + α22 + · · · + αn2 ,
p
(b) d(u, v) = (α1 − β1 )2 + (α2 − β2 )2 + · · · + (αn − βn )2 ,

(c) < u, v >= α1 β1 + α2 β2 + · · · + αn βn

Proof. Let S = {v1 , v2 , . . . , vn }. Then

u = α1 v1 + α2 v2 + · · · + αn vn ,

v = β1 v1 + β2 v2 + · · · + βn vn .

(a)
1
kuk = < α1 v1 + α2 v2 + · · · + αn vn , α1 v1 + α2 v2 + · · · + αn vn > 2
à ! 12 v u n
X uX
= αi αj < vi , vj > =t αi2
1≤i,j≤n i=1

since 

 1 if i = j,
< vi , vj >=

 0 if i 6= j.

122
(b) and (c) are left to the reader.

♣Change of Basis Problem. If we change the basis for a vector space V

from an old basis B = {u1 , u2 , . . . , un } to a new basis B 0 = {u01 , u02 , . . . , u0n },

how is the old coordinate matrix [v]B of a vector v related to the new coordi-

nate matrix [v]B 0 ?

Solution. Let    
α1 β1
   
   
 α2   β2 
   
[v]B =  .. , [v]B 0 = .. ,
   
 .   . 
   
αn βn
that is,
v = α 1 u1 + α 2 u2 + · · · + α n un ,

v = β1 u01 + β2 u02 + · · · + βn u0n .


Since B is a basis for V and B 0 ⊆ V , we may write

u01 = γ11 u1 + γ12 u2 + · · · + γ1n un ,

u02 = γ21 u1 + γ22 u2 + · · · + γ2n un ,


..
.

u0n = γn1 u1 + γn2 u2 + · · · + γnn un .

123
Then v = β1 u01 + β2 u02 + · · · + βn u0n becomes

v = β1 (γ11 u1 + γ12 u2 + · · · + γ1n un )

+ β2 (γ21 u1 + γ22 u2 + · · · + γ2n un )+


..
.

+ βn (γn1 u1 + γn2 u2 + · · · + γnn un )

= (β1 γ11 + β2 γ21 + · · · + βn γn1 )u1

+ (β1 γ12 + β2 γ22 + · · · + βn γn2 )u2 +


..
.

+ (β1 γ1n + β2 γ2n + · · · + βn γnn )un .

Thus     
α1 γ11 γ21 · · · γn1 β1
    
    
 α2   γ12 γ22 · · · γn2   β2 
    
 .. = .. .. ..   .. .
    
 .   . . .  . 
    
αn γ1n γ2n · · · γnn βn
Set  
γ11 γ21 · · · γn1
 
 
 γ12 γ22 · · · γn2 
 
P = .. .. ..  .
 
 . . . 
 
γ1n γ2n · · · γnn
Then the j-th column of P equals the coordinate matrix of u0j relative of B,

[u0j ]B , for each j = 1, 2, . . . , n, and P is denoted by

P = [[u01 ]B , [u02 ]B , . . . , [u0n ]B ]

124
and then

[v]B = P [v]B 0 = [[u01 ]B , [u02 ]B , . . . , [u0n ]B ][v]B 0 .

The matrix P is called the transition matrix from B 0 to B.

Theorem 4.13. If P is the transition matrix from a basis B 0 to a basis B,

then

(a) P is invertible,

(b) P −1 is the transition matrix from a basis B to a basis B 0

Theorem 4.14. If P is the transition matrix from an orthonormal basis to

another orthonormal basis for an inner product space, then

P −1 = P t .

Definition 4.6. A square matrix A such that

A−1 = At

is called an orthogonal matrix .

Theorem 4.15. If A is a n × n matrix, then the following are equivalent.

(a) A is orthogonal.

(b) The row vectors of A form an orthonormal set in Rn with the Euclidean

inner product.

(c) The coulmn vectors of A form an orthonormal set in Rn with the Eu-

clidean inner product.

125
Proof. Omitted!

♣Exercises 4.4

1. Find the coordinate matrix and coordinate vector for w relative to the

basis S = {u1 , u2 }.

(a) u1 = (1, 0), u2 = (0, 1); w = (3, 7),

(b) u1 = (2, −4), u2 = (3, 8); w = (1, 1),

(a) u1 = (1, 1), u2 = (0, 2); w = (a, b).

2. Consider the bases B = {u1 , u2 } and B 0 = {v1 , v2 }, where


       
 1   0   2   −3 
u1 =   , u2 =   , v 1 =   , v 2 =  .
0 1 1 4

(a) Find the transition matrix from B 0 to B,

(b) Find the transition matrix from B to B 0 ,

(c) Compute the coordinate matrix [w]B , where


 
 3 
w= .
−5

126
Chapter Five
Linear Transformations

§5.1 Introduction to Linear Transformations

Definition 5.1. A function from a vector space V to a vector space W ,

T : V → W , is called a linear transformation if for all vectors u, v ∈ V

and scalar α,

(i) T (u + v) = T (u) + T (v) and

(ii) T (αu) = αT (u).

Example 5.1. Let A be a m × n matrix. Then the function

T : Rn → Rm defined by T (x) = Ax

is a linear transformation which is called a matrix transformation or mul-

tiplication by A.

Example 5.2. A function from a vector space V to a vector space W , T :

V → W , defined by T (v) = 0, is a linear transformation which is called a zero

transformation.

Example 5.3. The function from a vector space V to itself, T : V → V ,

defined by T (v) = v, is a linear transformation which is called the identity

transformation on V .

127
A linear transformation from a vector space V to itself is called a linear

operator on V .

Example 5.4. Let V be a vector space and let α be a fixed scalar. Then the

function T : V → V defined by T (v) = αv is a linear operator. If α > 1, then

T is called a dilation of V , and if 0 < α < 1, then T is called a contraction

of V .

Example 5.5. Let V be an inner product space and let W be a finite-

dimensional subspace of V having

S = {w1 , w2 , . . . , wr }

as an orthonormal basis. Then the function T : V → W defined by

T (v) =< v, w1 > w1 + < v, w2 > w2 + · · · + < v, wr > wr

is a linear transformation which is called the orthogonal projection of V

onto W .

128
♣Exercises 5.1

♣.A formula is given for a function F : R2 → R2 . Determine whether F is

linear:
1. F (x, y) = (2x, y), 2. F (x, y) = (x2 , y),

3. F (x, y) = (y, x), 4. F (x, y) = (0, y),

5. F (x, y) = (x, y + 1), 6. F (x, y) = (2x + y, x − y).

2. Let T : R3 → R2 be a matrix transformation, and suppose


     
     
 1   0   0 
   1    3     4 
T  0  =   , T  1  =    , T  0  =  .
     
  1   0   −7
0 0 1

(a) Find thematrix.


 
 1 
 
(b) Find T   
 3 .
 
8
 
 x 
 
(b) Find T   
 y .
 
z

129
§5.2 Properties of Linear Transformations; Kernel and
Range

In this section we develop some basic properties of linear transformations.

We assume that V and W are vector spaces.

Theorem 5.1. If T : V → W is a linear transformation, then

(a) T (0) = 0.

(b) T (−v) = −T (v) for all v ∈ V .

(c) T (v − u) = T (v) − T (u) for all v, u ∈ V .

Proof. (a) Since 0 = 0v, T (0) = T (0v) = 0T (v) = 0.

(b) T (−v) = T ((−1)v) = (−1)T (v) = −T (v).

(c) Since v − u = v + (−1)u,

T (v − u) = T (v + (−1)u)

= T (v) + (−1)T (u)

= T (v) − T (u).

Definition 5.2. If T : V → W is a linear transformation, then

(a) ker(T ) = {v ∈ V |T (v) = 0} is called the kernel (or nullspace) of T ,

(b) R(T ) = {T (v) ∈ W |v ∈ V } is called the range of T .

Theorem 5.2. If T : V → W is a linear transformation, then

130
(a) ker(T ) is a subspace of V ,

(b) R(T ) is a subspace of W .

Proof. (a) Let v1 , v2 ∈ ker(T ) and α be any scalar. Then

T (v1 + v2 ) = T (v1 ) + T (v2 ) = 0 + 0 = 0,

T (αv1 ) = αT (v1 ) = α0 = 0;

so

v1 + v2 , αv1 ∈ ker(T ).

Thus ker(T ) is a subspace.

(b) Let w1 , w2 ∈ R(T ) and α be any scalar. Then there exist v1 , v2 ∈ V

such that

T (v1 ) = w1 and T (v2 ) = w2 .

Then
w1 + w2 = T (v1 ) + T (v2 ) = T (v1 + v2 ) ∈ R(T ),

αw1 = αT (v1 ) = T (αv1 ) ∈ R(T );


so

w1 + w2 , αw1 ∈ R(T ).

Thus R(T ) is a subspace.

Definition 5.3. If T : V → W is a linear transformation, then the dimen-

sion of the range R(T ) is called the rank of T and he dimension of the kernel

ker(T ) is called the nullity of T .

131
Theorem 5.3 (Dimension Theorem). If T : V → W is a linear transfor-

mation and V has dimension n, then

(rank of T ) + (nullity of T ) = n.

Proof. Omitted.

Theorem 5.4. If A is an m × n matrix, then the dimension of the solution

space of Ax = 0 is

n − rank(A).

Proof. Let T : Rn → Rm be multiplication by A, that is, T (x) = Ax for all

x ∈ Rn . By Theorem 5.3,

(rank of T ) + (nullity of T ) = n;

so

(nullity of T ) = n − (rank of T )

But ker(T ) = {x ∈ Rn |T (x) = 0, i.e., Ax = 0} is the set of all solutions of

Ax = 0; thus, the solution space of Ax = 0. Therefore,

(nullity of T ) = dimension of ker(T )

= dimension of the solution space of Ax = 0.


Since
R(T ) = {T (x)|x ∈ Rn } = {Ax|x ∈ Rn }

= {b ∈ Rm |b = Ax, x ∈ Rn }

= the set of all b such that Ax = b is consistent,

132
by Theorem 3.16, R(T ) is the column space of A. Therefore,

(rank of T ) = dimension of R(T )

= dimension of the column space of A

= rank(A).

133
♣Exercises 5.2

1. Let T : R2 → R2 be multiplication by
 
 2 −1 
 .
−8 4

(1) Which of the following are in R(T )?


     
 1   5   −3 
(a)   , (b)  , (c)  .
−4 0 12

(2) (1) Which of the following are in ker(T )?


     
 5   3   1 
(a)   , (b)   , (c)   .
10 2 1

2. In each let T be multiplication by the given matrix:


   
 1 −1 3   2 0 −1 
   
(a) 
 5 6 −4  , (b)  4 0 −2
 
.

   
7 4 2 0 0 0

Find that

(a) a basis for the range of T ;

(b) a basis for the kernel of T ;

(c) the rank and nullity of T .

134
§5.3 Linear Transformations from Rn to Rm

Theorem 5.5. If T : Rn → Rm is a linear transformation and if {e1 , e2 , . . . , en }

is the standard basis for Rn , then T is is multiplication by A where A is the

matrix whose j-th column is T (ej ) for each j = 1, 2, . . . , n, that is, T (x) = Ax

for each x ∈ Rn . The matrix A is called the standard matrix for T .

Proof. Let
     
a a a
 11   12   1n 
     
 a21   a22   a2n 
     
T (e1 ) =  .  , T (e2 ) =  .  , . . . , T (en ) =  . 
 .   .   . 
 .   .   . 
     
am1 am2 amn

and set  
a a12 · · · a1n
 11 
 
 a21 a22 · · · a2n 
 
A= . .. .. .
 . 
 . . . 
 
am1 am2 · · · amn

135
 
x1
 
 
 x2 
 
If x =  ..  , then x = x1 e1 + x2 e2 + · · · + xn en ; so
 
 . 
 
xn

T (x) = x1 T (e1 ) + x2 T (e2 ) + · · · + xn T (en )


     
a x a x a x
 11 1   12 2   1n n 
     
 a21 x1   a22 x2   a2n xn 
     
=  . + .  + ··· +  . 
 ..   ..   .. 
     
     
am1 x1 am2 x2 amn xn
 
a x + a12 x2 + · · · + a1n xn
 11 1 
 
 a21 x1 + +a22 x2 + · · · + a2n xn 
 
=  .  = Ax.
 .. 
 
 
am1 x1 + +am2 x2 + · · · + amn xn

Example 5.6. Find the standard matrix for the transformation T : R3 →

R4 defined by  
  x1 + x2
 
  x1    
   x1 − x2 
 
T   x2  =


.
  
   x3 
x3  
x1

136
Solution. Since
   
  1+1 1
1    
      
   1 − 0   1 
  
T (e1 ) = T  
 0  =   =  ,
    0
  
  0 
0    
1 1
   
  0+1 1
0    
      
   0 − 1   −1 
  
T (e2 ) = T  
 1  =  = ,
    0
 
  0 

0    
0 0
   
  0+0 0
0    
      
   0 − 0   0 
  
T (e3 ) = T  
 0  =   =  .
    1
  
  1 
1    
0 0

Thus  
1 1 0
 
 
 1 −1 0 
 
A= .
 
 0 0 1 
 
1 0 0
is the standard matrix for T .

137
♣Exercises 5.3

1. Find the standard matrix of each of the following linear operators:


       
 x1   2x1 − x2   x1   x1 
(a) T   =  , (b) T   =  ,
x2 x1 + x2 x2 x2

       
 x1   x1 + 2x2 + x3   x1   4x1 
       
(c) T  x  =  x + 5x
 2   1 2
 , (d) T  x  =  7x
  2   2
,

       
x3 x3 x3 −8x3

   
0    x4
    
  x1  
x  0     x1 
 1       
      x2    
(e) T   x  =  
,

(f ) T 
 
 = 

.
 2   0  x3
  



x
 
 3  


x3  0     x2 
   
  x4  
0 x1 − x3

138
§5.4 Matrices of Linear Transformations

In this section we show that if V and W are finite-dimensional vector spaces

(not necessarily Rn and Rm ), then any linear transformation T : V → W can

be regarded as a matrix transformation as follows:

Suppose that V is an n-dimensional vector space and W an m-dimensional

vector space. Let B and B 0 be bases for V and W , respectively, and for

each x ∈ V , let [x]B be the coordinate matrix of x with respect to B. Then

[x]B ∈ Rn and the coordinate matrix [T (x)]B 0 is in Rm . Thus the linear

transformation T which maps x to T (x) defines a linear transformation T 0

from Rn to Rm by sending [x]B to [T (x)]B 0 . Using the standard matrix A for

T 0 , we have

T 0 ([x]B ) = A[x]B = [T (x)]B 0 .

To find the matrix A, let B = {v1 , v2 , . . . , vn }. Then

A[v1 ]B = [T (v1 )]B 0 , A[v2 ]B = [T (v2 )]B 0 , . . . , A[vn ]B = [T (vn )]B 0 .

But      
1 0 0
     
     
 0   1   0 
     
[v1 ]B =  ..  , [v2 ]B =  .  , . . . , [vn ]B =  .  .
   .   . 
 .   .   . 
     
0 0 1

139
Thus {[v1 ]B , [v2 ]B , . . . , [vn ]B } is the standard basis for Rn . Since T 0 ([vj ]B ) =

A[vj ]B = [T (vj )]B 0 for each j = 1, 2, . . . , n. Thus

A = [[T (v1 )]B 0 , [T (v2 )]B 0 , . . . , [T (vn )]B 0 ]

which is called the matrix for T with respect to the bases B and B 0 . A

is commonly denoted by

[T ]B,B 0 .

If V = W and B = B 0 , then [T ]B,B is simply denoted by [T ]B and is called the

matrix for T with respect to the basis B.

Example 5.7. Let T : R2 → R2 be the linear operator define by


   
  x1    x1 + x2 
T   =  .
x2 −2x1 + 4x2

Find the matrix for T with respect to the basis B = {u1 , u2 } where
   
 1   1 
u1 =   , u2 =   .
1 2

Solution. Since
     
 1   1 + 1   2 
T (u1 ) = T   =   =   = 2u1 + 0u2 ,
1 −2 + 4 2

     
 1   1 + 2   3 
T (u2 ) = T   =   =   = 0u1 + 3u2 ,
2 −2 + 8 6

140
 
 2 0 
[T ]B = [[T (u1 )]B , [T (u2 )]B ] =  .
0 3

We end this chapter after defining two terminologies.

Definition 5.4. A square matrix A = [aij ] is called a diagonal matrix if

aij = 0 whenever i 6= j.

If A and B are square matrices, then B is said to be similar to A if there

exists an invertible matrix P such that B = P −1 AP .

♣Exercises 5.4

1. Let T : R2 → R3 be defined by
 
 
 x1 + 2x2 
 x1   
T   = 
 −x1
.

x2  
0
(a) Find the matrix of T with respect to the bases B = u1 , u2 } and B 0 =

v1 , v2 , v3 } where
   
 1   −2 
u1 =   , u2 =  ,
3 4
     
 1   2   3 
     
v1 =  1
 
 , v2 =  2  ,
  v 3 =  0

.

     
1 0 0

141
(b) Compute the value  
 8 
T   .
3

   
 1   −1 
2. Let v1 =   and v2 =  , and let
3 4
 
 1 3 
A= 
−2 5

be the matrix for T : R2 → R2 with respect to the basis B = {v1 , v2 }.

(a) Find [T (v1 )]B and [T (v2 )]B .

(b) Find T (v1 ) and T (v2


).  
 x1 
(c) Find a formula for T  .
x2
 
 1 
(d) Use the formula obtained in (c) to compute T  .
1

3. Let  
 3 −2 1 0 
 
A=
 1 6 2 1 

 
−3 0 7 1

be the matrix T : R4 → R3 with respect to the bases

B = {v1 , v2 , v3 , v4 }, B 0 = {w1 , w2 , w3 }

142
where
       
0 2 1 6
       
       
 1   1   4   9 
       
v1 =   , v2 =   , v3 =   , v4 =  
       
 1   −1   −1   4 
       
1 −1 2 2
     
 0   −7   −6 
     
w1 =  8
 
 , w2 =  8  , w3 =  9  .
   
     
8 1 1

(a) Find [T (v1 )]B 0 , [T (v2 )]B 0 , [T (v3 )]B 0 , [T (v4 )]B 0 .

(b) Find T (v1 ), T (v2 ), T (v3 ), T (v4 ).

(c) Find a formula for  


x1
 
 
  x2  
 
T   .
 
x
 3 
 
x4
(d) Compute  
2
 
 
 2 
 
T   .
 
 0 
 
0

143
Chapter Six
Eigenvalues and Eigenvectors

§6.1 Eigenvalues and Eigenvectors

Definition 6.1. If A is an n × n matrix, then a nonzero vector x ∈ Rn is

called an eigenvector of A if there exists a scalar λ such that

Ax = λx.

The scalar λ is called an eigenvalue (or proper value or characteristic

value) of A and x is said to be an eigenvector corresponding to λ.

Remark 6.1. To find an eigenvalue of a an n × n matrix A, we rewrite

Ax = λx as

Ax = λIx or (λI − A)x = 0.

For λ to be an eigenvalue, (λI − A)x = 0 must have a nonzero solution for x,

by Theorem 3.15, it has a nonzero solution if and only if

det(λI − A) = 0.

This equation is called the characteristic equation of A. The polynomial

det(λI −A) in λ is called the characteristic polynomial of A. Summarizing,

we have the following theorem.

144
Theorem 6.1. If A is an n × n matrix, then the following are equivalent:

(a) λ is an eigenvalue of A.

(b) The system of equation (λI − A)x = 0 has nontrivial solutions.

(c) There is a nonzero vector x such that Ax = λx.

(d) λ is a real solution of the characteristic equation det(λI − A) = 0.

The eigenvectors of A corresponding to an eigenvalue λ are the nonzero

vectors x such that Ax = λx, that is, are the nonzero vectors in the solution

space of (λI − A)x = 0 which is called the eigenspace of A corresponding to

λ.

Example 6.1. Find the eigenvalues of the matrix


 
 3 2 
A= 
−1 0

and the eigenvectors corresponding the eigenvalues.

Solution. The characteristic equation of A is


¯ ¯
¯ ¯
¯ λ − 3 −2 ¯
¯ ¯
0 = det(λI − A) = ¯ ¯
¯ ¯
¯ 1 λ ¯

= (λ − 3)λ + 2 = λ2 − 3λ + 2

= (λ − 1)(λ − 2).

So λ = 1 and λ = 2 are the eigenvalues of A.

145
 
 x1 
Let x =   be the the eigenvectors corresponding the eigenvalue λ.
x2
Then Ax = λx. If λ = 1, then
    
 3 2   x1   x1 
Ax = λx ⇒   = .
−1 0 x2 x2

So
3x1 + 2x2 = x1 ,

−x1 = x2 .
If we set x2 = t for any real t 6= 0, then x1 = −t; so
 
 −t 
x= 
t

is the eigenvector corresponding to λ = 1.

If λ = 2, then
    
 3 2   x1   x1 
Ax = λx ⇒    = 2 .
−1 0 x2 x2

So
3x1 + 2x2 = 2x1 ,

−x1 = 2x2 .
If we set x2 = t for any real t 6= 0, then x1 = −2t; so
 
 −2t 
x= 
t

146
is the eigenvector corresponding to λ = 5.

♣Exercises 6.1

1. For the following matrices,


     
 3 0   10 −9   0 3 
(a)  , (b)  , (c)  ,
8 −1 4 −2 4 0

     
 −2 −7   0 0   1 0 
(d)  , (e)  , (f )  ,
1 2 0 0 0 1

   
 4 0 1   3 0 −5 
   
(g)  
 −2 1 0  , (h)  1
 5 −1 0 
,
   
−2 0 1 1 1 −2

   
 −1 0 1   5 0 1 
   
(i) 
 −1 3 0  , (j) 
  1 1 0 
,
   
−4 13 −1 −7 1 0

(a) Find the characteristic equations;

(b) Find the eigenvalues;

(c) Find the eigenspaces.

147
2. Prove that λ = 0 is an eigenvalue of a matrix A if and only if A is not

invertible.

3. Prove that the constant term in the characteristic polynomial of an n ×

n matrix A is (−1)n det(A). (Hint: The constant term is the value of the

characteristic polynomial when λ = 0.)

4. Let A be an n × n matrix.

(a) Prove that the characteristic polynomial of A has degree n.

(b) Prove that the coefficient of λn in the characteristic polynomial is 1.

5. The traceof a square matrix A, denoted by tr(A), is the sum of the ele-

ments on the main diagonal. Show that the characteristic equation of a 2 × 2

matrix A is λ2 − tr(A)λ + det(A) = 0.

6. Prove that the eigenvalues of a triangular matrix are the entries on the

main diagonal.

7. Show that if λ is an eigenvalue of A, then λ2 is an eigenvalue of A2 ; more

generally, show that λn is an eigenvalue of An for each positive integer n.

8. Find the eigenvalues of A9 where


 
1 3 7 11
 
 
 0 −1 3 8 
 
A= .
 
 0 0 −2 4 
 
0 0 0 2

148
§6.2 Diagonalization

♣The Diagonalization Problem. Given a linear operator T : V → V

on a finite-dimensional vector space V , does there exists a basis B for V such

that the matrix for T with respect to B, [T ]B , is diagonal?

♣Matrix Form of the Diagonalization Problem. Given a square matrix

A, does there exists an invertible matrix P such that P −1 AP is diagonal?

Definition 6.2. A square matrix A is said to be diagonalizable if there

exists an invertible matrix P such that P −1 AP is diagonal; the matrix P is

said to diagonalize A.

Theorem 6.2. If A is an n × n matrix, then the following are equivalent:

(a) A is diagonalizable.

(b) A has n linearly independent eigenvectors.

Proof. (a) ⇒ (b) Suppose A is diagonalizable. Then there exists an invertible

matrix  
p11 p12 · · · p1n
 
 
 p21 p22 · · · p2n 
 
P = .. .. .. 
 
 . . . 
 
pn1 pn2 · · · pnn

149
such that P −1 AP = D where
 
λ1 0 ··· 0
 
 
 0 λ2 · · · 0 
 
D= .. .. .. .
 
 . . . 
 
0 0 ··· λn
Therefore, AP = P D, that is,
  
p p · · · p1n λ1 0 ··· 0
 11 12  
  
 p21 p22 · · · p2n  0 λ2 · · · 0 
  
AP =  . .. ..  .. .. .. 
 .  
 . . .  . . . 
  
pn1 pn2 · · · pnn 0 0 ··· λn
 
λp λ2 p12 · · · λn p1n
 1 11 
 
 λ1 p21 λ2 p22 · · · λn p2n 
 
=  . .. .. .
 . 
 . . . 
 
λ1 pn1 λ2 pn2 · · · λn pnn
Let  
p1j
 
 
 p2j 
 
pj =  ..  , j = 1, 2, . . . , n,
 
 . 
 
pnj
denote the j-th column vector of P . Then λj pj is the j-th column vector

of P D, and Apj is the j-th column vector of AP for j = 1, 2, . . . , n. Since

AP = P D,

Apj = λj pj , j = 1, 2, . . . , n.

150
Since P is invertible, pj is a nonzero vector for each j = 1, 2, . . . , n. Thus

p1 , p2 , . . . , pn are eigenvectors of A. Since P is invertible, by Theorem 3.15,

p1 , p2 , . . . , pn are linearly independent.

(b) ⇒ (a) Suppose A has n linearly independent eigenvectors, p1 , p2 , . . . , pn

with corresponding eigenvalues λ1 , λ2 , . . . , λn , respectively, and let


 
p p ··· p1n
 11 12 
 
 p21 p22 · · · p2n 
 
P = . .. .. 
 . 
 . . . 
 
pn1 pn2 · · · pnn

be the matrix whose columns are p1 , p2 , . . . , pn . Then Apj is the j-th column

vector of AP for j = 1, 2, . . . , n. But

Apj = λj pj , j = 1, 2, . . . , n

so that
 
λ1 p11 λ2 p12 · · · λn p1n
 
 
 λ1 p21 λ2 p22 · · · λn p2n 
 
AP =  . . . 
 . . . 
 . . . 
 
λ1 pn1 λ2 pn2 · · · λn pnn
  
p11 p12 · · · p1n λ 0 ··· 0
  1 
  
 p21 p22 · · · p2n   0 λ2 · · · 0 
  
=  . . .  . .. ..  = PD
 . . .  . 
 . . .  . . . 
  
pn1 pn2 · · · pnn 0 0 ··· λn

151
where D is the diagonal matrix whose diagonal entries are the eigenvalues

λ1 , λ2 , . . . , λn . Since the column vectors of P are linearly independent, P is

invertible; so AP = P D implies P −1 AP = D which is diagonal.

Remark 6.2. From the proof of the above Theorem 6.2, for a diagonalizable

n × n matrix A, to find a matrix P which diagonalizes A

Step 1. Find n linearly independent eigenvectors p1 , p2 , . . . , pn .

Step 2. Form the matrix P whose columns are p1 , p2 , . . . , pn .

Step 3. P −1 AP is diagonal with λ1 , λ2 , . . . , λn as its diagonal entries where

λi is the eigenvalue corresponding to the eigenvector pi for i = 1, 2, . . . , n.

Example 6.2. Find a matrix P which diagonalizes


 
 3 −2 0 
 
A=  −2 3 0 .
 
0 0 5

Solution.
¯ ¯
¯ ¯
¯ λ−3 2 0 ¯
¯ ¯
¯ ¯
0 = det(λI − A) = ¯¯ 2 λ−3 0 ¯ = (λ − 5)((λ − 3)2 − 4)
¯
¯ ¯
¯ ¯
¯ 0 0 λ−5 ¯

= (λ − 5)2 (λ − 1).

Hence λ = 1 and λ = 5 are eigenvalues of A.

152
 
 x1 
 
Let x =  
 x2  be the eigenvector corresponding to the eigenvalue λ. Then
 
x3
Ax = λx, equivalently, (λI − A)x = 0.

If λ = 5, then (λI − A)x = 0 becomes


    
 2 2 0   x1   0 
    
 2 2 0  x  =  0 .
  2   
    
0 0 0 x3 0

Solving this system

x1 = −s, x2 = s, x3 = t, s, t ∈ R.

Thus the eigenvectors of A corresponding λ = 5 are nonzero vectors


           
 x1   −s   −s   0   −1   0 
           
x=        
 x2  =  s  =  s  +  0  = s  1  + t  0
  .

           
x3 t 0 t 0 1

Note that the eigenvectors corresponding λ = 5


   
 −1   0 
   
p1 =  
 1  and p2 =  0 
 
   
0 1

are linearly independent; so they form a basis for the eigenspace corresponding

λ = 5.

153
If λ = 1, then (λI − A)x = 0 becomes
    
 −2 2 0   x1   0 
    
 2 −2 0    =  0 .
   x2   
    
0 0 −4 x3 0

Solving this system

x1 = t, x2 = t, x3 = 0, t ∈ R.

Thus the eigenvectors of A corresponding λ = 1 are nonzero vectors


     
 x1   t   1 
     
x=  
 x2  =  t
 = t 1 
  
     
x3 0 0

so  
 1 
 
p3 =  
 1 
 
0
form a basis for the eigenspace corresponding λ = 1. We see that {p1 , p2 , p3 }

is linearly independent. Thus the matrix


 
 −1 0 1 
 
P =  1 0 1 

 
0 1 0

154
diagonalizes A and  
 5 0 0 
 
P −1 AP =  
 0 5 0 .
 
0 0 1

Theorem 6.3. If v1 , v2 , . . . , vk are eigenvectors of a matrix A corresponding

to distinct eigenvalues λ1 , λ2 , . . . , λk , then {v1 , v2 , . . . , vk } is a linearly inde-

pendent set.

Proof. Omitted!

Theorem 6.4. If an n × n matrix A has n distinct eigenvalues, then A is

diagonalizable.

Proof. If v1 , v2 , . . . , vk are eigenvectors of a matrix A corresponding to dis-

tinct eigenvalues λ1 , λ2 , . . . , λk , then, by Theorem 6.3, {v1 , v2 , . . . , vk } is lin-

early independent. Thus A has n linearly independent eigenvectors. By The-

orem 6.2, A is diagonalizable.

155
♣Exercises 6.2

♣. Show that the following matrices are not diagonalizable:


 
   
 3 0 0 
 2 0   2 −3   
1.   , 2.   , 3.  0 2 0
.

1 2 1 −1  
0 1 2

♣. Find a matrix P that diagonalizes A and determine P −1 AP .


 
   
 1 0 0 
 −14 12   1 0   
4. A =   , 5. A =   , 6. A =   0 1 1 .

−20 17 6 −1  
0 1 1

3. Let T : R2 → R2 be the linear operator given by


   
 x1   3x1 + 4x2 
T   =  .
x2 2x1 + x2

Find a basis for R2 relative to which the matrix of T is diagonal.

4. Let  
 a b 
A= .
c d
Show that

(a) A is diagonalizable if (a − d)2 + 4ac > 0.

(b) A is not diagonalizable if (a − d)2 + 4ac < 0.

156
5. Let A be an n × n matrix and P an invertible n × n matrix. Show that

(a) (P −1 AP )2 = P −1 A2 P ;

(b) (P −1 AP )k = P −1 Ak P for each positive integer k.

6. Compute A10 where  


 1 0 
A= .
−1 2

(Hint: Find a matrix P that diagonalizes A and compute (P −1 AP )10 .)

157
§6.3 Diagonalization; Symmetric Matrices

♣The Orthogonal Diagonalization Problem. Given a linear operator

T : V → V on a finite-dimensional vector space V , does there exists an

orthonormal basis B for V such that the matrix for T with respect to B, [T ]B ,

is diagonal?

♣Matrix Form of the Orthogonal Diagonalization Problem. Given a

square matrix A, does there exists an orthogonal matrix P such that P −1 AP (=

P t AP ) is diagonal?

Definition 6.3. A square matrix A is said to be orthogonally diagonal-

izable if there exists an Orthogonal matrix P such that P −1 AP (= P t AP ) is

diagonal; the matrix P is said to orthogonally diagonalize A.

Definition 6.4. A square matrix A is said to be symmetric if A = At .

Theorem 6.5. If A is an n × n matrix, then the following are equivalent:

(a) A is orthogonally diagonalizable.

(b) A has an orthonormal set of n eigenvectors.

(c) A is symmetric.

Proof. (a) ⇒ (b). Suppose A is orthogonally diagonalizable. Then there

exists an orthogonal matrix P such that P −1 AP is diagonal. As shown in the

proof of Theorem 6.2, the n column vectors of P are eigenvectors of A. Since

158
P is orthogonal, by Theorem 4.15, these column vectors are orthonormal so

that A has n orthonormal eigenvectors.

(b) ⇒ (a). Suppose that A has an orthonormal set of n eigenvectors

{p1 , p1 , . . . , pn }. As shown in the proof of Theorem 6.2, P with these eigen-

vectors as columns diagonalizes A. Since these eigenvectors are orthonormal,

P is orthogonal, by Theorem 4.15, and then orthogonally diagonalizes A.

(a) ⇒ (c). Suppose A is orthogonally diagonalizable. Then there exists an

orthogonal matrix P such that P −1 AP = D where D is diagonal. Thus

D = P −1 AP ⇒ A = P DP −1 = P DP t

since P is orthogonal. Therefore,

At = (P DP t )t = P Dt P t = P DP t = A;

so A is symmetric.

(c) ⇒ (a). Omitted!

(beyond the scope of this elementary course)

Theorem 6.6. If A is a symmetric matrix, then eigenvectors from different

eigenspaces are orthogonal.

Proof. Omitted!

159
Remark 6.3. From Theorems 6.5 and 6.6, for a symmetric matrix A, we

obtain a procedure for finding an orthogonal matrix P which orthogonally

diagonalizes A:

Step 1. Find a basis for each eigenspace of A.

Step 2. Using Gram-Schmidt Process, change each basis found in Step 1 into

an orthonormal basis for the corresponding eigenspace.

Step 3. Form the matrix P whose columns are the basic vectors constructed

in Step 2.

Example 6.3. Find an orthogonal matrix P that diagonalizes the symmetric

matrix  
 4 2 2 
 
A=
 2 4 2 .

 
2 2 4

Solution. The characteristic equation of A is


¯ ¯
¯ ¯
¯ λ−4 −2 −2 ¯
¯ ¯
¯ ¯
0 = det(λI − A) = ¯¯ −2 λ − 4 −2 ¯
¯
¯ ¯
¯ ¯
¯ −2 −2 λ − 4 ¯

= (λ − 2)2 (λ − 8).

The eigenvectors of A are λ = 2 and λ = 8.

160
We see that    
 −1   −1 
   
u1 = 
 1 
 and u2 = 
 0 

   
0 1
form a basis for the eigenspace corresponding to λ = 2, and Gram-Schmidt

process yields orthonormal basis eigenvectors


   
1 √1
 − 2   − 6


   
v1 = 
 √1  and v =  − √1 .
2   
2
6
   
0 √2
6

The eigenspace corresponding to λ = 8 has


 
 1 
 
u3 = 
 1 

 
1

as a basis and Gram-Schmidt process yields

 
√1
 3 
 
v3 = 
 √1 .

3
 
√1
3

Finally, the matrix whose columns are v1 , v2 , v3 ,


 
√1 √1 √1
 − 2 − 6 3 
 

P =  √ −√ √
1 1 1 
2 6 3 
 
2 1
0 √
6

3

161
orthogonally diagonalizes A.

Theorem 6.7.

(a) The characteristic equation of a symmetric matrix A has only real roots.

(b) If an eigenvalue λ of a symmetric matrix A has a k-multiple root

of its characteristic equation, then the eigenspace corresponding to λ is k-

dimensional.

Proof. Omitted!

♣Exercises 6.3

♣. Find the dimension of the eigenspaces of the following symmetric matrices:


   
 
 1 −4 2   1 1 1 
 1 1     
1.  , 2.  −4 1 −2  
 , 3.  1 1 1  ,

1 1    
2 −2 −2 1 1 1

   
  4 4 0 0 10
− 43
0 − 43
6 0 0    3 
     
   4 4 0 0   −4 −5 0 1 
   3 
4.  0 3 3 
 3 3
 , 5.  , 6.  .
     
 0 0 0 0   0 0 −2 0 
0 3 3    
0 0 0 0 − 43 1
3
0 − 53

162
7. Find a matrix that orthogonally diagonalizes
 
 a b 
 
b a

where b 6= 0.

8. Two n × n matrices A and B are called orthogonally similar if there

is an orthogonal matrix P such that B = P t AP . Show that if A is symmetric

and A and B are orthogonally similar, then B is symmetric.

163

You might also like