0% found this document useful (0 votes)
80 views

Linear Algebra

This document provides an overview of introductory linear algebra concepts including: 1) Definitions of linear equations, systems of linear equations, and their solutions. Systems can have no solution, a unique solution, or infinitely many solutions depending on whether the corresponding lines are parallel, intersecting, or coincident. 2) Matrices are introduced including coefficient/augmented matrices representing systems of equations. Row echelon form, reduced row echelon form, and pivot positions are defined. 3) A algorithm is described for solving systems of linear equations by reducing the augmented matrix to row echelon form and then reduced row echelon form to obtain the solution. Applications to economics are briefly mentioned.

Uploaded by

Divya Reddy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
80 views

Linear Algebra

This document provides an overview of introductory linear algebra concepts including: 1) Definitions of linear equations, systems of linear equations, and their solutions. Systems can have no solution, a unique solution, or infinitely many solutions depending on whether the corresponding lines are parallel, intersecting, or coincident. 2) Matrices are introduced including coefficient/augmented matrices representing systems of equations. Row echelon form, reduced row echelon form, and pivot positions are defined. 3) A algorithm is described for solving systems of linear equations by reducing the augmented matrix to row echelon form and then reduced row echelon form to obtain the solution. Applications to economics are briefly mentioned.

Uploaded by

Divya Reddy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 42

MATH 247 Introductory Linear Algebra

Textbook “Linear Algebra and Its Applications” by David Lay, Pearson


Education, Inc., 2006 (3rd edition update).

1.1. Systems of Linear Equations.

Definition. A linear equation in the variables x1 , . . . , xn is an equation of


the form
a1 x1 + a2 x2 + · · · + an xn = b
where a1 , . . . , an , and b are real or complex numbers.

Example. x1 − 2x2 = 3, ix2 − 2x1 = π are linear equations in x1 and
x2 . Equations x21 = 5 and x1 ln(x2 ) = −7 are not linear.

Definition. A system of linear equations is a collection of linear equations


in the same variables x1 , . . . , xn .

Example. {
x1 − x2 = 1
x1 + x2 = 5
is a system of two linear equations in two unknowns x1 and x2 .
{
x1 + x3 = 8
x3 − x2 = 4

is a system of two linear equations in three unknowns x1 , x2 , and x3 .

Definition. A solution of a system of linear equations is a set of numbers


s1 , . . . , sn that make the equations true statements if x1 = s1 , . . . , xn = sn .

Example. The system {


x1 − x2 = 1
x1 + x2 = 5
has solution x1 = 3 and x2 = 2.

Definition. A system of linear equations has either


(i) no solution (inconsistent system),
(ii) exactly one solution (consistent system),
or
(iii) infinitely many solutions (consistent system).

Example. A graph of a linear equation is a straight line. Therefore, find-


ing a solution of a system of two linear equations is equivalent to finding an

1
intersection of two lines. Lines can be either (i) parallel (no solution), (ii)
intersect at a point (exactly one {
solution), or (iii) coincide (infinitely many
x1 + x2 = 3
solutions). For example, system has no solution, system
2x1 + 2x2 = 4
{ {
x1 + x2 = 4 x1 + x2 = 4
has exactly one solution, and system has
x1 − x2 = 2 2x1 + 2x2 = 8
infinitely many solutions.

Definition. Two systems of equations are called equivalent if they have


the same solution.
{ {
x1 + 3x2 = 7 x1 + 3x2 = 7
Example. Systems and are equivalent
2x1 − x2 = 0 3x1 + 2x2 = 7
since they have the same solution x1 = 1 and x2 = 2.

Definition. Given the system




 a11 x1 + a12 x2 + · · · + a1n xn = b1

a x + a x + · · · + a x = b
21 1 22 2 2n n 2
,

 ...


an1 x1 + an2 x2 + · · · + ann xn = bn
the matrix  
a11 a12 . . . a1n
 a21 a22 . . . a2n 
 
. . . 
an1 an2 . . . ann
is called the coefficient matrix or matrix of coefficients. The matrix
 
a11 a12 . . . a1n b1
 a21 a22 . . . a2n b2 
 
. . . 
an1 an2 . . . ann bn
is called the augmented matrix.

Example. Consider the system




−x1 + x3 = −3
2x1 + x2 − x3 = 4 .


x1 + 2x2 = 1
Its augmented matrix is
 
−1 0 1 −3
 2 1 −1 4  .
1 2 0 1

2
Definition. A matrix of size m × n has m rows and n columns.

Example. In the above example, we have 3 × 4 matrix.

1.1. Solving Linear System.

To solve a system of linear equations we replace it with an equivalent one


that is easier to solve. Three basic operations are used to simplify a linear
system (i) replace an equation by the sum of itself and a multiple of another
equation, (ii) interchange two equations, and (iii) multiply an equation by a
nonzero constant.

Example. { {
x1 − x2 = 1 e1 →e1 +e2 2x1 = 6
=⇒
x1 + x2 = 5 x1 + x2 = 5
{ {
e1 →e1 /2 x1 = 3 e2 →e2 −e1 x1 = 3
=⇒ =⇒
x1 + x2 = 5 x2 = 2.
This solution can be written in terms of augmented matrices as follows:
[ ] [ ]
1 −1 1 r1 →r1 +r2 2 0 6
=⇒
1 1 5 1 1 5
[ ] [ ]
r1 →r1 /2 1 0 3 r2 →r2 −r1 1 0 3
=⇒ =⇒ .
1 1 5 0 1 2
To solve a system of three or more equations, it is advisable to work with aug-
mented matrices. Three elementary row operations that lead to a row equivalent
augmented matrix are
(i) replace one row by the sum of itself and a multiple of another row,
(ii) interchange two rows,
and (iii) multiply a row by a nonzero constant.

1.2. Row Reduction and Echelon Forms.

Example. Consider two matrices


 
2 −2 0 1  
0 1 2 3 1 0 0 −1
   
0 0 0 5 and 0 1 0 3 .
0 0 1 2
0 0 0 0

Definition. In a matrix, a leading entry of a nonzero row is the leftmost


nonzero entry of the row.

3
In our example, the leading entries are 2, 1 and 5 in the first matrix, and 1,
1, and 1 in the second matrix.

Definition. A matrix is in row echelon form if:


(i) all nonzero rows are above any rows of all zeros,
(ii) each leading entry of a row is to the right of the leading entry of the row
above it,
and (iii) all entries in a column below a leading entry are zeros.

In our example, both matrices are in row echelon form.

Definition. A matrix is in reduced row echelon form if:


(i) the leading entry in each nonzero row is 1,
and (ii) each leading 1 is the only nonzero entry in its column.

In our example, the second matrix is in reduced row echelon form.

Definition. A pivot position in a matrix is a location in this matrix that


corresponds to a leading 1 in its reduced row echelon form. A pivot column
is a column that contains a pivot position.

Algorithm for Solving a Linear System.

To solve a system of equations, we reduce the augmented matrix to its row


echelon form and then to its reduced row echelon form. It is done in several
steps.

Step 1. Take the leftmost nonzero column. This will be your pivot column.
Make sure the top entry (pivot) is not zero. Interchange rows if necessary.

Example.   

x1 + x2 + 2x3 = 0 1 1 2 0
2x2 + x3 = 4 ↔ 0 2 1 4  .


x1 + 2x3 = −3 1 0 2 −3

The first column is a pivot column.

Step 2. Use the elementary row operations to create zeros in all positions
below the pivot.

In our example,  
1 1 2 0
r3 →r3 −r1
=⇒ 0 2 1 4  .
0 −1 0 −3
Step 3. Select a pivot column in the matrix with the first row ignored. Re-
peat the previous steps.

4
In our example,  
1 1 2 0
r3 ↔2r3 +r2
=⇒ 0 2 1 4  .
0 0 1 −2
This matrix is in row echelon form.

Step 4. Starting with the rightmost pivot, create zeros above each pivot.
Make pivots equal 1 by rescaling rows if necessary.

In our example,
     
1 1 2 0 1 1 0 4 1 1 0 4
r2 →r2 −r3 r →r1 −2r3 →r2 /2
=⇒ 0 2 0 6  1 =⇒ 0 2 0 6  r2=⇒ 0 1 0 3 
0 0 1 −2 0 0 1 −2 0 0 1 −2
  
1 0 0 1 
x1 = 1
r1 →r1 −r2
=⇒ 0 1 0 3  ↔ x2 = 3


0 0 1 −2 x3 = −2.

1.6. Applications of Linear Systems.

Example. (Applications in Economics: Example 1, page 58). Consider


coal, electric and steel sectors of economy. Suppose coal sells 60% of its out-
put to electric and 40% to steel; electric sells 40% to coal, 10% to electric
and 50% to steel; steel sells 60% to coal, 20% to electric and 20% to steel. It
can be summarized by a table
Coal Electric Steel Purchased by
0 .4 .6 coal
.6 .1 .2 electric
.4 .5 .2 steel
Let pc , pe and ps denote the prices of the total outputs of coal, electric and
steel, respectively. Find prices that balance each sector’s income and expen-
ditures.

Solution: The prices must satisfy the system




pc = .4pe + .6ps
pe = .6pc + .1pe + .2ps .


ps = .4pc + .5pe + .2ps
The reduced row echelon form of the augmented matrix of this system is
 
1 0 −.94 0
0 1 −.85 0 .
0 0 0 0

5
Thus, the solution is pc = .94ps , pe = .85ps and ps is free.

Example. (Applications in Chemistry: Example on page 59).


When propane burns, the propane C3 H8 combines with oxygen O2 and forms
carbon dioxide CO2 and water H2 O. Write a balanced chemical equation for
this reaction.

Solution: The number of atoms is preserved in a chemical reaction. Hence,


we have to find integers x1 , x2 , x3 , and x4 such that x1 C3 H8 + x2 O2 →
x3 CO2 + x4 H2 O. That is, we have to solve the system


3x1 = x3
8x1 = 2x4


2x2 = 2x3 + x4 .

The solution is 

x1 = .25x4
x2 = 1.25x4


x3 = .75x4 .
Since all x’s must be integers, x4 = 4, x1 = 1, x2 = 5, and x3 = 3.

Example. (Network Flow: Example 2 on page 61).


In a network, the in-flow and out-flow of every node are equal. Consider
intersections A, B, C, and D. Picture. Determine the flows at these intersec-
tions.

Solution: The unknown flows satisfy the linear system




 x1 + x2 = 800

x + x = x + 300
2 4 3

 x4 + x5 = 500


x1 + x5 = 600.

The solution is 

 x1 = 600 − x5

x
2 = 200 + x5

 x = 400


3
x4 = 500 − x5 .
The flow x5 is free, though it must satisfy x5 ≤ 500 since x4 ≥ 0.

1.3. Vector Equations.

6
Definition. A column vector or a vector is a list of numbers arranged in
a column.
 
1
Example. −3.
7
Notation. Vectors are usually denoted by u, v, w, x, y.

Notation. The set of real numbers is denoted by R.

Notation. The set of all vectors with n entries is denoted by Rn . We write


u ∈ Rn .

Definition. Two vectors are equal iff their corresponding entries are equal.
That is, equality of vectors is defined entry-wise.
[ ] [ ] [ πi ]
1 1 e
Example. Let u = , v = 2 , and w = . Entry-wise, u =
−1 i −1
v ̸= w.

Definition. Vector addition and scalar multiplication are defined entry-


wise.
[ ] [ ]
1 + (−1) 0
Example. In our example, u + w = = , and 2u =
[ ] [ ] −1 + (−1) −2
2(1) 2
= .
2(−1) −2

Geometric Presentation of a Vector.


 
v1
 v2 
A vector v =  
. . . is plotted in n-dimensional coordinate system as a
vn
[ ] 0 to point (v1 , v2 , . . . , vn ). Picture in R . Ex-
2
pointed line[from
] the origin
2 −2
ample u = , v= . How to picture the sum of two vectors and a
3 1
vector multiplied by a scalar?

7
Parallelogram Rule for Addition.
The vector u+v points from the origin and lies on the diagonal of the parallel-
ogram formed by vectors u and v. Picture. Example. Where is u−v? v−u?

Scalar Multiplication.
Vector cv lies on the same line as v, its length is |c| times the length of v,
and it points in the direction of v if c > 0 and in the opposite direction if
c < 0. Picture. Example.

Definition. For any vectors v1 , . . . , vp in Rn and any c1 , . . . , cp ∈ R, the


vector c1 v1 + · · · + cp vp is called a linear combination of v1 , . . . , vp with
weights c1 , . . . , cp .

Give an example.
       
1 2 3 0
Example. Let u =  0  , v = 1, and w = 0. Is vector y =  2  a
−1 0 1 −1
linear combination of u, v, and w?

Solution: The goal is to find, ifpossible,


 weights
  cu , 
cv , and
 cw such that
0 1 2 3 cu + 2cv + 3cw
   
y = cu u+cv v+cw w or 2 = cu 0 +cv 1 +cw 0 =      cv .
−1 −1 0 1 −cu + cw
The answer is y = −(1/4)u + 2v − (5/4)w.

Definition. The set of all linear combinations of vectors v1 , v2 , . . . , vp ∈ Rn


is called the subset of Rn spanned (or generated) by v1 , v2 , . . . , vp and is de-
noted by Span{v1 , v2 , . . . , vp }.

Example. Span{u} where u ∈ R3 . Picture.

Example. Span{u, v} where the vectors are in R3 . Picture.

Definition. A vector equation is a linear equation of the form x1 v1 +x2 v2 +


·[· ·+xp vp = b. It is equivalent
] to a linear system with the augmented matrix
v1 v2 . . . vp b .

1.4. The Matrix Equation Ax = b.


[ ]
Definition. A system of linear equations with augmented matrix A|b
can be written as a matrix equation Ax = b.

1.5. Solution Sets of Linear Systems.

8
Homogeneous Linear Systems.

Definition. A system Ax = 0 is called homogeneous. It has a trivial solu-


tion x = 0. A non-trivial solution exists iff the system has at least one free
variable.

Example (Example 1 on page 50). Describe the solution set of Ax = 0


where  
3 5 −4
A = −3 −2 4  .
6 1 −8
[ ]
Solution: The reduced row echelon form of A|0 is
 
3 5 −4 0
0 3 0 0 ,
0 0 0 0

meaning x1 = −4/3x3 , x2 = 0, and x3 is free. Thus, any solution


 
−4/3
x = x3  0  = x3 v,
1

that is, a non-trivial solution of this homogeneous system is in Span{v}.

Example (Example 2 on page 51). Describe the solution set of 10x1 −


3x2 − 2x3 = 0.
Solution: From this equation, x1 = .3x2 + .2x3 where x2 and x3 are free.
Any solution
       
x1 .3x2 + .2x3 .3 .2

x = x2 =   x2  = x2 1 + x3 0  = x2 u + x3 v,
  
x3 x3 0 1

that is, a non-trivial solution of this equation is in Span{u, v}.

Generally speaking, for a homogeneous system with k free variables, a non-


trivial solution is in the span of k vectors.

9
Nonhomogeneous Linear Systems.

Definition. A system Ax = b with nontrivial right-side (b ̸= 0) is called


nonhomogeneous linear system.

Proposition. The solution set of a nonhomogeneous system Ax = b is


the set of all vectors of the form w = p + vh where p is some particular
solution of the nonhomogeneous system, and vh is any solution of the homo-
geneous system Ax = 0.
In other words, in order to solve a nonhomogeneous system, one has to find
all solutions of the homogeneous system and add a particular solution of the
nonhomogeneous system.

Example (Example 3 on page 52).


   
3 5 −4 7

A = −3 −2 4  and b = −1 .

6 1 −8 −4
[ ]
The reduced row echelon form of A|b is
 
1 0 −4/3 −1
0 1 0 2 ,
0 0 0 0

meaning 

x1 = −1 + 4/3x3
x2 = 2


x3 free.
Thus,
           
x1 −1 + 4/3x3 −1 4/3x3 −1 4/3
x = x2  =  2  =  2 + 0  =  2 +x3  0  = p+x3 v.
x3 x3 0 x3 0 1

Picture of Span{v}, translated by a vector.

1.7. Linear Independence.

Definition. Vectors v1 , . . . , vp ∈ Rn are linearly independent if the vec-


tor equation x1 v1 + · · · + xp vp = 0 has only the trivial solution (there are no
free variables). Otherwise, the vectors are linearly dependent and the vec-
tor equation with a non-trivial solution is called a linear dependence relation.
 
1
Example (Example 1 on page 65). Decide whether vectors v1 = 2 , v2 = 
3

10
   
4 2
5, and v3 = 1 are linearly independent.
6 0
Solution: A row echelon form of the augmented matrix is
 
1 4 2 0
0 −3 −3 0 .
0 0 0 0

Thus, a non-trivial solution is possible. The vectors are not linear indepen-
dent.

Find a linear dependence relation.


Solution: The reduced row echelon form is
 
1 0 −2 0
0 1 1 0 .
0 0 0 0

Thus, for example, 10v1 − 5v2 + 5v3 = 0.

Proposition. The columns of a matrix A are linearly independent iff the


equation Ax = 0 has only the trivial solution.

Example (Example
 2on pages 66 – 67). Determine if the columns of the
0 1 4
matrix A = 1 2 −1 are independent.
5 8 0
Solution: A row echelon form of the augmented matrix is
 
1 2 −1 0
0 1 4 0 .
0 0 13 0

There are no free variables, therefore only a trivial solution exists and the
columns are independent.

Proposition. (i) One vector is linearly independent iff it is a non-zero vec-


tor. (ii) Two vectors are linearly dependent iff one is a multiple of the other
(that is, if they lie on the same line through the origin).
Proof: (i) The equation x1 v1 = 0 has a non-trivial solution iff v1 ̸= 0.
(ii) The equation x1 v1 +x2 v2 = 0 has a non-trivial solution iff v1 = −x2 /x1 v2
(assumed x1 ̸= 0). 2
[ ] [ ] [ ]
1 1 −3
Example. Vector is linearly independent. Vectors and are
3 [ ] [ ] −3 9
1 2
linearly independent, while and are linearly dependent.
−3 6

11
Theorem 7 on page 68. Vectors v1 , . . . , vp , p ≥ 2 are linearly depen-
dent iff at least one of the vectors is a linear combination of the others (that
is, at least one vector is in the set spanned by the others).
Proof: The p vectors are linearly dependent iff the equation x1 v1 + · · · +
xp vp = 0 has a non-trivial solution iff (say, x1 ̸= 0) v1 = −x2 /x1 v2 − · · · −
xp /x1 vp . 2
 
3
Example. (Example 4 on page 68). Consider vectors v1 = 0 and v2 =
  1
1
6. They are not multiples of each other, therefore, Span{v1 , v2 } is a plane
0
through the origin. A vector in this plane is linearly dependent with v1 and
v2 . A vector not in this plane is linearly independent. Picture.

Theorem 8 on page 69. Vectors v1 , . . . , vp ∈ Rn are linearly dependent if


p > n. In other words, if a set contains more vectors than there are entries
in each vector, then[ the set is linearly
] dependent.
Proof: Let A = v1 . . . vp . Then the equation Ax = 0 is a system of
n equations in p unknowns. If p > n, there must be free variables, and, thus,
a non-trivial solution. 2
[ ] [ ] [ ]
2 4 −2
Example 5 on page 69. Three vectors , , and are linearly
1 −1 2[ ] [ ]
2 4
dependent, since 3 = p > n = 2. The dependence relation is − + +
[ ] [ ] 1 −1
−2 0
= .
2 0

Theorem 9 on page 69. If a set of vectors contains the zero vector, the
set is linearly dependent.
Proof: Suppose v1 = 0. Then 1 · v1 + 0 · v2 + · · · + 0 · vp = 0 solves the
equation nontrivially. 2
     
2 0 1
  
Example 6(b) on page 69. The set of vectors 3 , 0 , and 1 is  
5 0 8
linearly dependent since it contains the zero vector.

1.8. Linear Transformations.

Definition. A transformation (or function or mapping) T from Rn to Rm is


a rule that assigns to each vector x in Rn a vector T (x) in Rm . Picture.

12
Definition. The set Rn is called the domain of T , and Rm is called the
codomain of T . The notation T : Rn → Rm means that T maps Rn into Rm .
The vector T (x) is called the image of x. The set of all images is called the
range of T . Picture.

A matrix equation Ax = b can be considered as a transformation.

Definition. Let x ∈ Rn . The transformation T (x) = Ax where A is a


m × n matrix is called a matrix transformation and is denoted by x 7→ Ax.

Example 1 on page 74. Define a matrix transformation T : R2 → R3


1 −3
by T (x) = Ax where A =  3 5 .
[ −1] 7
2
(i) Find T (u) where u = .
−1
   
1 −3 [ ] 5
  2
Solution: T (u) = Au = 3 5 = 1 .

−1
−1 7 −9
 
3
(ii) Find x which image is  2 .
−5   
[ ] 1 −3 [ ] 3
x1 x
Solution: x = solves  3 5  1 =  2 . Solution is x1 =
x2 x2
−1 7 −5
1.5, x2 = −.5 and is unique.
 
3
(iii) Determine if 2 is in the range. Answer: no.
5
Example 2 on page 76. The matrix transformation with
 
1 0 0
A= 0 1 0
0 0 0

is a projection on the x1 x2 -plane since


      
x1 1 0 0 x1 x1
x2  7→ 0 1 0  x2 = x2  .
 
x3 0 0 0 x3 0

Define projections on x1 x3 - and x2 x3 -planes.

Definition. A transformation T is called linear if

13
(i) T (u + v) = T (u) + T (v) for any vectors u and v in the domain of T ,
(ii) T (cu) = cT (u) for any vector u and any scalar c.

Example. A matrix transformation is linear.

Exercise. Show the superposition principle T (c1 v1 +· · ·+cp vp ) = c1 T (v1 )+


· · · + cp T (vp ).

Exercise. Define T : R2 → R2 by T (u) = rT (u). If 0 ≤ r ≤ 1, T is


called a contraction; if r > 1, T is called a dilation. Show that T is a linear
transformation.

Exercise. Show that if T is a linear transformation, then T (0) = 0.

1.9. The Matrix of a Linear Transformation.

It turns out that any linear transformation is in fact a matrix transformation.

Proposition. Let T : Rn → Rm be a linear transformation. Then there


 vector u ∈ R . The
n
exists a unique matrix A such that T (u) = Au  for any
1
[ ]  0
matrix A = T (e1 ) . . . T (en ) where e1 =  
. . ., etc.
m×n m×1
0
 
u1
Proof: u = . . . = u1 e1 + · · · + un en . Thus, T (u) = u1 T (e1 ) + · · · +

un  
[ ] u1
un T (en ) = T (e1 ) . . . T (en ) . . . . 2
un
Example 1 on  page
 82. Let T : R2→ R3 be a linear transformation such
5 −3
that T (e1 ) = −7 and T (e2 ) = 8 . Find A, the standard matrix for T .
  
2 0
2.1. Matrix Operations.

Definition. An m × n matrix A has the form


 
a11 . . . a1j . . . a1n
 ... 
 
A=  ai1 . . . aij . . . ain 

 ... 
am1 . . . amj . . . amn
where aij is called the (i, j)-entry.

14
Definition. The diagonal entries of A are a11 , a22 , . . . , and they form the
main diagonal of A.

Definition. A diagonal matrix is a square matrix with the zero entries off


the main diagonal.

Definition. The identity matrix In is a diagonal matrix with ones on


the main diagonal.

Definition. Two matrices are equal if they have the same dimensions and
their corresponding entries are equal.

Definition. The addition of matrices of equal dimensions and the scalar


multiplication of a matrix are defined entrywise.

Definition. If A is a m × n matrix and B is a n × p matrix, then the


product AB is a m × p matrix with (i, j) entry
(AB)ij = ai1 b1j + ai2 b2j + · · · + ain bnj .
Example.   
1 0 −4 1 3
2 −3 2   0 2 =
6 1 0 −1 1
Remark. (i) The product of matrices is not commutative, that is, in gen-
eral, AB ̸= BA. Example.
(ii) The cancellation law doesn’t hold for matrix multiplication, that is, if
AB = AC, then, in general, B ̸= C. Give example.
(iii) If AB = 0, then, in general, it is not true that A = 0 or B = 0. Give
example.

Definition. The kth power of a matrix A is defined as the product of k


copies of A, that is, Ak = A . . . A.
k times

Definition. The transpose of a n × m matrix A is the m × n matrix AT


with aji as the (i, j) entry. Example.

Theorem 3 on page 115. (i) (AT )T = A, (ii) (A + B)T = AT + BT ,


(iii) (rA)T = rAT for any scalar r, and (iv) (AB)T = BT AT .

Examples.

2.2. The Inverse of a Matrix.

Definition. The inverse of a square matrix A is the (unique) matrix A−1


such that AA−1 = A−1 A = I.

15
[ ]
a b
Theorem 4 on page 119. The inverse of a 2 × 2 matrix A =
[ ] c d
−1 1 d −b
is A = ad−bc .
−c a
If ad − bc = 0, the inverse does not exist, and A is called singular or
not invertible.
Proof:

Theorem 5 on page 120. If A is invertible, then the matrix equation


Ax = b has the unique solution x = A−1 b.

Example. Solve {
3x1 + 4x2 = 3
5x1 + 6x2 = 7.
Theorem 6 on page 121. (i) (A−1 )−1 = A, (ii) (AB)−1 = B−1 A−1 , and
(iii) (AT )−1 = (A−1 )T .
Proof:

We know how to find the inverse of a 2 × 2 matrix. For larger matrices


there is no explicit formula. However, one can find A−1 by applying the
following algorithm based on the fact that AA−1 = I.
[ ]
Algorithm for Finding A−1 : Take the augmented[ matrix
] A I . Apply-
ing elementary operations reduce the matrix to I A−1 . If it is impossible,
then A is not invertible.

Example 7 on page 124. Find the inverse of the matrix


 
0 1 2
A = 1 0 3 .
4 −3 8

Solution:
   
[ ] 0 1 2 1 0 0 1 0 0 −9/2 7 −3/2
A I = 1 0 3 0 1 0 =⇒ 0 1 0 −2 4 −1  .
4 −3 8 0 0 1 0 0 1 3/2 −2 1/2

It is useful to check the answer.

3.1. Determinants.

Notation. Denote by Aij the matrix obtained from matrix A by remov-


ing the ith row and the jth column.

16
Example. Find A11 .  
1 5 0
A = 2 4 −1 .
0 −2 0
Definition. The determinant of a n × n matrix A = [aij ] is a number given
recursively by
∑ n
detA = (−1)j+1 a1j detA1j .
j=1

Definition. The quantity (−1)i+j detAij is called the (i, j)-cofactor of A.

Definition. The determinant is defined as a cofactor expansion across the


first row.

Example. Find  
1 5 0
det 2 4 −1 .
0 −2 0
Theorem 1 on page 188. The determinant of a matrix can be computed
by a cofactor expansion across any row or any column.

Example. In above example, it is simpler to expand the determinant along


the last row or last column.

Exercise. Show that if one row or one column of a square matrix is zero,
its determinant is zero (singular matrix).

Definition. A matrix is called triangular if all entries below or above the


main diagonal are zeros. Example. Special case: diagonal matrix.

Theorem 2 on page 189. The determinant of a triangular matrix is the


product of the entries on the main diagonal.

3.2. Properties of Determinants.

Theorem 3 on page 192. (i) If a multiple of one row a matrix A is added


to another row to produce a matrix B, then detB = detA.
(ii) If two rows of A are interchanged to produce B, then detB = −detA.
(iii) If one row of A is multiplied by c to produce B, then detB = cdetA.

Example 1 on page 193.


   
1 −4 2 1 −4 2
det −2 8 −9 = det  0 0 −5
−1 7 0 −1 7 0

17
   
1 −4 2 1 −4 2
= det 0 0 −5 = −det 0 3 2  = −(1)(3)(−5) = 15.
0 3 2 0 0 −5
 
0 2 2
Exercise. Compute det 1 0 3. Answer: 12.

2 1 1
Exercise. Show that if two rows of a square matrix are equal, its deter-
minant is zero.

Exercise. Show that if one column of a square matrix is a linear combi-


nation of the others, then its determinant is zero. The converse is also true.

Theorem 4 on page 194. A square matrix is invertible iff its determi-


nant is nonzero.

Remark. It follows from the above Exercise and Theorem 4 that a ma-
trix is invertible iff its determinant is nonzero iff its columns are linearly
independent.

Theorem 5 on page 196. detAT = detA.

Theorem 6 on page 196. detAB = (detA)(detB).

Exercise. detA = 5. Find detA−1 .

Exercise. A matrix A is such that A = A−1 . Find detA.

3.3. Cramer’s Rule.

Notation. Let Ai (b) denote the matrix obtained from a n × n matrix A by


replacing its ith column by vector b ∈ Rn .

Theorem 7 on page 201 (Cramer’s Rule). The unique solution of a


matrix equation Ax = b is

xi = detAi (b)/detA, i = 1, . . . , n.
[ ] [ ]
Proof: Let [ A = a1 . . . a n and] I = [ e 1 . . . en . If Ax] = b, then
A Ii (x) =( Ae1 ) . . . Ax . . . Aen = a1 . . . b . . . an = Ai (b).
Thus, det A Ii (x) = detA detIi (x) = detAi (b), but detIi (x) = xi . 2

Example 1 on page 202. Use Cramer’s rule to solve


{
3x1 − 2x2 = 6
−5x1 + 4x2 = 8.

18
Answer: x1 = 20, x2 = 27.

Exercise 6 on page 209. Use Cramer’s rule to solve




2x1 + x2 + x3 = 4
−x1 + 2x3 = 2


3x1 + x2 + 3x3 = −2.

A Formula for A−1 .

Notation. Denote by Cij = (−1)i+j detAij , the (i, j)-cofactor of A.

Theorem 8 on page 203. The inverse


 
C11 C21 . . . Cn1

1  C12 C22 . . . Cn2 
A−1 = .
detA  ... 
C1n C2n . . . Cnn

The matrix of cofactors is called the adjugate or adjoint of A and is denoted


by adjA. Thus, A−1 = detA 1
adjA.

Example 3 on pages 203 – 204. Find the inverse of


 
2 1 3
1 −1 1  .
1 4 −2

Answer:  
−2 14 4
A−1 = (1/14)  3 −7 1  .
5 −7 −3
Exercise 12 on page 210. Find the inverse of
 
1 1 3
2 −2 1 .
0 1 0

Determinants as Area or Volume.

Theorem 9 on page 205 (Geometric Interpretation of Determi-


nants). The absolute value of the determinant of a 2 × 2 matrix is the
area of the parallelogram determined by the columns of the matrix. The
| · | of the determinant of a 3 × 3 matrix is the volume of the parallelepiped
determined by the columns of the matrix.

19
 
[ ] a 0 0
a 0
Example. Draw the picture for det and det 0 b 0 .
0 d
0 0 c
Example 4 on page 206. Find the area of the parallelogram with ver-
tices (−2, −2), (0, 3), (4, −1), and (6, 4).
Solution: First move the parallelogram to the origin by subtracting (−2, −2),
for example. Then [ the ]area of the parallelogram with vertices (0, 0), (2, 5), (6, 1),
2 6
and (8, 6) is |det | = | − 28| = 28.
5 1
4.1. Vector Spaces and Subspaces.

Definition. A vector space V is a collection of objects, called vectors, for


which two operations are defined: addition and multiplication by a scalar.
For any u, v, w ∈ V and any c, d ∈ R, the following ten axioms hold:
1. u + v ∈ V .
2. u + v = v + u (commutative law for addition).
3. (u + v) + w = u + (v + w) (associative law for addition).
4. There exists a zero vector 0 ∈ V such that u + 0 = u.
5. For every u, there exists a vector −u ∈ V , called the negative of u such
that u + (−u) = 0.
6. Vector cu ∈ V .
7. c(u + v) = cu + cv (distributive law).
8. (c + d)u = cu + du (distributive law).
9. c(du) = (cd)u.
10. 1 · u = u.

Example 1 on page 217. The space Rn is a vector space. Each element


is represented by a column vector.

Example 2 on pages 217 – 218. The space of all arrows is a vector


space. Here two vectors are equal if they have the same length and point in
the same direction. Addition is defined by the parallelogram rule, and mul-
tiplication by a scalar as contracting or dilating a vector by |c| and reversing
the direction if c < 0.

Example 3 on page 218. The space of all doubly infinite sequences of


numbers (. . . , x−2 , x−1 , x0 , x1 , x2 , . . . ) is a vector space.

Example 4 on pages 218 – 219. The space of all polynomials of de-


gree at most n is a vector space.

Example 5 on page 219. The space of all real-valued functions is a vector


space.

Example. The space of n × n matrices is a vector space.

20
Example. The space of natural numbers N = {1, 2, 3, . . . } is not a vec-
tor space. No negative vector, no zero.

Example. The space of integer numbers Z = {0, 1, −1, 2, −2, . . . } is not


a vector space. Multiplication by a real scalar doesn’t work.

Proposition 1. The negative of u is unique.

Proof: Let a and b be two negatives of u. Then, by axiom 5, u + a = 0.


Since vectors commute by axiom 2, we can write a + u = 0, which, by axiom
5 again, means that u = −a. Similar, u = −b. So, −a = −b. There-
fore, by axiom 5 once again, b + (−a) = 0. Adding
( a to both
) sides of this
identity, we get( that the)left-hand side equals b (+ (−a) +) a = {by ax-
iom 3 } = b + (−a) + a = { by axiom 2} = b + a + (−a) = {by axiom
5} = b+0 = {by axiom 4} = b, while the right-hand side equals 0+a = {by
axiom 2} = a + 0 = {by axiom 4} = a. So, a = b. 2

Proposition 2. 0 · u = 0.

Proof: By axiom 4, it suffices to show that u + 0 · u = u. Indeed,


u+0·u = {by axiom 10} = 1·u+0·u = {by axiom 8} = (1+0)u = 1·u = {by
axiom 10 again} = u. 2

Proposition 3. −u = (−1)u.

Proof: By axiom 5, it suffices to show that u + (−1)u = 0. Indeed,


u + (−1)u = {by axiom 10} = 1 · u + (−1)u = {by axiom 8} = (1 − 1)u =
0 · u = {by proposition 2} = 0. 2

Proposition 4. c 0 = 0.
( )
Proof: By axiom 5, c 0 =( c u+(−u)
) = {by axiom 7} = cu+c(−u) = {by
proposition 3} = cu + c (−1)u = {by axiom 9} = cu + (c(−1))u =
cu + ((−1)c)u
( = {by
) axiom 9 again} = cu + (−1)(cu) = {by proposition
3} = cu + − (cu) = {by axiom 5} = 0. 2

Subspaces.

Definition. A subspace of a vector space V is a subset H with three prop-


erties: (i) has the zero vector, (ii) closed under vector addition, (iii) closed
under scalar multiplication. These properties guarantee that a subspace is
itself a vector space.

Example 6 on page 220. The set consisting only of the zero vector is
a subspace. It is denoted by {0}.

21
Example 7 on page 220. The set of all polynomials with real coefficients,
P is a subspace of the space of real-valued functions. The set of all polyno-
mials of degree at most n is a subspace of P.
 
{ x1 }
Example 8 on page 220. The set x2  is a subspace of R3 . It looks
0
and acts like R but is not R .
2 2

Example 10 on page 220. If v1 , . . . , vp ∈ V , then Span{v1 , . . . , vp } is a


subspace of V .

4.2. Null Spaces, Column Spaces, and Linear Transformations.

Definition. The null space of an m × n matrix A is the set of all solutions


of the homogeneous equation Ax = 0. In set notation,

N ulA = {x : x ∈ Rn , Ax = 0}.

Geometrically, the null space of A is the set that is mapped into zero by the
linear transformation x 7→ Ax. Picture.

Theorem 2 on page 227. The null space of an m × n matrix is a sub-


space of Rn .
Proof: (i) 0 ∈ Rn is in the null space of A since A0 = 0.
(ii) If Au = 0 and Av = 0, then A(u + v) = 0.
(iii) If Au = 0, then A(cu) = c(Au) = c(0) = 0. 2
[ ]
1 −3 −2
Example. Find the null space of A = .
  −5 9 1
x1 {
x1 − 3x2 − 2x3 = 0
Solution: x = x2  is in N ulA if it solves
x3 −5x1 + 9x2 + x3 = 0.
Therefore, N ulA = {x ∈ R3 : x1 = −(5/2)x3 , x2 = −(3/2)x3 }. It is a
subspace of R3 .

An Explicit Description of N ulA.

Example 3 on page 228. Find a spanning set for the null space of the
matrix  
−3 6 −1 1 −7

A = 1 −2 2 3 −1 .
2 −4 5 8 −4
Solution: The general solution of the equation Ax = 0 is x1 = 2x2 +
x4 − 3x5 , x3 = −2x4 + 2x5 , and x2 , x4 , and x5 are free variables. We can
now decompose any vector in R5 into a linear combination of vectors where

22
weights are the free variables. That is,
         
x1 2x2 + x4 − 3x5 2 1 −3
x2   x2       
    1 0 0
x3  =  −2x4 + 2x5  = x2 0+x4 −2+x5  2  = x2 u+x4 v+x5 w.
         
x4   x4  0 1 0
x5 x5 0 0 1

Notice that u, v, and w are linearly independent since the weights are free
variables. Thus, N ulA = Span{u, v, w}.

The Column Space of a Matrix.

Definition. The column space of an m × n matrix a = [a1 , a2 , . . . , an ] is


the set of all linear combinations of its columns, that is,

ColA = Span{a1 , a2 , . . . , an } = {b : b = Ax for some x ∈ Rn }.

Theorem 3 on page 229. ColA is a subspace of Rm .


Proof: exercise.

Example 4 on pages 229 – 230. Find a matrix A such that W = ColA
{ 6a − b }
where W =  
a + b : a, b ∈ R .
−7a
Solution:
       
{ 6 −1 } { 6 −1 }
   
W = a 1 + b 1 : a, b ∈ R = Span  1 , 1 .
 
−7 0 −7 0
Thus, the matrix is  
6 −1
A= 1 1 .
−7 0
Kernel and Range of a Linear Transformation.

Recall that a linear transformation preserves vector addition and multiplica-


tion by a scalar, that is, T (u + v) = T (u) + T (v) and T (cu) = cT (u).

Definition. The kernel of a linear transformation is the set of vectors that


are mapped into zero. That is, for T : V → W the kernel is the set of all
v ∈ V such that T (v) = 0.

Definition. The range of a linear transformation is the image under the


mapping, that is, the range of T is the set of all vectors w ∈ W such that
w = T (v) for some v ∈ V .

23
Picture.

Remark. Recall that any linear transformation is a matrix transformation,


that is, T (x) = Ax for some A. Therefore, by definition, the kernel of T is
the null space of A, and the range of T is the column space of A.

4.3. Bases.

Definition. A set of vectors B = {b1 , . . . , bp } is a basis for a vector space


V if (i) B is a linearly independent set, and (ii) the vector space is spanned
by B, that is, V = Span{b1 , . . . , bp }.

Example 3 on page 238. Let A be an invertible n × n matrix. Then


the columns of A form a basis for Rn because they are linearly independent
and they span Rn . Explain more.

Example 4 on page 238. The set


     
1 0 0
{ 0 1  0 }
e1 =      
. . . , e2 = . . . , . . . , en = . . .
0 0 1

is called the standard basis for Rn . Picture.


   
{ 3 −4
Example 5 on page 238. Determine whether u = 0 , v = 1  , w =
  
  −6 7
−2 }
 1  is a basis for R3 .
5  
3 −4 −2
Solution: These vectors are linearly independent iff det  0 1 1  ̸=
−6 7 5
0, which is true. Thus, by Example 3, this set is a basis.

Example 6 on pages 238 – 239. Show that {1, t, t2 , . . . , tn } for a ba-


sis, called standard basis, for Pn , the set of all polynomial of degree at most
n.

Theorem 5 on page 239 (The Spanning Set Theorem).

Suppose a vector space V = Span{v1 , . . . , vp }. If one of the vectors, say


v1 , is a linear combination of the others, then the set with this vector re-
moved still spans V , that is, V = Span{v2 , . . . , vp }. The set with all such

24
vectors removed is a basis for V since it contains only linearly independent
vectors spanning V .
     
0 2 6
Example 7 on page 239. Let v1 =  2  , v2 = 2 , and v3 =  16 ,
−1 0 −5
and suppose V = Span{v1 , v2 , v3 }. Find a basis for V .
Solution: Note that v3 = 5v1 + 3v2 . By the theorem, {v1 , v2 } is a basis
for V .

Bases for N ulA.

Recall Example 3 on page 228 discussed on the last lecture. We learned


how to produce linearly independent set of vectors that spans the null space
of a matrix. That is, we learned how to find a basis.

Bases for ColA.

Example 8 on page 240. Find a basis for ColB where


 
1 4 0 2 0
0 0 1 −1 0 [ ]
B= 0 0 0 0 1
 = b1 . . . b5 .

0 0 0 0 0
Solution: Each nonpivot column of B is a linear combination of the pivot
columns. In fact, the pivot columns are b1 , b3 , and b5 , and the nonpivot
columns are b2 = 4b1 , and p4 = 2b1 − b3 . By the Spanning Set Theorem,
we may discard the nonpivot columns. The pivot columns are linearly inde-
pendent and form a basis for ColB.

Notice that the matrix B is in the reduced row echelon form. Suppose a
matrix is not in this form. How to find a basis for its column space?

Theorem 6 on page 241. (i) Elementary row operations on a matrix do


not affect the linear dependence relations among the columns of the matrix.
(ii) The pivot columns of a matrix form a basis for its column space.

Example 9 on page 241. Find a basis for ColA where


 
1 4 0 2 −1
3 12 1 5 5 
A= 2
.
8 1 3 2
5 20 2 8 8
Solution: It can be shown that A is row equivalent to the matrix B in
Example 8. Thus, the columns a1 , a3 , and a5 are the pivot columns and

25
form a basis.

4.4. Coordinate Systems.

Theorem 7 on page 246 (The Unique Representation Theorem). Let


B = {b1 , . . . , bn } be a basis for a vector space V . Then for any x ∈ V , there
exists a unique set of scalars c1 , . . . , cn such that x = c1 b1 + · · · + cn bn .

Proof: by contradiction, use linear independence.

Definition. The coordinates of x relative to a basis B = {b1 , . . . , bn } (or


B-coordinates of x) are the weights c1 , . . . , cn such that x = c1 b1 +· · ·+cn bn .

Definition. The vector in Rn


 
c1
[x]B = . . .
cn
is the coordinate vector of x (relative to B) or the B-coordinate vector of x.
The mapping x 7→ [x]B is the coordinate mapping determined by B.
[ ] [ ]
1 1
Examples 1 and 2 on page 247. Consider B = {b1 = , b2 = }.
0 2
It is a [basis] for R2 . Suppose a vector x ∈ R2 has the coordinate vector
−2
[x]B = . Find x.
3 [ ]
1
Solution: x = c1 b1 + c2 b2 = . Note that the entries of this vector are
6 [ ] [ ]}
{ 1 0
the coordinates of x relative to the standard basis e1 = , e2 = .
[ ] [ ] [ ] 0 1
1 1 0
Indeed, x = =1· +6· = 1 · e1 + 6 · e2 .
6 0 1
Graphically, the coordinates can be interpreted in the following way. Plot [ the
]
−2
standard basis and x, plot B. In the B-graph paper, x has coordinates .
3
[ ] [ ] [ ]
2 −1 4
Example 4 on page 249. Let b1 = , b2 = , and x = . Find
1 1 5
the B-coordinates of[ x.] [ ] [ ] [ ]
c1 4 2 −1
Solution: [x]B = where x = = c1 b1 + c2 b2 = c1 + c2 .
c2 5 1 1
Solving the system, obtain c1 = 3, c[2 = 2. ]
Note that the augmented matrix is b1 b2 x . This can be generalized for
Rn .
[ ]
Definition. Consider a basis B = {b1 , . . . , bn } in Rn . Let PB = b1 . . . bn .
The change-of-coordinates equation x = c1 b1 + · · · + cn bn is equivalent to
x = PB [x]B . The matrix PB is called the change-of-coordinates matrix from

26
B to the standard basis in Rn . The coordinate mapping from the standard
basis to B is given by [x]B = PB−1 x. Note that PB is invertible since its
columns are linearly independent.

The Coordinate Mapping.

Definition (on page 87). A mapping T : U → V is said to be onto V


if each vector in V is the image of at least one vector in U .
Picture.

Definition (on page 87). A mapping T : U → V is said to be one-to-one


if each vector in V is the image of at most one vector in U .
Picture.

Theorem 8 on page 250. Let B = {b1 , . . . , bn } be a basis for a vec-


tor space V . Then the coordinate mapping x 7→ [x]B is a one-to-one linear
transformation from V onto Rn .
Proof: The coordinate mapping preserves vector addition and scalar mul-
tiplication (show!). Hence it is a linear transformation. Show the rest.

Definition. A one-to-one linear transformation from a vector space V onto


a vector space W is called an isomorphism between V and W . The vec-
tor spaces are called isomorphic. The word “isomorphism” is from Greek
“iso”=“the same” and “morph”=“structure”. The name reflects the fact
that two isomorphic spaces are indistinguishable as vector spaces. Every
vector space calculation in V is accurately reproduced in W and vice versa.

The above theorem states that the possibly unfamiliar space V is isomor-
phic to the familiar Rn .

Example 5 on page 251. Let B = {1, t, t2 t3 } be the standard basis for


P3 . A typical vector in P3 has the form p(t) = 2 3
 a0+ a1 t + a2 t + a3 t which
a0
a1 
corresponds to the coordinate vector [p]B =  
a2  in R . By the theorem,
4

a3
the coordinate mapping p 7→ [p]B is an isomorphism between P3 and R4 .

4.5. The Dimension of a Vector Space.

Theorem 9 on page 256. If a vector space V has a basis B = {b1 , . . . , bn },


then any set in V containing more than n vectors must be linearly depen-
dent. This implies that any linearly independent set in V has no more than
n vectors.
Proof: Suppose {u1 , . . . , up } is a set in V and p > n. Then the set
{[u1 ]B , . . . , [up ]B } is a linearly dependent set in Rn since there are more

27
vectors (p) than entries (n) in each vector. Therefore, there exist scalars
c1 , . . . , cp , not all zeros, such that
 
0
. . . = c1 [u1 ]B + · · · + cp [up ]B
0

= {the coordinate mapping is a linear transformation} = [c1 u1 +· · ·+cp up ]B .


Hence, c1 u1 + · · · + cp up = 0. The scalars are not all zero, therefore, the set
{u1 , . . . , up } is linearly dependent. 2

Theorem 10 on page 257. If a vector space V has a basis consisting


of n vectors, then every basis in V must have exactly n vectors.
Proof: Let B1 be a basis consisting of n vectors, and B2 be any other ba-
sis. Since B1 is a basis and B2 is linearly independent, B2 has no more than
n vectors by Theorem 9. Also, since B2 is a basis and B1 is linearly inde-
pendent, B2 has at least n vectors. Thus, B2 consist of exactly n vectors. 2

Definition. The dimension of a vector space V , dimV , is the number of


vectors in a basis of V . The dimension of the zero vector space {0} is defined
to be zero. If V is not spanned by a finite number of vectors, then dimV = ∞.

Remark. This is a consistent definition since by Theorem 10, all bases in


V have the same number of vectors.

Example 1 on page 257. dimRn = n since the standard basis consists of


n vectors. dimPn = n + 1 since the set {1, t, t2 , . . . , tn } is a basis. dimP = ∞.

Example 3 on page 258. Find the dimension of the subspace of R4


 
a − 3b + 6c
{  5a + 4d  }
H=   : a, b, c, d ∈ R .
 b − 2c − d 
5d

Solution: Note that


       
1 −3 6 0
{ 5 0 0  4 }
H = Span v1 =        
0 , v2 =  1  , v3 = −2 , v4 = −1 .
0 0 0 5

Since v3 = −2v2 and the other vectors are linearly independent, H =


Span{v1 , v2 , v4 }. Thus, dimH = 3.

Theorem 12 on page 259 (The Basis Theorem). Let V be a p-dimensional


vector space, p ≥ 1. Any linearly independent set of exactly p vectors in V
is a basis in V . Any set of exactly p vectors that spans V is a basis in V .

28
Proposition. The dimension of N ulA is the number of free variables in the
equation Ax = 0. The dimension of ColA is the number of pivot columns
in A.

Example 5 on page 260. Find the dimensions of the null space and the
column space of  
−3 6 −1 1 −7
A =  1 −2 2 3 −1 .
2 −4 5 8 −4
[ ]
Solution: Row reduce the augmented matrix A 0 to echelon form
 
1 −2 2 3 −1 0
0 0 1 2 −2 0 .
0 0 0 0 0 0

There are three free variables – x2 , x4 , and x5 . Hence dimN ulA = 3. Also,
dimColA = 2 because A has two pivot columns.

4.6. Rank.

Definition. The rank of a matrix A is the dimension of the column space


of A. That is , rankA = dimColA.

Theorem 14 on page 265 (The Rank Theorem).

rankA + dimN ulA = n.


“Proof:” The rank of A is the number of pivot columns. They correspond
to the non-free variables. The other variables are free. 2

Note that in Example 5, rankA + dimN ulA = 2 + 3 = 5.

5.1. Eigenvectors and Eigenvalues.

Definition. An eigenvector of an n × n matrix A is a nonzero vector x


such that Ax = λx where the scalar λ is called the eigenvalue of A corre-
sponding to the eigenvector x.
In other words, the matrix transformation x 7→ Ax stretches or shrinks x.

Definition. The eigenspace of a matrix corresponding to eigenvalue λ is


the space consisting of the zero vector and all the eigenvectors corresponding
to λ.
[ ] [ ] [ ]
1 6 6 3
Example 2 on page 303. Let A = ,u = , and v = .
5 2 −5 −2

29
Are u and v eigenvectors of A?
Solution: Au = −4u and Av ̸= λv.

5.2. The Characteristic Equation.

To find eigenvalues of a matrix A, one has to find all λ’s such that the
equation Ax = λx or, equivalently, (A − λI)x = 0 has a nontrivial solution.
This problem is equivalent to finding all λ’s such that the matrix A − λI is
not invertible, which is equivalent to solving the characteristic equation of
A, det(A − λI) = 0.

Definition. If A is an n × n matrix, then det(A − λI) is a polynomial


of degree n called the characteristic polynomial of A.

Example 3 on pages 313 – 314. Find the characteristic equation of


 
5 −2 6 −1
0 3 −8 0 
A= 0 0
.
5 4
0 0 0 1

Answer: The characteristic equation is (λ − 1)(λ − 3)(λ − 5)2 = 0.

Definition. The multiplicity of an eigenvalue λ1 is its multiplicity as a root


of the characteristic equation, that is, if the characteristic polynomial of A
has factor (λ − λ1 )k , then the multiplicity of λ1 is k.

In our example, λ1 = 1 and λ2 = 3 have multiplicities 1, while λ3 = 5


has multiplicity 2.
 
1 5 0
Example. Find eigenvalues and eigenvectors of A = 0 4 −1.
0 0 −2
Solution: Solve
 
1−λ 5 0
0 = det  0 4−λ −1  = (1 − λ)(4 − λ)(−2 − λ).
0 0 −2 − λ

The solution is λ1 = −2, λ2 = 1, and λ3 = 4. Note that the eigenvalues of a


diagonal matrix are the diagonal entries.
The eigenvector x1 corresponding to λ1 = −2 solves (A + 2I)x = 0. The
augmented matrix of this matrix equation is
   
3 5 0 0 1 0 5/18 0
0 6 −1 0 =⇒ 0 1 −1/6 0 .
0 0 0 0 0 0 0 0

30
 
−(5/18)a
The solution is x1 =  (1/6)a  where a is free.
a
Eigenvector x2 solves (A − I)x = 0. The augmented matrix is
   
0 5 0 0 0 0 0
0 3 −1 =⇒ 0 1 0 0 .
0 0 −3 0 0 1 0
 
b
The solution is x2 = 0 where b is free.

0
Eigenvector x3 solves (A − 4I)x = 0. The augmented matrix is
   
−3 5 0 0 0 −5/3 0 0
 0 0 −1 0 =⇒ 0 0 0 0 .
0 0 −6 0 0 0 1 0
 
(5/3)c
The solution is x3 =  c  where c is free.
0
     
−5 1 5
For example, we may take x1 = 3 , x2 = 0 , and x3 = 3.
    
18 0 0
Theorem 2 on page 307. If v1 , . . . , vr are eigenvectors corresponding to
distinct eigenvalues λ1 , . . . , λr of a matrix A, then the vectors are linearly
independent.

Indeed,
  in our example,
  consider
 the linear combination αx1 + βx2 + γx3 =
−5 1 −5
α  3  + β 0 + γ  3  = 0 ⇐⇒ α = β = γ = 0.
18 0 0
Exercise 14 on page 317. Find eigenvalues and eigenvectors of
 
5 −2 3
A = 0 1 0 .
6 7 −2
Solution: det(A − λI) = −(λ + 4)(λ − 1)(λ − 7) = 0, thus, the eigenvalues
are λ1 = −4, λ2 = 1, and λ3 = 7. The eigenvector x1 solves
   
9 −2 3 0 −(1/3)a
0 5 0 0 =⇒ x1 =  0 .
6 7 2 0 a
The eigenvector x2 solves
   
4 −2 3 0 −(3/8)b
0 0 0 0 =⇒ x2 =  3b/4  .
6 7 −3 0 b

31
The eigenvector x3 solves
   
−2 −2 3 0 (3/2)c
 0 −6 0 0 =⇒ x3 =  0  .
6 7 −9 0 c
     
−1 −3 3

For example, we can take x1 = 0 , x2 = 6 , and x3 = 0. Again,
   
3 8 2
they are linearly independent.

5.3. Diagonalization.

Definition. A square matrix A is diagonalizable if there exist an invert-


ible matrix P and a diagonal matrix D such that A = PDP−1 . The matrix
A is said to be similar to D.
[ ] [ ]
7 2 1 1
Example 2 on page 320. Let A = , P= , and D =
[ ] −4 1 −1 −2
5 0
. Check that A = PDP−1 and find A3 .
0 3
Solution: Check[ AP = PD. ][ ][ ] [ ]
3 3 −1 1 1 125 0 2 1 223 98
A = PD P = = .
−1 −2 0 27 −1 −1 −196 −71
Theorem 5 on page 320 (The Diagonalization Theorem). An n × n
matrix A is diagalizable iff it has n linearly independent eigenvectors. In
fact, A = PDP−1 iff the columns of P are n linearly independent eigenvec-
tors of A, and the diagonal entries of D are the corresponding eigenvalues of
A.

In our example, det(A − λI) = (7 − λ)(1 − λ) + 8 = (λ − 5)(λ − 3) = 0, etc.

Example 3 on pages 321 – 322. If possible, diagonalize the following


matrix  
1 3 3
A = −3 −5 −3 .
3 3 1
Solution: The characteristic equation is −(λ − 1)(λ + 2)2 = 0. Hence,
the eigenvalues
  are 1 and
 -2. Three
 linearly
 independent eigenvectors are
1 −1 −1
v1 = −1, v2 =  1 , and v3 =  0 .
1 0 1
 
2 4 3
Example 4 on page 323. Diagonalize A = −4 −6 −3.
3 3 1
Solution: The characteristic equation is −(λ − 1)(λ + 2)2 = 0. Hence, the

32
eigenvalues
  are 1 and -2.
 We  can find only two linearly independent vectors
1 −1
v1 = −1 and v2 =  1 , so A is not diagonalizable.
1 0
Theorem 6 on page 323. An n × n matrix with n distinct eigenvalues
is diagonalizable.

5.4. Eigenvectors and Linear Transformations.

Suppose T : V → W where dimV = n, and dimW = m. Let B =


{b1 , . . . , bn } and C be two bases in V and W , respectively. Given x ∈ V ,
the coordinate vector [x]B ∈ Rn , and the coordinate vector of T (x) ∈ W ,
[T (x)]C is in Rm . The connection  between
 [x]B and [T (x)]C is as follows. If
r1
x = r1 b1 +· · ·+rn bn , then [x]B = . . . and T (x) = r1 T (b1 )+· · ·+rn T (bn )
rn
because T is linear.
In terms of the C-coordinate vectors [T (x)]C = r1 [T (b1 )]C + · · · + rn [T (bn )]C .
Hence,
[T (x)]C = M [x]B
[ ]
where M = [T (b1 )]C . . . [T (bn )]C is the matrix representation of T called
the matrix for T relative to the bases B and C.

Example 1 on page 329. Let B = {b1 , b2 } and C = {c1 , c2 , c3 }, and


suppose T (b1 ) = 3c1 − 2c2 + 5c3 and T (b2 ) = 4c1 + 7c2 − c3 . Then the

[ ] 3 4
matrix for T relative to B and C is M = [T (b1 )]C [T (b2 )]C = −2 7 .
5 −1
Definition. If T : V → V and the basis C is the same as B, then the
matrix M is called the matrix for T relative to B (or the B-matrix for T ),
and is denoted by [T ]B .

Example 2 on page 329. Consider differentiation in P2 . It is defined


as T : P2 → P2 such that T (a0 + a1 t + a2 t2 ) = a1 + 2a2 t. Suppose
B = {1, t t2 }. Since T (1) 2
 = 0, T (t) = 1, and T (t ) = 2t, the B-matrix
0 1 0

for T is [T ]B = 0 0 2.
0 0 0  
a0
For a general p(t) = a0 +a1 t+a2 t , the coordinate vector is [p]B = a1 , and
2 
     a2
a1 0 1 0 a0
the image [T (p)]B = [a1 + 2a2 t]B = 2a2  = 0 0 2 a1  = [T ]B [p]B .
0 0 0 0 a2

33
Theorem 8 on page 331. Suppose A = PDP−1 where D is a diago-
nal n × n matrix. If B is the basis for Rn formed from the columns of P,
then D is the B- matrix for the transformation T (x) = Ax.
[ ] [ ]
7 2 1 1
Example 3 on page 331. Let A = . Then P = and
[ ] −4 1 [ ] [ −1
] −2
5 0 { 1 1 }
D= . D is the B-matrix for T when B = , .
0 3 −1 −2
6.1. Inner Product, Length, and Orthogonality.
   
u1 v1
Definition. If u = . . . and v = . . . are two vectors in Rn , then the
  
un vn
inner product or dot product is the number
 
[ ] v 1
u · v = uT v = u1 . . . un . . . = u1 v1 + · · · + un vn .
vn
   
1 5
Example. Compute the dot product of u = −3 and v = 0 .
  
2 −2
Theorem 1 on page 376. The dot product has the following properties:
(a) u · v = v · u (commutative law)
(b) (u + v) · w = u · w + v · w (distributive law)
(c) (cu) · v = c(u · v) = u · (cv) (associative law)
(d) u · u ≥ 0, ∀u, and u · u = 0 iff u = 0
√ √
Definition. The length (or norm) of a vector v is ∥v∥ = v · v = v12 + · · · + vn2 .
Example.

Definition. The distance between u and v is dist(u, v) = ∥u − v∥.


Example. Picture.

Definition. Vectors u and v are orthogonal iff u · v = 0.

Theorem 2 on page 380 (The Pythagorean Theorem). Vectors u


and v are orthogonal iff ∥u + v∥2 = ∥u∥2 + ∥v∥2 .
Proof:

Definition. Let W be a vector space. The orthogonal complement of W,


W⊥ , is the set of all vectors that are orthogonal to every vector in W.

Example. Let W = R2 Then W⊥ is the line through the origin.

34
6.2. Orthogonal Sets.

Definition. A set of vectors {u1 , . . . , up } is an orthogonal set if ui · uj = 0


for any i ̸= j.
     
{ 3 −1 −1/2 }
Example 1 on page 384. The set 1 ,  2  ,  −2  is an orthog-
1 1 7/2
onal set.

Theorem 4 on page 384. If S = {u1 , . . . , up } is an orthogonal set, then


S is linearly independent and therefore is a basis for the subspace spanned
by S.
Proof: The vector equation 0 = c1 u1 + · · · + cp up has only the trivial so-
lution. Indeed, 0 = 0 · u1 = c1 u1 · u1 hence c1 = 0, etc.

Theorem 5 on page 385. Let {u1 , . . . , up } be an orthogonal basis for


a vector space W . Then for any y ∈ W ,

∑p
y · ui
y= ui .
i=1
ui · ui

Proof: y · u1 = c1 u1 · u1 , etc.
 
6
Example 2 on pages 385 – 386. Let y =  1 . Write y as a linear
−8
combination of the vectors in S in Example 1. Answer: y = u1 − 2u2 − 2u3 .

Definition. Let L = Span{u} for some vector u ∈ Rn . Take any y ̸∈ L.


The orthogonal projection of y onto L is
y·u
ŷ = projL y = u.
u·u
[ ]
1
Example. u = .
0
Definition. A set of vectors is orthonormal if it is an orthogonal set of
unit vectors.
Example.

Theorem 6 on page 390. Matrix U, called the unitarian matrix, has


orthonormal columns iff UT U = I.

Theorem 7 on page 390. Unitarian matrices have the property that(Ux)·


(Uy) = x · y. In particular, ∥Ux∥ = ∥x∥.

35
6.3. Orthogonal Projections.

Theorem 8 on page 395 (The Orthogonal Decomposition Theo-


rem). Let W be a subspace of Rn . Then any y ∈ Rn can be written uniquely
as
y = ŷ + z
where ŷ ∈ W and z ∈ W ⊥ .
In fact, if {u1 , . . . , up } is an orthogonal basis for W , then

∑p
y · ui
ŷ = ui
i=1
ui · ui

and z = y − ŷ.

Picture.
 
 
2 −2
Example 2 on pages 396 – 397. Let u1 =  
5 , u2 =  1 , and
  −1 1
1
y = 2. Then the decomposition of y is
3
     
1 −2/5 7/5
2 =  2  +  0  .
3 1/5 14/5

Theorem 9 on page 398. An orthogonal projection ŷ of a vector y onto


W is the closest vector in W to y in the sense that dist(y, ŷ) = ∥y − ŷ∥ =
minv∈W ∥y − v∥.

Picture.

Definition. The distance between y and W is the distance between y and


the closest vector in W , that is, dist(y, W ) = ∥y−projW y∥.
 
= −1
Example 4 on page 399. Find the distance between y =  −5  and
    10
{ 5 1 }
 
W = Span u1 = −2 , u2 = 2  . 
1 −1
Solution:
   
−1 0 √
  
ŷ = (1/2)u1 − (7/2)u2 = −8 , y − ŷ = 3 , ∥y − ŷ∥ = 3 5.
4 6

36
Theorem 10 on ∑ page 399. If {u1 , .[. . , up } is an]orthonormal basis in W ,
then projW y = pi=1 (y·ui )ui . If U = u1 · · · up , then projW y = UUT y.

6.4. The Gram – Schmidt Orthogonalization Process.

Theorem 11 on page 404 (The Gram – Schmidt Process). Suppose


{x1 , . . . , xp } is a basis for W . The following algorithm produces an orthog-
onal basis for W . Take
x2 · v1
v1 = x1 , v2 = x2 − v1 , . . . ,
v1 · v1
xp · v1 xp · v2 xp · vp−1
vp = xp − v1 − v2 − · · · − vp−1 .
v1 · v1 v2 · v2 vp−1 · vp−1
The set {v1 , . . . , vp } is an orthogonal basis for W .

Example 2 on pages 402 – 403. Suppose


     
1 0 0
1 1 0
x1 =      
1 , x2 = 1 , x3 = 1 .
1 1 1

Then      
1 −3/4 0
1  1/4  −2/3
v1 =      
1 , v2 =  1/4  , v3 =  1/3  .
1 1/4 1/3

6.7. Inner Product Spaces.

Definition. An inner product on a vector space V is a function that for


any u and v ∈ V , assigns a real number ⟨u, v⟩ satisfying the following prop-
erties:
1. ⟨u, v⟩ = ⟨v, u⟩ (commutative law)
2. ⟨u + v, w⟩ = ⟨u, w⟩ + ⟨v, w⟩ (distributive law)
3. ⟨cu, v⟩ = c⟨u, v⟩
4. ⟨u, u⟩ ≥ 0 and ⟨u, u⟩ = 0 iff u = 0.
A vector space with an inner product is called an inner product space.
[ ] [ ]
u1 v1
Example 1 on page 428. Let u = and v = . Show that
u2 v2
⟨u, v⟩ = 4u1 v1 + 5u2 v2 is an inner product.
∑n
Generally, ⟨u, v⟩ = i=1 ai ui vi defines an inner product in Rn .

37
Example 2 on ∑page 429. Fix t0 , . . . , tn ∈ R. For any p and q ∈ Pn
define ⟨p, q⟩ = ni=0 p(ti )q(ti ). Show that it is an inner product.
Solution: ⟨p, p⟩ = 0 iff the polynomial vanishes at n + 1 points iff it is a
zero polynomial since its degree is at most n.

Example 3 on page 429. Let t0 = 0, t1 = 1/2, t2 = 1 and let p(t) = 12t2


and q = 2t − 1. Then ⟨p, q⟩ = p(0)q(0) + p(1/2)q(1/2) + p(1)q(1) = 12.

Definition. The length (or norm) of a vector v in an inner product space



V is ∥v∥ = ⟨v, v⟩.
√ √
Example 4 on page 429. In Example 3, ∥p∥ = 153 and ∥q∥ = 2.

The Gram – Schmidt Orthogonalization process is applicable to an inner


product space.

Example 5 on pages 430 – 431. Consider P2 with the inner product


⟨p, q⟩ = p(−1)q(−1) + p(0)q(0) + p(1)q(1). Take {1, t, t2 }, the standard
basis in P2 . Orthogonalize this basis.  
p(−1)
Solution: For any polynomial p ∈ P2 , consider  p(0)  ∈ R3 . The stan-
  p(1)
   
{ 1 −1 1 }
dard basis corresponds to the set v1 = 1 , v2 =  0  , v3 = 0 .
1 1 1
Notice that x1 = v1 and x2 = v2 are orthogonal. Apply  the Gram
  – Schmidt

1 1 1/3
⟨x3 ,v1 ⟩
process to find x3 = v3 − ⟨v 1 ,v1 ⟩
v1 − ⟨x 3 ,v2 ⟩
⟨v2 ,v2 ⟩ 2
v = 0 −(2/3) 1 = −2/3.
1 1 1/3
This vector corresponds to a polynomial a+bt+ct ∈ P2 where the coefficients
2

satisfy 

a − b + c = 1/3
a = −2/3


a + b + c = 1/3.
Thus, {1, t, t2 − 2/3} is an orthogonal basis for P2 . An orthonormal basis is
{1/3, t/2, (3/2)t2 − 1}.

Theorem 16 on page 432 (The Cauchy – Schwarz Inequality). For


any u and v,
|⟨u, v⟩| ≤ ∥u∥∥v∥.

Three applications of the Cauchy – Schwarz Inequality.

38
1. Theorem 17 on page 433 (The Triangle Inequality). For any u
and v,
∥u + v∥ ≤ ∥u∥ + ∥v∥.
Picture.
Proof: ∥u + v∥2 = ⟨u + v, u + v⟩ = ⟨u, u⟩ + 2⟨u, v⟩ + ⟨v, v⟩ ≤ ∥u∥2 +
C−S.̸=
2|⟨u, v⟩| + ∥v∥2 ≤ ∥u∥2 + 2∥u∥∥v∥ + ∥v∥2 = (u + v)2 . 2

2. Exercise 19 on page 436. Show that ab ≤ (a + b)/2, that is, the
geometric mean is less than [√
or equal
] to the arithmetic
[√ ] mean.
a b
Solution: Consider u = √ and v = √ . The inner product is
b a
√ √
⟨u, v⟩ = 2 ab. The√norms are ∥u∥ = ∥v∥ = a + b. By the Cauchy –
Schwarz inequality, 2 ab ≤ a + b.

[ ](a + b) ≤ 2a + 2b .
2 2 2
3. Exercise 20 on page 436.[ ] Show that
a 1
Solution: Consider u = and v = . The inner product is ⟨u, v⟩ =
b 1
√ √
a + b. The norms are ∥u∥ √ = a2 + b2 and ∥v∥ = 2. By the Cauchy –
Schwarz inequality, a + b ≤ 2(a2 + b2 ).

7.1. Diagonalization of Symmetric Matrices.

Definition. A square matrix A is symmetric iff A = AT .

Example.

Theorem 1 on page 450. If A is symmetric, then any two eigenvectors


from different eigenspaces are orthogonal.
Proof: Let v1 and v2 be two eigenvectors corresponding to two distinct
eigenvalues λ1 and λ2 , respectively. We want to show that v1 · v2 = 0.
We have λ1 v1 · v2 = (λ1 v1 ) · v2 = (λ1 v1 )T v2 = (Av1 )T v2 = v1 T AT v2 =
v1 T (Av2 ) = v1 T (λ2 v2 ) = λ2 v1 · v2 . Since λ1 ̸= λ2 , v1 · v2 = 0. 2
 
6 −2 −1
Example 2 on pages 449 – 450. Consider A = −2 6 −1. The
−1 −1 5
characteristic equation of A is −(λ − 3)(λ − 6)(λ − 8) = 0. The eigenvectors
are
     
1 −1 −1
   
λ1 = 3 : v1 = 1 , λ2 = 6 : v2 = −1 , λ3 = 8 : v3 = 1  . 
1 2 0

The are orthogonal.

39
Since a nonzero multiple of an eigenvector is still an eigenvector, we can
normalize the orthogonal eigenvectors {v1 , v2 , v3 } to produce the unit eigen-
vectors (orthonormal eigenvectors) {u1 , u2 , u3 }. In our example,
 √   √   √ 
1/√3 −1/√6 −1/√ 2
u1 = 1/√3 , u2 = −1/√ 6 , u3 =  1/ 2  .
1/ 3 2/ 6 0

[ ]
Definition. Let P = u1 . . . un where {u1 , . . . , un } are the orthonormal
eigenvectors of an n × n matrix A that correspond to not necessarily distinct
eigenvalues λ1 , . . . , λn . The matrix P is called
 an orthonormal
 matrix. It has
λ1 0 . . . 0
 0 λ2 . . . 0 
the property that P−1 = PT . Let D =  
 be the diagonal

...
0 0 . . . λn
matrix with the eigenvalues on the main diagonal. Then A = PDP−1 =
PDPT . The matrix A is said to be orthogonally diagonalizable.

In our example, P = . . . , D = . . . .

Theorem 2 on page 451. An n × n matrix A is orthogonally diagonaliz-


able iff A is symmetric.

Example
 3 on 
pages 451 – 452. Orthogonally diagonalize the matrix
3 −2 4
A = −2 6 2 .
4 2 3
Solution: The characteristic polynomial is −(λ + 2)(λ − 7)2 . The eigen-
vectors are
     
−1 1 −1/2
λ1 = −2 : v1 = −1/2 , λ2 = 7 : v2 = 0 , v3 =  1  .
1 1 0

The basis {v2 , v3 } can be  orthogonalized


 by √
using
 the Gram – Schmidt
√  pro-
−2/3 1/ 2 −1/√ 18
cess. It produces u1 = −1/3 , u2 =  0√  , u3 =  4/√18 . Then
2/3 1/ 2 1/ 18
T
A = PDP .
  T
[ ] λ 1 0 . . . 0 u1
T
Definition. Write A = PDP = u1 . . . un  ...   ...  =
 T 0 0 . . . λn un T
[ ] u1
λ1 u1 . . . λn un  . . . . Thus, we can write A = λ1 u1 u1 T +· · ·+λn un un T .
un T
This representation of A is called a spectral decomposition of A. The set of

40
eigenvalues of A is called the spectrum.

Example
[ ] [4 √
on page√453.
] [ Construct
] [ √ a spectral
√ ] decomposition of A =
7 2 2/√5 −1/√ 5 8 0 2/ √5 1/√5
= .
2 4 1/ 5 2/ 5 0 3 −1/ 5 2/ 5
[ ] [ ]
T T 32/5 16/5 3/5 −6/5
Solution: A = 8u1 u1 + 3u2 u2 = + .
16/5 8/5 −6/5 12/5
7.2. Quadratic Forms.

Definition. A quadratic form on Rn is Q(x) = xT Ax where A is an n × n


symmetric matrix called the matrix of quadratic form.
[ ]
3 −2
Example 1 on page 456. Let A = . The quadratic form
−2 7
Q(x) = xT Ax = 3x21 − 4x1 x2 + 7x22 .

Example 2 on page 456. Let Q(x) =5x21 + 3x22 + 2x23− x1 x2 + 8x2 x3 .


5 −1/2 0
The matrix of the quadratic form is A = −1/2 3 4.
0 4 2
Definition. If x represents a variable vector in Rn , then the change of variable
is an equation of the form x = Py or y = P−1 x.

Definition. If the change of variable is made in the quadratic form, then


xT Ax = (Py)T A(Py) = yT (PT AP)y, and the new matrix of the quadratic
form is PT AP. If P orthogonally diagonalizes A, then PT AP = D. Thus,
the matrix of the new quadratic form is diagonal.
[ ]
1 −4
Example 4 on pages 457 – 458. Let A = . Then P =
−4 −5
[ √ √ ] [ ]
2/ √5 1/√5 3 0
and D = . The orthogonal change of variable
−1/ 5 2/ 5 0 −7
[ ] [ ]
x1 y
is x = Py where x = and y = 1 . The quadratic form is xT Ax =
x2 y2
x21 − 8x1 x2 − 5x22 = yT Dy = 3y12 − 7y22 .

Theorem 4 on page 458. If A is an n × n symmetric matrix, then there


exists an orthogonal change of variable x = Py that transforms the quadratic
form xT Ax into a quadratic form yT Dy with no cross-product terms.

Definition. The columns of P are called the principal axes of the quadratic
form xT Ax. The vector y is actually vector x relative to the orthonormal
basis of Rn given by the principal axes.

Geometric Interpretation of Quadratic Forms in R2 .

41
Consider Q(y) = yT Dy, and let c be a constant. Then the equation yT Dy =
c can be written in either of the following six forms:
(i) x21 /a2 + x22 /b2 = 1, a ≥ b > 0 (an ellipse if a > b, a circle if a = b),
(ii) x21 /a2 + x22 /b2 = 0 (a single point (0, 0)),
(iii) x21 /a2 + x22 /b2 = −1 (empty set of points),
(iv) x21 /a2 − x22 /b2 = 1, a ≥ b > 0 (a hyperbola),
(v) x21 /a2 − x22 /b2 = 0 (two intersecting lines x2 = ±(b/a)x1 ),
and
(vi) x21 /a2 − x22 /b2 = −1 (a hyperbola x22 /b2 − x21 /a2 = 1).

Consider Q(x) = xT Ax where A is a 2 × 2 symmetric matrix. If A is


diagonal, the graph is in standard position. If A is not diagonal, the graph
is rotated out of standard position. Finding the principal axes amounts to
finding the new coordinate system with respect to which that graph is in
standard position.

Pictures of an ellipse, hyperbola, rotation of axes, etc.

Definition. A quadratic form Q is:


a. positive definite if Q(x) > 0 for all x ̸= 0,
b. negative definite if Q(x) < 0 for all x ̸= 0,
and
c. indefinite if Q(x) assumes both positive and negative values.

Example. Q(x) = x21 + 2x22 , Q(x) = −x21 − 2x22 , Q(x) = x21 − 2x22 .

Theorem 5 on page 461. Let A be an n × n symmetric matrix. Then the


quadratic form xT Ax is
a. positive definite iff the eigenvalues of A are all positive,
b. negative definite iff the eigenvalues of A are all negative,
and
c. indefinite iff some eigenvalues of A are positive and some are negative.
Proof: xT Ax = yT Dy = λ1 y12 + · · · + λn yn2 . 2

Definition. An n×n symmetric matrix A is called a positive definite matrix


(negative definite or indefinite) if the corresponding quadratic form xT Ax is
positive definite (negative definite or indefinite).

Example 5 on page 461. Is Q(x) = 3x21 + 2x22 + x23 + 4x1 x2 + 4x2 x3


positive definite?  
3 2 0
Solution: A = 2 2 2. The eigenvalues are -1, 2, and 5. Hence, Q is
0 2 1
indefinite.

42

You might also like