0% found this document useful (0 votes)
141 views95 pages

AMA2111 Linear Algebra

Uploaded by

mjkj12345678
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
141 views95 pages

AMA2111 Linear Algebra

Uploaded by

mjkj12345678
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 95

AMA2111 Mathematics I

2 Linear Algebra

Lecturer: Dr. Xiao Li


[email protected]

1 / 84
Outline

§1 Matrix (AMA1120)
§2 Determinant (AMA1120)
§3 System of Linear Equations (AMA1120)
§4 Vector Space
§5 Eigenvalue and Eigenvector
§6 Inner Product and Orthogonality

2 / 84
§1 Matrix
Definition
A (real) matrix is a rectangular array of real numbers:
 
a11 a12 · · · a1j · · · a1n
 a21
 a22 · · · a2j · · · a2n  
 .. .. .. .. 
 . . . . 
 
 ai1
 ai2 · · · aij · · · ain  
 .. .. .. .. 
 . . . . 
am1 am2 · · · amj · · · amn

A matrix A having m rows and n columns is called an m × n


matrix or m-by-n matrix, denoted by Am×n = [aij ]m×n ,
where aij is called the (i, j)-entry of A.
A 1 × n matrix is called a row vector with n entries. The set
of all such row vectors is denoted by Rn .
An n × 1 matrix is called a column vector with n entries. The
set of all such column vectors is denoted by Rn . 3 / 84
For example,
 
20.5 22 22 23
A =  20 21 21.5 22  is a 3 × 4 matrix,
19.5 20 21 21.5
 
b = 36 8 is a row vector with 2 entries,
 
1
c =  2  is a column vector with 3 entries,
3
d = [3] = 3 is a scalar.

4 / 84
Special Matrix

1. Zero Matrix: A matrix with all its entries equal to zero and is
denoted by the symbol 0.
 
0 0 0 0
0=
0 0 0 0

2. Square Matrix: A matrix with n rows and n columns is called


an n × n matrix or a square matrix of order n.
 
22 22 23
A =  20 21 22 
20 21 21.5

a11
The diagonal .. is called the main diagonal.
.
ann

5 / 84
3. Upper Triangular Matrix: A square matrix whose all the
entries below the main diagonal are zeros.
4. Lower Triangular Matrix: A square matrix whose all the
entries above the main diagonal are zeros.
   
22 22 23 22 0 0
U =  0 21 22  , L =  20 22 0 
0 0 21.5 20 21 21.5
5. Diagonal Matrix: A square matrix with aij = 0 whenever
i 6= j.  
22 0 0
D =  0 21 0 
0 0 21.5
6. Identity Matrix: A diagonal matrix with aii = 1.
 
1 0 0
I= 0 1 0 
0 0 1
6 / 84
Matrix Operation
1. Comparison: Two m × n matrices A and B are said to be
equal if and only if aij = bij for 1 ≤ i ≤ m and 1 ≤ j ≤ n.
   
20.5 22 22 23 20.5 22 22
6=
20 21 21.5 22 20 21 21.5
2. Addition and Subtraction: If A and B are m × n matrices, we
define A ± B = [aij ± bij ]m×n .
   
4 0 5 1 1 1
+
−1 3 2 3 5 7
   
4+1 0+1 5+1 5 1 6
= =
−1 + 3 3 + 5 2 + 7 2 8 9
3. Scalar Multiplication: If A is an m × n matrix and t is any
scalar, we define tA = [taij ]m×n .
     
1 1 1 2×1 2×1 2×1 2 2 2
2 = =
3 5 7 2×3 2×5 2×7 6 10 14
7 / 84
4. Matrix Multiplication:
If A is an m × n matrix and B is an n × k matrix, then we
define their product AB to be the m × k matrix C such that
n
X
cij = ais bsj for 1 ≤ i ≤ m and 1 ≤ j ≤ k.
s=1

Cm×k = Am×n Bn×k

Example
 
    4 6
2 3 4 3 6
A2×2 = , B2×3 = , C3×2 =  7 −1 
1 −5 1 −2 3
3 −2
The products AB, BC, CA, ABC, BCA are defined.
But AC, BA are undefined.

8 / 84
Example 1
   
2 3 4 3 6
If A = and B = , compute C = AB.
1 −5 1 −2 3
Solution: C is an 2 × 3 matrix, C = [cij ]2×3 .
 
  4
c11 = 2 3 = 2 × 4 + 3 × 1 = 11,
1
 
  3
c12 = 2 3 = 2 × 3 + 3 × (−2) = 0,
−2
 
  6
c13 = 2 3 = 2 × 6 + 3 × 3 = 21,
3
 
  4
c21 = 1 −5 = 1 × 4 + (−5) × 1 = −1,
1
 
  3
c22 = 1 −5 = 1 × 3 + (−5) × (−2) = 13,
−2
 
  6
c23 = 1 −5 = 1 × 6 + (−5) × 3 = −9.
3
 
11 0 21
Therefore, C = .
−1 13 −9
9 / 84
Rules of Matrix Algebra

1 A + 0 = A, A − A = 0
2 A + B = B + A (commutative law of addition)
3 A + (B + C) = (A + B) + C (associative law of addition)
4 t(A + B) = tA + tB, t(AB) = A(tB)
5 (t + s)A = tA + sA
6 (AB)C = A(BC) (associative law of multiplication)
7 (A + B)C = AC + BC, A(B + C) = AB + AC
8 A0 = 0A = 0
9 AI = IA = A

10 / 84
Example 2
If    
5 1 2 0
A= , B= ,
3 −2 4 3
compute AB − BA.

Solution:
   
14 3 10 2
AB = , BA = .
−2 −6 29 −2
 
4 1
Therefore, AB − BA = .
−31 −4

No commutative law of matrix multiplication.

11 / 84
Definition
Let A be a square matrix. We define A0 = I and Ak = A Ak−1


for every positive integer k.

Thus, A1 = A, A2 = AA, A3 = AAA, · · · .


Example 3
 
1 2
Let A = . Find An .
0 1

Solution:
    
2 1 2 1 2 1 4
A = =
0 1 0 1 0 1
      
1 4 1 2 1 6 1 2n
A3 = = ⇒ An =
0 1 0 1 0 1 0 1
    
4 1 6 1 2 1 8
A = =
0 1 0 1 0 1

12 / 84
Transpose

Definition
The transpose of an m × n matrix A is defined to be the n × m
matrix B = AT such that bij = aji for 1 ≤ i ≤ n and 1 ≤ j ≤ m.

For example,
 
  1 3
1 2 0
A= , AT =  2 −1 
3 −1 4
0 4

Rules of transpose:
1 (AT )T = A
2 (A + B)T = AT + B T
3 (kA)T = kAT
4 (AB)T = B T AT

13 / 84
Inverse
Definition
A square matrix A is said to be nonsingular (or invertible) if there
is a square matrix B such that AB = BA = I. The matrix B is
called an inverse of A.
If A is nonsingular, its inverse is unique, denoted by A−1 .

For example,
     
3 2 5 −2 5 −2 3 2
= I and = I.
7 5 −7 3 −7 3 7 5
 
3 2
Therefore, the matrix is nonsingular, and
7 5
 −1  
3 2 5 −2
= .
7 5 −7 3

14 / 84
If A and B are nonsingular matrices of the same order, then
1 A−1 is nonsingular and (A−1 )−1 = A;
2 tA is nonsingular and (tA)−1 = 1t A−1 for any t 6= 0;
3 AB is nonsingular and (AB)−1 = B −1 A−1 ;
4 Ak is nonsingular and (Ak )−1 = (A−1 )k for any positive
integer k.

Example
  
1 2 3 2
A= B=
1 3 2 2
   
−1 3 −2 −1 1 −1
A = B = 3
−1 1 −1 2
   
7 6 −1 4 −3
AB = (AB) = = B −1 A−1
9 8 − 92 7
2

15 / 84
§2 Determinant
Definition
For any square matrix An×n = [aik ], we define the determinant of
A, denoted by |A| or det(A), as follows:
If n = 1, i.e., A = [a11 ], we define |A| = a11 .
Assume that n > 1 and that the determinant is defined for all
square matrices of order < n. We define the following terms:
1 Mik is the determinant of the (n − 1) × (n − 1) matrix
obtained from A by deleting its ith row and kth column. Mik
is called the minor of the entry aik .
2 Cik = (−1)i+k Mik is called the cofactor of the entry aik of A.
Then, we define
n
X n
X
|A| = a1k C1k = (−1)1+k a1k M1k .
k=1 k=1

That is, |A| is obtained by taking “cofactor expansion” along


the first row of the matrix A.
16 / 84
 
a11 a12
If A = , then C11 = (−1)1+1 a22 , C12 = (−1)1+2 a21 .
a21 a22
So
|A| = a11 C11 + a12 C12 = a11 a22 − a12 a21 .

17 / 84
 
a11 a12 a13
If A =  a21 a22 a23 , then
a31 a32 a33
a22 a23 a21 a23 a21 a22
|A| = a11 + a12 (−1) + a13
a32 a33 a31 a33 a31 a32
= a11 a22 a33 + a12 a23 a31 + a13 a21 a32
− a31 a22 a13 − a32 a23 a11 − a33 a21 a12 .

18 / 84
Example 4
 
2 1 3
Evaluate |A|, where A =  1 −1 1 .
1 4 −2

Solution:
−1 1 1 1 1 −1
|A| = 2 × (−1)2 + 1 × (−1)3 + 3 × (−1)4
4 −2 1 −2 1 4
= 2 × (−2) + 1 × 3 + 3 × 5 = 14.

Alternatively,

|A| = 2 × (−1) × (−2) + 1 × 1 × 1 + 3 × 1 × 4


− 1 × (−1) × 3 − 4 × 1 × 2 − (−2) × 1 × 1
= 4 + 1 + 12 − (−3) − 8 − (−2) = 14.

19 / 84
Properties
1 |AT | = |A|.
2 If A0 is obtained from A by interchanging any two rows of A,
then |A0 | = −|A|.
3 If A0 is obtained by multiplying the ith row of A by a scalar t
while other rows remain unchanged, then |A0 | = t|A|.
4 If A0 is obtained from A by adding a nonzero scalar multiple
of one row to another row, then |A0 | = |A|.
5 If two rows of A are identical, then |A| = 0.
6    
d11 0 . . . 0 a11 a12 . . . a1n
 0 d22 . . . 0   0 a22 . . . a2n 
D= . ..  U =  ..
   
.. .. .. 
 .. . .   . . . 
0 0 . . . dnn 0 0 . . . ann
|D| = d11 d22 . . . dnn |U | = a11 a22 . . . ann
7 A square matrix A is nonsingular if and only if |A| =
6 0.
8 If A and B are n × n matrices, then |AB| = |A||B|.
20 / 84
Theorem
1 The determinant of a matrix can be evaluated by taking

cofactor expansion along any row.


2 The determinant of a matrix can be evaluated by taking
cofactor expansion along any column.

The sign pattern of cofactor is


 
+ − + −
 − + − + 
 
 + − + − ··· 
 
 − + − + 
 
..
.
Calculate the determinant of the matrix in Example 4 by expanding
along the third row and obtain
1 3 2 3 2 1
|A| = 1 · + 4 · (−1) + (−2) · = 14
−1 1 1 1 1 −1
21 / 84
Example 5
Evaluate
0 −1 2 1
−4 3 −3 5
1 0 0 −1
−1 1 0 1

Solution:
0 −1 2 1 0 −1 2 1
−4 3 −3 5 c +c →c −4 3 −3 1
==4===
1 4
===
1 0 0 −1 1 0 0 0
−1 1 0 1 −1 1 0 0

−1 2 1
expand along r3 expand along r3 2 1
============= 3 −3 1 ============= =5
−3 1
1 0 0

22 / 84
§3 System of Linear Equations

A system of m linear equations with n unknowns x1 , x2 , . . . , xn is


given by


 a11 x1 + a12 x2 + · · · + a1n xn = b1
 a21 x1 + a22 x2 + · · · + a2n xn = b2

.. .. ..


 . . .
am1 x1 + am2 x2 + · · · + amn xn = bm

where aij and bk are given scalars. The system can be conveniently
written in matrix form as Ax = b which is followed. The matrix A
is commonly known as the coefficient matrix of the linear system.
    
a11 a12 . . . a1n x1 b1
 .. .. ..   ..  =  .. 
 . . .  .   . 
am1 am2 . . . amn xn bm

23 / 84
Elementary Row Operation
Definition
Given a system of linear equations Ax = b, we define the
augmented matrix for this system to be the m × (n + 1) matrix
obtained by joining the column vector b to the right of A.
 
a11 a12 . . . a1n b1
[A|b] =  ... .. .. .. 

. . . 
am1 am2 . . . amn bm

There are three basic operations:


Row Operation Symbol
interchange any two rows of [A|b] ri ↔ rj
(interchange any two equations of the system Ax = b)
multiply any row of [A|b] by a nonzero scalar tri
(multiply both sides of any equation in Ax = b by a nonzero scalar)
add a scalar multiple of one row of [A|b] to another row ri + trj → ri
(add a scalar multiple of one equation in Ax = b to another equation)
24 / 84
Row-echelon Form
Definition
A matrix is said to be in row-echelon form if it satisfies:
1 All the rows consisting entirely of zeros are grouped together
at the bottom of the matrix;
2 If a row does not consist entirely of zeros, then the 1st
non-zero entry of this row is equal to 1 (known as the leading
1 of the row);
3 If the leading 1 of the ith row occurs at the pth column and
the leading 1 of the (i + 1)th row occurs at the qth column,
then p < q.

Roughly, the left and bottom entries of the leading 1 are all zeros.
 
  0 1 2 6 0
1 4 3 7  0 0 1 −1 18 
 0 1 6 2   
 0 0 0 0 1 
0 0 1 5
0 0 0 0 0
25 / 84
A system whose augmented matrix is in row-echelon form can be
solved easily by backward substitutions.
 
 x1 + 4x2 + 3x3 = 7
  x1 = 7 − 4 × (−28) − 3 × 5 = 98

x2 + 6x3 = 2 ⇒ x2 = 2 − 6 × 5 = −28
 
x3 = 5 x3 = 5
 

Definition
If each column of a row-echelon form that contains a leading 1 has
zeros elsewhere, the matrix is called reduced row-echelon form.
The left, upper and bottom entries of the leading 1 are all zeros.
   
1 0 0 4 0 1 2 0 15
 0 1 0 −2   0 0 0 1 −3 
0 0 1 −9 0 0 0 0 0

Theorem
Every matrix A can be reduced to a matrix in reduced row-echelon
form by applying a sequence of elementary row operations to A.
26 / 84
Gaussian Elimination

1 Use a sequence of elementary row operations to reduce the


augmented matrix [A|b] to a matrix [R|c] , where R is the
row-echelon form of A.
2 The system of linear equations Rx = c, which is equivalent to
Ax = b, may now be solved by backward substitutions.

Example 6

Use Gaussian elimination to solve



 x1 + x2 + 2x3 = 9

2x1 + 4x2 − 3x3 = 1

3x1 + 6x2 − 5x3 = 0

27 / 84
Solution:
   
1 1 2 9 r2 −2r1 →r2 1 1 2 9
−3r1 →r3
 2 4 −3 1  −r−3−−−−−→  0 2 −7 −17 
3 6 −5 0 0 3 −11 −27
   
1
r
1 1 2 9 1 1 2 9
−3r2 →r3
2 2
−−→  0 1 − 72 − 17  −r−3−−−−−→  0 1 − 27 − 17 
2 2
1 3
0 3 −11 −27 0 0 −2 −2

x + x2 + 2x3 = 9
 1
 
1 1 2 9

7 17

−2r3
−−−→  0 1 − 7 − 17  ⇔ x2 − x3 = −
2 2
0 0 1 3


 2 2
x3 = 3
x1 = −2 − 2 × 3 + 9 = 1

 
1


7 17

⇒ x2 = × 3 − =2 ⇒x= 2 



 2 2 3
x3 = 3

28 / 84
Example 7

Use Gaussian elimination to solve



 x1 + x2 − 4x3 = 5

2x1 + 3x2 − 7x3 = 14

−x2 − x3 = −4

Solution:
   
1 1 −4 5 1 1 −4 5
r −2r →r2
 2 3 −7 14  −−2−−−1−−→  0 1 1 4 
0 −1 −1 −4 0 −1 −1 −4
 
1 1 −4 5 (
x1 + x2 − 4x3 = 5
r3 +r2 →r3
−−−−−−→  0 1 1 4 ⇔
0 0 0 0 x2 + x3 = 4

 x1 = 5 + 4t − (4 − t) = 1 + 5t
 
 1 + 5t
⇒ x2 = 4 − t ⇒x= 4−t 
t

x3 = t ∈ R

29 / 84
Example 8

Use Gaussian elimination to solve




 x2 − 4x3 = 8
2x1 − 3x2 + 2x3 = 1

5x1 − 8x2 + 7x3 = 1

Solution:   
0 1 −4 8 2 −3 2 1
r ↔r2
 2 −3 2 1  −−1−−→  0 1 −4 8 
5 −8 7 1 5 −8 7 1
3 1 3 1
   
1
r
1 −2 1 2 1 −2 1 2
2 1  r −5r →r3
−−→ 0 1 −4 8  −−3−−−1−−→  0 1 −4 8 
1
5 −8 7 1 0 −2 2 − 32
1 − 23 1 12
 
r
r3 + 22 →r3 5
−−−−−−−→  0 1 −4 8  ⇒ 0x1 + 0x2 + 0x3 =
2
0 0 0 52
The system is inconsistent since the third equation has no solution.
30 / 84
Solution Set
The collection of all the solutions is called the solution set of the
system. A system of linear equations is either inconsistent (no
solution) or consistent (has at least one solution).

31 / 84
Theorem
Suppose the augmented matrix [A|b] of the linear system Ax = b
is reduced to [R|c] by elementary row operations, where R is an
m × n matrix in the row-echelon form with r non-zero rows and
c ∈ Rm .
(1) The system Ax = b is inconsistent if and only if r < m and
there is at least one cj 6= 0 for r < j ≤ m.

Example 8 has no solution.

1 − 32 1
   
0 1 −4 8 1 2
[A|b] =  2 −3 2 1  −→  0 1 −4 8  = [R|c]
5
5 −8 7 1 0 0 0 2

Here, R has 2 non-zero rows, c3 6= 0 while A is 3 × 3 matrix.


One row corresponds to an equation having no solution.

32 / 84
Theorem
Suppose the augmented matrix [A|b] of the linear system Ax = b
is reduced to [R|c] by elementary row operations, where R is an
m × n matrix in the row-echelon form with r non-zero rows and
c ∈ Rm .
(1) The system Ax = b is inconsistent if and only if r < m and
there is at least one cj 6= 0 for r < j ≤ m.
(2) When the system Ax = b is consistent,
(a) if r = n, the system has a unique solution;

Example 6 has a unique solution.


   
1 1 2 9 1 1 2 9
[A|b] =  2 4 −3 1  −→  0 1 − 72 − 17
2
 = [R|c]
3 6 −5 0 0 0 1 3

Here, R has 3 non-zero rows and A is 3 × 3 matrix.


The equation # of [R|c] equals to unknown #, both 3.
33 / 84
Theorem
Suppose the augmented matrix [A|b] of the linear system Ax = b
is reduced to [R|c] by elementary row operations, where R is an
m × n matrix in the row-echelon form with r non-zero rows and
c ∈ Rm .
(1) The system Ax = b is inconsistent if and only if r < m and
there is at least one cj 6= 0 for r < j ≤ m.
(2) When the system Ax = b is consistent,
(a) if r = n, the system has a unique solution;
(b) if r < n, the system has infinitely many solutions.

Example 7 has infinitely many solutions.


   
1 1 −4 5 1 1 −4 5
[A|b] =  2 3 −7 14  −→  0 1 1 4  = [R|c]
0 −1 −1 −4 0 0 0 0
Here, R has 2 non-zero rows, c3 = 0, and A is 3 × 3 matrix.
The equation # of [R|c] is less than unknown #, 2 < 3.
34 / 84
Brief Summary on Solution Set of Ax = b

[Am×n |b] → [Rm×n |c], row-echelon form R with r non-zero rows

the system is inconsistent (has no solution)


⇔ r < m and at least one cj 6= 0 for r < j ≤ m
⇔ at least one equation in [R|c] has no solution

When the system is consistent,


the system has a unique solution
⇔r=n
⇔ equation # of [R|c] = unknown #

the system has infinitely many solutions


⇔r<n
⇔ equation # of [R|c] < unknown #

35 / 84
Example 9
Find the conditions which p and q must satisfy for the system
    
1 2 p x1 1
 2 3 0   x2 = 2 
 
6 8 3 x3 q

to have (a) a unique solution, (b) infinitely many solutions, and (c)
no solution.
Solution:
   
1 2 p 1 1 2 p 1
 2 3 0 2  → ··· →  0 1 2p 0 
6 8 3 q 0 0 2p + 3 q − 6

(a) The system has a unique solution when 2p + 3 6= 0.


(b) The system has infinitely many solutions when 2p + 3 = 0 and
q − 6 = 0.
(c) The system has no solution when 2p + 3 = 0 and q − 6 6= 0.
36 / 84
Exercise 1
Consider the system of linear equations

 x1 + x2 + px3 = 1

x1 + px2 + x3 = 1

px1 + x2 + x3 = 1

Determine the condition on p such that the system


(a) has a unique solution;
(b) has infinitely many solutions;
(c) has no solution.

Solution: Reduce the augmented matrix to the row-echelon form:


   
1 1 p 1 r2 −r1 →r2 1 1 p 1
r3 −pr1 →r3
 1 p 1 1  −−−−−−−→  0 p − 1 1 − p 0 
p 1 1 1 0 1−p 1−p 1−p2
 
1 1 p 1
r3 +r2 →r3
−−−−−−→  0 p − 1 1−p 0 
0 0 2
2−p−p 1−p
 
1 1 p 1
 0 p−1 1−p 0 
0 0 2
2−p−p 1−p
2 − p − p2 = (2 + p)(1 − p)

(a) If p = 1, the 2nd and 3rd rows are both zero rows,
the system has infinitely many solutions.

Otherwise, p 6= 1, then c3 = 1 − p 6= 0.

(b) If 2 − p − p2 6= 0, namely, p 6= −2 and p 6= 1,


the system has a unique solution.

(c) If 2 − p − p2 = 0, namely, p = −2,


the system has no solution.
Example 10
Use Gaussian elimination to solve
x1 + 3x2 − 2x3 + 2x5 = 0



 2x + 6x − 5x − 2x + 4x − 3x = −1

1 2 3 4 5 6


 5x3 + 10x4 + 15x6 = 5

2x1 + 6x2 + 8x4 + 4x5 + 18x6 = 6

Solution: The augmented matrix can be reduced to


 
1 3 0 4 2 0 0
 0 0 1 2 0 0 0 
 
 0 0 0 0 0 1 1 
3
0 0 0 0 0 0 0

Unknown#= 6, Equation#= 3, so the system has infinitely many


solutions.
37 / 84
 
1 3 0 4 2 0 0
 0 0 1 2 0 0 0 
 
 0 0 0 0 0 1 13 
0 0 0 0 0 0 0
Since 6 − 3 = 3, there are 3 free variables.


 x1 = −3r − 4s − 2t

x2 =r


 
x 1 + 3x 2 + 4x 4 + 2x 5 = 0 

 x3 = −2s

 
 
x3 + 2x4 = 0 ⇒
x4 =s

 1 

 x6 = 
 x5 =t
3 




 x6 1
=

3

The number of free variables is called the degree of freedom (dof):


dof = Unknown# − Equation#.
38 / 84
Gauss-Jordan Method

1 Use a sequence of elementary row operations to reduce the


augmented matrix [A|b] to a matrix [R|c] , where R is the
reduced row-echelon form of A.
2 The solution of the new equivalent system Rx = c can be
obtained simply by inspection.

Example 11
Use Gauss-Jordan Method to solve


 x1 − 2x2 + x3 = 0
2x2 − 8x3 = 6

−4x1 + 5x2 + 9x3 = −9

39 / 84
Solution:
   
1 −2 1 0 1 −2 1 0
r3 +4r1 →r3
 0 2 −8 6  −−−−−−−→  0 2 −8 6 
−4 5 9 −9 0 −3 13 −9
   
1
r
1 −2 1 0 1 −2 1 0
+3r2 →r3
2 2
−−→  0 1 −4 3  −r−3−−−−−→  0 1 −4 3 
0 −3 13 −9 0 0 1 0
   
r1 −r3 →r1 1 −2 0 0 1 0 0 6
r +4r →r2 r +2r →r1
−−2−−−3−−→  0 1 0 3  −−1−−−2−−→  0 1 0 3 
0 0 1 0 0 0 1 0
 
6
Therefore, x =  3 .
0

Ax = b has a unique solution if and only if A can be reduced to


the identity I by a sequence of elementary row operations.
40 / 84
Use Gauss-Jordon Method to Find Inverse

Given An×n is a square matrix.

Let bi (i = 1, 2, . . . , n) be the ith column of the identity matrix


In×n . According to Gauss-Jordon method, if Ax1 = b1 has a
unique solution, so does Axi = bi for i = 2, . . . , n.

Consider n systems together

A[x1 , x2 , . . . , xn ] = [b1 , b2 , . . . , bn ] = I.

The solution is [x1 , x2 , . . . , xn ] = A−1 . We can use Gauss-Jordon


method to find inverse, i.e., solve AX = I to get A−1 = X.

[A|I] → [I|A−1 ]

41 / 84
Example 12
Find the inverse of  
3 4 −1
A= 1 0 3 
2 5 −4

Solution:
   
3 4 −1 1 0 0 1 0 3 0 1 0
r1 ↔r2
 1 0 3 0 1 0  −−−−→  3 4 −1 1 0 0 
2 5 −4 0 0 1 2 5 −4 0 0 1
 
1 0
r2 −3r1 →r2 3 0 1 0
r3 −2r1 →r3
−−−−−−−→  0 4 −10 1 −3 0 
0 5 −10 0 −2 1
 
1
r
1 0 3 0 1 0
4 2 
−−→ 0 1 − 52 14 −3
4 0 
0 5 −10 0 −2 1
42 / 84
 
1 0 3 0 1 0
r −5r →r3 −3
−−3−−−2−−→  0 1 −5 1
0 
2 4 4
5 5 7
0 0 2 −4 4 1
 
1 0 3 0 1 0
r2 +r3 →r2
−−−−−−→  0 1 0 −1 1 1 
0 0 25 − 54 47 1
 
2
r
1 0 3 0 1 0
5 3 
−−→ 0 1 0 −1 1 1 
0 0 1 − 12 10 7 2
5
1 0 0 2 − 10 − 56
3 11
 
r −3r →r1
−−1−−−3−−→  0 1 0 −1 1 1 
1 7 2
0 0 1 −2 10 5
Therefore,
3
− 11 − 65
 
2 10
A−1 =  −1 1 1 
− 21 7
10
2
5
43 / 84
System of Homogeneous Equations

Definition
1 A system of m homogeneous linear equations in n unknowns

is given by Ax = 0, where A is a given m × n matrix and 0 is


the m × 1 zero column vector.
2 A homogeneous system has the n × 1 zero vector 0 as an
obvious solution (called the trivial solution). Any other
solutions are known as non-trivial solutions.

If v and w are two solutions of Ax = 0, then tv + sw is also a


solution for any t, s ∈ R.
Theorem
The system Ax = 0 either has only the trivial solution, or has
infinitely many solutions.

44 / 84
Theorem
For any square matrix A, the following statements are equivalent.
1 A is nonsingular;
2 |A| =
6 0;
3 A can be reduced to I by a sequence of elementary row
operations;
4 The system of non-homogeneous equations Ax = b has a
unique solution (x = A−1 b);
5 The system of homogeneous equations Ax = 0 has only the
trivial solution (x = A−1 0 = 0).

For any scalar a, the following statements are equivalent.


1 the reciprocal of a exists (a−1 );
2 a 6= 0;
3 a × a−1 = 1;
4 The equation ax = b has a unique solution (x = a−1 b);
5 The equation ax = 0 has only the trivial solution (x = a−1 0 = 0).
45 / 84
§4 Vector Space

Definition
A vector space is a set V of column vectors (or row vectors)
satisfying the following properties:
1 the zero vector 0 ∈ V ;
2 if v, w ∈ V , then the sum v + w ∈ V ;
3 if v ∈ V and t is any scalar, then tv ∈ V .

A vector space contains the zero vector, and is closed under


addition and scalar multiplication.

Example
Both Rn and Rn are vector spaces.
R2 is the usual Cartesian plane and R3 is the Cartesian space.

46 / 84
Definition
Let vi ∈ Rn and ti ∈ R, i = 1, 2, . . . , k.
1 The vector t1 v1 + t2 v2 + · · · + tk vk is called a linear
combination of v1 , v2 , . . . , vk .
2 The set of all linear combinations of v1 , v2 , . . . , vk is called a
vector space spanned by the vectors v1 , v2 , . . . , vk , denoted
by span{v1 , v2 , . . . , vk }.
3 Vectors v1 , v2 , . . . , vk are said to be linearly independent if
t1 v1 + t2 v2 + · · · + tk vk = 0 (∗)
implies ti = 0 for all 1 ≤ i ≤ k.
4 Vectors v1 , v2 , . . . , vk are said to be linearly dependent if
there are scalars ti , not all 0, such that (∗) holds.

0 and v are linearly dependent for any vector v.


If v1 , v2 are linearly dependent, then so are v1 , v2 , v3 .
v1 , v2 , . . . , vk must be either linearly dependent or linearly
independent.
47 / 84
Example
In the Cartesian plane R2 , let v1 , v2 be non-zero vectors.
v1 , v2 are linearly dependent
⇔ there exist non-zero t1 , t2 ∈ R such that t1 v1 + t2 v2 = 0
⇔ there exists λ ∈ R such that v2 = λv1
⇔ v2 ∈ span{v1 }
⇔ v1 , v2 are located in a same straight line
⇔ v1 //v2
Vectors [1 0]T , [0 1]T are linearly independent.

Proposition
Vectors v1 , v2 , . . . , vk are linearly dependent if and only if one of
these vectors can be expressed as a linear combination of the
remaining ones.
t1 v1 + t2 v2 + · · · + tj vj + · · · tk vk = 0
t1 tj−1 tj+1 tk
⇔ vj = − v1 −· · ·− vj−1 − vj+1 −· · ·− vk (tj 6= 0)
tj tj tj tj 48 / 84
Theorem
Given vj ∈ Rn , j = 1, 2, . . . , k.
Let A be the n × k matrix whose jth column equals to the vector
vj for j = 1, 2, . . . , k, that is, A = [v1 v2 · · · vk ]. Clearly,

t1 v1 + t2 v2 + · · · + tk vk = 0 ⇔ At = 0 (∗)
 
t1
 t2 
where t =  . . Therefore,
 
 .  .
tk
v1 , v2 , . . . , vk are linearly independent
⇔ the system At = 0 has only the trivial solution;
v1 , v2 , . . . , vk are linearly dependent
⇔ the system At = 0 has infinitely many solutions.

Corollary
If k > n, any k vectors in Rn are linearly dependent.
49 / 84
Example 13
     
1 1 0
Determine 2 , −1 and 0  are linearly independent or
    
0 1 1
dependent.

Solution:  
1 1 0
 2 −1 0  t = 0
0 1 1
Use row operations to reach the row-echelon form:
 
1 1 0
 0 1 1 
0 0 1

The system has the unique trivial solution, so they are linearly
independent.
50 / 84
Example 14
     
1 4 7
Determine v1 =  2 , v2 =  5  and v3 =  8  are linearly
3 6 9
independent or dependent. If dependent, could v3 be expressed by
the linear combination of v1 and v2 .

Solution:
     
1 4 7 1 4 7 s
 2 5 8  →  0 1 2  ⇒ t =  −2s 
3 6 9 0 0 0 s

The system has infinitely many solutions, so they are linearly


dependent, and
sv1 − 2sv2 + sv3 = 0
Therefore, v3 = −v1 + 2v2 .

51 / 84
Exercise 2

    
1 1 1
Let v1 =  2 , v2 =  3 , v3 = 4 .
 
0 1 2
(a) Determine v1 , v2 , v3 are linearly independent or dependent.
(b) If dependent, could v1 be expressed by the linear combination
of v2 and v3 ?

Solution:  
1 1 1
 2 3 4 t = 0 (∗)
0 1 2
 
1 1 1
 2 3 4 t = 0 (∗)
0 1 2
     
(a) 1 1 1 1 1 1 1 1 1
−2r1 →r2 r3 −r2 →r3
 2 3 4  −r−2−−−−−→  0 1 2  −− −−−−→  0 1 2 
0 1 2 0 1 2 0 0 0

The system (∗) has infinitely many solutions.


Therefore, v1 , v2 , v3 are linearly dependent.
 
s
(b) The solutions of (∗) are t =  −2s . So
s

sv1 − 2sv2 + sv3 = 0.

Therefore, v1 = 2v2 − v3 .
Proposition
Let v1 , v2 , . . . , vm ∈ Rn . Then, span{v1 , v2 , . . . , vm } = Rn if and
only if there are n linearly independent vectors in {v1 , v2 , . . . , vm }.

Example 15
    
1 4 7
Determine v1 = 2 , v2 = 5 and v3 = 8  can span
    
3 6 9
R3 or not.

Solution:    
1 4 7 1 4 7
 2 5 8 → 0 1 2 
3 6 9 0 0 0
So three vectors are linearly dependent. The row-echelon form tells
us there are only 2 linearly independent vectors. Therefore, they
cannot span R3 .
52 / 84
Theorem
For any square matrix A, the following statements are equivalent.
1 A is nonsingular;
2 |A| =
6 0;
3 A can be reduced to I by a sequence of elementary row
operations;
4 The system of non-homogeneous equations Ax = b has a
unique solution (x = A−1 b);
5 The system of homogeneous equations Ax = 0 has only the
trivial solution (x = A−1 0 = 0);
6 v1 , v2 , . . . , vn are linearly independent, where vj ∈ Rn is
equal to the jth column of An×n .

53 / 84
Example
     
1 1 0
Given 2 , −1 , 0 . (Example 13, linearly independent)
    
0 1 1

1 1 0
1 1
2 −1 0 = 1 × (−1)3+3 × = 1 × (−1) − 2 × 1 = −3
2 −1
0 1 1

Example
     
1 4 7
Given 2 , 5 , 8 . (Example 14, linearly dependent)
    
3 6 9
1 4 7 r2 −2r1 →r2 1 4 7 1 4 7
r3 −3r1 →r3 r3 −2r2 →r3
2 5 8 ========= 0 −3 −6 ========= 0 −3 −6 =0
3 6 9 0 −6 −12 0 0 0

54 / 84
§5 Eigenvalue and Eigenvector

Definition
Let A be an n × n matrix. If there is a scalar λ ∈ C and a nonzero
vector v ∈ Rn such that Av = λv, then λ is called an eigenvalue
of A and v is called an eigenvector of A corresponding to the
eigenvalue λ.

Remark
An eigenvector of A is the vector v such that Av is a scalar
multiple of v.
If Av = λv, then A(tv) = λ(tv) for any t 6= 0.
If Av = λv and Aw = λw, then A(v + w) = λ(v + w).
The set of eigenvectors corresponding to the eigenvalue λ is a
vector space.

55 / 84
Rewrite Av = λv as
(A − λI)v = 0. (∗)

An eigenvalue of A is a scalar λ such that (∗) has non-trivial


solutions. An eigenvector of A is a non-trivial solution of (∗).

The system (∗) has non-trivial solutions ⇐⇒ |A − λI| = 0.


Theorem
Let A be an n × n matrix with real entries. A scalar λ is an
eigenvalue of A if and only if |A − λI| = 0.

Once we obtain an eigenvalue λ of A, we shall be able to use


Gaussian elimination to find non-trivial solutions of (∗), and thus,
obtain the eigenvectors of A corresponding to the eigenvalue λ.

56 / 84
Definition
Let
a11 − λ a12 ··· a1n
a21 a22 − λ ··· a2n
f (λ) = |A − λI| = .. .. .. ..
. . . .
an1 an2 · · · ann − λ

f (λ) is a polynomial of degree n with leading coefficient (−1)n .


f (λ) is called the characteristic polynomial of A.

Eigenvalues of the matrix A are roots of the equation f (λ) = 0.


As a result, an n × n matrix has at most n eigenvalues.

57 / 84
As the homogeneous system (A − λI)v = 0 has infinitely many
non-trivial solutions, there are infinitely many eigenvectors of A
corresponding to λ.

Since the set of eigenvectors corresponding to λ is a vector space,


we only need to find eigenvectors that are linearly independent.
All other eigenvectors corresponding to λ may be expressed as
linear combinations of these linearly independent eigenvectors.

Proposition
Eigenvectors corresponding to different eigenvalues are linearly
independent.

For example, if λ and µ are different eigenvalues of A, u1 , u2 are


the eigenvectors of A corresponding to λ, and v1 , v2 , v3 are the
eigenvectors of A corresponding to µ, then ui and vj are linearly
independent for any i = 1, 2 and j = 1, 2, 3.

58 / 84
Example 16

Find the eigenvalues and eigenvectors of


 
0 1 0
A= 0 0 1 
6 −11 6

Solution:
−λ 1 0 1−λ 1 0
c +c +c →c
1
|A−λI| = 0 −λ 1 ====2===
3 1
=== 1 − λ −λ 1
6 −11 6 − λ 1 − λ −11 6 − λ

1−λ 1 0
r −r →r −λ − 1 1
==2===
1 2
=== 0 −λ − 1 1 = (1 − λ)
r3 −r1 →r3 −12 6−λ
0 −12 6−λ
= (1 − λ)[(−λ − 1)(6 − λ) − (−12)] = (1 − λ)(λ − 2)(λ − 3) = 0
So the eigenvalues are λ1 = 1, λ2 = 2 and λ3 = 3. (single roots)
59 / 84
1) λ1 = 1,
  
−1 1 0 v1
(A − λ1 I)v = 0 ⇔  0 −1 1   v2  = 0
6 −11 5 v3
     
−1 1 0  t 1
−v1 + v2 = 0
→  0 −1 1 ⇔  ⇒v= t =t 1 
  
−v2 + v3 = 0
0 0 0 t 1
 
1
So the eigenvector v1 = 1 .

1

60 / 84
 
1
1) λ1 = 1, the eigenvector v1 =  1 .
1
2) λ2 = 2,
  
−2 1 0 v1
(A − λ2 I)v = 0 ⇔  0 −2 1   v2  = 0
6 −11 4 v3
   1   
−2 1 0 
4 t 1
−2v1 + v2 = 0 1
→  0 −2 1  ⇔ ⇒ v =  12 t  = t  2 
−2v2 + v3 = 0 4
0 0 0 t 4
 
1
So the eigenvector v2 =  2 .
4

61 / 84
 
1
1) λ1 = 1, the eigenvector v1 =  1 .
1
 
1
2) λ2 = 2, the eigenvector v2 =  2 .
4
3) λ3 = 3,
  
−3 1 0 v1
(A − λ3 I)v = 0 ⇔  0 −3 1   v2  = 0
6 −11 3 v3
   1   
−3 1 0 
9 t 1
−3v1 + v2 = 0 1
→  0 −3 1  ⇔ ⇒ v =  13 t  = t  3 
−3v2 + v3 = 0 9
0 0 0 t 9
 
1
So the eigenvector v3 =  3 .
9
The vectors v1 , v2 , v3 are linearly independent.
62 / 84
Example 17

Find the eigenvalues and eigenvectors of


 
3 −2 0
A =  −2 3 0 
0 0 5

Solution:
3 − λ −2 0
3 − λ −2
|A − λI| = −2 3 − λ 0 = (5 − λ)
−2 3 − λ
0 0 5−λ
= (5 − λ)[(3 − λ)2 − 4] = (5 − λ)2 (1 − λ)

So the eigenvalues are λ1 = 1, λ2= λ3 = 5 (double roots).


1
1) λ1 = 1, the eigenvector v1 =  1 .
0
63 / 84
2) λ2 = λ3 = 5,
  
−2 −2 0 v1
(A − 5I)v = 0 ⇔  −2 −2 0   v2  = 0
0 0 0 v3
  
1 1 0 v1
→  0 0 0   v2  = 0 ⇔ v1 + v2 = 0
0 0 0 v3
     
−s −1 0
⇒v=  s =s
  1 +t 0 
 
t 0 1
   
−1 0
So the eigenvectors v2 =  1 and v3 = 0 .
 
0 1

The vectors v1 , v2 , v3 are linearly independent.


64 / 84
Example 18

Find the eigenvalues and eigenvectors of


 
1 1 −1
A =  −1 3 −1 
−1 2 0

Solution:
1−λ 1 −1 1−λ 1 −1
r −r →r
|A−λI| = −1 3 − λ −1 ==3===
2 3
=== −1 3 − λ −1
−1 2 −λ 0 λ−1 1−λ

1−λ 1 0
c +c →c 1−λ 1
==3===
2 3
=== −1 3 − λ 2 − λ = −(2 − λ)
0 λ−1
0 λ−1 0
= (2 − λ)(1 − λ)2
So the eigenvalues are λ1 = 2, λ2 = λ3 = 1 (double roots).
65 / 84
 
0
1) λ1 = 2, the eigenvector v1 =  1 .
1
2) λ2 = λ3 = 1,
     
0 1 −1 1 −2 1 t
A − I =  −1 2 −1  →  0 1 −1  ⇒ v =  t 
−1 2 −1 0 0 0 t
 
1
So the eigenvector v2 =  1 .
1

The vectors v1 and v2 are linearly independent.

n × n matrix may have at most n linearly independent eigenvectors,


but not always has exactly n linearly independent eigenvectors.

66 / 84
Diagonalization
Definition
A square matrix A is said to be diagonalizable if there is a
nonsingular matrix P such that P −1 AP is a diagonal matrix. We
also say that the matrix P diagonalizes A.

P is not unique.
Theorem
Let A be an n × n matrix. Then A is diagonalizable if and only if
A has n linearly independent eigenvectors.

If v1 , v2 , . . . , vn are linearly independent eigenvectors of A


corresponding to λ1 , λ2 , . . . , λn (not necessarily
 distinct), then
 by
λ1 0 · · · 0
. 
 0 λ2 . . . .. 

taking P = [v1 v2 · · · vn ] and D =  . .  ,
 .. .. ... 0  
0 · · · 0 λn
we obtain AP = P D since Avj = λj vj . 67 / 84
Both Examples 16 and 17 have 3 linearly independent eigenvectors,
so they are diagonalizable.

For Example 16,


   
1 1 1 1 0 0
P =  1 2 3 ,D =  0 2 0 .
1 4 9 0 0 3
For Example 17,
   
1 1 0 1 0 0
P =  1 −1 0  , D =  0 5 0  .
0 0 1 0 0 5
But Example 18 only has 2 linearly independent eigenvectors, so it
is not diagonalizable.

Corollary
An n × n matrix with n distinct eigenvalues is diagonalizable.
68 / 84
Exercise 3
 
5 4
Let A = .
1 2
(a) Compute all the eigenvalues of A.
(b) Find a nonsingular matrix P such that P −1 AP is diagonal.

Solution:
(a)
5−λ 4
|A − λI| =
1 2−λ
= (5 − λ)(2 − λ) − 4
= λ2 − 7λ + 6
= (λ − 1)(λ − 6)

Therefore, the eigenvalues of A are λ1 = 1 and λ2 = 6.


  
4 4 v1
(b) For λ1 = 1, (A − λ1 I)v = 0 ⇔ =0
1 1 v2
      
1 1 v1 −s −1
→ = 0 ⇔ v1 +v2 = 0 ⇒ v = =s
0 0 v2 s 1
 
−1
The eigenvector is v1 = .
1
  
−1 4 v1
For λ2 = 6, (A − λ2 I)v = 0 ⇔ =0
1 −4 v2
      
1 −4 v1 4s 4
→ = 0 ⇔ v1 −4v2 = 0 ⇒ v = =s
0 0 v2 s 1
 
4
The eigenvector is v2 = .
1
   
−1 4 −1 1 0
Let P = [v1 v2 ] = , then P AP = .
1 1 0 6
Application

Proposition
If A is diagonalizable
 m and A = P DP−1 , then Am = P Dm P −1 ,
λ1 0 ··· 0
 0 λm ··· 0 
2
where Dm =  . .. .
 
.. ..
 .. . . . 
0 0 · · · λm
n

Proof:

Am = P DP −1 P DP −1 P · · · P −1 P DP −1 = P DIDI · · · IDP −1

= P DD · · · DP −1 = P Dm P −1
This method is important in control theory.

More applications of diagonalization will be involved in Chapter 3.


69 / 84
Example 19
 
3 −2 0
Compute A290 , where A =  −2 3 0 .
0 0 5

Solution: Refer to Example 17,


   
1 1 0 1 0 0
P =  1 −1 0 ,D =  0 5 0 .
0 0 1 0 0 5
Using the Gauss-Jordan method to find P −1 :
   
1 1 0 1 0 0 1 1 0 1 0 0
 1 −1 0 0 1 0  −r− 2 −r1 →r2 
−−−−→ 0 −2 0 −1 1 0 
0 0 1 0 0 1 0 0 1 0 0 1
1 1
   
(− 12 )r2
1 1 0 1 0 0 1 0 0 2 20
r −r →r
−−−−→  0 1 0 12 − 12 0  −− 1
−−2−−→1 
0 1 0 1
2 − 120 
0 0 1 0 0 1 0 0 1 0 0 1
70 / 84
     1 1

1 1 0 1 0 0 2 2 0
P =  1 −1 0  , D =  0 5 0  , P −1 =  12 − 21 0 
0 0 1 0 0 5 0 0 1

Therefore, A = P DP −1 ,
1 1
   
1 1 0 1 0 0 2 20
290 290 −1 1
A = P D P = 1 −1 0    0 5290 0  2 − 120 
0 0 1 0 0 5290 0 0 1
5290
  
1 0 1 1 0
= 1 −5
 290 0   1 −1 0 1
2
0 0 5290 0 0 2
290 290
 
1+5 1−5 0
1 290 290
= 1−5 1+5 0 
2
0 0 2 × 5290

71 / 84
§6 Inner Product and Orthogonality
Definition
If v = [v1 · · · vn ]T and w = [w1 · · · wn ]T are vectors in Rn , the
inner product of v and w is defined by

hv, wi = vT w = v1 w1 + . . . + vn wn .

Example
The inner product in R2 or R3 is the dot product, hv, wi = v · w.

Proposition
1 hv, wi = hw, vi for any v and w in Rn .
2 hv + u, wi = hv, wi + hu, wi for any v, u and w in Rn .
3 htv, wi = thv, wi = hv, twi for any v, w in Rn and for any
scalar t.
4 hv, vi ≥ 0 for any v in Rn , and hv, vi = 0 ⇔ v = 0.
72 / 84
Definition
1 The Euclidean norm of a vector v is defined by

p q
kvk = hv, vi = v12 + v22 + · · · + vn2 .

2 If kuk = 1, then u is said to be a unit vector.

Normalization
v
If v is a non-zero vector, then is a unit vector.
kvk

Definition
The angle between u and v is the unique number θ lying between
hu, vi
0 and π such that cos θ = .
kukkvk

Example
For any vectors u and v in R2 , we have hu, vi = kukkvk cos θ.
73 / 84
Orthogonality

Definition
1 Two vectors v and w are said to be orthogonal if hv, wi = 0.

2 A set of vectors v1 , . . . , vk is said to be an orthogonal set of


vectors if these vectors are mutually orthogonal to each other,
i.e., hvi , vj i = 0 whenever i 6= j.
3 An orthogonal set of vectors v1 , . . . , vk is said to be an
orthonormal set of vectors if all these vectors are unit vectors,
i.e., kvi k = 1 for all i.

Example 20
     
 0 1 1 
Is  1  ,  0  ,  0  an orthogonal set of vectors? Can
0 1 −1
 
it be normalized to an orthonormal set of vectors?

74 / 84
* 0   1 +
Solution:  1  ,  0  = 0 × 1 + 1 × 0 + 0 × 1 = 0,
0 1
* 0   1 + * 1   1 +
 1  ,  0  = 0,  0  ,  0  = 0.
0 −1 1 −1

So it is an orthogonal set of vectors.


     
0 p 1 √ 1 √
 1  = 02 + 12 + 02 = 1,  0  = 2,  0  = 2
0 1 −1
     
 0 1
1
1
1 
So  1  , √  0  , √  0  is an orthonormal set of

0 2 1 2 −1 
vectors.

75 / 84
Gram-Schmidt Process
Definition
Let v and w be vectors and w 6= 0. We define the projection of v
onto w to be the vector
hv, wi
projw v = w.
kwk2

That is, projw v is obtained by projecting v onto span{w}.

v − projw v is orthogonal to w

k projw vk = kvk cos θ

This can be applied to a construction of orthonormal vectors


known as the Gram-Schmidt process.
76 / 84
Gram-Schmidt process
Transform the given vectors v1 , v2 , . . . , vk into an orthonormal set.
Step 1. Orthogonalization (Projection)
u 1 = v1 ,
hv2 , u1 i
u 2 = v2 u1 , (project v2 onto span{u1 })
ku1 k2
hv3 , u1 i hv3 , u2 i
u 3 = v3 2
u1 u2 , (project v3 onto span{u1 , u2 })
ku1 k ku2 k2
..
.
hvk , u1 i hvk , u2 i hvk , uk 1 i
uk = vk 2
u1 2
u2 · · · uk 1 .
ku1 k ku2 k kuk 1 k2
Then, {u1 , . . . , uk } is an orthogonal set of vectors.
Step 2. Normalization
u1 u2 uk
w1 = , w2 = , . . . , wk = .
ku1 k ku2 k kuk k
Then, {w1 , . . . , wk } is an orthonormal set of vectors.
77 / 84
Example 21
Use the Gram-Schmidt process to transform the given vectors into
an orthonormal set of vectors.
     
1 −1 1
v1 =  1  , v2 =  1  , v3 = 2 

1 0 1

Solution: Let u1 = v1 .
hv2 , u1 i = −1 × 1 + 1 × 1 + 0 × 1 = 0
So v2 is orthogonal to u1 , and we set u2 = v2 .
hv3 , u1 i = 1 + 2 + 1 = 4, hv3 , u2 i = −1 + 2 + 0 = 1
We use projections to transfer v3 :
hv3 , u1 i hv3 , u2 i
u3 = v3 − 2 u1 − u2
ku1 k ku2 k2
       
1 1 −1 1
4 1 1
= 2 − √  1 − √  1 =  1 
( 3)2 ( 2)2 6
1 1 0 −2 78 / 84
Now {u1 , u2 , u3 } is an orthogonal set of vectors.
We normalize u1 :
   
1 1
u1 1 1  
w1 = = √  1 =
 √ 1
ku1 k 12 + 1 2 + 1 2 1 3 1

Normalize u2 :
   
−1 −1
u2 1 1
 1 = √  1 
w2 = =p
ku2 k (−1)2 + 12 + 02 2
0 0

Normalize u3 :
   
1 1
u3 1  1  = √1  1 
w3 = =p
ku3 k 2 2
1 + 1 + (−2)2 6 −2
−2

So {w1 , w2 , w3 } is an orthonormal set of vectors.


79 / 84
Exercise 4      
1 1 1
Let v1 =  1 , v2 =  1 , v3 =  0 .
1 0 0
(a) Find the angle between the vectors v1 and v2 .
(b) Use the Gram-Schmidt process to transform {v1 , v2 , v3 } into
an orthonormal set.

Solution:
(a) p √
kv1 k = 12 + 12 + 12 = 3,
p √
kv2 k = 12 + 12 + 02 = 2,
hv1 , v2 i = 1 × 1 + 1 × 1 + 1 × 0 = 2,
we have √
hv1 , v2 i 6
cos θ = = .
kv1 kkv2 k 3

6
Therefore, θ = arccos .
3

(b) Let u1 = v1 , then ku1 k = 3.
Since hv2 , u1 i = 2, let
     1   
1 1 1
hv2 , u1 i 2 3 1
u2 = v2 − 2
u1 =  1 −  1  =  13  =  1  ,
ku1 k 3 3
0 1 − 32 −2

6
then ku2 k = .
3
1
Since hv3 , u1 i = 1 and hv3 , u2 i = , let
3
 1   
1
hv3 , u1 i hv3 , u2 i 2 1
u3 = v3 − 2
u1 − 2
u2 =  − 21  =  −1  ,
ku1 k ku2 k 2
0 0

2
then ku3 k = .
2
Let
 
1
u1 1
w1 = = √  1 ,
ku1 k 3 1
 
1
u2 1
w2 = = √  1 ,
ku2 k 6 −2
 
1
u3 1
w3 = = √  −1  .
ku3 k 2 0

Then, {w1 , w2 , w2 } is an orthonormal set of vectors.


Orthogonal Matrix
Definition
A square matrix Q is called an orthogonal matrix if QQT = I.

Any orthogonal matrix Q is nonsingular, and Q−1 = QT .


If Q = [q1 . . . qn ] is an orthogonal matrix, then {q1 , . . . , qn }
is an orthonormal set of vectors, i.e., hqi , qj i = 0 whenever
i 6= j, and kqi k = 1.

A square matrix A is said to be symmetric if AT = A.


Theorem
Let A be a symmetric matrix of order n. Then
1 A has eigenvectors q1 , . . . , qn forming an orthonormal set;
2 there is an orthogonal matrix Q such that A = QDQT , where
D is a diagonal matrix. That is, any symmetric matrix can be
diagonalized by an orthogonal matrix.
80 / 84
Example 22
 
3 −2 4
Let A =  −2 6 2 .
4 2 3
Find an orthogonal matrix Q such that QT AQ is a diagonal matrix.

Solution: AT = A. It is not difficult to find the eigenvalues and


eigenvectors of A:

= −2  λ2 =λ3 = 7
λ1   
2 1 0
v1 =  1  v2 =  −2  v3 =  2 
−2 0 1

Use Gram-Schmidt process to transform {v1 , v2 , v3 } to an


orthonormal set {q1 , q2 , q3 }.

81 / 84
Let u1 = v1 , then ku1 k = 3.

Since hv2 , u1 i = 2 × 1 + 1 × (−2) + (−2) × 0 = 0, we set

hv2 , u1 i
u2 = v2 − u1 = v2 ,
ku1 k2

then ku2 k = 5.

Since hv3 , u1 i = 0 and hv3 , u2 i = −4, we set

hv3 , u1 i hv3 , u2 i
u3 = v3 − 2
u1 − u2
ku1 k ku2 k2
     
0 1 4
−4  1 
= 2 −
  −2 = 2
5 5
1 0 5

3 5
then ku3 k = .
5
82 / 84
Let
4
   
2 √1 √
 
3 5 3 5
u1 1  , q 2 = u2 =  −2 u3 2

q1 = = √  , q3 = = .
  
ku1 k 3 ku2 k
 5 ku3 k 3 5
− 23 0 5

3 5

2 √1 4
 

3 5 3 5
1 −2
√ 2

So Q = [q1 q2 q3 ] =   is an orthogonal matrix
 
3 5 3 5
− 23 0 5

3 5
and  
−2 0 0
QT AQ =  0 7 0  .
0 0 7

83 / 84
Proposition
For a symmetric matrix, the eigenvectors corresponding to different
eigenvalues are orthogonal.

Recall the solution of Example 22.  


2
For λ1 = −2, the eigenvector is v1 =  1 .
−2
   
1 0
For λ2 = λ3 = 7, the eigenvectors are v2 =  −2 , v3 =  2 .
0 1
Let
     
0 1 4
hv3 , v2 i −4 1
u3 = v 3 − v 2 =  2  −  −2  =  2 .
kv2 k2 5 5
1 0 5
v1 v2 u3
Let q1 = , q2 = , q3 = . So Q = [q1 q2 q3 ] is an
kv1 k kv2 k ku3 k
orthogonal matrix.
84 / 84
Exercise 5
 
5 2
Let A = .
2 2
(a) Compute all the eigenvalues of A.
(b) Find an orthogonal matrix Q such that QT AQ is diagonal.

Solution:
(a)
5−λ 2
|A − λI| =
2 2−λ
= (5 − λ)(2 − λ) − 4
= λ2 − 7λ + 6
= (λ − 1)(λ − 6)

Therefore, the eigenvalues of A are λ1 = 1 and λ2 = 6.


  
4 2 v1
(b) For λ1 = 1, (A − λ1 I)v = 0 ⇔ =0
2 1 v2
− 12 s
    
2 1 v1
→ = 0 ⇔ 2v1 + v2 = 0 ⇒ v =
0 0 v2 s
 
−1
The eigenvector is v1 = .
2
  
−1 2 v1
For λ2 = 6, (A − λ2 I)v = 0 ⇔ =0
2 −4 v2
    
1 −2 v1 2s
→ = 0 ⇔ v1 − 2v2 = 0 ⇒ v =
0 0 v2 s
 
2
The eigenvector is v2 = .
1
We have hv1 , v2 i = 0. Let
   
v1 1 −1 v2 1 2
q1 = =√ , q2 = =√ ,
kv1 k 5 2 kv2 k 5 1
and Q = [q1 q2 ].

You might also like