0% found this document useful (0 votes)
460 views258 pages

Matrices and Linear Algebra

This document provides an introduction to matrices and matrix algebra. It defines a matrix as a rectangular arrangement of numbers organized into rows and columns. A matrix is denoted as A=(aij)m×n, where aij is the element in the ith row and jth column, m is the number of rows, and n is the number of columns. The chapter will cover basic matrix operations, types of matrices like symmetric and orthogonal matrices, and properties related to ranks, inverses, and determinants of matrices.

Uploaded by

Harshal Vaidya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
460 views258 pages

Matrices and Linear Algebra

This document provides an introduction to matrices and matrix algebra. It defines a matrix as a rectangular arrangement of numbers organized into rows and columns. A matrix is denoted as A=(aij)m×n, where aij is the element in the ith row and jth column, m is the number of rows, and n is the number of columns. The chapter will cover basic matrix operations, types of matrices like symmetric and orthogonal matrices, and properties related to ranks, inverses, and determinants of matrices.

Uploaded by

Harshal Vaidya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 258

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/320170178

Matrices and Linear Algebra

Book · August 2017

CITATIONS READS
0 3,983

1 author:

Akhilesh Chandra Yadav


Mahatma Gandhi Kashi Vidyapith, Cant, Varanasi, India
20 PUBLICATIONS   31 CITATIONS   

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Elements of Discrete Mathematics View project

Socio-Economic Development View project

All content following this page was uploaded by Akhilesh Chandra Yadav on 02 October 2017.

The user has requested enhancement of the downloaded file.


1
2
Matrices and Linear Algebra

Akhilesh Chandra Yadav


Department of Mathematics
M G Kashi Vidyapith, Varanasi
2
Preface

The purpose to write this book is to provide a text for undergrad-


uate Linear Algebra course at all Indian Universities. Therefore it
is written according to the unified syllabus of Indian universities
to fulfil the requirements of undergraduate students.
This book is divided into nine chapters. First four chapters deal
the course on matrices and the rest deal the course on Linear Alge-
bra. In the Chapter 1, the notion of matrices and their operations
are given. Some particular type of matrices like Symmetric, Skew-
Symmetric, Hermitian, Skew-Hermitian, orthogonal and unitary
are discussed here. In the Chapter 2, the notion of ranks of matri-
ces and the reductions of matrices into Echelon and Normal forms,
are given. The method of finding inverse of non-singular square
matrices using elementary operations are also given. The Chapter
3 deals the consistency and inconsistency of system of linear equa-
tions. The application of ranks and echelon forms in finding the
solution of system of linear equations are also given. In the Chapter
4, the notion of eigen values, eigen vectors are given. The nature of
eigen vectors and their role in the diagonalization of square matri-
ces are discussed. The application of Cayley-Hamilton’s theorem is
also given. The chapter 5 deals the abstract notion of vector spaces,
subspaces and quotient spaces. The chapter 6 deals the study of
linear transformations and linear functionals. In the Chapter 7,
the matrix connection of linear transformations and the effect of
change of basis on the matrix of linear transformations, are given.
The Chapter 8 deals inner product spaces. The last chapter deals
the notion of bilinear forms and quadratic forms.
I have tried my best to keep the book free from misprints. I will
be thankful to the readers to point out errors and give suggestions
to improve the book.
I am thankful to Dr. Prithvish Nag, Vice Chancellor, M G

i
ii

Kashi Vidyapith University for providing facilities and healthy en-


vironment.
Students, colleagues, brothers and friends are motivating fac-
tors to write the book. I am thankful to each of them includ-
ing Professors R P Shukla, B K Sharma, Swapnil Srivastav, Vipul
Kakkar, R P Singh, B D Pandey, Rajesh Kumar Yadav, J P Yadav,
Raman Pant, Anil Kumar and Amalesh Yadav.
I am thankful to the publisher for presenting the book in its
present form.
At the last, I am grateful to my wife Manju for her constant
support during writing and typing of the book. I dedicate this
book to my parents and Professor Ramji Lal.

Akhilesh Chandra Yadav


August 05, 2017
Dedicated to

My Parents

and

Professor Ramji Lal


iv
Contents

1 Matrix Algebra 1
1.1 Matrices . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Matrix operations . . . . . . . . . . . . . . . . . . . 5
1.3 Transpose of a matrix . . . . . . . . . . . . . . . . . 13
1.4 Symmetric and Skew-symmetric matrices . . . . . . 13
1.5 Hermitian and skew-Hermitian matrices . . . . . . . 17
1.6 Orthogonal and Unitary matrices . . . . . . . . . . . 23
1.7 Adjoint matrix . . . . . . . . . . . . . . . . . . . . . 25

2 Rank of a matrix 31
2.1 Elementary operations . . . . . . . . . . . . . . . . . 31
2.2 Elementary matrices . . . . . . . . . . . . . . . . . . 37
2.3 Echelon form . . . . . . . . . . . . . . . . . . . . . . 38
2.4 Linear dependence and independence . . . . . . . . . 43
2.5 Normal form . . . . . . . . . . . . . . . . . . . . . . 54
2.6 Inverse of a matrix . . . . . . . . . . . . . . . . . . . 61

3 System of linear equations 67


3.1 System of Homogeneous linear equations . . . . . . . 69
3.2 System of Non-homo. linear equations . . . . . . . . 79

4 Eigen values and Eigen vectors 91


4.1 Properties . . . . . . . . . . . . . . . . . . . . . . . . 98
4.2 Diagonalizable Matrix . . . . . . . . . . . . . . . . . 103
4.3 Cayley-Hamilton Theorem . . . . . . . . . . . . . . . 111

5 Vector Spaces 121


5.1 Vector Spaces . . . . . . . . . . . . . . . . . . . . . . 121
5.2 Linear Dependence and Linear Independence . . . . 129
5.3 Bases and Dimensions . . . . . . . . . . . . . . . . . 136

v
vi CONTENTS

5.4 Dimension of subspaces . . . . . . . . . . . . . . . . 144


5.5 Quotient Space . . . . . . . . . . . . . . . . . . . . . 150
5.6 Coordinates . . . . . . . . . . . . . . . . . . . . . . . 153

6 Linear Transformations 163


6.1 Rank-Nullity Theorem . . . . . . . . . . . . . . . . . 173
6.2 Algebra of Linear transformations . . . . . . . . . . 180
6.3 Linear functionals . . . . . . . . . . . . . . . . . . . 184
6.4 Annihilators . . . . . . . . . . . . . . . . . . . . . . . 191

7 Matrix Representations 201

8 Inner Product Spaces 215


8.1 Inner Products . . . . . . . . . . . . . . . . . . . . . 215
8.2 Notion of angle and orthogonality . . . . . . . . . . . 221
8.3 Orthonormal Sets and Bessel’s inequality . . . . . . 223
8.4 Gram-Schmidt Process . . . . . . . . . . . . . . . . . 226

9 Bilinear and Quadratic forms 233


9.1 Bilinear Forms . . . . . . . . . . . . . . . . . . . . . 233
9.2 Quadratic forms . . . . . . . . . . . . . . . . . . . . 241
Chapter 1

Matrix Algebra

In this chapter we shall introduce the notion of matrices together


with its basic operations. It is frequently used in several areas of
mathematics, engineering, commerce and social sciences, typically
when the raw data is arranged in rectangular array or tabular form.
In the latter chapter we shall see its application in the theory of
linear equations.

1.1 Matrices
Definition 1.1.1 An m×n matrix A is a rectangular arrangement
of m.n numbers (real or complex) in m-rows and n - columns. Let
aij be the number that appears in ith row and j th column of A for all
i = 1, 2, . . . , m and j = 1, 2, . . . , n. Then, it is denoted by (aij )m×n
and aij is called the (i, j)th entry of A. Thus, A = (aij )m×n and
we read A is an m by n matrix. Sometimes, we also read A is a
matrix of order m cross n or A is a matrix of size m × n .
In extended form it can be written as
 
a11 a12 ... a1j ... a1n
 a21 a22 ... a2j ... a2n 
 
 .. .. .. .. .. 
 . . . . . 
A=
 
a
 i1 a i2 ... aij . . . ain 

 .. .. .. .. .. 
 . . . . . 
am1 am2 . . . amj . . . amn

1
2 CHAPTER 1. MATRIX ALGEBRA

Some authors also use square brackets to denote matrices, that is;
 
a11 a12 . . . a1j . . . a1n
 a21 a22 . . . a2j . . . a2n 
 
 .. .. .. .. .. 
 . . . . . 
A=  
 ai1 ai2 . . . aij . . . ain 

 .. .. .. .. .. 
 . . . . . 
am1 am2 . . . amj . . . amn

Here we prefer first notation A = (aij )m×n .

Example 1.1.2  
1 2
A=
0 5
is a 2 by 2 matrix and
 
1 2 9
B=
3 0 2

is a 2 by 3 matrix.

Example 1.1.3 Write down the extended form of matrix A = (i+


j)2×2 .
Solution: The (i, j)th entry aij of the matrix A is given by
aij = i + j. Hence a11 = 1 + 1 = 2, a12 = 1 + 2 = 3, a21 = 2 + 1 = 3
and a22 = 2 + 2 = 4. Thus,
 
2 3
A=
3 4

Example 1.1.4 The 3×3 matrix whose (i, j)−th entry is (−1)i−j
for all i, j = 1, 2, 3, is given by
 
1 −1 1
 −1 1 −1 
1 −1 1

Exercise 1.1.5 Write down the extended form of following matri-


ces:

1. A = (ij )3×3 .
1.1. MATRICES 3
(
1 if i = j
2. A = (aij )2×2 , where aij = .
0 6 j
if i =
(
1 if i + j is even
3. A = (aij )3×3 , where aij = .
0 otherwise
(
ei+j if i + j is even
4. A = (aij )2×2 , where aij = .
0 otherwise

Exercise 1.1.6 Write A whose (i, j) − th entries are given by



 −1 if i > j
aij = 0 if i = j
−1 if i < j

Exercise 1.1.7 Write down 3 × 3 matrix A = (aij )3×3 , where aij


is given by
a). the least common multiple of i and j.
b). the greatest common divisor of i and j.

Definition 1.1.8 An m by n matrix A is called a square matrix


if m = n. Let A = (aij ) be a square matrix then its entries
a11 , a22 , a33 , . . . , ann are called diagonal (main diagonal or prin-
cipal diagonal) entries. All the entries aij where i < j, are called
upper diagonal entries and all the entries aij where i > j, are
called lower diagonal entries. It is illustrated by following figure:

a11 a12 a13 upper diagonal entries

a21 a22 a23


Main diagonal entries
a31 a32 a33

lower diagonal entries


4 CHAPTER 1. MATRIX ALGEBRA

If m = 1 then A is called a row vector. If n = 1 then A is


called a column vector. For example,
 
1 2 0
0 2 6
3 9 2

is a square matrix of order 3 × 3. Its first, second and third rows


(1 2 0), (0 2 6) and(3
9 2) are
 rowvectors
 respectively. Similarly,
1 2 0
the columns  0  ,  2  and  6  are column vectors.
3 9 2

Definition 1.1.9 A matrix A = (aij )m×n is called a zero ma-


trix if aij = 0 for all i = 1, 2, 3, ..., m and j = 1, 2, . . . , n. Thus,
zero matrix is a matrix whose all entries  are 
0. It is denoted by
  0 0 0
0 0 0
0m×n . For example, and 0 0 0 are zero matrices
0 0 0
0 0 0
of order 2 × 3 and 3 × 3 respectively.

Definition 1.1.10 A square matrix A = (aij )n×n is called an up-


per (lower) triangular matrix if aij = 0 for all i > j (aij = 0
for all i < j).
   
1 0 5 1 0 0
For example,  0 4 2  and  0 4 0  are 3×3 upper
0 0 5 −1 0 5
and lower triangular matrices respectively. Note that a zero square
matrix is both upper as well as lower triangular matrices. A square
matrix which is either upper triangular or lower triangular is called
a triangular matrix.

Definition 1.1.11 An n × n square matrix A = (aij )n×n is called


diagonal matrix if aij = 0 for all i 6= j. Thus, diagonal matrix
is a square matrix in which all the non-diagonal entries are 0. A
diagonal matrix is called a scalar matrix of size n × n if a11 =
a22 = . . . = ann . Thus, in other words, an n × n matrix is called
a scalar matrix if all the diagonal entries are same and all non-
diagonal entries are 0.
1.2. MATRIX OPERATIONS 5

For example,
 
  2 0 0 0
1 0 0 0
0 2 0 and  2 0 0

0 0 2 0
0 0 1
0 0 0 2
are diagonal matrices. The second matrix is a scalar matrix

Definition 1.1.12 A square matrix A = (aij )n×n is called an


identity matrix if

1 if i = j
aij =
0 otherwise
 
  1 0 0
1 0
It is denoted by In . Thus, I2 = and I3 = 0 1 0 are
0 1
0 0 1
examples of identity matrices.

Definition 1.1.13 (Equality) Two matrices A = (aij )m×n and


B = (bij )p×q are said to be equal if m = p, n = q and aij = bij
for all i and j. Thus, matrices I2 and I3 are unequal because they
have different number of rows (columns), i.e; they have different
sizes. Next, matrices
   
1 2 2 1 2 2
and
0 1 5 9 1 5
have same orders but they are not equal because their (2, 1)-th en-
tries are not same.

1.2 Matrix operations


Definition 1.2.1 (Addition and subtraction of matrices) Let
A = (aij )m×n and B = (bij )m×n be two m × n matrices of same or-
der. Then the sum A+B is defined by an m×n matrix whose (i, j)-
th entry is aij + bij for all i and j. Thus, A + B = (aij + bij )m×n ,
i.e; the sum is obtained by adding corresponding entries of A and
B. Similarly, the difference A − B is defined by the m × n matrix
whose (i, j)-th entry is aij − bij for all i and j. Thus, A − B =
(aij − bij )m×n . Note that A + B and A − B are not defined if they
do not have same number of rows and columns.
6 CHAPTER 1. MATRIX ALGEBRA

Example 1.2.2
     
1 0 5 4 11 7 1+4 0 + 11 5 + 7
+ =
2 3 9 6 10 13 2+6 3 + 10 9 + 13
 
5 11 12
= .
8 13 22
Similarly,
     
1 0 5 4 11 7 1−4 0 − 11 5 − 7
− =
2 3 9 6 10 13 2−6 3 − 10 9 − 13
 
−3 −11 −2
=
−4 −7 −4
Proposition 1.2.3 Let A, B, C be any three m×n matrices. Then
A + B = B + A and A + (B + C) = (A + B) + C.
Proof: Since matrices A, B, C are of same sizes therefore A +
B, B + A, A + (B + C) and (A + B) + C are defined. Let A =
(aij )m×n , B = (bij )m×n and C = (cij )m×n . Then
A + B = (aij + bij )
= (bij + aij ) as aij + bij = bij + aij f or all i, j
= B+A
Similarly, A + (B + C) = (A + B) + C follows from the fact that
(aij + bij ) + cij = aij + (bij + cij ) for all i, j. 2

Proposition 1.2.4 (i). For every m × n matrix A, we have a


unique matrix 0m×n such that A+0m×n = A = 0m×n +A (additive
identity).
(ii). For every m × n matrix A = (aij )m×n , we have a unique
matrix B = (−aij )m×n such that A + B = 0m×n = B + A. It is
denoted by −A (additive inverse of A).
Let Mm×n be the set of all m × n matrices. Then Mm×n forms
a group under addition of matrices.
Definition 1.2.5 (Scalar multiplication) Let c be any scalar
(real or complex) and A = (aij )m×n be any matrix. Then scalar
multiplication c.A is a matrix whose (i, j)-th entry is c.aij for
all i and j. Thus, c.A = (c.aij )m×n . Observe that additive inverse
−A = (−1).A. Hence A − B = A + (−1).B. We simply write cA
instead of c.A.
1.2. MATRIX OPERATIONS 7
   
1 6 8 0 2 5
Example 1.2.6 If A = and B = then
2 0 5 1 3 9
   
3 18 24 0 4 10
3A = and 2B = .
6 0 15 2 6 18

   
1 0 0 0 0 1
Exercise 1.2.7 If A = 2 6 7, B = 3 5 2 and
3 1 4 1 4 4
 
1 1 1
C = 2 2 5 . Then, evaluate 3A, (−7)B, A + 5B and 12A −
3 1 44
17B + 9C.

Proposition 1.2.8 If A and B are any two m × n matrices, then


for any two scalars we have:
(i). (k + l)A = kA + lA.
(ii). k(A + B) = kA + kB.
(iii). k(lA) = l(kA) = (kl)A.
(iv). 0A = 0m×n .

Proof: Let A = (aij )m×n and B = (bij )m×n . Then

(k + l)A = ([(k + l)aij ])


= ((kaij + laij )
= (kaij ) + (laij )
= k(aij ) + l(aij )
= kA + lA

This proves (i).


Next,

k(A + B) = k(aij + bij )


= (kaij + kbij )
= (kaij ) + (kbij )
= kA + kB

This proves (ii). Similarly, we have k(lA) = l(kA) = (kl)A and


0A = 0m×n . 2
8 CHAPTER 1. MATRIX ALGEBRA

Since Mm×n is a group under addition of matrices. Therefore,


for every positive integer n, nA denotes the integral multiple of A.
Thus,
nA = A + A + . . . + A = (naij )m×n ,
where aij denotes the (i, j)-th element of A.
Exercise 1.2.9 Let A and B be any two m × n matrices, then
solve the following equations:
(i). 2(X + 32 A) = 5A − 2X.
 
1 0 0
(ii) X + A = 2B − 3A. (iii). if A = 0 1 0 and B =
0 0 1
 
1 1 1
1 1 1 then evaluate X for which 2(X − 2 A) = 3 B − A.
3 4
1 1 1
Definition 1.2.10 Let A = (aij )m×n and B = (bkl )p×q . Then
matrix multiplication AB of A by B is defined if the number
of columns in A is equal to the number of rows in B, i.e; if n =
p. Suppose that A = (aij )m×n and B = (bjk )n×q . Then matrix
multiplication AB of APby B is the m × q matrix whose (i, k)-th
element is given by the nj=1 aij bjk (product of i-th row and j-th
column), i.e; given by
 
b1k
 b2k 
  
ai1 ai2 . . . ain  b3k  = ai1 b1k + ai2 b2k + . . . + ain bnk
 
 .. 
 . 
bnk
It is illustrated by the following figure:

b1k ai1b1k
i-th row of A b2k ai2b2k
ai1 ai2 ai3 . . . ain b3k ai3b3k
... ...
bnk ainbnk
Pn
j=1 aij bjk = ai1 b1k + ai2 b2k + . . . + ain bnk

k-th column of B
1.2. MATRIX OPERATIONS 9

Example 1.2.11
 
  6  
1 2 0   1×6+2×7+0×8
7 =
3 4 9 3×6+4×7+9×8
8
 
20
=
118
 
−1 2
Example 1.2.12 If A = then
1 3
  
2 −1 2 −1 2
A = AA =
1 3 1 3
2
 
(−1) + 1.2 (−1).2 + 2.3
=
1.(−1) + 3.1 1.2 + 3.3
 
3 4
=
2 11
 
3 4 −1
Exercise 1.2.13 Let A = and
−2 11 7
 
1 1 0
B =  −1 0 1 . Evaluate AB. What about BA? (Hint:
0 −1 −1
BA is not defined, why?).
   
0 1 1 1
Example 1.2.14 If A = and then
0 0 0 0
   
0 0 0 1
AB = = 02×2 but BA = 6= 02×2 .
0 0 0 0
Thus, AB 6= BA (i.e; in general matrix product is non-
commutative). Here A and B both are non-zero matrices but
BA = 02×2 . Thus, the product of two non-zero matrices
may be zero.

Remark 1.2.15 If A and B be any two matrices. Then, we have


the following possibilities:
(1). Both AB and BA are not defined.
(2). Either AB or BA is defined.
(3). Both AB and BA are defined but may have different sizes. If
A and B are square matrices of same sizes (orders) then both AB
10 CHAPTER 1. MATRIX ALGEBRA

and BA are defined and have same sizes (orders). However they
may or may not be equal( see Example 1.2.14. Thus, in general
AB 6= BA. From example 1.2.14, it also follows that product of
two non-zero matrices may be zero.

 
1 0 1
Example 1.2.16 Consider the matrices A = and B =
0 1 1
 
1 1
1 0. Then AB and BA both are defined and of orders(sizes)
0 1
 
1 2
2 × 2 and 3 × 3 respectively. Verify that AB = and BA =
1 1
 
1 1 2
1 0 1
0 1 1

Exercise 1.2.17 Compute the following matrix products:

   
3
 3 
1. 1 −1 3  1 and 1 1 −1 3 .
 
0 0

  
1 2 4 2 9 0
2. 0 3 1 0 3 1.
0 0 2 0 0 1

     
1 0 0 1 2 4 1 2 4 1 0 0
3. 0 3 0 6 3 5 and 6 3 5 0 3 0 .
0 0 2 7 8 9 7 8 9 0 0 2
Also verify that both products are same. Indeed, if A is any
square matrix of size n × n and D = cIn is any scalar matrix
of size n × n then AD = DA.

Proposition 1.2.18 Matrix multiplication is associative, i.e; (AB)C =


A(BC) for any three matrices A, B, C for which the product is de-
fined.
1.2. MATRIX OPERATIONS 11

Proof: Let A = (aij )m×n , B = (bjk )n×p and C = (ckl )p×r . Then
 
Xn
(AB)C =  aij bjk  (ckl )
j=1
  
p
X Xn
=   aij bjk  ckl 
k=1 j=1
 
n
X p
X
=  aij [bjk ckl ]
j=1 k=1
p
!
X
= (aij ) bjk ckl
k=1
= A(BC)

Exercise 1.2.19 Show that


  
a h g x
2 2

ax + by + 2hxy + 2gx + 2f y + c = x y 1  h b f   y
g f c 1

Using this expression, express the equations in matrix notations:


(1). y 2 = 4ax,
2 2
(2). xa2 − yb2 = 1
2 2
(3). xa2 + yb2 = 1
(4). xy = 4
(5). x2 − y 2 + 2xy + x − 23y + 1 = 0.

Exercise 1.2.20 Put the system of linear equations

2x + 3y = 1
x−y = 2
 
x
in to matrix form AX = B, where X =
y
 
1 2 2
Exercise 1.2.21 If A = 2 1 2 then show that A2 −4A−5I =
2 2 1
03×3 .
12 CHAPTER 1. MATRIX ALGEBRA
 
3 −4
Exercise 1.2.22 If A = then prove that
1 −1
 
k 1 + 2k −4k
A = , where k is any natural number.
k 1 − 2k
 
cos θ sin θ
Exercise 1.2.23 If Aθ = and
− sin θ cos θ
 
cos φ sin φ
Aφ = , then prove that
− sin φ cos φ
 
cos(θ + φ) sin(θ + φ)
Aθ Aφ = .
− sin(θ + φ) cos(θ + φ)
In particular (Aθ )n = Anθ , where n is an integer. Also prove that
Aθ Aφ = I2 if and only if θ + φ = 2πn.

Proposition 1.2.24 If A is any m×n matrix then we have unique


matrices Im and In such that Im A = A = AIn . In particular, if A
is a square matrix of size n × n then there exists a unique matrix
In such that AIn = A = In A.

Proof is left as an exercise for readers.


Proposition 1.2.25 If relevant sums and products are defined then
A(B +C) = AB +AC and (A+B)C = AC +BC holds (distributive
law).
The proof follows from the fact that
n
X n
X n
X
aij (bjk + cjk ) = aij bjk = aij cjk
j=1 j=1 j=1

for all i = 1, 2, . . . , m and k = 1, 2, . . . , p.


Let Mn be the set of all square matrices. Then Mn forms a ring
under matrix addition and multiplication. It is also noted that
if AB is defined then c(AB) = (cA)B = A(cB) for all scalar
c. Two matrices A, B are said to commute if AB = BA. Thus, to
commute A and B it is necessary that they are square matrices of
same size(type). Let A, B ∈ Mn and AB = BA. Using induction
principle, one may easily prove that the binomial expansion
(A + B)n = n C0 An + n C1 An−1 B + . . . + n Cr An−r B r + . . . + n Cn B n
is valid.
1.3. TRANSPOSE OF A MATRIX 13
 
0
a
Remark 1.2.26 For any a 6= 0 the matrix A = satis-
a−1 0
fies the equation X 2 = I2 . This shows that there are infinitely
many square roots of identity matrix I2 .

Definition 1.2.27 An n × n square matrix is said to be invertible


if there exists a matrix P such that P A = AP = In . From the
definition, it follows that P will also be invertible. The matrix P
is called the inverse of A and is denoted by A−1 . One may easily
prove that the inverse if exists is unique.

1.3 Transpose of a matrix


Definition 1.3.1 (Transpose of a matrix) If A is an m × n
matrix then transpose of A is the n × m whose (i, j)-th entry is
the (j, i)-th entry of A. It is denoted by A0 or AT or At . Thus,
the columns of A will be row of A0 . One may easily prove that
(A0 )0 = A.

Example 1.3.2 The transpose of matrix


 
−1 2 3 1
A= 0 1 −1 −2 
4 −3 9 6
 
−1 0 4
2 1 −3 
is given by A0 = 

.
 3 −1 9 
1 −2 6

Proposition 1.3.3 Let A, B C be any three matrices such that


AB and A + C are defined. If c is any scalar. Then (AB)0 = B 0 A0 ,
(cA)0 = cA0 , (A + C)0 = A0 + C 0 and [(A + C)B]0 = B 0 A0 + C 0 B 0 .

Proof is left as an exercise for readers.

1.4 Symmetric and Skew-symmetric matri-


ces
Definition 1.4.1 A square matrix A is said to be symmetric if
A0 = A, i.e; aji = aij for all i, j = 1, 2, . . . , n. A square matrix
14 CHAPTER 1. MATRIX ALGEBRA

A is said to be skew-symmetric if A0 = −A, i.e; aji = −aij for


all i, j = 1, 2, . . . , n. Note that if A is a skew-symmetric matrix
then aji = −aij for all i, j = 1, 2, . . . , n and so aii = −aii for all
i = 1, 2, . . . , n, i.e; aii = 0 for all i = 1, 2, . . . , n. This shows that
all diagonal entries of a skew-symmetric matrix are 0.

Example 1.4.2 The matrix


 
1 2 3
A = 2 8 0 
3 0 0

is symmetric because A0 = A and the matrix


 
0 2 −3
A = −2 0 0
3 0 0

is skew-symmetric because A0 = −A.

Example 1.4.3 The matrix


 
0 2 −3
A = −2 0 0 
3 0 1

is not skew-symmetric because (3, 3)-th diagonal entry is 1 which


is non- zero.

Exercise 1.4.4 For every square matrix A, A + A0 is symmetric


and A − A0 is skew-symmetric.

Exercise 1.4.5 Let A and B be any two square matrices such that
AB = BA. Then, we have the following:
(i). AB and BA both are symmetric if either both are symmetric
or both are skew-symmetric.
(ii). AB and BA both are skew-symmetric if one of them is skew-
symmetric.

Exercise 1.4.6 If A is n × n symmetric matrix and B is n × n


skew- symmetric matrix then AB + BA is skew- symmetric AB −
BA is symmetric.
1.4. SYMMETRIC AND SKEW-SYMMETRIC MATRICES 15

Exercise 1.4.7 If A and B are n × n matrices with A symmetric


and B skew- symmetric then test the nature of Ap B q Ap , where p, q
are positive integers.

Exercise 1.4.8 Let A, B be any two row vector of size 1×n. Prove
that
(i). AB 0 = BA0 ,
(ii). A0 B − B 0 A is skew-symmetric matrix of size n × n.

Exercise 1.4.9 (i) Prove kA is symmetric if A is symmetric,


where k 6= 0 scalar.
(ii) Prove kA is skew-symmetric if A is skew-symmetric, where
k 6= 0 scalar.

Proposition 1.4.10 Every square matrix can be uniquely expressed


as the sum of a symmetric matrix and a skew-symmetric matrix.

Proof: Let A be any square matrix. Clearly

A + A0 A − A0
A= + (1.4.1)
2 2
A+A0 A−A0
Let P = 2 and Q = 2 . Then

A=P +Q (1.4.2)

Now, P 0 = [ 21 (A + A0 )]0 = 21 (A0 + (A0 )0 ) = 12 (A0 + A) = P as


(A0 )0 = A. Thus, P is a symmetric matrix. Next,
Q0 = [ 21 (A − A0 )]0 = 12 (A0 − (A0 )0 ) = −Q as (A0 )0 = A, i.e; Q
is a skew-symmetric matrix. Hence, by equation 1.4.2, A can be
expressed as a sum of a symmetric and skew-symmetric matrix.
For uniqueness, let

A=P +Q (1.4.3)
A=R+S (1.4.4)

where P, R are symmetric and Q, S are skew symmetric. Then


P0= P , R0 = R and Q0 = −Q, S 0 = −S and so, by equations 1.4.3
and 1.4.4, we have

A0 = P 0 + Q0 = P − Q (1.4.5)
0 0 0
A =R +S =R−S (1.4.6)
16 CHAPTER 1. MATRIX ALGEBRA

Solving equations 1.4.3, 1.4.5 and 1.4.4, 1.4.6, we have P = R =


A+A0 0
2 and Q = S = A−A 2 . Thus, the expression is unique. 2
 
1 −1 0
Verification Let A =  −1 2 3 . Then
3 −1 0

 
1 −1 3
A0 =  −1 2 −1  .
0 3 0
A+A0
Consider P = 2 . Then
 
1 −1 3/2
P =  −1 2 1 
3/2 1 0

Since P 0 = P , therefore P is symmetric.


0
Now, consider Q = A−A 2 . Then
 
0 0 −3/2
Q= 0 0 2 
3/2 −2 0

Since (i, j)-th entry of Q is the negative of (j, i)-th entry for all i
and j, i.e; Q0 = −Q. Therefore Q is skew-symmetric.
Now
   
1 −1 3/2 0 0 −3/2
P + Q =  −1 2 1 + 0 0 2 
3/2 1 0 3/2 −2 0
 
1 −1 0
=  −1 2 3 =A
3 −1 0

Exercise 1.4.11 Express each of the following square matrices as


the sum of a symmetric and skew-symmetric
 matrix: 
    1 0 5 0
1 −2 5 1 −2 5  −7
 −7 1 1 3 
2 3 ,  −7 2 3 ,  3 −1 0 8 

3 −1 8 0 0 8
1 1 1 1
1.5. HERMITIAN AND SKEW-HERMITIAN MATRICES 17

1.5 Hermitian and skew-Hermitian matri-


ces
Definition 1.5.1 As we know that the conjugate of a complex
number z = a + ib, where a, b are real numbers, is given by z̄ =
a−ib. Observe that z̄ = z if and only if z is purely real and z̄ = −z
if and only if z is purely imaginary, i.e; z = ia for some real a.

Definition 1.5.2 Let A be any m × n matrix. Then conjugate of


A is the m×n matrix whose (i, j)-th entry is the conjugate of (i, j)-
th entry of A for all i, j. It is denoted by A. Thus, if A = (aij )m×n
then A = (aij )m×n .

For example, the conjugate of a matrix


 
1 1 + i −5 + i 0
A=
−7 1 1 3−i

is given by
 
1 1 − i −5 − i 0
A= .
−7 1 1 3+i

Observe that A = A if and only if all the entries of A are purely


real.

Exercise 1.5.3 Find conjugate matrix of the following:


 
  0 1 − i −5 − i
1 1 − i −5 0  −3 −1
, 0 .
−7 1 1 i
1 1 2

Proposition 1.5.4 Let A, B be any two matrices of same sizes


and k be any scalar. Then, we have:
(i).A = A,
(ii). kA = k̄A,
(iii). (A + B) = A + B,
(iv). If AB is defined then AB = A B,
(v). (A0 ) = (A)0 .

Proof is left as an exercises for readers. 2


18 CHAPTER 1. MATRIX ALGEBRA

Definition 1.5.5 The conjugate transpose (or tranjugate) of


an m × n matrix is the n × m matrix whose (i, j)-th entry is the
conjugate of (j, i)-th entry of A. It is denoted by Aθ . Thus, Aθ =
(A0 ). Since (A0 ) = (A)0 , therefore Aθ = (A)0 .
If A = (aij )m×n then Aθ = (aji )n×m .
For example, if
   
0 1 − i −5 − i 0 −3 1
A =  −3 −1 0  then Aθ =  1 + i −1 1 
1 1 2 −5 + i 0 2

Proposition 1.5.6 (i). (Aθ )θ = A,


(ii). (kA)θ = kAθ , k is any scalar(real or complex),
(iii). (A + B)θ = Aθ + B θ ,
(iv). (AB)θ = B θ Aθ

Proof is left as an exercises for readers. 2

Definition 1.5.7 A square matrix A = (aij )n×n is called Hermi-


tian matrix if Aθ = A, i.e; if aji = aij for all i, j = 1, 2, . . . , n. A
square matrix A is called skew-Hermitian matrix if Aθ = −A,
i.e; if aji = −aij for all i, j = 1, 2, . . . , n.

Proposition 1.5.8 All the diagonal entries of a Hermitian matrix


are real.

Proof: Let A = (aij )n×n be Hermitian. Then aji = aij for all
i, j = 1, 2, . . . , n. In particular, aii = aii for all i = 1, 2, . . . , n, i.e;
aii is real for all i = 1, 2, . . . , n. 2

Proposition 1.5.9 All the diagonal entries of a skew- Hermitian


matrix are either 0 or purely imaginary.
Proof: Let A = (aij )n×n be skew-Hermitian. Then aji = −aij for
all i, j = 1, 2, . . . , n. In particular, aii = −aii for all i = 1, 2, . . . , n,
i.e; aii is either 0 or purely imaginary for each i = 1, 2, . . . , n. 2
 
2 1+i
Example 1.5.10 The matrix A = is hermitian.
1−i 1
For,
   
2 1−i θ 0 2 1+i
A= and so A = (A) = = A.
1+i 1 1−i 1
1.5. HERMITIAN AND SKEW-HERMITIAN MATRICES 19

Example 1.5.11 The matrix


 
0 1+i 2−i
B =  −1 + i i 3 + 7i 
−2 − i −3 + 7i −3i

is skew-hermitian.
 For, 
0 1+i 2−i
Given that B =  −1 + i i 3 + 7i  then
−2 − i −3 + 7i −3i
 
0 1−i 2+i
B =  −1 − i −i 3 − 7i  .
−2 + i −3 − 7i 3i

Thus,
 0
0 1−i 2+i
B θ = (B)0 =  −1 − i −i 3 − 7i 
−2 + i −3 − 7i 3i
 
0 −1 − i −2 + i
=  1−i −i −3 − 7i 
2 + i 3 − 7i 3i
 
0 1+i 2−i
= −  −1 + i i 3 + 7i  = −A.
−2 − i −3 + 7i −3i

Exercise 1.5.12 Find Hermitian and skew-Hermitian matrix from


the following:
   
i 3 + 2i −2 − i 3 2 − 3i 3 + 5i
 −3 + 2i 0 3 − 4i  and  2 + 3i 5 i 
2−i −3 − 4i −2i 3 − 5i −i 7

Exercise 1.5.13 Prove the following:

(i). If A is Hermitian then kA is Hermitian when k is real and


kA is skew-Hermitian when k is purely imaginary (i.e; k =
ia, a 6= 0 and a is real).

(ii). If A is skew-hermitian then kA is skew-Hermitian when k is


real and kA is hermitian when k is purely imaginary.
20 CHAPTER 1. MATRIX ALGEBRA

(iii). If A is Hermitian ( or skew-Hermitian) then (a + ib)A is


neither Hermitian nor skew Hermitian, where a and b both
are non-zero real numbers.
A+Aθ
Example 1.5.14 For a given square matrix A, 2 is Hermi-
θ
tian and A−A
2 is skew-Hermitian. For,

A+Aθ A−Aθ
Let P = 2 and Q = 2 . Then
1 1
P θ = [ (A + Aθ )]θ = [Aθ + (Aθ )θ ] = P
2 2
1 1
Qθ = [ (A − Aθ )]θ = [Aθ − (Aθ )θ ] = −Q
2 2
as (Aθ )θ = A.
A+Aθ A−Aθ
Thus, P = 2 is Hermitian and Q = 2 is skew-Hermitian.

Proposition 1.5.15 Every square matrix can be uniquely expressed


as the sum of a Hermitian matrix and a skew-Hermitian matrix.

Proof: Let A be any square matrix. Clearly

A + Aθ A − Aθ
A= + (1.5.1)
2 2
Then
A=P +Q (1.5.2)
A+Aθ A−Aθ
where P = 2 and Q = 2 . Now,
 θ
θ 1 θ 1h θ i
P = (A + A ) = A + (Aθ )θ = P
2 2
 θ
θ 1 θ 1h θ i
Q = (A − A ) = A − (Aθ )θ = −Q
2 2
as (Aθ )θ = A.
θ θ
Thus, P = A+A
2 is Hermitian and Q = A−A
2 is skew-Hermitian
and so by equation 1.5.2 A can be expressed as the sum of a Her-
mitian matrix and a skew-Hermitian matrix. For uniqueness, let

A=P +Q (1.5.3)
A=R+S (1.5.4)
1.5. HERMITIAN AND SKEW-HERMITIAN MATRICES 21

where P, R are Hermitian and Q, S are skew-Hermitian. Then


Pθ = P , Rθ = R and Qθ = −Q, S θ = −S and so, by equa-
tions 1.5.3 and 1.5.4, we have

Aθ = P θ + Q θ = P − Q (1.5.5)
θ θ θ
A =R +S =R−S (1.5.6)

Solving equations 1.5.3, 1.5.5 and 1.5.4, 1.5.6, we have P = R =


A+Aθ θ
2 and Q = S = A−A 2 . Thus, the expression is unique. 2

Example 1.5.16 Every square matrix can be uniquely expressed


as the sum P + iQ, where P and Q are Hermitian matrices. For,
Let A be any square matrix. Clearly
A + Aθ A − Aθ
A= +i (1.5.7)
2 2i
Then
A = P + iQ (1.5.8)
A+Aθ A−Aθ
where P = 2 and Q = 2 . Now,
 θ
θ 1 θ 1h θ i
P = (A + A ) = A + (Aθ )θ = P
2 2
 θ
1 1 h θ i
Qθ = (A − Aθ ) = A − (Aθ )θ = Q
2i −2i
as (Aθ )θ = A.
θ θ
Thus, P = A+A2 and Q = A−A2 are hermitian and so by equa-
tion 1.5.8 A can be expressed as the sum of Hermitian matrices.
For uniqueness, let

A = P + iQ (1.5.9)
A = R + iS (1.5.10)

where P, R, Q and S are hermitian. Then P θ = P , Rθ = R


and Qθ = Q, S θ = S and so, by equations 1.5.3 and 1.5.4, we have

Aθ = P θ + iQθ = P − iQ (1.5.11)
θ θ θ
A = R + iS = R − iS (1.5.12)

Solving equations 1.5.9, 1.5.11 and 1.5.10, 1.5.12, we have P =


θ θ
R = A+A2 and Q = S = A−A 2i . Thus, the expression is unique.
22 CHAPTER 1. MATRIX ALGEBRA
 
1+i 2
Exercise 1.5.17 Let A = . Then
−3 1 − 5i
(i). find a Hermitian matrix P and skew-Hermitian matrix Q such
that A = P + Q, i.e; express A as the sum of a hermitian matrix
and a skew-hermitian matrix.
(ii). express A as the sum P + iQ, where P, Q are hermitian ma-
trices.

Solurion: (i). Given that


 
1+i 2
A=
−3 1 − 5i

Then  
θ 1−i −3
A =
2 1 + 5i
Hence
A + Aθ
   
1 1+i 2 1−i −3
= +
2 2 −3 1 − 5i 2 1 + 5i
 
1 −1/2
=
−1/2 1

Next,

A − Aθ
   
1 1+i 2 1−i −3
= −
2 2 −3 1 − 5i 2 1 + 5i
 
i 5/2
=
−5/2 −5i
   
1 −1/2 i 5/2
Take P = and Q = . Clearly
−1/2 1 −5/2 −5i
P is Hermitian 
and Q is skew-Hermitian.

1 −1/2
Thus, P = is a Hermitian matrix and Q =
−1/2 1
 
i 5/2
is a skew-Hermitian matrix such that A = P + Q.
−5/2 −5i
θ θ
(ii). Consider P = A+A
2 and Q = A−A
2i . Then, we have
   
1 −1/2 1 −5i/2
P = and Q=
−1/2 1 5i/2 −5
1.6. ORTHOGONAL AND UNITARY MATRICES 23

Clearly these are Hermitian matrices such that

   
1 −1/2 1 −5i/2
P +Q = +i
−1/2 1 5i/2 −5
 
1+i 2
= =A
−3 1 − 5i

Exercise 1.5.18 Find a Hermitian matrix P and a skew-Hermitian


matrix Q such that P + Q = A, where A is given by

   
1 −1 i 3 −1 i
(i).  −1 i 1  (ii).  −1 3 − i 1 + 2i 
1 −1 0 1 + i −1 + i 3 − 2i

Exercise 1.5.19 Find hermitian matrices P and Q such that P +


iQ = A, where A is given by

   
1 −1 2 3 9 0
(i).  8 i 1  (ii).  −1 3 − i 0 
1 −1 0 1+i 0 3 − 2i

1.6 Orthogonal and Unitary matrices

Definition 1.6.1 A square matrix A of size n × n is said to be


orthogonal matrix if A0 A = In = AA0 . It is said to be unitary
matrix if Aθ A = In = AAθ . If A2 = A then it is called idempo-
tent matrix. If A2 = In then A is called involutory matrix. If
An = 0n for some nN then A is called a nilpotent matrix . The
smallest natural number n such that An = 0n , then A is called the
nilpotent matrix of class n.

 
cos φ sin φ 0
Example 1.6.2 Let A =  − sin φ cos φ 0 .
0 0 1
24 CHAPTER 1. MATRIX ALGEBRA
 
cos φ − sin φ 0
Then A0 =  sin φ cos φ 0 . Now,
0 0 1
  
cos φ − sin φ 0 cos φ sin φ 0
A0 A =  sin φ cos φ 0   − sin φ cos φ 0 
0 0 1 0 0 1
2
cos2 φ + sin φ
 
cos φ sin φ − sin φ cos φ 0
=  cos φ sin φ − sin φ cos φ sin2 φ + cos2 φ 0 
0 0 1
 
1 0 0
=  0 1 0  = I3
0 0 1

Similarly A0 A = I3 . Thus, A is orthogonal. Since Aθ = A0 , there-


fore A is a unitary matrix too.
 
1 1 1 + i
Example 1.6.3 Let A = √3 .
1 − i −1
Then  
1 1 1−i
A= √ .
3 1 + i −1
Hence  
θ 01 1 1+i
A = (A) = √
3 1 − i −1
and so
   
θ 1 1 1+i 1 1 1+i
A.A = √ √
3 1 − i −1 3 1 − i −1
 
1 1 + (1 + i)(1 − i) 1 + i − (1 + i)
=
3 1 − i − (1 − i) (1 + i)(1 − i) + 1
 
1 3 0
=
3 0 3
= I2

Similarly, we have Aθ A = I2 (verify). Hence A is an unitary


matrix.
 
cos φ sin φ
Exercise 1.6.4 Prove that is orthogonal (uni-
− sin φ cos φ
tary).
1.7. ADJOINT MATRIX 25


−5 −8 0
Exercise 1.6.5 Prove that  −3 5 0  is involutory, i.e;
1 2 −1
A2 = I3 .
 
1 2 3
Exercise 1.6.6 Prove that  1 2 3  is nilpotent of class(index)
−1 −2 −3
2
2, i.e; A = 03 .

2 −2 −4
Exercise 1.6.7 Prove that the matrix  −1 3 4  is idem-
1 −2 −3
2
potent, i.e; A = A.
 
α + iγ −β + iδ
Exercise 1.6.8 Prove that the matrix A = is
β + iδ α − iγ
a unitary matrix if α2 + β 2 + γ 2 + δ 2 = 1.
Hint: Let α2 + β 2 + γ 2 + δ 2 = 1. Then, we have Aθ A = I2 = AAθ .
 
0 2b c
Exercise 1.6.9 Find the values of a, b, c when  a b −c 
a −b c
is orthogonal.

1.7 Adjoint matrix


Let A = (aij )n×n be a square matrix. Let Aij be the (n − 1) × (n −
1) square matrix obtained from A by deleting i-th row and j-th
column of  A, i.e; therow and column containing aij . For example,
1 2 0    
3 4 1 2
if A = −1 3 4 , then A11 =
  and A23 = .
6 7 5 6
5 6 7
Suppose that det A or |A| denotes the determinant of square matrix
A. Then the quantity (−1)i+j det Aij is called the co-factor of aij
for all i and j. It is denoted by Aij . Thus,

Aij = (−1)i+j det Aij

for all i, j = 1, 2, . . . , n.
26 CHAPTER 1. MATRIX ALGEBRA
 
1 2 0
For example, the cofactor A21 of a21 in A = −1 3 4 is
5 6 7

2+1 2
21
0
A = (−1) 6 7 = −14.

The matrix (Aij )n×n is called the cofactor matrix of A.


Definition 1.7.1 The matrix (Aji )n×n , where Aji denotes the co-
factor of aji in square matrix A = (aij )n×n , is called the adjoint
matrix or adjugate of A. It is denoted by Adj A. Thus,
Adj A = (Aji ) = (Aij )0 ,
the transpose of cofactor matrix (Aij ).
Example
 1.7.2The cofactor
  matrix of the 
matrix 
1 2 9 −3 9 −2
A= is . Thus, adj A = .
3 9 −2 1 −3 1
Observe that a11 A11 + a12 A12 = |A|. In general,
n
X
aij Aij = |A|.
j=1

Proposition 1.7.3 For every square matrix A,


A.Adj A = |A|In
Proof: Let A be a square matrix. Then
n
X
[A.Adj A]ij = aik .(Adj A)kj
k=1
Xn
= aik Ajk
k=1

|A| if j=i
=
0 if j 6= i
because ai1 Ai1 + ai2 Ai2 + . . . + ain Ain = |A| and nk=1 aik Ajk = 0
P
for j 6= i. Thus,
 
|A| 0 0 . . . 0
 0 |A| 0 . . . 0 
A.Adj A =  .  = |A|In
 
.. .. ..
 .. . . . 0 
0 0 0 . . . |A|
1.7. ADJOINT MATRIX 27

2
 
Adj A
Thus, if |A| =
6 0, then A. |A| = In , i.e; A is invertible and

Adj A
A−1 =
|A|

Indeed, A is invertible if and only if |A| =


6 0.

Exercises
Exercise 1.7.4 Write down the extended form of matrix A =
(aij )2×2 , where aij = ei−j for all i and j.

Exercise 1.7.5 Write down the extended form of matrix A =


(aij )3×4 , where

1 if i=j
aij =
0 otherwise
 
1 2 3
Exercise 1.7.6 Evaluate A − 3B, where A = and B =
0 8 1
 
1 0 0
.
0 −1 5
 
1 1 0
Exercise 1.7.7 Let A = 0 1 5, then evaluate At , A and Aθ .
2 5 i

Exercise 1.7.8 Find the matrix form of the system of equations:

2x − 3y = 0, y = 5
     
1 2 3 1 1 −1
Exercise 1.7.9 Let A = , B = , C =
0 1 1 0 8 1
 
d 0
and D = . Verify the following
0 d

1. A + B = B + A,

2. A − B 6= B − A,
28 CHAPTER 1. MATRIX ALGEBRA

3. (A + B) + C = A + (B + C),

4. (AB)C = A(BC),

5. (A + B)C = AC + BC,

6. A(B + C) = AB + AC,

7. AB 6= BA,

8. DA = AD,

9. Prove that A+At is symmetric and A−At is skew-symmetric,

10. Prove that D is orthogonal if and only if d2 = 1.

Exercise 1.7.10 Let X = (a b c) and


 
0 −c b
A= c 0 −a
−b a 0

where a2 + b2 + c2 = 1. Prove that


(1) A2 = X t X − I3 .
(2) A3 = −A.

Exercise 1.7.11 Prove that the zero square matrix is the only ma-
trix which is symmetric (Hermitian) as well as skew-symmetric
(skew-hermitian).

Exercise 1.7.12 Let x, y be any two n × 1 matrices. Prove that


the matrix xy0 −yx0 is skew-symmetric but xy0 +yx0 is symmetric.
Prove also that x0 y = y0 x. If x0 x = (1) = y0 y and x0 y = (k) =
y0 x, then prove that A3 = (k 2 − 1)A, where A = xy0 − yx0

Exercise 1.7.13 Prove that the equation X 2 = −I2 has infinitely


many solutions in the set M2×2 (R) of all 2 × 2 real matrices.

Exercise 1.7.14 Let A be the matrix


0 a a 2 a3
 
0 0 a a 2 
0 0 0 a  .
 

0 0 0 0
1.7. ADJOINT MATRIX 29

Prove that A4 = 0. Evaluate the matrix B given by


A2 A3 A4
B =A− + − + ...
2 3 4
Show that only finitely many terms of the series
B2 B3 B4
B+ + + + ...
2! 3! 4!
are non-zero and its sum is A.

Exercise 1.7.15 Let M be a matrix given by


 
0 a b
M =  −a 0 c  .
−b −c 0

where (a b c) 6= (0 0 0). Prove that eM is an orthogonal matrix


and is given by
 c2 +(λ2 −c2 ) cos λ bc(cos λ−1)+aλ sin λ ac(1−cos λ)+bλ sin λ

λ2 λ2 λ2
 
 
 bc(cos λ−1)−aλ sin λ b2 +(λ2 −b2 ) cos λ ab(cos λ−1)+cλ sin λ 
λ2 λ2 λ2
 
 
 
ac(1−cos λ)−bλ sin λ ab(cos λ−1)−cλ sin λ a2 +(λ2 −a2 ) cos λ
λ2 λ2 λ2

where λ = + a2 + b2 + c2 .

Exercise 1.7.16 Let N be a nilpotent matrix (i.e, N p = 0, for


some p) and U is a unipotent matrix (i.e; In − U is nilpotent).
Define
N2 N3 Nk
exp N = In + N + + + ... + + ...
2! 3! k!
and
(In − U )2 (In − U )3 (In − U )k
log U = −(In − U ) − − −...− −...
2 3 k
   
0 a b 1 x y
For the matrices N = 0 0 c  and U = 0 1 z , verify
0 0 0 0 0 1
that exp log U = U and log exp N = N .
30 CHAPTER 1. MATRIX ALGEBRA
Chapter 2

Rank of a matrix

In this chapter we have introduced elementary operations, elemen-


tary matrices and then Echelon forms. We have given the notion of
equivalent matrices and rank of a matrices. We have observed that
equivalent matrices have same ranks. We have given the notion of
normal form of a matrix and observed that every matrix can be
reduced to a normal form. In last we have discussed the method
of finding a inverse of a non-singular matrix using elementary op-
erations.

2.1 Elementary operations


Let A be any m × n matrix. Consider the following row operations
on the matrix A:
(i). Interchange of any two rows,
(ii). Multiply a row by a non-zero scalar,
(iii). add a non-zero multiple of one row to another row.
These operations are called elementary row operations. If rows
are replaced by columns in above operations (i), (ii) and (iii), then
these operations are called elementary column operations. El-
ementary row and column operations are called elementary op-
erations. The row(column) operations together with symbolic
representations are given by the following
(i). interchange rows(columns) i and j is represented by Ri ↔ Rj
(Ci ↔ Cj ),

31
32 CHAPTER 2. RANK OF A MATRIX

(ii). multiply a row(column) i by a non-zero scalar a is repre-


sented by Ri → aRi (Ci → aCi ), and

(iii). add a times row Rj (column Cj ) to row Ri ( column Ci ),


where a is a non-zero scalar, is represented by Ri → Ri + aRj
(Ci → Ci + aCj ).
 
1 0 1
Example 2.1.1 Let A = 2 8 9. Applying R1 ↔ R2 on A
1 0 0
(interchanging first and second row of A), we get
 
2 8 9
1 0 1 
1 0 0

Similarly, applying C3 → C3 − 98 C2 (subtract 9


8 times second column
from third column), we get
 
2 8 0
1 0 1
1 0 0

Exercise 2.1.2 Find the matrices corresponding to elementary op-


erations C1 → C1 − 2C2 and R3 → R3 + 2R4 on
 
2 1 2 1
 1 −1 0 0 
 
 2 1 5 0 
−1 5 7 1

Proposition 2.1.3 Let P be an m × m matrix obtained by inter-


changing rows i and j of identity matrix Im . For any m × n matrix
A, P A is the matrix obtained by interchanging rows i and j of A.

Proof: Let P = (pij )m×m be an m × m matrix obtained by inter-


changing rows i and j of identity matrix Im = (δij )m×m , where

1 if i=j
δij =
0 otherwise
Thus, i-th row of P will be given by j-th row of Im and so

pik = δjk (2.1.1)


2.1. ELEMENTARY OPERATIONS 33

for all k = 1, 2, . . . , n. Now, for any k, the i-th row of P A will be


given by

n
X n
X
pil alk = pil alk
l=1 l=1
Xn
= δjl alk
l=1
= δj1 a2k + δj2 ajk + . . . + δj(j−1) a(j−1)k + δjj ajk
+δj(j+1) a(j+1)k + . . . + δjn ank
= 0 + 0 + . . . + 0 + 1.ajk + 0 + . . . + 0
(by definition of δij )
= ajk

Hence i-th row of P A is the j-th row of A. Thus, proved. 2

Proposition 2.1.4 Let P be an n × n matrix obtained by inter-


changing columns i and j of identity matrix In . For any m × n
matrix A, AP is the matrix obtained by interchanging columns i
and j of A.

Proof is left as an exercise for readers (Hint: Replace rows by


columns in the above proof).
It is noted that in the product AB we say B is pre multiplied
by A or pre multiplication of B by A. We also say that A is
post multiplied by B or post multiplication of A by B.

Example 2.1.5 We consider the matrix


 
1 0 0
P = 0 0 1 .
0 1 0

This matrix is obtained from I3 by interchanging second and


third row (R2 ↔ R3 ). Consider any 3 × 2 (or in general 3 × n)
matrix  
a11 a12
A = a21 a22 
a31 a32
34 CHAPTER 2. RANK OF A MATRIX

Then     
1 0 0 a11 a12 a11 a12
P A = 0 0 1 a21 a22  = a31 a32 
0 1 0 a31 a32 a21 a22
which is obtained by interchanging second and third row.

Example 2.1.6 Consider the matrix


 
1 0 0 0
0 0 1 0
 
0 1 0 0
0 0 0 1

This matrix is obtained from I4 by interchanging second and third


column (C2 ↔ C3 ). Consider any 2×4 (or in general n×4) matrix
 
a11 a12 a13 a14
A=
a21 a22 a23 a24

Then
 
  1 0 0 0
a11 a12 a13 a14  0 0 1 0
AP = 
a21 a22 a23 a24 0 1 0 0
0 0 0 1
 
a11 a13 a12 a14
=
a21 a23 a22 a24

which is obtained by interchanging second and third column.

Exercise 2.1.7 Explain the effect of pre-multiplication and post-


multiplication multiplication by matrices:
   
0 0 1 0 1 0 0 0  
0 0 0 1 0 0 1 0 0 0 1
    0 1 0
1 0 0 0 0 0 0 1
1 0 0
0 1 0 0 0 1 0 0

Proposition 2.1.8 Let A be any m×n matrix and D = (ci δij )m×m
be any diagonal matrix. Then DA (AD) is obtained from A by
multiplying the i-th row (column) of A by ci .
2.1. ELEMENTARY OPERATIONS 35

Proof: Keep i fixed. Then, i-th row of DA is given by


m
X
ci δij ajk = ci aik
j=1

because δii = 1 and δij = 0 f or j 6= i for all k = 1, 2, . . . , n. This


proves the result. 2

Exercise 2.1.9 Explain the effect of pre-multiplication and post-


multiplication by matrices:
   
0 0 a 0 1 0 0 0  
0 0 0 b  0 0 0 0 a
   a 0 0 b 0
1 0 0 0 0 0 0 b
1 0 0
0 1 0 0 0 c 0 0

Proposition 2.1.10 Let P be any m × m matrix obtained by ap-


plying Rr → Rr + λRs on Im , where r, s are fixed and r 6= s. Then,
for any m × n matrix A, the matrix P A is obtained by applying the
same row operation on A.

Proof: Let Ersλ be the square matrix m × m whose all entries

are zero except the (r, s)-th entry which is λ, i.e; (i, j)-th entry
λ ) is given by
(Ers ij

λ λ if i = r, j = s
(Ers )ij =
0 otherwise
Let P = (pij )m × m and A = (ajk )m×n . By definition of P , we
have P = Im + Ersλ . Thus,

λ
pij = δij + (Ers )ij

Hence, (i, k)-th entry of P A is given by


m
X m
X
λ
pij ajk = [δij + (Ers )ij ]ajk
j=1 j=1
Xm h i
λ
= δij ajk + (Ers )ij ajk
j=1
   
Xm m
X
λ
=  δij ajk  +  (Ers )ij ajk 
j=1 j=1
36 CHAPTER 2. RANK OF A MATRIX

λ
= aik + (Ers )is ask

aik if i 6= r
=
ark + λask if i=r

This shows that r-th row of P A is obtained by applying Rr →


Rr + λRs on A (i.e; adding λ times s-th row to r-th row). 2

Exercise 2.1.11 Let P be any n × n matrix obtained by applying


column operation Cr → Cr + λCs on Im , where r, s are fixed and
r 6= s. Then, for any m × n matrix A, the matrix AP is obtained
by applying the same column operation on A. (Hint: Replace row
by column in above proof.)

Exercise 2.1.12 Let P be any m × m matrix obtained by applying


row operation Rr → cRr on Im , where r is fixed. Then, for any
m × n matrix A, the matrix P A is obtained by applying the same
row operation on A.

Exercise 2.1.13 Let P be any m × m matrix obtained by applying


column operation Cr → aCr on Im , where r is fixed. Then, for any
m × n matrix A, the matrix AP is obtained by applying the same
column operation on A.

Example 2.1.14 Consider the matrix


 
1 0 k
P = 0 1 0  .
0 0 1

Clearly this matrix is obtained from identity matrix I3 by adding k


times the third
 row to first
 row.
a11 a12
Let A = a21 a22  be any matrix. Then
a31 a32
    
1 0 k a11 a12 a11 + ka31 a12 + ka32
P A = 0 1 0 a21 a22  =  a21 a22 
0 0 1 a31 a32 a31 a32

This shows that the effect of pre-multiplication(left multiplication)


on A by P is to add k times the third row to first row of A. Let
2.2. ELEMENTARY MATRICES 37
 
a11 a12 a31
A= . Then
a21 a22 a32
 
  1 0 k  
a11 a12 a31  a11 a12 a31 + ka11
AP = 0 1 0 =

a21 a22 a32 a21 a22 a32 + ka21
0 0 1

Thus, we see that effect of post-multiplication(right multiplica-


tion) on A by P is to add k times the first column to third column of
A, where P is obtained from I3 by adding k times the first column
of I3 to third column.

Exercise 2.1.15 Explain the effect of pre-multiplication (left) and


post-multiplication (right multiplication) of following matrices:
   
1 k 0 1 a 0
0 1 0 0 b 0
0 0 1 0 0 c

2.2 Elementary matrices


Definition 2.2.1 A square matrix of size n × n which is obtained
from In by applying single elementary operation(row or column) on
it, is called an elementary matrix of size n × n.
 
1 1 0
Example 2.2.2 The matrix 0 1 0 is obtained by applying
0 0 1
R1 → R1 +R  2 (or C 2→ C 2 +C
 1 ). Thus,
 it is an elementary matrix.
0 1 0 1 0
Similarly, 1 0 0 and 0 5 0 are elementary matrices.
0 0 1 0 0 1

Using Propositions 2.1.3, 2.1.8 and 2.1.10, we have:

Theorem 2.2.3 A elementary row(column) operation on an m ×


n matrix A is given by pre-multiplication (post multiplication) of
an elementary matrix which is obtained by the same row(column)
operation on Im (In ).
38 CHAPTER 2. RANK OF A MATRIX

2.3 Echelon form


Definition 2.3.1 An m×n matrix A is said to be in row-echelon
form (or a row -echelon matrix) if it satisfies the following:

1. each non-zero row having more zeros before first non-zero


entry lies below the each non-zero row having less zeros before
first non-zero entry,

2. all the zero row lies below each non-zero row.

It is also called the echelon form of matrix. Thus, In and 0m×n


are in row-echelon form.

Example 2.3.2 The 4 × 5 matrix


 
0 1 0 8 9
0 0 2 1 7
 
0 0 0 0 6
0 0 0 0 0

is in row-echelon form because the zero row lies below to each non-
zero row and the number of zeros, before first non-zero entry, from
first row to fourth row are in ascending order.

Example 2.3.3 The 3 × 5 matrix


 
0 1 0 0 9
1 0 2 1 7
0 0 0 0 0

is not a row-echelon matrix because its first row contains one 0(zero)
while the second row contains no 00 s before their first non-zero en-
try.

Exercise 2.3.4 Prove that the following matrices are in row-echelon


form
 
1 1 0 4 3    
0 0 0 1 0 0 0 0 2 4 1 0
2 1 5 


0 0 , 0 0 0 0 1 0 6  , 0 2 9  , In
0 1 0
0 0 0 0 0 0 1 0 0 1
0 0 0 0 1
2.3. ECHELON FORM 39

Example 2.3.5 Prove that every diagonal matrices whose all di-
agonal entries are non-zero, are in row-echelon form.

Theorem 2.3.6 (Gauss elimination method) Every non-zero


matrix can be transformed into a row-echelon form by applying
finite number of elementary row operations.

Proof: Let A = (aij ) be a non-zero m × n matrix. Then


first non-zero column from left must contains at least one non-zero
element. Let it be s-th column. Let ars 6= 0 for r-th row. Apply
R1 ↔ Rr on A. Then, it will be of the form:
 
0 0 . . . 0 ars ar s+1 . . . arn
0 0 . . . 0 b2s b2 s+1 . . . b2n 
 
B = 0 0 . . . 0 b3s b3 s+1 . . . b3n 
 
 .. .. .. .. .. .. .. .. 
. . . . . . . . 
0 0 . . . 0 bms bm s+1 . . . bmn

Applying Ri → Ri − abrs
is
R1 on matrix B, we have
 
0 0 . . . 0 ars ar s+1 ... arn
0 0 . . . 0 0 c1 s+1 . . . c1n 
 
0 0 . . . 0 0 c . . . c2n
C= 2 s+1


 .. .. .. .. .. .. .. .. 
. . . . . . . . 
0 0 . . . 0 0 cm−1 s+1 . . . cm−1 n

We repeat the above argument for C. After at most m applications


of this process, we arrive at a matrix which is in row echelon form.
2

Definition 2.3.7 Two matrices of sizes m×n are said to be equiv-


alent (row) if one is obtained from other by applying finite number
of elementary row operations. If A is equivalent to B then we write
A ∼ B. In this case, we have elementary matrices E1 , E2 , . . . , Ek
such that
E1 E2 . . . Ek A = B
Then, we have
Ek−1 . . . E2−1 E1−1 B = A
We note that inverse of elementary matrices are elementary matri-
ces. Thus, B ∼ A. Indeed, this relation is an equivalence relation
40 CHAPTER 2. RANK OF A MATRIX

and so it will partition Mm×n into disjoint equivalence classes.


These classes are determined by their corresponding row echelon
forms.

Definition 2.3.8 Echelon form of matrix If A is equivalent


to row echelon form E, i.e; if A ∼ E, where E is in row eche-
lon form(row echelon matrix). Then E is called a echelon form
equivalent to A or echelon form of matrix A.

Example 2.3.9 The echelon form of the matrix


 
0 1 0 0 9
1 0 2 1 7
0 0 0 0 0

is given by
   
0 1 0 0 9 1 0 2 1 7
1 0 2 1 7 ∼ 0 1 0 0 9 by applying R1 ↔ R2
0 0 0 0 0 0 0 0 0 0

Example 2.3.10 The echelon form of the matrix


 
1 2 3
A = 3 1 2
5 5 8

is given by
 
1 2 3
A = 3 1 2
5 5 8
 
1 2 3
R2 → R2 − 3R1
∼ 0 −5 −7
R3 → R3 − 5R1
0 −5 −7
 
1 2 3
∼ 0 −5 −7 R3 → R3 − R2
0 0 0
 
1 2 3
Example 2.3.11 The echelon form of the matrix A = 3 1 2
2 3 1
2.3. ECHELON FORM 41

is given by
 
1 2 3
A = 3 1 2
2 3 1
 
1 2 3
R → R2 − 3R1
∼ 0 −5 −7 2
R3 → R3 − 2R1
0 −1 −5
 
1 2 3
∼ 0 −5 −7  R3 → R3 − 51 R2
18
0 0 −5
 
1 2 3
R2 → −15 R2
∼ 0 1 75  −5
R3 → 18 R3
0 0 1
 
1 2 0
R1 → R1 − 3R3
∼ 0 1 0
R2 → R2 − 57 R3
0 0 1
 
1 0 0
∼ 0 1 0 R1 → R1 − 2R2
0 0 1

Observe that this echelon form of matrix is special in the sense that
first non-zero entry of each row is one together with all the entries
above and below the first non-zero in its corresponding column are
zero. Such echelon form of a matrix is called areduced row-echelon
form or echelon form.

Exercise
 2.3.12 Reduce
 the
 following matrices
 to echelon
 form
1 1 1 −1 1 2 0 −1 1 2 3
(i). 1 2 3 4  (ii). 2 6 −3 −3 (iii). 1 4 2
3 4 5 2 3 10 −6 −5 2 6 5

Exercise 2.3.13 Reduce the following matricesto echelon form 


  3 4 5 6
1 4 9 16  
4 1 2 3 4 5 6 7
9 16 25 
 (ii). 2 3 4 (iii).  5 6 7 8 

(i). 
9 16 25 36  
3 5 7 10 11 12 13
16 25 36 49
15 16 17 18
42 CHAPTER 2. RANK OF A MATRIX

Exercise
 2.3.14 
Reduce the
 following matrices
 to row-echelon
 form

1 2 3 14 1 1 1 6 1 2 1 2
(i). 3 1 2 11 (ii). 1 −1 1 2 (iii). 3 6 5 4
2 3 1 11 2 1 −1 1 2 4 3 3
Exercise
 2.3.15 Reduce
 the
 following matrices
 toechelon form 
5 3 7 4 1 1 1 6 1 1 1 1
(i). 3 26 2 9 (ii). 1 2 3 10 (iii). 1 2 4 λ 
7 2 10 5 1 2 λ µ 1 4 10 λ2
Exercise
 2.3.16 For given matrix

1 1 0 1 4
 2 0 0 4 7 
A=  1 1
 determine the value of α such that
1 0 5 
1 −3 −1 −10 α
all the rows of its corresponding echelon form are non-zero.

Definition 2.3.17 An echelon form of a matrix is called reduced


row-echelon form (or reduced row-echelon form) if first non-zero
entry in each row is 1 and every entry lies above the first non-zero
entry is 0. Such matrices are also called Hermite matrix or
reduced row-echelon matrix.
For example,
   
1 8 0 1 0 0 1 0 3 0  
 0 0 1 3 0   0 0 1 5 0  1 0 0 5 0
 0 0 0 0 1 , 0 0 0 0 1 , 0 1 0 1 0
     
0 0 1 2 −7
0 0 0 0 0 0 0 0 0 0
and In are reduced row-echelon forms.

Theorem 2.3.18 Every non-zero matrix can transformed to re-


duced row-echelon form by applying finite number of elementary
row operations.

The proof is easy and is left for readers. We illustrate it by


taking an example.
Example 2.3.19 Consider the matrix
 
1 −1 0 −1 −5 −1
 2 1 −1 −4 1 −1 
A= .
 1 1 1 −4 −6 3 
1 4 2 −8 −5 8
2.4. LINEAR DEPENDENCE AND INDEPENDENCE 43

Then
 
1 −1 0 −1 −5 −1
R2 → R2 − 2R1
2 1 −1 −4 1 −1
A ∼   R3 → R3 − R1
1 1 1 −4 −6 3 
R4 → R4 − R1
1 4 2 −8 −5 8
 
1 −1 0 −1 −5 −1
0 1 −2 1 12 −3
∼   R2 → R2 − R3
0 2 1 −3 −1 4 
0 5 2 −7 0 9
 
1 0 −2 0 7 −4
R1 → R1 + R2
0 1 −2 1 12 −3
∼   R3 → R3 − 2R2
0 0 5 −5 −25 10 
R4 → R4 − 5R2
0 0 12 −12 −60 24
 
1 0 −2 0 7 −4
0 1 −2 1 12 −3 R3 → R3 /5
∼  
0 0 1 −1 −5 2  R4 → R4 /12
0 0 1 −1 −5 2
 
1 0 −2 0 7 −4
0 1 −2 1 12 −3
∼   R4 → R4 − R3
0 0 1 −1 −5 2 
0 0 0 0 0 0

 
1 0 0 −2 −3 0
0 1 0 −1 2 1 R1 → R1 + 2R3
∼  
0 0 1 −1 −5 2 R2 → R2 + 2R3
0 0 0 0 0 0
Exercise 2.3.20 Reduce the following matrices to reduced row-
echelon form
   
1 0 1 −1 −3 1 0 −1 −2  
 0 1 0 −1 1 1 1 6
2   0 1 0 −1  
  

 0 0 1 −1 , , 1 −1 1 2
−5   0 0 −1 −1 
2 1 −1 1
0 0 0 0 0 0 0 0 1

2.4 Linear dependence and independence


Let A = (aij )m×n be an m × n matrix. Let Ri denotes the i-th
row and denotes the j-th column of A, for i = 1, 2, . . . , m and
44 CHAPTER 2. RANK OF A MATRIX

j = 1, 2, . . . , n. Then,
 
a1j
  a2j 
Ri = a1 a2 . . . an and Cj = 
 
.. 
 . 
amj

They are frequently called row and column vectors(matrices).

Definition 2.4.1 Any expression of the form k1 x1 + k2 x2 + . . . +


kp xp , where k1 , k2 , . . . kp are scalars and x1 , x2 , . . . , xp are rows(columns)
of A, is called a linear combination of rows(columns) of A.

Example 2.4.2 Consider the matrix


 
1 1 3
A = 0 2 1 
1 3 4

Then,

R1 + R2 = (1 1 3) + (0 2 1) = (1 + 0 1 + 2 3 + 1) = (1 3 4) = R3 .

This shows that third row R3 is a linear combination of first row R1


and second row R2 of A. Similarly, C3 = 25 C1 + 12 C2 and so third
column C3 is a linear combination of first column C1 and second
column C2 of A.

Example 2.4.3 Any row matrix R = (a1 a2 a3 ) is a linear com-


bination of rows of I3 . For,

a1 a2 a3 = a1 e1 + a2 e2 + a3 e3 ,

where e1 = (1 0 0), e2 = (0 1 0) and e3 = (0 0 1) denotes the


first, second and third row of I3 respectively.

Example 2.4.4 Any row matrix R = a1 a2 . . . an is a lin-
ear combination of rows of In . For,

a1 a2 . . . an = a1 e1 + a2 e2 + . . . + an en ,

where ei denotes the i-th row of In .


2.4. LINEAR DEPENDENCE AND INDEPENDENCE 45
 
0
Example 2.4.5 The column vector 1 is not a linear combina-
1
 
1 0 0
tion of columns of matrix 1 1 0 (verify).
0 0 0

Definition 2.4.6 Rows(columns) x1 , x2 , . . . , xp of a matrix A are


said to be linearly dependent if there exists scalars k1 , k2 , . . . , kp ,
not all zeros, such that

k1 x1 + k2 x2 + . . . + kp xp = 0

If such scalars do not exist, i.e; if

k1 x1 + k2 x2 + . . . + kp xp = 0 ⇒ k1 = k2 = . . . = kp = 0

then x1 , x2 , . . . , xp are called linearly independent .

Example 2.4.7 Rows (1 1 3), (0 2 1) and (1 3 4) of given


matrix  
1 1 3
A = 0 2 1
1 3 4
are linearly dependent because

1.(1 1 3) + 1.(0 2 1) − 1.(1 3 4) = (0 0 0).


     
1 1 3
Similarly, columns 0, 2 and 1 are linearly dependent
1 3 4
because      
1 1 3
5  1   
0 + 2 − 1 =0
2 2
1 3 4

Example 2.4.8 All the rows(columns) of In are linearly indepen-


dent.

Example 2.4.9 The rows of matrix


 
1 2 0
1 0 1
0 1 1
46 CHAPTER 2. RANK OF A MATRIX

are linearly independent. For,

Suppose that
a(1 2 0) + b(1 0 1) + c(0 1 1) = (0 0 0)
Then, we have
(a + b 2a + c b + c) = (0 0 0)
Thus,
a+b = 0 . . . (1)
2a + c = 0 . . . (2)
b+c = 0 . . . (3)
Subtracting equation (3) from equation (2), we have
2a − b = 0 . . . (4)
Solving equations (1) and (4), we have a = b = 0. Putting the
value of a in equation (2), we have c = 0. Thus, we have
a(1 2 0) + b(1 0 1) + c(0 1 1) = (0 0 0) ⇒ a = b = c = 0
By definition of linearly independent rows, all the rows are lin-
early independent. Similarly, all the columns of A are linearly
independent.
Exercise 2.4.10 Prove that rows of matrix A are linearly inde-
pendent while the columns of A are linearly dependent, where
 
1 2 0 2
0 1 1 1 .
1 0 1 0
Also determine the maximum number of linearly independent rows
and maximum number of linearly independent columns. Does max-
imum number of linearly independent rows is same as the maximum
number of linearly independent columns?
Exercise 2.4.11 Determine the maximum number of linearly in-
dependent rows and maximum number of linearly independent columns
of 
following matrices:
    
5 3 7 4 1 1 1 6 1 1 1 1
(i)3 26 2 9 (ii) 1 2 3 10 (iii) 1 2 4 λ 
7 2 10 5 1 2 λ µ 1 4 10 λ2
2.4. LINEAR DEPENDENCE AND INDEPENDENCE 47

Proposition 2.4.12 If x1 , x2 , . . . , xp are linearly independent rows


(columns) of A then none of them can be zero vector. In other
words, if any one row (column) of rows (columns) x1 , x2 , . . . , xp
of a matrix is zero then x1 , x2 , . . . , xp are linearly dependent.

Proof: Let row (column) xi0 of rows (columns) x1 , x2 , . . . , xp


is zero. Then, we have:

0.x1 + 0.x2 + . . . + 0.xi0 −1 + 1.xi0 + 0.xi0 +1 + 0.xi0 +2 + . . . + 0.xp = 0

which shows that x1 , x2 , . . ., xp are linearly dependent rows (columns).


2
Indeed, we have

Theorem 2.4.13 The following are equivalent:

1. x1 , x2 , . . . , xp (p ≥ 2) are linearly dependent.

2. One of the xi is a linear combination of others.

The proof is left as an exercise for readers. Indeed, it follows from


the fact that

k1 x1 + k2 x2 + . . . + kp xp = 0 ⇔
k1 k2 ki −1 ki +1 kp
x i0 = − x1 − x2 − . . . − 0 xi0 −1 − 0 xi0 +1 − xp
ki0 ki0 ki0 ki0 ki0

where i0 is the first index such that ki0 6= 0.


This fact immediately determines the following:

Corollary 2.4.14 The rows(columns) of a matrix are linearly de-


pendent if and only if one can be obtained from others by means
of finite number of applications of elementary row (column) oper-
ations. From example 2.4.7, we see that rows R1 , R2 and R3 of a
matrix  
1 1 3
A = 0 2 1 
1 3 4
are linearly dependent and R3 = R1 + R2 .
Similarly, columns C1 , C2 and C3 of a matrix A are linearly
dependent and C3 = 52 C1 + 21 C2 .
48 CHAPTER 2. RANK OF A MATRIX

Definition 2.4.15 The maximum number of linearly independent


rows of a matrix is called the row rank of the matrix. The maxi-
mum number of linearly independent columns of a matrix is called
the column rank of the matrix. Since all the rows as well as
columns of In are linearly independent, therefore
row rank of In = column rank of In = 3
Indeed, we discuss in later that row rank of a given matrix is
same as the column rank of that matrix. The row rank of a
given matrix A is denoted by ρ(A).
Theorem 2.4.16 Row (Column) rank is invariant under elemen-
tary row (column) operations, i.e; elementary row operation do not
affect row rank (column)
Proof: It is clear that the maximum number of linearly indepen-
dent rows (columns) is unaffected by the P interchange of any two
p
rows. It is also noted that if row Ri = k=1 ck Rk , where ck ’s
are scalars and Rk ’s are rows other than the row Ri . Then, λRi =
P p
k=1 (λck )Rk , where λ 6= 0. Thus, by this fact and theorem 2.4.13,
it follows that elementary operation of the form Ri → λRi , where
λ 6= 0, does not affect the row rank (column rank).
Next, let R1 , R2 , . . . , Rp be linearly independent rows
(columns) of A. Without loss of generality, let
R1∗ = R1 + R2 (1)
Suppose that
c1 R1∗ + c2 R2 + . . . + cp Rp = 0 (2)
Using equation (1), we have
c1 R1 + (c1 + c2 )R2 + . . . + cp Rp = 0
But R1 , R2 , . . . , Rp are linearly independent, so by definition of
linear independence, we have
c1 = 0, (c1 + c2 ) = 0, c3 = 0, . . . , cp = 0
i.e; c1 = 0, c2 = 0, c3 = 0, . . . , cp = 0 This shows that R1∗ , R2 ,
. . . , Rp are linearly independent rows(columns). Hence, addition
of one row (column) in to other row(column) does not affect the
row(column) rank. 2
Thus, we have
2.4. LINEAR DEPENDENCE AND INDEPENDENCE 49

Corollary 2.4.17 Row-equivalent (column equivalent) matrices have


same row(column) rank.

By Gauss elimination theorem(Theorem 2.3.6), every non-zero ma-


trix can be transformed to row-echelon form by means of finite
number of row operations. By means of finite number of suitable
row operations, it can be transformed to reduced row-echelon form.
Since the relation ∼ of being row equivalent is an equivalence re-
lation, therefore if A ∼ E1 and A ∼ E2 then E1 ∼ E2 , where
E1 , E2 are reduced row echelon form of B. This shows that up to
equivalence relation E1 = E2 . Thus, we have

Theorem 2.4.18 Every non-zero matrix can be transformed to


unique reduced row-echelon form by means of finite number of ap-
plication of elementary row operations.

The detail of the proof of above theorem is given in [4].

Corollary 2.4.19 The row rank of a matrix is equal to the number


of non-zero rows in any row-echelon form of the matrix.

Since a zero row of a matrix can never be non-zero by means of


application of column operation. This suggests that column op-
erations can have no effect on linear independence of rows there-
fore row rank is invariant under column operations. Similarly, row
operations can have no effect on linear independence of columns
therefore column rank is invariant under row operations. Thus, we
have the following:

Theorem 2.4.20 Row rank and column rank both are invariant
under elementary row and column operations (elementary opera-
tions).

Using this theorem, we have the following

Theorem 2.4.21 Row rank and column rank are same for a given
matrix.

Proof: Let A be a given matrix and E is reduced row echelon


form of the A. Then ρ(A) = ρ(E) = r, say. Since row operations
do not affect its column rank. Therefore
column rank (A) = column rank(E).
50 CHAPTER 2. RANK OF A MATRIX

By suitable number of column operations on A, it can be trans-


formed to a matrix of the form
 
Ir 0r×(n−r)
N= ...... (∗)
0(m−r)×n 0(m−r)×(n−r)
because column operations do not affect row rank. Since column
operation do not affect the column rank, therefore we have
column rank(A) = column rank(E)
= column rank(N )
= r = ρ(A)
Thus proved. 2
Since row rank and column rank are same, therefore we simply
call the rank of a matrix.
Definition 2.4.22 The rank of a matrix is defined by its row rank
(or column rank). It is denoted by ρ(A).
Corollary 2.4.23 rank(A) = rank (A0 ).
Proof: Since row rank of A is same as the column rank, therefore

rank(A) = row rank(A)


= column rank(A0 )
= row rank(A0 )
= rank(A0 )
2

Corollary 2.4.24 Every non-zero m×n matrix can be transformed


to unique matrix of the form
 
Ir 0r×(n−r)
N= ...... (∗)
0(m−r)×n 0(m−r)×(n−r)
by means of application of elementary operations. In other words,
for every non-zero m × n matrix there exists two invertible (non-
singular) matrices P of order m × m and Q of order n × n such
that
 
Ir 0r×(n−r)
P AQ = ...... (∗)
0(m−r)×n 0(m−r)×(n−r)
2.4. LINEAR DEPENDENCE AND INDEPENDENCE 51

Example 2.4.25 The rank of matrix


 
1 2 3 14
A = 3 1 2 11 .
2 3 1 11

is 3. For
   
1 2 3 14 1 2 3 14
R → R2 − 3R1
3 1 2 11 ∼ 0 −5 −7 −31 2
R3 → R3 − 2R1
2 3 1 11 0 −1 −5 −17
 
1 2 3 14
∼ 0 −1 −5 −17 R2 ↔ R3
0 −5 −7 −31
 
1 2 3 14
∼ 0 −1 −5 −17 R3 → R3 − 5R2
0 0 18 54

which is in echelon form. Thus,

ρ(A) = the number of non − zero rows in its echelon f orm = 3.

Example 2.4.26 If A is a zero matrix, then ρ(A) = 0. For every


non-zero matrix A, ρ ≥ 1. From the definition of rank, it also
follows that ρ(A) ≤ min{m, n} for every non-zero m × n matrix
A. Thus, 0 ≤ ρ(A) ≤ min{m, n} for all m × n matrix A.

 
2 3 −1 −1
1 −1 −2 −4
Exercise 2.4.27 Find the rank of matrix  .
3 1 3 −2
6 3 0 7

 
2 3 −1 −1
1 −1 −2 −4
Solution: Given that A =  .
3 1 3 −2
6 3 0 7
52 CHAPTER 2. RANK OF A MATRIX

Then
 
2 3 −1 −1
1 −1 −2 −4
A =  
3 1 3 −2
6 3 0 7
 
1 −1 −2 −4
2 3 −1 −1
∼   R1 ↔ R2
3 1 3 −2
6 3 0 7
 
1 −1 −2 −4
0 R2 → R2 − 2R1
5 3 7
∼ 
0
 R3 → R3 − 3R1
4 9 10 
R4 → R4 − 6R1
0 9 12 31
 
1 −1 −2 −4
0 1 −6 −3
∼ 
0
 R2 → R2 − R3
4 9 10 
0 9 12 31
 
1 −1 −2 −4
0 1 −6 −3 R3 → R3 − 4R2
∼  
0 0 33 22  R4 → R4 − 9R2
0 0 66 58
 
1 −1 −2 −4
0 1 −6 −3 R3 → R3 − 4R2
∼  
0 0 33 22  R4 → R4 − 9R2
0 0 0 14

which is a echelon form.


Since the number of non-zero rows in the echelon form is 4.
Therefore rank of A is 4

Exercise 2.4.28 Find the rank of matrix

 
1 2 0 −1
2 6 −3 −3
3 10 −6 −5
2.4. LINEAR DEPENDENCE AND INDEPENDENCE 53

Solution:
   
1 2 0 −1 1 2 0 −1
2 6 −3 −3 ∼ 0 2 −3 R2 → R2 − 2R1
−1
R3 → R3 − 3R1
3 10 −6 −5 0 4 −6 −2
 
1 2 0 −1
∼ 0 2 −3 −1 R3 → R3 − 2R2
0 0 0 0

which is a echelon form. Since the number of non-zero rows in the


echelon form is 2. Therefore rank of matrix is 2.

Exercise 2.4.29 Find the rank of matrix


 
1 2 −1 −2
−1 −1 1 1
0 1 2 1

Solution:
   
1 2 −1 −2 1 2 −1 −2
−1 −1 1 1  ∼ 0 1 0 −1 R2 → R2 + R1
0 1 2 1 0 1 2 1
 
1 2 −1 −2
∼ 0 1 0 −1 R3 → R3 − R1
0 0 2 2

which is a echelon form. Since the number of non-zero rows in the


echelon form is 3. Therefore rank of matrix is 3.

Exercise 2.4.30 Find the rank of following matrices:


     
1 1 1 −1 1 2 0 −1 1 2 3
(i). 1 2 3 4  , (ii). 2 6 −3 −3 , (iii). 1 4 2
3 4 5 2 3 10 −6 −5 2 6 5

Exercise 2.4.31 Find the rank of following


 matrices: 
  3 4 5 6 7
1 4 9 16  
 4 9 16 1 2 3 4 5 6 7 8
25 
  

 9 16 25 , 2 3 4 ,
 5 6 7 8 9 
36  
3 5 7  10 11 12 13 14
16 25 36 49
15 16 17 18 19
54 CHAPTER 2. RANK OF A MATRIX

Exercise 2.4.32 Find the rank of following matrices:


     
1 2 3 14 1 1 1 6 1 2 1 2
(i). 3 1 2 11 , (ii). 1 −1 1 2 (iii). 3 6 5 4 .
2 3 1 11 2 1 −1 1 2 4 3 3

Exercise 2.4.33 Discuss the rank of following matrices:


   
1 1 1 6 1 1 1 1
(i). 1 2 3 10 , (ii). 1 2 4 λ 
1 2 λ µ 1 4 10 λ2

Exercise 2.4.34 Find the


 rank of followingmatrices:
  3 3 −1 −1  
1 2 0 −1 1 −1 2 2 2
−2 −4
(i) 3 4 1 2  (ii)  (iii)1 2 1
3 1 3 −2
−2 3 2 5 3 4 3
6 3 0 −7

Exercise 2.4.35 Find


 the rank of following
 matrices:
  1 1 1 6  
2 −1 3 9 1 −1 −2 −4 1 2 3 14
1 1 1 6 ,   and 3 1 2 11
1 −1 1 2
1 4 9 36 2 3 1 11
2 1 −1 1

Exercise 2.4.36 Investigate for what values of λ and µ, the ma-


trices:
 
1 2 1 8    
1 −1 −2 −4 1 1 1 6 1 1 1 1
 , 1 2 3 10, 1 2 4 λ
2 1 3 13 
1 2 λ µ 1 4 10 λ2
3 4 −λ µ
have (a) rank less than 3 and (ii) rank is 3.

2.5 Normal form


Definition 2.5.1 (Normal form) The matrix of the form
 
Ir 0r×(n−r)
0(m−r)×n 0(m−r)×(n−r)

which is obtained by means of application of elementary operations


on A is called the normal form of A. The index r gives the rank
of A.
2.5. NORMAL FORM 55

Example 2.5.2 Consider the matrix

 
1 3 4 5
A = 1 2 6 7 .
1 5 0 1

Then
 
1 3 4 5
A = 1 2 6 7
1 5 0 1
 
1 3 4 5
R2 → R2 − R1
∼ 0 −1 2 2
R3 → R3 − R1
0 2 −4 −4
 
1 0 10 11
R1 → R1 + 3R2
∼ 0 −1 2 2 
R3 → R3 + 2R2
0 0 0 0
 
1 0 0 0 C2 → (−1).C2
∼ 0 1 2 2 C3 → C3 − 10C1
0 0 0 0 C4 → C4 − 11C1
 
1 0 0 0
C3 → C3 − 2C2
∼ 0 1 0 0
C4 → C4 − 2C2
0 0 0 0

Thus, the normal form of given matrix A is

 
1 0 0 0
0 1 0 0
0 0 0 0

Exercise 2.5.3 Reduce the matrix to its normal form and find its
rank, where
 
0 1 −3 −1
1 0 1 1
A= 3 1 0
.
2
1 1 −2 0
56 CHAPTER 2. RANK OF A MATRIX

Solution:
 
0 1 −3 −1
1 0 1 1
A = 
3

1 0 2
1 1 −2 0
 
1 0 1 1
0 1 −3 −1
∼ 
3
 R1 ↔ R2
1 0 2
1 1 −2 0
 
1 0 1 1
0 1 −3 −1 R3 → R3 − 3R1
∼  
0 1 −3 −1 R4 → R4 − R1
0 1 −3 −1
 
1 0 1 1
0 1 −3 −1 R3 → R3 − R2
∼  
0 0 0 0 R4 → R4 − R2
0 0 0 0
 
1 0 0 0
0 1 −3 −1 C3 → C3 − C1
∼  
0 0 0 0 C4 → C4 − C1
0 0 0 0
 
1 0 0 0
0 1 0 0 C3 → C3 + 3C2
∼  
0 0 0 0 C4 → C4 + C1
0 0 0 0

which is a normal form. Since the number of non-zero rows is 2,


hence its rank is 2.

Exercise 2.5.4 Reduce the matrix to its normal form and hence
find its rank, where

 
2 −2 0 0
4 2 0 2
A= 
1 −1 0 3
1 −2 1 2
2.5. NORMAL FORM 57

Solution:
 
2 −2 0 0
4 2 0 2
A =  
1 −1 0 3
1 −2 1 2
 
1 −1 0 3
4 2 0 2
∼   (R1 ←→ R3 )
2 −2 0 0
1 −2 1 2
 
1 −1 0 3
R2 → R2 − 4R1
0 6 0 −10
∼   R3 → R3 − 2R1
0 0 0 −6 
R4 → R4 − R1
0 −1 1 −1
 
1 −1 0 3
0 0 6 −16
∼   R2 → R2 + 6R4
0 0 0 −6 
0 −1 1 −1
 
1 −1 0 3
0 −1 1 −1 
∼   R2 ↔ R4
0 0 0 −6 
0 0 6 −16
 
1 −1 0 3
0 −1 1 −1 
∼   R3 ↔ R4
0 0 6 −16
0 0 0 −6
 
1 −1 0 0
R1 → R1 − 3R4
0 −1 1 0
∼ 
0
 R2 → R2 + R4
0 6 0
R3 → R3 + 16R1
0 0 0 1
 
1 −1 0 0
0 1 −1 0 R2 → (−1)R2
∼  
0 0 1 0 R3 → 61 R3
0 0 0 1
 
1 −1 0 0
0 1 0 0
∼ 
0
 R2 → R2 + R3
0 1 0
0 0 0 1
58 CHAPTER 2. RANK OF A MATRIX
 
1 0 0 0
0 1 0 0
∼ 
0
 R1 → R1 + R2
0 1 0
0 0 0 1
∼ I4

which is a normal form. Clearly rank of matrix is 4.

Exercise 2.5.5 Find non-singular matrices P and Q such that


P AQ is in the normal form, where
 
1 −1 2 −1
A = 4 2 −1 2 
2 2 −2 0

Solution: The matrix A is of size 3 × 4. Thus, it can be expressed


as  
    1 0 0 0
1 −1 2 −1 1 0 0 0
4 2 −1 2  = 0 1 0 0
1 0 A 
0

0 1 0
2 2 −2 0 0 0 1
0 0 0 1
As we know that each row operation on A is the same as the pre-
multiplication of A by elementary matrix which is obtained by the
same row on operation on I3 and each column operation on A is the
same as the post-multiplication of A by elementary matrix which is
obtained by the same column on operation on I4 . Thus, each row
operation on left side of equation determines the same by applying
the same row operation on I3 (pre -multiplication of A on right side
of the equation) and each column operation on left side of equation
determines the same by applying the same column operation on I4
(post -multiplication of A on right side of the equation).
Now, applying row operations R2 → R2 − 4R1 and
R3 → R3 − 2R1 , we have
 
   1  1 0 0
1 −1 2 −1 1 0 0 0
0 6 −9 6  = −4 1 0 A  1 0 0

0 0 1 0
0 4 −6 2 −2 0 1
0 0 0 1

Now, applying column operations C2 → C2 + C1 ,


2.5. NORMAL FORM 59

C3 → C3 − 2C1 and C4 → C4 + C1 , we have


 
    1 1 −2 1
1 0 0 0 1 0 0
0 6 −9 6 = −4 1 0 A 0 1 0 0
 
0 0 1 0
0 4 −6 2 −2 0 1
0 0 0 1

Now, applying row operations R2 → 61 R2 , we have


 
   1  1 −2 1
1 0 0 0 1 0 0 0
0 1 −3/2 1 = −4/6 1/6 0 A  1 0 0 
0 0 1 0
0 4 −6 2 −2 0 1
0 0 0 1

Now, applying row operations R3 → R3 − 4R2 , we have


 
    1 1 −2 1
1 0 0 0 1 0 0 0
0 1 −3/2 1  = −4/6 1/6 0 A  1 0 0 
0 0 1 0
0 0 0 −2 4/6 −4/6 1
0 0 0 1

Now, applying row operations R3 → (−1/2)R3 , we have


 
    1 1 −2 1
1 0 0 0 1 0 0 0
0 1 −3/2 1 = −4/6 1/6 1 0 0
0 A0

0 1 0
0 0 0 1 −2/6 2/6 −1/2
0 0 0 1

Now, applying column operations C3 → C3 +(3/2)C2 and C4 →


C4 − C2 , we have
 
    1 1 −1/2 0
1 0 0 0 1 0 0 0 1 3/2 −1
0 1 0 0 = −4/6 1/6 0 A 
0 0 1 0
0 0 0 1 −2/6 2/6 −1/2
0 0 0 1

Now, applying column operations C3 ↔ C4 , we have


 
    1 1 0 −1/2
1 0 0 0 1 0 0
0 1 0 0 = −4/6 1/6
0 1 −1 3/2 
0 A 0

0 0 1 
0 0 1 0 −2/6 2/6 −1/2
0 0 1 0
60 CHAPTER 2. RANK OF A MATRIX

Remark 2.5.6 Note that non -singular matrices P and Q satis-


fying P AQ = N , where N is the normal form of A may not be
unique. Verify the following:
 
    1 1 0 −1/2
1 0 0 0 1 0 0
0 1 0 0 = −4/6 1/6
0 1 −1 3/2 
0 A0

0 0 1 
0 0 1 0 −2/6 2/6 −1/2
0 0 1 0

 
    1 1/2 0 −1/6
1 0 0 0 1 0 0
0 1 0 0 = −4/3 1/3 0  A 0 1/2 −1/2 −1/2
 
0 0 0 −1/3
0 0 1 0 −2/3 2/3 −1
0 0 1 0

Exercise 2.5.7 Find non-singular matrices P and Q such that


P AQ is in normal form, where
 
1 1 2
A = 1 2 3
0 −1 −1

Exercise 2.5.8 Find non-singular matrices P and Q such that


P AQ is in normal form, where
 
1 2 −1 −2
A = −1 −1 1 1
0 1 2 1

Exercise 2.5.9 Show that the matrix


 
1 2 −3
1 −2 1 
5 −2 −3

is of rank 2. Also, find matrices P and Q such that


 
1 0 0
P AQ = 0 1 0
0 0 0
2.6. INVERSE OF A MATRIX 61

2.6 Inverse of a matrix


Definition 2.6.1 A square matrix A is said to be invertible (non-
singular) if there exists a matrix P such that P A = In = AP . The
matrix P is called inverse of A. It is denoted by A−1 .It is noted
that if P A = In and AQ = In then

P = P In = P (AQ) = (P A)Q = In Q = Q

Example 2.6.2 Elementary matrices are non-singular.

Example 2.6.3 Diagonal matrices, whose all diagonal entries are


non-zero, are non-singular.

It is also noted that an n×n square matrix A is invertible if and only


if rank A = n. We can find the inverse of a invertible matrix by
means of application of elementary row (column) operations. The
procedure of finding inverse by means of application of elementary
row (column) operations is illustrated by following exercises.

Example 2.6.4 (Inverse of matrix by using elementary row


operations) Consider the matrix
 
1 2 3
A = 1 3 4 
1 4 4

It can be written as
   
1 2 3 1 0 0
1 3 4 = 0 1 0 A
1 4 4 0 0 1

Applying row operations R2 → R2 −R1 and R2 → R2 −R1 , we have


   
1 2 3 1 0 0
0 1 1 = −1 1 0 A
0 2 1 −1 0 1
Applying R3 → R3 − 2R2
   
1 2 3 1 0 0
0 1 1  = −1 1 0 A
0 0 −1 1 −2 1
62 CHAPTER 2. RANK OF A MATRIX

Applying R3 → (−1)R3 , we have


   
1 2 3 1 0 0
0 1 1 = −1 1 0  A
0 0 1 −1 2 −1
Applying R2 → R2 − R3 and R1 → R1 − 3R3
   
1 2 0 4 −6 3
0 1 0 =  0 −1 1  A
0 0 1 −1 2 −1
Applying R1 → R1 − 2R2 , we have
   
1 0 0 4 −4 1
0 1 0 =  0 −1 1  A
0 0 1 −1 2 −1
Thus,  
4 −4 1
A−1 =  0 −1 1 
−1 2 −1
Example 2.6.5 (Inverse of matrix by using elementary col-
umn operations) Consider the matrix
 
1 1 1
A = 1 2 3
0 1 1
Clearly it can be written as
   
1 1 1 1 0 0
1 2 3 = A 0 1 0
0 1 1 0 0 1
Applying C2 → C2 − C1 , C3 → C3 − C1 , we have
   
1 0 0 1 −1 −1
1 1 2 = A 0 1 0
0 1 1 0 0 1
Applying C3 → C3 − C2 , we have
   
1 0 0 1 −1 0
1 1 1 = A 0 1 −1
0 1 0 0 0 1
2.6. INVERSE OF A MATRIX 63

Applying C1 → C1 − C3 and C2 → C2 − C3 , we have


   
1 0 0 1 −1 0
0 0 1 = A  1 2 −1
0 1 0 −1 −1 1
Applying C2 ↔ C3 , we have
   
1 0 0 1 0 −1
0 1 0 = A  1 −1 2 
0 0 1 −1 1 −1
Thus, inverse of A is
 
1 0 −1
A−1 =  1 −1 2 
−1 1 −1

Exercise 2.6.6 Determine which of the following matrices are in-


vertible and find the inverses:
   
    1 1 1 1 1 1 1 1
1 2 1 1 2 2
1 3 2 , 1 3 1 , 1 2 −1 2 , 1 2 −1 2
   
1 −1 2 1 1 2 −1 1
1 0 1 1 1 3
1 3 3 2 5 9 1 6

Exercise 2.6.7 If A is invertible then show that the transpose A0


is invertible.

Exercise 2.6.8 If A is orthogonal then show that the transpose


A−1 = A0 .

Exercise 2.6.9 Find the inverse of matrix


 
1 1 2 1
0 −2 0 0
 
1 2 1 −2
0 3 2 1

Exercise 2.6.10 Let A and B be row equivalent square matrices,


then prove that A is invertible if and only if B is invertible.

Exercises
64 CHAPTER 2. RANK OF A MATRIX

Exercise 2.6.11 Determine for which value of a the matrix


 
1 1 0
A = 1 0 0 
1 2 a

is invertible. Describe A−1 .

Exercise 2.6.12 Reducethefollowing matrices


  to Echelon form
1 0 1 −1 −3 0 1 0 1 2  
 0 1 0 −1 2 1   0 1 0 −1  1 1 1 3
 0 0 1 1 −5 2 ,  0 0 1 1 , 1 1 1 2
     
2 1 −1 1
0 0 0 0 0 0 0 0 0 1

Exercise 2.6.13 Find a non-singular matrix P such that P A is


in row-echelon form, where
 
1 1 1 3
A = 1 1 1 2 .
2 2 2 1

Also find its rank.

Exercise 2.6.14 Find two non-singular matrices P and Q such


that P AQ is in normal form, where A is given by
     
1 1 1 3 5 3 14 4 1 2 1
1 1 1 2 , 0 1 2 1 , −1 0 2 
2 2 2 1 1 −1 2 0 2 1 −3
   
1 −1 2 −3 0 1 2 −1  
4 1 0 2  1 0 1 1 1 1
1 
 , 1 −1 −1
0 3 0 4  ,
  
3 1 0 2
3 1 1
0 1 0 2 1 1 −2 0
Also find the rank of each.

Exercise 2.6.15 Under what condition



2 4 2
(1) the rank of matrix 2 1 2  is 3.
1 0 x
 
1 5 4
(2) the rank of matrix  0 3 2  is 2.
x 13 10
2.6. INVERSE OF A MATRIX 65

Exercise 2.6.16 Find the rank of matrix


 
2 −2 0 6
4 2 0 2
A= 1 −1
,
0 3
1 −2 1 2

by reducing it to normal form.

Exercise 2.6.17 Find the inverse of following matrices


 
    0 1 2 2
1 3 3 1 2 1 0
1 4 3  , 3 2 3 ,  1 2 3 ,
2 2 2 3
1 3 4 1 1 2
2 3 3 3

by using elementary row operations.

Exercise 2.6.18 Find the inverse of matrix


 
0 1 2
A = 1 2 3
3 1 1

by using elementary column operations.

Exercise 2.6.19 Determine the rank of following matrices:


   
  2 −1 3 4 −2 −1 −3 −1
8 1 3 6
0
4 1 0 2,  1
 2 3 −1
3 2 2 , 0 3

0 4  1 0 1 1
−8 −1 −3 4
0 1 0 2 0 1 1 −1

Exercise 2.6.20 Determine the rank of following matrices:


     
6 1 3 8 2 3 −1 −1 1 a b 0
4 2
 6 −1 , 
1 −1 −2 −4 ,
0 c d
 1
10 3 9 7  3 1 3 −2 1 a b 0
16 4 12 15 6 3 0 −7 0 c d 1

2
66 CHAPTER 2. RANK OF A MATRIX
Chapter 3

System of linear
equations

Consider a system of m linear equations in n unknowns x1 , x2 , . . .,


xn :

a11 x1 + a12 x2 + . . . + a1n xn = b1


a21 x1 + a22 x2 + . . . + a2n xn = b2
.. .. .. .. ..
. . . . . .......(1)
am1 x1 + am2 x2 + . . . + amn xn = bm

It can be written as
    
a11 a12 ... a1n x1 b1
 a21 a22 ... a2n   x2   b2 
..   ..  =  .. 
    
 .. .. ..
 . . . .  .   . 
am1 am2 . . . amn xn bm

i.e; AX = b, where
     
a11 a12 . . . a1n x1 b1
 a21 a22 . . . a2n   x2   b2 
A= . ..  , X =  ..  and b =  ..  .
     
. .. ..
 . . . .   .   . 
am1 am2 . . . amn xn bm

The equation AX = b is called the matrix form of the system


(1) of linear equations. The matrix A is called the coefficient ma-
trix of the system (1) of linear equations, X is called the matrix

67
68 CHAPTER 3. SYSTEM OF LINEAR EQUATIONS

of unknowns. If b = 0, then the system (1) is called the system


of homogeneous linear equations. If b 6= 0, then the system
(1) is called system of non-homogeneous linear equations.
Definition 3.0.1 A column vector
 
x1
 x2 
X= . 
 
 .. 
xn
satisfying AX = b is called a solution to the equation AX = b.
If equation AX = b has a solution, then it is called consistent
otherwise called inconsistent.
 
1
Example 3.0.2 The column vector is a solution to the
−1
equation     
1 3 x −2
= .. (2)
2 0 y 2
Thus, it is consistent. Indeed, it has a unique solution.
Example 3.0.3 The equation
   
1 1   1
1 −1 x
= 1
y
1 2 3
is in-consistent, i.e; has no solutions (why?).
Example 3.0.4 The equation −x + y = 1, i.e;
 
x 
(−1 1) = 1
y
has infinitely many solutions x = a, y = a + 1, where a 6= 0 is
any scalar. Indeed, this equation represents a straight line passing
through infinitely many points (a, a + 1), where a 6= 0 is any scalar.
Example 3.0.5 Zero column vector 0n×1 is a solution to system
AX = 0n×1 , where A = (aij )m×n and
 
x1
 x2 
X =  . .
 
 .. 
xn
3.1. SYSTEM OF HOMOGENEOUS LINEAR EQUATIONS 69

Thus, a system of homogeneous linear equations is consistent.

From above examples, it is clear that a system of linear equa-


tions may or may not be consistent. If consistent then it may have
a unique solutions or may have infinitely many solutions. Here we
employ a systemic approach to solve the system of linear equations
by using elementary operations.

Definition 3.0.6 Two linear systems which have same sets of so-
lutions are called equivalent.

Definition 3.0.7 Let A and B be any two matrices of sizes m × n


and m × r. Then, matrix of size m × (n + r) whose first n columns
are columns of A arranged in the order of columns of A and last
r columns are columns of B arranged in the order of columns of
B, is called an augmented matrix. It is denoted by (A| B). For
example, if
   
5 3 7 4 5 0
A = 3 26 2 9 and B = 1 0
7 2 10 5 6 1
then  
5 3 7 4 5 0
(A|B) = 3 26 2 9 1 0 .
7 2 10 5 6 1

3.1 System of Homogeneous linear equations


As we have discussed earlier that a system of m equations in n
unknowns x1 , x2 , . . . , xn :

a11 x1 + a12 x2 + . . . + a1n xn = 0


a21 x1 + a22 x2 + . . . + a2n xn = 0
.. .. .. .. ..
. . . . . .......(1)
am1 x1 + am2 x2 + . . . + amn xn = 0

is called a system of Homogeneous linear equations. It can be writ-


ten in matrix form as
70 CHAPTER 3. SYSTEM OF LINEAR EQUATIONS

    
a11 a12 ... a1n x1 0
 a21 a22 ... a2n   x2  0
..   ..  =  .. 
    
 .. .. ..
 . . . .   .  .
am1 am2 . . . amn xn 0
i.e; AX = 0, where
   
a11 a12 ... a1n x1
 a21 a22 ... a2n   x2 
A= . ..  , X =  .. 
   
.. ..
 .. . . .   . 
am1 am2 . . . amn xn

One may easily verify that X = 0 is a solution to the system


of homogeneous linear equations AX = 0. Thus, a system of
homogeneous linear equations AX = 0 is always consistent
(i.e; has solutions). Next, if X is a non-zero solution then kX is
also a solution for every scalar k 6= 0. It also follows that if X1 , X2
are solutions to AX = 0, then k1 X1 + k2 X2 is also a solution for
every scalars k1 , k2 . This asserts that if AX = 0 has one non-
zero solution, then it has infinitely many solutions.

Proposition 3.1.1 Let E be the (row, or row reduced) echelon


form of A. Then, Y is a solution to the equation AX = 0 if and
only if Y is a solution to the equation EX = 0.

Proof: Let E be the (row, or row reduced) echelon form of A.


Then there exists a non-singular matrix P such that P A = E.
Now,

AY = 0 ⇔ (P A)Y = 0
⇔ EY = 0.

Thus proved. 2
Thus, to find solution of AX = 0, we first reduce A to
its Echelon form E and then solve the equation EX = 0.

Theorem 3.1.2 Consider the system AX = 0 of m equations in n


unknowns, where A = (aij )m×n . Suppose that ρ(A) = rank A = r.
Then, we have the following:
(i). If r = n then X = 0 is the only solution (i.e; it has a unique
3.1. SYSTEM OF HOMOGENEOUS LINEAR EQUATIONS 71

solution which is 0.
(ii). If r < n, then (n − r) solutions are linearly independent.
Indeed, in this case, n − r unknowns are free (independent) to take
any values and so it has infinitely many solutions.

Proof: Since r = row rank(A) = column rank (A), thus by


definition r ≤ min{m, n}. Hence, either r ≤ n or r = n. Case I:
If r = n, then its reduced Echelon form E will be
 
I
E= n
0
and so EX = 0 will be In X = 0. which gives X = 0. Thus, X = 0
is the only solution, i.e; AX = 0 has zero solution only.

Case II: Suppose that r < n. Let Aj , j = 1, 2, . . . , n denotes


the column of A. Without loss of generality assume that its first
r columns are linearly independent. Now, AX = 0 can be written
as
A1 x1 + A2 x2 + . . . + An xn = 0 . . . (1)
Since n − r columns Ar+1 , Ar+2 , . . . , An are linear combinations of
A1 , A2 , . . . , Ar , therefore we have
Ar+p = kp1 A1 + kp2 A2 + . . . + kpr Ar
for all p = 1, 2, . . . (n − r), where kp1 , kp2 , . . . , kpr) are scalars, not
all 0 for each p. It can be re-written as
kp1 A1 + kp2 A2 + . . . + kpr Ar + 0.Ar+1 + 0.Ar+2 + . . .
+0.Ar+p−1 − 1.Ar+p + 0.Ar+p+1 + 0.Ar+p+2 + . . . + 0.An = 0
for each p = 1, 2, . . . (n − r). This shows that
     
k11 k21 k31
k12  k22  k32 
 ..   ..   .. 
     
 .   .   . 
     
k1r  k2r  k3r 
     
X1 =  −1
 
, X2 =  0 , X3 =  0 ,
   
 0   −1   0 
     
 0   0   −1 
     
 .   .   . 
 ..   ..   .. 
0 0 0
72 CHAPTER 3. SYSTEM OF LINEAR EQUATIONS
 
k(n−r)1
k(n−r)2 
 .. 
 
 . 
 
k(n−r)r 
 
. . ., Xn−r =
 0 

 0 
 
 0 
 
 . 
.
 . 
−1
are (n − r) solutions of (1). It is also noted that a1 X1 + a2 X2 +
. . . + an−r Xn−r = 0 gives a1 = a2 = . . . = an−r = 0(follows from
last n − r entries of these columns). Thus, X1 , X2 , . . . , Xn−r are
(n − r) linearly independent solutions of Equation (1). 2

Corollary 3.1.3 A system of homogeneous linear equations, AX =


0, has a non-zero solution (i.e possesses a non-zero solution) if
rank (A) < n. Thus, if m < n, i.e; the number of rows (equa-
tions) is less than the number of columns (unknowns) then the
equation AX = 0 has at least n − m linearly independent solutions
and hence it possesses infinitely many non-zero solutions.
Example 3.1.4 Consider the system of equations:
x + 2y + 3z = 0
3x + 4y + 4z = 0
x + 10y + 12z = 0
It can be written in matrix form as
    
1 2 3 x 0
3 4 4  y  = 0
1 10 12 z 0
Now,
   
1 2 3 1 2 3  
3 4 4  ∼ 0 −2 −5 R2 → R2 − 3R1
R3 → R3 − R1
1 10 12 0 8 9
 
1 2 3 
∼ 0 −2 −5  R3 → R3 − 4R2
0 0 −11
3.1. SYSTEM OF HOMOGENEOUS LINEAR EQUATIONS 73

which is a Echelon form. Since the number of non-zero rows in


the Echelon form is 3 therefore, we have

rank(A) = 3 = number of unknowns.

Hence, it has zero solution only. Indeed, the solution to given


system is given by
    
1 2 3 x 0
0 −2 −5  y  = 0
0 0 −11 z 0

i.e; −11z = 0, −2y − 5z = 0 and x + 2y + 3z = 0. Solving this, we


get
z = 0, y = 0, x = 0.

Example 3.1.5 Consider the system of equations

x + 3y − 2z = 0
2x − y + 4z = 0
x − 11y + 14z = 0

It can be written in matrix form as


    
1 3 −2 x 0
2 −1 4   y  = 0
1 −11 14 z 0

Now,
   
1 3 −2 1 3 −2  
2 −1 R2 → R2 − 2R1
4  ∼ 0 −7 8
R3 → R3 − R1
1 −11 14 0 −14 16
 
1 3 −2 
∼  0 −7 8 R3 → R3 − 2R2
0 0 0

which is a Echelon form. Since the number of non-zero rows in


the Echelon form is 2, therefore rank of coefficient matrix A is 2.
Now, the number of unknowns is 3. Thus, number of independent
solutions is 1, i.e; one variable is free to take any value.
74 CHAPTER 3. SYSTEM OF LINEAR EQUATIONS

Now, the solution to given system of linear equations is


    
1 3 −2 x 0
0 −7 8  y  = 0
0 0 0 z 0

This gives

x + 3y − 2z = 0, −7y + 8z = 0 .........(∗)
8c
Take z = c, where c is arbitrary. Then y = 7 and so

8c 10c
x = −3y + 2z = −3 × + 2c = −
7 7
i.e; x = − 10c 8c
7 , y = 7 , z = c is a solution to the given system,
where c is arbitrary. If c = 0 then we have x = y = z = 0 is a
solution. If c = 7 then x = −10, y = 8, z = 7 is a solution. If c = 1
then x = − 10 8
7 , y = 7 , z = 1 is a solution. Indeed, it has infinitely
many solutions which is given by
   10c 
x − 7
8c 
X= y =
  
7
z c

where c is arbitrary scalar.

Remark 3.1.6 If we take x = k, where k is any scalar. Then


equation (*) becomes
3y − 2z = −k and −7y + 8z = 0. Solving these, we get

4k 7k
y=− ,z = −
5 5
Hence  
k
Y = − 4k
5

7k
− 10
is also solution for the given system, where k is any scalar.
It is noted that both solutions
 10c 
− 7
X =  8c 7

c
3.1. SYSTEM OF HOMOGENEOUS LINEAR EQUATIONS 75

and
 
k
Y = − 4k
5

7k
− 10

10c
are same. Indeed, if we put k = − then Y = X.
7

Exercise 3.1.7 Prove that the system of equations

x+y+z = 0
2x − y − 3z = 0
3x − 5y + 4z = 0
x + 17y + 4z = 0

has unique solution (i.e; has zero solution only).

Exercise 3.1.8 For what values of λ, the system of linear equa-


tions

λx + y + z = 0
x + λy + z = 0
x + y + λz = 0

has zero solution only.

Solution: The coefficient matrix A of system of equations


AX = 0 is
 
λ 1 1
A = 1 λ 1 ,
1 1 λ
 
x
where X = y 
z
For zero solution, ρ(A) = number of unknowns = 3 and so A
is invertible. Then |A| =
6 0, where |A| denotes the determinant of
square matrix A.
76 CHAPTER 3. SYSTEM OF LINEAR EQUATIONS

Now,

λ 1 1
1 λ 1 = λ(λ2 − 1) − (λ − 1) + 1 − λ


1 1 λ
= (λ − 1) [λ(λ + 1) − 1 − 1]
= (λ − 1) λ2 + λ − 2
 

= (λ − 1)(λ − 1)(λ + 2) = (λ − 1)2 (λ + 2)


Thus, if λ 6= 1, −2 then |A| 6= 0 and so A is invertible. In this
case, it has zero solution only. This shows that the system has
zero solution if λ 6= 1, −2.
Example 3.1.9 We now consider the system of equations
λx + y + z = 0
x + λy + z = 0
x + y + λz = 0
which involves a parameter λ. For zero solution, we have discussed
the case in above example. Here our aim is to discuss all the cases
in which it has non-zero solutions. Also our goal is to find non-zero
solutions in each cases.
The system of equation in matrix form is AX = 0, where the
coefficient matrix A is
 
λ 1 1
A = 1 λ 1
1 1 λ
and  
x
X = y 
z
For non-zero solution, ρ(A) should be less than the number of un-
knowns, i.e; ρ(A) < 3 and so |A| = 0. Now

λ 1 1
|A| = 1 λ 1 = λ(λ2 − 1) − (λ − 1) + 1 − λ

1 1 λ
= (λ − 1) [λ(λ + 1) − 1 − 1]
= (λ − 1) λ2 + λ − 2
 

= (λ − 1)(λ − 1)(λ + 2) = (λ − 1)2 (λ + 2)..(1)


3.1. SYSTEM OF HOMOGENEOUS LINEAR EQUATIONS 77

Thus, |A| = 0 ⇔ λ = 1, −2.


Case I: For λ = 1

 
1 1 1
A = 1 1 1 
1 1 1
Since
   
1 1 1 1 1 1
A = 1 1 1 ∼ 0 0 0 R2 → R2 − R1 , R3 → R3 − R1
1 1 1 0 0 0

which is a Echelon form. Clearly ρ(A) = 1. Since the number of


unknowns is 3, hence it has 3 − 1 = 2 independent solutions. Now,
solution is given by
    
1 1 1 x 0
0 0 0 y  = 0
0 0 0 z 0

i.e; x + y + z = 0. Take x = a, y = b, then z = −a − b, where a


and b are arbitrary scalars. If a = 1, b =0 then
 x = 1, y = 0, z =
1
−1 is a solution. In matrix form, X1 =  0  is a solution.
−1
Next, if a = 0, b = 1 then x = 0, y = 1, z = −1, i.e;
 
0
X2 =  1 
−1

is a solution. Observe that these two solutions X1 , X2 are indepen-


dent.
Case II: For λ = −2

The coefficient matrix A will be


 
−2 1 1
A =  1 −2 1 
1 1 −2
78 CHAPTER 3. SYSTEM OF LINEAR EQUATIONS

Since
   
−2 1 1 1 1 −2
A =  1 −2 1  ∼  1 −2 1  R1 ↔ R3
1 1 −2 −2 1 1
 
1 1 −2
R2 → R2 − R1
∼  0 −3 3 
R3 → R3 + 2R1
80 3 −3
 
1 1 −2
∼ 0 −3 3  R3 → R3 + R2
0 0 0

which is a Echelon form. Clearly ρ(A) = 2. Since the number of


unknowns is 3, hence it has 3 − 2 = 1 independent solution.
The solution is given by
    
1 1 −2 x 0
0 −3 3  y  = 0
0 0 0 z 0

i.e; x+y −2z = 0, −3y +3z = 0. This gives y = z and x = 2z −y =


2z − z = z.

Take z = c, where c is arbitrary scalar, then x = c, y = c.


Thus,
 
c
X = c
c

is a solution to the system, where c is arbitrary scalar.

Remark 3.1.10 For every coefficient matrix A of the equation


AX = 0, the number of columns is same as the number of un-
knowns. Suppose that the coefficient matrix A is a square matrix
of size n × n. If ρ(A) = r < n. Then there exists a invertible
matrix P (i.e, |P | =
6 0) such that
 
B
PA =
0(n−r)×n
3.2. SYSTEM OF NON-HOMO. LINEAR EQUATIONS 79

Then,
|P A| = 0 (as last row of P A is a zero − row)
|P ||A| = 0 (as |P A| = |P ||A|)
|A| = 0 (as |P | =
6 0)
Thus, if ρ(A) < n then |A| = 0. One may also prove that if
|A| = 0 then ρ(A) < n. We shall use this fact frequently in the
section of eigen values and eigen vectors. Indeed, this section is
applicable for the section eigen values and eigen vectors
Exercise 3.1.11 Discuss for all values of k, the system of equa-
tions
2x + 3ky + (3k + 4)z = 0
x + (k + 4)y + (4k + 2)z = 0
x + 2(k + 1)y + (3k + 4)z = 0
Exercise 3.1.12 Prove that the only value of λ for which the fol-
lowing system of equations has non-zero solution is 6:
x + 2y + 3z = λx
3x + y + 2z = λy
2x + 3y + z = λz
Exercise 3.1.13 Solve the following system of homogeneous linear
equations:
x1 − x2 + x3 = 0, x1 + 2x2 − x3 = 0 2x1 + x2 + 3x3 = 0
Exercise 3.1.14 Solve:
x−y+z =0 x+y−z =0 −x+y+z =0

3.2 System of Non-homo. linear equations


Definition 3.2.1 An equation of the form AX = b, where
   
x1 b1
 x2   b2 
A = (aij )m×n , X =  .  and b =  . 
   
 ..   .. 
xn bm
with b 6= 0, is called a system of non-homogeneous linear equations.
80 CHAPTER 3. SYSTEM OF LINEAR EQUATIONS

Theorem 3.2.2 AX = b is consistent if and only if rank (A|b) =


rank A, where (A| b) denotes the augmented matrix A by b.

Proof: Let Cj denotes the j-th column of A, then equation AX =


b can be written as

C1 x 1 + C2 x 2 + . . . + Cn x n = b ......(1)

Let rank A = r, then only r columns of A are linearly inde-


pendent. Without loss of generality assume that its first columns
C1 , C2 , . . . , Cr are linearly independent. Thus, remaining n − r
columns Cr+k , k = 1, 2, . . . , n − r are linear combinations of C1 ,
C+2 , . . . , Cr . Suppose that
r
X
Cr+k = ckl Cl ..........(2)
l=1

for each k = 1, 2, . . . , n − r.
Suppose that system is consistent, then there exists scalars pj ,
j = 1, 2, . . . , n, not all zero(since b 6= 0), such that

p1 C1 + p2 C2 + . . . + pn Cn = b ......(3)

By (2) and (3), we have b is a linear combination of C1 , C2 ,


. . . , Cr . This shows that maximal number of linearly independent
columns of (A|b) will be r. Hence,

rank (A|b) = r = rank A.


Conversely suppose that rank (A|b) = rank A. Then maximal
number of linearly independent columns of (A|b) will be r. But C1 ,
C2 , . . . , Cr are linearly independent and so it will be linearly inde-
pendent columns of (A|b). Hence, there exists scalars c1 , c2 , . . . , cr ,
not all zero such that

b = c1 C1 + c2 C2 + . . . + cr Cr

i.e;

c1 C1 + c2 C2 + . . . + cr Cr + 0.Cr+1 + 0.Cr+2 + . . . + 0.Cn = b

This shows that x1 = c1 , x2 = c2 , . . ., xr = cr , xr+1 = 0, xr+2 = 0,


. . . , xn = 0 is a solution. Hence the system is consistent. 2
3.2. SYSTEM OF NON-HOMO. LINEAR EQUATIONS 81

Corollary 3.2.3 Let rank A = rank (A | b) = r. Then we have


the following:
(i). If r = n then AX = b has a unique solution.
(ii). If r < n then AX = b has n − r + 1 independent solutions
and so infinitely many solutions.

Proof: Assume the hypothesis. If r = n then all the columns


C1 , C2 , . . . , Cn are linearly independent columns of (A | b) as
rank A = rank (A | b). Hence b is linear combination of C1 , C2 ,
. . . , Cn and so there exists unique scalars c1 , c2 , . . . , cr , not all zero,
such that
b = c1 C1 + c2 C2 + . . . + cn Cn

This shows that x1 = c1 , x2 = c2 , . . ., xn = cn is a unique solution.


Suppose that r < n. Then remaining n − r + 1 columns Cr+k ,
k = 1, 2, . . . , n − r and b are linear combinations of C1 , C+2 ,
. . . , Cr . Suppose that

r
X
Cr+k = clk Cl .........(1)
l=1

and
b = c1 C1 + c2 C2 + . . . + cr Cr ....(2)

where c1k , c2k , . . . , crk are not all zero for each k = 1, 2, . . . , n − r
and c − 1, c2 , . . . , cr are  zero. By(1), wehave n − r inde-
not all 
c11 c12
 c21   c22 
 ..   .. 
   
 .   . 
   
 cr1   cr2 
   
pendent solutions X1 =   −1 , X2 =  0 ,
  
 0   −1 
   
 0   0 
   
 .   . 
.
 .  .
 . 
0 0
82 CHAPTER 3. SYSTEM OF LINEAR EQUATIONS
 
c13  
c1(n−r)
 c23 
 c2(n−r)
..
  
..
   
 .   
   . 

 cr3 


 cr(n−r)


 0   
X3 =  , . . ., Xn−r =  0 
 0   
   0 

 −1 


 ..



 0 


 . 

 ..   0 
 . 
−1
0
of equation C1 x1 + C2 x2 + . . . + Cn xn = 0, i.e;

AXi = 0 . . . (3)

for all i = 1, 2, . . . , n − r.
It is noted that rank A = r < n, thus by fact, AX = 0 has n − r
linearly independent solutions X1 , X2 , . . . , Xr , say.
By (2),
 
c1
 c2 

 .. 

 . 
 
 cr 
X0 =   . . . (4)
 0 

 0 
 
 . 
 .. 
0
is a solution of AX = b. Using Equations (3) and (4), we have

A (X0 + k1 X1 + k2 X2 + . . . + kn−r Xn−r ) = b

for each scalars k1 , k2 , . . . , kn−r .

Thus, AX = b has infinitely many solutions. One may easily


verify that n−r +1 solutions X0 , X0 +X1 , X0 +X2 , . . ., X0 +Xn−r
are independent. 2
Thus, we see that
(i). If rankA 6= rank (A | b), system AX = b has no solutions
(Inconsistency)
3.2. SYSTEM OF NON-HOMO. LINEAR EQUATIONS 83

(ii). If rankA = rank (A | b) = r, system AX = b has solution.


If r = n then it has a unique solution. If r < n then it has n − r + 1
independent solutions and so we have infinitely many solutions.
Note that if m < n, then rank A < n and so it has infinitely many
solutions.
Example 3.2.4 Consider the following system of equations:
x+y+z = 6
x−y+z = 2
2x + y − z = 1
It can be written as AX = b, where
     
1 1 1 x 6
A = 1 −1 1  , X = y  , b = 2
2 1 −1 z 1
Now,
   
1 1 1 : 6 1 1 1 : 6
1 −1 1 : 2 ∼ 0 −2 0 R2 → R2 − R1
: −4 
R3 → R3 − 2R1
2 1 −1 : 1 0 −1 −3 : −11
 
1 1 1 : 6
1
∼  0 1 0 : 2  R2 → −2 R2
0 −1 −3 : −11
 
1 1 1 : 6
∼ 0 1 0 : 2  R3 → R3 + R1
0 0 −3 : −9

which is an Echelon form. Clearly rank (A | b) = 3. Observe that


reduced part of A in Echelon form is
 
1 1 1
0 1 0 
0 0 −3
Hence rank A = 3. Since rank A = rank (A | b) = 3 = number
of unknowns, therefore the system is consistent and has a unique
solution. The solution is given by
    
1 1 1 x 6
0 1 0   y  =  2 
0 0 −3 z −9
84 CHAPTER 3. SYSTEM OF LINEAR EQUATIONS

This gives −3z = −9, y = 2 and x + y + z = 6. Solving these, we


get
z = 3, y = 2 and x = 6 − y − z = 6 − 2 − 3 = 1. Thus, solution is
x = 1, y = 2, z = 3.

Example 3.2.5 Consider the following system of equations

x + 2y + 3z = 14
3x + y + 2z = 11
2x + 3y + z = 11
     
1 2 3 x 14
Here A = 3 1 2 , X = y and b = 11.
    
2 3 1 z 11
Hence, augmented matrix is
 
1 2 3 : 14
(A|b) = 3 1 2 : 11
2 3 1 : 11

Now,
 
1 2 3 : 14
(A|b) = 3 1 2 : 11
2 3 1 : 11
 
1 2 3 : 14
R2 → R2 − 3R1
∼ 0 −5 −7 : −31
R3 → R3 − 2R1
0 −1 −5 : −17
 
1 2 3 : 14
∼  0 0 18 : 54  R2 → R2 − 5R3
0 −1 −5 : −17

 
1 2 3 : 14
1
∼ 0 0 1 : 3  R2 → 18 R2
0 −1 −5 : −17
 
1 2 3 : 14
∼ 0 −1 −5 : −17 R2 ↔ R3
0 0 1 : 3

which is a Echelon form. From this, it follows that


3.2. SYSTEM OF NON-HOMO. LINEAR EQUATIONS 85

rank (A | b) = rank A = 3= number of unknowns


Thus, system is consistent and has a unique solution. The solution
is given by
    
1 2 3 x 14
0 −1 −5 y  = −17
0 0 1 z 3
This gives that z = 3, −y − 5z = −17, x + 2y + 3z = 14. Solving
this, we get z = 3, y = 17−5×3 = 2 and x = 14−2×2−3×3 = 1.
Thus,x  = 1, y = 2, z = 3, i.e;
1
X = 2 is a solution to the equation.
3

Example 3.2.6 Consider that the system of equations is

x + y + z = −3
3x + y − 2z = −2
2x + 4y + 7z = 7

It canbe writtenas AX =  b,where  


1 1 1 x −3
A = 3 1 −2, X = y  and b = −2
2 4 7 z 7
The augmented matrix (A | b) is
 
1 1 1 : −3
(A | b) = 3 1 −2 : −2
2 4 7 : 7
Now
 
1 1 1 : −3
(A | b) = 3 1 −2 : −2
2 4 7 : 7
 
1 1 1 : −3
R2 → R2 − 3R1
∼ 0 −2 −5 : 7 
R3 → R3 − 2R1
0 2 5 : 13
 
1 1 1 : −3
∼ 0 −2 −5 : 7  R3 → R3 + 2R2
0 0 0 : 20
86 CHAPTER 3. SYSTEM OF LINEAR EQUATIONS

which is a Echelon form. Clearly rank A = 2 and rank(A | b) = 3.


Since rank A 6= rank(A | b), therefore the system is inconsistent.
Exercise 3.2.7 Solve the following system of equations:
x+y+z = 9
2x + 5y + 7z = 52
2x + y − z = 0
Exercise 3.2.8 Solve the following system of equations:
2x − y + 3z = 8
−x + 2y + z = 4
3x + y − 4z = 0
Example 3.2.9 Consider the system of equations:
x+y+z = 6 (3.2.1)
x + 2y + 3z = 14 (3.2.2)
x + 4y + 7z = 30 (3.2.3)
It can be written as AX = b, where

     
1 1 1 x 6
A = 1 2 3  , X =  y  and b =  14  .
1 4 7 z 30
Then  
1 1 1 : 6
(A|b) = 1 2 3 : 14 .
1 4 7 : 30
Now,
 
1 1 1 : 6
(A|b) = 1 2 3 : 14
1 4 7 : 30
 
1 1 1 : 6
R2 → R2 − R1
∼ 0 1 2 : 8
R3 → R3 − R1
0 3 6 : 24
 
1 1 1 : 6
∼ 0 1 2 : 8 R3 → R3 − 3R2
0 0 0 : 0
3.2. SYSTEM OF NON-HOMO. LINEAR EQUATIONS 87

This is Echelon form. Clearly rank A = rank (A|b) = 2. Hence,


it has 3 − 2 + 1 = 2 independent solution. The solution is given by
    
1 1 1 x 6
0 1 2   y  =  8 
0 0 0 z 0
i.e; x + y + z = 6 and y + 2z = 8. This gives, y = 8 − 2z,
x = 6 − y − z = −2 + z. Take z = c then y = 8 − 2c and x = −2 + c,
c is any scalar. Thus,
 
−2 + c
X=  8 − 2c 
c
is a solution,
  where c is arbitrary
  scalar. For c = 0,we getX =
−2 0 −2
 8  and for c = 2, X =  4 . The solutions  8  and
0 2 0
 
0
 4  are its independent solutions.
2
Exercise 3.2.10 Investigate for what values of λ, µ, the system of
equations:
x+y+z = 6
x + 2y + 3z = 10
x + 2y + λz = µ
has (i) no solution, (ii) a unique solution, (iii) an infinite number
of solutions.

Solution: The system of equation is AX = b, where

     
1 1 1 x 6
A =  1 2 3 , X =  y  and b =  10 .
1 2 λ z µ
Then augmented matrix is
 
1 1 1 : 6
(A |b) =  1 2 3 : 10 
1 2 λ : µ
88 CHAPTER 3. SYSTEM OF LINEAR EQUATIONS

Now,
 
1 1 1 : 6
(A |b) = 1 2 3 : 10
1 2 λ : µ
 
1 1 1 : 6
R2 → R2 − R1
∼ 0 1 2 : 4 
R3 → R3 − R1
0 1 λ−1 : µ−6
 
1 1 1 : 6
∼  0 1 2 : 4  R3 → R3 − R2
0 0 λ − 3 : µ − 10

This is Echelon form.


Case I: If λ = 3 but µ 6= 10, then rank A = 2 and rank(A | b) =
3. Thus, rank A 6= rank(A | b), i.e; the system has no solutions
(inconsistent).
Case II: If λ 6= 3 then rank A = 3 = rank(A | b) = number
of unknowns. Thus, the system has unique solution.
Case III: If λ = 3 and µ = 10, then rank A = 2 = rank(A | b) <
3, the number of unknowns. Thus, the system has 3 − 2 + 1 = 2
independent solutions and so it has infinitely many solution.

Exercise 3.2.11 For what values of a, the system of equations:

x+y+z = 1
x + 2y + 4z = a
x + 4y + 10z = a2

has a solution and solve them completely in each case.

Solution: The system of equation is AX = b, where

     
1 1 1 x 1
A =  1 2 4 , X =  y  and b =  a .
1 4 10 z a2
Then augmented matrix is
 
1 1 1 : 1
(A |b) =  1 2 4 : a 
1 4 10 : a2
3.2. SYSTEM OF NON-HOMO. LINEAR EQUATIONS 89

Now,
 
1 1 1 : 1
(A | b) = 1 2 4 : a
1 4 10 : a2
 
1 1 1 : 1
R2 → R2 − R1
∼  0 1 3 : a−1
R3 → R3 − R1
0 3 9 : a2 − 1
 
1 1 1 : 1
∼  0 1 3 : a−1  R3 → R3 − 3R2
0 2
0 0 : a − 3a + 2

which is a Echelon form. From the Echelon form, it follows that


the system is consistent if a2 − 3a + 2 = 0, i.e; if a = 1, 2.
Case I: For a = 1, the echelon form of augmented matrix
(A | b) is  
1 1 1 : 1
0 1 3 : 0
0 0 0 : 0
Thus, solution is given by y + 3z = 0 and x + y + z = 1. Solving
these, we have y = −3z and x = 1 − y − z = 1 + 3z − z = 1 + 2z.
Thus, solution is given by z = c, y = −3c, x = 2c + 1, where c is
any scalar.
Case II: For a = 2, the Echelon form of augmented matrix
(A | b) is  
1 1 1 : 1
0 1 3 : 1
0 0 0 : 0
Thus, solution is given by y + 3z = 1 and x + y + z = 1. Solving
these, we have y = 1 − 3z and x = 1 − y − z = 1 − 1 + 3z − z = 2z.
Thus, solution is given by z = c, y = 1 − 3c, x = 2c, where c is any
scalar.

Exercises
Exercise 3.2.12 Solve the following systems of equations:

1. 2x − y + 3z = 8, −x + 2y + z = 4, 3x + y − 4z = 0.

2. x − 2y + 3z = 2, 2x − 3z = 0, x + y + z = 0.
90 CHAPTER 3. SYSTEM OF LINEAR EQUATIONS

3. 2x1 + 3x2 + x3 = 9, x1 + 2x2 + 3x3 = 6, 3x1 + x2 + 2x3 = 8.

4. x + y + z = 7, x + 2y + 3z = 16, x + 3y + 4z = 22.

5. x − y + 2z = 4, 3x + y + 4z = 6, x + y + z = 1.

6. x1 + 2x2 − x3 = 6, 3x1 − x2 − 2x3 = 3, 4x1 + 3x2 + x3 = 9.

Exercise 3.2.13 Investigate for what values of λ, µ, the system of


equations:
x + y + z = 6, x − 2y + 3z = 10, x + 2y + λz = µ
has (i) no solution, (ii) a unique solution, (iii) an infinite number
of solutions. Find solutions in the case of consistent.

Exercise 3.2.14 If the system of linear equations:


x + 3z = kx, −5x + 4y = ky, −3x + 2y − 5z = kz
has solutions other than x = y = z = 0, then prove that k 3 − 12k +
14 = 0.

Exercise 3.2.15 Prove that the system of linear equations:

−2x + y + z = a
x − 2y + z = b
x + y − 2z = c

has no solution unless a + b + c 6= 0.

2
Chapter 4

Eigen values and Eigen


vectors

Definition 4.0.1 Let A be a n × n square matrix. Then a scalar λ


is called an eigen value or characteristic roots (latent roots) of A
if there exists a non-zero column vector X such that AX = λX. A
non-zero column vector X such that AX = λX is called an eigen
vector associated with eigen value λ.

For example, each non-zero column vector of size n × 1 is an


eigen value of In associated with eigen value 1. It is noted that if
X is an eigen vector associated to eigen value λ, then kX is also an
eigen vector associated to eigen value λ, for each non-zero scalar k.
Next, if X1 , X2 are any two eigen vectors associated to eigen value
λ. Then, AX1 = λX1 and AX2 = λX2 . By matrix algebra, one
may easily observe the following:

A(aX1 + bX2 ) = aAX1 + bAX2 = aλX1 + bλX2 = λ(aX1 + bX2 )


Define Eλ = {X ∈ Mn×1 | (A − λIn )X = 0}. Clearly it is non-
empty as 0 ∈ Eλ . If λ is an eigen value of A, then Eλ contains
non-zero column vectors. Indeed, the elements of Eλ satisfies the
following conditions:
(i). for each scalar k and column vector X ∈ Eλ , kX ∈ Eλ .
(ii). X + Y ∈ Eλ for X, Y ∈ Eλ .
(iii). X + (Y + Z) = (X + Y ) + Z for X, Y ∈ Eλ
(iv). 0 ∈ Eλ such that 0 + X = X = X + 0,
(v). k.(X + Y ) = k.X + k.Y and (k + l).X = k.X + l.X,

91
92 CHAPTER 4. EIGEN VALUES AND EIGEN VECTORS

(vi). (kl).X = k.(l.X), and


(vii). 1.X = X.
Thus, (Eλ , +, .) is a vector space, called as eigen space asso-
ciated with eigen value λ. We shall discuss the details of vector
spaces in later chapter. The maximum number of linearly indepen-
dent column vectors of Eλ is called the geometric multiplicity
of eigen value λ.
Let λ be an eigen value of a square matrix A. Then, by def-
inition of eigen value, there exists a non-zero column vector X
such that AX = λX and so X is a non-zero solution to the sys-
tem of homogeneous linear equations (A − λIn )X = 0. Thus,
rank (A − λIn ) < n, the number of unknowns in the equation,
i.e; det (A − λIn ) = 0. Conversely, if det (A − λIn ) = 0 then
rank (A − λIn ) < n and so the system of homogeneous linear
equations (A − λIn )X = 0 has a non-zero solution, i.e; there exists
a non-zero column vector X such that AX = λX. Thus, we have
the following:

Proposition 4.0.2 A scalar λ is an eigen value of A if and only


if
det (A − λIn ) = 0.

Thus, all the eigen values of a square matrix A are roots of the
equation det (A − λIn ) = 0, where λ is a parameter.

Definition 4.0.3 Let A and B be two square matrices of same


sizes. Then A is said to be a similar matrix of B if there exists
a non-singular(invertible) matrix P such that B = P −1 AP . It is
denoted by A ∼ B. It is noted that if A ∼ B then B ∼ A. Indeed,
the relation ∼ of being similar is an equivalence relation. Thus, we
frequently call A and B are similar matrices.

Proposition 4.0.4 Similar matrices have the same eigen values.

Proof: Let A and B be any two similar matrices. Then there


93

exists an invertible matrix P such that B = P −1 AP . Now,

det (B − cIn ) = det (P −1 AP − c In )


= det (P −1 AP − P −1 c In P )
= det P −1 (A − cIn ) P
 

= det P −1 .det (A − cIn ) .det P


as det (AB) = det A.det B
1
= det (A − cIn ) ; because det P −1 =
det P
Thus, the result follows by Proposition 4.0.2. 2
Let A = (aij )n×n be a square matrix. Then,

a11 − x a12 a13 ... a1n

a21
a22 − x a23 ... a2n
a31
det (A − xIn ) = a32 a33 − x . . . a3n . . . (1)
.. .. .. .. ..
. . . . .

an1 an2 an3 . . . ann − x

One may easily observe that it is a polynomial in x of degree n.


This polynomial is called as characteristic polynomial of given
matrix A. The equation

det (A − xIn ) = 0

is called a characteristic equation of A.


By Proposition 4.0.2, it follows that c is a eigen value of A if
and only if c is a root of the characteristic equation of A.
Since the characteristic equation of A is of n-th degree, therefore
it has at most n roots. It is noted that it may or may not have all
the roots in R, but it has exactly n roots in C, the field of complex
numbers (Fundamental theorem of algebra).

For example, the characteristic equation of matrix A, where


 
0 −1
A=
1 0

is x2 + 1 = 0. Clearly it has no real roots in R, i.e; has no eigen


values in R. However if we consider A over the field C, it has
94 CHAPTER 4. EIGEN VALUES AND EIGEN VECTORS

exactly two eigen values i, −i. It is also noted that eigen values of
a square matrix may or may not be equal.
If c1 , c2 , . . . , ck are distinct roots of det (A − xIn ) such that

det (A − xIn ) = (−1)n (x − c1 )r1 (x − c2 )r2 . . . (x − ck )rk ,

together with kl=1 rl = n, then r1 , r2 , . . . , rk are called algebraic


P
multiplicities of c1 , c2 , . . . , ck respectively.

Example 4.0.5 Consider the matrix


 
5 4
A=
1 2
The characteristic polynomial of A is given by

5 − λ 4
|A − λI2 | =
= (5 − λ)(2 − λ) − 4
1 2 − λ
= λ2 − 7λ + 6
= (λ − 1)(λ − 6)

Thus, eigen values are given by (λ − 1)(λ − 6) = 0, i.e; λ = 1, 6.

Eigen vector associated to λ = 1

 
4 4
A − 1.I2 =
1 1
 
4 4 1
∼ R2 → R2 − R1
0 0 4
 
x
Hence, eigen vector X = is given by 4x+ 4y = 0, i.e; y = −x.
y
Take x = a, where a 6= 0 is arbitrary scalar, then y = −a.
Hence, eigen vector associated with λ = 1 is
 
a
X1 = , a 6= 0
−a
and a is arbitrary scalar.
If we take a = 1, then
 
1
X= .
−1
95

Eigen vector associated to λ = 6

 
−1 4
A − 6.I2 =
1 −4
 
−1 4
∼ R2 → R2 + R1
0 0
 
x
Hence, eigen vector X = is given by −x+ 4y = 0, i.e; x = 4y.
y
Take y = 1, we have x = 4. Hence, eigen vector associated to
λ = 6 is  
4
X2 = .
1

Example 4.0.6 Consider the matrix


 
−3 1 −1
A =  −7 5 −1  .
−6 6 −2

Then

−3 − λ 1 −1

|A − λI3 | = −7 5−λ −1

−6 6 −2 − λ


−2 − λ 1 −1

= −2 − λ 5 − λ −1 C1 → C1 + C2
0 6 −2 − λ

1
1 −1

= −(2 + λ) 1 5 − λ −1

0 6 −2 − λ

1
1 −1

= −(2 + λ) 0 4 − λ
0
R2 → R2 − R1
0 6 −2 − λ
= −(2 + λ)2 (λ − 4)

Thus, eigen values are λ = −2, −2, 4.


Observe that algebraic multiplicity of −2 is 2 and algebraic mul-
tiplicity of 4 is 1.
96 CHAPTER 4. EIGEN VALUES AND EIGEN VECTORS

Eigen vector associated to λ = −2:


 
−1 1 −1
A + 2.I3 = −7 7 −1
−6 6 0
 
−1 1 −1
1
∼ −7 7 −1 R3 → R3
−6
1 −1 0
 
−1 1 −1
∼ 0 0 −1 R2 → R2 + 7R3
1 −1 0
 
−1 1 0
∼ 0 0 −1 R1 → R1 − R2
1 −1 0
 
−1 1 0
∼  0 0 −1 R3 → R3 + R1
0 0 0
Clearly it has only one independent solution and
 so geometric
x
multiplicity is of λ = −2 is 1. Now, eigen vector y  is given by
z
    
−1 1 0 x 0
 0 0 −1 y  = 0
0 0 0 z 0
i.e; x + y − z = 0, z = 0. Solving this, we get y = −x, z = 0, x is
independent. Take x = 1, then eigen vector
 
1
X1 = −1
0
Eigen vector associated to λ = 4
 
−7 1 −1
A − 4.I3 =  −7 1 −1
−6 6 −6
 
−7 1 −1
R2 → R2 − R1
∼ 0 0 0 1
R3 → −6 R3
1 −1 1
97
 
0 −6 6
∼ 0 0 0  R1 → R1 + 7R3
1 −1 1

 
0 1 −1 1
R1 → −6 R1
∼ 1 −1 1 
R2 ↔ R3
0 0 0
 
1 −1 1
∼ 0 1 −1 R1 ↔ R2
0 0 0
 
1 0 0
∼ 0 1 −1 R1 ↔ R1 + R2
0 0 0
Thus, eigen vector is given by
    
1 0 0 x 0
0 1 −1 y  = 0
0 0 0 z 0
i.e, x = 0, y − z = 0 and so y = z, x = 0.
Take z = 1. Then, eigen vector associated to λ = 6 is given by
 
0
X2 = 1 .
1

Exercise 4.0.7 Find the characreristic polynomial, eigen values


and eigen vectors of following matrices:
       
3 1 1 3 2 4 1 0 0 2 2 1
2 4 2 , 2 0 2 , 0 2 1 , 1 3 1
1 1 3 4 2 3 2 0 3 1 2 2
 
0 1 2
Exercise 4.0.8 Find the eigen value of matrix 1 0 −1.
2 −1 0

Exercise
 4.0.9 Prove that the eigen values of matrix

a h g
0 b 0 are a, b and c.
0 c c
98 CHAPTER 4. EIGEN VALUES AND EIGEN VECTORS

Exercise 4.0.10 Prove that 0 is the eigen value of a matrix if and


only if the matrix is singular. Also prove that if λ is an eigen value
of invertible matrix A, then λ 6= 0 and λ−1 is an eigen value of A−1 .
Hint: 0 is the root of |A − λIn | = 0 if and only if |A − 0In | = 0,
i.e; if and only if |A| = 0, i.e; A is singular.

Exercise 4.0.11 The eigen values of idempotent matrix are either


0 or unity. (Hint: AX = λX, A2 = A ⇒ λ2 X = λX).

Exercise 4.0.12 If λ is an eigen value of an n × n matrix A then


λk is an eigen value of Ak . Let p(x) = a0 + a1 x + a2 x2 + . . . + ak xk
be a polynomial in x. Define p(A) = a0 In +a1 A+a2 A2 +. . .+ak Ak .
Prove that p(λ) is an eigen value of p(A).

Exercise 4.0.13 Prove that eigen values of skew-symmetric ma-


trices over R are 0, provided it exists.

4.1 Properties
Proposition 4.1.1 Corresponding to each eigen vector X of a
square matrix, there exists a unique eigen value of A whereas cor-
responding to each eigen value there exists infinitely many eigen
vectors.

Proof: Let X be an eigen vector of A corresponding to eigen


values λ1 and λ2 . Then AX = λ1 X and AX = λ2 X. Thus,
λ1 X = λ2 X and so (λ1 − λ2 )X = 0. But X 6= 0, so λ1 − λ2 = 0,
i.e; λ1 = λ2 . Thus, eigen value corresponding to each eigen vector
X is unique.
Next, let X be an eigen vector associated to eigen value λ of
A, then AX = λX. Let k be any non-zero scalar. Then A(kX) =
k(AX) = k(λX) = λ(kX). Thus, kX is an eigen vector associated
to eigen value λ of A, for each non-zero scalar. 2
From this proposition, it follows that, if λ1 and λ2 be any two
eigen values of a square matrix A, then Eλ1 ∩ Eλ2 = {0}.

Proposition 4.1.2 The product of eigen values (characteristic roots)


of a square matrix is equal to the determinant of A.
4.1. PROPERTIES 99

Proof: Let λ1 , λ2 , . . . , λn be n eigen values of an n × n matrix A.


Then det (A − xIn ) = (λ1 − x)(λ2 − x) . . . (λn − x). Putting x = 0,
we get det A = λ1 λ2 . . . λn . 2

Proposition 4.1.3 Linear combination of any two eigen vectors


associated to eigen value λ of a square matrix A is an eigen vector
corresponding to that eigen value.

Proof is left as an exercise for readers.

Proposition 4.1.4 The eigen values of a Hermitian matrix are


all real.

Proof: Let A be a Hermitian matrix. Then Aθ = A. Let λ be


any eigen value of A. Then there exists a non-zero vector X such
that

AX = λX (4.1.1)

Taking conjugated transpose of equation 4.1.1, we have

X θ Aθ = λX θ (4.1.2)

and hence
X θ A = λX θ as Aθ = A
. Thus
(X θ A)X = (λX θ )X
i.e;
X θ (AX) = λX θ X
Using equation 4.1.1, we have λX θ X = λX θ X. But X θ X 6= 0 as
X 6= 0. Hence λ = λ, i.e; λ is purely real. 2

Corollary 4.1.5 The eigen values of a real symmetric matrix are


all real.

Proposition 4.1.6 The eigen value of a skew- Hermitian matrix


is either zero or purely imaginary.
100 CHAPTER 4. EIGEN VALUES AND EIGEN VECTORS

Proof: Let A be a skew-Hermitian matrix. Then Aθ = −A.


Let λ be any eigen value of A. Then there exists a non-zero vector
X such that
AX = λX (4.1.3)
Taking conjugated transpose of equation 4.1.3, we have
X θ Aθ = λX θ (4.1.4)
and hence
−X θ A = λX θ as Aθ = −A
. Thus
−(X θ A)X = (λX θ )X
i.e;
−X θ (AX) = λX θ X
Using equation 4.1.3, we have −λX θ X = λX θ X. But X θ X 6= 0 as
X 6= 0. Hence −λ = λ, i.e; λ is either zero or purely imaginary. 2

Corollary 4.1.7 The eigen value of a real skew-symmetric matrix


is either 0 or purely imaginary.
Proposition 4.1.8 The eigen values of unitary matrices are of
unit modulus, i.e; |λ| = 1 or λ = eiθ for some real θ.
Proof: Let A be unity matrix then Aθ A = In . Let λ be any eigen
value of A. Then there exists a non-zero vector X such that
AX = λX (4.1.5)
Taking conjugated transpose of equation 4.1.5, we have
X θ Aθ = λX θ (4.1.6)
By equations 4.1.5 and 4.1.6, we have
X θ (Aθ A)X = λX θ (AX).
This gives
X θ In X = λX θ λX
i.e;
X θ X = (λλ)X θ X
But X θ X 6= 0 as X 6= 0. Hence, λλ = 1, i.e; |λ| = 1. 2
4.1. PROPERTIES 101

Corollary 4.1.9 The eigen values of orthogonal matrices are of


unit modulus. Thus, eigen value of an orthogonal matrix over R is
either 1 or −1.

Proposition 4.1.10 Any two eigen vectors corresponding to dis-


tinct eigen vectors of a Hermitian matrix are orthogonal.

Proof: Let A be a Hermitian matrix and λ1 , λ2 be distinct eigen


values of A. Then λ1 , λ2 will be real and so λ1 = λ1 and λ2 = λ2 .
Let X1 , X2 be eigen vectors associated to distinct eigen values
λ1 , λ2 respectively. Then, we have

AX1 = λ1 X1 . . . (1)
AX2 = λ2 X2 . . . (2)
Aθ = A . . . (3)
Pre-multiplying (1) by X2θ and (2) by X1θ , we have

X2θ AX1 = λ1 X2θ X1 . . . (4)

X1θ AX2 = λ2 X1θ X2 . . . (5)


Taking conjugated transpose of (4) and using Aθ = A, we have

X1θ AX2 = λ1 X1θ X2

Putting this value in (5), we have λ2 X1θ X2 = λ1 X1θ X2 i.e; (λ2 −


λ1 )X1θ X2 = 0. But λ2 6= λ1 , we have X1θ X2 = 0. Thus, eigen
vectors X1 , X2 are orthogonal. 2

Proposition 4.1.11 Let A be n × n matrix. Let λ1 , λ2 , . . . , λr


be its r distinct eigen values associated with distinct eigen vectors
X1 , X2 , . . . , Xr . Then the set {X1 , X2 , . . . , Xr } is linearly indepen-
dent.

Proof: Assume that the result is not true. Then there exists
a natural number i such that {X1 , X2 , . . . , Xi } is linearly inde-
pendent but the set {X1 , X2 , . . . , Xi , Xi+1 } is linearly dependent.
Then there exists scalars c1 , c2 , . . . , ci+1 not all zero such that

c1 X1 + c2 X2 + . . . + ci Xi + ci+1 Xi+1 = 0 . . . (1)


102 CHAPTER 4. EIGEN VALUES AND EIGEN VECTORS

Pre-multiply both sides by A and use AXj = λj Xj , for each j =


1, 2, . . . , i + 1, we have
c1 λ1 X1 + c2 λ2 X2 + . . . + ci λi Xi + ci+1 λi+1 Xi+1 = 0 . . . (2)
Multiply (1) by λi+1 and then subtracting from (2), we have
c1 (λ1 − λi+1 )X1 + c2 (λ2 − λi+1 )X2 + . . . + ci (λi − λi+1 )Xi = 0
But {X1 , X2 , . . . , Xi } is linearly independent, hence
c1 (λ1 − λi+1 ) = 0, c2 (λ2 − λi+1 ) = 0, . . . , ci (λi − λi+1 ) = 0
Since λj 6= λi+1 for all j, therefore we have
c1 = c2 = . . . = ci = 0
Putting this value in (1), we have ci+1 Xi+1 = 0, i.e; ci+1 = 0 and so
{X1 , X2 , . . . , Xi , Xi+1 } is linearly independent, which contradicts
our assumption. Thus, the result is true. 2

Proposition 4.1.12 If λ1 and λ2 are distinct eigen values of a


unitary matrix with λ1 6= λ2 , then any two eigen vectors corre-
sponding to λ1 and λ2 respectively, are orthogonal.
Proof: Let A be a unitary matrix and λ1 6= λ2 be distinct eigen
values of A. Then λ1 λ1 = 1 = λ2 λ2 . Let X1 , X2 be eigen vectors
associated to distinct eigen values λ1 , λ2 respectively. Then, we
have

AX1 = λ1 X1 . . . (1)
AX2 = λ2 X2 . . . (2)
θ
A A = In . . . (3)
Taking conjugated transpose of (2), we have
X2θ Aθ = λ2 X2θ . . . (4)
Post-multiplying (4) by AX1 , we have
X2θ Aθ AX1 = λ2 X2θ (AX1 ) = λ2 X2θ λ1 X1 = λ2 λ1 X2θ X1
Using (3), we have
X2θ X1 = λ2 λ1 X2θ X1
and so (λ2 λ1 − 1)X2θ X1 = 0, i.e; λ2 (λ1 − λ2 )X2θ X1 = 0 as 1 = λ2 λ2 .
But λ1 6= λ2 , λ2 6= 0, therefore X2θ X1 = 0. Thus, X1 , X2 are
orthogonal. 2
4.2. DIAGONALIZABLE MATRIX 103

4.2 Diagonalizable Matrix


Definition 4.2.1 A square matrix A is said to be diagonalizable
if it is similar to a diagonal matrix. In other words, if there exists
a non-singular matrix P such that P −1 AP is a diagonal matrix.
In the example 4.2.2, the matrix A is diagonalizable because there
exists a non-singular matrix P such that P −1 AP = D is diagonal.
The diagonal matrix D will be called diagonal form of A. The
matrix P is called the transition matrix for A.

Example 4.2.2 Consider the matrix


 
8 −6 2
−6 7 −4
2 −4 3

Then characteristic polynomial is



8 − λ −6 2

|A − λI3 | = −6 7 − λ −4
2 −4 3 − λ
= (8 − λ) [(7 − λ)(3 − λ) − 16] + 6[−18 + 6λ + 8]
+2 (24 − 14 + 2λ)
= (8 − λ) 21 − 10λ + λ2 − 16 − 60 + 36λ
 

+20 + 4λ
= (8 − λ) 5 − 10λ + λ2 − 40 + 40λ
 

= −λ(λ2 + 18λ + 45)


= −λ(λ − 3)(λ − 15)

Thus, eigen values are given by λ(λ − 5)(λ − 9) = 0, i.e; eigen


values are λ = 0, 3, 15.
Eigen vector associated with λ = 0

 
8 −6 2
|A − 0I3 | = −6 7 −4
2 −4 3
104 CHAPTER 4. EIGEN VALUES AND EIGEN VECTORS
 
0 10 −10
R → R1 − 4R3
∼ 0 −5 5  1
R2 → R2 + 3R3
2 −4 3
 
0 0 0
∼ 0 −5 5 R1 → R1 + 2R2
2 −4 3
 
2 −4 3
∼ 0 −5 5 R1 ↔ R3
0 0 0

Thus, eigen vector is given by 2x − 4y + 3z = 0, −5y + 5z = 0.


Solving this, we get

1 z
y = z, x = (4y − 3z) = ,
2 2

where z is free to assign any value.


Take z = 2, then y = 2 and x = 1. Hence, associated eigen
vector is
 
1
X1 = 2 .
2

Eigen vector associated to λ = 3


 
5 −6 2
A − 3I3 = −6 4 −4
2 −4 0
 
5 −6 2
−6 4 −4 R3
∼ R3 →
2
1 −2 0
 
0 4 2
0 −8 −4 R1 → R1 − 5R3

R2 → R2 + 6R3
1 −2 0
 
0 4 2
∼ 0 0 0 R2 → R2 + 2R1
1 −2 0
4.2. DIAGONALIZABLE MATRIX 105
 
0 4 2
∼ 1 −2 0 R2 ↔ R3
0 0 0
 
1 −2 0
∼ 0 4 2  R1 ↔ R2
0 0 0
Thus, eigen vector is given by x − 2y = 0 and 4y + 2z = 0. Solving
this, we have
x = 2y, z = −2y.
Take y = 1, eigen vector is given by
 
2
X2 =  1  .
−2
Eigen vector associated to λ = 15
 
−7 −6 2
A − 15I3 = −6 −8 −4 
2 −4 −12
 
−7 −6 2
∼ −6 −8 −4 R3 → R23
1 −2 −6
 
0 −20 −40
R1 → R1 + 7R3
∼ 0 −20 −40
R2 → R2 + 6R3
1 −2 −6
 
0 0 0
∼ 0 −20 −40 R1 → R1 − R2
1 −2 −6
 
0 0 0
1
∼ 0 1 2 R2 → R2
−20
1 −2 −6
 
1 −2 −6
∼ 0 1 2 R1 ↔ R3
0 0 0
Thus, eigen vector is given by x − 2y − 6z = 0, y + 2z = 0. Solving
this, we have
y = −2z, x = 2y + 6z = −4z + 6z = 2z.
106 CHAPTER 4. EIGEN VALUES AND EIGEN VECTORS

Take z = 1, then x = 2, y = −2. Hence associated eigen vector is


given by
 
2
X3 = −2 .
1
Thus, if we take
 
1 2 2
P = [X1 X2 X3 ] = 2 1 −2
2 −2 1

Observe that its columns are orthogonal and P 2 = 9I3 . Hence,


P −1 = P9 . Now,
  
0 0 0 1 2 2
1
P −1 AP = 6 3 −6 2 1 −2
9
30 −30 15 2 −2 1
 
0 0 0
=  0 3 0 (verif y)
0 0 15

which is a diagonal matrix

It is a proved fact that if all the eigen values λ1 , λ2 , . . . , λn of


n × n matrix A are distinct, then there exists a non-singular matrix
P such that P −1 AP is a diagonal matrix diag[λ1 , λ2 , . . . , λn ], where
j-th column of P is the eigen vector associated to eigen value λj of
A. In the above example, all the eigen values are distinct and hence
the above matrix is diagonalizable. This fact will be discussed latter.

Example 4.2.3 Consider the matrix


 
6 −2 2
A = −2 3 −1
2 −1 3

Then

6 − λ −2 2

|A − λI3 | = −2 3 − λ −1

2 −1 3 − λ
4.2. DIAGONALIZABLE MATRIX 107

6 − λ −2 2

= −2 3 − λ −1 R3 → R3 + R2
0 2−λ 2−λ

6 − λ −2 2

= (2 − λ) −2 3 − λ −1
0 1 1
= (2 − λ) [(6 − λ)(3 − λ + 1) + 2(−2 − 2)]
= (2 − λ)(16 − 10λ + λ2 )
= −(2 − λ)2 (λ − 8)

Thus, eigen values are λ = 2, 2, 8.

Case I: When λ = 2.
 
6 − 2 −2 2
A − 2I3 =  −2 3 − 2 −1 
2 −1 3 − 2
 
4 −2 2
= −2 1 −1
2 −1 1
 
4 −2 2
R2 → R2 + 21 R1
∼ 0 0 0 
R3 → R3 − 21 R1
0 0 0

Thus, eigen vector is given by 4x − 2y + 2z = 0, i.e; x = y−z


2 . Take
y = 0, z = −2, then x = 1. Similarly, if y = 2, z = 0 then x = 1.
Hence independent eigen vectors are given by
   
1 1
X1 =  0 , X2 = 2 .
 
−2 0
Case II: When λ = 3

 
6 − 8 −2 2
A − 8I3 =  −2 3 − 8 −1 
2 −1 3 − 8
 
−2 −2 2
= −2 −5 −1
2 −1 −5
108 CHAPTER 4. EIGEN VALUES AND EIGEN VECTORS
 
−2 −2 2
R2 → R2 − R1
∼ 0 −3 −3
R3 → R3 + R1
0 −3 −3
 
−2 −2 2
∼ 0 −3 −3 R3 → R3 − R2
0 0 0
Thus, eigen value is given by −3y − 3z = 0 and −2x − 2y + 2z =
0. Solving these, we get
y = −z, x = 2z
Take z = 1, then y = −1 and x = 2. Hence, eigen vector is
 
2
X3 = −1
1
Here eigen vectors X1 , X2 , X3 are linearly independent.
Take  
1 1 2
P = [X1 X2 X3 ] =  0 2 −1
−2 0 1
then  
2 −1 5
1 
P −1 = 2 5 1
12
4 −2 2
Now,  
4 −2 −10
1
P −1 A = 4 10 2 
12
4 −2 2
and hence  
2 0 0
P −1 AP = 0 2 0
0 0 8
Remark 4.2.4 Here X1 , X2 , X3 are 3 linearly independent eigen
vectors associated to eigen values 2, 2, 8 of an 3×3 matrix A. Then,
we have a non-singular matrix P = [X1 X2 X3 ] such that
 
2 0 0
P −1 AP = 0 2 0
0 0 8
4.2. DIAGONALIZABLE MATRIX 109

This prompts that if X1 , X2 , . . . , Xn are n linearly independent


eigen vectors associated to eigen values λ1 , λ2 , . . . , λn of a given ×n
matrix A, then we have non-singular matrix P = [X1 X2 . . . Xn ]
such that  
λ1 0 0 . . . 0
 0 λ2 0 . . . 0 
P −1 AP =  .
 
.. .. .. .. 
 .. . . . .
0 0 0 . . . λn
where some of the eigen value may be repeated.

Theorem 4.2.5 An n×n square matrix A is diagonalizable if and


only if it possesses n linearly independent eigen vectors.

Proof: Let A be diagonalizable then there exists a non-singular


matrix P such that P − 1AP is a diagonal matrix

D = diag[λ1 , λ2 , . . . , λn ],

for some scalars λ1 , λ2 , . . . , λn . Since

P − 1AP = diag[λ1 , λ2 , . . . , λn ],

therefore
AP = P.diag[λ1 , λ2 , . . . , λn ]

Let Xj denotes the j-th column of P . Then, we have


 
λ1 0 0 . . . 0
 0 λ2 0 . . . 0 
A[X1 X2 . . . Xn ] = [X1 X2 . . . Xn ]  .
 
.. .. .. .. 
 .. . . . . 
0 0 0 . . . λn
i.e;
[AX1 AX2 . . . AXn ] = [λ1 X1 λ2 X2 . . . λn Xn ]
Thus, AXj = λj Xj for all j = 1, 2, . . . , n. This shows that Xj is an
eigen vector of A associated to eigen values λj for each j. Clearly
X1 , X2 , . . ., Xn are linearly independent as P is non-singular.
Conversely suppose that X1 , X2 , . . ., Xn are linearly inde-
pendent eigen vectors associated to eigen values λ1 , λ2 , . . ., λn
110 CHAPTER 4. EIGEN VALUES AND EIGEN VECTORS

respectively. Then AXj = λj Xj for all j = 1, 2, . . . , n. Let


P = [X1 , X2 , . . . , Xn ] and
D = diag [λ1 , λ2 , . . . , λn ].
Then

AP = A[X1 , X2 , . . . , Xn ]
= [AX1 , AX2 , . . . , AXn ]
= [λ1 X1 , λ2 X2 , . . . , λn Xn ]
 
λ1 0 0 . . . 0
 0 λ2 0 . . . 0 
= [X1 , X2 , . . . , Xn ]  .
 
. .. .. .. .. 
. . . . . 
0 0 0 . . . λn
= PD
This gives, P −1 AP = D which is diagonal, where P is non-singular
because its all columns are linearly independent. Thus, A is diag-
onalizable. 2

Exercise 4.2.6 Find a non-singular matrix P such that P −1 AP


is diagonal, where
 
1 2 −2
A= 1 2 1 .
−1 −1 0
Exercise 4.2.7 Prove that the matrix
 
8 −8 −2
A = 4 −3 −2
3 −4 1
is diagonalizable. Also find the matrix P for which P −1 AP is di-
agonal.

Exercise 4.2.8 Reduce the matrix


 
1 −1 2
A = 0 2 −1
0 0 3
to its diagonal form.
4.3. CAYLEY-HAMILTON THEOREM 111

Exercise 4.2.9 For each of the matrices A given by


       
1 0 1 −2 5 7 −3 −7 19 1 −3 3
0 1 0 , 1
  0 −1 , −2 −1 8  , 3 −5 3 ,
1 0 1 −1 1 2 −2 −3 10 6 −6 4

find a non-singular matrix P such that P −1 AP is diagonal.

4.3 Cayley-Hamilton Theorem


Theorem 4.3.1 Every square matrix satisfies its characteristic
equations.

Proof: Let A be an n × n square matrix. Suppose that the


characteristic polynomial p(λ) of A is

det (A − λIn ) = a0 + a1 λ + a2 λ2 + . . . + an λn (4.3.1)

where an 6= 0. Indeed, an = (−1)n . It is also noted that a0 =


det A = |A| (it is obtained by putting λ = 0 in the equation 4.3.1).
Since each entry of A − λIn is a polynomial in λ of degree at
most 1, therefore each entry of adjoint matrix adj (A − λIn ) is a
polynomial of degree n − 1. Suppose that

adj (A − λIn ) = B0 + B1 λ + B2 λ2 + . . . + Bn−1 λn−1 (4.3.2)

where B0 , B1 , . . . , Bn−1 are matrices of size n × n. As we know


that
(A − λIn ) adj (A − λIn ) det (A − λIn ).In (4.3.3)
Using equations 4.3.1 and 4.3.2, we have

2 + ... + B n−1
 
(A − λIn ) B0 + B 1 λ + B 2 λ n−1 λ
= a0 + a1 λ + a2 λ2 + . . . + an λn In
 

i.e;

AB0 + (AB1 − B0 )λ + (AB2 − B1 )λ2 + . . .


+(ABn−1 − Bn−2 )λn−1 − Bn λn
= a0 In + a1 In λ1 + a2 In λ2 + . . . + an−1 In λn−1 + an In λn
112 CHAPTER 4. EIGEN VALUES AND EIGEN VECTORS

Equating like powers, we have

AB0 = a0 In
AB1 − B0 = a1 In
AB2 − B1 = a2 In
.. .. ..
. . .
ABn−1 − Bn−2 = an−1 In
−Bn = an In

Pre-multiply successively by In , A, A2 , . . ., An , we have

AB0 = a0 In
2
A B1 − AB0 = a1 A
A3 B2 − A2 B1 = a2 A2
.. .. ..
. . .
An Bn−1 − An−1 Bn−2 = an−1 An−1
−An Bn = an An

Adding them, we have

0 = a0 In + a1 A + a2 A2 + . . . + an−1 An−1 + an An

Thus, A satisfies the equation 4.3.1, i.e; A satisfies its characteristic


equation. 2

Remark 4.3.2 From characteristic equation

a0 In + a1 A + a2 A2 + . . . + an−1 An−1 + an An = 0

it follows that An is a linear combination of In , A, A2 , . . ., An−1 .


By repeating use of this equation we find that Am is a linear com-
bination of In , A, A2 , . . ., An−1 for each m ∈ N. If A is invertible
then a0 = det A 6= 0. Multiplying the characteristic equation by
A−1 , we have
1 
A−1 = − a1 In + a2 A + . . . + an An−1

a0
It can be illustrated by following example.
4.3. CAYLEY-HAMILTON THEOREM 113

Example 4.3.3 Consider the matrix


 
1 2
A=
0 2

The characteristic polynomial is



1 − λ 2
A − λI2 | =

0 2 − λ
= 2 − 3λ + λ2

Thus, characteristic equation of A is

λ2 − 3λ + 2 = 0 (4.3.4)

By Cayley-Hamilton theorem, we have

A2 − 3A + 2 I2 = 0

Verification of Cayley-Hamilton theorem:

Given that  
1 2
A= ,
0 2
     
2 1 6 3 6 2 0
then A = , 3A = and 2I2 = and so
0 4 0 6 0 2
     
2 1 6 3 6 2 0
A − 3A + 2I2 = − +
0 4 0 6 0 2
 
1−3+2 6−6+0
=
0−0+0 4−6+2
 
0 0
=
0 0

Thus,
A2 − 3A + 2I2 = 0 (4.3.5)
This shows that A satisfies its characteristic equation λ2 − 3λ + 2 =
0.
Observe that |A| = 2 6= 0. Hence, A is invertible. To obtain
A−1 , we multiply both sides the equation 4.3.5 by A−1 , we have

A − 3I2 + 2A−1 = 0
114 CHAPTER 4. EIGEN VALUES AND EIGEN VECTORS

Thus,
   
−1 1−3 2−0 −2 2
−2A = A − 3I2 = =
0−0 2−3 0 −1
 
1 −1
Hence, A−1 =
0 12
Next, by characteristic equation 4.3.5, we have
A2 = 3A − 2I2
Using this recursively, we have
A3 = A(3A − 2I2 )
= 3A2 − 2A
= 3(3A − 2I2 ) − 2A
= 7A − 6I2

A4 = 7A2 − 6A
= 7(3A − 2I2 ) − 6A
= 15A − 14I2
Hence
2A4 − 3A3 + 9A2 − 7A + 5I2 = 2(15A − 14I2 ) − 3(7A − 6I2 )
+9(3A − 2I2 ) − 7A + 5I2
= (30 − 21 + 27 − 7)A
+(−28 + 18 − 18 + 5)I2
= 29A − 23I2

Exercise 4.3.4 Use Cayley-Hamilton theorem to express 2A5 −


3A4 + A2 − 4I2 as a linear polynomial in A, where
 
3 1
A= .
−1 2
Exercise 4.3.5 Find the characteristic equation of the matrix
 
2 −1 1
A =  −1 2 −1  .
1 −1 2

Verify Cayley -Hamilton theorem and then find A−1 .


4.3. CAYLEY-HAMILTON THEOREM 115
 
2 −1 1
Solution: Given that A =  −1 2 −1 . Then characteris-
1 −1 2
tic polynomial is given by

2−λ −1 1
|A − λI3 | = −1 2 − λ −1
1 −1 2 − λ
= (2 − λ) (2 − λ)2 − 1 − (−1) [−(2 − λ) + 1]
 

+1 [1 − (2 − λ)]
= (2 − λ) 3 − 4λ + λ2 + λ − 1 + λ − 1
 

= −λ3 + 6λ2 − 9λ + 4

Thus, characteristic equation is

λ3 − 6λ2 + 9λ − 4 = 0

By Cayley-Hamilton’s theorem, we have

A3 − 6A2 + 9A − 4I3 = 0.

 
2 −1 1
Now, A =  −1 2 −1 , hence
1 −1 2

    
2 −1 1 2 −1 1 6 −5 5
A2 =  −1 2 −1   −1 2 −1  =  −5 6 −5  .
1 −1 2 1 −1 2 5 −5 6

  
6 −5 5 2 −1 1
A3 = A2 A =  −5 6 −5   −1 2 −1 
5 −5 6 1 −1 2
 
22 −21 21
=  −21 22 −21 
21 −21 22
116 CHAPTER 4. EIGEN VALUES AND EIGEN VECTORS

Then,
   
22 −21 21 6 −5 5
A3 − 6A2 + 9A − 4I3 =  −21 22 −21 − 6 −5
  6 −5
21 −21 22 5 −5 6
   
2 −1 1 1 0
+9 −1 2 −1 − 4 0 1
   0
1 −1 2 0 0 1
 
0 0 0
=  0 0 0 
0 0 0
Thus, A satisfies its characteristic equation.
Since |A| = 6−1−1 = 4 6= 0, therefore A−1 exists. Multiplying
the characteristic equation by A−1 , we have
1 2
A−1 =

A − 6A + 9I3
4      
6 −5 5 12 −6 6 9 0 0
1 
= −5 6 −5 − −6 12 −6 + 0 9 0
4
5 −5 6 6 −6 12 0 0 9
 
3 1 −1
1
= 1 3 1
4
−1 1 3
Exercise 4.3.6 Find the characteristic equation of the matrix
 
1 0 2
 0 2 1 
2 0 3

Verify that it is satisfied by A and hence find A−1 .

Solution: Given that


 
1 0 2
A= 0 2 1 
2 0 3
Then, its characteristic polynomial is
 
1−λ 0 2
|A − λI3 | =  0 2−λ 1  = −λ3 + 6λ2 − 7λ − 2
2 0 3−λ
4.3. CAYLEY-HAMILTON THEOREM 117

Hence characteristic equation is


λ3 − 6λ2 + 7λ + 2 = 0
and so by Cayley-Hamilton theorem, we have
A3 − 6A2 + 7A + 2I3 = 0 . . . (1)
Now
    
1 0 2 1 0 2 5 0 8
A2 =  0 2 1   0 2 1  =  2 4 5 
2 0 3 2 0 3 8 0 13
  
5 0 8 1 0 2
A3 = A2 A =  2 4 5   0 2 1 
8 0 13 2 0 3
 
21 0 34
=  12 8 23 
34 0 55
Then,
   
21 0 34 5 0 8
A3 − 6A2 + 7A + 2I3 =  12 8 23  − 6  2 4 5 
34 0 55 8 0 13
   
1 0 2 2 0 0
+7  0 2 1  +  0 2 0 
2 0 3 0 0 2
 
0 0 0
=  0 0 0 
0 0 0
Next, multiplying both sides of (1) by A−1 (since |A| = 2 6= 0),
we have
2A−1 = −A2 + 6A − 7I3
     
−5 0 −8 6 0 12 7 0 0
=  −2 −4 −5  +  0 12 6  −  0 7 0 
−8 0 −13 12 0 18 0 0 7
 
−6 0 4
=  −2 1 1 
4 0 −2
118 CHAPTER 4. EIGEN VALUES AND EIGEN VECTORS

Thus,  
−3 0 2
A−1 =  −1 1/2 1/2 
2 0 −1

Exercise 4.3.7 Find the characteristic equation of the matrix


 
1 2 0
 2 −1 0 
0 0 −1

and hence find A−1 .

Exercise 4.3.8 Find the characteristic equation of the matrix


 
0 1 2
 0 −3 0 
1 1 −1

Also verify the Cayley -Hamilton theorem and hence find A−1 .

Exercise 4.3.9 Prove that the following matrices satisfies Cayley-


Hamilton theorem and hence deduce the inverse:
     
1 2 2 1 0 2 1 2 −1
 2 1 2 , 0 2 1 , 0 1 −1  ,
2 2 1 2 0 3 3 −1 1

Exercise 4.3.10 Verify Cayley-Hamilton theorem for the follow-


ing matrices:
   
0 c −b 1 0 2
 −c 0 a  and  0 2 1 .
b −a 0 2 0 3

Exercises
Exercise 4.3.11 Find the eigen values and eigen vectors of the
matrix  
1 2 0
 2 −1 0 
0 0 −1
Is it diagonalizable?
4.3. CAYLEY-HAMILTON THEOREM 119

Exercise 4.3.12 Verify the Cayley -Hamilton theorem for the ma-
trix  
0 1 2
 0 −3 0 .
1 1 −1
Also find A−1 .

Exercise 4.3.13 Test whether the following matrices are diago-


nalizable:
     
1 2 2 1 0 2 1 2 −1
 2 1 2 , 0 2 1 , 0 1 −1  ,
2 2 1 2 0 3 3 −1 1
   
2 4 3 2 2 1
 0 −1 1 , 1 3 1 
2 2 −1 1 2 2

Exercise 4.3.14 Verify Cayley-Hamilton theorem for the follow-


ing matrices:
   
0 c −b 1 0 2
 −c 0 a  and  0 2 1 .
b −a 0 2 0 3

Exercise 4.3.15 Verify Cayley-Hamilton theorem for the follow-


ing matrices:
   
2 4 −3 2 0 1
 0 −1 1 , 1 3 1 
2 −2 −1 0 −2 2

2
120 CHAPTER 4. EIGEN VALUES AND EIGEN VECTORS
Chapter 5

Vector Spaces

In this chapter we give the abstract notion of vector spaces.

5.1 Vector Spaces


Definition 5.1.1 Let F be a field. Then a non-empty set V to-
gether with operations + : V × V → V and · : F × V → F is called
a vector space over field F if it satisfies the following conditions:
(1). (u + v) + w = u + (v + w) for all u, v, w ∈ V .
(2). there exists 0 ∈ V such that u + 0 = u = 0 + u for all u ∈ V .
(3). for each v ∈ V there exists u ∈ V such that u + v = 0 = v + u.
the element u is called the additive inverse of v denoted by −v.
(4). u + v = v + u for all u, v ∈ V .
(5). a · (u + v) = a · u + a · v for all u, v ∈ V and a ∈ F ,
(6). (a + b) · u = a · u + b · u for all u ∈ V and a, b ∈ F ,
(7). (ab) · u = a · (b · u) for all u ∈ V and a, b ∈ F
(8). 1 · u = u for all u ∈ V , where 1 is the multiplicative identity
of F .
The elements of V are called vectors and elements of F are
called scalars. The operation + is called vector addition and the
operation · is called scalar multiplication. It is also noted that
a·u denotes the image of (a, u) under the map · and u+v denotes the
image of (u, v) under the map +. Throughout the portion of linear
algebra, we shall write au in place of a · u. The vector space V
over the field F is denoted by V (F ). The additive identity 0 ∈ V
is called the zero vector.
Thus, in brief (V, +, ·), where + : V ×V → V and · : F ×V → F

121
122 CHAPTER 5. VECTOR SPACES

are maps; is called a vector space over field F if it satisfies the


following conditions:
(1). (V, +) is an abelian group with additive identity 0,
(2). a · (u + v) = a · u + a · v,
(3). (a + b) · u = a · u + b · u,
(4). (ab) · u = a · (b · u), and
(5). 1 · u = u, where 1 is the multiplicative identity of F ,
where a, b ∈ F and u, v ∈ V .

Example 5.1.2 Consider the set R3 = {(x, y, z) | x, y, z ∈ R},


where R is the field of real numbers. Let (x1 , y1 , z1 ), (x2 , y2 , z2 ) be
any two elements of R3 . Define + and · on R3 as follows

(x1 , y1 , z1 ) + (x2 , y2 , z2 ) = (x1 + x2 , y1 + y2 , z1 + z2 ) (5.1.1)

One may easily verify that this operation is well defined. This sum
is called as coordinate-wise addition. Next, let (x, y, z) ∈ R3 and
a ∈ R. Define · on R3 as follows

a(x, y, z) = (ax, ay, ay) (5.1.2)

One may easily verify that this operation is well defined. This mul-
tiplication is called as coordinate-wise multiplication. Let (x1 , y1 , z1 ),
(x2 , y2 , z2 ) and (x3 , y3 , z3 ) be any three elements of R3 . Then,

[(x1 , y1 , z1 ) + (x2 , y2 , z2 )] + (x3 , y3 , z3 )


= (x1 + x2 , y1 + y2 , z1 + z2 ) + (x3 , y3 , z3 )
= ((x1 + x2 ) + x3 , (y1 + y2 ) + y3 , (z1 + z2 ) + z3 )
= (x1 + (x2 + x3 ), y1 + (y2 + y3 ), z1 + (z2 + z3 )) by associativity of
real numbers
= (x1 , y1 , z1 ) + (x2 + x3 , y2 + y3 , z2 + z3 )
= (x1 , y1 , z1 ) + [(x2 , y2 , z2 ) + (x3 , y3 , z3 )]

Thus, associativity follows by associativity of addition of real


numbers. Similarly, by the properties of addition and multiplication
of real numbers, we have
(1). 0 = (0, 0, 0) in R3 such that

(x, y, z) + (0, 0, 0) = (x, y, z) = (0, 0, 0) + (x, y, z)

for all (x, y, z) in R3 , i.e; 0 is the additive identity.


(2). for each (x, y, z) in R3 , (−x, −y, −z) is the additive inverse
5.1. VECTOR SPACES 123

of (x, y, z).
(3). addition is commutative, i.e;

(x1 , y1 , z1 ) + (x2 , y2 , z2 ) = (x2 , y2 , z2 ) + (x1 , y1 , z1 ),

for all (x1 , y1 , z1 ), (x2 , y2 , z2 ) in R3 .


(4). a [(x1 , y1 , z1 ) + (x2 , y2 , z2 )] = a(x1 , y1 , z1 ) + a(x2 , y2 , z2 ) for all
a ∈ R and (x1 , y1 , z1 ), (x2 , y2 , z2 ) in R3 .
(5). (a + b)(x, y, z) = a(x, y, z) + b(x, y, z).
(6). (ab)(x, y, z) = a[b(x, y, z)].
(7). 1(x, y, z) = (x, y, z).
Thus, R3 (R) is a vector space.

Example 5.1.3 The set Rn of all n−tuples of real numbers forms


a vector space over the field R with respect to coordinate-wise ad-
dition + and multiplication · given by

(x1 , x2 , . . . , xn ) + (y1 , y2 , . . . , yn ) = (x1 + y1 , x2 + y2 , . . . , xn + yn )


(5.1.3)
and
a(x1 , x2 , . . . , xn ) = (ax1 , ax2 , . . . , axn )
respectively, for all (x1 , x2 , . . . , xn ), (y1 , y2 , . . . , yn ) ∈ Rn and
a ∈ R.

Example 5.1.4 Let F be a field and n ∈ N. Consider the set F n


of all n − tuples with entries in F . Then F n (F ) is a vector space
with respect to vector addition + and scalar multiplication . given
by

(x1 , x2 , . . . , xn ) + (y1 , y2 , . . . , yn ) = (x1 + y1 , x2 + y2 , . . . , xn + yn )

and
a(x1 , x2 , . . . , xn ) = (ax1 , ax2 , . . . , axn )
for all (x1 , x2 , . . . , xn ), (y1 , y2 , . . . , yn ) ∈ F n and a ∈ F .

Example 5.1.5 Let F be a field and Mm×n (F ) be the set of all


matrices with entries in F . Then, Mm×n (F ) is a vector space over
F with respect to addition of matrices and scalar multiplication of
matrices by a scalar.
124 CHAPTER 5. VECTOR SPACES

Example 5.1.6 Let F be a field. An expression of the form

a0 + a1 x + a2 x2 + . . . + an xn + . . . ,

where an ∈ F , an = 0 except for finitely many n, is called a poly-


nomial in x. Let F [x] be the set of all polynomials in x over F .
Then, F [x] is a vector space over F with respect to + and · given by

a0 + a1 x + a2 x2 + . . . + an xn + . . . + am xm + . . .
b0 + b1 x + b2 x2 + . . . + bn xn + . . . + bm xm + . . .
c0 + c1 x + c2 x2 + . . . + cn xn + . . . + cm xm + . . .
where cn = an + bn for all n = 0, 1, 2, . . .. and

a(a0 + a1 x + a2 x2 + . . . + an xn + . . .)
= aa0 + aa1 x + aa2 x2 + . . . . . . + aan xn + . . . +

The expression of the form a0 + a1 x + a2 x2 + . . . + an xn + . . . is


called a polynomial in x of degree n, if an 6= 0 and am = 0 for
all m = n + 1, n + 2, . . .. It is denoted by

a0 + a1 x + a2 x2 + . . . + an xn

Exercise 5.1.7 Prove that the set Pn (x) of all polynomials over
R (or C) in x of degree at most n, forms a vector space over R (or
C) with respect to addition + and · given by

(a0 + a1 x + a2 x2 + . . . + an xn ) + (b0 + b1 x + b2 x2 + . . . + bn xn )
= (a0 + b0 ) + (a1 + b1 )x + (a2 + b2 )x2 + . . . + (an + bn )xn

c(a0 + a1 x + . . . + an xn ) = ca0 + ca1 x + . . . + can xn

Exercise 5.1.8 Prove that the set of all real valued functions de-
fined on R forms a vector space over R w.r.t vector addition and
scalar multiplication given by (f +g)(x) = f (x)+g(x) and (af )(x) =
af (x) respectively, where a ∈ R. Also, prove the same for the set
C[a, b] of all real valued continuous functions defined on closed in-
terval [a, b], where a < b and a, b are real numbers.

Exercise 5.1.9 Let F be a subfield of field K. Prove that K is a


vector space over F with respect to its field operations.
5.1. VECTOR SPACES 125

Theorem 5.1.10 Let V (F ) be a vector space. Then, we have the


following: (1). a0 = 0 for all a ∈ F ,
(2). 0.v = 0 for all v ∈ V ,
(3). av = 0 implies either a = 0 or v = 0. (4). (−a)v = −av =
a(−v) for all a ∈ F and v ∈ V . in particular, (−1)v = −v, for all
v∈V

Proof: (1). Let a ∈ F . Then, a0 = a(0 + 0) = a0 + a0 gives


a0 = 0.

(2). Let v ∈ V . Then 0v = (0 + 0)v = 0v + 0v gives 0v = 0.

(3). Let av = 0. If a = 0 then the result follows by (2). Sup-


pose a 6= 0. Then a−1 exists and so a−1 (av) = a−1 0 gives v = 0.

(4). 0 = a0 = a[v + (−v)] = av + a(−v). Thus, a(−v) = −av.


Similarly, [a + (−a)]v = 0v = 0 gives (−a)v = −av. 2

Definition 5.1.11 A non-empty subset W of a vector space V


over field F is called a subspace of V if W is itself a vector space
over F with respect to operations of V . Non-empty subsets {0} and
V of V are subspaces of V . Note that V is the largest subspace of
V and {0} is the smallest subspace of V .

Proposition 5.1.12 A non-empty subset W of a vector space V


over F is a subspace if and only if w1 + w2 ∈ W and aw ∈ W for
all a ∈ F , w, w1 , w2 ∈ W .

Proof: If W is a subspace then it is itself a vector space there-


fore for all a ∈ F , w, w1 , w2 ∈ W we have w1 + w2 ∈ W and
aw ∈ W . Conversely suppose that w1 + w2 , aw ∈ W for all a ∈ F ,
w, w1 , w2 ∈ W . Since W ⊆ V , hence it immediately follows that
(w1 + w2 ) + w3 = w1 + (w2 + w3 )
a(w1 + w2 ) = aw1 + aw2 ,
(a + b)w = aw + bw,
w1 + w2 = w2 + w1 ,
1w = w and (ab)w = a(bw),
where 1, a, b ∈ F and w, w1 , w2 , w3 ∈ W . Since −w = (−1)w ∈ W
for each w ∈ W hence 0 = w − w ∈ W . Thus, W is itself a vector
space with respect to operations of V . 2
126 CHAPTER 5. VECTOR SPACES

Proposition 5.1.13 A non-empty subset W is a subspace of vec-


tor space V over field F if and only if aw1 + bw2 ∈ W for all
a, b ∈ F and w1 , w2 ∈ W .

Proof: If W is a subspace of V then it is a vector space and so


aw1 + bw2 ∈ W for all a, b ∈ F and w1 , w2 ∈ W .
Conversely suppose that aw1 + bw2 ∈ W for all a, b ∈ F and
w1 , w2 ∈ W . Take a = b = 1, we have w1 +w2 ∈ W . Let a ∈ F and
w ∈ W , then aw+0w ∈ W . But w ∈ W ⊂ V , hence aw+0w = aw.
Thus, aw ∈ W . Using Proposition 5.1.12, W is a subspace of V .
2

Example 5.1.14 Let A be n × n square matrix over R and λ is an


eigen value of A. Consider the set Eλ = {X ∈ Rn | AX = λX},
where Rn denotes the set of all column vectors with entries in R.
Clearly 0 ∈ Eλ . Hence Eλ is a non-empty subset of Rn . Let
X, Y ∈ Eλ . Then
AX = 0 = AY
and so
A(aX + bY ) = a(AX) + b(AY ) = 0,
for any two real numbers a, b.
This shows that aX + bY ∈ Eλ . Thus, Eλ is a subspace of Rn ,
the vector space of column vectors over R. This subspace is known
as eigen space associated to eigen value λ.

Example 5.1.15 Consider the vector space R3 over R. Let W =


{(x, y, z) ∈ R3 | ax + by + cz = 0} be the plane passing through the
origin. Then, it is a subspace of R3 . For
Let (x1 , y1 , z1 ), (x2 , y2 , z2 ) ∈ W be any two points. Then

ax1 + by1 + cz1 = 0 (5.1.4)

ax2 + by2 + cz2 = 0 (5.1.5)


Then, for any two real numbers k, l, we have

k(ax1 + by1 + cz1 ) + l(ax2 + by2 + cz2 ) = 0


i.e; a(kx1 + lx2 ) + b(ky1 + ly2 ) + c(kz1 + lz2 ) = 0
5.1. VECTOR SPACES 127

Thus, (kx1 + lx2 , ky1 + ly2 , kz1 + lz2 ) ∈ W . But

k(x1 , y1 , z1 ) + l(x2 , y2 , z2 ) = (kx1 + lx2 , ky1 + ly2 , kz1 + lz2 )

Hence k(x1 , y1 , z1 ) + l(x2 , y2 , z2 ) ∈ W .

Exercise 5.1.16 Prove that the set of all n × n upper(lower) tri-


angular matrices is a subspace of Mn×n (F ).

Exercise 5.1.17 In the vector space R3 , determine which of the


following subsets are subspaces:
(1). {(x, y, z) ∈ R3 | x = 0}.
(2). {(x, y, z) ∈ R3 | x = 1}.
(3). {(x, y, z) ∈ R3 | x = 0, z = 2y}.
(4). {(x, y, z) ∈ R3 | y = 2x2 }.
(5). {(x, y, z) ∈ R3 | x = y}.
(6). {(x, y, z) ∈ R3 | 2x − y + z = 1}.

Exercise 5.1.18 Let W be a non-empty subset of a vector space


V (F ). Prove that W is a subspace of V if and only if u − w ∈ W
and aw ∈ W for all u, w ∈ W and a ∈ F .

Exercise 5.1.19 Let W be a non-empty subset of a vector space


V (F ). Prove that W is a subspace of V if and only if au + w ∈ W
for all u, w ∈ W and a ∈ F .

Exercise 5.1.20 Which of the following subsets of Rn are sub-


spaces:
(1). {(a1 , a2 , . . . , an ) | an ≥ 0}
(2). {(a1 , a2 , . . . , an ) | a1 is a rational number}
(3). {(a1 , a2 , . . . , an ) | a1 + 5a3 = 0}
(4). {(a1 , a2 , . . . , an ) | a1 + an = k}, where k is a non-zero number.
(5). {(a1 , a2 , . . . , an ) | a1 = a2 = . . . = an }.

Exercise 5.1.21 Prove that the set of all real symmetric (skew-
symmetric) matrices form a vector space over the field of real num-
bers.

Exercise 5.1.22 Prove that C is a vector space over R.


128 CHAPTER 5. VECTOR SPACES

Exercise 5.1.23 Determine whether or not the the following sub-


sets of R4 are subspaces:
(1). {(a, b, c, d) ∈ R4 | a + b = c + d},
(2). {(a, b, c, d) ∈ R4 | a2 + b2 = 0},
(3). {(a, b, c, d) ∈ R4 | a2 + d2 = 5},
(4). {(a, a + b, a − b, b) | a, b ∈ R}.

Proposition 5.1.24 Intersection of any two subspaces of a vector


space is a subspace.

Proof: Let W1 , W2 be any two subspaces of a vector space V (F ).


Then, 0 ∈ W1 , W2 and hence 0 ∈ W1 ∩ W2 . Thus, W1 ∩ W2 is
a non-empty subset of V . Let a, b ∈ F and u, w ∈ W1 ∩ W2 .
Then, u, w ∈ W1 , W2 . But W1 , W2 are subspaces of V , therefore
au + bw ∈ W1 , W2 and so au + bw ∈ W1 ∩ W2 . This shows that
W1 ∩ W2 is a subspace. 2

Exercise 5.1.25 Prove that the intersection of any collection of


subspaces of a given vector space is a subspace.

Remark 5.1.26 Union of two subspaces need not be a subspace.


For example, Consider V = R3 and W1 = {(x, 0, 0) | x ∈ R} (x-
axis) and W2 = {(0, x, 0) | x ∈ R}. Then, (1, 0, 0) ∈ W1 and
(0, 1, 0) ∈ W2 . But

(1, 0, 0) + (0, 1, 0) = (1, 1, 0) ∈


/ W1 ∪ W2 .

Take W3 = {(x, y, 0)|| x, y ∈ R}. Then W1 ⊂ W3 and so W1 ∪W3 =


W3 which is a subspace.

Proposition 5.1.27 Union of two subspaces is a subspace if and


only if one is contained in other.

Proof: Let W1 , W2 be any two subspaces of a vector space V (F ).


Suppose that one is contained in other. If W1 ⊆ W2 , then W1 ∪
W2 = W2 and so it is a subspace. If W2 ⊆ W1 , then W1 ∪ W2 = W1
and so it is a subspace. Hence, W1 ∪ W2 is a subspace.
Conversely suppose that W1 ∪ W2 is a subspace. Suppose that
neither W1 ⊂ W2 nor W2 ⊂ W1 . Then there exists w1 ∈ W1 and
w2 ∈ W2 such that w1 ∈ / W2 and w2 ∈ / W1 . Since W1 , W2 ⊂
W1 ∪ W2 therefore w1 , w2 ∈ W1 ∪ W2 . But W1 ∪ W2 is a subspace
5.2. LINEAR DEPENDENCE AND LINEAR INDEPENDENCE129

and hence w1 + w2 ∈ W1 ∪ W2 . Now w1 + w2 ∈ W1 ∪ W2 implies


that either w1 + w2 ∈ W1 or w1 + w2 ∈ W2 . If w1 + w2 ∈ W1 then
w2 = (w1 + w2 ) − w1 ∈ W1 is a contradiction because w2 ∈ / W1 .
Next, if w1 + w2 ∈ W2 then w1 = (w1 + w2 ) − w2 ∈ W2 is a
contradiction because w1 ∈ / W2 . Thus, we arrive at a contradiction
and so the result is true. 2

Definition 5.1.28 Let S be any subset of a vector space V . Then


S ⊂ V . Consider the family {U is a subspace of V | S ⊂ U }. Then
intersection of the members of this family is a smallest subspace
containing S. This subspace is denoted by hSi and is called the
subspace of V generated by S. If S = ∅, then hSi = {0}, because
{0} is the smallest subspace containing ∅. If S = V then hSi = V
(why?).

5.2 Linear Dependence and Linear Indepen-


dence
Definition 5.2.1 Let S be a non-empty subset of a vector space
V (F ). A vector v ∈ V is called a linear combination of elements
of S, if there exists finitely many elements s1 , s2 , . . . , sn of S and
scalars a1 , a2 , . . . , an ∈ F such that

v = a1 s1 + a2 s2 + . . . + an sn

From the expression, we also say that v is a linear combination of


vectors s1 , s2 , . . . , sn . The set of all linear combinations of elements
of S is denoted by L(S) and is called the linear span of S.

Example 5.2.2 The element (1, 2, 0) ∈ R3 is a linear combination


of elements of S = {(1, 0, 0), (0, 1, 0), (0, 0, 1)} because (1, 2, 0) =
1.(1, 0, 0) + 2(0, 0, 1). The linear span of T = {(1, 0, 0), (0, 1, 0)} is

L(T ) = {a(1, 0, 0) + b(0, 1, 0) | a, b ∈ R} = {(a, b, 0) | a, b ∈ R}

Example 5.2.3 To check that the vector (3, −1, 0, −1) ∈ R4 is a


linear combination of vectors (2, −1, 3, 2), (−1, 1, 1, −3) and (1, 1, 9, −5),
we suppose that

x(2, −1, 3, 2) + y(−1, 1, 1, −3) + z(1, 1, 9, −5) = (3, −1, 0, −1)


130 CHAPTER 5. VECTOR SPACES

This determines a system of equations given by


2x − y + z = 3
−x + y + z = −1
3x + y + 9z = 0
2x − 3y − 5z = −1
It canbe re written 
as AX = b, where  
2 −1 1   3
−1 1 x
1 y  and b = −1.
  
A= 3 , X =
1 9 0
z
2 −3 −5 −1
Consider the augmented matrix
 
2 −1 1 : 3
−1 1 1 : −1
(A | b) = 
3

1 9 : 0
2 −3 −5 : −1

 
2 −1 1 : 3
−1 1 1 : −1
(A | b) = 
3

1 9 : 0
2 −3 −5 : −1
 
−1 1 1 : −1
2 −1 1 : 3
∼ 
3
 R1 ↔ R2
1 9 : 0
2 −3 −5 : −1
 
−1 1 1 : −1
0 R2 → R2 + 2R1
1 3 : 1
∼   R3 → R3 + 3R1
0 4 12 : −3
R4 → R4 + 2R1
0 −1 −3 : −3
 
−1 1 1 : −1
0 1 3 : 1 R3 → R3 − 4R2
∼  
0 0 0 : −7 R4 → R4 + R2
0 0 0 : −2
Since rank A = 2 6= rank (A|b) = 3, therefore the system of equa-
tions AX = b has no solutions. Thus, the given vector (3, −1, 0, −1) ∈
R4 is not a linear combination of vectors (2, −1, 3, 2), (−1, 1, 1, −3)
and (1, 1, 9, −5).
5.2. LINEAR DEPENDENCE AND LINEAR INDEPENDENCE131

Exercise 5.2.4 Is the vector (2, −5, 3) in the subspace of R3 spanned


by the vectors (1, −3, 2), (2, −4, −1), (1, −5, 7)?

Exercise 5.2.5 For which value of c, the vector (1, −2, c) ∈ R3


will be a linear combination of the vectors (1, 0, −2) and (1, −1, 0)?

Proposition 5.2.6 If S is a non-empty subset of vector space V


over F . Then, we have the following:
(i). L(S) is a subspace of V ,
(ii). L(S) = hSi.
Pn
Pm Proof: Let u, w ∈ L(S). Then u = k=1 aik sik and w =
j=1 blj slj for some scalars aik , bljPand vectors sik , sP
lj , for all k and
j. Let a, b ∈ F . Then au + bw = nk=1 (aaik )sik + m j=1 (bblj )slj ∈
L(S). Thus, L(S) is a subspace of V . This proves (i).
Since S ⊆ L(S) and hSi is a smallest subspace containing,
therefore
hSi ⊆ L(S) . . . (1)
Pn
Next, let u ∈ L(S) then u = i=1 ai si , where ai ∈ F P and si ∈ S.
Since hSi is a subspace containing S, therefore u = ni=1 ai si ∈
hSi. Hence
hSi ⊆ L(S) . . . (2)

Combining (1) and (2), we get L(S) = hSi. 2

Corollary 5.2.7 If S is a non-empty subset of a vector space V .


Then L(L(S)) = L(S). In particular, W is a subspace of a vector
space V (F ) if and only if L(W ) = W .

Proof: Since L(S) is a subspace containing L(S). Hence hL(S)i =


L(S). Thus, L(L(S)) = L(S). 2

Exercise 5.2.8 If S, T are two non-empty subsets of a vector space


V (F ) such that S ⊂ T then L(S) ⊆ L(T ).

Exercise 5.2.9 Let S = {e1 , e2 , . . . , en } ⊂ Rn , where


e1 = (1, 0, 0, . . . , 0), e2 = (0, 1, 0, . . . , 0), . . ., en = (0, 0, 0, . . . , 0, 1).
Prove that L(S) = Rn . Also find L({e1 }).
132 CHAPTER 5. VECTOR SPACES

Definition 5.2.10 Let V (F ) be a vector space. Then a finite set


{v1 , v2 , . . .,vn } of vectors in V is said to be linearly dependent
set if there exists scalars a1 , a2 , . . ., an , not all zero, such that

a1 v1 + a2 v2 + . . . + an vn = 0 . . . (1)

The elements v1 , v2 , . . ., vn are called linearly dependent vectors.


An infinite set S is said to be linearly dependent if it has a finite
subset of linearly dependent vectors.
Since some of the coefficients are non-zero, without loss assume
that a1 6= 0. Then Equation (1) can be written as
−a2 −an
v1 = v2 + . . . + vn
a1 a1
which shows that v1 is a linear combination of remaining vectors.

Example 5.2.11 The vectors (1, 1, 0) and (2, 2, 0) in R3 is linearly


dependent because

2(1, 1, 0) + (−1)(2, 2, 0) = (0, 0, 0)

Exercise 5.2.12 If v is a linear combination of vectors v1 , v2 , . . .,


vn in V (F ), then prove that the set {v, v1 , v2 , . . . , vn } is linearly
dependent.

Proposition 5.2.13 Every set containing a zero vector is linearly


dependent.

The proof is left as an exercise for readers.

Exercise 5.2.14 Prove that the polynomials 2 + x + x2 , x + 2x2 ,


2 + 2x + 3x2 in the vector space P2 (x) of polynomials of degree at
most 2 are linealy dependent.
Solution: Suppose that

a(2 + x + x2 ) + b(x + 2x2 ) + c(2 + 2x + 3x2 ) = 0 + 0x + 0x2 (∗)

By equality of polynomials, we have

2a + 0b + 2c = 0 (1)
a + b + 2c = 0 (2)
a + 2b + 3c = 0 (3)
5.2. LINEAR DEPENDENCE AND LINEAR INDEPENDENCE133

By (1), we have c = −a and hence (2) gives b = −2c−a = a. Thus,


we have b = a, c = −a. Clearly it satisfies equation (3). Taking
a = 1, then b = 1 and c = −1. Thus, a = 1, b = 1, c = −1 is a
common solution of equations (1), (2) and (3), and so we have

1(2 + x + x2 ) + 1(x + 2x2 ) + (−1)(2 + 2x + 3x2 ) = 0


This shows that polynomials 2 + x + x2 , x + 2x2 , 2 + 2x + 3x2 are
linearly dependent.
We can also prove it by using matrix method. For
Consider the coefficient
 matrix
 A of system of equations (1), (2)
2 0 2
and (3). Here A = 1 1 2 Now,
1 2 3
 
2 0 2
A = 1 1 2 
1 2 3
 
1 1 2
∼ 2 0 2  R1 ↔ R2
1 2 3
 
1 1 2
R2 → R2 − 2R1
∼ 0 −2 −2
R3 → R3 − R1
0 1 1
 
1 1 2
∼ 0 0 0  R2 → R2 + 2R3
0 1 1
 
1 1 2
∼ 0 1 1  R2 ↔ R3
0 0 0
Since rank A = 2, thus system of equations (1), (2) and (3), has
non zero solution, i.e; there exists infinitely many scalars a, b, c not
all zero such that equation (∗) holds. The solution is given by
a + b + 2c = 0, b + c = 0, i.e; one of the solution is a = 1, b = 1, c =
−1.
Exercise 5.2.15 Prove that the set
S = {(1, 3, 2), (1, −7, −8), (2, 1, −1)}
is a linearly dependent subset of R3 (R).
134 CHAPTER 5. VECTOR SPACES

Definition 5.2.16 A finite set {v1 , v2 , . . . , vn } of vectors in V is


said to be linearly independent set if the equation

a1 v1 + a2 v2 + . . . + an vn = 0

implies a1 = 0, a2 = 0, . . . , an = 0,
i.e; a1 v1 +a2 v2 +. . .+an vn = 0 only if all the coefficients a1 , a2 , . . .,
an are 0.

In this case vectors v1 , v2 , . . ., vn are called linearly indepen-


dent vectors. An infinite set S is called linearly independent if
its every finite subset is linearly independent.
From the definition it follows that a finite set is linearly inde-
pendent if it is not a linearly dependent set.

Example 5.2.17 Let v 6= 0. then the set {v} is linearly indepen-


dent because av = 0 implies that a = 0.

Example 5.2.18 The set {(1, 0, 0), (0, 1, 0), (0, 0, 1)} is linearly in-
dependent subset of vector space R3 over R. For
Suppose that

x(1, 0, 0) + y(0, 1, 0) + z(0, 0, 1) = (0, 0, 0).

Then, we have (x, y, z) = (0, 0, 0), i.e; x = y = z = 0. Thus,

x(1, 0, 0) + y(0, 1, 0) + z(0, 0, 1) = (0, 0, 0) ⇒ x = y = z = 0.

Exercise 5.2.19 Prove that the set {1, x, x2 , . . . , xn } is a linearly


independent subset of vector space Pn (x) of polynomials of degree
at most n.

Exercise 5.2.20 Prove that the set {1, x, 1 + x + x2 } is a linearly


independent subset of vector space P2 (x) of polynomials of degree
at most 2.

Exercise 5.2.21 Prove that the set

{(1, 1, 0, 0), (0, 1, −1, 0), (0, 0, 0, 3)}

in R4 is a linearly independent subset.


5.2. LINEAR DEPENDENCE AND LINEAR INDEPENDENCE135

Exercise 5.2.22 Prove that every subset of a linearly independent


set is linearly independent and every superset of a linearly depen-
dent set is linearly dependent.

Proposition 5.2.23 Let V (F ) be a vector space. Let v1 , v2 , . . .,


vn are non-zero vectors of V . Then either they are linearly inde-
pendent or there exists k with 2 ≤ k ≤ n such that vk is a linear
combination of preceding ones(i.e; v1 , v2 , . . ., vk−1 ).

Proof: If they are linearly independent then we have done. Sup-


pose that vectors v1 , v2 , . . ., vn are linearly dependent. Then, there
exists scalars a1 , a2 , . . ., an , not all 0 such that

a1 v1 + a2 v2 + . . . + an vn = 0 . . . (1)

Choose largest k such that ak 6= 0 but ak+1 = 0, ak+2 = 0,


. . . , an = 0. Then, equation (1) becomes

a1 v1 + a2 v2 + . . . + ak vk = 0 . . . (2)
If k = 1 then a1 v1 = 0, which is a contradiction because a1 6= 0
and v1 6= 0. Thus, 2 ≤ k ≤ n. Now, ak 6= 0 so equation (2) can be
written as
−a1 −a2 −ak−1
vk = v1 + v2 + . . . + vk−1
ak ak ak

Thus proved. 2

Exercise 5.2.24 Prove that non-zero vectors v1 , v2 , . . ., vn of a


vector space V (F ) are linearly independent if some vk , 2 ≤ k ≤ n,
is a linear combination of preceding ones (i.e; a linear combination
of v1 , v2 , . . ., vk−1 ).

Exercise 5.2.25 Prove that the following are equivalent:


(i). S is linearly dependent.
(ii). at least one element of S is a linear combination of others.

Proposition 5.2.26 In a vector space V (F ), if v is a linear com-


bination of vectors v1 , v2 , . . ., vn then the set {v, v1 , v2 , . . . , vn } is
linearly dependent.
136 CHAPTER 5. VECTOR SPACES

The proof is immediately follows from the definition of linear


combination. It is left as an exercise for readers.

Proposition 5.2.27 If S is a linearly independent subset of a vec-


tor space V (F ) and v ∈ is a vector such that v ∈
/ L(S). Then the
set S ∪ {v} is a linearly independent.

Proof: Suppose that the assumption is not true. Then there


exists scalars a, a1 , a2 , . . ., an , not all zero such that

av + a1 v1 + a2 v2 + . . . + an vn = 0 (1)

where v1 , v2 , . . . , vn ∈ S If a = 0 then expression (1) gives a1 =


a2 = . . . = an = 0 as S is a linearly independent set. This will
be a contradiction because some of the scalars are non-zero. Thus,
a 6= 0. Multiplying (1) both sides by a−1 , we have
a1 a2 an
v=− v1 − v2 − . . . − vn ∈ L(S)
a a a
which again contradicts the assumption v ∈ / L(S). Thus, the result
is true, i.e; S ∪ {v} is linearly independent. 2

5.3 Bases and Dimensions


Definition 5.3.1 A subset B of a vector space V (F ) is called a
basis of V (F ) if B is linearly independent and L(B) = V .

Example 5.3.2 Consider the set B = {(1, 0), (0, 1)}. Suppose that

a(1, 0) + b(0, 1) = (0, 0)

Then, we have
(a, b) = (0, 0).
By equality of ordered pairs, we have a = 0, b = 0.
This shows that a(1, 0) + b(0, 1) = (0, 0) implies a = 0, b = 0.
Thus, B is linearly independent.
Since every (x, y) in R2 can be expressed as

(x, y) = x(1, 0) + y(0, 1)

therefore L(B) = R2 . Thus, B is a basis of R2 .


5.3. BASES AND DIMENSIONS 137

Example 5.3.3 The set {e1 , e2 , . . . , en } is a basis of F n , where ei


is the n − tuple whose all entries are 0 except the i-th place which
is 1. This basis is called the standard basis of F n .

Example 5.3.4 The vector space C over R has a basis {1, i}.

Example 5.3.5 Consider the set S = {(1, 0, 1), (0, 1, 0), (0, 1, 1)}
of vectors in R3 . We claim that L(S) = R3 . For,
let (x, y, z) ∈ R3 and

a(1, 0, 1) + b(0, 1, 0) + c(0, 1, 1) = (x, y, z) . . . (1)

This gives (a, b + c, a + c) = (x, y, z). By equality of 3 − tuples,


we have
a = x, b + c = y, a + c = z.
Solving this, we have

a = x, c=z−x and b=x+y−z . . . (2)

Thus, L(S) = R3 .
Next, by (2), it follows that

a(1, 0, 1) + b(0, 1, 0) + c(0, 1, 1) = (0, 0, 0)

implies a = b = c = 0.
Thus, S is linearly independent. Since it spans R3 , therefore S
is a basis of R3 .

Remark 5.3.6 To check whether S = {v1 , v2 , . . . , vn } is a basis of


V or not, reader should solve the equation

a1 v1 + a2 v2 + . . . + an vn = v . . . (∗)

for general element v of V .

If solution does not exist then L(S) 6= V and so S is not a basis


of V .

If solution exists then, then L(S) = V . To check linear inde-


pendence of S, we put v = 0 in equation (∗), and write the solution
for v = 0. If all a0i s are zero then it is independent otherwise de-
pendent.
138 CHAPTER 5. VECTOR SPACES

Example 5.3.7 The set {1, x, x2 , . . . , xn } is a basis of the vector


space Pn (x) of polynomials over a filed F of degree at most n.
     
1 0 0 0 1 0 0 0 1
Exercise 5.3.8 Prove that matrices , , ,
0 0 0 0 0 0 0 0 0
     
0 0 0 0 0 0 0 0 0
, , and forms a basis of vector
1 0 0 0 1 0 0 0 1
space M2×3 (R), the set of all 2 × 3 matrices over R.

Exercise 5.3.9 Let Epq denotes the m × n matrix over F (R, or


C) whose (p, q)-th entry is 1 and other entries are 0, for all 1 ≤
p ≤ m, 1/leqq ≤ n. Prove that the set {Epq | 1 ≤ p ≤ m, 1/leqq ≤
n} forms a basis of vector space Mm×n (R), the set of all m × n
matrices over F .

Exercise 5.3.10 Determine whether or not the set

S = {(1, 2, 1), (2, 1, 0), (1, −1, 2)}

is a basis of R3 .

Definition 5.3.11 A vector space V (F ) is said to be finite-dimensional


vector space or finitely generated vector space vector space
if it has finite set S which spans V , i.e; if there exists a finite set
S such that L(S) = V . If V (F ) does not have a finite subset S
spanning V , then it is called infinite dimensional vector space.
Here we shall restrict ourselves on finite dimensional vector spaces.

Example 5.3.12 Since {(1, 0, 0), (0, 1, 0), (0, 0, 1)} spans R3 (R),
therefore R3 (R) is a finite dimensional vector spaces.

Example 5.3.13 The vector space Mm×n (R) is finite-dimen- -


sional.

Example 5.3.14 The vector spaces C(Q) and R(Q) are infinite
dimensional vector spaces (why?).

Proposition 5.3.15 (Existence of a basis of finite dimen-


sional space) Every finite dimensional vector space V (F ) has a
basis.
5.3. BASES AND DIMENSIONS 139

Proof: Let S = {v1 , v2 , . . . , vn } be a finite set such that L(S) =


V . Assume that all elements of S are non-zero. If S is linearly
independent then it will be a basis of V . Suppose that S is linearly
dependent, there exists vk with 2 ≤ k ≤ n such that vk is a linear
combination of v1 , v2 , . . ., vk−1 . Suppose that
vk = a1 v1 + a2 v2 + . . . + ak−1 vk−1
Then,
b1 v1 + b2 v2 + . . . + bn vn = (b1 v1 + b2 v2 + . . . + bk−1 vk−1 )
+bk vk + (bk+1 vk+1 + . . . + bn vn )
= (b1 v1 + b2 v2 + . . . + bk−1 vk−1 )
+bk [a1 v1 + a2 v2 + . . . + ak−1 vk−1 ]
+ (bk+1 vk+1 + . . . + bn vn )
= (b1 + bk a1 )v1 + (b2 + bk a2 )v2 + . . .
+(bk−1 + bk ak−1 )vk−1 + (bk+1 vk+1
+ . . . + bn vn )
Thus, L(S1 ) = L(S) = V , where S1 = S \ {vk } ⊂ S contains n − 1
vectors. If S1 is linearly independent then it will be a basis. If not,
then proceeding as above we shall get a new set S2 ⊂ S1 ⊂ S of
n − 2 vectors which spans V . Continuing the above process, after
finite numbers of steps (at most n − 1 steps), we shall get a set
S ∗ ⊂ S which will be linearly independent and L(S ∗ ) = V and
hence it will be a basis. It will happen because every singleton set
containing an element of S is linearly independent. 2

Proposition 5.3.16 (Invariance of basis) Any two bases of a


finite dimensional vector space V (F ) have the same number of el-
ements.

Proof: Let S1 = {v1 , v2 , . . . , vn } and S2 = {w1 , w2 , . . . , wm }


be any two bases of a vector space V (F ). Then L(S1 ) = V and
so w1 is a linear combination of v1 , v2 , . . . , vn . Then the set S3 =
{w1 , v1 , v2 , . . . , vn } will be linearly dependent such that L(S3 ) = V
and so there exists some vk such that it will be a linear combination
of preceding vectors w1 , v1 , v2 , . . ., vk−1 . Then L(S3∗ ) = L(S3 ) =
V , where
S3∗ = S3 \ {vk } = {w1 ; v1 , v2 , . . . , vk−1 , vk+1 , . . . , vn }
140 CHAPTER 5. VECTOR SPACES

Since w2 ∈ V and L(S3∗ ) = V , therefore

S4 = {w1 , w2 ; v1 , v2 , . . . , vk−1 , vk+1 , . . . , vn }

will be linearly dependent. Then there exists vj ∈ S4 which is a


linear combination of preceding vectors w1 , w2 , v1 , . . . vj−1 and so
L(S4∗ ) = L(S4 ) = V , where

S4∗ = {w1 , w2 ; v1 , v2 , . . . , vk−1 , vk+1 , . . . , vj−1 , vj+1 , . . . , vn }

Clearly vj will be different from w1 and w2 because {w1 , w2 } is a


linearly independent set.
Continuing this process of insertion and deletion. If n < m
then we have a proper subset {w1 , w2 , . . . , wn } of S2 which spans
V and so S2 will be linearly dependent, a contradiction. Thus,

n≥m (1)

Interchanging the roles of S1 and S2 , we have

m≥n (2)

Combining equations (1) and (2), we have m = n. 2


This proposition asserts that the number of elements
in a basis is invariant. Thus, we have the following:
Definition 5.3.17 The number of elements in a basis of a finite
dimensional vector space V (F ) is called as dimension of a vector
space. It is denoted by dimF V or simply dim V and is read as
dimension of V over F .

Example 5.3.18 The basis {(1, 0, 0), (0, 1, 0), (0, 0, 1)} of R3 con-
tains 3 elements, thus dimF V = 3.

Example 5.3.19 The vector space of all n × n real symmetric


matrices over R has dimension n(n+1)
2 (why?).

Example 5.3.20 Let SS(n) be the set all n×n real skew-symmetric
matrices over R. Let Eij be n × n matrix whose (i, j)-th entry is
1 and (j, i)-th entry is −1 and all other remaining entries are 0,
where 1 ≤ i, j ≤ n. Then {Eij | i < j, 1 ≤ i, j ≤ n} is a basis of
SS(n) containing n(n−1)
2 elements. Thus, dimension of this space
n(n−1)
is 2 .
5.3. BASES AND DIMENSIONS 141

Example 5.3.21 The set {1, x, x2 , . . . , xn } is a basis of Pn (x) con-


taining n + 1 elements. Hence, dimension of Pn (x) over R is n + 1.

Theorem 5.3.22 (Extension Theorem) Every linearly indepen-


dent subset of a finite dimensional vector space can be extended to
a basis.

Proof: Let V (F ) be a finite dimensional vector space. and di-


mension dimF V = n and B = {v1 , v2 , . . . , vn } be a basis. Let
S = {w1 , w2 , . . . , wm } be a linearly independent set. Then no ele-
ments of S will be zero. Consider the set

S1 = {w1 , w2 , . . . , wm ; v1 , v2 , . . . , vn }.

Since each w0 s are linear combination of elements of B, therefore S1


will be linearly dependent and L(S1 ) = V . Hence there exist some
vectors of S1 which will be linear combination of preceding ones.
This vector will not be wi0 s, because S is linearly independent.
Hence the vector will be some vj . Let

S2 = S1 \ {vj } = {w1 , w2 , . . . , wm ; v1 , v2 , . . . , vj−1 , vj+1 , . . . , vn }

Clearly L(S2 ) = L(S1 ) = V . If it is linearly independent, then it


will be a required basis which contains S as a subset.
If S2 is not linearly independent then repeating the above pro-
cess, after finite number of steps finally we get a linearly indepen-
dent set S3 containing S as a subset such that L(S3 ) = V and so
it will be a basis. 2

Corollary 5.3.23 If S is a linearly independent set containing m


elements in a finite dimensional vector space V (F ) of dimension
n. Then m ≤ n.

Corollary 5.3.24 Every set of n + 1 or more vectors of a finite


dimensional vector space of dimension n is linearly dependent.

Proof: Let dimF V = n and S be a set containing n + 1 or more


vectors in V (F ). If S is linearly independent, then by extension
theorem it can be extended to a basis containing S and so the basis
contains n + 1 or more vectors. This is a contradiction because
dimF V = n, i.e; basis has exactly n elements. Thus, S is linearly
dependent. 2
142 CHAPTER 5. VECTOR SPACES

Corollary 5.3.25 If V (F ) is a vector space spanned by a finite


number of non-zero elements v1 , v2 , . . . , vn . Then dimF V ≤ n
and so every linearly independent set is finite and contains at most
n vectors.

Proof: Let S = {v1 , v2 , . . . , vn } and L(S) = V . If it is linearly


independent then it will be a basis containing n elements and so
dimF V = n. If it is not linearly independent then there exist some
vector vj , 2 ≤ j ≤ n, which is a linear combination of preceding
vectors. Consider the set

S1 = S \ {vj } = {v1 , v2 , . . . , vj−1 , vj+1 , . . . , vn }

Then L(S1 ) = V . If it is linearly independent then it will be a basis


containing n − 1 vectors. If not then repeating the above process,
after finite number of steps, we have a basis S2 containing less than
n elements. Clearly the set S2 will not be empty because every
singleton is linearly independent. Thus, V (F ) is finite dimensional
with dimF V ≤ n.
Next, let I be any linearly independent set containing m ele-
ments. By extension theorem, it can be extended to a basis con-
taining dimF V elements and so m ≤ dimF V . But dimF V ≤ n.
Thus, m ≤ n. 2

Proposition 5.3.26 If V (F ) is finite dimensional vector space of


dimension n, then any linearly independent subset of V containing
n vectors is a basis of V (F ).

Proof: Let dimF V = n and S be a linearly independent


set containing n vectors. Suppose that S is not a basis, then by
extension theorem, it can be extended to a basis S 0 of V . Since
S ⊂ S 0 , therefore S 0 contains more than n vectors and so it will be
linearly dependent. But S 0 is a basis so it is linearly independent.
This is a contradiction. hence our assumption is wrong. Thus, S
is a basis. 2

Proposition 5.3.27 Let V (F ) be a vector space of dimension n.


If S is set containing n vectors such that L(S) = V , then it will be
a basis.
5.3. BASES AND DIMENSIONS 143

Proof: Let S = {v1 , v2 , . . . , vn }. Suppose that L(S) = V and


dimF V = n. We claim that S is linearly independent. Suppose
that S is not linearly independent, then there exists a vector vj ,
2 ≤ j ≤ n, which is a linear combination of preceding ones and
so L(S1 ) = L(S) = V , where S1 = S \ {vj }. If it is linearly
independent then it will form a basis containing n−1 vectors which
is a contradiction. If S1 is linearly dependent then repeat the above
process. By a finite number of steps, we get a basis containing less
than n elements, which is again a contradiction. Thus, we arrive at
contradiction in each case. Hence our assumption is wrong. Thus,
S is a basis. 2

Proposition 5.3.28 If dimF V = n, then any set containing less


than n elements will never span V .

The proof is left as an exercise for readers.

Exercise 5.3.29 Is the linearly independent subset

{(1, 0, 0), (1, 1, −1)}

of R3 a basis? Give reasons.

Theorem 5.3.30 A non-empty subset B = {v1 , v2 , . . . , vn } of a


vector space V (F ) is a basis of V (F ) if and only if every element
V can be uniquely expressed as a linear combination of elements of
B.

Proof: Let B = {v1 , v2 , . . . , vn } be a basis of V (F ). Let v be


any element of V . Since L(B) = V , then there exists scalars a1 ,
a2 , . . ., an such that

v = a1 v1 + a2 v2 + . . . + an vn . . . (1)

To prove uniqueness, suppose that

v = b1 v1 + b2 v2 + . . . + bn vn . . . (2)

By (1) and (2), we have

a1 v1 + a2 v2 + . . . + an vn = b1 v1 + b2 v2 + . . . + bn vn
144 CHAPTER 5. VECTOR SPACES

This gives

(a1 − b1 )v1 + (a2 − b2 )v2 + . . . + (an − bn )vn = 0

But v1 , v2 , ..., vn are linearly independent. Therefore this ex-


pression holds only if each coefficient are zero, i.e; a1 − b1 = 0,
a2 − b2 = 0, . . ., an − bn = 0. Thus, we have

a1 = b1 , a2 = b2 , . . . , an = bn .

Thus, expression is unique. Conversely suppose that every element


V can be uniquely expressed as a linear combination of elements
of B. Hence L(B) = V . Since 0 = 0v1 + 0v2 + . . . + 0vn and the
expression
Pn is unique. Therefore, by uniqueness of the expression,
i=1 ai vi = 0 implies ai = 0 for all i = 1, 2, . . . , n. Hence B is
linearly independent. Hence B is a basis. 2

5.4 Dimension of subspaces


Theorem 5.4.1 Let V (F ) be a finite dimensional vector space and
W be its subspace. Then W is finite dimensional and dimF W ≤
dimF V . Also V = W if and only if dimF W = dimF V .

Proof: Let dimF V = n and W be a subspace of W . Since


W ⊆ V , therefore any subset of W containing more than n will be
linearly dependent set. Thus every linearly independent subset of
W will contain at most n elements. Let S = {v1 , v2 , . . . , vm } be a
linearly independent set containing maximum number of elements.
We claim that L(S) = W . Let w ∈ W . Since S is a linearly
independent subset of W containing maximum number of elements,
therefore the set S1 = {v1 , v2 , . . . , vm ; w} will be linearly dependent
and so there exist a vector v ∈ S1 which will be linear combination
of preceding vectors. Clearly v 6= vi for all i, otherwise we get
a contradiction. Thus, v = w will be a linear combination of
v1 , v2 , . . . , vm and hence L(S) = W . But S is linearly independent,
hence S is a basis of W containing m elements. Thus, W is finite
dimensional and dimF W = m ≤ n.
If W = V then every basis of V is a basis of W . Hence
dimF W = dimF V . Conversely suppose that dimF W = dimF V =
n. Let S be a basis of W then L(S) = W and S is a linearly inde-
pendent subset of W containing n vectors. But W ⊂ V , S will be
5.4. DIMENSION OF SUBSPACES 145

a linearly independent subset of V containing n elements. Since


dimF V = n, hence S will be a basis of V and so L(S) = V . Thus,
W = L(S) = V . 2

Example 5.4.2 Consider the vector space R2 . Then dimR R2 =


2. Let W be a subspace of R2 . Then, dimR W ≤ 2. Hence its
dimension will be either 0 or 1 or 2.
If dimR W = 0, then it will be zero subspace, i.e; W = {(0, 0)}.
If dimR W = 1 then it will be generated by a single non-zero
element, say (x, y). Thus,

W = h(x, y)i = {k(x, y) | k ∈ R} = {(kx, ky) | k ∈ R}

If x = 0 then
W = {(0, ky) | k ∈ R},
which is the y-axis given by x = 0.
Next, if x 6= 0, then xy will be constant, say m. Then y = mx
and so
W = {(kx, k(mx)) | k ∈ R},
which represents a line passing through origin having slope m. In-
deed, every line passing through origin determines one dimensional
subspaces, therefore there are infinitely many one dimensional vec-
tor subspaces of R2 .
If dimR R2 = 2, then W = R2 .

Exercise 5.4.3 Find all possible dimensions of subspaces of R3 (R).

Exercise 5.4.4 Find all possible dimensions of subspaces of R4 (R).

Corollary 5.4.5 If W is a subspace of a finite dimensional vector


space V , then every linearly independent subset of W is finite and
a part of basis of W .

Proof: Let W be a subspace of V (F ), where dimF V = n. Let


I be a linearly independent subset of W . Let I0 be a linearly inde-
pendent subset of W containing I and contains maximum number
of elements. Then I0 will contain at most n elements as I0 is lin-
early independent subset of V . Thus, I0 is finite and so I is finite.
Next, let w ∈ W . Then I0 ∪ {w} will be linearly dependent and
hence there linear combination will be zero.
146 CHAPTER 5. VECTOR SPACES

Note that coefficient of w in that expression will not be zero


otherwise by linear independence of I0 , the set I0 ∪ {w} will be
linearly dependent, which is a contradiction.
Thus, w will be linear combination of elements of I0 . Hence
L(I0 ) = W and so it is a basis of W . Since I ⊆ I0 , therefore I is a
part of basis. 2

Proposition 5.4.6 Let W1 and W2 be any two subspaces of a vec-


tor space V (F ). Consider the set

W1 + W2 = {w1 + w2 | w1 ∈ W1 , w2 ∈ W2 }

Then W1 + W2 is a subspace of V (F ) generated by W1 ∪ W2 .

Proof: Since 0 ∈ W1 , W2 and 0 = 0 + 0. Therefore W1 +


W2 6= ∅. Let w1 + w2 , w1 0 + w2 0 ∈ W1 + W2 , then w1 , w1 0 ∈ W1
and w2 , w2 0 ∈ W2 . Let a, b be any two scalars. Since W1 , W2 are
subspaces therefore aw1 + bw1 0 ∈ W1 and aw2 + bw2 0 ∈ W2 . Since

a(w1 + w2 ) + b(w1 0 + w2 0 ) = (aw1 + bw1 0 ) + (aw2 + bw2 0 ) ∈ W1 + W2

therefore W1 +W2 is a subspace of V . Clearly W1 ∪W2 ⊆ W1 +W2 .


Let U be any subspace containing W1 ∪ W2 . Then every w1 + w2 ∈
W1 + W2 will belong to U as w1 , w2 ∈ W1 ∪ W2 ⊂ U and U is
a subspace. Hence W1 + W2 ⊆ U . Thus, W1 + W2 is a smallest
subspace containing W1 ∪ W2 . Thus, W1 + W2 = hW1 ∪ W2 i. 2

Definition 5.4.7 Let W1 , W2 be any two subspaces of a vector


space V (F ). Then the subspace W1 + W2 is called a linear sum
of subspaces W1 and W2 .

Example 5.4.8 The linear sum of subspaces


W1 = {(x, 0, 0) | x ∈ R} ( x-axis) and W2 = {(0, 0, z) | z ∈ R} (z-
axis) is xz − plane given by

W1 + W2 = {(x, 0, z) | x ∈ R}.

The linear sum of xy-plane and yz-plane in R3 is R3 .

Theorem 5.4.9 Let W1 and W2 be two subspaces of a finite di-


mensional vector space V (F ). Then

dimF (W1 + W2 ) = dimF W1 + dimF W2 − dimF (W1 ∩ W2 ).


5.4. DIMENSION OF SUBSPACES 147

Proof: Suppose that dimF V = n. Then W1 , W2 , W1 + W2 and


W1 ∩ W2 will be finite dimensional. Let dimF W1 = l, dimF W2 =
m and dimF (W1 ∩ W2 ) = k. Suppose that S = {v1 , v2 , . . . , vk }
be a basis of W1 ∩ W2 . By extension theorem, it can be extended
to a basis S1 = {v1 , v2 , . . ., vk ;u1 , u2 , . . . , ul−k } of W1 and S2 =
{v1 , v2 , . . . , vk ; t1 , t2 , . . . , tm−k } of W2 respectively. Consider the
set

B = {v1 , v2 , . . . , vk ; u1 , u2 , . . . , ul−k ; t1 , t2 , . . . , tm−k }

First we claim that L(B) = W1 + W2 .


Since each element of W1 is a linear combination of v1 , v2 , . . .,
vk ; u1 , u2 , . . . , ul−k and each element of W2 is a linear combination
of v1 , v2 , . . . , vk ; t1 , t2 , . . . , tm−k and so there sum will be a linear
combination of vectors v1 , v2 , . . ., vk ; u1 , u2 , . . ., ul−k ; t1 , t2 , . . .,
tm−k . Thus, L(B) = W1 + W2 .
Now we claim that B is linearly independent. For, let

a1 v1 + a2 v2 + . . . + ak vk + b1 u1 + b2 u2 + . . .
+ . . . + bl−k ul−k + c1 t1 + c2 t2 + . . . + cm−k tm−k = 0 (1)

Then

a1 v1 + a2 v2 + . . . + ak vk + b1 u1 + b2 u2 + . . . + bl−k ul−k
= −c1 t1 − c2 t2 + . . . − cm−k tm−k (2)

Since a1 v1 + a2 v2 + . . . + ak vk + b1 u1 + b2 u2 + . . . bl−k ul−k ∈ W1


and −c1 t1 − c2 t2 + . . . − cm−k tm−k ∈ W2 , therefore by (2), we have
−c1 t1 − c2 t2 − . . . − cm−k tm−k ∈ W1 ∩ W2 . Hence, there exists
scalars d1 , d2 , . . . , dk such that

−c1 t1 − c2 t2 − . . . − cm−k tm−k = d1 v1 + d2 v2 + . . . + dk vk

i.e;

c1 t1 + c2 t2 + . . . + cm−k tm−k + d1 v1 + d2 v2 + . . . + dk vk = 0

But S2 is linearly independent so that


c1 = 0, c2 = 0, . . . , cm−k = 0, d1 = 0, d2 = 0, . . ., dk = 0.
Putting the values of c0 s in (2), we have

a1 v1 + a2 v2 + . . . + ak vk + b1 u1 + b2 u2 + . . . bl−k ul−k = 0.
148 CHAPTER 5. VECTOR SPACES

But S1 is linearly independent so that a1 = 0, a2 = 0, . . . , ak = 0,


b1 = 0, b2 = 0, . . ., bl−k = 0. Thus, (1) holds only if a1 = 0, a2 =
0, . . . , ak = 0, b1 = 0, b2 = 0, . . ., bl−k = 0, c1 = 0, c2 = 0,
. . . , cm−k = 0. This shows that the set B is linearly independent
and hence will be a basis of W1 + W2 . Then

dimF (W1 + W2 ) = k + (l − k) + (m − k) = l + m − k

i.e;

dimF (W1 + W2 ) = dimF W1 + dimF W2 − dimF (W1 ∩ W2 ).

Example 5.4.10 The intersection of two planes passing through


origin in R3 is a line.

Example 5.4.11 Let W1 , W2 be two distinct subspaces of a vec-


tor space V (F ). If dim W1 = 4 = dim W2 and dim V = 6.
Since W1 6= W2 , therefore W1 , W2 ⊂ W1 + W2 . Then dimension of
W1 + W2 may be either 5 or 6.

Case I: If dim W1 + W2 = 5, then

(W1 ∩ W2 ) = dim W1 + dim W2 − dim (W1 + W2 )


= 4+4−5=3

Case II: If dim W1 + W2 = 6, then

dim (W1 ∩ W2 ) = dim W1 + dim W2 − dim (W1 + W2 )


= 4+4−6=2

Hence smallest possible dimension of W1 ∩ W2 is 2.

Exercise 5.4.12 Find the smallest possible dimension of


W1 ∩ W2 in R10 , where W1 and W2 are distinct vector spaces with
dimensions 8 and 9 respectively.

Definition 5.4.13 A vector space V is said to be direct sum of its


subspaces W1 and W2 if it satisfies the following:
(1). V = W1 + W2 ,
(2). W1 ∩ W2 = {0}. If V is a direct sum of its subspaces W1 , W2 ,
then we write V = W1 ⊕ W2 .
5.4. DIMENSION OF SUBSPACES 149

Example 5.4.14 The vector space R3 is a direct sum of xy-plane

W1 = {(x, y, 0); x, y ∈ R}
and z-axis
W2 = {(0, 0, z); z ∈ R},
i.e; R3 = W1 ⊕ W2 .

Theorem 5.4.15 V = W1 ⊕W2 if and only if every element v ∈ V


can be uniquely expressed as v = w1 +w2 , where w1 ∈ W1 , w2 ∈ W2 .

Proof: Suppose that V = W1 ⊕ W2 then V = V = W1 + W2 .


Hence each element v ∈ V can be expressed as v = w1 + w2 , where
w1 ∈ W1 , w2 ∈ W2 . Suppose that v = w1 + w2 and v = u1 + u2 ,
where w1 , u1 ∈ W1 , w2 , u2 ∈ W2 . Then, we have

w1 + w2 = u1 + u2

which gives w1 − u1 = u2 − w2 ∈ W1 ∩ W2 as w1 − u1 ∈ W1 . But


W1 ∩ W2 = {0}, hence w1 − u1 = u2 − w2 = 0. This gives w1 = u1
and w2 = u2 . Thus, the expression is unique.
Conversely suppose that every element v ∈ V can be uniquely
expressed as v = w1 + w2 , where w1 ∈ W1 , w2 ∈ W2 . Then
V = W1 + W2 . Let v ∈ W1 ∩ W2 , then v = 0 + v = v + 0. But the
expression is unique, therefore v = 0. Hence W1 ∩ W2 = {0}. 2

Proposition 5.4.16 If V is finite dimensional and V = W1 ⊕ W2


then dimV = dimW1 + dimW2 .

Proof: Let V be finite dimensional. Since V = W1 ⊕W2 , therefore


V = W1 + W2 and W1 ∩ W2 = {0}. Hence, dimV = dim(W1 + W2 )
and dim W1 ∩ W2 = 0. Therefore

dim(W1 + W2 ) = dimW1 + dimW2 − dim W1 ∩ W2

gives dimV = dimW1 + dimW2 . 2

Proposition 5.4.17 Let V (F ) be finite dimensional vector space


and let W1 , W2 be any two subspaces of V such that V = W1 + W2
and dim V = dim W1 + dim W2 . Then V = W1 ⊕ W2 .
150 CHAPTER 5. VECTOR SPACES

Proof: Since V = W1 +W2 therefore dim V = dim (W1 +W2 ).


But dim(W1 + W2 ) = dimW1 + dimW2 − dim W1 ∩ W2 . Hence

dim V = dimW1 + dimW2 − dim W1 ∩ W2 .

Using dim V = dim W1 + dim W2 , we have

dim W1 + dim W2 = dimW1 + dimW2 − dim W1 ∩ W2 .

Hence dim W1 ∩ W2 = 0 and so W1 ∩ W2 = {0}. 2

Proposition 5.4.18 Corresponding to each subspace W of a finite


dimensional space V (F ), we have a subspace W 0 such that V =
W ⊕ W 0.

Proof: Let S = {w1 , w2 , . . . , wm } be a basis of W .Then it is a


linearly independent subset of V . By extension theorem, then it
can be extended to a basis

B = {w1 , w2 , . . . , wm ; v1 , v2 , . . . , vn−m }

of V , where n = dimF V . Take S 0 = {v1 , v2 , . . . , vn−m } ⊂ B and


W 0 = L(S 0 ). Then S 0 is a basis of W 0 .
Clearly W ∩ W 0 = {0} and V = W + W 0 (prove it). 2

Definition 5.4.19 The subspace W 0 corresponding to a given sub-


space W of a finite dimensional vector space V (F ) is called a com-
plementary subspace if V = W ⊕ W 0 .

Example 5.4.20 The complementary subspace W 0 corresponding


to subspace W = {(x, y, 0); x, y ∈ R}( xy-plane) is given by W 0 =
{(0, 0, z) z ∈ R} ( z-axis).

5.5 Quotient Space


Let W be a subspace of V (F ). Define a relation ∼ on V as follows

u∼v ⇔u−v ∈W

Then ∼ is an equivalence relation. The equivalence class of v ∈ V


is given by
5.5. QUOTIENT SPACE 151

{u ∈ V ; u ∼ v} = {u ∈ V ; u − v ∈ W }
= {u ∈ V ; u = w + v f or some w ∈ W }
= {w + v; w ∈ W }

This set is denoted by W + v and is called right coset of v


modulo W . Similarly, we define v + W as

v + W = {v + w; w ∈ W }

and is called as left coset of v modulo W . Observe that W + v =


v + W for each v ∈ W . Since the equivalence relation forms a
partition, therefore any two right(left) cosets are either disjoint or
identical. Clearly W + 0 = W . From the property of equivalence
relation, it follows that

W +u=W +v ⇔u∼v ⇔u−v ∈W

The set of all right (left) cosets modulo W is denoted by V /W and


is called as the quotient set of V modulo W . Thus,

V /W = {W + v | v ∈ V }

Proposition 5.5.1 Let W be a subspace of vector space V (F ).


Then the set V /W forms a vector space over F with respect to
vector addition + and scalar multiplication · given by

(W + u) + (W + v) = W + (u + v) and a(W + u) = W + au

for all u, v ∈ V and a ∈ F .

Proof: The operation + and · are well defined. For; let W + u =


W + u0 and W + v = W + v 0 . Then u − u0 , v − v 0 ∈ W and so
(u − u0 ) + (v − v 0 ) ∈ W , i.e; (u + v) − (u0 + v 0 ) ∈ W . Thus,
W + (u + v) = W + (u0 + v 0 ), i.e;

(W + u) + (W + v) = (W + u0 ) + (W + v 0 )

Similarly, if a = a0 and W + u = W + u0 then u − u0 ∈ W and hence

au − a0 u0 = au − au0 = a(u − u0 ) ∈ W.
152 CHAPTER 5. VECTOR SPACES

Then, W + au = W + a0 u0 and so a(W + u) = a0 (W + u0 ).


Next, by property of + and . in V (F ), we have the following
(1). [(W +u) +(W +v)] +(W +z) = (W +u) +[(W +v) +(W +z)],
(2). (W + u) + (W + v) = W + v + (W + u),
(3). (W + u) + W = W + (W + u), W = W + 0 is the additive
identity,
(4). (W + u) + (W + (−u)) = W = (W + (−u)) + (W + u),
(5). a[(W + u) + (W + v)] = a(W + u) + a(W + v)],
(6). (a + b)(W + u) = a(W + u) + b(W + u),
(7). (ab)(W + u) = a[b(W + u)],
(8). 1(W + u) = (W + u),
for all u, v, z ∈ V and a, b, 1 ∈ F . Thus, V /W is a vector space. 2

Definition 5.5.2 Let W be a subspace of vector space V (F ). Then


the set V /W forms a vector space over F with respect to vector
addition + and scalar multiplication · given by

(W + u) + (W + v) = W + (u + v) and a(W + u) = W + au

for all u, v ∈ V and a ∈ F . This vector space is called a quotient


space of V modulo W .

Proposition 5.5.3 Let W be a subspace of vector space V (F ). If


V is finite dimensional then so is V /W and dim V /W = dim V −
dim W .

Proof: Let dim V = n. Then W is finite dimensional and


dim W ≤ dim V = n. Let dim W = m and S = {w1 , w2 , . . . , wm }
be a basis of W . Then it will be linearly independent subset of V
and so can be extended to a basis B = {w1 , w2 , . . . , wm ; v1 , v2 , . . . , vn−m }
of V . Consider the set Q = {W + v1 , W + v2 , . . . , W + vn−m }. Let

a1 (W + v1 ) + a2 (W + v2 ) + . . . + an−m (W + vn−m ) = W

Then a1 v1 + a2 v2 + . . . + an−m vn−m ∈ W and so there exists scalars


b1 , b2 , . . . , bm such that

a1 v1 + a2 v2 + . . . + an−m vn−m = b1 w1 + b2 w2 + . . . + bm wm

Then

a1 v1 + a2 v2 + . . . + an−m vn−m − b1 w1 − b2 w2 − . . . − bm wm = 0
5.6. COORDINATES 153

But B is linearly independent so all a0 s and b0 s are zero. Thus, Q


is linearly independent subset of V /W .
Next, let W + v ∈ V /W , then v ∈ V and so

v = a1 v1 + a2 v2 + . . . + an−m vn−m + b1 w1 + b2 w2 + . . . + bm wm

for some scalars a0 s and b0 s. Then

W + v = a1 (W + v1 ) + a2 (W + v2 ) + . . . + an−m (W + vn−m )

Thus, Q is a basis containing n − m elements and so V /W is finite


dimensional. Now

dim (V /W ) = n − m = dim V − dim W.

5.6 Coordinates
Definition 5.6.1 A finite sequence v1 , v2 , . . . , vn of vectors in an
n-dimensional vector space V (F ) is called an ordered basis of V .
It is denoted by n-tuple (v1 , v2 , . . . , vn ).

Let V (F ) be a finite dimensional vector space and

B = (v1 , v2 , . . . , vn )

be an ordered basis of V . Then we have a unique n-tuple (x1 , x2 , . . . , xn )


of scalars such that
n
X
v = x1 v1 + x2 v2 + . . . + xn vn = xi vi
i=1

The scalar xi is P
called the i-th coordinate of v relative to B.
Note that if v = ni=1 xi vi and w = ni=1 yi vi then
P

n
X n
X
av = (axi )vi , v+w = (xi + yi )vi
i=1 i=1
.
154 CHAPTER 5. VECTOR SPACES

Definition 5.6.2 Let B = (v1 , v2 , . . . , vn ) be an ordered basis of


a finite dimensional vector space V (F ). Let v ∈ V . Then v is
uniquely expressed as
n
X
v = x1 v1 + x2 v2 + . . . + xn vn = xi vi
i=1

The column vector 


x1
 x2 
X= . 
 
 .. 
xn
of n unique scalars x1 , x2 , . . . , xn is called the coordinate matrix
of v relative to ordered basis B. It is denote by [v]B . Hence
 
x1
 x2 
[v]B =  . 
 
 .. 
xn

Example 5.6.3 Consider V = R3 and B = (v1 , v2 , v3 ), where


v1 = (1, 1, 0), v2 = (0, 1, 0) and v3 = (0, 0, 1). Take v = (1, 2, 1) ∈
R3 . Then

(1, 2, 1) = 1.(1, 1, 0) + 1.(0, 1, 0) + 1.(0, 0, 1) = 1v − 1 + 1v2 + 1v3

Hence, coordinate matrix of (1, 2, 1) relative to ordered basis B is


given by  
1
[(1, 2, 1)]B = 1
1

Theorem 5.6.4 Let V be an n-dimensional space over field F .


Suppose that B = (v1 , v2 , . . . , vn ) and B 0 = (w1 , w2 , . . . , wn ) be any
two ordered bases of V . If v ∈ V then there exists a invertible
(non-singular) n × n matrix P with entries in F such that

(1). [v]B = P [v]B0 ,

(2). [v]B0 = P −1 [v]B ,


5.6. COORDINATES 155

where the columns Pj of P are given by Pj = [wj ]B , for all 1 ≤ j ≤


n.

Proof: Since B and B 0 are bases, therefore for each wj there


exists n unique scalars P1j , P2j , . . . , Pnj such that

n
X
wj = Pij vi = P1j v1 + P2j v2 + . . . + Pnj vn . . . (1)
i=1

Then,
 
P1j
 P2j 
[wj ]B =  .  f or each 1 ≤ j ≤ n
 
.
 . 
Pnj

Let y1 , y2 , . . . , yn be the coordinate of v relative to B 0 , i.e;



y1
 y2 
[v]B0 = .  . . . (2)
 
 .. 
yn

Then

v = y1 w1 + y2 w2 + . . . + yn wn
Xn
= yj wj
j=1
n n
!
X X
= yj Pij vi
j=1 i=1
 
n
X Xn
=  Pij yj  vi . . . (3)
i=1 j=1

Thus, i-th coordinate xi is given by


n
X
xi = Pij yj = Pi1 y1 + Pi2 y2 + . . . + Pin yn . . . (4)
j=1
156 CHAPTER 5. VECTOR SPACES

for all 1 ≤ i ≤ n. Thus,


  
P11 P12 . . . P1n y1
 P21 P22 . . . P2n   y2 
[v]B =  . ..   ..  = P [v]B0 . . . (5)
  
.. ..
 .. . . .  . 
Pn1 Pn2 . . . Pnn yn

Now

[v]B = 0 ⇔ xi = 0 f or all 1 ≤ i ≤ n
Xn
⇔ Pij yj = 0 (by Eq.(4))
j=1
 
n
X n
X
⇔ y1 w1 + y2 w2 + . . . + yn wn =  Pij yj  vi
i=1 j=1

= 0 (by Eq.(3))
⇔ yj = 0, ∀1 ≤ j ≤ n (as B 0 is linearly independent)
⇔ [v]B0 = 0

Hence, P is invertible. By Eq. (5), we have

[v]B0 = P −1 [v]B

where j-th column of P is [wj ]B . 2

Definition 5.6.5 Let B = (v1 , v2 , . . . , vn ) and


B 0 = (w1 , w2 , . . . , wn ) be any two ordered bases of an n-dimen–
sional vector space over F . Then, the matrix P whose j-th column
is the coordinate matrix [wj ]B of wj for each j = 1, 2, . . . , n; is
called the transition matrix from B to B 0 . Thus,

B 0 = BP

Theorem 5.6.6 Suppose that P is an n × n invertible square ma-


trix over F . Let V (F ) be an n-dimensional vector space over F
and B is an ordered basis of V . Then there exists a unique ordered
basis B 0 of such that
(i) [v]B = P [v]B0

(ii) [v]B0 = P −1 [v]B


5.6. COORDINATES 157

for every vector v ∈ V .

Proof: Let B = (v1 , v2 , . . . , vn ) be an ordered basis of V .


Suppose that P = (Pij ) is an invertible n × n matrix. Take B 0 =
(w1 , w2 , . . . , wn ), where
 
P1j
 P2j 
[wj ]B =  . 
 
 .. 
Pnj

is the j-th column of P for each 1 ≤ j ≤ n, i.e;


n
X
wj = Pij vi = P1j v1 + P2j v2 + . . . + Pnj vn
i=1

Let v ∈ V and y1 , y2 , . . . , yn be the coordinate of v relative to


B0 . Then  
y1
 y2 
[v]B0 =  . 
 
 .. 
yn
and

v = y1 w1 + y2 w2 + . . . + yn wn
Xn
= yj wj
j=1
n n
!
X X
= yj Pij vi
j=1 i=1
 
n
X Xn
=  Pij yj  vi . . . (3)
i=1 j=1

Thus, i-th coordinate xi is given by


n
X
xi = Pij yj = Pi1 y1 + Pi2 y2 + . . . + Pin yn
j=1
158 CHAPTER 5. VECTOR SPACES

for all 1 ≤ i ≤ n. Hence


  
P11 P12 . . . P1n y1
 P21 P22 . . . P2n   y2 
[v]B =  . ..   ..  = P [v]B0
  
.. ..
 .. . . .  . 
Pn1 Pn2 . . . Pnn yn
Now, we wish to prove that B 0 spans V . For, let Q = (Qij ) =
P −1 . Then
n n n
!
X X X
Qjk wj = Qjk Pij vi
j=1 j=1 i=1
 
n
X n
X
=  Qjk Pij  vi
i=1 j=1
 
n
X n
X n
X
=  δik  vi (as Qjk Pij = δik )
i=1 j=1 j=1
= vk
Thus, L(B 0 ) = L(B) = V . Since B 0 contains n elements and so it
will be a basis of V . Since P is invertible, therefore its columns
are linearly independent and so B 0 is linearly independent ordered
set containing n-elements of V . Hence V = L(B 0 ). Thus, there
exists a basis B 0 such that (i) holds and so (ii) holds. Now, by
Theorem 5.6.4, it is clear that if B 00 be any other ordered basis for
which (i) and (ii) holds. Then [v]B = P [v]B0 and [v]B = P [v]B00 .
Let [v]B0 = S[v]B00 . Using these, we have
P [v]B00 = [v]B = P [v]B0 = P S[v]B00
for all v ∈ V . Thus, P = P S. But P is invertible, therefore S = In .
Hence B 0 = B 00 . Thus, proved. 2

Example 5.6.7 Consider the matrix


 
cos θ − sin θ
P =
sin θ cos θ
where θ is a real number. Consider the space R2 (R). Clearly P is
an orthogonal matrix and so
 
−1 cos θ sin θ
P =
− sin θ cos θ
5.6. COORDINATES 159

Consider the ordered basis (w1 , w2 ), where w1 = (cos θ, sin θ) and


w2 = (− sin θ, cos θ). Observe that coordinate matrix of w1 , w2
relative to standard basis are columns of P . Clearly it rotates the
axis about origin through an angle θ. Then, by above theorem, the
coordinates x01 , x02 of any point (x1 , x2 ) relative to new basis are
given by
 0     
x1 −1 x1 cos θ sin θ x1
=P =
x02 x2 − sin θ cos θ x2

Hence
x01
   
x1 cos θ + x2 sin θ
=
x02 −x1 sin θ + x2 cos θ
Thus, x01 = x1 cos θ + x2 sin θ and x02 = −x1 sin θ + x2 cos θ

Example 5.6.8 Consider the ordered basis B = {w1 , w2 , w3 }, where


w1 = (−1, 0, 0), w2 = (4, 2, 0), w3 = (5, −3, 8). Form a matrix
 
−1 4 5
P =  0 2 −3
0 0 8

whose columns are given by basis vectors.


Observe that
−1 2 11
 
8
P −1 =  0 21 16 3 

0 0 18
Then coordinates of any vector (a, b, c) relative to this basis is

−1 2 11 −a + 2b + 11c
      
a 8 a 8
P −1  b  =  0 12 16 3  
b = b 3c
2 + 16

1 c
c 0 0 8 c 8

This asserts that


   
11c b 3c c
(a, b, c) = −a + 2b + w1 + + w2 + w3 .
8 2 16 8

In particular, (1, 0, 8) = 10w1 + 23 w2 + w3 .

Exercises
160 CHAPTER 5. VECTOR SPACES

Exercise 5.6.9 Find the coordinates each of of standard basis of


R4 (R) relative the ordered basis (w1 , w2 , w3 , w4 ), where w1 = (1, 1, 0, 0),
w2 = (0, 0, 1, 1), w3 = (1, 0, 0, 4) and w4 = (0, 0, 0, 2). Hint: Here
the transition matrix P is
 
1 0 1 0
1 0 0 0
P = 0 1 0 0

0 1 4 2

Then find P −1 . Now, the coordinates of each of standard basis are


given by columns of P −1 .

Exercise 5.6.10 Consider the subset of R3 given by

S = {(1, 1, 0), (0, 1. − 1)}.

Prove that it is linearly independent subset of R3 . Extend this to a


basis of R3 .

Exercise 5.6.11 Consider V = R3 and B = (v1 , v2 , v3 ), where


v1 = (1, 1, 0), v2 = (0, 1, 1) and v3 = (1, 0, 1). Find the coordinates
of v = (2, 2, 2) ∈ R3 in the basis B.

Exercise 5.6.12 Consider the subset of R4 given by

S = {(2, 2, 1, 3), (7, 5, 5, 5), (3, 2, 2, 1), (2, 1, 2, 1)}.

Find a basis of L(S)and extend this to a basis of R4 .

Exercise 5.6.13 Find the basis of a polynomial space P2 (x) con-


taining 1, 1 − x2 .

Exercise 5.6.14 Find the smallest possible dimension of


W1 ∩ W2 in R5 , where W1 and W2 are distinct vector spaces with
dimensions 2 and 3 respectively.

Exercise 5.6.15 Is the linearly independent subset

{(1, 0, 2), (1, 1, −1)}

of R3 a basis? Give reasons.


5.6. COORDINATES 161

Exercise 5.6.16 Prove that the rows of a matrix


 
1 1 0 0
A = 0 1 −1 0 
0 0 0 03

are linearly independent.

2
162 CHAPTER 5. VECTOR SPACES
Chapter 6

Linear Transformations

Definition 6.0.1 Let U and V be any two vector spaces over same
field F . Then a map T : U → V is called a linear transforma-
tion if it satisfies the following condition:

T (ax + by) = aT (x) + bT (y)


for all a, b ∈ F and x, y ∈ U . It is called injective linear trans-
formation if it is one-one and is called surjective if it is onto.
A one-one onto linear transformation is called an isomorphism
of vector spaces. A linear transformation T : V → V is called a
linear operator on V .

Example 6.0.2 Let A be an m × n matrix whose entries are real


numbers. Let Rm denotes the vector space of row vectors or row
matrices over R. Let T : Rm → Rn be a map given by

T (X) = XA.

Let a, b ∈ R and X, Y ∈ Rm . Then

T (aX + bY ) = (aX + bY )A
= a(XA) + b(Y A) (by right distributive law)
= aT (X) + bT (Y )

Thus, T is a linear transformation.

Example 6.0.3 The map T : R3 → R2 given by

T (x, y, z) = (2x, 3y)

163
164 CHAPTER 6. LINEAR TRANSFORMATIONS

is a linear transformation. For


let a, b ∈ R and (x1 , y1 , z1 ), (x2 , y2 , z2 ) ∈ R3 . Now,

T [a(x1 , y1 , z1 ) + b(x2 , y2 , z2 )]
= T [(ax1 , ay1 , az1 ) + (bx2 , by2 , bz2 )]
= T [(ax1 + bx2 , ay1 + by2 , az1 + bz2 )]
= (2[ax1 + bx2 ], 3[ay1 + by2 ])
= (2ax1 + 2bx2 , 3ay1 + 3by2 )
= (2ax1 , 3ay1 ) + (2bx2 + 3by2 )
= a(2x1 , 3y1 ) + b(2x2 , 3y2 )
= aT (x1 , y1 , z1 ) + bT (x2 , y2 , z2 )

Thus, T is a linear transformation.

Example 6.0.4 The map T : R2 → R3 defined by

T (x, y) = (x − y, x + y, 2y)

is a linear transformation. For,


let a, b ∈ R and (x1 , y1 ), (x2 , y2 ) in R2 . Then

T [a(x1 , y1 ) + b(x2 , y2 )]
= T [(ax1 , ay1 ) + (bx2 , by2 )]
= T [(ax1 + bx2 , ay1 + by2 )]
= (ax1 + bx2 − ay1 − by2 , ax1 + bx2 + ay1 + by2 , 2[ay1 + by2 ])
= (a(x1 − y1 ) + b(x2 − y2 ), a(x1 + y1 ) + b(x2 + y2 ), 2ay1 + 2by2 )
= (a(x1 − y1 ), a(x1 + y1 ), 2ay1 ) + (b(x2 − y2 ), b(x2 + y2 ), 2by2 )
= a (x1 − y1 , x1 + y1 , 2y1 ) + b (x2 − y2 , x2 + y2 , 2y2 )
= aT (x1 , y1 ) + bT (x2 , y2 )

Example 6.0.5 Let V (F ), W (F ) be any two vector spaces. Then


a map 0̂ : V → W defined by 0̂(v) = 0, is a linear transformation.
This linear transformation is called zero linear transformation.

Example 6.0.6 The i-th projection map pi : F n → F defined by

pi (x1 , x2 , . . . , xn ) = xi

is a linear transformation, where F is a field.


165

Example 6.0.7 The differentiation map T : Pn (x) → Pn (x) de-


fined by

T (a0 + a1 x + . . . + an xn ) = a1 + 2a2 x + . . . + nan xn−1

is a linear transformation.

Exercise 6.0.8 Determine which of the following maps T : R3 →


R3 are linear:

1. T (x, y, z) = (y, z, 2) (Hint: No, why?).

2. T (x, y, z) = (x − y, z, x) (Hint: Yes, why?).

3. T (x, y, z) = (x − y, x + y + z, 3z) (Hint: Yes, why?).

4. T (x, y, z) = (x − y, z, |x|) (Hint: No, why?).

Exercise 6.0.9 Let Mn×n (R) denotes the vector space of all n ×
n over R. Let B ∈ Mn×n (R) be a non-zero matrix. Prove the
following :

1. the map T : Mn×n (R) → Mn×n (R) defined by T (X) = XB−


BX is a linear transformation,

2. the map T : Mn×n (R) → Mn×n (R) defined by T (X) =


XB 2 + BX is linear transformation,

3. the map T : Mn×n (R) → Mn×n (R) defined by T (X) =


XB 2 − BX 2 is not a linear transformation.

Exercise 6.0.10 Prove that the map T : Pn (x) → Pn+1 (x) defined
by T (p(x)) = p(0) + xp(x) is a linear transformation but the map
T1 : Pn (x) → Pn+1 (x) defined by T1 (p(x)) = 1 + xp(x) is not a
linear transformation.

Example 6.0.11 Let T : V → V be a linear transformation,


whose images are known on a basis B = {v1 , v2 , . . . , vn } of V (F ),
i.e, T (vi ) = wi for all 1 ≤ i ≤ n where w1 , w2 , . . . , wn are given
vectors. Then, we can determine T explicitly. For, let v be
any element. Then v can be uniquely expressed as

v = a1 v1 + a2 v2 + . . . + an vn
166 CHAPTER 6. LINEAR TRANSFORMATIONS

Then T (v) = a1 T (v1 ) + a2 T (v2 ) + . . . + an T (vn ) = a1 w1 + a2 w2 +


. . . + an wn .
It is illustrated by following example:

Let V = R3 and T : R3 → R3 be a linear transformation given


by

T (1, 0, 1) = (0, 0, 0) 
T (0, 1, 0) = (1, −1, 0) . . . (1)
T (0, 0, 1) = (0, 1, −1)

where B = {(1, 0, 0), (0, 1, 0), (0, 0, 1)} is a basis.


Let (x, y, z) ∈ R3 . Since B is a basis of R3 , therefore (x, y, z)
can be uniquely expressed as linear combination of elements of B.
Suppose that

(x, y, z) = a(1, 0, 1) + b(0, 1, 0) + c(0, 0, 1) . . . (2)

Then we have, a = x, b = y, a + c = z, hence

a = x, b = y, c = z − x . . . (3)

Since T is linear, therefore by Eq. (1) and Eq. (2), we have

T (x, y, z) = aT (1, 0, 1) + bT (0, 1, 0) + cT (0, 0, 1)


= a(0, 0, 0) + b(1, −1, 0) + c(0, 1, −1)
= (b, c − b, −c)
= (y, z − x − y, x − z) (by Eq.(3))

Exercise 6.0.12 Determine the explicit formula for a linear trans-


formation T : R3 → R4 for which T (1, 1, 0) = (0, 0, 1, 0), T (0, 1, 0) =
(1, 0, 0, 0) and T (0, 1, −1) = (1, 0, −1, −1).

Proposition 6.0.13 Let T : U (F ) → V (F ) be a linear transfor-


mation. Then, we have the following
(1). T (0U ) = 0V , where 0U , 0V denotes the additive identity of
spaces U (F ) and V (F ) respectively.
(2). T (−u) = −T (u), for all u ∈ U .

Proof: Since T is a linear transformation, therefore

T (0U ) = T (00U ) = 0T (0U ) = 0V .


167

This proves (1).


Next, let u ∈ U . Then, by (1), 0 = T (u + (−u)) = T (u) +
T (−u). Thus, T (u) is the additive inverse of T (−u). Hence T (−u) =
−T (u), for all u ∈ U . This prove (2). 2
For convenience, we always use 0 for both 0U and 0V without
referring the spaces U and V respectively.

Definition 6.0.14 Let T : U (F ) → V (F ) be a linear transforma-


tion. Then range of T is denoted and defined by

T (U ) = {T (u) | u ∈ U }

We also denote Im T for T (U ).

Let W be any subspace of V , Then, the inverse image of W


is denoted and defined by

T −1 (W ) = {u ∈ U | T (u) ∈ W }

Proposition 6.0.15 Let T : U (F ) → V (F ) be a linear transfor-


mation. Then, we have the following:
(1). If X is a subspace of U , then T (X) is a subspace of V . In
particular, T (U ) is a subspace of V .

(2). If W is a subspace of V , then T −1 (W ) is a subspace of U .

Proof: Let a, b ∈ F and v1 , v2 ∈ T (X), then there exists u1 , u2 ∈


X such that v1 = T (u1 ) and v2 = T (u2 ). Now,

av1 + bv2 = aT (u1 ) + bT (u2 )


= T (au1 + bu2 ) ∈ T (X)
as T is a linear and au1 + bu2 ∈ X

Hence T (X) is a subspace. In particular, T (U ) is a subspace. This


proves (1).
Next, let a, b ∈ F and u1 , u2 ∈ T −1 (W ), then T (u1 ), T (u2 ) in
W . Now,

T (au1 + bu2 ) = aT (u1 ) + bT (u2 ) ∈ W, as W is a subspace

Hence au1 + bu2 ∈ T −1 (W ). Thus, T −1 (W ) is a subspace of U . 2


168 CHAPTER 6. LINEAR TRANSFORMATIONS

Definition 6.0.16 Let T : U (F ) → V (F ) be a linear transforma-


tion. Then kernel of a linear transformation T is denoted and
defined by
ker T = {u ∈ U |T (u) = 0V }
It is also called a null space of T .

Proposition 6.0.17 The kernel of a linear transformation T :


U (F ) → V (F ) is a subspace of U .

Proof: Let a, b ∈ F and u1 , u2 ∈ ker T . Then T (u1 ) = 0V =


T (u2 ). Since T (au1 + bu2 ) = aT (u1 ) + bT (u2 ) = 0V , therefore
au1 + bu2 ∈ ker T . Thus, ker T is a subspace of U . 2

Definition 6.0.18 Let U (F ) and V (F ) be any two finite dimen-


sional vector spaces and T : U (F ) → V (F ) be a linear transforma-
tion. Then dim ker T is called nullity of T denoted by ν(T ). The
dimension of range space T (U ) is called rank of T . It is denoted
by ρ(T ).

Proposition 6.0.19 A linear transformation T : U (F ) → V (F )


is injective (one-one) if and only if ker T = {0}, i.e; ν(T ) = 0.

Proof: Suppose that ker T = {0}. Then

T (u1 ) = T (u2 ) ⇒ T (u1 ) − T (u2 ) = 0


⇒ T (u1 − u2 ) = 0
⇒ u1 − u2 ∈ ker T
⇒ u1 = u2 as ker T = {0}

Conversely suppose that T is one-one. Since T (0) = 0 and T is


one-one, therefore image of only zero vector will be zero. Thus,
ker T = {0}. 2

Example 6.0.20 Consider the linear transformation T : R3 → R2


given by
T (x, y, z) = (x, y)
Let (x, y, z) ∈ ker T . Then T (x, y, z) = 0R2 = (0, 0), i.e; (x, y) =
(0, 0). By equality of ordered pairs, we have x = 0, y = 0. Thus,

ker T = {(0, 0, z)| z ∈ R}.


169

Clearly {(0, 0, 1)} is a linearly independent set such that (0, 0, z) =


z(0, 0, 1) for all z ∈ R. Hence, {(0, 0, 1)} is a basis of ker T con-
taining one element. Thus, ν(T ) = 1. Since T (R3 ) = R2 , which
is two dimensional vector space with basis {(1, 0), (0, 1)}. Hence,
ρ(T ) = 2. Observe that ρ(T ) + ν(T ) = 3 = dim R3 .

Example 6.0.21 Consider the i-th projection map Ti : Rn → R


given by
T (x1 , x2 , . . . , xn ) = xi ,
where 1 ≤ i ≤ n. Then

ker T = {(x1 , x2 , . . . , xn ) ∈ Rn | xi = 0}, T (Rn ) = {x | x ∈ R}

Hence ν(T ) = n − 1 and ρ(T ) = 1. Clearly ρ(T ) + ν(T ) = n =


dim Rn .

Example 6.0.22 Let A be an n×n matrix with real entries. Con-


sider the linear transformation TA : Mn×1 (R) → Mn×1 (R) given
by  
x1
 x2 
TA (X) = AX, where X =  .  .
 
 .. 
xn
 
b1
 b2 
Let b =  .  ∈ TA (Mn×1 ). Then there exists a vector
 
 .. 
bn
 
x1
 x2 
X =  .  ∈ Mn×1
 
 .. 
xn

such that TA (X) = b, i.e; AX = b, i.e;

C1 x1 + C2 x2 + . . . + Cn xn = b

where Cj denotes the j-th column of A. i.e; b is the linear com-


binations of columns of A. Hence, range space TA (Mn×1 ) is
170 CHAPTER 6. LINEAR TRANSFORMATIONS

the space spanned by columns of A and so its dimension will


be the maximum number of linearly independent columns of A, i.e;
the column rank of A. But the row rank of A is same as the column
rank of A. Hence, ρ(TA ) = rank A.
Next,
X ∈ ker TA ⇔ TA (X) = 0 ⇔ AX = 0.
Thus, ker TA is the solution space of the system of homogeneous
equations AX = 0. Hence, ν(TA ) will be the number of linearly
independent solutions of AX = 0. Using Theorem 3.1.2, we have

ν(TA ) = n − rank A.

Example 6.0.23 Consider the linear transformation T : R3 → R4


defined by
T (x, y, z) = (x − y, y − z, z − x, 0)
Since

(x − y, y − z, z − x, 0) = (x − z)(1, 0, −1, 0) + (y − z)(−1, 1, 0, 0)

therefore T (R3 ) = h{(1, 0, −1, 0), (−1, 1, 0, 0)}i. Now


Consider the matrix
 
1 0 −1 0
A=
−1 1 0 0

Since
 
1 0 −1 0
A =
−1 1 0 0
 
1 0 −1 0
∼ R2 → R2 + R1
0 1 −1 0

which is an Echelon form. Then

dim T (R3 ) = rank A = 2

But h{(1, 0, −1, 0), (−1, 1, 0, 0)}i = T (R3 ), hence

{(1, 0, −1, 0), (−1, 1, 0, 0)}

will be a basis of T (R3 ).


171

Next, let (x, y, z) ∈ ker T . Then T (x, y, z) = (0, 0, 0, 0), i.e;


(x − y, y − z, z − x, 0) = (0, 0, 0, 0). By equality of tuples, we have
x = y = z. Thus,

ker T = {(x, x, x) ∈ R3 | x ∈ R} = h(1, 1, 1)i.

Hence, {(1, 1, 1)} will be a basis of ker T and ν(T ) = 1.

Exercise 6.0.24 Find the basis of range space and null space of
the linear transformation T : R4 → R3 given by

T (x, y, z, u) = (x + y, y − z, x + u).

Exercise 6.0.25 Find the basis of range space and null space of
the linear transformation T : R3 → R3 given by

T (x, y, z) = (x + y, y + z, x + z).

Exercise 6.0.26 If T : R2 → R2 is a map given by T (x, y) = (0, x)


then prove that ker T = range (T ).

Exercise 6.0.27 Give an example of a linear transformation T


for which ker T ⊂ range T . Also give an example of a linear
transformation T for which range T ⊂ ker T .

Example 6.0.28 Consider the linear transformation T on R3 given


by
T (x, y, z) = (x − y, y + z, x + z)
Let (a, b, c) ∈ T (R3 ). Then T is surjective if and only if there exists
(x, y, z) ∈ R3 such that

T (x, y, z) = (x − y, y + z, x + z) = (a, b, c)

i.e; ⇔ there exists (x, y, z) ∈ R3 such that x − y = a, y + z = b and


x+z =c
Thus, T is surjective ⇔ the system AX = B is consistent, where
     
1 −1 0 x a
A = 0 1 1  , X = y  = (x, y, z)t and B =  b 
1 0 1 z c
172 CHAPTER 6. LINEAR TRANSFORMATIONS

Consider the augmented matrix (A|B). Since


 
1 −1 0 : a
(A|B) = 0 1 1 : b 
1 0 1 : c
 
1 −1 0 : a
∼ 0 1 1 : b  R3 → R3 − R1
0 1 1 : c−a
 
1 −1 0 : a
∼ 0 1 1 : b  R3 → R3 − R2
0 0 0 : c−a−b

Hence, the system is consistent if and only if c − a − b = 0, i.e, iff


c = a + b. This shows that T is not surjective. Clearly

range T = {(a, b, a + b) | a, b ∈ R}

and rank (T ) = 2 with basis {(1, 0, 1), (0, 1, 1)}.


Next, (x, y, z) ∈ ker T if and only if T (x, y, z) = (0, 0, 0),
i.e; ⇔ there exists (x, y, z) ∈ R3 such that

x − y = 0, y + z = 0, x+z =0

  ⇔ the system
T is surjective   AX = 0 is consistent, where A =
1 −1 0 x
0 1 1 and X = y  = (x, y, z)t . From above reduction,
1 0 1 z
we see that rank A = 2, hence AX = 0 has 3 − 2 = 1 linearly
independent solution and so ker T 6= {0}. Thus, T is not injective.

Exercise 6.0.29 Prove that i-th projection map T : Rn → R is


surjective but not injective.

Exercise 6.0.30 Give an example of a linear transformation T


which is injective but not surjective.

Exercise 6.0.31 Give an example of a linear transformation T


which is neither injective nor surjective.
6.1. RANK-NULLITY THEOREM 173

6.1 Rank-Nullity Theorem


Theorem 6.1.1 (Rank-nullity theorem): Let V (F ) and W (F )
be any two finite dimensional vector spaces. If T : V → W is a
linear transformation then

dim V = rank (T ) + nullity (T )

i.e;
dim V = ρ(T ) + ν(T )

Proof: Suppose that dim V = n. Since ker T is a subspace


of V , therefore dim ker T ≤ n. Let dim ker T = m and S =
{v1 , v2 , . . . , vm } be a basis of ker T . Then S will be linearly in-
dependent subset of V and so by extension theorem, S can be
extended to a basis B = {v1 , v2 , . . . , vm ; vm+1 , . . . , vn } of V . Let
S1 = {vm+1 , . . . , vn }. Let u ∈ T (V ), then there exists v ∈ V such
that u = T (v). Since v ∈ V and B is a basis of V therefore v can
be expressed as

v = a1 v1 + a2 v2 + . . . + am vm + am+1 vm+1 + . . . + an vn

Hence u = T (v) = am+1 T (vm+1 ) + . . . + an T (vn ) as T is linear and


T (vi ) = 0 for all 1 ≤ i ≤ m. This shows that T (V ) = L(S1 ). Now,
suppose that

am+1 T (vm+1 ) + . . . + an T (vn ) = 0 . . . (1)

Then
T (am+1 vm+1 + . . . + an vn ) = 0
Hence am+1 vm+1 + . . . + an vn ∈ ker T . But S is a basis of ker T .
Hence there exists scalars a1 , a2 , . . . , am ∈ F such that

am+1 vm+1 + . . . + an vn = a1 v1 + a2 v2 + . . . + am vm

i.e;

am+1 vm+1 + . . . + an vn − a1 v1 − a2 v2 − . . . − am vm = 0

But B is linearly independent therefore a1 = 0, a2 = 0, . . . , am = 0;


am+1 = 0, am+2 = 0, . . . , an = 0. Thus (1) holds only if am+1 = 0,
am+2 = 0, . . . , an = 0. Hence S1 is linearly independent. Since
174 CHAPTER 6. LINEAR TRANSFORMATIONS

L(S1 ) = T (V ), therefore S1 is a basis of T (V ) containing n − m


elements. Thus,

dim T (V ) = n − m = n − dim ker T

i.e;
dim V = n = dim T (V ) + dim ker T
But ρ(T ) = dim T (V ) and ν(T ) = dim ker T , therefore we have

dim V = ρ(T ) + ν(T )

Example 6.1.2 Let F be a field and T : V → F be a linear


transformation (linear functional) given by T (x1 v1 + x2 v2 + . . . +
xn vn ) = a1 x1 + a2 x2 + . . . + an xn , where {v1 , v2 , ..., vn is a ba-
sis of V , a1 , a2 , . . ., an are given scalars, not all zero. Then
range T = F hence T is surjective and rank T = 1. Since
dim V = n, therefore by rank -nullity theorem, dimension of
ker T = {x1 v1 + x2 v2 + . . . + xn vn | a1 x1 + a2 x2 + . . . + an xn = 0}
is n − 1. Such subspaces are called hyperspaces in V . In par-
ticular, if V = R3 and F = R, then ker T is a plane given by
a1 x1 + aq2 x2 + a3 x3 = 0, is of dimension 2. If F = R, and V = Rn
then ker T is given by a1 x1 +aq2 x2 +. . .+an xn = 0, is of dimension
n − 1. Such subspace of Rn is called a hypersurface.

Exercise 6.1.3 Let V (F ) be a vector space of dimension n ≥ 1.


Let T : V → V be a linear transformation. Then prove that the
following statements are equivalent:
1. T (V ) = ker T ,
n
2. T 6= 0̂ and T 2 = 0̂, n is even and rank of T is 2.

Theorem 6.1.4 Let V (F ) and W (F ) be any vector spaces of di-


mension n and T : V → W be a linear transformation. Then prove
that the following statements are equivalents:
(1). T is injective (one-one),

(2). T is surjective (onto),

(3). T is bijective (one-one onto),


6.1. RANK-NULLITY THEOREM 175

(4). T carries bases of V to bases of W , i.e; if {v1 , v2 , . . . , vn } is


a basis of V then {T (v1 ), T (v2 ), . . . , T (vn )} is a basis of W .
Proof: Given that dim V = dim W = n. By rank-nullity
theorem (Theorem 6.1.1), we have
dim V = dim T (V ) + dim ker T (1)
(1) ⇔ (2).
ker T = {0} ⇔ dim ker T = 0
⇔ dim V = dim T (V )
⇔ dim T (V ) = n = dim W
⇔ T (V ) = W
Hence, T is injective if and only if T is surjective.
(1) ⇔ (3). As we have proved that T is injective if and only if
T is surjective. Thus, we immediately follow that T is injective if
and only if T is bijective.
(1) ⇔ (4). Let T be injective and {v1 , v2 , . . . , vn } is a basis of
V . Since T is injective, therefore T (v1 ), T (v2 ), . . . , T (vn ) all are
distinct. Suppose that
a1 T (v1 ) + a2 T (v2 ) + . . . + an T (vn ) = 0
Then T [a1 v1 + a2 v2 + . . . + an vn ] = 0. But ker T = {0}. Hence
a1 v1 + a2 v2 + . . . + an vn = 0
By linear independence of {v1 , v2 , . . . , vn }, we have
a1 = 0, a2 = 0, . . . , an = 0
Thus, equation (1) holds only if a1 = 0, a2 = 0, . . . , an = 0. Hence
{T (v1 ), T (v2 ), . . . , T (vn )}
is linearly independent set containing n elements of W , where
dim W = n. Thus, {T (v1 ), T (v2 ), . . . , T (vn )} is a basis of W .
Conversely suppose that T carries bases of V to bases of W .
Let {v1 , v2 , . . . , vn } be a basis of V . Then {T (v1 ), T (v2 ), . . .,
T (vn )} is a basis of W . Let w ∈ W . Then
w = a1 T (v1 )+a2 T (v2 )+. . .+an T (vn ) = T [a1 v1 +a2 v2 +. . .+an vn ]
Hence, T is surjective and so T is injective (by (1) ⇔ (2)). 2
176 CHAPTER 6. LINEAR TRANSFORMATIONS

Exercise 6.1.5 A linear transformation T : V (F ) → W (F ) is


injective ⇔ T carries linearly independent subset of V to a linearly
independent subset of W , where V and W are finite dimensional
vector spaces over field F of dimension n.

Definition 6.1.6 A linear transformation T : V (F ) → W (F ) is


called an isomorphism if it is one-one and onto. In this case we
write V ' W and call ”V is isomorphic to W ”.

Theorem 6.1.7 Every finite dimensional vector space V (F ) is


isomorphic to F n for some n ∈ N, i.e; V ' F n for some n ∈ N.

Proof: Let V (F ) be a finite dimensional vector space. Suppose


that dim V = n and B = {v1 , v2 , . . . , vn } be a basis of V . Hence
every v ∈ V is uniquely expressible as

v = a1 v1 + a2 v2 + . . . + an vn

a1 , a2 , . . . , an are scalars. Define a map T : V → F n as

T (a1 v1 + a2 v2 + . . . + an vn ) = (a1 , a2 , . . . , an )

Clearly it is one-one and onto. Let a, b ∈ F . Then

Xn n
X
T [a( ai vi ) + b( bi vi )]
i=1 i=1
n
X
= T [ (aai + bbi )vi ]
i=1
= (aa1 + bb1 , aa2 + bb2 , . . . , aan + bbn )
= a(a1 , a2 , . . . , an ) + b(b1 , b2 , . . . , bn )
X n X n
= aT ( ai vi ) + bT ( bi vi )
i=1 i=1

Thus, T is linear.
Hence T is an isomorphism, i.e; V ' F n . 2

Proposition 6.1.8 Let T be a bijective linear transformation from


V (F ) in to W (F ). Then the inverse map T −1 is also linear.
6.1. RANK-NULLITY THEOREM 177

Proof: Let a, b ∈ F and w1 , w2 ∈ W . Since T is one-one and


onto therefore there exists unique v1 , v2 ∈ V such that T (v1 ) = w1
and T (v2 ) = w2 . Hence T w1 = v1 and T w2 = v2 . Now

T (av1 + bv2 ) = aT (v1 ) + bT (v2 ) = aw1 + bw2

Hence

T −1 (aw1 + bw2 ) = av1 + bv2 = aT −1 (w1 ) + T −1 w2 )

Thus, T −1 is a linear transformation. 2

Proposition 6.1.9 Any two vector spaces of same dimensions are


isomorphic.

Proof: Let V (F ) and W (F ) be any two vector spaces of dimen-


sion n. Let B = {v1 , v2 , . . . , vn } and B 0 = {w1 , w2 , . . . , wn } be
bases of V and W respectively. Define a map T : V → W by
n n
!
X X
T ai vi = a i wi
i=1 i=1

Then T is one-one and onto. Now


n n
" ! !#
X X
T a ai vi + b bi vi
i=1 i=1
" n #
X
= T (aai + bbi )vi
i=1
n
X
= (aai + bbi )wi
i=1
Xn n
X
= a ai wi + b bi wi
i=1 i=1
n n
" # " #
X X
= aT ai vi + bT bi vi
i=1 i=1

Thus, T is linear. Since it is bijective too, therefore T is an iso-


morphism. 2
178 CHAPTER 6. LINEAR TRANSFORMATIONS

Definition 6.1.10 A linear transformation T : V → W is said


to be singular if there exists a non-zero vector v ∈ V such that
T (v) = 0. If T is not singular then T is called non-singular, i.e;
T (v) = 0 ⇒ v = 0. If V and W are finite dimensional then T is
non-singular if and only if T is invertible.

Theorem 6.1.11 Let V (F ) and W (F ) be any two vector spaces.


If B = {v1 , v2 , . . . , vn } is a basis of V and w1 , w2 , . . . , wn be any
n vectors in W (not necessarily distinct). Then there is unique
linear transformation T : V → W such that T (vi ) = wi for all
i = 1, 2, . . . , n.

Proof: Since B = {v1 , v2 , . . . , vn } is a basis of V therefore every


vector v ∈ V can be uniquely expressed as linear combination Pn of
elements of B, i.e; every element of V is of the form i=1 i i .
a v
Define a map T : V → W by
n n
!
X X
T ai vi = ai wi . . . (1)
i=1 i=1

Let a, b ∈ F . Then
" n n
# " n #
X X X
T a. ai vi + b. bi vi = T (aai + bbi )vi
i=1 i=1 i=1
n
X
= (aai + bbi )wi
i=1

n
X n
X
= a. ai wi + b. bi wi
i=1 i=1
" n # " n
#
X X
= aT ai vi + bT bi vi
i=1 i=1

Thus, T is a linear transformation. Clearly


 

T (vi ) = T 0 + 0 + . . . + vi +0 + . . . + 0
|{z}
i−th place
= 0 + 0 + ... + wi +0 + . . . + 0
|{z}
i−th place
= wi
6.1. RANK-NULLITY THEOREM 179

for each i = 1, 2, . . . , n.
Let U be a linear transformation from V into W such that
U (vi ) = wi for each i = 1, 2, .., n. Then,
n n n n
! !
X X X X
U ai vi = ai U (vi ) = a i wi = T ai vi
i=1 i=1 i=1 i=1

Thus, U = T . 2

Example 6.1.12 Let T : F 2 → F 2 be a linear transformation


such that T (1, 0) = (a, b) and T (0, 1) = (c, d). Then T (x, y) =
T [x(1, 0) + y(0, 1)] = xT (1, 0)+yT (0,
 1) and
 hence T (x, y) = x(a, b)+
a b
y(c, d) = (ax + cy, bx + dy) = (x y)
c d

Exercise 6.1.13 Is there a linear transformation T from R3 into


R2 such that T (1, −1, 1) = (1, 0) and T (1, 1, 1) = (0, 1)?

Solution: Yes, there exists infinitely many linear transformations.


Here we give one of them. Let v1 = (1, −1, 1) and v2 = (1, 1, 1).
Select third basis vector v3 from F 3 \ L({(1, −1, 1), (1, 1, 1)}). Now

L({(1, −1, 1), (1, 1, 1)}) = {a(1, −1, 1) + b(1, 1, 1)| a, b ∈ R}


= {(a + b, −a + b, a + b) |a, b ∈ R}

Take v3 = (0, 0, 1) ∈
/ L({(1, −1, 1), (1, 1, 1)}). Suppose that T (v3 ) =
(0, 0). Then T is a linear transformation which is described as be-
low:
Let

(x, y, z) = a(1, −1, 1) + b(1, 1, 1) + c(0, 0, 1) . . . (1)

Then

a+b = x
−a + b = y
a+c = z

Solving these, we get


x−y x+y 2z − x + y
a= ,b = ,c =
2 2 2
180 CHAPTER 6. LINEAR TRANSFORMATIONS

Using (1), T is given by

T (x, y, z) = aT (1, −1, 1) + bT (1, 1, 1) + cT (0, 0, 1)


x−y x+y 2z − x + y
= (1, 0) + (0, 1) + (0, 0)
2 2 2
x−y x+y
= ,
2 2

Exercise 6.1.14 Let v1 = (1, −1), v2 = (2, −1), v3 = (−3, 2) and


w1 = (1, 0), w2 = (0, 1), w3 = (1, 1). Is there a linear transforma-
tion T from R2 into R2 such that T (vi ) = wi for all i = 1, 2, 3?

Solution: Let T be a transformation such that T (vi ) = wi for


all i = 1, 2, 3. Since v3 = −v1 − v2 and T a linear map, therefore

T (v3 ) = −T (v1 ) − T (v2 ),

i.e;
(1, 1) = −(1, 0) − (0, 1) = (−1, −1)
and so 1 = −1. This is a contradiction. Thus, there exists no linear
transformation such that T (vi ) = wi for all i = 1, 2, 3

6.2 Algebra of Linear transformations


Theorem 6.2.1 Let V and W be any two vector spaces over field
F . Let T and U be any two linear transformation from V into W .
Then the function T + U defined by

(T + U )(v) = T (v) + U (v)

is a linear transformation from V into W . If c is any scalar Then


the map (cT ) defined by

(cT )(v) = cT (v)

is a linear transformation from V into W .

Proof: Let a, b ∈ F and v1 , v2 ∈ V . Then,

(T + U )[av1 + bv2 ] = T (av1 + bv2 ) + U (av1 + bv2 )


= [aT (v1 ) + bT (v2 )] + [aU (v1 ) + bU (v2 )]
= a(T + U )(v1 ) + b(T + U )(v2 )
6.2. ALGEBRA OF LINEAR TRANSFORMATIONS 181

(cT )(av1 + bv2 ) = c.T (av1 + bv2 )


= c[aT (v1 ) + bT (v2 )]
= a(cT )(v1 ) + b(cT )(v2 )

Hence (T + U ) and (cT ) are linear maps. 2


Let L(V, W ) be he set of all linear maps from a vector space
V (F ) into vector space W (F ). It is also denoted by HomF (V, W ).

Proposition 6.2.2 The set L(V, W ) together with addition and


scalar multiplication defined by

(T + U )(v) = T (v) + U (v)

and
(cT )(v) = c.T (v)
is a vector space over field F .

Proof: Let T1 , T2 , T3 ∈ L(V, W ). Then, we have

[(T1 + T2 ) + T3 ](v) = (T1 + T2 )(v) + T3 (v)


= [T1 (v) + T2 (v)] + T3 (v)
= T1 (v) + [T2 (v) + T3 (v)]
= T1 (v) + [(T2 + T3 )](v)
= [T1 + (T2 + T3 )](v)

for all v ∈ V . Hence, by equality of maps, we have

(T1 + T2 ) + T3 = T1 + (T2 + T3 )
Similarly we have the following:

1. (T1 + T2 ) + T3 = T1 + (T2 + T3 ), for all T1 , T2 , T3 in L(V, W ),

2. T + 0̂ = T = 0̂ + T , where 0̂(v) = 0 for all v ∈ V ,

3. for each T , there exists −T ∈ L(V, W ) such that (−T ) + T =


0̂ = T + (−T ), where −T is given by (−T )(v) = −T (v),

4. T + U = U + T for all T, U in L(V, W ),


182 CHAPTER 6. LINEAR TRANSFORMATIONS

5. (a + b)T = aT + bT and a(T + U ) = aT + aU for all a, b ∈ F


and T, U in L(V, W ),

6. (ab)T = a(bT ), for all a, b in F and T in L(V, W ),

7. 1T = T for all T in L(V, W ), where 1 ∈ F

This shows that L(V, W ) is a vector space. 2

Theorem 6.2.3 Let V and W be finite dimensional vector space


over same field F . If dim V = n and dim W = m, then L(V, W )
is finite dimensional and has dimension mn.

Proof: Let B = {v1 , v2 , . . . , vn } and B 0 = {w1 , w2 , . . . , wm }.


By Theorem 6.1.11, for each pair of positive integers (i, j) with
1 ≤ i ≤ m and 1 ≤ j ≤ n, we get a unique linear transformation
T ij from V into W such that

ij 0 if k 6= j
T (vk ) = = δkj wi . . . (1)
wi if k = j

Consider the set S = {T ij | 1 ≤ i ≤ m, 1 ≤ j ≤ n}. Clearly it


contains mn elements. Let T be any liner transformation. Suppose
that  
A1k
 A2k 
[T (vk )] =  . 
 
 .. 
Ank
for each k = 1, 2, . . . , n. Then
n
X
T (vk ) = Aik wi (6.2.1)
i=1
Pm Pn ik .
We claim that T = i=1 j=1 Aik T For,

m X
n m X
n
" #
X X h i
ik
Aik T (vj ) = Aik T ik (vj )
i=1 k=1 i=1 k=1
m
" n #
X X
= Aik δjk wi (by (1))
i=1 k=1
6.2. ALGEBRA OF LINEAR TRANSFORMATIONS 183

i.e;
m X
n m
" #
X X
ik
Aik T (vj ) = Aij wi . . . (2)
i=1 k=1 i=1
= T (vj ) (by Eq. 6.2.1)

This shows that


m X
X n
Aik T ik = T
i=1 j=1

Hence S spans L(V, W ). Thus, it is finite dimensional. Next, if


m X
X n
Aik T ik = 0̂
i=1 j=1

Then, by (2), we have


m m X
n
" #
X X
ik
Aij wi = Aik T (vj ) = 0̂(vj ) = 0
i=1 i=1 k=1

for all j.
By linear independence of B 0 , Aij = 0 for all i and j. Thus,
S is linearly independent subset of L(V, W ). Then S is a basis of
L(V, W ) and dim L(V, W ) = mn. 2

Corollary 6.2.4 If V (F ) is finite dimensional and dim V = n,


then L(V, V ) is finite dimensional and dim L(V, V ) = n2 .

Note that elements of L(V, V ) are called linear operators.

Definition 6.2.5 Let V , W , and Z be vector spaces over same


field F . Let T : V → W and U : W → Z be linear transformations.
Then composite function (U oT ) : V → Z of T and U , defined by

U oT (v) = U (T (v))

is a linear transformation. We frequently write U T instead of U oT


and is called composite function of U and T . If Z = W = V ,
then U T is called the multiplication of linear operators U and
T . A linear operator I on V defined by I(v) = v for all v ∈ V , is
called an identity linear operator.
184 CHAPTER 6. LINEAR TRANSFORMATIONS

Proposition 6.2.6 Let V be a vector space over field F . Let


U, T1 , T2 be linear operators on V and c be any scalar. Then

1. IU = U = U I;

2. (U T1 )T2 = U (T1 T2 );

3. U (T1 + T2 ) = U T1 + U T2 ; (T1 + T2 )U = T1 U + T2 U ;

4. c(U T1 ) = (cU )T1 = U (cT1 ).

Proof: The proof is left for readers. 2


Thus, (L(V, V ), +, ·) is a vector space such that (L(V, V ), +, o)
is a ring with unity such that c(U T1 ) = (cU )T1 = U (cT1 ). Such
structures are called algebra over F .

6.3 Linear functionals


.

Definition 6.3.1 Let V (F ) be a vector space. Then a map f :


V → F is called a linear functional if f (av1 +bv2 ) = af (v1 )+bf (v2 )
for all a, b ∈ F and v1 , v2 ∈ V .

Example 6.3.2 The function 0̂ : V → F defined by f (v) = 0 for


all v ∈ V , is a linear functional. It is called a zero functional.

Example 6.3.3 A function f : F n → F defined by

f (x1 , x2 , . . . , xn ) = a1 x1 + a2 x2 + . . . + an xn

is a linear functional, where a1 , a2 , . . . , an are given scalars. In


later, we shall observe that every functional is of this form for
some scalars a1 , a2 , . . . , an ∈ F .

As F (F ) is a vector space, therefore L(V, F ) = L(V, W ), where


W = F ; is a vector space. This vector space is denoted by V ∗ ,
hence V ∗ = L(V, F ).

Theorem 6.3.4 Let V (F ) be a finite dimensional vector space. If


dim V = n, then dim V ∗ = n.
6.3. LINEAR FUNCTIONALS 185

Proof: Let B = (v1 , v2 , . . . , vn ) be a basis of V . By Theo-


rem 6.1.11, for each j = 1, 2, .., n there exist a unique liner map
fj : V → F given by

1 if i = j
fj (vi ) = = δij
0 if i 6= j

Consider B ∗ = {fj | 1 ≤ j ≤ n} Then P it has n elements. Let f


be any linear functional. Consider nj=1 f (vj )fj . Then, for each
i = 1, 2, ..., n, we have
 
X n
 f (vj )fj  (vi )
j=1
n
X
= f (vj )fj (vi )
j=1
= f (v1 )f1 (vi ) + f (v2 )f2 (vi ) + . . . + f (vi−1 )fi−1 (vi )
+f (vi )fi (vi ) + f (vi+1 )fi+1 (vi ) + . . . + f (vn )fn (vi )
= f (v1 )0 + f (v2 )0 + . . . + f (vi−1 )0 + f (vi )1 +
f (vi+1 )0 + . . . + f (vn )0
= f (vi )
Pn
Since linear maps j=1 f (vj )fj and f are same on the basis, there-
fore
n
X
f (vj )fj = f
j=1

Hence V is spanned by B ∗ , then V ∗ is finite dimensional space.


We now claim that B ∗ is linearly independent. For, suppose that


n
X
aj fj = 0̂ . . . (1)
j=1

Then,
n
X
aj fj (vi ) = 0̂(vi ) = 0 ∀i = 1, 2, ..., n
j=1

i.e;
n
X
aj δji = 0̂(vi ) = 0 ∀i = 1, 2, ..., n
j=1
186 CHAPTER 6. LINEAR TRANSFORMATIONS

i.e;
ai = 0 f or all i = 1, 2, ..., n
Thus, eq. (1) holds only if all the coefficients are zero. Hence B ∗
is linearly independent and so B ∗ is a basis of V ∗ . Thus, dim V ∗ =
n = dim V . 2

Corollary 6.3.5 Let V be a finite dimensional vector space over


field F , then V ' V ∗ .

Proof: Since any two finite dimensional vector spaces over same
field are isomorphic if and only if they have same dimension. Thus,
to prove the result it is sufficient to prove dim V = dim V ∗ . Thus
the result follows from the Theorem 6.3.4. 2

Definition 6.3.6 Let V (F ) be a finite dimensional vector space


and B = {v1 , v2 , . . . , vn } be a basis of V . Then the basis B ∗ =
{f1 , f2 , . . . , fn } of V ∗ given by fi (vj ) = δij , is called a dual basis of
the given basis B.

Proposition 6.3.7 Let V (F ) be a finite dimensional vector space


and B = {v1 , v2 , . . . , vn } be a basis of V . Let B ∗ = {f1 , f2 , . . . , fn }
a dual basis of basis B. Then for each linear functional f on we
have
Xn
f= f (vi )fi
i=1

and for each vector v ∈ V we have

v = fi (v)vi (6.3.1)

Proof: Let f be a linear functional on V . Since B ∗ = {f1 , f2 , . . . , fn }


is a basis of V ∗ . Then f is
Pan linear combination combination of

elements of B and so f = i=1 ai fi . Then
n
X n
X
f (vj ) = ai fi (vj ) = ai δij = aj
i=1 i=1

for each j. Hence


n
X
f= f (vi )fi
i=1
6.3. LINEAR FUNCTIONALS 187

Pn
Next, let v ∈ V . Then v = i=1 ai vi and so
n
X n
X
fj (v) = ai fj (vi ) = ai δij = aj
i=1 i=1

for each j. Hence


n
X
v= fi (v)vi
i=1
2

Corollary 6.3.8 Let V (F ) be a finite dimensional vector space


and B = {v1 , v2 , . . . , vn } be a basis of V . Let B ∗ = {f1 , f2 , . . . , fn }
a dual basis of basis B. Then fi is precisely the function which
assigns to each vector v ∈ V , the i-th coordinate of v relative to
given basis B.
Proof: The proof immediately follows from the equation
6.3.1 2
Thus, the coordinates of vPrelative to ordered basis B
give the dual basis, i.e; if v = ni=1 ai vi , where {v1 , v2 , ..., vn }
is an ordered basis; then the coefficient ai of vi gives the
element fi of the dual basis corresponding to the element vi
for each i = 1, 2, ..., n. Thus, we may call fi as the coordinate
functions for basis B.
Next, let v = x1 v1 + x2 v2 + ... + xn vn . If f ∈ V ∗ , then f (vj ) is
scalar for each j. Let aj = f (vj ). Then
f (v) = x1 f (v1 ) + x2 f (v2 ) + ... + xn f (vn ) = a1 x1 + a2 x2 + .. + an xn
Thus, each linear functional on f on F n is given by an expression
of the form a1 x1 + a2 x2 + .. + an xn , where ai = f (ei ) and ei =
(0, 0, .., |{z}
1 , 0..., 0) for each i.
i

Example 6.3.9 Let B = {v1 , v2 , v3 } be the basis of R3 given by


v1 = (1, 0, −1), v2 = (1, 1, 1), v3 = (2, 2, 0). Let x, y, z) ∈ R3 . Now,
as wePknow that the coefficient ai of vi in the linear combination
v = ni=1 ai vi is fi in the dual basis for each i. Thus, to obtain
the dual basis it is sufficient to express v = (x, y, z) as a linear
combination of elements of B. To do this, we suppose that
(x, y, z) = av1 + bv2 + cv3 = a(1, 0, −1) + b(1, 1, 1) + c(2, 2, 0) . . . (1)
188 CHAPTER 6. LINEAR TRANSFORMATIONS

This gives

a + b + 2c = x
b + 2c = y
−a + b = z

Solving these, we have


−x + 2y − z
a = x − y, b = z + x − y, c =
2

Hence, the dual basis B = {f1 , f2 , f3 } of basis B is given by

f1 (x, y, z) = a = x − y
f2 (x, y, z) = b = x − y + z
−x + 2y − z
f3 (x, y, z) = c =
2
Alternate method: We consider the augmented matrix
   
1 1 2 : x 1 0 0 : x−y
(A|B) =  0 1 2 : y  ∼  0 1 2 : y 
−1 1 0 : z −1 1 0 : z
   
1 0 0 : x−y 1 0 0 : x−y
∼ 0 1 2 : y  ∼ 0 0 2 : −x + 2y − z 
0 1 0 : x−y+z 0 1 0 : x−y+z
Hence, solution to equation (1) is a = x − y, b = x − y + z, c =
(−x + 2y − z)/2

Example 6.3.10 Consider the vector space R3 (R) with basis {v1 , v2 , v3 },
where v1 = (1, 0, 1), v2 = (0, 1, −2) and v3 = (−1, −1, 0). Suppose
that f is a linear functional such that f (v1 ) = 1, f (v2 ) = −1 and
f (v3 ) = 3 and we wish to determine f (a, b, c). Suppose that

(a, b, c) = x(1, 0, 1) + y(0, 1, −2) + z(−1, −1, 0) . . . (1)

Then x − z = a, y − z = b, x − 2y = c. Solving these, we have

x = 2a − 2b − c, y = a − b − c, z = a − 2b − c . . . (2)

Since f is a linear functional, hence (1) gives

f (a, b, c) = x.1 + y.(−1) + z.3 = x − y + 3z


6.3. LINEAR FUNCTIONALS 189

Putting the values of x, y, z from (2), we have


f (a, b, c) = 2a − 2b − c − (a − b − c) + 3(a − 2b − c) Hence

f (a, b, c) = 4a − 7b − 3c

In particular f (1, 2, 4) = 4 − 7 × 2 − 3 × 4 = −22.

Exercise 6.3.11 Find the dual basis of the basis {v1 , v2 , v3 } of R3 ,


where v1 = (1, 0, 1), v2 = (0, 1, −2) and (−1, −1, 0)

Exercise 6.3.12 Find the dual basis of basis

{(1, −2, 3), (1, −1, 1), (2, −4, 7)}

of R3 (R).
Answer: {f1 , f2 , f3 }, where f1 (x, y, z) = −3x−5y−2z, f2 (x, y, z) =
2x + y and f3 (x, y, z) = x + 2y + z.

Exercise 6.3.13 Find the dual basis of basis

{(−1, 1, 1), (1, −1, 1), (1, 1, −1)}

of R3 (R).
Answer: {f1 , f2 , f3 }, where f1 (x, y, z) = (y + z)/2, f2 (x, y, z) =
(x + z)/2 and f3 (x, y, z) = (x + y)/2.

Exercise 6.3.14 Find the dual basis of basis

{(1, −1, 3), (0, 1, −1), (0, 3, −2)}

of R3 (R).

Solution:
 
1 0 0 : x
(A|B) = −1 1 3 : y
3 −1 −2 : z

 
1 0 0 : x
∼ 0 1 3 : x+y  R2 → R2 + R1 , R3 → R3 − 3R1
0 −1 −2 : −3x + z
190 CHAPTER 6. LINEAR TRANSFORMATIONS
 
1 0 0 : x
∼ 0 1 3 : x+y  R3 → R3 + R1
0 0 1 : −2x + y + z
 
1 0 0 : x
∼ 0 1 0 : 7x − 2y − 3z  R2 → R2 − 3R3
0 0 1 : −2x + y + z

Thus, dual basis {f1 , f2 , f3 } is given by


f1 (x, y, z) = x
f2 (x, y, z) = 7x − 2y − 3z
f3 (x, y, z) = −2x + y + z
2

Exercise 6.3.15 Let {v1 v2 , v3 } be a basis of R3 , where v1 = (1, 0, 1),


v2 = (0, 1, −2) and v3 = (−1, −1, 0).
1. If f is a linear functional on R3 such that f (v1 ) = 1, f (v2 ) =
−1 and f (v3 ) = 2 and if v = (a, b, c) then find f (v).
2. Describe explicitly a linear functional f on R3 such that f (v1 ) =
f (v2 ) = 0 but f (v3 ) 6= 0.
3. Let f be any linear functional such that f (v1 ) = f (v2 ) = 0
but f (v3 ) 6= 0. If v = (2, 3, −1), show that f (v) 6= 0.
Exercise 6.3.16 Let V be the vector space of polynomial functions
p(x) over R of degree at most 2. Define three linear functionals on
V by
Z 1 Z 2 Z −1
f1 (p) = p(x)dx, f2 (p) = p(x)dx, f3 (p) = p(x)dx
0 0 0
Show that {f1 , f2 , f3 } is a basis for V ∗ by exhibiting the basis for
V of which it is the dual.
Exercise 6.3.17 Find the dual basis of basis
{(1, 0, 1), (1, −1, 2), (0, 0, 1)
of R3 .
Exercise 6.3.18 Find the dual basis of basis
{(1, 0, 1, 1), (1, −1, 0, 0), (0, 0, 1, 0), (0, 1, 0, −1)
of R4 .
6.4. ANNIHILATORS 191

6.4 Annihilators
Definition 6.4.1 If V is a vector space ove field F and S is a
subset of V . Then the annihilator of S is the set of all linear
functionals f on V such that f (s) = 0 for all s ∈ S. It is denoted
by S 0 . Thus,

S 0 = {f ∈ V ∗ | f (s) = 0∀s ∈ S}

Example 6.4.2 Since f (0) = 0 for every f ∈ V ∗ . Therefore

{0}0 = {f ∈ V ∗ |f (0) = 0} = V ∗

Example 6.4.3 V 0 = {f ∈ V ∗ | f (v) = 0∀v ∈ V } = {0̂}.

Proposition 6.4.4 If S is a subset of a vector space V (F ). Then,


S 0 is a subspace of V ∗ .

Proof: Since 0̂(s) = 0 for every s ∈ S, therefore 0̂ ∈ S 0 . Hence


S 0 =6= ∅. Let f, g ∈ S 0 and a, b ∈ F . Then f (s) = 0 = g(s) for all
s ∈ S and so

(af + bg)(s) = af (s) + bg(s) = a0 + b0 = 0

for all s ∈ S. Therefore, af + bg ∈ S 0 . Hence S 0 is a subspace of


V ∗. 2

Exercise 6.4.5 Let V (F ) be a vector space. Then, we have the


following:

1. A ⊂ B ⇒ B 0 ⊂ A0 .

2. If S is a subset of V then S 0 = (L(S)0 ).

Theorem 6.4.6 Let V be a finite dimensional vector space over


field F , and let W be a subspace of V . Then

dim W + dim W 0 = dim V

Proof: Let W be a subspace of a finite dimensional vector


space V over field F with dim V = n. Then W is finite dimensional.
Suppose that {w1 , w2 , . . . , wm } be a basis of W , where dim W =
m. Then it will be linearly independent subset of V and so it
192 CHAPTER 6. LINEAR TRANSFORMATIONS

can be extended to a basis B = {w1 , w2 , . . . , wm ; vm+1 , . . . , vn } of


V . Let B ∗ = {f1 , f2 , . . . , fn }. Then fj (wk ) = δjk = 0 for all
j = m + 1, m + 2, . . . , n and k = 1, 2, . . . , m. Then
m
X
fj (a1 w1 + a2 w2 + . . . + am wm ) = ak fj (wk ) = 0
k=1

for all j = m + 1, m + 2, . . . , n. Hence fj ∈ W 0 for all j = m +


1, m+2, . . . , n. Consider the set {fm+1 , fm+2 , . . . , fn }. It is linearly
independent because it is a subset of B ∗ . We claim that it spans
W 0 . Let f ∈ W 0 . Then f ∈ V ∗ and so
m
X n
X
f= f (wk )fk + f (vj )fj
k=1 j=m+1

But f ∈ W 0 so f (wk ) = 0 for all k = 1, 2, . . . m, hence the above


equation becomes
X n
f= f (vj )fj
j=m+1

Thus, {fm+1 , fm+2 , . . . , fn } is a basis of W 0 containing n − m ele-


ments. Then

dim W 0 = n − m = dim V − dim W

Hence proved. 2
As we have observed in the Example 6.1.2 (section of Rank-
Nullity theorem) that every non -zero functional f determines a
hyperspace of dimension n − 1. Thus, hyperspace corresponding to
each fj , m + 1 ≤ j ≤ n is of dimension n − 1.

Corollary 6.4.7 If W is a m-dimensional subspace of an n-dimensional


vector space V , then W is the intersection of n − m hyperspaces.

Corollary 6.4.8 If W is a m-dimensional subspace of an n-dimensional


vector space V , then W ∗ ' V ∗ /W 0 .

Proof: Since dim W ∗ = dim W = m and dim V ∗ /W 0 = dim V ∗ −


dim W 0 = n − (n − m) = m and any two vector spaces of same
dimension are isomorphic, therefore W ∗ ' V ∗ /W 0 . 2
6.4. ANNIHILATORS 193

Proposition 6.4.9 If W1 and W2 are subspaces of a finite dimen-


sional vector space then W1 = W2 ⇔ W10 = W20 .
Proof: If W1 = W2 then W1 ⊆ W2 and W2 ⊆ W1 . Hence, W20 ⊆
W10 and W10 ⊆ W20 and so W10 = W20 . Conversely suppose that
W10 = W20 . We claim that W1 = W2 . Suppose that W1 6= W2 .
Then there w1 ∈ W1 but w1 ∈ / W2 . There exists a linear functional
f such that f (w2 ) = 0 for all w ∈ W2 but f (v) 6= 0. Then f ∈ W20
/ W10 and so W10 6= W20 . Thus, W10 = W20 implies W1 = W2 .
but f ∈
2

Proposition 6.4.10 If W1 and W2 are subspaces of a finite di-


mensional vector space then (W1 + W2 )0 = W10 ∩ W20 .
Proof: Since W1 , W2 ⊆ (W1 + W2 ), therefore (W1 + W2 )0 ⊆
W10 , W20 and so

(W1 + W2 )0 ⊆ W10 ∩ W20 . . . (1)

Next, let f ∈ W10 ∩ W20 then f ∈ W10 and f ∈ W20 . Let


v ∈ W1 + W2 , then v = w1 + w2 for some w1 ∈ W1 and w2 ∈ W2 ,
then f (w1 ) = 0 and f (w2 ) = 0. Hence f (v) = f (w1 ) + f (w2 ) = 0.
Then f ∈ (W1 + W2 )0 . Thus

(W10 ∩ W20 ) ⊆ (W10 ∩ W20 ) . . . (2)

Combining (1) and (2), we have

(W1 + W2 )0 = W10 ∩ W20

Exercise 6.4.11 Find the subspace of R4 annihilated by linear


functionals

f1 (x1 , x2 , x3 , x4 ) = x1 + 2x2 + 2x3 + x4


f2 (x1 , x2 , x3 , x4 ) = 2x2 + x4
f3 (x1 , x2 , x3 , x4 ) = −2x1 − 4x3 + 3x4

Solution: Let W be the space annihilated by these functionals.


Then (x1 , x2 , x3 , x4 ) ∈ W if and only if f1 (x1 , x2 , x3 , x4 ) = 0,
f2 (x1 , x2 , x3 , x4 ) = 0 and f3 (x1 , x2 , x3 , x4 ) = 0,
194 CHAPTER 6. LINEAR TRANSFORMATIONS

i.e; (x1 , x2 , x3 , x4 ) ∈ W if and only if (x1 , x2 , x3 , x4 ) is the common


solution of the system of homogeneous equations given by

x1 + 2x2 + 2x3 + x4 = 0
2x2 + x4 = 0
−2x1 − 4x3 + 3x4 = 0
 
1 2 2 4
Consider the coefficient matrix A =  0 2 0 1. Now
−2 0 −4 3
 
1 2 2 4
A =  0 2 0 1
−2 0 −4 3
 
1 2 2 4
∼ 0 2 0 1  R3 → R3 + 2R1
0 4 0 11
 
1 2 2 4
∼ 0 2 0 1 R3 → R3 − 2R2
0 0 0 9

 
1 0 2 3
R1 → R1 − R2
∼ 0 2 0 1
R3 → 19 R3
0 0 0 1
 
1 0 2 0
R1 → R1 − 3R3
∼ 0 2 0 0
R2 → R2 − R3
0 0 0 1
From this it follows that the common solution is x4 = 0, x2 = 0 and
x1 + 2x3 = 0, i.e; x1 = −2x3 . Thus, W = {(−2x3 , 0, x3 , 0) | x3 ∈
R}. 2
From the above example it also follows that

h{f1 , f2 , f3 }i = h{g1 , g2 , g3 }i

where

g1 (x1 , x2 , x3 , x4 ) = x1 + 2x3
g2 (x1 , x2 , x3 , x4 ) = 2x2
g3 (x1 , x2 , x3 , x4 ) = x4
6.4. ANNIHILATORS 195

Exercise 6.4.12 Find the annihilators of the subspace W of R5


spanned by vectors w1 = (2, −2, 3, 4, −1), w2 = (−1, 1, 2, 5, 2),
w3 = (0, 0, −1, −2, 3) and w4 = (1, −1, 2, 3, 0).


Solution: Let f ∈ W 0 then f ∈ R5 and so it will be of the
form

f (x1 , x2 , x3 , x4 , x5 ) = c1 x1 + c2 x2 + . . . + c5 x5 (1)

where c1 , c2 , .., c5 are scalars. Since f ∈ W 0 hence f (wi ) = 0 for


all i = 1, 2, 3, 4 and so

   
 c 0
4 −1  1   

2 −2 3
−1 1 2 5 2
c
  0
2
  c3  = 0
0 0 −1 −2 3     
c4  0
1 −1 2 3 0
c5 0

But

 
2 −2 3 4 −1
−1 1 2 5 2
 
0 0 −1 −2 3 
1 −1 2 3 0
 
1 −1 2 3 0
−1 1 2 5 2
∼   R1 ↔ R4
0 0 −1 −2 3 
2 −2 3 4 −1
 
1 −1 2 3 0
0 0 4 8 2 R2 → R2 + R1
∼ 
0 0 −1 −2 3 

R4 → R4 − 2R1
0 0 −1 −2 −1
196 CHAPTER 6. LINEAR TRANSFORMATIONS
 
1 −1 2 3 0
0 0 1 2 21  R2 → 14 R2
∼  
0 0 −1 −2 3  R4 → R4 − R3
0 0 0 0 4
 
1 −1 2 3 0
0 0 1 2 21 
∼   R3 → R3 + R2
0 0 0 0 72 
0 0 0 0 1
 
1 −1 2 3 0
0 0 1 2 21 
∼ 
0
 R3 → 27 R3
0 0 0 1
0 0 0 0 1
 
1 −1 2 3 0
0 0 1 2 0 R2 → R2 − 12 R3
∼  
0 0 0 0 1 R4 → R4 − R3
0 0 0 0 0
 
1 −1 0 −1 0
0 0 1 2 0
∼ 
0
 R1 → R1 − 2R2
0 0 0 1
0 0 0 0 0

Thus, common solution is c5 = 0, c3 + 2c4 = 0 and c1 − c2 − c4 = 0.


Take c2 = a and c4 = b then c3 = −2b and c1 = a + b. Thus,

f (x1 , x2 , x3 , x4 , x5 ) = (a + b)x1 + ax2 − 2bx3 + bx4

where a and b are arbitrary scalars. Hence

W 0 = {f ∈ V ∗ | f (x1 , x2 , x3 , x4 , x5 ) = (a + b)x1 + ax2 − 2bx3 + bx4 }

Clearly dim W = 3 and hence dim W 0 = 5 − 3 = 2. The basis of


W 0 is given by

f1 (x1 , x2 , x3 , x4 , x5 ) = x1 + x2 (by taking a = 1, b = 0)

and

f2 (x1 , x2 , x3 , x4 , x5 ) = x1 − 2x3 + x4 (by taking a = 0, b = 1)

2
6.4. ANNIHILATORS 197

Theorem 6.4.13 Let V (F ) be a finite dimensional vector space.


For each v ∈ V , define Lv : V ∗ → F by

Lv (f ) = f (v)

Then Lv ∈ V ∗∗ and the map L : V → V ∗∗ defined by L(v) = Lv is


an isomorphism.

Proof: Let a, b ∈ F and f, g ∈ V ∗ . Then

[Lv (af + bg)] = (af + bg)(v)


= af (v) + bg(v)
= aLv (f ) + bLv (g)

Hence Lv ∈ V ∗∗ . Define a map L : V → V ∗∗ by

L(v) = Lv

Then

[L(av + bw)](f ) = Lav+bw (f )


= f (av + bw)
= af (v) + bf (w)
= aLv (f ) + bLw (f )
= [aLv + bLw ](f )

Hence L(av + bw) = aLv + bLw = aL(v) + bL(w), i.e; L is linear.


Now

L(v) = 0̂ ⇔ L(v)(f ) = 0∀f ∈ V ∗ ⇔ f (v) = 0∀f ∈ V ∗ ⇔ v = 0

Hence L is one-one. Since dim V ∗∗ = dim V ∗ = dim V and V is


finite dimensional, therefore L is an isomorphism. 2

Corollary 6.4.14 Let V (F ) be finite dimensional vector space. If


Q is a linear functional on V ∗ then there is a unique vector v ∈ V
such that
Q(f ) = f (v)
for every f ∈ V ∗ .
198 CHAPTER 6. LINEAR TRANSFORMATIONS

Proof: Since L : V → V ∗∗ is a natural isomorphism given by


L(v) = Lv . Thus, for given Q ∈ V ∗∗ there exists v ∈ V such
that L(v) = Q. Hence, Q(f ) = L(v)(f ) = Lv (f ) = f (v) for each
f ∈ V ∗. 2

Corollary 6.4.15 Let V be a finite dimensional vector space over


field F . Then each basis for V ∗ is the dual of some basis for V .

Proof: Let B ∗ = {f1 , f2 , ..., fn } be a basis for V ∗ . Let B ∗∗ =


{Q1 , Q2 , . . . , Qn } be the dual basis of B ∗ . Then
Qi (fj ) = δij
Using corollary above, for each Qi there exists a vector vi such that
Qi (f ) = f (vi ) for all f ∈ V ∗ . In particular, we have
fj (vi ) = Qi (fj ) = δij
This shows that B ∗ is the dual of B = {v1 , v2 , . . . , vn }. 2
Thus, V and V ∗ are dual of each other. Thus, we can identify
V and V ∗∗ up to an isomorphism L.

Definition 6.4.16 Let S be a subset of finite dimensional vector


space V over a field F . Then, annihilator of S 0 , i.e; S 00 is defined
as
S 00 = {v ∈ V | f (v) = 0∀ f ∈ S 0 }

Proposition 6.4.17 Let W be a subspace of a finite dimensional


vector space V over field F . Then W = W 00 .

Proof: By definition of W 00 , we have

W 00 = {v ∈ V | f (v) = 0∀ f ∈ W 0 }
Let w ∈ W then f (w) = 0 for all f ∈ W 0 and so w ∈ W 00 . Thus,
W ⊆ W 00
But
dim W + dim W 0 = dim V
dim W 0 + dim W 00 = dim V ∗
and dim V = dim V ∗ . Therefore dim W = dim W 00 . Since W ⊆
W 00 and hence W = W 00 . 2
6.4. ANNIHILATORS 199

Proposition 6.4.18 Let W1 , W2 be any two subspaces of a finite


dimensional vector space V over F . Then

(W1 ∩ W2 )0 = W10 + W20

Proof: Since (W1 + W2 )0 = W10 ∩ W20 . Replacing W1 , W2 by


their annihilators, we have

(W10 + W20 )0 = W100 ∩ W200

Since W 00 = W , therefore we have

(W10 + W20 )0 = W1 ∩ W2

Taking annihilator of both sides, we have

(W10 + W20 )00 = (W1 ∩ W2 )0

i.e;
W10 + W20 = (W1 ∩ W2 )0
2

Exercises
Exercise 6.4.19 Determine which of the following maps T : R3 →
R3 are linear:
1. T (x, y, z) = (y, z, 0).

2. T (x, y, z) = (x − y, y + z, x).

3. T (x, y, z) = (x − y − z, x + y + z, z).

4. T (x, y, z) = (x − y, z, |x|).

Exercise 6.4.20 Let Mn×n (R) denotes the vector space of all n ×
n over R. Let B ∈ Mn×n (R) be a non-zero matrix. Prove that the
map T : Mn×n (R) → Mn×n (R) defined by T (X) = XB + BX is
a linear transformation.

Exercise 6.4.21 Prove that the map T : P2 (x) → P3 (x) defined


by T (p(x)) = p(0) + xp(x) is a linear transformation but the map
T1 : P2 (x) → P3 (x) defined by T1 (p(x)) = 1 + xp(x) is not a linear
transformation.
200 CHAPTER 6. LINEAR TRANSFORMATIONS

Exercise 6.4.22 Determine the explicit formula for a linear trans-


formation T : R3 → R3 for which T (1, 1, 0) = (0, 0, 1), T (0, 1, 0) =
(1, 0, 0, 0) and T (0, 1, −1) = (0, 1, −1).

Exercise 6.4.23 Find the basis of range space and null space of
the linear transformation T : R4 → R3 given by

T (x, y, z, u) = (x + y, y − z, x).

Exercise 6.4.24 Find the basis of range space and null space of
the linear transformation T : R3 → R3 given by

T (x, y, z) = (x − y, y − z, x − z).

Exercise 6.4.25 Is there a linear transformation T from R3 into


R2 such that T (1, −1, 1) = (1, 0) and T (1, 1, 1) = (0, 1)?

Exercise 6.4.26 Let v1 = (1, 2), v2 = (2, 1), v3 = (3, 3) and w1 =


(1, 0), w2 = (0, 1), w3 = (1, −1). Is there a linear transformation
T from R2 into R2 such that T (vi ) = wi for all i = 1, 2, 3?

Exercise 6.4.27 Let T be a linear operator on R3 given by

T (a, b, c) = (3a, a − b, 2a + b + c), ∀(a, b, c) ∈ R3

Is T invertible? If so, find a rule for T −1 like the one which defines
T.

Exercise 6.4.28 Find the dual basis of basis containing vectors


(1, −2, 3), (1, −1, 1) and (2, −4, 5) of R3 (R).

Exercise 6.4.29 Find the dual basis of basis containing vectors


(1, 2, 1), (1, 1, 2) and (2, 1, 1) of R3 (R).

Exercise 6.4.30 Find the dual basis of basis containing vectors


(1, −1, 0), (0, 1, −1) and (1, 0, −1)} of R3 (R).

Exercise 6.4.31 Find the annihilators of the subspace W of R4


spanned by vectors w1 = (2, −2, 3, −1), w2 = (−1, 1, 2, 1) and w3 =
(0, 0, −1, −2).

2
Chapter 7

Matrix Representations

This chapter deals the relationship between matrices and Linear


transformations.
Let V be an n-dimensional vector space over F and W be m-
dimensional vector space over F . Let B = {v1 , v2 , . . . , vn } and
B 0 = {w1 , w2 , . . . , wn } be ordered bases for V and W respectively.
Let T be a linear transformation from V into W . Then, for each
j, 1 ≤ j ≤ n; the vector T (vj ) is uniquely expressed as a linear
combination
m
X
T (vj ) = A1j w1 + A2j w2 + . . . + Amj wm = Aij wi (7.0.1)
i=1

Hence coordinates of T (vj ) relative to B 0 are A1j , A2j , . . . , Amj


and then coordinate matrix is
 
A1j
 A2j 
[T (vj )]B0 = .  (7.0.2)
 
 .. 
Amj

for each 1, 2, . . . , n. Clearly these mn scalars Aij , 1 ≤ i ≤ m,


1 ≤ j ≤ n determine T completely.
The m × n matrix A = (Aij ) whose j-th column is the coor-
dinate matrix of T (vj ) relative to ordered basis B 0 is called the
matrix of T relative to the pair of ordered bases B and B 0 .

201
202 CHAPTER 7. MATRIX REPRESENTATIONS

It is denoted by [T ]B,B0 . Thus,


 
A11 A12 ... A1n
 A21 A22 ... A2n 
[T ]B,B0 =  . (7.0.3)
 
.. .. .. 
 .. . . . 
Am1 Am2 . . . Amn

Let v = x1 v1 + x2 v2 + . . . + xn vn in V . Then
 
Xn
T (v) = T  xj vj 
j=1
n
X
= xj T (vj )
j=1
Xn m
X
= xj Aij wi
j=1 i=1

i.e;
 
m
X Xn
T (v) =  Aij xj  wi (7.0.4)
i=1 j=1

Thus,
 Pn
A1j xj

Pj=1
 nj=1 A2j xj 
 Pn 
[T (v)]B0 =  j=1 A3j xj  (7.0.5)
 
 .. 

Pn . 
j=1 Amj xj

and so
  
A11 A12 ... A1n x1
 A21 A22 ... A2n   x2 
 
[T (v)]B0 =  .
 
.. .. ..   .. 
 .. . . .  . 
Am1 Am2 . . . Amn xn

i.e;

[T (v)]B0 = [T ]B,B0 [v]B (7.0.6)


203

Thus, if X = [v]B then [T ]B,B0 X is the coordinate matrix of


T (v) relative to ordered basis B 0 .
Conversely suppose that A = (Aij ) be any given m × n matrix.
Define T : V → W by
   
Xn X n Xn
T xj vj  =  Aij xj  wi
j=1 i=1 j=1

Then, T is a linear transformation such that equation 7.0.6


holds. Next, let a ∈ F . By equation (i), we have
  
Xn Xn
(aT )(v) = a  Aij xj  wi
i=1 j=1

Hence,

[aT ]B0 = a.[T ]B,B0 [v]B i.e; [aT ]B,B0 = a.[T ]B,B0

Similarly, if U be any other linear transformation then



[(T + U )(v)]B0 = [T ]B,B0 + [U ]B,B0 [v]B

i.e;
(T + U )B,B0 = [T ]B,B0 + [U ]B,B0
Thus,

[(aT + bU )]B,B0 = a[T ]B,B0 + b[U ]B,B0 (7.0.7)

for all a, b ∈ F .
Hence, we have the following:

Theorem 7.0.1 (Matrix Representation Theorem of Lin-


ear Transformations) Let V be an n-dimensional vector space
over field F and W be an m-dimensional vector space over same
field F . Let B, B 0 be ordered bases V and W respectively. Then,
there exists a natural isomorphism I : L(V, W ) → MM ×n (F ) given
by I(T ) = [T ]B,B0

If T is a linear operator on V and B = B 0 then we shall denote


[T ]B,B by [T ]B and call matrix of T relative to B.
204 CHAPTER 7. MATRIX REPRESENTATIONS

Example 7.0.2 Consider the linear transformation T : R3 → R2


given by
T (x, y, z) = (x + y + z, x − y)
Consider B = {(1, 0, 0), (0, 1, 0), (0, 0, 1)} and B 0 = {(1, 0), (0, 1)}
be standard bases of R3 and R2 respectively. Then

T (1, 0, 0) = (1, 0) = 1(1, 0) + 0(0, 1)


T (0, 1, 0) = (1, −1) = 1.(1, 0) − 1(0, 1)
T (0, 0, 1) = (1, 0) = 1.(1, 0) + 0(0, 1)

Hence    
1 1
[T (1, 0, 0)]B = , [T (0, 1, 0)]B =
0 −1
and  
1
[T (0, 0, 1)]B =
0
Thus, matrix of T relative to B and B 0 is
 
1 1 1
[T ]B,B0 =
0 −1 0

Working rule for finding matrix of T

To find the matrix of T relative given ordered bases, first we de-


termine the coefficients x1 , x2 , . . . , xn in the linear combination

T (v) = x1 w1 + x2 w2 + . . . + xm wm

for general vector v, i.e; first we determine the general formula for
T (v) in terms of basis B 0 .
After determining these coefficients, we evaluate it for basis
vectors v = vj , 1 ≤ j ≤ n and then determine [T (vj )]B0 . Thus,

[T ]B, B0 == [[T (v1 )]B0 [T (v2 )]B0 , . . . [T (vn )]B0 ]

We illustrate it by taking an example:

Example 7.0.3 Consider the linear operator T : R3 → R3 defined


by
T (x, y, z) = (3x + z, −2x + y, −x + 2y + 4z)
205

Let the ordered basis be B = {v1 , v2 , v3 }, where v1 = (1, 0, 1), v2 =


(−1, 2, 1) and v3 = (2, 1, 1). To find [T ]B , we proceed it as below:
Suppose that
T (x, y, z) = (a, b, c) = x(1, 0, 1) + y(−1, 2, 1) + z(2, 1, 1) . . . (1)
Then
x − y + 2z = a
2y + z = b
x+y+z = c
Solving these, we have
−a−3b+5c

x = 4 , 
y = (b − z)/2 = −a+b+c
4 . . . (2)
a+b−c
z =

2

By (1) and (2), we have

−a − 3b + 5c −a + b + c a+b−c
T (x, y, z) = (a, b, c) = v1 + v2 + v3
4 4 2
Hence, we have
17 −3 −1
T (v1 ) = T (1, 0, 1) = (4, −2, 3) = v1 + v2 + v3
4 4 2
35 15 −7
T (v2 ) = T (−1, 2, 1) = (−2, 4, 9) = v1 + v2 + v3
4 4 2
22 −6
T (v3 ) = T (2, 1, 1) = (7, −3, 4) = v1 + v2 + 0.v3
4 4
and so  17   35 
4 4
   
−3 15
   
[T (v1 )]B = 
 4
 , [T (v2 )]B = 
  4
,

   
−1 −7
2 2
and  22 
4
 
−6
 
[T (v3 )]B = 
 4


 
0
206 CHAPTER 7. MATRIX REPRESENTATIONS

Thus,
 17 35 22 
4 4 4
 
−3 15 −6
 
[T ]B = [[T (v1 )]B [T (v2 )]B [T (v3 )]B ] = 
 4 4 4


 
−1 −7
2 2 0

Remark 7.0.4 We may also solve the above system of equations,


by using matrix operations. For, Consider the augmented matrix
 
1 −1 2 : a
(A|B) = 0 2 1 : b 
1 1 1 : c

Then
 
1 −1 2 : a
(A|B) = 0 2 1 : b 
1 1 1 : c
 
1 −1 2 : a
∼ 0 2 1 : b  R3 → R3 − R1
0 2 −1 : c − a
 
1 −1 2 : a
∼ 0 2 1 : b  R3 → R3 − R2
0 0 −2 : c − a − b

This gives −2z = c − a − b, 2y + z = b and x − y + 2z = a, i.e;


a+b−c −a + b + c −a − 3b + 5c
z= , y = (b − z)/2 = ,x= .
2 4 4
Exercise 7.0.5 Find the matrix of linear operator T on R2 rela-
tive to its standard basis, where T (x, y) = (0, x)

Exercise 7.0.6 Find the matrix of differential operator D on P2 x,


the space of polynomials over R of degree at most 2, relative to its
d
ordered basis {2, 1 + x, x2 }, where D(f (x)) = f (x).
dx
Exercise 7.0.7 Let T be a linear transformation from R3 into R2
defined by
T (x1 , x2 , x3 ) = (x1 + x2 , 2x3 − x1 ).
207

(a) If B is the standard ordered basis of R3 and B 0 is the standard


ordered basis of R2 , then find the matrix of T relative to ordered
bases B, B 0 .
(b) If B = {v1 , v2 , v3 } and B 0 = {w1 , w2 }, where

v1 = (1, 0, −1), v2 = (1, 1, 1), v3 = (1, 0, 0), w1 = (0, 1), w2 = (1, 0)

what is the matrix of T relative to ordered bases B, B 0 ?

Exercise 7.0.8 Let T be a linear operator on R2 defined by

T (x, y) = (−y, x)

(a) what is the matrix of T relative to standard ordered basis of


R2 ?
(b) what is the matrix of T relative to ordered basis
B = {(1, 2), (1, −1)}?.
(c) Prove that for every a ∈ R, (T − aI)is invertible.

Exercise 7.0.9 Let T be a linear operator on R3 defined by

T (x1 , x2 , x3 ) = (x1 + x2 , x2 + x3 , x1 + x3 )

Prove that T is invertible. Also find explicit formula for T −1 like


T.

Exercise 7.0.10 Let T be a linear operator on R3 defined by

T (x, y, z) = (x + z, −2x + y, −x + 2y + z)

Determine the matrix of T relative to ordered basis


{(1, 0, 1), (−1, 1, 1), (0, 1, 1)}.

Exercise 7.0.11 Let T be a linear operator on R3 defined by

T (x, y, z) = (z, y + z, x + y + z).

Find the matrix of T relative to ordered basis


{(1, 0, 1), (−1, 2, 1), (2, 1, 1)}.

Exercise 7.0.12 Let T be a linear operator on R2 defined by

T (x, y) = (2y, 3x − y).

Find the matrix of T relative to ordered basis


{(1, 3), (2, 5)}.
208 CHAPTER 7. MATRIX REPRESENTATIONS

Exercise 7.0.13 Let T be a linear operator on R2 defined by


T (x, y) = (4x − 2y, 2x + y).
Find the matrix of T relative to ordered basis B = {(1, 1), (−1, 0)}.
Also verify that [T (v)]B = [T ]B [v]B .
Proposition 7.0.14 Let V, W, Z be finite dimensional vector spaces
over same field F of dimensions n, m, p respectively. Let T : V →
W and U : W → Z be any linear transformations. Let B, B 0 , B 00 be
ordered bases for V, W and Z respectively. Then
[U T ]B,B00 = BA = [U ]B0 ,B00 [T ]B,B0 .
Proof: Let B = {v1 , v2 , . . . , vn }, B 0 = {w1 , w2 , . . . , wm } and
B 00 = {z1 , z2 , . . . , zp }.
Suppose that A = [T ]B,B0 , B = [U ]B0 ,B00 and v ∈ V . Then
[T (v)]B0 = A[v]B
and
[U (T (v))]B00 = B[T (v)]B0
and so
[(U T )(v)]B00 = BA[v]B
Hence, by uniqueness of representing matrix, we have
[U T ]B,B00 = BA = [U ]B0 ,B00 [T ]B,B0
2
From this it follows that if V = W = Z, B = B 0 = B 00 then
U = T −1 if and only if [U ]B = [T ]−1
B .

Theorem 7.0.15 (Change of basis) Let V be an n-dime-


nsional vector space over the field V , and B = {v1 , v2 , . . . , vn },
B 0 = {w1 , w2 , . . . , wn } be any two ordered bases for V . If T is a
linear operator on V . If P is the transition matrix from B to B 0
whose j-th column is the coordinate matrix [wj ]B , then
[T ]B0 = P −1 [T ]B P.
Alternatively, if U is the invertible operator on V defined by U (vj ) =
wj for all j = 1, 2, . . . , n, then [U ]B = P , i.e;
[T ]B0 = [U ]−1
B [T ]B [U ]B .
209

Proof: Let B = {v1 , v2 , . . . , vn }, B 0 = {w1 , w2 , . . . , wn } be


any two ordered bases for V . Suppose that v ∈ V . Then, we have

[v]B = P [v]B0 , (7.0.8)

where P is the transition matrix whose j-th column is [wj ]B . Let


T be a linear operator on V . Then

[T (v)]B = [T ]B [v]B , (7.0.9)

[T (v)]B0 = [T ]B0 [v]B0 , (7.0.10)


and
[T (v)]B = P [T (v)]B0 (7.0.11)
Using equations 7.0.9, 7.0.10 in equation 7.0.11, we have

[T ]B [v]B = P [T ]B0 [v]B0

Using equation 7.0.8, we have

[T ]B P [v]B0 = P [T ]B0 [v]B0

This gives
[T ]B P = P [T ]B0
and so
[T ]B0 = P −1 [T ]B P.
Next, let U be a linear operator on V such that U (vjP
) = wj for
all j = 1, 2, .., n, thenPit will be invertible. But wj = ni=1 Pij vi
and so U (vj ) = wj = ni=1 Pij vi . Thus,

[U ]B = P

This proves the result. 2

Example 7.0.16 Let T be a linear operator on R2 such that


 
1 −1
[T ]B =
0 −1

where B = {(1, 0), (2, 1)} is an ordered basis of R2 . Take v1 = (1, 0)


and v2 = (2, 1) Let v = (x, y) ∈ R2 . Here we wish to find T
explicitly. As we know that

[T (v)]B = [T ]B [v]B . . . (1)


210 CHAPTER 7. MATRIX REPRESENTATIONS

Thus, to find T , it is sufficient to find the coordinates of T (v)


relative to B. From equation (1), it is clear that [T (v)]B will be
completely determined if we can determine [v]B . Now, suppose that

(x, y) = av1 + bv2 . . . (2)

Then, (x, y) = a(1, 0) + b(2, 1) = (a + 2b, b) and so a + 2b = x


and b = y. Solving these, we have b = y, a = x − 2y. Hence, (2)
becomes
(x, y) = (x − 2y)v1 + yv2 . . . (3)
   
a x − 2y
and so [v]B = = Then, by (1), we have
b y
    
1 −1 x − 2y x − 3y
[T (v)]B = [T ]B [v]B = =
0 −1 y −y

Then

T (v) = T (x, y) = (x − 3y)v1 − yv2


= (x − 3y)(1, 0) − y(2, 1)
= (x − 5y, −y)

Verification: Suppose that

T (x, y) = (x − 5y, −y).

Then, using (3), we have:

T (1, 0) = (1, 0) = 1.v1 + 0v2


T (2, 1) = (−3, −1) = (−1)v1 − v2
 
1 −1
Hence [T ]B = which is the given matrix.
0 −1
Next, consider the standard basis B 0 = {e1 , e2 }, where e1 =
(1, 0) and e2 = (0, 1). Using equation (3), we have

e1 = (1, 0) = 1v1 + 0v2


e2 = (0, 1) = −2v1 + v2

Thus, the transition matrix P from basis B to B 0 is


   
1 −2 −1 1 2
P = , then P =
0 1 0 1
211

Hence  
1 −5
[T ]B0 = P −1 [T ]B P =
0 −1

Exercise 7.0.17 Let T be a linear operator on R2 defined by T (x, y) =


(x, 0). Show that the matrix of T in the standard ordered basis of
R2 is  
1 0
[T ]B = .
0 0
If B 0 = {w1 , w2 }, where w1 = (1, 1), w2 = (2, 1); is another ordered
basis of R2 then find the transition matrix from B to B 0 . Also find
[T ]B0 .

Exercise 7.0.18 Let T be a linear operator on R3 such that the


matrix of T in the standard ordered basis of R3 is
 
1 −3 1
[T ]B =  3 −2 0 .
−4 1 2

If B 0 = {w1 , w2 , w3 }, where w1 = (1, −1, 1), w2 = (1, −2, 2) and


w3 = (1, −2, 1); is another ordered basis of R2 then find the tran-
sition matrix from B to B 0 . Also find [T ]B0 .

Definition 7.0.19 Let A and B be any two n × nsquare matrices


over same field F . Matrix B is said to be similar to A if there
exists an invertible matrices P over F such that B = P −1 AP .
Observe that the relation being similar (i.e; similarity) is an equiv-
alence relation.
212 CHAPTER 7. MATRIX REPRESENTATIONS

Exercises
Exercise 7.0.20 Find the matrix of linear operator T on R3 de-
fined by

T (x, y, z) = (3x + z, −2x + y, −x + 2y + 4z)

with respect to the ordered basis B and also with respect to B 0 , where
(1) B = {(1, 0, 0), (0, 1, 0), (0, 0, 1)},
(2) B 0 = {(1, 1, 1), (1, 1, 0), (1, 0, 0)}.

Exercise 7.0.21 Find the matrix of linear operator T on R3 de-


fined by
T (x, y, z) = (2y + z, x − 4y, 3x)
with respect to the ordered basis B = {v1 , v2 , v3 }, where v1 =
(1, 0, 1), v2 = (−1, 2, 1) and v3 = (2, 1, 1). Also prove that T is
invertible and find a explicit formula for T −1 . If B 0 is a standard
basis of R3 , then find [T ]B0 and an invertible matrix P such that
[T ]B = P −1 [T ]B P .

Exercise 2
 7.0.22  If the matrix of T on C relative to its standard
1 1
basis is , what is the matrix of T relative to ordered basis
1 1
B 0 = {(1, 1), (1, −1)}?

Exercise 7.0.23 Let T be a linear operator on R3 such that the


matrix of T in the standard ordered basis of R3 is
 
0 1 1
[T ]B =  1 0 −1 .
−1 −1 0

What is the matrix of T with respect to the ordered basis B 0 =


{w1 , w2 , w3 }, where w1 = (1, 1, −1), w2 = (−1, 0, 1) and w3 =
(1, 2, 1)?

Exercise 7.0.24 Find the linear operator T on R3 whose matrix


in standard ordered basis is
 
1 −1 1
−1 −2 0
−4 0 2
213

Exercise 7.0.25 Let V be the space of all polynomials of degree


at most 3 over R. Let D be the differentiation operator on V
and B = {1, x, x2 , x3 } be an ordered basis of V Find [D]B . Let
B 0 = {1, x + a, (x + a)2 , (x + a)3 } be another ordered basis for V .
Find the transition matrix from B to B 0 and [D]B .

2
214 CHAPTER 7. MATRIX REPRESENTATIONS
Chapter 8

Inner Product Spaces

In this chapter, we wish to study the notion of distance between


two points, length of a vector and angle between two vectors (or
lines). To fill up our purpose, we enrich the vector spaces (over
the field of real numbers or complex numbers) with a structure
(numerical function assigned for each pair of vectors) so that these
concepts can also be defined suitably. Throughout the chapter,
we consider vector spaces over the field R of real numbers or the
field C of complex numbers. Readers may consult the book [2] for
further readings.

8.1 Inner Products


Definition 8.1.1 Let V be a vector space over F , where F is ei-
ther R or C. An inner product on V is a function h , i from V × V
to F such that
(i). (Linearity Property).

hau + bv, wi = ahu, wi + bhv, wi (8.1.1)

.
(ii). (Conjugate - Symmetry).

hu, vi = hv, ui (8.1.2)

.
(iii). (Non- negativity).

hu, ui ≥ 0 and hu, ui = 0 ⇔ u = 0. (8.1.3)

215
216 CHAPTER 8. INNER PRODUCT SPACES

for all a, b ∈ F and u, v, w ∈ V , where hu, vi denotes the image of


(u, v) under the map h, i.

From the conjugate symmetry property of inner product on vector


space V , it follows that hu, ui = hu, ui and so hu, ui is a real
number for all u ∈ V .
A vector space V (F ) equipped with an inner product h , i is
called an inner product space. If F = R, then finite dimensional
vector space V (F ) is called an Euclidean space. If F = C then
finite dimensional vector space V (F ) is called an Unitary space.
Let (V (F ), h , i) be an inner product. Putting a = 1 = b in
8.1.1, we have

hu + v, wi = hu, wi + hv, wi (8.1.4)

Putting b = 0 in 8.1.1, we have

hau, wi = ahu, wi (8.1.5)

Next, using Conjugate symmetry and Linearity Property, we have

hu, av + bwi = h av + bw , ui
= ah v, ui + bhw , ui
= ah v, ui + bhw , ui
= ahu, vi + bhu, wi

Thus, in particular hu, avi = ahu, vi.

Proposition 8.1.2 In an inner product space (V (F ), h , i), h u, 0i =


0, for all u ∈ V .

Proof: Let u ∈ V . Then h u, 0i = h u, 0.0i = 0h u, 0i = 0.

Example 8.1.3 Let V = Rn the Euclidean vector space of di-


mension n over R. Define

hx, yi = x1 y1 + x2 y2 + · · · + xn yn
n
X
= x i yi
i=1

where x = (x1 , x2 , · · · , xn ) and y = (y1 , y2 , · · · , yn ).


8.1. INNER PRODUCTS 217

Since hx, xi = ni=1 xi 2 and P xi 2 ≥ 0, therefore hx, xi ≥ 0.


P
Clearly hx, xi = 0 if and only if ni=1 xi 2 = 0; i.e, if and only if
xi = 0 for all i = 1, 2 . . . , n. The conjugate symmetry is followed
immediately. Next,

hax + by, zi
= h(ax1 + by1 , ax2 + by2 , . . . , axn + byn ) , (z1 , z2 , . . . , zn )i
Xn
= (axi + byi )zi
i=1
Xn n
X
= a xi z i + b yi zi
i=1 i=1
= ahx, zi + bhy, zi

Thus h , i is an inner product on Rn and so (Rn , h , i) is an


inner product space. The inner product is called usual Euclidean
inner product on Rn .

Example 8.1.4 Define h , i on R3 by

hx, yi = x1 y1 + x2 y2 + 2x3 y3 + x2 y3 + x3 y2

where x = (x1 , x2 , x3 ) and y = (y1 , y2 , y3 ). Then this is an


inner product on R3 .

Example 8.1.5 Let V = Cn the complex vector space of dimen-


sion n. Define h , i by
n
X
hx, yi = xi yi . (8.1.6)
i=1

Then it is a complex inner product(verify). This inner product


space is called Standard Unitary space.

Example 8.1.6 Let V = R2 . Define h , i by

hx, yi = x1 y1 − x1 y2 − x2 y1 + 4 x2 y2 (8.1.7)

Then it is an inner product.

Example 8.1.7 Let V be F n×n , the Pnspace of all n × n matri-


ces over field F . Then hA, Bi = i, j=1 Aij Bji and hA, Bi =
trace(AB ? ) define inner products on V .
218 CHAPTER 8. INNER PRODUCT SPACES

Example 8.1.8 Let V be F n×1 , the space of all n × 1 matrices


over field F . Let Q be an n × n invertible matrices over F . Then
hX, Y i = (QY )? (QX) defines an inner product on V .

Example 8.1.9 Let V denote the complex vector space of all com-
plex valued continuous functions on [0, 1]. Define h , i by
Z 1
h f, gi = f (t)g(t)dt. (8.1.8)
0

Then h , i is a complex inner product(verify).

Example 8.1.10 Let l2 denote the set of all sequences an such


that
X∞
| an |2 < ∞
n=1
Then it is a vector space over R with respect to the usual addition of
sequences and multiplication by scalars. We have an inner product
on l2 given by

X
h{an }, {bn }i = an bn . (8.1.9)
n=1

As we have observed earlier that hx, xi is a non-negative real num-


ber in a given inner product space. Thus, we define the following:
Definition 8.1.11 Let (V, h , i) be an inner product space. Let
x ∈ V . Then positive square root of non-negative real number
hx, xi is called p
the length of the vector x and is denoted by kxk.
Thus, kxk = + hx, xi.

Proposition 8.1.12 In an inner product space (V, h , i), we have


the following:
(i). kxk ≥ 0 and kxk = 0 if and only if x = 0.
(ii) kaxk = |a| kxk.

Proof: Let x ∈ V . By non-negativity property, the result (i)


follows. Next, let a ∈ F . Then h ax, ax i = aah x, x i =
|a|2 h x, x i. This proves (ii). 2
Taking x = u − v and a = 1 ∈ F , then we have

ku − vk ≥ 0; ku − vk = 0 ⇔ u = v (8.1.10)
8.1. INNER PRODUCTS 219

and

ku − vk = kv − uk (8.1.11)

Theorem 8.1.13 (Cauchy-Schwarz inequality) Let (V, h , i)


be an inner product space. Then |hx, yi| ≤ kxk kyk for all x, y ∈
V . The equality holds if and only if {x, y} is linearly dependent.

Proof: If y = 0, then

|hx, yi| = |hx, 0i| = 0

and
kxk kyk = kxk k0k = 0
and so equality holds . Assume that y 6= 0 and so kyk 6= 0.
hy, xi
Consider the vector x − αy, where α = hy, yi . By non-negativity
property,
h x − αy, x − αyi ≥ 0
Using linearity property, we have

hx, xi − αh x, yi − αhy, xi + ααhy, yi ≥ 0

hy, xi
Putting α = hy, yi in the above equation and using that hy, xi =
hx, yi, we have

h x, yih x, yi
hx, xi − ≥ 0
hy, yi

that is;

h x, yih x, yi
≤ hx, xi
hy, yi

Thus
|hx, yi| ≤ kxk kyk .
If {x, y} is linearly independent, then x − αy 6= 0 for all
α ∈ F and so kx − αyk > 0. Hence

|h x, yi| < kxk kyk


220 CHAPTER 8. INNER PRODUCT SPACES

Thus, if equality holds then {x, y} is linearly dependent. Con-


versely, if {x, y} is linearly dependent, then x = αy for some
α ∈ F . Then

|h x, yi| = |hαy, yi| = | α | h y, yi = kxk kyk

2
Using Cauchy-Schwarz inequality in above examples, we have
several important inequalities (try).

Proposition 8.1.14 (Triangle inequality). Let (V, < >) be an in-


ner product space. Then

|| x + y || ≤ || x || + || y ||

for all x, y ∈ V . If equality holds then the set {x, y} is linearly


dependent.

Proof: Let x, y ∈ V . Then

(kx + yk)2 = hx + y, x + yi
= hx, xi + hx, yi + hy, xi + hy, yi
≤ kxk2 + hx, yi + hx, yi + kyk2
≤ kxk2 + 2 kxk kyk + kyk2
(by Cauchy Schwarz inequality)
= (kxk + kyk)2

Taking square root, we have

|| x + y || ≤ || x || + || y ||

Further, it is clear from the above that if equality holds then

hx, yi + hy, xi = 2 kxk · kyk

i.e;
Re hx, yi = kxk · kyk
But Rehx, yi ≤ |hx, yi| and so kxk · kyk ≤ |hx, yi|. Using
Cauchy- Schwarz inequality, we have kxk · kyk = |hx, yi|. Thus,
by Theorem 8.1.13, the set {x, y} is linearly dependent. 2
8.2. NOTION OF ANGLE AND ORTHOGONALITY 221

Remark 8.1.15 Converse of the above result is not true; that is,
if {x, y} is linearly dependent then it is not necessary that

|| x + y || = || x || + || y ||

Take x ∈ V \ {0}. Observe that {x, −x} is linearly dependent but


the equality does not hold.

Taking x = u − v and y = v − w in triangle inequality, we have

|| u − w || ≤ || u − v || + || v − w || (8.1.12)

Corollary 8.1.16 In inner product space (Cn (C), h, i), we have


v v v
u n u n u n
uX 2
uX uX
t |xi + yi | ≤ t |xi |2 + t |yi |2 .
i=1 i=1 i=1

Corollary 8.1.17 If f and g are two complex valued continuous


functions on [0, 1], then
qR qR qR
1 2 dt ≤ 1 2 dt + 1 2
0 | f (t) + g(t) | 0 | f (t) | 0 | g(t) | dt.

Corollary 8.1.18 If {xn } and {yn } are sequences in l2 , then


v v v
u∞ u∞ u∞
uX 2
X 2
uX
|yi |2 .
u
t |xi + yi | ≤ t |xi | + t
i=1 i=1 i=1

8.2 Notion of angle and orthogonality


Let (V, h , i) be an inner product space. By Cauchy-Schwarz in-
equality |hx, yi| ≤ kxk kyk for all x, y ∈ V . If x, y ∈ V \ {0},
then

hx, yi
|| x || || y || ≤ 1 (8.2.1)

If it is a real inner product space, then


<x, y>
−1 ≤ ||x|| ||y|| ≤ 1.

Thus there is unique θ, 0 ≤ θ ≤ π such that


222 CHAPTER 8. INNER PRODUCT SPACES

<x, y>
cos θ = ||x|| ||y|| .

This θ is called the angle between x and y.


If it is a complex inner product space, by 8.2.1, there is a unique
θ, −π < θ ≤ π such that
<x, y>
cos θ + i sin θ = ||x|| ||y|| .

This θ may be termed as angle between vectors x and y in V .

Definition 8.2.1 Any two vector x and y in an inner product


space is said to be orthogonal if hx, yi = 0. Observe that the
null vector 0 is orthogonal to each vector.
A vector x is called a unit vector if kxk = 1.

Proposition 8.2.2 (Pythagoras Theorem). Let (V, h, i) be a


real inner product space and x, y ∈ V . Then x is perpendicular to
y if and only if

kx ± yk2 = kxk2 + kyk2 (8.2.2)

Proof: Let x, y ∈ V . Then

kx − yk2 = hx − y, x − yi
= hx, xi − hx, yi − hy, xi + hy, yi
= kxk2 + kyk2 − 2 hx, yi

Similarly, kx + yk2 = kxk2 + kyk2 + 2 hx, yi. Since x ⊥ y if and


only if hx, yi = 0 and so kx ± yk2 = kxk2 + kyk2 . 2

Remark 8.2.3 In complex inner product space, if x is perpendic-


ular to y, then kx − yk2 = kxk2 + kyk2 . But the converse is not
true. For example, we consider the unitary space C 2 . The vec-
tors x = (i, i) and (−i, 1) are not orthogonal but kx − yk2 =
kxk2 + kyk2 = 4.

Proposition 8.2.4 In a real inner product space (V, h, i), we have

1. (Parallelogram Law)

kx − yk2 + kx + yk2 = 2(|| x ||2 + || y ||2 ).

for all x, y ∈ V .
8.3. ORTHONORMAL SETS AND BESSEL’S INEQUALITY223
 
2. (Polarization identity). hx, yi = 1
4 kx + yk2 − kx − yk2

Proof: Let x, y ∈ V . Then


|| x − y ||2 = || x ||2 + || y ||2 −2 hx, yi
and
|| x + y ||2 = || x ||2 + || y ||2 +2 hx, yi.
Adding these two equations, we have
|| x − y ||2 + || x + y ||2 = 2 || x ||2 + 2 || y ||2
Subtracting them, we have the polarization identity. 2

Exercise 8.2.5 In a complex inner product space (V, h, i), we have


1  i 
hx, yi = kx + yk2 − kx − yk2 + kx + iyk2 − kx − iyk2
4 4

8.3 Orthonormal Sets and Bessel’s inequal-


ity
Definition 8.3.1 Let (V, h, i) be an inner product space. A subset
S of V is called an orthonormal set if
(i). kxk = 1 ∀x ∈ S and
(ii). hx, yi = 0 ∀x, y ∈ S, x 6= y.

Thus, in brief we say that a finite set S = {x1 , x2 , · · · , xn } is an


orthonormal set if hxi , xj i = δij , where δij = 1 for j = i and 0 for
i 6= j. n o
x
By definition, it is clear that kxk is an orthonormal set for
each x ∈ V \ {0}.

Proposition 8.3.2 In an inner product space, every orthonormal


set is linearly independent.

Proof: Let S be an orthonormal set and α1 x1 + α2 x2 + · · · +


αn xn = 0, where x1 , x2 , · · · , xn are distinct elements of S. Then

hα1 x1 + α2 x2 + · · · + αn xn , xm i = h0, xm i = 0, ∀m = 1, 2, . . . n.

which gives αm = 0, ∀m. 2


224 CHAPTER 8. INNER PRODUCT SPACES

Proposition 8.3.3 If S = {x1 , x2 , · · · , xn } is an orthonormal set


in an inner product space and x = α1 x1 + α2 x2 + · · · + αn xn then
αi = hx, xi i for all i.

Proof: Let x = α1 x1 + α2 x2 + · · · + αn xn . Then for each i, we


have

hx, xi i = hα1 x1 + α2 x2 + · · · + αn xn , xi i
= αi , (since hxj , xi i = 0 f or all j 6= i)

Proposition 8.3.4 If S = {x1 , x2 , · · · , xn } is an orthonormal


Pn set
in an inner product space V and x ∈ / hSi, then x − P i=1 hx, xi ixi
is orthogonal to each vector xi ∈ S. Indeed, if x − ni=1 ai xi is
orthogonal to xj then aj = hx, xj i.

Proof: Suppose that x ∈ / S. Then, for each j = 1, 2, . . . , n, we


have
n n
* +
X X
x− hx, xi ixi , xj = hx, xj i − hx, xi ihxi , xj i
i=1 i=1
Xn
= hx, xj i − δij hx, xi i
i=1
= hx, xj i − hx, xj i
= 0
Pn
Next, if x − i=1 ai xi is orthogonal to xj then

n
* +
X
x− ai xi , xj = 0
i=1
n
X
⇒ hx, xj i − ai hxi , xj i = 0
i=1
⇒ hx, xj i − aj = 0

Thus aj = hx, xj i. 2
8.3. ORTHONORMAL SETS AND BESSEL’S INEQUALITY225

Proposition 8.3.5 (Bessel’s inequality). Let (V, h, i) be an in-


ner product space and {x1 , x2 , · · · , xr } be an orthonormal set, where
xi 6= xj for i 6= j. Then
r
X
|hx, xi i|2 ≤ kxk2 (8.3.1)
i=1

for all x ∈ V .

Proof: Let x ∈ V . By non-negativity property, we have


r r
* +
X X
x − hx, xi i xi , x − hx, xi i xi ≥ 0.
i=1 i=1

Using linearity property of the inner product and the property of


orthonormal set {x1 , x2 , · · · , xr }, we have
r
X
hx, xi − hx, xi ihx, xi i ≥ 0
i=1

Thus
r
X
2
kxk ≥ |hx, xi i|2
i=1

Definition 8.3.6 An orthonormal set which is also a basis is called


an Orthonormal Basis.

Corollary 8.3.7 Let (V, h, i) be an inner product space. An or-


thonormal set {x1 , x2 , · · · , xn } is an orthonormal basis if and only
if
n
X
2
kxk = |hx, xi i|2 (8.3.2)
i=1

Proof: Suppose that {x1 , x2 , · · · , xn } is an orthonormal basis.


Let x ∈ V . Then

x = a1 x1 + a2 x2 + · · · + an xn
226 CHAPTER 8. INNER PRODUCT SPACES

forPsome a1 , a2 , · · · , an in F and so hx, xi i = ai ∀i. Then


n
x = i=1 hx, xi ixi and so

n n
* +
X X
x − hx, xi i xi , x − hx, xi i xi = 0
i=1 i=1

Expanding this, we have


n
X
2
kxk = |hx, xi i|2 .
i=1

Conversely, suppose that


n
X
kxk2 = |hx, xi i|2 .
i=1

Then
n

X
x − hx, x i x i = 0

i

i=1

and so
n
X
x− hx, xi i xi = 0
i=1

This shows that {x1 , x2 , · · · , xn } is a set of generators. Since an


orthonormal set is linearly independent also, therefore it is a basis.
2

8.4 Gram-Schmidt Process


The Gram-Schmidt orthogonalization process is the process of deter-
mining an orthonormal basis of a given inner product space.

Theorem 8.4.1 (Gram-Schmidt). Let (V, h, i) be an inner product


space. Let {x1 , x2 , · · · , xn } be a finite subset of V . Then there
exists an orthonormal set {y1 , y2 , · · · , ys }, s ≤ r such that the
subspace generated by {x1 , x2 , · · · , xr } is the same as that generated
by {y1 , y2 , · · · , ys }.
8.4. GRAM-SCHMIDT PROCESS 227

Proof: The proof is given by the induction on r. If r = 0,


then the subset is empty set and so the result is vacuously true.
Suppose that r = 1. If x1 = 0, then L({x1 }) = {0} and empty
set is again an orthonormal set which generates {0}. If x1 6= 0,
take y1 = kxx11 k . Then {y1 } is an orthonormal set which generates
L({x1 }). Assume that the result is true for r. Consider a subset
{x1 , x2 , · · · xr , xr+1 } of V . By our induction assumption there is
an orthonormal subset {y1 , y2 , · · · , ys }, s ≤ r of V such that
L ({y1 , y2 , · · · , ys }) = L ({x1 , x2 · · · , xr }) .
If xr+1 belongs to this subspace, then there is nothing to do.
Suppose that xr+1 6∈ L ({x1 , x2 , · · · , xr }) = L ({y1 , y2 , · · · , ys }).
Then
Xs
xr+1 − hxr+1 , yi i yi 6= 0
i=1
Take
s
X
z = xr+1 − hxr+1 , yi i yi . . . (1)
i=1
and
z
ys+1 = . . . (2)
kzk
By Proposition 8.3.4, the set {y1 , y2 , · · · , ys+1 } is an orthonormal
set. Since ys+1 is linear combination of {xr+1 , y1 , y2 ,· · · , ys } and
L ({y1 , y2 , · · · , ys }) = L ({x1 , x2 · · · , xr }) ,
therefore
L ({y1 , y2 , · · · , ys+1 }) ⊆ L ({x1 , x2 , · · · , xr+1 }) . . . (3)
By Equations (1) and (2), xr+1 is a linear combination of {y1 , y2 , · · · , ys+1 },
hence
L ({x1 , x2 , · · · , xr+1 }) ⊆ L ({y1 , y2 , · · · , ys+1 }) . . . (4)
Combining Equations (3) and (4), we have
L ({x1 , x2 , · · · , xr+1 }) = L ({y1 , y2 , · · · , ys+1 })
where s + 1 ≤ r + 1 This shows that the result is true for r + 1 and
so by induction principle, the result is true. 2
228 CHAPTER 8. INNER PRODUCT SPACES

Corollary 8.4.2 Every finite dimensional inner product space has


an orthonormal basis.
Proof: Let (V, < >) be a finite dimensional inner product
space. Let {x1 , x2 , · · · , xn } be a basis of V . By Gram-Schmidt
theorem there exists an orthonormal set {y1 , y2 , · · · , ym }, m ≤ n
which generates V . Since an orthonormal set is linearly indepen-
dent, therefore it is a basis of V containing m elements. By in-
variance of number of elements in a basis, we have m = n. 2

Proposition 8.4.3 Every orthonormal set of a finite dimensional


inner product space can be embedded in to an orthonormal basis.
Proof: Let {x1 , x2 , · · · , xm } be an orthonormal set of an inner
product space (V, < >) of dimension n. Since an orthonormal
set is linearly independent, m ≤ n. If m = n, it is already
a basis and so an orthonormal basis. Suppose that m < n.
Then h{x1 , x2 , · · · , xm }i =6 V . LetPym+1 be a member of V −
m
h{x1 , x2 , · · · , xm }i. Then ym+1 − i=1 hym+1 , xi ixi 6= 0. Take
ym+1 − P m
P
i=1 <ym+1 ,xi >xi
xm+1 = ||ym+1 − m .
i=1 <ym+1 , xi >xi ||

Then {x1 , x2 , · · · , xm+1 } is an orthonormal set. If m + 1 = n,


then this is an orthonormal basis. If not, proceed as above. At
(n − m)th step we shall arrive at an orthonormal basis containing
{x1 , x2 , · · · xm }. 2

Example 8.4.4 Find the orthonormal basis of the subspace gener-


ated by the subset {(1, 1, 1), (0, 1, 1), (2, 1, 1)} in the usual Euclidean
inner product space R3 . Take x1 = (2, 1, 1), then
 
(1, 1, 1) 1 1 1
y1 = = √ ,√ ,√
k(1, 1, 1)k 3 3 3
and
x2
y2 = ,
|| x2 ||
where
 
1 1
x2 = (0, 1, 1) − (0, 1, 1), √ (1, 1, 1) √ (1, 1, 1)
3 3
1
= (−2, 1, 1).
3
8.4. GRAM-SCHMIDT PROCESS 229

(−2,1,1)
Thus, y2 = √
6
. Now,

(2, 1, 1) − < (2, 1, 1), y1 > y1 − < (2, 1, 1), y2 > y2


4 4
= (2, 1, 1) − (1, 1, 1) + (−2, 1, 1)
3 6
= 0

i.e; (2, 1, 1) = < (2, 1, 1), y1 > y1 + < (2, 1, 1), y2 > y2

Therefore (2, 1, 1) is a linear combination of y1 and y2 , i.e; (2, 1, 1) ∈


L(S) and so

L({(1, 1, 1), (0, 1, 1), (2, 1, 1)}) = L({y1 , y2 }).

Thus, the orthonormal set {y1 , y2 } is the basis of space generated


by {(1, 1, 1), (0, 1, 1), (2, 1, 1)}.

Example 8.4.5 Apply the Gram-Schmidt process to the vectors


v1 = (1, 0, 1), v2 = (1, 0, −1), v3 = (0, 3, 4), to obtain an orthonor-
mal basis for R3 with standard inner product.
Solution: Given that v1 = (1, 0, 1), v2 = (1, 0, −1), v3 =
(0, 3, 4). √
Now, kv1 k2 = hv1 , v1 i = 1.1 + 0.0 + 1.1 = 2, hence kv1 k = 2.
Take w1 = kvv11 k then
 
1 1
w1 = √ , 0, √ .
2 2
 
Now, hv2 , w1 i = h(1, 0, −1), √12 , 0, √12 i = 0 and so
hv2 , w1 iw1 = (0, 0, 0).
Then u2 = v2 − hv2 , w1 iw1 = (1, 0, −1) − (0, 0, 0) = (1, 0, −1) and
so ku2 k2 = hu2√ , u2 i = h(1, 0, −1), (1, 0, −1)i = 2.
Thus, ku2 k = 2. Then
 
u2 1 1
w2 = = √ , 0, − √
ku2 k 2 2
  √
Now, hv3 , w1 i = h(0, 3, 4), √12 , 0, √12 i = 0 + 0 + 4. √12 = 2 2 hence
hv3 , w1 iw1 = (2, 0, 2).
Next,
230 CHAPTER 8. INNER PRODUCT SPACES
  √
hv3 , w2 i = h(0, 3, 4), √12 , 0, − √12 i = 0 + 0 − 4. √12 = −2 2, hence
hv3 , w2 iw2 = (−2, 0, 2).
Then
u3 = v3 − hv3 , w1 iw1 − hv3 , w2 iw2 = (0, 3, 0) and so ku3 k = 3. Take
w3 = kuu33 k . Then
w3 = (0, 1, 0)
   
The vectors w1 = √12 , 0, √12 , w1 = √12 , 0, − √12 and w3 =
(0, 1, 0) constitute a required orthonormal basis.

Exercise 8.4.6 Consider the usual inner product space C3 . Find


an orthonormal basis for the subspace spanned by vectors (1, 0, i),
(2, 1, 1 + i).

Solution: Let v1 = (1, 0, i) and v2 = (2, 1, 1 + i). Since

kv1 k2 = h(1, 0, i), (1, 0, i)i = 1.1 + 0.0 + i.ī = 2



therefore kv1 k = 2.  
Take w1 = kvv11 k , then w1 = √12 , 0, √i2 .
Clearly L({v1 }) = L({w1 }).
Now, the projection hv2 , w1 iw1 of v2 along the direction of vector
w1 is given by
    
1 i 1 i
hv2 , w1 iw1 = (2, 1, 1 + i), √ , 0, √ √ , 0, √
2 2 2 2
  
1 i 1 i
= 2. √ + 1.0 + (1 + i) √ √ , 0, √
2 2 2 2
 
3−i 1 i
= √ √ , 0, √
2 2 2
 
3−i 1 + 3i
= , 0,
2 2

Hence u2 = v2 − hv2 , w1 iw1


 
3−i 1 + 3i
= (2, 1, 1 + i) − , 0,
2 2
 
1+i 1−i
= , 1,
2 2
8.4. GRAM-SCHMIDT PROCESS 231

and so q √
1+i 1+i 5 u2
+ 1.1̄ + 1−i
   1−i 
ku2 k = 2 2 2 2 = 2 . Take w2 = ku2 k .
Then  
1+i 1−i
w2 = √ , 1, √ .
5 5
Clearly L({v1 , v2 }) = L({w1 , w2 }). Thus the required  orthonormal  
basis of subspaces spanned by (1, 0, i), (2, 1, 1+i) is { √12 , 0, √i2 , 1+i
√ , 1, 1−i
5

5
}.2

Exercise 8.4.7 Let V be the subspace of R[x] of polynomials of de-


gree at most 3. Find an orthonormal basis of inner product space
2 3
R 1 <, >) corresponding to the basis {1, t, t , t }, where hf, gi =
(V,
0 f (t)g(t)dt.

Definition 8.4.8 Let W be a subspace of a given inner product


space V . Then orthogonal complement of W is denoted and defined
by
W ⊥ = {x ∈ V | hx, wi = 0 ∀w ∈ W }.

From the definition, it follows that 0 ∈ W ⊥ .

Proposition 8.4.9 In an inner product space V , W ⊥ is a sub-


space of V , where W is a subspace of V .

Proof: Let W be a subspace of V . Clearly 0 ∈ W ⊥ . Thus


W ⊥ is non- empty. Let u, v ∈ W ⊥ . Then hu, wi = 0 = hv, wi
for all w ∈ W . Let a, b ∈ F . Using linearity property, we have
hau + bv, wi = 0. Thus W ⊥ is a subspace of V . 2

Proposition 8.4.10 If W is a subspace of a finite dimensional


inner product space V , then V = W ⊕ W ⊥ .

Proof: Let x ∈ V and {x1 , x2 , . . . , xr } be a basis of W . Then


x − ri=1 hx, xiP
P
ixi is orthogonal to each xi and so orthogonal to
W . Thus x − ri=1 hx, xi ixi ∈ W ⊥ and so x ∈ W + W ⊥ . Hence
V = W + W ⊥ . One can easily observed that W ∩ W ⊥ = {0}.
Hence proved. 2

Example 8.4.11 The orthogonal complement of the vector (1, 1, −1)


is the plane x + y − z = 0, which is a two dimensional subspace of
R3 .
232 CHAPTER 8. INNER PRODUCT SPACES

Exercises
1. Show that we can always define an inner product on a finite
dimensional vector space F n (F ), where F = R or C.

2. Prove that hx, yi = ni=1 xi yi defines an inner product on


P
Rn , where x = (x1 , x2 , . . . , xn ) and y = (y1 , y2 , . . . , yn ) in
Rn .

3. Normalize the vectors (2, −11, 1) and 15 , 23 , 1 in R3 and




(1, 0, i) in C3 with respect to their standard inner products


respectively.

Exercise 8.4.12 Consider the usual inner product space R3 . Find


an orthonormal basis for the subspace spanned by vectors (1, 0, 1),
(1, 0, −1) and (0, 1, 1).

Exercise 8.4.13 Consider the usual inner product space C3 . Find


an orthonormal basis for the subspace spanned by vectors (1, 0, i),
(0, 1, i).

Exercise 8.4.14 Let V be the subspace of R[x] of polynomials


of degree at most 3. Find an orthonormal basis of inner prod-
uct spaceR (V, <, >) corresponding to the basis {1, t, t2 , t3 }, where
1
hf, gi = 0 f (t)g(t)dt.

Exercise 8.4.15 Apply the Gram-Schmidt process to the vectors


v1 = (1, 0, 0) , v2 = (1, 1, 0) and (1, 1, 1) to find the corresponding
orthonormal basis of R3 relative to standard inner product.

Exercise 8.4.16 Given the basis (2, 0, 1), (3, −1, 5) and
(0, 4, 2) for R3 , construct from it by the Gram-Schmidt process an
orthonormal basis of R3 relative to standard inner product.
Chapter 9

Bilinear and Quadratic


forms

This chapter deals only the introductions to bilinear forms and


quadratic forms. The applications of quadratic forms in Analytical
Geometry and Physics has been given in [1, 3, 5].

9.1 Bilinear Forms

Definition 9.1.1 Let V, W and Z be any three vector spaces over


same field F . Then a map f : V × W → Z is called bilinear if it
satisfies following conditions:
(i) f (av1 + bv2 , w) = af (v1 , w) + bf (v2 , w) for all a, b ∈ F and
v1 , v2 ∈ V ,
(ii) f (v, aw1 + bw2 ) = af (v, w1 ) + bf (v, w2 ) for all a, b ∈ F and
w1 , w2 ∈ W . If Z = F , then f is called a bilinear form. If
W = V , then the bilinear form f is called a bilinear form on V .

Example 9.1.2 Let A = (aij ) be any n × n matrix over field F .


Let F n denotes the set of all column matrices over F . Define a
map f : F m × F n → F by

f (X, Y ) = X t AY

233
234 CHAPTER 9. BILINEAR AND QUADRATIC FORMS

Let a, b ∈ F and X, Y, Z ∈ F m . Then

f (aX + bY, Z) = (aX + bY )t AZ


= (aX t + bY t )AZ
= a(X t AZ) + b(Y t AZ)
= af (X, Z) + bf (Y, Z)

Similarly f (X, aY + bZ) = af (X, Y ) + bf (X, Z). Thus, f is a


bilinear form. Let X = (x1 , x2 , . . . , xm )t and Y = (y1 , y2 , . . . , yn )t .
Observe that
m m m
!
X X X
t
X A= xi Ai1 , xi Ai2 , . . . , xi Ain
i=1 i=1 i=1

and so

f (X, Y ) = X t AY
m m
! !
X X
= xi Ai1 y1 + xi Ai2 y2 + . . .
i=1 i=1
m
!
X
+ xi Ain yn
i=1
n m
!
X X
= xi Aij yj
j=1 i=1
n
XX m
= Aij xi yj
j=1 i=1
Xm X n
= Aij xi yj
i=1 j=1

Exercise 9.1.3 Let V be a vector space over F . Prove that the


map f : V × V → F defined by f (v, w) = L1 (v)L2 (w) is a bilinear
form on V , where L1 , L2 are given linear functionals on V .

Exercise 9.1.4 Let T be a linear operator on V . Then prove that


the map f : V × V → F defined by f (v, w) = hT (v), T (w)i is a
bilinear form, where h , i be an inner product on V .
9.1. BILINEAR FORMS 235

Theorem 9.1.5 Let V and W finite dimensional vector spaces


over field F let B = (v1 v2 . . . vm ) and B 0 = (w1 w2 . . . wn ) be
ordered bases of V and W respectively. Let f : V × W → F be a
bilinear form. Then there exists a unique matrix A such that
f (v, w) = [v]tB A[w]B0 .
If A = (aij ) is any m × n matrix then it determines a unique
bilinear form f given by
f (v, w) = [v]tB A[w]B0 ,
such that f (vi , wj ) = aij for all i = 1, 2, . . . , m and j = 1, 2, . . . , n.

Proof:P Let [v]B = (x1 x2 P . . . xm )t and [w]B0 = (y1 y2 . . . ym )t .


Then v = m i=1 xi vi and w =
n
j=1 yj wj . Let aij = f (vi , wj ) for
all i = 1, 2, . . . , m and j = 1, 2, . . . , n. Then,
 
Xm n
X
f (v, w) = f  xi vi , y j wj 
i=1 j=1
 
m
X n
X
= xi f  vi , y j wj 
i=1 j=1

  P 
f v1 , nj=1 yj wj
  P 
 f v2 , nj=1 yj wj 
 
= (x1 , x2 , . . . , xm ) 

..



  . 


Pn
f vm , j=1 yj wj

 Pn 
f (v1 , wj )yj
Pj=1
 n f (v2 , wj )yj 
 j=1
= (x1 , x2 , . . . , xm ) 

.. 

Pn . 
j=1 f (vm , wj )yj

 Pn 
j=1 a1j yj
 n a2j yj 
P
 j=1
= (x1 , x2 , . . . , xm ) 

.. 
Pn .
 
j=1 amj yj
236 CHAPTER 9. BILINEAR AND QUADRATIC FORMS
  
a11 a12 ... a1n y1
 a21 a22 ... a2n   y2 
= (x1 , x2 , . . . , xm )  .
  
.. .. ..   .. 
 .. . . .  . 
am1 am2 . . . amn yn
= [v]tB A[w]B0

where (i, j)-th entry of A is aij = f (vi , wj ). Let B = (bij ) be any


other m × n matrix such that f (v, w) = [v]B B[w]B0 . Then

aij
= f (vi , wj )
 
0
 0 
..
 
 

b11 b12

. . . b1n  .

 b21 b22  0 
. . . b2n   
= (0, 0, .., 0, |{z}
1 , 0, .., 0)  . 1
 
.. .. ..   
 .. . . .  j−|{z}
 
i− place  place

bm1 bm2 . . . bmn  0 
 
 . 
 . .
0
 
b1j
 b2j 
 
 .. 
 . 
= (0, 0, . . . , 0, |{z}
1 , 0, . . . , 0)  
 bij 
i−th place
 
 .. 
 . 
bmj
= bij

Next, define a map f : V × W → F by

f (v, w) = [v]tB A[w]B0

One may easily verify that f is a bilinear form such that f (vi , wj ) =
aij . Next, if g is any other bilinear form such that g(vi , wj ) = aij .
9.1. BILINEAR FORMS 237

Pm Pn
Let v = i=1 xi vi and w = j=1 yj wj . Then
 
Xm n
X m X
X n
g(v, w) = g  xi vi , y j wj  = xi g(vi , wj )yj
i=1 j=1 i=1 j=1
m X
X n
= xi aij yj
i=1 j=1

Hence
  
a11 a12 ... a1n y1
 a21 a22 ... a2n   y2 
g(v, w) = (x1 , x2 , . . . , xm )  .
  
.. .. ..   .. 
 .. . . .  . 
am1 am2 . . . amn yn
= [v]tB A[w]B0

Definition 9.1.6 Let V and W be any two finite dimensional vec-


tor spaces over same field F with dim V = m and dim W = n
respectively. Let f : V × W → F be a bilinear form. Then the
m × n matrix A = (aij ) is called the matrix of bilinear form f
relative to ordered basis B of V and B 0 of W , where (i, j)-th
entry of A is aij = f (vi , wj ) for all i = 1, 2, .., m and j = 1, 2, ..., n.
It is denoted by [f ]B,B0 . If V = W and B 0 = B then matrix of f
relative to ordered basis B is denoted by [f ]B . Let f be a bilinear
form on V then f is called symmetric if f (v1 , v2 ) = f (v2 , v1 ) for
all v1 , v2 ∈ V . The bilinear form f on V is called anti-symmetric
if f (v1 , v2 ) = −f (v2 , v1 ) for all v1 , v2 ∈ V .

Corollary 9.1.7 There is a bijective correspondence


(f ↔ A = (f (vi , wj ))) between the set of all bilinear forms f : V ×
V → F into the set Mm×n (F ) of all m × n matrices over F .

Corollary 9.1.8 Every bilinear form f on a finite dimensional


vector space V (F ) is of the form

f (v, w) = [v]B A[w]B ,

where B is the ordered basis of V . In particular,

f (X, Y ) = X t AY
238 CHAPTER 9. BILINEAR AND QUADRATIC FORMS

where X, Y ∈ F n , the space of column matrices and A is the matrix


of bilinear form f on F n .

Corollary 9.1.9 A bilinear form f on an n-dimensional space


V (F ) is symmetric (skew-symmetric) if and only if the matrix A
of f is symmetric (skew-symmetric) i.e; if and only if f (vi , vj ) =
f (vj , vi ) (f (vi , vj ) = −f (vj , vi )) for all i and j, where B = (v1 , v2 . . . , vn )
is an ordered basis of V .

Theorem 9.1.10 (Effect of Change of basis) Let B = (v1 , v2 , . . . , vn )


and B 0 = (w1 , w2 , . . . , wn ) be any two ordered bases of V . Let f be
a bilinear form on V . Then

[f ]B0 = P t [f ]B P,

where P is the transition matrix from B to B 0 , i.e; B 0 = BP .


Pn
Proof: Let wj = i=1 pij vi for each j. Then

B 0 = (B[w1 ]B B[w2 ]B . . . B[wn ]B ) = BP

where P is the transition matrix whose j-th column is [wj ]B . Let


v, w ∈ V . Then,
[v]B = P [v]B0 . . . (1)
and
[w]B = P [w]B0 . . . (2)
Hence

f (v, w) = [v]tB [f ]B [w]B


= (P [v]B0 )t [f ]B P [w]B0
= [v]tB0 P t [f ]B P [w]B0
 

By uniqueness of its matrix representation, it follows that

[f ]B0 = P t [f ]B P

Definition 9.1.11 An n × n matrix B over a field F is said to be


congruent to an n × n matrix A over same field F if there exists
an invertible n × n matrix P such that B = P t AP .
9.1. BILINEAR FORMS 239

Definition 9.1.12 An m × n matrix B over a field F is said to


be equivalent to an m × n matrix A over same field F if there
exists two invertible m × m matrices P and Q respectively such
that B = P t AQ.

Example 9.1.13 Let f : R2 × R3 → R be a bilinear form given by

f (X, Y ) = 2x1 y2 − x2 y1 + 9x2 y3

where X = (x1 x2 )t and Y = (y1 y2 y3 ).


Take B = {e1 , e2 } and B 0 = {e1 0 , e2 0 , e3 0 }, where
 
1
e1 = = (1, 0)t , e2 = (0 1)t , e1 0 = (1 0 0)t ,
0

e2 0 = (0 1 0)t , and e3 0 = (0 0 1)t .

Then a11 = f (e1 , e1 0 ) = f (1, 0)t , (1 0 0)t = 2(1.0) − 0.1 +




9(0.0) = 0.
Indeed, aij = f (ei , ej 0 ) is the coefficient of xi yj . Thus, a12 = 2,
a13 = 0, a21 = −1, a22 = 0 and a23 = 9.
Thus, matrix of f relative to standard ordered basis B = {e1 , e2 }
of R2 and B 0 of R3 is
 
0 2 0
[f ]B =
−1 0 9

It is noted that if f is a bilinear form on Rn , then aij = f (ei , ej )


is the coefficient of xi yj relative to standard ordered basis B =
{e1 , e2 , . . . , en }.

Example 9.1.14 Let f : R2 × R2 → R be a bilinear form given by

f (X, Y ) = 2x1 y2 + x2 y1 + x2 y2

where X = (x1 x2 )t and Y = (y1 y2 ).


Take B = {e1 , e2 } and B 0 = {e1 0 , e2 0 }, where
 
1
e1 = = (1 0)t , e2 = (0 1)t , e1 0 = (1 1)t , and e2 0 = (1 − 1)t .
0
240 CHAPTER 9. BILINEAR AND QUADRATIC FORMS

Then
f (e1 , e1 ) = f (1 0)t , (1 0)t = 2.1.0 + 0.0 + 0.0 = 0


f (e1 , e2 ) = f (1 0)t , (0 1)t = 2.1.1 + 0.0 + 0.1 = 2




f (e2 , e1 ) = f (0 1)t , (1 0)t = 2.0.0 + 1.1 + 1.0 = 1




f (e2 , e2 ) = f (0 1)t , (0 1)t = 2.0.1 + 1.0 + 1.1 = 1




Hence  
0 2
[f ]B =
1 1
Next,
f (e01 , e01 ) = f (1 1)t , (1 1)t = 2.1.1 + 1.1 + 1.1 = 4


f (e01 , e02 ) = f (1 1)t , (1 − 1)t = 2.1.(−1) + 1.1 + (−1).1 = −2




f (e02 , e01 ) = f (1 − 1)t , (1 1)t = 2.1.1 + (−1).1 + (−1).1 = 0




f (e02 , e02 ) = f (1 − 1)t , (1 − 1)t




= 2.1.(−1) + (−1).1 + (−1).(−1) = −2


Hence  
4 −2
[f ]B0 =
0 −2
Observe that e01 = e1 + e2 and e02 = e1 − e2 , hence transition matrix
P is  
1 1
P =
1 −1
Clearly P t = P and
   
t 1 1 0 2 1 1
P [f ]B P =
1 −1 1 1 1 −1
  
1 3 1 1
=
−1 1 1 −1
 
4 −2
=
0 −2
Exercise 9.1.15 Find the matrix of following bilinear forms:
(i) f (X, Y ) = x1 y3 + y1 x3 ,
(ii) f (X, Y ) = x1 y3 + x2 y1 − 9x3 y2
relative to bases B = {(1 0 0)t , (0 1 0)t , (0 0 1)}
and B 0 = {(1 0 1)t , (0 1 − 1)t , (1 1 1)} respectively. Also find P such
that [f ]B0 = P t [f ]B P .
9.2. QUADRATIC FORMS 241

9.2 Quadratic forms


Definition 9.2.1 An expression of the form ni=1 nj=1 aij xi xj ,
P P
where x1 , x2 , . . . , xn are n real variables and a0ij s are real numbers;
is called a quadratic form in n real variables over  R (the
 set of
x1
 x2 
real numbers). It is denoted by q(X), where X =  .  ∈ Rn .
 
 .. 
xn
Thus,
n X
X n
q(X) = aij xi xj
i=1 j=1

For example, the expression ax2 + by 2 + 2hxy is a quadratic form


in two variables x, y. Similarly, the expression ax2 + by 2 + cz 2 +
2f yz + 2gzx + 2hxy is a quadratic form in three variables x, y and
z.
Observe that each term in quadratic form q(X) is a quadratic
expression. It can also be written as q(X) = X t AX, where A =
(aij ). Since q(X) is a scalar therefore q(X) = q(X)t and hence

q(X) + q(X) = (q(X)t ) + q(X)

This gives 2q(X) = X t (At + A)X as q(X)t = (X t AX)t = X t At X.


Thus, we have
 
t 1 t
q(X) = X (A + A ) X (9.2.1)
2

where 21 (A + At ) is symmetric. Clearly if B is a real symmetric


matrix such that q(X) = X t BX, then B = 21 (A + At ) and is
unique.

Definition 9.2.2 The real symmetric matrix 12 (A+At ) of quadratic


form q(X) = X t AX is called the matrix associated with quadratic
form or the matrix of quadratic form. In other words, the real
symmetric matrix B such that X t BX = q(X) is called the matrix
of quadratic form. Thus, if q(X) = X t AX, then its matrix is
B = 21 (A + At ).
242 CHAPTER 9. BILINEAR AND QUADRATIC FORMS

Example 9.2.3 The matrix of quadratic form q(X) = x2 + xy is

1 12
 
1 .
2 0

Example 9.2.4 Thequadratic formin three variables x, y, z whose


1 1/2 3
associated matrix is 1/2 5 0 , is x2 + xy + 6xz + 5y 2 − z 2 .
3 0 −1

Exercise 9.2.5 Find the real symmetric matrix A for the quadratic
form q(X) = x21 − x3 x4 in variables x1 , x2 , x3 and x4 .

Exercise 9.2.6 Find the quadratic forms in three variables x, y, z


whose associated matrices are
   
1 1/2 3/2 1 0 −1
(i) 1/2 2 9  (ii) 0 2 0
3/2 9 −1/4 −1 0 −1

Exercises
1. Find the matrix of the quadratic form

ax2 + by 2 + cz 2 + 2f yz + 2gzx + 2hxy.

2. Find the matrix of the quadratic form ax2 + by 2 + 2hxy.

3. Find the matrix of the bilinear form x1 y2 +y1 x2 on R2 relative


to ordered basis B = {(1, −1), (0, 1)}.

4. Find the matrix of the bilinear form

x1 y1 + x2 y2 + 3x2 y1 + 5x2 y1

on R2 relative to ordered basis B = {(1, −1), (1, 1)}.

5. Find the matrix of the bilinear form

x1 y1 + 2x2 y2 + x2 y1

on R2 relative to ordered basis B = {(1, 0), (0, 1)}.


9.2. QUADRATIC FORMS 243

6. Let f be the bilinear forms on R3 given by

f (X, Y ) = X t AY,

where X = [x1 x2 x3 ]t , Y = [y1 y2 y3 ]t in R3 and the matrix


A is given by
   
1 1 1/2 1 1 −1
(i)  1 2 9  (ii) 1 0 0
1/2 9 −1/4 −1 0 0

7. Diagonalize the matrix of the quadratic form

ax2 + by 2 + cz 2 + 2f yz + 2gzx + 2hxy.

8. Diagonalize the matrix of the quadratic form

ax2 + by 2 + 2hxy.

9. Find the matrix of the quadratic form xy on R2 . Test whether


the matrix is diagonalizable.

10. Find the quadratic forms in three variables x, y, z correspond-


ing to the following matrices:
   
1 1 1/2 1 1 −1
(i)  1 2 0  (ii)  1 −1 0 
1/2 0 4 −1 0 −1

2
244 CHAPTER 9. BILINEAR AND QUADRATIC FORMS
Bibliography

[1] M. Artin, Algebra, PHI, fifth Indian reprint, Oc-


tober 2000.

[2] Kenneth Hoffman and Ray Kunze, Linear Alge-


bra , second edition, PHI Pte. Ltd, 1998.

[3] Serge Lang, Linear Algebra , third edition,


Springer-Verlag London, 2004.

[4] T.S. Blyth and E. F. Robertson, Basic Linear


algebra, second edition, Springer-Verlag London,
2002.

[5] Derek J. S. Robinson, A course in Linear Al-


gebra with Applications, second edition, World
Scientific Publishing Co. Pte. Ltd, 2006.

245
Index

infinite dimensional vector space, elementary column operations, 31


138 elementary row operations, 31
linear span, 129 equivalent, 239
matrix multiplication, 8
finite-dimensional vector space,
addition of matrices, 5 138
adjoint matrix, 26
algebraic multiplicities, 94 geometric multiplicity, 92
annihilator, 191
anti-symmetric, 237 hermitian matrix, 18
homogeneous linear equations, 68
basis, 136 hyperspaces, 174
bilinear, 233
bilinear form, 233 idempotent matrix, 23
identity matrix, 5
Change of basis, 208 injective, 163
characteristic equation, 93 inverse, 13
characteristic polynomial, 93 involutory matrix, 23
co-factor, 25 isomorphism, 163
cofactor matrix , 26
column rank, 48 kernel, 168
column vector, 4
congruent, 238 linear combination, 44, 129
coordinate matrix , 154 linear operator, 163
linear sum, 146
diagonal matrix, 4 linear transformation, 163
diagonalizable, 103 linearly dependent, 45, 132
dimension, 140 linearly independent, 45, 134

echelon form of matrix, 40 matrix of bilinear form, 237


eigen space, 92, 126 multiplication, 183
eigen value, 91
eigen vector, 91 nilpotent matrix, 23

246
INDEX 247

non-homogeneous linear equations,


68
non-singular, 178
Normal form, 54
null space, 168
nullity, 168

ordered basis, 153


orthogonal, 222
orthogonal matrix, 23

quadratic form, 241


quotient space, 152

rank, 50, 168


Rank-nullity theorem, 173
reduced row-echelon form, 42
row rank, 48
row vector, 4
row-echelon form, 38

scalar matrix, 4
scalar multiplication, 6
similar, 211
similar matrix, 92
singular, 178
skew-Hermitian matrix, 18
skew-symmetric, 14
subspace, 125
surjective, 163
symmetric, 13, 237

transition matrix, 103, 156


transpose, 13
triangular matrix, 4

unitary matrix, 23
upper (lower) triangular, 4

vector space, 121

zero matrix, 4

View publication stats

You might also like