Linear Algebra
SMTA022
Dr Amin Saeidi
Department of mathematics and applied mathematics
School of Mathematical and Computer Sciences
University of Limpopo
July 2023
Preface
This guide has been prepared specifically for third-year students and aims to
cover the essential aspects of the theory. It assumes that you have already
completed the course on Set Theory, Linear and Abstract Algebra and have
a basic understanding of concepts like matrices and their properties.
The study guide is divided into six chapters. Chapter Zero provides ele-
mentary facts on matrix theory that will be useful throughout the semester.
Chapter 1 introduces fundamental concepts such as vectors, vector spaces,
linear independence, span, and linear combinations. Subsequent chapters
delve into finite-dimensional vector spaces, linear maps and matrices, eigen-
vectors and eigenvalues, and scalar products and orthogonality. Each chapter
explores key properties, applications, and relationships in a concise format.
You will also find numerous unsolved exercises and solved examples that
demonstrate the practical application of the theorems. At the end of each
chapter, there are some unsolved problems that have a more abstract nature
and helps the students to understand the theory more deeply.
While I have aimed to create a study guide that caters to the needs of the
average student, I encourage those of you who wish to delve deeper into the
subject to not limit yourselves solely to this guide. The provided references
are excellent sources for further study and exploration.
Your feedback and suggestions are highly valued. If you come across any
errors, mistakes, or have any recommendations to improve this study guide,
please don’t hesitate to contact me at [email protected]. Your input will
greatly contribute to making this study guide more effective and beneficial
for all students.
Dr. Amin Saeidi,
University of Limpopo,
July 2023
2
Contents
0 Matrices: Elementary Facts 6
0.1 Introduction to Matrices . . . . . . . . . . . . . . . . . . . . . 6
0.2 Matrix Operations . . . . . . . . . . . . . . . . . . . . . . . . 7
0.3 Matrix Properties . . . . . . . . . . . . . . . . . . . . . . . . . 8
1 Vector Spaces 9
1.1 Vector space . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.2 Some examples of Vector spaces . . . . . . . . . . . . . . . . . 12
1.3 Subspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.4 Direct sum of vector spaces . . . . . . . . . . . . . . . . . . . 17
1.5 Bases and Dimension . . . . . . . . . . . . . . . . . . . . . . . 19
2 Linear maps 26
2.1 Basic notions . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.2 Linear maps and Matrices . . . . . . . . . . . . . . . . . . . . 27
2.3 Coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.4 Nullity and Rank of a linear map . . . . . . . . . . . . . . . . 33
2.4.1 Nullity and Rank of matrices . . . . . . . . . . . . . . 36
2.5 Linear Forms and Dual Spaces . . . . . . . . . . . . . . . . . . 38
3 Eigenvectors and eigenvalues 43
3.1 Eigenvectors and eigenvalues . . . . . . . . . . . . . . . . . . . 43
3.2 Characteristic polynomial . . . . . . . . . . . . . . . . . . . . 45
3.3 Diagonalization . . . . . . . . . . . . . . . . . . . . . . . . . . 49
4 Orthogonality 56
4.1 Inner product . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
4.2 Norm of a vector . . . . . . . . . . . . . . . . . . . . . . . . . 57
3
4 CONTENTS
4.3 Orthogonal bases . . . . . . . . . . . . . . . . . . . . . . . . . 58
List of Symbols and Notations
F Field (e.g., R for real numbers, C for complex numbers)
Fn Column vector space of dimension n over F
Fm×n Space of m × n matrices over F
⃗0 Zero vector or zero matrix
I Identity matrix (when dimensions are clear from context)
⃗v Column vector
⃗v T Row vector (transpose of ⃗v )
A, B, C Matrices
AT Transpose of matrix A
A−1 Inverse of matrix A
rank(A) Rank of matrix A
ker(A) Null space or kernel of matrix A
span(⃗v1 , ⃗v2 , . . . , ⃗vn ) Span of vectors ⃗v1 , ⃗v2 , . . . , ⃗vn
B Basis for a vector space
[a]ij Element in the i-th row and j-th column of matrix A
AB Matrix multiplication of A and B
A+B Matrix addition of A and B
kA Scalar multiplication of matrix A by k
det(A) Determinant of matrix A
Tr(A) Trace of matrix A (sum of diagonal elements)
A⊤ Transpose of matrix A
5
Chapter 0
Matrices: Elementary Facts
In this chapter, we will review some elementary facts about matrices without
going into rigorous proofs. These concepts serve as the foundation for further
study in this module.
0.1 Introduction to Matrices
Definition. A matrix is a rectangular array of numbers or elements, ar-
ranged in rows and columns. It is often denoted by a capital letter.
Notation. An m × n matrix A can be represented as:
a11 a12 · · · a1n
a21 a22 · · · a2n
A = ..
.. . . ..
. . . .
am1 am2 · · · amn
where aij represents the element in the ith row and jth column.
Definition. A square matrix is a matrix in which the number of rows is
equal to the number of columns (i.e., m = n).
Definition. A zero matrix is a matrix in which all elements are zero.
Definition. An identity matrix is a square matrix in which all elements on
the main diagonal (from top-left to bottom-right) are 1, and all other ele-
6
0.2. MATRIX OPERATIONS 7
ments are 0. It is often denoted by I or In to indicate its size.
Definition. Two matrices are equal if their sizes and corresponding entries
are equal.
Definition. A diagonal matrix is a square matrix in which all non-diagonal
elements are zero. The diagonal elements may be nonzero or zero.
Definition. An upper triangular matrix is a square matrix in which all ele-
ments below the main diagonal are zero.
Definition. A lower triangular matrix is a square matrix in which all ele-
ments above the main diagonal are zero.
Definition. A square matrix A is invertible if there exists another matrix
B such that A × B = B × A = I. Here B is called the inverse of A and is
denoted by A−1 .
0.2 Matrix Operations
Definition. The sum of two matrices A and B of the same size is a matrix
obtained by adding corresponding elements. Matrix addition is only defined
for matrices of the same size.
Definition. Multiplying a matrix A by a scalar k involves multiplying each
element of A by k.
Definition. The product of two matrices A and B is a matrix obtained by
multiplying the elements of each row of A by the corresponding elements of
each column of B and summing the results.
Definition. The transpose of a matrix A, denoted by AT , is a matrix ob-
tained by interchanging its rows with columns. The element in the ith row
and jth column of AT is equal to the element in the jth row and ith column
of A.
8 CHAPTER 0. MATRICES: ELEMENTARY FACTS
0.3 Matrix Properties
Definition. Two n × n matrices A and B are said to be similar if there
exists an invertible matrix P such that B = P −1 AP .
The determinant of similar matrices is the same.
The trace of similar matrices is the same.
Definition. A matrix is in row echelon form if it satisfies the following
conditions:
All rows consisting entirely of zeros are at the bottom.
For each row that contains nonzero elements, the leftmost nonzero ele-
ment is called a pivot, and it is strictly to the right of the pivot of the
row above it.
A matrix can be transformed into row echelon form using elementary
row operations (e.g., swapping rows, multiplying a row by a nonzero
scalar, adding a multiple of one row to another).
A square matrix A is invertible if and only if its row echelon form has
no zero rows.
Definition. The determinant of a square matrix A, denoted by det(A) or
|A|, is a scalar value that provides information about the matrix.
The determinant of a matrix is only defined for square matrices.
If A is an upper triangular matrix, the determinant is the product of
the diagonal entries of A.
The determinant of a matrix is nonzero if and only if the matrix is
invertible.
The determinant satisfies the following properties:
– det(AT ) = det(A) (transpose property)
– det(AB) = det(A) det(B) (multiplicative property)
– det(kA) = k n det(A) for a scalar k and an n × n matrix A
– A and its row echelon form need not have equal determinants.
Chapter 1
Vector Spaces
1.1 Vector space
The set of real numbers R with the binary operation + has the following
properties:
1. ” + ” is closed; that is, if a, b ∈ R, then so is a + b.
2. ” + ” is associative; that is, if a, b, c ∈ R, then (a + b) + c = a + (b + c).
3. There exists an identity element, namely 0 and we have a + 0 = 0 + a
for all a ∈ R.
4. Every element a in R has an inverse, namely −a and we have a+(−a) =
(−a) + a = 0.
5. ” + ”is commutative, that is if a, b ∈ R then a + b = b + a.
Every set with a binary operation that has the above five properties is
called an ablelian group. If the binary operation is + then we call it an
additive group. Sometimes the binary operation is not +. For example,
R−{0} makes an abelian group with multiplication. Note that in this group,
the identity element is 1, and the inverse of an element a is a−1 . So based on
the above discussion, the set R has two operations, ” + ” and ”.”.
1. R with ” + ” is an abelian group.
2. R − {0} with ”.” is an abelian group.
9
10 CHAPTER 1. VECTOR SPACES
3. ”.” distributes over ” + ”. That is, for a, b, c ∈ R, we have a.(b + c) =
a.b + a.c.
Every structure with the above three properties is called a field. It is easy
to see that the C and Q are also fields. We can give examples of finite fields.
However, in linear algebra, we mainly consider the field of real numbers R or
complex numbers C.
Lemma 1.1.1. Let A be an additive abelian group containing the elements
a, b and c. Then the following statements hold:,
1. The zero element is unique and we have 0 + 0 = 0.
2. The cancellation laws hold. That is, a + b = a + c implies that b = c.
Similarly, if b + a = c + a then b = c.
3. If a + b = 0, then a = −b.
Proof. Exercise.
Definition 1.1.2. Let non-empty set V is called a vector space over a field
F (or an F -vector space) if V is an additive abelian group, and for elements
α, β ∈ F , and for ⃗v , w
⃗ ∈ V , and there is a function .F ×V → V with following
properties hold:
1. α.(⃗v + w)
⃗ = α.⃗v + α.w;
⃗
2. (α + β).⃗v = α.⃗v + β.⃗v ;
3. (αβ).⃗v = α.(β.⃗v );
4. 1.⃗v = ⃗v .
The elements of the field are called scalars while the elements of V are
called vectors. If F is the field of real numbers, then we call V a real vector
space or an R-vector space. If F = C then V is called a complex vector
space. We denote the scalars by Greek letters α, β, etc. We also denote the
vectors by ⃗v , w,...
⃗ to help students understand the difference. However, the
students are expected to discern between scalars and vectors without relying
on specific notation.
Lemma 1.1.3. For every ⃗v ∈ V and α ∈ F ,
1.1. VECTOR SPACE 11
1. 0.⃗v = ⃗0,
2. α.⃗0 = ⃗0,
3. −(α.⃗v ) = α.(−⃗v ) = (−α).⃗v ,
4. α.⃗v = ⃗0 implies either α = 0 or v = ⃗0.
Proof. 1. We have (0 + 0).⃗v = 0.⃗v + 0.⃗v . Since 0 + 0 = 0, we have
0.⃗v = 0.⃗v + 0.⃗v . Now by the cancellation law, 0.⃗v = ⃗0.
2. It is similar to the first part
α.(⃗0 + ⃗0) = α.⃗0 + α.⃗0.
Since ⃗0 + ⃗0 = ⃗0, we have α.⃗0 = α.⃗0 + α.⃗0. Now by the cancellation law,
0.⃗0 = ⃗0.
3. We have
α.⃗v + (−α).⃗v = (α + (−α)).⃗v = 0.⃗v = ⃗0.
Therefore, (−α).v = −(α.⃗v ). Similarly,
α.⃗v + α.(−⃗v ) = α.(⃗v + (−⃗v )) = α.⃗0 = ⃗0.
Therefore, α.(−⃗v ) = −(α.⃗v ).
4. Suppose that α.⃗v = ⃗0 and assume that α ̸= 0. Then we multiply α−
to both sides. Therefore,
α−1 (α.⃗v ) = α−1 .⃗0.
Hence, 1.⃗v = ⃗0. That is, ⃗v = 0.
Remark 1.1.4. In a vector space, we encounter two entities that need to
be distinguished: the scalar zero, denoted as 0 ∈ F , and the vector zero,
denoted as ⃗0 in the vector space V . However, we sometimes show the zero
vector simply by 0 which should not cause ambiguity.
12 CHAPTER 1. VECTOR SPACES
Definition 1.1.5. A vector ⃗v in V is said to be a linear combination of the
vectors ⃗v1 , . . . , ⃗vn in V provided there exist scalars α1 , . . . , αn in F such that
n
X
⃗v = α1⃗v1 + . . . + αn⃗vn = αi⃗vi .
i=1
The associative property of vector addition and the distributive properties
of scalar multiplication apply
Pnto linear combinations
Pn as well. For example,
for linear combinations ⃗v = i=1 αi⃗vi and w ⃗ = i=1 βi⃗vi , we have:
n
X n
X n
X
αi⃗vi + βi⃗vi = (αi + βi )⃗vi , and
i=1 i=1 i=1
n
X n
X
β· αi⃗ai = (β · αi )⃗vi ,
i=1 i=1
where β is a scalar.
1.2 Some examples of Vector spaces
Example 1. The n-tuple space, F n . Let F be any field, and let V be the set
of all n-tuples ⃗a = (x1 , x2 , . . . , xn ) of scalars xi in F . If ⃗b = (y1 , y2 , . . . , yn )
with yi in F , the sum of ⃗a and ⃗b is defined by
⃗a + ⃗b = (x1 + y1 , x2 + y2 , . . . , xn + yn ) (1.2.1)
The product of a scalar α and vector ⃗a is defined by
α⃗a = (αx1 , αx2 , . . . , αxn ) (1.2.2)
As an exercise, verify that F n is a vector space.
Example 2. The space of m × n matrices, F m×n (note that in this example,
the vectors are matrices). Let F be any field, and let m and n be positive
integers. Let F m×n be the set of all m × n matrices over the field F . The
sum of two matrices A and B in F m×n is defined by
(A + B)ij = Aij + Bij (1.2.3)
1.2. SOME EXAMPLES OF VECTOR SPACES 13
The product of a scalar α and the matrix A is defined by
(αA)ij = α · Aij (1.2.4)
Note that F 1×n = F n .
Example 3. The space of functions from a set to a field (here the vectors
are functions). Let F be any field, and let S be any non-empty set. Let V
be the set of all functions from the set S into F . The sum of two vectors f
and g in V is the vector f + g, i.e., the function from S into F defined by
(f + g)(s) = f (s) + g(s). (1.2.5)
The product of the scalar c and the function f is the function cf defined by
(cf )(s) = cf (s). (1.2.6)
The preceding examples are special cases of this one. An n-tuple of elements
of F may be regarded as a function from the set S of integers 1, 2, . . . , n into
F . Similarly, an m × n matrix over the field F is a function from the set S
of pairs of integers (i, j), 1 ≤ i ≤ m, 1 ≤ j ≤ n, into the field F .
For this third example, we shall indicate how one verifies that the oper-
ations we have defined satisfy conditions of a vector space.
For vector addition:
(a) Since addition in F is commutative, f (s) + g(s) = g(s) + f (s) for each
s in S, so the functions f + g and g + f are identical.
(b) Since addition in F is associative, f (s) + [g(s) + h(s)] = [f (s) + g(s)] +
h(s) for each s, so f + (g + h) is the same function as (f + g) + h.
(c) The unique zero vector is the zero function which assigns to each ele-
ment of S the scalar 0 in F .
(d) For each f in V , (−f ) is the function which is given by (−f )(s) =
−f (s).
The reader should find it easy to verify that scalar multiplication satisfies
the remaining conditions of a vector space by arguing as we did with the
vector addition.
14 CHAPTER 1. VECTOR SPACES
Example 4. The space of polynomial functions over a field F . Let F be a
field, and let V = F [X] be the set of all functions f from F into F which
have a rule of the form
f (x) = c0 + c1 x + . . . + cn xn (1.2.7)
where c0 , c1 , . . . , cn are fixed scalars in F (independent of x). A function
of this type is called a polynomial function on F . Let addition and scalar
multiplication be defined as in Example 3. It is important to note that if f
and g are polynomial functions, and c is in F , then f + g and c · f are again
polynomial functions.
1.3 Subspaces
Definition 1.3.1. Let V be a vector space over the field F . A subspace of V
is a subset W of V which is itself a vector space over F with the operations
of vector addition and scalar multiplication on V .
Theorem 1.3.2. A non-empty subset W of V is a subspace of V if and only
if for each pair of vectors ⃗v , w
⃗ in W and each scalar α in F , the vector α⃗v + w
⃗
is again in W .
Proof. Suppose that W is a non-empty subset of V such that α⃗v + w ⃗ belongs
to W for all vectors ⃗v , w
⃗ in W and all scalars α in F . Since W is non-empty,
there is a vector p⃗ in W , and hence (−1)⃗p + p⃗ = ⃗0 is in W . Then, if ⃗v is any
vector in W and α any scalar, the vector α⃗p = α⃗p + ⃗0 is in W . In particular,
(−1)⃗0 = ⃗0 is in W . Finally, if ⃗v and w
⃗ are in W , then 1⃗v + w⃗ =w⃗ is in W .
Thus, W is a subspace of V .
Conversely, if W is a subspace of V , ⃗v and w⃗ are in W , and α is a scalar,
certainly α⃗v + w⃗ is in W .
Example 5. (a) If V is any vector space, V is a subspace of V ; the subset
consisting of the zero vector alone is a subspace of V , called the zero subspace
of V .
(b) In F n , the set of n-tuples (x1 , . . . , xn ) with x1 = 0 is a subspace;
however, the set of n-tuples with x1 = 1 + x2 is not a subspace (n ≥ 2).
(c) The space of polynomial functions over the field F is a subspace of
the space of all functions from F into F .
1.3. SUBSPACES 15
(d) An n × n (square) matrix A over the field F is symmetric if Aij = Aji
for each i and j. The symmetric matrices form a subspace of the space of all
n × n matrices over F .
Theorem 1.3.3. Let V be a vector space over the field F . The intersection
of any collection of subspaces of V is a subspace of V .
T
Proof. Let {Wi } be a collection of subspaces of V , and let W = Wi be
their intersection. Since each Wi is a subspace, each contains the zero vector.
Thus, the zero vector is in the intersection W , and W is non-empty.
Let ⃗v and w⃗ be vectors in W and let α be a scalar. By definition of W ,
both ⃗v and w ⃗ belong to each Wi , and because each Wi is a subspace, the
vector α⃗v + w
⃗ is in every Wi . Thus, α⃗v + w
⃗ is again in W . By Theorem 1.3.2,
W is a subspace of V .
Definition 1.3.4. Let S be a set of vectors in a vector space V . The subspace
spanned by S (denoted by span(S)) is defined to be the intersection W
of all subspaces of V which contain S. When S is a finite set of vectors,
S = {⃗a1 , ⃗a2 , . . . , ⃗an }, we shall simply call W the subspace spanned by the
vectors ⃗a1 , ⃗a2 , . . . , ⃗an . Note that S ⊆ W .
Exercise 1. Let S be a subset of a vector space V and W = span(S).
Prove that W is the smallest subgroup of V that contains S. In particular,
S = span(S) is and only if S is a subspace of V .
Theorem 1.3.5. The subspace spanned by a non-empty subset S of a vector
space V is the set of all linear combinations of vectors in S.
Proof. Let S = {⃗si }i∈I . Since S ⊆ W , we have α⃗ si ∈ W for each i. Therefore,
every linear combination of S lies in W . So if L is the set of all linear
combinations of vectors in S, we have L ⊆ W . On the other hand, the
set L contains S and is non-empty. If ⃗u, w ⃗ belong to L, then ⃗u is a linear
combination
⃗u = a1⃗v1 + a2⃗v2 + . . . + am⃗vm
of vectors ⃗vi in S, and w
⃗ is a linear combination
w
⃗ = b1⃗v1 + b2⃗v2 + . . . + bn⃗vn
of vectors ⃗vj in S. For each scalar α,
m
! n
!
X X
α⃗u + w
⃗= αai⃗vi + bj ⃗vj .
i=1 j=1
16 CHAPTER 1. VECTOR SPACES
Hence, α⃗u + w
⃗ belongs to L. Thus, L is a subspace of V . Since W is the
smallest subspace of V containing S, we conclude that L = W .
Example 6. Let F = R. Suppose
⃗v1 = (1, 2, 0, 3, 0), ⃗v2 = (0, 0, 1, 4, 0), ⃗v3 = (0, 0, 0, 0, 1).
By Theorem 1.3.5, a vector ⃗a is in the subspace W of F 5 spanned by ⃗v1 , ⃗v2 , ⃗v3
if and only if there exist scalars α1 , α2 , α3 in F such that
⃗a = α1⃗v1 + α2⃗v2 + α3⃗v3 .
Thus, W consists of all vectors of the form
⃗a = (α1 + 2α2 , 2α1 + 4α2 , α3 , 3α1 , α3 ),
where α1 , α2 , α3 are arbitrary scalars in F . Alternatively, W can be described
as the set of all 5-tuples
(x1 , x2 , x3 , x4 , x5 )
with xi in F such that
x2 = 2x1 , x4 = 3x1 + 4x3 .
So every vector in W is (x, 2x, y, 3x + 4y, z).
Thus, (−3, −6, 1, −5, 2) is in W , whereas (2, 4, 6, 7, 8) is not.
Example 7. Let V be the space of all polynomial functions over R. Let S be
the subset of V consisting of the polynomial functions f0 , f1 , f2 , . . ., defined
by
fn (x) = xn ,
where n = 0, 1, 2, . . .. Then V is the subspace spanned by the set S.
Exercise 2. Which of the following sets of vectors ⃗a = (a1 , . . . , an ) in Rn
are subspaces of Rn (where n ̸= 3)?
1. The set of vectors ⃗a = (a1 , . . . , an ) in Rn such that a1 ̸= 0.
2. The set of vectors ⃗a = (a1 , a2 , a3 ) in Rn such that a1 + 3a2 = a3 .
3. The set of vectors ⃗a = (a1 , a2 , . . . , an ) in Rn such that a2 = a21 .
1.4. DIRECT SUM OF VECTOR SPACES 17
4. The set of vectors ⃗a = (a1 , a2 , . . . , an ) in Rn such that a1 a2 = 0.
5. The set of vectors ⃗a = (a1 , a2 , . . . , an ) in Rn such that a2 is rational.
Exercise 3. Let V be the (real) vector space of all functions from R into R.
Determine which of the following sets of functions are subspaces of V :
1. The set of all functions f such that f (x2 ) = f (x)2 .
2. The set of all functions f such that f (0) = f (1).
3. The set of all functions f such that f (3) = f (1) + f (−5).
4. The set of all functions f such that f (−1) = 0.
Exercise 4. Which of the following subsets of R2 are not subspaces?
1. The line x = y.
2. The unit circle x2 + y 2 = 1.
3. The line 2x + y = 1.
4. The set f (x, y) ∈ R2 such that x ≥ 0, y ≥ 0.
1.4 Direct sum of vector spaces
Definition 1.4.1. If W1 , W2 , . . . , Wk are subspaces of a vector space V , the
set of all sums
⃗a1 + ⃗a2 + . . . + ⃗ak
of vectors ⃗aj in Wj is called the sum of the subspaces W1 , W2 , . . . , Wk and is
denoted by
Xk
Wj
j=1
or by
W1 + W2 + . . . + Wk .
18 CHAPTER 1. VECTOR SPACES
Exercise 5. Prove that the sum
W1 + W2 + . . . + Wk
a subspace of V which contains each of the subspaces Wi .
Exercise 6. Prove that W1 + W2 + . . . + Wk is the subspace spanned by the
union of W1 , W2 , . . . , Wk .
Lemma 1.4.2. If W1 and W2 are two subspaces of a vector space V , then
the following properties are equivalent:
(a) W1 ∩ W2 = {⃗0}.
(b) Every vector in W1 + W2 can be written uniquely in the form w
⃗1 + w
⃗ 2,
where w⃗ 1 ∈ W2 and w⃗ 2 ∈ W2 .
Proof. To show that (b) implies (a), we note that if ⃗v ∈ W1 \W2 , then ⃗v +⃗0 =
⃗0+⃗v , and the uniqueness of the representation in (b) forces ⃗v = ⃗0. Conversely,
if (a) holds, then for ⃗s, ⃗s0 ∈ W1 and ⃗t, ⃗t0 ∈ W2 , we have ⃗s + ⃗t = ⃗s0 + ⃗t0 , which
implies (⃗s − ⃗s0 ) = (⃗t0 − ⃗t) = ⃗0. Since ⃗s − ⃗s0 ∈ W1 and ⃗t0 − ⃗t ∈ W2 , the
intersection of W1 and W2 must be ⃗0, which proves (a).
Definition 1.4.3. When conditions (a) and (b) of the Lemma 1.4.2 are
satisfied, we say W1 + W2 is a direct sum of W1 and W2 . Also, if W is a
direct sum of W1 and W2 , we write W = W1 ⊕ W2 .
Remark 1.4.4. We can generalize the notion of direct sum to more that two
subspaces. If fact, W = W1 ⊕ ... ⊕ Wn if W can be uniquely written in the
form w1 + w2 + ... + wn , where wi ∈ Wi for 1 ≤ i ≤ n. A generalization of
Lemma 1.4.2 implies that Wi intersects the some of the other n − 1 subspaces
in 0.
Example 8. Let V be the vector space of all 2 × 2 matrices over R. Let W1
be the subset of V consisting of all matrices of the form
x 0
,
y 0
where x, y, z are arbitrary scalars in F . Similarly, let W2 be the subset of V
consisting of all matrices of the form
0 w
,
0 z
1.5. BASES AND DIMENSION 19
where w and z are arbitrary scalars in R. Then W1 and W2 are subspaces of
V . Furthermore, we can see that
0 0
W1 ∩ W2 = ,
0 0
because any matrix in the intersection must have zeros in the top right and
bottom left positions. In particular, V = W1 ⊕ W2 .
1.5 Bases and Dimension
Definition 1.5.1. Let V be a vector space over F . A subset S of V is said
to be linearly dependent (or simply, dependent) if there exist distinct vectors
⃗a1 , ⃗a2 , . . . , ⃗an in S and scalars c1 , c2 , . . . , cn in F , not all of which are 0, such
that
c1⃗a1 + c2⃗a2 + . . . + cn⃗an = ⃗0.
A set which is not linearly dependent is called linearly independent. If
the set S contains only finitely many vectors ⃗a1 , ⃗a2 , . . . , ⃗an , we sometimes
say that ⃗a1 , ⃗a2 , . . . , ⃗an are dependent (or independent) instead of saying S is
dependent (or independent).
The following are easy consequences of the definition.
1. Any set which contains a linearly dependent set is linearly dependent.
2. Any subset of a linearly independent set is linearly independent.
3. Any set which contains the ⃗0 vector is linearly dependent, since 1·⃗0 = ⃗0.
4. A set S of vectors is linearly independent if and only if each finite subset
of S is linearly independent, i.e., if and only if for any distinct vectors
⃗a1 , . . . , ⃗an of S, c1⃗a1 + . . . + cn⃗an = ⃗0 implies each ci = 0.
Example 9. The following vectors in F 3 are linearly dependent.
⃗a1 = (3, 0, −3), ⃗a2 = (−1, 1, 2), ⃗a3 = (4, 2, −2)
Reason. We can easily check that
2⃗a1 + 2⃗a2 − ⃗a3 = ⃗0.
20 CHAPTER 1. VECTOR SPACES
Definition 1.5.2. Let V be a vector space. A basis for V is a linearly
independent set of vectors in V which spans the space V . In other words,
every element of V can be written as a linear combination of the elements of
the basis.
Remark 1.5.3. Let F be a field, and in F n , let S be the subset consisting
of the vectors ⃗e1 , ⃗e2 , . . . , ⃗en defined by
⃗e1 = (1, 0, 0, . . . , 0), ⃗e2 = (0, 1, 0, . . . , 0), ..., ⃗en = (0, 0, 0, . . . , 1).
Let x1 , x2 , . . . , xn be scalars in F and put ⃗a = x1⃗e1 + x2⃗e2 + . . . + xn⃗en . Then
⃗a = (x1 , x2 , . . . , xn ).
This shows that ⃗e1 , ⃗e2 , . . . , ⃗en span F n . Since ⃗a ̸= ⃗0 if and only if x1 , x2 , . . . , xn ̸=
0, the vectors ⃗e1 , ⃗e2 , . . . , ⃗en are linearly independent. The set S = {⃗e1 , ⃗e2 , . . . , ⃗en }
is accordingly a basis for F n . We shall call this particular basis the standard
basis of F n .
1 2
Example 10. Prove that B = { , } is not a basis for R2 .
2 4
Proof. Note that the second vector is a multiple of the first one. So they are
not linearly independent.
1 0 1
Example 11. Prove that B = {2 , −2 , 0} is not a basis for R3 .
3 −1 2
Proof. Note that the third vector derives by adding the first two vectors, so
they are not linearly independent.
Theorem 1.5.4. Let V be a vector space which is spanned by a finite set of
vectors {⃗v1 , ⃗v2 , . . . , ⃗vm }. Then any independent set of vectors in V is finite
and contains no more than m elements.
Proof. To prove the theorem, it suffices to show that every subset S of V
which contains more than m vectors is linearly dependent. Let S be such
a set. In S, there are distinct vectors ⃗a1 , ⃗a2 , . . . , ⃗an where n > m. Since
⃗v1 , ⃗v2 , . . . , ⃗vm span V , there exist scalars yij in F such that
m
X
⃗aj = yij ⃗vi for 1 ≤ j ≤ n.
i=1
1.5. BASES AND DIMENSION 21
For any n scalars x1 , x2 , . . . , xn , we have
n n X
m m n
!
X X X X
xi⃗ai = yij xi⃗vj = yij xi ⃗vj .
i=1 i=1 j=1 j=1 i=1
Since n > m, there exist scalars x1 , x2 , . . . , xn not all 0 such that
n
X
yij xi ̸= 0 for 1 ≤ i ≤ m.
i=1
Pn
Hence, i=1 xi⃗ai = ⃗0. This shows that S is a linearly dependent set.
Corollary 1.5.5. If V has a finite basis, then any two bases of V have the
same (finite) number of elements.
Proof. V has a finite basis {⃗v1 , ⃗v2 , . . . , ⃗vm }. By Theorem 1.5.4, every basis of
V is finite and contains no more than m elements. Thus, if {⃗a1 , ⃗a2 , . . . , ⃗an }
is a basis, n ≥ m. By the same argument, m ≥ n. Hence, m = n.
Definition 1.5.6. If V has a finite basis of size n, then V is called a finite-
dimensional vector space. We call n the dimension of V and denote it by
dim V .
Corollary 1.5.7. Let V be a finite-dimensional vector space and let n =
dim V . Then
(a) Any subset of V which contains more than n vectors is linearly depen-
dent.
(b) No subset of V which contains fewer than n vectors can span V .
Example 12. If F is a field, the dimension of F n is n, because the standard
basis for F n contains n vectors. The matrix space F m×n has dimension mn.
Remark 1.5.8. If V is any vector space over F , the zero subspace of V is
spanned by the vector ⃗0, but {⃗0} is a linearly dependent set and not a basis.
For this reason, we shall agree that the zero subspace has dimension 0.
Lemma 1.5.9. Let S be a linearly independent subset of a vector space V .
Suppose ⃗b is a vector in V which is not in the subspace spanned by S. Then
the set obtained by adjoining w
⃗ to S is linearly independent.
22 CHAPTER 1. VECTOR SPACES
Proof. Suppose ⃗a1 , ⃗a2 , . . . , ⃗am are distinct vectors in S and that c1⃗v1 + c2⃗v2 +
⃗ = ⃗0. Then w
. . .+cm⃗vm +bw ⃗ = ⃗0; for otherwise, w
⃗ is in the subspace spanned
⃗
by S. Thus c1⃗v1 +c2⃗v2 +. . .+cm⃗vm = 0, and since S is a linearly independent
set, each ci = 0.
Theorem 1.5.10. If W is a subspace of a finite-dimensional vector space
V , every linearly independent subset of W is finite and is part of a (finite)
basis for W .
Proof. Suppose S0 is a linearly independent subset of W . If S is a linearly
independent subset of W containing S0 , then S is also a linearly independent
subset of V ; since V is finite-dimensional, S contains no more than dim V
elements.
We extend S0 to a basis for W , as follows. If S0 spans W , then S0 is a basis
for W and we are done. If S0 does not span W , we use the preceding lemma
to find a vector ⃗v1 in W such that the set S1 = S0 ∪ {⃗v1 } is independent. If
S1 spans W , we stop. If not, we apply the lemma again to obtain a vector ⃗v2
in W such that S2 = S1 ∪ {⃗v2 } is independent. We continue this process until
(in not more than dim V steps) we reach a set Sm = S0 ∪ {⃗v1 , ⃗v2 , . . . , ⃗vm }
which is a basis for W .
Corollary 1.5.11. If W is a proper subspace of a finite-dimensional vector
space V , then W is finite-dimensional and dim W < dim V .
Proof. We may suppose W contains a vector ⃗a ̸= ⃗0. By Theorem 5 and its
proof, there is a basis of W containing ⃗a which contains no more than dim V
elements. Hence W is finite-dimensional, and dim W ≤ dim V . Since W is
a proper subspace, there is a vector ⃗b in V which is not in W . Adjoining
⃗b to any basis of W , we obtain a linearly independent subset of V . Thus
dim W < dim V .
Corollary 1.5.12. In a finite-dimensional vector space V , every non-empty
linearly independent set of vectors is part of a basis.
Theorem 1.5.13. If W1 and W2 are finite-dimensional subspaces of a vector
space V , then W1 + W2 is finite-dimensional and
dim W1 + dim W2 = dim(W1 ∩ W2 ) + dim(W1 + W2 ).
Proof. Let dim(W1 ∩W2 ) = k, and {⃗a1 , ..., ⃗ak } be a basis for it. Let dim W1 =
n. Since W1 ∩ W2 is a subspace of W1 , we can extend this basis to a basis of
1.5. BASES AND DIMENSION 23
W1 by adding the vectors ⃗b1 , ..., ⃗am . Similarly, we can extend this basis to a
basis of W2 by adding the vectors ⃗c1 , ..., ⃗cn . Our aim is to prove that the set
B = {⃗a1 ,⃗,..., ak , ⃗b1 , ..., ⃗bm , ⃗c1 , ..., ⃗cn },
is a basis for W1 + W2 . It is clear that the size of B is m + n + p. To prove it is
a basis, we should prove that it is linearly independent and spans W1 + W2 .
The details of the proof is left to the students as as exercise.
Problems
1. Show that any part of a linearly independent list is linearly indepen-
dent.
2. If w is a list of vectors in V and if some part of w spans V ;show that
w spans V .
3. If the vector v is not in the subspace S, but is in the subspace spanned
by S and the vector w; show that w is in the subspace spanned by S
and v.
4. Prove that all lines through the origin and planes through the origin in
R3 are subspaces.
5. Let U, V, W be three subspaces of a vector space E. Show that if
U ⊆ V , then U + (V ∩ W ) = (U + V ) ∩ (U + W ). Is this true if U is
not a subspace of V ?
6. Let U, V, W be three subspaces of a vector space E. Show that if
V ⊆ U , then U ∩ (V + W ) = (U ∩ V ) + (U ∩ W ). Is this true if V is
not a subspace of U ?
7. If S and T are distinct two-dimensional subspaces of a three-dimensional
space, show that their intersection S ∩ T has dimension 1. What does
this mean geometrically?
8. Suppose that V is finite-dimensional and U is a subspace of V such
that dim(U ) = dim(V ). Prove that U = V .
9. Suppose U and W are subspaces of R8 such that dim(U ) = 3, dim(W ) =
5, and U + W = R8 . Prove that U ∩ W = {0}.
24
1.5. BASES AND DIMENSION 25
10. Suppose that U and W are both five-dimensional subspaces of R9 .
Prove that U ∩ W = {0}.
Chapter 2
Linear maps
2.1 Basic notions
Definition 2.1.1. Let V and W be vector spaces over the field F . A linear
map (or a linear transformation) from V into W is a function T from V into
⃗ ∈ V and c ∈ F we have
W such that for ⃗v , w
1. T (⃗v ) + T (w)
⃗ = T (⃗v + w)
⃗
2. T (c⃗v ) = cT (⃗v )
Lemma 2.1.2. Let T be a function from V into W . Then T is a linear map
if and only if:
T (c⃗v + w)
⃗ = cT (⃗v ) + T (w)
⃗
for all ⃗v and w
⃗ in V and all scalars c in F .
Proof. Exercise.
Remark 2.1.3. A linear map f : V → V (an endomorphism) is also often
called an operator, whereas linear maps f : V → F with codomain F (a
vector space over itself) are called linear functionals or linear forms.
Example 13. If V is any vector space, the identity map I, defined by I(⃗a) =
⃗a, is a linear map from V into V . The zero map 0, defined by 0(⃗a) = ⃗0, is a
linear map from V into V .
26
2.2. LINEAR MAPS AND MATRICES 27
Example 14. Let F be a field and let V be the space of polynomial functions
f from F into F , given by
f (x) = c0 + c1 x + . . . + ck xk .
Let
(Df )(x) = c1 + 2c2 x + . . .
Then D is a linear map from V into V - the differentiation map.
Exercise 7. Determine which of the following maps f are linear:
1. f : R3 → R2 defined by f (x, y, z) = (x, z).
2. f : R2 → R2 defined by f (x, y) = (2x + y, y).
3. f : R2 → R2 defined by f (x, y) = (2, y − x).
4. f : R2 → R2 defined by f (x, y) = (y, x).
5. f : R2 → R defined by f (x, y) = xy.
Exercise 8. Let f : V → V ′ be a linear map. Show that:
1. f (0) = 0;
2. f (−v) = −f (v) for all v ∈ V .
2.2 Linear maps and Matrices
Let A be a fixed m × n matrix with entries in the field F . The function TA
defined by TA (⃗x) = A⃗x is a linear map from F n into F m . So every m × n
matrix induces a linear map from F n into F m . Conversely, every linear map
from F n into F m induces an m × n matrix. To see this, let T be a linear map
from F n into F m . Hence, for any vectors ⃗x1 , ⃗x2 ∈ F n and any scalar c ∈ F ,
we have T (⃗x1 + ⃗x2 ) = T (⃗x1 ) + T (⃗x2 ) and T (c⃗x) = cT (⃗x). Now, consider any
basis vectors ⃗b1 , ⃗b2 , . . . , ⃗bn of F n . Using the linearity of T , we can express the
image of each basis vector as T (⃗bi ) = ⃗vi , where ⃗vi is a vector in F m . We can
then construct an m × n matrix A such that the ith column of A is given by
the vector ⃗vi . In other words,
A = [T (⃗b1 ), T (⃗bi ), ..., T (⃗bn )] (2.2.1)
28 CHAPTER 2. LINEAR MAPS
It can be shown that this matrix A represents the linear map T with
respect to the bases of F n and F m . In other words, for any vector ⃗x ∈ F n ,
the matrix-vector multiplication A⃗x gives the same result as applying the
linear map T to ⃗x. Therefore, every linear map from F n into F m can be
represented by an m × n matrix.
Theorem 2.2.1. Every m × n matrix A whose entries are in F , induces a
linear map TA (⃗v ) = A⃗v from F n into F m . Conversely, every linear map from
F n into F m can be represented by a unique m × n matrix A whose entries
are in F .
Proof. The proof holds by the above discussing. Complete the details as an
exercise.
Remark 2.2.2. If an m × n matrix A, then the linear map corresponding
to A is simply defined by TA (⃗v ) = A⃗v . Also if we have a linear map, then
by (2.2.1), we have a matrix. Therefore in linear algebra, linear maps and
matrices are considered a same object and can be used interchangeably. Only
note that the matrix associated to a linear map is depended to the chosen
basis.
Example 15. Let f : R4 → R2 be the linear map given by f (x1 , x2 , x3 , x4 ) =
(2x2 , x1 ), where the input vectors are in R4 and the output vectors are in R2 .
Find the matrix associated with this linear map respected to the standard
basis.
Solution. To find the matrix associated with this linear map relative to the
bases of unit vectors, we need to determine the images of the standard basis
vectors in R4 under f .
Consider the standard basis vectors ⃗e1 , ⃗e2 , ⃗e3 , and ⃗e4 in R4 . We will find
the images of these vectors under f .
C1 = f (⃗e1 ) = f (1, 0, 0, 0) = (0, 1).
C2 = f (⃗e2 ) = f (0, 1, 0, 0) = (2, 0).
C3 = f (⃗e3 ) = f (0, 0, 1, 0) = (0, 0).
C4 = f (⃗e4 ) = f (0, 0, 0, 1) = (0, 0).
2.2. LINEAR MAPS AND MATRICES 29
So we have:
0 2 0 0
[T ] = [C1 C2 C3 C4 ] = .
1 0 0 0
Example 16. Let f : R2 → R2 be the linear map given by f (x, y) = (2x +
3y, 4x − 5y).
1. Find the matrix associated to this linear map with respect to the stan-
dard basis.
2. Find the matrix associated to this linear map with respect to the basis
{1, 1}, {0, 2}.
Solution. Let E be the standard basis:
f (⃗e1 ) = (2(1) + 3(0), 4(1) − 5(0)) = (2, 4)
f (⃗e2 ) = (2(0) + 3(1), 4(0) − 5(1)) = (3, −5)
Now, we can construct the matrix associated with f using these image
vectors:
2 3
[f ]E =
4 −5
Now let B be the basis {1, 1}, {0, 2}.
f (⃗b1 ) = (2(1) + 3(1), 4(1) − 5(1)) = (5, −1)
f (⃗b2 ) = (2(0) + 3(2), 4(0) − 5(2)) = (6, −10)
Now, we can construct the matrix associated with f using these image
vectors:
5 6
[f ]B =
−1 −10
2 3
Example 17. Let A = . Determine the linear map associated to A.
−1 0
30 CHAPTER 2. LINEAR MAPS
Solution. We have TA (⃗v ) = A⃗v . So if ⃗v = (x, y), we can write:
2 3 x 2x + 3y
TA (⃗v ) = =
−1 0 y −x
Therefor, TA (⃗v ) = (2x + 3y, −x).
Remark 2.2.3. If in the last example we go reverse and find the matrix
associated to TA (v) with respect to the standard basis, then we will get A.
Exercise 9.For each
case, find
the vector TA (⃗v ):
2 1 3
(a) A = , ⃗v =
1 0 −1
1 0 5
(b) A = , ⃗v =
0 0 1
1 1 −1 2
(c) A = 0 1 −1, ⃗v = 1
2 2 0 −1
2.3 Coordinates
One of the useful features of a basis B in an n-dimensional space V is that it
essentially enables one to introduce coordinates in V analogous to the natural
coordinates xi of a vector ⃗x = (x1 , ..., xn ) in the space Rn . In this scheme,
the coordinates of a vector ⃗x in V relative to the basis B will be the scalars
which serve to express a as a linear combination of the vectors in the basis.
Definition 2.3.1. If V is a finite-dimensional vector space, an ordered basis
for V is a finite sequence of vectors which is linearly independent and spans
V . We denote it by B = {⃗v1 , ⃗v2 , ..., ⃗vn }.
Definition 2.3.2. Let B = {⃗v1 , ⃗v2 , ..., ⃗vn } be an ordered basis for V . Given
⃗a in V , there is a unique n-tuple (α1 , . . . , αn ) of scalars such that
n
X
⃗a = αi⃗vi .
i=1
We shall call xi the ith coordinate of ⃗a relative to the ordered basis B.
2.3. COORDINATES 31
Remark 2.3.3. The coordinates of a vector ⃗a in vector space V are entirely
dependent on the choice of basis B. Therefore, when referring to the coordi-
nates of a vector, it is essential to specify the basis being used.
To indicate the dependence of this coordinate matrix on the basis, we
shall use the symbol [⃗v ]B for the coordinate matrix of the vector ⃗v relative
to the ordered basis B.
Example 18. Let F be a field and let ⃗a = (a1 , a2 , . . . , an ) be a vector in F n .
If B is the standard ordered basis of F n , given by B = {⃗e1 , . . . , ⃗en }, then the
coordinate matrix of the vector ⃗a in the basis B is given by
a1
a2
[⃗a]B = .. .
.
an
4
Example 19. Let ⃗v = . Find the coordinates of V in
2
1 0
1. the standard basis E = { , }.
0 1
1 2
2. the basis B = { , }
1 −1
1 0
Solution. It is clear that v = 4 +2 . So the coordinates of ⃗v in the
0 1
standard basis is [v]E = (4, 2). To find the coordinates of V in B, we write:
4 1 2
=a +b .
2 1 −1
Hence,
4 = a + 2b,
2 = a − 2b,
1 3
Therefore, a = 3 and b = 2
and we can write [v]B = 1 .
2
32 CHAPTER 2. LINEAR MAPS
Theorem 2.3.4. Let V be a finite-dimensional vector space over the field
F , and let B = {⃗b1 , . . . , ⃗bn } be an ordered basis and E = {⃗e1 , . . . , ⃗en } be the
standard basis for V . If P = [P1 . . . Pn ] is the n × n matrix with columns
Pj = [⃗bj ]B , then
P [v]B = [v]E .
Proof. Exercise.
Remark 2.3.5. The matrix P in the preceding theorem is called the change
of basis matrix.
Example 20. Give an alternating solution for Example 19 (part 2), using
the change of basis matrix.
Solution. According
to the theorem, the columns of the change of basis ma-
1 2
trix are { , } and we have P [v]B = [v]E . Hence,
1 −1
1 2 x 4
= .
1 −1 y 2
The result follows by solving the equation.
⃗ ⃗ ⃗ 3 } be the ordered basis for R3 consisting of
21. Let B = {b1 , b2 , b
Example
1 1 1 2
⃗b1 = 0 , ⃗b2 = 1, and ⃗b3 = 0. What are the coordinates of ⃗v = −1
−1 1 0 0
in the ordered basis B?
Solution. The matrix P is formed by taking the basis vectors of B as its
columns:
1 1 1
P = 0 1 0 .
−1 1 0
Now we have P [⃗v ]B = [⃗v ]E . Substituting the given values, we have:
1 1 1 x1 2
0 1 0 x2 = −1 .
−1 1 0 x3 0
Solving this system of equations, we find x1 = −1, x2 = −1, and x3 = 4.
Therefore, the coordinates of ⃗v in the ordered basis B are (−1, −1, 4).
2.4. NULLITY AND RANK OF A LINEAR MAP 33
2.4 Nullity and Rank of a linear map
Definition 2.4.1. Let V and W be vector spaces over the field F and let
T : V → W be a linear map. The kernel of T is the set of all vectors ⃗v in V
such that T⃗v = ⃗0. That is:
ker T = {⃗v ∈ V :T⃗v = ⃗0}.
Also, Range of T is defined as follows:
T (V ) = {T⃗v :⃗v ∈ V }.
Definition 2.4.2. If V is finite-dimensional, the rank of a linear map T :
V → W is the dimension of the range of T . That is, rank(T ) = dim T (V ).
Lemma 2.4.3. Let V and W be vector spaces over the field F and let T :
V → W be a linear map. Then ker T and T (V ) are subspaces of V and W ,
respectively.
Proof. Let ⃗v1 and ⃗v2 be vectors in ker T , so that T⃗v1 = T⃗v2 = ⃗0. For every
scalar α we have:
T (α⃗v1 + ⃗v2 ) = αT⃗v1 + T (⃗v2 ) = ⃗0.
Therefore, α⃗v1 + ⃗v2 ∈ ker T . This implies that ker T is a subspace of V .
Next assume that w ⃗ 1 and w⃗ 2 be vectors in T (V ), so there exist elements
⃗a and b in V such that T⃗a = w1 and T⃗b = w2 . Now we can write:
⃗
αw ⃗ 2 = αT⃗a + T⃗b = T (α⃗a + ⃗b) ∈ T (V ).
⃗1 + w
Therefore, T (V ) is a subspace of W .
Definition 2.4.4. The dimension of ker T is called the nullity of T and
denoted by nullity(T ). Also, the dimension of T (V ) is called the rank of T
and denoted by rank(T ). The kernel of T is often called the null space of T .
Theorem 2.4.5. Let V and W be vector spaces over the field F and let T
be a linear map from V into W . Suppose that V is finite-dimensional. Then
rank(T ) + nullity(T ) = dim V.
34 CHAPTER 2. LINEAR MAPS
Proof. Let {⃗a1 , . . . , ⃗ak } be a basis for ker T . We can extend this basis to a
basis for V , so we can find vectors ⃗ak+1 , . . . , ⃗an in V such that {⃗a1 , . . . , ⃗an }
is a basis for V .
We shall now prove that {T⃗ak+1 , . . . , T⃗an } is a basis for the range of T .
To this end, we need to show that it spans T (V ) and are linearly independent.
{T⃗ak+1 , . . . , T⃗an } spans T (V ):
Let w⃗ ∈ T (V ). Then w⃗ = T⃗v ) for some ⃗v ∈ V . We can write v as a
linear combination of the basis {⃗a1 , . . . , ⃗an }:
v = r1⃗a1 + . . . + rn⃗an .
w = T (v) = T (r1⃗a1 + . . . + rn⃗an ) = r1 T (a1 ) + . . . + rn T (⃗an ).
Notice that ⃗a1 , . . . , ⃗an belong to ker T so that T⃗ai = ⃗0 for 1 ≤ i ≤ k.
Hence,
w = rk+1 T (⃗ak+1 ) + . . . + rn T (⃗an ).
This implies that {T⃗ak+1 , . . . , T⃗an } spans T (V ).
{T⃗ak+1 , . . . , T⃗an } is linearly independent.
Let sk+1 T⃗ak+1 + . . . + sn T⃗an = ⃗0 for the scalars sk+1 , ..., sn :
T (sk+1⃗ak+1 + . . . + sn⃗an ) = ⃗0.
This implies that sk+1⃗ak+1 + . . . + sn⃗an lies in the null space of T .
We know that {⃗a1 + . . . + sn⃗ak } is a basis for the null space of T .
Therefore, sk+1⃗ak+1 + . . . + sn⃗an can be written as a linear combination
of ai (i = 1, ..., k):
sk+1⃗ak+1 + . . . + sn⃗an = r1⃗a1 + . . . + rk⃗ak .
r1⃗a1 + . . . + rk⃗ak + (−sk+1 )⃗ak+1 + . . . + (−sn )⃗an = ⃗0.
Since ai are linearly independent, all coefficients must be zero. In
particular, si = 0 for i = k + 1, ..., n.
2.4. NULLITY AND RANK OF A LINEAR MAP 35
Definition 2.4.6. The function T from V into W is called invertible if there
exists a function U from W into V such that U T is the identity function on
V and T U is the identity function on W . If T is invertible, the function U
is unique and is denoted by T −1 .
Theorem 2.4.7. Let V and W be finite-dimensional vector spaces over the
field F such that dim V = dim W . If T is a linear map from V into W , the
following are equivalent:
(i) T is invertible.
(ii) T is 1-1.
(iii) T is onto, i.e., the range of T is W .
Proof. Let n = dim V = dim W . From Theorem 2 we know that rank(T ) +
nullity(T ) = n. Now T is non-singular if and only if nullity(T ) = 0, and
(since n = dim W ) the range of T is W if and only if rank(T ) = n. Since
the rank plus the nullity is n, the nullity is 0 precisely when the rank is n.
Therefore, T is non-singular if and only if T (V ) = W . So, if either condition
(ii) or (iii) holds, the other is satisfied as well, and T is invertible.
Definition 2.4.8. If V and W are vector spaces over the field F , any one-
to-one linear map T of V onto W is called an isomorphism of V onto W . If
there exists an isomorphism of V onto W , we say that V is isomorphic to
W.
Theorem 2.4.9. Every n-dimensional vector space over the field F is iso-
morphic to the space F n .
Proof. Let V be an n-dimensional space over the field F and let B = {a1 , . . . , an }
be an ordered basis for V . We define a function T from V into F n , as follows:
If a is in V , let T a be the n-tuple (x1 , . . . , xn ) of coordinates of a relative to
the ordered basis B, i.e., the n-tuple such that
a = x 1 a1 + . . . + x n an .
So T is linear, one-to-one, and maps V onto F n .
36 CHAPTER 2. LINEAR MAPS
Remark 2.4.10. The importance of this theorem is that every vector space
can be viewed as F n . So the elements of V can be written as n-tuples. The
coordinates of these vectors can be obtained by using an ordered basis.
Proposition 2.4.11. Let T : V → W be a linear map between vector spaces
V and W . Then T is one-to-one if and only if the null space of T is {⃗0}.
Proof. Suppose that ⃗v in ker T , so that T⃗v = ⃗0. Since T is one-to-one, we
must have ⃗v = ⃗0. Conversely, assume that ker T = ⃗0. Assume that ⃗v1 and ⃗v2
are vectors in V such that T (⃗v1 ) = T (⃗v2 ). This implies that T (⃗v1 − ⃗v2 ) = ⃗0.
Hence, ⃗v1 − ⃗v2 ∈ kerT = ⃗0. That is, v1 = v2 and T is one-to-one.
Definition 2.4.12. A map is called singular if it is not 1-1. Equivalently, if
ker T ̸= {⃗0}. Otherwise, we call T non-singular,
2.4.1 Nullity and Rank of matrices
Definition 2.4.13. If A is an m × n matrix over the field F , the row space
of A is the subspace of F n spanned by the row vectors of A. The row rank
of A is the dimension of the row space of A.
Remark 2.4.14. If A is a matrix corresponding to a linear operator T , then
the invertibility of T is equivalent to the invertibility of A. In other words,
T is invertible if and only if det A ̸= 0.
Moreover, we can easily determine the rank and nullity of T using the
matrix A. By transforming A into its row echelon form, we can count the
number of zero rows, which corresponds to the nullity of T , and the number
of non-zero rows, which corresponds to the rank of T . The sum of the rank
and nullity will give us the total number of rows in A.
Example 22. Consider the following linear operator T on R3 :
T (x, y, z) = (2x + 3y − z, x + 2y + z, 3x + 4y).
To find the rank and nullity of T , we can represent T using its corre-
sponding matrix A. Let’s write down the matrix A and then determine its
row echelon form to analyze its rank and nullity.
2 3 −1
A = 1 2 1
3 4 0
2.4. NULLITY AND RANK OF A LINEAR MAP 37
By performing row reduction, we obtain the row echelon form of A as
follows:
1 0 1
0 1 −1
0 0 0
From the row echelon form, we can see that there are two non-zero rows,
which means the rank of A (and thus the rank of T ) is 2. The number of
zero rows is 1, so the nullity of A (and thus the nullity of T ) is 1.
Hence, the rank-nullity theorem tells us that 2 + 1 = 3, which is the
number of rows of A.
Example 23. Let F be a field and let T be the linear operator on F 2 defined
by
T (x1 , x2 ) = (x2 , x1 − x2 ).
Find T −1 .
Proof. We can see that T −1 (x2 , x1 − x2 ) = (x1 , x2 ). Let y = x2 and z =
x1 − x2 . The obvious solution is x1 = y + z and x2 = y. So the explicit
formula for T −1 is:
T −1 (y, z) = (y + z, y).
0 1
Alternating solution. We have [T ] = .
1 −1
−1 1 1
The inverse of the matrix is [T ] = . Hence,
1 0
−1 1 1 x x+y
T = = .
1 0 y y
x x
Remark 2.4.15. To verify our solution, we can check that T ◦T −1 = .
y y
38 CHAPTER 2. LINEAR MAPS
Exercise 10. Let F be a field and let T be the linear operator on F 2 defined
by
T (x1 , x2 ) = (x1 + x2 , x1 − x2 )
Find the inverse of T .
Exercise 11. Let
1 1 −1
A = 1 −2 1 .
2 −1 0
Find the nullity and rank of A.
2.5 Linear Forms and Dual Spaces
Definition 2.5.1. Let V be a vector space over a field F . A linear map
f : V → F is called a linear form (or a functional).
Lemma 2.5.2. Let V be a vector space over a field F . The set of all linear
forms is a vector space over F .
Definition 2.5.3. The vector space of all linear forms of V is called the dual
of V and denoted by V ∗ . In fact
V ∗ = HomF (V, F ) = {f : V → F }.
Theorem 2.5.4. Any finite-dimensional vector space V has the same di-
mension as its dual space V ∗ .
Proof. Let V be a vector space with basis B = {v1 , v2 , . . . , vn }. We construct
the set B ′ = {f1 , f2 , . . . , fn } as follows.
(
1 if i = j,
fi (vj ) =
0 if i ̸= j.
To complete the proof, we need to show that fi are linearly independent and
B ′ spans V ∗ .
If a1 f1 +...+an fn = 0 then for every vi , we have a1 f1 (vi )+...+an fn (vi ) = 0.
So for every i, we get 0 + 0 + ... + ai + ... + 0 = 0. Therefore, ai = 0. So fi
are linearly independent. Next we show that fi s span V ∗ . Let g ∈ V ∗ . We
2.5. LINEAR FORMS AND DUAL SPACES 39
fix i and write g(vi ) = ci for some ci ∈ F . Therefore, g(vi ) = ci fi (vi ). Now
if v ∈ V , then we have
v = d1 v1 + d2 v2 + ... + dn vn .
Hence,
g(v) = d1 g(v1 ) + ... + dn g(vn ) = d1 c1 f1 (v1 ) + ... + dn cn f1 (vn )
Notice that fi (v) = d1 fi (v1 ) + ... + dn fi (vn ) = di fi (vi ). Therefore,
g(v) = c1 f1 (v) + ... + cn f1 (v).
Since v is chosen arbitrarily, we conclude that g(v) = c1 f1 + ... + cn f1 , as
required.
We can now find the basis of V ∗ which is called the dual basis of V . Here
is an example:
Example 24. Consider the basis ⃗v1 = (2, 1) and ⃗v2 = (3, 2) of R2 . We want
to find the dual basis.
⃗ a ⃗ c
Solution. Let f1 = and f2 = be the dual basis vectors. We want
b d
f⃗1⃗v1 = 1, f⃗1⃗v2 = 0, f⃗2⃗v1 = 0, and f⃗2⃗v2 = 1.
a
f⃗1⃗v1 = f⃗1 = (2, 1) = 2a + b
b
⃗ ⃗ a
f1⃗v2 = f1 = (3, 2) = 3a + b
b
So we get the following system of equations:
(
2a + b = 1
3a + b = 0
this system of equations, we find a = −1 and b = 3. Thus,
Solving
−1
f⃗1 = .
3
40 CHAPTER 2. LINEAR MAPS
Using the conditions f⃗2 (⃗v1 ) = 0 and f⃗2 (⃗v2 ) = 1, we get the following
system of equations: (
2c + d = 0
3c + d = 1
this system of equations, we find c = 1 and d = −2. Thus,
Solving
1
f⃗2 = .
−2
−1 1
Therefore, the dual basis of the basis {(2, 1), (3, 1)} is , .
3 −2
Problems
1. Let c be a real number, and let f : R3 → R3 be the linear map such
that f (⃗v ) = c⃗v , where ⃗v ∈ R3 . Find the matrix associated with this
linear map.
2. Let f : V → V0 be a linear map, and S, T be subspaces of V . Show
that:
(a) f (S ∩ T ) ⊆ f (S) ∩ f (T ).
(b) f (S + T ) = f (S) + f (T ).
(c) f −1 (S ∩ T ) = f −1 (S) ∩ f −1 (T ).
(d) f −1 (S) + f −1 (T ) ⊆ f −1 (S + T ).
Recall that f −1 (X) denotes the inverse image of X.
3. In any vector space, prove:
(a) A list of just one vector is linearly independent if and only if the
vector is non-zero.
(b) A list of two vectors is linearly dependent if and only if each is a
scalar multiple of the other.
4. Show that any two vectors (x1 , x2 ) and (h1 , h2 ) are linearly independent
in F 2 if and only if x1 h2 − x2 h1 ̸= 0.
5. Let v, w be vectors of a vector space V over F , and assume that v ̸= 0.
If v, w are linearly dependent, show that there is a scalar a ∈ F such
that w = av.
6. If ku + lv + mw = 0, where ku ̸= 0, show that the vectors u and v span
the same subspace as do v and w.
41
42 CHAPTER 2. LINEAR MAPS
7. If w is a linearly independent list of n vectors of V , show that a vector
v is a linear combination of the list w if and only if the list v; w1 ; . . . ; wn
is linearly dependent.
8. If u, v, and w are three linearly independent vectors in a vector space
over Q, prove that v +w, w +u, and u+v are also linearly independent.
Is this true over every field?
9. If f : V → W is a linear map and v is a list of n vectors in V , prove
the following properties of the composite list f ◦ v = (f (v1 ), . . . , f (vn )):
(a) If v spans V , then f ◦ v spans Im(f ).
(b) If f ◦ v is linearly independent in W , then v is linearly independent
in V .
(c) If v is linearly independent in V and f is an injection, then f ◦ v is
independent in W .
Chapter 3
Eigenvectors and eigenvalues
3.1 Eigenvectors and eigenvalues
Definition 3.1.1. Let A be an n × n matrix with entries from the field
F . A non-zero vector ⃗v in F n is called an eigenvector of A if there exists a
scalar λ in F such that A⃗v = λ⃗v . The scalar λ is known as the eigenvalue
corresponding to the eigenvector ⃗v .
Remark 3.1.2. Every eigenvector has a unique eigenvalue. In fact if λ1 and
λ2 are eigenvalues of ⃗v , then we have λ1⃗v = λ2⃗v . Hence, (λ1 − λ2 )⃗v = 0.
Since ⃗v ̸= 0, we conclude that λ1 = λ2 .
Remark 3.1.3. If T is an operator over a vector space V (a linear map of
V into itself), then the eigenvectors and eigenvalues of T are just defined as
its corresponding matrix.
Definition 3.1.4. The set Eλ of all eigenvectors corresponding to λ is called
the eigenspace of A belonging to λ. In other words:
Eλ = {v ∈ V : Av = λv}.
Theorem 3.1.5. Let V be a vector space and let A : V → V be a linear map
and λ ∈ F be an eigenvalue. Then the set Eλ of all eigenvectors corresponding
to λ forms a vector space.
Proof. Let v1 , v2 ∈ Eλ . So we have Av1 = λv1 and Av2 = λv2 . Then,
A(v1 + v2 ) = Av1 + Av2 = λv1 + λv2 = λ(v1 + v2 ).
43
44 CHAPTER 3. EIGENVECTORS AND EIGENVALUES
Also if c ∈ F, then A(cv1 ) = cAv1 = cλv1 = λ(cv1 ). This proves our
theorem.
Theorem 3.1.6. Let V be a vector space and let A : V → V be a linear
map. Let v1 , v2 , . . . , vm be eigenvectors of A with eigenvalues λ1 , λ2 , . . . , λm
respectively. Assume that these eigenvalues are distinct, i.e., λi ̸= λj if i ̸= j.
Then v1 , v2 , . . . , vm are linearly independent.
Proof. By induction on m. For m = 1, an element v1 ∈ V , v1 ̸= 0 is linearly
independent. Assume m > 1. Suppose that we have a relation
c1 v1 + c2 v2 + . . . + cm vm = 0 (3.1.1)
with scalars ci . We must prove that all ci = 0. We multiply our relation
by λ1 to obtain
c1 λ1 v1 + c2 λ1 v2 + . . . + cm λ1 vm = 0.
We also apply A to (3.1.1). By linearity, we obtain
A(c1 λ1 v1 ) + A(c2 λ1 v2 ) + . . . + A(cm λ1 vm ) = 0. (3.1.2)
Using the linearity property of the map A, we can distribute the map A
over each term, giving
c1 (λ1 A(v1 )) + c2 (λ1 A(v2 )) + . . . + cm (λ1 A(vm )) = 0. (3.1.3)
Since v1 , v2 , . . . , vm are eigenvectors of A with eigenvalues λ1 , λ2 , . . . , λm ,
respectively, we can rewrite the above expression as
c1 (λ1 v1 ) + c2 (λ1 v2 ) + . . . + cm (λ1 vm ) = 0. (3.1.4)
We now subtract (3.1.1) from (3.1.4) and obtain
c2 (λ2 − λ1 )v2 + . . . + cm (λm − λ1 )vm = 0.
Since λj − λ1 ̸= 0 for j = 2, . . . , m, we conclude by induction that c2 = c3 =
. . . = cm = 0. Going back to our original relation, we see that c1 v1 = 0,
whence c1 = 0, and our theorem is proved.
Corollary 3.1.7. Suppose V is a vector space of dimension n and A :
V → V is a linear map having n eigenvectors v1 , v2 , . . . , vn whose eigen-
values λ1 , λ2 , . . . , λn are distinct. Then {v1 , v2 , . . . , vn } is a basis of V .
3.2. CHARACTERISTIC POLYNOMIAL 45
3.2 Characteristic polynomial
We shall now see how we can use determinants to find the eigenvalue of a
matrix.
Theorem 3.2.1. Let V be a finite-dimensional vector space, and let λ be a
scalar. Let A : V → V be a linear map. Then λ is an eigenvalue of A if and
only if A − λI is not invertible.
Proof. Assume that λ is an eigenvalue of A. Then there exists a non-zero
element v ∈ V such that Av = λv. Hence, (A − λI)v = 0, and lies in the
null space of V . Therefore, (A − λI) cannot be invertible.
Conversely, assume that (A − λI) is not invertible, which implies that
(A − λI) must have a non-zero null space. This means there exists a non-
zero element v ∈ V such that (A − λI)v = 0. Hence, Av = λv, and λ is an
eigenvalue of A.
This proves the theorem.
Definition 3.2.2. Let A be an n × n matrix, where A = (aij ). We define the
characteristic polynomial PA (t) to be the determinant of A − tI, or written
in full:
a11 − t −a12 ... −a1n
−a21 a22 − t ... −a2n
PA (t) = det(A − tI) = .. .. ... ..
. . .
−an1 −an2 . . . ann − t
We can also view A as a linear map from Fn to Fn , where F is the
underlying field, and we also say that PA (t) is the characteristic polynomial
of this linear map.
Theorem 3.2.3. Let A be an n × n matrix. A scalar λ is an eigenvalue of
A if and only if λ is a root of the characteristic polynomial of A.
Proof. Assume that λ is an eigenvalue of A. Then A − λI is not invertible
by Theorem 3.2.1, hence det(A − λI) = 0. Consequently, λ is a root of the
characteristic polynomial.
Conversely, if λ is a root of the characteristic polynomial, then det(A −
λI) = 0, and hence we conclude that A − λI is not invertible. Therefore, λ
is an eigenvalue of A by Theorem 3.2.1. This proves the theorem.
46 CHAPTER 3. EIGENVECTORS AND EIGENVALUES
Remark 3.2.4. The preceding theorem gives us an explicit way of determin-
ing the eigenvalues of a matrix, provided that we can determine explicitly
the roots of its characteristic polynomial. This is sometimes easy, especially
for small values of n.
Corollary 3.2.5. Let A be a square triangular matrix. Then the set of
eigenvalues of A is just the set of the entries on the main diagonal of A.
Proof. We compute the determinant det(A − λI) to get a polynomial ex-
pression in λ. Since A is a triangular matrix, subtracting λI from A will
only affect the diagonal entries of A. Therefore, the expression det(A − λI)
simplifies to a polynomial of the form:
(a11 − λ)(a22 − λ) . . . (ann − λ).
Theorem 3.2.6. Let A and B be two n × n matrices, and assume that B is
invertible. Then the characteristic polynomial of A is equal to the character-
istic polynomial of B −1 AB.
Proof. By definition and properties of the determinant,
det(A−tI) = det(B −1 (A−tI)B) = det(B −1 AB−tB −1 B) = det(A−tB −1 AB).
This proves the theorem.
Remark 3.2.7. Let T : V → V be a linear map of a finite-dimensional vector
space into itself, so T is an operator. Select a basis for V and let A = [T ]B
be the matrix associated with T with respect to this basis. We then define
the characteristic polynomial of T to be the characteristic polynomial of A.
If we change the basis, then A changes to B −1 AB where B is invertible. By
The Theorem 3.2.6, this implies that the characteristic polynomial does not
depend on the choice of basis.
3 2
Example 25. Consider the 2 × 2 matrix A = .
1 4
1. Calculate the characteristic polynomial PA (t) of matrix A.
2. Find the eigenvalues of matrix A.
3.2. CHARACTERISTIC POLYNOMIAL 47
Solution. 1. To calculate the characteristic polynomial PA (t), we evaluate
the determinant of the matrix (A − tI), where I is the identity matrix:
3 − t −2
PA (t) = det(A − tI) =
−1 4 − t
Expanding the determinant, we have:
PA (t) = (3 − t)(4 − t) − (−2)(−1) = t2 − 7t + 10
2. To find the eigenvalues of matrix A, we set the characteristic polynomial
PA (t) equal to zero and solve for t:
t2 − 7t + 10 = 0
Factoring the quadratic equation, we have:
(t − 5)(t − 2) = 0
So the eigenvalues of matrix A are t = 5 and t = 2.
Therefore, the eigenvalues of matrix A are 5 and 2.
1 2 3
Example 26. Consider the 3 × 3 matrix A = 4 5 6. Find the charac-
7 8 9
teristic equation of A.
Solution. We have:
1 − t −2 −3
det(A − tI) = −4 5 − t −6
−7 −8 9 − t
Simplifying the determinant expression, we get:
(1 − t)((5 − t)(9 − t) + 48) + 2(4(9 − t) + 42) − 3(−4(−8) − 7(5 − t)) = 0
Expanding and simplifying further, we have:
(1 − t)(t2 − 14t + 45) + 2(36 − 4t) + 3(35 − 7t + 28) = 0
Simplifying the equation, we obtain:
t3 − 15t2 + 72t − 90 = 0
48 CHAPTER 3. EIGENVECTORS AND EIGENVALUES
Now we give examples of eigenspaces. To find the eigenspaces of a ma-
trix, we first find the eigenvalues, then we find the corresponding eigenvec-
tors. Note that The dimension of the eigenspace corresponding to a specific
eigenvalue is equal to the multiplicity of that eigenvalue.
Example 27. Find the eigenspaces of the following matrix.
1 0 2
A = 0 3 0
2 0 1
Solution.
1−λ 0 2
A − λI = 0 3−λ 0
2 0 1−λ
det(A − λI) = (1 − λ)((3 − λ)(1 − λ)) + (2)(0 − (2)(3 − λ))
Simplifying the equation we get:
= (3 − λ)(λ − 3)(λ + 1)
So the eigenvalues are
λ1 = −1, λ2 = 3, λ3 = 3
To find the eigenvectors corresponding to each eigenvalue, we solve the
equation (A − λI)⃗v = 0 for each eigenvalue.
For λ1 = −1:
2 0 2 x
(A + I)⃗v = 0 4 0 y = ⃗0
2 0 2 z
Simplifying the equation, we have:
2x + 2z = 0
4y = 0
2x + 2z = 0
The solution for this
system
is y = 0 and z = −x. Hence, the eigenvector
1
for λ = −1 is ⃗v1 = 0 . Also, the eigenspace for this eigenvalue is the
−1
1
space spanned by 0 .
−1
3.3. DIAGONALIZATION 49
For λ1 = 3:
−2 0 2 x
(A − 3I)⃗v = 0 0 0 y = ⃗0
2 0 −2 z
Simplifying the equation, we have:
−2x + 2z = 0
2x − 2z = 0
0
This implies that x = z. If x = 0, the eigenvector will be ⃗v2 = 1. If
0
1
x ̸= 0, then the eigenvector will be ⃗v3 = 0. So the eigenspace E3 for this
1
0 1
eigenvalue is the space spanned by 1 and 0. Hence it is of dimension
0 1
2.
3.3 Diagonalization
Definition 3.3.1. A square matrix A is said to be diagonalizable if and only
if A is similar to a diagonal matrix.
Theorem 3.3.2. Let V be a finite-dimensional vector space, and let T : V →
V be an operator. Then V has a basis that consists of eigenvectors of T if
and only if A = [T ] is diagonalizable. We say that the linear map T can be
diagonalized if there exists a basis of V consisting of eigenvectors.
Proof. Suppose {⃗v1 , ⃗v2 , . . . , ⃗vn } is a basis of V consisting of eigenvectors of
A with corresponding eigenvalues λ1 , λ2 , . . . , λn . We know that Avi = λ⃗vi
for each i. Now consider the matrix P whose columns are the eigenvec-
tors ⃗v1 , ⃗v2 , . . . , ⃗vn . Since the eigenvectors form a basis (and hence linearly
independent), we conclude that P is invertible. Now we can write:
AP = A[⃗v1 ⃗v2 ... ⃗vn ] = [A⃗v1 A⃗v2 ... A⃗vn ] = [λ1⃗v1 λ2⃗v2 ... λn⃗vn ] = [⃗v1 ⃗v2 ... ⃗vn ]D,
50 CHAPTER 3. EIGENVECTORS AND EIGENVALUES
λ1 0 ··· 0
0 λ2 ··· 0
where D = .. .. . Hence, AP = P D or equivalently, A =
.. ..
. . . .
0 0 · · · λn
P DP −1 .
Conversely, if A = P DP −1 , then AP = P D. A reverse argument shows
that the columns of P are eigenvectors of A.
Remark 3.3.3. Theorem 3.3.2 is called The Diagonalization Theorem. Based
on the proof of the Theorem, if A is diagonalizable, then A is similar to a
diagonal matrix D whose main diameters contains all eigenvalues of A. To
find out whether or not the matrix is diagonalizable, we need to find n inde-
pendent eigenvectors.
Example 28. Prove tha the following matrix is not diagonalizable.
2 1
A=
0 2
Proof. To determine if it is diagonalizable, we need to find its eigenvalues
and check if it has a 2 linearly independent eigenvectors. The characteristic
polynomial of A is given by:
2−λ 1
det(A − λI) = = (2 − λ)2
0 2−λ
Setting the characteristic polynomial equal to zero, we find that the eigen-
value is λ = 2 with multiplicity 2. To find the eigenvectors corresponding to
the eigenvalue λ = 2, we solve the system of equations (A − 2I)X = 0:
0 1 x 0
=
0 0 y 0
From this, we see that x can be any value,
and y = 0. Therefore, the only
1
eigenvector corresponding to λ = 2 is . So we have not enough number
0
of eigenvectors to make a basis for A.
3.3. DIAGONALIZATION 51
Example 29. Prove that the following matrix is diagonalizable. Find D and
P.
4 2
A=
1 3
Proof. First we find the characteristic polynomial of A:
4−λ 2
det(A − λI) = = (4 − λ)(3 − λ) − 2 = (2 − λ)(5 − λ).
1 3−λ
So the eigenvalues are 2 and 5. For λ = 2 we have (A − 2I)X = 0.
2 2 x 0
=
1 1 y 0
1
So 2x + 2y = 0 and x + y = 0. Hence the eigenvector will be . For
−1
λ = 5 we have (A − 5I)X = 0.
−1 2 x 0
=
1 −2 y 0
2
So −x + 2y = 0 and x − 2y = 0. Hence the eigenvector will be .
1
So we have two different eigenvalues which are linearly independent.
2 0
Therefor A is diagonalizable. The matrices D is clearly D = . To
0 5
find P , we just need to put the eigenvectors in columns of P . Hence,
1 2
P = .
−1 1
Lemma 3.3.4. Suppose that T is a linear map and v is an eigenvector of
T corresponding to the eigenvalue λ. If f is any polynomial, then f (T )v =
f (λ)v.
Proof. Let
f (x) = an xn + an−1 xn−1 + . . . + a1 x + a0
Consider the application of f (T ) to v:
52 CHAPTER 3. EIGENVECTORS AND EIGENVALUES
f (T )v = (an T n + an−1 T n−1 + . . . + a1 T + a0 I)v
Since v is an eigenvector of T corresponding to the eigenvalue λ, we have
T v = λv. Now, we have:
f (T )v = (an T n +an−1 T n−1 +. . .+a1 T +a0 I)v = (an λn +an−1 λn−1 +. . .+a1 λ+a0 )v
Simplifying further:
f (T )v = (an λn + an−1 λn−1 + . . . + a1 λ + a0 )v = f (λ)v
Therefore, for any polynomial f , f (T )v = f (λ)v, as desired.
Theorem 3.3.5. Let T be a linear operator on the finite-dimensional space
V . Suppose λ1 , λ2 , . . . , λk are the distinct eigenvalues of T , and Eλi is the
eigenspace of the eigenvalue λi . If W = Eλ1 + Eλ2 + . . . + Eλk , then W is a
direct sum. That is, W = Eλ1 ⊕ Eλ2 + . . . ⊕ Eλk .
Proof. Suppose that for each i, we have a vector ⃗vi in Eλi , and assume that
⃗v1 + ⃗v2 + . . . + ⃗vk = ⃗0. We shall show that ⃗vi = 0 for each i.
Let f be any polynomial. Using the previous lemma, we have:
⃗0 = f (T )⃗0 = f (T )(⃗v1 + ⃗v2 + . . . + ⃗vk ) = f (λ1 )⃗v1 + f (λ2 )⃗v2 + . . . + f (λk )⃗vk .
Now, choose polynomials f1 , f2 , . . . , fk such that fi (λj ) = δij (Kronecker
delta). Then we have:
⃗0 = fi (T )⃗0 = fi (T )(⃗v1 +⃗v2 +. . .+⃗vk ) = fi (λ1 )⃗v1 +fi (λ2 )⃗v2 +. . .+fi (λk )⃗vk = ⃗vi .
This shows that ⃗vi = ⃗0 for each i, proving that the eigenspaces associated
with different eigenvalues are independent of one another.
Corollary 3.3.6. Let T be a linear operator on the finite-dimensional space
V and Suppose that W is the subspace spanned by all of the eigenvectors of
T . If Eλi are the eigenspaces of T , then dim(W ) = dim(Eλ1 ) + dim(Eλ2 ) +
. . . + dim(Eλk ).
Theorem 3.3.7. Let T be a linear operator on a finite-dimensional space V .
Let λ1 , . . . , λk be the distinct eigenvalues of T , and let Eλi be the eigenspaces
of T corresponding to λi . The following are equivalent.
3.3. DIAGONALIZATION 53
(i) T is diagonalizable.
(ii) The characteristic polynomial for T is f = (x − λ1 )d1 . . . (x − λk )dk , and
dim Eλi = di for i = 1, . . . , k.
(iii) dim Eλ1 + . . . + dim Eλk = dim V .
Proof. We have observed that (i) implies (ii). If the characteristic polynomial
f is the product of linear factors, as in (ii), then d1 + . . . + dk = dim V . For
the sum of the di ’s is the degree of the characteristic polynomial, and that
degree is dim V . Therefore, (ii) implies (iii).
Suppose (iii) holds. By the lemma, we must have V = Eλ1 + . . . + Eλk ,
i.e., the eigenvectors of T span V .
Example 30. Consider the matrix
1 1
A= .
0 1
We want to determine if matrix A is diagonalizable.
Solution. First, we find the characteristic polynomial of matrix A.
1−x 1
= (1 − x)(1 − x) = (1 − x)2
0 1−x
The characteristic polynomial is (1 − x)2 and the eigenvalue is λ1 = 1
with multiplicity 2. Now, we need to determine the dimension of the E1 . To
do this, we find the corresponding eigenvectors.
0 1 x 0
(A − I)⃗v = = .
0 0 y 0
1
The only solution is y = 0. Hence, the corresponding eigenvector is . This
0
implies that the dimension of E1 (the eigenspace corresponding to λ1 = 1) is
1. Since the algebraic multiplicity and the dimension of the eigenspace are
not equal, matrix A is not diagonalizable.
54 CHAPTER 3. EIGENVECTORS AND EIGENVALUES
Remark 3.3.8. If A is an n × n matrix and A has n distinct eigenvalues,
then A is diagonalizable. In fact in this case, the characteristic polynomial
of A will be
(x − λ1 )...(x − λn ).
So dimension of Eλ1 will be 1. The following example shows that the converse
is not true.
Example 31. The matrix in the Example 27 is diagonalizable since we found
three linearly independent eigenvectors, despite having only two eigenvalues.
Problems
1. Let V be a finite-dimensional vector space. Let A and B be operators.
Assume that AB = BA. Show that if v is an eigenvector of A with
eigenvalue λ, then Bv is an eigenvector of A with eigenvalue λ as well,
provided that Bv ̸= 0.
2. What is the characteristic polynomial of a triangular matrix?
3. Let A be an invertible matrix. If λ is an eigenvalue of A, show that
λ ̸= 0 and that λ−1 is an eigenvalue of A−1 .
4. Let V be an n-dimensional vector space, and assume that the charac-
teristic polynomial of a linear map A : V → V has n distinct roots.
Show that V has a basis consisting of eigenvectors of A.
5. Let A, B be square matrices of the same size. Show that the eigenvalues
of AB are the same as the eigenvalues of BA.
6. Let A be a diagonal matrix with diagonal elements a11 , . . . , ann . What
is the dimension of the space generated by the eigenvectors of A? Ex-
hibit a basis for this space, and give the eigenvalues.
55
Chapter 4
Orthogonality
4.1 Inner product
Definition 4.1.1. Let V be a vector space over a field F . A inner product
(or scalar product or dot product) on V is an function . : V × V → F which
to any pair of elements ⃗v , w⃗ of V associates a scalar, denoted by ⟨⃗v , w⟩
⃗ or
also ⃗v · w,
⃗ satisfying the following properties:
1. We have ⃗v · w ⃗ · ⃗v for all ⃗v , w
⃗ =w ⃗ ∈V.
⃗ are elements of V , then ⃗u · (⃗v + w)
2. If ⃗u, ⃗v , w ⃗ = ⃗u · ⃗v + ⃗u · w.
⃗
3. If α ∈ F , then α(⃗u · ⃗v ) = (α⃗u) · ⃗v = ⃗u · (α⃗v ).
The inner product is said to be non-degenerate if, in addition, it satisfies
the condition:
4. If ⃗v is an element of V and ⃗v · w ⃗ ∈ V , then ⃗v = ⃗0.
⃗ = 0 for all w
Definition 4.1.2. Let V be a vector space with a inner product. We define
⃗ of V to be orthogonal (or perpendicular), and write ⃗v ⊥ w,
elements ⃗v , w ⃗
⃗ = 0. If S is a subset of V , we denote by S ⊥ the set of all elements
if ⃗v · w
⃗ ∈ V which are orthogonal to all elements of S, i.e., ⃗v · w
w ⃗ = 0 for all ⃗v ∈ S.
Lemma 4.1.3. The set S ⊥ is a subspace of V (we call it the orthogonal
complement of S).
Proof. Follows from the properties of the inner product.
56
4.2. NORM OF A VECTOR 57
Definition 4.1.4. Let V be a vector space over the field F with a inner
product. Let {⃗v1 , . . . , ⃗vn } be a basis of V . We say that it is an orthogonal
basis if ⃗vi · ⃗vj = 0 for all i ̸= j.
Remark 4.1.5. We shall show later that any finite-dimensional vector space
with a inner product possesses an orthogonal basis.
Example 32. The following basis in R3 is orthogonal.
⃗v1 = (2, 1, 0)
⃗v2 = (0, −3, 2)
⃗v3 = (1, 0, 4)
Solution. To show that the vectors ⃗v1 , ⃗v2 , and ⃗v3 form an orthogonal basis
in R3 , we need to verify that their inner products are zero.
⃗v1 · ⃗v2 = (2)(0) + (1)(−3) + (0)(2) = −3 = 0 (orthogonal)
⃗v1 · ⃗v3 = (2)(1) + (1)(0) + (0)(4) = 2 = 0 (orthogonal)
⃗v2 · ⃗v3 = (0)(1) + (−3)(0) + (2)(4) = 8 = 0 (orthogonal)
Since all the inner products are zero, the vectors ⃗v1 , ⃗v2 , and ⃗v3 are or-
thogonal to each other which means they form an orthogonal basis in R3 .
4.2 Norm of a vector
√
Definition 4.2.1. The norm of a vector v ∈ V is defined by ||v|| = v · v.
Remark 4.2.2. p p then we immediately get ||cv|| = |c| · ||v||
If c is any scalar,
because ||cv|| = (cv) · (cv) = c2 (v · v) = |c| · ||v||.
The distance between two elements v, w ∈ V is defined to be d(v, w) =
||v − w||.
Lemma 4.2.3. Two vectors v and w are orthogonal if and only if ||v − w|| =
||v + w||:
Proof. The result follows by the intuition of plane geometry and the following
figure:
58 CHAPTER 4. ORTHOGONALITY
||v + w|| ||v − w||
−v v
O
We can also give an algebraic proof as follows:
||v − w||2 = ||v + w||2
(v − w) · (v − w) = (v + w) · (v + w)
v · v − 2v · w + w · w = v · v + 2v · w + w · w
4v · w = 0
v·w =0
Remark 4.2.4. By the Pythagoras Theorem, if v and w are orthogonal,
then ||v + w||2 = ||v||2 + ||w||2 . In general, for every two vectors, the triangle
inequality holds: ||v + w||2 ≤ ||v||2 + ||w||2
Theorem 4.2.5. (Schwarz inequality) if v and w are vectors, then ||v.w|| ≤
||v||.||w||.
Proof. omitted.
4.3 Orthogonal bases
Let V be a vector space with a positive scalar product throughout this sec-
tion. Recall tha a basis {v1 , . . . , vn } of V is said to be orthogonal if its
elements are mutually perpendicular, i.e., vi · vj = 0 whenever i ̸= j.
4.3. ORTHOGONAL BASES 59
Definition 4.3.1. We say that an element v ∈ V is a unit vector if ∥v∥ = 1.
v
If v ∈ V and v ̸= 0, then ∥v∥ is a unit vector.
Definition 4.3.2. An orthogonal basis is called orthonormal if each element
of the basis is a unit vector.
Example 33. In the Example 32, we prove that the following is an orthog-
onal basis for R3 . However, it is not an orthonormal basis since the vectors
v
are not unit. However, if we replace each vector by ∥v∥ , we can create a
3
orthonormal basis for R .
⃗v1 = (2, 1, 0)
⃗v2 = (0, −3, 2)
⃗v3 = (1, 0, 4)
To obtain the orthonormal basis, we normalize each vector by dividing it
by its norm:
⃗v1 ⃗v2 ⃗v3
⃗u1 = , ⃗u2 = , ⃗u3 =
|⃗v1 | |⃗v2 | |⃗v3 |
After normalization, the orthonormal basis for R3 is given by:
2 1 3 2 1 4
⃗u1 = √ , √ , 0 , ⃗u2 = 0, − √ , √ , ⃗u3 = √ , 0, √
5 5 13 13 17 17
Now, the vectors ⃗u1 , ⃗u2 , and ⃗u3 form an orthonormal basis for R3 since
they are mutually perpendicular and have unit length.
The preceding example demonstrates a straightforward technique for trans-
forming an orthogonal basis into an orthonormal basis. However, what if we
start with a basis that is not orthogonal? Can we still obtain an orthogonal
basis from it? To address this question, we utilize a technique known as the
Gram-Schmidt orthogonalization process. We use this process to prove the
following theorem.
Theorem 4.3.3. (Gram-Schmidt process) Let {v1 , ..., vn } be any basis for V .
Then there exists an orthonormal basis {u1 , ..., un } for V .
60 CHAPTER 4. ORTHOGONALITY
Proof. Step 1. u1 = v1 .
v2 .u1
Step 2. Let u2 = v2 − u.
||u1 ||2 1
v3 .u1 v3 .u2
Step 2. Let u3 = v3 − u
||u1 ||2 1
− u.
||u2 ||2 2
In general,
Pi−1 vi .uj
ui = vi − j=1 ∥uj ∥2 uj .
As an exercise, check that this basis is orthogonal.
Remark 4.3.4. Notice that the Gram-Schmidt process gives a orthogonal
basis. To find an orthonormal basis, we first use the Gram-Schmidt process
to find the orthogonal basis, we need to multiply each basis by its length.
Example 34. Let {v1 , v2 , v3 } be a basis for V = R3 where
1 1 1
v1 = −1 , v2 = 0 , v3 = 1 .
1 1 2
Use the Gram-Schmidt process to find an orthogonal basis for V .
Solution. Step 1: Compute u1 = v1 :
1
u1 = v1 = −1 .
1
Step 2: Compute u2 using u1 :
1
1 1
v2 .u1 2 32
u2 = v2 − u1 = 0 − −1 = 3 .
∥u1 ∥2 3 1
1 1 3
Step 3: Compute u3 using u1 and u2 :
v3 .u1 v3 .u2
u3 = v3 − 2
u1 − u2
∥u1 ∥ ∥u2 ∥2
4.3. ORTHOGONAL BASES 61
1
1 1
2 5/3 32
= 1 − −1 −
3 6/9 31
2 1 3
2 5 −1
1 −3 −6 2
= 1 + 23 + − 10
6
= 0 .
2 − 32 − 56 1
2
Therefore, an orthogonal basis for V is {u1 , u2 , u3 } given by:
1 −1
1 3 2
u1 = −1 , u2 = 23 , u3 = 0 .
1 1
1 3 2
Problems
1. Let V be a finite-dimensional vector space with a positive definite scalar
product. Assume that V ̸= {0}. Then V has an orthogonal basis. (An
inner product is positive definite if v.v ≥ 0 for all v ∈ V , and v.v > 0
if v ̸= 0.)
2. Let V be the subspace of maps generated by the two maps f and g
such that f (t) = t and g(t) = t2 . Find an orthonormal basis for V .
3. Let V be the subspace generated by the three maps 1, t, and t2 (where
1 is the constant map). Find an orthonormal basis for V .
4. Let V be a finite-dimensional vector space over R with a positive def-
inite scalar product. Prove the parallelogram law for any elements
v, w ∈ V :
∥u + v∥2 + ∥u − v∥2 = 2(∥u∥2 + ∥v∥2 )
62
Bibliography
[1] Kenneth Hoffman and Ray Kunze, Linear Algebra, Prentice Hall, En-
glewood Cliffs, NJ, 2nd edition, 1991.
[2] A. Onan, Linear Algebra: An Introductory Approach, CRC Press, Boca
Raton, FL, 2018.
[3] Sergei Treil, Linear Algebra Done Wrong, 3rd edition, Treil Books, Hous-
ton, TX, 2017.
63