Unit 1
Unit 1
Contents
1 Matrices and system of linear equations 2
1.1 Matrix operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1.1 Addition of matrices and scalar multiplication . . . . . . . . . . . . 3
1.1.2 Matrix multiplication . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.1.3 Matrix transpose . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.1.4 Determinant of a matrix . . . . . . . . . . . . . . . . . . . . . . . 8
1.1.5 Inverse of a matrix . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.1.6 Elementary row operations for matrices . . . . . . . . . . . . . . . 10
1.2 Row-echelon form and reduced row-echelon form . . . . . . . . . . . . . . 11
1.2.1 Row-echelon form . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.2.2 Reduced row-echelon form . . . . . . . . . . . . . . . . . . . . . . 12
1.3 System of linear equations . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.4 Gaussian elimination method for solving system of linear equations . . . . 17
1.4.1 The augmented matrix in a row-echelon form . . . . . . . . . . . . 17
1.4.2 The Gaussian elimination process . . . . . . . . . . . . . . . . . . 19
1.5 Gauss–Jordan method for computing the inverse of a matrix . . . . . . . . 21
2 Vector Spaces 24
2.1 Subspaces of vector spaces . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.2 Linear independence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.3 Basis of a vector space . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
2.4 Dimension of a vector space . . . . . . . . . . . . . . . . . . . . . . . . . 37
1
2
Linear algebra plays central role in the modern development of science and technology. It finds ex-
tensive applications in engineering, physics, computer science, economics, finance, cryptography,
etc. It is arguably one of the most widely applied fields of mathematics. Due to the importance of
linear algebra in various fields, in particular engineering, its basic theory has become indispens-
able.
This document presents lecture notes for Unit 1 of the course NMCI102 covering the following
topics: matrices, vector spaces, linear transformation, eigenvalues and eigenvectors, diagonaliza-
tion, and quadratic forms.
Matrices originated as a mathematical tool to represent and solve systems of linear equations effi-
ciently, with their roots tracing back to ancient methods of organizing coefficients.
where aij ∈ F represents the element of A in the ith row and j th column.
Depending on the structure of a matrix, some of the types of matrices are as follows:
• Square matrix: A matrix with the same number of rows and columns.
• Symmetric matrix: A square matrix A such that aij = aji for all i, j.
• Skew-symmetric matrix: A square matrix A such that aij = −aji for all i, j.
Equality of matrices: Two matrices A = [aij ] and B = [bij ] are said to be equal if and only if they
have the same size and the corresponding elements are equal, i.e., aij = bij for all i, j.
Matrix addition: Two matrices can be added if they have the same dimensions. The sum of two
matrices A = [aij ] and B = [bij ] of the same order m × n is another matrix C = [cij ] whose
elements are given by cij = aij + bij . We write C as A + B.
Scalar multiplication: Given a matrix A = [aij ] and a scalar α, the scalar multiplication of A with
the scalar α is defined to be the matrix αA = [αaij ]. In particular, we observe that A + ((−1)A) is
the zero matrix. So, we write −A = (−1)A.
4
The multiplication (or product) of two matrices A and B is defined when their sizes are compat-
ible; i.e., the number of columns in A must be equal to the number of rows in B. Under this
compatibility condition, the multiplication of two matrices is defined as follows.
• This product is sometimes called the row-column product to emphasize the fact that it is a
product involving the rows of A with the columns of B.
• The product of two matrices is not defined if their sizes are not compatible. For example, if
A has size 2 × 5 and B has size 3 × 2 then their product AB is not defined. However, the
product BA is defined. Why? What is the size of BA?
• Matrix multiplication distributes over addition, on both sides (provided, of course, that the
sizes of the matrices involved are compatible for the given operations):
(AB)C = A(BC).
• In particular, taking the nth power of a square matrix is well-defined for every positive integer
n.
• The transpose of the product of two matrices is the product of their transposes in reverse
order:
(AB)T = B T AT .
5
where
7 8 9 3 2 1
3
X
cij = aik bkj , 1 ≤ i, j ≤ 3.
k=1
We thus have
This gives
30 24 18
C = 84 69 54 .
138 114 90
1 6 −2 2 −9
A = 3 4 5 , B = 6 1 .
7 0 8 1 −3
Consider
1 6 −2 2 −9
C = AB = 3 4 5 6 1 .
7 0 8 1 −3
7
c21 = (3 · 2) + (4 · 6) + (5 · 1) = 6 + 24 + 5 = 35,
c22 = (3 · −9) + (4 · 1) + (5 · −3) = −27 + 4 − 15 = −38.
c31 = (7 · 2) + (0 · 6) + (8 · 1) = 14 + 0 + 8 = 22,
c32 = (7 · −9) + (0 · 1) + (8 · −3) = −63 + 0 − 24 = −87.
36 3
C = 35 −38 .
22 −87
If A is an m × n matrix then its transpose is the n × m matrix whose (i, j)-entry is equal to the
T 1 2 3
(j, i)-entry
h ofiA. We denote the transpose of A by A . For example, the transpose of A = [ 4 5 6 ]
1 4
is AT = 2 5 .
3 6
a1 a2 a3 " # " # " #
b 2 b 3 b 1 b 3 b 1 b 2
det b1 b2 b3 = a1 det − a2 det + a3 det .
c2 c3 c1 c3 c1 c2
c1 c2 c3
1 2 4 " # " # " #
1 0 −1 0 −1 1
det −1 1 0 = 1 det − 2 det + 4 det .
1 3 −2 3 −2 1
−2 1 3
AB = BA = In ,
3 −5
For example, the matrix A = [ 21 53 ] is invertible and its inverse is given by A−1 =
−1 2 . One
can easily check that
" #" #" #" # " #
2 5 3 −5 3 −5 2 5 1 0
= .
1 3 −1 2 −1 2 1 3 0 1
Not every matrix is invertible. An obvious example is the n × n zero matrix. Another example of
non-invertible matrix is the n × n matrix with each element equal to 1 for n ≥ 2. Indeed, we can
see that " # " # " #
1 1 a b a+c b+d
· = ,
1 1 c d a+c b+d
which is never equal to the identity matrix for any choice of a, b, c, d.
Proposition 11. If A and B are invertible n × n matrices with inverses A−1 and B −1 , then
AB is also an invertible matrix with the inverse B −1 A−1 .
10
Proposition 12. Any square matrix with a row or column of all zeroes cannot be invertible.
Proof. Suppose the n × n matrix A has all entries in its i-th row equal to zero. Then for any n × n
matrix B, the product AB will have all entries in its i-th row equal to zero, so it cannot be the
identity matrix.
Similarly, if the n × n matrix A has all entries in its i-th column equal to zero, then for any n × n
matrix B, the product BA will have all entries in its i-th column equal to zero.
Proof. This follows from solving the system of equations for e, f, g, h in terms of a, b, c, d that
arises from comparing entries in the product
" #" # " #
a b e f 1 0
= .
c d g h 0 1
One obtains precisely the solution given above. If ad = bc, then the system is inconsistent and
there is no solution; otherwise, there is exactly one solution as given.
The above row operations can be applied to any rectangular matrix to obtain a so-called row-
echelon form of the matrix that we will see in the next section.
Here we shall briefly discuss the concepts of row-echelon and reduced row-echelon forms of rect-
angular matrices. These concepts will play important roles in solving systems of linear equations
discussed in the next section.
(i) every row with a nonzero element is always above all the zero rows,
(ii) the first nonzero element in every row is always to the right of the first nonzero term
in the row above it.
The first nonzero element of a row is called the pivot element of that row.
(i′ ) all rows without a pivot (the zero rows) are at the bottom,
(ii′ ) any row’s pivot, if it has one, lies to the right of the pivot of the row directly above it.
We will see in Section 1.4 that any rectangular matrix can be brought to a row-echelon form by
performing elementary row operations on it. Below are some examples of matrices in row-echelon
12
Also, here are examples of matrices not in the row-echelon form. The matrix
1 2 3 4 5
1 1 2 3 4
0 0 1 0 1
is not in the row-echelon form because the pivot in the second row is not strictly to the right of the
pivot element in the row above it.
0 0 1 0 1
It is because the pivot in the third row is not strictly to the right of the pivot in the second row.
0 0 0 0 1
is in the row-echelon form? Justify.
Remark: We will see in the next system that if the coefficient matrix of a system of linear equations
is in a row-echelon form then it is easy to solve the system. It just requires backward substitution
to obtain the solution.
A matrix in row-echelon form can be further simplified into a form known as reduced row-echelon
form through elementary row operations.
13
Example 15. A rectangular matrix is said to be in reduced row-echelon form if it satisfies the
following three conditions:
(iii) the entries above every pivot element are all zero.
Below are some examples of matrices in reduced row-echelon form with their pivots boxed:
1 0 0 4 5 1 2 3 0 5 1 2 0 4 0
0 1 0 3 4 , 0 0 0 1 0 , 0 0 1 3 0 .
0 0 1 0 1 0 0 0 0 0 0 0 0 0 1
Also, here are some examples of matrices not in reduced row-echelon form. The matrix
1 2 0 4 5
0 1 0 3 4
0 0 1 0 1
is not in reduced row-echelon form because the entry above the second pivot element is not zero.
The following matrix is also not in reduced row-echelon form.
0 0 3 4 5
0 0 0 0 1 .
0 0 0 0 0
Why?
As stated earlier, through elementary row-operations, every matrix can be brought to a reduced
row-echelon form. It turns out that the reduced row-echelon form of a matrix is unique, as stated
in the following theorem. We omit the proof.
Remark: We will later define the concept of rank of a matrix. For a matrix in row-echelon
h form, iti
1 2 3 4 5
is equivalently defined as the number of pivots in the matrix. For example, the rank of 0 1 2 3 4
0 0 1 0 1
14
h i
1 2 3 0 5
is 3, while the rank of is 2. More generally, the rank of any matrix A is equal to the rank
0 0 0 1 0
0 0 0 0 0
of a matrix in row-echelon form obtained from A by applying elementary row operations.
A system of linear equations consists of two or more linear equations involving the same set of
variables. Such systems arise frequently in engineering, physics, computer science, and many
other fields.
where x1 , x2 , . . . , xn are the variables, aij are the coefficients, and b1 , b2 , . . . , bm are the
constants.
The above system of linear equations can be represented in the matrix form as follows:
Ax = B,
where A is an m × n matrix called the coefficient matrix, x is the column vector of variables, and
B is the column vector of the constants given by:
a11 a12 ··· a1n x1 b1
a21 a22 ··· a2n x2 b2
A=
.. .. ... ..
, x = ..
, B = . .
.
. . . . .
am1 am2 · · · amn xn bm
The matrix [A|B] obtained by appending the constant vector B to the right of the coefficient matrix
A is called the augmented matrix of the system. The vertical line in the augmented matrix is meant
to separate the coefficients from the constants in the system. We will see later that working with
the augmented matrix is a lot more convenient while solving a system of linear equations.
15
2x + 3y = 5,
(1)
4x − y = 11.
x + 2y + z = 6,
(2) 3x − y + 4z = 8,
5x + y − 2z = 4.
The traditional method for solving a system of linear equations (probably familiar from basic
algebra) is by elimination: we solve the first equation for one variable, say x1 , in terms of the
others. We then substitute the result into all other equations to obtain a reduced system involving
one fewer variables. Eventually, the system is simplified to a scenario that implies a contradiction
(e.g., 1 = 0), a unique solution, or an infinite family of solutions.
Example 18. Let us solve the following system of equations by the elimination method described
above.
x + y = 7,
2x − 2y = −2.
2(7 − y) − 2y = −2 → 14 − 4y = −2 → y = 4.
Thus, we get a unique solution to the system given by (x, y) = (3, 4).
x + y + 3z = 4, (i)
2x + 3y − z = 1, (ii)
−x + 2y + 2z = 1. (iii)
We first eliminate x by performing the following operations on the equations (ii) → (ii) − 2(i)
and (iii) → (iii) + (i):
x + y + 3z = 4,
y − 7z = −7,
3y + 5z = 5.
x + y + 3z = 4, (i′ )
y − 7z = −7, (ii′ )
26z = 26. (iii′ )
Solve equation (iii′ ) to get z = 1. Substituting z = 1 into (ii′ ) gives y = 0. Lastly, substitute
y = 0, z = 1 into (i′ ) to get x = 1. So, we get a unique solution (x, y, z) = (1, 0, 1).
Observe that the above procedure of solving systems of linear equations requires dealing with
only the coefficients of the variables and the constants. So, we only need to keep track of the
coefficients, which we can do by putting them into an array. For example, the above system of
linear equations given by (i), (ii), (iii) can be written in simplified form using the array as
1 1 3 4
2 3 −1 1 .
−1 2 2 1
Recall that the above matrix is called the augmented matrix for the given system of linear equations.
We can then do operations on the entries in the array that correspond to manipulations of the
associated system of equations. The elimination process consists of a combination of the three
elementary row operations, which we recall below:
17
Each of these elementary row operations leaves unchanged the solutions to the associated system
of linear equations. The idea of elimination is to apply these elementary row operations to the
coefficient matrix until it is in a simple enough form that we can simply read off the solutions
to the original system of equations. The above ideas lead to the Gaussian elimination process of
solving systems of linear equations.
If the augmented matrix of a system is in a row-echelon form, it is easy to read off the solutions to
the corresponding system of linear equations by working from the bottom up. This is illustrated in
the following example.
0 0 2 4
By using row operations on the augmented matrix, we can solve the associated system of equations,
since, as we noted before, each of the elementary row operations does not change the solutions of
the system.
18
Case 1: No solution. This happens precisely when two conditions are satisfied:
(ii) the pivot element in the last row appears in the last column.
Here is an example to illustrates this case. Consider the following augmented matrix in row-
echelon form:
1 1 3 4
0 1 −1 1 .
0 0 0 4
Note that the pivot element in the last row (boxed) appears in the last column. Its corresponding
system is
x + y + 3z = 4,
y − z = 1,
0 = 4.
The above system does not have any solution because the last equation is impossible to satisfy.
Case 2: Unique solution. This happens precisely when two conditions are satisfied:
(ii) the pivot element in the last row appears in the second-to-last column.
0 0 2 4
Observe that it satisfies the two conditions of the present case; i.e., each row has a pivot element and
the pivot element of the last row (boxed) appears in the second-to-last column. Its corresponding
19
system is
x + y + 3z = 4,
y − z = 1,
2z = 4.
The unique solution is z = 2, y = 3, and x = −5, which we obtain by backward substitution..
Case 3: Infinitely many solutions. This happens precisely when the last column of the augmented
matrix (in a row-echelon form) is zero. For example, consider the following augmented matrix in
row-echelon form whose last row is zero:
1 1 3 4
0 1 −1 1 .
0 0 0 0
The main idea behind the Gaussian elimination process to solve a system of linear equations is to
bring the augmented matrix in a row-echelon form. Then depending on the nature of solutions, as
discussed above, we either conclude there is not solution or solve the system through backward
substitution.
Step 1. Convert the augmented matrix in a row-echelon form. The augmented matrix can be
reduced to a row-echelon form by performing elementary row operations on it. This is called
forward elimination. If you go further with the process and convert the augmented matrix
into its reduced row-echelon form then the following steps become significantly easier to
20
execute; and this is called the Gauss–Jordan elimination process of solving system of linear
equations.
Step 2. Interpreting the nature of solutions: If each row contains a pivot element and the pivot in
the last row appears in the last column, then conclude that no solution exists. Else, proceed
to the next step.
Step 3. Calculating solutions explicitly: If you are at this step, it means a solution exists. In this
case, the variables corresponding to the pivot elements are given in terms of the remaining
variables (also called free variables) through backward substitution. In particular, if there are
n pivot elements for the system in n variables then the system has a unique solution. Else,
the pivot variables are determined in terms of free variables, which implies infinitely many
solutions.
x + y + 3z = 4,
2x + 3y − z = 1,
−x + 2y + 2z = 1,
−1 2 2 1
We now apply a series of elementary row operations to bring the augmented matrix in a row-
echelon form as follows:
1 1 3 4
2 3 −1 1
−1 2 2 1
21
0 3 5 5
Apply R3 → R3 − 3R2 :
1 1 3 4
0 1 −7 −7
0 0 26 26
We have now obtained a row-echelon form of the original augmented matrix. We can also see
that the system has a unique solution. Why? The solution can be easily obtained through back
substitution, which is given by x = 1, y = 0, z = 1.
This section concerns with computation of the inverse of invertible matrices. The method described
in the section is the Gauss–Jordan method for computing the inverse of a matrix. The main idea of
this method is to convert the problem of finding the inverse of a matrix into solving several systems
of linear equations simultaneously using the Gaussian elimination method.
AA−1 = In , (1)
where recall that In is the identity matrix of size n × n. Let us denote by x1 , . . . , xn the (unknown)
columns of A−1 and denote by e1 , . . . , en the columns of In . We can thus write (1) as
The above matrix equation is equivalent to the following systems of linear equations:
Axi = ei , 1 ≤ i ≤ n. (2)
Therefore, finding the inverse of A boils down to solving the systems of linear equations (2) si-
22
Gauss–Jordan method for finding inverse ≡ solving several systems of linear equations si-
multaneously
The following are the steps to perform Gauss–Jordan method to compute the inverse of a matrix:
(i) apply elementary row operations on [A|In ] to reduce it to a row-echelon form, say
[U |⋆],
(ii) apply another set of elementary row operations on [U |⋆] to get reduced row-echelon
form, which is given by [In |A−1 ].
Step 3. Read off A−1 from the reduced row-echelon form [In |A−1 ].
Example 22. Let us find the inverse of [ 21 11 ] using Gauss–Jordan method. Consider the augmented
matrix [A|I2 ]:
" #
2 1 1 0
1 1 0 1
Apply R2 → R2 − 12 R2 :
" #
2 1 1 0
0 21 − 21 1
23
Apply R1 → 12 R1 , R2 → 2R2 :
" #
1 12 12 0
0 1 −1 2
Apply R1 → R1 − 12 R2 :
" #
1 0 1 −1
0 1 −1 2
" # " #
2 1 1 −1
The inverse of is thus given by .
1 1 −1 2
0 1 2
Let us find its inverse using the Gauss–Jordan method. Consider the augmented matrix [A|I3 ]:
2 1 0 1 0 0
1 2 1 0 1 0
0 1 2 0 0 1
0 1 2 0 0 1
Apply R3 → R3 − 23 R2 :
2 1 0 1 0 0
3
0 2 1 − 12 1 0
0 0 43 31 − 23 1
24
Apply R1 → 12 R1 , R2 → 23 R2 , R3 → 43 R3 :
1 21 0 12 0 0
0 1 32 − 31 23 0
0 0 1 14 − 12 43
Apply R1 → R1 − 21 R2 :
1 0 − 31 23 − 13 0
0 1 23 − 31 32 0
0 0 1 14 − 12 43
Apply R1 → R1 + 13 R3 , R2 → R2 − 23 R3 :
1 0 0 34 − 12 14
0 1 0 − 21 1 − 12
0 0 1 41 − 12 34
2 Vector Spaces
When we read or hear the word vector, we may immediately think of two points in R2 (or R3 )
connected by an arrow. Mathematically speaking, a vector is just an element of a vector space.
This then begs the question: What is a vector space?
Roughly speaking, a vector space is a set of objects that can be added and multiplied by scalars.
We may have already worked with several types of vector spaces. Examples of vector spaces that
we might have already encountered are:
1. the set Rn ,
In all of these sets, there is an operation of addition and an operation multiplication by scalars. Let
us formalize exactly what we mean by a vector space.
Definition 24. A nonempty set V (called set of vectors) with operations “vector addition”,
denoted by “+” and “scalar multiplication”, denoted by “·” is called a vector space over
a set of scalars F if the following properties are satisfied:
V = {(x, y) ∈ R2 | x2 + y 2 ≤ 1}.
y
(0, 1)
x
O (1, 0)
Solution. The circle is not closed under scalar multiplication. For example, take u = (1, 0) ∈ V
and multiply by say α = 2. Then αu = (2, 0) is not in V . Therefore, property (6) of the definition
of a vector space fails, and consequently the unit disc is not a vector space.
V = {(x, y) ∈ R2 | y = x2 }.
10
6
f (x) = x2
4
−3 −2 −1 1 2 3
Solution. The set V is not closed under scalar multiplication. For example, u = (1, 1) is a point
in V but 2u = (2, 2) is not. You may also notice that V is not closed under addition either. For
example, both u = (1, 1) and v = (2, 4) are in V but u + v = (3, 5) and (3, 5) is not a point on
the parabola V . Therefore, the graph of f (x) = x2 is not a vector space.
V = {(x, y) ∈ R2 | y = 2x}.
Solution. We will show that V is a vector space. First, we verify that V is closed under addition.
We first note that an arbitrary point in V can be written as u = (x, 2x). Let then u = (a, 2a) and
v = (b, 2b) be points in V . Then
Therefore, V is closed under addition. Verify that V is closed under scalar multiplication:
All the other properties of a vector space can be verified to hold; for example, addition is commu-
tative and associative in V because addition in R2 is commutative/associative, etc. Therefore, the
graph of the function f (x) = 2x is a vector space.
Example 28. Let V = Rn [t] be the set of all polynomials in the variable t of degree at most n,
given by
Rn [t] = a0 + a1 t + a2 t2 + · · · + an tn | a0 , a1 , . . . , an ∈ R .
Then αu is a polynomial of degree at most n, and thus αu ∈ Rn [t]. Hence, Rn [t] is closed under
scalar multiplication. The zero vector in Rn [t] is the zero polynomial 0(t) = 0.
One can verify that all other properties of the definition of a vector space also hold; for example,
addition is commutative and associative, etc. Thus, Rn [t] is a vector space over R.
Example 29. Let V = Mm×n (R) be the set of all m × n matrices whose entries are from R. Under
the usual operations of addition of matrices and scalar multiplication, is Mm×n a vector space
over R?
Solution. Given matrices A, B ∈ Mm×n (R) and a scalar α, we define the sum A + B by adding
entry-by-entry, and αA by multiplying each entry of A by α. It is clear that the space Mm×n (R) is
closed under these two operations. The zero vector in Mm×n (R) is the matrix of size m × n having
all entries equal to zero. It can be verified that all other properties of the definition of a vector
space also hold. Thus, the set Mm×n (R) is a vector space over R.
Example 30. The n-dimensional Euclidean space V = Rn under the usual operations of addition
and scalar multiplication is a vector space over R.
Example 31. Let V = C[a, b] denote the set of functions with domain [a, b] and codomain R that
are continuous. Is V a vector space over R?
Frequently, one encounters a vector space W that is a subset of a larger vector space V . In this
case, we would say that W is a subspace of V . Below is the formal definition.
Definition 32. Let V be a vector space over F with operations “+” and “·” and let W ⊆ V .
Then W is called a subspace of V if W is a vector space under same operations + and ·
over F.
Since W is also a vector space, all 10-properties of the vector space hold. Hence it is closed under
vector addition and scalar multiplication. The following theorem says that without checking all the
10-properties to show a subset W of V is a subspace of V , we can only check 2-properties.
29
Theorem 33. Let V be a vector space over F under operations “+” and “·”. A subset W
of V is a subspace of V if and only if it satisfies the following properties:
1. u + v ∈ W for all u, v ∈ W
W = {(x, y) ∈ R2 | y = 2x}.
Is W a subspace of V = R2 ?
Solution. If x = 0, then y = 2 · 0 = 0 and therefore (0, 0) is in W . Let u = (a, 2a) and v = (b, 2b)
be elements of W . Then
Because the x- and y-components of αu satisfy y = 2x, then αu is an element of W , and thus W
is closed under scalar multiplication.
All three conditions of a subspace are satisfied for W , and therefore W is a subspace of V .
W = {(x, y) ∈ R2 | x ≥ 0, y ≥ 0}.
Is W a subspace of R2 ?
Solution. The set W contains the zero vector, and the sum of two vectors in W is again in W ; you
may want to verify this explicitly as follows: If u1 = (x1 , y1 ) is in W , then x1 ≥ 0 and y1 ≥ 0, and
30
u1 + u2 = (x1 + x2 , y1 + y2 )
αu = (−1, −1)
Is W a subspace of V ?
Solution. If 0 is the n × n zero matrix, then clearly tr(0) = 0, and thus 0 ∈ Mn×n . Suppose that A
and B are in W . Then necessarily tr(A) = 0 and tr(B) = 0. Consider the matrix
C = A + B.
Then
tr(C) = tr(A + B) = (a11 + b11 ) + (a22 + b22 ) + · · · + (ann + bnn )
Thus, tr(C) = 0, that is, C = αA ∈ W , and consequently W is closed under scalar multiplication.
Therefore, the set W is a subspace of V .
31
Solution. The zero polynomial 0(t) = 0 clearly does not equal −1 at t = 2. Therefore, W does not
contain the zero polynomial and, because all three conditions of a subspace must be satisfied for
W to be a subspace, we conclude that W is not a subspace of n[t]. As an exercise, you may want
to investigate whether or not W is closed under addition and scalar multiplication.
Example 38. A square matrix A is said to be symmetric if AT = A. For example, the following is
a 3 × 3 symmetric matrix:
2 4 5
A = 1 2 −3
−3 5 7
Verify for yourself that we do indeed have that AT = A. Let W be the set of all symmetric n × n
matrices. Is W a subspace of V = Mn×n ?
Example 39. For any vector space V , there are two trivial subspaces in V , namely, V itself is a
subspace of V , and the set consisting of the zero vector, W = {0}, is a subspace of V .
There is a particular way to generate a subspace of any given vector space V using the span of a
set of vectors.
Definition 40. Let V be a vector space over a field F and let S be any non-empty subset of
V . The span of S is defined to the set of all finite linear combinations of vectors in S, i.e.,
We define the span the empty set to be the zero space {0}.
32
Proof. We present a proof for the case of a finite set S = {v1 , . . . , vp }. The general case arguments
are similar.
Therefore, αu is in span S. Lastly, since 0v1 + 0v2 + · · · + 0vp = 0, the zero vector 0 is in the span
of v1 , v2 , . . . , vp . Therefore, span S is a subspace of V .
span{w1 , w2 , . . . , wp } = W,
then we say that {w1 , w2 , . . . , wp } is a spanning set of W . Hence, every vector in W can be written
as a linear combination of the vectors w1 , w2 , . . . , wp .
Roughly speaking, the concept of linear independence evolves around the idea of working with “ef-
ficient” spanning sets for a subspace. For instance, the set of directions {EAST, N ORT H, N ORT H−
EAST } are redundant since a total displacement in the NORTH-EAST direction can be obtained
by combining individual NORTH and EAST displacements. With these vague statements out of
the way, we introduce the formal definition of what it means for a set of vectors to be “efficient”.
33
Definition 42. Let V be a vector space over F and let {v1 , v2 , . . . , vp } be a non-empty set of
vectors in V . Then {v1 , v2 , . . . , vp } is linearly independent if the only scalars c1 , c2 , . . . , cp ∈
F that satisfy the equation
c1 v1 + c2 v2 + · · · + cp vp = 0
We now describe the redundancy in a set of linearly dependent vectors. If {v1 , . . . , vp } are linearly
dependent, it follows that there are scalars c1 , c2 , . . . , cp , at least one of which is nonzero, such that
c1 v1 + c2 v2 + · · · + cp vp = 0. (⋆)
For example, suppose that {v1 , v2 , v3 , v4 } are linearly dependent. Then there are scalars c1 , c2 , c3 , c4 ,
not all of them zero, such that equation (⋆) holds. Suppose, for the sake of argument, that c3 ̸= 0.
Then,
c1 c2 c4
v3 = − v1 − v2 − v4 .
c3 c3 c3
Therefore, when a set of vectors is linearly dependent, it is possible to write one of the vectors
as a linear combination of the others. It is in this sense that a set of linearly dependent vectors is
redundant. In fact, if a set of vectors is linearly dependent, we can say even more, as the following
theorem states.
Theorem 43. A set of vectors {v1 , v2 , . . . , vp }, with v1 ̸= 0, is linearly dependent if and only
if some vj is a linear combination of the preceding vectors v1 , . . . , vj−1 .
Example 44. Show that the following set of 2 × 2 matrices is linearly dependent:
" # " # " #
1 2 −1 3 5 0
A1 = , A2 = , A3 = .
0 −1 1 0 −2 −3
Solution: It is clear that A1 and A2 are linearly independent, i.e., A1 cannot be written as a scalar
34
multiple of A2 , and vice versa. Since the (2, 1) entry of A1 is zero, the only way to get the −2 in
the (2, 1) entry of A3 is to multiply A2 by −2. Similarly, since the (2, 2) entry of A2 is zero, the
only way to get the −3 in the (2, 2) entry of A3 is to multiply A1 by 3. Hence, we suspect that
3A1 − 2A2 = A3 . Verify:
"
# " # " #
3 6 −2 6 5 0
3A1 − 2A2 = − = = A3 .
0 −3 2 0 −2 −3
Therefore, 3A1 − 2A2 − A3 = 0, and thus we have found scalars c1 , c2 , c3 , not all zero, such that
c1 A1 + c2 A2 + c3 A3 = 0.
We now introduce the important concept of a basis. Given a set of vectors {v1 , . . . , vp−1 , vp } in V ,
we showed that W = span{v1 , v2 , . . . , vp } is a subspace of V . If, say, vp is linearly dependent on
v1 , v2 , . . . , vp−1 , then we can remove vp and the smaller set {v1 , . . . , vp−1 } still spans all of W :
Intuitively, vp does not provide an independent “direction” in generating W . If some other vector
vj is linearly dependent on v1 , . . . , vp−1 , then we can remove vj and the resulting smaller set of
vectors still spans W . We can continue removing vectors until we obtain a minimal set of vectors
that are linearly independent and still span W . The following remarks motivate the following
important definition.
A basis is therefore a minimal spanning set for a subspace. Indeed, if B = {v1 , . . . , vp } is a basis
35
B
e = {v1 , . . . , vp−1 }
cannot be a basis for W (Why?). If B = {v1 , . . . , vp } is a basis, then it is linearly independent, and
therefore vp cannot be written as a linear combination of the others. In other words, vp ∈ W is not
in the span of
B
e = {v1 , . . . , vp−1 },
and therefore B
e is not a basis for W , because a basis must be a spanning set.
If, on the other hand, we start with a basis B = {v1 , . . . , vp } for W and we add a new vector u
from W , then
Be = {v1 , . . . , vp , u}
is not a basis for W (Why?). We still have that span(B) e = W , but now Be is not linearly indepen-
dent. Indeed, because B = {v1 , . . . , vp } is a basis for W , the vector u can be written as a linear
combination of {v1 , . . . , vp }, and thus B
e is not linearly independent.
Example 46. Show that the standard unit vectors form a basis for V = R3 :
1 0 0
e1 = 0 , e2 = 1 , e3 = 0 .
0 0 1
x3 0 0 1
−4 0 −6
0 0 1
Therefore, the only solution to Ax = 0 is the trivial solution. Thus, B is linearly independent.
Moreover, for any b ∈ R3 , the augmented matrix [A b] is consistent. Therefore, the columns of A
span all of R3 :
Col(A) = span{v1 , v2 , v3 } = R3 .
Example 50. Recall that an n × n matrix A is skew-symmetric if A⊤ = −A. We proved that the set
of n × n skew-symmetric matrices is a subspace. Find a basis for the set of 3 × 3 skew-symmetric
matrices.
The following theorem will lead to the definition of the dimension of a vector space.
Theorem 51. Any two bases of a vector space have equal cardinality.
Proof. We will prove the theorem for the case that V = Rn . We already know that the standard
unit vectors {e1 , e2 , . . . , en } form a basis of Rn . Let {u1 , u2 , . . . , up } be nonzero vectors in Rn , and
suppose first that p > n. In previous theorem, we proved that any set of vectors in Rn containing
more than n vectors is automatically linearly dependent. The reason is that the RREF of A =
[u1 u2 · · · up ] will contain at most r = n leading ones, and therefore d = p − n > 0. Thus, the
solution set of Ax = 0 contains non-trivial solutions. On the other hand, suppose instead that
p < n. Also, a set of vectors {u1 , . . . , up } in Rn spans Rn if and only if the RREF of A has
exactly r = n leading ones. The largest possible value of r is r = p < n. Therefore, if p < n,
then {u1 , u2 , . . . , up } cannot be a basis for Rn . Thus, in either case (p > n or p < n), the set
{u1 , u2 , . . . , up } cannot be a basis for Rn . Hence, any basis in Rn must contain n vectors.
The previous theorem does not say that every set {v1 , v2 , . . . , vn } of nonzero vectors in Rn con-
taining n vectors is automatically a basis for Rn . For example,
1 0 2
0 3 0
v1 =
1 ,
v2 =
0 ,
v3 =
0
0 0 0
0
38
is not in the span of {v1 , v2 , v3 }. All that we can say is that a set of vectors in Rn containing fewer
or more than n vectors is automatically not a basis for Rn . From above Theorem 1, any basis in
Rn must have exactly n vectors. In fact, on a general abstract vector space V , if {v1 , v2 , . . . , vn } is
a basis for V , then any other basis for V must have exactly n vectors also. Because of this result,
we can make the following definition.
Definition 52. Let V be a vector space. The dimension of V , denoted as dim V , is the
number of vectors in any basis of V . The dimension of the trivial vector space V = {0} is
defined to be zero.
Moving on, suppose that we have a set B = {v1 , v2 , . . . , vn } in Rn containing exactly n vec-
tors. For B = {v1 , v2 , . . . , vn } to be a basis of Rn , the set B must be linearly independent and
span(B) = Rn . In fact, it can be shown that if B is linearly independent, then the spanning
condition span(B) = Rn is automatically satisfied, and vice-versa.
For example, say the vectors {v1 , v2 , . . . , vn } in Rn are linearly independent, and put A = [v1 v2 · · · vn ].
Then A−1 exists, and therefore Ax = b is always solvable. Hence,
Col(A) = span{v1 , v2 , . . . , vn } = Rn .
2 3 3 −2
4 7 8 −6
A=
0 0 1 0
−4 −6 −6 3
form a basis for R4 ?
det(A) = −2 ̸= 0.
Hence, rank(A) = 4, and thus the columns of A are linearly independent. Therefore, the vectors
v1 , v2 , v3 , v4 form a basis for R4 .
A subspace W of a vector space V is a vector space in its own right, and therefore also has a dimen-
sion. By definition, if B = {v1 , . . . , vk } is a linearly independent set in W and span{v1 , . . . , vk } =
W , then B is a basis for W , and in this case, the dimension of W is k. Since an n-dimensional
vector space V requires exactly n vectors in any basis, then if W is a strict subspace of V , then
2. The one-dimensional subspaces in R3 are lines through the origin. These are spanned by a
single nonzero vector.
3. The two-dimensional subspaces in R3 are planes through the origin. These are spanned by
two linearly independent vectors.
−2 4 −2 −4
A = 2 −6 −3 1 .
−3 8 2 −3
Solution: By definition, Null(A) is the solution set of the homogeneous system Ax = 0. Row
reducing, we obtain
1 0 6 5
A ∼ 0 1 5/2 3/2 .
0 0 0 0
40
−5 −6
−3/2 −5/2
0 + s 1 = tv1 + sv2 .
x = t
1 0
−5 −6
−3/2 −5/2
v1 =
, v2 =
0
1
1 0
span Null(A) and are linearly independent. Therefore, B = {v1 , v2 } is a basis for Null(A), and
thus
dim(Null(A)) = 2.
In general, the dimension of Null(A) is the number of free parameters in the solution set of the
system Ax = 0, that is,
dim(Null(A)) = d = n − rank(A).
Consider a function
T : Rn → Rm .
the output space. Using this terminology, the points x in the domain are called the inputs, and the
points T (x) produced by the mapping are called the outputs.
Definition 56. The vector b ∈ Rm is in the range of T , or in the image of T , if there exists
some x ∈ Rn such that T (x) = b.
In other words, b is in the range of T if there is an input x in the domain of T that outputs
b = T (x). In general, not every point in the co-domain of T is in the range of T . For example,
consider the vector mapping T : R2 → R2 defined as
" #
x21 sin(x2 ) − cos(x21 − 1)
T (x) = .
x21 + x22 + 1
The vector b = (3, −1) is not in the range of T because the second component of T (x) is positive.
On the other hand, b = (−1, 2) is in the range of T because
" #! " # " #
1 12 sin(0) − cos(12 − 1) −1
T = 2 2
= = b.
0 1 +0 +1 2
Definition 57. Let V and U be vector spaces over the scalar field F. Then T : V → U is
called a linear transformation of V into W if the following holds for any u, v ∈ V and any
scalar α ∈ F:
Example 58. Consider the function T : V → U defined by T (v) = 0 for all v ∈ V . Verify that
this a linear transformation. This is called the zero transformation.
Example 59. Consider the function T : V → V defined by T (v) = v for all v ∈ V . Verify that
this a linear transformation. This is called the identity transformation.
42
Example 60. Let A ∈ Mm×n (F) be given. Consider the function LA : Fn → Fm defined by
LA (v) = Av for each column vector v ∈ Fn . Verify that this a linear transformation. This is
called a left multiplication transformation.
Yes! We must verify that the two conditions in Definition 57 hold. For the first condition, take
arbitrary vectors u = (u1 , u2 ) and v = (v1 , v2 ). We compute:
" #! 2(u1 + v1 ) − (u2 + v2 )
u1 + v1
T (u + v) = T = (u1 + v1 ) + (u2 + v2 ) .
u2 + v2
−(u1 + v1 ) − 3(u2 + v2 )
T (u + v) = T (u) + T (v).
Example 62. Let V = Mn×n (R) be the vector space of n × n matrices with entries from R and let
T : V → V be the function
T (A) = A + AT .
Yes! Let A and B be matrices in V . Then using the properties of the transpose and regrouping, we
43
obtain:
This proves that T satisfies both conditions and thus T is a linear transformation of V into V .
" #
x21 sin(x2 ) − cos(x21 − 1)
T (x) =
x21 + x22 + 1
No! Recall that " #! " #
1 −1
T = .
0 2
If T were linear, then by Definition 57, the following must hold:
" #! " #! " #! " # " #
3 1 1 −1 −3
T =T 3 = 3T =3 = .
0 0 0 2 6
Example 64. Let V = Mn×n (R) be the vector space of n × n matrices with entries from R, where
n ≥ 2, and let T : V → R be the function
T (A) = det(A).
For example, we know from the properties of the determinant that det(αA) = αn det(A), and
therefore it does not hold that T (αA) = αT (A) unless α = 1. Therefore, T is not a linear
transformation.
Also, it does not hold in general that det(A + B) = det(A) + det(B); in fact, it rarely holds. For
example, if " # " #
2 0 −1 1
A= , B= ,
0 1 0 3
then det(A) = 2, det(B) = −3, and therefore det(A) + det(B) = −1. On the other hand,
" #
1 1
A+B =
0 4
Example 65. Let V = Rn [t] be the vector space of polynomials in the variable t of degree no more
than n ≥ 1. Consider the function T : V → V defined as
T (f (t)) = 2f (t) + f ′ (t) = 2(3t6 − t2 + 5) + (18t5 − 2t) = 6t6 + 18t5 − 2t2 − 2t + 10.
Yes! Let f (t) and g(t) be polynomials of degree no more than n ≥ 1. Then
T (f (t) + g(t)) = 2(f (t) + g(t)) + (f (t) + g(t))′ = 2f (t) + 2g(t) + f ′ (t) + g ′ (t)
= (2f (t) + f ′ (t)) + (2g(t) + g ′ (t))
= T (f (t)) + T (g(t)).
Therefore, T (f (t) + g(t)) = T (f (t)) + T (g(t)). Now let α be any scalar. Then
T (αf (t)) = 2(αf (t)) + (αf (t))′ = 2αf (t) + αf ′ (t) = α(2f (t) + f ′ (t)) = αT (f (t)).
45
Definition 66. Let V and U be vector spaces over the scalar field F. Let T : V → U be a
linear transformation of V into U .
(i) The kernel of T is the set of vectors v in the domain V that get mapped to the zero
vector, that is, T (v) = 0. We denote the kernel of T by ker(T ):
(ii) The range of T is the set of vectors b in the codomain U for which there exists at least
one v in V such that T (v) = b. We denote the range of T by Range(T ):
You may have noticed that the definition of the range of a linear transformation is the usual defini-
tion of the range of a function. Not surprisingly, the kernel and range are subspaces of the domain
and codomain, respectively.
Theorem 67. Let V and U be vector spaces over the scalar field F. Let T : V → U be
a linear transformation of V into U . Then ker(T ) is a subspace of V and Range(T ) is a
subspace of U .
Proof. Suppose that v and u are in ker(T ). Then T (v) = 0 and T (u) = 0. By the linearity of T ,
it holds that
T (v + u) = T (v) + T (u) = 0 + 0 = 0.
Therefore, since T (u+v) = 0, u+v is in ker(T ). This shows that ker(T ) is closed under addition.
46
Now suppose that α is any scalar and v is in ker(T ). Then T (v) = 0, and thus by linearity of T ,
it holds that
T (αv) = αT (v) = α · 0 = 0.
Therefore, since T (αv) = 0, αv is in ker(T ), which proves that ker(T ) is closed under scalar
multiplication.
that is, T (0) = 0. Therefore, the zero vector 0 is in ker(T ). This proves that ker(T ) is a subspace
of v. The proof that Range(T ) is a subspace of U is left as an exercise.
Example 68. Let V = Mn×n (R) be the vector space of n × n matrices with entries in R, and let
T : V → V be the mapping
T (A) = A + AT .
What type of matrix A satisfies AT = −A? For example, consider the case where A is the 2 × 2
matrix " #
a11 a12
A= ,
a21 a22
satisfies AT = −A.
−b −c 0
Example 69. Let V be the vector space of differentiable functions on the interval [a, b]. That is,
f is an element of V if f : [a, b] → R is differentiable. Describe the kernel of the linear mapping
T : V → V defined as
T (f (x)) = f (x) + f ′ (x).
f (x) + f ′ (x) = 0.
Equivalently, if f ′ (x) = −f (x). What functions f do you know that satisfy f ′ (x) = −f (x)? How
about f (x) = e−x ? It is clear that f ′ (x) = −e−x = −f (x) and thus f (x) = e−x is in ker(T ). How
about g(x) = 2e−x ? We compute that g ′ (x) = −2e−x = −g(x) and thus g is also in ker(T ). It
turns out that the elements of ker(T ) are of the form f (x) = Ce−x for a constant C.
Example 70. Consider the linear transformation T : R3 [t] → M2×2 (R) defined by:
T (p(t)) = p(A),
" #
1 1
where A = . For a polynomial p(t) = at3 + bt2 + ct + d, the transformation is computed
−1 1
as:
T (p(t)) = aA3 + bA2 + cA + dI,
Kernel of T
Range of T
x + y = 0, w = z.
49
Definition 71. Let V and U be vector spaces over the scalar field F. Let T : V → W
be a linear transformation. The dimension of the kernel of is T called the nullity of T and
is denoted by nullity(T ). The dimension of the range of T is called the rank of T and is
denoted by rank(T ).
Theorem 72. Let V be a finite-dimensional vector space and U be vector space over the
scalar field F. Let T : V → W be a linear transformation. Then
Proof. As with almost every proof that involves dimensions, we make use of bases for various
vector spaces involved. What we must do is relate the sizes of bases for ker(T ), Range(T ), and V .
Since ker(T ) is a subspace of V , it makes sense to start with these two vector spaces (ker(T ) and
V ). Let {u1 , u2 , . . . , uk } be a basis for ker(T ) (the smaller of the two spaces). This is a linearly
independent set of vectors in V , so we can extend it to a basis for all of V . Let this extended basis
be {u1 , u2 , . . . , uk , v1 , v2 , . . . , vm }. In forming these two bases, we have labeled the dimensions
of ker(T ) and V . That is, we have said that dim(ker(T )) = k, and dim(V ) = k + m. This means
that we must show that the dimension of the range of T is m.
Claim: A basis for the range of T is {T (v1 ), T (v2 ), . . . , T (vm )}. If we can verify this claim,
we will have finished the proof. To show this set is a basis, we must establish that this is a span-
50
ning set for the range of T and that this set is linearly independent. We tackle these properties in
the order listed.
Spanning set of Range(T ): Let w be in the range of T . This means that w = T (v) for some v in
V . This v can be written as a combination of basis vectors so
v = c1 u1 + · · · + ck uk + d1 v1 + · · · + dm vm .
w = T (v) = T (c1 u1 +· · ·+ck uk +d1 v1 +· · ·+dm vm ) = c1 T (u1 )+· · ·+ck T (uk )+d1 T (v1 )+· · ·+dm T (vm ).
Since the u’s are all in the kernel of T , we have T (uj ) = 0 for each j. Consequently,
w = d1 T (v1 ) + · · · + dm T (vm ),
d1 v1 + · · · + dm vm = c1 u1 + · · · + ck uk .
This looks like a linear dependence relation among the u’s and v’s, but the u’s and v’s are linearly
independent. The only possibility, then, is that all the coefficients, all the c’s and all the d’s, are
zero. In particular, all the d’s must be zero. This shows that {T (v1 ), T (v2 ), . . . , T (vm )} is a
linearly independent set, completing the proof that it is a basis.
51
Given a matrix of size m×n, we have some naturally associated vector spaces, which are discussed
below.
If A is an m × n matrix with entries in F, each row of A has n entries and thus can be identified
with a vector in Fn . Similarly, each column of A has m entries, and can be identified with a vector
in Fm .
Definition 73. • The set of all linear combinations of the row vectors of A is a subspace
n
of F , called the row space of A. The dimension of the row space is said to be the row
rank of A.
3 7 −8 6 3
The vector b is in the column space of A if there exists x ∈ R4 such that Ax = b. Hence, we must
determine if Ax = b has a solution. Performing elementary row operations on the augmented
matrix [A | b], we obtain:
0 1 −5 −4 −2
[A | b] ∼ 2 4 −2 1 3 .
0 0 0 17 1
The system is consistent, and therefore Ax = b will have a solution. Thus, b is in the column
space of A.
52
Recall that two matrices are row-equivalent when one can be obtained from the other by performing
finitely many elementary row operations. One can see that row-equivalent matrices have the same
row space.
We want to find a basis for the row space of A. If A is in row echelon form, then its non-zero row
vectors are linearly independent, and hence form a basis for the row space of A. In general, we
have the following useful result.
Theorem 75. If a matrix A is row-equivalent to a matrix B in row echelon form, then the
non-zero row vectors of B form a basis for the row space of A.
Observe that, the column vectors of A are the row vectors of AT . Hence, to a find a basis for the
column space of A, we find a basis for the row space of AT as above.
Theorem 76. The row rank and column rank of any matrix is the same.
Definition 77. The dimension of the row (or column) space of a matrix is called the rank of
A and is denoted by rank(A).
Let A be a matrix in Mm×n (F). Let us consider the homogeneous system of equations, Ax = 0,
where x = [x1 , x2 , . . . , xn ]T be the column vector of unknowns. Then the set of all solutions of
this homogeneous system of equations form a subspace of Fn , called the null space of A, which
we define below.
Definition 78. The null space of a matrix A ∈ Mm×n (F), denoted by Null(A), is the subset
of Fn consisting of vectors v ∈ Fn such that Av = 0. Using set notation:
Null(A) = {v ∈ Fn | Av = 0}.
53
This is a subspace of Fn . The dimension of the null space of A is called the nullity of A.
Is W a subspace of R4 ?
Recall the linear transformation LA : Fn → Fm of Example 60. We see that, v is in the kernel
of LA if and only if LA (v) = Av = 0. In other words, ker(LA ) = Null(A). Also, note that
range of LA is the column space of A. Hence, by the rank-nullity theorem applied to the linear
transformation LA , we have the following:
By the above result, if the the row reduced echelon form of A has r leading 1’s, that is, rank of A
is r, then dimension of the null space is d = n − r. Therefore, we will obtain vectors v1 , . . . , vd
such that
Null(A) = span{v1 , v2 , . . . , vd }.
Example 81. Find a spanning set for the null space of the matrix
−3 6 −1 1 −7
A = 1 −2 2 3 −1 ∈ M3×5 (R).
2 −4 5 8 −4
The null space of A is the solution set of the homogeneous system Ax = 0. Performing elementary
row operations one obtains
1 −2 0 −1 3
A ∼ 0 0 1 2 −2 .
0 0 0 0 0
54
Clearly r = 2 and since n = 5, we will have d = 3 vectors in a basis for Null(A). Letting x5 = t1 ,
and x4 = t2 , then from the 2nd row we obtain
x3 = −2t2 + 2t1 .
x1 = 2t3 + t2 − 3t1 .
1 0 0
Therefore,
−3 1 2
0 0 1
Null(A) = span v1 = 2 , v2 = −2 , v3 = 0 .
0 1 0
1 0 0
Until now, we have studied linear transformations by examining their ranges and kernels. Now,
we embark on one of the most useful approaches to the analysis of a linear transformation on a
finite-dimensional vector space: the representation of a linear transformation by a matrix.
3.6.1 Coordinates
Let V be a finite-dimensional vector space over F. Recall that a basis of V is a set of vectors
B = {v1 , v2 , . . . , vn } in V such that
55
Hence, if B is a basis for V , each vector x ∈ V can be written as a linear combination of vectors
in B:
x = c1 v 1 + c2 v 2 + · · · + cn v n .
Moreover, from the definition of linear independence, any vector x ∈ span(B) can be written in
only one way as a linear combination of v1 , . . . , vn . In other words, for the x above, there does
not exist other scalars t1 , . . . , tn such that also
x = t1 v1 + t2 v2 + · · · + tn vn .
To see this, suppose that we can write x in two different ways using B:
x = c1 v 1 + c2 v 2 + · · · + cn v n ,
x = t1 v1 + t2 v2 + · · · + tn vn .
Then
0 = x − x = (c1 − t1 )v1 + (c2 − t2 )v2 + · · · + (cn − tn )vn .
Definition 82. Let V be a finite-dimensional vector space over F. An ordered basis for V
is a basis for V endowed with a specific order; that is, an ordered basis for V is a finite
sequence of linearly independent vectors in V that generates V .
Our preceding discussion on the unique representation property of vectors in a given basis leads to
the following definition.
x = c1 v 1 + c2 v 2 + · · · + cn v n .
We define the coordinates of x relative to the ordered basis B to be the unique scalars
c1 , c2 , . . . , cn . The coordinate vector of x relative to the ordered basis B, denoted by [x]B ,
is defined to be
c1
c2
n
[x]B = .. ∈ F .
.
cn
If it is clear what ordered basis we are working with, we will omit the subscript B and simply
write [x] for the coordinates of x relative to B.
" #
c1
If we put P = [v1 v2 ], and let [v]B = , then we need to solve the linear system
c2
v = P [v]B .
" #
2
Solving the linear system, one finds that the solution is [v]B = , and therefore this is the
−1
coordinate vector of v relative to the ordered basis B.
Px = v
that is, x = [v]B is the unique solution to P x = v. Because v1 , v2 , . . . , vn are linearly indepen-
dent, the solution to P x = v is
[v]B = P −1 v.
h i
We remark that if an inconsistent row arises when you row reduce P v , then you have made an
error in your row reduction algorithm. In summary, to find coordinates with respect to an ordered
basis B in Rn , we need to solve a square linear system.
2 1 7
and let B = {v1 , v2 }. One can show that B is linearly independent and therefore an ordered basis
for W = span{v1 , v2 }. Determine if x is in W , and if so, find the coordinate vector of x relative
to B.
x = c1 v 1 + c2 v 2 .
58
2 1 7 0 0 0
−7
Clearly,
3 1 0 0
v = 11 = 3 0 + 11 1 − 7 0 .
−7 0 0 1
Therefore, the coordinate vector of v relative to {e1 , e2 , e3 } is
3
[v]E = 11
−7
Example 88. Let R3 [t] be the vector space of polynomials of degree at most 3.
The set B = {1, t, t2 , t3 } is a spanning set for R3 [t]. In fact, any polynomial u(t) = c0 + c1 t +
c2 t2 + c3 t3 is clearly a linear combination of 1, t, t2 , t3 . Is B linearly independent? Suppose that
59
c0 + c1 t + c2 t2 + c3 t3 = 0.
Since the above equality must hold for all values of t, we conclude that c0 = c1 = c2 = c3 = 0.
Therefore, B is linearly independent, and consequently an ordered basis for R3 [t]. In the ordered
basis B, the coordinates of v(t) = 3 − t2 − 7t3 are
3
0
[v(t)]B =
−1
−7
The ordered basis B = {1, t, t2 , t3 } is called the standard ordered basis in P3 [t].
ordered basis for M2×2 (R). The coordinate vector of A relative to the ordered basis B is given by
3
0
[A]B =
−4 .
−1
x = P [x]B . (4)
P −1 x = [x]B .
Therefore, P −1 maps coordinate vectors in the standard ordered basis to coordinate vectors relative
to B.
0 0 −1
On the other hand, the inverse matrix P −1 maps standard coordinates in R3 to B-coordinates. One
can verify that
4 3 6
P −1 = −1 −1 −1 .
0 0 −1
Therefore, the B-coordinate vector of v is
4 3 6 2 5
−1
[v]B = P v = −1 −1 −1 −1 = −1 .
0 0 −1 0 0
P (v) = [v]B .
Example 91. Let V = M2×2 (R) and let B = {A1 , A2 , A3 , A4 } be the standard ordered basis for
M2×2 ((R). What is P : V → R4 ?
Let V and W be finite-dimensional vector spaces over F and let T : V → W be a linear transfor-
mation. Then by definition, T (v + u) = T (v) + T (u) and T (αv) = αT (v) for every v, u ∈ V
and α ∈ F. Let β = {v1 , v2 , . . . , vn } be an ordered basis of V and let γ = {w1 , w2 , . . . , wm } be
an ordered basis of W . Then for any v ∈ V , there exist scalars c1 , c2 , . . . , cn such that
v = c1 v 1 + c2 v 2 + · · · + cn v n
c1
c2
and thus [v]β =
.. is the coordinate vector of v relative to the ordered basis β. By linearity of
.
cn
the mapping T , we have
Now, each vector T (vj ) is in W , and therefore, because γ is a basis of W , there are scalars
a1,j , a2,j , . . . , am,j such that
In other words,
a1,j
a2,j
[T (vj )]γ =
.. .
.
am,j
Substituting T (vj ) = a1,j w1 + a2,j w2 + · · · + am,j wm for each j = 1, 2, . . . , n into
Therefore,
[T (v)]γ = A[v]B
The matrix A is the matrix representation of the linear transformation T in the ordered bases β and
γ.
Example 92. Consider the vector space V = R2 [t] of polynomials of degree no more than two and
let T : V → V be defined by
T (v(t)) = 4v′ (t) − 2v(t).
c1 v 1 + c2 v 2 + c3 v 3 = 0
definition that B is linearly independent. Since we already know that dim(V ) = 3 and B contains
3 vectors, then B is a basis for V .
(ii) The coordinates of v(t) = −t2 + 3t + 1 are the unique scalars (c1 , c2 , c3 ) such that
c1 v 1 + c2 v 2 + c3 v 3 = v
and solving yields c1 = 1, c2 = 1, and c3 = −1. Hence, the coordinate vector of v relative to the
ordered basis B is given by
1
[v]B = 1 .
−1
And therefore,
− 18
5
− 6
5
24
5
A = 45 − 25 58 .
0 0 −2