Lecture Notes On Linear Algebra: S. K. Panda IIT Kharagpur October 25, 2019
Lecture Notes On Linear Algebra: S. K. Panda IIT Kharagpur October 25, 2019
Lecture Notes On Linear Algebra: S. K. Panda IIT Kharagpur October 25, 2019
S. K. Panda
IIT Kharagpur
1
Contents
1 Vector Space 3
1.1 Definition and Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Subspace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3 Linear Combination and Linear Span . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.4 Linear Dependency and Independency . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.5 Basis and Dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3 Linear Transformation 16
3.1 Definition, Examples and Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.2 Matrix Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.3 Matrix Representation Under Change of Basis . . . . . . . . . . . . . . . . . . . . . . 19
2
1 Vector Space
1.1 Definition and Examples
Definition 1.1. A set V over a field F is a vector space if x + y is defined in V for all x, y ∈ V
and αx is defined in V for all x ∈ V, α ∈ F s.t.
1) + is commutative, associative.
2) Each x ∈ V has a inverse w.r.t +.
3) Identity element 0 exists in V for +.
4) 1x = x holds x ∈ V, where 1 is the multiplicative identity of F.
5) (αβ)x = α(βx), (α + β)x = αx + βx hold α, β ∈ F, x ∈ V.
6) α(x + y) = αx + αy holds α ∈ F, x, y ∈ V.
8. The set of all eventually 0 sequences is a vector space over R. We call (xn ) eventually 0 if
∃k s.t. xn = 0 for all n ≥ k.
14. R[x; 5]= {p(x) ∈ R[x] | degree of p(x) ≤ 5} is a vector space over R.
15. The set {0} is the smallest vector space over any field.
3
18. {A ∈ Mn×n (R) | A is symmetric } is a vector space over R. What about the skew symmetric
ones? How about At = 2A?
Remark 1.1. Any field F is a vector space over the same field F.
Remark 1.2. The field F does matter for deciding a non-empty set is a vector space or not under
some addition. The following is an example.
Example 1.2. We have already seen that R is a vector space over R. But R is not a vector space
over C.
1.2 Subspace
Definition 1.2. Let (V, +, ., F be a vector space and let W ⊆ V. Then W is called a subspace of
V if W is a vector space over the same field F under the same operations: the restriction of + to
W × W and the restriction of . to F × W .
Theorem 1.1. Let V be a vector space over F and W ⊆ V. Then W is a subspace of V if and only
if x, y ∈ W we have αx + βy ∈ W for all α, β ∈ F
1. {0} and V are subspaces of any vector space V, called the trivial subspaces.
3. The vector space R over Q has many nontrivial subspaces. Two of them are
√ √ √ √
Q[ 2] = {a + b 2 : a, b ∈ Q} and Q[ 3] = {a + b 3 : a, b ∈ Q}.
Remark 1.3. The union of two subspaces of a vector space need be a subspace of that vector
space. The following is an example.
4
The following theorem supplies a necessary and sufficient condition to make union of two sub-
spaces again a subspace.
Theorem 1.3. Let U, W be two subspaces of V. Then U ∪ W is a subspace of V if and only if
either U ⊆ W or W ⊆ U.
Proof. First we assume that U ∪ W is subspace of V. To show either U ⊆ W or W ⊆ U. Suppose
that U 6⊆ W and W 6⊆ U. Then there exist two vectors x and y in V such that x ∈ U but not in W
and y ∈ W but not in U. Therefore x, y ∈ U ∪ W. Then x + y ∈ U ∪ W. This implies that either
x + y ∈ U or W. If x + y ∈ U, then (x + y) − x = y ∈ U which is not possible. If x + y ∈ W, then
(x + y) − y = x ∈ W which is not possible. Therefore either U ⊆ W or W ⊆ U.
Conversely, U ⊆ W or W ⊆ U. Then either U ∪ W = W or U ∪ W = V. Hence U ∪ W is subspace
of V.
Remark 1.4. If U is a subspace of W and W is a subspace of V, then U is a subspace of V.
Definition 1.3. If U, W are subspaces of V, then U + W := {u + w | u ∈ U, w ∈ W} and this is
called the sum of U and W.
Theorem 1.4. If U, W are subspaces of V. Then U + W is the smallest subspace of V which
contains both U and W.
Proof. First we show that U + W is subspace of V . Let x, y ∈ U + W. Then x = x1 + x2
for some x1 ∈ U and x2 ∈ W and y = y1 + y2 for some y1 ∈ U and y2 ∈ W. Take +βy =
α(x1 + x2 ) + β(y1 + y2 ) = (αx1 + βy1 ) + (αx2 + βy2 ). Since U and W are subspaces of V. Then
αx1 + βy1 ∈ U and αx2 + βy2 ∈ W. Therefore +βy ∈ U + W. Hence U + W is subspace of V.
Now we show that W, U ⊆ V. Let x ∈ U. Then x = x + 0. It is clear that x ∈ U + W. Hence
U ⊆ V. Similarly we can show that W ⊆ V.
Now we show that U + W is the smallest subspace of V which contains both U and W. Suppose
that there is a subspace Z containing U and W such that Z ( U + W. Then there exist a vector
x ∈ U ⊆ V but not in Z. Since x ∈ U ⊆ V, x = x1 + x2 where x1 ∈ U and x2 ∈ W. Notice that
x1 , x2 ∈ Z. Therefore x = x1 + x2 ∈ Z. This is a contradiction. Hence U + W is the smallest
subspace of V which contains both U and W.
Definition 1.4. When U ∩ W = {0}, it is called the internal direct sum of U and W. Notation:
U ⊕ W.
Remark 1.5. If V = U ⊕ W, then W is called complement of U.
Theorem 1.5. Let U, W be two subpaces of V. Then V = U ⊕ W iff for each v ∈ V, there exists
unique u ∈ U and there exists unique w ∈ W such that v = u + w.
Proof. First we assume that V = U ⊕ W, that means, U ∩ W = {0}. Let x ∈ V. Then there
exist x1 ∈ U and x2 ∈ W such that x = x1 + x2 . Now we show that x1 and x2 are unique. Suppose
there two vectors y1 (6= x1 ) ∈ U and y2 (6= y1 ) ∈ W such that x = y1 + y1 . Then x1 + x2 = y1 = y2
this implies x1 − y1 = y2 − x2 . Therefore x1 − y1 , y2 − x2 ∈ U ∩ W. This is only possible when
x1 − y1 = y2 − x2 = 0. Then x1 = y1 and x2 = y2 .
Conversely, v ∈ V, there exists unique u ∈ U and there exists unique w ∈ W such that v = u+w.
If we show that U ∩ W = {0}, then we are done. Let x ∈ U ∩ W. Then x = x + 0 where x ∈ U
and 0 ∈ W, and x = 0 + x where 0 ∈ U and x ∈ W. This is possible only when x = 0 otherwise the
hypothesis is wrong. Then U ∩ W = {0}. Hence V = U ⊕ W.
5
Definition 1.5. Let V, W be vector spaces over F. V × W is Cartesian product of V and W.
Addition and and scalar multiplication on V × W defined by
k
P k
P
Proof. Let x, y ∈ { αi xi | αi ∈ F} and α, β ∈ F. Therefore x = αi xi αi ∈ F for i = 1, . . . , k
i=1 i=1
k
P
and y = βi xi βi ∈ F i = 1, . . . , k. Then
i=1
k
X k
X k
X
αx + βy = α αi xi + β βi xi = (ααi + ββi )xi
i=1 i=1 i=1
k
P k
P
This implies αx + βy ∈ { αi xi | αi ∈ F}. Hence { αi xi | αi ∈ F} is subspace of V.
i=1 i=1
m
P
Definition 1.7. (Linear Span) Let S ⊆ V. By span of S we mean { αi xi | αi ∈ F, xi ∈ S}. The
i=1
span of S is denoted by ls(S).
Proof. The proof is similar to the proof of Theorem 1.6. ls(φ) = {0}.
6
1.4 Linear Dependency and Independency
Definition 1.8. Let x1 , . . . , xk ∈ V. If a nonzero linear combination of xi ’s becomes 0, then we
say x1 , . . . , xk are linearly dependent. We say x1 , . . . , xk are linearly independent, if they are
not linearly dependent.
Example 1.5. The following examples must be verified by the reader
2 1
1. In R , the vectors x = , e1 , e2 , are linearly dependent, as x − e1 − 2e2 = 0.
2
3 −1 4 2
2. v1 = 0 , v2 =
1 , v3 =
2 , v4 =
1 are linearly dependent in R3 , as
−3 2 −2 1
2v1 + 2v2 − v3 + 0v4 = 0.
1 −1
3. v1 = 1 , v2 = 1 are linearly independent in R3 . How?
1 1
3 −1 4 2
4. v1 = 0 , v2 = 1 , v3 = 2 , v4 = 1 are linearly dependent in R3 , as
−3 2 −2 1
2v1 + 2v2 − v3 + 0v4 = 0.
1 −1
5. v1 = 1 , v2 = 1 are linearly independent in R3 . How?
1 1
Remark 1.6. Thus φ is linearly independent and any set containing 0 is linearly dependent.
Theorem 1.9. Let S = {x1 , · · · , xn } be linearly dependent. Then S is linearly independent iff
there exits k s.t. xk is linear combination of x1 , · · · , xk−1 , xk+1 , . . . , xn (that is, there exists xk such
that xk ∈ ls(S − {αk }).
Proof. Since x1 , · · · , xn be linearly dependent, there exist scalars α1 , , αk ∈ F not all zero such
n n
6 0. Then xk = − α1k (
P P
that αi x1 = 0. Without loss of generality we assume that k = αi x1 ).
i=1 i=1,i6=k
So xk is linear combination of x1 , · · · , xk−1 , xk+1 , . . . , xn .
Conversely, ∃k s.t. xk is a linear combination of x1 , · · · , xk−1 , xk+1 , . . . , xn . That means xk =
c1 x1 + c2 x2 + · · · + ck−1 xk−1 + ck+1 xk+1 + · · · + cn xn for some scalars ci ∈ F for i = 1, . . . , k − 1, k +
1, . . . , n. Then c1 x1 + c2 x2 + · · · + ck−1 xk−1 − xk + ck+1 xk+1 + · · · + cn xn = 0. This implies S is
linearly dependent.
Remark 1.7. The above theorem gives you a technique to check whether a set is linearly dependent
or not. That is, you just have to check that a vector from that set is a linear combination of
remaining vectors of S.
The following theorem says something more.
Theorem 1.10. Let x1 , · · · , xn be linearly dependent, x1 6= 0. Then ∃k > 1 s.t. xk is a linearly
combination of x1 , · · · , xk−1 .
7
Proof. Consider {x1 }, {x1 , x2 }, · · · , {x1 , x2 , · · · , xn } one by one. Take the smallest k > 1 s.t.
Sk = {x1 , · · · , xk } is linearly dependent. So Sk−1 is linearly independent(since k is the smallest).
Pk
In that case if αi x1 = 0 α1 , , αk ∈ F not all zero, then alphak 6= 0. Otherwise Sk will be linearly
i=1
independent. Therefore xk is linear combination of x1 , · · · , xk−1 .
Theorem 1.11. Every subset of a finite linearly independent set is linearly independent.
Proof. Let S be a linearly independent set. Let S = {x1 , . . . , xk }. Let S1 ⊆ S. We have to show
that S1 is linearly independent. Suppose S1 is linearly dependent. Then there exists a vector say
xm in S1 such that xm ∈ ls(S1 − {xm }). Since S1 − {xm } ⊆ S − {xm }. Then xk ∈ ls(S − {xm }).
Hence S1 is linearly independent, a contradiction. Therefore S1 is linearly independent.
Remark 1.8. The above theorem says that if you will be able to find out a subset of a set S which
is linearly dependent, then S is linearly dependent. But if you will find out a subset of S which is
linearly independent, then you can not conclude anything about S.
Remark 1.9. Every subset of a finite linearly dependent set need not be linearly dependent. In
1
R2 , the vectors x = , e1 , e2 , are linearly dependent, as x − e1 − 2e2 = 0. The subset {e1 , e2 }
2
is linearly independent.
Till now we have seen linear independency and dependency for finite set. Next we are going
to define linear independency and dependency for infinite set and we hire the concepts of linear
independency and dependency of finite to define for infinite set.
Definition 1.9. Let V be a vector space and let S ⊆ V be infinite. We say S is linearly dependent
if it contains a finite linearly dependent set, otherwise it is linearly independent.
Remark 1.10. Theorem 1.11 is also true if S is infinite.
Theorem 1.12. Let S = {v1 , . . . , vn } ⊆ V and T ⊆ ls(S) such that m = |T | > |S|. Then T is
linearly dependent.
The above theorem is quite important. This theorem says that if you have a finite subset S of a
vector space containing n elements. Then any finite subset of ls(S) containing more than n elements
is linearly dependent. For example, consider the vector space R3 . Take S = {(1, 1, 0), (1, 0, 0)}.
Take T = {(1, 1, 0), (1, 0, 0), (3, 1, 0)}. You can easily check that T ⊆ ls(S) and T contains three
elements. T is linearly dependent.
Corollary 1. Any n + 1 vectors in Rn is linearly dependent.
Proof. Follows as Rn = ls(e1 , . . . , en ).
The following corollary is quite important. This theorem gives you a technique to extend a
linearly independent set to a larger linearly independent set. This technique will be used frequently
through out this course. So keep it in your mind.
Corollary 2. Let S ⊆ V be linearly independent, x ∈ V \ S. Then S ∪ {x} is linearly independent
iff x ∈
/ ls(S).
Corollary 3. Let S ⊆ V be linearly independent. Then ls(S) = V iff each proper superset of S is
linearly dependent.
8
1.5 Basis and Dimension
Definition 1.10. Let V be a vector space and B ⊆ V. Then B is called basis for V if the the
following are hold.
i) B is linearly independent,
ii) ls(B) = V.
Theorem 1.13. Every vector space has a basis.
Proof. To prove this theorem we need Zorn’s lemma. So I skip the proof at this point of time.
Remark 1.11. Every non-trivial vector space has infinitely many basis. That means basis is not
unique.
Let S ⊆ T . We say S is a maximal subset of T having a property (P) if
i) S has (P)
ii) no proper superset of S in T has (P).
Example 1.6. Let T = {2, 3, 4, 7, 8, 10, 12, 13, 14, 15}. Then a maximal subset of T of consecu-
tive integers is S = {2, 3, 4}. Other maximal subsets are {7, 8}, {10}, {12, 13, 14, 15}. The subset
{12, 13} is not maximal.
Definition 1.11. A subset S ⊆ V is called maximal linearly independent if
i) S is linearly independent
ii) no proper super set of S is linearly independent.
Example 1.7. The following examples must be verified by the reader.
1. In R3 , the set {e1 , e2 } is linearly independent but not maximal linearly independent.
2. In R3 , the set {e1 , e2 , e3 } is maximal linearly independent.
3. Let S ⊆ Rn be linearly independent and |S| = n. Then S is maximal linearly independent.
Theorem 1.14. A set S ⊆ V is maximal linearly independent, then ls(S) = V.
Proof. First we assume that S is maximal linearly independent set. To show ls(S) = V.
Suppose ls(S) 6= V. Then there exists α ∈ V but not in ls(S). Take S1 = S ∪ {α}.
Claim: The set S1 is linearly independent. Suppose it is dependent. Then S1 has a finite
subset which is linearly dependent, say, R. There are two cases.
CASE I: α 6∈. Then R is a finite subset of S. A contradiction that R is linearly dependent
(every subset of a linearly independent set is linearly independent ans S is linearly independent).
CASE II: α ∈. Let R = {α1 , . . . , αk , α}. Since R is linearly dependent then there exist
c1 , c2 , . . . , ck , c not all zero in F such that
c1 α1 + c2 α2 + · · · + ck αk + cα = 0
. If c = 0, then ci = 0 for i = 1, . . . , k (since {α1 , . . . , αk is a subset of S which is linearly
independent). So c 6= 0. Then α = 1c (c1 α1 + c2 α2 + · · · + ck αk ). This implies α ∈ ls(S), a
contradiction.
Hence we have proved our claim that S1 is linearly independent.
We notice that S ( S1 . A contradiction that S is maximal. Hence ls(S) = V.
9
Theorem 1.15. A subset S ⊆ V is a basis of V. Then S is maximal linearly independent set.
Proof. Suppose that S is not maximal. Then there exits a linearly independent set S1 such
that S ⊆ S1 and S1 ⊆ ls(S). By using Theorem 1.12, S1 is dependent, a contradiction. Hence S1
is linearly independent.
Remark 1.12. It is clear that every basis is maximal linearly independent set.
Remark 1.13. Let V be a vector space and let B be a basis of V. There exists unique α1 , . . . , αk ∈
B and unique c1 , . . . , ck ∈ F such that x = c1 α1 + c2 α2 + · · · + ck αk .
Definition 1.12. Let V be a vector space. Then V is called finite dimensional if it has a basis
B which is finite. The dimension of V is the cardinality of B and it is denote by dim(V ).
Theorem 1.16. Let S, T be two basis of a finite dimensional vector space V. Then |S| = |T |.
Proof. Since T ⊆ ls(S) and T is linearly independent, then by Theorem 1.12 |T | > |S|.
Similarly |S| > |T |. Hence |S| = |T |.
Theorem 1.18 (Basis Deletion Theorem). If V = ls({α1 , . . . , αk }). Then some vi can be
removed to obtain a basis for V.
Theorem 1.19. Let V be a finite dimensional vector space and let dim V = n. Let S = {α1 , . . . , αn } ⊆
V such that ls S = V. Then S is a basis for V.
Proof. If we show that ls(S) = V, then we are done. If ls(S 6= V, then by using Extension
theorem, we can extend S to be a basis for V and which contains at least n + 1 elements. A
contradiction that dim(V) = n. Hence S is a basis of V.
Theorem 1.20 (Basis Extension Theorem). Every linearly independent set of vectors in a
finite-dimensional vector space V can be extended to a basis of V .
Proof. Let dim(V) = n. Let S = {α1 , . . . , αk } be a linearly independent set. If ls(S) = V, then
S is basis of V. If ls(S) 6= V, then there exists a vector β1 ∈ V but not in ls S. Take S1 = S ∪ {β1 }.
Using Theorem 1.15, the set S1 is linearly independent. If ls(S1 ) = V,then S1 is basis for V. If
ls(S1 ) 6= V, then there exists a vector β2 ∈ V but not in ls S1 . Take S2 = S1 ∪{β2 }. Using Theorem
1.15, the set S2 is linearly independent. If ls(S2 ) = V,then S2 is basis for V. If ls(S2 ) 6= V, then
10
there exists a vector β3 ∈ V but not in ls S2 . Take S3 = S1 ∪ {β3 }. Continuing this process we get
a linearly independent subset consisting of k + p = n vectors(since n is finite existence of such p is
possible. Therefore Sp = {α1 , . . . , αk , β1 , . . . , βp }. We notice that Sp is linearly independent and it
contains n vectors. So Sp is basis of V.
The following is an application of Basis extension theorem.
Theorem 1.21. Let V be a finite dimensional vector space and dim V = n. Let S = {α1 , . . . , αn } ⊆
V such that S is linearly independent. Then S is a basis of V.
Proof. If we show that S is linearly independent then we are done. If S is linearly dependent,
then by Deletion theorem, we can reduce S to a basis for V which contains less than n elements.
A contradiction that dim(V) = n. Hence S is basis of V.
Proof. Since V is finite dimensional, U and W both are finite dimensional. Let B = {v1 , , vk }
be a basis of U ∩ W. By using Extension theorem we extend B1 to a basis for U which is
{v1 , . . . , vk , u1 , , um } and basis for W which is {v1 , . . . , vk , w1 , , wp , }.
Let x ∈ U + W. Then x = x1 + x2 for some x1 ∈ U and x2 ∈ W. Therefore
k
X m
X
x1 = ai vi + bj uj
i=1 j=1
and
k
X p
X
x2 = ci vi + dj wj
i=1 j=1
.
k
P m
P k
P p
P
Then x = ai vi + bj uj + ci vi + dj wj .
i=1 j=1 i=1 j=1
This implies x ∈ ls({v1 , . . . , vk , u1 , , um , v1 , . . . , vk , w1 , , wp , })
We now show that {v1 , . . . , vk , u1 , . . . , um , w1 , , wp , }) is linearly independent.
To show that we take,
Xk Xm X p
fi vi + gj uj + hj wj = 0 (1)
i=1 j=1 l=1
Therefore,
k
X m
X p
X
fi vi + gj uj = − hj wj
i=1 j=1 l=1
k
P m
P k
P m
P k
P
This implies fi vi + gj uj ∈ U ∩ W. Then fi vi + gj uj ∈ U ∩ W = αi vi Therefore,
i=1 j=1 i=1 j=1 i=1
k
P m
P
(fi + αi )vi + gj uj = 0. Then fi + αi = 0 for i = 1, . . . k and gj = 0 for j = 1, . . . , m as
i=1 j=1
11
{v1 , . . . , vk , u1 , , um } is linearly independent. Put the values of gi in Equation (), then
k
X p
X
fi vi + hl wl = 0
i=1 l=1
.
Therefore fi = 0 for i = 1, . . . , k and hj = 0 for j = 1, . . . , p as {v1 , . . . , vk , w1 , , wp , }. Hence
{v1 , . . . , vk , u1 , , um , v1 , . . . , vk , w1 , , wp , }) is linearly independent.
Then {v1 , . . . , vk , u1 , , um , v1 , . . . , vk , w1 , , wp , }) is basis of U + W. Therefore dim(U + W) =
k + m + P + k − k = dim(U) + dim(W) − dim(U ∩ W).
Definition 1.13. Let V be a finite dimensional vector spaces and let B = {x1 , . . . , xn } be a basis.
Let x ∈ V. Then there exists unique α1 , . . . , αn ∈ F such that x = α1 x1 + · · · + αn xn . Then
(α1 , . . . , αn ) is called co-ordinate of x.
Remark 1.15. A vector space is infinite dimensional if it is not finite dimensional.
12
2.2 Existence of Inner Product
Theorem 2.1. Every non-trivial finite dimensional vector space is inner product space.
Definition 2.2. Let (V, h., .i) be an inner product space. A subset S of V is said to be orthogonal
if hu, vi = 0 for all u, v ∈ V.
Theorem 2.2. Let (V, h., .i) be an inner product space. Let S = {α1 , . . . , αk } be an orthogonal
subset of V and let αi 6= 0 for i = 1, . . . , k. Then S is linearly independent.
Remark 2.1. The converse of the above theorem is not true. The following is an example.
n
Example 2.2. Let (Rn , h., .i) be an inner product space where hx, yi =
P
xi yi . Take u = (1, 0)
i=1
and v = (1, 1). These two vectors are linearly independent but not orthogonal.
Definition 2.3. Let (V, h., .i) be an inner product space. Let {α1 , α2 , . . . , αn } ⊆ V. Then
{α1 , α2 , . . . , αn } is called orthonormal if hαi , αj i = 0 for i 6= j and hαi , αi i = 1.
The next immediate question is that can we construct an orthogonal set from a finite linearly
independent set. The answer is yes. Gram Schimdt supplied a process to construct an orthogonal
set from a linearly independent finite set.
Theorem 2.3. (Gram Schmidt Orthogonalization Theorem) Let (V, h., .i) be an inner product
space. Let {α1 , α2 , . . . , αn } be a linearly independent set. Then there exists an orthogonal set
{β1 , β2 , . . . , βn } such that ls({β1 , β2 , . . . , βn }) = ls({α1 , α2 , . . . , αn }).
Step 1.
Put
β1 =1 .
hα2 ,β1 i
β2 = α2 − hβ1 ,β1 i β1
..
.
hα2 ,β1 i hαn ,βn−1 i
βn = αn − hβ1 ,β1 i β1 − ··· − hβn−1 ,βn−1 i βn−1 .
Step 2.
13
Check hβi , βj i = 0 for i 6= j for i, j ∈ {1, . . . , n}.
Step 3.
Theorem 2.4. Every non-trivial finite dimensional inner product space has orthonormal basis.
Theorem 2.5. Let (V, h., .i) be an inner product space. Let {α1 , α2 , . . . , αn } be an orthonormal
Pn
basis. Then for each x ∈ V we have x = hx, αi iαi
i=1
Theorem 2.6. Let (V, h., .i) be a finite dimensional inner product space. Let S is subspace of V.
Then (S ⊥ )⊥ = S.
Remark 2.2. Check whether the above theorem is true or not for infinite dimensional inner product
space.
Theorem 2.7. Let (V, h., .i) be a finite dimensional inner product space. Let W be a subspace of
V. Then V = W ⊕ W⊥ .
To show W ∩ W⊥ = {0}.
Step 2.
To show V = W ⊕ W⊥ .
Take a basis of W which is {u1 , . . . , uk }. We now extend it to a basis for V which is {u1 , . . . , uk , v1 , . . . , vm }.
We transform {u1 , . . . , uk , v1 , . . . , vm } to an orthogonal set by using Gram Schmidt process which
14
is {α1 , . . . , αk , β1 , . . . , βm }.
k
P m
P
Then x = ai αi + bj βj where ai , bj ∈ F for i = 1, . . . , k and j = 1, . . . , m.
i=1 j=1
Pk m
P
Let x1 = ai αi and x2 = bj βj . It is clear that x1 ∈ W. We can easily check that
i=1 j=1
hx1 , x2 i = 0. Therefore y2 ∈ W⊥ . Then x ∈ W ⊕ W⊥ . This implies V ⊆ W ⊕ W⊥ . Hence
V = W ⊕ W⊥ .
Remark 2.3. Let (V, h., .i) be a finite dimensional inner product space. Let W be a subspace of
V. For each x ∈ V there exists unique x1 ∈ W and x2 ∈ Wprp such that x = x1 + x2 . The vector x1
is called the orthogonal projection of x on W .
Theorem 2.8. Let (V, h., .i) be a finite dimensional inner product space. Let W be a subspace of V.
Let {α1 , . . . , αk } be an orthogonal basis of W. Then the orthogonal projection of any vector x ∈ V
k
P hx,αi i
on W is hαi ,αi i αi .
i=1
Definition 2.5. Let (V, h., .i) be an inner product space. Let x ∈ V. The norm of a vector is
denoted by ||x|| and which is equal to (hx, xi)1/2 .
Theorem 2.9. (Cauchy Schwarz Inequality) Let (V, h., .i) be an inner product space. Let x, y ∈ V.
Then |hx, yi| ≤ ||x||||y||. Equality holds if and only if x and y are linearly dependent.
0 ≤ hx − ty, x − tyi
|hx,yi|2
=hx, xi − hy,yi
|hx,yi|2
hx, xi − hy,yi ≥ 0
The first inequality which we have used in this proof is 0 ≥ hx − ty, x − tyi. If |hx, yi| = ||x||||y||
hold, then hx − ty, x − tyi = 0. This says that x = ty. Then x and y are linearly dependent.
Exercise: Let V be an inner product space.
15
1. h0, vi = 0 for all v ∈ V.
3 Linear Transformation
3.1 Definition, Examples and Properties
Definition 3.1. Let V and W be two vector spaces over the same field F. A mapping T : V → W
is said to be linear transformation if T (αx + βy) = αT (x) + βT (y) for all x, y ∈ V and for all
α, β ∈ F.
4. Let V and W be vector spaces over the same field F. Let T : V → W be defined by T (x) = 0,
x ∈ V. This transformation is called Zero transformation.
5. Let V be vector space over the field F. Let T : V → V be defined by T (x) = x, x ∈ V. This
transformation is called identity transformation.
6. Let V and W be vector space over the field F and let λ ∈ F. Let T : V → V be defined by
T (x) = λx, x ∈ V. This transformation is called scalar transformation.
Theorem 3.1. Let T : Rn → R be a linear transformation. Then T has the following form.
k
αi xi for some αi ∈ R for i = 1, . . . , n and for all (x1 , . . . , xn ) ∈ Rn .
P
T (x1 , . . . , xn ) =
i=1
Theorem 3.2. Let T : Rn → Rm be a linear transformation. Then there exist linear transforma-
tions Ti : Rn → R for i = 1, . . . , m such that T (x) = (T1 (x), . . . , Tm (x)) for all x ∈ Rn .
2. R(T ) := {T (x) : x ∈ V}. R(T ) is a subspace of W. The subspaces R(T ) is called the range
space of T .
Definition 3.3. The dim(R(T )) is called the rank of T and dim(N (T )) is called the nullity of T .
16
Theorem 3.3. Let T : V → V be a linear transformation. Then T is one-one if and only if
N (T ) = {0}.
Theorem 3.4. Let T : V → W be a linear transformation. Then the following are true.
Theorem 3.5. Let T : V → {W be a linear transformation. Then the following are true.
2. dim(R(T )) ≤ dim(V).
4. If V and W are finite dimensional such that dim(V) = dim(W), then T is one-one if and only
if T is onto.
Theorem 3.6. (Rank Nullity Theorem) Let V be a finite dimensional vector space over the field
F. Let W be a vector space over the field F. Let T : V → {W be a linear transformation. Then
N ullity(T ) + Rank(T ) = dim(V).
Definition 3.5. Let V and W be two vector spaces over the same field F. Then V and W are said
to be isomorphic if there is an isomorphism from V to W.
Theorem 3.7. Let V and W be two finite dimensional vector spaces over the same field F. Let V
and W be isomorphic. Then dim(V) = dim(W).
Theorem 3.8. Show that L(V, W) is a vector space over the field F.
Definition 3.6. The space L(V, F) is called the dual space of V and it is denoted by V∗ . Elements
of V are usually denoted by lover case letters f , g, etc.
17
Theorem 3.9. Let V be a finite dimensional space and B = {v1 , . . . , vn } be an order basis of V.
n
P
For each j ∈ {1, . . . , n}, let fj : V → F be defined by fj (x) = αj for x = αj vj . Then the
j=1
following are true.
2. {f1 , . . . , fj } is a basis of V∗ .
Definition 3.7. Let V be a finite dimensional space and B = {v1 , . . . , vn } be an order basis of V.
A basis {f1 , . . . , fj } of V∗ such that fi (vj ) = δij for i, j ∈ {1, . . . , n}. Then {f1 , . . . , fj } is called
dual basis of V∗ .
Definition 3.9. Let V and W be two vector spaces over the same field F. Let T : V → W be a
linear transformation. Then T is invertible if T is bijective (one-one+onto).
Definition 3.10. The matrix A = (aij ) in the above discussion is called the matrix representa-
tion of T with respect to the ordered bases B1 , B2 of V and V, respectively. This matrix is usually
denoted by [T ]B1 B2 , that is, [T ]B1 B2 = (aij ).
Remark 3.3. The word ordered basis is very significant for as the order of basis is changed, the
entries aij will change their position and so the corresponding matrix will be different.
18
Example 3.3. Let T be a linear transformation from R2 to R2 defined by T (x1 , x2 ) = (−x2 , x1 ).
Let B1 = {α1 = (1, 0), α2 = (0, 1) and B2 = {β1 = (1, 2), β2 = (1, −1)} be ordered basis for R2 .
Then find out [T ]B1 B2 = (aij ).
Sol: T (α1 ) = T (1, 0) = (0, 1) = 1(1, 2) + 1(1, −1) = 1β1 + 1β2 .
1. We have seen that for each linear transformation T : V → W, we have a matrix A ∈ Mn×m (F)
such that [T ]B,B 0 = A.
2. Let A ∈ Mn×m (F). Then there exists a linear transformation T : V → W such that A =
m
P
[T ]B,B 0 and such linear transformation is T (uj ) = aij vi for j = 1 . . . , n.
i=1
Theorem 3.11. Let V and W be finite dimensional vector spaces over the same field F. Let
dim(V) = n and dim(W) = m. Then L(V, W) is isomorphic to Mn×m (F).
Proof. Hint: Define ζ : L(V, W) → Mn×m (F) such that ζ(T ) = [T ]B,B 0 where B = {u1 , . . . , un }
and B 0 = {v1 , . . . , vm } are ordered basis of V and W respectively.
Show that is an isomorphism from L(V, W) to Mn×m (F).
Corollary 4. Let V and W be finite dimensional vector spaces over the same field F. Let dim(V) =
n and dim(W) = m. Then dimension of L(V, W) = mn.
Theorem 3.12. Let V be a finite dimensional vector space over the same field F. Let S and T be
two linear transformations from V and to V. Let B be an ordered basis of V. Then [S ◦ T ]B,B =
[S]B,B [T ]B,B .
Theorem 3.13. Let V be a finite dimensional vector space over the same field F. Let T be an
invertible linear transformation from V and to V. Let B be an ordered basis of V. Then [T −1 ]B,B =
[T ]−1
B,B .
=⇒ S(α1 u1 + · · · αn un ) = 0.
19
=⇒ α1 S(u1 ) + · · · + αn S(un ) = 0.
=⇒ α1 v1 + · · · + αn vn = 0.
=⇒ αi = 0 for all i.
=⇒ x = 0.
=⇒ Ker(S) = 0.
Then S is one-one. Since V is finite dimensional then S is onto. Let [T ]B,B = A = (aij ).
n
P
Then T (uj ) = aij ui .
i=1
n n
Therefore S ◦ T ◦ S −1 (vj ) = S ◦ T (uj ) = S(T (uj )) = S(
P P
aij ui ) = aij vi .
i=1 i=1
[ST S −1 ]B 0 ,B 0 = (aij ) = [T ]B,B .
=⇒ [S]B 0 ,B 0 [T ]B 0 ,B 0 [S −1 ]B 0 ,B 0 = [T ]B,B .
=⇒ [T ]B,B = [S]−1
B 0 ,B 0 [T ]B ,B [S]B ,B = P
0 0 0 0
−1 [T ] 0 0 P where P = [S] 0 0 .
B ,B B ,B
Remark 3.5. In the above theorem we have seen that if T : V → V is a operator. Let B and B 0
be two bases of V. Then the matrix [T ]B,B is similar to the matrix [T ]B 0 ,B 0 .
Let V and W be finite dimensional vector spaces over the same field F and let T : V → W
be a linear transformation. Let B1 = {u1 , . . . un } and B10 = {u01 , . . . , u0n } be two bases of V and
0 } be two bases of W. One may want to know the relation
B2 = {v1 , . . . , vm } and B; 02 = {v10 , . . . , vm
between [T ]B1 B10 and [T ]B2 B;02 .
For this purpose we consider the linear transformations T1 : V → V and T2 : W → W such that
T1 (ui ) = u0i
and T2 (vi ) = vi0 f ori=1,. . . ,nandj=1. . . ,m.
Theorem 3.15. [T ]B1 B2 = [T2 ]−1
B2 B 0
[T ]B10 B20 [T1 ]B1 B20
2
20
Remark 4.1. Let A ∈ Mn×n (F). Then det(xI − A) is a polynomial
of xwhose co-efficients are
1 1 1
coming from F. For example consider the following matrix A = 0 2 2. Then det(xI − A) =
0 0 3
3 2
x − 6x + 11x − 6.
Theorem 4.1. Let A ∈ Mn×n (F) and λ ∈ F. Then λ is an eigenvalue of A if and only if λ is a
root of the polynomial det(xI − A).
Definition 4.2. Let A ∈ Mn×n (F). Then the polynomial det(xI − A) is called characteristic
polynomial and the equation det(xI − A) = 0 is called characteristic equation.
Theorem 4.2. Let A ∈ Mn×n (F). Then the constant term and the coefficient of xn−1 in det(xI −A)
are (−1)n det(A) and −(A).
1 1 1
Remark 4.2. For example consider the following matrix A = 0 2 2. Then det(xI − A) =
0 0 3
x3 − 6x2 + 11x − 6. The (A) = 6 and det(A) = 6. It is clear that coefficient of x2 is −(A) and
constant term is (−1)3 det(A).
Remark 4.3. The following question is quite natural. Does every matrix have eigenvalue? This
question is similar to the following question. Let P (x) ∈ P(x, F), where P(x, F) set of all polynomials
with coefficients are coming from F. Does P (x) have root in F? The answer is no. For example,
consider the polynomial x2 + 1 in P(x, R), this polynomial does not have any root in R. Now we are
able to answer our 1st question. If x2 + 1 is the characteristic
polynomial of a matrix A ∈ M2 (R),
0 1
then A does not have eigenvalues. Here is that A = .
1 0
Theorem 4.3. The following are true.
1. Let A ∈ Mn×n (C). Then A has at least one eigenvalue.
Proof. Let A and B be two similar matrices. Then there exists a nonsingular matrix P
such that P −1 AP = B. You can easily check that det(B − xI) = det(P −1 AP − xI) =
det(P −1 AP − xP −1 P ) = det(P −1 ) det(A − xI) det(P ) = det(A − xI).
0 1 0 0
Converse is not true. Consider the following two matrices. A = and B = .
0 0 0 0
These two matrices have same characteristic polynomial but they are not similar.
3. Let A, B ∈ Mn (F). Then AB and BA have same characteristic polynomials.
Proof Let A ∈ Mm×n (F) and B ∈ Mn×m (F), where m ≤ n. Then det(xI − AB) =
xn−m det(xI − BA).
21
4. Let f (x) be a polynomial and λ be an eigenvalue of A corresponding to the eigenvector x. Then
f (λ) is an eigenvalue of f (A) corresponding to the eigenvector x. But the converse is not true.
Definition 4.3. 1. Let A ∈ Mn×n (F) and let λ be an eigenvalue of A. The number of times λ
appears as a root of the characteristic equation of A is called algebraic multiplicity.
2. Let A ∈ Mn×n (F) and let λ be an eigenvalue of A. Let S be the set of all eigenvectors of A
corresponding to λ. Then S ∪ {0} is a subspace of Fn . This subspace is called eigenspace
of A corresponding to λ and it is denoted by Eλ (A). Eλ (A) = N ull(A).
3. Let A ∈ Mn×n (F) and let λ be an eigenvalue of A. Then dim(Eλ (A)) is called geometric
multiplicity of λ with respect to A.
Theorem 4.4. Let A ∈ Mn×n (F) and let λ1 and λ2 be two distinct eigenvalues. Then Eλ1 (A) ∩
Eλ2 (A) = {0}.
Theorem 4.5. Let A ∈ Mn×n (F) and let λ be an eigenvalue of A. Then the algebraic multiplicity
of λ is greater equal to the geometric multiplicity.
Definition 4.4. A matrix is said to be diagonalizable if there exists a nonsingular matrix P such
that P −1 AP is a diagonal matrix. That is A is similar to a diagonal matrix.
Remark 4.4. If A is diagonalizable, then the eigenvalues of A are the diagonal entries of that
diagonal matrix. Every matrix is not diagonalizable.
Theorem 4.6. Let A ∈ Mn (F) and let λ1 , . . . λk be the distinct eigenvalues of A. Then the following
are equivalent.
1. A is diagonalizable.
22
Notation: Let A ∈ MC. Then A∗ = ĀT . For real case it should be A∗ = AT .
Definition 4.5. Let A ∈ MC. Then A is called Normal matrix if A∗ A = AA∗ . For real case it
should be AT A = AAT .
Definition 4.6. Let A ∈ MC. Then A is called Hermitian if A∗ = A. For real case it is called
real-symmetric and AT = A.
Theorem 4.7. Let A be Hermitian (or real-symmetrix). Then the following are true.
23
Then S2∗ S1∗ AS1 S2 is upper triangular. Consider S = (S1 S2 )∗ . Product of two unitary matrices
is unitary. Hence S ∗ AS is upper triangular.
24
Corollary 6. Let T be a linear operator on the finite-dimensional space V over the field F. The
eigenvalues of T are the zeros of its characteristic polynomial.
Definition 4.12. Let T be a linear operator on the finite-dimensional space V over the field F.
and let λ be an eigenvalue of A. Let S be the set of all eigenvectors of A corresponding to λ. Then
S ∪ {0} is a subspace of V. This subspace is called eigenspace of T corresponding to λ and it is
denoted by Eλ (T ). Eλ (T ) = N ull(T ).
The dim(Eλ (A)) is called geometric multiplicity of λ with respect to A.
Definition 4.13. Let T be a linear operator on the finite-dimensional space V. We say that T is
diagonalizahle if there is a basis for V each vector of which is a characteristic vector of T .
Remark 4.7. If there is an ordered basis B = {u1 , . . . , un } for V in which each ai is a characteristic
vector of T , then
the matrix ofT in the ordered basis B is diagonal. If T (ui ) = ci ui for i = 1, . . . , n.
c1 0 · · · 0
0 c2 · · · 0
then [T ]B = . .. .
. ..
. . .
0 0 · · · cn
We certainly do not require that the scalars c1 , . . . , cn be distinct.
Theorem 4.13. Let T be a linear operator on the finite-dimensional space V. Let λ1 , . . . λk be the
distinct eigenvalues of T . Then the following are equivalent.
1. T is diagonalizable.
Definition 4.14. Let T be a linear operator on the finite-dimensional space V over the field F.
Let P be in P(x, F). Then P is called the annihilating polynomial of T id p(T ) = 0.
The question is that does we have such type of polynomials for each linear operator T on the
finite-dimensional space V over the field F. The answer is affirmative.
Theorem 4.14. Let T be a linear operator on the finite-dimensional space V over the field F. Then
T always has an annihilating polynomial.
25
Proof. Let dim(V) = n. Then dim(L(V, V)) = n2 . That is any subset of L(V, V) which contains
n2 + 1 elements is linearly dependent.
2
We consider the following set {I, T, T 2 , T 3 , . . . , T n } and this is a subset of L(V, V) which
2
contains n2 + 1 elements. Hence {I, T, T 2 , T 3 , . . . , T n } is linearly dependent. Therefore there
2
exist scalar c1 , . . . , cn2 +1 not all zero such that c1 I + c2 T + c3 T 3 + · · · + cn2 +1 T n +1 = 0. Then
2
P (x) = c1 + c2 x + c3 x2 + · · · + cn2 +1 xn +1 is an annihilating polynomial of T .
Remark 4.8. Let T be a linear operator on the finite-dimensional space V over the field F. Let S
be the set of all annihilating polynomial of T , that is, S = {p(x) ∈ P(x, F)|p(T ) = 0}. Then S is a
subspace of P(x, F). Furthermore, it is an infinite dimensional subspace.
Definition 4.15. Let T be a linear operator on the finite-dimensional space V over the field F.
The minimal polynomial for T is the (unique) monic generator of the ideal of polynomials over F
which annihilate T .
Remark 4.9. The minimal polynomial p for the linear operator T is uniquely determined by these
three properties :
(2) p(T ) = 0.
(3) No polynomial over F which annihilates T has smaller degree than p has.
Remark 4.10. If the operator T is represented in some ordered basis by the matrix A, then T
and A have the same minimal polynomial.
Theorem 4.15. Let T be a linear operator on an n-dimensional vector space V [or, let A be an
n × n matrix]. The characteristic and minimal polynomials for T [for A] have the same roots,
except for multiplicities.
Proof. Let mT (x) be the minimal polynomial for T . Let λ be a scalar. We want to show is that
mT (λ) = 0 if and only if λ is a characteristic value of T .
First, suppose mT (λ) = O. Then mT (x) = (x − λ)q, where q is a polynomial. Since degq <
degmT , the definition of the minimal polynomial mT tells us that q(T ) 6= 0. Then there exists a
vector v ∈ V such that q(T )(v) 6= 0. Let u = q(T )(v). Since mT (T ) = 0, we have mT (T )(v) =.
Then (T − λI)q(T )(v) = 0 This implies λ is eigenvalue of T corresponding eigenvector q(T )(v).
Now, suppose that λ is a characteristic value of T , say, T (v) = λv with v(6= 0) ∈ V. Then
mT (T )(v) = mT (λ)v =⇒ mT (λ) = 0.
Corollary 7. Let T be a linear operator on an n-dimensional vector space V. The minimal poly-
nomial of T divides the characteristic polynomial of T .
Theorem 4.16. Let T be a linear operator on an n-dimensional vector space V. Let P (x) be an
annihilating polynomial of T over F. Then minimal polynomial divides that polynomial of P .
26
Proof. You can easily prove it by using division algorithm.
We will prove the following two theorems latter.
Theorem 4.18 (Cayley Hamilton Theorem). Let T be a linear operator on an n-dimensional vector
space V. If PT (x) is the characteristic polynomial for T , then PT (T ) = 0.
Remark 4.11. When V is finite-dimensional, the invariance of W under T has a simple matrix
interpretation, and perhaps we should mention it at this point. Let B 0 = {v1 , . . . , vr } be a basis
of W. We extend B 0 to basis for V which is B = {v1 , . . . , vr , vr+1 , . . . , vn }. Let A = [T ]B so that
Pn
T (vj ) = Aij vj . Since W is invariant under T , the vector T (vj ) ∈ W for j ≤ r. This means
i=1
B C
Aij = 0 if j ≤ r and i > r. Hence A has the block form A = , where B is an r × r matrix
0 D
and C is an r × n − r matrix and D is an n − r × n − r matrix. The matrix B is precisely the
matrix of the induced operator TW in the ordered basis B 0 .
B C P (B) E
Lemma 4.1. Let A = and let P (x) be a polynomial. Then P (A) = , where
0 D 0 P (D)
E is some new matrix of size r × n − r.
Theorem 4.19. Let W be an invariant subspace for T . The characteristic polynomial for the
restriction operator TW divides the characteristic polynomial for T . The minimal polynomial for
TW divides the minimal polynomial for T .
Proof. Using previous lemma and corollary, one can easily prove it.
27