M Tech Notes

Download as pdf or txt
Download as pdf or txt
You are on page 1of 32

LINEAR ALGEBRA

Mohan C Joshi
IIT Gandhinagar
[email protected], [email protected]
1 Vector Spaces
1.1 Vector Spaces and Subspaces, Linear Span and Direct Sum
Denition 1.1. A vector space V is a set which is closed with respect to binary operation
(u V, v V u +v V ) and scalar multiplication (u V, / u V , / is either the set
R of real numbers or the set C of complex numbers) and is such that the following holds.
V is commutative group with respect to the binary operation
(u +v) = u +v, ( +)u = u +v u, v V ; , /
(u) = ()u = (u) u V ; , /
1u = u u V
Elements of the space V are called vectors. It is said to be real vector space if / = R and is called
complex vector space if / = C . The set / will be referred as eld of scalars and elements of / will
be referred as scalars.
Example 1.1. Let R
n
denote the set of all n-tuples of real numbers. We shall denote the elements
of R
n
as
_

_
u
1
u
2
.
.
.
u
n
_

_
That is, the elements of R
n
are n 1 matrices.
Dene the binary operation u +v and scalar multiplication u as
_

_
u
1
u
2
.
.
.
u
n
_

_
+
_

_
v
1
v
2
.
.
.
v
n
_

_
=
_

_
u
1
+v
1
u
2
+v
2
.
.
.
u
n
+v
n
_

_
and

_
u
1
u
2
.
.
.
u
n
_

_
=
_

_
u
1
u
2
.
.
.
u
n
_

_
R.
1
R
n
is a real vector space. Similarly, one can show that the set C
n
of all n-tuples of complex
numbers is a complex vector space over the eld of complex numbers as scalars.
Example 1.2. Let ([a, b] denote the set of all continuous real valued functions on [a, b]. Dene
binary operation f +g and scalar multiplication f as
[f +g](x) = [f(x) +g(x)], [f](x) = f(x), x [a, b], R
([a, b] is a real vector space. The elements of the real vector space ([a, b] are continuous functions
on [a, b].
Example 1.3. Proceeding as above, we shall denote by T as the vector space of polynomials on
R, T[a, b] as the vector space of polynomials on [a, b] and T[a, b] as the vector space of all functions
on [a, b].
Example 1.4. In a similar way (
(n)
[a, b] shall denote the real vector space of n-times continuously
dierentiable real valued functions on [a, b]. (

[a, b] will denote the space of real valued functions


on [a,b], which are dierentiable innitely many times.
Theorem 1.2. In any vector space V , the following holds.
1. 0 = 0 /.
2. 0u = 0 u V .
3. (1)u = u u V .
Proof. 1. We have
0 = (0 + 0) = 0 +0
This gives
(0) + (0) = (0) +0 +0 = ((0) +0) +0
and hence
0 = 0 +0 = 0
which proves the result.
2. We have
0u = (0 + 0)u = 0u + 0u
which gives
(0u) + (0u) = 0u + 0u + 0u = (0u + 0u) + 0u = 0 + 0u = 0u
and hence
0 = 0u
3.
(1)u +u = (1)u + (1)u = (1 + 1)u = 0u = 0
Hence
(1)u = u
Denition 1.3. (Subspace) A nonempty subset U of V is a subspace if U is a vector space by
itself with respect to the binary operation and scalar multiplication dened on V .
Theorem 1.4. U is a subspace of V i
u U, v U u +v U
2
u U, / u U
Remark 1.1. The space 0, consisting of just the zero element and the entire space V , are trivial
subspaces of V .
Theorem 1.5. We have the following inclusions.
(
(1)
[a, b] is a subspace of ([a, b]
(
(n)
[a, b] is a subspace of (
(n1)
[a, b], n 1
T[a, b] is a subspace of (
(n)
[a, b]
T is a subspace of T[a, b]
Example 1.5. The following subsets of (
(1)
[a, b] are its subspaces.
f (
(1)
[a, b] : f(c) = 0, c (a, b), f (
(1)
[a, b] : f

(a) = f

(b),
f (
(1)
[a, b] :
_
b
a
f(x)dx = 0.
Example 1.6. The following subsets of (
(1)
[a, b] are not its subspaces.
f (
(1)
[a, b] : f(c) = 2, c (a, b), f (
(1)
[a, b] :
_
b
a
f(x)dx = 1,
Denition 1.6. (Linear Span) Let S be a subset of V . Then the set of all nite linear
combinations of elements of S is called the linear span of S and is denoted by L[S]. That is
L[S] =
1
u
1
+
2
u
2
+ +
n
u
n
,
i
/, u
i
S, 1 i n.
Theorem 1.7. Let S ,= be a subset of V . Then L[S] is the smallest subspace of V containing S.
Proof. Let u =
1
u
1
+
2
u
2
+ +
n
u
n
and v =
1
v
1
+
2
v
2
+ +
n
v
n
,

i
,
i
, /, u
i
, v
i
S, 1 i n be two elements of L[S]. Then
u +v =
1
u
1
+
1
v
1
+ +
n
u
n
+
n
v
n
, again a nite linear combination of elements of S and
hence it is in L[S]. Similarly, u V , whenever /. This proves that L[S] is a subspace of V .
We now show that it is the smallest subspace containing S. It is obvious that it contains S. Let W
be any other subspace of V , containing S. We need to show that L[S] W. Let
u =
1
u
1
+
2
u
2
+ +
n
u
n
L[S], with u
i
S,
i
/, 1 i n. Since S W, we have
u
i
W, 1 i n and hence u W. Thus L[S] W. So, L[S] is the smallest subspace of V
containing S.
Using the above theorem, we immediately get the following theorem.
Theorem 1.8. Let S ,= be a subset of V .
Then (i) L[S] = S i S is a subspace of V and (ii) L[L[S]] = L[S].
Let U and W be subspaces of V . Then, it is clear that U W is also a subspace of V . Now,
consider the subspaces U
1
, U
2
, , U
n
of V . By induction, it follows that U
1
U
2
U
n
is a
subspace of V .
However, union of two subspaces need not be a subspace. For example, consider U = x-axis and
W = y-axis. Then U W = 0. But U W is NOT a subspace of R
2
, because (1, 0) and (0, 1) are
two elements in U W but (1, 0) + (0, 1) = (1, 1) / U W.
We know that L[U W] is the smallest subspace containing U W. We can identify L[U W]
completely, if we dene the sum U +W of two subspaces U and W of V .
Denition 1.9. Let U and W be two subspaces of V . We dene their sum U +W as follows.
U +W = v V : v = u +w, u U, w W
If U W = 0, then the sum of U and W is called direct sum and is denoted as U W.
3
It is clear that U +W is a subspace of V containing U W. Further, one can show that U +W is
a smallest space containing U W.
Theorem 1.10. Let U and W be subspaces of V . Then, U +W = L[U W].
Proof. It is clear that U +W L[U W], as the space U +W contains elements of the form
v = u +w, u U, w W, which is a nite linear combination of elements of U W. Converesely,
assume that v L[U W]. Then, we have
v =
1
u
1
+
2
u
2
+ +
n
u
n
+
1
w
1
+
2
w
2
+ +
n
w
n
,

i
,
j
, /, u
i
U, w
j
W, 1 i n, 1 j m.. Thus v = u +w, u U, w W and hence
v U +W. This proves the theorem.
Theorem 1.11. Let U and W be two subspaces of V and let V = U +W. Then V = U W i
v V has a unique representation of the form v = u +w, u U, w W.
Proof. Let V = U W. Then, we can write v V as v = u +w, u U, w W. We claim that this
representation is unique. If possible let v = u
1
+w
1
, u
1
U, w
1
W, be another representation of
v. Then, we get u u
1
= w
1
w, where u u
1
U and w
1
w W. This implies that
u u
1
= w
1
w U W. As U W = 0, we have u = u
1
and w = w
1
. Hence, the
representation of elements of V is unique.
Converesely, let the representation of elements of V be unique. Let v U W. Then, we have two
representations v = v + 0, v U, 0 W and also v = 0 +v, 0 U, v W. As the representation of
elements of V is unique, we have v = 0. Hence U W = 0. This implies that the sum U +W is
a direct sum.
Remark 1.2. The sum V = U W is also referred as internal direct sum of two subspaces U
and W of V .
Example 1.7. Let V = T[a, a] be the vector space of all real valued functions on [a, a]. Let us
denote U = T
e
[a, b] and W = T
o
[a, b] the vector subspaces of V consisting of even and odd
functions, respectively. Then, it follows that V = U W. This follows from the fact that each
f V = T[a, a] has a represenation of the form f(x) = [
f(x) +f(x)
2
] + [
f(x) f(x)
2
], where
f
e
= [
f(x) +f(x)
2
] is an even function and f
o
= [
f(x) f(x)
2
] is odd.
We claim that this representation is unique. For, if f = g
e
+g
o
is another representation, with
g
o
U and g
o
W. Then, we have h = f
e
g
e
= g
o
f
o
. As h is both even and odd, it is a zero
function. Hence, f
e
= g
e
and f
o
= g
o
.
Let U and W be two vector spaces. We shall dene external direct sum of these two spaces. First,
we dene the sum U W as follows.
U W = v : v = (u, w), u U, w W
Theorem 1.12. The sum U W is a vector space with respect to the following addition and
scalar multiplication operation.
(u
1
, w
1
) + (u
2
, w
2
) = (u
1
+u
2
, w
1
+w
2
)
(u, w) = (u, w)
The space U W is called the external direct sum of vector spaces U and W.
Denition 1.13. Let W be a subspace of the vector space V and let v V . The set v +W is
called a linear variety or translate of W or parallel of W or coset of W.
4
Remark 1.3. Linear variety v +U is a subspace i v U.
Let W be a subspace of the vector space V . Then, we denote by V/W the set of all cosets of W
and is dened as follows.
V/W = v +W : v V
Theorem 1.14. The space V/W is a vector space with respect to the following addition and scalar
multiplication operation.
(v
1
+W) + (v
2
+W) = (v
1
+v
2
) +W
(v +W) = v +W
The space V/W is called quotient space.
1.2 Linear Independence and Dependence, Dimension and Basis
Denition 1.15. A set S = u
1
, u
2
, , u
n
of vectors is called linearly dependent (l.d.) if there
exists a non-trivial linear combination of u
1
, u
2
, , u
n
that equals the zero vector. That is,

i
/, 1 i n, not all zero such that

1
u
1
+
2
u
2
+ +
n
u
n
= 0
Otherwise, S is said to be linearly independent (l.i.).
Example 1.8. S = e
x
, e
2
x, , e
n
x is l.i. set in C

(, ).
To prove the l.i. of S, we proceed as follows. Assume that

1
e
x
+
2
e
2x
+
n
e
n
x = 0 x R
Dierentiating the above equation (n 1)-times, we get

1
e
x
+ 2
2
e
2x
+ +n
n
e
n
x = 0

1
e
x
+ 2
2

2
e
2x
+ +n
2

n
e
n
x = 0
. +. + +. = 0

1
e
x
+ 2
n1

2
e
2x
+ +n
n1

n
e
n
x = 0
Evaluating the above expression at at x = 0, we get

1
+
2
+ +
n
= 0

1
+ 2
2
+ +n
n
= 0

1
+ 2
2

2
+ +n
2

n
= 0
. +. + +. = 0

1
+ 2
n1

2
+ +n
n1

n
= 0
As the determinant of the above equation is nonzero, it follows that
i
= 0, 1 i n. Hence
S
1
= e
x
, e
2x
, , e
nx
is l.i. set in (

(, ).
Example 1.9. Let S = v
1
, v
2
, , v
n
V . If v
1
= 0, then S is l.d.
Example 1.10. S
1
= 1, sin
2
x, cos
2
x is l.d. where as S
2
= 1, sin x, cos x is l.i. in ([1, 1].
5
Theorem 1.16. In a vector space V
Any subset of a l.i. set is l.i. and any superset of a l.d. set is l.d.
If the set S = v
1
, v
2
, , v
n
is an ordered set with v
1
,= 0, then S is l.d. i one of the
vectors v
2
, v
3
, , v
n
, say, v
k
belongs to the linear span of the preceding vectors
v
2
, v
3
, , v
k1

Corollary 1.1. A nite set of vectors S = v


1
, v
2
, , v
n
of a vectors in V containing a non-zero
vectors has a linearly independent subset A such that L[A] = L[S].
Denition 1.17. An innite set S is l.i. if every nite subset of S is l.i.
Example 1.11. Let S = 1, x, x
2
be an innite subset of T. To show that it is l.i., we need to
show that every nite subset of it is l.i. Let x
k1
, x
k2
, ...x
kn
, (where k
1
< k
2
< .. < k
n
are
nonnegative integers) be any nite subset of S. Assume that

k1
x
k1
+
k2
x
k2
+ +
kn
x
kn
= 0,
ki
1, 1 i n
Dierentiating the above equation k
1
times, we get

k1
k
1
! +
k2
[k
2
(k
2
1) + + (k
2
k
1
+ 1)]x
(k2k1)
+ +

kn
[k
n
(k
n
1) +...(k
n
k
1
+ 1)]x
(knk1)
= 0
Evaluating the above equation at x = 0, we get
k1
k
1
! = 0 and hence
k1
= 0. Similarly,
Dierentiating the above equation k
2
, k
3
, ,k
n
-times and evaluating at x = 0, we get
0 =
k1
=
k2
= ... =
kn
This proves that x
k1
, x
k2
, ...x
kn
, is l.i. and hence S is l.i.
Example 1.12. 1 and i are l.i. in the real vector space C of complex numbers over the eld of
real numbers as scalars. Whereas this set is l.d. in the complex vector space C of complex numbers
over the eld of complex numbers as scalars.
Example 1.13. S = sin x, sin 2x, , sin nx is a l.i. subset of ([, ].
Let

1
sin x +
2
sin 2x + +
n
sin nx = 0,
i
1, 1 i n
Multiplying the above equation by sin kx and integrating from to , we get

1
_

sin kxsin xdx +


2
_

sin kxsin 2xdx + +


n
_

sin kxsin nxdx = 0


Using the fact that
_

sin kxsin jxdx =


_
0 if k ,= j
if j = k
we get that

j
= 0, 1 j n
This proves the linear independence of the set S. Similarly one can show that
1, cos x, cos 2x, , cos nx is a l.i. subset of C[, ]. One can now extend the argument to claim
the linear independence of the innite set sin x, sin 2x, , sin nx, and the innite set
1, cos x, cos 2x, , cos nx, .
Denition 1.18. A subset B of V is a basis if (i) B is l.i. and (ii) L[B] = V .
6
If a set B of n elements generates V , then no l.i. set can have more than n vectors. More precisely,
we have the following theorem.
Theorem 1.19. In a vector space V if S = v
1
, v
2
, , v
n
generates V and if
B = w
1
, w
2
, , w
m
is l.i., then m n.
Proof. As B is l.i., w
m
,= 0 V . Denote by S
1
the set w
m
S = w
m
, v
1
, v
2
, , v
n
. As L[S] = V ,
L[S
1
] = V . Further, w
m
V and hence S
1
is l.d. with w
m
,= 0. By Theorem 1.16, we can discard
an element v
i1
S
1
, 2 i
1
n such that it is a linear combination of the preceding elements
w
m
, v
1
, v
2
, , v
i11
. Denote this deleted subset of S
1
by S

1
. Now, consider the set
S
2
= w
m1
S

1
. L[S
2
] = V because L[S

1
] = V and S
2
contains 0 ,= w
m1
V = L[S

1
] and hence is
l.d. Again, by Theorem 1.16, we can discard an element v
i2
S
2
, 2 i
2
i
1
1 such that it is a
linear combination of the preceding elements w
m
, v
1
, v
2
, , v
i21
. from S
2
and get S

2
.
Construct S
3
= w
m2
S

2
. Note that at every stage the discarded element is from S.
Inductively proceeding we continue till all the elements of the set B are used up. Then obviously
m n. If not (m > n), by construction we get a linearly dependent set S
n+1
= w
mn
S

n
. Hence,
S
n+1
= w
mn
, w
mn1
, , w
m
, being a subset of B, is llinearly independent, a contradiction.
Hence, m > n is not possible.
Corollary 1.2. If V has a basis of n elements, then every set of p vectors with p > n is l.d.
Proof. Let S = v
1
, v
2
, , v
n
be a basis and let B = w
1
, w
2
, , w
p
be a set of p vectors. If
this set B is l.i., it follows by the above theorem that p n.
Corollary 1.3. If V has a basis of n elements, then every other basis for V has also n elements.
Proof. Let S
1
= v
1
, v
2
, , v
n
and S
2
= w
1
, w
2
, , v
m
be a two bases for V . Then both S
1
and S
2
are l.i. and further L[S
1
] = V = L[S
2
]. Hence, by above theorem m n and also n m.
This proves the corollary.
Denition 1.20. (Finite Dimensional Space)
A vector space V is said to be nite dimensional if V has a basis consisting of nite number of
elements. It is clear that the number of elements in the basis is unique and it called its dimension.
Example 1.14. The space T
n
of all polynomials of degree n is of dimension (n + 1) as
S = 1, x, x
2
, , x
n
is a basis for T
n
. However, the space T of all polynomials is not nite
dimensional.
Theorem 1.21. In an n-dimensional space V any set of n linearly independent vectors form a
basis for V .
Proof. Let B = v
1
, v
2
, , v
n
be a l.i. set. We need to show that L[B] = V . Let v V be
arbitrary. Denote this element by v
n+1
. Then the set B
1
= v
1
, v
2
, , v
n
+ 1 is a set of (n + 1)
vectors in a n-dimensional space and hence is l.d. By Theorem 1.16, there exist a vector v
i
from
B
1
such that it is linear combination of the preceding (i 1) vectors. This element v
i
, B. If not
so, it will contradict the assumption that B is l.i. Thus v
i
= v
n+1
. This proves that L[B] = V .
Theorem 1.22. Let B = v
1
, v
2
, , v
n
be a basis for a vector space V . Then every element
v V has a unique presentation
v =
1
v
1
+
2
v
2
+ +
n
v
n
,
i
/, 1 i n
Denition 1.23. Let B = v
1
, v
2
, , v
n
be a basis for a nite dimensional space V . Then, by
the previous theorem, every element v V has a unique presentation
v =
1
v
1
+
2
v
2
+ +
n
v
n
,
i
/, 1 i n
(
1
,
2
, ,
n
) /
n
is said to be the coordinate of v with respect to the basis B.
7
Remark 1.4. In case we are short of n l.i vectors in a vector space of dimension n, we can extend
it to get a basis.
Theorem 1.24. In an n-dimensional vector space, any set B = v
1
, v
2
, , v
k
of l.i. vectors can
be extended to a basis v
1
, v
2
, , v
k
, v
k
+ 1, , v
n
of V .
Proof. If k = n, we are done. Let k < n and V ,= [v
1
, v
2
, , v
k
]. Hence, there exists
v
k+1
, L[v
1
, v
2
, , v
k
] and thus v
1
, v
2
, , v
k+1
is l.i. If k +i = n, we are done. Otherwise,
repeat the process to get a set v
1
, v
2
, , v
k
, v
k+1
, , v
n
of n l.i. vectorts in V . This is a basis
as it a collection of n l.i. elements in a nite dimensional space of dimension n.
We now prove two theorems regarding the dimensions of subspaces of a vector space.
Theorem 1.25. Let W be a subspace of a nite dimensional space V . Then
(a) dim(W) dim(V )
Equality holds i W = V .
(b) dim(V/W) = dim(V ) dim(W)
Proof. (a) Let dimW = m and dimV = n. Let S = be a basis for W. Since S is also l.i. in V ,
m n. This proves that dimW dimV . Assume that dimW = dimV = n. As S is a basis for W
and W V , S is l.i. in V . As dimV = n, it follows that S is also a basis for V . Thus
W = L[S] = V , which implies that W = V .
(b) Extend the basis S = w
1
, w
2
, , w
m
of W to basis B = w
1
, w
2
, , w
m
, v
1
, v
2
, v
r
of V ,
where m+r = n. Let v
1
= v
1
+W, v
2
= v
2
+W, v
r
= v
r
+W be elements of the quotient space
V/W. We need to show that v
1
, v
2
, v
r
is a basis for V/W. Since L[B] = V , it follows that for
any v V , we have
v =
1
w
1
+
2
w
2
+ +
n
w
n
+
1
v
1
+
2
v
2
+ +
r
v
r
,
i
,
i
/
This gives
v =
1
v
1
+
2
v
2
+ +
r
v
r
,
i
/
This proves that L[ v
1
, v
2
, v
r
] = V/W.
We claim that v
1
, v
2
, v
r
are l.i. Let
1
v
1
+
2
v
2
+ +
r
v
r
= o. This implies that
w =
1
v
1
+
2
v
2
+ +
r
v
r
W and hence w =
1
w
1
+
2
w
2
+ +
n
w
n
for some scalars

i
/, 1 i r. This gives us
o =
1
w
1
+
2
w
2
+ +
n
w
n

1
v
1

2
v
2

r
v
r
The linear independence of the basis B implies that 0 =
i
=
j
, 1 i m; 1 j r, thereby
proving the linear independence of v
1
, v
2
, v
r
. This proves the theorem.
Theorem 1.26. If U and W are two subspaces of a nite dimensional space V , then
dim(U +W) = dim(U) + dim(W) dim(U W)
Proof. STEP 1
Let n = dim(V ), m = dim(U), p = dim(W), r = dim(U W). Assume that B = v
1
, v
2
, , v
r

basis for U W. Extend B to a basis B


1
= v
1
, v
2
, , v
r
, u
r+1
, u
r+2
, , u
m
of U and
B
2
= v
1
, v
2
, , v
r
, w
r+1
, w
r+2
, , w
p
, a basis for W.
STEP 2
We need to show that S = v
1
, v
2
, , v
r
, u
r+1
, u
r+2
, , u
m
, w
r+1
, w
r+2
, , w
p
is a basis for
U +W.
8
S is l.i. : Assume that there exists scalars
i
, 1 i r,
i
, 1 i m,
i
, 1 i p such that
r

i=1

i
v
i
+
m

i=r+1

i
u
i
+
p

i=r+1

i
w
i
= 0 (1.1)
This implies that
r

i=1

i
v
i
+
m

i=r+1

i
u
i
=
p

i=r+1

i
w
i
(1.2)
Let v =

p
i=r+1

i
w
i
=

r
i=1

i
v
i
+

m
i=r+1

i
u
i
. Since v is linear combinations of elements
v
1
, v
2
, , v
r
, u
r+1
, u
r+2
, , u
m
of U and elements w
r+1
, w
r+2
, , w
p
of W, it follows that
v U W. Therefore there exists scalars
i
, 1 i r such that
v =
r

i=1

i
v
i
(1.3)
and hence
r

i=1

i
v
i
+
p

i=r+1

i
w
i
= 0 (1.4)
Since v
1
, v
2
, , v
r
, w
r+1
, w
r+2
, , w
p
is l.i., it follows that
i
= 0 =
j
, 1 i r, r + 1 j p.
Using
r+1
=
r+2
= =
p
= 0 in (1.2), we get that
r

i=1

i
v
i
+
m

i=r+1

i
u
i
= 0 (1.5)
Using the fact that the elements v
1
, v
2
, , v
r
, u
r+1
, u
r+2
, , u
m
of U are l.i., we get that

i
, = 0 =
j
, 1 i r,
j
, r + 1 j . This proves the linear independence of S.
STEP 3
L[S] = U +W: Let z U +W. Then z = u +w, u U, w W. This implies that there exists
scalars
i
, 1 i r,
j
, r + 1 j m and scalars

i
, 1 i r,

j
, r + 1 j m such that
z =
r

i=1

i
v
i
+
m

i=r+1

i
u
i
+
r

i=1

i
v
i
+
p

i=r+1

i
w
i
(1.6)
RHS of the above equality implies that z L[S]. This implies that U +W L[S]. The reverse
containment L[S] U +W is obvious. Hence U +W = L[S]. This proves the theorem.
Corollary 1.4. If V = U W, then
dim(V ) = dim(U) + dim(W).
Example 1.15. Let V = T
3
, U = p T
3
: p(1) = 0, W = p T
3
: p

(1) = 0 be two subspaces


of V . dimV = 4, dimU = 3, dimW = 3. We wish to get explicit representation of U, V, W, U W
and U +W.
9
B = 1, x, x
2
, x
3
is a basis for V and hence
U = p V : p(1) = 0
= +x +x
2
+x
3
: + + + = 0, , , , 1
= ( ) +x +x
2
+x
3
; , , 1
= (x 1) +(x
2
1) +(x
3
1); , , 1
This shows that B
1
= (x 1), (x
2
1), (x
3
1) is a basis for U and dim(U) = 3.
W = p V : p

(1) = 0
= +x +x
2
+x
3
: + 2 + 3 = 0, , , , 1
= +(x
2
2x) +(x
3
3x); , , 1
This shows that B
2
= 1, (x
2
2x), (x
3
3x) is a basis for W and dim(W) = 3.
U W = p V : p(1) = p

(1) = 0
= +x +x
2
+x
3
: + + + = 0 = + 2 + 3, , , , 1
= (1 2x +x
2
) +(2 3x +x
3
); , 1
This shows that B
3
= 1 2x +x
2
, (2 3x +x
3
) is a basis for U W and dim(U W) = 2.
Hence, by the above theorem, it follows that dim(U +W) = 4 = dim(V ). Hence, it follows that
U +W = V .
2 Linear Transformations
Denition 2.1. Let U and V be two vector spaces over the same eld of scalars. A function
T : U V is said to be a linear transformation or linear map or a linear operator if
(a) T(u +v) = Tu +Tv, u, v U (b) T(u) = Tu, u U, /
Remark 2.1. The function f : R R dened by f(x) = x +a, a R xed, is customarily called a
linear function, because its graph is a line. However, it is NOT a linear transformation as dened
by us.
Theorem 2.2. Let T : U V be linear. Then
T(0) = 0
T(u) = T(u)
T(
1
u
1
+
2
u
2
+ +
n
u
n
) =
1
Tu
1
+
2
Tu
2
+ +
n
Tu
n
for every nite linear
combination (
1
u
1
+
2
u
2
+ +
n
u
n
) U
Theorem 2.3. Let U and V be vector spaces with U nite dimensional. Let T : U V be a linear
transformation. Then T is completely dened by its action on the basis elements of U.
Proof. Let u
1
, u
2
, , u
n
be a basis U and let Tu
1
= v
1
, Tu
2
= v
2
, , Tu
n
= v
n
be the action
on the basis elements. Let u U be arbitrary. We have u =
1
u
1
+
2
u
2
+ +
n
u
n
. Then, we
dene Tu =
1
v
1
+
2
v
2
+ +
n
v
n
.This transformation T is the required transformation.
10
(i) We rst show that T is unique. If there is any other linear map S : U V with
Su
i
= v
i
, 1 i n, we need to show that S = T. By linearity of S, we have
S(
1
u
1
+
2
u
2
+ +
n
u
n
) =
1
Su
1
+
2
Su
2
+ +
n
Su
n
=
1
v
1
+
2
v
2
+ +
n
v
n
= Tu
This proves the uniqueness of T.
(ii) Let u =
1
u
1
+
2
u
2
+ +
n
u
n
and v =
1
u
1
+
2
u
2
+ +
n
u
n
be elements of U. Then
u +v = (
1
+
1
)u
1
+ (
2
+
2
)u
2
+ (
n
+
n
)u
n
Hence
T(u +v) = (
1
+
1
)v
1
+ (
2
+
2
)v
2
+ (
n
+
n
)v
n
= Tu +Tv
and
T(u) =
1
v
1
+
2
v
2
+
n
v
n
= Tu
This proves the linearity of T and hence the theorem.
2.1 Range Space, Null Space and Rank-Nullity Theorem
Denition 2.4. Let T : U V be a linear operator. The null space N(T) and range space R(T)
are dened as
(a) N(T) = u U : T(u) = 0 (b) R(T) = v = T(u), u U
If R(T) is nite dimensional, then dim(R(T)) is called the rank of T and is denoted by r(T).
Similarly dim(N(T)) is called the nullity of T and is denoted by n(T).
Theorem 2.5. Let T : U V be linear. Then
1. R(T) is a subspace of V .
2. N(T) is a subspace of U
3. T is 1-1 i N(T) = 0.
4. If L[u
1
, u
2
, , u
n
] = U, then R(T) = L[T(u
1
), T(u
2
), , T(u
n
)].
5. If U is nite dimensional, then dim(R(T)) dimU.
Proof. 1 3 are easy to prove.
4. Let L[u
1
, u
2
, , u
n
] = U and let v R(T). Then, we have
v = Tu, u U
= T(
1
u
1
+
2
u
2
+ +
n
u
n
)
=
1
Tu
1
+
2
Tu
2
+ +
n
Tu
n
L[T(u
1
), T(u
2
), , T(u
n
)]
Hence
R(T) = L[T(u
1
), T(u
2
), , T(u
n
)]
5. If dim(U) < with basis B = u
1
, u
2
, , u
n
. Then U = L[u
1
, u
2
, , u
n
]. Hence by (iv)
R(T) = L[T(u
1
), T(u
2
), , T(u
n
)]. Let dimR(T) = m. Then by Theorem 1.25, it follows that
m n.
11
Theorem 2.6. Let T : U V be linear.
If T is 1-1 and if u
1
, u
2
, , u
n
is l.i., then T(u
1
), T(u
2
), , T(u
n
) is l.i.
If v
1
, v
2
, , v
n
is a l.i. set of R(T) such that v
i
= T(u
i
), 1 i n, then u
1
, u
2
, , u
n

is l.i.
Theorem 2.7. (Rank-Nullity Theorem) Let T : U V be linear with dim(U) = n. Then
dim(R(T)) + dim(N(T)) = dim(U)
or equivalently
r(T) +n(T) = n
Proof. STEP 1
N(T) is a nite dimensional subspace of U with B = u
1
, u
2
, , u
p
as its basis. Extend B to
B
1
= u
1
, u
2
, , u
p
, u
p+1
, , u
n
to a basis for U.
STEP 2
Consider the set S = Tu
p+1
, Tu
p+2
, , Tu
n
R(T). We show that (i) L[S] = R(T) and (ii) S
is l.i.
(i) We have L[B
1
] = U and hence
L[R(T)] = L[Tu
1
, Tu
2
, , Tu
p
, Tu
p+1
, , Tu
n
] = L[Tu
p+1
, Tu
p+2
, , , Tu
n
] = L[S]
(ii) Let
p+1
Tu
p+1
+
p+2
Tu
p+2
+ +
n
Tu
n
= 0. By linearity of T it follows that
T(
p+1
u
p+1
+
p+2
u
p+2
+ +
n
u
n
) = 0 and hence
p+1
u
p+1
+
p+2
u
p+2
+ +
n
u
n
) N(T).
Hence there exists scalars
1
,
2
, ,
p
such that

p+1
u
p+1
+
p+2
u
p+2
+ +
n
u
n
=
1
u
1
+
2
u
2
+ +
p
u
p
This gives

1
u
1
+
2
u
2
+ +
p
u
p

p+1
u
p+1

p+2
u
p+2

n
u
n
= 0
Linear independence of u
i

n
i=0
implies that
1
=
2
= =
p
=
p+1
=
p+2
= =
n
. This
proves that S is l.i. Thus S is a basis for R(T) and hence r(T) = n p r(T) +n(T) = n.
Corollary 2.1. Let T : R
n
R
n
. Then T is 1-1 i T is onto.
Corollary 2.2. Let T : U V with U and V nite dimensional. If T is 1-1, then
dim(U) dim(V ).
Corollary 2.3. Let T : U V with U and V nite dimensional. If T is onto, then
dim(U) dim(V ).
2.2 Invertible Linear Transformation
Denition 2.8. (Invertible Linear Transformation) A linear map or a linear operator or a
linear transformation T : U V is invertible or nonsingular if T is 1-1 and onto. Such a
transformation is also called isomorphism.
Let T : U V be nonsingular. Let v V be arbitrary. As T is 1-1 and on to, it follows that there
exists a unique u U such that Tu = v. This help us in dening a mapping T
1
: V U, called
the inverse of T : U V . This is done as under.
T
1
v = u Tu = v (2.1)
12
We claim that T
1
: V U is linear. Let
T
1
(v
1
) = u
1
and T
1
(v
2
) = u
2
This implies that
T(u
1
) = v
1
and T(u
2
) = v
2
Lnearity of T gives
T(
1
u
1
+
2
u
2
) =
1
T(u
1
) +
2
T(u
2
)
=
1
v
1
+
2
v
2
and hence
T
1
(
1
v
1
+
2
v
2
) =
1
u
1
+
2
u
2
(2.2)
(2.1) and (2.2) together imply that
T
1
(
1
v
1
+
2
v
2
) =
1
T
1
v
1
+
2
T
1
v
2
We now use Rank-Nullity Theorem to get the following.
Theorem 2.9. Let T : U V be linear map and dim(U) = dim(V ) = n. Then the following are
equivalent.
T is a isomorphism
T is 1-1
T maps l.i. sets into l.i. sets
T transforms basis to basis
T is onto
r(T) = n
n(T) = 0
T
1
exists.
Denition 2.10. A vector spaces U is said to be isomorphic to vector space V if there exists an
isomorphism T : U V . We denote two isomorphic spaces as U V .
Theorem 2.11. Every real (complex) vector space V of dimension n is isomorphic to R
n
(C
n
).
We now examine the dimension of the quotient space.
Theorem 2.12. Let U and W be a subspaces of a vector space V such that V = U W. Then
1.
W (V/U)
2. If V is nite dimensional with dim(V ) = n and dim(U) = m, then
dim(V/W) = n m
Proof. : 1. We dene an isomorphism from W to (V/U) as follows
Tw = w +U, w W
It is clear that T is linear and onto. It is also 1-1, as w N(T) T(w) = 0. This gives w +U = U
and hence w = 0. Thus T is an isomorphism from W to (V/U). Hence W (V/U).
2. As W (V/U), it follows that dim(V/U) = dim(W) = n m.
13
2.3 The Vector Space /(U, V ) and Composition of Linear Maps
We now examine the set of all linear transformation from U to V . Denote this set by /(U, V ) and
dene a binary operation T +S and scalar multiplication T on /(U, V ) as follows.
[T +S]u = Tu +Su, u U.
[T]u = [Tu], u U, /.
We immediately get the following theorem concerning /(U, V ).
Theorem 2.13. The set /(U, V )of all linear transformations from U to V is a vector space with
respect to the above dened binary operation and scalar multiplication.
Denition 2.14. Let U, V, W be vector spaces and let T /(U, V ) and S /(V, W). Then the
composition operation (S T) is dened as follows.
(S T)(u) = S(T(u)), u U
Alternatively, we shall also be writing the composition as ST instead of (S T). The composition
operation is linear, as we see below.
(S T)(u
1
+u
2
) = S(T(u
1
+u
2
))
= S(Tu
1
+Tu
2
)
= S(Tu
1
) +S(T(u
2
)
= ST(u
1
) +ST(u
2
)
= (S T)(u
1
) + (S T)(u
2
)
Similarly, we have
(S T)(u) = S(T(u))
= S(Tu)
= S(Tu)
= (S T)(u)
Thus (S T) /(U, W).
We have the following result concerning the binary operation and composition operation on the
vector space /(V ) of linear operators from V to itself.
Theorem 2.15. For all T, S, R /(V ), we have
R(T +S) = RT +RS
(T +S)R = TR +SR
R(ST) = (RS)T
(S)T = (ST) = S(T)
IT = TI = T
This immediately gives us the following theorem.
Theorem 2.16. The vector space /(V ) of all linear transformations from V to V is an algebra
with identity.
14
We have the following two interesting theorems concerning invertible operators.
Theorem 2.17. Let T : U V and S : V W be two linear maps.
If S and T are nonsingular, then ST is nonsingular and (ST)
1
= T
1
S
1
.
If ST is 1-1, then T is one-one.
If ST is onto, then S is onto.
If ST is nonsingular, then T is one-one and S is onto.
If dim(U) = dim(V ) = dim(W) and ST is nonsingular, then both S and T are nonsingular.
Theorem 2.18. T : U V is nonsingular i there exists S : V U such that TS = I
V
and
ST = I
U
. In such a case
S = T
1
and T = S
1
(2.3)
Proof. Let us assume that T : U V is nonsingular. Denote by S the inverse operator
T
1
: V U. Inview of (2.1), it follows that
S(v) = u T(u) = v
and hence
(S T)(u) = S(Tu) = S(v) = u and (T S)(v) = T(Sv) = T(u) = v
which implies that
ST = I
U
and TS = I
V
Let us now assume that there exists S : V U such that (2.3) holds. As ST = I
U
, it follows by
Theorem 2.17 that T is 1-1 and TS = I
V
implies that T is on to. Hence T is invertible with
T
1
= S. Similarly, it follows that S is invertible with S
1
= T, which proves the theorem.
3 Geometry of Vector Spaces
3.1 Inner Product Spaces and Orthogonality
Denition 3.1. An inner product u, v in a vector space U is a function on U U with values in
/ such that the following holds.
1. u, u 0 for all u U and equality holds i u = o
2. u, v = v, u u, v U
3. u +v, w = u, w + v, w , / and u, v, w U
Denition 3.2. The vector space U with an inner product dened is called inner product space.
In an inner product space U, u is said to be orthogonal to v if u, v = 0. This is denoted by u v.
If M is any subset of U, then the set M

denotes the set of all vectors perpendiculars to M. That


is
M

= w V : u, w = 0 u U
Denition 3.3. A vector space U is said to be a normed space if there exists a function |u| from
U to 1 such that the following properties hold.
1. |u| 0 for all u U and |u| = 0 i u = 0
2. |u| = [[|u| for all u U and /
15
3. |u +v| |u| +|v| for all u, v U
In an inner product space U, the induced norm is dened by
|u|
2
= u, u , u U
Denition 3.4. In a normed space U, the induced metric d(u, v) (distance between two vectors) is
dened as
d(u, v) = |u v| u, v U
In view of Denition 4.3, this metric d(u, v) satises the following properties.
1. d(u, u) 0 for all u, v U and d(u, v) = 0 i u = v
2. d(u, v) = d(v, u) for all u, v U
3. d(u, w) d(u, v) +d(v, w) for all u, v, w U
Denition 3.5. It is possible to dene a metric d(u, v) satisfying properties 1 3, in any set
without having a vector space structure. Such a set is called a metric space.
In a metric space, without the vector space structure, we can easily dene the concepts of
convergence, Cauchy convergence, completeness etc., as we see below.
Denition 3.6. A sequence x
k
in a metrix space (U, d) is said to be convergent to x U if
d(x
k
, x) 0 as k . x
k
is said to be Cauchy if d(x
k
, x

) 0 as k, l .
Denition 3.7. A metrix space (U, d) is said to be complete if every Cauchy sequence in (U, d)
converges. A complete normed space is called a Banach space, whereas a complete inner product
space is called a Hilbert space. In a normed space U, it is possible to dene innite sum

i=1
u
i
.
We say that S =

i=1
u
i
i the sequence of partial sums S
n
=

n
i=1
u
i
U, converges to S U.
Example 3.1. In the space 1
n
of n-tuples, the Euclidean norm and inner product are dened as
|u|
2
= u
2
1
+u
2
2
+. . . +u
2
n
, u = (u
1
, u
2
, . . . , u
n
) (3.1)
u, v = u
1
v
1
+u
2
v
2
+. . . u
n
v
n
, u = (u
1
, x
2
, . . . , u
n
), v = (v
1
, v
2
, . . . , v
n
) (3.2)
In terms of the matrix notation we write u, v = v

u = u

v, if we treat u, v as column vectors


and u

, v

as row vectors.
One can show that 1
n
is complete with respect to the norm induced by the inner product dened
by Eq. (3.2) and hence it is a Hilbert space.
Remark 3.1. It is possible to dene other norms in the space 1
n
, as we see below
|u|
1
= [u
1
[ +[u
2
[ +. . . +[u
n
[
|u|

= max
1in
[u
i
[
Denition 3.8. Two dierent norms |u|
a
and |u|
b
in a normed space u are said to be equivalent
if positive constants , such that
|u|
a
|u|
b
|u|
a
One can show that all norms in 1
n
are equivalent and hence 1
n
is also complete with respect to
the norms |u|
1
and |u|

, dened earlier.
16
Example 3.2. In the space ([0, 1] of all real valued continuous functions, one can dene norm and
inner product as under
|f|
2
2
=
_
1
0
f
2
(t)dt
f, g =
_
1
0
f(t)g(t)dt
Example 3.3. Let /
2
[a, b] denote the space of all square integrable functions.
L
2
[a, b] = f : [a, b] 1,
_
b
a
f
2
(t)dt <
Let the norm and inner product be dened as in Eqs (3.1)-(3.2). Then /
2
[a, b] is a Hilbert space.
Remark 3.2. 1
n
is nite dimensional, where as ([0, 1] and /
2
[0, 1] are innite dimensional spaces.
3.2 Pythagorus identity, Parallelogram Law and Schwartz Inequality
In an inner product space U, we have the following identities.
1. Pythagorus identity:
u v i |u +v|
2
= |u|
2
+|v|
2
2. Parallelogram law:
|u +v|
2
+|u v|
2
= 2[|u|
2
+|v|
2
]
One can show that in any inner product space U, we have the following identity
u, v +v, u = 1/2[|u +v|
2
|u v|
2
]
and in a complex inner product space we have
u, v v, u = i/2[|u +iv|
2
|u iv|
2
]
Combining the above two relations we get
u, v = 1/4[|u +v|
2
|u v|
2
+i|u +iv|
2
i|u iv|
2
] (3.3)
for the complex inner product space and
u, v = 1/4[|u +v|
2
|u v|
2
]
for the real inner product space.
From the above, it follows that in an inner product space, the knowledge of norm unambiguously
denes the inner product. In fact, one can show that if parallelogram law holds in a normed space,
then one can dene an inner product through (3.3).
Theorem 3.9. A normed space U is an inner product space i the norm in U satises the
parallelogram law. The inner product is dened through (3.3).
17
Example 3.4. X = ([0, 1] with
|f| = sup
t(0,1]
[f(t)[, f X t [0, 1]
Take f = t, g = 1. One can check that parallegram law is not satised and hence the Banach space
([0, 1] with respect to the above norm can never be made a Hilbert space.
Theorem 3.10. (Schwartz Inequality)
Let U be an inner product space. Then for all x, y, U, we have the following inequality
[ x, y [ |x||y| (3.4)
Equality holds i x and y are l.d.
Proof. If any of the elements x, y is zero, we are through. So, assume that x, y ,= 0. By normalizing
y as
y
|y|
= e, the above inequality reduces to
[ x, e [ |x| for all x X (3.5)
So, it suces to show Eq. (3.5).
We have
0 x x, e e, x x, e e
= x, x [x, e]
2
This gives Eq. (3.5). Also, if equality holds in Eq. (3.5), we get
x x, y y, x x, y y = 0
This implies that
x x, y y = 0
and hence x, y are l.d. On the other hand if x, y are l.d., then Eq. (3.5) is true. This proves the
theorem.
Let M be a closed subspace of a Hilbert space U. Given an element x U, we wish to obtain
u M which is closest to M. We have the following theorem in this direction.
Theorem 3.11. Suppose x U and M a closed subspace of U. Then a unique element u M
such that
|x u| = inf
yM
|x y|
Proof. Let d = inf
yM
|x y|. Then a sequence u
n
M such that
|x u
n
| d
By parallelogram law, we have
|u
n
u
m
|
2
= |(u
n
u
m
) x|
2
= [2|u
n
x|
2
+ 2|u
m
x|
2
| 2x +u
n
+u
m
|
2
]
[2|u
n
x|
2
+ 2|u
m
x|
2
4d
2
]
2d
2
+ 2d
2
4d
2
= 0
That is, u
n
is Cauchy and since M is a closed subspace of U, it is complete and hence
u
n
u M. It is clear that
|x u| = lim
n
|x u
n
| = d
18
To prove the uniqueness of u, assume that there exists a v M, which is also closest to M. As
before, we have
|u v|
2
[2|u x|
2
+ 2|v x|
2
4d
2
]
= 2d
2
+ 2d
2
4d
2
= 0
This gives u = v.
In the following theorem M

represent the spaces of all elements orthogonal to M.


Theorem 3.12. Let M be a closed subspace of U. Then U = M M

.
Proof. Suppose x U and M a closed subspace of U. Then by the previous theorem a unique
element u M which is closest to x. Dene v = x u. Then, clearly x = u +v. Let w M and
t 1 be arbitrary. If d = |x u| = |v|. Then
d
2
|x (u +tw)|
2
= |v tw|
2
= d
2
2tRev, w +t
2
|w|
2
Thus, 2tRev, w +t
2
|w|
2
0 for all t, thereby implying that Rev, w = 0. Repeating the same
argument with it instead of t, we get Imv, w = 0 and hence v, w = 0. This implies that
v M

. Thus we have x = u +v, u M, v M

. We claim that this representation x = u +v is


unique. If not, let
x = u
1
+v
1
= u
2
+v
2
, u
1
, u
2
M and v
1
, v
2
M

This implies that


u
1
u
2
= v
2
v
1
M M

and hence
u
1
u
2
= v
2
v
1
=

0
This proves that U is the direct sum of M and M

.
Denition 3.13. A set S U is called an orthonormal set if e

, e

S e

, e

= 0, ,=
and |e

| = 1 e

S.
We have the following property for orthonormal sets in U.
Theorem 3.14. Let e

A
be a collection of orthonormal elements in U. Then, for all x U,
we have

A
[ x, e

[
2
|x|
2
(3.6)
and further
_
x

A
x, e

_
e

A.
The inequality Eq. (3.6) is referred as Bessels inequality.
Denition 3.15. Let S be an orthonormal set in a Hilbert space U. Then S is called a basis for U
(or a complete orthonormal system) if no other orthonormal set contains S as a proper subset.
The following theorem presents the most important property of a complete orthonormal set.
Theorem 3.16. Let U be a Hilbert space and let S = e

A
be an orthonormal set in U. Then
the following are equivalent in X.
19
(1) S = e

A
is complete.
(2) 0 is the only vector which is orthogonal to every e

S. That is, x e

for every x = o.
(3) Every vector x U has Fourier series expansion x =

A
x, e

.
(4) Every vector x U satises the Parsevals equality
|x|
2
=

A
| x, e

[
2
(3.7)
Proof. (1) (2)
Suppose (2) is not true, then x ,= o such that x e

A.
Dening e =
x
|x[
, we get an orthonormal set S e which properly contains S. This contradicts
the completeness of S.
(2) (3)
x

A
x, e

is orthogonal to e

for every . But by (2), it follows that it must be the zero


vector 0 and hence
x =

A
x, e

(3) (4)
We have
|x|
2
= x, x =
_

A
x, e

A
x, e

_
=

A
[ x, e

[
2
(4) (1)
If S = e

A
is not complete, it is properly contained in an orthonormal set

S and let e

S S.
then Parsevals equation Eq. (3.7) gives
|e|
2
=

A
[ e, e

[
2
= 0
contradicting the fact that e is a unit vector.
Example 3.5. In U = /
2
[, ], the collection of functions
S =
_
1

2
,
cos t

,
cos 2t

, . . .
sin t

,
sin 2t

, . . .
_
is a complete orthonormal set and every f /
2
[, ] has a Fourier series expansion
f(t) =
a
0
2
+

k=1
[a
k
cos kt +b
k
sin kt] (3.8)
where
a
k
=
1

f(t) cos ktdt and b


k
=
1

f(t) sin ktdt, k = 0, 1, 2, . . .


The fact that the above set S forms an orthonormal basis in /
2
[, ] will be proved elsewhere.
We also note that for any f /
2
[, ] we have the Parsevals relation
_

f
2
(t)dt =
(f, 1)
2
2
+

n=1
1

[(f, cos nt)


2
+ (f, sin nt)
2
]
20
The Fourier series representation given by Eq. (3.8) will be used while discussing the solution of
boundary value problems.
For a piecewise continuous function f(t) dened on [, ],
f(t) =
_
1, t 0
1, t 0
we have the following Fourier series representation.
f(t) =
4

n=0
sin(2n + 1)t
2n + 1
3.3 Gram-Schmidt Orthonormalisation Procedure
We now give the following Gram-Schmidt procedure which helps in producing an orthonormal
collection e
1
, e
2
, . . . , e
n
, , out of the .i. collection x
1
, x
2
, . . . , x
n
, , . We rst obtain
orthogonal vectors y
1
, y
2
, . . . , y
n
, , , as follows.
y
1
= x
1
.
.
.
y
i
= x
i

i,1
y
1

i,2
y
2
. . .
i,i1
y
i1
where

i,j
=
x
i
, y
j

y
j
, y
j

, 1 j i 1
This is continued inductively for i + 1, i + 2, , . . . , n. It is clear that
[x
1
, x
2
, . . . , x
n
] = [y
1
, y
2
, . . . , y
n
]
for each n. Normalizing y
i
, we get the orthonormal collection e
1
, e
2
, . . . , e
n
, , , e
i
= y
i
/|y
i
|.
Example 3.6. In the space /
2
[1, 1], the set of all polynomials P
0
(t), P
1
(t), . . ., called Legendre
polynomials, are obtained by Gram-Schmidt orthonormalisation procedure. We rst compute
21
polinomials p
0
, p
1
, p
2
, p
3
and p
4
.
p
0
(t) = 1, p
1
(t) = t
p
2
(x) = t
2

0
p
0
(t)
1
p
1
(t),
0
=
(t
2
, 1)
(1, 1)
=
1
3
,

1
=
(t
3
, t)
(t, t)
= 0
and hence
p
2
(t) = t
2

1
3
p
3
(t) = t
3

0
p
0
(t)
1
p
1
(t)
2
p
2
(t),
0
=
(t
3
, 1)
(1, 1)
= 0,

1
=
(t
3
, t)
(t, t)
=
3
5
,
2
= 0
and hence
p
3
(t) = t
3

3
5
t
p
4
(t) = t
4

0
p
0
(t)
1
p
1
(t)
2
p
2
(t)
3
p
3
(t),

0
=
(t
4
, 1)
(1, 1)
=
1
5
,
1
= 0,

2
=
(t
4
, p
2
(t))
(p
2
(t), p
2
(t))
=
6
7
,
3
= 0
and hence
p
4
(t) = 0 = t
4

6
7
t
2

1
5
Normalising these polynomials we get the Legendre polynomials P
i
(t), 1 i 4,
given by P
0
(t) = 1, P
1
(t) = t, P
2
(t) =
1
2
(3t
2
1), P
3
(t) =
1
2
(5t
3
3t), P
4
(t) =
1
8
(35t
4
30t
2
+ 3).
The higher degree Legendre polynomials are obtained in a similar way.
For f /
2
[1, 1], we have the Legendre series representation
f(t) =

i=0
a
i
P
i
(t), a
i
= f, P
i
=
_
1
1
f(t)P
i
(t)dt, i 0
4 Linear Transformations on Finite Dimensional Spaces
4.1 Linear Transformations and Matrices
Denition 4.1. Let T : U V be a linear transformation with dim(U) = n and dim(V ) = m.
Let B
1
= u
1
, u
2
, , u
n
be a basis for U and B
2
= v
1
, v
2
, , v
m
be a basis for V . Hence there
exists scalars
ij
/, 1 i m, 1 j n such that
Tu
j
=
m

ij
v
i
, 1 j n (4.1)
22
Then (
ij
) is called the matrix of T with respect to the basis B
1
of U and basis B
2
of V and is
denote by m(T). That is
m(T) =
_

11

12

1n

21

22

2n

m1

m2

mn
_

_
Conversely, let A = (
ij
) be an mn matrix. Let T : U V be dened through Eq(4.1) by its
action on the basis elements of the basis B
1
= u
1
, u
2
, , u
n
for U. Since a linear opeartor is
uniquely dened by its action on the basis elements, T dened by (4.1) is such that T /(U, V ).
Thus, there is a 1-1 correspondence between T /(U, V ) and an mn matrix A.
Also, from the denition of the matrix associated with a linear operator it follows that if
m(T) = (
ij
), m(S) = (
ij
), then
m(T +S) = (
ij
) +(
ij
), , /
Thus, we immediately get the following theorem.
Theorem 4.2. The vector space /(U, V ) is isomorphic to the space /
mn
of all mn matrices.
Corollary 4.1.
dim/(U, V ) = mn
We now focus on /(V ) and examine the composition operation on it. Let T, S, R /(V ) with
m(T) =
ij
, m(S) =
ij
and m(R) =
ij
. Let R = TS. We will show that
ij
=

k

ik

kj
. We
have
Rv
j
= T(Sv
j
)
= T(

ij
v
i
)
=

ij
T(v
i
)
=

ij
(

ki
v
k
)
Interchanging the indices i, k we get
Rv
j
=

k

kj
(

ik
v
i
)
=

k
(

ik

kj
)v
i
=

ij
v
i
This shows that m(R) =
ij
. That is m(TS) = m(T)m(S). Further, if T is invertible , then
TT
1
= T
1
T = I. This gives us m(TT
1
) = m(T)m(T
1
) = I and hence m(T
1
) = [m(T)]
1
.
This can be stated as a theorem.
Theorem 4.3. The composition operation in /(V ) corresponds to multiplication operation in
/
n
2. Further, if T is invertible, then m(T
1
) = [m(T)]
1
.
Let T /(V ) and let m
1
(T) be the matrix of T with respect to the basis B
1
and m
2
(T) be the
matrix of T with respect to the basis B
2
. We wish to nd the relation between m
1
(T) and m
2
(T).
Theorem 4.4. Let T /(V ) and let A = m
1
(T) be the matrix of T with respect to the basis
B
1
= v
1
, v
2
, , v
n
and B = m
2
(T) be the matrix of T with respect to the basis
B
2
= w
1
w
2
, , w
n
. Then there exists a nonsingular matrix C such that
B = C
1
AC
23
Proof. Let S be the linear operator which maps basis elements of B
1
to basis elements of B
2
, That
is
Sv
i
= w
i
, 1 i n
As S maps basis to basis, by Theorem 2.9 we get that S is invertible. Let C = m(S) and
C
1
= m(S
1
). Let A = (
ij
), B = (
ij
). We have Tw
i
=

j

ji
w
j
and w
i
= Sv
i
. This gives us
TSv
i
=

ji
(Sv
j
) = S(

ji
v
j
)
Since S is invertible, we get [S
1
TS]v
i
= S
1
S(

j

ji
v
j
). So, we get
[S
1
TS]v
i
=

ji
v
j

m(S
1
TS) = (
ij
) = B
By Theorem 4.3 , m(S
1
TS) = m(S
1
)m(S)m(T) = [m(S)]
1
m(T)m(S) and hence
B = C
1
AC.
Denition 4.5. The matrices A and B are said to be similar if there exists an invertible matrix C
sich that B = C
1
AC.
Remark 4.1. The above theorem states that the matrices corresponding to dierent basis are
similar.
Example 4.1. Let D : T
n
T
n
be the dierential operator D(p(x)) = p

(x). Then
m
1
(T) =
_

_
0 1 0 0
0 0 2 0
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
0 0 0 n
0 0 0 0
_

_
with respect to basis B
1
= 1, x, x
2
, , x
n
for T
n
.
If we use B
2
= 1, 1 +x, 1 +x
2
, , 1 +x
n
as a basis for T
n
, then
m
2
(T) =
_

_
0 1 2 n
0 0 2 0
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
0 0 0 0
_

_
is the matrix of T with respect to the basis B
2
. The linear transformation which maps B
1
to B
2
has a matrix representation given by
C =
_

_
1 1 1 1
0 1 0 0
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
0 0 0 1
_

_
One can show that m
1
(T) and m
2
(T) are similar through the invertible matrix C. That is
m
2
(T) = C
1
m
1
(T)C.
24
4.2 Eigenvalues and Eigenvectors
Theorem 4.6. / is called a characteristic root or eigenvalue of T i there exists v ,= 0 in V
such that Tv = v. This v ,= 0 in V is called a characteristic vector or eigenvector of T
corresponding to the characteristic root or eigenvalue .
Remark 4.2. If / is a characteristic root of T then
k
is a characteristic root of T
k
for all
positive integers k. Further, if p(x) = a
0
x
n
+a
1
x
n1
+ +a
n
(a
i
/, 0 i n) is any
polynomial, then [a
0
T
n
+a
1
T
n1
+ +a
n
I]v = [a
0

n
+a
1

n1
+ +a
n
]v and hence p() is a
characteristic root of p(T).
We rst observe that if T /(V ) with dim(V ) = n, then it satises a nontrivial polynomial (with
coecients in /) of degree at most n
2
. This follows from the fact that I, T, T
2
, T
n
2
are (n
2
+ 1)
elements in the vector space /(V ) of dimension n
2
.
Theorem 4.7. The eigenvectors corresponding to distinct eigenvalues of T /(V ) are linearly
independent.
Proof. Let v
1
, v
2
, . . . , v
n
be the eigenvectors of T /(V ) with
1
,
2
, . . . ,
n
as the corresponding
eigenvalues. This implies that
Tv
i
=
i
v
i
, ,= 0 for 1 i n
We proceed by induction on n. Obviously, the result is true for n = 1, as v
1
,= 0. Assume that the
result is true for all i < k. We wish to show that it is also true for i = k. So, let

1
v
1
+
2
v
2
+. . . +
k
v
k
= 0 (4.2)
Operating in the above equation by T, we get

1
Tv
1
+
2
Tv
2
+. . . +
k
Tv
k
= 0 (4.3)
As v
i
is eigenvectors of T corresponding to eigenvalue
i
, 1 i n, it follows that

1
v
1
+
2

2
v
2
+. . . +
k

n
v
k
= 0 (4.4)
Multiplying Eq(4.2) by
k
and subtracting Eq(4.4) from it, we get
(
k

1
)
1
v
1
+ (
k

2
)
2
v
2
+. . . + (
k

k1
)
k1
v
k1
= 0 (4.5)
As v
1
, v
2
, . . . , v
k1
are assumed to be linearly independent, we get that
(
k

i
)
i
= 0, 1 i k 1
Since eigenvalues of T are distinct, it follows that
i
= 0, 1 i k 1. This in turn implies that

k
= 0, in view of Eqn(4.4). This proves the linear independence of the eigenvectors of T.
This gives us the following corollaries.
Corollary 4.2. T /(V ) can have at most n distinct characteristic roots.
Corollary 4.3. If T /(V ) has n distinct characteristic roots. Then V has a basis consisting of
characteristic vectors of T.
25
We now show how to diagonalize a matrix when its eigenvectors are l.i. Let v(1), v
(2)
, . . . , v
(n)
be
the independent eigenvectors of a matrix with
1
,
2
, . . . ,
n
as the corresponding eigenvalues.
Dene the matrix C as
C =
_
v
(1)
v
(2)
v
(n)

As v
(i)
(1 i n) are linearly independent, C is nonsingular and further we have,
AC =
_
Av
(1)
Av
(2)
. . . Av
(n)

=
_

1
v
(1)

2
v
(2)

n
v
(n)

= CD
where D =
_

1
0 . . . 0
0
2
. . . 0
.
.
. 0
.
.
.
.
.
.
0
.
.
. . . .
n
_

_
This implies that
C
1
AC = D =
_

1
0 . . . 0
0
2
. . . 0
.
.
. 0
.
.
.
.
.
.
0
.
.
. . . .
n
_

_
Thus A is similar to a diagonal matrix.
Example 4.2. Let
A =
_
_
2 2 3
2 1 6
1 2 0
_
_
A has eigenvalues 3, 3, 5 with linearly independent eigenvectors
_
_
2
1
0
_
_
,
_
_
3
0
1
_
_
,
_
_
1
2
1
_
_
Therefore,
C =
_
_
2 3 1
1 0 2
0 1 1
_
_
with C
1
=
_
_
1/4 1/2 3/4
1/8 1/4 5/8
1/8 1/4 3/8
_
_
It can be seen that A can be diagonalized as
C
1
AC =
_
_
3 0 0
0 3 0
0 0 5
_
_
Symmetric Matrices
Theorem 4.8. Let A be a symmetric matrix. Then
(i) the eigenvalues of A are real and
(ii) the eigenvectors corresponding to distinct eigenvalues are orthogonal.
Proof. Let / be an eigenavlue of A with eigenvector v. This implies that
Av = v, v ,= 0
Taking inner product of the above equation with v , we get
Av, v = v, v, v, Av = v, v
26
Subtracting the above equations we get
Av, v v, Av = v, v v, v
which gives
Av, v A

v, v = (

)v, v
As A

= A, we get that (

) = 0. That is is real.
Let (,= ) 1 be another eigenavlue of A, with eigenvector w ,= 0. This implies that
Av = v, Aw = w
Taking inner product of the above equation with w and v, respectively, we get
Av, w = v, w, v, Aw = v, w
Subtracting the above equations we get
Av, w v, Aw = ( )v, w
which gives
Av, w A

v, w = ( )v, v
As A

= A and ( ) ,= 0 we get that v, w = 0 This proves that v and w are orthogonal.


Denition 4.9. A matrix U is said to be orthogonal if
U

U = UU

= I
Theorem 4.10. A matrix U is an orthogonal matrix i its column vectors form an orthonormal
collection.
Proof. Let U be a matrix with u
(1)
, u
(2)
, . . . , u
(n)
be its column vectors. Then
U

=
_

_
u

(1)
u

(2)
.
.
.
u

(n)
_

_
and U

U is given by
U

U =
_

_
u

(1)
u

(2)
.
.
.
u

(n)
_

_
_
u
(1)
u
(2)
. . . u
(n)

This implies that


U

U =
_

_
u

(1)
u
(1)
u

(1)
u
(2)
. . . u

(1)
u
(n)
u

(2)
u
(1)
u

(2)
u
(2)
. . . u

(2)
u
(n)
.
.
.
u

(n)
u
(1)
u

(n)
u
(2)
. . . u

(n)
u
(n)
_

_
= I
i u
(i)
is orthonormal.
27
Theorem 4.11. Spectral Theorem For a symmetric matrix A, we have the eigenvalue
decomposition
A =
n

i=1

i
u
i
u

i
= UDU

, D = diag(
1
,
2
, ,
n
)
where the matrix U = [u
1
, u
2
, , u
n
] is orthogonal and contains the eigenvectors of A, while the
diagonal matrix D contaions the eigenvalues of A.
Proof. We prove by induction on the size n of the matrix A. By fundamental theorem of algebra,
the charachterstic polynomial of the symmetric matrix A, giving the eigenvalues of A, has at least
one root
1
in /. Since A is symmetric this root is real. Correspondingly, there exists u
1
,= 0 such
that Au
1
=
1
u
1
. We normalize this u
1
. So, the result is true for n = 1. Now, assume that the
result is true for matrix of size n 1. Begin with eigenvalue
1
and associated eigenvector u
1
.
Using Gram-Schmidt orhogonilsation procedure, we get n (n 1) matrix V
1
such that [u
1
, V
1
] is
orthogonal. We dene a (n 1) (n 1) symmetric matrix B as V

1
AV
1
. By induction
B = V

1
AV
1
= Q
1
D
1
Q

1
, D
1
= diag(
2
,
3
, ,
n
)
where the matrix Q
1
= [u
2
, u
3
, , u
n
] is orthogonal and contains the eigenvectors of B, while the
diagonal matrix D
1
contaiisns the eigenvalues of B and hence that of A. Dene n (n 1) matrix
U
1
as U
1
= V
1
Q
1
and n n matrix U as U = [u
1
, U
1
]. By construction U is orthogonal and
U

AU =
_
u

1
U

1
_
A[u
1
, U
1
]
=
_
_
u

1
Au
1
u

1
AU
1
U

1
Au
1
U

1
AU
1
_
_
=
_
_

1
0
0 D
1
_
_
by using the fact that U

1
Au
1
=
1
U

1
u
1
= 0 and U

1
AU
1
= D
1
. This completes the induction
process.
Denition 4.12. A symmetric matrix A is said to be positive semi-denite if
Au, u 0, u R
n
It is said to be positive denite if the above inequality is strict for u ,= 0.
f(u) = Au, u is called the quadratic form in u. Accordingly, the quadratic form Au, u is called
positive semi-denite (denite) if A is positive semi-denite (denite).
Theorem 4.13. A symmetric matrix A is positive semi-denite (denite) i all eigenvalues of A
are non-negative (positive).
Proof. Let eigenvalues of A be non-negative. As eigenvectors of a symmetric matrix form an
orthonormal basis,
u R
n
u = a
1
u
(1)
+ +a
n
u
(n)
, a
i
R
where u
(i)
, u
(j)
=
ij
, Au
(i)
=
i
u
(i)
. This gives us
Au, u =
_
n

i=1
a
i
Au
(i)
,
n

j=1
a
j
u
(j)
_
=
n

i=1
a
2
i
Au
(i)
, u
(i)

28
=
n

i=1

i
a
2
i
u
(i)
, u
(i)

=
n

i=1

i
a
2
i
0 as
i
0
and hence A is positive semi-denite.
Assume now that A is positive semi-denite and let be an eigenvalue of A with eigenvector
u ,= 0. That is, Au = u, u ,= 0. This gives us
=
Au, u
u, u
0
The result concerning the positive deniteness is obvious, where
i
s are eigenvalues of A, not
necessarily distinct.
This gives us the following corollary.
Corollary 4.4. If a symmetric A is positive denite, then we have the following Rayleighs
inequality

min
|u|
2
Au, u
max
|u|
2
, u R
n
(4.6)
Here
min
,
max
are the minimum and maximum of the eigenvalues of A, respectively.
4.3 Linear Equations - Solvability Analysis
Basic Solvability Result Let V = L[v
1
, v
2
, , v
n
] and W = L[w
1
, w
2
, , w
m
] be real vector
spaces. Let T /(V, W). Let A = (
ij
)
mn
be the matrix of T with respect to the given bases. Let
b = (b
1
, b
2
, , b
m
) 1
m
. We wish to determine x = (x
1
, x
2
, , x
n
) 1
n
, which solves the
nonhomogeneous system

11
x
1
+
12
x
2
+ +
1n
x
n
= b
1

21
x
1
+
22
x
2
+ +
2n
x
n
= b
2
.
.
. +
.
.
. + +
.
.
. =
.
.
.

i1
x
1
+
i2
x
2
+ +
in
x
n
= b
i
.
.
. +
.
.
. + +
.
.
. =
.
.
.

m1
x
1
+
m2
x
2
+ +
mn
x
n
= b
n
(4.7)

Ax = b (4.8)
The solvability of this nonhomogeneous system is linked with the solvability of the homogeneous
system

11
x
1
+
12
x
2
+ +
1n
x
n
= 0

21
x
1
+
22
x
2
+ +
2n
x
n
= 0
.
.
. +
.
.
. + +
.
.
. =
.
.
.

i1
x
1
+
i2
x
2
+ +
in
x
n
= 0
.
.
. +
.
.
. + +
.
.
. =
.
.
.

m1
x
1
+
m2
x
2
+ +
mn
x
n
= 0
(4.9)
29

Ax = 0 (4.10)
The solvability of the nonhomogeneous system (4.8) is equivalent to the solvability of the following
operator equation for v V
Tv = b (4.11)
where v V has coordinates (x
1
, x
2
, , x
n
) 1
n
and b W has coordinates
(b
1
, b
2
, , b
m
) 1
m
. The following theorem is immediate.
Theorem 4.14. (a) If b , R(T), then Eq(4.11) has no solution.
(b) If b R(T), then Eq(4.11) has a solution. It is unique if N(T) = o.
(c) If b R(T) and N(T) ,= o, then Eq(4.11) has innitely many solutions and these solutions
are given by the set S = v
0
+N(T), where v
0
is a particular solution of Eq(4.11).
We rst state a theorem giving a relationship between the rank of T and its adjoint T

.
Theorem 4.15. Let V and W be as dened above and let T /(V, W). Then r(T) = r(T

).
Proof. We have [N(T)]

= [R(T

)]. Hence r(T

) = dim[N(T]]

= n n(T) = r(T).
We now derive a relationship between the rank of T and the linearly independent rows and
columns of the matrix A = m(T). We have R(T) = L[Tv
1
, Tv
2
, , Tv
n
] and and r(T)=maximal
number of linearly independent subset of Tv
1
, Tv
2
, , Tv
n
. Observe that Tv
j
=

ij
v
i
.
Hence, the coordinates of Tv
j
are (
1j
,
2j
, ,
nj
). Thus Tv
1
, Tv
2
, , Tv
n
coressponds to
column vectors (
11
,
21
, ,
n1
)

,(
12
,
22
, ,
n2
)

,(
1n
,
2n
, ,
mn
)

. of the matrix
A =
_

11

12

1n

21

22

2n
.
.
.

m1

m2

mn
_

_
,
Hence r(T)=maximal number of linearly independent columns of A = m(T). Similarly,
r(T

)=maximal number of linearly independent rows of A = m(T). By the previous theorem,


r(T) = r(T

) and hence we get that r(T)=maximal number of linearly independent columns of A


= maximal number of linearly independent rows of A. This motivates us to dene the rank of a
matrix as follows.
Denition 4.16. Let A be any mn matrix. Its rank is dened as maximal number of linearly
independent columns of A or maximal number of linearly independent rows of A.
Denote by r(A) by r. We shall denote by (A, b) the matrix A augmented by b. As the solvability of
the matrix equation Eq(4.8) is equivalent to the solvability of the operator equation Eq(4.11), it
follows that (4.8) is solvable i b R(T. Note that b R(T) i r(A) = r(A, b). This gives us the
following solvability result for Eqs(4.8)-(4.10).
Theorem 4.17. (a) The nonhomogeneous system (4.8) has a solution for all b 1
m
i
r = r(A, b).
(b) The nonhomogeneous system (4.8) has a solution for all b 1
m
i r = m.
(c) The nonhomogeneous system (4.8) has a unique solution for all b 1
m
i r = m = n. In
addition, the homogeneous system has only zero solution.
30
(d) Let r = m and m < n. then the nonhomogeneous system (4.8) as well as the homogeneous
system (4.10) have innitely many solutions for all b 1
m
.
(e) Let r < m and let b R(A).
Then in all three cases (Case 1: r < m = n, Case 2: r < m < n, Case 3: r < n < m), both
nonhomogeneous system (4.8) as well as the homogeneous system (4.10) have innitely many
solutions.
(f ) Let r = n < m. Then homogeneous system (4.10) has only trivial solution. Further, if
b R(A), then nonhomogeneous system (4.8) has a unique solution.
(g) If m = n, then homogeneous system (4.10) has a trivial solution i A is nonsingular.
Proof. (a) Obvious.
(b) r = m implies that dim(R(A)) = m = dim(W). Hence R(A) = W and so A is on to.
For (c)-(f), we use Rank-Nullity theorem: n(A) +r = n.
(c) r = m = n n(A) = 0. This gives that N(A) = 0 and hence the uniqueness of solutions of
both homogeneous and nonhomogeneous equations (4.10) and (4.8), respectively.
(d) r = m n(A) = n m > 0. This gives that N(A) ,= 0 and hence both homogeneous and
nonhomogeneous equations (4.10) and (4.8) have innitely many solutions.
(e) b R(A) implies that nonhomogeneous system (4.8) has a solution. Further, all three cases
imples that n(A) = n r > 0. Hence N(A) ,= 0 and so the the solution set of both homogeneous
and nonhomogeneous equations is innite.
(f) Let r = n < m n(A) = 0. Hence homogeneous system (4.10) has only trivial solution and if
b R(A), then nonhomogeneous system (4.8) has a solution and this solution is unque.
(g) m = n A is a square matrix and hence homogeneous system (4.10) has a trivial solution i A
is nonsingular.
.
It is known that a square matrix A is nonsingular i det A ,= 0. Hence, we immediately get the
following corollary from (g) of the above theorem.
Corollary 4.5. Let A be a square matrix. The homogeneous system (4.10) has a trivial solution
i det A ,= 0 .
5 Some Sample Texts and Notes
Published Texts
[1] Inroduction to Dierential Equations and Linear Algebra, Stephen W Goode, Prentice Hall,
1991
[2] Finite Dimensional Vector Spaces, Paul R Halmos, Aliated East-West Pvt. Ltd. 1965
[3] Linear Algebra, Jim Heeron, Virginia Commonwealth University Mathematics, 2009
[4] Inroduction to Linear Algebra, Lee W Johnson, Jimmy Arnold and R Dean Riess, Pearson, 2001
[5] Introduction to Linear Algebra: An Applied First Course, B Kolman and D Hill, Pearson, 2001
[6] Linear Algebra: A Geometric Approach, S Kumaresan, Prentice Hall of India, 1999
[7] Inroduction to Linear Algebra, Serge Lang, Springer, 1985
31
[8] Linear Algebra and Applications, David Lay, Adison Wesley, 2011
[9] Applied Linear Algebra, Ben Noble, Prentice Hall, 1969
[10] Linear Algebra and Applications, Gilbert Strang, Thomson Brooks/Cole, 2005
Lecture Notes
[1] Notes on Linear Algebra (2008), Peter J Cameron, www.maths.qmul.ac.uk
[2] Linear Algebra, Jim Heeron (2009), www.mathematik.uni-muenchen.de
[3] Linear Algebra, Theory and Applications (2012), Kenneth Kuttler, www.math.byu.edu
[3] Linear Algebra in Twenty Five Lectures,Tom Denton and Andrew Waldron (2012),
www.math.ucdavis.edu
32

You might also like