0% found this document useful (0 votes)
39 views22 pages

Mathematics Notes Chapter 2

Uploaded by

EnRyuu Castadel
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
39 views22 pages

Mathematics Notes Chapter 2

Uploaded by

EnRyuu Castadel
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

Chapter 2

Inner Product Space

Syllabus Inner product space: Definition and properties of inner product space, orthog-
onality, Cauchy Schwarz inequality, Norm and Orthogonal Basis and Gramm-Schmidt
orthonormalisation. Schur’s Theorem, Linear functional, Riesz representation theorem,
orthogonal complement or dual subspace, Singular value and singular vectors, singular
value decomposition.

Definition 2.1 Dot product For x, y ∈ Rn , the dot product of x and y, denoted x · y,
is defined by x · y = x1 y1 + x2 y2 + · · · + xn yn , where x = (x1 , x2 , · · · , xn ) and y =
(y1 , y2 , · · · , yn ).

Note The dot product of two vectors in Rn is a number, not a vector. Obviously
x · x = ∥x∥2 for all x ∈ Rn . The dot product on Rn has the follow- ing properties:

(a) x · x ≥ 0 for all x ∈ Rn ,

(b) x · x = 0 if and only if x = 0,

(c) For y ∈ Rn fixed, the map from Rn → R that sends x ∈ Rn to x · y is linear, and

(d) x · y = y · x for all x, y ∈ Rn .

An inner product is a generalization of the dot product. At this point you may be
tempted to guess that an inner product is defined by abstracting the properties of the dot
product discussed above. For real vector spaces, that guess is correct. However, wee need
to examine the complex vector space, before making the general definition so that it will
be useful for both real and complex vector spaces.
Recall that if z = a + ib, where a, b ∈ R, then
(i) the absolute value of z denoted |z| is defined by |z| = a2 + b2 ,

1
2 BSCAIML-301 Notes by Subodh C Bhunia

(ii) the complex conjugate of z, denoted barz , is defined by z̄ = a − ib,

(iii) |z|2 = z z̄.

Definition 2.2 Inner product An inner product on a vector space V over F , is a


function that takes each ordered pair (u, v) of elements of V to a number ⟨u, v⟩ ∈ F and
has the following properties:

(a) Positivity: ⟨v, v⟩ ≥ 0 for all v ∈ V ,

(b) Definiteness: ⟨v, v⟩ = 0 if and only if v = 0,

(c) Additivity in first slot: ⟨u + v, w⟩ = ⟨u, w⟩ + ⟨v, w⟩ for all u, v, w ∈ V ,

(d) Homogeneity in first slot: ⟨au, v⟩ = a⟨u, v⟩ for all a ∈ F and all u, v ∈ V ,

(e) Conjugate symmetry: ⟨u, v⟩ = ⟨v, u⟩ for all u, v ∈ V .

Every real number equals its complex conjugate. Thus if we are dealing with a real
vector space, then in the last condition simply states that ⟨u, v⟩ = ⟨v, u⟩ for all u, v ∈ V .

Illus. 2.1 (a) The Euclidean inner product on F n is defined by


⟨(w1 , w2 , · · · , wn ), (z1 , z2 , · · · , zn )⟩ = w1 z̄1 + w2 z̄2 + · · · + wn z̄n
(b) If c1 , c2 , · · · , cn are positive numbers, then an inner product can be defined on F n by
⟨(c1 w1 , c2 w2 , · · · , cn wn ), (z1 , z2 , · · · , zn )⟩ = c1 w1 z̄1 + c2 w2 z̄2 + · · · + cn wn z̄n .
(c) An inner product can be defined on Z the vector space of continuous real-valued functions
1
on the interval [−1, 1] by ⟨f · g⟩ = f (x)g(x)dx
−1

Definition 2.3 Inner product space An inner product space is a vector space V along
with an inner product defined on V .

The most important example of an inner product space is F n with the Euclidean inner
product defined in (a) of Illus. 1.1.
When F n is referred to as an inner product space, you should assume that the inner
product is the Euclidean inner product unless explicitly told otherwise. So that we do
not have to keep repeating the hypothesis that V is an inner product space, for the rest
of this chapter we make the following assumption:
Notation V For the rest of discussion, V denotes an inner product space over F . When
we say that a vector space V is an inner product space, we also want to mean that an
inner product on V is the Euclidean inner product if the vector space is F n .
CHAPTER 2. INNER PRODUCT SPACE 3

2.0.1 Basic Properties of Inner Product Spaces


(a) For each fixed u ∈ V , the function that takes v to ⟨v, u⟩ is a linear map from V to
F.

(b) ⟨0, u⟩ = 0 for every u ∈ V .

(c) ⟨u, 0⟩ = 0 for every u ∈ V .

(d) ⟨u, v + w⟩ = ⟨u, v⟩ + ⟨u, w⟩ for all u, v, w ∈ V

(e) ⟨u, av⟩ = a⟨u, v⟩ for all u, v ∈ V and all a ∈ F .

Proof: (a) follows from the conditions of additivity in the first slot and homogeneity in
the first slot in the definition of an inner product.
(b) follows from part (a) and the result that every linear map takes 0 to 0.
Part (c) follows from part (b) and the conjugate symmetry property in the definition of
an inner product.
(d) Suppose u, v, w ∈ V . Then
⟨u, v + w⟩ = ⟨v + w, u⟩
= ⟨v, u⟩ + ⟨w, u⟩
= ⟨v, u⟩ + ⟨w, u⟩
= ⟨u, v⟩ + ⟨u, w⟩
(e) Suppose a ∈ F and u, v ∈ V . Then
⟨u, av⟩ = ⟨av, u⟩ = a ⟨v, u⟩
= a⟨u, v⟩.
Norms Our motivation for defining inner products came initially from the norms of
vectors on R2 and R3 . Now we see that each inner product determines a norm.

p Norm For v ∈ V , the norm of v, denoted ∥v∥, is defined by


Definition 2.4
∥v∥ = ⟨v, v⟩

Illus. 2.2 (a) If (z1 , z2 ,p · · · , zn ) ∈ F n (with the Euclidean inner product), then
∥(z1 , z2 , · · · , zn )∥ = |z1 |2 + |z2 |2 + · · · + |zn |2 .
(b) In the vector Zspace of continuous real-valued functions on [−1, 1] with inner product
1
defined as ⟨f · g⟩ = f (x)g(x)dx, we have
sZ −1
1
∥f ∥ = |f (x)|2 dx.
−1

Basic properties of the norm


Suppose v ∈ V . Then
4 BSCAIML-301 Notes by Subodh C Bhunia

(a) ∥v∥ = 0 if and only if v = 0.


(b) ∥kv∥ = k ∥v∥ for all k ∈ F .
Proof: (b) ∥kv∥2 = ⟨kv, kv⟩ = k⟨v, kv⟩
= k.k⟨v, v⟩ = k 2 ∥v∥2 = (k ∥v∥)2
Hence the result follows.

Definition 2.5 Orthogonality Two vectors u, v ∈ V are called orthogonal if ⟨u, v⟩ = 0.

Illus. 2.3 If u, v are nonzero vectors in R2 , then ⟨u, v⟩ = ∥u∥ ∥v∥ cos θ
where θ is the angle between u and v. Thus two vectors in R2 are orthogonal (with
respect to the usual Euclidean inner product) if and only if the cosine of the angle between
them is 0, which happens if and only if the vectors are perpendicular in the usual sense of
plane geometry.

Orthogonality and 0 vector


(a) In a vector space V , 0 is orthogonal to every vector in V .
(b) 0 is the only vector in V that is orthogonal to itself.

2.0.2 Pythagorean Theorem


Suppose u and v are orthogonal vectors in V . Then ∥u + v∥2 = ∥u∥2 + ∥v∥2
Proof We have nal vectors in V . Then
∥u + v∥2 = ⟨u + v, u + v⟩ = ⟨u, u + v⟩ + ⟨v, u + v⟩
= ⟨u, u⟩ + ⟨u, v⟩ + ⟨v, u⟩ + ⟨v, v⟩
= ⟨u, u⟩ + ⟨v, v⟩ [since u, v are orthogonal, ⟨u, v⟩ = 0 and ⟨v, u⟩ = 0.]
= ∥u∥2 + ∥v∥2 (proved)

2.1 Orthogonal Decomposition


Suppose u, v ∈ V , with v ̸= 0. We would like to write u as a scalar multiple of v plus a
vector w orthogonal to v, as suggested in the attached figure.
To discover how to write u as a scalar multiple of v plus a vector orthogonal to v, let
c ∈ F denote a scalar. Then
u = cv + (u − cv)
Thus we need to choose c so that v is orthogonal to (u − cv). In other words, we want
⟨u − cv, v⟩ = 0
or ⟨u, v⟩ − c⟨v, v⟩ = 0
or c ∥v∥2 = ⟨u, v⟩
⟨u, v⟩
The equation above shows that we should choose c to be .
∥v∥2
CHAPTER 2. INNER PRODUCT SPACE 5

u-cv
u

cv

Making this choice of c, we can


 write 
⟨u, v⟩ ⟨u, v⟩
u= v + w, where w = u − v is orthogonal to v.
∥v∥2 ∥v∥2

2.1.1 Cauchy–Schwarz Inequality


Suppose u, v ∈ V . Then |⟨u, v⟩| ≤ ∥u∥ ∥v∥
The equality holds if and only if one of u, v is a scalar multiple of the other.
Proof If v = 0, then both sides of the desired inequality equal 0. Thus we can assume
that v ̸= 0.
⟨u, v⟩
Consider the orthogonal decomposition: u = v + w,
  ∥v∥2
⟨u, v⟩
where w = u − v is orthogonal to v.
∥v∥2
By Pythagorean Theorem,
2
2 ⟨u, v⟩
∥u∥ = v + ∥w∥2
∥v∥2
|⟨u, v⟩|2 2 2 |⟨u, v⟩|2
= 4 ∥v∥ + ∥w∥ = 2 + ∥w∥2
∥v∥ ∥v∥
2
|⟨u, v⟩|
≥ 2 since ∥w∥2 ≥ 0
∥v∥
This gives ∥u∥2 ∥v∥2 ≥ |⟨u, v⟩|2 .
Taking square root of both sides, we get dsired result.
⟨u, v⟩
We observe that the equality holds only when w = 0, i.e., u − v = 0.
∥v∥2
u is scalar multiple of v or vice-versa.

Illus. 2.4 ( Cauchy–Schwarz Inequality) (a) If (x1 , x2 , · · · , xn ), (y1 , y2 , · · · , yn ); ∈ Rn ,


6 BSCAIML-301 Notes by Subodh C Bhunia

then |x1 y1 + x2 y2 + · · · + xn yn | ≤ (x21 + x22 + · · · + x2n )(y12 + y22 + · · · + yn2 )


(b) If f, g are continuous real-valued functions on [−1, 1], then
Z 1 2 Z 1  Z 1 
2 2
(f (x)g(x))dx ≤ (f (x)) dx (g(x)) dx
−1 −1 −1

2.1.2 Triangle Inequality


Suppose u, v ∈ V . Then ∥u + v∥ ≤ ∥u∥ + ∥v∥
Equality holds if and only if one of u, v is a nonnegative multiple of the other.
Proof We have
∥u + v∥2 = ⟨u + v, u + v⟩
= ⟨u, u + v⟩ + ⟨v, u + v⟩
= ⟨u, u⟩ + ⟨u, v⟩ + ⟨v, u⟩ + ⟨v, v⟩
= ⟨u, u⟩ + ⟨v, v⟩ + ⟨u, v⟩ + ⟨u, v⟩
= ∥u∥2 + ∥v∥2 + 2Re(⟨u, v⟩)
≤ ∥u∥2 + ∥v∥2 + 2 |(⟨u, v⟩)|
≤ ∥u∥2 + ∥v∥2 + 2 ∥u∥ ∥v∥ (using Cauchy–Schwarz Inequality).
= (∥u∥ + ∥v∥)2
Taking square roots of both sides of the inequality above gives the desired inequality.
The proof above shows that an equality holds if and only if ⟨u, v⟩ = ∥u∥ ∥v∥ That is,
if one of u, v is a non-negative multiple of the other.

2.1.3 Parallelogram Equality


For later use.

2.2 Orthonormal Bases


Definition 2.6 Orthonormal A list of vectors is said to be orthonormal if each vector
in the list has norm ‘1’ (unit) and is orthogonal to all the other vectors in the list.
In other words, a list e1 , e2 , · · · , em of vectors in V is orthonormal if
1, if i = j
⟨ei , ej ⟩ = for i, j = 1, 2, · · · m.
0, if i ̸= j
   
1 1 1 1 1
Illus. 2.5 (i) √ , √ , √ , −√ , √ , 0 is an orthonormal list in F 3 .
 3
 3 3 2
  2 
1 1 1 1 1 1 1 2
(ii) √ , √ , √ , −√ , √ , 0 , √ , √ , −√ is an orthonormal list in
3 3 3 2 2 6 6 6
F 3.
CHAPTER 2. INNER PRODUCT SPACE 7

Properties of Orthonormal Vectors


(a) The norm of an orthonormal linear combination
If e1 , e2 , · · · , em is an orthonormal list of vectors in V , then for all a1 , a2 , · · · , am ∈ F ,
we have
∥a1 e1 + a2 + e2 + · · · am em ∥2 = |a1 |2 + |a2 |2 + · · · + |am |2
Proof Because each ej has norm 1, this follows easily from repeated applications of the
Pythagorean Theorem.
(b) Every orthonormal list of vectors is linearly independent.
Proof Suppose e1 , e2 , · · · , em is an orthonormal list of vectors in V and a1 , a2 , · · · , am ∈ F
are such that a1 e1 + a2 e2 + · · · + am em = 0. Then taking norms of both sides, we get
|a1 |2 +|a2 |2 +· · ·+|am |2 = 0, which means that all the aj ’s are 0. Thus e1 , e2 , · · · , em
are linearly independent.

Definition 2.7 Orthonormal basis An orthonormal list of vectors in V that is also a


basis of V, is called an orthonormal basis of V .

For example, the standard basis {(1, 0, 0), (0, 1, 0), (0, 0, 1)} is an orthonormal basis of R3 .

Theorem 2.1 An orthonormal list of the right length (equal to the dim(V ) is an or-
thonormal basis of V .

Proof Since it an orthonormal list, the list must be linearly independent. Since, the
number of vectors in the orthonormal list is equal to the dimension of V , it must be a
basis of V .

Illus.
2.6 Consider the following list
  R4 :
of four vectorsin  
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
, , , , , ,− ,− , ,− ,− , , − , ,− , .
2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
Let us verify that the list forms an orthonormal basis in R4 .
s 
  2  2  2  2
1 1 1 1 1 1 1 1
We have , , , = + + + =1
2 2 2 2 2 2 2 2
Similarly, the other three vectors in the list above also have norm 1.
We
  have   
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
, , , , , ,− ,− = · + · − · − · =0
2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
Similarly, the inner product of any two distinct vectors in the list above also equals 0.
Thus the list above is orthonormal. Because we have an orthonormal list of length four
in the four-dimensional vector space R4 , this list is an orthonormal basis of R4 .

Theorem 2.2 If S = {u1 , u2 , · · · , um } is an orthogonal set of nonzero vectors in V , then


S is linearly independent and hence is a basis for the subspace spanned by S.
8 BSCAIML-301 Notes by Subodh C Bhunia

Theorem 2.3 Let {u1 , u2 , · · · , um } be an orthogonal basis for a vector space V . For each
y ∈ V , the weights in the linear combination y = c1 u1 + c2 u2 + · · · + cm um are given by
⟨y, uj ⟩
cj = j = 1, 2, · · · , m.
⟨uj , uj ⟩

Proof y = c1 u1 + c2 u2 + · · · + cm um .
The orthogonality of {u1 , u2 , · · · , um } shows that
⟨y, uj ⟩ = ⟨(c1 u1 + c2 u2 + · · · + cm um ), uj ⟩ = cj ⟨uj , uj ⟩
⟨y, uj ⟩
Since ⟨uj , uj ⟩ =
̸ 0, the equation above has a solution
⟨uj , uj ⟩
for cj for each j = 1, 2, · · · , m.

Writing a vector as linear combination of orthonormal basis


Suppose {e1 , e2 , · · · , en } is an orthonormal basis of V and v ∈ V . Then
v = ⟨v, e1 ⟩e1 + ⟨v, e2 ⟩e2 + · · · + ⟨v, en ⟩en
and ∥v∥ = |⟨v, e1 ⟩|2 + |⟨v, e2 ⟩|2 + · · · + |⟨v, en ⟩|2
Since {e1 , e2 , · · · , en } is a basis of V , there exist scalars a1 , a2 , · · · , an such that
v = a1 e1 + a2 e2 + · · · + an en
Since {e1 , e2 , · · · , en } is an orthonormal list, taking the inner product of both sides of this
equation with ej gives ⟨v, ej ⟩ = aj . Thus, the first equation holds.
The second equation follows immediately from the first equation.
We understand the usefulness of orthonormal bases¿ Now the question is how do we go
about finding them?
The algorithm used in the next proof is called the Gram–Schmidt Procedure. It gives
a method for turning a linearly independent list into an orthonormal list with the same
span as the original list. Danish mathematician Jorgen Gram (1850–1916) and German
mathematician Erhard Schmidt (1876–1959) popularized this algorithm that constructs
orthonormals.

2.2.1 Gram–Schmidt Procedure


v1
Suppose {v1 , v2 , · · · , vn } is a linearly independent list of vectors in V . Let e1 =
∥v1 ∥
For j = 2, 3, · · · , n, define ej inductively by
vj − ⟨vj , e1 ⟩e1 − · · · − ⟨vj , ej−1 ⟩ej−1
ej = ...(1)
∥vj − ⟨vj , e1 ⟩e1 − · · · − ⟨vj , ej−1 ⟩ej−1 ∥
Then {e1 , e2 , · · · , en } is an orthonormal list of vectors in V such that
span (v1 , v2 , · · · , vj ) = span (e1 , e2 , · · · , ej ) for j = 1, 2, · · · , n.
Proof We will show by induction on j that the desired conclusion holds.
For j = 1, note that span(v1 ) = span(e1 ) because v1 is a positive multiple of e1 .
CHAPTER 2. INNER PRODUCT SPACE 9

Hence the result is true.


Suppose 1 < j < m and we have verified that
span (v1 , · · · , vj−1 ) = span (e1 , · · · , ej−1 ) ... (2)
and {e1 , · · · , ej−1 } is an orthonormal list.
That, the result is true for up to j − 1 vectors.
Since v1 , v2 , · · · , vn are linearly independent, vj ̸∈ span(v1 , · · · , vj−1 ). Thus, vj ̸∈ span
(e1 , · · · , ej−1 ).
Hence we are not dividing by 0 in the definition of ej [Eqn (1)].
Dividing a vector by its norm produces a new vector with norm 1, thus ∥ej∥ = 1.
Then for 1 ≤k < j, we have 
vj − ⟨vj , e1 ⟩e1 − · · · − ⟨vj , ej−1 ⟩ej−1
⟨ej , ek ⟩ = , ek
∥vj − ⟨vj , e1 ⟩e1 − · · · − ⟨vj , ej−1 ⟩ej−1 ∥
⟨vj , vk ⟩ − ⟨vj , vk ⟩
=
∥vj − ⟨vj , e1 ⟩e1 − · · · − ⟨vj , ej−1 ⟩ej−1 ∥
=0
Thus we see that {e1 , e2 , · · · , ej } is an orthonormal list.
From the definition of ej in (1), we see that vj ∈ span (e1 , e2 , · · · , ej ).
Combining this information with (2) shows that
span (v1 , v2 , · · · , vj ) ⊂ span (e1 , e2 , · · · , ej )
Both the lists {v1 , v2 , · · · , vj } and {e1 , e2 , · · · , ej } are linearly independent and both
subspaces have dimension j.
Hence, they are equal.
This shows that result is true for j also.
By mathematical induction, the result is true for all j (= 1, 2, · · · , n).

Theorem 2.4 Existence of orthonormal basis Every finite-dimensional inner prod-


uct space has an orthonormal basis.

Proof Suppose e1 , · · · , em is an orthonormal list of vectors in V of dimension n. Then


e1 , · · · , em is linearly independent.
Hence this list can be extended to a basis e1 , · · · , em , vm+1 , · · · , vn of V .
Now apply the Gram–Schmidt procedure to e1 , · · · , em , vm+1 , · · · , vn , producing an
orthonormal list e1 , · · · , em , fm+1 , · · · , fn .
Here the formula given by the Gram–Schmidt Procedure leaves the first m vectors
unchanged because they are already orthonormal. The list above is an orthonormal basis
of V .

Theorem 2.5 Upper-triangular matrix with respect to orthonormal basis


Suppose T ∈ L(V ). If T has an upper-triangular matrix with respect to some basis of V ,
then T has an upper-triangular matrix with respect to some orthonormal basis of V .
10 BSCAIML-301 Notes by Subodh C Bhunia

Theorem 2.6 Schur’s Theorem Suppose T is an operator on a finite-dimensional


complex inner product space V . Then T has an upper-triangular matrix with respect to
some orthonormal basis of V .

Proof Recall that T has an upper-triangular matrix with respect to some basis of V .
Now apply the previous theorem.
Note : We shall use the name Hilbert’s space inour discussion, so we shall have some
idea about it.
Hilbert’s space : Firstly a metric space is a set X together with a notion of distance
d between its elements, usually called points. The distance is measured by a function
called a metric or distance function. For x, y ∈ X, the distance between them is written
as d(x, y) and the metric space is written as (X, d).
A Cauchy sequence is a sequence { xn } in a metric space (X, d) such that for every
ϵ > 0, there exists an integer N such that d(xn , xm ) < ϵ for all n, m ≥ N . If every Cauchy
sequences defined on a inner-product space converges to a limit point that is also within
the space, such a space is then called complete.
Further,
p in an inner product space, if the norm is defined using the inner product i.e.,
∥x∥ = ⟨x, x⟩, then the inner product space is called a Hilbert space, if it is complete.

2.2.2 Linear Functionals on Inner Product Spaces


Definition 2.8 Linear functional A linear functional on V is a linear map from V to
F . In other words, a linear functional is an element of L(V, F ).

Illus. 2.7 The function ϕ : R3 → R defined by ϕ(x, y, z) = 2x − 5y + z is a linear


functional on R3 .
We could write this linear functional in the form: ϕ(v) = ⟨v, u⟩ for every v ∈ R3 ,
where u = (2, −5, 1).

Theorem 2.7 Riesz Representation Theorem


Suppose V is finite-dimensional and ϕ is a linear functional on V . Then there is a unique
vector u ∈ V such that ϕ(v) = ⟨v, u⟩ for every v ∈ V .

Proof First we show there exists a vector u ∈ V such that ϕ(v) = ⟨v, u⟩ for every v → V .
Let e1 , · · · , en be an orthonormal basis of V . Then
ϕ(v) = ϕ(⟨v, e1 ⟩ e1 + · · · + ⟨v, en ⟩ en )
= ⟨v, e1 ⟩ ϕ(e1 ) + · · · + ⟨v, en ⟩ ϕ(en )
= ⟨v, ϕ(e1 ) e1 + · · · + ϕ(en ) en ⟩ for every v ∈ V .
Thus setting u = ϕ(e1 ) e1 + · · · + ϕ(en ) en , we have
ϕ(v) = ⟨v, u⟩ for every v ∈ V , as desired.
Now we prove that only one vector v ∈ V has the desired behavior
CHAPTER 2. INNER PRODUCT SPACE 11

Suppose u1 , u2 ∈ V are such that


ϕ(v) = ⟨v, u1 ⟩ = ⟨v, u2 ⟩
for every v ∈ V .
Then 0 = ⟨v, u1 ⟩ = ⟨v, u2 ⟩ = ⟨v, u1 − u2 ⟩ for every v ∈ V .
Taking v = u1 − u2 shows that u1 − u2 = 0. In other words, u1 = u2 , proving the
uniqueness of u.

2.3 Orthogonal Complements


Definition 2.9 Orthogonal complement, U ⊥
If U is a subset of V , then the orthogonal complement of U , denoted U ⊥ , is the set of all
vectors in V that are orthogonal to every vector in U .
Thus, U ⊥ = {v ∈ V : ⟨v, u⟩ = 0 for every u ∈ U }.

Illus. 2.8 If U is a line in R3 containing the origin, then U ⊥ is the plane containing the
origin that is perpendicular to U . If U is a plane in R3 containing the origin, then U ⊥ is
the line containing the origin that is perpendicular to U .

Basic properties of orthogonal complement

(a) If U is a subset of V , then U ⊥ is a subspace of V .


(b) {0}⊥ = V .
(c) V ⊥ = {0}.
(d) If U is a subset of V , then U ∩ U ⊥ ⊂ {0}.
(e) If U and W are subsets of V and U ⊂ W , then W ⊥ ⊂ U ⊥ .

Proof (a) Suppose U is a subset of V . Then ⟨0, u⟩ = 0 for every u ∈ U . Thus 0 ∈ U ⊥ .


Suppose u1 , u2 ∈ U ⊥ . If u ∈ U , then
⟨u1 + u2 , u⟩ = ⟨u1 , u⟩ + ⟨u2 , u⟩ = 0 + 0 = 0
Thus u1 + u2 ∈ U ⊥ . In other words, U ⊥ is closed under addition.
Similarly, suppose λ ∈ F and v ∈ U ⊥ .
If u ∈ U , then ⟨λv, u⟩ = λ ⟨v, u⟩ = λ · 0 = 0.
Thus v ∈ U ⊥ . In other words, U ⊥ is closed under scalar multiplication.
Thus, U ⊥ is a subspace of V .
(b) Suppose v ∈ V . Then ⟨v, 0⟩ = 0, which implies that v ∈ {0}⊥ . Thus {0}⊥ = V .
(c) Suppose v ∈ V ⊥ . Then ⟨v, v⟩ = 0, which implies that v = 0. Thus V ⊥ = {0}.
(d) Suppose U is a subset of V and u ∈ U ∩ U ⊥ .
Since u ∈ U and also u ∈ U ⊥ , we have ⟨u, u⟩ = 0, which implies that u = 0.
12 BSCAIML-301 Notes by Subodh C Bhunia

Thus U ∩ U ⊥ ⊂ {0}.
(e) Suppose U and W are subsets of V and U ⊂ W . Suppose v ∈ W ⊥ .
Then ⟨v, u⟩ = 0 for every u ∈ W , which implies that ⟨v, u⟩ = 0 for every u ∈ U .
Hence v ∈ U ⊥ . Thus W ⊥ ⊂ U ⊥ .

Linear functional, Riesz representation Theorem, Orthogonal


complement or dual subspace, Singular value and singular vectors
singular value decomposition
Definition 2.10 (Convex sets and Distance to a set) A convex set is a subset U of
a vector space V such that for all u, v ∈ U, tu + (1 − t)v ∈ U for all t ∈ [0, 1].
When V is a normed vector space, we say that the distance from a vector p to a subset U
is defined dist(p, U ) = inf(p − q) for q ∈ U .

Theorem 2.8 (The Hilbert projection theorem). For a Hilbert space V and a closed
convex subset U , the distance top described above is attained by a unique element of U .

Proof is omitted.

Definition 2.11 (Orthogonal projections) For a vector v and a closed convex subset
U (most often a closed subspace) of a Hilbert space, we use vU to denote this distance-
minimizing element of U , called the orthogonal projection of v into U .

To justify the name ‘orthogonal projection’, we show that v − vU is orthogonal to U (that


is, orthogonal to each of its elements u ∈ U ) for all v ∈ V and all closed subspaces U .
Proof For all scalars λ we have ∥v − vU ∥2 ≤ ∥v − (vU + λu)∥2 , as vU + λu is some member
of U with distance to v greater than or equal to that of vU .
So for any real t > 0, choose λ = −t<v − vU , u>.
Then, ∥v − vU ∥2 − vU ∥2 + |λ|2 |u∥2 + 2Re(λ⟨v − vU , u⟩).
Substituting for λ and rearranging, we get
∥v − vU ∥2 ≤ ∥v − vU ∥2 + t2 |⟨v − vU , u⟩|2 ∥u∥2 − 2t∥⟨v − vU , u⟩∥2
⇒ 2∥⟨v − vU , u⟩∥2 ≤ t|⟨v − vU , u⟩|2 ∥u∥2 for all real t > 0.
So ⟨v − vU , u⟩ = 0. This completes the proof.

2.3.1 Orthogonal Complements


It will be important to compute the set of all vectors that are orthogonal to a given set
of vectors. It turns out that a vector is orthogonal to a set of vectors if and only if it is
orthogonal to the span of those vectors, which is a subspace, so we restrict ourselves to
the case of subspaces.
CHAPTER 2. INNER PRODUCT SPACE 13

Definition 2.12 (Orthogonal Complement) Let V be a subspace over Rn . Its or-


thogonal complement is the subspace
W = {w ∈ Rn |w · V = 0} for all v ∈ V .
The orthogonal complement W of V is sometimes written as V ⊥ .

This is the set of all vectors w ∈ Rn that are orthogonal to all of the vectors in V .

It can be shown that V ⊥ is indeed a subspace of Rn .

Properties of Orthogonal Subspaces


Let V be a subspace of Rn . Then:
(1) V ⊥ is also a subspace of Rn ,
(2) (V ⊥ )⊥ = V , and
(3) dim(V )+ dim (V ⊥ ) = n.
Proof: (1) First, we verify the three defining properties of subspaces:
(i) The zero vector is in V ⊥ , because the zero vector is orthogonal to every vector in
Rn .
(ii) Let u, v be in V ⊥ , then
u · x = 0 and v ẋ = 0 for every vector x in V .
Then, we have (u + v) · x = u · x + v · x = 0 + 0 = 0 for every x in V .
(iii) Let u ∈ V ⊥ , then u · x = 0 for every x ∈ V and let c be a scalar. We must verify
that
(cu) · x = c(u · x) = c 0 = 0 for every x ∈ V .
(3) Next we prove the third assertion.
Let v1 , v2 , · · · , vm be a basis for V , so m = dim (V ),
and let vm+1 , vm+2 , · · · , vk be a basis for V ⊥ , so k − m = dim (V ⊥ ).
We need to show that k = n.
First we claim that {v1 , v2 , · · · , vm , vm+1 , vm+2 , · · · , vk }
is linearly independent.
Suppose that c1 v1 + c2 v2 + · · · + cm vm + cm+1 vm+1 + cm+2 vm+2 + · · · + ck vk = 0.
Let w = c1 v1 + c2 v2 + · · · + cm vm and wA = cm+1 vm+1 + cm+2 vm+2 + · · · + ck vk ,
so w is in V , and wA is in V ⊥
and w + wA = 0.
Then w = −wA is in both V and V ⊥ , which implies w is perpendicular to itself.
In particular, w · w = 0, so w = 0, and hence wA = 0.
Therefore, all coefficients ci are equal to zero,
because {v1 , v2 , · · · , vm , } and {vm+1 , vm+2 , · · · , vk } are linearly independent.
It follows from the previous paragraph that k ≤ n.
Suppose that k < n.
14 BSCAIML-301 Notes by Subodh C Bhunia

v1T
 
 v2T 
 .. 
.
 
 
 T 
Then the matrix A =  vm 
 T
 vm+1 

 . 
 .. 
vkT
has more columns than rows, so its null space is nonzero.
Let x be a nonzero vector in Nul (A).
 T
v1 · x
  
v1 x
 v2T x   v2 · x 
 ..   .. 
. .
   
   
 T
Then 0 = Ax =  vm x  =  vm · x 
  
 T
 vm+1 x   vm+1 · x 
  
 ..   .. 
 .   . 
vkT x vk · x
Since v1 · x = v2 · x = · · · = vm · x = vm+1 · x = · · · = vk · x = 0,
it follows that x is in V ⊥ and similarly, x is in (V ⊥ )⊥ .
This again implies x is orthogonal to itself, which contradicts our assumption that x
is nonzero. Therefore, k = n, as desired.
(2) Finally, we prove the second assertion.
Clearly V is contained in (V ⊥ )⊥ .
Let m = dim (V ). By the Property (3), proved above, we have dim (V ⊥ ) = n − m,
so dim ((V ⊥ )⊥ ) = n − (n − m) = m.
The only m-dimensional subspace of (V ⊥ )⊥ is (V ⊥ )⊥ itself, so (V ⊥ )⊥ = V .

2.3.2 Direct sum of a subspace and its orthogonal complement


Recall that if U, W are subspaces of V , then V is the direct sum of U and W (written
V = U ⊕ W ) if each element of V can be written in exactly one way as a vector in U plus
a vector in W .
We shall now show that every finite-dimensional subspace of V leads to a natural
direct sum decomposition of V .
Suppose U is a finite-dimensional subspace of V . Then V = U ⊕ U ⊥
Proof First we will show that
V = U + U⊥ ... (1)
To do this, suppose v ∈ V . Let e1 , · · · , em be an orthonormal basis of U .
Obviously, v = (⟨v, e1 ⟩e1 + · · · + ⟨v, em ⟩em ) + (v − ⟨v, e1 ⟩e1 − · · · − ⟨v, em ⟩em )
Let us write
CHAPTER 2. INNER PRODUCT SPACE 15

u = ⟨v, e1 ⟩e1 + · · · + ⟨v, em ⟩em


and w = v − ⟨v, e1 ⟩e1 − · · · − ⟨v, em ⟩em .
Clearly u ∈ U .
Since, {e1 , · · · , em } is an orthonormal list, for each j = 1, · · · , m we have
⟨w, ej ⟩ = ⟨v, ej ⟩ − ⟨v, ej ⟩ = 0
Thus w is orthogonal to every vector in span (e1 , · · · , em ). In other words, w ∈ U ⊥ .
Thus, we have written v = v + w, where u ∈ U and w ∈ U ⊥ , which proves (1).
From the property of orthogonal complement, we know that U ∩ U ⊥ = {0}.
This, along with (1), shows that V = U ⊕ U ⊥ .
16 BSCAIML-301 Notes by Subodh C Bhunia

Explain in details orthogonal complement or dual subspace, Singular value and singu-
lar vectors, singular value decomposition
Orthogonal Complement or Dual Subspace:
Given a vector space V over a field (usually the real numbers or complex numbers),
the orthogonal complement of a subspace U of V , denoted as U ⊥ , is the set of all vectors
in V that are orthogonal to every vector in U . In other words, for any vector ∈ U and
any vector v ∈ U ⊥ , their inner product (dot product) is zero: ⟨u, v⟩ = 0.
Mathematically, the orthogonal complement can be defined as:
U ⊥ = {v ∈ V |⟨u, v⟩ = 0 for all u ∈ U ⊥ }
The orthogonal complement is itself a subspace of V , and it is always true that U ∩
U ⊥ = {0}, meaning the only vector common to both U and its orthogonal complement is
the zero vector.
Singular Value and Singular Vectors: Consider a matrix A with dimensions m × n (m
rows and n columns).

2.3.3 Singular Value Decomposition (SVD)


The Singular Value Decomposition is an important topic of linear algebra.
Suppose A is any m × n matrix, square or rectangular and its rank is r. We will
diagonalize this A, but not by X −1 AX.
The eigenvectors in X can have three problems: (i) they are usually not orthogonal,
(ii) there are not always enough eigenvectors, and (iii) Ax = λx requires A to be a square
matrix. The singular vectors of A solve all those problems in a perfect way.
The singular value decomposition (SVD) of a matrix A is a factorization of A into
three matrices: U, Σ, and V T (the transpose of V ), where U is an m × m orthogonal
matrix, Σ is an m × n diagonal matrix with non-negative real numbers on the diagonal
(singular values), and V T is an n × n orthogonal matrix.
Mathematically, the SVD of a matrix A can be represented as: A = U ΣV T
where,
U : The columns of U are the left singular vectors of A. These vectors are orthogonal
and form a basis for the column space of A.
Σ : The diagonal entries of Σ are the singular values of A. They are non-negative and
represent the ”scaling” of the corresponding singular vectors during the matrix transfor-
mation.
V T : The rows of V T are the right singular vectors of A. Like the left singular vectors,
they are orthogonal and form a basis for the row space of A.

The SVD can be expressed using the following steps:

1. Compute the eigenvalues and eigenvectors of AT A to obtain the right singular vec-
tors (V T ).
CHAPTER 2. INNER PRODUCT SPACE 17

2. Normalize the right singular vectors to obtain the unit right singular vectors.

3. Compute the square root of the eigenvalues of AT A to obtain the singular values
(the diagonal elements of diagonal matrix Σ).

4. Compute the left singular vectors (U ) by dividing the corresponding right singular
vectors by the singular values.

5. Form the matrices U, Σ, and V T to obtain the SVD: A = U Σ V T .

SVD is widely used in various fields including linear algebra, signal processing, data anal-
ysis, image compression, and machine learning. It’s a fundamental tool for understanding
and manipulating matrices.

Example 2.1 For an integer n > 0, let Pn denote the vector space of polynomials with
real coefficients of degree 2 or less.
Define the map T : P2 → P4 by T (f )(x) = f (x2 ).
Determine if T is a linear transformation.
If it is, find the matrix representation for T relative to the basis B = {1, x, x2 } of P2
and C = {1, x, x2 , x3 , x4 } of P4 .

Soln. To prove that T is a linear transformation, we must show that it satisfies both
axioms for linear transformations.
For f, g ∈ P2 , we have
T (f + g)(x) = (f + g)(x2 ) = f (x2 ) + g(x2 ) = T (f )(x) + T (g)(x)
Also, for a scalar c ∈ R, we have
T (cf )(x) = (cf )(x2 ) = cf (x2 ) = cT (f )(x).
We see that T is a linear transformation.
To find its matrix representation, we must calculate T (f ) for each f ∈ B and find its
coordinate vector relative to the basis C.
We calculate T (1) = 1, T (x) = x2 , T (x2 ) = x4 .
Each of these is an element of C. Their coordinate vectors relative to C are thus
     
1 0 0
 0   0   0 
     
[T (1)]B =  0 , [T (x)]B =  1 , and [T (x2 )]B =  0 
     
 0   0   0 
0 0 1
The matrix representation of T is found by combining these columns, in order, into
one matrix.
18 BSCAIML-301 Notes by Subodh C Bhunia
 
1 0 0

 0 0 0 

Thus, [T ]C→B =
 0 1 0 

 0 0 0 
0 0 1
 
3 0
Example 2.2 Find the matrices U, Σ, V for A = .
4 5

Soln. The rank is r = 2. With rank 2, this A has positive singular values σ1 and σ2 .
We will see that σ1 is larger than λmax = 5, and σ2 is smaller than λmin = 3.
Begin with AT A and T
 AA :   
T 3 4 3 0 25 20
A A= =
0 5 4 5 20 25
     
T 3 0 3 4 9 12
and AA = =
4 5 0 5 12 41
2 2
These have the same trace √ √ eigenvalues σ1 = 45 and σ2 = 5.
(50) and the same
The square roots are
 √σ1 = 45and σ2 =  5. 
45 √0 √ 3 0
Thus, we get Σ = = 5 .
0 5 0 1
A key step is to find the eigenvectors of AT A (with eigenvalues 45 and 5) :
    
25 20 1 1
= 45
20 25 1 1
    
25 20 1 1
=5
20 25 −1 −1
Then v1 and v2 are orthogonal eigenvectors of AT A rescaled to length 1.
So, right singular
  vectors are:  
1 1 1 1
v1 = √ and v2 = √
2 1 2 −1
Next, we campute the left singular vectors ui√ ’s. √
Compute Av 1 and
 Av 2 which
  will be
 σ 1 u 1 = 45 u
 1 and σ 2 u 2 = 5 u2 respectively.
3 0 1 1 1 3 3 1
Av1 = √ =√ =√
4 5 2 1 2 9 2 3
1 3 1 1 1
So, u1 = √ √ =√
45 2 3 10 3
      
3 0 1 1 1 3 1 3
Similarly Av2 = √ =√ =√
4 5 −1 −1 −1
  2   2 2
1 1 3 1 3
So, u2 = √ √ =√
5 2 −1 10 −1
CHAPTER 2. INNER PRODUCT SPACE 19


     
1 1 1 1 1 3 3 0
Thus, V = √ , U=√ and Σ = 5 .
2 −1 1 10 3 −1 0 1
Thus, the singular value decomposition of A is U ΣV T .
Note that U and V contain orthonormal bases for the column space and the row space
respectively.
 
1 1
Example 2.3 Find the matrices U, Σ, V for A =  1 0 .
0 1

Soln. The rank is r = 2.


Begin with AT A :  
  1 1  
1 1 0 2 1
AT A =  1 0 =
1 0 1 1 2
0 1
AT A has eigenvalues σ12 = 3√and σ22 = 1. [since (2 − σ)2 − 1 = 0 or σ = 3, 1.]
The square roots are σ1 = 3 and σ2 = 1. √
A key step is to find the eigenvectors of AT A (with eigenvalues 3 and 1) :
    
2 1 1 1
=3
1 2 1 1
    
2 1 1 1
=1
1 2 −1 −1
Then v1 and v2 are orthogonal eigenvectors of AT A rescaled to length 1.
So, right singular
  vectors are:  
1 1 1 1
v1 = √ and v2 = √
2 1 −1
 √ 2 
3 0
So, we have found Σ =  0 1 .
0 0
[Note that Σ has same size s that of A and the last row is all zero elements]
Next, we campute the left singular vectors ui√’s.
Compute Av and
1  Av 2 which will be σ u
 1 1 = 3u1 and σ2 u2 = 1u2 = u2 respectively.
1 1   2
1 1 1  
Av1 = 1 0
  √ = √ 1
2 1 2
0 1 1
   
2 2
1 1   1  
So, u1 = √ √ 1 =√ 1
3 2 1 6 1
20 BSCAIML-301 Notes by Subodh C Bhunia
   
1 1   0
1 1 1
Similarly Av2 =  1 0  √ =√  1 
2 −1 2
0 1 −1
 
0
1 
So, u2 = √ 1 
2 −1
 
x
The third orthonormal vector u3 = y  may be computed as follows:

z
u3 is normal to both u1 and u2 , so we have
2x + y + z = 0 and 0x + y − z = 0
These gives x = −y = −z. Take x = −k, y = k, z = k.
1
On normalizing, we get k = √
  3
−1
1
So, u3 = √  1 
3 1
 √   √ 
  2 √0 −√ 2 3 0
1 1 1 1
Thus, V = √ , U=√  1 3 2  and Σ =  0 1 .
2 1 −1 6 1 −√3 √2 0 0
T
[Students are advised to verify that U ΣV = A.]
 
3 1 1
Example 2.4 Find the matrices U, Σ, V for A = .
−1 3 1

Soln. The rank is r = 2.


T
Begin with
 A A :  
3 −1   10 0 2
3 1 1
AT A =  1 3  =  0 10 4 
−1 3 1
1 1 2 4 2
Let λ be the eigenvalue of AT A, then
(10 − λ)[(10 − λ)(2 − λ) − 4 × 4] + 2[0 × 4 − 2(10 − λ)] = 0
or (10 − λ)[λ2 − 12λ + 20 − 16] + 4λ − 40 = 0
or λ3 − 22λ2 + 120λ = 0
or (λ − 12)(λ − 10)(λ − 0) = 0
Thus, the eigenvalues of AT A are σ12 = 12, 2
√σ2 = 10 and √σ32 = 0.
So, the singular values of A are are σ1 = 12 and σ2 = 10.
The matrix Σ will be of same size of A.
 √
12 √0 0
So we take Σ = (taking only two rows)
0 10 0
CHAPTER 2. INNER PRODUCT SPACE 21

A key step is to find the eigenvectors of AT A (with eigenvalues 12, sqrt10 and 0) :
    
10 − 12 0 2 x 0
 0 10 − 12 4   y = 0 
 
2 4 2 − 12 z 0
or −2x + 2z = 0, −2y + 4z = 0 and 2x + 4y − 10z  =0  
1 1
1 1  
or x = z and y = 2z. So take v1 = √  2 =√
 2
12 + 22 + 12 1 6 1
    
10 − 10 0 2 x 0
 0 10 − 10 4  y  =  0 
2 4 2 − 10 z 0
or 2z = 0, 4z = 0 and 2x + 4y − 8z = 0    
−2 −2
1  1  = √1  1 
or z = 0 and x = −2y. So take v2 = p
2
(−2) + 1 2 5
0 0
    
10 0 2 x 0
 0 10 4   y  =  0 
2 4 2 z 0
or 10x + 2z = 0, 10y + 4z = 0 and 2x + 4y + 2z = 0    
1 1
1 1
 2 = √  2 
y = 2x and z = −5x. So take v3 = p
12 + 22 + (−5)2 −5 30 −5
So, right 
singular
 vectors are:
   
1 −2 1
1 1 1
v1 = √  2  v2 = √  1  and v3 = √  2 .
6 1 5 0 30 −5

Next, we campute the left singular vectors ui ’s.


√ √
Compute Av1 and Av2 which
  will be σ1 u 1 = 12 u1 and σ 2 u 2 = 10 u2 respectively.
  1  
3 1 1 1   1 6
Av1 = √ 2 =√
−1 3 1 6 1 6 6
   
1 1 6 1 1
So, u1 = √ √ =√
12 6 6 2 1
 
  −2  
3 1 1 1  1 −5
Similarly Av2 = √ 1  =√
−1 3 1 5 5 5
0
   
1 1 −5 1 −1
So, u2 = √ √ =√
10 5 5 2 1
22 BSCAIML-301 Notes by Subodh C Bhunia
1 2 1
 
 1 1  √ √ √  √
6 6 6

√ −√ 12 √0
 
 2 2 1  2 1 
Thus, U =  1 1 , V = √ 6 
  −√ √ 0  and Σ =  0

10 .
√ √ 5 5
0 0
 
 1 2 5
2 2

√ √ −√
30 30 30

You might also like