Linear Spaces Teaching Slides Handout
Linear Spaces Teaching Slides Handout
Sudhir A. Shah
Definition
Given a vector space
n
PnV , n ∈ N , {x1 , . . . , xn } ⊂ V , and
(α1 , . . . , αn ) ∈ F , i=1 αi xi is called a linear combination of
x1 , . . . , xn .
Definition
Consider a vector space V and X ⊂ V .
1. For n ∈ N , let Xn be the collection of subsets of X with n
elements.
2. The spanPof Y := {y1 , . . . , yn } ∈ Xn is
n n .
[Y ] := i=1 αi yi | (α1 , . . . , αn ) ∈ F
3. The span of X is [X ] := ∪n∈N ∪Y ∈Xn [Y ].
4. [∅] := {0}.
Linearly (in)dependent sets of vectors
Definition
Consider a vector space V and X ⊂ V .
1. y ∈ V is said to be linearly dependent on X if y ∈ [X ].
2. X is said to be a linearly dependent set if there exists
y ∈ X such that y ∈ [X \ {y }].
3. A set X ⊂ V is said to be a linearly independent set if it is
not linearly dependent.
Properties of (in)dependent sets
Theorem
Consider a vector space V and sets X ⊂ V and Y ⊂ V .
Theorem
1. Every vector space has a basis.
Proof uses Zorn’s lemma. For infinite dimensional spaces,
this basis is called a Hamel basis in order to distinguish it
from other notions of bases such as Schauder bases.
2. A finite dimensional vector space has a finite basis. Proof
Working assumptions
Theorem
Every basis of V has the same number of vectors. Proof
Definition
The number of vectors in a basis of V is called the dimension of
V , denoted by dim V .
Theorem
Let dim V = n.
1. If X ⊂ V is independent, then X can be extended to a
basis for V . Proof
2. {x1 , . . . , xn } ⊂ V spans V iff. {x1 , . . . , xn } is a basis for V .
Proof
Theorem
If {x1 , . . . , xn } ⊂ V is a basis for the vector space V , then for
every x ∈ V there exists a unique Pn-tuple of scalars
(α1 , . . . , αn ) ∈ F n such that x = ni=1 αi xi .
The scalars {α1 , . . . , αn } are called the coordinates of x with
respect to the basis {x1 , . . . , xn }.
Subspaces of a vector space V
Definition
Let {V , F ; +, ., ⊕, } be a vector space and let S ⊂ V . If
{S, F ; +, ., ⊕, } is a vector space, then {S, F ; +, ., ⊕, } is
called a subspace of {V , F ; +, ., ⊕, }.
Theorem
If S ⊂ V , then S is a subspace of V iff.
1. x, y ∈ S implies x + y ∈ S, and
2. x ∈ S and α ∈ F implies αx ∈ S. Proof left as exercise.
Theorem
If S ⊂ V , then {[S], F ; +, ., ⊕, } is a subspace of V .
Proof left as exercise.
Sums of subspaces
Definition
Let S and T be subspaces of a vector space V . Example
Theorem
Let S and T be subspaces of a vector space V .
1. V = S ⊕ T iff. every v ∈ V has a unique representation
v = x + y for some x ∈ S and some y ∈ T . Proof
2. dim(S + T ) + dim(S ∩ T ) = dim S + dim T . Proof
Linear transformations
Definition
Given vector spaces V and W , a function A : V → W is called
a linear transformation if A(αx + βy ) = αA(x) + βA(y ) for all
x, y ∈ V and α, β ∈ F . The space of all linear transformations
from V to W is denoted by L(V , W ).
Terminology. Linear transformations may be called linear
mappings, or homomorphisms, or in certain contexts, they may
also be called linear operators.
Definition
The range space of A ∈ L(V , W ) is R(A) = {Ax ∈ W | x ∈ V }.
The null space of A is N (A) = {x ∈ V | Ax = 0}.
It is easy to check that R(A) and N (A) are subspaces of W
and V respectively.
Canonical n-dimensional vector space
Definition
A bijection A ∈ L(V , W ) is called an isomorphism.
Definition
Given the field of scalars F , we can define {F n , F ; +, ., ⊕, }.
F n as the vector space of n-tuples of scalars.
Theorem
If V is a vector space with dim V = n, then it is isomorphic to
F n . Proof
Rank and nullity
Definition
Let V and W be vector spaces and A ∈ L(V , W ). The
dimension of R(A) is called the rank of A, denoted by ρ(A). The
dimension of N (A) is called the nullity of A, denoted by ν(A).
Theorem
Let V and W be vector spaces, and A ∈ L(V , W ).
1. ρ(A) + ν(A) = dim V . Proof
Theorem
Let U, V and W be vector spaces, A ∈ L(V , W ) and
B ∈ L(U, V ).
1. AB ∈ L(U, W ). Proof
Theorem
Consider a vector space V with basis {v1 , . . . , vn }, a vector
space W with basis {w1 , . . . , wm }, and a linear transformation
F ∈ L(V , W ). Then, there exists a unique m × n matrix of
scalars A such that, if v ∈ V has coordinates b = (b1 , . . . , bn )
with respect to basis {v1 , . . . , vn }, then F (v ) has coordinates
(c1 , . . . , cm ) = Ab with respect to basis {w1 , . . . , wm }. Proof
Projection mappings
Definition
Let U and W be subspaces of V such that V = U ⊕ W .
P : V → V is said to project V on U along W if P(u + w) = u
for all u ∈ U and w ∈ W ; then, P is called a projector. Example
Theorem
Let V be a vector space and P : V → V .
1. If P is a projector, then P ∈ L(V , V ), V = R(P) ⊕ N (P),
and P projects V on R(P) along N (P). Proof
2. If P is a projector, then P 2 = P and R(I − P) = N (P). Proof
Definition
Let A ∈ L(V , W ).
1. AL ∈ L(W , V ) is a left inverse of A if AL A = I ∈ L(V , V ).
Example
Theorem
Let A ∈ L(V , W ).
1. A left inverse of A exists iff. ν(A) = 0. Proof
Definition
Let A ∈ L(V , W ). A− ∈ L(W , V ) is said to be a
generalised-inverse (or g-inverse) of A if AA− A = A.
Theorem
Every A ∈ L(V , W ) has a g-inverse.
Proof omitted. Many proofs are available in the literature.
Properties of g-inverses
Theorem
Let A ∈ L(V , W ) and B ∈ L(W , U).
1. ρ(A− ) ≥ ρ(A). Proof
Definition
1. Given A ∈ L(V , W ) and y ∈ W , Ax = y is called a
non-homogeneous equation.
2. The equation Ax = 0 is called the homogeneous part of
Ax = y .
3. A solution of the equation Ax = y is an x0 ∈ V such that
Ax0 = y .
4. If the equation Ax = y has a solution, then it is called
consistent.
Theorem
If A ∈ L(V , W ), then the solution set of the equation Ax = 0 is
N (A) = R(I − A− A) = (I − A− A)V . Proof
Linear equations
Theorem
Let A ∈ L(V , W ).
1. Given y ∈ W , the equation Ax = y is consistent iff.
AA− y = y . Proof
2. Suppose the equation Ax = y is consistent for every
y ∈ W . Then, it has a unique solution x = AL y for every
y ∈ W iff. ν(A) = 0. Proof
3. If ν(A) = 0 and ρ(A) = dim W , then the equation Ax = y is
consistent and has a unique solution x = A−1 y for every
y ∈ W . Proof
4. The solution set of the consistent equation Ax = y is
A− y + (I − A− A)V . Proof
Inner product
Definition
Given the vector space V , a function h., .i : V × V → < is called
an inner product on V if for all x, y , z ∈ V and λ, µ ∈ F ,
1. hx, y i = hy , xi,
2. hλx + µy , zi = λhx, zi + µhy , zi,
3. hx, xi ≥ 0, and
4. hx, xi = 0 iff. x = 0. Example
(V , h., .i) is called an inner product space. Let kxk := hx, xi1/2 .
Theorem
(Cauchy-Schwartz) Consider an inner product space (V , h., .i).
If x, y ∈ V , then
1. hx, y i ≤ hx, xi1/2 hy , y i1/2 Proof
Definition
Consider an inner product space (V , h., .i).
1. x, y ∈ V are said to be orthogonal if hx, y i = 0.
2. x ∈ V is said to be orthogonal to a subspace W of V if
hx, y i = 0 for every y ∈ W .
3. Subspaces U and W of V are said to be orthogonal
subspaces if every x ∈ U is orthogonal to W .
4. Given a subspace W of V ,
Definition
Consider an inner product space (V , h., .i).
1. A set {c1 , . . . , c(r } ⊂ V is called an orthonormal set if
1, i = j
hci , cj i = δij =
0, i 6= j
2. If an orthonormal set {c1 , . . . , cr } is a basis for V , then it is
called an orthonormal basis of V .
Properties of orthonormal sets
Consider an inner product space (V , h., .i).
Theorem
1. An orthonormal subset of V is linearly independent. Proof
2. If {c1P
, . . . , cr } is an orthonormal basis for V , then
y = ri=1 hy , ci ici for every y ∈ V . Proof
Lemma
If {x1 , . . . , xk } ⊂ V is independent and {y1 , . . . , yk −1 } ⊂ V is an
orthonormal set with [{y1 , . . . , yk −1 }] = [{x1 , . . . , xk −1 }], then
there exists yk ∈ V such that {y1 , . . . , yk } is an orthonormal set
with [{y1 , . . . , yk }] = [{x1 , . . . , xk }]. Proof
Theorem
(Gram-Schmidt) If (V , h., .i) is finite dimensional, then it has an
orthonormal basis. Proof
Orthogonal decomposition
Theorem
Consider an inner product space (V , h., .i). If W is a subspace
of V , then W ⊕ W ⊥ = V . Proof
Adjoint transformations I
Definition
Consider inner product spaces (V , h., .iV ) and (W , h., .iW ).
1. If A ∈ L(V , W ), then A∗ ∈ L(W , V ) is called the adjoint of A
if hx, A∗ y iV = hAx, y iW for every x ∈ V and y ∈ W . Example
2. If A ∈ L(V , V ) and A = A∗ , then A is called self-adjoint.
Adjoint transformations II
Theorem
Consider inner product spaces (V , h., .iV ) and (W , h., .iW ), and
A ∈ L(V , W ).
1. A∗ exists and is unique. Proof
2. (A∗ )∗ = A. Proof
2. (A + B)∗ = A∗ + B ∗ Proof
Definition
Given a vector space V , a projector P ∈ L(V , V ) is called an
orthogonal projector of V if R(P) = N (P)⊥ ; in this case, P is
said to orthogonally project V on R(P) along N (P).
Theorem
1. P ∈ L(V , V ) is an orthogonal projector of V iff. it is
self-adjoint and idempotent, i.e., P = P ∗ = P 2 . Proof
2. P is an orthogonal projector of V iff. P ∗ (I − P) = 0. Proof
Distance minimisation
Theorem
Let W be a subspace of an inner product space (V , h., .i) and
let P : V → W . Then, kPx − xk ≤ kw − xk for all x ∈ V and
w ∈ W iff. P is an orthogonal projector of V on W . Proof
Theorem
Consider vector spaces V and U, and transformations
P ∈ L(V , V ) and X ∈ L(U, V ). If P is an orthogonal projector
and R(P) = R(X ), then P = X (X ∗ X )− X ∗ . Proof
Application to linear estimation
Theorem
If A is a real symmetric n × n matrix and V ⊂ <n is an invariant
subspace with respect to A, then:
1. There exists λ ≥ 0 and x ∈ V \ {0} such that A2 x = λx.
Proof
Theorem
Suppose A is a real symmetric n × n matrix. Then:
1. The characteristic subspaces of <n corresponding to
distinct roots of A are orthogonal. Proof
2. If λ1 , . . . , λr are the distinct roots of A with corresponding
characteristic subspaces L1 , . . . , Lr , then
<n = L1 ⊕ . . . ⊕ Lr is an orthogonal decomposition. Proof
3. <n has an orthonormal basis consisting of characteristic
vectors of A. Proof
(Semi)definite matrices and roots
Theorem
Suppose A is a real symmetric n × n matrix. Then:
1. hAx, xi ≥ 0 for every x ∈ <n iff. all the roots of A are
non-negative.
2. hAx, xi > 0 for every x ∈ <n \ {0} iff. all the roots of A are
positive.
3. hAx, xi ≤ 0 for every x ∈ <n iff. all the roots of A are
non-positive.
4. hAx, xi < 0 for every x ∈ <n \ {0} iff. all the roots of A are
negative. Proof
Diagonalisation
Theorem
If A is a real symmetric n × n matrix, then there exists an n × n
non-singular matrix C such that A = CΛC T and Λ = C T AC,
where Λ is a diagonal matrix with the characteristic roots of A
on the diagonal. Proof
Corollary
det A = det Λ.
Diagonalisation of semidefinite matrices
Corollary
If A is positive semidefinite, then there exists a matrix E such
that
T Ir 0
E AE = J =
0 0
where r is the number of non-zero roots of A. Proof
Corollary
If A is positive definite, then there exists a non-singular matrix E
such that E T AE = I. Furthermore, A = (E T )−1 E −1 and
A−1 = EE T . Proof
Semidefinite matrices
Theorem
A positive semidefinite matrix A is positive definite if and only if
A is nonsingular. Proof
Theorem
A is positive definite if and only if A−1 is symmetric and positive
definite. Proof
Definitions
1. F = < Back
2. S + T = <2 <2
as every (x1 , x2 ) ∈ can be written as
(x1 , x2 ) = (x1 , 0) + (0, x2 ) ∈ S + T .
3. S ∩ T = {(0, 0)} = [(0, 0)].
4. So, <2 = S ⊕ T .
Example of projector
1. S = [{(1, 1)}] and T = [{(1, −1)}] are subspaces of <2 .
Back
1. Consider inner product spaces (<n , h., .in ) and (<m , h., .im ),
where h., .in and h., .im are the unitary inner products on <n
and <m respectively. Back
2. Let the m × n matrix A generate the linear mapping
x 7→ Ax from <n to <m . Let the n × m matrix AT generate
the linear mapping y 7→ AT y from <m to <n .
3. For every x ∈ <n and y ∈ <m ,
hx, AT y in = x T AT y = (Ax)T y = hAx, y im .
7 AT y represents the adjoint transformation of
4. Thus, y →
x 7→ Ax .
5. This characterisation depends on the unitary inner
products.
(In)dependent proofs I
Proof.
1. Suppose there exist sets {xP 1 , . . . , xn } ⊂ X and
{α1 , . . . , αn } ⊂ F such that ni=1 αi xi = 0 and α1 6= 0. Back
Pn
1.1 As x1 = − i=2 (αi /α1 )xi , we have
x1 ∈ [{x2 , . . . , xn }] ⊂ [X \ {x1 }]. So, X is dependent.
1.2 Conversely, suppose X is dependent. Then, there exists
y ∈ V such that y ∈ [X \ {y }]. So, there exists
{x1 , . . . , xn } ⊂ X \ {y } and {α1 , . . . , αn } ⊂ F such that
Pn Pn+1
y = i=1 αi xi . Setting xn+1 = y , we have i=1 αi xi = 0
with αn+1 = −1.
2. Trivial. Back
3. Trivial. Back
4. Trivial. Back
5. Trivial. Back
(In)dependent proofs II
Proof.
6. If X ∪ {y } is independent, then
y 6∈ [(X ∪ {y }) \ {y }] = [X \ {y }], as required. Back
Proof.
7. If {x1 , . . . , xn } is dependent, then there
P exist
α1 , . . . , αn ∈ F , not all 0, such that ni=1 αi xi = 0. Let
m = max{i Pm−1 | αi 6= 0}. Then,
xm = − i=1 (αi /αm )xi ∈ [{x1 , . . . , xm−1 }], as required.
The converse is trivial. Back
Lemma
Lemma
If B = {y1 , . . . , yn } is a basis for V and x ∈ V \ {0}, then there
is a set B 0 ⊂ B, B 0 6= B, such that {x} ∪ B 0 is a basis for V .
Proof.
1. As B is a basis for V , {x} ∪ B is dependent and spans V .
2. Then, there exists yj such that yj ∈ [{x, y1 , . . . , yj−1 }]. Ref
Proof.
1. If [X ] = V , the result holds. Suppose [X ] 6= V . Back
Proof.
2. If {x1 , . . . , xn } is a basis for V , then it spans V by definition.
Back
Proof.
3. If {x1 , . . . , xn } is a basis for V , then it is independent by
definition. Back
3.1 Conversely, suppose X := {x1 , . . . , xn } is independent.
3.2 If [X ] 6= V , then there exists y ∈ V \ [X ]. Since [X ∪ {y }] is
independent and every basis must have the same number
of vectors, the number of vectors in a basis for V must be at
least n + 1, which contradicts dim V = n.
Direct sum proofs I
Proof.
1. Let V = S ⊕ T and x ∈ V . Suppose x = y1 + z1 = y2 + z2 ,
where y1 , y2 ∈ S and z1 , z2 ∈ T . Then,
y1 − y2 = z2 − z1 ∈ S ∩ T = [0]. Consequently, y1 = y2 and
z1 = z2 .
The converse is trivial. Back
2. Let X := {x1 , . . . , xk } be a basis for S ∩ T . Since X ⊂ S is
independent, it can be extended to a basis
Y := X ∪ {y1 , . . . , yl } for S. Similarly, there is a basis
Z := X ∪ {z1 , . . . , zm } for T .
So, dim S = k + l, dim T = k + m, and dim(S ∩ T ) = k .
We show that Y ∪ {z1 , . . . , zm } is a basis for S + T , i.e.,
dim(S + T ) = k + l + m, as required. Back
Direct sum proofs II
Proof.
3. Let X := {x1 , . . . , xk } be a basis for S ∩ T . As X is
independent, it is extendable to a basis
Y := X ∪ {y1 , . . . , yl } for S. Similarly, there is a basis
Z := X ∪ {z1 , . . . , zm } for T . It suffices to show that
Y ∪ {z1 , . . . , zm } is a basis for S + T . This set clearly spans
S + T . So, it suffices to show that it is independent. Back
3.1 Consider the equation
Pk Pl Pm
i=1 αi xi + j=1 βj yj + h=1 γh zh = 0. It follows that
Pm
h=1 γh zh ∈ S ∩ T .
Pm Pk
3.2 Thus, h=1 γh zh = i=1 δi xi for some δ1 , . . . , δk .
3.3 As Z is independent, γ1 = . . . = γm = 0 = δ1 = . . . = δk .
Pk Pl
Thus, i=1 αi xi + j=1 βj yj = 0.
3.4 As Y is independent, α1 = · · · = αk = 0 = β1 = . . . = βl .
Isomorphism proofs
Proof.
1. Given a basis {x1 , . . . , xn } for V and y ∈ V , there is a
n
Pnn-tuple (α1 (y ), . . . , αn (y )) ∈ F such that
unique
y = i=1 αi (y )xi . Back
2. It is easy to check that y 7→ (α1 (y ), . . . , αn (y )) is an
isomorphism (i.e., a linear bijection) from V to F n .
Rank and nullity proofs I
Proof.
1. Let ν(A) = k and let {x1 , . . . , xk } be a basis for N (A). Back
Proof.
2. If A is not injective, then there exist x, y ∈ V , x 6= y , such
that A(x − y ) = Ax − Ay = 0. Thus, x − y 6= 0,
x − y ∈ N (A), and ν(A) > 0. Back
3. As R(A) ⊂ W and dim R(A) = dim W , we have R(A) = W .
Back
Proof.
1. Clearly, ABx = A(Bx) ∈ W . Consider x, y ∈ U and
α, β ∈ F . Then,
AB(αx + βy ) = A(αBx + βBy ) = αABx + βABy . Back
2. Clearly, R(AB) = AR(B) ⊂ AV = R(A). As A0 = 0, we
have N (B) ⊂ N (AB). Back
3. As R(AB) ⊂ R(A), we have ρ(AB) ≤ ρ(A). As
ρ(AB) = ρ(B) − ν(AR(B) ) ≤ ρ(B). As N (B) ⊂ N (AB), we
have ν(B) ≤ ν(AB). Back
Matrix representation of mapping
Proof.
1. Let v = nj=1 bj vj be the representation of v in terms of the
P
Proof.
3. Suppose P ∈ L(V , V ) is idempotent. Back
Proof.
2. Suppose B is a right inverse of A. Back
Proof.
1. Using Theorem Ref , ρ(A) = ρ(AA− A) ≤
min{ρ(AA− ), ρ(A)} ≤ min{min{ρ(A), ρ(A− )}, ρ(A)} ≤ ρ(A− )
Back
Proof.
6. We need to show that A(BA)− BA = A. Back
Proof.
9. Using Theorem Ref , AA− projects W on R(A). Back
Proof.
1. We have to show that R(I − A− A) = N (A). Back
Proof.
1. If Ax = y is consistent, then y ∈ R(A). We know that AA−
projects W on R(A). Therefore, AA− y = y .
The converse is trivial. Back
2. Suppose Ax = y has a unique solution for every y ∈ W . If
ν(A) > 0, then there exists x ∈ N (A) such that x 6= 0. This
violates the uniqueness assumption. Conversely, if
ν(A) = 0, then AL exists. If Ax = y is consistent, then
y ∈ R(A). Thus, there exists x ∈ V such that Ax = y . This
means AL y = AL Ax = Ix = x, i.e. AAL y = y . Back
3. It follows from the assumptions that A−1 exists. Therefore,
the equation Ax = y has a unique solution A−1 y for every
y ∈ W . Back
Non-homogeneous equation proofs II
Proof.
4. Let x and x 0 be solutions of Ax = y . Back
Proof.
1. If y = 0, then hy , y i = 0 and
hx, y i = hx, 0i = hx, c − ci = hx, ci − hx, ci = 0, where
c ∈ V . So, the inequality holds. Back
1.1 Suppose y 6= 0. Then, hy , y i > 0.
1.2 For every λ ∈ <, we have
0 ≤ hx − λy , x − λy i = hx, xi + λ2 hy , y i − 2λhx, y i.
1.3 Setting λ = hx, y i/hy , y i, we have
0 ≤ hx, xi + hx, y i2 /hy , y i − 2hx, y i2 /hy , y i, which yields the
result.
2. If y = 0 or x = λy for some λ ∈ <, then the equality is
satisfied. Back
2.1 Conversely, suppose y 6= 0 and x 6= λy for every λ ∈ <.
Then, the above argument holds with a strict inequality.
Orthonormal sets proofs I
Proof.
1. Consider P an orthonormal set {c1 , . . . , cr } ⊂ V and the
r
equation
Pr i = 0. For every j
∈
i=1 αi cP P{1, . . . , r },
r r
αj = i=1 αi δij = i=1 αi hci , cj i = i=1 αi ci , cj =
h0, cj i = 0. Back
2. Let y ∈ V . As {c1 , . P. . , cr } is a basis for V, y has a
representation y = ri=1 αi ci . Then, as {c1 , . . . , cr } is an
orthonormal
Prset, Pr Pr
hy , cj i = i=1 αi ci , cj = i=1 αi hci , cj i = i=1 αi δij = αj .
Back
Gram-Schmidt Lemma proof
Proof.
Pk −1
1. Let zk := xk − j=1 hyj , xk iyj . For every i ∈ {1, . . . , k − 1},
Pk −1
hzk , yi i = hxk , yi i − j=1 hyj , xk ihyj , yi i =
hxk , yi i − hyi , xk i = 0. Back
P −1
2. If zk = 0, then xk = kj=1 hyj , xk iyj , i.e.,
xk ∈ [{y1 , . . . , yk −1 }] = [{x1 , . . . , xk −1 }], a contradiction.
3. As zk 6= 0, let yk := zk /kzk k. Then, kyk k = 1, and by Step
1, kzk kyk = zk ∈ [{y1 , . . . , yk −1 }]⊥ . Thus, {y1 , . . . , yk } is
orthonormal.
P −1
4. Note that xk = kj=1 hyj , xk iyj + kzk kyk .
5. As [{x1 , . . . , xk −1 }] = [{y1 , . . . ,P
yk −1 }], there exists
−1 P −1
(α1 , . . . , αk −1 ) ∈ F n such that kj=1 hyj , xk iyj = kj=1 αj xj .
Pk −1
So, kzk kyk = xk − j=1 αj xj .
6. Thus, [{x1 , . . . , xk }] ⊂ [{y1 , . . . , yk }] ⊂ [{x1 , . . . , xk }].
Gram-Schmidt proof
Proof.
1. Let {x1 , . . . , xn } be a basis for V . Then, every xi 6= 0. Back
Proof.
1. Let y ∈ W . Using Ref , V has an orthonormal basis
{e1 , ..., en }. Back
1.1 If B ∈ L(W , V ) is an adjoint of A, then hei , By i = hAei , y i for
i = 1, P. . . , n. Consequently,
n Pn
By = i=1 hei , By iei = i=1 hAei , y iei (1)
Since every adjoint of A satisfies (1), it is unique if it exists.
1.2 We now verify that B given Pnby (1) is an adjoint of A.
1.3 Using
Pn (1), he j , By i = he j , i=1 hAei , y iei i =
i=1 hAei , y ihej , ei i = hAe Pny i for every ej and y ∈ W .
j ,
1.4 Consider x ∈P V . As x = i=1 hx,P ei iei ,
n n
hAx,
Pn y i = hA i=1 hx, ei ie
Pn i , y i = i=1 hx, ei ihAei , y i =
i=1 hx, ei ihei , By i = h i=1 hx, ei iei , By i = hx, By i. So,
A∗ = B.
Adjoint proofs II
Proof.
2. hy , (A∗ )∗ xiW = hA∗ y , xiV = hy , AxiW for all x ∈ V and
y ∈ W . Thus, hy , (A∗ )∗ x − AxiW = 0 for all x ∈ V and
y ∈ W . So, (A∗ )∗ x = Ax for every x ∈ V , i.e., (A∗ )∗ = A.
Back
Proof.
5. Using Theorem Ref , V = N (A) ⊕ N (A)⊥ . Using Theorem
Ref , dim V = ν(A) + dim N (A)⊥ . Back
Proof.
The following proofs use Theorems Ref and Ref .
5. I = I ∗ = (AA−1 )∗ = (A−1 )∗ A∗ and
I = I ∗ = (A−1 A)∗ = A∗ (A−1 )∗ . Back
6. If AL exists, then I = I ∗ = (AL A)∗ = A∗ (AL )∗ . So,
(A∗ )R = (AL )∗ . Back
7. If AR exists, then I = I ∗ = (AAR )∗ = (AR )∗ A∗ . So,
(A∗ )L = (AR )∗ . Back
8. Using the definition of A− ,
A∗ = (AA− A)∗ = A∗ (AA− )∗ = A∗ (A− )∗ A∗ . By the definition
of (A∗ )− , we have (A∗ )− = (A− )∗ . Back
Orthogonal projector proof I
Proof.
1. Suppose P is an orthogonal projector. Back
Proof.
2. Suppose P is an orthogonal projector. Back
Proof.
1. Suppose P is an orthogonal projector of V on W . Let
x ∈ V and w ∈ W . Back
1.1 Then, w − Px ∈ W = R(P) and
x − Px ∈ R(I − P) = N (P) = W ⊥ .
1.2 Consequently, hx − Px, w − Pxi = 0.
1.3 As kx − wk2 = k(x − Px) + (Px − w)k2 =
kx − Pxk2 + kPx − wk2 + 2hx − Px, Px − wi, we have
Proof.
2. Conversely, suppose Px ∈ W and kPx − xk ≤ kw − xk for
all x ∈ V and w ∈ W .
3. We first show that x − Px ∈ W ⊥ . Moreover, if x ∈ W ⊥ ,
then Px = 0.
3.1 If w, w 0 ∈ W and t ∈ (0, 1), then
0 ≥ kx − Pxk2 − kx − (tw 0 + (1 − t)Px)k2 = kx − Pxk2 − k(x −
Px) − t(w 0 − Px)k2 = 2thx − Px, w 0 − Pxi − t 2 kw 0 − Pxk2 .
Therefore, 0 ≥ 2hx − Px, w 0 − Pxi − tkw 0 − Pxk2 .
Letting t ↓ 0, we have 0 ≥ hx − Px, w 0 − Pxi.
3.2 If w 0 = Px + w, then 0 ≥ hx − Px, wi. If w 0 = Px − w, then
0 ≤ hx − Px, wi. Thus, hx − Px, wi = 0, i.e., x − Px ∈ W ⊥ .
3.3 If x ∈ W ⊥ , then kx − wk2 = kxk2 + kwk2 ≥ kx − 0k2 , for
every w ∈ W . Thus, Px = 0.
Linearity of distance minimiser I
Proof.
4. We now show that P ∈ L(V , W ), i.e., P is linear.
4.1 kw 0 − (x + w)k = k(w 0 − w) − xk for w, w 0 ∈ W and x ∈ V .
Therefore, kw 00 − (x + w)k ≥ kw 0 − (x + w)k iff.
k(w 00 − w) − xk ≥ k(w 0 − w) − xk.
4.2 So, w 0 = P(x + w) iff. w 0 − w = P(x). Thus,
P(x + w) = P(x) + w.
4.3 For x ∈ V , let x ⊥ := x − Px. For x, y ∈ V ,
P(x + y ) = P(x ⊥ + Px + y ⊥ + Py ). As Px + Py ∈ W , we
have P(x + y ) = P(x ⊥ + y ⊥ ) + Px + Py .
4.4 By Lemma Ref , x ⊥ , y ⊥ ∈ W ⊥ . So, x ⊥ + y ⊥ ∈ W ⊥ . By
Lemma Ref , P(x ⊥ + y ⊥ ) = 0. Thus, P(x + y ) = Px + Py .
4.5 Let α 6= 0 and x ∈ V . If w ∈ W , then
kw − αxk = |α|kw/α − xk. So, if w, w 0 ∈ W , then
kw − αxk ≤ kw 0 − αxk iff. kw/α − xk ≤ kw 0 /α − xk.
Therefore, if w = P(αx), then w/α = Px. So,
P(αx) = αPx.
Linearity of distance minimiser II
Proof.
5. Using Steps Ref and Ref , R(P) = W and W ⊥ ⊂ N (P).
We finally show that W ⊥ ⊃ N (P), and consequently, P is
an orthogonal projector.
5.1 Let x ∈ N (P).
5.2 As V = W ⊕ W ⊥ , we have x = w + w ⊥ with w ∈ W and
w ⊥ ∈ W ⊥.
5.3 Then, 0 = Px = Pw + Pw ⊥ = Pw. So, w = 0 and
x = w ⊥ ∈ W ⊥ . Thus, W ⊥ ⊃ N (P).
Characterisation of orthogonal projector
Proof.
1. Let y ∈ V . As P is an orthogonal projector, P(I − P)y = 0
and (I − P)y ∈ N (P) = R(P)⊥ . Back
2. As R(X ) = R(P), Py = Xb for some b ∈ U. Therefore,
y − Xb = y − Py ∈ R(P)⊥ = R(X )⊥ .
3. So, 0 = hy − Xb, Xzi = hX ∗ y − X ∗ Xb, zi for every z ∈ U.
Thus, X ∗ y = X ∗ Xb. Using Ref ,
b = (X ∗ X )− X ∗ y + [I − (X ∗ X )− X ∗ X ]u for some u ∈ U.
4. Using Ref and Ref , ρ(X ∗ X ) = ρ(X ∗ ) = ρ(X ). Using Ref ,
X (X ∗ X )− X ∗ projects U on R(X ).
5. Thus, X [I − (X ∗ X )− X ∗ X ]u = [X − X (X ∗ X )− X ∗ X ]u =
[I − X (X ∗ X )− X ∗ ]Xu = 0.
6. So, Py = Xb = X (X ∗ X )− X ∗ y + X [I − (X ∗ X )− X ∗ X ]u =
X (X ∗ X )− X ∗ y for every y ∈ V . Thus, P = X (X ∗ X )− X ∗ .
Existence of roots and invariant subspaces I
Proof.
1. By Weierstrass’ theorem, there exists x ∈ V such that
kxk = 1 and kAxk ≥ kAy k for all y ∈ V with ky k = 1. Back
1.1 So, for every y ∈ V \ {0}, as kA(y /ky k)k ≤ kAxk, we have
kAy k ≤ kAxkky k.
1.2 Using the symmetry of A, the Cauchy-Schwartz inequality
Ref , and Step (1.1),
Proof.
2. By Theorem Ref , there exist λ ≥ 0 and x ∈ V \ {0} such
that √ √
0 = A2 x − λx = (A2 − λI)x = (A − λI)(A + λI)x. Back
√ √
2.1 If z =
√(A + λI)x =6 0, then (A − λI)z = 0. Set y = z and
µ = λ. √ √
(A + λI)x = 0, then Ax = − λx. Set y = x and
2.2 If z = √
µ = − λ.
Roots and orthogonal characteristic subspaces I
Proof.
1. Let λ and µ be distinct roots of A, and let Ax = λx and
Ay = µy .
1.1 Then,
λhx, y i = hλx, y i = hAx, y i = hx, Ay i = hx, µy i = µhx, y i.
Back
Proof.
2. Suppose x ∈ Li ∩ Lj for some i 6= j. Back
Proof.
3. By Theorem Ref , <n = L1 ⊕ . . . ⊕ Lr . Back
Proof.
1. By Theorem Ref , <n has an orthonormal basis {c1 , . . . , cn }
where each ci is a characteristic
Pn vector of A. Then, x ∈ <n
has a representation x = i=1 αi ci . Back
2. As Aci =P λi ci for λi ∈ <Pand i = 1, . . P . , n,
n n n
Ax = A α
i=1 i ic = α
i=1 i Ac i = i=1 αi λi ci .
DP E
n P n
3. Then, hAx, xi = α λ
i=1 i i i c , c
j=1 j j =
α
Pn Pn Pn 2
i=1 j=1 αi αj λi hci , cj i = i=1 αi λi .
4. The results follow from this equation.
Diagonalisation Proof
Proof.
1. By Theorem Ref , <n has an orthonormal basis {c1 , . . . , cn }
where each ci is a characteristic vector of A. Let C be the
matrix with ci as the i-th column. Back
2. By definition, AC = CΛ. By construction, C T C = I. So,
C T AC = C T CΛ = Λ.
3. As {c1 , . . . , cn } is orthonormal, it is independent.
Therefore, C is non-singular.
4. Consequently, C T = C −1 . Therefore, CΛC T = ACC T = A.
Diagonalisation of semidefinite matrices Proofs
Proof.
1. By Theorem Ref , Λ = C T AC. Back
Proof.
1. By Theorem Ref , there exists a non-singular matrix C such
that C T AC = Λ. Back
2. As all the roots of A are positive by Theorems, D is a
diagonal matrix with positive diagonal entries.
3. So, D is non-singular.
4. C is non-singular by construction as it consists of
orthonormal vectors.
5. Therefore, E is non-singular. The other claims follow from
trivial manipulations.
Semidefinite matrices Proofs
Proof.
1. Consider a positive semidefinite matrix A. Back
Proof.
1. Suppose A is positive definite. Back