0% found this document useful (0 votes)
12 views

Lecture 3 Linear Algebra

Uploaded by

skylar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views

Lecture 3 Linear Algebra

Uploaded by

skylar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Lecture 3 Linear Algebra

1 Vector Spaces

We begin by introducing vector spaces.

Definition 1.1. A vector space V over R, also called a linear space, is a set V
with operations vector addition (∀u, v ∈ V , u + v defines a vector in V ) and scalar
multiplication (∀α ∈ R, ∀u ∈ V , α · u defines a vector in V ) such that ∀α, β ∈
R, ∀u, v, w ∈ V , the following axioms hold:
A1 (u + v) + w = u + (v + w),
A2 u + v = v + u,
A3 ∃o ∈ V s.t. u + o = u ∀u ∈ V (a zero vector o exists),
A4 ∀u ∈ V, ∃v ∈ V s.t. u + v = o (use −u to denote v),
A5 α · (u + v) = α · u + α · v,
A6 (α + β) · u = α · u + β · u,
A7 α · (β · u) = (α · β) · u,
A8 1 · u = u.
An element of a vector space is called a vector or a point.

1
Theorem 1.1. If V is a vector space, then ∀α ∈ R, u ∈ V,
(i) o is unique,
(ii) 0 · u = o,
(iii) α · o = o,
(iv) −u = (−1) · u.

Proof. (i): We prove by contradiction. Suppose o ̸= ψ, and ∀u ∈ V, u+o = u, u+ψ =


u. Then we have ψ + o = ψ and o + ψ = o. By A2, ψ + o = o + ψ, which implies
ψ = o, a contradiction.
(ii): Since 0 = 0 + 0, 0 · u = (0 + 0) · u. By A6, 0 · u = 0 · u + 0 · u. By A4, there
exists −(0 · u). By A1, = (−(0 · u) + 0 · u) + 0 · u = −(0 · u) + (0 · u + 0 · u). It follows
that o + 0 · u = −(0 · u) + 0 · u = o. By A3, o + 0 · u = 0 · u. So o = 0 · u.
(iii): α · o = α · (0 · u) for any u ∈ V . By A7, α · (0 · u) = (α · 0) · u = 0 · u = o.
(iv): By A1, (−u+u)+(−1)·u = −u+(u+(−1)·u). By A8 and A6, u+(−1)·u =
1·u+(−1)·u = (1+(−1))·u = o. Thus, o+(−1)·u = −u+o. By A3, (−1)·u = −u.

Example 1.1. (i) Rn . Vector addition: ∀⃗x, ⃗y ∈ Rn with ⃗x = (x1 , x2 , . . . , xn ),


⃗y = (y1 , y2 , . . . , yn ), ⃗x + ⃗y = (x1 + y1 , x2 + y2 , . . . , xn + yn ) ∈ Rn . Scalar
multiplication: ∀α ∈ R, ⃗x ∈ Rn α · ⃗x = (αx1 , αx2 , . . . , αxn ) ∈ Rn .
(ii) Space of all continuous functions defined on [a, b], denoted by C([a, b]). Vector
addition: ∀f, g ∈ C([a, b]), ∀α ∈ R, f + g ∈ C([a, b]). Scalar multiplication:
∀α ∈ R, f ∈ C([a, b]), α · f = αf ∈ C([a, b]).

Definition 1.2. Suppose V is a linear space and U is a nonempty subset of V with


the same vector addition and scalar multiplication operations. If U is a linear space,
we call U a linear subspace of V .

2
Remark 1.1. To prove that V is a linear space, we need to check (i) V is closed
under + and ·, (ii) A1-A8.

Example 1.2. (i) V = R2 , linear subspaces include R2 itself and {(x, y) : y = αx}
for some fixed α ∈ R.
(ii) V = C([a, b]). U = {α · f : α ∈ R} for some fixed f ∈ C([a, b]) is a linear
subspace.

By the uniqueness of the zero vector, clearly, if U is a linear subspace of V , then


U ’s zero vector is the same as V ’s.

Definition 1.3. Let v1 , v2 , . . . , vn be vectors of a linear space V . A linear combina-


tion of {v1 , v2 , . . . , vn } is a vector v = ni=1 αi vi in which αi ∈ R, i = 1, . . . , n.
P

Pn
What does i=1 αi vi mean? To simplify the exposition, we omit the “·” when
there is no risk of confusion.

Remark 1.2. Let A be a set of vectors in V . A linear combination of A is a vector


v = nj=i αi vi in which n ∈ N+ and αi ∈ R, vi ∈ A for i = 1, . . . , n. ( ∞
P P
i=1 αi vi may

not be well-defined.)

Definition 1.4. Suppose A is a set of vectors of a linear space V . The span of A,


denoted by Span(A), is the set of all linear combinations of A. Let Span(Ø) = {o}.

Theorem 1.2. Suppose U is a subset of a linear space V , then Span(U ) is a linear


subspace of V . Moreover, Span(U ) is the smallest linear subspace that contains U .

Definition 1.5. A set of vectors A of a linear space V is linearly independent if


∄v ∈ A s.t. v ∈ Span(A\{v}). Equivalently, a set of vectors A is linearly independent

3
Pn
if for any linear combination of vectors in A, v = j=i αi vi , we have v = o ⇒ αi =
0, i = 1, . . . , n.
Pn
Exercise 1.1. Suppose v1 , . . . , vn are distinct vectors. Show that [ i=1 αi vi = o ⇒
αi = 0, i = 1, . . . , n] ⇔ [∄vi ∈ {v1 , . . . , vn } s.t. vi ∈ Span({v1 , . . . , vi−1 , vi+1 , . . . , vn })].

Theorem 1.3. Let A = {v1 , v2 , . . . , vn } ⊆ V be linearly independent. Then for any


v ∈ Span(A), ∃! α1 , . . . , αn ∈ R s.t. v = ni=1 αi vi .
P

Exercise 1.2. Proof by contradiction.

Definition 1.6. Suppose A is a linearly independent set of vectors in V . A is called


a (Hamel) basis of V if V = Span(A). If A contains only a finite number of elements,
then we say that the dimension of V is |A|, denoted by dim V = |A|. If A contains
an infinite number of elements, we say that V is infinite dimensional.

How about V = {o}? Note that {o} is not a linearly independent set. Since
Span(Ø) = V , we conclude that dim V = |Ø| = 0.

Example 1.3. (i) Finite dimensional: Rn , {e1 , e2 , . . . , en } is a basis of Rn , where


e1 = (1, 0, . . . , 0), e2 = (0, 1, . . . , 0), . . . , en = (0, 0, . . . , 1).
(ii) Infinite dimensional: let V be the set of all polynomials, i.e., V = {f : f (x) =
α0 + α1 x + · · · + αn xn for some n ∈ N+ and α0 , . . . , αn ∈ R}. {1, x, x2 , . . . } is
a basis of V .
(iii) R∞ = {(xn )∞
n=1 : xn ∈ R ∀n ∈ N+ }. Let v1 = (1, 0, 0, . . . ), v2 = (0, 1, 0, . . . ), v3 =

(0, 0, 1, . . . ) and so on. {vn : n ∈ N+ } is not a Hamel basis of R∞ (it is a


Schauder basis). Why? Consider x = (1, 1, 1, . . . ), x = ∞
P
n=1 vn is not a linear

combination of {vn : n ∈ N+ }.

4
Theorem 1.4. Every linear space has a Hamel basis.

The proof of the theorem uses Zorn’s Lemma and is rather involved. It only
shows the existence but does not construct the basis explicitly.

Theorem 1.5. Any linearly independent subset of a linear space can be extended to
a Hamel basis.

Theorem 1.6. If A and B are two finite bases of a linear space V , then |A| = |B|.

Corollary 1.1. Suppose V has dimension n ∈ N+ and {v1 , . . . , vk } is a linearly


independent subset of V . ∃{vk+1 , . . . , vn } ⊆ V s.t. {v1 , . . . , vn } is a basis of V .

Definition 1.7. Suppose U, V are linear spaces. A function T : U → V is a linear


function/mapping/transformation if T (u1 + u2 ) = T (u1 ) + T (u2 ) ∀u1 , u2 ∈ U , and
T (α · u) = α · T (u) ∀α ∈ R, ∀u ∈ U .

Example 1.4. (i) T (x) = 3x. T (x1 + x2 ) = 3(x1 + x2 ) = 3x1 + 3x2 = T (x1 ) +
T (x2 ), and we can verify T (αx) = αT (x). T is linear.
(ii) T (x) = 3x + 1. T (0 + 0) ̸= T (0) + T (0), so T is not linear.

Definition 1.8. A linear transformation T : U → V is invertible if ∃S : V → U s.t.


∀u ∈ U, v ∈ V, S ◦ T (u) = u, T ◦ S(v) = v. We use T −1 to denote such S.

Theorem 1.7. Suppose T : U → V is a linear transformation and T is invertible.


Then T −1 : V → U is also linear.

Exercise 1.3. Prove the theorem.

Theorem 1.8. Suppose T : U → V is a linear transformation. Then T is invertible


if and only if T is bijective.

5
2 Matrices

We will not spend too much time on the algebra of matrices.

Definition 2.1. A (real) matrix is a rectangular array of real numbers, denoted by

 
 a11 · · · a1n 
 . ... .. 
. m×n
Am×n =
 .  = (aij ) ∈ R
.  .
 
am1 · · · amn

• (aij )m×n + (bij )m×n = (aij + bij )m×n .


• Am×n Bn×l = Cm×l (check if you have forgot what it is.)
• α(aij )m×n = (α · aij )m×n .
• The transpose of Am×n = (aij )m×n is (aji )n×m , denoted by AT .

Remark 2.1. (i) AB ̸= BA in general, but (AB)C = A(BC) holds.


(ii) (A1 A2 · · · An )T = ATn ATn−1 · · · AT1 .

Theorem 2.1. Suppose T : Rn → Rm is a linear transformation. Then ∃!AT ∈


Rm×n s.t. ∀u ∈ Rn , T (u) = AT u. In addition, the i-th column of AT is T (ei ) where
ei = (0, . . . , 0, 1, 0, . . . , 0) (its i-th component is 1).

Remark 2.2. (i) Am×n = (aij ) = [b1 , b2 , . . . , bn ] with bj = [a1j , a2j , . . . , amj ]T
called the j-th column vector. A = [cT1 , cT2 , . . . , cTm ]T with ci = [ai1 , ai2 , . . . , ain ]
called the i-th row vector.

6
(ii) Intuition: ∀u ∈ Rn , u = [u1 , . . . , un ]T , u = u1 · e1 + u2 · e2 + · · · + un · en .

T (u) = T (u1 · e1 + u2 · e2 + · · · + un · en )

= u1 · T (e1 ) + u2 · T (e2 ) + · · · + un · T (en )

= [T (e1 ), T (e2 ), . . . , T (en )][u1 , u2 , . . . , un ]T = AT u.

(iii) There is a bijection between {T : Rn → Rm : T is linear} and Rm×n because


for each matrix A ∈ Rm×n , we can also find a unique TA : Rn → Rm s.t.
TA (u) = Au. We only need to verify that TA is linear, which is true.

Definition 2.2. A matrix An×n is invertible if ∃Bn×n s.t. AB = BA = In , where


 
1 0
 . 
In =  ..  .
 
 
0 1

We denote such B as A−1 .

Definition 2.3. A square matrix An×n is orthogonal if AT A = AAT = In .

Remark 2.3. (i) A−1 = AT when A is orthogonal.


(ii) A = [b1 , . . . , bn ], AT A = [bT1 , . . . , bTn ]T [b1 , . . . , bn ] = In , which implies ∀j, k


1, if j = k,

bTj bk =

0, if j ̸= k.

p
Note that bTj bk = bj · bk . Thus we must have ∥bj ∥ = (bj · bj ) = 1 ∀j, and all

7
column vectors of A are pairwise orthogonal (i.e. bj · bk = 0 for any j ̸= k).
The same applies to A’s row vectors.

Definition 2.4. A matrix A = (aij )m×n is a diagonal matrix if aij = 0 ∀i ̸= j.

T
Theorem 2.2 (Singular Value Decomposition). Any matrix Am×n = Um×m Σm×n Vn×n
for some orthogonal U, V and some diagonal Σ.

Definition 2.5. The determinant of a matrix An×n is det(A) or |A|. (see wiki)

Theorem 2.3. A square matrix A is invertible if and only if det(A) ̸= 0. Moreover,


A−1 = 1
det(A)
adj(A). (see wiki)

Theorem 2.4. Let A, B be square matrices. Then


(i) det(AB) = det(A) · det(B);
(ii) det(αA) = αn det(A)
(iii) det(A) = det(AT );
(iv) det(A−1 ) = 1
det(A)
, if det(A) ̸= 0.

Theorem 2.5. Let A, A1 , A2 , . . . , An be invertible matrices. Then


(i) (A−1 )−1 = A;
(ii) (αA)−1 = 1
α
· A−1 , if α ̸= 0;
−1 −1
(iii) (A1 A2 · · · An )−1 = A−1
n An−1 · · · A1 .

Remark 2.4. A = U ΣV T , A−1 = V Σ−1 U T , where

8
   
1
λ1 0  λ1 0
   
1
 λ2

 −1 

λ2

Σ= ,Σ =  .
 
 ...   ... 
   
   
1
0 λn 0 λn

A is invertible if and only if Σ is invertible, if and only if λi ̸= 0 ∀i = 1, . . . , n.

Definition 2.6. The rank of a matrix Am×n = [b1 , . . . , bn ] is dim(Span({b1 , . . . , bn })),


denoted by rank(A). Span({b1 , . . . , bn }) is called the column space of A.

Theorem 2.6. rank(A) is equal to the maximum number of linearly independent


column (row) vectors.

Remark 2.5. Why also maximum number of row vectors? Let {c1 , c2 , . . . , cr } be a
basis of the column space of A, then there exists Vr×n such that [c1 , c2 , · · · , cr ]V = A.
Thus, each row of A is a linear combination of rows of V , which implies that the
maximum number of linearly independent row vectors must be less than or equal to
r. By doing the same for AT one can show that the maximum number of linearly
independent column vectors of A must be less than or equal to the maximum number
of linearly independent row vectors.

Theorem 2.7. Let A, B, C be matrices. Then


(i) rank(A) = rank(AT );
(ii) rank(Am×n ) ≤ min{m, n};
(iii) rank(AB) ≤ min{rank(A), rank(B)};
(iv) If Am×n is full rank, i.e., rank(A) = min{m, n}, then

9
(a) if m ≥ n, then rank(Am×n Bn×l ) = rank(B),
(b) if m ≤ n, then rank(Cl×m Am×n ) = rank(C),
(v) An×n is invertible if and only if A has a full rank.

Example 2.1. Suppose


     
2  1  0 
    ′  
A=
 1
,B =  2 ,B =  2 .
    
     
0 1 1

Then,    
2  0 

   
AB = 
 2
 , AB =  2  .
  
   
0 0

We have rank(AB) = min{rank(A), rank(B)}, rank(AB ′ ) < min{rank(A), rank(B ′ )}.

Definition 2.7. A matrix An×n is


(i) positive (negative) definite if [∀u ∈ Rn s.t. u ̸= ⃗0, we have uT Au > (<) 0].
(ii) positive (negative) semidefinite if [∀u ∈ Rn s.t. u ̸= ⃗0, we have uT Au ≥ (≤) 0].

Remark 2.6. Au is a linear function, uT Au is called a quadratic function.

Definition 2.8. An×n is a matrix. If v ̸= ⃗0 and λ ∈ R satisfy Av = λv, we say that


λ is an eigenvalue of A, and v is an eigenvector of A.

Remark 2.7. How to find eigenvalues and eigenvectors?


(i) Av = λv = λIn v ⇒ Av − λIn v = ⃗0 ⇒ (A − λIn )v = ⃗0. We want to find λ’s
satisfying the equation for some nonzero v.

10
(ii) It follows that we want to find λ’s s.t. A − λIn is not full rank, since otherwise
v = ⃗0 is the only solution. This is equivalent to solve det(A − λIn ) = 0 for λ.
(iii) To find the eigenvectors, for each solution λ to det(A − λIn ) = 0, solve (A −
λIn )v = 0 to find nonzero solutions for v.

Example 2.2. Find the eigenvalues and the eigenvectors for


 
2 7
A= .
−1 −6

We have

2−λ 7
|A − λIn | =
−1 −6 − λ

= (2 − λ)(−6 − λ) − (−1) × 7 = λ2 + 4λ − 5 = (λ + 5)(λ − 1).

So λ = −5 and 1 will the eigenvalues of A. To find the eigenvectors associated with


λ = −5. Let (A + 5I2 )v = 0, which reads

 
7 7
  v = 0.
−1 −1

∀α ∈ R \ {0}, [α, −α]T will be an eigenvector associated with λ = −5.

Finding the SVD of A amounts to finding the eigenvalues/eigenvectors of AAT


and AT A. (Why?)

11

You might also like