Algebraic Methods in Data Science: Lesson 3: Dan Garber
Algebraic Methods in Data Science: Lesson 3: Dan Garber
Algebraic Methods in Data Science: Lesson 3: Dan Garber
Dan Garber
https://dangar.net.technion.ac.il/
The Euclidean norm for Rm×n is defined similarly to the vector case, and
as in the vector case it is induced by the standard inner product.
It is called the Frobenius norm and it is given by
sX q
kAkF = 2
Ai,j = Tr(A> A).
i,j
Theorem (HW)
For every p, q ∈ [1, ∞] × [1, ∞], the following function is a norm
Definition
Given a real square matrix A ∈ Rn×n we say λ is an eigenvalue of A, if
there exists a vector u 6= 0 such that Au = λu.
We say u is an eigenvector corresponding to eigenvalue λ.
Recall that even when A is real, both eigenvalues and eigenvectors need
not be real (i.e., can be complex).
(λI − A)v = 0,
Theorem
Let λi , i = 1, . . . , k be the distinct eigenvalues of a matrix A ∈ Rn×n , and
let φi = Ker(λi I − A), i = 1, . . . , k be the corresponding eigenspaces.
Then, any k nonzero vectors ui ∈ φi , i = 1, . . . , k, are linearly independent.
Proof: The fact that U is invertible follows since its columns correspond
to eigenvectors of different eigenvalues. Since by previous theorem we
have that eigenvectors of different eigenvalues are linearly independent, we
have that the columns of U are linearly independent and hence it has full
rank and thus it is invertible.
u∗ Au = λu∗ u, u∗ A∗ u = λ∗ u∗ u. (1)
u∗ Au − u∗ A∗ u = u∗ (A − A> )u = (λ − λ∗ )kuk22 .
Let vi ∈ φi , vj ∈ φj , i 6= j.
Since Avi = λi vi we have
0 = (λi − λj )vj> vi .
Thus, the matrices A and U> 1 AU1 are similar! In particular they have
the same eigenvalues (including algebraic multiplicities).
Because of the block diagonal structure of U> 1 AU1 , λ is an eigenvalue of
A1 of multiplicity µ − 1 (in particular, note that the charcharistic
polynomial of A is given by ρA (σ) = (σ − λ) · ρA1 (σ)).
We can continue this process until we reach the actual value of µ (notice
that by the above, for each Ai , λ is an eigenvalue of algebraic multiplicity
µ − i) and at this point we exit the procedure with an orthonormal basis of
φ composed of exactly µ vectors.
@ Dan Garber (Technion) Lesson 3 Winter 2020-21 21 / 28
Recall that in order to prove that µi = dim φi we have used the following
lemma:
Lemma
Let B ∈ Sm and let λ be an eigenvalue of B. Then, there exists an
orthogonal matrix U = [u Q] ∈ Rm×m , Q ∈ Rm×(m−1) , such that
λ 0
Bu = λu, U> BU = > , Q> BQ ∈ Sm−1 .
0 Q BQ
Observation
Let A ∈ Sn . Let λ1 , . . . , λn denote its eigenvalues and let u1 , . . . , un
denote the corresponding eigenvectors. Then,
X X
A= λ i ui u>
i = λ i ui u>
i .
i∈[n] i∈[n]:λi 6=0
Observation
Let A ∈ Sn . rank(A) is equal to number of non-zero eigenvalues of A.
For the following, think on the case that rank(A) = k << n (this indeed
holds for many matrices representing data that we see in real-life).
Let A ∈ Sn andP
suppose that the eigen-decomposition
A = UΛU = ki=1 λi ui u>
>
i is given, where rank(A) = k. Then,
Informally speaking, A can be stored in a computer’s memory using
only k(1 + n) memory units - storing the k non-zero eigenvalues and
the k eigenvectors (assuming each memory unit can store a scalar).
Let A ∈ Sn andP
suppose that the eigen-decomposition
A = UΛU = ki=1 λi ui u>
>
i is given, where rank(A) = k. Then,