Lecture 14
Lecture 14
Xd′
Xd′
(AT A)W = AT Y
(AT A)W = AT Y
• The optimal W satisfies this system of linear
equations. (Called normal equations).
W ∗ = (AT A)−1 AT Y = A† Y
where A† = (AT A)−1 AT , is called the generalized
inverse of A.
W ∗ = (AT A)−1 AT Y = A† Y
where A† = (AT A)−1 AT , is called the generalized
inverse of A.
• The above W ∗ is the linear least squares solution for
our regression (or classification) problem.
W ∗ = (AT A)−1 AT Y
W ∗ = (AT A)−1 AT Y
• The only difference is that now the ith row of matrix A
would be
[φ0 (Xi ) φ1 (Xi ) · · · φd′ (Xi )]