0% found this document useful (0 votes)
18 views

Linear Least Squares Problem: - Can Not Be Satisfied Exactly

The document describes the linear least squares problem. It explains that when there are more equations than unknowns, the system is overcomplete and there is generally no solution that satisfies all equations exactly. The document outlines three algorithms for computing the least squares solution: normal equations, QR decomposition, and singular value decomposition (SVD). It states that normal equations are the fastest but least accurate, while SVD is the slowest but most accurate and reliable.

Uploaded by

bryandown
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views

Linear Least Squares Problem: - Can Not Be Satisfied Exactly

The document describes the linear least squares problem. It explains that when there are more equations than unknowns, the system is overcomplete and there is generally no solution that satisfies all equations exactly. The document outlines three algorithms for computing the least squares solution: normal equations, QR decomposition, and singular value decomposition (SVD). It states that normal equations are the fastest but least accurate, while SVD is the slowest but most accurate and reliable.

Uploaded by

bryandown
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 22

Linear Least Squares Problem

Consider an equation for a stretched beam:

Y = x 1 + x2 T
Where x1 is the original length, T is the force applied and x2
is the inverse coefficient of stiffness.
Suppose that the following measurements where taken:
T 10 15 20
Y 11.60 11.85 12.25

Corresponding to the overcomplete system:


11.60 = x1 + x2 10
11.85 = x1 + x2 15 - can not be satisfied exactly…
12.25 = x1 + x2 20
Math for CS Lecture 4 1
Linear Least Squares Problem

Problem:

Given A(m x n), m≥n, b(m x 1) find x(n x 1) to minimize ||


Ax-b||2.

• If m > n, we have more equations than the number of


unknowns, there is generally no x satisfying Ax=b
exactly.
• This is an overcomplete system.

Math for CS Lecture 4 2


Linear Least Squares

There are three different algorithms for computing the least


square minimum.

1. Normal Equations (Cheap, less Accurate).


2. QR decomposition.
3. SVD (expensive, more reliable).

The first algorithm in the fastest and the least accurate among the
three. On the other hand SVD is the slowest and most accurate.

Math for CS Lecture 4 3


Normal Equations 1
Minimize the squared Euclidean norm of the residual
vector:

To minimize we take the derivative with respect to x and


set it to zero:

Which reduces to an (n x n) linear system commonly


known as NORMAL EQUATIONS:

Math for CS Lecture 4 4


Normal Equations 2
11.60 = x1 + x2 10
min||Ax-b||2
11.85 = x1 + x2 15
12.25 = x1 + x2 20

A x b

Math for CS Lecture 4 5


Normal Equations 3
We must solve the system

For the following values

1 10  11 .60
 3 45 
A  1 15  1 1 1
A 
T AT  A    B  11 .85
  45 725 
1 20 10 15 20 12.25

10.925
x  ( AT  A) 1  AT  b    (ATA)-1AT is called a Pseudo-inverse
 0. 650 
Math for CS Lecture 4 6
QR factorization 1
• A matrix Q is said to be orthogonal if its columns are
orthonormal, i.e. QT·Q=I.

• Orthogonal transformations preserve the Euclidean norm


since

• Orthogonal matrices can transform vectors in various


ways, such as rotation or reflections but they do not
change the Euclidean length of the vector. Hence, they
preserve the solution to a linear least squares problem.
Math for CS Lecture 4 7
QR factorization 2
Any matrix A(m·n) can be represented as

A = Q·R

,where Q(m·n) is orthonormal and R(n·n) is upper triangular:

 r11 r12  r1n 


0 r11   
   
a 1 | ... | a n  q 1 | ... | q n  
0 0  
 
0 0 0 rnn 

Math for CS Lecture 4 8


QR factorization 2
• Given A , let its QR decomposition be given as A=Q·R, where
Q is an (m x n) orthonormal matrix and R is upper triangular.
• QR factorization transform the linear least square problem into a
triangular least squares.
Matlab Code:

Q·R·x = b
R·x = QT·b
x=R-1·QT·b

Math for CS Lecture 4 9


Singular Value Decomposition

• Normal equations and QR decomposition only work


for fully-ranked matrices (i.e. rank( A) = n). If A is
rank-deficient, that there are infinite number of
solutions to the least squares problems and we can
use algorithms based on SVD's.
• Given the SVD:
U(m x m) , V(n x n) are orthogonal
Σ is an (m x n) diagonal matrix (singular values of A)
The minimal solution corresponds to:

Math for CS Lecture 4 10


Singular Value Decomposition
Matlab Code:

Math for CS Lecture 4 11


Linear algebra review - SVD
Fact : for every A mn , rank( A)  p,
there exist U m p , Vn p , Σ p p such that
UTU  I
VTV  I
Σ  diag( 1 ,  2 ,...,  p ),  i  0

A  U Σ V T    i  u i  v Ti 
p

i 1


ATA  V Σ2 VT
AA T  U Σ 2 U T
Math for CS Lecture 4 12
Approximation by a low-rank matrix
Fact II : let A mn have rank p.

 
p
let A  U Σ V T    i u i  v Ti be the SVD of A,  1   2  ...   p
i 1

~ r

~ T ~

Then A    i u i  v i  U Σ V , Σ  diag( 1 ,  2 ,..., r ), r  p
T

i 1

Is the best rank r approximation to A in  2 :


 ~
min A  X 2  A  A  U Σ  Σ V
X : rank ( X )  r
 ~ T
2
  2
  r 1

Math for CS Lecture 4 13


Geometric Interpretation of the SVD
b  A mn   x
The image of the unit sphere under any mxn matrix is a hyperellipse

v1 σ·v1

σ·v2
v2

Math for CS Lecture 4 14


Left and Right singular vectors
We can define the properties of A in terms of the shape of AS

S v1 AS σ·u1
σ·u2
v2
AS  A mn   S

Singular values of A are the lengths of principal axes of AS, usually


written in non-increasing order σ1 ≥ σ2 ≥ … ≥ σn

n left singular vectors of A are the unit vectors {u1,…, un}, oriented in the

directions of the principal semiaxes of AS numbered in correspondance with {σi}

n right singular vectors of A are the unit vectors {v1,…, vn}, of S, which are the
preimages
Math for CS of the principal semiaxes of AS: 4
Lecture Avi= σiui 15
Singular Value Decomposition
Avi= σiui, 1≤i≤n

 
   1 
     
      
   
  1  2  n 
2
A   u1 u2  un 
    
   
    ( n ,n )   ( m ,n )  

   n  ( n,n )
( m ,n )

AV  U

A  UV *
- Singular Value decomposition

Matrices U,V are orthogonal and Σ is diagonal

Math for CS Lecture 4 16


Matrices in the Diagonal Form
Every matrix is diagonal in appropriate basis:

b  A mn   x
Any vector b(m,1) can be expanded in the basis of left singular vectors of A {ui};

Any vector x(n,1) can be expanded in the basis of right singular vectors of A {vi};

Their coordinates in these new expansions are:

b'  U *  b; x'  V *  x

Then the relation b=Ax can be expressed in terms of b’ and x’:

b  A x  U *  b  U * A x  U *UV * x  b   x'
Math for CS Lecture 4 17
Rank of A
Let p=min{m,n}, let r≤p denote the number of nonzero
singlular values of A,
Then:
The rank of A equals to r, the number of nonzero
singular values
Proof:
The rank of a diagonal matrix equals to the number of its
nonzero entries, and in the decomposition A=UΣV* ,U and V
are of full rank
Math for CS Lecture 4 18
Determinant of A
For A(m,m), m
| det( A) |   i
i 1

Proof:
The determinant of a product of square matrices is the
product of their determinants. The determinant of a Unitary
matrix is 1 in absolute value, since: U*U=I. Therefore,
m
| det( A) || det(UV ) || det(U ) || det( ) || det(V ) || det( ) |   i
* *

i 1

Math for CS Lecture 4 19


A in terms of singular vectors

For A(m,n), can be written as a sum of r rank-one matrices:


r
A    j u j v *j (1)
j 1
Proof:
If we write Σ as a sum of Σi, where Σi=diag(0,..,σi,..0), then
(1)
Follows from
A  UV *
(2)

Math for CS Lecture 4 20


Norm of the matrix
The L2 norm of the vector is defined as:
n


2
x  x x
T
xi (1)
2 i 1

The L2 norm of the matrix is defined as:


Ax
A 2  sup 2

x
2

Therefore
A 2  max( i )

A xi  i xi
,where λi are the eigenvalues
Math for CS Lecture 4 21
Matrix Approximation in SVD basis

For any ν with 0 ≤ ν ≤r, define


v
A    j u j v *j (1)
j 1

If ν=p=min{m,n}, define σv+1=0. Then

A  A 2
 infmn A  B 2    1
BC
rank ( B )

Math for CS Lecture 4 22

You might also like