MATH 685/ CSI 700/ OR 682 Lecture Notes: Least Squares
MATH 685/ CSI 700/ OR 682 Lecture Notes: Least Squares
Data fitting
Data fitting
Example
Example
Example
Existence/Uniqueness
Normal Equations
Orthogonality
Orthogonality
Orthogonal Projector
Pseudoinverse
Example
Example
Shortcomings
Orthogonal Transformations
QR Factorization
Orthogonal Bases
Computing QR factorization
To compute QR factorization of m n matrix A, with m > n, we annihilate subdiagonal entries of successive columns of A, eventually reaching upper triangular form Similar to LU factorization by Gaussian elimination, but use orthogonal transformations instead of elementary elimination matrices
Householder Transformation
Example
Householder QR factorization
Householder QR factorization
Householder QR factorization
For solving linear least squares problem, product Q of Householder transformations need not be formed explicitly
Householder vectors v can be stored in (now zero) lower triangular portion of A (almost)
Example
Example
Example
Example
Givens Rotations
Givens Rotations
Example
Givens QR factorization
Givens QR factorization
Straightforward implementation of Givens method requires about 50% more work than Householder method, and also requires more storage, since each rotation requires two numbers, c and s, to define it
Givens can be advantageous for computing QR factorization when many entries of matrix are already zero, since those annihilations can then be skipped
Gram-Schmidt orthogonalization
Gram-Schmidt algorithm
Modified Gram-Schmidt
Rank Deficiency
If rank(A) < n, then QR factorization still exists, but yields singular upper triangular factor R, and multiple vectors x give minimum residual norm
Can be computed by QR factorization with column pivoting or by singular value decomposition (SVD)
Rank of matrix is often not clear cut in practice, so relative tolerance is used to determine rank
Example: SVD
Applications of SVD
Pseudoinverse
Orthogonal Bases
Ordinary least squares is applicable when right-hand side b is subject to random error but matrix A is known accurately When all data, including A, are subject to error, then total least squares is more appropriate Total least squares minimizes orthogonal distances, rather than vertical distances, between model and data
Comparison of Methods
Comparison of Methods
Comparison of Methods