U5 MatrixInverseGaussElim Notes F21
U5 MatrixInverseGaussElim Notes F21
Equations
Gaussian Elimination and the LU Decompostion
Dr. M. Bikdash
We cannot divide two matrices like B/A if A is a matrix and we must use
BA 1 where A 1 is the inverse (if it exists)
The matrix inverse is a complicated concept because A can behave like a zero
even when it is not.
Understanding the existence and uniqueness of the inverse is crucial and it
can be tested using many major concepts in matrix theory
Like: determinant, rank, linear independence, eigenvalues, singular values,
etc.
The matrix inverse is a useful mathematical notation but it is not useful
from a computational point of view.
Instead, use Gaussian elimination to solve the system Ax = B directly
The inverse of a 2x2 matrix can be computed easily with this trick:
https://www.khanacademy.org/math/precalculus/x9e81a4f98389efdf:matrices/
x9e81a4f98389efdf:…nding-inverse-matrix-with-determinant/v/inverse-of-a-
2x2-matrix
Wikipedia has an extensive article on the matrix inverse
https://en.wikipedia.org/wiki/Invertible_matrix#Blockwise_inversion
In particular, read this mega-theorem which relates the inverse to almost
every other concept in matrix theory and linear algebra:
https://en.wikipedia.org/wiki/Invertible_matrix#The_invertible_matrix_theor
Very quick and useful properties are here:
https://en.wikipedia.org/wiki/Invertible_matrix#Other_properties
1 The plural of FLOP must be FLOPs, which is often confused with FLOPS (Floating-Point
De…nition
A square n n matrix A has an n n inverse B if
AB = BA = I .
D
C 0 = CD + 0 0=I Identity matrix. Good!
0
D DC 0 I 0
C 0 = = not an identity matrix!
0 0 0 0 0
(c) The proof is messy but simple. It requires mastering the Einstein Index
Notation (EIN).
Let B = AT meaning that bjk = akj .
T
De…ne Y = B 1 . We need to show that Y = A 1 or that Y T = A 1 .
Therefore A 1 = Z which is Y T .
(d) follows from (c) immediately.
(e) can be shown using a development like that of (c)–Left as exercise!
1
1 1 = 0
|{z} (1)
| {z } 1
nonzero E
| {z } zero matrix
nonzero F
1 2
Fx = x= with x 2 R (6)
1 2.1
Rule of Thumb
It is safe to premultiply (or postmultiply) by an invertible matrix,
Rule of Thumb
Premultiplying or postmultiplying by a singular matrix is dangerous!!
is true but
1 1 1 1 1 2 1 2
= does not imply =
1 1 1 1 1 2 1 2
| {z } | {z } that
singular same
Diagonal Matrices Just invert every diagonal element (assuming no zeros on the
diagonal)
2 3 1 2 3
2 0 0 1/2 0 0
Triangular Matrices 4 1 1 0 5 = 4 1/2 1 0 5
1 3 0.5 2.0 6 2
2 3 1 2 3
2 0 1 4/9 1/6 1/9
Symmetric Matrices 4 0 2 3 5 = 4 1/6 0 1/3 5 .
1 3 0.5 1/9 1/3 2/9
1
A11 0 A111 0
Block Diagonal Matrices = .
0 A22 0 A221
1
A11 A12 A111 A111 A12 A221
Block Triangular Matrices =
0 A22 0 A221
Orthogonal Matrices Q 1 = Q T by de…nition
Add a multiple of a row to another To add α times the j th row to the k th row
with k > j,pemultiply by E (j, k; α) = E (j, k; α) 1
2 3 2 3 1
1 0 0 1 0 0
E (2, 3; α) = 4 0 1 0 5=4 0 1 0 5 = E (2, 3; α) 1
0 α 1 0 α 1
Annihilate elements below a diagonal element To zero all elements below the
(j, j ) diagonal element use E (j, :; v ) where vk = 0 for k j
2 3 2 3 1
1 0 0 0 1 0 0 0
6 0 1 0 0 7 6 0 1 0 0 7
E (2, :; v ) = 6 7=6 7
4 0 α 1 0 5 4 0 α 1 0 5
0 β 0 1 0 β 0 1
1
= E (2, :; v )
If A has an inverse A 1
A 1A is the identity I , and
AA 1 is the identity I , and
x =A 1 b is the exact solution of Ax = b
It is de…ned as the matrix satisfying the …rst 4 properties and can be denoted
Af1,2,3,4 g .
TM
It is what MATLAB returns using the pinv command. Typically computed
using the SVD.
It exists and is unique for any matrix A, and is in general the most reliably
computable pseudo-inverse.
Drazin pseudo-matrix Denoted A# or AD ,
1
A† = AT AAT
The existence of either inverse is not guaranteed, and if either exists, it is not
unique.
However, a matrix A that has both a left inverse and a right inverse is invertible.
Note that right and left inverses are de…ned for general, not necessarily square,
matrices.
In general, the right or left inverses, when they exist, have the dimensions of
the transpose.
Ax = b =) AL Ax AL b =) x AL b (7)
Example. Two people measured the length of a chair x in feet, and found
the two (contradictory measurements) 2 and 2.1.
Find an approximate solution to the resulting overdetermined system
1 2.0
x = 2 and x = 2.1 or Ax = b with A = and b = (8)
1 2.1
2.0 2 + 2.1
AL Ax x 1/2 1/2 = = 2.05 (9)
2.1 2
T
AR = 1/ 3 1.052 1/ (3 1.05) 1/3 or
R T
A = 1/ 4 1.052 1/ (4 1.05) 1/2
because in both cases AAR = 1. If x = AR ($3K ) then two possible solutions are
But the two solutions require di¤erent total deposits ($2859 and $2894 respectively)
1
The solution that minimizes jjx jj2 is x = AT AAT ($3K ) or
Nonzero divisors of a zero matrix =) there are also nonzero roots of the zero
matrix.
De…nition. An n n matrix that satis…es the conditions M h = 0 and
M h 1 6= 0, h n is called a nilpotent matrix M of index h.
If M 3 = 0 but M 2 6= 0 then M is a cubic root of the zero matrix.
These matrices are crucial to understanding the eigenvalue problem and
singular systems of di¤erential equations.
Theorem. A k–upper triangular matrix A is nilpotent of index n + 1 k.
Proof. (sketch). Every time we premultiply by A we lose one superdiagonal:
2 3 2 3
0 a b 0 0 ac
A = 4 0 0 c 5 ) A2 = 4 0 0 0 5 ) A3 = 0. (13)
0 0 0 0 0 0
2 3 1 2 3
2 0 0 1/2 0 0 Inverse of lower
4 4 3 0 5 = 4 2/3 1/3 0 5 triangular is (quasi) lower
2 2 1 1/3 2/3 1 triangular. Why??
The operations have the form: When working with pivot A(k,k),
R1 $ R_i1 (if needed) j>k
R2 R2-A(2,1)/A(1,1)*R1
R3 R3-A(3,1)/A(1,1)*R1 A(k:end, 1:k-1) are already zero
.. Rk really denotes A(k,k+1:end)
.
R2 $ R_i2 (if needed) Rj really denotes A(j,k+1:end)
R3 R3-A(3,2)/A(2,2)*R2
The multiplier -A(j,k)/A(k,k)
R4 R4-A(4,2)/A(2,2)*R2
.. can be stored in A(k,j) which is
. supposedly being zeroed.
R3 $ R_i3 (if needed)
Need to store the swaps: start with
R4 R4-A(4,3)/A(3,3)*R3
.. p=[1,2,...,n] and swap as
. needed p(1)$ p(i1), ...
Algorithm
Naive pseudo-Code for Gauss Elimination
(2n 1 ) n (n 1) n3 n2 n
(n 1 )2 + + 12 = = + MAs
6 3 2 3
2n 3
Reduction to Upper-Triangular Form requires O 3 FLOPs
function x = backsubstitution(A, b)
[n,m] = size(A) ; % m rows, n columns
% ... diagonstics skipped ...
x(n) = b(n) / A(n,n) ; % start at the bottom
for k = ( n-1 : -1 :1) % loop over the rows up
c = 1 / A(k,k) ;
x(k) = ( b(k) - A(k,k+1:n)*x(k+1:n) ) * c ;
% use c because multiplication is more efficient than division
end
Note that the MATLAB command n works when the matrix A is not square.
For instance, if A = [1 1 1 1; 2 1 3 2] ;
representing 2 equations in 4 unknowns with b =[-1;2] ;
then Anb sometimes returns the "least squares solution"
(which will be explained later) [0; -2.5; 1.5; 0].
Beware of Many Pitfalls!!
We scale the rows of A so that the elements in the …rst column are all ones:
2 32 3 2 3
10 0 0 0.1 2 200 1 20 2000
4 0 2/10 0 54 5 9 12 5 = 4 1 1.8 2.4 5 = Ā
0 0 1/10 10 90 3000 1 9 300
| {z }| {z }
S 1 (0.1 )S 2 (0.2 )S 3 (0.1 ) A
Recall that since the pivot’s row will have to be subtracted from the other
row, thus "disturbing" them.
It is obvious that the …rst row is a poor choice, because the element
ā13 = 2000 will overpower ā23 = 2.4 thus endangering the accuracy of
the solution.
The best choice for a pivot row is the second row, as it will disturb the
other rows the least.
The third row is better choice than the …rst but not the best because
ā33 = 300 still overpowers ā23 = 2.4.
Partial Pivoting Row exchanges are allowed but column exchanges are not allowed
Full or Complete Pivoting Both row and column exchanges are allowed
No Pivoting Exchanges are not allowed (typically to preserve sparsity patterns)
Partial Pivoting: Typically uone uses as a pivot the entry with the largest
absolute value relative to its row.
For the k th pivot compare the ratios Ā (j, k ) / jjĀ (j, k : n )jj? and …nd the
maximizing j
The simplest norm to compute in this context and is quite satisfactory is the
1-norm jjx jj1 = ∑ jxi j , which requires O (n ) additions.
To …nd the best pivot at the k th iteration one must therefore compute
O (n k ) ratios,
The overall cost of partial pivoting is O n2 /2 divisions, O n2 /2
comparisons, and O n3 /3 additions.
For best accuracy, full pivoting can be used, in which the element with the
largest absolute value in the submatrix Acurrent(k:end,k:end) is chosen
as the new k th pivot.
If the largest element w.r.t. its row is located, these elements can then be
compared, and the largest among them is the largest element in the
submatrix.
Mathematically, the k th full pivoting operation can be written as Pk  Qk
where Pk and Qk are permutation matrices.
Full pivoting is slightly more di¢ cult to program
This result in the full pivoting LU decomposition
To avoid the n2 multiplications above, simply recall that all elements are
stored in binary format like: sign binary_mantissa 2expon .
Find the largest expon in the row, and subtract it from the exponent of
every element.
If the largest exponent is 5, the row is e¤eciently and e¤ectively scaled by
25 = 32. No multiplications are necessary.
If the desired row and/or column permutations are known in advance, they can be
all collected in two permutation matrices P and Q and applied to A at the onset.
There arre