0% found this document useful (0 votes)
8 views57 pages

U5 MatrixInverseGaussElim Notes F21

Uploaded by

idadetu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views57 pages

U5 MatrixInverseGaussElim Notes F21

Uploaded by

idadetu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 57

Matrix Inversion and Solution of General Linear

Equations
Gaussian Elimination and the LU Decompostion

Dr. M. Bikdash

Department of Computational Science and Engineering


North Carolina A&T State University

September 21, 2021

M Bikdash–copyright 2019 (CSE in NCA&T) Gaussian Elimination September 21, 2021 1 / 57


By the end of this Unit, students will be able to:
Part 1-Theory
De…ne the matrix inverse and state its basic properties
Understand the limitations of the matrix inverse (in existence and uniqueness),
Understand/Avoid cancellations, divisors of zero, multiplying by a singular matrix,
and computing the inverse
Inverses of matrices with special shapes
Pseudo-inverses
Part 2-Computation using Gaussian Elimination
Become familiar with Gaussian Elimination
Perform it manually, in Matlab, and code it
keep track of operations and counting their numbers
Pivoting and scaling
apply it to several RHS
Place the parentheses correctly!! which ordering of the operations is best?
Part 3–Introduction to the PLU
Code and uses of the PLU

M Bikdash–copyright 2019 (CSE in NCA&T) Gaussian Elimination September 21, 2021 2 / 57


Introduction

We cannot divide two matrices like B/A if A is a matrix and we must use
BA 1 where A 1 is the inverse (if it exists)
The matrix inverse is a complicated concept because A can behave like a zero
even when it is not.
Understanding the existence and uniqueness of the inverse is crucial and it
can be tested using many major concepts in matrix theory
Like: determinant, rank, linear independence, eigenvalues, singular values,
etc.
The matrix inverse is a useful mathematical notation but it is not useful
from a computational point of view.
Instead, use Gaussian elimination to solve the system Ax = B directly

Basic Idea: Manipulate the matrix zero pattern until


you reach an upper triangular matrix or better

M Bikdash–copyright 2019 (CSE in NCA&T) Gaussian Elimination September 21, 2021 3 / 57


Additional Resources

The inverse of a 2x2 matrix can be computed easily with this trick:
https://www.khanacademy.org/math/precalculus/x9e81a4f98389efdf:matrices/
x9e81a4f98389efdf:…nding-inverse-matrix-with-determinant/v/inverse-of-a-
2x2-matrix
Wikipedia has an extensive article on the matrix inverse
https://en.wikipedia.org/wiki/Invertible_matrix#Blockwise_inversion
In particular, read this mega-theorem which relates the inverse to almost
every other concept in matrix theory and linear algebra:
https://en.wikipedia.org/wiki/Invertible_matrix#The_invertible_matrix_theor
Very quick and useful properties are here:
https://en.wikipedia.org/wiki/Invertible_matrix#Other_properties

M Bikdash–copyright 2019 (CSE in NCA&T) Gaussian Elimination September 21, 2021 4 / 57


Overview of Solving Equations and Matrix Inversion

In the scalar case In the square matrix case


the solution of ax = b is the solution of Ax = b can be
x = b/a or even better written x = A 1 b if the inverse
x = a 1b A 1 exists
The inverse a 1 exists if a 6= 0 The inverse of A need not exist
even if A 6= 0

This leads to some puzzling facts of matrix algebra


Typically, computing the inverse is wasteful and should be avoided
Essentially, we solve Ax = b directly using Gaussian Elimination
Will consider tricks of "using" the inverse matrix without computing it
We will recall the de…nition of a FLoating-point OPeration (FLOP)1
We count FLOPs for many algorithms

1 The plural of FLOP must be FLOPs, which is often confused with FLOPS (Floating-Point

Opearation per Second), a FLOP rate.


M Bikdash–copyright 2019 (CSE in NCA&T) Gaussian Elimination September 21, 2021 5 / 57
Identity Matrices
The n n identity matrix In , or simply I , is a square matrix with ones on the
main diagonal and zeros everywhere else.
2 3
2 3 1 0 0 0
1 0 0 6 7
1 0 4 0 1 0 5 , 6 0 1 0 0 7,
1 , , 4
0 1 0 0 1 0 5
| {z } 0 0 1
| {z } 0 0 0 1
the 2 2 identity
the 3 3 identity
I2
I3

It is a collective name for matrices matri(ces) is denoted e 2 :


of di¤erent dimensions 2 3
Sometimes denoted as E , or E, or 2 3 0
0 6 1 7
1, etc. 0
, 4 1 5, 6 7
4 0 5,
1
Its n columns will be denoted as | {z } 0
| {z } 0
e1, , en e 2 of I2 | {z }
e 2 of I
3
The second column of the identity e 2 of I4

M Bikdash–copyright 2019 (CSE in NCA&T) Gaussian Elimination September 21, 2021 6 / 57


Matrix Inverse–De…nition

De…nition
A square n n matrix A has an n n inverse B if

AB = BA = I .

In this case, A is said to be nonsingular or invertible.

A square matrix that does not possess an inverse is called singular.


A matrix A that does not have an inverse is noninvertible (whether square
or not)
Must satisfy both conditions AB = I and BA = I
Counter Example: Assume that C 1 = D and hence CD = DC = I . Then

D
C 0 = CD + 0 0=I Identity matrix. Good!
0
D DC 0 I 0
C 0 = = not an identity matrix!
0 0 0 0 0

M Bikdash–copyright 2019 (CSE in NCA&T) Gaussian Elimination September 21, 2021 7 / 57


Theorem of inverse of transpose and product
Theorem
Let A, B, be all n n invertible matrices. Then
1
(a) A 1 = A.
(b) The inverse matrix, if it exists, is unique.
1 T
(c) AT = A 1 or the inverse of the transpose is the transpose of the
inverse.
(d) If A is symmetric (or Hermitian) then A 1 is symmetric (or Hermitian)
(e) (AB ) 1 = B 1 A 1 or the inverse of the product is the product of the
inverses read backwards
Proof of a. If B is the inverse of A then AB = BA = I which, by de…nition,
means that A is the inverse of B.
Proof of b If A has two distinct inverses X and Y , then AX = XA = I and
AY = YA = I .
Subtracting the two sets of equations yields
A (Y X ) = (Y X ) A = 0,
and since A is invertible, then X = Y .
M Bikdash–copyright 2019 (CSE in NCA&T) Gaussian Elimination September 21, 2021 8 / 57
Proof of (c), Inverse of Transpose, etc.

(c) The proof is messy but simple. It requires mastering the Einstein Index
Notation (EIN).
Let B = AT meaning that bjk = akj .
T
De…ne Y = B 1 . We need to show that Y = A 1 or that Y T = A 1 .

yij bjk = bjk yki = δki (EIN form of YB = BY = I )


yij akj = akj yki = δki (after substituting bjk = akj )
akj yij = yki akj = δki (after rearranging)

Now de…ne yij = zji () Y T = Z , and substitute yij above to get

akj zji = zik akj = δki (meaning that AZ = ZA = I )

Therefore A 1 = Z which is Y T .
(d) follows from (c) immediately.
(e) can be shown using a development like that of (c)–Left as exercise!

M Bikdash–copyright 2019 (CSE in NCA&T) Gaussian Elimination September 21, 2021 9 / 57


Divisors of zero
Also note that when it comes to matrices one can have the divisors of zero;
in other words, matrices that are not zero, but whose multiplication leads to
a zero matrix. For Example:

1
1 1 = 0
|{z} (1)
| {z } 1
nonzero E
| {z } zero matrix
nonzero F

and E and F are divisors of zero.


A matrix such as E can also be called a pseudo-zero matrix, or simply, a
pseudozero,
because it is not zero but it can behave like a zero sometimes.
Pseudozeros are apparently paradoxical because we are used to scalar algebra:

scalars: ab = 0 implies that either a or b is zero (2)


matrices: AB = 0 does not imply that either A or B is zero (3)

M Bikdash–copyright 2019 (CSE in NCA&T) Gaussian Elimination September 21, 2021 10 / 57


Other (apparently paradoxical) inverse matrix relations

For instance, cancellation of matrix factors on both sides of an equation

scalars : ab = ac AND a 6= 0 implies that b = c (4)


matrices : AB = AC AND A 6= 0 does not imply that B = C (5)

Another apparent paradox starts with a system of two equations in one


unknown, which does not have a solution. For instance

1 2
Fx = x= with x 2 R (6)
1 2.1

Premultiplying the equation by the pseudo-zero G = 1/2 1/2 leads to


the scalar equation x = 2.05
x = 1.05 is a very poor approximate solution
This is because G (and also F) is a pseudo-zero.
Recall multiplying both sides of a false equation (such as 2 = 3) with zero
turns it into a true equation (0 = 0).

M Bikdash–copyright 2019 (CSE in NCA&T) Gaussian Elimination September 21, 2021 11 / 57


It is safe to multiply with a nonsingular matrix
It is NOT safe to multiply with a singular or noninvertible matrix

Rule of Thumb
It is safe to premultiply (or postmultiply) by an invertible matrix,

If A 1 exists, it is safe to premultiply AB = ACD by A 1 to get


A 1 AB = A 1 ACD or B = CD
Cancelling an invertible matrix is OK

Rule of Thumb
Premultiplying or postmultiplying by a singular matrix is dangerous!!

is true but
1 1 1 1 1 2 1 2
= does not imply =
1 1 1 1 1 2 1 2
| {z } | {z } that
singular same

M Bikdash–copyright 2019 (CSE in NCA&T) Gaussian Elimination September 21, 2021 12 / 57


Pitfalls to Remember

In general, only invertible matrices can be used reliably to conduct typical


algebraic operations.
Noninvertible matrices should be used with care.
Non-square matrices and some square (singular) matrices can behave like a
zero matrix, even though these matrices are not zero.
A pseudozero matrix should not be canceled in an equation
Premultiplying both sides of an equation with such a matrix can create
solutions when no solution exists.
Pseudozeros can have pseudoinverses. These are matrices that behave like
inverses, but only sometimes.

Pseudo-inverses can have some of the properties of the inverse


but not all

M Bikdash–copyright 2019 (CSE in NCA&T) Gaussian Elimination September 21, 2021 13 / 57


Mega-Theorem on Existence of the Inverse
Theorem
Given an n n matrix A, the following statements are equivalent
(a) A is invertible;
(b) det A 6= 0.
(c) A has n linearly independent rows and n linearly independent columns.
(d) All the rows of the matrix U in the LU decomposition of A are nonzero.
(e) All the rows of the matrix R in the QR decomposition of A are nonzero.
(f) A does not have a zero eigenvalue.
(g) A does not have a zero singular value.
(h) The null-space of A is trivial.
(i) The system of equations Ax = b has a unique solution for any b

Some concepts have not been covered yet


The proofs are intricate and deep.
The inverse touches all aspects of matrix theory.

Self-study: You need to fully understand this theorem, eventually

M Bikdash–copyright 2019 (CSE in NCA&T) Gaussian Elimination September 21, 2021 14 / 57


Matrices whose Inverses have the Same Shape (1)
Assuming the inverse exists. Proof?

Diagonal Matrices Just invert every diagonal element (assuming no zeros on the
diagonal)
2 3 1 2 3
2 0 0 1/2 0 0
Triangular Matrices 4 1 1 0 5 = 4 1/2 1 0 5
1 3 0.5 2.0 6 2
2 3 1 2 3
2 0 1 4/9 1/6 1/9
Symmetric Matrices 4 0 2 3 5 = 4 1/6 0 1/3 5 .
1 3 0.5 1/9 1/3 2/9
1
A11 0 A111 0
Block Diagonal Matrices = .
0 A22 0 A221
1
A11 A12 A111 A111 A12 A221
Block Triangular Matrices =
0 A22 0 A221
Orthogonal Matrices Q 1 = Q T by de…nition

M Bikdash–copyright 2019 (CSE in NCA&T) Gaussian Elimination September 21, 2021 15 / 57


Matrices whose Inverses have the Same Shape (2)
Assuming the inverse exists. Proof?

Elementary Matrices Used in Gaussian Elimination


2 3 1 2 3
0 0 1 0 1 0
T
Permutation Π 1 = Π . Example: 4 1 0 0 5 = 4 0 0 1 5.
0 1 0 1 0 0
Swap 2 rows To swap j th and k th rows premultiply by E (j, k ) whose inverse is
itself. Example:
2 3 2 3 1
1 0 0 1 0 0
E (j, k ) = 4 0 0 1 5 = 4 0 0 1 5
0 1 0 0 1 0

Row Scaling To scale the j th row by α premultiply by E (j; α) whose inverse is


E (j; 1/α)
2 3 2 3 1
1 0 0 1 0 0
E (2; α) = 4 0 α 0 5=4 0 1/α 0 5 = E (2; 1/α)
0 0 1 0 0 1

M Bikdash–copyright 2019 (CSE in NCA&T) Gaussian Elimination September 21, 2021 16 / 57


Matrices whose Inverses have the Same Shape (3)
Assuming the inverse exists. Proof?

Add a multiple of a row to another To add α times the j th row to the k th row
with k > j,pemultiply by E (j, k; α) = E (j, k; α) 1
2 3 2 3 1
1 0 0 1 0 0
E (2, 3; α) = 4 0 1 0 5=4 0 1 0 5 = E (2, 3; α) 1
0 α 1 0 α 1

Annihilate elements below a diagonal element To zero all elements below the
(j, j ) diagonal element use E (j, :; v ) where vk = 0 for k j
2 3 2 3 1
1 0 0 0 1 0 0 0
6 0 1 0 0 7 6 0 1 0 0 7
E (2, :; v ) = 6 7=6 7
4 0 α 1 0 5 4 0 α 1 0 5
0 β 0 1 0 β 0 1
1
= E (2, :; v )

M Bikdash–copyright 2019 (CSE in NCA&T) Gaussian Elimination September 21, 2021 17 / 57


Special Singular Square Matrices

Skew Symmetric Matrices with odd dimension n Reason: Eigenvalues must be


complex conjugate pairs and hence one of them will be zero!
Matrices with two identical rows or two identical columns Reason: In that case,
the determinant will be zero, and the rows will NOT be linearly
independent!
Matrices with a zero row or a zero column Reason: The determinant will
obviously be zero!
Triangular Matrices with a zero on the diagonal Reason: The determinant is the
product of diagonal elements in this case, and will be zero!
Matrices with null vectors Reason: The columns will be linearly dependent!
Projection matrices Reason: If P is a projection matrix then P 2 = P. If it has an
inverse P 1 then P 1 P 2 = P 1 P or P = I . unless P is the
identity, then P must be noninvertible.
Nilpotent matrices Reason: By de…nition N h = 0 for h > 1. If N 1 exists, then
one premultiply by N 1 exactly h 1 times to get N = 0.

M Bikdash–copyright 2019 (CSE in NCA&T) Gaussian Elimination September 21, 2021 18 / 57


What to do when there is no inverse–Pseudoinverses

Solving systems of linear equations Ax = b is a fundamental problem in every


discipline in science, engineering, technology, economics, etc.
The matrix A is often not square and has dimensions m n with m not
necessarily equal to n.
In all cases, we would like to be able to express the solution (theoretically) as
x = A† b where A† is some pseudo-inverse.
If A is invertible with inverse X , then AX = XA = I by de…nition. The
identity matrix has many obvious and useful properties that one expects the
products AX and XA to possess.
One way to generate pseudoinverses X of A is to enforce a subset of these
properties on AX or XA. For each di¤erent application, one particular subset
of properties may be more useful.

M Bikdash–copyright 2019 (CSE in NCA&T) Gaussian Elimination September 21, 2021 19 / 57


Pseudo-Inverses in General

If A has an inverse A 1
A 1A is the identity I , and
AA 1 is the identity I , and
x =A 1 b is the exact solution of Ax = b

As shown by the mega-theorem above, square matrices may not have an


inverse for many reasons
If the matrix A has more rows that columns, there is no inverse
If the matrix A has more columns than rows, there is no inverse
What to do with equations with Ax = b then?
We de…ne a pseudo-inverse A# (or false inverse). Meaning?
A# A looks like I as much as possible, or
AA# looks like I as much as possible, or
x = A# b is a "meaningful or reasonable" solution of Ax = b (even though it
is typically not exact)

M Bikdash–copyright 2019 (CSE in NCA&T) Gaussian Elimination September 21, 2021 20 / 57


Desired Properties for Pseudoinverses.

Following is a sampler of such properties:


1 XAX = X
2 AXA = A
3 AX is symmetric (or Hermitian)
4 XA is symmetric (or Hermitian)
5 Ak +1 X = Ak
6 XAk X = Ak 1 X
7 x = Xb is the smallest x (i.e. minimizes x T x ) subject to the constraint
Ax = b when m n.
8 x = Xb minimizes the "error-squared" (Ax b )T (Ax b ) when m n.
And there are more. Each of the above properties will hold exactly when A is
invertible and X = A 1 . For instance XAX = A 1 AA 1 = A 1 = X . See the
Exercises. A pseudo inverse that satis…es properties i1 , i2 , is denoted
Afi1 ,i2 , g .

M Bikdash–copyright 2019 (CSE in NCA&T) Gaussian Elimination September 21, 2021 21 / 57


Some Useful Pseudo-Inverses (1)
Moore-Penrose (MP) pseudo-inverse, Denoted A+

It is de…ned as the matrix satisfying the …rst 4 properties and can be denoted
Af1,2,3,4 g .
TM
It is what MATLAB returns using the pinv command. Typically computed
using the SVD.
It exists and is unique for any matrix A, and is in general the most reliably
computable pseudo-inverse.
Drazin pseudo-matrix Denoted A# or AD ,

It is the same as Af1,2,5 g .


It is useful in solving algebraic di¤erential equations.
Left Inverse denoted AL and is de…ned as satisfying AL A = I

It satis…es properties 1, 2, 4, and 6


Useful for solving minimum-error problems Ax = b with m > n
Right Inverse denoted AR and is de…ned as satisfying AAR = I

It satis…es properties 1, 2, 3, and 5


Useful for solving minimum-norm problems Ax = b with m < n
M Bikdash–copyright 2019 (CSE in NCA&T) Gaussian Elimination September 21, 2021 22 / 57
Some Useful Pseudo-Inverses (2)
Minimum-norm pseudo-inverse denoted A† , or Af7 g , and is expressed as

1
A† = AT AAT

It exists if AAT is invertible. It also satis…es properties 1, 2, 3, and 5


It is useful in solving underdetermined systems where m < n which arise in design
and decision making problems.
This is a least-squares problem.
Minimum-error pseudo-inverse denoted A† , is Af8 g , and can be expressed as
1
A† = AT A AT

It exists if AT A is invertible. It also satis…es properties 1, 2, and 4


It is useful in solving overdetermined systems where m > n which arise in curve
…tting and measurement problems.
This is a least-squares problem.

M Bikdash–copyright 2019 (CSE in NCA&T) Gaussian Elimination September 21, 2021 23 / 57


The left and Right Inverses *
De…nition
A matrix A has a left inverse denoted AL if AL A = I , and a right inverse AR if
AAR = I .

The existence of either inverse is not guaranteed, and if either exists, it is not
unique.
However, a matrix A that has both a left inverse and a right inverse is invertible.
Note that right and left inverses are de…ned for general, not necessarily square,
matrices.
In general, the right or left inverses, when they exist, have the dimensions of
the transpose.

M Bikdash–copyright 2019 (CSE in NCA&T) Gaussian Elimination September 21, 2021 24 / 57


Uses of the Left Inverse

The left inverse is useful in …nding approximate solutions for


overdetermined systems of equations (which have more equations that
unknowns).
If Ax = b where A is m n and m > n, and AL is a left inverse of A, then

Ax = b =) AL Ax AL b =) x AL b (7)

Premultiplying both sides of a matrix equation by a matrix P does not


preserve the equality unless P is invertible.
This curious fact is related to the existence of divisors of zeros in matrix
algebra.

M Bikdash–copyright 2019 (CSE in NCA&T) Gaussian Elimination September 21, 2021 25 / 57


Left Inverse for Least-Squares Approximation

Example. Two people measured the length of a chair x in feet, and found
the two (contradictory measurements) 2 and 2.1.
Find an approximate solution to the resulting overdetermined system

1 2.0
x = 2 and x = 2.1 or Ax = b with A = and b = (8)
1 2.1

Solution. Note that A has a left inverse AL = 1/2 1/2 , because AL A = 1.


The approximate solution is

2.0 2 + 2.1
AL Ax x 1/2 1/2 = = 2.05 (9)
2.1 2

which can be interpreted as the average of the solutions of the two


equations.
Note that x = 2.05 does NOT satisfy either equation in the system.

M Bikdash–copyright 2019 (CSE in NCA&T) Gaussian Elimination September 21, 2021 26 / 57


Right Inverses for underdetermined systems

The right inverse is useful in …nding minimum-norm solutions for


underdetermined systems of equations (which have less equations that
unknowns).
If Ax = b where A is m n and m < n, then one expects in general in…nitely
many solutions.
Of interest is the solution that is "smallest" or has a minimum norm
(magnitude).
If AR is a right inverse of A, one can choose to represent one possible
solution as x = AR z and hence:

Ax = b =) AAR z = b =) z = b =) x = AR b is a possible solution


(10)

M Bikdash–copyright 2019 (CSE in NCA&T) Gaussian Elimination September 21, 2021 27 / 57


Example of Using Right Inverse
Problem. To make sure that a bank account has $3K after 3 years, one can make 3
payments x1 , x2 , and x3 at the end of every year till then. At the end of the third year,
and assuming an annual interest rate of 5%, the account must have

1.052 x1 + 1.05x2 + x3 = $3K . (11)

Find at least one solution.


Solution. Here A = 1.052 1.05 1 which has many right inverses such as

T
AR = 1/ 3 1.052 1/ (3 1.05) 1/3 or
R T
A = 1/ 4 1.052 1/ (4 1.05) 1/2

because in both cases AAR = 1. If x = AR ($3K ) then two possible solutions are

x T = [$907, $952, $1000] or xT = $680 $714 $1500 (12)

But the two solutions require di¤erent total deposits ($2859 and $2894 respectively)
1
The solution that minimizes jjx jj2 is x = AT AAT ($3K ) or

xT = $997 $949 $904 with a total deposit of $2850

M Bikdash–copyright 2019 (CSE in NCA&T) Gaussian Elimination September 21, 2021 28 / 57


Nilpotent matrices *

Nonzero divisors of a zero matrix =) there are also nonzero roots of the zero
matrix.
De…nition. An n n matrix that satis…es the conditions M h = 0 and
M h 1 6= 0, h n is called a nilpotent matrix M of index h.
If M 3 = 0 but M 2 6= 0 then M is a cubic root of the zero matrix.
These matrices are crucial to understanding the eigenvalue problem and
singular systems of di¤erential equations.
Theorem. A k–upper triangular matrix A is nilpotent of index n + 1 k.
Proof. (sketch). Every time we premultiply by A we lose one superdiagonal:
2 3 2 3
0 a b 0 0 ac
A = 4 0 0 c 5 ) A2 = 4 0 0 0 5 ) A3 = 0. (13)
0 0 0 0 0 0

If a = c = 1 above, A2 6= 0, and A3 = 0. Therefore this matrix has index 3. Now


if a = 0, and b = 1 then A2 = 0 and hence A will have index 2.

M Bikdash–copyright 2019 (CSE in NCA&T) Gaussian Elimination September 21, 2021 29 / 57


Nilpotent Matrices can look full –Example

Exercise. Show that M is a nilpotent matrix.


2 3
9 2 1 3
6 100 2 48 66 7
M=6 4 102
7
4 2 34 5
27 6 3 9

What is the order of nilpotency?

M Bikdash–copyright 2019 (CSE in NCA&T) Gaussian Elimination September 21, 2021 30 / 57


Part 2- Inverse Computations and Gaussian Elimination!!

We do not Compute the Matrix


Inverse Explicitly
Unless we absolutely have to
Or the computation is trivial
—————————————
We Use Gaussian Elimination
to Solve the Equations!!
—————————————
M Bikdash–copyright 2019 (CSE in NCA&T) Gaussian Elimination September 21, 2021 31 / 57
Important: Do Not Compute the Inverse Explicitly
(unless it is clearly advantageous or unavoidable)

Before computing the inverse or pseudo-inverse of a matrix one should always


ask whether such computation is really necessary.
Typically it is not necessary, and the extra e¤ort needed to compute
and store the inverse is not justi…ed.
The solution of the matrix equation Ax = b, if it exists and is unique, can be
formally written as x = A 1 b,
but computing the solution directly, and without computing the inverse is
much preferred.

Do Not compute the matrix inverse explicitly


unless it is unavoidable or clearly advantageous
When in doubt use Gaussian Elimination

M Bikdash–copyright 2019 (CSE in NCA&T) Gaussian Elimination September 21, 2021 32 / 57


Gaussian Elimination–Lower Triangular Case
Custom tailored solution (like it should be)

2 3 1 2 3
2 0 0 1/2 0 0 Inverse of lower
4 4 3 0 5 = 4 2/3 1/3 0 5 triangular is (quasi) lower
2 2 1 1/3 2/3 1 triangular. Why??

Put A and I or B side by side zero the elements below a11


2 0 0 1 0 0 R2 R2 -(4/3)*R1
4 3 0 0 1 0 R3 R3 -(2) *R1
2 2 1 0 0 1 1 0 0 1/2 0 0
Scale every pivot !unit matrix A 0 1 0 2/3 1/3 0
(i.e. ones on diagonal) 0 2 1 1 0 1
Keep Track of the operations zero the elements below a22
R1 (1/2)*R1 R3 R3 -(2) *R2
R2 (1/3)*R2 1 0 0 1/2 0 0
1 0 0 1/2 0 0 0 1 0 2/3 1/3 0
4/3 1 0 0 1/3 0 0 0 1 1/3 2/3 1
2 2 1 0 0 1
M Bikdash–copyright 2019 (CSE in NCA&T) Gaussian Elimination September 21, 2021 33 / 57
FLOPs for inverse of lower triangular matrix
A=Add, M=Multiply, D=Divide, MA=Multiply-Add

Scale every pivot To solve Ax = y , one must


A side: 1 D in …rst row, 2 D in compute x = A 1 y = By
second row, etc. (Typically n2 MAs)
I side: 1 1/pivot. n
Divisions. Since B is lower triangular
Zero elements under k th scaled Computing xi = ∑ik =1 bik yk
pivot akk = 1 for k = 1, , requires i MAs
n 1 Total is
A side: overwrite. No FLOPs 1+ + n = n (n + 1) /2 MAs
I side: replace
0 = bik aik bkk for Grand total for A 1 y = n D +
i = k + 1, , n. Needs n k n (n 1) /2 M + n (n + 1) /2 M
Ms for every k. + n (n + 1) /2 A
but (n 1) + (n 2) + +1 =
or n D + n 2 M + n (n + 1) /2 A
n (n 1) /2
Computing B requires n Ds and or 3n 2 /2 + 3n/2 FLOPs
n (n 1) /2 Ms
M Bikdash–copyright 2019 (CSE in NCA&T) Gaussian Elimination September 21, 2021 34 / 57
Forward Substitution=Lower Triangular Case
Custom tailored (like it should be)
2 32 3 2 3
2 0 0 x1 1 Solve the system
4 4 3 0 54 x2 5 = 4 2 5 directly. No need to
2 2 1 x3 3 compute the inverse!!

Write the equations x3 = (3 2x1 2x2 ) /2


2x1 = 1 Eq. 1 = (3 2 1/2 0) /2 = 1
4x1 + 3x2 = 2 Eq. 2 In general, for k = 1, ,n
2x1 + 2x2 + x3 = 3 Eq. 3 !
k 1
First equation has 1 unknown.
Solve: x1 = 1/2
xk = yk ∑ akj xj /akk
j =1
With x1 now known, solve Eq. 2 for x2 :
x2 = (2 4x1 ) /3 Needs n Ds, and
= (2 4 1/2) /3 = 0 1+ + (n 1) = n (n 1) /2 MAs
Given x1 , x2 , solve Eq. 3 for x3 :
FLOP Count: n2 /2 + n/2 FLOPs

M Bikdash–copyright 2019 (CSE in NCA&T) Gaussian Elimination September 21, 2021 35 / 57


Was computing the inverse necessary?

For solving Ax = b where A is lower triangular:


Computing the solution using Forward Substitution requires n2 /2 + n/2 FLOPs
via GE
Computing the inverse of a lower triangular matrix requires 3n2 /2 + 3n/2 FLOPs
via A 1

Computing the inverse then the solution


was totally unnecessary!
It costs n2 more FLOPs

M Bikdash–copyright 2019 (CSE in NCA&T) Gaussian Elimination September 21, 2021 36 / 57


Gaussian Elimination–General Case
The key is to keep track of operations

0x1 + 2x1 + 3x2 = 5 Zero the element below A(1,1)


4x1 x2 + 2x3 = 5 R3 R3+( A(3,1)/A(1,1))*R1
2x1 + 0x2 + 2x3 = 4 | {z }
multiplier= 1 /2
Write matrix equivalent 4 1 2 5
0 2 3 5 0 2 3 5
4 1 2 5 0 1/2 1 3/2
2 0 2 4
The (1,1) pivot is zero. Zero the element below A(2,2)
Need pivoting: swap rows 1 and 2 R3 R3 A(3,2)/A(2,2)*R2
| {z }
R1 $ R2 (Keeping track) multiplier= -1 /4
4 1 2 5 4 1 2 5
0 2 3 5 0 2 3 5
2 0 2 4 0 0 1/4 1/4

Mulitplier = – (entry to make zero) / pivot above it


Yields an Upper-Triangular System: Need Backward Substitution to get x

M Bikdash–copyright 2019 (CSE in NCA&T) Gaussian Elimination September 21, 2021 37 / 57


Gaussian Elimination – The Main Algorithm
The matrix A undergoes a series of transformations: A = A(0 ) ! A(1 )
! A(2 ) ! and ends up as U,an upper triangular matrix
A denotes the latest value of A. After transformation i = 4, A denotes A(4 ) .

The operations have the form: When working with pivot A(k,k),
R1 $ R_i1 (if needed) j>k
R2 R2-A(2,1)/A(1,1)*R1
R3 R3-A(3,1)/A(1,1)*R1 A(k:end, 1:k-1) are already zero
.. Rk really denotes A(k,k+1:end)
.
R2 $ R_i2 (if needed) Rj really denotes A(j,k+1:end)
R3 R3-A(3,2)/A(2,2)*R2
The multiplier -A(j,k)/A(k,k)
R4 R4-A(4,2)/A(2,2)*R2
.. can be stored in A(k,j) which is
. supposedly being zeroed.
R3 $ R_i3 (if needed)
Need to store the swaps: start with
R4 R4-A(4,3)/A(3,3)*R3
.. p=[1,2,...,n] and swap as
. needed p(1)$ p(i1), ...

M Bikdash–copyright 2019 (CSE in NCA&T) Gaussian Elimination September 21, 2021 38 / 57


Pseudo-Code for Naive Gaussian Elimination

Algorithm
Naive pseudo-Code for Gauss Elimination

initialize U=A and bn=b


for k = 1 : n-1 % main loop
select kp to maximize jU (kp,k)j
swap row k with row kp: U(k,:)$U(kp,:) and bn(k)$bn(kp)
for i = k+1:n % for rows below the pivot
compute the multiplier m = -U(i,k)/U(k,k)
update U(i,:) U(i,:)+m*U(k,:)
update bn(i) bn(i)+m*bn(k)

M Bikdash–copyright 2019 (CSE in NCA&T) Gaussian Elimination September 21, 2021 39 / 57


Gaussian Elimination – FLOP count

Finding the largest element in A(k:n,k) requires sorting n k numbers


but no FLOPs
Swapping 2 rows of length n k + 1. No FLOPs.
Zeroing n k elements under the k th pivot A(k,k).
To zero one element, need 1 division to compute the multiplier
and (n k ) MA to add the row A(k,k+1:n) to a row below it.
To zero all elements below k th pivot, need (n k ) divisions and (n k )2
MAs
To complete the triangularization, sum the above for k = 1 : n 1

(2n 1 ) n (n 1) n3 n2 n
(n 1 )2 + + 12 = = + MAs
6 3 2 3

2n 3
Reduction to Upper-Triangular Form requires O 3 FLOPs

M Bikdash–copyright 2019 (CSE in NCA&T) Gaussian Elimination September 21, 2021 40 / 57


Back Substitution
Same as Forward Substitution but upside down

Very similar to forward substitution


Given an upper triangular matrix, solve last equation for xn (1 eq. in 1
unknown)
With xn known, solve the equation above for xn 1 ,etc...

function x = backsubstitution(A, b)
[n,m] = size(A) ; % m rows, n columns
% ... diagonstics skipped ...
x(n) = b(n) / A(n,n) ; % start at the bottom
for k = ( n-1 : -1 :1) % loop over the rows up
c = 1 / A(k,k) ;
x(k) = ( b(k) - A(k,k+1:n)*x(k+1:n) ) * c ;
% use c because multiplication is more efficient than division
end

This algorithm needs only n 1 divisions, plus


1+2+ + (n 1) = (n 1) n/2 = O n2 multiply-adds.

M Bikdash–copyright 2019 (CSE in NCA&T) Gaussian Elimination September 21, 2021 41 / 57


Matlab Commands for Gaussian Elimination (n and LU)
mldivide or backslash n solves the system using GE with partial pivoting
and returns the answer
A = [0 2 3; 4 -1 2; 2 0 2] % specify A by rows
y = [5, 5, 4]’ % make into a column
x = A n y % returns column [1; 1; 1]

The lu command provides more information; Namely: PA = LU where


P is a permutation matrix, satisfying P 1 = P T .
L is unit lower triangular (ones on the main diagonal)
U is upper triangular and is the result of the GE
2 3 2 3 2 3
1 0 0 3 2 2 0 0 1
L=4 0 1 0 5, U = 4 0 4 2 5, P = 4 1 0 0 5
2/3 7/12 1 0 0 1/6 0 1 0
The PLU decomposition means that the GE of
2 32 3 2 3
0 0 1 0 2 3 2 0 2
PA = 4 1 0 0 5 4 4 1 2 5=4 0 2 3 5
0 1 0 2 0 2 4 1 2
proceeds without pivoting (row swapping) as to lead to U
M Bikdash–copyright 2019 (CSE in NCA&T) Gaussian Elimination September 21, 2021 42 / 57
The n Matlab Command

Note that the MATLAB command n works when the matrix A is not square.
For instance, if A = [1 1 1 1; 2 1 3 2] ;
representing 2 equations in 4 unknowns with b =[-1;2] ;
then Anb sometimes returns the "least squares solution"
(which will be explained later) [0; -2.5; 1.5; 0].
Beware of Many Pitfalls!!

M Bikdash–copyright 2019 (CSE in NCA&T) Gaussian Elimination September 21, 2021 43 / 57


The inverse as notation
Computational vs. mathematical equivalence

Above examples showed the essence of Gaussian Elimination in primitive form


For lower triangular: No need to pivot
Pivoting is swapping rows to ensure that the pivot/diagonal elements are
nonzero or large
If a pivot was zero, then lower-triangular A would be singular (det A = 0.
Why?)
The trend is the same: Because matrix multiplication is very expensive, there
is no need to compute the inverse.

Computational POV: x = A 1 y means "Solve Ax = y for x" (14)

Keep this in mind when you see:


A 1 (I + cvv T ) 1 (ABCy ) or B A 1 + B 1 (C + A 1 )DAy
1
or C 1 (A 1 + B )(A + B 1 )Cy or A5 y
Conclusion: keep manipulating the expression until you get the most
computationally e¢ cient form
M Bikdash–copyright 2019 (CSE in NCA&T) Gaussian Elimination September 21, 2021 44 / 57
Placing the parentheses and Gaussian Elimination

Gaussian Elimination solution is O n3 /3 + n2 /2


Matrix inversion then MV multiplication is O (2n3 /3)

A(B 1 ) (Cx ) A B 1C x A B 1 (Cx )


(1 MI +1 MM) + 1MV (1 MI +1 MM)+1 1 MV + 1 GE + 1 MV
+1MV MM+1 MV O 2n3 /3 + 2n2
O 2n3 /3 + n3 + 2n2 = O 2n3 /3 + 2n3 + n2 =
O (5n3 /3 + 2n2 ) O (8n3 /3 + n2 )

M Bikdash–copyright 2019 (CSE in NCA&T) Gaussian Elimination September 21, 2021 45 / 57


Pivoting is tricky *

The choice of an alternative pivot is tricky.


Note for instance that what may …rst appear as an excellent may actually be
poor. 2 3
0.1 2 200
A=4 5 9 12 5
10 90 3000
The element a11 = 0.1 is obviously a poor choice.
The largest element in the …rst column is a31 = 10.
The best choice however is actually a21 = 5.
To see this, we recall that rows of A correspond to equations, and scaling rows
corresponds to scaling equations.

M Bikdash–copyright 2019 (CSE in NCA&T) Gaussian Elimination September 21, 2021 46 / 57


Scaling the rows *

We scale the rows of A so that the elements in the …rst column are all ones:
2 32 3 2 3
10 0 0 0.1 2 200 1 20 2000
4 0 2/10 0 54 5 9 12 5 = 4 1 1.8 2.4 5 = Ā
0 0 1/10 10 90 3000 1 9 300
| {z }| {z }
S 1 (0.1 )S 2 (0.2 )S 3 (0.1 ) A

Recall that since the pivot’s row will have to be subtracted from the other
row, thus "disturbing" them.
It is obvious that the …rst row is a poor choice, because the element
ā13 = 2000 will overpower ā23 = 2.4 thus endangering the accuracy of
the solution.
The best choice for a pivot row is the second row, as it will disturb the
other rows the least.
The third row is better choice than the …rst but not the best because
ā33 = 300 still overpowers ā23 = 2.4.

M Bikdash–copyright 2019 (CSE in NCA&T) Gaussian Elimination September 21, 2021 47 / 57


Pivoting Strategies computational cost

Partial Pivoting Row exchanges are allowed but column exchanges are not allowed
Full or Complete Pivoting Both row and column exchanges are allowed
No Pivoting Exchanges are not allowed (typically to preserve sparsity patterns)

Partial Pivoting: Typically uone uses as a pivot the entry with the largest
absolute value relative to its row.
For the k th pivot compare the ratios Ā (j, k ) / jjĀ (j, k : n )jj? and …nd the
maximizing j
The simplest norm to compute in this context and is quite satisfactory is the
1-norm jjx jj1 = ∑ jxi j , which requires O (n ) additions.
To …nd the best pivot at the k th iteration one must therefore compute
O (n k ) ratios,
The overall cost of partial pivoting is O n2 /2 divisions, O n2 /2
comparisons, and O n3 /3 additions.

M Bikdash–copyright 2019 (CSE in NCA&T) Gaussian Elimination September 21, 2021 48 / 57


Row vs. Column Pivoting

Permuting rows of A is equivalent to permuting equations.


Permuting columns is equivalent to reordering and renaming the
unknown variables.
Column pivoting in Gaussian Elimination
Scan  (k, k : n ) to the right of the desired pivot location and …nd a suitable
pivot, say  (k, j )
Typically  (k, j ) is the largest element in  (k, k : n ) in absolute value
Swap the k th and the j th columns; i.e. form ÂQ kj where Q kj is a permutation
matrix obtained by exchanging the k th and j th columns of the identity matrix.
Proceed with the annihilation of all elements below the k th pivot
Column pivoting leads to LU decomposition

LU = AQ, with Q = Q 1j1 Q n 1,jn 1

M Bikdash–copyright 2019 (CSE in NCA&T) Gaussian Elimination September 21, 2021 49 / 57


Full Pivoting

For best accuracy, full pivoting can be used, in which the element with the
largest absolute value in the submatrix Acurrent(k:end,k:end) is chosen
as the new k th pivot.
If the largest element w.r.t. its row is located, these elements can then be
compared, and the largest among them is the largest element in the
submatrix.
Mathematically, the k th full pivoting operation can be written as Pk  Qk
where Pk and Qk are permutation matrices.
Full pivoting is slightly more di¢ cult to program
This result in the full pivoting LU decomposition

PAQ = LU, where P and Q are permutation matrices. (15)

M Bikdash–copyright 2019 (CSE in NCA&T) Gaussian Elimination September 21, 2021 50 / 57


T
t

In general, preconditioning of A can be thought as …nding a matrix S whose


inverse S 1 can be computed fast (say in O (n ) operations) such that S 1
approximates A 1 .
In other words, assume that A is dominated by S meaning that one can write
A as A = S + T where jjS jj jjT jj . Then S 1 A = I + S 1 T and hence
S T1 1 = jjI jj . For sparse matrix applications, it is also required
that S 1 T remains sparse.
Hence Ax = b is now equivalent to SA 1 = Ix + S 1 Tx = S 1 b. See
Iterative methods in U14

M Bikdash–copyright 2019 (CSE in NCA&T) Gaussian Elimination September 21, 2021 51 / 57


E¢ cient Scaling and preconditioning

The success of a linear algebra or machine learning algorithm often hinges on


preconditioning of the problem data
Centering the columns or rows If rows represent instances and columns represent
variables, one can subtract the mean of every column from it.
Normalizing rows and columns Every row (or column) can be centered and scaled
to have a unit standard deviation. This process can also be
called standardization
Scaling to have the same row maxima Every row is "divided" by its largest
element in absolute element, so the largest element in every row is
1.
Computationally:
n comparisons to …nd the maximum,
1 division to …nd the inverse of this element
and n multiplications

M Bikdash–copyright 2019 (CSE in NCA&T) Gaussian Elimination September 21, 2021 52 / 57


Approximate maximum binary scaling

To avoid the n2 multiplications above, simply recall that all elements are
stored in binary format like: sign binary_mantissa 2expon .
Find the largest expon in the row, and subtract it from the exponent of
every element.
If the largest exponent is 5, the row is e¤eciently and e¤ectively scaled by
25 = 32. No multiplications are necessary.

M Bikdash–copyright 2019 (CSE in NCA&T) Gaussian Elimination September 21, 2021 53 / 57


No Pivoting Strategies

If the desired row and/or column permutations are known in advance, they can be
all collected in two permutation matrices P and Q and applied to A at the onset.
There arre

M Bikdash–copyright 2019 (CSE in NCA&T) Gaussian Elimination September 21, 2021 54 / 57


One can in principle go through the full pivoting argument without any
annihilation/elimination
1 for k = 1 : min (m, n )
1 …nd the largest element aij in A (k : m, k : n )
2 swap the k th row with the …rst and the k th column with the j th
Alternatively, and to reduce …ll-ins in a sparse matrix, penalize rows that have too
many nonzero elements as to push them down.
Hence instead of …nding (i, j ) to maximize aij ,
maximize aij /νi where νi is the number of nonzero elements in the i th row
maximize aij for
i 2 frows having a number of nonzero elements less that νk g . Here νk
starts as little as possible (say 1) and is increased until an acceptable nonzero
aij is located in the submatrix A0 (k : m, k : n ) .

M Bikdash–copyright 2019 (CSE in NCA&T) Gaussian Elimination September 21, 2021 55 / 57


Reverse Cuthill-McKee Ordering

B = bucky; p = symrcm(B); R = B(p,p); spy(B); spy(R); %


beautification removed
An alternative is the Column approximate minimum degree permutation with
the Matlab command colmd

M Bikdash–copyright 2019 (CSE in NCA&T) Gaussian Elimination September 21, 2021 56 / 57


load west0479; A = west0479; p = colamd(A);

The column approximate


minimum degree agorithm
shu- es the columns as
to minimize …ll-in during GE
..
..

Notice the signi…cant


decrease of …ll after
applying the colamd
algorithm
..
..

M Bikdash–copyright 2019 (CSE in NCA&T) Gaussian Elimination September 21, 2021 57 / 57

You might also like