0% found this document useful (0 votes)

18 views

Direct Methods

DirectMethods

Uploaded by

wbdhvz6fxs

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

18 views

Direct Methods

DirectMethods

Uploaded by

wbdhvz6fxs

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 20

DIRECT METHODS

YUWEN LI

The textbook of this course is “Applied Numerical Linear Algebra” by

James Demmel [2]. Interested readers could also read [3, 4], which are
famous books for numerical linear algebra and matrix analysis.
The central topic is the numerical solution of linear systems of equations
by direct methods (Gaussian elimination) and iterative methods (Gauss-
Seidel methods, Jacobi methods, Krylov space methods). Related topics
include least-squares problems, matrix factorization (LU, Cholesky, QR fac-
torization), numerical eigenvalue problems, fast solvers (Fast Fourier Trans-
form and multigrid) etc.

1. Vector and matrix norms

To quantify the error in numerical linear algebra, we need to introduce
norms for vectors and matrices.

1.1. Vector Norms.

Definition 1.1. We say ∥ • ∥ : Rn → R is a norm on Rn if it satisfies the
following three conditions:
(1) ∥u∥ ≥ 0 and ∥u∥ = 0 =⇒ u = 0;
(2) ∥αu∥ = |α|∥u∥, ∀α ∈ R;
(3) ∥u + v∥ ≤ ∥u∥ + ∥v∥.
For example, the Euclidean norm is defined to be
1 1
∥u∥ = (u · u) 2 = (u21 + · · · + u2n ) 2 , u = (u1 , . . . , un ) ∈ Rn .
In analysis, the following ℓp norm is also very useful:
(∑ )1 1
∥u∥p = |ui |p p = (|u1 |p + · · · + |un |p ) p , u ∈ Rn ,
i
(´ )1
which is a discretization of the function norm | • |p dx p
.
Theorem 1.1. For 1 ≤ p ≤ ∞, ∥ • ∥p is a norm on Rn .
Proof. It suﬀices to prove
∥u + v∥p ≤ ∥u∥p + ∥v∥p ,

Address: [email protected]; School of Mathematical Sciences, Zhejiang University.

1
2 YUWEN LI

the so-called Minkowski inequality.

∑
∥u + v∥pp = |ui + vi ||ui + vi |p−1
i
∑
≤ |ui ||ui + vi |p−1 + |vi ||ui + vi |p−1
i
= |u| · |u + v|p−1 + |v| · |u + v|p−1 .
To proceed, we need to recall the Hölder inequality
1 1
|u · v| ≤ ∥u∥p ∥v∥q , + = 1.
p q
It then follows that
|u| · |u + v|p−1 ≤ ∥u∥p ∥|u + v|p−1 ∥q
(∑ )1 p
= ∥u∥p |u + v|p q = ∥u∥p ∥u + v∥pq ,
i
|v| · |u + v| ≤ ∥v∥p ∥|u + v|p−1 ∥q
p−1

(∑ )1 p
= ∥v∥p |u + v|p q = ∥v∥p ∥u + v∥pq .
i
Combining previous results completes the proof. □
The normed space (Rn , ∥ • ∥p ) is denoted by ℓp . A special example of the
ℓp norm is the ℓ∞ norm
∥u∥∞ = max |ui |.
1≤i≤n

Exercise: Prove the Jensen

∑ inequality: for a convex function f (namely
f ′′ ≥ 0) and 0 ≤ λi ≤ 1 with i λi = 1, it holds that
(∑ ) ∑
f λi x i ≤ λi f (xi ).
i i

Exercise: Use the Jensen inequality to prove the Hölder inequality.

Exercise: Prove that ∥ • ∥p is not a norm when 0 < p < 1.
Exercise: Prove that limp→∞ ∥u∥p = ∥u∥∞ and ∥u∥p is a non-increasing
function in p.
On Rn , we say two different norms ∥ • ∥X and ∥ • ∥Y are equivalent if
there exist positive constants C1 and C2 such that
C1 ∥u∥Y ≤ ∥u∥X ≤ C2 ∥u∥Y , ∀u ∈ Rn ,
and in this case we use the notation
∥u∥X ≂ ∥u∥Y .
For example, we have the following comparisons
√
∥u∥2 ≤ ∥u∥1 ≤ n∥u∥2 .
DIRECT METHODS 3

In fact, all norms on finite-dimensional linear spaces are equivalent.

Theorem 1.2 (Equivalence of vector norms). For any two norms ∥ • ∥X
and ∥ • ∥Y on Rn are equivalent: there exist positive constants C1 and C2
such that
C1 ∥u∥Y ≤ ∥u∥X ≤ C2 ∥u∥Y , ∀u ∈ Rn .
Proof. It suﬀices to prove that for any norm ∥ • ∥X , there exists C1 , C2 > 0
such that
C1 ∥u∥2 ≤ ∥u∥X ≤ C2 ∥u∥2 , ∀u ∈ Rn .
This inequality is equivalent to
C1 ≤ ∥u∥X ≤ C2 , ∀u ∈ Rn with ∥u∥2 = 1,
which is guaranteed by the continuity of ∥ • ∥X : Rn → R as a function and
that the unit sphere {u ∈ Rn : ∥u∥2 = 1} is compact.
We show the continuity in 2d (n=2). Let e1 = (1, 0), e2 = (0, 1).
|∥u∥X − ∥v∥X | ≤ ∥u − v∥X
= ∥(u1 − v1 )e1 + (u2 − v2 )e2 ∥X
≤ |u1 − v1 |∥e1 ∥X + |u2 − v2 |∥e2 ∥X
( )1 ( )
≤ |u1 − v1 |2 + |u2 − v2 |2 2 ∥e1 ∥X + ∥e2 ∥X .
Then the (uniform) continuity of ∥ • ∥X follows from ϵ − δ definition. □
1.2. Matrix norms. Similarly, we can define the norm of m × n matrices.
Definition 1.2. We say ∥ • ∥ : Rm×n → R is a norm on Rm×n if it satisfies
the following three conditions:
(1) ∥A∥ ≥ 0 and ∥A∥ = O =⇒ u = O;
(2) ∥αA∥ = |α|∥A∥, ∀α ∈ R;
(3) ∥A + B∥ ≤ ∥A∥ + ∥B∥.
A common matrix norm is the Frobenius norm
( ∑ 2 )1
∥A∥F = aij 2 .
i,j

Exercise: Prove that ∥A∥2F = tr(A⊤ A).

Given Rn and Rm are equipped with some norms ∥ • ∥Rn and ∥ • ∥Rm , this
pair of norms naturally induces a matrix norm on Rm×n by
∥Au∥Rm
∥A∥Rn →Rm := sup = sup ∥Au∥Rm ,
0̸=u∈Rn ∥u∥Rn u∈Rn ,∥u∥Rn =1

which is also called the operator norm or the induced norm. In particular,
∥Au∥p
∥A∥p := sup = sup ∥Au∥p ,
0̸=u∈Rn ∥u∥p u∈Rn ,∥u∥p =1

is the operator norm induced by the vector ℓp norm.

4 YUWEN LI

Question: There is a hidden issue in the definition of operator norms.

We are not sure whether the supremum used for defining ∥A∥Rn →Rm is finite
or not, right? The next theorem shows that
sup ∥Au∥Rm ̸= ∞
u∈Rn ,∥u∥Rn =1

is a bounded number.
Theorem 1.3. The operator norm is well-defined, namely,
sup ∥Au∥Rm ≤ C
u∈Rn ,∥u∥Rn =1

for some constant C.

Proof. This theorem follows from that ∥A • ∥Rm : Rn → R is a continuous
function and that the unit sphere {u ∈ Rn : ∥u∥2 = 1} is compact. □

Examples: For a matrix A, ∥A∥2 is called the ℓ2 operator norm and the
spectral norm of A. When p = 1, ∞, 2,
∑
m
∥A∥1 = max |aij |,
1≤j≤n
i=1
∑n
∥A∥∞ = max |aij |,
1≤i≤m
j=1
√
∥A∥2 = λmax (A⊤ A).

The last equality can be proved using the following important theorem.
Theorem 1.4. For a symmetric and positive semi-definite matrix A, we
have
x⊤ Ax
λmax (A) = sup ,
x̸=0 x⊤ x
x⊤ Ax
λmin (A) = inf .
x̸=0 x⊤ x

Proposition 1.1. An operator norm ∥ • ∥ of matrices satisfies

(1) ∥I∥ = 1;
(2) ∥Au∥ ≤ ∥A∥∥u∥;
(3) ∥AB∥ ≤ ∥A∥∥B∥ (sub-multiplicative).
As a consequence, ∥ • ∥F is not an operator norm.
Exercise: Prove that ∥A∥2 = 1 if A is an orthogonal matrix.
Exercise: For 1 ≤ p ≤ ∞ and 1
p + 1
q = 1, prove that ∥A∥p = ∥A⊤ ∥q .
DIRECT METHODS 5

2. Condition number
The main topic of this chapter is to solve the linear system of equations
Ax = b by numerical methods on a computer. However, this process can
never be exact due to machine error. The IEEE 754 format of floating point
numbers is given in Figure 1. Each digit in 1 is either 0 or 1. Figure 1a uses
32 bits single-precision to represent the following real number

(−1)s (1.b1 b2 · · · b23 )2 × 2(a1 a2 ···a8 )2 −127 .

Figure 1b uses 64 bits double-precision to represent the following real number

(−1)s (1.b1 b2 · · · b52 )2 × 2(a1 a2 ···a11 )2 −1023 .

In addition, when a1 = a2 = · · · = a8 = 1 or a1 = a2 = · · · = a11 = 1, the
floating point number is set to be ∞ or NaN (Not a number); when a1 = a2 =
· · · = a8 = 0 or a1 = a2 = · · · = a11 = 0, the floating point number is set
to be 0 or NaN (Not a number). Therefore a normalized/standard nonzero
floating point number cannot have exponent 111111112 (111111111112 in
double precision) or 000000002 (000000000002 in double precision).

s a1 a2 ... a8 b1 b2 ... b23

sign exponent fraction
binary point
(a) Single precision.

s a1 a2 ... a11 b1 b2 ... b52

sign exponent fraction
binary point
(b) Double precision.

Figure 1. IEEE 754 floating point numbers.

Exercise: Rewrite the denary point number 10.37510 as binary point

number. Then transform it into IEEE 754 single-precision floating point
number.
For example, 0.4 can not be exactly stored in a computer. Mathematically
speaking, we may actually solve some perturbed equation

Âx̂ = b̂,
(A + δA)(x + δx) = b + δb,
where δA, δb are very small perturbation. We want to bound the difference
δx in terms of δA and δb.
6 YUWEN LI

2.1. Perturbation Analysis I. We want to bound the relative error ∥δx∥/∥x̂∥

or ∥δx∥/∥x∥ in terms of the relative perturbation ∥δA∥/∥A∥ (and ∥δb∥/∥b∥).
Aδx + (δA)x + (δA)(δx) = δb,
δx = A−1 (−(δA)x̂ + δb).
As a result,
( )
∥δx∥ ≤ ∥A−1 ∥ ∥δA∥∥x̂∥ + ∥δb∥ .
∥δx∥ ( ∥δA∥ ∥δb∥ )
≤ ∥A∥∥A−1 ∥ +
∥x̂∥ ∥A∥ ∥A∥∥x̂∥
( ∥δA∥ ∥δb∥ )
≤ ∥A∥∥A−1 ∥ + .
∥A∥ ∥b̂∥
The factor ∥A∥∥A−1 ∥ measures the sensitivity of the solution with respect
to small relative perturbation of the coeﬀicient matrix A. It is named as the
condition number
κ(A) := ∥A∥∥A−1 ∥.

2.2. Perturbation Analysis II. The original and perturbed equations are
manipulated in a different way.
(A + δA)δx = −(δA)x + δb.
δx = (A + δA)−1 (−(δA)x + δb)
= (I + A−1 δA)−1 A−1 (−(δA)x + δb).
Then the relation between the relative error and the relative perturbation
of A, b is as follows:
∥δx∥ ( ∥δb∥ )
≤ ∥(I + A−1 δA)−1 ∥∥A−1 ∥ ∥δA∥ +
∥x∥ ∥x∥
1 ( ∥δA∥ ∥δb∥ )
−1
≤ ∥A∥∥A ∥ +
1 − ∥A−1 ∥∥δA∥ ∥A∥ ∥x∥∥A∥
1 ( ∥δA∥ ∥δb∥ )
≤ κ(A) + .
1 − κ(A)∥δA∥/∥A∥ ∥A∥ ∥b∥
Here we used the inequality (∥ • ∥ should be an operator norm)
1
∥(I − B)−1 ∥ ≤ , ∥B∥ < 1.
1 − ∥B∥
When κ(A) ≪ 1 (in particular ∥δA∥/∥A∥ ≪ 1
κ(A) ) we have

∥δx∥ ( ∥δA∥ ∥δb∥ )

≲ κ(A) + .
∥x∥ ∥A∥ ∥b∥
Here B1 ≲ B2 means B1 ≤ CB2 with a positive constant C = O(1) of mild
size.
DIRECT METHODS 7

Exercise: Prove the above inequality

1
∥(I − B)−1 ∥ ≤ , ∥B∥ < 1.
1 − ∥B∥

2.3. Perturbation analysis III. The following result says that κ(A) is the
smallest relative perturbation of A (under the norm ∥ • ∥2 ) such that the
perturbed matrix is singular.
Theorem 2.1. We have
{ ∥δA∥ } 1 1
2
min : A + δA is singular = = .
δA ∥A∥2 ∥A∥2 ∥A−1 ∥2 κ(A)
Proof. Throughout this proof, ∥ • ∥ = ∥ • ∥2 . We need to show that
{ }
min ∥δA∥ : A + δA is singular = ∥A−1 ∥−1 .
δA

If ∥δA∥ < ∥A−1 ∥−1 , then ∥A−1 δA∥ ≤ ∥A−1 ∥∥δA∥ < 1, which implies that
I + A−1 δA is invertible and A + δA is invertible. Therefore, we have
{ }
min ∥δA∥ : A + δA is singular ≥ ∥A−1 ∥−1 .
δA
On the other hand, there exists a vector u with ∥u∥ = 1 such that
∥A−1 u∥ = ∥A−1 ∥. Let
v = A−1 u/∥A−1 u∥ = A−1 u/∥A−1 ∥
and δA = −uv ⊤ /∥A−1 ∥. Direct calculation shows that
(A + δA)v = u/∥A−1 ∥ − u∥v∥2 /∥A−1 ∥ = 0,
so A + δA is singular, and
√ √
∥δA∥ = λmax (δA⊤ δA) = ∥A−1 ∥−1 λmax (vu⊤ uv ⊤ )
= ∥A−1 ∥−1 ∥u∥∥v∥ = ∥A−1 ∥−1 .
The second part of the proof confirms
{ }
min ∥δA∥ : A + δA is singular = ∥A−1 ∥−1 .
δA
□
2.4. Examples of ill-conditioned matrices. The condition number un-
der the ℓ2 -norm (called the spectral condition number) of a symmetric and
positive-definite matrix A can be computed by the formula
λmax (A)
κ(A) = .
λmin (A)
A simple ill-conditioned matrix is
( )
1 0
A= , ϵ ≪ 1.
0 ϵ
Its condition number is κ(A) = 1/ϵ ≫ 1.
8 YUWEN LI

Exercise: Compute the ℓ1 , ℓ2 , ℓ∞ condition numbers of the following

matrix ( )
1 ϵ
A= .
0 1

Another famous ill-conditioned matrix is the Hilbert matrix H = (hij ) ∈

Rn×n with
1
hij = .
i+j−1
The Hilbert matrix is of the following form
 
1 1/2 1/3 1/4 · · ·
1/2 1/3 1/4 1/5 · · ·
 
 
H = 1/3 1/4 1/5 1/6 · · · .
1/4 1/5 1/6 1/7 · · ·
 
.. .. .. .. ..
. . . . .
We note that hij is invariant provided i + j is fixed, e.g.,
h12 = h21 , h13 = h22 = h31 , h14 = h23 = h32 = h41 , . . .
Therefore, H is a special Hankel matrix.
Exercise: Prove that H is a positive-definite matrix.
It can be proved that the spectral condition number of H satisfies
√ √
κ(H) = O((1 + 2)4n / n).
Due to the ill-conditioness, the linear system
Hx = b
is extremely hard to solve!
Theorem 2.2. The relative machine error ϵ is 2−24 ≈ 6 × 10−8 under single
precision and 2−53 ≈ 10−16 under double precision.
Proof. Assume a single-precision x = (−1)s (1 + f ) × 2e−127 □
Therefore when inputting the right side b, the relative error ∥δH∥/∥H∥,
∥δb∥/∥b∥ caused by inexact representation of real numbers is around 10−16 .
In this case
∥δx∥ ( ∥δH∥ ∥δb∥ )
≲ κ(H) + ≲ κ(H) × 10−16 .
∥x∥ ∥H∥ ∥b∥
(1) When n = 6, κ(H) ≈ 1.5 × 107 and ∥δx∥ −9
∥x∥ ≲ 1.5 × 10 , within
acceptable accuracy;
(2) When n = 11, κ(H) ≈ 5.2 × 1014 and ∥δx∥ −2
∥x∥ ≲ 5.2 × 10 , not so good;
(3) When n = 12, κ(H) ≈ 1.6 × 1016 and ∥δx∥
∥x∥ ≲ 1.6, which is a pretty
pessimistic and unacceptable upper bound!
DIRECT METHODS 9

In MATLAB, hilb is a built-in function to generate Hilbert matrices.

Try the following code in the command window of MATLAB.

H=hilb (12); % generate a Hilbert matrix

x=rand (12 ,1); % generate the exact solution
b=Hx; % generate the right hand side
xhat = H \ b; % calculate the numerical solution
x-xhat % compare the exact and numerical solutions
Listing 1. An example of a Hilbert matrix

2.5. Condition numbers of general problems. In numerical analysis,

the condition number of a function measures how much the output value of
the function can change for a small change in the input argument. This is
used to measure how sensitive a function is to changes or errors in the input,
and how much error in the output results from an error in the input.
Suppose we need to evaluate the following
y = f (x),
where f = (f1 , . . . , fn ) : Rn → Rn is a differential function. Recall that
δf (x) = f (x̂) − f (x) ≈ J(x)(x̂ − x),
∂fi
where J(x) = ( ∂x j
)1≤i,j≤n is the Jacobian matrix of f at x. As a result,
∥f (x̂) − f (x)∥ ≈ ∥J(x)∥∥x̂ − x∥ and the absolute condition number at x is
∥δf (x)∥
lim sup = ∥J(x)∥.
ϵ→0 ∥δx∥≤ϵ ∥δx∥

The relative condition number at x is

∥δf (x)∥/∥f (x)∥ ∥J(x)∥∥x∥
κ = κ(x) = lim sup = .
ϵ→0 ∥δx∥≤ϵ ∥δx∥/∥x∥ ∥f (x)∥

For example, when y = A−1 x we have J(x) = A−1 and

∥A−1 ∥∥x∥
κ= ≤ ∥A−1 ∥∥A∥ = κ(A).
∥A−1 x∥
Therefore, the matrix condition number is an upper bound of the general
condition number.

3. Direct methods for solving linear systems

In linear algebra, we have already learned the Gaussian elimination for
solving the linear system of equations Ax = b. This process is essentially
equivalent to an LU decomposition of A with reordering rows of A.
10 YUWEN LI

3.1. LU decomposition. We say P is a permutation matrix if it is ob-

tained from reordering rows of the identity matrix. The entry of P is either
0 or 1 and each row of P contains only one 1.
Theorem 3.1 (Gauss Elimination with Partial Pivoting, GEPP). Let A be a
nonsingular matrix. There exist a permutation matrix P , a lower triangular
matrix L with unit diagonal entries, and an upper triangular matrix U such
that P A = LU .
Proof. From basic linear algebra, we know that there exist elementary ma-
trices P1 , . . . , PJ (for row swapping or just identity) and lower triangular
L1 , . . . , LJ for adding some multiple of one row with smaller index to an-
other row with bigger index, such that
LJ PJ · · · L2 P2 L1 P1 A = U
where U is a nonsingular upper triangular matrix. Let
L̃j = PJ · · · Pj+1 Lj Pj+1 · · · PJ .
We can rewrite the equation as
L̃J · · · L̃1 PJ · · · P1 A = U.
Setting P = PJ · · · P1 and
L = (L̃J · · · L̃1 )−1 = L̃−1 −1
1 · · · L̃J
completes the proof. □
The following theorem is a parallel version of Theorem 3.1 and its proof
is omitted.
Theorem 3.2. Let A be a nonsingular matrix. There exist a permutation
matrix Q, a lower triangular matrix L with unit diagonal entries, and an
upper triangular matrix U such that AQ = LU .
Once the LU decomposition P A = LU is available, one can solve Ax = b
in the following way:
(1) Solve Ly = P b for y;
(2) Solve U x = y for x.
Here the cost of solving the n × n triangular linear system is n2 .
Example: Consider the following matrix
 
0 5 22/3
A = 4 2 1 .
2 7 9
Swapping the 1st and 2nd rows and then add − 21 of the new 1st row to the
new 3rd row yields (4 is the pivot, the entry with largest absolute value in
the 1st column)
   
4 2 1 4 2 1
A(0) = 0 5 22/3 , A(1) = 0 5 22/3 .
2 7 9 0 6 8.5
DIRECT METHODS 11

These two operations correspond to

   
0 1 0 1 0 0
P1 = 1 0 0 , L1 =  0 1 0 .
0 0 1 − 21 0 1

Swapping the 2nd and 3rd rows of A(1) and then add − 56 of the new 2nd
row to the new 3rd row yields (6 is the pivot in A(1) , the entry with largest
absolute value in the 1st column of the lower right 2 by 2 block )
   
4 2 1 4 2 1
A(1) = 0 6 8.5  , A(2) = U = 0 6 8.5  .
0 5 22/3 0 0 0.25

These two operations correspond to

   
1 0 0 1 0 0
P2 = 0 0 1 , L2 = 0 1 0 .
0 1 0 0 − 56 1

Therefore, L̃2 = L2 ,
   
0 1 0 1 0 0
P = P2 P1 = 0 0 1 , L̃1 = P2 L1 P2 = − 12 1 0 .
1 0 0 0 0 1

Easy calculation shows that

    
1 0 0 1 0 0 1 0 0
L = L̃−1
1 L̃ −1
2 =  1
1 0   0 1 0  =  1
1 0 .
2 2
5 5
0 0 1 0 6 1 0 6 1

An algorithmic description of the above procedure is as follows.

     
1 0 0 0 1 0 0 1 0
R ↔R2 R2 ↔R3
P = 0 1 0 −−1−−−→ 1 0 0 − −−−−→ 0 0 1 ,
0 0 1 0 0 1 1 0 0
       
1 0 0 1 0 0 1 0 0 1 0 0
L = 0 1 0 −→  0 1 0 −→  21 1 0 −→  12 1 0 .
1 5
0 0 1 2 0 1 0 0 1 0 6 1

The pseudo-code of GEPP is given below. The total number of floating

point operations of GEPP is

∑
n−1
( ∑
n ∑
n ∑
n
) 2
1+ 2 = n3 + O(n2 ).
3
i=1 j=i+1 j=i+1 k=i+1
12 YUWEN LI

Algorithm 1 GEPP
for i = 1 : n − 1 do
permute rows so that |aii | is the largest in |A(i : n, i)|;
permute L(i : n, 1 : i − 1) accordingly;
for j = i + 1 : n do
lji = aji /aii ;
end for
for j = i : n do
uij = aij ;
end for
for j = i + 1 : n do
for k = i + 1 : n do
ajk = ajk − lji uik ;
end for
end for
end for

3.1.1. Effect of pivoting. To show the importance of pivoting, we consider

the linear system Ax = b, where
( ) ( )
−ϵ 1 1
A= , b= , 0 < ϵ ≪ 1.
1 1 1
The exact solution is x = (0, 1)⊤ . Without machine round-off error, a
standard Gaussian elimination implies
( )( )
1 0 −ϵ 1
A = LU = .
− 1ϵ 1 0 1 + 1ϵ
Let fl be the round-off process in a computer under single-precision floating
point arithmetic. Let ϵ = 2−24 ≈ 6 × 10−8 be the relative machine precision.
Then
( ) ( )
1 0 1 0
L̂ = = ,
fl(− 1ϵ ) 1 − 1ϵ 1
( ) ( )
fl(−ϵ) 1 −ϵ 1
Û = = ,
0 fl(1 + 1ϵ ) 0 1ϵ
( ) ( )
−ϵ 1 ϵ 1
1 1
where fl(1 + ϵ ) = ϵ . As a result, L̂Û = Â = ̸= A = .
1 0 1 1
Using this decomposition, we obtain the numerical solution x̂ = (1, 1 + ϵ)⊤ .
The relative error is unacceptable:
∥x − x̂∥2 √
= 1 + ϵ2 > 100%.
∥x∥2
In contrast, GEPP yields
( ) ( )( )
0 1 1 0 1 1
A = LU = ,
1 0 −ϵ 1 0 1+ϵ
DIRECT METHODS 13

and rounding process leads to

( ) ( )
1 0 1 1
L̂ = , Û = ,
ϵ 1 0 fl(1 + ϵ)
( ) ( )
1 1 1 1
Â = L̂Û = = .
0 1 ϵ 1+ϵ
The numerical solution is x̂ = (ϵ, 1 − ϵ)⊤ , which is very accurate:
∥x − x̂∥2 √
= 2ϵ ≪ 1.
∥x∥2
3.1.2. Characterization of LU factorization. In the end, we give a theorem
characterizing matrices that can be factorized into LU without permutation.
Theorem 3.3. Let A be a square matrix. Then A = LU for some unit
lower triangular matrix L and nonsingular upper triangular matrix U if and
only if all leading principal minors of A are nonzero.
Proof. We only prove ⇐= by induction, the converse direction is easier.
Assume that this is true for all (n − 1) × (n − 1) matrices. When A is n by
n and all leading principal minors are nonzero, we partition it as
( )
A11 α
A= .
β ⊤ ann
Using the induction assumption, we have A11 = L11 U11 for some unit lower
triangular matrix L11 and nonsingular upper triangular U11 . Assume that
( ) ( )( )
A11 α L11 0 U11 u
A= = .
β ⊤ ann l⊤ 1 0 unn
The equation is true whenever
u = L−1
11 α,
−⊤
l = U11 β, unn = ann − l⊤ u.
Here unn = ann − β ⊤ A−1
11 α ̸= 0 because det(A) ̸= 0. □
3.1.3. Matrix inverse and determinant. Computing A−1 is equivalent to
solving the following n linear systems
Ax1 = e1 , ..., Axn = en .
The cost of GEPP for, e.g., Ax1 = e1 is 2n2 (solving triangular systems
twice). Therefore, the total computational cost of A−1 is 2n3 .
3.1.4. Gaussian Elimination with Complete Pivoting. In a few cases, even
GECP is not numerically stable and it is better to permute both rows and
columns of the active submatrix to select a pivot. This strategy is called
Gaussian Elimination with Complete Pivoting (GECP) with pseudo-code
given in Algorithm 2.
GECP permutes both rows and columns of a matrix A such that the
absolute value of the pivot used in Gaussian elimination is biggest among
14 YUWEN LI

the entries of the active block. The output is P AQ = LU , where P and Q

are permutation matrices.

Algorithm 2 GECP
for i = 1 : n − 1 do
permute rows and cols so that |aii | is the largest in |A(i : n, i : n)|;
permute L(i : n, 1 : i − 1) accordingly;
for j = i + 1 : n do
lji = aji /aii ;
end for
for j = i : n do
uij = aij ;
end for
for j = i + 1 : n do
for k = i + 1 : n do
ajk = ajk − lji uik ;
end for
end for
end for

The drawback of GECP is that each step of the for-loop requires O(n2 )
sorting and the total computational cost of complete pivoting would be
O(n3 ), already of the same order as the number of floating point operations
in Gaussian elimination.

3.2. Positive-definite systems. For symmetric and positive-definite (SPD)

matrices, there is a special decomposition method that is cheaper than the
LU decomposition.
Theorem 3.4 (Cholesky decomposition). Let A be a symmetric and positive-
definite matrix. There exists a lower triangular matrix L with positive diag-
onal entries such that A = LL⊤ .
Proof. We prove it by induction. Assume that this is true for (n−1)×(n−1)
SPD matrices. When A is n by n, we partition it as
( )
A11 α
A= .
α⊤ ann
Using the induction assumption, we have A11 = L11 L⊤11 for some lower
triangular matrix L11 . Assume that
( ) ( )( ⊤ )
A11 α L11 0 L11 β
A= = .
α⊤ ann β ⊤ lnn 0 lnn

The equation is true whenever β = L−1

1
⊤
11 α and lnn = (ann − β β) . Here
2
⊤ ⊤ −1
ann − β β = ann − α A11 α > 0 because det(A) > 0. □
DIRECT METHODS 15

Algorithm 3 Cholesky decomposition

for j = 1 : n − 1 do
∑j−1 2 1
ljj = (ajj − k=1 ljk ) 2 ;
for i = j + 1 : n do
∑j−1
lij = (aij − k=1 lik ljk )/ljj ;
end for
end for

The algorithmic procedure is given in Figure 2. The count of floating

point operations of Cholesky decomposition is
∑
n
( ∑
n
) 1
2j + 2j = n3 + O(n2 ).
3
j=1 i=j+1

Step 1

Step 2

Step 3

L
...

Figure 2. Cholesky decomposition algorithm.

Corollary 3.1 (LDL decomposition). Let A be a symmetric and positive-

definite matrix. There exists a unit lower triangular matrix L and a positive
diagonal matrix D such that A = LDL⊤ .
To analyze the numerical error of Cholesky decomposition, we need to
take a close look at the round-off error of floating point operations. It is
true that
fl(a ⊚ b) = (a ⊚ b)(1 + δ), |δ| ≤ ϵ,
where ⊚ is either +, −, ∗, / etc and ϵ is the relative machine precision. It is
also easy to prove that
(∑
d ) ∑d
fl a i bi = ai bi (1 + δi ), |δi | ≤ dϵ + O(ϵ2 ).
i=1 i=1

We say a numerical method for solving Ax = b is forward stable if

∥x − x̂∥
≲ ϵ.
∥x∥
16 YUWEN LI

We say a numerical method is backward stable if the numerical solution

x̂ is the exact solution of the modified problem (A + δA)x̂ = b + δb, where
∥δA∥ ∥δb∥
+ ≲ ϵ.
∥A∥ ∥b∥
It turns out that Cholesky decomposition without pivoting is already nu-
merically stable.
Theorem 3.5. Solving a SPD system Ax = b by Cholesky decomposition
is backward stable. In particular, the Cholesky numerical solution x̂ is the
exact solution of the perturbed problem
(A + δA)x̂ = b,
where the relative perturbation is small: ∥δA∥∞ /∥A∥∞ ≲ ϵ.
Proof. For i > j we have
( ∑
j−1 )
ˆlij = (1 + δ ′ )(1 + δ ′′ ) aij − (1 + δk ) ˆlik ˆljk /ˆljj .
k=1

Reformulating the above equation leads to

1 ∑ j−1
aij = ˆlij ˆljj + (1 + δk ) ˆlik ˆljk
(1 + δ ′ )(1 + δ ′′ )
k=1
∑
j ∑
j
1
= ˆlik ˆljk + ˆlik ˆljk δk , δj = − 1.
(1 + δ ′ )(1 + δ ′′ )
k=1 k=1

Here |δ1 |, . . . , |δj−1 | ≤ (j − 1)ϵ + O(ϵ2 ) and |δj | ≤ 2ϵ. Meanwhile, for each j
recall that
( ∑
j−1 )1
ˆljj = (1 + δ ′ )(1 + δ) 12 ajj − ˆl2 (1 + δk ) 2 , |δk | ≤ jϵ + O(ϵ), |δ|, |δ ′ | ≤ ϵ.
jk
k=1

Rewriting it leads to

1 ∑ j−1
ajj = ˆl2 + ˆl2 (1 + δk ).
(1 + δ ′ )2 (1 + δ) jj jk
k=1

Because δk is very small, we further have

∑
j
ˆl2 ≲ ajj .
jk
k=1

In matrix notation, we have

A = L̂L̂t + E,
DIRECT METHODS 17

where the perturbation matrix is bounded as

∑ ( ∑ 2 )1 ( ∑ 2 )1
|Eij | ≤ nϵ |ˆlik ||ˆljk | ≤ nϵ ˆl 2
ik
ˆl 2
jk
k k k
√ √
≲ nϵ aii ajj .
As a result, ∥E∥∞ ≲ n2 ϵ∥A∥∞ and a numerical solution of Ax = b would
be the true solution of the perturbed system
(A − E)x̂ = b.
The proof is complete. □
Combining the backward stability and the perturbation property of Ax =
b, we deduce that the relative error is about
∥x − x̂∥∞ ∥δA∥∞
≲ κ(A) ≲ n2 ϵκ(A).
∥x∥∞ ∥A∥∞
3.3. A poseriori error estimation. The aforementioned backward sta-
bility analysis and perturbation analysis are a priori error analysis. In
practice, these error bounds are often pessimistic (the multiplicative con-
stant is not sharp at all). On the other hand, the residual r(x̂) = b − Ax̂ is
a very useful quantity measuring the numerical error. For example,
∥x − x̂∥ = ∥A−1 r∥ ≤ ∥A−1 ∥∥r(x̂)∥.
Once an eﬀicient upper bound ∥A−1 ∥ ≤ C(A) is available, we obtain an a
posteriori error estimator
∥x − x̂∥ ≤ C(A)∥r(x̂)∥.
The term “a posteriori” means that this bound depends on the computed
numerical solution. When ∥ • ∥ = ∥ • ∥1 , ∥A−1 ∥1 is obtained by solve the ℓ1
constrained and convex maximization problem
∥A−1 ∥1 = max ∥A−1 y∥1 ,
∥y∥1 =1

which is approximately solved by gradient ascent.

Exercise: Given a matrix B, compute the gradient of f (x) = ∥Bx∥1 .

3.4. Sparse linear systems. When A ∈ Rn×n is a sparse matrix, it is

desirable to factorize it as P AQ = LU (or A = LL⊤ ), where L and U are
also as sparse as possible. We say a matrix is sparse if most entries of A are
zeros and the number of nonzeros in A is O(n). We often pursue algorithms
for sparse linear systems Ax = b with computational cost O(n) or O(n log n).
For sparse matrices, Gaussian elimination (as well as Cholesky decompo-
sition) must be very carefully designed to preserve the sparsity structure. In
fact, an uncareful elimination process would create more and more nonzeros
(named as fill-in) and eventually lead to dense factors L and U . In this case,
solving Ax = b costs at least O(n2 ) operations, see Figures 3–5.
18 YUWEN LI

50
0 10 20 30 40 50
nz = 217

Figure 3. Sparse pattern of a 49 × 49 SPD matrix A.

50
0 10 20 30 40 50
nz = 397

Figure 4. Sparse pattern of the standard Cholesky factor

L of A (without permutation, A = LL⊤ ).

It turns out that the number of fill-ins created during the Gaussian elim-
ination (as well as Cholesky decomposition) heavily depends on the order of
unknowns, i.e., appropriate permutation of rows and columns of the sparse
matrix, see Figures 4 and 5 for example. A popular technique for determin-
ing the order (also used in MATLAB) is called the Approximate Minimum
Degree (AMD) algorithm [1]. Once such an order represented by the per-
mutation matrix P is available, one can solve P AP ⊤ y = P b (with x = P ⊤ y)
by Cholesky decomposition. Sophisticated high-performance algorithms for
solving sparse problems are coded in the package UMFPACK (Unsymmetric
Multi-frontal Package).
DIRECT METHODS 19

50
0 10 20 30 40 50
nz = 255

Figure 5. Sparse pattern of the Cholesky factor L of A

(with permutation P such that P ⊤ AP = LL⊤ ).

As an example, consider the 9 × 9 sparse and symmetric matrix

 
4 0 0 −1 0 −1 0 −1 −1
 0 4 0 −1 0 −1 0 0 0
 
 0 0 4 0 0 0 0 −1 −1
 
 −1 −1 0 4 −1 0 0 0 0
 
A9 = 
 0 0 0 −1 4 0 0 −1 0  .
 −1 −1 0 0 0 4 −1 0 0
 
 0 0 0 0 0 −1 4 0 −1
 
 −1 0 −1 0 −1 0 0 4 0
−1 0 −1 0 0 0 −1 0 4
Its adjacency graph is given in Figure 6. Each vertex of G(A9 ) corresponds
3 8

4
5
1
9

Figure 6. Adjacency graph G(A9 ) of A9 .

to one row of A9 . Vertices vi and vj in G(A9 ) are connected by an edge if

and only if the entry aij of A9 is nonzero.
Eliminating one row and column of A9 (say the i-th row) corresponds to
eliminating one vertex (vi ) of G(A9 ) and deduce a small graph G8 . In G8 ,
20 YUWEN LI

previous neighbors of vi in G(A9 ) will be pairwise connected by edges. The

Minimum Degree algorithm choose to eliminate a vertex of G(A9 ) having
the smallest degree and obtain a smaller graph G8 and recursively do this
minimum-degree procedure in G8 and so on. Such an order of vertices
corresponds to a permutation matrix P . In practice, it is often the case that
the Cholesky factor of P ⊤ AP is quite sparse.
3.4.1. MATLAB backslash. In Matlab, the ‘\’ command for solving x = A\b
invokes an algorithm which depends upon the structure of the matrix A
and includes checks (small overhead) on properties of A. In particular, the
MATLAB backslash \ works as follows.
(1) If A is an upper or lower triangular matrix, employ a backward
substitution algorithm.
(2) If A is symmetric and has real positive diagonal elements, attempt
a Cholesky factorization. If A is sparse, employ reordering first to
minimize fill-in.
(3) If A is banded, employ a banded solver.
(4) If none of criteria above is fulfilled, do a general triangular factor-
ization using Gaussian elimination with partial pivoting.
(5) If A is sparse, then employ the UMFPACK library.
(6) If A is not square, employ algorithms based on QR factorization for
undetermined systems.

References
[1] Patrick R. Amestoy, Timothy A. Davis, and Iain S. Duff. An approximate minimum
degree ordering algorithm. SIAM J. Matrix Anal. Appl., 17(4):886–905, 1996.
[2] James W. Demmel. Applied numerical linear algebra. Society for Industrial and Applied
Mathematics (SIAM), Philadelphia, PA, 1997.
[3] Gene H. Golub and Charles F. Van Loan. Matrix computations. Johns Hopkins Studies
in the Mathematical Sciences. Johns Hopkins University Press, Baltimore, MD, fourth
edition, 2013.
[4] Roger A. Horn and Charles R. Johnson. Matrix analysis. Cambridge University Press,
Cambridge, second edition, 2013.

A Level Computer Science Study Pack 1-1 PDF
100% (11)
A Level Computer Science Study Pack 1-1 PDF
124 pages
Coban Tk303g PROTOCOL
No ratings yet
Coban Tk303g PROTOCOL
19 pages
Cap.6 Bartle - Resolvido
No ratings yet
Cap.6 Bartle - Resolvido
7 pages
PHP UNIT 01-Notes PDF
100% (1)
PHP UNIT 01-Notes PDF
62 pages
ch7 4
No ratings yet
ch7 4
3 pages
Lin Syster RN
No ratings yet
Lin Syster RN
6 pages
Cs421 Cheat Sheet
No ratings yet
Cs421 Cheat Sheet
2 pages
CLA Week3
No ratings yet
CLA Week3
13 pages
Iterative Linear
No ratings yet
Iterative Linear
10 pages
Matrix Norms: Tom Lyche
No ratings yet
Matrix Norms: Tom Lyche
45 pages
Matrix Norms
100% (1)
Matrix Norms
15 pages
Announcements Estimating and Improving Accuracy
No ratings yet
Announcements Estimating and Improving Accuracy
3 pages
HW 2 Sol
No ratings yet
HW 2 Sol
9 pages
Linear Algebra Cheat Sheet
No ratings yet
Linear Algebra Cheat Sheet
2 pages
Solutions For Applied Numerical Linear Algebra PDF
No ratings yet
Solutions For Applied Numerical Linear Algebra PDF
75 pages
Slides
No ratings yet
Slides
428 pages
MATH3511_Assignment_4
No ratings yet
MATH3511_Assignment_4
13 pages
Iterative Linear System PDF
No ratings yet
Iterative Linear System PDF
13 pages
Direct Methods
No ratings yet
Direct Methods
79 pages
SFU MACM 409 Chapter 1 Notes
No ratings yet
SFU MACM 409 Chapter 1 Notes
11 pages
2 - Numerical Methods For Solving Linear Systems of Equations
No ratings yet
2 - Numerical Methods For Solving Linear Systems of Equations
35 pages
Typed Lecture For Sec 7 - 1
No ratings yet
Typed Lecture For Sec 7 - 1
5 pages
Shifting Method
No ratings yet
Shifting Method
9 pages
NumProg2 - 2020-07-12
No ratings yet
NumProg2 - 2020-07-12
110 pages
AA215A Lecture01
No ratings yet
AA215A Lecture01
13 pages
COL726_A1-Solutions (1)
No ratings yet
COL726_A1-Solutions (1)
8 pages
Notes ch0
No ratings yet
Notes ch0
12 pages
Homework 6 Solutions: Igor Yanovsky (Math 151B TA)
No ratings yet
Homework 6 Solutions: Igor Yanovsky (Math 151B TA)
9 pages
Linear Algebra Review
No ratings yet
Linear Algebra Review
18 pages
EE263s Homework 3
No ratings yet
EE263s Homework 3
12 pages
bookwithindex
No ratings yet
bookwithindex
96 pages
tut 9s(updated)
No ratings yet
tut 9s(updated)
6 pages
Analysis I-AAT
No ratings yet
Analysis I-AAT
51 pages
Ahinf Norm Proof
No ratings yet
Ahinf Norm Proof
9 pages
Functional Analysis - MT4515
No ratings yet
Functional Analysis - MT4515
40 pages
Section 2
No ratings yet
Section 2
29 pages
Lec 3 Printed
No ratings yet
Lec 3 Printed
136 pages
Preliminaries and Systems of Linear Equations
No ratings yet
Preliminaries and Systems of Linear Equations
30 pages
Chapter1_Numerical Analysis II 2023-2024
No ratings yet
Chapter1_Numerical Analysis II 2023-2024
30 pages
Direct and Iterative Methods For Solving Linear Systems of Equations
No ratings yet
Direct and Iterative Methods For Solving Linear Systems of Equations
16 pages
Fa 2
No ratings yet
Fa 2
7 pages
DiscreteSecondDerivativeMatrix
No ratings yet
DiscreteSecondDerivativeMatrix
5 pages
SciCom LecNotes
No ratings yet
SciCom LecNotes
28 pages
Lecture Notes Set 1
No ratings yet
Lecture Notes Set 1
30 pages
MA398 Script
No ratings yet
MA398 Script
115 pages
Homework 5 Spring 2014 Solutions
No ratings yet
Homework 5 Spring 2014 Solutions
7 pages
Signal Space Representations
No ratings yet
Signal Space Representations
35 pages
Lecture 6 Linear System Error
No ratings yet
Lecture 6 Linear System Error
36 pages
MA412 Final
No ratings yet
MA412 Final
82 pages
MIR2012 Lec1
No ratings yet
MIR2012 Lec1
37 pages
Lecture 14: Linear Algebra: cs412: Introduction To Numerical Analysis
No ratings yet
Lecture 14: Linear Algebra: cs412: Introduction To Numerical Analysis
8 pages
Algebraic Methods in Data Science: Lesson 1: Dan Garber
No ratings yet
Algebraic Methods in Data Science: Lesson 1: Dan Garber
13 pages
Linear_Analysis_2011
No ratings yet
Linear_Analysis_2011
65 pages
Mathematical Foundations123
No ratings yet
Mathematical Foundations123
22 pages
Vdoc - Pub Linear-Analysis
No ratings yet
Vdoc - Pub Linear-Analysis
67 pages
Matrix Perturbation Theory
No ratings yet
Matrix Perturbation Theory
18 pages
Comp 361
No ratings yet
Comp 361
492 pages
Dani Ardian - 190604003 - Makalah Metnum
No ratings yet
Dani Ardian - 190604003 - Makalah Metnum
13 pages
Properties of The Singular Value Decomposition: Preliminary Definitions
No ratings yet
Properties of The Singular Value Decomposition: Preliminary Definitions
24 pages
Lecture 02_Several variables calculus
No ratings yet
Lecture 02_Several variables calculus
9 pages
A-level Maths Revision: Cheeky Revision Shortcuts
From Everand
A-level Maths Revision: Cheeky Revision Shortcuts
Scool Revision
3.5/5 (8)
Theory of Approximation
From Everand
Theory of Approximation
N. I. Achieser
No ratings yet
Multiple Integrals, A Collection of Solved Problems
From Everand
Multiple Integrals, A Collection of Solved Problems
Steven Tan
No ratings yet
8 Results and Reports: 8.1 Results of A Calculation
No ratings yet
8 Results and Reports: 8.1 Results of A Calculation
10 pages
SQL Unit 5
No ratings yet
SQL Unit 5
84 pages
A LEVEL COMPUTER SCIENCE 9618 PAPER 03 NOTES BY MR SAEM
No ratings yet
A LEVEL COMPUTER SCIENCE 9618 PAPER 03 NOTES BY MR SAEM
20 pages
A Practical Approach To Data Structures and Algorithms-1
No ratings yet
A Practical Approach To Data Structures and Algorithms-1
573 pages
Arithmetic Expressions
No ratings yet
Arithmetic Expressions
9 pages
Design of FPGA Based 32-Bit Floating Point Arithmetic Unit and Verification of Its VHDL Code Using MATLAB
No ratings yet
Design of FPGA Based 32-Bit Floating Point Arithmetic Unit and Verification of Its VHDL Code Using MATLAB
14 pages
Chapter 1 Introduction
No ratings yet
Chapter 1 Introduction
71 pages
Beckhoff Twincat Manual
No ratings yet
Beckhoff Twincat Manual
58 pages
Sympy Tutorial
No ratings yet
Sympy Tutorial
12 pages
Programming With C
100% (1)
Programming With C
99 pages
Spartan-3E MATLAB Interface Documentation
No ratings yet
Spartan-3E MATLAB Interface Documentation
18 pages
Complete Download Beginning Python Games Development 2nd Edition With PyGame Harrison Kinsley PDF All Chapters
No ratings yet
Complete Download Beginning Python Games Development 2nd Edition With PyGame Harrison Kinsley PDF All Chapters
51 pages
Mesh Adaptation for Computational Fluid Dynamics 1: Continuous Riemannian Metrics and Feature-based Adaptation Alain Dervieuxdownload
100% (2)
Mesh Adaptation for Computational Fluid Dynamics 1: Continuous Riemannian Metrics and Feature-based Adaptation Alain Dervieuxdownload
54 pages
Oracle Java Notes
No ratings yet
Oracle Java Notes
3 pages
2021 Cao
No ratings yet
2021 Cao
2 pages
CSC2201 - Class Notes
No ratings yet
CSC2201 - Class Notes
97 pages
FPP - Question - Bank - II-I-CSE-SR22 Answerkey
No ratings yet
FPP - Question - Bank - II-I-CSE-SR22 Answerkey
56 pages
LPC55S06 Manual
No ratings yet
LPC55S06 Manual
1,029 pages
DXF LISP
No ratings yet
DXF LISP
270 pages
Exception Codes: EN-9 Xe-145F Modbus Rev B
No ratings yet
Exception Codes: EN-9 Xe-145F Modbus Rev B
1 page
DSP Lab Manual
No ratings yet
DSP Lab Manual
26 pages
Elements of The C Language
No ratings yet
Elements of The C Language
14 pages
Data Representation
No ratings yet
Data Representation
67 pages
Laboratory Activity #1 Java Fundamentals: NCP 3106 (Software Design Laboratory)
No ratings yet
Laboratory Activity #1 Java Fundamentals: NCP 3106 (Software Design Laboratory)
18 pages
Apuntes 2016-1
No ratings yet
Apuntes 2016-1
83 pages
Fast Higher-Order Derivative Tensors With Rapsodia
No ratings yet
Fast Higher-Order Derivative Tensors With Rapsodia
15 pages
Tome 1: Magelis Range XBT-H/P/E/HM
No ratings yet
Tome 1: Magelis Range XBT-H/P/E/HM
155 pages

Direct Methods

Uploaded by

Direct Methods

Uploaded by

DIRECT METHODS

The textbook of this course is “Applied Numerical Linear Algebra” by

1. Vector and matrix norms

1.1. Vector Norms.

Address: [email protected]; School of Mathematical Sciences, Zhejiang University.

the so-called Minkowski inequality.

Exercise: Prove the Jensen

Exercise: Use the Jensen inequality to prove the Hölder inequality.

In fact, all norms on finite-dimensional linear spaces are equivalent.

Exercise: Prove that ∥A∥2F = tr(A⊤ A).

is the operator norm induced by the vector ℓp norm.

Question: There is a hidden issue in the definition of operator norms.

for some constant C.

Proposition 1.1. An operator norm ∥ • ∥ of matrices satisfies

(−1)s (1.b1 b2 · · · b23 )2 × 2(a1 a2 ···a8 )2 −127 .

(−1)s (1.b1 b2 · · · b52 )2 × 2(a1 a2 ···a11 )2 −1023 .

s a1 a2 ... a8 b1 b2 ... b23

s a1 a2 ... a11 b1 b2 ... b52

Figure 1. IEEE 754 floating point numbers.

Exercise: Rewrite the denary point number 10.37510 as binary point

2.1. Perturbation Analysis I. We want to bound the relative error ∥δx∥/∥x̂∥

∥δx∥ ( ∥δA∥ ∥δb∥ )

Exercise: Prove the above inequality

Exercise: Compute the ℓ1 , ℓ2 , ℓ∞ condition numbers of the following

Another famous ill-conditioned matrix is the Hilbert matrix H = (hij ) ∈

In MATLAB, hilb is a built-in function to generate Hilbert matrices.

H=hilb (12); % generate a Hilbert matrix

2.5. Condition numbers of general problems. In numerical analysis,

The relative condition number at x is

For example, when y = A−1 x we have J(x) = A−1 and

3. Direct methods for solving linear systems

3.1. LU decomposition. We say P is a permutation matrix if it is ob-

These two operations correspond to

These two operations correspond to

Easy calculation shows that

An algorithmic description of the above procedure is as follows.

The pseudo-code of GEPP is given below. The total number of floating

3.1.1. Effect of pivoting. To show the importance of pivoting, we consider

and rounding process leads to

the entries of the active block. The output is P AQ = LU , where P and Q

3.2. Positive-definite systems. For symmetric and positive-definite (SPD)

The equation is true whenever β = L−1

Algorithm 3 Cholesky decomposition

The algorithmic procedure is given in Figure 2. The count of floating

Figure 2. Cholesky decomposition algorithm.

Corollary 3.1 (LDL decomposition). Let A be a symmetric and positive-

We say a numerical method for solving Ax = b is forward stable if

We say a numerical method is backward stable if the numerical solution

Reformulating the above equation leads to

Because δk is very small, we further have

In matrix notation, we have

where the perturbation matrix is bounded as

which is approximately solved by gradient ascent.

3.4. Sparse linear systems. When A ∈ Rn×n is a sparse matrix, it is

Figure 3. Sparse pattern of a 49 × 49 SPD matrix A.

Figure 4. Sparse pattern of the standard Cholesky factor

Figure 5. Sparse pattern of the Cholesky factor L of A

As an example, consider the 9 × 9 sparse and symmetric matrix

Figure 6. Adjacency graph G(A9 ) of A9 .

to one row of A9 . Vertices vi and vj in G(A9 ) are connected by an edge if

previous neighbors of vi in G(A9 ) will be pairwise connected by edges. The

You might also like