Direct Methods
Direct Methods
YUWEN LI
(∑ )1 p
= ∥v∥p |u + v|p q = ∥v∥p ∥u + v∥pq .
i
Combining previous results completes the proof. □
The normed space (Rn , ∥ • ∥p ) is denoted by ℓp . A special example of the
ℓp norm is the ℓ∞ norm
∥u∥∞ = max |ui |.
1≤i≤n
which is also called the operator norm or the induced norm. In particular,
∥Au∥p
∥A∥p := sup = sup ∥Au∥p ,
0̸=u∈Rn ∥u∥p u∈Rn ,∥u∥p =1
is a bounded number.
Theorem 1.3. The operator norm is well-defined, namely,
sup ∥Au∥Rm ≤ C
u∈Rn ,∥u∥Rn =1
Examples: For a matrix A, ∥A∥2 is called the ℓ2 operator norm and the
spectral norm of A. When p = 1, ∞, 2,
∑
m
∥A∥1 = max |aij |,
1≤j≤n
i=1
∑n
∥A∥∞ = max |aij |,
1≤i≤m
j=1
√
∥A∥2 = λmax (A⊤ A).
The last equality can be proved using the following important theorem.
Theorem 1.4. For a symmetric and positive semi-definite matrix A, we
have
x⊤ Ax
λmax (A) = sup ,
x̸=0 x⊤ x
x⊤ Ax
λmin (A) = inf .
x̸=0 x⊤ x
2. Condition number
The main topic of this chapter is to solve the linear system of equations
Ax = b by numerical methods on a computer. However, this process can
never be exact due to machine error. The IEEE 754 format of floating point
numbers is given in Figure 1. Each digit in 1 is either 0 or 1. Figure 1a uses
32 bits single-precision to represent the following real number
Âx̂ = b̂,
(A + δA)(x + δx) = b + δb,
where δA, δb are very small perturbation. We want to bound the difference
δx in terms of δA and δb.
6 YUWEN LI
2.2. Perturbation Analysis II. The original and perturbed equations are
manipulated in a different way.
(A + δA)δx = −(δA)x + δb.
δx = (A + δA)−1 (−(δA)x + δb)
= (I + A−1 δA)−1 A−1 (−(δA)x + δb).
Then the relation between the relative error and the relative perturbation
of A, b is as follows:
∥δx∥ ( ∥δb∥ )
≤ ∥(I + A−1 δA)−1 ∥∥A−1 ∥ ∥δA∥ +
∥x∥ ∥x∥
1 ( ∥δA∥ ∥δb∥ )
−1
≤ ∥A∥∥A ∥ +
1 − ∥A−1 ∥∥δA∥ ∥A∥ ∥x∥∥A∥
1 ( ∥δA∥ ∥δb∥ )
≤ κ(A) + .
1 − κ(A)∥δA∥/∥A∥ ∥A∥ ∥b∥
Here we used the inequality (∥ • ∥ should be an operator norm)
1
∥(I − B)−1 ∥ ≤ , ∥B∥ < 1.
1 − ∥B∥
When κ(A) ≪ 1 (in particular ∥δA∥/∥A∥ ≪ 1
κ(A) ) we have
2.3. Perturbation analysis III. The following result says that κ(A) is the
smallest relative perturbation of A (under the norm ∥ • ∥2 ) such that the
perturbed matrix is singular.
Theorem 2.1. We have
{ ∥δA∥ } 1 1
2
min : A + δA is singular = = .
δA ∥A∥2 ∥A∥2 ∥A−1 ∥2 κ(A)
Proof. Throughout this proof, ∥ • ∥ = ∥ • ∥2 . We need to show that
{ }
min ∥δA∥ : A + δA is singular = ∥A−1 ∥−1 .
δA
If ∥δA∥ < ∥A−1 ∥−1 , then ∥A−1 δA∥ ≤ ∥A−1 ∥∥δA∥ < 1, which implies that
I + A−1 δA is invertible and A + δA is invertible. Therefore, we have
{ }
min ∥δA∥ : A + δA is singular ≥ ∥A−1 ∥−1 .
δA
On the other hand, there exists a vector u with ∥u∥ = 1 such that
∥A−1 u∥ = ∥A−1 ∥. Let
v = A−1 u/∥A−1 u∥ = A−1 u/∥A−1 ∥
and δA = −uv ⊤ /∥A−1 ∥. Direct calculation shows that
(A + δA)v = u/∥A−1 ∥ − u∥v∥2 /∥A−1 ∥ = 0,
so A + δA is singular, and
√ √
∥δA∥ = λmax (δA⊤ δA) = ∥A−1 ∥−1 λmax (vu⊤ uv ⊤ )
= ∥A−1 ∥−1 ∥u∥∥v∥ = ∥A−1 ∥−1 .
The second part of the proof confirms
{ }
min ∥δA∥ : A + δA is singular = ∥A−1 ∥−1 .
δA
□
2.4. Examples of ill-conditioned matrices. The condition number un-
der the ℓ2 -norm (called the spectral condition number) of a symmetric and
positive-definite matrix A can be computed by the formula
λmax (A)
κ(A) = .
λmin (A)
A simple ill-conditioned matrix is
( )
1 0
A= , ϵ ≪ 1.
0 ϵ
Its condition number is κ(A) = 1/ϵ ≫ 1.
8 YUWEN LI
Swapping the 2nd and 3rd rows of A(1) and then add − 56 of the new 2nd
row to the new 3rd row yields (6 is the pivot in A(1) , the entry with largest
absolute value in the 1st column of the lower right 2 by 2 block )
4 2 1 4 2 1
A(1) = 0 6 8.5 , A(2) = U = 0 6 8.5 .
0 5 22/3 0 0 0.25
Therefore, L̃2 = L2 ,
0 1 0 1 0 0
P = P2 P1 = 0 0 1 , L̃1 = P2 L1 P2 = − 12 1 0 .
1 0 0 0 0 1
∑
n−1
( ∑
n ∑
n ∑
n
) 2
1+ 2 = n3 + O(n2 ).
3
i=1 j=i+1 j=i+1 k=i+1
12 YUWEN LI
Algorithm 1 GEPP
for i = 1 : n − 1 do
permute rows so that |aii | is the largest in |A(i : n, i)|;
permute L(i : n, 1 : i − 1) accordingly;
for j = i + 1 : n do
lji = aji /aii ;
end for
for j = i : n do
uij = aij ;
end for
for j = i + 1 : n do
for k = i + 1 : n do
ajk = ajk − lji uik ;
end for
end for
end for
Algorithm 2 GECP
for i = 1 : n − 1 do
permute rows and cols so that |aii | is the largest in |A(i : n, i : n)|;
permute L(i : n, 1 : i − 1) accordingly;
for j = i + 1 : n do
lji = aji /aii ;
end for
for j = i : n do
uij = aij ;
end for
for j = i + 1 : n do
for k = i + 1 : n do
ajk = ajk − lji uik ;
end for
end for
end for
The drawback of GECP is that each step of the for-loop requires O(n2 )
sorting and the total computational cost of complete pivoting would be
O(n3 ), already of the same order as the number of floating point operations
in Gaussian elimination.
Step 1
Step 2
Step 3
L
...
1 ∑ j−1
aij = ˆlij ˆljj + (1 + δk ) ˆlik ˆljk
(1 + δ ′ )(1 + δ ′′ )
k=1
∑
j ∑
j
1
= ˆlik ˆljk + ˆlik ˆljk δk , δj = − 1.
(1 + δ ′ )(1 + δ ′′ )
k=1 k=1
Here |δ1 |, . . . , |δj−1 | ≤ (j − 1)ϵ + O(ϵ2 ) and |δj | ≤ 2ϵ. Meanwhile, for each j
recall that
( ∑
j−1 )1
ˆljj = (1 + δ ′ )(1 + δ) 12 ajj − ˆl2 (1 + δk ) 2 , |δk | ≤ jϵ + O(ϵ), |δ|, |δ ′ | ≤ ϵ.
jk
k=1
Rewriting it leads to
1 ∑ j−1
ajj = ˆl2 + ˆl2 (1 + δk ).
(1 + δ ′ )2 (1 + δ) jj jk
k=1
10
15
20
25
30
35
40
45
50
0 10 20 30 40 50
nz = 217
10
15
20
25
30
35
40
45
50
0 10 20 30 40 50
nz = 397
It turns out that the number of fill-ins created during the Gaussian elim-
ination (as well as Cholesky decomposition) heavily depends on the order of
unknowns, i.e., appropriate permutation of rows and columns of the sparse
matrix, see Figures 4 and 5 for example. A popular technique for determin-
ing the order (also used in MATLAB) is called the Approximate Minimum
Degree (AMD) algorithm [1]. Once such an order represented by the per-
mutation matrix P is available, one can solve P AP ⊤ y = P b (with x = P ⊤ y)
by Cholesky decomposition. Sophisticated high-performance algorithms for
solving sparse problems are coded in the package UMFPACK (Unsymmetric
Multi-frontal Package).
DIRECT METHODS 19
10
15
20
25
30
35
40
45
50
0 10 20 30 40 50
nz = 255
4
5
1
9
References
[1] Patrick R. Amestoy, Timothy A. Davis, and Iain S. Duff. An approximate minimum
degree ordering algorithm. SIAM J. Matrix Anal. Appl., 17(4):886–905, 1996.
[2] James W. Demmel. Applied numerical linear algebra. Society for Industrial and Applied
Mathematics (SIAM), Philadelphia, PA, 1997.
[3] Gene H. Golub and Charles F. Van Loan. Matrix computations. Johns Hopkins Studies
in the Mathematical Sciences. Johns Hopkins University Press, Baltimore, MD, fourth
edition, 2013.
[4] Roger A. Horn and Charles R. Johnson. Matrix analysis. Cambridge University Press,
Cambridge, second edition, 2013.