0% found this document useful (0 votes)
33 views14 pages

Chapter 3

The document discusses iterative methods for solving large, sparse systems of linear equations. It introduces the concept of using an iterative approach to generate a sequence of vectors that converge to the solution. Three classical iterative methods are described: Jacobi, Gauss-Seidel, and SOR (successive over-relaxation). Convergence properties and the relationship between the spectral radius of the iteration matrix and convergence of the sequence are analyzed.

Uploaded by

heni16belay
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
33 views14 pages

Chapter 3

The document discusses iterative methods for solving large, sparse systems of linear equations. It introduces the concept of using an iterative approach to generate a sequence of vectors that converge to the solution. Three classical iterative methods are described: Jacobi, Gauss-Seidel, and SOR (successive over-relaxation). Convergence properties and the relationship between the spectral radius of the iteration matrix and convergence of the sequence are analyzed.

Uploaded by

heni16belay
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

Chapter 3

Systems of Linear
Equations - Iterative
Approach

Many linear systems arised in real-world applications are large and sparse.
Because of the large number of equations and unknowns, storage becomes a
serious concern. When it is possible, Gaussian elimination remains a very eco-
nomical, accurate, and useful algorithm. Elimination is possible as long as there
is space to store all the nonzero elements of the triangular matrices associated
with the elimination and when the coding necessary to locate these elements
can be programmed. Techniques along this line can be found, for example,
in SPARSPAK by George, A., and Liu, J.W.H., “Computer Solution of Large
Sparse Positive Definite Systems”.
There are cases where the orders n are so large that is is impossible to store
the fill-in resulted from the Gaussian elimination. It is, therefore, desirable to
solve such linear systems Ax = b by methods that never alter the matrix A and
never require storing more than a few vectors of length n. Iterative methods
are especially suitable for this purpose.
In an iterative method, beginning with an initial vector x(0) , we generate a
sequence of vectors x(1) → x(2) → · · · according to the iteration scheme. We
hope that as k → ∞, x(k) will converge to the exact solution. The computa-
tional effort in each individual step x(i) → x(i+1) , generally, is comparable to
the multiplications of A with a vector. This is a very modest amount when A
is sparse.system.
A iterative method may be motivated from the following consideration.
Given a linear system
Ax = b (3.1)
and an approximate solution x̃, the residual corresponding to x̃ is defined by

r := b − Ax̃. (3.2)

1
2 Systems of Linear Equations

It follows that the error e := x − x̃ satisfies the equation


Ae = r. (3.3)
If we could solve (3.3) exactly, then x := x̃ + e would be the solution to (3.1).
In iterative methods, instead of solving (3.3) for the correction e, we solve
Se = r (3.4)
where S is an approximation to A. The difference between A and S here is that
(3.4) is much easier to be solved than (3.3). Now adding approximate correction
to the approximate x̃ gives what we hope is a better approximate to the true
solution x. This procedure can be summarized as follows:
(a) xold : = The current approximation to x;

(b) Compute the residual r := b − Axold ;


(c) Solve Se = r for the unknown e;
(d) Set
xnew := xold + e; (3.5)
(e) Go back to (a).
Multiplying (3.5) by S yields
Sxnew = Sxold + Se
= Sxold + b − Axold
= (S − A)xold + b := T xold + b (3.6)
where
T := S − A (3.7)
is called the splitting of the matrix A. Note that if the iterates converges to a
limit x, then x = xold = xnew and, by (3.6), we see that Ax = b. In other words,
a limit point of the iteration scheme (3.6) is a solution of the system (3.1).
The choice of S gives rise to different iterative schemes. For instance, if the
matrix A is split as
A=D−L−U (3.8)
where D, −L and −U are, respectively, the diagonal, the strictly lower trian-
gular and the strictly upper triangular matrices of A. Then we may describe
three classical iterative schemes as follows:
(1) The Jacobi Method.

S = D; T = L + U ; (3.9)
new
Dx = (L + U )xold + b; (3.10)
1 
i−1 n
xnew
i = (bi − aij xold
j − aij xold
j ). (3.11)
aii j=1 j=i+1
3.1. GENERAL CONSIDERATION 3

(2) The Gauss-Seidel method.

S = D − L; T = U ; (3.12)
(D − L)x new
= U xold + b; (3.13)
1 
i−1 n
xnew
i = (bi − aij xnew
j − aij xold
j ). (3.14)
aii j=1 j=i+1

(3) The SOR method.

S = σD − L; T = (σ − 1)D + U ; (3.15)
(D − ωL)x new
= (1 − ω)Dxold + ωU xold + ωb; (3.16)
1
ω= ;
σ
1 
i−1 n
x̂new
i = (bi − aij xnew
j − aij xold
j ); (3.17)
aii j=1 j=i+1

xnew
i = (1 − ω)xold
i + ωx̂new
i . (3.18)

The main concerns raised in any iterative method are that


1. Will the sequence (x(k) ) every converge? Does the limit point depend
upon the starting point x(0) ?
2. If the sequence (x(k) ) does converge, how fast? Is there a way to accelerate
the convergence?

3.1 General Consideration


We realize that most of the iterative schemes are of the form

xnew = Hxold + d (3.19)


for some constant square matrix H and constant vector d. We first prove an
important theorem concerning the estimate of the spectral radius of H.
Lemma 3.1.1. For any square matrix H and any  > 0, there exist an induced
matrix norm such that

ρ(H) ≤ H ≤ ρ(H) +  (3.20)

(pf): The first inequality is true for all induced norm. We prove the second
inequality by construction. Given H, there exist a nonsingular matrix P such
that
P −1 HP = Λ + U
4 Systems of Linear Equations

where Λ := a diagonal matrix with λ1 (H) as elements, and U := an upper tri-


angular matrix with zero diagonal. (See, Linear Algebra by Lane, p184). With
δ > 0, define D := diag{1, δ, δ 2 , . . . , δ n−1 }. Then D−1 = diag{1, δ −1 , . . . , δ 1−n }.
Now
D−1 P −1 HP D = D−1 (Λ + U )D = Λ + D−1 U D := C.
Note that
⎡ ⎤
0 x x x x ← elements have factor δ n−1
⎢ 0 x x x ⎥
⎢ ⎥
D UD = ⎢
−1
⎢ 0 x x ⎥ ← elements have factor δ 2

⎣ 0 x ⎦ ← elements have factor δ
0 0

Now we define a vector norm  ·  by

x := D−1 P −1 x2 . (Show that this is a norm!)

Then the induced matrix norm for H is

H = sup Hx = sup D−1 P −1 Hx2 = sup CD−1 P −1 x2


x=1 x=1 x=1
−1 −1
= sup Cz2 where z := D P x, (So, x = 1 ⇒ z2 = 1)
z2 =1

= C2 .

Observe that C2 depends continuously on δ. When δ = 0, it is obvious that


C2 = ρ(C) = ρ(H) since C = Λ. Therefore, as δ → 0, C2 → ρ(H). Now
given , we may choose δ small enough such that C2 ≤ ρ(H) + . ⊕
Lemma 3.1.2. The following three statements are equivalent:

(1) limn→∞ H n = 0,
(2) limn→∞ H n  = 0 for some norm,
(3) ρ(H) < 1.

(pf): This is a homework problem.


We now apply these results to our iterative method.

Theorem 3.1.1 Suppose x = Hx + d has a unique solution x∗ . Then the


sequence {x(k) } computed from (3.19) with any starting point x(0) converges to
x∗ if and only ρ(H) < 1.

(pf): Observe first that x(k+1) − x∗ = H(x(k) − x∗ ) = . . . = H k+1 (x(0) − x∗ ).


Thus
(⇒) If ρ(H) < 1, then x(k+1) − x∗  ≤ H k+1 x(0) − x∗  → 0 by Lemma 3.1.2
Thus x(k) → x∗ ¿
(⇐) Suppose x(k) − x∗  → 0 for every x(0) . Take x(0) = x∗ + ei . Then
x(k) − x∗ = H k ei = The i-th column of H k . Use the  · 1 norm, then we have
3.1. GENERAL CONSIDERATION 5

 The i-th column of H k 1 → 0 as k → ∞. Since i is arbitrary, it follows that


H k 1 → 0. By Lemma 3.1.2 again, ρ(H) < 1. ⊕.
Remark. Suppose ρ(H) < 1. Consider a sequence {x(k) } generated from the
scheme (3.1.1). The estimate x(k) − x∗ 1/k represents the geometric average
error improvement in k iterations. The sequence {x(k) − x∗ 1/k } generally may
not converge. Thus, instead, we consider the number β := inf k supn≥k x(n) −
x∗ 1/n to be the rate of convergence of the given sequence, that is, for every
 > 0, there is a k such that for every n ≥ K, x(n) − x(  ≤ (β + )n . Note that
β depends on the starting vector x(0) .
Definition 3.1.1 The asymptotic convergence factor α of an iterative scheme
(3.1.1) is defined to be
α := sup inf sup x(n) − x∗ 1/n
x(0) =0 k n≥k

Remark. By the norm equivalence theorem and the fact that limn→∞ c1/n = 1
for any nonzero constant c, it follows that β (and hence, α) is norm independent.
Theorem 3.1.2 Suppose ρ(H) < 1. Then the iterative scheme has asymptotic
convergence factor α = ρ(H).
(pf): Since ρ(H) < 1, we may choose a norm · such that H ≤ ρ(H)+ <
1. We have already seen that x(k) − x∗  ≤ Hk |x(0) − x0 . It follows that
β ≤ ρ(H) + . Since  is arbitrary, it follows that α ≤ ρ(H). To show equality,
we construct a sequence {x(k) } such that the equality holds. Toward this, we
consider two cases:
(i) Suppose λ is a real eigenvalue of H such that λ = ρ(H). Let u be
the associated real unit eigenvector of λ. We choose x(0) := x∗ + u. Then
x(k) − x∗ = H k u = λk u.
For this sequence β = |λ| = ρ(H).
(ii) Suppose ρ(H) corresponds to a pair of complex conjugate eigenvalues λ
and λ. Let u and u be the corresponding eigenvector. We may select a basis
{ui} for Cn such that u1 = u and u2 = u. Any vector y ∈ Cn may be expressed
n
as y = i=1 ci ui . We may take y := Σ|ci | as norm for y. Now we choose
k
x(0) := + 12 (u + u). Then x(k) − x∗ = H k 21 (u + u) = 12 (λk u + λ u). Using the
norm just defined, we have x(k) − x∗ | = 12 (|λ|k + |λ|k ) = ρ(H)k . It follows that
β = ρ(H). ⊕.
Recall that the splitting of the matrix A
A=S−T
induces the iterative scheme
Sxnew = T xold + b
for the system Ax = b. Thus it is imperative to find conditions such that
ρ(S −1 T ) < 1. We discuss below several possible sufficient conditions that has
been established in the literature (cf: R. S. Varga, Matrix Iterative Analysis).
6 Systems of Linear Equations

Definition 3.1.2 Let A ∈ Rn×n . Then A = S − T is said to be a regular


splitting if S −1 ≥ 0 and T ≥ 0.

Theorem 3.1.3 If A ∈ Rn×n , A−1 ≥ 0 and A = S − T is a regular splitting,


then ρ(S −1 T ) < 1.

(pf): Let H := S −1 T . Then H ≥ 0 and S −1 A = I − H. Note that (I + H +


. . . + H m )(I − H) = I − H m+1 . It follows that 0 ≤ (I + H + . . . + H m )S −1 =
(I − H m+1 )A−1 ≤ A−1 . Let D(m) := I + H + . . . + H m , F := D(m) S −1 . Denote
n
A−1 = (αij ), S −1 = (βij ). Observe that fik =
(m)
djk = linear combinations
j=1
with nonnegative coefficients. Since S −1 is nonsingular, no row of S −1
(m)
of dij
(m)
can be identically zero. It follows that each dij must involve at least once with
one of fi∗ . For each j, there exists a column index k(j) such that βjk(j) > 0.
(m) (m) (m)
Then fik(j) = Σdij βjk(j) ≤ αik(j) implies that dij βjk(j) ≤ αik(j) . Thus dij
(m)
is always bounded above. Since {dij } is a monotone increasing sequence as


m → ∞. It follows that H m converges. So lim H m = 0. ⊕
m→∞
m=0

Definition 3.1.3 A nonsingular matrix A ∈ Rn×n is said to be an M -matrix


if aij ≤ 0 for i = j, and if A−1 ≥ 0.

Remark. Suppose A is an M -matrix. Then aii > 0.

Theorem 3.1.4 If A ∈ Rn×n is an M -matrix. Then both the Jacobi splitting


(3.10) and the Gauss-Seidel splitting are regular. In this case, both the Jacobi
method and the Gauss-Seidel method are convergent.

(pf): In the Jacobi method, S = D and T = L + U . Since D > 0, S −1 =


−1
D > 0. Now T = L + U ≥ 0 by the definition of M -matrix. This shows the
Jacobi splitting is regular. Convergence follows from Theorem 3.1.2.
In the Gauss-Seidel method, S = D − L and T = U . Obviously S is non-
singular and T ≥ 0. To show regular splitting, we need to show S −1 ≥ 0. Now
S −1 = (D − L)−1 = (I − D−1 L)−1D−1 . Note that D−1 L ≥ 0. Since D−1 L
is a strictly lower triangular matrix, (D−1 L)m+1 = 0 whenever m + 1 ≥ n. It
follows that (I − D−1 L)(I + D−1 L + . . . + D−1 L)m ) = I − (D−1 L)m+1 = I.
But then (I − D−1 L)−1 = I + D−1 L + . . . + (D−1 L)m ≥ 0. This shows that the
Gauss-Seidel splitting is regular. Convergence follows from Theorem 3.1.2. ⊕

Definition 3.1.4 A matrix A ∈ Rn×n is said to be (strictly, if inequality holds)



n
row-wise diagonally dominant if |aii | ≥ |aij | for all i.
j=1
j=i

Theorem 3.1.5 Both the Jacobi method and the Gauss-Seidel method converge
if A is strictly diagonally dominant.
3.2. RELAXATION METHOD 7

(pf): In the Jacobi method, J := H := D−1 (L + U ). Taking the L∞ -


norm, we have J∞ = maxi a1ii  k−i |aik | < 1. The convergence follows
from Lemma 3.1.1 and Theorem 3.1.1. In the Gauss-Seidel method, G := H :=
(D − L)−1 U = (I − D−1 L)−1 D−1 U . Note that |Je | ≤ J∞ e. Thus |D−1 U |e ≤
(J∞ I − |D−1 L|)e. Note also that 0, ≤ |(I − D−1 L)−1 | = |I + D−1 L + . . . +
(D−1 L)n−1 | ≤ (I − |D−1 L|)−1 . Thus |G|e ≤ (I − |D−1 L|)−1 (I − |D−1 L| +
(J∞ − I)I)e = (J∞ − 1)(I − |D−1 L|)−1 e ≤ (I + (J∞ − 1)I)e = J∞ e.
It follows that G∞ ≤ J∞ < 1. ⊕
Remark. In Theorem 3.1.5 we have actually proved a stronger result G∞ ≤
J∞ . But it is not necessarily true that ρ(G) ≤ ρ(J). That is, it is not
true, in general, that the Gauss-Seidel method converges at least as fast as the
Jacobi method, although intuitively it seems this should be so. (cf: Stein and
Rosenberg Theorem in Varga).
Theorem 3.1.6 If A is symmetric and positive definite, then the Gauss-Seidel
method converges.
(pf): In the Gauss-Seidel method, G := (D − L)−1 LT . Consider G1 :=
D GD−1/2 = (I − L1 )−1 LT1 with L1 := D−1/2 LD−1/2 .
1/2

Since G and G1 are similar, G and G1 have the same eigenvalues. Suppose
G1 x = λx with x∗ x = 1 (Note that x may be in Cn ). Then LT1 x = λ(I −
L1 )x. It follows that x∗ L1 x = λ(1 − x∗ L1 x). Let X ∗ L1 x = a + ib. Then
|λ|2 =|a−ib
2 a2 +b2
. Note that D−1/2 AD−1/2 = I − L1 − LT1 is still positive
1−a−ib| =
1−2a+a2 +b2

definite. Thus 1 − x∗ L1 x − x∗ LT1 x = 1 − 2a > 0. It follows that |λ| < 1. ⊕

3.2 Relaxation Method


In the relaxation methods, we consider classes of matrices H that depend on
certain parameters. The main idea is to vary these parameters so that the
corresponding asymptotic convergence factor ρ(H) becomes as small as possible.
One of the most popular relaxation methods is the SOR method (3.16) where
H(ω) = (D − ωL)−1 [(1 − ω)D + ωU )]
Theorem 3.2.1 Suppose A ∈ Rn×n has nonzero diagonal elements. Then
ρ(H(ω)) ≥ |1−ω|. So for convergence of SOR, it is necessary to have 0 < ω < 2.
(pf): We first observe that det(H(ω)) = det(D−ωL)−1 det[(1−ω)D+ωU ] =
n
det(D−1 ) det((1 − ω)D = (1 − ω)n . On the other hand, det(H(ω)) = λi with
i=1
n
λi eigenvalues of H(ω). It follows that |(1 − ω) | =
n
|λi | ≤ (ρ(H(ω)))n . The
i=1
assertion follows. ⊕
Theorem 3.2.2 (Ostrowski and Reich Theorem) Let A be real, symmetric and
positive definite. Then the SOR method converges if and only if 0 < ω < 2.
8 Systems of Linear Equations

(pf): The SOR method comes from the splitting A = S − T where

S := ω −1 D − L and T := (ω −1 − 1)D + U.

Obviously S is nonsingular. Let

Q := A − (S −1 T )T A(S −1 T ).

We claim that both S + T and Q are positive definite. Suppose these claims
are true. Let λ be any eigenvalues of H = S −1 T , and y the corresponding
eigenvector. Then 0 < y ∗ Qy = y ∗ AY − λy ∗ Aλy = (1 − |λ|2 )y ∗ Ay. It follows
that λ| < 1, and hence ρ(H) < 1.
Now we prove the claims. Recall that any given matrix M can be writ-
ten as M = 12 (M + M T ) + 12 (M − M T ) := Ms + Mk where Ms is sym-
metric and Mk is skew-symmetric. Note also that xT M x = xT Ms x. So it
suffices to check the symmetric part of S + T for positive definiteness. Now
(S+T )s = 12 {S+S T +T +T T } = 12 {(ω −1 D−L)+(ω −1D−U )+((ω −1 −1)D+U )+
((ω −1 −1)D+L)} = D(ω −1 (2−ω)) which obviously is positive definite. To check
the matrix Q for positive definiteness, we first observe that Q = A − H T AH =
A − (I − S −1 A)T A(I − S −1 A) = A − {A − (S −1 A)T A − A(S −1 A)T A(S −1 A)} =
(S −1 A)T {S + S T − A}(S −1 A) = (S −1 A)T {S T + T }(S −1 ). Now xT Qx =
xT (S −1 A)T {S T + T }(S −1 A)x = y T (S T + T )y = y T (S + T )y > 0 for all x = 0.
So Q is positive definite. ⊕
Remark. It is often possible to choose the parameter ω so that the SOR
method converges rapidly; much more rapidly than the Jacobi method or the
Gauss-Seidel method. Normally, such an optimum value for ω can be prescribed
if the coefficient matrix A, relative to the partitioning imposed, has the so called
property A and is the so called consistently order (cf: Hageman and Young,
Applied Iterative Methods, Chapter 9). In practice, the estimates for the SOR
ω is obtained by an adaptive procedure.

3.3 Acceleration Methods


In this section we discuss a general procedure for accelerating the rates of con-
vergence of basic iterative methods. The procedure involves the formation of a
new vector sequence from linear combinations of the iterates obtained from the
basic method.
Let {x(k) } be the sequence of iterates generated by a basic method (3.19.
That is, {x(k) } is formed by

x(k) = Hx(k−1) + d. (3.21)

Then the error vector e(k) := x(k) − x∗ satisfies

e(k) = H k e(0) . (3.22)


3.3. ACCELERATION METHODS 9

We consider a new vector sequence {u(k) } determined by the linear combination


k
u(k) := αk,i x(i) , k = 0, 1, . . . (3.23)
i=0

where the real numbers αk,i are required to satisfy the consistency condition


k
αk,i = 1, k = 0, 1, . . . (3.24)
i=0

Let (k) := u(k) − x∗ . Then we have

− x∗ = i=0 αk,i e(i)


k k
(k) = i=0 αk,i x
(i)
k k
(k) = i=0 αk,i x
(i)
= i=0 αk,i e(i)
k k
= ( i=0 αk,i H i )e(0) = ( i=0 αk,i H i )(0)
:= Qk (H)(0)

where
Qk (H) := αk,0 I + αk,1 H + . . . + αk,k H k
is a matrix polynomial. The idea is to choose the polynomials {Qk } so that
{u(k) } converges to x∗ faster than {x(k) }. Generally speaking, it requires a high
arithmetic cost and a large amount of storage in using (3.23) to obtain u(k) .
Alternatively, we usually consider only the important family of polynomials
satisfying the recurrence relation:

Q0 (x) = 1
Q1 (x) = γ1 x − γ1 + 1 (3.25)
Qk+1 (x) = ρk+1 (γk+1 x + 1 − γk+1 )Qk (x) + (1 − ρk+1 )Qk−1 (x), for k ≥ 1

where γ1 , ρ2 , γ2 , . . . are real numbers to be determined. We note that the con-


sistency condition (3.24) is satisfied automatically for all k ≥ 0.

Theorem 3.3.1 If the polynomial sequence {Qk } in (3.25) is used, then the
iterates {u(k) } of (3.23) may be obtained using the three-term relation:

u(1) = γ1 (Hu(0) + d) + (1 − γ1 )u(0) ,


(3.26)
u (k+1)
= ρk+1 {γk+1 (Hu (k)
+ d) + (1−, γk+1 )u (k)
+ (1 − ρk+1 )u (k−1)
.

(pf): This is a homework problem. ⊕ The polynomial {Qk } may be chosen


to fulfill one of two purposes:
(1) (Conjugate Gradient Acceleration) From (3.3), we have for any norm
that
(k)  ≤ Qk (H)(0) . (3.27)
10 Systems of Linear Equations

So the polynomial sequence {Qk } may be chosen to minimize Qk (H)(0) . (cf:
Hageman and Young, Chapter 7).
(2) (Chebyshev Acceleration) This is motivated by the fact that

(k)  ≤ Qk (H)(0)  ≤ (ρ(Qk (H)) + )(0)  (3.28)

for a certain norm. We note that

ρ(Qk (H)) = max Qk (λi )|. (3.29)


1≤i≤n

Let M (H) and m(H) denote, respectively, the algebraically largest and smallest
eigenvalues of H. So the polynomial {Qk } is chosen such that the virtual spectral
radius of Qk (H) defined by

ρ(Qk (H)) := max |Qk (x)| (3.30)


m(H)≤x≤M(H)

is minimized. (cf: Hageman and Young, Chapter 4-6).

3.4 Conjugate Gradient Method


The conjugate gradient method is a very useful technique in many areas of
numerical computation. In this section, we shall study how it can be applied to
solve the linear system (3.1) when A is symmetric and positive definite.
Consider the quadratic functional F : Rn → R where
1 T
F (x) = x Ax − xT b. (3.31)
2
Suppose x is the solution to (3.1). Then
1
F (x) = F (x) + (x − x)T A(x − x). (3.32)
2
It follows that the problem of solving Ax − b is equivalent to the problem of
minimizing F (x). Moreover, the gradient of F (x) is given by

F (x) = Au − b. (3.33)

The direction of the vector F (x) is the direction for which the functional F (x)
at the point x changes most rapidly. Suppose x(k) is an approximation to x,
then in the direction of steepest descent rk := − F (x(k) ) we should obtain an
improved approximation

x(k+1) := x(k) − αk rk (3.34)

if αk is chosen to minimize F (x(k) + αrk )). Using (3.31), we can easily calculate
the number ak . Thus we have derived
Algorithm 3.4.1. (The Steepest Descent Method)
3.4. CONJUGATE GRADIENT METHOD 11

Given x(0) arbitrary


For k = 0, 1, . . .,
rk := b − Ax(k)
If rk = 0
then stop
Else
rkT rk
αk := ; (Why?)
rkT Ark
x(k+1) := x(k) + αk rk .

When the condition number k2 = λλmax (A)


min (A)
is large, the level curves of F
are very elongated hyperellipsoids and minimization corresponds to finding the
lowest point on a relatively flat, steep-side valley. In steepest descent, we are
forced to traverse back and forth across the valley rather than down the valley.
This is a slow process. We would like to choose certain descent directions {pk }
other than {rk }.

Definition 3.4.1 Given a symmetric and positive definite matrix A, two vec-
tors d1 , d2 are said to be A-conjugate if and only dT1 Ad2 = 0. A finite set of
vectors d0 , . . . , dk is called an A-conjugate set if dTi Adj = 0 for all i = j.

Lemma 3.4.1. If {d0 , . . . , dn−1 } is an A-conjugate set, then d0 , . . . , dn−1 are


linearly independent.
Lemma 3.4.2. If d0 , . . . , dn−1 are A-conjugate, then the solution x∗ to (3.1)
may be written as
 dT b
n−1
x∗ = ( T i )di . (3.35)
i=0
di Adi

(pf): By Lemma 3.4.1, d0 , . . . , dn−1 form a basis of Rn . The solution x∗


to (3.1) has a unique representation x∗ =
n−1
j=0 γj dj . Also, we have b =
n−1 dTi b
j=0 γj Adj . Taking inner product of B and di , γi = dT Adi
. ⊕
i

Theorem 3.4.1 (Conjugate Direction Theorem) Let {d0 , . . . , dn−1 } be a set of


nonzero A-conjugate vectors. For any x(0) ∈ Rn , the sequence {x(k) } generated
by
x(k+1) = x(k) + αk dk , k ≥ 0 (3.36)
with
rkT dk
αk = (3.37)
dTk Adk
rk = b − Ax(k) (3.38)

converges to the unique solution x∗ of Ax = b after n steps, i.e., x∗ = x(0) +


n−1
j=0 αj dj .
12 Systems of Linear Equations

(pf); Suppose x∗ − x(0) = n−1


j=0 δj dj . Then

dTi (b − Ax(0) dTi (rk + Ax(k) − Ax(0) )


δi = ) =
dTi Adi dTi Adi
k−1
for all k ≥ 0. Observe from (3.36), x(k) − x(0) = j=0 αj dj . So dTi (Ax(k) −
dTi ri
Ax(0) ) = 0 for all i ≥ k. That is, we have shown that δi = dT Adi
= αi . ⊕
i

Theorem 3.4.2 (Expanding Space Theorem) Let (d0 , . . . , dn−1 ) be a set of A-


conjugate vectors. Let Sk := [d0 , . . . , dk−1 ] denote the n-dimensional subspace
spanned by the vectors d0 , . . . , dk−1 . For any x(0) ∈ Rn , the sequence {x(k) }
generated from the scheme (3.36) and (3.37) has the property that
F (x(k) ) = min F (x(k−1) + αdk−1 ). (3.39)
α
In fact,
F (x(k) ) = min F (x) (3.40)
x∈X (0) +Sk

(pf): Define g(α) = F (x(k−1) + αdk−1 ). Then g (α) = dTk−1 F (x(k−1) +


αdk−1 ) = dTk−1 (A(x(k−1) + αdk−1 ) − b) = dTk−1 (αAdk−1 − rk−1 ). It follows that
the optimum value of α is given by (3.37).
To show that x(k) is a minimizer over the linear variety x(0) + Sk , it suffices
to show that F (x(k) ) = rk is perpendicular to Sk . Now for k = 1, we have
dT0 r1 = dT0 (b − A(x(0) + α0 d0 )) = 0 by the definition of α0 . For k = 2, we
have dT1 r2 = dT1 (b − A(x(1) + α1 d1 )) = 0 by the definition of α1 , and dT0 r2 =
dT0 (r1 − α1 Ad1 ) = 0. The assertion follows from induction. ⊕
Remark. In the above theorem, we have actually proved the fact that
di ⊥rk
for all i < k.
Algorithm 3.4.2. (The Conjugate Gradient Method)
Given x(0) ∈ Rn arbitrary
d0 := r0 := b − Ax(0)
For k = 0, 1, . . . , n − 1
If rk = 0
Stop
Else
rkT dk
αk : = (Can be replaced by (3.46).) (3.41)
dTk Adk
x(k+1) := x(k) + αk dk (3.42)
rk+1 := b − Ax(k+1) (3.43)
rT Adk
βk := − k+1 (Can be replaced by (3.47).) (3.44)
dTk Adk
dk+1 : = rk+1 + βk dk . (3.45)
3.4. CONJUGATE GRADIENT METHOD 13

Theorem 3.4.3 (Conjugate Gradient Theorem) The conjugate gradient algo-


rithm is a conjugate direction method. If it does not terminate at x(k) , then

1. span{r0 , r1 , . . . , rk } = span{r0 , Ar0 , . . . , Ak r0 };

2. span{d0 , d1 , . . . , dk } = span{r0 , Ar0 , . . . , Ak r0 };

3. dTk Adi = 0 for i < k;


4.
rkT rk
αk = ; (3.46)
dTk Adk

5.
T
rk+1 rk+1
βk = + . (3.47)
rkT rk

(pf): All proofs should be completed by induction.


(a) When k = 0, the case is trivial. Suppose the statement (a) is true for k.
Want to show span{r0 , r1 , . . . , rk+1 } = span{r0 , Ar0 , . . . , Ak+1 r0 }. Note

rk+1 = b − Ax(k+1) = rk − αk Adk .

Note also

rk ∈ span{r0 , r1 , . . . , rk } = span{r0 , Ar0 , . . . , Ak r0 } ⊂ span{r0 , Ar0 , . . . , Ak+1 r0 }.

But by construction, dk ∈ span{r0 , r1 , . . . , rk } = span{r0 , Ar0 , . . . , Ak r0 }. There-


fore, rk+1 ∈ span{r0 , Ar0 , . . . , Ak+1 r0 }. This shows that

span{r0 , r1 , . . . , rk+1 } ⊂ span{r0 , Ar0 , . . . , Ak+1 r0 }.

Now we need to show

Ak+1 r0 ∈ span{r0 , r1 , . . . , rk+1 }.


k+1
Note that rk+1 = i=0 γi Ai r0 . Since

rk+1 ∈ span{r0 , r1 , . . . , rk } = span{r0 , Ar0 , . . . , Ak r0 },

γk+1 = 0. So Ak+1 r0 can be written as a linear combination of r0 , . . . , rk+1 .


(b) The proof is similar to (a).
(c) Assume dTk Adi = 0 for i < k. Want to show dTk+1 Adi = 0 for i < k + 1.
By construction, dTk+1 Adi = rk+1 T
Adi + βk dTk Adi . If i = k, then dTk+1 Adi = 0 by
the definition of βk . If i < k, then dTk+1 Adi = rk+1 T
Adi = 0 by (a) and (b) since
Adi ∈ span{d0 , . . . , di+1 } and rk+1 ⊥span{d0 , . . . , di+1 }.
rT d r T r +β T
rk dk−1
(d) By definition, αk = dTkAdk and dk = rk +βk−1 dk−1 . So αk = k k dTk−1 Ad
=
k k k k
T
rk rk
dT Adk
since rkT dk−1 = 0.
k
14 Systems of Linear Equations

T
rk+1 Adk rT Ad T
rk+1 (rk −rk+1 /αk
(e) By definition, βk := − dT Adk
= − rk+1
T r /α
k
= − T r /α
rk
=
k k k k k k
r T rk+1
+ k+1Tr
rk k
because by (3.45) rk = dk − βk−1 dk−1 .

Remark. In exact arithmetic, the conjugate gradient method would have


reached the solution in exactly n iterations. Because of the effect of floating-
point arithmetic, the computed rn generally is different from zero. In practice,
therefore the method is simply continued until rk is found sufficiently small.

You might also like