Chapter 3
Chapter 3
Systems of Linear
Equations - Iterative
Approach
Many linear systems arised in real-world applications are large and sparse.
Because of the large number of equations and unknowns, storage becomes a
serious concern. When it is possible, Gaussian elimination remains a very eco-
nomical, accurate, and useful algorithm. Elimination is possible as long as there
is space to store all the nonzero elements of the triangular matrices associated
with the elimination and when the coding necessary to locate these elements
can be programmed. Techniques along this line can be found, for example,
in SPARSPAK by George, A., and Liu, J.W.H., “Computer Solution of Large
Sparse Positive Definite Systems”.
There are cases where the orders n are so large that is is impossible to store
the fill-in resulted from the Gaussian elimination. It is, therefore, desirable to
solve such linear systems Ax = b by methods that never alter the matrix A and
never require storing more than a few vectors of length n. Iterative methods
are especially suitable for this purpose.
In an iterative method, beginning with an initial vector x(0) , we generate a
sequence of vectors x(1) → x(2) → · · · according to the iteration scheme. We
hope that as k → ∞, x(k) will converge to the exact solution. The computa-
tional effort in each individual step x(i) → x(i+1) , generally, is comparable to
the multiplications of A with a vector. This is a very modest amount when A
is sparse.system.
A iterative method may be motivated from the following consideration.
Given a linear system
Ax = b (3.1)
and an approximate solution x̃, the residual corresponding to x̃ is defined by
r := b − Ax̃. (3.2)
1
2 Systems of Linear Equations
S = D; T = L + U ; (3.9)
new
Dx = (L + U )xold + b; (3.10)
1
i−1 n
xnew
i = (bi − aij xold
j − aij xold
j ). (3.11)
aii j=1 j=i+1
3.1. GENERAL CONSIDERATION 3
S = D − L; T = U ; (3.12)
(D − L)x new
= U xold + b; (3.13)
1
i−1 n
xnew
i = (bi − aij xnew
j − aij xold
j ). (3.14)
aii j=1 j=i+1
S = σD − L; T = (σ − 1)D + U ; (3.15)
(D − ωL)x new
= (1 − ω)Dxold + ωU xold + ωb; (3.16)
1
ω= ;
σ
1
i−1 n
x̂new
i = (bi − aij xnew
j − aij xold
j ); (3.17)
aii j=1 j=i+1
xnew
i = (1 − ω)xold
i + ωx̂new
i . (3.18)
(pf): The first inequality is true for all induced norm. We prove the second
inequality by construction. Given H, there exist a nonsingular matrix P such
that
P −1 HP = Λ + U
4 Systems of Linear Equations
= C2 .
(1) limn→∞ H n = 0,
(2) limn→∞ H n = 0 for some norm,
(3) ρ(H) < 1.
Remark. By the norm equivalence theorem and the fact that limn→∞ c1/n = 1
for any nonzero constant c, it follows that β (and hence, α) is norm independent.
Theorem 3.1.2 Suppose ρ(H) < 1. Then the iterative scheme has asymptotic
convergence factor α = ρ(H).
(pf): Since ρ(H) < 1, we may choose a norm · such that H ≤ ρ(H)+ <
1. We have already seen that x(k) − x∗ ≤ Hk |x(0) − x0 . It follows that
β ≤ ρ(H) + . Since is arbitrary, it follows that α ≤ ρ(H). To show equality,
we construct a sequence {x(k) } such that the equality holds. Toward this, we
consider two cases:
(i) Suppose λ is a real eigenvalue of H such that λ = ρ(H). Let u be
the associated real unit eigenvector of λ. We choose x(0) := x∗ + u. Then
x(k) − x∗ = H k u = λk u.
For this sequence β = |λ| = ρ(H).
(ii) Suppose ρ(H) corresponds to a pair of complex conjugate eigenvalues λ
and λ. Let u and u be the corresponding eigenvector. We may select a basis
{ui} for Cn such that u1 = u and u2 = u. Any vector y ∈ Cn may be expressed
n
as y = i=1 ci ui . We may take y := Σ|ci | as norm for y. Now we choose
k
x(0) := + 12 (u + u). Then x(k) − x∗ = H k 21 (u + u) = 12 (λk u + λ u). Using the
norm just defined, we have x(k) − x∗ | = 12 (|λ|k + |λ|k ) = ρ(H)k . It follows that
β = ρ(H). ⊕.
Recall that the splitting of the matrix A
A=S−T
induces the iterative scheme
Sxnew = T xold + b
for the system Ax = b. Thus it is imperative to find conditions such that
ρ(S −1 T ) < 1. We discuss below several possible sufficient conditions that has
been established in the literature (cf: R. S. Varga, Matrix Iterative Analysis).
6 Systems of Linear Equations
Theorem 3.1.5 Both the Jacobi method and the Gauss-Seidel method converge
if A is strictly diagonally dominant.
3.2. RELAXATION METHOD 7
Since G and G1 are similar, G and G1 have the same eigenvalues. Suppose
G1 x = λx with x∗ x = 1 (Note that x may be in Cn ). Then LT1 x = λ(I −
L1 )x. It follows that x∗ L1 x = λ(1 − x∗ L1 x). Let X ∗ L1 x = a + ib. Then
|λ|2 =|a−ib
2 a2 +b2
. Note that D−1/2 AD−1/2 = I − L1 − LT1 is still positive
1−a−ib| =
1−2a+a2 +b2
S := ω −1 D − L and T := (ω −1 − 1)D + U.
Q := A − (S −1 T )T A(S −1 T ).
We claim that both S + T and Q are positive definite. Suppose these claims
are true. Let λ be any eigenvalues of H = S −1 T , and y the corresponding
eigenvector. Then 0 < y ∗ Qy = y ∗ AY − λy ∗ Aλy = (1 − |λ|2 )y ∗ Ay. It follows
that λ| < 1, and hence ρ(H) < 1.
Now we prove the claims. Recall that any given matrix M can be writ-
ten as M = 12 (M + M T ) + 12 (M − M T ) := Ms + Mk where Ms is sym-
metric and Mk is skew-symmetric. Note also that xT M x = xT Ms x. So it
suffices to check the symmetric part of S + T for positive definiteness. Now
(S+T )s = 12 {S+S T +T +T T } = 12 {(ω −1 D−L)+(ω −1D−U )+((ω −1 −1)D+U )+
((ω −1 −1)D+L)} = D(ω −1 (2−ω)) which obviously is positive definite. To check
the matrix Q for positive definiteness, we first observe that Q = A − H T AH =
A − (I − S −1 A)T A(I − S −1 A) = A − {A − (S −1 A)T A − A(S −1 A)T A(S −1 A)} =
(S −1 A)T {S + S T − A}(S −1 A) = (S −1 A)T {S T + T }(S −1 ). Now xT Qx =
xT (S −1 A)T {S T + T }(S −1 A)x = y T (S T + T )y = y T (S + T )y > 0 for all x = 0.
So Q is positive definite. ⊕
Remark. It is often possible to choose the parameter ω so that the SOR
method converges rapidly; much more rapidly than the Jacobi method or the
Gauss-Seidel method. Normally, such an optimum value for ω can be prescribed
if the coefficient matrix A, relative to the partitioning imposed, has the so called
property A and is the so called consistently order (cf: Hageman and Young,
Applied Iterative Methods, Chapter 9). In practice, the estimates for the SOR
ω is obtained by an adaptive procedure.
k
u(k) := αk,i x(i) , k = 0, 1, . . . (3.23)
i=0
where the real numbers αk,i are required to satisfy the consistency condition
k
αk,i = 1, k = 0, 1, . . . (3.24)
i=0
where
Qk (H) := αk,0 I + αk,1 H + . . . + αk,k H k
is a matrix polynomial. The idea is to choose the polynomials {Qk } so that
{u(k) } converges to x∗ faster than {x(k) }. Generally speaking, it requires a high
arithmetic cost and a large amount of storage in using (3.23) to obtain u(k) .
Alternatively, we usually consider only the important family of polynomials
satisfying the recurrence relation:
Q0 (x) = 1
Q1 (x) = γ1 x − γ1 + 1 (3.25)
Qk+1 (x) = ρk+1 (γk+1 x + 1 − γk+1 )Qk (x) + (1 − ρk+1 )Qk−1 (x), for k ≥ 1
Theorem 3.3.1 If the polynomial sequence {Qk } in (3.25) is used, then the
iterates {u(k) } of (3.23) may be obtained using the three-term relation:
So the polynomial sequence {Qk } may be chosen to minimize Qk (H)(0) . (cf:
Hageman and Young, Chapter 7).
(2) (Chebyshev Acceleration) This is motivated by the fact that
Let M (H) and m(H) denote, respectively, the algebraically largest and smallest
eigenvalues of H. So the polynomial {Qk } is chosen such that the virtual spectral
radius of Qk (H) defined by
F (x) = Au − b. (3.33)
The direction of the vector F (x) is the direction for which the functional F (x)
at the point x changes most rapidly. Suppose x(k) is an approximation to x,
then in the direction of steepest descent rk := − F (x(k) ) we should obtain an
improved approximation
if αk is chosen to minimize F (x(k) + αrk )). Using (3.31), we can easily calculate
the number ak . Thus we have derived
Algorithm 3.4.1. (The Steepest Descent Method)
3.4. CONJUGATE GRADIENT METHOD 11
Definition 3.4.1 Given a symmetric and positive definite matrix A, two vec-
tors d1 , d2 are said to be A-conjugate if and only dT1 Ad2 = 0. A finite set of
vectors d0 , . . . , dk is called an A-conjugate set if dTi Adj = 0 for all i = j.
5.
T
rk+1 rk+1
βk = + . (3.47)
rkT rk
Note also
T
rk+1 Adk rT Ad T
rk+1 (rk −rk+1 /αk
(e) By definition, βk := − dT Adk
= − rk+1
T r /α
k
= − T r /α
rk
=
k k k k k k
r T rk+1
+ k+1Tr
rk k
because by (3.45) rk = dk − βk−1 dk−1 .