0% found this document useful (0 votes)
53 views9 pages

4.6 Iterative Solvers: Ij If I If I

This document discusses iterative methods for solving systems of linear equations Ax=b. It introduces the Jacobi, Gauss-Seidel, and successive overrelaxation (SOR) methods. For each method, it provides the mathematical formulation and demonstrates an example using matrix notation to solve a sample system of equations. The document analyzes the rate of convergence for each method based on the eigenvalues of the iteration matrix.

Uploaded by

caseyp.ryan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
53 views9 pages

4.6 Iterative Solvers: Ij If I If I

This document discusses iterative methods for solving systems of linear equations Ax=b. It introduces the Jacobi, Gauss-Seidel, and successive overrelaxation (SOR) methods. For each method, it provides the mathematical formulation and demonstrates an example using matrix notation to solve a sample system of equations. The document analyzes the rate of convergence for each method based on the eigenvalues of the iteration matrix.

Uploaded by

caseyp.ryan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

H. H.

Bau , MEAM 427 Class Notes, Chapter 4: Algebraic equations (10-08-07)

ij


1


0

if
if

12

i j
.
iz j

The above allows us to define functions of a matrix f(A). Formally, the function can be
expanded into a Taylor series in terms of powers of A.
4.6 ITERATIVE SOLVERS

In this section, we return to solving the system Ax=b and discuss iterative solvers. In the
early days of scientific computing, iterative, relaxation techniques were frequently used because
they do not tax heavily computer resources. In fact, these techniques were developed in the precomputer days and were executed by hand. In these techniques, one starts with an initial guess
of the solution, and then the solution vector is modified until certain convergence criteria are
met. Relaxation techniques are still in use, but usually they are combined with accelerators
designed to improve both speed and robustness. Here, we will present iterative schemes using
matrix notation. The matrix formulation will allow us to gain insights into the various schemes.
In practice, in the interest of saving computer memory, one would work with individual
equations rather than in matrix form.
To construct an iterative solver, we modify the equation Ax=b to read
Mx=(M - A)x+b.

(8)

Clearly, the above equation is identical to Ax=b. The matrix M is chosen so that it is easy to
invert. The iterative procedure starts with an initial guess for x0 such as x0=b and proceeds
according to the iterative scheme:
Mxk+1=(M - A)xk+b

(9)

xk+1=M-1((M - A)xk+b) =(I - M-1A)xk+ M-1 b=Bxk+M-1b,

(10)

or
where B=(I-M-1A). Not all choices of M are going to work.
To take a closer look at the process, lets see what happens to the error, defined as the
difference between the exact solution x and the iterate xk, ek=xk - x. The error satisfies the
iterative equation
ek+1 =Bek,

(11)

ek =Bke0.

(12)

or

H. H. Bau , MEAM 427 Class Notes, Chapter 4: Algebraic equations (10-08-07)

13

Clearly, we would like ek0 as k. This will happen only when Bk0 as k increases.
Lets suppose that B has n distinct eigenvectors.

We will construct matrix Q with the

eigenvectors of B as its columns. That is, B=Q/ Q-1, where / is a diagonal matrix whose
entries are the eigenvalues of B (see equation 4). Accordingly,
ek =Bke0= (Q/ Q-1)k e0.= (Q/k Q-1) e0.

(13)

The matrix /k is a diagonal matrix whose diagonal terms are Oik, where Oi are the eigenvalues of
B. For Bk to approach zero as k increases, we need |Oi|<1. The smaller |Oi| is, the faster the

convergence. The rate of convergence is governed by the largest value of |Oi|, which is dubbed
the spectral radius of B. For convergence, the eigenvalues of B must be within a unit circle in
the complex plane. When -1<Oi<0 and real the convergence is oscillatory. The closer |Oi| is to
1, the more iterations will be needed to converge. We see that the eigenvalues O(B) control the
convergence of the iterative process. When B is small enough, we can calculate its eigenvalues
and tell right away whether the iterative process is going to work or not. Unfortunately, in most
applications, B is very large and we cannot afford to calculate its eigenvalues.
Before we proceed with specific choices of M, lets take a closer look at the error. We
can project the error on a space spanned by the eigenvectors of B. We denote these eigenvectors
as qi.
e0=c1 q1 + c2 q2 + .. + cn qn
Be0=c1 Bq1 + c2 Bq2 + .. + cn Bqn= c1 O1 q1 + c2 O2 q2 + .. + cn On qn

and
Bke0= c1 O1k q1 + c2 O2 k q2 + . .+ cn On k qn.

Witness that every iteration step, we multiply the eigenvector by Oi. The coefficients ci come
from the initial error. We see again that the initial error decays when all |Oi.|<1. When one of
the eigenvalues, say |O1| is greater than 1, the error will increase in the direction of this particular
eigenvector. The largest eigenvector will dominate. You may naively think that if we could set
the coefficients ci associated with the eigenvalues |Oi.|>1 to zero, we will be OK. This is,
however, impractical since we do not know the eigenvectors in advance. Moreover, even if we
knew the eigenvectors in advance, we cannot avoid errors cropping in during the calculations
with components along the forbidden eigenvectors, and these errors will amplify as we proceed
through the iterations.

H. H. Bau , MEAM 427 Class Notes, Chapter 4: Algebraic equations (10-08-07)

14

4.6.1 The Jacobi Method


One choice for M is the diagonal of A. This is known as the Jacobi method. We will
demonstrate with an example how the process works.
>> A=[5 1 -3; 0 2 1; 1 1 10]
A=
5 1 -3
0 2 1
1 1 10
>> b=[1; 2; 3]
b=
1
2
3
We will solve the equation exactly:
>> A\b

% exact solution %
ans =
0.1373
0.9020
0.1961

Next, we construct the matrix M that consists of the diagonal terms of A. dig(A) creates a vector
consisting of the diagonal elements of A. diag(diag(A)) forms a diagonal matrix with the entries
of diag(A) as its diagonal.
>> m=diag(diag(A))
m=

B=I-M-1A

5
0
0

0 0
2 0
0 10

>> B=eye(3,3)-inv(m)*A;
Lets take a look at the eigenvalues of B:
>> eig(B)
ans =
-0.2000

H. H. Bau , MEAM 427 Class Notes, Chapter 4: Algebraic equations (10-08-07)

15

0.1000 + 0.2000i
0.1000 - 0.2000i
All the eigenvalues are within the unit circle; hence, we expect the iterative process to converge.
The iterations are xk+1= Bxk+M-1b. It is convenient to construct the vector M-1b, which I call
bm.
>> bm=inv(m)*b;
The initial guess is arbitrary. I choose to start with x1=b.
>> x(:,1)=b;
We write a do loop to carry out the iterations.
>> for i=2:10,
x(:,i)=B*x(:,i-1)+bm;
end;
>> x'
ans =
1. 1.0000
2. 1.6000
3. 0.3000
4. 0.1140
5. 0.1210
6. 0.1359
7. 0.1377
8. 0.1374
9. 0.1373
10. 0.1372

2.0000
-0.5000
1.0000
0.9050
0.9150
0.9010
0.9018
0.9018
0.9020
0.9020

3.0000
0
0.1900
0.1700
0.1981
0.1964
0.1963
0.1961
0.1961
0.1961

11. 0.1373 0.9020 0.1961

% exact solution %

As you can see, we readily converge to the correct solution. Within about 8 iterations, we have
the solution correct to three significant digits. Of course, as the number of equations increases,
so does the number of iterations.

Additionally, the rate of convergence depends on the

magnitude of the eigenvalues of B. The smaller the absolute value of the eigenvalues is, the
faster the convergence.
We can look at the process in graphical format
>> plot(x')

H. H. Bau , MEAM 427 Class Notes, Chapter 4: Algebraic equations (10-08-07)

16

Witness that the computations can be stopped at any intermediate stage, and they still render
useful information. For example, had we stopped after four iterations, we would have the
solution within one significant digit. When using direct solvers, if we quit before the conclusion
of the calculations, we do not have anything useful.
Now, suppose that you replace marix A with a different one:
>> A=[1 5 -3; 10 2 1; 1 1 2];
Will the Jacobi iterations still work?

One disadvantage of the Jacobi method is that it requires us to store all the components of
x

(k)

until the calculation of x(k+1) is complete. This scheme can be readily implemented in

parallel computers. An alternative is to replace x(k) with x(k+1) as soon as the latter is computed.
This leads to the Gauss-Seidel Method.

4.6.2 Gauss Seidel Method


The first equation is the same as in the Jacobi method:

a11 x1

( k 1)

(a12 x2

(k )

 a13 x3

(k )

 .......  a1n xn

(k )

)  b1

The second equation is, however, different. We replace x1(k) with the newly calculated x1(k+1).

a 22 x 2

( k 1)

(a 21 x1

( k 1)

 a 23 x3

(k )

(k )

 .......  a 2 n x n )  b2

In the Gauss Seidel method, M becomes the lower triangular part of A, and M-A is
strictly the upper triangular part of A. For example, in the case of n=3, we have the system
0
a11 0
( k 1)

a 21 a 22 0 x
a
a32 a33
31

b
0 a12 a13
(k ) 1

 0 0 a 23 x  b2
b
0 0
0
3


M A

For example, consider again the matrix


>>A=[5 1 -3; 0 2 1; 1 1 10];
Next, we obtain the matrix M by extracting the lower triangular part of A

(14)

H. H. Bau , MEAM 427 Class Notes, Chapter 4: Algebraic equations (10-08-07)

17

>> M=tril(A)
M=
5
0
1

0 0
2 0
1 10

The remainder of the procedure is exactly the same as in the previous example. We follow the

0$
$ [[NEwhere, as before, we denote B=M - A.
iterative scheme: 0[N 0
>> B=eye(3,3)-inv(M)*A;
>> eig(B)
ans =
0.0000
-0.0050 + 0.0999i
-0.0050 - 0.0999i
Compare the eigenvalues of B in the Gauss Seidel method with the eigenvalues of B in the
Jacobi method. For the same matrix A, the eigenvalues associated with the Gauss Seidel method
are much smaller than the ones associated with the Jacobi method. Thus, we expect that the
Gauss-Seidel method converges significantly faster than the Jacobi method. This is, indeed, the
case. See below.
>> bm=inv(M)*b;
>> x=b;
>> for i=2:10,
x(:,i)=B*x(:,i-1)+bm;
end;
>> x'
ans =
1.
2.
3.
4.
5.
6.

1.0000
1.6000
0.4140
0.1199
0.1347
0.1375

2.0000
-0.5000
0.9050
0.9159
0.9018
0.9018

3.0000
0.1900
0.1681
0.1964
0.1964
0.1961

H. H. Bau , MEAM 427 Class Notes, Chapter 4: Algebraic equations (10-08-07)

7. 0.1373
8. 0.1373
9. 0.1373
10. 0.1373

0.9020
0.9020
0.9020
0.9020

18

0.1961
0.1961
0.1961
0.1961

We have the first three significant digits within 6 iterations.


In some cases, it is possible to accelerate the rate of convergence of the Gauss-Seidel
technique by introducing an acceleration factor known as overrelaxation.
4.6.3 The Method of Successive Overrelaxation (SOR)

In the Gauss-Seidel method, we corrected the previous guess x(k) by adding the correction
x(k+1)-x(k). The idea of the SOR technique is to overcorrect. The scheme is:
x(k+1) = x(k)+Z(x(k+1) - x(k)),

(15)

where Z is the SOR factor. When Z=1, we recover the Gauss Seidel. Typically, one chooses,
1<Z<2. There is actually an optimal value of Z for which the convergence is the fastest. In
certain cases of hard to converge nonlinear equations, one may select Z<1 (under relaxation).
Reducing the value of Z below one leads to very slow convergence.
We will construct the SOR equations explicitly. First consider a 22 case.
'x1

x1

( k 1)

x1

(k )

 w'x

a11 ( k 1)
x1
w

1
b
(k )
(k )
(a11 x1  a12 x2 )  1
a11
a11

x1

(k )

b
w
(k )
(k )
(a11 x1  a12 x 2 )  w 1
a11
a11

1
(k )
(k )
a11  1 x1  a12 x 2  b1
w

In a 33 case, the other equations can be manipulated in a similar way to yield the
iterative scheme:

H. H. Bau , MEAM 427 Class Notes, Chapter 4: Algebraic equations (10-08-07)

a11

0
0

a 22
( k 1)
a
0 x
21 w

a33
a31 a32

w




M

1
a11 1 

a12
a13
w

b1

(k )
1


0
a 22 1 
a 23
x  b2
w
b

3
1


0
0
1
a

33

19

(16)

M A

The optimal choice of w makes the largest eigenvalue of B=I-M-1A as small as possible.
The optimal conditions will be met when all the eigenvalues of B are equal. For a matrix of
reasonable size or with a pattern, we can compute the optimal Z analytically. In other cases, we
may have to resort to numerical experiments.
Below, we illustrate the basic ideas with a Maple session:
> with(LinearAlgebra):
> assume(w,real);
> a:=Matrix([[2, -1],[-1, 2]]);
2
a :=
-1

-1

> m:=Matrix([[2/w, 0],[-1, 2/w]]);


2 1
0

w~

m :=
1

-1
2
w~

> i:=Matrix([[1,0],[0,1]]);
1
i :=
0

> inv_m:=MatrixInverse(m);
1 w~

2
inv_m :=
1 2
w~
4

1
w~
2

> b:=i-MatrixInverse(m).a;
1

1w~
w~

b :=

1 2

1 2 1
 w~  w~ 1 w~ w~
2
4

H. H. Bau , MEAM 427 Class Notes, Chapter 4: Algebraic equations (10-08-07)

20

> factor(simplify(Determinant(b)));
( w~ 1 ) 2
> eig:=simplify(Eigenvalues(b));
1w~1 w~21 1616 w~w~2 w~

8
8

eig :=

1 2 1

1w~ w~  1616 w~w~2 w~


8
8

> plot({abs(eig[1]),
abs(eig[2])},w=1..1.1);

Witness that the two eigenvalues of


B are equal when Zopt~1.07. In the

Jacobi iterations (Z=1), the largest


eigenvalue is about 0.25. By
selecting Zopt~1.07, we reduce the
largest eigenvalue to ~0.07. This
means that the SOR iterations will
be more than three times faster
than the Jacobi iterations. Go ahead and give it a try.

NOTES

1. Determinant (B) =(1-Z)n. This is always true. The determinant of B is also equal to the
product of the eigenvalues. If we want to make all the eigenvalues equal, we have O=Z-1.

4.7 TRI-DIAGONAL DECOMPOSITION (The method of Lanczos)

Thus far, we have been exposed to direct and iterative techniques. We will conclude this
chapter with a semi-direct method. We start with the introduction of tri-diagonal decomposition.
In other words, given a matrix A, we wish to construct a related matrix T, which is populated by
zeros everywhere except along the main and two adjacent diagonals.

You might also like