0% found this document useful (0 votes)
39 views23 pages

The Conjugate Gradient Method: Tom Lyche

The document describes the conjugate gradient method for solving positive definite linear systems. It begins with an overview of the method and its algorithm. The algorithm generates iterates by adding a scalar multiple of the search direction to the current iterate. The search directions are chosen to be conjugate, meaning the gradients are orthogonal with respect to the inner product. For many problems, the error becomes small after only a few iterations. The document then provides more details on the derivation of the algorithm, examples of its implementation, testing on different problem types, and analysis of the method's complexity. It requires O(n) operations per iteration and finds the solution in a number of iterations proportional to the square root of the problem size for certain test problems.

Uploaded by

tt186
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
39 views23 pages

The Conjugate Gradient Method: Tom Lyche

The document describes the conjugate gradient method for solving positive definite linear systems. It begins with an overview of the method and its algorithm. The algorithm generates iterates by adding a scalar multiple of the search direction to the current iterate. The search directions are chosen to be conjugate, meaning the gradients are orthogonal with respect to the inner product. For many problems, the error becomes small after only a few iterations. The document then provides more details on the derivation of the algorithm, examples of its implementation, testing on different problem types, and analysis of the method's complexity. It requires O(n) operations per iteration and finds the solution in a number of iterations proportional to the square root of the problem size for certain test problems.

Uploaded by

tt186
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 23

The Conjugate Gradient Method

Tom Lyche
University of Oslo
Norway
The Conjugate Gradient Method p. 1/23
Plan for the day
The method
Algorithm
Implementation of test problems
Complexity
Derivation of the method
Convergence
The Conjugate Gradient Method p. 2/23
The Conjugate gradient method
Restricted to positive denite systems: Ax = b,
A R
n,n
positive denite.
Generate x
k
by x
k+1
= x
k
+
k
p
k
,
p
k
is a vector, the search direction,

k
is a scalar determining the step length.
In general we nd the exact solution in at most n
iterations.
For many problems the error becomes small after a few
iterations.
Both a direct method and an iterative method.
Rate of convergence depends on the square root of the
condition number
The Conjugate Gradient Method p. 3/23
The name of the game
Conjugate means orthogonal; orthogonal gradients.
But why gradients?
Consider minimizing the quadratic function Q : R
n
R
given by Q(x) :=
1
2
x
T
Ax x
T
b.
The minimum is obtained by setting the gradient equal
to zero.
Q(x) = Ax b = 0 linear system Ax = b
Find the solution by solving r = b Ax = 0.
The sequence x
k
is such that r
k
:= b Ax
k
is
orthogonal with respect to the usual inner product in R
n
.
The search directions are also orthogonal, but with
respect to a different inner product.
The Conjugate Gradient Method p. 4/23
The algorithm
Start with some x
0
. Set p
0
= r
0
= b Ax
0
.
For k = 0, 1, 2, . . .
x
k+1
= x
k
+
k
p
k
,
k
=
r
T
k
r
k
p
T
k
Ap
k
r
k+1
= b Ax
k+1
= r
k

k
Ap
k
p
k+1
= r
k+1
+
k
p
k
,
k
=
r
T
k+1
r
k+1
r
T
k
r
k
The Conjugate Gradient Method p. 5/23
Example

2 1
1 2

[
x
1
x
2
] = [
1
0
]
Start with x
0
= 0.
p
0
= r
0
= b = [1, 0]
T

0
=
r
T
0
r
0
p
T
0
Ap
0
=
1
2
, x
1
= x
0
+
0
p
0
= [
0
0
] +
1
2
[
1
0
] =

1/2
0

r
1
= r
0

0
Ap
0
= [
1
0
]
1
2

2
1

0
1/2

, r
T
1
r
0
= 0

0
=
r
T
1
r
1
r
T
0
r
0
=
1
4
, p
1
= r
1
+
0
p
0
=

0
1/2

+
1
4
[
1
0
] =

1/4
1/2

1
=
r
T
1
r
1
p
T
1
Ap
1
=
2
3
,
x
2
= x
1
+
1
p
1
=

1/2
0

+
2
3

1/4
1/2

2/3
1/3

r
2
= 0, exact solution.
The Conjugate Gradient Method p. 6/23
Exact method and iterative method
Orthogonality of the residuals implies that x
m
is equal to the solution
x of Ax = b for some m n.
For if x
k
,= x for all k = 0, 1, . . . , n 1 then r
k
,= 0 for
k = 0, 1, . . . , n 1 is an orthogonal basis for R
n
. But then r
n
R
n
is
orthogonal to all vectors in R
n
so r
n
= 0 and hence x
n
= x.
So the conjugate gradient method nds the exact solution in at most
n iterations.
The convergence analysis shows that |x x
k
|
A
typically becomes
small quite rapidly and we can stop the iteration with k much smaller
that n.
It is this rapid convergence which makes the method interesting and
in practice an iterative method.
The Conjugate Gradient Method p. 7/23
Conjugate Gradient Algorithm
[Conjugate Gradient Iteration] The positive denite linear system Ax = b is
solved by the conjugate gradient method. x is a starting vector for the iteration. The
iteration is stopped when [[r
k
[[
2
/[[r
0
[[
2
tol or k > itmax. itm is the number of
iterations used.
functi on [ x , i t m ] =cg ( A, b , x , t ol , i t max ) r =bAx ; p=r ; rho=r r ;
rho0=rho ; f or k=0: i t max
i f sqrt ( rho / rho0)<= t o l 2
i t m=k ; return
end
t =Ap ; a=rho / ( p t ) ;
x=x+ap ; r =rat ;
rhos=rho ; rho=r r ;
p=r +( rho / rhos )p ;
end i t m=i t max +1;
The Conjugate Gradient Method p. 8/23
A family of test problems
We can test the methods on the Kronecker sum matrix
A = C
1
I+IC
2
=

C
1
C
1
.
.
.
C
1
C
1

cI bI
bI cI bI
.
.
.
.
.
.
.
.
.
bI cI bI
bI cI

,
where C
1
= tridiag
m
(a, c, a) and C
2
= tridiag
m
(b, c, b).
Positive denite if c > 0 and c [a[ +[b[.
The Conjugate Gradient Method p. 9/23
m = 3, n = 9
A =

2c a 0 b 0 0 0 0 0
a 2c a 0 b 0 0 0 0
0 a 2c 0 0 b 0 0 0
b 0 0 2c a 0 b 0 0
0 b 0 a 2c a 0 b 0
0 0 b 0 a 2c 0 0 b
0 0 0 b 0 0 2c a 0
0 0 0 0 b 0 a 2c a
0 0 0 0 0 b 0 a 2c

b = a = 1, c = 2: Poisson matrix
b = a = 1/9, c = 5/18: Averaging matrix
The Conjugate Gradient Method p. 10/23
Averaging problem

jk
= 2c + 2a cos (jh) + 2b cos (kh), j, k = 1, 2, . . . , m.
a = b = 1/9, c = 5/18

max
=
5
9
+
4
9
cos (h),
min
=
5
9

4
9
cos (h)
cond
2
(A) =

max

min
=
5+4 cos(h)
54 cos(h)
9.
The Conjugate Gradient Method p. 11/23
2D formulation for test problems
V = vec(x). R = vec(r), P = vec(p)
Ax = b DV +V E = h
2
F,
D = tridiag(a, c, a) R
m,m
, E = tridiag(b, c, b) R
m,m
vec(Ap) = DP +PE
The Conjugate Gradient Method p. 12/23
Testing
[Testing Conjugate Gradient ] A = trid(a, c, a, m) I
m
+ I
m

trid(b, c, b, m) R
m
2
,m
2
functi on [ V, i t ] = cgt est (m, a , b , c , t ol , i t max )
h=1/ (m+1) ; R=hhones (m) ;
D=sparse( t r i di agonal ( a , c , a ,m) ) ; E=sparse( t r i di agonal ( b , c , b ,m) ) ;
V=zeros (m,m) ; P=R; rho=sum(sum(R. R) ) ; rho0=rho ;
f or k=1: i t max
i f sqrt ( rho / rho0)<= t o l
i t =k ; return
end
T=DP+PE; a=rho / sum(sum(P. T ) ) ; V=V+aP; R=RaT;
rhos=rho ; rho=sum(sum(R. R) ) ; P=R+( rho / rhos )P;
end;
i t =i t max +1;
The Conjugate Gradient Method p. 13/23
The Averaging Problem
n 2 500 10 000 40 000 1 000 000 4 000 000
K 22 22 21 21 20
Table 1: The number of iterations K for the averag-
ing problem on a

n

n grid. x
0
= 0 tol = 10
8
Both the condition number and the required number of iterations are
independent of the size of the problem
The convergence is quite rapid.
The Conjugate Gradient Method p. 14/23
Poisson Problem

jk
= 2c + 2a cos (jh) + 2b cos (kh), j, k = 1, 2, . . . , m.
a = b = 1, c = 2

max
= 4 + 4 cos (h),
min
= 4 4 cos (h)
cond
2
(A) =

max

min
=
1+cos(h)
1cos(h)
= cond
(
T)
2
.
cond
2
(A) = O(n).
The Conjugate Gradient Method p. 15/23
The Poisson problem
n 2 500 10 000 40 000 160 000
K 140 294 587 1168
K/

n 1.86 1.87 1.86 1.85


Using CG in the form of Algorithm 8 with = 10
8
and x
0
= 0 we list
K, the required number of iterations and K/

n.
The results show that K is much smaller than n and appears to be
proportional to

n
This is the same speed as for SOR and we dont have to estimate
any acceleration parameter!

n is essentially the square root of the condition number of A.


The Conjugate Gradient Method p. 16/23
Complexity
The work involved in each iteration is
1. one matrix times vector (t = Ap),
2. two inner products (p
T
t and r
T
r),
3. three vector-plus-scalar-times-vector (x = x + ap,
r = r at and p = r + (rho/rhos)p),
The dominating part of the computation is statement 1.
Note that for our test problems A only has O(5n) nonzero
elements. Therefore, taking advantage of the sparseness of
A we can compute t in O(n) ops. With such an
implementation the total number of ops in one iteration is
O(n).
The Conjugate Gradient Method p. 17/23
More Complexity
How many ops do we need to solve the test problems
by the conjugate gradient method to within a given
tolerance?
Average problem. O(n) ops. Optimal for a problem
with n unknowns.
Same as SOR and better than the fast method based
on FFT.
Discrete Poisson problem: O(n
3/2
) ops.
same as SOR and fast method.
Cholesky Algorithm: O(n
2
) ops both for averaging and
Poisson.
The Conjugate Gradient Method p. 18/23
Analysis and Derivation of the Method
Theorem 3 (Orthogonal Projection). Let o be a subspace of a nite
dimensional real or complex inner product space (1, F, , , )). To each
x 1 there is a unique vector p o such that
x p, s) = 0, for all s o. (1)
x
x
x - p
p=P
S
S
The Conjugate Gradient Method p. 19/23
Best Approximation
Theorem 4 (Best Approximation). Let o be a subspace of a nite
dimensional real or complex inner product space (1, F, , , )). Let
x 1, and p o. The following statements are equivalent
1. x p, s) = 0, for all s S.
2. |x s| > |x p| for all s o with s ,= p.
If (v
1
, . . . , v
k
) is an orthogonal basis for S then
p =
k

i=1
x, v
i
)
v
i
, v
i
)
v
i
. (2)
The Conjugate Gradient Method p. 20/23
Derivation of CG
Ax = b, A R
n,n
is pos. def., x, b R
n
(x, y) := x
T
y, x, y R
n
x, y) := x
T
Ay = (x, Ay) = (Ax, y)
|x|
A
=

x
T
Ax
W
0
= 0, W
1
= spanb, W
2
= spanb, Ab,
W
k
= spanb, Ab, A
2
b, . . . , A
k1
b
W
0
W
1
W
2
W
k

dim(W
k
) k, w W
k
Aw W
k+1
x
k
W
k
, x
k
x, w) = 0 for all w W
k
p
0
= r
0
:= b, p
j
= r
j

j1
i=0
r
j
,p
i

p
i
,p
i

p
i
, j = 1, . . . , k.
The Conjugate Gradient Method p. 21/23
Convergence
Theorem 5. Suppose we apply the conjugate gradient method to a
positive denite systemAx = b. Then the A-norms of the errors satisfy
[[x x
k
[[
A
[[x x
0
[[
A
2

+ 1

k
, for k 0,
where = cond
2
(A) =
max
/
min
is the 2-norm condition number of
A.
This theorem explains what we observed in the previous
section. Namely that the number of iterations is linked to

, the square root of the condition number of A. Indeed,


the following corollary gives an upper bound for the number
of iterations in terms of

.
The Conjugate Gradient Method p. 22/23
Corollary 6. If for some > 0 we have k
1
2
ln(
2

then
[[x x
k
[[
A
/[[x x
0
[[
A
.
The Conjugate Gradient Method p. 23/23

You might also like