11829-Article Text-20975-1-10-20211228
11829-Article Text-20975-1-10-20211228
11829-Article Text-20975-1-10-20211228
14 (2021), 5854-5865
Research Article
Abstract
The Conjugate Gradient Method is an iterative technique for solving large sparse systems
of linear equations. As a linear algebra and matrix manipulation technique, it is a useful
tool in approximating solutions to linearized partial differential equations. The
fundamental concepts are introduced and utilized as the foundation for the derivation of
the Conjugate Gradient Method.
1.Introduction
The Conjugate Gradient Method is an iterative technique for solving large sparse systems
of linear equations[1–3].As a linear algebra and matrix manipulation technique, it is a
useful tool in approximating solutions to linearized partial differential equations. the
fundamental concepts are introduced and utilized as the foundation for the derivation of
the Conjugate Gradient Method[4–6]. Alongside the fundamental concepts, an intuitive
geometric understanding of the Conjugate Gradient Method s detailsare included to add
clarity[7], [8].A detailed and rigorous analysis of the theorems which prove the
Conjugate Gradient algorithm are presented. Extensions of the Conjugate Gradient
Method through preconditioning the system in order to improve the effviency of the
ConjugateGradient Method are discussed[2], [5], [9–11].
Consider the problem of finding the vector x that minimizes the scalar function
1
𝐹 𝑥 = 2 𝑥 𝑇 𝐴 − 𝑏𝑇 𝑥 , where the matrix A is symmetric and positive definite because f(x)
is minimized when its gradient ∇𝑓 = 𝐴𝑥 − 𝑏 is zero , we see that minimization is
equivalent to solving 𝐴𝑥 = 𝑏
5854
Turkish Journal of Computer and Mathematics Education Vol.12 No.14 (2021), 5854-5865
Research Article
Each iterative cycle k computes a refined solution 𝑥𝑘+1 = 𝑥𝑘 + 𝛼𝑘 𝑑𝑘 ,the step length 𝛼𝑘
is chosen so that 𝑥𝑘+1 minimizes 𝑓(𝑥𝑘+1 ) in the search direction 𝑑𝑘 that is 𝑥𝑘+1 must
satisfy.
𝑑𝑘+1 𝐴𝑑𝑘 = 0
Ax=b
A(𝑥𝑘 + 𝛼𝑘 𝑑𝑘 ) = 𝑏(1)
A𝑥𝑘 + 𝛼𝑘 A𝑑𝑘 =b
𝛼𝑘 A𝑑𝑘 = 𝑏 − A𝑥𝑘
𝛼𝑘 𝑑𝑘 𝑇 A𝑑𝑘 = 𝑑𝑘 𝑇 𝑟𝑘
𝑑𝑘 𝑇 𝑟 𝑘
𝛼𝑘 =𝑑 𝑇
𝑘 A𝑑 𝑘
Intuition tells us to choose 𝑑𝑘 = −∇𝑓 = 𝑟𝑘 , since this is the direction of the largest
negative change in f(x).[1−2],[4−5],[7]
To find a new conjugate gradient for system of linear equations Coefficient 1 we will use
the quadratic form:
1
Let 𝐹 𝑥 = 2 𝑥 − 𝑥 ∗ 𝑇 𝐴(𝑥 − 𝑥 ∗ ) be a quadratic function
1
𝑎𝑛𝑑 𝐹 𝑥𝑘+1 = 𝑥 − 𝑥 ∗ 𝑇 𝐴(𝑥𝑘+1 − 𝑥 ∗ )
2 𝑘+1
Where 𝑥𝑘+1 = 𝑥𝑘 + 𝛼𝑘 𝑑𝑘 ,
1
Then , 𝐹 𝑥𝑘+1 = 2 𝑥𝑘 + 𝛼𝑘 𝑑𝑘 − 𝑥 ∗ 𝑇 𝐴(𝑥𝑘 + 𝛼𝑘 𝑑𝑘 − 𝑥 ∗ )
5855
Turkish Journal of Computer and Mathematics Education Vol.12 No.14 (2021), 5854-5865
Research Article
1
= 𝑥 𝐴 + 𝛼𝑘 𝑑𝑘 𝐴 − 𝑥 ∗ 𝐴 𝑇 (𝑥𝑘 + 𝛼𝑘 𝑑𝑘 − 𝑥 ∗ )
2 𝑘
1
= 2 (𝑥𝑘 𝑇 𝐴𝑥𝑘 + 𝛼𝑘 𝑥𝑘 𝑇 𝐴𝑑𝑘 − 𝑥𝑘 𝑇 𝐴𝑥 ∗ + 𝛼𝑘 𝑑𝑘 𝑇 𝐴𝑥𝑘 +𝛼𝑘 2 𝑑𝑘 𝑇 𝐴𝑑𝑘 − 𝛼𝑘 𝑑𝑘 𝑇 𝐴𝑥 ∗ −
𝑥 ∗ 𝑇 𝐴𝑥𝑘 − 𝛼𝑘 𝑥 ∗ 𝑇 𝐴𝑑𝑘 + 𝑥 ∗ 𝑇 𝐴𝑥 ∗ )
1
∇𝑓 = 2(𝑥𝑘 𝑇 𝐴𝑑𝑘 + 𝑑𝑘 𝑇 𝐴𝑥𝑘 + 2𝛼𝑘 𝑑𝑘 𝑇 𝐴𝑑𝑘 − 𝑑𝑘 𝑇 𝐴𝑥 ∗ − 𝑥 ∗ 𝑇 𝐴𝑑𝑘 )
1
∇𝑓 = 2(2𝑥𝑘 𝑇 𝐴𝑑𝑘 + 2𝛼𝑘 𝑑𝑘 𝑇 𝐴𝑑𝑘 − 2𝑥 ∗ 𝑇 𝐴𝑑𝑘 )
𝑛𝑒𝑤 1
𝑥 ∗ 𝑇 𝐴𝑑𝑘 − 𝑥𝑘 𝑇 𝐴𝑑𝑘
𝛼𝑘 =
𝑑𝑘 𝑇 𝐴𝑑𝑘
𝑏 𝑇 𝑑 𝑘 −𝑥 𝑘 𝑇 𝐴𝑑 𝑘
Then 𝛼𝑘 𝑛𝑒𝑤 1 = 𝑑 𝑘 𝑇 𝐴𝑑 𝑘
𝑑 𝑘 𝑇 (𝑏−𝑥 𝑘 𝑇 𝐴)
𝛼𝑘 𝑛𝑒𝑤 1 = 𝑑 𝑘 𝑇 𝐴𝑑 𝑘
Step(3):- 𝑑0 = 𝑟0
𝑑 𝑘 𝑇 (𝑏−𝑥 𝑘 𝑇 𝐴)
𝛼𝑘 𝑛𝑒𝑤 1 = 𝑑 𝑘 𝑇 𝐴𝑑 𝑘
𝑥𝑘+1 = 𝑥𝑘 + 𝛼𝑘 𝑑𝑘
5856
Turkish Journal of Computer and Mathematics Education Vol.12 No.14 (2021), 5854-5865
Research Article
𝑟𝑘+1 = b− A𝑥𝑘+1
−𝑟𝑘+1 𝑇 𝐴𝑑𝑘
𝛽𝑘 =
𝑑𝑘 𝑇 𝐴𝑑𝑘
𝑑𝑘+1 = 𝑟𝑘+1 + 𝛽𝑘 𝑑𝑘
Step(5):- End do
5 −2 0 𝑥1 20
Example 2.2.Consider the system −2 5 1 𝑥2 = 10 , where A=
0 1 5 𝑥3 −10
5 −2 0
−2 5 1 is positive definite and symmetric for conjugate gradient method, first
0 1 5
0
set the starting point 𝑥0= 0 , so
0
20 5 −2 0 0 20
𝑟0 = 𝑏 − 𝐴𝑥0 = 10 − −2 5 1 0 = 10
−10 0 1 5 0 −10
20
𝑟0 = 𝑑0 = 10
−10
5 −2 0 20 80
𝐴𝑑0 = −2 5 1 10 = 0
0 1 5 −10 −40
𝑑 0 𝑇 (𝑏−𝑥 0 𝑇 𝐴) 400 +100 +100
𝛼0 𝑛𝑒𝑤 1 = =
𝑑 0 𝑇 𝐴𝑑 0 1600 +400
𝛼0 𝑛𝑒𝑤 1 = 0.3
0 20
𝑥1 = 𝑥0 + 𝛼0 𝑑0 = 0 +0.3 10
0 −10
6
𝑥1 = 3
−3
20 5 −2 0 6
𝑟1 = 𝑏 − 𝐴𝑥1 = 10 − −2 5 1 3
−10 0 1 5 −3
5857
Turkish Journal of Computer and Mathematics Education Vol.12 No.14 (2021), 5854-5865
Research Article
−4
𝑟1 = 10
2
−𝑟1 𝑇 𝐴𝑑0
𝛽0 =
𝑑0 𝑇 𝐴𝑑0
320 +80
= 1600 +400
𝛽0 = 0.2
−4 20
𝑑1 = 𝑟1 + 𝛽0 𝑑0 = 10 + 0.2 10
2 −10
0
𝑑1 = 12
0
5 −2 0 0 −24
𝐴𝑑1 = −2 5 1 12 = 60
0 1 5 0 12
𝑑 1 𝑇 (𝑏−𝑥 1 𝑇 𝐴) 120
𝛼0 𝑛𝑒𝑤 1 = 𝑇 =720
𝑑 1 𝐴𝑑 1
𝛼1 = 0.1666667
6 0 6 0
𝑥2 = 𝑥1 + 𝛼1 𝑑1 = 3 + 0.1666667 12 = 3 + 2.0000004
−3 0 −3 0
6
𝑥2 = 5.0000004
−3
The exact solution is 𝑥1 = 6 , 𝑥2 = 5 𝑎𝑛𝑑 𝑥3 = −3.
10 −2 −1 −1 𝑥1 3
−1 −1 𝑥2
−2 10 = 15
−1 −1 10 −2 𝑥3 27
−1 −1 −2 10 𝑥4 −9
5858
Turkish Journal of Computer and Mathematics Education Vol.12 No.14 (2021), 5854-5865
Research Article
0
by Conjugate Gradient method with the initial point 𝑥0 = 0
0
0
First iteration:
3 10 −2 −1 −1 0
Now we compute the 𝑟0 = 𝑏 − 𝐴𝑥0 = 15 − −2 10 −1 −1 0
27 −1 −1 10 −2 0
−9 −1 −1 −2 10 0
3
𝑟0 = 𝑑0 = 15
27
−9
10 −2 −1 −1 3 −18
𝐴𝑑0 = −2 10 −1 −1 15 = 126
−1 −1 10 −2 27 270
−1 −1 −2 10 −9 −162
3
𝑏 − 𝑥0 𝑇 𝐴 = 15
27
−9
𝑑 0 𝑇 (𝑏−𝑥 0 𝑇 𝐴) 1044
𝛼0 𝑛𝑒𝑤 1 = = 10584
𝑑 0 𝑇 𝐴𝑑 0
𝛼0 𝑛𝑒𝑤 1 = 0.09864
0 3
𝑥1 = 𝑥0 + 𝛼0 𝑛𝑒𝑤 1 𝑑0 = 0 +0.09864 15
0 27
0 −9
0.29592
𝑥1 = 1.4796
2.66328
−0.88776
second iteration:
3 10 −2 −1 −1 0.29592
𝑟1 = 𝑏 − 𝐴𝑥1 = 15 − −2 10 −1 −1 1.4796
27 −1 −1 10 −2 2.66328
−9 −1 −1 −2 10 −0.88776
5859
Turkish Journal of Computer and Mathematics Education Vol.12 No.14 (2021), 5854-5865
Research Article
4.77552
𝑟1 = 2.57136
0.3672
6.97968
−𝑟1 𝑇 𝐴𝑑0
𝛽0 =
𝑑0 𝑇 𝐴𝑑0
793 .53216
= 10584
𝛽0 = 0.07497
4.77552 3
𝑑1 = 𝑟1 + 𝛽0 𝑑0 = 2.57136 + 0.07497 15
0.3672 27
6.97968 −9
5.00043
𝑑1 = 3.69591
2.39139
6.30495
𝑇
5.00043 10 −2 −1 −1 5.00043
𝑑1 𝐴𝑑1 = 3.69591
𝑇 −2 10 −1 −1 3.69591 =555.86432
2.39139 −1 −1 10 −2 2.39139
6.30495 −1 −1 −2 10 6.30495
𝑑1 𝑇 𝑏 − 𝑥1 𝑇 𝐴 = 78.26782
𝑑 1 𝑇 (𝑏−𝑥 1 𝑇 𝐴) 78.26782
𝛼1 𝑛𝑒𝑤 1 = 𝑇 =555 .86432
𝑑 1 𝐴𝑑 1
𝛼1 𝑛𝑒𝑤 1 = 0.14080
0.29592 5.00043
𝑥2 = 𝑥1 + 𝛼1 𝑛𝑒𝑤 1 𝑑1 = 1.4796 +0.14080 3.69591
2.66328 2.39139
−0.88776 6.30495
0.99998
𝑥2 = 1.99998
2.99999
−0.00002
Then the exact solution is
5860
Turkish Journal of Computer and Mathematics Education Vol.12 No.14 (2021), 5854-5865
Research Article
𝑥1 1
𝑥2
= 2
𝑥3 3
𝑥4 0
To find a new conjugate gradient for system of linear equations Coefficient (2) we will
use the quadratic form:
1
Let 𝐹 𝑥 = 2 𝑥 − 𝑥 ∗ 𝑇 𝐴 𝑥 − 𝑥 ∗ + (𝑥 − 𝑥 ∗ )𝑏 be a quadratic function
1
And 𝐹 𝑥𝑘+1 = 2 𝑥𝑘+1 − 𝑥 ∗ 𝑇 𝐴 𝑥𝑘+1 − 𝑥 ∗ + (𝑥𝑘+1 − 𝑥 ∗ )𝑏
1
= (𝑥𝑘 𝑇 𝐴𝑥𝑘 + 𝛼𝑘 𝑥𝑘 𝑇 𝐴𝑑𝑘 − 𝑥𝑘 𝑇 𝐴𝑥 ∗ + 𝛼𝑘 𝑑𝑘 𝑇 𝐴𝑥𝑘 +𝛼𝑘 2 𝑑𝑘 𝑇 𝐴𝑑𝑘 − 𝛼𝑘 𝑑𝑘 𝑇 𝐴𝑥 ∗ −
2
𝑥 𝐴𝑥𝑘 − 𝛼𝑘 𝑥 ∗ 𝑇 𝐴𝑑𝑘 + 𝑥 ∗ 𝑇 𝐴𝑥 ∗ )+𝑥𝑘 𝑇 𝑏 + 𝛼𝑘 𝑑𝑘 𝑇 𝑏 − 𝑥 ∗ 𝑇 𝑏
∗𝑇
1
∇𝑓 = 2(𝑥𝑘 𝑇 𝐴𝑑𝑘 + 𝑑𝑘 𝑇 𝐴𝑥𝑘 + 2𝛼𝑘 𝑑𝑘 𝑇 𝐴𝑑𝑘 − 𝑑𝑘 𝑇 𝐴𝑥 ∗ − 𝑥 ∗ 𝑇 𝐴𝑑𝑘 ) + 𝑑𝑘 𝑇 𝑏
1
= (2𝑥𝑘 𝑇 𝐴𝑑𝑘 + 2𝛼𝑘 𝑑𝑘 𝑇 𝐴𝑑𝑘 − 2𝑥 ∗ 𝑇 𝐴𝑑𝑘 ) + 𝑑𝑘 𝑇 𝑏
2
−𝑥𝑘 𝑇 𝐴𝑑𝑘
𝛼𝑘 𝑛𝑒𝑤 2 =
𝑑𝑘 𝑇 𝐴𝑑𝑘
5861
Turkish Journal of Computer and Mathematics Education Vol.12 No.14 (2021), 5854-5865
Research Article
Step(3):- 𝑑0 = 𝑟0
−𝑥𝑘 𝑇 𝐴𝑑𝑘
𝛼𝑘 𝑛𝑒𝑤 2 =
𝑑𝑘 𝑇 𝐴𝑑𝑘
𝑥𝑘+1 = 𝑥𝑘 + 𝛼𝑘 𝑑𝑘
𝑟𝑘+1 = b− A𝑥𝑘+1
−𝑟𝑘+1 𝑇 𝐴𝑑𝑘
𝛽𝑘 =
𝑑𝑘 𝑇 𝐴𝑑𝑘
𝑑𝑘+1 = 𝑟𝑘+1 + 𝛽𝑘 𝑑𝑘
Step(5):- End do
4 1 𝑥1 1
Example 3.2.Consider the linear system Ax=b given by𝐴𝑥 = 𝑥 = with the
1 3 2 2
2
initial point 𝑥0 = , solve by the conjugate gradient method.
1
Our first step to compute the 𝑟0 by the formula𝑟0 = 𝑏 − 𝐴𝑥
1 4 1 2 1 9 −8
𝑟0 = 𝑏 − 𝐴𝑥0 = − = − =
2 1 3 1 2 5 −3
𝑟0 = 𝑑0
4 1 −8 −35
A𝑑0 = =
1 3 −3 −17
−𝑥 0 𝑇 𝐴𝑑 0 87
𝛼0 𝑛𝑒𝑤 2 = = 331
𝑑 0 𝑇 𝐴𝑑 0
𝛼0 𝑛𝑒𝑤 2 = 0.2628
5862
Turkish Journal of Computer and Mathematics Education Vol.12 No.14 (2021), 5854-5865
Research Article
2 −8
𝑥1 = 𝑥0 + 𝛼0 𝑛𝑒𝑤 2 𝑑0 = +0.2628
1 −3
−0.1024
𝑥1 =
0.2116
1 4 1 −0.1024
𝑟1 = 𝑏 − 𝐴𝑥1 = −
2 1 3 0.2116
1.198
𝑟1 =
1.4647
−𝑟1 𝑇 𝐴𝑑0
𝛽0 =
𝑑0 𝑇 𝐴𝑑0
66.8792
𝛽0 =
331
𝛽0 = 0.20205
1.198 −8
𝑑1 = 𝑟1 + 𝛽0 𝑑0 = + 0.20205
1.4647 −3
−0.4184
𝑑1 =
0.86145
4 1 −0.4184 −0.81215
𝐴𝑑1 = =
1 3 0.86145 2.16595
−𝑥 1 𝑇 𝐴𝑑 1 −0.54148
𝛼1 𝑛𝑒𝑤 2 = =
𝑑 1 𝑇 𝐴𝑑 1 2.20566
𝛼1 𝑛𝑒𝑤 2 = −0.2455
−0.1024 −0.4184
𝑥2 = 𝑥1 + 𝛼1 𝑛𝑒𝑤 2 𝑑1 = + (−0.2455)
0.2116 0.86145
−0.39651
𝑥2 =
−0.1487
1 4 1 −0.39651
𝑟2 = 𝑏 − 𝐴𝑥2 = −
2 1 3 −0.1487
2.73474
𝑟2 =
2.84261
−𝑟2 𝑇 𝐴𝑑1
𝛽1 =
𝑑1 𝑇 𝐴𝑑1
5863
Turkish Journal of Computer and Mathematics Education Vol.12 No.14 (2021), 5854-5865
Research Article
𝛽1 = −1.78447
2.73474 −0.4184
𝑑2 = 𝑟2 + 𝛽1 𝑑1 = + (−1.78447)
2.84261 0.86145
3.48136
𝑑2 =
1.30538
4 1 3.48136 15.23082
𝐴𝑑2 = =
1 3 1.30538 7.39752
−𝑥 2 𝑇 𝐴𝑑 2 7.13918
𝛼2 𝑛𝑒𝑤 2 = =
𝑑 2 𝑇 𝐴𝑑 2 62.68052
𝛼2 𝑛𝑒𝑤 2 = 0.1139
−0.39651 3.48136
𝑥3 = 𝑥2 + 𝛼2 𝑛𝑒𝑤 2 𝑑2 = + 0.1139
−0.1487 1.30538
0.00002
𝑥3 =
−0.00002
5864
Turkish Journal of Computer and Mathematics Education Vol.12 No.14 (2021), 5854-5865
Research Article
4. References
[1] Y.-H. Dai and Y. Yuan, “A nonlinear conjugate gradient method with a strong
global convergence property,” SIAM J. Optim., vol. 10, no. 1, pp. 177–182, 1999.
[2] R. Fletcher and C. M. Reeves, “Function minimization by conjugate gradients,”
Comput. J., vol. 7, no. 2, pp. 149–154, 1964.
[3] J. C. Gilbert and J. Nocedal, “Global convergence properties of conjugate gradient
methods for optimization,” SIAM J. Optim., vol. 2, no. 1, pp. 21–42, 1992.
[4] M. R. Hestenes and E. Stiefel, Methods of conjugate gradients for solving linear
systems, vol. 49, no. 1. NBS Washington, DC, 1952.
[5] E. Polak and G. Ribiere, “Note sur la convergence de méthodes de directions
conjuguées,” ESAIM Math. Model. Numer. Anal. Mathématique Anal. Numérique,
vol. 3, no. R1, pp. 35–43, 1969.
[6] G. YU and L. GUAN, “Modified PRP Methods with Sufficient Descent Property
and Their Convergence Properties [J],” Acta Sci. Nat. Univ. Sunyatseni, vol. 4,
2006.
[7] R. Fletcher, “Practical methods of optimization john wiley & sons,” New York,
vol. 80, p. 4, 1987.
[8] P. Armand, “Modification of the Wolfe line search rules to satisfy the descent
condition in the Polak-Ribière-Polyak conjugate gradient method,” J. Optim.
Theory Appl., vol. 132, no. 2, pp. 287–305, 2007.
[9] “mendeley-reference-manager-2.” .
[10] P. Concus, G. H. Golub, and D. P. O’Leary, “A generalized conjugate gradient
method for the numerical solution of elliptic partial differential equations,” in
Sparse matrix computations, Elsevier, 1976, pp. 309–332.
[11] J. W. Daniel, “The conjugate gradient method for linear and nonlinear operator
equations,” SIAM J. Numer. Anal., vol. 4, no. 1, pp. 10–26, 1967.
5865