An Introduction of Multigrid Methods For Large-Scale Computation
An Introduction of Multigrid Methods For Large-Scale Computation
Chin-Tien Wu
National Center for Theoretical Sciences
National Tsing-Hua University
01/24/2005
How Large the Real Simulations Are?
Large-scale Simulation of Polymer Electrolyte Three-Dimensional Finite Element Modeling of
Fuel Cells by parallel Computing Human Ear for Sound Transmission
(Hua Meng and Chao-Yang Wang, 2004) (R. Z. Gan, B. Feng and Q. Sun, 2004)
FEM model with O(106) nodes FEM model with 105 ~106 nodes
Car engine with O(105) nodes Commercial Aircraft: 107 nodes FINFET transistor: 105 nodes
What do we need in order to simulate?
•Deepunderstandingtophysicalproblems
•Goodmathematicalmodels.
•Goodcomputablemathematicalmodels.
•Computationgrids(notnecessarybut...)
•Discretizations
•Solvelinearsystems
•Solvelinearsystemsfast!
Our goal is to introduce multigrid methods for solving sparse linear systems.
Why multigrid?
Reference:
1. How do we iterate?
2. For what category of matrices A, the iteration converge?
3. What is the convergence rate?
Some definitions:
⎡ A1,1
T
A1,2 ⎤
A is irreducible if there is no permutation P such that P AP= ⎢
⎣ 0 A 2,2 ⎥⎦
A is non-negative (denoted as A ≥ 0) if a i,j ≥ 0, for all 1 ≤ i,j ≤ n
A is an M-matrix if A is nonsingular, a i,j ≤ 0 for i ≠ j, and A-1 ≥ 0
A is irreducibly diagonally dominant if A is irreducible,
n
diagonally dominant with a i,i > ∑a i,j for some i.
j=1,j≠ i
Perron-Frobenius Theorem
Theorem: Let A≥0 be an irreducible matrix. Then
Stein-Rosenberg Theorem
Theorem: Let BJ = L + U be the Jacobi matrix and BGS = ( I − L ) U be the Gauss-
−1
Seidel matrix. Then one and only one of the following relations is vaild:
1) ρ ( BJ ) = ρ ( BGS ) = 0.
2) 0 < ρ ( BGS ) < ρ ( BJ ) < 1.
3) ρ ( BJ ) = ρ ( BGS ) = 1.
4) 1 < ρ ( BJ ) < ρ ( BGS ) .
Convergence of Jacobi, Gauss-Seidel and
SOR Iterative Methods
( )
n
Lemma 1. If A= a i,j ≥ 0 is irreducible then either ∑ a i,j =ρ ( A ) or
j=1
⎛ n ⎞ ⎛ n ⎞
min ∑ ai, j ⎟ < ρ ( A ) < max ⎜ ∑ ai, j ⎟ -------- (1).
1≤i≤n ⎜
⎝ j=1 ⎠ 1≤i≤n
⎝ j=1 ⎠
Proof: Case(1): All row sums of A are equal (=σ ): Let ζ = [1,1,,1]. Clearly, Aζ =σζ and σ ≤ ρ ( A ) .
However, the Gerchgorin's Theorem implies ρ ( A ) ≤ σ . Hence, ρ ( A ) = σ .
Case(2): Not all row sums of A are equal:
( ) ( )
Construct B= bi,j ≥ 0 and C= ci,j ≥ 0, by decreasing and increasing some entries of A,
respectively, such that
n ⎛ n ⎞ n ⎛ n ⎞
∑b , j = α = min ⎜ ∑ ai, j ⎟ and ∑ c, j = β = max ⎜ ∑ ai, j ⎟ , for all 1 ≤ ≤ n.
1≤i≤n
⎝ j=1 ⎠ 1≤i≤n
⎝ j=1 ⎠
j=1 j=1
⎧ 0 i= j
⎪
Proof: Recall that E J = I-D-1 A = D-1 ( L + U ) = bi, j ( ) , where b i,j = ⎨ −ai, j
i≠ j
. From Lemma 2,
⎪ a
⎩ i,i
it is clear that ρ ( B) ≤ ρ ( B ) . Since A is strictly diagonally dominant, clearly, we have
n
We have I-L ( )
−1
( )
≤ I-L
U
−1
(
≤ I+ L + L 2 + + L n −1 U
U = I- L
) ( )
−1
.
U
( )
−1
J = L + U
Now consider B and B
GS = I- L . Since we have already shown
U
( )
J < 1, the Stein-Rosenberg theorem implies ρ B
ρ B GS < 1. ( )
Therefore, we conclude the Gauss-Seidel iterative method converges.
Theorem 2. Let A=D-E-E * and D be Hermitian matrices, where D is positive
definite, and D-ω E is non-singular for 0 ≤ ω ≤ 2.
Let E SOR = I − ω ( D − ω E ) A. Then ρ ( ESOR ) < 1 if only if A is
−1
Using similar arguments, one can show that the converse is also true.
• For SOR, it is possible to determined the optimal value of ω
for special type of matrices (p-cyclic). The optimal value ωb
is precisely specified as the unique positive root
(0<p/(p-1)) of the equation
( J b) ⎣
( ) ⎡ ( ) ⎤ (ω b − 1) , ( Varga 1959 ) .
p
ρ ω = −
p 1− p
E p p 1 ⎦
• For p=2, ⎛ ρ ( EJ ) ⎞
ωb = 1 + ⎜ ⎟ ( Young 1950 )
⎜⎝ 1 + 1 − ρ 2 ( EJ ) ⎟⎠
m m
Convergence:
⎛ 2 (ω b − 1)m /2 ⎞ 2
e m ≤⎜ m ⎟ e 0 , here ω b =
⎝ 1 + (ω b − 1) ⎠ 1 + 1 − ρ2
The convergence rate is accelerated as ρ→0.
Remark: There are cases that the polynomial acceleration does not
improve asymptotic rate of convergence.
Some PDE and Finite Element Analysis
Finite Element Solutions:
{
Let ℑh be a given triangulation, Vh = v ∈H10 : v|T ∈P1 (T ) , T ∈ℑh and }
π h :H1 → Vh be the interpolation defined by π h ( u ) ( N i ) = u ( N i ) .
Consider the weak solution u ∈H1 satisfying a ( u,v) = ( f,v) , for all v ∈H10 ,
a finite element solution u h ∈Vh satisfies a ( u,v) = ( f,v) , for all v ∈Vh .
Interpolation Errors:
u2 ≤C f 0
Finite Element Solution is Quasi-Optimal
C
Céa Theorem: u − uh ≤ min u − v
H1
α v∈Vh H1
Proof:
( )
Step1: a u − uh , v = 0, for all v ∈Vh
≤ a ( u − uh ,u − uh ) = a ( u − uh ,u − v ) + a ( u − uh , v − uh )
2
Step2: α u − uh H1
=a ( u − uh ,u − v ) ≤ C u − uh H1
u−v H1
where v A
= a ( v, v ) is the energy norm.
Proof: From the interpolation estimation and Céa Theorem, (i) is trivial.
To prove (ii), we use the duality argument. Let w be the solution to the
adjoint problem, a ( v,w ) = ( u-u h ,v ) , for all v ∈H1 . Choosing v=u-u h ,
we have
( u-u h ,u-u h ) = a ( u-u h ,w ) = a ( u-u h ,w − wh ), for any w h ∈Vh
≤ C u − uh H1
w − wh H1
≤ Ch u − uh H1
w2
≤ Ch u − uh H1
u − uh L2
.
Therefore, u − uh L2
≤ Ch r +1 u r +1 .
Definition: mesh dependent norm v
k,h
= ( A v, v )
k
h h
for v ∈Vh , k=0,1, where
( v, w )h = ∑ h 2 ( v ( N i ) , w ( N i )). Clearly, v
0,h
≡ v 0
and v
1,h
≡ v A.
Lemma 3: Λ ( A h ) ≤ Ch −2
Proof: Let λ be an eigenvalue of A h with eigenvector φ .
a (φ , φ ) = ( A hφ , φ )h = λ (φ , φ )h = λ φ
2
0,h
.
C φ Ch −2 φ
2 2
λ≤ 2
A
≤ 2
0
Ch −2 .
φ 0,h
φ 0,h
k-1k I
Mesh Refinement
MG Coarsening
Restriction
Restriction
Prolongation kk-1 I
Prolongation
MG V-cycle
Why Multigrid Works?
1. Relaxation methods converge slowly but smooth the error quickly.
-u j -1 + 2u j − u j +1
Ex1: consider Lu = −u '' = λu ⇒ 2
= λu j
finite difference h
4 2⎛ kπ ⎞ ⎛ kjπ ⎞
Eigenvalues λk = 2 sin ⎜ ⎟ and eigenvectors φ j = sin ⎜
k
h ⎝ 2( N + 1) ⎠ ⎝ N + 1⎟⎠
here, k=1N is the wave number and j is the node number.
1
Richardson relaxation: E R = (I − σ −1 A) where A h = 2 tridiag[-1 2 -1].
h
4
Fourier analysis: Choosing σ = 2 (largest eigenvalue).
h
m m
N ⎛ λk ⎞ k N ⎛ 2⎛ kπ ⎞ ⎞ k N
e m = ∑ ⎜ 1 − ⎟ φ = ∑ ⎜ 1 − sin ⎜ ⎟ ⎟ φ = ∑ α φ
m k
k =1 ⎝ σ⎠ k =1 ⎝ ⎝ 2(N + 1) ⎠ ⎠ k =1
k
⎛ 2 jπ ⎞ 1 ⎛ 16 jπ ⎞ 1 ⎛ 32 jπ ⎞
( )
m
e = sin ⎜ + sin + sin
⎝ N + 1⎟⎠ 2 ⎜⎝ N + 1⎟⎠ 2 ⎜⎝ N + 1⎟⎠
e = I −ωD A e
m −1
2. Smooth error modes are more oscillatory on coarse grids. Smooth errors can
be better corrected by relaxation on coarser grids.
( ) ( )
E c = I − I Hh AH−1 I hH Ah = Ah−1 − I Hh AH−1 I hH Ah ,
HhI hHI
( ) (Galerkin formulation)
T
• AH =
H h H h
0 I A EIIAE⇒==
h
I and I =c I
h H
chHcHhh h H
⎧ E c is an A-orthogonal projection A E c e, I h e = 0
⎪ h H
⎪
( ) ( )
⇒ ⎨ N E c = R I Hh
⎪
( ) ( )
⎪⎩ R E = N I h Ah and E is identity on N I h Ah
c H c H
( )
A Picture That Show How Multigrid Works !
(
N I hH Ah ) (
N I hH Ah )
H o H
relaxation
correction
L L
o
( )
R I Hh
( )
R I Hh
(
N I hH Ah ) (
N I hH Ah )
H H
relaxation
o L correction L
o
( )
R I Hh ( )
R I Hh
1 1 T 1 1 1
Consider I = [ ,1, ]
h
H ( linear interpolation ) and I = [ ,1, ].
H
h
2 2 2 2 2
It is easy to check that A H =IHh A h IhH is the discretization of L on ℑH .
Now, for any v ∈Vh , let fv = Ah v. One can consider v and vH = I Hh AH−1 I hH fv as
finite element approximations of v̂, the solution of a ( v̂,w ) = ( fv , w ) . Then,
from the FEM-error estimation and H 2 -regularity, we have
Ec (v)
k
(
= Ah−1 − I Hh AH−1 I hH )( A v)
h
k
= v̂ − vH − (v̂ − v) k
----- (∗)
≤ Ch 2 − k v̂ 2 ≤ Ch 2 − k fv 0
= Ch 2 − k Av 0
N k
Consider the eigenfunction φ , k . φ j is also an eigenfunction of A H
k
j
2
We have E c φ jk ( ) 1
≤ Chλk = O ( h ) . This concludes the coarse-grid
6. set x k =x k +q m
7. (post-smoothing) x k =x k +M -1k g k -A k x k( )
8. set MG k (w k ,g k )=x k
(
Emg = Ah−1 − I Hh AH−1 I hH )( ) (
Ah E s = E s I − I Hh AH−1 I hH Ah )
Pre-smoothing only Post-smoothing only
Multigrid Cycles
MultigridV-cycleconvergencefor1-Dlaplacian
Results provided by 曾昱豪 in NCTU
MG Convergence
Smoothing property: ( )
Al E s ≤ η m Al , for all 0 ≤ m < ∞ and l > 0.
−1 −1 H −1
Approximation property: A −I A I
l
H
h l −1 h
≤ C A Al , for all l > 0.
{
0 ≤ x ≤1
}
≤ Λ sup (1 − x ) x ( vs ⋅ Ah vs ) ≤ Ch -2
m 1
m
vs 0 vs 2
⎡ T -I ⎤
⎢-I T -I ⎥
A= ⎢ ⎥ , T=[-1,4,-1]:
⎢ -I T -I ⎥
⎢ ⎥
⎣ -I T⎦
1
ε̂θ =
( n + 1)
∑ 2 ε k,lφ k,l ( −θ ), and
1≤ k,l ≤n
⎧ 2π n +1 n+3 ⎫
Θn = ⎨ ( )
k,l − ≤ k,l ≤ , n is odd ⎬.
⎩n + 1 2 2 ⎭
Similarly, the error ε obtained after relaxation can be written
ε̂θ
as ε = ∑ ε̂θφi, j (θ ) -----(ii). Let λ (θ ) ≡ . Brandt's smoothing
θ ∈Θn εθ
⎧ π ⎫
factor is defined as ρ =sup ⎨ λ (θ ) , ≤ θ k ≤ π , k = 1, 2 ⎬ .
⎩ 2 ⎭
Smoothing Factor of Damped Jacobi Iteration
ω
Recall that εi, j = ε i, j −
4
( (
4 ε i, j − ε i +1, j + ε i −1, j + ε i, j +1 + ε i, j −1 )) Plug (i) and (ii)
into it, we have
ω
∑ ε̂ φ (θ ) = ∑ {
θ ∈Θn
θ i, j
θ ∈Θn 4
(
ε̂θφi, j (θ ) − ⎡⎣ 4 ε̂θφi, j (θ ) − ε̂θφi +1, j (θ ) + ε̂θφi −1, j (θ ) + ε̂θφi, j +1 (θ )
⎤ )} ⎧
⎩ θ ∈Θn
ω
+ ε̂θφi +1, j (θ ) ⎦ = ∑ ε̂θ ⎨φi, j (θ ) − ⎡⎣ 4φi, j (θ ) − φi, j (θ ) e iθ1 − φi, j (θ ) e-iθ1 − φi, j (θ ) e iθ2
4
⎧⎪ ⎛ cos (θ1 ) + cos (θ 2 ) ⎞ ⎫⎪
-iθ 2
θ ∈Θn
}
−φi, j (θ ) e ⎤⎦ = ∑ ε̂θ ⎨1 − ω ⎜ 1 −
⎪⎩ ⎝ 2 ⎟ ⎬φi, j (θ )
⎠ ⎪⎭
⎛ cos (θ1 ) + cos (θ 2 ) ⎞
Therefore, λ (θ ) = 1 − ω ⎜ 1 − ⎟ . It is easy to see
⎝ 2 ⎠
⎧ ω 3ω ⎫
that ρ = max ⎨ 1 − ω , 1 − , 1 − ⎬ . The optimal ω that
⎩ 2 2 ⎭
4
minimize ρ is and the smoothing factor ρ = 0.6 for such ω .
5
HW7: Show that the smoothing factor of the Gauss-Seidel iteration is 0.5
How Much Multigrid Costs?
Convergence:
• Stationary method ≈ 1-O(κ-1) ≈ 1-h2
• Conjugate gradient ≈ 1-O(κ-1/2) ≈ 1-h
•Multigrid ≈ O(1) independent with h
⎧ d ⎛ du ⎞ ⎧ ε , 0 ≤ x ≤ i0 h
⎪− ⎜⎝ c ( x ) ⎟⎠ = f ( x ) ⎪
Ex2: ⎨ dx dx , here c ( x ) = ⎨ 1, i0 h < x ≤ i1h ---- (+)
⎪ u ( 0 ) = u (1) = 0 ⎪ ε, i h < x ≤ 1
⎩ ⎩ 1
⎡ 2ε −ε ⎤
⎢ −ε 2ε −ε ⎥
⎢ ⎥
⎢ ⎥
⎢ ⎥
⎢ −ε 1 + ε −1 ⎥
Ideas:
• Fix the smoothing operator.
•Carefully select coarse grids and define interpolation weights
AMG Convergence
2 2 2
Smoothing assumption: ∃ α >0 ∍ E e ≤ e 1 −α e
s
2
for all e ∈Vh
1
2 2
Approximation assumption: min e − I e h
H H 0
≤ β e 1 where β is independent with e.
eH
( )
2
E e = AE c e, E c e − I Hh eH ≤ E c e
c
E c e − I Hh eH ≤ β E ce E ce
1 2 0 2 1
2 2 ⎛ α⎞ c 2 ⎛ α⎞ 2
2
s
E E c
≤ E e − α E e ≤ ⎜1 − ⎟ E e ≤ ⎜1 − ⎟ e 1
c c
1 1 2 ⎝ β⎠ 1 ⎝ β⎠
1 ⎛ ⎞ 2
( Ae,e = ) ∑
2 i, j
−ai, j
(ei
− e j
) 2
+ ∑ ⎜ ∑ a e
i, j ⎟ i
<< ∑ a e
i,i i
2
i ⎝ j ⎠ i
1
( )
2
∑
2 j
−ai, j ei − e j << aii ei2
(e − e )
2
ai, j
∑
i j
2
<< 2
j≠i ai,i e
i
i∈F k ∈C ⎝ i, j i j ⎠
2 2
⎛ ⎞ ⎛ ⎞
Since ∑ aii ⎜ ei − ∑ wik ek ⎟ = ∑ aii ⎜ ∑ wik (ei − ek ) + (1 − si )ei ⎟
i∈F ⎝ k ∈C ⎠ ⎝ i∈F k ∈C ⎠
⎛ ⎞
≤ ∑ aii ⎜ ∑ wik (ei − ek ) + (1 − si )ei ⎟ ,
2 2
i∈F ⎝ k ∈C ⎠
here wik > 0 is the interpolation weight from node k to node i, and si = ∑ wik < 1,
k ∈C
⎧ β
clearly, if ⎪∑ ii ∑ ik i k − ≤ ∑ (−aij )(ei − e j )2
2
a w (e e )
⎪ i∈F k ∈C 2 i, j
(Θ) ⎨
⎪ a (1 − s )e2 ≤ β ⎛ a ⎞ e2 ,
⎪∑ ii i i ∑ ⎜⎝ ∑ ij ⎟⎠ i
⎩ i∈F i j
( )
the approximation assumption holds. For Θ to hold, we can simply require
(Ξ) (
0 ≤ aii wik ≤ β aik and 0 ≤ aii 1 − si ≤ β ∑ aik . )
k
Lemma 5: Given a β ≥1, suppose the coarse grid C is selected such that
1
ai,i + ∑ ai, j = ∑ ai, j ≥ a
β i,i
j∉Ci j∉Ci
j≠i
ai, k ai,i
ai,iω i, k = ai,i = ai, k ≤ β ai, k
∑a i, j ∑ ai, j
j ∉Ci j ∉Ci
⎛ ⎞
⎛
ai, k ⎟
⎞ ⎛ ∑a i, j
⎞
⎜
ai,i (1 − si ) = ai,i ⎜ 1 − ∑ ω i, k ⎟ = ai,i ⎜ 1 − ∑ =a ⎜ j ⎟
⎜⎝ k ∈Ci j∑ ∑ ai, j ⎟⎟⎠
⎝ k ∈Ci ⎠ a ⎟ i,i ⎜
i, j ⎟ ⎜⎝
∉Ci ⎠ j ∉Ci
≤ β ∑ ai, j
j
Recall E S = I − B −1 A. We have
2
( ( ) (
E GS e 1 = A I − B −1 A e, I − B −1 A e ))
( ) ( ) (
= ( Ae, e) − AB −1 Ae, e − Ae, B −1 Ae + AB −1 Ae, B −1 Ae )
= e1 − ( B Ae, BB Ae) − ( BB Ae, B Ae) + ( AB
−1 −1 −1 −1 −1
Ae, B −1 Ae )
2
− (( B + B − A ) B Ae, B Ae) .
−1 −1
2 T
= e1
((
The smooth assumption ≡ α e 2 ≤ BT + B − A B −1 Ae, B −1 Ae ---- (Θ)
2
) )
( ) (
Let e =B −1 Ae. Since e 2 = D −1 Ae, Ae = D −1 BB −1 Ae, BB −1 Ae ,
2
)
( ) ((
clearly, (Θ) ≡ α D −1 Be, Be ≤ BT + B − A e, e ---- (ΘΘ). ) )
Now consider B=D-L. Since BT + B − A = D, we have
(ΘΘ) ≡ α
( B T
) = α ( D B D BDe, e ) = α ( D B D BD e, D e ) .
D −1 Be, e −1 T −1 −1 T −1 1/2 1/2
= αρ ( D B D B ) ≤ 1. ≡ α ≤
−1 T −1 1
---- (ΘΘΘ)
ρ ( D B D B) −1 T −1
Therefore, the smooth assumption holds for Gauss-Seidel iteration.
If A is a diagonally dominant M-matrices, we can estimate α as follows:
( ) ( ) ( ) ( (
Since ρ D −1 BT D −1 B ≤ ρ I − D −1 LT ρ I − D −1 L ≤ 1 + ρ D −1 LT ))(1 + ρ ( D L )),
−1
⎧⎪ n ai, j ⎫⎪
( )
and ρ D L ≤ max ⎨ ∑
−1
⎬ ≤ 1, clearly, we have
1 1
≥ .
1≤i ≤n
⎪⎩ j =1, j ≠i ai,i
⎪⎭ (
−1 T −1
ρ D B D B 4 )
1
Therefore, Gauss-Seidel iteration satisfies the smoothing property with α = .
4
Iffact,forsymmetricM-matrices,smoothassumptionholds
forbothGauss-SeidelandJacobiiterations.
Furthermore,onecanalsoshowthatthecoarsegridmatrix
A H = ( I H ) Ah I H
h T h
isalsoadiagonallydominantM-matrix
whenisadiagonallydominantM-natrixandtheinterpolation
Ah
weightssatisfy(Ξ ) and ( Φ ) .
AMG Coarsening Criteria
First, let’s define the following sets:
{ ( ) }
N Si
N Si = j : − ai, j ≥ γ max −ai,m ,0 < γ < 1
m≠ i
(N )
T
S
(N ) = { j : i ∈ N }
T i
S S
i j
S
Here, N i T is the set of nodes that node i strongly connects to.
( NSi ) is the set of nodes strongly connects to node i.
S
• Ci-nodes should be chosen from N i
Criterion 2. C should be a maximal subset of all nodes with the property that
no two C-nodes are strongly connected to each other.
AMG Coarsening (I)
+1
-1
N Si
-1
(N )
T
S
i
A very good MG and AMG tutorial resource (by Van Emden Henson) :
http://www.llnl.gov/CASC/people/henson
AMG Coarsening: Example 2
Convection-Diffusion with Characteristic and downstream layers
∂u
− εΔu + =0 0
∂y
⎧1 if (y=0 x>0) or x=1,
u |∂Ω = ⎨ 0 1
⎩0 otherwise,
where Ω=[-1,1] × [-1,1].
0
. 1
Solution from Galerkin discretization on 32x32 Solution from SDFEM discretization on 32x32
grid grid
⎡ ε ε ε ⎤
h ⎢ − − − ⎥
3 3 3 ⎥
SDFEM discretization with δT = ⎢
2 ⎢ h ε 2h 8ε h ε ⎥
⎢ 6−3 3
+
3
−
6 3 ⎥
yields the left matrix stencil: ⎢
⎢− h − ε 2h ε h ε⎥
⎥
− − − −
⎢⎣ 6 3 3 3 6 3 ⎥⎦
C-point
Coarse grids from GMG coarsening Coarse grids from AMG coarsening
Example 2: GMG v.s. AMG
⎛ a1 ( u1 ,u2 ,,un ) ⎞ ⎛ f1 ⎞
⎜ a ( u ,u ,,u )⎟ ⎜ f ⎟
One needs to solve A h ( uh ) = fh ≡ ⎜ 2 1 2 ⎟ = ⎜ 2⎟ .
n
⎜ ⎟ ⎜⎟
⎜ ⎟ ⎜ ⎟
⎝ an ( u1 ,u2 ,,un )⎠ ⎝ fn ⎠
Method1:LinearizeAhusingNewtonmethodandsolvethelinear
systembymultigrid.
−1
⎡ D
uj ← uj − ⎢
⎣ Du
⎤
Ah (u )⎥
⎦
( f − A ( u ))
h j
Method2:NonlinearMultigrid,socalledfullapproximation
storagescheme(FAS)
• Nonlinearrelaxation
•Nonlineardefectcorrection
Nonlinear Relaxation:
Jacobi: (
ai u1old ,,uiold
−1 ,u new
i ,u old
i +1 ,,u old
n )
for all i=1,2,,n
Gauss-Seidel: a (u
i
new
1 ,,uinew
−1 ,u new
i ,u old
i +1 ,,u old
n ) for all i=1,2,,n
Solve local nonlinear problems iteratively.
Example ( Nonlinear Gauss-Seidel ) :
Discretiation:
In linear case: ( ) ( )
rh(n) = Ah ( uh ) − Ah uh(n) = Ah uh - uh(n)
In nonlinear case: rh(n) = A (u ) − A (u ) ≠ A (u - u )
h h h
(n)
h h
(n)
h
1. Nonlinear Relaxation
2. Restrict uhn and rhn by rH = I hH rhn and v = I hH uhn
FAS:
3. Solve AH ( u H ) = rH + AH ( v )
4. Compute eH = u H − v
5. Update uhn ← uhn + I Hh eH
-∆u(x, y) + γ u(x, y) e u(x,y) = f(x, y) in [0,1] × [0,1],
( )
u(x, y) = x-x 2 sin ( 3π y )
ui, j = ui, j −
( )
h −2 4ui, j − ui −1, j − ui +1, j − ui, j −1 − ui, j +1 + ui, j e i , j − fi, j
u
( )
4h −2 + γ 1 + ui, j e i , j
u
P P P P
What need to be done?
M M M M
1. Numericalalgorithmneedtobecapabletodoit.
2. Programhastodistributeworkstoprocessors
properlyanddynamically.(loadbalancing)
3.Computershavetocommunicateeachothers.
(Messagingpassinginterface,MPI)
4.Manyothers(gridtopoloogy,scheduling,….)
DB
2
⎛N⎞ ⎛N⎞
Time for relaxation on grid level k: Tk = Tcomm ⎜ k ⎟ + 5 ⎜ k ⎟ f ,
⎝2 ⎠ ⎝2 ⎠
⎛ 40 ⎞
Time for a V-cycle = Tv ≈ ∑ 2Tk ≈ 8α L + 16N β + ⎜ ⎟ N 2 f ,
k
⎝ 3⎠
α = startup time,
β = time to transfer a single double
Tcomm (n) = α + β n = communication time for transmitting n doubles to one processor.
f = one floating point operation time.
Since MG has O(1) convergence rate, we can analyze the
scaled efficiency as follows:
⎛ 40 ⎞ 2
( )
Tv N ,1 ≈ ⎜ ⎟ N f
2
⎝ 3⎠
( )
Tv ( pN ) , p = 8α log 2 ( pN ) + 16 β ( N ) +
2 40 2
3
N f
E ( N, P ) ≈ O (1 log 2 p ) as p → ∞
E ( N, P ) ≈ O (1) as N → ∞