SciComp2 ClassNotes
SciComp2 ClassNotes
Overview
Michael Bader
Technical University of Munich
Summer 2023
Remember: The Simulation Pipeline
phenomenon, process etc.
- modelling
?
v mathematical model
a
l
- numerical treatment
?
i numerical algorithm
d
a - implementation
t ?
i simulation code
o
n visualization
?
results to interpret
H
HH
j embedding
H
statement tool
Focussing on
• large systems: 106 –109 unknowns
• sparse systems: typically only O(N) non-zeros in the system matrix
(N unknowns)
• systems resulting from the discretization of PDEs
Topics
• relaxation methods (as smoothers)
• multigrid methods
• Conjugate Gradient methods
• preconditioning
hy
hx
• compute the temperature distribution on this plate!
Michael Bader | Scientific Computing II | Overview | Summer 2023 4
A Finite Volume Model (2)
Focussing on
• large systems: 106 –109 particles
• short-range vs. long-range forces
• N-body methods, parallelization
Lecturers:
• Michael Bader
• recorded lectures by Anne Reinarz, from summer 2020
Tutors:
• David Schneller (multigrid and CG)
• Sam Newcome (molecular dynamics)
“Style”:
• worksheets with applications & examples
• no compulsory part
ECTS, Modules
• 5 ECTS (2+2 lectures/tutorials per week)
• CSE: compulsory course
• Biomed. Computing/Computer Science:
elective/Master catalogue
• others?
Exam:
• written exam at end of semester
• based on exercises presented in the tutorials
• one implementation-oriented exercise Python
Black Headers:
• for all slides with regular topics
Green Headers:
• summarized details: will be explained in the lecture, but usually not as an
explicit slide; “green” slides will only appear in the handout versions
Red Headers:
• advanced topics or outlook: will not be part of the exam topics
Blue Headers:
• background information or fundamental concepts that are probably
already known, but are important throughout the lecture
Michael Bader
Technical University of Munich
Summer 2023
Part I
Residual-Based Correction
The Residual Equation
Relaxation
Jacobi Relaxation
Gauss-Seidel Relaxation
Successive-Over-Relaxation (SOR)
Michael Bader | Scientific Computing II | Relaxation Methods and the Smoothing Property | Summer 2023 2
The Residual Equation
• we consider a system of linear equations: Ax = b
(as stemming from the FD/FV/FEM discretisation of a PDE)
• for which we compute a sequence of approximate solutions x (k)
• the residual r (k) shall then be defined as:
r (k ) = b − Ax (k)
• short computation:
Ae(k) = r (k)
Michael Bader | Scientific Computing II | Relaxation Methods and the Smoothing Property | Summer 2023 3
Residual Based Correction
Michael Bader | Scientific Computing II | Relaxation Methods and the Smoothing Property | Summer 2023 4
Relaxation
Examples:
• B = diag(A) = DA (diagonal part of A)
⇒ Jacobi method (“Jacobi relaxation”)
• B = LA (lower triangular part of A)
⇒ Gauss-Seidel method (“Gauss-Seidel relaxation”)
Michael Bader | Scientific Computing II | Relaxation Methods and the Smoothing Property | Summer 2023 5
Jacobi Relaxation
2. for implementation:
x (k +1) = DA−1 b − (A − DA )x (k)
3. for analysis:
x (k+1) = I − DA−1 A x (k ) + DA−1 b =: Mx (k ) + Nb
Michael Bader | Scientific Computing II | Relaxation Methods and the Smoothing Property | Summer 2023 6
Jacobi Relaxation – Algorithm
for i from 1 to n do
xnew[i] := ( b[i]
- sum( A[i,j]*x[j], j=1..i-1)
- sum( A[i,j]*x[j], j=i+1..n)
) / A[i,i];
end do;
for i from 1 to n do
x[i] := xnew[i];
end do;
• properties:
• additional storage required (xnew)
• x, xnew can be computed in any order
• x, xnew can be computed in parallel
Michael Bader | Scientific Computing II | Relaxation Methods and the Smoothing Property | Summer 2023 7
Gauss-Seidel Relaxation
2. for implementation:
x (k +1) = L−1
A b − (A − LA )x (k)
3. for analysis:
x (k+1) = I − L−1
A A x
(k )
+ L−1
A b =: Mx
(k)
+ Nb
Michael Bader | Scientific Computing II | Relaxation Methods and the Smoothing Property | Summer 2023 8
Gauss-Seidel Relaxation – Algorithm
Michael Bader | Scientific Computing II | Relaxation Methods and the Smoothing Property | Summer 2023 9
Successive-Over-Relaxation (SOR)
Michael Bader | Scientific Computing II | Relaxation Methods and the Smoothing Property | Summer 2023 10
Does It Always Work?
Michael Bader | Scientific Computing II | Relaxation Methods and the Smoothing Property | Summer 2023 11
Part II
Michael Bader | Scientific Computing II | Relaxation Methods and the Smoothing Property | Summer 2023 12
The Model Problem – 1D Poisson
1D Poisson equation:
• −u 00 (x) = 0 on Ω = (0, 1), u(0) = u(1) = 0
• thus: u(x) = 0 boring, but easy to examine the error
• discretised on a uniform grid of mesh size h = n1
• compute approximate values uj ≈ u(xj )
at grid points xj := jh, with j = 1, . . . , (n − 1)
• tridiagonal system matrix Ah (size (n − 1) × (n − 1)) built from 3-point
stencil:
1
[−1 2 − 1]
h2
• assume zero right-hand side to obtain the following system:
1
−uj−1 + 2uj − uj+1 = 0 ⇔ uj = 12 uj−1 + uj+1
h2
Michael Bader | Scientific Computing II | Relaxation Methods and the Smoothing Property | Summer 2023 13
1D Poisson: Jacobi Relaxation
Relaxation Methods – Jacobi
Iterative scheme for Jacobi relaxation:
• leads to relaxation scheme uj(k +1) = 12 uj+1
(k ) (k)
+ uj−1
• start with initial guess uj(0) 6= 0
place peas on the line between
• in this case: ej(k) = uj − uj(k) = −uj(k)
two neighbours in parallel
Visualisation of relaxation process:
Michael Bader | Scientific Computing II | Relaxation Methods and the Smoothing Property | Summer 2023 14
1D Poisson: Jacobi Relaxation
Relaxation Methods – Jacobi
Iterative scheme for Jacobi relaxation:
• leads to relaxation scheme uj(k +1) = 12 uj+1
(k ) (k)
+ uj−1
• start with initial guess uj(0) 6= 0
place peas on the line between
• in this case: ej(k) = uj − uj(k) = −uj(k)
two neighbours in parallel
Visualisation of relaxation process:
Michael Bader | Scientific Computing II | Relaxation Methods and the Smoothing Property | Summer 2023 14
1D Poisson: Jacobi Relaxation
Relaxation Methods – Jacobi
Iterative scheme for Jacobi relaxation:
• leads to relaxation scheme uj(k +1) = 12 uj+1
(k ) (k)
+ uj−1
• start with initial guess uj(0) 6= 0
place peas on the line between
• in this case: ej(k) = uj − uj(k) = −uj(k)
two neighbours in parallel
Visualisation of relaxation process:
Michael Bader | Scientific Computing II | Relaxation Methods and the Smoothing Property | Summer 2023 14
1D Poisson: Jacobi Relaxation
Relaxation Methods – Jacobi
Iterative scheme for Jacobi relaxation:
(k )
• leads to relaxation scheme uj(k +1) = 12 uj+1 (k)
+ uj−1
• start with initial guess uj(0) 6= 0
place
• inpeas on the
(k) line between
(k)
this case: e = uj − u = −u
(k) two neighbours in parallel
j j j
we getBader
Michael a |high
Scientificplus a IIlow
Computing frequency
| Relaxation oscillation
Methods and the Smoothing Property | Summer 2023 14
1D Poisson: Gauss-Seidel Relaxation
Relaxation Methods – Gauss-Seidel
Iterative scheme for Gauss-Seidel relaxation:
• leads to relaxation scheme uj(k +1) = 12 uj+1
(k ) (k+1)
+ uj−1
• start with initial guess uj(0) 6= 0
sequentially place peas on the line
• in this case: ej(k) = uj − uj(k) = −uj(k)
between two neighbours
Visualisation of relaxation process:
Michael Bader | Scientific Computing II | Relaxation Methods and the Smoothing Property | Summer 2023 15
1D Poisson: Gauss-Seidel Relaxation
Relaxation Methods – Gauss-Seidel
Iterative scheme for Gauss-Seidel relaxation:
• leads to relaxation scheme uj(k +1) = 12 uj+1
(k ) (k+1)
+ uj−1
• start with initial guess uj(0) 6= 0
sequentially place peas on the line
• in this case: ej(k) = uj − uj(k) = −uj(k)
between two neighbours
Visualisation of relaxation process:
Michael Bader | Scientific Computing II | Relaxation Methods and the Smoothing Property | Summer 2023 15
1D Poisson: Gauss-Seidel Relaxation
Relaxation Methods – Gauss-Seidel
Iterative scheme for Gauss-Seidel relaxation:
• leads to relaxation scheme uj(k +1) = 12 uj+1
(k ) (k+1)
+ uj−1
• start with initial guess uj(0) 6= 0
sequentially place peas on the line
• in this case: ej(k) = uj − uj(k) = −uj(k)
between two neighbours
Visualisation of relaxation process:
Michael Bader | Scientific Computing II | Relaxation Methods and the Smoothing Property | Summer 2023 15
1D Poisson: Gauss-Seidel Relaxation
Relaxation Methods – Gauss-Seidel
Iterative scheme for Gauss-Seidel relaxation:
• leads to relaxation scheme uj(k +1) = 12 uj+1
(k ) (k+1)
+ uj−1
• start with initial guess uj(0) 6= 0
sequentially place peas on the line
• in this case: ej(k) = uj − uj(k) = −uj(k)
between two neighbours
Visualisation of relaxation process:
Michael Bader | Scientific Computing II | Relaxation Methods and the Smoothing Property | Summer 2023 15
1D Poisson: Gauss-Seidel Relaxation
Relaxation Methods – Gauss-Seidel
Iterative scheme for Gauss-Seidel relaxation:
• leads to relaxation scheme uj(k +1) = 12 uj+1
(k ) (k+1)
+ uj−1
• start with initial guess uj(0) 6= 0
sequentially place peas on the line
• in this case: ej(k) = uj − uj(k) = −uj(k)
between two neighbours
Visualisation of relaxation process:
Michael Bader | Scientific Computing II | Relaxation Methods and the Smoothing Property | Summer 2023 15
1D Poisson: Gauss-Seidel Relaxation
Relaxation Methods – Gauss-Seidel
Iterative scheme for Gauss-Seidel relaxation:
• leads to relaxation scheme uj(k +1) = 12 uj+1
(k ) (k+1)
+ uj−1
• start with initial guess uj(0) 6= 0
sequentially place peas on the line
• in this case: ej(k) = uj − uj(k) = −uj(k)
between two neighbours
Visualisation of relaxation process:
Michael Bader | Scientific Computing II | Relaxation Methods and the Smoothing Property | Summer 2023 15
1D Poisson: Gauss-Seidel Relaxation
Relaxation Methods – Gauss-Seidel
Iterative scheme for Gauss-Seidel relaxation:
• leads to relaxation scheme uj(k +1) = 12 uj+1
(k ) (k+1)
+ uj−1
• start with initial guess uj(0) 6= 0
sequentially place peas on the line
• in this case: ej(k) = uj − uj(k) = −uj(k)
between two neighbours
Visualisation of relaxation process:
Michael Bader | Scientific Computing II | Relaxation Methods and the Smoothing Property | Summer 2023 15
1D Poisson: Gauss-Seidel Relaxation
Relaxation Methods – Gauss-Seidel
Iterative scheme for Gauss-Seidel relaxation:
• leads to relaxation scheme uj(k +1) = 12 uj+1
(k ) (k+1)
+ uj−1
• start with initial guess uj(0) 6= 0
sequentially place peas on the line
• in this case: ej(k) = uj − uj(k) = −uj(k)
between two neighbours
Visualisation of relaxation process:
Michael Bader | Scientific Computing II | Relaxation Methods and the Smoothing Property | Summer 2023 15
1D Poisson: Gauss-Seidel Relaxation
Relaxation Methods – Gauss-Seidel
Iterative scheme for Gauss-Seidel relaxation:
• leads to relaxation scheme uj(k +1) = 12 uj+1
(k ) (k+1)
+ uj−1
• start with initial guess uj(0) 6= 0
sequentially place peas on the line
• in this case: ej(k) = uj − uj(k) = −uj(k)
between two neighbours
Visualisation of relaxation process:
Michael Bader | Scientific Computing II | Relaxation Methods and the Smoothing Property | Summer 2023 15
1D Poisson: Gauss-Seidel Relaxation
Relaxation Methods – Gauss-Seidel
Iterative scheme for Gauss-Seidel relaxation:
• leads to relaxation scheme uj(k +1) = 12 uj+1
(k ) (k+1)
+ uj−1
• start with initial guess uj(0) 6= 0
sequentially place peas on the line
• in this case: ej(k) = uj − uj(k) = −uj(k)
between two neighbours
Visualisation of relaxation process:
Michael Bader | Scientific Computing II | Relaxation Methods and the Smoothing Property | Summer 2023 15
Convergence of Relaxation Methods
Michael Bader | Scientific Computing II | Relaxation Methods and the Smoothing Property | Summer 2023 16
Convergence Analysis
e(i+1) = Mx + Nb − Mx (i) − Nb
= Mx − Mx (i) = Me(i)
⇒ e(i) = M i e(0) .
Michael Bader | Scientific Computing II | Relaxation Methods and the Smoothing Property | Summer 2023 17
Convergence Analysis (2)
Michael Bader | Scientific Computing II | Relaxation Methods and the Smoothing Property | Summer 2023 18
The Smoothing Property
Eigenvalues and -vectors of Ah : (compare with tutorials!)
• eigenvalues: λk = h42 sin2 kπ 4 2 k πh
2n = h2 sin 2
• eigenvectors: v (k) = sin(k πj/n)
j=1,...,n−1
• both for k = 1, . . . , (n − 1)
Michael Bader | Scientific Computing II | Relaxation Methods and the Smoothing Property | Summer 2023 19
The Smoothing Property
Eigenvalues and -vectors of Ah : (compare with tutorials!)
• eigenvalues: λk = h42 sin2 kπ 4 2 k πh
2n = h2 sin 2
• eigenvectors: v (k) = sin(k πj/n)
j=1,...,n−1
• both for k = 1, . . . , (n − 1)
Michael Bader | Scientific Computing II | Relaxation Methods and the Smoothing Property | Summer 2023 19
The Smoothing Property (2)
Michael Bader | Scientific Computing II | Relaxation Methods and the Smoothing Property | Summer 2023 20
The Smoothing Property (2)
Another Observation:
• the smoothest (slowest converging) component corresponds to the
smallest eigenvalue of A (k = 1)
• remember residual equation: Ae = r :
if e = v (1) , then r = λ1 v (1)
⇒ “small residual, but large error”
⇒ in such a situation, any residual-based correction will normally fail
Michael Bader | Scientific Computing II | Relaxation Methods and the Smoothing Property | Summer 2023 20
Scientific Computing II
Towards Multigrid Methods
Michael Bader
Technical University of Munich
Summer 2022
Part I
Multigrid Methods
“Multigrid” idea:
• use multiple grids to solve the system of equations
• hope that on each grid, a certain range of error frequencies will be
reduced efficiently
Algorithm:
1. Start on a very coarse grid with mesh size h = h0 ;
guess an initial solution xh
2. Iterate over Ah xh = bh using relaxation method
⇒ approximate solution xh
3. interpolate the solution xh to a finer grid Ωh/2
4. proceed with step 2 (now with mesh size h := h/2) using interpolated
xh/2 as initial solution
Algorithm:
1. relaxation/smoothing on the fine level system
⇒ solution xh
2. compute the residual rh = bh − Ah xh
3. restriction of rh to the coarse grid ΩH rH
4. compute a solution to AH eH = rH on ΩH
5. interpolate the coarse grid solution eH to the fine grid Ωh
6. add the resulting correction to xh
7. again, relaxation/smoothing on the fine grid
• smoother:
reduce the high-frequency error components, and get a smooth error
• restriction:
transfer residual from fine grid to coarse grid
• coarse grid equation:
(acts as) discretisation of the PDE on the coarse grid;
requires a coarse-grid solver
• interpolation:
transfer coarse grid solution/correction from coarse grid to fine grid
(and update solution accordingly)
Algorithm:
1. pre-smoothing on the fine level system Al xl = bl
approximate solution xl
2. compute the residual rl = bl − Al xl
3. restriction of rl to the coarse grid Ωl−1 r̂l−1
4. solve coarse grid system Al−1 el−1 = r̂l−1 =: bl−1
by a recursive call to the V-cycle algorithm
5. interpolate the coarse grid solution el−1 êl to the fine grid Ωl
6. add the interpolated coarse-grid correction: xl = xl + êl
7. post-smoothing on the fine grid
Thus: runtime O(n) per iteration, but how many iterations necessary?
• fastest method around (if all components are chosen carefully), but:
best-possible convergence often hard to obtain
• “textbook multigrid efficiency”:
e(m+1) ≤ γ e(m) ,
Interpolation
Restriction
Coarse Grid Operator
Smoothers
Multigrid Cycles
Multigrid W-Cycle
Full Multigrid V-Cycle
h h
Notation: I2h x2h = xh or P2h x2h = xh
Operator-dependent Interpolation:
• consider homogeneous problem (f = 0)
with Dirichlet boundaries: u(0) = 1, u(1) = 0
• exact solution for this case:
1
cx/ c/
1 − ecx/
u(x) = e − e = 1 −
1 − ec/ 1 − ec/
c/2
• interpolate at x = 21 : u 12 = 1 − 1−e c/ =: 1 − 1−z 1−z
1−e 2;
c 1
1
for large z = e 2 , we have u 2 ≈ 1 − z ≈ 1
• thus: linear interpolation inappropriate (and leads to slow convergence)
→ interpolation should be operator-dependent
Michael Bader | Scientific Computing II | Towards Multigrid Methods | Summer 2022 14
Restriction
For Poisson problem:
• “injection”: pick values at corresp. coarse grid points
• “full weighting” = transpose of bilinear interpolation (safer, more robust
convergence), see illustration below for the 1D case
1/2 1/2
1/2 f ll weighting
full i hti
1
Exercise:
• Compute A2h := Rh2h Ah P2h
h 1
for Ah := h2
tridiag(−1, 2, −1)
.. .. ..
. . .
• red entries become 0 ⇒ coarse-grid unknown no longer depends on fine-grid unknowns
• similar for multiplication Ah P h , but with column operations
2h
How about . . .
• Jacobi (non-weighted)?
→ does not work (zig-zag pattern prevents smoothing)
• SOR?
→ typically does not work well for Poisson model problem
(does not smooth high frequencies efficiently)
→ can help for other problems using a tailored ω
1 1
(ui+1,j − 2ui,j + ui−1,j ) + 2 (ui,j+1 − 2ui,j + ui,j−1 ) = fij
4hx2 hy
Line Smoothers:
• perform a column-wise (or row-wise) Jacobi/Gauss-Seidel relaxation
→ solve each column (or row) simultaneously:
(n+1) (n+1) (n+1) (n) (n)
+ ui+1,j = h2 fij − ui,j−1 + ui,j+1
ui−1,j − (2 + 2)uij
• use direct, tridiagonal solver for each “line” (i.e., row or column)
−uxx + ux = f , 1
• “upwind discretisaton”:
1
− 2
(un−1 − 2un + un+1 ) + (un − un−1 ) = fn
h h
• (weighted) Jacobi and red-black Gauss-Seidel?
→ no smoothing, basically updates one grid point per iteration
• Gauss-Seidel (relaxation from “left to right”)?
→ almost an exact solver
Problems:
• anisotropic Poisson with space-dependent = (x, y ),
or more general:
−∇ D(x, y )∇u(x, y) = f (x, y )
• convection-diffusion with variable convection:
Ωh Ωh Ωh Ωh
AU
AA
U
Ω2h Ω2h Ω2h Ω2h Ω2h
AU
AU
AU
Ω4h Ω4h Ω4h Ω4h Ω4h Ω4h Ω4h Ω4h
AA
U
AA
U
AA
U
AA
U AAU
Ω8h Ω8h Ω8h Ω8h Ω8h
(V-cycle and W-cycle)
• more expensive
• useful in situations where the coarse grid correction is not very accurate
Ωh Ωh
AA
U
Ω2h Ω2h Ω2h Ω2h
AA
U
AA
U
Ω4h Ω4h Ω4h Ω4h Ω4h Ω4h
AU
AA
U
AA
U
Ω8h Ω8h Ω8h Ω8h
Michael Bader
Technical University of Munich
Summer 2022
Part I
Michael Bader | Scientific Computing II | Multigrid and Finite Elements | Summer 2022 2
Remember: Finite Elements – Main Ingredients
1. solve weak form of PDE to reduce regularity properties
Z Z
u 00 = f −→ v 0 u 0 dx = vf dx
Michael Bader | Scientific Computing II | Multigrid and Finite Elements | Summer 2022 3
Test and Shape Functions
• consider a general PDE Lu = f on some domain Ω
• search for solution functions uh of the form
X
uh = uj ϕj (x)
j
span{ϕ1 , . . . , ϕJ } = Wh
• insert into weak formulation
Z X Z
vL uj ϕj (x) dx = vf dx ∀v ∈ V
j
Michael Bader | Scientific Computing II | Multigrid and Finite Elements | Summer 2022 4
Test and Shape Functions (2)
• choose a basis {ψi } of the test space Vh
typically defined on some discretisation grid Ωh
• then: if all basis functions ψi satisfy
Z X Z
ψi (x)L uj ϕj (x) dx = ψi (x)f (x) dx ∀ψi
j
0,8
0,6
0,4
0,2
0
0 0,2 0,4 0,6 0,8 1
x
Michael Bader | Scientific Computing II | Multigrid and Finite Elements | Summer 2022 6
Example Problem: 1D Poisson
• in 1D: −u 00 (x) = f (x) on Ω = (0, 1),
hom. Dirichlet boundary cond.: u(0) = u(1) = 0
• weak form: Z 1 Z 1
v 0 (x) · u 0 (x) dx = v (x)f (x) dx ∀v
0 0
• grid points xi = ih, (for i = 1, . . . , n − 1); mesh size h = 1/n
• Vh = Wh : piecewise linear functions (on intervals [xi , xi+1 ])
• leads to stiffness matrix:
2 −1
..
1
−1 2 .
h .. ..
. . −1
−1 2
Michael Bader | Scientific Computing II | Multigrid and Finite Elements | Summer 2022 7
Hierarchical Basis
• hat functions with multi-level resolution
1
0,8
0,6
0,4
0,2
0
0 0,2 0,4 0,6 0,8 1
x
2 2
1 1
0 0
0 xi 1 0 1
h3=2-3
3 u(x)=iii(x) 3
ii(x)
2 2
1 1
0 0
0 xi 1 0 1
h3=2-3
Michael Bader | Scientific Computing II | Multigrid and Finite Elements | Summer 2022 9
Hierarchical Basis and Multigrid
• What happens, if we use FEM on hat function bases with different
resolutions?
• Define “mother of all hat functions”
φ(x) := max{1 − |x|, 0}
• consider mesh size hn = 2−n and grid points xn,i = i · hn
• nodal basis then Φn := {φn,i , 0 ≤ i ≤ 2n } with
x − xn,i
φn,i (x) := φ
hn
• hierarchical basis combines Φb n := {φn,i , i = 1, 3, . . . , 2n − 1} (only odd
indices) and defines basis as
n
[
Ψn := Φ
bl
l=1
Michael Bader | Scientific Computing II | Multigrid and Finite Elements | Summer 2022 10
Hierarchical Basis Transformation
(or: How to represent functions on a coarser grid?)
1 1
0,8 0,8
0,6 0,6
0,4
−→ 0,4
0,2 0,2
0 0
0 0,2 0,4 0,6 0,8 1 0 0,2 0,4 0,6 0,8 1
x
x
• represent hat functions φn−1,i (x) via fine-level functions φn,j (x)
Michael Bader | Scientific Computing II | Multigrid and Finite Elements | Summer 2022 11
Hierarchical Basis Transformation (2)
Level-by-level algorithm for hierarchical transform:
Φ1 Φ2 Φ3 Φ4 Φ5 Φ6 Φ7 Ψ1 Ψ2 Ψ3 Ψ5 Ψ6 Ψ7
−→
. .
x1 x2 x3 x4 x5 x6 x7 x1 x2 x3 x4 x5 x6 x7
Ψ1 Ψ2 Ψ3 Ψ5 Ψ6 Ψ7 Ψ1 Ψ2 Ψ3 Ψ4 Ψ5 Ψ6 Ψ7
−→
. .
x1 x2 x3 x4 x5 x6 x7 x1 x2 x3 x4 x5 x6 x7
Michael Bader | Scientific Computing II | Multigrid and Finite Elements | Summer 2022 12
Hierarchical Basis Transformation (3)
• hierarchical basis transformation: ψn,i (x) =
P
Hi,j φn,j (x)
j
Michael Bader | Scientific Computing II | Multigrid and Finite Elements | Summer 2022 14
FEM and Hierarchical Basis Transform
• FEM discretisation with hierarchical test and shape functions:
Z X Z
ψi (x)L uj ψj (x) dx = ψi (x)f (x) dx ∀ψi
j
• Note: j uj AHB ∗
P P R
i,j and j vj Ai,j are both equal to ψi (x)f (x) dx
• Therefore: (AHB u)i = j uj AHB ∗ ∗
)i and v = H T u
P P
i,j = v A
j j i,j = (A v
Michael Bader | Scientific Computing II | Multigrid and Finite Elements | Summer 2022 15
FEM and Hierarchical Basis Transform (2)
• status: FEM with hierarchical test and nodal shape functions
Z X Z
ψi (x)L vj φj (x) dx = ψi (x)f (x) dx
j
Michael Bader | Scientific Computing II | Multigrid and Finite Elements | Summer 2022 17
Part II
Michael Bader | Scientific Computing II | Multigrid and Finite Elements | Summer 2022 18
Hierarchical Generating System
Φ1,1
W1 V1
l =1
. .
x1,1 x1,1
Φ2,1 Φ2,3
W 2 V3 V2
l =2
. .
x2,1 x2,3 x2,1 x2,2 x2,3
Φ3,1 Φ3,3 Φ3,5 Φ3,7
W3 V3
l =3
. .
x3,1 x3,3 x3,5 x3,7 x3,1 x3,2 x3,3 x3,4 x3,5 x3,6 x3,7
• test functions φ2h,i (x) = 21 φh,2i−1 (x) + φh,2i (x) + 12 φh,2i+1 (x)
• not a hierarchical transform, but a restriction operation Rh2h :
~ 2h = R 2h φ
φ ~h thus: Ah2h = Rh2h Ah
h
Michael Bader | Scientific Computing II | Multigrid and Finite Elements | Summer 2022 21
Multi-Level System of Equations (2)
Michael Bader | Scientific Computing II | Multigrid and Finite Elements | Summer 2022 22
Symmetric Gauss-Seidel on AGS v GS = bGS
• pick out second block row of the system AGS v GS = bGS :
• observation
#1: relaxation
works on restricted fine-grid residual
2h (new)
Rh bh − Ah vh
• observation #2: relaxation considers prolongated coarse-grid correction
2h (old)
from previous iteration: A2h P4h v4h
⇒ FEM on hierarchical generating systems matches V-Cycle Multigrid
with Galerkin coarsening
Michael Bader | Scientific Computing II | Multigrid and Finite Elements | Summer 2022 23
Literature/References – Multigrid
Michael Bader | Scientific Computing II | Multigrid and Finite Elements | Summer 2022 24
Scientific Computing II
Conjugate Gradient Methods
Michael Bader
Technical University of Munich
Summer 2022
Families of Iterative Solvers
• relaxation methods:
• Jacobi-, Gauss-Seidel-Relaxation, . . .
• Over-Relaxation-Methods
• Krylov methods:
• Steepest Descent, Conjugate Gradient, . . .
• GMRES, . . .
• Multilevel/Multigrid methods,
Domain Decomposition, . . .
r (i) = b − Ax (i)
Ae(i) = r (i)
2
y
160 x
-2 0 2 4 6
120 0
80
-2
40
0 4 -4
-4 2
-2
0
0
2 -2 y
x -6
4 -4
6 -6
2
y
x
-2 0 2 4 6
0
-2
-4
-6
∂
f (x (1) ) = 0
∂α
• use chain rule:
∂ ∂ (1)
f (x (1) ) = f 0 (x (1) )T x = f 0 (x (1) )T r (0)
∂α ∂α
• remember f 0 (x (1) ) = −r (1) , thus:
T
!
− r (1) r (0) = 0
T T
r (1) r (0) = b − Ax (1) r (0) = 0
T
b − A(x (0) + αr (0) ) r (0) = 0
T T
b − Ax (0) r (0) − α Ar (0) r (0) = 0
T T
r (0) r (0) − α r (0) Ar (0) = 0
Solve for α: T
r (0) r (0)
α= T
r (0) Ar (0)
2
y
1. r (i) = b − Ax (i) -2 0 2 4
x
6
T 0
r (i) r (i)
2. αi = T
-2
r (i) Ar (i)
3. x (i+1) = x (i) + αi r (i) -4
Observations: -6
Conjugate Gradients
Conjugate Directions
A-Orthogonality
Conjugate Gradients
A Miracle Occurs . . .
CG Algorithm
• observation:
Steepest Descent takes repeated steps in the same direction
• obvious idea:
try to do only one step in each direction
• possible approach:
choose orthogonal search directions d (0) ⊥ d (1) ⊥ d (2) ⊥ . . .
• notice:
errors then orthogonal to previous directions:
• formula for α:
T
d (0) e(0)
α= T
d (0) d (0)
• but: we don’t know e(0)
∂ (i+1)
T ∂
0 (i+1)
f x = f x x (i+1) = 0
∂α ∂α
T
⇔ − r (i+1) d (i) = 0
T
⇔ − d (i) Ae(i+1) = 0
i−1
X
d (i) = u (i) + βik d (k )
k =0
(u (i) )T Ad (k)
βik = − (k ) T (k)
(d ) Ad
• needs to keep all old search vectors in memory
• O(n3 ) computational complexity ⇒ infeasible (much too expensive!)
2. propagation of residuals
r (i+1) = Ae(i+1) = A e(i) − αi d (i)
⇒ r (i+1) = r (i) − αi Ad (i)
• r (i) T r (j) = 0, if i 6= j:
1
T
r (i) r (i) ,
αi i =j
T
r (i) Ad (j)
T (i)
= − 1
r (i)
r , i =j +1
αi−1
0 otherwise.
(d (i) )T r (i)
• remember: αi =
(d (i) )T Ad (i)
• thus: αi (d (i) )T Ad (i) = (d (i) )T r (i)
T T
r (i+1) r (i+1) r (i+1) r (i+1)
⇒ βi+1 = T = T
d (i) r (i) r (i) r (i)
• last step: d (i) T r (i) = r (i) + βi−1 d (i−1) T r (i) = r (i) T r (i) + βi−1 d (i−1) T r (i) = r (i) T r (i)
(r (i) )T r (i)
1. αi =
(d (i) )T Ad (i)
Conjugate Gradients:
• Shewchuk: An Introduction to the Conjugate Gradient Method Without
the Agonizing Pain.
• Hackbusch: Iterative Solution of Large Sparse Systems of Equations,
Springer 1993.
Michael Bader
Technical University of Munich
Summer 2022
Part I
Preconditioning
Michael Bader | Scientific Computing II | Conjugate Gradients and Preconditioning | Summer 2022 2
Conjugate Gradients – Convergence
Convergence Analysis:
• uses Krylov subspace:
n o
span r (0) , Ar (0) , A2 r (0) , . . . , Ai−1 r (0)
Convergence Results:
• in principle: direct method (n steps)
(however: orthogonality lost due to round-off errors → exact solution not found)
• in practice: iterative scheme
√ i
κ−1
e(i) ≤2 √ e(0) , κ = λmax /λmin
A κ+1 A
Michael Bader | Scientific Computing II | Conjugate Gradients and Preconditioning | Summer 2022 3
Preconditioning
• convergence depends on matrix A
• idea: modify linear system
Ax = b M −1 Ax = M −1 b,
Michael Bader | Scientific Computing II | Conjugate Gradients and Preconditioning | Summer 2022 4
CG and Preconditioning
• just replace A by M −1 A in the algorithm??
• problem: M −1 A not necessarily symmetric
(even if M and A both are)
• we will try an alternative first: symmetric preconditioning
Ax = b LT ALx̂ = LT b, x = Lx̂
• guarantees symmetry: LT AL T = LT AT LT T = LT AL
Michael Bader | Scientific Computing II | Conjugate Gradients and Preconditioning | Summer 2022 5
“Change-of-Basis” Preconditioning
• preconditioned system of equations:
T
LT b ,
Ax = b L
| {z } x̂ = |{z}
AL x = Lx̂
=: Ab =: b̂
• computation of residual:
b x̂ = LT b − LT ALx̂ = LT (b − Ax) = LT r
r̂ = b̂ − A
• computation of α (for preconditioned system):
T T T
r̂ (i) r̂ (i) r̂ (i) r̂ (i) r̂ (i) r̂ (i)
αi := T = T = T
d̂ (i) A b d̂ (i) d̂ (i) LT AL d̂ (i) d̃ (i) A d̃ (i)
where we defined Ld̂ (i) =: d̃ (i)
• update of solution:
x̂ (i+1) = x̂ (i) + αi d̂ (i)
(i+1) (i+1)
⇒x = Lx̂ = Lx̂ (i) + Lαi d̂ (i) = x (i) + αi d̃ (i)
Michael Bader | Scientific Computing II | Conjugate Gradients and Preconditioning | Summer 2022 6
“Change-of-Basis” Preconditioning (2)
• update of residual r̂ :
Michael Bader | Scientific Computing II | Conjugate Gradients and Preconditioning | Summer 2022 7
CG with “Change-of-Basis” Preconditioning
Start with r̂ (0) = LT (b − Ax (0) ) and d̃ (0) = L r̂ (0) ;
While kr̂ (i) k > iterate over:
T
r̂ (i) r̂ (i)
1. αi = T
d̃ (i) A d̃ (i)
Michael Bader | Scientific Computing II | Conjugate Gradients and Preconditioning | Summer 2022 8
Part II
Michael Bader | Scientific Computing II | Conjugate Gradients and Preconditioning | Summer 2022 9
Hierarchical Basis Preconditioning
Specifics for CG implementation, if L/LT stems from hierarchical transform:
• L transforms coefficient vector from hierarchical basis to nodal basis,
for example x = L x̂ or d̃ = L d̂
• LT transforms the vector of basis functions from nodal basis to
hierarchical basis (cmp. FEM), thus r̂ = LT r
• effect of L and LT can be computed in O(N) operations
Michael Bader | Scientific Computing II | Conjugate Gradients and Preconditioning | Summer 2022 10
“Change-of-Basis” Preconditioning
Hierarchical vs. “non-hierarchical” vectors
Michael Bader | Scientific Computing II | Conjugate Gradients and Preconditioning | Summer 2022 11
CG with Hierarchical Generating Systems
Recall: system of linear equations AGS v GS = bGS given as
h h
Ah Ah P2h Ah P4h
vh bh
= Rh2h bh
2h 2h v
Rh Ah A2h A2h P4h 2h
4h 4h
Rh Ah R2h A2h A4h v 4h Rh4h bh
Φ1 Φ2 Φ3 Φ4 Φ5 Φ6 Φ7 Ψ1 Ψ2 Ψ3 Ψ5 Ψ6 Ψ7
−→
. .
x1 x2 x3 x4 x5 x6 x7 x1 x2 x3 x4 x5 x6 x7
(2)
Matrices for change of basis are then: (H3 to transform to hierarchical basis)
1 0 0 0 0 0 0 1 0 0 0 0 0 0
1 1
2 1 2 0 0 0 0
0 1 0 0 0 0 0
0 0 1 0 0 0 0 0 0 1 0 0 0 0
(1) 1 1
(2) 1 1
H3 = 0 0 2 1 2 0 0
H3 = 0 2 0 1 0 2 0
0 0 0 0 1 0 0 0 0 0 0 1 0 0
0 0 0 0 1 1 1 0 0 0 0 0 1 0
2 2
0 0 0 0 0 0 1 0 0 0 0 0 0 1
Michael Bader | Scientific Computing II | Conjugate Gradients and Preconditioning | Summer 2022 13
Hierarchical Basis Transformation
Level-wise hierarchical transform:
• hierarchical basis transformation: ψn,i (x) = Hi,j φn,j (x)
P
j
~ n = Hn φ
• written as matrix-vector product: ψ ~n
~ n can be performed as a sequence of level-wise transforms:
• Hn φ
Michael Bader | Scientific Computing II | Conjugate Gradients and Preconditioning | Summer 2022 14
Hierarchical Coordinate Transformation
• transform b = HnT a turns “hierachical” coefficients a into “nodal”
coefficients b: X X
bj φn,j (x) = ai ψn,i (x) ≈ f (x)
j i
(n−1) (n−2) (2) (1)
• Hn = Hn Hn ... Hn Hn
has a level-wise representation, thus:
T T T T
(1) (2) (n−2) (n−1)
HnT = Hn Hn . . . Hn Hn
Michael Bader | Scientific Computing II | Conjugate Gradients and Preconditioning | Summer 2022 15
Part III
Matrix-based Preconditioning
Michael Bader | Scientific Computing II | Conjugate Gradients and Preconditioning | Summer 2022 16
CG and Preconditioning (revisited)
• preconditioning: replace A by M −1 A
• problem: M −1 A not necessarily symmetric
• compare symmetric preconditioning
Ax = b E −T AE −1 x̂ = E −T b, x̂ = Ex
• what if E cannot be computed (efficiently)?
(neither M nor M −1 might be known explicitly!)
• E, E −T , E −1 can be eliminated from algorithm
(again requires some re-computations):
set d̂ = Ed and use r̂ = E −T r , x̂ = Ex, E −1 E −T = M −1
Michael Bader | Scientific Computing II | Conjugate Gradients and Preconditioning | Summer 2022 17
CG with Preconditioner
Start: r (0) = b − Ax (0) ; d (0) = M −1 r (0)
(r (i) )T M −1 r (i)
1. αi =
(d (i) )T Ad (i)
Michael Bader | Scientific Computing II | Conjugate Gradients and Preconditioning | Summer 2022 18
Implementation
Michael Bader | Scientific Computing II | Conjugate Gradients and Preconditioning | Summer 2022 19
Preconditioners for CG – Examples
Find M ≈ A and compute effect of M −1 :
• Jacobi preconditioning: M := DA
• (Symmetric) Gauss-Seidel preconditioning: M := LA or
M = (DA + L0A )DA−1 (DA + (L0A )T ), etc.
Just compute effect of M −1 :
• any approximate solver might do → incl. multigrid methods
• incomplete Cholesky factorization
→ i.e., incomplete LU-decomp. (ILU) for symm. positive definite matrix
• use a multigrid method as preconditioner(?)
→ worthwhile (only) in situations where multigrid does not work (well) as
stand-alone solver
Find an M −1 similar to A−1
• “sparse approximate inverse” (SPAI)
• tries to minimise kI − MAkF , where M is a matrix with (given) sparse
non-zero pattern
Michael Bader | Scientific Computing II | Conjugate Gradients and Preconditioning | Summer 2022 20
Preconditioners – ILU and Incomplete Cholesky
A = LD −1 U or A = LD −1 LT ,
Michael Bader | Scientific Computing II | Conjugate Gradients and Preconditioning | Summer 2022 21
Cholesky Factorization
−1 T
L11 D11 L11 LT21 LT31 A11 AT21 AT31
−1
L21 L22 D22 LT22 T
L32 = A21 A22 AT32
L31 L32 L33 −1 LT33 A31 A32 A33
D33
Michael Bader | Scientific Computing II | Conjugate Gradients and Preconditioning | Summer 2022 22
Incomplete Cholesky Factorization
Algorithm: ( A → LD −1 LT )
• initialize D := 0, L := 0
• for i = 1, . . . , n:
1. for k = 1, . . . , i − 1:
if (i, k ) ∈ S then set Lik := Aik − 0 Lij Djj−1 Lkj
P
j<k
2. set Lii = Dii := Aii − 0 Lij Djj−1 Lij
P
j<i
P0 P0
• note: sums and only consider non-zero elements ∈ S
j<k j<i
• uses given pattern S of non-zero elements in the factorization
(frequent choice: use non-zeros of A for S)
• Cholesky factorization computed in O(n) operations for sparse matrices
(with c · n non-zeros)
• frequently used for preconditioning
Michael Bader | Scientific Computing II | Conjugate Gradients and Preconditioning | Summer 2022 23
Literature/References
Conjugate Gradients:
• Shewchuk: An Introduction to the Conjugate Gradient Method Without
the Agonizing Pain.
• Hackbusch: Iterative Solution of Large Sparse Systems of Equations,
Springer 1993.
• M. Griebel: Multilevelmethoden als Iterationsverfahren über
Erzeugendensystemen, Teubner Skripten zur Numerik, 1994
M. Griebel: Multilevel algorithms considered as iterative methods on
semidefinite systems, SIAM Int. J. Sci. Stat. Comput. 15(3), 1994.
Michael Bader | Scientific Computing II | Conjugate Gradients and Preconditioning | Summer 2022 24
Scientific Computing II
Molecular Dynamics Simulation – Introduction
Michael Bader – SCCS | Scientific Computing II | Molecular Dynamics – Intro | Summer 2022 3
Overview: Particle-Oriented Simulation Methods
General Modelling Approach:
• “N-body problem”
→ compute motion paths of many individual particles
• requires modelling and computation of inter-particle forces
• typically leads to ODE for particle positions and velocities
Numerical Aspects:
• how to discretize the resulting modelling equations?
• efficient time stepping algorithms?
Implementation Aspects:
• suitable data structures?
• efficient algorithms to compute short- and long-range forces?
• parallelisation?
Michael Bader – SCCS | Scientific Computing II | Molecular Dynamics – Intro | Summer 2022 4
Some types of N-Body Simulations
Smoothed Particle Discrete Element Methods Molecular Dynamics
Hydrodynamics (SPH) (DEM) (MD)
• Approximate gas/fluid • Particles have • Particles are single- or
via many particles, geometry multi-atom molecules
smoothed over gaps • Complex contact • Protein folding, free
• Cosmology potentials (e.g. twisting energy calculations, . . .
simulations, e.g. friction)
Michael Bader – SCCS | Scientific Computing II | Molecular Dynamics – Intro | Summer 2022 5
Molecular Dynamics – Large Complex Molecules
• Short- and long-range interaction potentials
• E.g. life-science applications
Michael Bader – SCCS | Scientific Computing II | Molecular Dynamics – Intro | Summer 2022 6
Molecular Dynamics – Small Rigid Molecules
• Large number of particles
• E.g. thermodynamics applications
Michael Bader – SCCS | Scientific Computing II | Molecular Dynamics – Intro | Summer 2022 7
ls1 Mardyn
Michael Bader – SCCS | Scientific Computing II | Molecular Dynamics – Intro | Summer 2022 8
Thermo
Prof. Dr.-Ing. habil. Jadran Vrabec
Fachgebiet Thermodynamik und Thermische Verfahrenstechnik
Delete particles of
forward flux 𝑗 +
Coloring: Boundary conditions Forward moving particles Backward moving particles (at instance of time)
Michael Bader – SCCS | Scientific Computing II | Molecular Dynamics – Intro | Summer 2022 12
HPC Example – ACM Gordon Bell Prize 2014
Anton 2 – special-purpose MD supercomputer:
(a)
(Shaw et al.: Anton 2: Raising the(b)Bar for Performance and Programmability
(c)
in a
Fig. 1 (a) An Anton 2 ASIC. (b) An Anton 2 node board, with one ASIC underneath a phase-change heat sink. (c) A 512-node Anton 2 machine, with four racks.
Special-Purpose Molecular Dynamics Supercomputer, DOI: 10.1109/SC.2014.9)
Fine-grained operation is exposed to software via mance could be obtained on larger Anton 2 machine
distributed
• shared memory and an event-driven programming while the largest machine constructed to date contains 51
special-purpose ASIC to support
model, with hardware support for scheduling and dispatching event-driven computation,
nodes, the architecture scales up to 4,096 nodes.
“pairwise point interaction pipelines” etc.
small computational tasks. Our MD software for Anton 2
leverages these general-purpose mechanisms with new II. BACKGROUND: MOLECULAR DYNAMICS SIMULATION
• nodes/ASICs
algorithms, consisting of manydirectly connected
sequentially to formAna MD
dependent tasks, 3D simulation
torus topology
models the motion of a set of atom
that would have been impractical on Anton 1, but provide over a large number of discrete time steps. During each tim
• dedicated
additional to COVID-19
performance improvements on Antonresearch
2. in 2019
step, the forces acting on all atoms are computed: these force
In conjunction with support for fine-grained operation, consist of “bond term” forces between small groups of atom
Michael
Anton 2 contains SCCS | Scientific
Bader –architectural Computing
improvements that II | Molecular
are more usually
Dynamicsseparated
– Introby| Summer
1–3 covalent
2022bonds, and “non-bonded
13
HPC Example – Millennium-XXL Project
Simulation Figures:
• N = 3 · 1011 particles
• 10 TB RAM required only to store
positions and velocities (32-bit floats)
• entire memory requirements: 29 TB
• JuRoPa Supercomputer (Jülich)
• computation on 1536 nodes
(each 2x QuadCore 12 288 cores)
• hybrid parallelisation:
MPI plus OpenMP/Posix threads
• execution time: 9.3 days
(ca. 300 CPU years) Development of N-body problem sizes for
cosmology simulations
(source: www.magneticum.org)
Michael Bader – SCCS | Scientific Computing II | Molecular Dynamics – Intro | Summer 2022 15
Scales – an Important Issue
Michael Bader – SCCS | Scientific Computing II | Molecular Dynamics – Intro | Summer 2022 16
Laws of Motion
∂U(~ri ,~rj )
~ ij
P
~
P
F F j6=i ∂|rij |
~r¨i = i = j6=i
=− (1)
mi mi mi
• system of dN ODE (2nd order)
(N: number of molecules, d: dimension),
• reformulated into a system of 2dN 1st-order ODEs:
Michael Bader – SCCS | Scientific Computing II | Molecular Dynamics – Intro | Summer 2022 17
Example: Hooke’s Law
i j
rij
• “harmonic potential”: Uharm rij = 12 k rij − r0 2
∂U
1D: Fij = −grad U rij = − = −k rij − r0
∂rij
~ ij = −k ~rij − ~r0
2D, 3D: F
Michael Bader – SCCS | Scientific Computing II | Molecular Dynamics – Intro | Summer 2022 18
Example: Gravity
q1 r12 q2
+ −
• attractive or repulsive force between charged particles
• Coulomb potential: Ucol rij = 4π 1 qi qj
r 0 ij
• resulting force:
1 qi qj
1D: Fij = −grad U rij =
4π0 rij2
Michael Bader – SCCS | Scientific Computing II | Molecular Dynamics – Intro | Summer 2022 20
Example – Smoothed Particle Hydrodynamics
Michael Bader – SCCS | Scientific Computing II | Molecular Dynamics – Intro | Summer 2022 21
Example – Smoothed Particle Hydrodynamics (2)
• approximate integrals at particle positions:
N
X mj
f (ri ) ≈ f (rj )W (|ri − rj |, h)
ρ(rj )
j=1
Michael Bader – SCCS | Scientific Computing II | Molecular Dynamics – Modelling | Summer 2022 2
Continuum Mechanics for Fluids
Fluid:
• term “fluid” covers liquids and gases
• liquids: hardly compressible
• gases: volume depends on pressure
• both: small resistance to changes of form
Continuum:
• “continuum” = space, continuously filled with mass
• homogeneous
• subdivision into small fluid voxels with constant physical properties is
possible
• idea valid on micro scale upward (where we consider continuous masses
and not discrete particles)
Michael Bader – SCCS | Scientific Computing II | Molecular Dynamics – Modelling | Summer 2022 3
Description of State
Michael Bader – SCCS | Scientific Computing II | Molecular Dynamics – Modelling | Summer 2022 4
Molecular Dynamics for Fluids?
N-Body Problem – Newton’s Laws of Motion:
~i = P F
• force on a molecule: F ~
j6=i ij
• leads to acceleration (Newton’s 2nd Law):
∂U(~ri ,~rj )
~ ij
P
~
P
F F j6=i ∂|rij |
~r¨i = i = j6=i
=− (1)
mi mi mi
• system of dN ODE (2nd order)
(N: number of molecules, d: dimension),
• reformulated into a system of 2dN 1st-order ODEs:
Michael Bader – SCCS | Scientific Computing II | Molecular Dynamics – Modelling | Summer 2022 5
Continuum vs. Molecular Dynamics
Compare Simulation Results for a (Micro-/Nano-)Channel Flow
mean free path
For various Knudsen numbers: Kn =
characteristic length
1.4 1.2
Kn = 0.1128 Kn = 4.5135
1.2 1.1
1
1
0.9
u
u
u x (y /H)/u
u x (y /H)/u
0.8
0.8
0.6
0.7
0.4
0.6
LBM Li et al. LBM Li et al.
0.2 Present LBM: 2nd Order, VA 0.5 Present LBM: 2nd Order, VA
2nd Order Slip NS 2nd Order Slip NS
Ohwada et al. Ohwada et al.
0 0.4
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
y /H y /H
2.687 · 1019 cm−3 · 22.413996 · 103 cm3 mol−1 = 6.0221415 · 1023 mol−1
Michael Bader – SCCS | Scientific Computing II | Molecular Dynamics – Modelling | Summer 2022 7
Pour Me A Glass . . .
• assume we want to simulate all molecules in one glass (0.5l) of water
• assume a simulation over 1 second with a time step size of 1 fs
• assume we only need one floating point operation per molecule in each
time step
→ 1.673 · 1022 molecules (biggest MD simulation: ≈ 1013 molecules)
→ 1015 timesteps
→ 1.673 · 1037 operations
Michael Bader – SCCS | Scientific Computing II | Molecular Dynamics – Modelling | Summer 2022 9
Classical Molecular Dynamics
Michael Bader – SCCS | Scientific Computing II | Molecular Dynamics – Modelling | Summer 2022 10
Fundamental Interactions
• Classification of the fundamental
interactions:
– strong nuclear force
– electromagnetic force rj rk
– weak nuclear force
– gravity ri O
N!
there are ( Nn ) = n!(N−n)! ∈ O(N n ) n-body potentials Un ,
particulary N one-body and 12 N(N − 1) two-body potentials
~ = −gradU
• force F
Michael Bader – SCCS | Scientific Computing II | Molecular Dynamics – Modelling | Summer 2022 11
Forces vs. Potentials
i j
rij
• some potentials from mechanics:
2
– harmonic potential (Hooke’s law): Uharm rij = 12 k rij − r0
Michael Bader – SCCS | Scientific Computing II | Molecular Dynamics – Modelling | Summer 2022 12
Intermolecular Two-Body Potentials
(
∞ ∀ rij ≤ d
•
hard sphere potential: UHS rij = force: Dirac function
0 ∀ rij > d
n
• soft sphere potential: USS rij = rσij
∞ ∀ rij ≤ d1
• Square-well potential: USW rij = − ∀ d1 < rij < d2
0 ∀ rij ≥ d2
(
∞ ∀ rij ≤ d
•
Sutherland potential: USu rij = −
r6
∀ rij > d
ij
• Lennard-Jones potential
6
• van der Waals potential: UW rij = −4σ 6 r1
ij
= energy parameter
σ = length parameter (rel. to atom diameter, cmp. van der Waals radius)
• Coulomb potential: Ucol rij = 1 qi qj
4π0 rij
Michael Bader – SCCS | Scientific Computing II | Molecular Dynamics – Modelling | Summer 2022 13
Two-Body Potentials: Hard vs. Soft Spheres
hard sphere potentials soft sphere potentials
2 2
hard sphere soft sphere
Square−well Lennard−Jones
Sutherland van der Waals
1.5 1.5
1 1
0.5 0.5
potential U
potential U
0 0
σ σ
−0.5 −0.5
−1 −1
−1.5 −1.5
−2 −2
0 0.5 1 1.5 2 2.5 3 0 0.5 1 1.5 2 2.5 3
distance r distance r
Michael Bader – SCCS | Scientific Computing II | Molecular Dynamics – Modelling | Summer 2022 14
Van der Waals Attraction
• intermolecular, electrostatic interactions
• electron motion in the atomic hull may result in a temporary asymmetric
charge distribution in the atom; i.e. more electrons (or negative charge,
resp.) on one side of the atom than on the opposite one
• charge displacement ⇒ temporary dipole
• a temporary dipole
– attracts another temporary dipole
– induces an opposite dipole moment in an adjacent non-dipole atom
and attracts it
• dipole moments are very small and the resulting electric attraction forces
are weak and act in a short range only
• atoms have to be very close to attract each other, for a long distance the
two dipole partial charges cancel each other
• high temperature (kinetic energy) breaks van der Waals bonds
Michael Bader – SCCS | Scientific Computing II | Molecular Dynamics – Modelling | Summer 2022 15
Lennard-Jones Potential
• general Lennard-Jones potential: e s
ULJ rij = α
σ
n
−
m
σ rij
rij rij
ri
1
nn
1
n−m e,s
with n > m and α = n−m mm
• LJ-12-6 potential rj
O
6
12
σ
− rσij
ULJ rij = 4 rij
Michael Bader – SCCS | Scientific Computing II | Molecular Dynamics – Modelling | Summer 2022 17
Dimensionless Formulation
“Dimensionless”:
use reference values such as σ, , . . . to derive equations in which quantities
no longer carry any dimensional units (m/s, kg, etc.)
Michael Bader – SCCS | Scientific Computing II | Molecular Dynamics – Modelling | Summer 2022 18
Dimensionless Formulation (2)
Michael Bader – SCCS | Scientific Computing II | Molecular Dynamics – Modelling | Summer 2022 19
Advanced Modelling: Multi-Centered Molecules
• molecules composed of multiple LJ-centers
(rigid bodies without internal degrees of freedom)
• additionally: orientation (quarternions), angular
velocity
• additionally: moment of inertia (principal axes
transformation)
CB2 • calculation of the interactions between each
CB
FB2A2
FB2A1 center of one molecule to each center of the other
CB1
• resulting force (sum) acts at the center of gravity,
FB1A2 FBA
FB1A1
additional calculation of torque
FAB
FA1B1
FA1B2 FA2B1
FA2B2
CA
CA1 CA2
Michael Bader – SCCS | Scientific Computing II | Molecular Dynamics – Modelling | Summer 2022 20
Molecular Dynamics – the Mathematical Model
System of ODEs
~i = P F
• resulting force acting on a molecule: F ~
j6=i ij
• acceleration of a molecule (Newton’s 2nd law):
∂U(~ri ,~rj )
~ ij
P
~
P
F F j6=i ∂|rij |
~¨ri = i = j6=i
=− (5)
mi mi mi
• system of dN coupled ordinary differential equations of 2nd order
(N: number of molecules, d: dimension)
• transferable to 2dN coupled ordinary differential equations of 1st order,
e.g. by introducing velocity ~v := ~r˙ (“derivative” variable),
or (even better) momentum ~p:
Michael Bader – SCCS | Scientific Computing II | Molecular Dynamics – Modelling | Summer 2022 21
Initial Conditions
Initial Value Problem:
Molecule positions and velocities have to be given:
• place molecules as in a crystal lattice
(body-/face-centered cell)
• choose initial velocity to match temperature:
N
d 1X
N kB T = mvi2 with vi := v0
2 2
i=1
Michael Bader – SCCS | Scientific Computing II | Molecular Dynamics – Modelling | Summer 2022 22
NVT Ensemble, Thermostat
• statistical (thermodynamics) “ensemble”: system in equilibrium,
described by a set of macroscopically observable variables
• for the simulation of a (canonical) NVT ensemble, the following values
have to be kept constant:
– N: number of molecules
– V : volume
– T : temperature
• thermostat regulates and controls the temperature (the kinetic energy),
which is fluctuating in a simulation
• kinetic energy specified by the velocity of the molecules: Ekin = 12 i mi ~vi2
P
Michael Bader – SCCS | Scientific Computing II | Molecular Dynamics – Modelling | Summer 2022 23
Periodic Boundary Conditions
a a
Michael Bader
Technical University of Munich
Summer 2022
Computational Effort for N-Body Problems
d2 ~
F 1 X~
~ri = i = Fij
dt 2 mi mi
j6=i
Short-Range Potentials
0
0.2
0 −0.5
−0.2
−1
*
*
U
F
−0.4
−1.5
−0.6
−2
−0.8
−1 −2.5
0 1 2 3 4 5 0 1 2 3 4 5
* *
r r
1
Fij Force matrix/Interaction-graph
0
0.2
0 −0.5
−0.2
−1
*
*
U
F
−0.4
−1.5
−0.6
−2
−0.8
−1 −2.5
0 1 2 3 4 5 0 1 2 3 4 5
* *
r r
1
Fij Force matrix/Interaction-graph
0.4
0
0.2
0 −0.5
−0.2
−1
*
*
U
F
−0.4
−1.5
−0.6
−2
−0.8
−1 −2.5
0 1 2 3 4 5 0 1 2 3 4 5
* *
r r
(
∗ ∗ (r ∗ ) for r ∗ ≤ r ∗
ULJ rij∗ − ULJ
∗ c c
ULJ,r ,shifted rij∗ = ij
0 for rij∗ > rc∗
c
(
~ ∗ ~r ∗
F for rij∗ ≤ rc∗
~ ∗ ~r ∗ =
F ij ij
ij,rc ij
0 for rij∗ > rc∗
shifted dim. red. finites Lennard−Jones 12−6 Potential (rc=2) dim. red. finite Lennard−Jones 12−6 Force (rc=2)
0.5
0.4
0
0.2
0 −0.5
−0.2
−1
*
*
U
F
−0.4
−1.5
−0.6
−2
−0.8
−1 −2.5
0 1 2 3 4 5 0 1 2 3 4 5
* *
r r
0.4
0
0.2
0 −0.5
−0.2
−1
*
*
U
F
−0.4
−1.5
−0.6
−2
−0.8
−1 −2.5
0 1 2 3 4 5 0 1 2 3 4 5
* *
r r
(
∗ ∗
∗
ULJ rij∗ − ULJ
∗
(rc∗ ) − FLJ
∗
(rc∗ ) rij∗ − rc∗ for rij∗ ≤ rc∗
ULJ,r c ,shifted
rij =
0 for rij∗ > rc∗
∗
~ ∗ ~r ∗ − F ∗ (r ∗ ) ~rij∗
F for rij∗ ≤ rc∗
~∗
F ∗
~rij = ij ij LJ c r
ij
ij,rc ,shifted
0 for rij∗ > rc∗
0.4
0
0.2
0 −0.5
−0.2
−1
*
*
U
F
−0.4
−1.5
−0.6
−2
−0.8
−1 −2.5
0 1 2 3 4 5 0 1 2 3 4 5
* *
r r
3 6
8
4
1 7
2
9
5
1 3 5 8 2 4 6 7 9
1 2 3 4 5 6 7 8 9
• runtime: O(n)
• only half (point symmetry) of the
neighbour cells are explicitly traversed
(Newton’s 3rd law)
orange vs. yellow cells in the picture
• erase and generate the data structure
in each time step
divisor t ∈ N 5
1
1 2 3 4 5 6 7 8 9 10
t =1 t =2 t =4 t =3
Michael Bader | Scientific Computing II | Molecular Dynamics – Forces | Summer 2022 18
Outlook: Parallelisation and Actio=Reactio
“Actio = Reactio”:
• symmetrically acting force between two
molecules
• straightforward optimisation: compute
force once and apply to both molecules
• can lead to race conditions for
parallelisation in shared memory
Mitigation: Colouring Schemes
• graph colouring of linked cells:
adjacent cells have different colours
• only parallelise within cells of the
same colour
• sequential processing of colours.
Michael Bader | Scientific Computing II | Molecular Dynamics – Forces | Summer 2022 19
Scientific Computing II
Molecular Dynamics – Numerics
~i = P F
• resulting force acting on a molecule: F ~
j6=i ij
• acceleration of a molecule (Newton’s 2nd law):
∂U(~ri ,~rj )
~ ij
P
~
P
F F j6=i ∂|rij |
~¨ri = i = j6=i
=−
mi mi mi
Michael Bader – SCCS | Scientific Computing II | Molecular Dynamics – Numerics | Summer 2022 2
Euler Time Stepping for MD
1 ∆t i (i)
~r (t + ∆t) = ~r (t) + ∆t~r˙ (t) + ∆t 2~¨r (t) + ~r (t) + . . . (1)
2 i!
(ṙ , r̈ , r (i) : derivatives)
• neglecting terms of higher order of ∆t, and analogous formulation of
~
~v (t) := ~r˙ (t) with ~a(t) := ~v˙ (t) = ~¨r (t) = Fm(t) leads to the explicit Euler
method:
. ~
~v (t + ∆t) = v (t) + ∆t ~a(t)
.
~r (t + ∆t) = ~r (t) + ∆t ~v (t)
Michael Bader – SCCS | Scientific Computing II | Molecular Dynamics – Numerics | Summer 2022 3
Euler Time Stepping for MD (cont.)
Michael Bader – SCCS | Scientific Computing II | Molecular Dynamics – Numerics | Summer 2022 4
Classical Störmer Verlet Method
• the Taylor series expansion in (1) can also be performed for −∆t:
1 (−∆t)i (i)
~r (t − ∆t) = ~r (t) − ∆t~r˙ (t) + ∆t 2~¨r (t) + ~r (t) + . . . (4)
2 i!
• from (1) and (4) the classical Verlet algorithm can be derived:
~ (t)
note: direct calculation of ~r (t + ∆t) from ~r (t) and F
• velocity can be estimated via
. ~r (t + ∆t) − ~r (t − ∆t)
~v (t) = ~r˙ (t) = (6)
2∆t
• disadvantage: needs to store two previous time steps
Michael Bader – SCCS | Scientific Computing II | Molecular Dynamics – Numerics | Summer 2022 5
Crank Nicolson Method
∆t ∆t
~v (t + ) = ~v (t) + ~a(t) (7a)
2 2
∆t ∆t
~v (t + ∆t) = ~v (t + )+ ~a(t + ∆t) (7b)
2 2
• leads to Crank-Nicolson scheme for v :
∆t
~v (t + ∆t) = ~v (t) + ~a(t) + ~a(t + ∆t) (8)
2
• key disadvantage: implicit scheme, as ~a(t + ∆t) depends on ~r (t + ∆t);
needs to solve non-linear system of equations
Michael Bader – SCCS | Scientific Computing II | Molecular Dynamics – Numerics | Summer 2022 6
Velocity Störmer Verlet Method
The Velocity Störmer Verlet method is a composition of a
• Taylor series expansion of 2nd order for the positions, as in Eq. (1)
• and a Crank Nicolson method for the velocities, as in Eq. (8)
~r (t + ∆t) = ~r (t) + ∆t ~v (t) + ∆t 2 ~
2 a(t) (9a)
∆t
~v (t + ∆t) = ~v (t) + 2
~a(t) + ~a(t + ∆t) (9b)
r
v
F
t-Dt t t+Dt t-Dt t t+Dt t-Dt t t+Dt t-Dt t t+Dt
Michael Bader – SCCS | Scientific Computing II | Molecular Dynamics – Numerics | Summer 2022 7
Velocity Störmer Verlet – Implementation
• reformulate equation for positions ~r :
∆t 2
~r (t + ∆t) = ~r (t) + ∆t ~v (t) + ~a(t)
2
∆t
= ~r (t) + ∆t ~v (t) + ~a(t)
2
∆t
contains “half an Euler time step” (i.e., Euler time step of size 2 ) for ~v
• similar for the velocities ~v :
∆t
~v (t + ∆t) = ~v (t) + ~a(t) + ~a(t + ∆t)
2
∆t ∆t
= ~v (t) + ~a(t) + ~a(t + ∆t)
2 2
Michael Bader – SCCS | Scientific Computing II | Molecular Dynamics – Numerics | Summer 2022 8
Velocity Störmer Verlet – Implementation (2)
~v (t + ∆t ∆t ~
2 ) = ~v (t) + 2 a(t)
2. update positions ~r :
~r (t + ∆t) = ~r (t) + ∆t ~v (t + ∆t
2 )
Michael Bader – SCCS | Scientific Computing II | Molecular Dynamics – Numerics | Summer 2022 9
Leapfrog Method
∆t ∆t
~v (t + 2 ) = ~v (t − 2 ) + ∆t ~a(t) (10a)
~r (t + ∆t) = ~r (t) + ∆t ~v (t + ∆t
2 ) (10b)
r
v t-Dt/2 t+Dt/2 t-Dt/2 t+Dt/2 t-Dt/2 t+Dt/2 t-Dt/2 t+Dt/2
F
t-Dt t t+Dt t-Dt t t+Dt t-Dt t t+Dt t-Dt t t+Dt
Michael Bader – SCCS | Scientific Computing II | Molecular Dynamics – Numerics | Summer 2022 10
Dimensionless Velocity Störmer Verlet
r
v
F
t-Dt t t+Dt t-Dt t t+Dt t-Dt t t+Dt t-Dt t t+Dt t-Dt t t+Dt
Michael Bader – SCCS | Scientific Computing II | Molecular Dynamics – Numerics | Summer 2022 11
Dimensionless Velocity Störmer Verlet (2)
Forward Euler r ½ Forward Euler v Force calculation ½ Backward Euler v
r
v
F
t-Dt t t+Dt t-Dt t t+Dt t-Dt t t+Dt t-Dt t t+Dt t-Dt t t+Dt
Procedure:
1. calculate new positions (11a),
∗2
partial velocity update: + ∆t2 F~∗ (t) in (11b)
2. calculate new forces, accelerations (computationally intensive!)
∗2
3. calculate new velocities: + ∆t2 F ~ ∗ (t + ∆t) in (11b)
→ memory requirements: 3 · 3N
Michael Bader – SCCS | Scientific Computing II | Molecular Dynamics – Numerics | Summer 2022 12
Outlook: Leapfrog Method with Thermostat
• Leapfrog method:
∆t ∆t
~v (t + 2
) = ~v (t − 2
) + ∆t ~a(t)
~r (t + ∆t) = ~r (t) + ∆t ~v (t + ∆t
2
)
Thermostat
r
v t-Dt t+Dt t+Dt t-Dt t+Dt
F
t-Dt t t+Dt t-Dt t t+Dt t-Dt t t+Dt t-Dt t t+Dt t-Dt t t+Dt
~vact (t) = ~v (t − ∆t ∆t ~
2 ) + 2 a(t) (13a)
∆t
~v (t + 2 ) = (2β − 1)~vact (t) + ∆t ~
2 a(t) (13b)
Michael Bader – SCCS | Scientific Computing II | Molecular Dynamics – Numerics | Summer 2022 13
Evaluation of Time Integration Methods
Evaluation criteria:
• accuracy (often not of great importance for exact particle positions)
• stability
• conservation
→ of phase space density (symplectic)
→ of energy
→ of momentum
(especially with PBC → Periodic Boundary Conditions)
• reversibility of time
• use of resources:
– computational effort (number of force evaluations)
– maximum time step size
– memory usage
Michael Bader – SCCS | Scientific Computing II | Molecular Dynamics – Numerics | Summer 2022 14
Reversibility of Time
• time reversal for a closed system means
• a turnaround of the velocities and also momentums;
positions at the inversion point stay constant
• traverse of a trajectory back in the direction of the origin
• demand for symmetry for time integration methods
+ satisfied by Verlet method, e.g.
− not satisfied by, e.g., Euler method, Predictor Corrector methods
(also not by standard Runge-Kutta methods)
• contradiction with
• the H-theorem (increase of entropy, irreversible processes)?
(Loschmidt’s paradox)
• the second theorem of thermodynamics?
• reversibility in theory only for a very short time
• Lyapunov instability ⇒ Kolmogorov entropy
Michael Bader – SCCS | Scientific Computing II | Molecular Dynamics – Numerics | Summer 2022 15
Lyapunov Instability
• Basic question: how does a model behave with slightly disturbed initial
condition?
• Example of a simple system:
• stable case:
jumping ball on a plane with slightly disturbed initial horizontal
velocity ⇒ linear increase of the disturbance
• instable case:
jumping ball on a sphere with slightly disturbed initial horizontal
velocity ⇒ exponential increase of the disturbance (Lyapunov
exponent)
• for the instable case, small disturbances result in large changes:
chaotic behaviour (butterfly leading to a hurricane?)
• non-linear differential equations are often dynamically instable
Michael Bader – SCCS | Scientific Computing II | Molecular Dynamics – Numerics | Summer 2022 16
Lyapunov Instability: A Numerical Experiment
7.7
7.6
7.5
2.5 7.4
3
3.5 7.3
4
4.5 7.2
5
5.57.1
Michael Bader – SCCS | Scientific Computing II | Molecular Dynamics – Numerics | Summer 2022 17
Lyapunov Instability: A Numerical Experiment
• Calculation of the trajectories → badly conditioned problem:
a small change of the initial position of a molecule may result in a
distance to the comparable original position, after some time, in the
magnitude of the whole domain!
• Thus: do not target at simulation of individual trajectories
→ numerical simulation of the behaviour of the system is wanted!
tracing a Molecule (with initial displacement) Molecule deviation (with initial displacement)
0.5
Molecule 25, run1
Molecule 25, run2 0.45
0.4
4.1 0.35
4
3.9 0.3
3.8
3.7 0.25
3.6
0.2
3.5
7.7
0.15
7.6
7.5
2.5 7.4 0.1
3
3.5 7.3
4
4.5 7.2 0.05
5
5.57.1
0
2.5 3 3.5 4 4.5 5 5.5
Michael Bader – SCCS | Scientific Computing II | Molecular Dynamics – Numerics | Summer 2022 18
Scientific Computing II
Molecular Dynamics – Barnes-Hut and Fast Multipole
Michael Bader
Technical University of Munich
Summer 2022
Computational Effort for N-Body Problems
• to solve: system of ODE
d2 ~
F 1 X~
~ri = i = Fij
dt 2 mi mi
j6=i
Michael Bader | Scientific Computing II | Barnes-Hut and Fast Multipole | Summer 2022 3
Barnes Hut Method – Key Ideas
Consider Astrophysics:
• force w.r.t. a far-away individual star might be neglected
• but not the force w.r.t. a far-away(?) galaxy
• thus: approximate forces on a individual star by grouping far-away
stars, galaxies, etc. into clusters
• represent clusters by accumulated mass located at its
centre-of-mass
Michael Bader | Scientific Computing II | Barnes-Hut and Fast Multipole | Summer 2022 4
Domain Decomposition
• distribute long-range region into subdomains: Ωfar = Ωfar
S
i i
• to be done for every particles position
(in practice via hierarchical domain decomposition)
• assign a point y0i to each Ωfar
i
• decomposition depending on size of subdomains:
xi
xi
Michael Bader | Scientific Computing II | Barnes-Hut and Fast Multipole | Summer 2022 6
Barnes-Hut Algorithm
• developed 1986 for applications in Astrophysics
• for gravity potential/force:
~ ~
U(rij ) = −γgrav
mi mj ~ (~ri , ~rj ) = −γgrav mi mj (ri − rj )
~ ij = F
F
rij k~ri − ~rj k3
Michael Bader | Scientific Computing II | Barnes-Hut and Fast Multipole | Summer 2022 7
Barnes-Hut: Computation of Forces
Michael Bader | Scientific Computing II | Barnes-Hut and Fast Multipole | Summer 2022 8
Barnes-Hut: Computation of Forces (2)
Tree traversal:
xi
xi
Michael Bader | Scientific Computing II | Barnes-Hut and Fast Multipole | Summer 2022 9
Barnes-Hut: Accuracy and Complexity
Accuracy of Barnes-Hut:
• depends on choice of θ
• the smaller θ, the more accurate the long-range forces
• the smaller θ, the larger the short-range (i.e., the costs)
• slow convergence w.r.t. θ (low-order method)
Complexity:
• grows for small θ
• for θ → 0: algorithm degenerates to “all-to-all” → O(N 2 )
• for more or less homogeneously distributed particles:
– number of active cells: O(log N/θ3 )
– total effort therefore O(θ−3 N log N)
Michael Bader | Scientific Computing II | Barnes-Hut and Fast Multipole | Summer 2022 10
Barnes-Hut: Implementation
Michael Bader | Scientific Computing II | Barnes-Hut and Fast Multipole | Summer 2022 11
Part II
Michael Bader | Scientific Computing II | Barnes-Hut and Fast Multipole | Summer 2022 12
Barnes-Hut vs. Fast Multipole
Barnes-Hut: compute forces of pseudo particles to particles:
x y0ii
x y0
x l0l y0ii
x0 y0
Michael Bader | Scientific Computing II | Barnes-Hut and Fast Multipole | Summer 2022 13
Revisited: Force, Potential Energy and Potential
For Coulomb and gravity interaction:
• force on a particle with mass mi located at position ~xi caused by many particles
with masses mj at positions ~xj
~ (grav) =
X mi mj ~ (Coul) =
X 1 qi qj
F i −γ 3 ~rij F i
~rij
rij 4π0 rij3
j j
What’s missing?
Michael Bader | Scientific Computing II | Barnes-Hut and Fast Multipole | Summer 2022 15
“Box–Box Interactions” in a “Linked Cell” Fashion
Findings:
• consider a “Linked Cell”-type grid (in 1D)
(k ) (k)
→ each grid cell Ck contains a list of particles: xi , mi
• define a pseudo particle (as in Barnes-Hut) for each linked cell
P (l)
→ accumulated mass Ml := j mj
P (l) (l)
→ located in centre of mass Xl := M1 j mj xj
l
• add up particle–particle forces between particles in adjacent cells Ck and Cl :
(k ) (k)
for all particles xi , mi in cell Ck :
(k) (k) P (k) (l) (k) (l) 2
Fi = Fi ± j∈Cl γmi mj / xi − xj
• for all separated cells Cl → add potential caused by pseudo particles of Cl
(sum over all cells
P Cl with pseudo particle of mass Ml at position Xl ):
Ψk = −γ l6=k ,l6=k ±1 Ml / Xk − Xl
(or, accumulate forces, considering correct sign:)
(k) (k) P (k) 2
Fi = Fi ± j∈Cl γmi Ml / Xk − Xl
(k)
⇒ idea: accumulate potentials Ψk , then multiply with factors mi
Michael Bader | Scientific Computing II | Barnes-Hut and Fast Multipole | Summer 2022 16
“Box–Box Interactions” in a “Multigrid’ Fashion
Less simplified setting:
• add a hierarchy of grids as in multigrid methods
→ finest grid contains a list of particles for each cell
→ all grids contain a pseudo particle (as in Barnes-Hut)
• force computation on the finest level:
identical to “Linked Cell” Fashion on previous slide
• force computation between pseudo particles:
1. between pseudo particles in “nearby” cells:
add pseudo-particle–pseudo-particle force
2. between pseudo particles in “far away” cells:
add force between corresp. pseudo-particles on next-coarser level
Do we catch all interactions? How to define “nearby”/“far away”?
What’s (still) missing?
Michael Bader | Scientific Computing II | Barnes-Hut and Fast Multipole | Summer 2022 17
“Box–Box Interactions” in a “Multigrid’ Fashion
Findings:
• box–box interactions occur at multiple levels → as particles are part of all
parent/grand-parent pseudo particles, the interaction between two particles might
be captured by box–box interactions on multiple levels
⇒ make sure that each particle–particle interaction is considered exactly once!
• different concepts for “far away” boxes:
Barnes-Hut-type: θ-criterion
Fast-Multipole-type: not in an adjacent cell ( “well separated”)
• force computation between pseudo particles occurs, if:
1. pseudo particles are not in cells that are direct neighbours (requires
particle–particle interaction, no approximation via pseudo particles allowed)
2. interaction between the boxes that contain those pseudo particles is not
considered on coarser levels
3. hence, considers comparably few interactions on each level that are
“nearby” but neither direct neighbours nor too far away
Michael Bader | Scientific Computing II | Barnes-Hut and Fast Multipole | Summer 2022 18
“Box–Box Interactions” in Hierarchical Methods
Michael Bader | Scientific Computing II | Barnes-Hut and Fast Multipole | Summer 2022 19
Comparison of Barnes-Hut and Fast Multipole
Michael Bader | Scientific Computing II | Barnes-Hut and Fast Multipole | Summer 2022 20
Approximate “Box potentials”
Approximate the potential of a set/cluster of particles by:
• multipole extension (Greengard & Rokhlin, 1987)
→ similar concept as Taylor series
→ complicated to derive, esp. in 3D (spherical harmonics)
→ complicated formula for hierarchical assembly
• inner/outer ring approximations (Anderson, 1992)
→ derived via numerical integration of an integral formula
→ uniform interaction with child and remote boxes
→ hierarchical assembly via evaluation of potentials
at integration points
• both approaches apply principle of “well-separated boxes”
→ box–box interaction allowed between boxes that are
separated by one box of the same hierarchical level
Michael Bader | Scientific Computing II | Barnes-Hut and Fast Multipole | Summer 2022 21
Outer Ring Approximations
• fundamental idea: represent potential via a surface integral
∞ n+1
Z !
X a
Ψa (~x ) = g(a~s) Qn (~s · ~xp ) ds
S2 r
n=0
Michael Bader | Scientific Computing II | Barnes-Hut and Fast Multipole | Summer 2022 23
Outer Ring Approximations – Illustrations
Michael Bader | Scientific Computing II | Barnes-Hut and Fast Multipole | Summer 2022 24
Outer Ring Approximations – Procedures
Michael Bader | Scientific Computing II | Barnes-Hut and Fast Multipole | Summer 2022 25
Inner Ring Approximations
Michael Bader | Scientific Computing II | Barnes-Hut and Fast Multipole | Summer 2022 26
Hierarchical Computation of Potentials and Forces
Michael Bader | Scientific Computing II | Barnes-Hut and Fast Multipole | Summer 2022 27
Hierarchical Computation of Potentials and Forces
(continued)
Michael Bader | Scientific Computing II | Barnes-Hut and Fast Multipole | Summer 2022 28
Hierarchical Force Computation – Illustrations
Forces on particles:
particle-to-particle (P2P) and local-to-particle (L2P)
Michael Bader | Scientific Computing II | Barnes-Hut and Fast Multipole | Summer 2022 29
Hierarchical Force Computation – Illustrations
Forces on ring approximations:
multipole-to-local (M2L) and local-to-local (L2L)
Michael Bader | Scientific Computing II | Barnes-Hut and Fast Multipole | Summer 2022 30
Kernels in Fast Multipole Methods – Illustration
Complexity:
• computation of box-approximations, i.e., all g(a~si )
→ constant effort per box (leaf and inner boxes)
→ thus O(NB ) effort (NB boxes); if max. number of particles per
box is constant then O(N) (N particles)
• computation of forces
→ multilevel algorithms leads to O(N) effort
Michael Bader | Scientific Computing II | Barnes-Hut and Fast Multipole | Summer 2022 32