0% found this document useful (0 votes)
177 views119 pages

Fem1d F PDF

This document describes a 1D finite element method (FEM) code written in Fortran to solve steady state heat conduction problems. It uses linear elements and the Galerkin method. The code employs the preconditioned conjugate gradient method as a sparse linear solver to solve the system of equations. It provides analytical solutions for verification of the 1D FEM results. Key aspects covered include the 1D linear elements, shape functions, Galerkin formulation, and sparse matrix storage and solution methods.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
177 views119 pages

Fem1d F PDF

This document describes a 1D finite element method (FEM) code written in Fortran to solve steady state heat conduction problems. It uses linear elements and the Galerkin method. The code employs the preconditioned conjugate gradient method as a sparse linear solver to solve the system of equations. It provides analytical solutions for verification of the 1D FEM results. Key aspects covered include the 1D linear elements, shape functions, Galerkin formulation, and sparse matrix storage and solution methods.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 119

1D-FEM in Fortran: Steady State

Heat Conduction

Kengo Nakajima
Information Technology Center

Programming for Parallel Computing (616-2057)


Seminar on Advanced Computing (616-4009)
FEM1D 2

• 1D-code for Static Linear-Elastic Problems by


Galerkin FEM
• Sparse Linear Solver
– Conjugate Gradient Method
– Preconditioning
• Storage of Sparse Matrices

• Program
FEM1D 3

Keywords
• 1D Steady State Heat Conduction Problems
• Galerkin Method
• Linear Element
• Preconditioned Conjugate Gradient Method
FEM1D 4

1D Steady State Heat Conduction

heat generation/volume Q
  T  
 Q  0
x  x 

x=0 (xmin) x= xmax

• Uniform: Sectional Area: A, Thermal Conductivity: 


• Heat Generation Rate/Volume/Time [QL-3T-1] Q
• Boundary Conditions
– x=0 : T= 0 (Fixed Temperature)
T
– x=xmax:  0 (Insulated)
x
FEM1D 5

1D Steady State Heat Conduction

heat generation/volume Q
  T  
 Q  0
x  x 

x=0 (xmin) x= xmax

• Uniform: Sectional Area: A, Thermal Conductivity: 


• Heat Generation Rate/Volume/Time [QL-3T-1] Q
• Boundary Conditions
– x=0 : T= 0 (Fixed Temperature)
T
– x=xmax:  0 (Insulated)
x
FEM1D 6

Analytical Solution
heat generation/volume Q   T  
 Q  0
x  x 

x=0 (xmin) x= xmax

T  0@ x  0 T
 0 @ x  xmax
x
T   Q
T   Q x  C1  C1  Q xmax , T   0 @ x  xmax
1
T   Q x 2  C1 x  C2  C2  0, T  0 @ x  0
2
1  2 Q xmax
 T  Qx  x
2 
FEM1D 7

1D Linear Element (1/4)


一次元線形要素
• 1D Linear Element
– Length= L
• Node (Vertex)
• Element

– Ti Temperature at i
T Ti
– Tj Temperature at j
– Temperature T on each
Tj
element is linear
function of x (Piecewise
Linear): L

x
T  1   2 x Xi Xj
FEM1D 8

1D Linear Element (1/4)


一次元線形要素
• 1D Linear Element
– Length= L
• Node (Vertex)
• Element

– Ti Temperature at i
T Ti
– Tj Temperature at j
– Temperature T on each
Tj
element is linear
function of x (Piecewise
Linear): L

x
T  1   2 x Xi Xj
FEM1D 9

Piecewise Linear

1 2 3 4 5 Global Node ID
1 2 3 4 Element ID

1 2 1 2 1 2 1 2
1 2 3 4

Local Node ID
for each elem.
Gradient of temperature is constant in each
element (might be discontinuous at each “node”)
FEM1D 10

1D Linear Elem.: Shape Function (2/4)


• Coef’s are calculated basedT Ti
on info. at each node
T  Ti @ x  X i , T  T j @ x  X j Tj
Ti  1   2 X i , T j  1   2 X j
L
• Coefficients:
Ti X j  T j X i T j  Ti x
Xi Xj
1  , 2 
L L
• T can be written as follows, according to Ti and Tj :
Ni, Nj
 Xj x  x  Xi 
T    Ti    Tj Shape Function or
 L   L 
Interpolation Function
Ni Nj function of x (only)
FEM1D 11

1D Linear Elem.: Shape Function (3/4)


• Number of Shape Functions T
T i
= Number of Vertices of
Each Element Tj
– Ni: Function of Position
– A kind of Test/Trial Functions L
 Xj x  x  Xi 
N i   , N j    Xi Xj
x
 L   L 
• Linear combination of shape functions provides
displacement “in” each element
– Coef’s (unknows): Temperature at each node
i Trial/Test Function (known
M function of position, defined in
T  N iTi  N jT j TuMM   ai i domain and at boundary. “Basis”
i 1 in linear algebra.
ai Coefficients (unknown)
FEM1D 12

1D Linear Elem.: Shape Function (4/4)


T Ti
• Value of Ni
– =1 at one of the nodes in
Tj
element
– =0 on other nodes L
 Xj x  x  Xi 
N i   , N j    x
 L   L  Xi Xj
N

1.0 Ni Nj

x
Xi Xj
FEM1D 13

Galerkin Method (1/4)


• Governing Equation for 1D
Steady State Heat
Conduction Problems
(Uniform ):
 d 2T  
  2   Q  0
 dx 
T  N {} Distribution of temperature in each element
(matrix form), : Temperature at each node
• Following integral equation is obtained at each
element by Galerkin method, where [N]’s are also
weighting functions:
  d 2T   
V N    dx 2   QdV  0
T

 
FEM1D 14

Galerkin Method (2/4)


体積当たり一様発熱 Q
• Green’s Theorem (1D)
 d 2B  dB  dA dB 
x=0 (xmin) x= xmax

V A dx 2  dV  S A dx dS V  dx dx  dV


• Apply this to the 1st part of eqn with 2nd-order diff.:
 d 2T   d N T dT 
   T dT
T
V  N  
 dx 2 
 
dV    
V 

 dx dx 


dV  
S
 N
dx
dS

• Consider the following terms: V

dT d N  S S
T  N {},  {} q   dT
dx dx dx
: Heat flux at element surface[QL-2T-1]
FEM1D 15

Galerkin Method (3/4)


• Finally, following eqn is 体積当たり一様発熱 Q

obtained by considering
heat generation term Q : x=0 (xmin) x= xmax

 d N T d N  
    dV    V
V 
dx dx  S S

  q N  dS   QN  dV  0
T T

S V

• This is called “weak form(弱形式)”. Original PDE


consists of terms with 2nd-order diff., but this “weak
form” only includes 1st-order diff by Green’s theorem.
– Requirements for shape functions are “weaker” in “weak
form”. Linear functions can describe effects of 2nd-order
differentiation.
FEM1D 16

Galerkin Method (4/4)


体積当たり一様発熱 Q
 d N T d N  
    dV   
V 
dx dx  x=0 (xmin) x= xmax

 q N  dS  Q N  dV  0
 
T T
V
S V
S S

• These terms coincide at element boundaries and


disappear. Finally, only terms on the domain
boundaries remain.
FEM1D 17

Weak Form and Boundary Conditions


体積当たり一様発熱 Q
• Value of dependent variable is
defined (Dirichlet)
x=0 (xmin) x= xmax
– Weighting Function = 0
– Principal B.C. (Boundary V
Condition)(第一種境界条件) S S
– Essential B.C.(基本境界条件)

• Derivatives of Unknowns  d N T d N  
    dV   
(Neumann) 
V  
dx dx
– Naturally satisfied in weak form  q N  dS  Q N  dV  0
 
T T

– Secondary B.C.(第二種境界 S V

条件) dT
where q  
– Natural B.C(自然境界条件) dx
FEM1D 18

Weak Form with B.C.: on each elem.

k   
(e) (e)
 f 
(e)

 d N  d N  
T
k  (e)
    dV

V  
dx dx

f  (e)
  QN  dV   q N  dS
 T T

V S
FEM1D 19

Integration over Each Element: [k]


 Xj x  x  Xi  dN i   1  1
dN j
N i   , N j      ,  
 L   L  dx  L  dx  L 
 d N T d N   ui
V   dx dx dV u

L
 1 / L  uj
     1 / L, 1 / L A dx
0 
1/ L 
L
2x1 matrix 1x2 matrix
x
A L  1  1 A  1  1 Xi  0 Xj L

L2   1 
 1
dx 
L  1  1
0
 x x
N i  1  , N j   
A: Sectional Area  L L
L: Length
FEM1D 20

Integration over Each Element: {f} (1/2)


 Xj x  x  Xi  dN i   1  1
dN j

Ni   
, Nj     ,  
 L   L  dx  L  dx  L 

 x x
N i  1  , N j   
 L L

 N  dV  Q A
L
1  x / L  Q AL 1 Heat Generation
V 0  x / L  dx  2 1
T
Q (Volume)

1 : 1
A: Sectional Area
L: Length
FEM1D 21

Integration over Each Element: {f} (2/2)


 Xj x  x  Xi  dN i   1  dN j1

Ni   
, Nj     ,  
 L   L  dx  L  dx  L 

L
1  x / L   AL 1 Heat Generation
V Q N  dV  Q A0  x / L  dx  2 1
T Q
(Volume)

0
 q N  dS  q A x L
dT Surface Heat Flux
 q A , q  
T

S 1 dx

when surface heat flux acts on


only this surface.
FEM1D 22

Global Equations
• Accumulate Element Equations:

k (e)  (e)   f (e) Element Matrix, Element Equations

K    F  Global Matrix, Global Equations

K    k , F     f 
: global vector of  
This is the final linear equations
(global equations) to be solved.
FEM1D 23

ECCS2012 System

Creating Directory
>$ cd Documents
>$ mkdir 2013summer your favorite name
>$ cd 2013summer

This is your “top” directory, and is called <$E-TOP> in this class.

1D Code for Steady-State Heat Conduction Problems

>$ cd <$E-TOP>
>$ cp /home03/skengon/Documents/class_eps/F/1d.tar .
>$ cp /home03/skengon/Documents/class_eps/C/1d.tar .
>$ tar xvf 1d.tar
>$ cd 1d
FEM1D 24

Compile & GO !
>$ cd <$E-TOP>/1d
>$ cc –O 1d.c (or g95 –O 1d.f)
>$ ./a.out

Control Data input.dat

4 NE (Number of Elements)
1.0 1.0 1.0 1.0 x (Length of Each Elem.: L) ,Q, A, 
100 Number of MAX. Iterations for CG Solver
1.e-8 Convergence Criteria for CG Solver

x=1
Element ID
1 2 3 4 5
Node ID (Global)
1 2 3 4
x=0 x=1 x=2 x=3 x=4
FEM1D 25

Results
>$ ./a.out

4 iters, RESID= 4.154074e-17

### TEMPERATURE
1 0.000000E+00 0.000000E+00
2 3.500000E+00 3.500000E+00
3 6.000000E+00 6.000000E+00
4 7.500000E+00 7.500000E+00
5 8.000000E+00 8.000000E+00

Computational Analytical

x=1
Element ID
1 2 3 4 5
Node ID (Global)
1 2 3 4
x=0 x=1 x=2 x=3 x=4
FEM1D 26

Element Eqn’s/Accumulation (1/3)


• 4 elements, 5 nodes
Element ID
1 2 3 4 5
Node ID (Global)
1 2 3 4
x=0 x=1 x=2 x=3 x=4

• [k] and {f} of Element-1:


A  1  1 Q AL 1
k  (1)
 f  (1)
 
L   1  1 2 1

• As for Element-4:
A  1  1 Q AL 1
k 
( 4)
 f  ( 4)
 
L   1  1 2 1
FEM1D 27

Element Eqn’s/Accumulation (2/3)


• Element-by-Element Accumulation:

4
K    k (e)  + + +
e 1

4
F     f (e)  + + +
e 1
FEM1D 28

Element Eqn’s/Accumulation (3/3)


• Relations to FDM
A  1  1
k  (e)

L  1  1

4
+1 -1
A
K    k  [ ] L
(e) -1 +1 +1 -1
 + -1 +1 + +1 -1 +
e 1 -1 +1 +1 -1
-1 +1

+1 -1  d 2T   T  2Ti  Ti 1 
-1 +2 -1 A    2 dV     i 1 dV

2
 
-1 +2 -1  V
dx  V
L
L  T  2Ti  Ti 1 
  AL  Ti 1  2Ti  Ti 1  
-1 +2 -1 A
-1 +1
  i 1 2
 L  L
Something familiar …
FEM: Coefficient Matrices are generally sparse
(many ZERO’s)
FEM1D 29

2nd –Order Differentiation in FDM

• Approximate Derivative at×(center of i and i+1)


i-1 i i+1
 d  i1  i
×   
x x  dx i 1/ 2 x

x→0: Real Derivative

• 2nd-Order Diff. at i

 d   d  i1  i i  i1
    
 d 2 
 2  
 dx i 1/ 2  dx i1/ 2
 x x  i 1  2i  i 1
 dx i x x x2
FEM1D 30

Element-by-Element Operation
very flexible if each element has different material
property, size, etc.

( e ) A( e )  1  1
k  (e)
   1  1
L( e )  

+1 -1

K    k  (1) A(1) ( 2) A( 2)
4
(e) -1 +1 +1 -1
 (1)
+ -1 +1 
e 1 L L( 2 )

(3) A(3) ( 4) A( 4)
+1 -1  ( 3)
+ 
-1 +1 L +1 -1 L( 4 )
-1 +1
FEM1D 31

Element/Global Operations
1 2 3 4
1 2 3 4 5 Global Node ID

Local Node ID
e e e e
1 2 1 2 1 2 1 2
 d N  d N  
T
 d N  d N  
T
 d N  d N  
T
 d N  d N  
T
  E  dV      E  dV      E  dV      E  dV   
V  dx dx  V  dx dx  V  dx dx  V  dx dx 
   N  dS   X N  dV  0    N  dS   X N  dV  0    N  dS   X N  dV  0    N  dS   X N  dV  0
T T T T T T T T

S V S V S V S V

[k ]( 3) {}( 3)  { f }(3) [k ]( 4 ) {}( 4)  { f }( 4 )


[k ](1) {}(1)  { f }(1) [k ]( 2 ) {}( 2 )  { f }( 2 )
k11( 3) k12( 3)  1( 3)   f1( 3)  k11( 4 ) k12( 4 )  1( 4 )   f1( 4) 
k11(1) k12(1)  1(1)   f1(1)  k11( 2 ) k12( 2 )  1( 2)   f1( 2)    (4) 
  ( 2)   ( 3) ( 3)   ( 3) 
  ( 3)   ( 4) ( 4)   ( 4) 
 (1) (1)   (1) 
  (1)   (2) ( 2)   (2)   2   f 2 
 2   f 2  k 21 k 22
 2   f 2  k 21 k 22
 2   f 2  k 21 k 22
k 21 k 22

1 2 3 4
1 2 3 4 5
[ K ]{}  {F }
 D1 AU11   1   B1 
Mapping Information needed,
 AL D2 AU 21     B  from element-matrix
 21   2   2 
 AL31 D3 AU 31   3    B3  to global-matrix.
 
 AL41 D4 AU 41   4   B4 
   
 AL51 D5   5   B5 
FEM1D 32
Around the FEM in 3 minutes

Accumulation to
Global/overall Matrix
[K ]{}  {F}
13 14 15 16  D X X X  1   F1 
X D X X
X X     F 
7 8 9   2   2 
 X D X X X X  3   F3 
    
9 10 11 12  X D X X  4   F4 
X X D X X X  5   F5 
4 5 6     
X X X X D X X X X  6   F6 
5 6 7 8  X X X X D X X X X     F 
  7   7 
 X X X D X X  8   F8 
1 2 3  X X D X X X      F 
  9   9 
1 2 3 4  X X X X D X X X X 10  F10 
    
 X X X X D X X X X 11  F11 
 X X X D X X 12  F12 
    
 X X D X 13  F13 
 X X X X D X 14  F 
    14 
 X X X X D X 15  F15 
    
 X X X D16  F16 
FEM1D 33

• 1D-code for Static Linear-Elastic Problems by


Galerkin FEM
• Sparse Linear Solver
– Conjugate Gradient Method
– Preconditioning
• Storage of Sparse Matrices

• Program
FEM1D 34

Large-Scale Linear Equations in


Scientific Applications
• Solving large-scale linear equations Ax=b is the most
important and expensive part of various types of
scientific computing.
– for both linear and nonlinear applications
• Various types of methods proposed & developed.
– for dense and sparse matrices
– classified into direct and iterative methods
• Dense Matrices:密行列: Globally Coupled Problems
– BEM, Spectral Methods, MO/MD (gas, liquid)
• Sparse Matrices:疎行列: Locally Defined Problems
– FEM, FDM, DEM, MD (solid), BEM w/FMM
FEM1D

Direct Method
直接法
 Gaussian Elimination/LU Factorization.
 compute A-1 directly.
Good
 Robust for wide range of applications.
 Good for both dense and sparse matrices
Bad
 More expensive than iterative methods (memory, CPU)
 not scalable

35
FEM1D

Iterative Method
反復法
• Stationary Method
– SOR, Gauss-Seidel, Jacobi
– Generally slow, impractical

• Non-Stationary Method
– With restriction/optimization conditions
– Krylov-Subspace
– CG: Conjugate Gradient
– BiCGSTAB: Bi-Conjugate Gradient Stabilized
– GMRES: Generalized Minimal Residual
36
FEM1D

Iterative Method (cont.)


Good
 Less expensive than direct methods, especially in memory.
 Suitable for parallel and vector computing.

Bad
 Convergence strongly depends on problems, boundary
conditions (condition number etc.)
 Preconditioning is required : Key Technology for
Parallel FEM

37
FEM1D 38

Conjugate Gradient Method


共役勾配法
• Conjugate Gradient: CG
– Most popular “non-stationary” iterative method
• for Symmetric Positive Definite (SPD) Matrices
– 対称正定
– {x}T[A]{x}>0 for arbitrary {x}
– All of diagonal components, eigenvaules and leading
principal minors > 0 (主小行列式・首座行列式)
– Matrices of Galerkin-based FEM: heat conduction, Poisson,
static linear elastic problems a a a a  a 
11 12 13 14 1n
a a24  a2 n 
• Algorithm  21
a
a22
a32
a23
a33 a34  a3n 
det  31 
– “Steepest Descent Method” a41 a42 a43 a44  a4 n 
   
– x(i)= x(i-1) + i p(i) 
an1

an 2

an 3


an 4  ann 
• x(i):solution,p(i):search direction,i: coefficient
– Solution {x} minimizes {x-y}T[A]{x-y}, where {y} is exact
solution.
FEM1D 39

Procedures of Conjugate Gradient


Compute r(0)= b-[A]x(0)
for i= 1, 2, … • Mat-Vec. Multiplication
z(i-1)= r(i-1) • Dot Products
i-1= r(i-1) z(i-1)
if i=1 • DAXPY (Double
p(1)= z(0)
else
Precision: a{X} + {Y})
i-1= i-1/i-2
p(i)= z(i-1) + i-1 p(i-1)
endif
q(i)= [A]p(i) x(i) : Vector
i = i-1/p(i)q(i)
x(i)= x(i-1) + ip(i) i : Scalar
r(i)= r(i-1) - iq(i)
check convergence |r|
end
FEM1D 40

Procedures of Conjugate Gradient


Compute r(0)= b-[A]x(0)
for i= 1, 2, … • Mat-Vec. Multiplication
z(i-1)= r(i-1) • Dot Products
i-1= r(i-1) z(i-1)
if i=1 • DAXPY
p(1)= z(0)
else
i-1= i-1/i-2
p(i)= z(i-1) + i-1 p(i-1)
endif
q(i)= [A]p(i)
i = i-1/p(i)q(i)
x(i) : Vector
x(i)= x(i-1) + ip(i) i : Scalar
r(i)= r(i-1) - iq(i)
check convergence |r|
end
FEM1D 41

Procedures of Conjugate Gradient


Compute r(0)= b-[A]x(0)
for i= 1, 2, … • Mat-Vec. Multiplication
z(i-1)= r(i-1) • Dot Products
i-1= r(i-1) z(i-1)
if i=1 • DAXPY
p(1)= z(0)
else
i-1= i-1/i-2
p(i)= z(i-1) + i-1 p(i-1)
endif
q(i)= [A]p(i) x(i) : Vector
i = i-1/p(i)q(i)
x(i)= x(i-1) + ip(i) i : Scalar
r(i)= r(i-1) - iq(i)
check convergence |r|
end
FEM1D 42

Procedures of Conjugate Gradient


Compute r(0)= b-[A]x(0)
for i= 1, 2, … • Mat-Vec. Multiplication
z(i-1)= r(i-1) • Dot Products
i-1= r(i-1) z(i-1)
if i=1 • DAXPY
p(1)= z(0)
else
• Double
i-1= i-1/i-2 • {y}= a{x} + {y}
p(i)= z(i-1) + i-1 p(i-1)
endif
q(i)= [A]p(i) x(i) : Vector
i = i-1/p(i)q(i)
x(i)= x(i-1) + ip(i) i : Scalar
r(i)= r(i-1) - iq(i)
check convergence |r|
end
FEM1D 43

Procedures of Conjugate Gradient


Compute r(0)= b-[A]x(0)
for i= 1, 2, …
x(i) : Vector
z(i-1)= r(i-1) i : Scalar
i-1= r(i-1) z(i-1)
if i=1
p(1)= z(0)
else
i-1= i-1/i-2
p(i)= z(i-1) + i-1 p(i-1)
endif
q(i)= [A]p(i)
i = i-1/p(i)q(i)
x(i)= x(i-1) + ip(i)
r(i)= r(i-1) - iq(i)
check convergence |r|
end
FEM1D 44

Algorithm of CG Method (1/5)


Solution x minimizes the following equation if y is the exact
solution (Ay=b)
x  y T Ax  y 
x  y T Ax  y   x, Ax    y, Ax   x, Ay    y, Ay 
 x, Ax   2 x, Ay    y, Ay    x, Ax   2x, b    y, b  Const.

Therefore, the solution x minimizes the following f(x):


1
f x    x, Ax    x, b 
2
1
f x  h   f  x   h, Ax  b   h, Ah  Arbitrary vector h
2
FEM1D 45

1
f x    x, Ax   x, b 
2
1
f x  h   f  x   h, Ax  b   h, Ah  Arbitrary vector h
2

1
f  x  h    x  h, A( x  h)    x  h, b 
2
1 1
  x  h, Ax   x  h, Ah    x, b   h, b 
2 2
1 1 1 1
  x, Ax   h, Ax    x, Ah   h, Ah    x, b   h, b 
2 2 2 2
1 1
  x, Ax    x, b   h, Ax   h, b   h, Ah 
2 2
1
 f  x   h, Ax  b   h, Ah 
2
FEM1D 46

Algorithm of CG Method (2/5)


CG method minimizes f(x) at each iteration.
Assume that approximate solution: x(0), and
search direction vector p(k) is defined at k-th iteration.
x ( k 1)  x ( k )   k p ( k )

Minimization of f(x(k+1)) is done as follows:

 (k ) (k )
1 2 (k )
2
     
f x   k p   k p , Ap ( k )   k p ( k ) , b  Ax ( k )  f x ( k )


f x ( k )   k p ( k ) 
 0  k 
  
p ( k ) , b  Ax ( k ) 
p(k ) , r (k )
 (k )
 k 
(k )
p , Ap  
(k )

p , Ap ( k )

r ( k )  b  Ax ( k ) residual vector
FEM1D 47

Algorithm of CG Method (3/5)

Residual vector at (k+1)-th iteration: r ( k 1)  b  Ax ( k 1) , r ( k )  b  Ax ( k )


r ( k 1)
r (k )
  k Ap (k ) r ( k 1)  r ( k )  Ax ( k 1)  Ax ( k )   k Ap ( k )

Search direction vector p is defined by the following


recurrence formula:
p ( k 1)  r ( k 1)   k p ( k ) , r ( 0 )  p ( 0 )

It’s lucky if we can get exact solution y at (k+1)-th iteration:


y  x ( k 1)   k 1 p ( k 1)
FEM1D 48

Algorithm of CG Method (4/5)


BTW, we have the following (convenient) orthogonality relation:
Ap (k )
, y  x ( k 1)   0

Ap (k )
   
, y  x ( k 1)  p ( k ) , Ay  Ax ( k 1)  p ( k ) , b  Ax ( k 1) 
 p (k )
  
, b  A x ( k )   k p ( k )  p ( k ) , b  Ax ( k )   k Ap ( k ) 
 p (k )
    
, r ( k )   k Ap ( k )  p ( k ) , r ( k )   k p ( k ) , Ap ( k )  0

 k 
 p (k )
, r (k )
p (k )
, Ap ( k ) 
Thus, following relation is obtained:
Ap (k )
   
, y  x ( k 1)  Ap ( k ) ,  k 1 p ( k 1)  0  p ( k 1) , Ap ( k )  0 
FEM1D 49

Algorithm of CG Method (5/5)


p ( k 1)
, Ap   r
(k )
  p , Ap   r
( k 1)
k , Ap     p , Ap   0
(k ) (k ) ( k 1) (k )
k
(k ) (k )

 r , Ap 
( k 1) (k )
 k 
 p , Ap 
(k ) (k )

p ( k 1)
, Ap   0 p(k) is “conjugate” for matrix A
(k )

Following “conjugate” relationship is obtained for arbitrary (i,j):


p (i )
, Ap ( j )   0 i  j 
Following relationships are also obtained for p(k) and r(k):
r (i )
, r ( j )   0 i  j , p (k )
, r ( k )   r ( k ) , r ( k ) 
In N-dimensional space, only N sets of orthogonal and linearly
independent residual vector r(k). This means CG method
converges after N iterations if number of unknowns is N.
Actually, round-off error sometimes affects convergence.
FEM1D 50

Proof (1/2)
p (i )
, r ( k 1)   0, i  0,1, , k
k
x ( k 1)  x ( i 1)   j
 p
j i 1
( j)

 (i 1) k 
r ( k 1)
 b  Ax ( k 1)
 b  A x   j p ( j)

 j i 1 

  
k k
 b  Ax (i 1) 
j i 1
j Ap ( j)
 r ( i 1)
  j
 Ap
j i 1
( j)

 ( i ) ( i 1) 
p    p , r    j Ap 
k
(i ) ( k 1) ( j)
,r
 j i 1 
Ap , y  x   0
(k ) ( k 1)

 (i ) k 
Ap , y  x 
(k ) ( k 1)

  p , r    p ,   j Ap   0
(i ) ( i 1) ( j)
  p , Ay  Ax
(k )
 ( k 1)

 j i 1    p , b  Ax
(k )
 ( k 1)

 p , r (k )
 0
( k 1)

=0 =0
FEM1D 51

Proof (2/2)
r (i )
, r ( j )   0 i  j 

  
0  p ( i ) , r ( k 1)  r ( i )   i 1 p (i 1) , r ( k 1) 
  i 1 p ( i 1)
, r ( k 1)
 
 r (i )
, r ( k 1)
 
 r (i )
, r ( k 1)

p (k )
, r ( k )   r ( k ) , r ( k ) 

p (k )
, r ( k )   r ( k )   k 1 p ( k 1) , r ( k ) 
  k 1 p ( k 1) , r ( k )   r ( k ) , r ( k )   r ( k ) , r ( k ) 
FEM1D 52

k,k
Usually, we use simpler definitions of k,k as follows:

 
 p , b  Ax   p
(k )

(k ) (k )
, r (k )  r (k ) , r (k ) 
 (k )
k
 p , Ap   p
(k ) (k ) (k )
, Ap   p , Ap ( k ) 
(k )

  p , r   r , r 
(k ) (k ) (k ) (k )

 r ( k 1) , Ap ( k )  r ( k 1) , r ( k 1) 


k  
 p , Ap  r ( k ) , r ( k ) 
(k ) (k )

r  ( k 1)
, Ap (k )
 r ( k 1)
, r ( k )  r ( k 1) 

r ( k 1)
, r ( k 1) 
k k
FEM1D 53

Procedures of Conjugate Gradient


Compute r(0)= b-[A]x(0)
for i= 1, 2, …
x(i) : Vector
z(i-1)= r(i-1) i : Scalar
i-1= r(i-1) z(i-1)
if i=1
p(1)= z(0)
else
i-1= i-1/i-2
p(i)= z(i-1) + i-1 p(i-1)
endif
q(i)= [A]p(i)
i = i-1/p(i)q(i)
x(i)= x(i-1) + ip(i)
r(i)= r(i-1) - iq(i)
check convergence |r|
end
FEM1D 54

Preconditioning for Iterative Solvers


 Convergence rate of iterative solvers strongly depends
on the spectral properties (eigenvalue distribution) of
the coefficient matrix A.
 Eigenvalue distribution is small, eigenvalues are close to 1
 In "ill-conditioned" problems, "condition number" (ratio of
max/min eigenvalue if A is symmetric) is large.
 A preconditioner M (whose properties are similar to
those of A)transforms the linear system into one with
more favorable spectral properties
 In "ill-conditioned" problems, "condition number" (ratio of
max/min eigenvalue if A is symmetric) is large.
 M transforms original equation Ax=b into A'x=b' where
A'=M-1A, b'=M-1b
 If M~A, M-1A is close to identity matrix.
 If M-1=A-1, this is the best preconditioner (a.k.a. Gaussian
Elimination)
FEM1D 55

Preconditioned CG Solver
Compute r(0)= b-[A]x(0)
for i= 1, 2, …
solve [M]z(i-1)= r(i-1)
i-1= r(i-1) z(i-1)
if i=1
p(1)= z(0)
else x

i-1= i-1/i-2
p(i)= z(i-1) + i-1 p(i-1)
endif
q(i)= [A]p(i)
i = i-1/p(i)q(i)
x(i)= x(i-1) + ip(i)
r(i)= r(i-1) - iq(i)
check convergence |r|
end
FEM1D 56

ILU(0), IC(0)
• Widely used Preconditioners for Sparse Matrices
– Incomplete LU Factorization
– Incomplete Cholesky Factorization (for Symmetric
Matrices)

• Incomplete Direct Method


– Even if original matrix is sparse, inverse matrix is not
necessarily sparse.
– fill-in
– ILU(0)/IC(0) without fill-in have same non-zero pattern
with the original (sparse) matrices
FEM1D 57

Diagonal Scaling, Point-Jacobi

D1 0 ... 0 0
0 D 0 0 
 2 
M    ... ... ... 
 
0 0 DN 1 0 
 0 0 ... 0 DN 

• solve [M]z(i-1)= r(i-1) is very easy.


• Provides fast convergence for simple problems.
• 1d.f, 1d.c
FEM1D 58

• More detailed discussions on preconditioning will be


provided in “Multicore Programming”.
FEM1D 59

• 1D-code for Static Linear-Elastic Problems by


Galerkin FEM
• Sparse Linear Solver
– Conjugate Gradient Method
– Preconditioning
• Storage of Sparse Matrices

• Program
FEM1D 60

Coef. Matrix derived from FEM


D X X X  1   F1 
• Sparse Matrix X


D X
X D X
X X X
X X X
    F 
 2   2 
 3   F3 
    
– Many “0”’s  X D X X  4   F4 
X X D X X X  5   F5 
    
X X X X D X X X X  6   F6 
• Storing all components 

X X X X D X X X X     F 
 7   7 
 X X X D X X  8   F8 
(e.g. A(i,j)) is not efficient 


X X
X X X
D X
X D X
X X
X X X
     F 
 9   9 
10  F10 
    
for sparse matrices 

X X X
X X
X D X
X D
X X X 11  F11 
X X 12  F12 
    
– A(i,j) is suitable for dense 

X X
X X X
D X
X D X
13  F13 
14  F14 
    
X D X 15  F15 
matrices 


X X X
X X
   
X D16  F16 

• Number of non-zero off-diagonal components is


O(100) in FEM
– If number of unknowns is 108 :
• A(i,j): O(1016) words
• Actual Non-zero Components:O(1010) words
• Only (really) non-zero off-diag. components should
be stored on memory
FEM1D 61

Variables/Arrays in 1d.f, 1d.c


related to coefficient matrix
name type size description
N I - # Unknowns

NPLU I - # Non-Zero Off-Diagonal Components


Diag(:) R N Diagonal Components
U(:) R N Unknown Vector

Rhs(:) R N RHS Vector


0:N Off-Diagonal Components (Number of Non-Zero Off-
Index(:) I
N+1 Diagonals at Each ROW)
Item(:) I NPLU Off-Diagonal Components (Corresponding Column ID)

AMat(:) R NPLU Off-Diagonal Components (Value)

Only non-zero components are stored according


to “Compressed Row Storage”.
FEM1D 62

Mat-Vec. Multiplication for Sparse Matrix


Compressed Row Storage (CRS)
Diag (i) Diagonal Components (REAL, i=1~N)
Index(i) Number of Non-Zero Off-Diagonals at Each ROW (INT, i=0~N)
Item(k) Off-Diagonal Components (Corresponding Column ID)
(INT, k=1, index(N))
AMat(k) Off-Diagonal Components (Value)
( REAL, k=1, index(N) )

{Y}= [A]{X} D
X
X X X  1   F1 
    F 
 D X X X X  2   2 
 X D X X X X  3   F3 
    
 4   F4 
do i= 1, N  X D X X
X X D X X X  5   F5 
    
X X X X D X X X X  6   F6 
Y(i)= Diag(i)*X(i) 


X X X
X X
X D X
X D
X X X
X X
    F 
 7   7 
 8   F8 

do k= Index(i-1)+1, Index(i) 


X X
X X X
D X
X D X
X X
X X X
     F 
 9   9 
10  F10 
   
Y(i)= Y(i) + Amat(k)*X(Item(k))

 X X X X D X X X X 11  F11 
 X X X D X X 12  F12 
    

enddo 


X X
X X X
D X
X D X
13  F13 
  F 
 14   14 

enddo  X X X X D X 15  F15 


    
 X X X D16  F16 
FEM1D 63

CRS or CSR ?
for Compressed Row Storage
• In Japan and USA, “CRS” is very general for
abbreviation of “Compressed Row Storage”, but they
usually use “CSR” in Europe (especially in France).

• “CRS” in France
– Compagnie Républicaine de Sécurité
• Republic Security Company of France
• French scientists may feel
uncomfortable when we use “CRS” in
technical papers and/or presentations.
FEM1D 64

Mat-Vec. Multiplication for Sparse Matrix


Compressed Row Storage (CRS)

{Q}=[A]{P}

for(i=0;i<N;i++){
W[Q][i] = Diag[i] * W[P][i];
for(k=Index[i];k<Index[i+1];k++){
W[Q][i] += AMat[k]*W[P][Item[k]];
}
}
FEM1D 65

Mat-Vec. Multiplication for Dense Matrix


Very Easy, Straightforward
 a11 a12 ... a1, N 1 a1, N   x1   y1 
 a a a a  x   y 
 21 22 2, N 1 2, N   2   2 
 ... ... ...      
  x   y 
aN 1,1 aN 1,2 aN 1, N 1 aN 1, N 
 N 1   N 1 
 aN ,1 aN ,2 ... aN , N 1 aN , N   xN   yN 
 

{Y}= [A]{X}

do j= 1, N
Y(j)= 0.d0
do i= 1, N
Y(j)= Y(j) + A(i,j)*X(i)
enddo
enddo
FEM1D 66

Compressed Row Storage (CRS)


1 2 3 4 5 6 7 8
1 1.1 2.4 0 0 3.2 0 0 0 
2
4.3 3.6 0 2.5 0 3.7 0 9.1 

3 0 0 5.7 0 1.5 0 3.1 0 
 
4
0 4.1 0 9.8 2.5 2.7 0 0 
5 3.1 9.5 10.4 0 11.5 0 4.3 0 
 
6 0 0 6.5 0 0 12.4 9.5 0 
7 0 6.4 2.5 0 0 1.4 23.1 13.1
 
8  0 9.5 1.3 9.6 0 3.1 0 51.3
FEM1D 67

Compressed Row Storage (CRS):


Fortran
1 2 3 4 5 6 7 8
1.1 2.4 3.2 N= 8
1
① ② ⑤
4.3 3.6 2.5 3.7 9.1 対角成分
2 Diag(1)= 1.1
① ② ④ ⑥ ⑧
5.7 1.5 3.1 Diag(2)= 3.6
3 Diag(3)= 5.7
③ ⑤ ⑦
4.1 9.8 2.5 2.7 Diag(4)= 9.8
4 Diag(5)= 11.5
② ④ ⑤ ⑥
Diag(6)= 12.4
3.1 9.5 10.4 11.5 4.3
5 Diag(7)= 23.1
① ② ③ ⑤ ⑦
Diag(8)= 51.3
6.5 12.4 9.5
6
③ ⑥ ⑦
6.4 2.5 1.4 23.1 13.1
7
② ③ ⑥ ⑦ ⑧
8 9.5 1.3 9.6 3.1 51.3
② ③ ④ ⑥ ⑧
FEM1D 68

Compressed Row Storage (CRS)


1 2 3 4 5 6 7 8
1.1 2.4 3.2
1
① ② ⑤
3.6 4.3 2.5 3.7 9.1
2
② ① ④ ⑥ ⑧
5.7 1.5 3.1
3
③ ⑤ ⑦
4 9.8 4.1 2.5 2.7
④ ② ⑤ ⑥
11.5 3.1 9.5 10.4 4.3
5
⑤ ① ② ③ ⑦
12.4 6.5 9.5
6
⑥ ③ ⑦
23.1 6.4 2.5 1.4 13.1
7
⑦ ② ③ ⑥ ⑧
8 51.3 9.5 1.3 9.6 3.1
⑧ ② ③ ④ ⑥
FEM1D 69

Compressed Row Storage (CRS)


# Non-Zero
index(0)= 0
Off-Diag.
1.1 2.4 3.2
1 2 index(1)= 2
① ② ⑤
3.6 4.3 2.5 3.7 9.1 4 index(2)= 6
2
② ① ④ ⑥ ⑧
5.7 1.5 3.1
3 2 index(3)= 8
③ ⑤ ⑦
9.8 4.1 2.5 2.7
4 3 index(4)= 11
④ ② ⑤ ⑥
11.5 3.1 9.5 10.4 4.3
5 4 index(5)= 15
⑤ ① ② ③ ⑦
12.4 6.5 9.5
6 2 index(6)= 17
⑥ ③ ⑦
23.1 6.4 2.5 1.4 13.1 4 index(7)= 21
7
⑦ ② ③ ⑥ ⑧
8 51.3 9.5 1.3 9.6 3.1 4 index(8)= 25 NPLU= 25
⑧ ② ③ ④ ⑥ (=index(N))
index(i-1)+1th~index(i) th
Non-Zero Off-Diag. Components corresponding to i-th row
FEM1D 70

Compressed Row Storage (CRS)


# Non-Zero
index(0)= 0
Off-Diag.
1.1 2.4 3.2
1 2 index(1)= 2
① ②,1 ⑤,2
3.6 4.3 2.5 3.7 9.1 4 index(2)= 6
2
② ①,3 ④,4 ⑥,5 ⑧,6
5.7 1.5 3.1
3 2 index(3)= 8
③ ⑤,7 ⑦,8
4 9.8 4.1 2.5 2.7
3 index(4)= 11
④ ②,9 ⑤,10 ⑥,11
11.5 3.1 9.5 10.4 4.3
5 4 index(5)= 15
⑤ ①,12 ②,13 ③,14 ⑦,15
12.4 6.5 9.5
6 2 index(6)= 17
⑥ ③,16 ⑦,17
23.1 6.4 2.5 1.4 13.1 4 index(7)= 21
7
⑦ ②,18 ③,19 ⑥,20 ⑧,21
8 51.3 9.5 1.3 9.6 3.1 4 index(8)= 25 NPLU= 25
⑧ ②,22 ③,23 ④,24 ⑥,25 (=index(N))
index(i-1)+1th~index(i) th
Non-Zero Off-Diag. Components corresponding to i-th row
FEM1D 71

Compressed Row Storage (CRS)

1.1 2.4 3.2


1
① ②,1 ⑤,2
3.6 4.3 2.5 3.7 9.1 Example:
2 item( 7)= 5, AMAT( 7)= 1.5
② ①,3 ④,4 ⑥,5 ⑧,6
item(19)= 3, AMAT(19)= 2.5
5.7 1.5 3.1
3
③ ⑤,7 ⑦,8
4 9.8 4.1 2.5 2.7
④ ②,9 ⑤,10 ⑥,11
11.5 3.1 9.5 10.4 4.3
5
⑤ ①,12 ②,13 ③,14 ⑦,15
12.4 6.5 9.5
6
⑥ ③,16 ⑦,17
23.1 6.4 2.5 1.4 13.1
7
⑦ ②,18 ③,19 ⑥,20 ⑧,21
8 51.3 9.5 1.3 9.6 3.1
⑧ ②,22 ③,23 ④,24 ⑥,25
FEM1D 72

Compressed Row Storage (CRS)

1.1 2.4 3.2 Diag (i) Diagonal Components (REAL, i=1~N)


1
① ②,1 ⑤,2 Index(i) Number of Non-Zero Off-Diagonals at
3.6 4.3 2.5 3.7 9.1 Each ROW (INT, i=0~N)
2
② ①,3 ④,4 ⑥,5 ⑧,6 Item(k) Off-Diagonal Components
5.7 1.5 3.1 (Corresponding Column ID)
3
③ ⑤,7 ⑦,8 (INT, k=1, index(N))
9.8 4.1 2.5 2.7 AMat(k) Off-Diagonal Components (Value)
4
④ ②,9 ⑤,10 ⑥,11 ( REAL, k=1, index(N) )
11.5 3.1 9.5 10.4 4.3
5 {Y}= [A]{X}
⑤ ①,12 ②,13 ③,14 ⑦,15
12.4 6.5 9.5 do i= 1, N
6
⑥ ③,16 ⑦,17 Y(i)= D(i)*X(i)
23.1 6.4 2.5 1.4 13.1 do k= index(i-1)+1, index(i)
7
⑦ ②,18 ③,19 ⑥,20 ⑧,21 Y(i)= Y(i) + AMAT(k)*X(item(k))
51.3 9.5 1.3 9.6 3.1 enddo
8 enddo
⑧ ②,22 ③,23 ④,24 ⑥,25
FEM1D 73

• 1D-code for Static Linear-Elastic Problems by


Galerkin FEM
• Sparse Linear Solver
– Conjugate Gradient Method
– Preconditioning
• Storage of Sparse Matrices

• Program
FEM1D 74

Finite Element Procedures


• Initialization
– Control Data
– Node, Connectivity of Elements (N: Node#, NE: Elem#)
– Initialization of Arrays (Global/Element Matrices)
– Element-Global Matrix Mapping (Index, Item)
• Generation of Matrix
– Element-by-Element Operations (do icel= 1, NE)
• Element matrices
• Accumulation to global matrix
– Boundary Conditions
• Linear Solver
– Conjugate Gradient Method
FEM1D 75

Program: 1d.f (1/6)


variables and arrays
!C
!C 1D Steady-State Heat Transfer
!C FEM with Piece-wise Linear Elements
!C CG (Conjugate Gradient) Method
!C
!C d/dx(CdT/dx) + Q = 0
!C T=0@x=0
!C
program heat1D
implicit REAL*8 (A-H,O-Z)
integer :: N, NPLU, ITERmax
integer :: R, Z, P, Q, DD
real(kind=8) :: dX, RESID, EPS
real(kind=8) :: AREA, QV, COND
real(kind=8), dimension(:), allocatable :: PHI, RHS, X
real(kind=8), dimension(: ), allocatable :: DIAG, AMAT
real(kind=8), dimension(:,:), allocatable :: W
real(kind=8), dimension(2,2) :: KMAT, EMAT
integer, dimension(:), allocatable :: ICELNOD
integer, dimension(:), allocatable :: INDEX, ITEM
FEM1D 76

Variable/Arrays (1/2)
Name Type Size I/O Definition
NE I I # Element
N I O # Node
NPLU I O # Non-Zero Off-Diag. Components
IterMax I I MAX Iteration Number for CG
errno I O ERROR flag
R, Z, Q, P, I O Name of Vectors in CG
DD
dX R I Length of Each Element
Resid R O Residual for CG
Eps R I Convergence Criteria for CG
Area R I Sectional Area of Element
QV R I Heat Generation Rate/Volume/Time Q
COND R I Thermal Conductivity
FEM1D 77

Variable/Arrays (2/2)
Name Type Size I/O Definition
X R N O Location of Each Node
PHI R N O Temperature of Each Node
Rhs R N O RHS Vector
Diag R N O Diagonal Components
W R (N,4) O Work Array for CG
Amat R NPLU O Off-Diagonal Components (Value)
Index I 0:N O Number of Non-Zero Off-Diagonals at
Each ROW
Item I NPLU O Off-Diagonal Components
(Corresponding Column ID)
Icelnod I 2*NE O Node ID for Each Element
Kmat R (2,2) O Element Matrix [k]
Emat R (2,2) O Element Matrix
FEM1D 78

Program: 1d.f (2/6)


Initialization, Allocation of Arrays
!C
!C +-------+
!C | INIT. |
!C +-------+
!C===
open (11, file='input.dat', status='unknown')
read (11,*) NE
read (11,*) dX, QV, AREA, COND Control Data input.dat
read (11,*) ITERmax
read (11,*) EPS 4 NE (Number of Elements)
close (11) 1.0 1.0 1.0 1.0 x (Length of Each Elem.: L) ,Q, A, 
100 Number of MAX. Iterations for CG Solver
1.e-8 Convergence Criteria for CG Solver
N= NE + 1
allocate (PHI(N), DIAG(N), AMAT(2*N-2), RHS(N))
allocate (ICELNOD(2*NE), X(N))
allocate (INDEX(0:N), ITEM(2*N-2), W(N,4))
PHI = 0.d0 1 2 3 4
AMAT= 0.d0 1 2 3 4 5
DIAG= 0.d0
RHS= 0.d0
X= 0.d0 NE: # Element
N : # Node (NE+1)
FEM1D 79

Program: 1d.f (2/6)


Initialization, Allocation of Arrays
!C
!C +-------+
!C | INIT. |
!C +-------+
!C===
open (11, file='input.dat', status='unknown')
read (11,*) NE
read (11,*) dX, QV, AREA, COND
read (11,*) ITERmax
read (11,*) EPS
close (11)

N= NE + 1
allocate (PHI(N), DIAG(N), AMAT(2*N-2), RHS(N))
allocate (ICELNOD(2*NE), X(N))
allocate (INDEX(0:N), ITEM(2*N-2), W(N,4))
PHI = 0.d0
AMAT= 0.d0 Amat: Non-Zero Off-Diag. Comp.
DIAG= 0.d0
RHS= 0.d0 Item: Corresponding Column ID
X= 0.d0
FEM1D 80

Element/Global Operations
1 2 3 4
1 2 3 4 5 Global Node ID

Local Node ID
e e e e
1 2 1 2 1 2 1 2
 d N  d N  
T
 d N  d N  
T
 d N  d N  
T
 d N  d N  
T
  E  dV      E  dV      E  dV      E  dV   
V  dx dx  V  dx dx  V  dx dx  V  dx dx 
   N  dS   X N  dV  0    N  dS   X N  dV  0    N  dS   X N  dV  0    N  dS   X N  dV  0
T T T T T T T T

S V S V S V S V

[k ]( 3) {}( 3)  { f }(3) [k ]( 4 ) {}( 4)  { f }( 4 )


[k ](1) {}(1)  { f }(1) [k ]( 2 ) {}( 2 )  { f }( 2 )
k11( 3) k12( 3)  1( 3)   f1( 3)  k11( 4 ) k12( 4 )  1( 4 )   f1( 4) 
k11(1) k12(1)  1(1)   f1(1)  k11( 2 ) k12( 2 )  1( 2)   f1( 2)    (4) 
  ( 2)   ( 3) ( 3)   ( 3) 
  ( 3)   ( 4) ( 4)   ( 4) 
 (1) (1)   (1) 
  (1)   (2) ( 2)   (2)   2   f 2 
 2   f 2  k 21 k 22
 2   f 2  k 21 k 22
 2   f 2  k 21 k 22
k 21 k 22

1 2 3 4
1 2 3 4 5
[ K ]{}  {F }
 D1 AU11   1   B1  Number of non-zero off-
 AL     B 
 21 D2 AU 21   2   2  diag. components is 2 for
 AL31 D3 AU 31   3    B3 
  each node. This number
 AL41 D4 AU 41   4   B4 
    is 1 at boundary nodes).
 AL51 D5   5   B5 
FEM1D 81

Attention: In C program, node and


element ID’s start from 0.

1 2 3 4
1 2 3 4 5

0 1 2 3
0 1 2 3 4

e e e e
0 1 0 1 0 1 0 1
FEM1D 82

Program: 1d.f (2/6)


Initialization, Allocation of Array
!C
!C +-------+
!C | INIT. |
!C +-------+ icel
!C===
open (11, file='input.dat', status='unknown')
read (11,*) NE Icelnod[2*icel] Icelnod[2*icel+1]
read (11,*) dX, QV, AREA, COND =icel =icel+1
read (11,*) ITERmax
read (11,*) EPS
close (11)

N= NE + 1 Amat: Non-Zero Off-Diag. Comp.


allocate (PHI(N), DIAG(N), AMAT(2*N-2), RHS(N))
allocate (ICELNOD(2*NE), X(N)) Item: Corresponding Column ID
allocate (INDEX(0:N), ITEM(2*N-2), W(N,4))
PHI = 0.d0 Number of non-zero off-diag.
AMAT= 0.d0 components is 2 for each node. This
DIAG= 0.d0
RHS= 0.d0 number is 1 at boundary nodes).
X= 0.d0

Total Number of Non-Zero Off-Diag.


Components:
2*(N-2)+1+1= 2*N-2
FEM1D 83

Program: 1d.f (3/6)


Initialization, Allocation of Arrays (cont.)
do i= 1, N
X(i)= dfloat(i-1)*dX
enddo
X: X-coordinate
do icel= 1, NE component of each node
ICELNOD(2*icel-1)= icel
ICELNOD(2*icel )= icel + 1
enddo
KMAT(1,1)= +1.d0
KMAT(1,2)= -1.d0
KMAT(2,1)= -1.d0
KMAT(2,2)= +1.d0
FEM1D 84

Program: 1d.f (3/6)


Initialization, Allocation of Arrays (cont.)
do i= 1, N
X(i)= dfloat(i-1)*dX
enddo
icel
do icel= 1, NE
ICELNOD(2*icel-1)= icel
ICELNOD(2*icel )= icel + 1 Icelnod(2*icel-1) Icelnod(2*icel)
enddo =icel =icel+1
KMAT(1,1)= +1.d0
KMAT(1,2)= -1.d0
KMAT(2,1)= -1.d0
KMAT(2,2)= +1.d0
FEM1D 85

Program: 1d.f (3/6)


Initialization, Allocation of Arrays (cont.)
do i= 1, N
X(i)= dfloat(i-1)*dX
enddo
do icel= 1, NE
ICELNOD(2*icel-1)= icel  d N T d N    A  1  1
ICELNOD(2*icel )= icel + 1
enddo k (e)     dV 
   1  1
KMAT(1,1)= +1.d0
V 
dx dx  L  
KMAT(1,2)= -1.d0
KMAT(2,1)= -1.d0 [Kmat]
KMAT(2,2)= +1.d0
FEM1D 86

Program: 1d.f (4/6)


Global Matrix: Column ID for Non-Zero Off-Diag’s
!C
!C +--------------+ Number of non-zero off-diag. components is 2 for each
!C | CONNECTIVITY | node. This number is 1 at boundary nodes).
!C +--------------+
!C=== Total Number of Non-Zero Off-Diag. Components:
INDEX = 2
2*(N-2)+1+1= 2*N-2= NPLU= Index[N]
INDEX(0)= 0
# Non-Zero
INDEX(1)= 1 index(0)= 0
INDEX(N)= 1 Off-Diag.
1.1 2.4 3.2
1 2 index(1)= 2
do i= 1, N ① ② ⑤
INDEX(i)= INDEX(i) + INDEX(i-1) 3.6 4.3 2.5 3.7 9.1
enddo 2 4 index(2)= 6
② ① ④ ⑥ ⑧
NPLU= INDEX(N) 5.7 1.5 3.1
3 2 index(3)= 8
③ ⑤ ⑦
do i= 1, N 4 9.8 4.1 2.5 2.7
3 index(4)= 11
jS= INDEX(i-1) ④ ② ⑤ ⑥
if (i.eq.1) then 11.5 3.1 9.5 10.4 4.3
ITEM(jS+1)= i+1 5
⑤ ① ②
4 index(5)= 15
③ ⑦
else if
& (i.eq.N) then 12.4 6.5 9.5
6 2 index(6)= 17
ITEM(jS+1)= i-1 ⑥ ③ ⑦
else 23.1 6.4 2.5 1.4 13.1 4
ITEM(jS+1)= i-1 7 index(7)= 21
⑦ ② ③ ⑥ ⑧
ITEM(jS+2)= i+1
endif 8 51.3 9.5 1.3 9.6 3.1 4 index(8)= 25
enddo ⑧ ② ③ ④ ⑥
!C=== index(i-1)+1th~index(i) th
Non-Zero Off-Diag. Components corresponding to i-th row
FEM1D 87

Program: 1d.f (4/6)


Global Matrix: Column ID for Non-Zero Off-Diag’s
!C
!C +--------------+
!C | CONNECTIVITY | i-1 i
!C +--------------+ i-1 i i+1
!C===
INDEX = 2
INDEX(0)= 0
# Non-Zero
INDEX(1)= 1 index(0)= 0
INDEX(N)= 1 Off-Diag.
1.1 2.4 3.2
1 2 index(1)= 2
do i= 1, N ① ② ⑤
INDEX(i)= INDEX(i) + INDEX(i-1) 3.6 4.3 2.5 3.7 9.1
enddo 2 4 index(2)= 6
② ① ④ ⑥ ⑧
NPLU= INDEX(N) 5.7 1.5 3.1
3 2 index(3)= 8
③ ⑤ ⑦
do i= 1, N 4 9.8 4.1 2.5 2.7
3 index(4)= 11
jS= INDEX(i-1) ④ ② ⑤ ⑥
if (i.eq.1) then 11.5 3.1 9.5 10.4 4.3
ITEM(jS+1)= i+1 5
⑤ ① ②
4 index(5)= 15
③ ⑦
else if
& (i.eq.N) then 12.4 6.5 9.5
6 2 index(6)= 17
ITEM(jS+1)= i-1 ⑥ ③ ⑦
else 23.1 6.4 2.5 1.4 13.1 4
ITEM(jS+1)= i-1 7 index(7)= 21
⑦ ② ③ ⑥ ⑧
ITEM(jS+2)= i+1
endif 8 51.3 9.5 1.3 9.6 3.1 4 index(8)= 25
enddo ⑧ ② ③ ④ ⑥
!C=== index(i-1)+1th~index(i) th
Non-Zero Off-Diag. Components corresponding to i-th row
FEM1D 88

Program: 1d.f (5/6)


Element Matrix ~ Global Matrix
!C +-----------------+
!C | MATRIX ASSEMBLE |
!C +-----------------+
!C===
do icel= 1, NE
in1= ICELNOD(2*icel-1)
in2= ICELNOD(2*icel ) icel
X1 = X(in1) in1 in2
X2 = X(in2)
DL = dabs(X2-X1)
cK= AREA*COND/DL
EMAT(1,1)= Ck*KMAT(1,1)
EMAT(1,2)= Ck*KMAT(1,2)
EMAT(2,1)= Ck*KMAT(2,1)
EMAT(2,2)= Ck*KMAT(2,2)
DIAG(in1)= DIAG(in1) + EMAT(1,1)
DIAG(in2)= DIAG(in2) + EMAT(2,2)
if (icel.eq.1) then
k1= INDEX(in1-1) + 1
else
k1= INDEX(in1-1) + 2
endif
k2= INDEX(in2-1) + 1
AMAT(k1)= AMAT(k1) + EMAT(1,2)
AMAT(k2)= AMAT(k2) + EMAT(2,1)
QN= 0.50d0*QV*AREA*DL
RHS(in1)= RHS(in1) + QN
RHS(in2)= RHS(in2) + QN
enddo
!C===
FEM1D 89

Program: 1d.f (5/6)


Element Matrix ~ Global Matrix
!C +-----------------+
!C | MATRIX ASSEMBLE |
!C +-----------------+
!C===
do icel= 1, NE
in1= ICELNOD(2*icel-1)
in2= ICELNOD(2*icel ) icel
X1 = X(in1) in1 in2
X2 = X(in2)
DL = dabs(X2-X1)
cK= AREA*COND/DL
 1  1 A
Emat   k (e)  A 
EMAT(1,1)= Ck*KMAT(1,1)
EMAT(1,2)= Ck*KMAT(1,2)
EMAT(2,1)= Ck*KMAT(2,1)   Kmat 
EMAT(2,2)= Ck*KMAT(2,2) L  1  1 L
DIAG(in1)= DIAG(in1) + EMAT(1,1)
DIAG(in2)= DIAG(in2) + EMAT(2,2)
if (icel.eq.1) then
k1= INDEX(in1-1) + 1
else
k1= INDEX(in1-1) + 2
endif
k2= INDEX(in2-1) + 1
AMAT(k1)= AMAT(k1) + EMAT(1,2)
AMAT(k2)= AMAT(k2) + EMAT(2,1)
QN= 0.50d0*QV*AREA*DL
RHS(in1)= RHS(in1) + QN
RHS(in2)= RHS(in2) + QN
enddo
!C===
FEM1D 90

Program: 1d.f (5/6)


Element Matrix ~ Global Matrix
!C +-----------------+
!C | MATRIX ASSEMBLE |
!C +-----------------+
!C===
do icel= 1, NE
in1= ICELNOD(2*icel-1)
in2= ICELNOD(2*icel ) icel
X1 = X(in1) in1 in2
X2 = X(in2)
DL = dabs(X2-X1)
cK= AREA*COND/DL
EMAT(1,1)= Ck*KMAT(1,1)
EMAT(1,2)= Ck*KMAT(1,2)
EMAT(2,1)= Ck*KMAT(2,1)
EMAT(2,2)= Ck*KMAT(2,2)
EA  1  1
DIAG(in1)= DIAG(in1) + EMAT(1,1)
Emat   k 
(e)

L   1  1
DIAG(in2)= DIAG(in2) + EMAT(2,2)
if (icel.eq.1) then
k1= INDEX(in1-1) + 1
else
k1= INDEX(in1-1) + 2
endif
k2= INDEX(in2-1) + 1
AMAT(k1)= AMAT(k1) + EMAT(1,2)
AMAT(k2)= AMAT(k2) + EMAT(2,1)
QN= 0.50d0*QV*AREA*DL
RHS(in1)= RHS(in1) + QN
RHS(in2)= RHS(in2) + QN
enddo
!C===
FEM1D 91

Program: 1d.f (5/6)


Element Matrix ~ Global Matrix
!C +-----------------+
!C | MATRIX ASSEMBLE |
!C +-----------------+
!C===
do icel= 1, NE
in1= ICELNOD(2*icel-1)
in2= ICELNOD(2*icel ) icel
X1 = X(in1) in1 in2
X2 = X(in2)
DL = dabs(X2-X1)
cK= AREA*COND/DL
EMAT(1,1)= Ck*KMAT(1,1)
EMAT(1,2)= Ck*KMAT(1,2)
EMAT(2,1)= Ck*KMAT(2,1)
EMAT(2,2)= Ck*KMAT(2,2)
DIAG(in1)= DIAG(in1) + EMAT(1,1) Non-zero Off-Diag. at i-th row:
DIAG(in2)= DIAG(in2) + EMAT(2,2) Index(i-1)+1, Index(i-1)+2
if (icel.eq.1) then
k1= INDEX(in1-1) + 1
else i-1 i i+1
k1= INDEX(in1-1) + 2
endif INDEX(i-1)+1 INDEX(i-1)+2
k2= INDEX(in2-1) + 1
AMAT(k1)= AMAT(k1) + EMAT(1,2)
A  1  1 k1
AMAT(k2)= AMAT(k2) + EMAT(2,1)
Emat   k (e)

QN= 0.50d0*QV*AREA*DL L   1  1
RHS(in1)= RHS(in1) + QN
RHS(in2)= RHS(in2) + QN
enddo k2
!C===
FEM1D 92

General Elements: k1
“in2” as a off-diag. component of “in1”
!C +-----------------+
!C | MATRIX ASSEMBLE |
!C +-----------------+
!C===
do icel= 1, NE
in1= ICELNOD(2*icel-1)
in2= ICELNOD(2*icel )
X1 = X(in1)
X2 = X(in2)
DL = dabs(X2-X1) icel
in1 in2
cK= AREA*COND/DL
EMAT(1,1)= Ck*KMAT(1,1)
EMAT(1,2)= Ck*KMAT(1,2)
EMAT(2,1)= Ck*KMAT(2,1)
EMAT(2,2)= Ck*KMAT(2,2)
DIAG(in1)= DIAG(in1) + EMAT(1,1) Non-zero Off-Diag. at i-th row:
DIAG(in2)= DIAG(in2) + EMAT(2,2) Index(i-1)+1, Index(i-1)+2
if (icel.eq.1) then
k1= INDEX(in1-1) + 1
else i-1 in1 in2
k1= INDEX(in1-1) + 2
endif INDEX(i-1)+1 INDEX(i-1)+2
k2= INDEX(in2-1) + 1
AMAT(k1)= AMAT(k1) + EMAT(1,2)
A  1  1 k1
AMAT(k2)= AMAT(k2) + EMAT(2,1)
Emat   k (e)

QN= 0.50d0*QV*AREA*DL L   1  1
RHS(in1)= RHS(in1) + QN
RHS(in2)= RHS(in2) + QN
enddo
!C===
FEM1D 93

General Elements: k2
“in1” as a off-diag. component of “in2”
!C +-----------------+
!C | MATRIX ASSEMBLE |
!C +-----------------+
!C===
do icel= 1, NE
in1= ICELNOD(2*icel-1)
in2= ICELNOD(2*icel )
X1 = X(in1)
X2 = X(in2)
DL = dabs(X2-X1) icel
in1 in2
cK= AREA*COND/DL
EMAT(1,1)= Ck*KMAT(1,1)
EMAT(1,2)= Ck*KMAT(1,2)
EMAT(2,1)= Ck*KMAT(2,1)
EMAT(2,2)= Ck*KMAT(2,2)
DIAG(in1)= DIAG(in1) + EMAT(1,1) Non-zero Off-Diag. at i-th row:
DIAG(in2)= DIAG(in2) + EMAT(2,2) Index(i-1)+1, Index(i-1)+2
if (icel.eq.1) then
k1= INDEX(in1-1) + 1
else in1 in2 i+1
k1= INDEX(in1-1) + 2
endif INDEX(i-1)+1 INDEX(i-1)+2
k2= INDEX(in2-1) + 1
AMAT(k1)= AMAT(k1) + EMAT(1,2)
A  1  1
AMAT(k2)= AMAT(k2) + EMAT(2,1)
Emat   k (e)

QN= 0.50d0*QV*AREA*DL L   1  1
RHS(in1)= RHS(in1) + QN
RHS(in2)= RHS(in2) + QN
enddo k2
!C===
FEM1D 94

0-th Element: k1
“in2” as a off-diag. component of “in1”
!C +-----------------+
!C | MATRIX ASSEMBLE |
!C +-----------------+
!C===
do icel= 1, NE
in1= ICELNOD(2*icel-1)
in2= ICELNOD(2*icel )
X1 = X(in1)
X2 = X(in2)
DL = dabs(X2-X1) Icel=1
in1 in2
cK= AREA*COND/DL
EMAT(1,1)= Ck*KMAT(1,1)
EMAT(1,2)= Ck*KMAT(1,2)
EMAT(2,1)= Ck*KMAT(2,1)
EMAT(2,2)= Ck*KMAT(2,2)
DIAG(in1)= DIAG(in1) + EMAT(1,1)
Non-zero Off-Diag. at i-th row:
DIAG(in2)= DIAG(in2) + EMAT(2,2) Index(i-1)+1 only
if (icel.eq.1) then
k1= INDEX(in1-1) + 1
else in1 in2
k1= INDEX(in1-1) + 2
endif INDEX(i-1)+1
k2= INDEX(in2-1) + 1
AMAT(k1)= AMAT(k1) + EMAT(1,2)
A  1  1 k1
AMAT(k2)= AMAT(k2) + EMAT(2,1)
Emat   k 
(e)

QN= 0.50d0*QV*AREA*DL L   1  1
RHS(in1)= RHS(in1) + QN
RHS(in2)= RHS(in2) + QN
enddo
!C===
FEM1D 95

Program: 1d.f (5/6)


RHS: Heat Generation Term
!C +-----------------+
!C | MATRIX ASSEMBLE |
!C +-----------------+
!C===
do icel= 1, NE
in1= ICELNOD(2*icel-1)
in2= ICELNOD(2*icel ) icel
X1 = X(in1) in1 in2
X2 = X(in2)
DL = dabs(X2-X1)
cK= AREA*COND/DL
EMAT(1,1)= Ck*KMAT(1,1)
EMAT(1,2)= Ck*KMAT(1,2)
EMAT(2,1)= Ck*KMAT(2,1)
EMAT(2,2)= Ck*KMAT(2,2)
DIAG(in1)= DIAG(in1) + EMAT(1,1)
DIAG(in2)= DIAG(in2) + EMAT(2,2)
if (icel.eq.1) then
k1= INDEX(in1-1) + 1
else
k1= INDEX(in1-1) + 2
endif
k2= INDEX(in2-1) + 1
AMAT(k1)= AMAT(k1) + EMAT(1,2)
AMAT(k2)= AMAT(k2) + EMAT(2,1)
L
1  x / L  Q AL 1
V QN  dV  QA0  x / L  dx  2 1
QN= 0.50d0*QV*AREA*DL
RHS(in1)= RHS(in1) + QN  T 
RHS(in2)= RHS(in2) + QN
enddo
!C===
FEM1D 96

Program: 1d.f (6/6)


Dirichlet B.C. @ X=0
!C
!C +---------------------+
!C | BOUNDARY CONDITIONS |
!C +---------------------+
!C===

!C
!C-- X=Xmin
i= 1
jS= INDEX(i-1)

AMAT(jS+1)= 0.d0
DIAG(i)= 1.d0
RHS (i)= 0.d0

do k= 1, NPLU
if (ITEM(k).eq.1) AMAT(k)= 0.d0
enddo
!C===
FEM1D 97

1D Steady State Heat Conduction

heat generation/volume Q
  T  
 Q  0
x  x 

x=0 (xmin) x= xmax

• Uniform: Sectional Area: A, Thermal Conductivity: 


• Heat Generation Rate/Volume/Time [QL-3T-1] Q
• Boundary Conditions
– x=0 : T= 0 (Fixed Temperature)
T
– x=xmax:  0 (Insulated)
x
FEM1D 98

(Linear) Equation at x=0


T1= 0 (or T0 = 0)
heat generation/volume Q
  T  
 Q  0
x  x 

x=0 (xmin) x= xmax

• Uniform: Sectional Area: A, Thermal Conductivity: 


• Heat Generation Rate/Volume/Time [QL-3T-1] Q
• Boundary Conditions
– x=0 : T= 0 (Fixed Temperature)
T
– x=xmax:  0 (Insulated)
x
FEM1D 99

Program: 1d.f (6/6)


Dirichlet B.C. @ X=0
!C
!C +---------------------+
!C | BOUNDARY CONDITIONS | T1=0
!C +---------------------+
!C===
Diagonal Component=1
!C RHS=0
!C-- X=Xmin
i= 1
Off-Diagonal Components= 0.
jS= INDEX(i-1)

AMAT(jS+1)= 0.d0
DIAG(i)= 1.d0
RHS (i)= 0.d0

do k= 1, NPLU
if (ITEM(k).eq.1) AMAT(k)= 0.d0
enddo
!C===
FEM1D 100

Program: 1d.f (6/6)


Dirichlet B.C. @ X=0
!C
!C +---------------------+
!C | BOUNDARY CONDITIONS | T1=0
!C +---------------------+
!C===
Diagonal Component=1
!C RHS=0
!C-- X=Xmin
i= 1
Off-Diagonal Components= 0.
jS= INDEX(i-1)
Erase !
AMAT(jS+1)= 0.d0
DIAG(i)= 1.d0
RHS (i)= 0.d0

do k= 1, NPLU
if (ITEM(k).eq.1) AMAT(k)= 0.d0
enddo
!C===
FEM1D 101

Program: 1d.f (6/6)


Dirichlet B.C. @ X=0
!C
!C +---------------------+ T1=0
!C | BOUNDARY CONDITIONS |
!C +---------------------+ Diagonal Component=1
!C===
RHS=0
!C Off-Diagonal Components= 0.
!C-- X=Xmin
i= 1
Elimination and Erase
jS= INDEX(i-1)

AMAT(jS+1)= 0.d0
DIAG(i)= 1.d0
RHS (i)= 0.d0

do k= 1, NPLU
if (ITEM(k).eq.1) AMAT(k)= 0.d0
enddo
!C===

Column components of boundary nodes (Dirichlet B.C.) are


moved to RHS and eliminated for keeping symmetrical feature of
the matrix (in this case just erase off-diagonal components)
FEM1D 102

if T1≠ 0
!C
!C +---------------------+
!C | BOUNDARY CONDITIONS |
Column components of boundary nodes
!C +---------------------+ (Dirichlet B.C.) are moved to RHS and
!C=== eliminated for keeping symmetrical feature
!C of the matrix.
!C-- X=Xmin
i= 1
Index[ j 1]1

 Amat
jS= INDEX(i-1)
Diag j j  k Item[ k ]  Rhs j
AMAT(jS+1)= 0.d0 k  Index[ j ]
DIAG(i)= 1.d0
RHS (i)= PHImin

do i= 1, N
do k= INDEX(i-1)+1, INDEX(i)
if (ITEM(k).eq.1) then
RHS (i)= RHS(i) – AMAT(k)*PHImin
AMAT(k)= 0.d0
endif
enddo
enddo
!C===
FEM1D 103

if T1≠ 0
!C
!C +---------------------+ Index[ j 1]1
!C | BOUNDARY CONDITIONS |
!C +---------------------+
!C===
Diag j j   Amat
k  Index[ j ], k  k s
k Item[ k ]

!C  Rhs j  Amatk s Item[ k s ]


!C-- X=Xmin
i= 1
jS= INDEX(i-1)
 Rhs j  Amatk s min where Item[k s ]  0
AMAT(jS+1)= 0.d0
DIAG(i)= 1.d0
RHS (i)= PHImin

do i= 1, N
do k= INDEX(i-1)+1, INDEX(i)
if (ITEM(k).eq.1) then
RHS (i)= RHS(i) – AMAT(k)*PHImin
AMAT(k)= 0.d0
endif Column components of boundary nodes
enddo
enddo
(Dirichlet B.C.) are moved to RHS and
!C=== eliminated for keeping symmetrical feature
of the matrix.
FEM1D 104

Secondary B.C. (Insulated)


Heat Gen. Rate Q
  T  
 Q  0
x  x 

x=0 (xmin) x= xmax

T  0@ x  0 T
 0 @ x  xmax
x

0
 q N  dS  q A
dT
 q A , q  
T
xL Surface Flux
S 1 dx

According to insulated B.C., q  0


is satisfied. No contribution by this term.
T Insulated B.C. is automatically satisfied
 0 @ x  xmax without explicit operations
x -> Natural B.C.
FEM1D 105

Preconditioned CG Solver
Compute r(0)= b-[A]x(0)
for i= 1, 2, …
D1 0 ... 0 0
0 D 0 0 
solve [M]z(i-1)= r(i-1)
 2 
i-1= r(i-1) z(i-1)
if i=1
M    ... ... ... 
 
p(1)= z(0)  0 0 DN 1 0 
else  0 0 ... 0 DN 
i-1= i-1/i-2
p(i)= z(i-1) + i-1 p(i-1)
endif
q(i)= [A]p(i)
i = i-1/p(i)q(i)
x(i)= x(i-1) + ip(i)
r(i)= r(i-1) - iq(i)
check convergence |r|
end
FEM1D 106

Diagonal Scaling, Point-Jacobi


D1 0 ... 0 0
0 D 0 0 
 2 
M    ... ... ... 
 
0 0 DN 1 0 
 0 0 ... 0 DN 

• solve [M]z(i-1)= r(i-1) is very easy.


• Provides fast convergence for simple problems.
• 1d.f, 1d.c
FEM1D 107

CG Solver (1/6)
!C
!C +---------------+
!C | CG iterations |
!C +---------------+
!C===
R = 1
Reciprocal numbers (逆数)of diagonal
Z = 2 components are stored in W(i,DD).
Q = 2 Computational cost for division is
P = 3
usually expensive.
DD= 4

do i= 1, N
W(i,DD)= 1.0D0 / DIAG(i)
enddo
FEM1D 108

CG Solver (1/6)
!C
!C +---------------+ Compute r(0)= b-[A]x(0)
!C | CG iterations | for i= 1, 2, …
!C +---------------+ solve [M]z(i-1)= r(i-1)
!C===
R = 1
i-1= r(i-1) z(i-1)
Z = 2 if i=1
Q = 2 p(1)= z(0)
P = 3 else
DD= 4
i-1= i-1/i-2
do i= 1, N p(i)= z(i-1) + i-1 p(i-1)
W(i,DD)= 1.0D0 / DIAG(i) endif
enddo
q(i)= [A]p(i)
i = i-1/p(i)q(i)
W(i,1)= W(i,R) ⇒ {r} x(i)= x(i-1) + ip(i)
W(i,2)= W(i,Z) ⇒ {z} r(i)= r(i-1) - iq(i)
check convergence |r|
W(i,2)= W(i,Q) ⇒ {q} end
W(i,3)= W(i,P) ⇒ {p}
W(i,4)= W(i,DD) ⇒ 1/{D}
FEM1D 109

CG Solver (2/6)
!C Compute r(0)= b-[A]x(0)
!C-- {r0}= {b} - [A]{xini} | for i= 1, 2, …
!C 初期残差 solve [M]z(i-1)= r(i-1)
do i= 1, N i-1= r(i-1) z(i-1)
W(i,R) = DIAG(i)*PHI(i) if i=1
do j= INDEX(i-1)+1, INDEX(i)
W(i,R) = W(i,R) + AMAT(j)*PHI(ITEM(j))
p(1)= z(0)
enddo else
enddo i-1= i-1/i-2
p(i)= z(i-1) + i-1 p(i-1)
BNRM2= 0.0D0
do i= 1, N endif
BNRM2 = BNRM2 + RHS(i) **2 q(i)= [A]p(i)
W(i,R)= RHS(i) - W(i,R) i = i-1/p(i)q(i)
enddo
x(i)= x(i-1) + ip(i)
BNRM2=|b|2 r(i)= r(i-1) - iq(i)
for convergence criteria check convergence |r|
of CG solvers end
FEM1D 110

CG Solver (3/6)
do iter= 1, ITERmax Compute r(0)= b-[A]x(0)
for i= 1, 2, …
!C
!C-- {z}= [Minv]{r} solve [M]z(i-1)= r(i-1)
i-1= r(i-1) z(i-1)
do i= 1, N if i=1
W(i,Z)= W(i,DD) * W(i,R)
enddo
p(1)= z(0)
else
!C i-1= i-1/i-2
!C-- RHO= {r}{z} p(i)= z(i-1) + i-1 p(i-1)
RHO= 0.d0 endif
do i= 1, N q(i)= [A]p(i)
RHO= RHO + W(i,R)*W(i,Z) i = i-1/p(i)q(i)
enddo
x(i)= x(i-1) + ip(i)
r(i)= r(i-1) - iq(i)
check convergence |r|
end
FEM1D 111

CG Solver (4/6)
!C Compute r(0)= b-[A]x(0)
!C-- {p} = {z} if ITER=1 for i= 1, 2, …
!C BETA= RHO / RHO1 otherwise
solve [M]z(i-1)= r(i-1)
if ( iter.eq.1 ) then i-1= r(i-1) z(i-1)
do i= 1, N if i=1
W(i,P)= W(i,Z)
enddo
p(1)= z(0)
else else
BETA= RHO / RHO1 i-1= i-1/i-2
do i= 1, N p(i)= z(i-1) + i-1 p(i-1)
W(i,P)= W(i,Z) + BETA*W(i,P)
enddo endif
endif q(i)= [A]p(i)
i = i-1/p(i)q(i)
!C
!C-- {q}= [A]{p}
x(i)= x(i-1) + ip(i)
r(i)= r(i-1) - iq(i)
do i= 1, N check convergence |r|
W(i,Q) = DIAG(i)*W(i,P) end
do j= INDEX(i-1)+1, INDEX(i)
W(i,Q) = W(i,Q) + AMAT(j)*W(ITEM(j),P)
enddo
enddo
FEM1D 112

CG Solver (5/6)
!C Compute r(0)= b-[A]x(0)
!C-- ALPHA= RHO / {p}{q} for i= 1, 2, …
C1= 0.d0 solve [M]z(i-1)= r(i-1)
do i= 1, N i-1= r(i-1) z(i-1)
C1= C1 + W(i,P)*W(i,Q) if i=1
enddo
ALPHA= RHO / C1
p(1)= z(0)
else
!C i-1= i-1/i-2
!C-- {x}= {x} + ALPHA*{p} p(i)= z(i-1) + i-1 p(i-1)
!C {r}= {r} - ALPHA*{q}
endif
do i= 1, N q(i)= [A]p(i)
PHI(i)= PHI(i) + ALPHA * W(i,P) i = i-1/p(i)q(i)
W(i,R)= W(i,R) - ALPHA * W(i,Q)
enddo
x(i)= x(i-1) + ip(i)
r(i)= r(i-1) - iq(i)
check convergence |r|
end
FEM1D 113

CG Solver (6/6)
DNRM2 = 0.0 Compute r(0)= b-[A]x(0)
do i= 1, N for i= 1, 2, …
DNRM2= DNRM2 + W(i,R)**2
enddo solve [M]z(i-1)= r(i-1)
i-1= r(i-1) z(i-1)
RESID= dsqrt(DNRM2/BNRM2) if i=1
if ( RESID.le.EPS) goto 900
p(1)= z(0)
RHO1 = RHO i-2 else
i-1= i-1/i-2
p(i)= z(i-1) + i-1 p(i-1)
enddo
900 continue endif
q(i)= [A]p(i)
i = i-1/p(i)q(i)
DNorm2 r Ax  b x(i)= x(i-1) + ip(i)
Resid     Eps
BNorm2 b b r(i)= r(i-1) - iq(i)
check convergence |r|
Control Data input.dat end
4 NE (Number of Elements)
1.0 1.0 1.0 1.0 x (Length of Each Elem.: L) ,Q, A, 
100 Number of MAX. Iterations for CG Solver
1.e-8 Convergence Criteria for CG Solver
FEM1D 114

Finite Element Procedures


• Initialization
– Control Data
– Node, Connectivity of Elements (N: Node#, NE: Elem#)
– Initialization of Arrays (Global/Element Matrices)
– Element-Global Matrix Mapping (Index, Item)
• Generation of Matrix
– Element-by-Element Operations (do icel= 1, NE)
• Element matrices
• Accumulation to global matrix
– Boundary Conditions
• Linear Solver
– Conjugate Gradient Method
FEM1D 115

Remedies for Higher Accuracy


• Finer Meshes
log A1 x  A2   log A2 
NE=8, dX=12.5 F
 u
8 iters, RESID= 2.822910E-16 U(N)= 1.953586E-01 EA1
### DISPLACEMENT
1 0.000000E+00 -0.000000E+00
2 1.101928E-02 1.103160E-02
3 2.348034E-02 2.351048E-02
4 3.781726E-02 3.787457E-02
5 5.469490E-02 5.479659E-02 1 2 3 4 5
6 7.520772E-02 7.538926E-02
7 1.013515E-01 1.016991E-01 1 2 3 4
8 1.373875E-01 1.381746E-01
9 1.953586E-01 1.980421E-01

NE=20, dX=5

20 iters, RESID= 5.707508E-15 U(N)= 1.975734E-01

### DISPLACEMENT
1 0.000000E+00 -0.000000E+00
2 4.259851E-03 4.260561E-03
3 8.719160E-03 8.720685E-03
4 1.339752E-02 1.339999E-02
……
17 1.145876E-01 1.146641E-01
18 1.295689E-01 1.296764E-01
19 1.473466E-01 1.475060E-01
20 1.692046E-01 1.694607E-01
21 1.975734E-01 1.980421E-01
FEM1D 116

Remedies for Higher Accuracy


• Finer Meshes
• Higher Order Shape/Interpolation Function(高次補
間関数・形状関数)
– Higher-Order Element(高次要素)
– Linear-Element, 1st-Order Element: Lower Order(低次要
素)
• Formulation which assures continuity of n-th order
derivatives
– Cn Continuity(Cn連続性)
FEM1D 117

Remedies for Higher Accuracy


• Finer Meshes
• Higher Order Shape/Interpolation Function(高次補
間関数・形状関数)
– Higher-Order Element(高次要素)
– Linear-Element, 1st-Order Element: Lower Order(低次要
素)
• Formulation which assures continuity of n-th order
derivatives
– Cn Continuity(Cn連続性)
• Linear Elements
– Piecewise Linear
– C0 Continuity
• Only dependent variables are continuous at element boundary
FEM1D 118

Example: 1D Heat Transfer (1/2)


T0  0C , h  0.10W / cm 2  K • Temp. Thermal Fins
• Circular Sectional Area,
  4.00W / cm  K 銅Cupper r=1cm

• Boundary Condition
7.50cm
– x=0 : Fixed Temperature
TS  150C – x=7.5 : Insulated
• Convective Heat Transfer
on Cylindrical Surface
– q= h (T-T0)
– q:Heat Flux
Convective Heat Transfer on
Cylindrical Surface • Heat Flow/Unit Surface
Area/sec.
FEM1D 119

Example: 1D Heat Transfer (2/2)


150
### RESULTS (linear interpolation)
ID X FEM. ANALYTICAL Exact Solution
1 0.00000 150.00000 150.00000 ERR(%): 0.00000
2 1.87500 102.62226 103.00165 ERR(%): 0.25292
3 3.75000 73.82803 74.37583 ERR(%): 0.36520 100

deg-C
4 5.62500 58.40306 59.01653 ERR(%): 0.40898
5 7.50000 53.55410 54.18409 ERR(%): 0.41999

### RESULTS (quadratic interpolation) 50  hP 


ID X FEM. ANALYTICAL cosh  x 
1 0.00000 150.00000 150.00000 ERR(%): 0.00000 T ( x)  TS  T0   A  T
0
 hP 
2 1.87500 102.98743 103.00165 ERR(%): 0.00948 cosh X max 
3 3.75000 74.40203 74.37583 ERR(%): 0.01747  A 
4 5.62500 59.02737 59.01653 ERR(%): 0.00722 0
5 7.50000 54.21426 54.18409 ERR(%): 0.02011 0.00 2.00 4.00 6.00 8.00

### RESULTS (linear interpolation) X


ID X FEM. ANALYTICAL
1 0.00000 150.00000 150.00000 ERR(%): 0.00000
2 0.93750 123.71561 123.77127 ERR(%): 0.03711 Quadratic interpolation provides
3 1.87500 102.90805 103.00165 ERR(%): 0.06240 more accurate solution,
4 2.81250 86.65618 86.77507 ERR(%): 0.07926
5 3.75000 74.24055 74.37583 ERR(%): 0.09019 especially if X is close to 7.50cm.
6 4.68750 65.11151 65.25705 ERR(%): 0.09703
7 5.62500 58.86492 59.01653 ERR(%): 0.10107
8 6.56250 55.22426 55.37903 ERR(%): 0.10317
9 7.50000 54.02836 54.18409 ERR(%): 0.10382

You might also like