Optionic
Optionic
Georg KRESSE
b-initio
ackage
ienna imulation
for simplicity we will consider a simple quadratic function
1 1
x0 B x x0
f x a bx xBx ā x
2 2
where B is the Hessian matrix
∂2 f
Bi j
∂xi ∂x j
for a stationary point, one requires
∂f
gx Bx x0
∂x
∂f
gi x ∑ Bi j x j x0j
∂xi
j
at the minimum the Hessian matrix must be additionally positive definite
educational example
calculate the gradient g x1
multiply with the inverse of the Hessian matrix and perform a step
x2 x1 B 1 g x1
∂f
by inserting g x1 B x 1 x0 , one immediately recognises that x2 x0
∂x
hence one can find the minimum in one step
3. make a step into the direction of the steepest descent
x2 x1 1 Γmax B g x1
4. repeat step 2 and 3 until convergence is reached
for functions with long steep valleys convergence can be very slow
Γ max
Γ min
Γ1 0 1
B x1 x0 with Γ1 Γ2 Γ3
0 Γn 1
Γ1 1 Γ 1 Γn
1
g x1 B x1 x0 x2 x1 g x1 x0
Γn
Γn 1 Γ n Γn
G. K RESSE , I ONIC OPTIMISATION Page 6
Convergence
1 Γ 1 Γn
1−Γ/Γ max
x2 x0
1 Γ n Γn Γ
1
Γ
2
Γ
3
Γ
4
Γ
5
Γ
1−2Γ/Γmax
−1
Γmin
k ln 1 ln ε
Γmax
Γmin Γmax Γmax
k ln ε k ln ε k∝
Γmax Γmin Γmin
the convergence speed can be much improved
xN 1
xN λPg xN
in this case the convergence speed depends on the eigenvalue spectrum of
PB
search directions are given by
1
Bapprox gx
the asymptotic convergence rate is give by
Γmax
number of iterations ∝
Γmin
set of points
xi i 1 N and gi i 1 N
search for a linear combination of xi which minimises the gradient, under the
constraint
∑ αi 1
i
g ∑ x
α i i
B ∑ x
α i i
x0 B ∑ x
α i i
∑ x
α i 0
i i i i
∑ αi B x i x0 ∑ αi g i
i i
2. gradient g1
g x1 , move along gradient (steepest descent)
x2 x1 λg1
3. calculate new gradient g2 g x2
4. search in the space spanned by gi i 1 N for the minimal gradient
gopt ∑ αi g i
and calculate the corresponding position
xopt ∑ x
α i i
x3 xopt λgopt
2. gradient along indicated red line is now know, determine optimal position x 1opt
3. another steepest descent step form x1opt along gopt g x1opt
4. calculate gradient x2 now the gradient is known in the entire 2 dimensional space
(linearity condition) and the function can be minimised exactly
a 0x 0+ a1x,1 a0+a1=1
x0
x1
x2
x0
x1opt
2. conjugate this gradient to the previous search direction using:
g xN g xN 1 g xN
N N
γs N 1
γ
s gx
g xN 1 g xN 1
3. line minimisation along this search direction sN
4. continue with step 1), if the gradient is not sufficiently small.
the search directions satisfy:
s N Bs M δNM N M
the conjugate gradient algorithm finds the minimum of a quadratic function with k
degrees of freedom in k 1 steps exactly
steps (crosses, at least one triastep is required) x1
2. determine new gradient g1 g x1 and conjugate it to get s1 (green arrow)
for 2d-functions the gradient points now directly to the minimum
3. minimisation along search direction s1
x0
x1
x2
x0
s1
x1
x1
G. K RESSE , I ONIC OPTIMISATION Page 15
Asymptotic convergence rate
asymptotic convergence rate is the convergence behaviour for the case that the
degrees of freedom are much large than the number of steps
e.g. 100 degrees of freedom but you perform only 10-20 steps
– steepest descent: Γmax Γmin steps are required to reduce the forces to a
fraction ε
– DIIS, CG, damped MD: Γmax Γmin steps are required to reduce the
forces to a fraction ε
Γmax Γmin are the maximum and minimal eigenvalue of the Hessian matrix
x¨ 2 αg x µx˙
using a velocity Verlet algorithm this becomes
2 αFN
vN 1 µ 2 vN 1 µ 2
1 2 1 2
"
xN xN vN
1 1 1 2
for µ 2, this is equivalent to a simple steepest descent step
if the optimal friction is chosen the ball will glide right away into the minimum
for a too small friction it will overshoot the minimum and accelerate back
for a tool large friction relaxation will also slow down (behaves like a steepest
descent)
x0
,,-+
,
,,-+
,
+
+
*
)
*
.
xi , that minimises the gradient
NFREE is the maximum N
x0
x0
x1 xtrial 1
x1
xtrial 2
this is done using a variant of Brent’s algorithm
– trial step along search direction (conjg. gradient scaled by POTIM)
– quadratic or cubic interpolation using energies and forces at x0 and x1 allows
to determine the approximate minimum
– continue minimisation as long as approximate minimum is not accurate
enough
2 αFN
vN 1 µ 2 vN 1 µ 2 xN xN vN
1 2 1 2 1 1 1 2
!
"
α ∝ POTIM and µ ∝ SMASS
POTIM must be as large as possible, but without leading to divergence
and SMASS must be set to µ 2 Γmin Γmax , where Γmin and Γmax are the
minimal und maximal eigenvalues of the Hessian matrix
a practicle optimisation procedure:
– set SMASS=0.5-1 and use a small POTIM of 0.05-0.1
– increase POTIM by 20 % until the relaxation runs diverge
– fix POTIM to the largest value for which convergence was achieved
– try a set of different SMASS until convergence is fastest (or stick to
SMASS=0.5-1.0)
this select an algorithm sometimes called QUICKMIN
QUICKMIN
F 0
/
new
2
v
αF
else
0 1
– if the forces are antiparallel to the velocities, quench the velocities to zero and
restart
– otherwise increase the “speed” and make the velocities parallel to the present
forces
I have not often used this algorithm, but it is supposed to be very efficient
2
damped: SMASS=0.4
quickmin
0
log(E-E0)
defective ZnO surface:
96 atoms are allowed to move! -2
relaxation after a finite temperature
MD at 1000 K -4
-6
0 20 40 60 80
steps
yes
Really, this is too complicated CG
no yes
yes
close to minimum 1−3 degrees of freedom
no no
no
DIIS
X0
X1
the convergence speed depends on the eigenvalue spectrum of the Hessian matrix
– larger systems (thicker slabs) are more problematic (acoustic modes are very
soft)
– molecular system are terrible (week intermolecular and strong intramolecular
forces)
– rigid unit modes and rotational modes can be exceedingly soft
the spectrum can vary over three orders of magnitudes 100 or even more steps
might be required ionic relaxation can be painful
to model the behaviour of the soft modes, you need very accurate forces since
otherwise the soft modes are hidden by the noise in the forces
EDIFF must be set to very small values (10 6) if soft modes exist