0% found this document useful (0 votes)
15 views27 pages

Optionic

Uploaded by

abkat080
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views27 pages

Optionic

Uploaded by

abkat080
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 27

Ionic optimisation

Georg KRESSE

Institut für Materialphysik and Center for Computational Material Science


Universität Wien, Sensengasse 8, A-1090 Wien, Austria

b-initio

ackage
ienna imulation

G. K RESSE , I ONIC OPTIMISATION Page 1


Overview

the mathematical problem


– minimisation of functions
– rule of the Hessian matrix
– how to overcome slow convergence

the three implemented algorithms


– Quasi-Newton (DIIS)
– conjugate gradient (CG)
– damped MD
strength, weaknesses

a little bit on molecular dynamics

G. K RESSE , I ONIC OPTIMISATION Page 2


The mathematical problem

search for the local minimum of a function f x

 

for simplicity we will consider a simple quadratic function
1 1
x0 B x x0


f x a bx xBx ā x
 




 


 




2 2





where B is the Hessian matrix
∂2 f
Bi j
∂xi ∂x j


for a stationary point, one requires
∂f
gx Bx x0

 

 

∂x







∂f
gi x ∑ Bi j x j x0j
 

∂xi







j
at the minimum the Hessian matrix must be additionally positive definite

G. K RESSE , I ONIC OPTIMISATION Page 3


The Newton algorithm

educational example

start with an arbitrary start point x1


calculate the gradient g x1

 

multiply with the inverse of the Hessian matrix and perform a step

x2 x1 B 1 g x1



 




∂f
by inserting g x1 B x 1 x0 , one immediately recognises that x2 x0

 

 


∂x







hence one can find the minimum in one step

in practice, the calculation of B is not possible in a reasonable time-span, and one


needs to approximate B by some reasonable approximation

G. K RESSE , I ONIC OPTIMISATION Page 4


Steepest descent

approximate B by the largest eigenvalue of the Hessian matrix steepest descent


algorithm (Jacobi algorithm for linear equations)
1. initial guess x1

2. calculate the gradient g x1


 

3. make a step into the direction of the steepest descent

x2 x1 1 Γmax B g x1


 
 




4. repeat step 2 and 3 until convergence is reached
for functions with long steep valleys convergence can be very slow
Γ max

Γ min

G. K RESSE , I ONIC OPTIMISATION Page 5


Speed of convergence

how many steps are required to converge to a predefined accuracy


assume that B is diagonal, and start from x1 x0 1





Γ1 0 1
B x1 x0 with Γ1 Γ2 Γ3


 

 













0 Γn 1

gradient g x1 and x2 after steepest descent step are:



 




Γ1 1 Γ 1 Γn


1
g x1 B x1 x0 x2 x1 g x1 x0

 

 


 


Γn
 

 














Γn 1 Γ n Γn


G. K RESSE , I ONIC OPTIMISATION Page 6
Convergence

the error reduction is given by

1 Γ 1 Γn


1−Γ/Γ max
x2 x0


 





1 Γ n Γn Γ
1
Γ
2
Γ
3
Γ
4
Γ
5
Γ


1−2Γ/Γmax
−1

– the error is reduced for each component


– in the high frequency component the error vanishes after on step
– for the low frequency component the reduction is smallest

G. K RESSE , I ONIC OPTIMISATION Page 7


the derivation is also true for non-diagonal matrices
in this case, the eigenvalues of the Hessian matrix are relevant
for ionic relaxation, the eigenvalues of the Hessian matrix correspond to the
vibrational frequencies of the system
the highest frequency mode determines the maximum stable step-width (“hard
modes limit the step-size”)
but the soft modes converge slowest
to reduce the error in all components to a predefined fraction ε,
k iterations are required
Γmin k
1 ε
Γmax



Γmin
k ln 1 ln ε
Γmax



Γmin Γmax Γmax
k ln ε k ln ε k∝
Γmax Γmin Γmin








G. K RESSE , I ONIC OPTIMISATION Page 8


Pre-conditioning

if an approximation of the inverse Hessian matrix is know P B 1,


the convergence speed can be much improved

xN 1
xN λPg xN


 






in this case the convergence speed depends on the eigenvalue spectrum of

PB

for P B 1 , the Newton algorithm is obtained




G. K RESSE , I ONIC OPTIMISATION Page 9


Variable-metric schemes, Quasi-Newton scheme

variable-metric schemes maintain an iteration history


they construct an implicit or explicit approximation of the inverse Hessian matrix
1
Bapprox


search directions are given by
1
Bapprox gx


 

the asymptotic convergence rate is give by

Γmax
number of iterations ∝
Γmin

G. K RESSE , I ONIC OPTIMISATION Page 10


Simple Quasi-Newton scheme, DIIS

direct inversion in the iterative subspace (DIIS)

set of points
xi i 1 N and gi i 1 N

 

 



 


 




search for a linear combination of xi which minimises the gradient, under the
constraint
∑ αi 1



i

g ∑ x
α i i
B ∑ x
α i i
x0 B ∑ x
α i i
∑ x
α i 0














i i i i

∑ αi B x i x0 ∑ αi g i
 









i i

gradient is linear in it’s arguments for a quadratic function

G. K RESSE , I ONIC OPTIMISATION Page 11


Full DIIS algorithm

1. single initial point x1


2. gradient g1
 g x1 , move along gradient (steepest descent)


 



x2 x1 λg1




3. calculate new gradient g2 g x2



 



4. search in the space spanned by gi i 1 N for the minimal gradient

 


 




gopt ∑ αi g i






and calculate the corresponding position

xopt ∑ x
α i i





5. Construct a new point x3 by moving from xopt along gopt





x3 xopt λgopt






G. K RESSE , I ONIC OPTIMISATION Page 12


1. steepest descent step from x0 to x1 (arrows correspond to gradients g0 and g1 )


2. gradient along indicated red line is now know, determine optimal position x 1opt


3. another steepest descent step form x1opt along gopt g x1opt


 

4. calculate gradient x2 now the gradient is known in the entire 2 dimensional space


(linearity condition) and the function can be minimised exactly

a 0x 0+ a1x,1 a0+a1=1
x0

x1

x2
x0

x1opt

G. K RESSE , I ONIC OPTIMISATION Page 13


Conjugate gradient

first step is a steepest descent step with line minimisation


search directions are “conjugated” to the previous search directions
1. gradient at the current position g xN


 

2. conjugate this gradient to the previous search direction using:
g xN g xN 1 g xN

 
   


 


 
N N
γs N 1
γ

   


s gx



 





 g xN 1 g xN 1


 





3. line minimisation along this search direction sN


4. continue with step 1), if the gradient is not sufficiently small.
the search directions satisfy:

s N Bs M δNM N M






the conjugate gradient algorithm finds the minimum of a quadratic function with k
degrees of freedom in k 1 steps exactly


G. K RESSE , I ONIC OPTIMISATION Page 14


1. steepest descent step from x0 , search for minimum along g0 by performing several trial


steps (crosses, at least one triastep is required) x1



2. determine new gradient g1 g x1 and conjugate it to get s1 (green arrow)


 



for 2d-functions the gradient points now directly to the minimum
3. minimisation along search direction s1


x0

x1

x2
x0

s1
x1
x1
G. K RESSE , I ONIC OPTIMISATION Page 15
Asymptotic convergence rate

asymptotic convergence rate is the convergence behaviour for the case that the
degrees of freedom are much large than the number of steps
e.g. 100 degrees of freedom but you perform only 10-20 steps

how quickly, do the forces decrease?


this depends entirely on the eigenvalue spectrum of the Hessian matrix:

– steepest descent: Γmax Γmin steps are required to reduce the forces to a
fraction ε

– DIIS, CG, damped MD: Γmax Γmin steps are required to reduce the
forces to a fraction ε

Γmax Γmin are the maximum and minimal eigenvalue of the Hessian matrix


G. K RESSE , I ONIC OPTIMISATION Page 16


Damped molecular dynamics

instead of using a fancy minimisation algorithms it is possible to treat the


minimisation problem using a simple “simulated annealing algorithm”

regard the positions as dynamic degrees of freedom


the forces serve as accelerations and an additional friction term is introduced

equation of motion (x are the positions)




x¨ 2 αg x µx˙



 

 


 

using a velocity Verlet algorithm this becomes

2 αFN


vN 1 µ 2 vN 1 µ 2


 
1 2 1 2
 










"
xN xN vN



1 1 1 2






for µ 2, this is equivalent to a simple steepest descent step


G. K RESSE , I ONIC OPTIMISATION Page 17


behaves like a rolling ball with a friction
it will accelerate initially, and then deaccelerate when close to the minimum

if the optimal friction is chosen the ball will glide right away into the minimum
for a too small friction it will overshoot the minimum and accelerate back

for a tool large friction relaxation will also slow down (behaves like a steepest
descent)

x0

G. K RESSE , I ONIC OPTIMISATION Page 18


Algorithms implemented in VASP

additional flags termination


DISS IBRION =1 POTIM, NFREE EDIFFG
CG IBRION =2 POTIM EDIFFG
damped MD IBRION =3 POTIM, SMASS EDIFFG
POTIM determines generally the step size
for the CG gradient algorithm, where line minisations are performed, this is the size of
the very first trial step
EDIFFG determines when to terminate relaxation
positive values: energy change between steps must be less than EDIFFG


negative values: Fi i 1 Nions



$#%
&'
'
(






G. K RESSE , I ONIC OPTIMISATION Page 19


DIIS

POTIM determines the step size in the steepest descent steps


no line minisations are performed !!
NFREE determines how many ionic steps are stored in the iteration history
set of points xi i 1 N and gi i 1 N searches for a linear combination of
 )





,,-+
,

,,-+
,
+

+
*

)
*

.
xi , that minimises the gradient
NFREE is the maximum N

for complex problems NFREE can be large (i.e. 10-20)


for small problems, it is advisable to count the degrees of freedom carefully
(symmetry inequivalent degrees of freedom)
if NFREE is not specified, VASP will try to determine a reasonable value, but
usually the convergence is then slower

G. K RESSE , I ONIC OPTIMISATION Page 20


CG
the only required parameter is POTIM
this parameter is used to parameterise, how large the trial steps are
CG requires a line minisations along the search direction

x0
x0
x1 xtrial 1
x1
xtrial 2
this is done using a variant of Brent’s algorithm
– trial step along search direction (conjg. gradient scaled by POTIM)
– quadratic or cubic interpolation using energies and forces at x0 and x1 allows


to determine the approximate minimum
– continue minimisation as long as approximate minimum is not accurate
enough

G. K RESSE , I ONIC OPTIMISATION Page 21


Damped MD

two parameters POTIM and SMASS

2 αFN


vN 1 µ 2 vN 1 µ 2 xN xN vN


 


1 2 1 2 1 1 1 2

 

















!

"
α ∝ POTIM and µ ∝ SMASS
POTIM must be as large as possible, but without leading to divergence
and SMASS must be set to µ 2 Γmin Γmax , where Γmin and Γmax are the

minimal und maximal eigenvalues of the Hessian matrix
a practicle optimisation procedure:
– set SMASS=0.5-1 and use a small POTIM of 0.05-0.1
– increase POTIM by 20 % until the relaxation runs diverge
– fix POTIM to the largest value for which convergence was achieved
– try a set of different SMASS until convergence is fastest (or stick to
SMASS=0.5-1.0)

G. K RESSE , I ONIC OPTIMISATION Page 22


Damped MD — QUICKMIN

alternatively do not specify SMASS (or set SMASS 0)


this select an algorithm sometimes called QUICKMIN

QUICKMIN

αF F vold F for vold F






F 0

 


/


new

2
v









αF
 else
0 1

– if the forces are antiparallel to the velocities, quench the velocities to zero and
restart
– otherwise increase the “speed” and make the velocities parallel to the present
forces
I have not often used this algorithm, but it is supposed to be very efficient

G. K RESSE , I ONIC OPTIMISATION Page 23


Damped MD — QUICKMIN

my experience is that damped MD (as implemented in VASP) is faster than


QUICKMIN
but it requires less playing around

2
damped: SMASS=0.4
quickmin
0

log(E-E0)
defective ZnO surface:
96 atoms are allowed to move! -2
relaxation after a finite temperature
MD at 1000 K -4

-6
0 20 40 60 80

steps

G. K RESSE , I ONIC OPTIMISATION Page 24


Why so many algorithms :-(... decision chart

yes
Really, this is too complicated CG

no yes

yes
close to minimum 1−3 degrees of freedom

no no

yes very broad vib. spectrum


damped MD or QUICKMIN >20 degrees of freedom

no

DIIS

G. K RESSE , I ONIC OPTIMISATION Page 25


Two cases where the DIIS has huge troubles
rigid unit modes i.e. in
force increases along the search
perovskites (rotation)
direction
molecular systems (rotation)

X0

X1

DIIS is dead, since it consideres in cartesian coordinates


the forces only the Hessian matrix changes
it will move uphill instead of down when the octahedron rotates!

G. K RESSE , I ONIC OPTIMISATION Page 26


How bad can it get

the convergence speed depends on the eigenvalue spectrum of the Hessian matrix
– larger systems (thicker slabs) are more problematic (acoustic modes are very
soft)
– molecular system are terrible (week intermolecular and strong intramolecular
forces)
– rigid unit modes and rotational modes can be exceedingly soft
the spectrum can vary over three orders of magnitudes 100 or even more steps


might be required ionic relaxation can be painful

to model the behaviour of the soft modes, you need very accurate forces since
otherwise the soft modes are hidden by the noise in the forces
EDIFF must be set to very small values (10 6) if soft modes exist

G. K RESSE , I ONIC OPTIMISATION Page 27

You might also like