Notes On The Ab Initio Theory of Molecules and Solids: Density Functional Theory (DFT)
Notes On The Ab Initio Theory of Molecules and Solids: Density Functional Theory (DFT)
Tomás A. Arias
January 26, 2004
Cornell University
Department of Physics
Contents
1 Introduction 2
4 Kohn-Sham equations 6
4.1 Basics of the calculus of variations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
4.2 Derivative of a real function of a complex variable and its conjugate . . . . . . . . . . . . . . 7
4.3 Kohn-Sham Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
4.4 Solution of the equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
5 Atomic units 10
6 References 11
1
1 Introduction
Dating back to at least the time of the ancient philosopher Empedocles (born ∼450 BCE) and his theory of
the elements earth, water, air and fire, and their basic interactions of love and strife, humanity has striven
to understand the behavior of the material world “from the beginning,” ab initio.
It has taken twenty-three centuries to bring this dream to fruition. C. Coulomb gave us the modern
understanding of what what would prove the basic interaction, electrostatics, in the late 1780’s. It would
take another hundred years to identify the basic constituents of matter as electrons (J.J. Thompson in 1897
and R.A. Millikan in 1909) and nuclei (E. Rutherford in 1911), and yet another two decades to formulate the
final ingredient needed for a predictive theory of these tiny objects, quantum mechanics. The rapid, heady
developments of early twentieth century prompted P.A.M. Dirac in 1929 to make the following statement of
optimism tempered with disappointment:
The general theory of quantum mechanics is now almost complete. The underlying physical laws
. . . for . . . a large part of physics and the whole of chemistry are thus completely known, and the
difficulty is only that . . . these laws lead to equations much to difficult to be solvable.
Despite the amazingly rapid progress of the early twentieth century, it would take nearly another fifty
years to surmount the difficulties which Dirac foresaw. However, it is now possible for us to write software
running on a personal computer to solve these equations. A combinations of three developments makes this
possible: (1) the development of density functional theory (DFT), for which Walter Kohn shared the 1998
Nobel prize in Chemistry and which is the subject of these notes; (2) the development of powerful new
numerical methods, which are the main subject of this course; and (3) the exponential progress in computer
power. In this course you will exploit these three developments to fulfill Empedocles’ dream for yourself.
2
("electron clouds")
(nucleus) X1
X2
X3
(origin)
Figure 1: Modern view of a molecular or solid system: nuclei (large dots), electron clouds (grey shaded
regions).
3.1 Electrostatics
There are three groups of electrostatic interactions which we must consider: interactions of nuclei with
nuclei, of electrons with nuclei, and of electrons with electrons. Coulomb’s law states that the potential
energy between two charges q1 and q2 at separation r12 is
q1 q2
U = [kc ] ,
r12
where e is the charge of the electron, I and J index different nuclei, RIJ is the separation between nuclei I
and J, ZI is the atomic number of nucleus I, and the factor of 1/2 is the famous double-counting correction
to ensure that we count each pair-wise interaction only once.
Similarly, the potential energy of a single electron at position ~x due to the nuclei is
X ZI
Vnuc (~x) = −[kc ]e2 , (2)
i
RI
where RI is the distance from point ~x to nucleus I. If the volume density (number per unit volume) of
electrons is n(~x), then the number of electrons in the volume element dV near point ~x is n(~x) dV and the
1 Note that in the cgs systems of units k ≡ 1. If you are more comfortable working in such units, then simply ignore any
c
factors which appear below in square brackets.
3
total potential energy of the electrons interacting with the nuclei is
Z
Uel−nuc = Vnuc (~x)n(~x) dV, (3)
where the integral is over all of space. (There is no 1/2 double-counting correction here because the interaction
between electron #1 and nucleus #2 is not counted again when we do the interaction between electron #2
and nucleus #1!)
Finally, the electrons interact not only with the nuclei, but also with themselves. From Coulomb’s law, the
potential energy for a single electron at point ~x coming from the electrons at point ~x 0 is [kc ]e2 n(~x0 ) dV 0 /|~x−~x0 |
where |~x − ~x0 | is the distance between points ~x and ~x0 . The total potential for a single electron at point ~x is
then
n(~x0 ) dV 0
Z
φ(~x) = [kc ]e2 .
|~x − ~x0 |
Standard electrostatics tells us that doing this integral is equivalent to solving Poisson’s equation,
Finally, once we have φ(~x), the potential energy for the electrons interacting with themselves follows the same
logic as (3) but with the double-counting correction of (1) because we are dealing with the total interaction
of a group of particles with itself,
1
Z
Uel−el = φ(~x)n(~x) dV. (5)
2
In addition to this normality constraint for each orbital i, the orbitals must be orthogonal to each other,
Z
0 = ψi∗ (~x)ψj (~x) dV for i 6= j, (7)
the condition by which density functional theory encodes the Pauli exclusion principle from elementary
chemistry courses. Apart from these constraints, the orbitals are completely free. Thus, we may combine all
relevant constraints into the orthonormality constraint,
Z ½
∗ 1 i=j
ψi (~x)ψj (~x) dV = . (8)
0 i 6= j
4
The electron density and total kinetic energy come directly from the orbitals. Because the square of each
orbital gives the distribution of the electrons in that orbital, the total electron density will be the sum of
squares of the orbitals time the number of electrons fi in or “filling” each orbital,
X
n(~x) = fi |ψi (~x)|2 . (9)
i
(As mentioned above, usually there are fi = 2 electrons in each orbital.) The total kinetic energy of the
electrons Tel is similarly just the sum over orbitals of the number of electrons in each orbital times the
elementary quantum mechanical expression for the kinetic energy of each orbital,
h̄2 2
X Z µ ¶
∗
Tel = fi ψi (~x) − ∇ ψi (~x) dV. (10)
i
2m
There arises from advanced quantum mechanics one final subtle point. The electron density defined in
(9) is only an average. The actual density fluctuates, resulting in relatively small but important errors in
Eqs. (5,10) due to correlations in these fluctuations. In theory, we may correct for these errors exactly,
but this turns out to be quite difficult in practice. A very good approximation to this exchange-correlation
correction, sufficient in practice to compute most properties to within a few percent, is the local density
approximation Z
Exc = fxc (n(~x)) dV, (11)
where fxc (. . .) is a relatively simply function which we will provide later in the course 2 .
That’s it – this is all the quantum mechanics we need to predict accurately the behavior of matter!
h̄2 2
X Z µ ¶ Z
E[{ψ(~x)}] = fi ψi∗ (~x) − ∇ ψi (~x) dV + Vnuc (~x)n(~x) dV (12)
i
2m
1
Z Z
+ φ(~x)n(~x) dV + fxc (n(~x)) dV + Unuc−nuc ,
2
where fxc (. . .) is some known function, Vnuc (~x) is the potential energy field created by the nuclei, Unuc−nuc
is the simple electron static interaction among the nuclei,
X
n(~x) = fi |ψi (~x)|2 ,
i
and fi (usually equal to two) is the number of electrons in orbital i. Note that the expression (12) maps
each possible choice of the set of electronic orbitals {ψi (~x)} to a unique value for the energy of the system
and thereby gives the total energy E as a function of the orbital functions φ i (~x). Such an expression which
returns a number as a function of other functions is called a functional and denoted with square brackets as
we do in Eq. (12).
We now have a functional for the energy in terms of the orbitals, but which orbitals are the right ones
to use? The answer is quite sensible: the correct orbitals are those which minimize the total energy E in
(12) while obeying the orthonormality constraints (6). Combined with this variational principle, Eq. (12)
now gives a complete prescription for computing total energies, and thereby all of the properties mentioned
in Sec. 2.
2 Improving the approximations for E
xc is one of the “holly grails” of electronic structure. If this interests you, please let me
know . . .
5
4 Kohn-Sham equations
There are two schools of thought on how to achieve the minimization of the total energy. The more prevalent
approach in the physics community is to view the calculation directly as a problem in numerical minimization
and to apply modern techniques for constrained numerical minimization. We shall return to this approach
in the second half of the this course. The second school of thought, more prevalent in the chemistry
community, is to derive the Lagrange-multiplier equations for constrained minimization and to then use
numerical methods to solve the resulting equations. As we shall see, each approach has its advantages and
disadvantages. In the end, though, both must lead to the same result.
We now derive Lagrange-multiplier equations for density functional theory, known as the Kohn-Sham
equations.
the variation of the functional F [g(x)] is given by a “sum” over the index x,
Note that, because x is now a continuous variable, the “sum” becomes an integral. also, (δF /δg(x)) is the
standard notation for the functional derivative, which we see amounts to taking the partial derivative of
F with respect to the value g(x). With this understood, we can take functional derivatives as easily as
differentiating a multi-variable function. All of the usual rules still apply, such as the product and chain
rules!
As an example, let us consider the functional derivative of Exc [n(~x)] with respect to n(~x). First, we shall
carry out the variation formally, and then we shall show how quickly we arrive at the same result by analogy
with multi-variable calculus. Applying the formal definition (13) of the functional derivative to E xc in (11),
we find
6
from which we may read off the result
δExc 0
= fxc (n(~x)). (14)
δn(~x)
Alternatively, we note that Eq. (11) is just the integral of the result of applying the function f xc (. . .) to
each component of n(~x) separately. If this were a multi-variable problem, the analogous function would be
a sum over the values of a function evaluated separately on each component,
X
exc (~q) = fxc (qi ),
i
Thus, the real and imaginary components of ∂f (z, z∗)/∂z ∗ |z give us both derivatives ∂f (zr , zi )/∂zr and
∂f (zr , zi )/∂zr simultaneously. In particular, to minimize over all possible values of z = z r + izi , we need
just one equation! ¯
∂f ¯¯
0= .
∂z ∗ ¯z
δ h̄2
(T el ) = −f i ψi (~x).
δψi∗ (~x) 2m
7
For the electron-nuclear potential energy (3), the only term which depends on ψ i∗ (~x) is the charge density,
where ψi∗ (~x) multiplies fi ψi (~x). The nuclear potential Vnuc (~x) is unchanged as ψi∗ (~x) varies, so the final
term is just
δ
∗ (Uel−nuc ) = fi Vnuc (~x)ψi (~x).
δψi (~x)
The electron-electron energy has a very similar structure. The only difference is that, here, when we
change ψi∗ (~x), the potential function φ(~x) also changes because it depends on n(~x). The net effect of the
change in φ(~x) is the same as that of the direct change in n(~x). To see this we note that Poisson’s equation
∇2 φ = −4π[kc ]e2 n implies also that ∇2 (δφ) = −4π[kc ]e2 (δn). Thus,
∇2 φ
µ ¶
δφ
Z Z Z Z
2
(δφ)n dV = (δφ) dV = ∇ φ dV = (δn)φ dV,
−4π[kc ]e2 −4π[kc ]e2
where we have moved the ∇2 from acting on φ to R acting on δφ by integrating by parts twice. Since both
terms are equal, we can take just twice the (1/2) φ δn dV term,
δ
(Uel−el ) = fi φ(~x)ψi (~x).
δψi∗ (~x)
0
For the exchange-correlation term, we have already derived in (14) that δExc /δn(~x) = fxc (n(~x)). By the
∗
chain rule we just need to multiply this by δn(~x)/δψi (~x) = fi ψi (~x) for the result
δ 0
(Exc ) = fi fxc (n(~x))ψi (~x).
δψi∗ (~x)
Fortunately, Unuc−nuc depends only on the nuclear positions and does not change with ψi∗ (~x), so
δ
(Unuc−nuc ) = 0.
δψi∗ (~x)
And, finally, ψi∗ (~x) only appears once in the constraint term, making the derivative,
à !
δ X Z
∗
− λi ψi (~x)ψi (~x) dV = −λi ψi (~x).
δψi∗ (~x) i
Summing all of these contributions, setting the resulting equation to zero, moving the “λ i ψi (~x)” term to
the right-hand side, and dividing through by fi , we get the final result,
h̄2 2 0 λi
− ∇ ψi (~x) + [Vnuc (~x) + φ(~x) + fxc (n(~x))] ψi (~x) = ψi (~x).
2m fi
Fortunately for us, this is in the form of a very well-known equation for which there are standard techniques.
This is in the form of the standard Schrödinger equation,
h̄2 2
− ∇ ψi (~x) + V (~x)ψi (~x) = ²i ψi (~x), (16)
2m
where we define the potential term as
0
V (~x) ≡ Vnuc (~x) + φ(~x) + fxc (n(~x)), (17)
and we define ²i ≡ λi /fi . We interpret the potential V (~x) as just the sum of the nuclear potential, the elec-
trostatic potential φ(~x) created by the electrons, and an extra, “exchange-correlation” potential correction,
0
Vxc (~x) ≡ fxc (n(~x)). Since the Lagrange-multipliers are unknown constants at the start, we may as well think
in terms of the ²i ≡ λi /fi instead, which have the interpretation of the Schrödinger energies for each orbital.
8
n in(x)
Poisson Evaluate
solver f’xc (n(x))
ψ( x)
F[n(x)]
Σ | ψ( x) | 2
n out(x)
9
5 Atomic units
Because even the simplest mistake can result in hours of debugging, it is critical to do everything possible
to make software clean and simple. One thing which we can do in the physical sciences toward this end is
to use dimensional analysis.
The simplest form of dimensional analysis is to change to a new system of units tailored specifically to
the problem at hand. This is relatively straight-forward because it involves no change in our equations and
changes only the numerical values of the physical constants which appear. It requires only that, once the
calculations are complete, we convert the results back to standard units using the familiar rules for unit
conversion.
A tailored system of units can be very useful when the relevant physical constants have large or small
values in term standard units. In a quantum mechanics calculation, for instance, h̄ ≈ 10 −34 kg m2 /s2 and,
as h̄2 appears in many of our expressions, numerical underflow is a significant risk. On the other hand,
if we worked not in meters but Angstroms (1 Å= 10−10 m), which are much more relevant for quantum
mechanical problems, then we have h̄ ≈ 10−14 kg Å2 /s2 , a much more manageable number.
Often times, we can do much better and arrange so that all of the relevant physical constants have a
numerical value of unity. This has the tremendous advantage that we do not have to type the values of the
physical constants into each subroutines or try to set up a repository of global variables, both of which are
a frequent source of hard-to-track bugs. Once should definitely seek such an appropriate set of units before
beginning a scientific application.
In the case of density functional theory, inspection of the expressions above reveals four physical constants:
Planck’s constant h̄, the electron mass m, Coulomb’s constant [kc ], and the electron charge e. Rather than
use the standard units of meter, kilogram and second for the three fundamental dimensions of length, mass
and time, we can define three new units, which we shall call L, M and T, respectively. With the ability
to choose three unknowns, in general we can hope to reduce only three physical constants to the value
unity. We are fortunate, however, because our physical constants always appear in only one of two different
combinations, h̄2 /m or [kc ]e2 , which we can simultaneously reduce to unity with an appropriate choice of
units.
For our two combinations, we have
h̄2
= 1.22 085 40 × 10−38 J m2
m
[kc ] e2 = 2.30 707 955 × 10−28 J m,
where J represents the SI unit of energy, the Joule, which has units 1 J=1 kg m 2 /s2 . In our new system of
units, we would like these combinations of constants to appear as
h̄2
= 1 E L2 (18)
m
[kc ] e2 = 1 E L,
where E is the unit of energy in our units 1 E=1 M L2 /T2 . We may easily solve (18) for L and E, finding
the standard atomic units of the Bohr and the Hartree as our units of length and energy, respectively,
h̄2 /m
L ≡ 1 bohr = = 0.529 177 25 × 10−10 m = 0.529 177 25 Å (19)
[k c ]e2
([kc ]e2 )2
E ≡ 1 hartree = = 4.35 974 82 × 10−18 J = 27.2 113 96eV (20)
h̄2 /m
From now, on so long as we interpret all distances in our calculations as expressed in Bohrs (about 1/2
Angstrom) and all energies in Hartrees (about 27 electron Volts), we can take the factors h̄ 2 /m and [kc ]e2
to be unity and, in effect ignore all physical constants appearing in our expressions. Note that we have in
reserve the ability to set yet one other constant to unity in the future, if necessary.
10
6 References
11